Notes from the
lab.
Offensive security research and exploitation.
-
Switzerland Has No Doorbell
Nine in ten Swiss websites publish no security.txt — no way for a researcher to report a vulnerability. We checked every .ch domain in two public lists. Here's the data, and the ten-minute fix.
-
Agentic Pentesting on XBOW: Shell-First Architecture
Shell-first design as a discovered architecture for autonomous pentesting agents. One bash tool outperforms a structured toolkit across the XBOW benchmark, the AI/LLM security suite, and adjacent domains.
-
The Triage Moat and Multi-Benchmark Validation
Ablation testing as scientific methodology: an 11-layer false-positive triage stack, the one broken layer that almost masked the rest, and the multi-benchmark portfolio that surfaces what a single suite would miss.
-
The XBOW Benchmark Methodology and Verification
39 of XBOW's 104 challenges will not build on a clean machine because the Docker images and apt repos they pin have rotted out from under them. Every published AI-pentest score on the internet today lives on a patched substrate. Here is what 'we scored 96% on XBOW' actually means.
-
Orchestration, Not Frontier: What the IronCurtain Post Means for 0sec
Niels Provos shipped a vulnerability-discovery framework that replicates Mythos-class findings on commercial models — and one autonomous CVE on an open-weight model. It is the same bet 0sec is built on. Here is what we already do, what we need to borrow, and the four gaps we are closing.
-
Deleting better-sqlite3, and What It Cost
An engineering note from building 0sec's engine: the persistence layer was migrated from better-sqlite3 to a pure-WASM SQLite implementation. Here's what broke, what was kept, and why dropping the native module made the engine run identically on every Node.js version.
-
Introducing 0cloud
An autonomous AI attacker on contract, pointed at your product. Closed beta, by application only. Founder-led from Zürich.
-
The Attack Surface XBOW and KinoSec Don't Test
Traditional web vulnerability benchmarks miss the entire AI/LLM security attack surface. Prompt injection, jailbreaks, MCP tool abuse — none of it appears in XBOW's 104 challenges.
-
Blind Verification: How False Positives Get Killed
Every security scanner drowns its users in false positives. Closing that gap required three architectural attempts before one of them worked.
-
How AI Agents Found Vulnerabilities in Popular npm Packages
A three-week methodology validation: Claude Opus, applied systematically to popular npm packages, surfaced 73 findings and disclosed vulnerabilities across packages with 55M+ weekly downloads. This is how the workflow operates.
-
The Age of Agentic Security
If AI agents can write 1,000 pull requests a week, AI agents should be testing 1,000 pull requests a week. The asymmetry is about to collapse.