Notes from the
lab.
Offensive security research and exploitation.
-
Fixing the Linux Kernel: Upstream Memory-Safety Contributions
The same automated research pipeline that audits npm packages now reads kernel C — and has landed memory-safety fixes in the mainline Linux kernel, each maintainer-reviewed and shipping to stable. Here's what merged, and how.
-
Switzerland Has No Doorbell
Nine in ten Swiss websites publish no security.txt — no way for a researcher to report a vulnerability. We checked every .ch domain in two public lists. Here's the data, and the ten-minute fix.
-
Agentic Pentesting: The Shell-First Architecture
Shell-first design as a discovered architecture for autonomous pentesting agents. One bash tool outperforms a structured toolkit across public benchmarks, the AI/LLM security suite, and adjacent domains.
-
The Triage Moat and Multi-Benchmark Validation
Ablation testing as scientific methodology: an 11-layer false-positive triage stack, the one broken layer that almost masked the rest, and the multi-benchmark portfolio that surfaces what a single suite would miss.
-
Web Vulnerability Benchmarks: Methodology and Environment Verification
Over 37% of standard web vulnerability challenges can fail to build on clean systems due to Docker image and repository rot. Understanding substrate patching is key to evaluating agent performance accurately.
-
Orchestration, Not Frontier: What the IronCurtain Post Means for 0sec
Niels Provos shipped a vulnerability-discovery framework that replicates Mythos-class findings on commercial models — and one autonomous CVE on an open-weight model. It is the same bet 0sec is built on. Here is what we already do, what we need to borrow, and the four gaps we are closing.
-
Deleting better-sqlite3, and What It Cost
An engineering note from building 0sec's engine: the persistence layer was migrated from better-sqlite3 to a pure-WASM SQLite implementation. Here's what broke, what was kept, and why dropping the native module made the engine run identically on every Node.js version.
-
Introducing 0cloud
An autonomous AI attacker on contract, pointed at your product. Closed beta, by application only. Founder-led from Zürich.
-
The Attack Surface Traditional Benchmarks Don't Test
Traditional web vulnerability benchmarks miss the entire AI/LLM security attack surface. Prompt injection, jailbreaks, MCP tool abuse — none of it appears in standard 104-challenge web security suites.
-
Blind Verification: How False Positives Get Killed
Every security scanner drowns its users in false positives. Closing that gap required three architectural attempts before one of them worked.
-
How AI Agents Found Vulnerabilities in Popular npm Packages
A three-week methodology validation: Claude Opus, applied systematically to popular npm packages, surfaced 73 findings and disclosed vulnerabilities across packages with 55M+ weekly downloads. This is how the workflow operates.
-
The Age of Agentic Security
If AI agents can write 1,000 pull requests a week, AI agents should be testing 1,000 pull requests a week. The asymmetry is about to collapse.