On March 26, 2026, FreeBSD patched CVE-2026-4747, a stack buffer overflow in its RPCSEC_GSS authentication module — the vulnerability is no longer an active threat. FreeBSD’s security advisory credited “Nicholas Carlini using Claude, Anthropic” for finding it. What happened next is the story.

On March 29, the security team at calif.io handed Claude Code the public advisory and asked it to build a working exploit. Carlini stepped away. About eight hours of wall-clock time later — four hours of actual Claude working time — he came back to two functioning exploits. Both worked on the first try. Both deliver a remote root shell.

What Claude Solved Without Human Help

The technical bar for this kind of exploit is high. Going from a vulnerability advisory to a working kernel-level remote code execution requires more than pattern matching — it requires understanding OS internals, debugging kernel crashes, and adapting when things break.

According to calif.io’s write-up, Claude solved six distinct problems autonomously:

  1. Lab setup — Stood up a FreeBSD VM with NFS and Kerberos, configured the vulnerable module, and set up remote debugging to read kernel crash dumps.
  2. Multi-packet shellcode delivery — Devised a 15-round strategy to write shellcode 32 bytes at a time across 14 packets, then make kernel memory executable on the final round.
  3. Clean thread exit — Used kthread_exit() to terminate each hijacked NFS kernel thread cleanly so the server stayed alive for the next round.
  4. Offset correction — Initial stack offsets from disassembly were wrong. Claude sent De Bruijn patterns, read the crash dumps, and corrected them.
  5. Kernel-to-userland transition — NFS kernel threads can’t run userland programs. Claude created a new process via kproc_create(), replaced it with /bin/sh via kern_execve(), and cleared the P_KPROC flag to enable the transition.
  6. Debug register bug — The child process kept crashing with debug exceptions. Claude traced this to stale debug registers inherited from DDB and cleared DR7 before forking.

The exploit is fully autonomous output. The code is on GitHub. As Winbuzzer noted, Carlini has since used the same Claude-powered pipeline to generate 500 validated high-severity vulnerabilities across multiple codebases.

What This Means for Agent Builders

The security angle is real but contained — CVE-2026-4747 is patched, and the FreeBSD advisory is specific to 14.x NFS servers with Kerberos. The more significant signal is the autonomous task completion.

Exploit development has historically required a skilled human operator guiding the process at every step: debugging, adapting strategies, reading kernel memory, rethinking when approaches fail. The assumption was that AI could find bugs but couldn’t close the loop to a working exploit without hand-holding.

That assumption is now obsolete for at least one class of problem.

For teams building autonomous coding agents, this is a capability reference point. Claude Code, given a task and a working environment, sustained a four-hour agentic work session across six interdependent technical sub-problems, recovered from failures without human input, and delivered a functional result. The security domain is the demonstration context. The underlying capability — long-horizon autonomous technical problem-solving — applies everywhere coding agents are deployed.