An open-source security testing framework called Pentest Agent Suite has shipped 50 specialized security agents, 26 slash commands, 19 CLI tools, and a cross-IDE installer covering seven major AI coding platforms. The project, published on GitHub by researcher H-mmer, targets autonomous vulnerability discovery inside agentic coding workflows.

Platform Coverage and Installation

The framework generates native configuration formats for Claude Code, OpenAI Codex, Google Gemini, Cursor, Windsurf, VS Code Copilot, and OpenClaw, according to Cyber Security News. A Python installer (python3 -m tools.installer) writes the appropriate files to each IDE’s directory structure. Platforms without native subagent support (Cursor, Windsurf, and OpenClaw) receive translated skill files and rules with Claude-specific prose stripped and path variables rewritten to absolute references.

The 7-Question Validation Gate

Every finding passes through a validation pipeline before it can reach submission. The validator agent runs a 7-question gate on each discovery. The first “NO” answer triggers an automatic KILL, DOWNGRADE, or CHAIN REQUIRED verdict. No finding can reach the /submit command without a /validate PASS and a /quality score of 7 or higher, enforced by hard gates in the reporting pipeline.

50 Agents Across Five Tracks

The agent roster spans 19 HackerOne weakness specialists (covering XSS, SQLi, SSRF, RCE, OAuth, and LLM/AI attack patterns), an 8-agent SAST pipeline, infrastructure and recon agents (cloud-recon, JS-analyzer, GraphQL-audit, WAF-profiler), and a web3-auditor for Solidity and DeFi patterns, as reported by Cyber Security News. Five deep methodology skills accompany the hunters, each distilled from hundreds of real paid bug bounty reports.

Bug Bounty Platform Integration

A dual-server MCP (Model Context Protocol) infrastructure connects the framework to live bug bounty programs. The bounty-platforms MCP server integrates 16 programs including HackerOne (full API), Bugcrowd, Intigriti, Immunefi, and YesWeHack, exposing tools for listing platforms, syncing program scope, drafting reports, and submitting findings. A separate writeup-search MCP server provides FAISS semantic search across a bundled payload library of 2,605 lines covering XSS, SSRF, SQLi, IDOR, OAuth, SSTI, JWT, LFI, prototype pollution, NoSQLi, and DeFi attack patterns.

Operational Controls

The framework includes a persistent memory system (brain.py) that tracks every endpoint per target, enforces circuit-breaker logic (five consecutive 403/429 responses trigger a 60-second auto-backoff), and syncs cross-engagement knowledge via incremental hash-based diffing. Cost tracking fires on every subagent stop event, logging agent name and session cost to a JSON file. A scope hook matches every Bash command against a YAML configuration, blocking out-of-scope execution before the tool call fires.

Security Testing at the Agent Layer

The release represents one of the first production-scale attempts to bring autonomous security testing directly into the agent coding workflow. Instead of running security scans separately, engineering teams can embed vulnerability discovery inside the same AI coding tools they already use for development. The framework requires Python 3.10+ and standard reconnaissance tooling including nmap, httpx, subfinder, nuclei, ffuf, katana, and sqlmap. All destructive operations are gated behind an explicit --execute flag.