DEV Community

Cover image for Semi-Autonomous Bug Bounty System
Chudi Nnorukam
Chudi Nnorukam

Posted on • Edited on • Originally published at chudi.dev

Semi-Autonomous Bug Bounty System

Originally published at chudi.dev


Reconnaissance takes forever. You spend hours on subdomain enumeration, tech stack fingerprinting, and endpoint discovery—only to find the same vulnerabilities you've tested before.

I built BugBountyBot to automate the tedious 80% while keeping humans in the loop for the decisions that matter.

The Problem with Manual Hunting

Traditional bug bounty hunting breaks down like this:

Phase Time Spent Value Added
Reconnaissance 40% Low (repetitive)
Testing 30% Medium (pattern-based)
Validation 15% High (requires judgment)
Reporting 15% High (requires clarity)

Most hunters spend 70% of their time on work that could be automated. The high-value phases—validation and reporting—get squeezed because you're exhausted from the grind.

The Multi-Agent Architecture

BugBountyBot uses four specialized agents, each optimized for their phase:

Why Four Agents Instead of One?

A single agent trying to do everything suffers from context dilution. The prompt space needed for effective reconnaissance is completely different from vulnerability testing.

Specialized agents can:

  • Use phase-specific prompts without compromise
  • Maintain focused context windows
  • Be tuned independently based on performance
  • Fail in isolation without breaking the pipeline

Evidence-Gated Progression

The biggest risk in automated hunting is false positives. Submit garbage, and your reputation tanks. Platforms flag your account. Programs stop accepting your reports.

BugBountyBot uses a 0.85 confidence threshold before any finding advances:

interface Finding {
  vulnerability: VulnerabilityType;
  evidence: Evidence[];
  confidence: number; // 0.0 - 1.0
  status: 'pending' | 'validated' | 'rejected';
}

function shouldAdvance(finding: Finding): boolean {
  // Only findings with 0.85+ confidence advance to human review
  return finding.confidence >= 0.85;
}
Enter fullscreen mode Exit fullscreen mode

Findings below 0.85 aren't discarded—they're logged with full context for the RAG database. The system learns why they failed validation, preventing similar false positives in future hunts.

What Builds Confidence?

The Validator Agent runs multiple checks:

  1. PoC Execution - Does the exploit actually work?
  2. Response Diff Analysis - Is the behavior change meaningful?
  3. False Positive Signatures - Does this match known FP patterns?
  4. Evidence Hashing - Is the evidence reproducible?

Each check contributes to the confidence score. Only when all checks align does a finding hit the 0.85 threshold.

The RAG Database

SQLite stores everything the system learns:

-- Knowledge that improves over time
CREATE TABLE knowledge_base (
  pattern TEXT,           -- What worked
  context TEXT,           -- Where it worked
  success_rate REAL,      -- How often it works
  last_used TIMESTAMP
);

CREATE TABLE failure_patterns (
  approach TEXT,          -- What failed
  reason TEXT,            -- Why it failed
  program_id TEXT,        -- Program-specific context
  created_at TIMESTAMP
);

CREATE TABLE false_positive_signatures (
  signature TEXT,         -- What to avoid
  occurrences INTEGER,    -- How often we see it
  last_seen TIMESTAMP
);
Enter fullscreen mode Exit fullscreen mode

Every hunt session adds knowledge:

  • Successful patterns get reinforced
  • Failures get logged with reasons
  • False positives become signatures to filter

After 50 hunts, the system knows which approaches work on which program types. It stops repeating mistakes that wasted your time six months ago.

Safety Mechanisms

Automated hunting without safety is a fast path to bans. BugBountyBot includes:

Rate Limiting

Token bucket algorithm per target. Configurable burst size and refill rate. Automatic slowdown when approaching limits.

Scope Validation

Every request validates against program scope before execution. Out-of-scope domains are hard-blocked, not just warned.

Ban Detection

Monitors for consecutive failures, response time changes, and error patterns that indicate blocking. Triggers automatic cooldown before you get banned.

interface SafetyConfig {
  maxRequestsPerMinute: number;
  burstSize: number;
  cooldownOnConsecutiveFailures: number;
  scopeValidation: 'strict' | 'permissive';
}
Enter fullscreen mode Exit fullscreen mode

Human-in-the-Loop

Every bug bounty platform requires human oversight for submissions. This isn't a limitation to work around—it's a feature to design for.

BugBountyBot's workflow:

  1. Automated phases (Recon → Testing → Validation) run without intervention
  2. 0.85+ findings queue for human review with full evidence
  3. Human approves specific findings for submission
  4. Reporter Agent formats and submits approved findings

You spend your time reviewing validated findings with evidence, not grinding through reconnaissance. The ratio flips: 20% of your time on tedious work, 80% on high-value decisions.

HackerOne, Intigriti, and Bugcrowd all have Terms of Service that require human oversight for automated tools. Fully autonomous submission isn't just risky—it can get you permanently banned.

Checkpoint System

Hunt sessions can span days or weeks. The checkpoint system saves state:

interface Checkpoint {
  sessionId: string;
  phase: 'recon' | 'testing' | 'validation' | 'reporting';
  progress: PhaseProgress;
  findings: Finding[];
  timestamp: Date;
}
Enter fullscreen mode Exit fullscreen mode

Resume any session exactly where you left off. No lost context, no repeated work.

Results

After building and running BugBountyBot:

Metric Before After
Time on recon 4+ hours 30 mins (review)
False positive rate ~30% Under 5%
Findings per session 2-3 8-12 (validated)
Time to first finding 2 days 4 hours

The system doesn't replace skill—it multiplies it. Your expertise in validation and reporting gets applied to 4x more findings.

Getting Started

BugBountyBot is built with TypeScript, SQLite, and Claude Code integration. The core architecture:

/src
  /agents
    recon.ts          # Passive enumeration
    testing.ts        # Vulnerability detection
    validator.ts      # PoC verification
    reporter.ts       # Report generation
  /database
    rag.ts            # Knowledge storage
    checkpoints.ts    # Session persistence
  /safety
    rate-limit.ts     # Request throttling
    scope.ts          # Scope validation
    ban-detect.ts     # Blocking detection
Enter fullscreen mode Exit fullscreen mode

Start with a single program. Let the RAG database learn. Expand scope as confidence grows.

What's Next

BugBountyBot v2.0 is in development with methodology-driven hunting:

  • 6-8 week structured hunt phases
  • Feature mapping before testing
  • Scope change monitoring
  • JavaScript file change detection

The shift from "run and hope" to systematic, elite-hunter methodology.


Related: Why Human-in-the-Loop Beats Full Automation | Portfolio: BugBountyBot

Top comments (0)