Leon Martin

Posted on Feb 17

How I Ended Up Building a “Gemini Bug Hunter” to Find Vulnerabilities in My Code

#learngoogleaistudio #gemini #ai #deved

Education Track: Build Apps with Google AI Studio

This post is my submission for DEV Education Track: Build Apps with Google AI Studio.

Three years ago, everyone wouldn’t stop talking about “the best JavaScript framework of the year.”

Today, most of the noise sounds more like:

“LLMs are going to replace you”
“Learn AI or die”
“Hiring freeze, layoffs everywhere”

While I watched friends and teammates get laid off, I started asking myself a uncomfortable question:

“If I lose my job tomorrow, what do I need to know to stay useful and valuable in tech?”

For me, the answer ended up being a strange mix: security + AI + developer tooling, not just “pretty apps.”

That’s how this experiment was born: Gemini Bug Hunter, a CLI that uses Gemini to hunt vulnerabilities in your own code from the terminal.

It’s not an “enterprise product,” and it’s not an academic paper. It’s a project built in weird late nights, between burnout, fear of layoffs, and the feeling that the ecosystem changed overnight.

In this article I’ll walk you through:

What I actually built
How it looks in action
What I got out of the experience (and whether it makes sense for you to dive into something similar… or not)

What I Built

On a practical level, Gemini Bug Hunter is a Node.js CLI that does something very simple to describe and very hard to do well:

It scans your project, sends code snippets to Gemini 2.5 Flash, and gives you back a report with vulnerabilities, risk, and recommendations, all designed for everyday developer use.

The idea is to treat Gemini not as a “friendly assistant”, but as the brain of the system. The README makes it very clear:

“Gemini 3 is not an assistant — it is the brain of the system.”

The Concept

I wanted a flow like this:

You’re inside your repo.
You run:

   gbh scan

The CLI:
- Walks through your codebase (using glob for file patterns).
- Cleans up sensitive data (secrets, keys, tokens).
- Sends code chunks to Gemini with a very specific prompt: act as a professional ethical hacker, focus on OWASP Top 10, avoid false positives, return JSON.
- Receives the response, parses it, and:
  - Calculates an overall project risk score (0–100).
  - Lists vulnerabilities with file, line, severity, impact, recommendation, and a secure code example.
- Displays everything with a pleasant terminal experience, not like a 2009 CI log dump.

The Security Nerd Part

Instead of stopping at “there’s an XSS here,” I wanted the model to return something structured, like:

{
  "projectRiskScore": 81,
  "riskLevel": "HIGH",
  "summary": "Found 3 vulnerabilities including 1 CRITICAL issues requiring immediate attention",
  "vulnerabilities": [
    {
      "id": "user-query-sqli",
      "title": "SQL Injection in User Query",
      "severity": "CRITICAL",
      "confidence": 0.95,
      "category": "SQL Injection",
      "file": "src/users.js",
      "line": 42,
      "description": "User input is directly concatenated into SQL query without sanitization.",
      "impact": "Attackers can extract or manipulate database data.",
      "exploitationScenario": "An attacker can craft a malicious query via the `search` parameter.",
      "recommendation": "Use parameterized queries and validate input.",
      "secureCodeExample": "SELECT * FROM users WHERE name = ?",
      "autoFixSafe": true
    }
  ]
}

With that structure, I built a risk calculator that weighs:

Severity (40%)
Model confidence (30%)
Exploitability (20%)
Impact (10%)

Is it perfect? No. Is it better than “I trust I have no bugs because the app compiles”? Absolutely.

Stack and Technical Choices

I stayed in a very familiar, non-exotic stack:

Node.js 18+
JavaScript (ES2022+)
CLI framework: commander
Styling: chalk, cli-table3, boxen
Config with .env and a config/default.js
Gemini client in a separate module (engine/gemini/client.js)
Scanning logic in engine/scanner/scanner.js
Risk scoring in engine/risk/calculator.js
Console reporting in reporter/console.js

I wanted any dev with basic Node experience to be able to read the code without suffering. No TypeScript, no decorators, no magic.

Demo

The whole project is designed so that the “hello world” is something a tired dev can do on a Friday afternoon without hating me.

Quick Local Setup

git clone https://github.com/holasoymalva/gemini-bug-hunter.git
cd gemini-bug-hunter

npm install

cp .env.example .env
# Then edit .env and add your GEMINI_API_KEY

npm start doctor

The doctor command basically checks:

Do you have Node?
Do I have your API key?
Can I talk to Gemini without exploding?

If that passes, you’re ready to play.

Using It as a Global CLI

The idea is to be able to do:

npm install -g gemini-bug-hunter

gbh config set-key <YOUR_GEMINI_API_KEY>
gbh scan

From there:

Scan the current directory:

  gbh scan

Scan only src:

  gbh scan ./src

Get JSON output for CI integration:

  gbh scan --json
  gbh scan --output report.json

Enable interactive auto-fix mode:

  gbh scan --fix

In --fix mode, the CLI does not go rogue and rewrite your repo without asking. For each vulnerability marked as autoFixSafe: true, it asks if you want to apply the change. You stay in control.

How the Report Looks

I was aiming for something like this:

🛡️  GEMINI BUG HUNTER REPORT

📊 Risk Assessment

  Risk Score: 81% ████████████████████
  Risk Level: HIGH
  Summary: Found 3 vulnerabilities including 1 CRITICAL issues requiring immediate attention

🎯 Severity Breakdown

  ● CRITICAL: 1
  ● HIGH: 1
  ● MEDIUM: 1

🔍 Detected Vulnerabilities

🔴 [1] SQL Injection in User Query
    File: src/users.js:42
    Category: SQL Injection
    Severity: CRITICAL | Confidence: 95%

    User input is directly concatenated into SQL query without sanitization.

    ⚠️  Impact: Attackers can extract or manipulate database data.

    ✓ Fix: Use parameterized queries and input validation.

    ✨ Auto-fix available

Could we build a beautiful web dashboard on top of this? Sure.

Does it make sense if the goal is to live in the developer’s daily workflow? Not really. That’s why I stayed CLI-first.

My Experience

Building Gemini Bug Hunter wasn’t just “another side project.” It was more like a way to process what’s been happening in the ecosystem.

The Last 3 Years in Tech Have Been Weird

In a very short time we’ve seen:

Massive hiring followed by massive layoffs.
Ultra-senior people competing for mid-level roles.
Startups going from “we’ll hire 50 devs” to “we froze everything and we’re in survival mode.”
And suddenly, generative AI everywhere, from designing landing pages to doing code review.

Watching colleagues lose their jobs hits differently when you know them; they’re not just numbers in a LinkedIn post. Some were deeply specialized in a specific stack, others only did feature work without touching security, performance, or DX.

All of that pushed me to ask two questions:

What would make me harder to replace with a “vanilla” LLM?
What kinds of problems do I actually want to spend my time on, beyond building yet another CRUD with the framework of the month?

Why Build a Vulnerability Hunter with AI

I chose to blend three things:

Security: not as glamorous as a shiny frontend, but always relevant.
Developer tooling: tools that developers use, not end users.
AI as the analysis engine, not just as a fancy autocomplete.

Instead of fighting against models, I preferred to stand on top of them. Gemini becomes the one:

Spotting insecure patterns.
Correlating signals.
Explaining impact.

And I focus on:

Designing the prompt and the response schema.
Setting boundaries (secret redaction, consent, etc.).
Turning the output into something humans can act on.

The Good Stuff I’m Taking Away

I learned a lot about practical security: OWASP Top 10 stops being a poster and becomes “this endpoint you wrote is a potential XSS.”
It forced me to think about privacy and ethics: you can’t just send arbitrary code to an AI API without care. That’s why the project includes:
- Explicit consent before sending code.
- Automatic secret redaction (tokens, API keys).
- No remote storage of source code.
It gave me back a bit of a sense of control: instead of just consuming new tools, I built something on top of them.

The Not-So-Pretty Side

There is ecosystem fatigue in AI too. Every week there’s a new “state-of-the-art” model, a new framework, a new SDK. It’s easy to feel like you’re always behind.
Integrating with AI APIs is convenient, but it creates strong dependency on the provider. Models change, features move, pricing shifts. You have to accept that fragility.
And yes, there’s a lot of noise and hype. Lots of shiny demos, not that many solid tools. That’s why I forced myself to make this project:
- Usable from the terminal.
- Installable with npm.
- Understandable from the README alone, without a 30-minute video.

Should You Learn This Stuff… or Maybe Not?

I’m not going to tell you the usual “if you don’t learn AI today, you’ll be irrelevant tomorrow.” Reality is more nuanced.

Why It’s Worth It

It pushes you to think about more interesting problems than “just another login form.” Security and tooling make you look at the whole system.
It forces you out of the comfort zone of just following framework tutorials.
You learn to collaborate with models, not only use them as luxury autocomplete.
You get concrete stories to tell in interviews beyond “I built a to-do app with X.”

Why It Might Not Be for You (At Least Not Yet)

If you’re still fighting with fundamentals (basic algorithms, HTTP, SQL, Git), jumping into LLMs + security at the same time can be too much noise.
If you hate the terminal and prefer 100% UI and design, a security CLI isn’t going to be your dream project.
If you’re deep in burnout, you don’t need another side project to “justify your professional existence.” Sometimes the healthiest move is to rest.

...Sooo?

Gemini Bug Hunter is not going to save the industry, prevent the next layoff wave, or replace an AppSec team.

But for me, it was:

A way to process how the ecosystem has changed in the last few years.
A reminder that I can still build useful things, even when the market feels hostile.
Proof that AI + security + developer experience is a very interesting combo where there’s still a lot of room to build.

If any of this resonates with you, maybe your next project doesn’t need to be “yet another Reddit clone with the framework of the month,” but something that mixes:

A real problem you care about.
The new tools (Gemini, other LLMs, etc.).
And your own experience from the trenches.

And if one day you run:

gbh scan

against your own code and avoid a nasty security surprise, then this whole experiment will have been even more worth it.

DEV Community