This story was shared by a fellow developer on DEV who asked to remain anonymous. If you've got a story to tell — come find me. Your name won't appear anywhere.
Based on real microservice security design patterns. About an engineer whose PR got blocked by an AI security system — he thought he was fixing a vulnerability. Turns out, someone had a vested interest in that vulnerability staying open.
1. $1,400,000
All-hands meeting. CTO James stood at the front, a number on the screen:
$1,400,000
"This is what we're spending on security this year." He pointed at the number. "The biggest piece — right here."
He clicked the remote. VoidSentinel's architecture topology appeared on screen.
"VoidSentinel — an AI security platform. Integrated into our CI/CD pipeline. Starting today, every PR involving internal service-to-service calls — it reviews them automatically."
The CEO didn't show up today. James didn't mention it. He looked straight at Mark — VP of Security.
Mark took the mic. "VoidSentinel has been running in our pre-production environment for three weeks. It's caught 47 high-risk patterns. Zero false positives."
He paused.
"— Of course, some people might feel uncomfortable when their PR gets blocked. But this isn't personal. This is the security standard. "
He wasn't looking at me. But I knew who he was talking about.
2. High Risk. Denied.
The story started three weeks earlier.
We had a payment service and a user service that talked to each other internally. They shared an old API key — one key across thirty-plus services, unchanged for five years.
It wasn't that nobody knew. It just never made it to the top of the backlog.
On Day 1, I opened a PR: add independent service-to-service auth between the payment and user services. Not much code — a new token exchange module, three call sites modified.
Five minutes later, VoidSentinel's automated comment hit:
"High-risk alert: Unauthorized internal access pattern change detected. This PR has been automatically rejected. Contact the security team."
I stared at that comment for a long time.
This isn't "high risk." This is adding independent auth between two services. This is fixing a hole that's been open for five years.
I replied under the PR: "This PR fixes a shared credential vulnerability between services. It is not introducing a new attack surface. Please escalate for manual review."
The next day, the reply came:
"VoidSentinel's verdict is final. Please reference the system's suggested modifications to adjust your code."
The system's "suggested modification" was: don't change anything.
3. This Is Final
I walked to Mark's office.
"That PR — it's not an attack. It's a fix. The payment service and user service are still sharing a five-year-old API key. If any one service gets compromised, the other thirty are exposed."
Mark didn't look up.
"VoidSentinel's model analyzed your code and flagged it as high risk. I trust its judgment."
"Its model can't read intent. It just sees someone modifying service-to-service auth — which looks exactly like what an attacker would do."
Mark finally looked up.
"So what's your suggestion? Everyone can just modify internal auth whenever they want, because 'I had good intentions'?"
" — Add a manual review gate. High-risk PRs go through AI first, then human review if rejected."
"I am the human review. The system flagged high risk, and I agree. PR denied."
I went back to my desk.
I tried a different approach: instead of modifying the auth layer, add a call allowlist at the gateway level. That wouldn't trigger VoidSentinel's "service auth change" detection pattern.
Submitted the PR.
VoidSentinel verdict: High risk. Denied.
— It recognized I'd changed the approach. The model's coverage was deeper than I thought.
But it still couldn't cover the real vulnerability.
4. PIP
Day 3. I got the PIP notice.
Subject: "Assessment of Alignment with Company Security Protocols"
Mark's office. He slid the PIP notice across the desk.
"Two attempts to bypass VoidSentinel's security review. This isn't a technical issue — it's an attitude problem."
"I submitted fixes. Not bypasses."
"From the system's perspective, you modified internal auth paths. The system flagged high risk. You changed the approach and submitted again. That's a bypass."
I looked at the PIP notice.
"30-day improvement plan. Success criteria: zero compliance violations. Fail — you're out."
I signed it.
Walking out of Mark's office, I ran into Jay from ops in the hallway. He lowered his voice:
"— You know Mark's a board appointment, right? The CEO's been trying to move him for almost a year. Watch your back. Also — VoidSentinel? He picked it. Sold the board on it for six months before they approved it. You tell him he picked wrong — he's not going to admit it."
I nodded.
Back at my desk, I opened VoidSentinel's architecture documentation.
5. Nobody Knew
For the next 24 days, I did three things.
One: went through the PIP motions during the day.
No more touching service-to-service auth in any PR. VoidSentinel stopped flagging my code. Mark sent PIP progress emails every Friday. I replied every Friday with "on track."
Two: built a proof of concept at night.
VoidSentinel's core blind spot — I confirmed it — wasn't that it was too sensitive. It was that it applied the exact same detection rules to fixes and attacks. The code that fixes a vulnerability and the code that exploits one — to VoidSentinel's model, they're indistinguishable.
That means: it can't tell who's fixing something and who's breaking something. It only knows someone's modifying service-to-service auth. Doesn't matter who you are, or why you're doing it.
Three: set up a monitor.
It didn't block anything. It just logged. Logged every "high-risk" PR VoidSentinel intercepted. Logged the PR content. The submitter. The resolution.
I piped it to a server I controlled. Auth call logs from 138 service nodes, synced in real time.
Mark didn't know.
Jay didn't know.
Nobody knew.
6. Zero Incidents
Day 10. Mark sent a company-wide email:
"VoidSentinel has been live for two weeks. 217 high-risk PRs intercepted. Zero security incidents. This is the security standard we need."
I read that email. Then I opened my monitor.
Out of 217 interceptions, 43 were security fixes that got falsely blocked. And 3 of those fixes were for vulnerabilities that — if exploited — VoidSentinel couldn't detect at all. Because those vulnerabilities weren't in north-south traffic. They were in the gaps between services.
VoidSentinel can't see what it isn't deployed to look at.
7. No
Day 22.
Mark's office. "You're more than halfway through the PIP. How are things?"
"On track."
"Anything you want to talk about?"
"No."
He looked at me.
"Alex. Your technical skills are not in question. If you're willing to put something in writing — acknowledging VoidSentinel's security review process — the PIP can end early."
I stood up.
"VoidSentinel's security review process — 43 false positives. 3 of them blocked real fixes. That's the only thing I can put in writing."
I didn't wait for his response.
8. $4,200,000
Day 27. 2:47 AM.
My phone lit up. Not a VoidSentinel alert — my own monitor.
"Anomaly detected: service-to-service auth call — credential KT-9f4 — flagged as 'high risk' by VoidSentinel, then automatically cleared three minutes later."
I sat up. Opened my laptop.
At 1:12 AM, someone had initiated an auth call on the payment service's dev interface. Credential KT-9f4 — a service account belonging to an employee who left four months ago.
VoidSentinel flagged it. Then automatically cleared it.
Reason: "Credential is valid. Call frequency normal. Classified as normal operation."
— The attacker used a legitimate credential. Normal frequency. Normal path. They just accessed a service they shouldn't.
VoidSentinel recognized "someone's modifying service-to-service auth" — but after evaluating the call frequency and credential validity, its model decided this was a routine operation.
It did nothing.
I opened VoidSentinel's dashboard. Threat score: 0.02. All green.
Then I opened the Slack channel.
"Payment reconciliation is off by $4.2 million. Anyone looking?"
I took three screenshots:
- My monitor log — credential KT-9f4 at 1:12 AM, cross-service call
- VoidSentinel's audit log — "high risk" → cleared three minutes later → "normal"
- The Slack message — $4.2 million missing
Then I opened the PR I'd submitted on Day 1.
VoidSentinel's verdict: High risk. Denied.
Fix content: independent auth between payment service and user service.
— The exact same link that was exploited at 1:12 AM.
I sent the three screenshots and that PR to CTO James. CC'd the CEO.
One line in the body:
"Day 27. The thing I tried to fix on Day 1 — someone used it tonight. VoidSentinel flagged it for three minutes, then let it through. PIP has 3 days left. Your call."
9. RCA
CTO James. VP Mark. VP of Finance. VP of Legal. The CEO — everyone in the room. One person I didn't recognize sat in the corner. Someone called him a "board observer."
James ran through the incident timeline first. When he said "the attack vector exploited an internal auth gap," the CEO raised a hand and stopped him.
"Alex. When did you submit your first PR?"
"Day 1."
"What was it for?"
"Independent auth between the payment and user services."
"What was VoidSentinel's verdict?"
"High risk. Denied."
"And then?"
I didn't answer. I opened the PR.
I flipped to the first page — the code diff. A new token exchange module. Three call sites changed.
"This was a fix. VoidSentinel flagged it as high risk."
Then I flipped to the 1:12 AM attack log.
"This was an attack. VoidSentinel flagged it as high risk — then automatically cleared it. Reason: 'credential valid, frequency normal.' Same system. Same vulnerability. Same detection model. One blocked. One let through."
I closed the laptop.
"— Because VoidSentinel can't tell the difference between a fix and an attack. It only knows someone's modifying things. Whether that person is patching or exploiting — it doesn't know. It can't."
The room went quiet for a few seconds.
The CEO didn't look at Mark. He looked at me.
"When were you put on PIP?"
"Day 3."
"What was the reason?"
"Two attempts to bypass VoidSentinel's security review."
The CEO didn't say anything. He turned to Mark and asked quietly:
"He submitted a fix. You gave him a PIP. Is that right?"
James's eyebrow twitched.
"Mark. The report you gave me said 'two attempts to bypass security review.' You didn't mention he was submitting a fix."
Mark opened his mouth. "His submission method — "
"He proposed independent auth. You said no. He tried a gateway-level approach. You said no. The vulnerability he reported — someone exploited it tonight. Your system — flagged it for three minutes, then let it through."
The CEO's voice was flat. Flat enough that the silence after it felt heavier than any shout.
Mark couldn't find words.
The CEO stood up.
"Alex. My office."
I stood up. In my peripheral vision — Mark stayed seated. He didn't follow.
10. Eleven Months
The CEO walked fast down the hall. I followed. He didn't turn around.
"PIP rescinded. Effective immediately."
"Okay."
"Your fix gets deployed today."
" — VoidSentinel will block it."
"VoidSentinel gets reconfigured this afternoon. Your PR gets re-reviewed after the config update."
"Okay."
He stopped.
"That monitor you built. How long has it been running?"
"Day 3."
"The day you got the PIP?"
"The day I got the PIP."
He looked at me. Maybe five seconds.
"You know what it means — being on PIP and still building something that caught an attack before a $1.4M system did?"
" — It means those 27 days on PIP weren't wasted."
He smiled. Not a "good job" smile. A different kind.
" — You know how Mark got that seat?"
I paused.
"Board appointment, I heard."
The CEO nodded.
"I waited eleven months for a reason."
He didn't finish the sentence. He didn't need to.
I went back to my desk. Mark's desk was empty. HR had completed everything by 3 PM.
I never replied to Mark's last PIP progress email. The last one sat at Day 27.
Subject line: "On track."
Later, I reopened that Day 1 PR. VoidSentinel ran it again — this time it passed, with a note: "Low risk. Recommended to merge."
I didn't feel good about it.
I just remembered what Jay said to me in the hallway that day.
— "The CEO's been trying to move him for almost a year. Watch your back."
The reason he was waiting for was never about me.
It was about someone finally handing him a board he didn't have to wait on anymore.
Folks, when you submit a fix that gets blocked —
are you fixing a bug, or finishing someone else's chess move?
👇
To the person fixing vulnerabilities at 3 AM — this one's for you. Buy me a coffee ☕
Top comments (34)
the CI/CD gatekeeper scenario without the conspiracy is just as frustrating - overfitted model, every auth fix blocked because auth modules touched historical vulns. no villain, just weeks of exception tickets.
You're right. The version without a villain is actually harder to deal with — at least with a villain you know who to yell at. An overfitted model blocking your auth fixes, and there's no one to complain to. Just a queue of exception tickets. And the worst part? You never know whether the next ticket will go through or get blocked. Its decision logic might as well be a coin flip.
yeah and the coin flip breaks planning more than the actual blocks do. people start batching auth changes just in case, the backlog fills with workarounds. it's not the blocked tickets that kill you, it's the process debt that forms around the unpredictability.
"Process debt" — that's the phrase I was missing. The blocks are surface-level. What actually compounds is the organizational scar tissue: people pre-batching, padding timelines, building workarounds for a system that doesn't know it's unpredictable. You stop planning around what's right and start planning around what might get through.
process debt is the right frame but it"s actually worse — the debt compounds because the workarounds outlive the original unpredictability. team rewires around the bad behavior, system gets patched, but the pre-batching habits and padded timelines stay. you end up carrying the tax without the reason for it.
"Carrying the tax without the reason" — that's a brutal way to frame it. What makes it worse is eventually nobody even remembers where the habits came from. A new joiner inherits a "batch auth changes" rule from a wiki page written by someone who left two years ago. Ask them why — "that's just how we do it here." The original unpredictability is long gone, but the process lives on as culture. At that point, the debt has become identity.
Really appreciate you diggin
honestly the inherited-from-leaver case is the easier fix - the author's gone so the rule's legitimacy is already in question. harder is when the person who wrote it is still on the team but also can't reconstruct why. then you get the same tribal-knowledge shrug with the original author nodding along, and nothing has permission to be questioned
That's the part that stuck with me too. The "everyone shrugs together" version is almost harder to fix than the abandoned rule — because with the leaver's rule you at least know you're dealing with a ghost. When the author's in the room nodding along, you're fighting the whole team's shared memory loss. No villain, just entropy.
entropy that looks like consensus is the hardest kind -- nodding doesn't mean understanding, it means everyone stopped asking. the tell is whether the author can write down the original reasoning now. if they can't, the rule is running on muscle memory with no owner.
"A rule with no owner" — that's the sharpest framing yet. But there's a more insidious version: the rule that was never written down, yet governed more decisions than any documented policy ever could. In the story from that article, "AI will verify itself" was never a written strategy. It was a default assumption nobody questioned. It had no owner and no document, but it drove architecture decisions, testing strategy, and even personnel moves. When a rule that doesn't exist and has no owner influences a $2.8M outcome, either you don't know it's there until it's too late — or worse, you're the one who was supposed to question it and didn't. The most dangerous rules I've seen aren't the outdated ones. They're the ones nobody ever wrote down.
that unwritten-rule layer is where agent governance collapses fastest. you can audit every policy file and still miss the defaults nobody questioned because nobody wrote the question. 'AI will verify itself' survived because it was assumption-shaped, not rule-shaped — it lived in the space between decisions, not inside them.
"Assumption-shaped vs rule-shaped" is such a sharp distinction. In testing, code coverage is the textbook case of an assumption cosplaying as a rule. 95% coverage reads like a hard requirement, but what it really is is an assumption nobody questioned — "we tested enough." Nobody asks what the uncovered 5% is: dead code, error paths nobody handles, or the exact line that'll take down production next month?
The thing about rule-shaped assumptions is at least you know where they live. You can pull them up, challenge them, change them. Assumption-shaped ones live in green dashboard numbers. In "this is how we've always done it." In the empty space where nobody ever wrote down "why." "AI will verify itself" didn't survive because someone made the case for it. It survived because nobody was ever asked to.
yeah, coverage percentage is the cleanest example of that — it passed through PR review so many times the number calcified into policy. same thing happens with agent timeout values, retry limits, scope flags. someone picked something that worked in staging, it never got challenged, and now it is the rule. the audit question has to be whether the constraint is principled or just sediment nobody thought to question.
And the thing that makes it worse — sediment is contagious. Once 90% coverage is the rule, the next person writes tests to hit 90%, not to test what actually needs testing. Once a 30-second timeout is the rule, nobody asks whether that endpoint genuinely needs 30 seconds or it's just what worked on staging two years ago. The sediment becomes the new baseline, and the baseline grows new sediment on top of it. The person who asks "why" becomes the person slowing things down.
yeah, and once the timeout outlives everyone who set it, it becomes load-bearing myth - nobody touches it because nobody remembers what broke at 10 seconds. the rule protects itself by erasing its own origin.
And here's the part that makes it worse — the act of fixing a load-bearing myth is itself a risk. You dial that 30-second timeout down and something breaks, it's on you. You dial it up and the system gets sluggish, also on you. The person who never touched it made the right call — because they never made a call at all. The sediment protects itself not because it's correct, but because questioning it carries all the risk, and defaulting to it carries none.
that asymmetry is what makes it a myth and not just tech debt. tech debt you can justify touching. the myth punishes anyone who moves, in either direction. so over time, not touching it IS the architecture.
the "fix and exploit look identical to a model" framing hits the same wall we run into with MCP tool authorization.
call_payment_service(user_id=X)looks identical whether it's a legitimate agent step or prompt injection from upstream — the model sees code pattern, not intent.the separate monitor was the right call. we ended up with an audit stream for every MCP tool call, keyed by request ID, completely outside the approval gate. gate can fail either way; the log does not care.
worst part: the gate was also the only escalation path. when the system that blocks you is also the appeal mechanism, there's nowhere to stand.
how are you handling where a human can actually override without going through the system that said no?
This is exactly why if you trust AI, you should trust a human more. The Sentinel was right, but it was too stupid to identify why it was right. Technical debt is where AI fails everytime, because what looks right, might have been wrong from the start.
"too stupid to identify why it was right" — that's the sharpest take on VoidSentinel I've read. It flagged the attack. Then cleared it three minutes later. Same model, same path, two different verdicts.
The "trust AI → trust human" part I'm less sure about. Mark was the human in this story, and he was the one using the AI as a shield. I think the scarier combination is: an AI that can't explain itself, plus a human who won't question it.
As for technical debt — VoidSentinel didn't fail because of code rot. It failed because its world only has one dimension: who's changing what. The real fight happens in another dimension: why are they changing it. That dimension doesn't exist in its model. Billions of training tokens can't buy you a plane you don't know is missing.
Exactly. But the problem is Mark trusted the AI over a human expert saying "This was a fix, not a flaw". Sure, trust the AI to flag suspicious code, but if you ask the implementer and they say it wasnt an accident, it was a security risk patched. Then at the very least, it would warrant looking in to. 2 portals sharing 1 API key from years ago, is the worst kind of code rot, namely permissions code-rot, because back in the day, attacks werent sophisticated enough to find the loophole. Half the reason why I built V.A.L.I.D. is as an easy way to upgrade legacy systems to a secure standard (that happens to be AI native and much faster). You can train a model, but a model cant think outside it's parameters, especially true if it's a smaller model (sub 1T) with gaps in it's knowledge (the missing dimension). When that's the case, always trust the human enough to double check it.
That's the real punchline — the AI got to be wrong twice (flag it, then clear the flag), but the human only got to be right once. The fix was intentional. The model couldn't tell the difference between "looks suspicious at first glance" and "prove it's wrong with evidence." So when both are in the loop and the system gives the model the final say, the implementer's context dies with their keystrokes.
V.A.L.I.D. sounds like exactly the kind of tool that shouldn't need to exist in a healthy engineering culture — but absolutely does in this landscape.
Exactly. If it's the AI vs the Expert, the back and forth should stop when the Expert provides proof. Not get shot down by the AI for 'it's not my policy'.
V.A.L.I.D. is an interesting one. I designed it because I hated CSLA's code bloat. So I decided, let me write a framework that fixes Blazor (also react compatible). It generates 85% of the code for you, you just create your DTO, mark items as ValidObjects, with constraints and rules. Then a simple markup of your UI and it fills in the blanks, including UnitTests and a MCP to make whatever you build webMCP compliant, or how I use it, as a great way to add an actually useful AI chatbot. It's written in F# and uses Roslyn for the generating. You can write backend in C# or F#, it'll convert it to the most efficient one for the task, you can write the UI in blazor or JS, it'll wire it up properly. It flags errors at compile time, not run-time. It also has a fuzzer functionality, so if you dont want to use the simple-scripting UI tests, you can just run the fuzzer and it'll blast a BO's objects and visually show you the affected properties. It ditches the WASM VDOM for unmanaged slabs and uses a 128 bit mask for the mutations. Result is Blazor hits around 600 mutations per second, V.A.L.I.D. hits 3600+, while logging state, so if a user hits an error, you can replay their exact state on your end, with a timeline, it natively supports PROPER undo, not that stupid implementation of CSLA. So it does a bit more than provide a skills file and MCP so a model can write it faster without breaking your old code 😅
If you're in to writing .NET, give it a try, I'm using it currently for my automated acounting suite, which is sitting at around 600k LOC, but I only had to write less than 1/5th of it by hand.
The scariest part isn't the AI – it's the person who hides behind it
This story gave me chills. Not because of the $4.2M loss (though that's painful), but because of how perfectly it captures the real danger of AI gatekeeping: when someone uses "the AI said no" as an unassailable shield for their own ego or agenda.
The technical lesson is clear – VoidSentinel couldn't distinguish a fix from an exploit because that's not a solvable problem for a model trained purely on code patterns. Intent isn't in the diff. But the human lesson is even bigger.
Mark didn't block the PR because the AI was right. He blocked it because admitting the AI was wrong would mean admitting he was wrong – about the tool he sold to the board, about the PIP he issued, about his own judgment. The AI wasn't the decision‑maker. It was the excuse.
What strikes me is that Alex didn't need to build a better AI. He built a monitor – a separate, independent observer that didn't try to judge fixes vs attacks, just logged what happened. That's often more valuable than another "smart" system.
Two quick takeaways from this (for anyone building or buying AI security tools):
Always have a human override with teeth. If a developer with domain expertise says "this is a fix, not an attack," there needs to be an escalation path that doesn't end at the same VP who bought the system.
Independence in monitoring is non‑negotiable. Alex's monitor ran outside VoidSentinel. It didn't share the same blind spots. In safety‑critical systems, that's called redundancy. In AI governance, it's called not letting the fox guard the henhouse.
Thanks for sharing this (and to the anonymous engineer who lived it). It's a reminder that the most dangerous line in any meeting is: "The system said no."
Cheers,
Jack
DEV.to/ggle.in
That line — "the AI wasn't the decision-maker, it was the excuse" — cuts to the deepest layer of the whole story.
But there's a quieter damage I keep coming back to. Once "the system said no" becomes a conversation ender, it doesn't just protect someone's ego. It kills the organization's ability to learn. Mark doesn't have to explain why he rejected the PR — "the gate didn't pass." Alex doesn't have to ask why it didn't pass, because he already knows the answer. Next time, the person after him won't ask "is this gate even right?" — they'll just route around it, or stop fixing things altogether.
The gate didn't just block that one fix. It blocked the conversation about why that fix was right in the first place. And a conversation that never happens is a loss no system can compensate for.
This hits close to home. When integrating AI APIs into automated security workflows, I have seen confident false negatives that would have been caught by a human reviewer. A simple secondary validation step using a different model or rule-based check has saved me from similar disasters. The key is never trusting a single AI call for critical decisions.
Nail on the head. The ironic part? VoidSentinel wasn't even wrong — it did detect someone modifying an auth path. It just couldn't tell if you're fixing a hole or punching a new one 😂 Same input, same pattern, opposite intent — system had no idea.
That line "never trust a single AI call for critical decisions" — I'm framing it
Btw, ever had the secondary validation also fall for it? First model was so confident it dragged the second one down the wrong path too? 🤣
AI security gates are useful, but “verdict is final” is where things break.A fix and an exploit can look almost identical to a model. Without human review, ownership context, and audit trails, you are not reducing risk. You are just automating blind spots.
That 'fix vs exploit look identical to a model' line cuts right to what I was trying to show. The team that built the gate assumed the model could tell the difference. It couldn't. And without the audit trail to prove the fix was legitimate, there was no way to overrule the gate — because the process to overrule it had been automated too. The gate didn't just block the fix. It blocked the conversation about whether the fix was right in the first place.
Security by AI alone is a costly bet; if it says no and the breach costs millions, bring humans back in the loop before the next incident.
Exactly. "Bring humans back in the loop before the next incident" — that line hits. The scary part is most orgs wait for the incident to happen before they do it.
Some comments may only be visible to logged-in visitors. Sign in to view all comments. Some comments have been hidden by the post's author - find out more