DEV Community

AI didn’t invent code drift. It made it harder to ignore.

Scarab Systems on June 06, 2026

I started noticing this while building a real frontend/backend system with AI assistance. The problem was not that the AI could not code. It could...
Collapse
 
xulingfeng profile image
xulingfeng

This resonates hard. "A fix would target the nearest visible symptom instead of the deeper boundary that had failed" — that's the sentence that sums up the whole class of failures I keep running into. The symptom looks fixed, the build passes, but the system is quietly less coherent than it was before.

I've been calling it "test coverage theater" on my end — where the AI patches the visible gap, and the pass/fail surface still looks green, but the invariants between layers have silently drifted. Curious: does Scarab detect drift by comparing against a stored baseline, or does it infer the expected boundaries from the codebase structure itself?

Collapse
 
scarab-systems profile image
Scarab Systems

That’s a really important distinction, and I should be precise here.

Scarab is not meant to “guess” what the repo should be.

The stronger version of the model is deterministic: the team provides, or the repo already contains, an agreed baseline — governance docs, accepted scripts, framework conventions, architecture notes, test expectations, deployment assumptions, etc. Scarab then uses those as declared truth surfaces and checks whether the repo still aligns with them.

So in a product/team setting, I would not want Scarab silently deciding intent on behalf of the team. The team still owns the definition of what the system is supposed to be.

Where it gets interesting is when the repo contains contradictions.

A config says one thing, the runtime path expects another.

A test passes, but browser behavior fails.

A build path and dev path enforce the same contract differently.

A generated artifact starts acting like source truth.

A final-output constraint starts governing an intermediate step.

That is where Scarab becomes useful: it does not need to invent intent; it can surface places where the repo’s own declared or operational truths no longer agree.

So I’d say: baseline comparison matters, but the deeper value is evidence-backed contradiction detection across boundaries.

The team decides what “fixed” means. Scarab helps show where the current system is no longer proving what the team thinks it is proving.

Collapse
 
scarab-systems profile image
Scarab Systems • Edited

I must tell you that I'm actually a physicist at my core so my approach to this entire space has been from a completely different viewpoint than those who generally work in it.

I think this is why I always looked at the issue from the perspective of how a holistic system is intended to operate cleanly rather than the more scoped approach to diagnosing an issue.

Collapse
 
xulingfeng profile image
xulingfeng

The physicist perspective reminds me of someone I wrote about once. Same energy — "I built this system, I know it's solid." Formal verification all green, every boundary covered. Three months in, the model walked itself right out of those boundaries. He didn't skip installing a door. It just never occurred to him that he needed one.
Not saying you're the same person. But there's a shared vibe — that "I see things from a different angle than everyone else" thing. He used math to prove safety. You use physics to reason about code consistency. Same underlying question though: when the system isn't lying to you, you trust it. When it starts lying — how do you know?
One thing I keep coming back to: baselines drift too. Docs go stale. Architecture decisions become assumptions nobody checks. Test assertions test things the codebase no longer assumes. When Scarab finds a contradiction — does it resolve it, or just put it on the table?
I wrote about that CTO here — you two would probably get along 😄
dev.to/xulingfeng/our-cto-built-an...
Maybe you two should grab a coffee. Or we should take this somewhere else — this comment thread is starting to look like a miniseries 🍿

Thread Thread
 
scarab-systems profile image
Scarab Systems • Edited

love it!... thing is that's really the only way I personally can see it... I'm not a programmer... or a software developer in the common sense... I never really had any other perspective when I started to run into problems...

which is why I value this conversation so much because honestly I need your perspective. I know there are aspects I am not seeing... that objectively has to be true but I don't know what I don't know so please keep your thoughts and questions coming...

Thread Thread
 
scarab-systems profile image
Scarab Systems

That’s exactly the hard part.

Scarab should not automatically resolve that contradiction, because resolving it means deciding which truth source has authority — and that still belongs to the repo owner/team.

If docs say one thing, tests say another, runtime behavior says another, and the architecture has quietly moved on, Scarab’s job is to put that conflict on the table with evidence.

So rather than saying “the docs are wrong” or “the test is wrong,” the useful diagnostic output is more like:

  • this baseline is declared here
  • this behavior contradicts it here
  • this test is proving a different assumption
  • this runtime path appears to have become the actual operating truth
  • these are the repair/update lanes depending on which authority the team chooses

That distinction matters because stale baselines are one of the easiest ways for a repair to become dangerous. The agent can “fix” the code to match an outdated document, or update a test to match broken runtime behavior, and both can look green while making the repo less coherent.

So Scarab does not decide the truth for the team.

It surfaces the contradiction clearly enough that the team can decide which truth should become authoritative again.

Collapse
 
unitbuilds profile image
UnitBuilds

Yip.. Feature creep, code-drift, etc. All existed well before AI. Eg. when I started at a new company, I had enthusiasm and wanted to rewrite everything, so I took their simple dashboard and turned it into a massive sprawling ecosystem... They didn want that. So a month of my time was wasted, because I created something extraordinary, that they never wanted... They just wanted drag and drop capability, I gave them JS/Blazor swapping, high detailed graphs, drilldowns, drag and drop, variable resizing, auto-arranging, permissions management, etc. They just wanted drag and drop... A symptom of a bad starting point and a poorly defined scope, leads to code drift and feature creep. That's why now, I dont bother doing more than I need to, you want an apple, i make an apple, I dont even offer a fruit salad.

Collapse
 
sunychoudhary profile image
Suny Choudhary

AI did not create code drift. It just made drift cheaper and faster....That is the real risk. Small generated changes can look fine in isolation, but over time they quietly break the architecture, naming logic, boundaries, and assumptions the team relied on.