DEV Community

Cover image for The Bug That Took 10 Minutes to Fix and 3 Days to Find
Harsh
Harsh

Posted on

The Bug That Took 10 Minutes to Fix and 3 Days to Find

Hidden cost of unvetted AI logic

The fix was one line.

if not items: return []
Enter fullscreen mode Exit fullscreen mode

That's it Three words a colon and a pair of brackets It took me 10 seconds to type. 10 minutes to test and verify 10 seconds to deploy.

It took me 3 days to find.

Three days of staring at logs that said nothing Adding print statements that confirmed nothing Rewriting code that wasn't broken Blaming the framework Blaming the database Blaming the network Eventually quietly blaming myself.

The bug wasn't complicated It wasn't deep It was hiding in plain sight in a place I hadn't thought to look because I hadn't thought to ask the right question.

The AI had assumed a list would never be empty I had assumed the AI was right. Neither of us checked.

This is the story of how I spent 3 days debugging a bug that took 10 minutes to fix and what I learned about assumptions silent failures and the expensive gap between it works and it always works.

How It Started

The code was simple A function that processed a list of user inputs and returned a summary A small feature, maybe two hundred lines total Nothing that should have taken more than an afternoon.

I'd used AI to write the core logic The prompt was clear The output looked clean - readable, well-structured, sensible variable names I reviewed it The tests passed The PR was approved in the next morning's standup I shipped it on Tuesday.

The feature worked fine for two days Users were using it Logs were quiet I moved on to the next ticket.

Then Thursday happened.

A user with an empty list hit the endpoint No data - just an empty state, the kind that every real application encounters eventually The function received the empty list, processed it, and returned... nothing Not an error Not an empty list Not a helpful message Just silence.

The UI froze waiting for a response that wasn't coming The user was confused Support flagged it I was pulled back into code I thought was done.

The function worked 99% of the time The 1% was invisible And in production invisible is expensive.

Day 1: The Spiral

I started where anyone would the error logs Nothing No exceptions no warnings, no trace The function had been called It had returned The logs had nothing to say about what happened in between.

I added print statements Ran the code locally It worked perfectly Of course it did - I had test data which meant I had a non-empty list which meant I never triggered the bug.

I checked the database The data was there The function was definitely being called - I could see the request in the logs It returned something The UI just couldn't do anything with it.

I blamed the framework Maybe it's a caching issue Maybe the response is getting intercepted somewhere Cleared caches Nothing changed.

I blamed the network Maybe the request is timing out before the response arrives Checked latency Everything was fine.

I blamed the AI-generated code Maybe the logic is wrong in a subtle way I missed in review Rewrote the core function by hand line by line Same behavior.

By 6 PM I had rewritten three functions restarted the server twice added eleven print statements and learned absolutely nothing.

I closed my laptop The bug was still there.

So was I.

Day 2: The Desperation

I came back Friday with fresh eyes and no new ideas - the worst combination.

I traced the execution path more carefully this time Line by line watching the data flow through the function The input was received The processing ran The output was generated Everything looked right.

Except the output was wrong.

I started questioning things I hadn't questioned in years Did I actually understand the data structure I was working with? Was Python doing something I wasn't expecting with list references? Was there a mutation happening somewhere that I wasn't seeing?

I added a check to log the exact value of the input list before processing Items were there I added a check after processing The result was empty The logic in between looked correct.

I posted on Stack Overflow No answers for six hours.

I asked an AI assistant It suggested the same approaches I'd already tried, phrased slightly differently.

I pulled a colleague into a Zoom call They looked at the code for ten minutes and said It looks fine to me.

That was the worst moment of the three days Not the frustration not the wasted hours - the moment when someone else looked at it and confirmed that nothing was obviously wrong. Because that meant either I was missing something fundamental or the bug was somewhere I hadn't looked yet.

By Friday night I had genuinely started to wonder if I was going to find it at all.

Day 3: The Breakthrough

Saturday morning. Fresh coffee No notifications I opened the function again with no particular plan - just read it one more time slowly with no assumptions about where the problem was.

Same code. Same behavior But I added one more log I hadn't thought to add before:

print(f"items before processing: {items}")
print(f"items type: {type(items)}, length: {len(items)}")
Enter fullscreen mode Exit fullscreen mode

The list had items. Three of them. Good — confirmed the input was right.

print(f"processed result: {processed}")
print(f"processed length: {len(processed) if processed else 'empty/None'}")
Enter fullscreen mode Exit fullscreen mode

The processed result was empty.

I stared at the screen for a moment Input: three items Output: empty The logic in between: apparently correct.

Then I looked at something I had looked at a dozen times before but never really seen The function I was testing was calling a helper function I had reviewed that helper function I had read it carefully But I had read it assuming the input would always have items - because in my testing it always did.

The helper function wasn't checking It was written assuming the list would have at least one item When it did it worked correctly and returned results. When it didn't when the list was empty it entered a code path that silently returned nothing instead of an empty list.

No exception No warning No log entry Just nothing wrapped up neatly and returned as if nothing was wrong.

if not items: return []
Enter fullscreen mode Exit fullscreen mode

I added the line Ran the test with an empty list The function returned an empty list.

I ran the full test suite Everything passed.

I deployed The UI loaded The user with the empty list finally saw something on their screen: an empty state message exactly what they should have seen three days ago.

The fix took 10 seconds to write and 10 minutes to verify.

Finding it took 3 days.

What the Bug Taught Me

The 99% trap is real. Code that works most of the time is significantly harder to debug than code that fails loudly and immediately The silent failure is the expensive one because it doesn't announce itself and because you'll keep testing with the cases that work and never see the case that doesn't.

Assumptions are invisible debt. The AI assumed a list would never be empty because most of its training examples involved lists that had items. I assumed the AI had handled the edge cases because the code looked complete. Neither assumption was wrong on its own they were just unchecked And unchecked assumptions in production are loans you'll repay with interest.

Works on my machine is a specific kind of lie. It works on my machine because I test with data. The bug lived in the absence of data The happy path works the happy path always works The skill is in finding the unhappy path before your users do.

The fix is almost never the hard part. Finding is hard Fixing is easy I spent 3 days finding a single unhandled edge case I spent 10 minutes fixing it. The value in software development isn't in writing code it's in knowing where to look when the code is wrong.

I should have asked what happens when this gets nothing? Before shipping Before the PR was approved. Before the tests ran That question takes 30 seconds to ask and answer It would have saved three days.

What I'm Doing Differently

Now I ask one question before I ship any function before review before testing before deployment:

What happens when this gets nothing?

Empty list Null input Missing field Zero results The case where the happy path assumption is wrong.

I don't trust AI-generated code to ask that question for me I don't trust myself to remember to ask it spontaneously So I made it a rule part of my personal pre-ship checklist right after does the happy path work.

It takes about 30 seconds to answer It would have saved me 3 days.

That's a trade-off I'll take every single time.


One Question

What's the longest you've spent debugging a bug that turned out to be a one-line fix?

Days? Hours? A week you'd rather forget?

I'll go first in the comments 3 days, one missing edge case, one line of code I still think about every time I ship a new function.

Your turn. 👇

Top comments (30)

Collapse
 
ranjancse profile image
Ranjan Dailata

Coding is an art. Great programmers go beyond and submerge themselves inside the code, that's how they get the full insights and can potentially troubleshoot and solve impossible things.

Remember, everything is "Code". We live within this universe where you, me and everyone are basically a code. Well, the universe itself is a giant computer per say.

Collapse
 
harsh2644 profile image
Harsh

Ranjan submerge themselves inside the code that's exactly what those 3 days felt like. Not debugging from the outside Living inside the problem until the shape of it became visible.

The universe-as-code framing is beautiful. Bugs are just inconsistencies in the simulation? 😄

Thank you for the poetic perspective. 🙌

Collapse
 
ranjancse profile image
Ranjan Dailata

I am glad that you came out of a recursive exit. I can understand how hard it could be. Yet times, a line of code looks really simple and easy to go; but it can also create many wonders 😆

Thread Thread
 
harsh2644 profile image
Harsh

A line of code that looks simple can also create wonders that's the other side of the coin. The same line that took 3 days to find also made the feature work for every user after
Worth it.

Thanks for the beautiful conversation, Ranjan. 🙌

Collapse
 
bridgexapi profile image
BridgeXAPI

One thing production systems teach very quickly:

A system isn't defined by what happens when everything is present.

It's defined by what happens when something is missing.

Empty lists.
Missing fields.
Zero liquidity.
No response.
No data.

Most incidents I've seen started with an assumption that one of those states would never happen.

Collapse
 
harsh2644 profile image
Harsh

BridgeXAPI a system isn't defined by what happens when everything is present It's defined by what happens when something is missing that's the line That's the whole article in one sentence We spend so much time testing the happy path.

The empty list, the missing field the null response those are the moments that define whether a system is actually ready Most incidents started with an assumption that one of those states would never happen Yes The assumption is always this won't happen And then it does.

Thank you for this it's going on my wall. 🙌

Collapse
 
bridgexapi profile image
BridgeXAPI

Haha, I think every engineer has at least one story like this.

You spend days looking for something complicated and in the end it's a tiny assumption hiding in plain sight.

The bug gets fixed in minutes.

The lesson sticks around for years.

Thread Thread
 
harsh2644 profile image
Harsh

Bug fixed in minutes Lesson sticks for years that's the real ROI of the 3 days Worth it.

Thank you for the wisdom BridgeXAPI. 🙌

Thread Thread
 
bridgexapi profile image
BridgeXAPI

Appreciate that. 🙌

Collapse
 
motedb profile image
mote

The "99% trap" framing is accurate but I'd push the lesson further: the real problem isn't missing empty-list checks, it's that the function's contract was never written down. If you had typed the signature as fn summarize(items: &[Item]) -> Vec<Summary> with a doc comment stating "returns empty vec for empty input", the AI would have generated the guard, and any reviewer would have caught the gap.

Type-driven development catches these before they become three-day debugging sessions. Same principle applies to property-based testing — instead of asking "what happens with nothing?", you write a property that must hold for all inputs and let the tool find the counterexample.

The checklist habit is good, but IMO it's a workaround for not having the contract explicit.

Collapse
 
theuniverseson profile image
Andrii Krugliak

The killer line is 'I had assumed the AI was right.' That empty-list assumption is exactly the failure that passes every test and ships clean, because the confident wrong answer looks identical to the correct one until production hands it the edge case. I now treat 'where did the agent assume instead of check' as the first place I look.

Collapse
 
harsh2644 profile image
Harsh

Andrii where did the agent assume instead of check that's the one question that would have saved me 3 days Not is the code wrong Not did the AI make a mistake Just where did it assume something instead of checking?

The wrong answer looks identical to the right one. That's what makes these bugs invisible No error message, no crash, no red flag Just plausible wrongness Treat 'where did the agent assume instead of check' as the first place to look.

This is going into my debugging checklist. Thank you. 🙌

Collapse
 
theuniverseson profile image
Andrii Krugliak

Glad it stuck. The assume-vs-check question is the first one I run now too, because plausible-wrong never throws an error you can grep for. What helped was making the agent log its assumptions out loud, so the silent guess turns into a line you can see.

Collapse
 
elionreigns profile image
E Lion Reigns

This resonates hard — the fix is often tiny but the search space is huge. I have been logging webhook + PHP integration bugs the same way (timestamped CSV + replay notes) so the next 3am me does not repeat the hunt.

If you are ever debugging production glue (auth headers, SMTP, CORS), happy to swap war stories. I am Eric — building solo on elionmusic.com / prayerauthority.com and looking for more dev friends who get this grind.

Collapse
 
harsh2644 profile image
Harsh

Eric the fix is tiny but the search space is huge that's the whole article in one sentence Timestamped CSV + replay notes is smart.
The 3am version of you will be grateful Production glue bugs (auth headers, CORS, webhooks) are a special kind of nightmare everything looks fine nothing works, logs are silent. Would love to swap war stories sometime.

Always good to connect with folks who get the solo dev grind. Followed. 🙌

Collapse
 
elionreigns profile image
E Lion Reigns

Appreciate the follow, Harsh — war stories swap sounds good. My last "silent logs" win was a webhook that returned 200 but never hit our handler: turned out to be a trailing slash + reverse-proxy strip. Timestamped CSV saved me twice since. I'll DM you a link to the friends thread if you want to compare notes on production glue — always down to learn from someone who debugs for real. 🤙

Thread Thread
 
harsh2644 profile image
Harsh

Eric trailing slash + reverse-proxy strip is the kind of bug that makes you question reality Everything looks right. Nothing works. Respect Timestamped CSV saving you twice that's the evidence The system works.

Would love to see the friends thread. Always down to learn from someone who's been in the trenches.

Talk soon. 🙌

Collapse
 
cart0ne profile image
Cartone

hahah I wish I had read this 3 months ago, before starting vibe coding and finding out only after 90 sessions that "AI always tends to overcomplicate everything, to look for convoluted solutions, adding unnecessary layers of complexity on top of relatively simple reasoning" and that the best approach is "keep asking instead of nodding along. The idiot's question is the only weapon I've got."🤣

Collapse
 
harsh2644 profile image
Harsh

Cartone the idiot's question is the only weapon I've got that's the real debugging tool 😂 Not the debugger. Not the logs The willingness to ask "wait, why did it do that? even when it sounds like a stupid question 90 sessions to learn that lesson Could be worse Could be 900.

Keep asking the idiot's question. It's never actually idiotic. 🙌

Collapse
 
nimay_04 profile image
Nimesh Kulkarni

This hits so hard. It’s wild how enterprise budgets will happily throw half a million dollars at a flashy vendor platform to tick a box, but deny a engineer $500 for an independent validation pipeline that actually keeps the system honest. Brilliant write-up.

Collapse
 
harsh2644 profile image
Harsh

Nimesh half a million on flashy vendor platform to tick a box, but deny $500 for an independent validation pipeline that's the quiet absurdity of enterprise budgets the flashy thing gets approved The boring thing that actually prevents 3-day debugging sessions? Denied Every time.

The empty list bug didn't need a fancy solution It needed a simple check Someone to ask what happens when this gets nothing But that question doesn't come in a shiny box with a sales deck.

Thanks for naming the budget irony So real. 🙌

Collapse
 
nimay_04 profile image
Nimesh Kulkarni

This captures the current era of engineering perfectly. When both the AI assistant and the human developer assume the input state is a given, we just end up automating invisible technical debt. Code that fails silently is infinitely more expensive than code that throws a loud exception. Great write-up!

Collapse
 
harsh2644 profile image
Harsh

Nimesh automating invisible technical debt the quiet horror Code works. Tests pass. Debt grows silently No ticket tracks it Code that fails silently is infinitely more expensive than loud exception loud crash gets fixed Silent wrongness gets shipped.

Thank you. 🙌

Collapse
 
lcmd007 profile image
Andy Stewart

The devil is in the details." AI-generated logic is notorious for missing these defensive edge cases. As a 20-year Linux veteran, my biggest takeaway is that you should never fear a loud crash; fear the silent failure that swallows data without a trace. Always asking "what if it gets nothing" saves days of debugging.

Collapse
 
harsh2644 profile image
Harsh

Andy never fear a loud crash fear the silent failure that swallows data without a trace that's the line Loud crash announces itself Silent failure just sits there. Data disappears No alert Invisible 20 years of Linux experience in one sentence Always asking what if it gets nothing saves days yes.

Thank you for the wisdom. 🙌

Collapse
 
urmila_sharma_78a50338efb profile image
urmila sharma

I felt this in my bones The smaller the bug, the harder it is to find. Thanks for sharing Harsh it makes the rest of us feel less alone.

Collapse
 
harsh2644 profile image
Harsh

Urmila the smaller the bug the harder it is to find exactly why it took 3 days Nothing dramatic. Just invisible.

Glad it made you feel less alone that's why I wrote it. 🙌

Some comments may only be visible to logged-in visitors. Sign in to view all comments.