Sahil Singh

Posted on Feb 8 • Originally published at glue.tools

Why Copilot Doesn't Work on Your Hardest Tickets

#ai #devtools #productivity

Copilot is great at autocompleting a for loop. It's terrible at understanding why your payment service has three different provider selection strategies depending on the customer's region, account age, and subscription tier.

This isn't a Copilot problem. It's a context problem. And it affects every AI coding tool — Cursor, Cline, Claude Code — the same way.

Where AI Coding Tools Actually Break Down

1. Cross-File Changes

Copilot sees the file you're in. Maybe a few neighbors if you're using Cursor with codebase indexing. But a typical feature ticket touches 5-15 files across 3-4 directories. The AI can't see the full picture.

Try this experiment: give Copilot a ticket that requires changes to an API endpoint, a database migration, a service layer function, a React component, and a test file. It'll give you reasonable suggestions for each file in isolation. But the suggestions won't be coordinated — the API response shape won't match the component's expectations, the migration won't match the service's queries.

2. Legacy Code With Tribal Knowledge

Your auth system works the way it does because of a production incident 2 years ago. The workaround is documented in a PR description that nobody reads. When Copilot suggests a "cleaner" auth implementation, it doesn't know about that incident. The "improvement" reintroduces the bug.

AI tools optimize for code quality in isolation. They don't know your code's history — the regressions, the intentional workarounds, the business rules encoded in seemingly arbitrary conditions.

3. Business Logic Encoded in Code

if (customer.created_at < '2024-01-01' && customer.plan !== 'enterprise') — this isn't a code pattern. It's a business rule. Copilot has no idea why that date matters. A developer who's been on the team for a year might not know either. But that line exists because of a pricing migration that grandfathered certain customers.

AI tools can complete code patterns. They can't understand business context.

4. Architecture Decisions

"Should this be a new microservice or a module in the existing service?" "Should we use WebSockets or SSE for this real-time feature?" "Should we add a caching layer here?"

These questions require understanding the full system: current load patterns, team expertise, operational complexity, future roadmap. No AI coding tool has this context.

The Context Hierarchy

AI coding tools operate at different levels of context:

Level 1: Line-level (Copilot inline completions)
Context: the current line and a few surrounding lines.
Good for: syntax completion, common patterns.
Fails at: anything requiring cross-line reasoning.

Level 2: File-level (Copilot chat, basic Cursor)
Context: the current file.
Good for: function implementations, refactoring within a file.
Fails at: cross-file changes, understanding imports.

Level 3: Project-level (Cursor with indexing, Sourcegraph Cody)
Context: the full repository, indexed for search.
Good for: finding related code, understanding imports.
Fails at: understanding WHY code is structured a certain way, historical context.

Level 4: Knowledge-level (what's missing)
Context: full repository + git history + feature boundaries + tribal knowledge + dependency graphs.
Good for: understanding what to build, not just how to write it.
This is the level that doesn't exist in current tools — and it's why they fail on hard tickets.

The Complementary Approach

The fix isn't better AI completion. It's giving AI tools better context.

When a developer works on a complex ticket with Glue:

Glue maps the ticket to code — which files, functions, and features are affected
Glue surfaces the context — who last changed this code, what past issues occurred, what dependencies exist
The developer feeds this context to their AI tool — now Cursor/Copilot has codebase-level understanding
The AI generates better suggestions — because it's working with full context, not just the current file

This is the stack that works: understanding layer (Glue) → reasoning layer (Claude Code) → generation layer (Cursor/Copilot). Each layer feeds the next.

The Bottom Line

Your AI coding tools aren't underperforming because the AI is bad. They're underperforming because they don't have the context they need. The hardest tickets require knowledge that lives outside any single file — in git history, in feature boundaries, in the team's collective memory.

Give them that context, and they transform from autocomplete toys into genuine productivity multipliers.

Originally published on glue.tools. Glue is the pre-code intelligence platform — paste a ticket, get a battle plan.

DEV Community