Kaushikcoderpy

Posted on May 30

🧠 NeuroDoc: From Broken Prototype to Production-Ready Async AI Documentation Engine

#devchallenge #githubchallenge #python #ai

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

I abandoned this project. Then I resurrected it. Here's how a fragile CLI script became a full-stack async web dashboard with RAG capabilities.

System Execution Timeline

Time Interval	Activity Description
00:00 - 00:15	Code Runner Demo execution
00:15 - 00:35	System command execution sequence
00:35 - 00:45	Dependency analysis performed on `aiohttp`
00:45 - 00:55	RAG (Retrieval-Augmented Generation) task initiated
00:55	System telemetry dashboard display
0.55 - 01.20	Waiting for RAG results
01:21 - 01:28	Backend log visualization
01:29 - 01:41	RAG query results display

🧨 The Problem — Why It Was Abandoned

NeuroDoc started as an ambitious idea: a single tool to fetch, scrape, process, and summarize documentation across Python, scikit-learn, PyTorch, and TensorFlow — powered by NLP and multi-core processing.

But it hit a wall fast.

# The villain: a blocking synchronous loop that froze everything
while True:
    query = input("Enter query: ")  # 🚫 BLOCKS the main thread
    result = fetch_docs(query)      # 🚫 BLOCKS background workers
    print(result)

The original prototype had three fatal flaws:

Problem	Impact
`input()` loop on main thread	Blocked all background scraping workers
In-memory task queue	All pending jobs vanished on crash
Brittle core resolver	Failed silently on dynamic imports

Long-running doc crawls would stall. A single crash wiped the entire task queue. It was a house of cards — impressive from a distance, terrifying up close.

So I shelved it.

💡 The Comeback — What Changed

Months later, I came back with a clear head and a plan. The rewrite wasn't incremental — it was architectural. Three shifts made everything click:

1. 🔄 Full Async Rewrite with `asyncio` + `aiohttp`

Out went the blocking loop. In came a proper async event loop that lets scraping, processing, and serving happen concurrently without stepping on each other.

async def fetch_documentation(url: str, session: aiohttp.ClientSession) -> DocResult:
    async with session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as response:
        content = await response.text()
        return await process_content(content)

async def run_pipeline(queries: list[str]) -> list[DocResult]:
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_documentation(q, session) for q in queries]
        return await asyncio.gather(*tasks, return_exceptions=True)

No more frozen terminals. No more stalled workers.

2. 🗄️ Database-Backed Task Queue (Goodbye, Volatile Memory)

The in-memory queue was replaced with a persistent, database-backed task queue using PostgreSQL and asyncpg. Now if the server crashes at 3 AM while crawling PyTorch docs, no work is lost. Tasks resume exactly where they left off.

class TaskQueue:
    async def enqueue(self, task: DocumentationTask) -> str:
        task_id = str(uuid.uuid4())
        await self.db.execute(
            ""INSERT INTO tasks (id, status, payload, created_at) VALUES ($1, $2, $3, $4)",
+            task_id, TaskStatus.PENDING, task.to_json(), datetime.utcnow()",
            (task_id, TaskStatus.PENDING, task.to_json(), datetime.utcnow())
        )
        return task_id

    async def get_next(self) -> DocumentationTask | None:
        row = await self.db.fetchone(
            "SELECT * FROM tasks WHERE status = 'pending' ORDER BY created_at LIMIT 1"
        )
        return DocumentationTask.from_row(row) if row else None

3. 🔍 RAG — Retrieval-Augmented Generation Layer

This is where NeuroDoc levels up from "scraper" to "intelligent documentation assistant."

Instead of returning raw docs, it:

Chunks scraped content into semantic segments
Embeds them into a vector store
Retrieves the most relevant chunks for a query
Generates a focused, context-aware summary

class RAGPipeline:
    async def query(self, user_query: str) -> RAGResponse:
        # Step 1: Embed the query
        query_embedding = await self.embedder.embed(user_query)

        # Step 2: Retrieve top-k relevant chunks
        relevant_chunks = await self.vector_store.similarity_search(
            query_embedding, top_k=5
        )

        # Step 3: Generate grounded summary
        context = "\n\n".join(chunk.text for chunk in relevant_chunks)
        summary = await self.llm.generate(
            prompt=f"Answer based on this documentation:\n{context}\n\nQuery: {user_query}"
        )

        return RAGResponse(summary=summary, sources=relevant_chunks)

🏗️ Architecture Overview

┌─────────────────────────────────────────────────────┐
│                   Web Dashboard (FastAPI)            │
│              ┌──────────┬──────────────┐            │
│              │  Submit  │   Results    │            │
│              │  Query   │   Viewer     │            │
│              └────┬─────┴──────┬───────┘            │
└───────────────────┼────────────┼────────────────────┘
                    │            │
          ┌─────────▼────────────▼──────────┐
          │       Async Task Dispatcher      │
          │    (asyncio + DB task queue)     │
          └──────┬──────────────────┬────────┘
                 │                  │
    ┌────────────▼────┐    ┌────────▼────────────┐
    │  Multi-core     │    │   RAG Pipeline       │
    │  Doc Scraper    │    │  (Embed → Retrieve   │
    │  (aiohttp)      │    │   → Generate)        │
    └────────┬────────┘    └────────┬─────────────┘
             │                      │
    ┌────────▼──────────────────────▼─────────────┐
    │            PostgreSQL DB           
    │   (tasks · chunks · embeddings · results)    │
    └──────────────────────────────────────────────┘

📦 Supported Documentation Sources

Library	Sections Scraped	NLP Processing
🐍 Python	stdlib, builtins, language ref	Code extraction, summaries
🤖 scikit-learn	API reference, user guide	Table parsing, param docs
🔥 PyTorch	Tensor ops, nn, autograd	Code snippets, examples
🌊 TensorFlow	Keras, tf.data, layers	API signatures, guides

🚀 Getting Started

# Clone the repo
git clone https://github.com/kaushikcoderpy1/neurodoc
cd neurodoc

# Install dependencies
pip install -r requirements.txt

# Initialize the database
python -m neurodoc.db init

# Start the async dashboard
uvicorn neurodoc.app:app --reload --port 8000

Then open http://localhost:8000 and start querying.

🧪 Key Technical Decisions — And Why

Why asyncio over threading?
Doc scraping is I/O-bound (waiting on HTTP). asyncio handles thousands of concurrent requests with a single thread — no GIL fights, no race conditions.

Why SQLite for the task queue instead of Redis?
Zero infrastructure. NeuroDoc is a dev tool — adding a Redis dependency just to persist a queue adds friction. SQLite WAL mode handles concurrent reads/writes cleanly for this use case.

Why RAG over fine-tuning?
Documentation changes constantly. RAG retrieves from live-scraped content. A fine-tuned model would be stale in weeks.

🤖 How GitHub Copilot Saved NeuroDoc — 4 Critical Bugs It Helped Crush

This section is the heart of the comeback story. NeuroDoc didn't just get rewritten — it got debugged at a deep architectural level with Copilot as a true pair programmer. Here are four real, production-blocking bugs it helped resolve.

🐛 Bug 1: Async Database Connection Pool Leaks Under Multi-Core Batches

The failure: Under high-concurrency loads via asyncio.gather, edge-case exceptions inside sub-coroutines bypassed connection release hooks — leaving asyncpg pool sockets exhausted and the app hanging silently.

Standard try/finally cleanup blocks failed because they referenced stale async contexts. The pool hit max capacity and froze.

How Copilot helped:

Copilot introduced a strict connection acquisition pattern bound directly to local transaction lifecycles, with absolute timeout guards:

# Copilot-suggested acquisition pattern
async with pool.acquire() as connection:
    async with connection.transaction():
        result = await asyncio.wait_for(
            connection.fetch(query, *args),
            timeout=5.0  # Hard boundary — no silent hangs
        )

It also added global exception wrappers that translate raw driver errors into clean structured responses — guaranteeing connection cleanup even if the downstream scraping pipeline crashed.

🐛 Bug 2: `SpecifierSet .contains()` AttributeError Across Packaging Versions

The failure: formatter.py runs dependency diagnostics via DependencyAnalyzer. On environments with older packaging library versions, calling .contains() on a SpecifierSet threw:

AttributeError: 'SpecifierSet' object has no attribute 'contains'

This crashed the entire diagnostic panel before it could render — silently breaking environment validation for a large chunk of users.

How Copilot helped:

Copilot identified that .contains() is version-specific, but the native in operator is universally backward-compatible across all historical releases of packaging:

# ❌ Old failing code
elif not raw_spec.contains(local):

# ✅ Copilot's robust fix — works on every packaging version
elif local not in raw_spec:

One operator swap. Zero crashes across all environments.

🐛 Bug 3: Implicit String Mappings Breaking Single-Core CLI Dispatch

The failure: In neurodoc.py, CLI input like neurodoc fetch os passed the core ID "1" as a raw string into isinstance(core, Core1PythonBasics) checks. Since "1" is a string, every check silently fell through with:

Unknown core type for str

Worse — the topic "os" was passed into the batch resolver without list wrapping, so it iterated over the characters 'o' and 's' separately instead of treating "os" as a unified module name.

How Copilot helped:

Copilot introduced dynamic string dereferencing that maps string IDs back to their live handler instances, plus list-wrapping for topic encapsulation:

# Dynamic dereference — string → live core handler
if isinstance(core, str):
    core = self.command_handler.available_cores.get(core)

# Topic wrapped as list — no more character iteration
return await self.call_backend("core1", topics=[topic_f], flags=flags)

🐛 Bug 4: NLP Tensor Dimension Mismatch in Cross-Encoder Similarity

The failure: nlp_with_cos.py calculates semantic similarity across documentation topics using PyTorch/TensorFlow models. Queries of varying lengths produced tensors with mismatched dimensions, throwing:

RuntimeError: Tensors must be of the same shape

This crashed deep multi-core fetches completely — the most expensive operation in the entire pipeline.

How Copilot helped:

Copilot suggested a preprocessing step using dynamic zero-padding and truncation to align all input vectors before the cosine similarity matrix calculation:

# Copilot's shape-alignment fix
inputs = tokenizer(
    text,
    padding="max_length",
    truncation=True,
    max_length=512,
    return_tensors="pt"
)

All tensors now enter the similarity layer at identical dimensions — no shape mismatches, no crashes.

💬 What Copilot Actually Felt Like as a Pair Programmer

These weren't simple autocomplete suggestions. Copilot reasoned about async lifecycle boundaries, cross-version API compatibility, type system edge cases, and linear algebra constraints — the kind of bugs that take hours of debugging to even locate, let alone fix.

The biggest unlock: it didn't just fix the symptom. For each bug, it explained why the original approach was fragile and offered a pattern that would hold up under production conditions.

That's the difference between a tool and a collaborator.

🔭 What's Next

[ ] Browser extension for one-click doc lookup
[ ] Streaming responses via WebSockets
[ ] Support for Hugging Face, Pandas, NumPy docs
[ ] Self-hosted embedding model (no API key required)
[ ] Export summaries as Jupyter notebooks

🔗 Links

📁 GitHub: github.com/kaushikcoderpy1/neurodoc
🎬 Demo Video: YouTube Preview

Built for the DEV.to hackathon. Powered by stubbornness, async Python, and too much coffee.

Top comments (2)

Harjot Singh • Jun 1

the broken-prototype-to-production arc is the real work, and async is where most AI doc tools fall over. that hardening journey is basically what a harness automates: in Moonshift the prototype-to-production gap is closed by validation between steps before agents build + deploy + market a SaaS overnight. nice to see you push NeuroDoc all the way. first run's free if you want to compare the productionizing path.