🚀 Awesome Copilot Guide: Your Personal Navigator for the Awesome GitHub Copilot

#devchallenge #algoliachallenge #ai #agents

This is a submission for the Algolia Agent Studio Challenge: Consumer-Facing Conversational Experiences

💡 What I Built

I built the Awesome Copilot Guide, think of it as your friendly AI sidekick that helps you navigate the massive Awesome GitHub Copilot ecosystem without losing your mind.

Here's the thing: there are hundreds of agents, skills, and prompts out there. Finding the right one for your specific workflow? It's like trying to find that one specific LEGO piece in a giant bin.

The Awesome Copilot Guide cuts through the noise. Just tell it what you're working on (your tech stack, your current task, whatever) and it'll hook you up with a personalized, curated list of recommendations. No more endless scrolling through markdown lists!

Tech Stack:

Next.js 16 with TypeScript
Tailwind CSS and Shadcn UI
Algolia Agent Studio powered by Google Gemini

🎬 Demo

Live Demo: https://mouadbourbian.github.io/Awesome-Copilot-Guide
GitHub Link: https://github.com/MouadBourbian/Awesome-Copilot-Guide

🛠️ How I Used Algolia Agent Studio

I used Algolia Agent Studio to ground the conversational AI in real time, searchable data. Here's how everything comes together:

1. 📊 Indexing the Data

The magic starts with good data. I built a custom Python script (algolia_generate_index.py) that fetches the raw llms.txt file from the official Awesome Copilot GitHub Pages. The script parses through the markdown structure and transforms it into structured data:

Name: What the resource is called
Type: Whether it's an Agent, Skill, or Prompt
Description: What it actually does (the value prop)
URL: Direct link to dive deeper

algolia_generate_index.py:

"""
This module provides functionality to fetch and parse an 'llms.txt' file
from a remote URL and convert it into a JSON index.
"""

import re
import json
import urllib.request
from urllib.error import URLError, HTTPError

def fetch_and_parse_llms_txt(url, output_file):
    """
    Fetches the llms.txt content from the given URL, parses the markdown links,
    and saves the structured data to a JSON file.

    Args:
        url (str): The URL of the llms.txt file.
        output_file (str): The path where the JSON output will be saved.
    """
    records = []
    current_section = "General"

    # Matches: - [Name](URL): Description
    pattern = r'- \[(.*?)\]\((.*?)\): (.*)'

    print(f"Fetching content from: {url}")

    try:
        # 1. Fetch the content from the web
        with urllib.request.urlopen(url) as response:
            content = response.read().decode('utf-8')
            lines = content.splitlines()

            for line in lines:
                line = line.strip()
                if not line:
                    continue

                # Detect Section (e.g., ## Agents)
                if line.startswith('## '):
                    current_section = line.replace('## ', '').strip()
                    continue

                # Extract Item Details
                match = re.search(pattern, line)
                if match:
                    name = match.group(1).strip()
                    full_url = match.group(2).strip()
                    description = match.group(3).strip()

                    # 2. Convert Raw URL to GitHub Viewer URL
                    github_url = full_url.replace(
                        "https://raw.githubusercontent.com/github/awesome-copilot/main/",
                        "https://github.com/github/awesome-copilot/tree/main/"
                    )

                    # Clean ID
                    safe_name = re.sub(r'[^a-zA-Z0-9]', '-', name.lower())
                    safe_name = re.sub(r'-+', '-', safe_name).strip('-')
                    object_id = f"{current_section.lower().split()[0]}-{safe_name}"

                    type_singular = (
                        current_section[:-1]
                        if current_section.endswith('s')
                        else current_section
                    )

                    # Create Record
                    record = {
                        "objectID": object_id,
                        "name": name,
                        "type": type_singular,
                        "description": description,
                        "url": github_url
                    }
                    records.append(record)

        # 3. Save to File
        with open(output_file, 'w', encoding='utf-8') as f:
            json.dump(records, f, indent=2)

        print(f"Success! Processed {len(records)} items.")
        print(f"Saved to: {output_file}")

    except HTTPError as e:
        print(f"HTTP Error {e.code}: {e.reason}")
    except URLError as e:
        print(f"URL Error: {e.reason}")
    except (OSError, IOError) as e:
        print(f"File system error: {e}")

if __name__ == "__main__":
    URL = 'https://github.github.io/awesome-copilot/llms.txt'
    OUTPUT_FILE = 'awesome-copilot.json'

    fetch_and_parse_llms_txt(URL, OUTPUT_FILE)

All of this gets structured into a clean JSON index (awesome-copilot.json) and uploaded directly through the Algolia Dashboard. To make filtering work smoothly, I configured the type attribute as a facet in the index settings, so the agent can quickly narrow down results by resource type.

You can check out the detailed setup steps I followed here: Algolia Configuration Guide

2. 💬 The Conversational Experience

On the frontend, I'm using the AI SDK (@ai-sdk/react) to connect directly to the Algolia Agent API. Instead of using a standard widget, this allowed me to wrap the powerful Algolia retrieval engine inside a fully custom Shadcn UI interface. The AlgoliaChat component streams responses in real time, creating that smooth, natural chat experience where the AI feels like a helpful sidekick on your team.

algolia-chat.tsx:

const { messages, sendMessage, status } = useChat({
  transport: new DefaultChatTransport({
    api: `https://${ALGOLIA_APP_ID.toLowerCase()}.algolia.net/agent-studio/1/agents/${AGENT_ID}/completions?stream=true&compatibilityMode=ai-sdk-5`,
    headers: {
      "x-algolia-application-id": ALGOLIA_APP_ID,
      "x-algolia-api-key": ALGOLIA_API_KEY,
    },
  }),
});

3. 🎯 Targeted Prompting

Quality control was key here. I engineered a two-layer defense against hallucinations:

Layer 1: Index Configuration
I configured the index description in Agent Studio to explicitly state:

"A catalog of GitHub Copilot agents, skills, instructions, prompts, and documentation for developers. Must be searched before any recommendation. This is the sole source of truth for all suggestions."

Layer 2: Mandatory Search Protocol
I crafted a strict system prompt that forbids the model from guessing.

[...]

## Mandatory Search Protocol

**YOU ARE STRICTLY FORBIDDEN from providing final recommendations or answers without first searching the catalog.**

**Before every recommendation:**

1. **MUST search the catalog first:** No exceptions, even if you think you know the answer
2. **MUST base all recommendations on actual search results:** Never rely on assumptions or memory
3. **MUST verify resources exist and are current:** Links and details must come from search results
4. **If search returns no results:** Try broader terms before concluding nothing exists

**Violations of this rule are unacceptable.** If you cannot search, you must inform the user that you cannot provide recommendations without search capability.

[...]

This forces the Agent to query the Algolia index before answering anything, which means it can't just make stuff up. Every recommendation is grounded in real, verified resources.

⚡ Why Fast Retrieval Matters

For a directory assistant like this, speed and accuracy aren't just nice to haves. They're everything. Here's why:

🎭 Eliminating Hallucinations

By using Algolia's retrieval as the foundation, the AI only recommends resources that actually exist in the catalog. It can't invent a link to a resource because it has to find it first. No more phantom recommendations!

🔍 Contextual Relevance

Algolia's lightning fast search filters through hundreds of items in milliseconds. When someone asks for "Next.js tools," the system instantly narrows down the context, letting the LLM focus only on relevant results instead of fumbling through its general knowledge.

🏃 User Flow

The combination of Algolia's speed and streaming responses means users get actionable answers immediately. They spend less time searching and more time building. That's the whole point, right?