DEV Community

Cover image for πŸš€ Stop Hallucinating! Build a RAG Chatbot in 5 Minutes with LangChain
Ananya S
Ananya S

Posted on

πŸš€ Stop Hallucinating! Build a RAG Chatbot in 5 Minutes with LangChain

Ever asked an AI about something that happened yesterday, only for it to confidently lie to your face? That’s because LLMs are frozen in timeβ€”limited by their training data.

Enter RAG (Retrieval-Augmented Generation). It’s like giving your AI an open-book exam. Instead of guessing, it looks up the answer in your documents first.

In this post, we’re building a simple RAG pipeline using LangChain. Let’s dive in! πŸŠβ€β™‚οΈ

πŸ”₯ The "Big Idea"

RAG works in three simple steps:

Index: Chop your documents into small "chunks" and turn them into math (vectors).

Retrieve: When a user asks a question, find the chunks that match best.

Augment: Stuff those chunks into the prompt and let the AI summarize them.

πŸ› οΈThe Setup
You'll need a few libraries. Open your terminal and run:

pip install langchain langchain-openai langchain-community chromadb pypdf
Enter fullscreen mode Exit fullscreen mode

πŸ’» The Code
Here is a complete, minimal script to chat with a PDF. Replace "your_api_key" with your actual OpenAI key.

import os
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain.chains import RetrievalQA

# 1. Set your API Key
os.environ["OPENAI_API_KEY"] = "sk-..."

# 2. Load your data (Change this to your PDF path!)
loader = PyPDFLoader("my_awesome_doc.pdf")
data = loader.load()

# 3. Chop it up! (Chunking)
# We split text so the AI doesn't get overwhelmed.
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
chunks = text_splitter.split_documents(data)

# 4. Create the "Brain" (Vector Store)
# This turns text into vectors and stores them locally.
vectorstore = Chroma.from_documents(
    documents=chunks, 
    embedding=OpenAIEmbeddings()
)

# 5. Build the RAG Chain
llm = ChatOpenAI(model_name="gpt-4o", temperature=0)
rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff", # "Stuff" all chunks into the prompt
    retriever=vectorstore.as_retriever()
)

# 6. Ask away!
question = "What is the main conclusion of this document?"
response = rag_chain.invoke(question)

print(f"πŸ€– AI: {response['result']}")
Enter fullscreen mode Exit fullscreen mode

πŸ€” Why did we do that?
RecursiveCharacterTextSplitter: Why not just feed the whole PDF? Because LLMs have a "context window" (limit). Chunking keeps the info bite-sized and relevant.

ChromaDB: This is our temporary database. It stores the "meaning" of our text so we can search it numerically.

chain_type="stuff": This is the funniest name in LangChain. It literally means "stuff all the retrieved documents into the prompt."

🌟 Pro-Tips for the Road
Overlap matters: Notice chunk_overlap=100? This ensures that if a sentence is cut in half, the context lives in both chunks.

Local Models: Don't want to pay for OpenAI? Swap ChatOpenAI for Ollama and run it 100% locally!

Garbage In, Garbage Out: If your PDF is a messy scan, your RAG will be messy too. Clean your data!

🎁 Wrapping Up

You just built a production-grade logic loop. RAG is the backbone of almost every AI startup today. Whether it's a legal bot, a medical assistant, or a "Chat with your Resume" toolβ€”you now have the blueprint.

What are you planning to build with RAG? Let me know in the comments! πŸ‘‡

Top comments (0)