A step-by-step guide to building a local AI document processor that makes zero external network calls — useful for processing NDA-bound contracts, confidential reports, or any document you can't upload to ChatGPT.
Architecture overview
PDF/DOCX file
↓
pdfplumber / python-docx (text extraction)
↓
System prompt + document text
↓
Ollama API (localhost:11434)
↓
Gradio UI (localhost:7860)
↓
Summary / Q&A / entities
Everything runs on localhost. Zero cloud dependencies at runtime.
Prerequisites
- Python 3.11+
- Ollama installed and running
- 8GB+ RAM (16GB recommended)
# Install Ollama (Windows)
winget install Ollama.Ollama
# Pull a model
ollama pull llama3.1:8b
Core dependencies
pip install gradio pdfplumber python-docx requests
Step 1: Text extraction
import pdfplumber
import docx
from pathlib import Path
def extract_text(file_path: str) -> str:
path = Path(file_path)
if path.suffix.lower() == ".pdf":
with pdfplumber.open(file_path) as pdf:
return "\n\n".join(
page.extract_text() or "" for page in pdf.pages
)
elif path.suffix.lower() in (".docx", ".doc"):
doc = docx.Document(file_path)
return "\n".join(p.text for p in doc.paragraphs if p.text.strip())
raise ValueError(f"Unsupported file type: {path.suffix}")
Step 2: Ollama integration
import requests
OLLAMA_URL = "http://localhost:11434/api/generate"
def query_ollama(prompt: str, model: str = "llama3.1:8b") -> str:
response = requests.post(OLLAMA_URL, json={
"model": model,
"prompt": prompt,
"stream": False,
}, timeout=120)
response.raise_for_status()
return response.json()["response"]
Note: http://localhost:11434 — not a cloud API. No authentication needed.
Step 3: Domain-specific system prompts
Generic prompts give generic results. Tuned prompts for document types:
DOMAIN_PROMPTS = {
"legal": (
"You are a legal document analyst. Extract and structure the following "
"from the document:\n"
"1. PARTIES: All named parties and their roles\n"
"2. KEY DATES: Effective date, termination, deadlines\n"
"3. OBLIGATIONS: Each party's obligations\n"
"4. PAYMENT TERMS: Amounts, schedules, conditions\n"
"5. UNUSUAL CLAUSES: Non-standard or notable provisions\n"
"6. GOVERNING LAW: Jurisdiction and dispute resolution\n"
"Be factual and precise. Do not interpret or give legal advice."
),
"financial": (
"You are a financial document analyst. Extract:\n"
"1. AMOUNTS: All monetary values with context\n"
"2. DATES: Payment dates, fiscal periods, deadlines\n"
"3. PARTIES: Vendors, clients, counterparties\n"
"4. TERMS: Payment terms, penalties, conditions\n"
"5. KEY METRICS: Revenue, costs, margins if present"
),
}
def process_document(file_path: str, domain: str, model: str) -> str:
text = extract_text(file_path)
system = DOMAIN_PROMPTS.get(domain, "Summarize the key points of this document.")
prompt = f"{system}\n\nDOCUMENT:\n{text[:12000]}" # ~12k char limit
return query_ollama(prompt, model)
Step 4: Privacy-safe Gradio UI
import gradio as gr
def build_ui():
with gr.Blocks(title="Local Document Processor") as app:
gr.Markdown("## Local Document Processor\n*All processing on your machine — no cloud*")
with gr.Row():
file_input = gr.File(label="Upload PDF or DOCX", file_types=[".pdf", ".docx"])
domain = gr.Dropdown(
choices=list(DOMAIN_PROMPTS.keys()),
value="legal",
label="Domain"
)
process_btn = gr.Button("Process Document", variant="primary")
output = gr.Textbox(label="Result", lines=20)
process_btn.click(
fn=lambda f, d: process_document(f.name, d, "llama3.1:8b"),
inputs=[file_input, domain],
outputs=output,
)
return app
if __name__ == "__main__":
app = build_ui()
app.launch(
server_name="127.0.0.1", # localhost only
share=False, # no Gradio tunnel
analytics_enabled=False, # no phone-home
)
Step 5: Batch processing
For processing entire folders:
import zipfile
import tempfile
from pathlib import Path
def batch_process(folder_path: str, domain: str, model: str) -> str:
results = {}
for file in Path(folder_path).glob("*"):
if file.suffix.lower() in (".pdf", ".docx"):
try:
results[file.name] = process_document(str(file), domain, model)
except Exception as e:
results[file.name] = f"ERROR: {e}"
# Package results as ZIP
with tempfile.NamedTemporaryFile(suffix=".zip", delete=False) as tmp:
with zipfile.ZipFile(tmp.name, "w") as zf:
for filename, content in results.items():
zf.writestr(f"{filename}.txt", content)
return tmp.name
Performance tips
- Context window: Truncate documents to ~12,000 characters for reliable results with 8b models
-
Temperature: Set
"temperature": 0.1for factual extraction (less hallucination) -
Streaming: Use
"stream": Truefor better UX on long documents — update UI in real-time - Model selection: qwen2.5:3b for speed, llama3.1:8b for quality, llama3.1:70b for accuracy
Verification
Run Wireshark filtered to not host 127.0.0.1 while processing a document. You should see zero packets — confirming no data leaves your machine.
Full product
The complete version (batch mode, 10 domain types, hardware detection, Windows installer, 12 use-case recipes) is available at https://journeyer376.gumroad.com/l/ussytd for $39.
The architecture above is the core of what it does — the product adds packaging, documentation, and domain prompt iteration aimed at non-developers.
Questions about the architecture or model benchmarks? Happy to answer in the comments.
Top comments (3)
Useful walkthrough. The extra detail I’d add is that “localhost” is a good starting point, but not the whole privacy story.
For sensitive docs I’d want the checklist to include: no Gradio share tunnel, analytics disabled, model already downloaded before handling private files, no crash/error reporting, temp files cleaned up, and a clear note that the extracted plaintext may be more sensitive than the original PDF/DOCX.
The Wireshark verification section is the strongest part because it turns the privacy claim into something people can test. I’d probably move that higher in the article.
Running Qwen locally, we found that cleaning up temporary files is crucial for maintaining privacy. Your checklist covers this well! Have you considered integrating a logging mechanism to ensure no sensitive data is logged inadvertently during processing?
This looks like a great approach for ensuring data privacy! Have you encountered any challenges with integrating Gradio for the UI?