Victory Lucky

Posted on Feb 18 • Edited on Feb 19

From RAG to Agents: How TiDB Powers Modern AI Applications

#ai #agents #vectordatabase #sql

Building AI applications today feels different than it did even a year ago. We've gone from simple chatbot wrappers around LLM APIs to building systems that need memory, context, and increasingly complex data patterns. The problem? Most of us end up juggling a vector database for embeddings, a relational database for structured data, maybe a cache layer for performance, and a bunch of ETL pipelines trying to keep everything synchronized. It works, but it's a lot.

TiDB offers a different approach: a distributed SQL database with native vector search and HTAP (Hybrid Transactional/Analytical Processing) capabilities that can serve as the foundation for modern AI applications. In this post, we'll explore how TiDB addresses the unique challenges of building AI systems in 2026, from RAG pipelines to autonomous agents.

The AI Data Problem: Why Your Stack Keeps Growing

If you're building AI apps right now, your architecture probably looks something like this:

Vector database (Pinecone, Weaviate, Qdrant) for semantic search
Relational database (PostgreSQL, MySQL) for structured data
Analytics database (Snowflake, ClickHouse) for insights
Cache (Redis) for performance
ETL pipelines trying to keep it all synchronized

Each tool is good at what it does. But the integration work? That's where things get messy. Take a customer support AI as an example. It needs to:

Find semantically similar support tickets (vector search)
Filter by customer tier and product version (relational)
Analyze resolution patterns in real-time (analytics)
Serve everything with sub-second latency

The traditional approach means data living in multiple places, syncing constantly, with consistency issues and latency at every boundary. This is the problem TiDB's architecture solves.

TiDB's Architecture: Not Built for AI, But Perfect for It

TiDB wasn't originally designed for the AI boom - it's been around since before "RAG" was even a term. But its architecture happens to map really well to what modern AI applications need:

1. MySQL Compatibility + Horizontal Scaling

TiDB speaks the MySQL protocol. Your ORMs work. Your existing tools work. Your team's MySQL knowledge transfers directly. The difference? It scales horizontally without you having to manually shard anything.

-- Standard MySQL syntax works
CREATE TABLE conversations (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id BIGINT NOT NULL,
    session_id VARCHAR(255),
    message TEXT,
    response TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_user_session (user_id, session_id)
);

As your AI application grows from thousands to millions of users, you add nodes rather than rewrite your schema.

2. Native Vector Search (No Separate Database Needed)

Starting with version 7.4, TiDB has native vector support. You can store embeddings right alongside your regular data and search them with SQL. No separate vector database to keep in sync.

-- Create a table with embeddings
CREATE TABLE documents (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    content TEXT,
    embedding VECTOR(1536),  -- OpenAI ada-002 dimension
    category VARCHAR(100),
    created_at TIMESTAMP,
    VECTOR INDEX idx_embedding (embedding)
);

-- Insert a document with its embedding
INSERT INTO documents (content, embedding, category) 
VALUES (
    'TiDB supports HTAP workloads...',
    '[0.123, -0.456, 0.789, ...]',  -- 1536-dim vector
    'documentation'
);

-- Semantic search with structured filters
SELECT id, content, category,
       VEC_COSINE_DISTANCE(embedding, '[0.1, -0.2, ...]') AS similarity
FROM documents
WHERE category = 'documentation'
  AND created_at > NOW() - INTERVAL 30 DAY
ORDER BY similarity ASC
LIMIT 10;

Compare this to managing Pinecone + PostgreSQL separately:

# Traditional approach: juggling two systems
# Step 1: Query Pinecone for vectors
vector_results = pinecone_index.query(
    vector=embedding,
    top_k=100,  # Have to over-fetch since we'll filter later
    include_metadata=True
)

# Step 2: Get IDs and query Postgres for filtered results  
ids = [r.id for r in vector_results]
filtered_results = db.execute("""
    SELECT * FROM documents 
    WHERE id IN (%s) 
      AND category = 'documentation'
      AND created_at > NOW() - INTERVAL '30 days'
    ORDER BY ARRAY_POSITION(%s, id)
""", (ids, ids))

# Hope nothing changed between the two queries...

With TiDB, it's one query. No sync issues. No race conditions.

3. HTAP: Run Analytics on Live Data (No ETL)

This is where TiDB gets interesting. It has two storage engines working together:

TiKV (row-based) for transactions
TiFlash (columnar) for analytics

You write to TiKV, query analytics from TiFlash. They stay in sync automatically. No ETL pipeline, no separate data warehouse, no stale data.

-- Analyze AI agent performance in real-time
SELECT 
    agent_type,
    DATE(created_at) as date,
    COUNT(*) as total_interactions,
    AVG(response_time_ms) as avg_response_time,
    AVG(user_satisfaction_score) as avg_satisfaction,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY response_time_ms) as p95_latency
FROM agent_interactions
WHERE created_at > NOW() - INTERVAL 7 DAY
GROUP BY agent_type, DATE(created_at)
ORDER BY date DESC;

This query runs on TiFlash (the columnar engine) while your app keeps writing to TiKV (the row engine). They don't interfere with each other. No need for Snowflake or ClickHouse on the side.

Real-World AI Use Cases with TiDB

RAG Applications (The Most Common Use Case)

RAG is basically the standard way to build with LLMs now. And RAG needs both semantic search and structured filtering - which is exactly what TiDB does well:

-- Knowledge base table with vectors and metadata
CREATE TABLE knowledge_base (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    title VARCHAR(500),
    content TEXT,
    embedding VECTOR(1536),
    source VARCHAR(255),
    last_updated TIMESTAMP,
    access_level ENUM('public', 'internal', 'restricted'),
    VECTOR INDEX idx_embedding (embedding)
);

-- RAG retrieval query: semantic + structured filters
SELECT 
    id, 
    title, 
    content,
    VEC_COSINE_DISTANCE(embedding, :query_embedding) AS relevance
FROM knowledge_base
WHERE access_level IN ('public', 'internal')
  AND last_updated > :cutoff_date
ORDER BY relevance ASC
LIMIT 5;

Postgres with pgvector can do this too, but it doesn't scale well. PostgreSQL's vector indexes (IVFFlat, HNSW) aren't built for distribution - you hit a ceiling pretty quickly. TiDB's distributed architecture means your knowledge base can actually grow to billions of documents without a rewrite.

AI Agent Memory (The Future of AI Apps)

If you're building agents that run for more than one interaction, they need memory. Not just short-term chat history - actual long-term memory about users, facts, past decisions, all of it. This breaks down into:

Episodic memory: What happened in past conversations
Semantic memory: Facts learned about the user or context
Working memory: Current task state

-- Agent memory schema
CREATE TABLE agent_memory (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    agent_id VARCHAR(255),
    user_id BIGINT,
    memory_type ENUM('episodic', 'semantic', 'working'),
    content TEXT,
    embedding VECTOR(1536),
    importance_score FLOAT,
    created_at TIMESTAMP,
    accessed_count INT DEFAULT 0,
    last_accessed TIMESTAMP,
    VECTOR INDEX idx_memory_embedding (embedding),
    INDEX idx_agent_user (agent_id, user_id)
);

-- Retrieve relevant memories for context
SELECT content, memory_type, importance_score
FROM agent_memory
WHERE agent_id = :agent_id 
  AND user_id = :user_id
  AND (
    -- Semantic search for relevant context
    VEC_COSINE_DISTANCE(embedding, :current_context_embedding) < 0.3
    OR 
    -- Recent working memory
    (memory_type = 'working' AND created_at > NOW() - INTERVAL 1 HOUR)
  )
ORDER BY importance_score DESC, created_at DESC
LIMIT 20;

As agents interact with thousands of users over months, memory tables grow large. TiDB's horizontal scaling ensures performance remains consistent.

Real-Time Feature Stores for ML

Training and serving ML models requires consistent features across environments. TiDB can serve as a unified feature store:

-- User features for a recommendation model
CREATE TABLE user_features (
    user_id BIGINT PRIMARY KEY,
    embedding VECTOR(128),
    total_purchases INT,
    avg_order_value DECIMAL(10,2),
    category_preferences JSON,
    last_purchase_date TIMESTAMP,
    churn_risk_score FLOAT,
    updated_at TIMESTAMP
);

-- Batch feature computation (training)
INSERT INTO user_features 
SELECT 
    user_id,
    -- Aggregate features from transaction history
    -- (runs on TiFlash for analytical performance)
FROM transactions
GROUP BY user_id;

-- Point lookup for inference (serving)
SELECT embedding, total_purchases, avg_order_value, category_preferences
FROM user_features
WHERE user_id = :user_id;

The HTAP architecture means batch feature computation (analytical) and real-time serving (transactional) use the same database without contention.

Multimodal AI Applications

Modern AI apps process text, images, audio, and video. Embeddings from different modalities need to coexist:

CREATE TABLE media_library (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    media_type ENUM('text', 'image', 'audio', 'video'),
    content_url VARCHAR(1000),
    text_embedding VECTOR(1536),      -- CLIP text encoder
    image_embedding VECTOR(512),      -- CLIP image encoder
    audio_embedding VECTOR(768),      -- Wav2Vec
    metadata JSON,
    tags VARCHAR(500),
    created_at TIMESTAMP,
    VECTOR INDEX idx_text (text_embedding),
    VECTOR INDEX idx_image (image_embedding),
    VECTOR INDEX idx_audio (audio_embedding)
);

-- Cross-modal search: find images similar to text query
SELECT id, content_url, metadata,
       VEC_COSINE_DISTANCE(image_embedding, :text_query_embedding) AS similarity
FROM media_library
WHERE media_type = 'image'
  AND tags LIKE '%product%'
ORDER BY similarity ASC
LIMIT 20;

Managing multiple vector indexes for different modalities in separate databases becomes cumbersome quickly. TiDB consolidates this complexity.

TiDB Works at Any Scale

One common misconception is that distributed databases are "overkill" for smaller projects. Here's the reality: TiDB scales down as well as it scales up.

For Small Projects and Startups:

Start with a single-node TiDB Cloud Starter cluster (free tier available)
MySQL compatibility means your existing skills and tools work immediately
No upfront complexity - you write standard SQL
When you grow, scaling is just adding nodes - no application rewrites

For Growing Applications:

Seamless transition from prototype to production
No "rewrite day" when you outgrow your database
The same codebase works at 100 users or 100 million users

For Enterprise Scale:

Multi-region deployments with strong consistency
Handles petabyte-scale datasets
HTAP means no separate analytics infrastructure

The beauty is that you don't choose between "simple database for MVP" vs. "complex distributed system for scale." You get both. Start simple, scale when needed, never rewrite.

Vector Search Performance

TiDB's vector index performance depends on dataset size and configuration:

-- Create optimized vector index
CREATE VECTOR INDEX idx_embedding ON foo ((VEC_COSINE_DISTANCE(embedding))) USING HNSW;
ALTER TABLE foo ADD VECTOR INDEX idx_embedding ((VEC_COSINE_DISTANCE(embedding))) USING HNSW;

Operational Benefits

Simplified Stack

Instead of managing:

Pinecone (vector search)
PostgreSQL (relational data)
Airflow + dbt (ETL pipelines)
Redis (caching)

You operate:

TiDB (handles all of the above)

This reduces:

Synchronization bugs: No eventual consistency issues between systems
Operational overhead: One database to monitor, backup, scale
Cost: Consolidated infrastructure, no data transfer fees between services

MySQL Ecosystem Compatibility

Your team's existing MySQL knowledge transfers directly:

# Standard MySQL drivers work
import mysql.connector

conn = mysql.connector.connect(
    host='tidb.cluster.example.com',
    port=4000,
    user='root',
    password='your_password',
    database='ai_app'
)

cursor = conn.cursor()
cursor.execute("""
    SELECT id, content, 
           VEC_COSINE_DISTANCE(embedding, %s) AS score
    FROM documents
    ORDER BY score ASC
    LIMIT 10
""", (query_embedding,))

ORMs like SQLAlchemy, Django ORM, and Prisma work without modification. This lowers the learning curve significantly.

Getting Started with PyTiDB: The Modern Way

While you can use TiDB with any MySQL-compatible driver, the PyTiDB SDK makes building AI applications significantly easier. It handles embedding generation, automatic vectorization, and provides a clean ORM-style interface.

Installation

pip install pytidb

# Optional: include built-in embedding models
pip install "pytidb[models]"

Quick Start: Building a RAG Application

Here's a complete RAG implementation using PyTiDB's auto-embedding features:

import os
from pytidb import TiDBClient
from pytidb.schema import TableModel, Field, FullTextField
from pytidb.embeddings import EmbeddingFunction

# Connect to TiDB
tidb_client = TiDBClient.connect(
    host=os.getenv("TIDB_HOST"),
    port=int(os.getenv("TIDB_PORT")),
    username=os.getenv("TIDB_USERNAME"),
    password=os.getenv("TIDB_PASSWORD"),
    database=os.getenv("TIDB_DATABASE"),
)

# Configure embedding provider (supports OpenAI, JinaAI, etc.)
tidb_client.configure_embedding_provider(
    "openai",
    api_key=os.getenv("OPENAI_API_KEY")
)

# Define your schema with auto-embedding
class Document(TableModel):
    __tablename__ = "documents"

    id: int = Field(primary_key=True)
    content: str = FullTextField()  # Full-text searchable

    # Auto-embeds the 'content' field using OpenAI
    content_vec: list[float] = EmbeddingFunction(
        "openai/text-embedding-3-small"
    ).VectorField(source_field="content")

    category: str = Field(max_length=100)
    created_at: str = Field()

# Create table
table = tidb_client.create_table(schema=Document, if_exists="skip")

# Insert documents - embeddings generated automatically!
table.bulk_insert([
    Document(
        id=1,
        content="TiDB supports HTAP workloads with TiFlash for real-time analytics.",
        category="features",
        created_at="2024-01-15"
    ),
    Document(
        id=2,
        content="PyTiDB is the official Python SDK for building AI apps with TiDB.",
        category="tools",
        created_at="2024-02-01"
    ),
    Document(
        id=3,
        content="Vector search in TiDB enables semantic similarity queries at scale.",
        category="features",
        created_at="2024-02-10"
    ),
])

# Semantic search - query is auto-embedded too!
results = (
    table.search("Python library for AI applications")
    .limit(3)
    .to_list()
)

for doc in results:
    print(f"Score: {doc['_score']:.3f}")
    print(f"Content: {doc['content']}")
    print(f"Category: {doc['category']}\n")

Output:

Score: 0.892
Content: PyTiDB is the official Python SDK for building AI apps with TiDB.
Category: tools

Score: 0.756
Content: Vector search in TiDB enables semantic similarity queries at scale.
Category: features

Score: 0.623
Content: TiDB supports HTAP workloads with TiFlash for real-time analytics.
Category: features

Hybrid Search: Combining Semantic and Keyword Search

PyTiDB supports hybrid search out of the box - combining vector similarity with full-text matching:

# Hybrid search: semantic + keyword matching
results = (
    table.search(
        "database for AI",
        search_type="hybrid"  # Combines vector + full-text
    )
    .limit(5)
    .to_pandas()  # Returns pandas DataFrame
)

print(results[['content', '_score', '_distance']])

Hybrid search delivers more accurate results by combining semantic understanding (vector search) with exact-term matching (full-text search).

Advanced Filtering with Vector Search

One of PyTiDB's strengths is combining vector search with SQL-style filtering:

from pytidb.sql import and_, or_

# Find semantically similar docs, but only from specific categories
results = (
    table.search("analytics and real-time processing")
    .filter(
        and_(
            Document.category == "features",
            Document.created_at > "2024-01-01"
        )
    )
    .distance_threshold(0.8)  # Only results with >0.8 similarity
    .limit(5)
    .to_list()
)

Image Search with Multi-Modal Embeddings

PyTiDB also supports multi-modal embeddings for image search:

from PIL import Image
from pytidb.embeddings import EmbeddingFunction

# Multi-modal embedding model (handles both text and images)
jina_embed = EmbeddingFunction("jina_ai/jina-embeddings-v4")

class Product(TableModel):
    __tablename__ = "products"

    id: int = Field(primary_key=True)
    name: str = Field()
    image_path: str = Field()

    # Automatically embeds images from the image_path
    image_vec: list[float] = jina_embed.VectorField(
        source_field="image_path",
        source_type="image"
    )

table = tidb_client.create_table(schema=Product, if_exists="skip")

# Insert products with images
table.insert(Product(
    id=1,
    name="Blue Sneakers",
    image_path="products/sneakers_blue.jpg"
))

# Search using natural language
results = table.search("running shoes").limit(5).to_list()

# Or search using another image
query_image = Image.open("query_shoe.jpg")
results = table.search(query_image).limit(5).to_list()

Why PyTiDB Simplifies Development

Compare this to the traditional approach:

Without PyTiDB:

Manually call OpenAI API to generate embeddings
Write raw SQL to create tables with VECTOR columns
Handle embedding generation for every insert
Write complex SQL queries for vector search
Manually combine results with metadata

With PyTiDB:

Define your schema with TableModel
Mark fields for auto-embedding
Insert data - embeddings happen automatically
Search with .search() - query embedding automatic
Filter and combine with SQL-like syntax

The SDK handles all the embedding plumbing, letting you focus on your application logic.

The Path Forward

Three years ago, vector databases as we know them today didn't really exist. Now they're everywhere. The stack is still figuring itself out - what used to require three different databases might soon need just one. Or maybe five specialized ones will win. Nobody knows for sure yet.

What's clear is that building AI applications at any meaningful scale means dealing with both structured and unstructured data, both transactions and analytics. You can stitch together multiple systems and deal with the synchronization headaches, or you can use something that handles it all in one place.

TiDB makes that bet: that consolidation beats fragmentation. That a single, scalable database beats orchestrating multiple specialized ones. For applications that need structured data and vector search, transactional consistency and analytical insights, that bet is looking pretty good.

Yes, distributed databases have complexity. But when the alternative is keeping three databases in sync while your app is down because of a race condition between systems, that complexity starts looking reasonable.

Try TiDB for Your AI Application

Ready to consolidate your AI data stack?

Get started:

TiDB Cloud: Sign up for a free cluster (no credit card required)
Documentation: Vector Search Guide
Example Projects: AI Application Templates on GitHub

Next steps:

Migrate your vector data from Pinecone/Weaviate to TiDB
Consolidate your feature store and transactional database
Build a RAG application using the example above
Join the TiDB Community Discord to share your experience

The AI era demands databases that can keep up. TiDB is ready.

Have you built AI applications on TiDB? Share your experience in the comments or reach out on Twitter/X @PingCAP.

If you find this article useful, share it and give me a follow on Twitter/X @codewithveek

DEV Community