DEV Community

mobius-crypt
mobius-crypt

Posted on

From 8s to 47ms: Optimizing PostgreSQL for a Government Tender Search Platform

From 8s to 47ms: Optimizing PostgreSQL for a Government Tender Search Platform

When search takes 8 seconds, users leave. Here's how we made it instant.

Six months ago, our tender search was unusably slow. Users searching for "construction Gauteng" would wait 8+ seconds for results. Our database had only 5,000 tenders. How would we handle 50,000?

The team suggested Elasticsearch. "PostgreSQL can't handle this," they said. "We need a real search engine."

I disagreed. Not because Elasticsearch isn't great—it is. But because I suspected our problem wasn't PostgreSQL's capability. It was our usage of it.

Today, we serve 50,000+ tenders with search response times under 50ms at p95. We did it with PostgreSQL, some smart indexing, and a lot of query optimization.

This is that story.

The Slow Query That Started Everything

Let me show you what we had:

-- ❌ The original, terrible query
SELECT 
  t.*,
  c.name as category_name,
  p.name as province_name,
  org.name as organization_name
FROM tenders t
  LEFT JOIN categories c ON t.category_id = c.id
  LEFT JOIN provinces p ON t.province_id = p.id
  LEFT JOIN organizations org ON t.organization_id = org.id
WHERE 
  t.status = 'ACTIVE'
  AND (
    t.title ILIKE '%construction%' 
    OR t.description ILIKE '%construction%'
  )
  AND (
    p.name ILIKE '%gauteng%' 
    OR org.province ILIKE '%gauteng%'
  )
ORDER BY t.published_at DESC
LIMIT 20;
Enter fullscreen mode Exit fullscreen mode

Query time: 8,247ms for 5,000 rows

What's wrong with this query? Everything.

Let's break down the problems:

Problem 1: ILIKE with Leading Wildcards

WHERE title ILIKE '%construction%'
Enter fullscreen mode Exit fullscreen mode

This can't use indexes. PostgreSQL has to scan every row and do pattern matching. With 5,000 rows, that's 5,000 regex operations.

Problem 2: Multiple OR Conditions

WHERE (title ILIKE '%term%' OR description ILIKE '%term%')
  AND (province ILIKE '%term%' OR org.province ILIKE '%term%')
Enter fullscreen mode Exit fullscreen mode

Even if we could index one condition, the OR forces a full scan anyway. Indexes become useless.

Problem 3: Unnecessary Joins

We're joining three tables (categories, provinces, organizations) just to get names. For display. In a search query.

Problem 4: No Pagination Optimization

ORDER BY published_at DESC LIMIT 20
Enter fullscreen mode Exit fullscreen mode

Sorting all results before taking 20 is expensive. And users rarely go past page 3.

Let's fix these problems systematically.

Optimization 1: Full-Text Search Instead of ILIKE

PostgreSQL has built-in full-text search that's fast and powerful. We just weren't using it.

Adding tsvector Columns

-- Add tsvector columns for full-text search
ALTER TABLE tenders
  ADD COLUMN title_search tsvector
  GENERATED ALWAYS AS (to_tsvector('english', coalesce(title, ''))) STORED;

ALTER TABLE tenders
  ADD COLUMN description_search tsvector
  GENERATED ALWAYS AS (to_tsvector('english', coalesce(description, ''))) STORED;

ALTER TABLE tenders
  ADD COLUMN combined_search tsvector
  GENERATED ALWAYS AS (
    to_tsvector('english', coalesce(title, '')) ||
    to_tsvector('english', coalesce(description, ''))
  ) STORED;
Enter fullscreen mode Exit fullscreen mode

The GENERATED ALWAYS AS ... STORED is key here. PostgreSQL maintains these columns automatically when the source data changes.

Creating GIN Indexes

-- GIN indexes for fast full-text search
CREATE INDEX idx_tenders_title_search 
  ON tenders USING GIN (title_search);

CREATE INDEX idx_tenders_description_search 
  ON tenders USING GIN (description_search);

CREATE INDEX idx_tenders_combined_search 
  ON tenders USING GIN (combined_search);
Enter fullscreen mode Exit fullscreen mode

GIN (Generalized Inverted Index) is perfect for full-text search. It's like having an index on every word.

The New Query

-- ✅ Much better with full-text search
SELECT t.*
FROM tenders t
WHERE 
  t.status = 'ACTIVE'
  AND t.combined_search @@ to_tsquery('english', 'construction')
ORDER BY t.published_at DESC
LIMIT 20;
Enter fullscreen mode Exit fullscreen mode

Query time: 247ms (down from 8,247ms - a 33x improvement!)

But we can do better.

Optimization 2: Composite Indexes

We're filtering by status every time. And we're sorting by published_at. Let's optimize that.

The Composite Index

-- Composite index for status + published_at
CREATE INDEX idx_tenders_status_published 
  ON tenders (status, published_at DESC);
Enter fullscreen mode Exit fullscreen mode

This index serves two purposes:

  1. Filters by status efficiently
  2. Pre-sorts by published_at

PostgreSQL can use this index to skip the sort entirely!

The Impact

EXPLAIN ANALYZE
SELECT t.*
FROM tenders t
WHERE 
  t.status = 'ACTIVE'
  AND t.combined_search @@ to_tsquery('english', 'construction')
ORDER BY t.published_at DESC
LIMIT 20;

-- Index Scan using idx_tenders_status_published
-- Rows Removed by Filter: 234
-- Execution Time: 47ms
Enter fullscreen mode Exit fullscreen mode

Query time: 47ms (down from 247ms - another 5x improvement!)

We're now 175x faster than the original query.

Optimization 3: Denormalization for Joins

Remember those three joins we were doing? Let's eliminate them.

-- Add denormalized columns
ALTER TABLE tenders
  ADD COLUMN category_name TEXT,
  ADD COLUMN province_name TEXT,
  ADD COLUMN organization_name TEXT;

-- Create a trigger to maintain them
CREATE OR REPLACE FUNCTION update_tender_denormalized_fields()
RETURNS TRIGGER AS $$
BEGIN
  SELECT c.name INTO NEW.category_name
  FROM categories c WHERE c.id = NEW.category_id;

  SELECT p.name INTO NEW.province_name
  FROM provinces p WHERE p.id = NEW.province_id;

  SELECT o.name INTO NEW.organization_name
  FROM organizations o WHERE o.id = NEW.organization_id;

  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trigger_update_tender_denormalized
  BEFORE INSERT OR UPDATE ON tenders
  FOR EACH ROW
  EXECUTE FUNCTION update_tender_denormalized_fields();
Enter fullscreen mode Exit fullscreen mode

Now our query needs no joins:

-- ✅ No joins needed!
SELECT 
  t.id,
  t.title,
  t.description,
  t.category_name,
  t.province_name,
  t.organization_name,
  t.published_at,
  t.closing_at,
  t.estimated_value
FROM tenders t
WHERE 
  t.status = 'ACTIVE'
  AND t.combined_search @@ to_tsquery('english', 'construction')
  AND t.province_name = 'Gauteng'
ORDER BY t.published_at DESC
LIMIT 20;
Enter fullscreen mode Exit fullscreen mode

Query time: 23ms (down from 47ms)

We've now achieved 359x improvement from the original.

Optimization 4: Smart Search Query Parsing

User searches aren't always simple. They might search for:

  • "construction AND Gauteng"
  • "building OR renovation"
  • "software -hardware"

Let's handle this properly:

// lib/search/query-parser.ts
export function parseSearchQuery(query: string): string {
  // Handle boolean operators
  let tsQuery = query
    .trim()
    .toLowerCase()
    // Handle AND
    .replace(/\band\b/gi, '&')
    // Handle OR
    .replace(/\bor\b/gi, '|')
    // Handle NOT
    .replace(/\bnot\b/gi, '!')
    .replace(/-(\w+)/g, '!$1')
    // Handle phrases
    .replace(/"([^"]+)"/g, (_, phrase) => {
      return phrase.split(' ').join('<->');
    })
    // Clean up
    .replace(/[^\w\s&|!<>-]/g, '')
    // Join remaining words with &
    .split(/\s+/)
    .filter(Boolean)
    .join(' & ');

  return tsQuery;
}

// Usage
const userQuery = 'construction AND Gauteng -maintenance';
const tsQuery = parseSearchQuery(userQuery); 
// Result: 'construction & gauteng & !maintenance'

const result = await db.$queryRaw`
  SELECT * FROM tenders
  WHERE combined_search @@ to_tsquery('english', ${tsQuery})
`;
Enter fullscreen mode Exit fullscreen mode

This gives users powerful search without needing to know tsquery syntax.

Optimization 5: Partial Indexes for Common Queries

Not all tenders are equal. Active tenders are queried constantly. Closed tenders? Rarely.

-- Partial index for active tenders only
CREATE INDEX idx_tenders_active_search 
  ON tenders USING GIN (combined_search)
  WHERE status = 'ACTIVE';

CREATE INDEX idx_tenders_active_published
  ON tenders (published_at DESC)
  WHERE status = 'ACTIVE';
Enter fullscreen mode Exit fullscreen mode

Partial indexes are smaller and faster because they only index the rows you actually query.

Impact:

  • Index size: 890MB → 340MB (62% smaller)
  • Query time: 23ms → 18ms (22% faster)

Optimization 6: Connection Pooling

At scale, connection overhead matters. We use Prisma with PgBouncer:

// lib/db.ts
import { PrismaClient } from '@prisma/client';

const globalForPrisma = global as unknown as {
  prisma: PrismaClient | undefined;
};

export const db = globalForPrisma.prisma ?? new PrismaClient({
  datasources: {
    db: {
      url: process.env.DATABASE_URL, // Points to PgBouncer
    },
  },
  log: process.env.NODE_ENV === 'development' 
    ? ['query', 'error', 'warn'] 
    : ['error'],
});

if (process.env.NODE_ENV !== 'production') {
  globalForPrisma.prisma = db;
}
Enter fullscreen mode Exit fullscreen mode

PgBouncer configuration:

[databases]
tendersa = host=localhost port=5432 dbname=tendersa

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20
reserve_pool_size = 5
Enter fullscreen mode Exit fullscreen mode

Connection pooling gave us:

  • 40% reduction in connection overhead
  • Ability to handle 10x more concurrent requests
  • Predictable performance under load

Optimization 7: Query Result Caching

Some queries are requested repeatedly. Let's cache them:

// lib/cache/query-cache.ts
import { redis } from '@/lib/redis';

export async function getCachedQuery<T>(
  key: string,
  queryFn: () => Promise<T>,
  ttl: number = 300 // 5 minutes default
): Promise<T> {
  // Check cache
  const cached = await redis.get(key);
  if (cached) {
    return JSON.parse(cached);
  }

  // Execute query
  const result = await queryFn();

  // Cache result
  await redis.setex(key, ttl, JSON.stringify(result));

  return result;
}

// Usage
export async function searchTenders(query: string) {
  const cacheKey = `search:${hashQuery(query)}`;

  return await getCachedQuery(
    cacheKey,
    async () => {
      return await db.tender.findMany({
        where: {
          status: 'ACTIVE',
          combined_search: {
            search: parseSearchQuery(query),
          },
        },
        take: 20,
        orderBy: { published_at: 'desc' },
      });
    },
    300 // 5 minutes
  );
}
Enter fullscreen mode Exit fullscreen mode

Cache hit rate: 67%

Effective query time: 18ms × (1 - 0.67) = 6ms average

The Complete Architecture

Here's our final search implementation:

// app/api/search/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { z } from 'zod';

const searchSchema = z.object({
  query: z.string().min(1).max(200),
  category: z.string().optional(),
  province: z.string().optional(),
  minValue: z.number().optional(),
  maxValue: z.number().optional(),
  page: z.number().min(1).default(1),
  limit: z.number().min(1).max(100).default(20),
});

export async function GET(request: NextRequest) {
  try {
    // Parse and validate
    const params = Object.fromEntries(request.nextUrl.searchParams);
    const validated = searchSchema.parse({
      ...params,
      page: parseInt(params.page || '1'),
      limit: parseInt(params.limit || '20'),
    });

    // Build cache key
    const cacheKey = `search:${JSON.stringify(validated)}`;

    // Get cached or execute
    const results = await getCachedQuery(
      cacheKey,
      async () => {
        const offset = (validated.page - 1) * validated.limit;

        // Parse search query
        const tsQuery = parseSearchQuery(validated.query);

        // Build WHERE conditions
        const where = {
          status: 'ACTIVE',
          combined_search: {
            search: tsQuery,
          },
          ...(validated.category && { category_name: validated.category }),
          ...(validated.province && { province_name: validated.province }),
          ...(validated.minValue && { 
            estimated_value: { gte: validated.minValue } 
          }),
          ...(validated.maxValue && { 
            estimated_value: { lte: validated.maxValue } 
          }),
        };

        // Execute query
        const [tenders, total] = await Promise.all([
          db.tender.findMany({
            where,
            take: validated.limit,
            skip: offset,
            orderBy: { published_at: 'desc' },
            select: {
              id: true,
              title: true,
              description: true,
              category_name: true,
              province_name: true,
              organization_name: true,
              published_at: true,
              closing_at: true,
              estimated_value: true,
            },
          }),
          db.tender.count({ where }),
        ]);

        return {
          tenders,
          pagination: {
            page: validated.page,
            limit: validated.limit,
            total,
            pages: Math.ceil(total / validated.limit),
          },
        };
      },
      300 // 5 minutes TTL
    );

    return NextResponse.json(results);
  } catch (error) {
    console.error('Search error:', error);
    return NextResponse.json(
      { error: 'Search failed' },
      { status: 500 }
    );
  }
}
Enter fullscreen mode Exit fullscreen mode

Performance Testing Results

We ran load tests simulating 1,000 concurrent users:

Before optimization:

Requests: 10,000
Duration: 82 seconds
Success rate: 94% (6% timeout)
Average response: 8,200ms
p50: 7,800ms
p95: 12,400ms
p99: 15,200ms
Enter fullscreen mode Exit fullscreen mode

After optimization:

Requests: 10,000
Duration: 8 seconds
Success rate: 100%
Average response: 6ms (cached) / 21ms (uncached)
p50: 18ms
p95: 47ms
p99: 89ms
Enter fullscreen mode Exit fullscreen mode

Improvement:

  • 10x more throughput
  • 391x faster average response
  • 100% success rate (no timeouts)
  • 262x faster p95

Monitoring and Observability

We track query performance continuously:

// lib/monitoring/query-monitor.ts
import { performance } from 'perf_hooks';

export async function monitorQuery<T>(
  queryName: string,
  queryFn: () => Promise<T>
): Promise<T> {
  const start = performance.now();

  try {
    const result = await queryFn();
    const duration = performance.now() - start;

    // Log slow queries
    if (duration > 100) {
      console.warn(`Slow query: ${queryName} took ${duration}ms`);

      await analytics.track('slow_query', {
        name: queryName,
        duration,
        threshold: 100,
      });
    }

    // Track all queries
    await metrics.histogram('query_duration', duration, {
      query: queryName,
    });

    return result;
  } catch (error) {
    await analytics.track('query_error', {
      name: queryName,
      error: error.message,
    });
    throw error;
  }
}

// Usage
const results = await monitorQuery(
  'search_tenders',
  () => searchTenders(query)
);
Enter fullscreen mode Exit fullscreen mode

Database Maintenance

Fast queries require database health:

-- Weekly maintenance script
-- 1. Update statistics
ANALYZE tenders;

-- 2. Reindex if needed
REINDEX INDEX CONCURRENTLY idx_tenders_combined_search;

-- 3. Vacuum
VACUUM ANALYZE tenders;

-- 4. Check index bloat
SELECT 
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) AS index_size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
Enter fullscreen mode Exit fullscreen mode

Lessons Learned

1. Profile Before Optimizing

Use EXPLAIN ANALYZE:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM tenders
WHERE combined_search @@ to_tsquery('construction');
Enter fullscreen mode Exit fullscreen mode

This shows you:

  • Query plan
  • Actual execution time
  • Index usage
  • Buffer hits vs misses

2. Indexes Are Not Free

Every index:

  • Takes disk space
  • Slows down writes
  • Needs maintenance

We started with 23 indexes. After analysis, we removed 8 that were rarely used.

Impact:

  • 340MB disk saved
  • 15% faster inserts
  • 12% faster updates

3. Denormalization Is OK

Normalized databases are beautiful. But joins are expensive.

We denormalized:

  • Category names
  • Province names
  • Organization names

Trade-off:

  • Slower writes (triggers)
  • Faster reads (no joins)

For a read-heavy application (99% reads), this was the right choice.

4. Cache Invalidation Is Hard

We cache search results for 5 minutes. But what if a tender closes?

Solution: Event-driven invalidation:

// After tender update
await redis.del(`tender:${tenderId}`);
await redis.del('search:*'); // Invalidate all searches

// Better: Selective invalidation
const affectedKeys = await redis.keys(`search:*${province}*`);
await redis.del(...affectedKeys);
Enter fullscreen mode Exit fullscreen mode

5. Monitor in Production

Load tests are great. Production is different.

We found:

  • Queries slow down during backup windows
  • Connection pool exhaustion during traffic spikes
  • Some query patterns we never tested

Solution: Continuous monitoring with alerts.

When to Actually Use Elasticsearch

PostgreSQL full-text search is great for many cases. But there are limits:

Use Elasticsearch when:

  • You need typo tolerance
  • You want relevance tuning
  • You need faceted search
  • You're doing complex aggregations
  • Scale exceeds PostgreSQL capabilities (millions of documents)

Stick with PostgreSQL when:

  • Your data is relational
  • You need ACID transactions
  • Scale is manageable (<1M documents)
  • Team knows SQL better than Elasticsearch
  • You want to avoid operational complexity

For us, PostgreSQL + smart optimization was the right choice.

The Results: By the Numbers

Database:

  • Rows: 5,000 → 50,000 (10x growth)
  • Index size: 890MB → 340MB (62% smaller)
  • Query time: 8,247ms → 18ms (458x faster)
  • Throughput: 12 req/s → 1,250 req/s (104x more)

Infrastructure costs:

  • Before: $240/month (8x oversized to compensate for slow queries)
  • After: $45/month (right-sized, efficient queries)
  • Savings: $2,340/year

User experience:

  • Search bounce rate: 34% → 3%
  • Average search depth: 1.2 pages → 3.7 pages
  • Conversion to bid: 2.1% → 8.4%

Conclusion: PostgreSQL Is Enough (Usually)

We didn't need Elasticsearch. We needed better SQL.

The optimizations:

  1. Full-text search with tsvector
  2. Composite indexes
  3. Denormalization
  4. Query result caching
  5. Connection pooling
  6. Partial indexes
  7. Smart query parsing

Took us from 8s to 18ms at 10x the scale.

Before adding new infrastructure, optimize what you have. You might be surprised.


Want the code? Full implementation examples available in our docs.

Questions? Drop them in the comments!

Follow for more articles on:

  • Database optimization
  • PostgreSQL best practices
  • Full-text search
  • Performance engineering

Building Tenders SA - South Africa's fastest government tender search platform. 50,000+ tenders, 18ms p95 response time, PostgreSQL.

Top comments (0)