Mohamed Hussain S

Posted on Feb 17

Two ClickHouse Internals That Change How You Write Queries

#clickhouse #database #backend #performance

Most ClickHouse mistakes don’t come from SQL syntax.

They come from using the wrong mental model.

ClickHouse looks like a familiar analytical SQL database, but internally it behaves very differently from traditional row-based systems. If you approach it with PostgreSQL or MySQL instincts, you’ll eventually hit confusing behavior: wrong results, strange duplicates, unexpected merges, or queries that don’t scale the way you expected.

Two internals in particular completely changed how I design tables and write queries:

AggregatingMergeTree stores aggregation states, not final values
argMax solves group-wise max problems the way ClickHouse expects you to

These aren’t tricks. They’re core to how the engine works.

1. AggregatingMergeTree Is About States, Not Values

One of the most misunderstood engines in ClickHouse is AggregatingMergeTree.

At first glance, it looks like a way to store pre-aggregated results. So it’s tempting to do something like this:

CREATE TABLE daily_metrics
(
    day Date,
    avg_delay Float32
)
ENGINE = AggregatingMergeTree()
ORDER BY day;

Then insert data like:

SELECT
    toDate(event_time) AS day,
    avg(arrival_delay) AS avg_delay
FROM flights
GROUP BY day;

And later query it with:

SELECT day, avg_delay FROM daily_metrics;

This looks reasonable. But conceptually, it’s wrong.

Why This Fails

AggregatingMergeTree does not store final aggregated values. It stores aggregation states - intermediate, mergeable representations of computations.

ClickHouse merges parts in the background. As new data arrives:

Rows are appended to new parts
Background merges combine parts asynchronously
Aggregation states are merged during these part merges

If you store plain values like avg() or sum(), ClickHouse has nothing mergeable to work with. You’ve effectively disabled the engine’s intended behavior.

The Correct Mental Model

Think in states, not values.

Instead of storing avg(), you store avgState().

Instead of storing sum(), you store sumState().

Instead of storing uniq(), you store uniqState().

You define your table like this:

CREATE TABLE daily_metrics
(
    day Date,
    avg_delay AggregateFunction(avg, Float32)
)
ENGINE = AggregatingMergeTree()
ORDER BY day;

And insert data like this:

SELECT
    toDate(event_time) AS day,
    avgState(arrival_delay) AS avg_delay
FROM flights
GROUP BY day;

Now you are storing mergeable states.

When querying, you finalize them:

SELECT
    day,
    avgMerge(avg_delay) AS avg_delay
FROM daily_metrics
GROUP BY day;

Why GROUP BY Is Still Required

Another common confusion is:

“If I already aggregated before inserting, why do I still need GROUP BY when querying?”

Because merges are asynchronous.

You may have multiple parts containing states for the same day. The final aggregation only happens when you explicitly merge those states using functions like avgMerge().

GROUP BY ensures that all states for the same key are combined correctly at query time.

Once you understand that AggregatingMergeTree is designed for incremental, mergeable aggregation, everything becomes predictable:

Materialized views make sense
Backfills behave correctly
Background merges are no longer mysterious

The engine isn’t broken. The mental model was.

2. Solving Group-Wise Max Correctly with argMax

Another common analytical requirement looks simple on the surface:

“For each group, return the row with the maximum value.”

For example:

“For each airline, return the flight with the worst arrival delay.”

The naive approach might look like this:

SELECT
    airline,
    flight_number,
    max(arrival_delay)
FROM flights
GROUP BY airline, flight_number;

This computes the maximum delay per (airline, flight_number) combination. That’s not the same as “the worst flight per airline.”

It also produces multiple rows per airline, which means extra filtering or subqueries later.

ClickHouse provides a much cleaner way.

The ClickHouse-Native Solution

SELECT
    airline,
    argMax(flight_number, arrival_delay) AS flight_number,
    max(arrival_delay) AS max_delay
FROM flights
GROUP BY airline;

argMax(value, weight) means:

“Return the value associated with the maximum weight.”

In this case:

Track the maximum arrival_delay per airline
Return the flight_number corresponding to that maximum
Produce exactly one row per airline

No subqueries. No joins. No row explosion.

Why This Matters

argMax aligns with how ClickHouse performs aggregation internally. It combines value selection and aggregation into a single pass.

Instead of thinking in terms of relational rewrites, you’re using an aggregate function that expresses the intent directly.

This is both faster and clearer.

Once you start using argMax, many complex “best per group” queries collapse into a single SELECT.

The Bigger Pattern

Both of these examples point to the same lesson:

ClickHouse is not just “Postgres, but faster.”

It is designed around:

Immutable parts
Background merges
Mergeable aggregation states
Columnar execution
Specialized aggregate functions

If you treat it like a traditional row-based database, you’ll keep fighting it.

If you align your thinking with how it stores and merges data, entire classes of bugs and inefficiencies disappear.

Final Thoughts

Understanding AggregatingMergeTree as a state-based engine - and using functions like argMax for group-wise logic - changes how you model data in ClickHouse.

These aren’t edge cases. They show up in real production systems:

Pre-aggregated materialized views
Streaming ingestion pipelines
Analytical dashboards
Per-tenant metrics
Best/worst reporting

ClickHouse rewards engineers who understand its internals.

Once you stop thinking in values and start thinking in states - and once you use the aggregate functions the engine was built around - query design becomes simpler and more predictable.

And performance follows naturally.

DEV Community