Matthieu

Posted on Feb 15

I don’t hate SQL. I hate metadata friction.

#bigquery #ai #automation #llm

I don’t struggle with writing SQL, but what I struggle with is everything around it. You know the drill:

Open BigQuery console.
Write a simple query on top of INFORMATION_SCHEMA.
Realize you forgot the partition column.
Copy an old query.
Tweak it.
Realize you need another column.
Open docs.
Switch to Airflow to check if you can get the answer faster.
Go back to the BigQuery console.
Forget what you were checking.

None of this is hard. It’s just constant, low-level friction repeated every other day.

So I built something small to reduce that friction.

The problem

I regularly ask questions like:

Which tables changed schema recently?
What jobs are consuming the most slots?
Why was BigQuery slow yesterday?
Why are BigQuery costs skyrocketing?

BigQuery exposes many answers through INFORMATION_SCHEMA.

But the queries are rarely trivial. They’re long. They require joins. You need to remember field names. And I constantly find myself going back to the docs for small details.

So I started modeling the metadata in DBT.

I created a few summarized tables for jobs, storage, schema changes, dependencies, and slot usage.

Nothing revolutionary. Just structured the data the way I actually use it. That helped, but I was still writing the same kinds of queries repeatedly.

So I threw an LLM at it

Once I had clean DBT models with decent descriptions, the next step felt obvious.

Instead of writing SQL, what if I just asked questions? Most BI tools advertise Text-to-SQL solutions for business users. Why would it not work for engineers as well?

Here’s what the current setup looks like:

DBT models summarize BigQuery metadata
The data is synced into DuckDB (to avoid scanning BigQuery every time, and because I don't trust the LLMs).
I pass the schema + column descriptions to an LLM.
The LLM generates SQL.
SQL runs against DuckDB

Results are shown in a small terminal UI. That’s it.

It’s basically a TUI where I can “chat” with my metadata.

A real example

One week BigQuery was painfully slow. There was likely slot contention. Probably a new transformation was deployed earlier that week.

Instead of digging manually, I tried:

why was bigquery slow this week?

It analyzed slot usage and long-running jobs and identified a new transformation model that was timing out after 6 hours.

I checked with the team responsible. They said they’d fix it.

A few days later, I asked:

Are the slot timeouts gone?

It confirmed that the long-running jobs had disappeared.

That’s when I realized this was actually useful for first-pass investigations.

It’s not perfect

There were issues immediately.

First, long answers. Even simple questions triggered overly detailed reports. So I split it into two modes: Fast and Analytics.

But the biggest challenge was hallucinations.

BigQuery job IDs are UUID-like strings. Even when the query result was correct, the LLM sometimes invented job IDs in its explanation.

That’s obviously not acceptable.

I tried adding validation steps, but they quickly became too expensive in terms of tokens. Right now, I return raw query results along with summaries.

I also hit an infinite loop where the LLM queried the data, interpreted it, then queried again, and so on. Luckily, I only had $5 in credits. I added strict usage limits after that.

Why DuckDB?

Mainly cost and speed. I didn’t want to have accidental expensive scans and slow iteration loops.

So I sync the relevant metadata into DuckDB and query that instead.

It works well at this scale, though obviously you can’t mirror unlimited history.

What this tool actually is

This is not a BI or an observability tool. It’s just a faster way to ask operational questions without writing repetitive metadata queries.

It can be used for:

First-level investigations
Sanity checks
Avoiding copy-paste SQL

That’s it. I use it when I need quick answers.

It doesn’t replace proper analysis. But it lowers the activation energy. And that alone makes it useful.

Open-sourcing?

I’m considering open-sourcing it. Right now, it works with BigQuery metadata, but technically, you could plug in any BigQuery dataset with proper table definitions.

But before I invest more time to support more data engineering ops metadata, I’d love to know:

Would you use something like this?
Does this solve a real annoyance for you?
What would immediately make it unusable?
What would make it indispensable?

If there’s interest, I’ll clean up the repo and open-source it.

I’m trying to validate whether this is just my personal itch or something broader.

Honest feedback (including brutal feedback) is welcome.

DEV Community