DEV Community

Cover image for Why 90% of Backtests Lie: Introducing Kiploks Data Quality Guard (DQG)
Kiploks Robustness Engine
Kiploks Robustness Engine

Posted on • Edited on

Why 90% of Backtests Lie: Introducing Kiploks Data Quality Guard (DQG)

Recently, I had an interesting conversation with an old acquaintance of mine, Kaspars. To be precise, I once worked for him as a developer.

At the moment, no one in my close circle really understands what kind of project Iโ€™m building, so I decided to ask for feedback from someone with strong startup experience. That conversation turned out to be extremely valuable - not just conceptually, but practically. I immediately started implementing several ideas that came out of it.

One of the key topics we discussed was integrating Kiploks Robustness Engine with third-party backtesting systems to perform focused, strategy-level analysis.

This post builds on Part 2, where I explained why most strategies should fail robustness checks. Here, I focus on what comes even earlier: data quality.

We quickly agreed that the first integration should be with Freqtrade, an open-source trading framework that supports backtesting, bots, and live trading. I already run several bots on Freqtrade myself, so this integration was a natural starting point.

The integration tests are currently in full swing, and the results look very promising. I genuinely believe that Freqtrade users will benefit from this work - saving both weeks of strategy testing and real money by avoiding weak or misleading strategies early.


The Unexpected Discovery: Data Quality Comes First

While working on the integration, I realized something important.

My analysis pipeline already contained a set of checks that didnโ€™t really belong to performance metrics, risk metrics, or robustness metrics. These checks were answering a more fundamental question:

Can we trust the data at all?

Thatโ€™s how a new analytical block was born:

Data Quality Guard (DQG).

DQG acts as Stage 0 of the entire analysis pipeline.
Before we evaluate alpha, Sharpe, or robustness - we verify whether the data itself is valid enough to support any conclusions.


Kill-Switch Logic: One Zero Invalidates Everything

Technically, DQG is built using a multiplicative scoring model - meaning that a single critical failure reduces the entire score to zero.

This is not a controversial idea in professional risk management.
In the industry, this approach is commonly referred to as Kill-Switch Logic.

If a strategy fails a fundamental data integrity check, no amount of profitability can justify deployment.

To make it clear:
DQG is not an opinion.
It is an automation of well-known quantitative research standards.

Below are the core concepts DQG is based on.


1. Garbage In, Garbage Out (GIGO)

๐Ÿ”— https://en.wikipedia.org/wiki/Garbage_in,_garbage_out

This is the foundation.

In trading, GIGO means that even the most advanced model will produce meaningless results if the input price data is broken, incomplete, or biased.

What DQG does:
It automatically filters out invalid datasets before the researcher wastes time optimizing noise.


2. Look-Ahead Bias (Data Snooping)

๐Ÿ”— https://en.wikipedia.org/wiki/Look-ahead_bias
๐Ÿ”— https://en.wikipedia.org/wiki/Data_snooping

This is the most critical failure mode.

Look-ahead bias occurs when a strategy uses information that was not available at the time of decision-making - even indirectly.

In academic literature, this often falls under selection bias or data snooping.

If DQG detects look-ahead bias, the strategy is instantly rejected.
No exceptions.


3. Data Integrity & Stationarity

๐Ÿ”— https://en.wikipedia.org/wiki/Survivorship_bias
๐Ÿ”— https://en.wikipedia.org/wiki/Outlier#In_statistics

Markets are continuous time series.
Missing candles, corrupted ticks, or discontinuities break indicator calculations like MA, RSI, or ATR and generate artificial signals.

DQG checks for:

  • Missing bars
  • Broken continuity
  • Survivorship bias
  • Price integrity issues

A dataset with gaps is not โ€œslightly worseโ€.
It is invalid.


4. Law of Large Numbers & Degrees of Freedom

๐Ÿ”— https://en.wikipedia.org/wiki/Law_of_large_numbers
๐Ÿ”— https://en.wikipedia.org/wiki/Overfitting
๐Ÿ”— https://en.wikipedia.org/wiki/P-hacking

This is DQGโ€™s protection against overfitting.

If a strategy has:

  • 10 optimized parameters
  • and only 30 trades total

Then the result is statistically meaningless.

Professional researchers typically require 10โ€“20 trades per optimized parameter to consider results credible.

Anything below that is curve-fitting.


5. Outlier Dominance & Fat Tails

๐Ÿ”— https://en.wikipedia.org/wiki/Fat-tailed_distribution
๐Ÿ”— https://en.wikipedia.org/wiki/Black_swan_theory

If most of a strategyโ€™s profit comes from:

  • a single trade
  • a rare price spike
  • or a bad tick

Then the strategy is not reproducible.

DQG flags cases where one trade dominates total PnL, indicating fat-tail dependency or data anomalies.


How DQG Fits Into Kiploks Robustness Engine

DQG is not a standalone metric.
It directly feeds into the Investability Grade of a strategy.

A strategy can show 1000% annual return - but if DQG detects look-ahead bias or outlier dominance, its grade instantly drops to F (Non-Investable).

In Kiploks, Data Quality Guard accounts for 40% of the final decision weight.

Because without trustworthy data, everything else is just a story.


Kiploks Robustness Score Is Now Data-Aware

With the introduction of Data Quality Guard, the Robustness Score became data-aware.

If critical data checks fail, robustness metrics are invalidated and the final score is forced to Fail - no performance metric can override bad data.


Final Thought

Most traders start by asking:

โ€œHow profitable is this strategy?โ€

DQG forces a different question:

โ€œIs this result even real?โ€

And surprisingly often, the answer is no.


Iโ€™m Radiks Alijevs, lead developer of Kiploks Robustness Engine.
Iโ€™m building tools to bring institutional-grade rigor into retail algorithmic trading.

Follow me if you want to see how I integrated Kiploks with Freqtrade, and how professional validation, data-quality gates, and kill-switch logic can be applied to real open-source trading systems.

Top comments (0)