📊 Dataset facts (refreshed 2026-06-11): The collection pipeline now holds 18.2M+ price snapshots (18,229,206) across 18,272 markets plus 1,795,838 orderbook-depth snapshots — verify live anytime at api.protodex.io/stats (updates every 15 minutes). The analysis below was run on the first ~8.9M-point cut; the edges have held as the archive has more than doubled. Downloadable SQLite at gumroad.com/l/agyjd.
Everyone says prediction markets are efficient. I spent months collecting data to test that claim.
The result: 18.2 million price snapshots across 18,272 markets — and the data tells a story most traders miss completely.
The Setup
I built an automated collector that snapshots every active Polymarket market every 15 minutes. Not just BTC or the US election — every market. Politics, sports, crypto, geopolitics, economics, entertainment, weather, science. All of it.
After 75 days of continuous collection (2026-03-28 → 2026-06-11):
| Metric | Value |
|---|---|
| Markets tracked | 18,272 markets |
| Price snapshots | 18,229,206 |
| Orderbook depth snapshots | 1,795,838 |
| Categories | 10 |
| Update frequency | every 15 min |
Most Polymarket datasets you'll find cover a single market or a single event. This covers the entire platform simultaneously — which lets you see patterns that single-market analysis can't.
Finding #1: Markets Are Not Efficient After Crashes
Everyone assumes prediction markets instantly price in new information. The data says otherwise.
I measured what happens after a price drops more than 20% between consecutive snapshots. Here's what the 5,629 crash events show:
| Time After Crash | Average Return | Events Measured |
|---|---|---|
| +15 min | +6.6% | 5,629 |
| +30 min | +8.8% | 5,629 |
| +45 min | +10.3% | 5,629 |
| +1 hour | +11.0% | 5,629 |
After a >20% crash, prices bounce back an average of 6.6% within 15 minutes.
This is classic mean reversion — and it's massive. For comparison, the S&P 500's average annual return is about 10%. These markets deliver that in an hour after a crash.
The reverse is also true. After a >10% pump:
| Time After Pump | Average Return |
|---|---|
| +15 min | -2.9% |
| +30 min | -3.7% |
Prices that spike tend to give it back. Markets overreact in both directions.
Finding #2: Hold Time Matters More Than Entry Price
I simulated the obvious strategy — buy the crash, sell the recovery — across the dataset. Here's how hold time affects the result:
| Max Hold | Trades | Win Rate | Total P&L |
|---|---|---|---|
| 2 hours | 10,204 | 54% | $87 |
| 6 hours | 8,324 | 64% | $108 |
| 12 hours | 7,295 | 70% | $121 |
| 24 hours | 6,225 | 75% | $135 |
| 48 hours | 5,352 | 81% | $142 |
The sweet spot is 12 hours. Going from 12h to 48h only adds $21 to total P&L but locks your capital 4x longer. Most of the money is made in the first few hours.
This surprised me. I expected entry price to be the key variable. It's not:
| Entry Price Range | Win Rate | Avg P&L Per Trade |
|---|---|---|
| Under $0.10 | 74% | $0.014 |
| $0.10 — $0.30 | 74% | $0.018 |
| $0.30 — $0.50 | 79% | $0.032 |
| Above $0.50 | 86% | $0.022 |
Higher-priced markets actually have better win rates. The cheap ones look tempting but they include more dust trades that go nowhere.
Finding #3: Category Is Your Edge Selector
Not all Polymarket categories behave the same:
| Category | Trades | Win Rate | P&L Per Trade | Verdict |
|---|---|---|---|---|
| Crypto | 646 | 78% | $0.030 | Best per-trade |
| Sports | 1,050 | 79% | $0.027 | Most consistent |
| Other | 1,872 | 75% | $0.024 | Most volume |
| Politics | 1,362 | 76% | $0.018 | Decent |
| Geopolitics | 890 | 71% | $0.016 | Below average |
| Economics | 101 | 69% | $0.008 | Avoid |
| Weather | 21 | 57% | Negative | Avoid |
Crypto and sports markets have the strongest mean reversion. Economics and weather markets are traps — they crash and stay crashed.
Why? Sports and crypto have event-driven resolution (the game happens, the price discovers). Economics markets depend on slow-moving indicators — when they crash, it's often because the fundamentals actually changed.
By raw market count, the platform today skews to "other" (10,571), sports (2,624), and crypto (2,345) — so there's no shortage of liquid markets in the categories that mean-revert best.
Finding #4: The "Always Bet No" Strategy Is Overhyped
You may have seen the "Nothing Ever Happens" bot that bets NO on everything. The claim: 73% of Polymarket resolves NO.
I checked with 4,763 resolved binary markets from the API:
- All markets: 52.3% resolve NO (not 73%)
- Non-sports: 57%
- "Will X happen?" framing: 59.3%
The 73% figure comes from a heavily filtered subset. Across all markets, the NO edge is barely there — and at typical NO prices ($0.65-0.85), the math doesn't work.
The Dataset Is Free to Explore
I'm releasing the data across multiple platforms:
Free:
-
Live API — 100 requests/day, no key required; hit
/markets,/crashes,/statsdirectly - Kaggle — markets.csv + price preview + SQLite DB
- HuggingFace — same files, HF ecosystem integration
- GitHub — browse the data, star if useful
Full historical archive ($9):
- Gumroad — the complete SQLite DB with orderbook depth
The pipeline keeps running every 15 minutes. If you want to reproduce any of these findings, everything is there.
What I'd Build Next
If I were starting a Polymarket quant project today, I'd focus on:
- Real-time crash detection — the 6.6% bounce after crashes is the clearest edge
- Category rotation — crypto and sports, skip economics and weather
- 12-hour max hold — the data is unambiguous on this
- Cross-market signals — does a crash in one political market predict crashes in related ones?
The prediction market space is where crypto was in 2017 — growing fast, most participants losing money, and the edge goes to people with data infrastructure.
The data is collected from Polymarket's Gamma API and CLOB API using an automated pipeline. I also maintain protodex.io, a security-scored index of MCP servers.
Questions or want custom data cuts? LuciferForge@proton.me
Top comments (0)