Documentation Index
Fetch the complete documentation index at: https://docs.matchr.xyz/llms.txt
Use this file to discover all available pages before exploring further.
What is Aggregation?
Aggregation is the process of collecting, normalizing, and unifying data from multiple prediction market platforms into a single coherent view.Think of Matchr as the Google of prediction markets - we index everything so you don’t have to.
The Data Pipeline
Our aggregation pipeline runs continuously, processing data from multiple sources:Sources We Aggregate
Polymarket
~7,000 markets tracked
- Gamma API for events & metadata
- CLOB API for orderbook & prices
- WebSocket for real-time updates
Kalshi
~3,400 markets tracked
- Elections API for event data
- Trade API for market details
- REST polling for price updates
What We Collect
For each market across all platforms, we aggregate:Event Data
| Field | Description |
|---|---|
title | Event name/question |
description | Detailed resolution criteria |
category | Politics, Sports, Crypto, etc. |
end_date | When the market resolves |
image | Event thumbnail |
Market Data
| Field | Description |
|---|---|
outcomes | YES/NO or multiple choice options |
prices | Current bid/ask for each outcome |
volume | Total trading volume |
liquidity | Available orderbook depth |
Real-time Data
| Field | Description |
|---|---|
best_bid | Highest buy price |
best_ask | Lowest sell price |
last_price | Most recent trade |
price_history | Historical price data |
Data Normalization
Different platforms structure data differently. We normalize everything:- Polymarket Format
- Kalshi Format
- Matchr Unified
Update Frequency
Market Refresh
Every 5 minutes
New markets, metadata changes
Price Updates
Real-time
WebSocket for Polymarket, 30s polling for Kalshi
Matching Engine
Continuous
New matches detected as markets appear
Embeddings & AI
Every market gets an AI embedding for semantic search and matching:- Semantic search: Find markets by meaning, not just keywords
- Market matching: Identify equivalent markets across platforms
- Similarity scoring: Rank match confidence
Database Architecture
We use a multi-layer data model:Raw Layer
Complete API responses stored as JSONB. Preserves all original data. Tables:
raw.polymarket_events, raw.kalshi_events, raw.polymarket_pricesCore Layer
Normalized, unified schema. Platform-agnostic. Tables:
core.events, core.markets, core.price_snapshotsPerformance
| Metric | Value |
|---|---|
| Total markets tracked | 10,000+ |
| Price update latency | Under 500ms |
| API response time | Under 200ms (p95) |
| Data freshness | Under 5 minutes |
Next Steps
Matched Markets
Learn how we identify equivalent markets across platforms.
Price Discovery
Understand how prices converge across venues.
