Matched Markets

The Matching Challenge

The same real-world event appears on different platforms with different names:

Platform	Market Title
Polymarket	”Will Trump win the 2024 election?”
Kalshi	”Winner of 2024 Presidential Election: Trump”
PredictIt	”Donald Trump wins presidency 2024”

Same event. Different phrasing. Different prices. Simple string matching doesn’t work. We need AI.

How Matching Works

Our matching engine uses a three-stage process:

Embedding Generation

Every market gets a 1536-dimensional vector embedding using OpenAI’s text-embedding-3-small model.

const embedding = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: market.title + " " + market.description
});

Similarity Search

We use pgvector with HNSW indexing for fast similarity search across 10,000+ markets.

SELECT *
FROM polymarket_events p, kalshi_events k
WHERE 1 - (p.embedding <=> k.embedding) > 0.70
ORDER BY similarity DESC;

Smart Filtering

High-similarity matches go through additional validation:

Temporal alignment: Do end dates match within 7 days?
Outcome mapping: Can YES/NO outcomes be paired?
Entity matching: Are key entities (people, companies) the same?

Confidence Tiers

Matches are classified by confidence score:

High Confidence (≥85%)
Medium Confidence (70-84%)
Low Confidence (<70%)

Auto-confirmed — These matches go live immediately.

Characteristics:

Near-identical phrasing
Same resolution criteria
Matching time boundaries

Example:

Polymarket: "Will Bitcoin reach $100k in 2024?"
Kalshi:     "Bitcoin at or above $100k by Dec 31, 2024"
Score:      0.92 ✓

Needs review — Flagged for human verification.

Characteristics:

Similar topic, different framing
Possible resolution differences
Ambiguous time boundaries

Example:

Polymarket: "Biden approval rating above 50%?"
Kalshi:     "Biden approval ≥50% in Gallup poll"
Score:      0.78 ⚠️

Match Signals

Beyond embedding similarity, we extract additional signals:

interface MatchSignals {
  similarity_score: number;      // 0.0 - 1.0
  temporal_overlap: boolean;     // End dates within 7 days?
  category_match: boolean;       // Same category?
  entity_match: boolean;         // Key entities align?
  volume_ratio: number;          // Relative trading activity
  price_spread: number;          // Current price difference
}

Example Match Analysis

{
  "polymarket_title": "Trump wins 2024 election?",
  "kalshi_title": "Donald Trump wins 2024 Presidential Election",
  "similarity_score": 0.94,
  "signals": {
    "temporal_overlap": true,
    "category_match": true,
    "entity_match": true,
    "volume_ratio": 2.3,
    "price_spread": 0.028
  },
  "confidence_tier": "high",
  "status": "auto_confirmed"
}

Current Statistics

900+

Matched Pairs

95%

Accuracy (High Conf)

3-7%

Average Spread

<5%

False Positive Rate

Common Match Categories

Category	Matched Pairs	Avg Spread
Politics	312	2.8%
Sports	245	4.1%
Crypto	178	3.5%
Economics	134	2.2%
Entertainment	87	5.6%

Handling Edge Cases

Time Period Mismatches

Polymarket: "Bitcoin hits $100k in 2024"
Kalshi:     "Bitcoin hits $100k in 2025"

→ NOT matched (different time periods)

Opposite Meanings

Polymarket: "Trump wins election"
Kalshi:     "Trump loses election"

→ NOT matched (opposite outcomes, handled separately)

Ambiguous Resolution

Polymarket: "Fed cuts rates before July"
Kalshi:     "Fed cuts rates in Q2"

→ FLAGGED for review (overlapping but not identical)

Using Matched Markets

In the UI

Look for the green “Matched” badge on market cards:

Click to expand and see:

Prices on each platform
Current spread
Historical convergence

Via API

// Get all matched markets with >3% spread
const matches = await matchr.getMatchedMarkets({
  minSpread: 0.03,
  status: 'confirmed'
});

// Returns
[
  {
    polymarket: { id: '123', price: 0.52 },
    kalshi: { id: 'PRES-24-DT', price: 0.55 },
    spread: 0.028,
    confidence: 0.94
  }
]

Improving Match Quality

We continuously improve matching through:

Human Review

Medium-confidence matches are reviewed by our team. Decisions feed back into the algorithm.

User Feedback

Report incorrect matches directly in the UI. We investigate every report.

Model Updates

We evaluate newer embedding models (like text-embedding-3-large) and fine-tune thresholds.

Getting Started

Core Concepts

Features

Vaults

Resources

Matched Markets

The Matching Challenge

How Matching Works

Confidence Tiers

Match Signals

Example Match Analysis

Current Statistics

900+

95%

3-7%

<5%

Common Match Categories

Handling Edge Cases

Time Period Mismatches

Opposite Meanings

Ambiguous Resolution

Using Matched Markets

In the UI

Via API

Improving Match Quality

Next Steps

Price Discovery

Cross-Platform Routing

Getting Started

Core Concepts

Features

Vaults

Resources

​The Matching Challenge

​How Matching Works

​Confidence Tiers

​Match Signals

​Example Match Analysis

​Current Statistics

900+

95%

3-7%

<5%

​Common Match Categories

​Handling Edge Cases

​Time Period Mismatches

​Opposite Meanings

​Ambiguous Resolution

​Using Matched Markets

​In the UI

​Via API

​Improving Match Quality

​Next Steps

Price Discovery

Cross-Platform Routing

The Matching Challenge

How Matching Works

Confidence Tiers

Match Signals

Example Match Analysis

Current Statistics

Common Match Categories

Handling Edge Cases

Time Period Mismatches

Opposite Meanings

Ambiguous Resolution

Using Matched Markets

In the UI

Via API

Improving Match Quality

Next Steps