WC 2026 Forecaster

EDA: data exploration

Patterns in the 49,256 match international dataset that informed the model. What the training data actually looks like before we Bayesian it.

Goals per match distribution
Goals per match distribution

Most international matches finish with 2-3 goals total. Heavy tail thanks to occasional 7+ goal blowouts.

Home advantage at non-neutral venues
Home advantage at non-neutral venues

Home teams score ~0.4 more goals per match. The single biggest reason WC venue assignment matters.

Goals per match by year
Goals per match by year

Slight downward trend through the 2010s; stabilizing in the 2020s as elite defensive blocks become universal.

Result distribution: home advantage visible
Result distribution: home advantage visible

Home teams win ~52% at non-neutral venues, ~37% at neutral. Draw rate is roughly stable around 25%.

Elo of all 48 WC2026 teams
Elo of all 48 WC2026 teams

Spain, Argentina, France lead. CuraƧao and Cape Verde are in tournament-debut territory.

Elo by confederation
Elo by confederation

UEFA's median is highest, but CONMEBOL teams (only 6 qualifying) are much more concentrated at the top.

Most-played international rivalries since 1990
Most-played international rivalries since 1990

Long running CONMEBOL pairings dominate. These 10 fixtures are responsible for ~5% of all training data.

Where the matches come from
Where the matches come from

Friendlies dominate volume. Why we down-weight them with `MATCH_WEIGHTS` in the model.

Friendlies are higher-scoring
Friendlies are higher-scoring

Managers experiment, defensive intensity is lower. Confirms the calibration choice to weight friendlies less.

Goal-scoring and draw rates by decade
Goal-scoring and draw rates by decade

Modern football is slightly less goal-rich than the 90s but draw rate has barely budged.