Poisson Distribution for Soccer Betting
The working bettor's guide to Poisson models for football. Derive attack and defense strengths from league data, compute expected goals, build a full 6x6 scoreline probability matrix, extract 1X2 and Over/Under prices, and apply the Dixon-Coles correction that top quants swear by.
Quick Calculator
Soccer is one of the cleanest real-world domains for the Poisson distribution. Goals are rare events — about 2.7 per match across Europe's top five leagues — and they arrive at a roughly constant rate through a 90-minute window. That combination is exactly the setup Poisson was built for. If you know the expected number of goals a team will score in a particular match (the parameter lambda), you can price every scoreline from 0-0 to 9-9 in a single closed-form calculation.
Building a usable model requires three steps: estimate team-level strengths from a historical sample, translate those strengths into match-specific lambdas, and transform the two lambdas into a scoreline probability matrix. From the matrix every market price falls out as a sum of cells: expected value, 1X2, Over/Under, Asian handicap, correct score, both teams to score — each one is a subset sum over the same 100-cell grid.
1. The Core Poisson Formula
# Probability a team scores exactly k goals with expected goals lambda P(X = k) = (lambda^k * e^(-lambda)) / k! # For a whole match the two teams score independently (baseline assumption): P(home=h, away=a) = P_home(h) * P_away(a) # Lambda per match lambda_home = AttackHome * DefenseAway * AvgHomeGoals lambda_away = AttackAway * DefenseHome * AvgAwayGoals Where: AttackHome = team home goals/game / league avg home goals/game DefenseAway = team away goals conceded/game / league avg away conceded/game AvgHomeGoals = league home goals per match (e.g. 1.52 EPL 2025-26) AvgAwayGoals = league away goals per match (e.g. 1.18 EPL 2025-26)
Two nuances. First, attack and defense coefficients must be computed separately for home and away splits — teams behave very differently in front of their own crowd and the home-field effect is real and stable. Second, the league averages act as calibrators: multiplying attack by defense alone gives a dimensionless ratio, and you need the league mean to scale it back into "goals per match" units.
2. Worked Example — Manchester City vs Arsenal
Imagine a Premier League matchup late in the 2025-26 season. The league-level numbers for the season so far are 1.52 home goals and 1.18 away goals per match. The two teams have posted the following splits over roughly 15 home and 15 away games each:
| Team | Home Scored / Game | Home Conceded / Game | Away Scored / Game | Away Conceded / Game |
|---|---|---|---|---|
| Manchester City | 2.40 | 0.80 | 2.00 | 1.10 |
| Arsenal | 2.20 | 0.70 | 1.70 | 0.95 |
| League Average | 1.52 | 1.18 | 1.18 | 1.52 |
# Step 1 — strengths
Attack_Home(MCI) = 2.40 / 1.52 = 1.579
Defense_Home(MCI) = 0.80 / 1.18 = 0.678
Attack_Away(ARS) = 1.70 / 1.18 = 1.441
Defense_Away(ARS) = 0.95 / 1.52 = 0.625
# Step 2 — lambdas
lambda_home = Attack_Home(MCI) * Defense_Away(ARS) * AvgHome
= 1.579 * 0.625 * 1.52 = 1.500 expected goals
lambda_away = Attack_Away(ARS) * Defense_Home(MCI) * AvgAway
= 1.441 * 0.678 * 1.18 = 1.153 expected goals
# Step 3 — marginal Poisson P(X=k)
k P(City = k) P(Arsenal = k)
0 0.2231 0.3157
1 0.3347 0.3640
2 0.2510 0.2099
3 0.1255 0.0807
4 0.0471 0.0233
5 0.0141 0.0054Manchester City are modelled to score 1.500 goals, Arsenal 1.153. Those numbers feel right: City are the stronger home side, Arsenal still a top-four defense. Now we combine the two marginals into a joint matrix by multiplying P_home(h) times P_away(a) for every pair (h, a). The table below shows the resulting probabilities rounded to four decimals.
3. The Full Scoreline Matrix (0-0 through 5-5)
| City \ Arsenal | 0 | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|---|
| 0 | 0.0705 | 0.0812 | 0.0468 | 0.0180 | 0.0052 | 0.0012 |
| 1 | 0.1057 | 0.1219 | 0.0702 | 0.0270 | 0.0078 | 0.0018 |
| 2 | 0.0793 | 0.0914 | 0.0527 | 0.0203 | 0.0058 | 0.0014 |
| 3 | 0.0396 | 0.0457 | 0.0263 | 0.0101 | 0.0029 | 0.0007 |
| 4 | 0.0149 | 0.0172 | 0.0099 | 0.0038 | 0.0011 | 0.0003 |
| 5 | 0.0045 | 0.0051 | 0.0030 | 0.0011 | 0.0003 | 0.0001 |
The cells sum to approximately 0.998 — the missing 0.2% is in the 6+ goal tails which we truncated. The single most likely scoreline is 1-1 at 12.19%, followed by 1-0 at 10.57% and 2-1 at 9.14%. Notice that 0-0 (7.05%) is slightly under-represented relative to real Premier League frequency of low-score draws — this is exactly the flaw the Dixon-Coles correction addresses later.
4. Extracting 1X2 and Over/Under 2.5
# Home win = sum of cells where h > a
P(City win) = 0.1057 + 0.0914 + 0.0457 + 0.0793 + 0.0172 + ...
= 0.4877 (48.77%)
# Draw = diagonal
P(Draw) = 0.0705 + 0.1219 + 0.0527 + 0.0101 + 0.0011 + 0.0001
= 0.2564 (25.64%)
# Away win = sum of cells where a > h
P(Arsenal win) = 1 - 0.4877 - 0.2564
= 0.2559 (25.59%)
# Over/Under 2.5
P(Under 2.5) = P(0-0)+P(1-0)+P(0-1)+P(2-0)+P(1-1)+P(0-2)
= 0.0705 + 0.1057 + 0.0812 + 0.0793 + 0.1219 + 0.0468
= 0.5054 (50.54%)
P(Over 2.5) = 1 - 0.5054 = 0.4946 (49.46%)
# Fair no-vig decimal odds
City win fair = 1 / 0.4877 = 2.050
Draw fair = 1 / 0.2564 = 3.900
Arsenal fair = 1 / 0.2559 = 3.908
Under 2.5 = 1 / 0.5054 = 1.979
Over 2.5 = 1 / 0.4946 = 2.022If a bookmaker is pricing Manchester City at 2.15 decimal, your model suggests the no-vig fair price is 2.050 — the book's price offers roughly 4.9% edge. Plug that into an EV calculator and you get EV = (0.4877 × 1.15) − 0.5123 = +0.0486, or +4.86% ROI. Under Kelly sizing, that edge justifies roughly 4.2% of bankroll on a full Kelly, 1.05% on quarter-Kelly.
5. Dixon-Coles Low-Score Correction
Pure Poisson has a well-known calibration flaw: it slightly under-predicts 0-0, 1-1 and (to a lesser degree) 2-2 draws, and slightly over-predicts 1-0 and 0-1 results. Dixon and Coles's 1997 paper introduced a four-cell correction function tau(h, a) that multiplies those specific low-score cells by small adjustment factors governed by a single nuisance parameter rho (typically estimated in the range -0.15 to -0.05 from historical data).
# Dixon-Coles adjustment (only affects four cells) tau(0, 0) = 1 - lambda_h * lambda_a * rho tau(0, 1) = 1 + lambda_h * rho tau(1, 0) = 1 + lambda_a * rho tau(1, 1) = 1 - rho tau(h, a) = 1 otherwise # Adjusted joint probability P'(h, a) = P(h, a) * tau(h, a) # Typical rho values rho ~ -0.12 for top-tier European leagues rho ~ -0.08 for higher-scoring leagues (Bundesliga, Eredivisie) # Effect on our example with rho = -0.12 P(0-0) : 0.0705 -> 0.0717 (+0.17 pts) P(1-1) : 0.1219 -> 0.1365 (+1.46 pts) P(1-0) : 0.1057 -> 0.1040 (-0.17 pts) P(0-1) : 0.0812 -> 0.0792 (-0.20 pts) Draw probability rises from 25.64% to about 27.0% which matches Premier League empirical frequency much better.
6. Known Limitations
Pure Poisson assumes the two teams score independently. In reality a team leading 2-0 typically slows down while the trailing side pushes forward — goals cluster negatively in score-state. Bivariate Poisson fixes this at the cost of one extra parameter.
The per-minute goal rate is not actually constant. It rises in the final 15 minutes as teams chase or protect leads, and it falls after red cards for the side with fewer players. For pre-match betting the bias is small; for in-play it is critical.
Attack and defense strengths drift game to game with injuries, transfers and tactical change. Naive season-average coefficients lag reality; weight recent games exponentially (half-life 6-10 matches) or use xG-based inputs for cleaner signal.
With only 15-19 games per split early in a season, team coefficients are noisy. Shrink toward the league mean (Bayesian prior) or use a multi-season blended sample to stabilize estimates before you start staking.
7. Practical Workflow
- Pull last two seasons of match results for the league you are modelling (approximately 760 matches for an EPL two-season sample).
- Compute league-average home goals and away goals per match.
- For each team compute home-attack, home-defense, away-attack, away-defense with exponential weighting (half-life 8 matches).
- For the target fixture derive lambda_home and lambda_away.
- Build the 10x10 scoreline matrix, apply Dixon-Coles tau correction.
- Extract 1X2, Over/Under, BTTS, and correct-score probabilities.
- Convert to no-vig fair decimal odds. Compare against bookmaker prices; bet only where your probability exceeds the market's no-vig implied probability by at least 2%.
- Size using fractional Kelly on the edge.
8. Frequently Asked Questions
Is the Poisson distribution actually accurate for soccer?
It is the best single-parameter starting point, but it needs corrections. Raw Poisson under-predicts 0-0, 1-1 and 2-2 draws. The Dixon-Coles correction fixes the four worst cells and brings calibration close to bookmaker accuracy.
How many historical matches do I need to fit attack/defense coefficients?
At least one full season (~380 matches for EPL) for stable estimates, ideally a blended two-season sample with more weight on recent games. Fewer than 150 matches per league produces unstable, over-fit coefficients.
Can I use xG instead of raw goals scored?
Yes and you probably should. Replace 'goals scored per game' with 'expected goals (xG) per game' in the attack/defense calculation. xG is a less noisy input, especially over small samples, and typically improves Brier-score calibration by 5-10%.
How do I model in-play Poisson for live betting?
Reset lambda_remaining = lambda_full * (minutes_remaining / 90) each time you want a price, then factor in current score-state using score-effect multipliers (teams trailing late increase their lambda by about 10-20%, leading teams decrease theirs by a similar amount).
What rho value should I use for Dixon-Coles?
Estimate rho from historical data using maximum likelihood. For the Premier League typical estimates are -0.10 to -0.14. Higher-scoring leagues like the Bundesliga sit around -0.06 to -0.09. Using rho = -0.12 as a default for EPL is reasonable if you can't fit it yourself.
Does Poisson work for Asian handicaps and correct-score markets?
Yes. Asian handicap probabilities are subsets of the same 10x10 matrix — simply sum the cells that satisfy the handicap condition. Correct score markets read directly off individual cells. This is where Poisson shines: one model, dozens of derived prices.
Start with the Poisson calculator, verify edge on the value finder, then confirm expected profitability with the EV tool.
Responsible gambling notice. This article is educational. Statistical models do not guarantee profit — every wager carries real financial risk and soccer results include substantial unpredictable variance. Stake only what you can afford to lose. For support with problem gambling visit BeGambleAware.org or call 1-800-GAMBLER (US). Must be of legal betting age in your jurisdiction.