Forecasting Models
This package implements four different election forecasting models, each with different approaches to handling polling data.
Model Overview
Model |
Approach |
Complexity |
Performance |
|---|---|---|---|
Poll Average |
Weighted average of recent polls |
Low |
Baseline (yet second best) |
Kalman Diffusion |
Brownian motion with Kalman filter |
Medium |
Good |
Improved Kalman |
Kalman filter with drift and regularization |
Medium |
Better |
Hierarchical Bayes |
Bayesian ensemble with bias correction |
Highest |
Best |
1. Poll Average Model
The simplest baseline model that computes a weighted average of recent polls.
Key Features:
Uses 14-day polling window
Weights polls by sample size
Empirical uncertainty estimation
Simple horizon adjustment for days until election (decrease uncertainty as we approach election)
File: src/models/poll_average.py
Use Case: Quick baseline forecast, minimal computational cost
2. Kalman Diffusion Model
Implements a Brownian motion model using Kalman filter with Rauch-Tung-Striebel (RTS) smoother.
Key Features:
Brownian motion with drift dynamics
Forward Kalman filter + backward RTS smoother
EM algorithm for parameter estimation
Handles irregular time spacing in polls
Mathematical Model:
where \(\mu\) is drift, \(\sigma^2\) is diffusion variance, and \(R_t\) is observation noise.
File: src/models/kalman_diffusion.py
3. Improved Kalman Model
Enhanced version of Kalman Diffusion with stronger regularization and better uncertainty quantification.
Key Features:
Increased minimum diffusion variance
Regularized pollster bias estimates
Conservative probability clipping
Reduced forecast horizon uncertainty
Improvements over Basic Kalman:
Minimum diffusion variance: \(\sigma^2_{\min} = 0.003^2\)
Pollster bias regularization parameter: \(\lambda = 5.0\)
Probability clipping: [0.02, 0.98] instead of [0.01, 0.99]
File: src/models/improved_kalman.py
4. Hierarchical Bayes Model (Best)
The most sophisticated model combining multiple information sources with systematic bias correction.
Key Features:
Fundamentals Prior: Weighted average of 2012 (70%) and 2008 (30%) election results
Kalman-Filtered Polls: RTS smoother on recent polling data
House Effects: Hierarchical shrinkage estimation of pollster biases
Systematic Bias Correction: Adaptive correction for polling errors
Proper Uncertainty Quantification: Combines multiple uncertainty sources
Model Components:
House Effects Estimation:
where \(h_p\) is house effect for pollster \(p\), \(n_p\) is number of polls, \(\lambda\) is shrinkage parameter, and \(\bar{r}_p\) is mean residual. Essentially, if a pollster tends to have very different results compared to other pollsters in a similar time period, shrink their effect.
File: src/models/hierarchical_bayes.py
Adding New Models
We have made all scripts scalable so that all one needs to do to add a new forecasting mode is:
Create new file in
src/models/Inherit from
ElectionForecastModelbase classImplement the
fit_and_forecast()method, which should return:win_probability: float in [0,1]predicted_margin: float (Democratic margin)margin_std: float (standard deviation)
Example:
from src.models.base_model import ElectionForecastModel
class NewModel(ElectionForecastModel):
def __init__(self):
super().__init__("new_model")
def fit_and_forecast(self, state_polls, forecast_date,
election_date, actual_margin):
# Your forecasting logic here
return {
"win_probability": p,
"predicted_margin": m,
"margin_std": s,
}