MLB Predictor
Baseball model with a public track record.
problem
Sports models are easy to make overconfident. The useful work is less about the prediction itself and more about calibration, data hygiene, and learning when a model is simply guessing loudly.
context
This started as a weekend project and stays deliberately low-stakes, but it now runs daily in season. Baseball is a convenient sandbox because the data is rich, the season is long, and small edges tend to disappear fast.
what I built
I built the modeling workflow around public baseball data (win probability, first-inning run predictions, and prop-style edges) plus the dashboard that publishes a public track record and calibration plots. Recent work added Beta calibration, multi-book line aggregation for fair-price comparison, and production monitoring so silent failures surface.
- Predicts win probability, first-inning runs, and a few prop-style outcomes
- Publishes a public track record with calibration checks rather than vibes
- Compares lines across books to keep the edge math honest
what I learned
Most of the work is data plumbing. The model is only interesting after the inputs, evaluation, and calibration story are honest. That is why the track record is public.
status
Runs daily in season with a public track-record dashboard. Still calibration practice, not betting advice.