Maintain a scored prediction log

Record predictions with explicit probabilities and score them when they resolve.

Why it works

Calibration requires a feedback loop: you state a probability, the outcome happens, and you learn whether your stated confidence matched the base rate of your predictions. Without recording and scoring, confidence remains a feeling rather than a measurable quantity. The Brier score is the standard metric: lower is better, rewarding confident correct predictions and penalizing confident wrong ones proportionally.

How to do it

Set up a simple log: date, question, probability estimate, outcome (when resolved).
Record at least five to ten predictions per month to generate enough data for calibration analysis.
Score each resolved prediction using the Brier score formula: (probability − outcome)², where outcome is 0 or 1.
Plot your Brier scores over time — a trend toward lower scores is real improvement.

Evidence

Tetlock’s forecasting tournaments used Brier scoring throughout; tracked, scored forecasters improved significantly more than those who forecast without feedback. The scoring mechanism is the operationalization of calibration feedback and is central to the good-judgment research program. Mellers et al. (2015) traced that improvement to specific, trackable drivers — repeated practice and belief-updating raised accuracy across successive forecasting rounds — grounding the log-and-score habit in the mechanism Brier (1950) first formalized as a proper score for probabilistic forecasts. (observational)

Personal prediction logs require discipline to maintain and only become statistically meaningful after sufficient volume — typically hundreds of predictions for fine-grained calibration.

Sources

Common mistake

Recording predictions in vague language ("I think it’ll work out") rather than numerical probabilities, which makes scoring impossible and calibration unmeasurable.

Practice this with IX Coach

Start with IX Coach

7 days free, then $40/month (~$1.30/day).