Interactive explainer — adjust sliders and click cells to explore how metrics relate to each other.
Every prediction falls into one of four buckets. True/False = was the model right? Positive/Negative = what did it predict? Click any cell or metric below to see how they connect.
Precision: use when false positives are costly (e.g. spam filters, fraud flags). Formula: TP / (TP + FP).
Recall: use when false negatives are costly (e.g. cancer screening, fault detection). Formula: TP / (TP + FN).
F1: harmonic mean of precision and recall. Useful when both matter and classes are imbalanced. If either is 0, F1 is 0.
Accuracy: use only on balanced datasets. On a 95/5 class split, always predicting the majority gives 95% accuracy with 0% recall.
| What you observe | What it means |
|---|---|
| High accuracy, low recall | Model is hiding behind class imbalance. Don't ship it as-is. |
| High precision, low recall | Very conservative. Safe for low-stakes alerts; dangerous when missing cases matters. |
| High recall, low precision | Catches nearly everything but triggers many false alarms. |
| High F1 | Genuinely balanced. The metric to trust on imbalanced datasets. |
| F1 = 0 | Either precision or recall is zero — the model is not working. |