Model evaluation metrics provide quantitative measures to assess the quality of predictions. Choosing the right metric depends on the task type and the problem's specific demands. Metrics for classification focus on measuring correctness and errors in predicted classes, while regression metrics measure differences between predicted continuous values and actual outcomes.
Classification Metrics
Reliable model assessment depends on using the right set of metrics tailored to the dataset and business goal. Outlined here are the key evaluation tools used to gauge classification performance.
1. Accuracy: The proportion of correct predictions made by the model out of all predictions.
Formula:
When to Use: In balanced datasets where classes are approximately equally represented.
Limitation: Can be misleading in imbalanced datasets (e.g., rare event detection).
2. Precision: The proportion of true positive predictions among all positive predictions.
Formula: (where TP = true positives, FP = false positives)
Use Case: Important when the cost of false positives is high (e.g., spam detection).
Interpretation: High precision means few false alarms.
3. Recall (Sensitivity): The proportion of actual positives correctly identified by the model.
Formula: (where FN = false negatives)
Use Case: Crucial when missing positive cases is costly (e.g., disease diagnosis).
Interpretation: High recall means few misses.
4. Area Under the ROC Curve (AUC-ROC): Measures the model’s ability to distinguish between classes across different classification thresholds.
Interpretation: AUC ranges from 0.5 (random guessing) to 1.0 (perfect classification).
Use Case: Useful in comparing classifiers when data is imbalanced or threshold selection varies.
Description: ROC curve plots true positive rate vs false positive rate at various threshold settings.
Regression Metrics
Regression metrics help determine how closely model predictions align with actual outcomes across a dataset. Here is the primary evaluation approach used to measure predictive error and model fit.
Root Mean Squared Error (RMSE): The square root of the average squared differences between predicted and actual values.
Formula:
Use Case: Provides intuitive error magnitude in the same units as the target variable.
Advantages: Penalises larger errors more than smaller ones, useful when large errors are particularly undesirable.
Limitation: Sensitive to outliers.
We have a sales campaign on our promoted courses and products. You can purchase 1 products at a discounted price up to 15% discount.