Python Machine Learning – AUC – ROC Curve
The AUC – ROC Curve (Area Under the Receiver Operating Characteristic Curve) is an important metric used to evaluate the performance of a classification model, particularly for binary classification tasks. It helps to determine how well the model distinguishes between the two classes (positive and negative).
1. Understanding ROC Curve
The Receiver Operating Characteristic (ROC) curve is a graphical plot that shows the trade-off between the True Positive Rate (TPR) and False Positive Rate (FPR) at different classification thresholds.
- True Positive Rate (TPR) or Recall: The proportion of actual positives correctly identified by the model.
- False Positive Rate (FPR): The proportion of actual negatives that were incorrectly classified as positive.
The ROC curve plots TPR vs. FPR at different classification thresholds (from 0 to 1).
2. AUC (Area Under the Curve)
The Area Under the Curve (AUC) is a single number that summarizes the overall performance of the model. It represents the area under the ROC curve, ranging from 0 to 1:
- AUC = 1: Perfect classifier (ideal case).
- AUC = 0.5: Random classifier (no predictive ability).
- AUC < 0.5: Worse than random (inverted predictions).
A higher AUC value indicates that the model is better at distinguishing between the positive and negative classes.
3. How to Interpret the ROC Curve and AUC
- Ideal ROC Curve: Approaches the top-left corner of the plot, indicating a high TPR and a low FPR.
- AUC close to 1: Indicates that the model is excellent at distinguishing between classes.
- AUC close to 0.5: Indicates that the model performs similarly to random guessing.
4. Implementing ROC Curve and AUC in Python
Scikit-learn provides easy-to-use functions for generating the ROC curve and calculating AUC. Below is an example of how to implement this.
Example: ROC Curve and AUC Calculation
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, roc_auc_score
import matplotlib.pyplot as plt
# Generate a sample dataset for binary classification
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create and train a Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Get predicted probabilities for the test set
y_probs = model.predict_proba(X_test)[:, 1] # Get probabilities for the positive class
# Compute the ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_probs)
# Compute AUC score
auc_score = roc_auc_score(y_test, y_probs)
print(f"AUC Score: {auc_score:.4f}")
# Plot the ROC curve
plt.figure()
plt.plot(fpr, tpr, label=f"ROC curve (AUC = {auc_score:.2f})", color='blue')
plt.plot([0, 1], [0, 1], color='gray', linestyle='--') # Diagonal line for random performance
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc="lower right")
plt.show()
Explanation:
- We generate a synthetic binary classification dataset using
make_classification. - The data is split into training and testing sets.
- We train a Logistic Regression model.
- The predicted probabilities (
predict_proba) are used to calculate the ROC curve and AUC score usingroc_curveandroc_auc_score. - The ROC curve is plotted, showing the trade-off between the True Positive Rate (Recall) and the False Positive Rate.
5. Key Functions in Scikit-learn
roc_curve(y_true, y_score): Computes the ROC curve by returning the FPR, TPR, and thresholds.roc_auc_score(y_true, y_score): Computes the AUC score for the model based on the ROC curve.
6. ROC Curve for Multiclass Classification
For multiclass classification, you can either:
- Compute the ROC curve for each class separately (One-vs-Rest approach).
- Use the macro or weighted average of the AUC scores across all classes.
Here’s an example for multiclass classification:
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
# Generate a multiclass dataset
X, y = make_classification(n_samples=1000, n_features=20, n_classes=3, random_state=42)
# Binarize the output labels for each class
y_bin = label_binarize(y, classes=[0, 1, 2])
# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y_bin, test_size=0.3, random_state=42)
# Use OneVsRestClassifier for multiclass classification
model = OneVsRestClassifier(LogisticRegression())
model.fit(X_train, y_train)
# Get predicted probabilities for each class
y_probs = model.predict_proba(X_test)
# Compute the AUC score for each class
auc_score = roc_auc_score(y_test, y_probs, average="macro", multi_class="ovr")
print(f"Multiclass AUC Score: {auc_score:.4f}")
7. Conclusion
The ROC Curve and AUC score are powerful tools for evaluating the performance of binary (and multiclass) classification models. A good model will have a ROC curve that bends toward the top-left corner and an AUC score close to 1, indicating strong discriminative power between classes. These metrics are particularly useful when the dataset has imbalanced classes, as they take into account both true positives and false positives.