Category |
Model |
Pros |
Cons |
Traditional Regression Models |
Logistic Regression Probit Regression |
- Transparent: coefficients provide clear insights into the risk drivers
- Well‑understood statistical properties; easy to validate and explain to stakeholders
- Widely accepted by regulators
|
- Assumes linear relationships between variables; limited ability to capture complex interactions
- Requires a sufficient number of default observations, which can be challenging for low‑default portfolios
- May underperform compared to machine learning models
|
Cox Proportional Hazards Model |
- Naturally handles time‑to‑default dynamics
- Incorporates censored observations effectively
- Handles varying observation periods effectively
|
- Less common in PD modeling
- The proportional hazards assumption may not hold
- Requires sufficient default events for stable estimation
|
Machine Learning Models |
Random Forest |
- Captures non‑linearity & interactions effectively
- Handles outliers & missing data well
- Less prone to overfitting: averaging across trees reduces variance
|
- Black‑box nature can challenge regulatory approval and internal validation
- Less stable: model results may vary across different runs due to bootstrapping
|
Gradient Boosting |
- High predictive power – often outperforms traditional models and random forests
- Captures non‑linearity & interactions effectively
- Handles outliers & missing data well
|
- Even harder to explain than random forests due to additive tree structure
- High risk of overfitting, requiring careful tuning
- Requires large datasets for optimal performance
|
Neural Network |
- High performance: with enough data, it can outperform other models
- Highly flexible: learns complex interactions automatically
- Feature learning: deep learning can extract latent features
|
- Extremely hard to explain to regulators and stakeholders
- Requires large amounts of data to train effectively and avoid overfitting
- Highly complex: requires significant expertise to design the network architecture (layers, nodes, activation functions) and tune the training process
|