Modeling and Analytics Strategy

Balanced Approach

  • Balanced approach:
    • Balance the complexities and the simplicity
      • Simplicity: user experience, operational efficiency
    • Balance the data and the experts
    • Overrides are not necessarily an indicator of the failure of the modeling and analytics. They are an indicator of the complexities of the subject.
    • Critical decision points: provide 2–3 alternatives and make a recommendation
    • Meet the business requirements and regulatory requirements
Research
Understand the complexities
Design
Make them simple
Quantitative
Primary support: Data
Secondary support: Experts
Qualitative
Primary support: Experts
Secondary support: Data
Override
Account for the complexities

Modeling and Analytics Partnership

Communication and Documentation

Modelers
Business &
Credit Users
Model Validators
Use the model documentationtemplate as a guideline
Meet the operational requirements

PD Models

Target Variable Approaches

Target Variable Description Pros Cons
Binary event
(default, non-default)

Modeling the "ground truth" – the objective, observable event of default

Model predicts PD → PD is mapped to an ORR based on the master scale

  • Conceptual soundness & objectivity: the model is based on actual default events rather than subjective rating assignments. This creates a direct, unbroken chain of logic from borrower characteristics to the default risk.
  • Binary PD models are standard in regulatory frameworks (e.g., Basel, etc.)
  • Simpler modeling approach
  • Significant information loss in the target variable
  • Calibration challenges: PD model must be well‑calibrated to avoid distorted ratings
  • Significant challenges for the low‑default portfolios
Obligor Risk Ratings

Modeling the "human process" – replicating the expert judgment that goes into assigning a rating

Model predicts ORR → ORR is mapped to a PD based on the master scale

  • Richer information in the target variable
  • Captures embedded expert judgement: the model can learn the experts' collective wisdom and align with internal practices
  • Historical consistency: maintains better continuity with existing rating methodologies
  • Works better for the low‑default portfolios
  • Subjectivity and circular: the model's validity is entirely dependent on the historical quality and consistency of the very rating process it is trying to replicate
  • May have higher regulatory scrutiny due to a subjective foundation
  • Increased model complexity

Modeling Approaches

Category Model Pros Cons
Traditional
Regression
Models
Logistic Regression
Probit Regression
  • Transparent: coefficients provide clear insights into the risk drivers
  • Well‑understood statistical properties; easy to validate and explain to stakeholders
  • Widely accepted by regulators
  • Assumes linear relationships between variables; limited ability to capture complex interactions
  • Requires a sufficient number of default observations, which can be challenging for low‑default portfolios
  • May underperform compared to machine learning models
Cox Proportional Hazards Model
  • Naturally handles time‑to‑default dynamics
  • Incorporates censored observations effectively
  • Handles varying observation periods effectively
  • Less common in PD modeling
  • The proportional hazards assumption may not hold
  • Requires sufficient default events for stable estimation
Machine
Learning
Models
Random Forest
  • Captures non‑linearity & interactions effectively
  • Handles outliers & missing data well
  • Less prone to overfitting: averaging across trees reduces variance
  • Black‑box nature can challenge regulatory approval and internal validation
  • Less stable: model results may vary across different runs due to bootstrapping
Gradient Boosting
  • High predictive power – often outperforms traditional models and random forests
  • Captures non‑linearity & interactions effectively
  • Handles outliers & missing data well
  • Even harder to explain than random forests due to additive tree structure
  • High risk of overfitting, requiring careful tuning
  • Requires large datasets for optimal performance
Neural Network
  • High performance: with enough data, it can outperform other models
  • Highly flexible: learns complex interactions automatically
  • Feature learning: deep learning can extract latent features
  • Extremely hard to explain to regulators and stakeholders
  • Requires large amounts of data to train effectively and avoid overfitting
  • Highly complex: requires significant expertise to design the network architecture (layers, nodes, activation functions) and tune the training process