Modeling and Analytics Strategy

Balanced Approach

  • Balanced approach:
    • Balance the complexities and the simplicity
      • Simplicity: user experience, operational efficiency
    • Balance the data and the experts
    • Overrides are not necessarily an indicator of the failure of the modeling and analytics. They are an indicator of the complexities of the subject.
    • Critical decision points: provide 2–3 alternatives and make a recommendation
    • Meet the business requirements and regulatory requirements
Research
Understand the complexities
Design
Make them simple
Quantitative Model
Primary support: Data
Secondary support: Experts
Qualitative Approach
Primary support: Experts
Secondary support: Data
Override
Account for the complexities

Modeling and Analytics Partnership

Communication and Documentation

Modelers
Business &
Credit Users
Model Validators
Use the model documentation template as a guideline
Meet the operational requirements

PD Models

Target Variable Approaches

Target Variable Description Pros Cons
Binary event
(default, non-default)

Modeling the "ground truth" – the objective, observable event of default

Model predicts PD → PD is mapped to an ORR based on the master scale

  • Conceptual soundness & objectivity: the model is based on actual default events rather than subjective rating assignments. This creates a direct, unbroken chain of logic from borrower characteristics to the default risk.
  • Binary PD models are standard in regulatory frameworks (e.g., Basel, etc.)
  • Simpler modeling approach
  • Significant information loss in the target variable
  • Calibration challenges: PD model must be well‑calibrated to avoid distorted ratings
  • Significant challenges for the low‑default portfolios
Obligor Risk Ratings

Modeling the "human process" – replicating the expert judgment that goes into assigning a rating

Model predicts ORR → ORR is mapped to a PD based on the master scale

  • Richer information in the target variable
  • Captures embedded expert judgement: the model can learn the experts' collective wisdom and align with internal practices
  • Historical consistency: maintains better continuity with existing rating methodologies
  • Works better for the low‑default portfolios
  • Subjectivity and circular: the model's validity is entirely dependent on the historical quality and consistency of the very rating process it is trying to replicate
  • May have higher regulatory scrutiny due to a subjective foundation
  • Increased model complexity

Modeling Approaches

Category Model Pros Cons
Traditional
Regression
Models
Logistic Regression
Probit Regression
  • Transparent: coefficients provide clear insights into the risk drivers
  • Well‑understood statistical properties; easy to validate and explain to stakeholders
  • Widely accepted by regulators
  • Assumes linear relationships between variables; limited ability to capture complex interactions
  • Requires a sufficient number of default observations, which can be challenging for low‑default portfolios
  • May underperform compared to machine learning models
Cox Proportional Hazards Model
  • Naturally handles time‑to‑default dynamics
  • Incorporates censored observations effectively
  • Handles varying observation periods effectively
  • Less common in PD modeling
  • The proportional hazards assumption may not hold
  • Requires sufficient default events for stable estimation
Machine
Learning
Models
Random Forest
  • Captures non‑linearity & interactions effectively
  • Handles outliers & missing data well
  • Less prone to overfitting: averaging across trees reduces variance
  • Black‑box nature can challenge regulatory approval and internal validation
  • Less stable: model results may vary across different runs due to bootstrapping
Gradient Boosting
  • High predictive power – often outperforms traditional models and random forests
  • Captures non‑linearity & interactions effectively
  • Handles outliers & missing data well
  • Even harder to explain than random forests due to additive tree structure
  • High risk of overfitting, requiring careful tuning
  • Requires large datasets for optimal performance
Neural Network
  • High performance: with enough data, it can outperform other models
  • Highly flexible: learns complex interactions automatically
  • Feature learning: deep learning can extract latent features
  • Extremely hard to explain to regulators and stakeholders
  • Requires large amounts of data to train effectively and avoid overfitting
  • Highly complex: requires significant expertise to design the network architecture (layers, nodes, activation functions) and tune the training process

PD Modeling Strategies When Internal Data Is Limited

Banks sometimes need to estimate the probability of default (PD) for a loan portfolio without sufficient internal default history. Common situations include offering a new product, having very few historical defaults in a low-default portfolio, lacking centralized financial statements or customer data, or facing selection bias because the bank only recorded data on accepted customers.

Scenario 1 – New product or no internal defaults

When the bank has no internal default data because the product or customer segment is new, modelling must rely on external information or expert judgement.

Approach Evidence and Rationale
Use external or comparable data Basel II/III guidelines state that when banks lack sufficient internal default history they may map their internal rating grades to external data (rating agencies or industry databases) and use statistical default models1. Credit bureaus maintain long time series of defaults; CGAP's credit-scoring guide notes that lenders without performance data can "use the bureau's score as an input" for underwriting and that bureau scores can be customized2.
References: 1, 2
Structural or "market-implied" models When financial statements exist but historical defaults are absent, structural models such as Merton's treat company equity as an option on assets. The Merton model calculates PD as the probability that the firm's asset value falls below its debt at maturity3; default probability increases with leverage and asset volatility4.
References: 3, 4
Macro-economic and rating-group models The Federal Reserve's corporate-loan stress-test model assumes that PDs depend on borrower rating, industry and macroeconomic variables (e.g., unemployment). Initial PDs are derived from long-run average PDs for the rating group, and PDs evolve using relationships between changes in PD and changes in macroeconomic variables5.
Reference: 5
Use external vendor or bureau models Industry vendors (e.g., Moody's KMV/EquityEdge or RiskCalc) produce PD estimates using option-theoretic or financial-statement models. Regulatory guidance notes that lenders may rely on such models but must justify that the target population is similar to the vendor's development population6.
Reference: 6
Expert judgement and qualitative scorecards When no comparable data exist, lenders may start with judgmental or expert scorecards. CGAP explains that organisations launching a new product should develop an expert scorecard to assign points based on borrower characteristics and use a controlled pilot to gather repayment data7.
Reference: 7
Pilot to generate repayment data For products with no performance history, CGAP recommends a controlled lending pilot where small loans are disbursed to a representative sample of customers; after a few loan cycles, the bank can develop a statistical model using the newly gathered default data8.
Reference: 8

Scenario 2 – Very few defaults (low-default portfolio)

In portfolios with only a few defaults (e.g., sovereigns, blue-chip corporates or specialised lending), traditional logistic regression can be unreliable. Regulators classify these as low-default portfolios (LDPs) and require conservative estimation.

Approach Evidence and Rationale
Conservative confidence-bound methods (Pluto-Tasche) For portfolios with zero or very few defaults, Pluto & Tasche propose estimating PDs using upper confidence bounds. The method calculates the most prudent PD estimate for each rating grade by solving equations for the upper bound of the binomial distribution, ensuring that PD estimates increase monotonically across grades9. The approach is required by some regulators and is conservative because it produces high PDs when defaults are rare10.
References: 9, 10
Extended default definition Finalyse (2023) notes that LDPs often redefine the outcome to increase the number of "bad" cases (e.g., using 60-day or 30-day past-due rather than 90-day default or extending the observation window)11. This creates more events for modelling while keeping calibration based on the regulatory default definition.
Reference: 11
Sampling and oversampling techniques Low default portfolios suffer from class imbalance. Finalyse explains that oversampling or undersampling can be used to increase the ratio of defaults to non-defaults; synthetic minority oversampling technique (SMOTE) generates synthetic default observations by interpolating between nearest neighbors12. Studies comparing techniques for LDPs find that limited logistic regression (adding a parameter to bound the probability) and Bayesian logistic regression outperform classical logistic regression and benefit from oversampling13.
References: 12, 13
Bayesian methods Bayesian statistics combine prior information with sparse data. In the low-default context, Bayesian approaches treat PD parameters as random variables and incorporate expert opinion or information from similar portfolios. The "Bayesian logistic regression" thesis explains that using priors derived from an old data set and combining them with limited new data improves predictive performance; the importance of the prior decreases as more data accumulate14. Finalyse highlights that Bayesian methods eliminate the need for selecting confidence levels and allow expert information to be embedded through informative priors15.
References: 14, 15
Quasi-Moment Matching (QMM) QMM calibrates scores to PDs using a relationship between the empirical cumulative distribution of scores for survivors and the mean portfolio PD; it solves a two-parameter equation and can be approximated by logistic regression16.
Reference: 16
Structural and simulation models When there are no observed defaults, simulation can be used. Rhino Risks paper on modelling with little or no data notes that structural models simulate key risk drivers (e.g., asset value, borrower income) and count the number of simulated defaults to estimate PDs; this approach is valid even with no defaults but depends on how well the model represents reality17.
Reference: 17
Markov chains / migration matrices Another simple approach is to construct migration matrices (Markov chains) showing transitions between risk states and through to default. Rhino Risk describes using such matrices to estimate eventual default rates when default data are scarce18.
Reference: 18

Scenario 3 - Lack of centralized financial statements or "thin-file" borrowers

When customers have little or no formal financial history, banks must use alternative data and creative data-collection strategies.

Approach Evidence and Rationale
Use alternative financial data (rent, utilities, telco, bank transactions) The Kansas City Fed explains that fintechs and credit bureaus are collecting alternative data—such as bank account balances, rent, utility and subscription payments, or income from gig-economy platforms—to supplement traditional credit reports19. These data are often more readily available for thin-file customers and can improve credit access.
Reference: 19
Leverage non-financial and digital-footprint data Alternative data also include non-financial information, such as educational and employment history, public records, and a customer's digital footprint20. An NBER study on digital footprints finds that simple website metadata (device type, operating system, email provider, time of access, etc.) have discriminatory power equal to or greater than bureau scores; digital footprints provide similar predictive power for previously unscorable customers21.
References: 20, 21
Pilot to collect transaction data and build behavioural scorecards As noted earlier, launching a controlled pilot allows a lender to generate repayment data. CGAP recommends designing short-term, low-amount loans and collecting demographic and behavioural information during the pilot; an expert scorecard can be used to select customers, and the resulting data can feed a statistical model7.
Reference: 7
Mapping to external ratings and industry ratios For corporate borrowers without centralized financial statements, banks can use industry averages or regional statistics. Basel guidelines permit mapping internal grades to external ratings or industry default rates1. Merton-style models can estimate PDs using market-based asset values even if financial statements are limited, provided the firm has traded equity3.
References: 1, 3
Use vendor-provided alternative-data scores Credit bureaus and fintechs offer scores that incorporate alternative data (e.g., FICO's UltraFICO, Experian Boost). The Kansas City Fed notes that some products allow consumers to contribute alternative data, such as bank account cash flows or rent payments, to enhance their credit scores22. Banks can incorporate these scores into underwriting.
Reference: 22
Data privacy and bias considerations The World Bank cautions that while alternative data can increase predictive accuracy by 5–20%, they raise concerns about privacy and discrimination23. Lenders should ensure compliance with data-protection regulations and avoid using sensitive variables.
Reference: 23

Scenario 4 - Selection bias because data exist only for approved borrowers

Credit-risk models built only on accepted loan applications can be biased because they exclude information on applicants who were rejected. This is common when the bank has historical repayment data for approved customers but not for rejected ones.

Approach Evidence and Rationale
Reject inference techniques SAS documentation describes several reject inference methods to infer the default behaviour of rejected applicants. Techniques include parceling, nearest neighbour, bureau performance, fuzzy augmentation and simple augmentation24. These methods assign hypothetical default outcomes to rejected applicants to reduce bias.
Reference: 24
Heuristic methods (parceling, augmentation) Parceling assigns rejected applicants to good and bad classes based on their score. Simple augmentation assigns all rejected applicants to the bad class. Fuzzy augmentation assigns partial good and bad weights to each rejected applicant. SAS notes that fuzzy augmentation is more sophisticated and less biased than simple methods25.
Reference: 25
Statistical methods (maximum likelihood, Heckman correction) The Heckman two-step correction models the selection process (accept/reject) and then the outcome (default/non-default). SAS states that this method is theoretically sound but complex and sensitive to model specification26. Maximum likelihood estimation jointly estimates the acceptance and default processes.
Reference: 26
Bureau-based reject inference If the bank obtained credit bureau scores for all applicants (including rejects), it can use the bureau's performance data on those rejects. SAS explains that this method is effective if the bureau has performance data for the rejects from other lenders27.
Reference: 27
Use machine learning and semi-supervised learning Machine learning techniques such as semi-supervised learning or positive-unlabeled (PU) learning can be applied to reject inference. These methods treat rejected applicants as unlabeled and attempt to infer their true class. A study on credit scoring with reject inference found that semi-supervised methods improved model accuracy and reduced bias28.
Reference: 28
Ongoing data collection and policy changes To avoid future selection bias, banks can occasionally approve a random sample of applicants who would normally be rejected. This strategy, called "through-the-door sampling" or "inference by experiment," provides performance data for all score ranges but is costly because it increases credit risk29.
Reference: 29

References

  1. Basel Committee on Banking Supervision. (2006). International Convergence of Capital Measurement and Capital Standards: A Revised Framework.
  2. CGAP. (2012). A Guide to Credit Scoring.
  3. Merton, R. C. (1974). On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. Journal of Finance.
  4. Bharath, S. T., & Shumway, T. (2008). Forecasting Default with the Merton Distance to Default Model. Review of Financial Studies.
  5. Federal Reserve. (2017). Dodd-Frank Act Stress Test 2017: Supervisory Stress Test Methodology and Results.
  6. OCC. (2011). Supervisory Guidance on Model Risk Management.
  7. CGAP. (2012). A Guide to Credit Scoring.
  8. CGAP. (2012). A Guide to Credit Scoring.
  9. Pluto, K., & Tasche, D. (2005). Thinking Positively about Low-Default Portfolios. Risk.
  10. BCBS. (2005). Studies on the Validation of Internal Rating Systems.
  11. Finalyse. (2023). Low Default Portfolio Modelling.
  12. Bastos, J. A. (2010). Forecasting Bank Loans Loss-Given-Default. Journal of Banking & Finance.
  13. Korobilis, D. (2013). Bayesian Methods in Empirical Finance.
  14. Finalyse. (2023). Low Default Portfolio Modelling.
  15. Tasche, D. (2013). The Art of Probability-of-Default Curve Calibration. Journal of Credit Risk.
  16. Rhino Risk. (2020). Modelling with Little or No Data.
  17. Federal Reserve Bank of Kansas City. (2019). Alternative Data and the Digital Transformation of Lending.
  18. Jagtiani, J., & Lemieux, C. (2018). The Roles of Alternative Data and Machine Learning in Fintech Lending. Federal Reserve Bank of Philadelphia.
  19. Berg, T., Burg, V., Gombović, A., & Puri, M. (2020). On the Rise of FinTechs: Credit Scoring Using Digital Footprints. The Review of Financial Studies.
  20. Federal Reserve Bank of Kansas City. (2019). Alternative Data and the Digital Transformation of Lending.
  21. World Bank. (2019). Alternative Data Transforming SME Finance.
  22. SAS. (2018). Reject Inference in Credit Scoring.
  23. Crook, J. N., & Banasik, J. (2004). Does Reject Inference Really Improve the Performance of Application Scoring Models? Journal of Banking & Finance.
  24. Hand, D. J., & Henley, W. E. (1993). Statistical Classification Methods in Consumer Credit Scoring: A Review. Journal of the Royal Statistical Society.