Modeling and Analytics Strategy

Balanced Approach

Balanced approach:
- Balance the complexities and the simplicity
  - Simplicity: user experience, operational efficiency
- Balance the data and the experts
- Overrides are not necessarily an indicator of the failure of the modeling and analytics. They are an indicator of the complexities of the subject.
- Critical decision points: provide 2–3 alternatives and make a recommendation
- Meet the business requirements and regulatory requirements

Research

Understand the complexities

Design

Make them simple

Quantitative Model

Primary support: Data
Secondary support: Experts

Qualitative Approach

Primary support: Experts
Secondary support: Data

Override

Account for the complexities

Modeling and Analytics Partnership

Communication and Documentation

Feedback loop: constant communication
Weekly working group meetings: modelers only
Email correspondence: also serve as evidence
Invite LOB, credit, and model risk to the meeting only when necessary to get consensus
Provide 2–3 alternatives at every critical decision point

Modelers

Business &
Credit Users

Model Validators

Use the model documentation template as a guideline

Meet the operational requirements

PD Models

Target Variable Approaches

Target Variable	Description	Pros	Cons
Binary event (default, non-default)	Modeling the "ground truth" – the objective, observable event of default Model predicts PD → PD is mapped to an ORR based on the master scale	Conceptual soundness & objectivity: the model is based on actual default events rather than subjective rating assignments. This creates a direct, unbroken chain of logic from borrower characteristics to the default risk. Binary PD models are standard in regulatory frameworks (e.g., Basel, etc.) Simpler modeling approach	Significant information loss in the target variable Calibration challenges: PD model must be well‑calibrated to avoid distorted ratings Significant challenges for the low‑default portfolios
Obligor Risk Ratings	Modeling the "human process" – replicating the expert judgment that goes into assigning a rating Model predicts ORR → ORR is mapped to a PD based on the master scale	Richer information in the target variable Captures embedded expert judgement: the model can learn the experts' collective wisdom and align with internal practices Historical consistency: maintains better continuity with existing rating methodologies Works better for the low‑default portfolios	Subjectivity and circular: the model's validity is entirely dependent on the historical quality and consistency of the very rating process it is trying to replicate May have higher regulatory scrutiny due to a subjective foundation Increased model complexity

Target Variable

Description

Pros

Cons

Binary event
(default, non-default)

Modeling the "ground truth" – the objective, observable event of default

Model predicts PD → PD is mapped to an ORR based on the master scale

Conceptual soundness & objectivity: the model is based on actual default events rather than subjective rating assignments. This creates a direct, unbroken chain of logic from borrower characteristics to the default risk.
Binary PD models are standard in regulatory frameworks (e.g., Basel, etc.)
Simpler modeling approach

Significant information loss in the target variable
Calibration challenges: PD model must be well‑calibrated to avoid distorted ratings
Significant challenges for the low‑default portfolios

Obligor Risk Ratings

Modeling the "human process" – replicating the expert judgment that goes into assigning a rating

Model predicts ORR → ORR is mapped to a PD based on the master scale

Richer information in the target variable
Captures embedded expert judgement: the model can learn the experts' collective wisdom and align with internal practices
Historical consistency: maintains better continuity with existing rating methodologies
Works better for the low‑default portfolios

Subjectivity and circular: the model's validity is entirely dependent on the historical quality and consistency of the very rating process it is trying to replicate
May have higher regulatory scrutiny due to a subjective foundation
Increased model complexity

Modeling Approaches

Category	Model	Pros	Cons
Traditional Regression Models	Logistic Regression Probit Regression	Transparent: coefficients provide clear insights into the risk drivers Well‑understood statistical properties; easy to validate and explain to stakeholders Widely accepted by regulators	Assumes linear relationships between variables; limited ability to capture complex interactions Requires a sufficient number of default observations, which can be challenging for low‑default portfolios May underperform compared to machine learning models
Traditional Regression Models	Cox Proportional Hazards Model	Naturally handles time‑to‑default dynamics Incorporates censored observations effectively Handles varying observation periods effectively	Less common in PD modeling The proportional hazards assumption may not hold Requires sufficient default events for stable estimation
Machine Learning Models	Random Forest	Captures non‑linearity & interactions effectively Handles outliers & missing data well Less prone to overfitting: averaging across trees reduces variance	Black‑box nature can challenge regulatory approval and internal validation Less stable: model results may vary across different runs due to bootstrapping
	Gradient Boosting	High predictive power – often outperforms traditional models and random forests Captures non‑linearity & interactions effectively Handles outliers & missing data well	Even harder to explain than random forests due to additive tree structure High risk of overfitting, requiring careful tuning Requires large datasets for optimal performance
	Neural Network	High performance: with enough data, it can outperform other models Highly flexible: learns complex interactions automatically Feature learning: deep learning can extract latent features	Extremely hard to explain to regulators and stakeholders Requires large amounts of data to train effectively and avoid overfitting Highly complex: requires significant expertise to design the network architecture (layers, nodes, activation functions) and tune the training process

PD Modeling Strategies When Internal Data Is Limited

Banks sometimes need to estimate the probability of default (PD) for a loan portfolio without sufficient internal default history. Common situations include offering a new product, having very few historical defaults in a low-default portfolio, lacking centralized financial statements or customer data, or facing selection bias because the bank only recorded data on accepted customers.

Scenario 1 – New product or no internal defaults

When the bank has no internal default data because the product or customer segment is new, modelling must rely on external information or expert judgement.

Approach	Evidence and Rationale
Use external or comparable data	Basel II/III guidelines state that when banks lack sufficient internal default history they may map their internal rating grades to external data (rating agencies or industry databases) and use statistical default models¹. Credit bureaus maintain long time series of defaults; CGAP's credit-scoring guide notes that lenders without performance data can "use the bureau's score as an input" for underwriting and that bureau scores can be customized². References: 1, 2
Structural or "market-implied" models	When financial statements exist but historical defaults are absent, structural models such as Merton's treat company equity as an option on assets. The Merton model calculates PD as the probability that the firm's asset value falls below its debt at maturity³; default probability increases with leverage and asset volatility⁴. References: 3, 4
Macro-economic and rating-group models	The Federal Reserve's corporate-loan stress-test model assumes that PDs depend on borrower rating, industry and macroeconomic variables (e.g., unemployment). Initial PDs are derived from long-run average PDs for the rating group, and PDs evolve using relationships between changes in PD and changes in macroeconomic variables⁵. Reference: 5
Use external vendor or bureau models	Industry vendors (e.g., Moody's KMV/EquityEdge or RiskCalc) produce PD estimates using option-theoretic or financial-statement models. Regulatory guidance notes that lenders may rely on such models but must justify that the target population is similar to the vendor's development population⁶. Reference: 6
Expert judgement and qualitative scorecards	When no comparable data exist, lenders may start with judgmental or expert scorecards. CGAP explains that organisations launching a new product should develop an expert scorecard to assign points based on borrower characteristics and use a controlled pilot to gather repayment data⁷. Reference: 7
Pilot to generate repayment data	For products with no performance history, CGAP recommends a controlled lending pilot where small loans are disbursed to a representative sample of customers; after a few loan cycles, the bank can develop a statistical model using the newly gathered default data⁸. Reference: 8

Scenario 2 – Very few defaults (low-default portfolio)

In portfolios with only a few defaults (e.g., sovereigns, blue-chip corporates or specialised lending), traditional logistic regression can be unreliable. Regulators classify these as low-default portfolios (LDPs) and require conservative estimation.

Approach	Evidence and Rationale
Conservative confidence-bound methods (Pluto-Tasche)	For portfolios with zero or very few defaults, Pluto & Tasche propose estimating PDs using upper confidence bounds. The method calculates the most prudent PD estimate for each rating grade by solving equations for the upper bound of the binomial distribution, ensuring that PD estimates increase monotonically across grades⁹. The approach is required by some regulators and is conservative because it produces high PDs when defaults are rare¹⁰. References: 9, 10
Extended default definition	Finalyse (2023) notes that LDPs often redefine the outcome to increase the number of "bad" cases (e.g., using 60-day or 30-day past-due rather than 90-day default or extending the observation window)¹¹. This creates more events for modelling while keeping calibration based on the regulatory default definition. Reference: 11
Sampling and oversampling techniques	Low default portfolios suffer from class imbalance. Finalyse explains that oversampling or undersampling can be used to increase the ratio of defaults to non-defaults; synthetic minority oversampling technique (SMOTE) generates synthetic default observations by interpolating between nearest neighbors¹². Studies comparing techniques for LDPs find that limited logistic regression (adding a parameter to bound the probability) and Bayesian logistic regression outperform classical logistic regression and benefit from oversampling¹³. References: 12, 13
Bayesian methods	Bayesian statistics combine prior information with sparse data. In the low-default context, Bayesian approaches treat PD parameters as random variables and incorporate expert opinion or information from similar portfolios. The "Bayesian logistic regression" thesis explains that using priors derived from an old data set and combining them with limited new data improves predictive performance; the importance of the prior decreases as more data accumulate¹⁴. Finalyse highlights that Bayesian methods eliminate the need for selecting confidence levels and allow expert information to be embedded through informative priors¹⁵. References: 14, 15
Quasi-Moment Matching (QMM)	QMM calibrates scores to PDs using a relationship between the empirical cumulative distribution of scores for survivors and the mean portfolio PD; it solves a two-parameter equation and can be approximated by logistic regression¹⁶. Reference: 16
Structural and simulation models	When there are no observed defaults, simulation can be used. Rhino Risks paper on modelling with little or no data notes that structural models simulate key risk drivers (e.g., asset value, borrower income) and count the number of simulated defaults to estimate PDs; this approach is valid even with no defaults but depends on how well the model represents reality¹⁷. Reference: 17
Markov chains / migration matrices	Another simple approach is to construct migration matrices (Markov chains) showing transitions between risk states and through to default. Rhino Risk describes using such matrices to estimate eventual default rates when default data are scarce¹⁸. Reference: 18

Scenario 3 - Lack of centralized financial statements or "thin-file" borrowers

When customers have little or no formal financial history, banks must use alternative data and creative data-collection strategies.

Approach	Evidence and Rationale
Use alternative financial data (rent, utilities, telco, bank transactions)	The Kansas City Fed explains that fintechs and credit bureaus are collecting alternative data—such as bank account balances, rent, utility and subscription payments, or income from gig-economy platforms—to supplement traditional credit reports¹⁹. These data are often more readily available for thin-file customers and can improve credit access. Reference: 19
Leverage non-financial and digital-footprint data	Alternative data also include non-financial information, such as educational and employment history, public records, and a customer's digital footprint²⁰. An NBER study on digital footprints finds that simple website metadata (device type, operating system, email provider, time of access, etc.) have discriminatory power equal to or greater than bureau scores; digital footprints provide similar predictive power for previously unscorable customers²¹. References: 20, 21
Pilot to collect transaction data and build behavioural scorecards	As noted earlier, launching a controlled pilot allows a lender to generate repayment data. CGAP recommends designing short-term, low-amount loans and collecting demographic and behavioural information during the pilot; an expert scorecard can be used to select customers, and the resulting data can feed a statistical model⁷. Reference: 7
Mapping to external ratings and industry ratios	For corporate borrowers without centralized financial statements, banks can use industry averages or regional statistics. Basel guidelines permit mapping internal grades to external ratings or industry default rates¹. Merton-style models can estimate PDs using market-based asset values even if financial statements are limited, provided the firm has traded equity³. References: 1, 3
Use vendor-provided alternative-data scores	Credit bureaus and fintechs offer scores that incorporate alternative data (e.g., FICO's UltraFICO, Experian Boost). The Kansas City Fed notes that some products allow consumers to contribute alternative data, such as bank account cash flows or rent payments, to enhance their credit scores²². Banks can incorporate these scores into underwriting. Reference: 22
Data privacy and bias considerations	The World Bank cautions that while alternative data can increase predictive accuracy by 5–20%, they raise concerns about privacy and discrimination²³. Lenders should ensure compliance with data-protection regulations and avoid using sensitive variables. Reference: 23

Scenario 4 - Selection bias because data exist only for approved borrowers

Credit-risk models built only on accepted loan applications can be biased because they exclude information on applicants who were rejected. This is common when the bank has historical repayment data for approved customers but not for rejected ones.

Approach	Evidence and Rationale
Reject inference techniques	SAS documentation describes several reject inference methods to infer the default behaviour of rejected applicants. Techniques include parceling, nearest neighbour, bureau performance, fuzzy augmentation and simple augmentation²⁴. These methods assign hypothetical default outcomes to rejected applicants to reduce bias. Reference: 24
Heuristic methods (parceling, augmentation)	Parceling assigns rejected applicants to good and bad classes based on their score. Simple augmentation assigns all rejected applicants to the bad class. Fuzzy augmentation assigns partial good and bad weights to each rejected applicant. SAS notes that fuzzy augmentation is more sophisticated and less biased than simple methods²⁵. Reference: 25
Statistical methods (maximum likelihood, Heckman correction)	The Heckman two-step correction models the selection process (accept/reject) and then the outcome (default/non-default). SAS states that this method is theoretically sound but complex and sensitive to model specification²⁶. Maximum likelihood estimation jointly estimates the acceptance and default processes. Reference: 26
Bureau-based reject inference	If the bank obtained credit bureau scores for all applicants (including rejects), it can use the bureau's performance data on those rejects. SAS explains that this method is effective if the bureau has performance data for the rejects from other lenders²⁷. Reference: 27
Use machine learning and semi-supervised learning	Machine learning techniques such as semi-supervised learning or positive-unlabeled (PU) learning can be applied to reject inference. These methods treat rejected applicants as unlabeled and attempt to infer their true class. A study on credit scoring with reject inference found that semi-supervised methods improved model accuracy and reduced bias²⁸. Reference: 28
Ongoing data collection and policy changes	To avoid future selection bias, banks can occasionally approve a random sample of applicants who would normally be rejected. This strategy, called "through-the-door sampling" or "inference by experiment," provides performance data for all score ranges but is costly because it increases credit risk²⁹. Reference: 29

References

Basel Committee on Banking Supervision. (2006). International Convergence of Capital Measurement and Capital Standards: A Revised Framework.
CGAP. (2012). A Guide to Credit Scoring.
Merton, R. C. (1974). On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. Journal of Finance.
Bharath, S. T., & Shumway, T. (2008). Forecasting Default with the Merton Distance to Default Model. Review of Financial Studies.
Federal Reserve. (2017). Dodd-Frank Act Stress Test 2017: Supervisory Stress Test Methodology and Results.
OCC. (2011). Supervisory Guidance on Model Risk Management.
CGAP. (2012). A Guide to Credit Scoring.
CGAP. (2012). A Guide to Credit Scoring.
Pluto, K., & Tasche, D. (2005). Thinking Positively about Low-Default Portfolios. Risk.
BCBS. (2005). Studies on the Validation of Internal Rating Systems.
Finalyse. (2023). Low Default Portfolio Modelling.
Bastos, J. A. (2010). Forecasting Bank Loans Loss-Given-Default. Journal of Banking & Finance.
Korobilis, D. (2013). Bayesian Methods in Empirical Finance.
Finalyse. (2023). Low Default Portfolio Modelling.
Tasche, D. (2013). The Art of Probability-of-Default Curve Calibration. Journal of Credit Risk.
Rhino Risk. (2020). Modelling with Little or No Data.
Federal Reserve Bank of Kansas City. (2019). Alternative Data and the Digital Transformation of Lending.
Jagtiani, J., & Lemieux, C. (2018). The Roles of Alternative Data and Machine Learning in Fintech Lending. Federal Reserve Bank of Philadelphia.
Berg, T., Burg, V., Gombović, A., & Puri, M. (2020). On the Rise of FinTechs: Credit Scoring Using Digital Footprints. The Review of Financial Studies.
Federal Reserve Bank of Kansas City. (2019). Alternative Data and the Digital Transformation of Lending.
World Bank. (2019). Alternative Data Transforming SME Finance.
SAS. (2018). Reject Inference in Credit Scoring.
Crook, J. N., & Banasik, J. (2004). Does Reject Inference Really Improve the Performance of Application Scoring Models? Journal of Banking & Finance.
Hand, D. J., & Henley, W. E. (1993). Statistical Classification Methods in Consumer Credit Scoring: A Review. Journal of the Royal Statistical Society.