Commercial-Bank PD Modeling Team

SAS Analyst Interview – Question & Answer Bank

Question 1 – Code Review
Look at the following snippet and tell me what it does, why the data-step-merge is risky, and how you would refactor it to avoid accidental many-to-many joins and missing BY values.
/* What it does: left-joins loans to financials on facility_id, keeps only matches. */ /* Risks: many-to-many merge if duplicates exist; silently drops records with missing BY values. */ proc sql; create table merged as select l.*, f.* from loans as l join financials as f on l.facility_id = f.facility_id; quit;
Question 2 – PD Vintage Analysis
Using only PROC SQL or the data step, calculate the cumulative 12-month default rate for each origination vintage.
proc sql; create table vintage_pd as select vintage_qtr, sum(flag_default)/count(distinct facility_id) as cum12_pd from ( select l.vintage_qtr, l.facility_id, max(case when intck('month',l.orig_date,p.report_date) between 1 and 12 and p.dpd ge 90 then 1 else 0 end) as flag_default from loans l join perf.monthly p on l.facility_id = p.facility_id group by l.vintage_qtr, l.facility_id ) group by vintage_qtr; quit;
Question 3 – MNAR Imputation
Describe two SAS procedures or macros you could use to implement multiple-imputation by chained equations (MICE) and explain how you would pool the resulting PD parameter estimates.
/* PROC MI with FCS for MICE */ proc mi data=loans nimpute=10 seed=12345; fcs nbiter=20; var debt_service_coverage_ratio ltv dscr other_ratios; run; /* PROC MIANALYZE to pool logistic regression results */ proc mianalyze parms=parms_pd; modeleffects intercept ltv dscr; run;
Question 4 – PROC LOGISTIC Fine-Tuning
Show the exact options you would add to PROC LOGISTIC to fit a PD model with backward selection (SBC), forced zero intercept, exposure weighting, and export ROC & parameter estimates.
proc logistic data=train; model default(event='1') = &vars / selection=backward(select=SBC) noint weight=exposure_wgt; output out=scored pred=p; roc; ods output roc=rocdata; ods output parameterestimates=parms; run;
Question 5 – Scalability Debate
Compare Base-SAS multithreaded, PROC HPLOGISTIC, and CAS actions on Viya for scoring 25 million facilities quarterly. Which do you recommend?
Base-SAS: single-machine RAM-bound. PROC HPLOGISTIC: multithreaded, up to ~10 M rows. CAS actions (Viya): distributed memory, linear scale. Recommendation: use CAS action logistic in Viya if available; else PROC HPLOGISTIC.
Question 6 – Data-Step Array Puzzle
Given 60 monthly PD forecasts (pd1–pd60), compute cumulative 12-month PD under conditional independence.
data want; set have; array pd{60} pd1-pd60; cum12_pd = 1 - exp(sum(log(1-pd{i}}), i=1 to 12)); run;
Question 7 – Hash Object vs. Formats
Explain when a hash object is preferable to a user-defined format for NAICS look-up and show skeleton code.
/* Hash – dynamic, multi-var return */ if _n_=0 then do; declare hash h(dataset:'industry_map'); h.defineKey('naics'); h.defineData('risk_weight'); h.defineDone(); end; rc = h.find(); /* Format – fastest, single var */ proc format; value $naics_fmt '311'='Low' '312'='Medium'; run;
Question 8 – Macro Facility Debugging
Describe systematic debugging steps and defensive coding for an inherited macro that throws “Apparent symbolic reference not resolved”.
/* Debug steps */ options mprint mlogic symbolgen; %put _user_; /* Defensive macro */ %macro score_pd(lib=, ds=); %if %length(&lib)=0 or %length(&ds)=0 %then %do; %put ERROR: parameters missing; %return; %end; ... %mend;
Question 9 – Stress-Testing Integration
Outline a macro that re-calculates PDs under 20 macro scenarios, writes outputs, and logs audit metadata.
%macro stress_pd(scen_file=); proc import datafile="&scen_file" out=scenarios dbms=csv replace; run; data _null_; set scenarios end=eof; call symputx(cats('scen',_n_),scenario_name); call symputx('nscen',_n_); run; %do i=1 %to &nscen; data loans_scen; set loans; scenario="&&scen&i"; macro_pd=...; run; proc append base=results.pd_scenarios data=loans_schen force; run; proc sql; insert into audit_log set timestamp=datetime(), scenario="&&scen&i", nobs=&sqlobs; quit; %end; %mend;
Question 10 – Git & Enterprise Guide Workflow
Propose a workflow so analysts can use Enterprise Guide interactively while production runs the exact Git-tagged version.
1. Clone repo to local workspace. 2. EG project links *.sas files via relative paths; .egp not committed. 3. Commit .sas files with Git; tag releases. 4. Production batch job runs: sasbatch.bat myfile.sas from tagged release branch.