Finance Analytics Portfolio

Cayman Roden · Data Analyst | Credit Risk, Market Analysis & BI Deliverables

SQL and Python analysis on real 2022 CFPB HMDA mortgage data (1.2M+ rows) covering credit risk, fair lending, and financial risk - findings delivered as a 25-page interactive dashboard, written business memos, 8-tab Excel workbook, and 11 SQL query templates.

Three findings that would change a lender's decisions today:

DTI ratio, not FICO, drives default among prime borrowers (720+ FICO) - tightening DTI thresholds reduces expected losses by $600K-$900K per $100M with no impact on approval volume
Geographic HMDA analysis surfaces county-level fair lending disparities with $5-15M estimated regulatory penalty exposure
Parametric VaR understates tail risk by 12-18% - GARCH volatility forecasts reveal $12-18M in unhedged exposure per $1B equity portfolio

Live Dashboard - no sign-up required

Credentials: 21 professional certifications · 1,831 training hours · 127 courses — view breakdown on the Credentials page

For Hiring Managers

If you're hiring a...	Start here
Data Analyst	SQL Case Study · Advanced SQL Patterns · BigQuery SQL Layer · Tableau Dashboard · HMDA Fair Lending Memo · Narrative Notebook
Business Analyst	Business Requirements Doc · Data Model ERD · Case Study Memo · Data Dictionary · HMDA Memo
Analytics Engineer	dbt layer (5 marts, 33 schema tests) · BigQuery SQL · Evidence.dev Report (dbt → BI as code) · ETL pipeline · 12 SQL files in `sql/`
Risk / Quant Analyst	VaR Analysis · Stress Testing (CCAR/DFAST) · Credit Risk Explorer
BI Developer	Tableau Dashboard · Power BI PBIP (14 DAX measures) · LookML Models · Evidence.dev · Google Sheets Export

Written deliverable: CASE_STUDY.md — bank memo format, VP-ready.

Analyst Deliverables

Deliverable	Description	Link
Tableau Public	Interactive mortgage credit risk dashboard — approval patterns, FICO analysis, fair lending disparities	Live Dashboard
Power BI Project	3-page PBIP report (Credit Risk, Sector Performance, Macro Dashboard) with DAX measures — Git-friendly text format	powerbi/
LookML Models	Looker semantic layer — lending, stocks, and FRED views with credit risk explore	lookml/
Evidence.dev	5-page narrative BI report — credit risk, fair lending, portfolio returns, sector performance — built on dbt marts with SQL + Svelte charts	Live Report
BigQuery SQL Layer	5 analytics queries: DDL with partitioning/clustering, credit risk analysis, portfolio analytics, fair lending compliance, CCAR stress testing	bigquery/
Business Requirements Doc	BA-format BRD: project charter, RACI, 7 business requirements with MoSCoW priority and acceptance criteria, functional requirements, traceability matrix	BUSINESS_REQUIREMENTS.md
Data Model ERD	Star schema Mermaid diagram — 4 source tables + 5 dbt marts, grain statements, relationship cardinality, design decisions	DATA_MODEL.md
Advanced SQL Patterns	5 interview-ready patterns: PIVOT cross-tab, QUALIFY top-N, CUME_DIST percentile, self-join period-over-period, GROUPING SETS subtotals	sql/advanced_patterns.sql
Excel Workbook	8-tab export: Executive Summary, Lending, Stocks, FRED, Risk Tiers, Cohorts, What-If	export script
HMDA Fair Lending Memo	4-page analyst memo — approval disparities, income controls, race gap, regulatory context, and 3 actionable recommendations	hmda-fair-lending-memo.pdf
Narrative Notebook	DA-style analysis (60% markdown) — findings presented to a compliance team, not to a code reviewer	notebook 16
Executive PDF	4-page chart-embedded brief — portfolio overview, credit risk, macro recommendations	executive-brief.pdf
Google Sheets	Cloud-native export with formatting — same 8-tab structure as Excel	export script
Case Study Memo	VP-ready bank memo format with dollar-impact findings	CASE_STUDY.md

Tableau Dashboard

Where Mortgage Approvals Break Down: HMDA Credit Risk Analysis — Tableau Public

Interactive executive dashboard analyzing CFPB HMDA mortgage lending data. Answers: Which loan grades carry the highest default risk, and where are applicants being denied at disproportionate rates?

Key findings surfaced:

Grade G defaults at ~4× the rate of Grade A; sub-prime tail risk concentrated in FICO < 640 band
Denial rates vary significantly by income tier — income-disparity pattern visible across geographies
DTI and loan purpose are stronger denial predictors than loan amount alone

Data pipeline: 9.5MB HMDA dataset prepared with Python + dbt before visualization. Export script: scripts/export_tableau_data.py. Spec: docs/TABLEAU.md.

Evidence.dev Report (BI as Code)

Finance Analytics Report — deployed on Vercel, built from dbt marts

Five-page narrative analytics report built with Evidence.dev — a markdown-based BI framework that compiles SQL queries and Svelte chart components into a static site. Each page combines live SQL against DuckDB marts with analyst-written narrative, alert callouts, and actionable recommendations.

Report pages:

Page	What It Covers	Link
Homepage	Executive overview with key metrics, portfolio composition, and navigation to all analyses	View
Credit Risk	Default rates by loan grade, FICO/DTI distributions, risk tier classification, underwriting recommendations	View
Fair Lending	Disparate impact analysis (ECOA/FHA), DI ratios by income tier and ownership, rate spread disparities, regulatory risk flags	View
Portfolio Returns	Sharpe ratios by sector and ticker, risk-return scatter, cumulative return trends, top 20 holdings summary	View
Sector Performance	Monthly sector rankings, rotation trends, volatility regimes, efficiency frontier scatter	View

What makes this different from the Streamlit dashboard: Evidence.dev pages are written like analyst memos — each finding has a narrative title, an explanatory paragraph, and a "So What?" section with numbered business recommendations. The SQL source queries include calculated columns (Sharpe ratios, DI ratios, momentum labels, risk flags) that power the charts directly, demonstrating analytics engineering alongside storytelling.

Stack: Evidence.dev + dbt-core + DuckDB | Source: evidence-report/ | Auto-deploys from GitHub via Vercel.

Methodology

Data: 50K synthetic lending records (seed=42, matched to Lending Club statistical properties) + live yfinance/FRED with synthetic fallback. See docs/DATA_GENERATION.md.
Models: scikit-learn (RF/GBM/LogReg), statsmodels (OLS, ARIMA), arch (GARCH), lifelines (Kaplan-Meier), shap (TreeExplainer)
Analytics layer: dbt-core + DuckDB (3 staging models, 5 marts, 33 schema tests). See SQL & Analytics Engineering.

Model governance: docs/model_card.md — SR 11-7 compliant.

Quick Start

git clone https://github.com/ChunkyTortoise/finance-analytics-portfolio
cd finance-analytics-portfolio
make setup   # pip install -r requirements.txt
make app     # streamlit run dashboard/app.py

Or with Docker:

docker build -t finance-analytics .
docker run -p 8501:8501 finance-analytics
# open http://localhost:8501

No API keys required — synthetic data fallback is built in.

Key Findings

Three findings that would change a lender's underwriting or risk decisions today:

DTI beats FICO for high-credit borrowers. Among borrowers with FICO > 720, debt-to-income ratio explains ~38% of default variance that FICO-only screening misses. For a $100M portfolio, tightening DTI thresholds in the 720-760 band is the higher-ROI lever without declining additional applications. → Credit Risk Explorer
Stagflation is harder on portfolios than severe recession. Grade G PD multiplier reaches 2.4x under stagflation vs 2.1x for severe recession — floating-rate exposure amplifies rate shocks beyond what unemployment-only stress models predict. → Stress Testing
Parametric VaR understates tail risk by 12–18%. GARCH(1,1) conditional volatility forecasts outperform rolling-window estimates with 15% lower out-of-sample RMSE. For a $1B equity portfolio, the gap implies ~$12–18M in unhedged tail exposure at 99% confidence. → VaR Analysis

Flagship Pages

Page	What It Demonstrates
Credit Risk Explorer	Random Forest + SHAP explanations, 3-model comparison (LogReg/RF/GBM), ROC curves, grade transition matrix, Reg B framing
VaR Analysis	Parametric/Historical/Monte Carlo VaR + ES, Kupiec + Christoffersen backtests, GARCH(1,1), horizon scaling, regime classification
Causal Inference	DiD estimator, synthetic control, regression discontinuity, propensity score matching — causal vs correlational
Fair Lending Geography	HMDA 2022 Colorado mortgage data, county-level approval rates, income-disparity scatter, choropleth
Stress Testing	CCAR/DFAST macro scenarios, grade-level PD amplification, sensitivity heatmap, Markov regime switching

Credit Risk Explorer	Fair Lending Geography

Architecture

graph LR
    subgraph Sources
        YF[yfinance<br/>15 tickers]
        FRED[FRED API<br/>7 indicators]
        SYN[Synthetic<br/>seed=42]
    end

    subgraph "ETL Pipeline"
        EX[extract.py]
        TR[transform.py]
        LD[load.py]
    end

    subgraph Storage
        SP[stocks.parquet]
        FP[fred.parquet]
        LP[lending.parquet]
    end

    subgraph "dbt Analytics"
        STG[3 staging views]
        MRT[5 mart tables]
    end

    subgraph "Analysis (26 modules)"
        CM[credit_models.py]
        NC[neural_credit.py]
        VA[var_analysis.py]
        ST[stress_testing.py]
        MORE[+19 modules]
    end

    subgraph Dashboard
        APP[Streamlit<br/>24 pages]
    end

    YF --> EX --> TR --> LD
    FRED --> EX
    SYN --> EX
    LD --> SP & FP & LP
    SP & FP & LP --> STG --> MRT
    SP & FP & LP --> CM & VA & ST & MORE
    CM & VA & ST & MORE --> APP

All Pages

24 pages (click to expand)

Page	Description
Credit Risk Explorer	FICO/DTI distributions, Random Forest default model, SHAP explanations, model comparison (LogReg/RF/GBM + ROC), grade transition matrix
VaR Analysis	Parametric/Historical/MC VaR + ES, Component VaR, Kupiec + Christoffersen backtests, GARCH(1,1), horizon risk scaling, regime classification
Causal Inference	DiD, synthetic control, regression discontinuity, propensity score matching
Fair Lending Geography	HMDA 2022 Colorado: county approval rates, income-disparity scatter, choropleth
Stress Testing	CCAR/DFAST macro scenarios, grade-level PD amplification, sensitivity heatmap, Markov regime switching
Executive Summary	CFO-ready KPIs (S&P YTD, Sharpe, default rate, recession risk), key findings, Excel export
Market Overview	Sector treemap, top movers, market breadth, S&P 500 moving average signals
Economic Dashboard	FRED macro time series, yield curve, event study (CAR/AAR)
Portfolio Optimizer	Efficient frontier, Monte Carlo simulation, factor attribution, rolling beta
Forecasting	Prophet time series with trend/seasonality decomposition
Survival Analysis	Kaplan-Meier curves by grade, cohort heatmap, vintage curves
A/B Testing Lab	Power analysis, frequentist z-test, Bonferroni, Bayesian Beta-Binomial
SQL Case Study	Gap-and-islands, recursive CTEs, cohort analysis — with business recommendations
SQL Playground	Interactive DuckDB runner — 7 templates, schema browser, NL-to-SQL
Segmentation	K-means clustering, PCA 2D projection, segment profiles
Data Quality	Schema validation, missing value profiling, freshness checks
Model Monitoring	PSI/CSI score drift, calibration curve, threshold optimization
Anomaly Detection	Isolation Forest, Z-score, CUSUM structural break detection
Fairness & Bias	Disparate impact, equalized odds, threshold mitigation — ECOA/FHA framing
Regression Analysis	OLS (LPM), VIF, stepwise selection (AIC/BIC), residual diagnostics
Live Market Monitor	Real-time simulated streaming market data, rolling VaR, volatility regime classification
News Sentiment	VADER headline sentiment, daily aggregation, sentiment-return correlation, top terms
Interview Walkthrough	Guided walkthrough of the portfolio's analytical decisions and methodology
Credentials & Expertise	21 professional certifications, provider breakdown, skills radar, "Applied" badges

SQL & Analytics Engineering

dbt Analytics

make dbt-run        # dbt run --target dev (DuckDB)
make dbt-test       # dbt test --target dev (33 schema tests)
make dbt-snowflake  # dbt run + test --target snowflake (requires env vars)

Staging (views): stg_stocks, stg_lending, stg_macro — source via {{ source_table() }} macro, dispatches to read_parquet() (DuckDB) or {{ source() }} (Snowflake)

Marts (tables): mart_default_rates, mart_portfolio_returns, mart_sector_performance, mart_credit_risk, mart_fair_lending — min_by()/max_by() throughout for cross-database compatibility

See docs/SNOWFLAKE.md for free trial setup and bulk loading.

SQL Case Studies

11 standalone queries in sql/ — each framed as a business question:

File	Business Question
`cohort_defaults.sql`	How do default rates evolve by loan origination cohort over time?
`cross_domain_correlation.sql`	Does rising unemployment predict rising default rates?
`gap_islands.sql`	Which origination periods show activity gaps vs continuous volume?
`monthly_default_trend.sql`	What is the month-over-month default trend by loan grade?
`recursive_cte.sql`	How do defaults cascade through a portfolio over a multi-period horizon?
`risk_decile_ntile.sql`	Which borrowers fall in the top-risk decile, and what do they share?
`borrower_risk_tiers.sql`	Which borrowers belong in each risk tier based on FICO and DTI combined?
`sector_momentum.sql`	Which sectors show sustained momentum vs mean-reversion?
`sector_rankings.sql`	How do sectors rank by risk-adjusted return?
`sessionization.sql`	What are the stage-to-stage conversion rates in the loan application funnel?
`yoy_comparison.sql`	How do sector returns compare year-over-year?

Tests (1,522 across 59 files)

make test   # pytest tests/ -v --tb=short

1,522 tests across 60 files covering ETL, SQL analytics, cohort/survival analysis, A/B testing, forecasting, model comparison, PyTorch neural networks, LangChain SQL chains, segmentation, data quality, Excel export, KPI computations, model monitoring, anomaly detection, fairness/bias, stress testing (CCAR/DFAST), VaR/ES (Component VaR, Christoffersen test, horizon scaling), GARCH volatility, OLS regression, factor attribution, rolling beta, credit transition matrices, Markov regime switching, event studies, HMDA geographic analysis, Snowflake SQL compatibility, and Great Expectations data quality validation.

Tool Integrations

Tool	What's built	Docs
Tableau Public	Interactive mortgage credit risk dashboard (approval/denial patterns, FICO analysis, fair lending) built on CFPB HMDA data prepared via Python/SQL/dbt pipeline	Live Dashboard
Power BI	3-page PBIP report with DAX measures, Git-friendly format	powerbi/
LookML	Looker semantic layer with lending, stocks, and FRED explores	lookml/
Snowflake	Dual-target dbt (`--target dev` or `--target snowflake`), `source_table()` dispatch macro, bulk parquet loader	docs/SNOWFLAKE.md
Apache Airflow	TaskFlow DAG (ETL→GX→dbt→Tableau), Docker Compose, isolated deps	airflow/README.md
Great Expectations	3 validation suites (lending: 12, stocks: 8, FRED: 6 expectations), Data Docs HTML output	`python -m great_expectations`
Google Sheets	Cloud-native export with formatting — 8-tab structure	export script
Polars	pandas vs Polars benchmark notebook (6 ops, LazyFrame, DuckDB Arrow zero-copy)	`notebooks/14_polars_comparison.ipynb`

Tech Stack

Category	Libraries
Data manipulation	`pandas`, `numpy`, `duckdb`, `pyarrow`
Statistics & modeling	`scipy`, `scikit-learn`, `statsmodels`, `lifelines`
Forecasting / volatility	`prophet`, `arch` (GARCH, EGARCH)
ML explainability	`shap`
Visualization	`plotly`, `matplotlib`, `seaborn`
Data acquisition	`yfinance`, `fredapi`
Dashboard	`streamlit`, `openpyxl`
Analytics layer	`dbt-core`, `dbt-duckdb`

For engineering details (FastAPI, Airflow, streaming, neural networks), see docs/TECHNICAL.md.

Limitations

Synthetic lending data: Generated to match real-world FICO/DTI/grade distributions (seed=42), not real borrower records.
Single period: 2019–2024 window covers COVID volatility but not a full credit cycle.
Univariate GARCH: Per-asset GARCH(1,1). Production would use DCC-GARCH for cross-asset correlation.
No transaction costs: Portfolio optimization assumes frictionless trading.

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.claude		.claude
.github/workflows		.github/workflows
.streamlit		.streamlit
airflow		airflow
analysis		analysis
api		api
bigquery		bigquery
dashboard		dashboard
data		data
dbt		dbt
docs		docs
etl		etl
evidence-report		evidence-report
gx_suites		gx_suites
lookml		lookml
notebooks		notebooks
output		output
powerbi		powerbi
research		research
scripts		scripts
sql		sql
tests		tests
web		web
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CASE_STUDY.md		CASE_STUDY.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TECHNICAL_DECISIONS.md		TECHNICAL_DECISIONS.md
audit_concentration.html		audit_concentration.html
docker-compose.yml		docker-compose.yml
mortgage_credit_risk.twb		mortgage_credit_risk.twb
mortgage_credit_risk.twbx		mortgage_credit_risk.twbx
pyproject.toml		pyproject.toml
requirements-airflow.txt		requirements-airflow.txt
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
runtime.txt		runtime.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finance Analytics Portfolio

For Hiring Managers

Analyst Deliverables

Tableau Dashboard

Evidence.dev Report (BI as Code)

Methodology

Quick Start

Key Findings

Flagship Pages

All Pages

SQL & Analytics Engineering

dbt Analytics

SQL Case Studies

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Finance Analytics Portfolio

For Hiring Managers

Analyst Deliverables

Tableau Dashboard

Evidence.dev Report (BI as Code)

Methodology

Quick Start

Key Findings

Flagship Pages

All Pages

SQL & Analytics Engineering

dbt Analytics

SQL Case Studies

Limitations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages