Most people coming to Burning Cost have one of three starting points. Pick the path closest to yours — each one recommends three to five libraries that solve problems you will encounter first, in an order that makes sense.
Quick install
The six libraries most teams reach for first. Copy the command that matches your setup.
Your first 5 minutes
A minimal working example using shap-relativities. Fits a CatBoost model on synthetic motor data and extracts a multiplicative factor table — the same output format as exp(β) from a GLM. No data download required.
The factors dict maps each feature name to a DataFrame with one row per level. relativity is the multiplicative factor — the same structure as a GLM output from Emblem or Radar. reconstruction_r2 tells you how much of the model's variance the factor table explains; above 0.95 is production-usable.
Pricing actuary moving to Python
You know the techniques: GLMs, factor tables, A/E monitoring, credibility. What you need are Python equivalents that produce output in the same formats you already use, and that handle the insurance-specific details (IBNR buffers, exposure weighting, renewal cohort structure) that generic ML libraries ignore.
- shap-relativities Turns a CatBoost GBM into a multiplicative factor table — same output format as exp(β) from Emblem or Radar, with confidence intervals and exposure weighting. This is where most actuaries start.
- insurance-cv Walk-forward cross-validation with IBNR buffers. Stops future experience from leaking into training folds. Produces Poisson and Gamma deviance scores, not just RMSE.
- insurance-monitoring Three-layer post-deployment monitoring: exposure-weighted PSI/CSI, segmented A/E ratios with IBNR adjustment, and a Gini z-test that tells you whether to recalibrate or refit.
- insurance-governance PRA SS1/23-aligned model validation reports. Bootstrap Gini CI, Poisson A/E CI, double-lift charts, renewal cohort test. HTML/JSON output for model risk committees.
- insurance-credibility Bühlmann-Straub credibility in Python. Practical for capping thin segments, stabilising NCD factors, and blending a new model with an incumbent rate.
Data scientist joining an insurance pricing team
You have the ML fundamentals. What you are missing is the insurance context: why you cannot use k-fold CV, what IBNR means for your validation, how proxy discrimination is tested under FCA guidance, and why your coefficient estimates might be confounded. These libraries encode that context.
- insurance-datasets Synthetic UK motor portfolio data with a known data-generating process. Use it to validate your methods before touching real data.
- insurance-cv Temporally correct cross-validation. Random folds are wrong for insurance data. This explains why and fixes it.
- insurance-fairness Proxy discrimination auditing aligned with FCA EP25/2 and Consumer Duty. Quantifies indirect discrimination risk from rating variables correlated with protected characteristics.
- insurance-causal Double machine learning for deconfounding rating factors. If your rating variables correlate with distribution channel or policyholder behaviour, standard GLM coefficients are biased.
- shap-relativities Produces outputs that the actuarial side of the team will recognise. Bridge between a model that lives in Python and a committee that wants a factor table.
Technical pricing team lead evaluating what to adopt
You need to know what is production-ready, what the regulatory exposure is, and how to move models through sign-off without creating a maintenance burden. These libraries have actuarial tests, clear scope, and outputs a pricing committee or auditor can follow.
- insurance-deploy Champion/challenger framework with shadow mode, SHA-256 deterministic routing, SQLite quote log, and a bootstrap likelihood ratio test to declare a winner. ICOBS 6B.2 audit trail included.
- insurance-governance PRA SS1/23 model validation reports in HTML and JSON. Covers the tests a model risk function will ask for, in a format they can file.
- insurance-monitoring Post-deployment drift monitoring with a clear decision rule: recalibrate vs. refit. Reduces the judgement calls that stall model review cycles.
- insurance-fairness Proxy discrimination audit that produces an evidence pack for Consumer Duty and FCA supervisory review. Quantifies risk before sign-off, not after a regulatory question.
- insurance-conformal Distribution-free prediction intervals with finite-sample coverage guarantees. Relevant wherever a model needs a principled uncertainty bound for Solvency II or internal capital.
Worked Examples
The burning-cost-examples repo contains 35 Databricks notebooks covering the full ecosystem. (These are in the burning-cost-examples GitHub repo — the Notebooks page on this site hosts a curated subset.) Each one installs its own dependencies, generates synthetic data, fits models, and benchmarks against a standard actuarial baseline. Browse the notebooks/ directory or pick from the table below — sorted by library name.
| Library | What it shows | |
|---|---|---|
| bayesian-pricing | Hierarchical Bayesian vs raw experience on thin segments | view |
| experience-rating | NCD stationary distribution, optimal claim threshold | view |
| insurance-causal | DML causal effect vs naive Poisson GLM on confounded data | view |
| insurance-causal-policy | SDID rate change evaluation with event study and HonestDiD | view |
| insurance-conformal | Tweedie conformal intervals vs bootstrap on 50k motor | view |
| insurance-conformal-ts | ACI/SPCI vs split conformal on non-exchangeable time series | view |
| insurance-covariate-shift | Importance-weighted evaluation after distribution shift | view |
| insurance-credibility | Bühlmann-Straub credibility vs raw experience on 30 segments | view |
| insurance-cv | Random CV vs temporal CV vs true OOT holdout | view |
| insurance-demand | Conversion/retention, DML elasticity, GIPP-constrained optimiser | view |
| insurance-deploy | Shadow mode, quote logging, bootstrap LR test, ENBP audit | view |
| insurance-dispersion | DGLM vs constant-phi Gamma GLM, per-risk volatility scoring | view |
| insurance-distributional | Distributional GBM (TweedieGBM) vs standard point predictions | view |
| insurance-distributional-glm | GAMLSS vs standard Gamma GLM on heterogeneous-variance data | view |
| insurance-dynamics | GAS Poisson filter vs static GLM, BOCPD changepoint detection | view |
| insurance-elasticity | DML elasticity, ENBP-constrained optimiser, efficient frontier | view |
| insurance-fairness | Proxy discrimination audit, bias metrics, Lindholm correction | view |
| insurance-frequency-severity | Sarmanov copula joint freq-sev vs independence assumption | view |
| insurance-gam | EBM/ANAM vs Poisson GLM with planted non-linear effects | view |
| insurance-glm-tools | Nested GLM embeddings for 500 vehicle makes vs dummy-coded GLM | view |
| insurance-governance | PRA SS1/23 validation workflow, MRM risk tiering, HTML report | view |
| insurance-interactions | CANN/NID interaction detection vs exhaustive pairwise GLM search | view |
| insurance-monitoring | Exposure-weighted PSI/CSI, A/E ratios, Gini drift z-test | view |
| insurance-multilevel | CatBoost + REML random effects vs one-hot encoding | view |
| insurance-optimise | SLSQP constrained optimisation, efficient frontier, FCA audit | view |
| insurance-quantile | CatBoost quantile regression vs lognormal, TVaR, ILF curves | view |
| insurance-severity | Spliced Lognormal-GPD + DRN vs Gamma GLM, tail quantiles | view |
| insurance-spatial | BYM2 territory factors vs postcode grouping, Moran's I | view |
| insurance-survival | Cure models vs KM/Cox PH, CLV bias by cure band | view |
| insurance-synthetic | Vine copula generation, fidelity report, TSTR benchmarks | view |
| insurance-telematics | HMM latent-state features vs raw trip aggregates | view |
| insurance-thin-data | GLMTransfer + TabPFN vs raw GLM on thin segments | view |
| insurance-trend | Automated trend selection vs naive OLS, structural breaks | view |
| insurance-whittaker | W-H smoothing with REML lambda vs manual step smoothing | view |
| shap-relativities | CatBoost relativities vs GLM vs true DGP on synthetic motor | view |
How to run: Import any notebook into Databricks via databricks workspace import notebooks/<name>.py /Workspace/Users/you@example.com/<name> --language PYTHON --overwrite, or drag-and-drop the .py file in the Databricks UI. Notebooks use %pip install cells and run on Databricks Free Edition serverless compute — no cluster setup needed.
Each notebook generates synthetic data inline — no external files needed. Install the relevant library and run.