The Stack

Ten libraries. One workflow. Here is what to reach for at each stage.

Workflow overview

Data prep → Smoothing → Modelling → Validation → Deployment → Monitoring

Data prep

insurance-datasets — Synthetic UK motor data with a known data-generating process. Use it to validate your pipeline before touching production data, or to benchmark a new method against something with ground truth.

Smoothing

insurance-whittaker — Whittaker-Henderson penalised smoothing with automatic REML lambda selection and Bayesian credible intervals. Replaces the Excel five-point moving average for experience rating curves.

Modelling

insurance-gam — Interpretable tariff models: Explainable Boosting Machines, Actuarial NAM, and PIN. Shape functions a pricing actuary can read; exact Shapley values for relativities.

insurance-causal — Double Machine Learning for causal inference. Estimates price elasticity, renewal CATEs, and telematics treatment effects from observational data, with valid frequentist confidence intervals.

insurance-credibility — Bühlmann-Straub group credibility and Bayesian individual experience rating. Finds the optimal blend between a scheme's own history and the portfolio average.

insurance-frequency-severity — Sarmanov bivariate model for correlated frequency and severity. Fits on top of your existing statsmodels GLMs; returns per-policy correction factors.

insurance-telematics — HMM-based trip scoring into latent driving regimes, credibility-weighted to driver level, producing GLM-ready features you can explain to the FCA.

Validation

insurance-conformal — Distribution-free prediction intervals for Tweedie and Poisson models. Finite-sample coverage guarantee, 13–14% narrower than parametric baselines on heterogeneous motor books.

insurance-fairness — Proxy discrimination audit for FCA Consumer Duty. Detects which rating factors act as protected-characteristic proxies; generates a sign-off-ready Markdown report with regulatory cross-references.

Deployment

insurance-optimise — Constrained portfolio rate optimisation. Enforces FCA PS21/5 ENBP ceilings, loss ratio targets, retention floors, and rate-change caps simultaneously; produces an auditable JSON pricing record.

Monitoring

insurance-monitoring — Model drift detection across three axes: covariate shift (PSI/CSI per feature), calibration drift (A/E with Murphy decomposition), and discrimination decay (Gini drift z-test). Returns a structured RECALIBRATE / REFIT / NO_ACTION recommendation.

Common workflows

Motor pricing refresh

insurance-datasets → insurance-whittaker → insurance-gam → insurance-conformal → insurance-fairness → insurance-optimise → insurance-monitoring

Start with synthetic data to validate the pipeline, smooth experience curves, fit an interpretable tariff, attach distribution-free prediction intervals, audit for proxy discrimination, optimise rates under FCA constraints, then monitor post-deployment.

Renewal pricing with causal elasticity

insurance-causal → insurance-optimise → insurance-fairness → insurance-monitoring

Estimate causal price sensitivity from your existing renewal book (correcting the confounding that a raw GLM carries), feed CATE-segmented elasticities into the constrained optimiser, audit the resulting differentials for Consumer Duty compliance, and track the deployed rates.

Telematics product build

insurance-telematics → insurance-causal → insurance-gam → insurance-conformal

Score raw trip data through an HMM into auditable driver-level features, use DML to estimate the causal effect of driving style on claims (separating it from correlated demographics), fit an interpretable tariff with shape functions, and quantify per-policy uncertainty before deployment.

Scheme and fleet pricing

insurance-credibility → insurance-frequency-severity → insurance-optimise

Blend thin scheme experience with portfolio rates using Bühlmann-Straub credibility, apply a Sarmanov correction for frequency-severity dependence driven by NCD structure, then optimise the scheme terms under commercial and regulatory constraints.