Ten libraries. One workflow. Here is what to reach for at each stage.
Workflow overview
Data prep
→
Smoothing
→
Modelling
→
Validation
→
Deployment
→
Monitoring
Data prep
insurance-datasets — Synthetic UK motor data with a known data-generating process. Use it to validate your pipeline before touching production data, or to benchmark a new method against something with ground truth.
Smoothing
insurance-whittaker — Whittaker-Henderson penalised smoothing with automatic REML lambda selection and Bayesian credible intervals. Replaces the Excel five-point moving average for experience rating curves.
Modelling
insurance-gam — Interpretable tariff models: Explainable Boosting Machines, Actuarial NAM, and PIN. Shape functions a pricing actuary can read; exact Shapley values for relativities.
insurance-causal — Double Machine Learning for causal inference. Estimates price elasticity, renewal CATEs, and telematics treatment effects from observational data, with valid frequentist confidence intervals.
insurance-credibility — Bühlmann-Straub group credibility and Bayesian individual experience rating. Finds the optimal blend between a scheme's own history and the portfolio average.
insurance-frequency-severity — Sarmanov bivariate model for correlated frequency and severity. Fits on top of your existing statsmodels GLMs; returns per-policy correction factors.
insurance-telematics — HMM-based trip scoring into latent driving regimes, credibility-weighted to driver level, producing GLM-ready features you can explain to the FCA.
Validation
insurance-conformal — Distribution-free prediction intervals for Tweedie and Poisson models. Finite-sample coverage guarantee, 13–14% narrower than parametric baselines on heterogeneous motor books.
insurance-fairness — Proxy discrimination audit for FCA Consumer Duty. Detects which rating factors act as protected-characteristic proxies; generates a sign-off-ready Markdown report with regulatory cross-references.
Deployment
insurance-optimise — Constrained portfolio rate optimisation. Enforces FCA PS21/5 ENBP ceilings, loss ratio targets, retention floors, and rate-change caps simultaneously; produces an auditable JSON pricing record.
Monitoring
insurance-monitoring — Model drift detection across three axes: covariate shift (PSI/CSI per feature), calibration drift (A/E with Murphy decomposition), and discrimination decay (Gini drift z-test). Returns a structured RECALIBRATE / REFIT / NO_ACTION recommendation.
Common workflows
Motor pricing refresh
insurance-datasets → insurance-whittaker → insurance-gam → insurance-conformal → insurance-fairness → insurance-optimise → insurance-monitoring
Start with synthetic data to validate the pipeline, smooth experience curves, fit an interpretable tariff, attach distribution-free prediction intervals, audit for proxy discrimination, optimise rates under FCA constraints, then monitor post-deployment.
Renewal pricing with causal elasticity
insurance-causal → insurance-optimise → insurance-fairness → insurance-monitoring
Estimate causal price sensitivity from your existing renewal book (correcting the confounding that a raw GLM carries), feed CATE-segmented elasticities into the constrained optimiser, audit the resulting differentials for Consumer Duty compliance, and track the deployed rates.
Telematics product build
insurance-telematics → insurance-causal → insurance-gam → insurance-conformal
Score raw trip data through an HMM into auditable driver-level features, use DML to estimate the causal effect of driving style on claims (separating it from correlated demographics), fit an interpretable tariff with shape functions, and quantify per-policy uncertainty before deployment.
Scheme and fleet pricing
insurance-credibility → insurance-frequency-severity → insurance-optimise
Blend thin scheme experience with portfolio rates using Bühlmann-Straub credibility, apply a Sarmanov correction for frequency-severity dependence driven by NCD structure, then optimise the scheme terms under commercial and regulatory constraints.