Top 10 Libraries — Where to Start

Forty-plus libraries, ten that matter most. These are the ones we would reach for first — the tools that address genuinely hard problems in UK personal lines pricing where nothing adequate existed before. Start here, then use the problem guide to find everything else.

# 1
insurance-fairness FCA · Consumer Duty

Proxy discrimination auditing — FCA Consumer Duty, Equality Act 2010, EP25/2

Your pricing actuary says postcode is a legitimate risk variable. Your compliance function needs to demonstrate it does not operate as an ethnic proxy under Consumer Duty and EP25/2. Spearman correlation misses non-linear categorical relationships entirely. This library runs CatBoost proxy R² and mutual information scoring to catch what rank correlation cannot see — and produces the Consumer Duty evidence pack for sign-off.

Benchmark: Postcode proxy R² = 0.777 (RED); manual Spearman r = 0.064 — missed entirely. Detection time: 0.5s.

pip install insurance-fairness GitHub Benchmark
# 2
insurance-monitoring Monitoring · Drift

Model drift detection — exposure-weighted PSI/CSI, A/E ratios, Gini drift, sequential testing

A/E ratios are creeping but aggregate figures look fine — because errors in young drivers and vehicle age cancel at portfolio level. This library disaggregates: exposure-weighted PSI/CSI per segment, segmented A/E with IBNR adjustment, and a Gini z-test with a formal recalibrate-vs-refit decision rule. v0.7.0 adds PITMonitor (e-process martingale for calibration drift — 3% FPR vs 46% for repeated Hosmer-Lemeshow) and InterpretableDriftDetector (BH-corrected feature-level attribution so you know which rating factors are driving the drift, not just that drift exists).

Benchmark: Manual aggregate A/E verdict: INVESTIGATE. MonitoringReport verdict: REFIT — because calibration drift concentrated in vehicle_age < 3 cancels at portfolio level. PITMonitor FPR ~3% vs repeated H-L 46%.

pip install insurance-monitoring GitHub Benchmark
# 3
insurance-conformal Uncertainty · Solvency II

Conformal prediction intervals for Tweedie/Poisson GBMs — distribution-free, finite-sample coverage

Standard conformal prediction meets aggregate coverage but systematically undercovers the highest-risk decile — the segment that drives SCR and reinsurance cost. Locally-weighted non-conformity scores adapt interval width to local variance, producing calibrated bounds in every risk segment. No distributional assumptions. Solvency II SCR bounds included.

Benchmark: Standard conformal: 87.9% worst-decile coverage (misses 90% target). Locally-weighted conformal: 90%+ in every decile, 11.7% narrower than parametric.

pip install insurance-conformal GitHub Benchmark
# 4
shap-relativities GBM · Interpretability

SHAP-based rating relativities from GBM models — extract GLM-style multiplicative factor tables

Your GBM beats the production GLM on every holdout metric. The rating engine and the actuarial committee both need multiplicative factor tables. There is no exp(β) in CatBoost. This library extracts TreeSHAP values, exposure-weights them per rating band, and produces the factor table format that pricing committees and rating engines expect — with reconstruction R² to validate fidelity.

Benchmark: +2.85pp Gini lift over direct GLM. NCD=5 relativity error 4.47% (GLM: 9.44%). Conviction factor recovered at 1.57× within confidence interval.

pip install shap-relativities GitHub Benchmark
# 5
insurance-causal Causal Inference · DML

Double machine learning for deconfounding rating factors and causal price elasticity

Your vehicle value factor looks significant in the GLM, but vehicle value correlates with distribution channel — direct customers buy cheaper cars. You cannot tell whether it is genuine risk signal or channel confounding, and ordinary regression cannot separate the two. Double machine learning residualises both outcome and treatment on confounders using CatBoost nuisance models, then estimates the causal effect in the residuals. Produces a confounding bias report alongside the deconfounded coefficients. insurance_causal.causal_forest extends to segment-level heterogeneous treatment effects (GATES/CLAN/RATE) for portfolio-level price response analysis.

Honest benchmark: DML wins at n ≥ 50,000 with large treatment effects and compounding GLM misspecification. At n = 5,000 it over-partials — see the README for conditions.

pip install insurance-causal GitHub Benchmark
# 6
insurance-optimise Optimisation · FCA ENBP

Constrained portfolio rate optimisation — SLSQP, FCA ENBP compliance, Pareto front

You have a technical price per segment, a loss ratio target, and movement caps. The rate change recommendation is still done in a spreadsheet where the constraints interact and the solution is not optimal. SLSQP with analytical Jacobians finds the optimal rate changes while respecting FCA ENBP constraints and retention floors simultaneously. v0.4.1 adds ParetoFrontier: single-objective optimisation is blind to fairness costs (premium disparity ratio 1.168 in the benchmark); the Pareto surface makes the profit/retention/fairness trade-off explicit and defensible.

Benchmark: 3–8% profit uplift over flat rate change on a 2,000-renewal portfolio. ParetoFrontier: 4 non-dominated solutions on the 3-objective surface; TOPSIS selection picks the balanced operating point.

pip install insurance-optimise GitHub Benchmark
# 7
insurance-governance PRA SS1/23 · MRM

Model governance — PRA SS1/23 validation reports, risk tier scoring

Your model governance committee needs a validation report. You are producing it manually in PowerPoint. The automated suite runs bootstrap Gini CI, Poisson A/E CI, double-lift charts, and a renewal cohort test — structured to what a model risk function and PRA review expect. HTML and JSON output. The benchmark case shows manual checklists miss miscalibration concentrated in young drivers (age < 30) that the automated suite catches via Hosmer-Lemeshow (p < 0.0001).

Benchmark: Manual checklist: flags global A/E only. Automated suite: catches age-band miscalibration, PSI shift, and Poisson CI on A/E. Overhead: 1.2s vs 0.09s — acceptable for a sign-off workflow.

pip install insurance-governance GitHub Benchmark
# 8
insurance-severity Severity · EVT

Spliced distributions, EVT, Deep Regression Networks, composite Lognormal-GPD

A single Gamma GLM fits attritional claims adequately but fails structurally at the tail — large losses follow a Pareto distribution with completely different physics. This library provides spliced body-tail models with covariate-dependent thresholds, composite Lognormal-GPD for heavy tails, Deep Regression Networks for non-parametric severity, and EQRN extreme quantile neural networks. ILF tables and TVaR per risk are included. The EVT module corrects for policy limit truncation, which naive GPD ignores.

Benchmark: Composite model: 5.6% tail error reduction vs single lognormal. Heavy-tail benchmark (Pareto α = 1.5): 15–20pp Q99 error reduction vs Gamma GLM. TruncatedGPD vs naive GPD: shape bias 0.006 vs 0.035.

pip install insurance-severity GitHub Benchmark
# 9
insurance-causal-policy Causal Inference · SDID

SDID + doubly robust SC for causal rate change evaluation — HonestDiD sensitivity, FCA evidence pack

You put through a rate increase in Q3 and conversion dropped. You cannot tell how much of that drop was the rate change versus market conditions, because you have no control group and the before/after comparison is confounded by market inflation. Synthetic difference-in-differences constructs a synthetic control from unaffected segments to isolate the rate change effect. Produces event study charts, HonestDiD sensitivity analysis for violations of parallel trends, and an FCA evidence pack.

Benchmark: SDID: near-zero bias, ~93–95% CI coverage. Naive before-after: ~2pp upward bias (4 periods × 0.5pp market inflation absorbed into the estimate).

pip install insurance-causal-policy GitHub Benchmark
# 10
insurance-quantile Tail Risk · ILF

Tail risk quantile and expectile regression — TVaR, increased limit factors, exceedance curves

Your mean model gives you no handle on the upper tail. Large loss loading and ILF curves require quantile estimates at the 90th and 99th percentile per risk, not just the expected value. This library fits quantile and expectile GBMs at multiple levels simultaneously, producing per-risk TVaR and ILF tables in standard actuarial format. The GBM approach captures non-linear covariate effects on the tail that parametric EVT methods miss in moderate-sized portfolios.

Benchmark: GBM lower TVaR bias on heavy tails vs lognormal parametric. Lognormal wins on pinball loss at small n — use the parametric path below ~10,000 policies, GBM above it.

pip install insurance-quantile GitHub

All 34 libraries across the full pricing stack

Browse all libraries Problem → library guide Getting started paths All benchmarks