How-To

84 articles in this topic

24 Mar 2026

Walk-Forward Cross-Validation for Insurance GLMs in Python

How to implement walk-forward cross-validation for insurance GLMs in Python using insurance-cv. Covers IBNR buffers, fold design, and a full worked example on freMTPL2-style mot...
24 Mar 2026

Migrating from Emblem to Python: What Actually Changes

If you are considering moving your GLM workflow from Emblem to Python, the modelling is not the hard part - here is what is.
24 Mar 2026

GLM, GAM, and GBM for UK Motor Pricing in Python

The Python equivalent of the IFoA MLR Working Party's R tutorial: Poisson GLM baseline, EBM GAM, and CatBoost GBM on UK motor data, with the full pipeline from data to governance.
24 Mar 2026

Does insurance-gam actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor book. EBM beats the GLM by 35 Gini points. But the deviance number is misleading. We explain why, and when you should care.
24 Mar 2026

Does HMM telematics scoring actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor fleet. HMM state fractions deliver 5–10pp Gini lift over simple aggregates. State classification recovers >50% of true high-r...
24 Mar 2026

Does GBM-to-GLM Distillation Actually Work for Insurance Pricing?

Honest benchmark: does fitting a surrogate GLM on CatBoost pseudo-predictions recover more discriminatory power than a direct GLM? We test it on 30,000 synthetic UK motor policies.
24 Mar 2026

Claims Inflation Adjustment in Pricing Models: Beyond CPI

CPI-adjusting your historical claims data before fitting a pricing model introduces systematic bias - here is what to use instead.
23 Mar 2026

Python vs R for Actuarial Pricing: A Practical Comparison

Python vs R for UK personal lines pricing: data wrangling, GLMs, GBMs, deployment, and Databricks. Honest about where R still wins in 2026.
23 Mar 2026

One-Way Analysis in Python: From Scratch to Production

One-way analysis in Python for pricing actuaries: pandas from scratch, credibility-weighted confidence intervals, thin cell handling, GBM shortcuts.
23 Mar 2026

How to Reproduce an Emblem GLM in Python

Reproduce an Emblem frequency-severity GLM in Python: factor tables, one-way plots, deviance residuals, and lift charts using statsmodels, CatBoost, and Polars.
23 Mar 2026

GLM Assumptions in Insurance Pricing: What Actually Matters

Which GLM assumptions actually matter for insurance pricing, which ones you routinely violate without consequence, and the diagnostics worth running before signing off a product...
23 Mar 2026

Getting Started: Three Libraries, One Workflow

A practical walkthrough for pricing analysts: use insurance-causal for causal inference, insurance-conformal for prediction intervals, and insurance-monitoring for drift detecti...
23 Mar 2026

Exposure-Weighted Gini Coefficient in Python

Exposure-weighted Gini for insurance pricing: correct formula, Python implementation, and why ignoring exposure distorts motor model governance.
23 Mar 2026

Does Whittaker-Henderson smoothing actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor age curve. REML recovers the true frequency well in the data-rich middle. The tails are a different story. Numbers, not claims.
23 Mar 2026

Does automated model monitoring actually work for insurance pricing?

Aggregate A/E at 0.94 looks fine. The model has been mispricing under-25s for eight months. Benchmark results on a synthetic UK motor book with three planted failure modes.
23 Mar 2026

Does Bühlmann-Straub credibility actually work for insurance pricing?

Benchmark results on 100 synthetic schemes with known true loss rates. Credibility blending reduces MSE by 25–35% vs the best naive alternative. Numbers, not theory.
23 Mar 2026

CatBoost Factor Table to Radar in 45 Minutes

CatBoost motor frequency model to Radar factor table in 45 minutes: SHAP relativity extraction, GLM distillation, and Radar-compatible CSV export in Python.
23 Mar 2026

How to Build a Burning Cost Model for Insurance Pricing in Python

Build a burning cost model in Python: frequency-severity split, exposure offsets, large loss capping, IBNR adjustment, and combined pure premium for UK pricing.
22 Mar 2026

Python Insurance Pricing Cookbook: 20 Recipes for Common Tasks

20 short code recipes for common insurance pricing tasks. Each recipe uses a real API from one of our open-source libraries. Copy and adapt.
22 Mar 2026

Insurance Model Monitoring in Python: Gini Drift, A/E Ratios and Double-Lift Curves

What pricing actuaries actually monitor and how to do it in Python: Gini coefficient stability, A/E ratios by segment, exposure-weighted PSI, and double-lift curves using the in...
22 Mar 2026

Insurance Model Monitoring in Python: Gini, A/E, and Double-Lift

Tutorial on monitoring insurance pricing models using actuarial KPIs. Gini tracking, segmented A/E, double-lift for champion/challenger. Why generic drift tools miss what matters.
22 Mar 2026

GLMs for UK Insurance Pricing in Python: What the Generic Tutorials Miss

GLM insurance Python for UK pricing actuaries: exposure handling, Consumer Duty, frequency-severity split, and Emblem/Radar deployment with glum.
22 Mar 2026

The Complete Python Insurance Pricing Toolkit: Every Library You Need in 2026

The Python insurance pricing ecosystem has grown fast enough that it is genuinely difficult to keep track of what exists. This is our attempt at a comprehensive, honest map — ge...
22 Mar 2026

CatBoost for Insurance Pricing: Frequency-Severity on freMTPL2

Build a CatBoost frequency-severity pricing model on freMTPL2 using Polars. Poisson frequency, Gamma severity, combined burning cost, SHAP factor extraction, and distillation to...
22 Mar 2026

CatBoost Frequency-Severity Modelling on freMTPL2

Step-by-step CatBoost frequency-severity model on freMTPL2. Poisson + Gamma, burning cost combination, SHAP factor tables, and GLM distillation for Radar.
22 Mar 2026

Building an Insurance Pricing Pipeline in Python: From Raw Claims to Production Tariff

End-to-end UK motor pricing pipeline in Python: synthetic data generation, CatBoost frequency/severity models, GLM factor extraction, FCA fairness audit, conformal prediction in...
21 Mar 2026

Why k-Fold CV Is Wrong for Insurance and What to Do Instead

Insurance walk-forward cross-validation prevents the look-ahead bias that makes standard k-fold results useless for prospective evaluation. Complete Python example with insuranc...
21 Mar 2026

Tweedie Regression for Insurance: What sklearn Doesn't Tell You About Exposure

sklearn's TweedieRegressor tutorial gets you to a fitted model in six lines. It also produces predictions that are wrong for any policy with non-annual exposure. Here is the cor...
21 Mar 2026

Insurance Model Monitoring Beyond Generic Data Drift

Evidently and NannyML are excellent tools. They do not understand exposure weighting, development lags, or the Gini drift test. insurance-monitoring does.
20 Mar 2026

Truncation-Corrected GPD Fitting for Capped Claims: Unbiased Tail Index Estimation Under Policy Limits

Standard GPD fitting is biased when claims are capped by policy limits. Most actuaries know this and do it anyway. insurance-severity v0.2.0 fixes it.
20 Mar 2026

FCA Consumer Duty Pricing Fairness in Python

The FCA expects pricing teams to demonstrate their models don't proxy-discriminate under Consumer Duty. Most teams do this in Excel. Here is how to do it properly in Python, usi...
18 Mar 2026

Whittaker-Henderson Smoothing for Rating Tables: The Penalised Least-Squares Method UK Actuaries Should Already Be Using

Every UK pricing actuary smooths experience tables. Most do it with a 5-point moving average or a polynomial fitted by eye.
17 Mar 2026

Bühlmann-Straub Treats Last Year the Same as Five Years Ago

Static credibility weights all years equally. The dynamic Poisson-gamma state-space model weights recent experience more - and quantifies how much more.
15 Mar 2026

Your Rating Table Smoothing Is Wrong

Every UK pricing actuary smooths experience tables. Most do it with a 5-point moving average or a polynomial fitted by eye.
15 Mar 2026

Monthly Covariate Shift Monitoring: When to Reweight and When to Retrain

How to run covariate shift detection as a recurring monthly check: monitoring cadence, ESS ratio trends, and the thresholds that trigger a retraining...
15 Mar 2026

Adaptive Conformal Inference for Non-Exchangeable Claims Series: Handling Trend Without Retraining

Standard split conformal prediction requires exchangeability — a condition insurance claims time series systematically violate.
15 Mar 2026

Debiasing Price Elasticity Estimates with Double Machine Learning: Removing the Risk Model's Fingerprint

OLS elasticity in formula-rated books is contaminated by your own risk model. insurance-causal fixes this with CausalForestDML and CatBoost nuisance.
14 Mar 2026

Optimal Binning for GLM Rating Factors: Beyond the Eyeball Test

Automated GLM factor banding for UK insurance pricing: R2VF fused lasso, neural embeddings for high-cardinality categoricals, SKATER spatial clustering.
14 Mar 2026

Per-Segment Large Loss Loading with Quantile GBMs: TVaR and ILFs at Risk Level

December is the season for year-end rate reviews where someone adds a flat 8% large loss loading to every segment regardless of tail weight.
14 Mar 2026

The Python Insurance Pricing Stack: 35 Libraries for Everything Emblem Can't Do

35 open-source Python libraries for UK insurance pricing: GBM-to-GLM distillation, causal inference, FCA fairness auditing, rate optimisation, PRA SS1/23.
14 Mar 2026

One Package, One Install: PRA SS1/23 Validation and MRM Governance Unified

insurance-governance merges insurance-validation and insurance-mrm. PRA SS1/23 statistical validation and MRM governance in one install - no version conflicts.
13 Mar 2026

Your Model Drift Alert Is Too Late

Aggregate A/E is a lagging indicator. insurance-monitoring catches input drift, feature drift, and score drift before the loss ratio moves.
13 Mar 2026

Credibility-Weighted Broker and Scheme Effects with REML

Two-stage CatBoost plus REML random effects for UK insurance broker adjustments. insurance-multilevel - Buhlmann-Straub credibility weighting, not guesswork.
13 Mar 2026

Foundation Models for Thin Segments: TabPFN and TabICLv2 in Insurance Pricing

TabPFN and TabICLv2 for thin-segment UK insurance pricing. In-context learning at inference, no gradient descent. insurance-thin-data wraps both for actuaries.
13 Mar 2026

Individual Experience Rating Beyond NCD: From Bühlmann-Straub to Neural Credibility

Four-tier experience rating in Python: Buhlmann-Straub, Poisson-Gamma state-space, GBM surrogate, attention credibility. Policy-level multiplicative factors.
13 Mar 2026

Mixture Cure Models for Retention Pricing: Separating Structural Non-Lapsers from the At-Risk Book

Logistic regression treats all non-lapsers the same. Mixture cure models split them into two groups: structural non-lapsers who will never leave, and...
12 Mar 2026

Telematics Risk Scoring: From Raw Trips to GLM Features

How to convert raw telematics trip data into GLM-ready features for UK motor pricing. Covers HMM state segmentation and score calibration to GLM relativities.
12 Mar 2026

Building a Modern Insurance Pricing Pipeline in Python

Complete UK insurance pricing pipeline in Python: CatBoost GLM distillation, causal inference, FCA fairness auditing, rate optimisation, PRA SS1/23 governance.
12 Mar 2026

GARCH for Claims Inflation: Modelling Volatility That Clusters

GARCH for UK insurance claims inflation: time-varying variance in trend analysis. insurance-garch - Engle (1982) applied to actuarial trend and pricing models.
11 Mar 2026

Quantitative Model Validation Under PRA SS1/23: Pass/Fail Tests with Reproducible Audit Trails

PRA SS1/23 requires quantitative pass/fail tests, not narrative. insurance-governance automates the full validation suite and generates auditable HTML reports.
10 Mar 2026

Double GLM for Insurance Severity: Per-Policy Dispersion via the Smyth-Jørgensen Method

Standard Gamma GLMs assign one dispersion parameter to every policy. That is wrong for most UK books. GAMLSS sigma submodels and Tweedie p estimation fix it.
10 Mar 2026

When You Can't Fit a GLM from Scratch: Transfer Learning for Thin Segments

GLMTransfer borrows statistical strength from a related source book to price thin target segments. Motor-to-fleet, home-to-landlord, and fleet roll-outs.
09 Mar 2026

Whittaker-Henderson Smoothing for Insurance Pricing

Whittaker-Henderson smoothing for noisy experience rating tables in Python. REML lambda selection, Bayesian confidence intervals, 2D surface smoothing.
09 Mar 2026

Getting Spatial Territory Factors Into Production

From CatBoost frequency model to BYM2 spatial territory factors for Emblem or Radar. Data engineering, MCMC convergence checks, Polars joins - Python.
09 Mar 2026

EBMs for Insurance Pricing: Better Than a GLM, Readable by a Pricing Committee

insurance-gam wraps EBM for UK pricing teams: Poisson/Tweedie loss, exposure offsets, RelativitiesTable, MonotonicityEditor, GLM comparison diagnostics.
08 Mar 2026

How Do You Know Your Sigma Model Is Working?

Three diagnostics prove a GAMLSS sigma submodel is real: quantile residuals, worm plots, split-sample calibration. From insurance-distributional-glm.
08 Mar 2026

Sarmanov Copula for Frequency-Severity Dependence: The Independence Assumption UK Motor Violates

Your frequency GLM and severity GLM are both correct. Multiplying them is not. How to test and correct for the dependence your pricing model ignores.
07 Mar 2026

Quantile GBMs for Insurance: TVaR, ILFs, and Large Loss Loadings

CatBoost MultiQuantile plus actuarial output layer: TVaR, ILFs, large loss loadings, exceedance probabilities for UK insurance pricing. insurance-quantile.
07 Mar 2026

Bühlmann-Straub Treats Last Year the Same as Five Years Ago

Static credibility weights all years equally. The dynamic Poisson-gamma state-space model weights recent experience more - and quantifies how much more.
06 Mar 2026

Spliced Severity Distributions: When One Distribution Isn't Enough

A practitioner tutorial on fitting spliced composite severity distributions for UK motor claims using insurance-severity.
04 Mar 2026

Per-Risk Volatility Scoring: How to Replace Your Constant Phi with a Distributional GBM

Per-risk volatility scores using TweedieGBM from insurance-distributional. Price volatility as a rating factor, not a portfolio-level adjustment.
04 Mar 2026

How to Build a Large Loss Loading Model for Home Insurance

Per-risk large loss loadings for UK home insurance using quantile GBMs. Avoids the flat-loading trap by making the loading a function of the risk itself.
04 Mar 2026

GLM Interaction Detection: A Six-Step Walkthrough with CANN, NID, and SHAP

Step-by-step tutorial: plant two interactions in synthetic motor data, detect them with CANN + NID, validate with SHAP, confirm with A/E surfaces, and...
02 Mar 2026

How to Score Repeat Claimants with a Shared Frailty Model

Step-by-step: fit a shared frailty model in Python to score repeat claimants. Gamma frailty via EM, posterior credibility weights, GLM comparison.
02 Mar 2026

How to Extract GLM-Style Rating Factors from a CatBoost Model

Step-by-step: extract multiplicative CatBoost rating factors using shap-relativities. SHAP decomposition to GLM-format exp(beta) tables with CI and...
01 Mar 2026

From CatBoost to Radar in 50 Lines of Python

Python library distilling CatBoost GBMs into multiplicative GLM factor tables for Radar and Emblem. Open-source GBM-to-GLM distillation for UK pricing teams.
28 Feb 2026

When Credibility Meets CatBoost: Choosing Between Classical and Modern Approaches

Bühlmann-Straub vs CatBoost vs two-stage multilevel for UK motor pricing: when each wins and how insurance-credibility and insurance-multilevel combine them.
27 Feb 2026

Finding the Interactions Your GLM Missed

Automated interaction search for UK motor GLMs using CANN residuals and NID. Bonferroni-corrected shortlist before manual testing - insurance-interactions.
27 Feb 2026

Experience Rating: NCD and Bonus-Malus

Python library for NCD and bonus-malus in UK motor insurance. Optimal claiming thresholds peak at 20% NCD discount, not 65% - derived mathematically.
23 Feb 2026

Why Your Cross-Validation is Lying to You

Standard k-fold CV is wrong for insurance pricing. Temporal leakage and IBNR contamination inflate scores. Walk-forward validation fixes both - Python.
21 Feb 2026

From GBM to Radar: A Complete Databricks Workflow for Pricing Actuaries

Databricks workflow for UK pricing actuaries: CatBoost plus MLflow tracking, SHAP relativities, and Radar export. End-to-end motor pricing in Python.
17 Feb 2026

Extracting Rating Relativities from GBMs with SHAP

Extract multiplicative rating relativities from CatBoost using SHAP - same exp(beta) format as a GLM. UK personal lines Python with confidence intervals.
17 Jan 2026

Your Elasticity Estimate Is Biased and You Already Know Why

OLS elasticity in formula-rated books is contaminated by your own risk model. insurance-elasticity fixes this with CausalForestDML and CatBoost nuisance.
08 Jan 2026

Monthly Covariate Shift Monitoring: When to Reweight and When to Retrain

How to run covariate shift detection as a recurring monthly check: monitoring cadence, ESS ratio trends, and the thresholds that trigger a retraining...
30 Dec 2025

Your Conformal Intervals Are Wrong When the Claims Series Has Trend

Standard split conformal prediction requires exchangeability — a condition insurance claims time series systematically violate.
21 Dec 2025

Your Year-End Large Loss Loading Is a Finger in the Air

December is the season for year-end rate reviews where someone adds a flat 8% large loss loading to every segment regardless of tail weight.
12 Dec 2025

Your Lapse Model Ignores Cure: The Customers Who Were Never Going to Leave

Logistic regression treats all non-lapsers the same. Mixture cure models split them into two groups: structural non-lapsers who will never leave, and...
03 Dec 2025

Telematics Risk Scoring: From Raw Trips to GLM Features

How to convert raw telematics trip data into GLM-ready features for UK motor pricing. Covers HMM state segmentation and score calibration to GLM relativities.
24 Nov 2025

Your Model Validation Is a Checklist, Not a Test

PRA SS1/23 requires quantitative pass/fail tests, not narrative. insurance-governance automates the full validation suite and generates auditable HTML reports.
15 Nov 2025

Your Severity Model Assumes the Same Variance for Every Policy

Standard Gamma GLMs assign one dispersion parameter to every policy. That is wrong for most UK books. GAMLSS sigma submodels and Tweedie p estimation fix it.
06 Nov 2025

When You Can't Fit a GLM from Scratch: Transfer Learning for Thin Segments

GLMTransfer borrows statistical strength from a related source book to price thin target segments. Motor-to-fleet, home-to-landlord, and fleet roll-outs.
28 Oct 2025

How Do You Know Your Sigma Model Is Working?

Three diagnostics prove a GAMLSS sigma submodel is real: quantile residuals, worm plots, split-sample calibration. From insurance-distributional-glm.
19 Oct 2025

Your Frequency-Severity Independence Assumption Is Costing You Premium

Your frequency GLM and severity GLM are both correct. Multiplying them is not. How to test and correct for the dependence your pricing model ignores.
15 Sep 2025

Spliced Severity Distributions: When One Distribution Isn't Enough

A practitioner tutorial on fitting spliced composite severity distributions for UK motor claims using insurance-severity.