Short practical guides for pricing actuaries: one-way analysis, Gini coefficients, factor tables, burning cost models, GLM reproduction, and common tasks you spend too long doing in Excel. Code-first, production-ready.
A complete Python tutorial for Whittaker-Henderson smoothing of insurance rating tables. Replace your Excel moving average or SAS graduation with automatic REML lambda selection...
A practical Python tutorial for telematics pricing: load raw GPS trip data, classify driving regimes with a Hidden Markov Model, and produce GLM-ready risk features using insura...
A hands-on Python tutorial for insurance pricing analysts on survival analysis and lapse modelling. Covers Kaplan-Meier, Weibull AFT, mixture cure models, customer lifetime valu...
How to set up insurance model monitoring in Python from scratch: PSI, Gini drift, and A/E with the insurance-monitoring library. Know when to redeploy, recalibrate, or refit.
End-to-end GLM frequency model in Python using freMTPL2 from OpenML. Data prep, exposure handling, glum fitting, deviance residuals, actual vs expected, and factor relativity ex...
A hands-on tutorial on GAM insurance pricing in Python using the insurance-gam library. Covers EBM tariff construction, shape function extraction, GLM comparison, Shapley values...
A practical Python tutorial on credibility theory for insurance pricing analysts. Covers Buhlmann-Straub model, the insurance-credibility library, UK motor example, GLM integrat...
A step-by-step tutorial on conformal prediction for insurance Python models, specifically the frequency-severity decomposition. Covers the calibration subtlety that breaks naive...
A hands-on tutorial on causal inference for insurance pricing in Python using the insurance-causal library. Covers double machine learning (DML), CatBoost nuisance models, ATE/C...
How to run actuarial model validation in Python for UK insurance pricing models. Covers Solvency II Article 120 and Consumer Duty requirements, the five-test validation suite, a...
A stable Gini coefficient is not evidence that a model is performing well. It is evidence that the model is still ranking risks in the same order. Score decomposition separates ...
insurance-distributional now has five distributional model classes. NeuralGaussianMixture is the newest and the most demanding. A routing guide: which model for which problem, a...
The hunger-for-bonus effect biases your NCD frequency relativities. It also biases your severity model. The two errors partially offset each other — but the combined underpricin...
Miao & Pesenti's KL discrimination-insensitive result is theoretically clean. Deploying it in a production GLM-based pricing system is not. The paper is silent on how to extract...
Holtan (2001) showed that the NCD reporting threshold falls when interest rates rise — the NPV of future premium penalties shrinks, so policyholders become more willing to claim...
Lee et al. (arXiv:2602.02398) prove that standard hurdle-Poisson models with bivariate normal random effects can violate credibility order — your frequency estimate goes down af...
insurance-monitoring v1.0.0 adds ModelMonitor with check_gmcb and check_lmcb — separate tests for global and local calibration drift, wired into a three-way REDEPLOY/RECALIBRATE...
A new mortality model from Liu & Zhou (2026) shows that cause-specific shocks decay heterogeneously — some fast, some slow. The analogy to UK claims inflation is exact, and the ...
Two January 2026 arXiv papers formalise what motor actuaries have always known informally: NCD creates rational incentives to suppress small claims, and the GLM you're using to ...
insurance-credibility v0.1.9 adds BMSEquilibriumSimulator — Lemaire NPV reporting thresholds, Liang 2-class equilibrium, and a frequency correction for the selection bias in NCD...
A runnable Python implementation of Goffard, Piette, and Peters (ASTIN Bulletin 2025): infer claim frequency and severity from competitor PCW quotes using ABC-SMC with isotonic ...
Yanez, Guillen and Nielsen (ASTIN Bulletin 2025) apply a bounded Bonus-Malus System not to claims but to telematics signals themselves, updating weekly. The result: Gini from 0....
Zhang, Mao and Wang (arXiv:2603.14991, March 2026) prove a closed-form equivalent for worst-case quantile regression under Wasserstein distributional uncertainty — a result that...
Protected NCD is widely misunderstood by consumers, and the product may not deliver the value it charges for. The Consumer Duty fair value test and the hunger-for-bonus literatu...
Standard conformal prediction gives valid coverage only when calibration and test data are exchangeable. For insurance models deployed for 12+ months — through claims inflation ...
Two January 2026 arXiv preprints formalise what UK pricing teams have long intuited: observed claim frequency at high-NCD classes understates true frequency by 15–35%, because p...
ANAM (Laub, Pho, and Wong, NAAJ 2025) fits each rating factor as a neural subnetwork with hard monotonicity constraints, exposure offsets, and proper actuarial losses. The insur...
ACI satisfies its marginal coverage guarantee while producing months of invalid intervals after a claims inflation shock. A new paper proves the minimax-optimal algorithm flushe...
Policyholders with good NCD rationally choose not to report small claims. Your frequency model is trained on that suppressed data. Two January 2026 papers formalise what this me...
Brauer, Menzel & Wüthrich (arXiv:2510.04556) give us two things we have been missing: a formal hypothesis test for Gini drift and a Murphy score decomposition that tells you whe...
EBMs achieve near-XGBoost predictive performance on insurance claims data while remaining fully interpretable by design — no post-hoc SHAP required. We show the Poisson frequenc...
Fitting one aggregate trend to UK motor claims 2019–2024 embeds a single implicit decay rate across parts shortage, labour shortage, and social inflation — components that norma...
We fit WhittakerHendersonPoisson to driver age frequencies from 677K French MTPL policies. The Poisson smoother handles count data correctly, REML selects lambda automatically, ...
Why Tweedie GLM is the standard for aggregate loss modelling in insurance, with a complete Python example covering power parameter selection, exposure offset, and comparison wit...
Deep learning survival models underperform Cox regression on tabular insurance data. Cure models are the real story post-GIPP. Here is what the research says and what UK pricing...
UK motor average claim costs reached a record £5,300 in Q4 2024. But applying a flat 8% trend assumption treats structural and cyclical inflation identically. They have opposite...
How to produce a full IBNR distribution in Python using the Mack method and Bootstrap ODP sampling. Covers analytical standard errors, 5,000-simulation bootstrap, percentile tab...
When does it make sense to reach beyond chain ladder and bootstrap ODP for neural reserving methods? We compare DeepTriangle, individual RNN approaches, and the Richman-Wüthrich...
End-to-end motor insurance pricing in Python using the French MTPL dataset. Frequency-severity GLMs, exposure offsets, coefficient interpretation, validation, and calibration to...
Most governance tooling is tested on toy examples with clean DGPs and inflated Gini coefficients. We ran the full insurance-governance validation suite on 677K freMTPL2 policies...
The standard rate adequacy workflow — earned premium at current rate level, ultimate losses, trend to future period, expense load, indicated rate change — built in Python with p...
Raw loss ratios by age band are noisy. A 5-year moving average introduces boundary bias and requires a judgment call you cannot defend in an IFRS 17 review. This tutorial shows ...
The GBM sits in a notebook outperforming the production GLM. This tutorial shows how to extract multiplicative rating relativities from a CatBoost Poisson model using shap-relat...
Point estimates from pricing models are incomplete. This tutorial shows how to add distribution-free prediction intervals to a CatBoost Tweedie model using insurance-conformal —...
Single-objective fairness constraints force a binary choice. NSGA-II finds the full tradeoff surface, so governance committees can make an explicit, documented decision about wh...
We benchmarked Whittaker-Henderson against raw rates and a 5-point weighted moving average on a synthetic UK motor driver age curve with known truth. W-H reduces MSE by 57.2% vs...
PSI detects covariate shift but not rank collapse. On a synthetic UK motor book where a new risk factor emerges post-deployment, PSI stays GREEN while Gini drops 8 points. The B...
On a UK motor DGP with a monotone young-driver requirement, unconstrained EBM violates monotonicity in 31% of runs. Constrained EBM matches GLM monotonicity compliance at 100% w...
We benchmarked Bühlmann-Straub credibility against raw experience and manual Z-factors on a 30-segment synthetic UK motor fleet book with a known DGP. On thin schemes, it reduce...
REML-selected lambda beats manual tuning on a 63-band age curve benchmark: 22% lower MSE on thin tail bands, zero analyst discretion, and principled credible intervals. The hone...
BonusMalus is built from past claims — a naive regression conflates the causal effect with selection. We ran Double Machine Learning on 677K freMTPL2 policies to isolate what Bo...
Build a working Solvency II SCR estimate from scratch using compound distributions and Monte Carlo simulation. Poisson/NegBin frequency, lognormal severity, 50k simulations, VaR...
A practical Python walkthrough of the burning cost method for pricing excess of loss reinsurance treaties — loss trending, development, pure rate calculation, and sensitivity an...
We ran Bühlmann-Straub credibility on the freMTPL2freq dataset — 677K French MTPL policies, 22 regions — and quantified how much thin regions get pulled toward the portfolio mea...
H2O, FLAML, and AutoGluon are genuinely useful tools. None of them handle the log(exposure) offset that makes insurance frequency modelling work. Here is an honest account of wh...
A complete Python tutorial for building a Tweedie GLM for insurance pricing: synthetic motor data, statsmodels, exposure offset, interpreting the p parameter, residual diagnosti...
Richman-Wüthrich's one-shot PtU reserving paper (arXiv:2603.11660) ships with R code only. We map the algorithm to Python, explain the censored-claims exposure mechanism that ma...
EY forecasts a 111% net combined ratio for UK motor in 2026. WTW documents a 13% annual premium fall. Here is what the data shows and what pricing teams should do about it.
Applying CANN + NID to severity (Gamma) GLMs. Why the signal is weaker than frequency, what configuration changes are needed, and when a severity interaction is worth adding.
How UK home insurers should model physical climate risk: UKCP18 projections, Flood Re's 2039 exit, ABI claims data, and practical code using insurance-whittaker, insurance-confo...
The FCA has explicitly flagged pet insurance for monitoring in its 2026 regulatory priorities. FOS complaint upheld rates hit 52% in Q1 2025 — the highest of any UKGI business l...
How to detect when a motor book has hit the floor of its underwriting cycle — using PSI on new business mix, segment-level A/E, Gini stability, and mSPRT to know when the next m...
Modern Insurance Pricing with Python and Databricks - all 12 modules, free, on GitHub. GLMs through causal elasticity, fairness auditing, spatial BYM2 territory models, and mode...
A hands-on tutorial for the RateChangeEvaluator in insurance-causal v0.6.0. DiD when you have a control group. ITS when you don't. Real code, real API.
A 5pp Gini improvement means nothing to a CFO. The Loss Ratio Error framework from arXiv:2512.03242 converts model correlation into expected loss ratio — and from there into pou...
Build a double-lift chart to compare GLM vs GBM predictions. Bin by prediction ratio, compute A/E per decile, plot with matplotlib. Standard tool for pricing committee model val...
The first Python implementation of the asymptotic Gini drift test from Brauer et al. (2025). A proper z-test for ranking degradation — not a heuristic, not a threshold, a p-value.
FCA EP25/2 published July 2025. Expected claim costs per home policy up 49% from £92 to £138. Average inception premium up only 5%. The data says insurers absorbed the shock — n...
FCA Consumer Duty PRIN 2A requires insurers to tell policyholders what they can change to get a better outcome. Most pricing teams have not built this. insurance-recourse does i...
insurance-fairness v0.6.3 ships DiscriminationInsensitiveReweighter. Here's why dropping the protected column doesn't work, how propensity-based reweighting does, and what the A...
EP25/2 (the FCA's evaluation of GIPP price-walking remedies) flags ongoing fair value supervision in motor and home. No single technical checklist exists for the pricing actuary...
Fleet motor, property, and liability pricing in Python. Covers Bühlmann-Straub credibility for fleet schemes, GPD large loss loading, MBBEFD ILF tables, and PSI drift detection ...
Why GLM coefficients aren't causal effects, and how to fix that using insurance-causal: DML with CatBoost nuisances, causal forests for heterogeneous treatment effects, and DiD/...
A clean Python tutorial for the most-cited neural network architecture in actuarial pricing: the Combined Actuarial Neural Network (Schelldorfer & Wüthrich, 2019). Architecture,...
Standard Tweedie GLMs handle zeros implicitly. When that implicit handling breaks — specialty lines, niche segments, specific peril models — you need ZIP or hurdle models. Here ...
How to implement walk-forward cross-validation for insurance GLMs in Python using insurance-cv. Covers IBNR buffers, fold design, and a full worked example on freMTPL2-style mot...
Definitive Python benchmark: Poisson GLM vs XGBoost vs CatBoost vs LightGBM for insurance frequency modelling on freMTPL2. Poisson deviance, Gini coefficient, and A/E calibratio...
How the Ogden discount rate and Periodical Payment Orders change the maths of large BI pricing in the UK — with Python code to calculate lump sum equivalents, discount PPO cash ...
UK motor bodily injury severity has outrun CPI since 2022. This post implements a multiplicative severity separation model and Whittaker-Henderson smoothing in Python to separat...
Migrating from Emblem to Python for insurance GLM pricing: what changes in workflow, what gets easier, what gets harder, and what the transition actually looks like in practice.
Step-by-step: extract CatBoost factor tables with shap-relativities and write a clean Excel file with openpyxl. Formatted output ready to paste into Radar or Emblem.
The Python equivalent of the IFoA MLR Working Party's R tutorial: Poisson GLM baseline, EBM GAM, and CatBoost GBM on UK motor data, with the full pipeline from data to governance.
A practical statsmodels tutorial for pricing actuaries: Poisson frequency model with exposure offset, Gamma severity model, overdispersion tests, factor table extraction, and A/...
Benchmark results on a known-DGP synthetic UK motor book. EBM beats the GLM by 12.6 Gini points (0.455 vs 0.329). But the deviance number is misleading. We explain why, and when...
Benchmark results on a known-DGP synthetic UK motor fleet. HMM state fractions deliver 5–10pp Gini lift over simple aggregates. State classification recovers >50% of true high-r...
Honest benchmark: does fitting a surrogate GLM on CatBoost pseudo-predictions recover more discriminatory power than a direct GLM? We test it on 30,000 synthetic UK motor policies.
Extract the calendar-year inflation component from a claims development triangle using Taylor's two-factor separation. Python from scratch, then connect to severity trending.
CPI-adjusting your historical claims data before fitting a pricing model introduces systematic bias. How to apply line-specific inflation indices for motor and home insurance in...
A practical comparison of Python and R for UK personal lines insurance pricing — data wrangling, GLMs, GBMs, deployment, and Databricks. Honest about where R still wins.
Reproduce an Emblem frequency-severity GLM in Python: factor tables, one-way plots, deviance residuals, and lift charts using statsmodels, CatBoost, and Polars.
Which GLM assumptions actually matter for insurance pricing, which ones you routinely violate without consequence, and the diagnostics worth running before signing off a product...
A practical walkthrough for pricing analysts: use insurance-causal for causal inference, insurance-conformal for prediction intervals, and insurance-monitoring for drift detecti...
Benchmark results on a known-DGP synthetic UK motor age curve. REML recovers the true frequency well in the data-rich middle. The tails are a different story. Numbers, not claims.
Aggregate A/E at 0.94 looks fine. The model has been mispricing under-25s for eight months. Benchmark results on a synthetic UK motor book with three planted failure modes.
Benchmark results on 100 synthetic schemes with known true loss rates. Credibility blending reduces MSE by 25–35% vs the best naive alternative. Numbers, not theory.
Build a burning cost model in Python: frequency-severity split, exposure offsets, large loss capping, IBNR adjustment, and combined pure premium for UK pricing.
sklearn's TweedieRegressor is a well-engineered GLM. It fits a fixed-power Tweedie model correctly. The problem is that insurance pricing needs per-risk variance, not a single p...
Tutorial on monitoring insurance pricing models using actuarial KPIs. Gini tracking, segmented A/E, double-lift for champion/challenger. Why generic drift tools miss what matters.
Build a CatBoost frequency-severity pricing model on freMTPL2 using Polars. Poisson frequency, Gamma severity, combined burning cost, SHAP factor extraction, and distillation to...
Insurance walk-forward cross-validation prevents the look-ahead bias that makes standard k-fold results useless for prospective evaluation. Complete Python example with insuranc...
sklearn's TweedieRegressor tutorial gets you to a fitted model in six lines. It also produces predictions that are wrong for any policy with non-annual exposure. Here is the cor...
Insurance model monitoring in Python that understands exposure weighting, development lags, and Gini drift. Why Evidently and NannyML miss what matters for pricing, and what ins...
The FCA expects pricing teams to demonstrate their models don't proxy-discriminate under Consumer Duty. Most teams do this in Excel. Here is how to do it properly in Python, usi...
Static credibility weights all years equally. The dynamic Poisson-gamma state-space model weights recent experience more - and quantifies how much more.
How to run covariate shift detection as a recurring monthly check: monitoring cadence, ESS ratio trends, and the thresholds that trigger a retraining...
insurance-governance merges insurance-validation and insurance-mrm. PRA SS1/23 statistical validation and MRM governance in one install - no version conflicts.
Two-stage CatBoost plus REML random effects for UK insurance broker adjustments. insurance-multilevel - Buhlmann-Straub credibility weighting, not guesswork.
TabPFN and TabICLv2 for thin-segment UK insurance pricing. In-context learning at inference, no gradient descent. insurance-thin-data wraps both for actuaries.
Logistic regression treats all non-lapsers the same. Mixture cure models split them into two groups: structural non-lapsers who will never leave, and...
GARCH for UK insurance claims inflation: time-varying variance in trend analysis. insurance-garch - Engle (1982) applied to actuarial trend and pricing models.
GLMTransfer borrows statistical strength from a related source book to price thin target segments. Motor-to-fleet, home-to-landlord, and fleet roll-outs.
GAMLSS in Python: seven families, RS algorithm, variance as function of covariates. insurance-distributional-glm - the actuarial implementation Python lacked.
CatBoost MultiQuantile plus actuarial output layer: TVaR, ILFs, large loss loadings, exceedance probabilities for UK insurance pricing. insurance-quantile.
insurance-distributional models the full conditional loss distribution, not just the mean. First open-source Python implementation of the ASTIN 2024 Best Paper.
Per-risk large loss loadings for UK home insurance using quantile GBMs. Avoids the flat-loading trap by making the loading a function of the risk itself.
Step-by-step tutorial: plant two interactions in synthetic motor data, detect them with CANN + NID, validate with SHAP, confirm with A/E surfaces, and...
Python library distilling CatBoost GBMs into multiplicative GLM factor tables for Radar and Emblem. Open-source GBM-to-GLM distillation for UK pricing teams.
Bühlmann-Straub vs CatBoost vs two-stage multilevel for UK motor pricing: when each wins and how insurance-credibility and insurance-multilevel combine them.
Automated interaction search for UK motor GLMs using CANN residuals and NID. Bonferroni-corrected shortlist before manual testing - insurance-interactions.
How to extract SHAP relativities from insurance GBMs. Multiplicative factor tables in GLM exp(beta) format, with confidence intervals and exposure weighting. Python, CatBoost, U...
How to convert raw telematics trip data into GLM-ready features for UK motor pricing. Covers HMM state segmentation and score calibration to GLM relativities.
PRA SS1/23 requires quantitative pass/fail tests, not narrative. insurance-governance automates the full validation suite and generates auditable HTML reports.