The core discipline. Rate adequacy, burning cost models, credibility weighting, GLM structure, frequency-severity separation, and the practical decisions that determine whether a book makes money. 123 articles.
A complete Python tutorial for Whittaker-Henderson smoothing of insurance rating tables. Replace your Excel moving average or SAS graduation with automatic REML lambda selection...
You don't need a JBA licence to build a materially better flood model. A French study on 968,000 policies shows which open data sources actually move the needle — and the answer...
A practical Python tutorial for telematics pricing: load raw GPS trip data, classify driving regimes with a Hidden Markov Model, and produce GLM-ready risk features using insura...
Richman and Wüthrich's March 2026 paper (arXiv:2603.11660) proves that aggregate chain-ladder produces materially biased ultimate estimates on liability lines. For UK personal l...
C. Evans Hedges (Lemonade, December 2025) derives the first closed-form formula connecting model discrimination to expected loss ratio. LRE translates a correlation improvement ...
Zanzouri et al. (NAAJ 2025) benchmark four ML severity models inside the QPP framework. The tau adjustment is elegant. CatBoost was missing from the comparison. Our library alre...
The Pool Adjacent Violators Algorithm solves an O(N) monotonicity problem with no parametric assumptions. It appears in three distinct insurance pricing contexts: as the link fu...
NeuralGaussianMixture is now in insurance-distributional v0.4.0. The question is not whether it can fit bimodal severity — it can. The question is whether your data actually nee...
insurance-distributional now has five distributional model classes. NeuralGaussianMixture is the newest and the most demanding. A routing guide: which model for which problem, a...
The hunger-for-bonus effect biases your NCD frequency relativities. It also biases your severity model. The two errors partially offset each other — but the combined underpricin...
UK personal lines generates hundreds of millions of competitor quotes per year. The industry treats them as competitive positioning data. They are, in fact, risk calibration dat...
Holtan (2001) showed that the NCD reporting threshold falls when interest rates rise — the NPV of future premium penalties shrinks, so policyholders become more willing to claim...
Everyone is buying CyberCube or Kovrr. The open data suggests entity-level cyber frequency scoring is feasible — but the honest numbers (r = 0.36, US-only, no severity) tell a d...
The hardest part of fitting a GPD is picking the threshold. A new Bayesian nonparametric approach eliminates the choice entirely — and tells you what fraction of your book has i...
Orihara, Momozaki & Sugasawa (arXiv:2506.04868) produce a Bayesian posterior over the ATE by tilting the product of independent posteriors to satisfy the DR moment condition. We...
UK motor bodily injury severity is structurally bimodal. A GammaGBM fits one mode between two humps, understating the 95th percentile by 30-40%. NeuralGaussianMixture fixes this...
Avanzi, Richman, Wüthrich et al. (arXiv:2601.07637) treat individual claim development as a Markov decision process, using Soft Actor-Critic to revise outstanding claim liabilit...
Kirke (arXiv:2603.15664) applies Quantum Amplitude Estimation to catastrophe insurance tail-risk pricing and claims quadratic speedup over classical Monte Carlo. The maths is re...
When you acquire a portfolio or enter a scheme, your pricing model was fitted on a different risk population. Weill and Wang (2026) give a kernel GLM framework for correcting th...
Most pricing teams treat acquisition and retention as separate problems. CLV-aware ratemaking integrates expected policy duration, cross-sell probability, and claim cost traject...
Nieto-Barajas (arXiv:2602.07228, 2026) proposes a Bayesian nonparametric mixture of Scaled Generalised Gaussian distributions that eliminates threshold selection entirely. The m...
Two January 2026 arXiv papers formalise what motor actuaries have always known informally: NCD creates rational incentives to suppress small claims, and the GLM you're using to ...
Izbicki and Rodrigues (arXiv:2603.26611, March 2026) benchmark TabPFN-2.5, RealTabPFN-2.5 and TabICL-Quantiles as conditional density estimators across 39 datasets. The thin-dat...
Zhang, Mao and Wang (arXiv:2603.14991, March 2026) prove a closed-form equivalent for worst-case quantile regression under Wasserstein distributional uncertainty — a result that...
Threshold selection is the Achilles' heel of extreme value theory in insurance pricing. Majumder and Richards (arXiv:2504.19994) eliminate it by blending a spline-neural-network...
Standard conformal prediction gives symmetric intervals calibrated on average. For right-skewed claim distributions, that average includes a lot of zero claims pulling threshold...
Roy, Singh, and Das (arXiv:2603.14841) build a 0-100 driver safety score by inverting a crash classifier and multiplying in condition-specific penalty factors. The maths is clea...
A follow-up to our QPP introduction: the honest case for quantile-based loading (it works for heavy-tailed lines and low-frequency risks, it does not work below the zero mass), ...
Protected NCD is widely misunderstood by consumers, and the product may not deliver the value it charges for. The Consumer Duty fair value test and the hunger-for-bonus literatu...
PenalizedGLMInference in insurance-gam v0.5.0 implements Manna et al. arXiv:2410.01008: bias-corrected confidence intervals for Poisson, Gamma, and Tweedie GLMs after Lasso or e...
Standard conformal prediction gives valid coverage only when calibration and test data are exchangeable. For insurance models deployed for 12+ months — through claims inflation ...
Two January 2026 arXiv preprints formalise what UK pricing teams have long intuited: observed claim frequency at high-NCD classes understates true frequency by 15–35%, because p...
An MGA launching on a UK PCW needs prices on day one with zero claims history. Here is the full architecture: market ABC as the prior, Bühlmann-Straub blending as claims arrive,...
Braun et al. (arXiv:2507.20941) replace the standard hyperrectangular joint prediction set with an ellipsoid built from a Mahalanobis nonconformity score. For d=2 (frequency + s...
LoBoostCP in insurance-conformal v1.0.0 implements Santos et al. arXiv:2602.22432 — local conformal prediction that uses the leaf structure of your existing GBT to calibrate pre...
A UK MGA at launch has five routes to market data: competitor quote reverse-engineering, rate indices from Consumer Intelligence, ABI aggregate statistics, capacity provider dat...
Paolo Toccaceli's CRPS-Optimal Binning for Conformal Regression (arXiv:2603.22000) partitions the covariate space using dynamic programming to minimise LOO-CRPS, then calibrates...
Verschuren (2021) showed that a Dutch insurer's home claim history predicts motor risk, and vice versa. The framework is technically clean. The UK structural context — aggregato...
Burger (arXiv:2512.23602, Dec 2025) applies conformal prediction to insurance model monitoring, replacing PSI > 0.2 and A/E > 1.15 with thresholds that are calibrated from data ...
Standard conformal prediction fails with right-censored survival data because you never observe the true event time for censored policies. ConformalisedSurvival in insurance-con...
Independent Lee-Carter models per cause-of-death produce forecasts that do not sum to total mortality — a coherence failure that flows directly into CI and LTC reserves. Nigri, ...
When you launch a new product with no claims history, you borrow from a related portfolio. Transfer learning formalises this. But the most-cited deep learning method for domain ...
Tab-TRM sets the French MTPL benchmark at 23.589×10⁻² Poisson deviance, beating PIN ensemble by 0.3%. The linearisation result — Tab-TRM is approximately a state-space model — i...
The quantile premium principle maps a single number — your risk appetite parameter tau — to per-risk safety loadings. Zanzouri et al. (NAAJ 2025) shows QRNN outperforms tree-bas...
Policyholders with good NCD rationally choose not to report small claims. Your frequency model is trained on that suppressed data. Two January 2026 papers formalise what this me...
FL solves the same variance-reduction problem as Bühlmann-Straub — but iteratively, with communication overhead, and without actuarial precedent. For UK personal lines, that tra...
Sun, Xie & Zhang (arXiv:2503.11375) combine parallel trends and synthetic control into a single estimator that remains consistent if either assumption holds. We explain the math...
Fitting one aggregate trend to UK motor claims 2019–2024 embeds a single implicit decay rate across parts shortage, labour shortage, and social inflation — components that norma...
EA NaFRA is open, 2m resolution, and free. So is the EPC register. So are OS building footprints. A UK pricing actuary who has actually tried to use them explains what you can a...
insurance-distributional v0.3.0 ships ZeroInflatedTweedieGBM — the first open-source implementation of So & Valdez (2024) Scenario 2. When standard Tweedie gets structural zeros...
Three-way benchmark on 677K French motor policies. TabPFN cannot handle log-exposure offsets — the structural limitation that makes it unviable for bread-and-butter Poisson freq...
UK motor average claim costs reached a record £5,300 in Q4 2024. But applying a flat 8% trend assumption treats structural and cyclical inflation identically. They have opposite...
Your GLM or GBM was trained on policyholders who chose to buy from you at your price. That is not a random sample of the market. The mechanism, what it does to your frequency an...
A practitioner-oriented deep dive on applying reinforcement learning and contextual bandits to PCW margin optimisation for UK personal lines. Two serious papers exist, six hard ...
XGBoostLSS, LightGBMLSS, NGBoost, and PGBM can all output a full conditional distribution rather than a point prediction. The Chevalier & Côté benchmark (EAJ 2025) tested 11 alg...
In January 2026 the PRA named AI as an insurance supervisory theme for the first time. The FCA published a bias research note in December 2024. The FRC updated TAS 100 to make b...
Most governance tooling is tested on toy examples with clean DGPs and inflated Gini coefficients. We ran the full insurance-governance validation suite on 677K freMTPL2 policies...
Gamma GLMs fit a single mode to severity data that often has two or three. Mixture Density Networks output the full conditional distribution — mixing weights, component means, a...
The standard rate adequacy workflow — earned premium at current rate level, ultimate losses, trend to future period, expense load, indicated rate change — built in Python with p...
Kolmogorov-Arnold Networks replace fixed activations with learnable splines on edges, letting the model discover its own functional forms. Here is what that means for insurance ...
The FCA published its final premium finance market study on 3 February 2026. No APR cap was imposed. That does not mean your book is clean. Here is what changed, what double dip...
Most UK motor insurers think they know their price elasticity. They are probably wrong by a factor of 3–5, in the direction that makes them systematically mispricing. The eviden...
A practitioner's guide to dynamic pricing in UK insurance: what GIPP actually permits, why your elasticity model likely has a 3-5x bias, and an honest assessment of whether Earn...
BonusMalus is built from past claims — a naive regression conflates the causal effect with selection. We ran Double Machine Learning on 677K freMTPL2 policies to isolate what Bo...
H2O, FLAML, and AutoGluon are genuinely useful tools. None of them handle the log(exposure) offset that makes insurance frequency modelling work. Here is an honest account of wh...
A definitive survey of open-source Python tools for insurance pricing in 2026. General-purpose ML libraries, specialist actuarial packages, the Burning Cost stack, and honest ga...
Denuit, Michaelides and Trufin (March 2026) unify autocalibration and non-discrimination into a single actuarial test. If your model fails it, you have a pricing problem and a r...
We fitted a Poisson GLM on the first third of freMTPL2 (677k French motor policies) and monitored it across two later temporal segments without refitting. PSI, A/E ratios with W...
We ran insurance-fairness against ausprivauto0405 — a real Australian motor dataset with an explicit Gender field. Here is what FairnessAudit, MulticalibrationAudit, and Indirec...
Sesia & Favaro's March 2026 survey of conformal prediction is the clearest account yet of what finite-sample distribution-free guarantees mean - and why the marginal/condition...
Lee, Badescu, and Lin (2026) replace ad-hoc event counts with a principled actuarial risk index: MODWT decomposes the acceleration signal, a Gaussian-Uniform mixture anchors tai...
Most UK insurers fit a logistic regression on PCW quote data and call it a demand model. It is biased in at least three distinct ways. Here is the causal structure that explains...
Brehmer & Strokorb (2019) proved that no proper scoring rule applied to raw data can discriminate tail indices. Bladt & Øhlenschlæger (arXiv:2603.24122) fix this by scoring norm...
A stochastic SIR model calibrated to LockBit ransomware data shows why treating cyber losses as independent events badly underestimates portfolio-level risk.
Balzer and Benlahlou (arXiv:2603.14543) extend gradient boosting to spatial panel data. Here is what it does, how it compares to BYM2 and Blier-Wong, and when a UK pricing team ...
A new paper combines panel fixed effects, double machine learning, and instrumental variables. The headline result is not the estimator — it's that ML covariate adjustment frequ...
Python has no equivalent of R's msm package for continuous-time multi-state modelling of claims. We explain the mathematics, show why a Poisson GLM substitution works for most p...
Moriah et al. (2026) run a sequential model-building exercise on a French home insurance portfolio to measure what each data layer — hydrological zoning, rainfall intensity, bui...
Why the standard flat EV surcharge is wrong in two directions simultaneously, what the claims data actually shows, and how to build a severity model that handles the bimodal str...
When you have fewer than 5,000 policies in a segment, should you use Bühlmann-Straub credibility or a GBM with transfer learning? The answer depends on whether you have a relate...
Conformal prediction and the parametric bootstrap both produce prediction intervals for insurance pricing models. They answer different questions, have different computational c...
Conformal prediction gives finite-sample valid 99.5% risk bounds for individual policies — useful for premium risk SCR validation and model validation consistent with Solvency I...
A structured decision framework for choosing between conformal prediction, distributional GBM, Bühlmann-Straub credibility, GLM bootstrap, and GAM uncertainty. Model type, data ...
EY forecasts a 111% net combined ratio for UK motor in 2026. WTW documents a 13% annual premium fall. Here is what the data shows and what pricing teams should do about it.
TabPFN v2 (Nature 637:319–326, 2025) does zero-shot prediction on datasets up to 10K rows. Here is what that actually means for the pricing segments where your current models ar...
Tab-TRM (arXiv:2601.07675) is a 14,820-parameter recursive model that beats CatBoost on French MTPL while connecting to GLM theory. We explain the architecture, the numbers, and...
Applying CANN + NID to severity (Gamma) GLMs. Why the signal is weaker than frequency, what configuration changes are needed, and when a severity interaction is worth adding.
The complete PS21/5 compliance workflow: CATE estimation with insurance-causal, ENBP-constrained optimisation with insurance-optimise, fairness audit with insurance-fairness, an...
Chevalier & Côté (EAJ 2025) benchmark nine GBM variants on five insurance datasets. We read it so you don't have to, then show where insurance-distributional fits in.
FCA MS24/2 (February 2026) means pricing teams now own the APR question. Here is how to treat it as a pricing problem — with the same tools used for the insurance itself.
How UK home insurers should model physical climate risk: UKCP18 projections, Flood Re's 2039 exit, ABI claims data, and practical code using insurance-whittaker, insurance-confo...
The FCA has explicitly flagged pet insurance for monitoring in its 2026 regulatory priorities. FOS complaint upheld rates hit 52% in Q1 2025 — the highest of any UKGI business l...
SOA and CAS research from late 2025 has sharpened the methods for calibrating parametric triggers and quantifying basis risk. Here is what that means in practice for UK flood an...
NSGA-II finds the non-dominated pricing strategies across accuracy, group fairness, and counterfactual fairness simultaneously. TOPSIS turns that front into an auditable regulat...
How to detect when a motor book has hit the floor of its underwriting cycle — using PSI on new business mix, segment-level A/E, Gini stability, and mSPRT to know when the next m...
A 5pp Gini improvement means nothing to a CFO. The Loss Ratio Error framework from arXiv:2512.03242 converts model correlation into expected loss ratio — and from there into pou...
Build a double-lift chart to compare GLM vs GBM predictions. Bin by prediction ratio, compute A/E per decile, plot with matplotlib. Standard tool for pricing committee model val...
The average treatment effect hides a 5x spread in price elasticity across a UK motor book. GATES, CLAN, and RATE tell you the size, who's who, and whether the ranking is actiona...
The FCA's interim report on MS24/1 landed in January 2026 with a Q3 2026 final report expected. Here is what pure protection pricing teams need to build before that lands.
FCA Consumer Duty PRIN 2A requires insurers to tell policyholders what they can change to get a better outcome. Most pricing teams have not built this. insurance-recourse does i...
A covariate that predicts mean severity well may tell you almost nothing about your 99th percentile claims. Here is how to identify which rating factors actually drive large los...
BCG's 2025 analysis puts embedded insurance at 30% CAGR. The pricing architecture question is not whether to do it - it's whether your model can answer in under 100ms without co...
Why GLM coefficients aren't causal effects, and how to fix that using insurance-causal: DML with CatBoost nuisances, causal forests for heterogeneous treatment effects, and DiD/...
arXiv:2504.16592 formalises what pricing teams have been quietly observing for years: autonomous pricing algorithms can converge to supra-competitive prices without any firm eve...
Laub, Pho and Wong's ANAM paper enforces smoothness and monotonicity architecturally, not as penalties. Here is what the mechanism actually is, why it matters more than the benc...
Standard Tweedie GLMs handle zeros implicitly. When that implicit handling breaks — specialty lines, niche segments, specific peril models — you need ZIP or hurdle models. Here ...
How to implement walk-forward cross-validation for insurance GLMs in Python using insurance-cv. Covers IBNR buffers, fold design, and a full worked example on freMTPL2-style mot...
Every insurance team checks their champion/challenger results monthly. Every month you look, you inflate the false positive rate. Here is how to do it correctly using sequential...
Monthly peeking at champion/challenger results with a t-test inflates your false positive rate to ~25%. The mixture SPRT (Johari et al. 2022) is an e-process: valid at every int...
PS25/21 abolished the mandatory 12-month product review cycle in December 2025. Harm-proportionate review cadence is now the requirement. Here is what that means for actuarial g...
How the Ogden discount rate and Periodical Payment Orders change the maths of large BI pricing in the UK — with Python code to calculate lump sum equivalents, discount PPO cash ...
UK motor bodily injury severity has outrun CPI since 2022. This post implements a multiplicative severity separation model and Whittaker-Henderson smoothing in Python to separat...
Migrating from Emblem to Python for insurance GLM pricing: what changes in workflow, what gets easier, what gets harder, and what the transition actually looks like in practice.
Step-by-step: extract CatBoost factor tables with shap-relativities and write a clean Excel file with openpyxl. Formatted output ready to paste into Radar or Emblem.
The Python equivalent of the IFoA MLR Working Party's R tutorial: Poisson GLM baseline, EBM GAM, and CatBoost GBM on UK motor data, with the full pipeline from data to governance.
A practical statsmodels tutorial for pricing actuaries: Poisson frequency model with exposure offset, Gamma severity model, overdispersion tests, factor table extraction, and A/...
Extract the calendar-year inflation component from a claims development triangle using Taylor's two-factor separation. Python from scratch, then connect to severity trending.
CPI-adjusting your historical claims data before fitting a pricing model introduces systematic bias. How to apply line-specific inflation indices for motor and home insurance in...
Which GLM assumptions actually matter for insurance pricing, which ones you routinely violate without consequence, and the diagnostics worth running before signing off a product...
Build a burning cost model in Python: frequency-severity split, exposure offsets, large loss capping, IBNR adjustment, and combined pure premium for UK pricing.
Build a CatBoost frequency-severity pricing model on freMTPL2 using Polars. Poisson frequency, Gamma severity, combined burning cost, SHAP factor extraction, and distillation to...
Insurance walk-forward cross-validation prevents the look-ahead bias that makes standard k-fold results useless for prospective evaluation. Complete Python example with insuranc...
Active Consumer Duty investigations in home and travel insurance. What a defensible pricing model actually requires under PRIN 2A, and what the FCA thematic review said about mo...
EconML is the standard Python library for causal ML. It was not built for insurance pricing, Poisson/Gamma exposure models, or the dual-selection bias problems specific to renew...
DoWhy is the most rigorous general-purpose causal inference library in Python — DAG specification, formal identification, refutation tests. It was not built for insurance pricin...
How to run covariate shift detection as a recurring monthly check: monitoring cadence, ESS ratio trends, and the thresholds that trigger a retraining...
Three interpretable architectures for UK insurance pricing: EBM, ANAM, and PIN via insurance-gam. Refuse the GLM-vs-GBM accuracy trade-off with factor tables.
CausalForestDML separates causal price effect from risk-lapse correlation in UK motor renewal. insurance-elasticity - per-customer CATE and ENBP optimiser.
Champion/challenger with ICOBS 6B.2.51R compliance for UK insurers. SHA-256 routing, SQLite logging, bootstrap LR tests, SMF-signable report - insurance-deploy.
Two-stage CatBoost plus REML random effects for UK insurance broker adjustments. insurance-multilevel - Buhlmann-Straub credibility weighting, not guesswork.
Continuous-time HMM for telematics risk scoring in UK motor pricing. Latent driving regimes from GPS data - actuarially interpretable features for Poisson GLM.
TabPFN and TabICLv2 for thin-segment UK insurance pricing. In-context learning at inference, no gradient descent. insurance-thin-data wraps both for actuaries.
Pairwise Interaction Networks produce exact tabulatable 2D rating factor surfaces, not SHAP approximations. Beats GBMs on French MTPL benchmark. Python.
Covariate-conditioned IBNR completion by risk segment using ML-EM algorithm. insurance-nowcast corrects aggregate LDF bias from your actual recent risk mix.
Joint conformal prediction sets for frequency and severity in UK insurance. Fan and Sesia coordinate-wise standardization - simultaneous coverage across both.
EVT for UK motor large loss pricing: censored GPD for open TPBI claims, profile likelihood CIs, excess layer pure premiums. insurance-evt Python library.
Distributionally robust rate optimisation: worst-case demand within a Wasserstein ball. Price-of-robustness curve for UK pricing committee papers - Python.
Shared-trunk neural model for frequency-severity dependence in UK motor pricing. Explicit dependence testing where two-part GLMs assume independence - Python.
Conformal risk control for UK insurance: coverage calibrated to financial shortfall, not miscoverage rate. insurance-conformal - beyond standard intervals.
Logistic regression treats all non-lapsers the same. Mixture cure models split them into two groups: structural non-lapsers who will never leave, and...
Doubly robust TMLE for insurance pricing with Poisson outcomes and exposure offsets. insurance-tmle - first Python library with the implementation AIPW lacks.
Bandit algorithms for FCA GIPP-compliant price experimentation in UK general insurance. ENBP constraints and compliance reporting built in - insurance-online.
GARCH for UK insurance claims inflation: time-varying variance in trend analysis. insurance-garch - Engle (1982) applied to actuarial trend and pricing models.
Vine copulas for multi-peril UK home pricing. Flood-subsidence correlation costs ~9% in mispriced revenue. insurance-copula: BIC selection, PML simulation.
Fine-Gray subdistribution hazard for UK insurance competing risks. Separates lapse, MTC, and NTU correctly - insurance-survival Python, not naive censoring.
Causal Forests with Fixed Effects for UK insurance panel data. Rate change evaluation by segment - beyond before-and-after loss ratios. causalfe Python.
Bayesian Causal Forests for heterogeneous lapse effects in UK insurance pricing. Segment-level elasticity with posteriors - insurance-bcf wrapping stochtree.
Transfer learning for thin-segment UK insurance pricing: Tian-Feng GLM algorithm, CatBoost source-as-offset, CANN fine-tuning, negative transfer diagnostics.
Regression Discontinuity Design tests if UK motor risk drops at age 25. Exposure-weighted Poisson outcomes, geographic boundaries, Consumer Duty output.
GLMTransfer borrows statistical strength from a related source book to price thin target segments. Motor-to-fleet, home-to-landlord, and fleet roll-outs.
A 12% rate increase on young motor drivers. An 8% lapse spike three months later. Here is how to tell whether the rate change caused it — using synthetic difference-in-differences.
ICC diagnostics for multiple group factors in insurance pricing. When broker, scheme, fleet, and postcode sector effects are worth modelling with REML...
Per-risk large loss loadings for UK home insurance using quantile GBMs. Avoids the flat-loading trap by making the loading a function of the risk itself.
How to combine GLM and GBM predictions for production pricing: cross-validated blend weights, PRA interpretability, and when blending actually helps. Once the blended model is v...
Bühlmann-Straub vs CatBoost vs two-stage multilevel for UK motor pricing: when each wins and how insurance-credibility and insurance-multilevel combine them.
Assumes familiarity with the Murphy decomposition framework. Focuses on the operational question: given a monitoring alert, how do you read GMCB vs LMCB...
How to convert raw telematics trip data into GLM-ready features for UK motor pricing. Covers HMM state segmentation and score calibration to GLM relativities.
PRA SS1/23 requires quantitative pass/fail tests, not narrative. insurance-governance automates the full validation suite and generates auditable HTML reports.