Machine Learning

31 articles in this topic

03 Apr 2026

Your SHAP Feature Importance Bar Chart Has No Error Bars — Here Is How to Fix That

A February 2026 paper provides the first statistically valid confidence intervals for global SHAP feature importance. We explain what changes for UK insurance pricing teams, whe...
02 Apr 2026

RobustMMDGLM: Severity Modelling When Gamma GLMs Break

insurance-glm-tools v0.2.0 ships RobustMMDGLM — a Gamma GLM that automatically downweights large losses and selects features via L1, based on Kang & Kang (2026). Replaces ad-hoc...
02 Apr 2026

Locally Adaptive Score Selection for Conformal Intervals: insurance-conformal v1.2.0

insurance-conformal v1.2.0 adds LCPModelSelector: locally adaptive conformal model and score selection that gives per-prediction tighter intervals while maintaining coverage gua...
02 Apr 2026

Genetic GAMs: An Interesting Idea That EBM Already Solved

Shankar & Cohen automate GAM structure search using NSGA-II evolutionary algorithms. The idea is legitimate; the problem is that EBM already does this better for insurance prici...
02 Apr 2026

The Fairness Pareto Front: Why There Is No Single Dial

Pricing teams treat fairness as a single slider between accuracy and parity. NSGA-II reveals it is a landscape with multiple competing criteria. Here is what the Pareto front lo...
02 Apr 2026

Conformal Fairness When the Protected Attribute Is Missing

Kong, Liu & Yang prove that standard conformal coverage guarantees degrade unevenly when protected attributes are absent at test time. With post-ECJ gender prohibition and GDPR ...
02 Apr 2026

Your Conformal Intervals Lie About Some Policyholders

Marginal 90% coverage can hide severe undercoverage for specific risk profiles. ConditionalCoverageAssessor — new in insurance-conformal v1.2.0 — quantifies it with CVI, decompo...
02 Apr 2026

ANAM's Monotonicity Insight Is Real. The tensorflow_lattice Dependency Is Not.

Laub/Pho/Wong's Actuarial Neural Additive Model has a genuine architectural insight in PWLCalibration monotonicity. It also depends on an unmaintained TensorFlow library. EBM is...
01 Apr 2026

LDP Optimal Mechanism for Privatised Fairness Auditing in Insurance Pricing

k-ary randomised response applies symmetric noise — wasteful and fairness-suboptimal. The Ghoukasian-Asoodeh optimal mechanism is asymmetric: minority groups get a higher correc...
01 Apr 2026

Intersectional Fairness in Insurance Pricing: Why Auditing Age and Gender Separately Is Not Enough

A model can pass its age fairness audit and its gender fairness audit and still systematically overprice young women. This is fairness gerrymandering. We explain the CCdCov meas...
01 Apr 2026

Two Things Random Splits and Pearson Correlation Get Wrong in Insurance Data

insurance-cv v0.3.0 adds SupportPointSplit (distributional train-test splitting via energy distance minimisation) and ChatterjeeSelector (nonlinear feature screening using Chatt...
31 Mar 2026

Transfer Learning for Thin Portfolios: What Works, What Doesn't, and Why DANN Is the Wrong Tool

When you launch a new product with no claims history, you borrow from a related portfolio. Transfer learning formalises this. But the most-cited deep learning method for domain ...
31 Mar 2026

Text Embeddings for Insurance Pricing — When Do They Actually Help?

Text embeddings can compress 13,000 business categories into 24 useful features and add 2-3% Gini lift on commercial lines. For UK personal motor FNOL they do essentially nothin...
31 Mar 2026

Tab-TRM: Best on the Benchmark, Not the Right Starting Point

Tab-TRM sets the French MTPL benchmark at 23.589×10⁻² Poisson deviance, beating PIN ensemble by 0.3%. The linearisation result — Tab-TRM is approximately a state-space model — i...
31 Mar 2026

Sequential Optimal Transport for Multi-Attribute Fairness in Insurance Pricing

Naively applying Wasserstein barycenter corrections sequentially across multiple protected attributes is miscalibrated: the ECDF for attribute k was fitted on the original predi...
31 Mar 2026

Setting premiums at the 85th percentile: quantile premium pricing with neural networks

The quantile premium principle maps a single number — your risk appetite parameter tau — to per-risk safety loadings. Zanzouri et al. (NAAJ 2025) shows QRNN outperforms tree-bas...
31 Mar 2026

Proxy Discrimination Risk When LLM-Generated Features Enter Insurance Pricing Models

LLMs encode societal stereotypes. When you use one to generate rating features, those stereotypes enter your pricing model. The insurer is responsible. Here is the testing proto...
31 Mar 2026

LLM Feature Engineering for Insurance Pricing — What Actually Works

Three published frameworks use LLMs to generate tabular features and beat classical search tools on generic benchmarks. None has been tested on an actuarial dataset. We explain ...
31 Mar 2026

Focal Loss for Insurance Fraud Detection — Why You Should Use XGBoost Instead

Focal loss is a clever idea from computer vision that does not translate well to tabular insurance fraud data. AUC=0.63 from a three-stage focal loss neural network versus 0.75-...
31 Mar 2026

Federated Learning vs Credibility Theory: When Does the Complexity Pay Off?

FL solves the same variance-reduction problem as Bühlmann-Straub — but iteratively, with communication overhead, and without actuarial precedent. For UK personal lines, that tra...
31 Mar 2026

Conformal Prediction Intervals for Insurance Pricing Models

Parametric Tweedie intervals over-cover low-risk policies and under-cover the high-risk tail. Conformal prediction fixes this with a finite-sample guarantee that does not rely o...
31 Mar 2026

Conditional Coverage and Conformal Prediction Model Selection: CVI and CC-Select

Marginal coverage guarantees say nothing about which policyholders are being undercovered. CVI decomposes conditional coverage into undercoverage risk and overcoverage cost. CC-...
28 Mar 2026

Zero-Inflated Tweedie GBM for Insurance Pricing

insurance-distributional v0.3.0 ships ZeroInflatedTweedieGBM — the first open-source implementation of So & Valdez (2024) Scenario 2. When standard Tweedie gets structural zeros...
28 Mar 2026

Probabilistic Gradient Boosting for Insurance Pricing — Beyond Point Predictions

XGBoostLSS, LightGBMLSS, NGBoost, and PGBM can all output a full conditional distribution rather than a point prediction. The Chevalier & Côté benchmark (EAJ 2025) tested 11 alg...
26 Mar 2026

LLMs for Property Insurance: What They Can Actually Do

LLMs can extract structured data from surveys, flag non-standard construction in loss adjuster notes, and rate categorical variables at scale. They cannot reliably assess claim ...
25 Mar 2026

Reinforcement Learning for Individual Claims Reserving: What Avanzi, Richman, and Wüthrich Propose

Avanzi, Richman, and Wüthrich reformulate individual claims reserving as a Markov Decision Process. We explain why it matters, what it actually does, and when a UK reserving act...
25 Mar 2026

Actuarial Neural Additive Model: What the Paper Actually Does (arXiv:2509.08467)

Laub, Pho and Wong's ANAM paper enforces smoothness and monotonicity architecturally, not as penalties. Here is what the mechanism actually is, why it matters more than the benc...
23 Mar 2026

CatBoost vs XGBoost for Insurance Pricing

A practical comparison of CatBoost and XGBoost for UK personal lines insurance pricing — categorical handling, Tweedie support, and why we default to CatBoost.
13 Mar 2026

Actuarial Neural Additive Models: Exact Interpretability with Tweedie Loss

Actuarial Neural Additive Models for UK pricing: exact interpretability, Tweedie loss, guaranteed monotonicity. insurance-gam - beyond SHAP and EBM limitations.
13 Mar 2026

Frequency-Severity Dependence in UK Motor: A Shared-Trunk Neural Architecture

Shared-trunk neural model for frequency-severity dependence in UK motor pricing. Explicit dependence testing where two-part GLMs assume independence - Python.
09 Mar 2026

EBMs for Insurance Pricing: Better Than a GLM, Readable by a Pricing Committee

insurance-gam wraps EBM for UK pricing teams: Poisson/Tweedie loss, exposure offsets, RelativitiesTable, MonotonicityEditor, GLM comparison diagnostics.