Validation

23 articles in this topic

03 Apr 2026

PAVA in Three Places: Isotonic Regression for Insurance Pricing

The Pool Adjacent Violators Algorithm solves an O(N) monotonicity problem with no parametric assumptions. It appears in three distinct insurance pricing contexts: as the link fu...
03 Apr 2026

Validating a Mixture Severity Model: When NE-GMM Earns Its Keep and When GammaGBM Still Wins

NeuralGaussianMixture is now in insurance-distributional v0.4.0. The question is not whether it can fit bimodal severity — it can. The question is whether your data actually nee...
28 Mar 2026

What PRA SS1/23 Validation Looks Like on Real Data: 677K French Motor Policies

Most governance tooling is tested on toy examples with clean DGPs and inflated Gini coefficients. We ran the full insurance-governance validation suite on 677K freMTPL2 policies...
28 Mar 2026

Does Whittaker-Henderson Smoothing Actually Work for Insurance Pricing?

We benchmarked Whittaker-Henderson against raw rates and a 5-point weighted moving average on a synthetic UK motor driver age curve with known truth. W-H reduces MSE by 57.2% vs...
28 Mar 2026

Does Sarmanov Copula Frequency-Severity Modelling Actually Work?

The standard UK motor pricing formula multiplies E[N] by E[S] and assumes independence. On a 15,000-policy benchmark with planted omega=3.5, that assumption understates portfoli...
28 Mar 2026

Does PSI Actually Catch Pricing Model Drift?

PSI detects covariate shift but not rank collapse. On a synthetic UK motor book where a new risk factor emerges post-deployment, PSI stays GREEN while Gini drops 8 points. The B...
28 Mar 2026

Does Proxy Discrimination Testing Actually Work?

Manual Spearman correlation missed postcode as an ethnicity proxy in 100% of 50 benchmark runs. CatBoost proxy R-squared caught it in 100% of runs. The difference is the non-lin...
28 Mar 2026

Does Monotonicity-Constrained EBM Actually Work for Insurance Pricing?

On a UK motor DGP with a monotone young-driver requirement, unconstrained EBM violates monotonicity in 31% of runs. Constrained EBM matches GLM monotonicity compliance at 100% w...
28 Mar 2026

Does HMM Telematics Risk Scoring Actually Work for Insurance Pricing?

HMM-derived driving state features improve Gini by 5–10 percentage points over raw trip averages on a state-structured DGP. The reason is temporal: the HMM knows that aggressive...
28 Mar 2026

Does Constrained Rate Optimisation Actually Work?

We benchmarked constrained portfolio optimisation against a uniform +7% rate change on a 2,000-policy UK motor book. The optimiser achieved the same GWP target with £4,000–8,000...
28 Mar 2026

Does Bühlmann-Straub Credibility Actually Work?

We benchmarked Bühlmann-Straub credibility against raw experience and manual Z-factors on a 30-segment synthetic UK motor fleet book with a known DGP. On thin schemes, it reduce...
28 Mar 2026

Does Automatic Lambda Selection for Whittaker-Henderson Actually Work?

REML-selected lambda beats manual tuning on a 63-band age curve benchmark: 22% lower MSE on thin tail bands, zero analyst discretion, and principled credible intervals. The hone...
27 Mar 2026

Does Automated Model Monitoring Actually Work?

We planted three simultaneous model failures in a 50,000-policy UK motor book. The aggregate A/E never triggered. The library detected the first problem after 1,500 policies. He...
26 Mar 2026

Does Conformal Prediction Actually Work for Insurance Claims?

Parametric Tweedie intervals undercover high-risk policies by 10–15 percentage points. We tested conformal prediction on 50,000 UK motor policies to find out whether the fix act...
25 Mar 2026

Does DML Causal Inference Actually Work for Insurance Pricing?

We ran Double Machine Learning against a naive GLM on a 50,000-policy UK motor telematics book. The GLM overestimated the treatment effect by 50–90%. Here is what that means for...
24 Mar 2026

Does insurance-gam actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor book. EBM beats the GLM by 12.6 Gini points (0.455 vs 0.329). But the deviance number is misleading. We explain why, and when...
24 Mar 2026

Does HMM telematics scoring actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor fleet. HMM state fractions deliver 5–10pp Gini lift over simple aggregates. State classification recovers >50% of true high-r...
24 Mar 2026

Does GBM-to-GLM Distillation Actually Work for Insurance Pricing?

Honest benchmark: does fitting a surrogate GLM on CatBoost pseudo-predictions recover more discriminatory power than a direct GLM? We test it on 30,000 synthetic UK motor policies.
23 Mar 2026

Does Whittaker-Henderson smoothing actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor age curve. REML recovers the true frequency well in the data-rich middle. The tails are a different story. Numbers, not claims.
23 Mar 2026

Does automated model monitoring actually work for insurance pricing?

Aggregate A/E at 0.94 looks fine. The model has been mispricing under-25s for eight months. Benchmark results on a synthetic UK motor book with three planted failure modes.
23 Mar 2026

Does DML causal inference actually work for insurance pricing?

We ran the benchmarks. On a synthetic UK motor book with nonlinear confounding, naive logistic GLM overestimates the telematics treatment effect by 50–90%. DML recovers the grou...
23 Mar 2026

Does conformal prediction actually work for insurance pricing?

Benchmark results on a known-DGP synthetic motor book. Conformal hits 90% across all deciles. Parametric Tweedie under-covers the top decile by 10–15pp. Numbers, not theory.
23 Mar 2026

Does Bühlmann-Straub credibility actually work for insurance pricing?

Benchmark results on 100 synthetic schemes with known true loss rates. Credibility blending reduces MSE by 25–35% vs the best naive alternative. Numbers, not theory.