Validation

12 articles in this topic

27 Mar 2026

Does Automated Model Monitoring Actually Work?

We planted three simultaneous model failures in a 50,000-policy UK motor book. The aggregate A/E never triggered. The library detected the first problem after 1,500 policies. He...
26 Mar 2026

Does Conformal Prediction Actually Work for Insurance Claims?

Parametric Tweedie intervals undercover high-risk policies by 10–15 percentage points. We tested conformal prediction on 50,000 UK motor policies to find out whether the fix act...
25 Mar 2026

Does DML Causal Inference Actually Work for Insurance Pricing?

We ran Double Machine Learning against a naive GLM on a 50,000-policy UK motor telematics book. The GLM overestimated the treatment effect by 50–90%. Here is what that means for...
24 Mar 2026

Does insurance-gam actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor book. EBM beats the GLM by 35 Gini points. But the deviance number is misleading. We explain why, and when you should care.
24 Mar 2026

Does HMM telematics scoring actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor fleet. HMM state fractions deliver 5–10pp Gini lift over simple aggregates. State classification recovers >50% of true high-r...
24 Mar 2026

Does GBM-to-GLM Distillation Actually Work for Insurance Pricing?

Honest benchmark: does fitting a surrogate GLM on CatBoost pseudo-predictions recover more discriminatory power than a direct GLM? We test it on 30,000 synthetic UK motor policies.
23 Mar 2026

Does Whittaker-Henderson smoothing actually work for insurance pricing?

Benchmark results on a known-DGP synthetic UK motor age curve. REML recovers the true frequency well in the data-rich middle. The tails are a different story. Numbers, not claims.
23 Mar 2026

Does Sarmanov copula frequency-severity modelling actually work for insurance pricing?

We read the source, ran the benchmark, and checked the claim: the independence assumption in standard two-part GLMs is wrong for UK motor, and this library corrects it analytica...
23 Mar 2026

Does automated model monitoring actually work for insurance pricing?

Aggregate A/E at 0.94 looks fine. The model has been mispricing under-25s for eight months. Benchmark results on a synthetic UK motor book with three planted failure modes.
23 Mar 2026

Does DML causal inference actually work for insurance pricing?

We ran the benchmarks. On a synthetic UK motor book with nonlinear confounding, naive logistic GLM overestimates the telematics treatment effect by 50–90%. DML recovers the grou...
23 Mar 2026

Does conformal prediction actually work for insurance pricing?

Benchmark results on a known-DGP synthetic motor book. Conformal hits 90% across all deciles. Parametric Tweedie under-covers the top decile by 10–15pp. Numbers, not theory.
23 Mar 2026

Does Bühlmann-Straub credibility actually work for insurance pricing?

Benchmark results on 100 synthetic schemes with known true loss rates. Credibility blending reduces MSE by 25–35% vs the best naive alternative. Numbers, not theory.