Double machine learning, causal forests, TMLE, regression discontinuity, and synthetic difference-in-differences applied to pricing problems. When correlation isn't enough and you need to know what actually drives loss. 12 articles.
Orihara, Momozaki & Sugasawa (arXiv:2506.04868) produce a Bayesian posterior over the ATE by tilting the product of independent posteriors to satisfy the DR moment condition. We...
If you have called RetentionUpliftModel with outcome='survival', your model silently ran as binary. There is no pip-installable Python package for survival CATE. We explain the ...
Gevorgyan et al. propose exchangeable multi-task Gaussian processes for causal effect estimation in staggered-adoption designs. The method handles nonlinear trends that break SD...
Most retention models measure whether a customer lapses. surv-iTMLE measures when — and what your pricing intervention caused. We explain the estimand, why left truncation is mo...
DML removes the omitted variable bias that makes naive GLM price elasticity estimates wrong by 20–80%. We explain why it works, show the two core insurance applications — price ...
Standard conformal prediction breaks under instrumental variable regression — the calibration residuals are not exchangeable. Kato (arXiv:2603.25509, March 2026) fixes this by r...
Sun, Xie & Zhang (arXiv:2503.11375) combine parallel trends and synthetic control into a single estimator that remains consistent if either assumption holds. We explain the math...
Your GLM or GBM was trained on policyholders who chose to buy from you at your price. That is not a random sample of the market. The mechanism, what it does to your frequency an...
Most UK motor insurers think they know their price elasticity. They are probably wrong by a factor of 3–5, in the direction that makes them systematically mispricing. The eviden...
BonusMalus is built from past claims — a naive regression conflates the causal effect with selection. We ran Double Machine Learning on 677K freMTPL2 policies to isolate what Bo...
Ciganovic et al. (March 2026) show that standard DML cross-fitting leaks future information when your data is a time series. Their fix — Reverse Cross-Fitting — has direct impli...
Most UK insurers fit a logistic regression on PCW quote data and call it a demand model. It is biased in at least three distinct ways. Here is the causal structure that explains...
A new paper combines panel fixed effects, double machine learning, and instrumental variables. The headline result is not the estimator — it's that ML covariate adjustment frequ...
How to estimate a causally identified price elasticity from PCW quote data in Python, using commercial loading variation as an instrument and CatBoost nuisance models. The pract...
Li and Castro-Camilo (arXiv:2603.23309, March 2026) unify inverse probability weighting and extreme value extrapolation in a single estimating equation. Here is what it does, wh...
EconML is the standard Python library for causal ML. It was not built for insurance pricing, Poisson/Gamma exposure models, or the dual-selection bias problems specific to renew...
DoWhy is the most rigorous general-purpose causal inference library in Python — DAG specification, formal identification, refutation tests. It was not built for insurance pricin...
CausalForestDML separates causal price effect from risk-lapse correlation in UK motor renewal. insurance-elasticity - per-customer CATE and ENBP optimiser.
DiD and Callaway-Sant'Anna for rate change attribution. insurance-causal-policy quantifies what your rate change actually achieved, with FCA Consumer Duty-aligned evidence output.
Doubly robust TMLE for insurance pricing with Poisson outcomes and exposure offsets. insurance-tmle - first Python library with the implementation AIPW lacks.
Causal Forests with Fixed Effects for UK insurance panel data. Rate change evaluation by segment - beyond before-and-after loss ratios. causalfe Python.
Bayesian Causal Forests for heterogeneous lapse effects in UK insurance pricing. Segment-level elasticity with posteriors - insurance-bcf wrapping stochtree.
Regression Discontinuity Design tests if UK motor risk drops at age 25. Exposure-weighted Poisson outcomes, geographic boundaries, Consumer Duty output.
A 12% rate increase on young motor drivers. An 8% lapse spike three months later. Here is how to tell whether the rate change caused it — using synthetic difference-in-differences.
Where double machine learning beats naive regression for insurance pricing — and where it does not. Benchmarks on 100,000-policy synthetic UK motor data with known ground truth....