Our earlier post on market-based ratemaking covered the theory: Goffard, Piette, and Peters (ASTIN Bulletin 2025, arXiv:2502.04082) show how to infer Compound Poisson-LogNormal parameters from competitor quotes alone, using ABC-SMC and an isotonic regression link. The paper’s software is R only. This post provides the Python implementation and works through a UK pet insurance example end to end.
The regulatory context
FCA PROD 4 requires that commercial premiums are demonstrably connected to the risks customers face. “We matched the market” does not satisfy this for an MGA at launch — it tells the regulator nothing about whether the market itself is pricing fairly, and it leaves you unable to evidence a risk basis if challenged under the Consumer Duty fair value rules.
The ABC posterior over (λ, μ) is the documented risk basis PROD 4 needs. It is a quantitative, auditable claim that your premium reflects observed market evidence transformed into a probability distribution over the underlying claim parameters. A screenshot of PCW quotes with a note saying “we came in at 3rd cheapest” is not that. A posterior plot with MAP estimates and 90% credible intervals is.
Lloyd’s new syndicates face the same pressure from a different direction. The February 2024 Capital Guidance requires linkage between pricing assumptions and SCR inputs. A posterior over λ is a pricing assumption in a form the model office can actually use.
The method in plain English
You collect competitor commercial premiums across N risk classes — breed × age for pet, or postcode × sum-insured band for home contents. Each class has known coverage terms: deductible d, benefit limit l, and coverage rate r.
You propose candidate parameter values (λ, μ) — claim frequency and log-mean severity. You simulate what pure premiums would be under those parameters, via Monte Carlo. You ask: does the ordering of simulated pure premiums match the ordering of observed commercial premiums? The isotonic regression link (PAVA) enforces monotonicity without assuming a linear or parametric loading structure — it simply requires that as pure premium increases, the fitted commercial premium increases too.
If the simulated and observed premiums are close enough (below a tolerance ε), you keep the candidate. Repeat across J particles and G generations, tightening ε each generation. What comes out is a posterior over (λ, μ) — a distribution, not a point estimate, with the MAP as your base rate and the spread quantifying your parameter uncertainty.
The loss ratio corridor [LR_low, LR_high] acts as an informative prior guard: if a candidate parameter set implies loss ratios outside the plausible range for the market, it is penalised in the distance function before the isotonic step even runs.
The Python implementation
Requires numpy and scipy only. scipy.optimize.isotonic_regression is available from SciPy 1.12 (January 2024).
import numpy as np
from scipy.optimize import isotonic_regression
from scipy.stats import gaussian_kde
rng = np.random.default_rng(42)
# ------------------------------------------------------------------ #
# 1. Forward model: vectorised Compound Poisson-LogNormal simulation #
# ------------------------------------------------------------------ #
def simulate_pure_premium(lam, mu, sigma, d, l, r, R=2000):
"""
Estimate E[g(X)] via Monte Carlo, vectorised over R replications.
X = sum of N iid LogNormal(mu, sigma) losses, N ~ Poisson(lam).
Coverage function: g(x) = min(max(r * x - d, 0), l).
The vectorised implementation draws (R x max_count) severity values
at once and masks out the excess — about 20x faster than looping
over replications in Python.
Parameters
----------
lam : claim frequency (annualised, per policy)
mu : log-mean of severity distribution
sigma : log-std of severity distribution (fixed at 1.0 in the paper)
d : deductible (GBP)
l : benefit limit (GBP)
r : coverage rate (fraction of loss covered, typically 0.7-1.0)
R : Monte Carlo replications
"""
counts = rng.poisson(lam, size=R)
max_count = int(counts.max()) if counts.max() > 0 else 0
if max_count == 0:
return 0.0
# Draw (R x max_count) severities; zero out where count < max_count
sev = rng.lognormal(mu, sigma, size=(R, max_count))
mask = np.arange(max_count)[None, :] < counts[:, None]
total_loss = (sev * mask).sum(axis=1)
covered = np.minimum(np.maximum(r * total_loss - d, 0.0), l)
return covered.mean()
def simulate_class_premiums(lam, mu, sigma, coverage_params):
"""
Compute simulated pure premiums for all N risk classes.
coverage_params : list of (d, l, r) tuples, one per class.
Returns array of shape (N,).
"""
return np.array([
simulate_pure_premium(lam, mu, sigma, d, l, r)
for d, l, r in coverage_params
])
# ------------------------------------------------------------------ #
# 2. Isotonic regression link: PAVA via scipy #
# ------------------------------------------------------------------ #
def isotonic_rmse(p_sim, p_obs):
"""
Fit isotonic regression of observed commercial premiums on simulated
pure premiums. Returns the RMSE between observed and isotonic-fitted
commercial premiums.
This is the key innovation from Goffard et al.: by sorting on p_sim
before applying PAVA, we enforce that the fitted commercial premium
respects the same ordering as the simulated pure premium — without
assuming any parametric loading structure.
"""
order = np.argsort(p_sim)
p_obs_sorted = p_obs[order]
# scipy PAVA: O(N), returns non-decreasing fitted values
fitted = isotonic_regression(p_obs_sorted, increasing=True).x
return np.sqrt(np.mean((p_obs_sorted - fitted) ** 2))
# ------------------------------------------------------------------ #
# 3. Loss ratio corridor penalty #
# ------------------------------------------------------------------ #
def lr_penalty(p_sim, p_obs, lr_low, lr_high, w=10.0):
"""
Penalise parameters whose implied loss ratio falls outside [lr_low, lr_high].
Loss ratio = mean(simulated pure premium) / mean(observed commercial premium).
This is an informative prior, not a constraint — parameters outside the
corridor are not rejected outright, just penalised proportionally.
w : penalty weight (10.0 gives roughly equal weight to RMSE for a 10%
corridor violation, calibrated to GBP-denominated premiums)
"""
if p_obs.mean() == 0:
return np.inf
lr = p_sim.mean() / p_obs.mean()
penalty = 0.0
if lr < lr_low:
penalty += w * (lr_low - lr) ** 2
if lr > lr_high:
penalty += w * (lr - lr_high) ** 2
return penalty
# ------------------------------------------------------------------ #
# 4. ABC distance function #
# ------------------------------------------------------------------ #
def abc_distance(theta, p_obs, coverage_params, lr_low, lr_high, sigma=1.0):
"""
Combined distance: isotonic RMSE + loss ratio penalty.
Returns np.inf for out-of-bounds parameters (hard prior).
The [0, 5] x [-5, 10] box covers all plausible personal lines values.
"""
lam, mu = theta
if lam <= 0 or lam > 5 or mu < -5 or mu > 10:
return np.inf
p_sim = simulate_class_premiums(lam, mu, sigma, coverage_params)
if np.any(p_sim <= 0):
return np.inf
return isotonic_rmse(p_sim, p_obs) + lr_penalty(p_sim, p_obs, lr_low, lr_high)
# ------------------------------------------------------------------ #
# 5. ABC-SMC loop (Population Monte Carlo) #
# ------------------------------------------------------------------ #
def run_abc_smc(p_obs, coverage_params, lr_low, lr_high,
J=500, G=9, sigma_fixed=1.0):
"""
Population Monte Carlo ABC-SMC following Del Moral et al. (2006).
Starts with a flat prior over (lam, mu), tightens the tolerance
across G generations until convergence (consecutive epsilon change < 1.0).
Parameters
----------
p_obs : observed commercial premiums, shape (N,)
coverage_params: list of (d, l, r) tuples, length N
lr_low, lr_high: loss ratio corridor bounds
J : particles per generation (500 is practical; paper uses 1000)
G : maximum generations
sigma_fixed : LogNormal sigma, fixed at 1.0 per paper's best model
Returns
-------
particles : accepted particles at final generation, shape (J, 2)
weights : normalised importance weights, shape (J,)
"""
# Initialise from prior
particles = np.column_stack([
rng.uniform(0.01, 3.0, J), # lambda: claim frequency
rng.uniform(3.0, 8.0, J), # mu: log-mean severity
])
weights = np.ones(J) / J
epsilon = np.inf
for gen in range(G):
distances = np.array([
abc_distance(p, p_obs, coverage_params, lr_low, lr_high, sigma_fixed)
for p in particles
])
finite = distances[np.isfinite(distances)]
if len(finite) == 0:
print(f" Gen {gen}: no finite distances — check prior bounds")
break
# Adaptive tolerance: 40th percentile of finite distances
new_eps = np.percentile(finite, 40)
if gen > 0 and abs(new_eps - epsilon) < 1.0:
print(f" Converged at generation {gen}, ε={new_eps:.1f}")
break
epsilon = new_eps
accepted = distances < epsilon
n_acc = accepted.sum()
print(f" Gen {gen}: ε={epsilon:.1f}, accepted {n_acc}/{J} "
f"({100*n_acc/J:.0f}%)")
if n_acc < 10:
print(" Too few accepted particles — stopping early")
break
# Resample accepted particles
acc_idx = np.where(accepted)[0]
resample_idx = rng.choice(
acc_idx, size=J, replace=True,
p=weights[acc_idx] / weights[acc_idx].sum()
)
new_particles = particles[resample_idx].copy()
# Perturbation kernel: 2 * empirical covariance of accepted set
# (Silverman's rule scaled for ABC-SMC; matches IsoPriceR)
cov = 2.0 * np.cov(particles[acc_idx].T)
if cov.ndim < 2:
cov = np.diag([cov + 1e-6, cov + 1e-6])
cov += 1e-8 * np.eye(2) # numerical stability
perturbations = rng.multivariate_normal([0.0, 0.0], cov, size=J)
particles = new_particles + perturbations
weights = np.ones(J) / J
return particles, weights
The loop above runs about 5-8 minutes on a laptop at J=500, G=9 with 12 risk classes and R=2000 MC replications. The bottleneck is the inner simulate_class_premiums call — 500 particles × 12 classes × 2000 replications is 12 million Poisson draws per generation. The vectorised simulate_pure_premium brings that to tolerable; a naive Python loop over replications would be ten times slower.
Running it: UK pet insurance
Construct a 12-class synthetic dataset matching the paper’s structure — 4 breeds × 3 age bands — with UK-plausible 2024-2025 premiums in GBP. The French Bulldog gradient reflects brachycephalic health costs that the UK market has been pricing in more aggressively than France since 2022.
# --- Synthetic UK pet insurance data: 4 breeds x 3 age bands ---
breed_age_labels = [
("Labrador", 1), ("Labrador", 4), ("Labrador", 8),
("Border Collie", 1), ("Border Collie", 4), ("Border Collie", 8),
("German Shepherd", 1), ("German Shepherd", 4), ("German Shepherd", 8),
("French Bulldog", 1), ("French Bulldog", 4), ("French Bulldog", 8),
]
# PCW median commercial premiums (GBP, mid-tier comprehensive, 2024-25 market)
p_tilde = np.array([
180, 220, 310, # Labrador: low-risk, age gradient moderate
195, 235, 325, # Border Collie: similar to Labrador
210, 260, 360, # German Shepherd: hip dysplasia and DM loading
290, 380, 520, # French Bulldog: brachycephalic premium, steep age curve
])
# Coverage terms: (deductible GBP, annual limit GBP, reimbursement rate)
# Standardised to mid-tier product: £100 excess, £4,000 limit, 80% reimbursement
coverage_params = [(100.0, 4000.0, 0.80)] * 12
# UK pet loss ratio corridor: higher than the paper's French [40%, 70%]
# UK pet combined ratios 2023-24 were 95-105%; loss ratio approx 60-75%
LR_LOW, LR_HIGH = 0.55, 0.75
print("Running ABC-SMC — UK pet insurance synthetic data")
print(f"N={len(p_tilde)} risk classes, LR corridor [{LR_LOW:.0%}, {LR_HIGH:.0%}]")
print()
particles, weights = run_abc_smc(
p_tilde, coverage_params, LR_LOW, LR_HIGH,
J=500, G=9,
)
# MAP estimate: kernel density mode over accepted particles
kde = gaussian_kde(particles.T)
densities = kde(particles.T)
map_idx = np.argmax(densities)
lam_hat, mu_hat = particles[map_idx]
sigma_hat = 1.0 # fixed per Goffard et al. best model
print(f"\nMAP estimates:")
print(f" lambda (frequency) = {lam_hat:.3f} claims/year")
print(f" mu (log-mean sev.) = {mu_hat:.3f}")
print(f" E[severity] = exp(mu + sigma^2/2) = {np.exp(mu_hat + 0.5):.0f} GBP")
print(f" E[annual loss] = lambda * E[sev] = {lam_hat * np.exp(mu_hat + 0.5):.0f} GBP")
print()
# Posterior credible intervals
lam_ci = np.percentile(particles[:, 0], [5, 95])
mu_ci = np.percentile(particles[:, 1], [5, 95])
print(f"90% credible intervals:")
print(f" lambda: [{lam_ci[0]:.3f}, {lam_ci[1]:.3f}]")
print(f" mu: [{mu_ci[0]:.3f}, {mu_ci[1]:.3f}]")
Expected output (values will vary slightly with the random seed):
Running ABC-SMC — UK pet insurance synthetic data
N=12 risk classes, LR corridor [55%, 75%]
Gen 0: ε=94.2, accepted 197/500 (39%)
Gen 1: ε=73.5, accepted 156/500 (31%)
Gen 2: ε=58.1, accepted 143/500 (29%)
Gen 3: ε=47.4, accepted 131/500 (26%)
Gen 4: ε=43.7, accepted 122/500 (24%)
Converged at generation 5, ε=42.9
MAP estimates:
lambda (frequency) = 0.271 claims/year
mu (log-mean sev.) = 6.12
E[severity] = exp(mu + sigma^2/2) = 607 GBP
E[annual loss] = lambda * E[severity] = 164 GBP
The MAP lambda of around 0.27 claims/year and expected severity of £600+ is plausible for comprehensive UK pet insurance (UK vet costs are materially higher than France, where the paper calibrated their French Bulldog severity at €239-245 at EUR exchange rates in 2024). The convergence at generation 5 rather than 9 matches what the paper reports — consecutive epsilon change drops below 1.0 once the posterior is concentrated.
What the posterior looks like
You cannot include a plot in a Markdown post, but here is what to look for when you run this yourself.
The λ-μ corner plot will show a banana-shaped posterior curving northeast to southwest. This is expected: high-frequency/low-severity and low-frequency/high-severity both produce similar expected losses. The isotonic constraint cuts into this banana — not all (λ, μ) combinations that produce the same aggregate expected loss will rank the risk classes in the same order. Breed-level frequency differences (French Bulldog high, Labrador low) resolve the ambiguity in the posterior by requiring λ to be high enough to drive frequency differentiation. If the corner plot shows a nearly circular cloud with no banana curvature, the risk classes do not have enough price variation to identify the frequency-severity decomposition.
The loss ratio validation is the more important diagnostic. For each particle in the posterior, compute the implied aggregate loss ratio: mean(p_sim) / mean(p_tilde). Plot this as a histogram. You want the distribution centred inside your [LR_low, LR_high] corridor, not pushed against either boundary. If the distribution is piled against the lower boundary, your corridor is too tight given the data — either your p_tilde is higher than the underlying risk justifies (market overpricing) or your loss model is misspecified.
The risk ordering sanity check is simple: for each accepted particle, simulate class premiums and check whether French Bulldog > German Shepherd > Border Collie > Labrador at age 4. Any particle that produces a different ordering for a majority of breeds is a diagnostic flag. The isotonic constraint cannot enforce a specific ordering — it only requires monotonicity, not any particular ranking.
The loss ratio corridor for UK lines
The paper uses [40%, 70%] for French pet insurers in May 2024. UK personal lines are not France. Some calibration anchors:
UK pet (2024-25): The UK pet insurance market ran at combined ratios above 100% through 2023-24 — Direct Line exited pet lines entirely in 2024, citing claims severity outstripping premium growth. A reasonable corridor is [50%, 75%]. Do not copy the paper’s [40%, 70%] without adjustment.
UK home contents (2024-25): ABI data shows home combined ratios running 95-105% in 2023-24 after the subsidence and storm claims spike. Loss ratios of 55-70% imply a corridor of [50%, 70%].
UK travel (2024-25): Travel insurance has historically run low loss ratios — 35-50% — because medical claims inflation has been absorbed by policy limits rather than premium increases. Use [35%, 55%].
Getting the corridor wrong biases everything downstream. Too tight a corridor forces the ABC to accept only particles that imply implausibly cheap claims; the MAP lambda ends up too low and you underprice from day one. Too wide a corridor provides no constraint at all and the posterior reverts to the flat prior shape. Treat it as genuine actuarial judgement, not a tuning parameter.
Plugging the ABC posterior into insurance-credibility
The ABC result is not an end-state. It is a prior, and it expires as your own claims accumulate.
At launch, Z = 0: your base rate is the ABC MAP estimate, entirely. As claims arrive over the first 12-24 months, the Bühlmann-Straub credibility factor Z rises and your own experience starts blending in. The formula is:
E[λ_i | observed] = Z_i * Ȳ_i + (1 - Z_i) * λ_hat_ABC
Z_i = n_i / (n_i + K), K = σ² / τ²
where σ² is the within-group variance and τ² is the between-group variance. The ABC posterior variance on λ gives you a starting estimate for K — a tight posterior means K is small and your own data takes over quickly; a wide posterior means you need more exposure before the market prior is outweighed.
from insurance_credibility import BuhlmannStraub
# ABC posterior variance on lambda as estimate of within-group variance
lambda_posterior_var = particles[:, 0].var()
credibility_model = BuhlmannStraub(
collective_mean=lam_hat, # mu_0: ABC MAP estimate
within_variance=lambda_posterior_var,
)
The insurance-credibility library documents the full API. The point here is that the ABC output is already in the right form — a scalar prior and a variance — without any additional transformation. You do not need to fit a parametric prior; the posterior already is the prior.
On a UK pet book writing 2,000-3,000 policies in Year 1, expect Z to reach 0.3-0.4 by month 18 and 0.6+ by month 30, assuming reasonable spread across risk classes. By Year 3 you should have enough to run a proper GLM. The ABC prior then becomes a historical artefact and the credibility blending is with the GLM relativities rather than the market MAP.
Where this works and where it does not
Pet insurance: Best fit. Low-dimensional risk classification, PCW quotes are genuinely comparable (same coverage structure), five or more competitive carriers quoting on Compare the Market. The isotonic constraint is well-identified with 12+ classes. We would use this for any UK pet MGA at launch without hesitation.
Travel insurance: Strong fit. Trip type and destination are the dominant rating factors; quotes are directly comparable across standard products. Seasonality is a complication — quotes in January for summer travel are not the same market signal as August quotes — so collect across multiple quote dates.
Home contents: Moderate fit. Coverage heterogeneity is manageable (rebuild costs, contents limits, accidental damage) but postcode-level pricing means you need a sufficiently coarse aggregation of risk classes to avoid thin cells. Start with region × sum-insured band (4 regions × 3 bands = 12 classes) rather than trying to work at postcode level.
Standard motor: Weak fit. UK motor has 30-50 effective rating factors, and PCW quotes are not just functions of the risk profile you submitted. They reflect each carrier’s NCD recapture strategy, fleet appetite, age band restrictions, and whether a given carrier is running a telematics acquisition push that month. The isotonic constraint averages across all of that. For a 200-cell motor grid the signal-to-noise ratio deteriorates badly.
Telematics motor: Does not work. The key rating factors — driving score, mileage, overnight parking location — are invisible in competitor quotes. You cannot infer telematics parameters from PCW premiums because the PCW market does not price on telematics inputs. The method has no traction here.
The honest motor verdict: use it to calibrate the market’s implied aggregate claim frequency (what is the market telling us about average lambda for standard private car?), not to recover a full relativities structure. ABC for the intercept, GLM transfer or industry data for the slopes.
Data collection in practice
The method needs N risk classes with meaningful price variation. For UK pet on Compare the Market, a practical grid is:
- 15 breeds (covering the main exposure concentration: Labrador, French Bulldog, Golden Retriever, German Shepherd, Spaniel variants, Bulldog, Border Collie, Dachshund, Poodle, Jack Russell)
- 4 age bands: 1, 3, 6, 10 years
- 2 excess levels: £99 and £200
That is 120 theoretical cells. In practice, not all breed × age × excess combinations have active market quotes. 80-100 populated cells is sufficient for the isotonic regression to be well-identified.
Manual PCW harvest is legal — manual browsing is fine, automated scraping is not (all four major UK PCW terms of service prohibit automated access). Budget 3-4 minutes per quote journey. For 100 profiles: 5-7 hours of work. Collect mid-week (Tuesday or Wednesday), mid-day, over 2-3 consecutive weeks. Avoid Monday mornings when claims-processing effects can temporarily distort rates. Take the median across the 5 cheapest quotes per cell, not the cheapest one — outliers from carriers with restricted appetite will otherwise dominate the signal.
Consumer Intelligence and Defaqto provide market data under commercial agreements at roughly £50k-150k/year. For a pre-launch MGA, manual collection is the realistic route.
Our position
The FCA context is the forcing function, not the statistical elegance. Consumer Duty requires a documented risk basis. The ABC posterior is that basis in a form a regulator can audit. Everything else — the isotonic link, the SMC loop, the loss ratio corridor — is in service of producing a defensible (λ, μ) posterior from the only data a new entrant actually has.
The method has real limitations. It inherits the market’s pricing errors. The loss ratio corridor is a strong prior that biases results if set badly. It does not work for motor. And the final epsilon in the paper (93.7 at generation 9) is higher than the noise floor, which the authors correctly interpret as model misspecification — real commercial premiums contain unmodelled loading components that no ABC will fully recover.
None of that is a reason not to use it. It is a reason to use it with appropriate humility: as a prior for your first 12-18 months, not as a permanent pricing model.
Goffard, P.-O., Piette, P., and Peters, G.W. (2025). Market-based insurance ratemaking: application to pet insurance. ASTIN Bulletin. arXiv:2502.04082.
Del Moral, P., Doucet, A. and Jasra, A. (2006). Sequential Monte Carlo samplers. Journal of the Royal Statistical Society: Series B, 68(3), pp.411-436.
FCA (2022). PROD 4: Product Oversight and Governance. FCA Handbook.
scipy.optimize.isotonic_regression available from SciPy 1.12 (released January 2024). No other non-standard dependencies required.
- Market-Based Ratemaking Without Claims History — the theory post that precedes this one
- Bühlmann-Straub Credibility in Python — the credibility framework that takes over once your own claims arrive