v0.2.3
Extract multiplicative rating relativities from GBM models using SHAP values. Built for insurance pricing.
API Reference GitHub PyPIGBMs outperform GLMs on most insurance pricing datasets, but the standard output — a single predicted value per row — is not directly useful for pricing teams. Underwriters and actuaries work with relativities: a table of multipliers, one per feature level, that explains how each risk characteristic adjusts the base rate.
shap-relativities converts a trained CatBoost model into exactly
that format. The output is directly comparable to GLM exp(beta)
relativities: a Polars DataFrame of (feature, level, relativity)
triples where the base level is 1.0 and relativities multiply together to give
the model's expected prediction.
uv add shap-relativities
# or, for CatBoost + plotting support:
uv add "shap-relativities[all]"
from shap_relativities import SHAPRelativities
sr = SHAPRelativities(
model=catboost_model,
X=df.select(["area", "ncd_years", "vehicle_age"]),
exposure=df["exposure"],
categorical_features=["area", "ncd_years"],
)
sr.fit()
rels = sr.extract_relativities(
normalise_to="base_level",
base_levels={"area": "London", "ncd_years": 0},
)
print(rels)
Or use the one-shot convenience wrapper:
from shap_relativities import extract_relativities
rels = extract_relativities(
model, X,
exposure=df["exposure"],
categorical_features=["area"],
)
| Name | Description |
|---|---|
SHAPRelativities |
Main class. Wraps a trained model and feature matrix, computes SHAP values, and extracts relativities with CIs. |
.fit() |
Compute SHAP values via TreeExplainer. Must be called before extracting relativities. |
.extract_relativities() |
Return a Polars DataFrame of (feature, level, relativity, lower_ci, upper_ci, ...). |
.extract_continuous_curve() |
Smoothed relativity curve for a continuous feature (LOESS or isotonic). |
.validate() |
Diagnostic checks: reconstruction error, feature coverage, sparse levels. |
.baseline() |
exp(expected_value) — the annualised base rate in prediction space. |
.to_dict() / .from_dict() |
Serialise and restore a fitted instance without the original model. |
extract_relativities() |
Convenience function for one-shot use. Calls fit() and extract_relativities() internally. |
datasets.load_motor() |
Synthetic UK motor portfolio dataset with known true parameters for validation. |
| Column | Description |
|---|---|
feature | Feature name |
level | Feature level (category value or per-observation value for continuous features) |
relativity | Multiplicative relativity. Base level = 1.0 |
lower_ci | Lower bound of 95% CLT confidence interval |
upper_ci | Upper bound of 95% CLT confidence interval |
mean_shap | Exposure-weighted mean SHAP value in log space |
shap_std | Weighted standard deviation of SHAP values |
n_obs | Observation count for this level |
exposure_weight | Total exposure weight for this level |