The Tweedie Distribution

The Tweedie distribution is a Poisson mixture of gammas. It is an exponential family distribution [Jørgensen, 1997]. Tweedie distributions are a suitable model for pure premiums and are used as unit distributions in GLMs [McCullagh and Nelder, 2019]. Tweedie distributions do not have a closed form density, but estimating the density is easy using aggregate.

The Tweedie family of distributions is a three-parameter exponential family. A variable \(X \sim \mathrm{Tw}_p(\mu, \sigma^2)\) when \(\mathsf E[X] = \mu\) and \(\mathsf{Var}(X) = \sigma^2 \mu^p\), \(1 \le p \le 2\). \(p\) is a shape parameter and \(\sigma^2>0\) is a scale parameter called the dispersion.

A Tweedie with \(1<p<2\) is a compound Poisson distribution with gamma distributed severities. The limit when \(p=1\) is an over-dispersed Poisson and when \(p=2\) is a gamma. More generally: \(\mathsf{Tw}_0(\mu,\sigma^2)\) is normal \((\mu, \sigma^2)\), \(\mathsf{Tw}_1(\mu, \sigma^2)\) is over-dispersed Poisson \(\sigma^2\mathsf{Po}(\mu/\sigma^2)\), and \(\mathsf{Tw}_2(\mu,\sigma^2)\) is a gamma with CV \(\sigma\).

Let \(\mathsf{Ga}(\alpha, \beta)\) denote a gamma with shape \(\alpha\) and scale \(\beta\), with density \(f(x;\alpha,\beta)=x^\alpha- e^{-x/\beta} / \beta^\alpha x\Gamma(\alpha)\). It has mean \(\alpha\beta\), variance \(\alpha\beta^2\), expected square \(\alpha(\alpha+1)\beta\) and coefficient of variation \(1/\sqrt\alpha\). We can define an alternative parameterization \(\mathsf{Tw}^*(\lambda, \alpha, \beta) = \mathsf{CP}(\lambda, \mathsf(Ga(\alpha,\beta))\) as a compound Poisson of gammas, with expected frequency \(\lambda\).

The dictionary between the two parameterizations relies on the relation between the two shape parameters \(\alpha\) and \(p\) given by

\[\alpha = \frac{2-p}{p-1}, \qquad p = \frac{2+\alpha}{1+\alpha}.\]

Starting from \(\mathrm{Tw}_p(\mu, \sigma^2)\): \(\lambda = \displaystyle\frac{\mu^{2-p}}{(2-p)\sigma^2}\) and \(\beta = \displaystyle\frac{\mu^{1-p}}{(p-1)\sigma^2} = \mu /\lambda \alpha\)

Starting from \(\mathsf{Tw}^*(\lambda, \alpha, \beta)\): \(\mu = \lambda \alpha \beta\) and \(\sigma^2 = \lambda \alpha(\alpha + 1) / (\beta^2\mu^p)\), by equating expressions for the variance.

It is easy to convert from the gamma mean \(m\) and CV \(\nu\) to \(\alpha=1/\nu^2\) and \(\beta = m/\alpha\). Remember, scipy.stats scale equals \(\beta\).

Tweedie distributions are mixed: they have a probability mass of \(p_0 =e^{-\lambda}\) at 0 and are continuous on \((0, \infty)\).

Jørgensen calls \(\mathsf{Tw}(\lambda, \alpha, \beta)\) the additive form of the model because

\[\sum_i \mathsf{Tw}(\lambda_i, \alpha, \beta) = \mathsf{Tw}\left(\sum_i \lambda_i, \alpha, \beta\right).\]

He calls \(\mathsf{Tw}_p(\mu, \sigma)\) the reproductive exponential dispersion model. If \(X_i\sim \mathsf{Tw}_p(\mu, \sigma/w_i)\) then

\[\frac{1}{w}\sum_i w_i X_i \sim \mathsf{Tw}_p\left(\mu, \frac{\sigma^2}{w}\right)\]

where \(w = \sum_i w_i\). The weights \(w_i\) represents volume in cell \(i\) and \(X_i\) represents the pure premium. The sum on the left represents the total pure premium.

The next diagram shows how the Tweedie family fits within the broader power variance exponential family of distributions. See the blog post The Tweedie-Power Variance Function Family for more details.

In [1]: from aggregate.extensions.figures import power_variance_family

In [2]: power_variance_family()
../_images/tweedie_powervariance.png