2.13.2. Embrechts and Frei (2009)

2.13.2.1. Poisson/Pareto Example

Embrechts and Frei [2009], Panjer recursion versus FFT for compound distributions.

Consider a \(\mathrm{Po}(\lambda)\vee\mathrm{Pareto}(\alpha, \beta)\), \(\alpha\) is shape and \(\beta\) is scale.

In [1]: from aggregate import build, qd

In [2]: from pandas import option_context

In [3]: import matplotlib.pyplot as plt

In [4]: alpha = 4

In [5]: beta = 3

In [6]: freq = 20

In [7]: a = build(f'agg EF.1 {freq} claims '
   ...:           f'sev {beta} * pareto {alpha} - {beta} '
   ...:            'poisson', bs=1/8, log2=8, padding=0, normalize=False)
   ...: 

In [8]: qd(a)

      E[X] Est E[X]   Err E[X]   CV(X) Est CV(X)  Skew(X) Est Skew(X)
X                                                                    
Freq    20                     0.22361            0.22361            
Sev      1   0.9985 -0.0014965  1.4142    1.3956   7.0711      5.1093
Agg     20   17.704   -0.11481  0.3873   0.38559   1.1619    -0.21402
log2 = 8, bandwidth = 1/8, validation: fails sev mean, agg mean, agg mean error >> sev, possible aliasing; try larger bs.

The last dataframe shows poor accuracy. Try different ways to compute the aggregate: padding, tilting, and more buckets.

In [9]: from pandas import option_context

In [10]: df = a.density_df[['p_total']].rename(columns={'p_total': 'Pad 0, tilt 0'})

In [11]: a.update(bs=1/8, log2=8, padding=1, normalize=False)

In [12]: df['Pad 1, tilt 0'] = a.density_df.p_total

In [13]: a.update(bs=1/8, log2=8, padding=2, normalize=False)

In [14]: df['Pad 2, tilt 0'] = a.density_df.p_total

In [15]: a.update(bs=1/8, log2=8, padding=0, tilt_vector=0.01, normalize=False)

In [16]: df['Pad 0, tilt 0.01'] = a.density_df.p_total

In [17]: a.update(bs=1/32, log2=16, padding=1, normalize=False)

In [18]: bit = a.density_df[['p_total']].rename(columns={'p_total': 'log2 16, pad 1, tilt 0'})

In [19]: qd(a)

      E[X] Est E[X]    Err E[X]   CV(X) Est CV(X)  Skew(X) Est Skew(X)
X                                                                     
Freq    20                      0.22361            0.22361            
Sev      1  0.99995 -5.4251e-05  1.4142    1.4144   7.0711      7.0284
Agg     20   19.999 -5.4252e-05  0.3873   0.38732   1.1619      1.1567
log2 = 16, bandwidth = 1/32, validation: not unreasonable.

The last dataframe shows a good approximation.

The next figure (compare Figure 1 in the paper, shown below) shows that padding, as recommended in Wang [1998], removes aliasing as effectively as padding, albeit at the expense of a longer FFT computation. The log density shows the aliasing is completely removed.

In [20]: f, axs = plt.subplots(1, 2, figsize=(2 * 3.5, 2.45), constrained_layout=True, squeeze=True)

In [21]: ax0, ax1 = axs.flat

In [22]: df.plot(ax=ax0, logy=False)
Out[22]: <Axes: xlabel='loss'>

In [23]: df.plot(ax=ax1, logy=True)
Out[23]: <Axes: xlabel='loss'>

In [24]: ax0.legend(loc='upper left')
Out[24]: <matplotlib.legend.Legend at 0x7f1b75ce6020>

In [25]: ax1.legend(loc='lower right');

Clearly there is not enough space with only 2**8 buckets. Expanding to 2**16 and using a finer bucket covers a more realistic range. The log density plot shows a change in regime from Poisson body to Pareto tail. The extreme tail can be approximated by differentiating Feller’s theorem, which says the survival function is converges to \(20\mathsf{Pr}(X>x)\) where \(X\) is the Pareto severity (right hand plot). The multiplication by four accounts for the different bs values.

In [26]: f, axs = plt.subplots(1, 2, figsize=(2 * 3.5, 2.45), constrained_layout=True, squeeze=True)

In [27]: ax0, ax1 = axs.flat

In [28]: df.plot(ax=ax0, logy=False)
Out[28]: <Axes: xlabel='loss'>

In [29]: (bit * 4).plot(ax=ax0, lw=3, alpha=.5);

In [30]: bit.plot(ax=ax1, logy=True);

# density from tail, need to divide by bs
In [31]: ax1.plot(bit.index, (20*4/3*a.bs)*(3/(3+bit.index))**5, label='Feller approximation');

In [32]: ax0.set(xlim=[-5, a.q(0.99999)]);

In [33]: ax0.legend(loc='upper right');

In [34]: ax1.legend(loc='upper right');

2.13.2.2. Choice of Bandwidth (Bucket Size)

This example replicates parts of Table 1. As well as the 99.9%ile it shows the 99.9999%ile.

In [35]: import pandas as pd

In [36]: a = build('agg EF.2 50 claims sev expon poisson', update=False)

In [37]: ans = []

In [38]: for log2, bs in zip([10, 10, 10, 16, 16, 16, 16], [1, 1/2, 1/8, 1/8, 1/16, 1/64, 1/512]):
   ....:     a.update(log2=log2, bs=bs, padding=1)
   ....:     ans.append([log2, 1/bs, a.q(0.999), a.q(1-1e-6)])
   ....: 

In [39]: df = pd.DataFrame(ans, columns=['log2', '1/bs', 'p999', 'p999999'])

In [40]: qd(df, accuracy=4)

   log2  1/bs   p999  p999999
0    10     1     84      107
1    10     2   84.5      108
2    10     8 85.125   108.25
3    16     8 85.125   108.25
4    16    16 85.125   108.25
5    16    64 85.109   108.23
6    16   512 85.105   108.23