2.4.4. The Severity Clause

The severity clause specifies the ground-up severity distribution, or “severity curve” as it is sometimes known. It is a very flexible clause. Its design follows the scipy.stats package’s specification of random variables using shape, location, and scale factors, see probability background. The syntax is different for non-parametric discrete distributions and parametric continuous distributions.

2.4.4.1. Non-Parametric Severity Distributions

Discrete distributions (supported on a finite number of outcomes) can be directly entered as a severity using the dsev keyword followed by two equal-length row vectors. The first gives the outcomes and the (optional) second gives the probabilities.

dsev [outcomes] <[probabilities]>

The horizontal layout is irrelevant and commas are optional. If the probabilities vector is omitted then all probabilities are set equal to the reciprocal of the length of the outcomes vector. A Python-like colon notation is available for ranges. Probabilities can be entered as fractions, but no other arithmetic operation is supported.

Examples:

dsev [0 9 10] [0.5 0.3 0.2]
dsev [0 9 10]
dsev [1:6]
dsev [0:100:25]
dsev [1:6] [1/4 1/4 1/8 1/8 1/8 1/8]

dsev [0 9 10] [0.5 0.3 0.2] is a severity with a 0.5 chance of taking the value 0, 0.3 chance of 9, and 0.2 of 10.
dsev [0 9 10] gives equally likely outcomes of 0, 9, or 10.
dsev [1:6] gives equally likely outcomes 1, 2, 3, 4, 5, 6. Unlike Python (but like pandas.DataFrame.loc) the right-hand limit is included.
dsev [0:100:25] gives qually likely outcomes 0, 25, 50, 100.
dsev [1:6] [1/4 1/4 1/8 1/8 1/8 1/8] gives outcomes 1 or 2 with probability 0.25 or 3-6 with probability 0.125.

Warning

Use binary fractions (denominator a power of two) to avoid rounding errors!

2.4.4.1.1. Details

A dsev clause is converted by the parser into a dhistogram step distribution:

sev dhistogram xps [outcomes] [probabilities]

In rare cases you want a continuous (ogive, piecewise linear distribution) version:

sev chistogram xps [outcomes] [probabilities]

When executed, these are both converted into a scipy.stats histogram class.

Discrete severities, specified using the dsev keyword, are implemented using a scipy.stats rv_historgram object, which is actually continuous. They work by concentrating the probability in small intervals just to the left of each knot point (to make the function right continuous). Given:

dsev [xs] [ps]

where xs and ps are the vectors of outcomes and probabilities, internally aggregate creates:

xss = np.sort(np.hstack((xs - 2 ** -30, xs)))
pss = np.vstack((ps1, np.zeros_like(ps1))).reshape((-1,), order='F')[:-1]
fz_discr = ss.rv_histogram((pss, xss))

The value 2**-30 needs to be smaller than the bucket size resolution, i.e., enough not to “split the bucket”. The mass is to the left of the knot to make a right continuous function (the approximation ramps up before the knot). Generally histograms are downsampled, not upsampled, so this is not a restriction.

A dsev statement is translated into the more general:

sev dhistorgram xps [xs] [ps]

where dhistrogram creates a discrete histogram (as above) and the xps keyword prefixes inputting the knots and probabilities. It is also possible to specify the input severity as a continuous histogram that is uniform on \((x_k, x_{k+1}]\). The discrete probabilities are \(p_k=P(x_k < X \le x_{k+1})\). To create a rv_histogram variable is much easier, just use:

sev chistorgram xps [xs] [ps]

which is translated into:

xs2 = np.hstack((xs, xs[-1] + xs[1]))
fz_cts = ss.rv_histogram((ps2, xs2))

The code adds an additional knot at the end to create enough differences (there are only two differences between three points). The Sidebar: Continuous Discretization uses a chistogram.

The discrete method is appropriate when the distribution will be used and interpreted as fully discrete, which is the assumption the FFT method makes and the default. The continuous method is useful if the distribution will be used to create a scipy.stats rv_histogram variable. If the continuous method is interpreted as discrete and if the mean is computed as \(\sum_i p_k x_k\), which is appropriate for a discrete variable, then it will be under-estimated by \(b/2\).

2.4.4.2. Parametric Severity

A parametric distribution can be specified in two ways:

sev DIST_NAME MEAN cv CV
sev DIST_NAME <SHAPE1> <SHAPE2>

where

sev is a keyword indicating the severity specification,
DIST_NAME is the scipy.stats distribution name, see scipy.stats Continuous Random Variables,
MEAN is the expected loss,
cv (lowercase) is a keyword indicating entry of the CV,
CV is the loss coefficient of variation, and
SHAPE1, SHAPE2 are the (optional) shape variables.

The first form enters the expected ground-up severity and CV directly. It is available for distributions with only one shape parameter and the beta distribution on \([0,1]\). aggregate uses a formula (lognormal, gamma, beta) or numerical method (all other one shape parameter distributions) to solve for the shape parameter to achieve the correct CV and then scales to the desired mean. The second form directly enters the shape variable(s). Shape parameters entered for zero parameter distributions are ignored.

Example. Entering sev lognorm 10 cv 0.2 produces a lognormal distribution with a mean of 10 and a CV of 0.2. Entering lognorm 0.2 produces a lognormal with \(\mu=0\) and \(\sigma=0.2\), which can then be scaled and shifted.

DIST_NAME can be any zero, one, or two shape parameter scipy.stats continuous distribution. They have (mostly) easy to guess names. For example:

Distributions with no shape parameters include: norm, Gaussian normal; unif, uniform; and expon, the exponential.
Distributions with one shape parameter include: pareto, lognorm, gamma, invgamma, loggamma, and weibull_min the Weibull.
Distributions with two shape parameters include: beta and gengamma, the generalized gamma.

See scipy.stats Continuous Random Variables for a full list and Appendix: scipy.stats Continuous Random Variables for details of each.

Details.

dhistogram and chistogram create discrete (point mass) and continuous (ogive) empirical distributions. chistogram is rarely used and dhistogram is easier to input using dsev, Non-Parametric Severity Distributions.

2.4.4.3. Shifting and Scaling Severity

A parametric severity clause can be transformed by scaling and location factors, following the scipy.stats scale and loc syntax. Location is a shift or translation. The syntax is:

sev SCALE * DISTNAME SHAPE + LOC
sev SCALE * DISTNAME SHAPE - LOC

For zero parameter distributions SHAPE is omitted. Two parameter distributions are entered sev SCALE * DISTNAME SHAPE1 SHAPE2 + LOC.

Examples.

sev lognorm 10 cv 3: lognormal, mean 10, CV 0.
sev 10 * lognorm 1.75: lognormal, \(10X\), \(X \sim \mathrm{lognormal}(\mu=0,\sigma=1.75)\)
sev 10 * lognorm 1.75 + 20: lognormal, \(10X + 20\)
sev 10 * lognorm 1 cv 3 + 50: lognormal: \(10Y + 50\), \(Y\sim\) lognormal mean 1, CV 3
sev 100 * pareto 1.3 - 100: Pareto, shape \(\alpha=3\), scale \(\lambda=100\).
sev 100 * pareto 1.3: Single parameter Pareto for \(x \ge 100\), Shape (\(\alpha\)) 3, scale (\(\lambda\)) 100
sev 50 * norm + 100: normal, mean (location) 100, standard deviation (scale) 50. No shape parameter.
sev 5 * expon: exponential, mean (scale) 5. No shape parameter.
sev 5 * uniform + 1: uniform between 1 and 6, scale 5, location 1. No shape parameters.
sev 50 * beta 2 3: beta: \(50Z\), \(Z \sim \beta(2,3)\), shape parameters 2, 3, scale 50.

With this parameterization, the Pareto has survival function \(S(x)=(100 / (100 + x))^{1.3}\).

The scale and location parameters can be vectors.

Warning

dsev severities cannot be shifted or scaled. If that is required use a Python f-string to adjust the outcomes:

f'dsev [{{5 * outcomes + 10}}] [probabilities]'

Warning

Shifting left (negative shift) must be written with space sev 10 * lognorm 1.5 - 10 not sev 10 * lognorm 1.5 -10. The lexer binds uniary minus to the number, so the latter omits the operator. sev 10 * lognorm 1.5 + -10, sev 10 * lognorm 1.5 +10 and sev 10 * lognorm 1.5 + 10 are all acceptable because there is no unary +. This is a known bug and is insidious: the -10 will be interpreted as a second shape parameter and ignored. You will not get the answer you expect.

2.4.4.4. Unconditional Severity

The severity clause is entered ground-up. It is converted to a distribution conditional on a loss to the layer if there is a limits sub-clause. Thus, for an excess layer \(y\) xs \(a\), the severity used to create the aggregate has a distribution \(X \mid X > a\), where \(X\) is specified in the sev clause. For a ground-up (or missing) layer there is no adjustment.

The default behavior can be over-ridden by adding ! after the severity distribution.

Example.

The default behavior uses severity conditional to the layer. In this example, the conditional layer severity is 6.

In [1]: from aggregate import build, qd

In [2]: cond = build('agg DecL:Conditional '
   ...:              '1 claim '
   ...:              '12 xs 8 '
   ...:              'sev 20 * uniform '
   ...:              'fixed')
   ...: 

In [3]: qd(cond)

      E[X] Est E[X]   Err E[X]   CV(X) Est CV(X)     Skew(X) Est Skew(X)
X                                                                       
Freq     1                           0                                  
Sev      6        6 -9.992e-16 0.57735   0.57735 -8.2046e-15 -9.0251e-14
Agg      6        6 -9.992e-16 0.57735   0.57735 -9.5721e-15 -9.0251e-14
log2 = 16, bandwidth = 1/2048, validation: fails sev skew, agg skew.

To specify unconditional severity, append ! to the severity clause. The unconditional layer severity is only 3.6 because there is just a 60% chance of attaching the layer. In the last line, uncd.sevs[0].fz is sev 20 * uniform ground-up.

In [4]: uncd = build('agg DecL:Unconditional '
   ...:              '1 claim '
   ...:              '12 xs 8 '
   ...:              'sev 20 * uniform ! '
   ...:              'fixed')
   ...: 

In [5]: qd(uncd)

      E[X] Est E[X]    Err E[X]  CV(X) Est CV(X) Skew(X) Est Skew(X)
X                                                                   
Freq     1                           0                              
Sev    3.6      3.6 -1.1102e-16 1.1055    1.1055 0.65784     0.65784
Agg    3.6      3.6 -1.1102e-16 1.1055    1.1055 0.65784     0.65784
log2 = 16, bandwidth = 1/2048, validation: not unreasonable.

In [6]: print(uncd.sevs[0].fz.sf(8), uncd.agg_m / cond.agg_m)
0.6 0.6

2.4.4.5. `scipy.stats` Continuous Random Variables

All scipy.stats continuous random variable classes can be used as severity distributions, see scipy.stats Severity Distributions for a complete list. As always, with great power comes great responsibility.

Warning

The user must determine if a severity distribution is appropriate, aggregate will not check! Only specified zero parameter (uniform, exponential, normal) and two parameter () distributions are allowed, but all one parameter distributions will work. However, any zero parameter distribution can be called with a dummy argument, that is ignored. Be careful out there!

2.4.4. The Severity Clause

2.4.4.1. Non-Parametric Severity Distributions

2.4.4.1.1. Details

2.4.4.2. Parametric Severity

2.4.4.3. Shifting and Scaling Severity

2.4.4.4. Unconditional Severity

2.4.4.5. scipy.stats Continuous Random Variables

2.4.4.5. `scipy.stats` Continuous Random Variables