3.3. Distributions Module

3.3.1. Frequency Class

class aggregate.distributions.Frequency(freq_name, freq_a, freq_b, freq_zm, freq_p0)[source]

Manages Frequency distributions: creates moment function and MGF.

freq_moms(n): returns EN, EN^2 and EN^3 when EN=n
freq_pgf(n, z): returns the moment generating function applied to z when EN=n

Frequency distributions are either non-mixture types or mixture types.

Non-Mixture Frequency Types

fixed: no parameters
bernoulli: exp_en interpreted as a probability, must be < 1
binomial: Binomial(n/p, p) where p = freq_a, and n = exp_en
poisson: Poisson(n)
geometric: geometric(1/(n + 1)), supported on 0, 1, 2, …
logarithmci: logarithmic(theta), supported on 1, 2, …; theta solved numerically
negymana: Po(n/freq_a) stopped sum of Po(freq_a) freq_a = “eggs per cluster”
negbin: freq_a is the variance multiplier, ratio of variance to mean
pascal:
pascal: (generalized) pascal-poisson distribution, a poisson stopped sum of negative binomial; exp_en gives the overall claim count. freq_a is the CV of the frequency distribution and freq_b is the number of claimants per claim (or claims per occurrence). Hence, the Poisson component has mean exp_en / freq_b and the number of claims per occurrence has mean freq_b. This parameterization may not be ideal(!).

Mixture Frequency Types

These distributions are G-mixed Poisson, so N | G ~ Poisson(n G). They are labelled by the name of the mixing distribution or the common name for the resulting frequency distribution. See Panjer and Willmot or JKK.

In all cases freq_a is the CV of the mixing distribution which corresponds to the asympototic CV of the frequency distribution and of any aggregate when the severity has a variance.

gamma: negative binomial, freq_a = cv of gamma distribution
delaporte: shifted gamma, freq_a = cv of mixing disitribution, freq_b = proportion of certain claims = shift. freq_b must be between 0 and 1.
ig: inverse gaussian, freq_a = cv of mixing distribution
sig: shifted inverse gaussian, freq_a = cv of mixing disitribution, freq_b = proportion of certain claims = shift. freq_b must be between 0 and 1.
sichel: generalized inverse gaussian mixing distribution, freq_a = cv of mixing distribution and freq_b = lambda value. The beta and mu parameters solved to match moments. Note lambda = -0.5 corresponds to inverse gaussian and 0.5 to reciprocal inverse gauusian. Other special cases are available.
sichel.gamma: generalized inverse gaussian mixture where the parameters match the moments of a delaporte distribution with given freq_a and freq_b
sichel.ig: generalized inverse gaussian mixture where the parameters match the moments of a shifted inverse gaussian distribution with given freq_a and freq_b. This parameterization has poor numerical stability and may fail.
beta: beta mixing with freq_a = Cv where beta is supported on the interval [0, freq_b]. This method should be used carefully. It has poor numerical stability and can produce bizzare aggregates when the alpha or beta parameters are < 1 (so there is a mode at 0 or freq_b).

Code proof for Neyman A:

from aggregate import build, qd
mean = 10
eggs_per_cluster = 4
neya = build(f'agg Neya {mean} claims dsev[1] neymana {eggs_per_cluster}')
qd(neya)

po = build(f'agg Po4 {eggs_per_cluster} claims dsev[1] poisson')
po_pmf = po.density_df.query('p_total > 1e-13').p_total

byhand = build(f'agg ByHand {mean / eggs_per_cluster} claims dsev {list(po_pmf.index)} {po_pmf.values} poisson')
qd(byhand)

df = pd.concat((neya.density_df.p_total, byhand.density_df.p_total), axis=1)
df.columns = ['neya', 'byhand']
df['err'] = df.neya - df.byhand
assert df.err.abs().max() < 1e-5
df.head(40)

Code proof for Pascal:

from aggregate import build, qd
mean = 10
claims_per_occ =1.24
overall_cv = 1.255
pascal = build(f'agg PascalEg {mean} claims dsev[1] pascal {overall_cv} {claims_per_occ}', log2=16)
qd(pascal)

c = (mean * overall_cv**2 - 1 - claims_per_occ) / claims_per_occ
th = claims_per_occ * c
a = 1 / c
# from form of nb pgf identify r = a and beta = theta, mean is rb, var is rb(1+b)
nb = build(f'agg NB {claims_per_occ} claims dsev[1] negbin {th + 1}', log2=16)
nb_pmf = nb.density_df.query('p_total > 1e-13').p_total
qd(nb)

byhand = build(f'agg ByHand {mean / claims_per_occ} claims dsev {list(nb_pmf.index)} {nb_pmf.values} poisson', log2=16)
qd(byhand)

df = pd.concat((pascal.density_df.p_total, byhand.density_df.p_total), axis=1)
df.columns = ['pascal', 'byhand']
df['err'] = df.pascal - df.byhand
assert df.err.abs().max() < 1e-5
df.head(40)

Parameters

freq_name – name of the frequency distribution, poisson, geometric, etc.
freq_a –
freq_b –

__init__(freq_name, freq_a, freq_b, freq_zm, freq_p0)[source]

Creates the freq_pgf and moment function:

moment function(n) returns EN, EN^2, EN^3 when EN=n.
freq_pgf(n, z) is the freq_pgf evaluated at log(z) when EN=n

Parameters

freq_name – name of the frequency distribution, poisson, geometric, etc.
freq_a –
freq_b –
freq_zm – freq_zm True if zero modified, default False
freq_p0 – modified p0, probability of zero claims

3.3.2. Severity Class

class aggregate.distributions.Severity(sev_name, exp_attachment=None, exp_limit=inf, sev_mean=0, sev_cv=0, sev_a=nan, sev_b=0, sev_loc=0, sev_scale=0, sev_xs=None, sev_ps=None, sev_wt=1, sev_lb=0, sev_ub=inf, sev_conditional=True, name='', note='')[source]

__init__(sev_name, exp_attachment=None, exp_limit=inf, sev_mean=0, sev_cv=0, sev_a=nan, sev_b=0, sev_loc=0, sev_scale=0, sev_xs=None, sev_ps=None, sev_wt=1, sev_lb=0, sev_ub=inf, sev_conditional=True, name='', note='')[source]

A continuous random variable, subclasses scipy.statistics_df.rv_continuous, adding layer and attachment functionality. It overrides

cdf
pdf
isf
ppf
sf
stats

See scipy.stats continuous rvs for more details about available distributions. The following distributions with two shape parameters are supported:

Burr (burr)
Generalized Pareto (genpareto)
Generalized gamma (gengamma)

It is easy to add others in the code below. With two shape parameters the mean cv input format is not available.

See code in Problems and Solutions to extract distributions from scipy stats by introsepection.

Parameters

sev_name – scipy statistics_df continuous distribution | (c|d)histogram cts or discerte | fixed
exp_attachment – None if layer is missing, distinct from 0; if a = 0 then losses are conditional on X>a, if a = None then losses are conditional on X>=0
exp_limit –
sev_mean –
sev_cv –
sev_a – first shape parameter
sev_b – second shape parameter (e.g., beta)
sev_loc – scipy.stats location parameter
sev_scale – scipy.stats scale parameter
sev_xs – for fixed or histogram classes
sev_ps –
sev_wt – this is not used directly; but it is convenient to pass it in and ignore it because sevs are implicitly created with sev_wt=1.
sev_conditional – conditional or unconditional; for severities use conditional
name – name of the severity object
note – optional note.

cv_to_shape(cv, hint=1)[source]

Create a frozen object of type dist_name with given cv. The lognormal, gamma, inverse gamma and inverse gaussian distributions are solved analytically. Other distributions solved numerically and may be unstable.

Parameters

cv –
hint –

Returns

mean_to_scale(shape, mean, loc=0)[source]

Adjust the scale to achieved desired mean. Return a frozen instance.

Parameters

shape –
mean –
loc – location parameter (note: location is added to the mean…)

Returns

moms()[source]

Revised moments for Severity class. Trying to compute moments of

X(a,d) = min(d, (X-a)+)

==> E[X(a,d)^n] = int_a^d (x-a)^n f(x) dx + (d-a)^n S(d).

Let x = q(p), F(x) = p, f(x)dx = dp. for 1,2,…n(<=3).

E[X(a,d)^n] = int_{F(a)}^{F(d)} (q(p)-a)^n dp + (d-a)^n S(d)

The base is to compute int_{F(a)}^{F(d)} q(p)^n dp. These are exi below. They are then adjusted to create the moments needed.

Old moments tried to compute int S(x)dx, but that is over a large, non-compact domain and did not work so well. With 0.9.3 old_moms was removed. Old_moms code did this:

ex1 = safe_integrate(lambda x: self.fz.sf(x), 1)
ex2 = safe_integrate(lambda x: 2 * (x - self.attachment) * self.fz.sf(x), 2)
ex3 = safe_integrate(lambda x: 3 * (x - self.attachment) ** 2 * self.fz.sf(x), 3)

Test examples

def test(mu, sigma, a, y):
    global moms
    import types
    # analytic with no layer attachment
    fz = ss.lognorm(sigma, scale=np.exp(mu))
    tv = np.array([np.exp(k*mu + k * k * sigma**2/2) for k in range(1,4)])

    # old method
    s = agg.Severity('lognorm', sev_a=sigma, sev_scale=np.exp(mu),
                     exp_attachment=a, exp_limit=y)
    est = np.array(s.old_moms())

    # swap out moment routine
    setattr(s, moms.__name__, types.MethodType(moms, s))
    ans = np.array(s.moms())

    # summarize and report
    sg = f'Example: mu={mu}  sigma={sigma}  a={a}  y={y}'
    print(f'{sg}\n{"="*len(sg)}')
    print(pd.DataFrame({'new_ans' : ans, 'old_ans': est,
                        'err': ans/est-1, 'no_la_analytic' : tv}))


test(8.7, .5, 0, np.inf)
test(8.7, 2.5, 0, np.inf)
test(8.7, 2.5, 10e6, 200e6)

Example: mu=8.7, sigma=0.5, a=0, y=inf

        new_ans       old_ans           err  no_la_analytic
6.802191e+03  6.802191e+03  3.918843e-11    6.802191e+03
5.941160e+07  5.941160e+07  3.161149e-09    5.941160e+07
6.662961e+11  6.662961e+11  2.377354e-08    6.662961e+11

Example: mu=8.7 sigma=2.5 a=0 y=inf (here the old method failed)

        new_ans       old_ans           err  no_la_analytic
1.366256e+05  1.366257e+05 -6.942541e-08    1.366257e+05
9.663487e+12  1.124575e+11  8.493016e+01    9.669522e+12
2.720128e+23  7.597127e+19  3.579469e+03    3.545017e+23

Example: mu=8.7 sigma=2.5 a=10000000.0 y=200000000.0

        new_ans       old_ans           err  no_la_analytic
1.692484e+07  1.692484e+07  2.620126e-14    1.366257e+05
1.180294e+15  1.180294e+15  5.242473e-13    9.669522e+12
1.538310e+23  1.538310e+23  9.814372e-14    3.545017e+23

The numerical issues are very sensitive. Goign for a compromise between speed and accuracy. Only an issue for very thick tailed distributions with no upper limit - not a realistic situation. Here is a tester program for two common cases:

logger_level(30) # see what is going on
for sh, dist in zip([1,2,3,4, 3.5,2.5,1.5,.5], ['lognorm']*3 + ['pareto']*4):
    s = Severity(dist, sev_a=sh, sev_scale=1, exp_attachment=0)
    print(dist,sh, s.moms())
    if dist == 'lognorm':
        print('actual', [(n, np.exp(n*n*sh*sh/2)) for n in range(1,4)])

Returns: vector of moments. np.nan signals unreliable but finite value. np.inf is correct, the moment does not exist.

plot(n=100, axd=None, figsize=(7.0, 4.9), layout='AB\nCD')[source]

Quick plot, updated for 0.9.3 with mosaic and no grid lines. (F(x), x) plot replaced with log density plot.

param n

number of points to plot.

param axd

axis dictionary, if None, create new figure. Must have keys ‘A’, ‘B’, ‘C’, ‘D’.

param figsize

(width, height) in inches.

param layout

the subplot_mosaic layout of the figure. Default is ‘AB

CD’.

return

3.3.3. Aggregate Class

class aggregate.distributions.Aggregate(name, exp_el=0, exp_premium=0, exp_lr=0, exp_en=0, exp_attachment=None, exp_limit=inf, sev_name='', sev_a=nan, sev_b=0, sev_mean=0, sev_cv=0, sev_loc=0, sev_scale=0, sev_xs=None, sev_ps=None, sev_wt=1, sev_lb=0, sev_ub=inf, sev_conditional=True, sev_pick_attachments=None, sev_pick_losses=None, occ_reins=None, occ_kind='', freq_name='', freq_a=0, freq_b=0, freq_zm=False, freq_p0=nan, agg_reins=None, agg_kind='', note='')[source]

__init__(name, exp_el=0, exp_premium=0, exp_lr=0, exp_en=0, exp_attachment=None, exp_limit=inf, sev_name='', sev_a=nan, sev_b=0, sev_mean=0, sev_cv=0, sev_loc=0, sev_scale=0, sev_xs=None, sev_ps=None, sev_wt=1, sev_lb=0, sev_ub=inf, sev_conditional=True, sev_pick_attachments=None, sev_pick_losses=None, occ_reins=None, occ_kind='', freq_name='', freq_a=0, freq_b=0, freq_zm=False, freq_p0=nan, agg_reins=None, agg_kind='', note='')[source]

The Aggregate distribution class manages creation and calculation of aggregate distributions. It allows for very flexible creation of Aggregate distributions. Severity can express a limit profile, a mixed severity or both. Mixed frequency types share a mixing distribution across all broadcast terms to ensure an appropriate inter- class correlation.

Parameters

name – name of the aggregate
exp_el – expected loss or vector
exp_premium – premium volume or vector (requires loss ratio)
exp_lr – loss ratio or vector (requires premium)
exp_en – expected claim count per segment (self.n = total claim count)
exp_attachment – occurrence attachment; None indicates no limit clause, which is treated different from an attachment of zero.
exp_limit – occurrence limit
sev_name – severity name or sev.BUILTIN_SEV or meta.var agg or port or similar or vector or matrix
sev_a – scipy stats shape parameter
sev_b – scipy stats shape parameter
sev_mean – average (unlimited) severity
sev_cv – unlimited severity coefficient of variation
sev_loc – scipy stats location parameter
sev_scale – scipy stats scale parameter
sev_xs – xs and ps must be provided if sev_name is (c|d)histogram, xs are the bucket break points
sev_ps – ps are the probability densities within each bucket; if buckets equal size no adjustments needed
sev_wt – weight for mixed distribution
sev_lb – lower bound for severity (length of sev_lb must equal length of sev_ub and weights)
sev_ub – upper bound for severity
sev_conditional – if True, severity is conditional, else unconditional.
sev_pick_attachments – if not None, a list of attachment points to define picks
sev_pick_losses – if not None, a list of losses by layer
occ_reins – layers: share po layer xs attach or XXXX
occ_kind – ceded to or net of
freq_name – name of frequency distribution
freq_a – cv of freq dist mixing distribution
freq_b – claims per occurrence (delaporte or sig), scale of beta or lambda (Sichel)
freq_zm – True/False zero modified flag
freq_p0 – if freq_zm, provides the modified value of p0; default is nan
agg_reins – layers
agg_kind – ceded to or net of
note – note, enclosed in {}

_apply_reins_work(reins_list, base_density, debug=False)[source]

Actually do the work. Called by apply_reins and reins_audit_df. Only needs self to get limits, which it must guess without q (not computed at this stage). Does not need to know if occ or agg reins, only that the correct base_density is supplied.

Parameters

reins_list –
kind – occ or agg, for debug plotting
debug –

Returns

ceder, netter,

_reins_audit_df_work(kind='occ')[source]: Apply each re layer separately and aggregate loss and other stats.

_repr_html_()[source]: For IPython.display

aggregate_error_analysis(log2, bs2_from=None, **kwargs)[source]

Analysis of aggregate error across a range of bucket sizes. If bs2_from is None use recommend_bucket plus/mins 3. Note: if distribution does not have a second moment, you must enter bs2_from.

Parameters

log2 –
bs2_from – lower bound on bs to use, in log2 terms; estimate using recommend_bucket if not input.
kwargs – passed to update

apply_agg_reins(debug=False, padding=1, tilt_vector=None)[source]

Apply the entire agg reins structure and save output. For by layer detail create reins_audit_df. Makes sev_density_gross, sev_density_net and sev_density_ceded, and updates sev_density to the requested view.

Not reflected in statistics df.

Returns

apply_distortion(dist)[source]: Apply distortion to the aggregate density and append as exag column to density_df. # TODO check consistent with other implementations. :param dist: :return:

apply_occ_reins(debug=False)[source]

Apply the entire occ reins structure and save output For by layer detail create reins_audit_df Makes sev_density_gross, sev_density_net and sev_density_ceded, and updates sev_density to the requested view.

Not reflected in statistics df.

Parameters: debug – More verbose.
Returns

approximate(approx_type='slognorm', output='scipy')[source]

Create an approximation to self using method of moments matching.

Compare to Portfolio.approximate which returns a single sev fixed freq agg, this returns a scipy dist by default.

Use case: exam questions with the normal approacimation!

Parameters

approx_type – norm, lognorn, slognorm (shifted lognormal), gamma, sgamma. If ‘all’ then returns a dictionary of each approx.
output – scipy - frozen scipy.stats continuous rv object; sev_decl - DecL program for severity (to substituate into an agg ; no name) sev_kwargs - dictionary of parameters to create Severity agg_decl - Decl program agg T 1 claim sev_decl fixed any other string - created Aggregate object

Returns

as above.

cdf(x, kind='previous')[source]

Return cumulative probability distribution at x using kind interpolation.

2022-10 change: kind introduced; default was linear

Parameters: x – loss size
Returns

cramer_lundberg(rho, cap=0, excess=0, stop_loss=0, kind='index', padding=1)

Return the Pollaczeck-Khinchine Capital function relating surplus to eventual probability of ruin. Assumes frequency is Poisson.

See Embrechts, Kluppelberg, Mikosch 1.2, page 28 Formula 1.11

TODO: Should return a named tuple.

Parameters

rho – rho = prem / loss - 1 is the margin-to-loss ratio
cap – cap = cap severity at cap, which replaces severity with X | X <= cap
excess – excess = replace severity with X | X > cap (i.e. no shifting)
stop_loss – stop_loss = apply stop loss reinsurance to cap, so X > stop_loss replaced with Pr(X > stop_loss) mass
kind –
padding – for update (the frequency tends to be high, so more padding may be needed)

Returns

ruin vector as pd.Series and function to lookup (no interpolation if kind==index; else interp) capitals

property density_df

Create and return the density_df data frame. A read only property, though if you write d = a.density_df you can obviously edit d. Some duplication of columns (p and p_total) to ensure consistency with Portfolio.

Returns: DataFrame similar to Portfolio.density_df.

property describe: Theoretic and empirical stats. Used in _repr_html_.

discretize(sev_calc, discretization_calc, normalize)[source]

Discretize the severity distributions and weight.

sev_calc describes how the severity is discretize, see `Discretizing the Severity Distribution`_. The options are discrete=round, forward, backward or moment.

sev_calc='continuous' (same as forward, kept for backwards compatibility) is used when you think of the resulting distribution as continuous across the buckets (which we generally don’t). The buckets are not shifted and so \(Pr(X=b_i) = Pr( b_{i-1} < X \le b_i)\). Note that \(b_{i-1}=-bs/2\) is prepended.

We use the discretized distribution as though it is fully discrete and only takes values at the bucket points. Hence, we should use sev_calc=’discrete’. The buckets are shifted left by half a bucket, so \(Pr(X=b_i) = Pr( b_i - b/2 < X \le b_i + b/2)\).

The other wrinkle is the righthand end of the range. If we extend to np.inf then we ensure we have probabilities that sum to 1. But that method introduces a probability mass in the last bucket that is often not desirable (we expect to see a smooth continuous distribution, and we get a mass). The other alternative is to use endpoint = 1 bucket beyond the last, which avoids this problem but can leave the probabilities short. We opt here for the latter and normalize (rescale).

discretization_calc controls whether individual probabilities are computed using backward-differences of the survival function or forward differences of the distribution function, or both. The former is most accurate in the right-tail and the latter for the left-tail of the distribution. We are usually concerned with the right-tail, so prefer survival. Using both takes the greater of the two esimates giving the best of both worlds (underflow makes distribution zero in the right-tail and survival zero in the left tail, so the maximum gives the best estimate) at the expense of computing time.

Sensible defaults: sev_calc=discrete, discretization_calc=survival, normalize=True.

Parameters

sev_calc – discrete=round, forward, backward, or continuous and method becomes discrete otherwise
discretization_calc – survival, distribution or both; in addition the method then becomes survival
normalize – if True, normalize the severity so sum probs = 1. This is generally what you want; but when dealing with thick tailed distributions it can be helpful to turn it off.

Returns

easy_update(log2=16, bs=0, recommend_p=0.99999, debug=False, **kwargs)

Convenience function, delegates to update_work. Avoids having to pass xs. Also aliased as easy_update for backward compatibility.

Parameters

log2 –
bs –
recommend_p – p value passed to recommend_bucket. If > 1 converted to 1 - 10**-p in rec bucket.
debug –
kwargs – passed through to update

Returns

entropy_fit(n_moments, tol=1e-10, verbose=False)[source]

Find the max entropy fit to the aggregate based on n_moments fit. The constant is added (sum of probabilities constraint), for two moments there are n_const = 3 constrains.

Based on discussions with, and R code from, Jon Evans

Run

ans = obj.entropy_fit(2)
ans['ans_df'].plot()

to compare the fits.

Parameters

n_moments – number of moments to match
tol –
verbose –

Returns

explain_validation()[source]: Explain validation result. Validation computed if needed.

freq_pmf(log2)[source]: Return the frequency probability mass function (pmf) computed using 2**log2 buckets. Uses self.en to compute the expected frequency. The Frequency does not know the expected claim count, so this is a method of Aggregate.

html_info_blob()[source]: Text top of _repr_html_

json()[source]

Write spec to json string.

Returns

limits(stat='range', kind='linear', zero_mass='include')[source]

Suggest sensible plotting limits for kind=range, density, etc., same as Portfolio.

Should optionally return a locator for plots?

Called by ploting routines. Single point of failure!

Must work without q function when not computed (apply_reins_work for occ reins; then use report_ser instead).

Parameters

stat – range or density (for y axis)
kind – linear or log (this is the y-axis, not log of range…that is rarely plotted)
zero_mass – include exclude, for densities

Returns

more(regex)[source]: More information about methods and properties matching regex

pdf(x)[source]

Probability density function, assuming a continuous approximation of the bucketed density.

Parameters: x –
Returns

picks(attachments, layer_loss_picks, debug=False)[source]: Adjust the computed severity to hit picks targets in layers defined by a. Delegates work to utilities.picks_work. See that function for details.

plot(axd=None, xmax=0, **kwargs)[source]

Basic plot with severity and aggregate, linear and log plots and Lee plot.

Parameters

xmax – Enter a “hint” for the xmax scale. E.g., if plotting gross and net you want all on the same scale. Only used on linear scales?
axd –
kwargs – passed to make_mosaic_figure

Returns

pmf(x)[source]: Probability mass function, treating aggregate as discrete x must be in the index (?)

pollaczeck_khinchine(rho, cap=0, excess=0, stop_loss=0, kind='index', padding=1)[source]

Return the Pollaczeck-Khinchine Capital function relating surplus to eventual probability of ruin. Assumes frequency is Poisson.

See Embrechts, Kluppelberg, Mikosch 1.2, page 28 Formula 1.11

TODO: Should return a named tuple.

Parameters

rho – rho = prem / loss - 1 is the margin-to-loss ratio
cap – cap = cap severity at cap, which replaces severity with X | X <= cap
excess – excess = replace severity with X | X > cap (i.e. no shifting)
stop_loss – stop_loss = apply stop loss reinsurance to cap, so X > stop_loss replaced with Pr(X > stop_loss) mass
kind –
padding – for update (the frequency tends to be high, so more padding may be needed)

Returns

ruin vector as pd.Series and function to lookup (no interpolation if kind==index; else interp) capitals

ppf(p, kind='lower')

Return quantile function of density_df.p_total.

Definition 2.1 (Quantiles) x(α) = qα(X) = inf{x ∈ R : P[X ≤ x] ≥ α} is the lower α-quantile of X x(α) = qα(X) = inf{x ∈ R : P[X ≤ x] > α} is the upper α-quantile of X.

kind=='middle' has been removed.

Parameters

p –
kind – ‘lower’ or ‘upper’.

Returns

property pprogram: pretty print the program

property pprogram_html: pretty print the program to html

price(p, g, kind='var')[source]

Price using regulatory and pricing g functions, mirroring Portfolio.price. Unlike Portfolio, cannot calibrate. Applying specified Distortions only. If calibration is needed, embed Aggregate in a one-line Portfolio object.

Compute E_price (X wedge E_reg(X) ) where E_price uses the pricing distortion and E_reg uses the regulatory distortion.

Regulatory capital distortion is applied on unlimited basis: reg_g can be:

if input < 1 it is a number interpreted as a p value and used to determine VaR capital
if input > 1 it is a directly input capital number
d dictionary: Distortion; spec { name = dist name | var, shape=p value a distortion used directly

pricing_g is { name = ph|wang and shape=}, if shape (lr or roe not allowed; require calibration).

if ly, must include ro in spec

Parameters

p – a distortion function spec or just a number; if >1 assets, if <1 a prob converted to quantile
kind – var lower upper tvar
g – pricing distortion function

Returns

q(p, kind='lower')[source]

Return quantile function of density_df.p_total.

Definition 2.1 (Quantiles) x(α) = qα(X) = inf{x ∈ R : P[X ≤ x] ≥ α} is the lower α-quantile of X x(α) = qα(X) = inf{x ∈ R : P[X ≤ x] > α} is the upper α-quantile of X.

kind=='middle' has been removed.

Parameters

p –
kind – ‘lower’ or ‘upper’.

Returns

q_sev(p)[source]

Compute quantile of severity distribution, returning element in the index. Very similar code to q, but only lower quantiles.

Parameters: p –
Returns

recommend_bucket(log2=10, p=0.99999, verbose=False)[source]

Recommend a bucket size given 2**N buckets. Not rounded.

For thick tailed distributions need higher p, try p=1-1e-8.

If no second moment, throws a ValueError. You just can’t guess in that situation.

Parameters

log2 – log2 of number of buckets. log2=10 is default.
p – percentile to use to determine needed range. Default is RECOMMEND_P. if > 1 converted to 1-10**-n.
verbose – print out recommended bucket sizes for 2**n for n in {log2, 16, 13, 10}

Returns

property reinsurance_audit_df

Create and return the _reins_audit_df data frame. Read only property.

Returns

reinsurance_description(kind='both', width=0)[source]

Text description of the reinsurance.

Parameters

kind – both, occ, or agg
width – width of text for textwrap.fill; omitted if width==0

property reinsurance_df

Version of density_df tailored to reinsurance. Several cases

occ program only: agg_density_.. is recomputed manually for all three outcomes
agg program only: sev_density_… not set for gcn
both programs: agg is gcn for the agg program applied to the requested occ output

_apply_reins_work

reinsurance_kinds()[source]

Text desciption of kinds of reinsurance applied: None, Occurrence, Aggergate, both.

Returns

property reinsurance_occ_layer_df: How losses are layered by the occurrence reinsurance. Expected loss, CV layer loss, and expected counts to layers.

reinsurance_occ_plot(axs=None)[source]: Plots for occurrence reinsurance: occurrence log density and aggregate quantile plot.

property reinsurance_report_df: Create and return a dataframe with the reinsurance report. TODO: sort out the overlap with reinsurance_audit_df (occ and agg) What this function adds is the ceded/net of occ aggregates before application of the agg reinsurance. The pure occ and agg parts are in reinsurance_audit_df.

property report_df

Created on the fly report to audit creation of object. There were some bad choices of columns in audit_df…but it [maybe] embedded in other code…. Eg the use of _1 vs _m for mean is inconsistent.

Returns

rescale(scale, kind='homog')[source]

Return a rescaled Aggregate object - used to compute derivatives.

All need to be safe multiplies because of array specification there is an array that is not a numpy array

TODO have parser return numpy arrays not lists!

Parameters

scale – amount of scale
kind – homog of inhomog

Returns

sample(n, replace=True)[source]: Draw a sample of n items from the aggregate distribution. Wrapper around pd.DataFrame.sample.

property sev: Make exact sf, cdf and pdfs and store in namedtuple for use as sev.cdf etc.

severity_error_analysis(sev_calc='round', discretization_calc='survival', normalize=True)[source]

Analysis of severity component errors, uses the current bs in self. Gives detailed, component by component, error analysis of severities. Includes discretization error (bs large relative to mean) and truncation error (tail integral large).

Total S shows the aggregate not severity. Generally about self.n * (1 - sum_p) (per Feller).

sf(x)[source]

Return survival function using linear interpolation.

Parameters: x – loss size
Returns

snap(x)[source]

Snap value x to the index of density_df, i.e., as a multiple of self.bs.

Parameters: x –
Returns

property spec

Get the dictionary specification, but treat as a read only property

Returns

property spec_ex

All relevant info.

Returns

property statistics

Pandas series of theoretic frequency, severity, and aggregate 1st, 2nd, and 3rd moments. Mean, cv, and skewness.

Returns

tvar(p, kind='')[source]

Updated June 2023, 0.13.0

Compute the tail value at risk at threshold p

Definition 2.6 (Tail mean and Expected Shortfall) Assume E[X−] < ∞. Then x¯(α) = TM_α(X) = α^{−1}E[X 1{X≤x(α)}] + x(α) (α − P[X ≤ x(α)]) is α-tail mean at level α the of X. Acerbi and Tasche (2002)

We are interested in the right hand exceedence [?? note > vs ≥] α^{−1}E[X 1{X > x(α)}] + x(α) (P[X ≤ x(α)] − α)

McNeil etc. p66-70 - this follows from def of ES as an integral of the quantile function

q is exact quantile (most of the time) q1 is the smallest index element (bucket multiple) greater than or equal to q

tvar integral is int_p^1 q(s)ds = int_q^infty xf(x)dx = q + int_q^infty S(x)dx we use the last approach. np.trapz approxes the integral. And the missing piece between q and q1 approx as a trapezoid too.

Parameters

p –
kind –

Returns

tvar_sev(p)[source]

TVaR of severity - now available for free!

added June 2023

update(log2=16, bs=0, recommend_p=0.99999, debug=False, **kwargs)[source]

Convenience function, delegates to update_work. Avoids having to pass xs. Also aliased as easy_update for backward compatibility.

Parameters

log2 –
bs –
recommend_p – p value passed to recommend_bucket. If > 1 converted to 1 - 10**-p in rec bucket.
debug –
kwargs – passed through to update

Returns

update_work(xs, padding=1, tilt_vector=None, approximation='exact', sev_calc='discrete', discretization_calc='survival', normalize=True, force_severity=False, debug=False)[source]

Compute a discrete approximation to the aggregate density.

See discretize for sev_calc, discretization_calc and normalize.

Quick simple test with log2=13 update took 5.69 ms and _eff took 2.11 ms. So quicker but not an issue unless you are doing many buckets or aggs.

Parameters

xs – range of x values used to discretize
padding – for FFT calculation
tilt_vector – tilt_vector = np.exp(tilt_amount * np.arange(N)), N=2**log2, and tilt_amount * N < 20 recommended
approximation – ‘exact’ = perform frequency / severity convolution using FFTs. ‘slognorm’ or ‘sgamma’ use a shifted lognormal or shifted gamma approximation.
sev_calc – discrete=round, forward, backward, or continuous and method becomes discrete otherwise
discretization_calc – survival, distribution or both; in addition the method then becomes survival
normalize – if True, normalize the severity so sum probs = 1. This is generally what you want; but when dealing with thick tailed distributions it can be helpful to turn it off.
force_severity – make severities even if using approximation, for plotting
debug – run reinsurance in debug model if True.

Returns

property valid

Check if the model appears valid. An answer of True means the model is “not unreasonable”. It does not guarantee the model is valid. On the other hand, False means it is definitely suspect. (The interpretation is similar to the null hypothesis in a statistical test). Called and reported automatically by qd for Aggregate objects.

Checks the relative errors (from self.describe) for:

severity mean < eps
severity cv < 10 * eps
severity skew < 100 * eps (skewness is more difficult to estimate)
aggregate mean < eps and < 2 * severity mean relative error (larger values indicate possibility of aliasing and that bs is too small).
aggregate cv < 10 * eps
aggregate skew < 100 * esp

The default uses eps = 1e-4 relative error. This can be changed by setting the validation_eps variable.

Test only applied for CV and skewness when they are > 0.

Run with logger level 20 (info) for more information on failures.

A Type 1 error (rejecting a valid model) is more likely than Type 2 (failing to reject an invalide one).

Returns: True (interpreted as not unreasonable) if all tests are passed, else False.

var_dict(p, kind='lower', snap=False)[source]

Make a dictionary of value at risks for the line, mirrors Portfolio.var_dict. Here is just marshals calls to the appropriate var or tvar function.

No epd. Allows the price function to run consistently with Portfolio version.

Example Use:

for p, arg in zip([.996, .996, .996, .985, .01], ['var', 'lower', 'upper', 'tvar', 'epd']):
    print(port.var_dict(p, arg,  snap=True))

Parameters

p –
kind – var (defaults to lower), upper, lower, tvar
snap – snap tvars to index

Returns