5.2. Quantiles and Related Risk Measures

Objectives: Definition and calculation of quantiles and related risk measures.

Audience: Readers interested in quantiles, VaR, and TVaR risk measures.

Prerequisites: Risk measures, probability.

See also: Insurance Probability.

Contents:

Helpful References
Quantiles
Value at Risk
The Failure of VaR to be Subadditive
Tail VaR and Related Risk Measures

5.2.1. Helpful References

Klugman et al. [2019]
Mildenhall and Major [2022], Chapter 4
Hyndman and Fan [1996]

5.2.2. Quantiles

A quantile function is inverse to the distribution function \(F(x):=\mathsf{Pr}(X\le x)\). For each \(0 < p < 1\), it solves \(F(x)=p\) for \(x\), answering the question,

which \(x\) has non-exceedance probability equal to \(p\)?

Or, said another way,

which \(x\) has exceedance probability equal to \(1-p\)?

When the distribution function is continuous and strictly increasing there is a unique such \(x\). It is called the \(p\)-quantile, and is denoted \(q(p)\). The resulting function \(q(p)=F^{-1}(p)\) is called the quantile function; it satisfies \(F(q(p))=p\).

Two issues arise when defining quantiles.

The equation \(F(x)=p\) may fail to have a unique solution when \(F\) is not strictly increasing. This can occur for any \(F\). Is corresponds to a range of outcome values with probability zero.
When \(F\) is not continuous, the equation \(F(x)=p\) may have no solution: \(F\) can jump from below \(p\) to above \(p\). Simulation and catastrophe models, and all discrete random variables have discontinuous distributions.

Example.

Here’s an example of the problems that can occur.

In [1]: from aggregate.extensions.pir_figures import fig_4_1

In [2]: fig = fig_4_1()

The distribution \(F\) has a flat spot between 0.9 and 1.5 at height \(p=0.417\). At \(x=1.5\) it jumps up to \(p=0.791\). The “inverse” to \(F\) at \(p=0.417\) could be any value between 0.9 and 1.5—illustrated by the lower green horizontal dashed line. The inverse at any value \(0.417 < p < 0.791\) does not exist because there is no \(p\) so that \(F(p)=0.6\). However, any rational person looking at the graph would agree that the answer must be \(x=1.5\), where the black dashed line intersects the vertical line \(x=1.5\).

When \(F\) is not continuous and \(F(x)=p\) has no solution because \(p\) lies is within a jump, we can still find an \(x\) so that

\[\mathsf{Pr}(X < x)\le p \le \mathsf{Pr}(X\le x).\]

\(\mathsf{Pr}(X<x)\) equals the height of \(F\) at the bottom of the jump and \(\mathsf{Pr}(X\le x)\) at the top. Turning this around, we can also say \(\mathsf{Pr}(X\ge x)\ge 1-p\ge \mathsf{Pr}(X> x)\). At a \(p\) with no jump, \(\mathsf{Pr}(X=x)=0\), \(\mathsf{Pr}(X < x)=p=\mathsf{Pr}(X\le x)\), and we have a well defined inverse, as the lower line at \(p=0.283\) illustrates.

The vertical segment at \(x=1.5\) between \(p=0.417\) and \(p=0.791\) is not strictly a part of \(F\)’s graph, because a function must associate a unique value to each \(x\) in its domain. However, filling in the vertical segment makes it easier to locate inverse values by finding the graph’s intersection with the horizontal line at \(p\) and is recommended in Rockafellar and Royset [2014]. Mentally, you should always fill in jumps in this way, treating the added segment as part of the graph.

Definition. Let \(X\) be a random variable with distribution function \(F\) and \(0 < p < 1\). Any \(x\) satisfying

\[\mathsf{Pr}(X < x)\le p\le \mathsf{Pr}(X\le x)\]

is a \(p\) quantile of \(X\). Any function \(q(p)\) satisfying

\[\mathsf{Pr}(X < q(p))\le p\le \mathsf{Pr}(X\le q(p))\]

for \(0\ < p < 1\) is a quantile function of \(X\).

Exercise. What are the \(0.1\) and \(1/6\) quantiles for the outcomes of the fair roll of a 6-sided die?

Solution. There are six outcomes \(\{1,2,3,4,5,6\}\) each with probability \(1/6\). The distribution function jumps at each outcome.

For \(p=0.1\) we seek \(x\) so that \(\mathsf{Pr}(X < x) \le 0.1 \le \mathsf{Pr}(X\le x)\). We know \(0=\mathsf{Pr}(X<1)<\mathsf{Pr}(X\le 1)=1/6\) and therefore \(q(0.1)=1\). It is good to rule out other possible values. If \(x<1\) then \(\mathsf{Pr}(X\le x)=0\) and if \(x>1\) then \(\mathsf{Pr}(X < x)\ge 1/6\), showing neither alternative satisfies the definition of a quantile.
For \(p=1/6\) we seek \(x\) so that \(\mathsf{Pr}(X < x) \le 1/6 \le \mathsf{Pr}(X\le x)\), which is satisfied by any \(1\le x \le 2\). If we pick \(x=1\) then \(0=\mathsf{Pr}(X<1)<1/6=\mathsf{Pr}(X\le 1)\). If we pick \(1 < x < 2\) then \(\mathsf{Pr}(X < x)=1/6=\mathsf{Pr}(X\le x)\). If \(x=2\) then \(\mathsf{Pr}(X<2)=1/6<\mathsf{Pr}(X\le 2)=1/3\).

Since the distribution and quantile functions are inverse, their graphs are reflections of one another in a 45-degree line through the origin. The distribution function is continuous from the right, hence the location of the probability masses indicated by the circles.

Define

The lower quantile function \(q^-(p) := \sup\ \{x \mid F(x) < p \} = \inf\ \{ x \mid F(x) \ge p \}\), and
The upper quantile function \(q^+(p) := \sup\ \{x \mid F(x) \le p \} = \inf\ \{ x \mid F(x) > p \}\).

The lower and upper quantiles both satisfy the requirements to be a quantile function. The lower quantile is left continuous. The upper quantile is right continuous. When the quantile is not unique, it lies between the lower and upper values.

5.2.3. Value at Risk

When a quantile is used as a risk measure it is called Value at Risk (VaR): \(\mathsf{VaR}_p(X):=q^-(p) = \inf\ \{ x\mid F(x) \ge p\}\).

Thus \(l\) is \(\mathsf{VaR}_p(X)\) if it is the smallest loss such that the probability \(X\le l\) is \(\ge p\). This is sometimes phrased: the smallest loss so that \(X\le l\) with confidence at least \(p\). Smallest loss allows for the case \(F\) is flat at \(p\). Probability \(\ge p\) allows for jumps in \(F\).

VaR has several advantages. It is simple to explain, can be estimated robustly, and is always finite. It is widely used by regulators, rating agencies, and companies in their internal risk management. Its principal disadvantage is its failure to be subadditive.

5.2.4. The Failure of VaR to be Subadditive

It is easy to create simple discrete examples where VaR fails to be subadditive, for example:

\[\begin{split}\small \begin{matrix} \begin{array}{clrrrr}\hline \text{Event} & \text{Prob} & F & X_1 & X_2 & X \\ \hline 1 & 0.98 & 0.98 & 0 & 0 & 0 \\ 2 & 0.01 & 0.99 & 1000 & 100 & 1100 \\ 3 & 0.01 & 1.00 & 150 & 1100 & 1250 \\ \hline \end{array} \end{matrix}\end{split}\]

\(X_1\) has 0.99 VaR 150 and \(X_2\) has 0.99 VaR 100 but \(X\) has 0.99 VaR 1100.

More interesting, 0.7-VaR applied to the sum of two independent exponential distributions is not subadditive, but 0.95-VaR is.

In [3]: from aggregate import build, qd

In [4]: import pandas as pd

In [5]: p = build('port NotSA '
   ...:           'agg A dfreq [1] sev 1 * expon '
   ...:           'agg B dfreq [1] sev 1 * expon')
   ...: 

In [6]: ans = p.var_dict(0.7)

In [7]: ans['sum'] = ans['A'] + ans['B']

In [8]: ans2 = p.var_dict(0.95)

In [9]: ans2['sum'] = ans2['A'] + ans2['B']

In [10]: pd.DataFrame([ans, ans2], index=pd.Index(['0.70', '0.95'], name='p'))
Out[10]: 
          A      B  total    sum
p                               
0.70  1.204  1.204  2.439  2.408
0.95  2.996  2.996  4.744  5.992

The function var_dict returns the VaR of each unit in p and the total. The total VaR is greater than the sum of the parts. Subadditivity requires total VaR be less than or equal to the sum of the parts.

5.2.5. Tail VaR and Related Risk Measures

Tail value at risk (TVaR) is the conditional average of the worst \(1-p\) outcomes. Let \(X\) be a loss random variable and \(0 \le p<1\). Then \(p\)-Tail Value at Risk is given by

\[\begin{split}\mathsf{TVaR}_p(X) :&= \dfrac{1}{1-p}\int_{p}^1 \mathsf{VaR}_s(X)\,ds \\ &= \dfrac{1}{1-p}\int_{p}^1 q^-(s)\,ds.\end{split}\]

In particular \(\mathsf{TVaR}_0(X)=\mathsf{E}[X]\). When \(p=1\), \(\mathsf{TVaR}_1(X)\) is defined to be \(\sup(X)\) if \(X\) is unbounded.

TVaR is defined in terms of \(q^-\), that is, dual implicit events. The actual sample space on which \(X\) is defined is not used. Recall, \(\mathsf{VaR}_p(X)\) refers to the lower quantile \(q^-(p)\).

TVaR is a well behaved function of \(p\). It is continuous, differentiable almost everywhere, and equal to the integral of its derivative (fundamental theorem of calculus). It takes every value between \(\mathsf{E}[X]\) and \(\sup X\). TVaR has a kink at jumps in \(F\) and is differentiable elsewhere.

5.2.5.1. Algorithm to Evaluate TVaR for a Discrete Distribution

Algorithm Input: \(X\) is a discrete random variable, taking \(N\) equally likely values \(X_j\ge 0\), \(j=0,\dots, N-1\). Probability level \(p\).

Follow these steps to determine \(\mathsf{TVaR}_p(X)\).

Algorithm Steps

Sort outcomes into ascending order \(X_0 < \dots < X_{N-1}\).
Find \(n\) so that \(n \le pN < (n+1)\).
If \(n+1=N\) then \(\mathsf{TVaR}_p(X) := X_{N-1}\) is the largest observation, exit;
Else \(n < N-1\) and continue.
Compute \(T_1 := X_{n+1} + \cdots + X_{N-1}\).
Compute \(T_2 := ((n+1)-pN)x_n\).
Compute \(\mathsf{TVaR}_p(X) := (1-p)^{-1}(T_1+T_2)/N\).

These steps compute the average of the largest \(N(1-p)\) observations. Step (6) adds a pro-rata portion of the \(\lfloor N(1-p)\rfloor\) largest observation when \(N(1-p)\) is not an integer. For instance, if \(N=71\) and \(p=0.95\), then \(Np=67.45\) and \(n=67\), giving \(\mathsf{TVaR}_p = 20(0.55x_{67}+x_{68}+x_{69}+x_{70})/71\).

Example.

Let \(X\) be defined on a sample space with ten equally likely events and outcomes \(0,1,1,1,2,3, 4,8, 12, 25\). Compute \(\mathsf{TVaR}_p(X)\) for all \(p\). Is it a piecewise linear function?

Solution. For \(p \ge 0.9\), \(q(p)=25\) and \(\mathsf{TVaR}_p(X)=25\). For \(0.8 \ge p < 0.9\)

\[\begin{split}(1-p)\mathsf{TVaR}_p(X) &= \int_p^1 q^-(s)ds \\ &= \int_p^{0.9}q^-(s)ds+ \int_{0.9}^1q^-(s)ds \\ &= (0.9-p)\times 12 + (1-0.9)\times \mathsf{TVaR}_{0.9}(X),\end{split}\]

for \(0.7 \ge p < 0.8\)

\[(1-p)\mathsf{TVaR}_p(X) = (0.8-p)\times 8 + (1-0.8)\times \mathsf{TVaR}_{0.8}(X),\]

and so forth. The TVaR function is shown below. TVaR is not piecewise linear. For example, for \(0.8\le p<0.9\), \(\mathsf{TVaR}_p(X)=(12(0.9-p) + 2.5)/(1-p)\).

The default aggregate TVaR function ignores this slight non-linearity and just interpolates. To get a more exact answer use kind='tail'. The difference is illustrated on the left in the next figure.

In [11]: from aggregate.extensions.pir_figures import fig_4_8

In [12]: fig_4_8()

5.2.5.2. CTE, and WCE: Alternatives to TVaR

There are two other risk measures (confusingly) similar to TVaR.

Tail value at risk (TVaR) is the conditional average of the worst \(1-p\) outcomes.
Conditional tail expectation (CTE) refers to the conditional expectation of \(X\) over \(X\ge \mathsf{VaR}_p(X)\).
Worst conditional expectation (WCE) refers to the greatest expected value of \(X\) conditional on a set of probability \(>1-p\).

The formal definitions of CTE and WCE are as follows. Let \(X\) be a loss random variable and \(0 \le p<1\).

\(\mathsf{CTE}_p(X) := \mathsf{E}[X \mid X \ge \mathsf{VaR}_p(X)]\) (lower) conditional tail expectation (TCE).
The upper CTE equals \(\mathsf{E}[X \mid X \ge q^+(p)]\).
\(\mathsf{WCE}_p(X) := \sup\ \{ \mathsf{E}[X \mid A] \mid \mathsf{Pr}(A) > 1-p \}\) is the worst conditional expectation.

Like TVaR, CTE is defined in terms of quantiles, and the sample space on which \(X\) is defined is not used. In contrast, WCE works with the original sample space and relies on its events. Some actuarial papers refer to CTE as tail value at risk, e.g., Bodoff [2007].

For continuous random variables TVaR, CTE, and WCE are all equal, and they are easy to compute. The distinctions between them arise for discrete and mixed variables when \(p\) coincides with a mass point.

5.2.5.3. Expected Policyholder Deficit

The expected policyholder deficit EPD when a risk \(X\) is supported by assets \(a\) equals \(\mathsf E[(X-a)^+]\), the unconditional excess loss cost. The insurer defaults on the EPD amount.

The EPD ratio is defined as the ratio of the EPD to expected losses. It gives the proportion of losses that are unpaid when \(X\) is supported by assets \(a\).

Example.

We can use the EPD to define a tail risk measure that is analogous to VaR and TVaR. Define the EPD risk measure \(\mathsf{E}PD_s(X)\) to be the amount of assets resulting in an EPD ratio of \(0 < s < 1\), i.e., solving

\[\mathsf{E}[(X-\mathsf{E}PD_p(X))^+] = s\mathsf{E}[X].\]

The EPD risk measure is a stricter standard for smaller \(s\). It accounts for the degree of default relative to promised payments, making it attractive to regulators. It is used to set risk based capital standards in Butsic [1994] and as a capital standard in Myers and Read Jr. [2001].

EPD is available in aggregate as the epd column in density_df.