## A Tangled TaleMay 31, 2007

Posted by Peter in Exam 1/P.

As is now common knowledge, Lewis Carroll was not merely a writer of children’s tales, but an amateur mathematician. He was fond of puzzles of a logical nature, and in his work A Tangled Tale, he posed a question that is particularly relevant to the concepts of probability theory tested on Exam 1/P:

“Sad — but very curious when you come to look at it arithmetically,” was her aunt’s less romantic reply. “Some of them have lost an arm in their country’s service, some a leg, some an ear, some an eye — ”

“And some, perhaps, all!” Clara murmured dreamily….

“Say that 70 per cent have lost an eye — 75 per cent an ear — 80 per cent an arm — 85 per cent a leg — that’ll do it beautifully. Now, my dear, what percentage, at least, must have lost all four?”

Being the writer that he was, Carroll posed the question in the setting of a conversation between a young girl, Clara, and her apparently unsympathetic aunt, who I wonder must have had Asperger’s syndrome, and perhaps would have made an excellent actuary. But the question is this: Given that 70% of veterans have lost an eye, 75% an ear, 80% an arm, and 85% a leg, what is the minimum percentage of veterans that have lost all four appendages? The solution, as I furnished to the individual who directed my attention to this question, is as follows:

If no less than 70% of the soldiers lost one eye, then no more than 30% of the soldiers did not lose one eye. Similarly, no more than 25% of the soldiers did not lose one ear; no more than 20% of the soldiers did not lose one hand; and no more than 15% of the soldiers did not lose one leg.

We see that the minimum possible percentage of soldiers who lost each of these parts is attained when the maximum percentages of each who did not lose at least one part are mutually disjoint. This is because if we maximize the number of soldiers who retained at least one body part, we minimize the number of soldiers who lost all such parts. To this end, the maximum number of soldiers retaining at least one body part occurs if no soldier has more than one surviving body part; that is, the 30%, 25%, 20%, and 15% of soldiers who retained an eye, ear, hand, and leg, respectively, are assumed to have lost all other parts.

Consequently, the total percentage of soldiers who have retained at least one body part is the maximum 30+25+20+15 = 90%. Therefore, we are guaranteed that at least 10% of the soldiers have lost all such body parts.

## Question 21, Spring 2007 MLCMay 29, 2007

Posted by Peter in Exam 3/MLC.

This question was received with a bit of controversy because of its wording.

21. You are given the following information about a new model for buildings with limiting age ω.

1. The expected number of buildings surviving at age x will be $l_x = (\omega - x)^\alpha$, x < ω.
2. The new model predicts a 33.3% higher complete life expectancy (over the previous DeMoivre model with the same ω) for buildings aged 30.
3. The complete life expectancy for buildings aged 60 under the new model is 20 years.

Calculate the complete life expectancy under the previous DeMoivre model for buildings aged 70.

The problem with the way this question reads lies in the phrase “previous DeMoivre model” mentioned in item 2, and at the end of the question. Usually, one does not relegate essential information to an offhandedly casual and parenthetical remark. A properly posed question should read “…previous model, which is DeMoivre….” This eliminates the ambiguity of whether the word “previous” belongs to “DeMoivre” or to “model,” the latter being the intended meaning. This issue is further exacerbated by the fact that the new model is a modified/generalized DeMoivre, and that both models share the same ω. Together, this leads to a confusingly written question, because the author did not take care to make it absolutely clear (preferably in a separately listed item) that the old model was DeMoivre (or equivalently, α = 1). That said, the solution is as follows:

Solution: We first compute the complete life expectancy of a building aged (x) under the new model, noting that the old model has α = 1:

${\setlength\arraycolsep{2pt} \begin{array}{rcl}\displaystyle\overset{\circ}{e}_x(\alpha) &=& \displaystyle\int_0^{\omega - x} \!\!_t p_x \, dt = \int_0^{\omega - x} \frac{l_{x+t}}{l_x} \, dt = \int_0^{\omega - x} \!\left(1 - \frac{t}{\omega - x}\right)^{\!\alpha} dt \\ &=& \displaystyle\left[\frac{\omega-x}{\alpha+1} \left(1 - \frac{t}{\omega - x}\right)^{\alpha+1}\right]_{t=0}^{\omega-x} = \frac{\omega-x}{\alpha+1}. \end{array}}$

Then item (2) gives the condition $\overset{\circ}{e}_{30}(\alpha) = \frac{4}{3} \overset{\circ}{e}_{30}(1)$, from which we obtain

$\displaystyle \frac{\omega-30}{\alpha+1} = \frac{4}{3} \cdot \frac{\omega - 30}{2},$

and hence α = 1/2. Item (3) then gives the condition $\displaystyle \overset{\circ}{e}_{60}(1/2) = \frac{\omega-60}{\frac{1}{2}+1} = 20,$ so ω = 90. Therefore,

$\overset{\circ}{e}_{70}(1) = \frac{90-70}{2} = 10.$

## From the AIMEMay 29, 2007

Posted by Peter in Exam 1/P.

Some time ago, I answered a question that was featured in the American Invitational Mathematics Examination (AIME), which is one step in the series of AMC exams that leads to the US team selection for the International Mathematics Olympiad (IMO). The AMC is open to high school students in the US, and every now and then, a question from the theory of probability crops up on the exam. This one would make a particularly challenging question for CAS/SOA Exam 1/P:

A jar has 10 red candies and 10 blue candies. Terry picks two candies at random, then Mary picks two of the remaining candies at random. Given that the probability that they get the same color combination, irrespective of order, is m/n, where m and n are relatively prime positive integers, find m + n.

Solution:
Terry has a 1/2 probability of choosing a red candy on his first draw. He then has a 9/19 probability of choosing another red candy on his second draw. Thus the probability of his having two red candies is 9/38. Similarly, the probability of his having two blue candies is 9/38. Therefore, the probability of his having one of each color is 1 – 2(9/38) = 10/19.

Now, given that Terry has two candies of the same color, the probability that Mary selects two more candies of that same color is (8/18)(7/17). Therefore, the probability that Terry and Mary have all chosen candies of the same color (all red or all blue) is

(9/19)(8/18)(7/17) = 28/323.

However, given that Terry has one candy of each color, the probability that Mary also selects one red and one blue candy is simply 9/17. This is because it is equivalent to the probability of her second candy color not being the same as her first. Another way to view it is to see that she can either choose red, then blue, or blue, then red. Each occurs with a probability of

(9/18)(9/17) = 9/(2·17),

so their combined probability is twice this, or 9/17. Hence the combined probability that Terry and Mary each choose candies of both colors is

(10/19)(9/17) = 90/323.

Therefore, the probability Terry and Mary choose candies of the same type is

(28+90)/323 = 118/323,

and since 118 and 323 are relatively prime, the answer is 441.

One can also work out the problem for the general case where one has n red candies and n blue candies. The desired probability is

$\displaystyle \frac{2 \binom{n}{2}\binom{n-2}{2} + n^2 (n-1)^2}{\binom{2n}{2}\binom{2n-2}{2}} = \frac{3n^2 - 7n + 6}{8n^2 - 16n + 6}.$

Can you generalize the question to the case where there are r red and b blue candies?

## Question 12, Spring 2007 Exam 4/CMay 27, 2007

Posted by Peter in Exam 4/C.

12. For 200 auto accident claims you are given:

1. Claims are submitted t months after the accident occurs, t = 0, 1, 2, ….
2. There are no censored observations.
3. $\hat{S}(t)$ is calculated using the Kaplan-Meier product limit estimator.
4. $\displaystyle c_S^2(t) = \frac{\widehat{\rm Var}[\hat{S}(t)]}{\hat{S}^2(t)}$, where $\widehat{\rm Var}[\hat{S}(t)]$ is calculated using Greenwood’s approximation.
5. $\hat{S}(8) = 0.22, \; \hat{S}(9) = 0.16, \; c_S^2(9) = 0.02625, \; c_S^2(10) = 0.04045$.

Determine the number of claims that were submitted to the company 10 months after an accident occurred.

Solution. There are two key observations we need to make. The first is that we are given the risk set at time t = 0, namely $r_0 = 200$. The second observation is that because no observations are censored, the Kaplan-Meier estimator of the survival time takes on a particularly simple form. This is because in the absence of censoring, $r_{j+1} = r_j - s_j$; that is, the risk set at time $y_{j+1}$ is simply the risk set at time $y_j$ minus those who died in the meantime. Therefore

$\displaystyle \hat{S}(y_n) = \prod_{j=0}^n \frac{r_j - s_j}{r_j} = \prod_{j=0}^n \frac{r_{j+1}}{r_j} = \frac{r_{n+1}}{r_0} = \frac{r_{n+1}}{200}.$

So with this in mind, we have $r_{10} = 200\hat{S}(9) = 32$. Recalling Greenwood’s approximation,

$\displaystyle c_S^2(t) = \frac{\widehat{\rm Var}[\hat{S}(t)]}{\hat{S}^2(t)} = \sum_{j=1}^t \frac{s_j}{r_j(r_j-s_j)},$

so $\displaystyle c_S^2(10) - c_S^2(9) = \frac{s_{10}}{r_{10}(r_{10}-s_{10})} = 0.0142.$

Substituting and solving, we obtain $s_{10} = 9.9978 \approx 10$.

## InterpolationMay 27, 2007

Posted by Peter in Exam 3/MLC.

We use interpolation whenever we want to construct a continuous model from discrete data. Interpolation methods range from the rudimentary (linear interpolation) to the sophisticated (polynomial splines). Since splines are no longer on the 4/C syllabus, we’ll instead talk about forms of interpolation based on the relation

$\varphi(s(x+t)) = (1-t)\varphi(s(x)) + t\varphi(s(x+1)), \quad t \in [0,1].$

Here, s(x) represents the function to be interpolated, $\varphi$ is an interpolation assumption, and t is a parameter on [0,1]. The simplest instance of the above is when $\varphi$ is the identity function:

$s(x+t) = (1-t)s(x) + t s(x+1).$

This is called linear interpolation, and it is the basis of the uniform distribution of deaths (UDD) assumption in life contingencies, where s is the survival distribution and x is age. But we also use this relationship (albeit slightly modified) when interpolating values in, say, a normal distribution table:

$\Phi\left(x+\frac{t}{100}\right) = (1-t)\Phi(x) + t\Phi\left(x+\frac{1}{100}\right).$

This assumes that adjacent entries in the table are listed in increments of 0.01. For instance, suppose we want to find Φ(1.263). 1.263 is between 1.26 and 1.27, so we have

$\Phi(1.263) \approx (1-0.3)\Phi(1.26) + 0.3 \Phi(1.27),$

and looking up the values in the table, we get Φ(1.263) = (0.7)(0.8962) + (0.3)(0.8980) = 0.89674.

In life contingencies, we are also sometimes interested in the constant force of mortality interpolation assumption; that is to say, deaths are not uniformly distributed at fractional ages, but rather, survival is exponentially distributed between integer ages. In this case, $\varphi(s) = \log s$ and the interpolation relation becomes

$\log s(x+t) = (1-t) \log s(x) + t \log s(x+1)$

or equivalently,

$s(x+t) = s(x)^{1-t} s(x+1)^t.$

To see that this indeed results in a constant force of mortality between integer ages, we differentiate the above with respect to t to obtain

$\displaystyle \mu_x(t) = -\frac{d}{dt}\left[\log s(x+t)\right] = \log s(x) - \log s(x+1) = \log \frac{s(x)}{s(x+1)} \ge 0,$

since s(x) ≥ s(x+1). Finally, there is the Balducci, or hyperbolic, interpolation assumption, where we set $\varphi(s) = 1/s$:

$\displaystyle \frac{1}{s(x+t)} = \frac{1-t}{s(x)} + \frac{t}{s(x+1)}.$

This model is so called because the survival function at fractional ages forms an arc of a hyperbola. In all cases, we can use the resulting relation on the survival function to derive the other life table functions $l_{x+t}, \,_t p_x, \,_t q_x, \mu_x(t)$, etc. But as we have seen, these three interpolation assumptions are not the only ones we can use, even in the very simple case of two-point interpolation.

## Random VariablesMay 26, 2007

Posted by Peter in Exam 1/P.

Suppose you have a standard, fair, 6-sided die. We are interested in the possible outcomes of rolling this die. Naturally, the outcomes of multiple rolls of the die are not predetermined or fixed, but rather are the result of a random process. And yet, “random” doesn’t mean that we don’t have any information about what the possible outcomes might be. We could ask any of the following questions of any given roll of the die:

1. What is the numerical outcome?
2. What is the square of the numerical outcome?
3. How many other possible outcomes are less than the value rolled?
4. What is the sum of the top and opposite faces?

Each of these questions corresponds to a distinct random variable on the probability space of the die roll. Loosely speaking, a random variable (or RV) is simply a function of the outcome of a random process.

Question 1. Suppose we roll a 5. What are the values of the random variables described in the above items 1-4?

Now suppose we have two fair 6-sided dice, and let X be the RV that denotes the sum of the rolled values. What is the probability that X > 8? Well, we have the possible outcomes {(3,6), (4,5), (4,6), (5,4), (5,5), (5,6), (6,3), (6,4), (6,5), (6,6)}, so there are 10 outcomes where X > 8. But there are 6(6) = 36 total outcomes, so the desired probability Pr[X > 8] = 10/36 = 5/18. The idea behind this example is that we can construct a RV and compute an associated probability, because the RV has an associated probability distribution of possible values. For instance, we can compute

Pr[X = 2] = 1/36; Pr[X = 3] = 2/36; Pr[X = 4] = 3/36; ….

and in doing so, we have completely specified the probability distribution of X.

Question 2. Complete the above list by computing Pr[X = n] for any real number n.

## WelcomeMay 26, 2007

Posted by Peter in Uncategorized.