Actuarial practice exam: Find number of exponential data values above threshold given MLE.

actuarial-scienceprobabilitystatistics

I was going thorough an actuarial exam and came across a problem that I can't figure out. Here is the problem as stated on the practice exam:


You are given:

$\bullet$ An insurance product with a per loss limit of 200 covers losses from an exponential distribution with parameter $\theta$.

$\bullet$ Based on the following table, the maximum likelihood estimate of $\theta$ is 168.

$$\begin{array}{|c | c | c |}\hline
\text{Size of loss} & \text{Number of claims} & \text{Sum of losses} \\ \hline
\text{Less than } 200 & 1,114 & 142,752\\ \hline
\text{At least } 200 & N & 200N \\ \hline
\end{array}$$

Calculate $N$ for the table above.


The answer is:

$N$ is less than 250.


My question: How can we know for sure that $N$ is less than 250? It can be thought of as a random variable, couldn't it? It could be that there is a single loss above 200 but very large so that the overall average is still 168, yes?


It seems that we are supposed to assume that the losses above 200 are exponential with mean 168 above 200 by the memoryless property and thus $$\frac{142752+368N}{1114+N}=168$$
gives $N=222$, which aligns well with the official answer.

But with the given information, it could be that the first 1,114 data points are all 128.14 and then we have a single data point of 44568. Of course this is an event of negligible probability, but nonetheless it is theoretically positive.

What am I missing?

Best Answer

The point is that, because of the claim limit, you do not observe the values above $200$, just the numbers of these censored values. What those high values would have been if they had not been censored could affect the likelihood, but you do not know them

Given $\theta$, the probability density for an observation of $x_i$ below $200$ is $\frac1\theta e^{-x_i/\theta}$, while the probability (not density) for an observation $x_j$ above $200$ is $e^{-200/\theta}$

That makes the likelihood for $\theta$ from the observations below $200$ proportional to $\frac1\theta^{1114} e^{-142752/\theta}$ and the likelihood from the observations above $200$ proportional to $e^{-200N/\theta}$. Multiply these together and you get an overall likelihood proportional to $\frac1\theta^{1114} e^{-(142752+200N)/\theta}$. That likelihood is maximised when $\theta = \frac{142752+200N}{1114}$

We are told that $\hat{\theta}= 168$, which suggests $N = \frac{1114 \times 168 - 142752}{200} = 222$

Related Question