Wow...detailed question!! But very interesting :) Having some background in economics and finance, I will offer some ideas on this, but note that I am not a professional banker or underwriter, so please confer with your colleagues and the literature before using anything from this site (which I am sure you were going to do anyway).
I agree with you that your colleagues' approach seems confused, especially in the sense that its normalization on term length assumes that your risk/reward function is perfectly linear (i.e., twice the margin for twice the term length), which is probably not true since there is a qualitative, human element in setting term length and margin.
From your description, your problem is actually multi-dimensional, and hence cannot be so conveniently normalized to guage performance. Below, I will offer some ideas that may help you better understand where you are mispricing risk.
First, some assumptions and definitions (please correct if I am wrong):
- Your firm is risk-averse, and so requires increasing profits from riskier investments. Note that there is no objective way to price risk, as there is always this subjective element to it. However, once you know your risk posture, you can make progress.
- Lets change the definition of margin ($m$) to be the profit margin, i.e., $m=\frac{anticipated\space revenue}{purchase\space price}$. This removes the effect of the loan amount from our evaluation.
- The expected margin ($p$) is the margin ($m$) adjusted for the default rate ($d$): $p=(1-d)m$
- The subjective risk, $r$, is a function of the specific client information $C$. For a given risk level $r$, the possible loan terms can be described by a function of the margin $m$ and term length $t$, hence: $r(C) = f(m,t)$. This is an implict relationship, as both the left and right hand side are the results of human evaluations, but formally this is what you are doing.
- The default rate is a function of the risk and the loan terms: $d=g(m,t,r)$.
- I am ignoring any internal discount factors you may be using to get the NPV of your loans. You will need to adjust your expected margins for the time value of money if you think this will be relevant.
OK, now lets see what we can do given the above:
The key to evaluating your performance will be to verify that you are acting risk-averse (i.e., consistent with your risk posture). To do this, you will need to do two things, one difficult, one relatively easy:
- Hard part: You will need to know what "risk category" or "risk level" your analysts assigned to each loan at the time of application (not ex post facto). If you already have such a system in place, then use those risk categories, if not, you will need to use the assigned margins and payback periods to infer the risk. A simple function that will do this is $r(m,t)=\frac{m}{t}$. This function assumes that if the loan periods are the same, then the one with the higher margin is assumed to have been preceived as riskier. Likewise, if both have the same margin but one has a longer period than another, then it is assumed that the one with the longer period is less risky. The exact risk may be some power of this ratio or some multiple of it, but at least you will be correctly ordering your loans by preceived risk.
- Easy Part: Calculate the actual margin, $\hat m = \frac{actual\space revenue}{purchase\space price}$ for each loan.
To get a measure of performance, you will want to perform a regression using your observed triples $x_i \equiv (r_i,t_i,\hat m_i)$, with $r_i$ and $t_i$ being the predictors and $\hat m_i$ being the response. Specifically, we will model $\hat m_i$ as follows:
$\hat m_i = \varepsilon (r_i t_i)^k$, where $k$ is an unknown parameter and $\varepsilon$ is a lognormal random variable on $[0,\infty)$ with logmean $\mu$ and logvariance $\sigma^2$ (both unknown).
I chose the lognormal for computational convenience. A full, albeing more complex, treatment would require generalized linear models, which I think may be too much for this application.
Taking the natural logarithm of both sides, we get the usual linear regression equation with normally distributed errors:
$\ln(\hat m_i) = \ln(\varepsilon)+k\ln(r_it_i) = \ln(\varepsilon)+ks_i = $, where $s_i = \ln(r_it_i)$
You can now estimate $k$ by performing a simple linear regression with $\ln(\hat m_i)$ as the response and $s_i$ as the predictor.
The regression output (from excel, Minitab, or whatever you use) should give you a confidence interval or standard error and degrees of freedom for the slope parameter $k$. You will want to test that $k>1$ vs $k\leq 1$. If $k>1$, it means that you are acting in a risk averse manner and are hence properly pricing your risk.
For a more detailed view, you can make a 3-D plot of your original triples to see if there are certain subsets where the $\hat m$ surface "slopes downward" significantly. You may be good at identifying very low and very high risks but are inconsistent in the middle risk ranges.
Best Answer
(a) Let $n = 1000$ be the number of loans in a year, and $X$ be the number of foreclosures in a year, assuming a foreclosure rate of $r = 2\% = ,02$. The total loss is $L = \$120,000\,X.$
In terms of probabilities, $X \sim Binom(n, .02)$. The expected value $E(L)$ is the average loss per year over many years. We have $$E(L) = 120,000E(X) = 120,000(n)(r) = 120,000(1000)(.02) = 2,400,000$$ dollars.
The way I read your question, you are asked to simulate the loss for $one$ year. If you repeat this several times, you will get a great variety of answers, because the standard deviation of $L$ is quite large. (You give no clue how much you know about probability. Can you find $SD(L) = 1533.623\,?$)
Simulating a binomial random variable. I think the most convenient way to simulate one realization of $X$ in R is to use the statement
rbinom(1, n, r)
. So here are simulated (and quite different) losses for three successive years. Try several runs for yourself: most of your numbers of foreclosures will likely be between 10 and 30.I used the code below to simulate the number $X$ of foreclosures per year over an imaginary 10000-year period, and make a histogram of the results.
Simulating with random sampling. However, you have been asked to use the
sample
function. One way to do this is the usesample
with parameterprob
which gives the probabilities of foreclosure and no foreclosure. We use 0 to stand for no foreclosure and 1 to stand for a foreclosure. We use the parameterrep=T
because 0's and 1's can be used repeatedly (sampling with replacement). Thensample
gives a vector of length 1000, typically with lots of 0's and a few 1's, andsum
counts how many foreclosures. Again here, you will get a great variety of different answers if you run the code below several times.This code using
sample
andsum
is not as convenient as the code withrbinom
. Maybe you were asked to do the exercise withsample
as a preliminary to learning about binomial random variables.I hope this is enough to get you started on the entire problem. If not, please edit your Question or leave a Comment showing what you have tried. This is not a 'homework answer' site. Once you show some additional participation, maybe I or someone else can help you with the next step.
Note. There are several reasons for learning to use simulation in such probability problems. Here are two: (a) Not all actuarial problems are as simple as the one posed here, and it may be difficult to get an analytical solution. (b) After you have an analytical solution and you're 'sure' it is right, it may be a good idea to go through the thought process one more time to program a simulation, and thus to check the validity of the analytic solution. This is an especially good idea where losses have as many 0's in them as in the situation described here.