Wow...detailed question!! But very interesting :) Having some background in economics and finance, I will offer some ideas on this, but note that I am not a professional banker or underwriter, so please confer with your colleagues and the literature before using anything from this site (which I am sure you were going to do anyway).
I agree with you that your colleagues' approach seems confused, especially in the sense that its normalization on term length assumes that your risk/reward function is perfectly linear (i.e., twice the margin for twice the term length), which is probably not true since there is a qualitative, human element in setting term length and margin.
From your description, your problem is actually multi-dimensional, and hence cannot be so conveniently normalized to guage performance. Below, I will offer some ideas that may help you better understand where you are mispricing risk.
First, some assumptions and definitions (please correct if I am wrong):
- Your firm is risk-averse, and so requires increasing profits from riskier investments. Note that there is no objective way to price risk, as there is always this subjective element to it. However, once you know your risk posture, you can make progress.
- Lets change the definition of margin ($m$) to be the profit margin, i.e., $m=\frac{anticipated\space revenue}{purchase\space price}$. This removes the effect of the loan amount from our evaluation.
- The expected margin ($p$) is the margin ($m$) adjusted for the default rate ($d$): $p=(1-d)m$
- The subjective risk, $r$, is a function of the specific client information $C$. For a given risk level $r$, the possible loan terms can be described by a function of the margin $m$ and term length $t$, hence: $r(C) = f(m,t)$. This is an implict relationship, as both the left and right hand side are the results of human evaluations, but formally this is what you are doing.
- The default rate is a function of the risk and the loan terms: $d=g(m,t,r)$.
- I am ignoring any internal discount factors you may be using to get the NPV of your loans. You will need to adjust your expected margins for the time value of money if you think this will be relevant.
OK, now lets see what we can do given the above:
The key to evaluating your performance will be to verify that you are acting risk-averse (i.e., consistent with your risk posture). To do this, you will need to do two things, one difficult, one relatively easy:
- Hard part: You will need to know what "risk category" or "risk level" your analysts assigned to each loan at the time of application (not ex post facto). If you already have such a system in place, then use those risk categories, if not, you will need to use the assigned margins and payback periods to infer the risk. A simple function that will do this is $r(m,t)=\frac{m}{t}$. This function assumes that if the loan periods are the same, then the one with the higher margin is assumed to have been preceived as riskier. Likewise, if both have the same margin but one has a longer period than another, then it is assumed that the one with the longer period is less risky. The exact risk may be some power of this ratio or some multiple of it, but at least you will be correctly ordering your loans by preceived risk.
- Easy Part: Calculate the actual margin, $\hat m = \frac{actual\space revenue}{purchase\space price}$ for each loan.
To get a measure of performance, you will want to perform a regression using your observed triples $x_i \equiv (r_i,t_i,\hat m_i)$, with $r_i$ and $t_i$ being the predictors and $\hat m_i$ being the response. Specifically, we will model $\hat m_i$ as follows:
$\hat m_i = \varepsilon (r_i t_i)^k$, where $k$ is an unknown parameter and $\varepsilon$ is a lognormal random variable on $[0,\infty)$ with logmean $\mu$ and logvariance $\sigma^2$ (both unknown).
I chose the lognormal for computational convenience. A full, albeing more complex, treatment would require generalized linear models, which I think may be too much for this application.
Taking the natural logarithm of both sides, we get the usual linear regression equation with normally distributed errors:
$\ln(\hat m_i) = \ln(\varepsilon)+k\ln(r_it_i) = \ln(\varepsilon)+ks_i = $, where $s_i = \ln(r_it_i)$
You can now estimate $k$ by performing a simple linear regression with $\ln(\hat m_i)$ as the response and $s_i$ as the predictor.
The regression output (from excel, Minitab, or whatever you use) should give you a confidence interval or standard error and degrees of freedom for the slope parameter $k$. You will want to test that $k>1$ vs $k\leq 1$. If $k>1$, it means that you are acting in a risk averse manner and are hence properly pricing your risk.
For a more detailed view, you can make a 3-D plot of your original triples to see if there are certain subsets where the $\hat m$ surface "slopes downward" significantly. You may be good at identifying very low and very high risks but are inconsistent in the middle risk ranges.
This CANNOT be done with only the means, standard deviations, correlation, and range. The range is not needed, but one other thing is needed that was mentioned in the first paragraph but not in the question in the last paragraph: the sample size, which is $250$.
The equation of the least-squares line is:
$$
\frac{y - 478}{107.2} = 0.58\left(\frac{x-424}{81.7}\right).
$$
The sum of squares of deviations from the average $y$-value is $249\cdot 107.2^2$. The proportion of that that is explained by variability in $x$-values is $0.58^2$, so the explained sum of squares is $0.58^2\cdot249\cdot 107.2^2\approx 962597.889$. The unexplained sum of squares, which is the sum of squares of residuals, is $(1-0.58^2)\cdot249\cdot 107.2^2\approx 1898870.271$. So we have this ANOVA table:
$$
\begin{array}{c|c|c|c|c}
\text{source of variability} & \text{sum of squares} & \text{degrees of freedom} & \text{mean square} & F & p \\
\hline
\text{first test} & 962597.889 & 1 & 962597.889 & 125.719 & \approx 0 \\
\text{error} & 1898870.271 & 248 & 7656.734964
\end{array}
$$
Finding the value of $p$ is generally the only part that exceeds elementary arithmetic. But in this case the $F$ value is so extreme that it's trivial to say that for all practical purposes, $p=0$.
Best Answer
The z-score is the standardisation that you should plot. Full-stop. (And you have the correct formula for the z-score.) The z-score might usually range from -3 to +3 and you can then plot both z-score distributions on the same graph. The z-score distributions plot with their centres at z=0. You mention you want to plot on a 0-10 scale. What do you mean by this ?