Regression Metrics – Best Metrics for Cases with Very Large Values and Zero Values

customer-lifetime-valuemodel-evaluationregression

I'm working on a CLTV problem, where the objective is to predict the future spending of the customers, given their past behaviour. According to arXiv:1912.07753, paragraph 4 EVALUATION METRICS, I'm measuring calibration (difference between actual and predicted values) and discrimination (ranking of the user by CLV).

I'm having a hard time finding a good metric for calibration because values can be extremely large for some customers (making squared errors based on metrics such as R-square and MSE less meaningful ) or exactly zero (making percentage metrics such as MAPE impossible to compute).

The only good metric I could find is the MAE, but it doesn't allow us to compare with other results, as the MAE is very dataset-specific.

What metrics would you recommend for regression problems with 0 and very large values?

Best Answer

It depends on what functional of the future distribution you want to elicit.

Put differently, future outcomes follow some probability distribution (which, judging from your description, may be heavy-tailed and/or zero-inflated), and the point forecast you want to evaluate is a "one number summary" of this distribution. This holds even if you do not explicitly look at the distribution - it will always be there and lurking under the surface.

The issue is that different error measures elicit different one number summaries from the underlying distribution. The MSE is minimized in expectation by the expectation of the distribution. The MAE is minimized by its median. (That the MSE is more strongly influenced by the tail of the distribution than the MAE is just another way of saying that the expectation of the distribution is more strongly influenced by the tail than the median.) A quantile loss will be optimized by the appropriate quantile.

One consequence is that different point forecasts will be optimal for different error measures. Another one is that you should remember that your OLS regression will likely optimize the MSE as an objective function, so it does not really make sense to evaluate forecasts from an OLS model using the MAPE. (The MAE makes sense if you believe in symmetric errors, which again does not seem to be the case here.)

So the question should first be what functional you are interested in, and only after you have given this some thought should you pick an appropriate error measure. Which functional solves your problem, in turn, depends on what you want to do with the point forecast afterwards.

More information can be found at What are the shortcomings of the Mean Absolute Percentage Error (MAPE)?, at Why use a certain measure of forecast error (e.g. MAD) as opposed to another (e.g. MSE)? and in Kolassa (2020).

Related Question