Solved – Why isn’t the monetary value of the first transaction used in the gamma-gamma spend model

gamma distributionintuitionmarketingpythonr

I've been doing some customer value calculations using the Lifetimes library in python. This library employs the approach found in Fader, Hardie, and Lee's paper on using iso-value curves for customer base analysis. This approach is also implemented in the BTYD (Buy 'Til You Die) package for R.

In this approach, a gamma-gamma spend model is used to estimate the average order value for each customer (gamma distribution) and the average of these average values across all customers (gamma distribution). As a customer makes more purchases, their expected spend moves closer to their observed average and further away from the average of all customers' averages. The derivation is discussed in Hardie's paper, and the intuition is discussed on page 20 of Fader, Hardie, and Lee.

In Fader, Hardie, and Lee (page 20), it is asserted that…

If no repeat purchases are observed… our best guess is that the individual's average transaction value in the future is simply the mean of the overall transaction value distribution

This means that the first purchase that an individual makes is not enough to learn anything about what their average spend might be, and it is totally ignored. In the formulas presented, a single-purchase customer's values are not used because their repeat purchase frequency is 0. It is unclear why the model could not be adjusted to use first purchases, thus including observations from the overwhelming number of single-purchase customers in most data sets. The paper does not intuitively explain why this information is not used. If the overall average is \$50 and a first-time customer spends \$5, it seems reasonable to assume that their average value may be lower than that of most customers, right?

Best Answer

I suspect the reason is primarily theoretical, while there may be mathematical implications as well I think these are more pronounced in the BG/NBD model than the Double-Gamma model. Consider that Fader & Hardie's research has built upon work by Ehrenberg, who began with modeling repeat transactions ... well technically it began with modeling customer buying patterns but it takes multiple transactions to generate a pattern.

We see the language in the papers - the initial transaction is seen as 'trial' rather than repeat and is considered to be of a different nature. (see the initial statement of problem in this paper: http://www.brucehardie.com/notes/006/creating_dor_summary_2004-05-04.pdf). It is a different problem to predict initial customer acquisition (in which a success is defined as a trial purchase) than to predict incremental customer transactions from already-acquired customers (in which success is defined as an additional repeat transaction).