Machine Learning Clustering – Why Use Avg Monetary Value Instead of Total in RFM Customer Segmentation

I am trying to segment our customers based on their purchase data. And I came to know about the RFM technique (Recency, Frequency and Monetary) through these posts here, here etc.

Recency – How recently they made a purchase

Frequency – How many times they made a purchase

Monetary – How much revenue did company got from that customer

I also came across a python package called lifetimes here

while I understand the idea of RFM, I am confused as to why do the consider average revenue of a customer (from all his/her transactions) instead of Total revenue for all his/her transactions?

For ex: If a new customer places 2 huge orders for 100K and 200K, then he contributed 300K to the company and could be classified as "New but promising" or "New but Heavy spender" etc.

But doesn't taking average normalize everyone on the equal scale? So, then monetary value doesn't become useful metric to segment customers. Instead we have to use only Recency and Frequency (because they have raw values).

Is there any reason why you think average revenue is better than Total revenue?

Best Answer

The 3 RTM are selected since they are often uncorrelated with each other, but you are free to redefine them or supplement them as you choose. Average revenue is a different metric than total revenue, although they are correlated so it might make sense to create a flag in your analysis for 'new heavy spenders' . While total revenue might be translated by average revenue * frequency, that is not always true if frequency is measured by multiple items which are often grouped together in an invoice. Sometimes businesses also include # of items as a separate variable.

All of these metrics are better off being scaled to within a timeframe, lets say the last year or 2 years, since you don't want to give equal weight to customers who have spent heavily a long time ago, but aren't spending anything now. So, if you are averaging, consider an exponentially smoothed average which would give greater weight to the most recent spending. But, on the other hand, if you have a 'Win Back' program, that is, trying to capture higher spend customers (or frequent) who are bought in the past, but now may be inactive, you could give greater weight to the older transactions.

Best Answer

Related Solutions

Clustering – Is It Meaningful to Retrieve Original Values After Standardization?

Related Question