Exponential Distribution – Calculating Average Days Between Customer Orders

exponential distributionmarketing

I have detailed sales data including customer number, SKU, product line, and order date. For marketing purposes, I would like to know the average days between orders both on a per-customer, per product-line basis, as well as on per product-line basis across our entire customer base.

The application is to send promotional materials for target product lines (or SKUs, or some other product attribute) ahead of the customer's expected order date.

My initial intuition was to take the difference in days between each order (per customer, per product line), sum them together, and divide by the total number of orders – 1 (i. e., the number of differences). I also wanted to get an idea of the spread of the data by calculating the standard deviation, and that's where I hit a block.

Since this is frequency data, it's non-normal, and to the best of my understanding, it falls under an exponential distribution. I've read some resources online, but in those cases, the mean $\mu$ is always given or calculated from a given $\lambda$, not calculated from a data set.

Is my approach of simply calculating the arithmetic mean sound? And if so, is that result also my standard deviation?

Any help would be much appreciated.

Best Answer

Calculating the mean is sound in this sense: it tells you the mean.

Comparing it with the standard deviation is a quick probe of whether you have an exponential distribution (my purchases of film views on Netflix are probably exponential, but my purchases of toilet paper are not).

Another sensible check is to plot the histogram, with a log scale for the counts: does it look like a straight line? If not, are the errors for quick reorders or slow reorders?

Using an exponential distribution is likely to be an approximation. Maybe it is a good enough approximation for your business purposes.

Related Question