Solved – Chi Squared Test to see if a treatment group spent more than a control group

chi-squared-testexpected valuehypothesis testing

I have two groups of people. One group were shown an ad and the other were not. I know in aggregate that the treatment group spent \$799 and the control group spent \$412, so overall it seems people responded. Both groups had 1000 people in them.

I am not sure how to translate this to verify whether this is statistically significant. I was going to do a chi squared test with treatment group observed values and use the expected value of the average spend per customer in the control group (\$0.412).

Is this the correct way to choose an expected value?

Best Answer

Comment: There is one sense in which seeing the ad seems to have prompted a greater response. That is the response rate (rather than the dollar amount spent).

There are various tests of $H_0: p_1 = p_2$ vs. $H_a: p_1 > p_2,$ where the $p_i$ are the respective response rates.

Output from Minitab for two such tests is shown below.

Test and CI for Two Proportions 

Sample   X     N  Sample p
1       50  1000  0.050000
2       29  1000  0.029000

Difference = p (1) - p (2)
Estimate for difference:  0.021
95% lower bound for difference:  0.00669270
Test for difference = 0 (vs > 0):  
   Z = 2.41  P-Value = 0.008

Fisher’s exact test: P-Value = 0.011

The first test (P-value 0.021) uses a normal approximation of the binomial proportions, which should give reasonably accurate results for such large samples. Fisher's exact test (P-value 0.011) uses a hypergeometric distribution. Both tests are significant.

If looking at response rates is of interest to you, you can find particulars in an elementary applied statistics text or online.

As @NickCox suggests, we would have to know the $(50+29 = 79)$ individual dollar amounts in order confidently to explore two-sample tests for amount spent. However, it seems each purchase in each group averages around $15, so looking at response rates might tell you what you really want to know about the effect of exposure to the ad.


Note: Just as an experiment, I simulated a dataset assuming that the 50 nonzero sales in Group 1 are distributed $\mathsf{Norm}(\mu=16, \sigma=3)$ and that the 29 nonzero sales in Group 2 are distributed $\mathsf{Norm}(\mu=14, \sigma=3).$ Dollar amounts were rounded to integers:

table(x1)
x1
  0  10  11  12  13  14  15  16  17  18  19  20  21  22 
950   2   1   7   6   8   5   4   5   2   2   5   1   2 

table(x2)
x2
  0   8   9  10  11  12  13  14  15  16  18  21  22 
971   1   2   1   3   4   2   3   4   3   4   1   1 

A Welch two-sample t test in R gave P-value 0.0033, as follows:

        Welch Two Sample t-test

data:  x by g
t = 2.7117, df = 1803.6, p-value = 0.003379
alternative hypothesis: 
  true difference in means is greater than 0
95 percent confidence interval:
  0.1415163       Inf
sample estimates:
mean in group 1 mean in group 2 
          0.768           0.408 

In spite of all the zeros and additional ties that result from rounding dollars to integers, a one-sided, two-sample Wilcoxon (rank sum) test in R gave P-value 0.007. with no error messages.

wilcox.test(x ~ g, alt="g")$p.val
[1] 0.00725217

A one-sided permutation test with the pooled 2-sample t statistic as metric (but not assuming normality) gave P-value about 0.003.

Unless your non-zero dollar values are much different from my simulated ones, I do not expect a problem finding a valid two-sample test to compare dollar amounts.

Related Question