Z-test and Chi squared test producing different p-values

hypothesis testingstatistical-inferencestatistics

Given that I'm doing A/B testing for conversion rate on two groups, where group A has 6000 samples of which 90 are conversions, and group B has 4000 samples of which 80 are conversions. I want to know if group B has a statistically higher conversion rate.

I seemingly get a different answer depending on if I use a Z-test or Chi squared test and alpha = 0.5. Z-test returns a p-value of 0.0327 whereas Chi squared gives a p-value of 0.058.

The problem originates from https://towardsdatascience.com/the-art-of-a-b-testing-5a10c9bb70a4 , and trying it on my own I get the same values as in the article. The author attempts to explain the discrepancy by saying the Z-test doesn't take into account that the random variable of the difference of the mean is restricted to [-1, 1] but I don't really follow.

I was under the impression that these tests are equivalent for this type of problem, so why do they return different p-values?

Thanks.

Edit: As @BruceET suspected I was doing a two sided chi squared test, which obviously doesn't give the same p-value as the Z-test (or T-test to be more accurate..) for proportions. As was also pointed out I wasn't clear in how i was estimating the variances which was another problem. The method used in the article I followed was Welch's T-test (i.e. T-test without pooling variances). If I use the "exact" variance=mean*(1-mean)*(1/n_A + 1/n_B) where the mean is over both A and B, the p-value is 0.29, exactly half of that of the Chi squared test. I suspect I'll get something close to it if I use a pooled variance, but not tried it.

Best Answer

I realize that this is not a direct answer to your question. However, using two fundamentally different procedures that I trust, I do not find any conflict in the results. [My guess is that your 'z-test' may be one-sided and your 'chi-squared test' two-sided.]

Data:

conv = c(90,80)
size = c(6000, 4000)
nonc = size - conv
MAT = rbind(conv,nonc)

MAT
     [,1] [,2]
conv   90   80
nonc 5910 3920

One-sided Fisher Exact test:

fisher.test(MAT, alt="less")

         Fisher's Exact Test for Count Data

data:  MAT
p-value = 0.03543
alternative hypothesis: 
  true odds ratio is less than 1
95 percent confidence interval:
 0.00000 0.97505
sample estimates:
odds ratio 
 0.7462279 

One-sided test of $p_A = p_B$ against $p_A < p_B:$

prop.test(conv, size, alt="less")

        2-sample test for equality of proportions 
        with continuity correction

data:  conv out of size
X-squared = 3.2975, df = 1, p-value = 0.03469
alternative hypothesis: less
95 percent confidence interval:
 -1.0000000000 -0.0003285328    # Does not incl 0

sample estimates:
prop 1 prop 2 
 0.015  0.020 

Two-sided chi-squared contingency test. (Irrelevant because you say you want a one-sided test, but this test is inherently two-sided.)

chisq.test(MAT, cor=F)

        Pearson's Chi-squared test

data:  MAT
X-squared = 3.5904, df = 1, p-value = 0.05811
Related Question