I am new to A/B test; I am interested to generate confidence interval in python language. At the moment I have ratio (proportion), for instance, ratio of add to cart to the total sessions in control and experiment variant. I have two code snippets:
A
number of success=204,503
number of trials=965,760
from statsmodels.stats.proportion import proportion_confint
ci_low, ci_upp = proportion_confint(204503, 965760, alpha=0.05, method='normal')
ci_low lower confidence interval and upper confidence interval
B
from scipy import stats
def binomial_ci(x, n, alpha=0.05):
#x is number of successes, n is number of trials
from scipy import stats
if x==0:
c1 = 0
else:
c1 = stats.beta.interval(1-alpha, x,n-x+1)[0]
if x==n:
c2=1
else:
c2 = stats.beta.interval(1-alpha, x+1,n-x)[1]
return c1, c2
p1,p2 = binomial_ci(204503,965760)
Link: https://stackoverflow.com/a/65873479/2740831
I cannot figure out which is what and when to use what.
Subsequently, I will need to generate confidence interval of average order value (AOV), gross margin value (GMV).
Sorry for redundant code and question, however, I do not know whom to reach out. I can blindly use any code but would be interested to know core/logic.
I conduct proportion test as the following:
from statsmodels.stats.proportion import proportions_ztest
import numpy as np
def conduct_prop_test(sample_success_a,sample_success_b,sample_size_a,sample_size_b):
#list_sampleA - contains success and trials for variant A
#list_sampleB - contains success and trials for variant B
successes = np.array([sample_success_a, sample_success_b]) ## create array/list
samples = np.array([sample_size_a, sample_size_b])
stat, p_value = proportions_ztest(count=successes, nobs=samples, alternative='two-sided')
return(stat, p_value)
##function ends
##how to run
conduct_prop_test(327061,81702,1543914,384635) ## returns a tuple of z-value and p-value
Best Answer
I almost address how to do this here.
The test statistic for a difference in proportions is
$$ z=\frac{p_{1}-p_{2}}{\sqrt{\frac{p_{1}\left(1-p_{1}\right)}{n_{1}}+\frac{p_{2}\left(1-p_{2}\right)}{n_{2}}}}$$
Here, $p_i$ is the estimated proportion in group $i$, so successes divided by trials. A failure to reject the null occurs when $ \vert z \vert < 1.96$. The way to obtain a confidence interval is then to re-arrange the inequality for $p_1 - p_2$, namely
$$ (p_1 - p_2) \pm 1.96 \sqrt{\frac{p_{1}\left(1-p_{1}\right)}{n_{1}}+\frac{p_{2}\left(1-p_{2}\right)}{n_{2}}} $$