Solved – What type of statistic should I use to compare ratios in the experiment

t-test

I have performed an experiment to analyse the ratio of intronic versus exonic levels for 1000 transcripts after treated the cells with a drug and compare my results with untreated cells . Each transcript is equivalent to a gene, while intronic regions represent a part of the transcript that is rapidly remove and exons the part of the transcript that encode for proteins. The drug is thought to potentially increase the levels of intronic regions compare to exonic regions.
I have compared the ratio of intron/exon signals in treated versus untreated cells in two independent experiments (two biological replicates). What kind of statistic test should I use to see if there is any significant increase in intronic/exonic regions after drug treatment?
I was performing a student´s t-test comparing the ratio of the intron/exon at biological replicates 1 and 2 treated with the drug with untreated cells, is that correct?

Thanks a lot for your help

Best Answer

Ok here's a quick summary of what I think you can do:

Since you expect an increase in intronic/exonic regions after drug treatment, you expect to see a ratio of intronic/exonic regions that is significantly higher for the treated cells than for the untreated one (if I've understood your problem correctly).

So let $r_t$ be the "true" ratio (intronic/exonic) for treated cells and $r_u$ the "true" ratio for untreated ones. Your null hypothesis ($H_0$) is that $r_t=r_u$ (i.e., no significant difference, drug has no effect) whereas your alternative hypothesis $H_a$ is $r_t > r_u$ (upper tail alternative)). Now we can transform this problem in a problem of proportions by considering $$p={\#\text{intronic}\over (\#\text{intronic} + \#\text{exonic})} = {r\over r+1}$$ and noting that $r_t>r_u \Leftrightarrow p_t>p_u$.

Now you can compare the observed difference in proportions $\hat p_t-\hat p_u$. Under the null hypothesis (i.e., under the hypothesis that $p_t=p_u=p$) and suitable assumptions [1],

$$ \hat{p}_t-\hat{p}_u \sim \mathcal N\left(0,p(1-p)\left({1\over n_t}+{1\over n_u}\right)\right)$$

So under the null, the following statistics should follow a standard normal distribution:

$$ z = {\hat{p}_t-\hat{p}_u\over \sqrt{\hat{p}(1-\hat{p})\left({1\over n_t}+{1\over n_u}\right)}} $$

where $\hat{p}$ is the observed proportion over the total population i.e.,

$$\hat{p} = {n_tp_t+n_up_u\over n_t+n_u} .$$

Now remains for you to compute $P(Z\ge z)$ if it's very small then there is evidence for you to reject the null hypothesis (i.e., to say that your drug does indeed seem to have a significant effect).

Remark: even before computing the statistic etc. You can just look at your observed difference $\hat{p}_t-\hat{p}_u$ and at the variance over the whole population ($\hat{s}^2=\hat{p}(1-\hat{p})(1/n_t+1/n_u)$). If the observed difference is much larger than the observed standard deviation $s$ then it provides evidence that the difference is significant. (Do you see why?)

[1] I strongly recommend you check a reference to know and understand the underlying theory. (not hard) Any good introductory book on statistics will mention hypothesis testing etc. An example among many others is "Mathematical Statistics with Applications" (Wackerly, Mendenhall, Scheaffer)

Related Question