T-Test – Is a T-Test Applicable for Comparing Two Proportions?

proportion;t-testz-test

I have 3 categories of tissues examined by microscopy (A , B and C, 12 of each). I want to test whether the proportion of tissues staining by a particular dye, differs in the three categories. Each of the 36 tissue specimens is stained and examined under the microscope. The data is collected in the form of proportion of cells staining positive for the dye in each tissue specimen.

Now, I understand that the proportions have a binomial distribution and t-test is probably not appropriate because the variance of proportions is known. But I don't know how to apply the z-test to detect the difference in proportion here, since the data collected itself is in the form of "proportion" for each specimen. Do I calculate the mean proportion ($p_A,p_B,p_C$ for each category) by averaging the recorded proportions under each observation and then apply the z-test?

As an alternative thought, am I not estimating the proportion from my study population by averaging the proportions themselves. Thus, since I am actually estimating the true population proportion from sample itself, would the assumptions of z-test hold here?

Best Answer

As @Dave already said, there's no reason to "throw away" the information on the distribution.

However, I suspect that the more important consideration here is that your data has a nested structure:

  • 3 types of tissue (A, B, C) - that's fixed factor
  • 12 slides nested within each tissue type, exhibiting random variation
  • a number of target cells nested in each of the slides, which are either stained or not, also random factor
  • (we're presumably not looking into further factors like day-to-day variance, or spatial heterogeneity in the staining of a single tissue type)

An (unpaired) t-test or z-test or binomial proportion test cannot properly describe this structure with 3 factors/levels of data hierarchy. You'd thus need to assume that the uncertainty in estimating the proportion is negligible compared to the variance between slides. This may be achievable by counting a sufficiently large number of target cells, and you'd need to justify this for the following test.
I'd say the distribution of stained proportions across the slides is unknown, since it is not only the (binomial) variance due to the finite number of evaluated target cells, but contains also other sources of variance between slides (repeatability of the staining protocol). t-test should be OK here - its p-values may be off if your proportions are all over the place across the slides, but would give you the practically important conclusion that your staining protocol doesn't work repeatably.

Nevertheless, a generalized linear (binomial) mixed model can directly deal with your data structure. This will give you estimates of the proportions stained in A, B, and C tissues, as well as an estimate of slide-to-slide repeatability and an estimate of the binomial uncertainty due to the number of evaluated cells (i.e. a quality check for your experimental design: was the number of cells that you evaluate per slide adequate)
I'd thus recommend that you consider using that for your evaluation.

Related Question