Solved – Statistical test for multiple comparison of groups of animals with Yes or No outcome

binary datamultiple-comparisonssas

In my experiment (pre-clinical vaccine testing) I want to know what kind of statistical test to be used to compare between 9 groups of animals (72 animals randomly divided into 9 groups). Each group consists of 8 or 7 animals each. After administering different experimental vaccines (n=9 for 9 groups of animals), each animal is evaluated for a continuous response (log10Titer) on unequal interval for about 1 year (0–day of vaccination, 7, 14, 21, 28, days, 1 , 3, 6, 9, 12 months after vaccination). After one year the animals are assessed for protection status (result is Yes or No for each animal). So there are 72 observations (yes or no). I have used linear mixed model with Tukey's multiple comparison test to find out significant group differences for time series data.

  1. Now I want only to use yes or no data from each animal to find out which group is best. What kind of statistical significance tests need to be carried out?
  2. p value for multiple comparison of percentage of protection of each group (calculated from Number of animals protected / Total Number of animals).
  3. Confidence interval of percentage of protection in each group.
  4. To asses which treatment group is best based on combined protection data and continuous time series data of one year.

I have SAS 9.3 and can work in R also (R Studio). I searched Google and found people suggesting different methods such as PROC MULTTEST PROC GENMOD / LOGISTICS. Some suggest Fisher's exact test and the chi-squared test. But in my opinion logistic regression / generalized linear model requires more data than it is used here.

My data look like this:

data animal;
input   Animal No   Treatment   Protection;
cards;  
3   T-01    0
53  T-01    0
58  T-01    0
59  T-01    0
66  T-01    0
8   T-02    1
23  T-02    0
40  T-02    1
44  T-02    1
49  T-02    1
55  T-02    1
57  T-02    1
11  T-03    0
18  T-03    1
20  T-03    0
32  T-03    1
41  T-03    1
43  T-03    1
67  T-03    1
74  T-03    1
19  T-04    1
21  T-04    1
22  T-04    1
24  T-04    1
38  T-04    0
45  T-04    1
51  T-04    0
69  T-04    0
10  T-05    1
30  T-05    1
31  T-05    1
47  T-05    1
50  T-05    1
56  T-05    1
70  T-05    1
72  T-05    1
2   T-06    1
4   T-06    0
6   T-06    0
9   T-06    1
15  T-06    0
48  T-06    0
64  T-06    0
79  T-06    0
5   T-07    1
7   T-07    1
14  T-07    0
28  T-07    1
33  T-07    1
37  T-07    1
68  T-07    1
12  T-08    0
16  T-08    1
27  T-08    0
36  T-08    0
39  T-08    0
42  T-08    1
60  T-08    0
1   T-09    0
25  T-09    1
26  T-09    0
52  T-09    1
54  T-09    1
63  T-09    1
71  T-09    1
75  T-09    0
;
run;

Best Answer

If you concentrate on the crucial pre-planned comparisons of protection (each test vaccine against no treatment and against known-effective vaccine), the correction for multiple comparisons is not such an issue as it is in post-hoc analyses of results discovered in the data. That will tell you whether each of the test vaccines is better than no vaccine, and if any are significantly different from known-effective vaccine. The small number of cases, however, means that you will not have much power for detecting true differences from known-effective. If you have two equivalent standard vaccines (T-02 and T-03), you might want to combine those into a single group to get greater power.

The Fisher exact test (fisher.test() in R) is your best choice for comparisons. You don't have to worry about whether the requirements for the chi-square test are met, and with these small numbers of cases you won't run into computer overflow errors. You should look at the Wikipedia page to start learning about the controversies over the conservative nature of the Fisher test (and other discrete tests) and for references. In practice, the conservativeness of the Fisher test should tend to balance the issues raised by multiple comparisons, although I don't know a way to gauge the balance precisely.

Looking for the "best" among the test vaccines is problematic. The nominally best, T-05, gave protection in 8 of 8 cases, but the lower 95% interval for it goes down to a fraction of 0.63 protected. (The binom.test() function in R provides 95% confidence intervals.) So any vaccine that gave protection in at least 5 of 8 cases (T-04, T-07, T-09) can't really be distinguished from it. Ideally, you would use these results to choose a set of best candidates and investigate them more extensively in a larger study.

Do not give up on using logistic regression to analyze the relation of protection to a continuous measure of titer. Unless you have reason to believe that the relation of titer to protection will differ among the vaccines, including all 66 animals (or at least the 61 who received any vaccine) together in such analysis will show if there is an overall relation between titer measurements and protection, and could support use of titer as a surrogate biomarker for protection in future studies. You might consider including vaccine as a random effect in your model to see if there is any evidence of differences among vaccines in titer-protection relations, although again there may be power limitations.

Related Question