Solved – Kruskal-Wallis

binomial distributionkruskal-wallis test”proportion;

I have three sets of data (1 set before treatment, second during treatment and the third after), each of which consists of hundreds of trials on a specific task that involves choosing 1 from 7 possible choices, the answer for which may be correct or not (binary categorization). I'd like to determine whether the proportion of correct choices is significantly different as a result of treatment. Which statistical test could potentially help?

Best Answer

So you have data from two subjects who completed hundreds of trials at three assesment phases (before/during/after treatment), right? Sounds like you're looking to falsify the null hypothesis that both subjects achieved equal success before and during treatment (and maybe after too?), but your data would permit more tests than this.

A mixed-effects model could test the effect of assessment phase (effectively a test of your treatment), test for differences between your two subjects, and test for a treatment × subjects interaction. If you know the order of the trials within assessment phases, you could also test for effects of practice, fatigue, or whatever else might change subjects' performance over a long series of consecutive trials. It seems desirable to statistically control any such effects of trials nested within phases, as well as to control differences between your two subjects, as this would leave less variance for your treatment to explain. You might also find evidence of an interaction interesting, as it would suggest individual differences in the effect of the treatment.

The fixed vs. random effects distinction is somewhat ambiguous ^{_{(Gelman, 2005; see also "What is the difference between fixed effect, random effect and mixed effect models?")}}, so I should admit that I'm not sure which effect(s?) you'd want to treat as random according some definitions. I'm mainly suggesting a mixture of between-subjects and within-subjects factors in your model: $$Y_{ij}=\mu+\beta_1{\rm Subject}_i+\beta_2{\rm Time}_j+\beta_3{\rm Subject}_i{\rm Time}_j+U_i+W_j+\varepsilon_{ij}$$

$Y_{ij}$ is the response variable, as usual. If this is a count of successful trials for subject $i$ at time $j$, and both subjects had an equal number of trials at each time, then you could assume a negative binomial distribution (or maybe Poisson, though this involves a dispersion assumption). If your subjects had a different number of trials or you prefer to model proportions of successful trials for some other reason, you could assume a beta distribution instead. If you prefer not to worry about the distribution, you could also try a nonparametric model ^{_{(e.g., Gu & Ma, 2005)}}.
$\mu$ is the grand mean of the response variable across all trials/subjects/times.
$\beta_1$ is the mean difference between subjects across all trials/times.
$\beta_2$ is the average slope of changes across your three times for all trials/subjects.
$\beta_3$ represents the difference in slopes across times for all of your two subjects' trials.
$U_i$ represents subject-specific error – could be useful to control if they have different variances.
$W_j$ represents time-specific error. E.g., a gradual effect of the treatment could increase variance over trials during the treatment only. You might be able to estimate this as a hierarchical model since you have hundreds of trials nested within assessment times, but since they're binary trials, I'm not sure this is possible. I'm not sure this applies, but consider Hudecová ^{₍₂₀₁₃₎}. See also "Autocorrelation of discrete time series".

It might help if you could get another subject too, as this would give you three proportions per assessment phase. That's at least enough to begin separating subject-specific error from time-specific error within a single time period, I think. I may be stretching what I know about latent variable modeling a little too far here.
$\varepsilon_{ij}$ would be the remaining error that can't be attributed to subject-specific error (e.g., inattentiveness) or time-specific error. If you can separate out the other two kinds of error at all, this could help isolate the effect of the treatment.

If your two subjects aren't very different, including terms for differentiating them in your model might not be worth the degrees of freedom, but that's an empirical question worth testing IMO. Then again, I seem to have taken this in a somewhat different direction than @Glen_b and @Adrian indicate. Hopefully someone will speak up if I'm suggesting something incorrect or unfeasible, or if the random effects idea can be clarified.

^{References

· Gelman, A. (2005, January 25). Why I don’t use the term “fixed and random effects”. Statistical Modeling, Causal Inference, and Social Science. Retrieved from http://andrewgelman.com/2005/01/25/why_i_dont_use/.

· Gu, C., & Ma, P. (2005). Generalized nonparametric mixed-effect models: Computation and smoothing parameter selection. Journal of Computational and Graphical Statistics, 14(2), 485–504. Retrieved from http://www.stat.purdue.edu/~chong/ps/guma.pdf.

· Hudecová, Š. (2013). Structural changes in autoregressive models for binary time series. Journal of Statistical Planning and Inference, 143(10), 1744–1752.}

Related Solutions

Logistic – How to Perform ANOVA on Binomial Data: A Comprehensive Guide

No to ANOVA, which assumes a normally distributed outcome variable (among other things). There are "old school" transformations to consider, but I would prefer logistic regression (equivalent to a chi square when there is only one independent variable, as in your case). The advantage of using logistic regression over a chi square test is that you can easily use a linear contrast to compare specific levels of the treatment if you find a significant result to the overall test (type 3). For example A versus B, B versus C etc.

Update Added for clarity:

Taking data at hand (the post doc data set from Allison) and using the variable cits as follows, this was my point:

postdocData$citsBin <- ifelse(postdocData$cits>2, 3, postdocData$cits)
postdocData$citsBin <- as.factor(postdocData$citsBin)
ordered(postdocData$citsBin, levels=c("0", "1", "2", "3"))
contrasts(postdocData$citsBin) <- contr.treatment(4, base=4) # set 4th level as reference
contrasts(postdocData$citsBin)
     #   1 2 3
     # 0 1 0 0
     # 1 0 1 0
     # 2 0 0 1
     # 3 0 0 0

# fit the univariate logistic regression model
model.1 <- glm(pdoc~citsBin, data=postdocData, family=binomial(link="logit"))

library(car) # John Fox package
car::Anova(model.1, test="LR", type="III") # type 3 analysis (SAS verbiage)
     # Response: pdoc
     #          LR Chisq Df Pr(>Chisq)
     # citsBin   1.7977  3     0.6154

chisq.test(table(postdocData$citsBin, postdocData$pdoc)) 
     # X-squared = 1.7957, df = 3, p-value = 0.6159

# then can test differences in levels, such as: contrast cits=0 minus cits=1 = 0
# Ho: Beta_1 - Beta_2 = 0
cVec <- c(0,1,-1,0)
car::linearHypothesis(model.1, cVec, verbose=TRUE)

Solved – Proportion data – beta distribution v. GLM with binomial distribution and logit link

With count data of that form, I'd actually fit a multinomial model (at least to start with*), because several numerators are present in the denominator - each '+1' count could have gone into any of $k$ cells ('sets').

(e.g. see here)

You'll need the denominator you divided by; the model is still for the proportion, but the variability depends on the denominator you used to obtain the proportion.

* a particular concern is that you'll have dependence over both space and time (e.g. adjacent locations and adjacent times will tend to be more related than more distant locations or times - at least if there's unmodelled variation that would be accounted for by such effects)

Once you have fitted a multinomial model, you would want to assess whether you have both the variance and the correlation modelled reasonably well -- you might need mixed models (GLMM) and possibly also to account for potential remaining overdispersion in addition.

You will find a number of discussions of multinomial models here on CV.

Another possibility is to model the counts as Poisson, by allowing for offsets, factors or continuous predictors related to the variation you mentioned as the reason you scaled to proportions.

Best Answer

Related Solutions

Logistic – How to Perform ANOVA on Binomial Data: A Comprehensive Guide

Solved – Proportion data – beta distribution v. GLM with binomial distribution and logit link

Related Question