Solved – Difference Between Full Model and Reduced Model in One-Way ANOVA

anova

I recently started learning ANOVA. I've explored the ANOVA table and how the values within the table are calculated (factors, sum of squares, degrees of freedom, mean square, F-value, P-value, etc.). I've also read about the concept of a "full" versus "reduced" model. However, the only (very brief) explanation I got was that the "full" model "separates" the group means, whilst the "reduced" model combines them into a single mean.

I found this explanation of "full" versus "reduced" models to be lacking and, as a result, am thoroughly confused as to what the "full" and "reduced" models are, what they mean, and how one would calculate them.

I would greatly appreciate it if people could please take the time to give an explanation of this concept for a beginner such as myself.

P.S. I have also been doing lots of practice problems using the aov() function in r, so if someone wants to explain using r, then that is fine with me.

Best Answer

Borrowing notation from wikipedia:

Full model: $y_{i,j}=\mu_j+\varepsilon_{i,j}$

Reduced model: $y_{i,j}=\mu+\varepsilon_{i,j}$

Consider two groups, A and B. Then $j =1,2$. You have a total of 10 observations, so $i = 1,2,..,10$. The observations in A have a mean and those in B have a mean, $\mu_1, \mu_2$. You can also pool them together and obtain a grand mean $\mu$. You estimates of these parameters depend on which model you choose because that choice defines the optimization problem.

In the full model we assume that each observation is a function of a specific group mean, so there are more parameters (one for each group's mean--hence "full"). In the reduced model we assume that each observation is a function of the grand mean, so there are fewer parameters $\mu$ (hence, "reduced").

If truthfully the observations are actually drawn from the reduced model---they are all generated by the same mean--then there should be no real difference between the error achieved with the estimates from the full and the estimates from the reduced model.

This concept is generally introduced using linear regression instead of ANOVA (https://onlinecourses.science.psu.edu/stat501/node/295).

Related Question