The answer in short is "contrasts". I had heard of them before but hadn't learned what they are. In fact, they serve exactly the purpose I was looking for. They can be used to alter the coefficients. For a given model you would always have the same number of coefficients, but by using contrasts they can be changed to be different linear combinations of the default coefficients. That is, the model is reparameterized - same model, different coefficient meanings and values.
Also, additional comparisons of other "virtual parameters" (linear combinations of actual parameters), not compared in the model summary, can be made after the model is fit, using the multcomp package, although this is not necessarily simple. (The "contrasts" package may also do this, though I couldn't figure out how to use it).
The default contrast in R is "contr.treatment" and it leads to the coefficients in the problem statement, with levels being compared to the base level within that factor. Another contrast, "contr.sum", generates coefficients that compare each level to the mean of all the levels of that factor.
In my case, I want the outer factor to continue using the "contr.treatment" contrast because I do want the outer factor to compare to the control level, jA. For the inner factor, though, I want to use "contr.sum" so the k factor levels get compared to the mean of all k levels at that j level. This is done after creating the k factor, with the assignment statement: contrasts(k) = "contr.sum"
The full code now looks like this:
> j = gl(3, 30, labels=LETTERS[1:3])
> k = gl(3, 10, length=90, labels=LETTERS[13:15])
> k[1:30] = "M" # Reference j="A" has K="M" only
> contrasts(k, 3) = "contr.sum"
> x = 100*(2^as.numeric(k)) # 200,400,800 for k=M,N,O
> x[1:30] = mean(c(200,400,800)) # Reference j="A" has x=same mean as "B" and "C"
> x = x + rnorm(90, 0, 4) # A little randomness
> x = x + rep(c(0, 1, 1.2), each=30) # A slight non-significant deviation by j.
> df = data.frame(x=x, j=j, k=k)
> f = lm(x ~ j/k, data=df, drop.unused.levels=TRUE)
> summary(f)
Call:
lm(formula = x ~ j/k, data = df, drop.unused.levels = TRUE)
Residuals:
Min 1Q Median 3Q Max
-10.6628 -2.9090 -0.4446 2.9412 10.0984
Coefficients: (2 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 467.3444 0.7878 593.259 <2e-16 ***
jB 0.6631 1.1141 0.595 0.553
jC 0.4530 1.1141 0.407 0.685
jA:k1 NA NA NA NA
jB:k1 -266.9330 1.1141 -239.604 <2e-16 ***
jC:k1 -267.5542 1.1141 -240.162 <2e-16 ***
jA:k2 NA NA NA NA
jB:k2 -66.0202 1.1141 -59.261 <2e-16 ***
jC:k2 -65.7784 1.1141 -59.044 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.315 on 83 degrees of freedom
Multiple R-squared: 0.9996, Adjusted R-squared: 0.9996
F-statistic: 3.344e+04 on 6 and 83 DF, p-value: < 2.2e-16
The coefficients have been renamed with "k1" and "k2" instead of "kM" and "kN" for some reason, but the results are now what is expected, as verified here:
> mean(df$x[df$j == "A"])
[1] 467.3444
> mean(df$x[df$j == "B"])-mean(df$x[df$j == "A"])
[1] 0.6631321
> mean(df$x[df$j == "C"])-mean(df$x[df$j == "A"])
[1] 0.4530243
> mean(df$x[df$j == "B" & df$k == "M"])-mean(df$x[df$j == "B"])
[1] -266.933
> mean(df$x[df$j == "C" & df$k == "M"])-mean(df$x[df$j == "C"])
[1] -267.5542
> mean(df$x[df$j == "B" & df$k == "N"])-mean(df$x[df$j == "B"])
[1] -66.02023
> mean(df$x[df$j == "C" & df$k == "N"])-mean(df$x[df$j == "C"])
[1] -65.77839
The comparisons of jB:kO and jC:kO (aka jB:k3 and jC:k3) are still missing. These do not appear as coefficients because they are simple linear combinations of the existing coefficients. In fact, with the "contr.sum" contrast, the value these coefficients would have is jB:k3 = -(jB:k1+jB:k2) and jC:k3 = -(jC:k1+jC:k2), as can be verified with a little algebra, remembering that coefficient jB is the mean of level jB minus the intercept, and jB:k1 is the mean of level kM minus the mean of level jB.
To use that knowledge to do the other two tests that I want, the glht() function in the multcomp package can be used. However, it chokes on the NA coefficient values that resulted from level jA containing only jB:M, requiring that we modify the model matrix to remove the columns corresponding to the NA coefficients, as well as the intercept column:
> # With contr.sum contrasts, the jB:k2 term is the negative of jB:k1+jB:k2
> summary(glht(f, linfct="0 - jB:k1 - jB:k2 = 0"))
Error in summary(glht(f, linfct = "0 - jB:k1 - jB:k2 = 0")) :
error in evaluating the argument 'object' in selecting a method for function 'summary': Error in modelparm.default(model, ...) :
dimensions of coefficients and covariance matrix don't match
>
> # It fails because of NA coefficients. They can be removed by removing their
> # columns in the model matrix and then updating the model.
> M = model.matrix(f)
> # Save the "assign" and "contrasts" attributes of the model matrix.
> attr.assign = attr(M, "assign")
> attr.contrasts = attr(M, "contrasts")
> rmvCols = c(1, which(summary(f)$aliased))
> M = M[,-rmvCols] # Remove intercept column (required) and NA columns
> # Reassign the "assign" attribute, removing the corresponding elements from it.
> attr(M, "assign") = attr.assign[-rmvCols]
> # Reassign the "contrasts" attribute to its original value
> attr(M, "contrasts") = attr.contrasts
Now we update model with new model matrix and try glht() again.
> f1 = update(f, ~M)
> f1
Call:
lm(formula = x ~ M, data = df, drop.unused.levels = TRUE)
Coefficients:
(Intercept) MjB MjC MjB:k1 MjC:k1 MjB:k2 MjC:k2
466.9725 0.2974 -0.4174 -266.9183 -267.7272 -65.2277 -66.5216
> # Now try glht using updated model fit. Must prepend an "M" on the coefficient
> # names, which was added because model matrix was named M.
> summary(glht(f1, linfct="0 - MjB:k1 - MjB:k2 = 0"))
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = x ~ M, data = df, drop.unused.levels = TRUE)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
0 - MjB:k1 - MjB:k2 == 0 332.953 1.114 298.9 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)
Excellent! This is one of the two missing comparisons I wanted!!!
To confirm estimate correctness:
> coef(f)
(Intercept) jB jC jA:k1 jB:k1 jC:k1 jA:k2 jB:k2 jC:k2
467.3444162 0.6631321 0.4530243 NA -266.9329982 -267.5542422 NA -66.0202315 -65.7783877
> -(coef(f)[5] + coef(f)[8])
jB:k1
332.9532
Yes, it is correct!
Do both missing comparisons, jB:k3 and jC:k3, at the same time.
> summary(glht(f1, linfct=c("0 - MjB:k1 - MjB:k2 = 0", "0 - MjC:k1 - MjC:k2 = 0")))
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = x ~ M, data = df, drop.unused.levels = TRUE)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
0 - MjB:k1 - MjB:k2 == 0 332.953 1.114 298.9 <1e-10 ***
0 - MjC:k1 - MjC:k2 == 0 333.333 1.114 299.2 <1e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)
Sweet!
Best Answer
You have to be careful about the interpretation of the coefficients which will change with the parameterization.
We first specify the One-Way ANOVA model:
$$y_{ij} = \mu + \alpha_{j} + \epsilon_{ij}$$
where
Since this model is not full-rank (results in non-linearly independent columns of the design matrix), we have to restrict this somehow to get unique solutions for the coefficients. The parameterizations you noted are two ways of doing this.
Using restriction $\alpha_{1}=0$
This is the default restriction for one-way ANOVA in R. This is equivalent to the CornerPoint parametrization. In this case then $\alpha_{2}$ is the effect of the treatment ($j=2$) relative to the mean of group 1 (the intercept).
Then the t-test: $$H_0: \alpha_{2}=0 \text{ and } H_1: \alpha_{2} \ne 0$$
It is testing if there is ANY effect from the treatment. In other words, is the difference in the mean between the two groups significant?
Using restriction $\mu=0$
This is what you call the GroupPoint parametrization. In this case $\alpha_1$ and $\alpha_2$ are means of each group.
Then the t-test is $$H_0: \alpha_{j}=0 \text{ and } H_1: \alpha_{j} \ne 0$$
This is testing if the mean of group $j$ is zero.
This is testing something very different than the first t-test.