Box Cox Transformation – Estimating Lambda for Box Cox Transformation for ANOVA

Assumptions:

In an ANOVA where the normality assumptions are violated, the Box-Cox transformation can be applied to the response variable. The lambda can be estimated by the using maximum likelihood to optimize the normality of the model residuals.

Question:

When the estimates for lambda in the null model and the full model differ, how should lambda be estimated?

My Data:

In my data the lambda estimate for the null model is -2.3 and the lambda estimate for the full model is -2.8. Transforming the response using these different parameters and preforming the ANOVA leads to different F-statistics.

I have produced below a simplified version of the analysis using beta distributions with different parameters to simulate non-normality. Unfortunately, in this example the results of the ANOVA are insensitive to the different estimates of lambda. So, it doesn't fully illustrate the problem.

library(ggplot2)
library(MASS)
library(car)


#Generating random beta-distributed data
n=200
df <- rbind(
  data.frame(x=factor(rep("a1",n)), y=rbeta(n,2,5)), # more left skewed
  data.frame(x=factor(rep("a2",n)), y=rbeta(n,2,2))) # less left skewed

print(qplot(data=df, color=x, x=y, geom="density"))

print("Untransformed Analaysis of Variance:")
m.null <- lm(y ~ 1, df)
m.full <- lm(y ~ x, df)
print(anova(m.null, m.full))

# Estimate Maximum Liklihood Box-Cox transform parameters for both models
bc.null <- boxcox(m.null); bc.null.opt <- bc.null$x[which.max(bc.null$y)]
bc.full <- boxcox(m.full); bc.full.opt <- bc.full$x[which.max(bc.full$y)]

print(paste("ML Box-Cox estimate for null model:",bc.null.opt))
print(paste("ML Box-Cox estimate for full model:",bc.full.opt))

df$y.bc.null <- bcPower(df$y, bc.null.opt)
df$y.bc.full <- bcPower(df$y, bc.full.opt)

print(qplot(data=df, x=x, y=y.bc.null, geom="boxplot"))
print(qplot(data=df, x=x, y=y.bc.full, geom="boxplot"))


print("Analysis of Variance with optimial Box-Cox transform for null model")
m.bc_null.null <- lm(y.bc.null ~ 1, data=df)
m.bc_null.full <- lm(y.bc.null ~ x, data=df)
print(anova(m.bc_null.null, m.bc_null.full))

print("Analysis of Variance with optimial Box-Cox transform for full model")
m.bc_full.null <- lm(y.bc.null ~ 1, data=df)
m.bc_full.full <- lm(y.bc.null ~ x, data=df)
print(anova(m.bc_full.null, m.bc_full.full))

Box Cox Transformation – Estimating Lambda for Box Cox Transformation for ANOVA

Assumptions:

Question:

My Data:

Best Answer

Related Question

Assumptions:

Question:

My Data:

Best Answer

Related Solutions

Solved – How to obtain confidence interval for lambda and the max using R Box Cox transformation

Solved – T-test / ANOVA on Box-Cox transformed non-normal data

Related Question