Solved – How to plot the contribution of each regression coefficient in a model, with R

data visualizationrregression

I've fit a model with lm(), and I now want to analyze it to see where it's overfit, etc… I'm imagining a plot that has the index of each observation on the x-axis, and the corresponding responses plotted as points on the y-axis. In (vertical) line with each point is a set of colored lines stacked on top of each other. Each vertical, colored line corresponds to a predictor in the model, and it's height is the coefficient on that predictor applied the data point it's in line with.

Is there already a function that does this? I don't want to reinvent the wheel (particularly because parsing the output of lm looks unpleasant).

Best Answer

This sounds like just a stacked bar chart. I don't see how you handle the situation when the "contribution" made by a predictor and its coefficient is negative. But you might get something like the not very elegant but workable below. It returns warnings for when it is trying to plot something negative. Perhaps in your case this doesn't happen.

The code uses ggplot2 0.8.9 - I think melt() changes in the latest implementation but I don't have it installed at work:

library(ggplot2)    
X1 <- rnorm(100)
X2 <- rnorm(100,5,3)
Y <- 4 + 5*X1 + 3*X2 + rnorm(100)
mod <- lm(Y ~ X1 + X2)
tmp <- data.frame(t(coef(mod) * t(cbind(1, X1, X2))))
names(tmp) <- c("Intercept", "X1", "X2")
qplot(x=as.factor(1:100), fill=variable, weight=value, geom="bar", data=melt(tmp)) +
    geom_point(aes(x=1:100, y=predict(mod))

enter image description here