I intend to assess how some variables relate to a particular response variable:
> dput(data)
structure(list(Y = c(28.2, 28.1, 27.3, 25.9, 27.2, 30.6, 27.6,
28.4, 26.6, 28.1, 30.1, 26.3), X1 = c(27, 27.8, 27.7, 26.6, 26.8,
30.7, 27.6, 25.4, 26.7, 26.7, 29.4, 25.1), X2 = c(26.6, 27.5,
27.1, 26.2, 24.8, 27.2, 26.3, 23.9, 24.3, 24.1, 25.1, 24), X3 = c(26.8,
27.4, 27.4, 26.3, 25.8, 29.2, 27.1, 25, 24.8, 25.3, 27.7, 24.9
)), class = "data.frame", row.names = c(NA, -12L))
My most obvious idea was to develop a GLM to achieve my goal, but, after I found out that there is high multicollinearity in my predictor variables, I chose to use a Ridge Regression:
> library(ridge)
> model <- linearRidge(Y ~ ., data)
> summary(model)
So, what would be the best way to graphically represent in R the "impact" of each explanatory variable on the response, from the Ridge model?
Best Answer
To graphically represent it you could just plot each line for each variable so y vs x1 with the regression line for x1 and so on, any basic regression plots just using your regularized coefficients. However, if you want to do traditional inference with p-values and all that then that is generally considered not a good route to go because we are biasing our estimates. You could just go full bayesian and set a normal prior to get a ridge and then look at the posterior since bayesian analysis is still coherent in that setup.