Solved – How to analyse data with multiple dependent and independent variables

multivariate analysisregression

I have two dependent variables, Abundance and Richness of moths, and 12 independent climate variables. These are Temperature, Rainfall and Sunlight, for each of the 4 seasons.

How do I go about analysing this? From doing individual simple linear regression I have found significance for summer rainfall and winter temperature as factors influencing my dependent variables, but I know that this isn't very statistically viable!
Is principle component analysis a suitable way of analysing this data? Are there any other multivariate techniques I could use?
Thanks.

Best Answer

Since you have multiple dependent and independent variables, a multivariate analysis would be one way to proceed. A multivariate analysis will attempt to model the relationship between your dependent and independent variables, and as an outcome you will be able to test if those factors are significant in your model. This is useful if you want to assess the significance of the factors within such a model, but if you are interested in knowing the significance of the relationship between the covariates and one response you can run a regression the way you describe.

Suppose though, that you want to construct a model for both responses simultaneously, and assess the significance of the factors in $that$ model. Then you can use multivariate analysis of covariance (MANCOVA). MANCOVA will provide you with the contribution to the variance in the responses made by each factor, as well as their significance. Note that MANCOVA will produce both type I, II, and III sums of squares (SS). Which one is appropriate depends on the balance of your data. See here for more information on the types of SS. http://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/

If you are using R, you can determine the statistical significance of your factors by performing multivariate regression and using this as input in the manova function. The code would go something like:

#fit a multivariate regression model and then test the type I SS using MANCOVA. fit = lm(formula = cbind(Abundance, Richness) ~ Temp_1 + Rain_1 + Sunlight_1 + Temp_2 + Rain_2 + Sunlight_2 + Temp_3 + Rain_3 + Sunlight_3 + Temp_4 + Rain_4 + Sunlight_4, data = yourData) summary(manova(fit), test="Hotelling-Lawley") A more thorough overview of how to perform such an analysis is provided here: http://www.uni-kiel.de/psychologie/rexrepos/posts/multRegression.html Note that separate regressions return the same slopes as multivariate regression, and also not that different tests besides the "Hotelling-Lawley" are possible for the MANCOVA test of type I SS, and that you can also test type II SS.

As you pointed out, PCA is another multivariate data analysis method. It may be interesting for performing exploratory data analysis, though you will not obtain the same sort of significance testing discussed above. If you perform PCA on your data, a bi-plot may be a good way to investigate interesting relationships. SAS provides some rather clear discussion interpreting the biplot: http://support.sas.com/documentation/cdl/en/imlsug/62558/HTML/default/viewer.htm#ugmultpca_sect2.htm