Solved – Ecological modelling: multivariate abundance time-series data

correspondence-analysiscount-dataecologymultivariate analysistime series

I am working with a dataset that consists of abundance counts of 6 microbial taxa in a lake measured weekly for 20 weeks. I also have environmental data (temperature, nutrient concentrations, conductivity etc) associated with each bacterial census. In total, i have about 25 environmental variables (IVs), though i expect only 4 or 5 of them to be strongly correlated with my DVs (species abundance).

I would like to construct a model that would relate the abundance of my six species (DVs) to environmental conditions in the lake (IVs). I expect my response variables to have unimodal responses to my dependent variables – as is most commonly the case in ecology. Typically a negative-binomial distribution is recommended for this type of species count data. I also expect there to be interactions between my IVs because the species are competing for resources. There is a great R package called mvabund which will fit GLMs to each of the species in your dataset to model their abundance. However, a major assumption is independence of the response variable. In my case, the abundance of a species should be temporally autocorrelated.

What is an appropriate model to use to relate species abundance to environmental variables in timeseries data? I am not interested in predictive power, but rather in understanding the relative importance of different environmental variables to each species' abundance.

Best Answer

I would suggest using Canonical Correspondence Analysis (sometimes called Constrained Correspondence Analysis). In your case, the "sites" are temporal, rather than spatial, but it should work just fine. You'll need a sites by species abundance matrix (which you seem to have) and a sites by environmental data matrix (which I presume you have or can construct).

There is a great discussion of CCA (and associated methods) in Numerical Ecology with R. It's geared towards using the R programming language, but the underlying theory is well described and should be extendable to whatever programming language/software you use.

If you don't have free access through your university to the book, then there are a few websites out there that describe how to use it. Just google it, but be careful not to confuse it for Canonical Correlation Analysis (which is different).

If that still doesn't work for you, try these tutorials on basic ordination analyses like PCA, DCA, and NMDS (the precursors to CCA), and work your way up.

Related Question