Solved – interpreting NMDS ordinations that show both samples and species

correspondence-analysisdescriptive statisticsinterpretationmultidimensional scalingr

I am using the vegan package in R to plot non-metric multidimensional scaling (NMDS) ordinations. I am using this package because of its compatibility with common ecological distance measures. When you plot the metaMDS() ordination, it plots both the samples (as black dots) and the species (as red dots). My question is: How do you interpret this simultaneous view of species and sample points?

My understanding of NMDS:

  • you start with a distance matrix of distances between all your points in multi-dimensional space
  • The algorithm places your points in fewer dimensional (say 2D) space
  • The algorithm moves your points around in 2D space so that the distances between points in 2D space go in the same order (rank) as the distances between points in multi-D space.

BUT there are 2 possible distance matrices you can make with your rows=samples cols=species data:

  1. distances between samples based on species composition (i.e. distances in species space)
  2. distances between species based on co-occurrence in samples (i.e. distances in sample space)

Is metaMDS() calculating BOTH possible distance matrices automatically?

Is the ordination plot an overlay of two sets of arbitrary axes from separate ordinations?

How do you interpret co-localization of species and samples in the ordination plot?

note: I did not include example data because you can see the plots I'm talking about in the package documentation example.

Best Answer

The NMDS vegan performs is of the common or garden form of NMDS. If metaMDS() is passed the original data, then we can position the species points (shown in the plot) at the weighted average of site scores (sample points in the plot) for the NMDS dimensions retained/drawn. The weights are given by the abundances of the species.

This is one way to think of how species points are positioned in a correspondence analysis biplot (at the weighted average of the site scores, with site scores positioned at the weighted average of the species scores, and a way to solve CA was discovered simply by iterating those two from some initial starting conditions until the scores stopped changing)

You'll notice that if you supply a dissimilarity matrix to metaMDS() will not draw the species points, because it does not have access to the species abundances (to use as weights).

Really, these species points are an afterthought, a way to help interpret the plot. You interpret the sites scores (points) as you would any other NMDS - distances between points approximate the rank order of distances between samples. The species just add a little bit of extra info, but think of the species point as the "optima" of each species in the NMDS space.