Solved – Assumptions for Canonical Correspondence analysis

analysisassumptionscanonical-correlationcorrespondence-analysis

I have been trying to find the major assumptions a Canonical Correspondence Analysis makes when doing its analysis. I have had a hard time finding anything useful. I did, however, find the assumptions for Canonical Correlation analysis. Does anyone know if these are the same assumptions for Canonical Correspondance analysis? I am guessing the assumptions are there should be a Gaussian relationship between the sets of variables, a Gaussian relationship between the variates, homoscedasticity and finally it is highly recommended for the data to be normally distributed.

Best Answer

Assumptions made are:

  1. Mean-variance relationship

Correspondence analysis assumes that your data follow the Poisson or multinomial distribution, since it divides the raw residuals by the square root of their expected value under the independence model. Mathematically speaking this becomes:

$$R^{-1/2}(X-E)C^{-1/2}$$

with $X$ your n-by-p matrix of interest, $R$ a n-by-n diagonal matrix containing the row sums of $X$, $C$ a p-by-p diagonal matrix containing the column sums of $X$ and $E = \frac{R11'C}{a}$ the matrix of expectations (a being the total sum of all elements of $X$) and $1$ and $1'$ vectors filled with ones with length n and p respectively.

There is thus a hidden distributional assumption that $E(X_{ij}) = Var(X_{ij})$ for the Pearson residuals $\frac{x_{ij}-e_{ij}}{e_{ij}}$ to be properly standardized. So far for regular as well as canonical correspondence analysis.

  1. Response functions

Canonical correspondence analysis indeed assumes a Gaussian response function and equal tolerances for all species.