I have been trying to find the major assumptions a Canonical Correspondence Analysis makes when doing its analysis. I have had a hard time finding anything useful. I did, however, find the assumptions for Canonical Correlation analysis. Does anyone know if these are the same assumptions for Canonical Correspondance analysis? I am guessing the assumptions are there should be a Gaussian relationship between the sets of variables, a Gaussian relationship between the variates, homoscedasticity and finally it is highly recommended for the data to be normally distributed.
Solved – Assumptions for Canonical Correspondence analysis
analysisassumptionscanonical-correlationcorrespondence-analysis
Related Solutions
The normality assumption is not necessary for nonlinear regression. It is often used because it's convenient. However, if it's clearly violated then I wouldn't use such an assumption at all. The same goes for homoscedasticity.
In your example the dependent variable seems to be confined between 0 and 100%. You could still use normal distributions and homoscedasticity if the data were "far" from the bounds. However, you show the sample where data spans all range, with substantial portion clustered by the borders. In this case neither homoscedasticity nor normality seems like reasonable assumptions.
There's nothing special or magically different about structural equation modeling (SEM) and other statistical techniques. Regression (and hence t-tests, anova), manova, etc can all be thought of as special cases of structural equation models. In addition, SEM and multilevel models are often equivalent - see http://curran.web.unc.edu/files/2015/03/Curran2003.pdf [edit: link updated, thanks @Peter Humburg) . If something is an assumption in statistical analysis generally, it's an assumption in SEM. If something is an issue or a problem in statistical analysis generally, it's an issue or a problem in SEM.
- Common method bias is an issue in the interpretation of your model. It's not an assumption.
- Think of multiple regression as being a structural equation model. If it's an assumption in regression, it's an assumption in SEM. Outliers are a problem in regression, and a problem in SEM.
- Multicollinearity is not an assumption in regression, or SEM, unless your matrices cannot be inverted because they are not positive definite, in which case it's an assumption everyone. It's a problem.
- Normality is an assumption in regular ML, as it is in regular OLS regression. But there are ways to handle it (as there are in regression).
- Linearity is as assumption, there are ways around it but they vary from "a bit fiddly" to "really hard".
- There are ways to handle data that are missing at random or missing completely at random, same as (almost) every other type of statistical analysis. It's a problem, and the solutions bring assumptions.
- Not an assumption, it's something that your model can test.
Best Answer
Assumptions made are:
Correspondence analysis assumes that your data follow the Poisson or multinomial distribution, since it divides the raw residuals by the square root of their expected value under the independence model. Mathematically speaking this becomes:
$$R^{-1/2}(X-E)C^{-1/2}$$
with $X$ your n-by-p matrix of interest, $R$ a n-by-n diagonal matrix containing the row sums of $X$, $C$ a p-by-p diagonal matrix containing the column sums of $X$ and $E = \frac{R11'C}{a}$ the matrix of expectations (a being the total sum of all elements of $X$) and $1$ and $1'$ vectors filled with ones with length n and p respectively.
There is thus a hidden distributional assumption that $E(X_{ij}) = Var(X_{ij})$ for the Pearson residuals $\frac{x_{ij}-e_{ij}}{e_{ij}}$ to be properly standardized. So far for regular as well as canonical correspondence analysis.
Canonical correspondence analysis indeed assumes a Gaussian response function and equal tolerances for all species.