Solved – Why robust PCA results change with each run

compositional-datamultivariate analysispcarrobust

According to Filzmoser et al. 2009, the best way to conduct a principal component analysis for compositional data with outliers is:

  • using a robust PCA method
  • and using the isometric log ratio transformation (instead of the centred log ratio transformation, see also the discussion here).

The function pcaCoDa() from the R package robCompositions can do both things.

However, every time I run the function, I get a different result… how is that possible?

Examples from four different runs:

1st run

2nd run

3rd run

4th run

In some of the biplots above, it's just a matter of the components being rotated, but for others, I don't think that's the case.

Also, for what I understand checking help(pcaCoDa), the data set that you provide to the function must not be transformed – the transformation is done internally. But how about scaling? Should we scale the matrix before running the pcaCoDa() if the different columns use very different units?

Best Answer

It looks to me as though the proposed method at its core uses robust estimates of location and covariance based on the MCD (Minimum Covariance Determinant) algorithm (the link is to the FastMCD variant.) This algorithm randomly samples the data hundreds of times, constructing covariance matrix estimates for the subsamples, then selects the one with the minimum determinant.

From your perspective, the important part is that "randomly samples" bit. This means that the estimated covariance matrix at the core of the pcaCoDa algorithm is non-deterministic, so the output eigenvectors are too. Given how different the results are from run to run, I'd guess there's some parameter tuning in the calls to the FastMCD algorithm that aren't working well for this problem. Since it doesn't appear that you can alter the parameters passed to the FastMCD algorithm by altering any parameters passed to pcaCoDa, you may have to mess with the code, or seek another approach altogether.

Related Question