Solved – How to calculate the R-squared value and assess the model fit in multidimensional scaling

multidimensional scalingr

I would like to do Multidimensional Scaling (MDS) using cmdscale() in R. I have read that it is useful to try out how many dimensions are suitable for the data by trying different values of k, and then seeing what proportion of variance is accounted for in the MDS result by looking at the R-square value. R-square values smaller than 0.6 are generally found to be acceptable for a good fit between the data and the number of dimensions.

However, how do I calculate R-square from an MDS generated by cmdscale()?

Best Answer

You can look at the "GOF" component of the result ("goodness of fit"), if you specify the number of dimensions. It returns two numbers, that should be equal unless the distance matrix is not positive.

You can also directly look at the eigenvalues: when they become small, you have enough dimensions.

In the following example, two dimensions seem sufficient.

> cmdscale(eurodist, 1, eig=TRUE)$GOF
[1] 0.4690928 0.5401388
> cmdscale(eurodist, 2, eig=TRUE)$GOF
[1] 0.7537543 0.8679134
> cmdscale(eurodist, 3, eig=TRUE)$GOF
[1] 0.7904600 0.9101784
> r <- cmdscale(eurodist, eig=TRUE)
> plot(cumsum(r$eig) / sum(r$eig), 
       type="h", lwd=5, las=1, 
       xlab="Number of dimensions", 
       ylab=expression(R^2))
> plot(r$eig, 
       type="h", lwd=5, las=1, 
       xlab="Number of dimensions", 
       ylab="Eigenvalues")