In multidimensional scaling, how can one determine dimensionality of a solution given a stress value?
Having a stress value it is not possible to determine the dimensionality of the dataset. At best, you can evaluate whether the value is low or high (this evaluation is also a bit problematic to me).
From what I understand, stress value is inversely related to the number of dimensions of a MDS solution,
correct
and that higher stress value indicates that there is a lot of error (i.e. badness-of-fit) in the current model,
correct
indicating a solution with more dimensions.
Not very accurate conclusion. consider stress as a function, "number of dimensions" is one of the inputs of this function. The others [significant factors] are the model that you are using as your MDS model, the initial configuration of points in the MDS configuration(map) or even the order of rows/columns in the dissimilarity matrix. Therefore, you will get different stress values in 2-dimension space for instance just by changing the initial configuration of the points! [although this change in the stress value is not considerable comparing to the one resulted by change in the number of dimensions]
Now if you want to figure out the most proper number of dimensions regarding the stress value, there is a straight-forward solution:
In multidimensional scaling, the pragmatic way of depicting the inverse relation of number of dimensions and stress is computing the stress for 2,3,4...,n-1 dimensions. n is the original number of dimension of the data.
The result of above computations becomes more lucid and comprehensible through "Scree plot of number of dimensions ~ amount of stress". The example below is from Cox and Cox(2001):
Now we can decide about the number of dimensions based on the relation. It is a trade-off: more dimensions-->lower stress (more accurate map) and less dimension reduction(more difficult to visualize and interpret).
Besides, the proper number of dimensions are not decided solely based on stress value. Your goal also matters. If you want to have a 2D map, then you choose 2-dimensions and then try to minimize the stress as much as possible.
Nevertheless, if you are implying "how much stress is too much" then we have another story! one way of evaluation of your magnitude of stress is comparing it to the average stress values of different possible configurations of your dataset. (have look at "Multidimensional Scaling in R: SMACOF" written by Patrick Mair).
Are the randomly generated coordinates, number of variables, and number of categories in a variable related?
Sorry but I don't understand this part of your question.
Scale should make no difference. But, all else being equal, the greater the number of points, the higher the stress.
As ttnphns comments, the cause of this is that when you have fewer observations, the model will over-fit, so the stress is downwardly biased. As the number of observations grow, the extent of the bias reduces.
Pretty much every measure of goodness-of-fit with a fixed minimum and maximum, as in this case, suffers from the same problem. For example, R-squared goes down as the number of observations go up, all else being equal, and the Adjusted R-squared was developed to address this. While I do appreciate that it would be great to have measures that were not influenced by the number of observations, as the degree of "noise" is going to differ from problem to problem, this is probably not solvable (e.g., Adjusted R-squared is not used by people with a good knowledge of regression).
You can compare between different data sets by randomly sampling the number of observations in the larger data set. For example if data set 1 has 20 observations, and data set 2 has 30, randomly sample 20 from data set 2 and compare the stress with 1. If you repeat this multiple times you will be able to do a significance test comparing the stress levels.
Best Answer
The MDS performed here is non-metric MDS: Kruskal's stress (or loss function) as defined in your question:
\begin{equation} \sqrt{\frac{\sum \left(d_{ij}-\delta_{ij}\right)^2}{\sum d_{ij}^{2}}} \end{equation}
where the disparities $\delta_{ij}$ preserve the order of the original dissimilarities
To minimise Kruskal's stress in R you can use the function
isoMDS
in theMASS
package.Below I provide a simple example of this with the Kellogg's data from Multidimensional Scaling by Cox & Cox (2001).
In the output from
isoMDS
you can see that the stress convergedThen you can also create a plot if you would like to do so
Here is the data that I used if you want to try it out for yourself: