Solved – Normalizing data before applying MDS with strain criterion

MATLABmultidimensional scalingnormalization

The features of my dataset are like below:
• BI-RADS assessment: 1 to 5 (ordinal)
• Age: patient's age in years (integer) ranges from 18 to 96
• Shape: mass shape: round=1 oval=2 lobular=3 irregular=4 (nominal)
• Margin: mass margin: circumscribed=1 microlobulated=2 obscured=3 ill-defined=4 spiculated=5 (nominal)
• Density: mass density high=1 iso=2 low=3 fat-containing=4 (ordinal)

When I run MDS with "strain" criterion on such a dataset without normalizing it first, I get a result as follows:
enter image description here
However, if I normalize the data the result is as follows:
enter image description here
The second results is pretty similar to results that I have got for other criteria and also for the PCA even I didn't normalize the data for them also.

So, my question is: Why does normalizing data make difference for "strain" criterion?

Thanks in advance…

Best Answer

Suppose you don't normalize your data. You could have a situation like this:

    user1   user2   user3   user4   user5   user6   user7   user8   user9
1   0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
2   18      96      21      45      93      19      21      19      90
3   0.08    0.12    0.19    0.70    0.12    0.01    0.00    0.09    0.04
4   1019    217     53      1082    1010    2       30      0       100

So it is clear that some features influence the results much more than others.

Remember that the MDS is a way to force differences between elements in n dimensions in differences in 2 dimensions, between all the couples of elements! I simply think that in the first case the strain function

enter image description here

is affected by the lack of normalization. When you normalize the data the algorithm is able to catch the real differences among the points considering in a proper way all the different features.

Related Question