The features of my dataset are like below:
• BI-RADS assessment:
1 to 5 (ordinal)
• Age:
patient's age in years (integer) ranges from 18 to 96
• Shape:
mass shape: round=1 oval=2 lobular=3 irregular=4 (nominal)
• Margin:
mass margin: circumscribed=1 microlobulated=2 obscured=3 ill-defined=4 spiculated=5 (nominal)
• Density:
mass density high=1 iso=2 low=3 fat-containing=4 (ordinal)
When I run MDS
with "strain"
criterion on such a dataset without normalizing it first, I get a result as follows:
However, if I normalize the data the result is as follows:
The second results is pretty similar to results that I have got for other criteria and also for the PCA
even I didn't normalize the data for them also.
So, my question is: Why does normalizing data make difference for "strain" criterion?
Thanks in advance…
Best Answer
Suppose you don't normalize your data. You could have a situation like this:
So it is clear that some features influence the results much more than others.
Remember that the MDS is a way to force differences between elements in n dimensions in differences in 2 dimensions, between all the couples of elements! I simply think that in the first case the strain function
is affected by the lack of normalization. When you normalize the data the algorithm is able to catch the real differences among the points considering in a proper way all the different features.