Solved – Difference between dimensionality reduction and clustering

clusteringdimensionality reductionpca

General practice for clustering is to do some sort of linear/non-linear dimensionality reduction before clustering esp. if the number the number of features are high(say n). In case of linear dimensionality reduction technique like PCA, the objective is to find principal orthogonal components(say m) that can explain most variance in the data such that m<<n when n is high.

But for non-linear dimensionality reduction techniques like auto-encoders, can the reduced dimensions, itself be clusters that indicate different modes of operation example for industrial components. Am I missing something here or is my understanding of non-linear dimensionally reduction wrong? Any help is appreciated.

This question might be too basic for some, so please don't be extremely critical of the question if you don't want to answer it.

@fk128 shared his interpretation of my question that might be better understood and easy to interpret than what I have mentioned above

Best Answer

The components of an autoencoder are supposedly even less reliable than your usual clustering.

Why don't you just try it: train autoencoders on some data sets, and visualize the "clusters" you get from the components?

While this great answer on tSNE for clustering is specific for tSNE, I believe the results for other such encoders will be similar: they will cause fake clusters because of emphasizing some random fluctuations in data.

Best Answer

Related Solutions

Solved – When do we combine dimensionality reduction with clustering

Solved – How would PCA help with a k-means clustering analysis

Related Question