Solved – Tutorial for feature extraction on unsupervised learning

feature-engineeringmachine learningreferencesunsupervised learning

I would like to extract features from (without loss of generality) numerical data using unsupervised learning methods among these:

  1. transformations: PCA/ICA/NMF
  2. embeddings: T-distributed stochastic neighbor embedding.
  3. cluster based methods: k-means or similar
  4. kernel based: kernel PCA

I also think about using auto-encoders or similar. The extracted feature are then used in a classifier.

My question: I study each of these methods one by one. Some in the original context (e.g. clustering) and some in the context of feature extraction. I lack experience on the details and many questions arise as

  • Can I stack these methods? What do I lose?
  • Can I apply them on subsets (to reduce training time) of the data and predict on the rest?

Thus:

Are there tutorials/lecture notes/blog posts on the web that describe best practice of feature extraction in this sense?

PS:
Courses like this Week 4: Feature construction deal with my question – I would love to see more examples from an applied point of view.
This question Tutorials for feature engineering is similar but I hope mine is not a duplicate.

Best Answer

A nice reference is Dimensionality Reduction A Short Tutorial by Ali Ghodsi. It covers PCA, Locally Linear Embedding, Multidimensional Scaling and Isomap.

Dan Ventura provides us with some nice worked examples of Manifold Learning - specifically, PCA, LLE and ISOMAP

Kilian Weinberger has a nice web page devoted to Manifold Learning

There is a high-level overview of Feature Engineering at Machine Learning Mastery that also has some references.

Lawrence Cayton has an overview paper on Algorithms for Manifold Learning

Even though it is mostly about supervised feature extraction, I hate to omit mention of the work of Isabelle Guyon. She has a nice paper An Introduction to Variable and Feature Selection slides from a KDD Tutorial and her book on Feature Extraction.

All links checked as of 18 Jan 2017

Related Question