Solved – Tutorials for feature engineering

feature-engineeringmachine learningreferences

As is known to all, feature engineering is extremely important to machine learning, however I found few materials associated with this area. I participated to several competitions in Kaggle and believe that good features may even be more important than a good classifier in some cases.
Does anyone know any tutorials about feature engineering, or is this pure experience?

Best Answer

I would say experience -- basic ideas are:

  • to fit how classifiers work; giving a geometry problem to a tree, oversized dimension to a kNN and interval data to an SVM are not a good ideas
  • remove as much nonlinearities as possible; expecting that some classifier will do Fourier analysis inside is rather naive (even if, it will waste a lot of complexity there)
  • make features generic to all objects so that some sampling in the chain won't knock them out
  • check previous works -- often transformation used for visualisation or testing similar types of data is already tuned to uncover interesting aspects
  • avoid unstable, optimizing transformations like PCA which may lead to overfitting
  • experiment a lot