Machine Learning – Why Does Feature Engineering Work?

feature-engineeringmachine learning

Recently I have learned that one of ways for finding better solutions for ML problems is by creation of features. One can do that by for example summing two features.

For example, we possess two features "attack" and "defense" of some kind of hero. We then create additional feature called "total" which is a sum of "attack" and "defense". Now what appears to me strange is that even tough "attack" and "defense" are almost perfectly correlated with "total" we still gain useful information.

What is the math behind that? Or is me reasoning wrong?

Additionally, is that not a problem, for classificators such as kNN, that "total" will be always bigger than "attack" or "defense"? Thus, even after standarization we will have features containing values from different ranges?

Best Answer

You question title and the content seems mismatched to me. If you are using linear model, add a total feature in addition to attack and defense will make things worse.

First I would answer why feature engineering work in general.

A picture is worth a thousand words. This figure may tell you some insights on feature engineering and why it works (picture source):

enter image description here

  • The data in Cartesian coordinates is more complicated, and it is relatively hard to write a rule / build a model to classify two types.

  • The data in Polar coordinates is much easy:, we can write a simple rule on $r$ to classify two types.

This tell us that the representation of the data matters a lot. In certain space, it is much easier to do certain tasks than other spaces.

Here I answer the question mentioned in your example (total on attack and defense)

In fact, the feature engineering mentioned in this sum of attack and defense example, will not work well for many models such as linear model and it will cause some problems. See Multicollinearity. On the other hand, such feature engineering may work on other models, such as decision tree / random forest. See @Imran's answer for details.

So, the answer is that depending on the model you use, some feature engineering will help on some models, but not for other models.

Related Question