Solved – What machine learning algorithms are good for estimating which features are more important

feature selectionmachine learning

I have data with a minimum number of features that don't change, and a few additional features that can change and have a big impact on the outcome. My data-set looks like this:

Features are A, B, C (always present), and D, E, F, G, H (sometimes present)

A = 10, B = 10, C = 10                  outcome = 10
A = 8,  B = 7,  C = 8                   outcome = 8.5
A = 10, B = 5,  C = 11, D = 15          outcome = 178
A = 10, B = 10, C = 10, E = 10, G = 18  outcome = 19
A = 10, B = 8,  C = 9,  E = 8,  F = 4   outcome = 250
A = 10, B = 11, C = 13, E = 8,  F = 4   outcome = 320
...

I want to predict the outcome value, and the combination of additional parameters is very important for determining the outcome. In this example, the presence of E and F leads to a big outcome, whereas the presence of E and G doesn't. What machine learning algorithms or techniques are good to capture this phenomenon ?

Best Answer

This is one of the main areas of research in Machine Learning and it is known as Feature Selection.

In general, the only way to say what the best subset of features is (for input into some predictive model that can combine them), is to try all possible subsets. This is usually impossible, so people try to sample the space of feature subsets by various heuristics (see the article for some typical approaches).

Related Question