Solved – Which data transformation can improve the performance of MLP neural networks for classification

I am trying to fit several MLP neural networks models with a single hidden layer using the caret R-package. My main concern now is in the preprocessing step. My train data features (16 in total) are right-skewed or left-skewed. Please see that behaviour in the following image:

Under this situation, I am considering to transform my data, I have experimented the following transformations with their results plotted.

log, center and scale: my concern using the log transformation mixed with center and scale techniques is that the x-scales between the features are not in the same range (as you can see in the next preprocessing technique). In addition, one of the variables (c._TE.) has zero values so the log transformation convert it to -INF):

box-cox, center and scale: this seems to be right, however there is a lack of references that show the feasibility of box cox transformation in neural networks for classification:

I really appreciate any feedback or suggestion about the best preprocessing practices in my case.

Best Answer

I don't think left- or reight-skewness is your concern but rather high variance between features. Also, output of transformation doesn't need to be at the same scale. Similar scales also works fine. I suggest you to use zero-mean unit variance on you feature matrix. But why each of your features has 2 different axes? What are corresponding labels of x- and y- axes?

Best Answer

Related Solutions

Solved – Is feature transformation (power, log, Box-Cox) necessary in deep learning

Solved – Which constant to add when applying ‘Box-Cox transformation’ to negative values

Related Question