Solved – Normalization of a data with skewed distibution

deep learningnormalizationskewness

I have a data set which has 10 features (columns). I want to normalize the data before dealing it with deep learning.

Each column has a different distribution but overall they are all skewed!

1) which normalization method should i use?

2) can I use different normalization methods on different columns or should it be the same for all ?

Best Answer

It is useful to keep in mind the usual reasons for transforming features (assuming Gaussian errors): reshaping their PDFs to enhance the symmetry of the error structure. It is not a violation of any deep learning or multivariate analytic assumption to apply different transformations ("normalizations") to your data. In other words, nowhere is it written that you have to use the same transformation for every feature. Sometimes it is convenient to do so, as with multiplicative, log-log modeling, but it is definitely not required.

It's impossible to recommend a specific transformation in the abstract -- it is totally a data-driven decision. Textbooks have been devoted to the seemingly limitless number of mathematical functions that are available to be applied. Some common transformations will compress the PDF, e.g., square roots or natural logs, while others extend it, e.g., polynomials. Extreme valued PDFs can benefit from functions such as the inverse hyperbolic sine or Lambert's W.

The biggest problem with many of these "solutions" is that you can lose the meaning of the original unit in which the variable was scaled. This may not be an issue for deep learning algorithms but can be a problem for insights-focused statistical analysis. In this instance, the cure can be worse than the disease, particularly given that Gaussian models are quite robust to violations of the assumptions.