Solved – How to predict output after standardizing data for used in Ridge and Lasso

lassopythonridge regressionscikit learnstandardization

I'm trying out Ridge and Lasso for feature selection step in machine learning.

I have below training data:

            input_cap  output_cap   cpu  disk  load   ips
2016-07-01       1.34        4.43  18.1    11  2.75  4863
2016-07-02       1.41        4.56  14.5    11  2.71  4616
2016-07-03       1.37        4.43  16.8    11  2.68  4440
2016-07-04       1.26        3.91  14.0    10  2.77  4047
2016-07-05       1.39        4.68  16.2    11  2.70  4720

and below test data:

            input_cap  output_cap    ips
2017-04-01       1.93        7.21  10077
2017-04-02       1.91        7.97  10840
2017-04-03       2.06        9.86  12768
2017-04-04       2.09       10.55  13896
2017-04-05       2.04        7.28  12756

I did the following (ignore datetime index):

# Split training into x_train and y_train
x_train = train.iloc[:, [0,2,3,4,5]]
y_train = train.output_cap

# Split test data into x_test and y_test
x_test = test.ips
y_test = test.output_cap

# I want to find out which feature is most important in predicting 
output_cap. So, I did the following:

# Standardize x_train
scaler = StandardScaler().fit(x_train)
x_train = scaler.transform(x_train)

from sklearn.linear_model import Ridge, Lasso

# Fit x_train and y_train
ridge = Ridge().fit(x_train, y_train)
lasso = Lasso().fit(x_train, y_train)

# Print results for both Ridge and Lasso
print('Ridge: ', ridge.coef_)
print('Lasso:', lasso.coef_)

Ridge:  [-0.13489306  0.33747024  0.37065464  0.27221361  0.94848913]
Lasso: [ 0.          0.          0.          0.          0.15643667]

# From the results, it shows that feature 'ips' is most significant in 
# predicting output_cap.

This is where I'm confused…

Let's assume that I want to use a LinearRegression() function from Sklearn, for simplicity, to predict output_cap_yhat from the test data using feature = 'ips'. My questions are:

  1. Do I need to first convert the standardized x_train back to its original scale, then fit the LinearRegression() and predict?

OR

  1. Do I need to standardize the x_test, then fit the LinearRegression()?

  2. How do I convert the standardized data back into non-standardized (original) data?

Best Answer

Since noisy magnitudes of variables may affect Lasso models, it would be beneficial to standardize the data. Here's a very helpful answer on this very topic: https://stats.stackexchange.com/a/86435/156469

While standardizing the variables ($z=\frac{x-\mu}{\sigma}$) you would get means ($\mu$) and standard deviations ($\sigma$). Using that you can back-convert the predicted values by: $x=z*\sigma+\mu$.