Solved – How to scale predictions from a neural network in R when the output is not a part of the dataset

data miningmachine learningneural networksrregression

I've been using a neural network to make predictions. So my training data is in one .csv file which I read-in and then scale. My test data is in another file that I read-in and is also scaled. However, my test data does not contain an output value column because I am going to be submitting predictions for it to Kaggle to test if the value is correct. (It is part of this Kaggle competition: https://www.kaggle.com/c/carseatsales).

I am not really sure how to scale my prediction if my test data does not have this output column.

Here is how I scaled the data:

train10           = read.csv("Carseats_training.csv")
train10$ShelveLoc = as.numeric(train10$ShelveLoc)
train10$Urban     = as.numeric(train10$Urban)
train10$US        = as.numeric(train10$US)

maxs  <- apply(train10, 2, max) 
mins  <- apply(train10, 2, min)
index <- sample(1:nrow(train10), round(1*nrow(train10)))

scaled <- as.data.frame(scale(train10, center = mins, scale = maxs - mins))

train100 <- scaled[index,]

test10           = read.csv("Carseats_testing.xls")
test10$ShelveLoc = as.numeric(test10$ShelveLoc)
test10$Urban     = as.numeric(test10$Urban)
test10$US        = as.numeric(test10$US)

maxss  <- apply(test10, 2, max) 
minss  <- apply(test10, 2, min)
index1 <- sample(1:nrow(test10), round(1*nrow(test10)))

scaleds <- as.data.frame(scale(test10, center = minss, scale = maxss - minss))

test100 <- scaleds[index1,]

This is my neural network:

nn <- neuralnet(Sales ~ CompPrice + Income + Advertising + Population + Price + ShelveLoc  
                        + Age + Education + Urban + US
                , data = train100
                , hidden = c(5,3)
                , linear.output = T)

I am trying to make a prediction on sales.

pr.nn <- compute(nn, test100[,2:11])

But now I am not really sure how to scale my result.

I would really appreciate any help. I have been stuck on this part for while now.

Best Answer

  1. We only (need to) scale the predictor variables. We do this to help our machine learning algorithm converge (faster) to a minimum of the loss function. In the case of a (feed-forward) neural network, the parameters which we want to estimate are the weights and "biases".
  2. For scaling between $0$ and $1$, we use the following transform for each predictor variable: $$ \tilde{x}_{ij} = \frac{x_{ij} - \min_i(x_{ij})}{\max_i(x_{ij}) - \min_i(x_{ij})}, $$ where the rows are indexed with $i$ and columns with $j$, as is customary.
  3. Given the first point above, one sees that one can simply use the found minima and maxima of the training set for the predictors in the test set.