Solved – Is it the correct usage of nnet in R

neural networksr

I have a dataset that looks like this :

char    x1 y1 x2 y2 x3 y3 x4 y4 x5 y5 x6 y6 x7 y7 x8 y8 x9 y9   
n   0.1875 0.140625 0.09375 0.515625 0 0.8828125 0.1796875 0.5078125 0.4609375 0.140625 0.640625 0.109375 0.515625 0.5390625 0.3828125 0.8828125 0.671875 0.9765625  
h   0 0 0.046875 0.3125 0.0625 0.671875 0.15625 0.765625 0.328125 0.4609375 0.4609375 0.3359375 0.625 0.4609375 0.6015625 0.7421875 0.53125 0.9921875

I am using nnet library in R to predict the char column from this data set. The data set is divided into 80%(training) and 20%(test)
Also I have converted the response char to numeric (1:26) for all the rows using the below code

trainResponse = as.integer(factor(trainingData[,1], levels = letters))

After changing response column of training data set, where the char was "h", the value currently = 8:

char  x1  y1        x2   y2        x3        y3       x4        y4        x5
8     0  0 0.0390625 0.25 0.0546875 0.5859375 0.046875 0.8359375 0.1015625
      y5        x6    y6        x7       y7      x8       y8        x9        y9
      0.671875 0.2421875 0.625 0.3515625 0.828125 0.40625 0.921875 0.2890625 0.7578125

// The above 2 lines shows the modifiedTrainData
modifiedTrainData = cbind(char = trainResponse,trainingData[,2:19]) 
//fitting the nnet model
nn1 = nnet(char ~ ., data = modifiedTrainData,size = 20, maxit = 1000,
       range = 0.1, trace = T)

# weights:  401
initial  value 904134.647697  
final  value 846130.000000 
converged

When I ask for fitted values , it gives me 1 for all the rows
nn1$fitted.values

2662    1
4991    1
445     1
3311    1
2271    1
2622    1
4433    1
3571    1
1269    1

For predict , it again gives me the result in the similar format:

predict(nn1, testData[,2:19])

3543    1
3545    1
3549    1
3554    1

for all the rows in the test data

How ever I am not sure if I am doing it correctly and how should I infer that the model predicted the char column correctly in the test data?

Best Answer

For the most part you are fine; however, the glaring issue is that you have no reason to convert char to an integer as nnet accepts factors. That is why you only see 1's reported. As an example:

set.seed(123)
vars <- as.matrix(replicate(18, rnorm(25)))

# Wrong way
char <- as.integer(factor(rep(letters[1:5], each=5)))
df <- data.frame(char, vars)
head(df)

library(nnet)

nn1 <- nnet(char ~ ., data=df, size=20, maxit=1000, range=0.1, trace=T)
nn1$fitted.values
    > nn1$fitted.values
   [,1]
1     1
2     1
3     1
4     1
5     1
6     1
...

# Right way
char <- rep(letters[1:5], each=5)
df <- data.frame(char, vars)

nn2 <- nnet(char ~ ., data=df, size=20, maxit=1000, range=0.1, trace=T)
nn2$fitted.values
    > nn2$fitted.values
              a            b            c            d            e
1  1.000000e+00 2.281148e-08 2.034399e-11 5.934214e-12 3.212223e-10
2  1.000000e+00 1.568664e-09 3.117289e-14 7.895235e-14 5.656804e-23
3  9.999958e-01 1.666613e-07 6.259551e-08 3.969482e-06 4.485918e-23
4  9.999994e-01 5.522178e-07 3.361721e-10 1.284468e-08 1.236227e-20
5  9.999909e-01 8.255399e-06 2.314208e-09 8.657084e-07 1.898005e-14
6  1.718135e-17 1.000000e+00 6.838461e-14 1.594482e-12 3.872572e-21
...

When you submit actual classes, you then get output that you can actually use for predicting classes.