Solved – R package nnet – neural network question

neural networks

I am experimenting with neural network package in R nnet and i have some questions.

  1. The regulatory environment i am working on requires me to reproduce my results to show them to the auditors. How can i reproduce my model results after few months/years ? Can i use a seed value to control the model output ?
  2. How can i validate a neural network model ? are there any goodness of fit tests ?
  3. How do i choose the number of hidden layers i need ? I have 18500 observations in my training dataset and 8 variables. Does that help in identifying hidden layers required in anyway?
  4. Many times the model stops after 100 iterations. I have used maxiter option but sometimes you see the output and it says converged and sometimes it says stopped after 100 iterations. When it says stopped after 100 iterations does that mean i have a bad model ? and it did not converge ?

Thank you

Best Answer

  1. Setting a seed is a good start. It should ensure that you will always get the same results from the same code. It would also be worth noting the version of R and the version of nnet that you're using (as well as any other relevant packages), in case something changes in later versions. If you want to be really careful, you could even archive the current versions so you can go back to them if CRAN ever goes down.

  2. The best evaluation approach usually involves some kind of out-of-sample prediction. Cross-validation is one very good option, as is a simple holdout approach. The caret package automates some of these tools. You have so much data that holding some of it out probably won't hurt you much.

  3. nnet always has one hidden layer. Perhaps you meant the number of hidden units? Cross-validation can be a good option here as well. Just make sure that you don't use the same data for improving the model fit that you plan on using to evaluate model performance (as in question #2).

  4. Convergence means that the parameters aren't changing anymore. It means that the nnet has done as well as it can, given the setup you chose and the random seed used. If you want it to run for more iterations, try increasing the size of the hidden layer or starting from a different set of initial conditions.