library(car)
library(caret)
trainIndex <- createDataPartition(Prestige$income, p=.7, list=F)
prestige.train <- Prestige[trainIndex, ]
prestige.test <- Prestige[-trainIndex, ]
my.grid <- expand.grid(.decay = c(0.5, 0.1), .size = c(5, 6, 7))
prestige.fit <- train(income ~ prestige + education, data = prestige.train,
method = "nnet", maxit = 1000, tuneGrid = my.grid, trace = F, linout = 1)
prestige.predict <- predict(prestige.fit, newdata = prestige.test)
prestige.rmse <- sqrt(mean((prestige.predict - prestige.test$income)^2))
The above was discussed here
How to train and validate a neural network model in R?
- Does caret package run many times for combinations of decay and size? If so, what is the default # of iterations?
- What is the final choice of decay and size? When I do
summary(prestige.fit)
, the decay is 0.5 and size is 5. Is that the final combination that caret chose as best option?
Best Answer
1) First off, yes the neural net models are run for every unique combination of .decay and .size that you've supplied in
my.grid
because, well, you created it. If you just specifiedmy.grid = data.frame('.decay'=0.5, '.size'=5)
, then you would only have one model.As far as iterations are concerned, you specified the backend modeling application to be the
nnet
function from thennet
package, so these models are estimated using back-propagation which is an iterative gradient descent process that stops once it converges (or diverges). It could be one iteration if you specify parameters that give a local maximum in the entropy space for your algorithm initialization. Alternately, it could iterate1000
times and and fail, since you specifiedmaxit=1000
. This refers to the model fitting process.Also built into this innocuous train function is a validation approach where you pick the best set of params from
my.grid
according to some objective measure and resampling process. The objective measure is specified by hand or by the value of your outcomemetric = ifelse(is.factor(y), "Accuracy", "RMSE")
. So depending onincome
(probably continuous), you pick grid options that minimized root mean square error (minimal bias / variance trade off, a good starting place). The method of cross validation is default bootstrap with a staggering default number of resamples being 10. See?trainControl
. Considering this is parallelized, I'm shocked. In my thesis, we weren't quite happy with10,000
resamples.2) Your specific model gives the combo in
my.grid
for which you have minimal RMSE in the pooled 10 bootstrap resamples cross validated model outcome.