Solved – Prediction Interval for Neural Net With Hessian :: nnet in R

maximum likelihoodneural networksprediction intervalregression

I would like to construct a confidence interval around prediction from a neural network, without resorting to bootstrapping – given the computational cost. Can I use the Hessian returned in this way to produce a 95% CI?

1) Can you take the negative inverse of the Hessian as the var/covar matrix? I read here that this depends on what is being maximized or minimized. Is this true and how can you know for certain?

2) Is this an accepted routine for producing a confidence interval around a prediction and if so, how would you do it?

Best Answer

Try the nnetpredint package. https://cran.r-project.org/web/packages/nnetpredint/

I’ve met the same problem and I also want to construct a prediction confidence interval to the neural networks. So I tried to develop the nnetpredint (R package), using the method from these related papers, which use the Jacobian matrix (first order derivative of the training datasets with gradient function) to estimate model errors instead of the Hessian matrix. The manual is here and the method has the function interface to the models trained by nnet, neuralnet and RSNNS packages:

The example for nnet package is here. The method nnetPredInt takes the model weights, nodes number, training datasets, etc. as input and compute the prediction interval for the new datasets.

    install.packages("nnetpredint")

    # Example: Using the nnet object trained by nnet package
    library(nnet)
    xTrain <- rbind(cbind(runif(150,min = 0, max = 0.5),runif(150,min = 0, max = 0.5)) ,
    cbind(runif(150,min = 0.5, max = 1),runif(150,min = 0.5, max = 1))
    )
    nObs <- dim(xTrain)[1]
    yTrain <- 0.5 + 0.4 * sin(2* pi * xTrain %*% c(0.4,0.6)) +rnorm(nObs,mean = 0, sd = 0.05)
    plot(xTrain %*% c(0.4,0.6),yTrain)

    # Training nnet models
    net <- nnet(yTrain ~ xTrain,size = 3, rang = 0.1,decay = 5e-4, maxit = 500)
    yFit <- c(net$fitted.values)
    nodeNum <- c(2,3,1)
    wts <- net$wts

    # New data for prediction intervals
    library(nnetpredint)
    newData <- cbind(seq(0,1,0.05),seq(0,1,0.05))
    yTest <- 0.5 + 0.4 * sin(2* pi * newData %*% c(0.4,0.6))+rnorm(dim(newData)[1],mean = 0, sd = 0.05)

    # S3 generic method: Object of nnet
    yPredInt <- nnetPredInt(net, xTrain, yTrain, newData, alpha = 0.05) # 95% confidence interval
    print(yPredInt[1:20,])

    # S3 default method for user defined input
    yPredInt2 <- nnetPredInt(object = NULL, xTrain, yTrain, yFit, node = nodeNum, wts = wts, newData, alpha = 0.05, funName = 'sigmoid')

    plot(newData %*% c(0.4,0.6),yTest,type = 'b')
    lines(newData %*% c(0.4,0.6),yPredInt$yPredValue,type = 'b',col='blue')
    lines(newData %*% c(0.4,0.6),yPredInt$lowerBound,type = 'b',col='red') # lower bound
    lines(newData %*% c(0.4,0.6),yPredInt$upperBound,type = 'b',col='red') # upper bound

The keys to the estimation methods:

Use the first order Taylor expansion to expand the f(x) at each weight parameters. And calculate the gradient vector/ Jacobian matrix from the training datasets.

References:

De Veaux R. D., Schumi J., Schweinsberg J., Ungar L. H., 1998, "Prediction intervals for neural networks via nonlinear regression", Technometrics 40(4): 273-282.

Chryssolouris G., Lee M., Ramsey A., "Confidence interval prediction for neural networks models",IEEE Trans. Neural Networks, 7 (1), 1996, pp. 229-232

And also check out this paper for detailed maths. http://cdn.intechopen.com/pdfs-wm/14915.pdf Confidence Intervals for Neural Networks and Applications to Modeling Engineering Materials