MATLAB: How can i analyze the neural network

neural network analysis

i have create a newpr network with two hidden layers, and i want to analysis the the netwrok, i see the plot confusion and see the result, and i see also the plot performance, but i do not understand how can i analyze the result is it good or not. the mean square error is going down and the test confusion matrix is 83.5 is it good or what?
another thing why there is a validation matrix? and what is the meaning of best validation performance is 0.10113 at epoch 26 is that good or what ?
need help if anyone can help me
thanks a lot

Best Answer

See the comp.ai.neural-nets FAQ.
Consider
1. Randomly draw all of the data from the same probability distribution. Use all of the data to
a. Train many candidate models: Typically, by just using different numbers of hidden nodes, H, and many different weight initializations for each value of H; but sometimes also varying other design parameters.
b. Choose the best of all the models.
c. Use the performance of the best model to estimate performance on unseen data.
Using the same data to perform all three tasks often leads to naive, optimistically biased, estimates of performance on unseen data.
The final result of this can be disasterous performance on unseen data, loss of clients, reputation and employment, bankruptcy of the company and failure of the economy.
A simple way to avoid the optimistic estimates and associated repercussions is to use a sufficiently large data set. However, in most real world situations, that much data is either not available or too unwieldly.
The most common approach is to perform the three tasks using three separate subsets of the data.
Now, consider the following
1. Randomly split the data into a design subset (for tasks a and b) and a test subset (for task c).
total = design + test
2. Put the test subset set aside to be used ONCE and ONLY ONCE for task c.
3. Randomly split the design set into training (task a) and validation (task b) subsets.
design = training + validation
4. Perform tasks a, b, and c.
5. If the estimated performance is unsatisfactory, either accept what you have or return to 1.
6. Steps 1 to 5 can be repeated 32 or more times so that histograms and confidence levels on the performance estimate can be obtained.
Very often the training set is too small to be representative of unseen data and/or too small to support the accurate estimation of a large number of weights. Consequently, as training proceeds, the tendency for MSEtrn to decrease is not representative of performance on unseen data.
There are several ways to avoid this. They are discussed in the CANN FAQ. The method used here is called Early Stopping (aka "Stopped Training"): I like my own term "Validation Stopping". Training stops when the nontraining validation set error reaches a minimum. MATLAB's default definition for Early Stopping is when the validation set error stops decreasing for 6 consecutive epochs. The default value of 6 can be overwritten if you wish.
Finally, an unbiased estimate of performance on unseen data is obtained from the test set.
The design set should be used repeatedly (randomly splitting into training + validation), to find the best combination of hidden nodes, initial weights, number of training epochs and other training parameters.
The test set is used to obtain an unbiased estimate of performance on unseen data.
Many repetitions can be used to obtain confidence intervals.
Hope this helps.
Thank you for officially accepting my answer.
Greg
P.S. There are many valid variations of the above design procedure