Hello Could you plz let me know what does Hub mean in NN trainig? I have seen that some of criteria are used for setting NN training parameters like Hub,Nw, Ntr,.. could you plz introduce some refernces like papers, books for all those who are interested to know more about these terms? best
MATLAB: A question on neural network
Deep Learning Toolboxneural network
Related Solutions
Poorly worded question
Are we supposed to guess
1. That you are referring to a classifier ? 2. Which MATLAB function you are using ... patternnet ? 3. The number of classes c ? 4. The dimensionality of the inputs I ? 5. The number of hidden nodes H ? 6. The trn/val/tst ratio 0.7/0.15/0.15 ?
Overfitting only means that you have more unknown weights than training equations
Nw > Ntrneq where Ntrneq = Ntrn*c Nw = (I+1)*H+(H+1)*c
Validation stopping has nothing to do with training data performance. It has to do with
OVERTRAINING AN OVERFIT NET
It means that the training has reached the point where validation set performance (mse or cross-entropy) has undergone a local minimum indicating that if you don't stop, you will have over-trained an over-fit net to the point where further training will probably make the net perform worse on val, tst and unseen non-training data.
Remember:
The goal of design is to use training data to obtain a net that works well on all nontraining data:
validation + test + unseen
Hope this helps.
Thank you for formerly accepting my answer
Greg
K-FOLD CROSS-VALIDATION IS NOT A CURE FOR THE ILLS OF AN OVERFIT NET.
BACKGROUND:
1. OVERFITTING:
Nw > Ntrneq where Nw = number of unknown weights Ntrneq = number of training equations
2. GENERALIZATION:
If y0 is a solution to the system of equations f(x0,y) = 0, then the system generalizes well if0 = f(x0 + dx , y) ==> y = y0 + dy for NON-INFINITESMALLY SMALL dx and dy
3. NONUNIQENESS and INSTABILITY
Solutions to over-fit systems are, typically not unique. More importantly, the non-uniqueness can lead to the instability and poor generalization of iterative solutions. Typically, there are an infinite number of solutions to an overfit system of equations. However, many of the solutions do not generalize well. For example, iterative solutions to an overfit system can lead to solutions that are inappropriate. I call this problem
4. OVERTRAINING AN OVERFIT NET
There are several approaches to avoid overtraining an overfit net:a. NONOVERFITTING: Do not overfit the net in the first place by using the rule Ntrneq >= Nwb. STOPPED TRAINING: Use train/val/test data division and STOP TRAINING when the validation subset error increases, continually, for a prespecified (MATLAB default is 6) number of epochs. This technique is used in the LEVENBURG-MARQUARDT and CONJUGATE-GRADIENT training functions TRAINLM and TRAINCG, respectively. c. BAYESIAN REGULARIZATION: Constrain the size of the weights by adding to the minimization function a penalty term proportional to the weights squared Euclidean norm. Although this technique is the default in the training function TRAINBR, it can be specified with other training functions.
5. Perhaps you confused the k-fold CROSS-VALIDATION with DATA DIVISION STOPPED TRAINING as a technique to avoid overtraining an overfit net.
It is not. See below.
6. K-FOLD CROSSVALIDATION
a. This widely known technique is not offered in the MATLAB NN TOOLBOX b. Nontheless, my use of the CROSSVAL and CVPARTITION functions from if true % code
endother toolboxes can be found in both the NEWSGROUP and ANSWERS by including "greg" as a searchword with cross validation, cross-validation and crossvalidation
7. However, instead of using other toolboxes to implement k-fold crossvalidation, I compensate
by using m multiple designs (typically 10 <= m <= 30) that only differ by a random division of
training, validation and test subsets in addition to the default random selection of initial weights.
8. My technique is trivial to implement:
Given an I-H-O net topology a. Initialize the random number generator so that designs can be duplicated. b. Store the current state of the RNG at the beginning of the loop so that any design can be recreated at a later date without regenerating the others. c. Design a net and store the performance results (e.g., Normalized Mean Square Error NMSE). Storing the net is not necessary since it is easily redesigned given the stored state of the RNG.
Hope this helps.
Thank you for formally accepting my answer
Greg
Best Answer