MATLAB: Neural Network is giving incorrect results after 99.4 % training , Why

neural networkspeech recognitionurdu

Hello Everyone,
I'm developing an automatic Urdu Speech Recognizer. I've taken 28 samples(7 males, 7 females ,2 recordings from each). After applying filters (Spectral subtraction, Silence Removal), DTW algorithm i calculated 42 MFCC features and then applied K mean algorithm on it.
Now i input the obtained vector to neural network as below :
e=0.2; % initialize error value
net=newff(minmax(input),Tar,[2 5],{'tansig','logsig'},'traingdx');
% 2 is number of hidden layer neurons and 5 is number of output layer neuron as i have to classify 5 words. Tar is Target
net.divideParam.trainRatio = 0.7; % ANN will take 70% data for training and 30% for testing
net.divideParam.testRatio = 0.3;
net.trainParam.epochs = 1160; % Maximum epochs
net.trainParam.goal =mean(var(Tar')')/100;
while e>= 0.0260
[net,T] = train(net,input,Tar);
test = sim(net,input)
temp=round(test)
e=Tar-test;
disp('error');
e=mse(e)
end
Now The confusion matrix give me 99.3% result, but when i test the data on the same samples it give me wrong answers.What should i do to get write results and why it is giving incorrect results even after 99.3% of training?

Best Answer

Ali on 31 Jan 2012 at 10:13
Thank You
Ok now i have excluded the irrelevant information
I' m using Matlab 7.9(R2009b) . The newff function in the documentation is described as :
newff(P,T,[S1 S2...S(N-l)],{TF1 TF2...TFNl}, BTF,BLF,PF,IPF,OPF,DDF)
TYPO: REPLACE N-I WITH N-1 WHERE N IS THE NUMBER OF WEIGHT LAYERS NOTE: "LAYERS" MEANS LAYERS OF WEIGHTS, NOT LAYERS OF NODES.
P : R x Q1 matrix of Q1 sample R-element input vectors
TAKE A GOOD LOOK AT THE ABOVE STATEMENT AND SEE WHY YOUR VERSION BELOW IS INVALID
T : SN x Q2 matrix of Q2 sample SN-element target vectors
THE DOCUMENTATION SHOULD EMPHASIZE Q2 = Q1!
Si : Size of ith layer, for N-1 layers, default = [ ](Output layer size SN is determined from T.)
TFi: Transfer function of ith layer (Default = 'tansig' for hidden layers and 'purelin' for output layer.)
BTF : Backpropagation network training function
DEFAULT = 'TRAINLM'
BLF : Backpropagation weight/bias learning function (default = 'learngdm')
PF: Performance function. (Default = 'mse')
IPF: Row cell array of input processing functions.
OPF: Row cell array of output processing function.
DDF: Data divison function (default = 'dividerand')
In my newff function i have taken default values after BTF, thats why i didnt write those parameters.
WHY DIDN'T YOU USE THE DEFAULT BTF 'TRAINLM'?
net=newff(minmax(input),Tar,[2 5],{'tansig','logsig'},'traingdx')
INVALID. SEE ABOVE COMMENTS. APPARENTLY YOU ARE GETTING YOUR VERSION OF NEWFF CONFUSED WITH THE EARLIER ONE CIRCA 2005 OR SO
. input is my input vector of size [7800 140]
DOESN'T THE SIZE OF A 7800 DIMENSIONAL INPUT VECTOR BOTHER YOU? WHAT ABOUT 140 OF THEM?
Tar is target vector of size [ 5 140]
Tansig for hidden layer function and losig for output layer function
Traingdx is network training Function
WHY NOT THE DEFAULT TRAINLM?
NUMBER OF UNKNOWN WEIGHTS?
Isnt the newff function randomly initialize the weights and update them? (in my case it is default learngdm)
YES. BUT THE QUESTION IS : HOW MANY WEIGHTS ARE YOU TRYING TO ESTIMATE? FOR A 7800-2-5 NODE TOPOLOGY I COUNT NW = (7800+1)*2+(2+1)*5 = 15, 617 !
NUMBER OF TRAINING EQUATION?
Did you mean Training Function? It is traingdx.
NO. HOW MANY EQUATIONS ARE YOU USING TO ESTIMATE 15,617 UNKNOWNS? I COUNT ONLY NEQ = 140*5 = 700
DOES 700 EQUATIONS AND 15,617 UNKNOWNS MAKE YOU FEEL UNEASY?
YOU HAVE 14 PEOPLE SAYING 2 WORDS EACH. YOU HAVE TO CLASSIFY 5 WORDS. HOW DOES THAT BEAKDOWN?
Sorry my previous statement was not clear. What I mean to say is I've taken recordings of five words from 14 people and each person record each word 2 times.
TAR = ? ..... PLEASE CUT AND PASTE a = [1 0 0 0 0;0 1 0 0 0 ;0 0 1 0 0; 0 0 0 1 0; 0 0 0 0 1];
EYE(5)
Tar = repmat(a,1,28);
% size(Tar) = [ 5 140]
WHY DO YOU THINK YOU NEED A WHILE STATEMENT ??
As the weights are randomly initialized and neural network gives different results each time whenever I run the program. We never know at what weights value it will give good results. So in order to catch right results I make it inside the while loop that whenever it gets minimum error it should stop.
THAT IS DONE AUTOMATICALLY VIA NET.TRAINPARAM.GOAL.
WHY 1160?
I was just checking previously before applying while loop that whether it makes any difference by increasing Epochs.
NO. YOU HAVE NOT SEPARATED TRAIN AND TEST DATA
I've defined train ratio and test ratio. Have I missed something?
YES. SEE HELP TRAIN AND/OR DOC TRAIN TO SEE HOW TO SEPARATE THE PERFORMANCE OF THE TRAIN/VAL/TEST SUBSETS.
NO. YOU HAVE NOT IDENTIFIED THE MAXIMUM OUPUT THAT YIELDS THE CLASS ASSIGNMENT.
How can I identify the maximum output?
HELP MAX
DOC MAX
HOPE THIS HELPS.
GREG