Solved – How to design neural networks for pattern recognition in biometry

machine learningneural networks

Having read numerous texts regarding neural networks and their characteristics, I am getting increasingly confused, paradoxically – I am looking for a brief explanation or references to the right sources.

I am trying to implement neural networks using PyBrain to recognise patterns in biometric data and classify them. It is just a small project to actually learn the procedures and the goal is to recognise which subject provided which data.

I am using a feedforward multilayer perceptron trained using back-propagation and the results are successful with roughly 0.7 probability but honestly I am just guessing the settings like types of layers and such.

I realise this question is very broad and very difficult, but could you perhaps suggest the best approach to such task and possibly identify the variables that determine which algorithms to use?

The data are in the time domain so for example, the typical input for training and following classification would be a vector of 256 values representing 2 seconds of ECG input.

I should also point out that the same data are classified with a ~0.97 probability using Matlab's Neural Network Toolbox which is why I believe that the configuration could use some work and the data contain the necessary unique patterns.

Cheers.


Edit to address the strikethrough:

The Matlab program in question was not mine and it turned out that the input data were terribly flawed during preprocessing so none of it was true.


Edit to answer @lambruscoAcido questions:

I have learned a lot new information since asking this question and some parts of the procedure as well as results have changed but the general problem persists – choosing a correct model. Despite reading through loads of literature, I still lack basic understanding of certain concepts.

  • The raw data coming from each subject are processed before feeding them to the NN (MLP) by extracting the same single channel, filtering out everything except beta range, splitting them into chunks of 256 samples with 50% overlap (fs = 128 Hz) and obtaining a PSD using Welch's method so they are skewed and look like this: Data

  • They do correlate with each other but I am not sure about the principal components – how can I find out? I am interested in the general methodology of empirically establishing the optimal number of hidden layers, the number of perceptron units and their activation functions (I am currently using tanh for hidden layers and softmax for output). The studies I read did not include any information regarding this procedure.

  • The dimensions of my current network are 129-10-6. The input is the power spectral density estimate (as pictured above) computed from each chunk of 256 samples times 30 per each of 6 subjects minus 1 due to overlapping (59 inputs per subject in total).

I am currently achieving a mean error of ~15% after 100 epochs so there has been an improvement. The studies I have read were at ~2%.

Best Answer

Neural networks are too often used as a black box without correct model specification. Indeed, you could take your 256 values per individual and model it as a Feed-forward NN with 256 inputs, maybe 5 to 10 perceptrons on the hidden layer, and hope to get a nice output.

However to choose a correct model it would be a good idea to do some preliminary analysis. Maybe you already did, but at least I have now the following questions:

  • How are the values distributed? Normal, symmetric, skewed? (often pre-transformations are very useful!)
  • Do they correlate with each other? How many principal components are needed to explain most of the variation? (If only few, you need not a lot of perceptrons)
  • Do the 256 values measure the same thing? Do you have per subject a time series of 256 values ranging from second 0-2 until 510-512??

After getting a better idea of the data, the model specification is feasible.