Solved – Unsupervised training of CNN

classificationclusteringconv-neural-networkmachine learningunsupervised learning

I have some unlabeled 1D (i.e. time-domain) signals (real neuron measurements) that I would like to classify in 3 classes. I would like to use a ConvNet to do this. However, as far as I know, ConvNets are trained in a supervised fashion.

I've done some research and I've found this paper that I'm not completely sure I understand. Does it use an improved version of the k-means algorithm to find the filters that best extract the features for the inputs to be clustered? I don't understand why the ConvNet would be of use here, if we are just using k-means to cluster the data. I've also found this question, which I believe is related to my problem but I'm not quite sure if it applies directly.

In a nutshell, my question is: how could I use a CNN to classify unlabeled one-dimensional data?

Best Answer

In the first paper you mention, k-means is used to learn filter (convolutional) layers in the network. The difference between what they are doing and vanilla k-means clustering, is that in this case:

the points are randomly extracted image patches and the centroids are the filters that will be used to encode images.

but this learns redundant filters, so they combine this with a kind of sparse dictionary encoding to eliminate the redundant filters.

Note that the actual task that the algorithm performs classification, not clustering, so in my opinion it's really more of a semi-supervised method rather a unsupervised method. I think the second link is much closer in spirit to what you are after.

I'm sure it's possible to do clustering using CNNs but do consider that they are not the right model for every problem.

If you are a subject matter expert in this area where these measurement are taken, can I suggest labelling in the data? It's possible to achieve good results fine-tuning a CNN with only 100-150 elements per class, if you can find a good base model. I know there has been a bunch of work on 1-D CNNs for ECG data (as mentioned in your links) so perhaps start transfer learning from one of those models. If you have less data, I think CNNs are probably not the right tool, maybe consider Gaussian processes.

Related Question