Solved – How to train convolutional neural networks with multi-channel images

conv-neural-networkdeep learningimage processingtrain

I have $m$ labeled images, each with 224×224 pixels and 5 different image channels. What is the best way to train a CNN architecture using this data when $m$ is small (less than 2000)? Is it possible to consider each channel as a separate image or is it better to input all $n$ channels into the input layer at the same time?

Best Answer

You should work with each image as a volume of dimensions 224x224x5. You still do 2D convolution over the first 2 dimensions as usual, but keep the entire 3rd dimension. For instance, if you use a 7x7 convolution window, each filter will produce a 224x224x1 volume as output (with stride = 1 and zero padding), and the convolutional layer as a whole will produce a 224x224xN volume, where N is the number of filters. ConvNetJS and many other conv nets libraries take this approach.