Solved – Pixel-wise classification on a large image using deep learning network

classificationcomputer visionconv-neural-networkdeep learning

I am trying to classify every pixel on a large image (satellite image ~ 6000×4000 pixels) as belonging to one of the 4 classes:"Cloud", "Thin Cloud", "Clear", "Shadow." To that extent, I have taken inspiration from the paper "Brain Tumor Segmentation with Deep Neural Networks" where I design a neural network that looks something like this:

TwoPathCNN Image

The idea is that the network above will predict the class of the pixel by processing the 31×31 patch centered on that pixel.

Training and testing on various patches have been carried out however the main problem is full-image inference. Since the image is so large, it takes almost 2 hours to extract a patch at each pixel and then process it to determine the pixel class. In the paper, however, the authors "[feed] as input a full image and not
individual patches. Therefore, convolutions at all layers can be
extended to obtain all label probabilities $p(Yij|X)$ for the entire
image."

So, my questions are:

  1. If I run all the convolution layers on my image, will I get a 4x5970x3970 dim tensor.
  2. How do I then make the leap to pixel-level classification for every pixel in my 6000×4000 image?

Best Answer

I might suggest you should have a look at this paper Fully Convolutional Networks for Semantic Segmentation.

This paper suggests architecture which can take input of undefined dimensions and produce segmentation heat maps.