Solved – How to mix image and data into a CNN

conv-neural-networkimage processingneural networks

I've recently been testing around tensorflow and keras and I've been doing a project to classify images. So far it's been working but now I want to use real data mixed with the image in order to solve a different problem.

Imagine that I have a camera that takes pictures of a living room and then, combined with the input from sensors I want it to do a different classification. Basically a CNN mixed with a regular ANN.

Best Answer

You need to define sub-modules of the network and then somehow merge them and do further processing on the whole data. This is usually done by creating smaller neural networks within the bigger one. For example, you have one sub-network that processes images (say convolutional network) and another one that processes tabular data (say, dense network), then you combine the outputs of both networks and put some layers on top (dense layers are the simplest case). By merging I mean in here operations like concatenation of the outputs, but other operations are possible as well, for example if all the dimensions match you can simply add them.

Recent example of such network was described on the Google Cloud blog (diagram below), where they used three sources of data: raw videos, tabular data on the movies, and tabular data on viewers of the movies, to forecast the movie audience.

enter image description here

Related Question