Solved – the definition of a “feature map” (aka “activation map”) in a convolutional neural network

conv-neural-networkdeep learningneural networks

 Intro Background

Within a convolutional neural network, we usually have a general structure / flow that looks like this:

  1. input image (i.e. a 2D vector x)

(1st Convolutional layer (Conv1) starts here…)

  1. convolve a set of filters (w1) along the 2D image (i.e. do the z1 = w1*x + b1 dot product multiplications), where z1 is 3D, and b1 is biases.
  2. apply an activation function (e.g. ReLu) to make z1 non-linear (e.g. a1 = ReLu(z1)), where a1 is 3D.

(2nd Convolutional layer (Conv2) starts here…)

  1. convolve a set of filters along the newly computed activations (i.e. do the z2 = w2*a1 + b2 dot product multiplications), where z2 is 3D, and and b2 is biases.
  2. apply an activation function (e.g. ReLu) to make z2 non-linear (e.g. a2 = ReLu(z2)), where a2 is 3D.

 The Question

The definition of the term "feature map" seems to vary from literature to literature. Concretely:

  • For the 1st convolutional layer, does "feature map" corresponds to the input vector x, or the output dot product z1, or the output activations a1, or the "process" converting x to a1, or something else?
  • Similarly, for the 2nd convolutional layer, does "feature map" corresponds to the input activations a1, or the output dot product z2, or the output activation a2, or the "process" converting a1 to a2, or something else?

In addition, is it true that the term "feature map" is exactly the same as "activation map"? (or do they actually mean two different thing?)

 Additional references:

Snippets from Neural Networks and Deep Learning – Chapter 6:

*The nomenclature is being used loosely here. In particular, I'm using "feature map" to mean not the function computed by the convolutional layer, but rather the activation of the hidden neurons output from the layer. This kind of mild abuse of nomenclature is pretty common in the research literature.


Snippets from Visualizing and Understanding
Convolutional Networks by Matt Zeiler
:

In this paper we introduce a visualization technique that reveals the input stimuli that excite individual feature maps at
any layer in the model. […] Our approach, by contrast, provides a non-parametric view of invariance, showing which patterns from the training set activate the feature map. […] a
local contrast operation that normalizes the responses across feature maps.
[…] To examine a given convnet activation, we set all other activations in the layer to zero and pass the feature maps as input to the attached deconvnet layer. […] The convnet uses relu non-linearities, which rectify the feature maps thus ensuring the feature maps are always positive. […] The convnet uses learned filters to convolve the feature maps from
the previous layer. […] Fig. 6, these visualizations are accurate representations of the input pattern that stimulates the given feature map in the model […] when the parts of the original input image corresponding to the pattern are occluded, we see a distinct drop in activity within the feature map. […]

Remarks: also introduces the term "feature map" and "rectified feature map" in Fig 1.


Snippets from Stanford CS231n Chapter on CNN:

[…] One dangerous pitfall that can be easily noticed with this visualization is that some activation maps may be all zero for many different inputs, which can indicate dead filters, and can be a symptom of high learning rates […] Typical-looking activations on the first CONV layer (left), and the 5th CONV layer (right) of a trained AlexNet looking at a picture of a cat. Every box shows an activation map corresponding to some filter. Notice that the activations are sparse (most values are zero, in this visualization shown in black) and mostly local.


Snippets from A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks

[…] Every unique location on the input volume produces a number. After sliding the filter over all the locations, you will find out that what you’re left with is a 28 x 28 x 1 array of numbers, which we call an activation map or feature map.

Best Answer

A feature map, or activation map, is the output activations for a given filter (a1 in your case) and the definition is the same regardless of what layer you are on.

Feature map and activation map mean exactly the same thing. It is called an activation map because it is a mapping that corresponds to the activation of different parts of the image, and also a feature map because it is also a mapping of where a certain kind of feature is found in the image. A high activation means a certain feature was found.

A "rectified feature map" is just a feature map that was created using Relu. You could possibly see the term "feature map" used for the result of the dot products (z1) because this is also really a map of where certain features are in the image, but that is not common to see.