Solved – 1D CNN for time series regression without pooling layers

conv-neural-networkneural networkspoolingregressiontime series

I am working on a prognostics task, where I predict the Remaining Useful Life of some equipment (i.e.: time steps remaining until failure). In order to do that, I use multivariate time series sensor data, which contains several run-to-failure recordings for different units. For each time step I can calculate the number of time steps remaining until failure, and use those as a target for a 1D Convolutional Neural Network model.

Therefore, the problem consists of modeling the input-output mapping between a tensor $X \in \mathbb{R}^{n\times d}$ and a scalar $y \in \mathbb{R}$, where $n$ is the length of my sliding time windows and $d$ is the dimensionality of the input data.

The time window lengths are relatively short ($20 \leq n \leq 40)$, and because of that I chose not to use pooling layers, as the convolutions themselves reduce the size of the tensor to some extent already and the dimensions are not too large. The resulting model has multiple 1D Convolution / Dropout layer pairs (the output from the convolution layers goes through a non-linear activation function), followed by one flatten and one dense layer leading up to the output.
An example of a model summary is given below (I use Keras):

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_66 (Conv1D)           (None, 34, 16)            336       
_________________________________________________________________
dropout_66 (Dropout)         (None, 34, 16)            0         
_________________________________________________________________
conv1d_67 (Conv1D)           (None, 31, 32)            2080      
_________________________________________________________________
dropout_67 (Dropout)         (None, 31, 32)            0         
_________________________________________________________________
conv1d_68 (Conv1D)           (None, 28, 64)            8256      
_________________________________________________________________
dropout_68 (Dropout)         (None, 28, 64)            0         
_________________________________________________________________
conv1d_69 (Conv1D)           (None, 25, 128)           32896     
_________________________________________________________________
dropout_69 (Dropout)         (None, 25, 128)           0         
_________________________________________________________________
flatten_28 (Flatten)         (None, 3200)              0         
_________________________________________________________________
dense_28 (Dense)             (None, 1)                 3201      
=================================================================
Total params: 46,769
Trainable params: 46,769
Non-trainable params: 0

So my question is: do pooling layers serve any purpose in this type of problem (other than to decrease the tensor length without introducing trainable parameters)?

In other words, am I doing something conceptually wrong or potentially hurting my model's performance by not using pooling layers?

Can you think of any reason why a pooling layer could in fact be beneficial in this scenario?

Best Answer

Polling layer is used to extract the more finer information from data(and size reduction is its byproduct).

Think this as following, On last dropout layer, you have (None, 25, 128) dims, which is nothing but 128 filters, each of 25 dims. As each filter carries information about input text. Pooling will helps to get rid of redundant or irrelevant information, which further helps Dense layer to focus on the more finer information of data.

Note: Even concatenating outputs from Max-Polling and Global-Average-Polling, gives comparable result, in case of CNN1D. For more info