Solved – How does Caffe handle non-integer convolution layer output size

conv-neural-networkdeep learningmachine learning

I am studying a project which someone did in Caffe where input image is 400 by 400 pixels and first layer is convolution with kernel_size: 11 and stride: 4. Then according to my calculations, output image size = ((400-11)/4) + 1 which is 398.25 which is not an integer. So in this case, what would the output size be? The following is the prototxt with these values:

    name: "RP"
    input: "data"
    input_dim: 32
    input_dim: 3
    input_dim: 400
    input_dim: 400
    layers {
    bottom: "data"
    top: "conv1"
    name: "conv1"
    type: CONVOLUTION
    convolution_param {
    num_output: 64
    kernel_size: 11
    stride: 4
    weight_filler {
    type: "xavier"
    }
    bias_filler {
    type: "constant"
    value: 0.1
    }

Best Answer

It should be floor((input + 2*pad -filter) / stride) + 1, which in your case is floor((400-11)/4) + 1 = floor(97.25) + 1 = 98.

ref: caffe source code
also see this answer

Related Question