I am studying a project which someone did in Caffe where input image is 400 by 400 pixels and first layer is convolution with kernel_size: 11 and stride: 4. Then according to my calculations, output image size = ((400-11)/4) + 1 which is 398.25 which is not an integer. So in this case, what would the output size be? The following is the prototxt with these values:
name: "RP"
input: "data"
input_dim: 32
input_dim: 3
input_dim: 400
input_dim: 400
layers {
bottom: "data"
top: "conv1"
name: "conv1"
type: CONVOLUTION
convolution_param {
num_output: 64
kernel_size: 11
stride: 4
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
Best Answer
It should be
floor((input + 2*pad -filter) / stride) + 1
, which in your case isfloor((400-11)/4) + 1 = floor(97.25) + 1 = 98
.ref: caffe source code
also see this answer