MATLAB: Cannot continue training R-CNN detector using Inception; Error: “Unconnected input. Each layer input must be connected to the output of another layer.”

Intro:

I'm trying to use the deep learning toolbox to train an R-CNN object detector using InceptionV3, however when trying to "continue" training, I get an error that all the layers are unconnected. According to the documentation

"When you specify the network as a SeriesNetwork, an array of Layer objects, or by the network name, the network is automatically transformed into a R-CNN network by adding new classification and regression layers to support object detection"

However, it looks like the Layers taken from this transformed network are not compatible with trainRCNNObjectDetector, or I'm missing something. If there's something I need to be doing, the documentation is very unclear about that. While I could start from scratch every time, that seems inefficient as I can't transfer from a tuned network based on some new input data, or continue training to increase the accuracy (if we're not in overfitting territory). The reason for doing a small batch at the beginning is to tune the learning rate and possibly using a cyclic learning rate such as the one cycle policy.

This appears to be a bug, but I'm not sure.

Step 1: Using the stop-sign example from trainRCNNObjectDetector

%% Load training data and network layers.
load('rcnnStopSigns.mat', 'stopSigns', 'layers')
mynetwork = 'inceptionv3';
%% 
% Add the image directory to the MATLAB path.
imDir = fullfile(matlabroot, 'toolbox', 'vision', 'visiondata',...
  'stopSignImages');
addpath(imDir);
%%

% Set network training options to use mini-batch size of 32 to reduce GPU
% memory usage. Lower the InitialLearnRate to reduce the rate at which
% network parameters are changed. This is beneficial when fine-tuning a
% pre-trained network and prevents the network from changing too rapidly. 
options = trainingOptions('sgdm', ...
  'MiniBatchSize', 32, ...
  'InitialLearnRate', 1e-6, ...
  'MaxEpochs', 5);
%%
% Train the R-CNN detector. Training can take a few minutes to complete.
rcnn = trainRCNNObjectDetector(stopSigns, mynetwork, options, 'NegativeOverlapRange', [0 0.3]);

And the output is:

*******************************************************************

Training an R-CNN Object Detector for the following object classes:

* stopSign

–> Extracting region proposals from 27 training images…done.

–> Training a neural network to classify objects in training data…

Training on single GPU.

Initializing input data normalization.

|========================================================================================|

|========================================================================================|

| 1 | 1 | 00:00:03 | 28.13% | 0.7700 | 1.0000e-06 |

| 2 | 50 | 00:05:58 | 31.25% | 0.7473 | 1.0000e-06 |

| 3 | 100 | 00:11:54 | 34.38% | 0.7232 | 1.0000e-06 |

| 5 | 150 | 00:17:50 | 50.00% | 0.7184 | 1.0000e-06 |

| 5 | 175 | 00:20:48 | 46.88% | 0.6998 | 1.0000e-06 |

|========================================================================================|

Network training complete.

–> Training bounding box regression models for each object class…100.00%…done.

Detector training complete.

*******************************************************************

Step 2: However when I try to continue it (to run for more Epochs for instance), I get an error

%% set up training options - test learning rate
checkpointPath = pwd;
options = trainingOptions('sgdm', ...
'MiniBatchSize', 32, ...
'InitialLearnRate', 1e-5, ...
'LearnRateSchedule', 'piecewise', ...
'LearnRateDropFactor', 0.1, ...
'LearnRateDropPeriod', 10, ...
'MaxEpochs', 100, ...
'Verbose', true);
%% continue training from previous version
network = rcnn.Network;
layers = network.Layers;
rcnn = trainRCNNObjectDetector(stopSigns, layers, options, 'NegativeOverlapRange', [0 0.3], 'PositiveOverlapRange',[0.5 1]);

I get the following error:

Error using trainRCNNObjectDetector (line 256)

Invalid network.

Error in testcodeforinceptionrcnn (line 40)

rcnn = trainRCNNObjectDetector(stopSigns, layers, options, 'NegativeOverlapRange', [0 0.3], 'PositiveOverlapRange',[0.5 1]);

Caused by:

Layer 'concatenate_1': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

Layer 'concatenate_2': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

Layer 'conv2d_7': Input size mismatch. Size of input to this layer is different from the expected input size.

Inputs to this layer:

from layer 'activation_9_relu' (output size 35×35×64)

Layer 'mixed0': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed1': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed10': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed2': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed3': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

Layer 'mixed4': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed5': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed6': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed7': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed8': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

Layer 'mixed9': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

input 'in3'

input 'in4'

Layer 'mixed9_0': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

Layer 'mixed9_1': Unconnected input. Each layer input must be connected to the output of another layer.

Detected unconnected inputs:

input 'in2'

Examining the network that comes from Step 1:

>> rcnn.Network

ans =

DAGNetwork with properties:

Layers: [315×1 nnet.cnn.layer.Layer]

Connections: [349×2 table]

InputNames: {'input_1'}

OutputNames: {'rcnnClassification'}

Finally, the output of the "layers" array that I'm trying to use:

>> rcnn.Network.Layers

ans =

315×1 Layer array with layers:

1 'input_1' Image Input 299x299x3 images with 'rescale-symmetric' normalization

2 'conv2d_1' Convolution 32 3x3x3 convolutions with stride [2 2] and padding [0 0 0 0]

3 'batch_normalization_1' Batch Normalization Batch normalization with 32 channels

4 'activation_1_relu' ReLU ReLU

5 'conv2d_2' Convolution 32 3x3x32 convolutions with stride [1 1] and padding [0 0 0 0]

6 'batch_normalization_2' Batch Normalization Batch normalization with 32 channels

7 'activation_2_relu' ReLU ReLU

8 'conv2d_3' Convolution 64 3x3x32 convolutions with stride [1 1] and padding 'same'

9 'batch_normalization_3' Batch Normalization Batch normalization with 64 channels

10 'activation_3_relu' ReLU ReLU

11 'max_pooling2d_1' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0]

12 'conv2d_4' Convolution 80 1x1x64 convolutions with stride [1 1] and padding [0 0 0 0]

13 'batch_normalization_4' Batch Normalization Batch normalization with 80 channels

14 'activation_4_relu' ReLU ReLU

15 'conv2d_5' Convolution 192 3x3x80 convolutions with stride [1 1] and padding [0 0 0 0]

16 'batch_normalization_5' Batch Normalization Batch normalization with 192 channels

17 'activation_5_relu' ReLU ReLU

18 'max_pooling2d_2' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0]

19 'conv2d_9' Convolution 64 1x1x192 convolutions with stride [1 1] and padding 'same'

20 'batch_normalization_9' Batch Normalization Batch normalization with 64 channels

21 'activation_9_relu' ReLU ReLU

22 'conv2d_7' Convolution 48 1x1x192 convolutions with stride [1 1] and padding 'same'

23 'conv2d_10' Convolution 96 3x3x64 convolutions with stride [1 1] and padding 'same'

24 'batch_normalization_7' Batch Normalization Batch normalization with 48 channels

25 'batch_normalization_10' Batch Normalization Batch normalization with 96 channels

26 'activation_7_relu' ReLU ReLU

27 'activation_10_relu' ReLU ReLU

28 'average_pooling2d_1' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'

29 'conv2d_6' Convolution 64 1x1x192 convolutions with stride [1 1] and padding 'same'

30 'conv2d_8' Convolution 64 5x5x48 convolutions with stride [1 1] and padding 'same'

31 'conv2d_11' Convolution 96 3x3x96 convolutions with stride [1 1] and padding 'same'

32 'conv2d_12' Convolution 32 1x1x192 convolutions with stride [1 1] and padding 'same'

33 'batch_normalization_6' Batch Normalization Batch normalization with 64 channels

34 'batch_normalization_8' Batch Normalization Batch normalization with 64 channels

35 'batch_normalization_11' Batch Normalization Batch normalization with 96 channels

36 'batch_normalization_12' Batch Normalization Batch normalization with 32 channels

37 'activation_6_relu' ReLU ReLU

38 'activation_8_relu' ReLU ReLU

39 'activation_11_relu' ReLU ReLU

40 'activation_12_relu' ReLU ReLU

41 'mixed0' Depth concatenation Depth concatenation of 4 inputs

42 'conv2d_16' Convolution 64 1x1x256 convolutions with stride [1 1] and padding 'same'

43 'batch_normalization_16' Batch Normalization Batch normalization with 64 channels

44 'activation_16_relu' ReLU ReLU

45 'conv2d_14' Convolution 48 1x1x256 convolutions with stride [1 1] and padding 'same'

46 'conv2d_17' Convolution 96 3x3x64 convolutions with stride [1 1] and padding 'same'

47 'batch_normalization_14' Batch Normalization Batch normalization with 48 channels

48 'batch_normalization_17' Batch Normalization Batch normalization with 96 channels

49 'activation_14_relu' ReLU ReLU

50 'activation_17_relu' ReLU ReLU

51 'average_pooling2d_2' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'

52 'conv2d_13' Convolution 64 1x1x256 convolutions with stride [1 1] and padding 'same'

53 'conv2d_15' Convolution 64 5x5x48 convolutions with stride [1 1] and padding 'same'

54 'conv2d_18' Convolution 96 3x3x96 convolutions with stride [1 1] and padding 'same'

55 'conv2d_19' Convolution 64 1x1x256 convolutions with stride [1 1] and padding 'same'

56 'batch_normalization_13' Batch Normalization Batch normalization with 64 channels

57 'batch_normalization_15' Batch Normalization Batch normalization with 64 channels

58 'batch_normalization_18' Batch Normalization Batch normalization with 96 channels

59 'batch_normalization_19' Batch Normalization Batch normalization with 64 channels

60 'activation_13_relu' ReLU ReLU

61 'activation_15_relu' ReLU ReLU

62 'activation_18_relu' ReLU ReLU

63 'activation_19_relu' ReLU ReLU

64 'mixed1' Depth concatenation Depth concatenation of 4 inputs

65 'conv2d_23' Convolution 64 1x1x288 convolutions with stride [1 1] and padding 'same'

66 'batch_normalization_23' Batch Normalization Batch normalization with 64 channels

67 'activation_23_relu' ReLU ReLU

68 'conv2d_21' Convolution 48 1x1x288 convolutions with stride [1 1] and padding 'same'

69 'conv2d_24' Convolution 96 3x3x64 convolutions with stride [1 1] and padding 'same'

70 'batch_normalization_21' Batch Normalization Batch normalization with 48 channels

71 'batch_normalization_24' Batch Normalization Batch normalization with 96 channels

72 'activation_21_relu' ReLU ReLU

73 'activation_24_relu' ReLU ReLU

74 'average_pooling2d_3' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'

75 'conv2d_20' Convolution 64 1x1x288 convolutions with stride [1 1] and padding 'same'

76 'conv2d_22' Convolution 64 5x5x48 convolutions with stride [1 1] and padding 'same'

77 'conv2d_25' Convolution 96 3x3x96 convolutions with stride [1 1] and padding 'same'

78 'conv2d_26' Convolution 64 1x1x288 convolutions with stride [1 1] and padding 'same'

79 'batch_normalization_20' Batch Normalization Batch normalization with 64 channels

80 'batch_normalization_22' Batch Normalization Batch normalization with 64 channels

81 'batch_normalization_25' Batch Normalization Batch normalization with 96 channels

82 'batch_normalization_26' Batch Normalization Batch normalization with 64 channels

83 'activation_20_relu' ReLU ReLU

84 'activation_22_relu' ReLU ReLU

85 'activation_25_relu' ReLU ReLU

86 'activation_26_relu' ReLU ReLU

87 'mixed2' Depth concatenation Depth concatenation of 4 inputs

88 'conv2d_28' Convolution 64 1x1x288 convolutions with stride [1 1] and padding 'same'

89 'batch_normalization_28' Batch Normalization Batch normalization with 64 channels

90 'activation_28_relu' ReLU ReLU

91 'conv2d_29' Convolution 96 3x3x64 convolutions with stride [1 1] and padding 'same'

92 'batch_normalization_29' Batch Normalization Batch normalization with 96 channels

93 'activation_29_relu' ReLU ReLU

94 'conv2d_27' Convolution 384 3x3x288 convolutions with stride [2 2] and padding [0 0 0 0]

95 'conv2d_30' Convolution 96 3x3x96 convolutions with stride [2 2] and padding [0 0 0 0]

96 'batch_normalization_27' Batch Normalization Batch normalization with 384 channels

97 'batch_normalization_30' Batch Normalization Batch normalization with 96 channels

98 'activation_27_relu' ReLU ReLU

99 'activation_30_relu' ReLU ReLU

100 'max_pooling2d_3' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0]

101 'mixed3' Depth concatenation Depth concatenation of 3 inputs

102 'conv2d_35' Convolution 128 1x1x768 convolutions with stride [1 1] and padding 'same'

103 'batch_normalization_35' Batch Normalization Batch normalization with 128 channels

104 'activation_35_relu' ReLU ReLU

105 'conv2d_36' Convolution 128 7x1x128 convolutions with stride [1 1] and padding 'same'

106 'batch_normalization_36' Batch Normalization Batch normalization with 128 channels

107 'activation_36_relu' ReLU ReLU

108 'conv2d_32' Convolution 128 1x1x768 convolutions with stride [1 1] and padding 'same'

109 'conv2d_37' Convolution 128 1x7x128 convolutions with stride [1 1] and padding 'same'

110 'batch_normalization_32' Batch Normalization Batch normalization with 128 channels

111 'batch_normalization_37' Batch Normalization Batch normalization with 128 channels

112 'activation_32_relu' ReLU ReLU

113 'activation_37_relu' ReLU ReLU

114 'conv2d_33' Convolution 128 1x7x128 convolutions with stride [1 1] and padding 'same'

115 'conv2d_38' Convolution 128 7x1x128 convolutions with stride [1 1] and padding 'same'

116 'batch_normalization_33' Batch Normalization Batch normalization with 128 channels

117 'batch_normalization_38' Batch Normalization Batch normalization with 128 channels

118 'activation_33_relu' ReLU ReLU

119 'activation_38_relu' ReLU ReLU

120 'average_pooling2d_4' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'

121 'conv2d_31' Convolution 192 1x1x768 convolutions with stride [1 1] and padding 'same'

122 'conv2d_34' Convolution 192 7x1x128 convolutions with stride [1 1] and padding 'same'

123 'conv2d_39' Convolution 192 1x7x128 convolutions with stride [1 1] and padding 'same'

124 'conv2d_40' Convolution 192 1x1x768 convolutions with stride [1 1] and padding 'same'

125 'batch_normalization_31' Batch Normalization Batch normalization with 192 channels

126 'batch_normalization_34' Batch Normalization Batch normalization with 192 channels

127 'batch_normalization_39' Batch Normalization Batch normalization with 192 channels

128 'batch_normalization_40' Batch Normalization Batch normalization with 192 channels

129 'activation_31_relu' ReLU ReLU

130 'activation_34_relu' ReLU ReLU

131 'activation_39_relu' ReLU ReLU

132 'activation_40_relu' ReLU ReLU

133 'mixed4' Depth concatenation Depth concatenation of 4 inputs

134 'conv2d_45' Convolution 160 1x1x768 convolutions with stride [1 1] and padding 'same'

135 'batch_normalization_45' Batch Normalization Batch normalization with 160 channels

136 'activation_45_relu' ReLU ReLU

137 'conv2d_46' Convolution 160 7x1x160 convolutions with stride [1 1] and padding 'same'

138 'batch_normalization_46' Batch Normalization Batch normalization with 160 channels

139 'activation_46_relu' ReLU ReLU

140 'conv2d_42' Convolution 160 1x1x768 convolutions with stride [1 1] and padding 'same'

141 'conv2d_47' Convolution 160 1x7x160 convolutions with stride [1 1] and padding 'same'

142 'batch_normalization_42' Batch Normalization Batch normalization with 160 channels

143 'batch_normalization_47' Batch Normalization Batch normalization with 160 channels

144 'activation_42_relu' ReLU ReLU

145 'activation_47_relu' ReLU ReLU

146 'conv2d_43' Convolution 160 1x7x160 convolutions with stride [1 1] and padding 'same'

147 'conv2d_48' Convolution 160 7x1x160 convolutions with stride [1 1] and padding 'same'

148 'batch_normalization_43' Batch Normalization Batch normalization with 160 channels

149 'batch_normalization_48' Batch Normalization Batch normalization with 160 channels

150 'activation_43_relu' ReLU ReLU

151 'activation_48_relu' ReLU ReLU

152 'average_pooling2d_5' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'

153 'conv2d_41' Convolution 192 1x1x768 convolutions with stride [1 1] and padding 'same'

154 'conv2d_44' Convolution 192 7x1x160 convolutions with stride [1 1] and padding 'same'

155 'conv2d_49' Convolution 192 1x7x160 convolutions with stride [1 1] and padding 'same'

156 'conv2d_50' Convolution 192 1x1x768 convolutions with stride [1 1] and padding 'same'

157 'batch_normalization_41' Batch Normalization Batch normalization with 192 channels

158 'batch_normalization_44' Batch Normalization Batch normalization with 192 channels

159 'batch_normalization_49' Batch Normalization Batch normalization with 192 channels

160 'batch_normalization_50' Batch Normalization Batch normalization with 192 channels

161 'activation_41_relu' ReLU ReLU

162 'activation_44_relu' ReLU ReLU

163 'activation_49_relu' ReLU ReLU

164 'activation_50_relu' ReLU ReLU

165 'mixed5' Depth concatenation Depth concatenation of 4 inputs

166 'conv2d_55' Convolution 160 1x1x768 convolutions with stride [1 1] and padding 'same'

167 'batch_normalization_55' Batch Normalization Batch normalization with 160 channels

168 'activation_55_relu' ReLU ReLU

Best Answer

Sorry, I figured out the answer on my own but didn't re-post.

The solution (for InceptionV3 and resnets at least) is to rewrap the network in a layergraph after initial training rather than simply using the layers, for instance, the last 3 lines of "step 2" above become:

network = rcnn.Network;
rcnn = trainRCNNObjectDetector(stopSigns, layerGraph(network), options, 'NegativeOverlapRange', [0 0.3], 'PositiveOverlapRange',[0.5 1]);

I've asked Mathworks to update their documentation to call attention to this as it's not obvious from the example.

Best Answer

Related Solutions

MATLAB: Imported U-Net from Onnx to MATLAB Deep Learning toolbox and it does not work

MATLAB: Cant get concatenationLayer to connect to other layers in the CNN

Related Question