MATLAB: Cannot continue training R-CNN detector using Inception; Error: “Unconnected input. Each layer input must be connected to the output of another layer.”

Computer Vision ToolboxDeep Learning ToolboxMATLABrcnntransfer learning

Intro:
I'm trying to use the deep learning toolbox to train an R-CNN object detector using InceptionV3, however when trying to "continue" training, I get an error that all the layers are unconnected. According to the documentation
"When you specify the network as a SeriesNetwork, an array of Layer objects, or by the network name, the network is automatically transformed into a R-CNN network by adding new classification and regression layers to support object detection"
However, it looks like the Layers taken from this transformed network are not compatible with trainRCNNObjectDetector, or I'm missing something. If there's something I need to be doing, the documentation is very unclear about that. While I could start from scratch every time, that seems inefficient as I can't transfer from a tuned network based on some new input data, or continue training to increase the accuracy (if we're not in overfitting territory). The reason for doing a small batch at the beginning is to tune the learning rate and possibly using a cyclic learning rate such as the one cycle policy.
This appears to be a bug, but I'm not sure.
Step 1: Using the stop-sign example from trainRCNNObjectDetector
%% Load training data and network layers.
load('rcnnStopSigns.mat', 'stopSigns', 'layers')
mynetwork = 'inceptionv3';
%%
% Add the image directory to the MATLAB path.
imDir = fullfile(matlabroot, 'toolbox', 'vision', 'visiondata',...
'stopSignImages');
addpath(imDir);
%%

% Set network training options to use mini-batch size of 32 to reduce GPU
% memory usage. Lower the InitialLearnRate to reduce the rate at which
% network parameters are changed. This is beneficial when fine-tuning a
% pre-trained network and prevents the network from changing too rapidly.
options = trainingOptions('sgdm', ...
'MiniBatchSize', 32, ...
'InitialLearnRate', 1e-6, ...
'MaxEpochs', 5);
%%
% Train the R-CNN detector. Training can take a few minutes to complete.
rcnn = trainRCNNObjectDetector(stopSigns, mynetwork, options, 'NegativeOverlapRange', [0 0.3]);
And the output is:
*******************************************************************
Training an R-CNN Object Detector for the following object classes:
* stopSign
–> Extracting region proposals from 27 training images…done.
–> Training a neural network to classify objects in training data…
Training on single GPU.
Initializing input data normalization.
|========================================================================================|
| Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | Accuracy | Loss | Rate |
|========================================================================================|
| 1 | 1 | 00:00:03 | 28.13% | 0.7700 | 1.0000e-06 |
| 2 | 50 | 00:05:58 | 31.25% | 0.7473 | 1.0000e-06 |
| 3 | 100 | 00:11:54 | 34.38% | 0.7232 | 1.0000e-06 |
| 5 | 150 | 00:17:50 | 50.00% | 0.7184 | 1.0000e-06 |
| 5 | 175 | 00:20:48 | 46.88% | 0.6998 | 1.0000e-06 |
|========================================================================================|
Network training complete.
–> Training bounding box regression models for each object class…100.00%…done.
Detector training complete.
*******************************************************************
Step 2: However when I try to continue it (to run for more Epochs for instance), I get an error
%% set up training options - test learning rate
checkpointPath = pwd;
options = trainingOptions('sgdm', ...
'MiniBatchSize', 32, ...
'InitialLearnRate', 1e-5, ...
'LearnRateSchedule', 'piecewise', ...
'LearnRateDropFactor', 0.1, ...
'LearnRateDropPeriod', 10, ...
'MaxEpochs', 100, ...
'Verbose', true);
%% continue training from previous version
network = rcnn.Network;
layers = network.Layers;
rcnn = trainRCNNObjectDetector(stopSigns, layers, options, 'NegativeOverlapRange', [0 0.3], 'PositiveOverlapRange',[0.5 1]);
I get the following error:
Error using trainRCNNObjectDetector (line 256)
Invalid network.
Error in testcodeforinceptionrcnn (line 40)
rcnn = trainRCNNObjectDetector(stopSigns, layers, options, 'NegativeOverlapRange', [0 0.3], 'PositiveOverlapRange',[0.5 1]);
Caused by:
Layer 'concatenate_1': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
Layer 'concatenate_2': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
Layer 'conv2d_7': Input size mismatch. Size of input to this layer is different from the expected input size.
Inputs to this layer:
from layer 'activation_9_relu' (output size 35×35×64)
Layer 'mixed0': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed1': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed10': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed2': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed3': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
Layer 'mixed4': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed5': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed6': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed7': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed8': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
Layer 'mixed9': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
input 'in3'
input 'in4'
Layer 'mixed9_0': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
Layer 'mixed9_1': Unconnected input. Each layer input must be connected to the output of another layer.
Detected unconnected inputs:
input 'in2'
Examining the network that comes from Step 1:
>> rcnn.Network
ans =
DAGNetwork with properties:
Layers: [315×1 nnet.cnn.layer.Layer]
Connections: [349×2 table]
InputNames: {'input_1'}
OutputNames: {'rcnnClassification'}
Finally, the output of the "layers" array that I'm trying to use:
>> rcnn.Network.Layers
ans =
315×1 Layer array with layers:
1 'input_1' Image Input 299x299x3 images with 'rescale-symmetric' normalization
2 'conv2d_1' Convolution 32 3x3x3 convolutions with stride [2 2] and padding [0 0 0 0]
3 'batch_normalization_1' Batch Normalization Batch normalization with 32 channels
4 'activation_1_relu' ReLU ReLU
5 'conv2d_2' Convolution 32 3x3x32 convolutions with stride [1 1] and padding [0 0 0 0]
6 'batch_normalization_2' Batch Normalization Batch normalization with 32 channels
7 'activation_2_relu' ReLU ReLU
8 'conv2d_3' Convolution 64 3x3x32 convolutions with stride [1 1] and padding 'same'
9 'batch_normalization_3' Batch Normalization Batch normalization with 64 channels
10 'activation_3_relu' ReLU ReLU
11 'max_pooling2d_1' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0]
12 'conv2d_4' Convolution 80 1x1x64 convolutions with stride [1 1] and padding [0 0 0 0]
13 'batch_normalization_4' Batch Normalization Batch normalization with 80 channels
14 'activation_4_relu' ReLU ReLU
15 'conv2d_5' Convolution 192 3x3x80 convolutions with stride [1 1] and padding [0 0 0 0]
16 'batch_normalization_5' Batch Normalization Batch normalization with 192 channels
17 'activation_5_relu' ReLU ReLU
18 'max_pooling2d_2' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0]
19 'conv2d_9' Convolution 64 1x1x192 convolutions with stride [1 1] and padding 'same'
20 'batch_normalization_9' Batch Normalization Batch normalization with 64 channels
21 'activation_9_relu' ReLU ReLU
22 'conv2d_7' Convolution 48 1x1x192 convolutions with stride [1 1] and padding 'same'
23 'conv2d_10' Convolution 96 3x3x64 convolutions with stride [1 1] and padding 'same'
24 'batch_normalization_7' Batch Normalization Batch normalization with 48 channels
25 'batch_normalization_10' Batch Normalization Batch normalization with 96 channels
26 'activation_7_relu' ReLU ReLU
27 'activation_10_relu' ReLU ReLU
28 'average_pooling2d_1' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'
29 'conv2d_6' Convolution 64 1x1x192 convolutions with stride [1 1] and padding 'same'
30 'conv2d_8' Convolution 64 5x5x48 convolutions with stride [1 1] and padding 'same'
31 'conv2d_11' Convolution 96 3x3x96 convolutions with stride [1 1] and padding 'same'
32 'conv2d_12' Convolution 32 1x1x192 convolutions with stride [1 1] and padding 'same'
33 'batch_normalization_6' Batch Normalization Batch normalization with 64 channels
34 'batch_normalization_8' Batch Normalization Batch normalization with 64 channels
35 'batch_normalization_11' Batch Normalization Batch normalization with 96 channels
36 'batch_normalization_12' Batch Normalization Batch normalization with 32 channels
37 'activation_6_relu' ReLU ReLU
38 'activation_8_relu' ReLU ReLU
39 'activation_11_relu' ReLU ReLU
40 'activation_12_relu' ReLU ReLU
41 'mixed0' Depth concatenation Depth concatenation of 4 inputs
42 'conv2d_16' Convolution 64 1x1x256 convolutions with stride [1 1] and padding 'same'
43 'batch_normalization_16' Batch Normalization Batch normalization with 64 channels
44 'activation_16_relu' ReLU ReLU
45 'conv2d_14' Convolution 48 1x1x256 convolutions with stride [1 1] and padding 'same'
46 'conv2d_17' Convolution 96 3x3x64 convolutions with stride [1 1] and padding 'same'
47 'batch_normalization_14' Batch Normalization Batch normalization with 48 channels
48 'batch_normalization_17' Batch Normalization Batch normalization with 96 channels
49 'activation_14_relu' ReLU ReLU
50 'activation_17_relu' ReLU ReLU
51 'average_pooling2d_2' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'
52 'conv2d_13' Convolution 64 1x1x256 convolutions with stride [1 1] and padding 'same'
53 'conv2d_15' Convolution 64 5x5x48 convolutions with stride [1 1] and padding 'same'
54 'conv2d_18' Convolution 96 3x3x96 convolutions with stride [1 1] and padding 'same'
55 'conv2d_19' Convolution 64 1x1x256 convolutions with stride [1 1] and padding 'same'
56 'batch_normalization_13' Batch Normalization Batch normalization with 64 channels
57 'batch_normalization_15' Batch Normalization Batch normalization with 64 channels
58 'batch_normalization_18' Batch Normalization Batch normalization with 96 channels
59 'batch_normalization_19' Batch Normalization Batch normalization with 64 channels
60 'activation_13_relu' ReLU ReLU
61 'activation_15_relu' ReLU ReLU
62 'activation_18_relu' ReLU ReLU
63 'activation_19_relu' ReLU ReLU
64 'mixed1' Depth concatenation Depth concatenation of 4 inputs
65 'conv2d_23' Convolution 64 1x1x288 convolutions with stride [1 1] and padding 'same'
66 'batch_normalization_23' Batch Normalization Batch normalization with 64 channels
67 'activation_23_relu' ReLU ReLU
68 'conv2d_21' Convolution 48 1x1x288 convolutions with stride [1 1] and padding 'same'
69 'conv2d_24' Convolution 96 3x3x64 convolutions with stride [1 1] and padding 'same'
70 'batch_normalization_21' Batch Normalization Batch normalization with 48 channels
71 'batch_normalization_24' Batch Normalization Batch normalization with 96 channels
72 'activation_21_relu' ReLU ReLU
73 'activation_24_relu' ReLU ReLU
74 'average_pooling2d_3' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'
75 'conv2d_20' Convolution 64 1x1x288 convolutions with stride [1 1] and padding 'same'
76 'conv2d_22' Convolution 64 5x5x48 convolutions with stride [1 1] and padding 'same'
77 'conv2d_25' Convolution 96 3x3x96 convolutions with stride [1 1] and padding 'same'
78 'conv2d_26' Convolution 64 1x1x288 convolutions with stride [1 1] and padding 'same'
79 'batch_normalization_20' Batch Normalization Batch normalization with 64 channels
80 'batch_normalization_22' Batch Normalization Batch normalization with 64 channels
81 'batch_normalization_25' Batch Normalization Batch normalization with 96 channels
82 'batch_normalization_26' Batch Normalization Batch normalization with 64 channels
83 'activation_20_relu' ReLU ReLU
84 'activation_22_relu' ReLU ReLU
85 'activation_25_relu' ReLU ReLU
86 'activation_26_relu' ReLU ReLU
87 'mixed2' Depth concatenation Depth concatenation of 4 inputs
88 'conv2d_28' Convolution 64 1x1x288 convolutions with stride [1 1] and padding 'same'
89 'batch_normalization_28' Batch Normalization Batch normalization with 64 channels
90 'activation_28_relu' ReLU ReLU
91 'conv2d_29' Convolution 96 3x3x64 convolutions with stride [1 1] and padding 'same'
92 'batch_normalization_29' Batch Normalization Batch normalization with 96 channels
93 'activation_29_relu' ReLU ReLU
94 'conv2d_27' Convolution 384 3x3x288 convolutions with stride [2 2] and padding [0 0 0 0]
95 'conv2d_30' Convolution 96 3x3x96 convolutions with stride [2 2] and padding [0 0 0 0]
96 'batch_normalization_27' Batch Normalization Batch normalization with 384 channels
97 'batch_normalization_30' Batch Normalization Batch normalization with 96 channels
98 'activation_27_relu' ReLU ReLU
99 'activation_30_relu' ReLU ReLU
100 'max_pooling2d_3' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0]
101 'mixed3' Depth concatenation Depth concatenation of 3 inputs
102 'conv2d_35' Convolution 128 1x1x768 convolutions with stride [1 1] and padding 'same'
103 'batch_normalization_35' Batch Normalization Batch normalization with 128 channels
104 'activation_35_relu' ReLU ReLU
105 'conv2d_36' Convolution 128 7x1x128 convolutions with stride [1 1] and padding 'same'
106 'batch_normalization_36' Batch Normalization Batch normalization with 128 channels
107 'activation_36_relu' ReLU ReLU
108 'conv2d_32' Convolution 128 1x1x768 convolutions with stride [1 1] and padding 'same'
109 'conv2d_37' Convolution 128 1x7x128 convolutions with stride [1 1] and padding 'same'
110 'batch_normalization_32' Batch Normalization Batch normalization with 128 channels
111 'batch_normalization_37' Batch Normalization Batch normalization with 128 channels
112 'activation_32_relu' ReLU ReLU
113 'activation_37_relu' ReLU ReLU
114 'conv2d_33' Convolution 128 1x7x128 convolutions with stride [1 1] and padding 'same'
115 'conv2d_38' Convolution 128 7x1x128 convolutions with stride [1 1] and padding 'same'
116 'batch_normalization_33' Batch Normalization Batch normalization with 128 channels
117 'batch_normalization_38' Batch Normalization Batch normalization with 128 channels
118 'activation_33_relu' ReLU ReLU
119 'activation_38_relu' ReLU ReLU
120 'average_pooling2d_4' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'
121 'conv2d_31' Convolution 192 1x1x768 convolutions with stride [1 1] and padding 'same'
122 'conv2d_34' Convolution 192 7x1x128 convolutions with stride [1 1] and padding 'same'
123 'conv2d_39' Convolution 192 1x7x128 convolutions with stride [1 1] and padding 'same'
124 'conv2d_40' Convolution 192 1x1x768 convolutions with stride [1 1] and padding 'same'
125 'batch_normalization_31' Batch Normalization Batch normalization with 192 channels
126 'batch_normalization_34' Batch Normalization Batch normalization with 192 channels
127 'batch_normalization_39' Batch Normalization Batch normalization with 192 channels
128 'batch_normalization_40' Batch Normalization Batch normalization with 192 channels
129 'activation_31_relu' ReLU ReLU
130 'activation_34_relu' ReLU ReLU
131 'activation_39_relu' ReLU ReLU
132 'activation_40_relu' ReLU ReLU
133 'mixed4' Depth concatenation Depth concatenation of 4 inputs
134 'conv2d_45' Convolution 160 1x1x768 convolutions with stride [1 1] and padding 'same'
135 'batch_normalization_45' Batch Normalization Batch normalization with 160 channels
136 'activation_45_relu' ReLU ReLU
137 'conv2d_46' Convolution 160 7x1x160 convolutions with stride [1 1] and padding 'same'
138 'batch_normalization_46' Batch Normalization Batch normalization with 160 channels
139 'activation_46_relu' ReLU ReLU
140 'conv2d_42' Convolution 160 1x1x768 convolutions with stride [1 1] and padding 'same'
141 'conv2d_47' Convolution 160 1x7x160 convolutions with stride [1 1] and padding 'same'
142 'batch_normalization_42' Batch Normalization Batch normalization with 160 channels
143 'batch_normalization_47' Batch Normalization Batch normalization with 160 channels
144 'activation_42_relu' ReLU ReLU
145 'activation_47_relu' ReLU ReLU
146 'conv2d_43' Convolution 160 1x7x160 convolutions with stride [1 1] and padding 'same'
147 'conv2d_48' Convolution 160 7x1x160 convolutions with stride [1 1] and padding 'same'
148 'batch_normalization_43' Batch Normalization Batch normalization with 160 channels
149 'batch_normalization_48' Batch Normalization Batch normalization with 160 channels
150 'activation_43_relu' ReLU ReLU
151 'activation_48_relu' ReLU ReLU
152 'average_pooling2d_5' Average Pooling 3×3 average pooling with stride [1 1] and padding 'same'
153 'conv2d_41' Convolution 192 1x1x768 convolutions with stride [1 1] and padding 'same'
154 'conv2d_44' Convolution 192 7x1x160 convolutions with stride [1 1] and padding 'same'
155 'conv2d_49' Convolution 192 1x7x160 convolutions with stride [1 1] and padding 'same'
156 'conv2d_50' Convolution 192 1x1x768 convolutions with stride [1 1] and padding 'same'
157 'batch_normalization_41' Batch Normalization Batch normalization with 192 channels
158 'batch_normalization_44' Batch Normalization Batch normalization with 192 channels
159 'batch_normalization_49' Batch Normalization Batch normalization with 192 channels
160 'batch_normalization_50' Batch Normalization Batch normalization with 192 channels
161 'activation_41_relu' ReLU ReLU
162 'activation_44_relu' ReLU ReLU
163 'activation_49_relu' ReLU ReLU
164 'activation_50_relu' ReLU ReLU
165 'mixed5' Depth concatenation Depth concatenation of 4 inputs
166 'conv2d_55' Convolution 160 1x1x768 convolutions with stride [1 1] and padding 'same'
167 'batch_normalization_55' Batch Normalization Batch normalization with 160 channels
168 'activation_55_relu' ReLU ReLU

Best Answer

Sorry, I figured out the answer on my own but didn't re-post.
The solution (for InceptionV3 and resnets at least) is to rewrap the network in a layergraph after initial training rather than simply using the layers, for instance, the last 3 lines of "step 2" above become:
network = rcnn.Network;
rcnn = trainRCNNObjectDetector(stopSigns, layerGraph(network), options, 'NegativeOverlapRange', [0 0.3], 'PositiveOverlapRange',[0.5 1]);
I've asked Mathworks to update their documentation to call attention to this as it's not obvious from the example.
Related Question