MATLAB: Do I get “Maximum variable size allowed on the device is exceeded.” error when running the “semanticseg” function on the image and custom network

Computer Vision Toolboxconvolveforward2dsemanticseg

I am trying to run the "semanticseg" function on an image with size (8557×11377) and my own custom network, and I receive the error:

Error using nnet.internal.cnngpu.convolveForward2D

Maximum variable size allowed on the device is exceeded.

I expect MATLAB to tile the image, run the semantic segmentation, and reassemble the tile, but I am receiving this error on a windows laptop with a GTX 1070 card.

The two questions I have are:

1) Why do I receive this error message?

2) I would like to know what the maximum allowed image size I can have with my own custom network when running the "semanticseg" function.

Best Answer

Currently, the maximum value for the total number of array elements on the GPU is limited by CUDA's CUBLAS/CUFFT APIs to " intmax('int32'). The error is likely because the number of elements of the array input to "gpuArray" exceeds this limit. Internally, "semanticseg" creates a gpuArray, and due to the size of the image and the customers network, the limit of intmax('int32') is being exceeded.

Since this is a very large image to push into a GPU along with the additional memory required for the network and all of its intermediate data, at the moment, there is no tool to support these large image sizes with deep learning.

One workaround in MATLAB R2017b would be to tile the image separately and pass the tiles into the "semanticseg" function. A function like "blockproc" can help with the tiling.

A version of a tile-based semantic segmentation helper function is also attached, which can be adapted for your specific application.

Related Solutions

MATLAB: Do I receive a gpuArray/cat Out of memory on device error when using the detect method on a squeezenet network

The problem stems from the fact that the "squeezenet" has so many convolution, ReLU and concatenation layers. Every time the layers are activated, at least 2 activations had to be in GPU memory. This increases because GPUs do not allow per unit variable clearing. There are a number of ways to address this issue:

1) One potential solution is to perform a reset before any batch operations (in this case, detection operation). After the detector is trained, you should reset the GPU using the reset method. This should be done every time a new image is used to detect the object:

>> d = gpuDevice;
>> reset(d);
>> testsim = imread('/myimages/im1.jpg');
>> [bboxes, scores, labels] = detect(...)

2) Another potential solution is to reduce the parameter, NumStrongestRegions, of the detect function from the default value of 2000 to a lower number. Remember to reset the GPU every time the detect method is called.

>> [bboxes, scores, labels] = detect(detector, testsim, 'NumStrongestRegions', 1000)

3) Play with the MinSize and MaxSize Name-Value pairs if you know the approximate size of the objects being detected.

4) Use a GPU device with more memory.

5) Create a custom fast rcnn layers for the trainFastRCNNObjectDetector, then run the detect method. The idea is to choose a feature extraction layer way ahead of the output layer so the number of computation layers are reduced.

MATLAB: Do I get an error when I try to train a DAG network on multiple GPU’s

This functionality is supported as of MATLAB R2018a. Unfortunately, there is no workaround for previous versions.

Best Answer

Related Solutions

MATLAB: Do I receive a gpuArray/cat Out of memory on device error when using the detect method on a squeezenet network

MATLAB: Do I get an error when I try to train a DAG network on multiple GPU’s

Related Question