MATLAB: Specifying gpuArray input for codegen

codegenDeep Learning ToolboxEmbedded CoderGPU CodergpuarrayParallel Computing Toolboxsimulation

Hi,
I am trying to accelerate some simulation code I have using my GPU. I am trying to do this by first creating a bunch of gpuArrays to hold the relevant variables and then passing these arrays to a function (updateNet) which I want to turn into a mex file using GPU Coder. For different simulations, the gpuArrays will be different sizes but for a particular simulation they will all be the same size. Since GPU Coder does not support variable sized gpuArray inputs I have written a function that takes as input an integer and then specifies that as the size of all of the gpuArray inputs for this call to codegen.
function [] = compile_easySim(N)
cfg = coder.gpuConfig('mex');
cfg.GpuConfig.CompilerFlags = '--fmad=false';
cfg.GenerateReport = true;
ARGS = cell(23,1);
ARGS{1} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % V (membrane Voltage)
ARGS{2} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % Gref (refractory conductance)
ARGS{3} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % dGref (refractory conductance change on spike)
ARGS{4} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % tau_ref (refractory time_constant)
ARGS{5} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % Vth (spike threshold)
ARGS{6} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % VsynE (excitatory synaptic reversal potential)
ARGS{7} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % VsynI (inhibitory synaptic reversal potential)
ARGS{8} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % GsynE (total excitatory synaptic conductance)
ARGS{9} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % GsynI (total inhibitory synaptic conductance)
ARGS{10} = coder.typeof(gpuArray(single(0)),[N 1], [0 0]); % maxGsynE
ARGS{11} = coder.typeof(gpuArray(single(0)),[N 1], [0 0]); % maxGsynI
ARGS{12} = coder.typeof(gpuArray(single(0)),[N N],[0 0]); % dGsyn (synaptic strength matrix)
ARGS{13} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % tau_synE (excitatory synaptic decay time constant)
ARGS{14} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % tau_synI (inhibitory synaptic decay time constant)
ARGS{15} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % Cm (membrane capacitance)
ARGS{16} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % Gl (leak conductance)
ARGS{17} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % El (leak reversal potential)
ARGS{18} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % Ek
ARGS{19} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % dth
ARGS{20} = coder.typeof(gpuArray(single(0)),[N 1],[0 0]); % Iapp
ARGS{21} = coder.typeof(single(0),[1],[0]); % dt
ARGS{22} = coder.typeof(gpuArray(false),[N 1],[0 0]); % ecells
ARGS{23} = coder.typeof(gpuArray(false),[N 1],[0 0]); % icells
codegen updateNet -args ARGS -nargout 5
However, when I do this I get the following error:
Use of CODER.TYPEOF to represent GPU inputs is supported only with GPU Coder.
Use help codegen for more information on using this command.
This confuses me since I thought I was using GPU Coder. Does GPU Coder only refer to the app GUI where you must manually specify each input?
For reference the output of coder.checkGpuInstall is:
coder.checkGpuInstall
Compatible GPU : PASSED
CUDA Environment : PASSED
Runtime : PASSED
cuFFT : PASSED
cuSOLVER : PASSED
cuBLAS : PASSED
cuDNN Environment : FAILED (Unable to find the 'NVIDIA_CUDNN' environment variable. Set 'NVIDIA_CUDNN' to point to the root directory of a NVIDIA cuDNN installation.)
Basic Code Generation : PASSED
Basic Code Execution : PASSED
ans =
struct with fields:
gpu: 1
cuda: 1
cudnn: 0
tensorrt: 0
basiccodegen: 1
basiccodeexec: 1
deepcodegen: 0
deepcodeexec: 0
tensorrtdatatype: 0
profiling: 0
The only thing that fails is cuDNN but since I'm not trying to do deep learning this shouldn't cause problems.
Where am I going wrong? Thanks!

Best Answer

Okay, I was able to fix this problem by getting gpucoder to produce the code that the app used to compile my function.
  1. Install the GPU coder interface for deep learning libraries.
  2. use the GPU Coder app to compile your code.
  3. run
gpucoder -script yourscript.m -tocode yourgpucoderproject.prj
replacing the .m and .prj files with the names of you files.
This will output a script called yourscript.m with the code the app used. Now you can turn this script into a function that takes in the sizes of the arrays it should expect.
The critical difference between what I was doing and what the GPU Coder app does is it uses
coder.typeof(single(0), [N 1], 'Gpu', true)
to signal a gpuArray input. NOWHERE in the documentation is this syntax shown or explained.