Hello,
Is it possible to use a C++-style templated CUDA kernel via MATLAB's GPU Computing interface?
For example, consider the following (useless) toy code:
template<typename T>__global__ void get_nans(T*, const int*);template<>__global__ void get_nans<double>(double* out, const int* dims){ const int tx = blockIdx.x*blockDim.x + threadIdx.x; const int ty = blockIdx.y*blockDim.y + threadIdx.y; if ((tx < dims[1]) && (ty < dims[0])) out[tx*dims[0] + ty] = nan(0);}template<>__global__ void get_nans<float>(float* out, const int* dims){ const int tx = blockIdx.x*blockDim.x + threadIdx.x; const int ty = blockIdx.y*blockDim.y + threadIdx.y; if ((tx < dims[1]) && (ty < dims[0])) out[tx*dims[0] + ty] = nanf(0);}
I then compile this into PTX code, but when I try to instantiate the kernel object in MATLAB I get the following error:
>> k = parallel.gpu.CUDAKernel( 'get_nans.ptx', 'get_nans.cu' );Error using handleKernelArgs (line 61)Found multiple matching entries in the PTX code. Matches found:_Z16get_nansIdEvPT_PKS0_S3_S3_PKiS5__Z16get_nansIfEvPT_PKS0_S3_S3_PKiS5_
Thank you,
Alex
Best Answer