I am new to parallel and GPU computing.
I understand parfor function didnt work on GPU
I wish to do operation similar to the parfor on GPU, I checked that I can go up to 512 workers
I am wondering how can I assign worker on each gpu processor (I only have 1 gpudevice)? so that I can maybe use a for loop to do the operation, or does the function arryfun + gpuarray = parallel GPU
for example, doing matrics operation, each gpu processor handle one roll or coulmn of the matrix
some Info:
CUDADevice with properties: Name: 'GeForce GTX 1060 3GB' Index: 1 ComputeCapability: '6.1' SupportsDouble: 1 DriverVersion: 10 ToolkitVersion: 9.1000 MaxThreadsPerBlock: 1024 MaxShmemPerBlock: 49152 MaxThreadBlockSize: [1024 1024 64] MaxGridSize: [2.1475e+09 65535 65535] SIMDWidth: 32 TotalMemory: 3.2212e+09 AvailableMemory: 2.5223e+09 MultiprocessorCount: 9 ClockRateKHz: 1708500 ComputeMode: 'Default' GPUOverlapsTransfers: 1 KernelExecutionTimeout: 1 CanMapHostMemory: 1 DeviceSupported: 1 DeviceSelected: 1
Best Answer