C = repelem( A, size( B, 1 ), 1 ) + repmat( B, size( A, 1 ), 1 );
Should be the same for gpuArrays too. Whether it is fastest for runtime or not is another matter entirely. There are any number of possible ways of doing it - this is just one. I certainly don't have time to think up, implement and time all of them!
Best Answer