MATLAB: Matrix multiply result different from loops

loopsMATLABmatrixmultiplynumeric precision

I know I'm having numeric precision issues with a matrix mutiplication operation, but I can't reproduce the same result using for loops. Using the two matricies found in the attached mat file, run the following script. No matter which method I used to calculate the resulting matrix, I can't reproduce the same result as the matrix multiply. How is the Matlab matrix multiply computed?
load matricies
C = A*B; % normal matix multiply A is 54 x 3 and B is 3 x 54, so C is 54 x 54
Ci1 = zeros(54);
Ci2 = zeros(54);
Ci3 = zeros(54);
for i=1:54
for j=1:54
for k=1:3
Ci1(i,j) = Ci1(i,j)+A(i,k)*B(k,j);
end
for k=3:-1:1
Ci2(i,j) = Ci2(i,j)+A(i,k)*B(k,j);
end
Ci3(i,j) = A(i,:)*B(:,j);
end
end
diff = C - Ci1;
fprintf('max error using k=1:3: %d\n', max(diff(:)))
diff = C - Ci2;
fprintf('max error with k=3:-1:1 %d\n', max(diff(:)))
diff = C - Ci3;
fprintf('max error with A(i,:)*B(:,j) %d\n', max(diff(:)))
eps_max_orig = eps(max(abs(C(:))));
eps_min_orig = eps(min(abs(C(:))));
fprintf('range of eps is [%d, %d]\n', eps_min_orig, eps_max_orig)

Best Answer

MATLAB calls 3rd party BLAS library code to do matrix multiply. This is a highly optimized multi-threaded library. The ordering of the operations is not published, but likely depends on size of the matrix, number of cores used, cache sizes for your CPU, etc. You might get lucky with a guess at operation order for your particular matrix size and your particular machine, but this wouldn't necessarily tell you what would happen with other matrix sizes or on a different machine.