Hi Tracy,
The functions in the Image Processing Toolbox and the Computer Vision System Toolbox use the pre-multiplication convention for coordinate transformations: row vectors mutiplied by a matrix. This is different from the convention you may see many textbooks, and it requires the matrix to be transposed. That is why you are seeing a 4x3 matrix instead of a 3x4 matrix. It does indeed map homogeneous world coordinates onto homogeneous image coordinates (ignoring distortion), but, again, it operates on row vectors.
[xw, yw, w] = [X,Y,Z,1] * P;
Now back to your problem. I would guess that the mismatch you are seeing happens because imagePoints come from the original image, which has some lens distortion. To do your test correctly, you have to first undistort the image, using the aptly named undistortImage function, and then detect the checkerboard in the undistorted image. Then you can compute the extrinsics and the camera matrix from the resulting image points.
Best Answer