MATLAB: Determination of stereo camera world coordinates with respect to calibration target

camera-calibrationcomputer visionComputer Vision Toolboxstereostereo calibration

I'm having some issues getting a precise and accurate determination of the location of my cameras. I am defining my world coordinate system with the origin on the calibration target, and want to know the world coordinates of the camera centers relative to this. I take a set of ~20 images, calibrate the cameras using the stereo calibration functions, and then compute the camera centers as:
wocoIdx= 1; % index for the image with the calibration target located where I would like to define my world coordinate system
C1= -stereoParams.CameraParameters1.TranslationVectors(wocoIdx,:);
C2= -stereoParams.CameraParameters2.TranslationVectors(wocoIdx,:);
The issue is that this typically gives me cameras spaced ~111mm apart horizontally, versus the 140mm I have measured manually.
(1) Is camera height vs. camera spacing a limiting factor here? My cameras are about 1960mm from the world coordinate target and spaced ~140mm apart. When calibrating, I translate the calibration target in a range of 200mm towards/away from the camera (this is limited by the depth of field of the setup).
(2) Am I missing another step (rotation?) to obtain the camera centers? I've also tried using the triangulate function but had little success getting an accurate estimate. The approach I used with triangulate is:
undistortedImagePts1= undistortPoints(imagePoints(:,:,wocoIdx,1),stereoParams.CameraParameters1);
undistortedImagePts2= undistortPoints(imagePoints(:,:,wocoIdx,2),stereoParams.CameraParameters2);
C1= triangulate(undistortedImagePts1(1,:),undistortedImagePts2(1,:),stereoParams);
C2= C1 + stereoParams.TranslationOfCamera2';

Best Answer

Matt's is almost correct. The extrinsics R and t represent the transformation from the world coordinates into camera's coordinates. So t is not the camera center. You have to rotate as well. The only problem with Matt's solution is that it does not follow the vector-matrix multiplication convention used by the Computer Vision System Toolbox. The camera center in the world coordinates is
c = -t * R';
because t is a row vector.
Also, you do not have to use one of your calibration images. You can calibrate your cameras and then take a new picture of a checkerboard, and use the extrinsics function to compute the R and t relative to that board. See this example.