Most of the following should apply to quadrics in general, but I'll refer to here to proper ellipsoids and not worry about the general case and edge conditions.
Before sketching out two reconstruction methods here are some preliminaries.
An ellipsoid is defined by a quadratic equation
$$
Ax^2+By^2+Cz^2+Dxy+Eyz+Fzx+Gx+Hy+Iz+J=0.
$$
This can be rewritten in the form
$$\mathbf{X}^T Q \mathbf{X}=0$$ where $\mathbf{X}^T=[x,y,z,1]$ and $Q=Q^T$ is a $4\times4$ symmetric matrix. If $Q=[q_{ij}]$ then $q_{11}=A, q_{12}=q_{21}=D/2$, etc.
This form is handy for various operations. For example, $\mathbf{X}^T Q^{-1} \mathbf{X}=0$ defines the quadric that is the dual of the ellipsoid defined by $\mathbf{X}^T Q \mathbf{X}=0$. (for more on this see section 2.2 of this paper.)
$5$ points in general define a conic. The well known method of finding the equation of a conic, given $5$ points is given in this answer.
$9$ points in general define an ellipsoid. To find the equation of the ellipsoid, generalize the method used for conics.
There is a caveat, in that the points have to be in some sense independent. Otherwise the equation will be degenerate.
For the case of conics, the points have to be not only distinct, but no more than three points can be collinear.
For the case of quadrics/ellipsoids, no more than $5$ of the $9$ points can lie on the same conic. Given two conics lying on the ellipsoid, no more than $8$ points can lie on these two conics (see Figure 2 in this paper).
With these preliminaries, here are some suggestions for solving the $3$-camera reconstruction problem.
Find nine points on the ellipsoid. An answer to a related question gives a method for computing points given two cameras. This method identifies points on two ellipses lying on the ellipsoid, but as noted in the preliminaries a maximum of $8$ points can be used from these ellipses. So a ninth point can be found from a different pair of cameras using the third camera.
Find nine tangent planes. To find a tangent plane, take a point $T$ on one of the image plane ellipses, and find its tangent $t$ to the ellipse. If the point $X$ is the corresponding camera position the plane containing $t$ and $TX$ is a tangent plane of the ellipsoid. Find three such tangent planes for each image ellipse, for a total of nine tangent planes. Using the nine points that are the duals of the tangent planes, build a quadric on which those points lie. Then the dual of that quadric (using the trick in the preliminaries) is the desired ellipsoid.
Use epipolar methods. This paper uses concepts from epipolar geometry (important in computer vision) to solve the problem. Frankly, I haven't had the time to read and understand it, but maybe it is helpful. I don't doubt that the solution they give is smarter and more robust than the two suggestions I've given above.
Suppose your identified ellipse (which lies in the image plane) is given by the $2D$ curve
$ p(t) = C + \cos t u_1 + \sin t u_2 $
where $C$ is the center of the ellipse, $u_1, u_2$ are the semi-axes of the ellipse. Since the photo is taken straight, then we can assume that
$ C = (0, c_0) $
$ u_1 = (a, 0 ) $
$ u_2 = (0, b) $
where $c_0, a$ and $b$ are known.
Since the image plane lies a distance $z_0$ from the center of the camera (the origin of the camera coordinate frame), then with respect to the camera coordinate frame, the equation of the ellipse is
$ p(t) = (0, c_0, -z_0) + \cos t (a, 0, 0) + \sin t (0, b, 0) $
Connecting the origin to the points of this ellipse generates a cone of view.
It can be shown that this cone has the equation
$p^T Q p = 0 $
where
$ Q = \begin{bmatrix} \dfrac{1}{a^2} && 0 && 0 \\ 0 && \dfrac{1}{b^2} && \dfrac{c_0}{b^2 z_0} \\ 0 && \dfrac{c_0}{b^2 z_0} && \dfrac{ c^2 }{(b z_0)^2} - \dfrac{1}{z_0^2} \end{bmatrix} $
This cone of view is to be intersected with a plane representing the top of the cylinder which is assumed horizontal in the world coordinate frame. Now, with respect to the camera coordinate frame, the equation of this "horizontal" plane is
$ n^T p = K $
where $K$ is an arbitrary non-zero constant. And
$ n = [0, \sin \theta, \cos \theta ]^T =:[0, s, c ]^T $
It can be shown that this intersection curve between the cone of view and the plane representing the top of the cylinder will be a circle if and only if
$ \dfrac{1}{a^2} = \dfrac{1}{b^2} \left( - 2 s c \dfrac{c_0}{z_0} + c^2 + s^2 \left( \dfrac{c_0}{z_0} \right)^2 \right) - \dfrac{s^2}{z_0^2} $
This equation relates the angle $\theta$ (which is the angle between the camera axis (it's $z$ axis) and the normal of the intercepting plane of the cylinder top.
Since the plane of the cylinder top is horizontal, then the camera must be pointing in a direction that makes an angle of $\phi =\dfrac{\pi}{2} - \theta \ $ below the horizontal.
The above equation implies that to determine the focal length $z_0$ we need to know the angle $\phi$ at which the camera was pointing. Then solving for $z_0$ is a a matter of solving a quadratic equation in $\left( \dfrac{1}{z_0} \right) $, which is trivial.
Best Answer
As John Hughes explained in his answer, what you want to do is not possible given only the information that you first provided in your question. However, with the additional information about the plane that the sphere is sitting on and the assumption that it is small relative to its distance from the cameras, the problem becomes soluble. Basically, you just need to compute the intersection of the back-projection of an image point with that plane.
Here’s one way to do this: From the information that you have about the pink camera, you’ll need to work out its $3\times4$ projection matrix $\mathtt P = [\mathtt M\mid\mathbf p_4]$. Working in homogeneous coordinates, you know two points on the back-projection of an image point $\mathbf x$: the camera’s center $\mathbf C$ and $\mathtt P^+\mathbf x$, where $\mathtt P^+$ is the right pseudo-inverse of $\mathtt P$: $$\mathtt P^+=\mathtt P^T(\mathtt P\mathtt P^T)^{-1}.$$ If the “ground” plane has equation $ax+by+cz+d=0$, then setting $\Pi=(a,b,c,d)^T$, the intersection of the back-projected ray and this plane is $$((\mathtt P^+\mathbf x)^T\Pi)\mathbf C-(\mathbf C^T\Pi)(\mathtt P^+\mathbf x). \tag1$$ If you’re not comfortable working with homogeneous coordinates, you can instead compute the intersection of the ground plane with the line through the pink camera center that has direction vector $\mathtt M^{-1}\mathbf x$.
Once you have the coordinates of this intersection point, working out the direction in which to point the blue camera is a simple exercise. It won’t point directly at the center of the green sphere, but if your assumption that it is small relative to its distance from the cameras, the error will also be small.