[Math] Unprojecting a 2D point to 3D space on a plane with perspective.

3dgeometry

I'm not a good mathematician, but I'm trying to unproject 2D screen coordinates to a plane in a 3D space with perspective.

I first do an uniform scaling on the 3D scene. Then I rotate around X axis, the plane defined by X axis and Z axis. Then I translate the scene on Z axis, with a value of -1.0.

So I have a ProjectionView matrix, computed by the following way, using the library linmath.h, in language C:

mat4x4 ProjectionView, matrix;

mat4x4 projection;
mat4x4_identity(projection);
mat4x4_perspective(projection, fova, width/height, zNear, zFar);

mat4x4 view;
mat4x4_identity(view);
mat4x4_translate(view, 0.0, 0.0, -1.0);
mat4x4_rotate(matrix, view, 1.0, 0.0, 0.0, theta);
mat4x4_scale_aniso(view, matrix, scale, scale, scale);

mat4x4_mul(ProjectionView, projection, view);

Here is the graphic I use to try to unproject:

The distance Oz is what I can compute using trigonometry, it is on the plane where are the unprojected points, but it doesn't take in consideration the effects of perspective.

So I would like to know how to correct this Oz distance, using the perspective parameters, in order to have the unprojected z coordinate ?

And then how to compute x coordinate, from the unprojected z and from the perspective parameters ?

After some readings on internet, I tried using the inverse ProjectionView matrix, or with the inverse Projection matrix, without a good result, maybe due to the fact I don't know the z value to give to these matrices. So I wonder if there is a way to solve this problem without using an inverse matrix ?

The computed Oz distance is close to the true result, I think it just need a correction due to the perspective.

Best Answer

Your ProjectionView matrix $P$ maps the visible part of the 3D world (bounded by the view frustum and the near and far planes) to normalized device coordinates, a $2\times 2\times 2$ cube with the computer's screen (the near plane) at the $z=-1$ face.

So your first task is to convert the screen coordinates of the "clicked point" $(p_x, p_y)$ to normalized device coordinates. If your viewport has width $w$ and height $h$, this is $$n_x = \frac{2p_x}{w} -1, \qquad n_y = 1- \frac{2p_y}{h}, \qquad n_z = -1.$$

Now the perspective projection happens in homogeneous coordinates, so that if $\mathbf{q} = (q_x,q_y,q_z,1)$ is a point in world coordinates, $$\left([P\mathbf{q}]_x/[P\mathbf{q}]_w, [P\mathbf{q}]_y/[P\mathbf{q}]_w, [P\mathbf{q}]_z/[P\mathbf{q}]_w, 1\right) = (n_x, n_y, n_z, 1)$$ is the corresponding projected point in normalized device coordinates. This mapping is not linear and cannot be inverted by taking $P^{-1}$; but notice that $$P\mathbf{q} = [P\mathbf{q}]_z \mathbf{n}.$$ and so $\mathbf{q} = (P^{-1}\mathbf{n}) / (P^{-1}\mathbf{n})_z.$

Of course, this is the clicked point's position in world coordinates, not the location of the projected point on the black plane. To find the position of the point on the plane, you could query the depth buffer and set $n_z$ to that depth; this approach is error-prone if the depth buffer lacks sufficient precision, though. Another approach is to simply shoot a ray from the eye through $\mathbf{q}$ and compute where it intersects the black plane.

Notice many possible places for subtle bugs in the above procedure: 1) trying to unproject the point from screen coordinates rather than NDC; 2) forgetting to flip the $y$-axis when converting to NDC; 3) setting $n_z$ to something other than $-1$; 4) forgetting to set $n_w$ to 1; 5) forgetting to divide the unprojected point by its new (non-1) $w$ coordinate; etc.

By the way, there are functions like gluUnproject that do most of the above for you---check your graphics library's documentation.

Related Solutions

[Math] A controlled trapezoid transformation with perspective projecton

Grabbed a pencil, piece of paper, and Maple and solved it by myself.

var eqRoot:Number = -Math.sqrt(newBottomWidth*newBottomWidth*oldHeight*oldHeight - newHeight*newHeight*oldWidth*oldWidth);
var focalLength:Number = Math.abs(eqRoot/(newBottomWidth - oldWidth));
var angle:Number = Math.atan2(eqRoot/(newBottomWidth*oldHeight), newHeight*oldWidth/(newBottomWidth*oldHeight));
var angleDeg:Number = angle*180/Math.PI;

[Math] 3D Perspective projection

We want to map $P = (x,y,z)^\top$ to $P'=(x',y',z')^\top$.

All rays go through $C = (0,0,-5)^\top = (0,0,-d)^\top$ and hit the plane $z = 0$.

(Large version here and here)

We have the line with intersection $$ (0,0,-d)^\top + t ((x, y, z)^\top - (0,0,-d)^\top) = (x', y', 0)^\top \iff \\ (tx,ty,t(z+d) - d)^\top = (x', y', 0)^\top $$

so we need $$ t(z+d) -d = 0 \iff t = d/(z+d) $$ This leads to \begin{align} P' &= (x', y', z')^\top \\ &= (x', y', 0)^\top \\ &= \left( \frac{d}{z + d} x, \frac{d}{z + d} y, \frac{d}{z + d} (z+d) - d \right)^\top \\ &= \left( \frac{d}{z + d} x, \frac{d}{z + d} y, 0 \right)^\top \quad (*) \end{align}

So far we are in agreement regarding $x'$ and $y'$. We have difference in $z'$, which should be $$ z' = \frac{1}{z/d + 1} (z+d) - d = \frac{d}{z + d} (z+d) - d = 0 $$ and $w'$ will be different as well, see below.

Using homogeneous coordinates we can write the transformation $(*)$ as $$ p' = T p \iff \\ \begin{pmatrix} x' \\ y' \\ z' \\ w' \end{pmatrix} = \begin{pmatrix} d & 0 & 0 & 0 \\ 0 & d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & d \end{pmatrix} \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} \quad (**) $$ we get a homogeneous image vector $$ p' = \left( d x, d y, 0, z + d \right)^\top $$ which can be normalized to $$ p' = \left( \frac{d}{z + d} x, \frac{d}{z + d} y, 0, 1 \right)^\top $$

Finally one can apply the above transformation $(**)$ to $p = ( 10, -20, -10, 1)^\top$.

This gives $p' = (50, -100, 0, -5)^\top$ which normalizes to $p' = (-10, 20, 0, 1)^\top$ or $P'=(-10,20,0)^\top$, where the $x'$ value agrees with the 2D image view shown above.

Best Answer

Related Solutions

[Math] A controlled trapezoid transformation with perspective projecton

[Math] 3D Perspective projection

Related Question