You were doing well up until the line $H'^{-1} = H'^T$. There’s no particular reason to believe that $H$ is orthogonal; quite the opposite, in fact. (I’m going to switch here to the more conventional name $\mathtt P$ for the projection matrix and also make use of the block decomposition $\mathtt P = \left[\mathtt M \mid \mathbf p_4\right]$.) The first two columns of $\mathtt P$, $\mathbf p_1$ and $\mathbf p_2$, are the vanishing points of the world $x$- and $y$ axes. Unless the camera happens to be positioned just right relative to the $x$-$y$ plane, these vectors will not be orthogonal in the image. The upshot is that you have to compute the inverse of $\begin{bmatrix}\mathbf p_1 & \mathbf p_2 & \mathbf p_4\end{bmatrix}$, not its transpose.
This leads to the next potential problem: this matrix might not be invertible. The submatrix $\mathtt M$ is effectively the composition of a rotation and nonsingular affine transformation, so we know the first three columns are linearly independent, but it’s entirely possible that $\mathbf p_4 = \lambda \mathbf p_1 + \mu \mathbf p_2$, that is, that the image of the world origin lies on the vanishing line of the $x$-$y$ plane. This occurs when the camera center lies on this plane. In the specific construction in your question, this isn’t really an issue since you’re reprojecting onto the $x$-$y$ plane, but keep this in mind when generalizing to other planes.
Thus, the reprojection onto the $x$-$y$ plane is given by the homography matrix $$\mathtt H = \begin{bmatrix}\mathbf p_1 & \mathbf p_2 & \mathbf p_4\end{bmatrix}^{-1}.$$ This can be generalized to any plane that doesn’t contain the camera center by inserting an appropriate coordinate transformation $\mathtt B$ into the cascade, i.e., start with $$w\begin{bmatrix}x_c \\ y_c \\ 1 \end{bmatrix} = \mathtt{PB}\begin{bmatrix}x_w\\y_w\\0\\1\end{bmatrix}.$$
For comparison, here’s a construction that makes direct use of back-mapping the image points. Given a plane with homogeneous coordinate vector $\mathbf\Pi$, its intersection with the line through a fixed point $\mathbf C$ not on the plane and an arbitrary point $\mathbf X$ is $\left(\mathbf\Pi^T\mathbf X\right) \mathbf C - \left(\mathbf C^T\mathbf\Pi\right) \mathbf X$. The world coordinates of the camera center can be recovered from $\mathbf P$: its inhomogeneous Cartesian coordinates are $\tilde{\mathbf C}=\mathtt M^{-1}\mathbf p_4$. Finally, an image point $\mathbf x$ back-projects to a ray that intersects the plane at infinity at $\left((\mathtt M^{-1}\mathbf x)^T,0\right)^T$. Putting this all together, and adding a final matrix $\mathtt B$ that imposes a coordinate system on the plane, we get $$\mathtt H = \mathtt B \left(\mathbf C \mathbf \Pi^T-\mathbf C^T \mathbf \Pi \mathtt I_4 \right) \begin{bmatrix}\mathtt M^{-1}\\\mathbf 0^T\end{bmatrix}.$$ With $\mathbf\Pi=(0,0,1,0)^T$—the $x$-$y$ plane—and $$\mathtt B = \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&0&1\end{bmatrix}$$ this produces the same homography matrix as above.
The details of the anatomy of the projection matrix that I used above can be found in Hartley and Zisserman’s Multiple View Geometry In Computer Vision and other standard references on the subject.
Ok I solved it. I can rotate whole system so my projection vector is orthogonal (for example (0, 0, 1)). And then its simple. X, Y coords stay the same and Z coord has to fit the plane equation ax + by + cz + d = 0.
If d = 0 (plane intersects point (0,0,0) (you can move your plane)) then its simple:
\begin{bmatrix}1&0&0\\0&1&0\\-a&-b&c\end{bmatrix}
matrix.
And after you can rotate system back.
Best Answer
To convert the projected 3D points into 2D coordinates you first need to define a 2D coordinate system which is contained in your plane. For this you need to define the base vectors $\overrightarrow{e_x}$ and $\overrightarrow{e_y}$ of your coordinate frame. I assume that you would want a right-handed orthonormal base.
First you need to define your base vector $\overrightarrow{e_x}$. For this you can choose any unit vector, which is contained in your plane (orthogonal to $\overrightarrow{n}$, length 1). One possibility would be to define your basis first basis vector $\overrightarrow{e_x}$ via $\overrightarrow{r}$:
$$\overrightarrow{e_x} = \frac{\overrightarrow{r} \times \overrightarrow{n}}{||\overrightarrow{r} \times \overrightarrow{n}||_2}$$
Where $\times$ denotes the vector cross product. This methods works unless $\overrightarrow{r}$ and $\overrightarrow{s}$ are parallel. Once you have found any possible basis vector $e_x$ you can derive the right hand basis vector $e_y$ by:
$$ \overrightarrow{e_y} = \frac{\overrightarrow{n} \times \overrightarrow{e_x}}{||\overrightarrow{n} \times \overrightarrow{e_x}||_2}$$
We now denote $p'$ as the projection of an arbitrary point p onto the defined plane in 3D coordinates. We can then define the 3d vector denoting the distance of point p' to the origin of the new coordinate system by:
$$\overrightarrow{rp'} = \overrightarrow{p'} - \overrightarrow{r}$$
To get the 2D coordinates of that point you simply project the 3D vector of $p'$ onto our derived 3D vectors for $e_x$ and $e_y$ using the scalar product:
$$ p'_{\ 2D} = \begin{pmatrix} \overrightarrow{rp'} \cdot \overrightarrow{e_x} \\ \overrightarrow{rp'} \cdot \overrightarrow{e_y} \end{pmatrix} $$
Finally you could reformulate the above equation into matrix form as follows:
$$ p'_{\ 2D} = K \ \overrightarrow{rp'} = \begin{pmatrix} \overrightarrow{e_x}^T \\ \overrightarrow{e_y}^T \end{pmatrix} \overrightarrow{rp'} = \begin{pmatrix} e_{x,1} & e_{x,2} & e_{x,3} \\ e_{y,1} & e_{y,2} & e_{y,3} \end{pmatrix} \overrightarrow{rp'} $$
Where K denotes the projection matrix which you can simply apply to any 3D point in the defined plane. With this the transition to a 2D coordinate for a projected point is complete.