You can compute the homography matrix H with your eight points with a matrix system such that the four correspondance points $(p_1, p_1'), (p_2, p_2'), (p_3, p_3'), (p_4, p_4')$ are written as $2\times9$ matrices such as:
$p_i = \begin{bmatrix}
-x_i \quad -y_i \quad -1 \quad 0 \quad 0 \quad 0 \quad x_ix_i' \quad y_ix_i' \quad x_i' \\
0 \quad 0 \quad 0 \quad -x_i \quad -y_i \quad -1 \quad x_iy_i' \quad y_iy_i' \quad y_i'
\end{bmatrix}$
It is then possible to stack them into a matrix $P$ to compute:
$PH = 0$
Such as:
$PH = \begin{bmatrix}
-x_1 \quad -y_1 \quad -1 \quad 0 \quad 0 \quad 0 \quad x_1x_1' \quad y_1x_1' \quad x_1' \\
0 \quad 0 \quad 0 \quad -x_1 \quad -y_1 \quad -1 \quad x_1y_1' \quad y_1y_1' \quad y_1' \\
-x_2 \quad -y_2 \quad -1 \quad 0 \quad 0 \quad 0 \quad x_2x_2' \quad y_2x_2' \quad x_2' \\
0 \quad 0 \quad 0 \quad -x_2 \quad -y_2 \quad -1 \quad x_2y_2' \quad y_2y_2' \quad y_2' \\
-x_3 \quad -y_3 \quad -1 \quad 0 \quad 0 \quad 0 \quad x_3x_3' \quad y_3x_3' \quad x_3' \\
0 \quad 0 \quad 0 \quad -x_3 \quad -y_3 \quad -1 \quad x_3y_3' \quad y_3y_3' \quad y_3' \\
-x_4 \quad -y_4 \quad -1 \quad 0 \quad 0 \quad 0 \quad x_4x_4' \quad y_4x_4' \quad x_4' \\
0 \quad 0 \quad 0 \quad -x_4 \quad -y_4 \quad -1 \quad x_4y_4' \quad y_4y_4' \quad y_4' \\
\end{bmatrix} \begin{bmatrix}h1 \\ h2 \\ h3 \\ h4 \\ h5 \\ h6 \\ h7 \\ h8 \\h9 \end{bmatrix} = 0$
While adding an extra constraint $|H|=1$ to avoid the obvious solution of $H$ being all zeros. It is easy to use SVD $P = USV^\top$ and select the last singular vector of $V$ as the solution to $H$.
Note that this gives you a DLT (direct linear transform) homography that minimizes algebraic error. This error is not geometrically meaningful and so the computed homography may not be as good as you expect. One typically applies nonlinear least squares with a better cost function (e.g. symmetric transfer error) to improve the homography.
Once you have your homography matrix $H$, you can compute the projected coordinates of any point $p(x, y)$ such as:
$\begin{bmatrix}
x' / \lambda \\
y' / \lambda \\
\lambda
\end{bmatrix} =
\begin{bmatrix}
h_{11} & h_{12} & h_{13} \\
h_{21} & h_{22} & h_{23} \\
h_{31} & h_{32} & h_{33}
\end{bmatrix}.
\begin{bmatrix}
x \\
y \\
1
\end{bmatrix}$
A common way to do this is to find a planar perspective transformation that “warps” the rectangle into the image quadrilateral and then use its inverse to map points on the image back to the rectangle. There are software libraries that include this as standard functionality.
If you want to code this up for yourself, it’s not terribly difficult to work out the necessary mapping. Without going into the detailed derivation, a general planar perspective transformation can be represented (in homogeneous coordinates) by a matrix of the form $$
M=\pmatrix{m_{00} & m_{01} & m_{02} \\ m_{10} & m_{11} & m_{12} \\ m_{20} & m_{21} & 1},
$$ which corresponds to the mapping $$\begin{align}
x' &= {m_{00}x+m_{01}y+m_{02} \over m_{20}+m_{21}+1} \\
y' &= {m_{10}x+m_{11}y+m_{12} \over m_{20}+m_{21}+1}.
\end{align}$$ Given a set of corner-to-corner correspondences between a pair of quadrilaterals, the eight coefficients can by found by solving a system of linear equations.
If one of the quads is a rectangle aligned with the coordinate axes, this transformation can be built up in stages, which can be more convenient for implementation in software. Assuming that we have the matrix $A$ which maps from the unit square to the quadrilateral, the rectangle-to-quadrilateral transformation can be derived by composing it with suitable translation and scaling transformations: $$
M = A\cdot S\left(\frac1w,\frac1h\right)\cdot T(-x_{LL},-y_{LL}).
$$ The inverse map is then $$
M^{-1}=T(x_{LL},y_{LL})\cdot S(w,h)\cdot A^{-1} = \pmatrix{w&0&x_{LL} \\ 0&h&y_{LL} \\ 0&0&1 }\cdot A^{-1},
$$ where $(x_{LL},y_{LL})$ are the coordinates of the rectangle’s lower-left corner. Left-handed coordinate systems can be accommodated by throwing in the appropriate reflections, and skew rectangles can be dealt with by adding a rotation to the transformation cascade.
Now, it’s just a matter of finding the matrix $A$. Let the corners of the quadrilateral be given by the points $q_i'$, and map the corners of the unit square to them as follows: $$\begin{align}
(0,0)&\mapsto q_0' \\
(1,0)&\mapsto q_1' \\
(0,1)&\mapsto q_2' \\
(1,1)&\mapsto q_3'.
\end{align}$$ Solving the resulting system of equations is tedious, but straightforward. One form of the solution is as follows: $$\begin{align}
a_{00} &= a_{20}x_1'+\Delta x_{10} \\
a_{10} &= a_{20}y_1'+\Delta y_{10} \\
a_{20} &= {\omega(\Delta_{10},\Delta_{32})+\omega(\Delta_{20},\Delta_{32})-\omega(\Delta_{30},\Delta_{32}) \over \omega(\Delta_{31},\Delta_{32})} \\
a_{01} &= a_{21}x_2'+\Delta x_{20} \\
a_{11} &= a_{21}y_2'+\Delta y_{20} \\
a_{21} &= -{\omega(\Delta_{10},\Delta_{31})+\omega(\Delta_{20},\Delta_{31})-\omega(\Delta_{30},\Delta_{31}) \over \omega(\Delta_{31},\Delta_{32})} \\
a_{02} &= x_0' \\
a_{12} &= y_0', \\
\end{align}$$ where $\Delta_{ij}=q_i'-q_j'$, $\Delta x_{ij}$ and $\Delta y_{ij}$ are the corresponding coordinate deltas, and $\omega$ is the symplectic form $\omega(\mathbf u, \mathbf v) = u_xv_y-u_yv_x$. (That this looks like a lot of cross products is no coincidence—there’s a geometric interpretation of the solution as the intersection of various planes.)
Since you’re mainly interested in mapping from a quadrilateral to a rectangle, it might be more efficient, and probably more computationally stable, to compute $A^{-1}$ directly from the corner coordinates. It’s similar to the solution for the other direction, but I’ll leave that calculation to you.
Best Answer
The question is a bit vague, but the first idea that came to my mind would be to use a low-discrepancy sequence such as a Halton sequence or a Sobol sequence. Here are examples of 256 points distributed on the plane using each of them, with a random distribution for comparison:
From left to right: Halton sequence, Sobol sequence, random sequence. Images by Jheald / Wikimedia Commons, licensed under the CC-By-SA 3.0 license.
At least, you could use these sequences as starting configurations for a local optimization algorithm, which might e.g. repeatedly move each point a small distance away from its nearest neighbor until the configuration stabilizes.