By insisting that a quadrilateral be sent to another quadrilateral, you are insisting that four non-colinear points be sent to four non-colinear points.
By a translation, we can take any vertex of $Q_1$ on to any vertex of $Q_2$. That leaves three free vertices for both $Q_1$ and $Q_2$ to be paired up. Given one free vertex of $Q_1$, there are three free vertices of $Q_2$ for it to be sent to. Given another, remaining, free vertex of $Q_1$, there are two free vertices of $Q_2$ remaining. Hence, there are exactly $4 \times 3 \times 2 = 24$ distinct affine transformations which match three of the vertices of $Q_1$ with three of the vertices of $Q_2$.
At this point, all of the degrees of freedom have been used up. You can only specify the image of three points, and we have done that. The fourth vertex of $Q_1$ will, or will not be paired with the fourth vertex of $Q_2$. In very exceptional circumstances it will. Remember that parallel lines must remain parallel under affine transformations, and conversely, non-parallel lines cannot be made parallel under affine transformation.
(Think of an ordinary linear space with a fixed origin. Parallelagrams are sent to parallelograms. If you have a rhombus as $Q_1$ and a rectangle as $Q_2$ then you can pair three of the vertices (including the origin), but the fourth is only possible if the rhombus is actually a rectangle of an area prescribed by the pairing of the first three vertices.)
In short, for almost all quadrilaterals, there is no affine transformation from one to the other. In very exceptional circumstances, the fourth vertices will line up (by chance) and you will have 24 different affine transformations from $Q_1$ to $Q_2$, depending on how you wish to order the pairing of vertices.
I am trying to "interpolate" an affine transform […]
I'm suggesting a different approach, which should be well suited to interpolation, although it does not depend on splitting the transformation into separate elementary operations the way your question suggests. Instead, I'd use fractional powers of a matrix, as I'll describe now.
Suppose $A$ is diagonalizable (which will be the case for most transformations), then there exists an orthogonal matrix $P$ and a diagonal matrix $D$ such that $A=P\,D\,P^{-1}$. The entries of $D$ are the eigenvalues of $A$, which I'll call $\lambda_1$ and $\lambda_2$. Now you can define $A$ raised to the $t$-th power like this:
$$
A^t = P\,D^t\,P^{-1} = P\,\begin{pmatrix}\lambda_1^t&0\\0&\lambda_2^t\end{pmatrix}\,P^{-1}
$$
Now if you change $t$ continuously from $0$ to $1$, the matrix $A^t$ will change from identity to the matrix $A$. So this is your interpolation.
One thing you have to be careful about is the fact that the $\lambda_k$ will very likely be a conjugate pair of complex number. You can express them as $\lambda_k=e^{z_k}$, where $z_k=\log\lambda_k$, but the logarithm of a number is only defined up to multiples of $2\pi i$. So in this case, you should make sure that the imaginary part doesn't become too big, namely you want $-\pi\le\operatorname{Im}(z_k)\le\pi$. This ensures that the interpolation will not take more turns for a rotation than actually required. Furthermore, you should maintain $z_2=\bar{z_1}$ to make sure that the interpolating matrices will be real as well. With this choice, $\lambda_k^t=e^{t\cdot z_k}$ is well defined and behaves as you described for the case of rotation.
I've created a proof of concept implementation which you can use to experiment, in order to decide whether this is what you want. Here is a snapshot of the kind of interpolation this will create:
With that experiment, I realized that one should include the translative part into the matrix as well, in just the way you composed a matrix including $A$ and $b$ in your question. Otherwise, the positions of the interpolated frames will depend on the location of the defining triples in the plane, which I consider undesirable.
Best Answer
Affine transformation includes scaling (which is 3 scaling values + 3 degrees of freedom determining the directions of scaling). However, for rotation you need only 3 degrees of freedom. So you answer is correct: 4 points not lying at the same plane are enough.
You can think it that way: let's have points $p_0,\ldots p_3\to q_0,\ldots q_3$. Then if we can express any other point $p=p_0+\alpha(p_1-p_0)+\beta(p_2-p_0)+\gamma(p_3-p_0)$ by linearity of the affine transformation it would go to $q=q_0+\alpha(q_1-q_0)+\beta(q_2-q_0)+\gamma(q_3-q_0)$.