The problem is to find a rotation matrix $R$ and a translation vector $\vec t$ such that
$$R\vec p_{1B}+\vec t=\vec p_{1A}\;,\tag1$$
$$R\vec p_{2B}+\vec t=\vec p_{2A}\;.\tag2$$
Subtracting these yields
$$R\left(\vec p_{1B}-\vec p_{2B}\right)=\vec p_{1A}-\vec p_{2A}\;.\tag3$$
Since the $y$ axis of system $B$ is parallel to the $x$-$y$ plane of system $A$, we can obtain the rotation by first rotating around the $y$ axis and then rotating around the $z$ axis, so $R$ must have the form
$$
\begin{pmatrix}\cos\alpha&\sin\alpha\\-\sin\alpha&\cos\alpha\\&&1\end{pmatrix}
\begin{pmatrix}\cos\beta&&\sin\beta\\&1&\\-\sin\beta&&\cos\beta\end{pmatrix}
=
\begin{pmatrix}
\cos\alpha\cos\beta&\sin\alpha&\cos\alpha\sin\beta\\
-\sin\alpha\cos\beta&\cos\alpha&-\sin\alpha\sin\beta\\
-\sin\beta&&\cos\beta
\end{pmatrix}
\;.
$$
So the third row depends only on $\beta$, and writing out the corresponding row of $(3)$ yields a trigonometric equation for $\beta$:
$$-\sin\beta(p_{1Bx}-p_{2Bx})+\cos\beta(p_{1Bz}-p_{2Bz})=p_{1Az}-p_{2Az}\;.$$
Since we'll get another equation of the form
$$a\sin\phi+b\cos\phi=c\tag4$$
shortly, I'll solve it in that general form, and you can substitute
$$
\begin{eqnarray}
a&=&p_{2Bx}-p_{1Bx}\;,\\
b&=&p_{1Bz}-p_{2Bz}\;,\\
c&=&p_{1Az}-p_{2Az}
\end{eqnarray}
$$
to solve this one. Writing $a$ and $b$ in polar form, $a=r\cos\xi$, $b=r\sin\xi$, leads to
$$r\cos\xi\sin\phi+r\sin\xi\cos\phi=c\;,$$
$$\sin(\xi+\phi)=\frac cr\;.$$
You can get one solution from
$$\phi_1=\arcsin\frac cr-\xi\;,$$
but note that in general there's a second one, $\phi_2=\pi-\phi_1$, which makes sense, since there are in general two different angles through which you can turn a vector around the $y$ axis to give it a certain $z$ component.
You can determine $r$ and $\xi$ using $r=\sqrt{a^2+b^2}$ and $\xi=\text{atan}(b,a)$, where $\text{atan}$ is the two-argument arctangent function found in many programming environments, which takes into account the signs of both arguments to disambiguate the arctangent on the full range of angles.
Now you have two values of $\beta$, and you can substitute them into the rotation matrix. For instance, substituting into the first row of $(3)$ yields
$$
\cos\alpha\cos\beta(p_{1Bx}-p_{2Bx})+\sin\alpha(p_{1By}-p_{2By})+\cos\alpha\sin\beta(p_{1Bz}-p_{2Bz})=p_{1Ax}-p_{2Ax}\;,
$$
which is again a trigonometric equation for $\alpha$ of the form $(4)$, with
$$
\begin{eqnarray}
a
&=&
p_{1By}-p_{2By}\;,
\\
b
&=&
\cos\beta(p_{1Bx}-p_{2Bx})+\sin\beta(p_{1Bz}-p_{2Bz})
\;,
\\
c
&=&
p_{1Ax}-p_{2Ax}
\;.
\end{eqnarray}
$$
You can solve it just like the other one, to obtain two values of $\alpha$ for each of the two values of $\beta$. Again, this makes sense, since we've only used the first row of $(3)$ so far and there are two different angles through which you can turn a vector around the $z$ axis to give it a certain $x$ component. However, only one of each of these two pairs of values of $\alpha$ will also solve the second row of $(3)$ (that is, the other one would produce a wrong $y$ component), so in the end you obtain two sets of solutions for $\alpha$ and $\beta$. You can substitute each of these into either $(1)$ or $(2)$ to get the corresponding value of $\vec t$.
So in general you get two different affine transformations for valid point coordinates, though sometimes, e. g. if $a=b=c=0$ in either trigonometric equation, the solution will be a lot more underdetermined than that, and if you substitute invalid point coordinates (e. g. for points at different distances from each other, or such that $|c|>r$ in either trigonometric equation), there will be no solutions at all.
I wrote the affine transform for transforming from $B$ to $A$ to simplify the calculations, but you can easily obtain the transform in the other direction as
$$\vec p_B=R^T(\vec p_A-\vec t)\;.$$
I'd say this won't work. The way I read it, this SVD-based approach makes use of the fact that the true rotation matrix is orthogonal with unit singular values. The SVD of the approximate result will yield two orthogonal matrices and a diagonal matrix whose elements should be almost one. Setting them to exactly one will yield the result. But you use homogenous coordinates, the true transform won't have singular values which are all one.
Looking for alternatives, I guess I'd try separating the translation and the rotation. You could treat the barycenter of your points as the center of the object, and consider its translation. Translate it to the origin, perform the rotation using SVD there, then translate to the target position. Not sure whether you get better results from an integrated technique, but I guess that very much depends on your error cost function which you haven't specified.
As for the number of points: translation has three real degrees of freedom, and so has rotation. Two arbitrary points in space gives six real degrees of freedom, but if their distance is fixed, you only have five left. So two points cannot be enough to fully determine the transformation, and you should use a third point, and if you already have a fourth, I'd plug that in as well just to make things more robust and get a better idea of how large the errors are.
Best Answer
The problem is called Wahba's problem (Note 1) and seeks to minimise the following:
$J(\mathbf{R}) = \frac{1}{2} \sum_{k=1}^{N} a_k|| \mathbf{w}_k - \mathbf{R} \mathbf{v}_k ||^2$
where $a_k$ are weights, $\mathbf{w}_k$ are the vectors in one frame, $\mathbf{v}_k$ are the vectors in the other frame and $\mathbf{R}$ is the rotation matrix that you seek.
You can construct two vectors from the 3 points that you have.
The easiest way to solve it (in my opinion) is by using a Singular Value Decomposition as reported by Markley (1988):
$\mathbf{B} = \sum_{i=1}^{n} a_i \mathbf{w}_i {\mathbf{v}_i}^T$
$\mathbf{B} = \mathbf{U} \mathbf{S} \mathbf{V}^T$
$\mathbf{R} = \mathbf{U} \mathbf{M} \mathbf{V}^T$
where $\mathbf{M} = diag(\begin{bmatrix} 1 & 1 & det(\mathbf{U}) det(\mathbf{V})\end{bmatrix})$
The answer will be "exact" for 2 vectors. With more than 2 vectors, it will choose the "best" rotation that fits the cost function.
For exactly 2 vectors, you can also use the Triad Method which is somewhat simpler to implement.
Note 1: This problem - and related problems - have different names in different fields. See also the following -
References: Markley, F. L. Attitude Determination using Vector Observations and the Singular Value Decomposition Journal of the Astronautical Sciences, 1988, 38, 245-258