Distance
Convert from spherical coordinates to cartesian ones. If you have latitude $\varphi$ and longitude $\lambda$, you can compute
\begin{align*}
x &= r\cos\varphi\cos\lambda \\
y &= r\cos\varphi\sin\lambda \\
z &= r\sin\varphi
\end{align*}
This assumes that the earth is a sphere of radius $r$, and has the $z$ axis pointing north and the $x$ axis pointing twoards the intersection of the equator and the prime meridian.
Do this conversion for both points, and you have two positions in $\mathbb R^3$. Compute their element-wise difference and take the length of that, by squaring the elements, adding them up and computing the square root.
$$\ell = \sqrt{(x_1-x_2)^2+(y_1-y_2)^2+(z_1-z_2)^2}$$
Direction
In order to compute a kind of bearing from the difference between two locations, you'd most likely want to orthogonally project the displacement vector onto a plane tangent to the surface of the earth at your current location. To do this, let us compute two tangent directions, as derivatives with respect to latitude and longitude. The first points north, the second east.
\begin{align*}
n = \frac{\partial(x,y,z)^T}{\partial\varphi} &=
\begin{pmatrix}
-\sin\varphi\cos\lambda \\
-\sin\varphi\sin\lambda \\
\cos\varphi
\end{pmatrix}
&
e = \frac{\partial(x,y,z)^T}{\partial\lambda} &=
\begin{pmatrix}
-\cos\varphi\sin\lambda \\
\cos\varphi\cos\lambda \\
0
\end{pmatrix}
\end{align*}
Next you have to normalize these to unit length. $n$ already has unit length, but $e$ has to be rescaled.
\begin{align*}
\hat n &= \frac{n}{\lVert n\rVert} = n = \begin{pmatrix}
-\sin\varphi\cos\lambda \\
-\sin\varphi\sin\lambda \\
\cos\varphi
\end{pmatrix} &
\hat e &= \frac{e}{\lVert e\rVert} = \frac1{\cos\varphi}e = \begin{pmatrix}
-\sin\lambda \\
\cos\lambda \\
0
\end{pmatrix}
\end{align*}
Then you can use dot products to project your displacement vector onto these directions. This gives you two components, the ratio of which corresponds to the tangens of the azimuth. With zero at north and positive angle going east, you want the north-pointing component in the denominator and the east-pointing component in the numerator.
\begin{align*}
\tan\alpha &= \frac{\hat e\cdot(p_2-p_1)}{\hat n\cdot(p_2-p_1)}
\\&= \frac
{-\sin\lambda_1(x_2-x_1)
+\cos\lambda_1(y_2-y_1)}
{-\sin\varphi_1\cos\lambda_1(x_2-x_1)
-\sin\varphi_1\sin\lambda_1(y_2-y_1)
+\cos\varphi_1(z_2-z_1)}
\end{align*}
Note that there are two solutions to this, so you either have to check the signs or use an atan2 function.
If you wanted to, you could perform additional computations to determine the elevation, i.e. the angle between the displacement vector and the tangent plane.
Properties of this bearing
I would assume the difference between magnetic north and true north to be rather small in most places, so I'm not sure how much more detail you can expect from a computation like this, with simplifications like a spherical earth and so on.
The initial bearing chosen in this fashion is identical to the one obtained for a greatcircle course. The reason is because in both of these cases you obtain the bearing by intersecting the tangent plane with the plane spanned by origin, destination and center of the earth. From the USA to Mecca that bearing would be mostly NE, according to Google Earth:
I hope this matches your question. If the answer didn't match the answer you expected, then you should probably confer with the authorities which caused that expectation, to see whether you asked the right question in the first place. The rhumb line you mention as an alternative has certain benefits for simple compass navigation, but is not related to an orthogonal projection of the direct displacement vector, or to the shortest distance either through the earth or along its surface.
I may be oversimplifying this, but the geometry seems to be be a non-issue. The formula for integrating over a parametrized surface $\vec r=\vec r(u,v)$ is
$$\iint_D f(\vec r(u,v))|\vec r_u\times \vec r_v|\,du\,dv \tag1$$
where $D$ is the parameter domain (rectangle in your case). There is no conceptual difference between the factors $f(\vec r(u,v))$ and $|\vec r_u\times \vec r_v|$ in (1). It's just an integral over a rectangle of some function $g(u,v)= f(\vec r(u,v))|\vec r_u\times \vec r_v|$. To evaluate (1) numerically, your can sample the integrand $g$ along a grid and multiply by appropriate rectangular areas $\Delta u_i \Delta v_j$.
the trapezoid formed by connecting the image of the four points
The images of four points will not lie in the same plane in general, much less form a trapezoid. The term $|\vec r_u\times \vec r_v|\,du\,dv$ implicitly uses approximation of a piece of the surface with a parallelogram.
Best Answer
Getting the boundary of the visible region is much easier than calculating its area. The boundary curve is an ellipse. It is the intersection of the ellipsoid and the so-called polar plane of the view-point.
If the ellipsoid has equation $$ \frac{x^2}{a^2} + \frac{y^2}{b^2} + \frac{z^2}{c^2} = 1 $$ Then the polar plane of the point $(u,v,w)$ has equation $$ \frac{ux}{a^2} + \frac{vy}{b^2} + \frac{wz}{c^2} = 1 $$ And, actually, I have now found that everything you need is probably in this paper. The details of the ellipse “horizon” are worked out, and the enclosed area is computed by numerical methods.