One way to proceed is to generate a point uniformly on the sphere, apply the mapping $f : (x,y,z) \mapsto (x'=ax,y'=by,z'=cz)$ and then correct the distortion created by the map by discarding the point randomly with some probability $p(x,y,z)$ (after discarding you restart the whole thing).
When we apply $f$, a small area $dS$ around some point $P(x,y,z)$ will become a small area $dS'$ around $P'(x',y',z')$, and we need to compute the multiplicative factor $\mu_P = dS'/dS$.
I need two tangent vectors around $P(x,y,z)$, so I will pick $v_1 = (dx = y, dy = -x, dz = 0)$ and $v_2 = (dx = z,dy = 0, dz=-x)$
We have $dx' = adx, dy'=bdy, dz'=cdz$ ;
$Tf(v_1) = (dx' = adx = ay = ay'/b, dy' = bdy = -bx = -bx'/a,dz' = 0)$, and similarly $Tf(v_2) = (dx' = az'/c,dy' = 0,dz' = -cx'/a)$
(we can do a sanity check and compute $x'dx'/a^2+ y'dy'/b^2+z'dz'/c^2 = 0$ in both cases)
Now, $dS = v_1 \wedge v_2 = (y e_x - xe_y) \wedge (ze_x-xe_z) = x(y e_z \wedge e_x + ze_x \wedge e_y + x e_y \wedge e_z)$ so $|| dS || = |x|\sqrt{x^2+y^2+z^2} = |x|$
And $dS' = (Tf \wedge Tf)(dS) = ((ay'/b) e_x - (bx'/a) e_y) \wedge ((az'/c) e_x-(cx'/a) e_z) = (x'/a)((acy'/b) e_z \wedge e_x + (abz'/c) e_x \wedge e_y + (bcx'/a) e_y \wedge e_z)$
And finally $\mu_{(x,y,z)} = ||dS'||/||dS|| = \sqrt{(acy)^2 + (abz)^2 + (bcx)^2}$.
It's quick to check that when $(x,y,z)$ is on the sphere the extrema of this expression can only happen at one of the six "poles" ($(0,0,\pm 1), \ldots$). If we suppose $0 < a < b < c$, its minimum is at $(0,0,\pm 1)$ (where the area is multiplied by $ab$) and the maximum is at $(\pm 1,0,0)$ (where the area is multiplied by $\mu_{\max} = bc$)
The smaller the multiplication factor is, the more we have to remove points, so after choosing a point $(x,y,z)$ uniformly on the sphere and applying $f$, we have to keep the point $(x',y',z')$ with probability $\mu_{(x,y,z)}/\mu_{\max}$.
Doing so should give you points uniformly distributed on the ellipsoid.
The map $\varphi:(\theta,\phi)\rightarrow\Big(f(\theta,\phi),\theta,\phi\Big)$ is not a parametric representation of the spherical surface $r=f(\theta,\phi)$. Rather, $\varphi$ is a parametric representation of the surface $x=f(y,z)$ since parametric surfaces (and curves) are inherently represented in Cartesian form. What you're doing is equivalent to saying $\theta \longrightarrow \Big(\theta^2,\theta\Big)$ is a parametric represention of the polar spiral $r=\theta^2$ when, in reality, $\theta \rightarrow \Big(\theta^2,\theta\Big)$ is parametric representation of the parabola $x=y^2$ while $\theta \rightarrow \Big(\theta^2 \cos(\theta),\theta^2 \sin(\theta)\Big)$ is the spiral $r=\theta^2$.
Let's refer to your example in which you seek to compute the surface area of a sphere.
In spherical coordinates, the equation of a sphere is $r=1$ on the domain $(\theta,\phi)\in[0,2\pi)\times [0,\pi]$. You can represent this parametrically as $$(\phi,\theta) \longrightarrow \Big(\sin(\phi)\cos(\theta),\sin(\phi)\sin(\theta),\cos(\phi)\Big)$$ simply by converting from spherical to cartesian coordinates. However, the image of your function $\varphi:(\phi,\theta)\longrightarrow (1,\phi,\theta)$ on the same domain is the rectangular patch $\{1\}\times [0,2\pi)\times [0,\pi]$ embedded in the vertical plane $x=1$ which has area $2\pi^2$. This is precisely why $$\int_0^{2\pi}\int_0^{\pi}\sqrt{1+\big(f_{\theta}\big)^2+\big(f_{\phi}\big)^2}d\phi d\theta=2\pi^2 \neq 4\pi$$
whenever $f(\phi,\theta)=1$; this integral is calculating the area of the surface $x=1$ on $(y,z)\in [0,2\pi)\times[0,\pi]$ which is the image of your map $\varphi$.
If you want to find the area of the surface given in spherical coordinates by $r=f(\phi,\theta)$ defined on domain $(\phi,\theta)\in \mathcal{U}$ you will need to express this parametrically as $$\vec{p}(\phi,\theta)=\Big(f(\phi,\theta)\sin(\phi)\cos(\theta),f(\phi,\theta)\sin(\phi)\sin(\theta),f(\phi,\theta)\cos(\phi)\Big)$$ The area of the surface will be $$S=\int \int _{\mathcal{U}}||\vec{p}_\phi \times \vec{p}_\theta||d\phi d\theta$$ If you compute $||\vec{p}_\phi \times \vec{p}_\theta||$ you will surely obtain your desired result.
Best Answer
I did find the following ovservation. Let us denote $\rho = \phi \circ \psi$. We want to find $\psi$ such that for all open subsets $W \subseteq U$, the submanifold $\rho(W)$ satisfies that $\rho$ preserves the probabiliy:
$$\frac{\int_W \sqrt{\det (D\rho ^T D\rho)}\text{d}u}{\text{vol}_k(M)}=\frac{\int_W \text{d}u}{\text{vol}_k(U)}$$
Since this is for all $W$ we must have $$\frac{\sqrt{\det (D\rho ^T D\rho)}}{\text{vol}_k(M)}=\frac{1}{\text{vol}_k(U)}$$ Note that by the chain rule $$\sqrt{\det (D\rho ^T D\rho)}=\sqrt{\det ((D\phi \circ \psi D\psi)^T(D\phi \circ \psi D\psi)}=\sqrt{\det (D\psi^T (D\phi \circ \psi)^T (D\phi \circ \psi) D\psi}$$ $$=|\det D\psi|\sqrt{\det ((D\phi \circ \psi)^T (D\phi \circ \psi)}=\frac{\text{vol}_k(M)}{\text{vol}_k(U)}$$ From here I think it's pretty obvious that generally you can't solve explicitly for $\psi$, At least that's my guess.