[Math] Probability of two random points being orthogonal in higher-dimensional unit sphere

dimensional analysisgeometric-probabilitygeometryprobabilitytopological-data-analysis

I understand that most points will be close to surface due to volume concentration. Also I also understand the concentration of volume near the equator, relative to any specific point (North pole).

However, I can't understand why the North pole concept should influence the angle between the two points.

The argument seems to that, once the first point is selected randomly, then considering the first point as the North pole, the second point is more likely to be found near the equator.

I don't understand this argument because I think the orientation of second point has no bearing with the location of the first point. Given both are drawn randomly, all locations on the sphere are equally probable candidates for both points. The angle should have an equal probability for all values, (from zero to 180) instead of having more probability for orthogonality.

What if both points are selected at the same time (no first and second), or selected on two separate spheres? How does the North pole effect the outcomes?

[edit: add another counter-argument] Instead of picking two points, pick the distance between them randomly and the orient that distance randomly. The end points of that distance will be the random two points. Will they still be orthogonal?

Why was the North pole concept considered to influence the location for the second point?

Any thoughts?

Best Answer

I'm assuming a uniform continuous distribution, assuming the probability distributions of the two points are independent, and assuming you mean "close to orthogonal" rather than strictly orthogonal.

It is true that "all locations on the sphere are equally probable candidates for both points." Assuming independent distributions, all locations are equally probable for the second point even given a known position for the first point.

The point is that the part of the sphere close to the "equator" that you randomly selected when you placed the first point is much larger than the part that is close to that point or to the antipodal point. If you go to a high enough dimension, most of the sphere will be "close" to the "equator," for whatever your definition of "close" is.


The remarks above, of course, amount merely to a restatement of the claim in slightly different terms. Better would be an actual proof.

Suppose we have a hypersphere of $n$ dimensions. To simplify the calculations, let the radius of the sphere be $1.$ The total volume inside this sphere is therefore $$ V_n(1) = \frac{\pi^{n/2}}{\Gamma\left(\frac n2 + 1\right)}. $$ We assume a uniform distribution over the volume inside the sphere. That is, when we pick a random point according to this distribution, the probability that the point lands inside a particular region of hyperspace inside the sphere is proportional to the volume of that region.

Now arbitrarily pick a number $q$ with $0 < q \leq 1$ and consider the probability that the distance $r$ between the center of the hypersphere and the random point will be less than $q.$ This probability is the volume inside the hypersphere of radius $q$ divided by the total volume inside the hypersphere, namely, $$ P(r \leq q) = \frac{V_n(q)}{V_n(1)} = \frac{\frac{\pi^{n/2}}{\Gamma\left(\frac n2 + 1\right)} q^n} {\frac{\pi^{n/2}}{\Gamma\left(\frac n2 + 1\right)}} = q^n. $$

Now suppose $n$ is very large, for example, $n = 10000,$ and let's see what is the probability that the random point lands within a distance $q = 0.999$ from the center of the hypersphere. This is $$ P(r \leq 0.999) = 0.999^{10000} \approx 0.000045173. $$ So the random point will land in this region less than $0.005\%$ of the time. The other $99.995\%$ of the time, the point lands in the thin hyperspherical shell within just $0.001$ units from the surface.

This doesn't happen because the point was somehow attracted to the surface of the hypersphere or because we skewed the distribution to something non-uniform to make it close to the surface. This is a distribution that is strictly uniform by volume, with no other influences. But it almost always lands very close to the surface simply because that's where almost all the volume is.

High dimensions are weird!

Now let's randomly choose two points and measure the angle between them. In order to measure the angle, after the points have already been chosen (so that we can no longer unduly influence the random placement of the points), we choose one of the points as the pole of a set of spherical coordinates. We do this only so that we can use the "latitude" in that coordinate system to measure the angle between the two points. The chance that the angle will be between $\phi_1$ and $\phi_2$ (where these are both angles between $0$ and $180$ degrees) is then proportional to how much of the sphere lies between the lines of latitude at angles $\phi_1$ and $\phi_2$ from the pole.

A general formula for this probability in high dimensions is rather ugly, I think. It is related to the question what is the surface area of a cap on a hypersphere?

The probability density, however, is dealt with in the answer to Distribution of an angle between a random and fixed unit-length $n$-vectors. If the angle $\phi$ is measure in radians, the density over the range $0 \leq \phi\leq \pi$ is $$ f(\phi)=\frac{\sin^{n-2}\phi}{\int_0^\pi\sin^{n-2}\theta\,\mathrm d\theta}. $$

Now let's take our $10000$-dimensional hypersphere again and let's see what the probability is that the angle is within $1/50$ radian (a little more than one degree) from a right angle. This is $$ P(\frac\pi2 - \frac1{50} \leq \phi \leq \frac\pi2 + \frac1{50}) = \frac{\int_{\pi/2 - 1/{50}}^{\pi/2 + 1/{50}}\sin^{9998}\phi\,\mathrm d\phi}{\int_0^\pi\sin^{9998}\theta\,\mathrm d\theta} \approx 0.95449. $$ So there is a better than $95\%$ chance that the angle will be between $88.85$ and $91.15$ degrees.

Related Question