Probability of Hexagon Side Lengths on Unit Circle – Conjecture P(ab < c) = 3/5

circlesconjecturesgeometric-probabilityintegrationprobability

The vertices of a hexagon are uniformly random points on a unit circle; $a,b,c$ are the lengths of three distinct random sides.

A simulation with $10^7$ such random hexagons yielded a proportion of $0.60008$ of them satisfying $ab<c$.

Is the following conjecture true: $P(ab<c)=\dfrac35$.

Probabilities involving the side lengths of random polygons inscribed in a circle are often simple rational numbers; for example here, here and here.

Those examples involve triangles, which are determined by only two parameters (two random central angles, with the third angle determined by the others), so it was easy to set up an integral. But this question involves a hexagon, which involves five parameters (five random angles, with the sixth angle determined by the others), and I do not know how to set up an integral.

Sometimes these kinds of questions can be answered without integration, using pure geometry, for example here. But I have been unable to find a purely geometric proof for this question.

$P(ab<c)$ seems to be a simple rational number only for triangles and hexagons, among $n$-gons for $3\le n\le 12$, based on simulations.

Distribution of $a$

Here I derive $f(x)$, the density function of $a$, which is the same as the density function of $b$ and $c$. However, I do not know if this is helpful, because $a,b,c$ are not independent.

Suppose the first chosen random point on the circle is $O$, and $a$ is the length of the side immediately anticlockwise from $O$.

enter image description here

$a>x$ if all the other chosen points are on the major arc subtended by the red line segment, and the other chosen points are not all on the minor arc subtended by the green line segment.

$\theta=\arcsin \frac{x}{2}$

$\begin{align}
P(a<x)&=1-P(a>x)\\
&=1-\left(\left(1-\frac{1}{\pi}\arcsin\frac{x}{2}\right)^5-\left(\frac{1}{\pi}\arcsin\frac{x}{2}\right)^5\right)
\end{align}$

$\begin{align}
f(x)&=\frac{d}{dx}P(a<x)\\
&=\frac{5}{\pi\sqrt{4-x^2}}\left(\left(1-\frac{1}{\pi}\arcsin\frac{x}{2}\right)^4+\left(\frac{1}{\pi}\arcsin\frac{x}{2}\right)^4\right)\\
\end{align}$

enter image description here

I checked that $\int_0^2f(x)\mathrm dx=1$.

Best Answer

Some words on how this answer evolved in time. After looking at the other "nice probabilities" related to metric conditions on segments between points randomly taken on the unit circle, the first impression was that $3/5$ should somehow quickly show up. However, neither the geometry nor the analytic tools were leading to something in the reach of this clean number. A piece of answer was only waiting for the last computational step. But it could not be found. So i decided to also experiment, and/or numerically compute the integral for the wanted probability. More and more, the feeling crystallized that $3/5$ is not the right answer, but it was rather the one result "close to some beautiful number" obtained experimentally, maybe in a long row of experiments. Then my answer completely changed, below we no longer have any geometric argument, rather there is a path to convince the reader oppinion against the $3/5$ result.



Let $C(6)$ the space of configurations of six distinct points on the unit circle $S$, identified with $[0,2\pi)=\Bbb R/2\pi \Bbb Z$, and $C_0(6)$ the space of configurations of six distinct points in cyclic order on $S$. The symmetric group acts on $C(S)$, and the cyclic group on $C_0(S)$. A tuple in $C(6)$ will be denoted by $P=(P_0,P_1,P_2,P_3,P_4,P_5)$. A tuple in $C_0(6)$ will be denoted by $A=(A_0,A_1,A_2,A_3,A_4,A_5)$. There is a map $C(6)\to C_0(6)$ that brings a configuration $P$ into a configuration $A$ with $A_0=P_0$, and with the following components ordered in cyclic order, so that as sets the components of the two tuples $A,P$ coincide.

A side of $P\in C(6)$ is one among the segments $P_0P_1$, $P_1P_2$, $P_2P_3$, $P_3P_4$, $P_4P_5$, $P_5P_0$.

We now pick from each configuration in $C(6)$ or $C_0(6)$ three random sides. Let $a,b,c$ be their lengths. We have to compute the probability $p$ of the event $E$ that $ab<c$.

Let $(\Omega,\Bbb P)$ be the space of configuration in $C(6)$ with three chosen sides, $E\subset\Omega$.

Let $(\Omega_0,\Bbb P)$ be the space of configuration in $C_0(6)$ with three chosen sides, take $E_0$ the part of $E$ with configurations in $C_0(6)$.

The computation of $p=\Bbb P(E)$ working in $\Omega$ is the same as for $\Bbb P_0(E_0)$ working in $\Omega_0$, because by definition it factorizes through the step of cyclicly ordering configurations from $C(6)$. We work thus over $C_0(6)$-configurations from now on.

Here are some pictures of making a choice of three sides as a set, when a configuration of six points $A\in C_0(6)$ is fixed.

enter image description here


Let us fix ideas, and consider the case I. By rotational symmetry, we can place the component $A_0$ of a $6$-tulple $A$ in $0$, then integrate on the piece $\Delta(5)$ of all $x=(x_1,x_2,x_3,x_4,x_5)\in[0,2\pi)^5$ with $ 0\le x_1\le x_2\le x_3\le x_4\le x_5\le 2\pi $, which determine the positions of $A_1,A_2,A_3,A_4,A_5)$ on the circumerence. The volume of this space of configurations is $(2\pi)^5/5!$, the volume of the $5$-dimensional simplex with size $(2\pi)$.


Assume for a second that $a,b,c$ are chosen to be the sides $A_0A_1$, $A_1A_2$, $A_2A_3$, $$ a=2\sin\underbrace{\frac{x_1-0}2}_{:=y_1}\ ,\qquad b=2\sin\underbrace{\frac{x_2-x_1}2}_{:=y_2}\ ,\qquad c=2\sin\underbrace{\frac{x_3-x_2}2}_{:=y_3}\ , $$ we will use the above $y_1,y_2,y_3$, and it is convenient to also introduce

$\displaystyle y_4=\frac12(x_4-x_3)$, $\displaystyle y_5=\frac12(x_5-x_4)$, $\displaystyle y_6=\frac12(2\pi-x_5)$.

So $y_1+y_2+y_3+y_4+y_5+y_6=\pi$, and we may change variables from $x$ to $y$, so that for example:

$dy_1=\frac 12 dx_1$, $dy_2=\frac 12(dx_2-dx_1)$, $dy_3=\frac 12(dx_3-dx_2)$, $dy_4=\frac 12(dx_4-dx_3)$, $dy_5=\frac 12(dx_5-dx_4)$,

and building Jacobians, or wedges to have computations in a line:

$dy_1\wedge dy_2=\frac 12dx_1\wedge\frac 12(dx_2-dx_1)=\frac 1{2^2}dx_1\wedge dx_2$, and similarly

$dy:=dy_1 \wedge dy_2 \wedge dy_3 \wedge dy_4 \wedge dy_5$ is $\frac 1{2^5} dx_1 \wedge dx_2 \wedge dx_3 \wedge dx_4 \wedge dx_5 =\frac 1{2^5}dx$.

The formula for the probability $p_1$ of the event $E$, conditioned to the case $I$, and the made choice for $a,b,c$ (first three chords) is now $$ \begin{aligned} p_1 &=\frac 1{(2\pi)^5/5!} \int_{\Delta(5)}1_{A_0A_1\cdot A_1A_2<A_2A_3}\; dx\\ &=\frac 1{(2\pi)^5/5!} \int_0^{2\pi}dx_1 \int_{x_1}^{2\pi}dx_2 \int_{x_2}^{2\pi}dx_3 \int_{x_3}^{2\pi}dx_4 \int_{x_4}^{2\pi}dx_5 \cdot 1_{2\sin\frac {x_1-0}2\cdot 2\sin\frac {x_2-x_1}2<2\sin\frac {x_3-x_2}2}\\ &=\frac 1{\pi^5/5!} \int_0^{\pi}dy_1 \int_0^{\pi-y_1}dy_2 \int_0^{\pi-y_1-y_2}dy_3\cdot 1_{2\sin y_1\cdot 2\sin y_2<2\sin y_3} \underbrace{ \iint_{\substack{0\le y_4,y_5\\y_4+y_5\le \pi-y_1-y_2-y_3}} dy_4\;dy_5}_{=\frac 12(\pi-y_1-y_2-y_3)^2} \\ &\color{blue}{ =\frac 1{\pi^5/5!} \int_{\substack{0\le y_1,y_2,y_3\le \pi\\0\le y_1+y_2+y_3\le \pi}} \frac 12(\pi-y_1-y_2-y_3)^2 \cdot1_{2\sin y_1\cdot 2\sin y_2<2\sin y_3} \;dy_1\; dy_2\; dy_3} \\ &\color{brown}{ =\frac 1{\pi^5/5!} \int_{\substack{0\le y_1,y_2,y_3,y_4\le \pi\\y_1+y_2+y_3+y_4= \pi}} \frac 12y_4^2 \cdot1_{2\sin y_1\cdot 2\sin y_2<2\sin y_3} \;dy} \ . \end{aligned} $$



Recall that the above calculation was done under the condition that we are in case one, and for the special fixed choice of $a,b,c$.

For short, we did the following. We have isolated the among the six sectors delimited by $A=(A_0,A_1,A_2,A_3,A_4,A_5)$ on the circumference those three that contain the sides $a,b,c$, we have used $2y_1,2y_2,2y_3$ for the angles at center pointing to these chords $a,b,c$ respectively, then we have changed variables, and integrated w.r.t. the remained variables.

A similar procedure applies also for an other choice of $a,b,c$ in Case $I$, and also for the Cases $II$, for the Case $II'$ (the reflected, changed orientation view to $II$), and for the case $III$. In each case we obtain the same formula.

This implies by cumulation of all conditional cases that $p=p_1$. For practical usage, the blue expression for $p=p_1$ is a good start, but in order to see the symmetry the brown formula is better suited. (We can first integrate w.r.t. $y_4$ if we want so.)



For other values of $n\ge 4$, the similar problem when $n$ instead of $6$ points on the unit circle are chosen, the formula for the corresponding probability $p(n)$ is with a similar argumentation: $$ \begin{aligned} p(n) &\color{blue}{ =\frac {(n-1)!}{\pi^{n-1}} \int_{\substack{0\le y_1,y_2,y_3\le \pi\\0\le y_1+y_2+y_3\le \pi}} \frac 1{(n-4)!}(\pi-y_1-y_2-y_3)^{n-4} \cdot1_{2\sin y_1\cdot 2\sin y_2<2\sin y_3} \;dy_1\; dy_2\; dy_3} \\ &\color{brown}{ =\frac {(n-1)!}{\pi^{n-1}} \int_{\substack{0\le y_1,y_2,y_3,y_4\le \pi\\y_1+y_2+y_3+y_4= \pi}} \frac 1{(n-4)!}y_4^{n-4} \cdot1_{2\sin y_1\cdot 2\sin y_2<2\sin y_3} \;dy} \ . \end{aligned} $$



My problem is now that the many ways i tried to get a "clean" value for $p=p(6)$, presumably $3/5$, all were involving many pieces of broken integrals, and none of them had a close formula. I promised to submit an answer, so far we have only easy steps, also covered here and in the many other answers to the related questions, but unfortunately starting from this point there is no "canonical" next step.

So i decided to close with a numerical experiment. Since numerical integrals are very well implemented in pari/gp, this will be the weapon of choice. The function intnum (integrale numerique, i suppose) does the job on a specific interval. To work on domains having more dimensions, we have to break the one specific domain of interest in slices, Fubini.
However, practice shows that we should use at most double integrals, to finish in time. We have reduced the computation to an integral in $3D$, so we must (and want to get better precision) integrate by hand w.r.t. the one of the variables.

We can implement the blue formula in more than one way. The simplest may be to let $y_1,y_2$ run in $[0,\pi]$ so that $0\le y_1+y_2\le \pi$. For each such fixed choice of $(y_1,y_2)$,

  • we check first if $ab=2\sin y_1\cdot 2\sin y_2$ is $\le 2$, if not such points do not contribute to the integral,
  • else we consider values of $y_3$ so that $2\sin y_3 = c > ab$, i.e. $\sin y_3>2\sin y_1\sin y_2:=\sin s(y_1,y_2)$,
  • so we have a first constraint for $y_3$, it is allowed to walk only between $s(y_1,y_2)$ and $\pi - s(y_1,y_2)$,
  • a second constraint being $y_3\le \pi -y_1-y_2$.

So the code computing the integral is:

{s(y1, y2) = asin( 2*sin(y1)*sin(y2) );}
{f(y1, y2, y3) = -1/6*(Pi - y1 - y2 - y3)^3;}

{p = 120./Pi^5 *     \
 intnum(y1=0, Pi,    \
 intnum(y2=0, Pi-y1, \
        if(2*sin(y1)*sin(y2) > 1   , 0,  
        if(Pi - y1 - y2 < s(y1, y2), 0, f(y1, y2, min(Pi-s(y1,y2), Pi-y1-y2)) - f(y1, y2, s(y1, y2))))));
print("p ~ ", p);
}

(Replacing the $\min$ in the formula for the primitive function $f$ by the argument $\pi -y_1-y_2$ does not change the result. A geometric argument can be used to confirm.)

We obtain using precision $100$ the result: $$ \bbox[lightyellow]{ p \approx \color{red}{0.6000501}5758345889343487322888211443947167399337503112\dots } $$ and of course only some first few decimals can match the exact result, but there are certainly more decimals then the first five/six only. Using $300$ digits instead, the result is: $$ \bbox[lightyellow]{ p \approx \color{red}{0.6000501}9618116188519951290405662775869199611849700\dots } $$ An other way to proceed would be to let $c=2\sin y_3=A_6A_0$ first take its possible values, more exactly we pick a place for $A_6$ (in cyclic order before $A_0=0$), then consider $y_{12}=y_1+y_2=\frac 12(\overset\frown{A_0A_1} + \overset\frown{A_1A_2} )$ running from $0$ to $\pi-y_3$, which determines the position of $A_2$, then finally consider all positions of $A_1$ on the arc from $A_0$ to $A_2$ so that the $A_1$-height of the triangle $\Delta A_0A_1A_2$ is less $\frac 12c=\sin y3$. It is not easy to implement this road, however.



Further experimental support: Using sage, the following code snipet simulates statistically the probability $p(n)$, $n=6$, for the case that the choice of the three chords is always corresponding to the extremities $A_0=0$, $A_1$, $A_2, A_3$.

import random
random.seed('MSE4898900')    # to be able to reproduce

n = 6        # take six random points on the unit circle, or take some othe n > 3
TRIALS = 10**5
count = 0    # count success cases below

def ru():    return random.uniform(0, 2*pi)
def f(x):    return 2*abs(sin(x/2))

for trial in range(TRIALS):
    x = sorted([ru() for k in range(n - 1)])
    x1, x2, x3 = x[:3]
    if f(x1) * f(x2-x1)  < f(x3-x2):
        count += 1

print(f"p ~ {count} / {TRIALS} ~ {count/TRIALS.n()}")

Yes, i know, this is slow, loops are the main weakness of python, and i insist to go with this weakness to "profit" at most from it, but after the coffee break there is the following result:

p ~ 59754 / 100000 ~ 0.597540000000000

It is in concordance with the inexact numerical value for the integral computing $p$, yes, a value close to $3/5$. Having only this experience, the "dot-six" result is still plausible, this is the reason why i tried so long on the other direction.


However at some point i wanted to have a better statistic view. Python cannot help. Using numpy (python with C++ vectorialization support, so pointers are tacitly part of the game) the experiment can use a bigger number of trials. Here is the code, and the results are immediately inserted in a table, that collects simulations of tuples of $n$ points for $n$ from $4$ to $10$.

import numpy as np

TRIALS = 10**8

def simulate(n, trials=TRIALS):
    a = np.random.uniform(0, 2*pi, size=(TRIALS, n - 1))
    a . sort(axis=1)
    x1, x2, x3 = a[:, 0], a[:, 1], a[:, 2]
    y1, y2, y3 = x1/2, (x2 - x1)/2, (x3 - x2)/2 
    
    return (2*sin(y1)*sin(y2) < sin(y3)).sum()

for n in [4, 5, 6, 7, 8, 9, 10]:
    print(f"| ${n}$ | ${simulate(n)/TRIALS}$ |")

The same experiment for $n$ among $4,5,6,7,8,9,10$ delivered this time...

$n$ Monte-Carlo substitute for $p(n)$
$4$ $0.53468146$
$5$ $0.56940318$
$6$ $0.60009132$
$7$ $0.62659136$
$8$ $0.64985936$
$9$ $0.67028715$
$10$ $0.68809787$

And the simulation for $n=6$ is somehow to far away from $3/5$.

On the side computing the integrals in pari/gp we have the following results:

$n$ Numerical integration for $p(n)$
$4$ $0.5346886788355638969611403661$
$5$ $0.5693885668751201134922537845$
$6$ $0.6000501867477713446343104695$
$7$ $0.6266785349552812266607746033$
$8$ $0.6498647578116058579889858517$
$9$ $0.6701997775201605495993263629$
$10$ $0.6881760215387701094302772635$

So if for $n=4,5,\dots$ there is an ugly answer, but Monte-Carlo and numerical integration somehow show the variance and give some frame for the error made, it is not probable that $0.60005\dots$ comes back to $0.6$. The numerical integration is in concordance with the numpy simulation. Used pari/gp code under realprecision = 211 significant digits (200 digits displayed):

{s(y1, y2) = asin( 2*sin(y1)*sin(y2) );}
{f(m, y1, y2, y3) = -1/m!*(Pi - y1 - y2 - y3)^m;}
{p(n) = m = n - 3;  
  (n-1)!/Pi^(n-1) *
 intnum(y1=0, Pi,
 intnum(y2=0, Pi-y1,
        if(2*sin(y1)*sin(y2) > 1   , 0,
        if(Pi - y1 - y2 < s(y1, y2), 0, -f(m, y1, y2, s(y1, y2))))));}
      
{for(n=4, 10, printf("| $%s$ |  $%30.28f$ |\n", n, p(n)));}

So i conclude with the following:

  • The wanted probability has a formula in terms of an integral. This integral has optically no reasons to have a closed formula in terms of "elementary functions".
  • Numerical integration and statistic experiments rather discourage a possible $3/5$ result.
  • All the above was still a positive experience, despite of going first in the false direction, i had to force numpy to give in few lines the result, learned a lot while doing so.