Average area of triangle formed ${1\over8}$ that of square

calculusgeometric-probabilityintegrationmultivariable-calculusprobability

Here's a question from my infamous probability textbook:

A point is taken at random in each of the two adjacent sides of a square. Show that the average area of the triangle formed by joining them is one eighth of the area of the square.

Intuitively, this makes sense since if I select the midpoints of two adjacent sides of the square, the triangle formed has ${1\over8}$ the area of the square. But I'm not sure how to go about averaging across all triangles i.e. what's the correct integral to setup. Any help would be well-appreciated.

EDIT: Following Ninad Munshi's hint in the comments, I got the expression$${{\int_0^1 \int_0^1 {{xy}\over2}\,\text{d}x\,\text{d}y}\over{1^2}},$$which indeed evaluates to ${1\over8}$ after a short calculation. But it feels like magic, can anyone explain in depth and conceptually why this integral gets us the desired probability?

EDIT 2: Thanks to heropup for his answer. However, I am asking for an explanation that specifically addresses my concerns in my previous edit, which his answer unfortunately does not.

Best Answer

Rather than thinking about the triangle itself, think about the following problem:

Say you have a square cake of unit side length. Uniformly at random, you pick a number $0 < x < 1$ and you slice the cake vertically into two rectangles with side lengths $x, 1$ and $1-x, 1$. Next, independently of the first cut and also uniformly at random, you pick a number $0 < y < 1$ and you slice the cake horizontally at $y$, so that now you have four rectangular pieces with sides $x, y$, $1-x, y$, $x, 1-y$, and $1-x, 1-y$.

If I ask you, what is the average area of the rectangle with side lengths $x, y$, it should be obvious that this is $(1/2)(1/2) = 1/4$. In fact, this is the average area of any one of the four pieces of cake that were created, since by an elementary symmetry argument, there is no difference between the probability distributions of the areas of the four pieces: You could, for instance, flip the cake horizontally or vertically. And since the total area of the cake is $1$, and cutting it twice leaves $4$ pieces, the average area should be $1/4$.

Another way to think about this is that if $X$ and $Y$ are the independent and identically distributed random variables that describe the locations of the two cuts along each side, then $XY$ is the random area of one of the four pieces, and $$\operatorname{E}[XY] \overset{\text{ind}}{=} \operatorname{E}[X]\operatorname{E}[Y] = (1/2)(1/2) = 1/4,$$ where the equality holds because $X$ and $Y$ are independent. Then since the average location of each cut is $1/2$, the result follows.

Now, what does this have to do with the triangle? Well, that triangle is just half the area of the aforementioned rectangle.


After seeing the edit, it is difficult to address the specific question without knowing what level of discussion and prerequisite theorems can be taken as true. You want something "in depth" and "conceptual" but, for instance, I do not know if you accept the formula for the expectation of a continuous random variable, say $$\operatorname{E}[X] = \int_{x \in \Omega} x f_X(x) \, dx,$$ or if you need that to be proven. If you need a conceptual explanation for why this holds true, then an analogy with a discrete-valued random variable is better motivation; e.g., we consider that the expected value of a discrete random variable is in a sense a weighted average of its outcomes, weighted by the likelihood of each such outcome: $$\operatorname{E}[X] = \sum_{x \in \Omega} x \Pr[X = x],$$ where $\Omega$ is the support of $X$. Replacing the sum by an integral and the probability mass by a probability density gives the previous formula and can be thought of as a limiting process similar to that of Riemann summation.

The bivariate continuous case is a natural extension of the above. We simply take the realization of an area $A(X,Y) = XY/2$ for each possible $(X,Y) \in [0,1]^2$, and weight it by the probability density of its occurrence. This leads to $$\operatorname{E}[A(X,Y)] = \int_{x=0}^1 \int_{y=0}^1 \frac{xy}{2} f_{X,Y}(x,y) \, dy \, dx = \frac{1}{2} \int_{x=0}^1 x \, dx \int_{y=0}^1 y \, dy,$$ since the joint density is uniform on the unit square; i.e. $f_{X,Y}(x,y) = 1$ for $0 \le x, y \le 1$.

That said, all of the above is something I would consider to be more conceptually sophisticated than the original question. However, there really is no simpler way to explain the meaning of the aforementioned integral. To ask for something even simpler would be like asking to explain what it means to integrate a function without talking about limits or sums.