Show that $E[f(X_1,X_2)\mid \mathcal{F}_n]=\frac{2}{n(n-1)}\sum_{ 1 \leq p<q \leq n}f(X_p,X_q)$

conditional-expectationmartingalesprobability theorystochastic-processes

Let $(X_k)_k$ be a sequence of i.i.d random variables and $f:\mathbb{R}^2 \to \mathbb{R}$ be a measurable function such that $f(X_1,X_2) \in L^1$ and for every $(x,y) \in \mathbb{R}^2,f(x,y)=f(y,x).$ Let for $n \geq 2,Y_n=\frac{2}{n(n-1)}\sum_{ 1 \leq p<q \leq n}f(X_p,X_q)$ and $\mathcal{F}_n=\sigma(Y_n,Y_{n+1},…)$.

Show that for $n \geq 2, Y_n=E[f(X_1,X_2)\mid \mathcal{F}_n].$

We have that

\begin{align}
\sum_{1 \leq p<q \leq n}f(X_p,X_q)&=E\left[\sum_{1 \leq p<q \leq n}f(X_p,X_q)\mid Y_n \right]
\\&=\sum_{1 \leq p <q \leq n}E [f(X_p,X_q)\mid Y_n ]
\\&=\sum_{1 \leq p<q \leq n}E [f(X_1,X_2)\mid Y_n ]
\\&=\frac{n(n-1)}{2}E [f(X_1,X_2)\mid Y_n ]
\end{align}

which is true since for $1 \leq p < q \leq n, P_{(X_p,X_q,U_n)}=P_{(X_1,X_2,U_n)}.$

How to verify that $Y_n=E[f(X_1,X_2)\mid \mathcal{F}_n ]$ ?

Best Answer

Since the $\sigma$- algebra $\mathcal{F}_{n}$ is generated by sets of the form $\{Y_{n}\in{A_{n}}, Y_{n+1}\in{A_{n+1}}, ..., Y_{n+k}\in{A_{n+k}}\}$ where $A_{n}, A_{n+1}, ..., A_{n+k}$ are Borel subsets of $\mathbb{R}$, to verify that $\mathbb{E}(f(X_{1},X_{2})|\mathcal{F}_{n})=Y_{n}$, it suffices to show that:

$$\int_{Y_{n}\in{A_{n}}, Y_{n+1}\in{A_{n+1}}, ..., Y_{n+k}\in{A_{n+k}}} f(X_{1},X_{2})d\mathbb{P}=\int_{Y_{n}\in{A_{n}}, Y_{n+1}\in{A_{n+1}}, ..., Y_{n+k}\in{A_{n+k}}} Y_{n}d\mathbb{P}$$

To this effect, notice that the values of $Y_{n}, Y_{n+1}, Y_{n+2}, ...$ are invariant under permutation of $X_{1}, X_{2}, ..., X_{n}$. Thus:

$$ \int_{Y_{n}\in{A_{n}}, Y_{n+1}\in{A_{n+1}}, ..., Y_{n+k}\in{A_{n+k}}} f(X_{1},X_{2})d\mathbb{P}=\int_{Y_{n}\in{A_{n}}, Y_{n+1}\in{A_{n+1}}, ..., Y_{n+k}\in{A_{n+k}}} f(X_{i},X_{j})d\mathbb{P}$$

for any choice of $i,j$ such that $1\leq{i}<j\leq{n}$. Summing over all such $i$ and $j$ on both sides of the equation above and then dividing by $\frac{n(n-1)}{2}$ then gives us the desired result.