If $f \circ V=f$ implies $f$ is constant, then $V$ must be ergodic.

ergodic-theorymeasure-theory

I'm taking a course in measure theory, and we've been introduced to the definition of ergodicity as shown below:

"Let $(X,\mathcal{A},m)$ be a probability space. Let $V:X \to X$ be a measure-preserving bijection. We say that $V$ is ergodic if, for for every measurable set $A$ such that $V^{-1}(A)=A$, we have $m(A)=0$ or $m(A)=1$."

Following this is the remark:

"Equivalently, $V$ is ergodic if, every random variable f : X $\to$ $\mathbb R$ such that
$f \circ V = f$ is constant almost everywhere."

My question is why does this remark follow from the definition of ergodicity as provided above?
I know that the converse of this statement holds true, as shown at If $g$ is invariant under an ergodic map then it's almost everywhere constant.

My attempt

Inspired by the answer in the link, I'm wondering if $V^{-1}(A)=A$ implies that $A=\{x \in X: f\ \geq c_1\}$ where $c_1$ is some arbitrary constant and $f:X \to \mathbb R$ is some measurable function that satisfies $f \circ V=f$. If this implication is true, then it is clear to me why we can conclude $m(A)=0$ or $m(A)=1$. But I don't know whether the implication is true. The most difficult part for me is that I don't know how to progress from the statement $V^{-1}(A)=A$.

Best Answer

Welcome to MSE!

Hint: What happens if $f = \chi_A$ is the characteristic function of a set $A$? What does it mean for $\chi_A \circ V = \chi_A$? What does it mean for $\chi_A$ to be constant a.e?

To follow up with your question in the comments, it's a kind of "standard trick" in measure theory. If you know things about sets, then we can often pass to functions by first considering characteristic functions, then simple functions (by taking linear combinations), then all measurable functions (by taking limits). For instance, this is how we define the Lebesgue integral.

Conversely, if we know things about functions, then we can often recover information about sets by seeing what happens to characteristic functions. Notice this is the "easier" direction of the two, because functions are the more complicated object. That said, this trick of considering characteristic functions is still extremely useful!


I hope this helps ^_^

Related Question