Subtracting a number (expected value) from a function (random value)

functionsprobability theoryvariance

While self studying probability, I came across this formula for variance of random variable:

$$\operatorname{Var}(X)=E\left[(X-E(X))^2\right]$$

However, what I understood from earlier definitions is that:

  • $X$ is random variable which is a measurable function
  • $E(X)$ is the expected value or mean which computes to a single value (or number)

In the above formula, it seems that we are subtracting this single value (or number) $E(X)$ from a function $X$ (in the part $(X-E(X)$). From the examples following the definition in the book, I could understand that while applying the formula, we subtract the same expected value from each possible value $x$ in codomain of $X$.

So does Mathematics allows subtracting number or any single value from a function? If yes, is the meaning always in the sense similar to vector addition/subtraction with scalar, where too we do add/subtract scalar to each item in vector.

I'm comparing this from my Programming experience where the functions and numbers are generally considered separate types which can't do algebra with each other.

Best Answer

Yes, we can add or subtract numbers to functions in mathematics... but rigorously what we are doing is identifying the number with the function which constantly is equal to that number.

Let me elaborate further. If $f, g:X\to Y$ are functions where addition is defined in $Y$ as $+_Y:Y\times Y\to Y$, one can define the sum between $f$ and $g$ as follows: $$(f+g)(x)=f(x)+_Y g(x)$$ for every $x\in X$. This should be read as "$f+g$ is the function which assigns to each $x\in X$ the element $f(x)+_Y g(x)\in Y$". You can also interpret this as the vector associated to the function $f+g$ being the result of the vector sum between the vectors $(f(x))_{x\in X}$ and $(g(x))_{x\in X}$.

Now, if one chooses a fixed element $y \in Y$, one can define the function $C_y : X\to Y$ such that $C_y(x)=y$ for every $x\in X$. As your intuition told you, for any function $f:X\to Y$ we will have that $f(x)+C_y(x)=f(x)+_Y y$, that is the function $f+C_y$ when evaluated at each $x$ gives the result of adding $f(x)$ and $y$.

Now let's go on to explaining what identification is being made for your particular case. We can define $C_{\mathbb{E}[X]}:\Omega\to \mathbb{R}$ such that $C_{E[X]}(\omega)=\mathbb{E}[X]$ for all $\omega\in \Omega$. This function will be a random variable (one has to check it satisfies the usual definition). Then, one can define: $$\text{Var}[X]=\mathbb{E}[(X-C_{\mathbb{E}[X]})^2]$$ which is a well-defined expectation in the sense that you know, as $(X-C_{\mathbb{E}[X]})^2$ is a random variable.

We do not write the constant function in a different way than the number $\mathbb{E}[X]$ usually, so you will notice that in order not to use complicated notation we just also write $\mathbb{E}[X]$ to refer to the function which is constantly $\mathbb{E}[X]$.

Related Question