I'm reading about statistical decision theory and on one point in my book the author defines the expected squared prediction error by:
$$EPE = E(Y-g(X))^2 = \int(y -g(x))^2Pr(dx, dy)$$
I like to write this with the density function so that it stays more precise:
$$EPE = \int\int(y-g(x))^2f(x,y)\;dx\;dy$$
Now on the other part the author says that by conditioning on $X$, $EPE$ can be written as:
$$EPE = E_XE_{Y|X}([Y-g(X)]^2\;|\;X)$$
For some reason this notation confuses me…could someone write this conditional notation of $EPE$ more precisely, i.e. so that it would include the joint density function of random variables $X$ and $Y$ etc.?
Just to be sure: $X$ is the variable we use to predict $Y$ and $g(X)$ is the function we are trying to solve, which minimizes $EPE$.
Thank you for any help 🙂
Best Answer
$EPE = \int\int {(y-g(x))^2f(x,y)dxdy}$
By Bayes' Theorem $f(x,y)=f(y\,|\,x)\,f(x)$ we have:
$EPE = \int\int {(y-g(x))^2f(y\,|\,x)\,f(x)dxdy}$
Rearranging gives:
$EPE = \int f(x)\;\left(\,\int (y-g(x))^2f(y\,|\,x)dy\,\right)\;dx$
Using definition of $E_x$ we get:
$EPE = E_x(\;\int (y-g(x))^2f(y\,|\,x)dy\;) $
Using definition of $E_{Y\,|\,X}$ we get:
$EPE = E_x(\;\;\;E_{Y\,|\,X}(\,(Y-g(X))^2\,|\,X\,)\;\;\;) $
Or an even shorter notation:
$EPE = E_xE_{Y\,|\,X}(\,[Y-g(X)]^2\,|\,X\,) $