In an expression where more than one random variables are involved, the symbol $E$ alone does not clarify with respect to which random variable is the expected value "taken". For example
$$E[h(X,Y)] =\text{?} \int_{-\infty}^{\infty} h(x,y) f_X(x)\,dx$$
or
$$E[h(X,Y)] = \text{?} \int_{-\infty}^\infty h(x,y) f_Y(y)\,dy$$
Neither. When many random variables are involved, and there is no subscript in the $E$ symbol, the expected value is taken with respect to their joint distribution:
$$E[h(X,Y)] = \int_{-\infty}^\infty \int_{-\infty}^\infty h(x,y) f_{XY}(x,y) \, dx \, dy$$
When a subscript is present... in some cases it tells us on which variable we should condition. So
$$E_X[h(X,Y)] = E[h(X,Y)\mid X] = \int_{-\infty}^\infty h(x,y) f_{h(X,Y)\mid X}(h(x,y)\mid x)\,dy $$
Here, we "integrate out" the $Y$ variable, and we are left with a function of $X$.
...But in other cases, it tells us which marginal density to use for the "averaging"
$$E_X[h(X,Y)] = \int_{-\infty}^\infty h(x,y) f_{X}(x) \, dx $$
Here, we "average over" the $X$ variable, and we are left with a function of $Y$.
Rather confusing I would say, but who said that scientific notation is totally free of ambiguity or multiple use? You should look how each author defines the use of such symbols.
The $i$ and $j$ are mathematical, rather than statistical, convention. The $i$ is because it is the first letter in the word index, and then $j$ comes after $i$. They have the benefit of being small, clear, and unobtrusive ($x_{ij}$ looks pretty good).
This is usually seen in the context of matrix entries, where $i$ indexes the rows, and $j$ indexes the columns. The matrix convention is followed far and wide in general branches of mathematics and science using linear algebraic machinery.
This is what I was told in my first linear algebra course in school, I know of no definitive reference.
Best Answer
Normally it means that you are taking the expectation with respect to that distribution (or that probability measure). Sometimes we change the probability measure and therefore the expectations are taken with respect to the new probability measure. So they want to specify exactly with respect to which probability measure, they are taking the expectations.