For questions of this nature, I generally like to gain intuition by assuming $X$ an $Y$ are discrete random variables. So let us assume $X$ and $Y$ take values in $\{1,\dots,n\}$, and keep in mind that they are not necessarily independent. For simplicity, assume $P(Y=y) >0$ for all $y$.
If you think back to undergraduate level probability classes on conditioning, you can probably guess that $E\left[X|Y=y\right]$ should intuitively be the expected value of the random variable $X$ if you know that $Y=y$. So the computation should go something like this:
\begin{align*}
E[X|Y=y] &= \sum_{x=1}^n xP(X=x|Y=y) = \sum_{x=1}^n\frac{xP(X=x,Y=y)}{P(Y=y)}
\end{align*}
Now let's recall the actual definition of conditional expectation. Now, $\sigma(X,Y)$ is the sigma algebra defining $(X,Y)$. Going back to our discrete case, this is given by $\sigma(X,Y) = \left\{\{(X,Y) \in A\}: A \in \{1,\dots,n\}^2\right\}$. This is just the set of all possible things you can say about $X$ and $Y$ (for example, the set $\{X=2\} = \{(X,Y) \in A := \{(2,i): i \in \{1,\dots,n\}\}$). Fix any sigma algebra $\mathcal{F} \subseteq \sigma(X,Y)$. Then $E[X|\mathcal{F}]$ is an $\mathcal{F}$-measurable random variable satisfying certain conditions that we'll get to later.
What does it mean for a random variable $Z$ to be $\mathcal{F}$-measurable? Well, recall that $Z$ is a function from the probability space to the real numbers. We can explicitly write $Z = Z(x,y) \in \mathbb{R}$ for $x,y \in \{1,\dots,n\}$. Then $\mathcal{F} \supseteq \{Z^{-1}(A): A \in \mathcal{B}(\mathbb{R})\}$. This means that every possible outcome of $Z$ can be completely described in terms of a set in $\mathcal{F}$.
Here are some examples. Let $\mathcal{F} = \sigma(X=1)$. Then $Z(x,y) = \mathbb{I}_{X=1}$ is $\mathcal{F}$-measurable, because no matter what you tell me about the output of $Z$, I can find a set in $\mathcal{F}$ that describes what's going on. For example, $Z < 1$ corresponds to $X\neq 1$, and $Z = 2$ corresponds with the empty set. Both are in $\mathcal{F}$. However, $Z(x,y) = y$ is not $\mathcal{F}$-measurable. If you tell me $Z=3$, well that could be part of $\{X=1\}$ or $\{X\neq 1\}$, but it's not equal to either set. Nor is it equal to $\{(X,Y) \in \{1,\dots,n\}^2\}$ or $\emptyset$.
$E[X|\mathcal{F}]$ is the unique $\mathcal{F}$-measurable random variable whose expectation on any $\mathcal{F}$-measurable set is the same as the expectation of $X$ on that set. Intuitively, since $\mathcal{F}$ is coarser than $\sigma(X,Y)$, you can think of $E[X|\mathcal{F}]$ as the best estimate of $X$ if you have some information about $(X,Y)$ which is described by $\mathcal{F}$. If $\mathcal{F}$ is the trivial sigma algebra, then $E[X|\mathcal{F}] = E[X]$ as that's your best guess if you only know the distribution of $(X,Y)$ and nothing else.
We can write out this definition as follows. For any $A \in \mathcal{F}$,
\begin{align*}
E\left[\mathbb{I}_A E[X|\mathcal{F}]\right] &= \sum_{(x,y) \in A} P(X=x,Y=y)E[X|\mathcal{F}](x,y) = E\left[\mathbb{I}_A X\right] = \sum_{(x,y) \in A} xP(X=x,Y=y)
\end{align*}
So now let's consider your original question. Let $\mathcal{F} = \sigma(Y)$. Then $\mathcal{F} = \{\{Y\in A\}: A \subseteq \{1,\dots,n\}\}$. Since $E[X|\sigma(Y)]$ is $\sigma(Y)$-measurable, it cannot depend on $x$. So, we can write $E[X|\sigma(Y)](x,y) = E[X|\sigma(Y)](y)$ for all $(x,y) \in\{1,\dots,n\}^2$. Let $A = \{y\}$ for some $y \in \{1,\dots,n\}$:
\begin{align*}
\sum_{x=1}^n\sum_{y\in A}P(X=x,Y=y)E[X|\sigma(Y)](y) &= P(Y=y)E[X|\sigma(Y)](y)\\
& = \sum_{x=1}^n\sum_{y \in A} xP(X=x,Y=y) = \sum_{x=1}^n xP(X=x,Y=y)
\end{align*}
Then,
$$E[X|\sigma(Y)](y) = \frac{\sum_{x=1}^n xP(X=x,Y=y)}{P(Y=y)} = E[X|Y=y]$$
So $E[X|\sigma(Y)](y) = E[X|Y=y]$. That's why we use this shorthand notation. You can extend this argument to more general random variables and even stochastic processes, but that involves many other tedious details. The main gist is the same. For any two arbitrary random variables, you can write $E[X|Y] = E[X|\sigma(Y)]$ with the intuitive idea that when $Y=y$, $E[X|\sigma(Y)] = E[X|Y=y]$. Hope that makes sense!
Best Answer
This is a ell known result by Doob.
Theorem: Let $\mathscr{A}$, $\mathscr{B}$ and $\mathscr{C}$ be sub--$\sigma$--algebras of $\mathscr{F}$. $\mathscr{A}\perp_\mathscr{C} \mathscr{B}$ iff $$ \begin{align} \Pr[A|\sigma(\mathscr{C},\mathscr{B})]=\Pr[A|\mathscr{C}]\tag{1}\label{doob-independence} \end{align} $$ for all $A\in \mathscr{A}$.
Here is a shot proof:
Suppose that $\mathscr{A}$ and $\mathscr{B}$ are conditional independent given $\mathscr{C}$, that is $$ \Pr[A\cap B|\mathscr{C}]=\Pr[A|\mathscr{C}] \Pr[B|\mathscr{C}] $$ for all $A\in \mathscr{A}$ and $B\in \mathscr{B}$. Then, for any $A\in\mathscr{A}$, $\mathscr{B}$ and $C\in\mathscr{C}$ we have $$ \begin{align} \Pr\big[A\cap\big(C\cap B)\big]&=\Pr\big[ \mathbb{1}_C\Pr[A\cap B|\mathscr{C}]\big]= \Pr\big[\mathbb{1}_C\Pr[A|\mathscr{C}]\Pr[B|\mathscr{C}]\big]\\ &= \Pr\big[\Pr[A|\mathscr{C}]\Pr[B\cap C|\mathscr{C}]\big]= \Pr\Big[\Pr\big[\Pr[A|\mathscr{C}]\mathbb{1}_{B\cap C}\big|\mathscr{C}\big]\Big]\\ &= \Pr\big[\Pr[A|\mathscr{C}]\mathbb{1}_{B\cap C}\big] \end{align} $$ Since $\sigma(\mathscr{B},\mathscr{C})=\sigma\Big(\{B\cap C: B\in\mathscr{B}, C\in\mathscr{C}\}\Big)$, a monotone class argument shows that $$ \begin{align} \Pr[A\cap H]=\Pr\big[\Pr[A|\mathscr{C}]\mathbb{1}_H \big] \end{align} $$ for all $H\in\sigma(\mathscr{B},\mathscr{C})$. This means that $$ \Pr[A|\sigma(\mathscr{B},\mathscr{C})]=\Pr[A|\mathscr{C}] $$
Conversely, suppose that $\eqref{doob-independence}$ holds. For any $A\in\mathscr{A}$ and $B\in\mathscr{B}$ we have \begin{align*} \Pr[A\cap B|\mathscr{C}]=\Pr\Big[\mathbb{1}_{B}\Pr[A|\sigma(\mathscr{B},\mathscr{C})]\Big| \mathscr{C}\Big]= \Pr\Big[\mathbb{1}_B\Pr[A|\mathscr{C}]\Big|\mathscr{C}\Big] =\Pr[A|\mathscr{C}]\Pr[B|\mathscr{C}] \end{align*} This shows that $\mathscr{A}$ and $\mathscr{B}$ are independent given $\mathscr{C}$.
The extension to random variables is done by expanding first to simple functions and then by the usual monotone approximation by simple functions.