I think we should start from the definition of conditional independence. Two random variables $X$ and $Y$ are conditionally independent w.r.t. $\mathcal{G}$ if for any $B\in \sigma(X)$ and $D\in \sigma(Y)$ we have
$$
P(B\cap D|\mathcal{G}) = P(B|\mathcal{G})P(D|\mathcal{G}).
$$
Starting from this definition (given on wikipedia), we can prove
\begin{equation}
E[XY|\mathcal{G}] = E[X|\mathcal{G}]E[Y|\mathcal{G}]\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)
\end{equation}
by following the “Standard Machinery” in probability and measure theory. It contains three steps:
1) If both $X$ and $Y$ are simple functions, i.e., $X = \sum_{i=1}^n x_iI_{A_i}$, $Y = \sum_{i=1}^m y_iI_{B_i}$ for two finite measurable division of the whole space $\Omega = \cup_{i=1}^n A_i = \cup_{i=1}^m B_i$, then the equation (1) can be proved easily.
2) If $X$ and $Y$ are non-negative, then there exist two sequences of simple functions $\{f_n\}$ and $\{g_n\}$ such that $f_n\uparrow X$ and $g_n\uparrow Y$. Since $X$ and $Y$ have finite expectation, by the dominant convergence theorem for conditional expectation we have
$$
E[XY|\mathcal{G}] = \lim_{n\rightarrow \infty}E[f_ng_n|\mathcal{G}] = \lim_{n\rightarrow \infty}E[f_n|\mathcal{G}]E[g_n|\mathcal{G}] = E[X|\mathcal{G}]E[Y|\mathcal{G}].
$$
3)For general $X$ and $Y$, we define $X^+(\omega) = \max\{X(\omega),0\}\geq 0$, $X^-(\omega) = \max\{-X(\omega),0\}\geq 0$, and write $X = X^+-X^-$, $Y = Y^+-Y^-$. Now using the result in 2) to $X^+,X^-,Y^+,Y^-$ (if $X$ and $Y$ are conditionally independent, then so should $X^+$ and $Y^+$, and so on), we can prove the equality for $X$ and $Y$.
Let $(\Omega, \mathcal F, \mathbb P)$ be a probability space. Let $(X,Y)$ be random vector with probability density function $g_{(X,Y)}$. Finally, let $f$ be any borel function, such that $\mathbb E[|f(X,Y)|] < \infty$. Then it holds: $\mathbb E[f(X,Y)|Y] = h(Y)$, where:
$$ h(y) = \frac{\int_{\mathbb R} f(x,y)g_{(X,Y)}(x,y)dx}{\int_{\mathbb R} g_{(X,Y)}(x,y)dx} $$when $\int_{\mathbb R} g_{(X,Y)}(x,y)dx \neq 0$, and $h(y) = 0$ otherwise.
Firstly, we can put $0$ in the second case, because the set $S=\{ \omega \in \Omega : \int_{\mathbb R} g_{(X,Y)}(x,Y(\omega)) dx = 0 \}$ has measure $0$. Clearly $\mathbb P(S) = \mathbb P(Y \in S_Y)$, where $S_Y = \{ y \in \mathbb R: \int_{\mathbb R} g_{(X,Y)}(x,y)dx = 0 \}$. Then $\mathbb P(Y \in S_Y) = \int_{S_Y} g_Y(y) dy $, where $g_Y$ is marginal density (It exists due to Fubini + existence of joint density of rv $(X,Y)$ ). But note $g_Y(y) = \int_{\mathbb R} g_{(X,Y)}(x,y)dx $, so we're just integrating $0$ function (cause we on $S_Y$ where it's $0$), so $\mathbb P(S) = 0$. This + the fact that Conditional Expectation is up to the set of measure $0$ allows us to forget about the case when $g_Y(y) = 0$.
So, we have to prove $2$ things:
1) $h(Y)$ is $\sigma(Y)$ measurable. Clearly both $\int_{\mathbb R} g_{(X,Y)}(x,Y) dx$ and $\int_{\mathbb R} g_{(X,Y)}(x,Y) f(x,Y) dx$ are $\sigma(Y)$ measurable due to Fubinii theorem (integrals of $\sigma(Y) -$ measurable functions are $\sigma(Y)$ measurable (We here used the fact that $g_{(X,Y)}$ is bounded and $\mathbb E[f(X,Y)]$ is finite to be able to apply Fubini's theorem.
2) For any $A \in \sigma(Y)$ we have to show $\int_A f(X,Y) d\mathbb P = \int_A h(Y) d\mathbb P$. Note that $A$ is of the form $Y^{-1}(B)$ where $B \in \mathcal B(\mathbb R)$ (borel set).
Note that $\int_A f(X,Y) d\mathbb P = \mathbb E[ f(X,Y) \cdot \chi_{_{Y \in B}} ]$ and $\int_A h(Y) d\mathbb P = \mathbb E[ h(Y) \cdot \chi_{_{Y \in B}} ]$
We'll use the fact, that if random variable/vector (in $\mathbb R^n$) $V$ has density function $g_V$, then for any borel function $\phi: \mathbb R^n \to \mathbb R^n$, we have $\mathbb E[\phi(V)] = \int_{\mathbb R^n} \phi(v) g_V(v) d\lambda_n(v)$.
Then: $$\mathbb E[ f(X,Y) \cdot \chi_{_{Y \in B}} ] = \int_{\mathbb R^2} f(x,y)\chi_{_{B}} g_{(X,Y)}(x,y) d\lambda_2(x,y) = \int_{B} \int_{\mathbb R} f(x,y)g_{(X,Y)}(x,y)dxdy $$
That last split of integrals due to fubinii (function is integrable due to our assumption with $f$ ).
And now similarly at the beggining:
$$ \mathbb E[ h(Y) \cdot \chi_{_{Y \in B}} ] = \int_{B} h(y) (\int_{\mathbb R} g_{(X,Y)}(x,y)dx)dy$$
Now due to our assumption of $h$ (that is getting rid of that case when denominator is $0$ due to its being $0$-measurable set). We have:
$$ \int_{B} (h(y)) (\int_{\mathbb R} g_{(X,Y)}(x,y)dx) dy = \int_{B} (\frac{\int_{\mathbb R} g_{(X,Y)}(x,y)f(x,y)dx}{\int_{\mathbb R} g_{(X,Y)}(x,y)dx}) (\int_{\mathbb R} g_{(X,Y)}(x,y)dx )dy$$
After simplification we get $\mathbb E[ h(Y) \cdot \chi_{_{Y \in B}} ] = \int_{B} \int_{\mathbb R} f(x,y)g_{(X,Y)}(x,y)dxdy = \mathbb E[ f(X,Y) \cdot \chi_{_{Y \in B}} ]$, what we wanted to prove.
Now your "definition $1$" follows when you take $f(x,y) = x$. Then $ h(y) = \mathbb E[X|Y=y] $
Best Answer
Consider any probability space $(\Omega, \mathcal{F}, \mathbb{P})$ ample enough to host two independent random variables $X$ and $U$ on it, where
$$ X \sim \text{Uniform}(\mathcal{X}) \qquad\text{and}\qquad U \sim \text{Uniform}([-1,1]). $$
Then define $Y$ as
$$ Y = \mathbf{1}[U \leq \eta(X)] - \mathbf{1}[U > \eta(X)] = \begin{cases} 1, & U \leq \eta(X), \\ -1, & U \geq \eta(X) \end{cases} $$
We claim that $\mathbb{E}[Y\mid X=x] = \eta(x)$ for a.e. $x$ (with respect to the Lebesgue measure). Indeed, for a.e. $x$, we have
\begin{align*} \mathbb{E}[Y \mid X = x] &= \mathbb{E}[ \mathbf{1}[U \leq \eta(X)] - \mathbf{1}[U > \eta(X)] \mid X = x] \\ &= \mathbb{E}[ \mathbf{1}[U \leq \eta(x)] - \mathbf{1}[U > \eta(x)] ] \\ &= \mathbb{P}[U \leq \eta(x)] - \mathbb{P}[U > \eta(x)] \\ &= \tfrac{1}{2}(1+\eta(x)) - \tfrac{1}{2}(1-\eta(x)) \\ &= \eta(x). \end{align*}