For a Markov process $(X_t)_{t \geq 0}$ we define the generator $A$ by
$$Af(x) := \lim_{t \downarrow 0} \frac{\mathbb{E}^x(f(X_t))-f(x)}{t} = \lim_{t \downarrow 0} \frac{P_tf(x)-f(x)}{t}$$
whenever the limit exists in $(C_{\infty},\|\cdot\|_{\infty})$. Here $P_tf(x) := \mathbb{E}^xf(X_t)$ denotes the semigroup of $(X_t)_{t \geq 0}$.
By Taylor's formula this means that
$$\mathbb{E}^xf(X_t) \approx f(x)+t Af(x)$$
for small $t \geq 0$. So, basically, the generator describes the movement of the process in an infinitesimal time interval. One can show that
$$\frac{d}{dt} P_t f(x) = A P_tf(x), \tag{1}$$
i.e. the generator is the time derivative of the mapping $t \mapsto P_tf(x)=\mathbb{E}^x(f(X_t))$. Reading $(1)$ as a (partial) differential equation we see that $u(t,x) := P_t f(x)$ is a solution to the PDE
$$\frac{\partial}{\partial t} u(t,x) = Au(t,x) \qquad u(0,x)=f(x).$$
This is one important reason why generators are of interest. Another, more probabilistic, reason is that the process
$$M_t^f := f(X_t) - f(X_0)- \int_0^t Af(X_s) \, ds, \qquad t \geq 0 \tag{2}$$
is a martingale. This means that we can associate with $(X_t)_{t \geq 0}$ a whole bunch of martingales, and this martingale property comes in handy very often, for example whenenver we deal with expectations of the form $\mathbb{E}^x(f(X_t))$. This leads to Dynkin's formula.
Generators are also connected with the martingale problem which in turn can be used to characterize (weak) solutions of stochastic differential equations. Futhermore, generators of stochastic processes are strongly related to Dirichlet forms and Carré du champ operators; it turns out that they are extremely helpful to carry over results from probability theory to analysis (and vica versa). One important application are heat-kernel estimates.
Example: Brownian motion In the case of (one-dimensional) Brownian motion $(B_t)_{t \geq 0}$, we see that
$$\mathbb{E}^x(f(B_t)) \approx f(x)+ \frac{t}{2} f''(x)$$
for small $t$. This formula can be motivated by Taylor's formula: Indeed,
$$\mathbb{E}^x(f(B_t)) \approx \mathbb{E}^x \left[f(x)+f'(x)(B_t-x)+\frac{1}{2} f''(x)(B_t-x)^2 \right]= f(x)+0+\frac{t}{2} f''(x)$$
using that $\mathbb{E}^x(B_t-x)=0$ and $\mathbb{E}^x((B_t-x)^2)=t$.
From $(1)$ we see that $u(t,x) := \mathbb{E}^x(f(B_t))$ is the (unique) solution of the heat equation
$$\partial_t u(t,x) = \frac{1}{2}\partial_x^2 u(t,x) \qquad u(0,x)=f(x).$$
Moreover, one can show that the solution of the Dirichlet problem is also related to the Brownian motion. Furthermore, $(2)$ yields that
$$M_t^f := f(B_t)-f(B_0) - \frac{1}{2} \int_0^t f''(B_s) \, ds.$$
is a martingale. Having Itô's formula in mind, this is not surprising since
$$f(B_t)-f(B_0) = \int_0^t f'(B_s) \, dB_s+ \frac{1}{2} \int_0^t f''(B_s) \,ds = M_t^f + \frac{1}{2} \int_0^t f''(B_s) \,ds.$$
The above-mentioned results (and proofs thereof) can be found in the monograph Brownian Motion - An Introduction to Stochastic Processes by René L. Schilling & Lothar Partzsch.
You can look at your process $X_{t}$ as a two dimensional stochastic process
$$Y_{t}=\left[\begin{array}{cc}X_{t}\\ \eta_{t}\end{array}\right]$$
Then
$$dY_{t}=\left[\begin{array}{cc}dX_{t}\\ d\eta_{t}\end{array}\right]=\left[\begin{array}{cc}b(X_t)+\lambda\eta_{t}\sigma(X_{t})\\ \lambda\eta_{t}\end{array}\right]dt+\left[\begin{array}{cc}\alpha\sigma(X_t)&0\\ 0&\alpha\end{array}\right]\left[\begin{array}{cc}dW_{t}\\ dW_{t}\end{array}\right]$$
and the infinitesimal generator is of the form
$$LV(y)=LV(x,\eta)=\left(b(x)+\lambda\eta \sigma(x)\right)V'_{x}(x,\eta)+\lambda\eta V'_{\eta}(\eta,x)$$
$$+\frac{1}{2}\alpha^{2}\sigma^{2}(x)V''_{xx}(x,\eta)+\alpha^{2}\sigma(x)V''_{x\eta}(x,\eta)+\frac{1}{2}\alpha^{2}V''_{\eta\eta}(x,\eta)$$
By the way, the infinitesimal generator of an Ornstein-Uhlenbeck process of the form
$$d\eta_{t} = \lambda\eta_{t} dt + \alpha dW_{t}$$
is
$$LV(\eta)=\lambda \eta V'(\eta) + \frac{\alpha^2}{2}V''(\eta)$$
Best Answer
Take a look at Examples 3.3.2./3.3.3. in "Stochastic Analysis On Manifolds" by Elton P. Hsu. He derives the generator for Brownian motion on the sphere
$$\mathcal{L}f(x):=\frac{1}{2}\Delta_{S}f(x)=\frac{1}{2}\Delta(f(\frac{x}{|x|})),$$
where $\Delta_{S}$ is the spherical Laplacian.
The formula you mentioned does have analogues on manifolds. For example, see theorem 4.9 in "Heat Kernel and Analysis on Manifolds" Book by A Grigoryan, where he describes the relation of the semigroup and generator and in particular derives the relation
$$\frac{d}{dt}P_{t}(f)=-\mathcal{L}P_{t}(f),$$
which is the one you mentioned at $t\to 0$ (also see exercises 4.41-42).
But to be clear even in $\mathbb{R}^{n}$, one again needs to start from some generator and then define the corresponding semigroup (satisfying the Cauchy-problem) and stochastic process.
Once some basic processes were built, then this procedure was back-engineered too eg. see Feynman-Kac, of the correspondence, so one can start from an SDE and figure out the corresponding generator.