With just an affine structure you will not be able to get an induced connection. (Part of the story is told in Fox's AMS Notices article from March 2012 titled "What is an affine sphere?".)
Instead, you can consider the following for codimension 1 submanifolds: given $\tau:N\to M$ an embedding and let $v$ be a vector field on $M$ along $N$ that is transverse to $N$, then $(\nabla,v)$ on $M$ together induces a connection on $N$. For $(X,Y)$ vector fields on $N$, we can define
$$ D^{(v)}_X Y = [\nabla_{\tau_*X}\tau_*Y] $$
where $[W]$ for $W\in T_pM$, $p\in \tau(N)$ is defined by $\tau_*[W] - W = \lambda v$ for some $\lambda\in\mathbb{R}$. For higher codimension case you need more (linearly independent) vector fields. In the Riemannian case, $v$ is canonically chosen to be the unit normal vector to $N$ (or in higher codimension, a family that spans the normal bundle).
All of my geometric intuition for "immersed" versus "embedded" surface is contained in my geometric intuition for "immersions" versus "embeddings". In particular, as many people have pointed out, immersions need not be injective. But, of course, even injective immersions need not be embeddings. As a very simple example, consider the map $f : (-\pi, \pi) \to \mathbb{R}^2$ given by
$$
f(t) = (\sin t, \sin 2t).
$$
The image of this map is a sort of "figure 8" in the plane, traced out starting at the origin, moving through quadrants II, III, I, and IV, in that order, as $t$ moves from $-\pi$ to $\pi$. It's easy to see that $f$ is an injective immersion, but $f$ is not an embedding, since every neighborhood of $\mathbb{R}^2$ containing the origin also contains points of the form $f(-\pi + \epsilon)$ and $f(\pi - \epsilon)$ for all sufficiently small $\epsilon$.
As another one-dimensional example of this type, you could consider the closed topologist's sine curve with a loop, which is the graph of $g(x) = \sin(1/x)$ for $x \in (0, 1]$ together with the $y$-axis between $y = -1$ and $y = 1$ together with a "loop" smoothly connecting the point $(0, -1)$ to the point $(1, \sin(1))$. It's clear that there is some injective immersion $f : [0, \infty) \to \mathbb{R}^2$ whose image is this curve, and this immersion is not an embedding.
You can, of course, easily make either of these example into a surface by considering $h: (-\pi, \pi) \times (0,1) \to \mathbb{R}^3$ given by $h(t, s) = (f(t), s)$.
On the other hand, I don't really know any examples of injective immersions of surfaces which aren't embeddings that are "interesting" in a way that's fundamentally different from the examples above. The idea I have is that immersions are allowed to "approach themselves" or "limit onto themselves" in crazy ways that embeddings are not. In particular, if $f : X \to Y$ is an injective immersion, the topology on $f(X)$ as a subspace of $Y$ might be very different than the topology on $X$.
Best Answer
Let's look at what do Carmo actually writes for the definition of a regular surface:
This is section 2-2, definition 1 in do Carmo.
Notice a couple of things. Since $V \cap S$ doesn't have a a smooth manifold structure yet, we can't strictly speaking talk about ${\bf x}$ being differentiable, and if you look at the definition do Carmo gives what he's really saying is that the composition $$U \xrightarrow{\bf x} S \cap V \hookrightarrow \mathbb{R}^3$$ is differentiable. Similarly, when he talks about $d{\bf x}_q$, he's talking about the differential of this same map, since again we can't talk about ${\rm T}_q (S \cap V)$ yet as it hasn't been defined.
Okay, now a smooth embedding, according to Lee, is an injective immersion which is a homeomorphism onto its image. Since $i$ the inclusion of a subspace, the topological conditions are already satisfied, so it remains only to check that $i$ is an immersion, i.e. that $i_\ast : {\rm T}_P S \to {\rm T}_P \mathbb{R}^3$ is injective for each $P \in S$.
Take a local parametrization ${\bf x} : U \to S$ at $P$, say with ${\bf x}(0) = P$. By definition, the composition $$U \xrightarrow{\bf x} S \xrightarrow{i} \mathbb{R}^3$$ is an immersion. Since $(i \circ {\bf x})_\ast = i_\ast \circ {\bf x}_\ast$, the composition $${\rm T}_0 U \xrightarrow{{\bf x}_\ast} {\rm T}_P S \xrightarrow{i_\ast} {\rm T}_P \mathbb{R}^3$$ is injective. Above, you've said you believe that $S$ is a 2-manifold. Given this, $\dim {\rm T}_0 U = \dim {\rm T}_P S = 2$ and $\dim {\rm T}_P \mathbb{R}^3 = 3$. So by elementary linear algebra the only way for the composition to be injective is for ${\bf x}_\ast$ to be an isomorphism and $i_\ast$ to be injective.