All of my geometric intuition for "immersed" versus "embedded" surface is contained in my geometric intuition for "immersions" versus "embeddings". In particular, as many people have pointed out, immersions need not be injective. But, of course, even injective immersions need not be embeddings. As a very simple example, consider the map $f : (-\pi, \pi) \to \mathbb{R}^2$ given by
$$
f(t) = (\sin t, \sin 2t).
$$
The image of this map is a sort of "figure 8" in the plane, traced out starting at the origin, moving through quadrants II, III, I, and IV, in that order, as $t$ moves from $-\pi$ to $\pi$. It's easy to see that $f$ is an injective immersion, but $f$ is not an embedding, since every neighborhood of $\mathbb{R}^2$ containing the origin also contains points of the form $f(-\pi + \epsilon)$ and $f(\pi - \epsilon)$ for all sufficiently small $\epsilon$.
As another one-dimensional example of this type, you could consider the closed topologist's sine curve with a loop, which is the graph of $g(x) = \sin(1/x)$ for $x \in (0, 1]$ together with the $y$-axis between $y = -1$ and $y = 1$ together with a "loop" smoothly connecting the point $(0, -1)$ to the point $(1, \sin(1))$. It's clear that there is some injective immersion $f : [0, \infty) \to \mathbb{R}^2$ whose image is this curve, and this immersion is not an embedding.
You can, of course, easily make either of these example into a surface by considering $h: (-\pi, \pi) \times (0,1) \to \mathbb{R}^3$ given by $h(t, s) = (f(t), s)$.
On the other hand, I don't really know any examples of injective immersions of surfaces which aren't embeddings that are "interesting" in a way that's fundamentally different from the examples above. The idea I have is that immersions are allowed to "approach themselves" or "limit onto themselves" in crazy ways that embeddings are not. In particular, if $f : X \to Y$ is an injective immersion, the topology on $f(X)$ as a subspace of $Y$ might be very different than the topology on $X$.
Best Answer
He mentions that to specify what he means by continuity of $f^{-1}$, probably for those readers who don't have any background in topology. In general a map is called continuous if preimages of open sets are open. Therefore you need to know which sets are called open - i.e. the topology, in this case the induced topology of $S$. His way to avoid this is saying that $f^{-1}$ is the restriction of some "bigger" map and therefore he creates $W$.
However, if you have some basic topology knowledge you can skip that. Its enough to say $S$ is locally homeomorphic to $\mathbb{R}^2$. (where $X \subset S$ is called open, if $X = A \cap S$ for some open $A \subset \mathbb{R}^3$)