If you think of a vector field on $\Bbb R^n$ as a map $f\colon\Bbb R^n\to\Bbb R^n$, of course it need not be injective (the $0$-vector field is the easiest counterexample). If, however, you think of a vector field as a section of $T\Bbb R^n$, then, as you already proved, it's injective. We can see this directly because we're then looking at the map $\sigma\colon\Bbb R^n\to\Bbb R^{2n}$ given by $\sigma(x)=(x,f(x))$, and the identity map in the first coordinate makes the mapping injective.
I think the simplest motivation is that of vector fields. We want to be able to assign a tangent vector to each point in the manifold, giving us a "field" of vectors on the manifold. That is, $F$ should be some map such that
$$ F(p)\in T_pM $$
for $p\in M$. So, what's wrong with just saying that? Well, nothing really, if you're only interested in the value of vector fields at a point. If you ever want to look beyond singular points, you need some structure connecting your different tangent spaces. For example, to look at continuity of $F$, we need the space of outputs of $F$ to have a topological structure; to look at differentiability, we need a differentiable structure.
So, we need some space which
- contains all the tangent vectors of $M$
- has the same "level" of structure that $M$ has
To satisfy (1), we simply glue together all the tangent spaces by taking a union. Since the tangent spaces are completely disconnected from one another, we can exemplify this fact by using a disjoint union
$$ TM = \coprod\limits_{p\in M} T_pM $$
The rest of the bundle structure is just there to "lift" the structure of $M$ onto $TM$. With that, we can now define vector fields as functions in the usual way
$$ F: M\to TM $$
and we are able to discuss continuity and differentiability to the extent that $M$ admits such properties. However, this definition isn't "complete" because it allows for e.g. attaching a tangent vector from $T_qM$ to $p$, which doesn't fit our idea of a vector field. This gives us another requirement,
- we need a way to determine which point a vector is tangent to
This is the bundle projection map, $\pi: TM\to M$, and so we can add the requirement on $F$ that $\pi(F(p)) = p$ everywhere.
Here, we created a bundle with a base manifold and a vector space at each point, but we could imagine a more general concept of bundle which just has some kind of space $B$ with some other kind of space $F_p B$ at each point $p\in B$. Even in this general setting, we can see the utility of attaching to each point of the base space some element of its attached space. We define a function
$$ \sigma: B\to FB = E $$
with $\pi(\sigma(p)) = p$ as a cross-section (or just section) of the total space $E$.
Applying this terminology to our original example, we can then reconstruct the more terse definition:
$$ \text{a vector field on } M \text{ is a section }\sigma\text{ of the tangent bundle } TM$$
Best Answer
If you embed $S^{n-1}$ into $\mathbb{R}^n$ as the unit sphere you can embed the tangent bundle as the collection of all hyperplanes tangent to the unit sphere. Said more explicitly the tangent space at a unit vector $v \in S^{n-1}$ can be identified with the space of all vectors in $\mathbb{R}^n$ orthogonal to $v$, and the tangent bundle can be identified with the space of pairs $(v, x) \in (\mathbb{R}^n)^2 \cong T(\mathbb{R}^n)$ such that $\| v \| = 1$ and $\langle v, x \rangle = 0$. So you can specify a vector field on $S^{n-1}$ by writing down a continuous function $S^{n-1} \to \mathbb{R}^n$ sending a vector $v \in S^{n-1} \subset \mathbb{R}^n$ to any vector orthogonal to it. For $S^1$ and $S^3$ the ambient $\mathbb{R}^2$ and $\mathbb{R}^4$ should be thought of as identified with $\mathbb{C}$ and $\mathbb{H}$ respectively, and then you can check that multiplication by $i, j, k$ always produces a vector orthogonal to a given vector.
The diffeomorphism is given by taking linear combinations of the vector fields. At each point they evaluate to a basis of the tangent space. So if the vector fields are $X_1, \dots X_n$ we have a map which takes the vector $(c_1, \dots c_n) \in \mathbb{R}^n$, in the fiber of the trivial bundle with fiber $\mathbb{R}^n$ at any point $p$, to the tangent vector given by taking $\sum c_i X_i$ and then evaluating it at $p$.