Let's call $\phi$ the parametrization of the 2-torus in $\mathbb{R}^4$, such that: $\phi: \mathbb{R}^2 \to T^2 =(\sin(x),\cos(x),\sin(y),\cos(y))$. Now, let $\pi$ be a line in $\mathbb{R}^2$ with irrational angolar coefficient, let's set it at $\sqrt{2}$. Let $\varphi:=\phi_{\vert \pi}$. Thus, the mapping is differentiable and it is, in fact an immersion, but is not an embedding, since the image is one-dimensional (yet, these sub-manifold is dense in the torus). Actually, in both these example and in yours, the patological behaviour is in some dense caused by the fact that those mappings are not "proper", they map too many points near infinity near to others. Formalizing this definition, we get the embedding' definition.
It is necessary for it to be an immersion since, if it is not, thus the jacobian matrix won't have maximum rank somewhere, and thus the transformation won't be invertible
Here is the main issue: how can one define the pushforward of a vector field? Rather – when? Take two copies of the real line, parametrized as $M=\mathbb{R}\times \{0\}\sqcup\mathbb{R}\times \{1\}\subseteq \mathbb{R}^2$ and define a map
$$ \pi:M\to \mathbb{R}$$
by $\pi(x,n)=x$. Then take the vector field defined by $X\in \mathfrak{X}(M)$ with $X_{(x,0)}=-1$ and $X_{(x,1)}=1$ for all $x\in \mathbb{R}$. Then, if we calculate $\pi_{*,(x,0)}X_{(x,0)}=-1\in T_x\mathbb{R}$ and $\pi_{*,(x,1)} X_{(x,1)}=1\in T_x\mathbb{R}$. The sensible way to (try to) define the pushforward vector field here is to set $Y=\pi_{*}X$ to be $Y_p=\pi_{*,q}(X_q)$ for some $q\in M$ with $\pi(q)=p$.
Unfortunately, this won't work because if for instance we choose $p=1$, then its preimages under $\pi$ are $(1,0)$ and $(1,1)$. If we choose $q=(1,0)$ we get $Y_p=-1$ and if we choose $q=0$ we get $Y_p=1$. So, the pushforward vector field is not well-defined in general.
The only way this could be defined is if the following criterion is met:
Let $F:M\to N$ denote a map of $\mathscr{C}^\infty$ manifolds. Then given $X\in \mathfrak{X}(M)$, there exists a vector field $F_*X\in \mathfrak{X}(F(M))$ defined as above if and only if for each $p\in N$, $F_{*,q}(X_{q})=F_{*,q'}(X_{q'})$ for all $q,q'\in F^{-1}(p)$.
An immediate Corollary is that when each $F^{-1}(p)$ contains only one point, the pushforward vector field is defined. So, if we have a smooth embedding $F:M\to N$ then the pushforward vector field is defined. It suffices to have a smooth injective map, too. An immersion might not work in general, however.
Now, for the first question a criterion (found in Tu's Introduction to Manifolds) says that a vector field $X\in \mathfrak{X}(M)$ is $\mathscr{C}^\infty$ if and only if for any $f\in \mathscr{C}^\infty(M)$ the map $p\mapsto X_pf$ is a smooth function. For convenience, assume $F$ is aan embedding. Now, for $g\in \mathscr{C}^\infty(N)$, fix $p=F(q)\in N$. Then $(F_*X)_p(g)=X_q(g\circ F)$ where we note that $g\circ F\in \mathscr{C}^\infty(M)$. So, the association $p\mapsto (F_*X)_p(g)$ is given by the composition, $p\mapsto F^{-1}(p)=q\mapsto X_q(g\circ F)$. By $X\in \mathfrak{X}(M)$ smooth, the second map is smooth. By $F$ admitting a smooth inverse defined on $F(M)$, the first map is smooth. Hence, the pushforward of a smooth vector field (by an embedding) is again smooth.
Best Answer
It is still smooth. If $F\in \mathscr{C}^\infty(M,N)$, then fixing $p\in M$ and local coordinates $(x^1,\ldots, x^n)$ centred at $p$ on a trivializing neighborhood $U\subseteq M$ for $TM\to $M, and fixing analogous coordinates on $(y^1,\ldots, y^m)$ centred at $F(p)$ on a neighborhood $V\subseteq N$ containing $F(U)$ trivializing $TN\to N$, we can write down $F_*: TM\to TN$ in local coordinates as a map $TU\to TV$.
In local coordinates, $F$ is given by an $m-$tuple of smooth functions, $y^i=F_i(x^1,\ldots, x^n)$ for $1\le i \le m$. And given the local trivialization condition we can view $TU\cong U\times \mathbb{R}^n$ and $TV\cong V\times \mathbb{R}^m$. Then $F_*:U\times \mathbb{R}^n\to V\times \mathbb{R}^m$ is $F\times L$ where $$L|_{\{x\}\times \mathbb{R}^n}=L_x:\{x\}\times\mathbb{R}^n\to \{F(x)\}\times\mathbb{R}^m$$ is a linear transformation and the transformations $L_x$ vary smoothly according to the choice of $x\in U$. So, denoting the variable in $U$ by $x$ and the variable in $\mathbb{R}^n$ by $y$, $F_*$ can be viewed as a map $F_*(x,y)=(F(x),L_x(y))$. All the components are smooth, and hence so is $F_*$.