As said in the comments, your description of $\mathcal{F}^a$ is not completely correct. First, it should be the disjoint union instead of the direct sum of the $\mathcal{F}_x$. Then all sections of $\pi$ are not allowed, only those that satisfy this condition :
$$ (1) \quad \forall x\in U, \exists V\ni x \text{ a neighborhood of $x$ in $U$ and }t\in\mathcal{F}(V) \text{ such that } \forall y\in V, s(y)=t_y $$
Otherwise you have too many sections, for example, if $\mathcal{F}=\mathcal{C}$ is the sheaf of continuous function on a space $X$, with your definition a section in $\mathcal{F}^a$ would consists of the choice of a germ of continuous function at every point, without conditions that these germs glue (and they might define a function which is not continuous).
Hence the good definition is the following :
$$\mathcal{F}^a(U)=\{s:U\rightarrow\coprod_{x\in U}\mathcal{F}_x | s \text{ is a section of $\pi$ and satisfies condition $(1)$}\}$$
Now it is easy to see that $\mathcal{F}^a_x=\mathcal{F}_x$. Indeed, if $s_x$ is a germ of a section in $\mathcal{F}^a_x$, then you can find a representative $(U,s)$ where $s\in\mathcal{F}^a(U)$. Now by condition $(1)$, there exists $t\in\mathcal{F}(V)$ such that $\forall y\in V, s(y)=t_y$. But this implies that $(U,s)$ and $(V,t)$ define the same germ. So $s\in\mathcal{F}_x$.
To be perfectly rigorous, check that what I just described is a well-defined map $\mathcal{F}^a_x\rightarrow\mathcal{F}_x$ which is the inverse of the obvious map $\mathcal{F}_x\rightarrow\mathcal{F}^a_x$.
For the first question: yes, the morphism of presheaves $i$ is injective, as its kernel is zero.
For the second question: I believe that "natural" here means canonical, since it comes out of the universal property of sheafification. I realise that this is confusing since "natural" has a specific meaning regarding compatibility with functors in category theory.
Best Answer
For (1), you want the "sheafification" to have the same stalks as $\mathcal F$, so if we allow $s(p)$ to be something outside $\mathcal F_p$, we'd get "too many" stalks.
For (2):
Take a space $X$. Define the presheaf, $F$, for each open $U\subset X$, as the set of bounded functions $f:U\to\mathbb R$. Clearly, if $V\subset U$, $f_{|V}$ is a bounded function on $V$, so this is a pre-sheaf.
But it is not a sheaf, because we cannot stitch an arbitrary number of bounded functions together to get a bounded function.
The sheaf you get when you "sheafify" this presheaf is the sheaf of all locally bounded functions, $f$. This is generally what "sheafification" does - the objects resulting are objects which "locally" have the properties of the pre-sheaf.
Perhaps a simpler example: Let $F(U)$ be a singleton if the closure of $U$ is compact, and empty if not. Then the sheafification of $F$ would give a singleton at $U$ precisely when $U$ is locally compact.
Indeed, I suspect almost any time you refer to something as "locally $P$," for some property $P$, you are referencing a sheafification of the original property, $P$. (For example, the other answer gives you the idea of a function being constant, and a function being "locally constant.")