It is enough to prove it preserves short exact sequences: $\;0\to M\to N\to P\to 0$. As the tensor product is right-exact, and $S^{-1}M\simeq M\otimes_A S^{-1}A$, it is even enough to prove it preserves injectivity.
So consider an injective morphism $\varphi\colon M\to N$ and suppose $\;(S^{-1}\varphi)\Bigl(\dfrac ms\Bigr)=0$ in $S^{-1}N$. This means there exists $t\in S$ such that $\;t\mkern1mu\varphi(m)=\varphi(tm)=0$. But then
$$\frac ms=\frac{tm}{ts}=\frac0{ts}=0,$$
which shows $\;S^{-1}\varphi\;$ is injective.
To address questions 1-3. In a way, there are two concepts of a direct sum, and some books actually make a clear distinction between internal direct sums and external direct sums.
If you have two submodules of an "ambient" module, $M,N\subseteq W$, then you can form their sum as a new submodule $M+N=\{w=m+n\mid m\in M,n\in N\}\subseteq W$. And if each element of the sum, $w\in M+N$, can be represented uniquely in this way, then we say that this sum is a direct sum, and thus we have an internal direct sum $M\oplus N$ of the submodules $M$ and $N$. From what you wrote above, this seems to be Axler's definition. From this point of view, a statement such as $U\oplus W=\mathbb{F}^3$ makes perfect sense: elements of $U$ and $V$ are actually elements of the same module, so they can be added together and compared with the entire $\mathbb{F}^3$.
The construction used in the other two books, where $M\oplus N=\{(m,n)\mid m\in M,n\in N\}$, is that of an external direct sum $M\oplus N$ of the modules $M$ and $N$. Note that here we don't care where $M$ and $N$ come from. For all we know, they can (and often do) consist of elements of different "nature". This construction does create a new kind of elements $(m,n)$ which are neither in $M$ nor in $N$.
Now, for two submodules $M,N\subseteq W$, assuming $M\cap N=\{0\}$, we can create both their internal and their external direct sum. But they are isomorphic, so people tend to abuse the language and the notation a little bit and not distinguish them from each other. Or in other words, it's usually clear from the context which type of direct sum has been constructed or is being discussed, so authors don't care to make the distinction.
Question 4. Pick an arbitrary $b\in B$, and let $x=b-h(g(b))$. Then
$$g(x)=g[b-h(g(b))]=g(b)-[\underbrace{g\circ h}_{\operatorname{Id}_C}\circ g](b)=g(b)-g(b)=0,$$
i.e. $x\in\operatorname{Ker}(g)=\operatorname{Im}(f)$, and
$$b=x+h(g(b))\in\operatorname{Im}(f)+\operatorname{Im}(h).$$
That's why they generate all of $B$.
A similar trick should work for Question 5 too.
Best Answer
Exactness of localization can indeed make the problem simpler. The inclusions $B_i \subseteq \sum B_i$ localize to give $S^{-1}B_i \subseteq S^{-1}(\sum B_i)$ for every $i$, so $\sum S^{-1}B_i \subseteq S^{-1}(\sum B_i)$. Conversely, an element of $S^{-1}(\sum B_i)$ is of the form $\frac{b_1 + \cdots + b_n}{s}$, where $b_i \in B_i$. But $\frac{b_1 + \cdots + b_n}{s} = \frac{b_1}{s} + \cdots + \frac{b_n}{s} \in \sum S^{-1}B_i$, so equality holds.