As others have pointed out, your statement of Zorn's lemma is ambiguous, and the one reasonable interpretation in English is not what is wanted (and in fact makes the statement false). This might be the source of your confusion. However, to address your questions at face value:
A maximal element of $P$ is an element such that no other is greater than it. That is, $x$ is maximal $p\not\geq x$ for any $p\in P$. Of course, because $P$ is not necessarily totally ordered, this is weaker than the notion of a maximum element, which demands that every other element is less than it.
Upper bounds require a bit more subtlety to define: Given a partially ordered set $P$ and a subset $S\subseteq P$, we say that $x$ is an upper bound of $S$ if every element of $S$ is less than $x$. Therefore, an upper bound is more similar to a maximum element than a maximal element. But it is importantly different because the upper bound does not need to belong to $S$ itself, just to $P$.
(A classic example for when this distinction is important is that open sets on the real line never have maximum elements, but they could have upper bounds.)
I am not really aware of a book like that. There are a few books about the axiom of choice, but they mainly focus on other things, not on "typical applications of Zorn's lemma". These are "Axiom of Choice" by Herrlich and "Axiom of Choice" by Jech, for example.
I would imagine that many books about rings, groups, modules, and other algebraic structures which are infinite, will have a handful of "typical applications of Zorn's lemma".
When I had to explain my students the "typical use", I actually prefer to point them to a different lemma: Teichmüller–Tukey.
Definition. We say that a family of sets $\cal F$ has finite character if the following holds: $A\in\cal F$ if and only if every finite $B\subseteq A$ satisfies $B\in\cal F$.
In other words, $\cal F$ has finite character if in order to verify that $A\in\cal F$, we only need to verify that its finite subsets are in $\cal F$.
Lemma. (Teichmüller–Tukey) If $\cal F$ is a family with finite character, then there is a $\subseteq$-maximal member of $\cal F$.
Note that it is often easy to verify that something has finite character. For example, linearly independent subsets a of a vector space. If a set $A$ is not linearly independent, this property is already given by a finite subset of $A$. So if all finite subsets of $A$ are linearly independent, so must $A$. Now, by the lemma, there is a maximal element, and then it is not hard to prove that such maximal element is a basis.
In the typical use of Zorn's lemma we actually have a family with finite character. So when we take a chain of sets, the union of these sets generates an upper bound (or it is an upper bound in many cases). The reason is that if the union wouldn't be an upper bound, it means that it fails to satisfy the property we are interested in, but by the finite character, this is witnessed by some finitely many elements in the chain, which is a contradiction, since the maximal one of those (in the chain) would indicate otherwise.
Think about being a subgroup, or an ideal, or a chain in a partial order, or so on. These are all properties of a subset which have finite character. Therefore Zorn's lemma or the Teichmüller–Tukey lemma is so useful in proving the existence of maximal sets with these properties.
Best Answer
First of all, in $(0,1)\cup[2,3]$ the chain $(0,1)$ has a least upper bound, $2$. While $2$ is not the least upper bound in $\Bbb R$, this is not $\Bbb R$ that we're talking about.
Secondly, it is indeed not true that if "every chain has an upper bound", then "every chain has a least upper bound". The simplest example would be $\Bbb Q\cap[0,1]$, where every chain has an upper bound, $1$, but if you look at the chain $\left\{q\in\Bbb Q\cap[0,1]\mathrel{}\middle|\mathrel{} q<\sqrt\frac12\right\}$, then it does not have a least upper bound.
The easiest way to prove this is to find an intermediate partial order where every chain has a least upper bound. Here are two suggestions:
Just prove the axiom of choice directly. Given a family of non-empty sets, consider all the choice functions from subfamilies, ordered by inclusion. It is not hard to show that if $C$ is a chain, then $\bigcup_{f\in C}f$ is a choice function, and it is the least upper bound of $C$.
Given a partial order $P$ where every chain has an upper bound, consider the partial order of all chains in $P$, ordered by inclusion. Again, since this is inclusion of chains, a chain-of-chains has a least upper bound: its union, just as the previous example. Now, we get a maximal chain in $P$, but because it has an upper bound, it is easy to see that this upper bound is a maximal element, thus proving Zorn's lemma.