The prime number theorem in its usual form is somewhat obvious and maybe intuitive from tables of data that prompted Gauss in 1792 or 1793 to speculate that the density of primes was $1/\log x.$ Unfortunately very little is obvious beyond this. Proofs of the PNT often use the version $\psi(x)\sim x$ and so intuition about proofs of the PNT require some familiarity with that version, which can be shown to be equivalent to the usual $\pi(x)\sim x/\log x.$ See Apostol [1], p. 79.
Your question (in bold) is, what does the implication $\sum \mu(x)/x=0\leftrightarrow PNT$ look like? I mentioned in a comment that it depends on the result that $\sum \mu(x)=o(x) \leftrightarrow PNT.$ Because this latter result is easy to find (see, for example, Apostol), below is only a sketch of the last step from Ayoub [1], showing that $\sum \mu(x)/x= o(1) \leftrightarrow \sum \mu(x)=o(x)$ (and hence $\sum \mu(x)/x=o(1)\leftrightarrow PNT$).
The general theorem below is a prerequisite for the proof and I think Apostol and others more or less also prove it so the proof is omitted.
Theorem. If $a(x)$ is defined for integral $x$, if B(x) is of bounded variation in every finite interval, if $\sum_{n \leq x}a(n)=o(x), if B(x) =O(1)$ and $\sum_{n\leq x}|a(n)|=O(x)$ then
$$\sum_{n\leq x}a(n)B\left(\frac{x}{n}\right)=o(x) $$ [proof omitted].
Main result. Let $M(x) = \sum_{n\leq x}\mu(n), L(x) =\sum_{n\leq x}\frac{\mu(n)}{n}.$ Then $M(x) =o(x)$ if and only if $L(x)=o(1).$
First assume $M(x)=o(x).$ In the theorem above put $a(n)=\mu(x)$ and $B(x)=x-[x].$ Then
$$\sum_{n\leq x }\mu(n)\left( \frac{x}{n}-\left[\frac{x}{n}\right]\right)=o(x). $$
But $\sum_{n\leq x}\mu(n)\left[ \frac{x}{n}\right]=1$ (see Apostol, etc.) and so
$$\sum_{n\leq x}\mu(n)\frac{x}{n}=o(x)$$ and dividing by $x,~L(x)=o(1).$
Now assume $L(x) = o(1).$ Then $M(x)=\sum_{n\leq x}\mu(n)=\sum \frac{\mu(n)\cdot n}{n}$
$$= xL(x)-\int_1^x L(t)~ dt = o(x) + o(x) =o(x). $$
The last line is a consequence of the following lemma.
Lemma. Let $x\geq 1$ and $\phi(x)$ have continuous derivatives for $x\geq 1.$ Let $S(x)=\sum_{n\leq x}C(n)$ with $C(n)$ real or complex numbers. Then
$$\sum_{n\leq x}C(n)\phi(n)=S(x)\phi(x)-\int_1^x S(t)\phi'(t)dt. $$
If we put $C(n)=\mu(n)/n$ and $\phi(x)=x$ then the last line next above follows. This is the direction of the proof that is included in Apostol's text [2], so you can get the non-trivial details there on page 97.
[1] Ayoub, An Introduction to the Theory of Numbers, AMS 1963.
[2] Apostol, Introduction to Analytic Number Theory, Springer 2000.
Best Answer
Re 1., the absolute convergence is used to be able to write the product of the sums as the sum of products of terms. If you have two sequences $(a_n),\, (b_k)$ of complex numbers, and the series $A = \sum_{n=0}^\infty a_n$ and $B = \sum_{k=0}^\infty b_k$ are convergent, can you write $A\cdot B$ as a series using the $a_n$ and $b_k$? Since there is no natural ordering on the index set $\mathbb{N}\times\mathbb{N}$ of the set of all products $a_nb_k$, it is difficult at least to express $A\cdot B$ as a series of terms built from the individual products if the convergence is only conditional. But if the convergence is absolute, the order of summation doesn't matter, all orderings give the same sum. Here, with $a_n = (p^n)^{-s}$ and $b_k = (q^k)^{-s}$ for distinct primes $p,\,q$, besides the usual Cauchy product of the series, we have a somewhat natural ordering by the products of the bases $p^nq^k$.
Re 2., The result is used in step $\mathbf{IV}$, though a quick glance seems to indicate that a weaker result, that $\zeta(s) - \frac{1}{s-1}$ extends holomorphically to $\Re s > 1-\varepsilon$ for some $\varepsilon > 0$ would suffice. But since the proof immediately yields the extension to $\Re s > 0$, why state something weaker?
Re 3., from the inequality
$$e^{\vartheta(2n) - \vartheta(n)} \leqslant 2^{2n},$$
one obtains
$$\vartheta(2n) - \vartheta(n) \leqslant \log 2\cdot (2n).\tag{1}$$
The inequality aimed for is
$$\vartheta(x) - \vartheta(x/2) \leqslant C\cdot x\tag{2}$$
for all $x > 0$. For $x$ of the particular form $x = 2n,\; n \in \mathbb{Z}^+$, the inequality is just $(1)$. For general $x$, choose $n = \lfloor x/2\rfloor$ to obtain
$$\begin{align} \vartheta(x) - \vartheta(x/2) &= \vartheta(2n) - \vartheta(n) + \left(\vartheta(x) - \vartheta(2n)\right) + \left(\vartheta(n) - \vartheta(x/2)\right)\\ &\leqslant \log 2\cdot (2n) + O(\log x) + O(\log x)\\ &\leqslant C\cdot x \end{align}$$
absorbing the possible logarithmic contributions into the constant for $x \geqslant x_0$. The $x_0$ depends on the constant $C$, you must have $\log x \leqslant (C - \log 2)x$ [because $\log x$ is the maximal possible value of $\left(\vartheta(x) -\vartheta(2n)\right) - \left(\vartheta(x/2) - \vartheta(n)\right)$].
Having that, one writes
$$\vartheta(x) = \left(\vartheta(x) - \vartheta(x/2)\right) + \left(\vartheta(x/2) - \vartheta(x/4)\right) + \dotsb + \left(\vartheta(x/2^r) - \vartheta(x/2^{r+1})\right) + \vartheta(x/2^{r+1})$$
and each difference is bounded by $C\cdot \frac{x}{2^k}$, while the last term is bounded by $\vartheta(x_0)$, so you get
$$\vartheta(x) \leqslant \sum_{k=0}^r C\cdot \frac{x}{2^k} + \vartheta(x_0) \leqslant 2Cx + \vartheta(x_0).$$
Explanation of part $(\mathbf{IV})$:
From the Euler product, by termwise logarithmic differentiation for $\Re s > 1$, one obtains
$$-\frac{\zeta'(s)}{\zeta(s)} = \sum_{p} \frac{\log p}{p^s-1}.$$
Termwise differentiation is legitimate because of the locally uniform convergence of the product, like for series. Then, by $\frac{1}{p^s-1} = \frac{1}{p^s} + \frac{1}{p^s(p^s-1)}$, and splitting the sum, one obtains
$$-\frac{\zeta'(s)}{\zeta(s)} = \Phi(s) + \underbrace{\sum_p \frac{\log p}{p^s(p^s-1)}}_{\alpha(s)}.$$
Rearranging the equation, we get
$$\Phi(s) = -\frac{\zeta'(s)}{\zeta(s)} - \alpha(s),\tag{3}$$
first for $\Re s > 1$. But by $(\mathbf{II})$, $\zeta(s)$ extends meromorphically to $\Re s > 0$, with a simple pole at $s = 1$ and no other pole. Therefore $-\zeta'(s)/\zeta(s)$ is meromorphic on $\Re s > 0$, with a simple pole at $s = 1$, and simple poles at the zeros of $\zeta$ in $\Re s > 0$.
The sum
$$\sum_p \frac{\log p}{p^s(p^s-1)}$$
converges (locally uniformly) for $\Re s > 1/2$, since
$$\left\lvert \frac{\log p}{p^s(p^s-1)}\right\rvert \leqslant \frac{\log p}{\lvert p^s\rvert (\lvert p^s\rvert -1)} \leqslant \frac{2\log p}{\lvert p^s\rvert^2} = \frac{2\log p}{p^{2\Re s}}$$
for $p$ large enough so that $\lvert p^s\rvert - 1 \geqslant \lvert p^s\rvert/2$ (for $\Re s > 1/2$, that means $p > 4$), and it is well known that $\sum \frac{\log n}{n^t}$ converges for $t > 1$.
So on the right hand side of $(3)$, we have a function that is meromorphic on $\Re s > 1/2$, hence we can extend the left hand side - $\Phi$ - meromorphically to the half plane $\Re s > 1/2$ by defining it as the right hand side for $1/2 < \Re s \leqslant 1$. Since $\alpha(s)$ has no poles, the poles of $\Phi$ in that half plane are exactly the poles of $-\zeta'(s)/\zeta(s)$.
When a meromorphic function $f$ has a simple pole in $z_0$, i.e. it has a representation
$$f(z) = \frac{a}{z-z_0} + g(z)$$
in a punctured neighbourhood of $z_0$ with a holomorphic $g$, then we have $\lim\limits_{z\to z_0} (z-z_0)f(z) = a$ (the residue of $f$ in $z_0$). When a meromorphic function $f$ has a zero of order $k$ in $z_0$, then its logarithmic derivative $f'/f$ has a simple pole with residue $k$ in $z_0$:
$$\begin{align} f(z) &= (z-z_0)^k\cdot g(z),\; g(z_0) \neq 0\\ \rightsquigarrow \frac{f'(z)}{f(z)} &= \frac{k(z-z_0)^{k-1}g(z) + (z-z_0)^kg'(z)}{(z-z_0)^kg(z)} = \frac{k}{z-z_0} + \frac{g'(z)}{g(z)}. \end{align}$$
Similarly, if $f$ has a pole of order $k$ in $z_0$, then the logarithmic derivative has a simple pole with residue $-k$ in $z_0$.
For uniformity of expression, we say that $f$ has a zero (or pole) of order $0$ in $z_0$ if $f$ is holomorphic in $z_0$ and $f(z_0) \neq 0$, and by abuse of language, that $f$ has a simple pole with residue $0$ in points of holomorphy.
Since in $(3)$ we have the negative of the logarithmic derivative of $\zeta(s)$, the signs flip, and the simple pole of $\zeta$ in $s = 1$ leads to a simple pole with residue $1$ of $\Phi$ in $s = 1$, thus $\lim\limits_{\varepsilon \to 0} \varepsilon \Phi(1+\varepsilon) = 1$ - the convergence need not be restricted to approaching $1$ via positive $\varepsilon$ per se, but in the following inequality, $\varepsilon > 0$ is needed. The other two limits are by the same reason, a zero of order $\mu$ resp. $\nu$ of $\zeta(s)$ produces a simple pole with residue $-\mu$ resp. $-\nu$ for the negative logarithmic derivative. And finally, the symmetry $\zeta(\overline{s}) = \overline{\zeta(s)}$ means that if $\zeta(s)$ has a zero of order $\mu$ in $1 + i\alpha,\; \alpha \in \mathbb{R}\setminus\{0\}$, then it also has a zero of order $\mu$ in $1 - i\alpha$.
By explicit summation, one obtains the equality
$$\sum_{r=-2}^2 \binom{4}{2+r} \Phi(1+\varepsilon + ir\alpha) = \sum_p \frac{\log p}{p^{1+\varepsilon}}\left(p^{i\alpha/2} + p^{-i\alpha/2}\right)^4,$$
and each term in the sum on the right is nonnegative, hence the sum is $\geqslant 0$. Multiplying with $\varepsilon > 0$ and taking the limit $\varepsilon \searrow 0$, one obtains
$$0 \leqslant \lim_{\varepsilon\searrow 0} \sum_{r=-2}^2 \binom{4}{2+r} \varepsilon\Phi(1+\varepsilon + ir\alpha) = 6 - 8\mu - 2\nu,$$
and since $\mu,\, \nu \geqslant 0$ because $s = 1$ is the only pole of $\zeta(s)$, that inequality can only hold if $\mu = 0$, so $\zeta(1 + i\alpha) \neq 0$.
Part $(\mathbf{VI})$:
You're right, the $xx$ is a typo and should be $x$. Since certainly $\pi(y) \leqslant y$, in the last line we can replace $-\pi(x^{1-\varepsilon})$ with $O(x^{1-\varepsilon})$ to obtain
$$\vartheta(x) \geqslant (1-\varepsilon)\log x \left(\pi(x) + O(x^{1-\varepsilon})\right).$$
We can't take the limit $\varepsilon \searrow 0$ in that inequality, since in the limit we would lose the guarantee that the $O(x^{1-\varepsilon})$ term grows slower than $\frac{x}{\log x}$.
Together with $\vartheta(x) \sim x$, the inequality $\vartheta(x) \leqslant \pi(x)\log x$ yields
$$\liminf_{x\to\infty} \frac{\pi(x)\log x}{x} \geqslant \lim_{x\to\infty}\frac{\vartheta(x)}{x} = 1.$$
The other inequality, $\vartheta(x) \geqslant (1-\varepsilon)\log x\left(\pi(x) + O(x^{1-\varepsilon})\right)$, yields
$$\frac{\pi(x)\log x}{x} \leqslant \frac{\vartheta(x)}{(1-\varepsilon)x} + O\left(\frac{\log x}{x^\varepsilon}\right)$$
and, since $\lim_{x\to\infty}\frac{\log x}{x^\varepsilon} = 0$ for all $\varepsilon > 0$,
$$\limsup_{x\to\infty} \frac{\pi(x)\log x}{x} \leqslant \frac{1}{1-\varepsilon}\lim_{x\to\infty} \frac{\vartheta(x)}{x} + \lim_{x\to\infty}O\left(\frac{\log x}{x^\varepsilon}\right) = \frac{1}{1-\varepsilon}.$$
Now we take the limit $\varepsilon \searrow 0$ to conclude
$$\limsup_{x\to\infty} \frac{\pi(x)\log x}{x} \leqslant 1.$$
Personally, I prefer leaving the $\varepsilon$ out, and arguing
$$\begin{align} \pi(x) &= \pi(y) + \sum_{y < p \leqslant x} 1\\ &\leqslant \pi(y) + \sum_{y < p \leqslant x}\frac{\log p}{\log y}\\ &\leqslant y + \frac{1}{\log y}\vartheta(x)\\ \Rightarrow \frac{\pi(x)\log x}{x} &\leqslant \frac{y\log x}{x} + \frac{\log x}{\log y}\cdot \frac{\vartheta(x)}{x} \end{align}$$
for $1 < y < x$, and then choosing (for $x \geqslant 3$) $y = \dfrac{x}{(\log x)^2}$ to get
$$\frac{\vartheta(x)}{x} \leqslant \frac{\pi(x)\log x}{x} \leqslant \frac{1}{\log x} + \frac{\log x}{\log x - 2 \log \log x}\cdot \frac{\vartheta(x)}{x}.$$