You seem to think that the implied constants will be completely independent of our choice of Littlewood-Paley partition of unity. If you look at the proof of the standard Littlewood-Paley inequality, you'll see that the estimates for the vector-valued kernel depend on the LP partition of unity. For the upper LP estimate, we can say the following: for all mean zero, $C^{1}$ functions $\phi$ such that
$$|\psi(x)|\leq B(1+|x|)^{-n-1},\quad |\nabla\psi(x)|\leq B(1+|x|)^{-n-1}$$
we have that
$$\left\|\left(\sum_{j\in\mathbb{Z}}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}\leq C_{n,p}B\|f\|_{L^{p}},\quad 1<p<\infty$$
Conversely, if $\psi$ is a Schwartz function such that $\sum_{j}\widehat{\psi}(\xi)=1$ for $\xi\neq 0$, then there is a constant $C_{n,\psi}$ depending only on the dimension $n$ and the function $\psi$, such that
$$\|f\|_{L^{p}}\lesssim_{p}C_{n,\psi}\left\|\left(\sum_{j}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}$$
Having said all that, let's turn to what you want to prove. Let $\phi$ be a Schwartz function such that $\widehat{\phi}$ is supported in the annulus $\left\{1/2\leq|\xi|\leq 2\right\}$ and $\sum_{j}\widehat{\phi}(2^{-j}xi)=1$ for all $\xi\neq 0$. Define a Schwartz function $\psi$ by
$$\widehat{\psi}(\xi)=\begin{cases}\sum_{j\leq 0}\widehat{\phi}(2^{-j}\xi) & {\xi\neq 0}\\ 1 & {\xi=0}\end{cases}$$
Technically, you need to check that $\psi$ is a well-defined Schwartz function, but this is easy. Let $P_{\leq 0}$ be the operator given by convolution with $\psi$. We then have the following result,
$$\|f\|_{L^{p}}\sim_{n,p,\phi}\|P_{\leq 0}f\|_{L^{p}}+\left\|\left(\sum_{j=1}^{\infty}|P_{j}f|^{2}\right)^{2}\right\|_{L^{p}},\quad f\in L^{p}(\mathbb{R}^{n})$$
for all $1<p<\infty$.
Observe that the identity $P_{\leq 0}+\sum_{j\geq 1}P_{j}$ holds for all Schwartz functions $f\in\mathcal{S}(\mathbb{R}^{n})$. Indeed, by our hypothesis that $\sum_{j\in\mathbb{Z}}\widehat{\phi}(2^{-j}\xi)=1$, $\xi\neq 0$, we have that for $\xi\neq 0$,
$$\widehat{f}(\xi)=\sum_{j\in\mathbb{Z}}\widehat{\phi}(2^{-j}\xi)\widehat{f}(\xi)=\sum_{j\leq 0}\widehat{\phi}(2^{-j}\xi)\widehat{f}(\xi)+\sum_{j\geq 1}\widehat{P_{j}f}(\xi)=\widehat{P_{\leq 0}f}(\xi)+\sum_{j\geq 1}\widehat{P_{j}f}(\xi)$$
By nonnegativity,
$$\left\|\left(\sum_{j\geq 1}|P_{j}f|\right)^{1/2}\right\|_{L^{p}}\leq\left\|\left(\sum_{j\in\mathbb{Z}}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}\lesssim_{n,p,\psi}\|f\|_{L^{p}}$$
By Young's inequality,
$$\|P_{\leq 0}f\|_{L^{p}}\leq\|\psi\|_{L^{1}}\|f\|_{L^{p}}$$
These two estimates together with Minkowski's inequality give us one side of the inequality.
We now show the other side of the inequality. We have the trivial estimate
$$\left(\sum_{j\geq 1}|P_{j}f|^{2}\right)^{1/2}\geq|P_{1}f|,$$
which implies by Minkowski that
$$\left\|\left(\sum_{j\geq 1}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}+\left\|P_{\leq 0}f\right\|_{L^{p}}\geq\left\|P_{\leq 1}f\right\|_{L^{p}}$$
Now by the support properties of $\widehat{\psi}$, we have that $P_{k}P_{\leq 1}=P_{k}$ if $k\leq 0$. Whence,
$$\left(\sum_{k\leq0}|P_{k}f|^{2}\right)^{1/2}\leq\left(\sum_{k\in\mathbb{Z}}|P_{k}P_{\leq 1}f|^{2}\right)^{1/2}$$
Taking $L^{p}$ norms and using the standard two-sided LP inequality, we obtain
$$\left\|\left(\sum_{k\leq 0}|P_{k}f|^{2}\right)^{1/2}\right\|_{L^{p}}\lesssim_{n,p,\phi}\left\|P_{\leq 1}f\right\|_{L^{p}}$$
Putting these results together and using Minkowski's inequality and two-sided LP inequality again, we conclude that
\begin{align*}
\|f\|_{L^{p}}&\lesssim_{n,p,\phi}\left\|\left(\sum_{j\in\mathbb{Z}}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}\\
&\leq\left\|\left(\sum_{j\leq 0}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}+\left\|\left(\sum_{j\geq 1}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}\\
&\lesssim_{n,p,\phi}\|P_{\leq 0}f\|_{L^{p}}+2\left\|\left(\sum_{j\geq 1}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}
\end{align*}
Best Answer
$W^{s,p}$ is the inhomogeneous Triebel-Lizorkin space $F_{p,q}^{s}(\mathbb{R}^{n})$, with $q=2$, defined by
$$\|f\|_{F_{p,q}^{s}}=\left\|\left(\sum_{k}2^{kqs}|P_{k}f|^{q}\right)^{1/q}\right\|_{L^{p}}$$
As you point out, one obtains the Besov space $B_{p,q}^{s}$ simply by interchanging the order in which norms are taken. I think you would agree that interchanging norms is, in general, a nontrivial action. It is clear from Minkowski's integral inequality that
$$\|f\|_{F_{p,q}^{s}}\leq\|f\|_{B_{p,q}^{s}} \enspace p\geq q, \quad\|f\|_{B_{p,q}^{s}}\leq\|f\|_{F_{p,q}^{s}} \enspace q\geq p$$
Additionally, by the nesting property of sequence spaces, $$\|f\|_{F_{p,q}^{s}}\leq\|f\|_{B_{p,r}^{s}} \enspace q\geq r, \quad \|f\|_{B_{p,r}^{s}}\leq\|f\|_{F_{p,q}^{s}} \enspace r\geq q$$
I believe that an equivalent characterization of $B_{p,q}^{s}$, for $0<s<1$, is in terms of the norm
$$\|f\|_{L^{p}}+\left(\int_{\mathbb{R}^{n}}\dfrac{(\|f(x+t)-f(x)\|_{L^{p}})^{q}}{|t|^{n+s}}dt\right)^{1/q}$$
If this is correct, then Besov spaces correspond to the generalized Lipschitz spaces $\Lambda_{\alpha}^{p,q}$ in E.M. Stein, Singular Integrals and Differentiability Properties of Functions, Chapter 5, where $s=\alpha$ in our notation. Furthermore, one can show using this characterization that $$W^{s,p}(\mathbb{R})\not\subset B_{p,q}^{s}(\mathbb{R}), \quad q<2 \tag{1}$$ and $$B_{p,q}^{s}(\mathbb{R})\not\subset W^{s,p}(\mathbb{R}), \quad q>2 \tag{2}$$
According to section 6.8 of the aforementioned reference, the function
$$f_{s,\sigma}(x):=e^{-\pi x^{2}}\sum_{k=1}^{\infty}a^{-ks}k^{-\sigma}e^{2\pi i a^{k}x}, \quad x\in\mathbb{R}$$
where $a>1$ is an integer, satisfy
$$f_{s,\sigma}\in W^{s,p}(\mathbb{R})\Leftrightarrow \sigma>\dfrac{1}{2},\quad \forall 1<p<\infty$$ and $$f_{s,\sigma}\in B_{p,q}^{s}(\mathbb{R})\Leftrightarrow \sigma>\dfrac{1}{q},\quad\forall 1<p<\infty$$
From this result, which I imagine depends on results for lacunary Fourier series, it is easy to deduce (1) and (2).