$W^{s,p}$ is the inhomogeneous Triebel-Lizorkin space $F_{p,q}^{s}(\mathbb{R}^{n})$, with $q=2$, defined by
$$\|f\|_{F_{p,q}^{s}}=\left\|\left(\sum_{k}2^{kqs}|P_{k}f|^{q}\right)^{1/q}\right\|_{L^{p}}$$
As you point out, one obtains the Besov space $B_{p,q}^{s}$ simply by interchanging the order in which norms are taken. I think you would agree that interchanging norms is, in general, a nontrivial action. It is clear from Minkowski's integral inequality that
$$\|f\|_{F_{p,q}^{s}}\leq\|f\|_{B_{p,q}^{s}} \enspace p\geq q, \quad\|f\|_{B_{p,q}^{s}}\leq\|f\|_{F_{p,q}^{s}} \enspace q\geq p$$
Additionally, by the nesting property of sequence spaces,
$$\|f\|_{F_{p,q}^{s}}\leq\|f\|_{B_{p,r}^{s}} \enspace q\geq r, \quad \|f\|_{B_{p,r}^{s}}\leq\|f\|_{F_{p,q}^{s}} \enspace r\geq q$$
I believe that an equivalent characterization of $B_{p,q}^{s}$, for $0<s<1$, is in terms of the norm
$$\|f\|_{L^{p}}+\left(\int_{\mathbb{R}^{n}}\dfrac{(\|f(x+t)-f(x)\|_{L^{p}})^{q}}{|t|^{n+s}}dt\right)^{1/q}$$
If this is correct, then Besov spaces correspond to the generalized Lipschitz spaces $\Lambda_{\alpha}^{p,q}$ in E.M. Stein, Singular Integrals and Differentiability Properties of Functions, Chapter 5, where $s=\alpha$ in our notation. Furthermore, one can show using this characterization that
$$W^{s,p}(\mathbb{R})\not\subset B_{p,q}^{s}(\mathbb{R}), \quad q<2 \tag{1}$$
and
$$B_{p,q}^{s}(\mathbb{R})\not\subset W^{s,p}(\mathbb{R}), \quad q>2 \tag{2}$$
According to section 6.8 of the aforementioned reference, the function
$$f_{s,\sigma}(x):=e^{-\pi x^{2}}\sum_{k=1}^{\infty}a^{-ks}k^{-\sigma}e^{2\pi i a^{k}x}, \quad x\in\mathbb{R}$$
where $a>1$ is an integer, satisfy
$$f_{s,\sigma}\in W^{s,p}(\mathbb{R})\Leftrightarrow \sigma>\dfrac{1}{2},\quad \forall 1<p<\infty$$
and
$$f_{s,\sigma}\in B_{p,q}^{s}(\mathbb{R})\Leftrightarrow \sigma>\dfrac{1}{q},\quad\forall 1<p<\infty$$
From this result, which I imagine depends on results for lacunary Fourier series, it is easy to deduce (1) and (2).
Best Answer
You seem to think that the implied constants will be completely independent of our choice of Littlewood-Paley partition of unity. If you look at the proof of the standard Littlewood-Paley inequality, you'll see that the estimates for the vector-valued kernel depend on the LP partition of unity. For the upper LP estimate, we can say the following: for all mean zero, $C^{1}$ functions $\phi$ such that $$|\psi(x)|\leq B(1+|x|)^{-n-1},\quad |\nabla\psi(x)|\leq B(1+|x|)^{-n-1}$$ we have that $$\left\|\left(\sum_{j\in\mathbb{Z}}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}\leq C_{n,p}B\|f\|_{L^{p}},\quad 1<p<\infty$$ Conversely, if $\psi$ is a Schwartz function such that $\sum_{j}\widehat{\psi}(\xi)=1$ for $\xi\neq 0$, then there is a constant $C_{n,\psi}$ depending only on the dimension $n$ and the function $\psi$, such that $$\|f\|_{L^{p}}\lesssim_{p}C_{n,\psi}\left\|\left(\sum_{j}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}$$
Having said all that, let's turn to what you want to prove. Let $\phi$ be a Schwartz function such that $\widehat{\phi}$ is supported in the annulus $\left\{1/2\leq|\xi|\leq 2\right\}$ and $\sum_{j}\widehat{\phi}(2^{-j}xi)=1$ for all $\xi\neq 0$. Define a Schwartz function $\psi$ by $$\widehat{\psi}(\xi)=\begin{cases}\sum_{j\leq 0}\widehat{\phi}(2^{-j}\xi) & {\xi\neq 0}\\ 1 & {\xi=0}\end{cases}$$ Technically, you need to check that $\psi$ is a well-defined Schwartz function, but this is easy. Let $P_{\leq 0}$ be the operator given by convolution with $\psi$. We then have the following result, $$\|f\|_{L^{p}}\sim_{n,p,\phi}\|P_{\leq 0}f\|_{L^{p}}+\left\|\left(\sum_{j=1}^{\infty}|P_{j}f|^{2}\right)^{2}\right\|_{L^{p}},\quad f\in L^{p}(\mathbb{R}^{n})$$ for all $1<p<\infty$.
Observe that the identity $P_{\leq 0}+\sum_{j\geq 1}P_{j}$ holds for all Schwartz functions $f\in\mathcal{S}(\mathbb{R}^{n})$. Indeed, by our hypothesis that $\sum_{j\in\mathbb{Z}}\widehat{\phi}(2^{-j}\xi)=1$, $\xi\neq 0$, we have that for $\xi\neq 0$, $$\widehat{f}(\xi)=\sum_{j\in\mathbb{Z}}\widehat{\phi}(2^{-j}\xi)\widehat{f}(\xi)=\sum_{j\leq 0}\widehat{\phi}(2^{-j}\xi)\widehat{f}(\xi)+\sum_{j\geq 1}\widehat{P_{j}f}(\xi)=\widehat{P_{\leq 0}f}(\xi)+\sum_{j\geq 1}\widehat{P_{j}f}(\xi)$$ By nonnegativity, $$\left\|\left(\sum_{j\geq 1}|P_{j}f|\right)^{1/2}\right\|_{L^{p}}\leq\left\|\left(\sum_{j\in\mathbb{Z}}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}\lesssim_{n,p,\psi}\|f\|_{L^{p}}$$ By Young's inequality, $$\|P_{\leq 0}f\|_{L^{p}}\leq\|\psi\|_{L^{1}}\|f\|_{L^{p}}$$ These two estimates together with Minkowski's inequality give us one side of the inequality.
We now show the other side of the inequality. We have the trivial estimate $$\left(\sum_{j\geq 1}|P_{j}f|^{2}\right)^{1/2}\geq|P_{1}f|,$$ which implies by Minkowski that $$\left\|\left(\sum_{j\geq 1}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}+\left\|P_{\leq 0}f\right\|_{L^{p}}\geq\left\|P_{\leq 1}f\right\|_{L^{p}}$$ Now by the support properties of $\widehat{\psi}$, we have that $P_{k}P_{\leq 1}=P_{k}$ if $k\leq 0$. Whence, $$\left(\sum_{k\leq0}|P_{k}f|^{2}\right)^{1/2}\leq\left(\sum_{k\in\mathbb{Z}}|P_{k}P_{\leq 1}f|^{2}\right)^{1/2}$$ Taking $L^{p}$ norms and using the standard two-sided LP inequality, we obtain $$\left\|\left(\sum_{k\leq 0}|P_{k}f|^{2}\right)^{1/2}\right\|_{L^{p}}\lesssim_{n,p,\phi}\left\|P_{\leq 1}f\right\|_{L^{p}}$$ Putting these results together and using Minkowski's inequality and two-sided LP inequality again, we conclude that \begin{align*} \|f\|_{L^{p}}&\lesssim_{n,p,\phi}\left\|\left(\sum_{j\in\mathbb{Z}}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}\\ &\leq\left\|\left(\sum_{j\leq 0}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}+\left\|\left(\sum_{j\geq 1}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}}\\ &\lesssim_{n,p,\phi}\|P_{\leq 0}f\|_{L^{p}}+2\left\|\left(\sum_{j\geq 1}|P_{j}f|^{2}\right)^{1/2}\right\|_{L^{p}} \end{align*}