You and Lubos are integrating the expression $dU = -pdV$ for a constant composition system, but this expression is only valid for constant entropy $S$. During integration you maintain $T$ constant, but the latter is not equivalent to $S$ constant.
If you chose a representation $(T,V)$ then
$$dU = \left( \frac{\partial U}{\partial T}\right)_V dT + \left( \frac{\partial U}{\partial V}\right)_T dV$$
and you can integrate for the special case of constant $T$, but then $({\partial U}/{\partial V})_T$ is not $(-p)$, as you assumed above. The value of this partial derivative can be obtained from the Helmholtz equation $({\partial U}/{\partial V})_T = T^2 [{\partial}/{\partial T} (p/T)]_V$. Integration gives
$$U(T,V,N) = U_0(T,V_0,N) + \int_{V_0}^V T^2 \left( \frac{\partial}{\partial T} \frac{p}{T} \right)_V dV$$
Noting that when the volume tends to infinity the real gas approaches an ideal gas, which implies that $U\to U_\mathrm{ideal}$, we can chose $V_0\to\infty$ to fix the integratino constant $U_0$
$$U(T,V,N) = U_\mathrm{ideal}(T,N) + \int_{\infty}^V T^2 \left( \frac{\partial}{\partial T} \frac{p}{T} \right)_V dV$$
Substituting the van der Waals equation of state
$$\frac{p}{T} = \frac{NR}{V-Nb} - \frac{a}{T} \frac{N^2}{V^2}$$
and working out yields the internal energy of the van der Waals gas
$$U(T,V,N) = U_\mathrm{ideal}(T,N) - \frac{aN^2}{V} $$
The energy of the ideal gas is given by the well-known form $U_\mathrm{ideal} = N C_V T$ with $C_V$ the heat capacity for ideal gas, e.g. $3R/2$ for monoatomic gas. Now you can understand why the appearance of the heat capacity for an ideal gas is not anything wrong, as you commented above.
Notice also that when $V\to\infty$ the internal energy of the van der Waals gas reduces to the internal energy of an ideal gas, which does not depend on volume. This is the correct result.
Firstly, $P$, $V$ and $T$ in the van der Waals equation of state have exactly the same meanings as they have elsewhere in thermodynamics: they represent the pressure, volume, and temperature of a sample of the material being studied.
Secondly, this equation of state is one of many approximate equations of state which attempt to model the values measured in experiment, especially (in this case) non-ideal gases. There are many other, more accurate, equations of state which are empirical fits to the measured data. The van der Waals equation is a useful one to study because it is simple, and one can understand in fairly simple terms where it comes from. It has some basis in statistical mechanics, which also helps explain its form.
But remember, it is just an approximation.
For example,
although it predicts the existence of a critical point,
it continues in an unphysical way below the critical temperature
into the two-phase region where
liquid and gas are in equilibrium with each other.
The equations of state page actually gives a handwaving explanation of the two modified terms. I'll paraphrase here.
Indeed $(V-nb)$ attempts to account for the excluded volume of the atoms. If we ignore all the attractions, we are essentially modelling the atoms as hard spheres, and writing the equation of state as
$$
\frac{PV}{nRT} = \left( \frac{V}{V-nb}\right) = \left( \frac{1}{1-b\rho}\right)
$$
where $\rho=n/V$. This means that, for a given density $\rho$ and temperature $T$, the pressure is higher than we might have expected if we had assumed that the gas were ideal. Also, it recognizes that there is a minimum possible value of $V/n$ or a maximum density $\rho$, because at some point the atoms will overlap. (We don't take the precise value of the maximum density too literally; in practice, the system will crystallize to a solid, which the equation does not describe). Nowadays, we know a lot more about the hard sphere fluid,
and have more accurate equations of state such as Carnahan-Starling for this system.
The effects of attractions are then treated as a weak perturbation, superimposed on the above equation. The gas surrounding a typical atom is treated as a uniform attractive background: a kind of mean field theory. This reduces the pressure, compared with what we might have thought on the basis of the ideal gas equation.
The equations of state page gives a way to think of this:
While ideal gas molecules do not interact, we consider molecules attracting others within a distance of several molecules' radii. It makes no effect inside the material, but surface molecules are attracted into the material from the surface. We see this as diminishing of pressure on the outer shell (which is used in the ideal gas law), so we write ( $P$ + something) instead of $P$. To evaluate this "something", let's examine an additional force acting on an element of gas surface. While the force acting on each surface molecule is $\sim\rho$, the force acting on the whole element is $\sim\rho^{2}$ ....
In fact, this can be made more rigorous:
it can be shown using that, for thermodynamic consistency,
if we add a uniform attractive background to a system, the effect on the equation of state must take the form
$$
P = P_0 - a\rho^2
$$
where $P_0$ is the pressure of the system without the attractions
(for example, $P_0=1/(1-b\rho)$ as above,
or a more accurate equation such as Carnahan-Starling)
and $a$ is a positive constant.
This is the form appearing in van der Waals' equation.
[EDIT following OP comment]
If the van der Waals equation describes the real gas accurately,
then $P$ is the pressure of the real gas (not an ideal gas)
and likewise $V$, $T$, $n$ are properties of the real gas.
If you want to compare the properties of a real gas and an ideal gas,
then it is necessary to specify what is kept the same, and what is to be compared.
I believe that it is clearest to keep $n$, $V$ and $T$ the same,
and compare pressures. Then, rearranging the equation of state
\begin{align*}
P &= \frac{nRT}{V-nb} - a\frac{n^2}{V^2} \\
\frac{PV}{nRT} = \frac{P}{P_{\text{id}}} &=
\frac{V}{V-nb} - \frac{a}{RT}\frac{n}{V} = \frac{1}{1-\rho b} - \frac{a}{RT}\rho
\end{align*}
where I have introduced
the density in moles per unit volume, $\rho=n/V$,
and $P_{\text{id}}=nRT/V$ which is the ideal gas pressure
at the same density and temperature.
At low density, $\rho\rightarrow 0$, $P\rightarrow P_{\text{id}}$,
as we would expect.
The equation makes clear that the $b$ term (excluded volume due to short range repulsive interactions) tends to
increase $P/P_{\text{id}}$ and the $a$ term (attractive long range interactions) tends to decrease $P/P_{\text{id}}$.
Depending on the temperature and the density,
it is possible for $P$ to be higher, or lower, than $P_{\text{id}}$.
Best Answer
Statistical mechanics
Van der Waals equation can be derived from virial/cluster expansion in terms of the average/mean density of the liquid/gas: $$ \rho(\mathbf{r})=\left\langle \sum_{i=1}^N\delta(\mathbf{r}-\mathbf{r}_i)\right\rangle, $$ where $\mathbf{r}_i$ is the position of the i-th atom/molecule.
Thermodynamics
Another way to look at it is that pressure is the derivative of the free energy in respect to volume, so one can obtain the expression for the free energy by integrating the pressure expressed from van der Waals equation in respect to volume. One can then analyze liquid-gas transition in terms of Landau theory of phase transitions, with density serving as the order parameter.
See, e.g., section 8.5 in Lecture notes on Statistical Physics by Lydéric Bocquet
See also Wikipedia for the expression for the Helmholz free energy in more conventional terms.
Simple mean-field derivation of Van der Waals equation
Simple mean-field derivation of VdW equation can be performed along the lines of the derivation given in Wikipedia. I sketch it here in even more simplified/clarified form.
The energy of a gas of N interacting particles is $$ H(\mathbf{x}_1, \mathbf{p}_1;...;\mathbf{x}_N, \mathbf{p}_N)= \sum_{i=1}^N\frac{\mathbf{p}_i^2}{2m}+\frac{1}{2}\sum_{i=1}^N\sum_{j=1,j\neq i}^NU(|\mathbf{x}_i-\mathbf{x}_j|), $$ where the potential is attractive, i.e., $U(|\mathbf{x}_i-\mathbf{x}_j|)<0$ (unless the molecules/atoms are very close to each other)
The partition function is $$ Z=\int \prod_{i=1}^Nd^3\mathbf{x}_id^3\mathbf{p}_ie^{-\beta H(\mathbf{x}_1, \mathbf{p}_1;...;\mathbf{x}_N, \mathbf{p}_N)}= \int d^3\mathbf{p}_1...d^3\mathbf{p}_Ne^{-\beta \sum_{i=1}^N\frac{\mathbf{p}_i^2}{2m}}Z_c=\\\left(2\pi mk_B T\right)^{\frac{3N}{2}}Z_c $$ where the configuration integral is $$ Z_c=\int d^3\mathbf{x}_1...d^3\mathbf{x}_Ne^{-\frac{\beta}{2}\sum_{i=1}^N\sum_{j=1,j\neq i}^NU(|\mathbf{x}_i-\mathbf{x}_j|)}, $$ and $\beta=1/(k_BT)$.
We now perform the mean-field approximation replacing the pairwise interactions between particles by an interaction of each particle with an averaged field of all other particles: $$ \frac{1}{2}\sum_{i=1}^N\sum_{j=1,j\neq i}^NU(|\mathbf{x}_i-\mathbf{x}_j|)\approx \frac{1}{2}\sum_{i=1}^N \frac{1}{V}\int_V d^3\mathbf{x_j}U(|\mathbf{x}_i-\mathbf{x}_j|)\approx\\ \frac{1}{2}\sum_{i=1}^N \frac{N}{V}\int_V d^3\mathbf{y}U(|\mathbf{y}|)=-\frac{a'N^2}{V}, $$ where the minus sign is due to the fact that the potential is attractive, so that $a'>0$. The configuration integral is then $$ Z_c=\int d^3\mathbf{x}_1...d^3\mathbf{x}_N e^{\beta\frac{a'N^2}{V}}= (V-b'N)^N e^{\beta\frac{a'N^2}{V}}, $$ where we took account for the fact that the particles cannot be too close to each other, and hence the volume integral for each particle is reduced by the factor accounting for the volume of all other particles.
We now can write down the free energy of the gas as $$ F=-\frac{1}{\beta}\log Z=-\frac{a'N^2}{V}-\frac{N}{\beta}\log(V-b'N)-\frac{3N}{2\beta}\log\left(2\pi mk_BT\right) $$ The gas pressure is given by (since $dF=-SdT-pdV$) $$ P=-\left(\frac{\partial F}{\partial V}\right)_T= \frac{Nk_BT}{V-Nb'}-\frac{a'N^2}{V^2} $$ Converting to molar fractions $n=N/N_A$, $b=N_Ab'$, $a=a'N_A^2$, we obtain the usual form of the VdW equation $$ P=\frac{nRT}{V-nb}-\frac{an^2}{V^2}\Leftrightarrow \left(P+\frac{an^2}{V^2}\right)(V-nb)=nRT. $$