The more formal derivation of the van der Waals equation of state utilises the partition function. If we have an interaction $U(r_{ij})$ between particles $i$ and $j$, then we can expand in the Mayer function,
$$f_{ij}= e^{-\beta U(r_{ij})} -1$$
the partition function of the system, which for $N$ indistinguishable particles is given by,
$$\mathcal Z = \frac{1}{N! \lambda^{3N}} \int \prod_i d^3 r_i \left( 1 + \sum_{j>k}f_{jk} + \sum_{j>k,l>m} f_{jk}f_{lm} + \dots\right)$$
where $\lambda$ is a convenient constant, the de Broglie thermal wavelength and this expansion is simply obtained by the Taylor series of the exponential. The first term $\int \prod_i d^3 r_i$ simply gives $V^N$, and the first correction is simply the same sum each time, contributing,
$$V^{N-1}\int d^3 r \, f(r).$$
The free energy can be derived from the partition function, which allows us to approximate the pressure of the system as,
$$p = \frac{Nk_B T}{V} \left( 1-\frac{N}{2V} \int d^3r \, f(r) + \dots\right).$$
If we use the van der Waals interaction,
$$U(r) = \left\{\begin{matrix}
\infty & r < r_0\\
-U_0 \left( \frac{r_0}{r}\right)^6 & r \geq r_0
\end{matrix}\right.$$
and evaluate the integral, we find,
$$\frac{pV}{Nk_B T} = 1 - \frac{N}{V} \left( \frac{a}{k_B T}-b\right)$$
where $a = \frac23 \pi r_0^3 U_0$ and $b = \frac23 \pi r_0^3$ which is directly related to the excluded volume $\Omega = 2b$.
Firstly, $P$, $V$ and $T$ in the van der Waals equation of state have exactly the same meanings as they have elsewhere in thermodynamics: they represent the pressure, volume, and temperature of a sample of the material being studied.
Secondly, this equation of state is one of many approximate equations of state which attempt to model the values measured in experiment, especially (in this case) non-ideal gases. There are many other, more accurate, equations of state which are empirical fits to the measured data. The van der Waals equation is a useful one to study because it is simple, and one can understand in fairly simple terms where it comes from. It has some basis in statistical mechanics, which also helps explain its form.
But remember, it is just an approximation.
For example,
although it predicts the existence of a critical point,
it continues in an unphysical way below the critical temperature
into the two-phase region where
liquid and gas are in equilibrium with each other.
The equations of state page actually gives a handwaving explanation of the two modified terms. I'll paraphrase here.
Indeed $(V-nb)$ attempts to account for the excluded volume of the atoms. If we ignore all the attractions, we are essentially modelling the atoms as hard spheres, and writing the equation of state as
$$
\frac{PV}{nRT} = \left( \frac{V}{V-nb}\right) = \left( \frac{1}{1-b\rho}\right)
$$
where $\rho=n/V$. This means that, for a given density $\rho$ and temperature $T$, the pressure is higher than we might have expected if we had assumed that the gas were ideal. Also, it recognizes that there is a minimum possible value of $V/n$ or a maximum density $\rho$, because at some point the atoms will overlap. (We don't take the precise value of the maximum density too literally; in practice, the system will crystallize to a solid, which the equation does not describe). Nowadays, we know a lot more about the hard sphere fluid,
and have more accurate equations of state such as Carnahan-Starling for this system.
The effects of attractions are then treated as a weak perturbation, superimposed on the above equation. The gas surrounding a typical atom is treated as a uniform attractive background: a kind of mean field theory. This reduces the pressure, compared with what we might have thought on the basis of the ideal gas equation.
The equations of state page gives a way to think of this:
While ideal gas molecules do not interact, we consider molecules attracting others within a distance of several molecules' radii. It makes no effect inside the material, but surface molecules are attracted into the material from the surface. We see this as diminishing of pressure on the outer shell (which is used in the ideal gas law), so we write ( $P$ + something) instead of $P$. To evaluate this "something", let's examine an additional force acting on an element of gas surface. While the force acting on each surface molecule is $\sim\rho$, the force acting on the whole element is $\sim\rho^{2}$ ....
In fact, this can be made more rigorous:
it can be shown using that, for thermodynamic consistency,
if we add a uniform attractive background to a system, the effect on the equation of state must take the form
$$
P = P_0 - a\rho^2
$$
where $P_0$ is the pressure of the system without the attractions
(for example, $P_0=1/(1-b\rho)$ as above,
or a more accurate equation such as Carnahan-Starling)
and $a$ is a positive constant.
This is the form appearing in van der Waals' equation.
[EDIT following OP comment]
If the van der Waals equation describes the real gas accurately,
then $P$ is the pressure of the real gas (not an ideal gas)
and likewise $V$, $T$, $n$ are properties of the real gas.
If you want to compare the properties of a real gas and an ideal gas,
then it is necessary to specify what is kept the same, and what is to be compared.
I believe that it is clearest to keep $n$, $V$ and $T$ the same,
and compare pressures. Then, rearranging the equation of state
\begin{align*}
P &= \frac{nRT}{V-nb} - a\frac{n^2}{V^2} \\
\frac{PV}{nRT} = \frac{P}{P_{\text{id}}} &=
\frac{V}{V-nb} - \frac{a}{RT}\frac{n}{V} = \frac{1}{1-\rho b} - \frac{a}{RT}\rho
\end{align*}
where I have introduced
the density in moles per unit volume, $\rho=n/V$,
and $P_{\text{id}}=nRT/V$ which is the ideal gas pressure
at the same density and temperature.
At low density, $\rho\rightarrow 0$, $P\rightarrow P_{\text{id}}$,
as we would expect.
The equation makes clear that the $b$ term (excluded volume due to short range repulsive interactions) tends to
increase $P/P_{\text{id}}$ and the $a$ term (attractive long range interactions) tends to decrease $P/P_{\text{id}}$.
Depending on the temperature and the density,
it is possible for $P$ to be higher, or lower, than $P_{\text{id}}$.
Best Answer
Short Answer: Every particle attracts every other, so the total attraction (which is the correction to pressure) has to be proportional to $N^2$.
Long Answer: If you want to rigorously derive the VdW equation of state, you should look into the Cluster expansion. It is covered in most textbooks on statistical mechanics. For a good introduction, have a look at David Tong's lecture notes.
More intuitively, you can view the VdW equation as a mean field theory of a weakly interacting gas. Take a look at the partition function $$ Z = \frac{1}{N! \lambda^{3N}} \int e^{-\beta \sum U(\Delta x_{ij})} dx^{3N} \,, $$ with the momentum integration already performed and put into the thermal wavelength $\lambda$.
For an arbitrary potential $U$, this integral is impossible to do exactly. But you can do a mean field approximation, where you assume that all particles feel the same 'mean field potential' $U_{\rm MF}(r)$. The partition function then factorizes into $$ Z = \frac{1}{N! \lambda^{3N}} \left[\int e^{-\beta U_{\rm MF}(x)} dx^3\right]^N \,. $$
What can you say about $U_{\rm MF}(r)$? You know two things:
Since our simplified $U_{\rm MF}$ does not depend on position anymore, we can pull it out of the integral. $$ Z = \frac{(e^{\beta a N / V})^N}{N! \lambda^{3N}} \left[\int dx^3\right]^N = \frac{1}{N!} \left[\frac{e^{\beta a N / V} (V- Nb)}{\lambda^3}\right]^N\,. $$ Here you can see that the $N^2$ dependence comes from the overall exponent of $N$: $$ \left(e^{\beta a N / V}\right)^N = e^{\beta a N^2 / V} $$ You can derive the VdW equation of state by calculating the Gibbs free energy $G$ and looking at its derivatives. Along the way you will pick up another $V$ to give you $a N^2 / V^2$.
If you want to know more I recommend the book Introduction to the Theory of Soft Matter by Jonathan Selinger.