Once you start thinking about relativity, gauge fields, qft, etc, it's easy to forget that the massless KG equation is actually just a fancy name for one of the simplest and most common equations in physics:
$$ (\partial_t^2 - \partial_x^2) \, \varphi = 0 , $$
the wave equation!
The most familiar example is waves on a string. Here's the answer in that context:
$$ (\partial_t^2 - \partial_x^2 + m^2) \, \varphi = 0 $$
With $m=0$ you are talking about waves on a string, where each little string segment is coupled only to its neighbors. (We call this the "wave equation".)
With $m\neq 0$ each little string segment has a harmonic restoring force back to its equilibrium displacement, in addition to neighbor coupling. (I'd call this the "wave equation with dispersion").
The value of $m$ tells you the strength of the harmonic restoring force at each point, relative to the strength of neighbor coupling.
Okay, so why "massive" and "massless"? A few reasons.
Look at the dispersion relation $\omega = \sqrt{k^2 + m^2}$.
In quantum mechanics $\omega \sim E$ and $k \sim p$, roughly speaking. Translating, the dispersion relation looks like $E = \sqrt{p^2 + m^2}$ which is the relativistic energy for a particle with rest mass $m$.
Normalized wavepackets have a minimum total energy $m$. (This might not strictly be true but the idea is right. Didn't feel like working out proof. The point is that in Fourier space (at a fixed time) you're summing up energies related to $\omega(k) \geq m$.)
Group velocity of all wavepackets is $c$ (of course $c=1$ here) if $m=0$. If $m>0$ all wavepackets have group velocity less than $c$. In the massive case $m>0$, low energy normalized wavepackets just sit still (all "rest mass" energy, no kinetic energy), whereas very energetic normalized wavepackets move almost at $c$ (high kinetic energy).
When you go quantum, the properties 2 and 3 of classical wavepackets basically translate to the corresponding properties of quantum excitations.
So basically the answer to your second question is: Because the KG dispersion relation corresponds to the relativistic energy equation for a particle of rest mass $m$, and the associated wavepacket dynamics agrees with the analogy as well.
I'm sure there are many more ways to think about this, some mathematically more rigorous, but I think they're all fundamentally related to that basic fact and the properties above.
The Schr$\ddot{\rm o}$dinger equation is non-relativistic and for a free particle is derived from the Hamiltonian
\begin{equation}
H\boldsymbol{=} \dfrac{p^2}{2m}
\tag{K-01}\label{eqK-01}
\end{equation}
by the transcription
\begin{equation}
H\boldsymbol{\longrightarrow} i\hbar\dfrac{\partial}{\partial t}\quad \text{and}\quad \mathbf{p}\boldsymbol{\longrightarrow} \boldsymbol{-}i\hbar\boldsymbol{\nabla}
\tag{K-02}\label{eqK-02}
\end{equation}
so that
\begin{equation}
i\hbar \dfrac{\partial \psi}{\partial t}\boldsymbol{+}\dfrac{\hbar^2}{2m}\nabla^2\psi\boldsymbol{=} 0
\tag{K-03}\label{eqK-03}
\end{equation}
For a first try to derive a relativistic quantum mechanical equation we make use of the property that according to the theory of special relativity the total energy $\;E\;$ and momenta $\;(p_x,p_y,p_z)\;$ transform as components of a contravariant four-vector
\begin{equation}
p^\mu\boldsymbol{=}\left(p^0,p^1,p^2,p^3\right)\boldsymbol{=}\left(\dfrac{E}{c},p_x,p_y,p_z\right)
\tag{K-04}\label{eqK-04}
\end{equation}
of invariant length
\begin{equation}
\sum\limits_{\mu\boldsymbol{=}0}^{3}p_{\mu} p^{\mu}\boldsymbol{\equiv}p_{\mu} p^{\mu}\boldsymbol{=}\dfrac{E^2}{c^2}\boldsymbol{-}\mathbf{p}\boldsymbol{\cdot}\mathbf{p}\boldsymbol{\equiv}m^2c^2\tag{K-05}\label{eqK-05}
\end{equation}
where $\;m\;$ is the rest mass of the particle and $\;c\;$ the velocity of light in vacuum.
Following this it is natural to take as the Hamiltonian of a relativistic free particle
\begin{equation}
H\boldsymbol{=}\sqrt{p^{2}c^2\boldsymbol{+}m^2c^4}
\tag{K-06}\label{eqK-06}
\end{equation}
and to write for a relativistic quantum analogue of \eqref{eqK-03}
\begin{equation}
i\hbar \dfrac{\partial \psi}{\partial t}\boldsymbol{=}\sqrt{\boldsymbol{-}\hbar^2c^2 \nabla^{2}\boldsymbol{+}m^2c^4}\,\psi
\tag{K-07}\label{eqK-07}
\end{equation}
Facing the problem of interpreting the square root operator on the right in eq. \eqref{eqK-07} we simplify
mathematics by removing this square root operator, so that
\begin{equation}
\left[\dfrac{1}{c^2}\dfrac{\partial^2}{\partial t^2}\boldsymbol{-}\nabla^{2}\boldsymbol{+}\left(\dfrac{mc}{\hbar}\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)^2\right]\psi\boldsymbol{=}0
\tag{K-08}\label{eqK-08}
\end{equation}
or recognized as the classical wave equation
\begin{equation}
\left[\square\boldsymbol{+}\left(\dfrac{mc}{\hbar}\right)^2\right]\psi\boldsymbol{=}0
\tag{K-09}\label{eqK-09}
\end{equation}
where(1)
\begin{equation}
\square\boldsymbol{\equiv}\dfrac{1}{c^2}\dfrac{\partial^2}{\partial t^2}\boldsymbol{-}\nabla^{2}\boldsymbol{=}\dfrac{\partial}{\partial x_\mu}\dfrac{\partial}{\partial x^\mu}
\tag{K-10}\label{eqK-10}
\end{equation}
Equation \eqref{eqK-09} is the Klein-Gordon equation for a free particle. With its complex conjugate we have
\begin{align}
& \dfrac{1}{c^2}\dfrac{\partial^2 \psi\hphantom{^{\boldsymbol{*}}}}{\partial t^2}\boldsymbol{-}\nabla^{2}\psi\hphantom{^{\boldsymbol{*}}}\boldsymbol{+}\left(\dfrac{mc}{\hbar}\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)^2\psi\hphantom{^{\boldsymbol{*}}}\boldsymbol{=} 0
\tag{K-11.1}\label{eqK-11.1}\\
&\dfrac{1}{c^2}\dfrac{\partial^2 \psi^{\boldsymbol{*}}}{\partial t^2}\boldsymbol{-}\nabla^{2}\psi^{\boldsymbol{*}}\boldsymbol{+}\left(\dfrac{mc}{\hbar}\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)^2\psi^{\boldsymbol{*}}\boldsymbol{=} 0
\tag{K-11.2}\label{eqK-11.2}
\end{align}
Multiplying them by $\;\psi^{\boldsymbol{*}},\psi\;$ respectively and subtracting side by side we have(2)
\begin{align}
\dfrac{1}{c^2}\left(\psi^{\boldsymbol{*}}\dfrac{\partial^2 \psi}{\partial t^2}\boldsymbol{-}\psi\dfrac{\partial^2 \psi^{\boldsymbol{*}}}{\partial t^2}\right)\boldsymbol{-}\left(\psi^{\boldsymbol{*}}\nabla^{2}\psi\boldsymbol{-}\psi\nabla^{2}\psi^{\boldsymbol{*}}\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)&\boldsymbol{=} 0\quad \boldsymbol{\Longrightarrow}
\nonumber\\
\dfrac{1}{c^2}\dfrac{\partial}{\partial t}\left(\psi^{\boldsymbol{*}}\dfrac{\partial \psi}{\partial t}\boldsymbol{-}\psi\dfrac{\partial \psi^{\boldsymbol{*}}}{\partial t}\right)\boldsymbol{+}\boldsymbol{\nabla \cdot}\left(\psi\boldsymbol{\nabla }\psi^{\boldsymbol{*}}\boldsymbol{-}\psi^{\boldsymbol{*}}\boldsymbol{\nabla }\psi\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)&\boldsymbol{=} 0
\tag{K-12}\label{eqK-12}
\end{align}
We multiply above equation by $\;i\hbar/2m\;$ in order to have real quantities on one hand and on the other hand to have an identical expression for the probability current density vector as that one from the Schr$\ddot{\rm o}$dinger equation
\begin{equation}
\dfrac{\partial}{\partial t}\left[\dfrac{i\hbar}{2mc^2}\left(\psi^{\boldsymbol{*}}\dfrac{\partial \psi}{\partial t}\boldsymbol{-}\psi\dfrac{\partial \psi^{\boldsymbol{*}}}{\partial t}\right)\right]\boldsymbol{+}\boldsymbol{\nabla \cdot}\left[\dfrac{i\hbar}{2m}\left(\psi\boldsymbol{\nabla }\psi^{\boldsymbol{*}}\boldsymbol{-}\psi^{\boldsymbol{*}}\boldsymbol{\nabla }\psi\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)\right]\boldsymbol{=} 0
\tag{K-13}\label{eqK-13}
\end{equation}
so
\begin{equation}
\dfrac{\partial \varrho}{\partial t}\boldsymbol{+}\boldsymbol{\nabla \cdot}\boldsymbol{S}\boldsymbol{=} 0
\tag{K-14}\label{eqK-14}
\end{equation}
where
\begin{equation}
\boxed{\:\:\varrho\boldsymbol{\equiv}\dfrac{i\hbar}{2mc^2}\left(\psi^{\boldsymbol{*}}\dfrac{\partial \psi}{\partial t}\boldsymbol{-}\psi\dfrac{\partial \psi^{\boldsymbol{*}}}{\partial t}\right)\:\:}\quad \text{and} \quad \boxed{\:\:\boldsymbol{S}\boldsymbol{\equiv}\dfrac{i\hbar}{2m}\left(\psi\boldsymbol{\nabla }\psi^{\boldsymbol{*}}\boldsymbol{-}\psi^{\boldsymbol{*}}\boldsymbol{\nabla }\psi\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)\:\:}
\tag{K-15}\label{eqK-15}
\end{equation}
We would like to interpret $\dfrac{i\hbar}{2mc^2}\left(\psi^{\boldsymbol{*}}\dfrac{\partial \psi}{\partial t}\boldsymbol{-}\psi\dfrac{\partial \psi^{\boldsymbol{*}}}{\partial t}\right)$ as a probability density $\varrho$. However, this is impossible, since it is not a positive definite expression.
(1)
We define
\begin{align}
\blacktriangleright x^\mu\boldsymbol{=}\left(ct,\mathbf{x}\right)&\blacktriangleright \nabla^\mu\boldsymbol{=}\partial^\mu\boldsymbol{=}\dfrac{\partial}{\partial x_\mu}\boldsymbol{=}\left(\dfrac{1}{c}\dfrac{\partial}{\partial t},\boldsymbol{-}\boldsymbol{\nabla}\right)
\nonumber\\
&\blacktriangleright \nabla_\mu\boldsymbol{=}\partial_\mu\boldsymbol{=}\dfrac{\partial}{\partial x^\mu}\boldsymbol{=}\left(\dfrac{1}{c}\dfrac{\partial}{\partial t},\boldsymbol{+}\boldsymbol{\nabla}\right)\blacktriangleright\square \boldsymbol{=}\nabla^\mu\nabla_\mu \boldsymbol{=}\partial^\mu\partial_\mu \boldsymbol{=}\dfrac{\partial}{\partial x_\mu}\dfrac{\partial}{\partial x^\mu}
\nonumber
\end{align}
(2)
If $\;\psi\;$ and $\;\mathbf{a}\;$ are scalar and vector functions in $\;\mathbb{R}^{3}$ then
\begin{equation}
\boldsymbol{\nabla \cdot}\left(\psi\mathbf{a}\right)\boldsymbol{=}\mathbf{a}\boldsymbol{\cdot}\boldsymbol{\nabla}\psi\boldsymbol{+}\psi\boldsymbol{\nabla \cdot}\mathbf{a}
\nonumber
\end{equation}
Best Answer
Every component of the metric is dimensionless, if you use rectilinear coordinates. $g_{22}$ and $g_{33}$ only have dimensions if you are using curvilinear coordinates (probably spherical, in this case). In that case, the $\partial_2$ and $\partial_3$ also have correspondingly different dimensions than $\partial_0$ and $\partial_1$.