Paul,
This particular writing of the problem in the article I have always thought was sloppy as well. The most confusing part of the discussion is the statement "The continuity equation is as before". At first one writes the continuity equation as:
$$\nabla \cdot J + \dfrac{\partial\rho}{\partial t} = 0$$
Although the del operator can be defined to be infinite dimensional, it is frequently reserved for three dimensions and so the construction of the sentence does not provide a clear interpretation. If you look up conserved current you find the 4-vector version of the continuity equation:
$$\partial_\mu j^\mu = 0$$
What is important about the derivation in the wikipedia article is the conversion of the non time dependent density to a time dependent density, or rather:
$$\rho = \phi^*\phi$$
becomes
$$\rho = \dfrac{i\hbar}{2m}(\psi^*\partial_t\psi - \psi\partial_t\psi^*)$$
the intent is clear, the want to make the time component have the same form as the space components. The equation of the current is now:
$$J^\mu = \dfrac{i\hbar}{2m}(\psi^*\partial^\mu\psi - \psi\partial^\mu\psi^*)$$
which now contains the time component. So the continuity equation that should be used is:
$$\partial_\mu J^\mu = 0$$
where the capitalization of $J$ appears to be arbitrary choice in the derivation.
One can verify that this is the intent by referring to the article on probability current.
From the above I can see that the sudden insertion of the statement that one can arbitrarily pick $$\psi$$ and $$\dfrac{\partial \psi}{\partial t}$$ isn't well explained. This part the article was a source of confusion for me as well until one realized that the author was trying to get to a discussion about the Klein Gordon equation
A quick search of web for "probability current and klein gordan equation" finds good links, including a good one from the physics department at UC Davis. If you follow the discussion in the paper you can see it confirms that the argument is really trying to get to a discussion about the Klein Gordon equation and make the connection to probability density.
Now, if one does another quick search for "negative solutions to the klein gordan equation" one can find a nice paper from the physics department of the Ohio University. There we get some good discussion around equation 3.13 in the paper which reiterates that, when we redefined the density we introduced some additional variability. So the equation:
$$\rho = \dfrac{i\hbar}{2mc^2}(\psi^*\partial_t\psi - \psi\partial_t\psi^*)$$
(where in the orginal, c was set at 1)
really is at the root of the problem (confirming the intent in the original article). However, it probably still doesn't satisfy the question,
"can anyone show me why the expression for density not positive
definite?",
but if one goes on a little shopping spree you can find the book Quantum Field Theory Demystified by David McMahon (and there are some free downloads out there, but I won't link to them out of respect for the author), and if you go to pg 116 you will find the discussion:
Remembering the free particle solution $$\varphi(\vec{x},t) = e^{-ip\cdot x} = e^{-i(Et- px)}$$ the time derivatives are $$\dfrac{\partial\varphi}{\partial t} = -iEe^{-i(Et- px)}$$ $$\dfrac{\partial\varphi^*}{\partial t} = iEe^{i(Et- px)}$$ We have $$\varphi^*\dfrac{\partial\varphi}{\partial t} = e^{i(Et- px)}[-iEe^{-i(Et- px)}] = -iE$$ $$\varphi\dfrac{\partial\varphi^*}{\partial t} = e^{-i(Et- px)}[iEe^{i(Et- px)}] = iE$$ So the probability density is $$\rho = i(\varphi^*\dfrac{\partial\varphi}{\partial t} - \varphi\dfrac{\partial\varphi^*}{\partial t}) = i(-iE-iE) = 2E$$ Looks good so far-except for those pesky negative energy solutions. Remember that $$E = \pm\sqrt{p^2+m^2}$$ In the case of the negative energy solution $$\rho = 2E =-2\sqrt{p^2+m^2}<0$$ which is a negative probability density, something which simply does not make sense.
Hopefully that helps, the notion of a negative probability does not make sense because we define probability on the interval [0,1], so by definition negative probabilities have no meaning. This point is sometimes lost on people when they try to make sense of things, but logically any discussion of negative probabilities is non-sense. This is why QFT ended up reinterpreting the Klein Gordan equation and re purposing it for an equation that governs creation and annihilation operators.
The Schr$\ddot{\rm o}$dinger equation is non-relativistic and for a free particle is derived from the Hamiltonian
\begin{equation}
H\boldsymbol{=} \dfrac{p^2}{2m}
\tag{K-01}\label{eqK-01}
\end{equation}
by the transcription
\begin{equation}
H\boldsymbol{\longrightarrow} i\hbar\dfrac{\partial}{\partial t}\quad \text{and}\quad \mathbf{p}\boldsymbol{\longrightarrow} \boldsymbol{-}i\hbar\boldsymbol{\nabla}
\tag{K-02}\label{eqK-02}
\end{equation}
so that
\begin{equation}
i\hbar \dfrac{\partial \psi}{\partial t}\boldsymbol{+}\dfrac{\hbar^2}{2m}\nabla^2\psi\boldsymbol{=} 0
\tag{K-03}\label{eqK-03}
\end{equation}
For a first try to derive a relativistic quantum mechanical equation we make use of the property that according to the theory of special relativity the total energy $\;E\;$ and momenta $\;(p_x,p_y,p_z)\;$ transform as components of a contravariant four-vector
\begin{equation}
p^\mu\boldsymbol{=}\left(p^0,p^1,p^2,p^3\right)\boldsymbol{=}\left(\dfrac{E}{c},p_x,p_y,p_z\right)
\tag{K-04}\label{eqK-04}
\end{equation}
of invariant length
\begin{equation}
\sum\limits_{\mu\boldsymbol{=}0}^{3}p_{\mu} p^{\mu}\boldsymbol{\equiv}p_{\mu} p^{\mu}\boldsymbol{=}\dfrac{E^2}{c^2}\boldsymbol{-}\mathbf{p}\boldsymbol{\cdot}\mathbf{p}\boldsymbol{\equiv}m^2c^2\tag{K-05}\label{eqK-05}
\end{equation}
where $\;m\;$ is the rest mass of the particle and $\;c\;$ the velocity of light in vacuum.
Following this it is natural to take as the Hamiltonian of a relativistic free particle
\begin{equation}
H\boldsymbol{=}\sqrt{p^{2}c^2\boldsymbol{+}m^2c^4}
\tag{K-06}\label{eqK-06}
\end{equation}
and to write for a relativistic quantum analogue of \eqref{eqK-03}
\begin{equation}
i\hbar \dfrac{\partial \psi}{\partial t}\boldsymbol{=}\sqrt{\boldsymbol{-}\hbar^2c^2 \nabla^{2}\boldsymbol{+}m^2c^4}\,\psi
\tag{K-07}\label{eqK-07}
\end{equation}
Facing the problem of interpreting the square root operator on the right in eq. \eqref{eqK-07} we simplify
mathematics by removing this square root operator, so that
\begin{equation}
\left[\dfrac{1}{c^2}\dfrac{\partial^2}{\partial t^2}\boldsymbol{-}\nabla^{2}\boldsymbol{+}\left(\dfrac{mc}{\hbar}\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)^2\right]\psi\boldsymbol{=}0
\tag{K-08}\label{eqK-08}
\end{equation}
or recognized as the classical wave equation
\begin{equation}
\left[\square\boldsymbol{+}\left(\dfrac{mc}{\hbar}\right)^2\right]\psi\boldsymbol{=}0
\tag{K-09}\label{eqK-09}
\end{equation}
where(1)
\begin{equation}
\square\boldsymbol{\equiv}\dfrac{1}{c^2}\dfrac{\partial^2}{\partial t^2}\boldsymbol{-}\nabla^{2}\boldsymbol{=}\dfrac{\partial}{\partial x_\mu}\dfrac{\partial}{\partial x^\mu}
\tag{K-10}\label{eqK-10}
\end{equation}
Equation \eqref{eqK-09} is the Klein-Gordon equation for a free particle. With its complex conjugate we have
\begin{align}
& \dfrac{1}{c^2}\dfrac{\partial^2 \psi\hphantom{^{\boldsymbol{*}}}}{\partial t^2}\boldsymbol{-}\nabla^{2}\psi\hphantom{^{\boldsymbol{*}}}\boldsymbol{+}\left(\dfrac{mc}{\hbar}\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)^2\psi\hphantom{^{\boldsymbol{*}}}\boldsymbol{=} 0
\tag{K-11.1}\label{eqK-11.1}\\
&\dfrac{1}{c^2}\dfrac{\partial^2 \psi^{\boldsymbol{*}}}{\partial t^2}\boldsymbol{-}\nabla^{2}\psi^{\boldsymbol{*}}\boldsymbol{+}\left(\dfrac{mc}{\hbar}\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)^2\psi^{\boldsymbol{*}}\boldsymbol{=} 0
\tag{K-11.2}\label{eqK-11.2}
\end{align}
Multiplying them by $\;\psi^{\boldsymbol{*}},\psi\;$ respectively and subtracting side by side we have(2)
\begin{align}
\dfrac{1}{c^2}\left(\psi^{\boldsymbol{*}}\dfrac{\partial^2 \psi}{\partial t^2}\boldsymbol{-}\psi\dfrac{\partial^2 \psi^{\boldsymbol{*}}}{\partial t^2}\right)\boldsymbol{-}\left(\psi^{\boldsymbol{*}}\nabla^{2}\psi\boldsymbol{-}\psi\nabla^{2}\psi^{\boldsymbol{*}}\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)&\boldsymbol{=} 0\quad \boldsymbol{\Longrightarrow}
\nonumber\\
\dfrac{1}{c^2}\dfrac{\partial}{\partial t}\left(\psi^{\boldsymbol{*}}\dfrac{\partial \psi}{\partial t}\boldsymbol{-}\psi\dfrac{\partial \psi^{\boldsymbol{*}}}{\partial t}\right)\boldsymbol{+}\boldsymbol{\nabla \cdot}\left(\psi\boldsymbol{\nabla }\psi^{\boldsymbol{*}}\boldsymbol{-}\psi^{\boldsymbol{*}}\boldsymbol{\nabla }\psi\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)&\boldsymbol{=} 0
\tag{K-12}\label{eqK-12}
\end{align}
We multiply above equation by $\;i\hbar/2m\;$ in order to have real quantities on one hand and on the other hand to have an identical expression for the probability current density vector as that one from the Schr$\ddot{\rm o}$dinger equation
\begin{equation}
\dfrac{\partial}{\partial t}\left[\dfrac{i\hbar}{2mc^2}\left(\psi^{\boldsymbol{*}}\dfrac{\partial \psi}{\partial t}\boldsymbol{-}\psi\dfrac{\partial \psi^{\boldsymbol{*}}}{\partial t}\right)\right]\boldsymbol{+}\boldsymbol{\nabla \cdot}\left[\dfrac{i\hbar}{2m}\left(\psi\boldsymbol{\nabla }\psi^{\boldsymbol{*}}\boldsymbol{-}\psi^{\boldsymbol{*}}\boldsymbol{\nabla }\psi\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)\right]\boldsymbol{=} 0
\tag{K-13}\label{eqK-13}
\end{equation}
so
\begin{equation}
\dfrac{\partial \varrho}{\partial t}\boldsymbol{+}\boldsymbol{\nabla \cdot}\boldsymbol{S}\boldsymbol{=} 0
\tag{K-14}\label{eqK-14}
\end{equation}
where
\begin{equation}
\boxed{\:\:\varrho\boldsymbol{\equiv}\dfrac{i\hbar}{2mc^2}\left(\psi^{\boldsymbol{*}}\dfrac{\partial \psi}{\partial t}\boldsymbol{-}\psi\dfrac{\partial \psi^{\boldsymbol{*}}}{\partial t}\right)\:\:}\quad \text{and} \quad \boxed{\:\:\boldsymbol{S}\boldsymbol{\equiv}\dfrac{i\hbar}{2m}\left(\psi\boldsymbol{\nabla }\psi^{\boldsymbol{*}}\boldsymbol{-}\psi^{\boldsymbol{*}}\boldsymbol{\nabla }\psi\vphantom{\dfrac{\partial^2 \psi}{\partial t^2}}\right)\:\:}
\tag{K-15}\label{eqK-15}
\end{equation}
We would like to interpret $\dfrac{i\hbar}{2mc^2}\left(\psi^{\boldsymbol{*}}\dfrac{\partial \psi}{\partial t}\boldsymbol{-}\psi\dfrac{\partial \psi^{\boldsymbol{*}}}{\partial t}\right)$ as a probability density $\varrho$. However, this is impossible, since it is not a positive definite expression.
(1)
We define
\begin{align}
\blacktriangleright x^\mu\boldsymbol{=}\left(ct,\mathbf{x}\right)&\blacktriangleright \nabla^\mu\boldsymbol{=}\partial^\mu\boldsymbol{=}\dfrac{\partial}{\partial x_\mu}\boldsymbol{=}\left(\dfrac{1}{c}\dfrac{\partial}{\partial t},\boldsymbol{-}\boldsymbol{\nabla}\right)
\nonumber\\
&\blacktriangleright \nabla_\mu\boldsymbol{=}\partial_\mu\boldsymbol{=}\dfrac{\partial}{\partial x^\mu}\boldsymbol{=}\left(\dfrac{1}{c}\dfrac{\partial}{\partial t},\boldsymbol{+}\boldsymbol{\nabla}\right)\blacktriangleright\square \boldsymbol{=}\nabla^\mu\nabla_\mu \boldsymbol{=}\partial^\mu\partial_\mu \boldsymbol{=}\dfrac{\partial}{\partial x_\mu}\dfrac{\partial}{\partial x^\mu}
\nonumber
\end{align}
(2)
If $\;\psi\;$ and $\;\mathbf{a}\;$ are scalar and vector functions in $\;\mathbb{R}^{3}$ then
\begin{equation}
\boldsymbol{\nabla \cdot}\left(\psi\mathbf{a}\right)\boldsymbol{=}\mathbf{a}\boldsymbol{\cdot}\boldsymbol{\nabla}\psi\boldsymbol{+}\psi\boldsymbol{\nabla \cdot}\mathbf{a}
\nonumber
\end{equation}
Best Answer
$\partial_t\equiv\frac\partial{\partial t}$ and $\partial^\mu\equiv g^{\mu\nu}\frac\partial{\partial x^\nu}=\left(\sum_{\nu=0}^3g^{\mu\nu}\frac\partial{\partial x^\nu}\right)_{\mu=0}^3$ are differential operators. $\partial^\mu$ is formally contravariant (upper index) and obeys the corresponding transformation laws. $\partial_t$ has a lower index and is (up to a constant factor) a component of the formally covariant operator $\partial_\mu$ via $\partial_0=\frac1c\partial_t$, which, in general, is not equal to $\partial^0$, the zeroth component of $\partial^\mu$.
The differential operator $\partial^\mu$ is known as gradient, which derives vector fields from potential functions. The gradient is not a natural operation on arbitrary manifolds and only available if there's a metric. Its dual $\partial_\mu\equiv\frac\partial{\partial x^\mu}$ on the other hand is a natural operation corresponding to the differential $\mathrm d$, taking potentials to 1-forms (covectorfields).
As a side note, $\partial_t$ can also be understood as a local vector field, as one of the intrinsic definitions of vectors on manifolds is via their directional derivatives. In mathematical literature, it is common to write the basis of the tangent space as $\{\frac\partial{\partial x^\mu}\}$ and its dual space as $\{\mathrm dx^\mu\}$.