If the charge is moving its position must change, so the actual time-explicit form of the charge density is:
$\rho\left(\mathbf{r},\,t\right)=q \delta^{(3)}\left(\mathbf{r}-\mathbf{r}_q\left(t\right)\right)$
Next you need the $\dot{\rho}=q\frac{\partial}{\partial t}\delta^{(3)}\left(\mathbf{r}-\mathbf{r}_q\left(t\right)\right)=q\frac{\partial \left(\mathbf{r}-\mathbf{r}_q\left(t\right)\right)^i}{\partial t}\,\frac{\partial}{\partial\left(\mathbf{r}-\mathbf{r}_q\left(t\right)\right)^i}\delta^{(3)}\left(\mathbf{r}-\mathbf{r}_q\left(t\right)\right)$
This is simply a chain rule. Note the sum over the repeated index $i$. Basically all I am saying here is that if you have a function $g=g\left(t\right)$ inside another function $f$ then
$\frac{df\left(g\left(t\right)\right)}{dt}=\dot{g}\left(t\right)\,f'\left(g\left(t\right)\right)$,
but this time instead of a single function $g$ you have three components of the $\mathbf{r}-\mathbf{r}_q\left(t\right)$
The first partial derivative is easy:
$\frac{\partial \left(\mathbf{r}-\mathbf{r}_q\left(t\right)\right)^i}{\partial t}=\frac{d \left(-\mathbf{r}_q\left(t\right)\right)^i}{d t}=-\mathbf{v}^i$
for the second one note that, for any function
$f=f\left(\mathbf{r}-\mathbf{r}_q\right)$
we have
$\frac{\partial f\left(\mathbf{r}-\mathbf{r}_q\right)}{\partial\left(\mathbf{r}-\mathbf{r}_q\left(t\right)\right)^i}=\frac{\partial f\left(\mathbf{r}-\mathbf{r}_q\right)}{\partial \mathbf{r}^i}=\boldsymbol{\nabla}_i f\left(\mathbf{r}-\mathbf{r}_q\right)$.
Now, delta function is not a normal function so taking derivatives of it may create problems, and should instead be handeled using a test function and an integral, but in this case such problems do not arise, so we proceed.
$\dot{\rho}\left(\mathbf{r},\,t\right)=-q \mathbf{v}^i \boldsymbol{\nabla}_i \delta^{(3)}\left(\mathbf{r}-\mathbf{r}_q\left(t\right)\right) = - \boldsymbol{\nabla}.\left(q\mathbf{v} \delta^{(3)}\left(\mathbf{r}-\mathbf{r}_q\right)\right)$
The last step is allowed since velocity does not depend on position (only on time).
Now finally you need to find a current density such that $\dot{\rho}+\boldsymbol{\nabla}.\mathbf{J}=0$
The solution you are trying to achieve is clearly a valid solution given the shape of $\dot{\rho}$. It is not, however unique, and there is not much you can do about it. Some may appeal to Special Relativity to establish uniqueness, but I have usually seen the charge and current density of a single charge being used in prooving that four-current was a four-vector, so that would be a circular argument.
following the comment.
Firstly, a disclaimer. I am not an expert on the theory of delta and other generalized functions, but I have used them a lot, primarily in electromagnetism, so do not trust me blindly, but here it goes.
Basic hand-wavey definition of the delta function would be a function that is zero everywhere except at one point AND the integral of that function is 1. More info here https://en.wikipedia.org/wiki/Dirac_delta_function. Much more info here:
M. J. Lightill, "An Introduction to Fourier Analysis and Generalized Functions" 1964
M. Rahman, "Applications of Fourier Transforms to Generalized Functions" 2011
Delta functions are not continuos so some of the normal notions you have for other functions may not work. The two problems I have seen in my practice are:
(1) Products of delta functions. What is the value of $\delta\left(x-a\right)\delta\left(x-b\right)$? Well, it is not defined. One way to think about a delta-function is that it is a limit of a very narrow Gaussian normalized to give unit integral. The key-word here is LIMIT. Two delta-functions: two limits,you need to define which limit is taken first, or how do these limits relate to each other.
For me, generalized functions are only 'safe' when they are inside an integral. For example $\delta\left(x-a\right)$ is zero everyhere and infinite at $x=a$. But stick it inside the integral with another 'good' function $f$ (continuous, differentialble, square-integrable, see books) and you get:
$\int dx \delta\left(x-a\right)f\left(x\right)=f\left(a\right)$.
(2) Derivatives of delta functions. Derivatives of delta functions make no sense on the face of it since delta functions are either zero or discontinuos, but again, stick it inside the integral:
$\frac{df}{dx}=\lim_{\epsilon\to 0}\frac{f\left(x+\epsilon/2\right)-f\left(x-\epsilon/2\right)}{\epsilon}=\lim_{\epsilon\to 0}\frac{\int dx' \delta\left(x'-\left(x+\epsilon/2\right)\right)f\left(x'\right)-\int dx' \delta\left(x'-\left(x-\epsilon/2\right)\right)f\left(x'\right)}{\epsilon}$
Here the left-hand side certainly makes sense, so we can define the right-handside, with the derivative of the delta function, with a proviso that the limit on the delta-functions should only be taken once they sit in the integral, then:
$\frac{df}{dx} = \int dx' \frac{d\delta\left(x'-x\right)}{dx}f\left(x'\right)=\int dx' \frac{d\left(x'-x\right)}{dx}\frac{d\delta\left(x'-x\right)}{d\left(x'-x\right)}f\left(x'\right)=-\int dx' \delta'\left(x'-x\right) f\left(x'\right)$
That's probably enough. In my main topic I have mentioned that derivatives of delta functions should not give you problems. The reason for this is that the most likely case where you will need the charge current (e.g.) is $\int_V d^3 r' G\left(\mathbf{r}-\mathbf{r'}\right)\rho\left(\mathbf{r'}\right)$, where $G$ is the Greens function and the observer ($\mathbf{r}$) is located outside of $V$. In this case $G$ is well-behaved and the trick with doign derivative of the delta-function once it is inside the integral, will work. If the observer is inside $G$, then the Green's function is not a 'good' function, and a lot more care is needed. For example see: W. C. Chew, "Waves and Fields in Inhomogeneous Media" 1995
Finally, there is an excellent paper by C. P. Frahm, "Some novel delta-function identities", Am. J. Phys. 51, 826 (1983). It really helped me to get friendy with these beasts.
This form of Maxwell's Equations should cover metals dielectrics and most other media. Focus on the following
$$
\begin{align}
\boldsymbol{\nabla}.\mathbf{D}&=\rho_f \\
\boldsymbol{\nabla}\times\mathbf{H}&=\mathbf{J}_f+\partial_t\mathbf{D}
\end{align}
$$
Now take divergence of the second equation, and bear in mind that divergence of a curl is zero:
$$
\boldsymbol{\nabla}.\boldsymbol{\nabla}\times\mathbf{H}=\boldsymbol{\nabla}.\mathbf{J}_f+\boldsymbol{\nabla}.\partial_t \mathbf{D}=0
$$
You can then swap and substitute $\boldsymbol{\nabla}.\partial_t \mathbf{D}=\partial_t \boldsymbol{\nabla}.\mathbf{D}=\partial_t \rho_f$
So the continuity equation is a direct consequence of Maxwell's equations. Verifying it is akin to verifying Maxwell's equations. No need to mess around with thing wires and loops, the above prescription applies to most media and even mixed domains, i.e. with different media.
The above analysis tells you that you cannot talk about charge density and current density separately.
I am not sure what proof you are expecting for infinitely thin wire. There I would simply state current density as:
$$
\mathbf{J}\left(\mathbf{r}\right)=\alpha\int ds'\: \boldsymbol{\mathcal{\dot{R}}}\left(s'\right)\delta^{\left(3\right)}\left(\mathbf{r}-\boldsymbol{\mathcal{R}}\left(s'\right)\right)
$$
Where $\boldsymbol{\mathcal{R}}\left(s\right)$ is the trajectory of your wire, parametrized by arc-length $s$. Assuming $\boldsymbol{\mathcal{\dot{R}}}.\boldsymbol{\mathcal{\dot{R}}}=\mathcal{\dot{R}}^2=const$
Value of $\alpha$ follows from:
$$
\int_S{d^2}r\: \mathbf{\hat{n}}.\mathbf{J}=I\Delta\left(s\in S\right)\cos\theta_\mathcal{\dot{R}}=\alpha\mathcal{\dot{R}}^2\,\Delta\left(s\in S\right)
$$
Where the integral above is over the surface area $S$ of the dot-product of the current density with the surface normal $\mathcal{\hat{n}}$. Quantity $\Delta\left(s\in S\right)=1$ is the wire only goes through the surface area $S$ once. $\cos\theta_{\mathcal{\dot{R}}}$ is the cosine of the angle between the wire and the surface normal. $I$ is the current in the wire. Equation above only makes sense if wire goes through $S$ once or not at all.
From this you can then extract the charge density by taking divergence. I think, you will find that once you take divergence the derivative (of the divergence) and the integral ($\int ds'$) will cancel out, and you will get zero charge density for closed-loop wires
ADDENDUM
Lets see how certain integrals/derivatives will transform
Let us restrict our attention to a region, $\left(x,y,z\right)\in\Omega$ and $s\in\left(s_0,s_1\right)$, where $z$ has a one-to-one relationship with arc-length $s$. More specifically, $\mathcal{R}_z\left(s\right)$ gives the z-coordinate of the arc at all points. Within the region we are considering, let $\mathcal{R}_z^{-1}\left(z\right)$ be defined.
Irrespective of where $z$ is the component of the arc-tangent (of the wire) in the direction of $z$ is:
$$
\frac{d\mathcal{R}_z}{ds}=\mathbf{\hat{z}}.\frac{d\boldsymbol{\mathcal{R}}}{ds}=\mathbf{\hat{z}}.\boldsymbol{\dot{\mathcal{R}}}
$$
Next note that within $\Omega$ for any function $f=f\left(x,y,z\right)$:
$$
\begin{align}
\int^b_a dz\, f\left(x,y,z\right)&=\int^{\mathcal{R}_z^{-1}\left(b\right)}_{\mathcal{R}_z^{-1}\left(a\right)} \frac{dz}{ds} ds f\left(x,y,\mathcal{R}_z\left(s\right)\right)=\int^{\mathcal{R}_z^{-1}\left(b\right)}_{\mathcal{R}_z^{-1}\left(a\right)} \frac{d\mathcal{R}_z}{ds} ds f\left(x,y,\mathcal{R}_z\left(s\right)\right)\\
&=\mathbf{\hat{z}}.\int^{\mathcal{R}_z^{-1}\left(b\right)}_{\mathcal{R}_z^{-1}\left(a\right)} \boldsymbol{\mathcal{\dot{R}}} f\left(x,y,\mathcal{R}_z\left(s\right)\right)ds
\end{align}
$$
Where I simply replaced the integration variable $z\to s$ where $z=\mathcal{R}_z\left(s\right)$. The same trick will work in volume integrals within $\Omega$. The the transform would be:
$$
\begin{align}
x &\to x \\
y&\to y \\
z &\to s \\
\end{align}
$$
With the Jacobian $\left|\frac{\partial\left(x,y,z\right)}{\partial\left(x,y,s\right)}\right|=\mathbf{\hat{z}}.\boldsymbol{\mathcal{\dot{R}}}$
It then follows that (within $\Omega$):
$$
\begin{align}
\mathbf{\hat{z}}.\mathbf{J}\left(\mathbf{r}\right)&=\alpha\mathbf{\hat{z}}.\int ds'\: \boldsymbol{\mathcal{\dot{R}}}\left(s'\right)\delta^{\left(3\right)}\left(\mathbf{r}-\boldsymbol{\mathcal{R}}\left(s'\right)\right)=\alpha\int dz'\: \delta^{\left(3\right)}\left(\left(\begin{array}\\x\\y\\z\end{array}\right)-\left(\begin{array}\\\mathcal{R}_x\left(\mathcal{R}_z^{-1}\left(z'\right)\right)\\\mathcal{R}_y\left(\mathcal{R}_z^{-1}\left(z'\right)\right)\\z'\end{array}\right)\right)=\\
&=\alpha \:\delta\left(x-\mathcal{R}_x\left(\mathcal{R}_z^{-1}\left(z\right)\right)\right)\:\delta\left(y-\mathcal{R}_y\left(\mathcal{R}_z^{-1}\left(z\right)\right)\right)
\end{align}
$$
From here it should be relatively easy to derive:
$$
\int_{x_0}^{x_1} dx \int_{y_0}^{y_1} dy \int_{-l/2}^{l/2} dz\: \mathbf{\hat{z}}.\mathbf{J}\left(\mathbf{r}\right)=l\,\alpha
$$
Assuming that $\left(x_0,\,x_1\right)\times\left(y_0,y_1\right)\times\left(-l/2,l/2\right)$ contain the wire and are within $\Omega$. I think this is what you were after.
Another interesting thing to try is:
$$
\begin{align}
\boldsymbol{\nabla}.\mathbf{J}&=\int^{s_b}_{s_a} ds'\: \boldsymbol{\mathcal{\dot{R}}}\left(s'\right).\boldsymbol{\nabla}\delta^{\left(3\right)}\left(\mathbf{r}-\boldsymbol{\mathcal{R}}\left(s'\right)\right)=\int^{s_b}_{s_a} ds'\: \frac{d\boldsymbol{\mathcal{R}}}{ds'}.\boldsymbol{\nabla}\delta^{\left(3\right)}\left(\mathbf{r}-\boldsymbol{\mathcal{R}}\left(s'\right)\right)
\\
&=-\int^{s_b}_{s_a} ds'\: \frac{d}{ds'}.\delta^{\left(3\right)}\left(\mathbf{r}-\boldsymbol{\mathcal{R}}\left(s'\right)\right)=\delta^{\left(3\right)}\left(\mathbf{r}-\boldsymbol{\mathcal{R}}\left(s_a\right)\right)-\delta^{\left(3\right)}\left(\mathbf{r}-\boldsymbol{\mathcal{R}}\left(s_b\right)\right)
\end{align}
$$
Clearly, if $s_a=s_b$ as would be in the case of a closed loop, the divergence of the current density would vanish.
Best Answer
I'll use their hydrodynamics analogy to provide an answer which might be a little bit easier to visualize. In hydrodynamics, $\rho$ represents the density of the fluid, while $\mathbf{j} = \rho \mathbf{v}$ is a "mass current", where $\mathbf{v}$ is the fluid's velocity. Notice it is sort of a generalization of momentum.
To interpret the equation, let us perform a volume integral over some region $\mathcal{V}$ (if you prefer, you can choose this volume to be a neighborhood of some point, to use the same words the authors used). The volume integral is then $$\int_{\mathcal{V}} \frac{\partial \rho}{\partial t} \ \textrm{d}^3 x = -\int_{\mathcal{V}} \mathbf{\nabla\cdot j} \ \textrm{d}^3 x.$$ We can pull the derivative out of the first integral, since we are integrating over space and differentiating with respect to time. Hence, $$\frac{\textrm{d}}{\textrm{d} t} \int_{\mathcal{V}} \rho \ \textrm{d}^3 x = -\int_{\mathcal{V}} \mathbf{\nabla\cdot j} \ \textrm{d}^3 x.$$
However, notice that $\int_{\mathcal{V}} \rho \ \textrm{d}^3 x$ is the integral of the density over some volume. In other words, this is simply the total mass contained in the volume $\mathcal{V}$. Let us denote it by $Q$ (the reason being the fact that in Electrodynamics, this would be just the total charge within $\mathcal{V}$). Hence, our equation reads $$\frac{\textrm{d} Q}{\textrm{d} t} = -\int_{\mathcal{V}} \mathbf{\nabla\cdot j} \ \textrm{d}^3 x.$$
This is cool, but what about the RHS? To deal with it, we'll employ the Divergence Theorem (also known as Gauss' Theorem), which states that $$\int_{\mathcal{V}} \mathbf{\nabla\cdot j} \ \textrm{d}^3 x = \int_{\partial\mathcal{V}} \mathbf{j \cdot n} \ \textrm{d}S,$$ where $\partial \mathcal{V}$ is the surface enclosing $\mathcal{V}$, $\mathbf{n}$ is the unit vector normal to this surface and pointing outwards, and $\textrm{d}S$ is a surface element. We get $$\frac{\textrm{d} Q}{\textrm{d} t} = - \int_{\partial\mathcal{V}} \mathbf{j \cdot n} \ \textrm{d}S.$$
Let us then understand the meaning of each term of this equation. $\mathbf{j \cdot n}$ measures how much $\mathbf{j}$ is aligned with $\mathbf{n}$ at each point. In other words, it tells us how much the fluid is flowing across the surface at that point. If $\mathbf{j}$ and $\mathbf{n}$ are orthogonal, the fluid is flowing parallel to the surface, and hence it is not coming in nor going out. If $\mathbf{j \cdot n} > 0$, then the two vectors are aligned and fluid is going out. If $\mathbf{j \cdot n} < 0$, they are opposed to each other and there is fluid coming in. We then integrate this quantity over the entire surface: $\int_{\partial\mathcal{V}} \mathbf{j \cdot n} \textrm{d}S$. This gives us a measure of how much fluid is going out. If we put a negative sign in front of this integral, we get a measure of how much fluid is coming in.
On the LHS, we get $\frac{\textrm{d} Q}{\textrm{d} t}$, which measures the change in mass in the volume $\mathcal{V}$. If $\frac{\textrm{d} Q}{\textrm{d} t} > 0$, then the amount of fluid must be increasing. If $\frac{\textrm{d} Q}{\textrm{d} t} < 0$, it is decreasing. If it vanishes, it remains constant. Our equation says $$\frac{\textrm{d} Q}{\textrm{d} t} = - \int_{\partial\mathcal{V}} \mathbf{j \cdot n} \ \textrm{d}S.$$
Hence, in words, it says $$\text{the increase in amount of fluid in the volume} = \text{the amount of fluid coming in through the surface}.$$
That is the idea: the equation states mathematically that the fluid flows in a continuous way. If there is more fluid inside the volume, it can't have come from nowhere: it must have crossed the surface bounding the volume.
The same idea holds for Electrodynamics: if there is a change of charge inside a volume, it must have come by crossing the surface that bounds the volume. It can't appear out of nothing and it can't vanish. In this sense, charge is conserved because it can't be created out of nothing and it can't disappear out of thin air. It can only leave the volume by crossing the surface, and it can only get inside the volume by crossing the surface.