A simpler approach is via integral theorems.
As stated in the question, the special cases for a rectangle in the $xy$ , $yz$ , $zx$ planes are well understood.
According to Green's theorem :
$$
\begin{cases}
\iint_{xy} \left(\frac{\partial F_y}{\partial x} - \frac{\partial F_x}{\partial y}\right) dx\, dy
= \oint_{xy} \left( F_x\, dx + F_y\, dy \right) \\
\iint_{yz} \left(\frac{\partial F_z}{\partial y} - \frac{\partial F_y}{\partial z}\right) dy\, dz
= \oint_{xy} \left( F_y\, dy + F_z\, dz \right) \\
\iint_{zx} \left(\frac{\partial F_x}{\partial z} - \frac{\partial F_z}{\partial x}\right) dz\, dx
= \oint_{xy} \left( F_z\, dz + F_x\, dx \right) \end{cases}
$$
But instead of rectangles, we take half rectangles, or better: the triangles $OAB$ , $OBC$ , $OAC$ respectively:
Thanks to Green's theorem we can replace area integrals by line-integrals; mind that they are counter-clockwise.
Then it is clear that, irrespective of any further content:
$$
\oint_{OAB} + \oint_{OBC} + \oint_{OAC} + \oint_{ABC} = 0
$$
Assuming that the operator rot(ation) is not defined yet in general, this means that we now have an expression for it:
$$
2 \iint_{ABC} \vec{\operatorname{rot}}(\vec{F}) \cdot \vec{n}\, dA = \\
- \iint_{xy} \left(\frac{\partial F_y}{\partial x} - \frac{\partial F_x}{\partial y}\right) dx\, dy
- \iint_{yz} \left(\frac{\partial F_z}{\partial y} - \frac{\partial F_y}{\partial z}\right) dy\, dz
- \iint_{zx} \left(\frac{\partial F_x}{\partial z} - \frac{\partial F_z}{\partial x}\right) dz\, dx
$$
Continuing with infinitesimal volumes / areas and flipping normals on the right hand side, so that they become
the components of the normal at the left hand side:
$$
\vec{\operatorname{rot}}(\vec{F}) \cdot \vec{n}\, \Delta A = \\
\left(\frac{\partial F_z}{\partial y} - \frac{\partial F_y}{\partial z}\right)\cdot n_x\, \Delta A +
\left(\frac{\partial F_x}{\partial z} - \frac{\partial F_z}{\partial x}\right)\cdot n_y\, \Delta A +
\left(\frac{\partial F_y}{\partial x} - \frac{\partial F_x}{\partial y}\right)\cdot n_z\, \Delta A
$$
Leaving out the infinitesimal area $\,\Delta A\,$ gives us the same answer as found by the OP themselves.
A somewhat neater approach is to calculate mean values and let the area of the (red) triangle go to zero:
$$
\vec{\operatorname{rot}}(\vec{F}) \cdot \vec{n} = \lim_{ABC \to 0}
\frac{\iint_{ABC} \vec{\operatorname{rot}}(\vec{F}) \cdot \vec{n}\, dA}{\iint_{ABC} dA}
$$
Note. I've encountered essentially the same method at several places elsewhere in physics
(I think it's with stress and strain). Aanyway, a related subject is :
What does shear mean?
To make things easier, first change the $x,y,z$ coordinates to a new system of Cartesian coordinates $x',y',z'$ parallel to the original one but with the origin at $(0,0,-1)$, so $x'=x$, $y'=y$, $z'=z+1$. In the new coordinate system the vector $\vec g$ is:
$$\vec g = \frac{(x',y',z')}{(x^{\prime 2}+y^{\prime 2}+z^{\prime 2})^{3/2}} = \frac{\vec r}{r^3}\,,$$
where $\vec r=(x',y',z')$ is the radial vector field (from the new origin of coordinates) and $r$ its magnitude. This brings up the symmetry of the problem (and its physical connection; $\vec g$ is proportional to the electric field of a static electric charge at the new origin of coordinates, or to the the magnetic field produced by a Dirac's magnetic monopole at that point). Although you can work in Cartesian coordinates, to take advantage of the symmetry it is probably easier to work using standard spherical coordinates $r,\theta,\phi$, and that is what I am going to do, later converting back to Cartesian coordinates. Before we start it is a good idea to check that $\nabla\cdot\vec g=0$, otherwise the problem has no solution. But the divergence of this vector field is well known ($\vec g$ is essentially the gradient of the Green's function of the Poisson equation):
$$ \nabla\cdot\frac{\vec r}{r^3} = 4\pi\delta^3(\vec r)\,.$$
Although outside the origin the divergence is zero, because it is not zero everywhere in $R^3$, it is not possible to find a vector $\vec f$ that is non-singular everywhere $R^3$ minus the origin and such that $\nabla\times\vec f=\vec g$. However it is going to be possible to find $\vec f$ for the required region $z>0$. Note that the solution to this equation is far from unique, because adding to a solution any gradient also gives a solution. We need only to find one solution, all other solution are obtained form this one by adding an arbitrary gradient. To make the calculations easier we can impose some reasonable condition that will restrict the number of available solutions (physicsits call this to "fix the gauge"). For example, if you want to work in Cartesian coordinates you could require your solution to have $f_z=0$ (physicists call this choice the "axial gauge"). Because $\vec g$ is radial, and the radial component of $\vec f$ does not contribute to it, I am going to make the radial component of $\vec f$, $f_r$ equal to zero. With this choice, using the expression of the curl in spherical coordinates we have for $f_\theta$ and $f_\phi$:
$$
\begin{align}
\frac{1}{r\sin{\theta}}\left[\frac{\partial}{\partial\theta}\left(\sin\theta f_\phi\right)-\frac{\partial f_\theta}{\partial\phi}\right] &= \frac{1}{r^2} \\
\frac{\partial}{\partial r}\left(rf_\phi\right) &= 0 \\
\frac{\partial}{\partial r}\left(rf_\theta\right) &= 0
\end{align}
$$
The last two equations show that in this gauge ($f_r=0$), $f_\theta$ and $f_\phi$ have to be proportional to $1/r$, which makes also the radial part of the first equation correct. In fact, with this in mind, it is easily seen by inspection of the first equation that the following choice is a solution:
$$f_\phi = \frac{C-\cos\theta}{r\sin\theta},~~f_\theta = 0\,,$$
where $C$ is some constant. Because we want the solution to be valid if $z>0$ we
must choose $C=1$. In other words, if $\hat u_\phi$ is the unit vector along the $\phi$ direction, then:
$$\vec f = \frac{1-\cos\theta}{r\sin\theta}\hat u_\phi$$
is a solution to the problem in spherical coordinates. The only thing that remains is to convert back to Cartesian coordinates. If $\hat u_x, \hat u_y, \hat u_z$ are the unit vectors along the Cartesian axes (old and new!), and using:
$$\hat u_\phi = -\sin\phi\,\hat u_x+\cos\phi\,\hat u_y$$
and
$$\cos\phi = \frac{x'}{r\sin\theta},~~\sin\phi = \frac{y'}{r\sin\theta},
~~\cos\theta = \frac{z'}{r},~~r^2\sin^2\theta = x'^2+y'^2$$
we get:
$$\vec f = \frac{y'(z'-r)}{r(x'^2+y'^2) }\hat u_x -\frac{x'(z'-r)}{r(x'^2+y'^2) }\hat u_y\,.$$
Reverting to the original $x,y,z$ coordinate system (just replace $x'~y'$ by $x~y$, and $z'$ by $z+1$) we get the final answer:
$$\vec f = \frac{y(z+1-r)\,\hat u_x-x(z+1-r)\,\hat u_y}{r(x^2+y^2)}$$
with
$$ r=\sqrt{x^2+y^2+(z+1)^2}\,.$$
Note that this solution satisfy the physicists "axial gauge" $f_z=0$, although I was not looking for it. Also note that this solution is regular for $z>-1$, as required, and becomes singular in the half line $x=y=0, z<=-1$.
Best Answer
Units, curl, and circulation
First it is convenient to conceive of a vector field as the velocity field of a flow, $${\bf F} =(F_1,F_2,F_3)= \left({dx \over dt},\, {dy \over dt},\, {dz \over dt}\right)\,,$$ which has the dimensions of length/time. The curl then becomes $${\rm curl}\; {\bf F} = \left( {\partial \over \partial y} {dz \over dt} - {\partial \over \partial z} {dy \over dt},\, {\partial \over \partial z} {dx \over dt} - {\partial \over \partial x} {dz \over dt},\, {\partial \over \partial x} {dy \over dt} - {\partial \over \partial y} {dx \over dt} \right)$$ which has the dimensions of 1/time, the same as angular velocity. The so-called circulation (around an oriented surface $S$) is given by $$\hbox{circulation} = \int_{\partial S} {\bf F} \cdot dr$$ which has the dimensions of area/time and is a bit difficult to understand intuitively. (If ${\bf F}$ represent force, the integral represents net work around $S$, but in that case, the curl has the odd dimensions of mass/time${}^2$, which I am at a loss to explain.)
From Stokes' to circulation/area
One way to approach the idea of the curl is through Stokes' theorem, which says the circulation of vector field around a surface is equal to the flux of the curl across the surface: $$\int_{\partial S} {\bf F} \cdot dr = \iint_S {\rm curl}\; {\bf F} \cdot {\bf n}\;dS$$ where ${\bf n}$ is the surface normal. By the mean value theorem $$\iint_S {\rm curl}\; {\bf F} \cdot {\bf n} \;dS= {\rm curl}\; {\bf F} \cdot {\bf n}\iint_S dS = ({\rm curl}\; {\bf F} \cdot {\bf n})\;A(S) $$ where ${\rm curl}\; {\bf F} \cdot {\bf n}$ is evaluated at some point on $S$. And so $${\rm curl}\; {\bf F} \cdot {\bf n} = {\hbox{circulation} \over A(S)}$$ If $S$ is a small square normal to ${\bf n}$, the circulation will be maximized when ${\bf n}$ points in the direction of ${\rm curl}\; {\bf F}$.
From circulation/area to the formula for curl
One can compute the components of the curl by letting ${\bf n}$ be ${\bf i}, {\bf j}, {\bf k}$. We will show the calculation for ${\bf n} = {\bf k}$. Let $S$ be a square parallel with opposite corners $(x,y,z)$ and $(x+dx,y+dy,z)$. Along a side of $S$ the integral of ${\bf F} \cdot dr$ is $\pm F_1\,dx$ or $\pm F_2\,dy$, evaluated at some point on the side and the sign according with the orientation of the side. Opposite sides have opposite orientations. Combining all four terms, we can approximate the circulation by $$\begin{align} (F_2(x+dx,y^*,z)&-F_2(x,y^*,z))\;dy - (F_1(x^*,y+dy,z)-F_1(x^*,y,z))\;dx \\ &= {\partial \over \partial x} {dy \over dt} \;dx\;dy - {\partial \over \partial y} {dx \over dt} \;dx\;dy \end{align}$$ Thus since $A(S) = dx\;dy$, $${\rm curl}\; {\bf F} \cdot {\bf k} = { {\partial \over \partial x} {dy \over dt} \;dx\;dy - {\partial \over \partial y} {dx \over dt} \;dx\;dy \over dx\;dy} = {\partial \over \partial x} {dy \over dt} - {\partial \over \partial y} {dx \over dt}\,.$$ The equations become exact in the limit as $dx \rightarrow 0$ and $dy \rightarrow 0$, and a rigorous proof can be constructed with the aid of the mean value theorem. The other components may be derived similarly.
I suppose one normally does the reverse, that is, derive Stokes' theorem from the curl rather than the formula for curl from Stokes', but if you accept that Stokes' has been proved, then it shouldn't matter that we've used it to better understand the formula. Yet there is still more to say about how the formula is related to rotation.
The flow of the vector field and local instantaneous rotation
So if we again consider our square $S$ with normal ${\bf k}$ and we wish to consider the relative motions of two points at $(x,y,z)$ and at $(x+dx,y,z)$. We wish to understand the rotation of the second about an axis parallel to ${\bf k}$ through $(x,y,z)$. Clearly the $z$ component is irrelevant. Also the $x$ component is, because if $dx/dt=F_1(x,y,z)$ differs from $dx/dt=F_1(x+dx,y,z)$ the $x$ coordinates of the points are moving nearer or farther from each other and such a relative motions does not contribute to rotation. What is left is the $y$ component. The change in the $y$ coordinates of the two particle in a small time $dt$ is approximately $(F_2(x+dx,y,z)-F_2(x,y,z))\;dt$. Since the points are $dx$ apart, the angle turned is $$d\theta_1 \approx \tan d\theta_1 = {(F_2(x+dx,y,z)-F_2(x,y,z))\;dt \over dx} = {\partial \over \partial x} F_2 \; dt = {\partial \over \partial x} {dy \over dt} \; dt\,.$$ Similarly comparing the points $(x,y,z)$ and $(x,y+dy,z)$, we can show the angle turned is approximately $$d\theta_2 \approx - {\partial \over \partial y} {dx \over dt} \; dt\,.$$
One might wonder if the analysis above is a little too loose. One can see what's going on in this figure:
The points $(x,y,z)$ and $(x+dx,y,z)$ on the left flow in a given time $\Delta t$ to two points $(x',y',z')$ and $(x'+dx+\Delta x,y'+\Delta y,z'+\Delta z)$ on the right. The relative change in position is given by $$(\Delta x,\Delta y,\Delta z) \approx \left( {\partial F_1 \over \partial x}\;dx,\, {\partial F_2 \over \partial x}\;dx,\, {\partial F_3 \over \partial x}\;dx \right)\; \Delta t = \left( {\partial \over \partial x} {dx \over dt},\, {\partial \over \partial x} {dy \over dt},\, {\partial \over \partial x} {dz \over dt} \right) \;dx \;\Delta t$$ The amount rotation $\Delta\theta$ in the time $\Delta t$ about ${\bf k}$ is given by $$\tan \Delta\theta = {\Delta y \over dx + \Delta x}$$ As $\Delta t \rightarrow 0$, $dx$ is fixed; but $\Delta x \rightarrow 0$, $\Delta y \rightarrow 0$ and $\tan \Delta\theta \sim \Delta\theta \rightarrow 0$. Thus $${\Delta\theta \over \Delta t} \sim {\tan \Delta\theta \over \Delta t} \sim {\Delta y/\Delta t \over dx} \rightarrow {\partial \over \partial x} {dy \over dt}\,.$$ The partial derivative is evaluated at some point $(x^*,y,z)$ between $(x,y,z)$ and $(x+dx,y,z)$. As $dx \rightarrow 0$, $(x^*,y,z) \rightarrow (x,y,z)$, and the angular rate becomes the term in the formula for the curl.
Curl and the rate of rotation
We found the ${\bf k}$ component of the curl to be the sum of two rates of rotation $${\rm curl}\; {\bf F} \cdot {\bf k} ={d\theta_1 \over dt} + {d\theta_2 \over dt} = { \left({\partial \over \partial x} {dy \over dt} \right) + \left( - {\partial \over \partial y} {dx \over dt} \right)}\,.$$ The other components may be derived similarly. Cartesian coordinate axes may be chosen arbitrarily. At a point on a given surface with unit normal ${\bf n}$, we may choose the coordinate axes so that ${\bf k} = {\bf n}$ (with ${\bf i}, {\bf j}$ being whatever mutually orthogonal unit vector we please). Thus the component of the curl ${\rm curl}\; {\bf F} \cdot {\bf n}$ is the sum of two rates of rotation independent of the choice of ${\bf i}, {\bf j}$.
One way to think of it is that the curl is twice the average of two rates of rotation. Why twice? Why the average? Plausible arguments: (1) It turns out that way. :) (2) The bisector of the angle formed by $(x,y+dy,z)$--$(x,y,z)$--$(x+dx,y,z)$ rotates at the average rate. (3) The circulation/area has a factor of 2 in it when $S$ is a disk or square, when viewed in a certain way: Let $S$ be a disk or square centered at $(x,y,z)$ whose perimeter is a radius/distance $R$ from the center. If we write the circulation/area in the form $${\rm curl}\; {\bf F} \cdot {\bf n} = {\int_{\partial S} {\bf F} \cdot dr \over \hbox{area}}= {(\bar{{F}})(\hbox{perimeter}) \over \hbox{area}} = {(\bar {F})(2\pi R) \over \pi R^2} \,\buildrel {\rm or} \over = \, {(\bar {F})(8R) \over 4R^2}= {2\,\bar {F} \over R}\,,$$ then the normal component of the curl is twice the mean circulation $\bar {F}$ over the radius of $S$. So for instance if a disk turns at an angular rate $\omega$, the velocity at the perimeter is a constant $\omega R$, which also equals $\bar {F}$. The circulation is $(\omega R)\,(2\pi R)$, and therefore the circulation/area is $2 \omega$, that is, twice the average rate of rotation.
Curl and the carpool lane
Suppose we model the flow of traffic on a highway by a $C^2$ autonomous vector field, where each car represents a discrete particle of the flow. If (as a passenger) you keep your gaze fixed toward a car near you, then the absolute rate at which your head (or rather eye) turns will be proportional to the curl. By absolute rate, I mean that the angle is to be measured with respect to a fixed direction, such as due north.