The way how you obtain the presented formula is by employing the fact that $$ \partial_{xy} u = \partial_x(\partial_y u) $$ and approximating the derivative in $y$ first through central differences:
$$u_y(x, y) \approx \frac{u(x, y + \Delta y ) - u(x, y - \Delta y)}{2 \Delta y}$$
and then $\partial_x u_y$ also through central differences:
\begin{align} \partial_x u_y &\approx \frac{u_y(x + \Delta x,y) - u(x - \Delta x, y)}{2 \Delta x} \\
& = \frac{\frac{u(x + \Delta x, y + \Delta y ) - u(x + \Delta x, y - \Delta y)}{2 \Delta y} - \frac{u(x - \Delta x, y + \Delta y ) - u(x - \Delta x, y - \Delta y)}{2 \Delta y}}{2 \Delta x} \\
&= \frac{u(x + \Delta x, y + \Delta y ) - u(x + \Delta x, y - \Delta y) - u(x - \Delta x, y + \Delta y ) +u(x - \Delta x, y - \Delta y) }{4\Delta x \Delta y} \end{align}
In the same manner you can any combination of one-sided/central/mixed finite differences. The second order forward finite difference is given by
$$u_x(x, y) \approx \frac{-1.5 u(x, y) +2 u(x + \Delta x, y) - 0.5 u(x + 2 \Delta x, y )}{\Delta x } $$
and the backward difference analogously
$$u_x(x, y) \approx \frac{0.5 u(x - 2 \Delta x, y) -2 u(x - \Delta x, y) + 1.5 u(x, y)}{\Delta x } $$
you can combine forward, backward, central as needed, e.g. as for boundaries and corners in a rectangular domain.
Now let's take a more structured approach by Taylor-Series. Let's consider your 2D example of $u(x + \Delta x, y + \Delta y)$ which you expand around $u(x, y)$:
\begin{align} u(x + \Delta x, y + \Delta y) &= u(x, y) + \Delta x u_x(x, y) + \Delta y u_y(x, y) \\
&+ \frac{\Delta x^2}{2} u_{xx}(x, y) + \frac{\Delta y^2}{2} u_{yy}(x, y) \\
&+ \Delta x \Delta y u_{xy} + \mathcal{O}\big(\Delta x^3, \Delta y^3, \Delta x^2 \Delta y, \Delta x \Delta y^2 \big)\end{align}
Now you want to retain an approximation on $u_{xy}$ through some linear combination of $u(x + k_x^{(i)} \Delta x, y + k_y^{(i)} \Delta y), k_x^{(i)}, k_y^{(i)} \in \mathbb{Z}, i = 1 \dots K$ which can be written as
$$ u_{xy} \overset{!}{=} \sum_{i=1}^K w_i u\Big(x + k_x^{(i)} \Delta x, y + k_y^{(i)} \Delta y \Big)$$
The RHS of the equation above can be rewritten as
\begin{align}
\sum_{i=1}^K w_i u\Big(x + k_x^{(i)} \Delta x, y + k_y^{(i)} \Delta y \Big)
&= u\sum_{i=1}^K w_i + u_x \sum_{i=1}^K w_i \Delta x k_x^{(i)} + u_y \sum_{i=1}^K w_i \Delta y k_y^{(i)} \\
&+ u_{xx} \sum_{i=1}^K \frac{w_i}{2} \Big( \Delta x k_x^{(i)} \Big)^2 + u_{yy}\sum_{i=1}^K \frac{w_i}{2} \Big( \Delta y k_y^{(i)} \Big)^2 \\
& + u_{xy} \sum_{i=1}^K w_i \Delta x k_x^{(i)} \Delta y k_y^{(i)} + \mathcal{O}\big(\Delta x^3, \Delta y^3, \Delta x^2 \Delta y, \Delta x \Delta y^2 \big) \\
&\overset{!}{=} u_{xy}\end{align}
What have we gained through this formulation? You can now directly read-off the conditions for an $\mathcal{O}\big(\Delta x^3, \Delta y^3, \Delta x^2 \Delta y, \Delta x \Delta y^2 \big)$ approximation of $u_{xy}$:
\begin{align}
u:& \sum_{i=1}^K w_i &\overset{!}{=} 0 \\
u_x:& \sum_{i=1}^K w_i \Delta x k_x^{(i)} &\overset{!}{=} 0 \\
u_y:& \sum_{i=1}^K w_i \Delta y k_y^{(i)} &\overset{!}{=} 0 \\
u_{xx}:& \sum_{i=1}^K \frac{w_i}{2} \Big( \Delta x k_x^{(i)} \Big)^2 &\overset{!}{=} 0 \\
u_{yy}: & \sum_{i=1}^K \frac{w_i}{2} \Big( \Delta y k_y^{(i)} \Big)^2 &\overset{!}{=} 0 \\
u_{xy}: & \sum_{i=1}^K w_i \Delta x k_x^{(i)} \Delta y k_y^{(i)} &\overset{!}{=} 1
\end{align}
So you have $6$ restrictions - this system can in general be solved for any choice of $k_x^{(i)}, k_y^{(i)}$ for $6$ variable weights $w_i$ unless the arising matrix is singular.
You can take a look at the Matlab code below to understand how you get to the classic formula you gave for the central FD, and use it for own choices of points.
clear;
clc;
% Point 1: +1, +1
k = [1, 1];
% Point 2: +1, -1
k = [k; [1, -1]];
% Point 3: -1, +1
k = [k; [-1, 1]];
% Point 4: -1, -1;
k = [k; [-1, -1]];
% Point 5 & 6: Do not matter for this clever choice!
k = [k; [0, 0]]; % E.g. origin
k = [k; [0, 1]]; % E.g. some other point
A = zeros(6);
RHS = zeros(6, 1);
% u:
A(1, :) = ones(1, 6);
% u_x:
A(2, :) = k(:, 1)';
% u_y:
A(3, :) = k(:, 2)';
% u_xx:
A(4, :) = (k(:, 1).^2)';
% u_yy:
A(5, :) = (k(:, 2).^2)';
% u_xy:
A(6, :) = (k(:, 1).* k(:, 2))';
% Grid spacing
dx = 1;
dy = 1;
RHS(6) = 1 / (dx * dy);
w = A \RHS;
disp(w);
For higher order schemes you would also need to get rid of the factors in front of the higher derivatives, thus employ more conditions and thus needing more points.
In summary, by using knowledge about the 1d case you can combine existing finite differences to get formulas for the mixed derivative. In principle (unless the generating matrix) is not singular, you can select any choice of points to get finite difference quotient of desired order.
However, the question of stability is not addressed in any matter, only the consistency.
Best Answer
The solution will be continuous in both $u$ and $u_x$, but the latter won't be smooth at $x=1$. This means the centered difference there won't be second order.