I think the difficulty with the subdifferential approach you describe lies in finding critical points. For instance, with your example, solving the equations you get by setting partials to $0$ yields only $x_1 = | x_2 |$ and $\lambda = -1$. This set of equations is too undetermined to yield a solution. As you point out, if you were given a solution $x_1 = x_2 = 0$, $\lambda = -1$, you could verify that it is a critical point with the subdifferential approach, but that's quite different from deriving $x_1 = x_2 = 0$, $\lambda = -1$ as a critical point.
Remember, though, that the definition of a critical point of a function includes points at which the derivative fails to exist. So when you're constructing your set of critical points using Lagrange multipliers, look not only for points at which the partials are zero but also for points at which a partial does not exist. For instance, it's clear in your example that the only point at which $\frac{\partial L}{\partial x_2}$ fails to exist is $x_2 = 0$. (You also get $\lambda = 0$ from $\frac{\partial L}{\partial x_2}= 0$, but this is inconsistent with the $\lambda = -1$ from $\frac{\partial L}{\partial x_1}= 0$.) Including that with your equations from $\frac{\partial L}{\partial x_1} = 0$ and $\frac{\partial L}{\partial \lambda} = 0$ quickly gets you the solution $x_1 = x_2 = 0$, $\lambda = -1$.
In fact, finding points where a partial does not exist is generally not any harder (and is sometimes easier) than finding points where a partial is zero. There aren't many ways for a point in the domain of a continuous function to have a partial that fails to exist. (Besides the absolute value situation, you could have division by zero with the partial from an expression like $x^p$, with $0 < p < 1$, in the function. Perhaps there are some others I'm forgetting - maybe others reading this can supply them. There are also some pathological cases, like the continuous but nowhere differentiable Weierstrass function, but those don't generally show up in practice.)
And, of course, if you have more than just a few variables, you don't want to use Lagrange multipliers anyway: Solving a large number of nonlinear equations is just too difficult. In that case, you will probably want an iterative nonlinear optimization method.
Consider a point $p$ in the common domain $\Omega\subset{\mathbb R}^n$ of $f$ and the constraints $$g_k(x)=0\qquad(1\leq k\leq r)\ .\tag{1}$$ The gradients $\nabla g_k(p)$ define a subspace $U$ of allowed directions when walking away from $p$. In fact a direction $X$ is allowed only if it belongs to the tangent planes of all level surfaces $(1)$. This means that $X$ is perpendicular to all $\nabla g_k(p)$, or is a solution of the homogeneous system of equations
$$\nabla g_k(p)\cdot X=0\qquad(1\leq k\leq r)\ .\tag{2}$$
Now comes an important technical condition for the application of Lagrange's method: We have to assume that the $r$ gradients $\nabla g_k(p)$ are linearly independent, i.e. that $p$ is a regular point of the manifold defined by $(1)$. In this case the $\nabla g_k(p)$ span an $r$-dimensional subspace $V$, and the system $(2)$ has full rank. It follows that ${\rm dim}(U)= n-r$. Therefore we not only have $U\subset V^\perp$, but in fact
$$U=V^\perp\ .$$
When $\nabla f(p)\cdot X\ne0$ for some allowed direction $X$ then the function $f$ is not conditionally stationary at $p$. For a constrained local extremum of $f$ at $p$ we therefore need $$\nabla f(p)\cdot X=0$$
for all directions $X\in U$, in other words: It is necessary that $$\nabla f(p)\in U^\perp=V\ .\tag{3}$$
When $\nabla f(p)\in V=\langle\nabla g_1(p),\ldots,\nabla g_r(p)\rangle$ then there are numbers $\lambda_k$ $\>(1\leq k\leq r)$ such that
$$\nabla f(p)=\sum_{k=1}^r \lambda_k\>\nabla g_k(p)\ .\tag{4}$$
Solving $(4)$ (with $x$ in place of $p$) together with $(1)$ will bring all regular constrained extrema of $f$ to the fore.
Best Answer
Here is a moderately complicated example, cooked up in such a way that it can be solved explicitly all the way to the end.
Given are the two points $A=(0,-7)$, $B=(-7,0)$, and the circle $\gamma: \>x^2+y^2=4$. Determine two points $P$, $Q\in\gamma$ such that the quantity $$d(P,Q):=|AP|^2+|PQ|^2+|QB|^2$$ becomes maximal, resp., minimal.
You will obtain four conditionally stationary situations $(P_k,Q_k)$: the two extremal situations and two "saddle points". The latter had to be expected, since the surface determined by the conditions is a torus.
A full solution (in German) is given on pp. 212–214 of the following pdf-file. The file contains a full chapter (pages 128–236) of a textbook for engineering students. Scroll down to page 212; there you will see the figure printed above.
https://people.math.ethz.ch/~blatter/Inganalysis_5.pdf