Basically, the multiplier method is a way to encode the constraint information of the system directly into the Lagrangian so that you don't have to worry about screwing up the physical requirements of the problem when you solve the equations of motion. In other words, instead of solving the equations of motion and constraining the results, you're constraining the equations of motion first and then solving them.
Another (and equivalent) way to introduce multipliers is to change the Lagrangian:
$$
L=T-U+\lambda(f(x,y)-C)
$$
The function multiplying $\lambda$ is zero by virtue of the constraint equation, so it's not going to screw up your equations of motion, which depend on taking derivatives. Now, the constraint equation can be reproduced by solving a third Lagrange's Equation with respect to $\lambda$ and $\dot\lambda$.
Solve your new system of three differential equations in three unknowns ($x,y,\&\ \lambda$) and you're home free. No need to worry about the messy details at the end because you squirreled them away sneakily at the beginning. And all of this is sort of just a clever way of saying that constraint forces do no work.
Let there be given a (configuration) manifold $M$. Often in physics one assumes that a constraint function $\chi$ obeys the following regularity conditions:
$\chi: \Omega\subseteq M \to \mathbb{R}$ is defined in an open neighborhood $\Omega$ of the constrained submanifold $C\subset M$;
$\chi$ is (sufficiently$^1$ many times) differentiable in $\Omega$;
The gradient $\vec{\nabla} \chi$ is non-vanishing on the constrained submanifold $C\subset M$.
Here it is implicitly understood that $\chi$ vanishes on the constrained submanifold $C\subset M$, i.e.
$$C\cap \Omega ~=~\chi^{-1}(\{0\})~:=~\{x\in\Omega \mid \chi(x)=0\}.$$
[Also we imagine that the full constrained submanifold $C\subset M$ is covered by a family $(\Omega_{\alpha})_{\alpha\in I}$ of open neighborhoods, each with a corresponding constrained function $\chi_{\alpha}: \Omega_{\alpha}\subseteq M \to \mathbb{R}$, and such that the constraint functions $\chi_{\alpha}$ and $\chi_{\beta}$ are compatible in neighborhood overlaps $\Omega_{\alpha}\cap \Omega_{\beta}$.] Since there (locally) is only one constraint, the constrained submanifold will be a hypersurface, i.e. of codimension 1. [More generally, there could be more than one constraint: Then the above regularity conditions should be modified accordingly. See e.g. Ref. 1 for details.]
The above regularity conditions are strictly speaking not always necessary, but greatly simplify the general theory of constrained systems. E.g. in cases where one would like to use the inverse function theorem, the implicit function theorem, or reparametrize $\chi\to\chi^{\prime}$ the constraints. [The rank condition (3.) can be tied to the non-vanishing of the Jacobian $J$ in the inverse function theorem.]
Quantum mechanically, reparametrizations of constraints may induce a Faddeev-Popov-like determinantal factor in the path integral.
Example 1a: OP's 1st example (v1)
$$\tag{1a} \chi(x,y)~=~x^2+y^2-\ell^2$$
would fail condition 3 if $\ell=0$. If $\ell=0$, then $C=\{(0,0)\}\subset M=\mathbb{R}^2$ is just the origin, which has codimension 2. On the other hand, the $\chi$-constraint satisfies the regularity conditions 1-3 if $\ell>0$.
Example 1b: OP's 1st example (v3)
$$\tag{1b} \chi(x,y)~=~\sqrt{x^2+y^2}-\ell$$
is not differentiable at the origin $(x,y)=(0,0)$, and hence would fail condition 2 if $\ell=0$. On the other hand, the $\chi$-constraint satisfies the regularity conditions 1-3 if $\ell>0$.
Example 2a: Assume $\ell>0$. OP's 2nd example (v1)
$$\tag{2a} \chi(x,y)~=~\sqrt{x^2+y^2-\ell^2}$$
would fail condition 1 and 2. The square root is not well-defined on one side of the constrained submanifold $C$.
Example 2b: Assume $\ell>0$. OP's 2nd example (v3)
$$\tag{2b} \chi(x,y)~=~(x^2+y^2-\ell^2)^2$$
would fail condition 3 since the gradient $\vec{\nabla} \chi$ vanishes on the constrained submanifold $C$.
References:
- M. Henneaux and C. Teitelboim, Quantization of Gauge Systems, 1994; Subsection 1.1.2.
--
$^1$ Exactly how many times differentiable depends on application.
Best Answer
So you should re-write your Lagrangian $\mathcal{L}$ as the following
$$ \mathcal{L} = \frac{1}{2}m(\dot{x}^2 + \dot{y}^2) -mgy + \lambda (x^2 + y^2 - l^2), $$
where the final term is your constraint equation which normally takes the form $f(x,y)=0$ or in generalised coordinates $f(q_i,t)=0$ (this is a holonomic constraint equation). So the next step is to do what you have done above. Hope you can take it from here. P.S a nice reference on how to approach the question is given here and here.