The upshot of this answer is as follows: if a path satisfies the Euler-Lagrange equations for $L^2/2$, then it will satisfy the Euler-Lagrange equations for $L$, but the converse does not hold unless the path has affine parameterization.
Let $L = L(x, \dot x)$ be a lagrangian that is a local function of only position and velocity, then a parameterized path $x(s) = (x^i(s))$ on $M$ is said to satisfy the Euler-Lagrange equations for $L$ provided
\begin{align}
\frac{\partial L}{\partial x^i}(x(s), \dot x(s)) - \frac{d}{ds}\frac{\partial L}{\partial \dot x^i}(x(s), \dot x(s)) = 0
\end{align}
for all $i$ and for all $s$ in the domain of $x$.
Lemma 1. If $x$ satisfies the Euler-Lagrange equations for $L$, then the Beltrami Identity holds for $x$:
$$
\frac{d}{ds}L(x(s), \dot x(s)) = \frac{d}{ds}\left(\frac{\partial L}{\partial \dot x^i}\big(x(s), \dot x(s)\big)\cdot \dot x^i(s)\right)
$$
for all $s$ in the domain of $x$.
Proof. Try it yourself! The proof hinges on the fact that $L$ is a local function of only $x$ and $\dot x$.
Lemma 2. If $L(x,\dot x) = \sqrt{g_{ij}(x)\dot x^i\dot x^j}$, then $L$ satisfies the following identity:
$$
\frac{\partial (L^2/2)}{\partial \dot x^i}(x, \dot x) \dot x^i = L(x,\dot x)^2
$$
Proof. Try this yourself too!
Corollary. If $L(x,\dot x) = \sqrt{g_{ij}(x)\dot x^i\dot x^j}$, and $x$ satisfies the Euler-Lagrange equations for $L^2/2$, then $x$ satisfies the Euler-Lagrange equations for $L$.
Proof. If $x$ satisfies the Euler-Lagrange equations for $L^2$, then Lemma 1 gives the following Beltrami identity (we use notational shorthand here -- all expressions should be evaluated on $x(s)$)
$$
\frac{d(L^2/2)}{ds} = \frac{d}{ds} \frac{\partial (L^2/2)}{\partial \dot x^i}\cdot \dot x^i
$$
On the other hand, evaluating both sides of Lemma 2 on $x(s)$, and taking the derivative of both sides with respect to $s$ gives
$$
\frac{d}{ds} \frac{\partial (L^2/2)}{\partial \dot x^i}\cdot \dot x^i = \frac{d(L^2)}{ds}
$$
Combining these facts shows that $d(L^2)/ds = 0$ which implies that $L^2$ is constant along $x(s)$ and therefore that $L$ is also constant along $x(s)$:
$$
\frac{dL}{ds} = 0.
$$
Now, we separately notice that since $x$ satisfies the Euler-Lagrange equations for $L^2/2$, we have
\begin{align}
0
&= \frac{\partial(L^2/2)}{\partial x^i} - \frac{d}{ds} \frac{\partial (L^2/2)}{\partial \dot x^i} \\
&= L\left(\frac{\partial L}{\partial x^i} - \frac{d}{ds}\frac{\partial L}{\partial \dot x^i}\right) - \frac{dL}{ds}\frac{\partial L}{\partial \dot x^i} \tag{$\star$}\\
&= L\left(\frac{\partial L}{\partial x^i} - \frac{d}{ds}\frac{\partial L}{\partial \dot x^i}\right)
\end{align}
and therefore as long as $L\neq 0$, we see that $x$ satisfies the Euler-Lagrange equations for $L$ as was desired.
The crucial point here is that because of the specific form of $L$, any path satisfying the Euler-Lagrange equation for $L^2/2$ has the nice property that $dL/ds = 0$ along the path. This allows one to kill the term in $(\star)$ which is the term that is the essential difference between the Euler-Lagrange equations for $L^2/2$ and the Euler-Lagrange equations for $L$.
However, if $x$ satisfies the Euler-Lagrange equations for $L$, then it is not necessarily the case that $dL/ds = 0$ along $x$, so in this case, one can't kill that term in $(\star)$, so it need not be a solution to the Euler-Lagrange equation for $L^2/2$.
Nonetheless, if $x$ is affinely parameterized, then it will automatically have the property that $L$ is constant along it, so it will automatically satisfy both Euler-Lagrange equations.
In fact, using parts of the computations above, it is not hard to show that
Proposition. Let $L(x, \dot x) = \sqrt{g_{ij}(x)\dot x^i\dot x^j}$. A path $x$ is an affinely parameterized geodesic if and only if it solves the Euler-Lagrange equations of both $L$ and $L^2/2$.
So the Euler-Lagrange equations of $L^2/2$ yield all affiniely parameterized geodesics, while the Euler-Lagrange equations of $L$ yield all geodesics, regardless of parameterization.
The two formulations are thus equivalent.
This is false.
In the case of spacelike geodesics, your definition of $L$ gives an imaginary number. The complex numbers are not an ordered field, so there is no such thing as "shortest."
There is a similar problem for null geodesics. A null geodesic has $L=0$, and perturbations of a null geodesic may make $L$ either real or imaginary.
Even in the timelike case, it can happen that a geodesic is not a maximal-time curve. There is a discussion of this in Misner, Thorne, and Wheeler, p. 318.
The only general definition of a geodesic that works is that it parallel-transports its own tangent vector, i.e., it's the straightest path.
Best Answer
As you may know, the geodesic equation, your equation (1), is not obtained as the Euler-Lagrange equations of the curve-length functional (2), but rather as the Euler-Lagrange equations of the energy functional
$$E = \frac12\int d\lambda\, g_{\mu\nu} \dot{x}^\mu\dot{x}^\nu.$$
I'm writing $\lambda$ rather than $\tau$ to avoid the suggestion that this has to be the proper time.
It is not very hard to show that extremals of $E$ are extremals of $L$, but the converse doesn't hold, in fact, length extremizing curves are extrema of $E$ if and only if they are true geodesics, i.e. affinely parameterized.
So, your equation (1) are the Euler-Langrange equations of $E$, whose solutions already are affinely parameterized. Adding (3) to it, the only additional requirement is for the curve to be timelike.
All three classes of geodesics, timelike, spacelike and lightlike, have affine parameterizations. For timelike geodesics proper time can be taken as an affine parameter, for spacelike geodesics proper length can be taken, and for lightlike curves no affine parameter has a special meaning.