There are two manifolds that are involved in string propagation.
The spacetime in which the string propagates.
The worldsheet of the string itself.
The fields $X^\mu$ are embedding coordinates of the worldsheet in the spacetime manifold. This means that for each point $(\sigma^1, \sigma^1)$ on the worldsheet, $X^\mu(\sigma^1, \sigma^2)$ gives the coordinates of that point in the spacetime manifold.
In the case you are considering, the spacetime is taken to be Minkowski, so the metric is $\eta_{\mu\nu}$. Now we could ask
"Given that the worldsheet is a two dimensional embedded submanifold of Minkowski space, is there some way that this manifold inherits its metric from the metric on the ambient spacetime?"
This question is analogous to
"Given that the sphere $S^2$ is some two-dimensional embedded submanifold of Euclidean space $\mathbb R^3$, is there some natural sense in which it inherits its metric from $\mathbb R^3$?
The answer to both of these question is yes, and the metric on the submanifold that does this is precisely the induced metric. The formula expression the induced metric for a two-dimensional submanifold of some ambient manifold with metric $g_{\mu\nu}$ (not necessarily flat) in terms of embedding coordinates is
$$
\gamma_{ab}(\sigma) = g_{\mu\nu}(X(\sigma))\partial_aX^\mu(\sigma)\partial_b X^\nu(\sigma), \qquad \sigma = (\sigma^2, \sigma^2)
$$
You are right about the derivation of the induced metric, it comes from demanding that the distance measured between points on the embedded submanifold is calculated to be the same number whether you use the ambient metric, or the induced metric. To see that the above expression for the induced metric does this, simply note that the infinitesimal distance between any two points on the embedded submanifold can be written in terms of the ambient metric and the embedding coordinates as
\begin{align}
g_{\mu\nu}(X(\sigma))d(X^\mu(\sigma))d(X^\nu(\sigma))
&= g_{\mu\nu}(X(\sigma))\partial_a X^\mu(\sigma)\partial_bX^\nu(\sigma)d\sigma^ad\sigma^b \\
&= \gamma_{a b}(\sigma)d\sigma^ad\sigma^b
\end{align}
To get some intuition for all of this, recall that expression for embedding coordinates of $S^2$ in $\mathbb R^3$ is
\begin{align}
X(\theta, \phi) &= \sin\theta\cos\phi\\
Y(\theta, \phi) &= \sin\theta\sin\phi\\
Z(\theta, \phi) &= \cos\theta
\end{align}
and using these embeddings you should be able to show that the metric on the sphere is simply
$$
\gamma_{ab}(\theta, \phi) = \mathrm{diag}(1, \sin^2\theta)
$$
Let me know if that's unclear or if you need more detail!
Let's recap: upper indices are vectors ($x^\mu$), the inner product on Minkowski space is given by $g_{\mu \nu}$ so "dual vectors" have lower indices $x_\nu = g_{\nu \rho} x^\rho.$
Then you see that a matrix (in the sense of linear map between vectors) has one upper and one lower index, because it maps a vector to another vector:
$$x^\mu \mapsto A^\mu{}_\nu x^\nu.$$
So actually $\Lambda^\mu {}_\nu$ is a very natural object, it's a matrix that rotates/Lorentz boosts vectors.
If you prefer working with the tensor $\Lambda_{\mu \nu}$, then the link between both tensors is given by
$$\Lambda^\mu{}_\nu = g^{\mu \rho} \Lambda_{\rho \nu} \Leftrightarrow \Lambda_{\mu \nu} = g_{\mu \sigma} \Lambda^\sigma {}_\nu.$$
Best Answer
Since the metric and inverse metric are related by $$ g^{\mu\lambda}g_{\lambda\nu} = \delta^\mu_\nu $$ taking the variation of both sides gives $$ \delta g^{\mu\lambda}g_{\lambda\nu} + g^{\mu\lambda}\delta g_{\lambda\nu} =0 $$ or in other words $$ \delta g_{\mu\nu} = -g_{\mu\rho}g_{\nu\sigma}\delta g^{\rho\sigma} $$ It follows that there is a one-to-one correspondence between variations in the metric and its inverse, so that a functional of the metric is stationary with respect to variations of the metric if and only if a functional of the metric is stationary with respect to its inverse. Noting this fact, we see that we can choose to perform either variations with respect to the metric or its inverse depending on which is more convenient.
For example, the Eisntein-Hilbert action involves the Ricci scalar which can be written in terms of the inverse metric as $R = g^{\mu\nu}R_{\mu\nu}$, so varying with respect to the inverse metric becomes convenient; $\delta R = \delta g^{\mu\nu} R_{\mu\nu} + g^{\mu\nu}\delta R_{\mu\nu}$.