The best derivation is Polyakov's, and it is found in the long string chapter of "Gauge Fields and Strings".
The key point is that the h-field in the path integral is integrated over, but it doesn't have derivative terms, so the fluctuations in the h-field just act to replace it at each point by its stationary value. The X-parts just go along for the ride when looking for stationary points of h, so you can write the action as
$$ S = \int \sqrt{h} h^{\alpha\beta} \gamma_{\alpha\beta} $$
Where $\gamma_{\alpha\beta} = \partial_\alpha X^\mu \partial_\beta X_\mu$ is the dot product of an $\alpha$ coordinate step with a $\beta$ coordinate step, i.e. it is the induced metric. The induced metric plays the role of a source term in the h path-integral (ignoring the X path integral). The stationary point condition is found by varying h (using the important determinant variation formula $\delta h = h h^{\alpha\beta} \delta h_{\alpha\beta}$ which you learn in math class as "expansion by minors" and "the inverse-minor theorem"):
$$\sqrt{h} \gamma_{\alpha\beta} + {1\over2\sqrt{h}} h h_{\alpha\beta} h^{\kappa\delta}\gamma_{\kappa\delta} $$
If you solve for h, you find that
$$ h_{\alpha\beta} = - {\gamma_{\alpha\beta}\over {1\over 2} h^{\kappa\delta}\gamma_{\kappa\delta}}$$
This might look like an incomplete solution, but the denominator on the right is a scalar, so this is saying that the tensors h and $\gamma$ are proportional
$$ h_{\alpha\beta} = A(x) \gamma_{\alpha\beta} $$
Where the proportionality constant A(x) won't make any difference (any two A choices will give solutions, and they lead to the same action).
Substitute in the extremal value for h in the action, and remember how to take an inverse matrix: $h^{\alpha\beta} = {1\over A} \gamma^{\alpha\beta}$, and you get that the action contribution for each external source $\gamma_{\alpha\beta}$ is proportional to $\sqrt{\gamma}$ no matter what $A(x)$ happens to be, which gives the Nambu-Goto action. Then you integrate the Nambu-Goto action over the remaining path-integral variables, which are the embedding coordiantes $X^\mu$.
The Nambu-Goto path integral is hard to understand in any way other than solving it classically, defining harmonic oscillators, and quantizing these by assuming they turn into standard harmonic oscillators. This is the old approach to string theory. The Polyakov action is just used to fix a gauge for h which will turn the problem into a simple sigma-model. So the equivalence between them is more of a formal thing, which relates the harmonic oscillator expansion to the vertex operators in the h formalism. It isn't necessarily a path-integral equality, because the Nambu-Goto path integral is not clearly well defined outside of turning it into Polyakov and fixing gauge for h.
I) The closest cosmetic resemblance between the Nambu-Goto action and the Polyakov action is achieved if we write them as
$$\tag{1} S_{NG}~=~ -\frac{T_0}{c} \int d^2{\rm vol} ~\det(M)^{\frac{1}{2}} , $$
and
$$\tag{2} S_{P}~=~ -\frac{T_0}{c}\int d^2{\rm vol}~ \frac{{\rm tr}(M)}{2} , $$
respectively. Here $h_{ab}$ is an auxiliary world-sheet (WS) metric of Lorentzian signature $(-,+)$, i.e. minus in the temporal WS direction;
$$\tag{3} d^2{\rm vol}~:=~\sqrt{-h}~d\tau \wedge d\sigma$$
is a diffeomorphism-invariant WS volume-form (an area actually);
$$\tag{4} M^{a}{}_{c}~:=~(h^{-1})^{ab}\gamma_{bc} $$
is a mixed tensor; and
$$\tag{5} \gamma_{ab}~:=~(X^{\ast}G)_{ab}~:=~\partial_a X^{\mu} ~\partial_b X^{\nu}~ G_{\mu\nu}(X) $$
is the induced WS metric via pull-back of the target space (TS) metric $G_{\mu\nu}$ with Lorentzian signature $(-,+, \ldots, +)$.
Note that the Nambu-Goto action (1) does actually not depend on the auxiliary WS metric $h_{ab}$ at all, while the Polyakov action (2) does.
II) As is well-known, varying the Polyakov action (2) wrt. the WS metric $h_{ab}$ leads to that the $2\times 2$ matrix
$$\tag{6} M^{a}{}_{b}~\approx~\frac{{\rm tr}(M)}{2} \delta^a_b~\propto~\delta^a_b $$
must be proportional to the $2\times 2$ unit matrix on-shell. This implies that
$$\tag{7} \det(M)^{\frac{1}{2}} ~\approx~ \frac{{\rm tr}(M)}{2},$$
so that the two actions (1) and (2) coincide on-shell, see e.g. the Wikipedia page. (Here the $\approx$ symbol means equality modulo eom.)
III) Now, let us imagine that we only know the Nambu-Goto action (1) and not the Polyakov action (2). The only diffeomorphism-invariant combinations of the matrix $M^{a}{}_{b}$ are the determinant $\det(M)$, the trace ${\rm tr}(M)$, and functions thereof.
If furthermore the TS metric $G_{\mu\nu}$ is dimensionful, and we demand that the action is linear in that dimension, this leads us to consider action terms of the form
$$\tag{8} S~=~ -\frac{T_0}{c}\int d^2{\rm vol}~ \det(M)^{\frac{p}{2}} \left(\frac{{\rm tr}(M)}{2}\right)^{1-p} , $$
where $p\in \mathbb{R}$ is a real power. Alternatively, Weyl invariance leads us to consider the action (8). Obviously, the Polyakov action (2) (corresponding to $p=0$) is not far away if we would like simple integer powers in our action.
Best Answer
There are two manifolds that are involved in string propagation.
The spacetime in which the string propagates.
The worldsheet of the string itself.
The fields $X^\mu$ are embedding coordinates of the worldsheet in the spacetime manifold. This means that for each point $(\sigma^1, \sigma^1)$ on the worldsheet, $X^\mu(\sigma^1, \sigma^2)$ gives the coordinates of that point in the spacetime manifold.
In the case you are considering, the spacetime is taken to be Minkowski, so the metric is $\eta_{\mu\nu}$. Now we could ask
"Given that the worldsheet is a two dimensional embedded submanifold of Minkowski space, is there some way that this manifold inherits its metric from the metric on the ambient spacetime?"
This question is analogous to
"Given that the sphere $S^2$ is some two-dimensional embedded submanifold of Euclidean space $\mathbb R^3$, is there some natural sense in which it inherits its metric from $\mathbb R^3$?
The answer to both of these question is yes, and the metric on the submanifold that does this is precisely the induced metric. The formula expression the induced metric for a two-dimensional submanifold of some ambient manifold with metric $g_{\mu\nu}$ (not necessarily flat) in terms of embedding coordinates is $$ \gamma_{ab}(\sigma) = g_{\mu\nu}(X(\sigma))\partial_aX^\mu(\sigma)\partial_b X^\nu(\sigma), \qquad \sigma = (\sigma^2, \sigma^2) $$ You are right about the derivation of the induced metric, it comes from demanding that the distance measured between points on the embedded submanifold is calculated to be the same number whether you use the ambient metric, or the induced metric. To see that the above expression for the induced metric does this, simply note that the infinitesimal distance between any two points on the embedded submanifold can be written in terms of the ambient metric and the embedding coordinates as \begin{align} g_{\mu\nu}(X(\sigma))d(X^\mu(\sigma))d(X^\nu(\sigma)) &= g_{\mu\nu}(X(\sigma))\partial_a X^\mu(\sigma)\partial_bX^\nu(\sigma)d\sigma^ad\sigma^b \\ &= \gamma_{a b}(\sigma)d\sigma^ad\sigma^b \end{align} To get some intuition for all of this, recall that expression for embedding coordinates of $S^2$ in $\mathbb R^3$ is \begin{align} X(\theta, \phi) &= \sin\theta\cos\phi\\ Y(\theta, \phi) &= \sin\theta\sin\phi\\ Z(\theta, \phi) &= \cos\theta \end{align} and using these embeddings you should be able to show that the metric on the sphere is simply $$ \gamma_{ab}(\theta, \phi) = \mathrm{diag}(1, \sin^2\theta) $$
Let me know if that's unclear or if you need more detail!