Let's start at the beginning:
The setting for relativity - be it special or general - is that spacetime is a manifold $\mathcal{M}$, i.e. something that is locally homeomorphic to Cartesian space $\mathbb{R}^n$ ($n = 4$ in the case of relativity), but not globally.
Such manifolds possess a tangent space $T_p\mathcal{M}$ at every point, which is where the vectors one usually talks about live. If you choose coordinates $x^i$ on the manifold, then the space of tangent vectors is
$$T_p\mathcal{M} := \{\sum_{i=0}^3 c^i \frac{\partial}{\partial x^i} \lvert c^i \in \mathbb{R} \}$$
When we say that a tupel $(c^0,c^1,c^2,c^3)$ is a vector, we mean that is corresponds to the object $c^i\partial_i \in T_p\mathcal{M}$ at some point $p \in \mathcal{M}$.
A metric on $\mathcal{M}$ can be given by specifying a non-degenerate, bilinear form at each point
$$g_p : T_p\mathcal{M} \times T_p\mathcal{M} \rightarrow \mathbb{R}$$
What you learned "in general" is that the components of the metric are, for chosen basis vectors $\partial_i$ of $T_p\mathcal{M}$, defined by $g_{ij} = g(\partial_i,\partial_j)$. You can now indeed see the metric as a kind of scalar product, setting $X \cdot Y := g(X,Y)$ for two vectors $X,Y$. (This contains the answer to your second problem) But for non-Riemannian manifolds, i.e. manifolds where not all entries in the metric are positive, this is not a scalar product in the sense you may be used to. In particular, it can be zero. Vectors for which it is zero are usually called lightlike or null.
The important thing to take away is that manifolds do not always behave like cartesian space.
Now, for your third problem, we need the concept of the cotangent space $T_p^*\mathcal{M}$. It is the dual vector space to the tangent space, spanned by the differentials $\mathrm{d}x^i : T_p\mathcal{M} \rightarrow \mathbb{R}$ for a chosen coordinate system, and defined by
$$\mathrm{d}x^i(\partial_j) = \delta^i_j$$
Now, recall that the metric was a map from twice the tangent space to $\mathbb{R}$. As such, we can see it as an element of the tensor product $T_p^*\mathcal{M} \otimes T_p^*\mathcal{M}$, which is the space spanned by element of the form $\mathcal{d}x^i \otimes \mathcal{d}x^j$. As the metric is an element of this space, it is expandable in its basis:
$$ g = g_{ij}\mathrm{d}x^i\mathrm{d}x^j$$
where the physicist just drops the bothersome $\otimes$ sign. Now, what has this to do with infinitesimal distance? We simply define the length of a path $\gamma : [a,b] \rightarrow \mathcal{M}$ to be (with $\gamma'(t)$ denoting the tangent vector to the path)$[1]$
$$ L[\gamma] := \int_a^b \sqrt{\lvert g(\gamma'(t),\gamma'(t))\rvert}\mathrm{d}t$$
And, by using physicists' sloppy notation, $g(\gamma'(t),\gamma'(t)) = g_{ij} \frac{\mathrm{d}x^i}{\mathrm{d}t}\frac{\mathrm{d}x^j}{\mathrm{d}t}$, if we understand $x^i(t)$ as the $i$-th coordinate of the point $\gamma(t)$, and so:
$$ L[\gamma] = \int_a^b \sqrt{g_{ij} \frac{\mathrm{d}x^i}{\mathrm{d}t}\frac{\mathrm{d}x^j}{\mathrm{d}t}}\mathrm{d}t = \int_a^b \sqrt{g_{ij}\mathrm{d}x^i\mathrm{d}x^j}\frac{\mathrm{d}t}{\mathrm{d}t} = \int_a^b \sqrt{g_{ij}\mathrm{d}x^i\mathrm{d}x^j}$$
Since we call $\mathrm{d}s$ the infinitesimal line element that fulfills $L = \int \mathrm{d}s$, this is suggestive of the notation
$$ \mathrm{d}s^2 = g_{ij}\mathrm{d}x^i\mathrm{d}x^j$$
If we notice that, by the definition of tangent and cotangent vectors by differentials and deriviatives as above, things with upper indices transform exactly in the opposite way from the things with lower indices (see also my answer here), it is seen that this is indeed invariant under arbitrary coordinate transformations.
$[1]$ $\gamma'(t)$ is really a tangent vector in the following sense:
Let $x : \mathcal{M} \rightarrow \mathbb{R}^n$ be a coordinate chart. Consider then: $ x \circ \gamma : [a,b] \rightarrow \mathbb{R}^n$. Since it is an ordinary function between (subsets of) cartesian spaces, it has a derivative
$$(x \circ \gamma)' : [a,b] \rightarrow \mathbb{R}^n$$
Now, $(x \circ \gamma)'^i(t)$ be be thought of as the components of the tangent vector $\gamma'(t) := (x \circ \gamma)'^i(t)\partial_i \in T_{\gamma(t)}\mathcal{M}$. It is a somewhat tedious, but worthwhile excercise to show that this definition of $\gamma'(t)$ is independent of the choice of coordinates $x$.
You exam question with the surfaces is asking about something different. You are given an embedding of a lower-dimensional submanifold $\mathcal{N}$ into Cartesian space
$$ \sigma: \mathcal{N} \hookrightarrow \mathbb{R}^n $$
and asked to calculate the induced metric on the submanifold from the Cartesian metric
$$\mathrm{d}s^2 = \sum_{i = 1}^n \mathrm{d}(x^i)^2$$
(which is just the identity matrix in component form w.r.t. any orthonormal basis of coordinates in $\mathbb{R}^n$, i.e. the dot product)
Now, how is a metric induced? Let $y : \mathbb{R}^m \rightarrow \mathcal{N}$ be coordinates for the submanifold (you are actually given $\sigma \circ y$ in the question), and $x$ be the coordinates of the Cartesian space. Observe that any morphism of manifolds $\sigma$ induces a morphism of tangent spaces
$$ \mathrm{d}\sigma_p : T_p\mathcal{N} \rightarrow T_{\sigma(p)}\mathbb{R}^n, \frac{\partial}{\partial y^i} \mapsto \sum_j \frac{\partial(\sigma \circ y)^j}{\partial y^i}\frac{\partial}{\partial x^j} $$
called the differential of $\sigma$. As a morphism of vector spaces, it is a linear map given, as a matrix, by the Jacobian $\mathrm{d}\sigma^{ij} := \frac{\partial(\sigma \circ y)^j}{\partial y^i}$ of the morphism of manifolds. Now, inducing a metric means setting
$$ g_\mathcal{N}(\frac{\partial}{\partial y^i},\frac{\partial}{\partial y^j}) := g_\mathrm{Euclidean}(\mathrm{d}\sigma(\frac{\partial}{\partial y^i}),\mathrm{d}\sigma(\frac{\partial}{\partial y^j}))$$
On the right hand side is now the dot product of two ordinary vectors in $\mathbb{R}^n$, and what your exams call $\vec e_{y^i}$ is my $\mathrm{d}\sigma(\frac{\partial}{\partial y^i})$. If you note that you are given $\sigma \circ y$, then all you need to do is to calculate the metric components by calculating $g_\mathcal{N}$ as above for every possible combination of $y^i,y^j$ (in 2D, fortunately, there's only four).
The confusion comes from the fact that apparently 2 concepts are involved in your question which makes the topic rather complicated.
The 2 concepts are :
description of the metric tensor in coordinate dependent form (the classical way)
description of the metric tensor in coordinate free form (the modern way), which actually requires the use of differential forms.
It is so easy to write down an equation like $e_i \cdot e_j = \delta_{ij}$, and it looks so intuitive that one does not hesitate a moment of believing it, but behind there is the rather abstract theory of tangent vectors and differential forms on manifolds.
In this theory the symbols $e_i$ are tangent vectors of a freely chosen point of a manifold, and these are actually written as expressions of partial derivatives and in your question they are chosen to be orthonormal. In a 2-dim. flat manifold there are for instance -- it's our choice -- 2 orthonormal tangent vectors $\frac{\partial}{\partial x}$ and $\frac{\partial}{\partial y}$ which have the (nice) property to orthonormal to their corresponding covectors ${ dx , dy}$. (Covectors are the basis of the dual space, here called cotangent space, which is defined pointwise, i.e. at each point of the manifold there is another tangent space and cotangent space etc. ... ) Explicitly:
$dx(\frac{\partial}{\partial x})=1$ and $dy(\frac{\partial}{\partial y})=1$ whereas $ dy(\frac{\partial}{\partial x})=0$ and $dx(\frac{\partial}{\partial y})=0$.
Now we define what is meant by $\cdot$ the product between tangent vectors. For this we need the metric tensor $g$ which is a symmetrical tensor $e_i \cdot e_j := g(e_i, e_j)$. So if our basis is chosen to be orthonormal, then actually we get : $e_i \cdot e_j = g(e_i, e_j)=\delta_{ij}$.
We will work out that a bit more. Our tensor $g$ actually is in the formalism of differential forms:
$g = dx \otimes dx + dy \otimes dy$
If we want to know its components we have evaluate it on the basis vectors (remember $dx(\frac{\partial}{\partial x})=1$ and $dy(\frac{\partial}{\partial y})=1$ whereas $ dy(\frac{\partial}{\partial x})=0$ and $dx(\frac{\partial}{\partial y})=0$. ):
$ e_x \cdot e_x = g(\frac{\partial}{\partial x}, \frac{\partial}{\partial x})= dx \otimes dx ( \frac{\partial}{\partial x}, \frac{ \partial}{\partial x} ) + dy \otimes dy( \frac{\partial}{\partial x}, \frac{ \partial}{\partial x} ) = g_{xx} = 1+0= 1.$
$ e_x \cdot e_y = g(\frac{\partial}{\partial x}, \frac{\partial}{\partial y})= dx \otimes dx ( \frac{\partial}{\partial x}, \frac{ \partial}{\partial y} ) + dy \otimes dy( \frac{\partial}{\partial x}, \frac{ \partial}{\partial y} ) = g_{xy} = 0 + 0 =0.$
$ e_y \cdot e_x = g(\frac{\partial}{\partial y}, \frac{\partial}{\partial x})= dx \otimes dx ( \frac{\partial}{\partial y}, \frac{ \partial}{\partial x} ) + dy \otimes dy( \frac{\partial}{\partial y}, \frac{ \partial}{\partial x} ) = g_{yx} =0 + 0 =0.$
$ e_y \cdot e_x = g(\frac{\partial}{\partial y}, \frac{ \partial}{\partial y})= dx \otimes dx ( \frac{\partial}{\partial y}, \frac{ \partial}{\partial y} ) + dy \otimes dy( \frac{\partial}{\partial y}, \frac{ \partial}{\partial y} ) = g_{yy}= 0 + 1 =1.$
We got the desired result, the basis vector are orthonormal as required.
What happens if we change the metric ? Let's go to polar coordinates $(r,\phi)$. (Remember $(x,y) =(r cos\phi, r sin\phi)$, the derivatives below have to be carried out using this definition)
With these coordinates we can construct the following tangent vectors $\left(\frac{\partial}{\partial r}, \frac{\partial}{\partial \phi}\right)$. The corresponding covectors are $(dr, d\phi)$:
The metric in polar coordinates looks like this:
$g = dr \otimes dr + r^2 d\phi \otimes d\phi$
$g_{rr}= g(\frac{\partial}{\partial r}, \frac{ \partial}{\partial r})= dr \otimes dr ( \frac{\partial}{\partial r}, \frac{ \partial}{\partial r} ) + r^2 d\phi \otimes d\phi( \frac{\partial}{\partial r}, \frac{ \partial}{\partial r} ) = 1 + 0 =1. $
$g_{r\phi}= g(\frac{\partial}{\partial r}, \frac{ \partial}{\partial \phi})= dr \otimes dr ( \frac{\partial}{\partial r}, \frac{ \partial}{\partial \phi} ) + r^2 d\phi \otimes d\phi( \frac{\partial}{\partial r}, \frac{ \partial}{\partial \phi} ) = 0 + 0 =0.$
If $r$ and $\phi$ in the tangent vectors are swapped, the result is also zero: $g_{\phi r}=0$.
$g_{rr}= g(\frac{\partial}{\partial \phi}, \frac{ \partial}{\partial \phi})= dr \otimes dr ( \frac{\partial}{\partial \phi}, \frac{ \partial}{\partial \phi} ) + r^2 d\phi \otimes d\phi( \frac{\partial}{\partial \phi}, \frac{\partial}{\partial \phi} ) = 0 + r^2 =r^2$.
We actually find that our chosen tangent vectors are normal to each other, but not orthonormal. That's okay. That is our choice. The base system does not need to be orthonormal. We can actually easily fix the problem by choosing $e_\phi = \frac{1}{r}\frac{\partial}{\partial \phi}$.
But there is a little caveat. Up to now our covectors (the dual vectors of the our tangent vectors) were total differentials. That is no longer possible for the new choice of coordinates. The covector of $e_\phi = \frac{1}{r}\frac{\partial}{\partial \phi}$ is $r d\phi$ which cannot longer be represented by a total differential. Such bases are called anholonom. They are extremely practical for computations, but kind of unnatural. Nevertheless in the modern formalism of differential forms you find them everywhere.
Finally, if you apply a coordinate transformation, the components of the metric tensor $g(e_i,e_j)$ transform according to the rule
$g(e'_i,e'_j) = \frac{\partial x^k}{\partial x'^i} \frac{\partial x^l}{\partial x'^j} g(e_k,e_l)$. The summation is carried out over double appearing indices.
Transformation from polar (unprimed) coordinates to cartesian (primed) coordinates:
First we know from our computations above (we'll use the holonom coordinates $(r,\phi)$): $g_{rr}=1$, $g_{r\phi}=0$, and $g_{\phi\phi}=r^2$.
With this in mind we set up the transformation equations:
$g_{xx} = \frac{\partial r}{\partial x} \frac{\partial r}{\partial x} g_{rr} + 2 \frac{\partial r}{\partial x} \frac{\partial \phi}{\partial x} g_{r\phi} + \frac{\partial \phi}{\partial x} \frac{\partial \phi}{\partial x} g_{\phi\phi} = cos^2\phi g_{rr} + 0+ \frac{(-sin\phi)^2}{r^2} g_{\phi\phi} = cos^2\phi + sin^2\phi =1. $
Remember, the metric tensor is symmetric, so we put together the 2 mixed terms into one and also realise that as $g_{r\phi}= g_{\phi r} =0$, we can forget the mixed terms altogether.
$g_{xy} = \frac{\partial r}{\partial x} \frac{\partial r}{\partial y} g_{rr} + 2 \frac{\partial r}{\partial x} \frac{\partial \phi}{\partial y} g_{r\phi} + \frac{\partial \phi}{\partial x} \frac{\partial \phi}{\partial y} g_{\phi\phi} = cos\phi sin\phi g_{rr} + 0 + \frac{cos\phi}{r} \frac{-sin\phi}{r} g_{\phi\phi} = cos \phi sin\phi - cos\phi sin\phi =0. $
$g_{xx} = \frac{\partial r}{\partial y} \frac{\partial r}{\partial y} g_{rr} + 2 \frac{\partial r}{\partial y} \frac{\partial \phi}{\partial y} g_{r\phi} + \frac{\partial \phi}{\partial y} \frac{\partial \phi}{\partial y} g_{\phi\phi} = sin^2\phi g_{rr} + \frac{cos\phi}{r} \frac{cos\phi}{r} g_{\phi\phi} = sin^2\phi + cos^2\phi =1. $
We can confirm that the formula for the transformation of the metric tensor in case of the coordinate transformation from polar to cartesian coordinates works correctly.
Actually, one can also do this with anholonom coordinates, may be there is a slight change in the transformation law, but a priori it should also work out.
I hope this helps, but may be it would be necessary to learn something more about differential forms to make this answer even clearer.
Best Answer
The basis vectors are exactly what you'd expect, $$(1, 0, 0, 0), \quad (0, 1, 0, 0), \quad (0, 0, 1, 0), \quad (0, 0, 0, 1).$$ However, the inner product, i.e. the way we combine two vectors into a number, is not the same as the usual dot product. Using your notation, we're changing the definition of $\cdot$, not the definition of the $e_{\mu}$.
You are correct that it's possible to continue working with the dot product formally if we define some of the basis vectors to have imaginary components. That's how it was done in the past, but it's a bad idea: time and lengths just aren't complex numbers. They're perfectly real, so moving to a complex vector space doesn't make physical sense. (Moreover, the dot product itself is unnatural in a complex vector space, where the Hermitian inner product fits better.)