Let's start at the beginning:
The setting for relativity - be it special or general - is that spacetime is a manifold $\mathcal{M}$, i.e. something that is locally homeomorphic to Cartesian space $\mathbb{R}^n$ ($n = 4$ in the case of relativity), but not globally.
Such manifolds possess a tangent space $T_p\mathcal{M}$ at every point, which is where the vectors one usually talks about live. If you choose coordinates $x^i$ on the manifold, then the space of tangent vectors is
$$T_p\mathcal{M} := \{\sum_{i=0}^3 c^i \frac{\partial}{\partial x^i} \lvert c^i \in \mathbb{R} \}$$
When we say that a tupel $(c^0,c^1,c^2,c^3)$ is a vector, we mean that is corresponds to the object $c^i\partial_i \in T_p\mathcal{M}$ at some point $p \in \mathcal{M}$.
A metric on $\mathcal{M}$ can be given by specifying a non-degenerate, bilinear form at each point
$$g_p : T_p\mathcal{M} \times T_p\mathcal{M} \rightarrow \mathbb{R}$$
What you learned "in general" is that the components of the metric are, for chosen basis vectors $\partial_i$ of $T_p\mathcal{M}$, defined by $g_{ij} = g(\partial_i,\partial_j)$. You can now indeed see the metric as a kind of scalar product, setting $X \cdot Y := g(X,Y)$ for two vectors $X,Y$. (This contains the answer to your second problem) But for non-Riemannian manifolds, i.e. manifolds where not all entries in the metric are positive, this is not a scalar product in the sense you may be used to. In particular, it can be zero. Vectors for which it is zero are usually called lightlike or null.
The important thing to take away is that manifolds do not always behave like cartesian space.
Now, for your third problem, we need the concept of the cotangent space $T_p^*\mathcal{M}$. It is the dual vector space to the tangent space, spanned by the differentials $\mathrm{d}x^i : T_p\mathcal{M} \rightarrow \mathbb{R}$ for a chosen coordinate system, and defined by
$$\mathrm{d}x^i(\partial_j) = \delta^i_j$$
Now, recall that the metric was a map from twice the tangent space to $\mathbb{R}$. As such, we can see it as an element of the tensor product $T_p^*\mathcal{M} \otimes T_p^*\mathcal{M}$, which is the space spanned by element of the form $\mathcal{d}x^i \otimes \mathcal{d}x^j$. As the metric is an element of this space, it is expandable in its basis:
$$ g = g_{ij}\mathrm{d}x^i\mathrm{d}x^j$$
where the physicist just drops the bothersome $\otimes$ sign. Now, what has this to do with infinitesimal distance? We simply define the length of a path $\gamma : [a,b] \rightarrow \mathcal{M}$ to be (with $\gamma'(t)$ denoting the tangent vector to the path)$[1]$
$$ L[\gamma] := \int_a^b \sqrt{\lvert g(\gamma'(t),\gamma'(t))\rvert}\mathrm{d}t$$
And, by using physicists' sloppy notation, $g(\gamma'(t),\gamma'(t)) = g_{ij} \frac{\mathrm{d}x^i}{\mathrm{d}t}\frac{\mathrm{d}x^j}{\mathrm{d}t}$, if we understand $x^i(t)$ as the $i$-th coordinate of the point $\gamma(t)$, and so:
$$ L[\gamma] = \int_a^b \sqrt{g_{ij} \frac{\mathrm{d}x^i}{\mathrm{d}t}\frac{\mathrm{d}x^j}{\mathrm{d}t}}\mathrm{d}t = \int_a^b \sqrt{g_{ij}\mathrm{d}x^i\mathrm{d}x^j}\frac{\mathrm{d}t}{\mathrm{d}t} = \int_a^b \sqrt{g_{ij}\mathrm{d}x^i\mathrm{d}x^j}$$
Since we call $\mathrm{d}s$ the infinitesimal line element that fulfills $L = \int \mathrm{d}s$, this is suggestive of the notation
$$ \mathrm{d}s^2 = g_{ij}\mathrm{d}x^i\mathrm{d}x^j$$
If we notice that, by the definition of tangent and cotangent vectors by differentials and deriviatives as above, things with upper indices transform exactly in the opposite way from the things with lower indices (see also my answer here), it is seen that this is indeed invariant under arbitrary coordinate transformations.
$[1]$ $\gamma'(t)$ is really a tangent vector in the following sense:
Let $x : \mathcal{M} \rightarrow \mathbb{R}^n$ be a coordinate chart. Consider then: $ x \circ \gamma : [a,b] \rightarrow \mathbb{R}^n$. Since it is an ordinary function between (subsets of) cartesian spaces, it has a derivative
$$(x \circ \gamma)' : [a,b] \rightarrow \mathbb{R}^n$$
Now, $(x \circ \gamma)'^i(t)$ be be thought of as the components of the tangent vector $\gamma'(t) := (x \circ \gamma)'^i(t)\partial_i \in T_{\gamma(t)}\mathcal{M}$. It is a somewhat tedious, but worthwhile excercise to show that this definition of $\gamma'(t)$ is independent of the choice of coordinates $x$.
You exam question with the surfaces is asking about something different. You are given an embedding of a lower-dimensional submanifold $\mathcal{N}$ into Cartesian space
$$ \sigma: \mathcal{N} \hookrightarrow \mathbb{R}^n $$
and asked to calculate the induced metric on the submanifold from the Cartesian metric
$$\mathrm{d}s^2 = \sum_{i = 1}^n \mathrm{d}(x^i)^2$$
(which is just the identity matrix in component form w.r.t. any orthonormal basis of coordinates in $\mathbb{R}^n$, i.e. the dot product)
Now, how is a metric induced? Let $y : \mathbb{R}^m \rightarrow \mathcal{N}$ be coordinates for the submanifold (you are actually given $\sigma \circ y$ in the question), and $x$ be the coordinates of the Cartesian space. Observe that any morphism of manifolds $\sigma$ induces a morphism of tangent spaces
$$ \mathrm{d}\sigma_p : T_p\mathcal{N} \rightarrow T_{\sigma(p)}\mathbb{R}^n, \frac{\partial}{\partial y^i} \mapsto \sum_j \frac{\partial(\sigma \circ y)^j}{\partial y^i}\frac{\partial}{\partial x^j} $$
called the differential of $\sigma$. As a morphism of vector spaces, it is a linear map given, as a matrix, by the Jacobian $\mathrm{d}\sigma^{ij} := \frac{\partial(\sigma \circ y)^j}{\partial y^i}$ of the morphism of manifolds. Now, inducing a metric means setting
$$ g_\mathcal{N}(\frac{\partial}{\partial y^i},\frac{\partial}{\partial y^j}) := g_\mathrm{Euclidean}(\mathrm{d}\sigma(\frac{\partial}{\partial y^i}),\mathrm{d}\sigma(\frac{\partial}{\partial y^j}))$$
On the right hand side is now the dot product of two ordinary vectors in $\mathbb{R}^n$, and what your exams call $\vec e_{y^i}$ is my $\mathrm{d}\sigma(\frac{\partial}{\partial y^i})$. If you note that you are given $\sigma \circ y$, then all you need to do is to calculate the metric components by calculating $g_\mathcal{N}$ as above for every possible combination of $y^i,y^j$ (in 2D, fortunately, there's only four).
Best Answer
If you want to be super systematic about language and not overloading the terminology, you can say the following. Fix a smooth $n$-dimensional manifold $M$.
Now, in physics, it is common to abbreviate terminology in the following ways. First, the adjective pseudo-Riemannian is often omitted because in GR we care exclusively with Lorentzian signature (1 plus, and $n-1$ minus, or the other way around), and since everyone knows it's about Lorentzian signature, we'd rather not beat the already dead horse. Next, the phrase "tensor field" is often shortened (by abuse of language) to just "tensor" because... well that's just the way things are. So, you may hear $g$ being referred to as "the metric tensor $g$". This is of course incorrect (but standard) terminology since the word field tells us there is one at every point in the manifold. The next abbreviation is to omit the word 'tensor' in this description, and simply speak of "the metric $g$". In Physics, people won't have any trouble understanding what you mean, but in math, a very common source of confusion for students is with the use of 'metric' in 'metric tensor field' in the context of Riemannian manifolds and 'metric space'. Now, one can always introduce a coordinate chart $(U,x=(x^1,\dots, x^n))$, and in this chart, we can write \begin{align} g|_U&=g_{ab}\,dx^a\otimes dx^b, \end{align} where $g_{ab}:U\to\Bbb{R}$ are smooth functions, namely $g_{ab}(p):=g\left(\frac{\partial}{\partial x^a}(p),\frac{\partial}{\partial x^b}(p)\right)$.
Ok, now given the object $g$ as above, we can define the following object, called the quadratic form associated to $g$. This I shall denote as $Q_g$, and it is a function $Q_g:TM\to\Bbb{R}$ defined as $Q_g(v)= g(v,v)$, for all $v\in TM$. So, you take any tangent vector $v$, and plug it into $g$ twice. The following is a basic linear algebra fact: we can recover $g$ from $Q_g$, in the following sense. For any $p\in M$ and $v,w\in T_pM$, we have \begin{align} g(v,w)&=\frac{Q_g(v+w)-Q_g(v-w)}{4}\tag{$*$}. \end{align} Think of the analogous statement for multiplication of real numbers. If I have two real numbers $x,y$, then from the sum/difference of squares formula, $xy=\frac{(x+y)^2-(x-y)^2}{4}$. The general form $(*)$ above is called the polarization identity. Thus, given any symmetric $(0,2)$ tensor field, we can define a corresponding quadratic form, and conversely given any quadratic form, we can define a symmetric $(0,2)$ tensor field which has that as the quadratic form.
Because of this equivalence (going back and forth between $(0,2)$ tensor (fields) and quadratic form (fields)) some would consider this a reasonable overload of terminology, and start referring to $Q_g$ as "the metric".
The above terminology of "quadratic form associated to $g$" is how a mathematician would phrase it. In terms of a coordinate chart $(U,x)$, this would be written as \begin{align} Q_g|_U&=g_{ab}\,dx^a\,dx^b. \end{align} The meaning of the product $dx^a\,dx^b$ on the right is as follows. The object $Q_g$ takes a tangent vector $v\in T_pM$ and outputs the number $Q_g(v)=g_{ab}(p)\,dx^a(v)\cdot dx^b(v)$ (recall that $dx^a$ is a 1-form so it takes a vector $v\in T_pM$ as input and outputs $dx^a(v)\in\Bbb{R}$ as output; this number is often denoted as $v^a$, and called the "$a^{th}$ component of $v$ with respect to the coordinate-induced basis $\left\{\frac{\partial}{\partial x^i}(p)\right\}_{i=1}^n$ of $T_pM$").
Motivated by the coordinate expression on the right and classical terminology, in Physics, we refer to $Q_g$ as the infinitesimal squared distance, or also as the line element (associated to $g$), and in SR/GR also the (infinitesimal) spacetime interval (the adjective 'infinitesimal' referring to the fact that it is at the level of tangent spaces), and instead of the notation $Q_g$, it is much more common to use the notation $ds^2$, (even though it is not the exterior derivative $d(s^2)$ of the square of a function $s^2$, nor is it the product of a 1-form $ds$ with itself in any manner). This is just symbolic and suggestive notation. In coordinates, we thus write \begin{align} Q_g|_U\equiv ds^2|_U=g_{ab}\,dx^a\,dx^b, \end{align} where $\equiv$ means 'same thing in different notation'. See this MSE answer of mine for more details about the notation involving tensor products and symmetrized tensor products and quadratic forms.
In your specific example, you're dealing with Riemannian signature (all plus). So, being slightly more systematic with the language, I would say the metric tensor field $g$ on the unit sphere $S^2$ is such that if you restrict it to the domain of the spherical coordinate mapping, then \begin{align} g&=d\theta\otimes d\theta+\sin^2\theta\,d\phi\otimes d\phi. \end{align} (but take a look at the above MSE answer of mine; if you use the symmetrized tensor product notation, you can write this also as $g=d\theta^2+\sin^2\theta\,d\phi^2$). Equivalently, you could say the line element on the sphere is such that when restricted to the sphere, it equals \begin{align} ds^2=d\theta^2+\sin^2\theta\,d\phi^2 \end{align} (the RHS now being interpreted as a quadratic form).
At the end of the day, they're giving you the same information.
If it were me, I prefer to keep a terminological distinction between $g$ (pseudo-Riemannian metric tensor field, or if I'm working in RIemannian geoemtry, I'd abbreviate this to "Riemannian metric", or if I'm doing GR, I'd say "Lorentzian metric") and $Q_g\equiv ds^2$ (which I'd prefer to call the Quadratic form associated to $g$, or just the line element). Having said this, because of the linear algebra fact mentioned above, it isn't super necessary (once you have learnt the definitions) to be so strict with maintaining the distinction (and for people who know this fact, it is so obvious that they may even blur the distinction between a symmetric $(0,2)$ tensor field and its associated quadratic form, so they may start writing stuff like $g=ds^2=g_{ab}\,dx^a\,dx^b$).
One final thing I'll mention is that sometimes you'll see statements like "the metric $g_{ab}$". This can be interpreted in two ways. The first is you have $g$ as above, and you're fixing a coordinate chart $(U,x)$ as above, and considering the component functions $\{g_{ab}\}_{a,b=1}^n$; in this sense identifying a tensor field with its component functions with respect to some coordinates is an abuse/overload of language, and I would strongly caution against it unless you know precisely what you're talking about. Alternatively, it is also common to use the abstract index notation in which the symbol $g_{ab}$ denotes the actual $(0,2)$ tensor field $g$ (I have my reasons for using this notation only occasionally, but it's logically fine).