It is possible to see the cotangent bundle of the submanifold as a kind of symplectic reduction of the cotangent bundle of the ambient manifold. I think it might be enough to explain the analogous fact from linear algebra.
Let V be a vector space and U a subspace. There is a natural symplectic form $\omega_V$ on $V^*\oplus V$ given by
$$
\omega_V((\alpha,u),(\beta,v)) = \alpha(v) - \beta(u)
$$
where greek letters are elements of $V^*$ and roman letters are elements of $V$. (This is just d of the Louville form in this situation.) There is an analogous form $\omega_U$ on $U^* \oplus U$.
Now, let $U^0$ denote the annahilator of $U$ in $V^*$. Consider the subspace
$$
U^0 \times\{0\} \subset V^* \oplus V
$$
This subspace is isotropic for $\omega_V$. Its symplectic complement is the coisotropic subspace $V^*\oplus U$.
Now it is a standard fact in symplectic geometry that if you divide a coisotropic subspace by its symplectic complement the result is naturally a symplectic vector space. (This is the linear algebra behind symplectic reduction.) Applying this idea here we see that the quotient
$$
(V^*\oplus U )/ (U^0\times\{0\})
$$
inherits a natural symplectic structure. Of course, the quotient is precisely $U^*\oplus U$ and the symplectic form is nothing but $\omega_U$.
Let's start by answering the first question.
Let $M$ be any manifold. Consider a physical system consisting of a point-particle moving on $M$. What are the configurations of this physical system? The points of $M$. Hence $M$ is the configuration space.
Typically one takes $M$ to be riemannian and we may add a potential function on $M$ in order to define the dynamics. (More complicated dynamics are certainly possible -- this is just the simplest example.)
As an example, let's consider a point particle of mass $m$ moving in $\mathbb{R}^3$ under the influence of a central potential
$$V= k/r,$$
where $r$ is the distance from the origin. The configuration space is $M = \mathbb{R}^3\setminus\lbrace 0\rbrace$.
Classical trajectories are curves $x(t)$ in $M$ which satisfy Newton's equation
$$m \frac{d^2 x}{dt^2} = \frac{k}{|x|^2}.$$
To write this equation as a first order equation we introduce the velocity $v(t) = \frac{dx}{dt}$. Geometrically $v$ is a vector field (a section of the tangent bundle $TM$) and hence the classical trajectory $(x(t),v(t))$ defines a curve in $TM$ satisfying a first order ODE:
$$\frac{d}{dt}(x(t),v(t)) = (v(t), \frac{k}{m|x(t)|^2})$$
This equation can be derived from a variational problem associated to a lagrangian function $L: TM \to \mathbb{R}$ given by
$$L(x,v) = \frac12 m v^2 - \frac{k}{|x|}.$$
The fibre derivative of the lagrangian function defines a bundle morphism $TM \to T^*M$:
$$(x,v) \mapsto (x,p)$$
where
$$p(x,v) = \frac{\partial L}{\partial v}.$$
In this example, $p = mv$. The Legendre transform of the lagrangian function $L$ gives a hamiltonian function $H$ on $T^*M$, which in this example is the total energy of the system:
$$H(x,p) = \frac{1}{2m}p^2 + \frac{k}{|x|}.$$
The equations of motion can be recovered as the flow along the hamiltonian vector field associated to $H$ via the standard Poisson brackets in $T^*M$:
$$ \frac{dx}{dt} = \lbrace x,H \rbrace \qquad\mathrm{and}\qquad \frac{dp}{dt} = \lbrace p,H \rbrace.$$
Being integral curves of a vector field, there is a unique classical trajectory through any given point in $T^*M$, hence $T^*M$ is a phase space for the system; that is, a space of states of the physical system. Of course $TM$ is also a space of states, but historically one calls $T^*M$ the phase space of the system with configuration space $M$. (I don't know the history well enough to know why. There are brackets in $TM$ as well and one could equally well work there.)
Not every space of states is a cotangent bundle, of course. One can obtain examples by hamiltonian reduction from cotangent bundles by symmetries which are induced from diffeomorphisms of the configuration space, for instance. Or you could consider systems whose physical trajectories satisfy an ODE of order higher than 2, in which case the cotangent bundle is not the space of states, since you need to know more than just the position and the velocity at a point in order to determine the physical trajectory.
It's late here, so I'll forego answering the bonus question for now.
Best Answer
Configuration space is, by definition, the position space of your particles. Phase space, on the other hand, is the space of pairs (position, momentum). The latter has a symplectic structure; the former has a Riemannian structure.
Regarding the relationship between kinetic energy and the Riemannian structure: You will recall from your high school physics class that kinetic energy is $\frac{1}{2} mv^2$. Of course the $v^2$ is really the dot product $v \cdot v$, in other words it's $g(v,v)$, where $g$ is the Riemannian metric and $v$ is a tangent vector. The $\frac{1}{2}$ explains the "twice the kinetic energy" part.