Any 'sufficiently nice' set (in this case, pair) of variables that are compatible with the constraints will do: longitude and latitude are just one possible choice among many others.
Generally, we cannot find a sufficiently nice set of parameters to cover the whole constraint surface. Instead, we use multiple compatible coordinate neighbourhoods (where compatible means that transition maps on overlapping neighbourhoods must be diffeomorphisms), yielding the structure of a differentiable manifold.
For your particular example, in addition to the given parametrization
$$
(\phi,\theta)\mapsto(R\sin\theta\cos\phi,R\sin\theta\sin\phi,R\cos\theta)
$$
we could also use
$$
(z,\phi)\mapsto(\sqrt{R^2-z^2}\cos\phi,\sqrt{R^2-z^2}\sin\phi,z)
$$
inspired by cylindrical coordinates, and stereographic projection would work equally well.
In fact, the upper hemisphere can be covered with just
$$
(x,y)\mapsto(x,y,\sqrt{R^2-x^2-y^2})
$$
ie one way to get a parametrization is by 'inverting' the constraints after a choice of independent variables.
The key point in all of this is that general relativity is a gauge theory, and, as the saying goes, "the gauge always hits twice" (apparently attributed to Claudio Teitelboim). What this means is that (1) you have an arbitrary freedom in defining your evolution, corresponding to the ability to make gauge transformations, and (2) some of the evolution equations will be constraints. This second fact means that you are not allowed to choose arbitrary initial data for your theory; rather, the initial data that you pick is subject to the constraints, which arise since your action is gauge invariant.
It's usually easiest to start with vacuum electrodynamics. There the equations of motion read
$$\partial^\mu(\partial_\mu A_\nu - \partial_\nu A_\mu)=0.$$
Not all of these equations are second order in time; just look at the $\nu=0$ component:
$$\partial_t^2 A_0 - \nabla^2A_0 -\partial_t(\partial_t A_0 - \nabla\cdot\vec{A}) = 0 \\
\implies\partial_t\nabla\cdot\vec{A}-\nabla^2A_0 = 0.$$
This is basically the $\nabla\cdot \vec{E} = 0$ vacuum Maxwell equation (i.e. Coulomb gauge with $\nabla\cdot\vec{A}=0$ and $\vec{E} = -\nabla A_0$). This is a constraint on your initial data, because you are not allowed to make an arbitrary choice for $(A_0, \vec{A})$ and $(\partial_t A_0, \partial_t \vec{A})$; rather, they need to satisfy this constraint. So this cuts down the number of initial conditions from 4 to 3. Then the gauge transformation $A_\mu \mapsto A_\mu + \partial_\mu \lambda$ allows you to cut off another piece of initial data, by imposing a gauge fixing condition (i.e. $\nabla\cdot\vec{A}=0$). This brings us to 2 degrees of freedom.
For general relativity, you now have 4 gauge freedoms generated by diffeomorphisms described by a vector $\xi^\mu$. So applying the maxim, we should expect to cut down $4\times2=8$ degrees of freedom. In fact the Bianchi identity tells where to look for the constraints. Let's expand it out a bit:
$$0=\nabla_\mu G^{\mu\nu} = \partial_0 G^{0\nu}+\partial_i G^{i\nu} + \Gamma^\mu_{\mu\alpha}G^{\alpha \nu}+ \Gamma^{\nu}_{\mu\alpha}G^{\mu\alpha}.$$
This tells us that the first time derivative ($\partial_0$) of $G^{0\nu}$ is related to spatial derivatives of $G^{i\nu}$ as well as terms with no derivatives of $G^{\mu\alpha}$. The important thing here is that this is an identity, so it holds even if you don't impose the vacuum Einstein equations $G^{\mu\nu}=0$. The tensor $G^{\mu\nu}$ has two derivatives of the metric in it. But if $G^{0\nu}$ had two time derivatives appearing, there would be no way to satisfy the Bianchi identity because no other term in the identity has three time derivatives acting on the metric. This means $G^{0\nu}$ are not evolution equations--they involve only one time derivative of the dynamical variables, and thus are initial value constraints. So that kills 4 degrees of freedom, and you kill 4 more from gauge fixing. This is how you get the $10-4-4=2$ degrees of freedom in general relativity.
And in regards to your second question, yes general relativity describes the two degrees of freedom of a massless spin-2 particle.
Best Answer
Every rigid body has 3 translational dof. In addition, there are 0, 2, or 3 rotational dof, depending on the geometry, giving a total of 3, 5, or 6 dof.
A spherically symmetric rigid body has no rotational dof.
A rigid body with rotational symmetry around an axis has 2 rotational dof, namely two angles for orienting the symmetry axis along a direction.
All other rigid bodies have 3 rotational dof, namely two angles with respect to an arbitrary axis attached to the body, and an angle for rotationg around this axis. This gives the Euler angle parameterization of the manifold of orientations (algebraically an $SO(3)$.) An important alternative parameterization is the quaternion parameterization, especially useful in computational geometry. It has a parameter vector $u$ with 4 components whose length is 1, leaving 3 dof. ($u$ and $-u$ describe the same rotation.)