The definition of temperature through Maxwellian and Boltzmann distributions have certain problems in quantum mechanics.
In thermodynamics temperature is usually defined through the derivative of entropy as you say:
$$
\frac{1}{T} = \frac{\partial S(E,\mathbf{V})}{\partial E}. \qquad (1)
$$
The division of the system into different parts (or different degrees of freedom) can be understood form the microcanonical distribution. Let the system have Hamiltonian of the following form:
$$
H = H(\mathbf{q}, \mathbf{p}, \mathbf{V});
$$
where $\mathbf{q}$ and $\mathbf{p}$ are the vectors of microscopic generalized coordinates and momenta respectively and $\mathbf{V}$ is the vector of macroscopic parameters that are constant (at the average) in the equilibrium.
The dimension of $\mathbf{q}$ and $\mathbf{p}$ is the number of the degrees of freedom of the system. Note that degrees of freedom of the same type (e.g. translation along $x$ axis) of different particles are different degrees of freedom. The set of $(\mathbf{q},\mathbf{p})$ pairs is the phase space of the system.
The distribution function for the system is
$$
f(\mathbf{q},\mathbf{p}) =
\frac{
\delta\bigl( E - H(\mathbf{q}, \mathbf{p}, \mathbf{V}) \bigr)
}{\Omega(E, \mathbf{V})};
$$
where $E$ is the internal energy and $\Omega(E, \mathbf{V})$ is the phase density of states or the number of accessible microscopic states for given $E$ and $\mathbf{V}$:
$$
\Omega(E, \mathbf{V}) =
\int \delta\bigl( E - H(\mathbf{q}, \mathbf{p}, \mathbf{V}) \bigr) d\mathbf{q} d\mathbf{p}.
$$
The entropy is
$$
S(E, \mathbf{V}) = \ln \Omega(E, \mathbf{V})
$$
Temperature of a subsystem
Let the system consist of two independent (non-interacting) subsystems. Then
$$
\mathbf{q} = (\mathbf{q}_1, \mathbf{q}_2); \quad \mathbf{p} = (\mathbf{p}_1, \mathbf{q}_2);
$$
$$
H(\mathbf{q}, \mathbf{p}, \mathbf{V}) =
H_1(\mathbf{q}_1, \mathbf{p}_1, \mathbf{V}) +
H_2(\mathbf{q}_2, \mathbf{p}_2, \mathbf{V}). \qquad (2)
$$
NB:
The subsystems are not obliged to be separated spatially. They even are not obliged to consist of different particles. The only requirement is that the Hamiltonian must have the form (2). We can put all translational coordinates to $\mathbf{q}_1$, rotational to $\mathbf{q}_2$, oscillatory to $\mathbf{q}_3$ and so on. If the energy transfer (interaction) between the subsystems is negligible during some period of time then expression (2) is correct for that period.
We can introduce distribution functions for each subsystem:
$$
f_i(\mathbf{q}_i,\mathbf{p}_i) =
\frac{
\delta\bigl( E_i - H_i(\mathbf{q}_i, \mathbf{p}_i, \mathbf{V}) \bigr)
}{\Omega_i(E_i, \mathbf{V})};
$$
where $E_i$ is the internal energy of the subsystem.
The entropy of the subsystem then is
$$
S_i(E_i, \mathbf{V}) = \ln \Omega_i(E_i, \mathbf{V})
$$
and the temperature is
$$
T_i = \left( \frac{\partial S_i(E_i, \mathbf{V})}{\partial E_i} \right)^{-1} \qquad (3)
$$
Here is the definition of the temperature of the subsystem (degree of freedom).
Temperatures in the equilibrium
Since the subsystems are independent the distribution function of whole system is the product:
$$
f(\mathbf{q},\mathbf{p}) = f_1(\mathbf{q}_1,\mathbf{p}_1)f_2(\mathbf{q}_2,\mathbf{p}_2);
$$
and total number of accessible states is:
$$
\Omega(E_1, E_2, \mathbf{V}) = \Omega_1(E_1, \mathbf{V})\Omega_2(E_2, \mathbf{V}).
$$
Hence the total entropy is
$$
S(E_1, E_2, \mathbf{V}) = S_1(E_1, \mathbf{V}) + S_2(E_2, \mathbf{V}) \qquad (4)
$$
If there is an interaction between the subsystems the internal energy will be transfered from one system to the other until the equilibrium will be reached. During this process the total energy is constant:
$$
E = E_1 + E_2 = \text{const}
$$
The energies of the subsystems changes with time and have certain values in the equilibrium. According to the 2nd law of thermodynamics the total entropy is maximal in this state. The condition of the extremum is
$$
\frac{\partial S(E_1, E_2(E, E_1), \mathbf{V})}{\partial E_1} = 0.
$$
From (4) we get:
$$
\frac{\partial S(E_1, E_2(E, E_1), \mathbf{V})}{\partial E_1} =
\frac{\partial S_1(E_1, \mathbf{V})}{\partial E_1} +
\frac{\partial S_2(E_2, \mathbf{V})}{\partial E_2}\frac{\partial E_1}{\partial E_2} =
$$
$$
\frac{1}{T_1} - \frac{1}{T_2} = 0
$$
or
$$
T_1 = T_2.
$$
One can prove that these temperatures are equal to $T$ defined as (1).
If you start with a monatomic gas then the only degrees of freedom available are the three translational degrees of freedom. Each of them absorbs $\tfrac{1}{2}kT$ of energy, so the specific heat (at constant volume) is $\tfrac{3}{2}k$ per atom or $\tfrac{3}{2}R$ per mole.
If you move to a diatomic molecule there are two rotational modes as well - only two extra modes because rotation about the axis of the molecule has energy levels too widely spaced to be excited at normal temperatures. Each of those two rotational degrees of freedom will soak up another $\tfrac{1}{2}kT$, giving a specific heat of $\tfrac{5}{2}k$ per molecule or $\tfrac{5}{2}R$ per mole.
But the rotational energy levels are quantised with an energy spacing of $E = 2B, 6B, 12B$ and so on, where $B$ is the rotational constant for the molecule:
$$ B = \frac{\hbar^2}{2\mu d^2} $$
where $\mu$ is the reduced mass and $d$ is the bond length. So these rotational energy levels will only be populated when $kT$ is a lot greater than $B$ - say 10 to 100 times greater. You can look up the rotational constant of nitrogen, or it's easy enough to calculate, and the result is:
$$ B \approx 3.97 \times 10^{-23} \text{J} $$
which is about $3k$. So as long as the temperature is above say $30K$ the rotational modes will be excited and nitrogen will have a specific heat of $\tfrac{5}{2}R$. If you go down to temperatures of $3K$ and below then the specific heat will fall to $\tfrac{3}{2}R$ just like a monatomic gas.
The specific heat of nitrogen at constant volume is 0.743 kJ/(kg.K), and converting this to J/mole.K we get 20.8 J/(mole.K) and this is indeed 2.50R (to three significant figures).
The conformist mentions that the vibrations of the nitrogen molecule will contribute to the specific heat, and indeed they will. However the energy of the first vibrational mode is 2359 cm$^{-1}$, which converted to non-spectrogeek units is $4.7 \times 10^{-20}$ J or about $3400k$. So the vibrational mode isn't going to contribute to the specific heat until the temperature gets above 3400K.
Best Answer
The reason that new degrees of freedom open up at higher temperatures is because, with the possible exception of translational kinetic energy, degrees of freedom are quantized. Due to quantum mechanics, the molecules can only vibrate/rotate/get excited with certain discrete energies, and there is a lowest energy at which this happens. The particular energy of this "lowest excited state" determines the temperature at which the degree of freedom "turns on," by which we mean that it is accessible to a large number of particles in the ensemble (in reality, you'll nearly always have a few highly excited particles at any temperature simply due to the nature of the statistical distribution of the particles, but in most situations this tiny fraction is irrelevant). As such, the temperature at which new degrees of freedom turn on is highly dependent on the specific material that is being examined.
Water, for example, has rotational energy levels at very low energies (low enough to be excited by microwaves) due to its asymmetry, while its vibrational energy levels are somewhat higher (can only be excited by higher-energy infrared radiation). Incidentally, this is why both infrared radiation and microwaves are perceived as heat - they put energy directly into one of these degrees of freedom, and this energy redistributes to the translational degrees of freedom to give a higher temperature.
Water is a bit of a complicated case, as it's a nonlinear asymmetric molecule, so for more concrete predictions, the numbers to look for are the rotational temperature and the vibrational temperature, which are material-specific, and tend to only be calculated for linear molecules. The rotational temperature is typically much lower than the vibrational temperature, as rotational motion typically has a much lower first excited state than vibrational motion. Some typical values for each are found on Wikipedia (https://en.wikipedia.org/wiki/Rotational_temperature, https://en.wikipedia.org/wiki/Vibrational_temperature). For example, oxygen gas (O$_2$) has a rotational temperature of 2.08 K and a vibrational temperature of 2256 K. This means, at room temperature, the typical energy of the oxygen gas molecules is more than large enough to excite rotational modes, but vibrational modes will be essentially out of reach.
As such, if you were to heat oxygen gas to above 2256 K, then you would see a jump in heat capacity corresponding to the vibrational modes now being accessible places to store energy; likewise, cooling oxygen to below 2 K will cause a decrease in heat capacity, as the rotational modes are no longer easily accessible.