In non-relativistic quantum mechanics the mass can, in principle, be considered an observable and thus described by a self-adjoint operator.
In this sense a quantum physical system may have several different values of the mass and a value is fixed as soon as one performs a measurement of the mass observable, exactly as it happens for the momentum for instance.
However, it is possible to prove that, as the physical system is invariant under Galileian group (or Galilean group as you prefer), a superselection rule arises, the well-known Bargmann mass superselection rule. It means that coherent superpositions of pure states with different values of the mass are forbidden.
Therefore the whole description of the system is always confined in a fixed eigenspace of the mass operator (in particular because all remaining observables, including the Hamiltonian one, commute with the mass operator).
In practice, the mass of the system behaves just like a non-quantum, fixed parameter. This is the reason, barring subtle technicalities (non-separability of the Hilbert space if the spectrum of the mass operator is continuous), why the mass can be considered a fixed parameter rather than a self-adjoint operator in non-relativistic quantum mechanics.
In relativistic quantum mechanics the picture is quite different. First of all, one has to distinguish between elementary systems (elementary free particles in with Wigner's defintion) and compound (interacting) systems. The formers are defined as irreducible (strongly continuous) unitary representations of Poincaré group. Each such representation is identified by a set of numbers defining the eigenvalues of some observables which attains constant values in the representation because of the irreducibility requirement. The nature of these numbers depend on the structure of the group one is considering.
Each such observable, in the irreducible Hilbert space of the system has the form $\lambda I$ where $\lambda$ is a fixed real number. Referring to the Poincaré group, the mass operator turns out to be one of these elementary observables. Therefore, in relativistic quantum mechanics, the elementary systems must have the trivial mass operator, which as before, can be considered as a fixed, non-quantum parameter.
The picture changes dramatically if one focuses on compound systems: there the mass is simply the energy operator evaluated in the rest frame of the system. It generally shows a mixed spectrum made of a continuous part, due to the "relative" kinetic energy and, below that, a point spectrum describing the possible masses of the overall system.
ADDENDUM. As Arnold Neumaier pointed out to me, neutrinos appear to have non-fixed values of the mass (i.e. the mass operator is not trivial) in view of the presence of the weak interaction. In my view, it is disputable if they can be considered elementary particles since they include weak interaction in their description. Surely they are elementary from a purely physical viewpoint. Maybe Wigner's description is physically inappropriate.
I think the main issue here is that you're attempting to think about a system in isolation -- the energy states of an electron subject to the electrostatic potential created by the positively charged nucleus -- and trying to understand the measurement based on this system. This is hopeless, as the measurement is not a part of this system.
Your first big red flag should be that energy is not conserved in such a transition. Where does the energy in the difference between the two states go?
You might try thinking about how this measurement is performed in a laboratory. Somewhat tangentially, this was an experiment in my undergraduate physics laboratory, because it encouraged you to think about precisely this type of problem[1].
I should also specify that the Rydberg-Ritz "difference of frequencies associated with the spectral lines" really just means "transitions between two excited states of the atom."
When an electron 'relaxes' from one energy state to another, the difference in energy has to go somewhere (by conservation of energy). As an interesting corollary, such a transition is forbidden if there is nowhere for that energy to go[2]. In this case, the electron's potential energy is released in the form of an excitation of the electromagnetic field, also known as a photon.
On the other end of the experiment, we measure the wavelength of this photon, typically with a grating monochrometer or similar apparatus. Using the energy-wavelength relation for a photon in free space, $E=\frac{hc}{\lambda}$, we calculate the energy of the emitted photons.
But we still have not gotten to the measurement in any true quantum mechanical sense. How do we know which wavelengths (energies) correspond to the transition lines of the atom? We plot the expected number of photons per unit time versus energy and look for spikes that look like a Lorentzian (or really, a Voigt profile). The center of those spikes is the energy we associate with the transition.
So the true measurement we are making is the expected value of the number operator, $\left<\hat{N}\right> = \left< a^\dagger a \right>$ when the monochrometer is set to different wavelength values.
In summary, you are correct that the difference in energy levels does not correspond nicely to a measurement. What does correspond to a measurement is the energy of the emitted photon when an electron traverses that energy difference. By conservation of energy, these must have the same value[3].
As an end note, including the measurement apparatus can be a powerful tool in the analysis of quantum systems. In the field of quantum information, it's typically referred to as the 'ancilla' system and allows you to understand the measurement of POVMs
[1] Technically, we were looking at the effects of the change in nuclear mass between Hydrogen and Deuterium, but that is really a tangent.
[2] This is the basic idea behind suppression of spontaneous emission in the Purcell Effect.
[3] I left out an important bit here -- why the entire energy of the transition must be conveyed to a single photon. This is essentially a consequence of the quantization of energy levels and the linearity of the electromagnetic field, though the ability to explain this simply and accurately lies beyond my skills.
Best Answer
One distinction I find it useful to make is between operators that are essentially measurement operators and operators that are essentially part of the mathematics that is used to construct a state. The operator $\hat x$ is a measurement operator, but we also need a state to tell us what measurement results we will observe. To construct one particular state, the conventional ground state of the simple harmonic oscillator, we introduce another operator, the annihilation operator $\hat a$ and write (not worrying about constant factors) $\hat x=\hat a+\hat a^\dagger$. We can use $\hat a$ to define/construct an object $\left|0\right>$ for which $\hat a\left|0\right>=0$. This object lets us construct the expectation values of any function of $\hat x$ in the state $f(\hat x)\rightarrow\left<0\right|f(\hat x)\left|0\right>$, using the commutation relations $[\hat a,\hat a^\dagger]=1$.
At this point, we have to introduce a moderately high level of mathematics, which lets us use the vacuum state (or any other state that we can create using other mathematics) as input to the Gelfand-Naimark-Segal construction of a Hilbert space, but the essential fact at the elementary level is as I've already noted, that we can construct the expectation values of any function of $\hat x$ (and we can extend the algebra of functions to include functions of both $\hat x$ and $\hat p=\mathrm{i}(\hat a^\dagger-\hat a)\;$) in the vacuum state, using the commutation relations $[\hat a,\hat a^\dagger]=1$.
The SHO is especially simple because there are only bound states, so that we can have $\hat x$ be self-adjoint by (abstract) construction as it is here, but the same distinction can be helpful elsewhere --- between operators that are introduced as mathematical analogues of measurements and operators that are introduced as mathematical tools for constructing states. I find this distinction helpful in quantum field theory, for example, because the SHO is foundational there.
I hope this is not too abstract. It's not a distinction that I have seen made as explicitly as this in textbooks, so use it carefully. The dichotomy between states and observables is not often emphasized as much as I think it might be, however it's manifest in the well-known expression of the expected value of an observable in a given state as a trace of the product of an observable and a density operator, $\mathsf{E}=\mathsf{Tr}\bigl[\hat A\hat\rho\bigr]$, which to me suggests that we think about whether various operators are used more to construct the $\hat A$'s or more to construct the $\hat\rho$'s.
Almost immediately an EDIT, to engage with your Question slightly more: All the above is built on abstract linear algebras, however the construction of a state can alternatively be put in terms of the construction of a representation of the linear algebra --- which in elementary mathematics is one and the same as the linear algebra itself, all done as "matrices", but a representation of a linear algebra requires considerably more structure than is required to construct an abstract linear algebra.