I agree with you that most books do not follow a logical path when defining thermodynamics terms. Even great books such as Fermi's and Pauli's.
The first thing you need to define is the concept of thermodynamic variables.
Thermodynamic variables are macroscopic quantities whose values depend
only on the current state of thermodynamic equilibrium of the system.
By thermodynamic equilibrium we mean that those variables do not change with time. Their values on the equilibrium cannot depend on the process by which the system achieved the equilibrium. Example of thermodynamic variables are: Volume, pressure, surface tension, magnetization... The equilibrium values of these quantities define the thermodynamic state of a system.
When a thermodynamic system is not isolated, its thermodynamic variables can change under influence of the surrounding. We say the system and the surrounding are in thermal contact. When the system is not in thermal contact with the surrounding we say the system is adiabatically isolated. We can define that,
Two bodies are in thermal equilibrium when they - in thermal contact
with each other - have constant thermodynamic variables.
Now we are able to define temperature. From a purely thermodynamic point of view this is done through the Zeroth Law. A detailed explanation can be found in this post. Basically,
We say that two bodies have the same temperature if and only if they
are in thermal equilibrium.
Borrowing the mechanical definition of work one can - by way of experiments - observe that the work needed to achieve a given change in the thermodynamic state of an adiabatically isolated system is always the same. It allows us to define this value as an internal energy change,
$$W=-\Delta U.$$
By removing the adiabatic isolation we notice that the equation above is no longer valid and we correct it by adding a new term,
$$\Delta U=Q-W,$$
so
The heat $Q$ is the energy the system exchange with the surrounding in
a form that is not work.
Notice that I have skipped more basic definitions such as thermodynamic system and isolated system but this can be easily and logically defined in this construction.
You are right. The thermal equilibrium will eventually be reached. In this process, heat is transferred from the water to the thermometer. This increases the temperature of the thermometer and decreases the temperature of the water until they are equal.
However, generally, the amount of water is large so that the heat it loses is too small to significantly change its temperature.
Best Answer
Your description of the disturbance wrought on the system by the thermometer is sound. You may be able lessen the effect with a thermal diffusion model of the thermometer and by calculating what the system's temperature was before it brought the thermometer into equilibrium with itself, but for that approach to work, one must know the system's heat capacity (and the thermometer's temperature before the measurement). Measurements always disturb systems in the way you describe; this is the the Observer Effect and, although this is not to be confused with Heisenberg's uncertainty principle, historically Heisenberg began thinking about measurement uncertainty along the lines of an "Observer effect". He didn't stop there of course and his work and that which followed ultimately led mainstream physics to develop the separate famous uncertainty principle with its full blown denial of counterfactual reality (the idea that measurement outcomes before the measurement have a separate reality).
Is your statement that a thermometer can't really measure something's temperature correct? Strictly speaking, it is always true, as it is true of any instrument. But a more practical question is "is it misleading" or "does it give the right impression for the experiment at hand"; questions that are best handled by theoretical calculations and overbounding of the effect. Thus, if you are using a thermometer to measure the temperature of a swimming pool, then your statement, although strictly true, gives an utterly misleading idea of the thermometer's usefulness for that task. However, your statement is very practically true - anything else would be misleading - for the measurement of the temperature of a cubic centimetre of liquid if your purposes call for $\pm1^o\,\text{K}$ accuracy. For very small things, infrared thermometers are useful: these can measure the blackbody spectrum of an emitter, but one must take care that they are not contaminated by the radiation from other things nearby. This would be the approach you might try for a cubic centimetre, or, if you get really good at experimental measurement, a cubic millimetre of substance.
Experimental design for very precise measurements is often mostly about lessenning the observer effect for the experiment at hand; ultimately it all boils down to a thorough investigation of the theoretically expected signal to noise ratio your measurement setup and whether the foreseen SNR works for your purposes, as well as a statistical analysis of your measurement as it is repeated to check experimentally whether your theoretical SNR analysis is sound.