I agree with you that most books do not follow a logical path when defining thermodynamics terms. Even great books such as Fermi's and Pauli's.
The first thing you need to define is the concept of thermodynamic variables.
Thermodynamic variables are macroscopic quantities whose values depend
only on the current state of thermodynamic equilibrium of the system.
By thermodynamic equilibrium we mean that those variables do not change with time. Their values on the equilibrium cannot depend on the process by which the system achieved the equilibrium. Example of thermodynamic variables are: Volume, pressure, surface tension, magnetization... The equilibrium values of these quantities define the thermodynamic state of a system.
When a thermodynamic system is not isolated, its thermodynamic variables can change under influence of the surrounding. We say the system and the surrounding are in thermal contact. When the system is not in thermal contact with the surrounding we say the system is adiabatically isolated. We can define that,
Two bodies are in thermal equilibrium when they - in thermal contact
with each other - have constant thermodynamic variables.
Now we are able to define temperature. From a purely thermodynamic point of view this is done through the Zeroth Law. A detailed explanation can be found in this post. Basically,
We say that two bodies have the same temperature if and only if they
are in thermal equilibrium.
Borrowing the mechanical definition of work one can - by way of experiments - observe that the work needed to achieve a given change in the thermodynamic state of an adiabatically isolated system is always the same. It allows us to define this value as an internal energy change,
$$W=-\Delta U.$$
By removing the adiabatic isolation we notice that the equation above is no longer valid and we correct it by adding a new term,
$$\Delta U=Q-W,$$
so
The heat $Q$ is the energy the system exchange with the surrounding in
a form that is not work.
Notice that I have skipped more basic definitions such as thermodynamic system and isolated system but this can be easily and logically defined in this construction.
First of all I will address your last concern, which translates into: equilibrium doesn't necessarily mean that nothing is moving. As an example, particle in a solution at equilibrium can move from one side to the other as long as almost the same number of particle move the opposite direction (this usually happens, say, because of thermal agitation/Brownian motion...). What doesn't change with time is then the average concentration of particles in the solution. Now using again this analogy, suppose you have an isolated system made of, say, sea water and distilled water. When you first mix them you can still distinguish them, as one has a higher concentration of salts. After a while, internal gradients, that arise because the system is out of equilibrium, will drive the salts to homogenise and the end result is water with a homogeneous concentration of salts, after a suitably long relaxation time. Same happens with temperature: if you replace sea and distilled water with hot and cold water, the same mechanism will homogenize your system. Temperature gradients will mix water around until the temperature is the same everywhere. Then gradients disappears and dynamical equilibrium settles in (water is not completely static, but molecules flow around because of thermal agitation).
Best Answer
I will provide an answer from an astrophysics point of view, in which the term local thermodynamic equilibrium (LTE) is often used. In astrophysics the distinction between 'thermal equilibrium' and 'thermodynamic equilibrium' is not carefully made, because there is rarely if ever a situation in this context in which thermal equilibrium might hold without thermodynamic equilibrium.
The most common situation in which the presence or absence of LTE is considered is for a star. There is a flow of heat from the interior of a star to its atmosphere at large radii, and the temperature varies as a function of radius. So clearly gas near the atmosphere is not in thermodynamic (or thermal) equilibrium with gas in the core.
However, models of stellar interiors can be vastly simplified if one recognizes that there is still a local thermodynamic (thermal) equilibrium in the sense that the kinetic distribution of the free electrons, the plasma ionization state, and the radiation field at each radius can all be very well described by a single number, a local temperature $T$. The electron velocities obey a Maxwell-Boltzmann distribution, the ionization state follows from the Saha equation, and the radiation field is described by a Planck function (blackbody), all evaluated for some common $T$.
As the link you provide explains, this works as long as the mean free path of any particles that might transport heat (e.g. photons, electrons) is very small compared to the length scale over which the temperature is changing. In the atmospheres of stars, where the photon mean free path grows large, LTE in the above sense can break down. However, if the electron mean free path is still small enough, it can helpful to apply an even more limited notion of LTE in which one takes the electron velocities to be in LTE, while acknowledging that the radiation field may depart from a Planck function.
To summarize, LTE is a useful concept to the extent that it simplifies the description of a large system as a collection of local regions each described by a single temperature. The distinction between thermal and thermodynamic equilibrium is rarely needed in the situations in which LTE is a helpful approximation.