You may have found a small glitch in that water fall analogy. An analogy I like much better is to think of water through pipes.
The voltage (potential difference) corresponds to the pressure difference between two points. A higher pressure in one spot means a larger "push" on the water. For charges in a circuit, the voltage is the "push" that squeezed them forward through the obstacles in the form of resistors and other circuit components.
Such a pressure difference is directly corresponding to a larger potential energy difference. This is why the water fall analogy is often used, because it is a more intuitive way to think of potential energy. But when you are increasing the voltage across two points in a circuit, then this corresponds to not a higher "pressure" difference from the top to the bottom of the water fall, but rather to a larger potential difference. And such a higher potential difference means a higher water fall, because the potential energy we are comparing with here is gravitational.
So, the increase in height of the water fall is analogous to an increase in charge accumulation in an electric circuit. The distance is changing in that analogy so the speeds are not really comparable. But they would be in the pipe-analogy.
I admire your determination to understand, and your line of questioning. I'll try and address one or two of your points.
First of all potential difference in free space in an electric field. I prefer to define pd between two points, P and Q, as the work done by the electric field on a charge, per unit charge, as it goes from P to Q. So a greater pd between the points implies more work done on the charge as it goes from P to Q, which can only mean a greater electric field strength, that is a greater force acting on the charge. [Work = Force x distance in direction of force, and we're considering the fixed distance between P and Q.]
So if you apply a pd between the ends of a wire, the free electrons in the wire experience forces, urging them to travel through the wire. Because of collisions between the electrons and the lattice of ions (this is simplified) the electrons don't accelerate continuously under the force from the electric field, but reach a steady mean speed (called the drift speed). If you increase the pd you increase the force on each electron and the drift speed increases. This means that more electrons pass through any cross-section of the wire per second, that is the current increases.
Jumping now to the end of your question: "Why does POTENTIAL energy per charge between two points translate to THERMAL energy? I thought that if voltage was a push, the potential energy would just get converted to kinetic energy in the electrons, so where did the thermal energy come from?"
(1) Voltage isn't "a push"; its units are joules per coulomb! But, as I tried to explain above, it is related to the push (that is the force) that charges get in an electric field.
(2) The thermal energy comes from the collisions that the electrons, driven by the electric field and losing electrical potential energy, make with the lattice of ions. This increases the random vibration energy of the ions. [The extra kinetic energy that the electrons acquire due to the voltage applied is pretty negligible. For a current of a few ampère in an ordinary wire, the drift speed is in the order of a millimetre per second.]
Best Answer
More voltage means the electrons are trying to repel away from each other harder. The harder they try and repel each other, the greater force with which they move which means that they are able to more readily push through obstacles (such as resistance), which means that more of them will push through said obstacles at any given instant in time.
Like pressurizing an air tank. The more air you shove into a tank, the higher the pressure gets which is the same as the force with which the gas trying to spread out to get away from itself. Then if you open the tank to let the air rush out, the airflow is higher when the pressure is higher than when it is lower. Same idea.