You have to be precise with the definition here. You mention a temperature difference at both sides, but this is a 1D problem. You have different temperatures at both sides, and a temperature difference across the rod.
The heat flux is proportional with the temperature gradient, this is referred to a Fourier's law (see e.g. http://en.wikipedia.org/wiki/Fourier's_law#Fourier.27s_law )
Now I understand your question as: why is $q=-k \frac{du}{dx}$ (u is temperature in your case), and not just proportional to $u$. Now, suppose the flux would be proportional to the temperature. Then you get some problems. First, suppose you have a rod of uniform temperature. The heat flux is constant across the rod everywhere, but in what direction? And what is the new equilibrium than?
Just the notion that the energy should flow from high to low temperature, is given by the gradient, making the energy always traveling down the gradient, trying to reach a more uniform state. To make it work out, some constant of proportionality was introduced, which is most often a system (or material) property.
Efficiency of one heat pump in cooling can be defined by expression
$$\eta = \frac{Q_C}{W},$$
that is heat that is taken from cooler reservoir divided by the work put into the pump.
If you have two pumps in parallel efficiency shall be the same, as you will have heat twice as large and work twice as large
$$\eta = \frac{2Q_C}{2W}.$$
If you however have two pumps in sequence equation reads as
$$\eta = \frac{Q_C}{2W}.$$
So in order to obtain at least same efficiency you should extract heat twice as large by each pump. So the efficiency of each pump should be twice as large for about half of the temperature change. So far so easy.
In the next step we must take into consideration exact cyclical thermodynamical processes. There are plenty of them that you can use and no can be exactly theoretically calculated. In such cases it is useful to observe the most efficient thermodynamical process, that is Carnot cycle and extract conclusions from it.
Efficiency of the heat engine based on Carnot cycle can be shown to be
$$\eta = \frac{T_C}{T_H-T_C}.$$
In case of two pumps in the first iteration intermediate temperature $T' = \frac{T_C+T_H}{2}$ in exactly in the middle. Efficiency of two pumps shall be
$$\eta_1 = \frac{T_C}{T'-T_C}, \eta_2 = \frac{T'}{T_H-T'}$$
Obviously $\eta_1 = 2 \eta$ and $\eta_2 > 2 \eta$, therefore two sequential pumps in cooling should be more efficient.
It is interesting that two sequential pumps in warming should be less efficient using the same arguments! I cannot find the error in the reasoning, so before accepting the answer, please wait some time that others check and give their comments.
Best Answer
If we take the definition of heat flux given here seriously, then heat flux is defined as a vector field $\vec\phi$ with units of energy per unit time, per unit area. At every point $\vec x$ in space, the vetor $\vec\phi(\vec x)$ tells you the direction and magnitude of heat flow in a neighborhood of that point. In particular, if we consider some two-dimensional surface $d\vec A$ containing $\vec x$, then $$ \vec\phi(\vec x) \cdot d\vec A $$ will tell us the amount of energy per unit time flowing through that surface. In particular, notice that here flux is being using to describe a vector field, not a scalar as in electric flux in EM. Perhaps this is rather bad terminology for this reason.