Your description of molecular orbital theory is rather misleading, though I concede that it is introduced to students in the way you describe. To really understand what is going on you need a better understanding of how MO theory works.
If we write down the Schrodinger equation for a diatomic molecule like $H_2$ we find it has no analytic solution, so we look for ways of finding approximate solutions. One way of approximating the solution is to write it as a sum of some other functions $\phi_i$:
$$ \Psi = \sum a_i\phi_i $$
We call the functions $\phi_i$ basis functions, and they can be any functions we want - there is no special restriction on what we choose as our basis. However it makes sense to use functions that give a good approximation to $\Psi$ with as few terms as possible. In the case of the $H_2$ molecule an obvious choice for the basis functions is the hydrogen atomic orbitals i.e. the $1s$, $2s$, $2p$, etc.
To get a perfect expression for the $H_2$ wavefunction would require an infinite basis, but we would expect a reasonable approximation with a finite number of terms in the sum. In particular we expect to get a start by considering only two terms i.e. the $1s$ orbitals of the two hydrogen atoms. Let's call these $\phi_1$ and $\phi_2$, so our expression for the molecular wavefunction looks like:
$$ \Psi_{H_2} \approx a_1\phi_1 + a_2\phi_2 $$
And we want to choose the constants $a_1$ and $a_2$ to give the best approximation. For a diatomic molecule this is easy because the molecule is symmetric so $|\Psi|^2$ must be symmetric and that means $|a_1| = |a_2|$. The only possible expressions for $\Psi$ are (give or take a normalising factor):
$$ \Psi_{+} \approx \phi_1 + \phi_2 $$
$$ \Psi_{-} \approx \phi_1 - \phi_2 $$
The energy of these two orbitals is given by:
$$ E = \langle\Psi|H|\Psi\rangle $$
where $H$ is the Hamiltonian for the hydrogen molecule. If you're interested in the details I found quite a nice account here, but assuming the just want an overview the energy depends on the overlap integral $\langle\phi_1|\phi_2\rangle$. In brief, if the overlap integral is large then the energy is low and if the overlap integral is low then the energy is high.
And finally we get to the reason the signs of $\phi_1$ and $\phi_2$ matter. If both are positive then they add up and the sum is large in the region where they overlap. This makes the overlap integral large and gives a low energy. By contrast, if they have different signs then the sum (i.e. the difference) is small in the region where they overlap and the ovelap integral is low and the energy is high.
Best Answer
It is guaranteed that finite wave packets always create places where the interference is constructive as well as places where it is destructive: the energy simply flows from the maxima to the minima.
In your convention, it's guaranteed that the total energy at the end is always "in between" the energy from the constructive interference and the energy from the destructive one, which is simply the average $$ (4 A^2 + 0 ) / 2 = 2A^2,$$ exactly as the original energy. Otherwise, the energy conservation can be proved even locally - as a continuity equation - directly from Maxwell's equations so it always holds. This is particularly easy to prove for vacuum Maxwell's equation - enough for propagation and interference of light,