It's not true in general that the energy of a wave is always proportional to the square of its amplitude, but there are good reasons to expect this to be true in most cases, in the limit of small amplitudes. This follows simply from expanding the energy in a Taylor series, $E=a_0+a_1 A+a_2 A^2+\ldots$ We can take the $a_0$ term to be zero, since it would just represent some potential energy already present in the medium when there was no wave excitation. The $a_1$ term has to vanish, because otherwise it would dominate the sum for sufficiently small values of $A$, and you could then have waves with negative energy for an appropriately chosen sign of $A$. That means that the first nonvanishing term should be $A^2$. Since we don't expect the energy of the wave to depend on phase, we expect that only the even terms should occur, $E=a_2A^2+a_4A^4+\ldots$ So it's only in the limit of small amplitudes that we expect $E\propto A^2$.
The other issue to consider is that we had to assume that $E$ was a sufficiently smooth function of $A$ to allow it to be calculated using a Taylor series. This doesn't have to be true in general. As an easy example involving an oscillating particle, rather than a wave, consider a pointlike particle in a gravitational field, bouncing up and down elastically on an inflexible floor. If we define the amplitude as the height of the bounce, then we have $E \propto |A|$. But a realistic ball deforms, so the small-amplitude limit consists of the ball vibrating while remaining in contact with the floor, and we regain $E\propto A^2$.
You could also make up examples where $a_2$ vanishes and the first nonvanishing coefficient is $a_4$.
A monochromatic wave has a fixed amplitude :
$\psi(x,t) = A ~e^{i(k.x-wt)}$
However, a wave packet (WP) is a combination of monochromatic waves :
$\psi_{WP}(x,t) = \int d\tilde k ~A(k) ~e^{i(k.x-wt)}$.
where $A(k)$ is a complex quantity.
So, $\psi_{WP}(x,t)$ is a complex quantity, which of course, can always be written :
$\psi_{WP}(x,t) = A_{WP}(x,t) e^{i\phi_{WP}(x,t)}$, where $\phi_{WP}(x,t)$ is a real phase, and $A_{WP}(x,t)$ is a real quantity, which is variable.
Now, the Schrodinger equation, is considering the whole wave packet $\psi_{WP}(x,t)$ as a probability amplitude (PA) $\psi_{PA}(x,t)$ (so, it is no more a real wave function).
With the decomposition $\psi_{PA}(x,t) = A_{PA}(x,t) e^{i\phi_{PA}(x,t)}$, the probability to find the particle at position $x$, at time $t$, is then just $|\psi_{PA}(x,t)|^2 = (A_{PA}(x,t))^2$.
Best Answer
The energy of an oscillator, as Planck defined, is quantized. It may be related to a harmonic oscillator, but it won't be simple to explain whose is the potential energy and the kinetic energy. We don't have here some sort of mass in some potential field, and the mass moves and has potential and kinetic energy. We have here the field itself and it is a thing that carries an energy of its own. So, the energy of the e.m. field is defined otherwise, not with potential and kinetic energy,
$ (\text I) \ H = \frac {1}{2} \int (\epsilon _0 |\vec E|^2 + \frac {1}{\mu _0} |\vec H|^2) \text d \vec r$.
where $H$ is the energy (the Hamiltonian). The vectors $\vec E$ and $\vec H$ also have an amplitude that oscillates in time.
Now, in the quantum theory, the fields $\vec E$ and $\vec H$, are replaced by the operators $\hat {\vec E}$ and $\hat {\vec H}$, (see D. F. Walls and G. J. Milburn, "Quantum Optics"). So, in the Hamiltonian $(\text I)$ the squared amplitudes $|\vec E|^2$ and $|\vec H|^2$ become operators. Next, to get the energy of the photon(s) in some state $|\psi\rangle$ of the e.m. field, we calculate the average of the Hamiltonian operator obtained, in that state. $ \langle \psi | \hat H|\psi \rangle$, and you will see that what we obtain is that the energy is given by Plank's formula.
I will show you a few steps of such a calculus.
a) In the classical electromagnetism $\vec E$ and $\vec B$ are obtained from the vector potential
$ (\text {II}) \hat E = - \frac {∂ \hat A}{∂t}, \ \ \ \hat B = \nabla \times \vec A .$
The same we do here, we define an operator vector potential
$ (\text {III}) \hat {\vec A(r,t)} = i \sum _k ( \frac {\hbar }{2 \omega _k \epsilon _0})^{1/2}[\hat a_k u_k(\vec r) e^{-i\omega _k t} + \hat a^{\dagger}_k u^*_k(\vec r) e^{i\omega _k t}]$,
where the index $k$ is for the $k$-th frequency (more exactly angular velocity) in the field, i.e. we number $\omega_1, \omega_2, ... \omega_k,...$; $C_k$ is a constant depending on $\omega_k$, and $a_k, a^{\dagger}_k$ are the annihilation and creation operators.
b) From $\hat {\vec A(r,t)} $ we calculate $\hat {\vec E}$ and $\hat {\vec B}$ applying the formulas in $ (\text {II})$. Then we take their absolute squares, i.e $\hat {\vec E}^{\dagger} \cdot \hat {\vec E}$ and $\hat {\vec B}^{\dagger} \cdot \hat {\vec B}$.
c) We introduce these vectors in the Hamiltonian, i.e.
$ (\text IV) \ \hat H = \frac {1}{2} \int (\epsilon _0 \hat {\vec E}^{\dagger} \hat {\vec E} + \frac {1}{\mu _0} \hat {\vec B}^{\dagger} \hat {\vec B} ) \text d \vec r$.
Performing the integral and applying different rules of the annihilation and creation operators, we get
$ (\text {V}) \ \hat H = \sum _k \hbar \omega _k (\hat a^{\dagger} \hat a + \frac{1}{2})$.
d) Now, as I said, we calculate the average for the state given for the e.m. field. Let's take a simple state, $|n\rangle$, which means that we have $n$ photons of the same frequency $\nu = \omega/2\pi$. We get the energy
$ (\text {VI}) \ \mathscr E = \langle \hat {H} \rangle = (n + \frac {1}{2})\hbar \omega$
in agreement with Planck's formula up to the term $\hbar \omega /2$ that I won't discuss here.