The power transfer is maximised at resonance because the driving force and the velocity of the oscillator are in phase.
If you multiply two sinusoidal terms together (the force and the velocity) with a phase difference between them, then the product has its maximum average value when the phase difference is zero and a minimum value when the phase difference is $\pm \pi/2$.
Your steady state solution could be correct, but it is more usual to say that if the driving force is $F_0 \sin \omega t$, then the displacement $x \propto \sin(\omega t + \phi)$, where the phase difference $\phi$ is given by
$$ \phi = \tan^{-1}\left(\dfrac{-\gamma \omega}{\omega_0^{2} - \omega^{2}}\right),$$
and $\gamma$ is the damping coefficient.
You can see that when $\omega = \omega_0$ the phase difference between displacement and force is $-\pi/2$. But if you differentiate the displacement to get the velocity $$v \propto \cos(\omega t + \phi) = \sin(\omega t + \phi +\pi/2)$$
and at resonance the phase difference between velocity and force is zero.
If the power transfer is maximised, then this is also why the amplitude is maximised, since the velocity amplitude also increases with the amplitude of the displacement.
Keep in mind that the bold text isn't a derivation, its a way to qualitatively understand, so it's going to be very very unconvincing. Instead of imaging that you are trying to compute a number imagine you are trying to estimate it by a factor of 10 or 100 or even a factor of 1000 or more. That's how unconvincing it will be.
So. At equilibrium amplitudes, power supplied equals power lost to friction. How much energy is lost to friction? Well, when no power is supplied it basically dissipates almost all of the available energy $E$ in time $\tau$. So friction saps away something like $E/\tau$ as a power lost to friction. So the power supplied is something like $E/\tau$ too.
The conclusion is that we expect the power supplied and the power lost to be somewhat equal to the energy stored divided by $\tau.$
But this is unconvincing because the power lost to friction changed and wasn't constant, if we waited longer then closer to 100% of the energy would be dissipated but the average power would be much smaller because we included more time of lower power loss.
The whole thing is incredibly wild estimates, not much different than dimensional analysis.
Most of the energy $E$ is lost by $\tau$ if the driving force is absent. Okay. But how is it related to the stored energy? Even if the driving force is present, it will have to supply $E/\tau$ so as to compensate the loss. How can it be stored, then?? it is just nullifying the dissipation; it is not stored?
The energy stored is defined to be $E$. Because the letter $E$ is the symbol I made up for the total stored energy. And $\tau$ is an almost arbitrary time related to how long it takes for the friction force to dissipate almost all of the energy $E$. Why is it that almost all of the energy is what is dissipated? Because we choose the zero of energy to be located at that place the friction approaches. Its not the absolute zero. There is more energy there, there is, force instance rest energy associated with every particle. There is chemical binding energy associated with each electron being in some atom/molecule. There is some energy associated with the thermal motion of the system. You could extract that energy with antimatter, chemical reactions with more reactive elements, and thermal contact with colder objects. But there isn't any more macroscopic kinetic energy or potential energy available besides $E$ that's all that is available. And that's the state the friction drives the system to. And it never gets perfectly there but it gets close in time $\tau$ and so in time $\tau$ almost $E$ is dissipated. Because of the definition of the two symbols.
Edit At resonance the amplitude can grow. As it grows the friction increases. When the amplitude grows it grows and grows and grows until the amplitude is so large that the friction dissipated is now exactly equal to the driving power. And that friction power is somewhat approximately equal to $E/\tau$ where $E$ is the total available energy stored.
At another frequency the amplitude doesn't grow so it doesn't get up to a nonzero steady state. In a complicated and real system there might be many modes that can be resonant and there might be shifting of modes over time.
Best Answer
Well, I finally pull it out.
I used Green's functions and it was pretty straightforward,
For a harmonic oscillator, you have to solve:
$(\frac{d^2}{dt^2} + 2b\frac{d}{dt} + \omega^2)G(t-t')= \delta(t-t')$
The solution is for $t>t' $:
$$ G(t-t')= exp(-b(t-t'))\frac{\sin(\omega'(t-t'))}{\omega'} $$
where $\omega' = \sqrt{\omega^2-b^2}$
The solution is:
$$ y(t)= \int{f(t')exp(-b(t-t'))\frac{\sin(\omega'(t-t'))}{\omega'}dt'} $$
Using $f(t) = \sum{\delta(t-nT)}$ the integral becomes super easy and you can interchange the sum and the integral since the sum does not depend on t':
Finally:
$$ y(t)= \sum{exp(-b(t-nT))\frac{\sin(\omega'(t-nT))}{\omega'}} $$
So what we got is as many sine functions as delta diracs the comb has, and vibrating at the natural frequency (just like a guitar) regardless if you are plucking it with a determinated frequency.