There are a few technical issues with the mathematics in that article, but I think the overall logic/presentation is great.
To begin, the relation:
$$\mathbf{P}(\omega)=\frac{\mathbf{P}_0}{\omega_0+i\omega}\tag{1}$$
was derived using the model of a step-function electric field and an exponentially-decaying response $\mathbf{P}(t)$ with characteristic time $\tau\equiv 1/\omega_0$. If you notice though, this was derived using an odd definition for the Fourier transform:
$$\mathbf{P}(\omega) \overset{?}{=}\int_{\color{red}{0}}^{\infty}\mathbf{P}(t)e^{-i\omega t}\,dt\tag{2}$$
The lower limit should be $-\infty$, but that would lead to a $\delta$ function for both $\mathbf{P}(\omega)$ and $\mathbf{E}(\omega)$. What the author has effectively assumed then is a model in which $\mathbf{E}(t)=\mathbf{E}_0\delta(t)$ (nevermind units) and:
$$\mathbf{P}(t)=\begin{cases}0 &,t<0\\ \mathbf{P}_0 e^{-\omega_0 t} &, t>0\end{cases}\tag{3}$$
This is a little odd, so let me clear the air a little.
The relation $\mathbf{P}(\omega)=\chi(\omega)\mathbf{E}(\omega)$ is an abstract/general, but very physical, relation between the polarization of an object and the applied electric field (assuming sufficient uniformity of the field, or sufficient smallness of the object, or both). You can easily get it by assuming that in the time-domain, $\mathbf{P}(t)$ is related to the applied field $\mathbf{E}(t)$ through a linear integral operator:
$$\mathbf{P}(t)=\int_{-\infty}^{t}\epsilon_0\chi_e(t-t')\mathbf{E}(t')dt'\tag{4}$$
where I have limited the upper integration bound to $t$ under the assumption of causality: $\mathbf{P}$ can't possibly respond to the future values of $\mathbf{E}$. Notice that in this formulation, if $\mathbf{P}$ responds only to the instantaneous value of the applied field $\mathbf{E}$ at, say, the same instant $\mathbf{P}$ is measured (i.e. infinitely fast information travel + no hysteresis (Markovian)), then the kernel $\chi_e(t-t')$ will now be a $\delta$ function:
$$\mathbf{P}(t)=\int_{-\infty}^{t}\epsilon_0\chi_e\delta (t-t') \mathbf{E}(t')dt'=\epsilon_0\chi_e \mathbf{E}(t)\tag{5}$$
but in general this will not be the case - the response (which we are assuming to be linear!), will generally depend on the past values of the applied field. Notice, if we now take the Fourier transform of eq. (4), we get:
$$\mathbf{P}(\omega)=\epsilon_0\chi_e(\omega)\mathbf{E}(\omega)\tag{6}$$
where every function with the argument $\omega$ is understood to be the Fourier transform of its respective time-domain counterpart, i.e. $\chi_e(\omega)\equiv \mathcal{F}\{\chi_e(t)\}$.
This frequency-domain relationship is something you literally measure by applying an electric field $\mathbf{E}=\mathbf{E}_0\sin\omega t$, and then measuring the resulting $\mathbf{P}(t)$ which we physically expect to also be a sinusoid (with a possible phase shift). The relation between the strengths (amplitudes) and phase-shift between $\mathbf{E}(t)$ and $\mathbf{P}(t)$ are meant to be captured by that general relationship.
But as you notice, we cannot just say that $\mathbf{E}(\omega)$ is the Fourier-transform of $\mathbf{E}(t)$. The Fourier transform of this is a $\delta$-function (well two, technically). This is a problem, of course, because we cannot "divide" $\delta$-functions, because they are not really functions. They are distributions, which are not defined point-wise.
This problem is an artifact of the assumed continuous nature, and infinite extent, of $\omega$. If we assume $\omega$ can take values from a continuum, of course you're going to get a distribution in the end. This problem wouldn't exist if we had used a Fourier series. You can kind of see these by noticing the mere units of a Fourier transform are not the same as the Fourier-series coefficients.
In reality, we do not sample the material for an infinite amount of time, and thus we do not have a $\delta$-function Fourier transform. We have a finite sample time $T\gg 1/\omega$. If we assume that over that sample time our applied field was a single frequency $\mathbf{E}(t)=\mathbf{E}_0e^{i\omega_0t}$, you get a $\text{sinc}$ function for the Fourier transform, which of course converges to a $\delta$-function in the limit $T\rightarrow\infty$. My point is, if you do all your work before/without taking the limit $T\rightarrow\infty$, you will have no problems. Similarly, if you work in the real world (with finite and discrete sampling), you will work with Fourier coefficients (discrete Fourier transform), and you will never have any problems of this kind (arising from $\delta$-functions).
So when we apply an Electric field of the form $\mathbf{E}(t)=\mathbf{E}_0e^{i\omega_0 t}$, it must be understood that $\mathbf{E}(\omega)=\delta_{\omega,\omega_0}\mathbf{E}_0$, where that is a Kronecker-delta, which is not the Fourier-transform.
Best Answer
It so happens that the autocorrelation function is a Fourier transform pair of the power spectral density. This is not to say that the only way to calculate the power spectral density is from the autocorrelation function.
As I stated at https://physics.stackexchange.com/a/309544/59023, the power spectral density, $S_{k}$, is proportional to the square of the magnitude of the Fourier transform of a signal, i.e., $S_{k} \propto \lvert X_{k} \rvert^{2}$.
First, let me use the generic symbol $X_{k}$ to represent the Fourier transform of the time domain signal $x_{n}$.
The words power spectrum are somewhat ambiguous here. In principle, one can compute a power spectrum (i.e., respective value vs. frequency) from each component of $\mathbf{E}$ or its magnitude. One can also compute the amplitude spectra, $A_{k} \propto \lvert X_{k} \rvert$, of the signal.
In the following, I will assume you are asking about $S_{k}$ and not $A_{k}$ for each of these.
The power spectrum of $\mathbf{E}$, whether of components ($E_{j}$) or vector magnitude ($\lvert \mathbf{E} \rvert$), describe the power of the field as functions of frequency with units (if properly normalized) of (V m-1)2 Hz-1. This is useful when trying to determine whether there exists, e.g., a wave at a given frequency which would show up as peak above the backgroun in $S_{k}$. If the oscillations exist only along the x-component of $\mathbf{E}$ (i.e., a longitudinal, electrostatic oscillation), then the spectrum of both $\lvert \mathbf{E} \rvert$ and $E_{x}$ would show a frequency peak but not $E_{y}$ or $E_{z}$.
The intensity, as you have written it, is just the field energy density multiplied by a constant. Thus, the power spectrum of $I$ would be qualitatively similar to that of $\lvert \mathbf{E} \rvert$.
I am not sure about what you were told and whether you are correctly conveying that information. A discrete Fourier transform or DFT (i.e., what you use in practice on real signals through algorithms like the FFT) is not the same as a continuous Fourier transform (CFT). In a DFT, the frequency bin width is defined as: $$ \Delta f = \frac{ f_{s} }{ 2 \ N } \tag{1} $$ where $f_{s}$ is the sample rate of the signal [e.g., vectors per second] and $N$ is the number of individual points used in the DFT.
In a CFT, the minimum $\Delta f$ is mathematically zero (i.e., infinitesimally small) but quantum shows us that energy/momentum are quantized and thus have discrete values. Therefore, there are physical limits on the lower bound of $\Delta f$. In this case, a variant of the uncertainty principle is applicable, called the time-energy uncertainty principle, which is roughly given as: $$ \Delta E \ \Delta t \geq \frac{\hbar}{2} \tag{2} $$ where $\hbar$ is the Planck constant and $\Delta Q$ is the minimum resolution of quantity $Q$.
Thus, the transition has a known energy change but we cannot know this better than that given by Equation 2. For photons, we can directly convert energy to frequency with some constants, i.e., $E = h \ \nu$, thus we have the limitation on the frequency resolution of the emitted photons.