If I take the Fourier transform of the autocorrelation of a signal in time, I will get the power spectral density.
It so happens that the autocorrelation function is a Fourier transform pair of the power spectral density. This is not to say that the only way to calculate the power spectral density is from the autocorrelation function.
As I stated at https://physics.stackexchange.com/a/309544/59023, the power spectral density, $S_{k}$, is proportional to the square of the magnitude of the Fourier transform of a signal, i.e., $S_{k} \propto \lvert X_{k} \rvert^{2}$.
Although what is the physical meaning of the power spectrums of $E(t)$ and $I(t)$, respectively, and their differences?
First, let me use the generic symbol $X_{k}$ to represent the Fourier transform of the time domain signal $x_{n}$.
The words power spectrum are somewhat ambiguous here. In principle, one can compute a power spectrum (i.e., respective value vs. frequency) from each component of $\mathbf{E}$ or its magnitude. One can also compute the amplitude spectra, $A_{k} \propto \lvert X_{k} \rvert$, of the signal.
In the following, I will assume you are asking about $S_{k}$ and not $A_{k}$ for each of these.
The power spectrum of $\mathbf{E}$, whether of components ($E_{j}$) or vector magnitude ($\lvert \mathbf{E} \rvert$), describe the power of the field as functions of frequency with units (if properly normalized) of (V m-1)2 Hz-1. This is useful when trying to determine whether there exists, e.g., a wave at a given frequency which would show up as peak above the backgroun in $S_{k}$. If the oscillations exist only along the x-component of $\mathbf{E}$ (i.e., a longitudinal, electrostatic oscillation), then the spectrum of both $\lvert \mathbf{E} \rvert$ and $E_{x}$ would show a frequency peak but not $E_{y}$ or $E_{z}$.
The intensity, as you have written it, is just the field energy density multiplied by a constant. Thus, the power spectrum of $I$ would be qualitatively similar to that of $\lvert \mathbf{E} \rvert$.
Although why is this true? Wouldn't this power spectrum only indicate the frequency of the signal itself and not the photons comprising it?
I am not sure about what you were told and whether you are correctly conveying that information. A discrete Fourier transform or DFT (i.e., what you use in practice on real signals through algorithms like the FFT) is not the same as a continuous Fourier transform (CFT). In a DFT, the frequency bin width is defined as:
$$
\Delta f = \frac{ f_{s} }{ 2 \ N } \tag{1}
$$
where $f_{s}$ is the sample rate of the signal [e.g., vectors per second] and $N$ is the number of individual points used in the DFT.
In a CFT, the minimum $\Delta f$ is mathematically zero (i.e., infinitesimally small) but quantum shows us that energy/momentum are quantized and thus have discrete values. Therefore, there are physical limits on the lower bound of $\Delta f$. In this case, a variant of the uncertainty principle is applicable, called the time-energy uncertainty principle, which is roughly given as:
$$
\Delta E \ \Delta t \geq \frac{\hbar}{2} \tag{2}
$$
where $\hbar$ is the Planck constant and $\Delta Q$ is the minimum resolution of quantity $Q$.
Thus, the transition has a known energy change but we cannot know this better than that given by Equation 2. For photons, we can directly convert energy to frequency with some constants, i.e., $E = h \ \nu$, thus we have the limitation on the frequency resolution of the emitted photons.
There are many possible choices regarding the overall scaling coefficients as well as the scaling coefficient converting time and frequency. It is possible to summarize these conventions succinctly using two numbers $a$ and $b$. I use the same notation as used in the Mathematica Fourier Transform function.
We define the Fourier Transform:
$$
\mathcal{FT}_{a,b}[f(t)](\omega) = \sqrt{\frac {|b|}{(2\pi)^{1-a}}}\int_{-\infty}^{+\infty} e^{+i b \omega t} f(t) dt
$$
And the inverse Fourier Transform
$$
\mathcal{FT}_{a,b}^{-1}[\tilde{f}(\omega)](t) = \sqrt{\frac{|b|}{(2\pi)^{1+a}}}\int_{-\infty}^{+\infty} e^{-i b \omega t} \tilde{f}(\omega) d\omega
$$
Let
$$
\tilde{f}_{a,b}(\omega) = \mathcal{FT}_{a,b}[f(t)](\omega)
$$
$$
\check{f}_{a,b}(t) = \mathcal{FT}_{a,b}^{-1}[\tilde{f}_{a,b}(\omega)](t)
$$
It can be shown via the Fourier inversion theorem that for the classes of functions we care about in physics $\check{f}_{a,b}(t) = f(t)$ for any $a$ and $b$. That is, for these definitions of the Fourier Transform and Inverse Fourier transform the two operations are inverses of eachother.
It's turns out that in the engineering and scientific literature there are many conventions that people choose depending mostly on what they are used to.
The first convention in the OP is $(a,b) = (1,-1)$ which is commonly used in physics, about as commonly as $(a,b) = (1,+1)$ which is the second convention you have shown.
In addition you will also see conventions where $(a,b) = (0,\pm1)$ where the factor of $2\pi$ is split evenly between the transform and inverse transform showing up with a square root.
Furthermore, usually in math or signal processing you will come across the $(a,b) = (0,\pm 2\pi)$ convention in which there is NO prefactor of $2\pi$ on either the transform or the inverse transform but now instead of angular frequency $\omega$ represents a cyclic frequency and a $2\pi$ appears in all of the exponentials.
All of these different conventions have advantages and disadvantages which may make one choice of convention more attractive than another depending on the application. The main point is that in any problem, whichever convention is chosen should be kept the same throughout the whole problem.
To get back to the OP's main question now. In the language set up in this answer the OP is basically asking if it matters whether $b=+1$ or $b=-1$. The short answer is that it does not matter. Either way works and converts the original signal as a function of time into a function of frequency. The difference has to do with how we interpret positive and negative frequencies.
Consider
$$
f^1(t) = e^{+i\omega_0 t}
$$
$$
f^2(t) = e^{-i \omega_0 t}
$$
The phasor for the first function rotates counterclockwise in phase space whereas the second rotates clockwise in phasespace.
If we choose the $b=-1$ convention then $\tilde{f}^1_{1,-1}(\omega)$ will have a nonzero contribution at $+\omega_0$ whereas $\tilde{f}^2_{1,-1}(\omega)$ will have a nonzero contribution at $-\omega_0$. We might say $f^1$ is a positive frequency signal while $f^2$ is negative.
However, if we choose $b=+1$ then everything reverses. $\tilde{f}^1_{1,+1}(\omega)$ will have a nonzero contribution at $-\omega_0$ while $\tilde{f}^2_{1,+1}(\omega)$ will have a contribution at $+\omega_0$. now $f^1$ is negative frequency and $f^2$ is positive frequency!
Thuse we see that both $b=+1$ and $b=-1$ give answer that we can interpret as frequencies with the only difference between the two being what we call positive and negative frequencies. As a note I personally prefer $(a,b)=(1,+1)$ because it makes the formula for the Fourier transform (which I use more often than the inverse transform) as simple as possible. No prefactor and no minus sign in the exponent.
edit: As you have pointed out sometimes these signs can have a substantial effect on some physical quantity such as reversing the sign (inverting the phase) of the complex impedance of a capacitor. Unfortunately this is something we just have to deal with and try to be consistent with our own conventions and those used by the references we consult. Of course you will find both conventions give the same answer for a real measurable quantity such as $V(t)$ across the resistor.
Best Answer
It is important to understand that we can state the electric field either in the time domain or in the frequency domain, never in both. I would recommend to avoid the notation $E(\omega,t)$ and instead replace $\omega$ by $\omega_0$ and drop the functional dependence in order to avoid confusion. A monochromatic field in the time domain is given by $$ E(t) = E_0 \cos\omega_0 t = \frac{E_0}{2} (e^{-i\omega_0 t}+e^{i\omega_0 t} ) $$ Here it is clear that $\omega_0$ is a fixed number, i.e. a parameter. The expansion into exponential functions helps us to see, that the Fourier transform must consist of two delta functions, as the exponential function is the Kernel of the Fourier transform. The electric field in the fourier domain is then $$ \tilde{E}(\omega) = \frac{E_0}{2}\left( \delta(\omega-\omega_0) + \delta(\omega+\omega_0) \right) $$ For now everything is fine. However one should note that your monochromatic field is somewhat problematic as it contains infinite energy as follows $$ \begin{align} E_\mathrm{field} &\propto \int |E(t)|^2 \mathrm{d}t = \infty\\ & = \int |\tilde{E}(\omega)|^2 \mathrm{d}\omega \quad \text{(Parselval's theorem)} \end{align} $$ If you look at the spectrum, i.e. $|\tilde{E}(\omega)|^2$, this is especially problematic because in a monochromatic wave all energy is focused into a single frequency, here $\omega_0$. Therefore your expression for the number of photons is simply undefined. So you just simply cannot assume a purely monochromatic wave in this case. Instead for example assume a finite wave packet with an electric field $$ E(t) = E_0 \exp\left(-\frac{t^2}{2 \sigma_t^2}\right) \cos\omega_0t $$ With this field everything works out.
Another alternative is to change your expression for the number of photons by using the Poynting vector $\mathbf{S}$, which is the energy flux per unit area and is well defined even for monochromatic waves (the problem in your formula is that it assumes that the all the photons in the field are absorbed). $$ \mathbf{S} = \frac{1}{\mu_0} \mathbf{E}(t) \times \mathbf{B}(t) $$