I am not a signal processing expert, but here goes.
Regarding questions 1 and 2: The first element will contain information about the magnitude, as it is the 0 Hz component. This is true for all Fourier transformed data sets. If you have a relatively high mean value, your data may be easier to visually analyze if you remove the first value. Removing the first value should be more or less equivalent to data=data-mean(data) prior to the fft.
Regarding question 3: Yes.
Regarding question 4: As far as I am aware, yes. That is the reason why they combine these two steps in the example: n = length(y); power = abs(y(1:floor(n/2))).^2;
Regarding question 5: This drops out because of the Nyquist criterion. You can only meaningfully detect frequencies up to half of your sample frequency. Hope this helps.
Best Answer