[Math] Understanding Discrete Cosine Transformation

fourier analysissignal analysis

I'm currently working on some software and a key component is 2D DCT. But my question is more general, as I'm trying to understand the DCT in general, let's say from engineers point of view.
For start, I know that there are 8 types of DCT, and that many authors use different notation, sometimes even different parameterization, but that's doesn't matter as I'm not going to implement DCT, I only want to understand it.

I will stick to formula, scavenged from http://www.cs.cf.ac.uk/Dave/Multimedia/node231.html.

DCT is defined as following:
$$
F(u) = \left ( \frac{2}{N} \right )^\frac{1}{2} \sum_{i=0}^{N-1} \Lambda (i)cos\left [ \frac{\pi\cdot u}{2N}(2i+1) \right ]f(i)
$$
$N$ is count of samples.
$i$ is index of particular sample and $f(i)$ it's value.
$\Lambda$- well I'm not sure, but it's only a weight coefficient, so it does not affect the principle of the DCT.

What I'm struggling to understand are values $u$ and $F(u)$. I know that DCT transform data to frequency domain, but I have not found the meaning of this values. My guess is that $u$ is particular frequency and $F(u)$ is amount of this frequency in data, e.g. for signal with 8kHz frequency (for example whistle), the DCT would return $0$ for all values of $u$ and some great value for $u=8000$. (this is an ideal case, I know this is overcast example).
I've also deducted, that maximum frequency in DCT result will be limited by number of samples, e.g. for sound sampled at 44100kHz there won't be any coefficient for frequency higher than 44100kHz, due to Nyquist criterium.

So are my conclusions right, or completely off track? Thanks in advance.

Best Answer

You're very close. $u$ corresponds to frequency and $|F(u)|$ is frequency content in the signal. Let me explain relation between variable $u$ and the frquency it corresponds to:-

A signal is being sampled at time period of $T_{p}$, then maximum frequency that it can successfully represent is $1/2T_{p}$. Here $f=1/T_{p}$ is sampling rate and that in case of audio signal is 44.1 KHz (so that it can represent 22KHz signal which is close to hearing limit of human ears).

Now, what all frequency it can represent depends on $N$ i.e. number of samples that you take.

Frequency in this case will take discrete values from $0,f/N,2f/N...(N-1)f/N$ and these frequency will correspond to $u=0,1,2,..N-1$. Frequency beyond that will alias back to one those frequencies. And so more number of samples you take you can represent more number of frequency.

Related Question