Huffman Algorithm

algorithmscomputer scienceinformation theory

We use Huffman’s algorithm to obtain an encoding of alphabet ${a,b,c,d}$ with frequencies $f_{a}, f_{b}, f_{c}, f_{d}.$ In each of the following cases, either give an example of frequencies $(fa, fb, fc, fd)$ that would yield the specified code, or explain why the code cannot possibly be obtained (no matter what the frequencies are).

(a) Code: ${0,10,110,111}$

(b) Code: ${1,00,01,110} $

(c) Code: ${00,01,10,11}$

I mostly solved part $b$, but could not be able to solve $a$ and $c$. Would be welcomed to know your solutions on this!

Best Answer

In Huffman coding, we build the tree bottom-up by considering pairs of alphabets with the least frequency/probability. So the two least frequent alphabets get assigned the longest code length. In part $(b)$, $d$ has the longest code $110$ so some other alphabet must also have the code $111$ which is not the case.

In part $(a),c,d$ have the largest code length so they have the two least probabilities, say $p_c\le p_d$. We start building the tree by forming a dummy vertex labelled $v_1$ with left child $c$, right child $d$ and $p_1=p_c+p_d$. Then $v_1$ has the code $11$. Now $b$ has the code $10$, i.e. $b$ is the left child and $v_1$ is the right child of a dummy vertex $v_2$. So $p_2=p_b+p_1$ and $p_b\le p_1$. Since out of $\{a,b,v_1\},b,v_1$ have the longest codes, we also have $p_b,p_1\le p_a$.

Now $v_2$ has the code $1$. $a$ has the code $0$, i.e. $a$ is the left child and $v_2$ is the right child of the root vertex $R$. So $p_a\le p_2.$ So the constraints we have are$$\begin{align*}&p_c\le p_d\le p_b\le p_a\\&p_b\le p_c+p_d\le p_a\\&p_a\le p_b+p_c+p_d\\&p_a+p_b+p_c+p_d=1\end{align*}$$

You can check that $p_a=0.41,p_b=0.29,p_c=0.15,p_d=0.15$ is a solution and it generates the tree in the picture:

Graph

For part $(c)$, all alphabets have the same code length so one solution is when they are all equiprobable, i.e. $p_a=p_b=p_c=p_d=0.25$. Can you check if you get the correct tree and codes?


Edit: Your tree should look like this:

Tree

Note that instead of $a,b,c,d$ we could have $b,a,c,d$ or any other order of the four alphabets in the bottom four vertices since they have the same probability, but we have to make the tree according to the codes given. The below tree also corresponds to the given probabilities but the code for $b$ is $00$, which is not desired.

Tree

Related Question