**1 Answer**

written 21 months ago by | • modified 21 months ago |

**Huffman Coding**

- Huffman Coding is a
that uses the frequency or probability of characters for generating codes.*Greedy Algorithm* - Here, we use the frequency of characters to create a Huffman tree and find out the Huffman code for each character in the given string CONNECTION.
- It allows a
of data.*Lossless Compression* - It generates variable-length codes for all the different characters.
- To do this it uses
*variable-length encoding.* - This process is called as
**Huffman Encoding**. - The Time Complexity of Huffman Coding becomes $O(N log N)$, where N is the number of unique characters in the given string.

**Prefix Rule:**

- To prevent any ambiguities while decoding, Huffman Coding creates a rule called as a
*Prefix Rule.* - It ensures that the code assigned to any character is not a prefix of the code assigned to any other character.

**Phases of Huffman Encoding:**

** Phase 1 -** Build a Huffman Tree from the input characters of the string.

** Phase 2 -** Allocate the Huffman Code to each unique character by traversing the Huffman Tree.

## Working of the Huffman Coding

**The given String is CONNECTION**

- The given string contains a total of 10 characters.
- As Huffman Coding technique is used to compress the string into a smaller size.
- To do this it first creates a
using the frequencies of each character and then generates code for each character.*Huffman Tree*

**Phase 1 – Huffman Tree Generation**

*Step 1 –*

- Calculate the frequency of each character in the given string CONNECTION.
- To do this make each unique character of the given string as a leaf node.
- Leaf node of a character shows the frequency occurrence of that unique character. This is shown in the below figure.

Characters | C | O | N | E | T | I |
---|---|---|---|---|---|---|

Frequency |
2 | 2 | 3 | 1 | 1 | 1 |

*Step 2 –*

- Arrange the characters in
of their frequencies.*Increasing order or Ascending order*

Therefore,

E | T | I | C | O | N |
---|---|---|---|---|---|

1 | 1 | 1 | 2 | 2 | 3 |

*Step 3 –*

Consider the first two nodes of the characters having minimum frequencies,

- Create a new internal node.
- The frequency of this new node is the sum of the frequencies of those two character nodes.

*Step 4 –*

- Keep repeating
*Step – 3*until all the nodes form a single tree. - The tree obtained is the required Huffman Tree for the given string CONNECTION.

*The figure shown below shows the step-by-step generation of the Huffman Tree for the given string CONNECTION:*

*Step 5 –*

- Allocate weights to all the edges of the generated Huffman Tree based on the below rules.

** Rule 1 -** If assign weight ‘0’ to the left edges, then assign weight ‘1’ to the right edges.

** Rule 2 -** If assign weight ‘1’ to the left edges, then assign weight ‘0’ to the right edges.

- Any of the above two rules can be adopted.
- But must ensure that the same rule is also adopted at the time of decoding too, that is adopted at the time of encoding.
- Here, we use
which means allocate weight ‘0’ to the left edges and weight ‘1’ to the right edges.*Rule 1*

*Now, the final Huffman Tree with allocated weights to the edges look as follows for the given String CONNECTION:*

**Phase 2 – Huffman Code Calculation**

- To find the Huffman Code for each unique character of the given string, traverse the Huffman Tree from the root node to the leaf node of that character.

*Therefore, the Huffman code for each unique character is as follows:*

Character | Huffman Code |
---|---|

E | 1010 |

T | 1011 |

I | 100 |

C | 00 |

O | 01 |

N | 11 |