Search Header Logo
Huffman Coding

Huffman Coding

Assessment

Presentation

Computers

10th Grade

Hard

Created by

Roy Duguid

Used 4+ times

FREE Resource

43 Slides • 0 Questions

1

​Do Now:

​ 1. Log in

  1. Go to class notebook

  2. Complete Do Now

2

​Do Now ANSWERS:

3

​Big Question

  • How can data can be compressed using Huffman coding?

  • How do we interpret a Huffman tree?

  • How do we calculate the number of bits required to store a piece of data compressed using Huffman coding?

​Key Words: Huffman, node, frequency

4

Without compression, each character in a text file would have an equal number of bits.

If you were using 8-bit ASCII, then each character would take up 1 byte of space. This would be higher for 16- or 32-bit Unicode.

​Huffman Coding

5

This is acceptable for short text files, but what if you were writing a book with 100,000 words in it?

An uncompressed book could end up having a file size of around 500,000 bytes.

To ensure that hundreds or even thousands of books can be saved on e-readers, some compression needs to occur.

​Huffman Coding

6

media

Huffman coding is a compression
algorithm that allocates bit patterns of
varying lengths to the characters in a
text file.

Huffman coding

Activity 1

7

media

Huffman coding is a compression
algorithm that allocates bit patterns of
varying lengths to the characters in a
text file.

Characters that occur most frequently
get a short bit pattern.

Huffman coding

e = 01

Activity 1

8

media

Huffman coding is a compression
algorithm that allocates bit patterns of
varying lengths to the characters in a
text file.

Characters that occur most frequently
get a short bit pattern.

Characters that occur less frequently
get a longer bit pattern.

Huffman coding

e = 01

z = 10001000

Activity 1

9

media

This means that instead of storing each
character using 8 or 16 bits, the most
common characters use much smaller
bit patterns.

This helps to reduce the overall file size
of the text file.

Huffman coding

e = 01

z = 10001000

Activity 1

10

media

The Huffman coding algorithm will:

Check the entire text file, counting
the number of occurrences
recorded for each character

Place those characters in order from
most frequent to least frequent

Create a Huffman tree based on
the frequencies of the characters

Apply a bit pattern to each
character in the tree

Encode each character using the
new bit pattern

Huffman coding

Activity 1

11

media

Let’s use a single word as an example:

MISSISSIPPI

There are 11 characters in this word.

The uncompressed text file would have a
file size of 88 bits (or 11 bytes) if using
8-bit ASCII.

Huffman coding

Activity 1

12

media

You can create a frequency table to
count how many times each letter
appears in this text file.

MISSISSIPPI

Huffman coding

Activity 1

M

I

S

P

13

media

You can create a frequency table to
count how many times each letter
appears in this text file.

MISSISSIPPI

Huffman coding

Activity 1

M

I

S

P

1

14

media

You can create a frequency table to
count how many times each letter
appears in this text file.

MISSISSIPPI

Huffman coding

Activity 1

M

I

S

P

1

4

15

media

You can create a frequency table to
count how many times each letter
appears in this text file.

MISSISSIPPI

Huffman coding

Activity 1

M

I

S

P

1

4

4

16

media

You can create a frequency table to
count how many times each letter
appears in this text file.

MISSISSIPPI

Huffman coding

Activity 1

M

I

S

P

1

4

4

2

17

media

The letters can then be sorted into order
to see which letters occur most
frequently, and which occur least

frequently.

Huffman coding

Activity 1

I

S

P

M

4

4

2

1

18

media

A Huffman tree is then created based on
the frequencies.

Huffman coding

Activity 1

19

media

A Huffman tree uses a binary tree
structure.

A binary tree is made up of nodes and
branches.

Huffman coding

Activity 1

Nodes

Branches

20

media

Each node can have a maximum of two
child nodes.

This is where the term ‘binary tree’
comes from.

Huffman coding

Activity 1

21

media

A Huffman tree for the word
MISSISSIPPI looks like this.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

22

media

Each node can have a maximum of two
child nodes.

This is where the term ‘binary tree’
comes from.

The node at the top is referred to as the
root node, and is also a parent node.

Huffman coding

Activity 1

Parent node

23

media

You can see the four characters in the
word listed here on the left-hand side.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

24

media

You can see the frequency of each
character listed on the right-hand side.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

25

media

Notice how the less frequently occurring
characters are furthest away from the
root node.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

Root node

26

media

The cumulative frequency of each
character is placed in the parent node.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

27

media

The cumulative frequency of each
character is placed in the parent node.

1 + 2 = 3

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

28

media

The cumulative frequency of each
character is placed in the parent node.

1 + 2 = 3

4 + 3 = 7

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

29

media

The cumulative frequency of each
character is placed in the parent node.

1 + 2 = 3

4 + 3 = 7

4 + 7 = 11

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

30

media

Once the Huffman tree is created, a
binary number is assigned to each
character.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

31

media

This is done by following the path from
the root of the tree to the character
node.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

32

media

Moving left uses a 0 and moving right
uses a 1.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

0

0

0

1

1

1

33

media

For the character S, you just move left
once,so the bit pattern for S is 0.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

0

0

0

1

1

1

0

34

media

For the character I, you move right once
and left once,so the bit pattern is 10.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

0

0

0

1

1

1

0

10

35

media

For the character M, you move right
twice and then left once, so the bit
pattern is 110.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

0

0

0

1

1

1

0

10

110

36

media

For the character P, you move right
three times, so the bit pattern is 111.

Huffman coding

Activity 1

11

7

3

S

4

I

4

M

1

P

2

0

0

0

1

1

1

0

10

110

111

37

media

These bit patterns can now be used to
create the compressed file.

Huffman coding

Activity 1

S = 0
I = 10
P = 111
M = 110

38

media

These bit patterns can now be used to
create the compressed file.

Huffman coding

Activity 1

S = 0
I = 10
P = 111
M = 110

MISSISSIPPI

110 10

0010

0010 111 111 10

39

media

The compressed file now has a file size of
21 bits compared to the original file size
of 88 bits.

Huffman coding

Activity 1

S = 0
I = 10
M = 110
P = 111

MISSISSIPPI

110 10

0010

0010 111 111 10

21 bits

40

media

The compressed file now has a file size of
21 bits compared to the original file size
of 88 bits.

Notice that there has been no loss of
data during this compression. This means
that Huffman coding is a lossless
compression algorithm.

Huffman coding

Activity 1

S = 0
I = 10
M = 110
P = 111

MISSISSIPPI

110 10

0010

0010 111 111 10

21 bits

41

media

The compressed file now has a file size of
21 bits compared to the original file size
of 88 bits.

Notice that there has been no loss of
data during this compression. This means
that Huffman coding is a lossless
compression algorithm.

Note: 21 bits refers to the raw file size and
doesn’t take into account the amount of
storage required to store the tree that is

needed for decompression.

Huffman coding

Activity 1

S = 0
I = 10
M = 110
P = 111

MISSISSIPPI

110 10

0010

0010 111 111 10

21 bits

42

media

Write down a brief explanation of how
data can be compressed using Huffman
coding.

Explain how Huffman coding works

Plenary

43

media

Huffman coding is a lossless
compression technique. It does this by
assigning bit patterns of varying lengths
to characters. The most frequently
occurring characters have a short bit
pattern and less frequently occurring
characters have a longer bit pattern.

Characters are encoded using a
Huffman tree, which has the same
structure as a binary tree.

Explain how Huffman coding works – Model answer

Plenary

​Do Now:

​ 1. Log in

  1. Go to class notebook

  2. Complete Do Now

Show answer

Auto Play

Slide 1 / 43

SLIDE