Lecture 7: Source Coding and Kraft Inequality

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Lecture 7: Source Coding and Kraft Inequality

Codes
Kraft inequality and consequences

Dr. Yao Xie, ECE587, Information Theory, Duke University

Horse Racing

!"#$"%"&'()%*&+,%

Dr. Yao Xie, ECE587, Information Theory, Duke University

pi
1/2
1/4
1/8
1/16
1/64
1/64
1/64
1/64
Eli

Code 1
000
001
010
011
100
101
110
111
3

H(X) =

Code 2
0
10
110
1110
111100
111101
111110
111111
2

pi log pi = 2bits

How to find the best code?

Dr. Yao Xie, ECE587, Information Theory, Duke University

Codes
Source code C for a random variable X is
C(x) : X D
D: set of finite-length strings of symbol from D-ary alphabet D
Code length: l(x)
Example: C(red) = 00, C(blue) = 11, X = {red, blue}, D = {0, 1}

Dr. Yao Xie, ECE587, Information Theory, Duke University

Morses code (1836)


A code for English alphabet of four symbols
Developed for electric telegraph system
D = {dot, dash, letter space, word space}
Short sequences represent frequent letters
Long sequences represent infrequent letter

Dr. Yao Xie, ECE587, Information Theory, Duke University

Dr. Yao Xie, ECE587, Information Theory, Duke University

Source coding applications


Magnetic recording: cassette, hardrive, USB...
Speech compression
Compact disk (CD)
Image compression: JPEG
Still an active area of research:
Solid state hard drive
Sensor network: distributed source coding

Dr. Yao Xie, ECE587, Information Theory, Duke University

What defines a good code


Non-singular:

x = x C(x) = C(x)

non-singular enough to describe a single RV X


When we send sequences of value of X, without comma can we still
uniquely decode
Uniquely decodable if extension of the code is nonsingular
C(x1)C(x2) C(xn)

Dr. Yao Xie, ECE587, Information Theory, Duke University

Singular

1
2
3
4

0
0
0
0

Nonsingular
not
uniquely
decodable
0
010
01
10

Uniquely
decoable

Prefix

10
00
11
110

0
10
110
111

Uniquely decodable if only one possible source string producing it


However, we have to look at entire string to determine
Prefix code (instantaneous code): no codeword is a prefix of any other
code
Dr. Yao Xie, ECE587, Information Theory, Duke University

All
codes

Nonsingular
codes

Uniquely
decodable
codes

Instantaneous
codes

Dr. Yao Xie, ECE587, Information Theory, Duke University

Expected code length


Expected length L(C) of a source code C(x) for X with pdf p(x)
L(C) =

p(x)l(x)

xX

We wish to construct instantaneous codes of minimum expected length

Dr. Yao Xie, ECE587, Information Theory, Duke University

10

Kraft inequality
By Kraft in 1949
Coded over alphabet size D
m codes with length l1, . . . , lm
The code length of all instantaneous code must satisfy Kraft inequality
m

Dli 1

i=1

Given l1, . . . , lm satisfy Kraft, can construct instantaneous code


Can be extended to uniquely decodable code (McMillan inequality)
Dr. Yao Xie, ECE587, Information Theory, Duke University

11

Proof of Kraft inequality


Consider D-ary tree
Each codeword is represented by a leaf node
Path from the root traces out the symbol
Prefix code: no codeword is an ancestor of any other codeword on the
tree
Each code eliminates its descendants as codewords

Dr. Yao Xie, ECE587, Information Theory, Duke University

12

Root

10

110

111

Dr. Yao Xie, ECE587, Information Theory, Duke University

13

lmax be the length of longest codeword


A codeword at level li has Dlmaxli descendants
Descendant sets must be disjoint:

Dlmaxli Dlmax

Dli 1

Converse: if l1, . . . , lmax satisfy Kraft inequality, can label first node at
depth l1, remove its descendants...
Can extend to infinite prefix code lmax
Dr. Yao Xie, ECE587, Information Theory, Duke University

14

Optimal expected code length


One application of Kraft inequality
Expected code length of D-ary is lower bounded by entropy:
L HD (X)
Proof:
L HD (X) =

pili

pi logD

1
pi

1
= D(p||r) + logD 0
c

ri = Dli /
Dlj , c =
Dli 1
j

Dr. Yao Xie, ECE587, Information Theory, Duke University

15

Achieve minimum code length if


c = 1: Kraft inequality is equality
ri = pi: approximated pdf using D-ary alphabet is exact
How to construct such an optimal code?
Finding the D-adic distribution that is closet to distribution of X
Construct the code by converse of Kraft inequality

Dr. Yao Xie, ECE587, Information Theory, Duke University

16

Construction of optimal codes


Finding the D-adic distribution that is closet to distribution of X is
impractical because finding the closest D-adic distribution is not obvious
Good suboptimal procedure
Shannon-Fano coding
Arithmetic coding
Optimal procedure: Huffman coding

Dr. Yao Xie, ECE587, Information Theory, Duke University

17

First step: finding optimal code length


Solving optimization problem
minimizeli
subject to

i=1
m

pi li
Dli 1.

i=1

Solve using Lagrangian multiplier


J=

pili + (

i=1
Dr. Yao Xie, ECE587, Information Theory, Duke University

Dli 1)

i=1
18

Solution:

li = logD pi.

Achieves the lower bound:

L =

pili

pi logD pi = HD (X).

Problem: logD pi may not be an integer!


Rounding up

li = logD pi.

may not be optimal.


Usable code constructions?
Dr. Yao Xie, ECE587, Information Theory, Duke University

19

Summary
Nonsingular > Uniquely decodable > Instantaneous codes
Kraft inequality for Instantaneous code
Entropy is lower bound on expected code length

Dr. Yao Xie, ECE587, Information Theory, Duke University

20

You might also like