Lecture 7: Source Coding and Kraft Inequality

Lecture 7: Source Coding and Kraft Inequality
Codes
Kraft inequality and consequences
Dr. Yao Xie, ECE587, Information Theory, Duke University
Horse Racing
!"#$"%"&'()%*&+,%
pi
1/2
1/4
1/8
1/16
1/64
1/64
1/64
1/64
Eli
Code 1
000
001
010
011
100
101
110
111
3
H(X) =
Code 2
0
10
110
1110
111100
111101
111110
111111
2
pi log pi = 2bits
How to find the best code?
Codes
Source code C for a random variable X is
C(x) : X D
D: set of finite-length strings of symbol from D-ary alphabet D
Code length: l(x)
Example: C(red) = 00, C(blue) = 11, X = {red, blue}, D = {0, 1}
Morses code (1836)

A code for English alphabet of four symbols
Developed for electric telegraph system
D = {dot, dash, letter space, word space}
Short sequences represent frequent letters
Long sequences represent infrequent letter
Source coding applications

Magnetic recording: cassette, hardrive, USB...
Speech compression
Compact disk (CD)
Image compression: JPEG
Still an active area of research:
Solid state hard drive
Sensor network: distributed source coding
What defines a good code

Non-singular:
x = x C(x) = C(x)
non-singular enough to describe a single RV X

When we send sequences of value of X, without comma can we still
uniquely decode
Uniquely decodable if extension of the code is nonsingular
C(x1)C(x2) C(xn)
Singular
1
2
3
4
0
0
0
0
Nonsingular
not
uniquely
decodable
0
010
01
10
Uniquely
decoable
Prefix
10
00
11
110
0
10
110
111
Uniquely decodable if only one possible source string producing it

However, we have to look at entire string to determine
Prefix code (instantaneous code): no codeword is a prefix of any other
code
All
codes
Nonsingular
codes
Uniquely
decodable
codes
Instantaneous
codes
Expected code length

Expected length L(C) of a source code C(x) for X with pdf p(x)
L(C) =
p(x)l(x)
xX
We wish to construct instantaneous codes of minimum expected length
10
Kraft inequality
By Kraft in 1949
Coded over alphabet size D
m codes with length l1, . . . , lm
The code length of all instantaneous code must satisfy Kraft inequality
m
Dli 1
i=1
Given l1, . . . , lm satisfy Kraft, can construct instantaneous code

Can be extended to uniquely decodable code (McMillan inequality)
11
Proof of Kraft inequality

Consider D-ary tree
Each codeword is represented by a leaf node
Path from the root traces out the symbol
Prefix code: no codeword is an ancestor of any other codeword on the
tree
Each code eliminates its descendants as codewords
12
Root
10
110
111
13
lmax be the length of longest codeword

A codeword at level li has Dlmaxli descendants
Descendant sets must be disjoint:
Dlmaxli Dlmax
Dli 1
Converse: if l1, . . . , lmax satisfy Kraft inequality, can label first node at
depth l1, remove its descendants...
Can extend to infinite prefix code lmax
14
Optimal expected code length

One application of Kraft inequality
Expected code length of D-ary is lower bounded by entropy:
L HD (X)
Proof:
L HD (X) =
pili
pi logD
1
pi
1
= D(p||r) + logD 0
c
ri = Dli /
Dlj , c =
Dli 1
j
15
Achieve minimum code length if

c = 1: Kraft inequality is equality
ri = pi: approximated pdf using D-ary alphabet is exact
How to construct such an optimal code?
Finding the D-adic distribution that is closet to distribution of X
Construct the code by converse of Kraft inequality
16
Construction of optimal codes

Finding the D-adic distribution that is closet to distribution of X is
impractical because finding the closest D-adic distribution is not obvious
Good suboptimal procedure
Shannon-Fano coding
Arithmetic coding
Optimal procedure: Huffman coding
17
First step: finding optimal code length

Solving optimization problem
minimizeli
subject to
i=1
m
pi li
Dli 1.
i=1
Solve using Lagrangian multiplier

J=
pili + (
i=1
Dli 1)
i=1
18
Solution:
li = logD pi.
Achieves the lower bound:
L =
pili
pi logD pi = HD (X).
Problem: logD pi may not be an integer!

Rounding up
li = logD pi.
may not be optimal.

Usable code constructions?
19
Summary
Nonsingular > Uniquely decodable > Instantaneous codes
Kraft inequality for Instantaneous code
Entropy is lower bound on expected code length
20

Lecture 7: Source Coding and Kraft Inequality

Uploaded by

Copyright:

Available Formats

Lecture 7: Source Coding and Kraft Inequality

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 7: Source Coding and Kraft Inequality

Uploaded by

Copyright:

Available Formats

Lecture 7: Source Coding and Kraft Inequality

Dr. Yao Xie, ECE587, Information Theory, Duke University

Dr. Yao Xie, ECE587, Information Theory, Duke University

How to find the best code?

Dr. Yao Xie, ECE587, Information Theory, Duke University

Dr. Yao Xie, ECE587, Information Theory, Duke University

Morses code (1836)

Dr. Yao Xie, ECE587, Information Theory, Duke University

Dr. Yao Xie, ECE587, Information Theory, Duke University

Source coding applications

Dr. Yao Xie, ECE587, Information Theory, Duke University

What defines a good code

non-singular enough to describe a single RV X

Dr. Yao Xie, ECE587, Information Theory, Duke University

Uniquely decodable if only one possible source string producing it

Dr. Yao Xie, ECE587, Information Theory, Duke University

Expected code length

We wish to construct instantaneous codes of minimum expected length

Dr. Yao Xie, ECE587, Information Theory, Duke University

Given l1, . . . , lm satisfy Kraft, can construct instantaneous code

Proof of Kraft inequality

Dr. Yao Xie, ECE587, Information Theory, Duke University

Dr. Yao Xie, ECE587, Information Theory, Duke University

lmax be the length of longest codeword

Optimal expected code length

Dr. Yao Xie, ECE587, Information Theory, Duke University

Achieve minimum code length if

Dr. Yao Xie, ECE587, Information Theory, Duke University

Construction of optimal codes

Dr. Yao Xie, ECE587, Information Theory, Duke University

First step: finding optimal code length

Solve using Lagrangian multiplier

Achieves the lower bound:

Problem: logD pi may not be an integer!

may not be optimal.

Dr. Yao Xie, ECE587, Information Theory, Duke University

You might also like