Lempel Ziv
Lempel Ziv
Lempel Ziv
Lempel-Ziv-Welsh Coding
LZW starts out with a dictionary of 256 characters (in
the case of 8 bits) and uses those as the "standard"
character set.
It then reads data 8 bits at a time (e.g., 't', 'r', etc.) and
encodes the data as the number that represents its
index in the dictionary.
Everytime it comes across a new substring (say, "tr"), it
adds it to the dictionary; everytime it comes across a
substring it has already seen, it just reads in a new
character and concatenates it with the current string to
get a new substring.
The next time LZW revisits a substring, it will be
encoded using a single number.
Current String
Seen this
Before?
yes
nothing
none
ba
ba
no
ba / 5
ban
an
no
1,0
an / 6
bana
na
no
1,0,3
na / 7
banan
an
yes
no change
none
banana
ana
no
1,0,3,6
ana / 8
banana_
a_
no
1,0,3,6,0
a_ / 9
banana_b
_b
no
1,0,3,6,0,4
_b / 10
banana_ba
ba
yes
no change
none
banana_ban
ban
no
1,0,3,6,0,4,5
ban / 11
banana_band
nd
no
1,0,3,6,0,4,5,3
nd / 12
banana_banda
da
no
1,0,3,6,0,4,5,3,2
da / 13
banana_bandan
an
yes
no change
banana_bandana
ana
yes
Input
Encoded Output
1,0,3,6,0,4,5,3,2,
New Dictionary
Entry/Index
none
none
Input
a
ab
aba
abab
ababa
ababab
abababa
abababab
yes
no
no
yes
no
yes
yes
no
nothing
0
0,1
nochange
0,1,2
nochange
nochange
0,1,2,4
New
Dictionar
y
Entry/Ind
ex
none
ab/2
ba/3
none
aba/4
none
none
abab/5
Uncompression
The uncompression process for LZW is also
straightforward. In addition, it has an advantage
over static compression methods because no
dictionary or other overhead information is
necessary for the decoding algorithm.
A dictionary identical to the one created during
compression is reconstructed during the process.
Both encoding and decoding programs
must start with the same initial dictionary.
(with all 256 ASCII characters in standard case)
Decoded
Output
Current Dictionar
String yEntry/
Index
0=a
none
none
0,1
1=b
ab
ab/2
0,1,2
2=ab
abab
ba/3
0,1,2,4
4=???
abab???
ab
???