Data Compression Unit-1 - 1
Data Compression Unit-1 - 1
Data Compression Unit-1 - 1
Introduction:
The Compression algorithm or Compression technique that
takes an input X and generates a representation Xc that
requires fewer bits, and there is a reconstruction
algorithm that operates on the compressed
representation Xc to generate the reconstruction Y.
Data Compression
Types or Classes of Data Compression:
Based on the requirements of reconstruction, data
compression schemes can be divided into two broad
classes:
• Lossless Compression
– lossless compression schemes, in which Y is identical to X,
• Lossy Compression
– lossy compression schemes, which generally provide
much higher compression than lossless compression
but allow Y to be different from X.
Data Compression
Measure of Performance:
A compression algorithm can be evaluated in a
number of different ways:
– The memory required to implement the algorithm.
– How fast the algorithm performs on a given machine.
– The amount of compression.
– How closely the reconstruction resembles the
original.
Data Compression
Measure of Performance:
• Compression Ratio:
– the ratio of the number of bits required to represent the data
before compression to the number of bits required to
represent the data after compression.
• Rate:
– Another way of reporting compression performance is to
provide the average number of bits required to represent a
single sample.
• Distortion:
– The difference between the original and the reconstruction is
often called the distortion.
Data Compression
Modeling and Coding:
The development of data compression algorithms
for a variety of data can be divided into two
phases.
• Modeling
– In this phase we try to extract information about any redundancy that exists
in the data and describe the redundancy in the form of a model.
• Coding
– A description of the model and a “description” of how the data differ from
the model are encoded, generally using a binary alphabet.
(The difference between the data and the model is often referred to as the
residual.)
Data Compression
Information Theory:
Used in development of Lossless Compression
Technique.
Self Information i(A) :
Suppose we have an Event A, which is set of outcomes of
some random experiment. If P(A) is the probability that the
event A will occur. Then the self information associate with
A is given by
Data Compression
Information Theory:
The self information associated with the occurrence of
both event A and B