Finite Precision Numerical Effects
Finite Precision Numerical Effects
Finite Precision Numerical Effects
=
N
1 k
k
k
M
0 k
k
k
z a 1
z b
z H ( )
=
N
1 k
k
k
M
0 k
k
k
z a 1
z b
z H
Quantization
5
MATLAB Demo
6
Poles of Quantized Second-Order Sections
Consider a 2nd order system with complex-conjugate pole pair
The pole locations after quantization will be on the grid point
3-bits
7-bits
7
Coupled-Form Implementation of Complex-Conjugate Pair
Equivalent implementation of
the second order system
But the quantization grid this
time is
8
Effects of Coefficient Quantization in FIR Systems
No poles to worry about only zeros
Direct form is commonly used for FIR systems
Suppose the coefficients are quantized
Quantized system is linearly related to the quantization error
Again quantization noise is higher for clustered zeros
However, most FIR filters have spread zeros
( ) | |
=
M
0 n
n
z n h z H
( ) | | ( ) ( ) z H z H z n h
z H
M
0 n
n
A + = =
=
( ) | |
A = A
M
0 n
n
z n h z H
Matlab Demo
9
Round-Off Noise
10
11
Round-Off Noise in Digital Filters
Difference equations
implemented with
finite-precision
arithmetic are non-
linear systems
Second order direct
form I system
Model with
quantization effect
Density function
error terms for
rounding
12
Analysis of Quantization Error
Combine all error terms to single location to get
The variance of e[n] in the general case is
The contribution of e[n] to the output is
The variance of the output error term f[n] is
| | | | | |
| | | | | | n e n e n e
n e n e n e
4 3 2
1 0
+ + +
+ =
( )
12
2
N 1 M
B 2
2
e
+ + = o
| | | | | |
=
+ =
N
1 k
k
n e k n f a n f
( ) | |
+ + = o
n
2
ef
B 2
2
f
n h
12
2
N 1 M
( ) ( ) z A / 1 z H
ef
=
13
Round-Off Noise in a First-Order System
Suppose we want to implement the following stable system
The quantization error noise variance is
Noise variance increases as |a| gets closer to the unit circle
As |a| gets closer to 1 we have to use more bits to
compensate for the increasing error
( ) 1 a
az 1
b
z H
1
<
=
( ) | |
|
|
.
|
\
|
= = + + = o
2
B 2
0 n
n 2
B 2
n
2
ef
B 2
2
f
a 1
1
12
2
2 a
12
2
2 n h
12
2
N 1 M
Zero-Input Limit Cycles
14
15
Zero-Input Limit Cycles in Fixed-Point Realization of IIR Filters
For stable IIR systems the output will decay to zero when the
input becomes zero
A finite-precision implementation, however, may continue to
oscillate indefinitely
Nonlinear behaviour very difficult to analyze so we sill study
by example
Example: Limite Cycle Behavior in First-Order Systems
Assume x[n] and y[n-1]
are implemented by 4 bit
registers
| | | | | | 1 a n x 1 n ay n y < + =
16
Example Contd
Assume that a=1/2=0.100b and the input is
If we calculate the output for values of n
A finite input caused an oscilation with period 1
| | | | ( ) | | n b 111 . 0 n
8
7
n x o = o =
n y[n] Q(y[n])
0 7/8=0.111b 7/8=0.111b
1 7/16=0.011100b 1/2=0.100b
2 1/4=0.010000b 1/4=0.010b
3 1/8=0.001000b 1/8=0.001b
4 1/16=0.00010b 1/8=0.001b
| | | | | | 1 a n x 1 n ay n y < + =
MATLAB Demo
17
Limite Cycles due to Overflow
18
19
Example: Limite Cycles due to Overflow
Consider a second-order system realized by
Where Q() represents twos complement rounding
Word length is chosen to be 4 bits
Assume a
1
=3/4=0.110b and a
2
=-3/4=1.010b
Also assume
The output at sample n=0 is
After rounding up we get
Binary carry overflows into the sign bit changing the sign
When repeated for n=1
| | | | | | ( ) | | ( ) 2 n y a Q 1 n y a Q n x n y
2 1
+ + =
| | | | b 010 . 1 4 / 3 2 y and b 110 . 0 4 / 3 1 y = = = =
| |
0.100100b 0.100100b
1.010b b 010 . 1 0.110b b 110 . 0 0 y
+ =
+ =
| | -3/4 1.010b 0.101b 0.101b 0 y = = + =
| | 4 / 3 110 . 0 1.010b 1.010b 0 y = = + =
20
Avoiding Limite Cycles
Desirable to get zero output for zero input: Avoid limit-cycles
Generally adding more bits would avoid overflow
Using double-length accumulators at addition points would
decrease likelihood of limit cycles
Trade-off between limit-cycle avoidance and complexity
FIR systems cannot support zero-input limit cycles