Laboratory Manual
Laboratory Manual
Laboratory Manual
First Edition
August 2002
ii The University of Michigan, All rights reserved
Preface
This laboratory manual was written during the first three semesters that EECS 206 was taught at
the University of Michigan. It represents an effort to provide hands-on experience with signals and
systems engineering and concepts by working with the M ATLAB mathematics environment. The
specific goals are:
• To acquaint students with a number of problems/tasks addressed by signals and systems engi-
neering, and with some of the approaches to these problems.
• To involve students in the design, implementation and testing of systems that address some
signals and systems engineering tasks.
• To familiarize students with the use of M ATLAB as a primary prototyping tool for signals and
systems engineering.
While not a “programming” class, it is important that students be able to do things for them-
selves, such as implement a solution to some basic signals and systems task. In signals and systems
engineering, this often involves programming in a language like M ATLAB. However, for these lab-
oratories we have attempted to limit the amount of “programming for the sake of programming,”
which is better obtained in a true programming course. What remains should allow students to gain
facility with M ATLAB without requiring advanced programming skills.
The lab assignments presume that students have had some significant programming experience,
e.g. a first course at the freshman level, and some experience with M ATLAB, e.g. two to three weeks
of coverage in a first programming course. This prerequisite notwithstanding, the lab manual begins
with a tutorial, which serves to review M ATLAB and to emphasize the constructs needed in these
assignments. It has been found that students with significant programming experience but no prior
M ATLAB can also succeed in these laboratory assignments, provided they make the extra effort to
focus strongly on M ATLAB during the first couple of weeks of the course. That is, M ATLAB is
readily learnable by people familiar with another programming language.
The laboratory assignments are intended to be mostly self-contained. To this end, each con-
tains a substantial amount of background material. This material highlights important theoretical
concepts, introductions to specific signals and systems problems/tasks, and the specific approaches
to the solution of problems to be examined in the assignments. In some cases, the material in the
background section for each lab is meant as a reference rather than as strictly necessary to the com-
pletion of the laboratory assignment. In other cases, the background material describes an approach
that you will use in the laboratory assignment.
Each lab assignment also contains a M ATLAB section introducing commands or techniques that
will be important in this assignment, a demo section listing the demonstrations that will take place
in the lab session, and an assignment section listing exactly what must be done. Note that the bullets
indicate items to be included in your lab report.
It is highly recommended that you read through each laboratory before arriving at the laboratory
session in which you will begin the lab. This will not only give you a better foundation to understand
the material in the laboratories, but it will allow you to complete the laboratory more quickly once
you have begun working on it.
Commensurate with the first listed goal, all of the laboratories are meant to reinforce key con-
cepts of the course. However, the presentation will often be somewhat different from that of the
lecture or textbook. For instance, we develop convolution and filtering by connecting it to the opera-
tion of correlation, which we present in the Lab 2. We also use the idea of correlation to motivate the
key concepts of spectrum and the Fourier series. In other cases, we use the visualization capabilities
of M ATLAB to help develop an intuitive sense of how systems “work.” For instance, Lab 9 uses a
GUI to graphically show the effects of poles and zeros on the frequency response of a filter.
While the laboratories reinforce material from the lectures and textbook, commensurate with the
second and third goals listed earlier, they also go beyond them in numerous places. For instance, the
ideas of detection and classification form a common theme throughout the laboratories. These ideas
are not commonly introduced at the undergraduate level, but they form an important component
of signals and systems engineering. As another example, Lab 5 develops a transform based image
encoder, similar to JPEG. We also focus on the two-dimensional signals (images) in Labs 5 and 6,
rather than solely concentrating on one-dimensional signals.
To a great extent, the amount you will get out of these laboratories is dependent upon the amount
you put into them. There are a wide variety of topics covered in these labs. We have necessarily not
examined them in great depth, but we wish to encourage further thought and exploration into many
of them. In many of the labs, you will see items labeled “food for thought.” These are exercises
that will lead you to examine other aspects of a problem, often in greater depth than in the actual
assignment. While these “food for thought” items are in no way required, we strongly recommend
that you look at them and discuss them with your lab instructors and peers. Hopefully, you will
find many ideas and applications in this course that will interest you and encourage you to explore
further.
A note about the “electronic” portion of this laboratory manual. Each laboratory involves the
use of M ATLAB code, data files, and programs that must be downloaded from the course web page.
These programs were developed using M ATLAB 6 (Release 12) and a Windows 2000 platform.
While most of the code should work on any version of M ATLAB, some (most notably the GUI
programs) require M ATLAB 6 or greater. Additionally, we have provided “compiled” MEX-file
versions of many of the programs that you will be writing code to complete. This allows you to
check the results provided by your code with “correct” code, and also gives you a way to continue
working on the laboratory even if you cannot get the code working. Note, however, that these
programs are compiled as Windows .dll files. As such, they will ONLY operate on a Windows-
based operating system. In general, we recommend that you use CAEN machines with the latest
version of Windows.
Remember that these laboratories are covered by the College of Engineering Honor Code. In
particular, it is a violation of the Honor Code to work on these laboratories with others, unless they
are members of the lab group to which you are assigned. Further, using, or in any way deriving
advantage from, solutions from previous terms is a violation. If you have any questions about how
the Honor Code applies to this class, talk to your instructors.
Finally, we would like to acknowledge all of those who helped us during the development of
these laboratories. In particular, we would like to thank Professors Stephane Lafortune and Jeffrey
Fessler for their input and comments on these laboratories. We would also like to thank the GSIs who
Preface iii
Contents vi
An Introduction to M ATLAB 1
1 What is M ATLAB? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 M ATLAB is a mathematics environment . . . . . . . . . . . . . . . . . . . 1
1.2 M ATLAB is tool for visualizing data . . . . . . . . . . . . . . . . . . . . . 2
1.3 M ATLAB is a prototyping language . . . . . . . . . . . . . . . . . . . . . 2
1.4 M ATLAB can do more... . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Demos for the first tutorial lab section . . . . . . . . . . . . . . . . . . . . . . . . 3
3 Using M ATLAB: The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1 Starting M ATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 How to get help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3 Using M ATLAB as a calculator (with variables) . . . . . . . . . . . . . . . 4
4 Vectors, Matrices, and Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.1 Constructing arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.2 Concatenating arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.3 Transposition and “flipping” arrays . . . . . . . . . . . . . . . . . . . . . 6
4.4 Building large arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.5 The colon operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5 Array Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6 Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.1 Basic indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.2 Single number indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.3 Vector indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.4 Finding the size of an array . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6.5 Vector indexing to modify arrays . . . . . . . . . . . . . . . . . . . . . . . 9
6.6 Conditional statements and the “find” command . . . . . . . . . . . . . . . 9
7 Data Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.1 Using “plot” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.2 Interpolation; line and point styles . . . . . . . . . . . . . . . . . . . . . . 10
7.3 Axis labels and titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.4 Commands related to “plot” . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.5 Plotting with an x-axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1 What is M ATLAB?
The Mathworks, Inc., makers of M ATLAB, claims that M ATLAB is “the language of technical com-
puting.” By and large, they are right. M ATLAB is widely used in a great number of scientific fields.
For those who work with signals and systems, M ATLAB is a de facto standard. Engineers from a
wide array of disciplines, in both academia and in industry, use M ATLAB on a regular basis. As
such, a knowledge of M ATLAB will not only be useful for this course, but for future courses and in
your career as a whole. One of the main reasons for M ATLAB’s popularity arises from its wide array
of uses. So what is M ATLAB?
1.1 M ATLAB is a mathematics environment that can easily handle vectors and
matrices
M ATLAB was originally written to provide an easy-to-use interface to the mathematical subroutines
included in LINPACK and EISPACK. These two packages are sets of subroutines written in FOR-
TRAN for a wide variety of linear algebra operations. M ATLAB’s original focus on linear algebra
means that it has very well developed capabilities for handling vectors and matrices 1 . In fact, M AT-
LAB is short for “Matrix Laboratory.” For our purposes, both vectors and matrices are examples of
signals – a mathematical environment that can easily handle vectors and matrices makes working
with signals just as easy.
Let’s look at an example to see exactly what this buys us. Suppose that we have two signals, x
and y, each of which is simply an array with 100 elements. How would we add these signals in a
language like C++? The easiest way probably involves the following fragment of code:
double z[100];
for(int i = 0; i < 100; i++)
{
z[i] = x[i] + y[i];
}
This is a simple enough piece of code, but it is not as clear as it could be. In M ATLAB we can
simply do the following:
z = x + y;
1 Vectors and matrices are simply one- and two-dimensional arrays, respectively
Simply adding two signals (vectors or matrices) with the same size automatically performs an
element-by-element sum. Which of these two is easier to understand? Using this M ATLAB syntax,
we can see immediately what is happening. M ATLAB takes care of any necessary looping and
variable declarations for us. This is a very common feature in M ATLAB; many operations that
you would normally need to perform explicitly in another programming language can be performed
implicitly in M ATLAB.
>>
>> help
See that list of categories? You can call help on any of these categories to get an organized list of
commands with brief discussions. Then, you can call help on any of the commands for a complete
description of that command. The description also includes a “see also” line near the bottom which
suggests other commands that may be related to the one you’re looking at. Select a category that
looks interesting and call help on it. Do the same for whichever command strikes your fancy. For
instance:
Most often you’ll use help in this last capacity. Note that help abs lists commands related to
the absolute value function as well.
Unfortunately the traditional help system isn’t so helpful if you don’t know the name of the
command you’re looking for. One way around this is to use the lookfor command. For instance,
if you know you’re looking for a function that deals with time, you can try:
This searches the first line of the every help description for the word “time.” This can take a while,
though (depending upon your system’s configuration). You should get into the habit of reading the
help on every new command that you run across. So call help on both help and lookfor.
There’s some useful information there.
Another very good source of help is the M ATLAB helpdesk. It may or may not be available
on your system; to find out, simply try:
>> helpdesk
If it is available, you will see a help window. The M ATLAB helpdesk contains all of the help
pages that you can find using help or lookfor, along with many other useful documents. The
helpdesk is also easily searchable (and often much faster than lookfor), so you would benefit
from becoming familiar with this tool.
>> 6 * 7
>> (((12 + 5) * 62/22.83) - 5)ˆ2.4
(The ˆ operator performs exponentiation.) Notice that when you execute these commands, M ATLAB
indicates that ans = 7.4977e+003 (or whatever the answer is). This indicates that the result has
been stored in a variable called ans. We can then refer to this quantity like this:
>> 0 * ans
>> ans + 1
It is important to note that each of these commands overwrites ans. If we want to save an
answer, we can simply perform assignment, like this:
>> my_variable = 42
This is the only declaration of my_variable that is needed, and we can use this variable later
just as we could with ans. Further, my_variable will retain its value until we explicitly assign
something else to it.
We can also remove variables with the command clear. Typing who or whos will list what
variables we have in our workspace.
Using variables, then, is straightforward.
>> x = 5.4
>> y = 2
>> z = (my_variable*y)ˆx
Note that sometimes you don’t need or want to see what M ATLAB returns in response to a
particular command. To suppress the output, simply add a semicolon, ;, after the command. Try
any of the above commands with and without the semicolon to see what this does.
We also have access to a wealth of standard mathematical functions. Thus, we can if we want to
calculate the sine of the square-root of two and store it in a variable called var, we simply type:
To access the 3 from vector v, we simply need to know that it is in the third row. (In M ATLAB,
we use v(3) to access this element.) Thus the vector is one-dimensional. To access the 6 in the
matrix M , though, we need to know that it is in the second row and the third column. We index the
6 using the pair (2,3), and so matrix is two-dimensional. (In M ATLAB, we use M(2,3) to access
this element.) M ATLAB arrays can have any number of dimensions. In practice, though, we will
only need vectors and matrices.
>> e = [a b]
>> f = [a; b]
>> g = [c d]
Oops! That last command produced an error. When concatenating arrays, the concatenated arrays
must have sizes such that the resulting array is rectangular.
>> ones(5,3)
>> zeros(3,4)
>> zeros(5)
>> eye(4)
>> 1:7
>> 1:2:13
>> 0.1:0.01:2.4
Each of these commands defines a row-vector. With only two arguments, as in the first command,
the colon operator produces a row vector starting with the first argument and incrementing by one
until the second argument has been reached. The optional middle argument (seen in the second
two commands) provides a different increment amount. The colon operator is extremely useful,
so it is recommended that you check out help colon for more details. Play with some other
combinations of parameters to familiarize yourself with the behavior of this operator.
5 Array Arithmetic
M ATLAB allows you to perform mixed arithmetic between scalars and arrays as well as two different
types of arithmetic on arrays. Mixed scalar/array arithmetic is the most straightforward. Adding,
subtracting, multiplying or dividing a scalar from an array is equivalent to performing the operation
on every element of the array. For instance,
>> [5 10 15 20]/5
>> t = 0:.1:pi;
>> sin(t)
return a 32-element vector (the same size as t) containing the sine of each element of t.
If we have two arrays, addition and subtraction is also straightforward. Provided that the arrays
are the same size, adding and subtracting them performs the operation on an element-by-element
basis. Thus, the (3,4) element in the output (for instance) is the result of the operation being per-
formed on the (3,4) elements in the input arrays. If the arrays are not the same size, M ATLAB will
generate an error message.
For multiplication, division, exponentiation, and a few other operations, there are two different
ways of performing the operation in question. The first involves matrix arithmetic, which you may
have studied previously. You may recall that the product of two matrices is only defined if the “inner
dimensions” are the same; that is, we can multiply an mxn matrix with an nxp matrix to yield an
mxp matrix, but we cannot reverse the order of the matrices. Then, the (p,q) element of the result
is equal to the sum of the element-by-element product of the pth row of the first matrix and the q th
column of the second. Division and exponentiation are defined with respect to this matrix product.
It is not imperative that you recall matrix multiplication here (most likely you will see it in a linear
algebra course in the future); however, it is important that you note that in M ATLAB the standard
mathematical operators (*, /, and ˆ) default to these forms of the operations.
A form of multiplication, division, and exponentiation for arrays that is more useful for our
purposes is the element-by-element variety. To use this form, we must use the “dot” forms of the
respective operators, .*, ./, and .ˆ). Once again, the arrays must have the same dimensions or
M ATLAB will return an error. Thus, the commands
>> [1 2 3 4].*[9 8 7 6]
>> [7; 1; 4]./(1:3)'
>> [5 6 7].ˆ[2 3 4]
>> 2.ˆ[1 2 3 4 5 6]
perform element-by-element multiplication, division, and two slightly different forms of exponenti-
ation. Note that the .ˆ form is necessary even for scalar-to-array exponentiation operations.
The array arithmetic capabilities of M ATLAB contribute greatly to its power as a programming
language. Using these operators, we can perform mathematical operations on hundreds or thousands
of numbers with a single command. This also has the side effect of simplifying M ATLAB code,
making it shorter and easier to read (usually).
6 Indexing
6.1 Basic indexing
To make arrays truly useful, we need to be able to access the elements in those arrays. First, let’s fill
a couple of arrays:
>> a = 5:5:60
>> d = [9, 8, 7, 6 ; 5, 4, 3, 2]
>> a(6)
>> a(3) = 12
>> d(2,3)
The first command retrieves the sixth element from the vector a. The second assigns a number to the
third element of the same vector. For the third command, the order of the dimensions is important.
In M ATLAB, the first dimension is always rows and the second dimension is always columns.
Note particularly that this is the opposite of (x, y) indexing. Thus, the third command retrieves the
element from row two, column three.
>> d(2)
>> d(3)
>> d(7)
These commands return a subset of the appropriate vector, as determined by the indexing vector. For
instance, the first command returns the first, fourth, and sixth elements from the vector a. Notice
the use of the end keyword in the third command. In an indexing context, end is interpreted as the
length of the currently indexed dimension. This is particularly useful because M ATLAB will return
an error if you try to access the eighth element of a seven-element vector, for instance. In general,
indices must be strictly positive integers less than the length of the dimension being indexed. Thus,
unlike C or C++, the indices begin at one rather than at zero.
Using multiple indices into multi-dimensional arrays is more complicated than doing so with
vectors, but in some cases it can be extremely useful. Consider the following commands:
7 Data Visualization
7.1 Using “plot”
So now we know how to build arbitrarily large arrays, populate them with interesting things, and
get individual elements or sections of them. However, pouring over pages and pages of numbers
is generally not much fun. Instead, we would really like to be able to visualize our data somehow.
Of course, M ATLAB provides many options for this. Let’s start by building a vector we can use
throughout this section, and then looking at it. Execute the following commands:
>> x = sin(2*pi*(1:200)/25);
>> plot(x);
>> zoom on;
The first command builds up a sine wave, and the second command plots it. A window should
have popped up with a sine wave in it. Notice the y-axis extents from -1 to 1 as we would expect.
Using this form of plot, the x-axis is labeled with the index number; that is, our vector has 200
elements, and so the data is plotted from 1 to 200. The third command turned on M ATLAB’s zooming
capabilities. If you left-click on the figure, it will zoom in; right-clicking5 will zoom out. You
can also left-click and drag to produce a zoom box that lets you control where the figure zooms.
Experiment with this zoom tool until you’re comfortable with it. Depending on the version of
M ATLAB that you are using, there may also be an icon of a magnifying glass with a + in it above
the figure; clicking this icon will also enable and disable zoom mode.
>> plot(x,'x-')
>> plot(x,'o')
>> plot(x,'rd:')
help plot lists the various combinations of characters that you can use to change line styles,
point styles, and colors.
Note that the single tick marks, ', delimit strings that are passed to these commands.
5 For Mac users, I believe you double-click to zoom out all the way.
>> stem(x)
>> stairs(x)
>> bar(x)
In this course, you will most often be using the plot and stem commands. Each is useful in a
somewhat different context.
>> plot(1:length(x),x,'x-');
Sometimes, we’ll have a time axis that we want to plot against. For instance,
>> t = 0:.01:1.99;
>> plot(t,x);
This scales the time axis to match t. We will find this very useful when working with sampled
signals.
This plots x and y versus t on the same figure with different line types. Note that the line style
arguments are optional; without them, M ATLAB will plot each curve using a different color.
The hold command provides another method of plotting several curves on the same figure.
When we type hold on, an old figure will not be erased before a new one is plotted. To add a
curve to the plot we produced above, use the commands:
A third way to plot multiple lines simultaneously makes use of the fact that plot will plot the
columns of a matrix as separate lines. Execute the following commands.
7.7 Legends
You can add a legend to a plot using the legend command like this:
>> legend('Data set 1', 'Data set 2');
The legend command can take any number of parameters; usually, though, you want one string
for each data set on your plot.
Now we have some “3-D” visualizations of our surface. If you click-and-drag the plot, you should
be able to rotate the surface so that you can see it from various directions. Experiment with this until
you’re comfortable with how it works. Notice what happens if you look at the surface from directly
above.
M ATLAB has some very powerful tools for data visualization; here, you’ve seen only a small
sampling. There many more. If you’re interested in exploring this topic further, check help
graph2d, help graph3d, and help specgraph.
8 Programming in M ATLAB
Programming in M ATLAB is really just like using the M ATLAB command line. The only difference
is that commands are placed in a file (called an M-file) so that they can be executed by simply calling
the file’s name. We’ll also see that M ATLAB has many of the same control flow structures, like loops
and conditionals, as other, more traditional programming languages.
Before we jump into programming in M ATLAB we need to make a few comments about files in
M ATLAB. M ATLAB has access to a machine’s file system in roughly the same way a command-line
based operating system like DOS or UNIX. It has a “present working directory” (which you can see
with the command pwd); any files in the present working directory can be seen by M ATLAB. You
can change the present working directory in roughly the same way that you do in DOS or UNIX,
using the cd command (for “change directory”). M ATLAB also has a “path,” like the path in DOS
or UNIX, which lists other directories that contain files that M ATLAB can see. The path command
will list the directories in the path. We’ll be making a few files in this tutorial, and you’ll need
to store commands in files when doing the laboratories. You’ll probably want to make a directory
somewhere in your personal workspace, cd to that directory, and store your files there. Unless
you’re working on your own system, do not store them in the main M ATLAB directory; if you do,
the system’s administrator will probably become very irritated with you.
There are two types of files containing commands that M ATLAB can call, scripts and functions. Both
use the “.m” file extension (and, thus, are called m-files. A script is nothing but a list of commands.
When you call the script (by simply typing in the script’s filename), M ATLAB will execute all of
the commands in the file and return to the command line exactly as if you had typed the commands
in by hand. Functions are different in that they have their own workspace and variables. We pass
information to a function by means of input parameters, and receive information from the function
through output parameters.
M ATLAB scripts
Start the M ATLAB editor using the command edit6 . Then, place the following lines in the text file
and save it as “hello.m”.
% hello.m -- Introductory 'Hello World' script
% These lines are comments, because they start with '%'
M ATLAB functions
The second type of file that we can put commands in is called a function. A function communi-
cates with the current workspace by passing variables called parameters. It also creates a separate
workspace so that it’s variables don’t get mixed up with whatever variables you have in your cur-
rent workspace. Note that most M ATLAB commands are also functions, and the M-file code is
available for most of them. You can see the code by using the type command, for instance as
type fliplr.
Using your text editor, make a new file that contains the following lines and save it as “hello2.m.”
% hello2.m -- Introductory 'Hello World' function
% Try typing 'help hello2' when you're done, and see what happens
%
% function output_param = hello2(input_param)
doing so. UNIX versions of M ATLAB prior to version 6 did not include a built-in editor.
Loops
The for loop is used to execute a set of commands a certain number of times, while also providing
an index variable. Consider the simple loop here:
This loop executes the disp command ten times. The first time it is executed, index is set to 1.
Thereafter, it is incremented by one each time the commands in the loop are executed. Note that the
colon form of the for loop is not mandatory; any row-vector can be used in its place, and the index
(which, of course, can be renamed) will be sequentially set to each of the elements in the vector
from left to right.
We can use while loops in a similar manner. Consider this:
ct = 10;
while ct > 0.5
ct = ct/2;
disp(ct);
end
As long as the conditional after while is true, the loop will be executed.
Conditional statements
A more traditional method of conditional execution comes from the if-else statement. Consider
this:
if pi > 4
disp('Pi is too big!');
elseif pi < 3
disp('Pi is too small!');
else
disp('Pi is just about right.');
end
Here, M ATLAB will first check the conditional, pi > 4. If this is true, the first display command
will be executed and the remainder of the if-else statement will be skipped (that is, none of the
other conditionals will be tested). If the first conditional is false, M ATLAB will begin to check the re-
maining conditionals. There can be any number of elseif statements in this construct (including
none), and the else statement is entirely optional. If you have a large number of chained con-
ditionals, you might consider using the switch-case construct (type help switch or help
case).
Also, for any C programmers in the audience, note that you can perform formatted string output
with fprintf and sprintf.
It is often useful to convert numbers to strings. We can use the num2str command to do this.
Consider this:
In this way, we can produce formatted output without using fprintf or sprintf.
For more information on strings, look at help strings and help strfun.
keyboard into your code is effectively the same as placing a breakpoint in the code, such that you
can execute commands before returning to program execution (with the command return).
There are a number of error types that you are likely to encounter. One very good rule of thumb
says that if an error occurs inside a M ATLAB function, the error is almost assuredly in the calling
function. Usually this means that the function is being passed improper parameters; check the call
stack or dbstep out until you find the line in your program which is causing problems. Other
common errors include indexing errors (indexing with 0 or a number greater than the length of
the indexed dimension of a variables) and assignment size mismatches. M ATLAB is usually pretty
descriptive with its error messages once you figure out how to interpret what it is saying. As is
usually the case when debugging, an error message at a particular line may in fact indicate an error
that has occurred several lines before.
1.1 Introduction
In everyday language, a signal is anything that conveys information. We might talk about traffic
signals or smoke signals, but the underlying purpose is the same. In the study of signals and systems
engineering, however, we adopt a somewhat more specific notion of a signal. In this field, a signal
is a numerical quantity that varies with respect to one or more independent variables. One can think
of a signal as a functional relationship, where the independent variable might be time or position.
As an example, one signal might be the voltage on the wires from a microphone as it varies
with time. Another signal might be the light intensity as it varies with position along a sensor array.
The important aspect of these signals, though, is the mathematical representation, not the underlying
medium. That is, the voltage and light signals might be mathematically the same, despite the fact that
the signals come from two very different physical sources. In signals and systems engineering, we
recognize that the most important aspects of signals are mathematical. Thus, we don’t necessarily
need to know anything about the physical behavior of voltage or light to deal with these signals.
What purposes do signals serve? Let us highlight a few of the many important ones. First, a
signal can embody a sensory experience, as in a sound that we would like to hear or a picture that we
would like to see. Second, a signal can convey symbolic information, as in the text of a newspaper.
Third, a signal can serve to control some system. For example, in a typical modern automobile, an
electronic control signal determines how much gasoline is emitted by the fuel injectors. Last, we
mention that a signal can embody an important measurement, for example, the speed of a vehicle or
the EKG of some patient.
What is the advantage of having a sound or a picture or text or control information or a mea-
surement embodied in a signal? For one thing, it enables us to transmit it to a remote location or
to record it. In many, but not all, cases, these are done electronically, either with analog or digital
hardware. For another, the signals we encounter frequently need to be processed, which can also be
done electronically with analog or digital hardware. For example, a signal may contain unwanted
noise that needs to be removed; this is an example of what is called noise reduction or signal recov-
ery. Alternatively, the desired information or sensory experience may need to be extracted from the
signal, as in the case of AM and FM radio signals, which need to be demodulated before we can
listen to them, or in the case of CT scan signals, which need extensive processing before an X-ray
• How can I detect the presence of “speech” within a segment of a speech signal?
1.2 Background
1.2.1 Continuous-time and discrete-time signals
In its most elementary form, a signal is a time-varying numerical quantity, for example, the time-
varying voltage produced by a microphone. Equivalently, a signal is a numerically valued function
of time. That is, it is an assignment of a numerical value to each instance of time. As such, it is
customary to use ordinary mathematical function notation. For example, if we use s to denote the
signal, i.e. the function, then s(t) denotes the value of the signal at time instance t. In common
usage, the notation s(t) also has an additional interpretation — it may also refer to the entire signal.
Usually, the context will make clear which interpretation is intended.
We will deal with many different signals and to keep them separate we will use a variety of
symbols to denote them, such as such as r, x, y, z, x0 . Occasionally, we will use other symbols to
denote time, such as t0 , s, u. In some situations, the independent parameter t represents something
(A)
(B)
other than “time”, such as “distance”. This happens, for example, when pictures are considered to
be signals.
As illustrated in Figure 1.1, there are two basic kinds of signals. When the time variable t
ranges over all real numbers, the signal s(t) is said to be a continuous-time signal. When the time
variable t ranges only over the set of integers {. . . , −2, −1, 0, 1, 2, . . .}, the signal s(t) is said to be
a discrete-time signal. To distinguish these, from now on we will use a somewhat different notation
for discrete-time signals. Specifically, we will use one of the letters i, j, k, l, m, n to denote the time
variable, and we will enclose the time-variable in square brackets, rather than parentheses, as in s[n].
Thus, for example, s[17] denotes the value of the discrete-time signal s[n] at time n = 17. Note that
for discrete-time signals, the time argument has no “units”. For example, s[17] simply indicates the
17th sample of the signal. When the independent parameter t or n represents something other than
time, for example distance, then the signal can be said to be continuous-space or discrete-space,
respectively, or more generally, continuous-parameter or discrete-parameter.
It is important to reemphasize the inherent ambiguity in the notation s(t) and s[n]. Sometimes
s(t) refers to the value of the signal at the specific time t. At other times, s(t) refers to the entire
signal. Usually, the intended meaning will be clear from context. The same two potential interpre-
tations apply to s[n].
For other signals, there are no such formulas. Rather they might simply be measured and recorded, as
with an analog tape recorder. Similarly, some discrete-time signals can be described with formulas,
such as s[n] = sin(n) or
0 n<0
s[n] = , (1.2)
cos[n] n ≥ 0
and some are described simply by recording their values for all values of n.
Often, a discrete-time signal is obtained by sampling a continuous-time signal. That is, if T s is
a small time increment, then the discrete-time signal s[n] obtained by sampling s(t) with sampling
interval or sampling period Ts is defined by
For example, if s(t) = sin(t) and Ts = 3, then the discrete-time signal obtained by sampling with
sampling interval Ts is s[n] = sin(3n). The reciprocal of Ts is called the sampling rate or sampling
frequency and denoted fs = 1/Ts . Its units are samples per second. The discrete-time signal in
Figure 1.1 was obtained by sampling the continuous-time signal shown above it.
In the above example, we have allowed the time parameter to be negative as well as positive,
which begs the question of how to interpret negative time. Time 0 is generally taken to be some
convenient reference time, and negative times simply refer to times prior to this reference time.
Nowadays, signals are increasingly processed by digital machines such as computers and DSP
chips. These are inherently discrete-time machines. They can record and work with a signal in just
two ways: as a formula or as a sequence of samples. The former applies to continuous-time and
discrete-time signals. For example,√ a computer can easily compute the value of the continuous-time
signal s(t) = sin(t) at time t = 2 or the value of the discrete-time signal s[n] = cos(n) at time
n = 17. However, the latter works only with discrete-time signals. Thus, whenever a digital machine
works with a continuous-time signal, it must either use a formula or it must work with its samples.
That is, it must work with the discrete-time signal that results from sampling the continuous-time
signal. This admonition applies to us, because in this and future lab assignments, many of the signals
in which we are interested are continuous-time, yet we will process them exclusively with digital
machines, i.e. ordinary computers.
Except in certain ideal cases, which never apply perfectly in real-world situations, sampling a
continuous-time signal entails a “loss”. That is, the samples only partially “capture” the signal.
Alternatively, they constitute an approximate representation of the original continuous-time signal.
However, as the sampling interval decreases (equivalently, the sampling rate increases), the loss
inherent in the sampled signal decreases. Thus in practical situations, when the sampling interval
is chosen suitably small, one can reliably work with a continuous-time signal by working with its
samples, i.e. with the discrete-time signal obtained by sampling at a sufficiently high rate. This will
be the approach we will take in this and future lab assignments, when working with continuous-time
signals that cannot be described with formulas.
When digital machines are used to process signals, in addition to sampling, one must also quan-
tize, or round, the sampled signal values to the limited precision with which numbers are represented
in the machine, e.g. to 32-bit floating point. This engenders another “loss” in the signal represen-
tation. Fortunately, for the computers we will use in performing our lab experiments, this loss is so
small as to be negligible. For example, M ATLAB uses 64-bit double-precision floating point repre-
sentation of numbers. (Lab 5 is an exception; in that lab, we will consider systems that are designed
to produce digital signal representations with as few bits as possible.)
examples, the duration of s(t) is 3, and the duration of s[n] is 4. Note that the support and duration
of a signal can be either finite or infinite.
1.2.4 Periodicity
Periodicity is a property of many naturally occurring or man-made signals. A continuous-time signal
s(t) is said to be periodic with period T , where T is some positive real number, if
If s(t) is periodic with period T , then it is also periodic with period 2T , 3T , . . . . The fundamental
period To of s(t) is the smallest T such that s(t) is periodic with period T .
Similarly, a discrete-time signal s[n] is said to be periodic with period N , where N is some
positive integer, if
s[n + N ] = s[n] , for all n (1.5)
If s[n] is periodic with period N , then it is also periodic with period 2N , 3N , . . . . The fundamental
period No is the smallest N such that s[n] is periodic with period N .
Discrete-time signals
We begin with an example. Suppose we want to represent the following discrete-time signal as an
array in M ATLAB: 2
n 5 ≤ n ≤ 15
s[n] = (1.6)
0 else
In M ATLAB, we do this by creating two vector: a support vector and a value or signal vector. The
support vector represents the support interval of the signal, i.e. the set of integers from the first time
at which the signal is nonzero to the last. For this example, the support vector can be created with
the command
>> n = 5:15
This causes n to be the array of 11 numbers 5, 6, . . . , 15. Next, the signal vector can be created with
the command
>> s = n.ˆ2
which causes s to be the array of 11 numbers 25, 36, . . . , 225.
Note that as in the above example, we usually only specify the signal within the range of times
that it is nonzero. That is, we usually do not include zero values outside the support interval in the
signal vector.
It is often quite instructive to plot signals. To plot the discrete-time signal s[n], use the stem
command:
>> stem(n,s)
You can also use the plot command; however, plot draws straight lines between plotted points,
which may not be desirable.
It is important to note that in M ATLAB, when i is an integer then s(i) is not necessarily the
signal value at time i. Rather it is the signal at time n(i). Thus, stem(n,s) and stem(s)
result in similar plots with different labelings of the time axis. Occasionally, it will happen that
n(i) = i, in which case s(i) = s(n(i)) and stem(n,s) and stem(s) result in identical
plots with identical time axis labels.
Continuous-time signals
We begin with an example. Suppose we wish to represent the following continuous-time signal as
an array in M ATLAB:
2
t 5 ≤ t ≤ 15
s(t) = (1.7)
0 else
>> Ts = 1/20
>> t = 5:Ts:15
>> s = t.ˆ2
What have we done? To represent s(t), we have created a support vector t that contains the
sample times 5, 5 + 1/20, 5 + 2/20, 5 + 3/20, . . . , 15 , and we have created a signal vector s
that contains the samples s(5), s(5 + 1/20), s(5 + 2/20), s(5 + 3/20), . . . , s(15). That is, for
n = 1, . . . , 301, s(n) contains the signal value at time t(n) = 5 + (n-1)/20.
Note that when representing a continuous-time signal as an array, it is usually important to
choose the sampling interval Ts small enough that the signal changes little over any time interval of
Ts seconds.
As with discrete-time signals, it is frequently instructive to plot a continuous-time signal. This
is done with the command
>> plot(t,s)
which plots the points (t(1),s(1)), (t(2),s(2)), . . ., and connects them with straight lines.
Connecting these points in this manner produces a plot that approximates the original continuous-
time signal s(t), which takes values at all times (not just integers) and which usually does not change
significantly between samples (assuming Ts is chosen to be small enough). Note that plot(s)
produces a similar plot, but the horizontal axis is labeled with sample “indices” (i.e., the number of
the corresponding sample) rather than sample times. When working with continuous-time signals, it
is important that you always use plot(t,s) rather than plot(s). It also important that your plot
indicates what the axes represent, which can be done using the xlabel and ylabel commands.
1. Support Interval. A signal’s support interval (also occasionally known as just the signal’s
support or its interval) is the smallest interval that includes all non-zero values of the signal.
Continuous-time: t 1 ≤ t ≤ t2 (1.8)
Discrete-time: n 1 ≤ n ≤ n2 (1.9)
M ATLAB: n = n1:n2 for a signal s. (1.10)
2. Duration. The duration of a signal is simply the length of the support interval.
Continuous-time: t 2 − t1 (1.11)
Discrete-time: n 2 − n1 + 1 (1.12)
M ATLAB: Assumed length(s) for a signal s. (1.13)
Sampled: (t2 − t1 ) = (n2 − n1 + 1)Ts (1.14)
3. Periodicity. Periodicity was described in section 1.2.4. The key formulas are included here.
4. Maximum and Minimum Value. These values are the largest and smallest values that a
signal takes on over some interval defined by n1 and n2 . In M ATLAB these values are found
using the min and max commands.
5. Average Value. The average value, M , is the value around which the signal is “centered”
1 This code assumes that the signal vector s is defined only over the range of times for which we wish to compute the
statistic. More generally, if n is the support vector and n1 and n2 define a subset of the support vector over which we wish
to calculate our statistic, we can compute the statistic over only this range, n1:n2, by replacing the signal s with the shorter
signal s((n1:n2)-n(1)+1).
6. Mean-squared value. The mean-squared value (or MSV) of a signal, M S, is defined as the
average squared valued of the signal over an interval. The MSV is also called the average
power, because the squared value of a signal is considered to be the instantaneous power of
the signal.
Z t2
1
Continuous-time: M S(s(t)) = s2 (t)dt (1.24)
t2 − t 1 t1
n2
X
1
Discrete-time: M S(s[n]) = s2 [n] (1.25)
n2 − n1 + 1 n=n
1
7. Root mean squared value. The root mean squared value (or RMS value) of a signal over
some interval is simply the square root of mean squared value.
s Z t2
1
Continuous-time: RM S(s(t)) = s2 (t)dt (1.28)
t 2 − t 1 t1
v
u n2
u 1 X
Discrete-time: RM S(s[n]) = t s2 [n] (1.29)
n2 − n1 + 1 n=n
1
8. Signal Energy. The energy of a signal, E, indicates the strength of a signal is present over
some interval. Note that energy equals the average power times the length of the interval.
Z t2
Continuous-time: E(s(t)) = s2 (t)dt (1.32)
t1
n2
X
Discrete-time: E(s[n]) = s2 [n] (1.33)
n=n1
M ATLAB: E(s) = sum(s.ˆ2) (1.34)
Sampled: E(s(t)) ≈ E(s[n])Ts (1.35)
−4 −2 0 2 4 −4 −2 0 2 4
Signal Value Distribution Histogram Approximation
9. Signal Value Distribution. The signal value distribution is a plot indicating the relative fre-
quency of occurrence of values in a signal. There is no closed-form definition of the signal
value distribution, but it can be approximated using a histogram. A histogram counts the num-
ber of samples that fall within particular ranges, or bins. Note that the y-axis is effectively
arbitrary, and that the coarseness of the approximation is determined by the number of his-
togram bins that are used. Figure 1.2 shows an example of a signal value distribution and the
histogram approximation to that distribution.
Assuming that we have both s[n] and r[n], we can easily calculate v[n] as
Note that if s[n] and r[n] are identical, v[n] will be zero for all n. This suggests that we can simply
measure the signal strength of v[n] by using one of the energy or power statistics.
Statistic Decision
x[n] d
Calculator Maker
Input Binary
Decision
Signal
Mean squared value is a natural choice because it normalizes the error with respect to the length
of the signal. Sometimes, though, the RMS value is more desirable because it produces error values
that are directly comparable to the values in v[n]2 . When we measure the MSV of an error signal,
we sometimes call it the mean squared error or MSE. Similarly, the RMS value of an error signal is
often called the root mean squared error or RMSE.
In M ATLAB, we will usually want to calculate the MSE or RMSE over the entire length of the
signals that we have. Supposing that we are given a signal s and a modified version s_mod (with
the same size), we can calculate the MSE and RMSE like this:
Notice that we could also subtract s from s_mod; the order doesn’t matter because of the square
operation. Also note that you must include the period before the exponentiation operator in order to
correctly square each sample.
x1[n]
RMS( ) ?(> c)
Break x2[n] RMS( ) ?(> c)
signal Recombine
x[n] into x3[n] decisions d[m]
RMS( ) ?(> c)
Input blocks Binary
...
...
signal decision
RMS( ) ?(> c) signal
xk[n]
Calculate Compare to
RMSV threshold
Figure 1.4: A detailed block diagram for the “signal present” detector.
The first step is to specify what our system needs to do. From the description above, we know
that we will receive a signal as an input. For simplicity, we’ll assume that we are given an entire
discrete-time signal. What must our system do? We need some sort of indication as to when speech
is present in a signal. However, the signal that we are given will contain both periods of silence and
periods with speech. For some integer N , let us break the signal into blocks of N samples, such
that the first N samples make up the first block, the next N samples make up the second block, and
so on. Then, we will make a “speech present” or “speech not present” decision separately for each
block. The output of our system will consist of a signal with a 0 or 1 for each block, where a 1
denotes “speech present” and a 0 denotes “speech not present”. To describe this signal in M ATLAB,
our system will produce a signal support vector containing containing the index of the first sample
of each block and a signal vector containing the 0 or 1 for each block. Choosing the support vector
in this way will allow us to plot the decisions on the same figure as the signal itself.
How will we make the decision for each block? Since we can assume a signal that is relatively
free of noise, we can simply calculate the energy for each block and compare the result to a threshold.
If the statistic exceeds the threshold, we decide that speech is present. Otherwise, we decide that
speech is not present. Using signal energy, though, is not ideal; the necessary threshold will depend
on the block size. We may want to change the block size, and we should be able to keep the threshold
constant. Using average power is a better option, but we would like our threshold to have a value
comparable to the values of the signal. Thus, the RMS value seems to be an ideal choice, and this is
what we will use. In summary, for each block the detector computes the RMS value R and compares
it to a threshold c. The decision is
signal present, if R ≥ c
signal not present, if R < c
Note that our detector system will has two design parameters. One is the block size N , and the other
is the threshold c. To tune the system, we will need to find reasonable values for these parameters
when we make the detector itself. A more detailed block diagram of the detector can be found in
Figure 1.4.
Detector algorithm
Now that we’ve specified the behavior of the detector, let’s come up with an algorithm for performing
the detection. Of course, this is not the only way to implement this detector.
It is assumed that the input signal is contained in a signal vector x whose support vector is simply
1,2,...,length(x).
(a) If you implement the algorithm using a for loop, you should first initialize the output
array to “empty” using the command “output =[];”. Then, loop over the values in
support_output. Within the loop, you need to determine what values of n 1 and
n2 to use in the RMS value calculation for a given value of the loop counter. Then,
compare the RMS value that you calculate to the threshold and append the result to the
end of output4 .
(b) An alternative to the for loop is to use the reshape command to make a matrix out of
our signal with one block of the signal per column. If you choose to use reshape, you
first need to discard all samples beyond the first block_size × number_of_blocks
samples of the input signal. reshape this shorter signal into a matrix with block_size
3 Type help colon if you need assistance with this operator.
4 Use either output(end+1) = result; or output = [output, result];
rows and number_of_blocks columns5 . Then, use .ˆ to square each element in the
matrix, use mean to take the mean value of each column, and take the sqrt of the
resulting vector to produce a vector of RMS values. Finally, compare this vector to
threshold to yield your output vector.
>> hold on
>> plot(1:10,1:10,'-')
>> plot(1:10,2:11,':')
>> plot(1:10,3:12,'--')
plot lines using solid, dotted, and dashed lines. Type help plot for more details about
using different line styles and colors. The legend command adds a figure legend for labeling
the different signals. For instance, the command
adds a legend with labels for each of the three signals on the figure. Note that signal labels are
given in the order that the signals were added to the figure.
• Labeling Figures: Any time that you create a figure for this laboratory, you need to include
axis labels, a figure number, and a caption:
Note that it is recommended that you use your word processor to produce figure numbers and
captions, rather than using the title command. You also need to include the code that you
used to produce the figures, including label commands. Note that each subplot of a figure
must include its own axis labels.
• Function Headers: At the top of the file containing a function declaration, you must have a
line like this:
5 Remember to assign the output of reshape to something! No M ATLAB function ever modifies its input parameters
turn.
7 The number of elements in array can be checked using the command prod(size(array)).
1. (A simple signal and its statistics) Use the following M ATLAB commands to create a signal:
>> n = 1:50;
>> s = sin(2*pi*n/50);
(a) (Plotting a signal with labels) Use stem to plot the signal. Make sure that you include 8 :
• The figure itself9 .
• An x-axis label and a y-axis label.
• A figure number and a caption that describes the figure.
• The code you used to produce the signal and the figure. This should be included
in an appendix at the end of your report10 . Make sure you clearly indicate which
problem the code belongs to.
(b) (Calculating signal statistics) Calculate the following statistics over the length of the
signal (i.e., let n1 = 1 and n2 = length(s)), and include your results in your report11 .
• Maximum value
• Minimum value
• Mean value
• Mean squared value
• RMS value
• Energy
8 Note that every figure that you produce in a laboratory for this class must include these things!
9 On Windows systems, you can select “Copy Figure” from Edit menu on the figure window to copy the figure to the
clipboard and then paste it into your report. Also, to make your report compact, you should make all figures as small as
possible, while being just large enough that the important features are clearly discernable.There are two ways to shrink plots,
you can shrink them in your lab report document, or you can shrink the M ATLAB window before copying and pasting.
Shrinking the M ATLAB window is generally preferable because it does not shrink the axis labels. Note, you may need to
specify the appropriate copy option, so that what is in fact copied is the shrunk rather than original version of the plot
10 You should include all M ATLAB code that you use in the appendix. However, you do not need to include code that is
2. (Statistics of real-world signals) Download the file lab1_data.mat from the course web
page. Place it in the present working directory or in a directory on the path, and type
This file contains two signals which will be loaded into your workspace. You will use the
signal clarinet12 in this problem and in Problem 3. The other signal, mboc, will be used
in Problem 4.
(a) (Plot the real-world signal) Define the support vector for clarinet as 1:length(clarinet).
Then, use plot to plot the signal clarinet.
• Include the figure (with axis labels, figure number, caption, and M ATLAB code) in
your report.
(b) (Zoom in on the signal) Zoom in on the signal so that you can see four or five “periods 13
.”
• Include the zoomed-in figure (with axis labels, figure number, caption, and M AT-
LAB code) in your report.
(c) (Find the signal’s period) Estimate the “fundamental period” of clarinet. Include:
• Your estimate for the discrete-time signal (in samples).
• Your estimate for the original continuous-time signal (in seconds).
(d) (Approximate the SVD) Use the hist command to estimate the signal value distribution
of clarinet. Use 50 bins.
• Include the figure (with axis labels, code, etc.) in your report.
• From the histogram, make an educated guess of the MSV and RMSV. Explain how
you arrived at these guesses.
(e) (Calculate statistics) Calculate the following (discrete-time) statistics over the length of
the signal:
• Mean value
• Energy
• Mean squared value
• RMS value
12 This is a one-second recording of a clarinet, recorded at a sampling frequency of 22,050 Hz. To listen to the sound, use
3. (Looking at and measuring signal distortion) In this problem, we’ll measure the amount of
distortion introduced to a signal by two “systems.” Download the two files lab1_sys1.m
and lab1_sys2.m. Apply each system to the variable clarinet using the following
commands:
(a) (Examine the effects of the systems) Use plot14 and M ATLAB’s zoom capabilities to
display roughly one “period” of:
• The input and output of lab1_sys1 on the same figure.
• The input and output of lab1_sys2 on the same figure.
(b) (Describe the effects of the systems) What happens to the signal when it is passed
through these two systems? Look at your plots from the previous section and describe
the effect of:
• lab1_sys1.m on clarinet.
• lab1_sys2.m on clarinet.
(c) (Measure the distortion) Calculate the RMS error introduced by each system.
• RMS error introduced by lab1_sys1.
• RMS error introduced by lab1_sys2.
• Which system introduces the least error? Is this what you would have expected
from your plots?
4. (Developing an energy detector) In this problem, we will develop a detector that identi-
fies segments of a speech signal in which speech is actually present. Download the files
sig_nosig.m and lab1_data.mat (if you haven’t already) from the course web page.
The first is a “skeleton” m-file for the signal/no signal detector function. The second contains
a speech signal, mboc15 , that we will use to test the detector.
(a) (Write the detector function) Following the detector description given in Section 1.2.8,
complete the function in sig_nosig.m. Use a threshold of 0.2 and a block size of
512. Verify the operation of your completed function on the mboc signal by comparing
its output to that of sig_nosig_demo.dll16 .
• Include the code for your completed version of sig_nosig.m in the appendix of
your lab report.
(b) (Plot the results of your function) Call sig_nosig17 like this:
>> [detection,n] = sig_nosig(mboc);
Then, plot the output of sig_nosig (using “stairs(n,detection,'k:');”)
and the signal mboc (with plot) on the same figure.
14 Make sure the two signals are easily distinguishable by using different line styles. Also, any time that you plot multiple
signals on a single set of axes, you must use legend or some other means to label the signals!
15 This signal has also been recorded with a sampling frequency of 22,050 Hz.
16 sig nosig demo.dll is a completed version of the function in sig nosig.m. This demo function has been com-
ment.
5. On the front page of your report, please provide an estimate of the average amount of time
spent outside of lab by each member of the group.
2.1 Introduction
• How can we transmit and receive bits from several different users on the same communication
channel?
• How can we develop a radar detection scheme that is robust to noise, and how do we charac-
terize its performance?
x[n]
0 0 0
−2 −2 −2
−4 −4 −4
0 5 10 0 5 10 0 5 10
4 4 4
2 2 2
y[n]
0 0 0
−2 −2 −2
−4 −4 −4
0 5 10 0 5 10 0 5 10
10 4 10
2
x[n]*y[n]
0 0 0
−2
−10 −4 −10
0 5 10 0 5 10 0 5 10
C(x,y) = 38 C(x,y) = 0 C(x,y) = −38
2.2 Background
2.2.1 Correlation
Suppose that we have two discrete-time signals, x[n] and y[n]. We compute the correlation 1 between
these two signals, C(x, y), using the formula
n2
X
C(x, y) = x[n]y[n] (2.1)
n=n1
where n1 and n2 define the interval over which we are calculating the correlation. In words, we
compute a correlation by multiplying two signals together and then summing the product. The
result is a single number that indicates the similarity between the signals x[n] and y[n].
What values can C(x, y) take on, and what does this tell us about the signals x[n] and y[n]? Let
us consider the examples in Figure 2.1. For the signals shown in the first column, C(x, y) > 0, in
which case, the signals are said to be positively correlated. Basically, this means that the signals are
more similar than they are dissimilar. In the second column, we can see an example where C(x, y)
is zero. In this case, the two signals are uncorrelated. One might say that uncorrelated signals are
“equally” similar and dissimilar. Notice, for instance, that the signal x[n] × y[n] is positive as often
as it is negative. Knowledge of the value of signal x[n] at time n indicates little about the value of
y[n] at time n. Finally, in the third column we see an example where C(x, y) < 0, which means that
x[n] and y[n] are negatively correlated. This means the signals are mostly dissimilar.
Note that the positively correlated signals given in Figure 2.1 are actually identical. This is a
special case; from equation (2.1), we can see that in this case the correlation is simply the energy of
1 We will occasionally refer to this operation as “in-place” correlation to distinguish it from “running” correlation. Some-
x[n], i.e.
C(x, x) = E(x) . (2.2)
Sometimes, it is more useful to work with normalized correlation, as defined by
n2
X
C(x, y) 1
CN (x, y) = p =p x[n]y[n]. (2.3)
E(x)E(y) E(x)E(y) n=n1
Normalized correlation is somewhat easier to interpret. The well known Cauchy-Schwartz inequal-
ity shows that the normalized correlation varies between -1 and +1. That is, for any two signals
−1 ≤ CN (x, y) ≤ 1. (2.4)
Thus, signals that are as positively correlated as possible have normalized correlation 1 and signals
that are as negatively correlated as possible have normalized correlation -1. Moreover, it is known
that two signals have normalized correlation equal to 1 when and only when one of the signals is
simply the other multiplied by a positive number. In this case, the signals are said to be perfectly
correlated. Similarly, two signals have normalized correlation equal to -1 when and only when one
is simply the other multiplied by a negative number, in which case they are said to be perfectly
anticorrelated.
Note that since n was used as the time variable for x and y, we have introduced a new time variable,
k, for r.
Suppose, for example, that we want to know the distance to a certain object, like an airplane. We
transmit a radar pulse, x[n], and receive a signal, y[n], that contains the reflection of our pulse off of
the object. For simplicity, let’s assume that we know y[n] is simply a delayed version of x[n], that
is2 ,
y[n] = x[n − n0 ], (2.7)
However, we do not know the delay factor, n0 . Since n0 is proportional to the distance to our object,
this is the quantity that we wish to estimate. We can use correlation to help us determine this delay,
but we need to use running correlation rather than simply in-place correlation.
Suppose that we first guess that n0 is equal to zero. We correlate y[n] with x[n] and record the
resulting correlation value as one sample of a new signal, r[0]. Then, we guess that n 0 is equal to
one, shift x[n] over by one sample, and correlate y[n] with x[n − 1]. We record this correlation
value as r[1]. We can continue this shift-and-correlate procedure, building up the new signal r[k]
according to the formula
∞
X
r[k] = C(x[n − k], y[n]) = x[n − k]y[n]. (2.8)
n=−∞
2 Recall that a signal x[n − n0 ] is equal to the signal x[n] shifted n0 samples to the right.
1
A
0.5
0
−40 −20 0 20 40 60 80 100
1
B
0.5
0
−0.5
100 200 300 400 500 600 700 800 900 1000
1
0.5 C
0
100 200 300 400 500 600 700 800 900 1000 1100
Figure 2.2: (A) A radar pulse. (B) A received sequence from the radar system, containing two
pulses and noise. (C) The running correlation produced by correlating the radar pulse with the
received signal.
Once we find a value of r[k] that equals E(x), we have found the value of n0 . This procedure of
building up the signal r[k] is known as running correlation or sliding correlation. We will refer to
the resulting signal (r[k] above) as the correlation signal.
As an example, Figure 2.2 shows a radar pulse, a received signal containing two delayed versions
of the radar pulse (one without noise and one with noise), and the running correlation produced by
correlating the pulse with the received signal.
Let us note a couple important features of the correlation signal. First, the limits of summation
in equation (2.8) are infinite. Usually, though, the support of x[n] and y[n] will be finite, so we do
not actually need to perform an infinite summation. Instead, the duration of the correlation signal
will be equal to the sum of the durations of x[n] and y[n] minus one3 . There will also be transient
effects (or edge effects) at the beginning and end of the correlation signal. These transient effects
result from cases where x[n − k] only partially overlaps y[n]. Finally, notice that the value of the
correlation signal at time k = 0 is just the in-place correlation C(x[n], y[n]).
Correlation
Signal
Input Correlation Decision Decision
Signal Calculator Maker
will work for the idealized system presented, real systems are usually much less ideal. We may have
multiple reflections, distorted reflections, reductions in reflection amplitude, and various kinds of
environmental noise. In order to address such problems in a wide variety of systems, we commonly
use a simple threshold comparison as our decision maker. For instance, if we compute a running
correlation signal r[n], we might choose a constant c called a threshold and make a decision for each
sample based on the following formula:
1
r ≷ c (2.9)
0
That is, when the correlation value r is greater than the threshold, c, we decide 1, or “signal present.”
If the value is less than the threshold, we decide 0, or “signal absent.” In our radar example, for
instance, we might select the threshold to be c = E(x)/2.
1
0
−1
10 20 30 40 50 60 70 80 90 100
1
0
−1
10 20 30 40 50 60 70 80 90 100
1
0
−1
10 20 30 40 50 60 70 80 90 100
2
A
0
−2
100 200 300 400 500 600
5
B
0
−5
100 200 300 400 500 600
Figure 2.5: (A) Example of a transmitted signal. (B) The sum of the transmitted signals from four
users.
the same fashion, except that, of course, they use different code signals. For example, Figure 2.5
shows a transmitted signal conveying eight message bits using the top code signal from Figure 2.4.
It also shows this signal with the transmitted signals from three other users added to it. Notice that
the signal in the upper panel is obscured in the lower panel.
When someone, say the user’s friend, receives the signal from the communication channel 5 and
wishes to decode the user’s message bits, the friend correlates the received signal with the user’s
code signal. Specifically, in-place correlation of the received signal with the code signal produces a
value with which a decision about the first message bit can be decided. Then in-place correlation of
the received signal with a delayed version of the code signal produces a value from which the second
message bit can be decided, and so on. Since each of these correlation values would equal plus or
minus the energy of the code signal if there were no other signals or noise present, it is natural to
have the decision maker decide that a message bit is “one” if the correlation value is positive and
“zero” if the correlation value is negative. That is, a threshold of zero is chosen.
A communication system of this form is said to be a code-division, multiple-access (CDMA)
system or a direct-sequence, spread-spectrum (DSSS) system. They are used, for example, in 900
5 For simplicity, we assume the communication channel does not attenuate or otherwise distort the transmitted signal.
Mhz cordless telephones. Such systems work best when the code signals are as different as possible,
i.e. when the normalized correlation between the code signals of distinct users are as near to zero as
possible, which is what system designers typically attempt to do. Consider the examples in Figure
2.4. The first two code signals are completely uncorrelated, as are the second two. The first and
third signals are slightly anticorrelated. The normalized correlation between these signals is only
-0.2, which is small enough that these two code signals will not interfere much with one another.
Above, we’ve indicated that our detection system uses in-place correlation. This means that this
system is synchronous; that is, the receiver knows when bits are sent. However, we can actually save
ourselves some work by using running correlation, and then sampling the resulting correlation signal
at the appropriate times. This is how we will implement this communication system in the laboratory
assignment. Using the running correlation algorithm presented in this lab, the “appropriate times”
occur in the correlation signal at the end of each code signal. That is, if our code signals are N
samples long, we want to pick off the (k × N )th sample out of the running correlator to decide the
k th message bit.
The threshold used to decode bits in this detection system, which we have chosen to be zero,
is actually a design parameter of the system. If it should happen, for instance, that the system’s
noise is biased in a way that we tend to get slightly positive correlations when no signal is sent,
then we would be able to improve performance of the system by using a positive threshold, rather
than a threshold of zero. Alternatively, we might want to decide that no bit has been sent if the
magnitude of the correlation is below some threshold. In this case, we actually have two thresholds.
One separates “no signal” from a binary “one;” the other separates “no signal” from a binary “zero.”
where x[n − n0 ] is the reflected radar pulse and w[n] is noise, i.e. an unpredictable, usually wildly
fluctuating signal that normally is little correlated with x[n] or any delayed version of x[n]. To
estimate n0 , we perform running correlation of y[n] with x[n], and the resulting correlation signal is
∞
X
r[k] = x[n − k]y[n]
n=−∞
X∞
= x[n − k](x[n − n0 ] + w[n])
n=−∞
X∞ ∞
X
= x[n − k]x[n − n0 ] + x[n − k]w[n]
n=−∞ n=−∞
= r0 [k] + rw [k] , (2.11)
where r0 [k] is the running correlation of x[n − n0 ] with x[n]. Note that r0 [k] is what r[k] would
be if there were no noise, as given in equation (2.8). rw [k] is the running correlation of the noise
w[n] with x[n], which is added to r0 [k]. This shows that the effect of noise is to add rw [k] to r0 [k].
Though in a well designed system rw [k] is usually close to zero, it will occasionally be large enough
to influence the decision made by the decision maker.
In Section 2.2.3, we argued that a threshold-based decision maker was useful for such systems.
Then, for example, when r[k] > c, the decision is that a radar pulse is present at time k, whereas
when r[k] < c, the decision is that no radar pulse is present at time k. Since in the absence of noise
r[k] = E(x) when there is a radar pulse at time k, and since r[k] = 0 when there is no pulse at time
k, it is natural, as mentioned in Section 2.2.3 to choose threshold c = E(x)/2.
Though it makes good sense to use a threshold detector, such a detector will nevertheless occa-
sionally make an error, i.e. the wrong decision. Indeed, there are two types of errors that a detector
can make. First, it could detect a reflection of the transmitted signal where no actual reflection exists.
This is called a false alarm. It occurs at time k when r0 [k] = 0 and rw [k] > c, i.e. when the part
of the correlation due to noise is larger than the threshold. The other type of error occurs when the
detector fails to detect an actual reflection because the noise causes the correlation to drop below
the threshold even though a signal is present. This type of error is called a miss. It occurs when
r0 [k] = E(x) and r[k] = E(x) + rw [k] < c, which in turn happens when rw [k] < c − E(x). In
summary, a false alarm occurs when there is no radar pulse present, yet the noise causes r w [k] > c,
and a miss occurs when there is a radar pulse present, yet the noise causes rw [k] < c − E(x).
Depending on the detection system being developed, these two types of error could be equally
undesirable or one could be more undesirable than another. For instance, in a defensive radar system,
false alarms are probably preferable to misses, since the former are decidedly less dangerous. We
can trade off the likelihood of these two types of error by adjusting the threshold. Raising the
threshold decreases the likelihood of a false alarm, while lowering it decreases the likelihood of a
miss.
It is often useful to know the frequency of each type of error. There is a simple way to empirically
estimate these frequencies. First, we perform an experiment where we do not send any radar pulses,
but simply record the received signal y[n], which contains just environmental noise w[k]. We then
compute its running correlation r[k] with the radar pulse x[n], which is just r w [k]. We count the
number of times that rw [k] exceeds the threshold c and divide by the total number of samples. This
gives us an estimate the false alarm rate, which is the frequency with which the detector will decide
a radar pulse is present when actually there is none. We can also use this technique to estimate
the miss rate. When a radar pulse is present, an error occurs when rw [k] < c − E(x). Thus, we
can estimate the miss rate by counting the number of times the already computed correlation signal
rw [k] is less than c − E(x), and dividing by the total number of samples.
The signal value distribution is also useful here. If we plot the histogram of values in r w [k], we
can use this plot to determine the error rate estimates. The estimate of false alarm rate is the area of
the histogram above values that exceeds c, divided by the total area of the histogram 6 . Similarly, the
estimate of the miss rate is the area of the hisogram above values that are less than c − E(x)
Assuming that the distribution of rw [k] is symmetric about 0, we can minimize the total error
rate (which is simply the sum of the false alarm and miss rates) by setting a threshold that yields the
same number of false alarms as misses. Since the distribution of rw [k] is assumed to be symmetric,
we get an equal number of false alarms and misses when c = E(x)/2, which is the threshold value
suggested earlier.
Next, it is important to note how the error rates depend on the energy of the radar pulse.
Consider first the false alarm rate, which corresponds to the frequency with which the noise in-
duced correlation signal rw [k] exceeds c = E(x)/2. Suppose for example that the radar pulse
is amplified by a factor of two. Then its energy increases by a factor of four, and consequently,
6 That is, we sum the values in this region of the histogram and divide by the sum of all values in the histogram
the threshold
R c increases by a factor of four. On the other hand, one can see from the formula
rw [k] = y[n]x[n − k] dx that the noise term rw [k] will be doubled. Since the threshold is quadru-
pled but the noise term is only doubled, the frequency with which the noise term exceeds the thresh-
old will be greatly decreased, i.e. the false alarm rate is greatly decreased. A similar argument shows
that the miss rate is also greatly decreased. Thus, we see that what matters is the energy of the radar
pulse, in relation to the strength of the noise. If the energy of the signal increases, but the typical
values of the noise w[n] remain about the same, the system will make fewer errors. By making the
energy sufficiently large, we can make the error rate as small as we like. In the lab assignments to
follow, we will observe situations where the noise w[n] is so strong that it completely obscures the
radar pulse x[n − n0 ], yet the radar pulse is long enough that it has enough energy that a correlation
detector will make few errors.
Finally, we comment on the effects of noise on the DSSS detector. In this case, instead of
deciding whether a pulse is present or not, the detector decides whether a positive or negative code
signal is present. As with the radar example, this must ordinarily be accomplished in the presence
of noise. However, in this case there are two kinds of noise: environmental noise, similar to that
which affects radar, and multiple user noise, which is due to other users transmitting their own code
signals. In the absence of any noise, the in-place correlation r(x, y) computed by the detector will be
+E(x) when the message bit is “one” and −E(x) when the message bit is “zero” , where x[n] is the
user’s code signal. For this reason, using a decision threshold c = 0 is natural. When the message
bit is zero, an error occurs when r(x, y) > 0, which happens when the correlation term r w due to
noise exceeds E(x). Similarly, when the message bit is one, an error occurs when r(x, y) < 0,
which happens when rw < −E(x). As with the radar example, errors occur less frequently when
the signal energy becomes larger. This will be evident in the lab assignment, when code signals of
different lengths, and hence different energies, are used.
1 1 1
0 0 0
−1 −1 −1
0 100 200 0 100 200 0 100 200
1 1 1
0 0 0.5
−1 −1 0
0 100 200 −1 0 1 50 100 150 200
(d) Output the running sum as the next sample of the correlation signal.
In the laboratory assignment, you will be asked to complete an implementation of this algorithm.
Note that significant portions of this algorithm can be implemented very simply in M ATLAB. For
instance, all of (a) can be accomplished using a single line of code. Similarly, parts (b) through (d)
can all be accomplished in a single line using one of M ATLAB’s built-in functions and its vector
arithmetic capabilities.
Note that x and y must be the same size; otherwise M ATLAB will return an error.
• The subplot command: In order to put several plots on the same figure in M ATLAB, we
use the subplot command. subplot creates a rectangular array of axes in a figure. Figure
2.6 has an example figure with such an array. Each time you call subplot, you activate
one of the axes. subplot takes three input parameters. The first and second indicate the
number of axes per row and the number of axes per column, respectively. The third parameter
indicates which of the axes to activate by counting along the rows7 . Thus the command:
>> subplot(2,3,5)
Other useful forms of the axis command include axis tight, which fits the axis range
closely around the data in a plot, and axis equal, which assures that the x- and y-axes
have the same scale.
• Buffer operations in M ATLAB: It is often useful to use M ATLAB’s vectors as buffers, with
which we can shift values in the buffer towards the beginning or end of the buffer by one
position. Such an operation has two parts. First, we discard the number at the beginning or
end of the buffer. If our buffer is a vector b, we can do this using either b = b(2:end) or
b = b(1:end-1). Then, we append a new number to the opposite end of the buffer using
a standard array concatenation operation. Note that we can easily combine these two steps
into a single command. For instance, if b is a row vector and we wish to shift towards the end
of the buffer, we use the command
• Counting elements that meet some condition: Occasionally we may want to determine how
many elements in a vector meet some condition. This is simple in M ATLAB because of how
the conditional operators are handled. Recall that for a vector, v, (v == 3) will return a
vector with the same size as v, the elements of which are either 1 or 0 depending upon the
truth of the conditional statement. Thus, to count the number of elements in v that equal 3,
we can simply use the command
(a) (Plotting code signals) Use subplot and stairs to plot the three code signals on
three separate axes in the same figure. After plotting each signal, call axis([1, 100, -1.5, 1.5])
to make sure that the signal is visible.
• Include your figure, with axis labels on each subplot, a figure number and caption,
and the generating code in your report.
(b) (Calculate statistics) For each of the three signals generated above, calculate:
• Their mean values.
• Their energies.
(c) (Calculate correlations) Calculate the “in-place” correlation and normalized correlation
for the following pairs of signals.
• code1 and code2
• code1 and code3
• code2 and code3
(d) (Classify correlations) For each of the signal pairs given in problem 1c:
• Identify each pair as positively correlated, uncorrelated, or negatively correlated.
2. (Implementing and interpreting running correlation) Download the file run_corr.m, which
is a “skeleton” file for an implementation of the “real-time” running correlation algorithm
described in Section 2.2.2. It accepts two input signals, performs running correlation on them,
and produces the correlation signal with a length equal to the sum of the lengths of the input
signals minus one.
(a) (Write the code) Complete the function, following the algorithm given in Section 2.2.2.
You can use the completed demo version of the function, run_corr_demo.dll to
check your function’s output8 .
• Include your code in the M ATLAB appendix of your report.
(b) (Compute running correlations) Use run_corr.m to compute the running correlation
between the following pairs of signals, and plot the resulting correlation signals on the
same figure using subplot.
• code1 and code2.
• code3 and itself.
(c) (Interpret a running correlation) When performing running correlation with a signal and
itself, the resulting correlation signal has some special properties. Look at the correlation
signal that you computed between code3 and itself.
• Is the correlation signal symmetric? (It can be shown that it should be.)
• What is the maximum value of the correlation signal? How does the maximum
value relate relate to the energy of code3?
3. (Using correlation to decode DSSS signals transmitted simultaneously with other signals.)
Download the file lab2_data.mat and load it into your workspace. The file contains the
variable which represents a received signal that is the sum of several message carrying signals,
one from each of four users. The message carrying signal from each user conveys a sequence
8 If you cannot get your function working properly, you may use run corr demo.dll to complete the rest of the
assignment.
(a) (Plot the signals) First, let’s look at the signals we’re given.
• Use subplot and stairs to plot dsss, cs1, and cs2 on three separate axes of
the same figure.
(b) (Decoding the bits of the user with the longer code signal) Start by using run_corr to
correlate the received signal dsss with the longer code signal cs1. Call the resulting
signal cor1. Now, to decode the sequence of message bits from this user, we need to
extract the appropriate samples from cor1. That is, we need to extract just those sam-
ples of the running correlation that correspond to the appropriate in-place correlations.
We can do this in M ATLAB using the following command:
>> sub_cor1 = cor1(length(cs1):length(cs1):length(cor1));
Each sample of sub_cor1 is used to make the decision about one of the user’s bits.
When it is greater than zero, i.e. the correlation of the received signal with the code
signal is positive, the decoder decides the bit is 1. When it is less than zero, the decoder
decides the user’s bit is 0.
• On two subplots of the same figure, use plot to plot cor1, and stem to plot
sub_cor1.
• Decode the sequence of bits. (You can do this visually or with M ATLAB ˙) (Hint:
The sequence is 10 bits long, and the first 3 bits are “011”.)
(c) (Decoding the bits of the user with the shorter code signal) Repeat the procedure in a
and b above, this time using the code signal cs2. Call your correlation signal cor2,
and the vector of extracted values sub_cor2.
• On two subplots of the same figure, use plot to plot cor2, and stem to plot the
signal sub_cor2.
• Decode the sequence of bits. (Hint: there are 17 bits in this sequence.)
• Since the code signal cs2 has less energy (because it is shorter), there is a greater
chance of error. Are there any decoded bits that you suspect might be incorrect?
Which ones? Why?
4. (Using running correlation to detect reflected radar pulses) lab2_data.mat also contains
three other signals: radar_pulse, radar_received, and radar_noise. The re-
ceived signal contains several reflections of the transmitted radar pulse and noise. The signal
radar_noise contains noise with similar characteristics to the noise in the received signal
without the reflected pulses.
(a) (Examining the radar signals) First, let’s take a look at the first two signals.
5. On the front page of your report, please provide an estimate of the average amount of time
spent outside of lab by each member of the group.
9 Remember that the radar pulse must travel to the object and then back again.
3.1 Introduction
Sinusoids are important signals. Part of their importance comes from their prevalence in the everyday
world, where many signals can be easily described as a sinusoid or a sum of sinusoids. Another part
of their importance comes from their properties when passed through linear time-invariant systems.
Any linear time-invariant system whose input is a sinusoid will have an output that is a sinusoid of
the same frequency, but possibly with different amplitude and phase. Since a great many natural
systems are linear and time-invariant, this means that sinusoids form a powerful tool for analyzing
systems.
Being able to identify the parameters of a sinusoid is a very important skill. From a plot of the
sinusoid, any student of signals and systems should be able to easily identify the amplitude, phase,
and frequency of that sinusoid.
However, there are many practical situations where it is necessary to build a system that iden-
tifies the amplitude, phase, and/or frequency of a sinusoid — not from a plot, but from the actual
signal itself. For example, many communication systems convey information by modulating, i.e.
perturbing, a sinusoidal signal called a carrier. To demodulate the signal received at the antenna,
i.e. to recover the information conveyed in the transmitted signal, the receiver often needs to know
the amplitude, phase, and frequency of the carrier. While the frequency of the sinusoidal carrier is
often specified in advance, the phase is usually not specified (it is just whatever phase happens to
occur when the transmitter is turned on), and the amplitude is not known because it depends on the
attenuation that takes place during transmission, which is usually not known in advance. Moreover,
though the carrier frequency is specified in advance, no transmitter can produce this frequency ex-
actly. Thus, in practice the receiver must be able to “lock onto” the actual frequency that it receives.
Doppler radar provides another example. With such a system, a transmitter transmits a sinusoidal
waveform at some frequency fo . When this sinusoid reflects off a moving object, the frequency of
the returned sinusoid is shifted in proportion to the velocity of the object. A system that determines
the frequency of the reflected sinusoid will also be able to determine the speed of the moving object.
How can a system be designed that automatically determines the amplitude, frequency and phase
of a sinusoid? One could imagine any number of heuristic methods for doing so, each based on how
you would visually extract these parameters. It turns out, though, that there are more convenient
methods for doing so – methods which involve correlation. In this lab, we will examine how to
automatically extract parameters from a sinusoid using correlation. Along the way, we will discover
how complex numbers can help us with this task. In particular, we will make use of the complex
exponential signal and see the mathematical benefits of using an “imaginary” signal that does not
really exist.
3.2 Background
3.2.1 Complex numbers
Before we begin, let us quickly review the basics of complex numbers. Recall the
√ a complex number
z = x + jy is defined by its real part, x, and its imaginary part, y, where j = −1. Also recall that
we can rewrite any complex number into polar form1 or exponential form, z = rejθ , where r = |z|
is the magnitude of the complex number and θ = angle(z) is the angle. We can convert between the
two forms using the formulas
x = r cos(θ) (3.1)
y = r sin(θ) (3.2)
and
p
r = x2 + y 2 (3.3)
tan−1 xy , x≥0
θ = (3.4)
tan−1 xy + π, x<0
A common operation on complex numbers is the complex conjugate. The complex conjugate of
a complex number, z ∗ , is given by
z∗ = x − jy (3.5)
= re−jθ (3.6)
Conjugation is particularly useful because zz ∗ = |z|2 .
Euler’s2 formula is a very important (and useful) relationship for complex numbers. This formula
allows us to relate the polar and rectangular forms of a complex number. Euler’s formula is
ejθ = cos(θ) + j sin(θ) (3.7)
Equally important are Euler’s inverse formulas:
ejθ + e−jθ
cos(θ) = (3.8)
2
ejθ − e−jθ
sin(θ) = (3.9)
2j
It is strongly recommended that you commit these three equations to memory; you will be using
them regularly throughout this course.
1 Sometimes the polar form is written as z = r∠θ, which is a mathematically less useful form. This form, however, is
Using Euler’s formula, we can also interpret a complex exponential signal c(t) as the sum of a real
cosine wave and an imaginary sine wave:
c(t) = A cos(ω0 t + φ) + jA sin(ω0 t + φ) (3.17)
Sometimes it is useful to visualize a complex exponential signal as a “corkscrew” in three dimen-
sions, as in Figure 3.1. Note that it is common to permit complex exponential signals to have
either positive or negative frequency. The sign of the frequency determines the “handedness” of the
corkscrew.
3.2.3 Finding the amplitude and phase of a sinusoid with known frequency
We’ve suggested that we can use correlation to help us determine the amplitude and phase of a sinu-
soid with known frequency. Suppose that we have a continuous-time sinusoid (the target sinusoid)
s(t) = A cos(ω0 t + φ) (3.18)
with known frequency ω0 , but unknown amplitude A and phase φ, which we would like to find.
We can perform in-place correlation4 between this sinusoid and a reference sinusoid, u(t), with the
3 These are sometimes referred to simply as complex exponentials.
4 In-place
R
correlation between two real, continuous-time signals, x(t) and y(t) is defined as C(x, y) = ab x(t)y(t)dt.
The length (b − a) is the correlation length.
0.5
Imaginary
0
−0.5
1
0.5
−1 0
0
5 −0.5
10
15 −1 Real
20
Time
same frequency and known amplitude and phase. Without loss of generality, let u(t) have A = 1
and φ = 0. Then5 ,
Z t2
C(s, u) = A cos(ω0 t + φ) cos(ω0 t)dt (3.19)
t1
Z t2
A
= cos(φ) + cos(2ω0 t + φ)dt (3.20)
2 t1
t2
A 1
= cos(φ)t + sin(ω0 t + φ) (3.21)
2 4ω0 t1
Since we know the frequency, ω0 , we can easily set the limits of integration to include an integer
number of fundamental periods of our sinusoids. In this case, the second term evaluates to zero and
the correlation reduces to
A
C(s, u) = cos(φ)(t2 − t1 ) (3.22)
2
This formula is a useful first step. If we happen to know the phase φ, then we can readily calculate
the amplitude A of s(t) from C(s, u0. Similarly, if we know the amplitude A, we can narrow the
phase φ down to one of two values. If both amplitude and phase are unknown, though, we cannot
uniquely determine them.
Note that if the interval over which we correlate is not a multiple of the fundamental period of
u(t), then the second term in equation (3.21) will not be zero. However, if as commonly happens ω 0
is much greater than one, then the second term will be so small that it can be ignored, and equation
(3.22) holds with approximate equality.
To resolve the ambiguity when both amplitude and phase are unknown, one common approach
to correlate with a second reference sinusoid that is π2 out of phase with the first. Here, though, we
will explore a different method which is somewhat more enlightening. Notice what happens if we
5 Recall that cos(A) cos(B) = 1
cos(A − B) + 1
cos(A + B).
2 2
If we again assume that we are correlating over an integer number of periods of our target sinusoid,
then the second term goes to zero and we are left with
A jφ
C(s, c) =
e (t2 − t1 ). (3.29)
2
Our correlation has resulted in a simple complex number whose magnitude is directly proportional
to the amplitude of the original sinusoid and whose angle is identically equal to its phase! We can
easily turn the above formula inside-out to obtain
2
A = |C(s, c)| (3.30)
t2 − t 1
φ = angle(C(s, c)) (3.31)
We can also see from equation (3.29) that in correlating with a complex exponential signal, we have
effectively calculated the phasor7 representation of our sinusoid.
As with the case of correlating with a sinusoid, we note that when the interval over which we
correlate is not a multiple of the fundamental period of c(t), then the second term in equation (3.28)
is not zero. However, if as commonly happens ω0 is much greater than 1, then the second term
will again be small enough that it can be ignored, and equations (3.29), (3.30), and (3.31) hold with
approximate equality.
Signal vector
Amplitude and Amplitude
Support vector Phase Calculator
(APC) Phase
Frequency
Figure 3.2: System diagram for the “amplitude and phase calculator.”
As shown below, when Ts is small, the correlation between s(t) and c(t) can be approximately
computed from the correlation between s[n] and c[n]. Let {n1 , . . . , n2 } denote the discrete-time
interval corresponding to the continuous-time interval [t1 , t2 ], and let N = n2 − n1 + 1 denote the
number of samples taken in the interval [t1 , t2 ], so that t2 − t1 ≈ N Ts . Then,
Z t2
C(s, c) = s(t) c∗ (t) dt (3.34)
t1
n2 Z
X (n+1)Ts
= s(t) c∗ (t) dt (3.35)
n=n1 nTs
n2
X Z (n+1)Ts
≈ s(nTs ) c∗ (nTs ) dt (3.36)
n=n1 nTs
n2
X
= s(nTs ) c∗ (nTs ) Ts (3.37)
n=n1
n2
X
= s[n] c∗ [n] Ts (3.38)
n=n1
= Cd (s, c) Ts (3.39)
where the approximation leading to the third relation is valid because Ts is small, and consequently
the signals s(t) and c(t) change little over each Ts second sampling interval, and where we use
Cd (s, c) to denote the correlation between the discrete-time signals s[n] and c[n], to distinguish it
from the correlation between continuous time signals s(t) and c(t). We see from this derivation
that the continuous-time correlation is approximately the discrete-time correlation multiplied by the
sampling interval, i.e.
We will use this value of correlation in equations (3.30) and (3.31) to estimate the amplitude and
phase of a continuous-time sinusoid.
In the laboratory assignment, we will be implementing an “amplitude and phase calculator”
(APC) as a M ATLAB function. A diagram of this system is shown in Figure 3.2. The system takes
three input parameters. The first is the signal vector which contains the sinusoid itself. The second
is the support vector for the sinusoid. The third input parameter is the frequency of the reference
sinusoid in radians per second. Note that for the system’s output to be exact, the input sinusoid must
be defined over exactly an integer number of fundamental periods.
The system outputs the sinusoid’s amplitude and its phase in radians. The system calculates
these outputs by first computing the in-place correlation given by equations (3.37) or (3.38). Then,
this correlation value is used with equations (3.30) and (3.31) to compute the amplitude and phase.
Note that in equation (3.30), we need to replace t2 − t2 with N = n2 − n1 + 1 when implementing
in discrete time.
Here, let us make a simplifying assumption and assume that (ωs + ωc ) is sufficiently large that we
can neglect the second term. Then, we have
A h i
C(s, c) ≈ ej[(ωs −ωc )t2 +φ] − e−j[(ωs −ωc )t1 +φ] (3.46)
2(ωs − ωc )
The resulting equation depends primarily on the frequency difference (ωs − ωc ) between the target
sinusoid and our reference signal. Though it is not immediately apparent, the value of this correlation
converges to the value of equation (3.29) as the (ωs − ωc ) approaches zero.
e c), defined as
Consider now the length-normalized correlation, C(s,
e c) = C(s, c) .
C(s, (3.47)
t2 − t 1
One can see from equation (3.29) that when the reference and target signals have the same frequency,
the length-normalized correlation does not depend on the length of the signal. However, when the
signals have different frequencies, one can see from equations (3.46) and (3.47) that the magnitude
of the length-normalized correlation becomes smaller as we correlate over a longer period of time.
(This happens more slowly as the frequency difference becomes smaller.) In the limit as the corre-
lation length goes to infinity, the length-normalized correlation goes to zero unless the frequencies
match exactly. This is a very important theoretical result in signals and systems.
Another special case occurs when we correlate over a common period of the target and reference
signals. This occurs when our correlation interval includes an integer number of periods of both
the target signal and reference signal. In this case, the correlation in equation (3.46), for signals of
different frequencies, is identically zero8 . Of course, the correlation is not zero when the frequencies
match. Note that this is the same condition required for equation (3.29) to be exact.
How does all of this help us to determine the frequency of the target sinusoid? The answer is
perhaps less elegant than one might hope; basically, we “guess and check”. If we have no prior
knowledge about possible frequencies for the sinusoid, we need to check the correlation with com-
plex exponentials having a variety of frequencies. Then, whichever complex exponential yields
the highest correlation, we take the frequency of that complex exponential as our estimate of the
frequency of the target signal. In the next section, we will formalize this algorithm.
1 2 N 1
, , ..., = (3.48)
T T 2T 2Ts
Then, for k = 1, 2, . . . , N/2, the length normalized correlation of s(t) with the complex exponential
at frequency Tk is (using equations (3.37), (3.38) and (3.47))
N −1
1 X k
X[k] ≈ s(nTs ) e−j2π T nTs Ts (3.49)
T n=0
N −1
1 X k
= s[n] e−j2π N n (3.50)
N n=0
where we have used the fact that Ts /T = 1/N and where we have denoted the result X[k] because
this is the notation used in future labs for the last formula above. Thus, the output output of these
correlations is the set of N/2 numbers X[1], . . . , X[N/2]. Remember that X[k] will generally be
complex. To estimate the frequency of the target sinusoid, we simply identify the value of k for
which |X[k]| is largest. With kmax denoting this value, our estimated frequency, ωest , is
kmax kmax
ωest = 2π = 2π (3.51)
T N Ts
Now that we have estimated the frequency, we should also be able to estimate the amplitude and
phase as well. In fact, we have almost calculated these estimates already. From equations (3.30) and
Figure 3.3: System diagram for the “frequency, amplitude, and phase estimator.”
There is one potential problem here, however. Previously, we assumed the frequency was known ex-
actly when determining the amplitude and phase; now, we only know the frequency approximately.
In the laboratory assignment, we will see the effect of this approximation.
In the laboratory assignment, we will be developing a system that can automatically estimate the
amplitude, phase, and frequency of a sinusoid. A block diagram of the “frequency, amplitude, and
phase estimator” (FAPE) system is given in Figure 3.3. Unlike the APC, this system takes only two
input parameters: a signal vector and the corresponding support vector. The system has four output
parameters. The first three are the estimates of the frequency, amplitude, and phase of the input
sinusoid. The fourth is the vector of correlations X[1], . . . , X[N/2] produced by the correlations. It
is often useful to examine this vector to get a sense of what the system is doing.
that the imaginary component of a complex number is in fact a real number, which M ATLAB
stores in the usual way. It thinks of a complex number as a pair of floating point numbers, one
to be interpreted as the real part and the other to be interpreted as the imaginary part. And it
knows the rules of arithmetic to apply to such pairs of numbers in order to do what complex
arithmetic is supposed to do.
• Extracting parts of complex numbers: If z contains a complex number (or an array of
complex numbers), you can find the real and imaginary parts using the commands real(z)
and imag(z), respectively. You can obtain the magnitude and angle of a complex number (or
an array of complex numbers) using the commands abs(z) and angle(z), respectively.
• Complex conjugation: To compute the complex conjugate of a value (or array) z, simply use
the M ATLAB command conj(z).
• Finding the index of the maximum value in a vector: Sometimes we don’t just want to
find the maximum value in a vector; instead, we need to know where that maximum value is
located. The max command will do this for us. If v is a vector and you use the command
the variable max_value will contain largest value in the vector, and index contains posi-
tion of max_value in v.
• M ATLAB commands to help you visually determine the amplitude, frequency, and phase
of a sinusoid: Sometimes you may need to determine the frequency, phase, and amplitude of
a sinusoid from a M ATLAB plot. In these cases, there three commands that are quite useful.
First, the command grid on provides includes a reference grid on the plot; this makes it
easier to see where the sinusoid crosses zero (for instance). The zoom command is also
useful, since you can drag a zoom box to zoom in on any part of the sinusoid. Finally, you can
use axis in conjunction with zoom to find the period of the signal. To do so, simply zoom
in on exactly one period of the signal and type axis. M ATLAB will return the current axis
limits as [x_min, x_max, y_min, y_max].
• Calling apc: The function apc, which you will be writing in this laboratory, estimates am-
plitude and phase of a continous-time target sinusoid from its samples. The input parameters
are a (sampled) target sinusoid s, the sinusoid’s support vector t, and the continuous-time
frequency w0 in radians per second. We call apc like this:
Note that a compiled version of this function, called apc_demo.dll, is also available.
• Calling fape: The function fape, which you will be writing in this laboratory, implements
the frequency, amplitude, and phase estimator system. This function accepts the samples of a
target continuous-time sinusoid s and it’s support vector t, like this:
where frq is the estimated frequency in radians per second, A is the estimated amplitude,
phi is the estimated phase, and X is the vector of correlations, X[1], . . . , X[N/2] between s
and each reference complex exponential. Note that a compiled version of this function called,
fape_demo.dll, is also available.
(a) (Extracting sinusoid parameters) Visually identify the amplitude, continuous-time fre-
quency, and phase of the continuous-time (sampled) sinusoid that you’ve just plotted.
• Include your estimated values in your report. Reduce your answers to decimal form.
• What is the phasor that corresponds to this sinusoid? Write it in both rectangular and
polar form. (Again, keep your answers in decimal form. You should use M ATLAB
to perform these calculations.)
(b) (Checking your parameters) Verify your answers in the previous problem by generating
a sinusoid using those parameters and plotting them on the above graph using hold on.
Use t as your time axis/support vector. The new plot should be close to the original, but
it does not need to be exactly correct.
• Include the resulting graph in your report. Remember to include a legend.
2. (The Amplitude and Phase Calculator) In this problem we will complete and test a func-
tion which implements the “Amplitude and Phase Calculator”, as described in Section 3.2.3.
Download the file apc.m. This is a “skeleton” M-file for the “amplitude and phase calcu-
lator”. Also, generate the following sinusoid (s_test) with its support vector (t_test):
(a) (Identify sinusoid parameters by hand) What are the amplitude, frequency in radians per
second, and phase of s_test?
• Include your answers in your lab report.
(b) (Write the APC) Complete the function apc. You should use the signal s_test to
test the operation of your function. You may also wish to use the compiled function
apc_demo.dll to test your results on other sinusoids.
• Include the code for apc in your M ATLAB appendix.
(c) (Test APC on a sinusoid with unknown parameters) Download the file lab3_data.mat.
This .mat file contains the support vector (t_samp) and signal vector (s_samp) for
a sampled continuous-time sinusoid with a continuous-time frequency of ω0 = 200π
radians.
• From t_samp, determine the sampling period, Ts , of this signal.
• Use apc9 to determine the amplitude and phase of the sinusoid exactly.
(d) (APC in a non-ideal case) What happens if we use apc to correlate over a non-integral
number of periods of our target sinusoid? We will investigate this question in this prob-
lem and the next. First, let’s examine a single non-integral number of periods. Generate
the following sinusoid:
>> apc_support = 0:0.1:8;
>> apc_test = cos(apc_support*2*pi/3);
This is a sinusoid with a frequency of ω0 = 2π
3 radians per second, unit amplitude, and
zero phase shift.
• Plot apc_test and include the plot in your report.
• What is the fundamental period of apc_test?
• Approximately how many periods are included in apc_test?
• Use apc to estimate the amplitude and phase of this sinusoid. What are the ampli-
tude and phase errors for this signal?
(e) (APC in many non-ideal cases) Now we wish to examine a large number of different
lengths of this sinusoid. You will do this by writing a for loop that repeats the previous
part for many different values of the length of the incoming sinusoid. Specifically, write
a for loop with loop counter support_length ranging over values of 1:0.1:50
seconds. In each iteration of the loop, you should
i. Set apc_support equal to 0:0.1:(support_length-0.1),
ii. Recalculate apc_test using the new apc_support,
iii. Use apc to estimate the amplitude and phase of apc_test, and
iv. Store these estimates in two separate vectors.
Put your code in an M-file script so that you can run it easily.
• Include your code in the M ATLAB appendix.
• Use subplot to plot the amplitude and phase estimates as a function of support
length in two subplots of the same figure. You should be able to see both local
oscillation of the estimates and a global decrease in error with increased support
length.
• At what support lengths are the amplitude estimates correct (i.e., equal to 1)?
• What minimum support length do we need to be sure that the phase error is less
than 0.01 radians?
3. (The Frequency, Amplitude, and Phase Estimator) In this problem, we’ll explore the fre-
quency, amplitude and phase estimator, as described in Section 3.2.4. Download the file
fape.m. This is a “skeleton” M-file for the “frequency, amplitude, and phase estimator”
system.
9 If you failed to correctly complete apc.m, you may use apc demo.dll for the following problems. If you use the
(a) (Write the FAPE) Complete the fape function. You can use t_test and s_test
from Problem 2 along with the compiled fape_demo.dll to check your function’s
results.
• Include the completed code in your report’s M ATLAB appendix.
• What are the frequency (in radians per second), amplitude, and phase estimates
returned by fape for t_test and s_test? Are these estimates correct?
• Use stem and abs to plot the magnitude of the vector of correlations returned by
fape versus the associated frequencies.
• What do you notice about this plot? What can you deduce from this fact? (Hint:
Consider what this plot tells you about the returned estimates.)
(b) (Running FAPE on in a non-ideal case) In this problem, we’ll see what happens to FAPE
when the target sinusoid does not include an integral number of periods. lab3_data.mat
contains the variables fape_test_t (a support vector) and fape_test_s (its asso-
ciated sinusoidal signal). Run fape on this signal.
• What are the frequency in radians per second, amplitude, and phase estimates that
are returned?
• Use stem and abs to plot the magnitude of the returned vector of correlations.
• Plot fape_test_s and a new sinusoid that you generate from the parameter esti-
mates returned by FAPE on the same figure (using hold on). Use fape_test_t
as the support vector for the new sinusoid. Make sure you use different line types
and include a legend.
• What can you say about the accuracy of estimates returned by FAPE?
• Compare the plot of the correlations generated in this problem and in Problem 3a.
What do these different plots tell you?
Food for thought: Investigate the error characteristics of fape as you did with apc in
problem 2e. Do the frequency, amplitude, and phase estimates improve as we use longer
support lengths? Which parameter is exhibits the most error? What does the vector of
correlations, X[k], tell you about these estimates?
(c) Measuring speed via Doppler shift. A sonar transmitter in the ocean emits a sinu-
soidal signal with frequency 1000 Hz, and the signal reflects off an object moving
toward the transmitter. The received signal can be found in the M ATLAB workspace
lab3_data.mat. The signal vector is called s_sonar and the support vector is
t_sonar. The speed of sound in salt water is approximately 1450 meters/second.
(Note: because the signal is rather long, it may take a little while for FAPE to run.)
• Estimate the speed of the object.
Food for thought: Use randn to add some random noise to s sonar and observe
how your estimate changes. How much noise do you need to add to produce an error?
Does the system degrade gracefully? (That is, is the amount of error proportional to the
amount of noise?)
4. On the front page of your report, please provide an estimate of the average amount of time
spent outside of lab by each member of the group.
4.1 Introduction
As emphasized in the previous lab, sinusoids are an important part of signal analysis. We noted that
many signals that occur in the real world are composed of sinusoids. For example, many musical
signals can be approximately described as sums of sinusoids, as can some speech sounds (vowels in
particular). It turns out that any periodic signal can be written exactly as a sum of amplitude-scaled
and phase-shifted sinusoids. Equivalently, we can use Euler’s inverse formulas to write periodic
signals as sums of complex exponentials. This is a mathematically more convenient description, and
the one that we will adopt in this laboratory and, indeed, in the rest of this course. The description
of a signal as a sum of sinusoids or complex exponentials is known as the spectrum of the signal.
Why do we need another representation for a signal? Isn’t the usual time-domain representa-
tion enough? It turns out that spectral (or frequency-domain) representations of signals have many
important properties. First, a frequency-domain representation may be simpler than a time-domain
representation, especially in cases where we cannot write an analytic expression for the signal. Sec-
ond, a frequency-domain representation of a signal can often tell us things about the signal that we
would not know from just the time-domain signal. Third, a signal’s spectrum provides a simple way
to describe the effect of certain systems (like filters) on that signal. There are many more uses for
frequency-domain representations of a signal, and we will examine many of them throughout this
course. Spectral representations are one of the most central ideas in signals and systems theory, and
can also be one of the trickiest to understand.
Consider the following problem. Suppose that we have a signal that is actually the sum of
two different signals. Further, suppose that we would like to separate one signal from the other,
but the signals overlap in time. If the signals have frequency-domain representations that do not
overlap, it is still possible to separate the two signals. In this way, we can see that frequency-domain
representations provide another “dimension” to our understanding of signals.
In this laboratory, we will examine two tools that allow us to use spectral representations. The
Fourier Series is a tool that we use to work with spectral representations of periodic continuous-
time signals. The Discrete Fourier Transform (DFT) is an analogous tool for periodic discrete-
time signals. Each of these tools allow both analysis (the determination of the spectrum of the
time-domain signal) and synthesis (the reconstruction of the time-domain signal from its spectrum).
Though you may not be aware of it, you have already performed DFT analysis; the “frequency,
amplitude, and phase estimator” system that you implemented in Laboratory 3 actually performs
DFT analysis.
(B)
(A)
4.2 Background
where the αk ’s, which are called Fourier coefficients. The Fourier coefficients are determined by the
Fourier series analysis formula
Z
1 2πk
αk = s(t)e−j T t dt , (4.4)
T hT0 i
R
where hT i indicates an integral over any T second interval2 . In other words, the Fourier synthesis
formula shows that the complex exponential component of s(t) at frequency 2πk
T is
2πk
α k ej T t . (4.5)
Similarly, the Fourier analysis formula shows how the complex exponential components can be
determined from s(t), even when no exponential components are evident.
In general, the Fourier coefficients, i.e. the αk ’s, are complex. Thus, they have a magnitude |αk |
and a phase or angle ∠αk . The magnitude |αk | can be viewed as the strength of the exponential
component at frequency kω0 = 2πk/T , while the angle ∠αk gives the phase of that component.
The coefficient α0 is the DC term; it measures the average value of the signal over one period.
Once we know the αk ’s, the spectrum of s(t) is simply a plot consisting of spectral lines at
frequencies . . . , −2ω0 , −ω0 , 0, ω0 , 2ω0 , . . .. The spectral line at frequency kω0 is drawn with height
indicating the magnitude |αk | and is labeled with the complex value of αk . Alternatively, two
separate spectral line plots can be drawn — one showing the |αk |’s and the other showing the ∠αk ’s.
Notice that the Fourier synthesis formula is very similar to the formula given in Lab 3 for the
correlation between a sinusoid and a complex exponential. Indeed it has the same interpretation: in
computing αk we are computing the correlation3 between the signal s(t) and a complex exponential
with frequency 2πk/T . Thought of another way, this correlation gives us an indication of how much
of a particular complex exponential is contained in the signal s(t).
1 This is the exponential form of the Fourier series synthesis formula. There is also a sinusoidal form, which is presented
Partial Series
Notice the infinite limits of summation in the synthesis formula (4.3). This tells us that, for the
general case, we need an infinite number of complex exponentials to represent our signal. However,
in practical situations, such as in this lab assignment, when we use the synthesis formula to determine
signal values, we can generally only include a finite number of terms in the sum. For example, if we
use only the first N positive and negative frequencies plus the DC term (at k = 0), our approximate
synthesis equation becomes
N
X 2πk
s(t) ≈ α k ej T t . (4.6)
k=−N
Fortunately, Fourier series theory shows that this approximation becomes better and better 4 as
N −→ ∞. Alternatively, it is known that the mean-squared value of the difference between s(t)
and the approximation tends to zero as N −→ ∞. Specifically, it can be shown that
N
! N
X X
j 2πk t
M S s(t) − αk e T = M S(s(t)) − |αk |2
k=−N k=−N
−→ 0 as N −→ ∞ . (4.7)
How large must N be for the approximation to be good? There is no simple answer. However,
you will gain some idea by the experiments you perform in this lab assignment.
where
2π
ω0 = (4.9)
T
and the 2T -second Fourier series has components at the frequencies.
ω0 ω0
. . . , −2ω00 , −ω00 , 0, ω00 , 2ω00 , . . . = . . . , −ω0 , − , 0, , ω0 , . . . , (4.10)
2 2
4 It is known that under rather benign assumptions about the signal s(t), the approximation converges to s(t) as N −→ ∞
at all times t where s(t) is continuous, and at times t where s(t) has a jump discontinuity, the approximation converges to
the average of the values immediately to the left and right of the discontinuity.
where
2π ω0
ω00 = = . (4.11)
2T 2
From this we see that the 2T -second Fourier series decomposes s(t) into frequency components
with half the separation of that of the T -second Fourier series. However, since s(t) is periodic with
period T , its spectrum is actually concentrated at frequencies that are multiples of ω 0 (or a subset
thereof). Hence, the “additional” coefficients in the 2T -Fourier series must be zero, and it turns out
that the nonzero coefficients are the same as for the T -second Fourier series. Specifically, it can be
shown that with αk and αk0 denoting the T -second and 2T -second Fourier coefficients, respectively,
then
αk/2 , k even
αk0 = (4.12)
0, k odd
In summary, Fourier series analysis/synthesis can be performed over one fundamental period or
over any number of fundamental periods. Usually, when Fourier series is mentioned, the desired
number of periods interval will be clear from context. In any case, the spectrum is not affected by
the choice of T .
This will give us an idea of the frequency content of the signal during the given time interval. It is
important to emphasize, however, that the synthesis equation (4.14) is valid only when t is between t 1
and t2 . Outside of this time interval, the synthesis formula will not necessarily equal s(t). Instead,
it describes a signal that is periodic with period T , called the periodic extension of the segment
between t1 and t2 .
1. (Fourier series analysis) The T -second Fourier series analysis of a periodic signal s(t) with
period T produces a set of Fourier coefficients αk , k = . . . , −2, −1, 0, 0, 1, 2, . . ., which are,
in general, complex valued.
2. (Frequency components) If αk are the coefficients of the T -second Fourier series of the peri-
odic signal s(t) with period T , then the frequency or spectral component of s(t) at frequency
j 2πk
T is αk e
T t.
2πk
5. (Conjugate symmetry) If s(t) is a real-valued signal, i.e. its imaginary part is zero, then for
any integer k
6. (Conjugate pairs) If αk ’s are the T -second Fourier coefficients for a real-valued signal s(t),
then for any k the sum of the complex exponential components of s(t) corresponding to α k
and α−k is a sinusoid at frequency 2πk/T . Specifically, using the inverse Euler relation,
7. (Sinusoidal form of the Fourier synthesis formula) The previous property leads to the sinu-
soidal form of the Fourier synthesis formula:
∞
X 2πk
s(t) = α0 + 2|αk | cos( t + ∠αk ) . (4.19)
T
k=−∞
8. (Linear combinations) If s(t) and s0 (t) have T -second Fourier coefficients αk and αk0 , respec-
tively, then as(t) + bs0 (t) has T -second Fourier coefficients aαk + bαk0 .
9. (Fourier series of elementary signals) The following lists the T -second Fourier coefficients of
some elementary signals.
2πm
(a) Complex exponential signal: s(t) = ej T t =⇒
1, k = m
αk = . (4.20)
0, k =
6 m
5 By “distinct”, we mean that s(t) and s0 (t) are sufficiently different that s(t) 6= s0 (t) for all times t in some interval
with (t1 , t2 ), with nonzero length. They are not “distinct” if they differ only at a set of isolated points. To see why we
need this clarification, observe that if s(t) and s0 (t) differ only at time t1 , then they have the same Fourier coefficients,
because integrals, such as those defining Fourier coefficients, are not affected by changes to their integrands at isolated
points. Likewise, s(t) and s0 (t) will have the same Fourier coefficients if they differ only at isolated times t 1 , t2 , . . ..
However, if s(t) 6= s0 (t) for all t in an entire interval, no matter how small, then αk 6= α0k for at least one k.
10. (T -second Fourier series) If a periodic signal s(t) has period T and T -second Fourier coeffi-
cients αk , then the nT -second Fourier coefficients are
0 αk/n , k = multiple of n
αk = (4.24)
0, else
11. (Parseval’s relation) If αk ’s are the T -second Fourier coefficients for signal s(t), then the
mean-squared value of s(t), equivalently the power, equals the sum of the squared magnitudes
of the Fourier coefficients. That is,
Z ∞
X
1
M S(s) = |s(t)|2 dt = |αk |2 (4.25)
T hT i
k=−∞
0, ω
b0 , 2b
ω0 , . . . , (N − 1)b
ω0 . (4.27)
The reason is that any complex exponential signal with the frequency kb ω0 is in fact identical to
a complex exponential signal with one of the N frequencies listed above6 . Notice that this set of
frequencies ranges from 0 to 2π(N N
−1)
, which is just a little less than 2π.
We now assert that the representation of s[n] in terms of complex exponentials with the above
frequencies is given by the discrete-time Fourier series synthesis formula or as we will usually call
it, the the Discrete Fourier Transform (DFT) synthesis formula
2π0 2π1 2π2 2π(N −1)
s[n] = S[0]ej N n + S[1]ej N n + S[2]ej N n + . . . + S[N − 1]ej N n
N
X −1
2πk
= S[k]ej N n , (4.28)
k=0
where the S[k]’s, which are called DFT coefficients, are determined by the DFT analysis formula
1 X 2πk
S[k] = s[n]e−j N n , k = 0, 1, 2, 3, . . . , N − 1 (4.29)
N
hN i
where hN i indicates a sum over any N consecutive integers7 , e.g. the sum over 0, . . . , N .
As with the continuous-time Fourier series, the DFT coefficients are, in general, complex. Thus,
they have a magnitude |S[k]| and a phase or angle ∠S[k]. The magnitude |S[k]| can be viewed as
the strength of the exponential component at frequency kb ω0 = 2πk/N , while ∠S[k] is the phase
of that component. The coefficient S[0] is the DC term; it measures the average value of the signal
over one period.
Once we know the S[k]’s, the spectrum of s[n] is simply a plot consisting of spectral lines at
frequencies 0, ω b0 , 2b ω0 . The spectral line at frequency kb
ω0 , . . . , (N − 1)b ω0 is drawn with height
indicating the magnitude |S[k]| and is labeled with the complex value of S[k]. Alternatively, two
separate spectral line plots can be drawn — one showing the |S[k]|’s and the other showing the
∠S[k]’s.
Since the sums in the synthesis and analysis formulas are finite, there are no convergence-of-
partial-sum issues, such as those that arise for the continuous-time Fourier series.
Often the DFT coefficients S[0], . . . , S[N ] are said to be the “DFT of the signal s[n]” and the
process of computing them via the analysis equation (4.29) is called “taking the DFT” of s[n].
Conversely, applying the synthesis equation (4.28) is often called “taking the inverse DFT” of
S[0], . . . , S[N ].
Notice that the DFT analysis formula (4.29) is identical to equation (3.45) in Lab 3. That is,
in computing the set of correlations between a signal s[n] and the various complex exponentials in
Lab 3, we were actually taking the DFT of s[n]. Indeed, it continues to be helpful to view the DFT
analysis as the process of correlating s[n] with various complex exponentials. Those correlations that
lead to larger magnitude coefficients indicate frequencies where the signal has larger components.
In some treatments, the DFT analysis and synthesis formulas differ slightly from those given
8
above in that the √ 1/N factor is moved from the analysis formula to the synthesis formula , or re-
placed by a 1/ N factor multiplying each formula. All of these approaches are equally valid. The
choice between them is largely a matter of taste. For example, our approach is the only one for
6 If kb
ω is not in this range, then k = mN +l where m 6= 0 and 0 ≤ l < N . It then follows that the complex exponential
0
2πk 2π(mN +l) 2πl n 2πl n
with this frequency is ej N n = ej N
n
= ej2πmn ej N = ej N , which is an exponential with one of the N
frequencies in the list above.
2πk
7 Because s[n]e−j N n is periodic with period N , the sum is the same for any choice of N consecutive integers.
8 The DSP First textbook does this in Chapter 9.
which S[0] equals the average signal value. For the other approaches, the average is S[0] multiplied
by a known constant. The only cautionary note is that one should never use the analysis formula
from one version with the synthesis formula from another. In this course, we will always use the
analysis and synthesis formulas shown above.
Although we will always take 0, ω b0 , 2b ω0 as the analysis frequencies produced
ω0 , . . . , (N − 1)b
by the DFT, it is important to point out that every frequency ω b in the upper half of this range,
i.e. between π and 2π, is equivalent to a frequency ω b − 2π, which lies between −π and 0. By
“equivalent,” we mean that a complex exponential with frequency ω b with π < ω b < 2π equals the
complex exponential with frequency ω b − 2π. Thus, it is often useful to think of frequencies in the
upper half of our designated range as representing frequencies in the range −π to 0.
For example, let us look at the DFT of a sinusoidal signal, s[n] = cos( 2πmN n), with 0 < m < 2 .
N
where S[m] = S[N − m] = 1/2 and S[k] = 0 for other k’s. In the synthesis formula, the
2πm
coefficient S[m] multiplies the complex exponential ej N n , and the coefficient S[N −m] multiplies
2π(N −m) 2πm
the complex exponential ej N n = e−j N n . Thus, these two coefficients can be viewed as
multiplying exponentials at frequencies ± N , which by the inverse Euler formula sum to yield
2πm
N -point DFT
As with continuous-time signals, if a discrete-time signal s[n] is periodic with period N , then it also
periodic with period 2N , and period 3N , and so on. Thus, when applying the DFT, we have a choice
as to the value of N . Sometimes we choose it to be the the smallest period, i.e. the fundamental
period, but sometimes we do not. When we want to explicitly specify the value of N used in a DFT,
we will say N -point DFT.
The relationship between the N -point and 2N -point DFT is just like the relationship between
the T -second and 2T -second Fourier series. That is, whereas the N -point DFT has components at
frequencies
0, ω
b0 , 2b
ω0 , . . . , (N − 1)b
ω0 , (4.31)
the 2N -point DFT has components at the frequencies
ω
b0 ω
b0
0, ω
b 00 , 2b ω00 = 0,
ω00 , . . . , (2N − 1)b ,ω
b0 , . . . , (2N − 1) .. (4.32)
2 2
where
2π ω0
ω
b0 = = (4.33)
2N 2
From this we see that the separation between frequency components has been halved. Moreover, it
can be shown that the relationship between the original and new coefficients is
S[k/2], k even
S 0 [k] = (4.34)
0, k odd
In summary, DFT analysis/synthesis can be performed over one fundamental period or over any
number of fundamental periods. Usually, when the DFT is mentioned, the desired number of periods
interval will be clear from context. In any case, the spectrum is not affected by the choice of N .
This will give us an idea of the frequency content of the signal during the given time interval. It is
important to emphasize, however, that the synthesis equation (4.36) is valid only at times n from n 1
to n2 . Outside of this time interval, the synthesis formula will not necessarily equal s[n]. Instead, it
describes a signal that is periodic with period N , called the periodic extension of the segment from
n1 to n2 .
Moreover, it can be shown that if it should happen that s(t) has no spectral components at frequen-
cies greater than 1/(2Ts ), then
S[k], 0 ≤ k ≤ N/2
αk = S[N − k + 1], −N/2 ≤k<0 (4.38)
0, |k| > N/2
The above two equation shows how the DFT can be used to compute, at least approximately, the
Fourier series coefficients. In fact, the Fourier series analysis program described in the in the M AT-
LAB section of this assignment uses the DFT to compute the Fourier coefficients.
properties are stated without derivations. However, each can be derived straightforwardly from the
analysis and synthesis formulas. Though not required in this laboratory, you may want to confirm
some of these properties using the DFT analysis and synthesis programs described in Section 4.3.
1. (DFT analysis) The N -point DFT of a periodic signal s[n] with period N produces a vector of
N DFT coefficients S[0], . . . , S[N − 1], which are, in general, complex valued. Equivalently,
the coefficients may be considered to be determined by a set of N signal samples.
2. (Frequency components) If S[k] is N -point DFT of the periodic signal s[n] with period N ,
j 2πk
then the frequency or spectral component of s[n] at frequency 2πk
N is S[k]e
N n . The com-
−j 2πk
ponent of the signal at frequency N is S[N − k]e N .
−2πk N
3. (DC component) The coefficient S[0] equals the average value or DC value of s[n].
5. (Conjugate symmetry) If s[n] is a real-valued signal, i.e. its imaginary part is zero, then for
any integer k
These facts indicate that we are usually only interested in the first half of the DFT coefficients.
In particular, note that when we plot the DFT, the location of the origin and the appearance of
the symmetry is different than when we plot the Fourier Series. See Figure 4.2 for an example
of the relation between the two.
6. (Conjugate pairs) If S[k] is the N -point DFT of a real-valued signal s[n], then for any k the
sum of the complex exponential components of s[n] corresponding to S[k] and S[N − k] is a
sinusoid at frequency 2πk/N . Specifically, using the inverse Euler relation,
7. (Linear combinations) If s[n] and s0 [n] have N -point DFT S[k] and S 0 [k], respectively, then
as[n] + bs0 [n] has N -point DFT aS[k] + bS 0 [k].
8. (Sampled continuous-time signals) If the discrete-time signal s[n] comes from sampling a
continuous-time signal s(t) with sampling interval Ts , i.e. if s[n] = s(nTs ), then the continuous-
time frequency represented by DFT coefficient S[k] is 2πk N fs , where fs = 1/Ts samples per
second is the sampling rate.
9. (DFT of elementary signals) The following lists the N -point DFT of some elementary signals.
(A) |C |
k X[k] (B)
ω −−> k −−>
Figure 4.2: (A) The magnitude of the Fourier Series coefficients αk for a periodic continuous-time
signal. (B) The DFT of a periodic discrete-time version of the same signal. Note that the origin for
the Fourier Series coefficients is in the middle of the plot, but the origin for the DFT is to the left.
2πm
(a) Complex exponential signal: s[n] = ej N n =⇒
1 1
(S[0], . . . , S[N − 1]) = (0, . . . , 0, , 0, . . . , 0, , 0, . . . , 0) , (4.44)
2 2
where the nonzero coefficients are S[m] and S[N − m].
(c) Sine: s[n] = sin 2πm
N n =⇒
j j
(S[0], . . . , S[N − 1]) = (0, . . . , 0, − , 0, . . . , 0, , 0, . . . , 0) , (4.45)
2 2
where the nonzero coefficients are S[m] and S[N − m].
(d) General sinusoid: s[n] = cos 2πmN n+φ =⇒
1 1
(S[0], . . . , S[N − 1]) = (0, . . . , 0, ejφ , 0, . . . , 0, e−jφ , 0, . . . , 0) , (4.46)
2 2
where the nonzero coefficients are S[m] and S[N − m].
(e) Not quite periodic sinusoid: s[n] = cos 2π(m+)
N n where (m+) is non-integer =⇒
The resulting S[k]’s will all be nonzero9 , typically with small magnitudes except those
corresponding to frequencies closest to 2π(m+)
N .
(f) Period contains unit impulse period: s[n] = (1, 0, . . . , 0) =⇒
1 1
(S[0], . . . , S[N − 1]) = ,..., . (4.47)
N N
9 This is the same effect that you saw in lab 3 when you ran fape over a non-integer number of periods of the sinusoid.
10. (N -point DFT) If S[k] is the N -point DFT of the periodic signal s[n] with period N , then the
mN -point DFT coefficients are
S[k/m], k = multiple of m
S[k] = (4.48)
0, else
This shows that the power in the signal s[n] equals the energy of the DFT coefficients.
0.8
0.6
|X[k]|
0.4
0.2
0
0 10 20 30 40 50 60 70
Coefficient Number, k
Figure 4.3: The DFT of a harmonic series. Note that only the first half of the DFT coefficients are
shown in this figure.
series, but a listener “hears” a signal which is the sum of these two signals. By the linear combination
properties of the Fourier Series and DFT, we know that the spectrum of the combined signal is simply
the sum of the spectra of the separate signals. We can use this property to separate the two signals
in the frequency-domain, even though they overlap in the time-domain.
Suppose that we wish to simply remove one of the notes from the combined signal. We’ll assume
that we have recorded and sampled the signal, so we’re working in discrete-time. We’ll also assume
that the combined signal is also periodic10 with some (fairly long) fundamental period N0 . If we
take the N0 -point DFT of a segment of the combined signal, we can identify the coefficients that
make up each harmonic series. Then, we simply zero-out all of the coefficients corresponding to the
harmonics of the note we wish to remove. When we resynthesize the signal with the inverse DFT,
the resulting signal will contain only one of the two notes.
We can extend this procedure to more complicated signals, like melodies with many notes.
In this case, we simply analyze and resynthesize each note individually. Of course, with more
simultaneously-sounding notes and more complicated music, this procedure becomes rather diffi-
cult. In this lab, we will implement this procedure to remove a “corrupting” note held throughout a
simple, easily analyzed melody. Though somewhat idealized, the problem should help to motivate
the use of the DFT and the frequency domain.
where CC is a vector containing the Fourier coefficients, T is the interval (in seconds) over
which the Fourier series is applied. periods is the (integer) number of periods to include in
the resynthesis; periods defaults to a value of 1 if not provided. The optional parameter Ns
specifies how many samples per period to include in the output signal.
It is assumed that CC contains the coefficients α−N . . . αN . (N is implicitly determined from
the length of CC.) Thus, CC has length 2N + 1, the CC(n) element contains the Fourier series
coefficient αn−N −1 . Further, note that the α0 coefficient falls at CC(N+1).
The two returned parameters are the signal vector ss and the corresponding signal support
vector tt.
by simply using a long DFT. In this case, each harmonic may be “spread” over several DFT coefficients, so to remove a
harmonic we need to zero-out all of coefficients associated with it. This spreading behavior is the same as what you saw in
Lab 3 when running fape over non-periodic signals.
where ss is a vector containing the signal samples, T is the interval T in seconds over which
the Fourier series is to be computed, and N is the number of positive harmonics to include in
the analysis. (2N+1 is the total number of harmonics.) It is assumed that ss contains samples
of the signal to be analyzed over the interval [0, T ].
The outputs are the vectors CC, which contains the 2N + 1 Fourier coefficients 11 , and ww,
which contains the frequencies (in Hertz) associated with each Fourier coefficient.
• DFT Analysis in M ATLAB: In order to calculate an N -point DFT using M ATLAB, we use
the fft command12 . The specific calling command is
>> XX = fft(xx)/length(xx);
This computes the N -point DFT of the signal vector xx , where N is the length of xx, and
where the signal is assumed to have support 0, 1, . . . , N − 1. Since the M ATLAB command
fft does not include the factor 1/N in the analysis formula, as in equation (4.29), we must
divide by length(xx) to obtain the N DFT coefficients XX.
• DFT Synthesis in M ATLAB: The synthesis equation for the DFT is computed with the com-
mand ifft. If we have computed the DFT using the above command, we must also remem-
ber to multiply the result by N :
>> xx = ifft(XX)*length(XX);
Note that the ifft command will generally return complex values even when the synthesis
should exactly be real. However, the imaginary part should be negligible (i.e., less than 1 ×
10−14 ). You can eliminate this imaginary part using the real command.
• Indexing the DFT: Since M ATLAB begins its indexing from 1 rather than 0, remember to use
the following rules for indexing the DFT:
X[0] ⇒ X(1)
X[1] ⇒ X(2)
X[k] ⇒ X(k+1)
X[N − k] ⇒ X(N-k+1)
X[N − 1] ⇒ X(N)
definition requires O(N 2 ) computations, but the FFT only requires O(N log N ). Additionally, the FFT is faster when N is
equal to a power of two (i.e., N = 256, 512, 1024, 2048, etc.).
• [16(+2)] Include the resulting figure window in your report. (On Windows systems, use
the “Copy to Clipboard” button to copy the figure, then you can simply paste it into a
Word or similar document. There is also a “Print Figure” button for other systems if you
can’t get access to a PC.)
Food for thought14 : Did you try the procedure suggested in the hint above, in which you tune
each sinusoid one at a time and then return to each for a “second round” of tuning? If so, can
you explain why the second round did or did not lead to any improvements? (Hint: Consider
Fourier series property 12.)
Food for thought: By executing sinsum(1), sinsum(2), and sinsum(3), you can
match different signals with sinusoids. Find MSE’s that are as small as possible for each
of these other signals.
2. (Applying Fourier series synthesis) In this problem you will simply apply fourier_synthesis
to a given set of Fourier coefficients and find the resulting continuous-time signal. Download
13 Note that this function will only work under M ATLAB 6 and higher. It is highly recommend that you use a Windows-
based PC for this problem, since you need to copy the figure window into your report. Using the Windows clipboard simplifies
this task significantly.
14 “Food for thought” items are not required to be read or acted upon. There is no extra credit for involved. However, if you
include something in your report, your GSI will read and comment on it. Alternatively, you can discuss “food for thought
topics” in office hours.
Let T = 0.1 seconds, and generate 5 periods of the signal. Use N = 20, giving you 41
Fourier series coefficients. (Hint: First, define a frequency support vector, kk=-20:20.
Then, generate CC from kk and set all even harmonics to zero.)
• Use stem to plot the magnitude of the Fourier coefficients. Use your kk vector as the
x-axis.
• Use plot to plot samples of the continuous-time signal that fourier_synthesis
returns versus time in seconds.
• What kind of signal is this?
3. (Applying Fourier series analysis) In this problem you will use the Fourier series analysis and
synthesis formula to see how the accuracy of the approximate synthesis formula (4.6) depends
on N .
Download the files lab4_data.mat and fourier_analysis.m. lab4_data.mat
contains the variables step_signal and step_time, which are the signal and support
vectors for the samples of a continuous-time periodic signal with fundamental period T 0 = 1
second. Note that there are Ns = 16384 samples in one fundamental period. (step_signal
and step_time include several fundamental periods, but you’ll be dealing with only one
period in several parts of this problem. As such, you might find it useful to create a one-
period version of step_signal.)
(d) (Meet an MSE target) Find the smallest value of N for which the mean-squared error of
the resynthesis is less than 0.5% of the mean-squared value of step_signal.
• Include this value in your report.
Food for thought: Try repeating Part (b) with the Fourier analysis performed over two fun-
damental periods of the signal, and compare to the previous answer to Part (b). Do the new
Fourier coefficients turn out as expected?
4. (Using the DFT to describe a signal as a sum of discrete-time sinusoids) In this problem,
you will simply apply the DFT to a particular discrete-time signal, which is also contained
in lab4_data.mat, namely, signal_id. signal_id is considered to be a periodic
discrete-time signal with fundamental period N0 = 128 = length(signal_id). Take
the N0 -point DFT of signal_id.
• Use stem to plot the magnitude of the DFT versus the DFT coefficient index, k.
• Use the DFT to describe signal_id as a sum of discrete-time sinusoids. That is, for
each sinusoid, give the amplitude, frequency (in radians per sample), and phase.
5. (Use the DFT to remove undesired components from a signal) In this problem you will use
the technique described in Section 4.2.4 to eliminate a noise signal from a desired signal. This
signal, melody, is also contained in lab4_data.mat. This variable contains samples of
a continuous-time signal sampled at rate fs = 8192 samples/second. It contains a simple
melody with one note every 1/2 second. Unfortunately, this melody is corrupted by another
“instrument” playing a constant note throughout. We would like to remove this second instru-
ment from the signal, and we will use the DFT to do so.
It is a good idea to begin by listening to melody using the soundsc command.
(a) (Examine DFT of first note) In order to remove the corrupting instrument, we need to
determine where it lies in the frequency domain. Let’s begin by looking at just the first
note (i.e the first 0.5 seconds or 4096 samples). This “note” consists of the sum of
two notes — one is the first note of the melody, the other is the constant note from the
corrupting instrument. Each of these notes has components forming a harmonic series.
The fundamental frequencies of these harmonic series are different, which is the key
to our being able to remove the corrupting note. Take the DFT of the first 0.5 seconds
(4096 samples) of the signal.
• Use stem to plot the magnitude of the DFT for the first note.
• Identify the frequencies contained in each of the two harmonic series present in
signal. What are the fundamental frequencies?
(b) (Examine DFT of second note) By comparing the spectra of the first two notes, we can
identify the corrupting instrument. Take the DFT of the second 0.5 seconds (samples
4097 through 8192).
• Use stem to plot the magnitude of the DFT for the second 0.5 seconds.
• What are the fundamental frequencies (in Hz) of the two harmonic series in this
note?
• We know that the melody changes from the first note to the second, but the corrupt-
ing instrument does not. Thus, by comparing the harmonic series found in this and
the previous part, identify which fundamental frequency belongs to the melody and
which to the corrupting instrument.
(c) (Identify the DFT coefficients of the corrupting signal) In order to remove the “cor-
rupting” instrument, we simply need to zero-out the coefficients corresponding to the
harmonics of the note from the corrupting instrument. This is done directly on the DFT
coefficients of each 0.5 seconds of the signal. Then, we resynthesize the signal from the
modified DFT coefficients.
• Based on this, and your results from the previous parts of this problem, which DFT
coefficients need to be set to zero in order to remove the corrupting instrument from
this signal? (Hint: Remember the conjugate pairs.)
(d) (Complete the function that removes the corrupting instrument) Finally, we’d like to re-
move the corrupting instrument from our melody. Download the file fix_melody.m.
This function contains the code that you’ll use to remove the corrupting instrument from
the melody signal. For each note of the melody, the function takes the DFT, zeros out
the appropriate coefficients (which you must provide), and resynthesizes the signal.
• Complete the function by setting the variable zc equal to a vector containing the
DFT coefficients that must be zeroed-out.
• Execute the function using the command
>> result = fix_melody(melody);
Listen to the resulting signal. Have you successfully removed the corrupting instru-
ment?
(e) (Check your result with the spectrogram) Finally, we’d like to be able to visually check
our result. Download the function melody_check.m. melody_check produces
an image called a spectrogram that you can use to check your work. Basically, the
spectrogram works by taking the DFT of many short segments of a signal and arranging
them as the columns of an image. Note that the x-axis is time and the y-axis is frequency.
The color of each point on the image represents the strength of the spectral component
(in decibels) at that time and frequency. The dark horizontal bands show the presence of
sinusoidal components in the signal at the associated times.
• Execute melody_check by passing it melody. Include the resulting figure in
your report.
• Can you identify the components of the corrupting instrument on this spectrogram?
• Now, execute melody_check by passing it result. Include the resulting figure
in your report.
• Compare the spectrogram of melody to the spectrogram of result. What differ-
ences do you see? Is this what you expect to see?
6. On the front page of your report, please provide an estimate of the average amount of time
spent outside of lab by each member of the group.
5.1 Introduction
A common application of signals and systems is in the production, manipulation, storage and dis-
tribution of images. For example, image transmission is an important aspect of communication
(especially on the internet), and we would like to be able to distribute high quality images quickly
over low-bandwidth connections. To do so, images must be encoded into a sequence or file of
bits, which can be digitally transmitted or stored. When display of the image is required, the se-
quence/file of bits must be decoded into a reproduction of the image. A block diagram of a general
data compression system, with an encoder and decoder, is shown in Figure 5.1.
Systems or algorithms that do the encoding and decoding are called source coders, coders, data
compressors, or compressors. They are called source coders because they encode the data from a
source, e.g. a camera or scanner. They are also called data compressors, because their encoders
usually produce fewer bits than were produced by the original data collector. For example, JPEG
is a commonly used, standardized image compressor. You’ve probably downloaded many JPEG
encoded images over the internet — any image with filename extension .jpg. FAX machines use a
different image compression algorithm.
In this lab, we will experiment with some basic data compression techniques as applied to im-
ages. Typically, there is a tradeoff between the number of bits an encoder produces and the quality
of the decoded reproduction. With more bits we can usually obtain better quality at the expense of
greater storage or bandwidth requirements. When we assess how well these techniques work, we
will count the number of bits their encoders produce(fewer is better), and as a measure of quality,
and we will compute the mean-squared or RMS error as a measure of the quality of the decoded
reproduction (low error means high quality, or equivalently, low distortion).
bits
Signal Encoder/ Decoder/ Reconstructed
Compressor Decompressor Signal
5.2 Background
5.2.1 Images
So far, we have dealt entirely with one-dimensional signals. That is, these signals are indexed by
only one independent variable (usually time). In this lab, we will start to consider two-dimensional
signals. An image is an example of a two-dimensional signal. In an image, we usually index the
signal based on horizontal and vertical position — two dimensions that are needed to find the “signal
value” at any given point.
In this lab, we will generally restrict our attention to gray-scale images1 . We mathematically
represent such an image (in continuous-space) as a signal x(t, s), where 0 ≤ t ≤ H, 0 ≤ s ≤ W .
x(t, s) denotes the intensity, brightness, or value of the image at the position with vertical coordinate
t and horizontal coordinate s, and H and W are the height and width of the image, respectively. The
values of x(t, s) are generally nonnegative. Thus, a small value of x(t, s) (close to zero) corresponds
to black while larger values correspond to progressively lighter shades of gray.
In digital image processing, the image is assumed to be sampled at regularly spaced intervals
creating a discrete-space image x[m, n]:
where Ts is the sampling interval, given in units of distance. Thus, in discrete-space, an image is
simply an M × N array or matrix of numbers x[m, n], where m and n are integers in the range
[1, M ] and [1, N ], respectively. Each x[m, n] is called a pixel. We adopt the usual convention that
x[1, 1] is the upper left pixel, x[1, N ] is the upper right, x[M, 1] is the lower left, and x[M, N ] is the
lower right.
We shall also adopt the common, but not universal, convention of digital image processing that
pixel values, often called levels, are integers ranging from 0 to 255. The reason the pixel values
are integers is that computers cannot store real-valued quantities. Instead the raw pixel values must
be quantized to values from a finite set. The usual practice is to scale the raw image pixel values
by some constant so the maximum value is close to 255 and then to round each pixel value to the
nearest integer, thereby obtaining an image whose values are integers between 0 and 255. Why 0 to
255? There are two reasons. One is that these values can be conveniently represented with one byte,
i.e. 8 bits.2 Another reason is that the effects of rounding to 256 possible levels are not ordinarily
observable, whereas rounding to a significantly smaller number, say 128, is sometimes noticeable.
Notice the relationship between the variance and standard deviation, equation (5.10), and the
relationship between these statistics and the MS and RMS values3 , equations (5.8) and (5.12). The
variance and standard deviation measure how widely varying are the values of a signal. If they
are small, it means that the signal values (and thus the signal value distribution) is tightly clustered
around the mean value, while if they are large, the signal values range widely.
Recall from Laboratory 1 that we often use the MS and RMS values to measure distortion of a
signal. We will be doing this for images in this laboratory. If y[m, n] is a distorted version of x[m, n],
then we can measure the mean-squared error (MSE) and root mean-squared error (RMSE), using
N M
1 XX
M SE = (x[m, n] − y[m, n])2 (5.13)
N M n=1 m=1
√
RM SE = MSE. (5.14)
3 Equation (5.8) is not something that is immediately obvious, but it is something that can be straightforwardly derived.
(Doing so is an interesting exercise.)
Image Encoder
Image Decoder
Many advanced compression systems, including JPEG, use variable length coders. MP3 coding has
provisions for both fixed-length and variable-length coding.
As illustrated in Figure 5.3, the decoder/decompressor corresponding to the encoder/compressor
just described has two components. The input to the decoder is the bits produced by the encoder.
The first component, the binary decoder, inverts the operation of the binary encoder, and produces
the levels originally produced by the quantizer. The quantization operation produced by the encoder
is not invertible, so there is not corresponding decoding step. Instead the last step is the inverse
transform, which as the name suggests performs the inverse of the encoding transform. The output
of the decoder, called an encoded or decoded image or reproduction can be displayed on a monitor
or printed on paper, as desired.
5.2.4 Transformation
Efficient lossy data compressors typically perform some sort of preprocessing on the data to be
compressed. One very common preprocessing step is a transform, and such compressors are called
transform coders. For example, JPEG is a transform coder based on the discrete cosine transform
(DCT), which is a spectral transformation similar to the DFT. The transform is typically applied
to small groups of pixels called blocks. In this lab, we will experiment with a simple DFT-based
transform coder that uses short 1 × 8 pixel blocks. That is, we use an N -point DFT with N = 8.
Recall that the synthesis and analysis formulas for an 8-point DFT are given by
7
X 2πk
x[n] = X[k]ej 8 n (5.15)
k=0
7
1X 2πk
X[k] = x[n]e−j 8 n (5.16)
8 n=0
2πk
8 . The DFT synthesis
Here, X[k]ej 8 n is the “spatial” frequency component at frequency ω̂ = 2πk
formula shows that an image block x[n] can be viewed as the sum of such components. More
specifically,
Note that we are NOT presuming that these blocks are in any way periodic.
which is an often useful fact. This is derived using Parseval’s relation, as given in Lab 4 4 .
4 For this derivation, see the document titled “Notes: The Distortion of Transform Coding” by D.L. Neuhoff.
5.2.5 Quantization
Quantization is the most elementary form of lossy data compression, while also forming a funda-
mental part of more advanced lossy compression schemes such as transform coding. We may quan-
tize an image directly, or we may quantize the results of a transformation as described in Section
5.2.4. When a number x is quantized to L levels, we mean that its value is replaced by (or quantized
to) the nearest member of a set of L quantization levels. Here, we consider uniform quantization 5 .
For the uniform quantization used here:
• We place the quantization level for a given segment in the middle of that segment.
The quantizer is illustrated with the figure shown below, which shows L = 8 segments of width
∆ = (xmax − xmin )/8 as thick lines and the corresponding levels within each segment as circles.
∆
xmin xmax
Given a pixel x[m, n], the quantizer operates by outputing the nearest level. Equivalently, if
x[m, n] lies in the ith segment, then quantizer outputs the ith level. If x is larger than xmax , then x
is quantized to the largest level, namely, xmax − ∆/2. Similarly, if x is smaller than xmin , then x is
quantized to the smallest level, namely, xmin + ∆/2.
One can see that if x is within the quantizer range, then its quantized value will differ from x
by at most ∆/2, so that the quantizer introduces only a small error. On the other hand, when x is
outside the range, the quantizer can introduce a large error. Thus, when designing a quantizer it
is important to choose the quantizer range so that it includes most values of x. Making the range
large will do this. However, we don’t want to make the range too large. Larger ranges mean that
∆ = (xmax − xmin )/L is larger, which in turn increases the maximum possible error introduced
when x lies within the range of the quantizer.
tizers in this lab. Uniform quantizers are sometimes called uniform scalar quantizers to distinguish them from more sophis-
ticated quantizers that do not operate independently on successive data samples.
∆
xmin xmax
^x[n,m]
Binary bits Binary
x[n,m] Quantizer ^x[n,m]
Encoder Decoder
Encoder Decoder
reproduction of the original piece of data. For instance, if the image pixel x[m, n] lies in the third
segment of the quantizer, the binary encoder will produce 010, which when received, the decoder
will produce level 3 as the reproduction of x[m, n].
If, as often happens, the number of levels is a power of two, i.e. M = 2b where b is an integer,
then the simplest approach is to make each codeword have b bits. It does not matter which b-bit
sequence is assigned to which level, but the usual scheme, as illustrated above, is to assign the
binary sequence representing 0 to the smallest level, the binary sequence representing 1 to the next
largest level, and so on. With this type of binary coding, the encoder is fixed-length (or fixed-rate) in
the sense described earlier. Often, a better scheme is to use shorter codewords for the quantization
levels that occur more frequently, and longer ones for those that are used less frequently. Such
variable-length codes are used in JPEG and other high efficiency schemes.
5.2.7 Performance
There are two ways that we measure the performance of a compression system. First, we want
to know how many bits are required to store an image. The total number of bits produced by the
encoder is equal to the number of blocks multiplied by the number of bits required to encode one
block. More commonly, we report the number of bits required to store a single pixel. This is called
the coding rate, R. The coding rate is equal to the number of bits required to code a single block
divided by the number of pixels in a block. Naturally, we prefer a lower coding rate.
The second performance measure is the amount of distortion introduced by the coder. Generally,
we measure this distortion by computing the mean-squared (MSE) or RMS error (RMSE). We also
prefer to have low distortion, and equivalently low error.
Unfortunately, we generally have to trade off between these two performance measures. That is,
we can produce a highly compressed (with a low coding rate) image, but this generally introduces a
large RMS error. Alternatively, we can have a very high-quality representation of an image (with low
distortion), but such a representation requires many bits to encode. Figure 5.5 shows an illustration
of the tradeoff between the two performance measures. In the laboratory assignment, you will
produce a plot similar to this for compression using uniform quantization.
Elementary theory predicts that when the quantizer range includes most values of the image
x[m, n] and when ∆ is much smaller than the standard deviation of the image, then the MSE induced
by quantizing with level spacing ∆ can be approximated as follows6 /
1 2
M SE ≈ ∆ (5.21)
12
2
1 xmax − xmin
= (5.22)
12 L
1
= (xmax − xmin )2 2−2R , (5.23)
12
This shows that if we were to shrink ∆ by a factor of 2, as would happen if L were doubled and the
range were held constant, then the MSE would decrease by a factor of four. Equivalently, the last
equation shows that this factor of four reduction comes by increasing the coding rate by one bit per
pixel.
When a quantizer is applied to data whose signal value distribution is fairly constant over a given
range, then it is usually good practice to choose the quantizer range to match the data range. This
is generally the case when directly quantizing images, so we will generally choose x min = 0 and
xmax = 255.
On the other hand, when quantizing data whose signal value distribution is quite uneven, then it
may be best to choose the quantizer range to be a subset of the data range. For example, in transform
coding, it often happens most of the data to be quantized is near zero but there are a few very, very
large values. In such cases, experience has shown that to design a quantizer with small MSE, one
should normally choose the width of the range to be proportional
p to the standard deviation of the data
being quantized, i.e. (xmax − xmin ) = c × Std(x) = c V ar(x). The constant of proportionality c
is usually between 2 and 6. Smaller values of c work well for smaller values of L, and larger values
6 See the document “Note: The ∆2 /12 Formula” by D.L. Neuhoff.
RMS Error
Figure 5.5: There is an inherent tradeoff between coding rate and distortion.
...
...
c[7] ^c[7] ^c[7]
Quantizer 8 Binary Encoder Binary Decoder
Encoder Decoder
Figure 5.6: A block diagram of a transform coder. The encoder divides the incoming image, x[m, n]
into 1 × 8 blocks and transforms each block into a sequence of 8 coefficients c[0], . . . , c[7]. These
coefficients are then quantized to yield b c[7], and encoded into a binary representation. The
c[0], . . . , b
decoder creates a reconstruction of the image, x b[m, n] by decoding the binary codewords and in-
verting the transformation.
of c work well for large values of L. Using this relation in (5.21), we find
2
1 2 1 xmax − xmin
MSE ≈ ∆ = (5.24)
12 12 L
2
c V ar(x)
≈ . (5.25)
12 L2
c2
≈ V ar(x)2−2R . (5.26)
12
This shows that quantizer MSE is proportional to the variance of the data and inversely proportional
to L2 .
R, as
7
1X
R= bk bpp . (5.27)
8
k=0
In many situations, we are given a desired coding rate R, e.g. R = 2 bpp. In this case, the question
becomes how we should divide these bits among the eight types of coefficients, i.e. how to choose
the bk ’s, so they average to the desired coding rate R, yet cause the distortion in the reproduction
produced by the transform code to be as small as possible.
Using (5.19), it can be shown that7 .
7
X
MSE = MSE[k] , (5.28)
k=0
where MSE[k] is the MSE of the quantizer for c[k]. In other words, the MSE of the transform coder
is approximately the sum of the MSE’s of the quantizers for the different coefficients.
Let us first consider a transform coder where each type of coefficient is quantized with the same
number of bits/pixel, i.e. b0 = b1 = . . . = b7 . We assert without proof that such a transform
coder has roughly the same MSE as that of direct quantization with the same number of bits/pixel.
Now, we will now argue that changing the b[k]’s so that some are larger than others will make the
transform coder work better than direct quantization.
From (5.26) we have that
1 2
MSE[k] ≈ c V ar(c[k])2−2bk , (5.29)
12
where V ar(c[k]) denotes the variance of the c[k] values. One can see from the above that the
coefficients with larger variance will be quantized with larger mean-squared error. In particular the
DC coefficients C[0] usually have the largest variance; so they will have the largest MSE. On the
other hand, the c[3]’s and c[7]’s usually have the smallest variance and distortion.
Now suppose we increase b0 by one and decrease b7 by one. From (5.27) we see that this will
have no net effect on the number of bits produced by the coder. However, from (5.29) we see that
this decreases the (large) MSE of the DC coefficients c[0] by a factor of 4, and increases the (small)
MSE of the c[7] coefficients by a factor of 4. Is it beneficial to decrease one MSE by 4, when another
one increases by 4? We can see from (5.28) that indeed it is beneficial. Decreasing a larger MSE
by the factor 4 decreases the average in (5.28) more than increasing a small MSE by the factor of
4 increases the average.8 Thus, what we want to do is shift bits towards the coefficients with larger
variances. This will make MSE smaller than if all coefficients were quantized with the same number
of bits and, therefore, smaller than the distortion of direct quantization.
More generally, in a well designed transform code, all of the MSE[k]’s will be approximately
the same. If they were quite different, we could move a bit from a coefficient with small MSE to one
with large MSE and achieve a net decrease in overall MSE. In this light, we can see that the role of
the transform is to make the variances of the coefficients as different as possible. Some should be
large, and others should be small.
>> y = x(:);
>> x = double(imread('my_img.tif'));
Note that the imread command will load many standard image file formats, including JPEG,
PNG, BMP, TIFF, PCX, and a host of others.
• Displaying images. To display an image in M ATLAB, there are actually a number of com-
mands that must be used simultaneously. To display the image itself, we use imagesc
command. To tell M ATLAB to display the image as a gray-scale image, we use command
colormap(gray). To set the axes so that the aspect ratio is correct, use the command
axis image. Finally, to add a “color bar” that relates image values to colors, use the
colorbar command. Every image that you produce for this course must have a color bar;
you will lose points for every image you display without a color bar. To do all of these things
at once to display an image x, use the following code:
You will be using this sequence of commands often, so you might wish to write a short func-
tion that executes all of these commands simultaneously.
• Quantizing an image: The function quantize_fcn.m, which we provide to you for this
lab, implements a uniform quantizer for images and transform coefficients. It takes a signal,
the desired number of quantization levels (L), and the two numbers that define the quantiza-
tion range, xmin and xmax . For instance, to quantize an image, img, to 64 levels, use the
command
q_img contains the quantized image, while delta contains the ∆ value used for quantiza-
tion. Here, note that xmin = 0 and xmax = 255. This separately quantizes each pixel of img
to one of 64 levels, in accordance with the procedure described in the background section.
• Using the DFT Coder: The DFT-based transform coder that we have described in this labo-
ratory is provided as three separate functions.
– dft_block.m breaks the image into 1 × 8 blocks and computes the DFT of each
block. If the image is N × N , this function produces a series of eight band images. For
k = 1, . . . , 8, the k th band image contains the c[k − 1] coefficients for each block. For
example, the k = 1 band image, contains the c[0], or DC, coefficients from each block.
Each band image has size M × N/8.
The eight band images are returned as a three-dimensional array. To produce the band
images for an image, img and then access the third band image, for instance, we would
use the commands
>> A = dft_block(img);
>> A(:,:,3);
Note that except for the first one, each band image contains both positive and negative
values. However, we can still display them using imagesc.
– inverse_dft_block.m reconstructs the image from the matrix of band images re-
turned by dft_block.m.
– dft_coder.m puts both of these blocks together by calling dft_block, quantizing
the coefficient matrix, and reconstructing the image with inverse_dft_block.
dft_coder takes several input parameters, all of which are optional except the first
one. The first parameter is the image to encode. The second is a vector of bit allocations,
bk . For instance, if we call dft_coder like this
>> coded = dft_coder(img,[8 6 6 6 6 4 4 4]);
we quantize our c[0] (DC) coefficients using 8 bits, the next four (real) coefficients with
6 bits each, and the last three (imaginary) coefficients using 4 bits each9 .
9 Though more advanced coders may allow the allocation of fractions of bits, for this coder you must allocate a whole
number of bits to each coefficient. You can, however, assign no bits to a coefficient. In this case, that coefficient is simply set
to a constant value.
Note that the number of bits required to encode a single pixel is equal to the average
value of all of the bk ’s. Thus, the example above uses 5.5 bits per pixel.
When run, dft_coder returns the decoded image and also displays a table of useful
statistics corresponding to each coefficient c[k]. To see this table, make sure that you put
a semicolon at the end of your call to dft_coder.
(a) (Display an image) Load the image “cameraman.tif”. (If your computer does not have
the Image Processing Toolbox, you’ll need to download the file from the web page).
• Display the image and include the resulting figure in your report.
• Calculate the size of the image (the number of rows and columns) and the total
number of pixels in the image.
• Find the minimum and maximum pixel values, xmin and xmax in the image.
(b) (Produce and interpret a histogram) Estimate the signal value distribution of this image
by generating a histogram with 256 bins centered at integers from 0 to 255.
• Include the resulting plot in your report.
• From this histogram, what signal values occur the most often in this image?
• In words, describe which part(s) of the image corresponds to these signal values.
(c) (Examining signal values) It is useful to be able to think of images in terms of the signal
values that make them up. Download the M-file display_square.m, which will
help this process. Use this function to display the pixel values in several rectangular
segments of the “cameraman” image. Find, approximately, the smallest rectangle of
pixels that includes the black tip of the camera lens.
• Include in your report a plot from display_square.m showing the pixel values
of the rectangle you found.
• From this display, what are the row and column indices of this rectangle?
• From this display, what are minimum and maximum values within this rectangle?
(d) (Signal representations) We know that this image takes on only integer values over a
finite range, but there are still a few different ways we can represent the image. In the
original file, for instance, each pixel is represented using 8 bits. In M ATLAB, though, we
convert the image into 64-bit double precision values.
• How many bits are required to describe the entire image at 8 bits per pixel?
• How many bits are required to describe the entire image at 64 bits per pixel?
• How many possible pixel values can a 64-bit number represent?
2. (Direct quantization) In this problem, we will experiment with direct quantization as an image
compression mechanism. Download the function quantize_fcn.m.
(a) (Quantize an image) Use quantize_fcn to quantize the “cameraman” image using
64 levels, 16 levels, and 4 levels. Assume xmin = 0 and xmax = 255.
• Display and include in your report the three resulting quantized images along with
the original using subplot. Again, make sure that you indicate which image is
which.
• Describe the effects of the quantization in these plots.
(b) (Plot quantization functions) Use M ATLAB to make a plot of the function being imple-
mented by quantize_fcn.m. For example, for the 64 level quantizer, run quantize_fcn(x,64,0,255)
for x ranging from 0 to 255, and plot the resulting values versus x.
• Plot the quantization function for the 16 level quantizer.
• Also, plot the histogram of image quantized with 16 levels, using 256 bins centered
at integers from 0 and 255.
(c) (Quantization as compression) For the 4, 16, and 64 level quantizers,
• How many bits are needed to represent each of these quantized images?
• How many bits are needed to represent each pixel in one of these images?
(d) (Measuring quantization error) Find the “error image” corresponding to each of these
quantized images.
• Using subplot, display and include in your report the three error images in the
same plot.
• Can you see aspects of the original images in these plots?
• Calculate the RMS error for each quantization of the image.
(e) (Evaluating RMS error predictions) Now, we want to compare the actual RMS error for
“cameraman” versus the predicted RMS error (based on the derivation in Section 5.2.7)
for quantizers with 2, 4, 8, 16, 32, 64, and 128 levels.
• Calculate the actual RMS error for each of these quantizers.
• Calculate the predicted RMS errors for these quantizers.
• Plot both the actual and predicted RMS error values versus the required number of
bits per pixel.
• For what number of bits per pixel is this prediction most accurate?
3. (Compression using a transform coder) In this problem, you will experiment with the DFT-
based transform coder that is described in the background section.
(a) (Create and examine band images) Download the M-file dft_block.m. Use it to
generate the matrix of band images for the “cameraman” image.
• Use subplot to simultaneously display all eight band images. Use axis square
rather than axis image when you display these band images.
• Discuss the appearances of the various band images. For example, can you see any
features of the original cameraman image in any or all of them?
(b) (Reconstruct a coded image) Download the M-file inverse_dft_block.m. Use
this function to reconstruct the original image from the set of band images produced by
dft_block.
• Compute the RMS error between the original and the transformed/inverse trans-
formed image. (It should be negligibly small.)
(c) (Designing coders for image compression) Download the M-file dft_coder.m. Our
goal in using dft_coder is to find appropriate parameters for the eight quantizers
when compressing the “camerman” image. Through intelligent design, we hope to
achieve lower RMS error than with direct quantization of the image using the same
number of bits. We do this by allocating bits to each of our eight quantizers indepen-
dently.
i. (Design a 4 bpp coder) Find a 4 bits per pixel design with as small an RMS error as
you can. You should be able to get an RMS error less than 4. (Hint: As a general
rule of thumb from Section 5.2.8, bigger coefficients should get more bits.)
• What bit allocation did you use, and what was the resulting RMS error?
• Display the reconstructed image and the error image on the same figure using
subplot.
• Compare your RMS error to the RMS error of 4 bits per pixel uniform quanti-
zation that you performed in problem 2e.
• Compare the qualitative appearance of the reconstruction produced by the trans-
form coder to that produced by the direct quantizer.
ii. (Design a 3 bpp coder) Find a 3 bits per pixel design with as small an RMS error as
you can. You should be able to get an RMS error less than 6.4.
• What bit allocation did you use, and what was the resulting RMS error?
• Display the reconstructed image and the error image on the same figure using
subplot.
• Compare your RMS error to the RMS error of the 3 bits per pixel uniform
quantization that you performed in problem 2e.
• Compare the qualitative appearance of the reconstruction produced by the trans-
form coder to that produced by the direct quantizer. (Note: You have not yet
displayed the 3 bpp image, so you will need to generate it for comparison.)
iii. (Design a 2 bpp coder) Find a 2 bits per pixel design with as small an RMS error as
you can. You should be able to get an RMS error less than 10.8.
• What bit allocation did you use, and what was the resulting RMS error?
• Display the reconstructed image and the error image on the same figure using
subplot.
• Compare your RMS error to the RMS error of a 2 bits per pixel uniform quan-
tization that you performed in problem 2e.
• Compare the qualitative appearance of the reconstruction produced by the trans-
form coder to that produced by the direct quantizer.
iv. (Comment on coder design) Given your experimentation with this transform coder,
• Comment on the relative performances of direct quantization and transform
coding as the number of bits/pixel changes.
Food for Thought: In this lab, we’ve used a 1-dimensional transform for our coder. We can
achieve significantly better compression if we use a 2-dimensional transform. M ATLAB imple-
ments a two-dimensional DFT with the command fft2. As a challenging project, consider
modifying the transform coder provided here to work on 4 × 4 or 8 × 8 blocks of an image.
How much compression can you achieve with this modified coder?
4. On the front page of your report, please provide an estimate of the average amount of time
spent outside of lab by each member of the group.
1. JPEG uses a two-dimensional transform. This allows much greater compaction of the data
into a few transform coefficients.
2. JPEG uses a transform called the discrete cosine transform, which purely real, rather than the
DFT. This removes some of the redundancies in our coding method.
3. JPEG uses a technique called run-length encoding. This allows a coder to store a “run” of
similar values by indicating the value and the number of repetitions.
4. JPEG uses a variable-length coding scheme (often Huffman coding, which you may study
in an intermediate programming course on data structures and algorithms) to produce a bit
stream for the final coded representation.
All of these improvements allow images to be significantly compressed with relatively small distor-
tion. For more information about JPEG coding, you might wish to look at the JPEG Tutorial:
http://www.ece.purdue.edu/˜ace/jpeg-tut/jpegtut1.html
6.1 Introduction
Digital filters are one of the most important tools that signal processors have to modify and improve
signals. Part of their importance comes from their simplicity. In the days when analog signal pro-
cessing was the norm, almost all filtering was accomplished with RLC circuits. Now, a great deal of
filtering is accomplished digitally with simple (and extremely fast) routines that can run on special
digital signal processing hardware or on general purpose processors.
So why do we filter signals? There are many reasons. One of the biggest is noise reduction
(which we have called signal recovery). If our signal has undesirable frequency components, e.g. it
contains noise in a frequency range where there is little or no desired signal, then we can use filters
to reduce the relative amplitude of the signal at such frequencies. Such filters are often called fre-
quency blocking filters, because they block signal components at certain frequencies. For example,
lowpass filters block high frequency signal components, highpass filters block low frequency signal
components, and bandpass filters block all frequencies except those in some particular range (or
band) of frequencies.
There are a wide range of uses for filtering in image processing. For example, they can be used
to improve the appearance of an image. For instance, if the image has granular noise, we might want
to smooth or blur the image to remove such. Typically such noise has components at all frequencies,
whereas the desired image has components at low and middle frequencies. The smoothing acts as a
lowpass filter to reduce the high frequency components, which come, predominantly, from the noise.
Alternatively, we might want to sharpen the image to make its edges stand out more. This requires
a kind of highpass filter.
In this lab, we will experiment with a class of filters called FIR (finite impulse response) filters.
FIR filters are simple to implement and work with. In fact, an FIR filtering operation is almost
identical to the operation of running correlation which you have worked with in Laboratory 2. In
particular, we will examine the use of FIR filters for image processing, including both smoothing
and sharpening. We will also examine their use on simple one-dimensional signals.
• How can we improve the appearance of an image? Specifically, how can we remove noise or
“sharpen” an image?
6.2 Background
6.2.1 Implementing FIR Filters
FIR filters are systems that we apply to signals. An FIR filter takes an input signal x[n], modifies it
by the application of a mathematical rule, and produces an output signal y[n]. This rule is generally
called a difference equation, and it tells us how to compute each sample of the output signal y[n] as
a weighted sum of samples of the input signal x[n]. A common form of the difference equation is
given as
M
X
y[n] = bk x[n − k] (6.1)
k=0
= b0 x[n] + b1 x[n − 1] + b2 x[n − 2] + . . . + bM x[n − M ] (6.2)
The bk ’s are called the FIR filter coefficients, and M is the order of the FIR filter. The set of FIR filter
coefficients completely specifies an FIR filter. Different choices of the order and the coefficients
leads to different kinds of filters, e.g. to lowpass, highpass and bandpass filters.
Equation (6.1) defines the class of causal FIR filters. A more general form is given by
M2
X
y[n] = bk x[n − k] (6.3)
k=−M1
= b−M1 x[n + M1 ] + . . . + b−1 x[n + 1]
+ b0 x[n] + b1 x[n − 1] + . . . + bM2 x[n − M2 ] , (6.4)
where M1 and M2 are nonnegative integers. Here, the order of the filter is M1 + M2 . When
M1 > 0, the FIR filter is non-causal. To calculate the “present” value of y[n0 ], a causal FIR
filter only requires “present” (n = n0 ) and “past” (n < n0 ) values of x[n]. Non-causal filters, on the
other hand, require “future” (n > n0 ) values of x[n]. Thus, a filter with difference equation given by
y[n] = x[n] + x[n − 1] is causal, but a filter with difference equation given by y[n] = x[n] + x[n + 1]
is non-causal. The distinction between causal and non-causal filters is necessary if we wish to
implement one of these filters in real-time. Causal filters can be implemented in real-time, but to
implement non-causal filters we generally need all of the data for a signal before we can filter it.
Compare equation (6.3) with the equation for performing running correlation between a signal
b[n] and x[n]:
∞
X
y[n] = C(b[k], x[k − n]) = b[k]x[k − n]. (6.5)
k=−∞
Recall that we thought of running correlation as a procedure where we “slid” one signal across the
other, calculating the in-place correlation at each step. If we consider that the b k ’s of an FIR filter
form a signal, then the application of an FIR filter uses the same procedure with two minor differ-
ences. First, when we apply an FIR filter, we are only “correlating” over a finite range; however,
we typically assume bk = 0 for k outside the range [M1 , M2 ]. Thus, we can change the limits of
summation to range over (−∞, ∞) without changing the result. Second, when applying a filter,
bk x[n]
y[n] = Σx[k]bk-n
Figure 6.1: A graphical illustration of filtering. The filter coefficients, bk and signal to be filtered,
x[n], are shown on the top axis. The middle axis shows x[n] and a time-reversed and shifted version
of bk . We multiply these two signals and sum the result to yield a single sample of the output,
y[n], which is shown on the bottom axis. For example, to compute the y[6] sample, we multiply the
samples of x[n] by b6−n and sum the result.
the signal x[n] is time-reversed with respect to the bk coefficients1 . This is not the case for running
correlation.
From the definition alone, it is not easy to see how a filter “works.” With the connection to
correlation, though, we can suggest an intuitive graphical understanding of this process which is
shown in Figure 6.1. To calculate a single sample of y[n], we time-reverse the signal formed by
the bk coefficients (by flipping it across the n = 0 axis). Then, we shift this time-reversed signal
by n samples and perform in-place correlation. The result is the nth sample of y[n]. To build up
the entire signal y[n], we do this repeatedly, “sliding” one signal across the other and calculating
in-place correlations at each point.
You may find it useful to go back to Lab 2 and review the algorithm for in-place correlation.
In that description of the algorithm, we used x[n] where here we wish to use the signal formed by
the bk ’s. We can use this algorithm when implementing FIR filters, as well. Note, however, that
we want to time-reverse the bk coefficients when we multiply them by the incoming signal samples.
That is, we always want to multiply the b−M1 coefficient by the newest sample in the buffer.
1 1 1 1 1
y[n] = x[n] + x[n − 1] + x[n − 2] + x[n − 3] + x[n − 4]. (6.6)
5 5 5 5 5
1 That is, x[n − k] is a time-reversed version of x[k − n], just as s[−n] is a time-reversed version of s[n]. Note that we
can “time-reverse” the bk coefficients rather than x[n] and achieve the same result.
Input x[n]
4
0
0 5 10 15 20 25 30
Sample Number
6
Output y[n]
0
0 5 10 15 20 25 30
Sample Number
s[n] Recovery
x[n] + ^
s[n] ^
= x[n] ^
+ v[n]
Filter
v[n]
Figure 6.3: A block diagram of additive noise and a recovery filter that attempts to remove the noise.
changes the amplitude of a signal, but does nothing else. Compare this to the system with difference
equation y[n] = x[n − N ]. This system’s only effect is to delay the signal by N samples. In some
circumstances, the delay introduced by a causal filter does not affect the operation of the system.
For our purposes in this laboratory, we will need to be careful to account for the delay introduced by
FIR filters when comparing two signals with a mean-squared or RMS distortion measure.
RMS error between x[n] and ŝ[n]) is minimized as a function of filter strength.
Nonlinear filtering
While standard FIR filters can be useful for noise reduction, in some cases we may find that they
distort the desired signal too much. An alternative is to use nonlinear filters. Nonlinear filters have
the potential to remove more noise while introducing less distortion to the desired signal; however,
the effects of these filters are much more difficult to analyze.
Consider the case of an image, for instance. One of the most important features of images of
natural scenes are edges. Edges in images are usually just sharp transitions where one object ends
and another begins. If we are attempting to remove high-frequency noise from an image, we will
often apply a lowpass filter. Edges, though, have considerable high-frequency content, so the edges
in resulting image will be smoothed out. To get around this problem, we can consider the application
of a common nonlinear filter called a median filter. Median filters replace each sample of a signal
with the median (i.e., the most central value) of a block of samples around the original sample. That
is, we can describe the operation of the median filter as
where
x((N +1)/2) N odd
M edian(x1 , . . . , xN ) = (6.10)
1
2 (x (N/2) + x(N/2+1) ) N even
and where x(n) is the nth smallest of the values x1 through xN . The order of the median filter is given
by M1 + M2 , and it determines how many samples will be included in the median calculation. Note
that the filter is noncausal because its output depends on future, as well as past and present, inputs.
Unlike lowpass filters, median filters tend to preserve edges in signals very well. These filters are
also very powerful for removing certain types of noise while introducing relatively little distortion.
In this laboratory, we will examine the effect of applying nonlinear filters to two-dimensional signals.
k
b 0,2 b 0,1 b 0,0 b 0,-1 b0,-2
b-1,2 b-1,1 b-1,0 b-1,-1 b-1,-2
b-2,2 b-2,1 b-2,0 b-2,-1 b-2,-2
Figure 6.4: The coefficients of a two-dimensional moving average filter. In this figure, pixels exist
at the intersection of the horizontal and vertical lines.
could describe this filtering operation in terms of two-dimensional set of filtering coefficients. For
instance, the difference equation for this two-dimensional filter would be
4 X
X 4
1
y[m, n] = x[m − k, n − l]. (6.11)
25
k=0 l=0
This operation is equivalent to filtering with a two-dimensional set of coefficients b k,l , where bk,l =
25 for k = 0, . . . , 4 and l = 0, . . . , 4.
1
This result suggests the third, most general, approach to FIR filtering of two-dimensional signals.
The general difference equation for this approach is
M2
X N2
X
y[m, n] = bk,l x[m − k, n − l]. (6.12)
k=−M1 l=−N1
[−M1 , M2 ] and [−N1 , N2 ] define the range of nonzero coefficients. Note that a filter, such as the one
defined by equation 6.11, is causal if M1 and N1 are non-negative. However, we should also note
that in image processing, causality is rarely important. Thus, two-dimensional FIR filters typically
have coefficients centered around b0,0 . A schematic of such a set of filter coefficients is shown in
Figure 6.4.
The “edge finding” filter highlights edges in an image by producing large positive or negative
values while setting constant regions of the image to zero. The most basic edge finding filter is a
simple one-dimensional first difference filter. A first difference filter has the difference equation
y[n] = x[n] − x[n − 1]. (6.13)
This filter will tend to respond positively to increases in the signal and negatively to decreases in
the signal. Adjacent input samples that are identical (or nearly so), though, will tend to cancel
one another, causing the output to be zero (or close to zero). There are various two-dimensional
“equivalents” of the first-difference filter, many of which respond to edges of a particular orientation.
One general edge-finding filter has the following difference equation:
1 1
y[m, n] = 4 x[m+ 1, n + 1] − x[m + 1, n] + 4 x[m + 1, n − 1]
− x[m, n + 1] + 3x[m, n] − x[m, n − 1] (6.14)
+ 14 x[m − 1, n + 1] − x[m − 1, n] + 1
4 x[m − 1, n − 1]
This filter “finds” edges of almost any orientation by outputting a value with large magnitude wher-
ever an edge occurs. Both the first difference filter and this general edge-finding filter are examples
of highpass filters. Note the “oscillatory” pattern of bk values such that adjacent coefficients are
negatives of one another. This pattern is characteristic of highpass filters. Note that both of these
filters will typically produce both positive and negative values, even if our input signal is strictly
non-negative. Also note that for both of these filters, the average of the bk coefficients is zero; this
means that these filters tend to “reject” constant regions of an input signal by setting them to zero.
The third operation, sharpening, makes use of an edge finding filter as well. Basically, the
sharpening filter produces a weighted sum of the output of an edge-finding filter and the original
image. Suppose that x[m, n] is the original image, and y[m, n] is the result of filtering x[m, n] with
the filter defined in equation (6.14). Then, the result of sharpening, z[m, n] is given by
z[m, n] = x[m, n] + by[m, n], (6.15)
where b controls the amount of sharpening; higher values of b produce a “sharper” image. Note that
z[m, n] can also be viewed as the output of a single filter. For display purposes, we will threshold the
resulting signal so that the output image has the same range of data values as the input image. That
is, assuming that our input image has values between 0 and 255, the final output of the sharpening
operation, ẑ[m, n] will be
0 z[m, n] < 0
ẑ[m, n] = z[m, n] 0 ≤ z[m, n] ≤ 255 (6.16)
255 255 < z[m, n]
Note that thresholding is a nonlinear operation, but it is not crucial to the sharpening process. This
final result can also be considered to be the output of a single nonlinear filter.
Sharpening is a useful operation when an image has undergone an undesired smoothing op-
eration. This happens frequently in optical systems when they are not entirely in focus. Unlike
smoothing filters, though, sharpening filters tend to enhance random noise; often they may make
“noise-like” components of a signal visible where they were not visible before.
>> yy = filter(bb,1,xx);
(We’ll use the second parameter later in the course when we study IIR filters.) xx is a vector
containing the discrete-time input signal to be filtered, bb is a vector of the bk filter coeffi-
cients, and yy is the output signal. The first element of this vector, bb(1), is assumed to be
b0 .
By default, filter returns a portion of the filtered signal equal in length to xx. Specifically,
the resulting signal includes the start-up transient but not the ending transient. This means
that the output will be delayed by an amount determined by the coefficients of the filter.
A method for filtering which does not introduce delay is often desirable, i.e. a noncausal
filtering method, especially when calculating RMS error between filtered and original versions
of a signal. The command filter2 is meant as a two-dimensional filtering routine, but it
can be used for 1-D filtering as well. Further, it can be instructed to return a “delay-free”
version of the output signal. When using filter2, it is important that xx and bb are either
both row vectors or both column vectors. Then, we use the command
>> yy = filter2(bb,xx,'same');
where xx is the input signal vector, yy is the output signal vector, and bb is the vector of
filter coefficients. If the length of the vector bb is odd, the b0 coefficient is taken to be the
coefficient at center of the vector bb. If the length of bb is even, b0 is taken to be just left of
the center. The output of filter2 has support equal to that of the input signal xx.
Though we will not use these additional options, we can also have filter2 return the full
length of the filtered signal (the length of the input signal plus the order of the filter) like this:
>> yy = filter2(bb,xx,'full');
or just the portion not affected by edge effects (the length of the input signal minus twice the
order of the filter), like this:
>> yy = filter2(bb,xx,'valid');
• 2-D Filtering in M ATLAB: Three approaches to filtering a two-dimensional signal were men-
tioned in Section 6.2.4. The first approach, which simply applies a one-dimensional filter to
each row of the image (alternatively, to each column) can be implementing with the M ATLAB
commands described in the previous bullet.
The second approach applies a one-dimensional filter first to the columns and then to the rows
of the image produced by the first stage of filtering. If the one-dimensional filter is causal
with coefficients bk contained in the M ATLAB vector bb and the image is contained in the
2-dimensional matrix xx, then this approach can be implemented with the command
>> yy = filter(bb,1,filter(bb,1,xx)')';
Note that we do not need to vectorize the image xx, because when presented with an matrix,
filter applies one-dimensional filtering to each column. However, to perform the second
stage of filtering (on rows of the image produced by the first stage), we need to transpose the
image produced by the the first stage of filtering and then transpose the final result again to
0.14
Width = 3
Width = 4
0.12 Width = 5
Width = 6
0.1
Coefficient amplitude
0.08
0.06
0.04
0.02
0
0 5 10 15 20 25 30 35 40 45
Coefficient number
Figure 6.5: The coefficients for g smooth filters with varying widths.
restore the original orientation. This approach will introduce edge effects at the top and on
one side of the image; however the resulting image will be the same size as x.
The third approach uses a two-dimensional set of coefficients bk,l . If these coefficients are
contained in the matrix bb and the image is contained in the matrix xx, then the filter can be
implemented with the command
>> yy = filter2(bb,xx,'same');
Note that the 'same' parameter indicates that the filter is non-causal and thus the b 0,0 coef-
ficient is located as near to the center of the matrix bb as possible. The same alternate third
parameters for filter2 that are listed in the 1-D filtering section apply here as well.
• Generating filter coefficients: We will be examining the effects of many types of filters in
this laboratory. Some have filter coefficients that can be generated easily in M ATLAB. Others
require a function (which we will provide to you) to generate. Note that the the vectors
representing the bk ’s will be column vectors.
>> bb = g_smooth(0.8);
returns the coefficients of a filter with width 0.8.
5. In Section 6.2.5, we presented a general-purpose two-dimensional edge-finding filter in
equation (6.14). The coefficients for this filter are given by
>> bb = [.25, -1, .25; -1, 3, -1; .25, -1, .25];
6. In Section 6.2.5, we also discussed a method for implementing a sharpening filter. Since
we include a threshold operation, this operation is nonlinear and cannot be accomplished
using only an FIR filter. Thus, we provide the sharpen command, which takes an
image and a sharpening “strength” and returns a sharpened image:
>> yy = sharpen(xx,0.7);
The second parameter is the strength factor, b, as discussed in Section 6.2.5. A sharpen-
ing strength of 0 passes the signal without modification.
7. As described in Section 6.2.3, median filters are a special type of nonlinear filter, and
they cannot be described using linear difference equations. To use a median filter on a
one-dimensional signal, we use the command2 medfilt1 like this:
>> yy = medfilt1(xx,N);
N is the order of the median filter, which simply describes how many samples we con-
sider when taking the median. In two dimensions3 , we use medfilt1 twice:
>> yy = medfilt1(medfilt1(xx,N)',N)';
Again, N is the order of the median filter. Here, we are using a one-dimensional filter on
both the rows and columns of the image. Note that since medfilt1 operates down the
columns, we need to transpose the image between the filtering operations and again at
the end.
(a) (Effects of delay) First, we’ll examine the delay introduced by the two filtering im-
plementations, filter and filter2, that we will be using. Filter simple with a 7-
point running average filter. Do this twice, first using filter and then using filter2
with the 'same' parameter4 .
• Use subplot and plot to plot the original signal and two filtered signals in three
subplots of the same figure.
• One of the filtering commands has introduced some delay. Which one? How many
samples of delay have been added?
• Compute the mean-squared error between the original signal and the two filtered
signals. Which is lower? Why?
(b) (Measuring distortion in 1-D) Now, use filter2 to apply the same 7-point running
average filter to the signal simple_noise. Referring to Figure 6.3, we consider
simple to be the signal of interest x[n], simple_noise to be the noise corrupted
signal s[n], and their difference to be the noise, v[n] = s[n] − x[n]. Note that the lower
of the two mean-squared errors that you computed in Problem 1a is M S(x̂[n] − x[n]),
which is a measure of the distortion of the signal of interest introduced by the filter.
• Compute the mean-squared error between simple and simple_noise. Refer-
ring back to Figure 6.3, this is M S(v[n]), the mean-squared value of the noise.
• Compute the mean-square error between your filtered signal and simple. This
value is M S(ŝ[n] − x[n]), which is a measure of how a good a job the filter has
done at recovering the signal of interest.
• Determine the distortion due to noise at the output of your reconstruction filter (i.e.,
M S(v̂[n])) by subtracting M S(x̂[n] − x[n]) from M S(ŝ[n] − x[n]).
• Compare M S(v̂[n]) and M S(ŝ[n] − x[n]) to M S(v[n]). What is the dominant
source of distortion in this filtered signal?
(c) (Running average filters in 1-D) Use filter2 to apply a 3-point, a 5-point, and an
9-point moving average filter to simple_noise.
• Use plot and subplot to plot the original signal, the three filtered signals, and
the three sets of filter coefficients, in seven panels of the the same figure.
• Compute the mean-squared error between each filtered signal and simple.
• Which of the four moving average filters that you have applied has the lowest mean-
squared error? Compare this value to M S(v[n]).
(d) (Tapered smoothing filter in 1-D) Download the file g_smooth.m, and use it to gen-
erate filter coefficients with “widths” of 0.5, 0.75, and 1.0. (Note the lengths of the re-
turned coefficient vectors. You should plot the filter coefficients to get a sense of how the
“width” factor affects the them.) Use filter2 to apply these filters to simple_noise.
• Use plot and subplot to plot the three filtered signals and the three sets of
coefficients in six panels of the same figure.
• Compute the mean-squared error between each filtered signal and simple.
• Which of these filtered signal has the lowest mean-squared error? Compare this
value to the lowest mean-squared error that you found for the moving average filters
and to M S(v[n]).
4 Henceforth, every time you use filter2 in this laboratory, you should use the ’same’ parameter.
2. (Noise reduction on images) In this problem, you look at the effects applying smoothing filters
to an image for noise reduction. Download the files peppers.tif5 and peppers_noise1.tif.
The first is a “noise-free” image, while the second is an image corrupted by random noise.
Load these two images into M ATLAB.
(a) (Examining 2-D filter coefficients) We’ll be using the function g_smooth2 to produce
filter coefficients for this problem. To get a sense of what these coefficients look like,
generate the coefficients for a g_smooth2 filter with width 5. In two side-by-side
subplots of the same figure:
• Display the coefficients as an image using imagesc.
• Generate a surface plot of the coefficients using the command surf(bb) (assum-
ing your coefficients matrix is called bb).
(b) (Examine the effects of noise) First, we’ll consider the noisy signal peppers_noise1.
• Use subplot to display peppers and peppers_noise1 side-by-side in a
single figure. Remember to set the color map, set the axis shape, and include a
colorbar as you did in lab 4.
• Compute the mean-squared error between these two images.
(c) (Minimizing the MSE) Our goal is to find a g_smooth2 reconstruction filter that min-
imizes the mean-squared error between the filtered image and the original, noise-free
image. Use filter2 when filtering signals in this problem.
• Find a filter width that minimizes the mean-squared error. What is this filter width
and the corresponding mean-squared error? (Hint: you might want to plot the mean-
squared error as a function of filter width.)
• Display the filtered image with the smallest mean-squared error.
• Look at some filtered images with different widths. Can you find one that looks
better than the minimum mean-squared error image6 ? What filter width produced
that image?
3. (Salt and pepper noise in images) Next, we’ll look at methods of removing a different type
of random noise from this image. Download the file peppers_noise2.tif and load it
into M ATLAB. This signal is corrupted with salt and pepper noise, which may result from a
communication system that loses pixels.
(a) (Examining the noise) First, let’s see what we’re up against. Salt and pepper noise
randomly replaces pixels with a value of either 0 or 255. In this image, one-fifth of the
pixels have been lost in this manner.
• Display peppers_noise2.
• Compute the mean-squared error between this image and peppers.
(b) (Using lowpass filters) Now, let’s try using some g_smooth2 filters to eliminate this
noise. Start by using filter2 to filter peppers_noise2 with a g_smooth2 filter
of width 1.3. Note that this is very close to the optimal width value.
• Display the resulting image.
5 Like “cameraman”, “peppers” is a standard image used for testing image processing routines. Our version, however, is
smaller than the traditionally used image.
6 Though mean-squared error is widely used as a measure of signal distortion, it is well known that its judgments of quality
4. (Edge-finding and enhancing) In this last problem, we’ll look at edge-finding and sharpening
filters.
(a) (Applying a first difference filter) In order to see how edge-finding filters work, let’s start
in one dimension. Use filter to apply a one-dimensional first difference filter to the
signal simple (which can be found in lab6_data.mat).
• Plot the resulting signal.
• There are five non-zero “features” of this signal. (These features should be clear
from the plot.) Describe them and what they correspond to in simple.
(b) (“Finding” edges) Now we’d like to look at the effects of the general edge-finding filter
presented in Section 6.2.5. Use filter2 to apply this filter to peppers.
• Display the resulting image.
• Describe the resulting image.
• Zoom in on the filtered image and examine some of the more prominent edges.
What do you notice about these edges? (Hint: Are they just a “ridge” of a single
color?)
(c) (Sharpening an image) Download sharpen.m and use the function to display several
sharpened versions of the peppers image.
• Display the sharpened image with a “strength” of 1 alongside the original peppers
image using subplot.
• Zoom in on this sharpened image. What makes it look “sharper”? (Hint: Again,
look at the prominent edges of the images. What do you notice?)
• The sharpened images (especially for strengths greater than 1) generally appear
more “noisy” than the original image. Speculate as to why this might be the case.
(d) (Using sharpening to remove smoothing) Finally, we want to try using the “sharpen”
function to undo a blurring operation. Download the file peppers_blur.tif and
load it into M ATLAB.
• Compute the RMS error between peppers and peppers_blur.
• Use sharpen to “de-blur” the blurred image. Find the sharpening strength that min-
imizes the RMS error of the “de-blurred” image. Include this strength and its cor-
responding RMS error in your report.
• Display the “de-blurred” image with the minimum RMS error and alongside peppers_blur
using subplot. Include the resulting figure in your report.
Note that sharpening is very much a perceptual operation. The minimum distortion
sharpened image may not look terribly much improved. Look at what happens as you
increase the sharpening factor even more. With additional “sharpening,” the (measured)
distortion may increase, but the result looks better perceptually.
5. On the front page of your report, please provide an estimate of the average amount of time
spent outside of lab by each member of the group.
7.1 Introduction
In Lab 6, you examined the behavior of several different filters. Some of the filters were “smoothing
filters” that averaged the signal over many samples. Others were “sharpening” filters that accentu-
ated transitions and edges. While it is very useful to understand the effects of these filters in the
time-domain or (for images) the spatial-domain, it is often not easy to quantify these effects, es-
pecially when we are dealing with more complicated filters. Thus, just as we did with signals, we
would like to obtain a better understanding of the behavior of our filters in the frequency-domain.
Assuming that our filter is linear and time-invariant, we can talk about the filter having a fre-
quency response. We derive the frequency response in the following way. We know that if we
put a complex exponential signal into such a filter, the output will be a scaled and shifted com-
plex exponential signal with the same frequency. The amount of scaling and phase shift, though,
is dependent on the frequency of the input signal. If we send a complex exponential signals with
some frequency through the filter, we can measure the scaling and phase shifting of that signal. The
collection of complex numbers which corresponds to this scaling and shifting for all possible fre-
quencies is known as the filter’s frequency response. The magnitude of the frequency response at a
given frequency is the filter’s gain at that frequency.
In this lab, we will be using the frequency response of filters to examine the problem solved by
telephone touch-tone dialing. The problem is this: given a noisy audio channel (like a telephone
connection), how can we reliably transmit and detect phone numbers? The solution, which was
developed at AT&T, involves the transmission of a sum of sinusoids with particular frequencies.
In order for this solution to be feasible, we must be able to easily decode the resulting signal to
determine which numbers were dialed. We will see that we can do this easily by considering filters
in the frequency domain.
7.2 Background
7.2.1 DTMF signals and Touch ToneTM Dialing
Whenever you hit a number on a telephone touch pad, a unique tone is generated. Each tone is
actually a sum of two sinusoids, and the resulting signal is called a dual-tone multifrequency (or
DTMF) signal. Table 7.1 shows the frequencies generated for each button. For instance, if the “6”
button is pressed, the telephone will generate a signal which is the sum of a 1336 Hz and a 770 Hz
sinusoid.
Table 7.1: DTMF encoding table for touch tone dialing. When any key
is pressed, the tones of the corresponding row and column are generated.
We will call the set of all seven frequencies listed in this table the DTMF frequencies. These
frequencies were chosen to minimize the effects of signal distortions. Notice that none of the DTMF
frequencies is a multiple of another. We will see what happens when the signal is distorted and why
this property is important.
Looking at a DTMF signal in the time domain does not tell us very much, but there is a common
signal processing tool that we can use to view a more useful picture of the DTMF signal. The
spectrogram is a tool that allows us to see the frequency properties of a signal as they change over
time. The spectrogram works by taking multiple DFTs over small, overlapping segments 1 of a
signal. The magnitudes of the resulting DFTs are then combined into a matrix and displayed as
an image. Figure 7.1 shows the spectrogram of a DTMF signal. Time is shown along the x-axis
and frequency along the y-axis. Note the bars, each of which represents a sinusoid of a particular
frequency existing over some time period. At each time, there are two bars which indicate the
presence of the two sinusoids that make up the DTMF tone. From this display, we can actually
identify the number that has been dialed; you will be asked to do this in the lab assignment.
1 Note that each segment is some very small fraction of a second, and the segments usually overlap by 25-75%.
4000
60
3500
50
3000
40
Frequency (Hz)
2500
2000 30
1500
20
1000
10
500
0 0
0 0.5 1 1.5 2 2.5 3 3.5 4
Time (sec)
Figure 7.1: A spectrogram of a DTMF signal. Each horizontal bar indicates a sinusoid that exists
over some time period.
From this equation, we have an FIR filter with order M . (Note that the support length of the impulse
response is M + 1.) What should M be? M is a design parameter. You may remember from Lab 3
that correlating over a long time produces better estimates of similarity. Thus, we should get better
differentiation between passed frequencies and rejected frequencies if M is large. There is a tradeoff,
though. The longer M is, the more computation that is required to perform the convolution. Thus
for efficiency reasons we would like M to be as small as possible. More computation also equates
to more expensive devices, so we prefer smaller M for reasons of device economy as well. Since
we have seven DTMF frequencies, we will also have seven bandpass filters in our system; in our
decoder system, we will choose a different value of M for each bandpass filter.
Because of the relatively small set of frequencies of concern in DTMF decoding, we will see that
larger M do not necessarily produce better frequency differentiation. In order to judge how good a
bandpass filter is at rejecting unwanted DTMF frequencies, we will define the gain-ratio, R. Given
697 Hz Lowpass
Bandpass Filter Rectify Filter
Figure 7.2: A block diagram of the DTMF decoder system. The input is a DTMF signal, and the
output is a string of numbers corresponding to the original signal.
where fˆ is in the set of DTMF frequencies and fˆ 6= fc . In words, we define R to be the ratio of
the filter’s gain at its center frequency to the next-highest gain at one of the DTMF frequencies.
Having a high gain-ratio is desirable, since it indicates that the filter is rejecting the other possible
frequencies.
Note that since we will be comparing the outputs of a variety of bandpass filters, we also need to
normalize each filter by the center frequency gain. Thus, we will need to record not only the M that
we select but also the center frequency gain. You will be directed to record and include these gains
in the lab assignment.
Half-Wave
Rectifier
Input signal
Full-Wave Lowpassed
Rectifier Result
Figure 7.3: A comparison of half-wave and full-wave rectification. Notice that full-wave rectifica-
tion allows us to achieve a higher output signal level after lowpass filtering.
The order of this filter is MLP . The value MLP (and thus the corresponding strength of the smooth-
ing filter) is a design parameter of the decoder system. When choosing MLP , there is a tradeoff
between amount of smoothing and transient effects. If our filter’s impulse response is not long
enough, the output signal will still have significant variations. If it is too long, transient effects will
dominate the output of the filter. If it is too short, the system may “smooth over” short DTMF tones
or periods of silence. Note that in our decoder system, we will apply the same smoothing filter to the
output of each filter. Figure 7.3 shows the results of smoothing for half-wave and full-wave rectified
signals.
Figure 7.4: An illustration of the detector subsystem. (a) A clean DTMF signal is compared to a
threshold, c. (b) The threshold should be set so that noise will not produce false tone detections or
miss true tone detections in the presence of noise. (c) Near the threshold crossing, noise can cause
multiple detections.
every sample of the input signal. Instead, we only make a decision every 100 samples. This makes
it more likely that there will only be one decision made in the vicinity of the threshold crossing. It
also reduces computation time somewhat. Note that the number 100 is somewhat arbitrary. We can
choose a smaller number, but then we increase the risk the multiple-crossing problem. Alternatively,
we can make it larger; however, it we make it too large, our detector may miss short tones or silences.
The second step is to decode of the DTMF tones that we have detected in the previous step.
By “decode,” we simply mean that we must decide which key was pressed to generate a particular
DTMF tone. To do this, we determine which two bandpass filters have the largest output at each
time when a DTMF tone was detected. Then, we effectively perform a table look-up to see which
key was pressed at these times. The result is a sequence of decoded numbers corresponding to key
presses. However, each DTMF tone will generally produce a sequence of identical numbers since
it is “decoded” at many times during the duration of the DTMF tone. To translate this sequence of
numbers into a sequence of key presses, we need a third step.
The third step simply combines adjacent, identical numbers in the decoded sequence. That is, a
“run” of identical numbers is replaced by a single number. Through this process, each DTMF tone
is finally represented by a single number. Note that for this process to work correctly, our sequence
of numbers must also contain an indication of when no tone was present. Otherwise, any repeated
key press would be decoded as only a single key press.
Whenever designing a communication system, like the DTMF coder/decoder described here, it is
important to consider how the system behaves in the presence of undesirable effects. For instance,
the telephone system could corrupt our DTMF signal with some amount of static. Under such
conditions, how well would the decoder work? How much noise can the system tolerate? These
are all questions about the robustness of the decoder system to noise. No system can work perfectly
under less than ideal conditions, so it is important to understand when and how a system will fail. In
the lab assignment, we will examine the robustness of this system under noise.
Here, bb is the set of filter coefficients (i.e., the impulse response) of the FIR filter, n is the
number of points in the range [0, π) at which to evaluate the frequency response, H is the
frequency response, and w is the set of n corresponding discrete-time frequencies, which are
spaced uniformly from 0 to π. The frequency response, H, is a vector of complex numbers
which define the gain (abs(H)) and phase-shift (angle(H)) of the filter at the given fre-
quencies.
Alternatively, we can evaluate the frequency response only at a specified set of frequencies by
replacing n with a vector of discrete-time frequencies. Thus, the command
When we apply a filter to a sampled signal with sampling frequency fs (in samples per sec-
ond), we can evaluate the frequency response at the discrete-time frequencies corresponding
to a specified set of continuous time-frequencies in Hertz in the following manner:
This converts the specified continuous-time frequencies into discrete-time frequencies and
evaluates the frequency response at those points.
• Sorting a vector: The M ATLAB command sort sorts a vector in ascending order. Thus,
given a vector x, the command
>> y = sort(x);
produces a vector y such that y(1) is the smallest value in x and y(end) is the largest value
in x.
• Creating matrices of ones and zeros: In order to create arrays of arbitrary size containing
only ones or only zeros, we use the M ATLAB ones and zeros commands. Both commands
take the same set of input parameters. If only one input parameter is used, a square matrix
with the specified number of rows and columns is generated. For instance, the command
>> x = ones(5);
produces a 5 × 5 matrix of ones. Two parameters specify the desired number of rows and
columns in the matrix. For instance, the command
produces a 4 × 8 matrix (i.e., four rows and eight columns) containing only zeros. To generate
column vectors or row vectors, we set the first or second parameter to 1, respectively.
• The DTMF Dialer: dtmf_dial.m is a DTMF “dialer” function. It takes a vector of key
presses (i.e., a phone number) and produces the corresponding audio DTMF signal. Note that
this function as provided is incomplete; you will be directed to complete it in the laboratory
assignment. (The lines of code that you need to complete are marked with a ?.) To produce
the DTMF signal that lets you dial the number 555-2198, use the command:
An optional second parameter will cause the function to display a spectrogram of the resulting
DTMF signal:
This function assumes a sampling frequency of 8192 samples per second. Each DTMF tone
has a length of 1/2 second, and the tones are separated by 1/10 second of silence. Note that
the number 10 corresponds to a '#', 11 corresponds to a '0', and 12 corresponds to a '*'.
• The DTMF Decoder: dtmf_decode.m is an (incomplete) DTMF decoder function.
(Once again, the lines of code that you need to complete are marked with a ?.) It takes a
DTMF signal (as generated by dtmf_dial) and returns the sequence of key-presses used to
create the signal. Thus, if our DTMF signal is stored in signal, we decode the signal using
the command:
An optional second parameter will cause the function to display a plot of the smoothed and
rectified outputs of each bandpass filter:
• Testing the robustness of the DTMF decoder: dtmf_attack.m is a function that tests
the DTMF decoder in the presence of random noise. This function generates a standard seven
digit DTMF signal, adds a specified amount of noise to the signal, and then passes it through
your completed dtmf_decode function. The decoded string of key presses is compared
to those that generated the signal. Since the noise is random, this procedure is repeated ten
times. The function then outputs the fraction of trials decoded successfully. The function
also displays the plot from the last execution of dtmf_decode. (Note: since each call to
dtmf_decode takes a little time, this function is rather slow. Be patient with it.)
For instance, to test the system with a noise power of 2.5, we use the following command:
The result is a number that provides the fraction of the 10 trials that were successful.
Note that dtmf_attack is a complete function, but it calls both dtmf_dial and dtmf_decode,
each of which you must complete.
• Complete the function and include the code in your lab report.
• Using your newly completed dialer function, execute the following command to create
a DTMF signal and display it’s spectrogram:
>> signal = dtmf_dial([1 2 3 4 5 6 7 8 9 10 11 12],1);
Include the resulting figure in your report. Note how each key press produces a different
pattern on the spectrogram.
• What is the phone number that has been dialed in Figure 7.1?
2. (The bandpass filters of the DTMF Decoder.) As we have noted, a key part of the DTMF
decoder is the bank of bandpass filters that is used to detect the presence of sinusoids at the
DTMF frequencies. We have specified a general form for the bandpass filters, but we still
need to choose the filter orders and create their impulse responses. In this problem you will
be identifying good values for M .
(a) (The impulse response of one bandpass filter.) First, we need to be able to create the
impulse response for a bandpass filter. Using equation (7.1) with a sampling frequency
fs = 8192 Hz and M = 50, use M ATLAB to create a vector containing the impulse
response, h, of a 770 Hz bandpass filter3 .
• What is the command that you used to create this impulse response?
• Use stem to plot your impulse response.
(b) (The frequency response of one bandpass filter.) When we talk about the response of
a filter to a particular frequency, we can think about filtering a unit amplitude sinusoid
with that frequency and measuring the amplitude and phase shift of the resulting signal.
We can certainly do this in M ATLAB, but it’s far simpler to use the freqz command.
Here, you’ll use freqz to examine the frequency response and gain-ratio of a bandpass
filter like the ones we’ll use in the DTMF decoder.
• Use freqz to calculate the frequency response of your 770 Hz bandpass filter at all
seven of the DTMF frequencies4 . Calculate the gain at each frequency, and include
these numbers in your report.
• From the frequency response of your filter at these frequencies, calculate the gain-
ratio, R.
• Do you think that this is a good gain-ratio for our bandpass filters? (Hint: You
might want to come back to this problem after you’ve worked the remainder of this
problem.)
(c) (Choosing M for this bandpass filter.) Now, we’d like to see what happens when we
change M for your 770 Hz bandpass filter. We’ve provided you with a function that will
facilitate this. Download the file dtmf_filt_char.m. This function will help you to
visualize the frequency response of these filters and to determine their gain at the DTMF
frequencies.
• Use this function to verify that the gains you calculated in Problem 2b were correct.
• Include the frequency-response plot that dtmf_filt_char produces in your re-
port.
3 Remember that if a filter has order M , the support length of the impulse response should be M + 1.
4 Remember that our system uses a sampling frequency of 8192 Hz
• The frequency response of this filter is characterized by several “humps” which are
typically called lobes. Describe the frequency response in terms of such lobes. Vary
M and examine the plots that result (you do not need to include these plots). De-
scribe the differences in the frequency response as M (which represents the length
of the filter’s impulse response) is changed.
• What happens to the relative heights of adjacent lobes as M is changed?
• What features of the filter’s frequency response contribute to the gain ratio R?
• For what values of M do we achieve gain ratios greater than 10?
(d) (A function for computing gain ratios.) You’ll need to compute the gain-ratio repeatedly
while finding good design parameters for the bandpass filters, so in this problem you’ll
automate this task. Write a function that accepts a vector of gains (such as that returned
by dtmf_filt_char) and computes the gain ratio, R. (Hint: This is a simple function
if you use the sort command. You can assume that the center frequency gain is the
largest value in the vector of gains.)
• Include the code for this function in your report.
(e) (Specifying the bandpass filters.) For each bandpass filter that corresponds to one of the
seven DTMF frequencies, we want to find a choice of M that yields a good gain ratio
but also minimizes the computation required for filtering.
To do this, for each bandpass filter frequency, use dtmf_filt_char and your func-
tion from Problem 2d to calculate R for all M between 1 and 200. Then, plot R as a
function of M . You can save some computation time by setting the third parameter of
dtmf_filt_char to zero to suppress plotting. You should be able to identify at least
one local maximum5 of R on the plot. The “optimal” value of M that we are looking
for is the smallest one that produces a local maximum of R that is greater than 10.
• Create this plot of R as a function of M for the bandpass filter with a center fre-
quency of 770 Hz. Include the resulting plot in your report.
• Identify the “optimal” value of M for this filter, the associated center frequency
gain, and the resulting value of R.
• Repeat the above two steps for the remaining six bandpass filters. (You do not need
to include the additional plots in your report.) Create a table in which you record
the center frequency, the optimal M value, the associated center frequency gain,
and the resulting value of R.
3. (Completing the DTMF decoder.) Now we have designed the bank of bandpass filters that
we need for the DTMF decoder. In this problem, we’ll use the parameters that we found to
help us complete the decoder design. Download the file dtmf_decode.m. This function is
a nearly complete implementation of the DTMF decoder system described earlier in this lab.
There are several things that you need to add to the function.
(a) (Setting the M ’s and the gains of the bandpass filters.) First, you need to record your
“optimized” values of M and the center frequency gains in the function. Replace the
question marks on line 29 by a vector of your optimized values of M . They should be
in order from smallest frequency to largest frequency. Do the same on line 32 for the
variable G, which contains the center frequency gains.
5 A local maximum is basically just a point on the plot that is larger than all other values in its vicinity. It may or may not
• Make these modifications to the code. (At the end of this problem, make sure that
you include your completed function in your report.)
(b) (Setting the impulse responses of the bandpass filters.) Also, you need to define the
impulse response for each bandpass filter on line 49. Use equation (7.1) for this, where
the filter’s order is given by M(i).
• Make this modifications to the code.
(c) (Selecting the order of the post-rectifier smoothing filter.) Next, you need to specify
the post-rectifier smoothing filter, h_smooth. Temporarily set both h_smooth (line
36) and threshold (line 40) equal to 1 and run dtmf_decode on the DTMF signal
you generated in Problem 1. This function displays a figure containing the rectified and
smoothed outputs for each bandpass filter. With h_smooth equal to 1, no smoothing
is done and we only see the results of the rectifier in this figure. We will use moving
average filters of order MLP , as defined by the M ATLAB command
>> h_smooth = ones(M_LP+1,1)/(M_LP+1);
We want the smoothed output to be effectively constant during most of the duration of
the DTMF tones, but we don’t want to smooth so much that we might miss short DTMF
tones or pauses between tones.
• Examine the behavior of the smoothed signal when you replace line 36 with moving
average filters with order MLP equal to 20, 200, and 2000. Which filter order, MLP
gives us the best tradeoff between transient effects and smoothing?
• Set h_smooth to be the filter you have just selected.
(d) (Detection threshold.) Finally, you need to identify a good value for threshold.
threshold determines when our system detects the presence of a DTMF signal.
dtmf_decode plots the threshold on its figure as a black dotted line. We want the
threshold to be smaller than the large amplitude signals during the steady-state portions
of a DTMF signal, but larger than the signals during the start-up transients for each
DTMF tone. (Hint: When choosing a threshold, consider what might happen if we add
noise to the input signal.)
• By looking at the figure produced by dtmf_decode, what would be a reasonable
threshold value? Why did you choose this value?
• Set threshold to the value you have just selected.
• Now, execute dtmf_decode and include the resulting plot in your report. (Note:
You can include this plot in black and white, if you like.)
• dtmf_decode should output the same vector of “key presses” that was used to
produce your signal. What “key presses” does the function produce? Do these
match the ones used to generate the DTMF signal? If not, you’ve probably made a
poor choice of threshold.
(e) Remember to include the code for your completed dtmf_decode function in your
report.
4. (Robustness of the DTMF decoder to noise.) In the introduction to this lab, we indicated that
we would be transmitting our DTMF signals over a noisy audio channel. So far, though, we
have assumed that the decoder sees a perfect DTMF signal. In this problem, we will examine
the effects of additive noise on the DTMF decoder.
(a) Download the file dtmf_attack.m. Execute dtmf_attack with various noise
powers. Find a value of noise power for which some but not all of the trials fail.
• What value of noise power did you find? (Hint: use the parameter searching method
discussed in the background section to speed your search).
• Make a plot of the fraction of successes versus noise power. Include at least 10
values on your plot. Make sure that your minimum noise power has a success rate
at (or at least near) 1 and your maximum noise power has a success rate at (or near)
0. Try to get a good plot of the transition between high success rates and low success
rates. While making this plot, pay attention to the types of errors that the decoder is
making.
(b) By examining the plots for failure trials and the types of errors that the decoder is mak-
ing, you should be able to speculate about the source of the errors.
• What types of errors is the system making when it decodes the noisy signals?
• Speculate about what could you do to the decoder in order to increase the system’s
tolerance to additive noise.
5. On the front page of your report, please provide an estimate of the average amount of time
spent outside of lab by each member of the group.
8.1 Introduction
The ability to recognize and categorize things is fundamental to human cognition. A large part of our
ability to understand and deal with the world around us is a result of our ability to classify things.
This task, which is generally known as classification, is important enough that we often want to
design systems that are capable of recognition and categorization. For instance, we want vending
machines to be able to recognize the bills inserted into the bill changer. We want internet search
engines to classify web pages based on their relevance to our query. We want computers that can
recognize and classify speech properly so that we can interact with them naturally. We want medical
systems that can classify unusual regions of an x-ray as cancerous or benign. We want high speed
digital communication modems that can determine which sequence of, say, 64-ary signals that was
transmitted.
There is a vast array of applications for classification. We have actually already seen some of
these applications. Detection, which we studied in Labs 1 and 2, is a form of classification where
we chose from only two possibilities. In this lab, we consider one popular application of multiple-
alternative classification: speech recognition. In particular, we will focus on a simplified version
of speech recognition, namely, vowel classification. That is, we will experiment with systems that
classify a short signal segment of an audio signal which corresponds to a spoken vowel, such as an
“ah”, an “ee”, an “oh”, and so on. (We won’t deal with how one determines that a given segment
corresponds to a vowel.) In the process, we will develop some of the basic ideas behind automatic
classification.
One of these basic ideas is that an item to be classified is called an instance. For example, if
each of 50 short segments of speech must be individually classified, then each segment is considered
to be one instance. A second basic idea is that there is a finite set of prespecified classes to which
instances may belong. The goal of a classifier system (or simply a classifier) is to determine the
class to which a presented instance belongs. A third basic idea is that to simplify the process, the
classification of a given instance is based on a set of feature values. This set is a relatively small
list of numbers that, to an appropriate degree, describes the given instance. For example, a short
segment of speech might contain thousands of samples, but we will see that vowel classification can
be based on feature sets with as few as two components. A fourth basic idea is that classification is
often performed by comparing the feature values for an instance to be classified with sets of feature
values that are representative of each class. The output of the classifier will be the class whose
representative feature values are most similar, in some appropriate sense, to the feature values of the
Feature
Instance Feature Vector Feature Class
Calculator Classifier Label
Class Representative
Feature Vectors
instance to be classified.
8.2 Background
8.2.1 An Introduction to Classification
You may recall Lab 7, in which we developed a system for decoding DTMF signals into the sequence
of key-presses that produced the original signal. Our DTMF decoder was actually performing clas-
sification on each segment of the DTMF signal. Classification is a process in which we examine
instances of some thing (like an object, a number, or a signal) and try to determine which of a num-
ber of groups, or classes, each instance belongs to. We can think of this as a labeling process. In
our DTMF decoder, for example, we looked at a given segment of the signal and labeled it with a
number corresponding to an appropriate key press.
Generally, classification is a two-stage process. Figure 8.1 shows a block diagram of a classifier
system. First, we need some information about the instance that we are considering. This infor-
mation is traditionally referred to as a set of features. If we are classifying people, for instance,
we might use height, weight, or hair color as features. If we are classifying signals, we might use
power, the output of some filter, or the energy in a certain spectral band as features. So that we
can deal with our features easily, we generally like to have a set of measurable features to which
we can assign numerical feature values. When we are using more than one feature to describe an
instance, we typically place all of the feature values into a feature vector, f = (f 1 , f2 , . . . , fN ). N
is the number of elements in the feature vector and is called the dimension of the feature vector.
A feature vector is calculated for each instance we wish to classify by measuring the appropriate
aspects of that instance. As shown in Figure 8.1, the first block is the “feature calculator,” which
takes an instance (of a signal, for instance) and produces the set of numerical feature values. For
our DTMF decoder, our features were the spectral strength of a given segment of the signal at each
DTMF frequency. That is, the feature calculator produced a seven-element feature vector, one for
each DTMF frequency.
The second stage of classification, the “feature classifier” (which we have previously called
a “decision maker”). uses the feature vectors to decide which class a feature vector belongs to.
Generally, we make this decision by comparing the feature vector for an instance to each member
of a set of representative feature vectors, one for each class under consideration. The idea is that
the feature classifier labels the instance as the class that has the most similar representative feature
vector. We will discuss the specifics of the feature classifier after we have presented a classification
example.1
Before we continue, we should note the relationship between what we previously called “detec-
tion” and what we now call “classification”. Detection generally refers to binary “signal present”
or “signal not present” decisions. For instance in Lab 1, we used energy to decide whether a signal
was present or not, and Lab 2 we used correlation to make such decisions. As such, detection, is
generally considered to be a special case of the more general notion of classification, which refers to
decisions among two or more classes. However, this usage is not universal. For example, “detection”
is sometimes used to describe a system that decides which of 64 potential signals was transmitted
to a modem, each representing a distinct pattern of 6 bits. This lab assignment also generalizes the
idea, used in Labs 1 and 2, that decisions are made on a single number or feature. However, as noted
before, Lab 7 also used such a generalization.
14
Type A
Type B
12
10
Number of flowers
0
36 38 40 42 44 46 48 50 52 54 56
Flower height (cm)
Figure 8.2: A simple example where one feature (plant height) is sufficient to perform classification.
This histogram shows how many plants have a given height.
10
Type A
9 Type B 50
7 45
Number of flowers
3
35
2
1
30
0
35 40 45 50 55 60 35 40 45 50 55 60
Flower height (cm) Flower height (cm)
Figure 8.3: An example where a histogram of one feature is not sufficient to perform perfect clas-
sification (left), but a scatter plot of two features shows a clear separation between the two classes
(right).
distinguished clusters of feature values. The type A plants form a cluster with heights centered
around a mean (i.e. average) of 50 centimeters, and the type B plants form a cluster with heights
centered around a mean of 40 centimeters. Most importantly, the two clusters do not overlap. This
suggests that classification can indeed be based on plant height.
How do we use this information to classify a new plant (i.e., a new instance)? Intuitively, if the
new plant’s height is closer to the Type A mean of 50 cm than to the Type B mean of 40 cm, we
should classify the plant as type A rather than type B. In this case, we can use a simple threshold test.
If a new plant’s height is greater than 45 cm (which is halfway between two mean feature values),
we classify the new plant as Type A. Conversely, if it’s height is less than 45 cm we classify it as
Type B. In other words, for each class, we use the mean feature value as the class representative,
and we compare the feature value of a new instance to be classified (it’s height) to the two means
and decide the class whose representative feature value (its mean) is closest to the feature value of
the given instance.
Figure 8.3 (left) shows a histogram of plant heights in a more troublesome scenario. In this case,
Type A plants still tend to be taller than Type B plants, but there are a significant number of plants
that we will confuse (that is, misclassify) if we decide exclusively using this one feature. Though
we will typically need to deal with some classification error, we can often reduce it by adding more
features. Suppose we measure not only the height of the plant but also the average length of its
leaves. Now, instead of a histogram, we can look at the training set of features using a scatter plot,
in which we plot a point for each feature vector in our training set. We put one of the two features
along each of the plot’s axes. For example, a scatter plot for the two features just mentioned for each
plant is shown in Figure 8.3 (right). Here we again see two distinct clusters, which suggests that we
can classify with little error by using these two features together.2
How do we design a classifier for this case? We cannot simply use a threshold on one of the
features. Instead, we will use a more general decision rule, which is based on mean feature vectors
and distances between an instance and the mean feature vectors. First, let us consider the two
features as a two-dimensional vector f = (f1 , f2 ). Thus, if a plant is 52 cm tall and has leaves with
2 In this case, classification using either feature by itself will result in many classification errors. That is, by itself, neither
feature is sufficient to separate the two clusters. One can see this by projecting the scatter plot on to either one of the axes.
When we do so, we see that the two feature values of the two classes are intermingled, rather than remaining distinct.
average length of 44 mm, our feature vector is f = (52, 44). Now, given the feature vectors from
each plant in our design set of one type, we want to calculate a mean feature vector for plants of
that type. Since the mean feature vector indicates the central tendency of each feature in a class, we
use it as a representative of the entire class. To calculate a mean feature vector in this case, we first
take the mean, m1 , of all of the plant heights for plants of one type. Then we take the mean, m2 , of
all of the leaf lengths for plants of the same type. The mean feature vector is then f̄ = (m1 , m2 ).
Note that this is the general procedure for calculating the mean of a set of vectors, regardless of the
vector’s dimension. On the scatter plot in Figure 8.3, we’ve plotted the locations of mean feature
vectors with large symbols.
As with the one-feature case, we will classify new instances based on how close they are to
each of the mean feature vectors. To do this, we still need to know how to calculate distances
between two feature vectors. For simplicity, we will calculate distances using the Euclidean distance
measure3 . The Euclidean distance between two vectors is simply the straight-line distance between
their corresponding points on a scatter plot like that in Figure 8.3. To calculate the distance, d,
between two feature vectors (f1 , f2 ) and (m1 , m2 ), we simply use the formula
p
d = (f1 − m1 )2 + (f2 − m2 )2 (8.1)
Euclidean distance generalizes to any number of dimensions; the general formula can be found
later in equation (8.2). Note that the Euclidean distance is essentially the RMS difference (i.e.,
RMS “error”) between two vectors4 , which we have used repeatedly throughout this course. Here,
though, we refer to the computation as “Euclidean distance”, rather than RMS difference, to motivate
a geometric interpretation of classification.
Now that we have designed a classifier for this case, we can finally consider the classification of
a new instance. To classify a new instance, we first calculate the distances between that instance’s
feature vector and the mean feature vectors of each class. Then, we simply classify the instance
as a member of the class for which the distance is smallest. Consider what this means in terms of
the scatter plot. Given a new instance, we can plot it’s feature vector on the scatter plot. Then, we
classify based on the nearest mean feature vector. For a two-class case such as that shown in Figure
8.3, there exists some set of points that are equally far from both mean feature vectors. These points
form a decision line that separates the plane into two halves. We can then classify based on the half
of the plane on which a feature vector falls. For example, in Figure 8.3, any plant with a feature
vector that falls above the line will be classified as type B. Similarly, any plant with a feature vector
that falls below the line will be classified as type A.
With this classification rule, we can correctly classify almost all of our training instances. How-
ever, note that we’re not classifying perfectly. There is one rogue type B close to the rest of the type
A’s. In general, though, we will need to accept more error than this.
Of course, two features may not be enough either. If our scatter plot looked like the one in
Figure 8.4, then we can still see the two clusters, but we can’t perfectly distinguish them based only
on these two features. The line we draw for our distance rule will properly classify most of the
instances, but many are still classified incorrectly. Once again, we can either accept the errors that
will be made or we can try to find another feature to help us better distinguish between the two
classes. Unfortunately, visualizing feature spaces with more than two dimensions is rather difficult.
However, the intuition we’ve built for two-dimensional feature spaces extends to higher dimensions.
We can calculate mean feature vectors and distances in the roughly the same way regardless of the
number of dimensions.
3 There are a wide variety of possible distance measures; Euclidean distance is certainly not the only choice.
4 The two calculations actually differ by a scaling factor, since RMS involves a mean while Euclidean distance involves a
sum.
Type A
52 Type B
50
48
44
42
40
38
36
40 45 50 55
Flower height (inches)
Figure 8.4: An example where two features are not as clearly separated.
most similar to. In this lab, we use Euclidean distance to measure similarity, and so we use a distance-based feature classifier.
Other types of feature classifier are also possible, such as a correlation-based feature classifier.
(A) (B)
11.5
10.5
11
10.5
Feature 2
Feature 2
10
10
9.5
9.5
9
9
7 7.5 8 8.5 9 7 8 9
Feature 1 Feature 1
(C) (D)
24 30
22
25
20
Feature 2
Feature 2
18 20
16
15
14
12 10
10 15 20 25 10 15 20 25 30
Feature 1 Feature 1
Figure 8.5: (A) Classes overlap, so the features do not allow much discrimination; these are bad
features. (B) Feature 2 aids discrimination, but Feature 1 does not. (C) An example with four
distinct classes; decision lines are approximate. (D) An example with three indistinct classes.
a set of representative feature vectors, one for each class. We will denote the representative feature
vector for the cth class as f̄c = (f¯c,1 , . . . , f¯c,N ), where c = 1, . . . , C.
Given a set of representative feature vectors (the choice of such will be discussed later), we can
classify new instances using the feature classifier. The feature classifier (seen in Figure 8.6) has
two steps. The first step computes the distances between the input feature vector and each of the
class representatives. As we have done in the previous sections, we will use Euclidean distance
in our system. Equation (8.1) gives the formula for Euclidean distance in two dimensions. For a
general, N -dimensional feature space, we use the following equation. Let u = (u 1 , . . . , uN ) and
v = (v1 , . . . , vN ) be two N -dimensional vectors (i.e., arrays with length N ). We calculate the
Euclidean distance between them as
v
uN
uX p
d(u, v) = t (vi − ui )2 = (v1 − u1 )2 + (v2 − u2 )2 + · · · + (vN − uN )2 . (8.2)
i=1
Again, we note that, to within a scaling factor, Euclidean distance is equivalent to the root mean
squared error between two vectors.
The second step of the feature classifier applies a decision rule to select the best class for the
input instance. The decision rule that we will use is the nearest class representative rule. This simply
Distances to class
representatives
Feature Distance Decision Class
Vector Calculator Rule Label
Class Representative
Feature Vector
Figure 8.6: Block diagram of a distance-based feature classifier, which makes the decision in a
general classifier system.
means that the classifier decides the class whose whose representative feature vector is closest (in
Euclidean distance) to the feature vector of the instance being classified. That is, if f is the feature
vector for an instance to be classified, then the decision rule decides class c if d(f , f̄c ) is less than
d(f , f̄c0 ) for all other classes6 c0 . Other decision rules, which may weight the distances from the
various class representatives, are also possible, but they will not be considered here 7 .
Let us now discuss how to choose the class representative feature vectors f̄1 , . . . , f̄C . Finding
these vectors is the main aspect in feature classifier design. We have previously suggested that we
can find a representative feature vector for a class by taking the mean across some set of instances
that belong to that class. We describe this calculation formally as follows. Suppose that we have
a set of N -dimensional feature vectors from M instances of a given class c (this the design set
of instances for this class). Let f̃i = (f˜i,1 , f˜i,2 , . . . , f˜i,N ) denote the ith such feature vector. We
calculate the mean feature vector, f̄c = (f¯c,1 , . . . , f¯c,N ), for this class as
M
1 X 1
f̄c = f̃i = (f̃1 + f̃2 + · · · + f̃M ). (8.3)
M i=1 M
Alternatively, we can say that the j th element of the mean feature vector, f̄c , is
M
1 X˜ 1 ˜
f¯c,j = fi,j = (f1,j + f˜2,j + · · · + f˜M,j ). (8.4)
M i=1 M
them.
7 Note that our DTMF signal classifier from Lab 7 used a simpler “feature classifier” that was based on neither distance
nor correlation. However, with a little extra work it could have been formulated as either of these types of classifier, most
likely without a degradation of performance.
The diagonal elements, Kn,n , show what fraction of instances from the the nth class were correctly
classified. The .9 in the upper left corner, for instance, indicated that 90% of instances from the first
class were classified as belonging to the first class. Thus, higher diagonal elements are desirable.
The off-diagonal element Kn,m indicates what fraction of instances from the mth class were
misclassified as belonging to class n. In the example above, for instance, the .02 in the upper right
corner indicates that 2% of instances in the fifth class were incorrectly classified as belonging to the
first class. Thus, we hope that off-diagonal elements are as small as possible. The confusion matrix
for a perfect classifier will be an identity matrix (i.e., ones on the diagonals, zeros elsewhere).
40 40
30
30
20
|S(ω)| (db)
|S(ω)| (db)
20
10
0 10
−10 0
−20
−10
−30
0 1000 2000 3000 4000 0 1000 2000 3000 4000
Frequency (Hz) Frequency (Hz)
20 20
|S(ω)| (db)
|S(ω)| (db)
10 10
0 0
−10 −10
−20 −20
0 1000 2000 3000 4000 0 1000 2000 3000 4000
Frequency (Hz) Frequency (Hz)
Figure 8.7: The magnitude spectrum (in decibels) of four vowel signals. The plots on the left
correspond to two instances of an “ee” vowel, as in the word tree. The plots on the right correspond
to two instances of an “ah” vowel, as in the word father. The solid line is a smoothed version of the
spectrum, which shows the general trends of the spectrum.
of sinusoids with harmonically related frequencies. As with the DTMF signals in Lab 7, the time-
domain provides relatively little information about the signal. So, as with the DTMF signals, this
suggests that we need to examine vowels in the frequency domain. Figure 8.7 shows examples of
the magnitude spectrum (in decibels8 ) of two different vowels. The plots on the left correspond to
an “ee” vowel (as in the word tree), while the plots on the right correspond to an “ah” vowel (as
in the word father). Also shown is a smoothed version of each spectrum, which shows its general
trend.
There are a number of interesting things to note about these plots. First, we can see the peaks that
correspond to the harmonics that make up the periodic signal. Notice that the peaks are spaced more
closely in some plots than others, corresponding to a lower fundamental frequency and thus a longer
fundamental period. As illustrated by this figure, though, the fundamental frequency of the signal is
independent of the vowel being produced. Notice that the overall shape of the frequency spectrum
is different between the two vowels, but remains relatively constant between the two instances of
each vowel, as can be seen from the smooth versions. This shape determines the timbre 9 of the
sound, and, correspondingly, the “sound” of the vowel. Notice that there are peaks in the smoothed
8 To convert a number, x, into decibels, we use the formula xdB = 20 log10 (x).
9 Pronounced “tambor.”
spectrum at various places. These peaks are called formants; it is generally known that the position
of these formant is the primary feature that distinguishes one vowel from another, i.e. that makes
one vowel sound different from another.
Unfortunately, there is no solid definition of a “formant,” and they are remarkably difficult to
identify automatically. In fact, there is some disagreement as to what constitutes a formant in some
cases. In this lab, we’ll work with two sets of features that hopefully capture the information con-
tained in the formant positions. In Lab 9, we’ll investigate the use of another, somewhat more
sophisticated feature for vowel recognition. This feature actually models speech production, and
thus should more readily capture the relevant aspects of the vowel signal.
The first feature set that we use in this lab will be the formant features. The formant features
attempt to locate the formants themselves using a simple algorithm. This algorithm first uses the
DFT to compute the spectrum of a short segment of a vowel. Then, the spectrum is smoothed using
a weighted averaging filter. Finally, the algorithm returns frequencies of the largest peaks on the
smoothed signal that occur above and below 1500 Hz. Thus, there are two formant features, so the
resulting feature vector is two-dimensional.
The second feature set, the filter bank features, are quite similar to the features used in the
DTMF decoder. The filter bank features compute the energy (in decibels) of the speech signal after
it has been passed through a bank of six bandpass filters. We will use bandpass filters with center
frequencies of 600 Hz, 1200 Hz, 1800 Hz, 2400 Hz, 3000 Hz, and 3600 Hz. Thus, the resulting
feature vectors are six-dimensional.
Note that there are a large number of vowels that we could possibly consider. However, for
simplicity we will restrict attention to just five vowels: “ee” (as in tree), “ah” (as in father), “ae” (as
in fate), “oh” (as in boat), and “oo” (as in moon). Each of these five vowels will be its own class.
This command can be also used to simultaneously convert a vector of values to decibels.
• Calculating features for a vowel signal: As indicated, the features we would like to consider
in order to classify a vowel signal are based on the signal’s spectrum. We provide functions to
calculate the two feature sets described in this laboratory. Each function takes an audio wave-
form, x, and (optionally) the sampling frequency in samples per second, fs. (If no sampling
frequency is specified, a sampling frequency of 8192 samples per second is assumed.) Both
functions return a row vector, y, that contains the features calculated from the waveform.
To compute the “formant features,” use calc_formants.m:
>> y = calc_formants(x,fs);
>> y = calc_fbank(x,fs);
• Working with features vectors in M ATLAB: In this lab, we will adopt the convention that
a feature vector is a row vector, and that a set of feature vectors, such as a set of class repre-
sentatives or a set of testing data, is stored in a matrix such that there is one feature vector per
row and one feature per column. This allows us to easily compute mean feature vectors from
such a matrix.
When computing Euclidean distances, note that the computation is almost the same as that
which we used for computing RMS error. The only difference is that we replace the mean
operation by a summation.
• Advanced plotting: You may recall from Lab 1 that we can use M ATLAB’s plot command
to change the color and style of plotted lines. A line-style string consists of as many as three
parts. One part specifies a color (for instance, ‘k’ for black or ‘r’ for red). Another part
specifies the type of markers at each data point (for instance, ‘*’ uses asterisks while ‘o’
specifies circles). The third part specifies the type of line used to connect the points (‘:’
specifies a dotted line, while ‘-’ specifies a solid line). Note that these three parts can occur
in any order, and all are optional. If no color is specified, one is chosen automatically. If no
marker is specified, a marker will be not be plotted. If a marker is specified but a line type is
not, then lines will not be drawn between data points. Thus, the command:
>> plot(x1,y1,'rx',x2,y2,'k:');
will plot x1 versus y1 using red verb-x-’s with no connecting line, and also x2 with y2 with
a dotted connecting line but no marker. See help plot for more details.
Additionally, we can change the width of lines and the size of markers using additional
parameter-pairs. For instance, to increase the line width to 2 and the marker size to 18, use
the command
>> plot(x1,y1,'rx--','Linewidth',2,'Markersize',18);
The function outputs a column vector of class labels, labels, with one label for each row
of fmatrix. The labels are numbers that indicate which representative feature vector the
corresponding instance is closest to. Thus, if the first element of labels is a 3, it means that
the feature vector in the first row of fmatrix is closest to the representative feature vector
in the third row of M.
>> K = confusion_matrix(M,class1,class2,class3);
Note that the function confusion_matrix works for any number of classes. The size of
the confusion matrix is determined by the number of input parameters.
• Vowel spectra
• Vowel features
• Calculate and include the filter bank feature vector for this vowel.
• What are the center frequencies of the filters that have the greatest output amplitude?
Compare this to your estimated formant locations.
2. (Mean feature vectors and hand classification.) lab8_data.mat also has several other
variables, including matrices containing features for 50 instances of each vowel. The vari-
ables ah_form, ee_form, ae_form, oh_form, and oo_form contain formant feature
vectors for each vowel. Each matrix has 50 rows (one for each instance) and two columns
(one for each feature value). Similarly, the variables ah_fbank, ee_fbank, ae_fbank,
oh_fbank, and oo_fbank contain the filter bank feature vectors with one row per instance.
(a) (Mean feature vectors) Calculate the mean feature vectors for each vowel class and for
both feature classes.
• Include the five mean formant feature vectors in your report. Make sure you label
them.
• Include the five mean filter bank feature vectors in your report. Again, make sure
you label them.
(b) (Generate a scatter plot) We would like to see how separable the classes are from the
formant feature vectors. To do this, you’ll create a scatter plot that plots the first formant
location versus the second formant location. Plot each of the feature vectors, using a
different color and marker symbol for each vowel. Make sure you include a legend.
Also, plot the mean vector for each class on your scatter plot. (Hint: To make your
mean vectors stand out, you should increase the line width and marker size for just those
points.)
• Include the scatter plot in your report.
• Interpret this scatter plot. Are all of the classes distinct and easily separated? Do
you expect expect any vowels to be frequently confused? Do you expect any vowels
to frequently be classified correctly?
Food for thought: What would happen if we only used one of these two features for
classification? Which would give us better classification results? Do you think that a
third formant feature might improve class separation?
(c) (Hand classification) Now, you’ll “classify” the signal vowel1 by hand. To do this,
you’ll need to compute the distance between the instance and the five mean feature
vectors.
• Compute the distances between the formant feature vector that you generated for
vowel1 and the five mean formant feature vectors.
• Compute the distances between the filter bank feature vector that you generated for
vowel1 and the five mean filter bank feature vectors.
• Using the nearest-representative decision rule, use the above results to classify
vowel1. Do the results for both feature sets agree? If not, which feature set
produces the correct answer?
3. (Complete and use the feature classifier code.) In this problem, you’ll complete and then
use the function, called feature_classifier.m, that does this classification for us au-
tomatically. As described in the background section, the function takes as input a matrix of
representative feature vectors and a matrix of instances to classify. We output a label for each
of the instances in the input.
(a) Complete the function. (Hint: You should use two for loops. One loops over the rows
of the matrix of test instances. Then, for each instance, loop over the rows of M and
compute the distances. To make the classification for each instance, find the position of
the smallest distance and store it in labels.)
• Include your code in your report.
(b) (Test feature_classifier on the formant features.) Place your mean formant fea-
ture vectors into a matrix, M_form with one feature vector per row. For consistency, put
“oo” in the first row, “oh” in the second row, “ah” in the third row, “ae” in the fourth row,
and “ee” in the fifth row. Call your feature_classifier function using this matrix
and ee_form. (Hint: To make sure your function works correctly, you should compare
its output to the completed and compiled function feature_classifier_demo.dll.
If you did not successfully complete feature_classifier, you can use this demo
function throughout the remainder of the lab.)
• What fraction these instances are properly classified?
• Calculate the fraction of the instances that are misclassified as each of the incorrect
classes. That is, determine the fraction that are misclassified as “ah,” the fraction
misclassified as “ae,” and so on.
• From this data, what vowel is “ee” most often misclassified as?
(c) Repeat the above with the filter bank features, this time using the matrix ee_fbank
and generating the matrix M_fbank. Use the same order for your classes. (Again, you
should compare your function’s output to the output of feature_classifier_demo.dll).
(a) Use confusion_matrix to compute the confusion matrix for the formant features.
Use M_form as your set of class representatives. Use the vowel following vowel order
for your remaining input parameters: “oo,” “oh,” “ah,” “ae,” and “ee.” (This should be
the same as the order as the classes in M_form).
• Include this confusion matrix in your report. Label each row and column with the
corresponding class.
• In Problem 3b, you computed a portion of the confusion matrix. Identify that por-
tion and verify that your results were correct.
• From this confusion matrix, determine how many instances of “ee” vowels were
misclassified as “oo” vowels.
• Which vowel is most commonly misclassified using this feature set?
(b) Use confusion_matrix to compute the confusion matrix for the filter bank features.
Use M_fbank as your set of class representatives. Use the vowel following vowel order
for your remaining input parameters: “oo,” “oh,” “ah,” “ae,” and “ee.” (This should be
the same as the order as the classes in M_form).
• Include this confusion matrix in your report. Label each row and column with the
corresponding class.
• In problem 3c, you computed a portion of the confusion matrix. Identify that portion
and verify that your results were correct.
• From this confusion matrix, determine how many instances of “ae” vowels were
misclassified as “ah” vowels.
• Which vowel is most commonly misclassified using this feature set?
(c) Finally, compare the two confusion matrices.
• Which feature set has the best performance overall?
• Based on the performance of these classifiers, comment on the spectral similarities
between the various vowels. That is, are any of the vowel classes particularly like
any of the other vowel classes?
5. On the front page of your report, please provide an estimate of the average amount of time
spent outside of lab by each member of the group.
• The above function also returns the set of features that it computed from the recorded vowel. If
you collect these features into series of matrices of testing data, you can use confusion_matrix.m
to formally compute the performance of this classifier on your voice.
• How well does the classifier work on other people’s voices? Are there certain people for whom
it works very well? Very poorly? Does it work if we vary the pitch?
• The compiled functions listed above take an optional input parameter, which is a matrix of rep-
resentative feature vectors. Since the current classifier is designed using only one speaker’s
vowels, maybe you can improve the performance by coming up with a better set of repre-
sentative feature vectors. To do this, consider gathering a set of vowels from a number of
different speakers and combining them into a set of mean feature vectors. Can you improve
the performance of the system?
9.1 Introduction
So far, we’ve been considering filters as systems that we design and then apply to signals to achieve a
desired affect. However, filtering is also something that occurs everywhere, without the intervention
of a human filter designer. At sunset, the light of the sun is filtered by the atmosphere, often yielding
a spectacular array of colors. A concert hall filters the sound of an orchestra before it reaches your
ear, coloring the sound and adding pleasing effects like reverberation. Even our own head, shoulders,
and ears form a pair of filters that allows us to localize sounds in space.
Quite often, we may wish to recreate these filtering effects so that we can study them or apply
them in different situations. One way to do this is to model these “natural” filters using simple
discrete-time filters. That is, if we can measure the response of a particular system, we would often
like to design a filter that has the same (or a similar) response.
One of the goals for this laboratory is to introduce the use of discrete-time filters as models of
real-world filters. In particular, we will examine how to apply a modeling approach to understanding
vowel signals. This in turn will suggest a way that we might improve the performance of the vowel
classifier we developed in Lab 8 using an automatic modeling method.
Another goal of this lab is to present a method of filter design called pole-zero placement design.
Working with this method of filter design is extremely useful for building an intuition of how the
z-plane “works” with respect to the frequency domain that you are already familiar with. The design
interface that we use for this task should help you to develop a graphical understanding of how poles
and zeros affect the frequency response of a system. We will use this design methodology both to
design a traditional “goal-oriented” lowpass filter, and to do some filter modeling.
9.2 Background
9.2.1 Filters and the z-transform
Previously, we have presented the general time-domain input-output relationship for a causal filter 1
given by the convolution sum:
X X
y[n] = x[n] ∗ h[n] = h[k]x[n − k] = x[k]h[n − k] , (9.1)
k k
where x[n] is the input signal, y[n] is the output signal, and h[n] is the filter impulse response.
Using the z-transform techniques described in Chapter 7 of DSP First, we can also describe the
input/output relationship in the z-domain as
where X(z) is the z-transform of x[n], which is the complex-valued function, defined on the com-
plex plane2 by
X
X(z) = x[n]z −n , (9.3)
n
where Y (z) is the z-transform of y[n], defined in a similar fashion, and where H(z) is the system
function of the filter, which is a complex-valued function defined on the complex plane by one of
the following equivalent definitions:
1. The system function is the z-transform of the filter impulse response h[n], i.e
X
H(z) = h[n]z −n . (9.4)
n
2. For X(z) and Y (z) as defined above, the system function is given by
Y (z)
H(z) = . (9.5)
X(z)
The system function has a very important relationship to the frequency response of a system,
H(ω̂). The system function evaluated at ej ω̂ is equal to the frequency response evaluated at fre-
quency ω̂. That is,
H(ω̂) = H(ej ω̂ ) . (9.6)
We
P can derive this result from equation 9.4. If we let z = ej ω̂ , then we know that H(z) = H(ej ω̂ ) =
n h[n]e
−j ω̂n
. This is simply the definition of a system’s frequency given its impulse response
h[n].
1 Note that in this lab, we will only be concerned with causal filters.
2 The complex plane is simply the set of all complex numbers. The real part of the complex number is indicated by the
x-axis, while the imaginary part is indicated by the y-axis.
where we have used the fact that the z-transform of x[n − no ] is X(z)z −no . Dividing both sides of
the above by X(z) gives the system function:
Y (z)
H(z) = = b0 + b1 z −1 + b2 z −2 + b3 z −3 + · · · + bM z −M . (9.9)
X(z)
Notice that H(z) is a polynomial of order M . We can factor the above complex-valued polynomial
as3
H(z) = K(1 − r1 z −1 )(1 − r2 z −1 )(1 − r3 z −1 ) · · · (1 − rM z −1 ) , (9.10)
where K is a real number called the gain, and {r1 , . . . , rM } are the M roots or zeros of the polyno-
mial, i.e. the values r such that H(r) = 0. We typically assume that the filter coefficients b k are real.
In this case, the zeros may be real or complex, and if one is complex, then its complex conjugate is
also a zero. That is, complex roots come in conjugate pairs.
The very important point to observe now from equation (9.10) is that the system function H(z)
of a causal FIR filter is completely determined by its gain and its zeros. Therefore, we can think of
{K, r1 , . . . , rM } as one more way to describe a filter4 . We will see that when it comes to designing
an FIR filter to have a certain desired frequency response, the description of the filter in terms of its
gain and its zeros is by far the most useful. In other words, the best way to design a filter to have a
desired frequency response (e.g., a low pass filter) is to appropriately choose its gain and zeros. One
may then find the system function by multiplying out the terms of equation (9.10), and then picking
off the filter coefficients from the system function. For example, the number multiplying z −3 in the
system function is the filter coefficient b3 . The specific procedure will be described shortly.
The fact that we may design the frequency response of a causal FIR filter by choosing its zeros 5
stems from the following principle:
If a filter has a zero r located on the unit circle, i.e. |r| = 1, then H(∠r) = 0, i.e. the
frequency response has a null at frequency ∠r. Similarly, if a filter has a zero r located
close to the unit circle, i.e. |r| ≈ 1, then H(∠r) ≈ 0, i.e. the frequency response has a
dip at frequency ∠r. In either case, H(ω̂) ≈ 0, when ω̂ ≈ ∠r.
3 The Fundamental Theorem of Algebra guarantees that H(z) factors in this way.
4 Previous ways of describing a filter have included the filter coefficients, the impulse response sequence, the frequency
response function, and the system function.
5 The gain does not affect the shape of the frequency response.
The above fact follows from the property that if ω̂ = ∠r and |r| = 1, then ej ω̂ = r, and so
H(ω̂) = H(ej ω̂ ) = H(r) = 0 . (9.11)
A similar statement shows H(ω̂) ≈ 0 when |r| ≈ 1 and/or ω̂ ≈ ∠r.
From this fact, we see that we see that we can make a filter block a particular frequency, i.e.
create a null or a dip in the frequency response, simply by placing a zero on or near the unit circle at
an angle equal to the desired frequency6 . On the other hand, the frequency response at frequencies
corresponding to angles that are not close to these zeros will have large magnitude. The filter will
“pass” these frequencies. The specific procedure to design such a filter is the following.
1. Choose frequencies ω̂1 , ..., ω̂L at which the frequency response should contain a null or a dip.
2. Choose zeros ri = ρi ej ω̂i , i = 1, . . . , L, with ρi = 1 or ρi ≈ 1, depending upon whether
a null or a dip is desired at frequency ω̂. For each ω̂i 6= 0 choose also a zero rj that is the
complex conjugate of ri . Let M be the total number of zeros chosen.
3. Form the system function H(z) = K(1 − r1 z −1 ) × · · · × (1 − rM z −1 ), where K is a gain
that we also choose.
4. Cross multiply the factors of H(z) found in the previous step so as to express H(z) as a
polynomial whose terms are powers of z −1 .
5. Identify the FIR filter coefficients {b0 , . . . , bM }, which are simply the coefficients of the poly-
nomial found in the previous step, as shown in equation (9.9).
which is never zero for any positive n. (Note that the impulse response is generally not so simple to
compute; this is an unusual case where the impulse response can be obtained by inspection.) Thus,
by introducing feedback terms into our difference equation, we have produced a filter with an infinite
impulse response, i.e., an IIR filter.
In general, computing the system function by taking the z-transform of the resulting infinite
impulse may not be trivial because of the required infinite sum, and also because it may be difficult
to find the impulse response. However, we can use the fact that H(z) = Y (z)/X(z) to determine
the system function. To do this, we first collect the y[n] terms on the left side of the equation and
take the z-transform of the result.
(1 − r1 z −1 )(1 − r2 z −1 )(1 − r3 z −1 ) · · · (1 − rM z −1 )
H(z) = K . (9.19)
(1 − p1 z −1 )(1 − p2 z −2 )(1 − p3 z −1 ) · · · (1 − pN z −1 )
The roots of the polynomial in the numerator, {r1 , . . . , rM }, are again called the zeros of the system
function. The roots of the polynomial in the denominator, {p1 , . . . , pN } are called the poles of the
system function. K is again a gain factor that determines the overall amplitude of the system’s
output. As before, the zeros are complex values where H(z) goes to zero. The poles, on the other
hand, are complex values where the denominator goes to zero and thus the system function goes to
infinity10 . Again, we typically assume that the filter coefficients bk and ak are real, so both the poles
and zeros of the system function must be either purely real or must appear in complex conjugate
pairs.
Just as we could completely characterize an FIR filter by its gain and its zeros, we can completely
characterize an IIR filter by its gain, its zeros, and its poles. As in the FIR case, this is typically the
most useful characterization when designing IIR filters. As before, if the system function has zeros
near the unit circle, then the filter magnitude frequency response will be small at frequencies near the
angles of these zeros. On the other hand, if there are poles near the unit circle, then the magnitude
frequency response will be be large at frequencies near the angles of these poles. With FIR filters
we could directly design filters to have nulls or dips at desired frequencies. Now, with IIR filters, we
can design peaks in the frequency response, as well as nulls. The specific procedure is the following.
1. Choose frequencies ω̂1 , ..., ω̂L at which the frequency response should contain a null, a dip,
or a peak.
9 Thisis a generalization of the terminology that the ratio of two integers is called a rational number.
10 Technically,because of a division by zero, H(z) is undefined at the location of a pole. However, the magnitude of the
system function becomes very large in the neighborhood of a pole.
2. Choose zeros ri = ρi ej ω̂i at those frequencies at which a null or a dip should occur, with
ρi = 1 or ρi ≈ 1, as desired. For each such ω̂i 6= 0, choose also a zero rj that is the complex
conjugate of ri . Let M be the total number of zeros chosen.
3. Choose poles pi = ρi ej ω̂i at those frequencies at which a peak should occur, with ρi = 1 or
ρi ≈ 1 as desired. For each such ω̂i 6= 0 choose also a pole pj that is the complex conjugate
of pi . Let N be the total number of poles chosen.
−1 −1
4. Form the system function H(z) = K (1−r 1z )×···×(1−rM z )
(1−p1 z −1 )×···×(1−pN z −1 ) , where K is a gain that we
also choose.
5. Cross multiply the factors of H(z) found in the previous step and express H(z) as the ratio
of two polynomials whose terms are powers of z −1 .
6. Identify the IIR filter coefficients {a0 , . . . , aN , b0 , . . . , bM }, which are simply the coefficients
of the polynomials found in the previous step, as shown in equation (9.18).
infinity.
12 Trivial poles and zeros do affect the phase (and thus the delay or time shift) of a system.
0.8
0.6
0.4
Imaginary Part
0.2
2
0
−0.2
−0.4
−0.6
−0.8
−1
−1 −0.5 0 0.5 1
Real Part
Note also that if one chooses filter coefficients such that the numerator and denominator contain
an identical factor, i.e. if ri = pj for some i and j, then these factors “cancel” each other, i.e. the
filter is equivalent to a filter whose system function has neither factor.
Pole-zero plots
It is often very useful to graphically display the locations of a system’s poles and zeros. The standard
method for this is the pole-zero plot. Figure 9.1 shows an example of a pole-zero plot. This is a two-
dimensional plot of the z-plane that shows the unit circle, the real and imaginary axes, and the
position of the system’s poles and zeros. Zeros are typically marked with an ‘o’, while poles are
indicated with an ‘x’. Sometimes, a location has multiple poles and zeros. In this case, a number is
marked next to that location to indicate how many poles or zeros exist there. Figure 9.1, for instance,
shows four zeros (two conjugate pairs), two “trivial” poles at the origin, and one other conjugate pair
of poles. Recall that zeros and poles near the unit circle can be expected to have a strong influence
on the magnitude frequency response of the filter.
1
(A)
Imaginary Part
(B)
2
0
5
4 −1
−1 0 1
3
|H(z)|
Real Part
2 3
(C)
1
|H(ω)|
2
1
−1 0 1
0
−1 0
1 Imag(z) 0 1 2 3
Real(z) Discrete radian frequency, ω
Figure 9.2: (A) The z-plane surface defined by the system function H(z) = (1 − e jπ/4 z −1 )(1 −
e−jπ/4 z −1 ). (B) The corresponding pole-zero plot. (C) The corresponding magnitude frequency
response.
1
(A)
Imaginary Part
(B)
2
0
4 −1
3 −1 0 1
|H(z)|
Real Part
2
1 2 (C)
|H(ω)|
1
−1 1
0
0
−1 0
1 Imag(z) 0 1 2 3
Real(z) Discrete radian frequency, ω
Figure 9.3: (A) The z-plane surface defined by the system function H(z) =
1
. (B) The corresponding pole-zero plot. (C) The correspond-
(1−0.8ejπ/2 z −1 )(1−0.8e−jπ/2 z −1 )
ing magnitude frequency response.
1
(A)
Imaginary Part
(B)
0
4
−1
3 −1 0 1
|H(z)|
Real Part
2
1 2 (C)
|H(ω)|
1
−1 1
0
0
−1 0
1 Imag(z) 0 1 2 3
Real(z) Discrete radian frequency, ω
Figure 9.4: (A) The z-plane surface for a complicated system function with four poles and four
zeros. (B) The corresponding pole-zero plot. (C) The corresponding magnitude frequency response.
system’s magnitude frequency response.) Thus, the magnitude frequency response has higher gain
at points far away from the zeros.
Figure 9.3 shows the surface |H(z)| as defined by a different system function. This system
function has two poles (which form a complex conjugate pair) and two zeros at the origin. Notice
how the poles “push up” the surface near them, like poles under a tent. The surface then typically
“drapes” down away from the poles, getting lower at points further from them. The magnitude
frequency response here has a point of high gain in the vicinity of the poles. (Again, the zeros in this
system function are located at the origin, and thus do not affect the magnitude frequency response.)
Figure 9.4 shows the surface for a system function which has poles and zeros interacting on
the surface. This system function has four poles and four zeros. Notice the tendency of the poles
and zeros to cancel the effects of one another. If a pole and a zero coincide exactly, they will
completely cancel. If, however, a pole and a zero are very near one another but do not have exactly
the same position, the z-plane surface must decrease in height from infinity to zero quite rapidly.
This behavior allows the design of filters with rapid transitions between high gain and low gain.
Transition
Band
Passband Stopband
Passband Stopband
Ripple Attenuation
|H(ω)|
circle roughly equally). You might use the example of the running average filter and bandpass filters
(given in Chapter 7 of DSP First) as a prototype of how to use zeros to design FIR filters using zero
placement.
If we wish to design an IIR filter (with both poles and zeros), it usually makes sense to start
with the poles since they typically affect the frequency response to a greater extent. If the frequency
response that we are trying to match has peaks on it, this suggests that we should place a pole
somewhere near that peak (inside the unit circle). Then, use zeros to try to pull down the frequency
response where it is too high. As with zeros, poles near the origin have relatively little effect on the
system’s filter response.
Regardless of which type of filter we are designing, there are a couple of methodological points
that should be mentioned. First, moving a pole or zero affects the frequency response of the entire
system. This means that we cannot simply optimize the position of each pole-pair and zero-pair
individually and expect to have a system which is optimized overall. Instead, after adjusting the
position of any pole-pair or zero-pair, we generally need to move many of the remaining pairs to
compensate for the changes. This means that filter design using manual pole-zero placement is
fundamentally an iterative design process.
Additionally, it is important that you consider the filter’s gain. Often we cannot adjust the overall
magnitude of the frequency response using just poles and zeros. Thus, to match the frequency re-
sponse properly, you may need to adjust the filter’s gain up or down. The pole-zero design interface
that you will use in this Lab includes an edit box where you can change the gain parameter. Alter-
nately, by dragging the frequency response curve, you can change the gain graphically. A related
idea is that of spectral slope. By having a pair of poles or zeros inside the unit circle and near the
real axis, we can adjust the overall “tilt” of the frequency response. As we move the pair to the right
and left on the z-plane, we can adjust the slope of the system’s frequency response up and down.
Note that there are automatic filter design methods which do not require manual placement of
poles and zeros. In Section 9.2.8 we discuss one such method.
x[n]
Glottal Vocal Tract y[n]
Source Filter
Note that the above description is only accurate for vowels and so-called voiced consonants like
“m” and “n.” Most consonant sounds are produced using the tongue, lips, and teeth rather than the
vocal cords. We will not consider consonants in this lab.
It is traditional to model speech production using a source-filter model. Figure 9.7 shows a
block diagram of the source-filter model. The first block is the glottal source, which takes as input
a fundamental frequency and produces a periodic signal (the glottal source signal) with the given
fundamental frequency. The signal produced is typically modeled as a periodic pulse train. To a
first approximation, we can assume that spectrum of this pulse train is composed of equal amplitude
harmonics. The glottal source signal is meant to be analogous to the signal formed by the air pressure
fluctuations produced by the vibrating vocal cords. Note that to model whispering, the glottal source
signal can be modeled using random noise rather than a pulse train.
The second block of the source-filter model is the vocal tract filter. This is a discrete-time filter
that mimics the spectrum-shaping properties of the vocal tract. Since we are assuming a source
signal with equal-amplitude harmonics, the vocal tract filter provides the spectral envelope for our
output signal. That is, when we filter the source signal with fundamental frequency ω̂ 0 radians per
sample, the k th harmonic of the output signal will have an amplitude equal to the filter’s magnitude
frequency response evaluated at k ω̂0 . This is illustrated in Figure 9.8 which shows a particular
example. The magnitude spectrum of the glottal source signal is shown on top, the magnitude
frequency response of the vocal tract filter is shown in the center, and the magnitude spectrum of the
output signal, which is the signal that models the specific vowel signal. One may clearly see that, as
X(ω)
0.5
0
0 0.5 1 1.5 2 2.5 3
1
H(ω)
0.5
0
0 0.5 1 1.5 2 2.5 3
1
Y(ω)
0.5
0
0 0.5 1 1.5 2 2.5 3
Frequency (radians per sample)
Figure 9.8: A plot of the magnitude spectrum of a glottal source signal, the frequency response of a
vocal tract filter, and the magnitude spectrum of the output signal.
desired, the envelope of the spectrum of the vowel signal model matches the spectrum of the vocal
tract filter.
In Problem 3 of this assignment, we will make such source-filter models for particular vowel
signals, by measuring the spectrum of the vowel signal and designing an IIR vocal tract filter whose
frequency response approximates this spectrum.
Typically, our vocal tract filter can have relatively few filter coefficients (i.e., approximately 10-
20 coefficients). Further, the acoustics of the vocal tract suggest that this filter should be IIR. Often,
the vocal tract is modeled using an all-pole filter which has no nontrivial zeros. This is because an
acoustic passageway like the vocal tract primarily affects a sound through resonances. A resonance
is a part of a system that tends to vibrate at a certain resonant frequency, thus amplifying that fre-
quency in signals passed through them. The feedback form of an IIR filter is a direct implementation
of resonance; this is how IIR filters are able to produce high gain at certain frequencies. Using this
simple model of speech production, it is possible to synthesize artificial vowels.
from a time-domain waveform. These tools, fit the spectrum of a time-domain signal with poles in a
least-squares sense. Note that these tools work directly with the time-domain waveform rather than
its spectrum; typically, they return the resulting ak feedback coefficients for a filter with those poles,
rather than the locations of the poles themselves. We will explore these tools for all-pole analysis in
the laboratory assignment, and we will compare classification performance using features based on
these models to the performance we achieved with our other feature sets from Lab 8.
As with FIR filters, M ATLAB’s convention for the bk coefficients is B(1)= b0 , B(2)= b1 ,
. . ., B(M+1)= bM . M ATLAB’s convention for the ak coefficients is A(1)= 1, A(2)= −a1 ,
. . ., A(N+1)= −aN . Both bk and ak are given as defined in equation 9.18. H contains the
frequency response and w contains the corresponding discrete-time frequencies. Alternatively,
we can compute the frequency response only at a desired set of frequencies. For example, the
command
returns the frequency response of the filter at the frequencies π/4, π/2, and 3π/4.
• Pole-Zero Place 3-D: In this laboratory, we will primarily be exploring filter design using
manual pole-zero placement. To help us do this, we will be using a M ATLAB graphical user
interface (GUI) called Pole-Zero Place 3-D. Pole-Zero Place 3-D allows you to place, move,
and delete poles and zeros on the z-plane, and provides immediate feedback by displaying
the filter’s frequency response and the |H(z)| surface. Additionally, it calculates some useful
statistics for assessing the quality of a particular filter design.
To run this program you need to download two different files: pole_zero_place3d.m
and pole_zero_place3d.fig. To begin Pole-Zero Place 3-D, simply execute 15 the
command
>> pole_zero_place3d;
Once the program starts, the GUI window shown in Figure 9.9 will appear. The axis in the
upper left of the window shows a portion of the z-plane with the unit circle. In the lower left
is an axis that displays the frequency response of the system. In the lower right is a 3-D axis
which displays a 3-D graph of the |H(z)| surface16 .
The interface allows you to do a wide variety of things.
15 This program was designed to run using Windows systems running M ATLAB 6 or higher; it will not work with previous
versions of M ATLAB. It should work with Unix operating systems running M ATLAB 6, but this has not been tested.
16 This surface plot requires significant computation, and thus it can be toggled on and off using the View 3-D checkbox in
1
0.8
0.6
0.4
0.2
Imag(z) 2
0
−0.2
−0.4
−0.6
−0.8
−1
−1 −0.5 0 0.5 1
Real(z)
2
3
|H(z)|
2
1.5
|H(ω)|
1 1
0 0.5
−1 0
−0.5
0 −0.5
0.5 0.5
1 −1
0 0.5 1 1.5 2 2.5 3 Imag(z)
ω (Discrete radian frequency) Real(z)
1. To add a poles or zeros to the z-plane, click the Add Zeros or Add Poles button and then
click on the z-plane plot in the upper left of the GUI. The state of the Place pair check-
box determines whether a single (real) pole or zero is added, or whether a conjugate pair
is added.
Note that the program also adds the hidden poles and zeros that accompany nontrivial
poles and zeros. Specifically, for each zero that is added a pole is added at the origin, or
if there is already at least one zero at the origin, instead of adding a pole at the origin,
one zero at the origin is removed, i.e. cancelled. Moreover, for each pole that is added, a
zero is added at the origin, or if there are already poles at the origin, one pole is removed,
i.e. cancelled. The system does not allow one to place zeros at infinity, and it can be
shown that zeros at infinity will not be induced by any other choices of poles or zeros.
2. To move a real pole or zero (or a conjugate pair of complex poles or zeros), you must
first select the pole/zero by clicking on one member of the pair. Then, you can drag it
around the z-plane, use the arrow keys to move it, or move it to a particular location by
inputting the magnitude and angle (in radians) in the Magnitude and Angle edit boxes.
3. To delete a pole or zero (or pair), select it and hit the Delete Poles/Zeros button. Again,
the system will maintain an equal number of poles and zeros by also removing poles or
zeros from the origin as necessary. This may also have the effect of no longer cancelling
other poles and zeros, and thus the total number of poles and zeros that appear at the
origin will change.
4. To change the filter’s gain, you can either use the Filter Gain edit box or you can click-
and-drag the blue frequency response curve in the lower left.
5. To toggle between linear amplitude and decibel displays in the lower two plots, select
the desired radio button above the Filter Gain edit box.
6. To rotate the 3-D |H(z)| plot, simply click-and-drag the axes in the lower right of the
GUI. To enable or disable the 3-D plot, toggle the View 3-D checkbox in the upper right
of the GUI
7. To begin with an initial filter configuration defined by the feedforward coefficients, B,
and the feedback coefficients, A, start the program with the command
>> pole_zero_place3d(B,A);
This is useful if you wish to start continue working on a design that you had previously
saved. You may set either of these parameters to empty ([]) if you do not wish to
specify the filter coefficients.
8. To print the GUI window, you can either use the Copy to Clipboard button to copy an
image of the figure into the clipboard17 , or you can print the figure using the Print GUI
button.
9. To save your current design, use the Export Filter Coefs button. The feedforward and
feedback coefficients will be stored in the variables B_pz and A_pz, respectively.
10. To hear your filter’s response to periodic signal with equal-amplitude harmonics, press
the Play Sound button. This is particularly useful when using the GUI to design vocal
tract filters for vowel synthesis.
• Pole-Zero Place 3-D – Filter Matching Mode: In “filter matching mode,” you specify sam-
ples of a desired transfer function at harmonically related frequencies and try to match that
transfer function. The GUI plots a red curve or stem plot along with the frequency response
function; this is the response we wish to match. Two edit boxes labeled Linear Matching Error
and Decibel Matching Error indicate how closely your filter matches the desired frequency
response. The matching error values are computed as the RMS error between the desired
frequency response and your filter design in both linear amplitude and in decibels.
To start the GUI in this mode, use the following command:
>> pole_zero_place3d(B,A,filter_gains,fund_frq);
filter_gains are the values of the desired filter frequency response at harmonically re-
lated frequencies, and fund_frq is the fundamental frequency (in radians per sample) of the
harmonic series at which filter_gains are defined.
• Pole-Zero Place 3-D – Lowpass Design Mode: In “lowpass design mode,” you specify the
maximum frequency of the passband and the minimum frequency of the stopband (both in
radians per sample). To start the GUI in this mode, use the following command:
>> pole_zero_place3d(B,A,[pass_max,stop_min]);
In this mode, the GUI computes the passband ripple and the stopband attenuation of your
lowpass filter design. You can use these measures to evaluate your filter design. The figure in
the lower right also displays the passband and stopband of the filter, with appropriate minima
and maxima.
• Converting between filter coefficients and zeros-poles: Given a set of filter coefficients, we
often need to determine the set of poles and zeros defined by those coefficients. Similarly, we
often need to take a set of poles and zeros and compute the corresponding filter coefficients.
There are two M ATLAB commands that help us do this. First, if we have our filter coefficients
stored in the vectors B and A, we compute the poles and zeros using the commands
17 Windows operating systems only
This is because the system zeros are simply the roots of the numerator polynomial whose
coefficients are the numbers in B, while the system zeros are simply the roots of the denom-
inator polynomial whose coefficients are the numbers in A. We can see this in equation 9.18.
To convert back, use the commands
>> B = poly(zeros);
>> A = poly(poles);
Note that we loose the filter’s gain coefficient, K, with both of these conversions.
• Generating pole-zero plots: Frequently, we’d like to use M ATLAB to make a pole-zero plot
for a filter. If our filter is defined by feedforward coefficients B and feedback coefficients A
(both row vectors), we can generate a pole-zero plot using the command:
>> zplane(B,A);
Alternately, if we have a list of poles, p, and a list of zeros, z, (both column vectors) we can
use the following command:
>> zplane(z,p);
An example of a pole-zero plot resulting from this command is shown in Figure 9.1.
• Automatic all-pole modeling: Using the M ATLAB command aryule, we can compute an
all-pole filter model for a discrete-time signal. That is aryule automatically finds an all-
pole filter whose magnitude frequency response that in some sense matches the magnitude
frequency response of the signal. If signal is given by signal, the command
>> A = aryule(signal,N);
returns the filter feedback coefficients ak as a vector A. The parameter N indicates how many
poles we wish to use in our filter model. Once we have A, we can compute the filter’s fre-
quency response at 256 points using freqz as
>> pole_zero_place3d([],[],FIR_fr,2*pi/8192);
Use the GUI to find an FIR filter with six nontrivial zeros that matches the frequency response
of the original filter. You should be able to get the linear matching error to be less than 0.1.
(Hint: The original filter had all six of its zeros inside the unit circle, so yours should as well.)
• Include the GUI window with your matching filter in your report. In this and the fol-
lowing problems, make sure that it is possible to read the filter evaluation scores on your
printout. (Note: The easiest way to include the GUI window is to use the Copy to Clip-
board button on a Windows machine. After hitting the button, wait for the GUI to flash
white and then paste the result into your report.)
• What are the filter coefficients bk and ak for your filter?
• Where are the zeros on the z-plane? Give your answers in rectangular form.
2. (Design a lowpass filter.) In this problem, we will use the “lowpass design mode” of Pole-Zero
Place 3-D to design some lowpass filters, as described in Section 9.2.7. For the various parts
of this problem, use the command
This sets the filter transition band to 1500 Hz to 2000 Hz if we assume a sampling rate of 8192
samples per second.
(a) (Design an FIR lowpass filter to maximize stopband attenuation.) First, let’s see what
we can do with just zeros (that is, with FIR filters). Using only six nontrivial zeros (i.e.,
three zero pairs), design a lowpass filter with a stopband attenuation of at least 30 dB.
(Remember, we want our stopband attenuation to be as large as possible). For now, take
note of your filter’s passband ripple, but don’t worry about minimizing it.
• Include the GUI window with your matching filter in your report.
• What are the filter coefficients bk and ak for your filter?
• Where are the zeros on the z-plane? Give your answers in rectangular form.
Food for thought: Using just zeros, try to find a way to minimize the passband ripple.
What does this do to your stopband attenuation? Try this with more zeros, but don’t use
any poles.
(b) (Design an IIR lowpass filter to maximize stopband attenuation.) There are two primary
benefits to the use of IIR filters. First, it is very easy to get very high gain at certain fre-
quencies. This lets us design a lowpass filter with very high stopband attenuation. Using
a single pair of nontrivial poles, design a lowpass filter that has a stopband attenuation
greater than 60 dB. Use the same transition band as in the previous problem. Again, you
should take note of the passband ripple, but don’t worry about minimizing it.
• Include the GUI window with your matching filter in your report.
• What are the filter coefficients bk and ak for your filter?
• Where are the poles on the z-plane? Give your answers in rectangular form.
(c) (Design an IIR lowpass filter for both high stobpand attenuation and low ripple.) The
second benefit of IIR filters is the ability to achieve fast transitions between high gain
and low gain. Among other things, this allows us to transition between the passband
and stopband more quickly, which in turn allows us to achieve relatively high stopband
attenuation with low passband ripple.
Once again using the same transition band, design a lowpass filter with a passband ripple
of less than 2 dB and a stopband attenuation of at least 20dB. You may use as many poles
and zeros as you wish, but it is possible to meet these criteria with only two poles and
four zeros. (Hint: use decibel mode to help you increase the stopband attenuation, and
linear mode to help you decrease the passband ripple.)
• Include the GUI window with your matching filter in your report.
• What are the filter coefficients bk and ak for your filter?
• Where are the poles and zeros on the z-plane? Give your answers in rectangular
form.
Food for thought: For a more interesting challenge, design a lowpass filter with a pass-
band ripple of less than 1 dB and a stopband attenuation of 60 dB. This can be done
with six poles and six zeros, but you might want to use more than this.
7. When a function returns multiple parameters, we use square brackets to retrieve them:
max_value has a vector of six ones (since the maximum value in each column is 1) and
index is a vector containing the row number of the 1 in each column.
9. The end keyword is exceptionally useful when indexing into arrays of unknown size. Thus,
if I want to return all elements in a vector but the first and last one, I can use the command:
>> x(2:end-1)
>> x(2:length(x)-1)
10. M ATLAB automatically resizes arrays for you. Thus, if I want to add an element on to the end
of a vector, I can use the command:
>> x(end+1) = 5;