Introduction To Rule-Based Fuzzy Logic Systems: by Jerry M. Mendel

Introduction to
Rule-Based
Fuzzy Logic Systems
by Jerry M. Mendel
University of Southern California
Fuzzy Logic System
Rules
Crisp
Output
Processor
Fuzzifier
inputs
x
Fuzzy
Inference
Fuzzy
output sets
input sets
y = f( x)
Crisp
outputs
y
CONTENTS
A Self-Study Course (Introduction)
Lesson 1
Introduction and Overview
Lesson 2
Fuzzy SetsPart 1
Lesson 3
Fuzzy SetsPart 2
Lesson 4
Fuzzy SetsPart 3
Lesson 5
Fuzzy Logic
Lesson 6
Case Studies
Lesson 7
Singleton Type-1 Fuzzy Logic SystemsPart 1
Lesson 8
Lesson 9
Lesson 10 Singleton Type-1 Fuzzy Logic SystemsPart 4

Lesson 11 Non-Singleton Type-1 Fuzzy Logic Systems
Lesson 12 TSK Fuzzy Logic Systems
Lesson 13 Applications of Type-1 FLSs
Lesson 14 Computation
Lesson 15 Open Issues With Type-1 FLSs
Solutions
Profile of Lotfi Zadeh, IEEE Spectrum, June 1995, pp. 32-35.
Final Exam
Solution to Final Exam
Introduction to Rule-Based
Fuzzy Logic Systems
A Self-Study Course
This course was designed around Chapters 1, 2, 46, 13 and 14 of Uncertain Rule-Based Fuzzy Logic
Systems: Introduction and new Directions by Jerry M. Mendel, Prentice-Hall 2001. The goal of this selfstudy course is to provide training in the field of rule-based fuzzy logic systems.
In this course, which is the first of two self-study courses, the participant will focus on rule-based fuzzy
logic systems when no uncertainties are present. This is analogous to first studying deterministic systems
before studying random systems. In the follow-on self-study course New Directions in Rule-Based Fuzzy
Logic Systems: Handling Uncertainties, the participant will learn about expanded and richer kinds of rulebased fuzzy logic systems, ones that can directly model uncertainties and minimize their effects. The
present course (or equivalent knowledge) is a prerequisite to the follow-on course.
Prerequisites
This course is directed at participants who have had no formal training in fuzzy logic and want to learn
about rule-based fuzzy logic systems. It assumes a college undergraduate degree, preferably in electrical
engineering or computer science.
Course Objectives
After completing this course, you should be able to:
Describe many differences between fuzzy sets and crisp sets, and fuzzy logic and crisp logic
Describe numerous applications for rule-based fuzzy logic systems (FLSs)
Demonstrate how a fuzzy set is described by a membership function
Compute set theoretic operations for fuzzy sets using membership functions
Demonstrate compositions of fuzzy relations and compute their membership functions
Describe and use Zadehs Extension Principle
Explain the transition from crisp logic to fuzzy logic
Demonstrate membership functions for rules
Explain how rules are fired and implement the firing of rules
Describe and demonstrate how a FLS can be used to forecast a time-series
Describe and demonstrate how a FLS can be used as a fuzzy logic advisor for making social or
engineering judgments
Describe the architectures of three type-1 FLSs
Compute the input-output relationships for these three FLSs
Demonstrate and implement a variety of design methods for optimizing the design parameters of
these three FLSs
Describe the nature of and the order of all computations needed to design and implement these three
FLSs
Explain what software is available to implement and design these three FLSs
Explain four kinds of uncertainties that can occur in a rule-based FLS

Describe why a type-1 FLS cannot directly model and minimize the effects of the uncertainties
Course Components
This course includes:
A study guide including learning objectives, reading assignments, and practice problems (with
solutions)
A final exam and its solution.
The textbook is not included.
How to use this course

This course was developed assuming the reader would complete the lessons sequentially, i.e., Lesson 1
followed by Lesson 2, etc. Similarly, the tasks in each lesson should be completed sequentially in the
following order:
1. Read the objectives of the lesson.
2. Read the assigned sections of the text and the Study Guide in the order indicated.
3. Review the key points of the chapter.
4. Solve the practice problems.
5. Review the practice problem solutions.
6. Review the objectives of the lesson and determine if they have been met. If so, proceed to the next
lesson. If not, review 2 through 5 above until the objectives are met.
After finishing Lesson 15 take the final exam.
Acknowledgements
I would like to take this opportunity to thank Qilian Liang for his careful review of the Study Guide and
Li-Xin Wang for contributing the write-up in Lesson 13 about fuzzy logic control.
Lesson 1: INTRODUCTION & OVERVIEW

Learning Objectives
The main purpose of this lesson is to provide some motivation for and a broad overview of the
entire course. After completing this lesson you will be able to:
Explain the difference between logic and fuzzy logic.

Explain why FL is needed.
Describe a brief history about the development of fuzzy logic (FL), including information
about the founder of FL, Lotfi Zadeh.
Describe the four components that make up a rule-based fuzzy logic system (FLS).
List some applications for FLSs.
Explain the difference between a FLS and a neural network.
Reading Assignment
I. What is Fuzzy Logic (FL)?
We answer this question by contrasting FL with logic.
According to the Encyclopedia Britannica, Logic is the study of propositions and their use in
argumentation. According to Websters Dictionary of the English Language, logic is the
science of formal reasoning, using principles of valid inference, and logic is the science whose
chief end is to ascertain the principles on which all valid reasoning depends, and which may be
applied to test the legitimacy of every conclusion that is drawn from premises. Although multivalued logic exists, we are most familiar with two-valued (dual-valued) logic in which a
proposition is either true or false. This kind of logic is also referred to as crisp logic.
Traditional (sometimes called Western) logic was first systematized by Aristotle thousands of
years ago, in ancient Athens. There are two fundamental laws of classical logic:
Law of the Excluded Middle: A set and its complement must comprise the universe of
discourse.
Law of Contradiction: An element can either be in its set or its complement; it cannot
simultaneously be in both.
These two laws sound similar, but the Law of Contradiction forbids something being
simultaneously true and not true, whereas the Law of the Excluded Middle forbids anything
other than something being true or not true. Shakespeares Hamlet exemplified the Law of
Contradiction when he said To be or not to be, that is the question.
2
Fuzzy logic (FL) is a type of logic that includes more than just true or false values. It is the logic
that deals with situations where you cant give a clear yes/no (true/false) answer. In FL,
propositions are represented with degrees of truthfulness or falsehood, i.e., FL uses a continuous
range of truth values in the interval [0, 1] rather than just true or false values. In FL, both of the
two fundamental laws of classical logic can be broken, i.e., it is possible for an element to
simultaneously be in its set and its complement but to different degrees, the sums of which add
up to unity. This will be made very clear in Lesson 3. So, Zadehs Hamlet might have said To
be somewhat and not to be somewhat, that is the cunundrum. FL includes classical dual-valued
logic as a special case.
II. Why is FL Needed?

The following quotes address this question:
An argument, which is only convincing if it is precise loses all its force if the assumptions on
which it is based are slightly changed, while an argument which is convincing but imprecise may
well be stable under small perturbations of its underlying axioms. [J. Schwartz, The pernicious
influence of mathematics in science, in Nagel, Suppes, and Tarski, Logic Methodology and
Philosophy of Science, Standford, 1962.]
All traditional logic habitually assumes that precise symbols are being employed. It is therefore
not applicable to this terrestrial life but only to an imagined celestial existence.
[B. Russell, Vagueness, Australasian J. Psychol. Philos., vol. 1, pp. 8492, 1923.]
As the complexity of a system increases, our ability to make precise and yet significant
statements about its behavior diminishes until a threshold is reached beyond which precision and
significance (or relevance) become almost mutually exclusive characteristics. [L. A. Zadeh,
The concept of a linguistic variable and its application to approximate reasoning,
Memorandum ERL-M 411, Berkeley, Oct. 1973.] This is called The Principle of Incompatibility.
As we move into the information era, human knowledge becomes increasingly important. We
need a theory to formulate human knowledge in a systematic manner and put it into engineering
systems, together with other information like mathematical models and sensory measurements.
[L.-X. Wang, A Course in Fuzzy Systems and Control, Prentice-Hall, Upper Saddle River, NJ,
1997]
Read the IEEE Spectrum (June 1995, pp. 32-35) profile of Zadeh that is a supplement to this
lesson and appears at the end of the Study Guide.
III. An Impressionistic Brief History of FL (In literature, impressionism is a mode of

treatment in which scene, character, and emotion are depicted through the authors or characters
impressions rather than by strict objective detail. [New Websters Dictionary of the English
Language, Delair Publ. Co., 1981])
3
Lotfi Zadeh is the founding father of FL. His first seminal paper on fuzzy sets appeared in 1965,
although he began to formulate ideas about them at least four years earlier. Fuzzy sets met with
great resistance in the West, perhaps because of the negative connotations associated with the
word fuzzy. Lets face it, fuzzy does not conjure up visions of scientific or mathematical
rigor. For decades after 1965 somealbeit, a relatively small number ofpeople, along with
Zadeh, developed the rigorous mathematical foundations of fuzzy sets and fuzzy logic.
Interestingly enough, Chinese and Japanese researchers devoted a large effort to fuzzy sets and
fuzzy logic. A popular hypothesis for this is that fuzzy fits in quite nicely with Eastern
philosophies and religions (e.g., the complimentarity of Yin and Yang). But, until the early
1970s fuzzy logic was a theory looking for an application. Then, a major breakthrough occurred
in 1975 when Mamdani and Assilian showed how to use rule-based FL to control a non-linear
dynamical system. It was relatively easy to do this, and it was a fast way to design a control
system. Although the design did not lend itself to the well-accepted, important, critical and
rigorous examinations called for by control theory, it did demonstrate an important real
application for FL. Other applications of rule-based FL began to appear, two very notable ones in
Japancontrol of the Sendai cities subway system, and control of a water treatment system.
Commercial products began to appear, e.g. fuzzy shower, fuzzy washing machine, fuzzy ricecooker, and, in Japan, the word fuzzy took on the connotation of intelligent, and in 1990
received an award. Western industries took noticethere was big money to be madeand the
decade of the 90s rolled in, during which FL achieved a high degree of acceptability (there still
is an on-going debate between subjective probabilists and fuzzy theorists about whether FL is the
same as or is different from subjective probability). The IEEE established the IEEE Transactions
on Fuzzy Systems, and established the IEEE Conference and Fuzzy Systems (FUZZ); there are
many other journals devoted to fuzzy systems (e.g., Fuzzy Sets and Systems); and, there are many
workshops and conferences devoted either exclusively to or that include sessions on fuzzy
technologies. In 1995, the IEEE awarded Zadeh its highest honor, its Medal of Honor, which is
comparable to the Nobel Prize. Fuzzy logic is now widely used in many industries and fields to
solve practical problems, and is still a subject of intense research by academics all over the
world. Although many applications have been found for FL, it is its application to rule-based
systems that has most significantly demonstrated its importance as a powerful design
methodology. Such rule-based fuzzy logic systems (FLSs) are what this course is all about.
If you are interested in a less impressionistic history of FL, then see, for example, the books by
McNeill and Freilberger (1992), Wang (1997), or Kosko (1993a). One of the best histories of FL
appears in the recent textbook by Yen and Langari (1999, pp. 318).
IV. Four Components That Make Up a Rule-Based Fuzzy Logic System (FLS)
Read pages 38 of the textbook.
FL has led to a new architecture for problem solving. This architecture processes its inputs nonlinearly and is built upon a class of logical propositionsrules. Rules can be extracted from
experts and can then be quantified using the mathematics of FL that you will learn in this course;
doing this leads to the architecture of a FLS. Or, we can a priori assume the architecture of a
FLS, using the mathematics of FL, and tune the parameters of the FLS to solve a problem. The
latter approach is in the spirit of using a neural network (NN) to solve a problem, where the
4
architecture of the NN is assumed ahead of time and its parameters are tuned to solve a problem.
The former approach is truly unique to FL. The two approaches can be combined, allowing an
architecture to be developed that can be based on a combination of linguistic and numerical
information. Both approaches have an important role to play in problem solving and are
described in this course.
V. Applications for FLSs

FL and rule-based FLSs have been applied in many different fields and industries, far too
numerous to catalog. Consumer applications include cameras, camcorders, washing machines,
microwave ovens, vacuum cleaners and rice-cookers. There are many control system
applications, including: the already-mentioned control of the Sendai subway system, control of a
cement kiln, traffic junction control, gas cooling control, robot control, and autonomous orbital
operations. The automobile industry has found a broad range of applications for FLSs, including
automobile speed control, anti-skid braking system, transmission system, and fuel injector. FL is
also used by financial investment companies in some security investment systems, and by
businesses to help make complicated decisions. FLSs have also been proposed for digital
communications (e.g., modulation classification, equalization). For more comprehensive
discussions about the applications of FL and FLSs, see, e.g., McNeill and Freilbergers 1992
book, Koskos 1997 book, Yen and Langaris 1999 book, Coxs 1995 book, or Lin and Lees
1996 book. For a complete listing of these books, see the References at the end of this courses
textbook.
VI. Difference Between a Rule-Based FLS and a Neural Network

A rule-based FLS is built upon IF-THEN propositions from logic, whereas a neural network is
built upon simple biological models of a neuron. Just as todays NNs are a far cry from
biological neurons, todays rule-based FLSs are a far cry from propositional logic.
Today, fuzzy and neural are being combined. A fuzzy neural network is a NN that uses FL in
some way, e.g., the weights of the NN may be modeled as fuzzy sets. A neural fuzzy system is a
FLS that uses NN concepts in some way, e.g., the parameters of the FLS may be tuned using a
back-propagation method, or data may first be clustered using a NN after which the clusters play
the roles of the antecedents in a FLS.
VII. Coverage
After this introductory first lesson, there are a series of four lessons that will provide you with
the basic tools that are needed in order to mathematically describe a rule-based FLS. Three
lessonsLessons 24are about fuzzy sets and relations and one lessonLesson 5is about
fuzzy logic. Lesson 6 then describes two applications that are treated in the rest of this course as
case studies, namely forecasting of time-series and knowledge mining using surveys. We then
turn to three specific architectures for FLSs. Four lessonsLessons 710cover many aspects
of the very widely used singleton type-1 FLS (also known as a Mamdani FLS), ranging from
analysis to design to applications. Lesson 11 then covers all aspects of a non-singleton type-1
FLS, also ranging from analysis to design to applications. The non-singleton FLS lets us model
5
the inputs to the FLS as fuzzy numbers, whereas the singleton FLS does not, and, because a nonsingleton FLS is very similar to a singleton FLS, we spend only one lesson on it. Finally, Lesson
12 covers many aspects of a type-1 TSK FLS, again ranging from analysis to design to
applications. The TSK FLS is very popular in control systems applications of FL and is also
becoming popular in signal processing applications of a FLS. Lesson 13 lets you explore some
applications of a type-1 FLS, namely: rule-based pattern classification, equalization of timevarying non-linear digital communication channels, and fuzzy logic control. Its main purpose is
to let you see how one or more of the FLSs already studied can be used to solve some real-world
problems. Lesson 14 focuses on computation, both for implementing a FLS during its operation
and during the design of the FLS. It enumerates all computations for singleton and non-singleton
type-1 Mamdani FLSs and a singleton type-1 TSK FLS, and, overviews on-line software that is
available for these computations. Finally, Lesson 15 focuses on the shortcomings of type-1 FLSs
and how they can be overcome.
Key Points
Fuzzy logic is a type of logic that includes more than just true or false values; it uses a
continuous range of truth values in the interval [0, 1].
Fuzzy logic lets us combine linguistic knowledge and numerical data in a systematic way.
Lotfi Zadeh is the founder of fuzzy logic.
A rule-based fuzzy logic system is comprised of four elements: rules, fuzzifier, inference
engine and output processor.
A FLS is a new architecture for problem solving, one that processes its inputs nonlinearly
and is built upon IF-THEN rules.
FL and FLSs have been applied in many different fields and industries.
Today, fuzzy and neural are being combined into fuzzy neural networks and neural fuzzy
systems.
Questions
1. Consider an engineering project that you are working on or have recently worked on. What are
some IF-THEN rules for that project?
2. What are the antecedents and consequent(s) for the just-stated rules?
3. Why do you think that fuzzy logic as a discipline has encountered so much resistance?
Note: No solutions are provided for these questions because each participant will have their own
answers to them. Deeper answers to Question 3 than are given in Section III above, can be
found in the references mentioned at the end of that section.
Lesson 2: FUZZY SETSPart 1

Learning Objectives
This lesson is the first of a series of four that will provide you with the basic tools that are needed
in order to mathematically describe a rule-based FLS. In all of these lessons we begin by
describing concepts that should be familiar to youcrisp sets, crisp relations, and crisp logic
and show how they can be generalized to related fuzzy conceptsfuzzy sets, fuzzy relations, and
fuzzy logic. We spend three lessons on fuzzy sets and relations and one lesson on fuzzy logic.
The main purpose of this lesson is to explain the transition from crisp to fuzzy sets, emphasizing
the concepts of a membership function and linguistic variables. After completing this lesson you
will be able to:
Explain how a fuzzy set is a generalization of a crisp set.

Demonstrate what a membership function is and how it differs for crisp and fuzzy sets.
Explain and demonstrate what we mean by a linguistic variable.
Explain some terminology about fuzzy sets.
Reading Assignment
Key Points
A crisp set can be defined using a membership function (MF) that only has two values, 0 or
1.
A fuzzy set is a generalization of a crisp set to MFs that have values in the closed interval [0,
1].
A crisp set is a special case of a fuzzy set.
Linguistic variables are variables whose values are not numbers but words or sentences in a
natural or artificial language.
Membership functions are associated with termslinguistic variableswhich appear in the
antecedents of consequents of rules, or in phrases.
Popular shapes for MFs are triangles, Gaussian, trapezoidal, piece-wise linear, and bellshaped.
There is no unique MF for a term; even when its shape is agreed upon, there are parameters
for the shape that can be chosen in different ways. The freedom to make such choices
provides fuzzy logic systems with design degrees of freedom.
The terms support of a fuzzy set, fuzzy singleton, and normal fuzzy set let us communicate
about fuzzy sets.
Practice Problems
Complete Exercise 12 (all six parts).

Learning Objectives
This lesson is the second of a series of four that will provide you with the basic tools that are
needed in order to mathematically describe a rule-based FLS. The main purposes of this lesson
are to describe the transitions from set theoretic operations for crisp sets to those for fuzzy sets as
well as the transitions from crisp relations and compositions on the same product space to those
for fuzzy relations and compositions on the same product space, and to introduce the concept of
a hedge. After completing this lesson you will be able to:
Demonstrate how the basic crisp set theoretic operations of union, intersection and
complement can be computed using membership functions.
Explain basic set-theoretic properties for crisp sets (e.g., associativity, DeMorgans Laws,
Law of Excluded Middle, Law of Contradiction).
Describe the generalizations of the set theoretic operations of union, intersection and
complement to fuzzy sets, and how they can be computed using membership functions.
Explain what t-norms and t-conorms are.
Explain basic set-theoretic properties for fuzzy sets (e.g., associativity, DeMorgans Laws,
Law of Excluded Middle, Law of Contradiction).
Demonstrate crisp relations and compositions on the same product space.
Demonstrate fuzzy relations and compositions on the same product space and explain how
these differ from their crisp counterparts.
Explain the concept of a hedge and list some hedges and their MFs.
Reading Assignment
Read pages 2636, 517520, and 4244 (in this order) of the textbook.
Key Points
The basic crisp set theoretic operations of union, intersection and complement can be
computed using crisp membership functions whose values are either 0 or 1. The maximum
and minimum functions can be used for union and intersection, respectively.
Operations on crisp sets satisfy many properties including associativity, DeMorgans Laws,
Law of Excluded Middle, Law of Contradiction, and these properties can be proved using
Venn diagrams or membership functions.
The basic fuzzy set theoretic operations of union, intersection and complement can be
computed using fuzzy membership functions whose values are in the closed interval [0, 1].
The maximum and minimum functions can be used for fuzzy union and fuzzy intersection,
respectively, but they are not the only operations that can be used.
T-norms are operators that can be used for fuzzy intersection.
T-conorms are operators that can be used for fuzzy union.

When minimum t-norm and maximum t-conorm are used for fuzzy intersection and fuzzy
union, respectively, then operations on fuzzy sets satisfy all set theoretic properties except for
the Laws of Excluded Middle and Contradiction, and these properties can be proved using
membership functions.
When product t-norm and maximum t-conorm are used for fuzzy intersection and fuzzy
union, respectively, then operations on fuzzy sets do not satisfy the Laws of Excluded Middle
and Contradiction, and do not satisfy some other set theoretic properties; hence, this t-norm/tconorm pair must be used with care.
A crisp relation represents the presence or absence of association, interaction, or
interconnectedness between the elements of two or more sets.
A crisp relation can be described using either a relational matrix or a sagittal diagram.
A crisp relation is a crisp set.
The intersection and union of two crisp relations is called a composition of the crisp relations.
Fuzzy relations represent a degree of presence or absence of association, interaction, or
interconnectedness between the elements of two or more fuzzy sets.
Binary fuzzy relations are fuzzy relations between two fuzzy sets.
The intersection and union of two fuzzy relations are called compositions of the fuzzy
relations. They can be computed using t-norm or t-conorm operators.
A linguistic hedge is an operation that modifies the meaning of a term, i.e. of a fuzzy set.
Hedges can be viewed as operators that act on a fuzzy sets MF to modify it.
Practice Problems
Complete Exercises 1-9, 1-11 and 1-19a.

Learning Objectives
This lesson is the third of a series of four that will provide you with the basic tools that are
needed in order to mathematically describe a rule-based FLS. The main purpose of this lesson is
to describe the transition from crisp relations and compositions on different product spaces that
share a common set to those for fuzzy relations and compositions on different product spaces that
share a common set. Computing the MF for the composition of two fuzzy sets on different
product spaces that share a common set is, as we will see in Lesson 7, the most important
computation of a rule-based FLS. A second purpose of this lesson is to explain Zadehs
Extension Principle, which is widely used in many applications of FL. After completing this
lesson you will be able to:
Explain what is meant by different product spaces.

Demonstrate how to compute the MFs for compositions of crisp relations on different
product spaces that share a common set using max-min and max-product composition
formulas.
Demonstrate how to compute the MFs for compositions of fuzzy relations on different
product spaces that share a common set using the sup-star composition formula.
Describe the Extension Principle and demonstrate how to use it in different situations.
Reading Assignment
The sup in the sup-star composition is short for supremum. If S is a set of real numbers
bounded from above, then there is a smallest real number y such that x y for all x S. The
number y is called the least upper bound or supremum of S and is denoted supx S (x) . We use the
maximum for the supremum.
The sup-star composition, which is given in Equation (1-45), is the most important formula for a
rule-based FLS; but, it is not proven in the text. Because of its importance, we provide a proof of
it here. Since an understanding of the proof is not essential to the use of the sup-star composition,
you may consider the proof as optional reading.
First, we define the composition of two fuzzy relations.
We have already learned that an element belongs to a fuzzy set if it has a non-zero membership
in that set. In this respect, the composition of two fuzzy relations means:
If R(U,V) and S(V,W) (R and S, for short) are two type-1 fuzzy relations on U V and
V W respectively, then the composition of these two relations, denoted
R(U,V) o S(V,W ) R o S(U,W), is defined as a subset R o S(U,W) of U W such that
2
(u,w) R o S if and only if the membership for any pair (u,w) , u U and w W , is nonzero [i.e., R o S (u,w) 0] for at least one v V such that R (u, v) 0 and S (v,w) 0 .
We shall show that this condition is equivalent to the sup-star composition
RoS
(u,w) = sup vV [
(u,v)
(v, w)]
A Side: In the proof given next, we use the following method. Let A be the statement
R o S (u,w) 0, and B be the statement there exists at least one v V such that
and S (v,w) 0 . We prove that A iff B by first proving that B A
R (u, v) 0
(equivalent to proving that A B, i.e., necessity of B) and then proving that A B
(equivalent to proving that B A, i.e., sufficiency of B).
Proof of (1-45): NecessityIf there exists no v V such that R (u, v) 0 and S (v,w) 0 , then
this means that for every v V , either R (u, v) or S (v,w) is equal to zero (or both are zero),
which in turn implies that R (u, v) S (v, w) = 0 for every v V , i.e. the supremum of
over v V is zero. Hence, R o S (u,w) = 0, as it should be.
R (u, v) S (v, w)
SufficiencyIf the sup-star composition is zero then it must be true that R (u, v) S (v, w) = 0 for
every v V , which means that for every v V , either R (u, v) or S (v,w) (or both) is zero. This
means that there is no v V such that R (u, v) 0 and S (v,w) 0 .

Using the Extension Principle in a Rule-Based FLS.
In engineering applications of rule-based FLSs, it can happen that functions of a measured
variable are used as either the antecedents or consequent of a rule. Some examples are x1 = ln x
and x2 = sin f . If we are given the MFs for x and f, then we will need to determine the MFs for
x1 and x2 . This is done using the Extension Principle. An alternative, of course, is to think
directly in terms of MFs for ln x and sin f ; but, doing this may be very unnatural, i.e. it is
usually much more natural to converse in terms of measured quantities and not in terms of
functions of those quantities.
Key Points
The composition of two crisp relations on different product spaces that share a common set
can be computed in different ways, including relational matrices and sagittal diagrams; but,
using formulas to do this, such as the max-min or max-product compositions (or their
shortcuts), are very efficient because they can be easily implemented on a digital computer.
The composition of two fuzzy relations on different product spaces that share a common set is
performed using the sup-star composition, where star denotes a t-norm operator.
The most important application of the sup-star composition in a rule-based FLS is when one
of the relations is a fuzzy set.
The Extension Principle (EP) lets us extend mathematical relationships between non-fuzzy
variables to fuzzy variables.
When using the EP, we must be careful to distinguish between one-to-one and one-to-many
mappings, and, single- and multiple-variable mappings, so as to use the proper version of it
in each case.
Practice Problems
Complete Exercises 116 and 120 (b).
Lesson 5FUZZY LOGIC

Learning Objectives
This lesson is the fourth of a series of four that will provide you with the basic tools that are
needed in order to mathematically describe a rule-based FLS. The main purposes of this lesson
are to review the elements of crisp logic, make the transition from crisp to fuzzy logic, obtain
membership functions for rules, and to provide pictorial explanations of the firing of rules. After
completing this lesson you will be able to:
Explain that rules are a form of propositions, and describe what propositions are.
Demonstrate the role of truth tables in crisp logic.
Explain the major elements of crisp logic and demonstrate the truth table for five operations
that are frequently applied to propositions.
Explain the concept of a tautology and demonstrate how to use it to determine MFs for crisp
implications (rules).
Describe the firing of crisp rules using Modus Ponens and Modus Tollens.
Explain the transition from crisp logic to fuzzy logic.
Describe Generalized Modus Ponens and demonstrate how to implement it using a sup-star
composition formula.
Create insightful pictorial diagrams that show the steps of the Generalized Modus Ponen supstar composition.
Explain what engineering implications are and why they are needed.
Reading Assignment
Because of the importance of the sup-star composition (1-74), we now illustrate its computation
when there is some uncertainty about the measurement of input variable x, in which case the
measurement can be modeled as a fuzzy number. These results are used in Lesson 11.
Let the measured value of x be denoted x . In our two examples below we create a fuzzy number
centered about x by using the following Gaussian membership function for A :
(x x 2
(x) = exp 12

Here we consider a single-antecedent rule whose antecedent membership function is also

assumed to be a Gaussian, namely
2
2
1 (x mA

A (x) = exp 2
A

Example 51: Calculation of the sup-star composition for Gaussian MFs and
Product t-norm
In this example, we assume product implication and product t-norm.
(a) First, we show that the sup-star composition in (1-74) can be expressed as
B
(y) = supx X
Derivation: Using product implication,

= ; hence,
B*
A B
(x)
(x)
(x, y) =
(y)
(y) , and using product t-norm
(x)
(y)] = (supx X [
(y) = supx X [
A*
(x)
(x)
(x)
(x) occurs at x = x max =
(b) Next, we show that supx X

Derivation: Let f (x)
that
A*
(x)
A*
(x)
2
A
(x)])
mA +
2
A
(y)
)(
2
A
2
A
).
(x), and substitute the exponential MFs stated above into it, to see
2
2
x mA
1 x x
1
f (x) = exp 2
+
exp{ 2 (x)}

A*
A
To maximize f (x) we must minimize
(x) ; hence, we proceed as follows:
x x
x mA
(x)
= 2 2 + 2
2
A*
x
A
(x)
= 0 x = xmax
x
(xmax x )
xmax (
2
A
xmax =
2
A
+
2
A*
+ (xmax m A )
2
A*
)=
mA +
2
A +
2
A*
2
A
2
A*
2
A*
mA +
2
A
=0
x
QED
(c) Finally, we show that supx X
(x)
Derivation: Substitute x = x max into f (x)

f (xmax ) = sup x X [
f (xmax ) = exp{
1
2
(x) = exp 12 ( x mA )
A*
A*
(x)
(x)
2
A
2
A
)} .
(x) to obtain:
(x)] =
A*
(x max )
(x max )
2
2
x max mA
1 x max x
(xmax )} = exp 2
+

A*
A
where
xmax x
mA +
(
2
A*
A*
2
A
2
A
x ( 2A +
+ A2 * ) A*
2
A*
) x
(mA x )
( 2A + 2A * )
A*
and
xmax mA
2
A*
mA +
(
2
A
2
A
x ( 2A +
+ 2A* ) A
2
A*
)mA
(x m A )
( 2A + 2A* )
A
So,
(xmax ) =
2
A*
(mA x )2 +
( 2A +
( x mA )2 ( x mA ) 2
= 2
2
2
( A + 2A * )
A*)
2
A
Hence,
( x mA )2
f (xmax ) = exp 12 2
2
( A + A* )
Example 5-2: Calculation of the sup-star composition for Gaussian MFs and Minimum t-norm
In this example, we assume minimum implication and minimum t-norm.
(a) First, we show that the sup-star composition in (1-74) can be expressed as
B*
(y) = min{supx X min[
Derivation: Using minimum implication,

norm = minimum; hence,
A B
A*
(x),
(x, y) = min[
(x)],
(x),
(y)}
(y)] , and using minimum t-
B*
B*
(y) = supx X min[
(y) = supx X min {min[
A*
A*
(x) , m i n[ A (x),
(x),
(x)],min [
(y)]]
A*
(x),
(y)]}
B*
A*
(x),
(x)],sup x X min[
A*
(x),
(y)]}
B*
A*
(x),
(x)],min [sup x X
A*
(x),
(y)]}
B*

B*
A*
(x),
(b) Next, we show that supx X min A (x),

Gaussian membership functions, namely at
x = x max =
A*
(y)]}
(x)],min [1,
(x)],
(y)}
(x),
(x) occurs at the intersection point of the two
mA +
)(
).
Derivation: We do this in the figure below, where it is clear that xmax occurs at the intersection
of the two Gaussian membership functions.
A*
(x)
(x)
min[
xmax
In order to get a formula for xmax , we set

2
2
1 x x
1 x mA
exp 2
= exp 2

A

A*
A*
(x),
(x)]
5
We must take into account the fact that, at the point where the two exponential functions cross
each other, one is increasing and the other is decreasing; hence, xmax is the solution to
xmax x
A*
x mA
= max
(xmax x ) +
xmax =
(c) Finally, we show that supx X min
A*
(xmax m A ) = 0
x +
A* +
mA
A*
2
1 x mA
(x), A (x) = exp 2
.
A
*
A
Derivation: From part (b), it is clear that
supx X min
(x),
(x) =
(x max ) =
(xmax )
So, for example,

1 1
(x
)
=
exp
2 2
A
max
A
x +
A*
A*
mA
mA
1
= exp 12 2
x +
mA
( A* +
A*
mA
2
A )
A*
mA
2
1 x mA

A (x max ) = exp 2
A* + A
Key Points
A proposition is an ordinary statement involving terms that have been defined.

Rules are a form of proposition.
Propositions can be combined in many ways using conjunction, disjunction, implication,
negation, and equivalence.
The IF part of an implication is called the antecedent, whereas the THEN part is called the
consequent.
A truth table shows relationships between several propositions; the truth table for the five
operations that are frequently applied to propositions is Table 12.
A tautology is a proposition formed by combining other propositions; tautologies can be
proven to be true or false using truth tables.
For our work in rule-based FLSs, the following tautologies for an implication are most
important because they let us establish MFs for the implication: (p q) ~[ p (~ q)] and
(p q) (~ p) q .
Logic, set theory and Boolean Algebra are mathematically equivalent; any statement that is
true is one system becomes a true statement in the other simply by making some changes in
notation.
Crisp rules are fired using inference mechanisms known as Modus Ponens and Modus
Tollens; only Modus Pollens plays a role in a FLS.
The transition from crisp logic to FL is done by replacing crisp logics MFs by fuzzy MFs,
and Modus Ponens by Generalized Modus Ponens.
Generalized Modus Ponens is a fuzzy composition where the first fuzzy relation is a fuzzy
set.
The MF of a fired rule is given by the sup-star composition.
Singleton fuzzification simplifies the computation of the sup-star composition by
eliminating the need to perform the supremum operation.
When all MFs are Gaussian then it is possible to compute the sup-star composition
analytically for both product and minimum t-norms.
Pictorial descriptions of the sup-star composition provide insight into its operations, and
demonstrate a problem with using fuzzy versions of classical crisp implications, namely a
bias in the MF of a fired rule.
Mamdani implicationsproduct and minimumovercome the problem of a bias in the MF
of a fired rule; but, their MFs are a departure from those of classical crisp implications.
Practice Problems
Complete Exercises 123 (c) and 126.
Lesson 6CASE STUDIES

Learning Objectives
This lesson describes two applications that are treated in the rest of this course as case studies.
These applications are forecasting of time-series and knowledge mining using surveys. After
completing this lesson you will be able to:
Describe how to formulate time-series forecasting problems.

Explain the difference between training and testing sets of data.
Demonstrate three ways to extract rules from numerical training data.
Describe the Mackey-Glass chaotic time-series.
Describe a six-step methodology for knowledge mining using surveys.
Explain what a fuzzy logic advisor (FLA) is and demonstrate how it can be used for making
social or engineering judgments.
Reading Assignment
Although we focus on the Mackey-Glass chaotic time-series in this course and in Chapters 5 and
6 of the textbook, it is by no means the only chaotic time series that has been used to demonstrate
the forecasting capabilities of a FLS, e.g. the Duffing equation is considered by Mendel and
Mouzouris in their 1997 paper.
Table 41 needs some additional explanation in relation to this course. Although it refers to six
kinds of forecasters, in this course we will only cover three kinds: singleton type-1, nonsingleton type-1, and TSK. The Mackey-Glass equation may be chaotic, but it is deterministic,
i.e., even though it is very sensitive to its initial conditions (a property of a chaotic system), once
they have been chosen, then each time we run a simulation of that equation we obtain exactly the
same results. A singleton type-1 forecaster is useful when no uncertainties are present, i.e., there
is no measurement noise so that the measurements that activate the forecaster are perfect, and,
training and testing data are noise-free. A non-singleton type-1 forecaster tries to handle the
situation when the data is corrupted by measurement noise, both during the design and operation
of the forecaster. It does so by modeling the measurements as type-1 fuzzy numbers.
Unfortunately, this leaves a lot to be desired; but, we can not do better within the framework of a
type-1 FLS. To do better we must use a type-2 FLS, as described in the next course New
Directions in Rule-Based Fuzzy Logic Systems: Handling Uncertainties. Finally, we use a totally
different time series (a stream of compressed video) to illustrate the forecasting capabilities of a
TSK forecast. That series is random but has no measurement noise associated with it either
during its design or operation.
2
Sometimes a FLA is comprised of FL sub-advisors. Here we describe three architectures for such
a FLA, assuming for illustrative purposes that there are three sub-advisors. The extension of
these results to more than three sub-advisors is straightforward.
There are many different ways to combine/use three FL sub-advisors. First, however, we explain
why one would construct sub-advisors. To ask people questions that use more than two
antecedents is very difficult, because people usually can not correlate more than two things at a
time. So, if more than two indicators are present for a social or engineering judgment, we can
rank order them in importance (if this ordering is known ahead of timeor it may have to be
established) and then use one or two of the indicators at a time to create the sub-advisors, after
which results from the sub-advisors are combined to give the overall output of the FLA.
1. Parallel Architecture: Overall Decision Maker

In the figure for this architectureFigure 1 belowwe have partitioned the indicators in x into
three subsets, each of which is the input to its own FLA. I assume that the dimensions of each of
these subsets is one or two (if there are more than 6 indicators, then more sub-advisors will be
needed). Simple one- or two-antecedent questions can be created in order to construct the subadvisors. The output of each sub-advisor is for the same social judgment or engineering
judgment, and the three outputs are aggregated in the Combiner block. Examples of a Combiner
are: y(x) = max[y1 (x1 ),y 2(x2 ), y3 (x3 )] and y(x) = [y1(x1 ) + y2 (x2 ) + y3 (x 3 ) ] /3. Note that a final
decision is only made at the output of the Combiner and not at the outputs of the sub-advisors.
This architecture would be used in Fig. 4-3 for both the Consensus and Individuals FLAs. It
would be used in Fig. 4-6 for the Consensus FLA.
FLA
x1
y1
x2
x3
FLA 2
FLA 3
y2
Combiner
FLA 1
y
decision
y3
Figure 1: Parallel architecture: overall decision maker.
3
2. Parallel Architecture: Aggregate Decision Maker
In the figure for this architectureFigure 2we have again partitioned the indicators in x into
three subsets, each of which is the input to its own FLA. Again, I assume that the dimensions of
each of these subsets is one or two (if there are more than 6 indicators, then more sub-advisors
will be needed). Simple one- or two-antecedent questions can be created in order to construct the
sub-advisors.
x1
x1
x2
Consensus FLA 1
Compare
Action/Decision #1
Compare
Action/Decision #2
Compare
Action/Decision #3
Individuals FLA 1
Consensus FLA 2
x
x2
x3
x3
Combiner
Action/Decision
Individuals FLA 2
Consensus FLA 3
Individuals FLA 3
Figure 2: Parallel architecture: aggregate decision maker.
The architecture of the overall FLA is different than the architecture shown in Fig. 4-3. Now
actions or decisions are made at the output of each sub-advisor and it is those actions or
decisions that are passed on to the Combiner. The Combiner could use a majority-rules strategy,
or some other strategy.
I have shown the block for the Combiner dashed because instead of combining actions and
decisions it may be important to examine the actions/decisions at the output of each sub-advisor.
For social judgments, an individual could be sensitized at the sub-advisor level with the hope that
in so doing he or she would become sensitized at the aggregate level.
3. Hierarchical Architecture
In the figure for this architectureFigure 3we have again partitioned the indicators in x into
three subsets. FLA 1 has antecedents that depend on the indicators in x1 . The output of that sub-
4
advisor, y1 (x1 ) , acts as one of the indicators of FLA 2. The output of that sub-advisor,
y2 (x 2 , y1 (x1 )), then acts as one of the indicators of FLA 3. The output of FLA 3 is considered to
be the overall output of the FLA, namely
y(x) = y3[x 3 , y2 ] = y3[x3 ,y2 (x2 ,y1 (x1))]
The output of each sub-advisor can be the same social judgment, but conditioned on different
antecedents. The questions for FLA 1 are the standard ones. Those for FLAs 2 and 3 are not. For
example a question for FLA 2 would have to be structured like:
IF judgement y made on the basis of indicators x 1 is ____
and indicator x21 is _______ and indicator x22 is _______
THEN judgment y is __________
FLA
x1
FLA 1
x2
y1
FLA 2
x3
y2
y
FLA 3
decision
Figure 3: Hierarchical architecture.
We immediately see a potential problem for this architecture, namely if a sub-advisor indicator
vector has two elements, then the questions associated with that sub-advisor will have three
antecedents. Such three-antecedent questions are very difficult for people to answer. So, the
overall indicator vector must be partitioned more finely so that each sub-advisor has at most twoantecedents. This can lead to an architecture that has a lot of sub-advisors.
For engineering judgments, when rules are extracted from data, it is possible to use the
hierarchical architecture without having to worry about the dimension of the antecedents, since
questions will not be asked of people.
Key Points
To design a FLS forecaster data is partitioned into training and testing subsets. The number
of elements in each subset depends on the size of the window of data points that is used to
forecast the next data point.
The training data are used in a FLS forecaster to establish its rules.
One way to extract rules from numerical training data is: Let the data establish the fuzzy sets
that appear in the antecedents and consequents of the rules.
Another way to extract rules from numerical training data is: Pre-specify fuzzy sets for the
antecedents and consequents and then associate the data with those fuzzy sets.
A third way to extract rules from numerical training data is: Establish the architecture of a
FLS and use the data to optimize its parameters.
Chaotic behavior can be described as bounded fluctuations of the output of a non-linear
system with high degree of sensitivity to initial conditions.
The Mackey-Glass equation is a non-linear delay differential equation that is known to
exhibit chaos when its delay parameter is greater than 17.
Knowledge mining, as used in this course, means extracting information in the form of
IFTHEN rules from people.
Judgment means an assessment of the level of a variable of interest.
A six step methodology for knowledge mining involves: identifying the behavior of interest,
determining the indicators of the behavior of interest, establishing scales for each indicator
and the behavior of interest, establishing names and interval information for each of the
indicators fuzzy sets and behavior of interests fuzzy sets, establishing rules, and, surveying
people (experts) to provide a consequent for each rule.
Rules that are extracted from people about a judgment can be modeled using a FLS called a
fuzzy logic advisor (FLA).
FLAs can be used in different ways for social judgments or engineering judgments, e.g. they
can be used to sensitize people about social judgments.
FLAs can be comprised of sub-advisors that can be organized in a variety of architectures;
this is useful so that people can be asked questions with at most one or two antecedents.
Practice Problem
Participate in the survey given in Table 4-2 (see, also, the discussion given on p. 77) by: (1)
providing your start and end points for the five range labels, and (2) re-computing the mean and
standard deviation values for the start and end points for each label using those shown in the
table (obtained from 47 students) and your new values.
Lesson 7SINGLETON TYPE-1

FUZZY LOGIC SYSTEMSPart1
Learning Objectives
This lesson is the first in a series of four that cover many aspects of a very widely used FLSa
singleton type-1 FLS (also known as a Mamdani FLS)ranging from analysis to design to
applications. The main purpose of this lesson is to explain how to quantify the inputoutput
operations of the inference mechanism in a FLS, and how this quantification is made simple
when measurements are treated as perfectthe singleton case. After completing this lesson you
will be able to:
Describe the architecture of a type-1 singleton FLS.

Demonstrate the broad range of IF-THEN rules that can be included within the framework of
a rule-based FLS.
Derive the MF that appears at the output of the inference engine for a single fired rule, using
the sup-star composition.
Demonstrate different ways for combining rule output MFs for multiple fired rules.
Describe what is meant by singleton fuzzification.
Demonstrate the tremendous simplification of the sup-star composition in the case of
singleton fuzzification.
Demonstrate pictorial descriptions of the firing of rules and the combining of multiple-fired
rules that provide a lot of insight into the operation of a FLS.
Reading Assignment
Key Points
A singleton type-1 FLS consists of rules, fuzzifier, inference mechanism and defuzzifier.
A multiple-antecedent multiple-consequent rule can always be considered as a group of
multi-input single-output rules
Many non-obvious rules can be cast into the form of a standard IF-THEN rule, so that a rulebased FLS is quite broad in it applicability.
The MF of a fired rule, B (y), is given by B (y) = sup x X A (x) A G (x, y) , y Y .
The fuzzy inference engine can be interpreted as a system whose output is
(y).
Fired rules can be combined in different ways; there is no one best way to do this.
A singleton fuzzifier has a MF that is non-zero at only one point, xi = xi .
For singleton fuzzification the supremum operation in the sup-star composition is very easy
to evaluate because the MF of the input is non-zero only at one point, xi = xi .
Bl
For singleton fuzzification, the MF of a fired rule,
Bl
(y) =
(y)
F1l
Bl
( x1)L
(y), is given by
Fpl
( xp ) , y Y
Pictorial descriptions of input and antecedent operations, consequent operations, and

combined output fuzzy sets provide lots of insight into the operations of the fuzzy inference
mechanism.
Practice Problem
Example 5-1 is one of the most important ones given in Chapter 5, because it provides a
geometric interpretation for the operations that occur within the inference engine. In this
exercise, I want you to provide the figures that are comparable to the ones given in Figures 545-6, but using triangular MFs. Do this for both the minimum and product t-norms.

FUZZY LOGIC SYSTEMSPart 2
Learning Objectives
This lesson is the second in a series of four that cover many aspects of a very widely used
FLSa singleton type-1 FLSranging from analysis to design to applications. The main
purposes of this lesson are to complete the mathematical description of a singleton type-1 FLS,
examine all of the design choices that have to be made, and explain why a FLS will work well.
After completing this lesson you will be able to:
Describe five popular methods for defuzzification: centroid, center-of-sums, height, modified
height, and center-of-sets.
Explain why there is no one singleton type-1 FLS, and demonstrate that the many choices
that need to be made to specify or design such a FLS lead to a rich variety of FLSs.
Demonstrate the inputoutput formula for a singleton type-1 FLS as a new kind of basis
function expansiona fuzzy basis function (FBF) expansion.
Demonstrate that each rule, whether it derives from expert linguistic knowledge or is
extracted from numerical data, can be associated with one FBF.
Explain what a universal approximation theorem is, and describe a singleton type-1 FLS as a
universal approximator.
Demonstrate what is meant by rule explosion.
Reading Assignment
Section 5.5.2: Here we derive (5-16) by beginning with the additive combiner depicted in Figure
5-3, assuming product implication and product t-norm, and formally determining the center of
M
gravity of its output
(y) = wl
Bl
l=1
(y).
Derivation: From the last line of (5-10) we know that the MF for the additive combiner
can be expressed as:
M
(y) = wl
l=1
(y) = wl
l=1
(y)
i=1
Fil
(xi ) = wl f l
l=1
where f l =
i=1
F il
(xi ) . The centroid of
(y) is computed as follows:
Gl
(y)
2
M
y Y y wl f
y B (y)dy
l =1
Centroid of B (y) = yY
=
M
l
y Y B (y)dy
yY wl f
l =1
l
Gl
Gl
(y)dy
(y)dy
B (y) =
Centroid of
wl f l y Y y
l=1
M
(y)dy
Gl
wl f y Y
l
Gl
l= 1
(y)dy
y Y
Gl
y Y
(y)dy
(y)dy
Centroid of
(y) =
wl f l cG l aGl
l=1
M
l
wl f aG l
l =1
where cG l is the centroid of the lth consequent set Gl , i.e.

cG l = y Y y
Gl
(y)dy y Y
Gl
(y)dy = yY y
Gl
(y)dy aG l
and aG l is the area of that set. This completes the derivation.

It is interesting to see if this result also holds for product implication but minimum t-norm
[between G l (y) and the firing level in (5-10)]. In this case, (5-10) becomes
Bl
(y) = min
Gl
(y), f
where f l is defined above. Clearly, the previous derivation of the Centroid of B (y) depends on
the separability of G l (y) and f l in the equation for B l (y), something that can not be
guaranteed when
Bl
(y) = min
Gl
l
(y), f ; hence, Koskos SAM is of very limited value.
Note that the center-of-sums defuzzifier is still applicable in this case, because (5-14) is in terms
of the centroid and area of output fuzzy sets and not consequent fuzzy sets. These quantities can
be computed numerically from knowledge of B l (y), as calculated from (5-10).
Read pages 149157.
Key Points
Defuzzification produces a crisp output from the fuzzy sets that appear at the output of a
FLSs inference block.
There are many kinds of defuzzifiers.
The defuzzifiers that are based on some sort of center of gravity computation are: centroid,
center-of-sums, height, modified height, and center-of-sets.
Many choices need to be made in order to specify or design a type-1 FLS; they provide the
designer with many design degrees of freedom.
A FLS can be interpreted as a fuzzy basis function (FBF) expansion, which places a FLS into
the more global perspective of function approximation.
FBFs are not radial basis functions and they are not orthogonal basis functions.
Every rule in a FLS, whether it comes from linguistic knowledge or is extracted from data,
can be associated with a FBF.
A FLS is a universal approximator, i.e., it can uniformly approximate any real continuous
non-linear function to arbitrary degree of accuracy.
Universal approximation is an existence theorem that helps to explain why a FLS is so
successful in engineering applications, but is does not tell us how to specify a FLS.
Rule explosion refers to rapid growth in the maximum number of rules that may be required
in a FLS, e.g. if there are p input variables, each of which is divided into r overlapping
regions, then a complete FLS must contain p r rules.
Practice Problems
Complete Exercises 54 and 56.

Learning Objectives
This lesson is the third in a series of four that cover many aspects of a very widely used FLSa
singleton type-1 FLSranging from analysis to design to applications. The main purpose of this
lesson is to learn how to design singleton type-1 FLSs when a collection of training data is
available. By design we mean specify or optimize the parameters that characterize the FLS.
Describe how training data can be interpreted as a collection of IF-THEN rules.

Enumerate how many design parameters there can be in a specific FLS design, and the
relation of that number to the number of possible rules in the FLS.
Describe three high-level designs that can be associated with singleton type-1 FLSs.
Explain a singleton type-1 FLS as a three-layered architecture.
Demonstrate some design methods that can be used for the three kinds of designs, namely:
one-pass methods, least-squares method, and a back-propagation (steepest descent) method,
Demonstrate how to compute the derivatives that are needed for the back-propagation
method.
Reading Assignment
Read pages 157166 of the textbook. Omit Sections 5.9.4 and 5.9.5.
The following material supplements Section 5.9.3.
I. Interpretation of a Type-1 FLS as a Three-Layered Architecture
A singleton (or non-singleton) type-1 FLS can be viewed as a three-layered architecture. This
was first discovered by my former Ph. D. student Li-Xin Wang, around 1990, as part of his Ph.
D. research. This architecture suggests the possibility of back-propagating errors from the output
of the FLS to earlier layers, in analogy with back-propagation in a feed-forward neural network
(NN) (see discussions about this on the top of p. 166 in the textbook). It is important to note,
though, that the three-layer architecture for the FLS is merely a re-interpretation of the FLS and
is not a physical architectureimplementation. This is different from the layered architecture of
a NN, where that architecture is usually viewed as a physical implementation of the network.
Starting with (5-24) and (5-25), we re-express y(x) as follows:
y(x) = fs (x) h g
where
2
M
h = y l wl and g =
l =1
l= 1
in which
p
wl =
i=1
Fi l
(x i )
l = 1,...,M
These equations lead to the following three-layered architecture for this FLS:
f s (x)
h
...
Layer 3
w
y
Layer 2
i= 1
...
F i1
(xi )
. . .
F iM
(xi )
i= 1
Layer 1
x = col(x1 ,..., x p )
Figure 1: Three-layer architectural interpretation for a type-1 FLS.
II. A Very Short Primer on Optimizing a Function Using an Algorithm That Makes Use of
First Derivative Information
There are many ways to optimize (i.e., minimize or maximize) a function. Here I will briefly
describe a very popular way that uses not only the value of the function being optimized but also
its first derivative. Methods that use this information are called steepest descent algorithms.
In order to keep the initial discussion as simple as possible, I shall assume that the function being
minimized depends only on a single parameter, . That function (called an objective function) is
3
denoted J( ), and an example of it is depicted in Figure 2. Observe that there are various kinds
of extrema that can occurrelative maxima, relative minima, global maximum, global
minimum, and even inflection points. When our goal is to minimize J( ), then we want to
determine the value of labeled in Figure 2 as * . One of the great challenges to doing this is
not to get trapped at a local extremum, e.g., at 1* or 2* . The importance of a good starting value
for can not be over-stated. If, for example, our initial choice is at 0 , then it is very likely that
an optimization algorithm that is based on derivative information will cause
to lock-on
(converge) to 1* or 2* . On the other hand, if the initial choice is at 0 , then it is very likely that
an optimization algorithm that is based on derivative information will cause to converge to the
global minimum at * .
One approach to trying to achieve the global minimum is to randomly choose 0 , solve for the
associated minimum of J( ), say J( ), and to repeat this procedure for a collection of such 0
values. One then chooses * as that value of associated with the smallest value of J( ). In
many practical optimization problems, it may not be essential to compute the overall global
minimum of J( ). A value of that leads to a small enough value of J( ) may suffice.
J( )
J( *)
Figure 2: An objective function, J( ), that has multiple extrema.

What really makes finding the minimum of J( ) very challenging in the design of a FLS is that,
even though we have a mathematical formula for J( ), we do not know the shape of J( ) ahead
of time. We have available only a set of training data [ (x ( j) : y ( j) ), j = 1,2,..., N ] (see page 158 of
the textbook) that containwe hopegood knowledge about J( ).
We do not use all of the data to minimize J( ). Instead, we partition the data into two sets, i.e.,
{data} = {{training data}, {testing data}} {DTRAIN , DTEST }
(1)
Note that in the textbook I refer to all of training data as the data, which is then partitioned into
a training data subset and a testing data subset. The idea is to use the training data subset to
minimize J( ) the best you can, but to then evaluate how well you do this by using the testing
data subset. There will be a trade off between over-fitting using the training data subset and
generalization using the testing data subset. Usually, over-fitting leads to poor generalization
performance.
My goal in the next few paragraphs is to give you a fairly high-level explanation of the
construction of a steepest descent algorithm for minimizing objective function J( ), where in the
discussions below now is a vector of design parameters. In order to emphasize the role of the
data during the optimization process, as used by the optimization algorithm, I shall denote J( )
as J = J(D, ) . DTRAIN is used by the steepest descent algorithm because that algorithm is based
on minimizing JTRAIN = J(DTRAIN , ). DTEST is used to evaluate the overall optimization results by
computing JTEST = J(DTEST , ) and establishing an overall stopping rule, of the form:
J(DTEST ,
i +1
) J(DTEST , i )
(2)
where
is pre-specified. This is only one example of a stopping rule, but it is one that is
frequently used in practice. Another practical stopping rule is to choose a pre-specified
maximum number of iterations, and to stop the iterative minimization when that number is
reached. This stopping rule is not as effective as the first one because JTEST = J(DTEST , ) could
still be changing by a large amount after the pre-specified number of iterations has been reached.
The general structure of a steepest descent algorithm is:
=
i+1
gi + 1 (DTRAIN , i )
i = 0,1,...
(3)
where g is a vector of partial derivatives, known as the gradient vector, and

is a learning
parameter whose choice is part of the art of steepest descent. Too small a choice for can lead
to very long convergence times, whereas too large a choice for
can lead to very erratic
behavior of i + 1 from one iteration to the next. Of course, in order to start the steepest descent
algorithm in (3), an initial value 0 must be specified.
Another way to write (3) is:
i+1
[derivatives of J(D
TRAIN
, )
i = 0,1,...
(4)
The vertical-bar notation means that after we determine the derivatives of J(DTRAIN , )
analytically, some or all of them will still be explicit functions of the unknown , and those
values are then replaced by the best values we have for them, namely i .
5
In our tuning procedure we use a squared-error function [see (5-47) in the textbook], i.e.
J(DTRAIN , ) = e(DTRAIN , i )
(5)
where
e(DTRAIN , ) =
1
2
[ y(D
TRAIN
, ) y( j ) (DTRAIN )]
(6)
and
y(DTRAIN , ) = fs (DTRAIN , )
(7)
In (7), f s(DTRAIN , ) is the output of a singleton type-1 FLS. Its exact structure depends on the
many choices that have to be made by the designer of a FLS. One example of f s(DTRAIN , ) is
given in (5-46) of the textbook.
It is easy to compute the derivatives of J(DTRAIN , ), which are needed in (4), using (5)(7), i.e.
J(DTRAIN , ) e(DTRAIN , )
y(DTRAIN , )
=
= [y(DTRAIN , ) y ( j ) (DTRAIN )]
f (D
, )
= [y(DTRAIN , ) y ( j) (DTRAIN )] s TRAIN
(8)
In order to proceed further, the specific FLS choices mentioned above must be made. Those
choices will let us determine analytical formulas for fs (DTRAIN , ) . We complete these
calculations for a specific set of choices below in Section III.
This completes the high-level overview on optimizing a function using a steepest descent
algorithm. Lots of good software already exists for doing this (e.g., The MathWorks
Optimization Toolbox), software that has been written by experts who have included lots of the
bells and whistles that let a steepest descent algorithm work well. We return to software for
doing this in Lesson 14.
III. Derivation of the Steepest Descent Algorithms (5-48)(5-50)

Here we shall derive the steepest descent algorithms that are given in (5-48)(5-50), for updating
the MF parameters. Regardless of whether = mF , y l , or F , certain parts of the calculations
of gradJ( ) = J( ) , where J( ) = e =
( i)
J( )
where
{ [ f (x
1
2
(i)
1
2
[ f (x
s
) y( i ) ]
l
k
(i)
l
k
) y
(i)
} = [ f (x
s
] , are identical, namely:
(i)
) y (i ) ]
fs (x ( i ) )
f s(x( i ) ) = y l l (x ( i ) )
l=1
and
(i )
1 xk mF
exp 2
2
k=1
(i )
l (x ) =
2
( i)
M
p
x
m
k
F
1
exp 2
2
l =1 k = 1
l
k
l
k
l
k
(9)
l
k
(a)
l
= y : In this case,
f (x( i ) ) =
y l s
(x( i ) )
so that
y l (i +1) = y l (i)
l
l
l J(y ) = y (i)
y
[ f (x
(i )
) y ( i) ] l (x (i ) )
which is (5-49).
(b)
= mF : In this case, it is helpful to use the layered architecture interpretation for (5-24) and
l
k
(5-25) that is depicted above in Figure 1, i.e. we write f s(x( i ) ) as

fs = h g
where
M
h = y l wl
l =1
g=
w
l= 1
and
(i)
1 x k mF
l
w = exp 2
2
k=1
)
2
l
k
Then, we use the chain rule to compute fs m F as follows:

l
k
f s
fs w l
=
w l m F
m F
l
k
l
k
where
h
g
g l h l
l
l
f s
w
w = gy h = y f s
=
l
2
2
w
g
g
g
and
w l
=
m F
mF
l
k
(i )
1 xk mF
exp 2
2
k=1
l
k
)
2
l
k
l
k
( i)
1 xk mF
=
exp 2
2
mF
l
k
( i)
1 x k mF
= exp 2
2
k =1
l
k
) exp (x
2
l
k
) ( x
2
l
k
(i)
k
mF
l
k
l
k
k=1
k l
1
2
) = (x
mF
( i)
k
)
2
l
k
F kl
(i )
k
mF
l
k
Fk l
) w
Fkl
Hence
xk(i ) m F
f s
yl fs
=
wl
2
m F
g
F
l
k
l
k
l
k
and we obtain the following iterative algorithm for updating mF :

l
k
mF (i + 1) = m F (i)
l
k
l
k
[ f (x
( i)
) y( i ) ]
fs
mF
= mF (i)
l
[ f (x
s
(i )
)y
(i)
] [ y (i) f (x
l
(i)
(x
)]
(i )
k
) w (i)
mF (i)
l
k
2
Fkl
(i)
g(i)
Observe, also, from the previous equations for

w l (i)
=
g(i)
l
(x (i ) ) , w and g, that
(x (i ) )
Substituting this last equation into the one just before it, we reach the steepest descent algorithm
for updating mF that is given in (5-48).
l
k
(c)
Fkl
: The derivation of (5-50) is just like the derivation of (5-48). The key steps are
summarized in the layered architectural equations given above for f s , h, g and w l . We then
compute
fs
Fkl
fs
wl
wl
F kl
where fs w l has been computed above, so we only need the new computation of wl
Because this last computation is just like the one we just carried out for w
Fkl
m F , we leave its
l
k
details to the reader.
Key Points
Each training datum can be interpreted as an IFTHEN rule of the form IF x1 is F1l and
L and xp is Fpl , THEN y is Gl , where Fi l are fuzzy sets described by Gaussian (other shapes
can be used) MFs. A particular design method establishes how the MF parameters are
specified.
It is good design practice to have fewer FLS design parameters than training pairs; hence, a
constraint always exists among the number of training samples, number of rules, and number
of antecedents.
Three high-level designs can be associated with a singleton type-1 FLS, ranging from one in
which the data establishes the rules and no tuning is used, to two others in which the training
data is used to tune some or all of the antecedent and consequent MF parameters.
The layered architecture for a type-1 FLS suggests that errors will be back-propagated
during a steepest descent parameter tuning procedure, just as they are during the steepestdescent design of a feed-forward neural network.
The two one-pass design methods let the data establish either the parameters of the MFs or
the entire rule. Their major drawback is that they lead to a FLS that has too many rules.
When all of the antecedent parameters are pre-specified, the method of least-squares can be
used to design the consequent parameters; doing this leads to a linear system of equations
that has to be solved for the consequent parameters. Knowing how to choose the antecedent
parameters ahead of time is a major drawback to using this design method.
When none of the antecedent or consequent parameters are pre-specified, they can all be
tuned using the method of steepest descent.
Calculating the derivative of the objective function, which is required to derive a steepest
descent algorithm, requires a careful use of the chain rule; this can be expedited by making
use of the three-layer architectural interpretation of a type-1 FLS.
Practice Problem
Complete Exercise 5-10.

Learning Objectives
This lesson is the fourth in a series of four that cover many aspects of a very widely used FLSa
singleton type-1 FLSranging from analysis to design to applications. The main purpose of this
lesson is to see how singleton type-1 FLSs can be designed for our two case studies, forecasting
of time-series and knowledge mining using surveys. After completing this lesson you will be
able to:
Demonstrate how one-pass and back-propagation design methods can be applied to

forecasting the Mackey-Glass time-series.
Explain that if we only have access to noisy measurements, then the performance of a
singleton type-1 FLS may not be acceptable because it is unable to directly model such
uncertainty.
Describe how to construct MFs for linguistic labels from survey information about interval
information for each label.
Demonstrate three ways in which expert information (i.e., the consequent of survey rules)
can be used in a fuzzy logic advisor (FLA).
Demonstrate the designs of two FLAs and describe their interpretations as judgment decision
surfaces.
Reading Assignment
Key Points
It is possible to successfully forecast the perfectly-measured chaotic MackeyGlass time

series using a FLS with only 16 rules, when the rules MFs are tuned using a backpropagation procedure.
Measurement noise severely degrades a singleton type-1 FLS forecaster, because it has not
been accounted for during the design of the FLS and cannot be accounted for during the
operation of that system.
Fixing the parameters of type-1 MFs using survey data is difficult to do because uncertainties
about the survey data cannot be modeled using type-1 MFs.
The three possibilities for using consequent results from surveys are: keep the response
chosen by the largest number of experts, find a weighted average of the rule consequents for
each rule, and preserve the distributions of the expert-responses for each rule.
A FLA can be visualized as a multi-dimensional surface.
Practice Problems
Lesson 11NON-SINGLETON TYPE-1

FUZZY LOGIC SYSTEMS
Learning Objectives
This lesson covers many aspects of another FLSa non-singleton type-1 FLS (which is also a
Mamdani FLS)ranging from analysis to design to applications. Because a non-singleton FLS
is very similar to a singleton FLS, we only spend one lesson on it. The main purposes of this
lesson are to explain how to quantify the inputoutput operations of the inference mechanism in
a non-singleton type-1 FLS, why this quantification is more complicated than in the singleton
case, learn how to design non-singleton type-1 FLSs when training data are available, and learn
how non-singleton type-1 FLSs can be designed for forecasting of time-series when only noisy
measurements are available. After completing this lesson you will be able to:
Explain why the architecture of a non-singleton type-1 FLS is the same as for a singleton
type-1 FLS.
Describe what is meant by non-singleton fuzzification.
Demonstrate the calculation of the sup-star composition for the case of non-singleton fuzzification and explain why it is more difficult than in the singleton case.
Explain how a non-singleton FLS can be interpreted as a prefiltering operation on the
measurements followed by the inference mechanism.
Demonstrate pictorial descriptions of the firing of rules and the combining of multiple-fired
rules.
Explain that what is new for a non-singleton type-1 FLS is the need for the designer to
choose MFs for the input measurements, something that wasnt necessary for a singleton
type-1 FLS.
Demonstrate the inputoutput formula for a non-singleton type-1 FLS as a fuzzy basis
function (FBF) expansion and explain the differences between this FBF expansion and the
FBF for singleton type-1 FLSs.
Explain how training data can be interpreted as a collection of IF-THEN rules and describe
what the difference is between these IF-THEN rules and the ones for a singleton type-1 FLS.
Enumerate how many design parameters there can be in a specific design and describe the
relation of that number to the number of possible rules in the non-singleton type-1 FLS, and
how these numbers compare with those for a singleton type-1 FLS.
Describe four high-level designs that can be associated with a non-singleton type-1 FLSs.
Describe two high-level approaches to the tuning of a non-singleton FLS.
Demonstrate that the design methods learned for singleton type-1 FLSs are easily modified
for non-singleton type-1 FLSs.
Demonstrate how one-pass and back-propagation design methods can be applied to
forecasting the Mackey-Glass time-series.
Explain that if we only have access to noisy measurements, then the performance of a nonsingleton type-1 FLS outperforms that of a singleton type-1 FLS, but that there is room for
further improvements.
Reading Assignment
Before you read Example 62, review Examples 51 and 52 in Lesson 5.
Read pages 193209 of the textbook. Omit Sections 6.6.4 and 6.6.5.
Key Points
A non-singleton type-1 FLS consists of a fuzzifier, inference mechanism and defuzzifier; its
rules are the same as those for a singleton type-1 FLS; it differs from a singleton type-1 FLS
in the nature of the fuzzifier.
A non-singleton fuzzifier treats each input as a fuzzy number, i.e. it assigns a MF to each
input that has a value equal to one at the measured value of the input and decreases to zero as
the input variable gets farther away from the measured input value.
As in a singleton type-1 FLS, the MF of a fired rule, B (y), is given by B (y) =
supx X A (x) A G (x, y) , y Y ; but, for a non-singleton type-1 FLS, the sup operation
does not disappear, because A (x) has non-zero values over a range of values for each xi .
Except for some simple, but important choices for the MFs (e.g., Gaussian MFs) it is not
possible to evaluate this sup-star composition analytically.
l
A non-singleton FLS first pre-filters its input x, transforming it into x max
. Doing this
accounts for the effects of the input measurement uncertainty, and is a direct result of the
sup-star composition.
Only the pictorial description for the input and antecedent operations of a non-singleton type1 FLS differs from the one for a singleton FLS. The other pictorial descriptions remain the
same.
The only difference between a type-1 non-singleton and a singleton FLS is the numerical
value of the firing level; for the former, this value includes the effects of input uncertainties,
whereas for the latter it does not.
The same choices must be made to specify or design a non-singleton type-1 FLS as had to be
made for a singleton type-1 FLS. In addition, the designer must specify the MFs for the input
measurements, which provides new design degrees of freedom to the non-singleton FLS.
A non-singleton FLS can also be interpreted as a FBF expansion. Input uncertainty may
activate more of these FBFs, which means that decisions are more distributed in the nonsingleton case than in the singleton case.
Training data establish exactly the same sort of rules as they did in the singleton FLS case,
and a particular design method establishes how the MF parameters are specified, including
those for the input MFs.
The constraint that exists among the number of training samples, number of rules, and
number of antecedents is slightly different for a non-singleton type-1 FLS than it is for a
singleton type-1 FLS, because of the addition of the input MF parameters.
x
Four high-level designs can be associated with a non-singleton type-1 FLS, ranging from one
in which the data establishes the rules and no tuning is used, to three in which the training
data is used to tune some or all of the antecedent, consequent, and input measurement MF
parameters.
One approach to designing a non-singleton type-1 FLSthe partially dependent approach
is to first design the best possible singleton FLS, freeze the common parameters, and only
optimize the parameters that are new to the non-singleton type-1 FLS. A second
approachthe totally independent approachis to design the best possible non-singleton
type-1 FLS regardless of any pre-existing singleton FLS design.
The one-pass and least-squares design methods developed for a singleton type-1 FLS are
essentially the same for a non-singleton type-1 FLS.
The steepest-descent algorithms are different for a non-singleton type-1 FLS because of the
pre-filtering operation performed by the sup-star composition.
A non-singleton type-1 FLS forecaster is less sensitive to noisy measurements than a
singleton type-1 FLS forecaster, but the improvement is modest.
When the training data are noisy there is no way to account for this in the antecedent and
consequent MFs of a type-1 FLS. This represents a limitation of a type-1 FLS.
Practice Problems
Exercise 111
Example 6-1 (just as Example 5-1) is one of the most important ones given in Chapter 6, because
it provides a geometric interpretation for the operations that occur within the inference engine. In
this exercise, I want you to provide the figures that are comparable to the ones given in Figures
6-3, 5-5 and 5-6, but using triangular MFs. Do this for both the minimum and product t-norms.
Lesson 12TSK FUZZY LOGIC SYSTEMS

Learning Objectives
This lesson covers many aspects of another (and our last) FLSa type-1 TSK FLSranging
from analysis to design to applications. The TSK FLS is very popular in control systems
applications of FL and is also becoming popular in some signal processing applications of FLSs.
Describe the mathematical model for a first-order type-1 TSK FLS.

Demonstrate connections between type-1 TSK and Mamdani FLSs.
Explain that TSK FLSs are also universal approximators.
Enumerate how many design parameters there are in a TSK FLS.
Demonstrate some design methods that can be used for the design of TSK FLSs, namely:
least-squares method, back-propagation (steepest descent method), and an iterative design
method.
Describe what we mean by forecasting of compressed video traffic.
Demonstrate how to design type-1 TSK and Mamdani forecasters of compressed video
traffic.
Reading Assignment
Read pages 421428 of the textbook. Omit Section 13.3
Section 13.4 explains how to design both type-1 and type-2 TSK and Mamdani FLSs for the
problem of forecasting compressed video traffic. Because the textbook interweaves material
about both type-1 and type-2 designs, here we will filter out all of the type-2 design materials
(leaving them for the follow-on course New Directions in Rule-Based Fuzzy Logic Systems:
Handling Uncertainties), i.e. we will guide you through Section 13.4.
Start by reading Section 13.4.1, including Example 13-5, pp. 442444, but, omit the two
paragraphs directly after Example 13-5. Read the last paragraph of Section 13.4.1.
Next, we have extracted materials from Sections 13.4.213.5 that focus on the type-1 designs.
Section 13.4.2 Forecasting I frame sizes: General Information
In the rest of this section we focus on the problem of forecasting I frame sizes (i.e., the number
of bits/frame) for a specific video product, namely Jurassic Park. All of our methodologies for
doing this apply as well to forecasting P and B frame sizes and can also be applied to other video
products.
Here we examine two designs of FLS forecasters based on the logarithm of the first 1000 I frame
sizes of Jurassic Park, s(1), s(2), , s(1000) (see Figure 13-1). Those designs are type-1 TSK
FLS and singleton type-1 Mamdani FLS. We used the first 504 data [s(1), s(2), , s(504)] for
2
tuning the parameters of these forecasters, and the remaining 496 data [s(505), s(502), ,
s(1000)] for testing after tuning.
Type-1 TSK FLS: The rules of this FLS forecaster are (i = 1, , M)
Ri : I Fs(k 3) is F1i and s(k 2) is F2i and s(k 1) is F3i
and s(k) is F4i THEN s i (k +1) = c i0 + c1is(k 3)
(13-53)
+ ci2 s(k 2) + c3i s(k 1) + c4i s(k)

We initially chose Fji to be the same for all i (rules) and j (antecedents), and used a Gaussian
membership function for them, one whose initial mean and standard deviation were chosen from
the first 500 I frames as (see Entire segment row of Table 13-2) m = 4.7274 and = 0.0954 .
According to Table 13-1, the number of design parameters for this type-1 TSK FLS is
(3p + 1)M = 13M .
Singleton type-1 Mamdani FLS: The rules of this FLS forecaster are (i = 1, , M)
Ri : I Fs(k 3) is F1i and s(k 2) is F2i and s(k 1) is F3i
and s(k) is F4i THEN s i (k +1) is G i
(13-57)
We used height defuzzification. As we did for the type-1 TSK FLS, we initially chose Fji to be
the same for all i (rules) and j (antecedents), and used a Gaussian membership function for them,
one whose initial mean and standard deviation were chosen from the first 500 I frames, as
described earlier, as m = 4.7274 and
= 0.0954 . According to Table 13-1, the number of
design parameters for this singleton type-1 Mamdani FLS is (2 p +1)M = 9M .
13.4.3 Forecasting I frame sizes: Using the same number of rules
In this first approach to designing the two FLS forecasters, we fixed the number of rules at five
in both of them; i.e., M = 5. Doing this means that the type-1 TSK FLS is described by 65 design
parameters and the singleton type-1 Mamdani FLS is described by 45 design parameters.
Steepest descent algorithms (as described in Section 5.9.3 for the Mamdani FLS and in Section
13.2.4 for the TSK FLS) were used to tune all of these parameters. In these algorithms, we used
step sizes of = 0.001 and = 0.01 for the TSK and Mamdani FLSs, respectively.
We have already explained how we chose initial values for the membership function parameters.
All of the remaining parameters were initialized randomly, as follows:
Consequent parameters, c ij (i = 1,...,5; j = 0,1,...,4) , of the TSK FLS were each chosen
randomly in [0, 0.2] with uniform distribution.
Consequent parameters, y i (i = 1,...,5), of the Mamdani FLS were chosen randomly in
[0, 5] with uniform distribution.
3
Because we chose the initial values of the consequent parameters randomly, we ran 50 MonteCarlo realizations for each of the 2 designs.10 For each realization, each of the two FLSs was
tuned for 10 epochs on the 504 training data. All designs were then evaluated on the remaining
496 testing data using the following RMSE:
RMSE =
1
496
999
k = 504
[s(k + 1) f
FLS
(s( k ) )]
(13-59)
where s( k ) = [s(k 3),s(k 2), s(k 1), s(k)]T . The average value and standard deviations of these
RMSEs are plotted in Figure 13-2 for each of the 10 epochs. Observe, from Figure 13-2(a), that
(pay attention only to the curves for the two type-1 designs):
1. After 10 epochs of tuning, the average RMSE of the 2 FLS forecasters is:
Type-1 TSK FLS: 0.0779

Singleton type-1 Mamdani FLS: 0.0808
2. In terms of average RMSE and standard deviation of the RMSE, the type-1
TSK FLS outperforms the singleton type-1 Mamdani FLS for epochs 210.
13.4.4 Forecasting I frame sizes: Using the same number of design parameters
Because a five-rule TSK FLS always has more parameters (design degrees of freedom) to tune
than does a comparable five-rule Mamdani FLS, we modified the previous approach to designing
the two FLSs. We did this by fixing the rules used by the TSK FLS at five and by then choosing
the number of rules used by the Mamdani FLS so that its total number of design parameters
approximately equals the number for the TSK FLS. Doing this led us to use seven rules for the
Mamdani FLS. The designs of the resulting two FLSs proceeded exactly as described in the
preceding section. All designs were again evaluated using the RMSE in (13-59). The average
value and standard deviations of these RMSEs are plotted in Figure 13-3 for each of the 10
epochs (again, only pay attention to the curves for the two type-1 designs). Observe that:
The results are similar to the ones depicted in Figure 13-2; so, at least for this
example, equalizing the numbers of design parameters in the Mamdani and
TSK FLSs does not seem to be so important.
13.4.5/13.5 Conclusion
It is not our intention in this example to recommend one FLS architecture over another. Some
people prefer a TSK FLS over a Mamdani FLS or vice-versa. We leave that choice to the
designer who, as always, must be guided by a specific application. When both kinds of FLSs are
applicable, as in the case of forecasting a random-signal and perfect-measurement time-series,
the designer can carry out a comparative performance analysis between the two architectures, as
we have just done.
10
In Chapters 5 and 6 Monte-Carlo simulations were run to average out the effects of additive measurement noise. Here they are run to average
out the effects of random initial consequent parameter values.
Key Points
TSK is short for Takagi, Sugeno and Kang, the originators of the TSK FLS.
To-date only a singleton type-1 TSK FLS has been described in the literature.
The most widely used type-1 TSK FLS uses first-order rules, i.e., rules whose antecedents
are type-1 fuzzy sets, and whose consequent is a linear combination of the measured
antecedents. The fact that its consequent is a function and not a fuzzy set is the biggest
difference between a TSK FLS and a Mamdani FLS.
The output formula for a type-1 TSK FLS is obtained by combining its rules in a prescribed
way; it does not derive from the sup-star composition, as does the output of a type-1
Mamdani FLS. This is another big difference between a TSK FLS and a Mamdani FLS.
Normalized and unnormalized type-1 TSK FLSs have been defined.
When the consequent function in a TSK rule is a constant, then the normalized type-1 TSK
FLS is exactly the same as a type-1 Mamdani FLS that uses either center-of-sums, height,
modified height, or center-of-sets defuzzification.
TSK FLSs are also universal approximators.
Just as in a type-1 Mamdani FLS, a constraint always exists among the number of training
samples, number of rules and number of antecedents in a type-1 TSK FLS. Because the
consequent of a TSK rule contains more design parameters than does the consequent of a
Mamdani rule, a TSK FLS that uses the same number of rules as a Mamdani FLS always has
more design degrees of freedom than a Mamdani FLS.
Two high-level designs can be associated with a singleton TSK FLS. In one design, the
shapes and parameters of all the antecedent MFs are fixed ahead of time and the training data
is used to tune only the consequent parameters. In the other design, the training data is used
to tune all of the MF and consequent parameters.
When all of the antecedent parameters are pre-specified, the method of least-squares can be
used to design the consequent parameters; doing this leads to a linear system of equations
that has to be solved for the consequent parameters. Knowing how to choose the antecedent
parameters ahead of time is a major drawback to using this method.
When none of the antecedent of consequent parameters are pre-specified, they can all be
tuned using the method of steepest descent.
It is possible to interweave the steepest-descent and least squares design methods to obtain a
more powerful iterative design method. (This can also be done for the design of a Mamdani
FLS.)
Forecasting compressed video means predicting a future value of either an I, P or B frame,
directly in the compressed video domain, using a window of previously measured I, P, or B
frame values.
Forecasting of compressed video can be accomplished using either singleton type-1 TSK or
Mamdani FLSs. Somewhat better performance is achieved for the TSK forecaster.
Some people prefer a TSK FLS over a Mamdani FLS or vice-versa. The final choice is left to
the designer who, as always, must be guided by a specific application.
Practice Problem
Complete Exercise 131. [This exercise is very similar to the calculations that are included in
Lesson 9 of this Study Guide. So, you may be wondering why I am asking you to once again
carry out derivative calculations. My answer to this rhetorical question is These calculations
require your bringing together all of the equations that are needed to implement a type-1 TSK
FLS, and this is a good thing to do.]
Lesson 13APPLICATIONS OF TYPE-1 FLSs

Learning Objectives
This lesson will let you explore one-to-three applications of type-1 FLSs, namely: rule-based
pattern classification of video traffic, equalization of time-varying non-linear digital
communication channels, and fuzzy logic control. The main purpose of the lesson is to let you
see how one or more of the FLSs already studied can be used to solve some real-world problems.
You must cover at least one of these applications; but, if more than one is of interest to you, then
do more. After completing this lesson you will be able to:
1. Demonstrate an application that can be solved using FLSs
2. Demonstrate the versatility of FLSs
Reading Assignment
Read below about one or more of the following three applications:
1. Rule-based pattern classification of video traffic
2. Equalization of time-varying non-linear digital communication channels
3. Fuzzy logic control
I. Rule-Based Classification of Video Traffic
For this self-study course, we focus on the use of type-1 FLSs as rule-based classifiers.
Consequently, we have modified Section 14.4 of the textbook as follows:
1. Read the first three paragraphs in Section 14.4 on pp. 458459.
2. Paragraph 4 of Section 14.4, on p. 459, is modified to:
Given a collection of MPEG-1 compressed movies and sports program videos, we shall use a
subset of them to create (i.e., design and test) a rule-based classifier (RBC) in the framework of
FL. We shall develop two type-1 classifiers and compare them to see which provides the best
performance. Our overall approach is to:
1. Choose appropriate features that act as the antecedents in a RBC
2. Establish rules using the features
3. Optimize the rule design-parameters using a tuning procedure
4. Evaluate the performance of the optimized RBC using testing
The first two steps of this procedure are relatively straightforward. The third step requires that
we establish the computational formulas for the FL-based classifiers, in much the same way that
we established such formulas for the Mamdani FLSs of Chapters 5 and 6 and the TSK FLS in
Chapter 13. We do this following. The fourth step requires that we also baseline our FL
classifiers. We do this using the accepted standard of a Bayesian classifier, one whose structure
we also explain following.
3. Read Section 14.4.1

4. Omit Section 14.4.2.
5. Section 14.4.3 (Rules) is modified to:
Rules for a RBC of compressed video traffic use the three selected features as their antecedents
and have one consequent. The antecedents are: logarithm of bits/I frame, logarithm of bits/P
frame, and logarithm of bits/B frame. The consequent is +1 if the video is a movie and 1 if it is
a sports program. Observe that there is nothing fuzzy about a rules consequent in rule-based
classification; i.e., each rules consequent is assigned a numerical value, +1 or 1.
Each rule in a type-1 fuzzy logic rule-based classifier (FL RBC) has the following structure:4
Rl : IF I frame isF1l and P frame is F2l and B frame is F3l , THEN the product is
a movie (+1) or a sports program (-1)
(14-1)
Observe that these rules are a special case of a Mamdani FLS rule, one in which the consequent
is a singleton. Such a rule can also be interpreted as a TSK rule.
We use a very small number of rules, namely one per video product, e.g. if our training set
contains four movies and four sports programs, we use just eight rules.
7. Section 14.4.5 (Design parameters in a FL RBC) is modified to:
In our simulations below we shall design two FL RBCssingleton type-1 FL RBC and nonsingleton type-1 FL RBC. The design results will establish which classifier provides the best
performance.
Each antecedent membership function has two design parameters, its mean and standard
deviation; hence, there are six design parameters per rule. For the non-singleton type-1 FL RBC
there is also one additional design parameter for each measurementthe standard deviation of
its Gaussian MF.
Optimum values for all design parameters are determined during a tuning process; but, before
such a process can be programmed, we must first establish computational formulas for the FL
RBCs.
8. Read Section 14.4.6.
4
See also Kuncheva (2000) for an excellent introduction to RBCs.
10. Section 14.4.8 (Optimization of rule design-parameters) is modified to:

In our simulation results discussed in Section 14.4.10, we begin with five movies and five sports
programs and, by way of illustration, design FL RBCs using four movie rules and four sports
program rules; i.e. each classifier has eight rules. Each one of the two FL RBCs is optimized
using very simple modifications of the tuning procedures that are described in Sections 5.9.3 and
6.6.3. The modifications are due to using an unnormalized output. We leave it to the reader to
develop the details of these tuning procedures. The online M-files (see Appendix C)
train_sfls_type1.m and train_nsfls_type1.m, which are for tuning normalized type-1 FLSs, are
easily adapted to the present situations.
12. Section 14.4.10 (Results and conclusions) is modified to:
So as not to get lost in the many details associated with the designs of the 2 FL RBCs, we refer
the reader to Liang and Mendel (2000e) for them. Here we focus on one set of results for socalled out-of-product classification. Out-of-product means that we use some of the compressed
data from some of the available video products to establish the rules and to optimize (tune) the
resulting classifiers, and we test the classifiers on the unused video products. As mentioned
earlier, we used eight video products out of a total of 10 available productsfour movies and
four sports programsto design each RBC. The first 24,000 (out of 40,000) compressed frames
of each of the eight video products were used to establish and design two eight-rule FL RBCs.
The first 37,500 compressed frames of the remaining two videos were then used for testing. An
exhaustive study of the 25 possible designs (five movies taken four at a time multiplied by four
sports programs taken four at a time equals five times five) was conducted. Average FAR
(averaged aver the 25 possible designs) for the two FL RBCs as well as for the Bayesian
classifier are:
singleton type-1 FL RBC: FAR = 9.41%

non-singleton type-1 FL RBC: FAR = 9.17%
Bayesian classifier: FAR = 14.29%
From these results, we see that the non-singleton type-1 FL RBC provides the best performance,
and has 35.8% fewer false alarms than does the Bayesian classifier. Additional simulation
studies that use 20 video products (10 movies and 10 sports programs) have been performed and
support these conclusions.
In summary, we have demonstrated that it is indeed possible to perform high-level classification
of movies and sports programs working directly with compressed data. Even better performance
is possible using type-2 FL RBCs, as will be demonstrated in the follow-on course New
Directions in Rule-Based Fuzzy Logic Systems: Handling Uncertainties (see, also, the discussion
of results in the textbooks Section 14.4.10 on pp. 468469).
4
II. Equalization of Time-Invariant Non-linear Digital Communication Channels
For this self-study course, we focus on the use of type-1 FLSs as equalizersfuzzy adaptive
filters (FAFs)for time-invariant non-linear digital communication channels. Consequently, we
have modified Section 14.5 of the textbook as follows:
1. Read pp. 469470, through Figure 14-2.
4. Section 14.5.3 (Designing the FAFs) is modified to:
Here we illustrate the design of a singleton type-1 FAF for the non-linear time-invariant channel
in (14-36). The FAF has eight rules, one per channel state, and the rules have the following
structure (l = 1, , 8):
Rl : I Fr(k) is F1l and r(k 1) is F2l , THEN y l = wl
(14-44)
In these rules, wl is a crisp value of +1 or 1, as determined by (14-39). We used Gaussian

membership functions for F1l and F2l .
Because of the isomorphism between equalization and classification, the computational formulas
for type-1 FAFs are easily obtained from Section 14.4.6, as follows:
f l (x) L u s e ( 1 4 - 6 )
yRBC ,1 (x) y FAF ,1 (x) L u s e ( 1 4 - 1 1yy )= w (=x ) 1 y
i
RBC ,1
Decision rule L u s e ( 1 4 - 1 0y )
RBC ,1 ( x )
FAF ,1
yFAF
,1
(x)
(x)
In Karnik et al. (1999) and Liang and Mendel (2000d), the mean-value parameters of all
membership functions were estimated using a clustering procedure [Chen et al. (1993a)] that was
applied to some training data, because such a procedure is computationally simple. We used this
same procedure. An alternative to doing this is to use a tuning procedure.
5. Section 14.5.4 (Simulations and conclusions) is modified to:
Here we compare a singleton type-1 FAF and a K-nearest neighbor classifier (NNC) [Savazzi et
al. (1998)] for equalization of the time-invariant non-linear channel in (14-36). In our
simulations, we chose the number of taps of the equalizer, p, equal to the number of taps of the
channel, n +1, where n = 1; i.e. p = n + 1 = 2 . The number of rules equaled the number of
clusters; i.e. 2 p + n = 8. We used a sequence s(k ) of length 1000 for our experiments. The first
5
121 symbols8 were used for training (i.e. clustering) and the remaining 879 were used for testing.
The training sequence established the parameters of the antecedent membership functions, as
described in Section 14.5.3. After training, the parameters of the type-1 FAF were fixed and then
testing was performed.
The results below do not appear in the textbook (the ones in Figures 14-4 and 14-5 are for a
time-varying channel). They were created especially for the Study Guide by Dr. Qilian Liang.
We ran simulations for nine different SNR values, ranging from SNR = 10dB to SNR = 18dB (at
equal increments of 1dB), and we set d = 0. We performed 100 Monte-Carlo simulations for
each value of SNR, where in each realization the AGN was uncertain. The mean values and
standard deviations of the bit error rate (BER) for the 100 Monte-Carlo realizations are plotted in
Figures 1 and 2 below, respectively. Observe, from these figures that:
4. In terms of the mean values of BER, the type-1 FAF performs better than the NNC (see
Figure 1).
5. In terms of the standard deviation of BER, the type-1 FAF is more robust than the NNC
(see Figure 2).
These observations suggest that a type-1 FAF, as just designed, looks very promising as a
transversal equalizer for time-invariant non-linear channels.
Figure. 1: Average BER of type-1 FAF and nearest neighbor classifier (NNC) versus SNR.
In the K-NNC, if the number of training prototypes is N, then K =

the choice of N = 121.
N is the optimal choice for K. It is required that N be an odd integer; hence,
Figure. 2: STD of BER of type-1 FAF and nearest neighbor classifier (NNC) versus SNR.
When uncertainties, such as additive measurement noise or time-varying channel coefficients,
are present, then type-2 FAFs outperform their type-1 counterparts, because they are able to
model such uncertainties and minimize their effects. This will be demonstrated in the follow-on
course New Directions in Rule-Based Fuzzy Logic Systems: Handling Uncertainties.
III. Fuzzy Logic Control

The material in this section was prepared by Prof. Li-Xin Wang.
III.A Introduction
A control system consists of two parts: the controller and the plant under control. Therefore, the
fuzzy control approaches developed over the years can be best classified according to the
structures and assumptions on the controller and the plant. Specifically, the plant can be modeled
as linear, non-linear, or fuzzy system models, and these models can be known or unknown to the
control-system designer ahead of time. The controller, on the other hand, can be a fixed structure
(e.g., TSK or Mamdani FLS) or can be designed according to the model of the plant. The
controller can be non-adaptive (i.e., its parameters are determined during the design phase and do
not change during the on-line implementation phase) or adaptive (i.e., its parameters are updated
on-line during the real-time operation of the overall system). Figure 3 depicts this classification
of fuzzy control approaches. Many combinations of plant and controller subclasses result in
meaningful fuzzy control systems. In the next two sections, we will summarize the state-of-theart of non-adaptive and adaptive fuzzy control theory, respectively.
III.B Non-Adaptive Fuzzy Control
In this section, we consider three situations: linear plant with a fuzzy controller, non-linear plant
with a fuzzy controller, and fuzzy plant with a fuzzy controller.
Fuzzy Control
Non-adaptive fuzzy
control (plant model known)
Adaptive fuzzy
control (plant model unknown)
Linear
plant model
Non-linear
plant model
Fuzzy
plant model
Direct
scheme
Indirect
scheme
Basic
property
analysis,
optimal
fuzzy
control, etc.
Basic
property
analysis,
optimal
fuzzy
control, etc.
Robust
fuzzy
control and
LMI,
stability
analysis,
etc.
Learning the
fuzzy
controller
parameters
on-line,
incorporating
human control
knowledge,
stability and
convergence
analysis, etc.
Learning
model
parameters
on-line,
using plant
knowledge,
stability and
convergence
analysis, etc.
Figure 3: Classification of fuzzy control schemes.

III.B.1 Linear plant plus fuzzy controller
The motivation to study the control of a linear plant using a fuzzy controller is the fact that
improved performance can usually be obtained by controlling a linear plant with a non-linear
controller, and a fuzzy logic controller is non-linear. The main issues here are how to design the
structure and parameters of the fuzzy controller so as to guarantee the stability and robustness of
the closed-loop system when the linear plant model is either known or unknown. See Chapters
17 and 18 of Wang (1997) where these issues are addressed. The following reference is a recent
approach in which optimal control principles were used to design a fuzzy controller for a linear
plant so as to achieve certain optimal performance [Wang, L.-X., Stable and Optimal Fuzzy
Control of Linear Systems, IEEE Trans. on Fuzzy Systems, vol. 6, pp. 137-143, 1998].
III.B.2 Non-linear plant plus fuzzy controller
Sliding-mode control is a powerful approach to controlling non-linear and uncertain systems. It
is a robust control method and can be applied in the presence of non-linear plant-model
uncertainties and plant-parameter disturbances, provided that the bounds of these uncertainties
and disturbances are known. In fuzzy sliding-mode control the fuzzy controller is decomposed
locally, with each rule responsible for control within a region of the state space that is covered by
8
that rule. The fuzzy control rules are designed so as to push the systems states to the so-called
sliding surface. See Chapter 19 of Wang (1997) or Driankov, et al (1996) for detailed
discussions about fuzzy sliding-mode control.
III.B.3 Fuzzy plant model plus fuzzy controller
Using a fuzzy model of the plant as well as a fuzzy controller is very popular in recent fuzzy
control studies. The plant is modeled using r TSK fuzzy logic rules of the form:
Plant Ri : IF z1 (t) is Fi 1 and L and z g (t) is Fig , THEN x (t) = Ai x(t) + Bi u(t), yi (t )= C ix(t)
(1)
where i = 1,...,r . The fuzzy controller is also modeled by r TSK fuzzy rules of the form:
Controller Ri : IF z1 (t) is Fi 1 and L and z g (t) is Fig , THEN u(t) =Ki x(t)
(2)
where i = 1,...,r . The main advantage of this approach is that, although the plant model and the
controller are non-linear, the control law can be designed locally (i.e., for each i) using linear
control design principles. Specifically, from (1) and (2), we see that for each local region
described by IF z1 (t) is Fi1 and L and zg (t) is Fig the plant model is linear and the controller is also
linear. Studies have shown that if all of the local linear controllers are stable, then under certain
conditions the global control system is also stable. For detailed discussions about this, see
[Tanaka, K., Ikeda, T. and H. O. Wang, Fuzzy Regulators and Fuzzy Observers: Relaxed
Stability Conditions and LMI-Based Designs, IEEE Trans. on Fuzzy Systems, vol. 6, pp. 250256, 1998.]
IV. Adaptive Fuzzy Control
In this section we describe two kinds of adaptive fuzzy controlindirect and direct. In indirect
adaptive fuzzy control the fuzzy controller comprises a number of fuzzy systems constructed
(initially) form knowledge about the plant, whereas in direct adaptive fuzzy control, the fuzzy
controller is a single fuzzy system constructed (initially) from knowledge about the control. It is
even possible to combine indirect and direct fuzzy controllers.
IV.A Indirect Adaptive Fuzzy Control
In indirect adaptive fuzzy control, plant non-linearities are unknown and fuzzy systems are used
to model them. The parameters of the fuzzy systems are tuned on-line in such a way that the
overall output of the fuzzy system model follows the output of the plant. The controller is
designed according to the fuzzy system model, which is considered to be the true model of the
plant. Since the fuzzy system model is changing on-line, the controller is time-varying and
adaptive. More specifically, consider the plant with the structure:
x (t)= f(x( t ) ) +g(x(t))u(t)
(3)
y(t) = x1 (t)
(4)
where f and g are unknown non-linear functions. The fuzzy system model for the plant is
9
x (t)= f(x( t ) | f (t))+ g (x( t ) | g (t))u(t)
(5)
where f and g are fuzzy systems, and f (t) and g (t) are parameters of the respective fuzzy
system. These parameters change on-line (which is why they are shown as functions of time) so
as to make f and g approximate f and g, respectively. The adaptation laws for f (t) and g (t)
have the general forms:
(t) = h (
f
f
(t),
(t) = h ( (t),
g
g
g
(t),y(t), x (t), u(t))
(6)
(t), y(t), x (t),u(t))
(7)
The controller, u(t), is designed as if (5) is the true model of the plant in (3). For example, the
following controller cancels the non-linearities and then uses a linear control law to make the
plant output x(t) follow a desired trajectory, x d (t) , of a first-order dynamical system:
u(t) =
1
f(x(t)|
g (x(t) | g (t))
(t))+ x d (t) +0.5( xd (t) x(t))
(8)
See Wang (1997, 1994) for the details.

IV.B Direct Adaptive Fuzzy Control
In direct adaptive control, the controller is a single fuzzy system whose parameters are updated
on-line so as to make the plant output follow a set-point trajectory. Specifically, suppose the
plant structure is still the one in (3), but the controller now is:
u(t) = u (x(t) | (t ))
(9)
where u is a standard FLS whose parameters, (t) , are up-dated on-line in a similar manner to
(6) and (7), so as to force y(t) follow a desired trajectory, x d (t) . See Wang (1997, 1994) for the
details.
V. Conclusions
Fuzzy control is an active research field and many new results have appeared in recent years. A
good reference that puts many approaches to fuzzy control into a single book is [Farinwata, S.
S., Filev, D. and R. Langari, Fuzzy Control: Synthesis and Analysis, John Wiley & Sons, Ltd.,
New York, 2000].
Key Points
Direct classification of compressed video traffic can save time and money.
The three features that are used in RBCs are logarithm of bits per I, P, and B frames.
I frames have more bits/frame than P frames, which have more bits/frame than B frames.
10
Rules for a RBC of compressed video traffic use the three selected features as their
antecedents and have one consequent (+1 for a movie or 1 for a sports program).
Each video product leads to one rule.
The computational formulas for type-1 FL RBCs follow directly from computational
formulas for singleton or non-singleton unnormalized Mamdani type-1 FLSs, in which the
MF of the consequent is either 1(for a movie or sports program) or 0 (for anything else).
Each rule has a small number of design parameters that can be tuned using a training set of
video traffic and the steepest descent tuning procedures described in Chapters 5 and 6.
The performance of the FL RBCs is base-lined against a Bayesian classifier.
False-alarm rate (FAR) is used as the measure of performance for all classifiers.
The FLCs outperformed the Bayesian classifier, and the FAR of the non-singleton type-1 FL
RBC gave the best results.

When a message gets confused because of the transmitting and receiving media as well as by
objects that may interfere with it, there is inter-symbol interference (ISI).
Inter-symbol interference is undone at the receiving end of a digital communication system
by equalization.
The goal in channel equalization is to recover the input sequence based on a sequence of
measured channel output values without knowing or estimating the channels coefficients.
A transversal equalizer processes a finite window of past channel output measurements.
An equalizer of order p for a channel of order n is characterized by 2 n + p channel states.
Equalization of binary signals is equivalent to two-category classification; hence, an unnormalized output singleton type-1 FL RBCa FAFcan be used to implement a Bayesian
equalizer for a time-invariant channel.
The antecedents of FLS rules are the p components of the channel state vector; the
consequent is a crisp value of either +1 (for a +1 transmitted symbol) or 1 (for a 1
transmitted symbol)
A type-1 FAF outperforms a nearest neighbor classifier (equalizer), especially at higher
SNRs.

Many different kinds of fuzzy logic controllers have been developed.
At the highest level, we can distinguish between non-adaptive and adaptive fuzzy logic
controllers
Non-adaptive fuzzy control, in which the controller is a FLS, can be further classified by the
way in which the plant is modeled: linear plant, non-linear plant, or fuzzy plant
In non-adaptive fuzzy control, the controllers parameters are determined during the design
phase and do not change during the on-line implementation phase
Adaptive fuzzy control, in which the controller is also a FLS, can be further classified by the
knowledge used to construct the fuzzy controller: indirect or direct.
11
In adaptive fuzzy control, the controllers parameters are updated during the real-time
operation of the overall system.
Review Questions
1. Circle all of the possible design parameters for a non-singleton type-1 FL RBC, when
Gaussian MFs are used:
a. Mean of each antecedent MF
b. Mean of the consequent MF
c. Standard deviation of each antecedent MF
d. Standard deviation of the consequent MF
e. Mean of each measurement MF
f. Standard deviation of each measurement MF
g. Kurtosis of each measurement MF
2. The output of the FLS in a RB FLC:
a. must be normalized
b. does not have to be normalized
c. must come from a Mamdani architecture
M
3. Normalization of yRBC ,1 (x) by
in (14-9) does not change the sign of yRBC ,1 (x) because:
l=1
a.
=0
=1
> 0 always
< 0 always
l=1
M
b.
l =1
M
c.
l=1
M
d.
l=1
4. A 10-rule singleton type-1 FL RBC that uses Gaussian MFs has how many design parameters?
a. 50
b. 60
c. 70
5. Suppose that a FL RBC gives the following results for 500 testing elements: 240 movies are
correctly classified, 245 sports programs are correctly classified, 10 movies are mis-classified as
12
sports programs, and 5 sports programs are mis-classified as movies. How many false alarms are
there?
a. 5
b. 10
c. 15

1. Inter-symbol interference (ISI) occurs when the:
a. Message is sent to a wrong address
b. Receiver becomes inoperative
c. Message gets confused because of the transmitting and receiving media as well as by
objects that may interfere with it
2. ISI is undone by a process known as:
a. Deconvolution
b. Equalization
c. Filtering
3. A transversal equalizer for a channel of order n that uses a window of past measurements
r(k),r(k 1),...,r(k p +1) has how many taps?
a. p
b. n
c. n p +1
4. A channel of order 4 that is equalized by a transversal equalizer of order 4 has how many
states?
a. 2 4
b. 2 8
c. 216
5. Monte-Carlo simulations in our equalization experiment are needed in order to average out the
effects of:
a. Classification errors
b. Channel initial conditions
c. Additive random noise
13
1. How many kinds of fuzzy logic controllers are there?
a. one
b. many
c. six
2. Controllers parameters are determined during the design phase and do not change during the
on-line implementation phase in what kind of control?
a.
b.
c.
d.
non-linear
non-adaptive fuzzy control
adaptive fuzzy control
sliding-mode control
3. The motivation to study the control of a linear plant using a fuzzy controller is:
a. Improved performance can usually be obtained by controlling a linear plant with a nonlinear controller, and a FL controller is non-linear
b. Systems that use a fuzzy logic controller are guaranteed to be stable and robust
c. They are very simple to design
4. Sliding-model control can be applied in the presence of non-linear plant-model uncertainties
and plant-parameter disturbances, provided that the uncertainties and disturbances are:
a.
b.
c.
d.
uncorrelated
unknown
known
stationary
5. The main advantage to using a fuzzy model of the plant as well as a fuzzy controller is:
a. The plant model and controller are linear
b. Although the plant model and controller are non-linear, the control law can be designed
locally, and for each local region the plant model is linear and the controller is also linear
c. Although the plant model and controller are non-linear, the control law can be designed
locally, and for each local region the plant model and the controller are non-linear
Lesson 14COMPUTATION
Learning Objectives
This lesson focuses on computation, both for implementing a type-1 FLS during its operation
and for the design of the FLS. The purposes of this lesson are to enumerate all computations for
singleton and non-singleton type-1 Mamdani FLSs and for a singleton Type-1 TSK FLS, and to
overview on-line software that is available for these computations. This lesson will let you see
the forest from the trees (so-to-speak). After completing this lesson you will be able to:
Describe the nature of and the order of all computations needed to implement the three type-1
FLSs studied in this course.
Describe the nature of and the order of all computations needed to design the three type-1
FLSs studied in this course.
Explain what software is available to implement and design the three type-1 FLSs studied in
this course.
Reading Assignment
All of the reading material for this lesson is in this Study Guide.
I. Implementation of Type-1 Mamdani FLSs
In this section we collect all of the equations that are needed to implement singleton and nonsingleton type-1 Mamdani FLSs. These equations require the designer to make many choices
(see Figure 59) and will change if the choices are different from the ones we make.
I.A Singleton type-1 Mamdani FLS
General equations for inference engine [see (510)]:
Bl
(y) =
(y)
F1l
( x1)L
Fpl
( xp ) , y Y
(1)
Inputoutput equation for the FLS: This requires specific choices to be made, e.g. maxproduct composition and product implication [which together mean we use the product t-norm in
(1)], and height defuzzification, so that [see (524) and (525)]:
y(x) = fs (x) =
(x) =

M
l=1
F il
i=1
l =1
i=1
y l l (x)
(xi )
F il
(xi )
(2)
l =1,..., M
(3)
2
Final implementation of inputoutput equation for the FLS: This requires choices to be
made about the MFs, e.g. Gaussian antecedent MFs [see (533)]
2
1 x i mF
(xi ) = exp 2

i = 1,...,p and l = 1,...,M
F il
(4)
Equations (2)(4) implement a singleton type-1 Mamdani FLS.
I.B Non-Singleton type-1 Mamdani FLS

General equations for inference engine [although Equations (5) and (6) do not appear in
the textbook, they are an explicit restatement of (62) and the sentence in which it is embedded]:
Q lk
(x lk ,max ) = sup x
Xk
(xk )
(xk )
(5)
(x k )
(6)
(7)
Fkl
where
xkl ,max = arg supx
Xk
(xk )
Fkl
so that [see (63)]

Bl
(y) =
(y) Tk =1
Q lk
(x k ,max )
Inputoutput equation for the FLS: This requires specific choices to be made, e.g. maxproduct composition and product implication [which together mean we use the product t-norm in
(7)], and height defuzzification, so that [see (617) and (618)]:
y(x) = fns (x) =
(x) =

M
l=1
(8)
k=1
y l l (x)
Qk
l=1
k =1
(xk ,max )
Q kl
(x lk,max )
(9)
Final implementation of inputoutput equation for the FLS: This requires choices to be
made about the MFs, e.g. Gaussian antecedent MFs and Gaussian input MFs [see (624) and (625)]
2
1 x i mF
(xi ) = exp 2

Fi
i = 1,...,p and l = 1,...,M
(10)
x k m X 2
(xk ) = exp 12

k = 1,...,p
Xk
(11)
so that [see (67) and (68)]

l
k ,max
mF +
2
Xk
2
F kl
l
k
2
X
(12)
2
Fkl
Q lk
mX
1 mX m F
l
(x k ,max ) = exp 2
2
2
X +
F
l
k
l
k
)
2
(13)
Equations (8), (9) and (13) implement a non-singleton type-1 Mamdani FLS.
II. Implementation of Type-1 TSK FLSs

In this section we collect all of the equations that are needed to implement singleton normalized
and unnormalized type-1 TSK FLSs. These equations also require the designer to make many
choices and will change if the choices are different from the ones we make.
II.A First-order normalized type-1 TSK FLS
General equations [see (132) and (133)]: Using product t-norm,
yTSK ,1
(x) =
M
i =1
f i (x) = Tkp= 1
f i (x)(c0i + c1i x1 + c2i x2 +L + cip xp )
f i (x)
i =1
Fki
(14)
(15)
(xk )
Final implementation of inputoutput equation for the normalized TSK FLS: This
requires choices to be made about the MFs, e.g. Gaussian antecedent MFs [see (136)]
2
1 x i mF
(xi ) = exp 2

Fi
i = 1,...,p and l = 1,...,M
Equations (14)(16) implement a first-order normalized type-1 TSK FLS.
(16)
4
II.B First-order unnormalized type-1 TSK FLS
General equations [see (13-2)(13-4)]: Using product t-norm,
yTSK ,1(x) = i= 1 f i (x )y i (x) = i =1 f i (x) c0i +c1i x1 +c i2 x 2 + L+ cip xp

M
f i (x) = Tkp= 1
(17)
(18)
(xk )
Fki
Final implementation of inputoutput equation for the unnormalized TSK FLS: This
requires choices to be made about the MFs, e.g. Gaussian antecedent MFs [see (136)]
2
1 x i mF
(xi ) = exp 2

F il
i = 1,...,p and l = 1,...,M
(19)
Equations (17)(19) implement a first-order unnormalized type-1 TSK FLS.
III. Designs of Mamdani FLSs Using a Back Propagation (Steepest Descent) Design
Procedure
In this section we collect all of the equations that are needed to design singleton and nonsingleton type-1 Mamdani FLSs using the back-propagation (steepest descent) method. These
equations require the designer to make many choices (see Figure 59) and will change if the
choices are different from the ones we make.
III.A Singleton type-1 Mamdani FLS
The key design equations are described in Section 5.9.3 [see (548)(550)]:
mF (i + 1) = m F (i)
l
k
l
k
[x
(i )
k
Fkl
(i +1) =
F kl
l
k
xk(i ) mF (i)
l
k
3
l
Fkl
(21)
[ fs (x (i ) ) y( i ) ][ y l (i) fs (x ( i ) )]
Fk
where
(i)
(20)
( i)
l (x )
[ fs (x ( i ) ) y( i ) ] l (x ( i) )
(i)
mF (i)
F kl
y l (i +1) = y l (i)
l
(i)
[ fs (x ( i) ) y( i ) ][ y (i) fs (x )]
(i)
(22)
( i)
(x )
(x( i ) ) and f s(x( i ) ) are computed using (2)(4) in which y l = y l (i) , mF = m F (i) and
F kl
l
k
(i).
l
k
5
III.B Non-singleton type-1 Mamdani FLS
The key design equations are described in Section 6.6.3 [see (630)(633)]:
mF (i + 1) = m F (i)
l
k
l
k
[ fns (x( i ) ) y ( i) ][y l (i) fns (x ( i ) )]
xk( i ) mF (i)
l (x (i ) )
2
2
(i) + F (i)
X
(23)
l
k
l
k
y l (i +1) = y l (i)
(i +1) =
Fkl
F kl
[ fns (x( i ) ) y ( i) ] l (x( i ) )
(24)
[ fns (x( i) ) y (i ) ][ y l (i) fns (x (i ) )]
(i)
xk(i ) mF (i)
(i) 2
2
X (i) + F (i)
(25)
l
k
Fk
(i)
l (x )
l
k
(i + 1) =
(i)
[ fns (x( i) ) y(i ) ][ y l (i) fns (x (i ) )]
xk( i) m (i)
F
X (i) 2
X (i) + F2 (i)
l
k
(26)
(i )
)
l (x
l
k
where l (x( i ) ) and f ns (x ( i ) ) are computed using (8), (9) and (13) in which y l = y l (i) ,
mF = m F (i), F = F (i) and X = X (i).
l
k
l
k
l
k
l
k
IV. Designs of TSK FLSs Using a Back Propagation (Steepest Descent) Design Procedure
In this section we collect all of the equations that are needed to design singleton normalized and
unnormalized type-1 TSK FLSs using the back-propagation (steepest descent) method. These
equations also require the designer to make many choices and will change if the choices are
different from the ones we make.
IV.A First-order normalized type-1 TSK FLSs
The key design equations have been worked out by you in Lesson 12, Exercise 131 [see
(5) and (15) in the solution to Exercise 131]:
c ij (n + 1) = c ij (n)
mF (n +1) = mF (n)
i
k
[ y (x ) y ] g (x )
(t)
[y ( x ) y ]
(t)
(t)
i
j
TSK ,1
(t )
(27)
(t)
TSK ,1
(t )
p (t) i
x k mF (n) w i (n)
( t)
x j c j (n) y TSK ,1 (x )
2
(n)
g(n)
j = 0
F
i
k
i
k
(28)
6
where
(t )
1 xk mF (n)
i
w (n) = exp 2
2
(n)
k=1
i
k
i
k
(29)
g(n) = wi (n)
(30)
i=1
yTSK ,1 ( x( t ) ) is computed using (14)(16) in which c ij = cji (n) , mF = m F (n) and
and
Fkl
l
k
F kl
(n) . We have not included the equation for
Fkl
l
k
(n + 1), leaving it to for you to derive.
IV.B First-order unnormalized type-1 TSK FLSs

We leave it for you to derive the steepest descent formulas for c ij (n + 1), mF (n +1) and
Fkl
(n + 1).
i
k
V. M-Files for Type-1 FLSs

Although no MATLAB M-files are packaged with the textbook, eight are available for type-1
FLSs as freeware on the Internet at the following URL: http://sipi.usc.edu/~mendel/software.
Brief descriptions of the M-files appear at the end of each chapter for which the M-file is most
applicable. In this section we collect all of the type-1 FLS M-files together as they are organized
on the Internet in the folder type-1 fuzzy logic systems.
V.A Singleton Mamdani Type-1 FLS
sfls_type1.m: Compute the output(s) of a singleton type-1 FLS when the antecedent
membership functions are Gaussian.
train_sfls_type1.m: Tune the parameters of a singleton type-1 FLS when the antecedent
membership functions are Gaussian, using some inputoutput training data.
svd_qr_sfls_type1.m: Rule-reduction of a singleton type-1 FLS when the antecedent
membership functions are Gaussian, using some inputoutput training data.
V.B Non-Singleton Mamdani Type-1 FLS
nsfls_type1.m: Compute the output(s) of a non-singleton type-1 FLS when the antecedent
membership functions are Gaussian and the input sets are Gaussian.
train_nsfls_type1.m: Tune the parameters of a non-singleton type-1 FLS when the
antecedent membership functions are Gaussian, and the input sets are Gaussian, using some
inputoutput training data.
svd_qr_nsfls_type1.m: Rule-reduction of a non-singleton type-1 FLS when the antecedent
membership functions are Gaussian, and the input sets are Gaussian, using some
7
V.C Normalized TSK FLS
tsk_type1.m: Compute the output(s) of a type-1 TSK FLS (type-1 antecedents and type-0
consequent) when the antecedent membership functions are Gaussian.
train_tsk_type1.m: Tune the parameters of a type-1 TSK FLS (type-1 antecedents and type0 consequent) when the antecedent membership functions are Gaussian, using some
V.D Unnormalized TSK FLS
Although no M-files are available for unnormalized TSK FLSs, they can easily be
constructed using the structure of the M-files that are available for a normalized TSK FLS.
Key Points
The equations needed to implement singleton type-1 and non-singleton type-1 Mamdani
FLSs and singleton normalized and unnormalized type-1 TSK FLSs have been collected in
one place.
The equations needed to design [using the back-propagation (steepest descent) design
method] singleton type-1 and non-singleton type-1 Mamdani FLSs and singleton normalized
and unnormalized type-1 TSK FLSs have been collected in one place. These equations also
make use of the ones for implementing their respective type-1 FLSs.
On-line (free) software for implementation and design of FLSssingleton type-1 and nonsingleton type-1 Mamdani FLSs and normalized type-1 TSK FLSsare available on the
Internet at: http://sipi.usc.edu/~mendel/software.
Practice Problems
Exercise SG 14-1
Suppose that the t-norm used for implementation of a singleton type-1 Mamdani FLS is the
minimum. How do the implementation equations change for that FLS?
Exercise SG 14-2
Suppose that the t-norm used for implementation of a non-singleton type-1 Mamdani FLS is the
minimum. How do the implementation equations change for that FLS?
Exercise SG 14-3
Suppose that the t-norm used for implementation of a singleton normalized type-1 TSK FLS is
the minimum. How do the implementation equations change for that FLS?
Lesson 15OPEN ISSUES WITH TYPE-1 FLSs

Learning Objectives
This lesson focuses on the shortcomings of type-1 FLSs and how they can be overcome. The
main purposes of this lesson are to introduce you to different kinds of uncertainties that can
occur in a type-1 FLS, to explain where such uncertainties can occur in the applications studied
in this course, why they can not be handled by a type-1 FLS, and what can be done about this
situation. After completing this lesson you will be able to:
Describe three general types of uncertainty.

Describe the four kinds of uncertainties that can occur in a rule-based FLS.
Demonstrate what the phrase Words mean different things to different people means.
Describe where the four kinds of uncertainties can occur in time-series forecasting,
knowledge mining from surveys, rule-based pattern classification, equalization of timevarying non-linear digital communication channels, and fuzzy logic control.
Describe why a type-1 FLS can not handle (i.e., directly model and minimize the effects of)
the uncertainties.
Explain why an expanded and richer FL is needed to do this, and that it already exists and
has led to type-2 FLSs.
Reading Assignment
Read pages 66-78 of the textbook. Then read the following new material.
I. Uncertainties in Our Applications
Here I explain where uncertainties can occur in the applications that have been included as part
of this course.
I.A Forecasting of time series (Lesson 6)
Since rules are extracted from numerical data, if the data are corrupted by additive noise then the
rule antecedents and the rule consequent are uncertain. Uncertainty also affects the tuning of the
FLS parameters because noisy measurements are used. Finally, if only noisy measurements are
available to activate the FLS, then uncertainty also affects the inputs to the FLS. In this
application, all four sources of uncertainty that are listed in the first paragraph on p.68 of the
textbook can be present.
I.B Knowledge mining using surveys (Lesson 6)
In Chapter 2, we saw that words can mean different things to different people, so rule
antecedents are uncertain because they use words. Surveys collected from a group of experts lead
to a histogram for the consequent of each rule; hence, there is uncertainty about a rules
consequent. There is no tuning of a FLA, so this kind of uncertainty is not present in a FLA.
Activating a FLA can be done using words. In this case, there is the usual uncertainty about
2
words associated with this activation. In this application, only three kinds of uncertainty are
present because data is not used to tune the FLA.
I.C Rule-based classification of video traffic (Lesson 13)
Example 13-5 in the textbook demonstrates that the logarithm of I, P, or B frame sizes are more
appropriately modeled as Gaussians each of whose mean is a constant, but whose standard
deviation varies. This suggests that we should use a Gaussian MF with a fixed mean and an
uncertain standard deviation to model each frame of the compressed video. Hence, rule
antecedents are uncertain. Rule consequents in a RBC are certain because they correspond to a
class (e.g. 1, movies or sports programs). If the parameters of a FL RBC are tuned using a
training sample, then the just-described uncertainties also affect the tuning. Measurements that
activate the FL RBC will also be uncertain because they are logarithms of I, P, or B frame sizes
as computed over a window of measurements. In this application, only three sources of
uncertainty will be present, because the uncertainty about words does not affect a rules
consequent.
I.D Equalization of Time-Invariant Non-linear Digital Communication Channels
(Lesson 13)
Because equalization using a rule-based FLSa FAFis equivalent to rule-based classification,
there can be three sources of uncertainty present, namely: uncertainty about a rules antecedents
(but not about a rules consequent), uncertainty about the data used to tune the parameters of the
FAF, and uncertainty about the measurements used to activate the FAF. When measurements are
very accurate, then all of these uncertainties disappear. However, if the communication system is
in a time-varying environment (e.g., as in mobile communications), then channel coefficients
will be time-varying, and rule antecedents become uncertain (e.g., see Example 14-2).
I.E Fuzzy Logic Control
Because of the vast scope of FL control, and our very brief coverage of FL control in Lesson 13,
we can only provide a very cursory discussion here about where uncertainties can occur in FL
control. To be as specific as possible, we focus first on non-adaptive fuzzy control in which a
fuzzy model is used for both the plant as well as the controller [see Equations (1) and (2) in
Lesson 13]. Uncertainty can occur in rule antecedents, and may also be present in each rules
consequent if only noisy measurements are available, or if the control cannot be implemented
perfectly. If control parameters are tuned during an off-line design phase using noisy data, then
that kind of uncertainty is also present. It would seem, therefore, that all four sources of
uncertainty could be present in this kind of non-adaptive fuzzy controller problem.
Next, focus on indirect adaptive fuzzy control, as described by Equations (3)(7) in Lesson 13.
The fuzzy system models for f and g will use IF-THEN rules. If they are Mamdani rules, then
uncertainties may be present in both antecedent and consequent words. If they use TSK rules,
then uncertainties may be present just in the antecedent words. Uncertain antecedents or
consequents can be used to model the lack of knowledge about the true non-linearities, f and g.
Observe, in the adaptation laws (6) and (7), that fuzzy system parameters are updated using
measurements, so if only noisy measurements are available to do this, then data uncertainties will
also be present. It would seem, therefore, that all four sources of uncertainty can also be present
in this kind of adaptive fuzzy controller problem.
II. Why Type-1 FLSs Cannot Handle Uncertainties

The original FL, founded by Lotfi Zadeh, has been around for more than 35 years, as of the year
2000, and yet it is unable to handle uncertainties. By handle, I mean to model and minimize the
effect of. That the original FLtype-1 FLcannot handle uncertainties sounds paradoxical
because the word fuzzy has the connotation of uncertainty. Type-1 FL handles uncertainties by
using precise membership functions (MFs) that the user believes capture the uncertainties. Once
the type-1 MFs are chosen, all uncertainty disappears, because type-1 MFs are totally precise.
III. An Expanded and Richer FL

An expanded FLtype-2 FLis able to handle uncertainties because it can model them and
minimize their effects. And, if all uncertainties disappear, type-2 FL reduces to type-1 FL, in
much the same way that if randomness disappears, probability reduces to determinism.
As you now know, FL is all about IF-THEN rules in which antecedents and consequent are
modeled as fuzzy sets. And, rules are described by the MFs of these fuzzy sets. In type-1 FL, the
antecedents and consequent are all described by the MFs of type-1 fuzzy sets. In type-2 FL, some
or all of the antecedents and consequent are described by the MFs of type-2 fuzzy sets.
Good news, the rules do not change as we go from a type-1 to a type-2 FLS. Paraphrasing
Gertrude Stein, A rule is a rule is a rule . What does change is the way in which we model a
rules antecedent and consequent fuzzy sets. In type-1 FL, they are all modeled as type-1 fuzzy
sets, whereas in type-2 FL some or all are modeled as type-2 fuzzy sets.
The term fuzzy set is general and includes type-1 and type-2 fuzzy sets (and even higher-type
fuzzy sets). All fuzzy sets are characterized by MFs. A type-1 fuzzy set is characterized by a
two-dimensional MF, whereas a type-2 fuzzy set is characterized by a three-dimensional MF.
As an example, suppose the variable of interest is eye contact, which we denote as x. Lets put
eye contact on a scale of values 010. One of the terms that might characterize the amount of
perceived eye contact (e.g. during flirtation) is some eye contact. Suppose that we surveyed
100 men and women, and asked them to locate the ends of an interval for some eye contact on
the scale 010. In Chapter 2 of the textbook, we have already seen that we do not get the same
interval end-points from all of them, because words mean different things to different people.
One approach to using the 100 sets of two end-points is to average the end-point data and to use
the average values for the interval associated with some eye contact. We could then construct a
triangular (other shapes could be used) MF whose base end-points (on the x-axis) are at the two
average values and whose apex is midway between the two end-points. This type-1 triangle MF
can be displayed in two-dimensions and is expressed mathematically as F (x), x X .
Unfortunately, this MF has completely ignored the uncertainties associated with the two endpoints.
A second approach is to make use of the average values and the standard deviations for the two
end-points. By doing this we are blurring the location of the two end-points along the x-axis.
4
Now locate triangles so that their base end-points can be anywhere in the intervals along the xaxis associated with the blurred average end-points. Doing this leads to a continuum of triangular
MFs sitting on the x-axis, e.g. picture a whole bunch of triangles all having the same apex point
but different base points, as in Figure 1.
For purposes of this discussion, suppose there are exactly N such triangles. Then at each value of
x, there can be up to N MF values, MF1(x), MF 2(x),, MFN(x). Lets assign a weight to each of
the possible MF values, say wx1, wx2,, wxN (see the insert on Figure 1). We can think of these
weights as the possibilities associated with each triangle at this value of x. The resulting type-2
MF can be expressed mathematically as
{(x, {( MFi(x), wxi)| i = 1, , N}| x an element of X}
Another way to write this is:
{(x, MF(x, w)| x an element of X and w an element of Jx}
MF(x, w) is a type-2 MF. It is three-dimensional because MF(x, w) depends on two variables, x
and w.
wx1
MF1(x)
MFN(x)
wxN
MFN(x)
MF1(x)
l
Uncertainty about
left end-point
Some eye contact
Uncertainty about
right end-point
Figure 1: Triangular MFs when base end-points (l and r) have uncertainty intervals associated
with them.
A type-1 FLS only uses type-1 fuzzy sets whereas a type-2 FLS uses at least one type-2 fuzzy
set. The diagram for a type-2 FLS is the same as for a type-1 FLS (see Figure 1-1 in the
5
textbook). The inference engine of a type-1 FLS maps type-1 input fuzzy sets into type-1 output
fuzzy sets, whereas the inference engine of a type-2 FLS maps type-2 and/or type-1 fuzzy sets
into type-2 fuzzy sets. The output processor for a type-1 FLS transforms a type-1 fuzzy set into a
number (i.e. a type-0 fuzzy set), and is the familiar defuzzifier. The output processor for a type-2
FLS has two components to it: (1) a type-reducer that transforms a type-2 fuzzy set into a type-1
fuzzy set (a two-dimensional type- reduced set), followed by (2) a defuzzifier that transforms the
resulting type-1 fuzzy set into a number. A type-reduced set is like a confidence interval. The
more uncertainty that is present, then the larger is the type-reduced set, and vice-versa.
Type-2 FLSs have been developed that satisfy the following fundamental design requirement:
when all sources of uncertainty disappear, a type-2 FLS must reduce to a comparable type-1
FLS. This design requirement is analogous to what happens to a probability density function
when random uncertainties disappear. In that case, the variance of the pdf goes to zero, and a
probability analysis reduces to a deterministic analysis. So, just as the capability for a
deterministic analysis is embedded within a probability analysis, the capability for a type-1 FLS
is embedded within a type-2 FLS.
Type-2 FLSs are described by type-2 membership functions (MFs) that are characterized by
more parameters than are MFs for type-1 FLSs. During the designs of type-1 and type-2 FLSs,
MF parameters are optimized using some training data. Because type-2 FLSs are characterized
by more design parameters than are type-1 FLSs (i.e., they have more design degrees of
freedom), type-2 FLSs have the potential to outperform type-1 FLSs.
Of course, one way to introduce more design degrees of freedom into a type-1 FLS is to add
more rules to it. Unfortunately, additional rules do not let a type-1 FLS account for uncertainties,
because uncertainties cannot be modeled by type-1 fuzzy sets. And, in all fairness, the additional
rules should also be provided to the type-2 FLS, especially if we require that a type-2 FLS must
reduce to a type-1 FLS when all sources of uncertainty disappear.
Some specific situations where we have found that type-2 FLSs outperform type-1 FLSs are: (1)
Measurement noise is non-stationary, but the nature of the non-stationarity cannot be expressed
ahead of time mathematically (e.g. variable SNR measurements); (2) A data-generating
mechanism is time-varying, but the nature of the time-variations cannot be expressed ahead of
time mathematically (e.g. equalization of non-linear and time-varying digital communication
channels); (3) Features are described by statistical attributes that are non-stationary, but the
nature of the non-stationarity cannot be expressed ahead of time mathematically (e.g. rule-based
classification of video traffic); and, (4) Knowledge is mined from experts using IFTHEN
questionnaires (e.g. connection admission control for ATM networks).
Type-2 fuzzy sets and FLSs are covered in the textbook that came with this course and are the
subject of the follow-on (to this course) IEEE Self-Study Course New Directions in Rule-Based
Fuzzy Logic Systems: Handling Uncertainties.
Key Points
There are three general types of uncertaintiesfuzziness, non-specificity, and strife.

Fuzziness results from the imprecise boundaries of fuzzy sets; nonspecificity is connected
with the sizes of relevant sets of alternatives; and strife expresses conflicts among the various
sets of alternatives.
There are four sources of uncertainty that can occur in a FLS: meanings of words used in
rules, (histograms of) consequents that occur in rules, measurements that activate rules, and
data that are used to tune the parameters of a FLS.
Surveys demonstrate that there is uncertainty associated with intervals used to describe
words/phrases; hence, words mean different things to different people.
Rule reduction can be achieved by including uncertainties about words.
Some or all of the four sources of uncertainty can occur in all of the applications studied in
this course.
Type-1 fuzzy sets can not handle uncertainties because they cannot directly model them.
Type-2 fuzzy sets and FLSs can handle the four kinds of uncertainties because they can
model them and minimize their effects.
Type-2 fuzzy sets are described by three-dimensional MFs, whereas type-1 fuzzy sets are
described by two-dimensional MFs. It is the new third dimension of type-2 fuzzy sets that
provides them with the ability to handle the uncertainties.
Solutions
Lesson 2: Practice Problem Solutions

Exercise 1-2
(a) Real numbers close to 10
Examples of formulas for Gaussian, triangular, and even an unnamed MF are given on p.188 of
the textbook. Any one of these could be used. One could also use a trapezoidal MF. How to
choose the width of these MFs is unclear because the word close can mean different things to
different people. If I were to choose a Gaussian MF, then Id use = 1 so that
close
(x 10) 2
(x)
=
exp
to 10
x R
(b) Real numbers approximately equal to 6

Again, examples of formulas for Gaussian, triangular, and an unnamed MF are given on p.188 of
the textbook, and, any one of these could be used (as could a trapezoidal MF). And, as in (a),
how to choose the width of these MFs is unclear because approximately equal to can mean
different things to different people. To me, what is clear is that approximately equal to should be
associated with a MF that is much narrower than the MF for close to. So, if I were to choose a
Gaussian MF, then Id use << 1 (e.g., = 0.05) so that
approximately
equal to
(x 6) 2
(x)
=
exp
6
0.005
xR
(c) Integers very far from 10

One choice for very far from 10 (x) might be:
very far from 10
(x) = 0.1/ 50 + 0 . 1 /51 + L + 0.2 / 60 + L+ 0.3 / 70 + L

+ 0 . 4 /80 + L + 0 . 5 /90 + L + 1 /200 + L
This choice for the MF points out a number of interesting points:

1. Again the choice of the MF is not unique, because of our interpretation of the phrase
very far from.
2. It is difficult to express very far from 10 (x) as we have tried to do because integers go on
indefinitely. A closed-form formula would be a better representation, e.g.,
very far from 10
x I
a(x) x < 200
(x) =
x 200 x I
1
There is no unique choice for a(x) and the choice of 200 is also arbitrary.
(d) Complex numbers near the origin
2
Let x = a + jb so that x = a 2 + b 2 . We can interpret complex numbers near the origin as those
numbers for which x is very small. In this case x must be a positive real number. By
multiplying a MF like any of the three given on p. 188 of the textbook by a unit step function, we
can obtain the desired MF, e.g.
x 2
close to origin ( x ) = exp
2 u 1 ( x )
2
where
<< .
(e) light (weight)

The MF for light (weight), LIGHT (w), can look like the one shown below. Any number of
mathematical functions can be used to represent this MF, e.g.
LIGHT
(w) =
2e aw
1 + e aw
w0
that is related to the sigmoidal function (1 e aw ) (1+ e aw ) (a shifted version of which is widely
used in neural networks). A more general s-curve that can be used for LIGHT (w) is given in Cox
(1994, pp. 5153).
LIGHT
(w)
weight
(f) heavy (weight)

The MF for heavy (weight), HEAVY (w) , can look like the one shown below. As in (e), any
number of mathematical functions can be used to represent this MF, e.g. the sigmoidal function
HEAVY
(w) =
1 e aw
1+ e aw
w0
HEAVY
(w)
weight

Exercise 1-9 The solution is given in the following table.
Three membership functions
(a)
(c-1)
A B C
(x)
( A B ) C
(x)
(b)
(c-1)
A B C
(x)
( A B ) C
(x)
(c-2)
A (BC )
(x)
(c-2)
A (BC )
(x)
(d-1)
( A B ) C
(x)
(d-1)
( A B ) C
(x)
(d-2)
A (BC )
(x)
(d-2)
A (BC )
(x)
(e)
A B C
(e)
(x)
A B C
(x)
A B C
(x)
6
Exercise 1-11
We restate (1-31) using the maximum t-norm:
c s
(ui ,vj ) =
(ui ,v j )
(ui ,v j ) = max
(ui ,v j ),
(ui ,v j )
(ui ,v j )
It follows that:
(1,1) = max(0.9,0) = 0.9
c s (1,2) = max(0.4,0.6) = 0.6
c s
(1,3) = max(0.1,1) =1
c s (2,1) = max(0.1,0) = 0.1
cs
( 2 , 2 )= max(0.4,0) = 0.4
c s (2,3) = max(0.9,0.3) = 0.9
c s
We restate (1-32) using the minimum t-norm:

c s
(ui ,vj ) =
(ui ,v j )
(ui ,v j ) = min
(ui , v j ),
It follows that:
(1,1) = min(0.9,0) = 0
c s (1,2) = min(0.4,0.6) = 0.4
c s
(1,3) = min(0.1,1) = 0.1

c s (2,1) = min(0.1,0) = 0
cs
( 2 , 2 )= min(0.4,0) = 0
c s (2,3) = min(0.9,0.3) = 0.3
c s
Exercise 1-19a
Let very likely VL . Then, according to the concept of concentration,
use L (x) from (1-56) to compute VL (x), i.e.
VL
(x) = (
VL
(x) = (
(x)) . We
2
(x)) = 1/ 1+1 / 0.9 +1 /0.8 + 0 . 6 4 /0.7 + 0.36 / 0.6 + 0 . 2 5 /0.5

2
+0.09
/ 0.4 + 0 . 0 4 0.3
/
Observe the higher concentration of VL (x) MF values for high values of probability (x), which
seems sensible for the term very likely.

Exercise 1-16
For completeness, we repeat Equations (1-29) and (1-46) which provide the MFs for
respectively:
mb (v,w),
v1
v2
w1
v3
u1 0.9 0.4 0.1
c (u,v) =
u2 0.1 0.4 0.9
(u,v) and
w2
v1 0
0
0.6 0
mb (v,w) = v2
v3 1
0.7
In this Exercise, product and maximum . The four elements of co mb (u,v) are computed
using the max-product composition shortcuts that are described on p. 42, as:
co mb
(u1 ,v1 ) = (0.9 0.4
co mb
0
0.1) 0.6 = 0.9 0 + 0.4 0.6 + 0.1 1 = max(0,0.24,0.1) = 0.24

1
(u1 ,v2 ) = (0.9 0.4
0
0.1) 0 = max(0,0,0.07)= 0.07

0.7
0
0.4 0.9) 0.6 = max(0,0.24,0.9)= 0.9
co mb (u2 ,v1 ) = (0.1

1
0
0.4 0.9) 0 = max(0,0,0.63)= 0.63
co mb (u2 ,v2 ) = (0.1

0.7
so that
0.24
(u,v)
=
co mb
0.9
Next, we compare this result for

the convenience of the reader:
0.07
when product and maximum
0.63
co mb
(u,v) with the result in (1-49), which we repeat here for
co mb
0.4
(u,v) =
0.9
0.1
0.7
when minimum and maximum
Observe that the two MF matrices are very similar, with the biggest difference between the two
occurring in the 1-1 element.
Exercise 1-20 (b)

The calculations for
is stated in (1-60).
(y) are given in the following table. We used the Extension Principle that
x
-5
-4
-3
-2
-1
0
1
2
3
4
5
0.2
0.4
0.4
0.5
0.5
0.6
0.9
1
0.8
0.5
0.1
(x)
y= x
5
4
3
2
1
0
1
2
3
4
5
(y)
max{0.2, 0.1} = 0.2
max{0.4, 0.5} = 0.5
max{0.4, 0.8} = 0.8
max{0.5, 1} = 1
max{0.5, 0.9} = 0.9
max{0.6} = 0.6
max{0.5, 0.9} = 0.9
max{0.5, 1} = 1
max{0.4, 0.8} = 0.8
max{0.4, 0.5} = 0.5
max{0.2, 0.1} = 0.2
B
From the last two columns of this table, we conclude that

B = 0 . 6 /0 + 0.9 / 1+1 /2 + 0 . 8 /3 + 0 . 5 /4 + 0.2 / 5
Lesson 5 Practice Problem Solutions

Exercise 1-23 (c)
We set up the following truth table in order to prove that
indeed a tautology.
p
T
T
T
T
F
F
F
F
q
T
T
F
F
T
T
F
F
r
T
F
T
F
T
F
T
F
(p q) r
p q
T
T
F
F
F
F
F
F
T
F
T
T
T
T
T
T
pr
T
F
T
F
T
T
T
T
(( p q ) r) ( p r) (q r ) is
( p r) (q r )
qr
T
F
T
T
T
F
T
T
T
F
T
T
T
T
T
T
Observe equality in the columns for ( p q ) r and ( p r) (q r ) . This tautology suggests

that a two-antecedent rule can be decomposed into the union of two single-antecedent rules,
something that has already been developed by W. E. Combs and J. E. Andrews, in
Combinatorial rule explosion eliminated by a fuzzy rule configuration, IEEE Trans. on Fuzzy
Systems, vol. 6, pp. 111, Feb. 1998.
Exercise 1-26
Beginning with the implication MF in (1-73), namely
can be expressed as:
B*
(y) =
AB
( x ,y) = 1
A B
(x, y) = 1
( x )[1
(x)[1
(y)], (1-76)
(y)]
The three figures below provide our construction of B * (y). Observe that the result in part (c) is
identical to the result in part (c) of Figure 1-10; hence, conclusions drawn at the end of Example
1-19 apply here as well.
1
B
(y)
B*
(x)
(y)
y
A ( x )[1
(a)
(y)
(b)
( x )
y
B (y)]
(c)
10
Lesson 6 Practice Problem Solution

Lesson 6 Exercise
Regardless of the numerical values you chose for the end points of the five labels, there is the
following interesting question: How do you update the mean and standard deviation values that
are given in Table 4-2 to account for your new values? Consider the sample mean first.
Suppose we are given a collection of i measured values of a quantity X, that is, x(1), x(2), ,
x(i). The sample mean of these measurements, x (i), is
x (i) =
1 i
x( j)
i j =1
A recursive formula for the sample mean lets us fold in a new measurement, x(i + 1), into this
formula to compute x (i + 1). It is obtained as follows:
1 i +1
1 i
x (i + 1) =
x( j) =
x( j) + x(i +1)
i +1 j =1
i +1 j =1
x (i + 1) =
i
1
x (i) +
x(i + 1)
i +1
i+1
In our case, i = 47; hence, we use the formula

47
1
x (47) +
x(48)
48
48
x (48) =
Next, consider the standard deviation. We update the standard deviation by first updating the
variance, 2 (i) , and then taking its positive square root. Recall that the sample variance is given
as
2
(i) =
1 i
[ x( j) x ( j)]2
i j =1
Following exactly the same procedure as above, we find that

2
(i + 1) =
In our case, we use the formula
i
i +1
(i) +
1
[ x(i + 1) x (i + 1)]2
i +1
11
2
Observe that to compute
(48) =
47
48
(47) +
1
[ x(48) x (48)]2
48
(48) we must first compute x (48) .
12

Lesson 7 Exercise
Follow the discussion in Example 5-1 for an explanation of how to construct the three figures
below.
l
1
x1
l
1
x1
x1
x1
min
prod
l
2
l
2
x2
x2
x2
x2
(a)
(b)
Figure 5-4: Pictorial description of input and antecedent operations for a type-1
FLS that uses triangular MFs. (a) Singleton fuzzification with minimum t-norm,
and (b) singleton fuzzification with product t-norm.
l1
l1
y1
l2
y1
l2
y l2
(a)
y l2
(b)
Figure 5-5: Pictorial description of consequent operations for a type-1 FLS when
consequent fuzzy set MFs are triangles. (a) Fired output sets with minimum tnorm, and (b) fired output sets with product t-norm.
13
l1
l1
l2
l2
y1 y2
(a)
y1 y2
(b)
Figure 5-6: Pictorial description of (a) combined output sets for the two fired
output sets depicted in Figure 5-5 (a), and (b) combined output sets for the two
fired output sets depicted in Figure 5-5 (b). Observe that the maximum of the MFs
for the two fired-sets coincides with the MF for the first fired output set.
Note that it is purely by coincidence that the second fired rule makes no contribution to the
combined output sets for the two fired output sets. Your solution to this exercise might have led
to a maximum operation in which the two fired output sets contributed to the final output set.
14

Exercise 5-4:
(a) Features of a fired-rule consequent set used by a center-of-sums defuzzifier:
Under product implication and product t-norm [see (5-10)], ya (x) is given in (5-15), from
which we see that we use the centroid and area of each consequent set. For other kinds of
implications (e.g., minimum) we dont use specific features of each Gl (y). Instead, ya (x) must
be computed using (5-14) which involves features of the output set Bl (y), namely its centroid,
c Bl , and area, a Bl .
(b) Features of a fired-rule consequent set used by a height defuzzifier:
y l the point having maximum membership in B l (y).
(c) Features of a fired-rule consequent set used by a center-of-sets defuzzifier:
c l the centroid of the lth consequent set Bl .
Exercise 5-6:
When triangles are used for the interior MFs and piecewise linear functions are used for the two
shoulder (exterior) MFs, the design parameters are:
1. Shoulder MFs: break point and slope of leg (or location of base point)2
parameters/MF.
2. Interior triangles: center location and length of base [assume that the triangle is
symmetrical (for non-symmetrical triangles, a third parameter is needed, e.g., slopes of
both legs, or left-end and right-end base points)] 2 parameters/MF.
Assume L fuzzy sets for each antecedent and consequent. Total antecedent/consequent MF
design parameters2L.
15

Exercise 5-10:
J( ) =
1 N ( i)
e
N i= 1
where
e (i ) =
1
2
[ f (x
(i)
) y( i ) ] (5-47)
2
Hence,
grad J( ) =
2
J( ) 1 N (i )
1 N
=
e
=
f s (x( i ) ) y ( i) ]
[
N i=1
2N i= 1
1 N
fs (x ( i ) ) y (i ) ]
fs (x ( i ) )
[
N i= 1
In this last equation, note that the summation also acts on fs (x( i ) ) . Calculations of
fs (x( i ) ) are exactly the same as in Exercise 5-9. See its solution given in this Study Guide.
N
Here we just need to include the
in its proper places, e.g. when = y l
i=1
(i )
)=
l f s(x
y
(x( i ) )
and
y (i +1) = y (i)
l
J(y l ) = y l (i)
l
y
1
N
[ f (x
N
(i )
i=1
) y ( i) ]
(x( i) )
Compare this equation with its counterpart in Equation (5-49).

The steepest descent algorithms for mF (i + 1) and
l
k
Fkl
(i +1), given in (5-48) and (5-50), are
structurally the same, except that each has
i=1
in front of its second term.
16

Exercise 5-14:
Our goal is to compute yc1 ( 2 , 4 ). When x1 = 2 two subsets are fired, NVL and S, and their firing
degrees (picked off of Figure 5-13) are 1 and 0.182, respectively. When x2 = 4 two subsets are
fired, S and MOA, and their firing degrees (also picked off of Figure 5-13) are 1 and 0.545,
respectively. It follows, therefore, that the rules whose antecedent pairs are
(NVL,S), ( NVL, MOA) , S( ,S), and S( , MOA)
are the ones fired. From Table 5-6, we see that these are rules 2, 3, 7 and 8. The firing degree for
each of these rules is obtained by multiplying the rules respective antecedent firing degrees, e.g.
the firing degree for R2 is 1 1 = 1. Consequently, we can compute yc1 ( 2 , 4 )using (5-63) and the
results given in the last column of Table 5-6, as:
1
yc1 ( 2 , 4 )=
(1 1+1 0.545 + 0.182 1 + 0.182 0.545)

[2.099 1 + 4.3204 0.545 + 3.1601 0.182 + 5.1566 (0.182 0.545)]
1
=
[2.099 + 2.3546 + 0.5751 + 0.5115]
1.8268
= 5.5402 1.8262 = 3.0337
Exercise 5-15:
Single-antecedent rules: Using Figure 5-13, project upwards from the horizontal axis and
observe that there can be either one, two or three intersections with MFs. Hence, a singleantecedent FLS that uses these MFs can fire one, two or three rules.
Two-antecedent rules: If each antecedent can intersect one, two or three MFs, then we take all
possible combinations of products of (1, 2, 3) and (1, 2, 3) to obtain (1, 2, 3, 4, 6, 9). Hence, we
conclude that a two-antecedent FLS that uses these MFs can fire one, two, three, four, six or nine
rules.

Exercise 111:
Follow the discussion in Example 6-1 for an explanation of how to construct the pictorial
description of input and antecedent operations for a non-singleton type-1 FLS, one that now uses
triangular MFs. The results are depicted in Figure 1 below. In part (b) of the figure, although we
show the product as two heavy triangles (they are actually quadratics) their exact shape is
unimportant, because we only use the location of the maximum of the product, which occurs at
the value of x1 and x2 where the input MFs equal one (i.e., at x1 = x1 and x 2 = x 2 ). The latter
occurs at the apex of the heavy figures regardless of their shape.
l
1
l
1
x1
x1
x1
x1
min
prod
l
2
l
2
x2
x2
x2
x2
(a)
(b)
Figure 1: Pictorial description of input and antecedent operations for a nonsingleton type-1 FLS that uses triangular MFs. (a) Singleton fuzzification with
minimum t-norm, and (b) singleton fuzzification with product t-norm.
Figures comparable to Figures 5-5 and 5-6 have already been created by you in Lesson 7, and
they do not change. What does change are the numerical values for l and l .
1
Exercise 6-5:
Regardless of whether
= mF , y l ,
l
k
= J( ) , where J( ) = e ( i) =
1
2
[f
ns
Fkl
or
, certain parts of the calculations of gradJ( )
(x( i ) ) y ( i) ] , are identical, namely:

2
J( )
{ [ f (x
(i )
1
2
ns
) y ( i) ]
} = [ f (x
) y( i ) ]
(i)
ns
f (x ( i ) )
ns
(1)
where
M
f ns (x ( i ) ) = y l l (x( i ) )
(2)
l =1
and
(i )
m
1 k
F
exp 2
2
2
k=1
X +
F
(i )
l (x ) =
2
( i)
M
p
x
m
1 k
F
exp 2 X2 + 2
l =1 k = 1
l
k
(3)
Comparing the first two equations in Section III of Lesson 9 (in the Study Guide) with Equations
(1) and (2) above, we see that they are identical. Comparing Equation (10) in Section III of
Lesson 9 with Equation (3) above, we see that they are identical when
2
Fkl
(5 9)
2
X
2
Fk l
(4)
(6 5)
Hence, we do not need to repeat the derivation of the steepest descent algorithms for mF , y l ,
l
k
and
Fkl
. What we conclude is that

(6 30) = (5 48) ( 4 a) bove
(6 31) = (5 49) ( 4 a) bove
(6 32) = (5 50) ( 4 a) bove
We did not derive a steepest descent algorithm for X in Chapter 5, because X = 0 in that
chapter. In (1)(3), X only appears in (3). Comparing (3) above with Equation (10) in Lesson 9
of this Study Guide, we see (as mentioned in the textbook) that, in Chapter 6, 2X + F2 plays the
l
k
role of Chapter 5s
X
2
Fkl
. We leave it to the reader to show that the steepest descent algorithm for
in (6-33)is the same as the steepest descent algorithm for
Fkl
in (6-32)when
Fkl
(6 3 2 )
(6 33)
Set X = 0 in this chapters steepest-descent algorithms to show that they reduce to their
singleton counterparts in (5-48)(5-50).
Exercise 6-7:
1. Fix the shapes and parameters of all the antecedent, consequent and input measurement
membership functions ahead of time. The data establishes the rules and the standard deviation of
the measurements, and no tuning is used.
Either of our two one-pass methods can be used.
2. Fix the shapes and parameters of the antecedent and input measurement membership
functions ahead of time. Use the training data to tune the consequent parameters.
Use the least-squares method to do this.
3. Fix the shapes and parameters of all the antecedent and consequent membership functions
ahead of time. Fix the shape but not the parameter(s) of the input measurement membership
function(s) ahead of time. Use the training data to tune the parameter(s) of the input
measurement membership function(s).
Use a back-propagation (steepest descent) method to do this.
4. Fix the shapes of all the antecedent, consequent and input measurement membership functions
ahead of time. Use the training data to tune the antecedent, consequent and input measurement
parameters.
Use a back-propagation (steepest descent) method to do this.

Exercise 13-1:
= mF ,ci , or
Regardless of whether
l
k
Fkl
, certain parts of the calculations of J( ) , where
J( ) = e ( t) = 12 y TSK ,1 (x ( t) ) y(t ) , are identical, namely:

J( )
{ [y
1
2
(x ) y ] } = [ y (x ) y ]
(t )
TSK ,1
(t )
(t)
(t )
TSK ,1
yTSK ,1 ( x( t ) )
(1)
where [see (13-9)]

p
yTSK ,1 ( x( t ) ) = g ij (x( t ) )cji

M
(2)
i=1 j= 0
and [see (13-16)]
(t)
1 x k mF
(t )
x j exp 2
2
k =1
g ij ( x( t ) ) =
(t)
M
p
1 x k mF
exp 2
2
i =1 k = 1
)
2
i
k
i
k
i
k
(1)
(3)
= c ij : In this case,
(t)
= gij (x (t ) )
)
i yTSK ,1 ( x
c j
(4)
Hence,
c ij (n + 1) = c ij (n)
i
i
i J (cj ) = c j (n)
c j
[ y (x ) y ] g (x )
( t)
TSK ,1
(t )
i
j
(t )
(5)
where j = 0, 1, , p and i= 1, , M.
(2)
= mF : In this case, we write yTSK ,1 in (2) as

i
k
yTSK ,1 = h g
(6)
where [see (2) and (3)]

p
h = c ij w i xj( t )
(7)
i= 1 j = 0
g=
(8)
i= 1
and
(t)
1 x k mF
i
w = exp 2
2
k=1
)
2
(9)
i
k
We use the chain rule to compute yTSK ,1 mF as follows:

i
k
yTSK ,1 yTSK ,1 wi
=
mF
w i m F
i
k
(10)
i
k
where
p
(t ) i
h
g
g i h i g x j cj h
= w 2 w = j = 0 2
=
g
g
yTSK ,1
w i
(t)
w
1 x k mF
=
exp 2
2
m F
mF k = 1
i
k
i
k
i
k
i
k
x
j= 0
i
k
i
k
i
k
) ( x
2
i
k
(t)
k
(11)
c yTSK ,1
(t ) i
j
j
(t )
1 x k mF
exp 2
=
2
mF
(t)
w
1 x k mF
= exp 2
2
m F
k=1
mF
2
i
k
) exp (x
2
k =1
k k
1
2
(t)
k
mF
)
2
F ki
(12)
F ki
so that
xk( t ) mF
w i
=
wi
2
m F
F
i
k
i
k
i
k
(13)
Hence, substituting (11) and (13) into (10), we find that

p
yTSK ,1
=
mF
c yTSK ,1
(t ) i
j
j
j=0
i
k
(x
(t)
k
mF
i
k
) w
(14)
Fki
and, we obtain the following iterative algorithm for updating mF :

i
k
mF (n +1) = mF (n)
i
k
mF (n +1) = mF (n)
i
k
(3)
Fkl
[y ( x ) y ]
(t)
(t)
TSK ,1
y
[y ( x ) y ] m
(t)
(t)
TSK ,1
TSK ,1
Fki
(t )
p (t ) i
xk mF (n) w i (n)
(t)
x j c j (n) yTSK ,1( x )
2
(n)
g(n)
j = 0
F
(15)
i
k
i
k
: The derivation of the back-propagation algorithm for
Fkl
is just like the derivation
of (15). The key steps are (6)(9). We then compute

yTSK ,1 yTSK ,1 wi
=
F
w i F
i
k
(16)
i
k
where yTSK ,1 wi is in (11), and we only need to compute w i
F ki
. Because this last
computation is just like the one for w m F , we leave its details to the reader.
i
i
k
Lesson 13 Review Question Solutions

1. a, c, f
2. b
3. c
4. b
5. c
1. c
2. b
3. a
4. b
5. c
1. b
2. b
3. a
4. c
5. b

Exercise SG 14-1:
Only Equation (3) changes to [see (5-26)]
Equations (2) and (4) and this new equation for

FLS under minimum t-norm.
(x) =
l
mini = 1 , 2 , . . .p,
Xk
(xi )
Fil
(xi )
l =1,..., M .
(x) implement a singleton type-1 Mamdani
min k = 1 , 2 , . . .p,
(x) =
Equation (12) changes to [see (6-13)] xkl ,max =
Fil
l = 1 min i= 1 , 2 , . . .p,
Exercise SG 14-2:
Equation (9) changes to [see (6-19)]
l = 1 min k = 1 , 2 , . . .p,
M
mF +
l
k
F kl
mX
Q kl
(xkl ,max )
Q kl
(x lk, m a x)
l = 1,...,M ;
; and, Equation (13) changes to
Fkl
[see Lesson 5, Example 5-2, Part (c), in which we make the appropriate substitutions for xmax ,
2
1 mX m F
l
x , mA , A* , and A ] Q (x k ,max ) = exp 2
. Equation (8) and the new equations
X +
F
l
for l (x) and Q (x k ,max ) implement a non-singleton type-1 Mamdani FLS under minimum tl
l
k
l
k
l
k
norm.
Exercise SG 14-3:
Only Equation (15) changes to f i (x) = min k = 1 , 2 , . . .p,
i
Fki
(xk ) . Equation (14), this new equation
for f (x), and Equation (16) implement a singleton, normalized type-1 TSK FLS under
minimum t-norm.
The inventor of fuzzy logic persisted despite decades of opposition
LOTFI A. ZADEH
T
HE DENUNCIATIONS were sometimes

extreme. "Fuzzy theory is wrong, wrong,
and pernicious," said William Kahan, a
highly regarded professor of computer
sciences and mathematics at the University of
California at Berkeley in 1975. "The danger of fuzzy
theory is that it will encourage the sort of imprecise
thinking that has brought us so much trouble."
Another berated the theory's scientific laxity. "No
doubt professor Zadehs enthusiasm for fuzziness has
been reinforced by the prevailing political climate in
the United States-ne
of unprecedented permissiveness," said R. E. Kalman in 1972, who is now a
professor at Florida State University in Tallahassee.
"Fuzzification is a kind of scientific permissiveness, it
tends to result in socially appealing slogans unaccompanied by the discipline of hard scientitic work."
A multitude of other outspoken critics also disputed the theory of fuzzy logic, developed by Lotfi
A. Zadeh in the mid- 1960s. Some 20 years were to
pass before the theory became widely acceptedcapped by this year's award of the IEEE Medal of
Honor to Zadeh "for pioneering development of
kizzy logic and its many diverse applications." Even
today some critics remain. But Zadeh never wavered.
H e had found himself alone in his scientific opinions
on several earlier occasions.
"There is a picture of me in my study, taken when
I was a student at the University of Tehran," Zadeh
told IEEE Spectrum. "I sit at a table, and above the table
is a sign in Russian. ODIN, which means 'alone.' It
was a proclamation of my independence."
Child of privilege
TEKL.A S. PERRY
32
Senior Editor
Perhaps the confidence Zadeh had in his judgment despite some tough opposition, and his willingness to stand apart from the crowd, originated in
a childhood of privilege. H e was born in 1921 in
Azerbaijan, then part of the Soviet Union, and moved
to Iran at age 10. His parents-his father a businessman and newspaper correspondent, his mother a
doctor-were
comfortably well off. As a child,
Zadeh was surrounded by governesses and tutors,
( K i I K ~ ) 2 3 5 ! 0 5 / $ 4 110011JY5 IFFF
while as a young adult, he had a personal servant.

His career goal, for as long as he can remember,
was to be an engineering professor. H e never considered going into industry, he said, because money was no problem. Rather, he thought of scientific and engineering research as a type of religion,
practiced at universities.
Zadeh received an electrical engineering degree
from the University of Teheran in 1942. But instead
of taking the comfortable route-becoming
a professor in Iran-he emigrated to the United States.
"I could have stayed in Iran and become rich, but
I felt that I could not do real scientific work there,"
he told Spectrum. "Research in Iran was nonexistent."
After graduation, Zadeh had a business association with the U S . Army Persian Gulf Command.
That enabled him to be financially independent
when he came to the United States to enroll in graduate school at the Massachusetts Institute of Technology (MIT) in Cambridge. "MIT didn't have many
graduate students at the time," Zadeh recalled, "so it
was fairly easy to get in, even though the University
of Teheran had no track record."
MIT, it turned out, was an easy ride after the demanding course work Zadeh had faced in Teheran.
His choice of subject for his master's thesis, though,
marked one of the first times he would sail against
the prevailing technical winds. H e chose to study
helical antennas, a subject deemed unreasonable by
the professor who had taught him antenna theory.
Undaunted, Zadeh found another professor to
supervise his work.
"I felt that my judgment was correct, and the judgment of people who supposedly knew much more
about the subject than I did was not correct," Zadeh
said. "This was one of many such situations. Helical
antennas came into wide use in the '40s and OS, and
my judgment was vindicated."
By the time Zadeh received his master's degree in
1946, his parents had moved from Teheran to New
York City. So instead of continuing at MIT, he
searched out a post as an instructor at New York
City's Columbia University and began his Ph.D.
IEEE 5I'FC TKUhl
IUNE 1995
studies there. His thesis on the frequency

analysis of time-vaiying networks considered ways of analyzing systems that
change in time. "It was not a breakthrough," he recalled, "but it did make an
"r:M i A. Zadeh
Date of birth: Feb. 4,1921
Blrthpko: Baku, Azerbaijan
Hdght: 178 cm
WdgM 63.5 kg
Family wife, Fay; chiidren, Stella and
Nonnan
Educetion. BSEE, University of Teheran,
lmMSEE, Massachusetts Institute of
Technology, 1946; Ph.D., Columbia
University, 1949
First job: design and analysis of defense
systems, lntemational Electronics
Corp., New York City, summer of 1944
P.tenls: one US. patent, two Iranian
patents
Favoritp books: "I made a conscious decision to stop readingfiction at age 15,
when I was a voracious reader.
I now read scientific books and other
nonfiction only."
paciodaalr: Four newspapers
Fadaily (The New York Times, San
FranriKo Chronicle, San Francisco
Examiner, The Wall Street Journal or
San Jose? fbfe~ury
News), 8Usiness
Week, 7he conomkt
Favorih klml of nwdc classical and
electronic
Favorit6 composers:Sergey Prokofiev,
Dimity Shostakovich
Gwtputer: a Hewlett-Packard workstation, which is wed "only to print my
ictate all my answers to
show "MacNeilllehrer
Pkwshour"
lemt fevorite food: any kind of shellfish
Fawxka testawant: Three Cs Cafe, an
inexpensive creperie in Berkeley, Calif.
Fawdtia crxpmdon: "No matter what
you are told, take it as a compliment."
FavOtite c9ty: Berkeley, Calif.
Leiswe
:portrait photography
(has
hed US. Presidents
Richard Nixon and Harry Truman, as
well as other notables), high-fidelity
audio, garage sales
Car: Nissan Qwst Minivan
Wquages spoken:English, Russian,
Iranian, French
AMfne mileage:two million miles in past
10 years on American and United
airlints alone, uncounted mileage on
ather airlines
Key organkational memberships: the
IEEE, Association for Computing
Machinery, InternationalFuuy
Systems #&sociation, American
Associationfor Artificial Intelligence
Tap awards: the IEEE Medal of Honor
(1995) and the Japan Honda Prize
(1989)
34
impact and opened a certain direction in

its field."
What he views as his first technical
breakthrough came in 1950, when, as an
assistant professor at Columbia, he coauthored a paper with his doctoral thesis advisor, John R. Ragazzini, on "An
extension of Wiener's theory of prediction." This analysis of prediction of time
series i s often cited as an early classic in
its field. This thesis introduced the use
of a finite, rather than an infinite, preceding time interval of observation for
subsequent smoothing and prediction in
the presence of multiple signals and
noises. This, and Zadeh's other work
while he was at Columbia, made him a
well-known figure in the analysis of analog systems.
Berkeley beckons
As Zadeh was pretty much mtrenched
at Columbia, he surprised his colleagues
when he packed up in 1959 and moved to
the University of California at Berkeley.
"I had not been looking for another
position," Zadeh said, "so the offer from
Berkeley was unexpected.'' It came from
electrical engineering department chairman John Whinnery, who called him at
home over the weekend and offered him a
position. "If my line had been busy, I believe l would still be at Columbia," Zadeh
told Spectrum.
Whinnery recalls it slightly differently.
He had heard from a colleague that Zadeh
had been toying with the idea of leaving
Columbia. Minutes later, Whinnery picked
up the phone and called him, arranged to
meet in him New York City for dinner,
and soon afterward hired him. Berkeley
was then growing rapidly, and Whinnery
was on the lookout for young scholars
who were considered brilliant in their
fields. Zadeh fit the bill.
For Zadeh, moving to Berkeley was a
simple decision to make: "1 was happy at
Columbia, but the job was too soft. It was
a comfortable, undemanding environment;
I was not challenged internally. I realized
that at Berkeley my life would not be anywhere near as comfortable, but I felt that it
would be good for me to be challenged."
Zadeh has never regretted the decision. To this day he remains at Berkeley,
although by now as professor emeritus.
At Berkeley, Zadeh initially continued
his work in linear, noiilinear, and finitestate systems analysis. But before long he
became convinced that digital systems
would grow in importance. Appointed as
chairman of the electrical engineering department, he decided to act o n that conviction, and immediately set about strengthening the role of computer science in
the department's curriculum. He also lobbied the electrical engineering communi-
ty nationwide to recognize the importance of computer science.

Once again, he found himself fighting
conventional wisdom. A number of departmental colleagues felt that the trend toward
computer science was a fad, and that consumer science should not be assigned a
high departmental priority. 'They accused
m e of being an Yves St. Laurent," Zadeh
recalled, "a follower of fads." Elsewhere,
professors in the mathematics department,
along with the head of the computer center, were lobbying to set up their own computer science department.
Zadeh fought this battle as he
has fought others, with polite persistence, his former chairman recollected. "We had many differences of opinion when he was
chairman," Whinnery said. "When
he couldn't convince people, he
would get upset, but [even now]
you can only tell this by the expression on his face. H e doesn't
yell or scream. Then he goes
ahead and does what he was going
to d o anyway. And mostly he's
been right, particularly about the
importance of computers in electrical engineering."
Said Earl Cox, chief executive
officer of the Metus Systems
Group, Chappaqua, N.Y., who has
known Zadeh since the '70s: "I've
never seen him anger anybody,
even though he prides himself in
going his own way, in thinking his
own thoughts." (Zadeh is also
known for encouraging others to
be independent. H e insists his
graduate students publish in their
own name, noted former student
Chin L. Chang, who is now president of Nicesoft Corp., Austin,
Texas. That practice goes against custom.)
Zadeh finally got his way in 1967: the
name of the department was changed to
electrical engineering and computer science (EECS).A separate computer science
department was also established in
Berkeley's College of Letters, but after a
few years it folded and became absorbed
into EECS.
Fuzzy is born
hile he was focusing on systems analysis, in the early
1960s, Zadeh began to feel
that traditional systems analysis techniques were too precise for real-world
problems. In a paper written in 1961, he
mentioned that a new technique was
needed, a "fuzzy" kind of mathematics. At
the time, though, he had no clear idea
how this would work.
That idea came in July 1964. Zadeh
was in New York City visiting his parents,
IEFF 5I'FCTRUM
IUNE
1995
ii
(i
and planned to leave soon for Southern

California, where he would spend several
weeks at Rand Corp working on pattem
recognition problems With this upcoming work on his mind, his thoughts often
turned to the use of imprecise categories
for classification
"Ope night i n New York," Zadsh
recalled, "I had a dinner engagement with
some friends It was canceled, and I spent
the evening by myself in my parents'
apartment 1 remember distinctly that the
idea occurred to me then to introduce the
and engaged in his struggle over the place

of computer science at the university,
Zadeh had little time to work on his new
theory of fuzzy sets. H e published his first
paper in 1965, convinced that he was
onto something important, but wrote
only sparingly on the topic until after he
left the department chairmanship in 1968.
Since then, fuzzy sets have been his
full-time occupation. "I continue to be an
active player," he said. " I am not merely an
elder statesman who rests on his laurels. I
give many talks, and this puts me under
pressure. I must constantly think of
new ideas to talk about and keep up
with what others are doing."
The Golden Fleece
concept of grade of membership [concepts

that became the backbone of fuzzy set theory]. So it is quite possible that if that dinner engagement had not been canceled,
the idea would not have occurred to me."
Fuzzy technology, Zadeh explained, is
a means of computing with words-bigger,
smaller, taller, sborter. For example, small can
be multiplied by afew and added to large,
or colder can be added to t " e r to get
something in between.
Once the issue of classification had been
solved, Zadeh could develop the theory
of fuzzy sets quickly. Two weeks later he
had a fairly fleshed-out group of concepts
to present to his collaborator at Rand,
Richard Bellman. "His response was enthusiastic," Zadeh said, "and that was a
source of encouragement to me-though
had he been very critical, I wouldn't have
changed my mind."
Since he was Berkeley's electrical engineering department chairman at the time,
PERRY - LOTFI A ZAIIEH
Acceptance of fuzzy set theory by

the technical community was slow in
coming. Part of the problem was the
name-"fuzzy" is hardly proper terminology. And Zadeh knew it.
"I was cognizant of the fact that it
would be controversial. but I could
not think of any other, respectable
term to describe what I had in mind,
which was classes that do not have
sharp boundaries, like clouds," he
said. "So I decided to do what I
thought was right, regardless of how
it might be perceived. And I've never regretted the name. I think it is
better to be visible and provocative
than to be bland."
And, as expected, fuzzy theory did
cause controversy. Some people rejected it outright because of the
name, without knowing the content.
Others rejected it because of the theory's focus on imprecision.
In the late 1 9 6 0 ~it ~even gamered
the passing attention of Congress as a
pnme example of the waste of govemment funds (much of Zadehs research was
being funded by the National Science
Foundation). Former Senator William
Proxmire (D-Wis.), the force behind the
Golden Fleece Awards that honored such
govemment boondoggles as $600 toilet
seats, sent a letter to the foundation suggesting that such "fuzzy" garbage they were supporting should eam a Colden Fleece nomination. A fluny of correspondence from
Zadeh and the foundation emerged in
defense of the work.
Zadeh remembers the challenge of developing his theories "in the face of opposition, even hostility. Someone with a
thinner skin would have been traumatized," he said. And Cox remarked, "He
meets people who have written some really nasty things, and he's nice to them."
But, observed Berkeley's Whinnery, "1
do think this lack of acceptance bothered
him, although he now describes it with
some humor."
Eventually, fuzzy theory was taken

seriously-by the Japanese. And their implementations of it surprised even Zadeh.
H e at first had expected fuzzy sets to
apply to fields in which conventional analytic techniques had been ineffectual, for
work outside of the hard sciences, for
work in philosophy, psychology, linguistics, biology, and so on. H e also thought
that the theory might apply to control
systems, in engine control, for example.
But he never expected it to be used in
consumer products, which today is perhaps its biggest application, thanks to
Japanese electronics companies.
Matsushita Electric Industrial Co. was the
first to apply fuzzy theory to a consumer
product, a shower head that controlled
water temperature, in 1987. Now numerous
Japanese consumer productdishwashers,
washing machines, air conditioners, microwave ovens, cameras, camcorders, television
sets, copiers, and even automobiles-+quietly apply fuzzy technology.
These products make use of fuzzy logic combined with sensors to simplify control. For example, cameras have several
focusing spots and use fuzzy's IF-THEN
rules to calculate the optimal focus; camcorders use fuzzy logic for image stabilization; and washing machines use sensors to detect how dirty the water is and
how quickly it is clearing to determine the
length of wash cycles.
T h e introduction of fuzzy products by
the Japanese riveted press attention on
this apparently "new" technology (some
two decades after Zadeh had developed
the theory). Growing acknowledgment of
the theory by his colleagues followed,
although some still reject it.
Acceptance, colleagues say, has somewhat changed Zadeh. "Since fuzzy logic
has turned into something with so much
panache, and he has finally come into his
own after being ignored for so many
years, I think Lotfi has come out of his
shell," said Cox.
To date, hundreds of books have been
published o n the topic, and some 15 000
technical papers have been written (most,
it seems, piled around his office, where
stacks of papers leave only a narrow path
from the door to his desk). Zadeh is now
known as the Father of Fuzzy.
"Had I not launched that theory," said
Zadeh, "I would fall into the same category as many professors-be reasonably well
known, have attained a certain level of
recognition, and written some books and
papers, but not have made a long-lasting
impact. So I consider myself to have been
lucky that this thing came about.
' T h e important criterion of your impact is: has what you have done generated a following? With fuzzy sets, 1 can definitely say, 'Yes.' "
35
Final Examination for
Introduction to
Rule-Based
Fuzzy Logic Systems
by Jerry M. Mendel
University of Southern California
Fuzzy Logic System
Rules
Crisp
Output
Processor
Fuzzifier
inputs
x
Fuzzy
Inference
Fuzzy
output sets
input sets
y = f( x)
Crisp
outputs
y
Lesson 1: INTRODUCTION AND OVERVIEW

1. Circle the four elements that comprise a fuzzy logic system:
a. inference engine
b. encoder
c. demodulator
d. fuzzifier
e. equalizer
f. output processor
g. sampler
h. rules
2. A neural network that incorporates fuzzy sets or fuzzy logic is called a:
a. neural fuzzy system
b. fuzzy logic system
c. fuzzy neural system
3. The founder of fuzzy logic is:
a. Aristotle
b. Lotfi Zadeh
c. Bertrand Russell
4. Circle the two fundamental laws of classical logic that are usually broken in fuzzy logic:
a. De Morgans Laws
b. Law of Excluded Middle
c. Transitive Law
d. Distributive Law
e. Law of Contradiction
f. Idempotent Law
5. Fuzzy logic:
a. includes classical logic as a special case
b. is three-valued logic
c. includes true, false, could be and maybe
6. During the operation of a rule-based fuzzy logic system, it:
a. is implemented using neural networks
b. involves three of its four elements: fuzzifier, inference engine and output processor
c. involves all four of its elements: rules, fuzzifier, inference engine and output processor
7. Applications of fuzzy logic:
a. are confined to control systems
b. are non-existent
c. occur only in Japanese products
d. abound in many fields
Lesson 2: FUZZY SETSPART 1

8. Circle all of the ways in which a fuzzy set can be described:
a. membership function
b. listing all of its elements
c. specifying a condition or conditions for which an element is a member of the set
2
9. Linguistic variables are variables whose values are:
a. numbers
b. algebraic relations
c. words or sentences in a natural or artificial language
10. A normal Gaussian MF has how many design degrees of freedom?
a. one
b. two
c. three
11. Membership functions for fuzzy sets:
a. are unique
b. can be of shapes that are chosen by the designer
c. are always chosen as triangles
12. A MF for integers close to 10 is:
a. close to 10 (x) = 0.3 / 7 + 0.6 / 8 + 1 / 9 + 1 / 10 + 1 / 11+ 0.6 / 12 + 0.3 / 13
x 10, x < 10
b. close to 10 (x) =
1, x 10
c. close
to 10
(x 10) 2
(x) = exp

2

x R

13. The basic fuzzy set theoretic operations of union, intersection and complement can be computed using:
a. crisp MFs whose values are 0 or 1
b. fuzzy MFs whose values are in the closed interval [0, 1]
c. fuzzy MFs whose values are in the interval (0, 1)
14. Circle all of the following operators that are t-norms:
a. xy
b. min( x, y)
c. x + y
d. x y
e. min(1, x + y)
f. max(0, x + y 1)
15. The MFs for two fuzzy sets are depicted in Figure 15a. Which of the other 3 figures represents A B (x) under
maximum t-norm?
a. Figure 15-b
b. Figure 15-c
c. Figure 15-d
B (x)
A (x)
Figure 15-a
B (x)
A (x)
Figure 15-b
B (x)
A (x)
Figure 15-c
B (x)
A (x)
Figure 15-d
16. Given two fuzzy relations on the same product space, R(U,V) and S(U, V) . Which of the following is the
correct expression for the intersection of R(U,V) and S(U, V) ?
a. R S (x, y) = R (x, y) S (x, y) x U and y V
b. R S (x, y) = max[ R (x, y) S ( x, y)] x U and y V
c. R S (x, y) = 1 R (x, y) S (x, y) x U and y V
17. The intersection and union of two fuzzy relations on the same product space are called _________ of the fuzzy
relations.
a. Cartesian products
b. compositions
c. membership functions
18. A linguistic hedge is an:
a. even bet
b. operator that acts on a fuzzy sets MF converting it into a crisp MF
c. operation that modifies the meaning of a term
19. The MF for very very unlikely is:
a. 1 4LIKELY (x)
b. [1 LIKELY (x)]
c. [1 LIKELY (x)]

20. In the sup-star composition, the star refers to a:
a. linear operator
b. t-co-norm operator
c. t-norm operator
21. Which of the following are correct sup-star compostions?
a. max yV [ R (x, y) S (y, z)]
[
[ max[0,
d.
max yV min[ R (x, y), S (y, z)]

max xU [ R (x, y) S (y, z )]
min y V max[ R (x, y), S (y, z)]
e.
max yV
b.
c.
(x, y) + S (y, z) 1]
ab (u2 ,w 1 ) , in the max-min composition of
22. The (2, 1)-element,
0.7 0.2 0.4

a (u, v) =

0.1 0.3 0.5
and
0 0.2
b (u, w) = 0.8 1 is:

0.3 0.4
a. a b (u2 ,w 1 ) = 0
b. a b (u2 ,w 1 ) = 0.3
c. a b (u2 ,w 1 ) = 0.5
23. The Extension Principle lets us:
a. compute the sup-star composition for continuous-valued fuzzy sets
b. extend mathematical relations between one-dimensional fuzzy variables to multi-dimensional fuzzy
variables
c. extend mathematical relationships between non-fuzzy variables to fuzzy variables
24. Given A = small = A (x) , in which one of the situations below would you use the Extension Principle?
a. Find the MF of C = not small = C (x)
b. Find the MF of B = very small = B (x)
c. Find the MF of D = (very small)3
25. Which of the Extension Principles stated below is the correct one to use for a one-to-many multi-variable
mapping?
a. B (y) = max
A (x) y V
1
x f
( y)
sup min{ A (x 1 ), A (x 2 )}
(y) B (y) = ( x , x ) f ( y ) 1

0 if f (y) =
b.
f (A
c.
B = f ( A) = f
1 , A2 )
(
xU
1
A ( x) x = A (x 1 ) y1 + A (x 2 ) y2 + A ( xN ) yN B( y)
Lesson 5: FUZZY LOGIC

26. Which two tautologies are for an implication?
a. ( p q) (~ p) q
b. [( p q) (r s) (p r )] (q s)
c. ( p q) ( p q)
d. ( p q) ~ [ p (~ q)]
27. The importance of the tautologies for implication is that they let us:
a. establish MFs for the implication operator
b. prove the truth of implication
c. apply set theoretic operations to implication
28. The following is a tautology, ( p q) ( p q)
a. true
b. false
29. The importance of set theory, logic and Boolean Algebra being mathematically equivalent is:
a. set theory can be replaced by logic
b. set theory and logic can be replaced by Boolean Algebra
c. any statement that is true in one system becomes true in the other simply by making some changes in
notation
30. The transition from crisp logic to FL is done by replacing crisp logics MFs by fuzzy MFs, and Modus Ponens
by Generalized Modus Ponens, which means that the fuzzy MF of a fired rule is:
a. non-unique
b. unique
c. zero or one
31. Suppose that antecedent and consequent MFs are triangles, singleton fuzzification occurs, minimum t-norm is
used, and Mamdani minimum implication is used. Then the MF of a fired rule is:
a. a scaled version of the consequents triangular MF
b. a clipped version of the consequents triangular MF
c. a clipped version of the product of the antecedents MFs
(x m
2
A
32. When the antecedent MF is Gaussian, ( A (x) = exp 12
), the input is modeled as a Gaussian fuzzy

A

(x x 2
), product implication and t-norm are used, then:
number ( A (x) = exp 12

A
2
2
a. sup x X A (x) A (x) occurs at x = x max = A mA + A x A + A
b. sup x X
c. sup x X
[
[
[
A
A
]
(x) (x)] occurs at x = x
(x) (x)] occurs at x = x
A
max
max
(
= (
= (
)(
m + x ) (
m + x ) (
A
2
A
2
A
)
+ )
+ )
A
2
A
2
A
Lesson 6: CASE STUDIES

33. Forecasting means:
a. looking into the past
b. looking into the future
c. looking at the present
34. During the design of a FLS forecaster data is partitioned into which two subsets?
a. testing
b. universe of discourse
c. training
d. validation
35. Rules can be extracted from data in:
a. only one way
b. exactly three ways
c. at least three ways
36. Chaotic behavior is:
a. the same as random behavior
b. deterministic and therefore repeatable
c. sensitivity to parameter variations
37. A fuzzy logic advisor is a rule-based FLS whose rules are extracted from:
a. people about controlling dynamical systems
b. numerical data
c. people about a judgment
38. People do not like to answer questions that have more than two antecedents because:
a. they have a short memory
b. correlating more than two things is very difficult to do
c. it is boring to do
39. If more than two indicators of a judgment are important, then it is advisable to:
a. Use a FLA that is comprised of sub-advisors
b. Use all of them in a single FLA
c. Use a neural network
Lesson 7: SINGLETON TYPE-1 FUZZY LOGIC SYSTEMSPART 1

40. A singleton type-1 FLS is comprised of which elements?
a. defuzzifier
b. hedge controller
c. inference mechanism
d. fuzzifier
e. rules
f. type-reduction
41. The MF of a fired rule, B l (y) , is given by:
[
[
]
(x, y)],
a. B l (y) = supx X Ax (x) A l G l (x, y) , y Y

b. B l (y) = supx X
(x) Al G l
y Y
c. B l (y) = supx X Ax (x) A l G l (x, y) , y Y

42. Singleton fuzzification means that a MF:
a. is singular at the measurement point
b. is non-zero at only one point, x i = xi
c. has non-zero values at a few single points
43. The importance of singleton fuzzification is it greatly simplifies the computation of the sup-star composition, in
that it eliminates having to perform the:
a. t-norm operation
b. supremum operation
c. both the t-norm and supremum operations
44. For singleton fuzzification, the MF of a fired two-antecedent rule, B l (y) y Y , is:
a. B l (y) = G l (y) F1l ( x1 ) F2l ( x2 ) , y Y

b. B l (y) = F l ( x1 ) F l ( x2 ), y Y
1
c. B l (y) = G l (y) F1l ( x1 ) F2l ( x2 ) , y = {y1 , y 2 }

45. For singleton fuzzification, when all MFs are trapezoids and the t-norm is minimum or product, then a pictorial
description of consequent operations shows that the fired output set for each rule is another trapezoid for which
the one for the product t-norm is _____ the one for the minimum t-norm:
a. larger than
b. smaller than
c. the same size as

46. The defuzzifier that is the simplest to compute is:
a. centroid
b. center-of-sums
c. height
d. modified height
e. center-of-sets
47. Outputs of different defuzzifiers are usually:
a. different
b. the same
c. fuzzy sets
48. A singleton FLS has many design degrees of freedom. This means that:
a. there is one unique type-1 FLS
b. there can be many different type-1 FLSs, but their outputs will all be the same
c. there can be many different type-1 FLSs, whose outputs will all be different
49. That a FLS can be interpreted as a fuzzy basis function (FBF) expansion means that the output of the FLS is:
a. a nonlinear function of those FBFs
b. a finite series which is a linear function of those FBFs
c. an infinite series which is a linear function of those FBFs
50. The number of FBFs in a singleton type-1 FLS equals the:
a. number of rule antecedent MF parameters multiplied by the number of consequent MF parameters
8
b. sum of all antecedents and consequents multiplied by the number of rules
c. number of rules
51. FBFs are:
a. orthogonal
b. coupled
c. uncoupled
52. The fact that a FLS is a universal approximator:
a. helps to explain why a FLS is so successful in engineering applications
b. tells us exactly how to design a FLS
c. means that a FLS can uniformly approximate any real discontinuous non-linear function to arbitrary degree of
accuracy
53. Rule explosion:
a. refers to rules that are too hot to handle
b. is no problem in a FLS
c. refers to rapid growth in the maximum number of rules that may be required in a FLS.

54. Referring to Example 5-6, if each rule has 4 antecedents and there are 30 rules, then the maximum number of
design parameters for a singleton type-1 FLS is:
a. 240
b. 270
c. 290
55. Again, referring to Example 5-6, if each rule has 4 antecedents and we have 500 training pairs, then the
maximum number of rules is:
a. 50
b. 30
c. 55
d. 60
56. The layered architecture for a FLS:
a. lets us implement the FLS in hardware as such an architecture
b. suggests the possibility of back-propagating errors from the output of the FLS to earlier layers, in analogy
with back-propagation in a feed-forward neural network
c. lets us replace a FLS with a feed-forward neural network
57. An algorithm that makes use of first derivative information of an objective function:
a. is guaranteed to maximize the objective function
b. will find the global extremum of the objective function
c. will find a local extremum of the objective function
58. Derivations of steepest descent algorithms make very heavy use of the:
a. chain rule
b. Extension Principle
c. Mean-Value Theorem
59. One-pass design methods:
a. have a small number of rules
b. choose the design parameters using the method of least-squares
c. let the data establish either the parameters of the MFs or the entire rule
60. The least-squares design method is used when:

a. all of the antecedent MF parameters are specified and only the consequent parameters need to be determined
from the data
b. all of the consequent MF parameters are specified and only the antecedent parameters need to be determined
from the data
c. both the antecedent and consequent parameters have to be determined from the data

61. In forecasting the Mackey-Glass time series, suppose that we used six antecedents instead of four, and we still
used two fuzzy sets for each antecedent. How many rules would there be?
a. 16
b. 64
c. 32
62. Measurement noise degrades a singleton type-1 FLS forecaster because:
a. although it has been accounted for during the design of the FLS, it cannot be accounted for during its
operation
b. although it can be accounted for during the operation of the FLS, it has not been accounted for during its
design
c. it cannot be accounted for during both the design and operation of the FLS
63. A three-antecedent FLS can be visualized as a ________ surface:
a. four-dimensional
b. three-dimensional
c. two-dimensional
64. Suppose that a type-1 FLS is described by two-antecedent rules and that the MFs for each antecedent are the
ones shown in Figure 3-1 of the textbook. The numbers of rules that can be fired for such a FLS are:
a. 1, 2, 3, or 4
b. 1, 2, or 3
c. 1, 2 or 4
65. If uncertainties were present about the consequents of a FLAe.g., they are collected from a group of experts
who do not all agreethen which of the possibilities for using the consequents listed below totally ignores the
uncertainties?
a. keep the response chosen by the largest number of experts
b. find a weighted average of the rule consequents for each rule
c. preserve the distributions of the expert-responses for each rule
Lesson 11: NON-SINGLETON TYPE-1 FUZZY LOGIC SYSTEMS

66. Non-singleton fuzzification means that each input to the FLS is modeled as a:
a. spike MF
b. flat MF
c. MF that has a value equal to one at the measured value of the input and decreases to zero as the input variable
gets farther away from the measured input value
67. For a non-singleton type-1 FLS, the sup operation in the sup-star composition:
a. is the same computation as for a singleton type-1 FLS
b. does not disappear, because Ax (x) has non-zero values over a range of values for each x i
10
c. can be computed in closed form for all MFs
68. An informative way to interpret a non-singleton type-1 FLS is as a:
a. pre-filter of the input x that transforms x into x lmax (l = 1, , M), after which the remaining operations of the
FLS are the same as those of a singleton FLS
b. pre-filter of the input x that transforms x into x lmax (l = 1, , M), after which the remaining operations of the
FLS are different from those of a singleton FLS
c. an inference mechanism followed by post-filtering
69. The firing level for a non-singleton type-1 FLS is:
a. the same as the firing level of a singleton type-1 FLS
b. smaller than the firing level of a singleton type-1 FLS
c. different than the firing level of a singleton type-1 FLS
70. The new design degrees of freedom for a non-singleton type-1 FLS, as compared to those in a singleton type-1
FLS, are associated with parameters in the:
a. consequent MF
b. antecedent MF
c. input MF
71. The rules of a non-singleton type-1 FLS are __________ the rules of a singleton type-1 FLS:
a. different from
b. the same as
c. fewer than
72. In general, the totally independent design approach should provide ____________ performance than the
partially independent design approach:
a. better
b. the same
c. worse
73. Which design method can be used to optimize all the parameters of a non-singleton type-1 FLS?
a. one-pass
b. least-squares
c. back-propagation
74. Although a non-singleton type-1 FLS can model uncertain measurements, this is usually insufficient to achieve
significantly improved performance over a singleton type-1 FLS because the:
a. rules of the two kinds of FLSs are the same
b. uncertainty contained in noisy training data is accounted for by the antecedent MFs of a type-1 FLS
c. uncertainty contained in noisy training data cannot be accounted for by the antecedent and consequent MFs of
a type-1 FLS
Lesson 12: TSK FUZZY LOGIC SYSTEMS

75. The reason that to-date only a singleton TSK FLS has been described in the literature is:
a. the sup-star composition for the TSK FLS does not lead to a pre-filtering effect
b. the pre-filtering effect of a non-singleton fuzzifier in a Mamdani FLS is established through the sup-star
composition, but in a TSK FLS there is no sup-star composition
c. TSK and Mamdani FLSs are the same, so it doesnt have to be described
76. Choose the two biggest differences between type-1 TSK and Mamdani FLSs:
a. their antecedents are different
11
b. the consequent function of a type-1 TSK FLS is a function and not a fuzzy set
c. the consequent function of a type-1 TSK FLS is a fuzzy set and not a function
d. the output formula for a type-1 TSK FLS is obtained using a sup-star composition
e. the output formula for a type-1 TSK FLS is obtained by combining its rules in a prescribed way and does not
derive from the sup-star composition
77. Under which condition is the normalized type-1 TSK FLS exactly the same as a type-1 Mamdani FLS?
a. the consequent function in a TSK rule is a linear function of the antecedent variables
b. the consequent function in a TSK rule is a quadratic function of the antecedent variables
c. the consequent function in a TSK rule is a constant
78. A TSK FLS that uses the same number of rules as a Mamdani FLS always has _______ design degrees of
freedom than a Mamdani FLS:
a. fewer
b. the same number
c. more
79. When all of the antecedent parameters of a type-1 TSK FLS are pre-specified, which design method can be used
to design the consequent parameters?
a. least-squares
b. one-pass
c. back-propagation
80. A TSK FLS can outperform a Mamdani FLS because it:
a. has more design degrees of freedom when both use the same number of rules
b. is a universal approximator
c. does not require the use of the sup-star composition
81. Within the general class of time-series forecasting problems, the problem of forecasting compressed video falls
into which category?
a. deterministic-signal and noisy-measurement case
b. random-signal and noisy-measurement case
c. random-signal and perfect-measurement case
Lesson 13: APPLICATIONS OF TYPE-1 FLSs

Choose one application and answer its five questions. Do not answer the questions for the other applications.
Check-off the application you are answering the five questions for:
a. Rule-Based Classification of Video Traffic _____
b. Equalization of Time-Invariant Non-linear Digital Communication Channels _____
c. Fuzzy Logic Control _____
Now answer the respective five questions:
a. Rule-Based Classification of Video Traffic

82a. The features of a FL RBC for video traffic classification are:
a. bits per I, P and B frames
b. logarithm of bits per I, P and B frames
c. exponential of bits per I, P and B frames
12
83a. The overall approach to designing a RBC involves four steps performed in a correct order, taken from the
following candidate steps:
1.a Establish rules using the features
1.b Evaluate the performance of the optimized RBC using testing
1.c Choose appropriate features that act as the antecedents in a RBC
1.d Optimize the rule design-parameters using a tuning procedure
1.e Cluster the features in feature space
Which of the following is the correctly ordered four steps?
a. 1.a, 1.b, 1.c, 1.d
b. 1.a, 1.e, 1.c, 1.b
c. 1.e, 1.d, 1.c, 1.b
d. 1.c, 1.a, 1.d, 1.b
e. 1.c, 1.e, 1.a, 1.d
84a. The rules for a RBC of compressed video traffic have:
a. Two antecedents and one consequent
b. Three antecedents and two consequents
c. Three antecedents and one consequent
85a. The consequent of a RB FLC for classification of video traffic is:
a. a fuzzy set
b. a linear function of its antecedent variables
c. 1
86a. Suppose that a FL RBC gives the following results for 600 testing elements: 370 movies are correctly
classified, 210 sports programs are correctly classified, 12 movies are mis-classified as sports programs, and 8
sports programs are mis-classified as movies. How many false alarms are there?
a. 12
b. 8
c. 20
b. Equalization of Time-Invariant Non-linear Digital Communication Channels

82b. The goal in channel equalization is to recover the:
a. input sequence based on a sequence of measured channel output values without knowing or estimating the
channels coefficients
b. input sequence based on a sequence of measured channel output values knowing the channels coefficients
c. channels coefficients based on a sequence of measured channel output values
83b. What kind of equalizer processes a finite window of past channel output measurements?
a. decision-feedback
b. transversal
c. blind
84b. A transversal equalizer for a channel of order 6 that uses a window of past measurements r (k), r (k 1), r(k 2)
has how many taps?
a. 2
b. 3
c. 6
13
85b. A channel of order 6 that is equalized by a transversal equalizer of order 4 has how many states?
4
a. 2
24
b. 2
10
c. 2
86b. For a binary input sequence, equalization is equivalent to:
a. The output of an unnormalized TSK FLS
b. The centroid defuzzified output of a Mamdani FLS
c. Two category classification
c. Fuzzy Logic Control

82c. At the highest level, we can distinguish between:
a. adaptive and non-adaptive FL controllers
b. indirect or direct FL controllers
c. linear, non-linear, or fuzzy control
83c. Controllers parameters are updated during the real-time operation of the overall system in what kind of
control?
a. non-linear
b. non-adaptive fuzzy control
c. adaptive fuzzy control
d. sliding-mode control
84c. Indicate the three ways in which non-adaptive fuzzy control can be further classified:
a. linear plant
b. time-varying plant
c. sliding mode
d. non-linear plant
e. chaotic plant
f. fuzzy plant
g. constant-coefficient plant
85c. When using a fuzzy model of the plant as well as a fuzzy controller, the plant is modeled using which kind of
rules?
i
a. R : IF z1 (t) is Fi1 and and zg (t) is Fig , THEN x (t) = Ai x(t) + Bi u(t), u(t) = K i x(t)
b.
c.
Ri : IF z1 (t) is Fi1 and and zg (t) is Fig , THEN u(t) = K i x(t)

Ri : IF z1 (t) is Fi1 and and zg (t) is Fig , THEN x (t) = Ai x(t) + Bi u(t), y i (t) = Ci x(t)
86c. In indirect adaptive fuzzy control, fuzzy systems are used to model what kinds of plant non-linearities?
a. known
b. unknown
c. discontinuous
d. continuous
14
Lesson 14: COMPUTATION

87. Equations (5-24) and (5-25) implement a singleton:
a. Mamdani FLS with minimum implication and t-norm
b. TSK FLS
c. Mamdani FLS with product implication and t-norm and Gaussian MFs
d. Mamdani FLS with product implication and t-norm and arbitrary MFs
88. Equations (5-24), (5-25) and (5-33) implement a:
a. non-singleton type-1 FLS
b. singleton Mamdani FLS with product implication and t-norm and Gaussian MFs
c. singleton Mamdani FLS with product implication and t-norm and arbitrary MFs
89. Equations (6-17), (6-18) and (13) of the Study Guide implement a non-singleton:
a. Mamdani FLS with product implication and t-norm and Gaussian MFs
b. Mamdani FLS with minimum implication and t-norm and Gaussian MFs
c. TSK FLS
90. Equations (13-2), (13-3) and (13-6) implement:
a. an un-normalized type-1 TSK FLS with Gaussian MFs
b. a normalized second-order type-1 TSK FLS with Gaussian MFs
c. a normalized first-order type-1 TSK FLS with Gaussian MFs
d. a normalized first-order type-1 TSK FLS with arbitrary MFs
91. The steepest descent design equations for a non-singleton type-1 Mamdani FLS are:
a. exactly the same as the ones for a singleton type-1 Mamdani FLS
b. different from the ones for a singleton type-1 Mamdani FLS, but include the same number of design
equations
c. different from the ones for a singleton type-1 Mamdani FLS, and include at least one additional design
equation
d. the same as the ones for a first-order normalized type-1 TSK FLS
92. How many MATLAB M-files for implementing and designing the type-1 FLSs that are covered in this course
are available on-line?
a. more than 8
b. 8
c. 30
93. When the t-norm used to implement a singleton or non-singleton type-1 FLS is the minimum instead of the
product, then:
a. we can use exactly the same equations to implement such a FLS as we use to implement a type-1 FLS that
uses the product t-norm
b. a singleton type-1 TSK FLS cannot be used at all
c. some of the equations change and the changes depend on the nature of the MFs
d. some of the equations change and the changes do not depend on the nature of the MFs
Lesson 15: OPEN ISSUES WITH TYPE-1 FLSs

94. Fuzziness results from:
a. conflicts among a various set of alternatives
b. imprecise boundaries of fuzzy sets
c. sizes of relevant sets
15
95. Strife results from:

96. Non-specificity results from:
97. Circle the four sources of uncertainty that can occur in a FLS:
a. data that are used to tune the parameters of a FLS
b. choosing triangular MFs rather than Gaussian MFs
c. using height defuzzification rather than centroid defuzzification
d. meanings of words used in rules
e. measurements that activate rules
f. choosing minimum implication rather than product implication
g. consequents that occur in rules
h. choosing product t-norm instead of minimum t-norm
98. Words:
a. mean the same thing to everyone
b. have no uncertainty associated with them
c. mean different things to different people
99. Type-1 fuzzy sets:
a. cannot handle uncertainties because they cannot directly model them
b. can model all sorts of uncertainties
c. can provide a worst-case design in which uncertainties can be modeled
100. Which kinds of sets can model the four kinds of uncertainties that can occur in a FLS and can therefore
minimize their effects?
a. crisp sets
b. type-2 fuzzy sets
c. type-1 fuzzy sets
SSC#1: Final Examination Solutions

1. A, D, F, H
2. C
3. B
4. B, E
5. A
6. B
7. D
26. A, D
27. A
28. A
29. C
30. A
31. B
32. C
51. B
52. A
53. C
54. B
55. C
56. B
57. C
8. A
9. C
10. B
11. B
12. A
13. B
14. A, B, D, F
15. B
16. A
17. B
18. C
19. C
20. C
21. A, B, E
22. B
23. C
24. C
25. B
33. B
34. A, C
35. C
36. B
37. C
38. B
39. A
40. A, C, D, E
41. C
42. B
43. B
44. A
45. B
46. C
47. A
48. B
49. B
50. C
58. A
59. C
60. A
61. B
62. B
63. A
64. C
65. A
66. C
67. B
68. A
69. C
70. C
71. B
72. A
73. C
74. C
75. B
76. B, E
77. C
78. C
79. A
80. A
81. C
a
b
82a. B
82b. A
83a. D
83b. B
84a. C
84b. B
85a. C
85b. C
86a. C
86b. C
87. D
88. B
89. A
90. C
91. C
92. B
93. C
94. B
95. A
96. C
97. A, D, E, G
98. C
99. A
100. B
c
82c. A
83c. C
84c. A, D, F
85c. C
86c. A

Introduction To Rule-Based Fuzzy Logic Systems: by Jerry M. Mendel

Uploaded by

Copyright:

Available Formats

Introduction To Rule-Based Fuzzy Logic Systems: by Jerry M. Mendel

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Rule-Based Fuzzy Logic Systems: by Jerry M. Mendel

Uploaded by

Copyright:

Available Formats

Introduction to

University of Southern California

Fuzzy Logic System

Introduction and Overview

Singleton Type-1 Fuzzy Logic SystemsPart 1

Singleton Type-1 Fuzzy Logic SystemsPart 2

Singleton Type-1 Fuzzy Logic SystemsPart 3

Lesson 10 Singleton Type-1 Fuzzy Logic SystemsPart 4

Explain four kinds of uncertainties that can occur in a rule-based FLS

How to use this course

Lesson 1: INTRODUCTION & OVERVIEW

Explain the difference between logic and fuzzy logic.

II. Why is FL Needed?

III. An Impressionistic Brief History of FL (In literature, impressionism is a mode of

V. Applications for FLSs

VI. Difference Between a Rule-Based FLS and a Neural Network

Lesson 2: FUZZY SETSPart 1

Explain how a fuzzy set is a generalization of a crisp set.

Lesson 3: FUZZY SETSPart 2

T-conorms are operators that can be used for fuzzy union.

Lesson 4: FUZZY SETSPart 3

Explain what is meant by different product spaces.

Read pages 4447 of the textbook.

Lesson 5FUZZY LOGIC

Here we consider a single-antecedent rule whose antecedent membership function is also

Derivation: Using product implication,

(y) , and using product t-norm

(x) occurs at x = x max =

(b) Next, we show that supx X

To maximize f (x) we must minimize

(x) ; hence, we proceed as follows:

(c) Finally, we show that supx X

Derivation: Substitute x = x max into f (x)

(y) = min{supx X min[

Derivation: Using minimum implication,

(y)] , and using minimum t-

(y) = supx X min[

(y) = supx X min {min[

(y) = min{supx X min[

(y) = min{supx X min[

(y) = min{supx X min[

(y) = min{supx X min[

(b) Next, we show that supx X min A (x),

(x) occurs at the intersection point of the two

In order to get a formula for xmax , we set

(c) Finally, we show that supx X min

Derivation: From part (b), it is clear that

So, for example,

A proposition is an ordinary statement involving terms that have been defined.

Lesson 6CASE STUDIES

Describe how to formulate time-series forecasting problems.

1. Parallel Architecture: Overall Decision Maker

Figure 1: Parallel architecture: overall decision maker.

Figure 2: Parallel architecture: aggregate decision maker.

Figure 3: Hierarchical architecture.

Lesson 7SINGLETON TYPE-1

Describe the architecture of a type-1 singleton FLS.

The fuzzy inference engine can be interpreted as a system whose output is

For singleton fuzzification, the MF of a fired rule,