MTH 202

Download as pdf or txt
Download as pdf or txt
You are on page 1of 263

Lecture Notes on Discrete Mathematics

July 30, 2019


T
AF
DR
2

DR
AF
T
Contents

1 Basic Set Theory 7


1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Operations on sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Composition of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6 Equivalence relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 The Natural Number System 25


2.1 Peano Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Other forms of Principle of Mathematical Induction . . . . . . . . . . . . . . . . . . . 28
2.3 Applications of Principle of Mathematical Induction . . . . . . . . . . . . . . . . . . . 31
2.4 Well Ordering Property of Natural Numbers . . . . . . . . . . . . . . . . . . . . . . . . 33
T
AF

2.5 Recursion Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34


2.6 Construction of Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
DR

2.7 Construction of Rational Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3 Countable and Uncountable Sets 43


3.1 Finite and infinite sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Families of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Constructing bijections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Cantor-Schröder-Bernstein Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Countable and uncountable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4 Elementary Number Theory 61


4.1 Division algorithm and its applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Chinese Remainder Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5 Combinatorics - I 71
5.1 Addition and multiplication rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Permutations and combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2.1 Counting words made with elements of a set S . . . . . . . . . . . . . . . . . . 73
5.2.2 Counting words with distinct letters made with elements of a set S . . . . . . . 74
5.2.3 Counting words where letters may repeat . . . . . . . . . . . . . . . . . . . . . 75
5.2.4 Counting subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2.5 Pascal’s identity and its combinatorial proof . . . . . . . . . . . . . . . . . . . . 77

3
4 CONTENTS

5.2.6 Counting in two ways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78


5.3 Solutions in non-negative integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4 Binomial and multinomial theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5 Circular arrangements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.6 Set partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.7 Number partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.8 Lattice paths and Catalan numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

6 Combinatorics - II 103
6.1 Pigeonhole Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2 Principle of Inclusion and Exclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.3 Generating Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.3.1 Generating Functions and Partitions of n . . . . . . . . . . . . . . . . . . . . . 116
6.4 Recurrence Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.5 Generating Function from Recurrence Relation . . . . . . . . . . . . . . . . . . . . . . 124

7 Introduction to Logic 133


7.1 Logic of Statements (SL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.2 Formulas and truth values in SL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.3 Equivalence and Normal forms in SL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.4 Inferences in SL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.5 Predicate logic (PL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
T

7.6 Equivalences and Validity in PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153


AF

7.7 Inferences in PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156


DR

8 Partially Ordered Sets, Lattices and Boolean Algebra 161


8.1 Partial Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.2 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
8.3 Boolean Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.4 Axiom of choice and its equivalents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

9 Graphs - I 191
9.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.2 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.3 Isomorphism in graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
9.4 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
9.5 Eulerian graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.6 Hamiltonian graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.7 Bipartite graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
9.8 Planar graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.9 Vertex coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

10 Graphs - II 221
10.1 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
10.2 Matching in graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
10.3 Ramsey numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
10.4 Degree sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
CONTENTS 5

10.5 Representing graphs with matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

11 Polya Theory∗ 231


11.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
11.2 Lagrange’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
11.3 Group action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
11.4 The Cycle index polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
11.5 Polya’s inventory polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

Index 258

T
AF
DR
6 CONTENTS

T
AF
DR
Chapter 1

Basic Set Theory

The following notations will be followed throughout the book.


1. The empty set, denoted ∅, is the set that has no element.
2. N := {1, 2, . . .}, the set of Natural numbers;
3. W := {0, 1, 2, . . .}, the set of whole numbers
4. Z := {0, 1, −1, 2, −2, . . .}, the set of Integers;
5. Q := { pq : p, q ∈ Z, q 6= 0}, the set of Rational numbers;
6. R := the set of Real numbers; and
7. C := the set of Complex numbers.
T
AF

This chapter will be devoted to understanding set theory, relations, functions. We start with the basic
DR

set theory.

1.1 Sets
Mathematicians over the last two centuries have been used to the idea of considering a collection of
objects/numbers as a single entity. These entities are what are typically called sets. The technique of
using the concept of a set to answer questions is hardly new. It has been in use since ancient times.
However, the rigorous treatment of sets happened only in the 19-th century due to the German math-
ematician Georg Cantor. He was solely responsible in ensuring that sets had a home in mathematics.
Cantor developed the concept of the set during his study of the trigonometric series, which is now
known as the limit point or the derived set operator. He developed two types of transfinite numbers,
namely, transfinite ordinals and transfinite cardinals. His new and path-breaking ideas were not well
received by his contemporaries. Further, from his definition of a set, a number of contradictions and
paradoxes arose. One of the most famous paradoxes is the Russell’s Paradox, due to Bertrand Russell
in 1918. This paradox amongst others, opened the stage for the development of axiomatic set theory.
The interested reader may refer to Katz [8].
In this book, we will consider the intuitive or naive view point of sets. The notion of a set is taken
as a primitive and so we will not try to define it explicitly. We only give an informal description of
sets and then proceed to establish their properties.
A “well-defined collection” of distinct objects can be considered to be a set. Thus, the principal
property of a set is that of “membership” or “belonging”. Well-defined, in this context, would enable
us to determine whether a particular object is a member of a set or not.

7
8 CHAPTER 1. BASIC SET THEORY

Members of the collection comprising the set are also referred to as elements of the set. Elements
of a set can be just about anything from real physical objects to abstract mathematical objects. An
important feature of a set is that its elements are “distinct” or “uniquely identifiable.”
A set is typically expressed by curly braces, { } enclosing its elements. If A is a set and a is an
element of it, we write a ∈ A. The fact that a is not an element of A is written as a 6∈ A. For instance,
if A is the set {1, 4, 9, 2}, then 1 ∈ A, 4 ∈ A, 2 ∈ A and 9 ∈ A. But 7 6∈ A, π 6∈ A, the English word
‘four’ is not in A, etc.
Example 1.1.1. 1. Let X = {apple, tomato, orange}. Here, orange ∈ X, but potato 6∈ X.
2. X = {a1 , a2 , . . . , a10 }. Then, a100 6∈ X.
3. Observe that the sets {1, 2, 3}, {3, 1, 2} and {digits in the number 12321} are the same as the
order in which the elements appear doesn’t matter.

We now address the idea of distinctness of elements of a set, which comes with its own subtleties.
Example 1.1.2. 1. Consider the list of digits 1, 2, 1, 4, 2. Is it a set?
2. Let X = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Then X is the set of first 10 natural numbers. Or equivalently,
X is the set of integers between 0 and 11.

Definition 1.1.3. The set S that contains no element is called the empty set or the null set and
is denoted by { } or ∅. A set that has only one element is called a singleton set.

One has three main ways for specifying a set. They are:
T

1. Listing all its elements (list notation), e.g., X = {2, 4, 6, 8, 10}. Then X is the set of even integers
AF

between 0 and 12.


DR

2. Stating a property with notation (predicate notation), e.g.,


(a) X = {x : x is a prime number}. This is read as “X is the set of all x such that x is a prime
number”. Here, x is a variable and stands for any object that meets the criteria after the
colon.
(b) The set X = {2, 4, 6, 8, 10} in the predicate notation can be written as
i. X = {x : 0 < x ≤ 10, x is an even integer }, or
ii. X = {x : 1 < x < 11, x is an even integer }, or
iii. x = {x : 2 ≤ x ≤ 10, x is an even integer } etc.

Note that the above expressions are certain rules that help in defining the elements of the set
X. In general, one writes X = {x : p(x)} or X = {x | p(x)} to denote the set of all elements x
(variable) such that property p(x) holds. In the above, note that “colon” is sometimes replaced
by “|”.
3. Defining a set of rules which generate its members (recursive notation), e.g., let

X = {x : x is an even integer greater than 3}.

Then, X can also be specified by


(a) 4 ∈ X,
(b) whenever x ∈ X, then x + 2 ∈ X, and
(c) every element of X satisfies the above two rules.
1.2. OPERATIONS ON SETS 9

In the recursive definition of a set, the first rule is the basis of recursion, the second rule gives
a method to generate new element(s) from the elements already determined and the third rule
binds or restricts the defined set to the elements generated by the first two rules. The third rule
should always be there. But, in practice it is left implicit. At this stage, one should make it
explicit.

Definition 1.1.4. Let X and Y be two sets.


1. Suppose X is the set such that whenever x ∈ X, then x ∈ Y as well. Here, X is said to be a
subset of the set Y , and is denoted by X ⊆ Y . When there exists x ∈ X such that x 6∈ Y , then
we say that X is not a subset of Y ; and we write X 6⊆ Y .
2. If X ⊆ Y and Y ⊆ X, then X and Y are said to be equal, and is denoted by X = Y .
3. If X ⊆ Y and X 6= Y , then X is called a proper subset of Y .
Thus, X is a proper subset of Y if and only if X ⊆ Y and X 6= Y .
Example 1.1.5. 1. For any set X, we see that X ⊆ X. Thus, ∅ ⊆ ∅. Also, ∅ ⊆ X. Hence, the
empty set is a subset of every set. It thus follows that there is only one empty set.
2. We know that N ⊆ W ⊆ Z ⊆ Q ⊆ R ⊆ C.
3. Note that ∅ 6∈ ∅.
4. Let X = {a, b, c}. Then a ∈ X but {a} ⊆ X. Also, {{a}} 6⊆ X.
5. Notice that {{a}} 6⊆ {a} and {a} 6⊆ {{a}}; though {a} ∈ {a, {a}} and also {a} ⊆ {a, {a}}.

We now mention some set operations that enable us in generating new sets from existing ones.
T
AF

1.2 Operations on sets


DR

Definition 1.2.1. Let X and Y be two sets.


1. The union of X and Y , denoted by X ∪ Y , is the set that consists of all elements of X and also
all elements of Y . More specifically, X ∪ Y = {x|x ∈ X or x ∈ Y }.
2. The intersection of X and Y , denoted by X ∩ Y , is the set of all common elements of X and
Y . More specifically, X ∩ Y = {x|x ∈ X and x ∈ Y }.
3. The sets X and Y are said to be disjoint if X ∩ Y = ∅.
Example 1.2.2. 1. Let A = {1, 2, 4, 18} and B = {x : x is an integer, 0 < x ≤ 5}. Then,

A ∪ B = {1, 2, 3, 4, 5, 18} and A ∩ B = {1, 2, 4}.

2. Let S = {x ∈ R : 0 ≤ x ≤ 1} and T = {x ∈ R : .5 ≤ x < 7}. Then,

S ∪ T = {x ∈ R : 0 ≤ x < 7} and S ∩ T = {x ∈ R : .5 ≤ x ≤ 1}.

3. Let X = {{b, c}, {{b}, {c}}, b} and Y = {a, b, c}. Then

X ∩ Y = {b} and X ∪ Y = {a, b, c, {b, c}, {{b}, {c}} }.

We now state a few properties related to the union and intersection of sets.

Lemma 1.2.3. Let R, S and T be sets. Then,


1. (a) S ∪ T = T ∪ S and S ∩ T = T ∩ S (union and intersection are commutative operations).
10 CHAPTER 1. BASIC SET THEORY

(b) R ∪ (S ∪ T ) = (R ∪ S) ∪ T and R ∩ (S ∩ T ) = (R ∩ S) ∩ T (union and intersection are


associative operations).
(c) S ⊆ S ∪ T, T ⊆ S ∪ T .
(d) S ∩ T ⊆ S, S ∩ T ⊆ T .
(e) S ∪ ∅ = S, S ∩ ∅ = ∅.
(f ) S ∪ S = S ∩ S = S.
2. Distributive laws (combines union and intersection):
(a) R ∪ (S ∩ T ) = (R ∪ S) ∩ (R ∪ T ) (union distributes over intersection).
(b) R ∩ (S ∪ T ) = (R ∩ S) ∪ (R ∪ T ) (intersection distributes over union).

Proof. 2a. Let x ∈ R ∪ (S ∩ T ). Then, x ∈ R or x ∈ S ∩ T . If x ∈ R then, x ∈ R ∪ S and x ∈ R ∪ T .


Thus, x ∈ (R ∪ S) ∩ (R ∪ T ). If x 6∈ R, then x ∈ S ∩ T . So, x ∈ S and x ∈ T . Here, x ∈ R ∪ S and
x ∈ R ∪ T . Thus, x ∈ (R ∪ S) ∩ (R ∪ T ). In other words, R ∪ (S ∩ T ) ⊆ (R ∪ S) ∩ (R ∪ T ).
Now, let y ∈ (R ∪ S) ∩ (R ∪ T ). Then, y ∈ R ∪ S and y ∈ R ∪ T . Now, if y ∈ R ∪ S then either
y ∈ R or y ∈ S or both.
If y ∈ R, then y ∈ R∪(S∩T ). If y 6∈ R then the conditions y ∈ R∪S and y ∈ R∪T imply that y ∈ S
and y ∈ T . Thus, y ∈ S ∩T and hence y ∈ R∪(S ∩T ). This shows that (R ∪S)∩(R ∪T ) ⊆ R ∪(S ∩T ),
and thereby proving the first distributive law. The remaining proofs are left as exercises.

Exercise 1.2.4. 1. Complete the proof of Lemma 1.2.3.


2. Prove the following:
T
AF

(a) S ∪ (S ∩ T ) = S ∩ (S ∪ T ) = S.
(b) S ⊆ T if and only if S ∪ T = T .
DR

(c) If R ⊆ T and S ⊆ T then R ∪ S ⊆ T .


(d) If R ⊆ S and R ⊆ T then R ⊆ S ∩ T .
(e) If S ⊆ T then R ∪ S ⊆ R ∪ T and R ∩ S ⊆ R ∩ T .
(f ) If S ∪ T 6= ∅ then either S 6= ∅ or T 6= ∅.
(g) If S ∩ T 6= ∅ then both S 6= ∅ and T 6= ∅.
(h) S = T if and only if S ∪ T = S ∩ T .

Definition 1.2.5. Let X and Y be two sets.


1. The set difference of X and Y , denoted by X \ Y , is defined by X \ Y = {x ∈ X : x 6∈ Y }.
2. The set (X \ Y ) ∪ (Y \ X), denoted by X∆Y , is called the symmetric difference of X and Y .
Example 1.2.6. 1. Let A = {1, 2, 4, 18} and B = {x ∈ Z : 0 < x ≤ 5}. Then,

A \ B = {18}, B \ A = {3, 5} and A∆B = {3, 5, 18}.

2. Let S = {x ∈ R : 0 ≤ x ≤ 1} and T = {x ∈ R : 0.5 ≤ x < 7}. Then,

S \ T = {x ∈ R : 0 ≤ x < 0.5} and T \ S = {x ∈ R : 1 < x < 7}.

3. Let X = {{b, c}, {{b}, {c}}, b} and Y = {a, b, c}. Then

X \ Y = {{b, c}, {{b}, {c}}}, Y \ X = {a, c} and X∆Y = {a, c, {b, c}, {{b}, {c}}}.
1.3. RELATIONS 11

In naive set theory, all sets are essentially defined to be subsets of some reference set, referred to
as the universal set, and is denoted by U . We now define the complement of a set.

Definition 1.2.7. Let U be the universal set and X ⊆ U . Then, the complement of X, denoted by
X c , is defined by X c = {x ∈ U : x 6∈ X}.

We state more properties of sets.

Lemma 1.2.8. Let U be the universal set and S, T ⊆ U . Then,


1. U c = ∅ and ∅c = U .
2. S ∪ S c = U and S ∩ S c = ∅.
3. S ∪ U = U and S ∩ U = S.
4. (S c )c = S.
5. S ⊆ S c if and only if S = ∅.
6. S ⊆ T if and only if T c ⊆ S c .
7. S = T c if and only if S ∩ T = ∅ and S ∪ T = U .
8. S \ T = S ∩ T c and T \ S = T ∩ S c .
9. S∆T = (S ∪ T ) \ (S ∩ T ).
10. De-Morgan’s Laws:
T

(a) (S ∪ T )c = S c ∩ T c .
AF

(b) (S ∩ T )c = S c ∪ T c .
DR

The De-Morgan’s laws help us to convert arbitrary set expressions into those that involve only
complements and unions or only complements and intersections.

Exercise 1.2.9. Let S and T be subsets of a universal set U .


1. Then prove Lemma 1.2.8.
2. Suppose that S∆T = T . Is S = ∅?

Definition 1.2.10. Let X be a set. Then, the set that contains all subsets of X is called the power
set of X and is denoted by P(X) or 2X .

Example 1.2.11. 1. Let X = ∅. Then P(∅) = P(X) = {∅, X} = {∅}.

2. Let X = {∅}. Then P({∅}) = P(X) = {∅, X} = {∅, {∅}}.

3. Let X = {a, b, c}. Then P(X) = {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}.

4. Let X = {{b, c}, {{b}, {c}}}. Then P(X) = {∅, {{b, c}}, {{{b}, {c}}}, {{b, c}, {{b}, {c}}} }.

1.3 Relations
In this section, we introduce the set theoretic concepts of relations and functions. We will use these
concepts to relate different sets. This method also helps in constructing new sets from existing ones.
12 CHAPTER 1. BASIC SET THEORY

Definition 1.3.1. Let X and Y be two sets. Then their Cartesian product, denoted by X × Y , is
defined as X × Y = {(a, b) : a ∈ X, b ∈ Y }. The elements of X × Y are also called ordered pairs
with the elements of X as the first entry and elements of Y as the second entry. Thus,

(a1 , b1 ) = (a2 , b2 ) if and only if a1 = a2 and b1 = b2 .

Example 1.3.2. 1. Let X = {a, b, c} and Y = {1, 2, 3, 4}. Then

X × X = {(a, a), (a, b), (a, c), (b, a), (b, b), (b, c), (c, a), (c, b), (c, c)}.
X ×Y = {(a, 1), (a, 2), (a, 3), (a, 4), (b, 1), (b, 2), (b, 3), (b, 4), (c, 1), (c, 2), (c, 3), (c, 4)}.

2. The Euclidean plane, denoted by R2 = R × R = {(x, y) : x, y ∈ R}.

3. By convention, ∅ × Y = X × ∅ = ∅. In fact, X × Y = ∅ if and only if X = ∅ or Y = ∅.

Remark 1.3.3. Let X and Y be two nonempty sets. Then, X × Y can also be defined as follows:
Let x ∈ X and y ∈ Y and think of (x, y) as the set {{x}, {x, y}}, i.e., we have a new set in which an
element (a set formed using the first element of the ordered pair) is a subset of the other element (a set
formed with both the elements of the ordered pair). Then, with the above understanding, the ordered
pair (y, x) will correspond to the set {{y}, {x, y}}. As the two sets {{x}, {x, y}} and {{y}, {x, y}} are
not the same, the ordered pair (x, y) 6= (y, x).

Exercise 1.3.4. Let X, Y, Z and W be nonempty sets. Then, prove the following statements:
1. The product construction can be used on sets several times, e.g.,
T
AF

X × Y × Z = {(x, y, z) : x ∈ X, y ∈ Y, z ∈ Z} = (X × Y ) × Z = X × (Y × Z).
DR

2. X × (Y ∪ Z) = (X × Y ) ∪ (X × Z).
3. X × (Y ∩ Z) = (X × Y ) ∩ (X × Z).
4. (X × Y ) ∩ (Z × W ) = (X ∩ Z) × (Y ∩ W ).
5. (X × Y ) ∪ (Z × W ) ⊆ (X ∪ Z) × (Y ∪ W ). Give an example to show that the converse need not
be true.
6. Is it possible to write the set T = {(x, x, y) : x, y ∈ N} as Cartesian product of 3 sets? What
about the the set T = {(x, x2 , y) : x, y ∈ N}?

A relation can be informally thought of as a property which either holds or does not hold between
two objects. For example, x is taller than y can be a relation. However, if x is taller than y, then y
cannot be taller than x.

Definition 1.3.5. Let X and Y be two nonempty sets. A relation R from X to Y is a subset of
X × Y , i.e., it is a collection of certain ordered pairs. We write xRy to mean (x, y) ∈ R ⊆ X × Y .
Thus, for any two sets X and Y , the sets ∅ and X × Y are always relations from X to Y . A relation
from X to X is called a relation on X.

Example 1.3.6. 1. Let X be any nonempty set and consider the set P(X). Define a relation R
on P(X) by R = {(S, T ) ∈ P(X) × P(X) : S ⊆ T }.

2. Let A = {a, b, c, d}. Some relations R on A are:

(a) R = A × A.
1.3. RELATIONS 13

(b) R = {(a, a), (b, b), (c, c), (d, d), (a, b), (a, c), (b, c)}.
(c) R = {(a, a), (b, b), (c, c)}.
(d) R = {(a, a), (a, b), (b, a), (b, b), (c, d)}.
(e) R = {(a, a), (a, b), (b, a), (a, c), (c, a), (c, c), (b, b)}.
(f) R = {(a, b), (b, c), (a, c), (d, d)}.
(g) R = {(a, a), (b, b), (c, c), (d, d), (a, b), (b, c)}.
(h) R = {(a, a), (b, b), (c, c), (d, d), (a, b), (b, a), (b, c), (c, b)}.
(i) R = {(a, a), (b, b), (c, c), (a, b), (b, c)}.

Sometimes, we draw pictures to have a better understanding of different relations. For example,
to draw pictures for relations on a set X, we first put a node for each element x ∈ X and label
it x. Then, for each (x, y) ∈ R, we draw a directed line from x to y. If (x, x) ∈ R then a loop is
drawn at x. The pictures for some of the relations is given in Figure 1.1.

c d c d c d

a b a b a b
T
AF

A×A Example 2.b Example 2.c


DR

Figure 1.1: Pictorial representation of some relations from Example 2

3. Let A = {1, 2, 3}, B = {a, b, c} and let R = {(1, a), (1, b), (2, c)}. Figure 1.2 represents the
relation R. 1

1 a

2 b

3 c

Figure 1.2: Pictorial representation of the relation in Example 3

4. Let R = {(x, y) : x, y ∈ Z and y = x + 5m for some m ∈ Z} is a relation on Z. If we try to draw


a picture for this relation then there is no arrow between any two elements of {1, 2, 3, 4, 5}.

5. Fix n ∈ N. Let R = {(x, y) : x, y ∈ Z and y = x + nm for some m ∈ Z}. Then, R is a relation


on Z. A picture for this relation has no arrow between any two elements of {1, 2, 3, . . . , n}.
1
We use pictures to help our understanding and they are not parts of proof.
14 CHAPTER 1. BASIC SET THEORY

Definition 1.3.7. Let X and Y be two nonempty sets and let R be a relation from X to Y . Then,
the inverse relation, denoted by R−1 , is a relation from Y to X, defined by R−1 = {(y, x) ∈ Y × X :
(x, y) ∈ R}. So, for all x ∈ X and y ∈ Y

(x, y) ∈ R if and only if (y, x) ∈ R−1 .


Example 1.3.8. 1. If R = {(1, a), (1, b), (2, c)} then R−1 = {(a, 1), (b, 1), (c, 2)}.
2. Let R = {(a, b), (b, c), (a, c)} be a relation on A = {a, b, c} then R−1 = {(b, a), (c, b), (c, a)} is
also a relation on A.

Let R be a relation from X to Y . Consider an element x ∈ X. It is natural to ask if there exists


y ∈ Y such that (x, y) ∈ R. This gives rise to the following three possibilities:
1. (x, y) 6∈ R for all y ∈ Y .
2. There is a unique y ∈ Y such that (x, y) ∈ R.
3. There exists at least two elements y1 , y2 ∈ Y such that (x, y1 ), (x, y2 ) ∈ R.

One can ask similar questions for an element y ∈ Y . To accommodate all these, we introduce a
notation in the following definition.

Definition 1.3.9. Let R be a nonempty relation from X to Y . Then,


1. the set dom R:= {x : (x, y) ∈ R} is called the domain of R1 , and
2. the set rng R:= {y ∈ Y : (x, y) ∈ R} is called the range of R.

Notation 1.3.10. Let R be a nonempty relation from X to Y . Then,


T

1. for any set Z, one writes R(Z) := {y : (z, y) ∈ R for some z ∈ Z}.
AF

2. for any set W , one writes R−1 (W ) := {x ∈ X : (x, w) ∈ R for some w ∈ W }.


DR

Example 1.3.11. Let a, b, c, and d be distinct symbols and let R = {1, a), (1, b), (2, c)}. Then,
1. dom R = {1, 2}, rng R = {a, b, c},
2. R({1}) = {a, b}, R({2}) = {c}, R({1, 2}) = {a, b, c}, R({1, 2, 3}) = {a, b, c}, R({4}) = ∅.
3. dom R−1 = {a, b, c}, rng R−1 = {1, 2},
4. R−1 ({a}) = {1}, R−1 ({a, b}) = {1}, R−1 ({b, c}) = {1, 2}, R−1 ({a, d}) = {1}, R−1 ({d}) = ∅.

The following is an immediate consequence of the definition, but we give the proof of a few parts
for the sake of better understanding.

Proposition 1.3.12. Let R be a nonempty relation from X to Y , and let Z be any set.
1. R(Z) = R(X ∩ Z) ⊆ Y, R−1 (Z) = R−1 (Z ∩ Y ) ⊆ X.
2. dom R = R−1 (Y ) = rng R−1 ⊆ X, rng R = R(X) = dom R−1 ⊆ Y.
3. R(Z) 6= ∅ if and only if dom R ∩ Z 6= ∅.
4. R−1 (Z) 6= ∅ if and only if rng R ∩ Z 6= ∅.

Proof. We prove the last two parts. The proof of the first two parts is left as an exercise.
3. Let f (S) 6= ∅. There exist a ∈ S ∩ A and b ∈ B such that (a, b) ∈ f . It implies that a ∈ dom f ∩ S
(a ∈ S). Converse is proved in a similar way.
4. Let rng f ∩ S 6= ∅. There exist b ∈ rng f ∩ S and a ∈ A such that (a, b) ∈ f . Then a ∈ f −1 (b) ⊆
f −1 (S). Similarly, the converse follows.
1
In some texts, the set X is referred to as the domain set of R and it should not be confused with dom R.
1.4. FUNCTIONS 15

1.4 Functions
Definition 1.4.1. Let X and Y be nonempty sets and let f be a relation from X to Y .
1. f is called a partial function from X to Y , denoted by f : X * Y , if for each x ∈ X, f ({x})
is either a singleton or ∅.
2. For an element x ∈ X, if f ({x}) = {y}, a singleton, we write f (x) = y. Hence, y is referred to
as the image of x under f ; and x is referred to as the pre-image of y under f .
f (x) is said to be undefined at x ∈ X if f ({x}) = ∅.
3. If f is a partial function from X to Y such that for each x ∈ X, f ({x}) is a singleton then f is
called a function and is denoted by f : X → Y .

Observe that for any partial function f : X * Y , the condition (a, b), (a, b0 ) ∈ f implies b = b0 .
Thus, if f : X * Y , then for each x ∈ X, either f (x) is undefined, or there exists a unique y ∈ Y
such that f (x) = y. Moreover, if f : X → Y is a function, then f (x) exists for each x ∈ X, i.e., there
exists a unique y ∈ Y such that f (x) = y.
It thus follows that a partial function f : X * Y is a function if and only if dom f = X, i.e.,
domain set of f is X.

Example 1.4.2. Let A = {a, b, c, d}, B = {1, 2, 3, 4} and X = {3, 4, b, c}.


1. Consider the relation R1 = {(a, 1), (b, 1), (c, 2)} from A to B. The following are true.
(a) R1 is a partial function.
(b) R1 (a) = 1, R1 (b) = 1, R1 (c) = 2. Also, R1 ({d}) = ∅; thus R1 (d) is undefined.
T
AF

(c) R1 (X) = {1, 2}.


(d) R1−1 = {(1, a), (1, b), (2, c)}. So, R1−1 ({1}) = {a, b} and R1−1 (2) = c. For any x ∈ X,
DR

R1−1 (x) = ∅. Therefore, R1−1 (x) is undefined.

2. R2 = {(a, 1), (b, 4), (c, 2), (d, 3)} is a relation from A to B. The following are true.
(a) R2 is a partial function.
(b) R2 (a) = 1, R2 (b) = 4, R2 (c) = 2 and R2 (d) = 3.
(c) R2 (X) = {2, 4}.
(d) R2−1 (1) = a, R2−1 (2) = c, R2−1 (3) = d and R2−1 (4) = b. Also, R2−1 (X) = {b, d}.

Convention:
Let p(x) be a polynomial in the variable x with integer coefficients. Then, by writing ‘f : Z → Z
is a function defined by f (x) = p(x)’, we mean the function f = {(a, p(a)) : a ∈ Z}. For example,
the function f : Z → Z given by f (x) = x2 corresponds to the set {(a, a2 ) : a ∈ Z}.

Example 1.4.3. 1. For A = {a, b, c, d} and B = {1, 3, 5}, let f = {(a, 5), (b, 1), (d, 5)} be a relation
in A × B. Then, f is a partial function with dom f = {a, b, d} and rng f = {1, 5}. Further, we
can define a function g : {a, b, d} → {1, 5} by g(a) = 5, g(b) = 1 and g(d) = 5. Also, using g, one
obtains the relation g −1 = {(1, b), (5, a), (5, d)}.
2. The following relations f : Z → Z are indeed functions.
(a) f = {(x, 1) : x is even} ∪ {(x, 5) : x is odd}.
(b) f = {(x, −1) : x ∈ Z}.
(c) f = {(x, 1) : x < 0} ∪ {(0, 0)} ∪ {(x, −1) : x > 0}.
16 CHAPTER 1. BASIC SET THEORY

3. Define f : Q+ → N by f = {( pq , 2p 3q ) : p, q ∈ N, q 6= 0, p and q are coprime}. Then, f is a


function.
Remark 1.4.4. 1. If X = ∅, then by convention, one assumes that there is a function, called the
empty function, from X to Y .
2. If Y = ∅ and X 6= ∅, then by convention, we say that there is no function from X to Y .
3. Individual relations and functions are also sets. Therefore, one can have equality between rela-
tions and functions, i.e., they are equal if and only if they contain the same set of pairs. For exam-
ple, let X = {−1, 0, 1}. Then, the functions f, g, h : X → X defined by f (x) = x, g(x) = x|x| and
h(x) = x3 are equal as the three functions correspond to the relation R = {(−1, −1), (0, 0), (1, 1)}
on X.
4. A function is also called a map.
5. Throughout the book, whenever the phrase ‘let f : X → Y be a function’ is used, it will be
assumed that both X and Y are nonempty sets.

Some important functions are now defined.

Definition 1.4.5. Let X be a nonempty set.


1. The relation Id := {(x, x) : x ∈ X} is called the identity relation on X.
2. The function f : X → X defined by f (x) = x, for all x ∈ X, is called the identity function and
is denoted by Id.
3. The function f : X → R with f (x) = 0, for all x ∈ X, is called the zero function and is denoted
T

by 0.
AF

Exercise 1.4.6. 1. Do the following relations represent functions? Why?


DR

(a) f : Z → Z defined by
i. f = {(x, 1) : 2 divides x} ∪ {(x, 5) : 3 divides x}.
ii. f = {(x, 1) : x ∈ S} ∪ {(x, −1) : x ∈ S c }, where S = {n2 : n ∈ Z} and S c = Z \ S.
iii. f = {(x, x3 ) : x ∈ Z}.

(b) f : R+ → R defined by f = {(x, ± x) : x ∈ R+ }, where R+ is the set of all positive real
numbers.

(c) f : R → R defined by f = {(x, x) : x ∈ R}.

(d) f : R → C defined by f = {(x, x) : x ∈ R}.
(e) f : R− → R defined by f = {(x, loge |x|) : x ∈ R− }, where R− is the set of all negative real
numbers.
(f ) f : R → R defined by f = {(x, tan x) : x ∈ R}.

2. Let f : X → Y be a function. Then f −1 is a relation from Y to X. Show that the following


results hold for f −1 :
(a) f −1 (A ∪ B) = f −1 (A) ∪ f −1 (B) for all A, B ⊆ Y .
(b) f −1 (A ∩ B) = f −1 (A) ∩ f −1 (B) for all A, B ⊆ Y .
(c) f −1 (∅) = ∅.
(d) f −1 (Y ) = X.
(e) f −1 (Y \ B) = X \ f −1 (B) for each B ⊆ Y .


3. Let S = {(x, y) ∈ R2 : x2 + y 2 = 1, x ≥ 0}. It is a relation from R to R. Draw a picture of the


inverse of this relation.
1.4. FUNCTIONS 17

Definition 1.4.7. A function f : X → Y is said to be injective (also called one-one or an injection)


if for all x, y ∈ X, x 6= y implies f (x) 6= f (y). Equivalently, f is one-one if for all x, y ∈ X, f (x) = f (y)
implies x = y.
Example 1.4.8. 1. Let X be a nonempty set. Then, the identity map Id on X is one-one.
2. Let X be a nonempty proper subset of Y . Then f (x) = x is a one-one map from X to Y .
3. The function f : Z → Z defined by f (x) = x2 is not one-one as f (−1) = f (1) = 1.
4. The function f : {1, 2, 3} → {a, b, c, d} defined by f (1) = c, f (2) = b and f (3) = a, is one-one.
It can be checked that there are 24 one-one functions f : {1, 2, 3} → {a, b, c, d}.
5. There is no one-one function from the set {1, 2, 3} to its proper subset {1, 2}.
6. There are one-one functions from the set N of natural numbers to its proper subset {2, 3, . . .}.
One of them is given by f (1) = 4, f (2) = 3, f (3) = 2 and f (n) = n + 1, for all n ≥ 4.

Definition 1.4.9. Let f : X → Y be a function. Let A ⊆ X and A 6= ∅. The restriction of f to


A, denoted by fA , is the function fA = {(x, y) : (x, y) ∈ f, x ∈ A}.

Example 1.4.10. Define f : R → R by f (x) = 0 if x is rational, and f (x) = 1 if x is irrational. Then,


fQ : Q → R is the zero function.

Proposition 1.4.11. Let f : X → Y be a one-one function and let Z be a nonempty subset of X.


Then fZ is also one-one.

Proof. Suppose fZ (x) = fZ (y) for some x, y ∈ Z. Then f (x) = f (y). As f is one-one, x = y. Thus,
T

fZ is one-one.
AF
DR

Definition 1.4.12. A function f : X → Y is said to be surjective (also called onto or a surjection)


if f −1 ({b}) 6= ∅ for each b ∈ Y . Equivalently, f : X → Y is onto if there exists a pre-image under f ,
for each b ∈ Y .
Example 1.4.13. 1. Let X be a nonempty set. Then the identity map on X is onto.
2. Let X be a nonempty proper subset of Y . Then the identity map f : X → Y is not onto.
3. There are 6 onto functions from {a, b, c} to {a, b}. For example, f (a) = a, f (b) = b, and f (c) = b
is one such function.
4. Let X be a nonempty subset of Y . Fix an element a ∈ X. Define g : Y → X by
(
y, if y ∈ X,
g(y) =
a, if y ∈ Y \ X.

Then g is an onto function.


5. There does not exist any onto function from the set {a, b} to its proper superset {a, b, c}.
6. There exist onto functions from the set {2, 3, . . .} to its proper superset N. An example of such
a function is f (n) = n − 1 for all n ≥ 2.

Definition 1.4.14. Let X and Y be sets. A function f : X → Y is said to be bijective (also call a
bijection) if f is both one-one and onto. The set X is said to be equinumerous1 with the set Y if
there exists a bijection f : X → Y .
1
If X is equinumerous with Y then X is also said to be equivalent to Y .
18 CHAPTER 1. BASIC SET THEORY

Clearly, if a set X is equinumerous with a set Y then Y is also equinumerous with X. Hence, X
and Y are said to be equinumerous sets.
Example 1.4.15. 1. The function f : {1, 2, 3} → {a, b, c} defined by f (1) = c, f (2) = b and
f (3) = a, is a bijection. Thus, f −1 : {a, b, c} → {1, 2, 3} is a bijection; and the set {a, b, c} is
equinumerous with {1, 2, 3}.
2. Let X be a nonempty set. Then the identity map on X is a bijection. Thus, the set X is
equinumerous with itself.
3. The set N is equinumerous with {2, 3, . . .}. Indeed the function f : N → {2, 3, . . .} defined by
f (1) = 3, f (2) = 2 and f (n) = n + 1, for all n ≥ 3 is a bijection.
Exercise 1.4.16. 1. Let f : X → Y be a bijection. Then, for every choice of pairs x, y with x ∈ X
and y ∈ Y there exists a bijection, say h : X → Y , such that h(x) = y.
2. Define f : W → Z by f = { x, −x : x is even} ∪ { x, x+1
 
2 2 : x is odd}. Is f one-one? Is it onto?
3. Define f : N → Z by f = {(x, 2x) : x ∈ N}, and g : Z → Z by g = { x, x2 : x is even} ∪ {(x, 0) :


x is odd}. Are f and g one-one? Are they onto?


4. Let X be a nonempty set. Give a one-one function from X to P(P(P(X))).
5. For a fixed n ∈ N, let An and Bn be nonempty sets and let Rn be a one-one relation from An to
Bn . Then, ∩ Rn is a one-one relation.
n
6. Let A be the set of subsets of {1, 2, . . . , 9} each having 5 elements and let B be the set of 5
digit numbers with strictly increasing digits. For a ∈ A, define f (a) as the number obtained by
arranging the elements of a in increasing order. Is f one-one and onto?
T
AF
DR

1.5 Composition of functions


Definition 1.5.1. Let f and g be two relations such that rng f ⊆ dom g. Then, the composition of
f and g, denoted by g ◦ f , is defined as

g ◦ f = {(x, z) : (x, y) ∈ f and (y, z) ∈ g for some y ∈ rng f ⊆ dom g} .

Notice that the composition of two relations in the above definition is a relation. In case, both f
and g are functions, g ◦ f is also a function, and (g ◦ f )(x) = g (f (x)) as (x, z) ∈ g ◦ f implies that
there exists y such that y = f (x) and z = g(y). Similarly, one defines f ◦ g if rng g ⊆ dom f .

Example 1.5.2. Let f = {(β, a), (3, b), (3, c)} and g = {(a, 3), (b, β), (c, β)}. Then, g◦f = {(3, β), (β, 3)}
and f ◦ g = {(a, b), (a, c), (b, a), (c, a)}.

The proof of the next result is omitted as it directly follows from definition.

Proposition 1.5.3. [Algebra of composition of functions] Let f : X → Y, g : Y → Z and


h : Z → W be functions.
1. Then, (h ◦ g) ◦ f : Z → W and h ◦ (g ◦ f ) : X → W are functions. Moreover, (h ◦ g) ◦ f =
h ◦ (g ◦ f ) (associativity holds).
2. If f and g are injections then g ◦ f : X → Z is an injection.
3. If f and g are surjections then g ◦ f : X → Z is a surjection.
4. If f and g are bijections then g ◦ f : X → Z is a bijection.
1.6. EQUIVALENCE RELATION 19

5. [Extension] If dom f ∩ dom h = ∅ and rng f ∩ rng h = ∅ then the function f ∪ h from X ∪ Z to
Y ∪ W defined by f ∪ h = {(a, f (a)) : a ∈ X} ∪ {(c, h(c)) : c ∈ Z} is a bijection.
6. Let X and Y be sets with at least two elements each and let f : X → Y be a bijection. Then the
number of bijections from X to Y is at least 2.

Theorem 1.5.4. [Properties of the identity function] Let X and Y be two nonempty sets and Id
be the identity function on X. Then, for any two functions f : X → Y and g : Y → X,

f ◦ Id = f and Id ◦ g = g.

Proof. By definition, (f ◦ Id)(x) = f (Id(x)) = f (x), for all x ∈ X. Hence, f ◦ Id = f . Similarly, the
other equality follows.

We now give a very important bijection principle.

Theorem 1.5.5. [Bijection principle] Let f : X → Y and g : Y → X be functions such that


(g ◦ f )(x) = x for each x ∈ X. Then f is one-one and g is onto.

Proof. To show that f is one-one, suppose f (a) = f (b) for some a, b ∈ X. Then

a = (g ◦ f )(a) = g (f (a)) = g (f (b)) = (g ◦ f )(b) = b.

Thus, f is one-one.
To show that g is onto, let a ∈ X. Write b = f (a). Now, a = (g ◦ f )(a) = g(f (a)) = g(b). That is,
we have found b ∈ Y such that g(b) = a. Hence, g is onto.
T
AF

1. Let f, g : W → W be defined by f = {(x, 2x) : x ∈ W} and g = { x, x2 :



Exercise 1.5.6.
x is even} ∪ {(x, 0) : x is odd}. Verify that g ◦ f is the identity function on W, whereas f ◦ g
DR

maps even numbers to even numbers and odd numbers to 0.


2. Let f : X → Y be a function. Prove that f −1 : Y → X is a function if and only if f is a
bijection.
3. Define f : N × N → N by f (m, n) = 2m−1 (2n − 1). Is f a bijection?
4. Let f : X → Y be a bijection and let A ⊆ X. Is f (X \ A) = Y \ f (A)?
5. Let f : X → Y and g : Y → X be two functions such that
(a) (f ◦ g)(y) = y for each y ∈ Y ,
(b) (g ◦ f )(x) = x for each x ∈ X.

Show that f is a bijection and g = f −1 . Can we conclude the same without assuming the second
condition?

1.6 Equivalence relation


We look at some relations that are of interest in mathematics.

Definition 1.6.1. Let A be a nonempty set. Then, a relation R on A is said to be


1. reflexive if for each a ∈ A, (a, a) ∈ R.
2. symmetric if for each pair of elements a, b ∈ A, (a, b) ∈ R implies (b, a) ∈ R.
3. transitive if for each triple of elements a, b, c ∈ A, (a, b), (b, c) ∈ R imply (a, c) ∈ R.
20 CHAPTER 1. BASIC SET THEORY

Exercise 1.6.2. For relations defined in Example 1.3.6, determine which of them are
1. reflexive.
2. symmetric.
3. transitive.

Definition 1.6.3. Let A be a nonempty set. A relation on A is called an equivalence relation if it


is reflexive, symmetric and transitive. It is customary to write a supposed equivalence relation as ∼
rather than R. The equivalence class of the equivalence relation ∼ containing an element a ∈ A is
denoted by [a], and is defined as [a] := {x ∈ A : x ∼ a}.
Example 1.6.4. 1. Consider the relations on A of Example 1.3.6.
(a) The relation in Example 1.3.6.1 is not an equivalence relation; it is not symmetric.
(b) The relation in Example 1.3.6.2a is an equivalence relation with [a] = {a, b, c, d} as the only
equivalence class.
(c) Other relations in Example 1.3.6.2 are not equivalence relations.
(d) The relation in Example 1.3.6.4 is an equivalence relation with the equivalence classes as
i. [0] = {. . . , −15, −10, −5, 0, 5, 10, . . .}.
ii. [1] = {. . . , −14, −9, −4, 1, 6, 11, . . .}.
iii. [2] = {. . . , −13, −8, −3, 2, 7, 12, . . .}.
iv. [3] = {. . . , −12, −7, −2, 3, 8, 13, . . .}.
v. [4] = {. . . , −11, −6, −1, 4, 9, 14, . . .}.
(e) The relation in Example 1.3.6.5 is an equivalence relation with the equivalence classes as
T

[0] = {. . . , −3n, −2n, −n, 0, n, 2n, . . .}.


AF

[1] = {. . . , −3n + 1, −2n + 1, −n + 1, 1, n + 1, 2n + 1, . . .}.


DR

[2] = {. . . , −3n + 2, −2n + 2, −n + 2, 2, n + 2, 2n + 2, . . .}.


..
.
[n − 2] = {. . . , −2n − 2, −n − 2, −2, n − 2, 2n − 2, 3n − 2, . . .}.
[n − 1] = {. . . , −2n − 1, −n − 1, −1, n − 1, 2n − 1, 3n − 1, . . .}.

2. Consider the relation R = {(a, a), (b, b), (c, c)} on the set A = {a, b, c}. Then R is an equivalence
relation with three equivalence classes, namely [a] = {a}, [b] = {b} and [c] = {c}.
3. The relation R = {(a, a), (b, b), (c, c), (a, c), (c, a)} is an equivalence relation on A = {a, b, c}. It
has two equivalence classes, namely [a] = [c] = {a, c} and [b] = {b}.

Proposition 1.6.5. [Equivalence relation divides a set into disjoint classes] Let ∼ be an equivalence
relation on a nonempty set X. Then,
1. any two equivalence classes are either disjoint or identical ;
2. the set X is equal to the union of all equivalence classes of ∼.

That is, an equivalence relation ∼ on X divides X into disjoint equivalence classes.

Proof. 1. Let a, b ∈ X be distinct elements of X. If the equivalence classes [a] and [b] are disjoint,
then there is nothing to prove. So, assume that there exists c ∈ X such that c ∈ [a] ∩ [b]. That is,
c ∼ a and c ∼ b. By symmetry of ∼ it follows that a ∼ c and b ∼ c. We will show that [a] = [b].
For this, let x ∈ [a]. Then x ∼ a. Since a ∼ c and ∼ is transitive, we have x ∼ c. Again, c ∼ b
and transitivity of ∼ imply that x ∼ b. Thus, x ∈ [b]. That is, [a] ⊆ [b]. A similar argument proves
that [b] ⊆ [a]. Thus, whenever two equivalence classes intersect, they are indeed equal.
1.6. EQUIVALENCE RELATION 21

2. Notice that for each x ∈ X, the equivalence class [x] is well defined, x ∈ [x] and [x] ⊆ X. Thus, if
S
we take the union of the equivalence classes over all x ∈ X, we get X = [x].
x∈X

Exercise 1.6.6. Determine the equivalence relation among the relations given below. Further, for
each equivalence relation, determine its equivalence classes.
1. R = {(a, b) ∈ Z2 : a ≤ b} on Z.
2. R = {(a, b) ∈ Z∗ × Z∗ : a divides b} on Z∗ , where Z∗ = Z \ {0}.
3. Recall the greatest integer function f : R → Z given by f (x) = [x] and let R = {(a, b) ∈ R × R :
[a] = [b]} on R.
4. For x = (x1 , x2 ), y = (y1 , y2 ) ∈ R2 and R∗ = R \ {0}, let
(a) R = {(x, y) ∈ R2 × R2 : x21 + x22 = y12 + y22 }.
(b) R = {(x, y) ∈ R2 × R2 : x = αy for some α ∈ R∗ }.
(c) R = {(x, y) ∈ R2 × R2 : 4x21 + 9x22 = 4y12 + 9y22 }.
(d) R = {(x, y) ∈ R2 × R2 : x − y = α(1, 1) for some α ∈ R∗ }.
(e) Fix c ∈ R. Now, define R = {(x, y) ∈ R2 × R2 : y2 − x2 = c(y1 − x1 )}.
(f ) R = {(x, y) ∈ R2 × R2 : |x1 | + |x2 | = α(|y1 | + |y2 |)}, for some number α ∈ R+ .
(g) R = {(x, y) ∈ R2 × R2 : x1 x2 = y1 y2 }.

5. For x = (x1 , x2 ), y = (y1 , y2 ) ∈ R2 , let S = {x ∈ R2 : x21 + x22 = 1}. Then, are the relations
given below an equivalence relation on S?
T

(a) R = {(x, y) ∈ S × S : x1 = y1 , x2 = −y2 }.


AF

(b) R = {(x, y) ∈ S × S : x = −y}.


DR

Definition 1.6.7. Let X be a nonempty set. Then a partition of X is a collection of disjoint,


nonempty subsets of X whose union is X.

Example 1.6.8. Let X = {a, b, c, d, e}.


1. Then {{a, b}, {c, e}, {d}} is a partition of X.
Consider the relation R = {(a, a), (b, b), (c, c), (d, d), (e, e), (a, b), (b, a), (c, e), (e, c)} on X. The
equivalence classes of R are [a] = [b] = {a, b}, [c] = [e] = {c, e} and [d] = {d}, which constitute
the said partition of X.
2. Consider the partition {{a}, {b, c, d}, {e}} of X. Verify that the relation

R = {(a, a), (b, b), (c, c), (d, d), (e, e), (b, c), (c, d), (b, d), (c, b), (d, c), (d, b)}

is an equivalence relation on X with equivalence classes [a] = {a}, [b] = {b, c, d} and [e] = {e}.

Given a partition of a nonempty set X, does there exists an equivalence relation on X such that
the disjoint equivalence classes are exactly the elements of the partition? Recall that the elements of
a partition are subsets of the given set.

Proposition 1.6.9. [Constructing equivalence relation from equivalence classes] Let P be a par-
tition of a nonempty set X. Let ∼ be the relation on X defined by

for each pair of elements x, y ∈ X, x ∼ y if and only if both x and y are elements of the
same subset A in P.
22 CHAPTER 1. BASIC SET THEORY

Then the set of equivalence classes of ∼ is equal to P.

Proof. The construction of ∼ says that if A and B are two distinct elements of P, then all elements
of A are related to each other by ∼, all elements of B are related to each other by ∼, but no element
of A is related to any element of B by ∼.
Let x ∈ X. Since P is a partition, x ∈ A for some A ∈ P. Then x ∼ x. So, ∼ is reflexive.
Let x, y ∈ X such that x ∼ y. Then, there exists A ∈ P such that x, y ∈ A. So, y ∼ x. Hence ∼
is symmetric.
Let x, y, z ∈ X such that x ∼ y and y ∼ z. Then there exists A ∈ P such that x, y ∈ A and
y, z ∈ A. It follows that x ∼ z. That is, ∼ is transitive.
To complete the proof, we show that
1. Each equivalence class of ∼ is an element of P.
2. each element of P is an equivalence class of ∼.

1. Let [x] be an equivalence class of ∼ for some x ∈ X. This x is in some A ∈ P. Now,


y ∈ [x] ⇔ x ∼ y ⇔ y ∈ A. Then [x] = A.
2. Similarly, let B ∈ P. Take x ∈ B. Now y ∈ B ⇔ y ∼ x ⇔ y ∈ [x]. Then [x] = B.

Exercise 1.6.10. 1. Let X and Y be two nonempty sets and f be a relation from X to Y . Let
IdX and IdY be the identity relations on X and Y , respectively. Then,
(a) is it necessary that f −1 ◦ f ⊆ IdX ?
(b) is it necessary that f −1 ◦ f ⊇ IdX ?
(c) is it necessary that f ◦ f −1 ⊆ IdY ?
T
AF

(d) is it necessary that f ◦ f −1 ⊇ IdY ?


DR

2. In addition to the data in (1), suppose f is a function. Then,


(a) is it necessary that f ◦ f −1 ⊆ IdY ?
(b) is it necessary that IdX ⊆ f −1 ◦ f ?

3. Take X 6= ∅. Is X × X an equivalence relation on X? If yes, what are the equivalence classes?


4. On a nonempty set X, what is the smallest equivalence relation (in the sense that every other
equivalence relation will contain this equivalence relation; recall that a relation is a set)?
5. Supply the equivalence relation on R whose equivalence classes are {[m, m + 1) : m ∈ Z}.

6. A relation on a nonempty set may or may not be reflexive, symmetric, or transitive. Thus there
are 8 types of relations. With X = {1, 2, 3, 4, 5}, give one example for each type of such relations.
7. What is the number of all relations on {1, 2, 3}?
8. What is the number of relations f from {1, 2, 3} to {a, b, c} such that dom f = {1, 3}?
9. What is the number of relations f on {1, 2, 3} such that f = f −1 ?
10. What is the number of partial functions on {1, 2, 3}? How many of them are functions?
11. What is the number of functions from {1, 2, 3} to {a1 , a2 , . . . , an }?
12. What is the number of equivalence relations on {1, 2, 3, 4, 5}?
13. Let f, g be two non-equivalence relations on R. Then, is it possible to have f ◦g as an equivalence
relation? Give reasons for your answer.
1.6. EQUIVALENCE RELATION 23

14. Let f, g be two equivalence relations on R. Then, prove/disprove the following statements.
(a) f ◦ g is necessarily an equivalence relation.
(b) f ∩ g is necessarily an equivalence relation.
(c) f ∪ g is necessarily an equivalence relation.
(d) f ∪ g c is necessarily an equivalence relation. (g c = (R × R) \ g)

T
AF
DR
24 CHAPTER 1. BASIC SET THEORY

T
AF
DR
Chapter 2

The Natural Number System

Proofs are to mathematics what spelling is to poetry. Mathematical works do consist of proofs, just as
poems do consist of words - V. Arnold.

2.1 Peano Axioms


In this section, the set of natural numbers is defined axiomatically. These axioms are credited to the
Italian mathematician G. Peano and the German mathematician J. W. R. Dedekind. The goal in
these axioms is to first establish the existence of one natural number and then define a function, called
the successor function, to generate the remaining natural numbers. Each of these axioms, listed P1
to P3 below, is crucial to the properties that the set of natural numbers enjoy.
T

P1. 1 ∈ N, i.e., 1 is a natural number.


AF

P1 guarantees the existence of one natural number. We now generate more natural numbers
DR

using the successor function. So, we assume the existence of a successor function S defined on
N. The existence of the successor function is a property unique to the set of natural numbers.
P2. There exists an injective function S : N → N \ {1}.
Here, for each x ∈ N, S(x) is called the successor of x.
Axiom P2 implies that 1 is not the successor of any natural number. As S(1) 6= 1, denote S(1)
by 2. Now S(S(1)), which is S(2), is different from both 1 and 2. Denote S(2) by 3. By a similar
argument, denote S(3) to be 4, S(4) to be 5, etc. From this argument each of the elements of
the set {1, 2, 3, . . .} is also an element of N. Thus, the axiomatic/formal definition of N includes
all the usual elements, i.e., 1, 2, 3, . . ..
Further, to exclude versions of N that are ‘too large’, the last axiom, called the Axiom of
Induction is stated next.
P3. [Axiom of Induction] Let X ⊆ N be such that
1. 1 ∈ X, and
2. for each x ∈ X, S(x) ∈ X.

Then X = N.
Axioms P1 and P2 ensure that {1, 2, . . .} ⊆ N. Further, as 1 ∈ {1, 2, . . .} and for each n ∈
{1, 2, . . .}, S(n) ∈ {1, 2, . . . , }, Axiom P3 ensures that that N = {1, 2, . . .}.

The next result ensures that any natural number different from 1 has to be a successor of some
other natural number. This, in effect, re-emphasizes the Axioms P2 and P3.

25
26 CHAPTER 2. THE NATURAL NUMBER SYSTEM

Lemma 2.1.1. If n ∈ N and n 6= 1, then there exists m ∈ N such that S(m) = n.

Proof. Let X = {x ∈ N : x = 1 or ∃ y ∈ N such that x = S(y)}. By the definition of X, both 1 and


S(1) belong to X, i.e., X \ {1} =
6 ∅.
So, for any x ∈ X \ {1}, there must exist y ∈ N such that x = S(y). Observe that S(y) ∈ N.
Therefore, S(x) = S(S(y)) implies that S(x) ∈ X. Thus, by the induction axiom, P3 X = N.

The existence of the set of natural numbers has been established axiomatically. So, we now discuss
the arithmetic on N, an important property of the set of natural numbers. The arithmetic in N that
touches every aspect of our lives is clearly addition and multiplication. So, depending solely on the
Peano axioms, we define the operation of addition on N. 1 is always a natural number by Axiom P1.
First, we establish what it means to add 1 to a natural number n. Here, we define n + 1 = S(n).
We now wish to add any two natural numbers n and m. Without loss of generality assume that
m 6= 1. From Lemma 2.1.1, there exists k ∈ N such that m = S(k). So, to define n + m, it is sufficient
to define n + S(k). We do this by using the following recursive definition: n + S(k) = S(n + k).
For example, suppose we wish to compute 1 + 2. By the paragraph after Axiom P2, 2 = S(1). So,
1 + 2 = 1 + S(1). By the above definition, 1 + S(1) = S(1 + 1) and 1 + 1 = S(1), which is 2 by the
paragraph after Axiom P2. Thus, 1 + S(1) = S(1 + 1) = S(2) = 3. An iteration of this process will
generate the usual addition on N. In short, the definition for addition is:

Definition 2.1.2. We define addition as follows.


1. For each n ∈ N, n + 1 := S(n), and
T

2. for each m, n ∈ N, n + S(m) := S(n + m).


AF

Using a similar argument, axiomatic multiplication “.” can be defined. First, set n · 1 to be n.
DR

The multiplication of arbitrary natural numbers is now defined in a recursive manner. The formal
definition is:

Definition 2.1.3. The multiplication of two natural numbers is defined as follows.


1. For all n ∈ N, n · 1 := n, and
2. for all m, n ∈ N, n · S(m) := (n · m) + n.

We follow the usual convention of writing (n · m) + k as n · m + k.

Using the above axiomatic definitions of both addition and multiplication, we derive the properties
of the set of natural numbers N.

1. [Associativity of addition] For every n, m, k ∈ N, n + (m + k) = (n + m) + k.


Proof. Let X = {k ∈ N : for all m, n ∈ N, n + (m + k) = (n + m) + k}. We show that X = N.
Let n, m ∈ N. As

n + (m + 1) = n + S(m) (Definition 2.1.2.1)


= S(n + m) (Definition 2.1.2.2)
= (n + m) + 1, (Definition 2.1.2.1)

we get 1 ∈ X. Now, let z ∈ X and let us show that S(z) ∈ X. As z ∈ X, by definition of X

n + (m + z) = (n + m) + z, for all n, m ∈ N. (2.1)


2.1. PEANO AXIOMS 27

Therefore, using the definition of X and Equation (2.1), we see that

n+(m+S(z)) = n+S(m+z) = S(n+(m+z)) = S((n+m)+z) = (n+m)+S(z) for all n, m ∈ N.

Hence, S(z) ∈ X and thus by the induction axiom, Axiom P3, X = N.

2. [Commutativity of addition] For every x, y ∈ N, x + y = y + x.


Proof. Let X = {k ∈ N : for all n ∈ N, n + k = k + n}. We show that X = N.
To show 1 ∈ X, we define the set Y to be Y = {n ∈ N : n + 1 = 1 + n, for all n ∈ N} and prove
that Y = N.
Firstly, 1 + 1 = 1 + 1 and hence 1 ∈ Y . Now, let y ∈ Y . To show S(y) ∈ Y . But, y ∈ Y implies
that 1 + y = y + 1 and hence

1 + S(y) = S(1 + y) = S(y + 1) = S(S(y)) = S(y) + 1.

Thus, S(y) ∈ Y and hence by Axiom P3, Y = N. Therefore, we conclude that 1 ∈ X.


Now, let z ∈ X. To show S(z) ∈ X. But, z ∈ X implies that n + z = z + n, for all n ∈ N. Thus,
using 1 ∈ X, n + z = z + n, for all n ∈ N and associativity, one has

n + S(z) = n + (z + 1) = (n + z) + 1 = (z + n) + 1 = 1 + (z + n) = (1 + z) + n = S(z) + n,

for all n ∈ N. Hence, S(z) ∈ X and thus by Axiom P3, X = N.

3. [Distributive law] For every n, m, k ∈ N, n · (m + k) = n · m + n · k.


T

Proof. Let X = {k ∈ N : for all m, n ∈ N, n · (m + k) = n · m + n · k}. We show that X = N.


AF

1 ∈ X as for each n, m ∈ N,
DR

n · (m + 1) = n · S(m) = n · m + n = n · m + n · 1.

Now, let z ∈ X and let us show that S(z) ∈ X. Since z ∈ X

n · (m + z) = n · m + n · z, for all n, m ∈ N. (2.2)

Thus, by definition and Equation (2.2), we see that

n·(m+S(z)) = n·S(m+z) = n·(m+z)+n = (n·m+n·z)+n = n·m+(n·z +n) = n·m+n·S(z),

for all n, m ∈ N. Hence, S(z) ∈ X and thus by Axiom P3, X = N.

Exercise 2.1.4. Prove the following using only the above properties:
1. [Uniqueness of addition] For every m, n, k ∈ N, if m = n then m + k = n + k.
2. [Additive cancellation] For every x, y ∈ N, if x + z = y + z for some z ∈ N then x = y.
3. [Associativity of multiplication] For every x, y, z ∈ N, x · (y · z) = (x · y) · z.
4. [Multiplication by 1] For each n ∈ N, 1 · n = n.
5. [Second distributive law] For every n, m, k ∈ N, (m + n) · k = m · k + n · k.
6. [Commutativity of multiplication] For each m, n ∈ N, n · m = m · n.
7. [Uniqueness of multiplication] For every m, n, k ∈ N, whenever m = n then m · k = n · k.
8. [Multiplicative cancellation] For every x, y ∈ N, if x · z = y · z for some z ∈ N then x = y.
28 CHAPTER 2. THE NATURAL NUMBER SYSTEM

2.2 Other forms of Principle of Mathematical Induction


Mathematical Induction is an important and useful technique used for proofs in Mathematics. This
in a sense is a reformulation of the Axiom of Induction. We discuss this principle now.
Let P (n) be a statement which may or may not be true for any natural number n. Consider the
set X = {n ∈ N : P (n) is true }. The axiom of induction states that if 1 ∈ X and n ∈ X implies
n + 1 = S(n) ∈ X, for all n ∈ N then X = N. In other words, if P (1) is true and P (n) is true
implies P (n + 1) is true for all n ∈ N then one concludes that P (n) is true for all n ∈ N. The formal
description is given below.
[Principle of Mathematical Induction (PMI)] Let P (n) be a statement (proposition) dependent on
a natural number n ∈ N such that the following hold:
1. Base step: P (1) is true.
2. Induction step: for each n ∈ N, the statement P (n) is true implies P (n + 1) is true.

Then, P (n) is true for all n ∈ N.


We give an analogy, to the above principle.

Observation.
Imagine a ladder with n rungs, where n can be very large. Suppose I wish to climb the ladder.
The strategy that I would like to adopt is:

1. I step onto the first rung of the ladder.


T

2. When I am on the k-th rung of the ladder, I know how to climb to the (k + 1)-th rung.
AF

Here, observe that if k = 1, then I am on the first rung and using 2, I climb to the second rung.
DR

When k = 2, by 2, I can climb to the third rung. In short, using 1, I step onto the ladder and
then using 2 repeatedly, I ascend up the ladder. This is the essence of mathematical induction.

Stepping onto the ladder is referred to as the base step and the process of moving to the (k + 1)-
th step from the k-th step is referred to as the inductive step. The above idea is formalized as the
principle of mathematical induction. We now state and prove it using Peano axioms.
We now present three simple examples to illustrate this.
Example 2.2.1. 1. Compute the sum of the first n natural numbers.
Pn n(n + 1)
Let P (n) be the statement that i= .
i=1 2
1
P 1·2
(a) Base step: n = 1 ⇒ i=1= .
i=1 2
(b) Induction hypothesis: Let us assume that P (k) holds and show that P (k + 1) holds. Here,

k+1 k
X X k(k + 1) (k + 1)(k + 2)
i= i + (k + 1) = + (k + 1) = .
2 2
i=1 i=1

Thus, by PMI P (n) is true for all n ∈ N.


2. Prove that 6 divides n3 + 5n for all natural numbers.
Let P (n) be the statement that 6 divides n3 + 5n.
(a) Base step: n = 1 ⇒ 13 + 5 · 1 = 6, which is clearly divisible by 6.
2.2. OTHER FORMS OF PRINCIPLE OF MATHEMATICAL INDUCTION 29

(b) Induction hypothesis: Let us assume that P (k) holds and show that P (k + 1) holds. Note
that the properties of addition and multiplication implies that (k + 1)3 = k 3 + 3k 2 + 3k + 1.
Thus,

(k + 1)3 + 5(k + 1) = k 3 + 3k 2 + 3k + 1 + 5k + 5 = (k 3 + 5k) + 3k(k + 1) + 6.

By induction hypothesis, 6 divides k 3 + 5k; 6 divides 6 and 6 also divides 3k(k + 1) as either
k or k + 1 is even for all natural number k.

Thus, by PMI P (n) is true for all n ∈ N.


   
1 1 n 1 n
3. Let A = . Then prove that A = , for all n ≥ 1.
0 1 0 1
 
n 1 n
For n ≥ 1, let P (n) be the statement that A = . Then,
0 1
 
1 1
(a) P (1) = A = A1 = holds true.
0 1
(b) So, let us assume that
 P(k) is true and show that P (k + 1) holds. Here, P (k) holds true
1 k
implies that Ak = . Thus,
0 1
    
k+1 k 1 k 1 1 1 k+1
A =A A= = .
0 1 0 1 0 1

Thus, by PMI P (n) is true for all n ≥ 1.


T
AF

There is another form of the principle of mathematical induction, generally called the principle of
strong induction, wherein the difference is in the induction step.
DR

Theorem 2.2.2. [Principle of strong induction (PSI)] Let P (n) be a statement dependent on n ∈ N
such that the following hold:
1. Base step: P (1) is true.
2. Induction step: For each n ∈ N, P (1), P (2), . . . , P (n) are all true implies P (n + 1) is true.

Then, P (n) is true for all n ∈ N.

Proof. Let X = {n ∈ N : P (1) and P (2) and . . . and P (n) hold true}. Since P (1) is assumed true,
1 ∈ X. Let n ∈ X. Then all of P (1), P (2), . . . , P (n) are true. By the induction step, P (n + 1)
is true. That is, n + 1 = S(n) ∈ X. Thus, X is an inductive set and hence by Axiom P3, X = N.
Therefore, P (n) is true for all n ∈ N.

As expected, PSI is equivalent to PMI. We now prove this equivalence.

Theorem 2.2.3. [Equivalence of PMI and PSI] Let P (n) be a statement dependent on n ∈ N.
Suppose that P means the statement ‘P (n) is true for each n ∈ N. ’ Then ‘P can be proved using
PMI’ if and only if ‘P can be proved using PSI’.

Proof. Let us assume that P has been proved using PMI. Hence, P (1) is true. Further, whenever
P (n) is true, we are able to establish that P (n + 1) is true. Therefore, we can recursively establish
that P (n + 1) is true if P (1), . . . , P (n) are true. Hence, P can be proved using PSI.
So, now let us assume that P has been proved using PSI. Define Q(n) to mean ‘P (`) holds
for ` = 1, 2, . . . , n. ’ Notice that Q(1) is true. Suppose that Q(n) is true, i.e., P (`) is true for
30 CHAPTER 2. THE NATURAL NUMBER SYSTEM

` = 1, 2, . . . , n. But, by hypothesis, we know that P has been proved using PSI. Thus, P (n + 1) is
true whenever P (`) is true for ` = 1, 2, . . . , n. This, in turn, means that Q(n + 1) is true. Hence, by
PMI, Q(n) is true for all n ∈ N using PMI. Thus, P can be proved using PMI.

There are many variations of PMI and PSI. One useful formulation considers the set N\{1, 2, . . . , n0 }
(for some fixed n0 ∈ N) instead of N. We formulate and prove one such version of PMI below.

Theorem 2.2.4. [Another form of PMI] Let n0 ∈ N. Let P (n) be a statement dependent on n ∈ N
such that the following hold:
1. P (n0 + 1) is true.
2. For each n ≥ n0 + 1, P (n) is true implies P (n + 1) is true.

Then, P (n) is true for each n ≥ n0 + 1.

Proof. Since n0 ∈ N, for each n ∈ N, n + n0 ∈ N. Consider the statement Q(n) := P (n + n0 ). Then


Q(1) = P (n0 + 1).
Let n ≥ n0 + 1. Then, n = n0 + `, for some ` ∈ N with ` ≥ 1. Let us now assume that Q(`) is true.
Then, by definition P (` + n0 ) = P (n) holds true as Q(`) = P (` + n0 ). Therefore, using the second
assumption and the commutativity of addition, P (n + 1) = P (` + n0 + 1) = P (` + 1 + n0 ) holds true.
Thus, Q(` + 1) = P (` + 1 + n0 ) holds true. Hence, we have shown the following:
1. Q(1) is true.
2. Further, for each ` ∈ N, ` ≥ 1 the assumption Q(`) is true implies that Q(` + 1) is true.
T
AF

Hence, by PMI, it follows that for each m ∈ N, Q(m) is true. However, m ≥ 1 implies n ≥ n0 + 1.
Therefore, for each n ≥ n0 + 1, P (n) is true.
DR

Exercise 2.2.5. Prove the following variations of PSI and PMI.


1. Variation of PSI: Let n0 ∈ N be fixed. Let P (n) be a statement dependent on n ∈ N such that
the following hold:

P (n0 + 1) is true.
For each n ≥ n0 + 1, P (n0 + 1), P (n0 + 2), . . . , P (n) are true implies P (n + 1) is true.

Then for each n ≥ n0 + 1, P (n) is true.


2. Variation of PMI: Let n0 ∈ N and let N0 = {n0 + 1, n0 + 2, . . .}. Let X ⊆ N0 be such that
n0 + 1 ∈ X, and for each n ∈ N0 , n0 + 1, n0 + 2, . . . , n ∈ X implies S(n) ∈ X. Then X = N0 .

As an application, we now prove the following result.

Example 2.2.6. Every natural number greater than or equal to 2 is a product of primes.1
Let P (n) be the statement that any natural number n ≥ 2 can be written as a product of primes.
1. Base step: Let n = 2. As 2 is prime, P (2) is true.
2. Induction step: Assume that P (1), P (2), . . . , P (k) are all true.
Consider the natural number k + 1. Then, we consider the following two cases:
(a) If k + 1 is prime then P (k + 1) holds.
1
Refer to Definition 4.1.11 for prime numbers.
2.3. APPLICATIONS OF PRINCIPLE OF MATHEMATICAL INDUCTION 31

(b) k + 1 is not a prime. In this case, there exists p, q ∈ {2, 3, . . . , k} such that p · q = k + 1.
Since p, q ≤ k, by PSI we already know that each of p and q can be written as product of
primes, say p = p1 · · · ps and q = q1 · · · qt . Thus, k + 1 = (p1 · · · ps ) · (q1 · · · qt ). Therefore,
P (k + 1) holds.

Hence by PSI, P (n) is true for all n ∈ N.

2.3 Applications of Principle of Mathematical Induction


Example 2.3.1. [Triangular numbers]
1. Show that for each x ∈ N, x ≥ 2, there exists a unique t ∈ N such that 1 + 2 + · · · + t < x ≤
1 + 2 + · · · + t + (t + 1).
2. Let S0 = 01 and let St = 1 + 2 + · · · + t for t ∈ N. Show that for each x ∈ N, there exists a
unique t ∈ W = N ∪ {0} such that St < x ≤ St+1 .

The base steps in PMI and PSI are important, and overlooking these may result in spurious
arguments. See the following example.

Example 2.3.2. [Wrong use of PSI] The following is an incorrect proof of “if a set of n balls contains
a green ball then all the balls in the set are green”. Find the error.

Proof. The statement holds trivially for n = 1. Assume that the statement is true for n ≤ k. Take a
collection Bk+1 of k + 1 balls that contains at least one green ball. From Bk+1 , pick a collection Bk
T

of k balls that contains at least one green ball. Then by the induction hypothesis, each ball in Bk is
AF

green. Now, remove one ball from Bk and put the ball which was left out in the beginning. Call it
DR

Bk0 . Again by induction hypothesis, each ball in Bk0 is green. Thus, each ball in Bk+1 is green. Hence
by PMI, our proof is complete.

The following result enables us to define a function on N inductively.

Theorem 2.3.3. [Inductive definition of function] Let f be a relation from N to a nonempty set X
satisfying
1. f ({1}) is a singleton, and
2. for each n ∈ N, if f ({n}) is a singleton implies f ({S(n)}) is a singleton.

Then, f is a function N to X.

Proof. By the hypothesis, f is already a partial function. Now, let A = dom f . Note that 1 ∈ A and
n ∈ A implies S(n) ∈ A. So, by the induction axiom A = N. Thus, f is a function.
In the following exercises, assume the usual properties of xn where x ∈ C and n ∈ N ∪ {0}.
Exercise 2.3.4. 1. Let a, a+d, a+2d, . . . , a+(n−1)d be the first n terms of an arithmetic progres-
n−1
X n
sion, with a, d ∈ C. Then (a + id) = a + (a + d) + · · · + (a + (n − 1)d) = (2a + (n − 1)d) .
2
i=0

2. Let a, ar, ar2 , . . . , arn−1


be the first n terms of a geometric progression, with a, r ∈ C, r 6= 1.
n−1
X rn − 1
Then ari = a + ar + · · · + arn−1 = a .
r−1
i=0
3. Prove that
1
The reader may refer to Section 2.6 for the construction of the set of integers.
32 CHAPTER 2. THE NATURAL NUMBER SYSTEM

(a) 6 divides n3 − n, for all n ∈ N.


(b) 12 divides n4 − n2 , for all n ∈ N.
(c) 7 divides n7 − n, for all n ∈ N.
(d) 3 divides 22n − 1, for all n ∈ N.
(e) 9 divides 22n − 3n − 1, for all n ∈ N.
(f ) 10 divides n9 − n, for all n ∈ N.
(g) 12 divides 22n+2 − 3n4 + 3n2 − 4, for all n ∈ N.
(h) 13 + 23 + · · · + n3 = (1 + 2 + · · · + n)2 .

4. Find a formula for 1 · 2 + 2 · 3 + 3 · 4 + · · · + (n − 1) · n and prove it.


5. Find a formula for 1 · 2 · 3 + 2 · 3 · 4 + 3 · 4 · 5 + · · · + (n − 1) · n · (n + 1) and prove it.
6. Find a formula for 1 · 3 · 5 + 2 · 4 · 6 + · · · + n · (n + 2) · (n + 4) and prove it.
7. For every positive integer n ≥ 5 prove that 2n > n2 > 2n + 1.
8. Prove by induction that 2n divides (n + 1)(n + 2) · · · (2n).
9. [AM-GM inequality]
(a) Let a1 , . . . , a9 be non-negative real numbers such that the sum a1 + · · · + a9 = 5. Consider
the numbers a1 +a 2 a1 +a2
2 , 2 , a3 , . . . , a9 and argue that

a1 + a2 a1 + a2  a + a 2
1 2
+ + a3 + · · · + a9 = 5, a1 · · · a9 ≤ a3 · · · a9 .
2 2 2
(b) Among two pairs of non-negative real numbers with equal sum, the pair with least difference
T

has the largest product.


AF

(c) The product of n ≥ 2 non-negative real numbers is maximum when all numbers are equal.
DR

(d) Let a1 , . . . , an be non-negative real numbers. Show that [(a1 + · · · + an )/n]n ≥ a1 · · · an ; and
equality is achieved, when a1 = · · · = an .

10. For all n ≥ 32, there exist non-negative integers x and y such that n = 5x + 9y.
11. Prove that, for all n ≥ 40, there exist non-negative integers x and y such that n = 5x + 11y.
12. Prove that for µ > 0,
p
1 p2 (p + 1)2 p(p + 1)(2p + 1)
 
Y p(p + 1)
(1 + lµ) ≥ 1 + µ+ − µ2 .
2 2 4 6
l=1

13. By an L-shaped piece, we mean a piece of the type shown in the picture. Consider a 2n × 2n
square with one unit square cut. See the picture given below.

L-shaped piece 4 × 4 square with a unit square cut

Show that a 2n × 2n square with one unit square cut, can be tiled with L-shaped pieces.
n
14. Use (k + 1)5 − k 5 = 5k 4 + 10k 3 + 10k 2 + 5k + 1 to get a closed form expression for k 4 . Then
P
k=1
use PMI to prove your answer.
2.4. WELL ORDERING PROPERTY OF NATURAL NUMBERS 33

2.4 Well Ordering Property of Natural Numbers


In this section, we introduce an ordering, denoted by <, on N. So, for any m, n ∈ N, we need to define
what n < m means?

Definition 2.4.1. Let m, n ∈ N. Then, the natural number n is said to be strictly less than the
natural number m, denoted by n < m, (in word, n is less than m) if there exists a k ∈ N such that
m = n + k. Further, n ≤ m will imply that either n = m or n < m. When n < m, we also write
m > n and read it as m is greater than n.

We prove some properties of N with the ordering <.

Lemma 2.4.2. [Transitivity] Let x, y, z ∈ N such that x < y and y < z. Then x < z.

Proof. Since x < y, there exists k ∈ N such that y = x + k. Similarly, y < z gives the existence of
` ∈ N such that z = y + `. Hence, z = y + ` = (x + k) + ` = x + (k + `) = x + t, where t = k + ` ∈ N as
k, ` ∈ N. Since the sum of two natural numbers is a natural number, we conclude from Definition 2.4.1
that x < z.

Exercise 2.4.3. Let x, y, z ∈ N. Then prove the following:


1. If x ≤ y and y < z then x < z.
2. If x < y and y ≤ z then x < z.
3. If x ≤ y and y ≤ z then x ≤ z.
T

4. If x < y then x + z < y + z and x · z < y · z.


AF
DR

Lemma 2.4.4. For all m, n ∈ N, m 6= m + n.

Proof. Suppose there exist m, n ∈ N such that m = m + n. Then m + 1 = m + n + 1 = m + S(n). By


additive cancellation (Exercise 2.1.4.2), 1 = S(n), contradicting Axiom P2.

Lemma 2.4.5. [Law of trichotomy] For all m, n ∈ N, exactly one of the following is true:

n < m, n = m, n > m.

Proof. As a first step, we show that no two of the above can hold together. For, suppose n < m
and n = m. Then n = m + k for some k ∈ N and n = m. That is, m = m + k, which contradicts
Lemma 2.4.4. As another possibility, assume that n < m and n > m. Then there exist k, ` ∈ N such
that n = m + k and m = n + `. So that n = m + k = n + (` + k), which again contradicts Lemma 2.4.4.
Similarly, other possibilities can be ruled out and is left as an exercise for the reader.
To complete the proof, fix n ∈ N, and define X = {m ∈ N : n < m or n = m or n > m}. We show
that X = N.
First, we need to show that 1 ∈ X. If n = 1 then 1 = 1 and hence 1 ∈ X. If n 6= 1 then there
exists y ∈ N such that n = S(y) = y + 1 = 1 + y and hence by the definition of order, 1 < n or n > 1.
Thus, 1 ∈ X.
Next, in order to apply Axiom P3, assume that m ∈ X. Then either n < m or n = m or n > m.
We will consider all three cases and in each case show that S(m) ∈ X.
If n < m, then m = n + ` for some ` ∈ N. Thus, S(m) = S(n + `) = (n + `) + 1 = n + (` + 1); and
hence n < S(m). Therefore, S(m) ∈ X.
34 CHAPTER 2. THE NATURAL NUMBER SYSTEM

If m = n, then S(m) = m + 1 = n + 1. So, n < S(m). Thus S(m) ∈ X.


If n > m then n = m + k, for some k ∈ N. Further, if k = 1, then n = m + 1 and S(m) = n. Thus,
S(m) ∈ X. If k 6= 1, then there exists ` ∈ N such that S(`) = k. Then,

n = m + k = m + S(`) = m + (` + 1) = m + (1 + `) = (m + 1) + ` = S(m) + `.

Hence S(m) < n and hence S(m) ∈ X.


Thus, by Axiom P3, X = N.

As an application of the law of trichotomy, we show that there does not exist any natural number
between n and S(n). Or equivalently, if n ≤ m < n + 1, then it is necessarily true that n = m.
Observe that this fact is a consequence of the following result.

Lemma 2.4.6. For all m, n ∈ N, m ≤ n if and only if m < n + 1.

Proof. Let m, n ∈ N. Suppose m ≤ n. Clearly, n < n + 1. So, if m = n, then n < n + 1 implies that
m < n + 1. If m < n, then n < n + 1 again implies that m < n + 1. Thus, in any case, m < n + 1.
Conversely, suppose m < n + 1. If m 6≤ n, then by the law of trichotomy, m > n. That is, there
exists ` ∈ N such that m = n + `. It follows that n + ` < n + 1 for some ` ∈ N. Thus, using Additive
Cancellation (Exercise 2.1.4.2), one has ` < 1. However, either ` = 1 or ` = S(k) for some k ∈ N.
The first case implies 1 < 1 and the second case implies that 1 is a successor of some natural number;
giving us a contradiction in either case. Hence m ≤ n.

We are now in a position to state an important principle, namely the well ordering principle.
T

Theorem 2.4.7. [Well Ordering Principle in N] Every nonempty subset X of N contains its least
AF

element.
DR

Proof. By definition, a least element of a set is an element of the set. We thus need to show that
every nonempty subset of N has a least element. On the contrary, suppose A is a nonempty subset of
N that has no least element. Let B = N \ A. If 1 ∈ A, then 1 will be the least element of A. Thus
1 6∈ A so that 1 ∈ B.
Suppose 1, 2, . . . , m ∈ B. Then, none of 1, 2, . . . , m is in A. If S(m) ∈ A, then S(m) would be the
least element of A. Thus, S(m) 6∈ A and hence S(m) ∈ B.
Hence, by the strong form of induction, B = N. Then, B = N \ A implies A = ∅, a contradiction.

Exercise 2.4.8. [Variation of well ordering principle] Let n0 ∈ N and let X be a nonempty subset
of {n0 + 1, n0 + 2, . . . , }. Then prove that X contains its least element.

2.5 Recursion Theorem


Recall how we defined addition and multiplication in N. For any fixed n ∈ N, we defined addition by
declaring that n + 1 := S(n) and n + S(m) := S(n + m). Due to induction, we remarked that for each
m ∈ N, these two conditions defined n + m. This intuitive work requires a formal justification. Notice
that + is a binary operation on N, that is, + is a function from N×N to N. We need to derive rigorously
from our axioms that a function satisfying the properties n + 1 := S(n) and n + S(m) := S(n + m)
exists, and that such a function is unique. Similarly, multiplication is to be tackled. We rather present
a more general result, and view the definitions of addition and multiplication as special cases. The
following result provides this general framework in N.
2.5. RECURSION THEOREM 35

Theorem 2.5.1. [Recursion Theorem] Let f : N → N be a function. Then, for any fixed natural
number α, there exists a unique function g : N → N such that

g(1) = α and g(S(x)) = f (g(x)) for each x ∈ N. (2.3)

Proof. Define g ⊆ N × N as follows


1. (1, α) ∈ g, and
2. (x, y) ∈ g implies (S(x), f (y)) ∈ g.

As 1 is not a successor of any natural number, g({1}) = {α}, is a singleton. Assume that g({x}) = {y}.
Then, g({S(x)}) = {f (y)}, a singleton as f is a function. So, by Theorem 2.3.3, g is a function.
To show the uniqueness of the function g, we consider two functions g1 , g2 : N → N, satisfying
Equation (2.3). Now, define
V = {n ∈ N : g1 (n) = g2 (n)}.

From Equation (2.3), g1 (1) = g2 (1) = α. So, 1 ∈ V .


Let n ∈ V . Here, g1 (n) = g2 (n). Therefore, g1 (S(n)) = f (g1 (n)) = f (g2 (n)) = g2 (S(n)). Thus,
S(n) ∈ V . By Axiom P3, V = N. Therefore, g1 = g2 .

Using the recursion theorem, we now show that the definitions of addition and multiplication are
indeed well defined.
Example 2.5.2. 1. [Addition function] Let f : N → N be the function defined by f (x) = S(x),
for all x ∈ N. Fix any element y ∈ N. By the recursion theorem, there exists a unique function

g : N → N such that g(1) = S(y) and f (g(x)) = g(S(x)), for all x ∈ N. (2.4)
T
AF

Define
DR

for all x ∈ N, y + x := g(x) (2.5)

When x = 1, from Equation (2.5), we get y + 1 = g(1). As g(1) = S(y), we get y + 1 = S(y).
Further, for any x ∈ N, we see that

y + S(x) = g(S(x)) (using Equation (2.5))


= f (g(x)) (using f (g(x)) = g(S(x)))
= S(g(x)) (using f (x) = S(x))
= S(y + x). (using g(x) = y + x)

Thus, for all y, x ∈ N, y + S(x) = S(y + x). Hence, both the rules of addition stated in stated
in Definition 2.1.2 are satisfied.
2. [Multiplication function] Fix an element y ∈ N and consider the function f : N → N defined
by f (x) = x + y. (Observe that this is well defined by Part 1. )
Then, by the recursion theorem, there exists a unique function h : N → N, such that h(1) = y
and f (h(x)) = h(S(x)), for all x ∈ N. Now, define y · x := h(x), for all x ∈ N.
Then, for x = 1, we get y · 1 = h(1) = y. Further, for any x ∈ N, we see that

y · S(x) = h(S(x)) = f (h(x)) = f (y · x) = y · x + y,

thereby, proving both the rules of multiplication stated in Definition 2.1.3.


36 CHAPTER 2. THE NATURAL NUMBER SYSTEM

3. [Power function] Fix an element m ∈ N and consider the function f : N → N defined by


f (x) = x · m. (Part 2 allows us to define such a function.)
Then, by the recursion theorem, there exists a unique function p : N → N, such that p(1) = m
and f (p(x)) = p(S(x)), for all x ∈ N. Now, define mx := p(x), for all x ∈ N.
Then, for x = 1, we get m1 = p(1) = m. Further, for any x ∈ N, S(x) = x + 1 gives

mx+1 = mS(x) = p(S(x)) = f (p(x)) = p(x) · m = (mx ) · m.

Hence, we have obtained the required power function.

Remark 2.5.3. Recall that in Example 2.5.2.1, it was easy to show that y + S(x) = S(y + x), for all
y, x ∈ N. What is more difficult to prove is that S(y) + x = S(y + x), for all x, y ∈ N which together
with Example 2.5.2.1 gives us commutativity of addition.
So, we take X = {x ∈ N : S(y) + x = S(y + x)} and prove that X is an inductive set.
By the recursion theorem, there exists a unique function t : N → N such that t(1) = S(S(y)) and
f (t(x)) = t(S(x)), for all x ∈ N. Define

S(y) + x := t(x) for all x ∈ N. (2.6)

As g(1) = S(y) (see Example 2.5.2.1) and g(1) = y + 1 (Equation (2.5)), we see that for x = 1,
S(y) + 1 = t(1) = S(S(y)) = S(g(1)) = S(y + 1). This implies that 1 ∈ X.
To show that X = N, we assume that x ∈ X. Now, consider S(y) + S(x). Then, using Exam-
ple 2.5.2.1, S(y) + S(x) = S(S(y) + x). As x ∈ X, S(y) + x = S(y + x) and hence
T
AF

S(y) + S(x) = S(S(y) + x) = S(S(y + x)) = S(y + S(x)).


DR

where the last equality also follows from Example 2.5.2.1.


Therefore, S(x) ∈ X, whenever x ∈ X. Therefore, by Axiom P3, X = N.

2.6 Construction of Integers


By now, the readers should have got a glimpse of the work required to axiomatically construct N,
the set of natural numbers. Similarly, the construction of integers from natural numbers and the
construction of rational numbers from integers require quite a lot of work. These constructions are
very helpful in understanding advanced algebra. In this section and the succeeding one, we will discuss
how to construct the integers and rational numbers from the natural numbers.
To start with let X = N × N. We define a relation ‘∼’ on X by

(a, b) ∼ (c, d) if a + d = b + c for all a, b, c, d ∈ N.

Then, verify that ∼ is indeed an equivalence relation on X. Let Z denote the collection of all equiv-
alence classes under this relation. So, if [x], [y] ∈ Z then [x] is an equivalence class containing
x = (x1 , x2 ), for some x1 , x2 ∈ N and [y] is an equivalence class containing y = (y1 , y2 ), for some
y1 , y2 ∈ N. Now, using the successor function S defined in Axiom P2, observe that
1. [(1, 1)] = {(n, n) : for all n ∈ N},
2. for a fixed element m ∈ N, [(1, S(m))] = {(n, m + n) : for all n ∈ N}, and
3. for a fixed element m ∈ N, [(S(m), 1)] = {(m + n, n) : for all n ∈ N}.
2.6. CONSTRUCTION OF INTEGERS 37

Further, Z consists of all equivalence classes of the above forms. That is,
  
Z = [(1, 1)] ∪ [(1, S(m))] : m ∈ N ∪ [(S(m), 1)] : m ∈ N .

Definition 2.6.1. Let [x] = [(x1 , x2 )], [y] = [(y1 , y2 )] ∈ Z for some x1 , x2 , y1 , y2 ∈ N. Define

[x] ⊕ [y] = [(x1 , x2 )] ⊕ [(y1 , y2 )] = [(x1 + y1 , x2 + y2 )]. (2.7)

The map ⊕ : Z × Z → Z, defined above is called the addition in Z.

Note that addition, ı.e., the function ⊕ maps a pair of two nonempty sets, say [(x1 , x2 )] and
[(y1 , y2 )] to the set [(x1 + y1 , x2 + y2 )]. Thus, we need to verify that the addition of two different
representatives of the domain, give rise to the same set on the range. This process of defining a map
using representatives and then verifying that the image is independent of the representatives chosen
is characterized by saying that “the map is well-defined”. So, let us now prove that ⊕ is well-defined.

Lemma 2.6.2. The map ⊕ defined in Equation (2.7) is well-defined.

Proof. Let [(u1 , u2 )] = [(v1 , v2 )] and [(x1 , x2 )] = [(y1 , y2 )] be two equivalence classes in Z. Then, by
definition

[(u1 , u2 )] ⊕ [(x1 , x2 )] = [(u1 + x1 , u2 + x2 )], [(v1 , v2 )] ⊕ [(y1 , y2 )] = [(v1 + y1 , v2 + y2 )].

For well-definedness, we need to show that [(u1 + x1 , u2 + x2 )] = [(v1 + y1 , v2 + y2 )]. Or equivalently,


we need to show that u1 + x1 + v2 + y2 = u2 + x2 + v1 + y1 .
But, the equality of the equivalence classes [(u1 , u2 )] = [(v1 , v2 )] and [(x1 , x2 )] = [(y1 , y2 )] implies
T
AF

u1 + v2 = u2 + v1 and x1 + y2 = x2 + y1 . Thus, adding the two and using the commutativity of addition
in N, we get
DR

u1 + x1 + v2 + y2 = u2 + x2 + v1 + y1 .

Thus, the required result follows.

On similar lines, we now define multiplication among elements of Z.

Definition 2.6.3. Let [x] = [(x1 , x2 )], [y] = [(y1 , y2 )] ∈ Z, for some x1 , x2 , y1 , y2 ∈ N. Then, one
defines multiplication in Z, denoted by , as

[x] [y] = [(x1 , x2 )] [(y1 , y2 )] = [(x1 y1 + x2 y2 , x1 y2 + x2 y1 )]. (2.8)

Since we are talking about multiplication between two sets using their representatives, we need
to verify that the multiplication is indeed well-defined. So, the readers are required to prove that
multiplication is well-defined. Further, the following properties of of addition and multiplication in
Z can be proved by using the corresponding properties of natural numbers and hence is left as an
exercise for the readers.
Exercise 2.6.4. 1. Show that the multiplication defined in Equation (2.8) is well-defined.
2. Let [x], [y], [z] ∈ Z. Write [0] = [(1, 1)]. Prove the following:
(a) [Associativity of addition] ([x] + [y]) + [z] = [x] + ([y] + [z]).
(b) [Commutativity of addition] [x] + [y] = [y] + [x].
(c) [Existence of the zero element] [x] + [0] = [x].
(d) [Cancellation property] If [x] + [y] = [x] + [z] then [y] = [z]. This implies that the zero
element is unique.
38 CHAPTER 2. THE NATURAL NUMBER SYSTEM

(e) [Existence of additive inverse] for every [x] = [(x1 , x2 )], the equivalence class [(x2 , x1 )],
denoted by −[x], satisfies [x]⊕(−[x]) = [0]. Now, use the cancellation property in Z to show
that the additive inverse is unique. So, the equivalence class −[x] is called the additive
inverse of [x].
(f ) [Distributive laws] ([x] + [y]) [z] = [x] [z] ⊕ [y] [z].
(g) [Associativity of multiplication] ([x] [y]) [z] = [x] ([y] [z]).
(h) [Commutativity of multiplication] [x] [y] = [y] [x].
(i) [Existence of the identity element] [x] [1] = [x], where [1] = [(S(1), 1)].
(j) [Cancellation property] If [x] [y] = [x] [z] with [x] 6= [0] then [y] = [z].
(k) [x] [0] = [0].

As a last property, we show that a copy of N naturally seats inside Z.

Lemma 2.6.5. Define f : N → Z by f (n) = [(S(n), 1)] for all n ∈ N. Then the following are true:
1. f is one-one.
2. For all a, b ∈ N, f (a + b) = f (a) ⊕ f (b).
3. For all a, b ∈ N, f (a · b) = f (a) f (b).

Proof. 1. Suppose f (a) = f (b) for some a, b ∈ N. By definition, [(S(a), 1)] = [(S(b), 1)] or equivalently,
S(a) + 1 = S(b) + 1. By the cancellation law in N, we get S(a) = S(b). Since S is one-one, we have
a = b.

2. Let a, b ∈ N. By definition, f (a + b) = [(S(a + b), 1)]. So


T

f (a) ⊕ f (b) = [(S(a), 1)] ⊕ [(S(b), 1)] = [(S(a) + S(b), 1 + 1)] = [(S(a) + b + 1, 1 + 1)]
AF

= [(S(a + b) + 1, 1 + 1)] = [(S(a + b), 1)] = f (a + b).


DR

3. Let a, b ∈ N. Now, f (a · b) = [(S(a · b), 1)]. So

f (a) f (b) = [(S(a), 1)] [(S(b), 1)] = [(S(a) · S(b) + 1 · 1, S(a) · 1 + 1 · S(b))]
= [(S(a) · S(b) + 1, S(a) + S(b))] = [(S(a · b), 1)] = f (a b)

as S(a) · S(b) + 1 + 1 = S(a) · b + S(a) · 1 + 1 + 1 = a · b + 1 · b + S(a) + 1 + 1 = S(a · b) + S(b) + S(a).

We have shown that f (N) ⊆ Z. Further, the map f commutes with the addition operation and the
multiplication operation. Thus, we identify f (N) inside Z as a copy of N. From now on, the symbols
+ and · will be used for addition and multiplication in integers. Further, as n ∈ N is identified with
f (n) = [(S(n), 1)], we would like to associate the symbol ‘−’ as n = S(n) − 1 and −n = 1 − S(n). We
proceed to do this in the next few paragraphs.

Definition 2.6.6. Let [x] = [(x1 , x2 )], [y] = [(y1 , y2 )] ∈ Z, for some x1 , x2 , y1 , y2 ∈ N. Then, the
order in Z is defined by saying that [x] < [y] if x1 + y2 < y1 + x2 . Further, [x] ≤ [y] if either [x] = [y]
or [x] < [y].

We again need to check for well-definedness. So, let [(u1 , u2 )] = [(v1 , v2 )] and [(x1 , x2 )] = [(y1 , y2 )]
be two equivalence classes in Z with [(u1 , u2 )] < [(x1 , x2 )]. We need to show that [(v1 , v2 )] < [y1 , y2 )],
or equivalently, v1 + y2 < y1 + v2 . As [(u1 , u2 )] = [(v1 , v2 )] and [(x1 , x2 )] = [(y1 , y2 )], one has u1 + v2 =
v1 + u2 and x1 + y2 = y1 + x2 . Thus, u1 + v2 + y1 + x2 = v1 + u2 + x1 + y2 . Hence,

v1 + y2 + x1 + u2 = v1 + u2 + x1 + y2 = u1 + v2 + y1 + x2 = y1 + v2 + u1 + x2
< y1 + v2 + x1 + u2 ,
2.6. CONSTRUCTION OF INTEGERS 39

as u1 + x2 < x1 + u2 . Therefore, by the order property in N (see Exercise 2.4.3), v1 + y2 < y1 + v2 .


Thus, the above definition is well-defined. At this stage, one would like to verify that the function f
defined in Lemma 2.6.5 preserves the order as well.

Lemma 2.6.7. Consider the map f : N → Z defined by f (n) = [(S(n), 1)] for all n ∈ N. Then, for
all a, b ∈ N, a < b if and only if f (a) < f (b).

Proof. Using Exercise 2.4.3, a < b if and only if a + 1 + 1 < b + 1 + 1, or equivalently, a < b if and
only if S(a) + 1 < S(b) + 1. Thus, a < b if and only if f (a) = [(S(a), 1)] < [(S(b), 1)] = f (b).

Definition 2.6.8. Let [x] = [(x1 , x2 )] ∈ Z. Then, [x] is said to be positive if [0] < [x] and is said to
be non-negative if [0] ≤ [x]. In general, we write [x] > [0] to mean [x] is positive and [x] ≥ [0] for
[x] being non-negative.

Lemma 2.6.9. Let [x] = [(x1 , x2 )] ∈ Z. Then, [x] > [0] if and only if x1 > x2 .

Proof. By definition, [(x1 , x2 )] > [0] = [(1, 1)] if and only if x1 + 1 > x2 + 1. Or equivalently, using
Exercise 2.4.3, one obtains [(x1 , x2 )] > [(1, 1)] if and only if x1 > x2 .

Exercise 2.6.10. 1. Prove the following results for any [x] ∈ Z.


(a) [x] > 0 if and only if [x] = [(S(n), 1)] = f (n) for some n ∈ N.
(b) [x] > 0 if and only if −[x] < 0.

2. [y] > [z], for some [y], [z] ∈ Z if and only if [y] + [x] > [z] + [x].
3. If [y] > [z], for some [y], [z] ∈ Z then [y] · [x] > [z] · [x], whenever [x] > 0.
T
AF

Thus, Z = N ∪ {0} ∪ (−N) and hence from now on, in place of using equivalence class to represent
DR

the elements of Z, we will just use natural numbers, their negatives and the zero element to represent
Z, the set of integers. Thus, whenever we define functions or operations on Z then we need not
worry about well-definedness. Let us now discuss the “absolute value function”, namely the modulus
function.

Definition 2.6.11. A function g : Z → N ∪ {0} is called an absolute/modulus function if


1. g(n) = n if n ≥ 0,
2. g(n) = −n, if n < 0.

This function is denoted by | · |. Thus, |m| = m, if m ≥ 0 and −m, if m < 0. Further, by Exer-
cise 2.6.10.1, observe that |m| ≥ 0 for all m ∈ Z.

For a better understanding of this function, we prove the following two results.

Lemma 2.6.12. For any x ∈ Z, −|x| ≤ x ≤ |x|. Further, if x ≥ 0 and −x ≤ y ≤ x for some y ∈ Z,
then |y| ≤ x.

Proof. Let x ≥ 0. Then, by definition |x| = x and hence x ≤ |x|. As |x| = x, the other inequality
−|x| ≤ x reduces to −x ≤ x. Or equivalently, we need to show that 0 = x + (−x) ≤ x + x = 2x, which
is indeed true. If x < 0 then we see that |x| > 0 > x and hence x ≤ |x|. Note that the condition
−|x| ≤ x is equivalent to the condition |x| + x ≥ 0 (use Exercise 2.6.10.2) which is indeed true as by
definition x + |x| = x + (−x) = 0.
For the second part, we again consider two cases, namely, y ≥ 0 and y < 0. If y ≥ 0 then |y| = y
and hence the condition y ≤ x implies |y| ≤ x. If y < 0 then |y| = −y. Further, using Exercise 2.6.10.2,
40 CHAPTER 2. THE NATURAL NUMBER SYSTEM

the condition −x ≤ y is equivalent to the condition 0 ≤ y + x which in turn is equivalent to −y ≤ x.


Hence |y| = −y ≤ x. Thus, the required result follows.

As a direct application of Lemma 2.6.12, one obtains the triangle inequality.

Lemma 2.6.13. [Triangle inequality in Z] Let x, y ∈ Z. Then |x + y| ≤ |x| + |y|.

Proof. Using Lemma 2.6.12, one has −|x| ≤ x ≤ |x| and −|y| ≤ y ≤ |y|. Hence,

−|x| + (−|y|) ≤ x + y ≤ |x| + |y|.

Now, use the associativity and commutativity of addition to get

0 = −|x| + (−|y|) + |x| + |y| = −(|x| + |y|) + (|x| + |y|)

and hence the uniqueness of the additive inverse implies −|x| + (−|y|) = −(|x| + |y|). Thus, the
required result follows from the second part of Lemma 2.6.12.

This finishes most of the results on the basic operations related to integers. As a last note, we
make the following remark.

Remark 2.6.14. Even though the well ordering principle and its extension (Exercise 2.4.8) is valid
for subsets of N, it can be generalized to W, the set of whole numbers. Furthermore, if we fix an
integer z ∈ Z and take S = {z, z + 1, z + 2, . . .} then it can also be shown that every nonempty subset
X of S contains its least element. Or equivalently, every nonempty subset X of Z which is bounded
below satisfies the well ordering principle.
T

2.7 Construction of Rational Numbers


AF

We will describe the construction of rational numbers in brief, and and prove a few properties, such
DR

as addition, multiplication, subtraction and division by nonzero elements.


We write Z∗ := Z\{0} and define an equivalence relation on X = Z×Z∗ and then doing everything
afresh as was done for the set of integers. Define a relation ‘∼’ on X by

(a, b) ∼ (c, d) if a · d = b · c for all a, c ∈ Z, b, d ∈ Z∗ .

Then, verify that ∼ is indeed an equivalence relation on X. Let Q denote the collection of all
equivalence classes under this relation. This set is called the “set of rational numbers”. In this
set, we define addition and multiplication, using the addition and multiplication in Z, as follows:
1. Let [x] = [(x1 , x2 )], [y] = [(y1 , y2 )] ∈ Q. Then, addition in Q, denoted as ⊕, is defined by

[x] ⊕ [y] = [(x1 , x2 )] ⊕ [(y1 , y2 )] = [(x1 · y2 + x2 · y1 , x2 · y2 )].

2. Let [x] = [(x1 , x2 )], [y] = [(y1 , y2 )] ∈ Q. Then, multiplication in Q, denoted as , is defined by

[x] [y] = [(x1 , x2 )] [(y1 , y2 )] = [(x1 · y1 , x2 · y2 )].

The readers are advised to verify that the above operations in Q are well-defined. Further, the
map f : Z → Q defined by f (a) = [(a, 1)], is one-one and it preserves addition and multiplication.
Thus, Z is seating inside Q as f (Z). As earlier, we replace the symbols ‘⊕’ and ‘ ’ by ‘+’ and ‘·’.
Sometimes, x · y is simply written as xy. Note that the element 0 ∈ Z corresponds to [(0, 1)] = [(0, x)]
for all x ∈ Z∗ . Hence, an element [(x1 , x2 )] ∈ Q with [(x1 , x2 )] 6= 0 implies that x1 6= 0. Verify that
for each [(x1 , x2 )] ∈ Q with x1 6= 0, the element [(x2 , x1 )] ∈ Q satisfies [(x1 , x2 )] · [(x2 , x1 )] = 1. As the
next operation, one defines division in Q as follows.
2.7. CONSTRUCTION OF RATIONAL NUMBERS 41

Definition 2.7.1. Let [x] = [(x1 , x2 )], [y] = [(y1 , y2 )] ∈ Q with y1 6= 0. Then, the division in Q,
denoted as /, is defined by

[x]/[y] = [(x1 , x2 )]/[(y1 , y2 )] = [(x1 y2 , x2 y1 )].

Note that x2 y1 ∈ Z∗ as x2 , y1 6= 0.

The readers are advised to verify that division is well-defined. Before proceeding further with
other important properties of rational numbers, the readers should verify all the properties related
with addition, subtraction, multiplication, and division by a nonzero element. The next result helps
in defining order in Q.

Lemma 2.7.2. [Representation of an Element of Q] Let [x] ∈ Q. Then [x] = [(y1 , y2 )] for some
y1 , y2 ∈ Z, y2 > 0.

Proof. Let [x] = [(x1 , x2 )] for some x1 , x2 ∈ Z. If x2 > 0, we are done. Else, using Exercise 2.6.10.1, we
know that −x2 > 0. Then, by the definition of equivalence class we have [x] = [(x1 , x2 )] = [(−x1 , −x2 )].
Hence the required result follows.

Definition 2.7.3. Let [x] = [(x1 , x2 )], [y] = [(y1 , y2 )] ∈ Q for some x1 , x2 , y1 , y2 ∈ Z with x2 , y2 > 0.
Then the order in Q is defined by [x] > [y] if x1 y2 > x2 y1 .

One should verify that the order in Q is indeed well-defined. Notice that as earlier, [x] ≥ [y] means
either [x] = [y] or [x] > [y]. Further, it may be seen that Q is an ordered field, that is, the following
are satisfied for all a, b, c ∈ Q:
1. a + b = b + a.
T
AF

2. (a + b) + c = a + (b + c).
DR

3. a + 0 = a.
4. There exists an element, written as −a such that a + (−a) = 0.
5. a · b = b · a.
6. (a · b) · c = a · (b · c).
7. a · 1 = a.
8. Corresponding to a, there exists an element, written as 1/a ∈ Q such that a · (1/a) = 1.
9. a · (b + c) = (a · b) + (a · c).
10. Exactly one of the conditions a < b or a = b or b < a is true.
11. If a < b and b < c, then a < c.
12. If a < b, then a + c < b + c.
13. If a < b and 0 < c, then a · c < b · c.

As a final result of this section, we prove the following result.

Lemma 2.7.4. [Existence of a Rational between two Rationals] Let [x], [y] ∈ Q with [x] < [y].
Then there exists [z] ∈ Q such that [x] < [z] < [y].

Proof. Let [x] = [(x1 , x2 )] and [y] = [(y1 , y2 )], for some x1 , x2 , y1 , y2 ∈ Z with x2 , y2 > 0. Since
[x] < [y], x1 y2 < x2 y1 , one has 2x1 y2 < x1 y2 + x2 y1 < 2x2 y1 . Further, 2x2 y2 > 0 and hence let us
take [z] = [(x1 y2 + x2 y1 , 2x2 y2 )]. It can be easily verified that [x] < [z] < [y] as x2 , y2 ∈ Z and using
the multiplicative cancellation (Exercise 2.1.4.8) in Z.
42 CHAPTER 2. THE NATURAL NUMBER SYSTEM

T
AF
DR
Chapter 3

Countable and Uncountable Sets

In this chapter, we discuss the size of sets. Intuitively, the number of elements in a set may be
considered as its size. For instance, the sets {1} has size 1 and the set {a, b} has size 2. We will be
concerned about size of sets of various kinds.

3.1 Finite and infinite sets


We first show that the intuitive notion of ‘number of elements in a set’ is a well defined notion, at
least for finite sets. Since the set {1, 2, . . . , m} will be used often, we give a notation for this set.

Notation: [m] = {1, 2, . . . , m} for m ∈ N.


T

We hope that this notation will not conflict with the notation of an equivalence class induced by
AF

an equivalence relation; the context will clarify which one is used.


DR

Lemma 3.1.1. Let n ∈ N. There exists no one-one function from [n] to any of its proper subsets.

Proof. We use PMI to prove this result. For each n ∈ N, let P (n) be the statement that there exists
no one-one function from [n] to any of its proper subsets.
The statement P (1) holds as there exists no one-one function from [1] to ∅. Assume the induction
hypothesis that for an m ∈ N, P (m) holds. We show that P (m + 1) holds.
On the contrary, suppose there exists one-one function f : [m + 1] → A, where A is a proper subset
of [m + 1]. We consider two cases depending on whether m + 1 ∈ rng f or not.
Case 1: m + 1 ∈ rng f.
(a) If f (m+1) = m+1, then the restriction function f[m] is a one-one function from [m] to A\{m+1},
which is a proper subset of [m]. This contradicts the induction hypothesis.
(b) If f (m + 1) 6= m + 1, then there exist k, ` ∈ [m] such that f (k) = m + 1 and f (m + 1) = `. Define
the function g : [m] → A \ {m + 1} by

g(k) = `, g(x) = f (x) for x 6= k.

Observe that g is one-one and A \ {m + 1} is a proper subset of [m]. This contradicts the induction
hypothesis.
Case 2: m + 1 6∈ rng f .
In this case, f (m + 1) ∈ [m]. Then the restriction function f[m] is a one-one function from [m] to
A \ {f (m + 1)}, which is a proper subset of [m]. Again, it contradicts the induction hypothesis.

43
44 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

Hence, there exists no one-one function from [m + 1] to any of its proper subsets so that P (m + 1)
holds.

As an application of Lemma 3.1.1, we prove the following result.

Lemma 3.1.2. Let m, n ∈ N. Then the following are true:


1. [Injection] There exists a one-one function from [m] to [n] if and only if m ≤ n.
2. [Bijection] There exists a bijection from [m] to [n] if and only if m = n.

Proof. (1) Suppose m ≤ n. Then the function Id : [m] → [n] given by Id(x) = x is a one-one function.
Conversely, let f : [m] → [n] be a one-one function. If m > n, then [n] is a proper subset of [m]. Now,
f is one-one function from [m] to a proper subset of [m] contradicting Lemma 3.1.1. Hence m ≤ n.
(2) Assume that m = n. Then the identity function on [n], given by Id(x) = x is a bijection.
Conversely, suppose that g : [m] → [n] is a bijection. Then both g and g −1 : [n] → [m] are one-one
functions. By (1), m ≤ n and n ≤ m. Therefore, m = n.

Recall that two sets are said to be equinumerous if there is a bijection between them, and that the
composition of two bijections is a bijection. Thus, if m, n ∈ N, m 6= n and A is a set equinumerous
with {1, 2, . . . , m}, then A cannot be equinumerous with {1, 2, . . . , n}, i.e., such a set A has a definite
number of elements. This idea provides a mathematical justification of the fact that if two persons
count all English words in this page correctly, then they will arrive at the same number.
Taking cue from the above results, we define the notions of finite sets, infinite sets, and the number
of elements, or the cardinality of a finite set as follows.
T

Definition 3.1.3. 1. A set X is called finite if either X = ∅ or there exists a bijection from X
AF

to [m] for some m ∈ N; this number m is called the cardinality of X and is denoted by |X| .
DR

We write |∅| = 0.
2. A set which is not finite is called an infinite set.

For instance, [m] is a finite set for any m ∈ N. Moreover, |[m]| = m. For any m ∈ N, if a1 , . . . , am
are distinct objects, then A := {a1 , . . . , am } is a finite set since f : A → [m] defined by f (aj ) = j is a
bijection; and, |A| = m.
If N is a finite set, then there is a bijection f : N → [n] for some n ∈ N. In that case, the restriction
function f[n+1] : [n + 1] → [n] is one-one. It contradicts Lemma 3.1.1. Therefore, N is an infinite set.
We give some characterization of finite and infinite sets, where the requirements are seemingly
weaker than those mentioned in their definitions.

Theorem 3.1.4. 1. A nonempty set X is finite if and only if there exists a one-one function
f : X → [m] for some m ∈ N.
2. A set X is infinite if and only if there exists a one-one function f : N → X.
3. A set X is infinite if and only if there exists a bijection from X to one of its proper subsets.
4. A set X is infinite if and only if there exists a one-one function from X to one of its proper
subsets.

Proof. (1) Let X be a nonempty set. If X is finite, then there is a bijection f : A → [n] for some
n ∈ N. Now, f itself is a one-one function.
Conversely, let g : X → [m] be a one-one function for some m ∈ N. We show by PMI on m that X
is finite. For m = 1, if g : X → {1} is one-one, then g is onto, and hence a bijection. So, by definition
3.1. FINITE AND INFINITE SETS 45

X is finite. Assume that the statement is true for m = k and let g : X → [k + 1] be one-one function.
If g is onto, then g is a bijection with n = k + 1 so that , i.e., X is equinumerous with [k + 1] and
hence by definition, X is finite. So, assume that g is not onto.
If k + 1 6∈ rng g, then g : X → [k] is one-one, and the induction hypothesis implies that X is finite.
Otherwise, there exist x0 ∈ X and ` ≤ k such that g(x0 ) = k + 1 and ` 6∈ rng g. Define h : X → [k] by
(
g(t), if t 6∈ x0
h(t) =
`, if t = x0 .

Then h : X → [k] is one-one. By the induction hypothesis, X is finite.


(2) Let X be an infinite set. Since X 6= ∅, there exists at least one element, say, a1 ∈ X. We
show by induction that for each n ≥ 2, there exists an ∈ X different from a1 , . . . , an−1 . Now that
a1 has been chosen, consider the set X \ {a1 }. If this set is empty, then X = {a1 }, which is a finite
set. As X is infinite, X \ {a1 } is nonempty. So, let a2 ∈ X \ {a1 }. This proves the basis case.
So, suppose a1 , . . . , am ∈ X have been chosen corresponding to the numbers 1, 2, . . . , m. The set
X \ {a1 , a2 , . . . , am } is nonempty, since otherwise X = {a1 , a2 , . . . , am } would be a finite set. So, let
am+1 ∈ X \ {a1 , a2 , . . . , am }. This proves the induction step.
Hence, corresponding to 1, there exists a1 ∈ X, and for each n ≥ 2, there exists an ∈ X different
from all of a1 , a2 , . . . , an−1 . Define the function f : N → X by f (n) = an . Then f is a one-one
function. (Notice that for different choices of an s, we get different functions f .)
Conversely, let f : N → X be one-one. If X is finite, then there exists a one-one function
g : X → [m] for some m ∈ N. Then g ◦ f : N → [m] is one-one. The restriction of g ◦ f to [n + 1] is a
one-one function from [n + 1] to [n]. It contradicts Lemma 3.1.1. Therefore, X is infinite.
T
AF

(3) Let X be an infinite set. By (2), there is a one-one function f : N → X. Now define the function
g : X → X \ {f (1)} by
DR

(
x, if x 6∈ rng f
g(x) =
f (k + 1), if x = f (k) for some k ∈ N.

Then g is a bijection. So, we have a bijection from X to one of its proper subsets.
Conversely, Let g : X → Y be a bijection, where Y is a proper subset of X. On the contrary, assume
that X is a finite set. Then, there is a bijection f : X → [m] for some m ∈ N. Since Y is a proper
subset of X, f (Y ) is a proper subset of f (X). As f (X) = [m], the function f ◦ g ◦ f −1 : [m] → f (Y )
is a bijection from [m] to a proper subset of [m]. This contradicts Lemma 3.1.1.
(4) Let X be an infinite set. By (3) there exists a bijection from X to one of its proper subsets. This
bijection is itself a one-one function from X to that subset. Conversely, suppose that h : X → Y is
one-one, where Y is a proper subset of X. Let Z = rng h. We see that Z is also a proper subset of X
and h : X → Z is a bijection.

Observe that Theorem 3.1.4.3 implies that a set X is finite if and only if there is no bijection from
X to any of its proper subsets, if and only if, there is no one-one function from X to any of its proper
subsets.

Exercise 3.1.5.
1. A subset of a finite set is finite.
2. If X and Y are disjoint sets with |X| = m and |Y | = n, then |X ∪ Y | = m + n. In particular,
if X and Y are disjoint finite sets, then X ∪ Y is finite.
3. Let X and Y be finite sets. Then X ∪ Y is finite.
46 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

4. Let X be a nonempty set with |X| = n. For any x ∈ X, |X \ {x}| = n − 1.


5. A superset of an infinite set is infinite.
6. Let X be an infinite set and let Y be a finite set. Then X \ Y is an infinite set.
7. Let X and Y be nonempty finite sets. Then |X| ≤ |Y | if and only if there exists a one-one
function f : X → Y.
8. Let X and Y be nonempty finite sets. Then |X| = |Y | if and only if there is a bijection from
X to Y .
9. Let X be a finite nonempty set and let α be a fixed symbol. Let Y = {(a, α) : a ∈ X}. Then
|X| = |Y | .
10. Let X be a nonempty finite set. Then, for any set Y , |X| = |X \ Y | + |X ∩ Y | .
11. Let X and Y be two finite sets. Then |X ∪ Y | = |X| + |Y | − |X ∩ Y | .
12. Let A and B be finite sets. Show that A × B is a finite set, and |A × B| = |A| × |B| .
13. Let f : A → B be a function, where both A and B are finite sets. If rng f = {b1 , . . . , bn } then
n
|f −1 (bj )| . In particular, if |f −1 (bj )| = k for j = 1, 2, . . . , n, then |A| = nk.
P
show that |A| =
j=1

3.2 Families of sets


In this section, we extend the notation of operations on sets to sets of sets.
T

Definition 3.2.1. Let I be a set. For each α ∈ I, take a set Aα . The set
AF


{Aα }α∈I := Aα : α ∈ I
DR

is called a family of sets indexed by elements of I. In this case, the set I is called an index set. The
family of sets {Aα : α ∈ I} is called a nonempty family when the index set I is nonempty.
Let {Yα }α∈I be a nonempty family of sets. We define the union and intersection of the sets in the
family as follows:
1. union : ∪ Yα = {y : y ∈ Yα for some α ∈ I};
α∈I

2. intersection : ∩ Yα = {y : y ∈ Yα for all α ∈ I}.


α∈I

[Convention] The union of sets in an empty family is ∅. The intersection of sets in an empty family
of subsets of a set S is S.1
Unless otherwise mentioned, we assume that the index set for a family of sets is nonempty so that
the family is a nonempty family.

Example 3.2.2.
1. Take A = {1, 2, 3}, B1 = {1, 2}, B2 = {2, 3} and B3 = {4, 5}. Then the family

{Bα : α ∈ A} = {B1 , B2 , B3 } = {{1, 2}, {2, 3}, {4, 5}} .

Thus, ∪ Bα = {1, 2, 3, 4, 5} and ∩ Bα = ∅.


α∈A α∈A

1
Consider the family {Aα }α∈I , where each Aα is a subset of a set S. Let x ∈ S. If x 6∈ ∩ Aα , then there exists an
α∈I
α ∈ I such that x 6∈ Aα . However, such an α does not exist since I is empty. Therefore, each such x ∈ ∩ Aα .
α∈I
3.2. FAMILIES OF SETS 47

2. Take A = N and Bn = {n, n + 1, . . .}. Then the family

{Bα : α ∈ A} = {B1 , B2 , . . .} = {{1, 2, . . .}, {2, 3, . . .}, . . .} .

Thus, ∪ Bα = N and ∩ Bα = ∅.
α∈A α∈A
T 1 2
3. Verify that [− n , n ] = {0}.
n∈N

Proposition 3.2.3. Let {Aα }α∈I be a nonempty family of subsets of X and let B be any set. For
any subset Y of X, write Y c = X \ Y . Then
 
1. B ∪ ∩ Aα = ∩ (B ∪ Aα ) ,
α∈I α∈I
 
2. B ∩ ∪ Aα = ∪ (B ∩ Aα ) ,
α∈I α∈I
 c
3. ∪ Aα = ∩ Acα , and
α∈I α∈I
 c
4. ∩ Aα = ∪ Acα .
α∈I α∈I
 
Proof. (1) Let x ∈ B ∪ ∩ Aα . Then x ∈ B or x ∈ ∩ Aα . If x ∈ B, then x ∈ B ∪ Aα for each
α∈I α∈I
α ∈ I. So, x ∈ ∩ (B ∪ Aα ). If x ∈ ∩ Aα , then for each α ∈ I, x ∈ Aα so that x ∈ B ∪ Aα . Then
α∈I α∈I
x ∈ ∩ (B ∪ Aα ). In any case, x ∈ ∩ (B ∪ Aα ).
α∈I α∈I
Conversely, suppose x ∈ ∩ (B ∪ Aα ). Then for each α ∈ I, x ∈ B ∪ Aα . If x ∈ B, then
  α∈I
x ∈ B ∪ ∩ Aα . If x 6∈ B but x ∈ B ∪ Aα for each α ∈ I, then x ∈ Aα for each α ∈ I. So that
α∈I  
x ∈ ∩ Aα . Then x ∈ B ∪ ∩ Aα .
T

α∈I α∈I
AF

(3) Notice
 thatboth the sets are subsets of X. So, let x ∈ X. Now,
c
x ∈ ∪ Aα ⇔ x 6∈ ∪ Aα ⇔ for each α ∈ I, x 6∈ Aα ⇔ for each α ∈ I, x ∈ Acα ⇔ x ∈ ∩ Acα .
DR

α∈I α∈I α∈I


Proof of (2) and (4) are similar to those of (1) and (3), respectively.

Practice 3.2.4.

1. Consider Ax }x∈R , where Ax = [x, x + 1]. What is ∪ Ax and ∩ Ax ?
x∈R x∈R
2. For x ∈ [0, 1] write Zx := {zx : z ∈ Z} and Ax = R \ Zx. What is ∪ Ax and ∩ Ax ?
x∈[0, 1] x∈[0, 1]

3. Write the closed interval [1, 2] as ∩ In for suitable open intervals In .


n∈N

Proposition 3.2.5. Let X and Y be nonempty sets and let f be a relation from X to Y . Let {Aα }α∈I
be a family of subsets of X. Then
 
f ∪ Aα = ∪ f (Aα ) and f ∩ Aα ⊆ ∩ f (Aα ).
α∈I α∈I α∈I α∈I

Proof. For the equality,



y ∈ f ∪ Aα ⇔ (x, y) ∈ f for some x ∈ ∪ Aα ⇔ (x, y) ∈ f where x ∈ Aα for some α ∈ I
α∈I α∈I

⇔ y ∈ f (Aα ) for some α ∈ I ⇔ y ∈ ∪ f (Aα ).


α∈I

For the containment, the case ∩ Aα = ∅ is obvious. So, assume that ∩ Aα 6= ∅. Then
α∈I α∈I

y ∈ f ∩ Aα ⇔ (x, y) ∈ f for some x ∈ ∩ Aα ⇔ (x, y) ∈ f with x ∈ Aα for all α ∈ I
α∈I α∈I

⇒ y ∈ f (Aα ) for all α ∈ I ⇔ y ∈ ∩ f (Aα ).


α∈I
48 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

Remark 3.2.6. Observe that in the proof of the containment in Proposition 3.2.5, if y ∈ f (Aα ) for
each α ∈ I, then for each α ∈ I, we can find some xα ∈ Aα such that (xα , y) ∈ f . However, such an xα
need not be the same for each α. Thus the containment need not be an equality. To see that it is indeed
the case, consider the function f : {1, 2, 3, 4} → {a, b} where f = {(1, a), (2, a), (2, b), (3, b), (4, b)}.
Take A1 = {1, 3} and A2 = {1, 2, 4} and verify that f (A1 ∩ A2 ) 6= f (A1 ) ∩ f (A2 ).

To define the product of sets in a family, we first rewrite the product of two sets in an equivalent
way. Let A1 and A2 be nonempty sets and let a1 ∈ A1 , a2 ∈ A2 . The ordered pair (a1 , a2 ) may be
thought of as the function f : {1, 2} → A1 ∪ A2 with f (1) = a1 and f (2) = a2 . Therefore, A1 × A2 is
identified with the set of all functions f : {1, 2} → A1 ∪A2 with f (1) ∈ A1 and f (2) ∈ A2 . Generalizing
this observation leads to the following definition.

Definition 3.2.7. Let {Aα }α∈I be a nonempty family of sets. Assume that Aα is nonempty for each
α ∈ I. The product of the sets in the family is defined as
Y 
Aα = f : f is a function from I to ∪ Aα with f (α) ∈ Aα for each α ∈ I .
α∈I
α∈I
Q
In case Aα = ∅ for some α ∈ I, we define the product Aα := ∅.
α∈I
Q
Example 3.2.8. Take I = N and Aα = {0, 1} for each α ∈ N. Then the product Aα is the set of
α∈I
all functions f : N → {0, 1}. In other words, the product is the set of all 0-1 sequences.

Exercise 3.2.9.
1. Write R as a union of infinite number of pairwise disjoint infinite sets.
T

2. Write the set {1, 2, 3, 4} as the intersection of infinite number of infinite sets.
AF

3. Prove Parts 2 and 4 of Proposition 3.2.3.


DR

4. Let f : X → Y be a partial function, A ⊆ X, B ⊆ Y and let {Bβ }β∈I be a nonempty family of


subsets of Y . Show the following.
(a) f −1 ∩ Bβ = ∩ f −1 (Bβ ).

β∈I β∈I
(b) f −1 (B c ) = dom f \ f −1 (B).
(c) f f −1 (B) ∩ A = B ∩ f (A).


(d) f −1 ∪ Bβ = ∪ f −1 (Bβ ).

β∈I β∈I

Also, show that in (a)-(c), equality may fail if f is a relation but not a partial function. Observe
that (d) is a special case of Proposition 3.2.5.
5. Let f : X → Y be a one-one function and let {Aα }α∈I be a nonempty family of subsets of X. Is

it true that f ∩ Aα = ∩ f (Aα )?
α∈I α∈I
6. Show that each set can be written as a union of finite sets.
7. Give an example of an equivalence relation on N for which there are 7 equivalence classes, out
of which exactly 5 are infinite.
8. Show that the union of finitely many finite sets is a finite set.
Q
9. Let I = A1 = A2 = A3 = {1, 2, 3}. Is the set Aα equal to the set of all functions from {1, 2, 3}
α∈I
to {1, 2, 3}? Give reasons for your answer.
An has 6 elements. Give another.1
Q
10. Give sets An , n ∈ N, such that
n∈N
1
When we ask for more than one example, we encourage the reader to get examples of different types, if possible.
3.3. CONSTRUCTING BIJECTIONS 49

3.3 Constructing bijections


Though we have discussed criteria for classifying a set as finite or infinite through injections, the
definitions demand creating bijections. If f : X → Y is one-one, then f : X → rng f is a bijection.
Besides this, we now discuss some general techniques to create bijections.

Experiment 1: Make a horizontal list of the elements of N using only dots instead of writing the
numbers themselves. Also write Z using dots horizontally below the list for N. Draw arrows connecting
the dots on the top list to dots on the bottom list to supply a bijection from N to Z. Can you supply
another bijection by changing the arrows?

Experiment 2: Consider an open interval (a, b). Its center is c = a+b 2 , length is ` = b − a, and the
`
distance of the center from each end-point is 2 . View the open interval as a line segment on the real
line. Stretch (a, b) uniformly without disturbing the center and make its length equal to L. Use this
information to answer the following:
1. Where is c now (in R)?
2. Where is c − 2l ?
3. Where is c + 2l ?
4. Where is c − α × 2l , for a fixed α ∈ (−1, 1)?
Using these information, find a bijection from (a, b) to (s, t). [Hint: First, fix the center.]
T

Practice 3.3.1.
AF

1. Construct two bijections from (1, ∞) to (5, ∞).


DR

2. Construct two bijections from (0, 1) to (1, ∞).


3. Construct two bijections from (−1, 1) to (−∞, ∞).
4. Construct two bijections from (0, 1) to R.
5. Construct two bijections from (0, 1) × (0, 1) to R × R.

Experiment 3: Let P = (0, 1), T = (3, 5) and f : P → T be a bijection. Imagine elements of P as


‘persons’ and elements of T as ‘seats’ in a train. So, f assigns a seat to each person and the train is
full.
1. Now suppose a new person 0 is arriving. He wants a seat. To manage it, let us un-seat two
persons 12 , 13 . So, two seats f ( 12 ), f ( 13 ) are vacant. But we have 3 persons to take those seats.
Giving each person a seat is not possible.
2. Suppose that we un-seat 21 , 13 , · · · , 30
1
? Can we manage it?
3. Suppose that we un-seat 21 , 13 , · · · ? Can we manage it now?
4. What do we do if we had two new persons arriving? Fifty new persons arriving? A set
{a1 , a2 , · · · } of new persons arriving?

It leads to the following result, which you can prove easily.

Theorem 3.3.2. [Train Seat Argument] Let X be a set with {a1 , a2 , . . . , } ⊆ X and let f : X → Y
be a bijection.
50 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

1. If c1 , . . . , ck are distinct objects not in X, then the function





h(x) = f (x) if x ∈ X \ {a1 , a2 , . . .}

f (ai+k ) if x = ai , i ∈ N


f (ai )

if x = ci , i = 1, 2, . . . , k

is a bijection from X ∪ {c1 , . . . , ck } to Y.

2. If c1 , c2 , . . . are distinct objects not in X, then the function





h(x) = f (x) if x ∈ X \ {a1 , a2 , . . .}

f (a2n−1 ) if x = an , n ∈ N


f (a2n ) if x = cn , n ∈ N

is a bijection from X ∪ {c1 , c2 , . . .} to Y.

Example 3.3.3. In each of the following cases, give a bijection from X to Y :

1. X = [0, 1) and Y = (0, 1).

Ans: Map {0, 1/2, 1/3,


 . . .} onto {1/2, 1/3, . . .} and each of the rest to itself. That is, define
 1/2
 if x = 0
f : X → Y by f (x) = 1/(n + 1) if x = 1/n with n ∈ {2, 3, . . .}

x if x 6∈ {1/2, 1/3, 1/4, · · · }.

T
AF

2. X = (0, 1) and Y = R \ N.
DR

Ans: f : X → R given by f (x) = tan(π(x − 1/2)) is a bijection. Define g : R → Y by





x if x ∈ R \ Z

g(x) = −2x if x ∈ N ∪ {0}


−2x + 1

if − x ∈ N.

That is, g maps each x in R\Z to itself by the identity map, and then it maps 0, −1, 1, −2, 2, −3, 3, . . .
to 0, −1, −2, −3, −4, −5, −6, . . . in that order. Clearly, g is a bijection. Hence g ◦ f : X → Y is
a bijection.

Exercise 3.3.4. In each of the following, use Theorem 3.3.2 to give a bijection from X to Y .

1. X = [0, 1] and Y = (0, 1).

2. X = (0, 1) ∪ {1, 2, 3, 4} and Y = (0, 1).

3. X = (0, 1) ∪ N and Y = (0, 1).

4. X = [0, 1] and Y = [0, 1] \ { 11 , 31 , 51 , · · · }.

5. X = R and Y = R \ N.

6. X = [0, 1] and Y = R \ N.

7. X = (0, 1) and Y = (1, 2) ∪ (3, 4).

8. X = R \ Z and Y = R \ N.
3.4. CANTOR-SCHRÖDER-BERNSTEIN THEOREM 51

3.4 Cantor-Schröder-Bernstein Theorem


Let A and B be finite sets with |A| = m and |B| = n. Suppose there exists a one-one function from
A to B. Then we know that m ≤ n. In addition, if there exists a one-one function from B to A, then
n ≤ m so that m = n. It then follows that there is a bijection from A to B. Does the same result hold
good for infinite sets? That is, given one-one functions f : A → B and g : B → A does there exist a
bijection from A to B?

Experiment : Creating a Bijection from Injections


Let X = Y = N. Take one-one functions f : X → Y and g : Y → X defined by f (x) = x + 2 and
g(x) = x + 1. In the picture, we have X on the left and Y on the right. If (x, y) ∈ f , we draw a solid
line joining x and y. If (y, x) ∈ g, we draw a dotted line joining y and x.

1 1
2 2
3 3
4 4
5 5
6 6
7 7

..
..
T
AF

Figure 3.1: Graphic representation of functions f and g


DR

We want to create a bijection h from X to Y by erasing some of these lines. Initially, we keep all solid
lines and look at rng f . Since f is not an onto function, there are elements in Y \ rng f . Each one of
these elements must be connected by a dotted line to some element in X. So, we keep all those pairs
(y, x) ∈ g such that y 6∈ rng f . We follow the heuristic of keeping as many pairs in f as possible; and
then keep a pair (y, x) ∈ g if no pair (z, y) ∈ f has been kept.

1. The elements 1, 2 ∈ Y but are not in rng f . So, the dotted lines connecting them to elements in
X must stay. That is, the pairs (1, 2), (2, 3) ∈ g must be kept.
2. Then the pairs (2, 4), (3, 5) ∈ f must be deleted.
3. Now, (1, 3) ∈ f ; it is kept, and then (3, 4) ∈ g must be deleted.
4. The pair (4, 5) ∈ g is kept; so (5, 7) ∈ f must be deleted.
5. The pair (4, 6) ∈ f is kept, and then (6, 7) ∈ g must be deleted.
6. The pair (7, 8) ∈ g is kept; so (8, 10) ∈ f must be deleted.

Continue this scheme to realize what is happening. Then the bijection h : X → Y is given by

f (x) if x = 3n − 2, n ∈ N
h(x) =
g −1 (x) otherwise.

Practice 3.4.1. Construct bijections using the given injections f : N → N and g : N → N.


1. f (x) = x + 1 and g(x) = x + 2.
52 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

2. f (x) = x + 1 and g(x) = x + 3.


3. f (x) = x + 1 and g(x) = 2x.

We use this heuristic method of constructing a bijection in proving the following theorem.

Theorem 3.4.2. [Cantor-Schröder-Bernstein (CSB)] Let X and Y be nonempty sets and let
f : X → Y and g : Y → X be one-one functions. Then there exists a bijection h : X → Y.

Proof. If f is onto, then f itself is a bijection. So, assume that f is not onto. Then f (X) is a proper

subset of Y . Write B = Y \ f (X), φ = f ◦ g, and A = B ∪ φ(B) ∪ φ2 (B) ∪ · · · = B ∪ ∪ φn (B). Then
n=1
A ⊆ Y and
∞ ∞
φ(A) = φ(B) ∪ ∪ φn (B) = ∪ φn (B).
n=2 n=1

Hence A = B ∪ φ(A). Notice that f (X) = Y \ B, φ(A) = f (g(A)) ⊆ Y , and f is one-one. Hence

f (X \ g(A)) = f (X) \ f (g(A)) = [Y \ B] \ φ(A) = Y \ [B ∪ φ(A)] = Y \ A.

Thus, the restriction of f to X \ g(A) is a bijection onto Y \ A. As g is one-one, its restriction to A


is a bijection onto g(A). That is, g −1 : g(A) → A is a bijection. Therefore, the function h : X → Y
defined by (
f (x), if x ∈ X \ g(A),
h(x) = −1
g (x), if x ∈ g(A)
is a bijection.
T

Alternate. If g is onto, we have nothing to prove. So, assume that g is not onto. Then O :=

AF

X \ g(Y ) 6= ∅. Write ψ = g ◦ f and E = O ∪ ψ(O) ∪ ψ 2 (O) ∪ · · · = O ∪ ∪ ψ n (O). Observe that


n=1
O ⊆ E ⊆ X, ψ : X → X is one-one, and g does not map any element of Y to any element of O. Hence
DR

 ∞  ∞
ψ(E) = ψ O ∪ ∪ ψ n (O) = ∪ ψ n (O) = E \ O.
n=1 n=1

Thus the restriction of ψ to E is a bijection from E onto E \ O. Define the function τ : X → X \ O by


(
x, if x ∈ X \ E,
τ (x) =
ψ(x), if x ∈ E.

Then τ is a bijection. Write h := τ −1 ◦g. Then h is one-one and h(Y ) = τ −1 (g(Y )) = τ −1 (X \O) = X.
Therefore, h is a bijection from Y to X.

Alternate. Consider the family F = {T ⊆ X : g (f (T )c ) ⊆ T c } of subsets of X. Here, T c = X \ T


and f (T )c = Y \ f (T ).

g
g(f (T )c) f (T )c

T f f (T )

Figure 3.2: Depiction of CSB-theorem


3.4. CANTOR-SCHRÖDER-BERNSTEIN THEOREM 53

Note that ∅ ∈ F. Put U = ∪ T. Then


T ∈F
   c   
c
g f (U )c = g f ∪ T = g ∩ f (T )c = ∩ g (f (T )c ) ⊆ ∩ T c = U c .
 
=g ∪ f (T )
T ∈F T ∈F T ∈F T ∈F T ∈F

Thus, U ∈ F; and hence U is the maximal element of F. Now that g (f (U )c )⊆ U c , we want to


show that g (f (U )c ) = U c . On the contrary, assume that U c 6= g (f (U )c ). Then we have an element
x ∈ U c \ g (f (U )c ). Write V = U ∪{x}. Then g f (U )c ⊆ U c ∩{x}c and f (U ) ⊆ f (V ). Thus,


f (V )c ⊆ f (U )c and
g f (V )c ⊆ g f (U )c ⊆ U c ∩{x}c = V c .
 

This contradicts the maximality of U in F. So, g f (U )c = U c . Hence f is a bijection from U to




f (U ) and g is a bijection from f (U )c to U c . Define h : X → Y by


(
f (x) if x ∈ U,
h(x) = −1
g (x) if x 6∈ U.
Then h is a bijection.

We apply CSB-theorem to prove the following important result. Also, we give different proofs of
this fact.
Theorem 3.4.3. The set N × N is equinumerous with N.
Proof. We already know that the function f : N → N × N given by f (n) = (n, 1) is one-one. Define the
function g : N × N → N by g(m, n) = 2m 3n . Note that g(m, n) = g(r, s), implies that 2m−r = 3s−n .
Since one is a power of 2 and the other is a power of 3, their equality ensures that the indices are 0.
Hence m = r and s = n; that is, (m, n) = (r, s), and thus f is one-one. By CSB-theorem, there exists
T

a bijection from N × N to N.
AF

Alternate. Define the function h : N × N → N by h(x, y) = 2x−1 (2y − 1). Suppose h(x, y) = h(m, n).
DR

Then, 2x−1 (2y − 1) = 2m−1 (2n − 1). Let x > m. Then 2x−m (2y − 1) = 2n − 1 implies that the left
hand side is an even number whereas the right hand side is an odd number; this is a contradiction.
Similarly, x < m leads to a contradiction. Hence x = m. Then the equality implies 2y − 1 = 2n − 1
so that y = n. Thus, (x, y) = (m, n) and hence h is a one-one function. Further, each x ∈ N can be
uniquely written as x = 2r−1 (2n − 1), for some r, n ≥ 1. So, h is an onto function.

Alternate. Define f : N × N → N by f (m, n) = (m + n − 1)(m + n − 2)/2 + n. Since m ≥ 1, n ≥ 1,


(m + n − 1)(m + n − 2)/2 + n ≥ 1. Hence f is well defined. Write S0 = 0; and for any r ∈ N, write
1 + 2 + · · · + r = Sr . Notice that f (m, n) = Sm+n−2 + n. In Example 2.3.1.2, we have shown that
corresponding to each x ∈ N, there exists a unique t ∈ N ∪ {0} such that St < x ≤ St+1 . The existence
of such a t shows that f is onto, and its uniqueness shows that f is one-one. The details are as follows.
Suppose f (k, `) = f (m, n) for some choice of k, `, m, n ∈ N, i.e., x := Sk+`−2 + ` = Sm+n−2 + n.
Since ` ≤ k + ` − 1 and n ≤ m + n − 1, we have Sk+`−2 < x ≤ Sk+`−1 and Sm+n−2 < x ≤ Sm+n−1 . By
the uniqueness of t corresponding to x it follows that k +`−2 = m+n−2. Therefore Sk+`−2 = Sm+n−2
and ` = x − Sk+`−2 = x − Sm+n−2 = n. This, along with k + ` − 2 = m + n − 2 implies that k = m.
Hence, (k, `) = (m, n) and consequently, f is one-one.
To show that f is onto, let x ∈ N. Then there exists t ∈ N such that St < x ≤ St+1 . Take
n = x − St . The inequality St < x ≤ St+1 implies that 1 ≤ n ≤ t + 1. So, take m = t + 2 − n. Then
note that for m, n chosen as above m ≥ 1, n ≥ 1, t = m+n−2 and f (m, n) = Sm+n−2 +n = St +n = x.
Therefore, f is an onto function.

The function f (m, n) in the above proof is called Cantor’s pairing function. Till now it is not
known whether there exists another polynomial in m and n which is a bijection.
54 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

Example 3.4.4. We show that Q is equinumerous with N. For this, write Q = Q+ ∪ Q− ∪ {0}, where
nm o
Q+ = : m, n ∈ N, gcd(m, n) = 1 , Q− = {−x : x ∈ Q+ }.
n

1. Prove that Q+ is equinumerous with N.

Proof. Let p1 , p2 , . . . be the infinite list of prime numbers arranged in an increasing order, that
is, p1 = 2, p2 = 3, p3 = 5, etc. The prime factorization theorem asserts that each n ∈ N can be
written uniquely as n = pa11 pa22 · · · , where ai ∈ N only for a finite number of pi ’s, and the rest
of ai ’s are 0. Hence each q ∈ Q+ can be written uniquely as q = pb11 pb22 · · · , where bi ∈ Z \ {0}
only for a finite number of pi ’s, and the rest of bi ’s are 0. Let f : N → Z be a bijection
such as f (n) = −n/2 if n is even, and f (n) = (n + 1)/2 if n is odd. Define g : N → Q+ by
f (a ) f (a )
g(n) = p1 1 p2 2 · · · for n = pa11 pa22 · · · . Then g is a bijection.

2. Use the above to conclude that Q− is equinumerous with N.

Ans: The function h : Q+ → Q− given by h(q) = −q is a bijection. Using Part 1, we see that
h ◦ g : N → Q− is a bijection.

3. Use the above two parts to conclude that Q is equinumerous with N.

Ans: Let A = {2n : n ∈ N}, B = {2n + 1 : n ∈ N}. Then N = A ∪ B ∪ {1}. Define φ1 : A → N


by φ1 (n) = n/2, and φ2 : B → N by φ2 (n) = (n − 1)/2. Let g : N → Q+ and h : Q+ → Q− be
the bijections given in Parts 1 and 2. Then g ◦ φ1 is a bijection from A to Q+ , and h ◦ g ◦ φ2 is
T

a bijection from B to Q− . We see that the following function ψ : N → Q is a bijection:


AF


 (g ◦ φ1 )(x) if x ∈ A
DR


ψ(x) = (h ◦ g ◦ φ2 )(x) if x ∈ B

0 if x = 1.

Exercise 3.4.5.

1. For each of the exercises in Exercise 3.3.4, give injections. Then use the CSB-theorem to prove
that all the sets are equinumerous.

2. Define f : Q → N by



2r 3s if x = rs , gcd(r, s) = 1, r > 0, s > 0

f (x) = 5r 3s −r
if x = s , gcd(r, s) = 1, r > 0, s > 0


1

if x = 0.

Show that f is one-one. Apply CSB-theorem to prove that Q is equinumerous with N.

3. Let X = {(x, y) ∈ N × N : y ≤ x}.

(a) Define a function f : N × N → X by f (x, y) = (x + y − 1, y). Prove that f is a bijection.


x(x−1)
(b) Further, define g : X → N by g(x, y) = 2 + y. Prove that g is a bijection.

Note that g ◦ f is a bijection from N × N to N. Is this function the same as Cantor’s pairing
function?
3.5. COUNTABLE AND UNCOUNTABLE SETS 55

3.5 Countable and uncountable sets


As we have seen N × N and Q are equinumerous with N. By induction it follows that Nk , that is the
product of N with itself taken k times, for any natural number k, is also equinumerous with N. Does
it mean that every infinite set is equinumerous with N? With the hope of discovering an answer to
this question, we introduce some related notions.
Definition 3.5.1. 1. A set which is equinumerous with N is called a denumerable set. A denu-
merable set is also called a countably infinite set.
2. A set which is either finite or denumerable is called a countable set.
3. A set which is not countable is called an uncountable set.

Since the identity function on N is a bijection, it follows that N is denumerable. Each finite set
such as ∅ and [m], for some m ∈ N, are countable; so is N.

Example 3.5.2.
1. Define f : N → Z and g : Z → N, respectively by
 
−x/2 if x is even −2z if z is negative
f (x) = g(x) =
(x − 1)/2 if x is odd, 1 + 2z if z is non-negative.

Then, we see that g ◦ f and f ◦ g are identity functions on their respective domains. Hence f is
a bijection. Therefore, Z is denumerable; and also countable.
2. By Theorem 3.4.3, there is a bijection from N × N onto N. Thus, N × N is denumerable, and
T

countable.
AF

3. By Example 3.4.4, Q+ , Q− , Q are denumerable, and countable.


DR

Before exploring other examples, we will give simpler characterizations of these notions.

Theorem 3.5.3. Let X be a nonempty set.


1. X is countable if and only if there exists a one-one function f : X → N.
2. X is denumerable if and only if there exist one-one functions f : X → N and g : N → X.

Proof. 1. Let X be a countable set. If X is finite, then there exists a bijection f : X → [m] for some
m ∈ N. This bijection gives a one-one function f : X → N. Else, X is denumerable, so that there
is a bijection g : X → N. In this case, the function g is one-one. Conversely, suppose there exists
a one-one function f : X → N. If X is finite, then it is countable. So, suppose that X is infinite.
Then, by Theorem 3.1.4.2, there exists a one-one function g : N → X. By CSB-theorem, there exists
a bijection h : X → N. Hence X is denumerable; thus countable.
2. Let X be a denumerable set. By definition there is a bijection f : X → N. Thus, f : X → N and
f −1 : N → X are one-one functions. Conversely, suppose there exist one-one functions f : X → N and
g : N → X. Then, by CSB-theorem, there exists a bijection h : X → N. Hence X is denumerable.

Definition 3.5.4.
1. Let X be a denumerable set. Then, there is a bijection f : N → X. So, we can list all the
elements of X as f (1), f (2), . . .. This list is called an enumeration of the elements of X.
2. Let X be a nonempty set. An infinite sequence of elements of X is a function f : N → X.
 
Writing f (i) = xi , such a sequence is represented by xi i∈N = x1 , x2 , . . . , where xi ∈ X.
56 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

In the proof of Theorem 1, Part 2, we have essentially extracted an infinite sequence from the
infinite set X.
Since Z is denumerable, its elements can be enumerated. For example, 0, 1, −1, 2, −2, 3, −3, . . . is
an enumeration of Z. Similarly, all rational numbers can be enumerated; and sometimes we write such
an enumeration of Q by r1 , r2 , r3 , . . .. It says that there is a sequence r1 , r2 , r3 , . . .) in which each
rational number occurs exactly once. This is what an enumeration means. If X is a countable set,
then its elements can be enumerated in a sequence; but the sequence can be finite or infinite.
By a denumerable family of sets, we mean a family of sets which is denumerable. A denumerable
family of sets can be indexed by N and we may write such a family as {Ai }i∈N . We also use the same
notation for a countable family, where possibly only a finite number of sets Ai are nonempty. The
union of sets in a countable family will be referred to as a countable union of sets.
Notice that a countable infinite set is denumerable. Besides this, some more facts about countable
sets are listed in the following proposition.

Proposition 3.5.5. [Facts about countable sets]


1. Each subset of a denumerable set is countable.
2. Each infinite subset of a denumerable set is denumerable.
3. A set is infinite if and only if it has a denumerable subset.
4. Any subset of a countable set is countable; and any superset of an uncountable set is uncountable.
5. A countable union of countable sets is countable.
6. For any k ∈ N, the Cartesian product Nk is denumerable.
T
AF

7. A finite product of countable sets is countable.


DR

Proof. (1) Let X ⊆ Y , where Y is denumerable. There exists a bijection f : Y → N. The identity
function Id : X → Y is one-one. So, f ◦ Id : X → N is one-one.
(2) Let X be an infinite subset of a denumerable set. By (1), X is countable. So, X is countably
infinite, same as denumerable.
(3) Let X be an infinite set. Then, by Theorem 3.5.3, there is a one-one function f : N → X. Thus,
f : N → rng f is a bijection. Hence, rng f is a denumerable subset of X.
Conversely, let X be a set and let Y ⊆ X be denumerable. There exists a bijection f : Y → N.
The function f −1 : N → X is one-one. By Theorem 3.1.4, X is an infinite set.
(4) Let X be a countable set and let Y ⊆ X. If Y = ∅, then it is finite, thus countable. So, suppose
that Y 6= ∅. As X is countable, by Theorem 3.5.3, there exists a one-one function f : X → N. The
restriction of f to Y is also a one-one function from Y to N. Hence Y is countable.
Let X be an uncountable set and let Y ⊇ X. If X is countable, then by what we have just proved,
X would be countable. Hence, Y is uncountable.
(5) Let {Ai }i∈N be a countable family of sets, where each Ai is a countable set. Write X = ∪ Ai .
i∈N
We show that X is countable.
If X is finite, then it is countable. So, let X be infinite. By Theorem 3.1.4.2, there is a one-one
function f : N → X. Now, let x ∈ X. Then, there exists at least one i ∈ N such that x ∈ Ai . Further,
since Ai is countable, we may assume that Ai has been enumerated. So, suppose x appears at the kth
position in this enumeration of Ai . Thus, corresponding to each x ∈ X, we have a unique pair (i, k) of
natural numbers. Define g : X → N by g(x) = 2i 3k , where i is the smallest natural number for which
x ∈ Ai and x appears at the k-th position in the enumeration of Ai . Then g is one-one. Therefore, by
3.5. COUNTABLE AND UNCOUNTABLE SETS 57

CSB-theorem, A is equinumerous with N.


(6) For k = 1, the result is obvious. Suppose the result is true for k = m. That is, there exists a
bijection f : Nm → N. From Theorem 3.4.3, we have a bijection g : N × N → N. Define h : Nm+1 → N

by h(x1 , . . . , xm , xm+1 ) = g f (x1 , . . . , xm ), xm+1 . Then h is a bijection. Thus, by the PMI the result
holds.
Alternate. The function f : N → Nk given by f (m) = (m, 1, . . . , 1) is one-one. Next, let p1 , . . . , pk
be the first k number of primes, i.e., p1 = 2, p2 = 3, etc. Define g : Nk → N by g(m1 , . . . , mk ) =
k −1
pm
1
1 −1 m2 −1
p2 · · · pm
k . The prime factorization theorem implies that g is one-one. So, by CSB-
theorem, there exists a bijection from Nk → N.
(7) Let A1 , . . . , Ak be countable sets. We need to show that X := A1 × · · · × Ak is countable. If
any Ai = ∅, then X = ∅; thus it is countable. So, assume that each Ai is nonempty. Since Ai is
countable, there exists a one-one function fi : Ai → N. Then the function f : X → Nk defined by
f (x1 , . . . , xk ) = f1 (x1 ), . . . , fk (xk ) is one-one. Let g : Nk → N be the one-one function given in (6).


Then g ◦ f : X → N is a one-one function.

We now address the question whether all infinite sets are denumerable or not. Its answer is hidden
in Cantor’s experiment, which we present in the following. Recall that if X is a set, then its power
set P(X) denotes the set of all subsets of X.

Cantor’s experiment: Take a blank sheet of paper.


1. On the left draw an oval (of vertical length) and write the elements of {1, 2, 3, 4} inside it,
one below the other. On the right draw a similar but larger oval and write the elements of
T

P({1, 2, 3, 4}) inside it, one below the other.


AF

2. Now draw a directed line from 1 (on the left) to any element on the right. Repeat this for 2, 3
DR

and 4. We have drawn a function. Call it f.


3. Notice that f (1), f (2), f (3) and f (4) are sets. Find out the set Y = {i : i ∈
/ f (i)}. Locate this
set on the right.
4. It is guaranteed that you do not have a directed line touching Y . Why?

Theorem 3.5.6. [Cantor] There exists no surjection from a set to its power set.

Proof. On the contrary, let X be a set and let f : X → P(X) be an onto function. For each x ∈ X,
f (x) ⊆ X. Consider the set Y = {x ∈ X : x 6∈ f (x)}. Since Y ∈ P(X) and f is onto, there exists
s ∈ X with f (s) = Y .
If s ∈ Y , then s satisfies the defining property of Y , i.e., s 6∈ f (s). As f (s) = Y , s 6∈ Y .
If s 6∈ Y , then f (s) = Y gives s 6∈ f (s). So, s satisfies the defining property of Y , and hence s ∈ Y .
We thus see that s ∈ Y if and only if s 6∈ Y . This is a contradiction.

Remark 3.5.7. Cantor’s theorem implies that one cannot have a bijection between a set and its
power set. In particular, the sets N and P(N) cannot be equinumerous. However, f : N → P(N) given
by f (x) = {x} is one-one. Thus the set P(N) is infinite but not denumerable, i.e., by Definition 3.5.1,
P(N) is an uncountable set. It follows that any set equinumerous with P(N) is uncountable. In
general, the following result holds.

Theorem 3.5.8. The power set of any infinite set is uncountable.


58 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

Proof. Let X be an infinite set. By Theorem 3.1.4, there exists a one-one function f : N → X. Define
the function g : P(N) → P(X) by

g(A) = {f (i) : i ∈ A} for each A ∈ P(N).



Then, g is one-one. As Remark 3.5.7 shows, P(N) is uncountable. Thus g P(N) is uncountable. The

set P(X), being a superset of g P(N) , is uncountable.

Example 3.5.9.
1. Let X be the family of all functions x : N → {0, 1}. Equivalently, let

X = x : x = (x1 , x2 , . . .), xi ∈ {0, 1} for each i ∈ N ,

the set of all 0-1 sequences. Define f : X → P(N) by



f (x) = f (x1 , x2 , . . .) = {n : xn = 1}.

Then f is a bijection. Hence, X is uncountable.



2. Let Y = .a1 a2 a3 · · · : ai ∈ {0, 1} for each i ∈ N . It follows from (1) that X is uncountable.
We give another proof by Cantor.

Cantor’s diagonalization: On the contrary, suppose Y is countable. Clearly Y is not finite.


So, let x1 , x2 , · · · be an enumeration of Y . Let xn = .xn1 xn2 · · · , where xni ∈ {0, 1}. We
construct the numbers yn as follows:
T

If xnn = 0, then take yn = 1; otherwise, take yn = 0.


AF

Now, consider the number y = .y1 y2 · · · ∈ X. Notice that for each n, y 6= xn , i.e., y ∈ Y but it
DR

is not in the enumeration of Y . This is a contradiction.


Recall that every real number in the interval [0, 1) has a unique non-terminating binary repre-
sentation, and also a non-terminating decimal representation. Thus we have shown that [0, 1) is
an uncountable set.

Theorem 3.5.10. The set P(N) is equinumerous with [0, 1) and also with R.

Proof. By Example 3.5.9, there exists a one-one function f : P(N) → [0, 1). Let r ∈ (0, 1). Consider the
non-terminating binary representation of r. Denote by Fr the set of positions of 1 in this representation.
Define g : [0, 1) → P(N) by g(0) = ∅, and g(r) = Fr if r 6= 0. Then g is one-one. Therefore, by
CSB-theorem, P(N) is equinumerous with [0, 1).
The next statement follows as [0, 1) is equinumerous with (0, 1) (see Exercise 3.3.4.1) and (0, 1) is
equinumerous with R (see Practice 3.3.1.4).

Exercise 3.5.11.
1. Let X be a nonempty set. Prove by two methods that there is no injection from P(X) to X;
once by using CSB-theorem and once without using it.
2. Give a bijection from R to R \ Q.
3. Write R as a union of pairwise disjoint sets of size 5.
4. Supply a bijection from (0, 1) to (1, 2) ∪ (3, 4) ∪ (5, 6) ∪ (7, 8) ∪ · · · .
5. Show using CSB-theorem that (0, 1) is equinumerous with (0, 1].
3.5. COUNTABLE AND UNCOUNTABLE SETS 59

6. Show that (0, 1) is equinumerous with (0, 1) × (0, 1); and that R × R is equinumerous with R.
7. Let A1 , A2 , . . . be an infinite sequence of nonempty sets such that Ak is a proper superset of Ak+1
for each k ∈ N. Show that A1 is an infinite set.
8. Let X be a set such that f : N → X is an onto function. Then prove that X is countable.
9. Let S be the set of sequences (xn ), with xn ∈ {0, 1, . . . , 9}, for each n ∈ N, such that ‘if xk < xk+1 ,
then xk+1 = xk+2 = · · · ’. Is S countable?
10. Let S be the set of all decreasing1 sequences made with natural numbers. Is S countable?
11. Let S be the set of all increasing sequences made with natural numbers. Is S uncountable?
12. Let S be a countable set of points on the unit circle in R2 . Consider the line segments Ls with
one end at the origin and the other end at a point s ∈ S. Fix these lines. We are allowed to
rotate the circle anticlockwise (the lines do not move). Let T be another countable set of points
on the unit circle. Can we rotate the circle by an angle θ so that no line Ls touches any of the
points of T ?
13. A complex number is called algebraic if it is a root of a polynomial equation with integer
coefficients. All other complex numbers are called transcendental.
(a) Show that the set of algebraic numbers is countable.
(b) Show that the set of transcendental numbers is uncountable.

14. Fix an n ∈ N and let Tn be the set of all functions from {1, 2, . . . , n} to N.
(a) Is Tn a countable set?
T


S
Tn countable?
AF

(b) Is the set


n=1
DR

15. Let X be the set of all functions from N to N.


(a) Is X uncountable? Justify your answer.
(b) A function f ∈ X is said to be eventually constant if there exist m, N ∈ N such that
f (n) = m for all n ≥ N . Let S ⊆ X be the set all eventually constant functions. Is S
countable?

1
A sequence (xn ) is called decreasing if xm+1 ≤ xm for each m ∈ N; increasing if xm+1 ≥ xm for each m ∈ N; strictly
decreasing if xm+1 < xm for each m ∈ N; and it is called strictly increasing if xm+1 > xm for each m ∈ N.
60 CHAPTER 3. COUNTABLE AND UNCOUNTABLE SETS

T
AF
DR
Chapter 4

Elementary Number Theory

4.1 Division algorithm and its applications


In this section, we study some properties of integers. We start with the ‘division algorithm’.

Lemma 4.1.1. [Division algorithm] Let a and b be two integers with b > 0. Then there exist unique
integers q, r such that a = qb + r, where 0 ≤ r < b. The integer q is called the quotient and r, the
remainder.

Proof. Existence: Take S = {a + bx|x ∈ Z} ∩ W. Then a + |a|b ∈ S. Hence, S is a nonempty subset


of W. Therefore, by the well ordering principle, S contains its minimum, say s0 . So, s0 = a + bx0 , for
some x0 ∈ Z. Since s0 ∈ W, s0 ≥ 0.
T

If s0 ≥ b then 0 ≤ s0 − b = a + b(x0 − 1) ∈ S. This contradicts the minimality of s0 . Hence


AF

0 ≤ s0 < b. Take q = −x0 and r = s0 . Then qb + r = −x0 b + s0 = −x0 b + a + bx0 = a, i.e., we have
DR

obtained q and r such that a = qb + r with 0 ≤ r < b.


Uniqueness: Assume that there exist integers q1 , q2 , r1 and r2 satisfying a = q1 b + r1 , 0 ≤ r1 < b,
a = q2 b+r2 , and 0 ≤ r2 < b. Suppose r1 < r2 . Then 0 < r2 −r1 < b. Notice that r2 −r1 = (q1 −q2 )b. So,
0 < (q1 − q2 )b < b. This is a contradiction since (0, b) does not contain any integer which is a multiple
of b. Similarly, r2 < r1 leads to a contradiction. Therefore, r1 = r2 . Then, 0 = r1 − r2 = (q1 − q2 )b
and b 6= 0 imply that q1 = q2 .

Definition 4.1.2. Let a, b ∈ Z with b 6= 0. If a = bc, for some c ∈ Z then b is said to divide a and
we write b|a (read as b divides a. ) When b|a, we also say that b is a divisor of a, and that a is a
multiple of b.

Remark 4.1.3. Let a be a nonzero integer. If b is a positive divisor of a, then 1 ≤ b ≤ |a|. Hence the
set of all positive divisors of a nonzero integer is a nonempty finite set.
Further, if a is a positive integer and b is a positive divisor of a, then a = kb for some k ∈ N so
that b ≤ a. It then follows that if a, b ∈ N such that a|b and b|a, then a = b.

Definition 4.1.4. 1. Let a and b be two nonzero integers. Then the set S of their common positive
divisors is nonempty and finite. Thus, S contains its greatest element. This element is called
the greatest common divisor of a and b and is denoted by gcd(a, b). The gcd is also called
the highest common factor.

2. An integer a is said to be relatively prime to an integer b if gcd(a, b) = 1. In this case, we also


say that the integers a and b are coprimes.

61
62 CHAPTER 4. ELEMENTARY NUMBER THEORY

The next result is often stated as ‘the gcd(a, b) is a linear combination of a and b’.

Theorem 4.1.5. [Bézout’s identity] Let a and b be two nonzero integers and let d = gcd(a, b).
Then there exist integers x0 , y0 such that d = ax0 + by0 .

Proof. Consider the set S = {ax + by : x, y ∈ Z} ∩ N. Then, either a ∈ S or −a ∈ S. Thus, S is a


nonempty subset of N. By the well ordering principle, S contains its least element, say d. As d ∈ S,
we have d = ax0 + by0 , for some x0 , y0 ∈ Z. We show that d = gcd(a, b).
By the division algorithm, there exist integers q and r such that a = dq + r, with 0 ≤ r < d. If
r > 0, then

r = a − dq = a − q(ax0 + by0 ) = a(1 − qx0 ) + b(−qy0 ) ∈ {ax + by : x, y ∈ Z}.

In this case, r is a positive integer in S which is strictly less than d. This contradicts the choice of d
as the least element of S. Thus, r = 0. Consequently, d|a. Similarly, d|b. Hence d ≤ gcd(a, b).
Now, gcd(a, b)|a and gcd(a, b)|b. Since d = ax0 + by0 for some x0 , y0 ∈ Z, we have gcd(a, b)|d.
That is, d = k × gcd(a, b) for some integer k. However, both gcd(a, b) and d are positive. Thus k is a
positive integer. Hence d ≥ gcd(a, b).
Therefore, d = gcd(a, b).

We prove three useful corollaries to Bézout’s identity.

Corollary 4.1.6. Let a, b ∈ Z and let d ∈ N. Then, d = gcd(a, b) if and only if d|a, d|b, and each
common divisor of a and b divides d.
T

Proof. Suppose d = gcd(a, b). Then d|a and d|b. By Bézout’s identity, d = ak + bm for some k, m ∈ Z.
AF

Thus, any common divisor of a and b divides d = gcd(a, b).


DR

Conversely, suppose d|a, d|b and each common divisor of a and b divides d. Since d is a common
divisor of a and b, by what we have just proved, d| gcd(a, b). Further, gcd(a, b) is a common divisor of
a and b; so, by assumption gcd(a, b)|d. By Remark 4.1.3, d = gcd(a, b).

Corollary 4.1.7. Let a, b be nonzero integers. Then gcd(a, b) = 1 if and only if there exist integers
x0 and y0 such that ax0 + by0 = 1.

Proof. If gcd(a, b) = 1, then by Bézout’s identity, there exist integers x0 and y0 such that ax0 +by0 = 1.
Conversely, suppose there exist integers x0 and y0 such that ax0 + by0 = 1. If gcd(a, b) = k, then k is
a positive integer such that k|1. It follows that k ≤ 1; consequently, k = 1.

Corollary 4.1.8. Let n1 , . . . , nk be positive integers which are pairwise coprimes. If a ∈ Z is such
that n1 |a, . . . , nk |a, then n1 · · · nk |a.

Proof. The positive integers n1 , . . . , nk are pair wise coprimes means that if i 6= j, then gcd(ni , nj ) = 1.
Let a ∈ Z be such that n1 |a, . . . , nk |a. We show by induction that n1 · · · nk |a. For k = 2, it is given
that n1 |a, n2 |a and gcd(n1 , n2 ) = 1. By Bézout’s identity, there exist x, y ∈ Z such that n1 x+n2 y = 1.
Multiplying by a, we have a = an1 x + an2 y = n1 n2 x( na2 ) + y( na1 ) .


Since n2 |a and n1 |a, we see that na2 , na1 ∈ Z so that x( na2 ) + y( na1 ) ∈ Z. Hence n1 n2 |a.


Assume the induction hypothesis that the statement is true for k = m. Let each of n1 , . . . , nm+1
divide a and that they are pairwise coprimes. Let n1 · · · nm = `. Then gcd(`, nm+1 ) = 1. By the
induction hypothesis, `|a. By the basis case, (k = 2 as proved), we conclude that ` nm+1 |a. That is,
n1 · · · nm+1 |a.
4.1. DIVISION ALGORITHM AND ITS APPLICATIONS 63

The division algorithm helps to algorithmically compute the greatest common divisor of two
nonzero integers, commonly known as the Euclid’s algorithm.
Let a, and b be nonzero integers. By the division algorithm, there exists integers q and r with
0 ≤ r < |b| such that a = |b|q + r. We apply our observation that a common divisor of two integers
divides their gcd.
Now, gcd(|b|, r) divides both |b| and r; hence it divides a. Again, gcd(|b|, r) divides both a and |b|.
Hence gcd(|b|, r)| gcd(a, |b|).
Similarly, with r = a − |b|q, we see that gcd(a, |b|) divides both a and |b|; hence gcd(a, |b|)|r.
Consequently, gcd(a, |b|)| gcd(|b|, r).
Further, the gcd of any two integers is positive. Thus, gcd(a, b) = gcd(a, |b|). So, we obtain

gcd(a, b) = gcd(a, |b|) = gcd(|b|, r).

Euclid’s algorithm applies this idea repeatedly to find the greatest common divisor of two given
nonzero integers, which we now present.

Euclid’s algorithm
Input: Two nonzero integers a and b; Output: gcd(a, b).

a = b q0 + r0 with 0 ≤ r0 <b
b = r0 q1 + r1 with 0 ≤ r1 < r0
r0 = r1 q2 + r2 with 0 ≤ r2 < r1
r1 = r2 q3 + r3 with 0 ≤ r3 < r2
..
T

.
AF

r`−1 = r` q`+1 + r`+1 with 0 ≤ r`+1 < r`


DR

r` = r`+1 q`+2 .
gcd(a, b) = r`+1

The process will take at most b − 1 steps as 0 ≤ r0 < b. Also, note that r`+1 can be expressed in
the form r`+1 = a x0 + b y0 for integers x0 , y0 using backtracking. That is,

r`+1 = r`−1 − r` q`+1 = r`−1 − q`+1 (r`−2 − r`−1 q` ) = r`−1 (1 + q`+1 q` ) − q`+1 r`−2 = · · · .

Example 4.1.9. We apply Euclid’s algorithm for computing gcd(155, −275) as follows.

−275 = (−2) · 155 + 35 (so, gcd(−275, 155) = gcd(155, 35))


155 = 4 · 35 + 15 (so, gcd(155, 35) = gcd(35, 15))
35 = 2 · 15 + 5 (so, gcd(35, 15) = gcd(15, 5))
15 = 3·5 (so, gcd(15, 5) = 5).

To write 5 = gcd(155, −275) in the form 155x0 + (−275)y0 , notice that

5 = 35 − 2 · 15 = 35 − 2(155 − 4 · 35) = 9 · 35 − 2 · 155 = 9(−275 + 2 · 155) − 2 · 155 = 9 · (−275) + 16 · 155.

Also, note that 275 = 5 · 55 and 155 = 5 · 31 and thus, 5 = (9 + 31x) · (−275) + (16 + 55x) · 155, for all
x ∈ Z. Therefore, we see that there are infinite number of choices for the pair (x, y) ∈ Z2 , for which
d = ax + by.
Exercise 4.1.10. 1. Let a, b ∈ N with gcd(a, b) = d. Then gcd( ad , db ) = 1.
2. Prove that the system 15x + 12y = b has a solution for x, y ∈ Z if and only if 3 divides b.
64 CHAPTER 4. ELEMENTARY NUMBER THEORY

3. [Linear Diophantine equation] Let a, b, c ∈ Z \ {0}. Then the linear system ax + by = c, in


the unknowns x, y ∈ Z has a solution if and only if gcd(a, b) divides c. Furthermore, determine
all pairs (x, y) ∈ Z × Z such that ax + by is indeed c.
4. Prove that gcd(a, bc) = 1 if and only if gcd(a, b) = 1 and gcd(a, c) = 1, for any three nonzero
integers a, b and c.
5. Euclid’s algorithm can sometimes be applied to check whether two numbers which are dependent
on an unknown integer n, are relatively prime or not. For example, we can use the algorithm to
prove that gcd(2n + 3, 5n + 7) = 1 for every n ∈ Z.
6. Suppose a milkman has only 3 cans of sizes 7, 9 and 16 liters. What is the minimum number of
operations required to deliver 1 liter of milk to a customer? Explain.

To proceed further, we need the following definitions.


Definition 4.1.11. 1. The integer 1 is called the unity (or the unit element) of Z.
2. An integer p > 1 is called a prime, if p has exactly two positive divisors, namely, 1 and p.
3. An integer r > 1 is called composite if r is not a prime.

We are now ready to prove an important result that helps us in proving the fundamental theorem
of arithmetic.

Lemma 4.1.12. [Euclid’s Lemma] Let a, b ∈ Z and let p be a prime. If p|ab then p|a or p|b.

Proof. Suppose p|ab. If p|a, then there is nothing to prove. So, assume that p - a. As p is a prime,
gcd(p, a) = 1. Thus there exist integers x, y such that 1 = ax + py. Then b = abx + pby. Since p|ab
T
AF

and p|pb, we see that p|b.

One also has the following result.


DR

Proposition 4.1.13. Let a, b, n ∈ Z be such that n|ab. If gcd(n, a) = 1, then n|b.

Proof. Suppose gcd(n, a) = 1. There exist x0 , y0 ∈ Z such that nx0 + ay0 = 1. Then b = aby0 + nbx0 .
Since n|ab and n|nb, we have n|b.

Now, we are ready to prove the fundamental theorem of arithmetic that states that ‘every positive
integer greater than 1 is either a prime or is a product of primes. This product is unique, except for
the order in which the prime factors appear’.

Theorem 4.1.14. [Fundamental theorem of arithmetic] Let n ∈ N with n ≥ 2. Then there exist
prime numbers p1 > p2 > · · · > pk and positive integers s1 , s2 , . . . , sk such that n = ps11 ps22 · · · pskk , for
some k ≥ 1. Moreover, if n also equals q1t1 q2t2 · · · q`t` , for distinct primes q1 > q2 > · · · > q` and positive
integers t1 , t2 , . . . , t` then k = ` and for each i ∈ {1, . . . , k}, pi = qi and si = ti .

Proof. See Example 2.2.6 for a proof.

Theorem 4.1.15. [Euclid: Infinitude of primes] The number of primes is infinite.

Proof. On the contrary assume that the number of primes is finite, say p1 = 2, p2 = 3, . . . , pk . Consider
the positive integer N = p1 p2 · · · pk + 1. We see that none of the primes p1 , p2 , . . . , pk divides N . This
contradicts Theorem 4.1.14.

Proposition 4.1.16. [Primality testing] Let n ∈ N with n ≥ 2. If no prime p ≤ n divides n, then
n is prime.
4.2. MODULAR ARITHMETIC 65

√ √
Proof. Suppose n = xy, for 2 ≤ x, y < n. Then, either x ≤ n or y ≤ n. Without loss of generality,
√ √
assume x ≤ n. If x is a prime, we are done. Else, take a prime divisor p of x. Now, p ≤ n and p
divides n.

Exercise 4.1.17. 1. Prove that there are infinitely many primes of the form 4n − 1.
2. Fix N ∈ N, N ≥ 2. Then, there exists a consecutive set of N natural numbers that are composite.

Definition 4.1.18. The least common multiple of integers a and b, denoted as lcm(a, b), is the
smallest positive integer that is a multiple of both a and b.

Lemma 4.1.19. Let a, b ∈ Z and let ` ∈ N. Then, ` = lcm(a, b) if and only if a|`, b|` and ` divides
each common multiple of a and b.

Proof. Let ` = lcm(a, b). Clearly, a|` and b|`. Let x be a common multiple of both a and b. If ` - x,
then by the division algorithm, x = ` · q + r for some integer q and some r with 0 < r < `. Notice that
a|x and a|`. So, a|r. Similarly, b|r. That is, r is a positive common multiple of both a and b which is
less than lcm(a, b). This is a contradiction. Hence, ` = lcm(a, b) divides each common multiple of a
and b.
Conversely, suppose a|`, b|` and ` divides each common multiple of a and b. By what we have
just proved, lcm(a, b)|`. Further, lcm(a, b) is a common multiple of a and b. Thus `| lcm(a, b). By
Remark 4.1.3, we conclude that ` = lcm(a, b).

Theorem 4.1.20. Let a, b ∈ N. Then gcd(a, b) · lcm(a, b) = ab. In particular, lcm(a, b) = ab if and
only if gcd(a, b) = 1.
T

Proof. Let d = gcd(a, b). Then a = a1 d and b = b1 d for some a1 , b1 ∈ N. Further,


AF
DR

ab = a1 d b1 d = (a1 b1 d) · gcd(a, b).

Thus, it is enough to show that lcm(a, b) = a1 b1 d.


Towards this, notice that a1 b1 d = ab1 = a1 b, that is, a|a1 b1 d and b|a1 b1 d. Let c ∈ N be any
common multiple of a and b. Then ac , cb ∈ Z. Further, by Bézout’s identity, d = as + bt for some
s, t ∈ Z. So,
c cd c(as + bt) c c
= = = s + t ∈ Z.
a1 b1 d (a1 d) · (b1 d) ab b a
Hence a1 b1 d|c. That is, a1 b1 d divides each common multiple of a and b. By Lemma 4.1.19,
a1 b1 d = lcm(a, b).

4.2 Modular arithmetic


Definition 4.2.1. Fix a positive integer n. Let a, b ∈ Z. If n divides a−b, we say that a is congruent
to b modulo n, and write a ≡ b (mod n).

Example 4.2.2. 1. Notice that 2|(2k − 2m) and also 2|[(2k − 1) − (2m − 1)]. Therefore, any two
even integers are congruent modulo 2; and any two odd integers are congruent modulo 2.
2. The numbers ±10 and 22 are congruent modulo 4 as 4|(22 − 10) and 4|(22 − (−10)).
3. Let n be a fixed positive integer. Recall the notation [n − 1] := {0, 1, 2, . . . , n − 1}.
(a) Then, by the division algorithm, for any a ∈ Z there exists a unique b ∈ [n − 1] such that
a ≡ b (mod n). The number b is called the residue of a modulo n.
66 CHAPTER 4. ELEMENTARY NUMBER THEORY

n−1
S
(b) Further Z = {a + kn : k ∈ Z}, i.e., every integer is congruent to an element of [n − 1].
a=0
The set [n − 1] is taken as the standard representative for the set of residue classes
modulo n.

Theorem 4.2.3. Fix n ∈ N, and let a, b, c, d ∈ Z. Then the following are true:
1. If a ≡ b (mod n) and b ≡ c (mod n), then a ≡ c (mod n).
2. If a ≡ b (mod n), then a + c ≡ b + c (mod n), a − c ≡ b − c (mod n) and ac ≡ bc (mod n).
3. If a ≡ b (mod n) and c ≡ d (mod n), then a + c ≡ b + d (mod n), a − c ≡ b − d (mod n) and
ac ≡ bd (mod n). In particular, a ≡ b (mod n) implies am ≡ bm (mod n) for all m ∈ N.
4. If ac ≡ bc (mod n) for nonzero a, b, c, and d = gcd(c, n), then a ≡ b (mod n/d). In particular,
if ac ≡ bc (mod n) for nonzero a, b, c, and gcd(c, n) = 1, then a ≡ b (mod n).

Proof. We will only prove two parts. The readers should supply the proof of other parts.
3. Note that ac − bd = ac − bc + bc − bd = c(a − b) + b(c − d). Thus, n|ac − bd, whenever n|a − b
and n|c − d. In particular, taking c = a and d = b and repeatedly applying the above result, one has
am ≡ bm (mod n), for all m ∈ N.

4. Let gcd(c, n) = d. Then, there exist nonzero c1 , n1 ∈ Z with c = c1 d, n = n1 d. Then n|ac − bc


n
implies n1 d|c1 d(a − b), which implies n1 |c1 (a − b). By Proposition 4.1.13, n1 |a − b, i.e., gcd(c,n) |a − b.

Example 4.2.4. 1. Note that 3 · 9 + 13 · (−2) ≡ 1 (mod 13). If x satisfies 9x ≡ 4 (mod 13) then

x ≡ x · 1 ≡ x · (3 · 9 + 13 · (−2)) as 3 · 9 + 13 · (−2) ≡ 1 (mod 13)


T

≡ 3 · 9x as 13 ≡ 0 (mod 13)
AF

≡3·4 as 9x ≡ 4 (mod 13)


DR

≡ 12 (mod 13).

To verify, if x ≡ 12 (mod 13), then 9x ≡ 108 ≡ (13 × 8 + 4) ≡ 4 (mod 13). Therefore, the
congruence equation 9x ≡ 4 (mod 13) has solution x ≡ 12 (mod 13).
2. Verify that 9 · (−5) + 23 · (2) = 1. Hence, the equation 9x ≡ 1 (mod 23) has the solution

x ≡ x · 1 ≡ x (9 · (−5) + 23 · (2)) ≡ (−5) · (9x) ≡ −5 × 1 ≡ 18 (mod 23).

3. Verify that the equation 3x ≡ 15 (mod 30) has solutions x = 5, 15, 25; where as the equation
7x = 15 (mod 30) has only one solution x = 15; and that the equation 3x ≡ 5 (mod 30) has no
solution.

Theorem 4.2.5. [Linear Congruence] Let n be a positive integer and let a, b be nonzero integers.
Then the congruence equation ax ≡ b (mod n) has at least one solution if and only if gcd(a, n)|b.
Moreover, if d = gcd(a, n)|b, then ax ≡ b (mod n) has exactly d number of solutions r1 , . . . , rd ∈
{0, 1, 2, . . . , n − 1}, where ri ≡ rj (mod n/d) for all i, j = 1, 2, . . . , d.

Proof. Write d = gcd(a, n). Let x0 be a solution of ax ≡ b (mod n). Then, by definition, ax0 −b = nq,
for some q ∈ Z. Thus, b = ax0 − nq. Since d|a and d|n, we have d|ax0 − nq = b.
Conversely, suppose d|b. Then, b = b1 d, for some b1 ∈ Z. By Bézout’s identity, there exist
x0 , y0 ∈ Z such that ax0 + ny0 = d. Hence,

a(x0 b1 ) ≡ b1 (ax0 ) ≡ b1 (ax0 + ny0 ) ≡ b1 d ≡ b (mod n).


4.2. MODULAR ARITHMETIC 67

That is, x0 b1 is a solution of ax ≡ b (mod n). This proves the first statement.
To proceed further, assume that d|b. By what we have just proved, there exists a solution x1 of
ax ≡ b (mod n). By the division algorithm, there exist p, r ∈ Z with 0 ≤ r < n such that x1 = pn + r.
Now, ar ≡ a(x1 − pn) ≡ ax1 ≡ b (mod n). Thus, r is also a solution of ax ≡ b (mod n), i.e., there
exists r ∈ {0, 1 . . . , n − 1} satisfying ar ≡ b (mod n).
If x2 ∈ {0, 1, . . . , n − 1} is any other solution of ax ≡ b (mod n), then ax2 ≡ b ≡ ar (mod n).
Thus, by Theorem 4.2.3.4, x2 ≡ r (mod n/d). Conversely, if x2 ≡ r (mod n/d), then x2 = r + m(n/d)
for some m ∈ Z. Then ax2 = ar + am(n/d) = ar + mn(a/d). as d|a, the number a/d is an integer.
Hence, ax2 ≡ ar (mod n) so that x2 is a solution of ax ≡ b (mod n).
Therefore, all solutions of ax ≡ b (mod n) in {0, 1, . . . , n − 1} are of the form r + k(n/d) for k ∈ Z.
However, there are exactly d number of integers in {0, 1, . . . , n − 1} which are congruent to r modulo
(n/d). Hence there are d number of solutions of ax ≡ b (mod n) in {0, 1, . . . , n − 1}.

Remark 4.2.6. Observe that a solution of the congruence ax ≡ b (mod n) is a number in {0, 1, . . . , n−
1}. This set is not to be confused with the congruence class [n − 1]. When d = gcd(a, n), we may write
the distinct solutions in [n − 1] in increasing order as r1 = r, r2 = r + n/d, r3 = r + 2n/d, . . . , rd =
r + (d − 1)n/d. It means that the solutions are xi ≡ ri (mod n) for i = 1, 2, . . . , d.

Exercise 4.2.7. 1. Complete the proof of Theorem 4.2.3.


2. Determine the solutions of the system 3x ≡ 5 (mod 65).
3. Determine the solutions of the system 5x ≡ 95 (mod 100).
T

4. Prove that the system 3x ≡ 4 (mod 28) is equivalent to the system x ≡ 20 (mod 28).
AF

5. Consider the congruence pair 3x ≡ 4 (mod 28) and 4x ≡ 2 (mod 27).


DR

(a) Prove that the given pair is equivalent to the pair x ≡ 20 (mod 28) and x ≡ 14 (mod 27).
(b) Prove that solving the congruence pair in (a) is equivalent to solving one of the congruences
20 + 28k ≡ 14 (mod 27) or 14 + 27k ≡ 20 (mod 28) for the unknown quantity k.
(c) Verify that k = 21 is the solution for the first case in (b) and k = 22 for the second case.
(d) Conclude that x = 20 + 28 · 21 = 14 + 27 · 22 is a solution for the given congruence pair.
p!
6. Prove that if p is a prime, then p|C(p, k) := for 1 ≤ k ≤ p − 1.
k!(p − k)!

7. Let p be a prime. Write Zp := {0, 1, 2, . . . , p − 1} and Z∗p := {1, 2, . . . , p − 1} = Zp \ {0}. Show


that Zp has the following properties:
(a) For all a, b ∈ Zp , a + b (mod p) ∈ Zp .
(b) For all a, b ∈ Zp , a + b = b + a (mod p).
(c) For all a, b, c ∈ Zp , a + (b + c) ≡ (a + b) + c (mod p).
(d) For all a ∈ Zp , a + 0 ≡ a (mod p).
(e) For all a ∈ Zp , a + (p − a) ≡ 0 (mod p).
(f ) For all a, b ∈ Z∗p , a · b (mod p) ∈ Z∗p .
(g) For all a, b ∈ Z∗p , a · b = b · a (mod p).
(h) For all a, b, c ∈ Z∗p , a · (b · c) ≡ (a · b) · c (mod p).
(i) For all a ∈ Z∗p , a · 1 ≡ a (mod p).
(j) For each a ∈ Z∗p , there exists b ∈ Z∗p such that a · b ≡ 1 (mod p).
68 CHAPTER 4. ELEMENTARY NUMBER THEORY

(k) For all a, b, c ∈ Zp , a · (b + c) ≡ (a · b) + (a · c) (mod p).

Any nonempty set containing at least two elements such as 0 and 1, in which ‘addition’ and
‘multiplication’ can be defined in such a way that the above properties are satisfied, is called a
field. So, Zp = {0, 1, 2, . . . , p − 1} is an example of a field. The well known examples of fields
are:
(a) Q, the set of rational numbers.
(b) R, the set of real numbers.
(c) C, the set of complex numbers.

8. Let p be an odd prime. Prove the following:


(a) The equation x2 ≡ 1 (mod p) has exactly two solutions in Zp .
(b) Corresponding to any a ∈ {2, 3, . . . , p − 2}, if there exists b ∈ Z∗p such that a · b ≡ 1 (mod p),
then b ∈ {2, 3, . . . , p − 2} and b 6= a.
(c) If a, b, c, d ∈ {2, 3, . . . , p − 2} satisfy a 6= c, a · b ≡ 1 (mod p) and c · d ≡ 1 (mod p), then
b 6= d.
(d) Let p > 3. Write q = (p − 3)/2. There exist two-element sets {a1 , b1 }, {a2 , b2 }, . . . , {aq , bq }
Sq
that are pairwise disjoint satisfying ai · bi ≡ 1 (mod p) for 1 ≤ i ≤ q, and {ai , bi } =
i=1
{2, 3, . . . , p − 2}.
(e) If p > 3, then 2 · 3 · · · · · (p − 2) ≡ 1 (mod p).

9. [Wilson’s Theorem] If p is any prime, then (p − 1)! ≡ −1 (mod p).


T

10. [Primality Testing] Any integer n > 1 is a prime if and only if (n − 1)! ≡ −1 (mod n).
AF
DR

4.3 Chinese Remainder Theorem


Theorem 4.3.1. [Chinese remainder theorem] Fix a positive integer m. Let n1 , n2 , . . . , nm be
pairwise coprime positive integers. Write M = n1 n2 · · · nm . Then, the system of congruences

x ≡ a1 (mod n1 )
x ≡ a2 (mod n2 )
..
.
x ≡ am (mod nm )

has a unique solution modulo M .


M
Proof. For 1 ≤ k ≤ m, define Mk = . Then, gcd(Mk , nk ) = 1 and hence there exist integers xk , yk
nk
such that Mk xk + nk yk = 1 for 1 ≤ k ≤ m. Let 1 ≤ i, j ≤ m. Then

Mi xi ≡ Mi xi + ni yi ≡ 1 (mod ni ); i 6= j ⇒ ni |Mj ⇒ Mj xj ≡ 0 (mod ni ).


m
P
Now, x0 := Mk xk ak ≡ Mi xi ai ≡ 1 · ai ≡ ai (mod ni ). That is, x0 is a solution to the given
k=1
system of congruences.
If y0 is any solution to the system of congruences, then for each integer k with 0 ≤ k ≤ m, we have
y0 ≡ ak (mod nk ) so that y0 − x0 ≡ ak − ak ≡ 0 (mod nk ). Since n1 , . . . , nk are pairwise coprimes and
their product is M , Corollary 4.1.8 implies that y0 − x0 ≡ 0 (mod M ). Therefore, x0 is the unique
solution of the system of congruences module M .
4.3. CHINESE REMAINDER THEOREM 69

Example 4.3.2. Consider the system of congruences x ≡ 20 (mod 28) and x ≡ 14 (mod 27) in
Exercise 4.2.7.5. In this case, a1 = 20, a2 = 14, n1 = 28 and n2 = 27 so that M = 28 · 27 = 756, M1 =
27 and M2 = 28. Then, x1 = −1 and x2 = 1 show that M1 x1 + M2 x2 = 27 · −1 + 28 · 1 = 1. Hence

x0 = 27 · −1 · 20 + 28 · 1 · 14 ≡ −540 + 392 ≡ −148 ≡ 608 (mod 756).

Exercise 4.3.3. 1. Find the smallest positive integer which when divided by 4 leaves a remainder
1 and when divided by 9 leaves a remainder 2.
2. Find the smallest positive integer which when divided by 8 leaves a remained 4 and when divided
by 15 leaves a remainder 10.
3. Does there exist a positive integer n such that n ≡ 4 (mod 14) and n ≡ 6 (mod 18)? Give
reasons for your answer. What if we replace 6 or 4 with an odd number?

4. Let n be a positive integer. Show that the set Zn := {0, 1, 2, . . . , n−1} has the following properties:

(a) For all a, b ∈ Zn , a + b (mod n) ∈ Zn .


(b) For all a, b ∈ Zn , a + b = b + a (mod n).
(c) For all a, b, c ∈ Zn , a + (b + c) ≡ (a + b) + c (mod n).
(d) For all a ∈ Zn , a + 0 ≡ a (mod n).
(e) For all a ∈ Zn , a + (n − a) ≡ 0 (mod n).
T

(f ) For all a, b ∈ Zn , a · b (mod n) ∈ Zn .


AF

(g) For all a, b ∈ Zn , a · b = b · a (mod n).


DR

(h) For all a, b, c ∈ Zn , a · (b · c) ≡ (a · b) · c (mod n).


(i) For all a ∈ Zn , a · 1 ≡ a (mod n).
(j) For all a, b, c ∈ Zn , a · (b + c) ≡ (a · b) + (a · c) (mod n).

Any set, say R, with 0, 1 ∈ R, 0 6= 1, in which ‘addition’ and ‘multiplication’ can be defined in
such a way that the above properties are satisfied, is called a commutative ring with unity.
So, Zn = {0, 1, 2, . . . , n − 1} is an example of a commutative ring with unity. The well known
examples of commutative ring with unity are:
(a) Z, the set of integers.
(b) Q, the set of rational numbers.
(c) R, the set of real numbers.
(d) C, the set of complex numbers.

5. Let m and n be two coprime positive integers. By Exercise 4.3.3.4, the sets Zm , Zn , and Zmn are
commutative rings with unity. Now, define addition and multiplication in Zm × Zn component-
wise. Also, define the function

f : Zmn → Zm × Zn by f (x) = (x (mod m), x (mod n)) for all x ∈ Zmn .

Then, prove the following:


(a) Zm × Zn is a commutative ring with unity. What are the 0 and 1 here?
(b) For all x, y ∈ Zmn , f (x + y) = f (x) + f (y).
70 CHAPTER 4. ELEMENTARY NUMBER THEORY

(c) For all x, y ∈ Zmn , f (x · y) = f (x) · f (y).


(d) For each (a, b) ∈ Zm ×Zn there exists a unique x ∈ Zmn such that x ≡ a (mod m) and x ≡ b
(mod n).
(e) |Zm × Zn | = |Zmn | = mn.

Such a function f is called a ring isomorphism, and thus, the two rings Zm × Zn and Zmn are
isomorphic.

T
AF
DR
Chapter 5

Combinatorics - I

Combinatorics can be traced back more than 3000 years to India and China. For many centuries,
it primarily comprised the solving of problems relating to the permutations and combinations of
objects. The use of the word “combinatorial” can be traced back to Leibniz in his dissertation on
the art of combinatorial in 1666. Over the centuries, combinatorics evolved in recreational pastimes.
These include the Königsberg bridges problem, the four-colour map problem, the Tower of Hanoi, the
birthday paradox and Fibonacci’s ‘rabbits’ problem. In the modern era, the subject has developed
both in depth and variety and has cemented its position as an integral part of modern mathematics.
Undoubtedly part of the reason for this importance has arisen from the growth of computer science and
the increasing use of algorithmic methods for solving real-world practical problems. These have led
to combinatorial applications in a wide range of subject areas, both within and outside mathematics,
T

including network analysis, coding theory, and probability.


AF
DR

5.1 Addition and multiplication rules


We first consider some questions.

1. How many possible crossword puzzles are there?

2. Suppose we have to select 4 balls from a bag of 20 balls numbered 1 to 20. How often do two of
the selected balls have consecutive numbers?

3. How many ways are there of rearranging the letters in the word ALPHABET?

4. Can we construct a floor tiling from squares and regular hexagons?

We observe various things about the above problems. A priori, unlike many problems in math-
ematics, there is hardly any abstract or technical language. Despite the initial simplicity, some of
these problems will be frustratingly difficult to solve. Further, we notice that despite these problems
appearing to being diverse and unrelated, they principally involve selecting, arranging, and counting
objects of various types. We will first address the problem of counting. Clearly, we would like to be
able to count without actually counting. In other words, can we figure out how many things there
are with a given property without actually enumerating each of them. Quite often this entails deep
mathematical insight. We now introduce two standard techniques which are very useful for counting
without actually counting. These techniques can easily be motivated through the following examples.

Example 5.1.1.

71
72 CHAPTER 5. COMBINATORICS - I

1. Let the cars in New Delhi have license plates containing 2 alphabets followed by two numbers.
What is the total number of license plates possible?
Ans: Here, we observe that there are 26 choices for the first alphabet and another 26 choices
for the second alphabet. After this, there are two choices for each of the two numbers in the
license plate. Hence, we have a maximum of 26 × 26 × 10 × 10 = 67, 600 license plates.
2. Let the cars in New Delhi have license plates containing 2 alphabets followed by two numbers
with the added condition that “in the license plates that start with a vowel the sum of numbers
should always be even”. What is the total number of license plates possible?
Ans: Here, we need to consider two cases.
Case 1: The license plate doesn’t start with a vowel. Then using the previous example, the
number of license plates equals 21 × 26 × 10 × 10 = 54600.
Case 2: The license plate starts with a vowel. Then the number of license plates equals
5 × 26 × (5 × 5 + 5 × 5) = 6500.
Hence, we have a maximum of 54600 + 6500 = 61100 license plates.

Generalization of the first example leads to what is referred to as the rule of product and that of
the second leads to the rule of addition. To understand these rules, we explain the involved ideas.
Suppose we have a task to complete and that the task has some parts (subtasks). Assume that
each of the parts can be completed on their own and completion of one part does not result in the
completion of any other part. We say the parts are compulsory to mean that each of the parts must
be completed to complete the task. We say the parts are alternative to mean that exactly one of
T

the parts must be completed to complete the task. With this setting we state the two basic rules of
AF

combinatorics.
DR

Discussion 5.1.2. [Basic counting rules] Let n, m1 , . . . , mn ∈ N.


1. [Multiplication/Product rule] If a task consists of n compulsory parts and the i-th part can
be completed in mi ways, i = 1, 2, . . . , n, then the task can be completed in m1 m2 · · · mn ways.
2. [Addition rule] If a task consists of n alternative parts, and the i-th part can be completed in
mi ways, i = 1, . . . , n, then the task can be completed in m1 + m2 + · · · + mn ways.

To illustrate these rules once again let us consider the following examples.
Example 5.1.3. 1. How many three digit natural numbers can be formed using digits 0, 1, · · · , 9?
Identify the number of parts in the task and the type of the parts (compulsory or alternative).
Which rule applies here?
Ans: The task of forming a three digit number can be viewed as filling three boxes kept in a
horizontal row. Our task has three compulsory parts. Part 1: choose a digit for the leftmost
place. Part 2: choose a digit for the middle place. Part 3: choose a digit for the rightmost place.

Multiplication rule applies. Ans: 9 × 10 × 10.


2. How many three digit natural numbers with distinct digits can be formed using digits 1, · · · , 9
such that each digit is odd or each digit is even? Identify the number of parts in the task and
the type of the parts (compulsory or alternative). Which rule applies here?
5.2. PERMUTATIONS AND COMBINATIONS 73

Ans: The task has two alternative parts. Part 1: form a three digit number with distinct digits
using digits from {1, 3, 5, 7, 9}. Part 2: form a three digit number with distinct digits using
digits from {2, 4, 6, 8}. Observe that Part 1 is a task having three compulsory subparts. Using
multiplication rule, we see that Part 1 can be done in 5 × 4 × 3 ways. Part 2 is a task having
three compulsory subparts. So, it can be done in 4 × 3 × 2 ways. Since our task has alternative
parts, addition rule applies. Ans: 84.

Remark 5.1.4. There is another way to formulate the above rules. Let Ai be the set of all possible
ways in which the i-th part can be completed. In this setting, the multiplication rule can be re-written
as: if A1 , A2 , . . . , An are nonempty finite sets, then |A1 × A2 × · · · × An | = |A1 | · |A2 | · · · · · |An |.
For the addition rule, note that, as the completion of one part does not result in the completion of any
other part, A1 , A2 , . . . , An are disjoint. Thus, the addition rule can be re-written as: if A1 , A2 , . . . , An
are disjoint, nonempty finite sets, then |A1 ∪ A2 ∪ · · · ∪ An | = |A1 | + |A2 | + · · · + |An |.

5.2 Permutations and combinations


This section is primarily devoted to introduce some very common combinatorial objects and develop-
ment of methods to count them using the addition rule and multiplication rule.

5.2.1 Counting words made with elements of a set S

The first fundamental combinatorial object one commonly studies is a function f : [k] → S. The set
of all functions from A to B will be denoted by Map(A, B).
T
AF

Discussion 5.2.1. 1. Let k ∈ N and let f ∈ Map([k], S). Then, we may view f as the ordered
k-tuple (f (1), . . . , f (k)). Thus f is an element of S k = S × S × · · · × S, k times.
DR

2. Consider an ordered k-tuple (x1 , x2 , . . . , xk ) of elements of X. If we remove the brackets and the
commas, then what we get is x1 x2 . . . xk , which is called a word of length k made with elements
of X. Thus, the word corresponding to the tuple (a, a, b) is aab.
3. Consider a function f : [3] → {a, b, . . . , z}, defined by f (1) = a, f (2) = a and f (3) = b.
Technically, f = {(1, a), (2, a), 3, b)} and the ordered tuple it gives is (a, a, b) and the word related
to it is aab. Because of this natural one-one correspondence, people use them interchangeably.

Theorem 5.2.2. Let n, r ∈ N be fixed. Then |Map([n], [r])| = rn .

Proof. Forming such a function is a task with n compulsory parts, where each part can be done in r
many ways. So, by the product rule, the number of such functions is rn .

Example 5.2.3. 1. How many functions are there from [9] to [12]?
Ans: 129 . This task has 9 compulsory parts, where is each part can be done in 12 ways.
2. Determine the number of words of length 9 made with alphabets from {a, b, . . . , z}?
Ans: 269 . This task has 9 compulsory parts, where each part can be done in 26 many ways.
3. Suppose 3 distinct coins are tossed and the possible outcomes, namely H and T , are recorded.
For example, the word T T H means that the first two coins have shown T and the third coin has
shown H. Determine the number of possible outcomes.
Ans: It is the same as the number of words of length 3 made using T and H. So, it is 23 .
74 CHAPTER 5. COMBINATORICS - I

Practice 5.2.4. 1. Let n, r ∈ N. In how many ways can r distinct balls be placed into n distinct
boxes?

2. How many ways are there to make 5-letter words (words of length 5) using the ENGLISH alphabet
such that the vowels do not appear at even positions?

3. Determine the number of possible outcomes if three distinct coins and five distinct dice are tossed?

Discussion 5.2.5. [Use of complements] A simple technique which is used very frequently is counting
the complement of a set, when we know the size of the whole set. For example, consider the following
question.
How many 5-letter words can be made using the letters A, B, C, D that do not contain the string
“ADC”? For example, ADCDD, BADCB are not counted but DACAD is counted.
Ans: Let X be the set of all 5-letter words that can be made using A, B, C, D. Then |X| = 45 .
Consider the sets A = {words in X of the form ADC ∗ ∗}, B = {words in X of the form ∗ ADC∗},
and C = {words in X of the form ∗ ∗ADC}. We see that |A| = |B| = |C| = 42 . As the sets A, B, C
are disjoint, we see that |A ∪ B ∪ C| = 3 × 42 . Hence our answer to the original question is 45 − 3 × 42 .
Practice 5.2.6. 1. Determine the number of functions f : [6] → [5] satisfying f (i) 6= i for at least
two values of i?
2. How many 5 digit natural numbers are there that do not have the digit 9 appearing exactly 4
times?

5.2.2 Counting words with distinct letters made with elements of a set S
T
AF

We now discuss the next combinatorial object namely the one-one functions. For n ∈ N, the term
n-set is used for ‘a set of size n’. Further, n! = 1 · 2 · · · · · n and by convention, 0! = 1.
DR

Discussion 5.2.7. [Injections] Let n, r ∈ N and X be a non-empty set.


1. An injection f : [r] → X can be viewed as an ordered r-tuple of elements of X with distinct
entries. It can also viewed as a word of length r with distinct letters made with elements of X.
The set of all injections from A to B will be denoted by Inj(A, B).
2. If |X| = r, then a bijection f : X → X is called a permutation of X. If X = {x1 , . . . , xr },
then f (x1 ), . . . , f (xr ) is just a rearrangement of elements of X.
3. We define P(n, r) := |Inj([r], [n])|. As a convention, P (n, 0) = 1 for n ≥ 0.

Example 5.2.8. How many one-one maps f : [4] → {A, B, . . . , Z} are there?
Ans: The task of forming such a one-one map has 4 compulsory parts: selecting f (1), f (2), f (3)
and f (4). Further, f (2) 6= f (1), f (3) 6= f (1), f (2) and so on. So, by the product rule, the number of
26!
one-one map equals 26 · 25 · 24 · 23 = 22! .

Theorem 5.2.9. [Number of injections f : [r] → S] Let n, r ∈ N and |S| = n. Then the number
n!
P (n, r) = (n−r)! .

Proof. The task is to from an r-tuple (f (1), . . . , f (r)) of distinct elements. It has r compulsory parts,
namely selecting f (1), f (2), . . ., f (r) with the condition that f (k) 6∈ {f (1), f (2), . . . , f (k − 1)}, for
n!
2 ≤ k ≤ r. So, using the product rule, P (n, r) = |Inj([r], [n])| = n(n − 1) · · · (n − r + 1) = (n−r)! .

Practice 5.2.10. 1. How many ways are there to make 5 letter words using the ENGLISH alpha-
bet if the letters must be different?
5.2. PERMUTATIONS AND COMBINATIONS 75

2. How many ways are there to arrange the 5 letters of the word ROY AL?

3. How many bijections f : [12] → [12] are there if a multiple of 3 is mapped to a multiple of 3?

5.2.3 Counting words where letters may repeat

Consider the word AABAB. We want to give subscripts 1, 2, 3 to the A’s and subscripts 1, 2 to
the B’s so that we create words made with A1 , A2 , A3 , B1 , and B2 . For example, one such word is
A2 A3 B2 A1 B1 . How many such words can we create? Fill the following table to get all such words.
Notice that each of these words become AABAB when we erase the subscripts.
A1 A2 B1 A3 B2 A1 A2 B2 A3 B1
A1 A3 B1 A2 B2 A1 A3 B2 A2 B1

A3 A2 B1 A1 B2 A3 A2 B2 A1 B1
The following is another useful principle. It is a special case of Exercise 3.1.5.13.

Proposition 5.2.11. [Principle of disjoint pre-images of equal size] Let A, B be nonempty finite sets
and f : A → B be a function satisfying |f −1 (i)| = k = |f −1 (j)|, for each i, j ∈ B. Then, |A| = k|B|.
In particular, for k = 1 this principle is also called the principle of bijection.

Let n1 , . . . , nk ∈ N. Suppose, we are given ni copies of the symbol Ai , for i = 1, . . . , k. Then, by


an arrangement of these n1 + · · · + nk symbols, we mean a way of placing them in a row. It is a
T
AF

word made with the symbols A1 , . . . , Ak containing the symbol Ai exactly ni times, i = 1, . . . , k. For
example, ABBAA is an arrangement of 3 copies of A and 2 copies of B.
DR

Example 5.2.12. 1. How many words of size 5 can be formed using three A’s and two B’s?
Ans: Let A = {arrangements of A1 , A2 , A3 , B1 , B2 } and B = {words of size 5 which use three
A’s and two B’s}. For each arrangement a ∈ A, define Er(a) to be the word in B obtained by
erasing the subscripts. Then, the function Er : A → B satisfies:

‘for each b, c ∈ B, b 6= c, we have |Er−1 (b)| = |Er−1 (c)| = 3!2!’.


|A| 5!
Thus, by Proposition 5.2.11, |B| = 3!2! = 3!2! .

2. Determine the number of ways to place 4 couples in a row if each couple sits together.
Ans: Let X be the set of all arrangements of A, B, C, D. Let Y be the set of all arrangements
of A, A, B, B, C, C, D, D in which both the copies of each letter are together. For example
AACCDDBB ∈ Y but ABBCCDDA 6∈ Y . Let Z be the set of all arrangements of Ah , Aw ,
Bh , Bw , Ch , Cw , Dh , Dw in which Ah , Aw are together, Bh , Bw are together, Ch , Cw are together,
and Dh , Dw are together.
In this setting, we need to find the size of Z. So, define Er : Z → Y by Er(z) equals the
arrangement obtained by erasing the subscripts, namely h and w, that appear in z. Notice
that each y ∈ Y has 24 pre-images in Z. Now, define M rg : Y → X by M rg(y) equals
the arrangement obtained by merging the two copies of the same letters into one single letter.
For example, M rg(BBAADDCC) = BADC. Notice that each x in X has exactly one pre-
image in Y . By applying the principle of disjoint pre-images of equal size twice, we see that
|Z| = 24 |Y | = 24 |X| = 24 4!, as |X| = 4!.
76 CHAPTER 5. COMBINATORICS - I

Alternate. Instead of writing it in such a laborious way as the above, let us adopt a more reader
friendly way of writing the same. A couple can be thought of as one cohesive group (they are
to be seated together). So, the 4 cohesive groups can be arranged in 4! ways. But a couple can
sit either as “wife and husband” or “husband and wife”. So, the total number of arrangements
is 24 4!.

Theorem 5.2.13. [Arrangements] Let n, n1 , n2 , . . . , nk ∈ N and suppose that we have ni copies of


the symbol (object) Ai , for i = 1, . . . , k and that n = n1 + · · · + nk . Then the number of arrangements
of these n symbols is
n!
.
n1 !n2 ! · · · nk !
The formula remains valid even if we take some of the ni ’s to be 0.

Proof. Let S be set of all arrangements of the n1 + n2 + · · · + nk symbols and let T be the set of
all arrangements of the symbols A1,1 , . . . , A1,n1 , A2,1 , . . . , A2,n2 , . . . , Ak,1 , . . . , Ak,nk . Define a function
Er : T → S by Er(t) equals the arrangement obtained by erasing the second subscripts that appear
in t. Notice that each s ∈ S has n1 !n2 ! · · · nk ! many pre-images. Hence, by the principle of disjoint
pre-images of equal size, we have |T | = n1 ! · · · nk !|S|. As |T | = (n1 + n2 + · · · + nk )!, we obtain the
desired result.
Assume that some ni ’s are 0 (all cannot be 0 as n ∈ N). Then our arrangements do not involve the
corresponding Ai ’s. Hence we can use the argument in the previous paragraph and get the number of
arrangements. As 0! = 1, we can insert some 0! in the denominator.

We have an immediate special case.


T
AF

Corollary 5.2.14. Let m, n ∈ N. Then the number of arrangements of m copies of A and n copies
of B is (m+n)!
m!n! .
DR

5.2.4 Counting subsets


As an immediate application of Corollary 5.2.14, we have the following result which counts the number
of subsets of size k of a given set S.

Theorem 5.2.15. Let n ∈ N and k ∈ {0, 1, . . . , n}. Then the number of subsets of [n] of size k is
n!
k!(n−k)! .

Proof. If k = 0 or n, then we know that there is only one subset of size k and the formula also gives
us the same value. So, let 1 ≤ k ≤ n − 1 and let X be the set of all arrangements of k copies of T ’s
and n − k copies of F ’s. For an arrangement x = x1 x2 . . . xn ∈ X, define f (x1 . . . xn ) = {i | xi = T },
i.e., the set of positions where a T appears in x. Then, f is a bijection between X and the set of all
n!
k-subsets of [n]. Hence, the number of k-subsets of [n] = |X| = |X| = k!(n−k)! , by Corollary 5.2.14.

Discussion 5.2.16. 1. For n ∈ N and r ∈ {0, 1, . . . , n}, the symbol C(n, r) is used to denote the
number of r-subsets of [n]. The value of C(0, 0) is taken to be 1. Many texts use the word
‘r-combination’ for an r-subset.
n!
2. Using Theorem 5.2.15, we see that for n ∈ N0 and r = 0, 1, . . . , n, C(n, r) = r!(n−r)! . Also it
follows from the definition that C(n, r) = 0 if n < r, and C(n, r) = 1 if n = r.
3. Let n ∈ N and n1 , n2 , . . . , nk ∈ N0 such that n = n1 + · · · + nk . Then by C(n; n1 , . . . , nk ) we
denote the number n1 !n2n!!···nk ! . By Theorem 5.2.13, it is the number of arrangements of n objects
where ni are of type i, i = 1, . . . , k. By convention, C(0; 0, . . . , 0) = 1.
5.2. PERMUTATIONS AND COMBINATIONS 77

4. If n ∈ N and n1 , . . . , nk−1 ∈ N0 with n1 + · · · + nk−1 < n, we also use C(n; n1 , . . . , nk−1 ) to mean
C(n; n1 , . . . , nk−1 , n − n1 − · · · − nk−1 ).

5.2.5 Pascal’s identity and its combinatorial proof


We aim to supply a combinatorial proof of a very well known identity called the Pascal’s identity.

Theorem 5.2.17. [Pascal] Let n and r be non-negative integers. Then

C(n, r) + C(n, r + 1) = C(n + 1, r + 1).

Proof. (This is not the combinatorial proof.) If r > n, then by definition all the three terms are zero.
So, we have the identity. If r = n, then the first and the third terms are 1 and the second term is 0. So,
again we have the identity. So, let us take r < n. Now we can use the formulas for C(n, r), C(n, r + 1)
and C(n + 1, r + 1) to verify the identity.

Sometimes, we want to supply a combinatorial proof of an identity, i.e., by associating the terms
on the left hand side (LHS) and the right hand side (RHS) with some objects and by showing a one
to one correspondence between them. Before we supply a combinatorial proof of Pascal’s identify, the
reader is advised to go through the following experiment to discover that proof on their own.

Experiment
Complete the following list by filling the left list with all 3-subsets of {1, 2, 3, 4, 5} and the right
T
AF

list with 3-subsets of {1, 2, 3, 4} as well as with 2-subsets of {1, 2, 3, 4} as shown below. Can you
match the sets in the left with the sets in the right in some natural way?
DR

 

 {1, 2, 3} {1, 2, 3} 


 
C(4, 3)





 

{2, 3, 4} {2, 3, 4} 

 


{1, 2, 5} {1, 2}

C(5, 3)





 


 
C(4, 2)





 


 


 

{3, 4, 5} {3, 4}
 

We now present the combinatorial proof of Theorem 5.2.17.


Proof. If r > n, then by definition all the three terms are zero. So we have the identity. If r = n,
then the first and the third terms are 1 and the second term is 0. So, again we have the identity. So,
assume that r < n.
Let S = {1, 2, . . . , n, n + 1} and A be an (r + 1)-subset of S. Then, by definition, there are
C(n + 1, r + 1) such sets with either n + 1 ∈ A or n + 1 6∈ A.
Note that n + 1 ∈ A if and only if A \ {n + 1} is an r-subset of {1, 2, . . . , n}. So, the number of
(r + 1)-subsets of {1, 2, . . . , n, n + 1} which contain the element n + 1 is, by definition, C(n, r).
Also, n + 1 ∈/ A if and only if A is an (r + 1)-subset of {1, 2, . . . , n}. So, a set A which does not
contain n + 1 can be formed in C(n, r + 1) ways.
Therefore, using the above two cases, an (r + 1)-subset of S can be formed, by definition, in
C(n, r) + C(n, r + 1) ways. Thus, the required result follows.
78 CHAPTER 5. COMBINATORICS - I

5.2.6 Counting in two ways


Let R and C be two nonempty finite sets and take a function f : R × C → R. View the function
written as a matrix of real numbers with rows indexed by R and columns indexed by C. Then the
total sum of the entries of that matrix can be obtained either ‘by first taking the sum of entries in
each row and then summing them’ or ‘by first taking the sum of the entries in each column and then
summing them’, i.e.,
  !
X X X X X
f (x, y) =  f (x, y) = f (x, y) .
(x,y)∈R×C x∈R y∈C y∈C x∈R

This is known as ‘counting in two ways’ and it is a very useful tool to prove some combinatorial
identities. Let us see some examples.
Example 5.2.18. 1. [Newton’s Identity] Let n ≥ r ≥ k be natural numbers. Then

C(n, r)C(r, k) = C(n, k)C(n − k, r − k).

In particular, for k = 1, the identity becomes rC(n, r) = nC(n − 1, r − 1). Ans: Let us use the
method of ‘counting in two ways’. So, we take two appropriate sets R = {all r-subsets of [n]}
and C = {all k-subsets of [n]} and define f on R × C by f (A, B) = 1 if B ⊆ A, and f (A, B) = 0
if B 6⊆ A.
Then given a set A ∈ R, it has C(r, k) many subsets of A. Thus,
!
X X X
f (A, B) = C(r, k) = C(n, r)C(r, k).
T

A∈R B∈C A∈R


AF

Similarly, given a set B ∈ C, there are C(n − k, r − k) subsets of [n] that contains B. Hence,
DR

!
X X X
f (A, B) = C(n − k, r − k) = C(n, k)C(n − k, r − k).
B∈C A∈R B∈C

Hence, the identity is established.

Alternate. We now present the same argument in a more reader friendly manner.
Select a team of size r from n students (in C(n, r) ways) and then from that team select k leaders
(in C(r, k) ways). So, there are C(n, r)C(r, k) ways of selecting a team and it’s leaders from the
team itself. Alternately, select the leaders first in C(n, k) ways and out of the rest select another
r − k to form the team in C(n − k, r − k) ways. So, using this argument, the number of ways of
doing this is C(n, k)C(n − k, r − k).
2. [Important] Let n, r ∈ N, n ≥ r. Then

C(1, r) + C(2, r) + · · · + C(n, r) = C(n + 1, r + 1). (5.1)

The RHS stands for the class F of all the subsets of [n + 1] of size r + 1. Let S ∈ F. Note
that S has a maximum element. A moments thought tells us that the maximum element of
such a set can vary from r + 1 to n + 1. If the maximum of S is r + 1, then the remaining
elements of S have to be chosen in C(r, r) ways. If the maximum of S is r + 2, then the
remaining elements of S has to be chosen in C(r + 1, r) ways and so on. If the maximum
of S is n + 1, then the remaining elements of S has to be chosen in C(n, r) ways. Thus,
C(n + 1, r + 1) = C(r, r) + C(r + 1, r) + · · · + C(n + 1, r) = C(1, r) + C(2, r) + · · · + C(n, r).
n(n+1)
Observe that for r = 1, it gives us 1 + 2 + · · · + n = 2 .
5.2. PERMUTATIONS AND COMBINATIONS 79

Exercise 5.2.19. 1. In a school there are 17 girls and 20 boys. A committee of 5 students is to
be formed to represent the class.

(a) Determine the number of ways of forming a committee consisting of 5 students.


(b) Suppose the committee also needs to choose two different people from among themselves,
who will act as “spokesperson” and “treasurer”. In this case, determine the number of ways
of forming a committee consisting of 5 students and selecting a treasurer and a spokesperson
among them. Note that two committees are different if
i. either the members are different, or
ii. even if the members are the same, they have different students as spokesperson and/or
treasurer.
(c) Due to certain restrictions, it was felt that the committee should have at least 3 girls. In
this case, determine the number of ways of forming the committee consisting of 5 students.

2. Prove that C(pn, pn − n) is a multiple of p directly from its expression.


3. Determine the number of arrangements of the letters of the word ABRACADABARAARCADA.
4. Prove the following identities using combinatorial arguments.
(a) C(n, r) = C(n, n − r), for non-negative integers n and r.
(b) C(n, r) = C(r, r)C(n − r, 0) + C(r, r − 1)C(n − r, 1) + · · · + C(r, 0)C(n − r, r) for natural
numbers n ≥ r.
(c) C(n, 0)2 + C(n, 1)2 + · · · + C(n, n)2 = C(2n, n) for all n ∈ N.
T

5. Determine the number of ways of selecting a committee of m people from a group consisting of
AF

n1 women and n2 men, with n1 + n2 ≥ m.


DR

6. How many anagrams (rearrangements of letters) of M ISSISSIP P I are there so that no two S
are adjacent?
7. How many rectangles are there in an n × n square? How many squares are there?
8. Supply combinatorial proofs of the following statements.
(a) For each n ∈ N, prove that n! divides the product of n consecutive natural numbers.
(b) For m, n ∈ N, the number (m!)n divides (mn)!.
(c) For n, p ∈ N, the number C(pn, pn − n) is a multiple of p.
(d) Prove combinatorially that 2n |(n + 1) · · · (2n).

9. If n points are placed on the circumference of a circle and all the lines connecting them are
joined, what is the largest number of points of intersection of these lines inside the circle that
can be obtained?
10. How many ways are there to form the word MATHEMATICIAN starting from any side and
moving only in horizontal or vertical directions?
M
M A M
M A T A M
M A T H T A M
M A T H E H T A M
M A T H E M E H T A M
M A T H E M A M E H T A M
M A T H E M A T A M E H T A M
M A T H E M A T I T A M E H T A M
M A T H E M A T I C I T A M E H T A M
M A T H E M A T I C I C I T A M E H T A M
M A T H E M A T I C I A I C I T A M E H T A M
M A T H E M A T I C I A N A I C I T A M E H T A M
80 CHAPTER 5. COMBINATORICS - I

5.3 Solutions in non-negative integers


There are 3 types of ice-creams available in the market: A, B, C. We want to buy 5 ice-creams in
total. In how many ways can we do that? For example, we can buy 5 of type A or we can buy 3 of A
and 2 of C. In general, suppose we are buying n1 of type A, n2 of type B and n3 of type C. Then, we
must have n1 + n2 + n3 = 5. So, we want to know the number of different possible tuples (n1 , n2 , n3 )
satisfying certain condition(s).
Let us discuss it in a general setup. Recall that N0 := N ∪ {0}. A point p = (p1 , . . . , pk ) ∈ Nk0 with
p1 + · · · + pk = n is called a solution of x1 + · · · + xk = n in non-negative integers or a solution of
x1 + · · · + xk = n in N0 . Two solutions (p1 , . . . , pk ) and (q1 , . . . , qk ) are said to be the same if pi = qi ,
for each i = 1, . . . , k. Thus, (5, 0, 0, 5) and (0, 0, 5, 5) are two different solutions of x + y + z + t = 10
in N0 .

Theorem 5.3.1. [Solutions in N0 ] The number of solutions of x1 +· · ·+xr = n in N0 is C(n+r−1, n).

Proof. Each solution (x1 , . . . , xr ) may be viewed as an arrangement of n dots and r − 1 bars.
‘Put x1 many dots; put a bar; put x2 many dots; put another bar; continue; and end by putting
xr many dots.’
For example, (0, 2, 1, 0, 0) is associated to | • •| • || and vice-versa. As there are C(n + r − 1, r − 1)
arrangements of n dots and r − 1 bars, we see that the number of solutions of x1 + · · · + xr = n in N0
is C(n + r − 1, n).

Example 5.3.2. Determine the number of words that can be made using all of 3 copies of A and 6
T

copies of B.
AF

Ans: Note that this number equals the number of arrangements of 3 copies of A and 6 copies of
DR

B. Hence, this number is C(9, 3).


Alternate. First put the three A’s in row. Now put x1 B’s to the left of the first A, x2 B’s between
the first and the second A, x3 B’s between the second and the third A and x4 B’s after the third A.
Thus, we need to find number of solutions of x1 + x2 + x3 + x4 = 6 in N0 . By Theorem 5.3.1, the
number is C(6 + 4 − 1, 6) = C(9, 6).

Discussion 5.3.3. The question of finding non-negative integers solutions can also be asked in some
other styles.

1. In how many ways can we place 6 indistinguishable balls into 4 distinguishable boxes?
Taking ni as the number of balls to be put in the i-th box, it is asking us to find number of
solutions of n1 + n2 + n3 + n4 = 6 in N0 .
2. A multiset is a generalization of a set where elements are allowed to repeat. For example,
{a, b, a} and {a, a, b} mean the same multisets (imagine carrying all of them in a bag). A set is
also a multiset. How many multisets of size 6 can be made using the symbols a, b, c, d?
Taking na as the number of a’s to be put in the multiset and so on, it is asking us to find
solutions of na + nb + nc + nd = 6 in N0 .

Example 5.3.4. 1. Suppose there are 5 kinds of ice-creams available in our market complex. In
how many ways can we buy 15 of them for a party?
Ans: Suppose we buy xi ice-creams of the i-th type. Then, the problem reduces to finding the
number of solutions of x1 + · · · + x5 = 15 in non-negative integers.
5.3. SOLUTIONS IN NON-NEGATIVE INTEGERS 81

2. [Variables are bounded below by other numbers] How many solutions in N0 are there to
x + y + z = 60 such that x ≥ 3, y ≥ 4, z ≥ 5?
Ans: Note that (x, y, z) is such a solution if and only if (x − 3, y − 4, z − 5) is a solution to
x + y + z = 48 in N0 . So, the answer is C(50, 2).
3. [Reducing a related problem] In how many ways can we pick integers x1 < x2 < x3 < x4 <
x5 , from {1, 2, . . . , 20} so that xi − xi−1 ≥ 3, i = 2, 3, 4, 5? For example, one such choice is
(1, 5, 8, 11, 19).
Ans: For each choice of (x1 , x2 , x3 , x4 , x5 ), note that

(x1 − 1) + (x2 − x1 ) + · · · + (x5 − x4 ) + (20 − x5 ) = 19 i.e., d1 + d2 + d3 + d4 + d5 + d6 = 19

where d1 ≥ 0, d2 ≥ 3, . . ., d5 ≥ 3 and d6 ≥ 0. So, the problem reduces to finding the number of


solutions of n1 + n2 + · · · + n6 = 7 in N0 . Hence, the answer is C(12, 5).

Alternate. Take an arrangement of fifteen dots (•’s) and five bars (|’s) such that between two
consecutive bars, there are at least two dots. The position of the bars in each such arrangement
gives us one solution. For example,

• • | • • • | • • • | • •| • • • •|• → (3, 7, 11, 14, 19).

Conversely, each solution can be converted into such an arrangement by the following method:
let n1 be the number of dots present to the left of the first bar; n2 be the number of dots present
between the first bar and the second bar and so on. The problem now has been converted to
T

count integer solutions of n1 + n2 + n3 + n4 + n5 + n6 = 15, where n1 , n6 ≥ 0, n2 , n3 , n4 , n5 ≥ 2.


AF

This is the same as the number of solutions of n1 + n2 + n3 + n4 + n5 + n6 = 7 in N0 .


DR

Alternate. Notice that x1 , x2 −3, x3 −6, x4 −9, x5 −12 is a increasing sequence of numbers from
1. . . . , 8. For example, (1, 5, 8, 12, 17) → (1, 2, 2, 3, 5). And from any increasing sequence
of numbers from 1. . . . , 8 we can get back our original sequence.
So, the problem reduces to counting the number of increasing sequences of length 5 with digits
1, 2, . . . , 8. But, this is the same as the number of 5-multisets of [8], as each multiset can be
sorted to give a unique increasing sequence. Ans: C(12, 5).
4. [Variables are bounded above] In this case problems become harder. How many solutions in
N0 are there to x + y + z = 60 such that 20 ≥ x ≥ 3, 30 ≥ y ≥ 4, 40 ≥ z ≥ 5?
Ans: We are looking for the number of solutions in N0 of x + y + z = 48 such that x ≤ 17, y ≤ 26
and z ≤ 35. So, let A = {(x, y, z) ∈ N30 : x + y + z = 48},
(a) Ax = {(x, y, z) ∈ N30 : x + y + z = 48, x ≥ 18},
(b) Ay = {(x, y, z) ∈ N30 : x + y + z = 48, y ≥ 27}, and
(c) Az = {(x, y, z) ∈ N30 : x + y + z = 48, z ≥ 36}.

We know |A| = C(50, 2). So, the answer equals C(50, 2) − |Ax ∪ Ay ∪ Az |. The calculation of
|Ax ∪ Ay ∪ Az | is left to the reader. A more general formula appears in the next chapter.
Exercise 5.3.5. 1. Determine the number of solutions of x + y + z = 7 with x, y, z ∈ N?
2. Find the number of ways to keep n identical objects in r distinct locations, so that location i gets
at least pi ≥ 0 elements, i = 1, 2, · · · , r.
3. Find the number of solutions in non-negative integers of a + b + c + d + e < 11.
82 CHAPTER 5. COMBINATORICS - I

4. How many 4-letter words (with repetition) are there with the letters in alphabetical order?
5. Determine the number of increasing sequences of length r using the numbers 1, 2, . . . , n.
6. How many ways are there to select 10 integers from the set {1, 2, . . . , 100} such that the positive
difference between any two of the 10 integers is at least 3.
7. There are 10 types of ice-creams available in the market. We want to buy 3 ice-creams for each
of 40 students. For example, we may buy 2 of first type and 1 of second type for a student. In
how many ways can this be done?
8. (a) In how many ways can one arrange n different books in m different boxes kept in a row, if
books inside the boxes are also kept in a row?
(b) What if no box can be empty?

9. In a room, there are 2 distinct book racks with 5 shelves each. Each shelf is capable of holding
up to 10 books. In how many ways can we place 10 distinct books in these two racks?
10. How many permutations of a, b, . . . , z have no 2 vowels together?
11. How many rearrangements of 5 copies of a, 5 copies of b, . . . , 5 copies of z have at least two
consonants between any two vowels?
12. How many 10-subsets of {a, b, . . . , z} have a pair of consecutive letters?
13. How many ways (write an expression) are there to distribute 60 identical balls to 5 persons if
Ram and Shyam together get no more than 30 and Mohan gets at least 10?
14. In how many ways can we pick 20 letters from 10 A’s, 15 B’s and 15 C’s?
T
AF

i1 P
n P
P i2 iP
k−1
15. Evaluate ··· 1.
DR

i1 =1 i2 =1 i3 =1 ik =1
i1 P
9 P i2 i8
i29 .
P P
16. Evaluate ···
i1 =1 i2 =1 i3 =1 i9 =1
17. There are 10 persons to be seated on chairs with numbers 1 to 10. The first person first comes
and can seat on any chair. Then for i = 2, 3, . . . , 10, the i-th person enters and takes the seat i
if it is available, otherwise any other seat. In how many ways can they be seated?
18. Fix n ∈ N. Then, a composition of n is an expression of n as a sum of positive integers. For
example, if n = 4, then the distinct compositions are

4, 3 + 1, 1 + 3, 2 + 2, 2 + 1 + 1, 1 + 1 + 2, 1 + 2 + 1, 1 + 1 + 1 + 1.

Let Sk (n) denote the number of compositions of n into k parts. Then, S1 (4) = 1, S2 (4) =
P
3, S3 (4) = 3 and S4 (4) = 1. Determine Sk (n), for 1 ≤ k ≤ n and Sk (n).
k≥1
19. Let n ≥ 2 be a natural number. Supply a bijection between the set of all compositions of n and
P([n − 1]).
20. How many rearrangements of the letters in ABRACADABARAARCADA such that the first

(a) A precedes the first B?


(b) B precedes the first A and the first D precedes the first C?
(c) B precedes the first A and the first A precedes the first C?
(d) B and A both precede the first C?
(e) B or A precede the first C?
5.4. BINOMIAL AND MULTINOMIAL THEOREMS 83

5.4 Binomial and multinomial theorems


Discussion 5.4.1. 1. By an algebraic expansion of (x+y +z)n let us mean, an expansion where
each term is of the form αxi y j z k , so that two terms differ in the degree of at least one of x, y, z.
For example, x3 + 3x2 y + 3xy 2 + y 3 is an algebraic expansion of (x + y)3 .

2. By a word expansion of (x + y + z)n we mean an expansion where each term is a word of


length n using letters x, y, z. For example, xxx + xxy + xyx + xyy + yxx + yxy + yyx + yyy is
a word expansion of (x + y)3 .

3. Algebraic and word expansions for (x1 + · · · + xr )n are defined similarly.

4. Take the word expansion of (X + Y + Z)4 . It contains 34 words each of length 4 made with
X, Y, Z. Imagine a list the words in the order

XXXX, XXXY, XXXZ, XXY X, . . . , ZZZY, ZZZZ.

5. Do you think the word ZXY Z appears in the list? At which position? It is not difficult to
see that it appears in the position 1 + (2012)3 , where (2012)3 is the value computed in base 3.
(Prove this by induction!)

6. In fact, each possible word of length 4 that can be made with X, Y, Z, appears somewhere in
the word expansion of (X + Y + Z)4 .

7. How many words in the list have two Z’s, one X and one Y ? The answer must be all possible
4!
arrangements of Z, Z, X, Y which is 2!1!1! .
T
AF

4!
8. Hence, the coefficient of XY Z 2 in the algebraic expansion of (X + Y + Z)4 must be 2!1!1! =
C(4; 2, 1, 1). We express this by writing
DR

cf XY Z 2 , (X + Y + Z)4 = C(4; 2, 1, 1).


 

Theorem 5.4.2. [Multinomial Theorem] Let n, k ∈ N and n1 , . . . , nk ∈ N0 with n = n1 + · · · + nk .


Then
cf xn1 1 xn2 2 · · · xnk k , (x1 + · · · + xk )n = C(n; n1 , · · · , nk ).
 

So
n
C(n; n1 , · · · , nk )xn1 1 · · · xnk k .
X
(x1 + · · · + xk )n =
n1 , · · · , nk ≥ 0
n1 + · · · + nk = n

Proof. It is clear that the word expansion of (x1 + · · · + xk )n contains all possible words of length n
made with letters x1 , . . . , xk . The coefficient of xn1 1 xn2 2 · · · xnk k is given by the words of length n that
are made with n1 copies of x1 , n2 copies of x2 , . . ., nk copies of xk . As we already know, there are
n!
n1 !n2 !···nk ! such words. Hence, the first identity follows. The second identity follows from the first one.

Theorem 5.4.3. [Binomial Theorem] Let n ∈ N and 0 ≤ i ≤ n be an integer. Then


n
X
cf xi y n−i , (x + y)n = C(n, i) n
C(n, k)xn−k y k .
 
or (x + y) =
k=0

Proof. Follows from Theorem 5.4.2.


84 CHAPTER 5. COMBINATORICS - I

Remark 5.4.4. Let n ∈ N and n1 , . . . , nk ∈ Z such that n = n1 + · · · + nk . Then, as the term


xn1 1 xn2 2 · · · xknk does not appear in the expression of (x1 + · · · + xk )n , we can think that the coefficient
cf xn1 1 xn2 2 · · · xnk k , (x1 + · · · + xk )n = 0. Defining C(n; n1 , · · · , nk ) = 0 if any of the ni ’s is negative,
 

we now see that the multinomial theorem remains valid even for n1 , . . . , nk ∈ Z. A similar comment
is true for the binomial theorem too.

The numbers C(n, r) and C(n; r1 , . . . , rk ) are thus known as ‘binomial coefficients’ and ‘multinomial
coefficients’, respectively. An immediate and important corollary to the binomial theorem is the
following.

Corollary 5.4.5. Let n ∈ N. Then the total number of subsets of [n] is 2n . (We can also prove this
statement using some other arguments. See the exercises.)

Proof. The number of subsets of size k is C(n, k). Thus the total number of subsets is C(n, 0) +
C(n, 1) + · · · + C(n, n) which is (1 + 1)n by the binomial theorem.

The example below show how the multinomial coefficients can be seen as an additional tool in our
study.
Pk
Example 5.4.6. 1. Fix m, n, k ∈ N. Then show that C(m + n, k) = C(m, i) C(n, k − i).
i=0
Ans: First, we give an argument using counting in two ways. We can form a committee of size
k from a group consisting of m men and n women in C(m + n, k) ways. On the other hand, such
a committee can be formed by taking i many men and n − i many women, where 0 ≤ i ≤ k. In
Pk
this way our answer is C(m, i) C(n, k − i). Hence, they are the same.
i=0
T

Alternate. We now give an argument using the binomial coefficients. We have C(m + n, k) =
AF

cf xk y m+n−k , (x + y)m+n = cf xk y m+n−k , (x + y)m (x + y)n


   
DR

k k
X  h i X
cf xi y m−i , (x + y)m cf xk−i y n−k+i , (x + y)n =

= C(m, i)C(n, k − i).
i=0 i=0

n
C(k, m)C(n, k) = C(n, m)2n−m .
P
2. Let n > m be natural numbers. Prove that
k=m
Ans: Recall that C(k, m)C(n, k) = C(n, m)C(n − m, k − m). Hence,
n
X n
X n
X
C(k, m)C(n, k) = C(n, m)C(n − m, k − m) = C(n, m) C(n − m, k − m)
k=m k=m k=m
n−m
X
= C(n, m) C(n − m, s) = C(n, m)2n−m .
s=0

Alternate. Noticing a combinatorial proof is relatively harder. The RHS stands for (A, B)
where A ⊆ [n] of size m and B ⊆ [n] \ A. For each fixed A, we have 2n−m choices of B, and
this is why we have the RHS. On the other hand, we can first select a big set C of size |C| ≥ m.
From this set C, we will take a subset A of size m and we will treat the remaining as B. The
LHS expresses the number of ways in which this task can be done.

Alternate. Yet another way to see it is to notice that C(n, k)C(k, m) = C(n; m, k − m, n − k),
which is cf xm y k−m z n−k , (x + y + z)n . Since, m is fixed (and x’s can appear in any m of the
 

n places) this coefficient equals


n−m
X h i
C(n, m) cf y k−m z n−k , (y + z)n−m = C(n, m)2n−m .
k−m=0
5.4. BINOMIAL AND MULTINOMIAL THEOREMS 85

3. Determine the number of words of size 5 using letters from ‘MATHEMATICIAN’ (including
multiplicity, i.e., you may use M at most twice).
Ans: Note that to form such a word, suppose we have selected xm many M ’s, xa many A’s,
and so on. Then, the problem reduces to finding the number of solutions in non-negative
numbers to xm + xa + xt + xh + xe + xi + xc + xn = 5, with 0 ≤ xm , xt , xi ≤ 2, 0 ≤ xa ≤
3, 0 ≤ xh , xc , xn , xe ≤ 1. In that case the number of words that can be formed from them is
C(5; xm , xt , xi , xa , xh , xc , xn , xe ). Hence, the total number of such words is
X
C(5; k1 , · · · , k8 ).
k1 +···+k8 =5
k1 ≤2,k2 ≤3,k3 ≤2,k4 ≤1,k5 ≤1,k6 ≤2,k7 ≤1,k8 ≤1

Exercise 5.4.7. 1. Show that |P({1, 2, . . . , n})| = 2n in the following ways.


(a) By using ‘select a subset is a task with n compulsory parts’.
(b) By associating a subset with a 0-1 string of length n and evaluating their values in base-2.
(c) Arguing in the line of ‘a subset of {1, 2, . . . , n, n + 1} either contains n + 1 or not’ and using
induction.

2. Let S be a set of size n. Then, prove in two different ways that the number of subsets of S of
odd size is the same as the number of subsets of S of even size. That is,
X X
C(n, 2k) = C(n, 2k + 1) = 2n−1 .
k≥0 k≥0

t
P n
P
3. Show that C(n, `) = C(t, k) C(n − t, ` − k) = C(t, k) C(n − t, ` − k), for any t, 0 ≤ t ≤ n.
T

k=0 k=0
AF

r
P
4. Show that C(n + r + 1, r) = C(n + `, `).
DR

`=0
n
P
5. We already have seen a combinatorial proof of C(n + 1, r + 1) = C(`, r). Supply a different
`=r
proof by manipulating the binomial coefficients.

6. We know that rC(n, r) = nC(n − 1, r − 1). Use it to evaluate the following sums.
Pn
(a) Evaluate rC(n, r) for n ≥ 3.
r=0
Pn
(b) Evaluate (2k + 1) C(n, 2k + 1) for n ≥ 3.
k=0
Pn
(c) Evaluate (5k + 3) C(n, 2k + 1) for n ≥ 3.
k=0
n
r2 C(n, r) for n ≥ 3.
P
(d) Evaluate
k=0
n n(n + 1)
k i . Then, we know that S1 (n) =
P
7. For each i, n ∈ N, define Si (n) = , S2 (n) =
k=1 2
 2
n(n + 1)(2n + 1) n(n + 1)
and S3 (n) = . Determine S4 (n). Also, find a recursive method
6 2
to find closed form expression for Si (n), for i ≥ 5.
8. For n ∈ N, k1 , . . . , km ∈ N such that k1 + · · · + km = n, show that

C(n; k1 , . . . , km ) = C(n − 1; k1 − 1, . . . , km ) + · · · + C(n − 1; k1 , . . . , km − 1).

This is called the generalized Pascal’s identity.


86 CHAPTER 5. COMBINATORICS - I

9. For n, m ∈ N, evaluate
X
C(n; k1 , . . . , km ).
k1 ,...,km ∈N0
k1 +...+km =n

10. Let m, n ∈ N. How many terms are there in the (algebraic) expansion of (x1 + x2 + · · · + xm )n ?
How many terms involve at least one of each xi , i = 1, . . . , n? How many terms involve at least
two x1 and at most five x1 ?
r
11. Let n, r ∈ N. By the binomial theorem, we know that (n + 1)r = C(r, k)nk . Supply a
P
k=0
combinatorial proof by using Map([r], [n]).
12. For n, m ∈ N and r = b m
2 c (greatest integer function) evaluate
X
(−1)k2 +k4 +···+k2r C(n; k1 , . . . , km ).
k1 ,...,km ∈N0
k1 +...+km =n

5.5 Circular arrangements


Let S be a nonempty finite multiset. By a circular arrangement of elements of S, we mean an
arrangement of the elements of S on a circle. Two circular arrangements are the same if each element
has the same ‘clockwise adjacent’ element, i.e., one can be obtained as a rotation of the other. By
[x1 , x2 , . . . , xn , x1 ], we shall denote a circular arrangement, keeping the anticlockwise direction in a
picture. We use the word circular permutation if elements of S are distinct. Thus, exactly two of
the following pictures represent the same circular permutation.
T
AF
DR

A4 A3 A1 A5 A2 A3

A5 A2 A2 A4 A1 A4
A1 A3 A5
[A1 , A2 , A3 , A4 , A5 , A1 ] [A1 , A5 , A4 , A3 , A2 , A1 ]

Figure 5.1: Circular permutations

Example 5.5.1. Determine the number of circular permutations of X = {A1 , A2 , A3 , A4 , A5 }.


Ans: 4!. Let B = {circular permutations of X} and A = {permutations of X}. Now, define
f : A → B as f (a) = b if a is obtained by breaking the cycle b at some gap and then following in
the anticlockwise direction. For example, if we break the leftmost circular permutation in Figure 5.1
at the gap between A1 and A2 , we get [A2 , A3 , A4 , A5 , A1 ]. Notice that |f −1 (b)| = 5, for each b ∈ B.
Further if b, c ∈ B, then f −1 (b) ∩ f −1 (c) = ∅ (why?1 ). Thus, by the principle of disjoint pre-images
of equal size, the number of circular permutations is 5!/5.

Theorem 5.5.2. [Circular permutations] The number of circular permutations of {1, 2, . . . , n} is


(n − 1)!.

Proof. A proof may be obtained on the line of the previous example. Here we give an alternate proof.
Put A = {circular permutations of {1, 2, . . . , n − 1, n}}. Put B = {permutations of {1, 2, . . . , n −
1
Think of creating the circular permutation from a given permutation.
5.5. CIRCULAR ARRANGEMENTS 87

1}}. Define f : A → B as f ([n, x1 , x2 , . . . , xn−1 , n]) = [x1 , x2 , . . . , xn−1 ]. Define g : B → A as


g([x1 , x2 , . . . , xn−1 ]) = [n, x1 , x2 , . . . , xn−1 , n]. Then, g ◦ f (a) = a, for each a ∈ A and f ◦ g(b) = b, for
each b ∈ B. Hence, by the bijection principle (see Theorem 1.5.5) f is a bijection.

Example 5.5.3. Find the number of circular arrangements of {A, B, B, C, C, D, D, E, E}.


Ans: There is only one A. Cutting A out from a circular arrangement we get a unique arrangement
of {B, B, C, C, D, D, E, E}. So, the required answer is 2!8!4 .
Definition 5.5.4. 1. Given an arrangement (not a circular arrangement) [X1 , . . . , Xn ] by a rota-
tion R1 ([X1 , . . . , Xn ]), in short R1 (X1 , . . . , Xn ), we mean the arrangement [X2 , . . . , Xn , X1 ] and
by R2 (X1 , . . . , Xn ) we mean the arrangement [X3 , . . . , Xn , X1 , X2 ]. On similar lines, we define
Ri , i ∈ N and put R0 (X1 , . . . , Xn ) = [X1 , . . . , Xn ]. Thus, for each k ∈ N,

R0 (X1 , . . . , Xn ) = Rkn (X1 , . . . , Xn ) = [X1 , . . . , Xn ].

2. The orbit size of an arrangement [X1 , . . . , Xn ] is the smallest positive integer i which satisfies
Ri (X1 , . . . , Xn ) = [X1 , . . . , Xn ]. In that case, we call
n o
R0 (X1 , . . . , Xn ), R1 (X1 , . . . , Xn ), . . . , Ri−1 (X1 , . . . , Xn )

the orbit of [X1 , . . . , Xn ].


Discussion 5.5.5. 1. We have R1 (ABCABCABC) = [BCABCABCA], R2 (ABCABCABC) =
[CABCABCAB] and R3 (ABCABCABC) = [ABCABCABC]. Thus, the orbit size of [ABCABCABC]
is 3.
2. An arrangement of S = {A, A, B, B, C, C} with orbit size 6 is [AABCBC]. An arrangement of
T
AF

S with orbit size 3 is [ACBACB].


DR

3. There is no arrangement of S = {A, A, B, B, C, C} with orbit size 2. In fact, if there is an


arrangement with orbit size 2 then it’s form, by definition, must be [X1 X2 X1 X2 X1 X2 ]. Thus
the element X1 repeats at least 3 times in S, which is not possible.
4. There is no arrangement of {A, A, B, B, C, C} with orbit size 1 or 2 or 4 or 5.
5. There are 3! arrangements of {A, A, B, B, C, C} with orbit size 3.
6. Take an arrangement of {A, A, B, B, C, C} with orbit size 3. Make a circular arrangement by
joining the ends. How many distinct arrangements can we generate by breaking the circular
arrangement at gaps?
Ans: 3. They are the elements of the same orbit.
7. Take an arrangement of {A, A, B, B, C, C} with orbit size 6. Make a circular arrangement by
joining the ends. How many distinct arrangements can we generate by breaking the circular
arrangement at gaps?
Ans: 6. They are the elements of the same orbit.
8. Take an arrangement of n elements with orbit size k. Make a circular arrangement by joining the
ends. How many distinct arrangements can we generate by breaking the circular arrangement
at gaps?
Ans: k. They are the elements of the same orbit.
9. If we take the set of all arrangements of a finite multiset and group them into orbits (notice that
each orbit gives us exactly one circular arrangement), then the number of orbits is the number
of circular arrangements.
88 CHAPTER 5. COMBINATORICS - I

Proposition 5.5.6. The orbit size of an arrangement of an n-multiset is a divisor of n.

Proof. Suppose, the orbit size of [X1 , . . . , Xn ] is k and n = kp + r, for some r, 0 < r < k. Then,

Rk (X1 , . . . , Xn ) = R2k (X1 , . . . , Xn ) = · · · = Rkp (X1 , . . . , Xn ) = Rk−r (X1 , . . . , Xn )

as (p+1)k = pk+k = n−r+k ≡ k−r (mod n). Thus, Rk−r (X1 , . . . , Xn ) = [X1 , . . . , Xn ], contradicting
the minimality of k. Hence, r = 0. Equivalently, k divides n.

Example 5.5.7. Find the number of circular arrangements of S = {A, A, B, B, C, C, D, D, E, E}.


Ans: How many arrangements are there of orbit size 1? 0.
How many arrangements are there of orbit size 2? 0.
How many arrangements are there of orbit size 3? 0.
How many arrangements are there of orbit size 4? 0.
How many arrangements are there of orbit size 5? 5!.
How many arrangements are there of orbit size 6, 7, 8, 9? 0.
10!
How many arrangements are there of orbit size 10? 2!2!2!2!2! − 5!.
The number of circular arrangements generated by those of orbit size 5 is 5!/5. The number of
10! 5!
circular arrangements generated by those of orbit size 10 is 2!2!2!2!2!10 − 10 . Thus the total number of
10! 5!
circular arrangements is 4! + 2!2!2!2!2!10 − 10 .

Discussion 5.5.8. [Binary operations] We want to provide another way to count the number of
circular arrangements. Let [X1 , . . . , Xn ] and [Y1 , . . . , Yn ] be two arrangements of an n-multiset. Then,
in the remainder of this section, we shall consider expressions like [X1 , . . . , Xn ] + [Y1 , . . . , Yn ]. By
T
AF

[Ri +Rj ](X1 , . . . , Xn ), we mean the expression Ri (X1 , . . . , Xn )+Rj (X1 , . . . , Xn ). By Ri ([X1 , . . . , Xn ]+
[Y1 , . . . , Yn ]) we denote the expression Ri (X1 , . . . , Xn ) + Ri (Y1 , . . . , Yn ).
DR

Example 5.5.9. Think of all arrangements P1 , . . . , Pn , of two A’s, two B’s and two C’s, where
6!
n = 2!2!2! . How many copies of [ABCABC] are there in [R0 + · · · + R5 ](P1 + · · · + Pn )?
Ans: Of course 6. To see this, note that R0 , R3 take [ABCABC] to itself; R1 , R4 will take
[CABCAB] to [ABCABC]; R2 , R5 will take [BCABCA] to [ABCABC]; and no other arrangement
after rotation will give [ABCABC].

Proposition 5.5.10. Let P1 , . . . , Pn be all the arrangements of an m-multiset. Then,

[R0 + · · · + Rm−1 ](P1 + · · · + Pn ) = m(P1 + · · · + Pn ).

Proof. In fact, [R0 + · · · + Rm−1 ](P1 + · · · + Pn ) means, take all arrangements and apply all rotations
(R0 , . . . , Rm−1 ), and collect all resulting arrangements.
Note that, if we apply R0 on (P1 + · · · + Pn ), we get one copy of each arrangement. Similarly, if we
apply Ri on (P1 + · · · + Pn ), we get one copy of each arrangement. So, [R0 + · · · + Rm−1 ](P1 + · · · + Pn )
will contain m copies of each arrangement.

Proposition 5.5.11. Let P be an arrangement of an m-multiset which has orbit size k. Then the
number of rotations Ri , i = 0, 1, . . . , m−1 which fix P (that is, satisfy Ri (P ) = P ) is m
k . Furthermore,

m
[R0 + R1 + · · · + Rm−1 ](P ) = orbit(P ).
k
Proof. As k is the orbit size of P , we already know that k divides m. Put p = m/k. Then
R0 , Rk , . . . , R(p−1)k fix P . If there is any other s such that Rs fixes P , then noting that s is not
5.5. CIRCULAR ARRANGEMENTS 89

a multiple of k, let s = kj + r, where 0 < r < k. It now follows that Rr (P ) = P . This is a


contradiction to the fact that k is the orbit size of P .
The next assertion follows from the fact that

[R0 + · · · + Rk−1 ](P ) = [Rk + · · · + R2k−1 ](P ) = · · · = [R(p−1)k + · · · + Rpk−1 ](P )

is the orbit(P ).

Discussion 5.5.12. Let P be an arrangement of an m-multiset S which has orbit size k. Recall that
each orbit accounts for one circular arrangement of objects in S. Thus [R0 + · · · + Rm−1 ](P ) accounts
for m/k counts of the same circular arrangement.
Now, let P1 , . . . , Pn be all the arrangements of objects in S. Then,

X X
(the number of rotations fixing Pi ) orbit(Pi ) = [R0 + · · · + Rm−1 ](Pi )
Pi Pi
= m(P1 + · · · + Pn )
= m(all circular arrangements).

The number of circular arrangements contained in the LHS being the same as that of the RHS, we
1 P
get that the total number of all circular arrangements is m the number of rotations fixing Pi . But,
Pi
notice that
T
AF

X X
the number of rotations fixing Pi = |{Rj |Rj (Pi ) = Pi }|
DR

Pi Pi
= |{(Pi , Rj )|Rj (Pi ) = Pi }|
X
= |{Pi |Rj (Pi ) = Pi }|
Rj
X
= the number of Pi ’s fixed by Rj .
Rj

Hence, the total number of circular arrangements is

1 X
the number of Pi ’s fixed by Rj .
m
Rj a rotation

Example 5.5.13. 1. How many circular arrangements of {A, A, A, B, B, B, C, C, C} are there?

Ans: First way:


orbit size no of arrangements no of circular arrangements
1 0 0
2 0 0
3!
3 3! 3 =2
4, 5, 6, 7, 8 0 0
9!
9! −3!
9 3!3!3! − 3! 3!3!3!
9 = 186
Total 188

Second way:
90 CHAPTER 5. COMBINATORICS - I

Rotations no of arrangements fixed by it


9!
R0 3!3!3!
R1 0
R2 0
R3 3!
R4 , R5 , R7 , R8 0
R6 3!
Total 5.6.7.8 + 3! + 3!
Thus, the number of circular arrangements is

5.6.7.8 + 12 (5.2.7.8 + 4) 564


= = = 188.
9 3 3

2. Determine the number of circular arrangements of size 5 using the alphabets A, B and C.

Ans: First way:


orbit size no of arrangements no of circular arrangements
1 3 3
2, 3, 4 0 0
35 −3
5 35 −3 5 = 48
Total 51
Second way:
Rotations no of arrangements fixed by it
T

R0 35
AF

R1 3
DR

R2 3
R3 3
R4 3
Total 35 + 3 + 3 + 3 + 3
35 +4·3
Hence, the number of circular arrangements is 5 = 51.

Verify that the answer will be 8 if we have just two alphabets A and B.

Exercise 5.5.14. 1. If there are n girls and n boys then what is the number of ways of making
them sit around a circular table in such a way that no two girls are adjacent and no two boys
are adjacent?

2. Let us assume that any two garlands are same if one can be obtained from the other by rotation.
Then, determine the number of distinct garlands that can be formed using 6 flowers, in the
following cases.

(a) The flowers can have colors ‘red’ or ‘blue’.


(b) The flowers can have the colors ‘red’, ‘blue’ or ‘green’.

3. Let us assume that any two garlands are same if one can be obtained from the other by rotation.
Then, determine the number of distinct garlands that can be formed using 6 flowers, 4 of which
are blue and 2 are red.

4. Find the number of circular permutations of {A, A, B, B, C, C, C, C}.


5.6. SET PARTITIONS 91

5. Let us assume that any two garlands are same if one can be obtained from the other by rotation.
Then, determine the number of distinct garlands that can be formed using 6 flowers which can
have colors, R1 , . . . , Rk .

6. Persons P1 , . . . , P100 are seating on a circle facing the center and talking. With this situation
find answers to the following questions.

(a) If Pi tells lies, then the person to his right tells truths. What is the minimum possible
number of persons telling truths? Give a circular arrangement of L and T showing that the
minimum is attainable. What is the orbit size of this circular arrangement?
(b) What if we change the condition to ‘if Pi tells lies, then the second person to his right tells
truths’ ? Give a circular arrangement of L and T showing that the minimum is attainable.
What is the orbit size of such a circular arrangement?
(c) What if we change the condition: ‘if Pi talks lie, then the next two persons to his right talk
truth’ ? Give a circular arrangement of L and T showing that the minimum is attainable.
What is the orbit size of this circular arrangement?

5.6 Set partitions

Discussion 5.6.1. There are 9 balls with numbers 1, 2, . . . , 9 written on them. Imagine that we have
to carry them in two identical polythene bags, without having a bag empty. In how many ways, can
T
AF

we do that? Well, we can carry them like


{1}, {2, 3, 4, 5, 6, 7, 8, 9} or
DR

{1, 2, 9}, {3, 4, 5, 6, 7, 8} and other ways.


Notice that {1, 2, 9}, {3, 4, 5, 6, 7, 8} and {3, 4, 5, 6, 7, 8}, {1, 2, 9} do not give us different ways of
carrying as the bags are identical.

Let S be a nonempty set and k ∈ N. A partition of S into k subsets means a collection of k


pairwise disjoint nonempty subsets of S whose union is S. For brevity, a partition of S into k subsets
is called a k-partition of S.
 
Example 5.6.2. 1. (a) Each of the collections {1, 2}, {3}, {4, 5, 6} , {1, 3}, {2}, {4, 5, 6} and

{1, 2, 3, 4}, {5}, {6} is a 3-partition of [6], whereas the collection {{1, 2, 3}, {3, 4, 5, 6}} is not
a partition of any set.

2. There are 2n−1 − 1 ways to obtain a 2-partition of [n]. To see this, observe that, if n = 1, then
we cannot have a 2-partition of [1] and the formula also gives the value 0. So let n ≥ 2. For each
non-trivial A ⊆ [n] (that is, A 6= ∅, [n]), the set {A, Ac } is a 2-partition of [n]. Since {A, Ac } and
{Ac , A} are regarded as the same 2-partition and since, the total number of non-trivial subsets
of [n] equals 2n − 2, the required number is 2n−1 − 1.

3. Number of allocations of 7 students into 7 different project groups so that each group has one
student, is 7! = C(7; 1, 1, 1, 1, 1, 1, 1) but the number of partitions of a set of 7 students into 7
subsets is 1.
n o
Discussion 5.6.3. 1. In how many ways can we write {1, 2}, {3, 4}, {5, 6}, {7, 8, 9}, {10, 11, 12}
on a piece of paper, with the condition that sets have to be written in a row in increasing size?
92 CHAPTER 5. COMBINATORICS - I

Ans: Let us write a few first.


n o
{1, 2}, {3, 4}, {5, 6}, {7, 8, 9}, {10, 11, 12} correct
n o
{2, 1}, {3, 4}, {5, 6}, {7, 8, 9}, {10, 11, 12} correct
n o
{5, 6}, {3, 4}, {1, 2}, {10, 11, 12}, {9, 7, 8} correct
n o
{2, 3}, {1, 4}, {5, 6}, {7, 8, 9}, {10, 11, 12} incorrect, not the same partition
n o
{2, 1}, {3, 4}, {7, 8, 9}, {5, 6}, {10, 11, 12} incorrect, not satisfying the condition

There are 3!(2!)3 × 2!(3!)2 ways. Notice that from each written partition, if we remove the
brackets, then we get an arrangement of elements of {1, 2, . . . , 12}.
2. How many arrangements do we generate from a partition which has pi subsets of size ni , where
n1 < · · · < nk ?
k
Y
Ans: p1 !(n1 !)p1 · · · pk !(nk !)pk = [pi !(ni !)pi ].
i=1

Theorem 5.6.4. [Set partition] The number of partitions of [n] consisting of pi subsets of size ni ,
i = 1, 2, . . . , k where n1 < · · · < nk , is
n!
.
(n1 !)p1 p p
1 ! · · · (nk !) k pk !

Proof. Let X be the set of all arrangements of elements of [n] and Y be the set partitions of [n] of
the given type. Take any x = x1 . . . xn be arrangement of elements of [n]. Since we know that the
T

sets in the partition have to be in the increasing order of their sizes, this arrangement naturally gives
AF

us a way to construct the partition. To do this take the first n1 letters of this arrangement and make
DR

a set. Take the next n1 letters and make a set. Do this p1 times. Then take the next n2 letters and
make a set. Continue similarly to finish the job. In fact, once an arrangement x is given, there is only
a unique partition of the above type that we will get in this way. Call the resulting partition f (x).
Thus we have defined a function f : X → Y .
k
[pi !(ni !)pi ] arrangements of elements of [n]. This
Q
Note that each partition y ∈ Y generates
i=1
k
means |f −1 (y)| = [pi !(ni !)pi ]. Hence, by the principle of disjoint pre-images of equal size, we have
Q
i=1
n!
|Y | = (n1 !)p1 p1 !···(nk !)pk pk ! .

Let n, r ∈ N. Then the number of r-subsets of [n] is called the Stirling numbers of the second
kind and is denoted by S(n, r). By convention, S(0, 0) = 1 and S(n, 0) = 0 for n ∈ N.

Example 5.6.5. We have S(5, 5) = 1, as the only way to make a 5-partition of [5] is to consider
{{1}, {2}, . . . , {5}}.
We have S(5, 1) = 1, as the only way to make a 1-partition of [5] is to consider {[5]}.
We have S(5, 10) = 0, as there is no way we can make a 10-partition of [5].
We have S(5, 2) = 15, as the formula is 2n−1 − 1.
We have S(50, 49) = C(50, 2), as we will have exactly one doubleton set in our partition and rest
will be singletons and a subset of size of [50] can be chosen in C(50, 2) ways.

Theorem 5.6.6. [Recurrence for S(n, r)] Let n, r ∈ N. Then S(n + 1, r) = S(n, r − 1) + rS(n, r).

Proof. If r = 1, then the verification is trivial. So let r > 1. Take an r-partition F of [n + 1]. If
{n + 1} is an element of F , then removing that element from F we get an (r − 1)-partition of [n].
5.6. SET PARTITIONS 93

If {n + 1} is not present in F , then n + 1 is present in some part with some other elements. Now,
if we remove n + 1 from that part, we get an r-partition of [n]. Note that, given any r-partition of
[n], by inserting n + 1 into any of these r parts, we can create r many r-partitions of [n + 1]. Hence,
S(n + 1, r) = S(n, r − 1) + rS(n, r).

Example 5.6.7. Determine the number of ways of putting n distinct balls into r identical boxes with
the restriction that no box is empty.
Ans: Make an r-partition of the set of these balls in S(n, r) ways. One part goes to one box.
Since boxes are identical, this can be done in one way. So the answer is S(n, r).

To proceed further, consider the following example.

Example 5.6.8. Let A = {a, b, c, d, e} and define an onto function f : A → S by f (a) = f (b) =
f (c) = 1, f (d) = 2 and f (e) = 3. Then, the collection {f −1 = {a, b, c}, f −1 (2) = {d}, f −1 (3) = {e}}
gives a 3-partition of A.

Conversely, take a 3-partition of A, say, A1 = {a, d}, A2 = {b, e}, A3 = {c} . Then, this par-
tition gives 3! onto functions fi from A into [3]. Each of them is related to a one-one function
gi : {A1 , A2 , A3 } → [3]. We list them below. Notice that fi (p) = gi (Ar ) if p ∈ Ar .

A1 A2 A3 a b c d e
g1 1 2 3 f1 1 2 3 1 2
g2 1 3 2 f2 1 3 2 1 3
g3 2 1 3 → f3 2 1 3 2 1
T

g4 2 3 1 f4 2 3 1 2 3
AF

g5 3 1 2 f5 3 1 2 3 1
DR

g6 3 2 1 f6 3 2 1 3 2

Lemma 5.6.9. Let n, k ∈ N. Then the number of onto functions from [n] to [k] is S(n, k)k!.

Proof. Let X be the set of all onto functions from [n] to [k] and Y be the set of all k-partitions of [n].
Observe that, when f : [n] → [k] is an onto function, then {f −1 (1), . . . , f −1 (k)} is a unique
k-partition of [n]. Keeping that in mind, we define F : X → Y as F (f ) = {{f −1 (1), . . . , f −1 (k)}.
On the other hand, given a k-partition α = {S1 , . . . , Sk } of [n], we can define k! onto functions
f : [n] → [k] by taking a one-one function σ : {S1 , . . . , Sk } → [k] and then defining f (p) = σ(Si ) if
p ∈ Si , i = 1, . . . , k. This means |F −1 (α)| = n!, for each α ∈ Y .
Hence, by the principle of disjoint pre-images of equal size, we have |X| = k!S(n, k).

Lemma 5.6.10. Let n, m ∈ N. Then,


n
X
nm = C(n, k)k!S(m, k). (5.2)
k=1

Proof. The LHS is the number of all functions f : [m] → [n].


On the other hand, any function f : [m] → [n] is an onto function from [m] to rng f , and rng f
can only be a nonempty subset of [n]. So, we can first select a subset A ⊆ [n] of size k ≥ 1 and then
consider all onto functions f : [m] → A. This has to be done for each subset A of size k and for each
k = 1, . . . , n. Choosing a subset A of size k can be done in C(n, k) many ways and there are k!S(m, k)
many onto functions from [m] to A. So the total number of functions becomes the expression in the
RHS.
94 CHAPTER 5. COMBINATORICS - I

n
P n
P
Proposition 5.6.11. Let n, k ∈ N. Then S(n + 1, k + 1) = C(n, i)S(n − i, k) = C(n, i)S(i, k).
i=0 i=0

Proof. Imagine forming a (k + 1)-partition of [n + 1]. The number n + 1 must belong to some part.
Suppose there are i other elements in this part. They can be chosen in C(n, i) ways. The rest of the
elements of the set [n + 1] must get divided into k parts in S(n − i, k) ways. Since i varies from 0 to
n, we have the identity. As C(n, i) = C(n, n − i), we get the second equality.

Remark 5.6.12. 1. Recall that the number of onto functions f : [n] → [m] is the same as the
number of ways to put n distinct objects 1, 2, . . . , n into m distinct boxes 1, 2, . . . , m. In fact,
this is how, we counted the total number of such functions to be mn .

2. The number of onto functions f : [n] → [m] is the same as the number of ways to put n distinct
balls into m distinct boxes, so that no box is empty.

3. The numbers S(r, k) can be recursively calculated using Equation (5.2). For example, S(5, 3) =
S(4, 2) + 3S(4, 3) = 24−1 − 1 + 3C(4, 2) = 7 + 18 = 25.

Summary of some work done till now

1. In how many ways can we distribute n distinct books to r students, if there is no restriction
at all? All functions : rn .
2. In how many ways can we distribute n distinct books to r students, if each student gets at
T

most one book? All injections : C(r, n)n!.


AF

3. In how many ways can we distribute n distinct books to r students, if each student gets at
DR

least one book? All onto functions : S(n, r)r!.


4. In how many ways can we carry n distinct books in r identical bags, if there is no restriction
r
P
at all? All partitions : S(n, i).
i=0
5. In how many ways can we carry n distinct books in r identical bags, if each bag contains at
most one book? Partition into singletons : 1.
6. In how many ways can we carry n distinct books in r identical bags, if each bag contains at
least one book? All r-partitions : S(n, r).
7. In how many ways can we distribute n identical books to r students, if there is no restriction
at all? All non-negative integer solutions : C(n + r − 1, r − 1).
8. In how many ways can we distribute n identical books to r students, if each student gets at
most one book? All n-subset of [r] : C(r, n).
9. In how many ways can we distribute n identical books to r students, if each student gets at
least one book? All positive integer solutions : C(n − 1, r − 1).
Exercise 5.6.13. 1. Determine the number of ways of carrying 20 distinct heavy books with 4
identical bags if each bag contains 5 books?
2. Determine the number of ways of distributing 20 distinct toys among 4 children if each children
gets 5 toys?
3. We know that S(n, 1) = 1 and S(n, 2) = 2n−1 − 1. Give a formula for S(n, 3).
5.7. NUMBER PARTITIONS 95

n
P
4. For n ∈ N, let Bell(n) denote the number of partitions of the set [n], i.e., Bell(n) = S(n, r).
r=0
It is called the nth Bell number. By definition, Bell(0) = 1 = Bell(1). Determine Bell(n), for
Pn
2 ≤ n ≤ 5. Prove combinatorially that Bell(n + 1) = C(n, k)Bell(k).
k=0
5. Suppose 13 people get on the lift at level 0. If all the people get down at some level, say 1, 2, 3, 4
and 5 then, calculate the number of ways of getting down if at least one person gets down at each
level.
6. How many functions are there from [10] to [4] such that each i ∈ [4] has at least two pre-images?
7. Let n ≥ k be natural numbers. Show that S(n, k) = 1a1 −1 2a2 −1 · · · k ak −1 , where the summation
P

is over all solutions of a1 + · · · + ak = n in N, by showing that the RHS has the same initial
values and satisfies the same recurrence relation.

5.7 Number partitions


Let n, k ∈ N. A partition of n into k parts is a tuple (x1 , · · · , xk ) ∈ Nk written in decreasing order
such that x1 + · · · + xk = n. By πn (k), we denote the number of partitions of n into exactly k parts
and by πn we denote the number of all partitions of n. Conventionally we take π0 = 1. By definition
πn (k) = 0, whenever k > n.
Example 5.7.1. 1. Notice that (1, 1, 1, 1), (2, 2), (2, 1, 1) are some partitions of 4.
2. Notice that π7 (4) = 3 as the partitions of 7 into 4-parts are (4, 1, 1, 1), (3, 2, 1, 1) and (2, 2, 2, 1).
T

Verify that π7 (2) = 3 and π7 (3) = 4.


AF

Discussion 5.7.2. We give here two instances where number partitions occur naturally.
DR

1. Determine the number of ways of carrying n copies of the same book in r identical bags with
the restriction that no bag goes empty.
Ans: As the books are indistinguishable, we need to count the number of books in each bag.
As the bags are indistinguishable, arrange them so that the number of books inside the bags are
in decreasing order. Also, each bag is nonempty and hence the answer is πn (r).

2. Determine the number of ways of carrying n copies of the same book in r identical bags with
with no restriction.
Ans: As the books are indistinguishable, we need to count the number of books in each bag. As
the bags are indistinguishable, arrange them so that the number of books inside the bags are in
decreasing order. Also, as empty bags are allowed the resulting sequence (of numbers of books
in the bags in increasing order) may have some 0’s. Truncating the 0’s we obtain a partition of
n with at most r parts, that is, πn (1) + · · · + πn (r).

At times ‘a partition of n into k parts’ is written in short as ‘a k-partition of n’.

Proposition 5.7.3. Let n, r ∈ N. Then the number of partitions of n into at most r parts is equal to
the number of partitions of n + r into r parts.

Proof. Given a partition of n into at most r parts, extend it to an r-tuple by adding some 0’s at the
right end. For example, if n = 7, r = 4, we change the partition (6, 1) which has at most four parts
into (6, 1, 0, 0) which is a four tuple. This can be done uniquely. Next, add 1 to each component of
96 CHAPTER 5. COMBINATORICS - I

the r-tuple. We get an r-partition of n + r. For example, our previous four tuple would now change
to (7, 2, 1, 1) which is a partition of 11 into four parts.
Conversely, given an r-partition of n+r, subtract 1 from each component. Some of the components
might become 0. Truncating them we get a partition of n into at most r parts.

Remark 5.7.4. [Recurrence for πn (k)] Another way of writing the previous result is

πn (k) = πn−k (0) + πn−k (1) + · · · + πn−k (k)

and so
πn (k) = πn−1 (k − 1) + πn−k (k).

We can also prove the second one directly using the fact that a k-partition can have the last part 1 or
more than 1 and then derive the first one.

Practice 5.7.5. Calculate π(n) for n = 1, 2, 3, . . . , 8.

Practice 5.7.6. Prove that π2r (r) = πr for any r ∈ N.

Definition 5.7.7. Let n, k ∈ N and λ = (n1 , n2 , · · · , nk ) be a k-partition of n.


1. Then, the Ferrer’s Diagram of λ is a pictorial representation of the partition created in the
following way. The i-th part of the partition is represented by putting ni equally spaced dots in
a row. The first row is on the top. The leftmost dots of each row lies in the same column.
T

2. The (i, j)-hook of the partition consists of the (i, j)-dot along with the dots (of i-th row) to the
AF

right of it and the dots (of j-th column) below it. The hook length is the number of dots in
that particular hook.
DR

Example 5.7.8. Ferrer’s diagram for the partitions λ1 = (5, 3, 3, 2, 1, 1), λ2 = (6, 4, 3, 1, 1) and
λ3 = (5, 5, 4, 3, 2) of 15, 15 and 19 are given below.

••••• •••••• •••••


••• •••• •••••
••• ••• ••••
•• • •••
• • ••

(5, 3, 3, 2, 1, 1) (6, 4, 3, 1, 1) (5, 5, 4, 3, 2)

Figure 5.2: Ferrer’s diagram of λ1 , λ2 , λ3

Suppose that we have a Ferrer’s diagram of some partition λ of n. Observe that the number of
dots in the first column of the Ferrer’s diagram is greater than or equal to the number of dots in the
second column. In general, the number of dots in the i-th column is always greater than or equal
to the number of dots in the (i + 1)-th column. Thus, if we interchange the rows and columns of
the Ferrer’s diagram (transposing), then the result is another Ferrer’s diagram of some partition of n.
This new partition is called the conjugate of λ and is denoted by λ0 . A partition λ of n is called self
conjugate if λ = λ0 .
For instance, if λ = (5, 3, 3, 2, 1, 1) is a partition of 15, then its conjugate is λ0 = (6, 4, 3, 1, 1). The
partition (5, 4, 3, 2, 1) is a self-conjugate partition of 15.
5.7. NUMBER PARTITIONS 97

Remark 5.7.9. Let λ = (n1 , . . . , nk ) be a partition (of some number). One can write the conjugate
without drawing the Ferrer’s diagram. It’s conjugate λ0 = (p1 , . . . , pn1 ) has n1 components and pi =
the number of components in λ that are at least i. For example, the conjugate of (5, 3, 1, 1) is a
partition with 5 components (p1 , . . . , p5 ), where p1 = the number of components in λ that are at least
1. So p1 = 4. Now, p2 = the number of components in λ that are at least 2. So p2 = 2. Similarly,
p3 = 2, p4 = 1, and p5 = 1. So λ0 = (4, 2, 2, 1, 1).

Proposition 5.7.10. Let n ∈ N. Then the number of self conjugate partitions of n is the same as the
number of partitions of n whose parts are distinct odd numbers.

Proof. Let λ be a self conjugate partition of n with k diagonal dots. For 1 ≤ i ≤ k, define li = length
of the (i, i)-th hook. Since λ is self-conjugate, each li is odd and (l1 , . . . , lk ) is a strictly decreasing
sequence of positive integers with l1 + l2 + . . . + lk = n. Hence, from a self conjugate partition λ of n
we have got a partition of n whose parts are distinct and odd.
Conversely, given any partition, say l = (l1 , . . . , lk ) where parts are distinct and odd, we can get
a self conjugate partition by putting l1 dots in the (1, 1)-th hook, l2 dots in the (2, 2)-th hook and so
on. Since each li is odd, the hook is symmetric and as the hook lengths decrease at least by 2, we see
that the corresponding diagram of dots is indeed a Ferrer’s diagram. (Try to give a formula for the
resulting partition in terms of li ’s.) Hence the result follows.

Proposition 5.7.11. Let n ∈ N and f (n) be the number of partitions of n in which no part is 1.
Then f (n) = πn − πn−1 .

Proof. For n = 1, both the sides of the equality are 0. So assume that n > 1.
T
AF

We shall count the complement. Let λ = (n1 , . . . , nk ) be a partition of n with nk = 1. (Since


n > 1, there are at least two parts.) Then, λ gives rise to a partition of n − 1, namely (n1 , . . . , nk−1 ).
DR

Conversely, if µ = (t1 , . . . , tk ) is a partition of n − 1, then (t1 , . . . , tk , 1) is a partition of n with last


part 1. Hence, the number of partitions of n with last part 1 is πn−1 (k − 1).
Thus, using Remark 5.7.4, the number of partitions of n in which no part is 1 is πn − πn−1 .

Exercise 5.7.12. 1. Let n ∈ N. Find an expression for the number of k-partitions of n in which
each part is at least 3.
2. Let n, k, m ∈ N. Prove the following.
(a) The number of k-partitions of n with the first (largest) part m = the number of m-partitions
of n with the first part k.
(b) The number of k-partitions of n with the first part at most m = the number of partitions
of n into at most m parts with the first part k.
(c) The number of partitions of n into at most k parts with the first part at most m = the
number of partitions of n into at most m parts with the first part at most k.

3. For n, r ∈ N, prove that πn (r) is the number of partitions of n + C(r, 2) into r unequal parts.
4. Recall that a composition of n is an ordered tuple of positive integers whose sum is n. They are
also called ordered partitions. Express the following quantities in terms of Fibonacci numbers
(F1 = F2 = 1).
(a) The number of ordered partitions of n into parts > 1.
(b) The number of ordered partitions of n into parts equal to 1 or 2.
98 CHAPTER 5. COMBINATORICS - I

(c) The number of ordered partitions of n into odd parts.

5. Let f (n, r) be the number of partitions of n where each part repeats less than r times. Let g(n, r)
be the number of partitions of n where no part is divisible by r. Show that f (n, r) = g(n, r).

5.8 Lattice paths and Catalan numbers


Let A = (a1 , a2 ) and B = (b1 , b2 ), a1 ≤ b1 , a2 ≤ b2 , be two points on Z × Z. By a lattice path
from A to B we mean a sequence of points (A = P1 , . . . , Pk = B) of S such that if Pi = (x, y) then
Pi+1 is either (x + 1, y) or (x, y + 1), for 1 ≤ i ≤ k − 1. Thus, at each step we move either one unit
right, denoted R, or one unit up, denoted U . For example, from (2, 3) if we take the sequence of steps
U U RRU RRRU R, then we reach (8, 7). This lattice path is shown in the Figure 5.3.

(8, 7)

U = UP
(2, 3)
R = RIGHT
T

(0, 0)
AF

Figure 5.3: A lattice with a lattice path from (2, 3) to (8, 7)


DR

Discussion 5.8.1. How many lattice paths are there from (0, 0) to (m, n)?
Ans: As at each step, either the step has to be R or it has to be U . We have to take m many
steps of type R in total in order to reach a point with x-coordinate m. Similarly, we have to take n
many steps of type U in total in order to reach a point with y-coordinate n. So, any arrangement of
m many R’s and n many U ’s will give such a path uniquely. Hence, the answer is C(m + n, m).
m
P
Discussion 5.8.2. Use lattice paths to give a combinatorial proof of C(n+`, `) = C(n+m+1, m).
`=0
Ans: Observe that C(n + m + 1, m) is the number of lattice paths from (0, 0) to (m, n + 1). A
lattice path from (0, 0) to (m, n + 1) must touch a point F of the form (i, n + 1), i = 0, 1, . . . , m for
the first time. For i 6= j, we see that the lattice paths for which F = (i, n + 1) are disjoint from the
the lattice path for which F = (j, n + 1).
Hence, the total number of lattice paths is the sum of the number of lattice paths from (0, 0) to
(i, n + 1). The number of lattice paths for which F = (i, n + 1), is nothing but the number of lattice
paths from (0, 0) to (i, n), which is C(n + i, i). Our proof is complete.

Discussion 5.8.3. As observed earlier, the number of lattice paths from (0, 0 to (n, n) is C(2n, n).
Suppose, we wish to take paths so that at no step the number of U ’s exceeds the number of R’s. Then,
what is the number of such paths?
Ans: Call an arrangement of n many U ’s and n many R’s a ‘bad path’ if the number of U ’s exceeds
the number of R’s at least once. For example, the path RRU U U RRU is a ‘bad path’. To each such
arrangement, we correspond another arrangement of n+1 many U ’s and n−1 many R’s in the following
5.8. LATTICE PATHS AND CATALAN NUMBERS 99

way: spot the first place where the number of U ’s exceeds that of R’s in the ‘bad path’. Then, from the
next letter onwards change R to U and U to R. For example, the bad path RRU U U RRU corresponds
to the path RRU U U U U R. Notice that this is a one-one correspondence. Thus, the number of bad
C(2n, n)
paths is C(2n, n − 1). So, the answer to the question is C(2n, n) − C(2n, n − 1) = .
n+1
Discussion 5.8.4. A rectangular grid with m units on x-axis and n units on y-axis is called an
(m, n)-lattice. By a standard (m, n)-lattice, we mean the rectangular grid with opposite corners at
(0, 0) and (m, n).
Consider the standard (n, n)-lattice. Recall that a lattice path from (0, 0) to (n, n) can be viewed
an arrangement of n many R’s and n many U ’s. An arrangement in which at some position the
number of U ’s is more than that of the R’s corresponds to a lattice path which enters the region y > x
in that grid.
From the previous discussion, it follows that the number of lattice paths from (0, 0) to (n, n) that
do not enter the region above the line y = x is C(2n, n)/(n + 1).

Definition 5.8.5. The n-th Catalan number, denoted Cn , is the number of different representations
of the product A1 · · · An+1 of n + 1 square matrices of the same size using n pairs of brackets. By
convention C0 = 1.

Example 5.8.6. The different representations of the product A1 · · · A4 by using 3 pairs of brackets
are (((A1 A2 )A3 )A4 ), ((A1 A2 )(A3 A4 )), ((A1 (A2 A3 ))A4 ), (A1 ((A2 A3 )A4 )), (A1 (A2 (A3 A4 ))). Hence
C3 = 5.
C(2n,n)
Theorem 5.8.7. [Catalan number] Let n ∈ N. Then Cn = n+1 .
T
AF

Proof. Consider a meaningful representation X of the product of n + 1 matrices with n pairs of


DR

brackets. First we erase, the subscripts, with the understanding that the i-th A from left is Ai .
Claim: After the (n − k)-th ‘(’, there are at least k + 2 many A’s.
Proof of the claim. It is true for n = 1, that is when there are only two matrices. Assume it is
true for n = 2, 3, . . . , p − 1 Consider a meaningful representation X of the product of p + 1 matrices
with p pairs of brackets.
Observe that the last ( is followed by AA), as the product is meaningful.
Now, treat this (AA) as a single matrix, A. Then our original meaningful representation of the
product of p + 1 matrices changes into a meaningful representation X ∗ of p matrices with p − 1 pairs
of brackets.
Hence, by induction, in X ∗ , after the p−k = ((p−1)−(k−1))-th ‘(’, there are at least k+1 = k−1+2
many matrices. This means, in X, after the (p − k)-th ‘(’, there are at least k + 2 matrices. So the
claim is justified.
Drop the right brackets and one A from the right end, to have a sequence of n many ‘(’s and n
many A’s, where the number of A’s used till the (n − k)-th ‘(’ is at most n − (k + 1) = n − k − 1. So,
the number of A’s never exceeds the number of ‘(’.
Conversely, given such an arrangement, we can put back the ‘)’s: first add one more A at the right
end; find two consecutive letters from the last ‘(’; put a right bracket after them; treat (AA) as a
letter; repeat the process. For example,

((A((AAA → ((A((AAAA → ((A((AA)AA → ((A((AA)A)A → ((A((AA)A))A = ((A((AA)A))A)

C(2n,n)
By previous discussions, the number of such arrangements is n+1 .
100 CHAPTER 5. COMBINATORICS - I

n
P n−1
P
Theorem 5.8.8. [Recurrence relation for Cn ] Let n ∈ N. Then Cn = Ci−1 Cn−i = Ci Cn−1−i .
i=1 i=0

Proof. As Cn is number of ways to multiply n + 1 pairs of A’s with n pairs of brackets, removing the
outer pair of brackets, we get two expressions written, one is a meaningful multiplication of k many
A’s with k − 1 pairs of brackets and the other is a meaningful multiplication of n + 1 − k many A’s
with n − k pairs of brackets, where k can vary from 1, . . . , n. These two expressions for a k = i differ
from the two expressions for a k 6= i. Hence,
n
X n−1
X
Cn = Ci−1 Cn−i = Ci Cn−1−i .
i=1 i=0

Example 5.8.9. A full binary tree is a rooted binary tree in which every node either has exactly two
offsprings or has no offspring, see Figure 5.4. Show that Cn is equal to the number of full binary trees
on 2n + 1 vertices.

♥ ♥ ♥ ♥
♥ ♥ ♥ ♥ ♥ ♥ ♥ ♥

♥ ♥ ♥ ♥ ♥ ♥ ♥ ♥

Figure 5.4: Full binary trees on 7 vertices (or 4 leaves)


T
AF

Let f (n) be the number of full binary trees on 2n + 1 vertices. The idea is to show that f (n)
DR

satisfies the same recurrence relation as that of Cn and has the same initial values. We see that
f (0) = 1 = C0 .
Now take any full binary trees on 2n + 1 vertices and delete the root. We two trees, one on the
left, say Tl and one on the right, say Tr . Notice that Tl and Tr are full binary trees and their sizes
are 2k + 1 and 2n − 2k − 1, respectively, where k can be 0, 1, . . . , n − 1. And these cases are mutually
disjoint, that is, a full binary tree with Tl having k vertices is different from that of one with Tl having
n−1
P
different number of vertices. Hence, f (n) = f (k)f (n − k − 1). So f (n) = Cn .
k=0

Remark 5.8.10. The book titled “enumerative combinatorics” by Stanley [13] gives a comprehensive
list of places in combinatorics where Catalan numbers appear. The interested reader may have a look
at those.
Pn
Exercise 5.8.11. 1. Take C0 = 1. Use the recurrence relation Cn = Ci−1 Cn−i to show that
i=1
Cn = C(2n, n)/(n + 1).
2. Give a bijection between ‘the solution set of x0 + x1 + x2 + · · · + xk = n in non-negative integers’
and ‘the number of lattice paths from (0, 0) to (n, k)’.
n
C(n, k) = 2n .
P
3. Use lattice paths to give a combinatorial proof of
k=0
n
C(n, k)2 = C(2n, n). [Hint: C(n, k) is the
P
4. Use lattice paths to give a combinatorial proof of
k=0
number of lattice paths from (0, 0) to (n − k, k) as well as from (n − k, k) to (n, n). ]
5.8. LATTICE PATHS AND CATALAN NUMBERS 101

5. As Cn = C(2n, n)/(n + 1) is the number of ways of expressing a product of n + 1 many A’s


using n pairs of brackets meaningfully, it is an integer and so n + 1 divides C(2n, n). Give an
arithmetic proof of this fact.
6. A man is standing on the edge of a swimming pool (facing it) holding a bag containing n blue
and n red balls. He randomly picks up one ball at a time and discards it. If the ball is blue he
takes a step back and if the ball is red, he takes a step forward. What is the probability of his
falling into the swimming pool?
7. Let n ≥ 4 and consider a regular polygon with vertices 1, 2, · · · , n. In how many ways can we
divide the polygon into triangles using (n − 3) non-crossing diagonals?
8. How many lattice paths are there from (0, 0) to (9, 9) which does not cross the dotted line, that
is they stay in lower part of the lattice?

(9, 9)

(0, 0)
T
AF
DR

9. How many arrangements of n blue and n red balls are there such that at any position in the
arrangement the number of blue balls (till that position) is at most one more than the number of
red balls (till that position)?
10. We want to write a matrix of size 10 × 2 using numbers 1, . . . , 20 with each number appearing
exactly once. Then, determine the number of such matrices in which the numbers
(a) increase from left to right?
(b) increase from up to down?
(c) increase from left to right and up to down?

11. Show that Cn also equals the number of integer sequences that satisfy 1 ≤ a1 ≤ a2 ≤ · · · ≤ an
and ai ≤ i, for all i, 1 ≤ i ≤ n.

Exercise 5.8.12. [Additional exercises] :

1. Prove that there exists a bijection between any two of the following sets.

(a) The set of words of length n on an alphabet consisting of m letters.


(b) The set of maps of an n-set into an m-set.
(c) The set of distributions of n distinct objects into m distinct boxes.
(d) The set of n-tuples on m letters.

2. Prove that there exists a bijection between any two of the following sets.
102 CHAPTER 5. COMBINATORICS - I

(a) The set of n letter words with distinct letters out of an alphabet consisting of m letters.
(b) The set of one-one functions from an n-set into an m-set.
(c) The set of distributions of n distinct objects into m distinct boxes, subject to ‘if an object
is put in a box, no other object can be put in the same box’.
(d) The set of n-tuples on m letters, without repetition.
(e) The set of permutations of m symbols taken n at a time.

3. Prove that there exists a bijection between any two of the following sets.

(a) The set of increasing words of length n on m ordered letters.


(b) The set of distributions on n non-distinct objects into m distinct boxes.
(c) The set of combinations of m symbols taken n at a time with repetitions permitted.

Need to put somewhere


1. For n ≥ 1, let an = (n − 1)n(n + 1). Write a generating function for an and hence evaluate
Pn
(k − 1)k(k + 1).
k=1
2. Let an = −3an−1 + 10an+2 + 3 × 2n , for n ≥ 2 with a0 = 0 and a1 = 6. Use generating function
to evaluate an .
T
AF
DR
Chapter 6

Combinatorics - II

6.1 Pigeonhole Principle


The Pigeonhole Principle is an obvious but powerful tool in solving many combinatorial problems.
We will prove its mathematical form first.

Theorem 6.1.1. [Pigeonhole Principle, PHP] Let A be a finite set and let f : A → {1, 2, . . . , n}
be a function. Let p1 , . . . , pn ∈ N. If |A| > p1 + · · · + pn , then there exists i ∈ {1, 2, . . . , n} such that
|f −1 (i)| > pi .

Proof. On the contrary, suppose that for each i ∈ {1, 2, . . . , n}, |f −1 (i)| ≤ pi . As A is a disjoint union
of the sets f −1 (i), we have |A| = ni=1 |f −1 (i)| ≤ p1 + · · · + pn < |A|, a contradiction.
P
T

The elements of A are thought of as pigeons and the elements of B as pigeon holes; so that the
AF

principle is commonly formulated in the following forms, which come in handy in particular problems.
DR

Discussion 6.1.2. [Pigeonhole principle (PHP)]

PHP1. If n + 1 pigeons stay in n holes then there is a hole with at least two pigeons.
PHP2. If kn + 1 pigeons stay in n holes then there is a hole with at least k + 1 pigeons.
PHP3. If p1 + · · · + pn + 1 pigeons stay in n holes then there exists i, 1 ≤ i ≤ n such that the i-th
hole contains at least pi + 1 pigeons.
Example 6.1.3. 1. Consider a tournament of n > 1 players, where each pair plays exactly once
and each player wins at least once. Then, there are two players with the same number of wins.
Ans: Number of wins varies from 1 to n − 1 and there are n players.
2. A bag contains 5 red, 8 blue, 12 green and 7 yellow marbles. The least number of marbles to be
chosen to ensure that there are
(a) at least 4 marbles of the same color is 13,
(b) at least 7 marbles of the same color is 24,
(c) at least 4 red or at least 7 of any other color is 22.

3. In a group of 6 people, prove that there are three mutual friends or three mutual strangers.
Ans: Let a be a person in the group. Let F be the set of friends of a and S the set of strangers
to a. Clearly |S| + |F | = 5. By PHP either |F | ≥ 3 or |S| ≥ 3.
Case 1: |F | ≥ 3. If any two in F are friends then those two along with a are three mutual
friends. Else F is a set of mutual strangers of size at least 3.

103
104 CHAPTER 6. COMBINATORICS - II

Case 2: |S| ≥ 3. If any pair in S are strangers then those two along with a are three mutual
strangers. Else S becomes a set of mutual friends of size at least 3.
9
P
4. Let {x1 , . . . , x9 } ⊆ N with xi = 30. Then, prove that there exist i, j, k ∈ {1, 2, . . . , 9} with
i=1
xi + xj + xk ≥ 12.
9
P
xi
Ans: Note that i=19 = 30 3
9 = 3 + 9 . Now use PHP to conclude that there are at least 3 xi ’s
that are ≥ 4. Hence, the required result follows.
5. Each point of the plane is colored red or blue, then prove that there exist two points of the same
color which are at a distance of 1 unit.
Ans: Take a point, say P . Draw a unit circle with P as the center. If all the points on the
circumference have the same color then we are done. Else, the circumference contains a point
which has the same color as that of P .
6. If 7 points are chosen inside or on the unit circle, then there is a pair of points which are at a
distance at most 1.
Ans: Divide the circle into 6 equal sectors by drawing radii so that angle between two consecutive
radii is π/3. By PHP there is a sector containing at least two points. The distance between these
two points is at most 1.
7. If n + 1 integers are selected from {1, 2, . . . , 2n}, then there are two, where one of them divides
the other.
Ans: Each number has the form 2k s, where s = 2m + 1 is an odd number. There are n odd
T
AF

numbers. If we select n + 1 numbers from S, by PHP some two of them (say, x, y) have the same
odd part, that is, x = 2i s and y = 2j s. If i ≤ j, then x|y, otherwise y|x.
DR

8. Given any n integers, n ≥ 1012 integers, prove that there is a pair that either differ by, or sum
to, a multiple of 2021. Is this true if we replace 1012 by 1011?
Ans: Consider some 1012 integers out of the given ones, say, n1 , n2 , . . . , n1012 . Write S =
{n1 − nk , n1 + nk : k = 2, . . . , 1012}. Then, |S| = 2022 and hence, at least two of them will have
the same remainder when divided by 2021. Then, consider their difference.
The question in the second part has negative answer. For, consider {0, 1, 2, . . . , 1010}.
9. Prove that there exist two powers of 3 whose difference is divisible by 2021.
Ans: Let S = {1 = 30 , 3, 32 , 33 , . . . , 32021 }. Then, |S| = 2022. As the remainders of any integer
when divided by 2021 is 0, 1, 2, . . . , 2020, by PHP, there is a pair which has the same remainder.
Hence, 2021 divides 3j − 3i for some i, j.
10. Prove that there exists a power of three that ends with 0001.
Ans: Let S = {1 = 30 , 3, 32 , 33 , . . .}. Now, divide each element of S by 104 . As |S| > 104 , by
PHP, there exist i > j such that the remainders of 3i and 3j , when divided by 104 , are equal.
But gcd(104 , 3) = 1 and thus, 104 divides 3` − 1. Then 3` − 1 = s · 104 for some positive integer
s. That is, 3` = s · 104 + 1 from which the result follows.
11. Suppose that f (x) is a polynomial with integer coefficients. If f (x) = 5 for three distinct integers,
then for no integer x, f (x) can be equal to 4.
Ans: Let f (x) = 5, for x ∈ {a, b, c}. If f (d) = 4, for an integer d, then (d − a)|f (d) − f (a) = −1.
So, a = d ± 1. Similarly b, c = d ± 1. By PHP two of a, b, c are the same, a contradiction.
6.1. PIGEONHOLE PRINCIPLE 105

Alternate. If f is an integer polynomial and f (m) = 0 for some integer m, then using the
factor/remainder theorem f (x) = (x − m)g(x) for some integer polynomial g. For our problem,
we see that f (x) = (x − a)(x − b)(x − c)g(x) + 5, where g is an integer polynomial. If f (n) = 4,
then (n − a), (n − b), (n − c)| − 1, so that (n − a), (n − b), (n − c) ∈ {1, −1}. By PHP some two
of them are the same, a contradiction.

Theorem 6.1.4. Let r1 , r2 , · · · , rmn+1 be a sequence of mn + 1 distinct real numbers. Then, prove
that there is a subsequence of m + 1 numbers which is increasing or there is a subsequence of n + 1
numbers which is decreasing.
Does the above statement hold for every collection of mn distinct numbers?

Proof. Define li to be the maximum length of an increasing subsequence starting at ri . If some


li ≥ m + 1 then we have nothing to prove. So, let 1 ≤ li ≤ m. Since (li ) is a sequence of mn + 1
integers, by PHP, there is one number which repeats at least n+1 times. Let li1 = li2 = · · · = lin+1 = s,
where i1 < i2 < · · · < in+1 . Notice that ri1 > ri2 , because if ri1 < ri2 , then ‘ri1 together with the
increasing sequence of length s starting with ri2 ’ gives an increasing sequence of length s+1. Similarly,
ri2 > ri3 > · · · > rin+1 and hence the required result holds.
Alternate. Let S = {r1 , r2 , · · · , rmn+1 } and define a map f : S → Z × Z by f (ri ) = (s, t), for
1 ≤ i ≤ mn + 1, where s equals the length of the largest increasing subsequence starting with ri and
t equals the length of the largest decreasing subsequence ending at ri . Now, if either s ≥ m + 1 or
t ≥ n + 1, we are done. If not, then note that 1 ≤ s ≤ m and 1 ≤ t ≤ n. So, the number of tuples
(s, t) is at most mn. Thus, the mn + 1 distinct numbers are being mapped to mn tuples and hence
by PHP there are two numbers ri 6= rj such that f (ri ) = f (rj ). Now, proceed as in the previous case
T

to get the required result.


AF

The above statement is FALSE. Consider the sequence:


DR

n, n − 1, · · · , 1, 2n, 2n − 1, . . . , n + 1, 3n, 3n − 1, · · · , 2n + 1, · · · , mn, mn − 1, · · · , mn − n + 1.

Theorem 6.1.5. Corresponding to each irrational number a, there exist infinitely many rational
numbers pq such that |a − pq | < q12 .

Proof. It is enough to show that there are infinitely many (p, q) ∈ Z2 with |qa − p| < 1q . As a is
irrational, for every m ∈ N, 0 < ia − biac < 1, for i = 1, . . . , m + 1. Hence, by PHP there exist i, j
with i < j such that
1 1
|(j − i)a − (bjac − biac)| < ≤ .
m j−i
Then, the pair (p1 , q1 ) = (bjac − biac, j − i) satisfies the required property. To generate another pair,
find m2 such that
1 p1
< |a − |
m2 q1
p2 p1
and proceed as before to get (p2 , q2 ) such that |q2 a − p2 | < m12 ≤ 1
q2 . Since |a − q2 | < 1
m2 < |a − q1 |,
we have pq11 6= pq22 . Now use induction to get the required result.

Theorem 6.1.6. Let α be a positive irrational number. Then prove that S = {m + nα : m, n ∈ Z} is


dense in R.

Proof. Consider any open interval (a, b). By Archimedean property, there exists n ∈ N such that
1
n < b − a. Observe that 0 < rk = kα − bkαc < 1, k = 1, . . . , n + 1. By PHP, some two satisfy
106 CHAPTER 6. COMBINATORICS - II


0 < ri − rj < 1/n. Then x = ri − rj = (i − j)α + bjαc − biαc ∈ S. Let p be the smallest integer so
that px > a. If px ≥ b, then (a, b) ⊆ (p − 1)x, px and so b − a ≤ x < n1 , which is not possible. So,


px ∈ (a, b) and px ∈ S as well. Thus, (a.b) ∩ S 6= ∅.


Exercise 6.1.7. 1. Consider the poset (X = P({1, 2, 3, 4}), ⊆). Write 6 maximal chains P1 , . . . P6
(need not be disjoint) such that ∪ Pi = X. Let A1 , . . . , A7 be 7 distinct subsets of {1, 2, 3, 4}.
i
Use PHP, to prove that there exist i, j such that Ai , Aj ∈ Pk , for some k. That is, {A1 , . . . , A7 }
cannot be an anti-chain. Conclude that this holds as the width of the poset is 6.
2. Suppose that f (x) is a polynomial with integer coefficients. If
(a) f (x) = 14 for three distinct integers, then for no integer x, f (x) can be equal to 15.
(b) f (x) = 11 for five distinct integers, then for no integer x, f (x) can be equal to 9.

3. There are 7 distinct real numbers. Is it possible to select two of them, say x and y such that
x−y
0 < 1+xy < √13 ?
n
Q 
4. If n is odd then for any permutation p of {1, 2, . . . , n} the product i − p(i) is even.
i=1
5. Five points are chosen at the nodes of a square lattice (view Z × Z). Why is it certain that a
mid-point of some two of them is a lattice point?
6. Choose 5 points at random inside an equilateral triangle of side 2 units. Show that there exist
two points that are away from each other by at most 1 unit.
7. Take 25 points on a plane satisfying ‘among any three of them there is a pair at a distance less
T

than 1’. Then, some circle of unit radius contains at least 13 of the given points.
AF

8. If each point of a circle is colored either red or blue, then show that there exists an isosceles
DR

triangle with vertices of the same color.


9. Each point of the plane is colored red or blue, then prove the following.
(a) There is an equilateral triangle all of whose vertices have the same color.
(b) There is a rectangle all of whose vertices have the same color.

10. Show that among any 6 integers from {1, 2, . . . , 10}, there exists a pair with odd sum.
11. Any 14-subset of {1, 2, . . . , 46} has four elements a, b, c, d such that a + b = c + d.
12. Show that if 9 of the 12 chairs in a row are filled, then some 3 consecutive chairs are filled. Will
8 work?
13. Show that every n-sequence of integers has a consecutive subsequence with sum divisible by n.
14. Let n > 3 and S ⊆ {1, 2, . . . , n} of size m = b n+2
2 c + 1. Then, there exist a, b, c ∈ S such that
a + b = c.
15. Let a, b ∈ N, a < b. Given more than half of the integers in the set {1, 2, . . . , a + b}, there is a
pair which differ by either a or b.
16. Consider a chess board with two of the diagonally opposite corners removed. Is it possible to
cover the board with pieces of rectangular dominoes whose size is exactly two board squares?
17. Mark the centers of all squares of an 8 × 8 chess board. Is it possible to cut the board with 13
straight lines not passing through any center, so that every piece had at most 1 center?
18. Fifteen squirrels have 104 nuts. Then, some two squirrels have equal number of nuts.
6.2. PRINCIPLE OF INCLUSION AND EXCLUSION 107

19. Let {x1 , x2 , . . . , xn } ⊆ Z. Prove that there exist 1 ≤ i ≤ j ≤ n such that xi +xi+1 +· · ·+xj−1 +xj
is a multiple of 2021, whenever n ≥ 2021.
20. Let A and B be two discs, each having 2n equal sectors. On disc A, n sectors are colored red and
n are colored blue. The sectors of disc B are colored arbitrarily with red and blue colors. Show
that there is a way of putting the two discs, one above the other, so that at least n corresponding
sectors have the same colors.

21. Show that there is a non-zero integer multiple of 2021 whose decimal representation has 2022
consecutive zeroes after the first decimal point.
22. If more than half of the subsets of {1, 2, . . . , n} are selected, then some two of the selected subsets
have the property that one is a subset of the other.
23. Suppose we are given any ten 4-subsets of {1, 2, . . . , 11}. Then, show that some two of them have
at least 2 elements in common.
24. A person takes at least one aspirin a day for 30 days. If he takes 45 aspirin altogether then prove
that in some sequence of consecutive days he takes exactly 14 aspirins.
25. If 58 entries of a 14 × 14 matrix are 1 and the remaining entries are 0, then prove that there is
a 2 × 2 submatrix with all entries 1.
26. Let A and B be two finite non-empty sets with B = {b1 , b2 , . . . , bm }. Let f : A → B be any
function. Then, for any non-negative integers a1 , a2 , . . . , am if |A| = a1 + a2 + · · · + am − m + 1
then prove that there exists an i, 1 ≤ i ≤ m such that |f −1 (bi )| ≥ ai .
T

27. Each of the given 9 lines cuts a given square into two quadrilaterals whose areas are in the ratio
AF

2 : 3. Prove that at least three of these lines pass through the same point.
DR

28. Let S ⊆ {1, 2, . . . , 100} be a 10-set. Then, some two disjoint subsets of S have equal sum.
29. Prove that corresponding to each n ∈ N, n odd, there exists an ` ∈ N such that n divides 2` − 1.
30. Does there exist a multiple of 2021 that is formed using only the digits
(a) 2? Justify your answer.
(b) 2 and 3 and the number of 2’s and 3’s are equal? Justify your answer.

31. Each natural number has a multiple of the form 9 · · · 90 · · · 0, with at least one 9.

6.2 Principle of Inclusion and Exclusion


We start this section with the following example.

Example 6.2.1. How many natural numbers n ≤ 1000 are not divisible by any of 2, 3?
Ans: Let A2 = {n ∈ N|n ≤ 1000, 2|n} and A3 = {n ∈ N|n ≤ 1000, 3|n}. Then, |A2 ∪ A3 | =
|A2 | + |A3 | − |A2 ∩ A3 | = 500 + 333 − 166 = 667. So, the required answer is 1000 − 667 = 333.

We now generalize the above idea whenever we have 3 or more sets.

Theorem 6.2.2. [Principle of Inclusion and Exclusion, PIE] Let A1 , · · · , An be finite subsets of a
set U . Then,
n  
n X k+1
X
∪ Ai = (−1) Ai ∩ · · · ∩ Ai . (6.1)
1 k
i=1
k=1 1≤i1 <···<ik ≤n
108 CHAPTER 6. COMBINATORICS - II

Or equivalently, the number of elements of U which are in none of A1 , A2 , . . . , An equals


n  
n X X
(−1)k

U \ ∪ Ai = |U | − Ai ∩ · · · ∩ Ai .
1 k
i=1
k=1 1≤i1 <···<ik ≤n

n
Proof. Let x ∈
/ ∪ Ai . Then, we show that inclusion of x in some Ai contributes (increases the value)
i=1
1 to both sides of Equation (6.1). So, assume that x is included only in the sets A1 , · · · , Ar . Then,
the contribution of x to |Ai1 ∩ · · · ∩ Aik | is 1 if and only if {i1 , . . . , ik } ⊆ {1, 2, . . . , r}. Hence, the
P
contribution of x to |Ai1 ∩ · · · ∩ Aik | is C(r, k). Thus, the contribution of x to the right
1≤i1 <···<ik ≤n
hand side of Equation (6.1) is

C(r, 1) − C(r, 2) + C(r, 3) − · · · + (−1)r+1 C(r, r) = 1.

The element x clearly contributes 1 to the left hand side of Equation (6.1) and hence the required
result follows. The proof of the equivalent condition is left for the readers.

Example 6.2.3. How many integers between 1 and 10000 are divisible by none of 2, 3, 5, 7?
Ans: For i ∈ {2, 3, 5, 7}, let Ai = {n ∈ N|n ≤ 10000, i|n}. Therefore, the required answer is
10000 − |A2 ∪ A3 ∪ A5 ∪ A7 | = 2285.

Definition 6.2.4. [Euler Totient Function] For a fixed n ∈ N, the Euler’s totient function is
defined as ϕ(n) = |{k ∈ N : k ≤ n, gcd(k, n) = 1}|.

Thus, ϕ(n) is the number of natural numbers less than or equal to n and relatively prime to n.
T
AF

For instance, ϕ(1) = 1, ϕ(2) = 1, ϕ(3) = 2, ϕ(4) = 3, ϕ(12) = 4, etc.


DR

Theorem 6.2.5. Let p1 , . . . , pk be the distinct prime divisors of n. Then

1 1 1
ϕ(n) = n 1 − 1− ··· 1 − .
p1 p2 pk
n n
Proof. For 1 ≤ i ≤ k, let Ai = {m ∈ N : m ≤ n, pi |m}. Then, |Ai | = , |Ai ∩ Aj | = , and so on.
pi pi pj
By PIE,
k
h X 1 X 1 1 i
ϕ(n) = n − | ∪ Ai | = n 1 − + − · · · + (−1)k
i pi pi pj p1 p2 · · · pk
i=1 1≤i<j≤k
1 1 1
= n 1− 1− ··· 1 −
p1 p2 pk

Definition 6.2.6. [Derangement] A derangement of objects in a finite set S is a permuta-


tion/arrangement σ on S such that for each x, σ(x) 6= x. The number of derangements of {1, 2, . . . , n}
is denoted by Dn with the convention that D0 = 1.

For example, 2, 1, 4, 3 is a derangement of 1, 2, 3, 4, but 2, 3, 1, 4 is not a derangement of 1, 2, 3, 4.


If a sequence (xn ) converges to some limit `, we say that xn is approximately ` for large values of
n, and write xn ≈ `.
n
X (−1)k Dn 1
Theorem 6.2.7. For n ∈ N, Dn = n! . Consequently, ≈ .
k! n! e
k=0
6.2. PRINCIPLE OF INCLUSION AND EXCLUSION 109

Proof. For each i, 1 ≤ i ≤ n, let Ai be the set of arrangements σ such that σ(i) = i. Then, verify that
|Ai | = (n − 1)!, |Ai ∩ Aj | = (n − 2)! and so on. Thus,
n
X (−1)k−1
| ∪ Ai | = n.(n − 1)! − C(n, 2)(n − 2)! + · · · + (−1)n−1 C(n, n)0! = n! .
i k!
k=1

n ∞ Dn 1
(−1)k (−1)k
= e−1 , it follows that lim
P P
So, Dn = n! − | ∪ Ai | = n! k! . Since k! = .
i k=0 k=0 n→∞ n! e
Example 6.2.8. How many square-free integers do not exceed n for a given n ∈ N?

Answer: Let P = {p1 , · · · , ps } be the set of primes not exceeding n and for 1 ≤ i ≤ s, let Ai be the
set of integers between 1 and n that are multiples of p2i . Then
jnk j n k
|Ai | = 2 , |Ai ∩ Aj | = 2 2 , · · ·
pi pi pj

So, the number of square-free integers not greater than n is


s j
s X nk X j n k X j n k
n − | ∪ Ai | = n − + 2 p2 − + ···
i=1
i=1
p2i 1≤i<j≤s
p i j p2i p2j p2k
1≤i<j<k≤s

For n = 100, we have P = {2, 3, 5, 7}. So, the number of square-free integers not exceeding 100 is
j 100 k j 100 k j 100 k j 100 k j 100 k j 100 k
100 − − − − + + = 61.
4 9 25 49 36 100
T

Exercise 6.2.9. 1. In a school there are 12 students who take an art course A, 20 who take a
AF

biology course B, 20 who take a chemistry course C and 8 who take a dance course D. There
DR

are 5 students who take both A and B, 7 students who take both A and C, 4 students who take
both A and D, 16 students who take both B and C, 4 students who take both B and D and 3
students who take who take both C and D. There are 3 who take A, B and C; 2 who take A, B
and D; 3 who take A, C and D; and 2 who take B, C and D. Finally there are 2 in all four
courses and further 71 students who have not taken any of these courses. Find the total number
of students.
1 r−1
(−1)i C(r, i)(r − i)n .
P
2. Let n ∈ N. Using PIE, show that S(n, r) =
r! i=0
(
m n! if m = n
k n
P
3. Show that (−1) C(m, k)(m − k) =
k=0 0 if m > n.
4. Determine the number of 10-letter words over English alphabet that do not contain all the vowels.
5. Let m, n ∈ N with gcd(m, n) = 1 Prove that ϕ(mn) = ϕ(m)ϕ(n).
6. Determine all natural numbers n satisfying ϕ(n) = 13.
7. Determine all natural numbers n satisfying ϕ(n) = 12.
P
8. For each fixed n ∈ N, use mathematical induction to prove that ϕ(d) = n.
d|n
P
9. For each fixed n ∈ N, use mathematical induction to prove that ϕ(d) = n.
d|n

10. A function f : N → N is said to be multiplicative if f (nm) = f (n)f (m), whenever gcd(n, m) =


P
1. Let f, g : N → N be functions satisfying f (n) = g(d) and f (1) = g(1) = 1. If f is
d|n
multiplicative then use induction to show that g is also multiplicative.
110 CHAPTER 6. COMBINATORICS - II

11. Show that for n ≥ 2, Dn = b n! 1


e + 2 c.
n
P
12. Prove combinatorially: C(n, i)Dn−i = n!.
i=0
13. Find the number of non-negative integer solutions of a + b + c + d = 27, where 1 ≤ a ≤ 5, 2 ≤
b ≤ 7, 3 ≤ c ≤ 9, 4 ≤ d ≤ 11.

14. Let x be a natural number less than or equal to 9999999.

(a) Find the number of x’s for which the sum of the digits in x equals 30.
(b) How many of the solutions obtained in the first part consist of 7 digits?

15. In how many ways the digits 0, 1, . . . , 9 can be arranged so that the digit i is never followed
immediately by i + 1.

16. Determine the number of strings of length 15 that use some or all of the digits 0, 1, . . . , 9, so that
no string contains all the 10 digits.

17. Determine the number of ways of permuting the 26 letters of the English alphabet so that none
of the patterns lazy, run, show and pet occurs.
P X 15!
18. Let S = {(n1 , n2 , n3 )|ni ∈ N, ni = 15}. Evaluate .
n1 !n2 !n3 !
(n1 ,n2 ,n3 )∈S

19. Each of the 9 senior students said: ‘the number of junior students I want to help is exactly one’.
There were 4 junior students a, b, c, d, who wanted their help. The allocation was done randomly.
What is the probability that either a has exactly two seniors to help him or b has exactly 3 seniors
T

to help him or c has no seniors to help him?


AF
DR

6.3 Generating Functions


This is one of the strongest tools in combinatorics. We start with the definition of formal power series
over Q and develop the theory of generating functions. This is then used to get closed form expressions
for some known recurrence relations and are then further used to get some binomial identities.

an xn , where an ∈ Q for all


P
Definition 6.3.1. 1. An algebraic expression of the form f (x) =
n≥0
n ≥ 0, is called a formal power series in the indeterminate " x over C and
# is denoted by Q[[x]].
By cf[xn , f ], we denote the coefficient of xn in f , e.g., cf xn , an xn = an .
P
n≥0

2. Two elements f, g ∈ Q[[x]] are said to be equal if cf[xn , f ]


= cf[xn , g] for all n ≥ 0.

an xn and g(x) = bn xn be elements in Q[[x]]. Then, their


P P
3. Let f (x) =
n≥0 n≥0

(a) sum/addition is defined by cf[xn , f + g] = cf[xn , f ] + cf[xn , g].


(b) scalar multiplication is defined by cf[xn , αf ] = αcf[xn , f ].
Thus, with the above operations, the class of formal power series Q[[x]] over Q, is a vector
space which is isomorphic to the space of all sequences.
(c) One also defines the product (called the Cauchy product) by cf[xn , f · g] = cn =
Pn
ak bn−k .
k=0

Before proceeding further, we consider the following examples.


6.3. GENERATING FUNCTIONS 111

Example 6.3.2. 1. How many words of size 8 can be formed with 6 copies of A and 6 copies of
B?
P6
Ans: C(8, k), as we just need to choose k places for A, where 2 ≤ k ≤ 6.
k=2

Alternate. In any such word, we need m many A’s and n many B’s with m + n = 8, m ≤ 6
8!
and n ≤ 6. Also, the number of words with m many A’s and n many B’s is .
m!n!
8!xm y n
We identify this number with and note that this is a term of degree 8 in
m!n!
h x2 x3 x4 x5 x6 ih y2 y3 y4 y5 y6 i
8! 1 + x + + + + + 1+y+ + + + + .
2! 3! 4! 5! 6! 2! 3! 4! 5! 6!
If we replace y by x, then our answer is
h 2 3 4 5 6 2 3 4 5 6
i
8!cf x8 , (1 + x + x2! + x3! + x4! + x5! + x6! )(1 + x + x2! + x3! + x4! + x5! + x6! )
h 2 3 4 5 6 2 3 4 5 6
i
= 8!cf x8 , ( x2! + x3! + x4! + x5! + x6! )( x2! + x3! + x4! + x5! + x6! )
h 2 3 2 3
i
= 8!cf x8 , ( x2! + x3! + · · · )( x2! + x3! + · · · )
8
= 8!cf x8 , (ex − 1 − x)2 = e2x + 1 + x2 − 2xex − 2ex + 2x = 8! 28! − 7!2 − 8!2 = 238.
  

2. How many anagrams (rearrangements) are there of the word M ISSISSIP P I?


11!
Ans: Using basic counting, the answer is .
4!4!2!
T

x4 x4 x2
 
11!
AF

Alternate. For another understanding, note that 11


= 11! × cf x , x . Here
4!4!2! 4! 4! 2!
DR

x4 x4 x2
    
1 1 1
the numbers 1 = cf[x, x] , = cf x4 , , = cf x4 , and = cf x2 , correspond
4! 4! 4! 4! 2! 2!
to the number of occurrences of M, I, S and P , respectively. Hence, the readers should note that

x2 x3 x4 2 x2 
 
11! 11

= 11! cf x , 1 + x 1 + x + + + 1+x+ , or
4!4!2! 2! 3! 4! 2!
x2  x4 x5 2 x2 x3
 
11!
= 11! cf x11 , x +

+ ··· + + ··· + + ···
4!4!2! 2! 4! 5! 2! 3!

3. How many multi-subsets of size 4 of the multiset {E, X, A, M, I, N, A, T, I, O, N } are there?


Ans: By direct counting the answer is

C(5, 4) + C(5, 3)C(3, 1) + [C(5, 2)C(3, 2) + C(5, 2)C(3, 1)]


+[C(5, 1)C(3, 3) + C(5, 1)C(3, 1)C(2, 1)] + [C(5, 0)C(3, 2) + C(5, 0)C(3, 1)] = 136.

Alternate. It is as good as asking how many A’s are you including and how many E’s, etc.
Suppose that we are considering A2 EM (means {A, A, E, M }). But this is a term of degree 4 in

(1 + A + A2 )(1 + E)(1 + I + I 2 )(1 + M )(1 + N + N 2 )(1 + O)(1 + T )(1 + X).

So their number is nothing but

cf x4 , (1 + x)5 (1 + x + x2 )3 =
 

cf x4 , (1 + 5x + 10x2 + 10x3 + 5x4 + · · · )(1 + 3x + 6x2 + 7x3 + 6x4 + · · · ) = 136.


 
112 CHAPTER 6. COMBINATORICS - II

4. How many non-negative integer solutions of u + v + w + t = 10 are there?


Ans: Note that u can take any value from 0 to 10 which corresponds to 1 + x + · · · + x10 . Hence,
the required answer is
4 · 5 · · · · 13
cf x10 , (1 + x + x2 + · · · )4 = (1 − x)−4 = C(13, 10) =
 
.
10!
Definition 6.3.3. [Generating Functions] Let (bn ) = (b0 , b1 , b2 , . . . , ) be a sequence of integers.
Then,
1. the ordinary generating function (ogf) is the formal power series

b0 + b1 x + b2 x2 + b3 x3 + · · · , and

2. the exponential generating function (egf) is the formal power series

x2 x3
b0 + b1 x + b2 + b3 + · · · .
2! 3!

If the sequence has finitely many elements then the generating functions have finitely many terms.

Example 6.3.4. What is the number of non-negative integer solutions of 2a + 3b + 5c = r, r ∈ N0 ?


Ans: Note that a ∈ N0 and hence 2a corresponds to the formal power series 1 + x2 + x4 + · · · .
Thus, we need to consider the ogf
1
(1 + x2 + x4 + · · · )(1 + x3 + x6 + · · · )(1 + x5 + x10 + · · · ) = .
(1 − x2 )(1 − x3 )(1 − x5 )
 
1
T

Hence, the required answer is cf xr , .


(1 − x2 )(1 − x3 )(1 − x5 )
AF

xn P xn
DR

P
Remark 6.3.5. 1. Let f (x) = an
, g(x) = bn ∈ Q[[x]]. Then, in case of egf, their
n≥0 n! n≥0 n!
P xn n
P
product equals dn , where dn = C(n, k)ak bn−k , for n ≥ 0.
n≥0 n! k=0

x −1 P yn x P (ex − 1)n
2. Note that ee ∈ Q[[x]] as ey = implies that ee −1 = and
n≥0 n! n≥0 n!
 
X (ex − 1)n m x n
m (e − 1)
 
 m ex −1  m
X
cf x , e = cf x ,
  = cf x , . (6.2)
n! n!
n≥0 n=0

x
That is, for each m ≥ 0, cf xm , ee −1 is a sum of a finite number of rational numbers. Whereas,
 
x x
the expression ee 6∈ Q[[x]] as computing cf xm , ee , for all m ≥ 0, requires infinitely many


computations.

an xn , g(x) = bn xn ∈ Q[[x]] then the composition


P P
3. Recall that if f (x) =
n≥0 n≥0
X X X
(f ◦ g)(x) = f (g(x)) = an (g(x))n = an ( bm xm )n
n≥0 n≥0 m≥0

may not be defined (just to compute the constant term of the composition, one may have to
look at an infinite sum of rational numbers). For example, let f (x) = ex and g(x) = x + 1. Note
that g(0) = 1 6= 0. Here, (f ◦ g)(x) = f (g(x)) = f (x + 1) = ex+1 . So, as function f ◦ g is well
defined, but there is no formal procedure to write ex+1 as ak xk ∈ Q[[x]] (i.e., with ak ∈ Q)
P
k≥0
and hence ex+1 is not a formal power series over Q.
6.3. GENERATING FUNCTIONS 113

With the algebraic operations as defined in Definition 6.3.1.3, it can be checked that Q[[x]] forms
a Commutative Ring with identity, where the identity element is given by the formal power series
an xn is said to have a reciprocal if there exists
P
f (x) = 1. In this ring, the element f (x) =
n≥0
bn xn ∈ Q[[x]] such that f (x) · g(x) = 1. So, the question arises, under
P
another element g(x) =
n≥0
what conditions on cf[xn , f ], can we find g(x) ∈ Q[[x]] such that f (x)g(x) = 1. The answer to this
question is given in the following proposition.

Proposition 6.3.6. The reciprocal of f ∈ Q[[x]] exists if and only if cf x0 , f 6= 0. Further, if


 

an ∈ Q, for all n then an ∈ Q, for all n.

bn xn ∈ Q[[x]] be the reciprocal of f (x) = an xn . Then, f (x)g(x) = 1 if and


P P
Proof. Let g(x) =
n≥0 n≥0
only if cf x0 , f · g = 1 and cf[xn , f · g] = 0, for all n ≥ 1.
 

But, by definition of the Cauchy product, cf x0 , f · g = a0 b0 . Hence, if a0 = cf x0 , f = 0


   

then cf x0 , f · g = 0 and thus, f cannot have a reciprocal. However, if a0 6= 0, then the coefficients
 

cf[xn , g] = bn ’s can be recursively obtained as follows:


b0 = 1/a0 as 1 = c0 = a0 b0 ;
b1 = −(a1 b0 )/a0 as 0 = c1 = a0 b1 + a1 b0 ;
b2 = −(a2 b0 + a1 b1 )/a0 as 0 = c2 = a0 b2 + a1 b1 + a2 b0 ; and in general, if we have computed bk , for
k ≤ r, then using 0 = cr+1 = ar+1 b0 + ar b1 + · · · + a1 br + a0 br+1 , we get

br+1 = −(ar+1 b0 + ar b1 + · · · + a1 br )/a0 .

Hence, the required result follows.


T

The next result gives the condition under which the composition (f ◦ g)(x) is well defined.
AF

Proposition 6.3.7. Let f, g ∈ Q[[x]]. Then, the composition (f ◦ g)(x) ∈ Q[[x]] if either f is a
DR

polynomial or cf x0 , g(x) = 0. Moreover, if cf x0 , f (x) = 0, then there exists g ∈ Q[[x]], with


   

cf x0 , g(x) = 0, such that (f ◦ g)(x) = x. Furthermore, (g ◦ f )(x) ∈ Q[[x]] and (g ◦ f )(x) = x.


 

cn xn and suppose that either f is a polynomial or


P
Proof. As (f ◦ g)(x) ∈ Q[[x]], let (f ◦ g)(x) =
 n≥0
cf x0 , g(x) = 0. Then, to compute ck = cf xk , (f ◦ g)(x) , for k ≥ 0, one just needs to consider the
  
k
an (g(x))n , whenever f (x) = an xn . Hence, each ck ∈ Q and thus, (f ◦ g)(x) ∈ Q[[x]].
P P
terms
n=0 n≥0
This completes the proof of the first part. We leave the proof of the other part for the reader.
The proof of the next result is left for the reader.

Proposition 6.3.8. [Basic facts] Recall the following statements from Binomial theorem.
1. cf xn , (1 − x)−1 = (1 + x + x2 + · · · ) = 1.
 

2. (a0 + a1 x + · · · )(1 + x + x2 + · · · ) = a0 + (a0 + a1 )x + (a0 + a1 + a2 )x2 + · · · .


3. cf xn , (1 − x)−r = (1 + x + x2 + · · · )r = C(n + r − 1, n). Thus,
 

(1 − x)−5 = C(4, 4) + C(5, 4)x + C(6, 4)x2 + · · · .

4. (1 − xm )n = 1 − C(n, 1)xm + C(n, 2)x2m − · · · + (−1)n xnm .


1 − xm n
 
2
5. (1 + x + x + · · · + x m−1 n
) = = (1 − xm )n (1 + x + x2 + · · · )n .
1−x
We now define the formal differentiation in Q[[x]] and give some important results. The proof is
left for the reader.
114 CHAPTER 6. COMBINATORICS - II

an xn ∈ Q[[x]].
P
Definition 6.3.9. Let f (x) =
n≥0

1. [Formal Differentiation] Then, the formal differentiation of f (x), denoted f 0 (x), is defined by
X
f 0 (x) = a1 + 2a2 x + · · · + nan xn−1 + · · · = nan xn−1 .
n≥1

R
2. [Formal Integration] Then, the formal integration of f (x), denoted f (x), is defined by

a1 2 an n+1 X an
Z
f (x)dx = α + a0 x + x + ··· + x + ··· = α + xn+1 .
2 n+1 n+1
n≥0

Proposition 6.3.10. [ogf: tricks] Let g(x), h(x) be the ogf ’s for the sequences (an ), (bn ), respectively.
Then, the following are true.

1. Ag(x) + Bh(x) is the ogf for (Aan + Bbn ).

2. (1 − x)g(x) is the ogf for the sequence a0 , a1 − a0 , a2 − a1 , · · · .

3. (1 + x + x2 + · · · )g(x) = (1 − x)−1 g(x) is the ogf for (Mn ), where Mn = an + an−1 + · · · + a0 .

4. g(x)h(x) is the ogf for (cn ), where cn = a0 bn + a1 bn−1 + a2 bn−2 + · · · + an b0 .

5. xf 0 (x) is the ogf for nan , n = 1, 2, . . ..

Example 6.3.11. 1. Let ar = 1 for all r ≥ 0. Then, the ogf of the sequence (ar ) equals 1 + x +
x2 + · · · = (1 − x)−1 = f (x). So, for r ≥ 0, the ogf for
T
AF

(a) ar = r for all r ≥ 1 is xf 0 (x) and


(b) ar = r2 for all r ≥ 1 is x f 0 (x) + xf 00 (x) .

DR

(c) Using the above two examples, the ogf of the sequence ar = 3r + 5r2 for all r ≥ 1 is
3xf 0 (x) + 5 xf 0 (x) + x2 f 00 (x) = 8x(1 − x)−2 + 10x2 (1 − x)−3 .


2. Determine the number of ways to distribute 50 coins among 30 students so that no student gets
more than 4 coins equals

cf x50 , (1 + x + x2 + x3 + x4 )30 = cf x50 , (1 − x5 )30 (1 − x)−30


   

= cf x50 , (1 − x5 )30 C(29, 29) + C(30, 29)x + C(31, 29)x2 + · · ·


 

= C(79, 50) − 30C(74, 45) + C(30, 2)C(69, 40) + · · ·


10
X
= (−1)i C(30, i)C(79 − 5i, 29).
i=0

3. For n, r ∈ N, determine the number of solutions to y1 + · · · + yn = r with yi ∈ N0 , 1 ≤ i ≤ n.

Ans: Recall that this number equals C(r + n − 1, r) (see Theorem 5.3.1).

Alternate. We can think of the problem as follows: the above system can be interpreted as
coming from the monomial xr , where r = y1 + · · · + yn . Thus, the problem reduces to finding the
coefficients of xyk of a formal power series, for yk ≥ 0. Now, recall that cf xyk , (1 − x)−1 = 1.
 

Hence, the question reduces to computing


   
1 1
cf xr , r
= cf x , = C(r + n − 1, r).
(1 − x)(1 − x) · · · (1 − x) (1 − x)n
6.3. GENERATING FUNCTIONS 115

P∞ k 1 2 3
4. Evaluate S := k
= + 2 + 3 + ···.
k=0 2 2 2 2
Ans: Note that

2 3 4
2S = 1 + + 2 + 3 + ···
2 2 2
1 2 3
S = 0 + + 2 + 3 + ···
2 2 2
1 1 1
S = 1 + + 2 + 3 + · · · = 2.
2 2 2

Alternate. Put f (x) = (1 − x)−1 . Then, it has 1 as its radius of convergence and within this
radius, the derivative is the same as the power series obtained by term by term differentiation.
Thus, f 0 (x) = 1 + 2x + 3x2 + · · · has 1 as its radius of convergence. Hence,

1
S = f 0 (1/2) = 2.
2

Alternate. Alternately (rearranging terms of an absolutely convergent series) it is

1
2 +
1 1
4 + 4 +
1 1 1
8 + 8 + 8 +
..
.
1
1+ + · · · = 2.
T

2
AF

nxn ∈ Q[[x]]. Or in other words,


P
Exercise 6.3.12. 1. Determine a closed form expression for
n≥0
DR

p(x)
nxn =
P
write , where p(x), q(x) are polynomials with integer coefficients.
n≥0 q(x)
2. Determine the sum of the first N positive integers.

3. Determine the sum of the squares of the first N positive integers.


P n2 + 5n + 16
4. Determine a closed form expression for .
n≥0 n!
N
k3 .
P
5. Determine a closed form expression for
k=1
6. For n, r ∈ N determine the number of non-negative solutions to x1 + 2x2 + · · · + nxn = r in the
unknowns xi ’s.

P 1
7. Determine 2k
C(n + k − 1, k).
k=0
8. Find the number of non-negative integer solutions of a + b + c + d + e = 27, satisfying

(a) 3 ≤ a ≤ 8,
(b) 3 ≤ a, b, c, d ≤ 8
(c) c is a multiple of 3 and e is a multiple of 4.

9. Determine the number of ways in which 150 voters can cast their 150 votes for 5 candidates such
that no candidate gets more than 30 votes.

10. Verify the following table of formal power series.


116 CHAPTER 6. COMBINATORICS - II

Table of Formal Power Series


xk
ex = (1 + x)n = C(n, k)xk , n ∈ N0
P P
k! k≥0 r≥0
P (−1)r x2r P (−1)r x2r+1
cos(x) = sin(x) =
r≥0 (2r)! r≥0 (2r + 1)!
P x2r P x2r+1
cosh(x) = sinh(x) =
r≥0 (2r)! r≥0 (2r + 1)!
Radius of convergence: |x| < 1
P xk
log(1 − x) = −
k≥1 k
1 P k 1
C(n + k − 1, k)xk , n ∈ N
P
= x n
=
1−x k≥0 (1 − x) k≥0
(1 + x)n k xn
C(k, n)xk , n ∈ N0
P P
= C(n, r + k)x =
xr k≥−r (1 − x) n+1
k≥0
1
Radius of convergence: |x| <
√ 4
1 k 1 − 1 − 4x P 1
√ C(2k, k)xk
P
= C(2k, k)x =
1 − 4x k≥0 2x k≥0 k + 1

11. Find the ogf of the Fibonacci sequence (Fn )n≥0 := (1, 1, 2, 3, . . .)? Hence, show that for n ≥ 1,
Fn is the number of ways to write n as a sum of 1’s and 2’s.
12. Take a natural number n. Find

C(n, 0)2n − C(n − 1, 1)2n−2 + C(n − 2, 2)2n−4 − C(n − 3, 3)2n−6 + · · · .


T
AF
DR

13. We know (1 − x)−2 = 1 + 2x + 3x2 + · · · . Also,

(1 − x)−2 = (1 + x2 − 2x)−1 = (1 − [2x − x2 ])−1 = 1 + [2x − x2 ] + [2x − x2 ]2 + · · · .

So, can you verify this identity, i.e., the coefficient of xn in the later expression is actually n+1?

6.3.1 Generating Functions and Partitions of n


Recall from Page 95 that a partition of n into k parts is a tuple (n1 , · · · , nk ) ∈ Nk written in non-
increasing order, that is, n1 ≥ n2 ≥ · · · ≥ nk , such that n1 + n + 2 + · · · + nk = n. Also, recall that
πn is the number of distinct partitions of n. The following result due to Euler gives the generating
function of πn .

Theorem 6.3.13. [Euler: partition of n] The generating function for πn is


1
ε(x) = (1 + x + x2 + · · · )(1 + x2 + x4 + · · · ) · · · (1 + xn + x2n + · · · ) = .
(1 − x)(1 − x2 ) · · · (1 − xn )
Proof. Note that any partition λ of n has m1 copies of 1, m2 copies of 2 and so on till mn copies of n,
n
mi = n. Hence, λ uniquely corresponds to (x1 )m1 (x2 )m2 · · · (xn )mn
P
where mi ∈ N0 for 1 ≤ i ≤ n and
i=1
in the word-expansion of

(1 + x + x2 + · · · )(1 + x2 + x4 + · · · ) · · · (1 + xn + x2n + · · · ).

Thus, πn = cf[xn , ε(x)].


The next result is the same idea as Theorem 6.3.13 and hence the proof is omitted.
6.3. GENERATING FUNCTIONS 117

r
 
1
Theorem 6.3.14. The number of partitions of n with entries at most r is cf xn ,
Q
1−xi
.
i=1

Corollary 6.3.15. Fix n, r ∈ N. Then, the ogf for the number of partitions of n into at most r parts,
is (1−x)(1−x12 )···(1−xr ) .

Proof. Note that by using Ferrer’s diagram (taking conjugate) we see that the number of partitions
of n into at most r parts is same as the number
 of partitions of n with entries at most r. So, by
r
n
Q 1
Theorem 6.3.14, this number is cf x , 1−xi
.
i=1

Theorem 6.3.16. [ogf of πn (r)] Fix n, r ∈ N. Then, the ogf for πn (r), the number of partitions of n
r
into r parts, is (1−x)(1−xx2 )···(1−xr ) .

Proof. Consider a partition (λ1 , . . . , λr ) of n. So, n ≥ r. Assume that λ1 , . . . , λk > 1 and λk+1 , . . . , λr =
1. Then (λ1 − 1, . . . , λk − 1) is a partition of n − r into at most r parts.
Conversely, if (µ1 , . . . , µk ), k ≤ r, is a partition of n − r into at most r parts, then (µ1 + 1, . . . , µk +
1, 1, . . . , 1), where the number of 1’s is r − k times, is an r partition of n.
Thus, the number of r partitions of n is the same as the number h of partitions of n − ir with at
most r parts. Thus, by Corollary 6.3.15 the required number is cf xn−r , (1−x)(1−x12 )···(1−xr ) . Hence,
the ogf for πn (r) is
xr
.
(1 − x)(1 − x2 ) · · · (1 − xr )

Exercise 6.3.17. 1. For n, r ∈ N, prove that πn (r) is the number of partitions of n + C(r, 2) into
T

r unequal parts.
AF

2. Let P, M ⊆ N and f (n) be the number of partitions of n where parts are from P and multiplicities
DR

are from M . Find the generating function for the numbers f (n).

Theorem 6.3.18. Suppose there are k types of objects.


1. If there is an unlimited supply of each object, then the egf of the number of r-permutations is
ekx .
2. If there are mi copies of i-th object, then the egf of the number of r-permutations is

x2 xm1 x2 xm k
   
1+x+ + ··· + ··· 1 + x + + ··· + .
2! m1 ! 2! mk !

xr
3. Moreover, n!S(r, n) is the coefficient of r! in (ex − 1)n .

Proof.
1. Since there are unlimited supply of each object, the egf for each object corresponds to ex =
xn
1 + x + ··· + + · · · . Hence, the required result follows.
n!
2. Similar to the first part.
3. Recall that n!S(r, n) is the number of surjections from {1, 2, . . . , r} to X = {s1 , · · · , sn }. Each
surjection can be viewed as a word of length r of elements of X, with each si appearing at least
n
P
once. Thus, we need a selection of ki ∈ N copies of si , with ki = r. Also, by Exercise 5.4.7.8,
i=1
this number equals C(r; k1 , · · · , kn ). Hence,
n 
x2 x3
   r 
r x x n
n!S(r, n) = r!cf x , x + + + ··· = cf , (e − 1) .
2! 3! r!
118 CHAPTER 6. COMBINATORICS - II

Example 6.3.19. 1. In how many ways can you get Rs 2007 using denominations 1, 10, 100, 1000
only?
 
2007 1
Ans: cf x , .
(1 − x)(1 − x10 )(1 − x100 )(1 − x1000 )
2. If we use at most 9 of each denomination in Part 1, then this number is
9
" ! 9 ! 9 ! 9 !#
10000
2007 1 − x
X X X X  
2007 i 10i 100i 1000i
cf x , x x x x = cf x , = 1.
1−x
i=1 i=1 i=1 i=1

3. Every natural number has a unique base-r representation (r ≥ 2). Note that Part 2 corresponds
to the case r = 10.
4. Consider n integers k1 < k2 < · · · < kn with gcd(k1 , . . . , kn ) = 1. Then, the number of
natural numbers not having a partition using {k1 , . . . , kn } is finite. Determining the largest such
integer (Frobenius number) is the coin problem/ money changing problem. The general
problem is NP-hard. No closed form formula is known for n > 3.

Some times we have a way to obtain a recurrence relation from the generating function. This is
important and hence study the next example carefully.
1 X
Example 6.3.20. 1. Suppose F = = an xn . Then, tak-
(1 − x)(1 − x10 )(1 − x100 )(1 − x1000 )
n≥0
ing log and differentiating, we get

10x9 100x99 1000x999


 
0 1
F =F + + + .
1 − x 1 − x10 1 − x100 1 − x1000
T
AF

So,
DR

n
10x9 100x99 1000x999
  
 n−1 0  n−1 1 X
nan = cf x , F = cf x ,F + + + = an−k bk ,
1 − x 1 − x10 1 − x100 1 − x1000
k=1

where



 1 if 10 - k
10x9 100x99 999
   
k−1 1 1000x  11 if 10|k, 100 - k
bk = cf x , + + + =
1 − x 1 − x10 1 − x100 1−x 1000  111
 if 100|k, 1000 - k


 1111 else.

n n
P 1 P 1
2. We know that lim k = ∞. What about lim , where pk is the k-th prime?
n→∞ k=1 n→∞ k=1 pk
n
P 1
Ans: For n > 1, let sn = k. Then, note that
k=1
     Yn
1 1 1 1 1 1 1
sn ≤ 1 + + + · · · 1 + + + ··· ··· 1 + + 2 + ··· = (1 + ).
2 4 3 9 pn pn pk − 1
k=1

Thus,
n n n n−1
!
Y 1 X 1 X 1 X 1
log sn ≤ log (1 + ) ≤ log(1 + )≤ ≤1+ .
pk − 1 pk − 1 pk − 1 pk
k=1 k=1 k=1 k=1

n
1 P
As n → ∞, we see that lim = ∞ as lim log sn = ∞.
n→∞ i=1 pi n→∞
6.4. RECURRENCE RELATION 119

3. Let X be the set of natural numbers with only prime divisors 2, 3, 5, 7. Then,
X 1 1 1 1 1 1 1 2357 35
1+ = (1 + + + · · · )(1 + + + · · · ) · · · (1 + + + ···) = = .
n 2 4 3 9 7 49 1246 8
n∈X
P n
P
Exercise 6.3.21. 1. Let σ(n) = d, for n ∈ N. Then, prove that nπn = πn−k σ(k).
d|n k=1

2. A Durfee square is the largest square in a Ferrer’s diagram. Find the generating function for the
number of self conjugate partitions of n with a fixed size k of the corresponding Durfee square.
∞ 2
3
X xk
Show that (1 + x)(1 + x ) · · · = 1 + .
(1 − x2 )(1 − x4 ) · · · (1 − x2k )
k=1
3. Show that the number of partitions of n into distinct terms is the same as the number of partitions
of n into odd terms.
4. Find the number of r-digit binary numbers that can be formed using an even number of 0s and
an even number of 1s.
5. Find the egf of the number of words of size r using A, B, C, D, E,
(a) if the word has all the letters and the letter A appears an even many times.
(b) if the word has all the letters and the first letter of the word appears an even number of
times.

6. A permutation σ of {1, 2, . . . , n} is said to be connected if there does not exist k, 1 ≤ k < n


such that σ takes {1, 2, . . . , k} to itself. Let cn denote the number of connected permutations of
{1, 2, . . . , n} (convention: c0 = 0), then show that
T

n
AF

X
ck (n − k)! = n!.
DR

k=1

Hence, derive the relationship between the generating functions of (n!) and (cn ).
7. Let f (n, r) be the number of partitions of n where each part repeats less than r times. Let g(n, r)
be the number of partition of n where no part is divisible by r. Show that f (n, r) = g(n, r).
8. Find the number of 9-sequences that can be formed using 0, 1, 2, 3 in each case:
(a) The sequence has an even number of 0s.
(b) The sequence has an odd number of 1s and an even number of 0s.
(c) No digit appears exactly twice.

6.4 Recurrence Relation


Definition 6.4.1. [Recurrence Relation] A recurrence relation is a way of recursively defining
the terms of a sequence as a function of preceding terms together with certain initial conditions.

Example 6.4.2. an = 3 + 2an−1 for n ≥ 1 with the initial condition a0 = 1 is a recurrence relation.
Note that it completely determines the sequence (an ) = {1, 5, 13, 29, 61, . . .}.

Definition 6.4.3. [Difference Equation] For a sequence (an ), the first difference d(an ) is an −
an−1 . The k-th difference dk (an ) = dk−1 (an ) − dk−1 (an−1 ). A difference equation is an equation
involving an and its differences.
Example 6.4.4. 1. an − d2 (an ) = 5 is a difference equation. But, note that it doesn’t give a
recurrence relation as we don’t have any initial condition(s).
120 CHAPTER 6. COMBINATORICS - II

2. Every recurrence relation can be expressed as a difference equation. The difference equation
corresponding to the recurrence relation an = 3 + 2an−1 is an = 3 + 2(an − d(an )).

Definition 6.4.5. [Solution of a Recurrence Relation] A solution of a recurrence relation is a


function u(n), generally denoted by un , satisfying the recurrence relation.
Example 6.4.6. 1. u(n) = 2n+2 − 3 is a solution of an = 3 + 2an−1 with a0 = 1.
2. The Fibonacci sequence is given by an = an−1 + an−2 for n ≥ 2 with a0 = 0, a1 = 1. Use
 √ 2 √  √ 2 √ 1  1+√5 n  1−√5 n 
1+ 5 3+ 5 1− 5 3− 5
2 = 2 and 2 = 2 to verify that an = √ 2 − 2 is a
5
solution of the recurrence relation that defines the Fibonacci sequence.

Definition 6.4.7. [LNRC/LHRC] A recurrence relation is called a linear nonhomogeneous


recurrence relation with constant coefficients (LNRC) of order r if, for a known function f

an = c1 an−1 + · · · + cr an−r + f (n), where ci ∈ R for 1 ≤ i ≤ r, cr 6= 0. (6.3)

If f = 0, then Equation (6.3) is homogeneous and is called the associated linear homogeneous
recurrence relation with constant coefficients (LHRC).

Theorem 6.4.8. For k ∈ N and 1 ≤ i ≤ k, let fi be known functions. Consider the k number of
LNRC
an = c1 an−1 + · · · + cr an−r + fi (n) for i = 1, . . . , k, (6.4)

with the same set of initial conditions. If ui (n) is a solution of the i-th recurrence relation, then
T

k
AF

X
an = c1 an−1 + · · · + cr an−r + αi fi (n) (6.5)
DR

i=1

k
P
with the same set of initial conditions has αi ui (n) as it solution.
i=1

Proof. The proof is left as an exercise for the reader.

Definition 6.4.9. [Characteristic Equation] The equation xr − c1 xr−1 − · · · − cr = 0 is called the


characteristic equation of the LHRC an = c1 an−1 + · · · + cr an−r with cr 6= 0. The roots of the
characteristic equation are called the characteristic roots of the LHRC.

Observe that if an = xn is a solution of the LHRC an = c1 an−1 + · · · + cr an−r with cr 6= 0, then


either x = 0 or x is a characteristic root. Further, if x1 , . . . , xr are the characteristic roots, then
r
an = xni is a solution of the LHRC. It follows that an = αi xni for αi ∈ R is a solution of the given
P
i=1
LHRC. We show that the latter form of a solution is a general solution so that a given set of initial
conditions may be satisfied.

Theorem 6.4.10. [General Solution: Distinct Roots] If the characteristic roots x1 , . . . , xr of an


LHRC are distinct, then every solution of the LHRC is a linear combination of xn1 , . . . , xnr . Moreover,
the solution is unique if r consecutive initial conditions are given.

Proof. Let u(n) be any solution of a given LHRC an = c1 an−1 + · · · + cr an−r . That is,
r
X
u(n) = cj u(n − j) = c1 u(n − 1) + · · · + cr u(n − r).
j=1
6.4. RECURRENCE RELATION 121

r
αi xni for all n ∈ W. We first consider a
P
We show that there exist α1 , . . . , αr ∈ R such that u(n) =
i=1
smaller problem, that is, whether the first r values of u(n) can be expressed in this form. The answer
r
αi xni for
P
will be affirmative provided we can determine the constants α1 , . . . , αr so that u(n) =
i=1
n = 0, 1, . . . , r − 1. To explore this, substitute n = 0, 1, . . . , r − 1 to obtain the following linear system
in the unknowns α1 , . . . , αr :
    
u(0) 1 ··· 1 α1
 u(1)   x1 · · · xr α2 
.. =  .. .
    
. . .

 .    . 
u(r − 1) xr−1
1 · · · xr−1
r αr

Since the above r × r matrix (commonly known as the Vandermonde matrix) is invertible, there exist
r
αi xni for 0 ≤ n ≤ r − 1. Hence, we have proved the result for the first
P
α1 , . . . , αr such that u(n) =
i=1
r
αi xni for 0 ≤ n < k, where k ≥ r. Notice that for
P
r values of u(n). So, let us assume that u(n) =
i=1
r
n = k, xki is a solution of the given LHRC. So, xki = cj xk−j
P
i . Then
j=1

r r r r r r
αi xk−j cj xk−j
X X X X X X
u(k) = cj u(k − j) = cj i = αi i = αi xki .
j=1 j=1 i=1 i=1 j=1 i=1

r
αi xni for all n.
P
Hence by PMI, u(n) =
i=1
For uniqueness, suppose u(n) and v(n) are solutions of the LHRC satisfying the r initial conditions
T
AF

u(i) = v(i) = ai for 0 ≤ i ≤ r − 1. Write y(n) = u(n) − v(n). Then y(n) satisfies the same LHRC
r
γi xni for some
P
with intial conditions y(1) = · · · = y(r) = 0. By what we have just proved, y(n) =
DR

i=1
constants γ1 , . . . , γr . Treating γi s as unknowns, and substituting n = 0, 1, . . . , r − 1, we arrive at a
linear system as above, where u is replaced by y. Since the system matrix there is invertible, it leads
to the unique solution γ1 = · · · = γr = 0. In turn, we obtain y(n) = 0 for all n. That is, u(n) = v(n)
for all n.
Notice that the characteristic roots are, in general, complex numbers, so that the constants in the
linear combination can be complex numbers.
Example 6.4.11. 1. Solve an − 4an−2 = 0 for n ≥ 2 with a0 = 1 and a1 = 1. Ans: The
characteristic equation is x2 − 4 = 0. As the characteristic roots x = ±2 are distinct, the general
solution is an = α(−2)n + β 2n . The initial conditions give α + β = 1 and 2β − 2α = 1. Hence,
α = 14 , β = 34 . Thus, the unique solutions is an = 2n−2 3 + (−1)n .


2. Solve an = 3an−1 + 4an−2 for n ≥ 2 with a0 = 1 and a1 = c, a constant. Ans: The characteristic
equation is x2 − 3x − 4 = 0. The characteristic roots are −1 and 4; they are distinct. The general
solution is an = α(−1)n + β 4n . The initial conditions imply α = 4−c 1+c
5 and β = 5 . Thus, the
unique general solution is an = 51 (4 − c)(−1)n + (1 + c)4n .


3. Solve the Fibonacci recurrence an = an−1 + an−2 with initial conditions a0 = 0, a1 = 1.



Ans:
2 1± 5
The characteristic equation x − x√ − 1 = 0 gives
√ 
distinct characteristic roots as 2 . So,
1+ 5 n 1− 5 n

the general solution is an = α 2 +β 2 . Using the initial conditions, we get α =
√ √
1/ 5, β = −α = −1/ 5. Hence, the required solution is
" √ !n √ !n #
1 1+ 5 1− 5
an = √ − . (6.6)
5 2 2
122 CHAPTER 6. COMBINATORICS - II

4. Solve the recurrence relation an + an−2 = 0 with the initial conditions a0 = a1 = 2. Ans: The
characteristic equation is x2 + 1 = 0 with distinct characteristic roots as ±i. The general solution
is in the form an = α in + β (−i)n . Initial conditions imply that α + β = 2 and α i − β i = 2. So,
α = 1 − i and β = 1 + i. Then an = (1 − i)in + (1 + i)(−i)n .
5. Consider a triangle with vertices (a1 , b1 ) = (0, 0), (a2 , b2 ) = (5, 0) and (a3 , b3 ) = (3, 7). For
n > 3, define (an , bn ) as the centroid of the triangle formed by (an−1 , bn−1 ), (an−2 , bn−2 ) and
(an−3 , bn−3 ). Does the sequence ((an , bn )) converge? If so, to what limit?
Ans: Note that the sequence ((an , bn )) converges if and only if both the sequences (an ) and (bn )
converge. We will first show that (an ) converges.
Let M1 = max{a1 , a2 , a3 } and m1 = min{a1 , a2 , a3 }. Notice that m1 ≤ a1 , a2 , a3 ≤ M1 . Hence,
a1 + a2 + a3 2M1 + m1 2M1 + m1
m1 ≤ ≤ , i.e., m1 ≤ a4 ≤ ;
3 3 3
a2 + a3 + a4 2M1 + a4 8M1 + m1 8M1 + m1
m1 ≤ ≤ ≤ , i.e., m1 ≤ a5 ≤ ; and
3 3 9 9
a3 + a4 + a5 26M1 + m1 26M1 + m1
m1 ≤ ≤ , i.e., m1 ≤ a6 ≤ .
3 27 27
2M1 +m1 8M1 +m1 26M1 +m1
As 3 ≤ 9 ≤ 27 , we see that
26M1 + m1
m1 ≤ a4 , a5 , a6 ≤ .
27
Let M2 = max{a4 , a5 , a6 } and m2 = min{a4 , a5 , a6 }. Then
26
T

[m2 , M2 ] ⊆ [m1 , M1 ] and length([m2 , M2 ]) ≤ length([m1 , M1 ]).


27
AF

Similarly, taking Mn = max{a3n+1 , a3n+2 , a3n+3 } and mn = min{a3n+1 , a3n+2 , a3n+3 }, we get a
DR

nested sequence of nonempty closed intervals

[m1 , M1 ] ⊇ [m2 , M2 ] ⊇ [m3 , M3 ] ⊇ · · ·



with diameters going to zero. By nested interval theorem, ∩ [mi , Mi ] is a singleton set, say, {l}.
i=1
Note that, [mn+1 , Mn+1 ] contains all the terms a3n+1 , a3n+2 , a3n+3 , a3n+4 , . . .. It now follows
that lim an = l. Thus, lim an+1 +2an+2
6
+3an+3
= l. But notice that,
n→∞ n→∞

a1 + 2a2 + 3a3 a2 + 2a3 + 3a4 a3 + 2a4 + 3a5


= = = ··· .
6 6 6
a1 +2a2 +3a3
Thus l = 6 . Thus, the limit to the original question is (19/6, 7/2).
* How did we guess the formula? To see that write

3a4 = a1 + a2 + a3
3a5 = a2 + a3 + a4
..
.
3an+3 = an + an+1 + an+2
3(a4 + a5 + · · · + an+3 ) = a1 + 2a2 + 3(a3 + · · · + an ) + 2an+1 + an+2

Cancelling, we get an+1 + 2an+2 + 3an+3 = a1 + 2a2 + 3a3 , which is what we required.

Alternate. This method is of interest to us. Note that we have the LHRC
an−1 + an−2 + an−3
an = , n > 3.
3
6.4. RECURRENCE RELATION 123

So, the characteristic equation is 3x3 − x2 − x − 1 = 0. Observe that 1 is a root. We now see
that 3x3 − x2 − x − 1 = (x − 1) (3x2 + 2x + 1) and so the other two roots are
√ √ √
−2 + 4 − 12 −1 + i 2 −1 − i 2
α := = and β := .
6 3 3
Hence, by Theorem 6.4.10, there exist constants a, b, c ∈ C such that

an = a(1)n−1 + b(α)n−1 + c(β)n−1 .

As |α| = |β| = √1 < 1, we see that an → a. Using the initial conditions, we get
3
    
1 1 1 a a1
1 α β  b = a2 .
1 α2 β 2 c a3
a1 +2a2 +3a3
Solving for a gives a = 6 .

Theorem 6.4.12. [General Solution: Multiple


s−1
 Roots] Given an LHRC, let t be a characteristic
root of multiplicity s. Then u(n) = tn αi ni is a solution (called a basic solution). Moreover,
P
i=0
if t1 , . . . , tk are the distinct characteristic roots with multiplicities s1 , . . . , sk , respectively, then every
solution is a sum of the k corresponding basic solutions.

Proof. It is given that t is a zero of the polynomial F = xr − c1 xr−1 − · · · − cr of multiplicity s. Put

G0 = xn−r F = xn − c1 xn−1 − · · · − cr xn−r


T

G1 = xG00 = nxn − c1 (n − 1)xn−1 − · · · − cr (n − r)xn−r


AF

G2 = xG01 = n2 xn − c1 (n − 1)2 xn−1 − · · · − cr (n − r)2 xn−r


DR

.. ..
.=.
Gs−1 = xG0s−2 = ns−1 xn − c1 (n − 1)s−1 xn−1 − · · · − cr (n − r)s−1 xn−r

Note that each of G0 , G1 , . . . , Gs−1 has a zero at t, i.e., for i = 0, 1, . . . , s − 1, we have

Gi (t) = tn ni − c1 tn−1 (n − 1)i − . . . − cr tn−r (n − r)i = 0.


s−1
k i αi , for k ≥ 0 then
P
Thus, for any choice of αi ∈ R, 0 ≤ i ≤ s − 1, if one defines P (k) =
i=1

s−1
X
0= αi Gi (t) = tn P (n) − c1 tn−1 P (n − 1) − · · · − cr tn−r P (n − r).
i=0

Hence, by definition u(n) − c1 u(n − 1) − · · · − cr u(n − r) = 0. Therefore, u(n) is a solution of the


LHRC.
Now, the second statement follows from Theorem 6.4.10.

Example 6.4.13. Suppose that an LHRC has roots 2, 2, 3, 3, 3. Then, the general solution is given
by 2n (α1 + nα2 ) + 3n (β1 + nβ2 + n2 β3 ).

Consider the LNRC in Equation (6.3). If vn and wn are solutions of the LNRC, then un := wn − vn
satisfies the associated LHRC. That is, wn = un + vn shows that any solution wn can be expressed as
a solution of the associated LHRC plus a solution vn of the LNRC. We summarize this finding in the
next theorem.
124 CHAPTER 6. COMBINATORICS - II

Theorem 6.4.14. [LNRC] Consider the LNRC in Equation (6.3). Let un be a general solution of
the associated LHRC. If vn is a (particular) solution of the LNRC, then an = un + vn is a general
solution of the LNRC.

Remark 6.4.15. Theorem 6.4.14 implies that in order to obtain a general solution of an LNRC, we
need to solve the associated LHRC for a general solution and also obtain a particular solution of the
same LNRC. Unlike an LHRC, no general algorithm is available to obtain a particular solution of an
LNRC. In some cases, heuristic methods can be used to obtain a particular solution. If f (n) = an or
nk or a linear combination of these, then a particular solution can be easily obtained.

Obtaining particular solution after knowledge of the characteristic roots.

1. If f (n) = an and a is not a root of LHRC, then v(n) = can .


2. If f (n) = an and a is a root of LHRC of multiplicity t, then v(n) = cnt an .
3. If f (n) = nk and 1 is not a root of LHRC, then use v(n) = c0 + c1 n + · · · + ck nk .
4. If f (n) = nk and 1 is a root of LHRC of multiplicity t, then
v(n) = nt (c0 + c1 n + · · · + ck nk ).
Example 6.4.16. 1. Solve an = 3an−1 + 2n for n ≥ 1 with a0 = 1.
Ans: Observe that 3 is the characteristic root of the associated LHRC (an = 3an−1 ). Thus,
the general solution of LHRC is un = 3n α. Note that 1 is not a characteristic root and hence a
particular solution is a+nb, where a and b are to be computed using a+nb = 3(a+(n−1)b)+2n.
This gives a = −3/2 and b = −1. Hence, an = 3n α − n − 3/2. Using a0 = 1, check that α = 5/2.
T
AF

2. Solve an = 3an−1 − 2an−2 + 3 (5)n for n ≥ 3 with a1 = 1, a2 = 2.


DR

Ans: The associated LHRC (an = 3an−1 − 2an−2 ) has the characteristic roots 1 and 2. Thus,
the general solution of the LHRC is un = α1n + β 2n . Notice that 5 is not a characteristic root.
So, vn = c 5n is a particular solution of LNRC. That is, c 5n = 3c 5n−1 − 2c 5n−2 + 3 (5)n . It gives
c = 25/4. Hence, the general solution of LNRC is in the form an = α + β 2n + (25/4)5n . One can
then determine α and β from the initial conditions.
3. In the previous example, take f (n) = 3(2n ). Trying c (2)n as a particular solution, we have 4c =
6c − 2c + 12. This is absurd. The reason is that 2 is a characteristic root of the associated LHRC.
Now, with the choice of c n(2)n as a particular solution, we get 4nc = 6(n − 1)c − 2(n − 2)c + 12.
It gives c = 6. Hence, the general solution of LNRC is in the form an = α + β2n + 6n2n from
which the constants α and β can be computed using the initial conditions.

6.5 Generating Function from Recurrence Relation


Sometimes we can find a solution to the recurrence relation using the generating function of an ; see
the following example.
Example 6.5.1. 1. Consider solving an = 2an−1 + 1, a0 = 1.
Ans: Let F (x) = a0 + a1 x + · · · be the generating function for {ai }. Then,
∞ ∞ ∞ ∞
X X X X 1
F =1+ ai xi = 1 + (2ai−1 + 1)xi = xi + 2x ai xi = + 2xF.
1−x
i=1 i=1 i=0 i=0

1 2 1
Hence, F (x) = = − so that an = cf[xn , F (x)] = 2n+1 − 1.
(1 − x)(1 − 2x) 1 − 2x 1 − x
6.5. GENERATING FUNCTION FROM RECURRENCE RELATION 125

2. Find the ogf F for the Fibonacci recurrence relation an = an−1 + an−2 , a0 = 0, a1 = 1.
an xn = an xn . Then using the recurrence relation, we have
P P
Ans: Define F (x) =
n≥0 n≥1

X X
F (x) = an xn = x + (an−1 + an−2 ) xn = x + (x + x2 )F (x).
n≥0 n≥2

x
So, F (x) = .
1 − x − x2
√ √
1+ 5 1− 5
Let α = 2 and β = 2 . Verify that (1 − αx)(1 − βx) = 1 − x − x2 . Then
 
1 1 1  1 X X
F (x) = √ − =√  α n xn − β n xn  .
5 1 − αx 1 − βx 5 n≥0 n≥0

1 P n
Therefore, an = cf[xn , F (x)] = √ (α − β n ), which equals Equation (6.6).
5 n≥0

The next result follows using a small calculation and hence the proof is left for the reader.

Theorem 6.5.2. [Obtaining Generating Function from Recurrence Relation] Let an be the solution
of the r-th order LHRC with r initial conditions given by

an = c1 an−1 + · · · + cr an−r with a0 = A0 , a1 = A1 , ar−1 = Ar−1 . (6.7)

Then the generating function of (an ) is obtained by taking


T
AF

F (x) = A0 + A1 x + · · · + Ar−1 xr−1 + [(c1 Ar−1 + · · · + cr A0 ]xr + · · ·


= A0 + A1 x + · · · + Ar−1 xr−1 + cr xr F + cr−1 xr−1 (F (x) − A0 ) + · · · +
DR

c1 x(F (x) − A0 − A1 x − · · · − Ar−2 xr−2 ).

This implies that


r−1 r−2 r−3
Ai xi − c1 x Ai xi − c2 x2 Ai xi − · · · − cr−1 xr−1 A0
P P P
i=0 i=0 i=0
F (x) = . (6.8)
1 − c1 x − · · · − cr xr

Remark 6.5.3. Then we observe the following about Equation (6.8) in Theorem 6.5.2.
1. Note that the numerator is a polynomial in x of degree at most r − 1, determined by the initial
conditions and the denominator Q(x) is a polynomial of degree r determined by the recurrence
relation.
2. Now consider all solutions of the LHRCC an = c1 an−1 + · · · + cr an−r of order r. We already
know that they form a vector space of dimension r. Each such solution will give us an ogf as
shown above. Since they have the same denominator, if we take linearly independent solutions,
we will get linearly independent numerators. It now follows that, if P (x) has degree less than r,
P (x)
then is an ogf for some solution.
Q(x)
3. Note that we can write 1 − c1 x − · · · − cr xr = (1 − α1 x)s1 · · · (1 − αk xk )sk , where αi ’s are distinct
complex numbers and s1 + · + sk = r. Let P1 (x) have degree less than s1 . Then notice that

P1 (x) P1 (x)(1 − α2 x)s2 · · · (1 − αk x)sk


=
(1 − α1 x)s1 (1 − α1 x)s1 (1 − α2 x)s2 · · · (1 − αk x)sk
126 CHAPTER 6. COMBINATORICS - II

P1 (x) Pk (x)
is an ogf for some solution. Similarly, (1−α s , . . . , (1−α x)s1 are ogf’s of some solutions. Are
1 x) 1 k
these solutions linearly independent? Yes. Indeed, if those solutions are linearly dependent, then
a linear combination
P1 (x) Pk (x)
a1 s
+ . . . + a k = 0.
(1 − α1 x) 1 (1 − αk x)s1
But this is not possible, otherwise, multiplying by (1 − α1 x)s1 (1 − α2 x)s2 · · · (1 − αk x)sk , we get
a1 R1 (x) + · · · + ak Rk (x) is the zero polynomial. As every term except the first one is divisible
by (1 − α1 x)s1 and the rhs is also divisible by (1 − α1 x)s1 , and that P1 has degree less than s1 ,
it follows that a1 = 0. Similarly, all other ai are 0. Thus we already know that the sequences
(α1n ), (nα1n ), . . . , (ns1 −1 α1n ) are linearly independent. Indeed, if there is a combination

a0 (α1n ) + a1 (nα1n ) + · · · + as1 −1 (ns1 −1 α1n ) = (0, 0, · · · ),

as α1 6= 0, we would get

(a0 + a1 n + a2 n2 + · · · + as1 −1 ns1 −1 ) = (0, 0, · · · ),

implying a0 = a1 = · · · = as1 −1 = 0.
4. Now suppose that, the sequences

(α1n ), (nα1n ), . . . , (ns1 −1 α1n ), · · · , (αkn ), (nαkn ), . . . , (nsk −1 αkn )

are linearly dependent. We then have

(P1 (n)α1n + P2 (n)α2n + · · · + Pk (n)αkn ) = (0, 0, · · · ),


T
AF

for some polynomials Pi (n) with degrees less than si , i = 1, . . . , k.


DR

We explain Theorem 6.5.2 by considering the following examples.


Example 6.5.4. 1. Find the ogf for the Catalan numbers Cn ’s.
Cn xn , where Cn = C(2n,n) 2(2n−1)
P
Ans: Let g(x) = 1 + n+1 = n+1 Cn−1 with C0 = 1. Then,
n≥1

X X 2(2n − 1)
g(x) − 1 = Cn x n = Cn−1 xn
n+1
n≥1 n≥1
∞ ∞ x
4n + 4 −6 −6
X X Z
n
= Cn−1 x + Cn−1 xn = 4xg(x) + tg(t)dt.
n+1 n+1 x
n=1 n=1 0

Rx Rx
So, [g(x) − 1 − 4xg(x)]x = −6 tg(t)dt. So, [g(x) − 1 − 4xg(x)]x = −6 tg(t)dt. Differentiate
0 0
with respect to x to get
x(1 − 4x)g 0 + (1 − 2x)g = 1.

It is a linear ordinary differential equation. Observe that

1 − 2x
Z    
1 2 x
Z
dx = + dx = ln √ .
x(1 − 4x) x 1 − 4x 1 − 4x
x
We thus multiply the equation with its integrating factor √ to obtain
1 − 4x
x 1 − 2x 1 d x 1
g(x)0 √

+ g(x) 3/2
= 3/2
⇔ g(x) √ = 3/2
.
1 − 4x (1 − 4x) (1 − 4x) dx 1 − 4x (1 − 4x)
6.5. GENERATING FUNCTION FROM RECURRENCE RELATION 127

x

Hence, g(x) √1−4x = √1 + C, where C ∈ R. Or, equivalently 2xg(x) = 1 + 2C 1 − 4x.
2 1−4x

Note that C = − 12 as C0 = lim g(x) = 1. Therefore, the ogf of the Catalan numbers is
x→0

1− 1 − 4x
g(x) = .
2x

Alternate. Recall that Cn is the number of representations of the product of n + 1 square


matrices of the same size, using n pairs of brackets. From such a representation, remove the
leftmost and the rightmost brackets to obtain the product of two representations of the form:

A1 (A2 · · · An+1 ), (A1 A2 )(A3 · · · An+1 ), · · · , (A1 · · · Ak )(Ak+1 · · · An+1 ), · · · , (A1 · · · An )An+1 .

Hence, we see that


Cn = C0 Cn−1 + C1 Cn−2 + · · · + Cn−1 C0 . (6.9)

Cn xn . Then, for n ≥ 1,
P
Let g(x) be the generating function of Cn ; that is, g(x) =
n=0
 !2  n−1

X X
 n−1
, g(x)2 = cfxn−1 , Cn x n  =

cf x Ci Cn−1−i = Cn using Equation (6.9).
n=0 i=0

That is, cf xn , xg(x)2 = Cn . Hence, g(x) = 1 + xg(x)2 . Solving for g(x), we get
 

r ! √
1 1 1 4 1± 1 − 4x
T

g(x) = ± 2
− = .
AF

2 x x x 2x
DR

As the function g is continuous (being a power series in the domain of convergence) and
lim g(x) = C0 = 1, it follows that
x→0

1− 1 − 4x
g(x) = .
2x

n
P
2. Fix r ∈ N and let (an ) be a sequence with a0 = 1 and ak an−k = C(n + r, r) for all n ≥ 1.
k=0
Determine an .
an xn . Using C(n + r, r) = C(n + (r + 1) − 1, n), we obtain
P
Answer: Let g(x) =
n≥0

n
!
X X X X 1
g(x)2 = ak an−k xn = C(n + r, r)xn = C(n + r, n)xn = .
(1 − x)r+1
n≥0 k=0 n≥0 n≥0

h i
1
Hence, an = cf xn , (1−x)(r+1)/2 . For example, when r = 2

3 · 5 · 7 · · · (2n + 1) (2n + 1)!


an = (−1)n C(−3/2, n) = n
= 2n .
2 n! 2 n!n!

3. Determine the sequence {f (n, m) : n, m ∈ W} which satisfies f (n, 0) = 1 for all n ≥ 0, f (0, m) =
0 for all m > 0, and

f (n, m) = f (n − 1, m) + f (n − 1, m − 1) for n > 0, m > 0. (6.10)


128 CHAPTER 6. COMBINATORICS - II

f (n, m)xm = 1 + f (n, m)xm . Then F1 (x) = 1 + x,


P P
Answer: For n > 0, define Fn (x) =
m≥0 m≥1
and for n ≥ 2,
X X
Fn (x) = f (n, m)xm = 1 + (f (n − 1, m) + f (n − 1, m − 1)) xm
m≥0 m≥1
X X
m
= 1+ f (n − 1, m)x + f (n − 1, m − 1)xm
m≥1 m≥1
= Fn−1 (x) + xFn−1 (x) = (1 + x)Fn−1 (x).

By induction it follows that Fn (x) = (1 + x)n . Thus,


(
m n C(n, m) if 0 ≤ m ≤ n
f (n, m) = cf[x , (1 + x) ] =
0 if m > n.

f (n, m)y n = f (n, m)y n . Then, G1 (y) =


P P
Alternate. For m > 0, define Gm (y) =
n≥0 n≥1
y
, and for m ≥ 2, Equation (6.10) gives
(1 − y)2
X X
Gm (y) = f (n, m)y n = (f (n − 1, m) + f (n − 1, m − 1)) y n
n≥1 n≥1
X X
n
= f (n − 1, m)y + f (n − 1, m − 1)y n
n≥1 n≥1
= yGm (y) + yGm−1 (y).
T
AF

y y ym
Therefore, Gm (y) = Gm−1 (y). As G1 (y) = 2
, one has Gm (y) = . Thus,
1−y (1 − y) (1 − y)m+1
DR

 (
ym if 0 ≤ m ≤ n
  
1 C(n, m)
f (n, m) = cf y n , = cf y n−m
, =
(1 − y)m+1 (1 − y)m+1 0 if m > n.

4. Determine the sequence {S(n, m) : n, m ∈ W} which satisfies S(0, 0) = 1, S(n, 0) = 0 for n > 0,
S(0, m) = 0 for m > 0, and

S(n, m) = mS(n − 1, m) + S(n − 1, m − 1), for n > 0, m > 0. (6.11)

y
S(n, m)y n = S(n, m)y n . Then G1 (y) =
P P
Answer: For n > 0, define Gm (y) = 1−y , and for
n≥0 n≥1
m ≥ 1, Equation (6.11) gives
X X
Gm (y) = S(n, m)y n = (mS(n − 1, m) + S(n − 1, m − 1)) y n
n≥0 n≥1
X X
= m S(n − 1, m)y n + S(n − 1, m − 1)y n
n≥1 n≥1
= myGm (y) + yGm−1 (y).

y
Therefore, Gm (y) = Gm−1 (y). By induction it follows that
1 − my
m
ym X αk
Gm (y) = = ym , (6.12)
(1 − y)(1 − 2y) · · · (1 − my) 1 − ky
k=1
6.5. GENERATING FUNCTION FROM RECURRENCE RELATION 129

(−1)m−k k m
where αk = for 1 ≤ k ≤ m. Then
k! (m − k)!
m m
" #  
n m
X αk
X
n−m αk
S(n, m) = cf y , y = cf y ,
1 − ky 1 − ky
k=1 k=1
m m m−k
X X (−1) kn
= αk k n−m = (6.13)
k! (m − k)!
k=1 k=1
m m
1 X 1 X
= (−1)m−k k n C(m, k) = (−1)k (m − k)n C(m, k).
m! m!
k=1 k=1

1 Pm
(a) The identity S(n, m) = (−1)k (m − k)n C(m, k) is known as the Stirling’s Identity.
m! k=1
(b) As there is no restriction on n, m ∈ N0 , Equation (6.13) is also valid for n < m. But, we
know that S(n, m) = 0, whenever n < m. Hence, we get the following identity,
m
X (−1)m−k k n−1
= 0 whenever n < m.
(k − 1)! (m − k)!
k=1

5. [Bell Numbers] Recall that the n-th Bell number b(n) for n ∈ N, is the number of partitions
of {1, 2, . . . , n}. By convention we take b(0) = 1. For n ≥ 1,
n m
X X XX (−1)m−k k n−1
b(n) = S(n, m) = S(n, m) =
(k − 1)! (m − k)!
m=1 m≥1 m≥1 k=1
X k n X (−1)m−k 1 X kn 1 X kn
= = = (6.14)
T

k! (m − k)! e k! e k!
k≥1 m≥k k≥1 k≥0
AF

as 0n = 0 for n ≥ 1. We see that Equation (6.14) is valid even for n = 0. Notice that b(n) has
DR

kn
terms of the form . So, we compute its egf as follows:
k!
 
x n k  xn
n
1
X X X
B(x) = 1 + b(n) =1+
n! e k! n!
n≥1 n≥1 k≥1
1 X 1 X n xn 1 X 1 X (kx)n
= 1+ k =1+
e k! n! e k! n!
k≥1 n≥1 k≥1 n≥1

1 X (ex )k
 
1 X 1  kx  1
= 1+ e −1 =1+ −
e k! e k! k!
k≥1 k≥1
1 ex x
e − 1 − (e − 1) = ee −1 .

= 1+ (6.15)
e
x
Recall that ee −1 is a valid formal power series (see Remark 6.3.5). Taking logarithm of Equa-
tion (6.15), we get log B(x) = ex − 1. Hence, B 0 (x) = ex B(x), or equivalently
X b(n)xn−1 X xn X xm X xn
B 0 (x) = = ex b(n) = · b(n) .
(n − 1)! n! m! n!
n≥1 n≥0 m≥0 n≥0

Thus,
 
m n n−1
b(n)  n−1 0  n−1
X x X x  X 1 b(m)
= cf x , B (x) = cf x
 , · b(n) = · .
(n − 1)! m! n! (n − 1 − m)! m!
m≥0 n≥0 m=0

n−1
P
Therefore, b(n) = C(n − 1, m)b(m) for n ≥ 1.
m=0
130 CHAPTER 6. COMBINATORICS - II

Exercise 6.5.5. 1. Find the recurrence relation(s) for the number of binary words without having
sub-words 00 and 111.
2. Find the number of subsets (including the empty set) of {1, . . . , n} not containing consecutive
integers.
3. Let Fn be the nth Fibonacci number. Prove that if n, m ∈ N, then Fn divides Fnm .
4. In a particular semester 6 students took admission in our PhD program. There were 9 professors
who were willing to supervise these students. As a rule ‘a student can have either one or two
supervisors’. In how many ways can we allocate supervisors to these students if all the ‘will-
ing professors’ are to be allocated? What if we have an additional condition that exactly one
supervisor gets to supervise two students?
5. (a) Prove combinatorially that Dn = (n − 1)(Dn−1 + Dn−2 ) for n ≥ 2.
e−x
(b) Use (a) to show that the egf of Dn is .
1−x
6. (a) In how many ways can one distribute 10 identical chocolates among 10 students?
(b) In how many ways can one distribute 10 distinct chocolates among 10 students?
(c) In how many ways can one distribute 10 distinct chocolates among 10 students so that each
receives one?
(d) In how many ways can one distribute 15 distinct chocolates among 10 students so that each
receives at least one?
(e) In how many ways can one distribute 10 out of 15 distinct chocolates among 10 students so
that each receives one?
T

(f ) In how many ways can one distribute 15 distinct chocolates among 10 students so that each
AF

receives at most three?


DR

(g) In how many ways can one distribute 15 distinct chocolates among 10 students so that each
receives at least one and at most three?
(h) In how many ways can one distribute 15 identical chocolates among 10 students so that
each receives at most three?
7. (a) In how many ways can one carry 15 distinct objects with 10 identical bags? Answer using
S(n, r).
(b) In how many ways can one carry 15 distinct objects in 10 identical bags with no empty bag?
Answer using S(n, r).
(c) In how many ways can one carry 15 distinct objects in 10 identical bags with each bag
containing at most three objects?
(d) In how many ways can one carry 15 identical objects in 10 identical bags?
(e) In how many ways can one carry 15 identical objects in 10 identical bags with no empty
bag?
(f ) In how many ways can one carry 15 identical objects in 20 identical bags?
8. What is the number of integer solutions of x + y + z = 10 with x ≥ −1, y ≥ −2 and z ≥ −3?
9. Is the number of solutions of x + y + z = 10 in non-negative multiples of 12 (x, y, z are allowed to
be 0, 1/2, 1, 3/2, . . .) at most four times the number of non-negative integer solutions of x+y+z =
10?
10. How many words of length 8 can be formed using the English alphabet, where each letter can
appear at most twice? Give answer using generating function.
6.5. GENERATING FUNCTION FROM RECURRENCE RELATION 131

11. Let p1 , . . . , pn , n ≥ 2, be distinct prime numbers. In how many ways can we partition the
set {p1 , . . . , pn , p21 , . . . , p2n } into subsets of size two such that no prime is in the same subset
containing its square?
15
(−1)k C(15, k)(15 − k)5 ?
P
12. What is the value of
k=0
13. Give your answers to the following questions using generating functions:

(a) What is the number of partitions of n with entries at most r?


(b) What is the number of partitions of n with at most r parts?
(c) What is the number of partitions of n with exactly r parts (πn (r))?
(d) What is the number of partitions of n + C(r, 2) with r distinct parts?
(e) What is the number of partitions of n with distinct entries?
(f ) What is the number of partitions of n with odd entries?
(g) What is the number of partitions of n with distinct odd entries?
(h) What is the number of self conjugate partitions of n?

14. We summarize our findings about partitions in the following table.

Objects-n Places-r Places


Relate Number
distinct? distinct? nonempty?
r!S(n, r) =
Y Y Y Onto functions r−1
T

(−1)i C(r, i)(r − i)n


P
AF

i=0
Y Y N All functions rn
DR

r-partition of a
Y N Y S(n, r)
set
All partitions of r
Y N N
P
b(n) = S(n, i)
a set i=1
Positive integer
N Y Y C(n − 1, r − 1)
solutions
Nonnegative
N Y N C(n + r − 1, r − 1)
integer solutions
N N Y r-partition of n πn (r)
h = i
cf xn−r , (1−x)(1−x12 )···(1−xr )
Partitions of n r
N N N
P
πn (i)
of length ≤ r i=1

15. How many words of length 15 are there using the letters A,B,C,D,E such that each letter must
appear in the word and A appears an even number of times? Give your answers using generating
function.
16. The characteristic roots of an LHRC are 2, 2, 2, 3, 3. What is the form of the general solution?
17. Consider the LNRC an = c1 an−1 + · · · + cr an−r + 5n . Give a particular solution.
18. Obtain the ogf for an , where an = 2an−1 − an−2 + 2n , a0 = 0, a1 = 1.
19. Solve the recurrence relation an = 2an−1 − an−2 + 2n + 5, a0 = 0, a1 = 1.
132 CHAPTER 6. COMBINATORICS - II

20. Find the number of words of size 12 made using letters from {A, B, C} which do not have the
sub-word BCA. For instance, BCCABCCABCCA is such a word, but ABCABCCCCCBA is not.
21. Find the number of 8 letter words made using letters from {A, B, C, D} in which 3 consecutive
letters are not allowed to be the same.
22. We have 3 blue bags, 4 red bags and 5 green bags. We have many balls of each of the colors blue,
red and green. What is the the smallest positive integer n so that if we distribute n balls (without
seeing the colors) into these bags, then at least one of the following three conditions is met?
Condition 1: A blue bag contains 3 blue balls or 4 red balls or 5 green balls.
Condition 2: A red bag contains 3 blue balls or 5 red balls or 7 green balls.
Condition 3: A green bag contains 3 blue balls or 6 red balls or 9 green balls.
23. Let f (x) be a polynomial with integer coefficients. What is the smallest natural number n such
that if f (x) = 2009 has n distinct integer roots, then f (x) = 9002 does not have an integer root?
24. My friend says that he has n ≥ 2 subsets of {1, 2, . . . , 14} each of which has size 6. Give a value
of n so that we can guarantee ‘some two of his subsets have 3 elements in common’, without
seeing his collection? What is the smallest possible value of n?
25. My class has n CSE, m MSC and r MC students. Suppose that t copies of the same book are
to be distributed so that each branch gets at least s copies. In how many ways can this be done,
if each student gets at most one? In how many ways can this be done, without the previous
restriction? Answer using generating functions.
26. My class has n CSE, m MSC and r MC students. Suppose that t distinct books are to be
T

distributed so that each branch gets at least s. In how many ways can this be done, if each
AF

student gets at most one? In how many ways can this be done, without the previous restriction?
Answer only using generating function.
DR

27. My class has N students. To conduct an exam, we have M identical answer scripts. In how
many ways can we distribute the answer scripts so that each student gets at least 2. Answer
using generating functions.
28. My class has N students. In an examination paper, there are M questions. Each student answers
all the questions in an order decided by him/her. In how many ways can it happen that some
three or more students have followed the same order? Answer using generating function.
29. Eleven teachers attended the Freshers’ Party. There were 4 types of soft drinks available. In how
many ways a total of 18 glasses of soft drinks can be served to them, in general? Answer using
generating function.
Chapter 7

Introduction to Logic

7.1 Logic of Statements (SL)


We study logic to differentiate between valid and invalid arguments. An argument is a set of state-
ments which has two parts: a set of premises and a conclusion. Each premise is a statement which is
assumed to hold for the sake of the argument. The conclusion is a statement claimed to hold by the
argument. An argument has the structure
Premises: Statement1 , . . ., Statementk ; therefore
Conclusion: Statementc .
The following are instances of arguments:

• Statement1 : If today is Monday, then Mr. X gets |5.


T

Statement2 : Today is Monday.


AF

Statementc : (Therefore,) Mr. X gets |5.


DR

• Statement1 : If today is Monday, then Mr. X gets |5.


Statement2 : Mr. X gets |5.
Statementc : (Therefore,) Today is Monday.

• Statement1 : If today is Monday, then Mr. X gets |5.


Statement2 : Today is Tuesday.
Statementc : (Therefore,) Mr. X gets |5.

• Statement1 : If today is Monday, then Mr. X gets |5.


Statement2 : Today is Tuesday.
Statementc : (Therefore,) Mr. X does not get |5.

We understand that the first one is a valid argument, whereas the next three are not. In order to
determine whether an argument is valid or not, we need to know the logical form of a statement. A
simple statement is an expression which is either false or true but not both. Complex statements are
made out of simple ones by using the words ‘not’, ‘and’, ‘or‘, ‘implies’ and ‘if and only if’.
For example, ‘Today is Monday’ is a statement. ‘Today is Tuesday’ is a statement. ‘Today is not
Monday’ is a statement. ‘Today is Monday and today is Tuesday’ is also a statement.
Using symbols for simple statements and the words ‘not’, ‘and’, ‘or‘, ‘implies’ and ‘if and only if’
help us in seeing the logical structure of a statement. Normally, we use the symbols p, q, r, p1 , p2 , . . .
to denote simple statements. The quoted words are denoted by ¬, ∧, ∨, → and ↔, respectively.
Then the complex statements are made using these symbols along with parentheses by following some
specified rules.

133
134 CHAPTER 7. INTRODUCTION TO LOGIC

We abbreviate the phrase ‘Logic of Statements’ to ‘SL’ and present it in the following three sections.

7.2 Formulas and truth values in SL


Definition 7.2.1. Fix a countable set A = {p1 , p2 , . . .} of symbols. Each element of A is called
an atomic formula. An atomic formula is also called an atomic variable. The special symbols
¬, ∧, ∨, → and ↔ are called connectives; their names are ‘negation’, ‘conjunction’, ‘disjunction’,
‘implication’, and ‘biconditional’, respectively. The well formed formulas, or formulas, for short,
are generated by using the following rules recursively:
F1: Each atomic formula is a formula.
F2: If x is a formula, then (¬x) is a formula.
F3: If x and y are formulas, then (x ∧ y), (x ∨ y), (x → y) and (x ↔ y) are formulas.

The connective that has been introduced last in the process of generation of the formula is called the
principal connective in that formula.

The connectives ∨, ∧, →, and ↔ always connect two old formulas to create a new one. This is
why they are called binary connectives. The connective ¬ is used on a single old formula to give a
new one. So, it is called a unary connective. Notice that in every formula, there is a matching pair of
parentheses.

Example 7.2.2.
1. The expression (¬p5 ) is a formula.
T
AF

Ans: Since p5 ∈ A, by (F1), it is a formula. By (F2), (¬p5 ) is a formula. The principal


connective in the formula is ¬.
DR

2. The expression (¬(p3 ∧ (¬p4 ))) is a formula.


Ans: p3 , p4 ∈ A; by (F1), these are formulas. By (F2), (¬p4 ) is a formula. By (F3), (p3 ∧ (¬p4 ))
is a formula. Next, by (F2), (¬(p3 ∧(¬p4 ))) is a formula. The principal connective in the formula
is ¬.
3. The expression (p1 → (p1 ∨ p1 )) is a formula.
Ans: By (F1), p1 is a formula. By (F3), (p1 ∨ p1 ) is a formula. Once more, by (F3), (p1 →
(p1 ∨ p1 )) is a formula. The principal connective in the formula is →.
4. The expression (p1 ∨ ((¬(p1 → p1 )) ↔ (p3 ∧ p5 ))) is a formula.
Ans: By (F1), p1 , p3 and p5 are formulas. By (F3), (p1 → p1 ) and (p3 ∧ p5 ) are formulas.
By (F2), (¬(p1 → p1 )) is a formula. Next, by two applications of (F3), (p1 ∨ ((¬(p1 → p1 )) ↔
(p3 ∧ p5 ))) is a formula. The principal connective in this formula is ∨.
5. The expression ¬p9 is not a formula since according to our formation rules, a pair of parentheses
should have been used. Of course, with the pair of parentheses, the expression (¬p9 ) is a
formula, where the principal connective is ¬. Similarly, (¬(p4 )) is not a formula due to extra
pair of parentheses, but (¬p4 ) is a formula with the principal connective as ¬.
6. The expression (p4 ∨ p5 is not a formula, but (p4 ∨ p5 ) is a formula with the principal connective
as ∨.
7. The expression (p6 ∨p1 )∧(¬p4 )) has one extra right parenthesis. Also, the connective ∧ demands
an extra pair of outer parentheses; that is, a left parenthesis is missing. We see that ((p6 ∨ p1 ) ∧
(¬p4 )) is a formula with the principal connective as ∧.
7.2. FORMULAS AND TRUTH VALUES IN SL 135

Convention: For our comfort, we use the symbols p, q, r, . . . with or without subscripts for atomic
formulas in place of p1 , p2 , . . .. Similarly, we ignore the outer parentheses in a formula. By using
precedence rules we also cut short some more parentheses. The precedence rules are as follows:
1. ¬ has the highest precedence.
2. ∧ and ∨ have the next precedence.
3. → and ↔ have the least precedence.

Recall that when we say that × has more precedence over +, the expression x × y + z × w means

(x × y) + (z × w) . If ambiguity results from using this convention in a context, we expand the
abbreviated formulas to formulas and decide the case. We illustrate the convention in the following
example.

Example 7.2.3. 1. By abbreviating p5 as p, we abbreviate (¬p5 ) as ¬p.


2. To abbreviate the formula (¬(p3 ∧ (¬p4 ))), we write p3 as p, p4 as q. Using the precedence rules,
our abbreviation is ¬(p3 ∧ ¬p4 ).
3. Writing p1 as p, we abbreviate (p1 → (p1 ∨ p1 )) as p → p ∨ p.
4. Write p1 as p, p3 as q, and p5 as r. Then the formula (p1 ∨ ((¬(p1 → p1 )) ↔ (p3 ∧ p5 ))) is
abbreviated to p ∨ (¬(p → p) ↔ q ∧ r)).

To be careful, we should not abbreviate different atomic formulas to the same symbol in any
context. For instance in the last part of the above example, we should not abbreviate both p1 and p3
as p.
T

Assuming familiarity with the process of abbreviation, we regard abbreviated formulas as formulas.
AF

Since statements are supposed to be either true or false, we now discuss how to assign truth values
DR

to formulas. Observe that any formula has occurrences of some finite number of atomic variables.
Further, if X is any formula, then either X = pi , an atomic variable, or X is in one of the forms:
¬p, p ∧ q, p ∨ q, p → q, or p ↔ q for formulas p, q, with the principal connective as ¬, ∧, ∨, →, ↔,
respectively.

Definition 7.2.4. Let X be a formula. Let B be the set of all formulas generated from the atomic
variables occurring in X. A truth assignment (appropriate to X) is a function f : B → {T, F }
satisfying the following conditions:
1. For an atomic variable pi , either f (pi ) = T or f (pi ) = F .
For formulas p and q,
2. f (¬p) = F if f (p) = T , and f (¬p) = T if f (p) = F .
3. f (p ∧ q) = T if f (p) = f (q) = T , and (p ∧ q) = F otherwise.
4. f (p ∨ q) = F if f (p) = f (q) = F , and f (p ∨ q) = T otherwise.
5. f (p → q) = F if f (p) = T, f (q) = F , and f (p → q) = T otherwise.
6. f (p ↔ q) = T if f (p) = f (q), and f (p ↔ q) = F otherwise.

Sometimes we write ‘f (p1 , . . . , pk ) is a formula’ to mean that ‘f is a formula involving the atomic
formulas p1 , . . . , pk ’. Let f (p1 , . . . , pk ) be a formula. Then, the truth value of f is determined based on
the truth values of the atomic formulas p1 , . . . , pk . Since, there are 2 assignments for each pi , 1 ≤ i ≤ k,
there are 2k ways of assigning truth values to these atomic formulas. A truth table for a formula
f (p1 , . . . , pk ) is a table which systematically lists the truth values of f under every possible assignment
136 CHAPTER 7. INTRODUCTION TO LOGIC

of truth values to the involved atomic formulas. The above definition of assignment of truth values
can be depicted in a truth table. It is as follows.

Understanding the connectives in a Truth table:

p q p∧q p q p∨q p q p→q p q p↔q


p ¬p T T T T T T T T T T T T
T F T F F T F T T F F T F F
F T F T F F T T F T T F T F
F F F F F F F F T F F T

Assignment of truth values to ¬p, p ∧ q, p ∨ q, p → q, p ↔ q

For instance, look at the table for →. The second row there tells that when p is assigned T and q is
assigned F , p → q is assigned F . In all other cases, p → q is assigned T .

Read T as ‘true’ and F as ‘false’. Observe that ¬ makes a true statement false and a false statement
true. The formula p ∧ q is true if and only if both p, q are true; p ∨ q is true if and only if at least
one of p, q is true; p ↔ q is true when either both p, q are true, or when both p, q are false. The
case that ‘p → q is true’ closely resembles the sentence ‘if p is true, then q is true’, though not very
obvious. (We illustrate this case in Example 7.2.6 below.) Accordingly, we also read the connectives
¬, ∧, ∨, → and ↔ as not, and, or, then1 and if and only if, respectively.

Example 7.2.5. The following is a truth table for the formula p ∨ (q ∧ r).
T

p q r q ∧ r p ∨ (q ∧ r)
AF

F F F F F
DR

F F T F F
F T F F F
F T T T T
T F F F T
T F T F T
T T F F T
T T T T T

Example 7.2.6. Consider the formula p → q, where p and q symbolize the English statements as
follows:
p: you attend the class.
q: you understand the subject.
Then, p → q is the statement ‘if you attend the class, then you understand the subject’. The formula
p → q is true under the first three cases as explained below.
1. p is true and q is true. This means ‘you attend the class and understand the subject’. Here,
p → q is true.
2. p is false and q is false. This means ‘you do not attend the class and do not understand the
subject’. In this case, p → q is true.
3. p is false and q is true. This means ‘you do not attend the class but understand the subject’.
Here also, p → q is true.
1
In many texts, p → q is read as ‘if p then q’. However, it will be easier to read it as ‘p then q’.
7.3. EQUIVALENCE AND NORMAL FORMS IN SL 137

4. p is true and q is false. This means ‘you attend the class and do not understand the subject’.
Then p → q is false.

Thus, a conditional p → q is true when either p is false or q is true.

Practice 7.2.7.

1. Draw a truth table for the formula p ∧ ¬p → (p ∨ ¬q) .
2. Can both the formulas p → q and q → p be F for some assignment on p and q?

Depending on the structure of a formula f (p1 , . . . , pn ) it receives a truth value under an assignment
of truth values to the atomic formulas p1 , . . . , pn . It is quite possible that the formula receives the
truth value T under an assignment and it receives the truth value F under another assignment. In
this connection we isolate those formulas which receive the same truth value under each assignment.

Definition 7.2.8. A contradiction is a formula which takes the truth value F under each assignment.
A tautology is a formula which takes the truth value T under each assignment. Often we write a
contradiction as ⊥ and a tautology as >.

For example, p∧¬p is a contradiction and p∨¬p is a tautology. Once a tautology and a contradiction
are given new tautologies and contradictions can be obtained by using the following theorem.

Theorem 7.2.9. Let A be a formula having at least one occurrence of an atomic variable p. Let B be
any formula. Denote by A[p/B] the formula obtained by replacing each occurrence of p by B in A.
1. If A is a contradiction, then A[p/B] is a contradiction.
T
AF

2. If A is a tautology, then A[p/B] is a tautology.


DR

Proof. Let A be a contradiction. For ease in notation, write A = A(p; p1 , . . . , pn ), where other than
p, the atomic variables occurring in A are p1 , . . . , pn . Similarly, write A[p/B] = A(B; p1 , . . . , pn ). Let
f be any truth assignment that assigns truth values to p, p1 , . . . , pn and also to all atomic variables
occurring in B.
If f assigns T to B, then the value of A[p/B] is the same as that of A(T ; p1 , . . . , pn ), which is F
since A is a contradiction.
If f assigns F to B, then the value of A[p/B] is the same as that of A(F ; p1 , . . . , pn ), which is F
since A is a contradiction.
Hence, A[p/B] takes the value F under the assignment f . Since f is an arbitrary assignment, we
conclude that A[p/B] is a contradiction. This proves the first statement.
Statement 2 is proved similarly.

For example, ((p → q) ∧ (q ↔ r)) ∧ ¬((p → q) ∧ (q ↔ r)) is a contradiction, since it is obtained from
p ∧ ¬p by replacing p with ((p → q) ∧ (q ↔ r)). Similarly, ((p → q) ∧ (q ↔ r)) ∨ ¬((p → q) ∧ (q ↔ r))
is a tautology since it is obtained from p ∨ ¬p by replacing p with ((p → q) ∧ (q ↔ r)).

7.3 Equivalence and Normal forms in SL


In an algebraic identity such as (x + y)2 = x2 + 2xy + y 2 , when we replace the variables x, y with
some numbers we see that both the sides give the same value. Such expressions help us in simplifying
algebraic expressions. Analogously, we introduce the notion of equivalence which will help us in
simplifying formulas.
138 CHAPTER 7. INTRODUCTION TO LOGIC

Definition 7.3.1. Two formula A and B are called equivalent if under any truth assignment, both
receive the same truth value. When A and B are equivalent, we write A ≡ B.

Thus, equivalent formulas are evaluated the same in each row of their truth table. Notice that
the set of atomic variables occurring in both the formulas may not be same; so a truth table is to be
constructed taking care of all the atomic variables involved.

Example 7.3.2.
1. Is p → q ≡ ¬q → ¬p?
Ans: We construct a truth table as follows.

p q p → q ¬q → ¬p
T T T T
T F F F
F T T T
F F T T

Since in each row of the truth table, the truth values of the two formulas match, they are
equivalent.
2. Is p ≡ p ∧ (q ∨ (¬q))?
Ans: The truth table is constructed below.

p q p ∧ (q ∨ (¬q))
T

T T T
AF

T F T
DR

F T F
F F F

Since in each row of the truth table the values of p and that of p ∧ (q ∨ (¬q)) match, they are
equivalent.

Practice 7.3.3. Is p ∨ ¬p ≡ q ∨ ¬q?

When many atomic variables are involved, it may be time consuming to construct a truth table.
In such a case, equivalence may be shown by using one of the following methods:
1. A ≡ B if and only if whenever A is true, B is true, and whenever B is true, A is also true.
2. A ≡ B if and only if whenever B is false, A is false, and whenever B is false, A is also false.

Example 7.3.4. Show that p → q ≡ ¬q → ¬p.


Ans: p → q is false if and only if p is true and q is false
if and only if ¬p is false and ¬q is true
if and only if ¬q → ¬p is false.
Hence p → q ≡ ¬q → ¬p.

Proposition 7.3.5. [Laws] Let p, q, r be formulas. Then the following equivalences hold:
1. [Commutativity] p ∨ q ≡ q ∨ p, p∧q ≡q∧p
2. [Associativity] p ∨ (q ∨ r) ≡ (p ∨ q) ∨ r, p ∧ (q ∧ r) ≡ (p ∧ q) ∧ r
3. [Distributivity] p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r), p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r)
7.3. EQUIVALENCE AND NORMAL FORMS IN SL 139

4. [De Morgan] ¬(p ∨ q) ≡ ¬p ∧ ¬q, ¬(p ∧ q) ≡ ¬p ∨ ¬q


5. [Idempotence] p ∨ p ≡ p, p∧p≡p
6. [Constants] ⊥ ∨ p ≡ p, ⊥ ∧ p ≡ ⊥, > ∨ p ≡ >, > ∧ p ≡ p, p ∨ ¬p ≡ >, p ∧ ¬p ≡ ⊥,
where ⊥ denotes contradiction and > denotes tautology.
7. [Double Negation] ¬(¬p) ≡ p
8. [Absorption] p ∨ (p ∧ q) ≡ p, p ∧ (p ∨ q) ≡ p
9. [Implication] p → q ≡ ¬p ∨ q, ¬(p → q) ≡ p ∧ ¬q
10. [Contraposition] p → q ≡ ¬q → ¬p, p → ¬q ≡ q → ¬p
11. [Biconditional] p ↔ q ≡ (p ∧ q) ∨ (¬p ∧ ¬q), p ↔ q ≡ (¬p ∨ q) ∧ (p ∨ ¬q),
p ↔ q ≡ (p → q) ∧ (q → p)

Proof. Construct the truth tables and verify.

Remark 7.3.6. The statement q → p is called the converse of the statement p → q. In general, a
statement is not equivalent to its converse. Reason: The assignment f that assigns T to p and F to
q, assigns F to p → q but assigns T to q → p. Also, the assignment g that assigns T to q and F to p
assigns F to q → p while it assigns T to p → q. Compare this with the Rule of Contraposition. The
contrapositive of a statement p → q is ¬q → ¬p. The rule says that a statement is equivalent to its
contrapositive.

The above laws help us in proving equivalence of some formulas, in addition to the method of
truth tables and helps us in analyzing when the formulas are true or false.
T
AF

Example 7.3.7. We use the laws to show the following:


1. p → (q → r) ≡ (p ∧ q) → r.
DR

2. ¬(p ↔ q) ≡ ¬p ↔ q.
3. p → q ≡ p ↔ p ∧ q.

Ans:

(1) p → (q → r) ≡ ¬p ∨ (¬q ∨ r) as p → p ≡ (¬p) ∨ q


≡ (¬p ∨ ¬q) ∨ r Associativity
≡ ¬(p ∧ q) ∨ r De Morgan
≡ (p ∧ q) → r as p → p ≡ (¬p) ∨ q

(2) ¬(p ↔ q) ≡ ¬ (p ∧ q) ∨ (¬p ∧ ¬q) Biconditional
≡ ¬(p ∧ q) ∧ ¬(¬p ∧ ¬q) De Morgan
≡ (¬p ∨ ¬q) ∧ (p ∨ q) De Morgan, Double negation
≡ (¬p ∧ p) ∨ (¬p ∧ q) ∨ (¬q ∧ p) ∨ (¬q ∧ q) Distributivity
≡ (¬p ∧ q) ∨ (¬q ∧ p) Constants
≡ (¬p ∧ q) ∨ (¬¬p ∧ ¬q) Double negation
≡ ¬p ↔ q Biconditional
 
(3) p ↔ p ∧ q ≡ ¬p ∨ (p ∧ q) ∧ p ∨ ¬(p ∧ q) Biconditional

≡ ¬p ∨ (p ∧ q) ∧ (p ∨ (¬p ∨ ¬q)) De Morgan
≡ (¬p ∨ p) ∧ (¬p ∨ q) ∧ (p ∨ (¬p ∨ ¬q)) Distributivity
≡ ¬p ∨ q Constants
≡ p→q Implication
140 CHAPTER 7. INTRODUCTION TO LOGIC

Practice 7.3.8.
1. Does the absorption law imply p ∨ (p ∧ (¬q)) ≡ p and p ∧ (p ∨ (¬q)) ≡ p?

2. Write a statement equivalent to (p → q) → p → (q → ¬r) where → and ↔ do not occur.
Simplify so that the number of occurrences of connectives is minimum.

Any formula has a truth table. On the other hand, if a truth table is given, can we construct a
formula corresponding to it? For example, can we have a formula involving the atomic variables p, q, r
such that the formula receives the truth value T under the assignment T, F, T to p, q, r, respectively?
We see that the formula p ∧ ¬q ∧ r does the job.

Definition 7.3.9. A truth function of n variables is any function from {T, F }n → {T, F }. A truth
function is expressed by a formula if the formula has the same truth table as that of the truth
function.

If φ is a truth function of n variables p1 , . . . , pn , then a truth table can be constructed to depict


it. Such a truth table will have n columns and 2n rows, each row showing the different assignments of
truth values to the variables. The (n + 1)-th column is filled with T or F corresponding to each row.
For example, the truth function φ : {T, F }2 → {T, F } given by

φ(T, T ) = T, φ(T, F ) = F, φ(F, T ) = T, φ(F, F ) = F

is depicted by the truth table


p q φ
T T T
T

T F F
AF

F T T
DR

F F F
n
Notice that there are 22 number of truth functions involving n number of variables. Obviously,
any formula is a truth function. The question is whether any truth function can be expressed by a
formula.

Experiment: Consider the variables p, q, r in that order.


A formula which takes value T only on the assignment T T T is p ∧ q ∧ r. Verify.
A formula which takes value T only on the assignment T T F is p ∧ q ∧ ¬r. Verify.
Give a formula which takes value T only on the assignment F T F .
Give a formula which takes value T only on the assignments T T F and F T F .
Give a formula which takes value T only on the assignments T F T , T T F and T F F .
Give a formula f which takes value T only on the assignments F T F and F F F , i.e., whose truth table
is the following:
p q r A
T T T F
T T F F
T F T F
T F F F
F T T F
F T F T
F F T F
F F F T
7.3. EQUIVALENCE AND NORMAL FORMS IN SL 141

Theorem 7.3.10. Each truth function of n variables is expressed by a formula involving n variables.

Proof. Let φ be a truth function of n variables. Let p1 , . . . , pn be n number of atomic variables. If


rng φ = {F }, then A ≡ ⊥. Thus, take A = p1 ∧ ¬p1 ∧ p2 ∧ · · · ∧ pn . Otherwise, collect all those
assignments f such that φ(f ) = T . Suppose this set is {f1 , . . . , fm }. Corresponding to each fi , define
the formula Bi = r1 ∧ r2 ∧ · · · ∧ rn , where for 1 ≤ j ≤ n,

p if f (pj ) = T
j
rj =
¬p if f (p ) = F.
j j

Notice that the formula Bi takes the value T only on the assignment fi . Thus, A = B1 ∨ B2 ∨ · · · ∨ Bm
is the required formula.

Example 7.3.11. Construct a formula that expresses the truth function φ given by

p q φ
T T T
T F T
F T F
F F F

Ans: The truth function φ is true only for the truth assignments f1 and f2 , where f1 (p) = f1 (q) =
T and f2 (p) = T, f2 (q) = F . The corresponding formulas are B1 = p ∧ q and B2 = p ∧ ¬q. So the
formula that expresses φ is (p ∧ q) ∨ (p ∧ ¬q).
T
AF

As the proof of Theorem 7.3.10 shows, each truth function can be expressed by a formula which
DR

has a special form. In particular, every formula can be equivalently expressed by a formula in such a
special form. We define such a special form, along with another related special form.

Definition 7.3.12. An atomic formula and the negation of an atomic formula are together called
literals. We say that a formula A is in disjunctive normal form (in short, DNF) if it is a disjunction
of conjunctions of literals. We say that a formula A is in conjunctive normal form (in short, CNF)
if it is a conjunction of disjunctions of literals. Both DNF and CNF are called normal forms.

Example 7.3.13. The formulas (p ∧ ¬q) ∨ ¬r and (p ∧ ¬q) ∨ (q ∧ ¬r) ∨ (r ∧ s) are in DNF; (p ∨ ¬q) ∧ r
and (p ∨ q) ∧ (q ∨ ¬r) ∧ (r ∨ s) are in CNF; while p, p ∨ q, ¬p ∧ q are in both CNF and DNF.

Practice 7.3.14. Write 5 formulas in CNF involving p, q, r.

Theorem 7.3.15. Any formula is equivalent to a formula in DNF, and also to a formula in CNF.

Proof. Since each formula is a truth function, the first assertion follows from Theorem 7.3.10. The
second assertion can be proved similarly. Alternatively, if A is a formula, get a DNF for ¬A; then
negate the DNF and use the distributivity laws to get an equivalent CNF.

Practice 7.3.16. Write all the truth functions on two variables and write formulas for them.

A CNF and/or DNF representation of a formula can be computed by using equivalences. First, we
eliminate the connectives → and ↔ by using the laws of Implication and Biconditional, i.e., by using
the equivalences x → y ≡ ¬x ∨ y and x ↔ y ≡ (¬x ∨ y) ∧ (x ∨ ¬y). Next, we use the law of De Morgan
and Double negation, that is, ¬(x ∨ y) ≡ (¬x ∧ ¬y), ¬(x ∧ y) ≡ (¬x ∨ ¬y) and ¬¬x ≡ x so that the
142 CHAPTER 7. INTRODUCTION TO LOGIC

earlier obtained formula is equivalent to the one, in which each occurrence of the connective ¬ precedes
atomic variables. Finally, we use the laws of distributivity to obtain an equivalent formula, which is
in CNF and/or DNF. The formula so obtained can also be simplified using the laws of Absorption.
The following examples illustrate this method.

Example 7.3.17. Find a formula in DNF and also one in CNF equivalent to

((p → q) ∧ (q → r)) ∨ ((p ∧ q) → r).

We apply various laws in bringing the formula to its DNF and CNF as follows. Complete this by
mentioning the laws at each step.

((p → q) ∧ (q → r)) ∨ ((p ∧ q) → r)


≡ ((¬p ∨ q) ∧ (¬q ∨ r)) ∨ (¬(p ∧ q) ∨ r)
≡ ((¬p ∨ q) ∧ (¬q ∨ r)) ∨ (¬p ∨ ¬q ∨ r)

Ans: Using Distributivity on (¬p ∨ q) ∧ (¬q ∨ r), we get the DNF as

(¬p ∧ ¬q) ∨ (¬p ∧ r) ∨ (q ∧ ¬q) ∨ (q ∧ r) ∨ (¬p ∨ ¬q ∨ r).

Using Distributivity on the whole formula, we get the CNF as

(¬p ∨ q ∨ ¬p ∨ ¬q ∨ r) ∧ (¬q ∨ r ∨ ¬p ∨ ¬q ∨ r).

Notice that the CNF can be simplified using Absorption laws. The simplified formula equivalent
T

to the original formula is ¬p ∨ ¬q ∨ r, which is in both DNF and CNF.


AF
DR

Exercise 7.3.18.
1. Use induction on the number of connectives to show that any formula is equivalent to a formula
in DNF and a formula in CNF.
2. A set of connectives is called adequate if every other connective can be expressed in terms of
the given ones. For instance, DNF and CNF conversion show that {¬, ∧, ∨} is an adequate
set. Determine which are adequate:
(a) {¬, ∧} (b) {¬, ∨} (c) {¬, →} (d) {∧, ∨} (e) {¬, ↔} (f ) {→, ∨, ∧}.
3. Fill in the blanks to prove that ‘f ≡ g’ if and only if ‘f ↔ g is a tautology’.
Proof. Assume that f ≡ g. Let b be an assignment. Then, the value of f and g are the same
under b. Thus, the value of f ↔ g is T under b. As b is an arbitrary assignment, we see that
f ↔ g is a tautology.
Therefore, if f is T under b, then g is T under b. That is, f → g and g → f are both T under
b. Thus, f ↔ g is T under the assignment b.
Conversely, suppose that f ↔ g is a tautology. Assume that f 6≡ g. Then, there is an assignment b
under which f and g take different truth values.
So, suppose that f takes T and g takes F under b. Then f → g is F under b and hence f ↔ g
takes F under b, a contradiction. A similar contradiction is obtained if f takes F and g takes
T under b.

4. The dual P ∗ of a formula P involving the connectives ∨, ∧, ¬ is obtained by interchanging ∨


with ∧. For instance, the dual of ¬(p ∨ q) ∧ r is ¬(p ∧ q) ∨ r. Prove the following:
7.4. INFERENCES IN SL 143

(a) Let A(p1 , . . . , pk ) be a formula involving the atomic variables p1 , . . . , pk and connectives ∨, ∧
and ¬. If A(¬p1 , . . . , ¬pk ) is obtained by replacing pi with ¬pi in A for 1 ≤ i ≤ k, then
A(¬p1 , . . . , ¬pk ) ≡ ¬A∗ (p1 , . . . , pk ).
(b) Let A, B be formulas that use only the connectives ∨, ∧ and ¬. If A ≡ B, then A∗ ≡ B ∗ .

7.4 Inferences in SL
We now turn our attention towards the main goal of logic: when is a given argument valid? An
argument has the form: “ S1 , . . . , Sn . Therefore, Q. ”. Here, S1 , . . . , Sn and Q are sentences in some
natural language. To translate such an argument to SL involves translating the sentences to formulas
in SL. Suppose S1 , . . . , Sn , Q are translated to the formulas P1 , . . . , Pn , C, respectively. Our goal is
to determine whether C is true under the assumption that each of P1 , . . . , Pn is true. The translated
entity corresponding to the argument is denoted by

?
P1 , . . . , P n ⇒ C

and is called an inference. We use the terminology that P1 , . . . , Pn are premises and C is the conclusion
of this inference. Once the truth of C is determined from the assumption that P1 , . . . , Pn are true, we
would like to write
?
the inference P1 , . . . , Pn ⇒ C is valid.

This last assertion is written as


T

P1 , . . . , Pn ⇒ C.
AF

We formally define the notions involved.


DR

?
Definition 7.4.1. An inference is an expression of the form {P1 , . . . , Pn } ⇒ C, where P1 , . . . , Pn
?
and C are formulas. We also write the inference as P1 , . . . , Pn ⇒ C. The formulas P1 , . . . , Pn are
called the premises or hypotheses, and C is called the conclusion of the inference. We say that
the inference is valid if (P1 ∧ · · · ∧ Pn ) → C is a tautology; in this case, we write {P1 , . . . , Pn } ⇒ C,
and also P1 , . . . , Pn ⇒ C. We read the symbol ⇒ as ‘implies’. When the inference is valid, we also
say that C is a logical conclusion of the premises P1 , . . . , Pn .

Example 7.4.2.

1. Is the following argument valid?

If x = 4, then discrete math is bad. Discrete math is bad. Therefore, x = 4.

Ans: Denote ‘x = 4’ by p and ‘discrete mathematics is bad’ by q. The argument is translated


?
to SL as the inference {p → q, q} ⇒ p. The question is whether the inference is valid or not,
i.e., whether {p → q, q} ⇒ p. We need to determine whether (p → q) ∧ q → p is a tautology or
not.

Consider the assignment f with f (p) = F and f (q) = T . In this assignment, p → q is T ;


(p → q) ∧ q is T ; consequently, (p → q) ∧ q → p is F . Hence, the argument is invalid.

2. Is the following argument valid?

If discrete math is bad, then x = 4. Discrete math is bad. Therefore, x = 4.


144 CHAPTER 7. INTRODUCTION TO LOGIC

Ans: Denote ‘x = 4’ by p and ‘discrete mathematics is bad’ by q. The argument is translated


?
into the inference {q → p, q} ⇒ p. To determine whether it is valid, we need to find whether
(q → p) ∧ q → p is a tautology.
For this, suppose there is an assignment for which (q → p) ∧ q → p takes the value F . Then for
that assignment, p must be F and (q → p) ∧ q must be T . As (q → p) ∧ q is T , q must be T and
q → p must be T . Thus, we need to have, p is F , q is T , and q → p is T . This is impossible.
Hence, there is no assignment for which (q → p) ∧ q → p is F . Hence, it is a tautology. So p
logically follows from q → p and q. That is, {q → p, q} ⇒ p. The argument is valid.

Remark 7.4.3. Let A, B be formulas. A ⇒ B means that A → B is a tautology. Similarly, B ⇒ A


means B → A is a tautology. Hence “A ⇒ B and B ⇒ A” is same as “A ↔ B is a tautology”, which
is again same as A ≡ B. Thus, sometimes A ≡ B is also written as A ⇔ B.

While proving an inference to be correct, we only show that the falsity of the conclusion does not
go along with the truth of the premises, i.e., the premises and the negation of the conclusion cannot
be true simultaneously. And, if the conclusion of an inference is in the form p → q, we often ignore
the cases when p is false. This is so because when p is false, p → q is true, and in this case, we need
not use any premise towards a correct inference. These two proof methods are encapsulated in the
following result.

Theorem 7.4.4. Let A1 , . . . , An and X, Y be formulas.


1. [Rule of Contradiction] A1 , . . . , An ⇒ X if and only if A1 ∧ · · · ∧ An ∧ ¬X is a contradiction.
2. [Rule of Deduction] A1 , . . . , An ⇒ X → Y if and only if A1 , . . . , An , X ⇒ Y .
T

Proof. (1) Suppose A1 , . . . , An ⇒ X. Let f be a truth assignment. Then f assigns T to A1 ∧· · ·∧An →


AF

X. If f assigns any one of A1 , . . . , An to F , then it assigns F to A1 ∧ · · · ∧ An ∧ ¬X. Otherwise, f


DR

assigns T to each of A1 , . . . , An . Since f assigns T to A1 ∧ · · · ∧ An → X, f assigns T to X. In this


case, f assigns F to A1 ∧ · · · ∧ An ∧ ¬X. Hence, each assignment f assigns F to A1 ∧ · · · ∧ An ∧ ¬X.
Thus, A1 ∧ · · · ∧ An ∧ ¬X is a contradiction.
Conversely, suppose A1 ∧ · · · ∧ An ∧ ¬X is a contradiction. Let f be an assignment. If f assigns F
to any of A1 , . . . , An , then f assigns T to A1 , . . . , An → X. Otherwise, suppose f assigns T to all of
A1 , . . . , An . Since A1 ∧ · · · ∧ An ∧ ¬X is a contradiction, f assigns F to ¬X. That is, f assigns T to
X. Hence, f assigns T to A1 , . . . , An → X. That is, each assignment f assigns T to A1 , . . . , An → X.
Therefore, A1 , . . . , An ⇒ X.
(2) We use (1) repeatedly and the equivalence ¬(X → Y ) ≡ X ∧ ¬Y to obtain the following:
A1 , . . . , A n ⇒ X → Y
if and only if A1 ∧ · · · ∧ An ∧ ¬(X → Y ) is a contradiction
if and only if A1 ∧ · · · ∧ An ∧ X ∧ ¬Y is a contradiction
if and only if A1 ∧ · · · ∧ An ∧ X ⇒ Y .

Example 7.4.5. [MP, MT, HS, AI, OI]


1. Show that p, p → q ⇒ q.
Ans: Suppose p and p → q are T (under an assignment). Suppose q is F (under the same
assignment). As p → q is T , p must be F . This is a contradiction.

Alternate. By the rule of Deduction, (p, p → q) ⇒ q ≡ (p → q, p) ⇒ q if and only if


p → q ⇒ p → q if and only if (p → q) → (p → q) is a tautology; and this is true.
This inference is called Modus Ponens, often abbreviated to MP.
7.4. INFERENCES IN SL 145

2. Show that ¬q, p → q ⇒ ¬p.


Ans: Suppose ¬q and p → q are T . If ¬p is F , then p is T . Now that p → q is T , we see that q
is T . This is a contradiction.

Alternate. By the rule of Deduction, ¬q, p → q ⇒ ¬p if and only if (p → q) → (¬q → ¬p) is


a tautology; and this follows from Contraposition.
This inference is called Modus Tolens, often abbreviated to MT.
3. Show that p → q, q → r ⇒ p → r.
Ans: Suppose p → r is F . Then p is T and r is F . As r is F and q → r is T , q must be F . As
q is F and p → q is T , p is F , a contradiction.

Alternate. Using the rule of Deduction, we need to show that p → q, q → r, p ⇒ r. Using


Modus Ponens p, p → q ⇒ q, we have (p → q, q → r, p) ≡ (q, q → r) which in turn implies r
(again using Modus Ponens).
This inference is called Hypothetical Syllogism abbreviated to HS.
4. Show that p → q, p → r ⇒ p → q ∧ r.
Ans: Suppose p is T . Since p → q is T , q is T . Since p → r is T , r is T . Then q ∧ r is T . Hence
p → q ∧ r is T .
This inference is called And Introduction, abbreviated to AI.
5. Show that p → r, q → r ⇒ p ∨ q → r.
T

Ans: Suppose p ∨ q → r is F . Then p ∨ q is T and r is F . Since r is F and the premise p → r


AF

is T , we have p is F . Similarly, the premise q → r gives q is F . Now, the three statements p is


DR

F , q is F and p ∨ q is T lead to a contradiction.


This inference is called Or Introduction, abbreviated to OI.

As you see, correctness of an inference may be proved in three ways. Consider an inference
?
A1 , . . . , An ⇒ C.

We find out the atomic formulas involved in all the formulas Ai and C. Then we construct a truth
table having columns devoted to all Ai s and also C. Next, we mark all those rows, where all Ai s are
T . In all these rows, check whether C is also T . If yes, then the inference is correct, else, the inference
is incorrect. This method of proof is called Proof by Truth Table.
Instead of constructing a truth table, one analyzes all possibilities of assigning truth values to the
atomic formulas so that the premises are true, and then shows that in all these cases, the conclusion
is also true. This method also comes under the method of truth table.
In another variation of the truth table method, we consider all possibilities of assigning truth values
to the atomic variables so that the conclusion is false. In each of these cases, we show that at least
one premise becomes false. This method is sometimes referred to as the indirect truth table method.
Thus, the truth table method has three varieties of proofs: one - construction of truth table, two
- analyzing the cases when premises are true, and three - analyzing the cases when the conclusion
is false. We see that when the conclusion is in the form p → q, it is advantageous to use the third
variation.
Alternatively, we may use the laws and the already known valid inferences such as Modus Ponens,
Modus Tolens, Hypothetical Syllogism, And Introduction, and Or Introduction to construct a proof of
146 CHAPTER 7. INTRODUCTION TO LOGIC

the required inference. In this method, a proof is defined as a finite sequence of formulas, where each
formula is either a premise (some Ai ), or a tautology, or is derived from earlier formulas using some
law or already known valid inferences. The last formula in such a sequence must be the conclusion C.
Such a proof is called a Direct Proof. If the conclusion C is in the form p → q, then we may use p
as a new premise, and construct a proof with conclusion q. In symbols,

A1 , . . . , An ⇒ p → q if and only if A1 , . . . , An , p ⇒ q

This follows from the rule of Deduction; see Theorem 7.4.4.


As the third alternative, we construct a proof using the rule of Contradiction. Such a proof is called
an Indirect proof. In such a proof, one uses ¬C as a new premise, and then derives a contradiction.
Schematically,
A1 , . . . , An ⇒ C if and only if A1 , . . . , An , ¬C ⇒ ⊥

This method is justified by the rule of Contradiction as shown in Theorem 7.4.4. While constructing
the proof, when we find that some formula X has appeared in a line, and also ¬X has appeared
in some line, then it would mean that the same set of premises imply X as well as ¬X. This is a
contradiction. Thus we mention these two lines as our justification and write ⊥ on the last line.
In practice, we use the rule of Deduction and the rule of Contradiction to bring the given inference
to another form and proceed towards constructing a proof of the new inference. We explain these
methods of proof in the following example.

Example 7.4.6. Determine validity of the following argument:


T

The meeting can take place if all members are informed in advance and there is quorum
AF

(a minimum number of members are present). There is a quorum if at least 15 members


DR

are present. Members would have been informed in advance if there was no postal strike.
Therefore, if the meeting was canceled, then either there were fewer than 15 members
present or there was a postal strike.

Ans: Let us symbolize the simple statements as follows:


m: the meeting takes place;
a: all members are informed;
f : at least fifteen members are present;
q: the meeting had quorum;
p: there was a postal strike.
We need to determine the validity of the inference
?
q ∧ a → m, f → q, ¬p → a ⇒ ¬m → ¬f ∨ p.

Proof by Truth table: In this case, we have five atomic formulas; the truth table will consist of 25 rows.
After construction, we will find that there are more than twenty cases, where the premises are true.
In all theses cases, we will find that the conclusion is also true.
However, this is time consuming. Even analyzing the truth values so that the premises are true is
no less time consuming. We will rather use the indirect truth table method.
Suppose the conclusion ¬m → (¬f ∨ p) is F and each of the premises q ∧ a → m, f → q and
¬p → a is T .
Now, ¬m → (¬f ∨ p) is F means ¬f ∨ p is F and ¬m is T . Hence, the atomic variables m, f and
p take values F, T and F , respectively. Since f → q is T and f is T , q must be T . Similarly, ¬p → a
7.4. INFERENCES IN SL 147

is T gives a is T . Then (q ∧ a) → m is T and both q and a are T give m is T . This contradicts ¬m


taking the value T .
Therefore, the inference is valid; that is, q ∧ a → m, f → q, ¬p → a ⇒ ¬m → ¬f ∨ p; and hence
the argument is valid.
Direct Proof: First, we plan how to go about: from f → q and ¬p → a, we get f ∧ ¬p → q ∧ a. Then
q ∧ a → m gives f ∧ ¬p → m. Its contrapositive is ¬m → ¬f ∨ p. This plan is rewritten as a proof
below, where we mention the justification on the right side, which may be a tautology, a premise, a
law, or a known rule (valid inference) that uses previous lines.

1. f ∧ ¬p → f (p ∧ q ⇒ p)
2. f ∧ ¬p → ¬p (p ∧ q ⇒ q)
3. f →q (Premise)
4. f ∧ ¬p → q (1, 3, HS)
5. ¬p → a (Premise)
6. f ∧ ¬p → a (2, 5, HS)
7. f ∧ ¬p → (q ∧ a) (4, 6, AI)
8. q∧a→m (Premise)
9. f ∧ ¬p → m (7, 8, HS)
10. ¬m → ¬(f ∧ ¬p) (Contraposition)
11. ¬m → ¬f ∨ ¬¬p (De Morgan)
12. ¬m → ¬f ∨ p (Double negation)
Indirect Proof: Using the rule of Deduction and Contradiction, we have
T
AF

q ∧ a → m, f → q, ¬p → a ⇒ ¬m → ¬f ∨ p
if and only if q ∧ a → m, f → q, ¬p → a, ¬m ⇒ ¬f ∨ p
DR

if and only if q ∧ a → m, f → q, ¬p → a, ¬m, ¬(¬f ∨ p) ⇒ ⊥.


We then proceed to construct a proof of the last assertion.

1. ¬(¬f ∨ p) (premise)
2. f ∧ ¬p (De Morgan, Double negation)
3. f (p ∧ q ⇒ p)
4. ¬p (p ∧ q ⇒ q)
5. f →q (Premise)
6. q (3, 5, MP)
7. ¬p → a (Premise)
8. a (4, 7, MP)
9. q∧a (6, 8, AI)
10. q∧a→m (Premise)
11. m (9, 10, MP)
12. ¬m (Premise)
13. ⊥ (11, 12)

Exercise 7.4.7.
1. List all the nonequivalent formulas involving atomic variables p and q which take truth value T
on exactly half of the assignments.
2. Let A and B be two formulas involving the atomic variables p1 , . . . , pk . Prove that A ≡ B if and
only if ‘A ↔ B is a tautology’.
148 CHAPTER 7. INTRODUCTION TO LOGIC

3. Prove (p → q ∨ r) ≡ (p ∧ ¬q → r) in three different ways: truth table method, simplification, by


proving both p → q ∨ r ⇒ p ∧ ¬q → r and p ∧ ¬q → r ⇒ p → q ∨ r.
4. Determine which of the following are logically equivalent:
(a) q → s
(b) (p → r ∨ s) ∧ (q ∧ r → s)
(c) (s → q ∨ r) ∧ (q ∧ s → r)
 
(d) p ∨ r ∨ (s → p) ∧ p → (s → r)
 
(e) p ∨ s ∨ (q → p) ∧ p → (q → s) .

5. Let A be a formula that involves the connectives ∧, ∨, →, and atomic variables p1 , · · · , pk .


Show that the truth value of A is T under the assignment f (p1 ) = · · · = f (pk ) = T .
6. Verify the following assertions by analyzing truth table, and also by constructing a proof:
(a) p ∧ q ⇒ p
(b) p ⇒ p ∨ q
(c) ¬p ⇒ p → q
(d) ¬(p → q) ⇒ p
(e) ¬p, p ∨ q ⇒ q
(f ) p, p → q ⇒ q
(g) ¬q, p → q ⇒ ¬p
(h) p → q, q → r ⇒ p → r
T

(i) p ∨ q, p → r, q → r ⇒ r
AF

(j) p ↔ q ≡ (p ∧ q) ∨ (¬p ∧ ¬q)


DR

(k) p ∧ q, p ∨ q ⇒ p → q
(l) p0 → p1 , p1 → p2 , . . . , p9 → p10 ⇒ ¬p0 ∨ p5 .
(m) ¬p ∨ q → r, s ∨ ¬q, ¬t, p → t, ¬p ∧ r → ¬s ⇒ ¬q.
(n) p → q, r ∨ s, ¬s → ¬t, ¬q ∨ s, ¬s, ¬p ∧ r → u, w ∨ t ⇒ u ∧ w.

7. [Monotonicity] Let S1 ⊆ S2 be finite sets of formulas and let A be a formula. Show that if
S1 ⇒ A, then S2 ⇒ A. (We have used this result without mention.)
8. Determine which of the following arguments is/are correct:
(a) If discrete math is bad, then computer programming is bad. If linear algebra is good, then
discrete math is good. If complex analysis is good, then discrete math is bad. If computer
programming is good, then linear algebra is bad. Complex analysis is bad and hence, at least
one more subject is bad. (Assume that a subject is either bad or good.)
(b) Three persons X, Y and Z are making statements. We know that if X is wrong, then Y is
right; if Y is wrong, then Z is right; and if Z is wrong, then X is right. Does it follow that
at least two of them are always right?
(c) If the lecture proceeds, then either black board is used or the slides are shown or the tablet
pc is used. If the black board is used, then students at the back bench are not comfortable
in reading the black board. If the slides are shown, then students are not comfortable with
the speed. If the tablet pc is used, then it causes a lot of small irritating disturbances to the
instructor. The lecture proceeds and the students are comfortable. Therefore, the instructor
faces disturbances.
7.5. PREDICATE LOGIC (PL) 149

9. The normal forms can be used for inferences. The clue lies in seeing when a normal form is a
tautology or a contradiction. Let A = C1 ∨· · ·∨Cm be a formula in DNF and let B = D1 ∧· · ·∧Dn
be a formula in CNF, where Ci s are conjunctions of literals and Dj s are disjunctions of literals.
Prove the following:
(a) A is a tautology if and only if each Ci has an occurrence of p and also ¬p for some atomic
variable p. Such a p may vary from Ci to Ci .
(b) B is a contradiction if and only if each Di has an occurrence of p and also ¬p for some
atomic variable p. Such a p may vary from Dj to Dj .
Ans: Similar to the first part.

10. Let A and B be two formulas having the truth tables given below. How many nonequivalent
formulas C involving the atomic formulas p, q, r are there such that {A, B} ⇒ C?

p q r A p q r B
T T T T T T T T
T T F F T T F F
T F T T T F T T
T F F T T F F F
F T T F F T T T
F T F T F T F F
F F T F F F T T
F F F F F F F F


11. How many assignments of truth values to p, q, r and w are there for which (p → q) → r → w
is true? Guess a formula in terms of the number of variables.
T
AF

12. Assume that F ≤ T . Let φ and ψ be two truth functions on the variables p1 , . . . , p9 . Suppose
that for each assignment f , we have φ(f ) ≤ ψ(f ). Does this imply that φ → ψ is a tautology?
DR

13. Consider the set S of all nonequivalent formulas written using two atomic variables p and q.
For A, B ∈ S, define A ≤ B if A ⇒ B. Prove that this is a partial order on S. Draw its Hasse
diagram.

7.5 Predicate logic (PL)


How do we symbolize the argument ‘x runs faster than y, y runs faster than z, hence x runs faster
than z’ ? It is clear that, it is not {p, q} ⇒ r, as it is an invalid argument, whereas the given argument
is valid. Notice that we are making this statement on a set, where the elements are comparable as
to who runs faster than whom. In other words, we require something called a predicate f aster(x, y)
which takes truth values T or F depending on the inputs as elements from such a set.

Definition 7.5.1. A k-place predicate P (x1 , . . . , xk ) is a sentence involving the variables x1 , . . . , xk


to which a truth value can be assigned under each assignment of values to x1 , . . . , xk from a nonempty
set, called a universe of discourses (UD).

Example 7.5.2.
1. Let P (x) mean ‘x > 0’. Then P (x) is a 1-place predicate. On the UD: [−1, 1], i.e., when an
element a ∈ [−1, 1] is selected corresponding to x, the resulting statement P (a) is either T or F .
2. Let P (x, y) mean ‘x2 + y 2 = 1’. Then P (x, y) is a 2-place predicate. On the UD: R, when two
elements a, b ∈ R are selected corresponding to x, y, the resulting statement P (a, b) is either T
or F .
150 CHAPTER 7. INTRODUCTION TO LOGIC

3. Let P (x, y, z) mean ‘x and y are children of z’. Then P (x, y) is a 3-place predicate. On the
UD: the set of all human beings, when three human beings a, b, c are selected corresponding to
x, y, z, the resulting statement P (a, b, c) is either T or F .

Definition 7.5.3. The well formed formulas, called formulas for short, of Predicate logic (PL)
are generated by using the following rules recursively:
1. Any predicate is a formula, called an atomic formula.
2. If A, B are formulas, then (¬A), (A ∧ B), (A ∨ B), (A → B) and (A ↔ B) are formulas.
3. If A is a formula and x is a variable, then (∀x A) and (∃x A) are formulas.

The symbols ∀ and ∃ are called quantifiers, where ∀ is the universal quantifier and ∃ is the existential
quantifier. Read ∀ as ‘for each’ and ∃ as ‘there exists’.

For example, (¬(∃x P (x, y, z))), (∀y (¬(∃x P (x, y, z)))), ∀z ¬((∃z R(z)) → R(z)) are formulas.

Remark 7.5.4. We use the same term formula to mean a formula in SL, and one in PL. Notice that
PL is an extension of SL; so there should not be any confusion in the use of this term.

Definition 7.5.5. Let P be a formula.


1. In (∀x P ) or (∃x P ) the formula P is called the scope of the quantifier (extent to which that
quantification applies).
2. (a) If no quantifier occurs in P , then any occurrence of x in (∀x P ) is said to be bound by the
quantifier ∀, and any occurrence of x in (∃x P ) is said to be bound by the quantifier ∃.
T
AF

(b) If some quantifiers occur in P , then any occurrence of x in (∀x P ) which is not already bound
by any quantifier occurring in P , is said to be bound by this occurrence of ∀. A similar statement
DR

holds for the quantifier ∃.


3. An occurrence of a variable in a formula is called a free occurrence if that occurrence of the
variable is not bound by any quantifier. A variable in a formula is called a free variable if it
has at least one free occurrence in the formula.

Example 7.5.6. Let P (x, y, z) and R(z) be predicates.


1. In (∃x P (x, y, z)), the occurrence of y and z are free and both the occurrences of x are bound.
2. In (∀y (∃x P (x, y, z))), all occurrences of x and y are bound and the occurrence of z is free.

3. In ∀z (∃zR(z)) → R(z) , the middle two z’s are bound by ∃ and the first and the last
occurrences of z are bound by ∀.1

Convention: Once the formation of formulas, scope, bound and free occurrences of variables are
understood, we will put forth the precedence rules so that formulas can be written in an abbreviated
form. The precedence rules are the following:
1. Outer parentheses are ignored.
2. ¬, ∀ and ∃ have the highest precedence.
3. ∧ and ∨ have the next precedence.
4. → and ↔ have the least precedence.
1
Normally, we do not repeat the variable symbols used in the quantifiers. We will see that this formula is equivalent

to ∀z (∃yR(y)) → R(z) .
7.5. PREDICATE LOGIC (PL) 151

For example, using the precedence rules, the formulas


 
(¬(∃x P (x, y, z))), (∀y (¬(∃x P (x, y, z)))), ∀z ¬((∃zR(z)) → R(z)) , ∀z (∃yR(y)) → R(z)

are respectively abbreviated to

¬∃x P (x, y, z), ∀y ¬∃x P (x, y, z), ∀z¬(∃zR(z) → R(z)), ∀z(∃yR(y) → R(z)).

We will use the abbreviated formulas with the understanding that in case an ambiguity arises, we
would resort back to the original form.
Definition 7.5.7. 1. Let A be a formula. An interpretation for A means fixing a nonempty set
UD (called the universe of discourse), assigning values to the free variables in A, and giving
meanings of the predicates in A. Schematically,

 fix UD, assumed to be nonempty,

An interpretation for A : assign values to the free variables ocurring in A,

give meanings to the predicates occurring in A.

If x is a variable, its value must be an element of UD; and if P (x1 , . . . , xn ) has n arguments,
then its meaning must be an n-ary relation on UD.
2. Let I be an interpretation for a formula ∀xP . Then we say ‘∀xP is T under I’ if for each a ∈ UD,
the value of P |x=a is T . Here, P |x=a means the expression obtained from P by replacing each
free occurrence of x with a.
Similarly, we say ‘∃xP is T under I’ if for some a ∈ UD, the value of P |x=a is T .
3. If P is a formula, then it will have a truth value T or F under each interpretation. (So you can
imagine a formula as a huge truth table.)
T
AF

4. At times, the meaning of a formula under an interpretation, is also called an interpretation.


DR

Remark 7.5.8. Formally, an interpretation I gives meaning to a predicate P (x1 , . . . , xn ) by assigning


it to an n-ary relation, say, P 0 on the UD. So, P |x1 =a1 ,...,xn =an means (a1 , . . . , an ) ∈ P 0 . For ease in
notation, we continue with the informal assertion “P |x=a means the expression obtained from P by
replacing each free occurrence of x with a”, which is applied recursively.
Example 7.5.9. Consider the formula ∀x P (x, y).
1. Take N as UD. Let P (x, y) mean ‘x > y’. Let us assign 1 to the free variable y. Then the
formula is interpreted as ‘each natural number is greater than 1’, which has truth value F .
2. Take N as UD. Let P (x, y) mean ‘x + y is an integer’, and assign y to 2. Then the formula is
interpreted as ‘when we add 2 to each natural number we get an integer’; it has truth value T .
Example 7.5.10. Let UD be the set of all human beings. Consider the 2-place predicate R(x, y): ‘x
runs faster than y’. Then
1. ∀x ∀y R(x, y) means ‘each human being runs faster than every human being’.
2. ∀x ∃y R(x, y) means ‘for each human being there is a human being who runs slower’.
3. ∃x ∃y R(x, y) means ‘there is a human being who runs faster than some human being’.
4. ∃x ∀y R(x, y) means ‘there is a human being who runs faster than every human being ’.
Remark 7.5.11. [Translation] We expect to see that our developments on logic help us in drawing
appropriate conclusions. In order to do that we must know how to translate an English statement
into a formal logical statement that involves no English words. We may have to introduce appropriate
variables and required predicates. We may have to specify the UD, but normally we use the most
general UD.
152 CHAPTER 7. INTRODUCTION TO LOGIC

Example 7.5.12.
1. Translate: Each person in this class room is either a BTech student or an MSc student.
Ans: Does the statement guarantee that there is a person in the room? No. All it says, if there
is a person, then it has certain properties. Let P (x) mean ‘x is a person in this class room’;
B(x) mean ‘x is a BTech student’; and M (x) mean ‘x is an MSc student’. Then the formula is

∀x P (x) → B(x) ∨ M (x) .
2. Translate: There is a student in this class room who speaks Hindi or English.
Ans: Does the statement guarantee that there is a student in the room? Yes. Let S(x) mean ‘x
is a student in this class room’; H(x) mean ‘x speaks Hindi’; and E(x) mean ‘x speaks English’.

Then the formula is ∃x S(x) ∧ (H(x) ∨ E(x)) .
Note that ∃x S(x) → H(x) ∨ E(x) is not the correct translation. Why?1


Notice that if a formula in PL has no free variables, then its translation into English will result in
a statement. Similarly, when English statements are translated into PL-formulas, they will result in
formulas having no free variables.

Example 7.5.13. Using the vocabulary Q(x): x is a rational number, R(x): x is a real number, and
L(x): x is less than 2, the following formulas are translated into English sentences, as shown:

1. ∀x Q(x) → R(x) : Every rational number is a real number.

2. ∃x ¬Q(x) ∧ R(x) : There is a real number which is not rational.

3. ∀x Q(x) ∧ L(x) → R(x) ∧ L(x) : Every rational number less than 2 is a real number less than
T

2.
AF

 
4. ∀x Q(x) ∧ L(x) → ∀x R(x) ∧ L(x) : If each rational number is less than 2, then each real
DR

number is less than 2.

Exercise 7.5.14. Translate the following sentences into PL:


1. If there is a man on Mars, he is a genius.
2. For each student in IITG there is a student in IITG with more CPI.
3. Every natural number is either the square of a natural number or its square root is irrational.
4. For every real number x there is a real number y such that x + y = 0.
In the rest of the exercises, fill in the blank with a PL-formula so that the definition will be
complete.
5. A subset S ⊆ Rn is called compact, if —. Use the predicates O(x, A): x is an open cover of A;
S(x, y): x is a subset of y; and C(x, A): x is a finite cover of A.
6. A function f : R → R is called continuous at a point a, if —. Use UD = R and the predicates
P (x): x is positive; and Q(x, y, z): |x − y| < z.
7. A function f : R → R is called continuous if —. Use UD = R and the predicates P (x): x is
positive; and Q(x, y, z): |x − y| < z.
8. A function f : R → R is called uniformly continuous if —. Use UD = R and the predicates
P (x): x is positive; and Q(x, y, z): |x − y| < z.
9. A function f : S → T is called a bijection if —. Use predicates B(x, A): x is an element of A;
and E(x, y): x is equal to y.
1
Remember, ∃x (P (x) → Q(x)) never asserts P (x). But ∃x (P (x) ∧ Q(x)) asserts both P (x) and Q(x).
7.6. EQUIVALENCES AND VALIDITY IN PL 153

7.6 Equivalences and Validity in PL


In parallel with SL, we isolate those formulas which receive the truth value T under every interpreta-
tion; and use this notion to define equivalence of two given formulas.

Definition 7.6.1. A formula is called valid if every interpretation evaluates it to T . A formula,


which receives the truth value F under each interpretation is called unsatisfiable. Two formulas A
and B are called equivalent, written A ≡ B, if A ↔ B is valid.

Notice that formulas A and B are equivalent if and only if under each interpretation, A and B
have the same truth value.
In first line of the next example what is unary relation? One assigns
x to a or a to x at different places in different paragraphs
Example 7.6.2. Let R(x) be a predicate.
1. R(x) → R(x) is valid.
Reason: To see this, suppose I is an interpretation that fixes R(x) to a unary relation, say, R0
on some UD; and that assigns x to some element, say, a ∈ UD. Notice that R0 ⊆ UD. Now, I
assigns T to R(x) if and only if a ∈ R0 . The formula R(x) → R(x) is interpreted as the sentence:
if a ∈ R0 , then a ∈ R0 . This sentence is true in any UD. Since I is an arbitrary interpretation,
we conclude that R(x) → R(x) is valid.
2. R(x) ∧ ¬R(x) is unsatisfiable.
Reason: Consider an interpretation I with any UD. Suppose I assigns x to b ∈ UD; and
T

interprets R(x) as the unary relation R0 ⊆ UD. The formula R(x) ∧ ¬R(x) is interpreted as the
AF

statement: b ∈ R0 and b 6∈ R0 . This is false. Since I is an arbitrary interpretation, R(x) ∧ ¬R(x)


DR

is unsatisfiable.
3. Are the formulas ∀x R(x) and ∀z R(z) equivalent?
Reason: Let I be an interpretation. Under I, suppose that the formula ∀x R(x) is T . It means
that for each a ∈ UD, the value of P (a) is T . Then ∀z R(z) is T under I. The argument is
similar if ∀x R(x) is F under I. Since I is an arbitrary interpretation, the two formulas are
equivalent.
Similarly, ∃x R(x) ≡ ∃y R(y).
 
4. Consider the formulas ∀z ∃z R(z) → R(z) and ∀y ∃z R(z) → R(y) . Let I be an interpreta-
 
tion. Assume that ∀ z ∃z R(z) → R(z) is T under I. This means ∃z R(z) → R(z) |z=a is T

for each a ∈ UD. This means ∃z R(z) → R(a) is T for each a ∈ UD. But this also means that

∀y ∃z R(z) → R(y) is T under I.
 
Similarly, if ∀z ∃z R(z) → R(z) is F under I, then ∀y ∃z R(z) → R(y) is F under I. As I is
 
arbitrary, ∀z ∃z R(z) → R(z) ≡ ∀y ∃z R(z) → R(y) .

Proposition 7.6.3. Let P and Q be formulas. The following are true:


1. All tautologies of SL are valid in PL.
2. If P is valid and x is any variable, then both ∀x P and ∃x P are valid.
3. ¬(∀x P ) ≡ ∃x ¬P , ¬(∃x P ) ≡ ∀x ¬P .
4. ∀x ∀y P ≡ ∀y ∀x P , ∃x ∃y P ≡ ∃y ∃x P .
5. ∀x (P ∧ Q) ≡ ∀x P ∧ ∀x Q , ∃x (P ∨ Q) ≡ ∃x P ∨ ∃x Q.
154 CHAPTER 7. INTRODUCTION TO LOGIC

Proof. (1) In a tautology of SL, replace all atomic formulas by predicates of PL (chosen respectively).
For instance, in the tautology p → (q → p), replacing p by P (x, y) and q by R(x, y, z), we get the
formula P (x, y) → (R(x, y, z) → P (x, y)). The assertion says that the resulting formula of PL is valid.
Observe that the connectives are interpreted the same way in PL as in SL. Therefore, the assertion
holds.
(2) Let P be a valid formula and let x be any variable. Let I be an interpretation. Let a ∈ UD. Since
P is valid, P |x=a is T . This holds for each element a of UD. So, both the statements
“There exists a ∈ UD, P |x=a is T .” and “For each a ∈ UD, P |x=a is T .”
hold. (Recall that UD 6= ∅.) Therefore, under I, both ∃x P and ∀x P are T . Since I is an arbitrary
interpretation, both ∃x P and ∀x P are valid.
(3) Assume that under some interpretation I, the formula ¬(∀x P ) is T . So, ∀x P is F under I. That
is, for some a ∈ UD, P |x=a is F under I. Thus, ¬(P |x=a ) is T under I. Hence, ∃x¬P is T under I.
Conversely, suppose that ∃x ¬P is T under an interpretation I. Then there is an a ∈ UD such that
(¬P )|x=a is T under I. This means, P |x=a is F under I. Hence, ∀xP is F under I. That is, ¬(∀xP )
is T under I. This proves the first assertion.
For the second assertion, we use the first assertion as follows:

¬(∃x P ) ≡ ¬(∃x ¬¬P ) ≡ ¬¬(∀x ¬P ) ≡ ∀x ¬P.

(4) Consider the formulas ∃x ∃y P and ∃y ∃xP . Let I be an interpretation. Suppose ∃x ∃y P is T


under I. Then for some a ∈ UD, we have (∃y P )|x=a is T under I. Then again, for some b ∈ UD, we
have P |x=a,y=b is T under I. Since P |x=a,y=b = P |y=b,x=a , we see that (∃x P )|y=b is T under I. This
means ∃y ∃x P is T under I. A similar argument shows that if ∃y ∃x P is T under I, then ∃x ∃y P is
T
AF

also T under I. This proves the second assertion.


For the first assertion, we use the second as follows:
DR

∀x ∀y P ≡ ¬¬(∀x ∀y P ) ≡ ¬(∃x ∃y ¬P ) ≡ ¬(∃y ∃x ¬P ) ≡ ∀y ∀x¬¬P ≡ ∀y ∀x P.

(5) Let I be an interpretation under which ∀x (P ∧Q) is T . Then for each element a ∈ UD, (P ∧Q)|x=a
   
is T . However, (P ∧ Q)|x=a = P |x=a ∧ Q|x=a . Thus, both P |x=a and Q|x=a are T under I.

Now, for each element a ∈ UD, P |x=a is T under I implies that ∀x P is T under I. Similarly, for

each element a ∈ UD, Q|x=a is T under I implies that ∀x Q is T under I. Therefore, ∀x P ∧ ∀x Q
is T under I.
Conversely, suppose ∀x P ∧ ∀x Q is T under I. Then both ∀x P and ∀x Q are T under I. Then for
each element a ∈ UD, P |x=a is T , and for each element b ∈ UD, Q|x=b is T . Let c ∈ UD. It follows
that under I, P |x=c is T and Q|x=c is T . That is, for each c ∈ UD, (P ∧ Q)|x=c is T under I. Hence
∀x (P ∧ Q) is T under I.
We conclude that under I, the formula ∀x (P ∧ Q) ↔ (∀x P ) ∧ (∀x Q) is T . Since I is an arbitrary
interpretation, this biconditional is valid, so that ∀x (P ∧ Q) ≡ ∀x P ∧ ∀x Q.
The second assertion is obtained from the first as in the following:

∃x (P ∨ Q) ≡ ¬¬∃x (P ∨ Q) ≡ ¬∀x ¬(p ∨ Q) ≡ ¬∀x (¬P ∧ ¬Q) ≡ ¬ (∀x ¬P ) ∧ (∀x ¬Q)
 
≡ ¬ ¬(∃x P ) ∧ ¬(∃x Q) ≡ ¬¬ (∃x P ) ∨ (∃x Q) ≡ ∃x P ∨ ∃x Q.

The first part in Proposition 7.6.3 says that all the rules of the logic of Statements also hold in
Predicate logic. For instance, the p ∨ ¬p being a tautology, it follows that ∀x P ∨ ¬∀x, P is valid.
Again, ¬∀x P ≡ ∃x ¬P . Hence ∀x P ∨ ∃x ¬P is valid. You may similarly obtain many more valid
formulas in PL, and formulate many equivalences accordingly.
7.6. EQUIVALENCES AND VALIDITY IN PL 155

In the following example, we show that different quantifiers do not commute, ∀ does not distribute
over ∨, and ∃ does not distribute over ∧.

Example 7.6.4.

1. ∃x ∀y P 6≡ ∀y ∃x P .
Reason: Consider P as the predicate Q(x, y) in the UD = N. Interpret Q(x, y) as ‘x > y’. Then
∃x ∀y P is the formula ∃x ∀yQ(x, y). It means ‘There is a natural number larger than all natural
numbers’. Clearly, this is false. The formula ∀y ∃x P is ∀y ∃x Q(x, y). It means ‘for each natural
number there is a larger natural number’, which is true.

2. ∀x (P ∨ Q) 6≡ ∀x P ∨ ∀x Q.
Reason: Consider P as the predicate O(x) and Q as the predicate E(x) in the UD = N. Interpret
O(x) as ‘x is odd’, and E(x) as ‘x is even’. Then ∀x (P ∨ Q) is the formula ∀x (O(x) ∨ E(x)). It
means each natural number is either odd or even. This is true. Now, ∀x P ∨ ∀x Q is the formula
∀x O(x) ∨ ∀x E(x). It means Either all natural numbers are odd, or all natural numbers are
even. Clearly, this is false.

3. ∃x (P ∧ Q) 6≡ ∃x P ∧ ∃x Q.
Reason: Consider the predicates and their interpretations as in (2). The formula ∃x (P ∧ Q) is
interpreted as ‘there is a natural number which is both odd and even’. This is false. Where as
the formula ∃x P ∧ ∃x Q is interpreted as the true sentence ‘there exists a natural number which
is odd, and also there exists a natural number which is even’.
T

 
Example 7.6.5. Is ∀x R(x) → ∃y R(y) ∧ P (x, y) ≡ ∀x ∃y R(x) → R(y) ∧ P (x, y) ?
AF

Ans: First, let us check the validity of X → Y , where


DR

 
X = ∀x R(x) → ∃y R(y) ∧ P (x, y) , Y = ∀x ∃y R(x) → R(y) ∧ P (x, y) .

Suppose that X → Y is invalid. So there is an interpretation I under which X is T and Y is F .


As Y is F , we see that for some a ∈ UD,

∃y R(a) → R(y) ∧ P (a, y) is F .

That is, for each y ∈ UD, R(a) → R(y) ∧ P (a, y) is F .
That is, R(a) is T and for each y, R(y) ∧ P (a, y) is F .
That is, R(a) is T and ∃y (R(y) ∧ P (a, y)) is F .
That is, R(a) → ∃y (R(y) ∧ P (a, y)) is F .

This leads to a contradiction since X = ∀x R(x) → ∃y R(y) ∧ P (x, y) is T .
Similarly, one shows that Y → X is valid.

Alternate. Write A = R(x) → ∃y R(y) ∧ P (x, y) and B = ∃y R(x) → R(y) ∧ P (x, y) . Consider
an element a ∈ UD. If R(a) is F , Then both X and Y are T . So, suppose R(a) is T . Notice that

R(a) → ∃y (R(y) ∧ P (a, y)) and ∃y R(a) → (R(y) ∧ P (a, y) have the same truth value. Thus, A ≡ B.
It follows that ∀xA ≡ ∀xB, that is, X ≡ Y .

Exercise 7.6.6.

1. Show that ∀x R(x) → ∃y (R(y) ∧ P (x, y)) is not valid.
 
2. Show that ∀x P (x) → Q(x) → ∃x ¬P (x) → ¬Q(x) is not valid.

3. Let P and Q be formulas. Determine whether ∀x(P → Q) ≡ ∀xP → ∀xQ.


156 CHAPTER 7. INTRODUCTION TO LOGIC

7.7 Inferences in PL
As in SL, we translate arguments to inferences in PL. The validity of inferences are defined in an
analogous manner.
?
Definition 7.7.1. An inference is an expression of the form {P1 , . . . , Pn } ⇒ C, where the formulas
P1 , . . . , Pn are called premises or hypotheses, and the formula C is called the conclusion of the
inference. We say that the inference is valid, and write {P1 , . . . , Pn } ⇒ C, if (P1 ∧ · · · ∧ Pn ) → C is
valid. In such a case, we also say that C is a logical conclusion of the premises P1 , . . . , Pn .
? ?
We abbreviate {P1 , . . . , Pn } ⇒ C to P1 , . . . , Pn ⇒ C and {P1 , . . . , Pn } ⇒ C to P1 , . . . , Pn ⇒ C;
and read the symbol ⇒ as ‘implies’.

It follows that X ≡ Y if and only if both X ⇒ Y and Y ⇒ X hold.


Since PL is an extension of SL, we will use all the laws and rules including the Rules of Contradiction
and Deduction. Moreover, to prove that P1 , . . . , Pn ⇒ C, all that we have to do is assume that all
premises P1 , . . . , Pn are T under an arbitrary interpretation I and show that under the same I, C
must be T . Alternatively, using the rule of Contradiction, P1 , . . . , Pn ⇒ C can be proved by assuming
that an interpretation I makes the conclusion C false, and then showing that I makes at least one of
the premises P1 , . . . , Pn false.
We have seen in Example 7.6.4 that some of the equivalences do not hold. In fact, we have shown
that one part of the equivalences fail. Namely,

∀y ∃x P 6⇒ ∃x ∀y P, ∀x (P ∨ Q) 6⇒ ∀x P ∨ ∀x Q, ∃x P ∧ ∃x Q 6⇒ ∃x (P ∧ Q).
T

We show that their converse implications hold.


AF

Proposition 7.7.2. Let P and Q be formulas. Then the following assertions hold:
DR

1. ∃x ∀y P ⇒ ∀y ∃x P .
2. ∀x P ∨ ∀x Q ⇒ ∀x (P ∨ Q).
3. ∃x (P ∧ Q) ⇒ ∃x P ∧ ∃x Q.

Proof. (1) Let I be an interpretation under which ∃x ∀y P is T . and ∀y ∃x P is F . Then there is an


element a ∈ UD such that (∀y P )|x=a is T . Then for each b ∈ UD, P |x=a,y=b is T . It implies that for
each b ∈ UD, (∃y P )|x=a is T . Hence ∀y ∃x P is T . Since I is an arbitrary interpretation, we conclude
that ∃x ∀y P ⇒ ∀y ∃x P .
(2) Let I be an interpretation under which ∀x P ∨ ∀x Q is T . If ∀x P is T , then for each a ∈ UD,
P |x=a is T . However, P |x=a is T implies that P |x=a ∨ Q|x=a is T ; and P |x=a ∨ Q|x=a = (P ∨ Q)|x=a .
Thus, for each a ∈ UD, (P ∨ Q)|x=a is T . So, under I , ∀x (P ∨ Q) is T . Similarly, it follows that if
∀x Q is T under I, then ∀x (P ∨ Q) is also T . In any case, ∀x (P ∨ Q) is T under I. Since I is an
arbitrary interpretation, ∀x P ∨ ∀x Q ⇒ ∀x (P ∨ Q).
(3) We know ¬(∃x P ∧ ∃x Q) ≡ ∀x ¬P ∨ ¬∀x ¬Q. By (2), ∀x ¬P ∨ ¬∀x ¬Q ⇒ ∀x (¬P ∨ ¬Q). Now,
∀x (¬P ∨ ¬Q) ≡ ∀x ¬(P ∧ Q) ≡ ¬∃x (P ∧ Q). Hence, ¬(∃x P ∧ ∃x Q) ⇒ ¬∃x (P ∧ Q). This is same
as ∃x (P ∧ Q) ⇒ ∃x P ∧ ∃x Q.

Example 7.7.3. Any student who appears in the exam and gets a score below 30, gets F grade. A
student x0 has not written the exam. Therefore x0 should get F grade. Do you agree?
Ans: Let S(x) mean ‘x is a student, E(x) mean ‘x writes the exam’, B(x) mean ‘x gets a score
below 30’, and F (x) mean ‘x gets F grade’.
7.7. INFERENCES IN PL 157

We want to see whether1 ∀x S(x) ∧ E(x) ∧ B(x) → F (x) , S(x0 ) ∧ ¬E(x0 ) ⇒ F (x0 ).


Take the following interpretation: S(x) is ‘x is a positive real number’, E(x) is ‘x is a rational

number’, B(x) is ‘x is an integer’, F (x) is ‘x is a natural number’, and x0 = 2.

In this interpretation, the premises mean ‘every positive integer is a natural number’ and ‘ 2 is
a positive real number which is not rational’. Both of them are true. Whereas the conclusion means

‘ 2 is a natural number’, which is false. So the argument is incorrect.

Example 7.7.4. Translate the following argument into PL and then check whether it is correct:

All scientists are human beings. Therefore, all children of scientists are children of human
beings.

Ans: Let S(x) mean ‘x is a scientist’, H(x) mean ‘x is a human being’, and C(x, y) mean ‘x is a child
of y’. Then our hypothesis is ∀x (S(x) → H(x)). A few possible translation of the conclusion are the
following:
1. ∀x (∃y (S(y) ∧ C(x, y)) → ∃z (H(z) ∧ C(x, z))). It means ‘for each x, if x has a scientist father
then x has a human father’. This is a correct translation.
2. ∀x (∀y (S(y)∧C(x, y)) → ∀z (H(z)∧C(x, z))). The statement means ‘for all x, if x is a (common)
child of all scientists, then x is a (common) child of all human beings’. This is a wrong translation.
3. ∀x (S(x) → ∀y (C(y, x) → ∃z (H(z) ∧ C(y, z)))). This means ‘for each x, if x is a scientist, then
each child of x has a human father’. This is also a correct translation.
4. ∀x ∀y (S(x) ∧ C(y, x)) → ∀x ∀y (H(x) ∧ C(x, y)). This means ‘if each x is a scientist and each y
T

is a child of x (y can be equal to x), then each x is a human being and each y is a child of x’.
AF

This is a wrong translation.


DR

So, let us check whether ∀x (S(x) → H(x)) ⇒ ∀x (∃y (S(y) ∧ C(x, y)) → ∃z (H(z) ∧ C(x, z))).
Let I be an interpretation under which ∀x (S(x) → H(x)) is T . Let b be any element of UD.
Suppose that ∃y (S(y)∧C(b, y)) is T under I. Then there is an element a ∈ UD such that S(a)∧C(b, a)
is T . Since ∀x (S(x) → H(x)) is T , we see that S(a) → H(a) is T . It follows that H(a) ∧ C(b, a) is T .
Hence under I, ∃z (H(z) ∧ C(b, z)) is T .
Using the Rule of Deduction, we conclude that under I, the formula ∃y (S(y) ∧ C(b, y)) →
∃z (H(z) ∧ C(b, z)) is T . Since this holds for any arbitrary element b ∈ UD, we conclude that under I,

∀x ∃y (S(y) ∧ C(x, y)) → ∃z (H(z) ∧ C(x, z)) is T . Since I is an arbitrary interpretation, this proves
that the conclusion logically follows from the premise.

Example 7.7.5. Let P be a formula and let R be a formula that does not have any occurrence of x.
Show that
∀x (R ∨ P ) ≡ R ∨ ∀x P, ∀x (R → P ) ≡ R → ∀x P,

∃x (R ∧ P ) ≡ R ∧ ∃x P, ∃x (R → P ) ≡ R → ∃x P.

∀x P → R ≡ ∃x (P → R), ∃x P → R ≡ ∀x (P → R).

Ans: We already know that ∀x R ∨ ∀x P ⇒ ∀x (R ∨ P ). Since R does not have any occurrence of
x, R ≡ ∀x R. Hence R ∨ ∀x P ⇒ ∀x (R ∨ P ). For the converse, let I be an interpretation under
which ∀x (R ∨ P ) is T . Then for each element a ∈ UD, (R ∨ P )|x=a is T . Since R does not have any
occurrence of x, (R ∨ P )|x=a = R ∨ P |x=a . So, under I, either R is T or for each a ∈ UD, P |x=a is T .
1
Actually x0 here is not a variable; it is a constant. Constants are interpreted as elements of UD just like variables,
but their occurrence in a formula is never categorized into bound or free.
158 CHAPTER 7. INTRODUCTION TO LOGIC

That is, under I, R ∨ ∀x P is T . Since I is an arbitrary interpretation, ∀x (R ∨ P ) ⇒ R ∨ ∀x P . We


conclude that ∀x (R ∨ P ) ≡ R ∨ ∀x P .
Others follow from the above by using the equivalences A → B ≡ ¬A ∨ B, ¬∀x A ≡ ∃x¬A,
¬∃x A ≡ ∀x ¬A, ¬¬A ≡ A, ¬(A ∨ B) ≡ ¬A ∧ ¬B and ¬(A ∧ B) ≡ ¬A ∨ ¬B.

Remark 7.7.6.
1. If S is a given set and P is a formula, sometimes we use ∀(x ∈ S)P and ∃(x ∈ S)P . These are
nothing but ∀x(E(x) → P ) and ∃x(E(x) ∧ P ), respectively, where, E(x) means x ∈ S.
2. At times, while dealing with real numbers or very familiar sets, we use certain predicate symbols
in an informal way. For example, we may write x ∈ S instead of using something like E(x, S);
or we may use x > 0 instead of using something like P (x).
For example, in the set R, the meaning of

∃( > 0) ∀(δ > 0)(0 < |x − a| < δ → |f (x) − `| < )

is: “the set {|f (x) − `| : x ∈ R, 0 < |x − a|} has an upper bound ”.

Logic is used primarily to define and argue about mathematical systems. The predicate logic
developed so far is not enough to do that, in general. We need to extend it further by including the
equality predicate, constants, and function symbols. The equality predicate is a predicate like any
other but it is to be interpreted as the equality or identity relation on any UD. For instance, Peano’s
axioms formulated to define the natural number system uses the constant symbol 1, the function
symbol S and the equality predicate =. Such an extension of PL is called the first order logic, which
T

we do not deal with here. However, the logical structure to tackle mathematical theories is provided
AF

by PL.
DR

In some of the exercises that follow you may use constants and the equality predicate freely if
required for translation into the formal language of PL. Revisit Example 7.7.3, where we have used a
constant symbol x0 .

Exercise 7.7.7.
1. Let f : R → R be a function and let a, ` ∈ R. Write a formal definition of lim f (x) 6= `.
x→a
2. In the following, fill in the blank with a PL-formula so that the definition will be complete:
(a) A subset S ⊆ Rn is called connected if —.
(b) A set S is called a group if —.
(c) A subset S ⊆ Rn is called a subspace if —.
(d) A function f : Rn → Rk is called a linear transformation if —.
(e) A function f : (S, ◦) → (T, +) is called a group isomorphism if —.
(f ) A function f : V → W is called a vector space isomorphism if —.

3. Translate and check for validity of the following arguments.


(a) The decimal representation of a rational number either terminates or recurs, whereas that
of an irrational number neither terminates nor recurs. The square root of a natural number
either has a decimal representation which terminates or has a non-terminating decimal
representation and also a non-recurring decimal representation. The square root of all
natural numbers which are squares have decimal representations that terminate. Therefore,
the square root of a natural number which is not a square is an irrational number.
7.7. INFERENCES IN PL 159

(b) For any two algebraic numbers a and b, a 6= 0, 1 and b irrational, we have that ab is
transcendental. The number i (imaginary unit) is irrational and algebraic. The number i
is not equal to 0 or 1. Therefore, the number ii is transcendental.
(c) Each student writes the exam using blue ink or black ink. A student who writes the exam
using black ink and does not write his/her roll number gets an F grade. A student who
writes the exam using blue ink and does not have his/her ID card gets an F grade. A
student who has his/her ID card has written the exam with black ink. Therefore a student
who passes the exam must have written his roll number.
Use predicates S(x): x is a student , B(x): x write the exam using blue ink, Bl(x): x write
the exam using black ink, R(x): x writes roll number, I(x): x has ID card, F (x): x gets F
grade.
(d) Check whether the following argument is correct:
Every mango is either an apple or an orange. Every pineapple is a mango. No apples
are pineapples. Every object is either an apple or a pineapple or a mango or an orange.
Therefore, if an apple is a pineapple, then it is an orange.
Use predicates M (x): x is a mango, A(x): x is an apple, P (x): x is a pineapple, O(x): x
is an orange. T
AF
DR
160 CHAPTER 7. INTRODUCTION TO LOGIC

T
AF
DR
Chapter 8

Partially Ordered Sets, Lattices and


Boolean Algebra

8.1 Partial Orders


A relation can also be used to define an order on a set. For example, the words in a dictionary are
arranged according to a lexicographic ordering. So, ordering the objects according to a particular rule
brings a certain structure to the area of study. In the set of natural numbers, the relation “less than or
equal to” enables us to conclude whether a number precedes or succeeds another number. Similarly,
the relation ⊆ also brings an ordering to the set of sets. In this section, we study the concept of
“order”.
T

The reader is already aware of what reflexive, symmetric and transitive relations are. We now
AF

introduce a fourth relation called an “antisymmetric” relation.


DR

Definition 8.1.1. The relation f defined on a nonempty set X is called an anti-symmetric relation
if and only if, ∀ x, y ∈ X, the property (x, y) ∈ f and (y, x) ∈ f implies that x = y.

It is possible to interpret an anti-symmetric relation using the arrow diagrams of relations. In this
context, a relation is called anti-symmetric if, whenever there is an arrow going from one element to
an element different from it, there does not exist an arrow going back from the second element to the
first.
Example 8.1.2. 1. Example 1.3.6.1 is an anti-symmetric relation.
2. Let R1 = {(x, y) ∈ Z+ × Z+ | x divides y} and R2 = {(x, y) ∈ Z \ {0} × Z | x divides y}.
(a) Show that R1 is an anti-symmetric relation on the set of positive integers.
(b) Show that R2 is not an anti-symmetric relation on the set of integers by giving a counter
example.

There are two relations which play a prominent role in mathematics. One of them is the equivalence
relation, which we have already seen is a relation which is reflexive, symmetric and transitive. We
now introduce the other relation called a partial order.

Definition 8.1.3. A relation f on a nonempty set X is called a partial order if f is reflexive,


transitive and anti-symmetric. Here (X, f ) is a partially ordered set and is colloquially referred to as
a poset.

The relation less than or equal to on the set of real numbers and the relation subset on the set of
sets are two fundamental partial orders. These can be thought of as models for the general partial

161
162 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

order. It is common practice to use the symbol  to denote a partial order. Further, if (X, ) is a
poset and x  y, then we read this as x is less than or equal to y.

Definition 8.1.4. Let (X, ) be a poset. It there exist elements x and y in X, such that either
(x, y) ∈ or (y, x) ∈ holds, then x and y are said to be comparable. In neither (x, y) nor (y, x)
belongs to , then x and y are said to be incomparable.

Example 8.1.5.
1. Let X = {1, 2, 3, 4, 5}.
(a) The identity relation Id on X is reflexive, transitive and anti-symmetric and is therefore a
partial order. However, no two elements of X are comparable.
(b) The relation Id ∪ {(1, 2)} is also a partial order on X. Here 1 and 2 are comparable.
(c) The relation = Id∪{(1, 2), (2, 1)} is both reflexive and transitive, but not anti-symmetric.
Observe that (1, 2), (2, 1) ∈ and 1 6= 2.
(d) The relation Id ∪ {(1, 2), (3, 4)} is a partial order on X. Here, 1 and 2 are comparable and
so are 3 and 4.

2. Let X = N. The relation = {(a, b) : a divides b} is a partial order on X.


3. Let X be a nonempty collection of sets. Here, = {(A, B) : A, B ∈ X, A ⊆ B} is a partial order
on X.
4. On R the set = {(a, b) : a ≤ b} is a partial order. It is called the usual partial order on R.

Practice 8.1.6. Construct a partial order on the set {1, 2, 3, 4, 5}


T
AF

1. of maximum cardinality and


2. of minimum cardinality.
DR

In Example 8.1.5(4) any two elements are comparable, whereas in Example 8.1.5(1a), no two
elements are comparable.

Definition 8.1.7. Let (X, ) be a poset.


1. If any two elements in the poset (X, ) are comparable, then  is called a linear order and
(X, ) is called a linearly ordered set. Often a linear order is also referred to as a total order
or a complete order.
2. A subset, C of X, is called a chain if and only if  induces a linear order on C. If C is a finite
set, then the length of C is equal to the number of elements if C. If C is not a finite set, then
the length of C is said to be infinite.
3. A subset, A of X, is called an antichain if and only if no two elements of A are comparable.
The length of an antichain is defined in precisely the same manner as that of the chain.
4. The maximum of the lengths of the chains of X is called the height of X and the maximum of
the lengths of the antichains of X is called the width of X.

Let X be a nonempty set and let f be a relation on X. Then, recall from Definition 1.6.1 that f
is reflexive if (x, x) ∈ f for all x ∈ X; f is transitive if (x, y) ∈ f and (y, z) ∈ f imply (x, z) ∈ f for
all x, y, z ∈ X; and f is anti-symmetric if (x, y) ∈ f and x 6= y implies (y, x) ∈ / f , i.e., for all distinct
elements x, y of X both (x, y) and (y, x) cannot be in f . Relations which are simultaneously reflexive,
transitive and anti-symmetric play an important role in mathematics; and we give a name to such
relations.
8.1. PARTIAL ORDERS 163

Definition 8.1.8. Let X be a nonempty set. A relation f on X is called a partial order if f is


reflexive, transitive and anti-symmetric. Let f be a partial order on X and let a, b ∈ X. Then, a and
b are said to be comparable (with respect to the partial order f ) if either (a, b) ∈ f or (b, a) ∈ f .

When a partial order satisfies some other desirable properties, they are given different names. We
fix some of these in the following definition.

Definition 8.1.9. Let X be a nonempty set.


1. The pair (X, f ) is called a partially ordered set (in short, poset) if f is a partial order on X.
2. A partial order f on X is called a linear order if either (x, y) ∈ f or (y, x) ∈ f for all x, y ∈ X,
i.e., when any two elements of X are comparable. A linear order is also called a total order,
or a complete order.
3. The poset (X, f ) is said to be a linearly ordered set if f is a linear order on X.
4. A linearly ordered subset of a poset is called a chain in the poset. The maximum size of a chain
in a poset is called the height of a poset.
5. Let (X, f ) be a poset and let A ⊆ X. A is called an anti-chain in the poset if no two elements
of A are comparable. The maximum size of an anti-chain in a poset is called the width of the
poset.

You may imagine the elements of a linearly ordered set as points on a line. The height of a poset
is the maximum of the cardinalities of all chains in the poset. The width of a poset is the maximum
of the cardinalities of all anti-chains in the poset.
T
AF

Example 8.1.10.
1. The poset in Example 8.1.5.1a has height 1 (size of the chain {1}) and width 5 (size of the
DR

anti-chain {1, 2, 3, 4, 5}).


2. The poset in Example 8.1.5.1b has height 2 (respective chain is {1, 2}) and width 4 (respective
anti-chains are {2, 3, 4, 5} and {1, 3, 4, 5}).
3. The poset in Example 8.1.5.1d has height 2 (respective chains are {1, 2} and {3, 4}) and width
3 (a respective anti-chain is {1, 3, 5}). Find other anti-chains.
4. The usual order (usual ≤) in N is a linear/complete/total order. The same holds for the usual
order in Z, Q and R.
5. If (X, f ) is a finite linearly ordered set then the singleton subsets of X are the only anti-chains.
In this case, the height of X is the number of elements in X and the width of X is 1.
6. The set N with the partial order f defined by “(a, b) ∈ f if a divides b” is not linearly ordered.
However, the set {1, 2, 4, 8, 16} is a chain. This is just a linearly ordered subset of the poset.
There are larger chains, for example, {2k : k = 0, 1, 2, . . .}. The set of all primes is an anti-chain
here. The poset (N, f ) has infinite height and infinite width.
7. The poset (P({1, 2, 3, 4, 5}), ⊆) is not linearly ordered. However, {∅, {1, 2}, {1, 2, 3, 4, 5}} is a
chain in it. Also, {∅, {2}, {2, 3}, {2, 3, 4}, {2, 3, 4, 5}, {1, 2, 3, 4, 5}} is a chain. The height of this
poset is 6. What is its width?

Convention: It is common to use ≤ in infix notation for a partial order. That is, if f is a partial
order on a nonempty set X we write x ≤ y to mean that (x, y) ∈ f . Accordingly, the poset (X, f ) is
written as (X, ≤). Also, instead of writing ‘(X, f ) is a poset’ we will often write ‘X is a poset with
164 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

the partial order f ’. Following custom, by x ≥ y we mean y ≤ x; by x < y we mean that x ≤ y and
x 6= y; by x > y we mean y < x. Also, we read x ≤ y as x is less than or equal to y; x < y as x is less
than y; x ≥ y as x is greater than or equal to y; and x > y as x is larger than y.

Practice 8.1.11. Let n ∈ N. Define Pn = {k ∈ N : k divides n}. Define a relation ≤n on Pn


by ≤n = {(a, b) : a divides b}. Show that (Pn , ≤n ) is a poset for each n ∈ N. Give a necessary and
sufficient condition on n so that (Pn , ≤n ) is a linearly ordered set.

In any finite set of symbols, we can fix a linear order by arbitrarily declaring which symbol comes
next to which other symbol. Further, the same linear order can be extended to the set of all words
formed using those symbols, such as that followed in a dictionary.

Definition 8.1.12. Let (Σ, ≤) be a finite linearly ordered set (like the English alphabet with a <
b < c < · · · < z) and let Σ∗ be the collection of all words formed using the elements of Σ. For
a = a1 a2 · · · an , b = b1 b2 · · · bm ∈ Σ∗ for m, n ∈ N, define a ≤ b if

(a) a1 < b1 , or
(b) ai = bi for i = 1, . . . , k for some k < min{m, n} and ak+1 < bk+1 , or
(c) ai = bi for i = 1, 2, . . . , n = min{m, n}.

Then (Σ∗ , ≤) is a linearly ordered set. This ordering is called the lexicographic or dictionary
ordering. Sometimes Σ is called the alphabet and the linearly ordered set Σ∗ is called the dictionary.

Practice 8.1.13. Let D1 be the dictionary of words made from a, b, c and D2 be the dictionary of
T

words made from a, b, d. Are D1 and D2 equinumerous?


AF

Discussion 8.1.14. [Directed Graph Representation of a Finite Poset] Often we represent


DR

a finite poset (X, ≤) by a picture. The process is described below.


(a) Put a dot (called a node) for each element of X and label it with that element.
(b) If a ≤ b, draw a directed line (an arrow) from the node labeled a to the node labeled b.
(c) Put a loop at the node labeled a for each a ∈ X.

1. A directed graph representation of the poset (A, ≤) with A = {1, 2, 3, 9, 18} and ≤ as the ‘divides’
relation (a ≤ b if a|b) is given below.
18

2 3

1
An abbreviated diagrammatic representation of a finite poset is defined below.

Definition 8.1.15. The Hasse diagram of a finite poset (X, ≤) is a picture drawn in the following
way:
1. Each element of X is represented by a point and is labeled with the element.
2. If a ≤ b then the point labeled a must appear at a lower height than the point labeled b and
further the two points are joined by a line.
8.1. PARTIAL ORDERS 165

3. If a ≤ b and b ≤ c then the line between a and c is removed.

We will see later that for each finite poset a Hasse diagram exists; see Discussion 8.1.23.

Example 8.1.16. Hasse diagram for the poset (A, ≤) with A = {1, 2, 3, 9, 18} and ≤ as the ‘divides’
relation is given below.

18

2 3

Practice 8.1.17. Draw the Hasse diagram for


1. {1, 2, 3} × {1, 2, 3, 4} under lexicographic order.
2. {1, 2, 3, 6, 9, 18} (all positive divisors of 18) with the relation as ‘divides’.
3. {2, 3, 4, 5, 6, 7, 8} with the ‘divides’ relation.

Proposition 8.1.18. Let X and Y be nonempty sets. Let F be a nonempty family of partial functions
from X to Y . Suppose (F, ⊆) is a linearly ordered set. Let h = ∪ f . Then the following are true:
f ∈F
1. h is a partial function from X to Y .
2. dom h = ∪ dom f .
T

f ∈F
AF

3. rng h = ∪ rng f .
f ∈F
DR

4. If every element of F is one-one (from its domain to its range) then h is also one-one.

Proof. We shall only prove the first two.


1. Suppose that h is not a partial function. We have x ∈ dom h and (x, y), (x, z) ∈ h, y 6= z. Then
there are f, g ∈ F, such that (x, y) ∈ f and (x, z) ∈ g. As F is linearly ordered, either f ⊆ g or
g ⊆ f . If f ⊆ g, then (x, y) ∈ g and (x, z) ∈ g. Then, g is not a partial function, a contradiction.
Similarly, g ⊆ f leads to a contradiction.
2. Note that x ∈ dom h means (x, y) ∈ h for some y. This means (x, y) ∈ f for some f , i.e.,
x ∈ dom f for a partial function f . Hence, x ∈ ∪ dom f .
f ∈F

Practice 8.1.19. Prove the other parts of Proposition 8.1.18.

We fix some more terminology for posets related to extreme elements.

Definition 8.1.20. Let (X, ≤) be a poset and let A ⊆ X.


1. We say that an element x ∈ X is an upper bound of A if for each z ∈ A, z ≤ x; or equivalently,
when each element of A is less than or equal to x. An element y ∈ X is called a lower bound
of A if for each z ∈ A, y ≤ z; or equivalently, when y is less than or equal to each element of A.
2. An element x ∈ A is called the maximum of A, if x is an upper bound of A. Thus, maximum
of A is an upper bound of A which is contained in A. Such an element is unique provided it
exists. In this case, we denote x = max{z : z ∈ A}. Similarly, minimum of A is an element
y ∈ A which is a lower bound of A. If minimum of A exists, then it is unique; and we write
y = min{z : z ∈ A}.
166 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

3. An element x ∈ X is called the least upper bound (lub) of A in X if x is an upper bound of


A and for each upper bound y of A, we have x ≤ y; i.e., when x is the minimum (least) element
of the set of all upper bounds of A. Similarly, the greatest lower bound (glb) of A is a lower
bound of A which is greater than or equal to all upper bounds of A; it is the maximum (largest)
of the set of all lower bounds of A.
4. An element x ∈ A is a maximal element of A if x ≤ z for some z ∈ A implies x = z; or
equivalently, when no element in A is larger than x. An element y ∈ A is called a minimal
element of A if z ≤ y for some z ∈ A implies y = z; or equivalently, when no element in A is less
than y.

Example 8.1.21. Consider the two posets X = {a, b, c} and Y = {a, b, c, d} described by the following
Hasse diagrams:
d

b c b c

a a

X Y
Figure 8.1: Posets X and Y

1. Let A = X. Then,
(a) the maximal elements of A are b and c,
T
AF

(b) the only minimal element of A is a,


(c) a is the lower bound of A in X,
DR

(d) A has no upper bound in X,


(e) A has no maximum element,
(f) a is the minimum element of A,
(g) no element of X is the lub of A, and
(h) a is the glb of A in X.

2. The following table illustrates the definitions by taking different subsets A of X, and also con-
sidering the same A as a subset of Y .
A = {b, c} ⊆ X A = {a, c} ⊆ X A = {b, c} ⊆ Y
Maximal element(s) of A b, c c b, c
Minimal element(s) of A b, c a b, c
Lower bound(s) of A in X a a a
Lower bound(s) of A in Y a a a
Upper bound(s) of A in X does not exist c d
Upper bound(s) of A in Y does not exist c d
Maximum element of A does not exist c does not exist
Minimum element of A does not exist a does not exist
lub of A in X does not exist c d
lub of A in Y does not exist c d
glb of A in X a a a
glb of A in Y a a a
8.1. PARTIAL ORDERS 167

Practice 8.1.22.
1. Apply induction to show that a finite poset has a maximal element and a minimal element.
2. Let (X, ≤) be a poset and let Y be a nonempty subset of X. For a, b ∈ Y , define a ≤Y b if a ≤ b.
Show that ≤Y is a partial order on Y . This is called the induced partial order on Y .

Discussion 8.1.23. [Hasse diagram exists] Let (X, ≤) be a finite poset. Let x1 , . . . , xk be the minimal
elements of X. (See Practice 8.1.22.1.) Draw k points on the same horizontal line and label them
x1 , . . . , xk . Now consider Y = X \ {x1 , . . . , xk } with the induced partial order ≤Y . By induction, the
picture of (Y, ≤Y ) can be drawn. Put it above those k dots. Let y1 , . . . , ym be the minimal elements
of Y . Now, draw the lines (xi , yj ) if xi ≤ yj in X. This is the Hasse diagram for the poset (X, ≤).

Remark 8.1.24. [Bounds of the Empty Set] Let (X, ≤) be a poset. Then each x ∈ X is an upper
bound for ∅ as well as a lower bound for ∅. So, an lub for ∅ may or may not exist. For example, if
X = {1, 2, 3} and ≤ is the usual order, then lub ∅ = 1. Whereas, if X = Z and ≤ is the usual order,
then an lub for ∅ does not exist. Similar statements hold for glb.

Another important class of partial orders is introduced next.

Definition 8.1.25. A linear order ≤ on a nonempty set X is said to be a well order if each nonempty
subset of X has minimum. We call (X, ≤) a well ordered set to mean that ≤ is a well order on X.

Often we use the phrase ‘X is a well ordered set with the ordering as ≤’ to mean ‘(X, ≤) is a well
ordered set’.
Recall that in a linearly ordered set X, if a minimum of a subset exists, it is an element of the
T

subset and it is unique. Thus to say that every subset (of X) has minimum is same as saying every
AF

subset has its minimum, which is unique and is an element of that subset. Further, if each subset of
DR

X has a minimal element, then such a minimal element is the minimum of the subset. Thus a linearly
ordered set is well ordered if and only if every nonempty subset has a minimal element.
Example 8.1.26. 1. The set Z with the usual order is not well ordered, as {−1, −2, . . . , } has no
minimum.
2. The ordering 0 ≤ 1 ≤ −1 ≤ 2 ≤ −2 ≤ 3 ≤ −3 ≤ · · · describes a well order on Z.
3. The set N with the usual order is well ordered.
4. The set R with the usual order (usual ≤) is not well ordered as the set (0, 1) has no minimum.

What to write before this theorem? Also, it will be better to include


the following theorem: ”Every finite poset can be embedded in a totally
ordered set”.
Theorem 8.1.27. [Principle of Transfinite Induction] Let (W, ≤) be a well ordered set. Let A ⊆ W
satisfying “suppose for each x ∈ W , the condition {y ∈ W : y < x} ⊆ A implies x ∈ A”. Then
A = W.

Proof. Suppose A 6= W . Then Ac = W \ A 6= ∅. As W is well ordered, let s be the minimum of Ac


in W . Notice that s ∈ Ac , and hence any element of W that is less than s is in A.
Consider the set Ws := {y ∈ W : y < s}. If z ∈ Ws , then z < s. So, by definition of Ac , z ∈ A.
Hence, Ws ⊆ A. Then, by the given condition, s ∈ A. This is a contradiction.

For any element a in a well ordered set (W, ≤), the subset Wa := {x ∈ W : x < a} is called the
initial segment of a. In the well ordered set N, the intiial segment of any natural number n is
168 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

the set {1, . . . , n − 1}. The principle of transfinite induction says that if the condition that an initial
segment of an element, say x0 , is contained in a subset, say A, implies that the element is in the
subset (x0 ∈ A), then the subset (A) cannot leave out any element of the well ordered set. We leave
it as an exercise to supply a formal proof that this principle is same as the principle of mathematical
induction in N.
Exercise 8.1.28. 1. Determine the maximal elements, minimal elements, lower bounds, upper
bounds, maximum, minimum, lub and glb of A in the following posets (X, f ).
(a) Take X = Z with the usual order (usual ≤) and A = Z.
(b) Take X = N, f = {(i, i) : i ∈ N} and A = {4, 5, 6, 7}.

2. Does there exist a poset with exactly 5 maximal chains of sizes 2, 3, 4, 5, 6, and 2 maximal ele-
ments? If yes, draw the Hasse diagram. If no, give reasons.
3. Consider the poset X = {1, 2, 3, 6, 9, 18} with ‘divides’ relation.

(a) Draw the Hasse diagram of the poset.


(b) What is its height? What is its width.
(c) Let A = {2, 3, 6} ⊆ X. What are the maximal elements, minimal elements, maximum,
minimum, lower bounds, upper bounds, glb and lub of A?

4. Construct the Hasse diagram for the ‘⊆’ relation on P({a, b, c}).
 
5. Consider the poset X = (1, 1), (1, 2), (1, 3), . . . ∪ (2, 1)(3, 1), (4, 1), . . . with the partial order
T

  [  
f= ∪ (1, m), (1, n) ∪ (m, 1), (n, 1) .
AF

m, n ∈ N m, n ∈ N
m≤n m≤n
DR

(a) Does X have any minimal element(s)?


(b) Does every subset of X have a lower bound?
(c) Is X linearly ordered?
(d) Is it true that every nonempty subset of X has a minimal element?
(e) Is it true that every nonempty subset of X has a minimum?
(f ) What type of nonempty subsets of X always have a minimum?

6. [Tarski] A set X is finite if and only if every nonempty family of subsets of X has a minimal
element in the poset (P(X), ⊆).
7. Prove or disprove each of the following assertions:
(a) There are at least 5 functions f : R → R which are partial orders.
(b) Take N with the usual order. Then the dictionary order on N2 is a well order.
(c) Take N with the usual order and N2 with the dictionary order. Then any nonempty subset
of N2 which is bounded above has an lub.
(d) There exists a partial order on N for which each nonempty subset has at least one but finitely
many upper bounds, and also at least one but finitely many lower bounds?
(e) There exists a partial order on N for which there are infinitely many maximal elements but
has no minimal element.
(f ) Every countable linearly ordered set is well ordered with respect to the same ordering.
8.2. LATTICES 169

(g) Every countable chain which is bounded below, in a poset, is well ordered with respect to the
same ordering.
(h) The set Q can be well ordered.
(i) Let S be the set of words with length at most 8 using letters from {3, A, a, b, C, c}. We want
to define a lexicographic order on S to make it a dictionary. Are there more than 500 ways
to do that?
(j) An infinite poset in which each nonempty finite set has a minimum, must be linearly ordered.
(k) A finite poset in which each nonempty finite set has a minimum, must be well ordered.
(l) An infinite poset in which each nonempty finite set has a minimum, must be well ordered.
(m) Every total order corresponds to an equivalence relation.

8. Show that the principle of transfinite induction is same as the principle of mathematical induction
in the well ordered set (N, ≤).

8.2 Lattices
In a poset, it is not necessary that two elements x, y should have a common upper bound. For instance,
consider the poset {1, 2, . . . , 6} with “a ≤ b if and only if a divides b”. The elements 5 and 3 have no
common upper bound.
Similarly, in a poset, if a pair {x, y} has at least one upper bound, it is not necessary that the set
{x, y} has an lub. For example, look at the poset described by the third Hasse diagram in Figure 8.2.
The set {a, b} has c and d as upper bounds, but there is no lub of {a, b}.
T
AF

1 1
c d
DR

a c
a c a b c

a b
0 0 b
A distributive lattice A non-distributive lattice Both are non-lattices

Figure 8.2: Hasse diagrams

Definition 8.2.1.
1. A poset (L, ≤) is called a lattice if each pair x, y ∈ L has an lub and also a glb. An lub of x, y
is also written as x ∨ y (read as ‘x or y’ / ‘join of x and y’) and a glb of x, y as x ∧ y (read as ‘x
and y’ / ‘meet of x and y’). Do you want to write join and meet here or in
Boolean Algebras? I added it. They appear in the next section.
2. A lattice is called a distributive lattice if for all pairs of elements x, y the following conditions,
called distributive laws, are satisfied :

x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z), x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z).

Example 8.2.2.
1. Consider the poset L = {0, 1}, where 0 < 1. So, L is a linearly ordered set. In this case,
a ∨ b = max{a, b} and a ∧ b = min{a, b}. Hence, L is a distributive lattice.
170 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

2. The set N with the usual order is a distributive lattice, where ∨ = max and ∧ = min. We
consider two cases to verify that a ∨ (b ∧ c) = (a ∨ b) ∧ (a ∨ c). The second distributive law is
left as an exercise to the reader.

(a) Case 1: a ≥ min{b, c}. Then, either a ≥ b or a ≥ c, say a ≥ b. So, max{a, b} = a and
max{a, c} ≥ a. Thus,

a ∨ (b ∧ c) = max{a, min{b, c}}



= a = min max{a, b}, max{a, c} = (a ∨ b) ∧ (a ∨ c).

(b) Case 2: a < min{b, c}. Then, a < b and a < c. So, max{a, b} = b and max{a, c} = c.
Thus,

a ∨ (b ∧ c) = max{a, min{b, c}}



= min{b, c} = min max{a, b}, max{a, c} = (a ∨ b) ∧ (a ∨ c).

3. The poset described by the first diagram in Figure 8.2 is a distributive lattice. (Verify.)
4. The poset described by the second diagram in Figure 8.2 is a lattice but not a distributive lattice.
(Verify by computing a ∨ (b ∧ c) and (a ∨ b) ∧ (a ∨ c) separately.)
5. Let S = {a, b, c}. Consider the poset P(S) with the partial order as ⊆. Then A ∨ B = A ∪ B
and A ∧ B = A ∩ B. Verify that P(S) is a distributive lattice.
6. Fix a positive integer n and let D(n) denote the set of all divisors of n. For elements x, y ∈ D(n),
T

define x ≤ y if x divides y. Then (D(n), ≤) is a distributive lattice, where ∨ = lcm and ∧ = gcd.
AF

For n = 12, 30 and 36, the corresponding lattices are shown below.
DR

36

12 30
12 18

4 6 6 10 15 4 6 9

2 3 2 3 5 2 3

1 1 1

To check the first distributive law, let a, b, c ∈ D(n), p a prime, and let k ∈ N. Further, let
pk | lcm{a, gcd{b, c}}. Then, either pk |a or pk |b, c. In that case, pk | lcm{a, b} and pk | lcm{a, c}.
So, pk | gcd{lcm{a, b}, lcm{a, c}}.
Now, let us assume that pk | gcd{lcm{a, b}, lcm{a, c}}. Then, pk | lcm{a, b} and pk | lcm{a, c}.
Then, either pk |a or (pk |b and pk |c). So, pk | lcm{a, gcd{b, c}}.
Thus, any power of a prime divides a ∨ (b ∧ c) if and only if it divides (a ∨ b) ∧ (a ∨ c). Therefore,
a ∨ (b ∧ c) = (a ∨ b) ∧ (a ∨ c). Similarly, the second distributive law can be verified.

Practice 8.2.3.
1. Fix a prime p and a positive integer n. Draw the Hasse diagram of D(pn ). Does this correspond
to a linearly ordered set? Give reasons for your answer.
8.2. LATTICES 171

2. Let n be a positive integer. Prove that D(n) is a linearly ordered set if and only if n = pm for
some prime p and a positive integer m.
3. Is every linearly ordered set a distributive lattice?

We now prove the basic results on ∧ and ∨.

Proposition 8.2.4. [Laws] In a lattice (L, ≤), the following are true:
1. [Idempotence] : a ∨ a = a, a∧a=a
2. [Commutativity] : a ∨ b = b ∨ a, a∧b=b∧a
3. [Associativity] : a ∨ (b ∨ c) = (a ∨ b) ∨ c, a ∧ (b ∧ c) = (a ∧ b) ∧ c
4. a ≤ b ⇔ a ∨ b = b. Similarly, a ≤ b ⇔ a ∧ b = a
5. [Absorption] : a ∨ (a ∧ b) = a = a ∧ (a ∨ b)
6. [Isotonicity] : b ≤ c ⇒ a ∨ b ≤ a ∨ c, b≤c ⇒a∧b≤a∧c
7. a ≤ b, c ≤ d ⇒ a ∨ c ≤ b ∨ d, a ≤ b, c ≤ d ⇒ a ∧ c ≤ b ∧ d
8. [Distributive Inequality] : a ∨ (b ∧ c) ≤ (a ∨ b) ∧ (a ∨ c), a ∧ (b ∨ c) ≥ (a ∧ b) ∨ (a ∧ c)
9. [Modularity] : a ≤ c ⇔ a ∨ (b ∧ c) ≤ (a ∨ b) ∧ c

Proof. We prove only the first parts of all assertions; the second parts can be proved similarly.
(1) a ∨ a is an upper bound of {a, a}. Hence a ∨ a ≥ a. On the other hand, a is an upper bound
of {a, a}. So, a ∨ a being the least of all upper bounds of {a, a}, is less than or equal to a. Hence
T

a ∨ a = a.
AF

(2) a ≤ b ∨ a, b ≤ b ∨ a. So, b ∨ a is an upper bound of a, b. Since a ∨ b is the least of all upper bounds
DR

of a, b, we have a ∨ b ≤ b ∨ a. Exchanging a and b, we get b ∨ a ≤ a ∨ b. Hence a ∨ b = b ∨ a.


(3) Let d = a ∨ (b ∨ c). Then, d ≥ a, d ≥ b ∨ c so that d ≥ a, d ≥ b and d ≥ c. So, d ≥ a ∨ b and d ≥ c.
That is, d ≥ (a ∨ b) ∨ c. Similarly, e = (a ∨ b) ∨ c implies e ≥ a ∨ (b ∨ c). Thus, the first part of the
result follows.
(4) Let a ≤ b. As b is an upper bound of {a, b}, and a ∨ b is the least of all upper bounds of {a, b},
we have a ∨ b ≤ b. Also, a ∨ b is an upper bound of {a, b} and hence a ∨ b ≥ b. So, we get a ∨ b = b.
Conversely, let a ∨ b = b. As a ∨ b is an upper bound of {a, b}, we have a ≤ a ∨ b = b. Therefore,
a ≤ b ⇔ a ∨ b = b.
(5) By definition a ∧ b ≤ a. So, a ∨ (a ∧ b) ≤ a ∨ a = a using (1). Also, by definition a ∨ (a ∧ b) ≥ a.
Hence, a ∨ (a ∧ b) = a.
(6) Let b ≤ c. Note that a ∨ c ≥ a and a ∨ c ≥ c ≥ b. So, a ∨ c is an upper bound of {a, b}. Thus,
a ∨ c ≥ lub{a, b} = a ∨ b.
(7) Using (6), we have a ∨ c ≤ b ∨ c ≤ b ∨ d. Again, using (6), we get a ∧ c ≤ b ∧ c ≤ b ∧ d.
(8) Note that a ≤ a ∨ b and a ≤ a ∨ c. Thus, a = a ∧ a ≤ (a ∨ b) ∧ (a ∨ c). As b ≤ a ∨ b and c ≤ a ∨ c,
by (7), we get b ∧ c ≤ (a ∨ b) ∧ (a ∨ c). So, by definition a ∨ (b ∧ c) ≤ (a ∨ b) ∧ (a ∨ c).
(9) Let a ≤ c. Then, a ∨ c = c and hence by (8), we have a ∨ (b ∧ c) ≤ (a ∨ b) ∧ (a ∨ c) = (a ∨ b) ∧ c.
Conversely, let a ∨ (b ∧ c) ≤ (a ∨ b) ∧ c. Then a ≤ a ∨ (b ∧ c) ≤ (a ∨ b) ∧ c ≤ c. Thus the required result
follows.

Practice 8.2.5. Show that in a lattice one distributive law implies the other.
172 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

From two given lattices, a new lattice can be created by taking the product.
Before defining Direct Product of lattices, do we need to show that
if a = (a1, a2) ≤ (b1, b2) = b whenever a1 ≤1 b1 and a2 ≤2 b2 then
a ∨ b = (a1 ∨1 b1, a2 ∨2 b2) and a ∧ b = (a1 ∧1 b1, a2 ∧2 b2).
Definition 8.2.6. Let (L1 , ≤1 ) and (L2 , ≤2 ) be lattices. Then, (L1 × L2 , ≤) is a poset with a =
(a1 , a2 ) ≤ (b1 , b2 ) = b if a1 ≤1 b1 and a2 ≤2 b2 , i.e., if b dominates a entry-wise. This is called the
lattice order on L1 ×L2 . In this case, we see that a∨b = (a1 ∨1 b1 , a2 ∨2 b2 ) and a∧b = (a1 ∧1 b1 , a2 ∧2 b2 ).
Thus (L1 × L2 , ≤) is a lattice, called the direct product of lattices (L1 , ≤1 ) and (L2 , ≤2 ).

Example 8.2.7.
1. Consider L = {0, 1} with the usual order (0 ≤ 1). The set of all binary strings Ln of length n is
a poset with the order (a1 , . . . , an ) ≤ (b1 , . . . , bn ) if ai ≤ bi for each i. This is the n-fold direct
product of L with itself. It is called the lattice of n-tuples of 0 and 1.
2. Consider the lattices {1, 2, 3} and {1, 2, 3, 4} with the usual orders. Hasse diagram of the direct
product {1, 2, 3} × {1, 2, 3, 4} is given below.
(3, 4)

(1, 4)
(3, 1)

(1, 1)
T
AF

Practice 8.2.8. Consider N with the usual order. The lattice order defined on N2 is different from
DR

the lexicographic order on N2 . Draw pictures for all (a, b) ≤ (5, 6) in both the orders to verify this.

Proposition 8.2.9. The direct product of two distributive lattices is a distributive lattice.

Proof. Let (a1 , b1 ), (a2 , b2 ), (a3 , b3 ) be elements in the direct product of two distributive lattices.
Then

[(a1 , b1 ) ∨ (a2 , b2 )] ∧ (a3 , b3 ) = (a1 ∨ a2 , b1 ∨ b2 ) ∧ (a3 , b3 )



= (a1 ∨ a2 ) ∧ a3 , (b1 ∨ b2 ) ∧ b3

= (a1 ∧ a3 ) ∨ (a2 ∧ a3 ), (b1 ∧ b3 ) ∨ (b2 ∧ b3 )
 
= (a1 ∧ a3 ), (b1 ∧ b3 ) ∨ (a2 ∧ a3 ), (b2 ∧ b3 )
 
= (a1 , b1 ) ∧ (a3 , b3 ) ∨ (a2 , b2 ) ∧ (a3 , b3 )

This verifies one of the distributive laws. Similarly, the other one can be verified, or use Prac-
tice 8.2.5.

As in all algebraic structures, there is a notion of lattice homomorphism and also lattice isomor-
phism. Informally, a homomorphism is a function from one lattice to the other which preserves the
two operations of ∨ and ∧; and an isomorphism is a bijective homomorphism.

Definition 8.2.10. Let (L1 , ≤1 ) and (L2 , ≤2 ) be lattices. A function f : L1 → L2 satisfying f (a∨1 b) =
f (a)∨2 f (b) and f (a∧1 b) = f (a)∧2 f (b) is called a lattice homomorphism. Further, if f is a bijection,
then it is called a lattice isomorphism.
8.2. LATTICES 173

Example 8.2.11.
1. Let D be the set of all words formed using the letters a, b, . . . , z and let S ⊆ D consist of all
words of length at most six. With the dictionary order, where a ≤ b ≤ · · · ≤ z, both D and
S are lattices. Define f : D → S as f (d) = d if d has length at most six, otherwise f (d) is
obtained from d by keeping its first six letters intact and chopping off the rest. Then, f is a
homomorphism. It is not an isomorphism as f (isomor) = f (isomorphism).
2. Consider the lattice N with the usual order. Let S = {0, 1, 2} with the usual order. Let f : N → S
be a homomorphism. If f (m) = 0 and f (n) = 1, then m ≤ n, or else, we have

f (m ∨ n) = f (m) = 0, f (m) ∨ f (n) = 0 ∨ 1 = 1.

Thus f (m ∨ n) 6= f (m) ∨ f (n). So, the map f must have one of the following forms. Draw
pictures to understand this.
(a) f −1 (0) = N.
(b) f −1 (0) = {1, 2, . . . , k} and f −1 (1) = N \ {1, 2, . . . , k} for some k ∈ N.
(c) f −1 (0) = {1, 2, . . . , k}, f −1 (1) = {k + 1, k + 2, . . . , k + r} and f −1 (2) = N \ {1, 2, . . . , k + r}
for some k, r ∈ N.

In a lattice there may or may not exist an element which is greater than or equal to every other
element. If such an element exists, which is greater than or equal to every other element, then it is
called a largest element. In fact, a largest element is unique. For, suppose in a lattice (L, ≤) there
exist elements a, b such that for all x ∈ L, we have x ≤ a and x ≤ b. Then, in particular, a ≤ b and
T

b ≤ a so that a = b. Thus, a lattice can have only one largest element. Similarly, a lattice can have
AF

only one smallest element.


DR

Definition 8.2.12. Let (L, ≤) be a lattice. It is called a bounded lattice if there exist elements
α, β ∈ L such that for each x ∈ L, we have x ≤ α and β ≤ x. Such an element α is called the largest
element of L, and is denoted by 1. The element β ∈ L satisfying β ≤ x for all x ∈ L is called the
smallest element of L, and is denoted by 0.

Notice that if a lattice is bounded, then 1 is the lub of the lattice and 0 is the glb of the lattice.

Definition 8.2.13. A lattice (L, ≤) is said to be complete if each nonempty subset of L has lub and
glb in L. For A ⊆ L, we write lub of A as ∨A, and glb of A, as ∧A.

It follows that each complete lattice is a bounded lattice.

Example 8.2.14.
1. Verify that the lattices in Figure 8.3 are complete.
2. The set [0, 5] with the usual order is a lattice which is both bounded and complete. So, is the
set [0, 1) ∪ [2, 3].
3. The set (0, 5] with the usual order is a lattice which is neither bounded nor complete.
4. The set [0, 1) ∪ (2, 3] with the usual order is a lattice which is bounded but not complete.
5. Every finite lattice is complete, and hence, bounded. (Use induction.)
6. The set R with the usual order is a lattice. It is not a complete lattice. Observe that the
completeness property of R, i.e., “for every bounded nonempty subset a glb and an lub exist”
is different from the completeness in the lattice sense.
174 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

(1, 1, 1) 30 {a, b, c}

(1, 0, 1)
(1, 1, 0) (0, 1, 1) 6 10 15 {a, b} {a, c} {b, c}

(1, 0, 0) (0, 0, 1) 2 3 5 {a} {b} {c}


(0, 1, 0)

(0, 0, 0) 1 ∅

Figure 8.3: Complete lattices

Definition 8.2.15. Let (L, ≤) be a bounded lattice. We say that (L, ≤) is a complemented lattice if
for each x ∈ L, there exists y ∈ L such that x ∨ y = 1 and x ∧ y = 0. Such an element y corresponding
to the element x is called a complement of x, and is denoted by ¬x.

Example 8.2.16.

1. The interval [0, 1] with the usual ordering is a distributive lattice but is not complemented. In
T

fact, if x ∈ (0, 1), then it does not have a complement.


AF
DR

2. Verify the captions of the two figures given below. Also, compute ¬0, ¬a, ¬b, ¬c, and ¬1.

1
f

a b c

0 0
Complemented but NOT distributive Distributive but NOT complemented

Why the first 8 rows in the table given below? They already appear
in Proposition 8.2.4. If you want to1 write then it will be nice to just
mention it as a separate table. In the proof, the first eight are already
proved earlier.

Theorem 8.2.17. [The Comparison Table] Let (L, ≤) be a lattice and let a, b, c ∈ L. The following
table lists the properties that hold (make sense) in the specified type of lattices.
8.2. LATTICES 175

Properties Lattice type


∨, ∧ are idempotent Any lattice
∨, ∧ are commutative Any lattice
∨, ∧ are associative Any lattice
a≤b⇔a∧b=a⇔a∨b=b Any lattice
[Absorption] a ∧ (a ∨ b) = a = a ∨ (a ∧ b) Any lattice
[Isotonicity] b ≤ c ⇒ {a ∨ b ≤ a ∨ c, a ∧ b ≤ a ∧ c} Any lattice
a ∨ (b ∧ c) ≤ (a ∨ b) ∧ (a ∨ c)
[Distributive inequalities] Any lattice
a ∧ (b ∨ c) ≥ (a ∧ b) ∨ (a ∧ c)
[Modular inequality] a ≤ c ⇔ a ∨ (b ∧ c) ≤ (a ∨ b) ∧ c Any lattice
0 is unique; 1 is unique Bounded lattice
If a is a complement of b, then b is also a complement of a Bounded lattice
¬0 is unique and it is 1; ¬1 is unique and it is 0 Bounded lattice
An element a has a unique complement Distributive complemented lattice

a ∨ c = b ∨ c, a ∨ ¬c = b ∨ ¬c ⇒ a = b
[Cancellation]  Distributive complemented lattice
a ∧ c = b ∧ c, a ∧ ¬c = b ∧ ¬c ⇒ a = b
¬(a ∨ b) = ¬a ∧ ¬b
[De-Morgan] Distributive complemented lattice
¬(a ∧ b) = ¬a ∨ ¬b
a ∨ ¬b = 1 ⇔ a ∨ b = a
Distributive complemented lattice
a ∧ ¬b = 0 ⇔ a ∧ b = a
Proof. We will only prove the properties that appear in the last three rows; others are left as exercises.
T

Cancellation property:
AF

b = b ∨ 0 = b ∨ (c ∧ ¬c) = (b ∨ c) ∧ (b ∨ ¬c) = (a ∨ c) ∧ (a ∨ ¬c) = a ∨ (c ∧ ¬c) = a ∨ 0 = a.


DR

b = b ∧ 1 = b ∧ (c ∨ ¬c) = (b ∧ c) ∨ (b ∧ ¬c) = (a ∧ c) ∨ (a ∧ ¬c) = a ∧ (c ∨ ¬c) = a ∧ 1 = a.

De-Morgan’s property:

(a ∨ b) ∨ (¬a ∧ ¬b) = (a ∨ b ∨ ¬a) ∧ (a ∨ b ∨ ¬b) = 1 ∧ 1 = 1.


(a ∨ b) ∧ (¬a ∧ ¬b) = (a ∧ ¬a ∧ ¬b) ∨ (b ∧ ¬a ∧ ¬b) = 0 ∨ 0 = 0.
(a ∧ b) ∨ (¬a ∨ ¬b) = (a ∨ ¬a ∨ ¬b) ∧ (b ∨ ¬a ∨ ¬b) = 1 ∧ 1 = 1.
(a ∧ b) ∧ (¬a ∨ ¬b) = (a ∧ b ∧ ¬a) ∨ (a ∧ b ∧ ¬b) = 0 ∧ 0 = 0.

Using Definition 8.2.15 on the first two equalities, we get ¬(a ∨ b) = ¬a ∧ ¬b; and using it again on
the last two equalities, we obtain ¬(a ∧ b) = (¬a ∨ ¬b).
To prove the next assertion, note that if a ∨ ¬b = 1, then

a = a ∨ (b ∧ ¬b) = (a ∨ b) ∧ (a ∨ ¬b) = (a ∨ b) ∧ 1 = a ∨ b.

Conversely, if a = a ∨ b, then a ∨ ¬b = (a ∨ b) ∨ ¬b = 1. Similarly, the second part is proved.

Exercise 8.2.18.
1. Prove that every linearly ordered set is a distributive lattice.
2. Draw the Hasse diagrams of {1, 2, 3} × {1, 2, 3, 4} with dictionary order and the lattice order:
(m, n) ≤ (p, q) if m ≤ p and n ≤ q.
3. Give a partial order on N to make it a bounded lattice. You may draw a Hasse diagram repre-
senting it.
176 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

4. Consider the lattice N2 with lexicographic order. Is it isomorphic to the direct product of (N, ≤)
with itself, where ≤ is the usual order?
5. Show that {0, 1, 2, . . .} is a complete lattice under divisibility relation (What do you
mean by the next sentence “allow (0, 0) in the relation”). Characterize those
sets A for which ∨A = 0.
6. Is the lattice {1, 2} × {1, 2} × {1, 2} × {1, 2} isomorphic to {1, 2, 3, 4} × {1, 2, 3, 4}?
7. Prove or disprove: If L is a lattice which is not complete, then there exists a one-one function
from N to L.
8. Draw the Hasse diagram of a finite complemented lattice which is not distributive.
9. Fix n ∈ N. Let p1 , p2 , . . . , pn be n distinct primes. Prove that the lattice D(N ), (See Exam-
ple 8.2.2.6) for N = p1 p2 · · · pn is isomorphic to the lattice Ln (the lattice of n-tuples of 0
and 1) and to the lattice P(S), where S = {1, 2, . . . , n}. The Hasse diagram for n = 3 with
p1 = 2, p2 = 3, p3 = 5 is shown in Figure 8.3.
10. How many lattice homomorphisms are there from {1, 2} to {1, 2, . . . , 9}?
11. Draw as many Hasse diagrams of non-isomorphic lattices of size 6 as you can.

8.3 Boolean Algebras


In a distributive complemented lattice (see Theorem 8.2.17) the binary operations ∨, ∧, and the
unary operation ¬ satisfy certain properties. Taking cue from these properties, we define an algebraic
T
AF

structure and later show that the algebraic structure is capable of capturing the seemingly more
general notion of a distributive complemented lattice.
DR

Definition 8.3.1. A Boolean algebra is a nonempty set S which is closed under the binary opera-
tions ∨ (called join), ∧ (called meet), and the unary operation ¬ (called inverse or complement)
satisfying the following properties for all x, y, z ∈ S:

1. [Commutativity] : x ∨ y = y ∨ x and x ∧ y = y ∧ x.
2. [Distributivity] : x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z) and x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z).
3. [Identity elements] : There exist elements 0, 1 ∈ S such that x ∨ 0 = x and x ∧ 1 = x.
4. [Inverse] : x ∨ ¬x = 1 and x ∧ ¬x = 0.

When required, we write the Boolean algebra S as (S, ∨, ∧, ¬) showing the operations explicitly.

Notice that the fourth property in the definition above uses the two special elements 0 and 1,
whose existence has been asserted in the third property. This is meaningful when these two elements
are uniquely determined by the third property. We show that it is indeed the case.

Proposition 8.3.2. Let S be a Boolean algebra. Then the following statements are true:
1. Elements 0 and 1 are unique.
2. Corresponding to each s ∈ S, ¬s is the unique element in S that satisfies the property: s∨¬s = 1
and s ∧ ¬s = 0.
3. For each s ∈ S, ¬¬s = s.
8.3. BOOLEAN ALGEBRAS 177

Proof. (1) Let 01 , 02 ∈ S be such that for each x ∈ S, x ∨ 01 = x and x ∨ 02 = x. Then, in particular,
02 ∨ 01 = 02 and 01 ∨ 02 = 01 . By Commutativity, 02 ∨ 01 = 01 ∨ 02 . So, 02 = 01 . That is, 0 is the
unique element satisfying the property that for each x ∈ S, 0 ∨ x = x. A similar argument shows that
1 is the unique element that satisfies the property that for each x ∈ S, x ∧ 1 = x.
(2) Let s ∈ S. By definition, ¬s satisfies the required properties. For the converse, suppose t, r ∈ S
are such that s ∨ t = 1, s ∧ t = 0, s ∨ r = 1 and s ∧ r = 0. Then

t = t ∧ 1 = t ∧ (s ∨ r) = (t ∧ s) ∨ (t ∧ r) = 0 ∨ (t ∧ r) = (s ∧ r) ∨ (t ∧ r) = (s ∨ t) ∧ r = 1 ∧ r = r.

(3) It directly follows from the definition of inverse, due to commutativity.

Example 8.3.3.
1. Let S be a nonempty set. Then P(S) is a Boolean algebra with ∨ = ∪, ∧ = ∩, ¬A = Ac , 0 = ∅
and 1 = S. This is called the power set Boolean algebra. So, we have Boolean algebras of
finite size as well as of uncountable size.
30
2. Take D(30) = {n ∈ N : n | 30} with a ∨ b = lcm(a, b), a ∧ b = gcd(a, b) and ¬a = a. It is a
Boolean algebra with 0 = 1 and 1 = 30.
3. Let B = {T, F }, where ∨, ∧ and ¬ are the usual connectives. It is a Boolean algebra with 0 = F
and 1 = T .
4. Let B be the set of all truth functions involving the variables p1 , . . . , pn , with usual operations
∨, ∧ and ¬. Then B is a Boolean algebra with 0 = ⊥ and 1 = >. This is called the free
T

Boolean algebra on the generators p1 , . . . , pn . (See Chapter 7.)


AF

5. The set of all formulas (of finite length) involving variables p1 , p2 , . . . is a Boolean algebra with
DR

usual operations. This is also called the free Boolean algebra on the generators p1 , p2 , . . .. Here
also 0 = ⊥ and 1 = >. So, we have a Boolean algebra of denumerable size.

Remark 8.3.4. The rules of Boolean algebra treat (∨, 0) and (∧, 1) equally. Notice that the second
parts in the defining conditions of Definition 8.3.1 can be obtained from the corresponding first parts
by replacing ∨ with ∧, ∧ with ∨, 0 with 1, and 1 with 0 simultaneously. Thus, any statement that one
can derive from these assumptions has a dual version which is derivable from the same assumptions.
This is called the principle of duality.

Why are we proving the theorem? Except “constants” don’t the


other follow from what has already been done?
Theorem 8.3.5. [Laws] Let S be a Boolean algebra. Then the following laws hold for all s, t ∈ S:
1. [Constants] : ¬0 = 1, ¬1 = 0, s ∨ 1 = 1, s ∧ 1 = s, s ∨ 0 = s, s ∧ 0 = 0.
2. [Idempotence] : s ∨ s = s, s ∧ s = s.
3. [Absorption] : s ∨ (s ∧ t) = s, s ∧ (s ∨ t) = s.
4. [Cancellation] : s ∨ t = r ∨ t, s ∨ ¬t = r ∨ ¬t ⇒ s = r.
5. [Cancellation] : s ∧ t = r ∧ t, s ∧ ¬t = r ∧ ¬t ⇒ s = r.
6. [Associativity] : (s ∨ t) ∨ r = s ∨ (t ∨ r), (s ∧ t) ∧ r = s ∧ (t ∧ r).

Proof. We give the proof of the first part of each item and that of its dual is left for the reader.
(1) 1 = 0 ∨ (¬0) = ¬0.
178 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

s ∨ 1 = (s ∨ 1) ∧ 1 = (s ∨ 1) ∧ (s ∨ ¬s) = s ∨ (1 ∧ ¬s) = s ∨ ¬s = 1.
s ∨ 0 = s ∨ (s ∧ ¬s) = (s ∨ s) ∧ (s ∨ ¬s) = s ∧ 1 = s.
(2) s = s ∨ 0 = s ∨ (s ∧ ¬s) = (s ∨ s) ∧ (s ∨ ¬s) = (s ∨ s) ∧ 1 = (s ∨ s).
(3) s ∨ (s ∧ t) = (s ∧ 1) ∨ (s ∧ t) = s ∧ (1 ∨ t) = s ∧ 1 = s.
(4) Suppose that s ∨ t = r ∨ t and s ∨ ¬t = r ∨ ¬t. Then
s = s ∨ 0 = s ∨ (t ∧ ¬t) = (s ∨ t) ∧ (s ∨ ¬t) = (r ∨ t) ∧ (r ∨ ¬t) = r ∨ (t ∧ ¬t) = r ∨ 0 = r.
(5) This is the dual of (4) and left as an exercise.
(6) Using distributivity and absorption, we have
  
s ∨ (t ∨ r) ∧ ¬s = (s ∧ ¬s) ∨ (t ∨ r) ∧ ¬s = 0 ∨ (t ∨ r) ∧ ¬s

= (t ∨ r) ∧ ¬s = (t ∧ ¬s) ∨ (r ∧ ¬s).
  
(s ∨ t) ∨ r ∧ ¬s = (s ∨ t) ∧ ¬s ∨ (r ∧ ¬s) = (s ∧ ¬s) ∨ (t ∧ ¬s) ∨ (r ∧ ¬s)

= (0 ∨ (t ∧ ¬s) ∨ (r ∧ ¬s) = (t ∧ ¬s) ∨ (r ∧ ¬s).
 
Hence, s ∨ (t ∨ r) ∧ ¬s = (s ∨ t) ∨ r ∧ ¬s.
  
Also, (s ∨ t) ∨ r ∧ s = (s ∨ t) ∧ s ∨ (r ∧ s) = s ∨ (r ∧ s) = s = s ∨ (t ∨ r) ∧ s.
Now, apply Cancellation law to obtain the required result.

Isomorphisms between two similar algebraic structures help us in understanding an unfamiliar


entity through a familiar one. Boolean algebras are no exceptions.

Definition 8.3.6. Let (B1 , ∨1 , ∧1 , ¬1 ) and (B2 , ∨2 , ∧2 , ¬2 ) be two Boolean algebras. A function
T

f : B1 → B2 is a Boolean homomorphism if it preserves 0, 1, ∨, ∧, and ¬. In such a case,


AF

f (01 ) = 02 , f (11 ) = 12 , f (a ∨1 b) = f (a) ∨2 f (b), f (a ∧1 b) = f (a) ∧2 f (b), f (¬1 a) = ¬2 f (a).


DR

A Boolean isomorphism is a Boolean homomorphism which is a bijection.

Unless we expect an ambiguity in reading and interpreting the symbols, we will not write the
subscripts with the operations explicitly as is done in Definition 8.3.6.

Example 8.3.7. Recall the notation [n] = {1, 2, . . . , n}. The function f : P([4]) → P([3]) defined
by f (S) = S \ {4} is a Boolean homomorphism. We check two of the properties and leave others as
exercises.

f (A ∨ B) = f (A ∪ B) = (A ∪ B) \ {4} = (A \ {4}) ∪ (B \ {4}) = f (A) ∨ f (B).


f (1) = f ([4]) = [4] \ {4} = [3] = 1.
Exercise 8.3.8. 1. Let B1 and B2 be two Boolean algebras and let f : B1 → B2 be a function that
satisfies the four conditions f (01 ) = 02 , f (11 ) = 12 , f (a ∨1 b) = f (a) ∨2 f (b) and f (a ∧1 b) =
f (a) ∧2 f (b). Then, prove that f also satisfies the fifth condition, namely f (¬1 a) = ¬2 f (a).
2. Let B be a Boolean algebra. If a, b ∈ B with a ∧ b 6= a then a ∧ ¬b 6= 0.
3. Let B be a Boolean algebra. Then prove the following:
(a) If B has three distinct atoms p, q and r, then p ∨ q 6= p ∨ q ∨ r.
(b) Let b ∈ B. If p, q and r are the only atoms less than or equal to b, then b = p ∨ q ∨ r.

4. Prove or disprove: Let f : B1 → B2 be a Boolean homomorphism and let a ∈ B1 be an atom.


Then f (a) is an atom of B2 .
8.3. BOOLEAN ALGEBRAS 179

5. What is the number of Boolean homomorphisms from P([4]) to P([3])?


6. How many Boolean homomorphisms from P([4]) onto P([3]) exist?
7. See Example 8.3.3.2. How many atoms does D(30030) have? How many elements does it have?

We will show that finite Boolean algebras are simply the power set Boolean algebras, up to iso-
morphism. Towards this, looking a Boolean algebra as a lattice will be of help.
Let (L, ≤) be a distributive complemented lattice. Then, L has two binary operations ∨ and ∧
and the unary operation ¬x. It can be verified that (L, ∨, ∧, ¬) is a Boolean algebra. Conversely, let
(B, ∨, ∧, ¬) be a Boolean algebra. Is it possible to define a partial order ≤ on L so that (B, ≤) will
be a distributive complemented lattice, and then in this lattice, the resulting operations of ∨, ∧ and
¬ will be the same operations we have started with?

Theorem 8.3.9. Let (B, ∨, ∧, ¬) be a Boolean algebra. Define the relation ≤ on B by

a ≤ b if and only if a ∧ b = a for all a, b ∈ B.

Then (B, ≤) is a distributive complemented lattice in which lub{a, b} = a ∨ b and glb{a, b} = a ∧ b for
all a, b ∈ B.

Proof. We first verify that (B, ≤) is a partial order.


Reflexive: s ≤ s if and only if s ∧ s = s, which is true.
Antisymmetry: Let s ≤ t and t ≤ s. Then we have s = s ∧ t = t.
Transitive: Let s ≤ t and t ≤ r. Then s ∧ t = s and t ∧ r = t. Using associativity, s ∧ r = (s ∧ t) ∧ r =
s ∧ (t ∧ r) = s ∧ t = s; consequently, s ≤ r.
T
AF

Now, we show that a ∨ b = lub{a, b}. Since B is a Boolean algebra, using absorption, we get
(a ∨ b) ∧ a = a and hence a ≤ a ∨ b. Similarly, b ≤ a ∨ b. So, a ∨ b is an upper bound for {a, b}.
DR

Now, let x be any upper bound for {a, b}. Then, by distributive property, (a ∨ b) ∧ x = (a ∧ x) ∨
(b ∧ x) = a ∨ b. So, a ∨ b ≤ x. Thus, a ∨ b is the lub of {a, b}. Analogous arguments show that
a ∧ b = glb{a, b}.
Since for all a, b ∈ B, a ∨ b and a ∧ b are in B, we see that lub{a, b} and glb{a, b} exist. Thus (B, ≤)
is a lattice.
Further, if a ∈ B, then ¬a ∈ B. This provides the complement of a in the lattice (B, ≤). Further,
both the distributive properties are already satisfied in B. Hence (B, ≤) is a distributive complemented
lattice.

In view of Theorem 8.3.9, we give the following definition.

Definition 8.3.10. Let (B, ∨, ∧, ¬) be a Boolean algebra. The relation ≤ on B given by

a ≤ b if and only if a ∧ b = a for all a, b ∈ B

is called the induced partial order. A minimal element of B with respect to the partial order ≤,
which is different from 0 is called an atom in B.

It follows from Theorem 8.3.9 that a Boolean algebra can be defined as a distributive complemented
lattice. In this development, one then proves the defining properties and the laws of a Boolean algebra.

Example 8.3.11.
1. In the power set Boolean algebra, singleton sets are the only atoms.
2. In Example 8.3.3.2, atoms of D(30) are 2, 3 and 5.
180 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

3. The {F, T } Boolean algebra has only one atom, namely T .

Practice 8.3.12.
1. What are the atoms of the free Boolean algebra with generators p1 , . . . , pn ?
2. Is it necessary that every Boolean algebra has at least one atom?

The following two results are intuitively obvious.

Proposition 8.3.13. Each finite Boolean algebra has at least one atom.

Proof. Let B be a finite Boolean algebra. Assume that no element of B is an atom. Now, 0 < 1 and
1 is not an atom. Then there exists b1 ∈ B such that 0 < b1 < 1. Since b1 is not an atom, there exists
b2 ∈ B such that 0 < b2 < b1 < 1. By induction it follows that we have a sequence of elements (bi )
such that 0 < · · · < bi < bi−1 < · · · < b1 < 1. As B is finite, there exist k > j such that bk = bj . We
then have bk < bk−1 < · · · < bj = bk . This is impossible. Hence B has at least one atom.

Proposition 8.3.14. Let p and q be atoms in a Boolean algebra B. If p 6= q, then p ∧ q = 0.

Proof. Suppose that p ∧ q 6= 0. We know that p ∧ q ≤ p. If p ∧ q 6= p, then p ∧ q < p. But this is not
possible since p is an atom. So, p∧q = p. Similarly, q∧p = q. By commutativity, p = p∧q = q∧p = q.

Theorem 8.3.15. [Representation] Let B be a finite Boolean algebra. Then there exists a set X
such that B is isomorphic to P(X).

Proof. Let X be the set of all atoms of B. By Proposition 8.3.13, X 6= ∅. Define f : B → P(X)
T

by f (b) = {x ∈ B : x is an atom and x ≤ b} for b ∈ B. We show that f is the required Boolean


AF

isomorphism.
DR

Injection: Suppose b1 6= b2 . Then, either b1  b2 or b2  b1 . Without loss of generality, let b1  b2 .


Note that b1 = b1 ∧ (b2 ∨ ¬b2 ) = (b1 ∧ b2 ) ∨ (b1 ∧ ¬b2 ). Also, the assumption b1  b2 implies b1 ∧ b2 6= b1
and hence b1 ∧ ¬b2 6= 0 (see Exercise 8.3.8.2). So, there exists an atom x ≤ (b1 ∧ ¬b2 ) and hence
x = x ∧ b1 ∧ ¬b2 . Then

x ∧ b1 = (x ∧ b1 ∧ ¬b2 ) ∧ b1 = x ∧ b1 ∧ ¬b2 = x.

Thus, x ≤ b1 . Similarly, x ≤ ¬b2 . As x 6= 0, we cannot have x ≤ b2 (for, x ≤ ¬b2 and x ≤ b2 imply


x ≤ b2 ∧ ¬b2 = 0). Thus there is an atom in f (b1 ) which is not in f (b2 ). Therefore, f (b1 ) 6= f (b2 ).
Surjection: Let A = {x1 , . . . , xk } ⊆ X. Write a = x1 ∨ · · · ∨ xk (if A = ∅, take a = 0). Clearly,
A ⊆ f (a). We show that A = f (a). So, let y ∈ f (a). Then y is an atom in B and

y = y ∧ a = y ∧ (x1 ∨ · · · ∨ xk ) = (y ∧ x1 ) ∨ · · · ∨ (y ∧ xk ).

Since y 6= 0, by Proposition 8.3.14, y ∧ xi 6= 0 for some i ∈ {1, 2, . . . , k}. As xi and y are atoms, we
have y = y ∧ xi = xi and hence y ∈ A. That is, f (a) ⊆ A so that f (a) = A. Thus, f is a surjection.
Preserving 0, 1 : Clearly f (0) = ∅ and f (1) = X.
Preserving ∨, ∧ : By definition,

x ∈ f (b1 ∧ b2 ) ⇔ x ≤ b1 ∧ b2 ⇔ x ≤ b1 and x ≤ b2
⇔ x ∈ f (b1 ) and x ∈ f (b2 ) ⇔ x ∈ f (b1 ) ∩ f (b2 ).

For the other one, let x ∈ f (b1 ∨ b2 ). Then, x = x ∧ (b1 ∨ b2 ) = (x ∧ b1 ) ∨ (x ∧ b2 ). So, x ∧ b1 =


6 0
or x ∧ b2 6= 0. Without loss of generality, suppose x ∧ b1 6= 0. As x is an atom, x ≤ b1 and
8.4. AXIOM OF CHOICE AND ITS EQUIVALENTS 181

hence x ∈ f (b1 ) ⊆ f (b1 ) ∪ f (b2 ). Conversely, let x ∈ f (b1 ) ∪ f (b2 ). Without loss of generality, let
x ∈ f (b1 ). Thus, x ≤ b1 and hence x ≤ b1 ∨ b2 which in turn implies that x ∈ f (b1 ∨ b2 ). Therefore,
x ∈ f (b1 ∨ b2 ) ⇔ x ∈ f (b1 ) ∪ f (b2 ).
Preserving ¬ : Let x ∈ B. Then f (x)∪f (¬x) = f (x∨¬x) = f (1) = X and f (x)∩f (¬x) = f (x∧¬x) =
c
f (0) = ∅. Thus f (¬x) = f (x) .

As immediate consequences of the representation theorem, we obtain the following results. The
readers should provide a proof.

Corollary 8.3.16. Let B be a finite Boolean algebra.


1. If B has exactly k atoms then B is isomorphic to P({1, 2, . . . , k}). Hence, B has exactly 2k
elements.
2. Fix b ∈ B. If p1 , . . . , pn are the only atoms less than or equal to b, then b = p1 ∨ · · · ∨ pn .

Exercise 8.3.17.
1. Supply a Boolean homomorphism f from P([4]) to P([3]) such that rng f has 4 elements.
2. Prove or disprove: The number of Boolean homomorphisms from P([4]) to P([3]) is less than
the number of lattice homomorphisms from P([4]) to P([3]).
3. Show that a lattice homomorphism on a Boolean algebra which preserves 0 and 1 is a Boolean
homomorphism.
4. Consider the class of all functions f : R → {π, e}. Can we define some operations on this class
to make it a Boolean algebra?
T

5. We know that a finite Boolean algebra must have at least one atom. Is ‘finite’ necessary?
AF

6. A positive integer is called square-free if it is not divisible by the square of a prime. Let
DR

Bn = {k ∈ N : k | n}. For a, b ∈ Bn take the operations a ∨ b = lcm(a, b), a ∧ b = gcd(a, b) and


¬a = n/a. Show that Bn is a Boolean algebra if and only if n > 1 is square-free.
7. Show that the set of subsets of N which are either finite or have a finite complement is a de-
numerable Boolean algebra. Find the atoms. Is it isomorphic to the free Boolean algebra with
generators p1 , p2 , · · · ?
8. Let B be a Boolean algebra and let xi ∈ B for i = 1, 2, . . .. We know that, for each n ∈ N, the
n
W ∞
W
expression xi is meaningful in each Boolean algebra due to associativity. Is xi necessarily
i=1 i=1
a meaningful expression?
9. Show that a Boolean algebra with at least 3 atoms has at least 23 elements.
10. Prove or disprove: If B1 and B2 are finite Boolean algebras each of size k > 100, then they must
be isomorphic and there must be more than k isomorphisms between them.
11. Let F(N) = {X ⊆ N : X is finite or X c is finite }. Similarly, define F(R). Show that both F(N)
and F(R) are Boolean algebras, where ∨ = ∪, ∧ = ∩ and ¬(Y ) = Y c . What is ≤ here?
12. Give examples of two denumerable non-isomorphic Boolean algebras.
13. Give examples of two uncountable non-isomorphic Boolean algebras.

8.4 Axiom of choice and its equivalents


As mentioned earlier, unrestricted use of apparently natural constructions led to paradoxes in the
informal theory of sets. This brought forth many axiomatizations of Set theory. Mathematicians
182 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

working in various branches, specifically those who worked on the foundations of mathematics, raised
some concerns regarding one particular axiom, called the Axiom of Choice. A priori, it is inconceivable
that this seemingly obvious statement should generate so much controversy. The controversy and
debate generated by the axiom of choice among mathematicians might be put in parallel to the
much discussed Euclid’s parallel postulate. Though Axiom of choice looked very innocent, some of its
consequences were counter-intuitive. More than a century had passed before it was formulated. It had
been used in many branches of mathematics with much success in proving very important results.
There are different versions of the axiom of choice and some more equivalent statements, popularly
accepted as Lemmas or Principles named after their originators. We will give an overview of the topic
in this section and discuss its usefulness. The reader may refer to [7] and [11] for details.
We know that the Cartesian product of two nonempty sets is nonempty. Using induction, we can
show that the product of a finite number of nonempty sets is nonempty. Is it true that the product of
an infinite number of nonempty sets is nonempty? Axiom of choice posits that it is indeed true.

Axiom 8.4.1. [Axiom of Choice (AC)] The product of a nonempty family of nonempty sets is
nonempty.

Recall that if {Aα }α∈I is a nonempty family of nonempty sets with the index set I, then the union
of all sets in this family is denoted by ∪ Aα . Similarly, the product of this family consists of all
α∈I
functions f from I to ∪ Aα , where f (α) ∈ Aα for each α ∈ I. Thus AC asserts that at least one
α∈I
such function exists. Notice that any arbitrary family of sets C can be written as an indexed family
by taking the index set as C itself; for, C = {Aα }α∈C with Aα = α. The union of such a family of sets
is thus ∪ Y ; which is also written as ∪ C. Hence a reformulation of AC is as follows:
T

Y ∈C
AF

AC: Given any nonempty family C of nonempty sets, there exists a function f : C → ∪ Y , called
Y ∈C
DR

the choice function, such that f (X) ∈ X for each X in C.

Another formulation of AC is given in the following axiom. It so closely resembles AC that it goes
by the acronym AC1.

Axiom 8.4.2. [Axiom of Choice 1 (AC1)] Given any nonempty family C of nonempty disjoint sets,
there exists a set B such that for each set X in C, X ∩ B is a singleton set.

Intuitively, one arrives at the set B in AC1 by choosing an element from each set in the given
family.

Theorem 8.4.3. AC1 is equivalent to AC.

Proof. Assume that AC1 is true. Let {Bα : α ∈ I} be a nonempty family of nonempty sets. For each
α ∈ I, write Cα = {(x, α) : x ∈ Bα }. In a way Cα is a copy of Bα , the only difference being Cα consists
of ordered pairs (x, α) instead of the element x in Bα . Consider the family of sets C = {Cα : α ∈ I}.
Notice that if α 6= β, then Cα ∩ Cβ = ∅. Thus C is a nonempty family of disjoint nonempty sets.
By AC1, there exists a set A such that A ∩ Cα is a singleton set. Write A ∩ Cα = {(xα , α)}, where
xα ∈ Bα . Define the function f : {Bα : α ∈ I} → ∪ Bα by f (Bα ) = xα . Clearly, f is well defined
α∈I
and f (Bα ) ∈ Bα for each α ∈ I. Therefore, AC is true. The proof of “AC implies AC1” is left as an
exercise.

There are many general statements equivalent to Axiom of Choice. We will state only some of
them and discuss their applications. For one of the equivalents of AC, we require a new notion that
we introduce now.
8.4. AXIOM OF CHOICE AND ITS EQUIVALENTS 183

Definition 8.4.4. A family of sets C is called a family of finite character if for any set A,
A ∈ C if and only if each finite subset of A is in C.

Example 8.4.5.
1. The empty family is a family of finite character.
2. The power set of any set is a family of finite character.
3. {∅, {1}, {2}} is a family of finite character.
4. Let V be a nontrivial vector space. As we know, a subset A of V is linearly independent if
and only if all finite subsets of A are linearly independent. Therefore, the family of linearly
independent subsets of V is a family of finite character.

Proposition 8.4.6. Let A and B be two nonempty sets. Show that P(A) ∪ P(B) is a family of finite
character.

Proof. Let C = P(A) ∪ P(B) and let X ∈ C. Suppose X ∈ P(A). Then X ⊆ A. If Y is a finite subset
of X, then Y ⊆ A. Then Y ∈ P(A). Similarly, if X ∈ P(B), then all finite subsets of X are in P(B).
Thus, all finite subsets of X are in P(A) ∪ P(B).
Conversely, suppose X is a set such that all finite subsets of X are in P(A) ∪ P(B). We need to
show that X is in P(A) ∪ P(B).
Assume the contrary, that X 6⊆ A and X 6⊆ B. Then there exist elements x ∈ X \ A and y ∈ X \ B.
Now, {x, y} is a finite subset of X; and hence {x, y} ∈ P(A) ∪ P(B). But {x, y} is neither in P(A)
nor in P(B), a contradiction.
T

Practice 8.4.7. Let {Aα }α∈I be a nonempty family of nonempty sets. Show that ∪ P (Aα ) need
AF

α∈I
not be a family of finite character.
DR

The following theorem lists some of the widely used equivalents of the axiom of choice. You will
find some of them intuitively appealing while others are not. Nonetheless each of them follows from
the other.

Theorem 8.4.8. The following are equivalent to the axiom of choice:


1. [Tukey’s lemma] Every nonempty family of finite character has a maximal element.
2. [Hausdorff ’s maximality principle] Every nonempty poset contains a maximal chain.
3. [Zorn’s lemma] In a nonempty poset, if every chain has an upper bound, then the poset has a
maximal element.
4. [Zermelo’s well ordering principle] Every set can be well ordered.

Proof. (AC ⇒ Tukey’s lemma) As usual, in a family of sets, we consider the partial order as ⊆.
Assume that the axiom of choice (AC) is true but Taukey’s lemma is false. Let F be a nonempty
family of finite character without any maximal element. So, each set A in the family F has a proper
superset in F. For each A ∈ F, define SA as the family of proper supersets of A that are in F. Thus
for each set A in the family F, the family SA is nonempty. Now, the collection of families SA , that
is, {SA }A∈F is a collection of families indexed by the family F. The product of this indexed family
is the set of all functions from F to ∪ SA . Since each SA is nonempty, AC implies that there exists
A∈F
such a function f . Consequently, for each A ∈ F, f (A) ∈ SA , that is, f (A) is a proper superset of A.
For convenience, call a sub-family E of F an f -inductive family if the following conditions hold:

(i) ∅ ∈ E, (ii) if A ∈ E, then f (A) ∈ E, (iii) if B is a chain in E, then ∪ B ∈ E.


184 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

Notice that F itself is f -inductive. Let H be the intersection of all f -inductive sub-families of F. It
is easy to see that H is f -inductive. We show that H is a chain.
In order to do that define the following family of sets

L = {A ∈ F : if B ∈ H and B is a proper subset of A, then f (B) ⊆ A}.

Since F has no maximal element, L is a nonempty family. Fix any L ∈ L.


Claim 1 : For any H ∈ H, we have either H ⊆ L or f (L) ⊆ H.
To see this, consider the sub-family C of H defined by

C = {H ∈ H | H ⊆ L or f (L) ⊆ H}.

We first show that C is an f -inductive set.


Clearly, ∅ is in C. Observe that for any set A in C, the following are true:
1. If A is a proper subset of L, then f (A) ⊆ L, as L ∈ L. So f (A) ∈ C.
2. If A = L, then f (A) = f (L) ⇒ f (L) ⊆ f (A) ⇒ f (A) ∈ C.
3. If f (L) ⊆ A, then f (L) ⊆ A ⊆ f (A). So f (A) ∈ C.
4. If B is a chain in C, then ∪ B ∈ C.
Reason: If each element of B is subset of L, then ∪ B ⊆ L and so it is in C. If some element B
of B is not a subset of L, then f (L) ⊆ B and so f (L) ⊆ ∪ B; thus ∪ B ∈ C.

Thus C is an f -inductive set. Since H is the intersection of all f -inductive families, H ⊆ C. Also,
by the very definition of C, we have C ⊆ H. Therefore, C = H. Again, the definition of C implies that
T
AF

if H ∈ H, then either H ⊆ L or f (L) ⊆ H.


Claim 2 : L = H.
DR

We first show that L is an f -inductive family.


Clearly, ∅ is in L. Let L ∈ L. Does it follow that f (L) ∈ L? To answer this, let B be a proper
subset of f (L). If B 6⊆ L, then since L ∈ L, f (L) ⊆ B, which is not possible. Hence B ⊆ L. Now, if
B = L, then f (B) = f (L). Otherwise, B is a proper subset of L. Since L ∈ L, f (B) ⊆ L ⊆ f (L). In
any case, f (B) ⊆ f (L). Hence f (L) ∈ L.
Let B be a chain in L. Is it true that ∪ B ∈ L? Well, let H be a proper subset of ∪ B. We show
that f (H) ⊆ ∪ B. For this, let B ∈ B. If H is a proper subset of B, then since B ∈ B ⊆ L, we have
f (H) ⊆ B ⊆ ∪ B. If H = B, then since B is a proper subset of ∪ B, ∪ B ∈ H and B ∈ L, we have
f (H) = f (B) ⊆ ∪ B. Otherwise, for each B ∈ B, we have H 6⊆ B. As B ∈ L, f (B) ⊆ H. So, B ⊆ H,
for each B and then ∪ B ⊆ H. But this is not possible as H is a proper subset of ∪ B. Therefore,
∪ B ∈ L.
Hence, L is an f -inductive family. Since H is the intersection of all f -inductive families, H ⊆ L.
Also, by the very definition of L, we have L ⊆ H. Therefore, L = H.

Form Claim 1, we conclude that for each pair of sets H1 , H2 in H, we have either H2 ⊆ H1 or
f (H1 ) ⊆ H2 ⇒ H1 ⊆ H2 . So H is a chain. Using Claim 2, we see that H is a chain in F satisfying

(a) ∅ ∈ H, (b) if A ∈ H, then f (A) ∈ H, (c) ∪ A ∈ H.


A∈H

Now, starting with ∅ we see that ∅ ∈ H and f (∅) ∈ H, and then f 2 (∅) ∈ H, etc. Using induction,
we have f n (∅) ∈ H for each n ∈ N. It then follows that
 
∪ f n (∅) is a proper subset of f ∪ f n (∅) ⊆ ∪ f n (∅).
n∈N n∈N n∈N
8.4. AXIOM OF CHOICE AND ITS EQUIVALENTS 185

This is a contradiction.

(Tukey’s lemma ⇒ Hausdorff’s maximality principle) Assume that Tukey’s lemma is true. Let X be
a nonempty poset. Denote by C, the family of all chains in X. Let Y be a set such that all its finite
subsets are in C. Then for any x, z ∈ Y , we have {x, z} ∈ C; so, x and z are comparable. Thus, Y is a
chain and so Y is a set in C. Hence the family C is a family of finite character. Therefore, by Tukey’s
lemma, X has a maximal chain.

(Hausdorff’s maximality principle ⇒ Zorn’s lemma) Assume that Hausdorff’s maximality principle is
true. Let (X, ≤) be a nonempty poset in which every chain has an upper bound. Due to Hausdorff’s
maximality principle, (X, ≤) has a maximal chain C. Let a be an upper bound of C. Suppose a is
not a maximal element of (X, ≤). Then there exists b ∈ X such that a < b. Then C ∪ {b} becomes
a larger chain than C, contradicting the assumption that C is a maximal chain in (X, ≤). Hence a
is a maximal element of (X, ≤). We have shown that if every chain in (X, ≤) has an upper bound in
(X, ≤), then (X, ≤) has a maximal element. This proves Zorn’s lemma.

(Zorn’s lemma ⇒ Zermelo’s well ordering principle) Assume that Zorn’s lemma is true. Let X be a
nonempty set. Consider the family of all well ordered subsets of X, with their respective well orders:

F = {(A, ≤A ) : A ⊆ X and ≤A is a well order on A}.

Notice that F is a set of ordered pairs, where the first element is a subset of X and the second element
is a well order on that subset. For (B, ≤B ), (C, ≤C ) in F, define (B, ≤B ) ≤ (C, ≤C ) if

B ⊆ C, ≤B ⊆ ≤C , if b ∈ B and c ∈ C \ B, then (b, c) ∈ ≤B .


T

We leave it as an exercise to show that ≤ is a partial order on F. We wish to see that the poset (F, ≤)
AF

satisfies the hypotheses of Zorn’s lemma.


DR

Let C be a nonempty chain in (F, ≤). We propose that (W, ≤W ) is an upper bound of C, where

W = ∪{A : (A, ≤A ) ∈ C}, ≤W = ∪{≤A : (A, ≤A ) ∈ C}.

Notice that the proposal goes through provided (W, ≤W ) ∈ F. We leave it as an exercise to show that
≤W is a linear order on W . We need to show that if P is a nonempty subset of W , then there exists
p0 ∈ P such that p0 ≤W p for each p ∈ P .
So, let P be a nonempty subset of W . Given p ∈ P , there exists (D, ≤D ) such that p ∈ D.
Consider the set Sp := {x ∈ P : x ≤D P }. It has a minimum, say p0 as ≤D is a well order on D. We
claim that p0 is the minimum of P with respect to ≤W . For, suppose that there exists p1 ∈ W such
that p1 ≤W p0 , p0 6= p1 . Clearly, p1 6∈ D, otherwise p0 cannot be the minimum of Sp . So, let p1 ∈ E
for some pair (E, ≤E ) ∈ C. As (D, ≤D ) and (E, ≤E ) are in the chain C, either D ⊆ E or E ⊆ D. But
p1 ∈ E and p1 6∈ D; so, E 6⊆ D. Hence, D is a proper subset of E. That is, there exists b ∈ E such
that D = {x ∈ E : x ≤E b, x 6= b}. It follows that p0 ≤E b, p0 6= b and b ≤E p1 . This contradicts
p1 ≤W p0 as ≤W = ≤B on E.
Hence our proposal goes through, that is, C has an upper bound, namely, (W, ≤W ). By Zorn’s
lemma, F has a maximal element. Call such a maximal element (Y, ≤Y ). Notice that (Y, ≤Y ) is a
well ordered set. Now, if Y is a proper subset of X, then we have an element x ∈ X \ Y . We can
then extend ≤Y to a well order on Y ∪ {x}. This will contradict the maximality of (Y, ≤Y ). Hence,
Y = X. We rename ≤Y as ≤X and conclude that (X, ≤X ) is a well ordered set.

(Zermelo’s Well ordering principle ⇒ AC). Assume that Zermelo’s well ordering principle is true. Let
{Xα }α∈L be a nonempty family of nonempty sets. Write X = ∪ Xα . By Zermelo’s well ordering
α∈L
186 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

principle, we have a well order, say, ≤ on X. Hence, each set Xα being a nonempty subset of X, has
Q
a minimum mα . Define f on L by f (α) = mα for each α ∈ L. Then f ∈ Xα .
α∈L

Example 8.4.9. Without using AC, show that Z and Q can be well ordered.
Ans: For x, y ∈ Z, x 6= y, define x < y if either |x| < |y| or |x| = |y| with x negative. In this
partial order, the elements of Z may be listed as 0, −1, 1, −2, 2, −3, 3, . . .. Clearly, (Z, ≤) is a well
ordered set.
For Q, recall that the set of positive rational numbers, Q+ , is denumerable. So, let r1 , r2 , . . . be
an enumeration of Q+ . Then enumerate Q by 0, −r1 , r1 , −r2 , r2 , . . .. This provides a well order on Q.
More directly, since Z is denumerable, any enumeration of it gives a well order on it; same for Q.
However, we do not know yet how to construct a well order on R. It is one of the reasons why
many mathematicians do not accept AC as one of the axioms of Set theory.

We now discuss some applications of the axiom of choice. Due to Theorem 8.4.8, we are free to
use any of its equivalents when need arises. Further, since every set can be well ordered (assuming
AC), the Principle of transfinite induction (See Theorem 8.1.27.) can be used in any set with the help
of the well order.

Corollary 8.4.10. [Injection-Surjection] Let A and B be nonempty sets. Then there exists a one-one
function from A to B if and only if there exists an onto function from B to A.

Proof. Let f : A → B be a one-one function. Then f −1 : rng f → A is one-one and onto. Now, fix an
element a ∈ A. Define the function g : B → A by
T
AF


f −1 (x) if x ∈ rng f
g(x) =
DR

a if x ∈ B \ rng f.

Then g is a an onto function.


Conversely, let g : B → A be an onto function. For any x ∈ A, g −1 (x) is a nonempty subset of B.
Consider the family C = {g −1 (α) : α ∈ A}. Now, C is a nonempty family of nonempty sets. Further,
∪ g −1 (α) = B. By the axiom of choice, there exists a function f : A → B, where for each α ∈ A,
α∈A
f (α) ∈ g −1 (α). Since g is a function, g −1 (α) ∩ g −1 (β) = ∅ for α 6= β. So, f is one-one.

The Cardinal numbers are symbols that we associate with sets such that equinumerous sets get
the same symbol. We denote the cardinal number of a set A by |A| . If A is a finite set, then |A| is
the number of elements in A, which is some natural number m.
Generalizing the observation that “ |[m]| = |[n]| if and only if m = n” and “ |[m]| ≤ |[n]| if and
only if m ≤ n” we introduce the following definition to compare cardinal numbers.

Definition 8.4.11. Let A and B be sets. By |A| we mean the cardinality of A. By a cardinal
number, we mean the cardinality of some set. Comparison of cardinality of sets and some related
notation are defined as follows:
1. |A| ≤ |B| if there exists a one-one function from A to B.
2. |A| ≥ |B| if |B| ≤ |A| .
3. |A| = |B| if there is a bijection f : A → B.
4. |A| < |B| if |A| ≤ |B| and |A| 6= |B| .
5. |∅| = 0, |[n]| = n, |N| = ℵ0 .
8.4. AXIOM OF CHOICE AND ITS EQUIVALENTS 187

6. If x = |A| , then by 2x we mean |P(A)| .


7. If |B| = ℵk , then we write |P(B)| = ℵk+1 = 2ℵk .

Observe that, due to AC, |A| ≥ |B| if and only if there exists an onto function f : A → B.
Further, Cantor-Schröder-Bernstein (CSB) theorem implies that |A| = |B| if and only if |A| ≤ |B|
and |B| ≤ |A| . why this paragraph as we have mentioned it in item 3 above

Example 8.4.12.
1. If α, β, γ are cardinal numbers such that α ≤ β and β ≤ γ, then α ≤ γ.
Ans: It says, if there exist a one-one function f from A to B and a one-one function g from B
to C, then there is a one-one function from A to C. This is true, as g ◦ f is such a function.
2. If α is any cardinal number, then α < 2α .
Ans: If A is any set, then Cantor’s theorem says that there is no onto function from A to P(A).
That is, |A| =
6 |P(A)| . However, the function f : A → P(A) given by f (a) = {a} is a one-one
function. That is, |A| ≤ |P(A)| . Hence, the result follows.
3. The cardinal numbers we know till now are
ℵ0
0, 1, 2, 3, . . . , ℵ0 = |N| , ℵ1 = 2ℵ0 = |P(N)| = |R| , ℵ2 = 2ℵ1 = 22

= |P P(N) | = |P(R)| , . . . .

4. The cardinal numbers ℵ0 , ℵ1 , ℵ2 , . . . are infinite cardinal numbers. In general, |A| for any infinite
set A, is called an infinite cardinal.
T
AF

5. The generalized continuum hypothesis by Cantor asserts that there is no cardinal number between
an infinite cardinal number α and 2α .
DR

Again, generalizing on the operations on natural numbers, we obtain the following definition for
adding and multiplying cardinal numbers.

Definition 8.4.13. Let A and B be sets. Write α = |A| and β = |B| . Then, the addition and
multiplication of cardinal numbers are defined as follows:
1. If A ∩ B = ∅, then α + β := |A ∪ B| .
2. αβ := |A × B| .

We abbreviate αα · · · (m times) to αm .

Notice that we have a restriction in defining addition. But that is not a problem as the following
example shows.

Example 8.4.14. Let A and B be sets. Show the following:


1. There exists an object x which is not an element of A.
2. There exist sets C and D such that |C| = |A| , |B| = |D| and C ∩ D 6= ∅.

Ans: (1) Since |A| < |P(A)| , A 6= P(A). Hence there exists x ∈ P(A) which is not an element of A.
(2) Using (1), let x be an object which is not in A∪B, and let y be an object which is not in A∪B ∪{x}.
Write C = A × {x} and D = B × {y}. Then C and D satisfy all the requirements.
We will simply write C = A × {0} and D = B × {1} instead of using the objects x and y.
188 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

Example 8.4.15. Let A and B be nonempty sets. Then either |A| ≤ |B| or |B| ≤ |A| .
Proof. Let F be the family of all one-one functions f with dom f ⊆ A and rng f ⊆ B. Since A and B
are nonempty sets, F is nonempty. Consider the poset (F, ⊆). By Hausdorff’s maximality principle,
we have a maximal chain C. Write g = ∪ f .
f ∈C
It is easy to see that g is one-one, dom g = ∪ dom f and rng g = ∪ rng f .
f ∈C f ∈C
If dom g is a proper subset of A and rng g is a proper subset of B, then take x ∈ A \ dom g and
y ∈ B \ rng g. Then h = g ∪{(x, y)} is a one-one function in F and h ∈/ C. Thus C ∪ {h} is a larger
chain, a contradiction to the maximality of C.
So either dom g = A, in which case |A| ≤ |B| ; or rng g = B, in which case |B| ≤ |A| .
Example 8.4.16. If α is an infinite cardinal number, then α + α = α.
Proof. Let A, B be infinite sets with α = |A| = |B| . In view of Example 8.4.14, assume that
α = |A × {0}| = |A × {1}| , where A × {0} and A × {1} are disjoint sets.
Let F be the set of all one-one functions with domain as a subset of A and range as dom f × {0, 1}.
Then (F, ⊆) is a nonempty poset. By Hausdorff’s maximality principle we have a maximal chain C.
Write g = ∪ f . We see that dom g = ∪ dom f . Now, if y ∈ rng g, then rng g = ∪ rng f implies
f ∈C f ∈C f ∈C
y ∈ rng f for some f . However, rng f = dom f × {0, 1}. So, there exists x ∈ dom f such that y = (x, 0)
or (x, 1). Since x ∈ dom g, we have y ∈ dom g × {0, 1}. Conversely, for any x ∈ dom g, we have
x ∈ dom f for some f and hence (x, 0), (x, 1) ∈ rng f ⊆ rng g. Therefore, rng g = dom g × {0, 1}.
Further, g is an onto function from dom g to rng g. We want to show that g is also one-one. On
the contrary, suppose that we have a, b ∈ dom g and c ∈ rng g such that a 6= b and (a, c), (b, c) ∈ g. As
g = ∪ f , there exists h ∈ C such that (a, c) ∈ f and (b, c) ∈ h. Notice that h is one-one. Since C is
T

f ∈C
a chain, either f ⊆ h or h ⊆ f . If f ⊆ h, then (a, c), (b, c) ∈ h, a contradiction to the fact that h is
AF

one-one. Similarly, h ⊆ f contradicts the fact that f is one-one. Thus, we conclude that g is one-one;
DR

and hence it is a bijection from dom g to rng g = dom g × {0, 1}.


Is the set A \ dom g finite or infinite? Suppose A \ dom g is infinite. Then A \ dom g contains a
denumerable set, say D. There exists a bijection φ : D → D × {0, 1}. Then the function ψ = g ∪ φ is
a bijection. As

ψ : dom g ∪ D → (dom g × {0, 1}) ∪ (D × {0, 1}) = (dom g ∪ D) × {0, 1}

is a bijection, we see that ψ ∈ F. Further, g is a proper subset of ψ; so that ψ 6∈ C. Hence, C ∪ {ψ} is


a larger chain than C, a contradiction.
Hence, A \ dom g is finite. Write A \ dom g = {x1 , . . . , xn }. By the train-seat argument, we find a
bijection from A \ {x1 , . . . , xn } to A. That is, |A \ {x1 , . . . , xn }| = |A| = α. So, | dom g| = α, and g
is a bijection from dom g to dom g × {0, 1}. Therefore, α = α + α.
A more general result is proved in the following example.
Example 8.4.17. If α is an infinite cardinal number, then α2 = α.
A proof along the lines of the previous example can be constructed for α2 = α; however, we give
another using Zorn’s lemma.
Proof. Let A be an infinite set with |A| = α. So, A has a denumerable subset N . Clearly, there is a
bijection f : N → N 2 . (N 2 = N × N .) Define the set

X = {(M, g) : N ⊆ M ⊆ A and g : M → M 2 is a bijection }.

Define the partial order ≤ on X by

(M, g) ≤ (K, h) ⇔ (M ⊆ K and g is a restriction of h to M ).


8.4. AXIOM OF CHOICE AND ITS EQUIVALENTS 189

It is easy to see that in the poset (X, ≤), every chain has an upper bound. By Zorn’s lemma, (X, ≤)
has a maximal element, say, (B, φ).
Consider the set C = A \ B; and write β = |B| , γ = |C| . Notice that the definition of X implies
that β 2 = β. Suppose β ≤ γ. Then there exists D ⊆ C such that |D| = β. Write E = (B ∪ D)2 \ B 2
and  = |E| . Then
β = |D| ≤ |D2 | ≤ 2 = 3β 2 = 3β ≤ β 2 = β.

Hence  = β. Thus, there exists a bijection ψ : D → E. Define the function ξ : (B ∪ D) → (B ∪ D)2


by 
φ(x) if x ∈ B
ξ(x) =
ψ(x) if x ∈ D.

Clearly, ξ is a bijection that extends φ. This contradicts the maximality of (B, φ). Hence γ ≤ β.
Then
α ≤ α2 = β 2 + βγ + γβ + γ 2 ≤ 4β 2 = 4β ≤ β 2 = β ≤ α.

Therefore, α2 = α.

Practice 8.4.18. Let α, β, γ be cardinal numbers with β ≤ γ. Show that α + β ≤ α + γ and αβ ≤ αγ.

Example 8.4.19. Every partial order on a nonempty set can be extended to a linear order.
Proof. Let (X, f ) be a nonempty poset. We show that there exists a linear order g on X such that
f ⊆ g. Towards this, let F be the family of all partial orders h on X such that f ⊆ h. Since f ∈ F,
T

the poset (F, ⊆) is nonempty. By Hausdorff’s maximality principle, it has a maximal chain, say C.
AF

Write g = ∪ f .
h∈C
DR

It is easy to verify that g is a partial order. Suppose that g is not a linear order on X. Then there
exists distinct x, y ∈ X such that (x, y) ∈ / g and (y, x) ∈
/ g.
Define Lx = {z : (z, x) ∈ g} and My = {z : (y, z) ∈ g}. If z ∈ Lx ∩ My , then (y, x) ∈ g by
transitivity. Hence Lx ∩ My = ∅. Note that x ∈ Lx and y ∈ My . Write g1 = g ∪ (Lx × My ). We show
that g1 a partial order.
Reflexivity: Trivial.
Antisymmetry: Let (a, b), (b, a) ∈ g1 . Both of (a, b), (b, a) cannot be in Lx × My , as Lx ∩ My = ∅.
Suppose one of them is in Lx × My and the other is in g. Without loss of generality, assume that
(a, b) ∈ Lx × My and (b, a) ∈ g. This means (a, x) ∈ g, (y, b) ∈ g, and (b, a) ∈ g. But then (y, x) ∈ g,
a contradiction. Therefore, both of (a, b), (b, a) are in g, and hence a = b.
Transitivity: Let (a, b), (b, c) ∈ g1 . Clearly, both of them are not in g1 . If both of them are in g, we
have nothing to prove. So let (a, b) ∈ Lx × My and (b, c) ∈ g. This means (a, x) ∈ g, (y, b) ∈ g and
(b, c) ∈ g. From the last two, c ∈ My . So (a, c) ∈ Lx × My ⊆ g1 .
Notice that g1 ∈
/ C. Then C ∪ {g1 } is a larger chain than C, a contradiction.

Example 8.4.20. Let H be an Abelian subgroup of a non-Abelian group G. Show that there is a
maximal Abelian subgroup J of G such that H ⊆ J.
Ans: Let F be the set of all Abelian subgroups of G which contain H. Notice that H ∈ F. By
Hausdorff’s maximality principle there is a maximal chain C in F. Observe that H ∈ C, otherwise we
could extend C. Write J = ∪ A. It is easy to check that J is an Abelian subgroup of G. If J0 is any
A∈C
Abelian subgroup that contains J properly, then J0 ∈
/ C and J0 ∈ F. Thus C ∪ {J0 } is a larger chain
than C, which contradicts the maximality of C.
190 CHAPTER 8. PARTIALLY ORDERED SETS, LATTICES AND BOOLEAN ALGEBRA

Example 8.4.21. Show that every vector space has a Hamel basis.
Ans: Recall that a Hamel basis of a vector space is a maximal linearly independent subset of the
vector space. Of course, the trivial vector space {0} has the only Hamel basis as ∅.
Let V be a vector space. Recall that the family F of linearly independent subsets of V is a family
of finite character. By Tukey’s lemma, the family F, ordered as usual by the subset relation, has a
maximal element. Such a maximal element is a Hamel basis for V.

Example 8.4.22. Let (L, ≤) be a nonempty linearly ordered set. Prove that there exists W ⊆ L
such that ≤ well orders W , and for each x ∈ L, there exists y ∈ W such that x ≤ y. (For example,
for L = R, we can take W = N.)
Proof. Take an element ` ∈ L. The singleton set {`} is well ordered by ≤. Let F be the family of
subsets of L satisfying the condition: “each set in L is well ordered by ≤ with ` as its minimum”.
Notice that {`} ∈ F.
On F we define a partial order g by (A, B) ∈ g if and only if A ⊆ B and elements of B \ A are
upper bounds of A.
Then (F, g) is a nonempty poset. By Hausdorff’s maximality principle we have a maximal chain
in F, say C. Clearly, this chain starts with {`}. Write W = ∪ A. Then it is clear that W ⊆ L.
A∈C
To show that W is well ordered, let B ⊆ W be a nonempty set. Let b ∈ B. Then there is a set
Cb ∈ C such that b ∈ Cb . Recall that Cb is well ordered. Consider the initial segment I(b) = {x ∈
Cb : x < b}. Then (I(b) ∪ {b}) ∩ B is a nonempty subset of Cb ; hence it has a minimum, say, w in
(I(b) ∪ {b}) ∩ B.
We claim that w is the minimum of B. Suppose, on the contrary that there exists y ∈ B such
T

that y < w. As w ≤ b, we see that y < b. If y ∈ Cb , then y ∈ I(b), and hence y ∈ (I(b) ∪ {b}) ∩ B,
AF

which implies that w ≤ y. So y ∈ / Cb . In that case, y can only belong to a set in C that comes after
DR

Cb (which is a proper superset of Cb ). But then y is an upper bound of Cb , contradicting y < b.


Thus W is well ordered. Now, suppose there exists p ∈ L which is a strict upper bound of W . Then
W0 := W ∪ {p} is well ordered and C ∪ {W0 } is a larger chain than C, contradicting the maximality
of C. That is, no element in L is strictly larger than every element of W . In other words, for each
x ∈ L, there exists y ∈ W such that x ≤ y.

Exercise 8.4.23.
1. Let A and B be two nonempty sets. Show that there is a set C such that C ∩ A = ∅ and
|C| = |B| .
2. Let X be the set of all infinite sequences formed using 0, 1 and let Y be the set of all infinite
sequences formed using 0, 1, 2. Which one is larger, |X| or |Y | ?
3. Let α, β be infinite cardinal numbers with α ≤ β. Then α + β = β and αβ = β.
4. Show that R is not a finite dimensional vector space over Q.
5. Let X be a vector space. Let A and S be nonempty subsets of X, where A is linearly independent,
A ⊆ S and span(S) = X. Show that there exists a Hamel basis B such that A ⊆ B ⊆ S.
6. Let A be a nonwmpty set and let F be a field. Write FA := {f : f is a function from A to F}.
Let Γ := {f ∈ FA : {a ∈ A : f (a) 6= 0} is finite}. Show that Γ is a vector space over F with
respect to point-wise addition of functions and point-wise scalar multiplication. Also show that
every vector space V is isomorphic to Γ for some suitable choice of A.
Chapter 9

Graphs - I

9.1 Basic concepts

Experiment
‘Start from a dot. Move through each line exactly once. Draw it.’ Which of the following pictures
can be drawn? What if we want the ‘starting dot to be the finishing dot’ ?
T
AF
DR

Later, we shall see a theorem by Euler addressing this question.

Definition 9.1.1. A pseudograph G is a pair (V, E) where V is a nonempty set and E is a multiset
of 2-elements sets of points of V . The set V is called the vertex set and its elements are called
vertices. The set E is called the edge set and its elements are called edges.
  
Example 9.1.2. G = {1, 2, 3, 4}, {1, 1}, {1, 2}, {2, 2}, {3, 4}, {3, 4} is a pseudograph.

Discussion 9.1.3. A pseudograph can be represented in picture in the following way.


1. Put different points on the paper for vertices and label them.
2. If {u, v} appears in E some k times, draw k distinct lines joining the points u and v.
3. A loop at u is drawn if {u, u} ∈ E.

Example 9.1.4. A picture for the pseudograph in Example 9.1.2 is given in Figure 9.1.

1 2

3 4

Figure 9.1: A pseudograph

191
192 CHAPTER 9. GRAPHS - I

Definition 9.1.5. Let G = (V, E) be a graph. Then the following definitions and notations are in
order.
1. we sometimes use V (G) in place of V for the vertex set and E(G) in place of E for the edge set.
2. The number |V (G)| is called the order of the graph G, and is denoted by |G|. By kGk, we
denote the number of edges of G. A graph with n vertices and m edges is called an (n, m)
graph.
3. An edge {u, v} is sometimes denoted uv. An edge uu is called a loop. The vertices u and v are
called the end vertices of the edge uv. Let e be an edge. We say ‘e is incident on u’ to mean
that ‘u is an end vertex of e’.
4. If uv is an edge in G, then we say that the vertices u and v are adjacent in G, and also that u
is a neighbor of v. We write u ∼ v to denote that u is adjacent to v.
5. If v ∈ V (G), by N (v) or NG (v), we denote the set of neighbors of v in G and |N (v)| is called
the degree of v. It is usually denoted by dG (v) or d(v). A vertex of degree 0 is called isolated.
A vertex of degree one is called a pendant vertex.
6. Two edges e1 and e2 are called adjacent if they have a common end vertex.
7. A graph is said to be non-trivial if it has at least one edge; else it is called a trivial graph.
8. A multigraph is a pseudograph without loops. A multigraph is a simple graph if no edge
appears twice.
9. In this book, we consider only simple graphs with finite vertex sets. Thus, by a graph, we will
mean a simple graph with a finite vertex set, unless stated otherwise.
T
AF

10. A set of vertices or edges is said to be independent if no two of them are adjacent. The
maximum size of an independent vertex set is called the independence number of G, denoted
DR

α(G).

Discussion 9.1.6. Note that a graph is an algebraic structure, namely, a pair of sets satisfying some
conditions. However, it is easy to describe and carry out the arguments with a pictorial representation
of a graph. Henceforth, the pictorial representations are used to describe graphs and to provide our
arguments, whenever required. There is no loss of generality in doing this.

Example 9.1.7. Consider the graph G in Figure 9.2. The vertex 12 is an isolated vertex whereas
the vertex 10 is a pendant vertex. We have N (1) = {2, 4, 7}, d(1) = 3. The vertices 1 and 6 are not

adjacent. The set {9, 10, 11, 2, 4, 7} is an independent vertex set. The set {1, 2}, {8, 10}, {4, 5} is an
independent edge set.

4
5 13
8 6 3
10 12
2
7
11 9 1

Figure 9.2: A graph G.

Definition 9.1.8. Let G = (V, E) be a graph on n vertices, say V = {1, . . . , n}. Then, G is said to
be a
9.1. BASIC CONCEPTS 193

1. Complete graph, denoted Kn , if each pair of vertices in G are adjacent.


2. Path graph, denoted Pn , if E = {{i, i + 1} : 1 ≤ i ≤ n − 1}.
3. Cycle graph, denoted Cn , if E = {{i, i + 1} : 1 ≤ i ≤ n − 1} ∪ {n, 1}.
4. Bipartite graph if V = V1 ∪ V2 such that |V1 |, |V2 | ≥ 1, V1 ∩ V2 = ∅ and e = {u, v} ∈ E if
either u ∈ V1 and v ∈ V2 , or u ∈ V2 and v ∈ V1 . 1 2 3 ··· n −1 n
5. Complete bipartite graph, denoted Kr,s if E = {{i, j} : 1 ≤ i ≤ Pnr, 1 ≤ j ≤ s}.

1 2 3 ··· n −1 n 1 2 3 ··· n −1 n
Pn Cn
Figure 9.3: Pn and Cn .

The importance of the labels of the vertices depends on the context. At this point of time, even
1 2 3 ··· n −1 n
if we interchange the labels of the
Cn vertices, we still call them a complete graph or a path graph or a
cycle or a complete bi-partite graph.

1 4 3 4 5

1 2 2 3 1 2 1 2 3
T

K1,1 K1,2 K2,2 K2,3


AF

1 4 3 3
2
DR

4
1 1 2 2 3 1 2 1
5
K1 K2 K3 K4 K5

1 4 3 3
2 2 4
4 5 1
2 3 1 2 1
5 3 6
C3 C4 C5 C6

1 4 3 2 1

1 1 2 2 3 1 2 3 4 5
P1 P2 P3 P4 P5

Figure 9.4: Some well known family of graphs

Quiz 9.1.9. What is the maximum number of edges possible in a simple graph of order n?
P
Lemma 9.1.10. [Hand shaking lemma] In any graph (simple) G, d(v) = 2|E|. Thus, the number
v∈V
of vertices of odd degree is even.
194 CHAPTER 9. GRAPHS - I

P P
Proof. Each edge contributes 2 to the sum d(v). Hence, d(v) = 2|E|. Note that
v∈V v∈V
X X X
2|E| = d(v) = d(v) + d(v)
v∈V v:d(v) is odd v:d(v) is even
P P
Since d(v) is even, the above implies that d(v) must be even as well. There-
is even
v:d(v) v:d(v) is even
fore, the number of vertices of odd degree is even.

Quiz 9.1.11. In a party of 27 persons, prove that someone must have an even number of friends
assuming that friendship is mutual.

Example 9.1.12. The graph in Figure 9.5 is called the Petersen graph. We shall use it as an
example in many places.

5 9 3

10 8

6 7

1 2

Figure 9.5: Petersen graphs


T

Proposition 9.1.13. In a graph G with n = |G| ≥ 2, there are two vertices of equal degree.
AF
DR

Proof. If G has two or more isolated vertices, we are done. First, suppose G has exactly one isolated
vertex. Then, the remaining n − 1 vertices have degrees between 1 and n − 2 and hence by PHP, the
result follows. Otherwise, G has no isolated vertex. Then G has n vertices whose degrees lie between
1 and n − 1. Again by PHP, we get the required result.

Exercise 9.1.14. 1. Let G = (V, E) be a graph with a vertex v ∈ V of odd degree. Then, prove
that there exists a vertex u ∈ V such that there is a path from v to u and deg(u) is also odd.
2. Let G = (V, E) be a graph having exactly two vertices, say u and v, of odd degree. Then, prove
that there is a path in G connecting u and v.

Definition 9.1.15. Let G = (V, E) be a graph. Then,


1. the minimum degree of a vertex in G is denoted by δ(G) and the maximum degree of a
vertex in G is denoted by ∆(G).
2. a graph G is called k-regular if d(v) = k for all v ∈ V (G).
3. a 3-regular graph is called cubic.
Example 9.1.16. 1. The cycle graph Cn is 2-regular whereas the complete graph Kn is (n − 1)-
regular.
2. The Petersen graph and the complete graph K4 are cubic.
3. The graph P4 is not regular.
4. Consider the graph G in Figure 9.2. We have δ(G) = 0 and ∆(G) = 3.

Quiz 9.1.17. Can we have a cubic graph on 5 vertices?


9.1. BASIC CONCEPTS 195

Definition 9.1.18. Let G = (V (G), E(G)) be a graph.


1. Then a graph H is a called a subgraph of G if V (H) ⊆ V (G) and E(H) ⊆ E(G).
2. Then a subgraph H of G is called a spanning subgraph if V (G) = V (H).
3. Then a k-regular spanning subgraph is called a k-factor of G.
4. If U ⊆ V (G), then the induced subgraph of G on U is denoted by hU i = (U, E), where the
edge set E = {{u, v} ∈ E(G) : u, v ∈ U }.
Example 9.1.19. 1. Consider the graph G in Figure 9.2.

(a) Let H1 be the graph with V (H1 ) = {6, 7, 8, 9, 10, 12} and E(H1 ) = {6, 7}, {9, 10} . Then,
H1 is not a subgraph of G as {9, 10} 6∈ E(G).

(b) Let H2 be the graph with V (H2 ) = {6, 7, 8, 9, 10, 12} and E(H2 ) = {6, 7}, {8, 10} . Then,
H2 is a subgraph but not an induced subgraph of G as {8, 9} ∈ E(G) but not in E(H2 ).
(c) Let H3 be the induced subgraph of G on the vertex set {6, 7, 8, 9, 10, 12}. Then, verify that

E(H3 ) = {6, 7}, {8, 9}, {8, 10} .
(d) The graph G does not have a 1-factor.

2. A complete graph has a 1-factor if and only if it has an even order.


3. The Petersen graph has many 1-factors. One of them is obtained by selecting the edges
{1, 6}, {2, 7}, {3, 8}, {4, 9} and {5, 10}.

Quiz 9.1.20. Consider K8 on the vertex set {1, 2, . . . , 8}. How many 1-factors does it have?

Definition 9.1.21. Let G = (V (G), E(G)) be a graph.


T
AF

1. If v ∈ V (G) then the graph G − v, called the vertex deleted subgraph, is obtained from G
by deleting v and all the edges that are incident with v.
DR

2. If e ∈ E(G), then the graph G − e = (V, E(G) \ {e}) is called the edge deleted subgraph.
3. If u, v ∈ V (G) such that u  v, then G + uv = (V, E(G) ∪{uv}) is called the graph obtained
by edge addition.
4. The complement G of a graph G is defined as (V (G), E), where E = {uv : u 6= v, uv ∈
/ E(G)}.
Example 9.1.22. 1. Consider the graph G in Figure 9.2. Let H2 be the graph with V (H2 ) =

{6, 7, 8, 9, 10, 12} and E(H2 ) = {6, 7}, {8, 10} . Consider the edge e = {8, 9}. Then, H2 + e is
the induced subgraph h{6, 7, 8, 9, 10, 12}i and H2 − 8 = h{6, 7, 9, 10, 12}i.
2. See Figure 9.6 for two examples of complement graphs.

4 3 4 3 3 3
2 2
4 4
1 2 1 2 1 1
5 5
C4 C4 C5 C 5 = C5

Figure 9.6: Complement graphs

3. The complement of K3 contains 3 isolated points/vertices.


4. For any graph G, kGk + kGk = |G|

2 .

5. In any graph G of order n, dG (v) + dG (v) = n − 1. Thus, ∆(G) + ∆(G) ≥ n − 1.


196 CHAPTER 9. GRAPHS - I

Quiz 9.1.23. 1. Characterize graphs G such that ∆(G) + ∆(G) = n − 1.

2. Can we have a graph G such that ∆(G) + ∆(G) = n?

3. Show that a k-regular simple graph on n vertices exists if and only if kn is even and n ≥ k + 1.

Definition 9.1.24. Let G = (V (G), E(G)) and H = (V (H), E(H)) be two graphs.

1. Then their intersection, denoted G ∩ H, is defined as (V (G) ∩ V (H), E(G) ∩ E(H)).

2. Then their union, denoted G ∪ H, is defined as (V (G) ∪ V (H), E(G) ∪ E(H)).

3. Then their disjoint union is the union while treating the vertex sets as disjoint sets.

4. If V (G) ∩ V (H) = ∅, then their join, denoted G + H has V (G) ∪ V (H) as the vertex set and
E(G) ∪ E(H) ∪ {uv : u ∈ V (G), v ∈ V (H)} as the edge set.

5. Then their Cartesian product, denoted G × H, has V (G) × V (H) as the vertex set and the
edge set consists of all elements {(u, v), (u0 , v 0 )}, where either u = u0 and {v, v 0 } ∈ E(H), or
v = v 0 and {u, u0 } ∈ E(G).

Example 9.1.25. Two graphs G and H with their intersection G ∩ H, their union G ∪ H and their
disjoint union as G1 are shown in Figure 9.7. Further, the join of K2 + K3 and K2 + K2 are also given.
Note that K2 + K3 = K5 and K 2 + K 2 = C4 .

2 2 2 2
4 4
T
AF

1 3 1 1 3 1
DR

G H G∪H G∩H

2 b a 2
2 ′
2
4′
K2 c K3
1 3 1′ 3′
1 a 1 b
G1 K2 + K3 K2 + K2

13 23 33
b 1b 2b 3b
22
12 32

a 1 2 3 1a 2a 3a 11 21 31
X Y X ×Y Y ×Y

Figure 9.7: Examples of graph constructions (see Definition 9.1.24)

Quiz 9.1.26. 1. What is the complement of the disjoint union of G and H?

2. Is Km,n = K m + K n ?
9.2. CONNECTEDNESS 197

9.2 Connectedness
Definition 9.2.1. Let G = (V, E) be a graph and let u, v ∈ V .
1. A u-v walk in G is a finite sequence of vertices [u = v1 , v2 , · · · , vk−1 , vk = v] such that vi vi+1 ∈ E,
for all i = 1, · · · , k − 1.
2. The length of a walk is the number of edges on it.
3. A walk is called a trail if edges on the walk are not repeated.
4. A u-v walk is a called a path if the vertices involved are all distinct, except that u and v can
be the same.
5. If P is a u-v path with u 6= v, then we sometimes call u and v as the end vertices of P and
the remaining vertices on P as the internal vertices.
6. A walk (trail, path) is called closed if u = v.
7. The length of a path is the number of edges on it. A path can have length 0.
8. A closed path is called a cycle/circuit. Thus, in a simple graph a cycle has length at least 3.
A cycle (walk, path) of length k is also written as a k-cycle (k walk, k cut-vertex).
Example 9.2.2. 1. Take G = K5 with vertex set {1, 2, 3, 4, 5}.
(a) Then [1, 2, 3, 2, 1, 2, 5, 4, 3] is an 8 walk in G and [1, 2, 2, 1] is not a walk.
(b) The walk [1, 2, 3, 4, 5, 2, 4, 1] is a closed trail.
(c) The walk [1, 2, 3, 5, 4, 1] is a closed path, i.e., it is a 5-cycle.
(d) The maximum length of a cycle in G is 5 and the minimum length of a cycle in G is 3.
T

(e) The number of 3-cycles in G is 53 = 10.



AF

(f) Verify that the number of 4-cycles in G is not 54 . Can it be 3 × 54 ?


 
DR

2. Let G be the Petersen graph. Then, G has a 9-cycle, namely, [6, 8, 10, 5, 4, 3, 2, 7, 9, 6]. But, G
has no 10-cycles. We shall see this when we discuss the Hamiltonian graphs.

Proposition 9.2.3. Let u and v be distinct vertices in a graph G. Let W = [u = u1 , . . . , uk = v] be


a walk. Then W contains a u-v path.

Proof. If no vertex on W repeats, then W is itself a path. So, let ui = uj for some i < j. Now,
consider the walk W1 = [u1 , . . . , ui−1 , uj , uj+1 , . . . uk ]. This is also a u-v walk but of shorter length.
Thus, using induction on the length of the walk, the desired result follows.

Definition 9.2.4. Let G = (V, E) be a graph.


1. The distance d(u, v) between two vertices u, v ∈ V, u 6= v is the shortest length of a u-v path
in G. If no such path exists, the distance is taken to be ∞.
2. The greatest distance between any two vertices in a graph G is called the diameter of G, and
is denoted by diam(G).
3. Let distv = max d(v, u). The radius is the min distv and the center is the set of all vertices v
u∈V v∈V
for which distv is the radius.
4. The girth, denoted g(G), of a graph G is the minimum length of a cycle contained in G. If G
has no cycle, then we put g(G) = ∞.

Example 9.2.5. The Petersen graph has diameter 2, radius 2 and each vertex is in the center.
Further, its girth is 5.
198 CHAPTER 9. GRAPHS - I

Exercise 9.2.6. 1. Determine the diameter, radius, center and girth of the following graphs:
Pn , Cn , Kn and Kn,m .
2. Let G be a graph. Then, show that the distance function d(u, v) is a metric on V (G). That is,
it satisfies
(a) d(u, v) ≥ 0 for all u, v ∈ V (G) and d(u, v) = 0 if and only if u = v,
(b) d(u, v) = d(v, u) for all u, v ∈ V (G) and
(c) d(u, v) ≤ d(u, w) + d(w, u) for all u, v, w ∈ V (G).

Proposition 9.2.7. Let G be a graph with kGk ≥ 1 and d(v) ≥ 2, for each vertex except one, say v1 .
Then, G has a cycle.

Proof. Consider a longest path [v1 , . . . , vk ] in G (as V (G) is finite, such a path exists). As d(vk ) ≥ 2,
it must be adjacent to some vertex from v2 , . . . , vk−2 ; otherwise, we can extend it to a longer path.
Choose i ≥ 2 such that vi is adjacent to vk . Then, [vi , vi+1 , . . . , vk , vi ] is a cycle.

Proposition 9.2.8. Let P and Q be two different u-v paths in G. Then, P ∪ Q contains a cycle.

Proof. Imagine a signal was sent from u to v via P and was returned back from v to u via Q. Call an
edge ‘dead’ if signal has passed through it twice. Notice that each vertex receives the signal as many
times as it sends the signal.
Is E(P ) = E(Q)? No, otherwise both P and Q are the same paths.
So, there are some ‘alive’ edges. Get an alive edge v1 v2 . There must be an alive edge v2 v3 ;
T

otherwise, v2 is incident to just one alive edge and some dead edges so that v2 has received more
AF

signal than it has sent. Similarly, get v3 v4 and so on. Stop at the first instance of repetition of a
DR

vertex: [v1 , v2 , · · · , vi , vi+1 · · · , vj = vi ]. Then, [vi , vi+1 · · · , vj = vi ] is a cycle.



Alternate. Consider the graph H = V (P ) ∪ V (Q), E(P )∆E(Q) , where ∆ is the symmetric
difference. Notice that E(H) 6= ∅, otherwise P = Q. As the degree of each vertex in the multigraph
P ∪ Q is even and H is obtained after deleting pairs of multiple edges, each vertex in H has even
degree. Hence, by Proposition 9.2.7, H has a cycle.

Proposition 9.2.9. Every graph G containing a cycle satisfies g(G) ≤ 2 diam(G) + 1.

Proof. Let C = [v1 , v2 , . . . , vk , v1 ] be the shortest cycle and diam(G) = r. If k ≥ 2r + 2, then consider
the path P = [v1 , v2 , . . . , vr+2 ]. Since the length of P is r + 1 and diam(G) = r, there is a vr+2 -v1
path R of length at most r. Note that P and R are different v1 -vr+2 paths. By Proposition 9.2.8, the
closed walk P ∪ R of length at most 2r + 1 contains a cycle. Hence, the length of this cycle is at most
2r + 1, a contradiction to C having the smallest length k ≥ 2r + 2.

Definition 9.2.10. Let C = [v1 , . . . , vk = v1 ] be a cycle in a graph G. An edge vi vj in G is called a


chord of C if it is not an edge of C. G is called chordal if each cycle of length at least 4 has a chord.
G is acyclic if it has no cycles.

For example, complete graphs are chordal, so are the acyclic graphs. The Petersen graph is not
chordal.
Quiz 9.2.11. 1. How many acyclic graphs are there on the vertex set {1, 2, 3}?
2. How many chordal graphs are there on the vertex set {1, 2, 3, 4}?
9.2. CONNECTEDNESS 199

Definition 9.2.12. 1. A graph G is said to be a maximal graph with respect to a property P


if G has property P and no proper supergraph of G has the property P . The term minimal
graph is defined similarly.

Notice!
The class of all graphs with that property is the poset here. So, the maximality and the
minimality are defined naturally.

2. A complete subgraph of G is called a clique. The maximum order of a clique is called the clique
number of G. It is denoted ω(G).
3. A graph G is called connected if there is a u-v path, for each u, v ∈ V (G).
4. A graph which is not connected is called disconnected. If G is a disconnected graph, then a
maximal connected subgraph is called a component or sometimes a connected component.

Example 9.2.13. Consider the graph G shown in Figure 9.2.


1. Some cliques in G are h{8, 10}i, h{2}i. The first is a maximal clique. Notice that every vertex
is a clique. Similarly, each edge is a clique. Here ω(G) = 2.
2. The graph G is not connected. It has four connected components, namely, h{8, 9, 10, 11}i,
h{1, 2, 3, 4, 5, 6, 7}i, h{12}i and h{13}i.

Quiz 9.2.14. What is ω(G) for the Petersen graph?


T

Proposition 9.2.15. If δ(G) ≥ 2, then G has a path of length δ(G) and a cycle of length at least
AF

δ(G) + 1.
DR

Proof. Let [v1 , · · · , vk ] be a longest path in G. As d(vk ) ≥ 2, vk is adjacent to some vertex v 6= vk−1 .
If v is not on the path, then we have a path that is longer than [v1 , · · · , vk ] path. A contradiction.
So, let i be the smallest positive integer such that vi is adjacent to vk . Then

δ(G) ≤ d(vk ) ≤ |{vi , vi+1 , · · · , vk−1 }|.

Hence, the cycle C = [vi , vi+1 , · · · , vk , vi ] has length at least δ(G) + 1 and the length of the path
P = [vi , vi+1 , · · · , vk ] is at least δ(G).

|E(G)|
Definition 9.2.16. The edge density, denoted ε(G), is defined to be the number |V (G)| .
Quiz 9.2.17. 1. When does the deletion of a vertex reduces its edge density?
δ(G)
2. Is 2 a lower bound for ε(G)?
3. Suppose that ε(G) ≥ δ(G). Should we have a vertex v ∈ V (G) with ε(G) ≥ d(v)?

Proposition 9.2.18. Let G be a graph with kGk ≥ 1. Then G has a subgraph H with δ(H) > ε(H) ≥
ε(G).

Proof. If ε(G) < δ(G), then H = G. Otherwise, there exists v ∈ V (G) with ε(G) ≥ d(v). Put
G1 = G − v. Then, using ε(G) ≥ d(v), we have ε(G1 ) = kG 1k
n−1 = ε(G) +
ε(G)−d(v)
n−1 ≥ ε(G).
If ε(G1 ) < δ(G1 ), then H = G1 . Otherwise, there exists u ∈ V (G1 ) with ε(G1 ) ≥ d(u). Put
G2 = G1 − u. Again, we have ε(G2 ) ≥ ε(G1 ) ≥ ε(G).
Continuing as above, we note that “Initially ε(G) > 0. At the i-th stage, we obtained the subgraph
Gi satisfying |V (Gi )| = |G|−i, ε(Gi ) ≥ ε(Gi−1 ). That is, we have been reducing the number of vertices
200 CHAPTER 9. GRAPHS - I

and the corresponding edge densities have been increasing.” Hence, this process must stop before we
reach a single vertex, as its edge density is 0.
So, let us assume that the process stops at H. Then, ‘ε(H) < δ(H)’ must be true, or else, the
process would not stop at H and hence the required result follows.

9.3 Isomorphism in graphs


Definition 9.3.1. Two graphs G = (V, E) and G0 = (V 0 , E 0 ) are said to be isomorphic if there is
a bijection f : V → V 0 such that u ∼ v in G if and only if f (u) ∼ f (v) in G0 , for each u, v ∈ V . In
other words, an isomorphism is a bijection between the vertex sets which preserves adjacency. We
write G ∼= G0 to mean that G is isomorphic to G0 .

Example 9.3.2. Consider the graphs in Figure 9.8. We observe the following:

4
3 6 2 4

5 3 5 6
2 5
6 2

1 4 1 3
1
F G H
T
AF

Figure 9.8: F ∼
= G but F ∼
6 H
=
DR

1. The graph F is not isomorphic to H as α(F ), the independence number of F is 3 whereas


α(H) = 2. Alternately, H has a 3-cycle, whereas F does not have a 3-cycle.

2. The map f : V (F ) → V (G) defined by f (1) = 1, f (2) = 5, f (3) = 3, f (4) = 4, f (5) = 2 and
f (6) = 6 gives an isomorphism. So, F ∼
= G.

Check the adjacency


F G
1 → 2, 4, 6 f (1) = 1 → f (2) = 5, f (4) = 4, f (6) = 6
3 → 2, 4, 6 f (3) = 3 → f (2) = 5, f (4) = 4, f (6) = 6
5 → 2, 4, 6 f (5) = 2 → f (2) = 5, f (4) = 4, f (6) = 6
All edges are covered, no need to check any further.

Discussion 9.3.3. [Isomorphism] Let F and G be isomorphic under f : V (F ) → V (G). Relabel


each vertex v ∈ F as f (v). Call the new graph F 0 . Then, F 0 = G. This is so, as V (F 0 ) = V (G) and
E(F 0 ) = E(G) due to the isomorphic nature of the function f .

Practice 9.3.4. Take the graphs F and G of Figure 9.8. Take the isomorphism f (1) = 1, f (2) = 5,
f (3) = 3, f (4) = 4, f (5) = 2 and f (6) = 6. Obtain the F 0 as described in Discussion 9.3.3. List
V (F 0 ) and E(F 0 ). List V (G) and E(G). Notice that they are the same.

Definition 9.3.5. A graph G is called self-complementary if G ∼


= G.
9.3. ISOMORPHISM IN GRAPHS 201

Example 9.3.6. Let G be a self-complementary graph on n vertices. Then kGk = n(n − 1)/4 as
kGk = kGk and there are n2 edges in the complete graph. Thus, either n = 4k or n = 4k + 1. Now,


verify the following:


1. The path P4 = [0, 1, 2, 3] is self complimentary. An isomorphism from P4 to P 4 is described by
f (i) = 2i (mod 5).
2. The cycle C5 = [0, 1, 2, 3, 4, 0] is self complimentary. An isomorphism from C5 to C 5 is described
by f (i) = 2i (mod 5).
Exercise 9.3.7. 1. Construct a self-complementary graph of order 4k.
2. Construct a self-complementary graph of order 4k + 1.

Definition 9.3.8. A graph invariant is a function which assigns the same value (output) to iso-
morphic graphs.

Observe that some of the graph invariants are: |G|, kGk, ∆(G), δ(G), ω(G), α(G), ε(G), and the
multiset {d(v) : v ∈ V (G)}.

Exercise 9.3.9. How many graphs are there with vertex set {1, 2, . . . , n}? Do you find it easy if we
ask for non-isomorphic graphs (try for n = 4)?

Proposition 9.3.10. Let G and H be graphs and let f : G → H be an isomorphism. For any
v ∈ V (G), G − v ∼
= H − f (v).
T

Proof. Consider the bijection g : V (G − v) → V (H − f (v)) described by g = fV (G−v) .


AF

Definition 9.3.11. An isomorphism of G to G is called an automorphism.


DR

Example 9.3.12. 1. The identity map, denoted e is always an automorphism on any graph.
2. Any permutation in Sn is an automorphism of Kn .
3. There are only two automorphisms of a path P8 . Is it true for Pn , for n ≥ 3?

Proposition 9.3.13. Let G be a graph and let Γ(G) denote the set of all automorphisms of G. Then,
Γ(G) forms a group under composition of functions with e as the identity element.

Proof. Let V (G) = {1, 2, . . . , n} and σ, µ ∈ Γ(G) be two automorphisms. Then,

ij ∈ E(G) ⇔ µ(i)µ(j) ∈ E(G) ⇔ (σ ◦ µ)(i)(σ ◦ µ)(j) ∈ E(G).

Thus, σ ◦ µ is an automorphism. Moreover, µ−1 , σ −1 are indeed automorphisms.

Example 9.3.14. Determine Γ(C5 ).


Ans: Consider C5 = [1, . . . , 5, 1]. Note that σ = (2, 3, 4, 5, 1) is an automorphism. Hence,
{e, σ, σ 2 , . . . , σ 4 } ⊆ Γ(C5 ) as σ 5 = e.
Now, let µ be an automorphism with µ(1) = i. Put τ = σ 6−i µ. Then, τ is an automorphism with
τ (1) = 1. If τ (2) = 2, then the adjacency structure implies that τ (j) = j for j = 3, 4, 5. In this case,
σ 6−i µ = e; consequently, µ = σ i−6 = σ i−1 .
If τ (2) 6= 2, then τ (2) = 5 as 1 is adjacent only to the vertices 2 and 5. In this case, verify that
τ (3) = 4 and hence τ = (2, 5)(3, 4) is the reflection which fixes 1. Let us denote the permutation
(2, 5)(3, 4) by ρ. Then, Γ(C5 ) is the group generated by σ and ρ and hence Γ(C5 ) has 10 elements.
202 CHAPTER 9. GRAPHS - I

Example 9.3.15. Notice that Γ(C5 ) has a subgroup Γ1 = {e, σ, σ 2 , σ 3 , σ 4 }, with σ 5 = e, of order
5. Let G be a subgraph of C5 obtained by deleting some (zero allowed) edges. If kGk = 5, then
|Γ(G)| = 10. If kGk = 0, then |Γ(G)| = |S5 | = 5!. If kGk = 4, then |Γ(G)| = 2. If kGk = 3, then
|Γ(G)| = 2 or 4. If kGk = 2, then |Γ(G)| = 4 or 8. If kGk = 1, then |Γ(G)| = 2 × 3!. Thus, there is no
subgraph of G whose automorphism group is Γ1 .
Exercise 9.3.16. 1. Determine the graphs G for which Γ(G) = Sn , the group of all permutations
of 1, . . . , n.
2. Compute Γ(G) for some graphs of small order.
3. Let G be a subgraph of H of the same order. Explore more about the relationship between Γ(G)
and Γ(H).
4. List the automorphisms of the following graph.

5 3

6 2

1
5. Determine the automorphism groups of the following graph. Are the three groups isomorphic?
T
AF
DR

9.4 Trees
Definition 9.4.1. Let G be a connected graph. A vertex v of G is called a cut-vertex if G − v is
disconnected. Thus, G − v is connected if and only if v is not a cut-vertex.

Theorem 9.4.2. Let G be a connected graph with |G| ≥ 2 and let v ∈ V (G).
1. If d(v) = 1, then G − v is connected, so that v is never a cut-vertex.
2. If G − v is connected, then either d(v) = 1 or v is on a cycle.

Proof. 1. Let u, w ∈ V (G − v), u 6= w. As G is connected, there is a u-w path P in G. The vertex v


cannot be an internal vertex of P , as each internal vertex has degree at least 2. Hence, the path P is
available in G − v. So, G − v is connected.
2. Assume that G − v is connected. If dG (v) = 1, then there is nothing to prove. So, assume that
d(v) ≥ 2. We need to show that v is on a cycle in G.
Let u and w be two distinct neighbors of v in G. As G − v is connected there is a path, say
[u = u1 , . . . , uk = w], in G − v. Then [u = u1 , . . . , uk = w, v, u] is a cycle in G containing v.

Quiz 9.4.3. Let G be a graph and v be a vertex on a cycle. Can G − v be disconnected?


9.4. TREES 203

Definition 9.4.4. Let G be a graph. An edge e in G is called a cut-edge or a bridge if G − e has


more connected components than that of G.

Proposition 9.4.5. Let G be connected and let e = uv be a cut-edge. Then G−e has two components,
one containing u and the other containing v.

Proof. If G − e is not disconnected, then by definition, e cannot be a cut-edge. So, G − e has at least
two components. Let Gu (respectively, Gv ) be the component containing the vertex u (respectively,
v). We claim that these are the only components.
Let w ∈ V (G). Since G is connected, there is a path, say P , from w to u. Moreover, either P
contains v as its internal vertex or P does not contain v. In the first case, w ∈ V (Gv ) and in the latter
case, w ∈ V (Gu ). Thus, every vertex of G is either in V (Gv ) or in V (Gu ) and hence the required
result follows.

Theorem 9.4.6. Let G be a graph and let e be an edge. Then, e is a cut-edge if and only if e is not
on a cycle.

Proof. Suppose that e = uv is a cut-edge of G. Let F be the component of G that contains e. Then,
by Proposition 9.4.5, F − e has two components, namely, Fu that contains u and Fv that contains v.
Let if possible, C = [u, v = v1 , . . . , vk = u] be a cycle containing e = uv. Then [v = v1 , . . . , vk = u]
is a u-v path in F − e. Hence, F − e is still connected. A contradiction. Thus, e cannot be on any
cycle.
Conversely, let e = uv be an edge which is not on any cycle. Now, suppose that F is the component
T

of G that contains e. We need to show that F − e is disconnected.


AF

Let if possible, there is a u-v path, say [u = u1 , . . . , uk = v], in F − e. Then, [v, u = u1 , . . . , uk = v]


is a cycle containing e. A contradiction to e not lying on any cycle.
DR

Hence, e is a cut-edge of F . Consequently, e is a cut-edge of G.

Exercise 9.4.7. Let G be a graph on n > 2 vertices. If kGk > n−1



2 , is G necessarily connected?
n−1

Give an ‘if and only if ’ condition for the connectedness of a graph with exactly 2 edges.

Definition 9.4.8. A connected acyclic graph is called a tree. A forest is a graph whose components
are trees.

Thus, any acyclic graph is a forest and any component of it is a tree.

Proposition 9.4.9. A tree on n vertices has n − 1 edges.

Proof. We apply strong induction on n. Take a tree on n ≥ 2 vertices and delete an edge e. Then, we
get two subtrees T1 , T2 of order n1 , n2 , respectively, where n1 +n2 = n. So, E(T ) = E(T1 )∪E(T2 )∪{e}.
By induction hypothesis kT k = kT1 k + kT2 k + 1 = n1 − 1 + n2 − 1 + 1 = n1 + n2 − 1 = n − 1.

Corollary 9.4.10. A tree with at least two vertices has at least two pendant vertices.
P
Proof. Let T be any tree on n ≥ 2 vertices. Then d(v) = 2kE(T )k = 2(n − 1) = 2n − 2. By
v∈V (T )
PHP, T has at least two vertices of degree 1.

Theorem 9.4.11. Let G be a graph with n vertices. Then the following are equivalent:
1. G is a tree.
204 CHAPTER 9. GRAPHS - I

2. G is a maximal acyclic graph.


3. G is a minimal connected graph.
4. G is acyclic and it has n − 1 edges.
5. G is connected and it has n − 1 edges.
6. Between any two distinct vertices of G there exists a unique path.

Proof. (1)⇔(2). Let G be a tree. On the contrary, suppose that G is not maximal acyclic. Then there
exist u, v ∈ V (G) such that G + uv is acyclic. If in G, there exists a u-v path, then G + uv would have
a cycle containing the edge uv. So, in G, there is no u-v path. It contradicts the assumption that G
is a tree and hence connected.
Conversely, suppose that G is maximal acyclic. If G is not a tree, then G has at least two
components. Let u and v be two vertices from different components, so that there exists no u-v path
in G. Thus G + uv has no cycle. This contradicts the assumption that G is maximal acyclic.
(1)⇔(3). Let G be a tree. Then G is connected. Let e = uv be an edge of G. By (2), e is the only
u-v path. Then G − e is disconnected. Hence G is minimal connected.
Conversely, suppose G is minimal connected. If G is not a tree, then there is a cycle in G. Let u, v
be two adjacent vertices on such a cycle. Now, G − uv is still connected. It contradicts the assumption
that G is minimal connected.
(1)⇔(4). Let G be a tree. Then G is acyclic, and By Proposition 9.4.9, G has n − 1 edges.
Conversely, let G by acyclic and G has n − 1 edges. If possible, let G be disconnected. Then G
has components G1 , . . . , Gk , k ≥ 2. As G is acyclic, each Gi is a tree on, say ni ≥ 1 vertices, with
T

k k
AF

P P
ni = n. As k ≥ 2, we have kGk = (ni − 1) = n − k < n − 1 = kGk, a contradiction.
i=1 i=1
DR

(1)⇔(5). Let G be a tree. Then G is connected, and By Proposition 9.4.9, G has n − 1 edges.
Conversely, assume that G is connected and G has n − 1 edges. On the contrary, suppose that G
is not a tree. Then G has a cycle. Select an edge e from the cycle. Notice that G − e is connected.
We go on selecting edges from G that lie on cycles and keep removing them, until we get an acyclic
graph H. Since the edges that are being removed lie on some cycle, the graph H is still connected.
So, by definition, H is a tree on n vertices. Thus, by Proposition 9.4.9, kHk = n − 1. But, in the
above argument, we have deleted at least one edge and hence, kGk ≥ n. This gives a contradiction to
kGk = n − 1.
(1)⇔(6). Let G be a tree. Since G is connected, between any two distinct vertices of G there exists
a path. If there exist more than one path between u, v ∈ V (G), then by Proposition 9.2.8 any two of
these u-v paths will contain a cycle. This is not possible as G is acyclic. Hence the uniqueness of such
a path.
Conversely, let (6) hold. Then G is clearly connected. Further, if G has a cycle, then that cycle
would provide two paths between any two vertices on the cycle. Hence G is acyclic, i.e., G is a tree.

Proposition 9.4.12. The center of a tree is either a singleton or has at most two vertices.

Proof. Let T be a tree of radius k. Since the center contains at least one vertex, let u be a vertex in
the center of T . Now, let v be another vertex in the center. We claim that u is adjacent to v.
On the contrary, suppose u  v. Then, there exists a path from u to v, denoted P (u, v), with
at least one internal vertex, say w. Let x be any pendant (d(x) = 1) vertex of T . Then, either
v ∈ P (x, w) or v ∈
/ P (x, w). In the latter case, check that kP (x, w)k < kP (x, v)k ≤ k.
9.4. TREES 205

u w v u w v

x x

If v ∈ P (x, w), then u ∈


/ P (x, w) and kP (x, w)k < kP (x, u)k ≤ k. Thus in either case, the distance
from w to any pendant vertex is less than k. Hence, k is not the radius, a contradiction. Thus, uv ∈ T .
We cannot have another vertex in the center, or else, we will have a C3 in T , a contradiction.

Exercise 9.4.13. Draw a tree on 8 vertices. Label V (T ) as 1, . . . , 8 so that each vertex i ≥ 2 is


adjacent to exactly one element of {1, 2, . . . , i − 1}.

Proposition 9.4.14. Let T be a tree on n vertices. Let G be a graph with δ(G) ≥ n − 1. Then G has
a subgraph H with H ∼
= T.

Proof. We prove the result by induction on n. The result is trivially true if n = 1 or 2. So, let the
result be true for every tree on n − 1 vertices and take a tree T on n vertices. Also, suppose that G
is any graph with δ(G) ≥ n − 1.
Due to Corollary 9.4.10, let v ∈ V (T ) with d(v) = 1. Take u ∈ V (T ) such that uv ∈ E(T ). Now,
consider the tree T1 = T − v. Then, δ(G) ≥ n − 1 > n − 2. Hence, by induction hypothesis, G has
a subgraph H such that H ∼ = T1 under a map, say φ. Let h ∈ V (H) such that φ(h) = u. Since
δ(G) ≥ n − 1, h has a neighbor, say h1 , such that h1 is not a vertex in H but is a vertex in G. Now,
map this vertex to v to get the required result.
T

Definition 9.4.15. Let T be a tree on n > 2 vertices and labeled by n integers, say {1, 2, . . . , n}.
AF

The Prüfer code PT of T is a sequence X of size n − 2 created in the following way.


DR

1. Find the largest pendant vertex, say v1 . Let u1 be the neighbor of v1 . Put X(1) = u1 .
2. Let T1 = T − v1 and find X(2).
3. Repeat the procedure to obtain X(3), . . . , X(n − 2).

Example 9.4.16. For example, Consider the tree T in Figure 9.9.

1 6 2 4
3
Figure 9.9: A tree T on 6 vertices

Then, the above process proceeds as follows.

Exercise 9.4.17. In the above process, prove that uj = i, for some j, if and only if d(i) ≥ 2.

Example 9.4.18. Can I get back the original tree T from the Prüfer code 2, 2, 2, 6?
Answer: Yes. The process of getting back the original tree is as follows.
1. Plot points 1, 2, . . . , 6.
2. Since ui is either 2 or 6, it implies that 2 and 6 are not the pendant vertices. Hence, the pendant
1
vertices in T must be {1, 3, 4, 5}. Thus, the algorithm implies that the largest pendant 5 must
be adjacent to (the first element of the sequence) 2.
206 CHAPTER 9. GRAPHS - I

Step Pendant vi Neighbor ui PT = X(1), X(2), . . . Ti = T − vi


1 6 2 4
3
1 5 2 2
1 6 2
3
2 4 2 2,2

3 3 2 2,2,2 1 6 2

4 2 6 2,2,2,6 1 6

Figure 9.10: A tree T on 6 vertices

3. At step 1, the vertex 5 was deleted. Hence, V (T1 ) = {1, 2, 3, 4, 6} with the given sequence 2, 2, 6.
So, the pendants in T1 are {1, 3, 4} and the vertex 4 (largest pendant) is adjacent to 2.
4. Now, V (T2 ) = {1, 2, 3, 6} with the sequence as 2, 6. So, 3 is adjacent to 2.
5. Now, V (T3 ) = {1, 2, 6} with the sequence as 6. So, the pendants in the current T are {1, 2} and
2 is adjacent to 6.
T

6. Lastly, V (T4 ) = {1, 6}. As the process ends with K2 and we have only two vertices left, they
AF

must be adjacent.
DR

The corresponding set of figures are as follows.

1 2 3 1 2 3 1 2 3

4 5 6 4 5 6 4 5 6

1 2 3 1 2 3 1 2 3

4 5 6 4 5 6 4 5 6

Proposition 9.4.19. Let T be a tree on the vertex set {1, 2, . . . , n}. Then, d(v) ≥ 2 if and only if v
appears in the Prüfer code PT . Thus, {v : v ∈
/ PT } are precisely the pendant vertices in T .

Proof. Let d(v) ≥ 2. Since the process ends with an edge, there is a stage, say i, where d(v) decreases
strictly. Thus, at the (i − 1)-th stage, v was adjacent to a pendant vertex w and at the i-th stage w
was deleted and thus, v appears in the sequence.
Conversely, let v appear in the sequence at the k-th stage for the first time. Then, the tree Tk had
a pendant vertex w of highest label that was adjacent to v. Note that Tk − w is a tree with at least
two vertices. Thus, d(v) ≥ dTk (v) ≥ 2.

Exercise 9.4.20. Prove that in the Prüfer code of1 T a vertex v appears exactly d(v) − 1 times. [Hint:
Use induction and if v is the largest pendant adjacent to w and T 0 = T − v then PT = w, PT 0 .]
9.4. TREES 207

Proposition 9.4.21. Let T and T 0 be two trees on the same vertex set of integers. If PT = PT 0 , then
T = T 0.

Proof. The statement is trivially true for |T | = 3. Assume that the statement holds for |T | < n. Now,
let T and T 0 be two trees with vertex set {1, 2, . . . , n} and PT = PT 0 . As PT = PT 0 , T and T 0 have the
same set of pendants. Further, the largest labeled pendant w is adjacent to the vertex X(1) in both
the trees. Thus, PT −w = PT 0 −w and hence, by induction hypothesis T − w = T 0 − w. Thus, by PMI,
T = T 0.

Proposition 9.4.22. Let S be a set of n ≥ 3 integers and let X be a sequence of length n − 2 of


elements from S. Then, there is a tree T with V (T ) = S and PT = X.

Proof. Verify the statement for |T | = 3. Now, let the statement hold for all trees T on n > 3 vertices
and consider a set S of n + 1 integers and a sequence X of length (n − 1) of elements of S.
Let v = max{x ∈ S : x ∈ / X}, S 0 = S − v and X 0 = X(2), . . . , X(n − 1). By definition, note that
v 6= X(i), for 2 ≤ i ≤ n − 1. Thus, X 0 is a sequence of elements of S 0 of length n − 2. As |S 0 | = n, by
induction hypothesis, there is a tree T 0 with PT 0 = X 0 .
Let T be the tree obtained by adding a new pendant v at the vertex X(1) of T 0 . In T 0 , the vertices
X(i), for i ≥ 2, were not available as pendants and now in T the vertex X(1) is also not available as
a pendant (here some X(i)’s may be the same). Let R0 = {x ∈ S 0 : x ∈ / X 0 } be the pendants in T 0 .
Then, the set of pendants in T is (R0 ∪ {v}) \ {X(1)} which equals {x ∈ S : x ∈ / X}. Thus, v is the
pendant of T of maximum label. Hence, PT = X.
T

Theorem 9.4.23. [A. Cayley, 1889, Quart. J. Math] Let n ≥ 3. Then, there are nn−2 different
AF

trees with vertex set {1, 2, . . . , n}.


DR

Proof. Let F be the class of trees on the vertex set {1, 2, . . . , n} and let G be the class of (n − 2)-
sequences of {1, 2, . . . , n}. Note that the function f : F → G defined by f (T ) = PT , the Prüfer code,
is a one-one and onto mapping. As |G| = nn−2 , the required result follows.

Exercise 9.4.24. 1. Find out all non-isomorphic trees of order 6 or less.


2. Count with diameter: how many non-isomorphic trees are there of order 7?
3. Show that every automorphism of a tree fixes a vertex or an edge.
4. Give a class of trees T with |Γ(T )| = 6.
5. Let T be a tree, σ ∈ Γ(T ), u ∈ V (T ) such that σ 2 (u) 6= u. Can we have an edge uv ∈ E(T ) such
that σ(u) = v?
6. Let T be a tree with center {u} and radius r. Let v satisfy d(u, v) = r. Show that d(v) = 1.
7. Let T be a tree with |T | > 2. Let T 0 be obtained from T by deleting all the pendant vertices of
T . Show that the center of T is the same as the center of T 0 .
8. Let T be a tree with center {u} and σ ∈ Γ(T ). Show that σ(u) = u.
9. Is it possible to have a tree such that |Γ(T )| = 7?
10. Construct a tree T on vertices S = {1, 2, 3, 6, 7, 8, 9} for which PT = 6, 3, 7, 1, 2.
11. Draw the tree on the vertex set {1, 2, . . . , 12} whose Prüfer code is 9954449795.
12. Practice with examples: get the Prüfer code from a tree; get the tree from a given code and a
vertex set.
208 CHAPTER 9. GRAPHS - I

.. .. .. ···
. . .

13. How many trees of the following forms are there on the vertex set {1, 2, . . . , 100}?
14. Show that any tree has at least ∆(T ) leaves (pendant edges).
15. Let T be a tree and T1 , T2 , T3 be subtrees of T such that T1 ∩T3 6= ∅, T2 ∩T3 6= ∅ and T1 ∩T2 ∩T3 =
∅. Show that T1 ∩ T2 = ∅.
16. Let T be a set of subtrees of a tree T . Assume that the trees in T have nonempty pairwise
intersection. Show that their overall intersection is nonempty. Is this true, if we replace T by a
graph G?
17. A connected graph G is said to be unicyclic if G has exactly one cycle as its subgraph. Prove
that if G is connected and |G| = kGk, then G is a unicyclic graph.

9.5 Eulerian graphs


Definition 9.5.1. Let G be a graph. Then, G is said to have an Eulerian tour if there is a closed
walk, say [v0 , v1 , . . . , vk , v0 ], such that each edge of the graph appears exactly once in the walk. The
graph G is said to be Eulerian if it has an Eulerian tour.
T
AF

Note that by definition, a disconnected graph is not Eulerian. In this section, the graphs can have
DR

loops and multiple edges. The graphs that have a closed walk traversing each edge exactly once have
been named “Eulerian graphs” due to the solution of the famous Königsberg bridge problem by Euler
in 1736. The problem is as follows: The city of Königsberg (the present day Kaliningrad) is divided
into 4 land masses by the river Pregolya. These land masses are joined by 7 bridges (see Figure 9.11).
The question required one to answer “is there a way to start from a land mass that passes through
all the seven bridges in Figure 9.11 and return back to the starting land mass”? Euler, rephrased the
problem along the following lines: Let the four land masses be denoted by the vertices A, B, C and
D of a graph and let the 7 bridges correspond to 7 edges of the graph. Then, he asked “does this
graph have a closed walk that traverses each edge exactly once”? He gave a necessary and sufficient
condition for a graph to have such a closed walk and thus giving a negative answer to Königsberg
bridge problem.
One can also relate the above problem to the problem of “starting from a certain point, draw a
given figure with pencil such that neither the pencil is lifted from the paper nor a line is repeated such
that the drawing ends at the initial point”.

Theorem 9.5.2. [Euler, 1736] A connected graph is Eulerian if and only if each vertex in the graph
is of even degree.

Proof. Let G be a connected graph. Suppose G has an Eulerian tour, say W = [v0 , v1 , . . . , vk , v0 ].
Observe that whenever we arrive at a vertex v 6= v0 using an edge, say e, in W then we leave that
vertex using an edge, say e0 in W with e 6= e0 . As each1edge appears exactly once in W and each edge
is traversed, d(v) = 2r, if v appears r times in the tour. Also, if v0 appears r times in the tour then
d(v0 ) = 2(r − 1). Hence, d(v) is always even.
9.5. EULERIAN GRAPHS 209

B D

Figure 9.11: Königsberg bridge problem

Conversely, let G be a connected graph with each vertex having even degree. Let W = v0 v1 · · · vk
be a longest walk in G without repeating any edge in it. As vk has an even degree it follows that
vk = v0 , otherwise W can be extended. If W is not an Eulerian tour then there exists an edge, say
e0 = vi w, with w 6= vi−1 , vi+1 . In this case, W 0 = wvi · · · vk (= v0 )v1 · · · vi−1 vi is a longer tour compared
to W , a contradiction. Thus, there is no edge lying outside W and hence W is an Eulerian tour.

Proposition 9.5.3. Let G be a connected graph with exactly two vertices of odd degree. Then, there
is an Eulerian walk starting at one of those vertices and ending at the other.

Proof. Let x and y be the two vertices of odd degree and let v be a symbol such that v ∈
/ V (G). Then,
the graph H with V (H) = V (G) ∪ {v} and E(H) = E(G) ∪ {xv, yv} has each vertex of even degree
and hence by Theorem 9.5.2, H is Eulerian. Let Γ = [v, v1 = x, . . . , vk = y, v] be an Eulerian tour.
Then, Γ − v is an Eulerian walk with the required properties.
T
AF

Exercise 9.5.4. Let G be an Eulerian graph and let e be any edge. Show that G − e is connected.
DR

How to find an Eulerian tour (algorithm)?


Start from a vertex v0 , move via edge that has not been taken and go on deleting them.
Do not take an edge whose deletion creates a non-trivial component not containing v0 .

Exercise 9.5.5. Find Eulerian tours for the following graphs.

11 13
16 15 14 13
12 7 5
9 10 11 12
10 8 6
8 7 6 5
3 1 9

2 4 1 2 3 4

Theorem 9.5.6. [Finding Eulerian tour] The previous algorithm correctly gives an Eulerian tour
provided the given graph is Eulerian.

Proof. Let the algorithm start at a vertex, say v0 . Now, assume that we are at u with H as the current
graph and C as the only non-trivial component of H. Thus, dH (u) > 0. Assume that the deletion of
1
the edge uv creates a non-trivial component not containing v0 . Let Cu and Cv be the components of
C − uv, containing u and v, respectively.
210 CHAPTER 9. GRAPHS - I

We first claim that u 6= v0 . In fact, if u = v0 , then H must have all vertices of even degree and
dH (v0 ) ≥ 2. So, C is Eulerian. Hence, C − uv cannot be disconnected, a contradiction to C − uv
having two components Cu and Cv . Thus, u 6= v0 . Moreover, note that the only vertices of odd degree
in C is u and v0 .
Now, we claim that Cu is a non-trivial component. Suppose Cu is trivial. Then, v0 ∈ Cv , a
contradiction to the assumption that the deletion of the edge uv creates a non-trivial component not
containing v0 . So, Cu is non-trivial.
Finally, we claim that v0 ∈ Cv . If possible, let v0 ∈ Cu . Then, the only vertices in C − uv of odd
degree are v ∈ Cv and v0 ∈ Cu . Hence, C − uv + v0 v is a connected graph with each vertex of even
degree. So, by Theorem 9.5.2, the graph C − uv + v0 v is Eulerian. But, this cannot be true as vv0 is
a bridge. Thus, v0 ∈ Cv .
Hence, Cu is the newly created non-trivial component not containing v0 . Also, each vertex of Cu
has even degree and hence by Theorem 9.5.2, Cu is Eulerian. This means, we can take an edge e0
incident on u and complete an Eulerian tour in Cu . So, at u if we take the edge e0 in place of the edge
e, then we will not create a non-trivial component not containing v0 .
Thus, at each stage of the algorithm either u = v0 or there is a path from u to v0 . Moreover, this is
the only non-trivial connected component. When the algorithm ends, we must have u = v0 . Because,
as seen above, the condition u 6= v0 gives the existence of an edge that is incident on u and that can
be traversed (as dH (u) is odd). Hence, if u 6= v0 , the algorithm cannot stop. Thus, when algorithm
stops u = v0 and all components are trivial.

Exercise 9.5.7. 1. Apply the algorithm to graphs of Exercise 9.5.5. Also, create connected graphs,
T

where each vertex is of even degree, and apply the above algorithm.
AF

2. Give a necessary and sufficient condition on m and n so that Km,n is Eulerian.


DR

3. Each of the 8 persons in a room has to shake hands with every other person as per the following
rules:

(a) The handshakes should take place sequentially.


(b) Each handshake (except the first) should involve someone from the previous handshake.
(c) No person should be involved in 3 consecutive handshakes.

Is there a way to sequence the handshakes so that these conditions are all met?

4. Prove: A connected graph G is Eulerian if and only if the E(G) can be partitioned into cycles.

9.6 Hamiltonian graphs


Definition 9.6.1. let G be a graph. A cycle in G is said to be Hamiltonian if it contains all vertices
of G. If G has a Hamiltonian cycle, then G is called a Hamiltonian graph.

Finding a nice characterization of a Hamiltonian graph is an unsolved problem.

Example 9.6.2. 1. For each positive integer n ≥ 3, the cycle Cn is Hamiltonian.

2. The graphs corresponding to all platonic solids are Hamiltonian.

3. The Petersen graph is a non-Hamiltonian Graph (the proof appears below).

Proposition 9.6.3. The Petersen graph is not Hamiltonian.


9.6. HAMILTONIAN GRAPHS 211

3 2
4 1

5 10

6 9
7 8
Figure 9.12: A Hamiltonian graph

Proof. Suppose that the Petersen graph, say G, is Hamiltonian. So, G contains C10 = [1, 2, 3, . . . , 10, 1]
as a subgraph. As each vertex of G has degree 3, G = C10 + M , where M is a set of 5 chords in which
each vertex appears as an endpoint. Now, consider the vertices 1, 2 and 3.
Since, g(G) = 5, the vertex 1 can be adjacent to only one of the vertices 5, 6 or 7. Hence, if 1 is
adjacent to 5, then the possible third vertex that is adjacent to 10 will create cycles of length 3 or 4.
Similarly, if 1 is adjacent to 7 then there is no choice for the possible third vertex that can be adjacent
to 2. So, let 1 be adjacent to 6. Then, 2 must be adjacent to 8. In this case, note that there is no
choice for the third vertex that can be adjacent to 3. Thus, the Petersen graph is non-Hamiltonian.

Theorem 9.6.4. Let G be a Hamiltonian graph. Then, for S ⊆ V (G) with S 6= ∅, the graph G − S
has at most |S| components.
T

Proof. Note that by removing k vertices from a cycle, one can create at most k connected components.
AF

Hence, the required result follows.


DR

Theorem 9.6.5. [Dirac, 1952] Let G be a graph with |G| = n ≥ 3 and d(v) ≥ n/2, for each v ∈ V (G).
Then G is Hamiltonian.

Proof. We first show that G is connected. If possible, let G be disconnected. Then G has a component,
say H, with |V (H)| = k ≤ n/2. Hence, d(v) ≤ k − 1 < n/2, for each v ∈ V (H). A contradiction to
d(v) ≥ n/2, for each v ∈ V (G). Therefore, G is connected.
Now, let P = [v1 , v2 , · · · , vk ] be a longest path in G. Since P is a longest path, all neighbors of
v1 and vk are in P and k ≤ n. We claim that there exists an i such that v1 ∼ vi and vi−1 ∼ vk .
Otherwise, for each vi ∼ v1 , we must have vi−1  vk . Then, |N (vk )| ≤ k − 1 − |N (v1 )|. Hence,
|N (v1 )| + |N (vk )| ≤ k − 1 < n, a contradiction to d(v) ≥ n/2 for each v ∈ V (G).
So, the claim is valid and hence, we have a cycle P̃ := v1 vi vi+1 · · · vk vi−1 · · · v1 of length k.
We now prove that P̃ gives a Hamiltonian cycle. Suppose not. Then, there exists v ∈ V (G) such
that v is outside P and v is adjacent to some vj . Now, use P̃ , v and vj to create a path whose length
is larger than the length of P . Hence, P cannot be a path of longest length, a contradiction. Thus,
the required result follows.

A slight relaxation on the sufficient condition of a graph to be Hamiltonian is provided by the


following result. We expect the reader to prove it.

Theorem 9.6.6. [Ore, 1960] Let G be a graph on n ≥ 3 vertices such that d(u) + d(v) ≥ n for every
pair of non-adjacent vertices u and v. Then G is Hamiltonian.

Lemma 9.6.7. Let u and v be two non-adjacent vertices of a graph G such that d(u) + d(v) ≥ |G|.
Then G is Hamiltonian if and only if G + uv is Hamiltonian.
212 CHAPTER 9. GRAPHS - I

Proof. If G is Hamiltonian, then so is G+uv. Conversely, suppose that G+uv is Hamiltonian. If G+uv
has a Hamiltonian cycle not using uv, then G is Hamiltonian. Otherwise, let [u = v1 , . . . vn = v, u] be
a Hamiltonian cycle in G + uv. Then, the path [v1 , . . . , vn ] is available in G. Now proceeding as in
the proof of Dirac’s theorem, as d(v1 ) + d(vn ) ≥ n, we see that there must exist an i such that v1 ∼ vi
and vn ∼ vi−1 . Then the cycle [v1 , vi , vi+1 , . . . , vn , vi−1 , vi−2 , . . . , v1 ] is a Hamiltonian cycle in G.

Discussion 9.6.8. [Closure] Let G be a graph on n vertices, n ≥ 2. Suppose we perform the


following operation(s) on G.
Step 1: If G has two nonadjacent vertices u 6= v such that d(u) + d(v) ≥ n, then add the edge
(u, v) in G and treating the resulting graph as G, repeat Step 1, until the graph has no nonadjacent
vertices u 6= v satisfying d(u) + d(v) ≥ n.
Step 2: If G has no nonadjacent vertices u 6= v such that d(u) + d(v) ≥ n, then stop.
For example, let G be the trivial graph on 10 vertices (G has no edge). Then, the application of
the above operation stops with the trivial graph itself. Whereas, if G is the graph obtained from K10
by deleting the edges {1, 2} and {3, 4}, then applying the above operation gives K10 as the result.
Notice that in the above example, one might have added the edge {1, 2} first and then the edge
{3, 4} whereas some one else might have added {3, 4} first and then {1, 2}. However, they both get
the same end result. We prove this for any graph G. Before that, note that, if G is any graph on n
vertices, then the above operation can add at most a finitely many edges as the end result has to be
a subgraph of Kn .

Proposition 9.6.9. Let G be a graph on n vertices. Suppose the application of the operations described
in Discussion 9.6.8 to G by following two different sequences of edge additions gives K and F as the
T

end results. Then K = F .


AF
DR

Proof. Let K and F be obtained by sequentially adding edges

(e-list) e1 = u1 v1 , . . . , ek = uk vk and (f -list) f1 = x1 y1 , . . . , fr = xr yr ,

respectively, to G in that order.


Assume that K 6= F . Then, without loss of generality, suppose an edge has been added in the
e-list which doesn’t appear in the f -list. Let ei be the first such edge in the e-list which does not
appear in the f -list. Put H = G + e1 + · · · + ei−1 . As e1 , . . . , ei−1 are in the f -list, we see that H is a
subgraph of F .
Furthermore, taking ei = {u, v}, as ei was the next to be added in the e-list, it follows that
dH (u) + dH (v) ≥ n. But as H is a subgraph of F , we see that dF (u) + dF (v) ≥ n too.
This means that F is not the end result, because in an end result there are no nonadjacent vertices
u 6= v with sum of degrees at least n. This is a contradiction.

Let G be a graph. The graph obtained as the end result of applying the operation described in
Discussion 9.6.8, is called the closure of G, denoted C(G). (It is obtained by repeatedly choosing
pairs of non-adjacent vertices u, v such that d(u) + d(v) ≥ |G| and adding edges between them until
no such pair of vertices exist.) Proposition 9.6.9 tells us that for any graph G, C(G) is unique.

Corollary 9.6.10. Let G be a graph. Then G is Hamiltonian if and only if C(G) is Hamiltonian. In
particular, if C(G) is Hamiltonian, then G is Hamiltonian.

Proof. Follows from Lemma 9.6.7.

Quiz 9.6.11. Let G be a graph on n ≥ 3 vertices. If G has a cut-vertex, then prove that C(G) 6= Kn .
9.6. HAMILTONIAN GRAPHS 213

Theorem 9.6.12. Let d1 ≤ · · · ≤ dn be the vertex degrees of G which satisfy the property

R : If dk ≤ k then dn−k ≥ n − k for each k < n/2.

Then G is Hamiltonian.

Proof. We show that under the above condition H = C(G) ∼ = Kn . On the contrary, assume that there
exists a pair of vertices u, v ∈ V (G) such that uv ∈ / E(H) and dH (u) + dH (v) ≤ n − 1. Among all such
pairs, choose a pair u, v ∈ V (G) such that uv ∈ / E(H) and dH (u) + dH (v) is maximum. Assume that
dH (v) ≥ dH (u) = k (say). As dH (u) + dH (v) ≤ n − 1, we get k < n/2.
Now, let Sv = {x ∈ V (H) : x 6= v, xv ∈ / E(H)} and Su = {w ∈ V (H) : w 6= u, wu ∈ / E(H)}.
Therefore, the assumption that dH (u) + dH (v) is the maximum among each pair of vertices u, v with
uv ∈/ E(H) and dH (u) + dH (v) ≤ n − 1 implies that |Sv | = n − 1 − dH (v) ≥ dH (u) = k and
dH (x) ≤ dH (u) = k, for each x ∈ Sv . So, there are at least k vertices in H (elements of Sv ) with
degrees at most k.
Also, for any w ∈ Su , note that the choice of the pair u, v implies that dH (w) ≤ dH (v) ≤ n − 1 −
dH (u) = n−1−k < n−k. As dH (u) = k, |Su | = n−1−k. Further, the condition dH (u)+dH (v) ≤ n−1,
dH (v) ≥ dH (u) = k and u ∈ / Su implies that dH (u) ≤ n − 1 − dH (v) ≤ n − 1 − k < n − k. So, there
are n − k vertices (Su ∪ {u}) in H with degrees less than n − k.
Therefore, if d01 ≤ · · · ≤ d0n are the vertex degrees of H, then we observe that there exists a k < n/2
for which d0k ≤ k and d0n−k < n − k. As k < n/2 and di ≤ d0i , we get a contradiction to the given
hypothesis.

Exercise 9.6.13. Let d1 ≤ · · · ≤ dn be the vertex degrees of G which satisfy the property R (see
T

Theorem 9.6.12). Then show that C(G) also has property R.


AF
DR

Definition 9.6.14. The line graph H of a graph G is a graph with V (H) = E(G) and e1 , e2 ∈ V (H)
are adjacent in H if e1 and e2 share a common vertex/endpoint.

Example 9.6.15. Verify the following:


1. Line graph of C5 is C5 .
2. Line graph of P5 is P4 .
3. Line graph of any graph G contains a complete subgraph of size ∆(G).
Exercise 9.6.16. 1. Let G be a connected Eulerian graph. Show that the line graph of G is
Hamiltonian. Is the converse true?
2. What can you say about the clique number of a line graph?

Theorem 9.6.17. A connected graph G is isomorphic to its line graph if and only if G = Cn for some
n ≥ 3.

Proof. If G is isomorphic to its line graph, then |G| = kGk. Thus, G is a unicyclic graph. Let
[v1 , v2 , . . . , vk , vk+1 = v1 ] form the cycle in G. Then, the line graph of G contains a cycle P =
[v1 v2 , v2 v3 , . . . , vk v1 ]. We now claim that dG (vi ) = 2.
Suppose not and let dG (v1 ) ≥ 3. So, there exists a vertex u ∈ / {v2 , . . . , vk } such that u ∼ v1 . In
that case, the line graph of G contains the triangle T = [v1 v2 , v1 vk , v1 u] and P 6= T . Thus, the line
graph is not unicyclic, a contradiction.

Exercise 9.6.18. 1. Consider the graphs shown below.


(a) Determine the closure of G.
214 CHAPTER 9. GRAPHS - I

G H

(b) Show that H is not Hamiltonian.

2. Give a necessary and sufficient condition on m, n ∈ N so that Km,n is Hamiltonian.


3. Show that any graph with at least 3 vertices and atleast n−1

2 + 2 edges is Hamiltonian.
4. Show that for any n ≥ 3 there is a graph H with kGk = n−1

2 + 1 that is not Hamiltonian. But,
prove that all such graphs H admit a Hamiltonian path (a path containing all vertices of H).

9.7 Bipartite graphs


Definition 9.7.1. A graph is said to be 2-colorable if its vertices can be colored with two colors in
a way that adjacent vertices get different colors.
T
AF

Example 9.7.2. Prove the following results.


DR

1. Every tree is 2-colorable.


2. Every cycle of even length is 2-colorable.
3. The complete bipartite graphs, namely Km,n , are 2-colorable
4. Petersen graph is not 2-colorable but 3-colorable.

Lemma 9.7.3. Let P and Q be two v-w paths in G such that length of P is odd and length of Q is
even. Then, G contains an odd cycle.

Proof. If P, Q have no inner vertex (a vertex other than v, w) in common then P ∪ Q is an odd cycle
in G.
So, suppose P, Q have an inner vertex in common. Let x be the first common inner vertex when
we walk from v to w. Then, one of P (v, x), P (x, w) has odd length and the other is even. Let P (v, x)
be odd. If length of Q(v, x) is even then P (v, x) ∪ P (x, v) is an odd cycle in G. If length of Q(v, x)
is odd then the length of Q(x, w) is also odd and hence we can consider the x-w paths P (x, w) and
Q(x, w) and proceed as above to get the required result.

Theorem 9.7.4. Let G be a connected graph with at least two vertices. Then the following statements
are equivalent:
1. G is 2-colorable.
2. G is bipartite.
3. G does not have an odd cycle.
9.8. PLANAR GRAPHS 215

Proof. (1)⇒(2). Let G be 2-colorable. Let V1 be the set of red vertices and V2 be the set of blue
vertices. Clearly, G is bipartite with partition V1 , V2 .
(2)⇒(1). Color the vertices in V1 with red color and that of V2 with blue color to get the required 2
colorability of G.
(2)⇒(3). Let G be bipartite with partition V1 , V2 . Let v0 ∈ V1 and suppose Γ = v0 v1 v2 · · · vk = v0 is
a cycle. It follows that v1 , v3 , v5 · · · ∈ V2 . Since vk ∈ V1 , we see that k is even. Thus, Γ has an even
length.
(3)⇒(2). Suppose that G does not have an odd cycle. Pick any vertex v. Define

V1 = {w : there is a walk of even length from v to w}


V2 = {w : there is a walk of odd length from v to w}.

Clearly, v ∈ V1 . Also, G does not have an odd cycle implies that V1 ∩ V2 = ∅ (use Lemma 9.7.3). As
G is connected each w is either in V1 or in V2 .
Let x ∈ V1 . Then, there is an even path P (v, x) from v to x. If xy ∈ E(G), then we have a
v-y walk of odd length. Deleting all cycles from this walk, we have an odd v-y path. Thus, y ∈ V2 .
Similarly, if x ∈ V2 and xy ∈ E, then y ∈ V1 . Thus, G is bipartite with parts V1 , V2 .

Exercise 9.7.5. 1. There are 15 women and some men in a room. Each man shook hands with
exactly 6 women and each woman shook hands with exactly 8 men. How many men are there in
the room?
2. How do you test whether a graph is bipartite or not?
T
AF

3. Prove the statements in Example 9.7.2.


4. Let G and H be two bipartite graphs. Prove that G × H is also a bipartite graph.
DR

9.8 Planar graphs


Definition 9.8.1. A graph is said to be embedded on a surface S when it is drawn on S so that no
two edges intersect. A plane graph is a graph drawn on the plane where no two edges intersect. A
graph is said to be planar if it can be embedded on the plane.

K5 -Non-planar K3,3 -Non-planar K4 K4 - Planar embedding

Figure 9.13: Planar and non-planar graphs

Example 9.8.2. 1. A tree is embeddable on a plane..


2. Any cycle Cn , n ≥ 3 is planar.
3. The planar embedding of K4 is given in Figure 9.13.
4. Draw a planar embedding of K2,3 .
5. Draw a planar embedding of the edges of a three dimensional cube.
216 CHAPTER 9. GRAPHS - I

6. Draw a planar embedding of K5 − e, where e is any edge.


7. Draw a planar embedding of K3,3 − e, where e is any edge.

Definition 9.8.3. Consider a planar embedding of a graph G. The regions on the plane defined by
this embedding are called faces/regions of G. The unbounded face/region is called the exterior face
(see Figure 9.14).

Example 9.8.4. Consider the following planar embedding of the graphs X1 and X2 .

9 f1 9 f1
14 13
f2 f2 11
8 11 8
10 f3 12 10 f3 12
15 f4
2 f4
1 2 3 4 5 1 3 4 5
f6 f5
7 6 7 6
Planar Graph X1 Planar Graph X2

Figure 9.14: Planar graphs with labeled faces to understand the Euler’s theorem

1. The faces of the planar graph X1 and their corresponding edges are listed below.

Face Corresponding Edges


T

f1 {9, 8}, {8, 9}, {8, 2}, {2, 1}, {1, 2}, {2, 7}, {7, 2}, {2, 3}, {3, 4}, {4, 6}, {6, 4}, {4, 5},
AF

{5, 4}, {4, 12}, {12, 4}, {4, 11}, {11, 10}, {10, 13}, {13, 14}, {14, 10}, {10, 8}, {8, 9}
DR

f2 {10, 13}, {13, 14}, {14, 10}


f3 {4, 11}, {11, 10}, {10, 4}
f4 {2, 3}, {3, 4}, {4, 10}, {10, 8}, {8, 2}, {2, 15}, {15, 2}

2. Determine the faces of the planar graph X2 and their corresponding edges.
3. Any planar embedding of a tree has only one face, the exterior face.
4. Any planar embedding of a cycle has two faces.

From the table, we observe that each edge of X1 appears in two faces. This is easily seen for the
faces that do not have pendant vertices (see the faces f2 and f3 ). In faces f1 and f4 , there are a
few edges which are incident with a pendant vertex. Notice that the edges that are incident with a
pendant vertex, e.g., the edges {2, 15}, {8, 9} and {1, 2} etc., appear twice when traversing a particular
face. This observation leads to the proof of Euler’s theorem for planar graphs which is stated next.

Theorem 9.8.5. [Euler Formula] Let G be a connected plane graph with f number of faces. Then

|G| − kGk + f = 2. (9.1)

Proof. We use induction on f . Let f = 1. Then G cannot have a subgraph isomorphic to a cycle.
For, if G has a subgraph isomorphic to a cycle, then in any planar embedding of G, f ≥ 2. Therefore,
G is a tree; and hence |G| − kGk + f = n − (n − 1) + 1 = 2.
Assume that Equation (9.1) is true for all plane connected graphs having 2 ≤ f < n. Let G be a
connected planar graph with f = n. Choose an edge that is not a cut-edge, say e. Then, G − e is still
9.8. PLANAR GRAPHS 217

a connected graph. Notice that the edge e is incident with two separate faces. So, its removal will
combine the two faces, and hence G − e has only n − 1 faces. Thus, using the induction hypothesis

|G| − kGk + f = |G − e| − (kG − ek + 1) + n = |G − e| − kG − ek + (n − 1) = 2.

Hence the required result follows.

Lemma 9.8.6. Let G be a plane bridgeless graph with kGk ≥ 2. Then 2kGk ≥ 3f . Further, if G has
no cycle of length 3, then 2kGk ≥ 4f ⇔ kGk ≥ 2f .

Proof. For each edge put two dots on either side of the edge. The total number of dots is 2kGk. If
G has a cycle then each face has at least three edges. So, the total number of dots is at least 3f .
Further, if G does not have a cycle of length 3, then 2kGk ≥ 4f .

Theorem 9.8.7. The complete graph K5 and the complete bipartite graph K3,3 are not planar.

Proof. If K5 is planar, then consider a plane representation of it. By Equation (9.1), f = 7. But, by
Lemma 9.8.6, one has 20 = 2kGk ≥ 3f = 21, a contradiction.
If K3,3 is planar, then consider a plane representation of it. Note that it does not have a C3 . Also,
by Euler’s formula, f = 5. Hence, by Lemma 9.8.6, one has 18 = 2kGk ≥ 4f = 20, a contradiction.

Definition 9.8.8. Let G be a graph. Then, a subdivision of an edge uv in G is obtained by replacing


the edge by two edges uw and wv, where w is a new vertex. Two graphs are said to be homeomorphic
if they can be obtained from the same graph by a sequence of subdivisions.
T

For example, the paths Pn and Pm are homeomorphic for all m, n ∈ N. Similarly, all the cyclic
AF

graphs are homeomorphic to the cycle C3 . (We are considering only simple graphs. In general, one
DR

can say that all cyclic graphs are homeomorphic to the graph G = (V, E), where V = {v} and
E = {{v, v}}. It is a graph having exactly one vertex and a loop). Also, note that if two graphs are
isomorphic then they are also homeomorphic. Figure 9.15 gives examples of homeomorphic graphs
that are different from a path or a cycle.

Figure 9.15: Homeomorphic graphs

The following result characterizes planar graphs via homeomorphisms, which we do not prove.

Theorem 9.8.9. [Kuratowski, 1930] A graph is planar if and only if it has no subgraph homeomorphic
to K5 or K3,3 .

We have the following observations that directly follow from Kuratowski’s theorem.
Remark 9.8.10. 1. Among all simple connected non-planar graphs
(a) the complete graph K5 has minimum number of vertices.
(b) the complete bipartite graph K3,3 has minimum number of edges.

2. If Y is a non-planar subgraph of a graph X then X is also non-planar.


218 CHAPTER 9. GRAPHS - I

Definition 9.8.11. Let G be a graph. Define a relation on the edges of G by e1 ' e2 if either e1 = e2
or there is a cycle containing both these edges. Note that this is an equivalence relation. Let Ei be the
equivalence class containing the edge ei . Also, let Vi denote the endpoints of the edges in Ei . Then,
the induced subgraphs hVi i are called the blocks of G.

The following result, which we do not prove, characterizes planar graphs via blocks.

Proposition 9.8.12. A graph G is planar if and only if each of its blocks are planar.

Definition 9.8.13. A graph is called maximal planar if it is planar and addition of any more edges
results in a non-planar graph.

Notice that a maximal planar graph is necessarily connected.

Proposition 9.8.14. If G is a maximal planar graph with at least 3 vertices, then every face is a
triangle and kGk = 3|G| − 6.

Proof. Suppose there is a face, say f , described by the cycle [u1 , . . . , uk , u1 ], k ≥ 4. Then, we can take
a curve joining the vertices u1 and u3 lying totally inside the region f , so that G + u1 u3 is planar.
This contradicts the fact that G is maximal planar. Thus, each face is a triangle. It follows that
2kGk = 3f . As |G| − kGk + f = 2, we have 2kGk = 3f = 3(2 − |G| + kGk) or kGk = 3|G| − 6.

Exercise 9.8.15. 1. Prove/disprove: A two colorable graph is necessarily planar.


2. Suppose that G is a plane graph such that each face is a 4-cycle. What is the number of edges
T

in G?
AF

3. Show that the Petersen graph has a subgraph homeomorphic to K3,3 .


DR

4. Show that a plane graph with at least 3 vertices can have at most 2|G| − 5 bounded faces.
5. Let G be a plane graph with f faces and k components. Prove that |G| − kGk + f = k + 1 (use
induction).
6. If G is a plane graph without 3-cycles, then show that δ(G) ≤ 3.
7. Is it necessary that a plane graph G should contain a vertex of degree less than 5?
8. Show that any plane graph with at least 4 vertices has a vertex of degree at most five.
9. Show that any plane graph with at least 4 vertices has at least four vertices of degree at most 5.
10. Produce a planar embedding of the graph G given in Figure 9.16.

7 8

6 1

5 2

4 3

Figure 9.16: A graph on 8 vertices


9.9. VERTEX COLORING 219

9.9 Vertex coloring


Definition 9.9.1. A graph G is said to be k-colorable if the vertices can be assigned k colors in
such a way that adjacent vertices get different colors. The chromatic number of G, denoted χ(G),
is the minimum k such that G is k-colorable.

Exercise 9.9.2. Every connected bipartite graph on ≥ 2 vertices has chromatic number 2.

Theorem 9.9.3. For every graph G, χ(G) ≤ ∆(G) + 1.

Proof. If |G| = 1, the statement is trivial. Assume that the result is true for |G| = n and let G be a
graph on n + 1 vertices, labeled 1, 2, . . . , n + 1. Let H = G − 1. As H is (∆(G) + 1)-colorable and
d(1) ≤ ∆(G), the vertex 1 can be given a color other than its neighbors.

In this connection we state the following result without proof.

Theorem 9.9.4. [Brooks, 1941] If G is a graph which is neither complete nor an odd cycle, then
χ(G) ≤ ∆(G).

Theorem 9.9.5. [5-Color Theorem] Every Planar graph is 5-colorable.

Proof. Let G be a minimal planar graph on n vertices and m edges, such that G is not 5-colorable.
Then, n ≥ 6, and by Proposition 9.8.14, m ≤ 3n − 6. So, n δ(G) ≤ 2m ≤ 6n − 12 and hence,
δ(G) ≤ 2m/n ≤ 5. Let v be a vertex such that d(v) ≤ 5. By the minimality of G, G − v is 5-colorable.
If neighbors of v use at most 4 colors, then v can be colored with the 5-th color to get a 5-coloring of
G. Else, take a planar embedding in which the neighbors v1 , . . . , v5 of v appear in clockwise order.
T

Let H = G[Vi ∪ Vj ] be the graph spanned by the vertices colored i or j. If vi and vj are in different
AF

connected components of H, then we can swap colors i and j in a component that contains vi , so
DR

that the vertices v1 , . . . , v5 use only 4 colors. Thus, as above, in this case the graph G is 5-colorable.
Otherwise, there is a 1, 3-colored path between v1 and v3 and similarly, a 2, 4-colored path between v2
and v4 . But this is not possible as the graph G is planar. Hence, every planar graph is 5-colorable.
220 CHAPTER 9. GRAPHS - I

T
AF
DR
Chapter 10

Graphs - II

10.1 Connectivity
Proposition 10.1.1. Let G be a connected graph on the vertex set {1, 2, . . . , n}. Then, its vertices can
be labeled in such a way that the induced subgraph on the set {1, 2, . . . , i} is connected for 1 ≤ i ≤ n.

Proof. If n = 1, there is nothing to prove. Assume that the statement is true if n < k and let G be a
connected graph on the vertex set {1, 2, . . . , k}. If G is a tree, pick any pendant vertex and label it k.
If G has a cycle, pick a vertex on a cycle and label it k. In both the cases G − k is connected. Now,
use the induction hypothesis to get the required result.

Definition 10.1.2. Let G be a graph. Then a set X ⊆ V (G) ∪ E(G) is called a separating set if
T

G − X has more connected components than that of G.


AF

Let X be a separating set of G. Then there exists u, v ∈ V (G) that lie in the same component
DR

of G but lie in different components of G − X. If {u} ⊆ V (G) is a separating set of G, then u is a


cut-vertex. If {e} ⊆ E(G) is a separating set of G, then it is a bridge/cut-edge.
Example 10.1.3. 1. In a tree, each edge is a bridge and each non-pendant vertex is a cut-vertex.
Is it true for a forest?
2. The graph K7 does not have a separating set of vertices. In K7 , a separating set of edges must
contain at least 6 edges.

Recall that a graph is said to be a non-trivial graph if it has at least one edge.

Definition 10.1.4. A graph G is said to be k-connected if |G| > k and G is connected even after
deletion of any k − 1 vertices. The vertex connectivity of a non-trivial graph G, denoted by κ(G),
is the largest k such that G is k-connected. Convention: κ(K1 ) = 0.
Example 10.1.5. 1. Each connected graph of order more than one is 1-connected.
2. A 2-connected graph is also a 1-connected graph.
3. For a disconnected graph, κ(G) = 0 and for n > 1, κ(Kn ) = n − 1.
4. The graph G in Figure 10.1 is 2-connected but not 3-connected. Thus, κ(G) = 2.

Figure 10.1: graph with vertex connectivity 2

221
222 CHAPTER 10. GRAPHS - II

5. The Petersen graph is 3-connected.

Definition 10.1.6. A graph G is called `-edge connected if |G| > 1 and G − F is connected for
every F ⊆ E(G) with |F | < `. The greatest integer ` such that G is `-edge connected is the edge
connectivity of G, denoted λ(G). Convention: λ(K1 ) = 0.
Example 10.1.7. 1. Note that λ(Pn ) = 1, λ(Cn ) = 2 and λ(Kn ) = n − 1 for n > 1.
2. Let T be a tree on n ≥ 2 vertices. Then, λ(T ) = 1.
3. For the graph G in Figure 10.1, λ(G) = 3.
4. For the Petersen graph G, λ(G) = 3.

Exercise 10.1.8. Let |G| > 1. Show that κ(G) = |G| − 1 if and only if G = Kn . Can we say the
same for λ(G)?

Theorem 10.1.9. [H. Whitney, 1932] For any graph G, κ(G) ≤ λ(G) ≤ δ(G).

Proof. If G is disconnected or |G| = 1, then we have nothing to prove. So, let G be a connected graph
and |G| ≥ 2. Then, there is a vertex v with d(v) = δ(G). If we delete all edges incident on v, then the
graph is disconnected. Thus, δ(G) ≥ λ(G).
Suppose that λ(G) = 1 and G−uv is disconnected with components Cu and Cv . If |Cu | = |Cv | = 1,
then G = K2 and κ(G) = 1. If |Cu | > 1, then we delete u to see that κ(G) = 1.
If λ(G) = k ≥ 2, then there is a set of edges, say e1 , . . . , ek , whose removal disconnects G. Notice
that G − {e1 , . . . , ek−1 } is a connected graph with a bridge, say ek = uv. For each of e1 , . . . , ek−1 select
an end vertex other than u or v. Deletion of these vertices from G results in a graph H with uv as a
T

bridge of a connected component. Note that κ(H) ≤ 1. Hence, κ(G) ≤ λ(G).


AF
DR

Exercise 10.1.10. Give a lower bound on the number of edges of a graph G on n vertices with vertex
connectivity κ(G) = k.

In this connection, we state the following result without proof.

Theorem 10.1.11. [Chartrand and Harary, 1968] For all integers a, b, c such that 0 < a ≤ b ≤ c,
there exists a graph with κ(G) = a, λ(G) = b and δ(G) = c.

Theorem 10.1.12. [Mader, 1972] Every graph G of average degree at least 4k has a k-connected
subgraph.

Proof. For k = 1, the assertion is trivial. So, let k ≥ 2. Note that

n = |G| ≥ ∆(G) ≥ 4k ≥ 2k − 1, (10.1)


1
m = kGk ≥ (average degree × n) ≥ 2kn ≥ (2k − 3)(n − k + 1) + 1. (10.2)
2
We use induction to show that if G satisfies Equations (10.1) and (10.2), then G has a k-connected
subgraph. If n = 2k − 1, then m ≥ (2k − 3)(n − k + 1) + 1 = (n − 2) (n+1) 2 + 1 = n(n−1)
2 . So, G is a
n(n−1)
graph on n vertices with at least 2 many edges and hence G = Kn . Thus Kk+1 ⊆ Kn = G.
Assume n ≥ 2k and Equations (10.1) and (10.2) hold for graphs having less than n vertices. If v
is a vertex with d(v) ≤ 2k − 3, then G − v is a graph on n − 1 vertices and

kGk ≥ (2k − 3)(n − k + 1) + 1 − (2k − 3) = (2k − 3)((n − 1) − k + 1) + 1.

Hence, by the induction hypothesis G − v has a k-connected subgraph.


10.2. MATCHING IN GRAPHS 223

So, let d(v) ≥ 2k − 2, for each vertex v. If G is k-connected then we have nothing to prove.
Assume, on the contrary, that G is not k-connected. Then G = G1 ∪ G2 with |G1 ∩ G2 | < k, |G1 | < n
and |G2 | < n. Thus each of G1 − V (G2 ) and G2 − V (G1 ) has at least one vertex, and there is no edge
between those vertices as G is not k-connected. As the degree of these vertices is at least 2k − 2, we
have |G1 |, |G2 | ≥ 2k − 1. Further,

|G1 | + |G2 | = |G1 ∪ G2 | + |G1 ∩ G2 | ≤ n + (k − 1) = n + k − 1. (10.3)

If G1 or G2 satisfies Equation (10.2), using induction hypothesis, the result follows. Otherwise,
kGi k ≤ (2k − 3)(|Gi | − k + 1), for i = 1, 2. Using Equation (10.3), we obtain

m = kGk ≤ kG1 k + kG2 k ≤ (2k − 3)(|G1 | + |G2 | − 2k + 2) ≤ (2k − 3)(n − k + 1),

a contradiction to Equation (10.2); and hence the required result follows.


The following characterization of k-connected graphs is often helpful.

Theorem 10.1.13. [Menger] A graph is k-edge-connected if and only if there are k edge disjoint
paths between each pairs of vertices. A graph is k-connected if and only if there are k internally vertex
disjoint paths between each pairs of vertices.

10.2 Matching in graphs


Definition 10.2.1. A matching in a graph G is an independent set of edges. A maximum match-
ing is a matching with maximum number of edges. A vertex v is saturated by a matching M if
T

there is an edge e ∈ M incident on v. A matching is a perfect matching if every vertex is saturated.


AF
DR

Example 10.2.2. 1. In Figure 10.2, M1 = {u1 u2 } is a matching. If e is any edge, then M2 = {e}
is a matching. The set M3 = {u3 u2 , u4 u7 } is also a matching. The set M4 = {u1 u2 , u4 u5 , u6 u7 }
is maximum matching (why?). Can you give another maximum matching?

u1

u2 u7 u6

u3 u4 u5
Figure 10.2: A graph

2. Any non-trivial graph G has a maximum matching.


3. Vertices that are saturated for M3 are u2 , u3 , u4 and u7 .
4. Any graph with a perfect matching must have even order as each edge saturates two vertices.
So, the graph in Figure 10.2 cannot have a perfect matching.

Definition 10.2.3. Let M be a matching in G. A path P is called M -alternating if its edges are
alternately from M and from G − M . An M -alternating path with two unmatched vertices as end
points (of the alternating path) is called M -augmenting. Convention: Each path of length 1 in M
is M -alternating.

Example 10.2.4. Consider the graph in Figure 10.2.


224 CHAPTER 10. GRAPHS - II

1. The path [u1 , u2 ] is M1 -alternating. The only path of length 2 which is M1 -alternating is
[u1 , u2 , u3 ]. Why is the path [u1 , u2 , u4 ] not M1 -alternating of length
2?
2. The path [u1 , u2 , u4 , u7 ] is not M3 -alternating. But, [u2 , u3 , u4 , u7 ] is M3 -alternating.
3. The path P = [u1 , u2 , u3 , u4 , u7 , u6 ] is M3 -alternating and M3 -augmenting. This gives us a way
to get a larger (in size) matching M5 using M3 : throw away the even edges of P from M3 and
add the odd edges; i.e., M5 = M3 − {u2 u3 , u4 u7 } + {u1 u2 , u3 u4 , u7 u6 }.

Theorem 10.2.5. [Berge, 1957] A matching M is maximum if and only if there is no M -augmenting
path in G.

Proof. Let M = {u1 v1 , . . . , uk vk } be a maximum matching. If there is an M -augmenting path P , then


(P \ M ) ∪ (M \ P ) is a larger matching, a contradiction. Conversely, suppose that M is not maximum.
Let M ∗ be a maximum matching. Consider the graph H = (V, M ∪ M ∗ ). Note that dH (v) ≤ 2 for
each vertex in H. Thus, H is a collection of isolated vertices, paths and cycles. Since a cycle contains
equal number of edges of M and M ∗ , there is a path P which contains more number of edges of M ∗
than that of M . Then P is an M -augmenting path, a contradiction.

Exercise 10.2.6. How do we find a maximum matching in a graph G.

Example 10.2.7. Can we find a matching that saturates all vertices in the graph given below?

1
T

2
AF

3
DR

Ans: No. Let X be the given graph and take S = {1, 2, 3}. If there is a matching that saturates
S then |N (S)| ≥ |S|. But this is not the case with this graph.

Theorem 10.2.8. [Hall, 1935] Let G = (X ∪ Y, E) be a bipartite graph. Then there is a matching
that saturates all vertices in X if and only if |N (S)| ≥ |S| for each S ⊆ X.

Proof. If there is such a matching, then obviously |S| ≤ |N (S)| for each subset S of X. Conversely,
suppose that |N (S)| ≥ |S| for each S ⊆ X. If possible, let M ∗ be a maximum matching that does not
saturate x ∈ X.
As |N ({x})| ≥ |{x}|, there is a y ∈ Y such that xy ∈ / M ∗ . Since M ∗ cannot be extended, y must
have been matched to some x1 ∈ X.
Now consider N ({x, x1 }). As |N ({x, x1 })| ≥ |{x, x1 }| and M ∗ does not saturate x, N ({x, x1 }) has
a vertex y1 which is adjacent to either x or x1 or both by an edge not in M ∗ . Again the condition
that M ∗ cannot be extended implies that y1 must have been matched to some x2 ∈ X. Continuing
as above, we see that this process never stops and thus, G has infinitely many vertices, which is not
true. Hence, M ∗ saturates each x ∈ X.
10.2. MATCHING IN GRAPHS 225

Corollary 10.2.9. Let G be a k-regular (k ≥ 1) bipartite graph. Then G has a perfect matching.

Proof. Let X and Y be the two partitions of V (G). Since G is k-regular |X| = |Y |. Let S ⊆ X and
P
E be the set of edges with an end vertex in S. Then k|S| = |E| ≤ d(v) = k|N (S)|. Hence, we
v∈N (S)
see that for each S ⊆ X, |S| ≤ |N (S)| and thus, by Hall’s theorem the required result follows.

Definition 10.2.10. Let G be a graph. Then a subset S of V (G) is called a covering of G if each
edge has at least one end vertex in S. A minimum covering of G is a covering of G that has
minimum cardinality.
Exercise 10.2.11. 1. Can there be a graph in which the size of a minimum covering is |G|?
2. Show that for any graph G the size of a minimum covering is n − α(G).
3. Characterize G in terms of its girth if the size of a minimum covering is |G| − 2.

Proposition 10.2.12. Let G be a graph. If M is a matching and K is a covering of G, then |M | ≤ |K|.


If |M | = |K|, then M is a maximum matching and K is a minimum covering.

Proof. By definition, the proof of the first statement is trivial. To prove the second statement, suppose
that |M | = |K| and M is not a maximum matching. Let M ∗ be a matching of G with |M ∗ | > |M |.
Then, using the first statement, we have |K| ≥ |M ∗ |. Hence, |K| ≥ |M ∗ | > |M | = |K|. Thus, M is
maximum. As each covering must have at least |M | elements, we see that K is a minimum covering.

Exercise 10.2.13. Let G = Kn , n ≥ 3. Then, determine


1. the cardinality of a maximum matching?
T
AF

2. the cardinality of a minimum covering?


DR

Is the converse of Proposition 10.2.12 necessarily true? Can you guess the class of graphs for which
the converse of Proposition 10.2.12 is true?

Theorem 10.2.14. [Konig, 1931] Let M be a maximum matching in a bipartite graph G and let K
be a minimum covering. Then |M | = |K|.

Proof. Let (L, R) (L for left and R for right) be the bipartition of V and let M be a maximum
matching. Let U be the set of unmatched vertices on the left.

UL′ UR′
U

Let U 0 be the set of vertices reachable from U by alternating paths (with respect to M ). Then U 0
has two parts : one on the left, say UL0 and the other on the right, say UR0 . Note that the vertices of
U are reachable from themselves. Hence, we have U ⊆ UL0 . We have a few observations.
a) If v ∈ L is a left vertex not in UL0 , then it is not in U , and so it must be matched to some right
vertex, say w. Can w ∈ UR0 ? No. Because, if w ∈ UR0 , then we have an alternating path from u ∈ U
to w and as [w, u] is a matching edge, we see that v is reachable from u by an alternating path. Then
226 CHAPTER 10. GRAPHS - II

v should have been in UL0 , a contradiction. Thus every vertex from L \ UL0 is matched to a vertex in
R \ UR0 .
b) Is every vertex in UR0 matched (saturated)? Yes. To see it, suppose that w ∈ UR0 is not matched.
As w ∈ UR0 , it must be reachable from a vertex u ∈ U via an alternating path. But, this alternating
path is an augmenting path. This means M is not a maximum matching, a contradiction.
c) The above two points imply that |M | = |L \ UL0 | + |UR0 |.
d) Is there any edge from a vertex in UL0 to a vertex in R \ UR0 ? No. To see this note that, each
vertex in UL0 \ U is reached from some vertex of U via an alternating path and the last edge of this
path must be a matching edge. Thus, each vertex in UL0 \ U is matched to some vertex in UR0 . This
means, if there an edge from a vertex in UL0 to a vertex w ∈ R \ UR0 , it must be a nonmatching edge.
But then, this makes w reachable from U via an alternating path. So w should have been in UR0 , a
contradiction.
e) The previous point means that (L \ UL0 ) ∪ UR0 is covering. This is a minimum covering, as any
covering must contain at least |M | many vertices by Proposition 10.2.12.

Alternate. Let V = X ∪ Y be the bipartition of V and let M be a maximum matching. Let U be


the vertices in X that are not saturated by M and let Z be the set of vertices reachable from U by
an M -alternating path.
Put S = Z ∩ X, T = Z ∩ Y and K = T ∪ (X \ S). Then, U ⊆ Z ⊆ X ∪ Y and every element of X \ S
is saturated by M . Also, every vertex in T is saturated by M (as M is a maximum matching) and
N (S) = T (else there will be M -augmenting path starting from u ∈ U ). Further, a vertex v ∈ X \ S is
matched to some vertex y ∈ / T . Thus, |K| = |T ∪ (X \ S)| ≤ |M |. If K is not a covering, then there is
T

an edge xy ∈ G with x ∈ S and y ∈ / T , a contradiction to N (S) = T . Thus, K is a covering and hence,


AF

using |K| ≤ |M | and Proposition 10.2.12, we get |K| = |M |. Furthermore, by Proposition 10.2.12, we
DR

also see that K is a minimum covering.


Exercise 10.2.15. 1. How many perfect matchings are there in a labeled K2n ?
2. Characterize G if the size of a minimum covering is |G| − 1.

10.3 Ramsey numbers


Recall that in any group of 6 or more persons either we see 3 mutual friends or we see 3 mutual
strangers. Expressed using graphs it reads as follows:

Any graph with at least 6 vertices has either K3 or K 3 as its subgraph.

Definition 10.3.1. The Ramsey number r(m, n) is the smallest natural number k such that any
graph G on k vertices either has a Km or a K n as its subgraph.

Example 10.3.2. As C5 does not have K3 or K 3 as its subgraph, r(3, 3) > 5. But, using the first
paragraph of this section, we get r(3, 3) ≤ 6 and hence, r(3, 3) = 6. It is known that r(3, 4) = 9 (see
the text by Harary [6] for a table).

Proposition 10.3.3. Let G be a graph on 9 vertices. Then, either K4 ⊆ G or K 3 ⊆ G.

Proof. Assume that |V | = 9. Then, we need to consider three cases.


Case I. There is a vertex a with d(a) ≤ 4. Then, |N (a)0 | = |V \ N (a)| ≥ 4. If all vertices in N (a)0
are pairwise adjacent, then K4 ⊆ G. Otherwise, there are two non-adjacent vertices, say b, c ∈ N (a)0 .
In that case a, b, c induces the graph K 3 .
10.4. DEGREE SEQUENCE 227

Case II. There is a vertex a with d(a) ≥ 6. If hN (a)i has a K 3 , we are done. Otherwise, r(3, 3) = 6
implies that hN (a)i has a K3 with vertices, say, b, c, d. In that case a, b, c, d induces the graph K4 .
P
Case III. Each vertex has degree 5. This case is not possible as d(v) should be even.
Exercise 10.3.4. 1. Can you draw a graph on 8 vertices
(a) which does not have K3 , K 4 in it?
(b) which does not have K4 , K 3 in it?

2. Consider the graph C8 = [1, 2, . . . , 8, 1] with 10 extra edges 13, 14, 17, 26, 27, 35, 36, 48, 57, 58.
Does this graph has a K4 or the complement of C3 ?

Theorem 10.3.5. [Erdos & Szekeres, 1935] Let m, n ∈ N. Then,

r(m, n) ≤ r(m − 1, n) + r(m, n − 1).

Proof. Let p = r(m − 1, n) and q = r(m, n − 1). Now, take any graph G on p + q vertices and take
a vertex a. If d(a) ≥ p, then hN (a)i has either a subgraph Km−1 (and Km−1 together with a gives
Km ) or a subgraph K n . Otherwise, |N (a)0 | ≥ q. In this case, hN (a)0 i has either a subgraph Km or a
subgraph K n−1 (K n−1 together with a gives K n ).

Exercise 10.3.6. Is it true that in any group of 7 persons there are 3 mutual friends or 4 mutual
strangers?

10.4 Degree sequence


T
AF

Definition 10.4.1. The degree sequence of a graph of order n is the tuple (d1 , . . . , dn ) where
d1 ≤ · · · ≤ dn . A increasing sequence d = (d1 , . . . , dn ) of non-negative integers is graphic if there is a
DR

graph whose degree sequence is d.

Example 10.4.2. Show that (1, 1, 3, 3) is not graphic.


Ans: Let the vertices be {u, v, w, x}. If d(u) = d(v) = 3, then u ∼ v, w, x and v ∼ u, w, x. Thus,
d(w) ≥ 2 and d(x) ≥ 2.

Theorem 10.4.3. Fix n ≥ 1 and the natural numbers d1 ≤ · · · ≤ dn . Then, d = (d1 , . . . , dn ) is


di = 2n − 2. Consider n ≥ 5.
P
the degree sequence of a tree on n vertices if and only if
Then you can decompose the path on n vertices as union of K2
and Cn−2.
P
Proof. If d = (d1 , . . . , dn ) is the degree sequence of a tree on n vertices then di = 2|E(T )| =
2(n − 1) = 2n − 2.
P
Conversely, let d1 ≤ · · · ≤ dn be a sequence of natural numbers with di = 2n − 2. We use
induction to show that d = (d1 , . . . , dn ) is the degree sequence of a tree on n vertices. For n = 1, 2, the
result is trivial. Let the result be true for all n < k and let d1 ≤ · · · ≤ dk , k > 2, be natural numbers
P P
with di = 2k − 2. Since, di = 2k − 2, we must have d1 = 1 and dk > 1. Then, we note that
0 0 0
P 0
d2 = d2 , · · · , dk−1 = dk−1 and dk = dk − 1 are natural numbers such that di = 2(k − 1) − 2. Hence,
by induction hypothesis, there is a tree T on vertices 2, · · · , k − 1, k with degrees d0i s. Now, introduce
0

a new vertex 1 and add the edge {1, k} to get a tree T that has the required degree sequence.

Theorem 10.4.4. [Havel-Hakimi, 1962] The degree sequence d = (d1 , . . . , dn ) is graphic if and only
if the sequence d1 , d2 , . . . , dn−dn −1 , dn−dn − 1, . . . , dn−1 − 1 is graphic.
228 CHAPTER 10. GRAPHS - II

Proof. If the later sequence is graphic then we introduce a new vertex and make it adjacent to the
vertices whose degrees are dn−dn − 1, . . . , dn−1 − 1. Hence, the sequence d = (d1 , . . . , dn ) is graphic.
Now, assume that d is graphic and G is a graph with degree sequence d. Let dn = k and let
NG (n) = {i1 , i2 , . . . , ik } with di1 ≤ di2 ≤ · · · ≤ dik . If di1 ≥ dv for all v ∈ V (G) \ NG (n) then
{di1 , di2 , . . . , dik } = {dn−dn , dn−dn +1 , . . . , dn−1 } and hence G − n is the required graph.
If di1 < dv0 for some v0 ∈ V (G) \ NG (n) then, we construct another graph, say G0 , such that G
and G0 have the same degree sequence but
X X
dv ≥ du . (10.4)
v∈NG0 (n) u∈NG (n)

As, v0 6∼ n, the vertex v0 has a neighbor v 6= i1 with v 6∼ i1 as di1 < dv0 . Now, consider the
graph G0 = G − {v0 , v} + {n, v0 } + {i1 , v} − {i1 , n}. Then, G0 also has d as its degree sequence
with NG0 (n) = {v0 , i2 , . . . , ik }. Thus, (10.4) holds. This process will end after a finite number of
steps by producing a graph in which the vertex n has degree dn and has neighbors with degrees
dn−dn , dn−dn +1 , . . . , dn−1 ; and hence the required result follows.
Exercise 10.4.5. 1. How many different degree sequences are possible on a graph with 5 vertices?
List all the degree sequences and draw a graph for each one.
2. For each of the degree sequences given below, draw the graph; or else, argue why it is not graphic.

(a) (2, 2, 3, 4, 4, 5)
(b) (1, 2, 2, 3, 3, 4)
(c) (2, 2, 3, 3, 3, 3, 3, 3, 4, 4)
T
AF

3. If two graphs have the same degree sequence, are they necessarily isomorphic?
DR

4. If two graphs are isomorphic, is it necessary that they have the same degree sequence?

10.5 Representing graphs with matrices


Definition 10.5.1. Let G = (V, E) be a simple (undirected) graph on vertices 1, . . . , n. Then the
n × n matrix, called the adjacency matrix A(G) of G (or simply A), is defined by
(
1 if {i, j} ∈ E,
A(G) = [aij ], aij =
0 otherwise.

Let H be the graph obtained by relabeling the vertices of G. Then A(H) = S −1 A(G)S, for
some permutation matrix S (recall that for a permutation matrix S t = S −1 ). Hence, we talk of the
adjacency matrix of a graph and ignore possible labeling of the vertices of G.

Example 10.5.2. The adjacency matrices of the 4-cycle C4 and the path P4 on 4 vertices are given
below.    
0 1 0 1 0 1 0 0
   
 1 0 1 0   1 0 1 0 
A(C4 ) = 
 0 1 0
 , A(P4 ) =  .
 1 


 0 1 0 1 

1 0 1 0 0 0 1 0
Exercise 10.5.3. 1. A"graph G is# not connected if and only if there exists a permutation matrix
A11 0
P such that A(G) = for some matrices A11 and A22 .
0 A22
10.5. REPRESENTING GRAPHS WITH MATRICES 229

2. Two graphs G and H are isomorphic if and only if A(G) = P t A(H)P for some permutation
matrix P .

Theorem 10.5.4. The (i, j)th entry of B = A(G)k is the number of i-j walks of length k in G.

Proof. Write A(G) = [aij ] and B = [bij ]. Then B = A(G)k implies that
X
bij = aii1 ai1 i2 · · · aik−1 ik .
i1 ,...,ik−1

Thus, bij = r if and only if we have r sequences i1 , . . . , ik−1 with aii1 = · · · = aik−1 ik = 1. That is,
bij = r if and only if we have r walks of length k between i and j.

Theorem 10.5.5. Let G be a graph of order n. Then, G is connected if and only if all entries of
 n−1
I + A(G) are positive.

n−1 n−1
Proof. Write B = I +A. Let G be connected. If P is an i-j path of length n−1, then Bij ≥ Aij ≥ 1.
If P = [i, i1 , . . . , ik = j] is an i-j path of length k < n − 1, then bii . . . bii bii1 . . . bik−1 j = 1, where bii is
n−1
used n − 1 − k times. Thus, Bij > 0.
n−1
Conversely, let Bij > 0. Then, the corresponding summand bii1 . . . bin−1 j is positive. By throwing
out entries of the form bii , for 1 ≤ i ≤ n, from this expression, we have an expression which corresponds
to an i-j walk of length at most n − 1. Therefore, G is connected.

Exercise 10.5.6. Let G be a graph with adjacency matrix A. Prove the following:
1. The eigenvalues of A are all real.
T
AF

2. The eigenvectors of A can be chosen to form an orthonormal basis of Rn .


DR

3. Each rational eigenvalue of A is an integer.


4. If G = Kn , then A = J − I, where J is the matrix with each entry 1.
5. If G = Kn , then the eigenvalues of A are n − 1 with multiplicity 1, and −1 with multiplicity
n − 1.
6. Let G be the complement graph of G. Then, A(G) = J − I − A.
7. If G is k-regular then the following are true:
(a) k is an eigenvalue of A.
(b) n − k − 1 is an eigenvalue of G.
(c) If λ 6= k is an eigenvalue of A, then −1 − λ is an eigenvalue of A(G).
" #
0 B1
8. If G is bipartite then there exists a permutation matrix P such that B = P t AP = .
B1t 0
Further, prove that λ is an eigenvalue of A if and only if −λ is an eigenvalue of A.

Definition 10.5.7. Let G be a graph with V (G) = {1, 2, . . . , n} and E(G) = {e1 , e2 , . . . , em }. Let
us arbitrarily give an orientation to each edge of G. For this fixed orientation, the vertex-edge
incidence matrix or in short, incidence matrix, Q(G) = [qij ] of G is a n × m matrix whose (i, ej )th
entry is given by

 1 if edge ej originates at i,

qij = −1 if edge ej terminates at i,

0 if edge ej is not incident with i.

230 CHAPTER 10. GRAPHS - II

Example 10.5.8. Consider the graph given below.

e4
5 > 4

e6

>
>
e7
1 e5 e3

>
>
>
e1

<
2 e2 3

It has V (G) = {1, 2, 3, 4, 5} and E(G) = {e1 , e2 , . . . , e7 }. Its incidence matrix is


 
1 0 0 0 0 1 1
−1 −1 0 0 1 0 0
 
Q= 0 1 1 0 0 0 0.
 0 0 −1 −1 −1 0 −1
0 0 0 1 0 −1 0

Exercise 10.5.9. Let G be a graph on n vertices and m edges. Prove the following:
1. Qt Q = diag(d1 , d2 , . . . , dn ) − A, where diag(d1 , d2 , . . . , dn ) is the diagonal matrix with di s as the
degrees of n vertices.
2. QQt = 2I − A(L(G)), where A(L(G)) is the adjacency matrix of the line graph L(G) of G.
3. If e is the vector with each component as 1, then Qt e = 0.
T

4. If G is connected, then rank(Q) = n − 1.


AF

5. Any square submatrix of Q is unimodular; that is, the determinant of any square submatrix of
DR

Q is either −1 or 0 or 1.

1
Chapter 11

Polya Theory∗

In Section 5.5, we have already studied ideas and problems related with circular permutations. In this
chapter, we would like to generalize the ideas in that section to a more general setting. This will help
us to get answers to the following type of questions:
1. How many different necklace configurations are possible if we use 6 beads of 3 different colors?
Or for that matter what if we use n beads of m different colors?
2. How many different necklace configurations are possible if we use 12 beads among which 3 are
red, 5 are blue and 4 are green? And a generalization of this problem?

Observe that if we want to look at different color configurations of a necklace formed using 6 beads,
we need to understand the symmetries, not just the circular rotations, of a hexagon. Such a study is
T

achieved through what in literature is called groups. Once we have learnt a bit about groups, we study
AF

group action. This study helps us in defining an equivalence relation on the set of color configurations
DR

for a given necklace. And it turns out that the number of distinct color configurations is same as the
number of equivalence classes.

11.1 Groups
Before coming to the definition and its properties, let us look at the properties of the sets N, Z, Q, R
and C. We know that the set S, which may be Z, Q, R or C, satisfies the following:

Binary operation: For all a, b ∈ S, a + b, called the addition of a and b, is an element of S.

Addition is associative: For all a, b, c ∈ S, (a + b) + c = a + (b + c).

Additive identity: S contains an element, called zero, denoted 0, so that for each a ∈ S, a + 0 =
a = 0 + a.

Additive inverse: For every element a ∈ S, there exists an element −a ∈ S such that a + (−a) =
0 = −a + a.

Addition is commutative: For all a, b ∈ S, a + b = b + a.

Write S ∗ = S \ {0}. Correspondingly, we write Z∗ = Z \ {0}, Q∗ = Q \ {0}, R∗ = R \ {0} and


C∗ = C \ {0}. As in the previous case, we see that similar statements hold true for S ∗ with respect to
the multiplication operation, with one exception. They are as follows:

Binary operation: For all a, b ∈ S ∗ , a · b, called the multiplication of a and b, is an element of S ∗ .

231
232 CHAPTER 11. POLYA THEORY∗

Multiplication is associative: For all a, b, c ∈ S ∗ , (a · b) · c = a · (b · c).

Multiplicative identity: S ∗ contains an element, called a unit element, or one, denoted 1, is such
that for each a ∈ S ∗ , a · 1 = a = 1 · a.

Multiplication is commutative: For all a, b ∈ S ∗ , a · b = b · a.

Observe that if we choose a ∈ Z∗ with a 6= 1, −1, then there does not exist an element b ∈ Z∗
such that a · b = 1 = b · a. Whereas, for the sets Q∗ , R∗ and C∗ one can always find a b such that
a · b = 1 = b · a.
Based on the above examples, an abstract notion called a group is defined. Formally, one defines
a group as follows.

Definition 11.1.1. Let G be a nonempty set and let ∗ be a binary operation on G. The pair (G, ∗)
is called a group if the following are satisfied:
1. For all a, b, c ∈ G, (a ∗ b) ∗ c = a ∗ (b ∗ c). (Associativity Property holds in G.)
2. There exists e ∈ G such that for each a ∈ G, a ∗ e = a = e ∗ a. (Existence of Identity in G.)
3. For each a ∈ G, there exists b ∈ G such that a ∗ b = e = b ∗ a. (Existence of Inverse in G. )

In addition, if the statement “For all a, b ∈ G a ∗ b = b ∗ a” is true, then the group (G, ∗) is called an
an abelian (commutative) group.

Observe that once ∗ is a binary operation on G, it is assumed that for each pair of elements
T

a, b ∈ G, the element a ∗ b is also an element of G.


AF

When (G, ∗) is a group, we say informally that G is a group with the operation as ∗. For example,
Z, Q, R and C are groups with the binary operation as addition. Also, Q \ {0}, R \ {0} and C \ {0} are
DR

groups with the binary operation as multiplication. In general, if the binary operation ∗ is understood
from the context, we say that G is a group; and write ab instead of a ∗ b when a, b ∈ G.
Before proceeding with examples of groups that concerns us, we state a few basic results in group
theory in the following remark. Those may be proved without much difficulty.

Remark 11.1.2. Let (G, ∗) be a group. Then the following hold:


1. The identity element of G is unique. Hence, keeping a definite notation such as e for the identity
element is meaningful.
2. Corresponding to any a ∈ G, the element b ∈ G that satisfies a ∗ b = e = b ∗ a is unique. So, we
denote such a b by a−1 , and call it the inverse of a.
3. e−1 = e.
4. For each a ∈ G, (a−1 )−1 = a.
5. If a ∗ b = a ∗ c for some a, b, c ∈ G, then b = c. Similarly, if b ∗ d = c ∗ d for some b, c, d ∈ G, then
b = c. That is, the cancellation laws hold in G.
6. For all a, b ∈ G, (ab)−1 = b−1 a−1 .
7. By convention, we assume a0 = e for each a ∈ G; and define an = an−1 · a for n ∈ N. Then
an = a · an−1 .
8. For each a ∈ G, (an )−1 = (a−1 )n for all n ∈ W. We write both (an )−1 and (a−1 )n as a−n .
9. Last two statements define am for each a ∈ G and for each m ∈ Z.
11.1. GROUPS 233

In the remaining part of this chapter, the binary operation may not be explicitly mentioned as it
will be clear from the context. We now look at a few examples that will be used later in this chapter.

Example 11.1.3. [Symmetric Group on n letters/symbols] Write N = {1, 2, . . . , n}. Recall that a
bijection f : N → N is called a permutation on n elements. The set of all permutations on n elements
is denoted by Sn , i.e.,
Sn = {f : N → N | f is one to one and onto}.

We observe the following:


1. Suppose f, g, h ∈ Sn . Then f, g, h : N → N are one-to-one and onto functions.
(a) Hence f ◦ g, the composition of f and g, is also one-to-one and onto. Thus, f ◦ g ∈ Sn . So,
“composition of functions”, denoted ◦, defines a binary operation in Sn .
(b) It is well known that ◦ is an associative operation, i.e., (f ◦ g) ◦ h = f ◦ (g ◦ h).
(c) The identity function e : N → N defined by e(i) = i for all i = 1, 2, . . . , n is a one-to-one
and onto function. Further, f ◦ e = f = e ◦ f for all f ∈ Sn . The permutation e is called
the identity permutation.
(d) As f is a one-to-one and onto function, f −1 : N → N defined by f −1 (i) = j, whenever
f (j) = i, for all i = 1, 2, . . . , n, is a one-one and onto function. So, for each f ∈ Sn ,
f −1 ∈ Sn and f ◦ f −1 = e = f −1 ◦ f .

2. Thus (Sn , ◦) is a group. This group is called the Symmetric group or the Permutation
group on n letters/symbols.
T

3. [Product of permutations] Let σ, τ ∈ Sn . Then σ ◦ τ (the composition of σ and τ ) is popularly


AF

called the product of σ and τ . From now onwards, we will just use στ in place of σ ◦ τ , i.e,
DR

we will not use the symbol ◦ unless it becomes necessary for the sake of clarity.
!
1 2 ··· n
4. If σ ∈ Sn then one represents this by writing σ = . This representa-
σ(1) σ(2) · · · σ(n)
tion of an element of Sn is called a two row notation.
5. Since σ : N → N is one-to-one and onto, {σ(1), σ(2), . . . , σ(n)} = N . Hence, there are n choices
for σ(1), n − 1 choices for σ(2) (all elements of N except σ(1)) and so on. Thus, |Sn | = n!.

Before discussing other examples, let us try to understand the group Sn . As seen above, any element
σ ∈ Sn can be represented using a two-row notation. There is another notation for permutations that
is often very useful. This notation is called the cycle notation which we define next.

Definition 11.1.4. Let σ ∈ Sn and let S = {i1 , i2 , . . . , ik } ⊆ {1, 2, . . . , n}. If σ satisfies

σ(i` ) = i`+1 for each ` = 1, 2, . . . , k − 1, σ(ik ) = i1 , and σ(r) = r for r 6∈ S

then σ is called a k-cycle and is denoted by σ = (i1 , i2 , . . . ik ) or (i2 , i3 , . . . , ik , i1 ) and so on.


!
1 2 3 4 5
Example 11.1.5. 1. The permutation σ = in cycle notation can be written as
2 3 4 1 5
(1234), (2341), (3412), or (4123) as σ(1) = 2, σ(2) = 3, σ(3) = 4, σ(4) = 1 and σ(5) = 5.
!
1 2 3 4 5 6
2. The permutation in cycle notation equals (123)(65) as σ(1) = 2, σ(2) =
2 3 1 4 6 5
3, σ(3) = 1; σ(4) = 4; and σ(5) = 6, σ(6) = 5. That is, this element is formed with the help of
two cycles (123) and (56).
234 CHAPTER 11. POLYA THEORY∗

3. Consider two permutations σ = (143)(27) and τ = (1357)(246). Then, their product is obtained
as follows:
  
(στ )(1) = σ τ (1) = σ(3) = 1, (στ )(2) = σ τ (2) = σ(4) = 3, (στ )(3) = σ τ (3) = σ(5) =

5, (στ )(4) = σ τ (4) = σ(6) = 6, (σ ◦ τ )(5) = 2, (στ )(6) = 7 and (στ )(7) = 4. Hence
!
1 2 3 4 5 6 7
στ = (143)(27)(1357)(246) = = (235)(467).
1 3 5 6 2 7 4

4. Similarly, verify that (1456)(152) = (16)(245).


5. Let σ = (123) and τ = (56). Then, σ, τ can be thought of as elements of S6 with σ(i) = i for
4 ≤ i ≤ 6. Similarly, τ ∈ S6 with τ (i) = i for 1 ≤ i ≤ 4. Further, the permutation (123)(56) is
the product of σ and τ .
6. Note that the identity permutation e ∈ Sn satisfies e(i) = i for 1 ≤ i ≤ n. So, we sometimes
write e = (1)(2) · · · (n).

Definition 11.1.6. Two cycles σ = (i1 , i2 , . . . , it ) and τ = (j1 , j2 , . . . , js ) are said to be disjoint if

{i1 , i2 , . . . , it } ∩ {j1 , j2 , . . . , js } = ∅.

The proof of the following theorem can be obtained from any standard book on abstract algebra.

Theorem 11.1.7. [Permutation as product of disjoint cycles] Let σ ∈ Sn . Then σ can be written
as a product of disjoint cycles.
T
AF

Remark 11.1.8. Observe that the representation of a permutation as a product of disjoint cycles,
DR

none of which is the identity, is unique up to the order of the disjoint cycles. The representation of
an element σ ∈ Sn as product of disjoint cycles is called the cyclic decomposition of σ.

Example 11.1.9. 1. Symmetries of regular n-gons in plane.

(a) Let A be the square in the XY -plane with it’s vertices labeled as 1, 2, 3 and 4 and placed at
the points (0, 1, 0), (0, 0, 0), (1, 0, 0) and (1, 1, 0), respectively. (Since each side of A measures
to 1 unit, we say that A is a unit square.) Our aim is to move the square in space so that
each vertex may change its place, but altogether, the vertices are placed at these points
only. Verify that whichever way we move the square (using only such movements), the
square is moved to one of the following configurations (see Figure 11.1):
Now, let e denote the initial position of A. Then, one can obtain the possible 8 positions
(see Figure 11.1) by repeated application of either the counter-clockwise rotation of A by
90◦ , denoted by r, or by flipping of A along the vertical axis passing through the midpoint of
opposite horizontal edges, denoted by f . So, we have a set G = {e, r, r2 , r3 , f, rf, r2 f, r3 f }
whose elements are functions that sends A to a particular configurations in Figure 11.1.
Thus, with the composition of functions as the binary operation

G = {e, r, r2 , r3 , f, rf, r2 f, r3 f } with relations r4 = e = f 2 and f r3 = rf (11.1)

forms a group. Further, using (11.1), observe that (rf )2 = (rf )(rf ) = r(f r)f = r(r3 f )f =
r4 f 2 = e. Similarly, it can be checked that (r2 f )2 = (r3 f )2 = e, i.e., the elements f, rf, r2 f
and r3 f are flips.
11.1. GROUPS 235

4 1 1 2 2 3 3 4
r
90◦ 180◦ 270◦
3 2 4 3 1 4 2 1
rf
f f r3

1 4 4 3 3 2 2 1
r
90◦ 180◦ 270◦
2 3 1 2 4 1 3 4

Figure 11.1: Symmetries of a square.

The group G is generally denoted by D4 and is called the Dihedral group with 8 elements
or the symmetries of a square. This group can also be represented by

H = {e, (1234), (13)(24), (1432), (14)(23), (24), (12)(34), (13)} (11.2)

where the elements are obtained using the position of the vertices of the square in it’s new
position with respect to the position of vertices in A.
For another understanding, observe that if r and f in G are mapped to (1234) and (14)(23),
respectively, in H, then using the respective binary operations, the different elements of G
and H can be identified. For example, rf is mapped to the product (1234)(14)(23) = (24).
T
AF

(b) In the same way, one can define the symmetries of an equilateral triangle (see Figure 11.2).
This group is denoted by D3 and is represented as
DR

D3 = {e, r, r2 , f, rf, r2 f } with relations r3 = e = f 2 and f r2 = rf, (11.3)



where r is a counter-clockwise rotation by 120◦ = and f is a flip. Using Figure 11.2,
3
one can check that the group D3 , consisting of 6 elements, can also be represented by

D3 = {e, (ABC), (ACB), (BC), (CA), (AB)}.

The readers should verify that (ABC)2 = (ABC)(ABC) = (ACB), (ABC)3 = e, (AB)2 =
e, (ABC)(AB) = (AC) and so on.
(c) For a regular pentagon, it can be verified that the group of symmetries of a regular pentagon
is given by G = {e, r, r2 , r3 , r4 , f, rf, r2 f, r3 f, r4 f } with r5 = e = f 2 and rf = f r4 , where r

denotes a counter-clockwise rotation through an angle of 72◦ = and f is a flip along a
5
line that passes through a vertex and the midpoint of the opposite edge. Or equivalently,
if we label the vertices of a regular pentagon, counter-clockwise, with the numbers 1, 2, 3, 4
and 5 then

G = {e, (1, 2, 3, 4, 5), (1, 3, 5, 2, 4), (1, 4, 2, 5, 3), (1, 5, 4, 3, 2), (2, 5)(3, 4),
(1, 3)(4, 5), (1, 5)(2,1 4), (1, 2)(3, 5), (1, 4)(2, 3)}.

(d) In general, one can define symmetries of a regular n-gon. This group is denoted by Dn , has
2n elements and is represented as

{e, r, r2 , . . . , rn−1 , f, rf, . . . , rn−1 f } with rn = e = f 2 and f rn−1 = rf. (11.4)


236 CHAPTER 11. POLYA THEORY∗

C A B

r
120◦ 240◦

B A C B A C

f
rf
f r2
C B A

r
120◦ 240◦

A B C A B C

Figure 11.2: Symmetries of an Equilateral Triangle.


Here the symbol r stands for a counter-clockwise rotation through an angle of and f
n
stands for a vertical flip.

2. Symmetries of regular platonic solids.

(a) Recall from geometry that a tetrahedron is a 3-dimensional regular object having 6 edges,
T

4 vertices and 4 faces, each face being an equilateral triangle (see Figure 11.1). If we denote
AF

the vertices of the tetrahedron with numbers 1, 2, 3 and 4, then the symmetries of the
DR

tetrahedron is the following group:

T = {e, (234), (243), (124), (142), (123), (132), (134), (143), (12)(34), (13)(24), (14)(23)},

where, for distinct numbers i, j, k and `, the element (ijk) is formed by a rotation of 120◦
along the line that passes through the vertex ` and the centroid of the equilateral triangle
with vertices i, j and k. Similarly, the group element (ij)(k`) is formed by a rotation of
180◦ along the line that passes through mid-points of the edges (ij) and (k`).
(b) Consider the Cube and the Octahedron given in Figure 11.3. It can be checked that the
group of symmetries of the two figures has 24 elements. We give the group elements for the
symmetries of the cube, when the vertices of the cube are labeled. The readers are required
to compute the group elements for the symmetries of the octahedron. For the cube (see
Figure 11.3), the group elements are
i. e, the identity element;
ii. 3 × 3 = 9 elements that are obtained by rotations along lines that pass through the
center of opposite faces (3 pairs of opposite faces and each face is a square: corresponds
to a rotation of 90◦ ). In terms of the vertices of the cube, the group elements are

(1234)(5678), (13)(24)(57)(68), (1432)(5876), (1265)(3784), (16)(25)(38)(47),


(1562)(3487), (1485)(2376), (18)(45)(27)(36), (1584)(2673).
1
iii. 2 × 4 = 8 elements that are obtained by rotations along lines that pass through op-
posite vertices (4 pairs of opposite vertices and each vertex is incident with 3 edges:
11.1. GROUPS 237

5 8 5
1 6 7 4
2 6

4 1 4
2 3 2 3 3

Figure 11.3: Regular Platonic solids.

corresponds to a rotation of 120◦ ). The group elements in terms of the vertices of the
cube are
T
AF

(254)(368), (245)(386), (163)(457), (136)(475), (275)(138),


DR

(257)(183), (168)(274), (186)(247).

iv. 1 × 6 = 6 elements that are obtained by rotations along lines that pass through the
midpoint of opposite edges (6 pairs of opposite edges: corresponds to a rotation of
180◦ ). The corresponding elements in terms of the vertices of the cube are

(14)(67)(28)(35), (23)(58)(17)(46), (15)(37)(28)(64), (26)(48)(17)(35),


(12)(78)(36)(45), (34)(56)(17)(28).

5 8 5 8 5 8
6 7 6 7 6 7

1 4 1 4 1 4
2 3 2 3 2 3
◦ ◦ ◦
90 120 180

Figure 11.4: Understanding the group of symmetries of a cube.

(c) Consider now the icosahedron and the dodecahedron (see Figure 11.3). Note that the
icosahedron has 12 vertices, 20 faces and 30 edges and the dodecahedron has 20 vertices, 12
faces and 30 edges. It can be checked that the group of symmetries of the two figures has 60
238 CHAPTER 11. POLYA THEORY∗

elements. We give the idea of the group elements for the symmetries of the icosahedron. The
readers are required to compute the group elements for the symmetries of the dodecahedron.
For the icosahedron, one has
i. e, the identity element;
ii. 2 × 10 = 20 elements that are obtained by rotations along lines that pass through
the center of opposite faces (10 pairs of opposite faces and each face is an equilateral
triangle: corresponds to a rotation of 120◦ );
iii. 6 × 4 = 24 elements that are obtained by rotations along lines that pass through
opposite vertices (6 pairs of opposite vertices and each vertex is incident with 5 edges:
corresponds to a rotation of 72◦ );
iv. 1 × 15 = 15 elements that are obtained by rotations along lines that pass through the
midpoint of opposite edges (15 pairs of opposite edges: corresponds to a rotation of
180◦ ).

Exercise 11.1.10. Determine the group of symmetries of a parallelogram, a rectangle, a rhombus


and an octahedron?

By now, we have already come across lots of examples of groups that arise as symmetries of different
objects. To proceed further, we study the notion of subgroup of a given group.

Definition 11.1.11. Let (G, ∗) be a group. A nonempty subset H of G is said to be a subgroup of


G, if (H, ∗) is a group.

Note that the binary operation ∗ on H is the restriction of ∗ on G to the subset H. We informally
T

say that the binary operation on H is same as that in G, and use the notation ∗ for the restriction
AF

of ∗ to H. Thus, H is a subgroup of G if and only if H ⊆ G, H 6= ∅ and H forms a group with the


DR

same binary operation of the group G.


Example 11.1.12. 1. Let G be a group with identity element e. Then G and {e} are themselves
groups and hence are subgroups of G. These two subgroups are called trivial subgroups.
2. Both Z and Q are subgroups of R with addition as the binary operation.
3. The sets {e, f } and {e, r2 , f, r2 f } form subgroups of D4 .
4. Let σ ∈ S4 . Then, using Theorem 11.1.7, we know that σ has a cycle representation. With this
understanding, verify that D4 is a subgroup of S4 .
5. Verify that H = {e, r, r2 , . . . , rn−1 } is a subgroup of Dn .

We leave the proof of the next result to the readers.

Lemma 11.1.13. Let G be a group and let a ∈ G, a 6= e. Then H = {an : n ∈ Z} is a subgroup of G.

In view of the above result, we give the following definition.

Definition 11.1.14. Let G be a group and let a ∈ G, a 6= e. The subgroup generated by a,


denoted by hai, is the subgroup hai := {an : n ∈ Z}.

The following two results help us in proving whether a given nonempty set H of a group G is a
subgroup or not.

Theorem 11.1.15 (Two-Step Subgroup Test). Let H be a nonempty subset of a group G. Then H
is a subgroup of G if and only if the following two conditions are satisfied:
11.1. GROUPS 239

1. For all a, b ∈ H, ab ∈ H (H is closed with respect to the binary operation of G).


2. For each a ∈ H, a−1 ∈ H. (H is closed with respect to taking inverse.)

Proof. If H is a subgroup, then clearly the two conditions are satisfied. Conversely, suppose that the
two conditions are satisfied.
Since H 6= ∅, let a ∈ H. The second condition implies that a−1 ∈ H. By the first condition,
e = aa−1 ∈ H. As in G, xe = ex = x for each x ∈ H. Hence e is the identity element of H and
e ∈ H.
Then for each x ∈ H, x−1 ∈ H implies that x−1 is the inverse element of x in H, and x−1 ∈ H.
The associativity condition is directly inherited from G to H.
Therefore, H is a subgroup of G.

Theorem 11.1.16. [Subgroup test] Let H be a nonempty subset of a group G. Then H is a subgroup
of G if and only if for each pair of elements a, b ∈ H, ab−1 ∈ H.

Proof. If H is a subgroup, then a, b ∈ H implies ab−1 ∈ H. Conversely, suppose that for each pair of
elements a, b ∈ H, ab−1 ∈ H. Since H 6= ∅, let x ∈ H. As in G, e = xx−1 shows that e ∈ H.
First, if x ∈ H, then with a = e and b = x, we have ab−1 = ex−1 = x−1 ∈ H.
Second, if x, y ∈ H, then by what we have just proved, y −1 ∈ H. As y = (y −1 )−1 , we see that
xy = x(y −1 )−1 ∈ H.
By Theorem 11.1.15, H is a subgroup of G.

Example 11.1.17. 1. The subsets of Z given below are not subgroups of (Z, +).
T

(a) Let H = {0, 1, 2, 3, . . .} ⊆ Z. Note that, for each a, b ∈ H, a + b ∈ H and the identity
AF

element 0 ∈ H. But H is not a subgroup of Z, as for all n 6= 0, −n 6∈ H.


DR

(b) Let H = Z \ {0} = {. . . , −3, −2, −1, 1, 2, 3, . . .} ⊆ Z. Note that the identity element 0 6∈ H
and hence H is not a subgroup of Z.
(c) Let H = {−1, 0, 1} ⊆ Z. Then H contains the identity element 0 of Z and for each h ∈ H,
h−1 = −h ∈ H. But H is not a subgroup of Z as 1 + 1 = 2 6∈ H.

2. Let G be an abelian group with identity e. Write H = {x ∈ G : x2 = e} and K = {x2 : x ∈ G}.


Prove that H and K are subgroups of G.
Answer: Clearly e ∈ H; so H 6= ∅. Let x, y ∈ H ⊆ G. Then x2 = e = y 2 . Since G is abelian,
2
xy −1 = xy −1 xy −1 = x2 (y −1 )2 = e (y 2 )−1 = e−1 = e.

So, xy −1 ∈ H. By Theorem 11.1.16, H is a subgroup of G.


Again, e ∈ K; so, K 6= ∅. Let x, y ∈ K. There exist a, b ∈ G such that x = a2 and y = b2 .
Notice that b−1 ∈ G. Since G is abelian,

xy −1 = a2 (b2 )−1 = a2 (b−1 )2 = aab−1 b−1 = (ab−1 )2 ∈ K.

By Theorem 11.1.16, K is a subgroup of G.

As a last result of this section, we prove that the condition of the above theorems can be weakened
if we assume that H is a finite, nonempty subset of a group G.

Theorem 11.1.18. [Finite subgroup test] Let H be a nonempty finite subset of a group G. Then,
H is a subgroup of G if and only if for each pair of elements a, b ∈ H, ab ∈ H.
240 CHAPTER 11. POLYA THEORY∗

Proof. Suppose for each pair of elements a, b ∈ H, ab ∈ H. Due to Theorem 11.1.15, we need to show
that for each a ∈ H, a−1 ∈ H. Notice that if a = e ∈ H then a−1 = e−1 = e ∈ H. So, assume that
a 6= e and a ∈ H. Consider the set S = {a, a2 , a3 , . . . , an , . . .}. As H is closed with respect to the
binary operation of G, S ⊆ H. But H has only finite number of elements. Hence, all these elements of
S cannot be distinct. That is, there exist positive integers, say m, n with m > n, such that am = an .
Thus, using Remark 11.1.2, one has am−n = e. Hence, a−1 = am−n−1 ∈ H.

Exercise 11.1.19. 1. Consider the group D3 . Does the subset {e, rf } form a subgroup of D3 ?
2. Determine all subgroups of D4 .
3. Fix a positive integer n and consider the group Dn . Now, for each integer i, 0 ≤ i ≤ n − 1, does
the set {e, ri f } form a subgroup of Dn ? Justify your answer.
4. Determine all subgroups of the group of symmetries of a tetrahedron.
5. Determine all subgroups of the group of symmetries of a cube.

11.2 Lagrange’s Theorem


In this section, we prove the first fundamental theorem for groups having finitely many elements.
First, consider the following example.

Example 11.2.1. On R2 = {(x, y) : x, y ∈ R} define addition component wise. That is, for (x1 , y1 ),
(x2 , y2 ) ∈ R2 we take (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ). Then R2 with component wise addition
is a group. If H is a subgroup of R2 , then H represents a line passing through (0, 0).
T
AF

For instance, H1 = {(x, y) ∈ R2 : y = 0}, H2 = {(x, y) ∈ R2 : x = 0} and H3 = {(x, y) ∈ R2 :


y = 3x} are subgroups of R2 . Notice that H1 represents the X-axis, H2 represents the Y -axis and H3
DR

represents the line that passes through the origin and has slope 3.
Fix the element (2, 3) ∈ R2 . Then
1. (2, 3) + H1 = {(2, 3) + (x, y) : y = 0} = {(2 + x, 3) : x ∈ R}. It is the line that passes through
the point (2, 3) and is parallel to the X-axis.
2. Verify that (2, 3) + H2 represents the line that passes through the point (2, 3) and is parallel to
the Y -axis.
3. Similarly, (2, 3) + H3 = {(2 + x, 3 + 3x) : x ∈ R} = {(x, y) ∈ R2 : y = 3x − 3} represents the line
that has slope 3 and passes through the point (2, 3).

In general, if H is a subgroup of R2 and (x0 , y0 ) ∈ R2 , then the set (x0 , y0 ) + H is the line that is a
parallel shift of the line represented by H containing the point (x0 , y0 ). Further,
1. (x1 , y1 ) lies on the line (x0 , y0 ) + H if and only if (x0 , y0 ) + H = (x1 , y1 ) + H;
2. for any two points (x0 , y0 ), (x1 , y1 ) ∈ R2 , either (x0 , y0 ) + H = (x1 , y1 ) + H or they represent
two parallel lines; each is parallel to the line H; and
(x, y) + H = R2 .
S S  
3.
x∈R y∈R

That is, if we define a relation, denoted ∼, in R2 by (x1 , y1 ) ∼ (x2 , y2 ), whenever (x1 −x2 , y1 −y2 ) ∈ H,
then the above observations imply that this relation is an equivalence relation. Hence, as (x, y) vary
over all the points of R2 , we get a partition of R2 . Moreover, the equivalence class containing the
point (x0 , y0 ) is the set (x0 , y0 ) + H.
11.2. LAGRANGE’S THEOREM 241

We see that given a subgroup H of a group G, it may be possible to partition the group G into
subsets in the form gH or Hg, each of which is similar to H in some sense.

Definition 11.2.2. Let H be a subgroup of a group G. Let g ∈ G.


1. The set gH = {gh : h ∈ H} is called a left coset of H in G.
2. The set Hg = {hg : h ∈ H} is called a right coset of H in G.

Remark 11.2.3. Since the identity element e ∈ H, for each fixed g ∈ G, g = ge ∈ gH. Hence, we
often say that gH is the left coset of H containing g. Similarly, g ∈ Hg and hence Hg is said to be
the right coset of H containing g.

Example 11.2.4. Consider the subgroups H = {e, f } and K = {e, r2 } of the group D4 . We observe
the following:
1. H = {e, f } = Hf, Hr = {r, f r} = Hf r, Hr2 = {r2 , f r2 } = Hf r2 and Hr3 = {r3 , f r3 } =
Hf r3 .
2. H = {e, f } = f H, rH = {r, rf } = rf H, r2 H = {r2 , r2 f } = r2 f H and r3 H = {r3 , r3 f } =
r3 f H.
3. K = {e, r2 } = Kr2 = r2 K, Kr = {r, r3 } = rK = Kr3 = r3 K, Kf = {f, r2 f } = f K =
Kr2 f = r2 f K and Kf r = {f r, f r3 } = f rK = Kf r3 = f r3 K.

From Items 1 and 2, we see that Hg need not be equal to H, for each g ∈ D4 . Whereas in Item 3,
Kg = gK, for each g ∈ D4 . So, there is a need to distinguish between these two subgroups of D4 .
This leads to study of normal subgroups and beyond. The interested reader can look at any standard
T
AF

book in abstract algebra to go further in this direction.


DR

Some important information about cosets are listed in the following theorem, which the reader can
prove with a little labor.

Theorem 11.2.5. [Cosets are equal or disjoint]


1. Let H be a subgroup of a group G. Suppose a, b ∈ G. Then the following results hold for left
cosets of H in G:
(a) aH = H if and only if a ∈ H.
(b) aH is a subgroup of G if and only if a ∈ H.
(c) Either aH = bH or aH ∩ bH = ∅.
(d) aH = bH if and only if a−1 b ∈ H.
(e) Each left coset is an equivalence class of the equivalence relation given by “a ∼ b if a−1 b ∈
H”; and the collection of all left cosets is a partition of G.

2. The following statements hold for right cosets of H in G:


(a) Ha = H if and only if a ∈ H.
(b) Ha is a subgroup of G if and only if a ∈ H.
(c) Either Ha = Hb or Ha ∩ Hb = ∅.
(d) Ha = Hb if and only if ab−1 ∈ H.
(e) Each right coset is an equivalence class of the equivalence relation given by “a ∼ b if
ab−1 ∈ H”; and the collection of all right cosets is a partition of G.

3. Further, aH = Ha if and only if H = aHa−1 = {aha−1 : h ∈ H}.


242 CHAPTER 11. POLYA THEORY∗

To proceed further, we need the following definition.

Definition 11.2.6. Let G be a group. As a set if G is finite, then |G| is called the order of the
group G. In such a case, G is said to be a finite group, or a group of finite order. As a set, if G is
infinite, then G is said to be an infinite group.

Theorem 11.2.7. [Lagrange Theorem] Let H be a subgroup of a finite group G. Then |H| divides
|G|
|G|. Moreover, the number of distinct left (right) cosets of H in G equals .
|H|
Proof. We give the proof for left cosets. A similar proof holds for right cosets. Since G is a finite
group, the number of left cosets of H in G is finite. Let g1 H, g2 H, . . . , gm H be the collection of all left
cosets of H in G. Then by Theorem 11.2.5, G is a disjoint union of the sets g1 H, g2 H, . . . , gm H.
m Also, |aH|
m
= |bH|, for each a, b ∈ G. Hence, |gi H| = |H|, for all i = 1, 2, . . . , m. Thus, |G| =
S P
gi H =
|gi H| = m|H| (the disjoint union gives the second equality). Thus, |H| divides |G| and
i=1 i=1
|G|
the number of left cosets equals m = .
|H|
Remark 11.2.8. 1. The number m in Theorem 11.2.7 is called the index of H in G, and is denoted
by [G : H] or iG (H).
2. Theorem 11.2.7 is a statement about any subgroup of a finite group. It is quite possible that
both the group G and its subgroup H are infinite but the number of left (right) cosets of H
in G is finite. In this case, one still talks of index of H in G. For example, let G = Z and
H = 10Z = {10m : m ∈ Z}, with the group operation as addition. Then the left cosets are
H, 1 + H, . . . , 9 + H so that [Z : H] = 10.
T
AF

3. In general, if m ∈ N, then mZ is a subgroup of Z and [Z : mZ] = m.


DR

Definition 11.2.9. Let G be a group and let g ∈ G. Then the smallest positive integer m such that
g m = e is called the order of g. If there is no such positive integer then g is said to have an infinite
order. The order of an element is denoted by o(g).
Example 11.2.10. 1. The only element of order 1 in a group G is the identity element of G.
2. In D4 , each of the elements r2 , f, rf, r2 f, r3 f has order 2, whereas the elements r and r3 have
order 4.
Exercise 11.2.11. 1. Prove that for each a ∈ G, o(a) = o(a−1 ).
2. Determine the index of each subgroup that were obtained in Exercise 11.1.19.
3. Let G be a finite group and a ∈ G, a 6= e. If H = {an : n ∈ Z} then prove that |H| = o(a).
4. Let a ∈ G, a finite group. Show that o(a) ∈ N.

We now state some important corollaries of Lagrange’s Theorem, whose proofs are easy.

Corollary 11.2.12. Let G be a finite group and let a ∈ G. Then o(a) divides |G| as H = {an : n ∈ Z}
is a finite subgroup of G.

Corollary 11.2.12 implies that the possible orders of elements of a finite group G are the divisors
of |G|. For example, if |G| = 30 then for each g ∈ G, o(g) ∈ {1, 2, 3, 5, 6, 10, 15, 30}.
Further, Let g ∈ G, a finite group. Then, |G| = m · o(g) for some m ∈ N. Hence

g |G| = g m·o(g) = (g o(g) )m = em = e.

We thus obtain the following corollary.


11.2. LAGRANGE’S THEOREM 243

Corollary 11.2.13. Let G be a finite group. Then for each g ∈ G, g |G| = e.

We now use the above understanding to digress towards modular arithmetic. Recall that for
a, b ∈ Z, the notation “a ≡ b (mod m)” means that m divides a − b.
Let p be an odd prime. Write Z∗p = {1, 2, . . . , p − 1}. Verify that (Z∗p , p ) is a group, where

a p b = the remainder when ab is divided by p.

Applying Corollary 11.2.13 to Z∗p gives the following result.

Corollary 11.2.14. [Fermat’s Little Theorem] Let a ∈ N and let p be a prime.


1. If p does not divide a then ap−1 ≡ 1 (mod p).
2. In general, ap ≡ a (mod p).

We now state without proof a generalization of Fermat’s Little Theorem. Let n ∈ N. Consider

Un = {k : 1 ≤ k ≤ n, gcd(k, n) = 1}.
a n b = the remainder when ab is divided by n.

Then (Un , n ) is a group with |Un | = ϕ(n), where ϕ(n) is Euler’s totient function that denotes the
the number of integers between 1 and n, coprime to n.
By Corollary 11.2.13, we obtain the following result.

Corollary 11.2.15. [Euler’s Theorem] If a ∈ Z, n ∈ N and gcd(a, n) = 1, then aϕ(n) ≡ 1 (mod n).
T

Example 11.2.16. 1. Find the digit in the unit place of 131001 written in decimal notation.
AF

Ans: Observe that 13 ≡ 3 (mod 10). So, 131001 ≡ 31001 (mod 10). Also, 3 ∈ U10 and therefore
DR

by Corollary 11.2.13, 3|U10 | = 1 (mod 10). But |U10 | = 4 and 1001 = 4 · 250 + 1. Thus,

131001 ≡ 31001 ≡ 34·250+1 ≡ (34 )250 · 31 ≡ 1 · 3 ≡ 3 (mod 10).

Hence, the digit in the unit place of 131001 is 3.


2. Find the digits in the unit and tens places of 231002 written in decimal notation.
Ans: Observe that 23 ∈ U100 and 23|U100 | = 1 (mod 100). But |U100 | = 40 and 1002 = 40·25+2.
Hence
231002 ≡ 2340·25+2 ≡ (2340 )25 · 232 ≡ 1 · 232 ≡ 529 ≡ 29 (mod 100).

Therefore, in the decimal representation of 231002 , the digit in the unit place is 9 and the digit
in the tens place is 2.

In general, the converse of Lagrange’s Theorem is not true. That is, there exists a group of order
mn but it has no subgroup of order m for some m, n ∈ N. See the following example.

Example 11.2.17. Let T be the group of symmetries of the tetrahedron as discussed in Exam-
ple 11.1.9.2a. This group has 12 elements. From Exercise 11.1.19(4), we see that there is no subgroup
of T with 6 elements.
If you have not completed that exercise, then let us show that there is no subgroup of T consisting
of 6 elements. On the contrary, suppose H is a subgroup of T and |H| = 6. We know that

T = {e, (234), (243), (124), (142), (123), (132), (134), (143), (12)(34), (13)(24), (14)(23)}.
244 CHAPTER 11. POLYA THEORY∗

Observe that T has exactly 8 elements of order 3; each of them is of the form (ijk) for distinct numbers
i, j and k. Let a be any one of these 8 elements. That is, a ∈ T with o(a) = 3. The possible cosets
could be H, aH and a2 H (as a3 = e, no other coset exists). Using Theorem 11.2.5, we see that cosets
of H in T will be exactly 2 in number. Hence, at most two of the cosets H, aH and a2 H are distinct;
so that at least two of them are equal. By Theorem 11.2.5, it follows that a ∈ H. Therefore, all the 8
elements of order 3 must be elements of H. That is, H must have at least 9 elements (8 elements of
order 3 and one identity). This is absurd as |H| = 6.

11.3 Group action


Recall that when f : A × B → C is a function and a ∈ A, b ∈ B, the value of the function at the
point (a, b) is written as f (a, b). This is called the outfix notation. In the infix notation, we write the
same value f (a, b) as af b. For example, + : N × N → N and we write +(3, 5) as 3 + 5. When we use
the infix notation, the function f is referred to as an operator. It is another name for a function.

Definition 11.3.1. Let (G, ·) be a group with identity e. Then G is said to act on a set X if there
exists an operator ? : G × X → X satisfying the following two conditions:
1. For each x ∈ X, e ? x = x.
2. For all x ∈ X, and g, h ∈ G, g ? (h ? x) = (g · h) ? x.

In such a case, the operator ? is called an action of the group G on the set X.
Remark 11.3.2. 1. Let us assume that X consists of a set of points and let us suppose that the
T

group G acts on X by moving the points. Then Definition 11.3.1 can be interpreted as follows:
AF

(a) The first condition implies that the identity element of the group does not move any element
DR

of X. That is, the points in X remain fixed when they are acted upon by the identity element
of G.
(b) The second condition implies that if a point, say x0 ∈ X, is first moved by an element
h ∈ G and then by an element g ∈ G, then the final position of x0 is same as the position
it would have reached if it was moved by the element g · h ∈ G.

2. Suppose a group G acts on a set X with the group action as ?. Fix an element g ∈ G. Define
functions φ, ψ : X → X by φ(x) = g ? x, ψ(x) = g −1 ? x for each x ∈ X. Then

(ψ ◦ φ)(x) = g −1 ? (g ? x) = (g −1 · g)(x) = e ? x = x.

Similarly, it follows that (φ ◦ ψ)(x) = x. Hence the function φ is a bijection on X. That is, g
just permutes the elements of X. In particular, {g ? x : x ∈ X} = X.
3. There may exist g, h ∈ G, with g 6= h such that g ? x = h ? x, for all x ∈ X,.
Example 11.3.3. 1. Consider the dihedral group D6 = {e, r, . . . , r5 , f, rf, . . . , r5 f }, with r6 = e =
f 2 and rf = f r5 . Here, f stands for the vertical flip and r stands for counter clockwise rotation
π
by an angle of . Then D6 acts on the labeled edges/vertices of a regular hexagon by permuting
3
the labeling of the edges/vertices (see Figure 11.5).
2. Let X denote the set of ways of coloring the vertices of a square with two colors, say, red and
blue. Then X equals the set of all functions h : {1, 2, 3, 4} → {red, blue}, where the vertices
south-west, south-east, north-east and north-west are respectively, labeled as 1, 2, 3 and 4. Then
observe that |X| = 16. The distinct colorings have been depicted in Figure 11.6, where R stands
11.3. GROUP ACTION 245

a2 a2 3 2 1 6
a3 a1 a1 a3
f 4 1 r2 2 5

→ −

a4 a6 a6 a4
a5 a5 5 6 3 4

Figure 11.5: Action of f on labeled edges and of r2 on labeled vertices of a regular hexagon.

for the vertex colored “red ” and B stands for the vertex colored “blue”. For example, the figure
labeled x9 in Figure 11.6 corresponds to h(1) = R = h(4) and h(2) = B = h(3). Now, let
us denote the permutation (1234) by r and the permutation (12)(34) by f . Then the dihedral
group D4 = {e, r, r2 , r3 , f, rf, r2 f, r3 f } acts on the set X. For example,
(a) x1 and x16 are mapped to itself under the action of every element of D4 . That is, g ?x1 = x1
and g ? x16 = x16 , for all g ∈ G.
(b) r ? x2 = x5 and f ? x2 = x3 .

B B B B B R R B B B
x12 x13 x14 x15 x16
R B B R B B B B B B
T
AF

B B B R B R R B R B R R
x6 x7 x8 x9 x10 x11
DR

R R R B B R R B B R B B

R R B R R B R R R R
x1 x2 x3 x4 x5
R R R R R R R B B R

Figure 11.6: Coloring the vertices of a square.

There are three important sets associated with a group action. We first define them and then try
to understand them using an example.

Definition 11.3.4. Let G act on a set X with the action as ?.


1. For each x ∈ X, O(x) := {g ? x : g ∈ G} ⊆ X is called the orbit of x.
2. For each x ∈ X, Gx := {g ∈ G : g ? x = x} ⊆ G is called the stabilizer of x in G.
3. For each g ∈ G, Fg := {x ∈ X : g ? x = x} ⊆ X is called the fix of g.

Example 11.3.5. Consider the set X given in Example 11.3.3.2. Then using the depiction of the set
X in Figure 11.6, we have

O(x2 ) = {x2 , x3 , x4 , x5 }, Gx2 = {e, rf }, and Frf = {x1 , x2 , x4 , x7 , x10 , x13 , x15 , x16 }.
246 CHAPTER 11. POLYA THEORY∗

We now state a few results associated with the above definitions without proof as they can be
easily verified.

Proposition 11.3.6. Let G act on a set X with action as ?.


1. Then for each x ∈ X, the stabilizer Gx of x is a subgroup of G.

2. Define the relation ∼ on X by x ∼ y if there exists g ∈ G such that g ? x = y. Then ∼ is an


equivalence relation on X, and [x] = O(x). That is, the equivalence class containing x equals
the orbit of x.

3. In particular, for each x ∈ X, if t ∈ O(x), then O(x) = O(t).

4. For all x, t ∈ X, if g ? x = t then Gx = g −1 Gt g.

Proposition 11.3.6 helps us to relate the distinct orbits of X under the action of G with the cosets
of G. This is stated and proved as the next result. Recall that the number of left cosets of a subgroup
H of a group G equals [G : H], the index of H in G.

Theorem 11.3.7. [Orbit stabilizer theorem] Let a group G act on a set X. Then for each x ∈ X,
there is a bijection between O(x) and the set of all left cosets of Gx in G. In particular, |O(x)| = [G :
Gx ]. Moreover, if G is a finite group then for each x ∈ X, |G| = |O(x)| · |Gx |.

Proof. Let S be the set of distinct left cosets of Gx in G. Write S := {gGx : g ∈ G}. Then
|S| = [G : Gx ]. Consider the map τ : S → O(x) by τ (gGx ) = g ? x, where ? is the group action. Let
us first check that τ is well-defined.
T

Suppose the left cosets gGx and hGx are equal. That is, gGx = hGx . Then, using Theorem 11.2.5
AF

and the definition of group action, one obtains the following sequence of assertions:
DR

gGx = hGx ⇔ h−1 g ∈ Gx ⇔ (h−1 g) ? x = x ⇔ h−1 ? (g ? x) = x ⇔ g ? x = h ? x ⇔ τ (gGx ) = τ (hGx ).

Hence, τ is well-defined and also a one-one map. To show τ is onto, note that for each y ∈ O(x),
there exists an h ∈ G, such that h?x = y. Also, for this choice of h ∈ G, the coset hGx ∈ S. Therefore,
for this choice of h ∈ G, τ (hGx ) = h ? x = y holds. Hence, τ is onto.
Therefore, τ is a bijection between O(x) and S. The second statement follows by observing that
|G|
whenever |G| is finite, [G : Gx ] = , for each subgroup Gx of G.
|Gx |
The following results are immediate consequences of Proposition 11.3.6 and Theorem 11.3.7. We
give the proof for the sake of completeness.

Lemma 11.3.8. Let G be a finite group acting on a set X. Then for each y ∈ X,
X
|Gx | = |G|.
x∈O(y)

Proof. Recall that, for each x ∈ O(y), |O(x)| = |O(y)|. Hence, using Theorem 11.3.7, one has
|G| = |Gx | · |O(x)|, for all x ∈ X. Therefore,
X X |G| X |G| |G| X |G|
|Gx | = = = 1= |O(y)| = |G|.
|O(x)| |O(y)| |O(y)| |O(y)|
x∈O(y) x∈O(y) x∈O(y) x∈O(y)

The next theorem is the generalization of Discussion 5.5.12 where we had calculated the number
of distinct circular arrangements.
11.3. GROUP ACTION 247

Theorem 11.3.9. Let G be a finite group acting on a set X. Let N denote the number of distinct
orbits of X under the action of G. Then

1 X
N= |Gx |.
|G|
x∈X

P
Proof. By Lemma 11.3.8, |Gx | = |G| for all y ∈ X. Let x1 , x2 , . . . , xN be the representative of
x∈O(y)
the distinct orbits of X under the action of G. Then
N N
1 X 1 X X 1 X 1
|Gx | = |Gxi | = |G| = N · |G| = N .
|G| |G| |G| |G|
x∈X i=1 y∈O(xi ) i=1

Example 11.3.10. For Example 11.3.3.2, check that the number of distinct colorings are

16
1 X 1
|Gxi | = (8 + 2 + 2 + 2 + 2 + 2 + 4 + 2 + 2 + 4 + 2 + 2 + 2 + 2 + 2 + 8) = 6.
|G| 8
i=1

As the above example illustrates, the distinct configurations are obtained by listing out elements
of X. If we color the vertices of the square with 3 colors, then |X| = 34 = 81, whereas the number
of elements of the group of symmetries of a square, that is, of D4 , is only 8. Clearly, it will be
advantageous to relate the number of distinct orbits with the elements of the group, in place of the
elements of the set X. Our next result does this.
T

Theorem 11.3.11. [Burnside’s Lemma] Let G be a finite group acting on a set X. Let N be the
AF

number of distinct orbits of X under the action of G. Then


DR

1 X
N= |Fg |.
|G|
g∈G

Proof. Write the group action as ?. Consider the set S = {(g, x) ∈ G × X : g ? x = x}. We calculate
|S| by two methods. First, for each fixed x ∈ X, Gx gives the collection of elements of G that satisfy
P
g ? x = x. So, |S| = |Gx |.
x∈X P
Second, for each g ∈ G, Fg is the collection of elements of X that satisfy g?x = x. So, |S| = |Fg |.
g∈G
P P 1 X
Thus |Gx | = |S| = |Fg |. By Theorem 11.3.9, we have N = |Fg |.
x∈X g∈G |G|
g∈G

Example 11.3.12. In Example 11.3.3.2, verify that

|Fe | = 16, |Fr | = 2, |Fr2 | = 4, |Fr3 | = 2, |Ff | = 4, |Frf | = 8, |Fr2 f | = 4 and |Fr3 f | = 8.

Hence, the number of distinct configurations are

1 X 1
|Fg | = (16 + 2 + 4 + 2 + 4 + 8 + 4 + 8) = 6.
|G| 8
g∈G

It seems that we may still need to know all the elements of X to compute the above terms. In the
next section, it will be shown that to compute |Fg |, for any g ∈ G, we just need to find a proper n
such that g ∈ Sn and decompose g as product of disjoint cycles.
248 CHAPTER 11. POLYA THEORY∗

11.4 The Cycle index polynomial


Let G be a finite group acting on a set X. Then as mentioned at the end of the previous section, we
need to understand the cycle decomposition of each g ∈ G as product of disjoint cycles. red field and
Polya observed that elements of G with the same cyclic decomposition made the same contribution to
the sets of fixed points. They defined the notion of cycle index polynomial to keep track of the cycle
decomposition of the elements of G. We first state the following result of Cayley which implies that
every group element can be written as an element of a symmetric group. We start with the following
definition.

Definition 11.4.1. Let (G1 , ?) and (G2 , ) be two groups. Then, an isomorphism is a function
f : G1 → G2 satisfying
1. f is one-to-one,
2. f is onto, and
3. f (σ ? τ ) = (σ) f (τ ), for each σ, τ ∈ G1 .

Example 11.4.2. Observe the following:


1. The function that sent r 7→ (1234) and f 7→ (14)(23) gave an isomorphism between the two
groups that appear in Example 11.1.9.1a.
2. The function that sends r 7→ (12345) and f 7→ (13)(45) gives an isomorphism between the two
groups that appear in Example 11.1.9.1c.
T

Theorem 11.4.3. [Cayley’s Theorem] Let G be a group. Then G is isomorphic to a subgroup of


AF

the symmetric group acting on G.


DR

Proof. Let S be the set of all bijections on G. Notice that S is a group with the operation as
composition of maps. Corresponding to x ∈ G, let λx be the function λx : G → G given by λx (g) = xg
for each g ∈ G. Now, if λx (a) = λx (b), then xa = xb implies a = b. So, λx is one-one. If y ∈ G, then
λx (x−1 y) = xx−1 y = y shows that λx is onto. Thus, λx ∈ S.
Define the function φ : G → S by φ(x) = λx for each x ∈ G. Then φ : G → rng φ ⊆ S is a bijection.
Also, observe that rng φ For any g ∈ G,

φ(xy)(g) = λxy (g) = xyg = x(yg) = x(λy (g)) = λx (λy (g)) = (φ(x) ◦ φ(y))(g).

It shows that rng φ is closed under the operation of composition of maps; that is, rng φ is a subgroup
of S. It also shows that φ is a homomorphism. Therefore, G is isomorphic to rng φ ⊆ S.

Let us now start with a few definitions and examples to better understand the use of cycle decom-
position of an element of a permutation group.

Definition 11.4.4. A permutation σ ∈ Sn is said to have the cycle structure 1k1 2k2 · · · nkn , if the
n
P
cycle representation of σ has ki cycles of length i, for 1 ≤ i ≤ n. Observe that i · ki = n.
i=1
Example 11.4.5. 1. Let e be the identity element of Sn . Then e = (1) (2) · · · (n) and hence the
cycle structure of e, as an element of Sn equals 1n .
!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2. Let σ = ∈ S15 . We see that
3 6 7 10 14 1 2 13 15 4 11 5 8 12 9
σ = (1 3 7 2 6) (4 10) (5 14 12) (8 13) (9 15) (11). Thus, the cycle structure of σ is 11 23 31 51 .
11.4. THE CYCLE INDEX POLYNOMIAL 249

3. Consider the group G of symmetries of the tetrahedron (see Example 11.1.9.2a). The elements
of G have the following cycle structures:

14 for exactly 1 element corresponding to the identity element;


11 31 for exactly 8 elements corresponding to 3 cycles;
22 for exactly 3 elements corresponding to (12)(34), (13)(24) and (14)(23).

Definition 11.4.6. Let G be a permutation/symmetric group on n symbols. For any g ∈ G, let `k (g)
denote the number of cycles of length k, 1 ≤ k ≤ n, in the cycle representation of g. Then the cycle
index polynomial of G is a polynomial in n variables z1 , z2 , . . . , zn given by

1  X `1 (g) `2 (g) 
PG (z1 , z2 , . . . , zn ) = z1 z2 · · · zn`n (g) .
|G|
g∈G

Notice that for each g ∈ G, the condition that g has exactly `k (g) cycles of length k, 1 ≤ k ≤ n,
implies that 1 · `1 (g) + 2 · `2 (g) + · · · + n · `n (g) = n.
Example 11.4.7. 1. Let G be the dihedral group D4 (see Example 11.1.9.2). Then

e = (1)(2)(3)(4) → z14 , r = (1234) → z4 , r3 = (1432) → z4 , r2 = (13)(24) → z22 ,


f = (14)(23) → z22 , rf = (1)(3)(24) → z12 z2 , r2 f = (12)(34) → z22 , r3 f = (13)(2)(4) → z12 z2 .

1
z14 + 2z4 + 3z22 + 2z12 z2 .

Thus, PG (z1 , z2 , z3 , z4 ) = 8

2. Let G be the dihedral group D5 (see Example 11.1.9.1c). Then


T

1
z15 + 4z5 + 5z1 z22 .

PG (z1 , z2 , z3 , z4 , z5 ) = 10
AF

3. Verify that the cycle index polynomial of the symmetries of a cube induced on the set of vertices
DR

is given by
1 8
z1 + 6z42 + 9z24 + 8z12 z32 .

PG (z1 , z2 , . . . , z8 ) =
24
Let S be an object, for example, say a geometrical figure, and let X be the finite set of vertices,
edges and faces etc. of S. Also, let C be a finite set (say, of colors). Consider the set Ω that denotes
the set of all functions from X to C. Observe that an element of Ω gives a color pattern on the object
S. Let G be a subgroup of the group of permutations of the object S. Hence, G acts on the elements
of X. Let us denote this action by ?. So, g ? x ∈ X, for all x ∈ X.
One can also obtain an action of G on Ω, denoted ~, by the following rule:
Fix an element x ∈ X. Then, for each φ ∈ Ω and g ∈ G, g ~ φ is an element of Ω and hence it gives a
function from X to C. Hence, one defines

(g ~ φ)(x) = φ(g −1 ? x), for all φ ∈ Ω.

We claim that ~ indeed defines a group action on the set Ω. To do so, note that for each h, g ∈ G
and φ ∈ Ω, the definition of the action on X and Ω gives

(h ~ (g ~ φ)) (x) = (g ~ φ)(h−1 ? x) = φ g −1 ? (h−1 ? x) = φ g −1 h−1 ? x


 

= φ (hg)−1 ? x = (hg ~ φ)(x).




Since, (h ~ (g ~ φ)) (x) = (hg ~ φ)(x), for all x ∈ X, one has h ~ (g ~ φ) = hg ~ φ, for each h, g ∈ G
and φ ∈ Ω. Hence, the proof of the claim is complete. Now, using the above notations, we have the
following theorem.
250 CHAPTER 11. POLYA THEORY∗

Theorem 11.4.8. Let C, S, X and Ω be as defined above. Also, let G be a subgroup of the group of
permutations of the object S. Then the number of distinct color patterns (distinct elements of Ω),
distinct up to the action of G, is given by

PG (|C|, |C|, . . . , |C|).

Proof. Let |X| = n. Then observe that G is a subgroup of Sn . So, each g ∈ G can be written as a
product of disjoint cycles. Also, by Burnside’s Lemma (Theorem 11.3.11), N , the number of distinct
1 P
color patterns (distinct orbits under the action of G), equals |Fg |, where
|G| g∈G

Fg = { φ ∈ Ω : (g ~ φ)(x) = φ(x), for all x ∈ X}.

We claim that “g ∈ G fixes a color pattern (or an element of Ω) if and only if φ colors the elements
in a given cycle of g with the same color”.
Suppose that g ~ φ = φ. That is, (g ~ φ)(x) = φ(x), for all x ∈ X. So, using the definition, one
has φ(g −1 ? x) = φ(x), for all x ∈ X. In particular, for a fixed x0 ∈ X, one also has

φ(x0 ) = φ(g ? x0 ) = φ g 2 ? x0 = · · · .


Note that, for each fixed x0 ∈ X and g ∈ G, the permutation (x0 , g ? x0 , g 2 ? x0 , . . .) corresponds to
a cycle of g. Therefore, if g fixes a color pattern φ, i.e., g ~ φ = φ, then φ assigns the same color to
each element of any cycle of g.
Conversely, fix an element g ∈ G and let φ be a color pattern (a function) that has the property
that every point in a given cycle of g is colored with the same color. That is, φ(x) = φ(g ? x), for
each x ∈ X. Or equivalently, φ(x) = φ g −1 ? x = (g ~ φ)(x), for all x ∈ X. Hence, by definition,

T

g ~ φ = φ. Thus, g fixes the color pattern φ. Hence, the proof of the claim is complete.
AF

Therefore, we observe that for a fixed g ∈ G, a cycle of g can be given a color independent
DR

of another cycle of g. Also, the number of distinct colors equals |C|. Hence, for a fixed g ∈ G,
|Fg | = |C|`1 (g) · |C|`2 (g) · · · |C|`n (g) , where for each k, 1 ≤ k ≤ n, `k (g) denotes the number of cycles
of g of length k. Thus,
1 X 1 X
N= |Fg | = |C|`1 (g) · |C|`2 (g) · · · |C|`n (g) = PG (|C|, |C|, . . . , |C|).
|G| |G|
g∈G g∈G

We now give a few examples to indicate the importance of Theorem 11.4.8.


Example 11.4.9. 1. Determine the number of distinct color patterns, when the vertices of a pen-
tagon are colored with 3 colors.
Answer: We know that the group D5 , is the group of symmetries of a pentagon. Hence, D5 acts
on the color patterns. Verify that
1 z 5 + 4z5 + 5z1 z22
PD5 (z1 , z2 , . . . , z5 ) = (z15 + 4z5 + 5z1 z22 ) = 1 .
|D5 | 10
1 5
Thus, by Theorem 11.4.8, the required number equals N = (3 + 4 · 3 + 5 · 3 · 32 ) = 39.
10
2. Suppose we are given beads of 3 different colors and that there are at least 6 beads of each color.
Determine the distinct necklace patterns that are possible using the 6 beads.
Answer : Since we are forming a necklace using 6 beads, the group D6 acts on the 6 beads of
the necklace. Also, the cycle index polynomial of D6 equals
1
PD6 (z1 , z2 , . . . , z5 , z6 ) = (z 6 + 2z6 + 2z32 + z23 + 3z23 + 3z12 z22 ).
|D6 | 1
11.5. POLYA’S INVENTORY POLYNOMIAL 251

1 6
Hence, by Theorem 11.4.8, the number of distinct necklace patterns equals (3 + 2 · 3 + 2 ·
12
32 + 4 · 33 + 3 · 32 · 32 ) = 92.

3. Consider the 2 × 2 square given in Figure 11.7. Determine the number of distinct color patterns,
when the vertices of the given figure are colored with two colors.
Answer: Observe that D4 is the group of symmetries of the 2 × 2 square and it needs to act on
9 vertices. So, we need to write the elements of D4 as a subgroup of S9 . Hence, the cycle index
z 9 + 2z1 z42 + z1 z24 + 4z13 z23
polynomial is given by PD4 (z1 , . . . , z9 ) = 1 and the number of distinct
8
color patterns equals 102.

7 8 9
13 14 15 16

9 10 11 12 4 5 6
5 6 7 8

1 2 3 4 1 2 3

4×4 2×2

Figure 11.7: Faces and Vertices of Squares


T
AF

4. Determine the number of distinct color patterns when the faces of a cube are colored with 2
DR

colors.
Answer: Using the group of symmetries of the cube given on Page 236, the cycle index polynomial
z 12 + 6z43 + 3z26 + 8z34 + 6z12 z25
corresponding to the faces equals PG (z1 , . . . , z12 ) = 1 . Thus, the
24
required number is 218.

Exercise 11.4.10. Determine the number of distinct color patterns when

1. the faces of the 4 × 4 square given in Figure 11.7 are colored with 2 colors.

2. the edges of a cube are colored with 2 colors. Hint: The cycle index polynomial equals
1 6
z1 + 6z12 z4 + 3z12 z22 + 6z23 + 8z32 .

PG (z1 , z2 , . . . , z6 ) =
24

11.5 Polya’s inventory polynomial


In this section, the ideas of the previous subsection are generalized. This generalization allows us to
count the distinct number of necklaces even if there are not sufficient number of beads of each color.
To do this, each element of C is assigned a weight, that in turn gives weight to each color pattern. This
weight may be a number, a variable or in general, an element of a commutative ring with identity.
The setup for our study remains the same. To start with, we have the following definitions.

Definition 11.5.1. Let A be a commutative ring with identity (the elements of A are called weights).
Let w : C → A be a map that assigns weight to each color. Then the weight of a color pattern
Q 
φ : X → C, with respect to the weight function w is given by w(φ) = w φ(x) .
x∈X
252 CHAPTER 11. POLYA THEORY∗

Fix g ∈ G. Then we have seen that g fixes a color pattern φ ∈ Ω if and only if φ colors the elements
in a given cycle of g with the same color. Similarly, for each fixed g ∈ G and φ ∈ Ω, one has
Y Y Y
w φ(g −1 ? x) =
  
w(g ~ φ) = w g ~ φ(x) = w φ(y) = w(φ), (11.5)
x∈X x∈X y∈X

as {g ? x : x ∈ X} = X (see Remark 11.3.2). That is, for a fixed φ ∈ Ω, the weight of each element of
O(φ) = {g ~ φ : g ∈ G} is the same and it equals w(φ). That is, w(φ) = w(ψ), whenever ψ = g ~ φ,
for some g ∈ G.

Example 11.5.2. Let X consist of the set of faces of a cube, G be the group of symmetries of the
cube and let C consist of two colors ‘red’ and ‘blue’. Thus, if the weights R and B are assigned to the
two elements of C then the weight
1. B 6 corresponds “all faces being colored blue”;
2. R2 B 4 corresponds to “any two faces being colored ‘red’ and the remaining four faces being
colored ‘blue’;
3. R3 B 3 corresponds to “any three faces being colored ‘red’ and the remaining three faces being
colored ‘blue’ and so on.

The above examples indicate that different color patterns need not have different weights. We also
need the following definition to state and prove results in this area.

Definition 11.5.3. Let G be a group acting on the set Ω, the set of color patterns and let w : C → A
be a weight function. The pattern inventory, denoted I, under the action of G on Ω, with respect to
T

P
w, is the sum of the weights of the orbits. That is, I = w(∆), where the sum runs over all the
AF


distinct orbits ∆ obtained by the action of G on Ω.
DR

With the above definitions, we are ready to prove the Polya’s Enumeration Theorem. To do so,
we first need to prove the weighted Burnside’s Lemma. This Lemma is the weighted version of the
Burnside’s Lemma.

Lemma 11.5.4. With the definitions and notations as above,


X 1 X X
I= w(∆) = w(φ),
|G|
∆ g∈G φ∈Ω
g~φ=φ

where the sum runs over all the distinct orbits ∆ obtained by the action of G on Ω.

Proof. As G acts on Ω, for each α ∈ Ω, the application of Lemma 11.3.7 gives |Gα | · |O(α)| = |G|.
Since ∆ is an orbit under the action of G, for each φ ∈ ∆, |Gφ | · |∆| = |G|. Also, by definition,
w(∆) = w(φ), for all φ ∈ ∆. Thus,

1 X X 1 X |Gφ | 1 X
w(∆) = w(φ) = w(φ) = w(φ) = w(φ) = |Gφ | · w(φ).
|∆| |∆| |G| |G|
φ∈∆ φ∈∆ φ∈∆ φ∈∆
P P P P
Let Fg = {φ ∈ Ω : g ~ φ = φ}. Then w(φ) = w(φ) and hence
φ∈Ω g∈Gφ g∈G φ∈Fg

X X 1 X 1 XX 1 X
I = w(∆) = |Gφ | · w(φ) = |Gφ | · w(φ) = |Gφ | · w(φ)
|G| |G| |G|
∆ ∆ φ∈∆ ∆ φ∈∆ φ∈Ω
1 X X 1 X X
= w(φ) = w(φ).
|G| |G|
φ∈Ω g∈Gφ g∈G φ∈Fg
11.5. POLYA’S INVENTORY POLYNOMIAL 253

We are now in a position to prove the Polya’s Enumeration Theorem. Before doing so, recall that
Fg consists precisely of those color schemes which color each cycle of g with just one color (see the
argument used in the second paragraph in the proof of Theorem 11.4.8).

Theorem 11.5.5. [Polya’s enumeration theorem] With the definitions and notations as above,
X
I= w(∆) = PG (x1 , x2 , . . . , xn ),

w(c)i ,
P
where the sum runs over all the distinct orbits ∆ obtained by the action of G on Ω and xi =
c∈C
is the ith power sum of the weights of the colors. In particular, I = PG (|C|, |C|, . . . , |C|), if weight of
each color is 1.

Proof. Using the weighted Burnside Lemma 11.5.4, we need to prove that

` (g) `2 (g)
X X X
w(φ) = x11 x2 · · · xn`n (g) ,
g∈G φ∈Fg g∈G

where `i (g) is the number of cycles of length i in the cycle representation of g.


Now, fix a g ∈ G. Suppose g has exactly t disjoint cycles, say g1 , g2 , . . . , gt . As Fg consists precisely
of those color schemes which color each cycle of g with just one color, we just need to determine the
weight of such a color pattern. To do so, for 1 ≤ i ≤ t, define Xi to be that subset of X whose
elements form the cycle gi . Then, it is easy to see that X1 , X2 , . . . , Xt defines a partition of X. Also,
T

the condition that x and g ? x belong to the same cycle of g, one has w(φ(si )) = w(φ(g ? si )), for each
AF

si ∈ Xi , 1 ≤ i ≤ t. Thus, for each φ ∈ Fg ,


DR

Y t Y
Y t
Y
w(φ) = w(φ(x)) = w(φ(x)) = w(φ(si ))|Xi | .
x∈X i=1 x∈Xi i=1

t
 
w(c)|Xi |
Q P
Note that if we pick a term from each factor in and take the product of these terms,
i=1 c∈C
t t
 
w(c)|Xi | . All these terms also appear in w(φ(si ))|Xi |
Q P P Q
we obtain all the terms of
i=1 c∈C φ∈Fg i=1
because as φ is allowed to vary over all elements of Fg , the images φ(si ), for 1 ≤ i ≤ t, take all values
in C. The argument can also be reversed and hence it follows that
t t
!
X XY Y X
w(φ) = w(φ(si ))|Xi | = w(c)|Xi | .
φ∈Fg φ∈Fg i=1 i=1 c∈C

Now, assume that g has `k (g) cycles of length k, 1 ≤ k ≤ n. This means that in the collection
|X1 |, |X2 |, . . . , |Xt |, the number 1 appears `1 (g) times, the number 2 appears `2 (g) times and so on
till 
the number  n appears `n (g) times (note that some of the `i (g)’s may be zero). Consequently,
t n
` (g)
w(c)|Xi | equals w(c)2 and so on till xn = w(c)n .
Q P Q P P P
xkk , as x1 = w(c), x2 =
i=1 c∈C k=1 c∈C c∈C c∈C
n
P Q `k (g)
Hence, w(φ) = xk and thus, the required result follows.
φ∈Fg k=1

Example 11.5.6. 1. Consider a necklace consisting of 6 beads. If there are 3 color choices, say
R, B and G, then determine
(a) the number of necklaces that have at least one R bead.
254 CHAPTER 11. POLYA THEORY∗

(b) the number of necklaces that have three R, two B and one G bead.

Ans: Recall that D6 acts on a regular hexagon and its cycle index polynomial equals
1 6
PD6 (z1 , z2 , . . . , z6 ) = (z + 4z23 + 2z32 + 2z6 + 3z12 z22 ).
12 1
So, for the first part, at least one R needs to be used and the remaining can be any number of
B and/or G. So, we define the weight of the color R as x and that of B and G as 1. Therefore,
by Polya’s Enumeration Theorem 11.5.5,
1
I = (x + 1 + 1)6 + 4(x2 + 1 + 1)3 + 2(x3 + 1 + 1)2
12
+2(x6 + 1 + 1) + 3(x + 1 + 1)2 (x2 + 1 + 1)2


= x6 + 2x5 + 9x4 + 16x3 + 29x2 + 20x + 15.

So, the required answer is 1 + 2 + 9 + 16 + 29 + 20 = 77.


For the second part, define the weights as R, B and G itself. Then
1
I = (R + B + G)6 + 4(R2 + B 2 + G2 )3 + 2(R3 + B 3 + G3 )2
12
+2(R6 + B 6 + G6 ) + 3(R + B + G)2 (R2 + B 2 + G2 )2 .


The required answer equals the coefficient of R3 B 2 G in I, which equals


 
1 1 6!
(C(6; 3, 2, 1) + 3 · 2 · 2) = + 6 = 6.
12 12 3!2!
T
AF

We end this chapter with a few Exercises. But before doing so, we give the following example with
DR

which Polya started his classic paper on this subject.

Example 11.5.7. Suppose we are given 6 similar spheres in three different colors, say, three red, two
blue and one yellow (spheres of the same color being indistinguishable). In how many ways can we
distribute the six spheres on the 6 vertices of an octahedron freely movable in space?
Ans: Here X = {1, 2, 3, 4, 5, 6} and C = {R, B, Y }. Using Example 11.1.9.2b on Page 236 the
cycle index polynomial corresponding to the symmetric group of the octahedron that acts on the
vertices of the octahedron is given by
1 6
z1 + 6z12 z4 + 3z12 z22 + 8z32 + 6z23 .

24
Hence, the number of patterns of the required type is the coefficient of the term R3 B 2 Y in
1
I = (R + B + Y )6 + 6(R + B + Y )2 (R4 + B 4 + Y 4 ) + 3(R + B + Y )2 (R2 + B 2 + Y 2 )2
24
+8(R3 + B 3 + Y 3 )2 + 6(R2 + B 2 + Y 2 )3 .


Verify that this number equals 3.


Exercise 11.5.8. 1. Three black and three white beads are strung together to form a necklace.
If the beads of the same color are indistinguishable, determine the number of distinct necklace
patterns, if the necklace can only be rotated. What is the number if the necklace can be rotated
and turned over?
2. Suppose the edges of a regular tetrahedron are being colored with white and black. Then determine
the number of patterns that have exactly four black edges and two white edges.
11.5. POLYA’S INVENTORY POLYNOMIAL 255

3. Consider the molecules CH4 , C2 H6 and C6 H6 given in Figure 11.8. In each case, determine the
number of possible molecules that can be formed, if the hydrogen atoms can be replaced by either
Fluorine, Chlorine or Bromine.
4. In essentially how many different ways can we color the vertices of a cube if n colors are available?
5. Three ear-rings are shown in Figure 11.8. In each case, the ear-ring can be rotated along the
horizontal axis passing through the central vertex (highlighted with dark circle). Then determine
the following:
(a) The group that acts on the ear-rings.
(b) Write the elements of the group as a subgroup of Sn , for a proper choice of n.
(c) Determine the number of distinct color patterns when there are sufficient number of beads
of both the colors “RED” and “BLUE”.

H
C
H H H H H
C C
H C H H C C H

C C
H H H H H
C
H

8
T

8 6
AF

2 7 3 4
DR

7 3 5
6 5 1 2 9 1 8 1
3 4 7
5 6 9 2

4 10 10

Figure 11.8: Three ear-rings and three molecules, CH4 , C2 H6 and C6 H6 .

6. Let p be a prime suppose that we want to make a necklace consisting of p beads. If for each bead,
one has n choices of colors, then determine the number of distinct necklace patterns. Use this
number to prove the Fermat’s little theorem.
7. Prove that the cycle index polynomial for the vertices, edges and faces of the octahedron is
respectively, equal to
1 6
z1 + 6z12 z4 + 3z12 z22 + 8z32 + 6z23 ,

P (z1 , . . . , z6 ) =
24
1 12
z1 + 6z43 + 3z26 + 8z34 + 6z12 z25 ,

P (z1 , . . . , z12 ) =
24
1 8
z1 + 6z42 + 9z24 + 8z12 z32 .

P (z1 , . . . , z8 ) =
24
8. Consider the following problems that appeared in Section 5.5. Note that we need to consider only
the rotations to form the group.
(a) Find the number of circular arrangements of {A, B, B, C, C, D, D, E, E}.
256 CHAPTER 11. POLYA THEORY∗

(b) Find the number of circular arrangements of S = {A, A, B, B, C, C, D, D, E, E}.


(c) How many circular arrangements of {A, A, A, B, B, B, C, C, C} are there?
(d) Determine the number of circular arrangements of size 5 using the alphabets A, B and C.
(e) Let us assume that any two garlands are same if one can be obtained from the other by
rotation. Then, determine the number of distinct garlands that can be formed using 6
flowers, in the following cases.
i. The flowers can have colors ‘red’ or ‘blue’.
ii. The flowers can have the colors ‘red’, ‘blue’ or ‘green’.
iii. The flowers can have k colors.
(f ) Determine the number of distinct garlands that can be formed using 6 flowers, 4 of which
are blue and 2 are red.
(g) Find the number of circular permutations of {A, A, B, B, C, C, C, C}.

T
AF
DR
Bibliography

[1] G. Agnarson and R. Greenlaw, Graph Theory: Modelling, Applications and Algorithm, Pearson
Education.

[2] R. B. Bapat, Graphs and Matrices, Hindustan Book Agency, New Delhi, 2010.

[3] D. M. Cvetkovic, Michael Doob and Horst Sachs, Spectra of Graphs: theory and applications,
Academic Press, New York, 1980.

[4] D. I. A. Cohen, Basic Techniques of Combinatorial Theory, John Wiley and Sons, New York,
1978.

[5] William Dunham, Euler: The Master of Us All, Published and Distributed by The Mathematical
Association of America, 1999.

[6] F. Harary, Graph Theory, Addison-Wesley Publishing Company, 1969.


T

[7] T. J. Jech, The Axiom of Choice, Dover, 2008.


AF

[8] Victor J Katz, A history of mathematics, an intro, Harper Collins College Publishers, New York,
DR

1993.

[9] G. E. Martin, Counting: The Art of Enumerative Combinatorics, Undergraduate Texts in Math-
ematics, Springer, 2001.

[10] R. Merris, Combinatorics, 2th edition, Wiley-Interscience, 2003.

[11] G. H. Moore, Zermelo’s Axiom of Choice: Its Origins, Development and Influence, Dover, 2013.

[12] J. Riordan, Introduction to Combinatorial Analysis, John Wiley and Sons, New York, 1958.

[13] R. P. Stanley, Enumerative Combinatorics, vol. 2, Cambridge University Press, 1999.

[14] H. S. Wilf, Generatingfunctionology, Academic Press, 1990.

257
Index

C(G): Closure of G, 152 Cayley’s Theorem, 188


C(n, k), 74 Chinese remainder theorem, 66
C(n; n1 , . . . , nk ), 74 Circuit in a graph, 137
∆(G): Maximum degree of G, 134 Clique in a graph, 139
α(G): Independence number of G, 132 Coin problem, 116
cf[xn , f ] : Coefficient of xn in f , 108 Compulsory parts, 70
δ(G): Minimum degree of G, 134 connected permutation, 117
diam(G): Diameter of G, 137 Counting
κ(G): Vertex connectivity of G, 161 Addition rule, 70
hU i: Induced subgraph on U , 135 Multiplication rule, 70
λ(G): Edge connectivity of G, 162 Product rule, 70
ω(G): Clique number of G, 139 CSB-theorem, 50
ε(G): Edge density of G, 139 Cut edge, 143
{−1, 0, 1} vertex-edge incidence matrix, 169 Cut vertex, 142
g(G): Girth of G, 137 Cycle in a graph, 137
T

k-Cycle permutation, 173 Chord, 138


AF

Cycle index polynomial, 189


Absolute value in Z, 37 Cycle structure of a permutation, 188
DR

Addition function, 33 Cycles


Addition rule, 70 Disjoint, 174
algebraic expansion, 80 Cyclic decomposition, 174
Algebraic number, 57
Alternative parts, 70 Degree sequence, 167
Arrangements, 73 Graphic, 167
Derangement, 106
Bézout’s identity, 60 Difference equation, 117
Bell Numbers, 127 k-th difference, 117
Bell numbers, 93 First difference, 117
Bijective function, 15 Dihedral group D3 , 175
Bipartite graph, 133 Dihedral group D4 , 175
Blocks of a graph, 158 Disconnect graph, 139
Bridge in a graph, 143 Division algorithm, 59
Burnside’s Lemma, 187 Durfee square, 117

Cantor’s Diagonalization, 56 Edge, 131


Cantor-Schröder-Bernstein Theorem, 50 Empty set, 6
Cardinality, 42 End vertex, 132
Cartesian Product, 10 Equinumerous sets, 15
Catalan number (Cn ), 97 Equivalence relation, 18
Cauchy product, 108 Euclid’s Algorithm, 60, 61

258
INDEX 259

Euclid’s lemma, 62 Binomial coefficients, 125


Euler’s Theorem, 183 Catalan numbers, 124
Euler’s totient function (ϕ(n)), 106 Stirling numbers (S(n, k)), 126
Eulerian graph, 148 Generating functions
Eulerian tour, 148 Exponential (egf), 110
Ordinary (ogf), 110
Family of sets, 44 Graph, 132
Intersection, 44 2-colorable, 154
Product, 46 M -Alternating path, 163
Union, 44 M -Augmenting path, 163
Fermat’s Little Theorem, 183 k-colorable, 159
Ferrer’s diagram, 94 k-factor, 135
Fibonacci sequence, 118 Acyclic, 138
Fix of an element, 185 Addition of edge, 135
Forest, 143 Adjacency matrix, 168
Formal power series Adjacent vertices, 132
Cauchy product, 108 Automorphism, 141
differentiation, 112 Automorphism group, 141
Equality, 108 Bipartite, 133
integration, 112 Blocks, 158
Reciprocal, 111 Bridge, 143
Sum, 108 Cartesian product, 136
T

Formal power series (Q[[x]]), 108 Center, 137


AF

Frobenius number, 116 Chord, 138


DR

Function Chordal, 138


Addition, 33 Chromatic number (χ(G)), 159
Bijective, 15 Clique, 139
Eventually constant, 57 Clique number (ω(G)), 139
Identity, 14 Closed path, 137
Image, 13 Closed trail, 137
Injective, 15 Closure (C(G)), 152
Multiplication, 33 Coloring, 159
Multiplicative, 107 Complement, 135
One-one, 15 Complete (Kn ), 133
Onto, 15 Complete bipartite (Kr,s ), 133
Partial, 13 Component, 139
Power, 34 Connected, 139
Pre-image, 13 Connected component, 139
Restriction, 15 Covering, 165
Surjective, 15 Cubic, 134
Zero, 14 Cut edge, 143
Fundamental theorem of arithmetic, 62 Cut vertex, 142
Cycle (Cn ), 133
Generalized Pascal identity, 83 Degree (d(v), dG (v)), 132
Generating function Degree sequence, 167
Bell numbers, 127 Diameter (diam(G)), 137
260 INDEX

Disconnected, 139 Simple, 132


Disjoint union, 136 Spanning subgraph, 135
Distance, 137 Subdivision, 157
Edge connectivity (λ(G)), 162 Subgraph, 135
Edge deleted, 135 Trail, 137
Edge density (ε(G)), 139 Tree, 143
Edge set (E, E(G)), 131 Trivial, 132
Embedding, 155 Unicyclic, 148
End vertex, 132 Union, 136
Eulerian, 148 Vertex connectivity (κ(G)), 161
Forest, 143 Vertex deleted, 135
Girth (g(G)), 137 Vertex set (V, V (G)), 131
Hamiltonian, 150 Walk, 137
Homeomorphic, 157 Graphic sequence, 167
Incident edge, 132 Greatest common divisor (gcd), 59
Independence number (α(G)), 132 Group, 172
Independent set, 132 Cayley’s Theorem, 188
Induced subgraph (hU i), 135 Dihedral group D3 , 175
Intersection, 136 Dihedral group D4 , 175
Invariant, 141 Lagrange theorem, 182
Isolated vertices, 132 Left coset of a subgroup, 181
Isomorphism, 140 Permutation, 173
T

Join, 136 Right coset of a subgroup, 181


AF

Length of path, 137 Subgroup, 178


DR

Length of walk, 137 Symmetric, 173


Line graph, 153 Group action
Loop, 132 Fix, 185
Matching, 163 Orbit, 185
Maximal, 139 Stabilizer, 185
Maximal planar, 158 Group: order, 182
Maximum degree (∆(G)), 134
Maximum matching, 163 Hamiltonian graph, 150
Minimal, 139 Hand shaking lemma, 133
Minimum covering, 165 Highest common factor, 59
Minimum degree (δ(G)), 134
Neighbor (N (v), NG (v)), 132 Identity function, 14
Non-trivial, 132 Incident edge, 132
Path (Pn ), 133 Index of a subgroup, 182
Pendant, 132 Injective function, 15
Perfect matching, 163 Integers
Petersen, 134 Co-prime, 59
Planar, 155 Composite, 62
Radius, 137 Divisibility, 59
Regular, 134 Divisor, 59
Self-complimentary, 140 Greatest common divisor (gcd), 59
Separating set, 161 Highest common factor, 59
INDEX 261

Least common multiple (lcm), 63 Onto function, 15


Modular arithmetic, 63 Orbit, 85
Multiple, 59 Orbit of an element, 185
Prime, 62 Orbit size, 85
Relatively prime, 59 Order of a group, 182
Unity, 62 Order of an element, 182
Inverse relation, 12 Ordered pair, 10
Isomorphic graphs, 140 Ordering
Isomorphism of two groups, 188 Well ordering, 32
Ordering in N, 31
Join of two graphs, 136
Ordinary Generating functions (ogf), 110
Lagrange theorem, 182
Partial function, 13
Lattice path, 96
Partition of n (πn ), 93
Law of trichotomy, 31
Partition of n into k parts (πn (k)), 93
Lemma
Partition of a set, 19
Hand shaking, 133
Pascal’s identity, 74
LHRC, 118
Pascal:Generalized identity, 83
Line graph, 153
Path in a graph, 137
Linear congruence, 64
End vertices, 137
Linear Diophantine equation, 62
Internal vertices, 137
Linear recurrence relation, 118
Pattern inventory, 192
Homogeneous, 118
T

Peanos axioms, 23
Nonhomogeneous, 118
AF

Addition in Q, 38
LNRC, 118
DR

Addition in Z, 35
Matching Construction of Q, 38
Saturated vertex, 163 Construction of Z, 34
Modulus in Z, 37 Division in Q, 39
Money changing problem, 116 Multiplication in Q, 38
Multigraph, 132 Multiplication in Z, 35
Multiplication function, 33 Non-negative elements in Z, 37
Multiplication rule, 70 Order in Q, 39
Multiplicative function, 107 Order in Z, 36
Multiset, 78 Permutation
Cycle structure, 188
Natural numbers
Cyclic representation, 173
Addition, 24
Disjoint cycles, 174
Multiplication, 24
permutation, 72
Newton’s identity, 76
Permutation group, 173
Non-negative integer solutions, 78
Permutations
Non-trivial graph, 132
Product, 173
Null Set, 6
Petersen graph, 134
Number of circular permutations, 84
PHP, 101
Number of subsets, 74
Pigeohole Principle, 101
One-one correspondence, 15 Pigeonhole principle (PHP), 101
One-one function, 15 Planar graph, 155
262 INDEX

Edges, 156 Difference, 8


Exterior face, 156 Disjoint, 7
Faces, 156 Empty, 6
Maximal, 158 Enumeration, 53
Regions, 156 Equality, 7
Plane graph, 155 Finite, 42
Positive elements in Z, 37 Identity relation, 14
Power function, 34 Infinite, 42
Power set, 9 Intersection, 7
Prüfer code, 145 Multiset, 78
Principle Null, 6
Mathematical induction, 26 Partition, 19
Strong induction, 27 Power Set, 9
Principle of mathematical induction, 26 Proper subset, 7
Principle of strong induction, 27 Relation, 10
Product of permutations, 173 Singleton, 6
Product rule, 70 Subset, 7
Pseudograph, 131 Symmetric difference, 8
Uncountable, 53, 55
Ramsey number (r(m, n)), 166 Union, 7
Recurrence relation, 117 Simple graph, 132
Characteristic equation, 118 Singleton set, 6
T

General solution-Distinct roots, 118 Solution


AF

General solution-Multiple roots, 121 Non-negative integers, 78


DR

Initial condition, 117 Stabilizer of an element, 185


Solution, 118 Stirling numbers
Recursion Theorem, 33 Second kind (S(n, r)), 90
Relation, 10 Stirling’s Identity, 127
Domain, 12 Subgroup, 178
Equivalence, 18 Index, 182
Inverse, 12 Left coset, 181
Range, 12 Right coset, 181
Reflexive, 17 Surjective function, 15
Symmetric, 17 Symmetric group, 173
Transitive, 17
Trail in a graph, 137
Restricted function, 15
Transcendental number, 57
Rotation, 85
Tree, 143
Sequence, 53 Prüfer code, 145
Set Triangular numbers, 29
Cartesian product, 10 Trivial graph, 132
Complement, 9
Uncountable set, 55
Composition of relations, 16
Unicyclic graph, 148
Countable, 53
Countably infinite, 53 Vertex, 131
Denumerable, 53 Adjacent, 132
INDEX 263

Walk in a graph, 137


Weight of a color pattern, 191
Well ordering principle, 32
word expansion, 81

Zero function, 14

T
AF
DR

You might also like