Mira
Mira
Mira
preliminary edition
12 May 2019
Sheldon Axler
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Dedicated to
Sheldon Axler was valedictorian of his high school in Miami, Florida. He received his
AB from Princeton University with highest honors, followed by a PhD in Mathematics
from the University of California at Berkeley.
As a postdoctoral Moore Instructor at MIT, Axler received a university-wide
teaching award. Axler was then an assistant professor, associate professor, and
professor at Michigan State University, where he received the first J. Sutherland
Frame Teaching Award and the Distinguished Faculty Award.
Axler received the Lester R. Ford Award for expository writing from the Mathe-
matical Association of America in 1996. In addition to publishing numerous research
papers, Axler is the author of six mathematics textbooks, ranging from freshman to
graduate level. His book Linear Algebra Done Right has been adopted as a textbook
at over 300 universities and colleges.
Axler has served as Editor-in-Chief of the Mathematical Intelligencer and As-
sociate Editor of the American Mathematical Monthly. He has been a member of
the Council of the American Mathematical Society and a member of the Board of
Trustees of the Mathematical Sciences Research Institute. Axler has also served on
the editorial board of Springer’s series Undergraduate Texts in Mathematics, Graduate
Texts in Mathematics, Universitext, and Springer Monographs in Mathematics.
Axler has been honored by appointments as a Fellow of the American Mathe-
matical Society and as a Senior Fellow of the California Council on Science and
Technology.
Axler joined San Francisco State University as Chair of the Mathematics Depart-
ment in 1997. In 2002, Axler became Dean of the College of Science & Engineering
at San Francisco State University. After serving as Dean for thirteen years, Axler re-
turned to a regular faculty appointment as a professor in the Mathematics Department.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler iii
Contents (tentative beyond Chapter 10)
Acknowledgments xii
1 Riemann Integration 1
1A Review: Riemann Integral 2
Exercises 1A 7
2 Measures 13
2A Outer Measure on R 14
Motivation and Definition of Outer Measure 14
Good Properties of Outer Measure 15
Outer Measure of Closed Bounded Interval 18
Outer Measure is Not Additive 21
Exercises 2A 23
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler iv
Contents (tentative beyond Chapter 10) v
2D Lebesgue Measure 47
Additivity of Outer Measure on Borel Sets 47
Lebesgue Measurable Sets 52
Cantor Set 55
Exercises 2D 58
3 Integration 71
3A Integration with Respect to a Measure 72
Integration of Nonnegative Functions 72
Monotone Convergence Theorem 75
Integration of Real-Valued Functions 79
Exercises 3A 82
4 Differentiation 99
4A Hardy–Littlewood Maximal Function 100
Markov’s Inequality 100
Vitali Covering Lemma 101
Hardy–Littlewood Maximal Inequality 102
Exercises 4A 104
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
vi Contents (tentative beyond Chapter 10)
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Contents (tentative beyond Chapter 10) vii
7 L p Spaces 191
7A L p (µ) 192
Hölder’s Inequality 192
Minkowski’s Inequality 196
Exercises 7A 197
7B L p (µ) 200
Definition of L p (µ) 200
L p (µ) is a Banach Space 202
Duality 204
Exercises 7B 206
8B Orthogonality 222
Orthogonal Projections 222
Orthogonal Complements 227
Riesz Representation Theorem 231
Exercises 8B 232
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
viii Contents (tentative beyond Chapter 10)
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Contents (tentative beyond Chapter 10) ix
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
x Contents (tentative beyond Chapter 10)
Exercises E 391
Index 398
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Preface for Students
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler xi
Acknowledgments
I owe a huge intellectual debt to the many mathematicians who created real analysis
over the past several centuries. The results in this book belong to the common heritage
of mathematics. A special case of a theorem may first have been proved by one
mathematician and then sharpened and improved by many other mathematicians.
Bestowing accurate credit for all the contributions would be a difficult task that I have
not undertaken. In no case should the reader assume that any theorem presented here
represents my original contribution. However, in writing this book I tried to think
about the best way to present real analysis and to prove its theorems, without regard
to the standard methods and proofs used in most textbooks.
More acknowledgments to come . . .
Sheldon Axler
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler xii
Chapter
1
Riemann Integration
This chapter reviews Riemann integration. Riemann integration uses rectangles to
approximate areas under graphs. This chapter begins by carefully presenting the
definitions leading to the Riemann integral. The big result in the first section states
that a continuous real-valued function on a closed bounded interval is Riemann
integrable. The proof depends upon the theorem that continuous functions on closed
bounded intervals are uniformly continuous.
The second section of this chapter focuses on several deficiencies of Riemann
integration. As we will see, Riemann integration does not do everything that we
would like an integral to do. These deficiencies will provide motivation in future
chapters for the development of measures and integration with respect to measures.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 1
2 Chapter 1 Riemann Integration
The lower and upper Riemann sums, which we now define, approximate the
area under the graph of a nonnegative function (or, more generally, the signed area
corresponding to a real-valued function).
and
n
U ( f , P, [ a, b]) = ∑ ( x j − x j −1 ) sup f .
j =1 [ x j −1 , x j ]
Our intuition suggests that for a partition with only a small gap between consecu-
tive points, the lower Riemann sum should be a bit less than the area under the graph,
and the upper Riemann sum should be a bit more than the area under the graph.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 1A Review: Riemann Integral 3
The pictures in the next example help convey the idea of these approximations.
The base of the jth rectangle has length x j − x j−1 and has height inf f for the
[ x j −1 , x j ]
lower Riemann sum and height sup f for the upper Riemann sum.
[ x j −1 , x j ]
and
n
1 j2 2n2 + 3n + 1
U ( x2 , Pn , [0, 1]) =
n ∑ n2 =
6n2
,
j =1
n(2n2 +3n+1)
as you should verify [use the formula 1 + 4 + 9 + · · · + n2 = 6 ].
The next result states that adjoining more points to a partition increases the lower
Riemann sum and decreases the upper Riemann sum.
Proof To prove the first inequality, suppose P is the partition x0 , . . . , xn and P0 is the
partition x00 , . . . , x 0N of [ a, b]. For each j = 1, . . . , n, there exist k ∈ {0, . . . , N − 1}
and a positive integer m such that x j−1 = xk0 < xk0 +1 < . . . < xk0 +m = x j . We have
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
4 Chapter 1 Riemann Integration
m
( x j − x j−1 ) inf
[ x j −1 , x j ]
f = ∑ (xk0 +i − xk0 +i−1 ) [x inf, x ] f
i =1 j −1 j
m
≤ ∑ (xk0 +i − xk0 +i−1 ) [x0 inf
0
f.
i =1 k + i −1 , x k + i ]
The following result states that if the function is fixed, then each lower Riemann
sum is less than or equal to each upper Riemann sum.
Proof Let P00 be the partition of [ a, b] obtained by merging the lists that define P
and P0 . Then
We have been working with lower and upper Riemann sums. Now we define the
lower and upper Riemann integrals.
and
U ( f , [ a, b]) = inf U ( f , P, [ a, b]),
P
where the supremum and infimum above are taken over all partitions P of [ a, b].
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 1A Review: Riemann Integral 5
In the definition above, we take the supremum (over all partitions) of the lower
Riemann sums because adjoining more points to a partition increases the lower
Riemann sum (by 1.5) and should provide a more accurate estimate of the area under
the graph. Similarly, in the definition above, we take the infimum (over all partitions)
of the upper Riemann sums because adjoining more points to a partition decreases
the upper Riemann sum (by 1.5) and should provide a more accurate estimate of the
area under the graph.
Our first result about the lower and upper Riemann integrals is an easy inequality.
L( f , [ a, b]) ≤ U ( f , [ a, b]).
Proof The desired inequality follows from the definitions and 1.6.
The lower Riemann integral and the upper Riemann integral can both be reasonably
considered to be the area under the graph of a function. Which one should we use?
The pictures in Example 1.4 suggest that these two quantities are the same for the
function in that example; we will soon verify this suspicion. However, as we will see
in the next section, there are functions for which the lower Riemann integral does not
equal the upper Riemann integral.
Instead of choosing between the lower Riemann integral and the upper Riemann
integral, the standard procedure in Riemann integration is to consider only functions
for which those two quantities are equal. This decision has the huge advantage of
making the Riemann integral behave as we wish with respect to the sum of two
functions (see Exercise 5 in this section).
Let Z denote the set of integers and Z+ denote the set of positive integers.
2n2 + 3n + 1 1 2n2 − 3n + 1
U ( f , [0, 1]) ≤ inf = = sup ≤ L( f , [0, 1]),
n∈Z+ 6n2 3 n∈Z+ 6n2
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
6 Chapter 1 Riemann Integration
where the two inequalities above come from Example 1.4 and the two equalities
easily follow from dividing the numerators and denominators of both fractions above
by n2 .
The paragraph above shows that U ( f , [0, 1]) ≤ 31 ≤ L( f , [0, 1]). When combined
with 1.8, this shows that L( f , [0, 1]) = U ( f , [0, 1]) = 13 . Thus f is Riemann
integrable and
Z 1
1
f = .
0 3
b−a
x j − x j −1 =
n
for each j = 1, . . . , n. Then
≤ (b − a)ε,
where the first equality follows from the definitions of U ( f , [ a, b]) and L( f , [ a, b])
and the last inequality follows from 1.12.
We have shown that U ( f , [ a, b]) − L( f , [ a, b]) ≤ (b − a)ε for all ε > 0. Thus
1.8 implies that L( f , [ a, b]) = U ( f , [ a, b]). Hence f is Riemann integrable.
Rb Rb
An alternative notation for a f is a f ( x ) dx. Here x is a dummy variable, so
Rb
we could also write a f (t) dt or use another variable. This notation becomes useful
R1
when we want to write something like 0 x2 dx instead of using function notation.
The next result gives a frequently-used estimate for a Riemann integral.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 1A Review: Riemann Integral 7
EXERCISES 1A
1 Suppose f : [ a, b] → R is a bounded function such that
L( f , P, [ a, b]) = U ( f , P, [ a, b])
and
U ( f + g, [ a, b]) ≤ U ( f , [ a, b]) + U ( g, [ a, b]).
5 Suppose f , g : [ a, b] → R are Riemann integrable. Prove that f + g is Riemann
integrable on [ a, b] and
Z b Z b Z b
( f + g) = f+ g.
a a a
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
8 Chapter 1 Riemann Integration
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 1B Riemann Integral is Not Good Enough 9
because [ a, b] contains an irrational number (by 0.39) and a rational number (by
0.30). Thus L( f , P, [0, 1]) = 0 and U ( f , P, [0, 1]) = 1 for every partition P of [0, 1].
Hence L( f , [0, 1]) = 0 and U ( f , [0, 1]) = 1. Because L( f , [0, 1]) 6= U ( f , [0, 1]),
we conclude that f is not Riemann integrable.
This example is disturbing because (as we will see later), there are far fewer
rational numbers than irrational numbers. Thus f should, in some sense, have
integral 0. However, the Riemann integral of f is not defined.
1.15 Example Riemann integration does not work with unbounded functions
Define f : [0, 1] → R by
√1
(
x
if 0 < x ≤ 1,
f (x) =
0 if x = 0.
If x0 , x1 , . . . , xn is a partition of [0, 1], then sup f = ∞. Thus if we tried to apply
[ x0 , x1 ]
the definition of the upper Riemann sum to f , we would have U ( f , P, [0, 1]) = ∞
for every partition P of [0, 1].
However, we should consider the area under the graph of f to be 2, not ∞, because
Z 1 √
lim f = lim(2 − 2 a) = 2.
a ↓0 a a ↓0
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
10 Chapter 1 Riemann Integration
R1
Calculus courses deal with the previous example by defining √1 dx to be
0 x
R1
lima↓0 a √1x dx. If using this approach and
1 1
f (x) = √ + √ ,
x 1−x
R1
then we would define 0 f to be
Z 1/2 Z b
lim f + lim f.
a ↓0 a b ↑1 1/2
However, the idea of taking Riemann integrals over subdomains and then taking
limits can fail with more complicated functions, as shown in the next example.
1.16 Example area seems to make sense, but Riemann integral is not defined
Let r1 , r2 , . . . be a sequence that includes each rational number in (0, 1) exactly
once and that includes no other numbers (0.57 implies that such a sequence exists).
For k ∈ Z+, define f k : [0, 1] → R by
√ 1 if x > rk ,
x −r k
f k (x) =
0 if x ≤ rk .
Define f : [0, 1] → R by
∞
f k (x)
f (x) = ∑ 2k
.
k =1
Because every subinterval of [0, 1] with more than one element contains infinitely
many rational numbers (as follows from 0.30), f is unbounded on every such subin-
terval. Thus the Riemann integral of f is undefined on every subinterval of [0, 1] with
more than one element.
However, the area under the graph of each f k is less than 2. The formula defining
f then shows that we should expect the area under the graph of f to be less than 2
rather than undefined.
The next example shows that the pointwise limit of a sequence of Riemann
integrable functions bounded by 1 need not be Riemann integrable.
1.17 Example Riemann integration does not work well with pointwise limits
Let r1 , r2 , . . . be a sequence that includes each rational number in [0, 1] exactly
once and that includes no other numbers (0.57 implies that such a sequence exists).
For k ∈ Z+, define f k : [0, 1] → R by
1 if x ∈ {r1 , . . . , rk },
f k (x) =
0 otherwise.
R1
Then each f k is Riemann integrable and 0 f k = 0, as you should verify.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 1B Riemann Integral is Not Good Enough 11
Define f : [0, 1] → R by
(
1 if x is rational,
f (x) =
0 if x is irrational.
Clearly
lim f k ( x ) = f ( x ) for each x ∈ [0, 1].
k→∞
However, f is not Riemann integrable (see Example 1.14) even though f is the
pointwise limit of a sequence of integrable functions bounded by 1.
Because analysis relies heavily upon limits, a good theory of integration should
allow for interchange of limits and integrals, at least when the functions are appropri-
ately bounded. Thus the previous example points out a serious deficiency in Riemann
integration.
Now we come to a positive result, but as we will see, even this result indicates that
Riemann integration has some problems.
| f k ( x )| ≤ M
for all k ∈ Z+ and all x ∈ [ a, b]. Suppose limk→∞ f k ( x ) exists for each
x ∈ [ a, b]. Define f : [ a, b] → R by
f ( x ) = lim f k ( x ).
k→∞
The result above suffers from two problems. The first problem is the undesirable
hypothesis that the limit function f is Riemann integrable. Ideally, that property
would follow from the other hypotheses, but Example 1.17 shows that we must
explicitly include the assumption that f is Riemann integrable.
The second problem with the result above is that it does not seem to have a
reasonable proof using just the tools of Riemann integration. Thus a proof of the
result above will not be given here. A proof of a stronger result will be given later,
using the tools of measure theory that we will develop starting with the next chapter.
The lack of a good Riemann-integration-based proof of the result above indicates that
Riemann integration is not the ideal theory of integration.
We have not discussed differentiation (but we will do so in Chapter 4). However,
you should recall from your calculus class the following version of the Fundamental
Theorem of Calculus: if f is differentiable on an open interval containing [ a, b] and
f 0 is continuous on [ a, b], then
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
12 Chapter 1 Riemann Integration
Z b
f (b) − f ( a) = f 0.
a
Note the hypothesis above that f 0 is continuous on [ a, b]. It would be nice not to
have that hypothesis. However, that hypothesis (or something close to it) is needed
because there exist functions f that are differentiable everywhere on an open interval
containing [ a, b] but f 0 is not Riemann integrable on [ a, b]. In other words, if we use
Riemann integration, then the right side of the equation above need not make sense
even if f 0 is defined everywhere.
EXERCISES 1B
1 Define f : [0, 1] → R as follows:
0 if a is irrational,
f ( a) = 1 if a is rational and n is the smallest positive integer
n
such that a = m
n for some integer m.
Z 1
Show that f is Riemann integrable and compute f.
0
and
U ( f + g, [0, 1]) 6= U ( f , [0, 1]) + U ( g, [0, 1]).
4 Give an example of a sequence of continuous real-valued functions f 1 , f 2 , . . .
on [0, 1] and a continuous real-valued function f on [0, 1] such that
f ( x ) = lim f k ( x )
k→∞
5 Show that
(
2k 1 if x is rational,
lim lim cos( j!πx ) =
j→∞ k→∞ 0 if x is irrational
for every x ∈ R.
[This example is due to Henri Lebesgue.]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
2
Measures
The last section of the previous chapter discusses several deficiencies of Riemann
integration. To remedy those deficiencies, in this chapter we will extend the notion of
the length of an interval to a larger collection of subsets of R. This will lead us to
measures and then in the next chapter to integration with respect to measures.
We begin this chapter by investigating outer measure, which looks promising but
fails to have a crucial property. That failure leads us to σ-algebras and measurable
spaces. Then we define measures, in more abstract context that can be applied to
settings more general than R. Next, we will construct Lebesgue measure on R as our
desired extension of the notion of the length of an interval.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 13
14 Chapter 2 Measures
2A Outer Measure on R
Motivation and Definition of Outer Measure
The Riemann integral arises from approximating the area under the graph of a function
by sums of the areas of approximating rectangles. These rectangles have heights that
approximate the values of the function on subintervals of the function’s domain. The
width of each approximating rectangle is the length of the corresponding subinterval.
This length is the term x j − x j−1 in the definitions of the lower and upper Riemann
sums (see 1.3).
To extend integration to a larger class of functions than the Riemann integrable
functions, we will write the domain of a function as the union of subsets more
complicated than the subintervals used in Riemann integration. We will need to
assign a size to each of those subsets, where the size is an extension of the length of
intervals. For example, we expect the size of the set (1, 3) ∪ (7, 10) to be 5 (because
the first interval has length 2, the second interval has length 3, and 2 + 3 = 5).
Assigning a size to subsets of R that are more complicated than unions of open
intervals becomes a nontrivial task. This chapter focuses on that task and its extension
to other contexts. In the next chapter, we will see how to use the ideas developed in
this chapter to create a rich theory of integration.
We begin by giving the expected definition of the length of an open interval, along
with a notation for that length.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2A Outer Measure on R 15
The definition of outer measure involves an infinite sum. The infinite sum
∑∞k=1 tk of a sequence t1 , t2 , . . . of elements of [0, ∞ ]
is defined to be ∞ if some
tk = ∞. Otherwise, ∑∞ k=1 tk is defined to be the limit of the increasing sequence
t1 , t1 + t2 , t1 + t2 + t3 , . . . of partial sums; thus
∞ n
∑ tk = nlim
→∞
∑ tk .
k =1 k =1
The result above along with the result that the set Q of rational numbers is
countable (see 0.57) implies that Q has outer measure 0. We will soon show that
there are far fewer rational numbers than real numbers (see 2.16). Thus the equation
|Q| = 0 indicates that outer measure has a good property that we want any reasonable
notion of size to possess.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
16 Chapter 2 Measures
The next result shows that outer measure does the right thing with respect to set
inclusion.
Taking the infimum over all sequences of open intervals whose union contains B, we
have | A| ≤ | B|, as desired.
We expect that the size of a subset of R should not change if the set is shifted to
the right or to the left. The next definition will allow us to be more precise.
t + A = { t + a : a ∈ A }.
If t > 0, then t + A is obtained by moving the set A to the right t units on the real
line; if t < 0, then t + A is obtained by moving the set A to the left |t| units.
Translation does not change the length of an open interval. Specifically, if t ∈ R
and a, b ∈ [−∞, ∞], then t + ( a, b) = (t + a, t + b) and thus ` t + ( a, b) =
` ( a, b) . Here we are using the standard convention that t + (−∞) = −∞ and
t + ∞ = ∞.
The next result states that translation invariance carries over to outer measure.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2A Outer Measure on R 17
The union of the intervals (1, 4) and (3, 5) is the interval (1, 5). Thus
` (1, 4) ∪ (3, 5) < ` (1, 4) + ` (3, 5)
because the left side of the inequality above equals 4 and the right side equals 5. The
direction of the inequality above is explained by noting that the interval (3, 4), which
is the intersection of (1, 4) and (3, 5), has its length counted twice on the right side
of the inequality above.
The example of the paragraph above should provide intuition for the direction of
the inequality in the next result. The property of satisfying the inequality in the result
below is called countable subadditivity because it applies to sequences of subsets.
Proof Let ε > 0. For each k ∈ Z+, let I1,k , I2,k , . . . be a sequence of open intervals
whose union contains Ak such that
∞
ε
∑ `( Ij,k ) ≤ 2k + | Ak |.
j =1
Thus
∞ ∞ ∞
2.9 ∑ ∑ `( Ij,k ) ≤ ε + ∑ | Ak |.
k =1 j =1 k =1
I1,1 , I1,2 , I2,1 , I1,3 , I2,2 , I3,1 , I1,4 , I2,3 , I3,2 , I4,1 , I1,5 , I2,4 , I3,3 , I4,2 , I5,1 , . . . .
|{z} | {z } | {z } | {z } | {z }
2 3 4 5 sum of the two indices is 6
Inequality 2.9 shows that the sum of the lengths of the intervals listed above is less
than or equal to ε + ∑∞
S∞ ∞
| A k | . Thus k=1SAk ≤ ε + ∑k=1 | Ak |. Because ε is an
k =1
arbitrary positive number, this implies that k=1 Ak ≤ ∑∞
∞
k=1 | Ak |, as desired.
| A1 ∪ · · · ∪ A n | ≤ | A1 | + · · · + | A n |
for all A1 , . . . , An ⊂ R, because we can take Ak = ∅ for k > n in 2.8.
The finite and countable subadditivity of outer measure, as proved above, add to
our list of nice properties enjoyed by outer measure.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
18 Chapter 2 Measures
Suppose A ⊂ R.
• A collection C of open subsets of R is called an open cover of A if A is
contained in the union of all the sets in C .
• An open cover C of A is said to have a finite subcover if A is contained in
the union of some finite list of sets in C .
• The collection
S∞
{(k, k + 2) : k ∈ Z+ } is an open cover of [2, 5] because
[2, 5] ⊂ k=1 (k, k + 2). This open cover has a finite subcover because [2, 5] ⊂
(1, 3) ∪ (2, 4) ∪ (3, 5) ∪ (4, 6).
• The collection
S∞
{(k, k + 2) : k ∈ Z+ } is an open cover of [2, ∞) because
[2, ∞] ⊂ k=1 (k, k + 2). This open cover does not have a finite subcover
because there do not exist finitely many sets of the form (k, k + 2) [with k ∈ Z+ ]
whose union contains [2, ∞).
• The collection {(0, 2 − 1k ) : k ∈ Z+ } is an open cover of (1, 2) because
(1, 2) ⊂ ∞ 1
k=1 (0, 2 − k ). This open cover does not have a finite subcover
S
because there do not exist finitely many sets of the form (0, 2 − 1k ) whose union
contains (1, 2).
The next result will be our major tool in the proof that |[ a, b]| ≥ b − a. Although
we need only the result as stated, be sure to see Exercise 5 in this section, which
when combined with the next result gives a characterization of the closed bounded
subsets of R. Note that the following proof uses the completeness property of the real
numbers (by asserting that the supremum of a certain nonempty bounded set exists).
See Section D in the Appendix for a review of closed sets.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2A Outer Measure on R 19
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
20 Chapter 2 Measures
Now we can prove that closed intervals have the expected outer measure.
Proof See the first paragraph of this subsection for the proof that |[ a, b]| ≤ b − a.
To prove the inequality in the other direction, suppose I1 , I2 , . . . is a sequence of
open intervals such that [ a, b] ⊂ ∞
S
k=1 Ik . By the Heine–Borel Theorem (2.12), there
exists n ∈ Z+ such that
2.14 [ a, b] ⊂ I1 ∪ · · · ∪ In .
We will now prove by induction on n that the inclusion above implies that
n
2.15 ∑ `( Ik ) ≥ b − a.
k =1
Hence
Alice was beginning to get very tired
[ a, c] ⊂ I1 ∪ · · · ∪ In . of sitting by her sister on the bank,
and of having nothing to do: once or
By our induction hypothesis, we have twice she had peeped into the book
∑nk=1 `( Ik ) ≥ c − a. Thus her sister was reading, but it had no
n +1 pictures or conversations in it, “and
∑ `( Ik ) ≥ (c − a) + `( In+1 ) what is the use of a book,” thought
k =1 Alice “without pictures or
= (c − a) + (d − c) conversation?”
–opening paragraph of Alice’s
= d−a
Adventures in Wonderland, by Lewis
≥ b − a, Carroll
completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2A Outer Measure on R 21
The previous result has the following important corollary. You may be familiar
with Georg Cantor’s (1845–1918) original proof of the next result. The proof using
outer measure that is presented here gives an interesting alternative to Cantor’s proof.
| I | ≥ |[ a, b]| = b − a > 0,
where the first inequality above holds because outer measure preserves order (see 2.5)
and the equality above comes from 2.13. Because every countable subset of R has
outer measure 0 (see 2.4), we can conclude that I is uncountable.
| A ∪ B | 6 = | A | + | B |.
Proof For a ∈ [−1, 1], let ã be the set of numbers in [−1, 1] that differ from a by a
rational number. In other words,
ã = {c ∈ [−1, 1] : a − c ∈ Q}.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
≈ 7π Chapter 2 Measures
[
Clearly a ∈ ã for each a ∈ [−1, 1]. Thus [−1, 1] = ã.
a∈[−1,1]
Let V be a set that contains exactly one
This step involves the Axiom of
element in each of the distinct sets in
Choice, as discussed after this proof.
{ ã : a ∈ [−1, 1]}. The set V arises by choosing one
element from each equivalence
In other words, for every a ∈ [−1, 1], the class.
set V ∩ ã has exactly one element.
Let r1 , r2 , . . . be a sequence of distinct rational numbers such that
We know that |[−1, 1]| = 2 (from 2.13). The translation invariance of outer measure
(2.7) thus allows us to rewrite the inequality above as
∞
2≤ ∑ |V |.
k =1
Thus |V | > 0.
Note that the sets r1 + V, r2 + V, . . . are disjoint. (Proof: Suppose there exists
t ∈ (r j + V ) ∩ (rk + V ). Then t = r j + v1 = rk + v2 for some v1 , v2 ∈ V, which
implies that v1 − v2 = rk − r j ∈ Q. Our construction of V now implies that v1 = v2 ,
which implies that r j = rk , which implies that j = k.)
Let n ∈ Z+. Clearly
n
[
(rk + V ) ⊂ [−3, 3]
k =1
because V ⊂ [−1, 1] and each rk ∈ [−2, 2]. The set inclusion above implies that
[n
2.18 (rk + V ) ≤ 6.
k =1
However
n n
2.19 ∑ |r k + V | = ∑ |V | = n |V |.
k =1 k =1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2A Outer Measure on R 23
Now 2.18 and 2.19 suggest that we choose n ∈ Z+ such that n|V | > 6. Thus
[n n
2.20 (r k + V ) < ∑ |r k + V |.
k =1 k =1
The Axiom of Choice, which belongs to set theory, states that if E is a set whose
elements are disjoint sets, then there exists a set D that contains exactly one element
in each set that is an element of E . We used the Axiom of Choice to construct the set
D that was used in the last proof.
A small minority of mathematicians objects to the use of the Axiom of Choice.
Thus we will keep track of where we need to use it. Even if you do not like to use the
Axiom of Choice, the previous result warns us away from trying to prove that outer
measure is additive (any such proof would need to contradict the Axiom of Choice,
which is consistent with the standard axioms of set theory).
EXERCISES 2A
1 Prove that if A and B are subsets of R and | B| = 0, then | A ∪ B| = | A|.
2 Suppose A ⊂ R and t ∈ R. Let tA = {ta : a ∈ A}. Prove that |tA| = |t| | A|.
[Assume that 0 · ∞ is defined to be 0.]
3 Prove that if A, B ⊂ R and | A| < ∞, then | B \ A| ≥ | B| − | A|.
4 Prove that | A| = lim | A ∩ [−k, k ]| for all A ⊂ R.
k→∞
5 Suppose F is a subset of R with the property that every open cover of F has a
finite subcover. Prove that F is closed and bounded.
Suppose A is a set of closed subsets of R such that F∈A F = ∅. Prove that if A
T
6
contains at least one bounded set, then there exist n ∈ Z+ and F1 , . . . , Fn ∈ A
such that F1 ∩ · · · ∩ Fn = ∅.
7 Prove that if a, b ∈ R and a < b, then
8 Suppose a, b, c, d are real numbers with a < b and c < d. Prove that
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
24 Chapter 2 Measures
12 Suppose ε > 0. Prove that there exists a subset F of [0, 1] such that F is closed,
every element of F is an irrational number, and | F | > 1 − ε.
13 Consider the following figure, which is drawn accurately to scale.
(a) Show that the right triangle whose vertices are (0, 0), (20, 0), and (20, 9)
has area 90.
[We have not defined area yet, but just use the elementary formulas for the
areas of triangles and rectangles that you learned long ago.]
(b) Show that the yellow right triangle has area 27.5.
(c) Show that the red rectangle has area 45.
(d) Show that the blue right triangle has area 18.
(e) Add the results of parts (b), (c), and (d), showing that the area of the colored
region is 90.5.
(f) Seeing the figure above, most people expect that parts (a) and (e) will have
the same result. Yet in part (a) we found area 90, and in part (e) we found
area 90.5. Explain why these results differ.
[You may be tempted to think that what we have here is a two-dimensional
example similar to the result about the nonadditivity of outer measure (2.17).
However, examples of nonadditivity require much more complicated sets
than in this example.]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2B Measurable Spaces and Functions 25
There does not exist a function µ with all the following properties:
µ ( B ) = µ ( A ) + µ ( B \ A ) + 0 + 0 + · · · = µ ( A ) + µ ( B \ A ) ≥ µ ( A ).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
26 Chapter 2 Measures
We have shown that µ has all the properties of outer measure that were used
in the proof of 2.17. Repeating the proof of 2.17, we see that there exist disjoint
subsets A, B of R such that µ( A ∪ B) 6= µ( A) + µ( B). Thus the disjoint sequence
A, B, ∅, ∅, . . . does not satisfy the countable additivity property required by (c). This
contradiction completes the proof.
σ-Algebras
The last result shows that we need to give up one of the desirable properties in our
goal of extending the notion of size from intervals to more general subsets of R. We
cannot give up 2.21(b) because the size of an interval needs to be its length. We
cannot give up 2.21(c) because countable additivity is needed to prove theorems
about limits. We cannot give up 2.21(d) because a size that is not translation invariant
does not satisfy our intuitive notion of size as a generalization of length.
Thus we are forced to relax the requirement in 2.21(a) that the size is defined for
all subsets of R. Experience shows that to have a viable theory that allows for taking
limits, the collection of subsets for which the size is defined should be closed under
complementation and closed under countable unions. Thus we make the following
definition.
Make sure you verify that the examples in all three bullet points below are indeed
σ-algebras. The verification is obvious for the first two bullet points. For the third
bullet point, you need to use the result that the countable union of countable sets
is countable (see the proof of 2.8 for an example of how a doubly-indexed list can
be converted to a singly-indexed sequence). The exercises contain some additional
examples of σ-algebras.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2B Measurable Spaces and Functions 27
(a) X ∈ S ;
(b) if D, E ∈ S , then D ∪ E ∈ S and D ∩ E ∈ S and D \ E ∈ S ;
∞
\
(c) if E1 , E2 , . . . is a sequence of elements of S , then Ek ∈ S .
k =1
Proof Because ∅ ∈ S and X = X \ ∅, the first two bullet points in the definition
of σ-algebra (2.22) imply that X ∈ S , proving (a).
Suppose D, E ∈ S . Then D ∪ E is the union of the sequence D, E, ∅, ∅, . . . of
elements of S . Thus the third bullet point in the definition of σ-algebra (2.22) implies
that D ∪ E ∈ S .
De Morgan’s Laws (0.63) tell us that
X \ ( D ∩ E ) = ( X \ D ) ∪ ( X \ E ).
For example, if X = R and S is the set of all subsets of R that are countable or
have a countable complement, then the set of rational numbers is S -measurable but
the set of positive real numbers is not S -measurable.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
28 Chapter 2 Measures
Borel Subsets of R
The next result guarantees that there is a smallest σ-algebra on a set X containing a
given set A of subsets of X.
Proof There is at least one σ-algebra on X that contains A because the σ-algebra
consisting of all subsets of X contains A.
Let S be the intersection of all σ-algebras on X that contain A. Then ∅ ∈ S
because ∅ is an element of each σ-algebra on X that contains A.
Suppose E ∈ S . Thus E is in every σ-algebra on X that contains A. Thus X \ E
is in every σ-algebra on X that contains A. Hence X \ E ∈ S .
Suppose E1 , E2 , . . . is a sequence of elements of S . Thus each Ek is in every σ-
X that contains A. Thus ∞
S
algebra on S k=1 Ek is in every σ-algebra on X that contains
∞
A. Hence k=1 Ek ∈ S , which completes the proof that S is a σ-algebra on X.
Using the terminology smallest for the intersection of all σ-algebras that contain
a set A of subsets of X makes sense because the intersection of those σ-algebras is
contained in every σ-algebra that contains A.
• Suppose X is a set and A is the set of subsets of X that consist of exactly one
element:
A = {x} : x ∈ X .
Then the smallest σ-algebra on X containing A is the set of all subsets E of X
such that E is countable or X \ E is countable, as you should verify.
• Suppose A = {(0, 1), (0, ∞)}. Then the smallest σ-algebra on R containing
A is {∅, (0, 1), (0, ∞), (−∞, 0] ∪ [1, ∞), (−∞, 0], [1, ∞), (−∞, 1), R}, as you
should verify.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2B Measurable Spaces and Functions 29
• Every closed subset of R is a Borel set because every closed subset of R is the
complement of an open subset of R.
• Every
T∞
half-open interval [ a, b) (where a, b ∈ R) is a Borel set because [ a, b) =
1
k =1 a − k , b ).
(
• If f : R → R is a function, then the set of points at which f is continuous is the
intersection of a sequence of open sets (see Exercise 12 in this section) and thus
is a Borel set.
Inverse Images
The next definition will be used frequently in the rest of this chapter.
f −1 ( A ) = { x ∈ X : f ( x ) ∈ A } .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
30 Chapter 2 Measures
Inverse images have good algebraic properties, as is shown in the next two results.
f −1 ( f −1 ( A ),
S S
Thus A∈A A ) = A∈A which proves (b).
Part (c) is proved in the same fashion as (b), with unions replaced by intersections
and for some replaced by for every.
( g ◦ f ) −1 ( A ) = f −1 g −1 ( A )
for every A ⊂ W.
⇐⇒ f ( x ) ∈ g−1 ( A)
⇐⇒ x ∈ f −1 g−1 ( A) .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2B Measurable Spaces and Functions 31
Measurable Functions
The next definition tells us which real-valued functions behave reasonably with
respect to a σ-algebra on their domain.
f −1 ( B ) ∈ S
The set X that contains E is not explicitly included in the notation χ E because X will
always be clear from the context.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
32 Chapter 2 Measures
f −1 ( a, ∞) ∈ S
Proof Let
T = { A ⊂ R : f −1 ( A) ∈ S}.
We want to show that every Borel subset of R is in T . To do this, we will first show
that T is a σ-algebra on R.
Certainly ∅ ∈ T , because f −1 (∅) = ∅ ∈ S .
If A ∈ T , then f −1 ( A) ∈ S ; hence
f −1 ( R \ A ) = X \ f −1 ( A ) ∈ S
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2B Measurable Spaces and Functions 33
The union inside the large parentheses above is an open subset of R [by 0.55(a)], and
hence its intersection with X is a Borel set. Thus we can conclude that f −1 ( a, ∞)
is a Borel set.
Now 2.38 implies that f is a Borel measurable function.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
34 Chapter 2 Measures
f −1 ( a, ∞) = (b, ∞) ∩ X or f −1 ( a, ∞) = [b, ∞) ∩ X.
The next result shows that measurability interacts well with composition.
( g ◦ f ) −1 ( B ) = f −1 g −1 ( B ) .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2B Measurable Spaces and Functions 35
Measurability also interacts well with algebraic operations, as shown in the next
result.
Thus a < f ( x ) + g( x ). Hence the open interval a − g( x ), f ( x ) is nonempty, and
thus it contains some rational number r. This implies that r < f ( x ), which means
that x ∈ f −1 (r, ∞) , and a − g( x ) < r, which implies that x ∈ g−1 ( a − r, ∞) .
Thus x is an element of the right side of 2.46, completing the proof that the left side
of 2.46 is contained in the right side.
The proof of the inclusion in theother direction is easier. Specifically, suppose
x ∈ f −1 (r, ∞) ∩ g−1 ( a − r, ∞) for some r ∈ Q. Thus
r < f (x) and a − r < g ( x ).
Adding these two inequalities, we see that a < f ( x ) + g( x ). Thus x is an element of
the left side of 2.46, completing the proof of 2.46. Hence f + g is an S -measurable
function.
Example 2.44 tells us that − g is an S -measurable function. Thus f − g, which
equals f + (− g) is an S -measurable function.
The easiest way to prove that f g is an S -measurable function uses the equation
( f + g )2 − f 2 − g2
fg = .
2
The operation of squaring an S -measurable function produces an S -measurable
function (see Example 2.44), as does the operation of multiplication by 12 (again, see
Example 2.44). Thus the equation above implies that f g is an S -measurable function,
completing the proof of (a).
Suppose g( x ) 6= 0 for all x ∈ X. The function defined on R \ {0} (a Borel subset
of R) that takes x to 1x is continuous and thus is a Borel measurable function (by
2.40). Now 2.43 implies that 1g is an S -measurable function. Combining this result
with what we have already proved about the product of S -measurable functions, we
f
conclude that g is an S -measurable function, proving (b).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
36 Chapter 2 Measures
The next result shows that the pointwise limit of a sequence of S -measurable
functions is S -measurable. This is a highly desirable property (recall that the set of
Riemann integrable functions on some interval is not closed under taking pointwise
limits; see Example 1.17).
f ( x ) = lim f k ( x ).
k→∞
f ( x ) > a + 1j . The definition of limit now implies that there exists m ∈ Z+ such
that f k ( x ) > a + 1j for all k ≥ m. Thus x is in the right side of 2.48, proving that the
left side of 2.48 is contained in the right side.
To prove the inclusion in the other direction, suppose x is in the right side of 2.48.
Thus there exist j, m ∈ Z+ such that f k ( x ) > a + 1j for all k ≥ m. Taking the
limit as k → ∞, we see that f ( x ) ≥ a + 1j > a. Thus x is in the left side of 2.48,
completing the proof of 2.48. Thus f is an S -measurable function.
Occasionally we need to consider functions that take values in [−∞, ∞]. For
example, even if we start with a sequence of real-valued functions in 2.52, we might
end up with functions with values in [−∞, ∞]. Thus we extend the notion of Borel
sets to subsets of [−∞, ∞], as follows.
A subset of [−∞, ∞] is called a Borel set if its intersection with R is a Borel set.
In other words, a set C ⊂ [−∞, ∞] is a Borel set if and only if there exists
a Borel set B ⊂ R such that C = B or C = B ∪ {∞} or C = B ∪ {−∞} or
C = B ∪ {∞, −∞}.
You should verify that with the definition above, the set of Borel subsets of
[−∞, ∞] is a σ-algebra on [−∞, ∞].
Next, we extend the definition of S -measurable functions to functions taking
values in [−∞, ∞].
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2B Measurable Spaces and Functions 37
The next result, which is analogous to 2.38, states that we need not consider all
Borel subsets of [−∞, ∞] when taking inverse images to determine whether or not a
function with values in [−∞, ∞] is S -measurable.
The proof of the result above is left to the reader (also see Exercise 22 in this
section).
We end this section by showing that the pointwise infimum and pointwise supre-
mum of a sequence of S -measurable functions is S -measurable.
k =1
as you should verify. The equation above, along with 2.51, implies that h is an
S -measurable function.
Note that
g( x ) = − sup{− f k ( x ) : k ∈ Z+ }
for all x ∈ X. Thus the result about the supremum implies that g is an S -measurable
function.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
38 Chapter 2 Measures
EXERCISES 2B
Show that S = { : K ⊂ Z} is a σ-algebra on R.
S
1 n∈K ( n, n + 1]
12 Suppose f : R → R is a function.
(a) For k ∈ Z+, let
1
Gk = { a ∈ R : there exists δ > 0 such that | f (b) − f (c)| < k
for all b, c ∈ ( a − δ, a + δ)}.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2B Measurable Spaces and Functions 39
is an S -measurable subset of X.
15 Suppose X is a set and E1 , E2 ,S. . . is a disjoint sequence of subsets of X such
that ∞k=1 Ek = X. Let S = { k∈K Ek : K ⊂ Z }.
+
S
S A = { E ∈ S : A ⊂ E or A ∩ E = ∅}.
21 Prove 2.51.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
40 Chapter 2 Measures
f : X → [−∞, ∞]
S -measurable function.
23 Suppose B ⊂ R and f : B → R is an increasing function. Prove that f is
continuous at every element of B except for a countable subset of B.
24 Suppose f : R → R is a strictly increasing function. Prove that the inverse
function f −1 : f (R) → R is a continuous function.
25 Suppose B ⊂ R is a Borel set and f : B → R is an increasing function. Prove
that f ( B) is a Borel set.
26 Suppose B ⊂ R and f : B → R is an increasing function. Prove that there exists
a sequence f 1 , f 2 , . . . of strictly increasing functions from B to R such that
f ( x ) = lim f k ( x )
k→∞
for every x ∈ B.
27 Suppose B ⊂ R and f : B → R is a bounded increasing function. Prove that
there exists an increasing function g : R → R such that g( x ) = f ( x ) for all
x ∈ B.
28 Suppose f : B → R is a Borel measurable function. Define g : R → R by
(
f ( x ) if x ∈ B,
g( x ) =
0 if x ∈ R \ B.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2C Measures and Their Properties 41
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
42 Chapter 2 Measures
µ( E) = ∑ w( x )
x∈E
for E ∈ S . [Here the sum is defined as the supremum of all finite subsums
∑ x∈ D w( x ) as D ranges over all finite subsets of E.]
• Suppose X is a set and S is the σ-algebra on X consisting of all subsets of X
that are either countable or have a countable complement in X. Define a measure
µ on ( X, S) by (
0 if E is countable,
µ( E) =
3 if E is uncountable.
Properties of Measures
The hypothesis that µ( D ) < ∞ is needed in part (b) of the next result to avoid
undefined expressions of the form ∞ − ∞.
(a) µ( D ) ≤ µ( E);
(b) µ( E \ D ) = µ( E) − µ( D ) provided that µ( D ) < ∞.
µ ( E ) = µ ( D ) + µ ( E \ D ) ≥ µ ( D ),
which proves (a). If µ( D ) < ∞, then subtracting µ( D ) from both sides of the
equation above proves (b).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2C Measures and Their Properties 43
where the second line above follows from the countable additivity of µ and the last
line above follows from 2.56(a).
Proof If µ( Ek ) = ∞ for some k ∈ Z+, then the equation above holds because
both sides equal ∞. Hence we can consider only the case where µ( Ek ) < ∞ for all
k ∈ Z+.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
44 Chapter 2 Measures
k
∑
= lim µ ( E j ) − µ ( E j −1 )
k→∞ j =1
= lim µ( Ek ),
k→∞
Another mew.
as desired.
Measures also behave well with respect to decreasing intersections (but see Exer-
cise 10, which shows that the hypothesis µ( E1 ) < ∞ is needed below).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2C Measures and Their Properties 45
The next result is intuitively plausible—we expect that the measure of the union of
two sets equals the measure of the first set plus the measure of the second set minus
the measure of the set that has been counted twice.
µ ( D ∪ E ) = µ ( D ) + µ ( E ) − µ ( D ∩ E ).
Proof We have
D ∪ E = D \ ( D ∩ E) ∪ E \ ( D ∩ E) ∪ D ∩ E .
as desired.
EXERCISES 2C
1 Explain why there does not exist a measure space ( X, S , µ) with the property
that {µ( E) : E ∈ S} = [0, 1).
+
Let 2Z denote the σ-algebra on Z+ consisting of all subsets of Z+.
+
2 Suppose µ is a measure on (Z+ , 2Z ). Prove that there is a sequence w1 , w2 , . . .
in [0, ∞] such that
µ ( E ) = ∑ wk
k∈ E
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
46 Chapter 2 Measures
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2D Lebesgue Measure 47
2D Lebesgue Measure
Additivity of Outer Measure on Borel Sets
Recall that there exist disjoint sets A, B ∈ R such that | A ∪ B| 6= | A| + | B| (see
2.17). Thus outer measure, despite its name, is not a measure on the σ-algebra of all
subsets of R.
Our main goal in this section is to prove that outer measure, when restricted to the
Borel subsets of R, is a measure. Throughout this section, be careful about trying to
simplify proofs by applying properties of measures to outer measure, even if those
properties seem intuitively plausible. For example, there are subsets A ⊂ B ⊂ R
with | A| < ∞ but | B \ A| 6= | B| − | A| [compare to 2.56(b)].
The next result is our first step toward the goal of proving that outer measure
restricted to the Borel sets is a measure.
| A ∪ G | = | A | + | G |.
Then
`( In ) = `(Jn ) + `(Kn ) + `( Ln ).
Now J1 , L1 , J2 , L2 , . . . is a sequence of open intervals whose union contains A and
K1 , K2 , . . . is a sequence of open intervals whose union contains G. Thus
∞ ∞ ∞
∑ `( In ) = ∑ ∑ `(Kn )
`(Jn ) + `( Ln ) +
n =1 n =1 n =1
≥ | A | + | G |.
The inequality above implies that | A ∪ G | ≥ | A| + | G |, completing the proof that
| A ∪ G | = | A| + | G | in this special case.
Using induction on m, we can now conclude that if m ∈ Z+ and G is a union of
m disjoint open intervals that are all disjoint from A, then | A ∪ G | = | A| + | G |.
Now suppose G is an arbitrary open subset of R that is disjoint from A. Then
G= ∞
S
n=1 In for some sequence of disjoint open intervals I1 , I2 , . . . (see 0.59), each
of which is disjoint from A. Now for each m ∈ Z+ we have
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
48 Chapter 2 Measures
m
[
| A ∪ G| ≥ A ∪ In
n =1
m
= | A| + ∑ `( In ).
n =1
Thus
∞
| A ∪ G | ≥ | A| + ∑ `( In )
n =1
≥ | A | + | G |,
completing the proof that | A ∪ G | = | A| + | G |.
The next result shows that the outer measure of the disjoint union of two sets is
what we expect if at least one of the two sets is closed.
| A ∪ F | = | A | + | F |.
implies that
2.63 | A | ≤ | G \ F |.
Because G \ F = G ∩ (R \ F ), we know that G \ F is an open set. Hence we can
apply 2.61 to the disjoint union G = F ∪ ( G \ F ), getting
| G | = | F | + | G \ F |.
Adding | F | to both sides of 2.63 and then using the equation above gives
| A| + | F | ≤ | G |
∞
≤ ∑ `( Ik ).
k =1
Recall that the collection of Borel sets is the smallest σ-algebra on R that con-
tains all open subsets of R. The next result provides an extremely useful tool for
approximating a Borel set by a closed set.
Suppose B ⊂ R is a Borel set. Then for every ε > 0, there exists a closed set
F ⊂ B such that | B \ F | < ε.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2D Lebesgue Measure 49
Proof Let
The strategy of the proof is to show that L is a σ-algebra. Then because L contains
every closed subset of R (if D ⊂ R is closed, take F = D in the definition of L), by
taking complements we can conclude that L contains every open subset of R and
thus every Borel subset of R.
To get started with proving that L is a σ-algebra, we want to prove that L is closed
under countable intersections. Thus suppose D1 , D2 , . . . is a sequence in L. Let
ε > 0. For each k ∈ Z+, there exists a closed set Fk such that
ε
Fk ⊂ Dk and | Dk \ Fk | < .
2k
T∞
Thus k=1 Fk is a closed set and
∞
\ ∞
\ ∞
\ ∞
\ ∞
[
Fk ⊂ Dk and Dk \ Fk ⊂ ( Dk \ Fk ).
k =1 k =1 k =1 k =1 k =1
The last set inclusion and the countable subadditivity of outer measure (see 2.8) imply
that
\∞ \∞
Dk \ Fk < ε.
k =1 k =1
Thus ∞ k=1 Dk ∈ L, proving that L is closed under countable intersections.
T
(R \ D ) \ (R \ G ) ⊂ G \ D
⊂ G \ F.
Thus
|(R \ D ) \ (R \ G )| ≤ | G \ F |
= |G| − | F|
= (| G | − | D |) + (| D | − | F |)
ε
< + |D \ F|
2
< ε,
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
50 Chapter 2 Measures
where the equality in the second line above comes from applying 2.62 to the disjoint
union G = ( G \ F ) ∪ F, and the fourth line above uses subadditivity applied to
the union D = ( D \ F ) ∪ F. The last inequality above shows that R \ D ∈ L, as
desired.
Now, still assuming that D ∈ L and ε > 0, we consider the case where | D | = ∞.
For k ∈ Z+, let Dk = D ∩ [−k, k]. Because Dk ∈ L and | Dk | < ∞, the previous
case implies that R \ Dk ∈ L. Clearly D = ∞
S
k=1 Dk . Thus
∞
\
R\D = ( R \ Dk ) .
k =1
Because L is closed under countable intersections, the equation above implies that
R \ D ∈ L, which completes the proof that L is a σ-algebra.
Now we can prove that the outer measure of the disjoint union of two sets is what
we expect if at least one of the two sets is a Borel set.
| A ∪ B | = | A | + | B |.
Proof Let ε > 0. Let F be a closed set such that F ⊂ B and | B \ F | < ε (see 2.64).
Thus
| A ∪ B| ≥ | A ∪ F |
= | A| + | F |
= | A| + | B| − | B \ F |
≥ | A| + | B| − ε,
where the second and third lines above follow from 2.62 [use B = ( B \ F ) ∪ F for
the third line].
Because the inequality above holds for all ε > 0, we have | A ∪ B| ≥ | A| + | B|,
which implies that | A ∪ B| = | A| + | B|.
You have probably long suspected that not every subset of R is a Borel set. Now
we can prove this suspicion.
There exists a set B ⊂ R such that | B| < ∞ and B is not a Borel set.
Proof In the proof of 2.17, we showed that there exist disjoint sets A, B ⊂ R such
that | A ∪ B| 6= | A| + | B|. For any such sets, we must have | B| < ∞ because
otherwise both | A ∪ B| and | A| + | B| equal ∞ (as follows from the inequality
| B| ≤ | A ∪ B|). Now 2.65 implies that B is not a Borel set.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2D Lebesgue Measure 51
The tools we have constructed now allow us to prove that outer measure, when
restricted to the Borel sets, is a measure.
Outer measure is a measure on (R, B), where B is the σ-algebra of Borel subsets
of R.
The result above implies that the next definition makes sense.
Lebesgue measure is the measure on (R, B), where B is the σ-algebra of Borel
subsets of R, that assigns to each Borel set its outer measure.
In other words, the Lebesgue measure of a set is the same as its outer measure,
except that the term Lebesgue measure should not be applied to arbitrary sets but
only to Borel sets (and also to what are called Lebesgue measurable sets, as we will
soon see). Unlike outer measure, Lebesgue measure is actually a measure, as shown
in 2.67. Lebesgue measure is named in honor of its inventor, Henri Lebesgue.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
52 Chapter 2 Measures
(e) For each ε > 0, there exists an open set G ⊃ A such that | G \ A| < ε.
∞
\
(f) There exist open sets G1 , G2 , . . . containing A such that Gk \ A = 0.
k =1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2D Lebesgue Measure 53
Proof Let L denote the collection of sets A ⊂ R that satisfy (b). We have already
proved that every Borel set is in L—see 2.64. As a key part of that proof, which we
will freely use in this proof, we showed that L is a σ-algebra on R (see the proof
of 2.64). In addition to containing the Borel sets, L contains every set with outer
measure 0 [because if | A| = 0, we can take F = ∅ in (b)].
(b) =⇒ (c): Suppose (b) holds. Thus for each n ∈ Z+, there exists a closed set
Fn ⊂ A such that | A \ Fn | < n1 . Now
∞
[
A\ Fk ⊂ A \ Fn
k =1
n ∈ Z+. Thus | A \ ∞ 1 +
k=1 Fk | ≤ | A \ Fn | ≤ n for each n ∈ Z . Hence
S
for each
S∞
| A \ k=1 Fk | = 0, completing the proof that (b) implies (c).
(c) =⇒ (d): Because every countable union of closed sets is a Borel set, we see
that (c) implies (d).
(d) =⇒ (b): Suppose (d) holds. Thus there exists a Borel set B ⊂ A such that
| A \ B| = 0. Now
A = B ∪ ( A \ B ).
We know that B ∈ L (because B is a Borel set) and A \ B ∈ L (because A \ B has
outer measure 0). Because L is a σ-algebra, the displayed equation above implies
that A ∈ L. In other words, (b) holds, completing the proof that (d) implies (b).
At this stage of the proof, we now know that (b) ⇐⇒ (c) ⇐⇒ (d).
(b) =⇒ (e): Suppose (b) holds. Thus A ∈ L. Let ε > 0. Then because
R \ A ∈ L (which holds because L is closed under complementation), there exists a
closed set F ⊂ R \ A such that
|(R \ A) \ F | < ε.
Now R \ F is an open set with R \ F ⊃ A. Because (R \ F ) \ A = (R \ A) \ F,
the inequality above implies that |(R \ F ) \ A| < ε. Thus (e) holds, completing the
proof that (b) implies (e).
(e) =⇒ (f): Suppose (e) holds. Thus for each n ∈ Z+, there exists an open set
Gn ⊃ A such that | Gn \ A| < n1 . Now
\∞
Gk \ A ⊂ Gn \ A
k =1
T∞
Z+. Thus | k=1 Gk \ A| ≤ | Gn \ A| ≤ n1 for each n ∈ Z+. Hence
for each n ∈
T∞
| k=1 Gk \ A| = 0, completing the proof that (e) implies (f).
(f) =⇒ (g): Because every countable intersection of open sets is a Borel set, we
see that (f) implies (g).
(g) =⇒ (b): Suppose (g) holds. Thus there exists a Borel set B ⊃ A such that
| B \ A| = 0. Now
A = B ∩ R \ ( B \ A) .
We know that B ∈ L (because B is a Borel set) and R \ ( B \ A) ∈ L (because this
set is the complement of a set with outer measure 0). Because L is a σ-algebra, the
displayed equation above implies that A ∈ L. In other words, (b) holds, completing
the proof that (g) implies (b).
Our chain of implications now shows that (b) through (g) are all equivalent.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
54 Chapter 2 Measures
Proof Because (a) and (b) are equivalent in 2.70, the set L of Lebesgue measurable
subsets of R is the collection of sets satisfying (b) in 2.70. As noted in the first
paragraph of the proof of 2.70, this set is a σ-algebra on R, proving (a).
To prove the second bullet point, suppose A1 , A2 , . . . is a disjoint sequence of
Lebesgue measurable sets. By the definition of Lebesgue measurable set (2.69), for
each k ∈ Z+ there exists a Borel set Bk ⊂ Ak such that | Ak \ Bk | = 0. Now
[∞ [ ∞
Ak ≥ Bk
k =1 k =1
∞
= ∑ | Bk |
k =1
∞
= ∑ | A k |,
k =1
where the second line above holds because B1 , B2 , . . . is a disjoint sequence of Borel
sets and outer measure is a measure on the Borel sets (see 2.67); the last line above
holds because Bk ⊂ Ak and by subadditivity of outer measure (see 2.8) we have
| Ak | = | Bk ∪ ( Ak \ Bk )| ≤ | Bk | + | Ak \ Bk | = | Bk |.
The inequality above,S combined with countable subadditivity of outer measure
(see 2.8), implies that ∞ ∞
A
k =1 k
= ∑ k=1 | Ak |, completing the proof of (b).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2D Lebesgue Measure 55
Lebesgue measure is the measure on (R, L), where L is the σ-algebra of Lebesgue
measurable subsets of R, that assigns to each Lebesgue measurable set its outer
measure.
The two definitions of Lebesgue measure disagree only on the domain of the
measure—is the σ-algebra the Borel sets or the Lebesgue measurable sets? You may
be able to tell which is intended from the context. In this book, the domain will be
specified unless it is irrelevant.
If you are reading a mathematics paper and the domain for Lebesgue measure
is not specified, then it probably does not matter whether you use the Borel sets
or the Lebesgue measurable sets (because every Lebesgue measurable set differs
from a Borel set by a set with outer measure 0, and when dealing with measures,
what happens on a set with measure 0 usually does not matter). Because all sets that
arise from the usual operations of analysis are Borel sets, you may want to assume
that Lebesgue measure means outer measure on the Borel sets, unless what you are
reading explicitly states otherwise.
A mathematics paper may also refer to
The emphasis in some textbooks on
a measurable subset of R, without further
Lebesgue measurable sets instead of
explanation. Unless some other σ-algebra
Borel sets probably stems from the
is clear from the context, the author prob-
historical development of the
ably means the Borel sets or the Lebesgue
subject, rather than from any serious
measurable sets. Again, the choice prob-
use of Lebesgue measurable sets
ably will not matter, but using the Borel
that are not Borel sets.
sets can be cleaner and simpler.
Lebesgue measure on the Lebesgue measurable sets does have one small advantage
over Lebesgue measure on the Borel sets: Every subset of a set with (outer) measure
0 is Lebesgue measurable but is not necessarily a Borel set. However, any natural
process that produces a subset of R will produce a Borel set. Thus this small
advantage does not come up in practice.
Cantor Set
Every countable set has outer measure 0 (see 2.4). A reasonable question arises
about whether the converse holds. In other words, is every set with outer measure
0 countable? The Cantor set, which is introduced in this subsection, provides the
answer to this question.
The Cantor set also gives counterexamples to other reasonable conjectures. For
example, Exercise 17 in this section shows that the sum of two sets with Lebesgue
measure 0 can have positive Lebesgue measure.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
56 Chapter 2 Measures
One way to envision the Cantor set is to start with the interval [0, 1] and then
consider the process that removes at each step the middle-third open intervals of all
intervals left from the previous step. At the first step, we remove G1 = ( 13 , 23 ).
G1 is shown in red.
After that first step, we have [0, 1] \ G1 = [0, 31 ] ∪ [ 32 , 1]. Thus we take the
middle-third open intervals of [0, 13 ] and [ 32 , 1]. In other words, we have
G2 = ( 19 , 92 ) ∪ ( 79 , 98 ).
G1 ∪ G2 is shown in red.
G1 ∪ G2 ∪ G3 is shown in red.
Base 3 representations provide a useful way to think about the Cantor set. Just
1
as 10 = 0.1 = 0.09999 . . . in the decimal representation, base 3 representations
are not unique for fractions whose denominator is a power of 3. For example,
1
3 = 0.13 = 0.02222 . . .3 , where the subscript 3 denotes a base 3 representations.
Notice that G1 is the set of numbers in [0, 1] whose base 3 representations have
1 in the first digit after the decimal point (for those numbers that have two base 3
representations, this mean that both such representations must have 1 in the first digit).
Also, G1 ∪ G2 is the set of numbers in [0, 1] whose base 3 representations S have 1 in
the first digit or the second digit after the decimal point. And so on. Hence ∞ n=1 Gn
is the set of numbers in [0, 1] whose base 3 representations have a 1 somewhere.
Thus we have the following description of the Cantor set. In the following
result, the phrase a base 3 representation indicates that if a number has two base 3
representations, then it is in the Cantor set if and only if at least one of them contains
no 1s. For example, both 13 (which equals 0.02222 . . .3 ) and 23 (which equals 0.23 )
are in the Cantor set.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2D Lebesgue Measure 20103
The Cantor set is the set of numbers in [0, 1] that have a base 3 representation
containing only 0s and 2s.
Proof Each set Gn used in the definition of the Cantor set is a union of open intervals.
Thus each Gn is open. Thus ∞
S
Gn is open,
n =1 S and hence its complement is closed.
The Cantor set equals [0, 1] ∩ R \ ∞ G
n=1 n , which is the intersection of two closed
sets. Thus the Cantor set is closed, completing the proof of (a).
By induction on n, each Gn is the union of 2n−1 disjoint open intervals, each of
n −1
which has length 31n . Thus | Gn | = 23n . The sets G1 , G2 , . . . are disjoint. Hence
∞
[ 1 2 4
Gn = + + +···
3 9 27
n =1
1 2 4
= 1+ + +···
3 3 9
1 1
= ·
3 1 − 23
= 1.
2.56(b)]. In other words, the Cantor set has Lebesgue measure 0, completing the
proof of (b).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
58 Chapter 2 Measures
Each number in the Cantor set has a unique base 3 representation containing
only 0s and 2s (by 2.74; for those numbers that have two base 3 representations,
one of them must contain a 1). In that base 3 representation containing only 0s and
2s, replace each 2 by 1 and consider the resulting string of digits as representing
a base 2 number. This gives a mapping of the Cantor set onto [0, 1]. Because
[0, 1] is uncountable (see 2.16), this implies that the Cantor set is uncountable, thus
proving (c).
A set with Lebesgue measure 0 cannot contain an interval that has more than one
element. Thus (b) implies (d).
The proof of (e) is left as an exercise for the reader.
EXERCISES 2D
1 (a) Show that the set consisting of those numbers in (0, 1) that have a decimal
expansion containing one hundred consecutive 4s is a Borel subset of R.
(b) What is the Lebesgue measure of the set in part (a)?
2 Prove that there exists a bounded set A ⊂ R such that | F | ≤ | A| − 1 for every
closed set F ⊂ A.
3 Prove that there exists a set A ⊂ R such that | G \ A| = ∞ for every open set G
that contains A.
4 The phrase nontrivial interval is used to denote an interval of R that contains
more than one element. Recall that an interval might be open, closed, or neither.
(a) Prove that the union of each collection of nontrivial intervals of R is the
union of a countable subset of that collection.
(b) Prove that the union of each collection of nontrivial intervals of R is a Borel
set.
(c) Prove that there exists a collection of closed intervals of R whose union is
not a Borel set.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2D Lebesgue Measure 59
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
60 Chapter 2 Measures
lim f k ( x ) = f ( x )
k→∞
for each x ∈ X.
In other words, f 1 , f 2 , . . . converges pointwise on X to f if for each x ∈ X
and every ε > 0, there exists n ∈ Z+ such that | f k ( x ) − f ( x )| < ε for all
integers k ≥ n.
• The sequence f 1 , f 2 , . . . converges uniformly on X to f if for every ε > 0,
there exists n ∈ Z+ such that | f k ( x ) − f ( x )| < ε for all integers k ≥ n and
all x ∈ X.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2E Functions on Measure Spaces 61
Like the difference between continuity and uniform continuity, the difference
between pointwise convergence and uniform convergence lies in the order of the
quantifiers. Take a moment to examine the definitions carefully. If a sequence of
functions converges uniformly on some set, then it also converges pointwise on the
same set; however, the converse is not true, as shown by Example 2.77.
Example 2.77 also shows that the pointwise limit of continuous functions need not
be continuous. However, the next result tells us that the uniform limit of continuous
functions is continuous.
Thus f is continuous at b.
Egorov’s Theorem
A sequence of functions that converges
Dmitri Egorov (1869–1931) proved
pointwise need not converge uniformly.
the theorem below in 1911. You may
However, the next result says that a point-
encounter some books that spell his
wise convergent sequence of functions on
last name as Egoroff.
a measure space almost converges uni-
formly, in the sense that it converges uniformly except on a set that can have arbitrarily
small measure.
As an example of the next result, consider Lebesgue measure λ on the inter-
val [−1, 1] and the sequence of functions f 1 , f 2 , . . . in Example 2.77 that con-
verges pointwise but not uniformly on [−1, 1]. Suppose ε > 0. Then taking
E = [−1, − 4ε ] ∪ [ 4ε , 1], we have λ([−1, 1] \ E) < ε and f 1 , f 2 , . . . converges uni-
formly on E, as in the conclusion of the next result.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
62 Chapter 2 Measures
Then clearly A1,n ⊂ A2,n ⊂ · · · is an increasing sequence of sets and 2.80 can be
rewritten as
∞
[
Am,n = X.
m =1
The equation above implies (by 2.58) that limm→∞ µ( Am,n ) = µ( X ). Thus there
exists mn ∈ Z+ such that
ε
µ ( X ) − µ ( Amn , n ) < .
2n
Now let
∞
\
E= Amn , n .
n =1
Then
∞
\
µ( X \ E) = µ X \ Amn , n
n =1
∞
[
=µ ( X \ Amn , n )
n =1
∞
≤ ∑ µ ( X \ Amn , n )
n =1
< ε.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2E Functions on Measure Spaces 63
Proof The idea of the proof is that for each k ∈ Z+ and n ∈ Z, the interval
[n, n + 1) is divided into 2k equally-sized half-open subintervals. If f ( x ) ∈ [0, k),
we define f k ( x ) to be the left endpoint of the subinterval into which f ( x ) falls; if
f ( x ) ∈ (−k, 0), we define f k ( x ) to be the right endpoint of the subinterval into
which f ( x ) falls; and if | f ( x )| ≥ k, we define f k ( x ) to be ±k. Specifically, let
m
if 0 ≤ f ( x ) < k and m ∈ Z is such that f ( x ) ∈ 2mk , m2+k 1 ,
2 k
m+1 if − k < f ( x ) < 0 and m ∈ Z is such that f ( x ) ∈ m , m+1 ,
2k 2k 2k
f k (x) =
k if f ( x ) ≥ k,
−k if f ( x ) ≤ −k.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
64 Chapter 2 Measures
Luzin’s Theorem
Our next result is surprising. It says that
Nikolai Luzin (1883–1950) proved
an arbitrary Borel measurable function is
the theorem below in 1912. Most
almost continuous, in the sense that its
mathematics literature in English
restriction to a large closed set is contin-
refers to the result below as Lusin’s
uous. Here, the phrase large closed set
Theorem. However, Luzin is the
means that we can take the complement
correct transliteration from Russian
of the closed set to have arbitrarily small
into English; Lusin is the
measure.
transliteration into German.
Be careful about the interpretation of
the conclusion of Luzin’s Theorem that f | B is a continuous function on B. This is
not the same as saying that f (on its original domain) is continuous at each point
of B. For example, χQ is discontinuous at every point of R. However, χQ|R\Q is a
continuous function on R \ Q (because this function is identically 0 on its domain).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2E Functions on Measure Spaces 65
Suppose ε > 0. By the special case already proved, for each k ∈ Z+, there exists
a closed set Ck ⊂ R such that |R \ Ck | < 2kε+1 and gk |Ck is continuous. Let
∞
\
C= Ck .
k =1
Thus C is a closed set and gk |C is continuous for every k ∈ Z+. Note that
∞
[
R\C = (R \ Ck );
k =1
thus |R \ C | < 2ε .
For each m ∈ Z, the sequence g1 |(m,m+1) , g2 |(m,m+1) , . . . converges pointwise
on (m, m + 1) to g|(m,m+1) . Thus by Egorov’s Theorem (2.79), for each m ∈ Z,
there is a Borel set Em ⊂ (m, m + 1) such that g1 , g2 , . . . converges uniformly to g
on Em and
ε
|(m, m + 1) \ Em | < |m|+3 .
2
Thus g1 , g2 , . . . converges uniformly to g on C ∩ Em for each m ∈ Z. Because each
gk |C is continuous, we conclude (using 2.78) that g|C∩Em is continuous for each
m ∈ Z. Thus g| D is continuous, where
[
D= (C ∩ Em ).
m∈Z
Because [
R\D ⊂ Z∪ (m, m + 1) \ Em ∪ ( R \ C ),
m∈Z
we have |R \ D | < ε.
There exists a closed set F ⊂ D such that | D \ F | < ε − |R \ D | (by 2.64). Now
|R \ F | = |(R \ D ) ∪ ( D \ F )| ≤ |R \ D | + | D \ F | < ε.
Because the restriction of a continuous function to a smaller domain is also continuous,
g| F is continuous, completing the proof.
We will need the following result to get another version of Luzin’s Theorem.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
66 Chapter 2 Measures
For each interval Ik of the form (b, c) with b < c and b, c ∈ R, define h on [b, c]
to be the linear function such that h(b) = g(b) and h(c) = g(c).
Define h( x ) = g( x ) for all x ∈ R for which h( x ) has not been defined by the
previous two paragraphs. Then h : R → R is continuous and h| F = g.
The next result gives a slightly modified way to state Luzin’s Theorem. You can
think of this version as saying that the value of a Borel measurable function can be
changed on a set with small Lebesgue measure to produce a continuous function.
By the first version of Luzin’s Theorem (2.84), there is a closed set C ⊂ R such
that |R \ C | < ε and g̃|C is a continuous function on C. There exists a closed set
F ⊂ C ∩ E such that |(C ∩ E) \ F | < ε − |R \ C | (by 2.64). Thus
| E \ F | ≤ (C ∩ E) \ F ∪ (R \ C ) ≤ |(C ∩ E) \ F | + |R \ C | < ε.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2E Functions on Measure Spaces 67
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
68 Chapter 2 Measures
|{ x ∈ R : g( x ) 6= f ( x )}| = 0.
For each j ∈ {1, . . . , n}, there exists a Borel set Bj ⊂ A j such that | A j \ Bj | = 0
[by the equivalence of (a) and (d) in 2.70]. Let
gk = c1 χ B + · · · + c n χ B .
1 n
g( x ) = lim (χ E gk )( x ).
k→∞
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 2E Functions on Measure Spaces ≈ 100 ln 2
EXERCISES 2E
1 Suppose X is a finite set. Explain why a sequence of functions from X to R that
converges pointwise on X also converges uniformly on X.
2 Give an example of a sequence of functions from Z+ to R that converges
pointwise on Z+ but does not converge uniformly on Z+.
3 Give an example of a sequence of continuous functions f 1 , f 2 , . . . from [0, 1] to
R that converge pointwise to a function f : [0, 1] → R that is not a continuous
function.
4 Prove or give a counterexample: If A ⊂ R and f 1 , f 2 , . . . is a sequence of
uniformly continuous functions from A to R that converge uniformly to a
function f : A → R, then f is uniformly continuous on A.
5 Give an example to show that Egorov’s Theorem can fail without the hypothesis
that µ( X ) < ∞.
6 Suppose ( X, S , µ) is a measure space with µ( X ) < ∞. Suppose f 1 , f 2 , . . . is a
sequence of S -measurable functions from X to R such that limk→∞ f k ( x ) = ∞
for each x ∈ X. Prove that for every ε > 0, there exists a set E ∈ S such that
µ( X \ E) < ε and f 1 , f 2 , . . . converges uniformly to ∞ on E (meaning that for
every t > 0, there exists n ∈ Z+ such that f k ( x ) > t for all integers k ≥ n and
all x ∈ E).
[The exercise above is an Egorov-type theorem for sequences of functions that
converge pointwise to ∞.]
7 Suppose F is a closed bounded subset of R and g1 , g2 , . . . is an increasing
sequence of continuous real-valued functions on F (thus g1 ( x ) ≤ g2 ( x ) ≤ · · ·
for all x ∈ F) such that sup{ g1 ( x ), g2 ( x ), . . .} < ∞ for each x ∈ F. Define a
real-valued function g on F by
g( x ) = lim gk ( x ).
k→∞
Prove that for every ε > 0, there exists a set E ⊂ Z+ with µ(Z+ \ E) < ε
such that f 1 , f 2 , . . . converges uniformly on E for every sequence of functions
f 1 , f 2 , . . . from Z+ to R that converges pointwise on Z+.
[This result does not follow from Egorov’s Theorem because here we are asking
for E to depend only on ε. In Egorov’s Theorem, E depends on ε and on the
sequence f 1 , f 2 , . . . .]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
70 Chapter 2 Measures
g : F1 ∪ · · · ∪ Fn → R
f ( x ) = sup{ f t ( x ) : t ∈ R},
|{ x ∈ B : g( x ) 6= f ( x )}| = 0.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
3
Integration
To remedy deficiencies of Riemann integration that were discussed in Section 1B,
in the last chapter we developed measure theory as an extension of the notion of the
length of an interval. Having proved the fundamental results about measures, we are
now ready to use measures to develop integration with respect to a measure. As we
will see, this new method of integration fixes many of the problems with Riemann
integration.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 71
72 Chapter 3 Integration
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3A Integration with Respect to a Measure 73
≤ µ ( E ).
R
Thus χ E dµ ≤ µ( E), completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
74 Chapter 3 Integration
m n
= ∑ ∑ µ( A j ∩ Ek ) {i : Amin
∩ E 6=∅}
ci
j =1 k =1 j i
m n
≤ ∑ ∑ µ( A j ∩ Ek )ck
j =1 k =1
n m
= ∑ ck ∑ µ( A j ∩ Ek )
k =1 j =1
n
= ∑ ck µ(Ek ).
k =1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3A Integration with Respect to a Measure 75
inf f ( x ) ≤ inf g( x )
x∈ A j x∈ A j
R R
for each j = 1, . . . , m. Thus L( f , P) ≤ L( g, P). Hence f dµ ≤ g dµ.
Proof First note that the left side of 3.10 is bigger than or equal to the right side by
3.7 and 3.8.
To prove that the right side of 3.10 is bigger than or equal to the left side, first
assume that infx∈ A f ( x ) < ∞ for every A ∈ S with µ( A) > 0. Then for P an
S -partition A1 , . . . , Am of X, take c j = infx∈ A j f ( x ), which shows that L( f , P) is
R
in the set on the right side of 3.10. Thus the definition of f dµ shows that the right
side of 3.10 is bigger than or equal to the left side.
The only remaining case to consider is when there exists a set A ∈ S such that
µ( A) > 0 and infx∈ A f ( x ) = ∞ [which implies that f ( x ) = ∞ for all x ∈ A]. In
this case, for arbitrary t ∈ (0, ∞) we can take m = 1, A1 = A, and c1 = t. These
choices show that the right side of 3.10 is at least tµ( A). Because t is an arbitrary
positive number, this shows that the right side of 3.10 equals ∞, which of course is
greater than or equal to the left side, completing the proof.
The next result allows us to interchange limits and integrals in certain circum-
stances. We will see more theorems of this nature in the next section.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
76 Chapter 3 Integration
f ( x ) = lim f k ( x ).
k→∞
Then Z Z
lim f k dµ = f dµ.
k→∞
k→∞
lim f k dµ ≥ ∑ c j µ ( A j ).
j =1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3A Integration with Respect to a Measure 77
The proof that the integral is additive will use the Monotone Convergence Theorem
and our next result. The representation of a simple function h : X → [0, ∞] in the
form ∑nk=1 ck χ E is not unique. Requiring the numbers c1 , . . . , cn to be distinct and
k
E1 , . . . , En to be nonempty and disjoint with E1 ∪ · · · ∪ En = X does produce what
is called the standard representation of a simple function [take Ek = h−1 ({ck }),
where c1 , . . . , cn are the distinct values of h]. The following lemma shows that all
representations (including representations with sets that are not disjoint) of a simple
measurable function give the same sum that we expect from integration.
m n
∑ a j µ( A j ) = ∑ bk µ( Bk ).
j =1 k =1
where the three sets appearing on the right side of the equation above are disjoint.
Now A1 = ( A1 \ A2 ) ∪ ( A1 ∩ A2 ) and A2 = ( A2 \ A1 ) ∪ ( A1 ∩ A2 ); each
of these unions is a disjoint union. Thus µ( A1 ) = µ( A1 \ A2 ) + µ( A1 ∩ A2 ) and
µ( A2 ) = µ( A2 \ A1 ) + µ( A1 ∩ A2 ). Hence
a1 µ ( A1 ) + a2 µ ( A2 ) = a1 µ ( A1 \ A2 ) + a2 µ ( A2 \ A1 ) + ( a1 + a2 ) µ ( A1 ∩ A2 ).
The equation above, in conjunction with 3.14, shows that if we replace the two
sets A1 , A2 by the three disjoint sets A1 \ A2 , A2 \ A1 , A1 ∩ A2 and make the
appropriate adjustments to the coefficients a1 , . . . , am , then the value of the sum
∑m j=1 a j µ ( A j ) is unchanged (although m has increased by 1).
Repeating this process with all pairs of subsets among A1 , . . . , Am that are
not disjoint after each step, in a finite number of steps we can convert the ini-
tial list A1 , . . . , Am into a disjoint list of subsets without changing the value of
∑m j =1 a j µ ( A j ).
The next step is to make the numbers a1 , . . . , am distinct. This is done by replacing
the sets corresponding to each a j by the union of those sets, and using finite additivity
of the measure µ to show that the value of the sum ∑m j=1 a j µ ( A j ) does not change.
Finally, drop any terms for which A j = ∅, getting the standard representation
for a simple function. We have now shown that the original value of ∑m j =1 a j µ ( A j )
is equal to the value if we use the standard representation of the simple function
∑m n
j=1 a j χ A j . The same procedure can be used with the representation ∑k=1 bk χ Bk to
show that ∑nk=1 bk µ(χ B ) equals what we would get with the standard representation.
k
Thus the equality of the functions ∑m n
j=1 a j χ A and ∑k=1 bk χ B implies the equality
j k
∑m n
j=1 a j µ ( A j ) = ∑k=1 bk µ ( Bk ).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
78 Chapter 3 Integration
Proof The desired result follows from writing the simple function ∑nk=1 ck χ E in
k
the standard representation for a simple function and then using 3.7 and 3.13.
Proof The desired result holds for simple nonnegative S -measurable functions (by
3.15). Thus we approximate by such functions.
Specifically, let f 1 , f 2 , . . . and g1 , g2 , . . . be increasing sequences of simple non-
negative S -measurable functions such that
lim f k ( x ) = f ( x ) and lim gk ( x ) = g( x )
k→∞ k→∞
for all x ∈ X (see 2.82 for the existence of such increasing sequences). Then
Z Z
( f + g) dµ = lim ( f k + gk ) dµ
k→∞
Z Z
= lim f k dµ + lim gk dµ
k→∞ k→∞
Z Z
= f dµ + g dµ,
where the first and third equalities follow from the Monotone Convergence Theorem
and the second equality holds by 3.15.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3A Integration with Respect to a Measure 79
The lower Riemann integral is not additive, even for bounded nonnegative measur-
able functions. For example, if f = χQ ∩ [0, 1] and g = χ[0, 1] \ Q, then
3.17 Definition f +; f −
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
80 Chapter 3 Integration
The next result says that the integral of a number times a function is exactly what
we expect.
Proof R and c ≥ 0.
First consider the case where f is a nonnegative function R If P is
an S -partition of X, then clearly L(c f , P) = cL( f , P). Thus c f dµ = c f dµ.
Now consider the general case where f takes values in [−∞, ∞]. Suppose c ≥ 0.
Then
Z Z Z
c f dµ = (c f )+ dµ − (c f )− dµ
Z Z
= c f + dµ − c f − dµ
Z Z
=c +
f dµ − f − dµ
Z
=c f dµ,
where the third line follows from the first paragraph of this proof.
Finally, now suppose c < 0 (still assuming that f takes values in [−∞, ∞]). Then
−c > 0 and
Z Z Z
c f dµ = (c f )+ dµ − (c f )− dµ
Z Z
−
= (−c) f dµ − (−c) f + dµ
Z Z
= (−c) f − dµ − f + dµ
Z
=c f dµ,
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3A Integration with Respect to a Measure 81
Now we prove that integration with respect to a measure has the additive property
required for a good theory of integration.
Proof Clearly
( f + g)+ − ( f + g)− = f + g
= f + − f − + g+ − g− .
Thus
( f + g)+ + f − + g− = ( f + g)− + f + + g+ .
Both sides of the equation above are sums of nonnegative functions. Thus integrating
both sides with respect to µ and using 3.16 gives
Z Z Z Z Z Z
( f + g)+ dµ + f − dµ + g− dµ = ( f + g)− dµ + f + dµ + g+ dµ.
where the left side is not of the form ∞ − ∞ because ( f + g)+ ≤ f + + g+ and
( f + g)− ≤ f − + g− . The equation above can be rewritten as
Z Z Z
( f + g) dµ = f dµ + g dµ,
The next result resembles 3.8, but now the functions are allowed to be real valued.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
82 Chapter 3 Integration
R
Proof Because f dµ is defined, f is an S -measurable function and at least one of
f + dµ and f − dµ is finite. Thus
R R
Z Z Z
f dµ = f + dµ − f − dµ
Z Z
≤ f + dµ + f − dµ
Z
= ( f + + f − ) dµ
Z
= | f | dµ,
as desired.
EXERCISES 3A
1 Suppose ( X, S , µ)Ris a measure space and f : X → [0, ∞] is an S -measurable
function such that f dµ < ∞. Explain why
inf f ( x ) = 0
x∈E
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3A Integration with Respect to a Measure 83
µ( E) = ∑ w( x )
x∈E
where the infinite sums above are defined as the supremum of all sums over
finite subsets of X.
8 Suppose λ denotes Lebesgue measure on R. Given an example of a sequence
f 1 , f 2 , . . . of simple Borel measurable functionsR from R to [0, ∞) such that
limk→∞ f k ( x ) = 0 for every x ∈ R but limk→∞ f k dλ = 1.
9 Suppose µ is a measure on a measurable space ( X, S) and f : X → [0, ∞] is an
S -measurable function. Define ν : S → [0, ∞] by
Z
ν( A) = χ A f dµ
11 Suppose ( X, S , µ) is a measure
R space and f 1 , f 2 , . . . are S -measurable functions
from X to R such that ∑∞ k=1 | f k | dµ < ∞. Prove that limk→∞ f k ( x ) = 0 for
almost every x ∈ X.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
84 Chapter 3 Integration
12 Show that there exists a Borel measurable function f : R → (0, ∞) such that
χ I f dλ = ∞ for every nonempty open interval I ⊂ R, where λ denotes
R
Lebesgue measure on R.
13 Give an example to show that the Monotone Convergence Theorem (3.11) can
fail if the hypothesis that f 1 , f 2 , . . . are nonnegative functions is dropped.
14 Give an example to show that the Monotone Convergence Theorem fails if the
hypothesis of an increasing sequence of functions is replaced by a hypothesis of
a decreasing sequence of functions.
[This exercise shows that the Monotone Convergence Theorem should be called
the Increasing Convergence Theorem. However, see Exercise 20.]
15 Give an example of a sequence x1 , x2 , . . . of real numbers such that
n
lim
n→∞
∑ xk exists in R,
k =1
Note that inf{ xk , xk+1 , . . .} is an increasing function of k; thus the limit above
on the right exists in [−∞, ∞].
18 Suppose that ( X, S , µ) is a measure space and f 1 , f 2 , . . . is a sequence of non-
negative S -measurable functions on X. Define a function f : X → [0, ∞] by
f ( x ) = lim inf f k ( x ). Prove that
k→∞
Z Z
f dµ ≤ lim inf f k dµ.
k→∞
[The result above is called Fatou’s Lemma. Some textbooks prove Fatou’s
Lemma and then use it to prove the Monotone Convergence Theorem. Here
we are taking the reverse approach—you should be able to use the Monotone
Convergence Theorem to give a clean proof of Fatou’s Lemma.]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3A Integration with Respect to a Measure 85
Z Z
lim f k dµ = f dµ.
k→∞
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
86 Chapter 3 Integration
= cµ( E),
where the second line comes from 3.23, the third line comes from 3.8, and the fourth
line comes from 3.15.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3B Limits of Integrals & Integrals of Limits 87
The next result could be proved as a special case of the Dominated Convergence
Theorem (3.30), which we will prove later in this section. Thus you could skip the
proof here. However, sometimes you get more insight by seeing an easier proof of an
important special case. Thus you may want to read the easy proof of the Bounded
Convergence Theorem that is presented next.
| f k ( x )| ≤ c
Z Z Z Z Z
f k dµ − f dµ = f k dµ − f dµ + ( f k − f ) dµ
X\E X\E E
Z Z Z
≤ | f k | dµ + | f | dµ + | f k − f | dµ
X\E X\E E
ε
< + µ( E) sup| f k ( x ) − f ( x )|,
2 x∈E
where the last inequality follows from 3.25. Because f 1 , f 2 , . . . converges uniformly
to f on E and µ( E) < ∞, the right side of the inequality above is less than ε for k
sufficiently large, which completes the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
88 Chapter 3 Integration
For example, almost every real number is irrational (with respect to the usual
Lebesgue measure on R) because |Q| = 0.
Theorems about integrals can almost always be relaxed so that the hypotheses
apply only almost everywhere instead of everywhere. For example, consider the
Bounded Convergence Theorem (3.26), one of whose hypotheses is that
lim f k ( x ) = f ( x )
k→∞
for all x ∈ X. Suppose that the hypotheses of the Bounded Convergence Theorem
hold except that the equation above holds only almost everywhere, meaning there
is a set E ∈ S such that µ( X \ E) = 0 and the equation above holds for all x ∈ E.
Define new functions g1 , g2 , . . . and g by
( (
f k ( x ) if x ∈ E, f ( x ) if x ∈ E,
gk ( x ) = and g( x ) =
0 if x ∈ X \ E 0 if x ∈ X \ E.
Then
lim gk ( x ) = g( x )
k→∞
for all x ∈ X. Hence the Bounded Convergence Theorem implies that
Z Z
lim gk dµ = g dµ,
k→∞
which immediately implies that
Z Z
lim f k dµ = f dµ
k→∞
R R R R
because gk dµ = f k dµ and g dµ = f dµ.
Z
g dµ < ε
B
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3B Limits of Integrals & Integrals of Limits 89
Some theorems, such as Egorov’s Theorem (2.79) have as a hypothesis that the
measure of the entire space is finite. The next result sometimes allows us to get
around this hypothesis by restricting attention to a key set of finite measure.
Z
g dµ < ε.
X\E
Let E be the union of those A j such that infx∈ A j f ( x ) > 0. Then µ( E) < ∞
R otherwise we would have L( g, P) = ∞, which contradicts the hypothesis
(because
that g dµ < ∞). Now
Z Z Z
g dµ = g dµ − χ E g dµ
X\E
< ε + L( g, P) − L(χ E g, P)
= ε,
where the last equality holds because infx∈ A j f ( x ) = 0 for each A j not contained
in E.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
90 Chapter 3 Integration
lim f k ( x ) = f ( x )
k→∞
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3B Limits of Integrals & Integrals of Limits 91
|{ x ∈ [ a, b] : f is not continuous at x }| = 0.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
92 Chapter 3 Integration
Proof Suppose n ∈ Z+. Consider the partition Pn that divides [ a, b] into 2n subin-
tervals of equal size. Let I1 , . . . , I2n be the corresponding closed subintervals, each
of length (b − a)/2n . Let
2n 2n
∑ ∑
3.34 gn = inf f ( x ) χ I and hn = sup f ( x ) χ I .
x ∈ Ij j j
j =1 j =1 x ∈ I j
The lower and upper Riemann sums of f for the partition Pn are given by integrals.
Specifically,
Z Z
3.35 L( f , Pn , [ a, b]) = gn dλ and U ( f , Pn , [ a, b]) = hn dλ,
[ a, b] [ a, b]
Taking the limit as n → ∞ of both equations in 3.35 and using the Bounded Conver-
gence Theorem (3.26) along with Exercise 8 in Section 1A, we see that f L and f U
are Lebesgue measurable functions and
Z Z
L
3.36 L( f , [ a, b]) = f dλ and U ( f , [ a, b]) = f U dλ.
[ a, b] [ a, b]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3B Limits of Integrals & Integrals of Limits 93
Rb
We previously defined the notation a f to mean the Riemann integral of f .
Because the Riemann integral and Lebesgue integral agree for Riemann integrable
Rb
functions (see 3.33), we now redefine a f to denote the Lebesgue integral.
Rb
3.38 Definition a f
The definition in the second bullet point above is made so that equations such as
Z b Z c Z b
f = f+ f
a a c
remain valid even if, for example, a < b < c.
3.39 Definition k f k1 ; L1 ( µ )
The terminology and notation used above are convenient even though k·k1 might
not be a genuine norm (to be defined in Chapter 6).
3.40 Example L1 (µ) functions that take on only finitely many values
Suppose ( X, S , µ) is a measure space and E1 , . . . , En are disjoint subsets of X.
Suppose a1 , . . . , an are distinct nonzero real numbers. Then
a1 χ E + · · · + a n χ E ∈ L1 ( µ )
1 n
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
94 Chapter 3 Integration
3.41 Example `1
If µ is counting measure on Z+ and x = x1 , x2 , . . . is a sequence of real numbers
(thought of as a function on Z+ ), then k x k1 = ∑∞ 1
k=1 | xk |. In this case, L ( µ ) is
1 1
often denoted by ` (pronounced little-el-one). In other words, ` is the set of all
sequences x1 , x2 , . . . of real numbers such that ∑∞
k=1 | xk | < ∞.
The next result states every function in L1 (µ) can be approximated in L1 -norm
by measurable functions that take on only finitely many values.
Suppose µ is a measure and f ∈ L1 (µ). Then for every ε > 0, there exists a
simple function g ∈ L1 (µ) such that
k f − gk1 < ε.
Proof Suppose ε > 0. Then there exist simple functions g1 , g2 ∈ L1 (µ) such that
0 ≤ g1 ≤ f + and 0 ≤ g2 ≤ f − and
Z Z
ε ε
( f + − g1 ) dµ < and ( f − − g2 ) dµ < ,
2 2
where we have used 3.9 to provide the existence of g1 , g2 with these properties.
Let g = g1 − g2 . Then g is a simple function in L1 (µ) and
k f − gk1 = k( f + − g1 ) − ( f − − g2 )k1
Z Z
= ( f + − g1 ) dµ + ( f − − g2 ) dµ
< ε,
as desired.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3B Limits of Integrals & Integrals of Limits 95
3.44 Definition L1 ( R ); k f k1
g = a1 χ I + · · · + a n χ I ,
1 n
Suppose g is a step function of the form above and the intervals I1 , . . . , In are
disjoint. Then
k gk1 = | a1 | | I1 | + · · · + | an | | In |.
In particular, g ∈ L1 (R) if and only if all the intervals I1 , . . . , In are bounded.
The intervals in the definition of a step
Even though the coefficients
function can be open intervals, closed in-
a1 , . . . , an in the definition of a step
tervals, or half-open intervals. We will be
function are required to be nonzero,
using step functions in integrals, where
the function 0 that is identically 0 on
the inclusion or exclusion of the endpoints
R is a step function. To see this, take
of the intervals does not matter.
n = 1, a1 = 1, and I1 = ∅.
Suppose f ∈ L1 (R). Then for every ε > 0, there exists a step function
g ∈ L1 (R) such that
k f − gk1 < ε.
Proof Suppose ε > 0. By 3.43, there exist Borel (or Lebesgue) measurable subsets
A1 , . . . , An of R and nonzero numbers a1 , . . . , an such that | Ak | < ∞ for all k ∈
{1, . . . , n} and
n
ε
f − ∑ a k χ Ak
< .
k =1 1 2
For each k ∈ {1, . . . , n}, there is an open subset Gk of R that contains Ak and
whose Lebesgue measure is as close as we want to | Ak | [by part (e) of 2.70]. Each
Gk is a countable union of disjoint open intervals (by 0.59 in the Appendix). Thus for
each k, there is a set Ek that is a finite union of bounded open intervals contained in
Gk whose Lebesgue measure is as close as we want to | Gk |. Hence for each k, there
is a set Ek that is a finite union of bounded intervals such that
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
96 Chapter 3 Integration
| Ek \ Ak | + | Ak \ Ek | ≤ | Gk \ Ak | + | Gk \ Ek |
ε
< ;
2| a k | n
in other words,
ε
χ
Ak
− χ E
1 < .
k 2| a k | n
Now
n
n
n n
f − ∑ ak χ E
≤
f − ∑ ak χ A
+
∑ ak χ A − ∑ ak χ E
k 1 k 1 k k 1
k =1 k =1 k =1 k =1
n
ε
+ ∑ | a k |
χ A − χ E
1
<
2 k =1 k k
< ε.
Each Ek is a finite union of bounded intervals. Thus the inequality above completes
the proof because ∑nk=1 ak χ E is a step function.
k
Luzin’s Theorem (2.84 and 2.86) gives a spectacular way to approximate a Borel
measurable function by a continuous function. However, the following approximation
theorem is usually more useful than Luzin’s Theorem. For example, the next result
plays a major role in the proof of the Lebesgue Differentiation Theorem (4.10).
Suppose f ∈ L1 (R). Then for every ε > 0, there exists a continuous function
g : R → R such that
k f − g k1 < ε
and { x ∈ R : g( x ) 6= 0} is a bounded set.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 3B Limits of Integrals & Integrals of Limits 97
EXERCISES 3B
1 Give an example of a sequence f 1 , f 2 , . . . of functions from Z+ to [0, ∞) such
that
lim f k (m) = 0
k→∞
Z
for every m ∈ Z+ but lim f k dµ = 1, where µ is counting measure on Z+.
k→∞
defined.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
98 Chapter 3 Integration
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
4
Differentiation
Consider the set E = [0, 18 ] ∪ [ 14 , 38 ] ∪ [ 12 , 58 ] ∪ [ 34 , 87 ]. This set E has the property
that
b
| E ∩ [0, b]| =
2
for b = 0, 14 , 12 , 34 , 1. Does there exist a Lebesgue measurable set E ⊂ [0, 1], perhaps
constructed in a fashion similar to the Cantor set, such that the equation above holds
for all b ∈ [0, 1]?
In this chapter we will see how to answer this question by considering differentia-
tion issues. We will begin by developing a powerful tool called the Hardy–Littlewood
Maximal Inequality. This tool will be used to prove an almost everywhere version
of the Fundamental Theorem of Calculus. These results will lead us to an important
theorem about the density of Lebesgue measurable sets.
Later, in Chapter 9, we will come back to examine other issues connected with
differentiation.
1
µ({ x ∈ X : |h( x )| ≥ c}) ≤ k h k1
c
for every c > 0.
1
Z
µ({ x ∈ X : |h( x )| ≥ c}) = c dµ
c { x ∈ X : |h( x )|≥c}
1
Z
≤ |h| dµ
c { x ∈ X : |h( x )|≥c}
1
≤ k h k1 ,
c
as desired.
St. Petersburg University along the Neva River in St. Petersburg, Russia.
Andrei Markov (1856–1922) was a student and then a faculty member here.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 4A Hardy–Littlewood Maximal Function 101
The next result will be a key tool in the proof of the Hardy–Littlewood Maximal
Inequality (4.8).
I1 ∪ · · · ∪ In ⊂ (3 ∗ Ik1 ) ∪ · · · ∪ (3 ∗ Ikm ).
Then
Thus
I1 ∪ I2 ∪ I3 ∪ I4 ⊂ (3 ∗ I1 ) ∪ (3 ∗ I4 ).
In this example, I1 , I4 is the only sublist of I1 , I2 , I3 , I4 that produces the conclusion
of the Vitali Covering Lemma.
Proof of 4.4 To avoid trivial exceptions in the proof (because the empty set is
disjoint from all other sets), assume that none of I1 , . . . , In is the empty set.
The desired sublist can be constructed using a greedy algorithm that inductively
selects at each stage a largest remaining interval that is disjoint from the previously
selected intervals. Specifically, let k1 be such that
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
102 Chapter 4 Differentiation
Because we start with a finite list, the procedure must eventually terminate after
some number m of choices.
Suppose j ∈ {1, . . . , n}. To complete the proof, we must show that
Ij ⊂ (3 ∗ Ik1 ) ∪ · · · ∪ (3 ∗ Ikm ).
If j ∈ {k1 , . . . , k m }, then the inclusion above obviously holds.
Thus assume that j ∈ / {k1 , . . . , k m }. Because the process terminated without
selecting j, the interval Ij is not disjoint from all of Ik1 , . . . , Ikm . Let Ik L be the first
interval on this list not disjoint from Ij ; thus Ij is disjoint from Ik1 , . . . , Ik L−1 . Because
j was not chosen in step L, we conclude that | Ik L | ≥ | Ij |. Because Ik L ∩ Ij 6= ∅, this
last inequality implies (easy exercise) that Ij ⊂ 3 ∗ Ik L , completing the proof.
In other words, h∗ (b) is the supremum over all bounded intervals centered at b of
the average of |h| on those intervals.
1
if b ≤ 0,
2(1− b )
∗
(χ[0, 1]) (b) = 1 if 0 < b < 1,
1
if b ≥ 1,
2b The graph of (χ[0, 1])∗ on [−2, 3].
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 4A Hardy–Littlewood Maximal Function 103
3
|{b ∈ R : h∗ (b) > c}| ≤ k h k1
c
for every c > 0.
Proof Suppose F Ris a closed bounded subset of {b ∈ R : h∗ (b) > c}. We will
∞
show that | F | ≤ 3c −∞ |h|, which implies our desired result [see Exercise 20(a) in
Section 2D].
For each b ∈ F, there exists tb > 0 such that
Z b+t
1 b
4.9 |h| > c.
2tb b−tb
Clearly [
F⊂ ( b − t b , b + t b ).
b∈ F
The Heine–Borel Theorem (2.12) tells us that this open cover of a closed bounded set
has a finite subcover. In other words, there exist b1 , . . . , bn ∈ F such that
F ⊂ (b1 − tb1 , b1 + tb1 ) ∪ · · · ∪ (bn − tbn , bn + tbn ).
To make the notation cleaner, let’s relabel the open intervals above as I1 , . . . , In .
Now apply the Vitali Covering Lemma (4.4) to the list I1 , . . . , In , producing a
disjoint sublist Ik1 , . . . , Ikm such that
I1 ∪ · · · ∪ In ⊂ (3 ∗ Ik1 ) ∪ · · · ∪ (3 ∗ Ikm ).
Thus
| F | ≤ | I1 ∪ · · · ∪ In |
≤ |(3 ∗ Ik1 ) ∪ · · · ∪ (3 ∗ Ikm )|
≤ |3 ∗ Ik1 | + · · · + |3 ∗ Ikm |
= 3(| Ik1 | + · · · + | Ikm |)
3
Z Z
< |h| + · · · + |h|
c Ik1 Ik m
Z ∞
3
≤ | h |,
c −∞
where the second-to-last inequality above comes from 4.9 (note that | Ik j | = 2tb for
the choice of b corresponding to Ik j ) and the last inequality holds because Ik1 , . . . , Ikm
are disjoint.
The last inequality completes the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
104 Chapter 4 Differentiation
EXERCISES 4A
1 Suppose ( X, S , µ) is a measure space and h : X → R is an S -measurable
function. Prove that
1
Z
µ({ x ∈ X : |h( x )| ≥ c}) ≤ |h| p dµ
cp
for all positive numbers c and p.
2 Suppose ( X, S , µ) is a measure space with µ( X ) = 1 and h ∈ L1 (µ). Prove
that
Z 2
1
n Z o Z
µ x ∈ X : h( x ) − h dµ ≥ c ≤ 2 h2 dµ − h dµ
c
1
µ({ x ∈ X : |h( x )| ≥ c}) = k h k1 .
c
4 Show that the constant 3 in the Vitali Covering Lemma (4.4) cannot be replaced
by a smaller positive constant.
5 Prove the assertion left as an exercise in the last sentence of the proof of the
Vitali Covering Lemma (4.4).
6 Verify the formula in Example 4.7 for the Hardy–Littlewood maximal function
of χ[0, 1].
{b ∈ R : h∗ (b) > c}
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 4A Hardy–Littlewood Maximal Function 105
13 Show that there exists h ∈ L1 (R) such that h∗ (b) = ∞ for every b ∈ Q.
14 Suppose h ∈ L1 (R). Prove that
3
|{b ∈ R : h∗ (b) ≥ c}| ≤ k h k1
c
for every c > 0.
[This result slightly strengthens the Hardy–Littlewood Maximal Inequality (4.8)
because the set on the left side above includes those b ∈ R such that h∗ (b) = c.
A much deeper strengthening comes from replacing the constant 3 in the Hardy–
Littlewood Maximal Inequality with a smaller constant. In 2003, Antonios
Melas answered what had been an open question about the best constant by
√ that can replace 3 in the Hardy–Littlewood
proving that the smallest constant
Maximal Inequality is (11 + 61)/12 ≈ 1.56752; see Annals of Mathematics
157 (2003), 647–688.]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
106 Chapter 4 Differentiation
4B Derivatives of Integrals
Lebesgue Differentiation Theorem
The next result states that the average amount by which a function in L1 (R) differs
from its values is small almost everywhere on small intervals. The 2 in the denomina-
tor of the fraction in the result below could be deleted, but its presence nicely makes
the length of the interval of integration match the denominator 2t.
The next result is called the Lebesgue Differentiation Theorem even though no
derivative is in sight. However, we will soon see how another version of this result
deals with derivatives. The hard work takes place in the proof of this first version.
Before getting to the formal proof of this first version of the Lebesgue Differen-
tiation Theorem, we pause to provide some motivation for the proof. If b ∈ R and
t > 0, then 3.25 gives the easy estimate
Z b+t
1
| f − f (b)| ≤ sup{| f ( x ) − f (b)| : | x − b| ≤ t}.
2t b−t
δ
4.11 k f − h k k1 < .
k2k
Let
Bk = {b ∈ R : | f (b) − hk (b)| ≤ 1
k and ( f − hk )∗ (b) ≤ 1k }.
Then
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 4B Derivatives of Integrals 107
Markov’s Inequality (4.1) as applied to the function f − hk and 4.11 imply that
δ
4.13 |{b ∈ R : | f (b) − hk (b)| > 1k }| <
.
2k
The Hardy–Littlewood Maximal Inequality (4.8) as applied to the function f − hk
and 4.11 imply that
3δ
4.14 |{b ∈ R : ( f − hk )∗ (b) > 1k }| < .
2k
Now 4.12, 4.13, and 4.14 imply that
δ
|R \ Bk | < .
2k −2
Let
∞
\
B= Bk .
k =1
Then
[∞ ∞ ∞
δ
4.15 |R \ B| = (R \ Bk ) ≤ ∑ |R \ Bk | < ∑ = 4δ.
2 k −2
k =1 k =1 k =1
Because hk is continuous, the last term is less than 1k for all t > 0 sufficiently close
to 0 (with how close is sufficiently close depending upon k). In other words, for each
k ∈ Z+, we have
1 b+t 3
Z
| f − f (b)| <
2t b−t k
for all t > 0 sufficiently close to 0.
Hence we conclude that
1 b+t
Z
lim | f − f (b)| = 0
t↓0 2t b−t
for all b ∈ B.
Let A denote the set of numbers a ∈ R such that
1 a+t
Z
lim | f − f ( a)|
t↓0 2t a−t
either does not exist or is nonzero. We have shown that A ⊂ (R \ B). Thus
| A| ≤ |R \ B| < 4δ,
where the last inequality comes from 4.15. Because δ is an arbitrary positive number,
the last inequality implies that | A| = 0, completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
108 Chapter 4 Differentiation
Derivatives
You should remember the following definition from your calculus course.
g(b + t) − g(b)
g0 (b) = lim
t →0 t
if the limit above exists, in which case g is called differentiable at b.
g 0 ( b ) = f ( b ).
Proof If t 6= 0, then
R b+t Rb
g(b + t) − g(b)
−∞ f− −∞ f
− f (b) = − f (b)
t t
R b + t
f
= b − f (b)
t
R b+t f − f (b )
4.18 = b
t
≤ sup | f ( x ) − f (b)|.
{ x ∈R : | x −b|<|t|}
If ε > 0, then by the continuity of f at b, the last quantity is less than ε for t
sufficiently close to 0. Thus g is differentiable at b and g0 (b) = f (b).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 4B Derivatives of Integrals 109
Now we can answer the question raised on the opening page of this chapter.
There does not exist a Lebesgue measurable set E ⊂ [0, 1] such that
b
| E ∩ [0, b]| =
2
for all b ∈ [0, 1].
Proof Suppose there does exist a Lebesgue measurable set E ⊂ [0, 1] with the
property above. Define g : R → R by
Z b
g(b) = χE .
−∞
Thus g(b) = 2b for all b ∈ [0, 1]. Hence g0 (b) = 12 for all b ∈ (0, 1).
The Lebesgue Differentiation Theorem (4.19) implies that g0 (b) = χ E(b) for
almost every b ∈ R. However, χ E never takes on the value 12 , which contradicts the
conclusion of the previous paragraph. This contradiction completes the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
110 Chapter 4 Differentiation
The next result says that a function in L1 (R) is equal almost everywhere to the
limit of its average over small intervals. These two-sided results generalize more
naturally to higher dimensions (take the average over balls centered at b) than the
one-sided results.
Again, the conclusion of the result above holds at every number b at which f is
continuous. The remarkable part of the result above is that even if f is discontinuous
everywhere, the conclusion holds for almost every real number b.
Density
The next definition captures the notion of the proportion of a set in small intervals
centered at a number b.
| E ∩ (b − t, b + t)|
lim
t ↓0 2t
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 4B Derivatives of Integrals 111
The next beautiful result shows the power of the techniques developed in this
chapter.
for every t > 0 and every b ∈ R, the desired result follows immediately from 4.21.
Now consider the case where | E| = ∞ [which means that χ E ∈ / L1 (R) and hence
+
4.21 as stated cannot be used]. For k ∈ Z , let Ek = E ∩ (−k, k). If |b| < k, then the
density of E at b equals the density of Ek at b. By the previous paragraph as applied
to Ek , there are sets Fk ⊂ Ek and Gk ⊂ R \ Ek such that | Fk | = | Gk | = 0 and the
density of Ek equals 1 at every element of Ek \ Fk and the density of Ek equals 0 at
every element of (R \ Ek ) \ Gk .
Let F = ∞
S∞
k=1 Fk and G = k=1 Gk . Then | F | = | G | = 0 and the density of E is
S
The Lebesgue Density Theorem makes the example provided by the next result
somewhat surprising. Be sure to spend some time pondering why the next result does
not contradict the Lebesgue Density Theorem. Also, compare the next result to 4.20.
The bad Borel set provided by the next result leads to a bad Borel measurable
function. Specifically, let E be the bad Borel set in 4.25. Then χ E is a Borel
measurable function that is discontinuous everywhere. Furthermore, the function χ E
cannot be modified on a set of measure 0 to be continuous anywhere (in contrast to
the function χQ).
Even though the function χ E discussed in the paragraph above is continuous
nowhere and every modification of this function on a set of measure 0 is also continu-
ous nowhere, the function g defined by
Z b
g(b) = χE
0
0 < |E ∩ I | < | I |
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
112 Chapter 4 Differentiation
4.26 Suppose G is a nonempty open subset of R. Then there exists a closed set
F ⊂ G \ Q such that | F | > 0.
To prove 4.26, let J be a closed interval contained in G such that 0 < |J |. Let
r1 , r2 , . . . be a list of all the rational numbers. Let
∞
[ |J | |J |
F=J\ rk − , r k + .
k =1
2k +2 2k +2
1
Then F is a closed subset of R and F ⊂ J \ Q ⊂ G \ Q. Also, |J \ F | ≤ 2 |J |
because J \ F ⊂ ∞
|J | |J |
−
S
k =1 r k 2 k + 2 , r k + 2 k + 2 . Thus
| F | = |J | − |J \ F | ≥ 12 |J | > 0,
In \ ( F̂0 ∪ . . . ∪ F̂n−1 )
In \ ( F0 ∪ . . . ∪ Fn ),
which is nonempty because it contains all rational numbers in In , we see that there is
a closed set F̂n contained in the set above such that F̂n contains no rational numbers
and | F̂n | > 0.
Now let
∞
[
E= Fk .
k =1
Our construction implies that Fk ∩ F̂n = ∅ for all k, n ∈ Z+. Thus E ∩ F̂n = ∅ for
all n ∈ Z+. Hence F̂n ⊂ In \ E for all n ∈ Z+.
Suppose I is a nonempty bounded open interval. Then In ⊂ I for some n ∈ Z+.
Thus
0 < | Fn | ≤ | E ∩ In | ≤ | E ∩ I |.
Also,
| E ∩ I | = | I | − | I \ E| ≤ | I | − | In \ E| ≤ | I | − | F̂n | < | I |,
completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 4B Derivatives of Integrals 113
EXERCISES 4B
For f ∈ L1 (R) and I an interval of R with R 0 < | I | < ∞, let f I denote the
average of f on I. In other words, f I = |1I | I f .
| f (b)| ≤ f ∗ (b)
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
5
Product Measures
Lebesgue measure on R generalizes the notion of the length of an interval. In this
chapter, we will see how two-dimensional Lebesgue measure on R2 generalizes the
notion of the area of a rectangle. More generally, we will construct new measures
that are the products of two measures.
Once these new measures have been constructed, the question arises of how to
compute integrals with respect to these new measures. Beautiful theorems proved
in the first decade of the twentieth century will allow us to compute integrals with
respect to product measures as iterated integrals involving the two measures that
produced the product.
Main building of Scuola Normale Superiore di Pisa, the university in Pisa, Italy,
where Guido Fubini (1879–1943) received his PhD in 1900. In 1907 Fubini proved
that under reasonable conditions, an integral with respect to a product measure can
be computed as an iterated integral and that the order of integration can be switched.
Leonida Tonelli (1885–1943) also taught for many years in Pisa; he also discovered
and proved an important theorem about interchanging the order of integration in an
iterated integral.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 114
Section 5A Products of Measure Spaces 115
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
116 Chapter 5 Product Measures
Suppose X and Y are sets and E ⊂ X × Y. Then for a ∈ X and b ∈ Y, the cross
sections [ E] a and [ E]b are defined by
Proof Let E denote the collection of subsets E of X × Y for which the conclusion
of this result holds. Then A × B ∈ E for all A ∈ S and all B ∈ T (by Example 5.5).
The collection E is closed under complementation and countable unions because
[( X × Y ) \ E] a = Y \ [ E] a
and
[ E1 ∪ E2 ∪ · · · ] a = [ E1 ] a ∪ [ E2 ] a ∪ · · ·
for all subsets E, E1 , E2 , . . . of X × Y and all a ∈ X, as you should verify, with
similar statements holding for cross sections with respect to all b ∈ Y.
Because E is a σ-algebra containing all the measurable rectangles in S ⊗ T , we
conclude that E contains S ⊗ T .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5A Products of Measure Spaces 117
The next result shows that cross sections preserve measurability, this time in the
context of functions rather than sets.
and
[ f ]b is an S -measurable function on X for every b ∈ Y.
y ∈ ([ f ] a )−1 ( D ) ⇐⇒ [ f ] a (y) ∈ D
⇐⇒ f ( a, y) ∈ D
⇐⇒ ( a, y) ∈ f −1 ( D )
⇐⇒ y ∈ [ f −1 ( D )] a .
Thus
([ f ] a )−1 ( D ) = [ f −1 ( D )] a .
Because f is an S ⊗ T -measurable function, f −1 ( D ) ∈ S ⊗ T . Thus the equation
above and 5.6 imply that ([ f ] a )−1 ( D ) ∈ T . Hence [ f ] a is a T -measurable function.
The same ideas show that [ f ]b is an S -measurable function for every b ∈ Y.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
118 Chapter 5 Product Measures
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5A Products of Measure Spaces 119
The equation above writes the union of two measurable rectangles in S ⊗ T as the
union of three disjoint measurable rectangles in S ⊗ T .
Now consider any finite union of measurable rectangles in S ⊗ T . If this is not
a disjoint union, then choose any nondisjoint pair of measurable rectangles in the
union and replace those two measurable rectangles with the union of three disjoint
measurable rectangles as in 5.14. Iterate this process until obtaining a disjoint union
of measurable rectangles.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
120 Chapter 5 Product Measures
Proof Let M denote the smallest monotone class containing A. Because every σ-
algebra is a monotone class, M is contained in the smallest σ-algebra containing A.
To prove the inclusion in the other direction, first suppose A ∈ A. Let
E = { E ∈ M : A ∪ E ∈ M}.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5A Products of Measure Spaces 121
The previous paragraph shows that A ⊂ D . A moment’s thought again shows that D
is a monotone class. Thus, as in the previous paragraph, we conclude that M ⊂ D .
Hence we have proved that D ∪ E ∈ M for all D, E ∈ M.
The paragraph above shows that the monotone class M is closed under finite
unions. Now if E1 , E2 , . . . ∈ M, then
E1 ∪ E2 ∪ E3 ∪ · · · = E1 ∪ ( E1 ∪ E2 ) ∪ ( E1 ∪ E2 ∪ E3 ) ∪ · · · ,
Products of Measures
The following definitions will be useful.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
122 Chapter 5 Product Measures
The next result will allow us to define the product of two σ-finite measures.
ν([ E] x ) = ν([ E1 ∪ · · · ∪ En ] x )
= ν([ E1 ] x ∪ · · · ∪ [ En ] x )
= ν([ E1 ] x ) + · · · + ν([ En ] x ),
where the last equality holds because ν is a measure and [ E1 ] x , . . . , [ En ] x are disjoint.
The equation above, when combined with the conclusion of the previous paragraph,
shows that x 7→ ν([ E] x ) is a finite sum of S -measurable functions and thus is an
S -measurable function. Hence E ∈ M. We have now shown that A ⊂ M.
Our next goal is to show that M is a monotone class on X × Y. To do this, first
suppose that E1 ⊂ E2 ⊂ · · · is an increasing sequence of sets in M. Then
[∞ ∞
[
ν [ Ek ] x = ν ([ Ek ] x )
k =1 k =1
= lim ν([ Ek ] x ),
k→∞
where we have used 2.58. Because the pointwise limit of S -measurable functions
is S -measurable (by 2.47), the equation above shows that x 7→ ν [ ∞
S
k = 1 Ek ] x is
S∞
an S -measurable function. Hence k=1 Ek ∈ M. We have now shown that M is
closed under countable increasing unions.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5A Products of Measure Spaces 123
where we have used 2.59 (this is where we use the assumption that ν is a finite
measure). Because the pointwise limit of S -measurable T∞
functions
is S -measurable
(by 2.47), the equation above shows that x 7 → ν [ k = 1 E k ] x is an S -measurable
function. Hence ∞ k=1 Ek ∈ M. We have now shown that M is closed under
T
where dµ( x ) indicates that variables other than x should be treated as constants.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
124 Chapter 5 Product Measures
352
=
3
and
Z Z Z 64
( x2 + y) dλ( x ) dλ(y) = + 4y dλ(y)
[0, 4] [0, 4] [0, 4] 3
352
= .
3
The two iterated integrals in this example turned out to both equal 352
3 , even though
they do not look alike in the intermediate step of the evaluation. As we will see in the
next section, this equality of integrals when changing the order of integration is not a
coincidence.
The definition of (µ × ν)( E) given below makes sense because the inner integral
below equals ν([ E] x ), which makes sense by 5.6 (or use 5.9), and then the outer
integral makes sense by 5.20(a).
The restriction in the definition below to σ-finite measures is not bothersome be-
cause the main results we seek are not valid without this hypothesis (see Example 5.30
in the next section).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5A Products of Measure Spaces 125
= µ ( A ) ν ( B ).
In other words, the product measure of a rectangle is the product of the measures of
the corresponding sets.
∞ Z
= ∑ ν([ Ek ] x ) dµ( x )
k =1 X
∞
= ∑ (µ × ν)(Ek ),
k =1
where the fourth equality follows from the Monotone Convergence Theorem (3.11;
or see Exercise 10 in Section 3A). The equation above shows that µ × ν satisfies the
countable additivity condition required for a measure.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
126 Chapter 5 Product Measures
EXERCISES 5A
1 Suppose ( X, S) and (Y, T ) are measurable spaces. Prove that if A is a
nonempty subset of X and B is a nonempty subset of Y such that A × B ∈
S ⊗ T , then A ∈ S and B ∈ T .
2 Suppose ( X, S) is a measurable space. Prove that if E ∈ S ⊗ S , then
{ x ∈ X : ( x, x ) ∈ E} ∈ S .
3 Let B denote the σ-algebra of Borel subsets of R. Show that there exists a set
E ⊂ R × R such that [ E] a ∈ B and [ E] a ∈ B for every a ∈ R, but E ∈
/ B ⊗ B.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5B Iterated Integrals 127
5B Iterated Integrals
Tonelli’s Theorem
Relook at Example 5.24 in the previous section and notice that the value of the
iterated integral was unchanged when we switched the order of integration, even
though switching the order of integration led to different intermediate results. Our
next result states that the order of integration can be switched if the function being
integrated is nonnegative and the measures are σ-finite.
and
Z Z Z Z Z
f d( µ × ν ) = f ( x, y) dν(y) dµ( x ) = f ( x, y) dµ( x ) dν(y).
X ×Y X Y Y X
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
128 Chapter 5 Product Measures
The Monotone Class Theorem (5.17) implies that M contains the smallest σ-algebra
containing A. In other words, M contains S ⊗ T . Thus
Z Z Z Z
5.29 χ E( x, y) dν(y) dµ( x ) = χ E( x, y) dµ( x ) dν(y)
X Y Y X
for every E ∈ S ⊗ T .
Now relax the assumption that µ and ν are finite measures. Write X as an
increasing union of sets X1 ⊂ X2 ⊂ · · · in S with finite measure, and write Y
as an increasing union of sets Y1 ⊂ Y2 ⊂ · · · in T with finite measure. Suppose
E ∈ S ⊗ T . Applying the finite-measure case to the situation where the measures
and the σ-algebras are restricted to X j and Yk , we can conclude that 5.29 holds
with E replaced by E ∩ ( X j × Yk ) for all j, k ∈ Z+. Fix k ∈ Z+ and use the
Monotone Convergence Theorem (3.11) to conclude that 5.29 holds with E replaced
by E ∩ ( X × Yk ) for all k ∈ Z+. One more use of the Monotone Convergence
Theorem then shows that
Z Z Z Z Z
χ E d( µ × ν ) = χ E( x, y) dν(y) dµ( x ) = χ E( x, y) dµ( x ) dν(y)
X ×Y X Y Y X
for all E ∈ S ⊗ T , where the first equality above comes from the definition of
(µ × ν)( E) (see 5.25).
Now we turn from characteristic functions to the general case of an S ⊗ T -
measurable function f : X × Y → [0, ∞]. Define a sequence f 1 , f 2 , . . . of simple
S ⊗ T -measurable functions from X × Y to [0, ∞) by
m h m m + 1
k
if f ( x, y ) < k and m is the integer with f ( x, y ) ∈ , ,
f k ( x, y) = 2 2k 2k
k if f ( x, y) ≥ k.
Note that
0 ≤ f 1 ( x, y) ≤ f 2 ( x, y) ≤ f 3 ( x, y) ≤ · · · and lim f k ( x, y) = f ( x, y)
k→∞
for all ( x, y) ∈ X × Y.
Each f k is a finite sum of functions of the form cχ E, where c ∈ R and E ∈ S ⊗ T .
Thus the conclusions of this theorem hold for each function f k .
The Monotone Convergence Theorem implies that
Z Z
f ( x, y) dν(y) = lim f k ( x, y) dν(y)
Y k→∞ Y
R
for every x ∈ X. Thus the function x 7→ Y f ( x, y) dν(y) is the pointwise limit on
X of a sequence of S -measurable functions. Hence (a) holds, as does (b) for similar
reasons.
The last line in the statement of this theorem holds for each f k . The Monotone
Convergence Theorem now implies that the last line in the statement of this theorem
holds for f , completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5B Iterated Integrals 129
See Exercise 1 in this section for an example (with finite measures) showing that
Tonelli’s Theorem can fail without the hypothesis that the function being integrated
is nonnegative. The next example shows that the hypothesis of σ-finite measures also
cannot be eliminated.
5.30 Example Tonelli’s Theorem can fail without the hypothesis of σ-finite
Suppose B is the σ-algebra of Borel subsets of [0, 1], λ is Lebesgue measure on
([0, 1], B), and µ is counting measure on ([0, 1], B). Let D denote the diagonal of
[0, 1] × [0, 1]; in other words,
D = {( x, x ) : x ∈ [0, 1]}.
Then Z Z Z
χ D( x, y) dµ(y) dλ( x ) = 1 dλ = 1,
[0, 1] [0, 1] [0, 1]
but Z Z Z
χ D( x, y) dλ( x ) dµ(y) = 0 dµ = 0.
[0, 1] [0, 1] [0, 1]
The following useful corollary of Tonelli’s Theorem states that we can switch the
order of summation in a double-sum of nonnegative numbers. Exercise 2 asks you
to find a double-sum of real numbers in which switching the order of summation
changes the value of the double sum.
Fubini’s Theorem
Our next goal is Fubini’s Theorem, which has the same conclusions as Tonelli’s
Theorem but has a different hypothesis. Tonelli’s Theorem requires that the function
being integrated is nonnegative. Fubini’s Theorem does not require nonnegativity but
instead requires that the absolute value of the function being integrated has a finite
integral. When using Fubini’s Theorem, you will usually first use Tonelli’s Theorem
as applied to | f | to verify the hypothesis of Fubini’s Theorem.
Historically, Fubini’s Theorem (proved in 1907) came before Tonelli’s Theorem
(proved in 1909). However, presenting Tonelli’s Theorem first, as is done here, seems
to lead to simpler proofs and better understanding. The hard work here went into
proving Tonelli’s Theorem; thus our proof of Fubini’s Theorem consists mainly of
bookkeeping details.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
130 Chapter 5 Product Measures
As you will see in the proof of Fubini’s Theorem, the function in 5.32(a) is defined
only for almost every x ∈ X and the function in 5.32(b) is defined only for almost
every y ∈ Y. For convenience, you can think of these functions as equaling 0 on the
sets of measure 0 on which they are otherwise defined.
and
Z Z Z Z Z
f d( µ × ν ) = f ( x, y) dν(y) dµ( x ) = f ( x, y) dµ( x ) dν(y).
X ×Y X Y Y X
Proof R Tonelli’s Theorem (5.28) applied to the nonnegative function | f | implies that
x 7→ Y | f ( x, y )| dν ( y ) is an S -measurable function on X. Hence
n Z o
x∈X: | f ( x, y)| dν(y) = ∞ ∈ S .
Y
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5B Iterated Integrals 131
−
R functions from X to [0, ∞]. Because f R ≤ | f | and f ≤ | f |, the
are S -measurable +
sets { x ∈ X : Y f + ( x, y) dν(y) = ∞} and { x ∈ X : Y f − ( x, y) dν(y) = ∞}
have µ-measure
R 0. Thus the intersection of these two sets, which is the set of x ∈ X
such that Y f ( x, y) dν(y) is not defined, also has µ-measure 0.
Subtracting the second function in 5.33 from the first equation in 5.33, we see that
the function that we define to be 0 for those x ∈ XRwhere we encounter ∞ − ∞ (a
set of µ-measure 0, as noted above) and that equals Y f ( x, y) dν(y) elsewhere is an
S -measurable function on X.
Now
Z Z Z
f d( µ × ν ) = f + d( µ × ν ) − f − d( µ × ν )
X ×Y X ×Y X ×Y
Z Z Z Z
= f + ( x, y) dν(y) dµ( x ) − f − ( x, y) dν(y) dµ( x )
X Y X Y
Z Z
f + ( x, y) − f − ( x, y) dν(y) dµ( x )
=
X Y
Z Z
= f ( x, y) dν(y) dµ( x ),
X Y
where the first line above comes from the definition of the integral of a function that
is not nonnegative (noteR that neither of the two terms on the right side of the first
line equals ∞ because X ×Y | f | d(µ × ν) < ∞) and the second line comes applying
Tonelli’s Theorem to f + and f − .
We have now proved all aspects of Fubini’s Theorem that involve integrating first
over Y. The same procedure provides proofs for the aspects of Fubini’s theorem that
involve integrating first over X.
Suppose X is a set and f : X → [0, ∞] is a function. Then the region under the
graph of f , denoted U f , is defined by
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
132 Chapter 5 Product Measures
The first equality in the result below can be thought of as recovering Riemann’s
conception of the integral as the area under the graph (although now in a much more
general context with arbitrary σ-finite measures). The second equality in the result
below can be thought of as reinforcing Lebesgue’s conception of computing the area
under a curve by integrating in the direction perpendicular to Riemann’s.
Markov’s Inequality (4.1) implies that if f and µ are as in the result above, then
R
f dµ
µ({ x ∈ X : f ( x ) > t}) ≤ X
t
for all t > 0. Thus if X f dµ < ∞, then the result above should be considered to be
R
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5B Iterated Integrals 133
EXERCISES 5B
1 (a) Let λ denote Lebesgue measure on [0, 1]. Show that
x 2 − y2
Z Z
π
dλ(y) dλ( x ) =
[0, 1] [0, 1] ( x 2 + y2 )2 4
and
x 2 − y2
Z Z
π
dλ( x ) dλ(y) = − .
[0, 1] [0, 1] ( x 2 + y2 )2 4
(b) Explain why (a) violates neither Tonelli’s Theorem nor Fubini’s Theorem.
2 (a) Give an example of a doubly-indexed collection { xm,n : m, n ∈ Z+ } of
real numbers such that
∞ ∞ ∞ ∞
∑ ∑ xm,n = 0 and ∑ ∑ xm,n = ∞.
m =1 n =1 n =1 m =1
(b) Explain why (a) violates neither Tonelli’s Theorem nor Fubini’s Theorem.
3 Suppose ( X, S) is a measurable space and f : X → [0, ∞] is a function. Let B
denote the σ-algebra of Borel subsets of (0, ∞). Prove that U f ∈ S ⊗ B if and
only if f is an S -measurable function.
4 Suppose ( X, S) is a measurable space and f : X → R is a function. Let
graph( f ) ⊂ X × R denote the graph of f :
graph( f ) = { x, f ( x ) : x ∈ X }.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
134 Chapter 5 Product Measures
5C Lebesgue Integration on Rn
Throughout this section, assume that m and n are positive integers. Thus, for example,
5.36 should include the hypothesis that m and n are positive integers, but theorems
and definitions become easier to state without explicitly repeating this hypothesis.
Borel Subsets of Rn
We begin with a quick review of notation and key concepts concerning Rn . See
Sections D and E of the Appendix for more details and results about Rn.
Recall that Rn is the set of all n-tuples of real numbers:
Rn = {( x1 , . . . , xn ) : x1 , . . . , xn ∈ R}.
For x ∈ Rn and δ > 0, the open cube B( x, δ) with side length 2δ is defined by
B( x, δ) = {y ∈ Rn : ky − x k∞ < δ}.
We can now prove that the product of two open sets is an open set.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5C Lebesgue Integration on Rn 135
When n = 1, the definition below of a Borel subset of R1 agrees with our previous
definition (2.28) of a Borel subset of R.
However, there are only countably many distinct cubes whose center has all rational
coordinates and whose side length is rational. Thus G is the countable union of open
cubes.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
136 Chapter 5 Product Measures
The next result tells us that the collection of Borel sets from various dimensions
fit together nicely.
5.39 The product of the Borel subsets of Rm and the Borel subsets of Rn
Bm ⊗ Bn = Bm+n .
Proof Suppose E is an open cube in Rm+n . Thus E is the product of an open cube
inRm and an open cube in Rn . Hence E ∈ Bm ⊗ Bn . Thus the smallest σ-algebra
containing all the open cubes in Rm+n is contained in Bm ⊗ Bn . Now 5.38(b) implies
that Bm+n ⊂ Bm ⊗ Bn .
To prove a set inclusion in the other direction, temporarily fix an open set G in
Rn . Let
E = { A ⊂ R m : A × G ∈ B m + n }.
Then E contains every open subset of Rm (as follows from 5.36). Also, E is closed
under countable unions because
∞
[ ∞
[
Ak × G = ( A k × G ).
k =1 k =1
Thus E is a σ-algebra on Rm that contains all open subsets of Rm , which implies that
Bm ⊂ E . In other words, we have proved that if A ∈ Bm and G is an open subset of
Rn , then A × G ∈ Bm+n .
Now temporarily fix a Borel subset A of Rm . Let
F = { B ⊂ R n : A × B ∈ B m + n }.
The conclusion of the previous paragraph shows that F contains every open subset of
Rn . As in the previous paragraph, we also see that F is a σ-algebra. Hence Bn ⊂ F .
In other words, we have proved that if A ∈ Bm and B ∈ Bn , then A × B ∈ Bm+n .
Thus Bm ⊗ Bn ⊂ Bm+n , completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5C Lebesgue Integration on Rn 137
Lebesgue Measure on Rn
λ n = λ n −1 × λ 1 ,
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
138 Chapter 5 Product Measures
Proof Let
E = { E ∈ Bn : tE ∈ Bn }.
Then E contains every open subset of Rn (because if E is open in Rn then tE is open
in Rn ). Also, E is closed under complementation and countable unions because
[ ∞ ∞
t(Rn \ E) = Rn \ (tE) and t
[
Ek = (tEk ).
k =1 k =1
Hence E is a σ-algebra on Rn containing the open subsets of Rn. Thus E = Bn . In
other words, tE ∈ Bn for all E ∈ Bn .
To prove λn (tE) = tn λn ( E), first consider the case n = 1. Lebesgue measure on
R is a restriction of outer measure. The outer measure of a set is determined by the
sum of the lengths of countable collections of intervals whose union contains the set.
Multiplying the set by t corresponds to multiplying each such interval by t, which
multiplies the length of each such interval by t. In other words, λ1 (tE) = tλ1 ( E).
Now assume n > 1. We will use induction on n and assume that the desired result
holds for n − 1. If A ∈ Bn−1 and B ∈ B1 , then
λn t( A × B) = λn (tA) × (tB)
= λn−1 (tA) · λ1 (tB)
= tn−1 λn−1 ( A) · tλ1 ( B)
5.42 = t n λ n ( A × B ),
giving the desired result for A × B.
For m ∈ Z+, let Cm be the open cube in Rn centered at the origin and with side
length m. Let
Em = { E ∈ Bn : E ⊂ Cm and λn (tE) = tn λn ( E)}.
From 5.42 and using 5.13(b), we see that finite unions of measurable rectangles
contained in Cm are in Em . You should verify that Em is closed under countable
increasing unions (use 2.58) and countable decreasing intersections (use 2.59, whose
finite measure condition holds because we are working inside Cm ). From 5.13 and
the Monotone Class Theorem (5.17), we conclude that Em is the σ-algebra on Cm
consisting of Borel subsets of Cm . Thus λn (tE) = tn λn ( E) for all E ∈ Bn such that
E ⊂ Cm .
Now suppose E ∈ Bn . Then 2.58 implies that
λn (tE) = lim λn t( E ∩ Cm ) = tn lim λn ( E ∩ Cm ) = tn λn ( E),
m→∞ m→∞
as desired.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5C Lebesgue Integration on Rn 139
Bn = {( x1 , . . . , xn ) ∈ Rn : x1 2 + · · · + xn 2 < 1}.
The open unit ball Bn is open in Rn (as you should verify) and thus is in the
collection Bn of Borel sets.
Proof Because λ1 (B1 ) = 2 and λ2 (B2 ) = π, the claimed formula is correct when
n = 1 and when n = 2.
Now assume that n > 2. We will use induction on n, assuming that the claimed for-
mula is true for smaller values of n. Think of Rn = R2 × Rn−2 and λn = λ2 × λn−2 .
Then
Z Z
5.45 λn (Bn ) = χB ( x, y) dy dx.
R2 R n −2 n
To evaluate this integral, switch to the usual polar coordinates that you learned about
in calculus (dλ2 = r dr dθ), getting
Z π Z 1
λ n ( B n ) = λ n −2 ( B n −2 ) (1 − r2 )(n−2)/2 r dr dθ
−π 0
2π
= λ n −2 ( B n −2 ).
n
The last equation and the induction hypothesis give the desired result.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
140 Chapter 5 Product Measures
f ( x + t, y) − f ( x, y)
( D1 f )( x, y) = lim
t →0 t
and
f ( x, y + t) − f ( x, y)
( D2 f )( x, y) = lim
t →0 t
if these limits exist.
Using the notation for the cross section of a function (see 5.7), we could write the
definitions of D1 and D2 in the following form:
( D1 f )( x, y) = ([ f ]y )0 ( x ) and ( D2 f )( x, y) = ([ f ] x )0 (y).
( D1 f )( x, y) = yx y−1 and ( D2 f )( x, y) = x y ln x,
as you should verify. Taking partial derivatives of those partial derivatives, we have
D2 ( D1 f ) ( x, y) = x y−1 + yx y−1 ln x
and
D1 ( D2 f ) ( x, y) = x y−1 + yx y−1 ln x,
as you should also verify. The last two equations show that D1 ( D2 f ) = D2 ( D1 f )
as functions on G.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
√
Section 5C Lebesgue Integration on Rn ≈ 100 2
In the example above, the two mixed partial derivatives turn out to equal to each
other, even though the intermediate results look quite different. The next result shows
that the behavior in the example above is typical rather than a coincidence.
Some proofs of the result below do not use Fubini’s Theorem. However, Fubini’s
Theorem leads to the clean proof below.
There exist versions of the result below with slightly different hypotheses, but the
hypotheses used here are usually easy to verify in practice. Although the continuity
hypotheses used here can be slightly weakened, they cannot be eliminated, as shown
by Exercise 14 in this section.
The integrals in the proof below make sense because continuous real-valued
functions on R2 are measurable (because the inverse image of each open set is open;
see Exercise 18 in Section E of the Appendix) and continuous real-valued functions
on closed bounded subsets of R2 are bounded (see 0.80 in the Appendix).
D1 ( D2 f ) = D2 ( D1 f )
on G.
= f ( a + δ, b + δ) − f ( a + δ, b) − f ( a, b + δ) + f ( a, b),
where the first equality comes from Fubini’s Theorem (5.32) and the second and third
equalities come from the Fundamental
R Theorem of Calculus.
A similar calculation of S D2 ( D1 f ) dλ2 yields the same result. Thus
δ
Z
[ D1 ( D2 f ) − D2 ( D1 f )] dλ2 = 0
Sδ
for all δ such that Sδ ⊂ G. If D1 ( D2 f ) ( a, b) > D2 ( D1 f ) ( a, b), then by
the continuity of D1 ( D2 f ) and D2 ( D1 f ), the integrand in the equation above is
positive on Sδ for δ sufficiently small, which
contradicts the integral
above equal-
ing 0. Similarly, the inequality D1 ( D2 f ) ( a, b) < D2 ( D1 f ) ( a, b) also contra-
dicts the equation above for small δ. Thus we conclude that D1 ( D2 f ) ( a, b) =
D2 ( D1 f ) ( a, b), as desired.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
142 Chapter 5 Product Measures
EXERCISES 5C
1 Show that a set G ⊂ Rn is open in Rn if and only if for each (b1 , . . . , bn ) ∈ G,
there exists r > 0 such that
n q o
( a1 , . . . , an ) ∈ Rn : ( a1 − b1 )2 + · · · + ( an − bn )2 < r ⊂ G.
{ A × B × C : A ∈ S , B ∈ T , C ∈ U }.
S ⊗ T ⊗ U = (S ⊗ T ) ⊗ U = S ⊗ (T ⊗ U ).
1
Z Z
f t dλn = f dλn .
Rn tn Rn
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 5C Lebesgue Integration on Rn 143
π n/2
λn (Bn ) =
Γ( n2 + 1)
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
6
Banach Spaces
We begin this chapter with a quick review of the essentials of metric spaces. Then
we extend our results on measurable functions and integration to complex-valued
functions. After that, we rapidly review the framework of vector spaces, which will
allow us to consider natural collections of measurable functions that are closed under
addition and scalar multiplication.
Normed vector spaces and Banach spaces, which are introduced in the third
section of this chapter, play a hugely important role in modern analysis. Most interest
focuses on linear maps on these vector spaces. Key results about linear maps that we
will develop in this chapter include the Hahn–Banach Theorem, the Open Mapping
Theorem, the Closed Graph Theorem, and the Principle of Uniform Boundedness.
Market square in Lwów, a city that has been in several countries because of changing
international boundaries. Before World War I, Lwów was in Austria-Hungary.
During the period between World War I and World War II, Lwów was in Poland.
During this time, mathematicians in Lwów, particularly Stefan Banach (1892–1945)
and his colleagues, developed the basic results of modern functional analysis.
After World War II, Lwów was in USSR. Now Lwów is in Ukraine and is called Lviv.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 144
Section 6A Metric Spaces 145
6A Metric Spaces
Open Sets, Closed Sets, and Continuity
Much of analysis takes place in the context of a metric space, which is a set with a
notion of distance that satisfies certain properties. The properties we would like a
distance function to have are captured in the next definition, where you should think
of d( f , g) as measuring the distance between f and g.
Specifically, we would like the distance between two elements of our metric space
to be a nonnegative number that is 0 if and only if the two elements are the same. We
would like the distance between two elements not to depend on the order in which
we list them. Finally, we would like a triangle inequality (the last bullet point below),
which states that the distance between two elements is less than or equal to the sum
of the distances obtained when we insert an intermediate element.
Now we are ready for the formal definition.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
146 Chapter 6 Banach Spaces
Abusing terminology, many books (including this one) include phrases such as
suppose V is a metric space without mentioning the metric d. When that happens,
you should assume that a metric d lurks nearby, even if it is not explicitly named.
Our next definition declares a subset of a metric space to be open if every element
in the subset is the center of an open ball that is contained in the set.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6A Metric Spaces 147
For example, each closed ball B( f , r ) in a metric space is closed, as you are asked
to prove in Exercise 3.
Now we define the closure of a subset of a metric space.
Limits in a metric space are defined by reducing to the context of real numbers,
where limits have already been defined.
6.9 Closure
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
148 Chapter 6 Banach Spaces
The definition of continuity that follows uses the same pattern as the definition for
a function from a subset of Rm to Rn (see 0.75 in the Appendix).
The next result gives equivalent conditions for continuity. Recall that T −1 ( E) is
called the inverse image of E and is defined to be { f ∈ V : T ( f ) ∈ E}. Thus the
equivalence of the (a) and (c) below could be restated as saying that a function is
continuous if and only if the inverse image of every open set is open. The equivalence
of the (a) and (d) below could be restated as saying that a function is continuous if
and only if the inverse image of every closed set is closed.
(a) T is continuous.
(b) lim f k = f in V implies lim T ( f k ) = T ( f ) in W.
k→∞ k→∞
(c) T −1 ( G ) is an open subset of V for every open set G ⊂ W.
Proof We first prove that (b) implies (d). Suppose (b) holds. Suppose F is a closed
subset of W. We need to prove that T −1 ( F ) is closed. To do this, suppose f 1 , f 2 , . . .
is a sequence in T −1 ( F ) and limk→∞ f k = f for some f ∈ V. Because (b) holds, we
know that limk→∞ T ( f k ) = T ( f ). Because f k ∈ T −1 ( F ) for each k ∈ Z+, we know
that T ( f k ) ∈ F for each k ∈ Z+. Because F is closed, this implies that T ( f ) ∈ F.
Thus f ∈ T −1 ( F ), which implies that T −1 ( F ) is closed [by 6.9(e)], completing the
proof that (b) implies (d).
The proof that (c) and (d) are equivalent follows from the equation
T − 1 (W \ E ) = V \ T − 1 ( E )
for every E ⊂ W and the fact that a set is open if and only if its complement (in the
appropriate metric space) is closed.
The proof of the remaining parts of this result are left as an exercise that should
help strengthen understanding of these concepts.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6A Metric Spaces 149
Proof Suppose limk→∞ f k = f in a metric space (V, d). Suppose ε > 0. Then
there exists n ∈ Z+ such that d( f k , f ) < 2ε for all k ≥ n. If j, k ∈ Z+ are such that
j ≥ n and k ≥ n, then
d( f j , f k ) ≤ d( f j , f ) + d( f , f k ) ≤ ε
2 + ε
2 = ε.
Thus f 1 , f 2 , . . . is a Cauchy sequence, completing the proof.
Metric spaces that satisfy the converse of the result above have a special name.
6.15 Example
• All five of the metric spaces in Example 6.2 are complete, as you should verify.
• The metric space Q, with metric defined by d( x, y) = | x − y|, is not complete.
To see this, for k ∈ Z+ let
1 1 1
xk = + 2! + · · · + k! .
101! 10 10
If j < k, then
1 1 2
| xk − x j | = + · · · + k! < ( j+1)1! .
10( j+1)1! 10 10
Thus x1 , x2 , . . . is a Cauchy sequence in Q. However, x1 , x2 , . . . does not con-
verge to an element of Q because the limit of this sequence would have a decimal
expansion 0.110001000000000000000001 . . . that is neither a terminating deci-
mal nor a repeating decimal. Thus Q is not a complete metric space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
150 Chapter 6 Banach Spaces
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6A Metric Spaces 151
EXERCISES 6A
1 Verify that each of the claimed metrics in Example 6.2 is indeed a metric.
2 Prove that every finite subset of a metric space is closed.
3 Prove that every closed ball in a metric space is closed.
4 Suppose V is a metric space.
(a) Prove that the union of each collection of open subsets of V is an open
subset of V.
(b) Prove that the intersection of each finite collection of open subsets of V is
an open subset of V.
13 Prove the parts of 6.11 that were not proved in the text.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
152 Chapter 6 Banach Spaces
16 Suppose (U, d) is a metric space. Let W denote the set of all Cauchy sequences
of elements of U.
(a) For ( f 1 , f 2 , . . .) and ( g1 , g2 , . . .) in W, define ( f 1 , f 2 , . . .) ≡ ( g1 , g2 , . . .)
to mean that
lim d( f k , gk ) = 0.
k→∞
Show that ≡ is an equivalence relation on W.
(b) Let V denote the set of equivalence classes of elements of W under the
equivalence relation above. For ( f 1 , f 2 , . . .) ∈ W, let ( f 1 , f 2 , . . .)ˆ denote
the equivalence class of ( f 1 , f 2 , . . .). Define dV : V × V → [0, ∞) by
dV ( f 1 , f 2 , . . .)ˆ, ( g1 , g2 , . . .)ˆ = lim d( f k , gk ).
k→∞
for all f , g ∈ U.
(e) Explain why (d) shows that every metric space is a subset of some complete
metric space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6B Vector Spaces 153
6B Vector Spaces
Integration of Complex-Valued Functions
Complex numbers were invented so that we can take square roots of negative numbers.
The idea is to assume we have a square root of −1, denoted i, that obeys the usual
rules of arithmetic. Here are the formal definitions:
C = { a + bi : a, b ∈ R}.
( a + bi ) + (c + di ) = ( a + c) + (b + d)i,
( a + bi )(c + di ) = ( ac − bd) + ( ad + bc)i;
here a, b, c, d ∈ R.
If a ∈ R, then we identify a + 0i
with a. Thus we think of R as a subset of √ symbol i was first used to denote
The
−1 by Leonhard Euler in 1777.
C. We also usually write 0 + bi as bi, and
we usually write 0 + 1i as i. You should verify that i2 = −1.
With the definitions as above, C satisfies the usual rules of arithmetic. Specifically,
with addition and multiplication defined as above, C is a field, as you should verify
(see 0.1 in the Appendix for the definition of field). Thus subtraction and division of
complex numbers are defined as in any field (see 0.4).
The field C cannot be made into an or-
Much of this section may be review
dered field (see Exercise 13 in Section A
for many readers.
of the Appendix). However, the useful
concept of an absolute value can still be defined on C.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
154 Chapter 6 Banach Spaces
For b a real number, the previous definition of |b| (see 0.9) is consistent with the
new definition just given of |b| with b thought of as a complex number. Note that if
z1 , z2 , . . . is a sequence of complex numbers and L ∈ C, then
lim zk = L ⇐⇒ lim Re zk = Re L and lim Im zk = Im L.
k→∞ k→∞ k→∞
We will reduce questions concerning measurability and integration of a complex-
valued function to the corresponding questions about the real and imaginary parts of
the function. We begin this process with the following definition.
See Exercise 4 in this section for two natural conditions that are equivalent to
measurability for complex-valued functions.
We will make frequent use of the following result. See Exercise 5 in this section
for algebraic combinations of complex-valued measurable functions.
Proof The functions (Re f )2 and (Im f )2 are S -measurable because the square
of an S -measurable function is measurable (by Example 2.44). Thus the function
(Re f )2 + (Im f )2 is S -measurable (because the sum of two S -measurable functions
p/2
is S -measurable by 2.45). Now (Re f )2 + (Im f )2 is S -measurable because it
is the composition of a continuous function on [0, ∞) and an S -measurable function
(see 2.43 and 2.40). In other words, | f | p is an S -measurable function.
Then f dµ is definedZ
by Z Z
f dµ = (Re f ) dµ + i (Im f ) dµ.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6B Vector Spaces 155
R You can easily Rshow that if f , g : X → C are S -measurable functions such that
| f | dµ < ∞ and | g| dµ < ∞, then
Z Z Z
( f + g) dµ = f dµ + g dµ.
R R
Proof The result clearly holds if f dµ = 0. Thus assume that f dµ 6= 0.
Let R
| f dµ|
α= R .
f dµ
Then
Z Z Z
f dµ = α f dµ = α f dµ
Z Z
= Re(α f ) dµ + i Im(α f ) dµ
Z
= Re(α f ) dµ
Z
≤ |α f | dµ
Z
= | f | dµ,
where
R the second equality holds by Exercise 7, the fourth equality holds because
| f dµ| ∈ R, the inequality on the fourth line holds because Re z ≤ |z| for every
complex number z, and the equality in the last line holds because |α| = 1.
Because of the result above, the Bounded Convergence Theorem (3.26) and the
Dominated Convergence Theorem (3.30) hold if the functions f 1 , f 2 , . . . and f in the
statements of those theorems are allowed to be complex valued.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
156 Chapter 6 Banach Spaces
Suppose w, z ∈ C. Then
product of z and z
z z = | z |2 ;
sum and difference of z and z
z + z = 2 Re z and z − z = 2(Im z)i;
additivity and multiplicativity of complex conjugate
w + z = w + z and wz = w z;
complex conjugate of complex conjugate
z = z;
absolute value of complex conjugate
| z | = | z |;
integral of complex conjugate of a function
Z Z
f dµ = f dµ for every measure µ and every f ∈ L1 (µ).
as desired.
The straightforward proofs of the remaining items are left to the reader.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6B Vector Spaces 157
6.25 Definition F
commutativity
f + g = g + f for all f , g ∈ V;
associativity
( f + g) + h = f + ( g + h) and (αβ) f = α( β f ) for all f , g, h ∈ V and α, β ∈ F;
additive identity
there exists an element 0 ∈ V such that f + 0 = f for all f ∈ V;
additive inverse
for every f ∈ V, there exists g ∈ V such that f + g = 0;
multiplicative identity
1 f = f for all f ∈ V;
distributive properties
α( f + g) = α f + αg and (α + β) f = α f + β f for all α, β ∈ F and f , g ∈ V.
Most vector spaces that you will encounter are subsets of the vector space F X
presented in the next example.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
158 Chapter 6 Banach Spaces
for x ∈ X. Then, as you should verify, F X is a vector space; the additive identity in
this vector space is the function 0 ∈ F X defined by 0( x ) = 0 for all x ∈ X.
+
6.29 Example Fn ; FZ
Special case of the previous example: if n ∈ Z+ and X = {1, . . . , n}, then F X is
the familiar space Rn or Cn , depending upon whether F = R or F = C.
+
Another special case: FZ is the vector space of all sequences of real numbers or
complex numbers, again depending upon whether F = R or F = C.
The next result gives the easiest way to check whether a subset of a vector space
is a subspace.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6B Vector Spaces 159
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
160 Chapter 6 Banach Spaces
EXERCISES 6B
1 Suppose z ∈ C. Prove that
√
max{|Re z|, |Im z|} ≤ |z| ≤ 2 max{|Re z|, |Im z|}.
2 Suppose z ∈ C. Prove that
|Re z| + |Im z|
√ ≤ |z| ≤ |Re z| + |Im z|.
2
3 Suppose w, z ∈ C. Prove that
|wz| = |w| |z| and |w + z| ≤ |w| + |z|.
4 Suppose ( X, S) is a measurable space and f : X → C is a complex-valued
function. For conditions (b) and (c) below, identify C with R2 . Prove that the
following are equivalent:
(a) f is S -measurable;
(b) f −1 ( G ) ∈ S for every open set G in R2 ;
(c) f −1 ( B) ∈ S for every Borel set B ∈ B2 .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6C Normed Vector Spaces 161
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
162 Chapter 6 Banach Spaces
Sometimes examples that do not satisfy a definition help you gain understanding.
The next result shows that every normed vector space is also a metric space in a
natural fashion.
d ( f , g ) = k f − g k.
Then d is a metric on V.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6C Normed Vector Spaces 163
From now on, all metric space notions in the context of a normed vector space
should be interpreted with respect to the metric introduced in the previous result.
However, usually there is no need to introduce the metric d explicitly—just use the
norm of the difference of two elements. For example, suppose (V, k·k) in a normed
vector space, f 1 , f 2 , . . . is a sequence in V, and f ∈ V. Then in the context of a
normed vector space, the definition of limit (6.8) becomes the following statement:
Every sequence in a normed vector space that has a limit is a Cauchy sequence
(see 6.13). Normed vector spaces that satisfy the converse have a special name.
• The vector space C ([0, 1]) with the norm defined by k f k = supx∈[0, 1] | f ( x )| is
a Banach space.
• The vector space `1 with the norm defined by k( a1 , a2 , . . .)k1 = ∑∞
k=1 | ak | is a
Banach space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
164 Chapter 6 Banach Spaces
6.41 ∑∞ ∞
k=1 k gk k < ∞ =⇒ ∑k=1 gk converges ⇐⇒ Banach space
f k = g1 + · · · + g k .
If k > j ≥ n, then
k f k − f j k = k g j +1 + · · · + g k k
≤ k g j +1 k + · · · + k g k k
∞
≤ ∑ k gm k
m=n
< ε.
Thus f 1 , f 2 , . . . is a Cauchy sequence in V. Because V is a Banach space, we conclude
that f 1 , f 2 , . . . converges to some element of V, which is precisely what it means for
∑∞k=1 gk to converge, completing one direction of the proof.
To prove the other direction, suppose ∑∞ k=1 gk converges for every sequence
g1 , g2 , . . . in V such that ∑∞ k =1 k g k k < ∞. Suppose f 1 , f 2 , . . . is a Cauchy sequence
in V. We want to prove that f 1 , f 2 , . . . converges to some element of V. It suffices to
show that some subsequence of f 1 , f 2 , . . . converges (by Exercise 14 in Section 6A).
Dropping to a subsequence (but not relabeling) and setting f 0 = 0, we can assume
that
∞
∑ k f k − f k−1 k < ∞.
k =1
Hence ∑∞k=1 ( f k − f k−1 ) converges. The partial sum of this series after n terms is f n .
Thus limn→∞ f n exists, completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6C Normed Vector Spaces 165
The set of linear maps from a vector space V to a vector space W is itself a vector
space, using the usual operations of addition and scalar multiplication of functions.
Most attention in analysis focuses on the subset of bounded linear functions, defined
below, which we will see is itself a normed vector space.
In the next definition, we have two normed vector spaces, V and W, which may
have different norms. However, we use the same notation k·k for both norms (and
for the norm of a linear map from V to W) because the context makes the meaning
clear. For example, in the definition below, f is in V and thus k f k refers to the norm
in V. Similarly, T f ∈ W and thus k T f k refers to the norm in W.
( T f )( x ) = x2 f ( x ).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
166 Chapter 6 Banach Spaces
The next result shows if V and W are normed vector spaces, then B(V, W ) is a
normed vector space with the norm defined above.
Be sure that you are comfortable using all four equivalent formulas for k T k shown
in Exercise 17. For example, you should often think of k T k as the smallest number
such that k T f k ≤ k T k k f k for all f in the domain of T.
Note that in the next result, the hypothesis requires that W be a Banach space but
there is no requirement for V to be a Banach space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6C Normed Vector Spaces 167
k T f k ≤ sup{k Tk f k : k ∈ Z+ }
≤ sup{k Tk k : k ∈ Z+ } k f k
for each f ∈ V. The last supremum above is finite because every Cauchy sequence is
bounded (see Exercise 4). Thus T ∈ B(V, W ).
We still need to show that limk→∞ k Tk − T k = 0. To do this, suppose ε > 0. Let
n ∈ Z+ be such that k Tj − Tk k < ε for all j ≥ n and k ≥ n. Suppose j ≥ n and
suppose f ∈ V. Then
k( Tj − T ) f k = lim k Tj f − Tk f k
k→∞
≤ ε k f k.
The next result shows that the phrase bounded linear map means the same as the
phrase continuous linear map.
A linear map from one normed vector space to another normed vector space is
continuous if and only if it is bounded.
fk fk T fk
lim = 0 and T = 6→ 0,
k→∞ kT fk k kT fk k kT fk k
k T f k − T f k = k T ( f k − f )k
≤ k T k k f k − f k.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
168 Chapter 6 Banach Spaces
EXERCISES 6C
1 Show that the map f 7→ k f k from a normed vector space V to F is continuous
(where the norm on F is the usual absolute value).
2 Prove that if V is a normed vector space, f ∈ V, and r > 0, then
B ( f , r ) = B ( f , r ).
3 Show that the functions defined in the last two bullet points of Example 6.35 are
not norms.
4 Prove that each Cauchy sequence in a normed vector space is bounded (meaning
that there is a real number that is greater than the norm of every element in the
Cauchy sequence).
5 Show that if n ∈ Z+, then Fn is a Banach space with both the norms used in the
first bullet point of Example 6.34.
6 Suppose X is a nonempty set and b( X ) is the vector space of bounded functions
from X to F. Prove that if k·k is defined on b( X ) by k f k = supx∈X | f ( x )|,
then b( X ) is a Banach space.
7 Show that `1 with the norm defined by k( a1 , a2 , . . .)k∞ = supk∈Z+ | ak | is not
a Banach space.
8 Show that `1 with the norm defined by k( a1 , a2 , . . .)k1 = ∑∞
k=1 | ak | is a Banach
space.
9 Show that the vector space C ([0, 1]) of continuous functions from [0, 1] to F
R1
with the norm defined by k f k = 0 | f | is not a Banach space.
10 Suppose U is a subspace of a normed vector space V such that some open ball
of V is contained in U. Prove that U = V.
11 Prove that the only subsets of a normed vector space V that are both open and
closed are ∅ and V.
12 Prove that if V is a normed vector space and U is a subspace of V, then U is a
subspace of V.
13 Suppose that U is a normed vector space. Let d be the metric on U defined
by d( f , g) = k f − gk for f , g ∈ U. Let V be the complete metric space
constructed in Exercise 16 in Section 6A.
(a) Show that the set V is a vector space under natural operations of addition
and scalar multiplication.
(b) Show that there is a natural way to make V into a normed vector space and
that with this norm, V is a Banach space.
(c) Explain why (b) shows that every normed vector space is a subspace of
some Banach space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6C Normed Vector Spaces 169
15 Give an example to show that part (a) of the previous exercise can fail if the
assumption that W is a Banach space is replaced by the assumption that W is a
normed vector space.
16 For readers familiar with the quotient of a vector space and a subspace: Suppose
V is a normed vector space and U is a subspace of V. Define k·k on V/U by
k f + U k = inf{k f + gk : g ∈ U }.
(a) Prove that k·k is a norm on V/U if and only if U is a closed subspace of V.
(b) Prove that if V is a Banach space and U is a closed subspace of V, then
V/U (with the norm defined above) is a Banach space.
(c) Prove that if U is a Banach space (with the norm it inherits from V) and
V/U is a Banach space (with the norm defined above), then V is a Banach
space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
170 Chapter 6 Banach Spaces
6D Linear Functionals
Bounded Linear Functionals
Linear maps into the scalar field F are so important that they get a special name.
When we think of the scalar field F as a normed vector space, as in the next
example, the norm kzk of a number z ∈ F is always intended to be just the usual
absolute value |z|. This norm makes F a Banach space.
Suppose V and W are vector spaces and T : V → W is a linear map. Then the
null space of T is denoted by null T and is defined by
null T = { f ∈ V : T f = 0}.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6D Linear Functionals 171
(d) null ϕ 6= V.
Proof The equivalence of (a) and (b) is just a special case of 6.48.
To prove that (b) implies (c), suppose that ϕ is a continuous linear functional.
Then null ϕ, which is the inverse image of the closed set {0}, is a closed subset of V
by 6.11(d). Thus (b) implies (c).
To prove that (c) implies (a), we will show that the negation of (a) implies
the negation of (c). Thus suppose ϕ is not bounded. Thus there is a sequence
f 1 , f 2 , . . . in V such that k f k k ≤ 1 and | ϕ( f k )| ≥ k for each k ∈ Z+. Now
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
172 Chapter 6 Banach Spaces
A family {ek }k∈Γ in a set V is a function e from a set Γ to V, with the value of
the function e at k ∈ Γ denoted by ek .
Even though a family in V is a function mapping into V and thus is not a subset
of V, the set terminology and the bracket notation {ek }k∈Γ are useful, and the range
of a family in V really is a subset of V.
We now restate some basic linear algebra concepts, but in the context of vector
spaces that might be infinite-dimensional. Note that only finite sums appear in the
definition below, even though we might be working with an infinite family.
• The span of {ek }k∈Γ is denoted by span{ek }k∈Γ and is defined to be the set
of all sums of the form
∑ αj ej ,
j∈Ω
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6D Linear Functionals 173
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
174 Chapter 6 Banach Spaces
Zorn’s Lemma now allows us to prove that every vector space has a basis. The
proof does not help us find a concrete basis because Zorn’s Lemma is an existence
result rather than a constructive technique.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6D Linear Functionals 175
Now we can prove the promised result about the existence of discontinuous linear
functionals on every infinite-dimensional normed vector space.
Hahn–Banach Theorem
In the last subsection, we showed that there exists a discontinuous linear functional
on each infinite-dimensional normed vector space. Now we turn our attention to the
existence of continuous linear functionals.
The existence of a nonzero continuous linear functional on each Banach space is
not obvious. For example, consider the Banach space `∞ /c0 , where `∞ is the Banach
space of bounded sequences in F with
k( a1 , a2 , . . .)k∞ = sup | ak |
k ∈Z+
and c0 is the subspace of `∞ consisting of those sequences in F that have limit 0. The
quotient space `∞ /c0 is an infinite-dimensional Banach space (see Exercise 16 in
Section 6C). However, no one has ever exhibited a concrete nonzero linear functional
on the Banach space `∞ /c0 .
In this subsection, we will show that infinite-dimensional normed vector spaces
have plenty of continuous linear functionals. We will do this by showing that a
bounded linear functional on a subspace of a normed vector space can be extended
to a bounded linear functional on the whole space without increasing its norm—this
result is called the Hahn–Banach Theorem (6.69).
Completeness plays no role in this topic. Thus this subsection deals with normed
vector spaces instead of Banach spaces.
Most of the work in proving the Hahn–Banach theorem is done in the next lemma.
The next lemma shows that we can extend a linear functional to a subspace generated
by one additional element, without increasing the norm. This one-element-at-a-time
approach, when combined with a maximal object produced by Zorn’s Lemma, will
give us the desired extension to the full normed vector space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
176 Chapter 6 Banach Spaces
U + Rh = { f + αh : f ∈ U and α ∈ R}.
ϕ( f + αh) = ψ( f ) + αc
The desired result in the line above will follow from the inequality
6.66 sup −kψk k f + hk − ψ( f ) ≤ inf kψk k g + hk − ψ( g) .
f ∈U g ∈U
−kψk k f + hk − ψ( f ) = −kψk k( g + h) − ( g − f )k + ψ( g − f ) − ψ( g)
≤ kψk(k g + hk − k g − f k) + ψ( g − f ) − ψ( g)
≤ k ψ k k g + h k − ψ ( g ).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6D Linear Functionals 177
Because our simplified form of Zorn’s Lemma deals with set inclusions rather
than more general orderings, we will need to use the notion of the graph of a function.
Proof First we consider the case where F = R. Let A be the collection of subsets
E of V × R that satisfy all the following conditions:
• graph(ψ) ⊂ E;
• |α| ≤ kψk k f k for every ( f , α) ∈ E;
• E = graph( ϕ) for some linear functional ϕ on some subspace of V.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
178 Chapter 6 Banach Spaces
Then A satisfies the hypothesis of Zorn’s Lemma (6.60). Thus A has a maximal
element. The Extension Lemma (6.63) implies that this maximal element is the graph
of a linear functional defined on all of V. This linear functional is an extension of ψ
to V and it has norm kψk, completing the proof in the case where F = R.
Now consider the case where F = C. Define ψ1 : U → R by
ψ1 ( f ) = Re ψ( f )
for f ∈ U. Then ψ1 is an R-linear map from U to R and kψ1 k ≤ kψk (actually
kψ1 k = kψk, but we need only the inequality). Also,
ψ( f ) = Re ψ( f ) + i Im ψ( f )
= ψ1 ( f ) + i Im −iψ(i f )
= ψ1 ( f ) − i Re ψ(i f )
6.70 = ψ1 ( f ) − iψ1 (i f )
for all f ∈ U.
Temporarily forget that complex scalar multiplication makes sense on V and
temporarily think of V as a real normed vector space. The case of the result that
we have already proved then implies that there exists an extension ϕ1 of ψ1 to an
R-linear functional ϕ1 : V → R with k ϕ1 k = kψ1 k ≤ kψk.
Motivated by 6.70, we define ϕ : V → C by
ϕ( f ) = ϕ1 ( f ) − iϕ1 (i f )
for f ∈ V. The equation above and 6.70 imply that ϕ is an extension of ψ to V. The
equation above also implies that ϕ( f + g) = ϕ( f ) + ϕ( g) and ϕ(α f ) = αϕ( f ) for
all f , g ∈ V and all α ∈ R. Also,
ϕ(i f ) = ϕ1 (i f ) − iϕ1 (− f ) = ϕ1 (i f ) + iϕ1 ( f ) = i ϕ1 ( f ) − iϕ1 (i f ) = iϕ( f ).
The reader should use the equation above to show that ϕ is a C-linear map.
The only part of the proof that remains is to show that k ϕk ≤ kψk. To do this,
note that
| ϕ( f )|2 = ϕ ϕ( f ) f = ϕ1 ϕ( f ) f ≤ kψk k ϕ( f ) f k = kψk | ϕ( f )| k f k
for all f ∈ V, where the second equality holds because ϕ ϕ( f ) f ∈ R. Dividing by
| ϕ( f )|, we see from the line above that | ϕ( f )| ≤ kψk k f k for all f ∈ V (no division
necessary if ϕ( f ) = 0). This implies that k ϕk ≤ kψk, completing the proof.
We have given the special name linear functionals to linear maps into the scalar
field F. The vector space of bounded linear functionals now also gets a special name
and a special notation.
By 6.47, the dual space of every normed vector space is a Banach space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6D Linear Functionals 179
The next result gives another beautiful application of the Hahn–Banach Theorem,
with a useful necessary and sufficient condition for an element of a normed vector
space to be in the closure of a subspace.
EXERCISES 6D
1 Suppose V is a normed vector space and ϕ is a linear functional on V. Suppose
α ∈ F \ {0}. Prove that the following are equivalent:
(a) ϕ is a bounded linear functional.
(b) ϕ−1 (α) is a closed subset of V.
(c) ϕ−1 (α) 6= V.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
180 Chapter 6 Banach Spaces
For the next three exercises, Fn should be endowed with the norm k·k∞ as defined
in Example 6.34.
4 Suppose V is a normed vector space, n ∈ Z+, and T : V → Fn is linear. Prove
that T is a bounded linear map if and only if null T is a closed subspace of V.
5 Suppose n ∈ Z+ and V is a normed vector space. Prove that every linear map
from Fn to V is continuous.
6 Suppose n ∈ Z+, V is a normed vector space, and T : Fn → V is a linear map
that is one-to-one and onto V.
(a) Use the Bolzano–Weierstrass Theorem (see 0.73 in the Appendix) to show
that
inf{k Tx k : x ∈ Fn and k x k∞ = 1} > 0.
(b) Prove that T −1 : V → Fn is a bounded linear map.
7 Suppose n ∈ Z+.
(a) Prove that all norms on Fn have the same convergent sequences, the same
open sets, and the same closed sets.
(b) Prove that all norms on Fn make Fn into a Banach space.
for every h ∈ V \ U.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6D Linear Functionals 181
for all ( a1 , a2 , . . .) ∈ `∞ such that the limit above on the right exits.
15 Suppose B is an open ball in a normed vector space V such that 0 ∈
/ B. Prove
that there exists ϕ ∈ V 0 such that
Re ϕ( f ) > 0
for all f ∈ B.
16 Show that the dual space of each infinite-dimensional normed vector space is
infinite-dimensional.
A normed vector space is called separable if it has a countable subset whose clo-
sure equals the whole space. Probably most of the normed vector spaces that you
will encounter are separable.
17 Suppose V is a normed vector space that contains a countable set whose closure
is V. Explain how the Hahn–Banach Theorem (6.69) for V can be proved
without using any results (such as Zorn’s Lemma) that depend upon the Axiom
of Choice.
18 Suppose that V is a normed vector space such that the dual space V 0 is a
separable Banach space. Prove that V is separable.
19 Prove that the dual of the Banach space C ([0, 1]) is not separable; here the norm
on C ([0, 1]) is defined by k f k = supx∈[0, 1] | f ( x )|.
The double dual space of a normed vector space is defined to be the dual space of
the dual space. If V is a normed vector space, then the double dual space of V is
0
denoted by V 00 ; thus V 00 = (V 0 ) . The norm on V 00 is defined to be the norm it
0
receives as the dual space of V .
20 Define Φ : V → V 00 by
(Φ f )( ϕ) = ϕ( f )
for f ∈ V and ϕ ∈ V 0.
Show that kΦ f k = k f k for every f ∈ V.
[The map Φ defined above is called the canonical isometry of V into V 00.]
21 Suppose V is an infinite-dimensional normed vector space. Show that there is a
convex subset U of V such that U = V and such that the complement V \ U is
also a convex subset of V with V \ U = V.
[This exercise should stretch your geometric intuition because this behavior
cannot happen in finite dimensions.]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
182 Chapter 6 Banach Spaces
Baire’s Theorem
We begin with some key topological notions.
You should verify the following elementary facts about the interior.
For example, Q and R \ Q are both dense in R, where R has its standard metric
d( x, y) = | x − y|.
You should verify the following elementary facts about dense subsets.
The proof of the next result uses the following fact, which you should first prove:
If G is an open subset of a metric space V and f ∈ G, then there exists r > 0 such
that B( f , r ) ⊂ G.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6E Consequences of Baire’s Theorem 183
(a) A complete metric space is not the countable union of closed subsets with
empty interior.
(b) The countable intersection of dense open subsets of a complete metric space
is nonempty.
Proof We will prove (b) and then use (b) to prove (a).
To prove (b), suppose (V, d) is a complete metric space and G1 , G2 , . . . is a
sequence of dense open subsets of V. We need to show that ∞ k=1 Gk 6 = ∅.
T
6.77 B ( f 1 , r1 ) ⊃ B ( f 2 , r2 ) ⊃ · · · ⊃ B ( f n , r n )
and
r j ∈ 0, 1j
6.78 and B( f j , r j ) ⊂ Gj for j = 1, . . . , n.
Thus we inductively construct a sequence f 1 , f 2 , . . . that satisfies 6.77 and 6.78 for
all n ∈ Z+.
If j ∈ Z+, then 6.77 and 6.78 imply that
1
6.79 f k ∈ B( f j , r j ) and d( f j , f k ) ≤ r j < j for all k > j.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
184 Chapter 6 Banach Spaces
Because [
R= {x}
x ∈R
and each set { x } has empty interior in R, Baire’s Theorem implies R is uncountable.
Thus we have yet another proof that R is uncountable, different than Cantor’s original
diagonal proof and different from the proof via measure theory (see 2.16).
The next result is another nice consequence of Baire’s Theorem.
6.80 The set of irrational numbers is not a countable union of closed sets
There does not exist a countable collection of closed subsets of R whose union
equals R \ Q.
The equation above writes the complete metric space R as a countable union of
closed sets with empty interior, which contradicts the Baire’s Theorem [6.76(a)]. This
contradiction completes the proof.
Suppose V and W are Banach spaces and T is a bounded linear map of V onto W.
Then T ( G ) is an open subset of W for every open subset G of V.
Proof Let B denote the open unit ball B(0, 1) = { f ∈ V : k f k < 1} of V. For any
open ball B( f , a) in V, the linearity of T implies that
T B( f , a) = T f + aT ( B).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6E Consequences of Baire’s Theorem 185
Thus W = ∞
S
k=1 kT ( B ). Baire’s Theorem [6.76(a)] now implies that kT ( B ) has a
nonempty interior for some k ∈ Z+. The linearity of T allows us to conclude that
T ( B) has a nonempty interior.
Thus there exists g ∈ B such that Tg ∈ int T ( B). Hence
Thus there exists r > 0 such that B(0, 2r ) ⊂ 2T ( B) [here B(0, 2r ) is the closed ball
in W centered at 0 with radius 2r]. Hence B(0, r ) ⊂ T ( B). The definition of what it
means to be in the closure of T ( B) [see 6.7] now shows that
h ∈ W and khk ≤ r and ε > 0 =⇒ ∃ f ∈ B such that kh − T f k < ε.
r
For arbitrary h 6= 0 in W, applying the result in the line above to khk
h shows that
khk
6.82 h ∈ W and ε > 0 =⇒ ∃ f ∈ r B such that kh − T f k < ε.
here we are using 6.41 (this is the place in the proof where we use the hypothesis that
V is a Banach space). The inequality displayed above shows that k f k < 2r .
Because
1
kg − T f1 − T f2 − · · · − T fn k < n
2
and because T is a continuous linear map, we have g = T f .
We have now shown that B(0, 1) ⊂ 2r T ( B). Thus 2r B(0, 1) ⊂ T ( B), completing
the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
186 Chapter 6 Banach Spaces
Suppose V and W are Banach spaces and T is a one-to-one bounded linear map
from V onto W. Then T −1 is a bounded linear map from W onto V.
Proof The verification that T −1 is a linear map from W to V is left to the reader.
To prove that T −1 is bounded, suppose G is an open subset of V. Then
−1
( T −1 ) ( G ) = T ( G ).
The result above shows that completeness for normed vector spaces plays a role
analogous to compactness for metric spaces (think of the theorem stating that a
continuous one-to-one function from a compact metric space onto another compact
metric space has an inverse that is also continuous).
The next result gives a terrific way to show that a linear map between Banach
spaces is bounded. The proof is remarkably clean because the hard work has been
done in the proof of the Open Mapping Theorem (which was used to prove the
Bounded Inverse Theorem).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6E Consequences of Baire’s Theorem 187
S( f , T f ) = f .
Then
kS( f , T f )k = k f k ≤ max{k f k, k T f k} = k( f , T f )k
for all f ∈ V. Thus S is a bounded linear map from graph( T ) onto V with kSk ≤ 1.
Clearly S is injective. Thus the Bounded Inverse Theorem (6.83) implies that S−1 is
bounded. Because S−1 : V → graph( T ) satisfies the equation S−1 f = ( f , T f ), we
have
k T f k ≤ max{k f k, k T f k}
= k( f , T f )k
= k S −1 f k
≤ k S −1 k k f k
for all f ∈ V. The inequality above implies that T is a bounded linear map with
k T k ≤ kS−1 k, completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
188 Chapter 6 Banach Spaces
Then
sup{k T k : T ∈ A} < ∞.
6.87 B(h, r ) ⊂ Vn .
EXERCISES 6E
1 Suppose U is a subset of a metric space V. Show that U is dense in V if and
only if every nonempty open subset of V contains at least one element of U.
2 Suppose U is a subset of a metric space V. Show that U has an empty interior if
and only if V \ U is dense in V.
3 Prove or give a counterexample: If V is a metric space and U, W are subsets
of V, then (int U ) ∪ (int W ) = int(U ∪ W ).
4 Prove or give a counterexample: If V is a metric space and U, W are subsets
of V, then (int U ) ∩ (int W ) = int(U ∩ W ).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 6E Consequences of Baire’s Theorem 189
5 Suppose
∞
1
[
X = {0} ∪ k
k =1
and d( x, y) = | x − y| for x, y ∈ X.
(a) Show that ( X, d) is a complete metric space.
(b) Each set of the form { x } for x ∈ X is a closed subset of R that has an
empty interior as a subset of R. Clearly X is a countable union of such sets.
Explain why this does not violate the statement of the Baire’s Theorem that
a complete metric space is not the countable union of closed subsets with
empty interior.
6 Give an example of a metric space that is the countable union of closed subsets
with empty interior.
[This exercise shows that the completeness hypothesis in the Baire’s Theorem
cannot be dropped.]
7 (a) Show that there does not exist a countable collection of open subsets of R
whose intersection equals Q.
(b) Show that there does not exist a function f from R to R such that f is
continuous at each element of Q and f is discontinuous at each element of
R \ Q.
[The function in Exercise 4 in Section E of the Appendix is continuous at each
element of R \ Q and is discontinuous at each element of Q.]
8 Suppose ( X, d) is a complete metric space and G1 , G2 , . . . is a sequence of
dense open subsets of X. Prove that ∞
T
k=1 Gk is a dense open subset of X.
9 Prove that there does not exist an infinite-dimensional Banach space with a
countable basis.
[This exercise implies, for example, that there is not a norm that makes the
vector space of polynomials with coefficients in F into a Banach space.]
10 Give an example of a Banach space V, a normed vector space W, a bounded
linear map T of V onto W, and an open subset G of V such that T ( G ) is not an
open subset of W.
[The exercise shows that the hypothesis in the Open Mapping Theorem that W is
a Banach space cannot be relaxed to the hypothesis that W is a normed vector
space.]
11 Show that there exists a normed vector space V, a Banach space W, a bounded
linear map T of V onto W, and an open subset G of V such that T ( G ) is not an
open subset of W.
[The exercise shows that the hypothesis in the Open Mapping Theorem that V is
a Banach space cannot be relaxed to the hypothesis that V is a normed vector
space.]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
190 Chapter 6 Banach Spaces
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
7
L p Spaces
Fix a measure space ( X, S , µ) and a positive number p. We begin this chapter by
looking at the vector space of measurable functions f : X → F such that
Z
| f | p dµ < ∞.
Important results called Hölder’s Inequality and Minkowski’s Inequality will help
us investigate this vector space. A useful class of Banach spaces appears when we
identify functions that differ only on a set of measure 0 and require p ≥ 1.
The main building of the Swiss Federal Institute of Technology (ETH Zürich).
Hermann Minkowski (1864–1909) taught at this university from 1896 to 1902.
During this time, Albert Einstein (1879–1955) was a student in several of
Minkowski’s mathematics classes. Minkowski later created mathematics that helped
explain Einstein’s special theory of relativity.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 191
192 Chapter 7 L p Spaces
7A L p (µ)
Hölder’s Inequality
Our next major goal is to define an important class of vector spaces that generalize the
vector spaces L1 (µ) and `1 introduced in the last two bullet points of Example 6.32.
We begin this process with the definition below. The terminology p-norm introduced
below is convenient even though it is not necessarily a norm.
7.1 Definition k f kp
The exponent 1/p appears in the definition of the p-norm k f k p because we want
the equation kα f k p = |α| k f k p to hold.
For 0 < p < ∞, the p-norm k f k p does not change if f changes on a set of
µ-measure 0. By using the essential supremum rather than the supremum in the
definition of k f k∞ , we arrange for the ∞-norm k f k∞ to enjoy this same property.
Also, Exercise 15 in this section shows why using the essential supremum rather than
the supremum is the right definition.
Note that for counting measure, the essential supremum and the supremum are the
same because in this case there are no sets of measure 0 other than the empty set.
Now we can define our generalization of L1 (µ), which was defined in the last
bullet point of Example 6.32.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 7A L p (µ) 193
7.4 Example `p
When µ is counting measure on Z+, the set L p (µ) is often denoted by ` p (pro-
nounced little ell-p). Thus if 0 < p < ∞, then
∞
` p = {( a1 , a2 , . . .) : each ak ∈ F and ∑ | ak | p < ∞}
k =1
and
`∞ = {( a1 , a2 , . . .) : each ak ∈ F and sup | ak | < ∞}.
k ∈Z+
Inequality 7.5(a) below will provide an easy proof that L p (µ) is closed under
addition. Soon we will prove Minkowski’s Inequality (7.14), which provides an
important improvement of 7.5(a) when p ≥ 1 but is more complicated to prove.
and
(b) kα f k p = |α| k f k p
for all f , g ∈ L p (µ) and all α ∈ F. Furthermore, with the usual operations of
addition and scalar multiplication of functions, L p (µ) is a vector space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
194 Chapter 7 L p Spaces
What we call the dual exponent in the definition below is often called the conjugate
exponent or the conjugate index. However, the terminology dual exponent conveys
more meaning because of results (7.24 and 7.25) that we will see in the next section.
10 = ∞, ∞0 = 1, 20 = 2, 40 = 4/3, (4/3)0 = 4
The result below will be a key tool in proving Hölder’s Inequality (7.9).
increasing on the interval b1/( p−1) , ∞ . Thus f has a global minimum at b1/( p−1) .
A tiny bit of arithmetic [use p/( p − 1) = p0 ] shows that f b1/( p−1) = 0. Thus
The important result below furnishes a key tool that is used in the proof of
Minkowski’s Inequality (7.14).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 7A L p (µ) 195
Proof Suppose 1 < p < ∞, leaving the cases p = 1 and p = ∞ as exercises for
the reader.
First consider the special case where k f k p = khk p0 = 1. Young’s Inequality
(7.8) tells us that
0
| f ( x )| p |h( x )| p
| f ( x )h( x )| ≤ +
p p0
for all x ∈ X. Integrating both sides of the inequality above with respect to µ shows
that k f hk1 ≤ 1 = k f k p khk p0 , completing the proof in this special case.
If k f k p = 0 or khk p0 = 0, then
Hölder’s Inequality was proved in
k f hk1 = 0 and the desired inequal- 1889 by Otto Hölder (1859–1937).
ity holds. Similarly, if k f k p = ∞ or
khk p0 = ∞, then the desired inequality clearly holds. Thus we will assume that
0 < k f k p < ∞ and 0 < khk p0 < ∞.
Now define S -measurable functions f 1 , h1 : X → F by
f h
f1 = and h1 = .
k f kp k h k p0
Then k f 1 k p = 1 and k h1 k p0 = 1. By the result for our special case, we have
k f 1 h1 k1 ≤ 1, which implies that k f hk1 ≤ k f k p khk p0 .
The next result gives a key containment among Lebesgue spaces with respect to a
finite measure. Note the crucial role that Hölder’s Inequality plays in the proof.
Now raise both sides of the inequality above to the power 1p , getting
Z 1/p Z 1/q
| f | p dµ ≤ µ( X )(q− p)/( pq) | f |q dµ ,
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
196 Chapter 7 L p Spaces
7.11 Example L p ( E)
We adopt the common convention that if E is a Borel (or Lebesgue measurable)
subset of R and 0 < p ≤ ∞, then L p ( E) means L p (λ E ), where λ E denotes
Lebesgue measure λ restricted to the Borel (or Lebesgue measurable) subsets of R
that are contained in E.
With this convention, 7.10 implies that
if 0 < p < q < ∞, then Lq ([0, 1]) ⊂ L p ([0, 1]) and k f k p ≤ k f kq
for f ∈ Lq ([0, 1]. See Exercises 12 and 13 in this section for related results.
Minkowski’s Inequality
The next result will be used as a tool to prove Minkowski’s Inequality (7.14). Once
again, note the crucial role that Hölder’s Inequality plays in the proof.
Proof If k f k p = 0, then both sides of the equation in the conclusion of this result
equal 0. Thus we will also assume that k f k p 6= 0.
0
Hölder’s Inequality (7.9) implies that if h ∈ L p (µ) and khk p0 ≤ 1, then
Z Z
f h dµ ≤ | f h| dµ ≤ k f k p khk p0 ≤ k f k p .
0
Thus sup f h dµ : h ∈ L p (µ) and khk p0 ≤ 1 ≤ k f k p .
R
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 7A L p (µ) 197
k f + gk p ≤ k f k p + k gk p .
Proof Assume that 1 ≤ p < ∞ (the case p = ∞ is left as an exercise for the reader).
Inequality 7.5(a) implies that f + g ∈ L p (µ).
0
Suppose h ∈ L p (µ) and k hk p0 ≤ 1. Then
Z Z Z
( f + g)h dµ ≤ | f h| dµ + | gh| dµ ≤ (k f k p + k gk p )khk p0
≤ k f k p + k gk p ,
where the second inequality comes from Hölder’s Inequality (7.9). Now take the
0
supremum of the left side of the inequality above over the set of h ∈ L p (µ) such
that khk p0 ≤ 1. By 7.12, we get k f + gk p ≤ k f k p + k gk p , as desired.
EXERCISES 7A
1 Suppose µ is a measure. Prove that
k f + gk∞ ≤ k f k∞ + k gk∞ and kα f k∞ = |α| k f k∞
for all f , g ∈ L∞ (µ) and all α ∈ F. Conclude that with the usual operations of
addition and scalar multiplication of functions, L∞ (µ) is a vector space.
2 Suppose a ≥ 0, b ≥ 0, and 1 < p < ∞. Prove that
0
ap bp
ab = + 0
p p
0
if and only if a p = b p [compare to Young’s Inequality (7.8)].
3 Suppose a1 , . . . , an are nonnegative numbers. Prove that
( a1 + · · · + a n )5 ≤ n4 ( a1 5 + · · · + a n 5 ).
4 Prove Hölder’s Inequality (7.9) in the cases p = 1 and p = ∞.
5 Suppose that ( X, S , µ) is a measure space, 1 < p < ∞, f ∈ L p (µ), and
0
h ∈ L p (µ). Prove that Hölder’s Inequality (7.9) is an equality if and only if
there exist nonnegative numbers a and b, not both 0, such that
0
a| f ( x )| p = b|h( x )| p
for almost every x ∈ X.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
198 Chapter 7 L p Spaces
|h( x )| = khk∞
k f 1 f 2 · · · f n k 1 ≤ k f 1 k p1 k f 2 k p2 · · · k f n k p n
` p 6 = `1 .
\
11 Show that
p >1
14 Suppose p, q ∈ (0, ∞], with p 6= q. Prove that neither of the sets L p (R) and
Lq (R) is a subset of the other.
15 Suppose ( X, S , µ) is a finite measure space. Prove that
lim k f k p = k f k∞
p→∞
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 7A L p (µ) 199
17 Suppose 0 < p < ∞ and f ∈ L p (R). Prove that for every ε > 0, there exists a
step function g ∈ L p (R) such that k f − gk p < ε.
[This exercise extends 3.46.]
18 Suppose 0 < p < ∞ and f ∈ L p (R). Prove that for every ε > 0, there
exists a continuous function g : R → R such that k f − gk p < ε and the set
{ x ∈ R : g( x ) 6= 0} is bounded.
[This exercise extends 3.47.]
19 Suppose ( X, S , µ) is a measure space, 1 < p < ∞, and f , g ∈ L p (µ). Prove
that Minkowski’s Inequality (7.14) is an equality if and only if there exist
nonnegative numbers a and b, not both 0, such that
a f ( x ) = bg( x )
for almost every x ∈ X.
20 Suppose ( X, S , µ) is a measure space and f , g ∈ L1 (µ). Prove that
k f + g k1 = k f k1 + k g k1
if and only if f ( x ) g( x ) ≥ 0 for almost every x ∈ X.
21 Suppose ( X, S , µ) is a measure space and 0 < p < 1. Prove that the reverse
Minkowski inequality
k f + gk p ≥ k f k p + k gk p
holds for all S -measurable functions f , g : X → [0, ∞).
22 Suppose ( X, S , µ) and (Y, T , ν) are σ-finite measure spaces and 0 < p < ∞.
Prove that if f ∈ L p (µ × ν), then
[ f ] x ∈ L p (ν) for almost every x ∈ X
and
[ f ]y ∈ L p (µ) for almost every y ∈ Y,
where [ f ] x and [ f ]y are the cross sections of f as defined in 5.7.
23 Suppose 1 ≤ p < ∞ and f ∈ L p (R).
(a) For t ∈ R, define f t : R → R by f t ( x ) = f ( x − t). Prove that
limk f − f t k p = 0.
t →0
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
200 Chapter 7 L p Spaces
7B L p (µ)
Definition of L p (µ)
Suppose ( X, S , µ) is a measure space and 1 ≤ p ≤ ∞. If there exists a nonempty set
E ∈ S such that µ( E) = 0, then kχ Ek p = 0 even though χ E 6= 0; thus k·k p is not a
norm on L p (µ). The standard way to deal with this problem is to identify functions
that differ only on a set of µ-measure 0. To help make this process more rigorous, we
introduce the following definitions.
f˜ = { f + z : z ∈ Z (µ)}.
The set Z (µ) is clearly closed under scalar multiplication. Also, Z (µ) is closed
under addition because the union of two sets with µ-measure 0 is a set with µ-
measure 0. Thus Z (µ) is a subspace of L p (µ), as we had noted in the third bullet
point of Example 6.32.
Note that if f , F ∈ L p (µ), then f˜ = F̃ if and only if f ( x ) = F ( x ) for almost
every x ∈ X.
L p (µ) = { f˜ : f ∈ L p (µ)}.
The last bullet point in the definition above requires a bit of care to verify that it
makes sense. The potential problem is that if Z (µ) 6= {0}, then f˜ is not uniquely
represented by f . Thus suppose f , F, g, G ∈ L p (µ) and f˜ = F̃ and g̃ = G̃. For
the definition of addition in L p (µ) to make sense, we must verify that ( f + g)˜ =
( F + G )˜. This verification is left to the reader, as is the similar verification that the
scalar multiplication defined in the last bullet point above makes sense.
You might want to think of elements of L p (µ) as equivalence classes of functions
in L p (µ), where two functions are equivalent if they agree almost everywhere.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 7B L p (µ) 201
k f˜k p = k f k p
for f ∈ L p (µ).
The proof of the result above is left to the reader, who will surely use Minkowski’s
Inequality (7.14) to verify the triangle inequality. Note that the additive identity of
L p (µ) is 0̃, which equals Z (µ).
For readers familiar with quotients of
If µ is counting measure on Z+, then
vector spaces: you may recognize that
L p (µ) is the quotient space L p (µ) = L p (µ) = ` p
L p ( µ ) / Z ( µ ). because counting measure has no
sets of measure 0 other than the
For readers who want to learn about quo-
empty set.
tients of vector spaces: see a textbook for
a second course in linear algebra.
The notation introduced below is commonly used in mathematics literature.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
202 Chapter 7 L p Spaces
lim k f k − f k p = 0.
k→∞
Proof The case p = ∞ is left as an exercise for the reader. Thus assume 1 ≤ p < ∞.
It suffices to show that limm→∞ k f km − f k p = 0 for some f ∈ L p (µ) and some
subsequence f k1 , f k2 , . . . (see Exercise 14 of Section 6A, whose proof does not require
the positive definite property of a norm).
Thus dropping to a subsequence (but not relabeling) and setting f 0 = 0, we can
assume that
∞
∑ k f k − f k−1 k p < ∞.
k =1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 7B L p (µ) 203
= lim infk f k − f j k p
j→∞
≤ ε,
where the second line above comes from Fatou’s Lemma (Exercise 18 in Section 3A).
Thus limk→∞ k f k − f k p = 0, as desired.
The proof that we have just completed contains within it the proof of a useful
result that is worth stating separately. A sequence can converge in p-norm without
converging pointwise anywhere (see, for example, Exercise 11). However, the next
result guarantees that some subsequence converges pointwise almost everywhere.
lim f km ( x ) = f ( x )
m→∞
Proof This result follows immediately from 7.20 and the appropriate definitions.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
204 Chapter 7 L p Spaces
Duality
Recall that the dual space of a normed vector space V is denoted by V 0 and is defined
to be the Banach space of bounded linear functionals on V; see 6.71.
In the statement and proof of the next result, an element of an L p space is denoted
by a symbol that makes it look like a function rather than like a collection of functions
that agree except on a set of measure 0. However, because integrals and L p -norms
are unchanged when functions change only on a set of measure 0, this notational
convenience causes no problems.
0 0
7.24 Natural map of L p (µ) into L p (µ) preserves norms
0
Suppose µ is a measure and 1 < p ≤ ∞. For h ∈ L p (µ), define ϕh : L p (µ) → F
by Z
ϕh ( f ) = f h dµ.
0 0
Then h 7→ ϕh is a one-to-one linear map from L p (µ) to L p (µ) . Furthermore,
0
k ϕh k = khk p0 for all h ∈ L p (µ).
0
Proof Suppose h ∈ L p (µ) and f ∈ L p (µ). Then Hölder’s Inequality (7.9) tells us
that f h ∈ L1 (µ) and that
k f h k1 ≤ k h k p 0 k f k p .
Thus ϕh , as defined above, is a bounded linear map from L p (µ) to F. Also, the map
0 0
h 7→ ϕh is clearly a linear map of L p (µ) into L p (µ) . Now 7.12 (with the roles of
p and p0 reversed) shows that
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 7B L p (µ) 205
0
7.25 Dual space of ` p can be identified with ` p
0
Suppose 1 ≤ p < ∞. For b = (b1 , b2 , . . .) ∈ ` p , define ϕb : ` p → F by
∞
ϕb ( a) = ∑ a k bk ,
k =1
0
where a = ( a1 , a2 , . . .). Then b 7→ ϕb is a one-to-one linear map from ` p onto
0 0
` p . Furthermore, k ϕb k = kbk p0 for all b ∈ ` p .
Proof For k ∈ Z+, let ek ∈ ` p be the sequence in which each term is 0 except that
the kth
term is 1; thus ek = (0, . . . , 0, 1, 0, . . .).
0
Suppose ϕ ∈ ` p . Define a sequence b = (b1 , b2 , . . .) of numbers in F by
bk = ϕ ( e k ) .
Suppose a = ( a1 , a2 , . . .) ∈ ` p . Then
∞
a= ∑ ak ek ,
k =1
where the infinite sum converges in the norm of ` p (the proof would fail here if we
allowed p to be ∞). Because ϕ is a bounded linear functional on ` p , applying ϕ to
both sides of the equation above shows that
∞
ϕ( a) = ∑ a k bk .
k =1
0
We still need to prove that b ∈ ` p . To do this, for n ∈ Z+ let µn be counting
measure on {1, 2, . . . , n}. We can think of L p (µn ) as a subspace of ` p by identi-
fying each ( a1 , . . . , an ) ∈ L p (µn ) with ( a1 , . . . , an , 0, 0, . . .) ∈ ` p . Restricting the
linear functional ϕ to L p (µn ) gives the linear functional on L p (µn ) that satisfies the
following equation:
n
ϕ | L p ( µ n ) ( a1 , . . . , a n ) = ∑ a k bk .
k =1
Now 7.24 (also see Exercise 14 for the case where p = 1) gives
k(b1 , . . . , bn )k p0 = k ϕ| L p (µn ) k
≤ k ϕ k.
Because limn→∞ k(b1 , . . . , bn )k p0 = kbk p0 , the inequality above implies the in-
0
equality kbk p0 ≤ k ϕk. Thus b ∈ ` p , which implies that ϕ = ϕb , completing the
proof.
The previous result does not hold when p = ∞. In other words, the dual space
of `∞ cannot be identified with `1 . However, see Exercise 15, which shows that the
dual space of a natural subspace of `∞ can be identified with `1 .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
206 Chapter 7 L p Spaces
EXERCISES 7B
1 Suppose n > 1 and 0 < p < 1. Prove that if k·k is defined on Fn by
1/p
k( a1 , . . . , an )k = | a1 | p + · · · + | an | p ,
then f = g.
(b) Give an example to show that the result in part (a) can fail if p = 1.
(c) Give an example to show that the result in part (a) can fail if p = ∞.
6 Suppose ( X, S , µ) is a measure space and 0 < p < 1. Show that
p p p
k f + gk p ≤ k f k p + k gk p
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 7B L p (µ) 207
11 Show that there exists a sequence f 1 , f 2 , . . . of functions in L1 ([0, 1]) such that
lim k f k k1 = 0 but
k→∞
sup{ f k ( x ) : k ∈ Z+ } = ∞
for every x ∈ [0, 1].
[This exercise shows that the conclusion of 7.22 cannot be improved to conclude
that limk→∞ f k ( x ) = f ( x ) for almost every x ∈ X.]
12 Suppose ( X, S , µ) is a measure space, 1 ≤ p ≤ ∞, f ∈ L p (µ), and f 1 , f 2 , . . .
is a sequence in L p (µ) such that limk→∞ k f k − f k p = 0. Show that if
g : X → F is a function such that limk→∞ f k ( x ) = g( x ) for almost every
x ∈ X, then f ( x ) = g( x ) for almost every x ∈ X.
13 Suppose 1 ≤ p ≤ ∞. Prove that
{( a1 , a2 , . . .) ∈ ` p : ak 6= 0 for every k ∈ Z+ }
is not an open subset of ` p .
14 (a) Give an example of a measure µ such that 7.24 fails for p = 1.
(b) Show that if µ is a σ-finite measure, then 7.24 holds for p = 1.
15 Let
c0 = {( a1 , a2 , . . .) ∈ `∞ : lim ak = 0}.
k→∞
Give c0 the norm that it inherits as a subspace of `∞ .
(a) Prove that c0 is a Banach space.
(b) Prove that the dual space of c0 can be identified with `1 .
16 Suppose 1 ≤ p ≤ 2.
(a) Prove that if w, z ∈ C, then
|w + z| p + |w − z| p |w + z| p + |w − z| p
≤ |w| p + |z| p ≤ .
2 2 p −1
(b) Prove that if µ is a measure and f , g ∈ L p (µ), then
p p p p
k f + gk p + k f − gk p p p k f + gk p + k f − gk p
≤ k f k p + k gk p ≤ .
2 2 p −1
17 Suppose 2 ≤ p < ∞.
(a) Prove that if w, z ∈ C, then
|w + z| p + |w − z| p p p |w + z| p + |w − z| p
≤ | w | + | z | ≤ .
2 p −1 2
(b) Prove that if µ is a measure and f , g ∈ L p (µ), then
p p p p
k f + gk p + k f − gk p p p k f + gk p + k f − gk p
p − 1
≤ k f k p + k gk p ≤ .
2 2
[The inequalities in the two previous exercises are called Clarkson’s Inequalities.
They were discovered James Clarkson in 1936.]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
208 Chapter 7 L p Spaces
A Banach space is called reflexive if the canonical isometry of the Banach space
into its double dual space is surjective; see Exercise 20 in Section 6D for the
definitions of the double dual space and the canonical isometry.
18 Prove that if 1 < p < ∞, then ` p is reflexive.
for f ∈ V and g ∈ W.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
8
Hilbert Spaces
Normed vector spaces and Banach spaces, which were introduced in Chapter 6,
capture the notion of distance. In this chapter we introduce inner product spaces,
which capture the notion of angle. As we will see, the concept of orthogonality, which
corresponds to right angles in the familiar context of R2 or R3 , plays a particularly
important role in inner product spaces.
Just as a Banach space is defined to be a normed vector space in which every
Cauchy sequence converges, a Hilbert space is defined to be an inner product space
that is a Banach space. Hilbert spaces are named in honor of David Hilbert (1862–
1943), who helped develop parts of the theory in the early twentieth century.
In this chapter, we will see a clean description of the bounded linear functionals
on a Hilbert space. We will also see that every Hilbert space has an orthonormal
basis, which make Hilbert spaces look much like standard Euclidean spaces but with
infinite sums replacing finite sums.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 209
210 Chapter 8 Hilbert Spaces
for all f , gR∈ L2 (µ). Thus we can associate with each pair of functions f , g ∈ L2 (µ)
a number f g dµ. An inner product is almost a generalization of this pairing, with a
slight twist to get a closer connection to the L2 (µ)-norm.
If g = f and F = R, then the left side of the inequality above is k f k22 . However,
if g = f and F = C, then the left side of the inequality above need not equal k f k22 .
Instead, we should take g = f to get k f k22 above.
RThe observations above suggest that we should consider the pairing that takes f , g
to f g dµ. Then pairing f with itself gives k f k22 .
Now we are ready R to define inner products, which abstract the key properties of
the pairing f , g 7→ f g dµ on L2 (µ), where µ is a measure.
An inner product on a vector space V is a function that takes each ordered pair
f , g of elements of V to a number h f , gi ∈ F and has the following properties:
positivity
h f , f i ∈ [0, ∞) for all f ∈ V;
definiteness
h f , f i = 0 if and only if f = 0;
linearity in first slot
h f + g, hi = h f , hi + h g, hi and hα f , gi = αh f , gi for all f , g, h ∈ V and
all α ∈ F;
conjugate symmetry
h f , gi = h g, f i for all f , g ∈ V.
A vector space with an inner product on it is called an inner product space. The
terminology real inner product space indicates that F = R; the terminology
complex inner product space indicates that F = C.
If F = R, then the complex conjugate above can be ignored and the conjugate
symmetry property above can be rewritten more simply as h f , gi = h g, f i for all
f , g ∈ V.
Although most mathematicians define an inner product as above, many physicists
use a definition that requires linearity in the second slot instead of the first slot.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8A Inner Product Spaces 211
h( a1 , . . . , an ), (b1 , . . . , bn )i = a1 b1 + · · · + an bn
for f , g ∈ L2 (µ). Hölder’s Inequality (7.9) with p = 2 implies that the integral
above makes sense as an element of F. When thinking of L2 (µ) as an inner prod-
uct space, we will always mean this inner product unless the context indicates
some other inner product.
Here we use L2 (µ) rather than L2 (µ) because the definiteness requirement fails
on L2 (µ) if there exist nonempty sets E ∈ S such that µ( E) = 0 (consider
hχ E, χ Ei to see the problem).
The first two bullet points in this example are special cases of L2 (µ), taking µ to
be counting measure on either {1, . . . , n} or Z+.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
212 Chapter 8 Hilbert Spaces
As we will see, even though the main examples of inner product spaces are L2 (µ)
spaces, working with the inner product structure is often cleaner and simpler than
working with measures and integrals.
Proof
If F = R, then parts (b) and (c) of 8.3 imply that for f ∈ V, the function
g 7→ h f , gi is a linear map from V to R. However, if F = C and f 6= 0, then
the function g 7→ h f , gi is not a linear map from V to C because of the complex
conjugate in part (c) of 8.3.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8A Inner Product Spaces 213
Thus the norm on Fn associated with the standard inner product is the usual
Euclidean norm.
• If ( a1 , a2 , . . .) ∈ `2 , then
∞ 1/2
k( a1 , a2 , . . .)k = ∑ | a k |2 .
k =1
Thus the norm associated with the inner product on `2 is just the standard norm
k·k2 on `2 as defined in Example 7.2.
• If µ is a measure and f ∈ L2 (µ), then
Z 1/2
kfk = | f |2 dµ .
Thus the norm associated with the inner product on L2 (µ) is just the standard
norm k·k2 on L2 (µ) as defined in 7.17.
The definition of an inner product (8.1) implies that if V is an inner product space
and f ∈ V, then
• k f k ≥ 0;
• k f k = 0 if and only if f = 0.
The proof of the next result illustrates a frequently used property of the norm on
an inner product space: working with the square of the norm is often easier than
working directly with the norm.
k α f k = | α | k f k.
Proof We have
kα f k2 = hα f , α f i = αh f , α f i = ααh f , f i = |α|2 k f k2 .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
214 Chapter 8 Hilbert Spaces
The next definition plays a crucial role in the study of inner product spaces.
Two elements of an inner product space are called orthogonal if their inner
product equals 0.
In the definition above, the order of the two elements of the inner product space
does not matter because h f , gi = 0 if and only if h g, f i = 0. Instead of saying that
f and g are orthogonal, sometimes we say that f is orthogonal to g.
nal because
Z π h cos(5t) cos(11t) it=π
sin(3t) cos(8t) dt = − = 0,
−π 10 22 t=−π
Exercise 7 asks you to prove that if a and b are nonzero elements in R2 , then
h a, bi = k ak kbk cos θ,
where θ is the angle between a and b (thinking of a as the vector whose initial point is
the origin and whose end point is a, and similarly for b). Thus two elements of R2 are
orthogonal if and only if the cosine of the angle between them is 0, which happens if
and only if the vectors are perpendicular in the usual sense of plane geometry. Thus
you can think of the word orthogonal as a fancy word meaning perpendicular.
Mr. Friedman: I think that issue is entirely orthogonal to the issue here
because the Commonwealth is acknowledging—
Chief Justice Roberts: I’m sorry. Entirely what?
Mr. Friedman: Orthogonal. Right angle. Unrelated. Irrelevant.
Chief Justice Roberts: Oh.
Justice Scalia: What was that adjective? I liked that.
Mr. Friedman: Orthogonal.
Chief Justice Roberts: Orthogonal.
Mr. Friedman: Right, right.
Justice Scalia: Orthogonal, ooh. (Laughter.)
Justice Kennedy: I knew this case presented us a problem. (Laughter.)
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8A Inner Product Spaces 215
The next theorem is over 2500 years old, although it was not originally stated in
the context of inner product spaces.
k f + g k2 = k f k2 + k g k2 .
Proof We have
k f + gk2 = h f + g, f + gi
= h f , f i + h f , gi + h g, f i + h g, gi
= k f k2 + k g k2 ,
as desired.
Exercise 2 shows that whether or not the converse of the Pythagorean Theorem
holds depends upon whether F = R or F = C.
Suppose f and g are elements of an inner product space V,
with g 6= 0. Frequently it is useful to write f as some number c
times g plus an element h of V that is orthogonal to g. The figure
here suggests that such a decomposition should be possible. To
find the appropriate choice for c, note that if f = cg + h for
some c ∈ F and some h ∈ V with hh, gi = 0, then we must
have
h f , gi = hcg + h, gi = ck gk2 ,
h f , gi Here
which implies that c = , which then implies that
k g k2 f = cg + h,
h f , gi where h is
h= f− g. Hence we are led to the following result.
k g k2 orthogonal to g.
Suppose f and g are elements of an inner product space, with g 6= 0. Then there
exists h ∈ V such that
h f , gi
hh, gi = 0 and f = g + h.
k g k2
h f , gi
Proof Set h = f − g. Then
k g k2
D h f , gi E h f , gi
hh, gi = f − 2
g, g = h f , gi − h g, gi = 0,
k gk k g k2
giving the first equation in the conclusion. The second equation in the conclusion
follows immediately from the definition of h.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
216 Chapter 8 Hilbert Spaces
The orthogonal decomposition 8.10 will be used in our proof of the next result,
which is one of the most important inequalities in mathematics.
|h f , gi| ≤ k f k k gk,
Proof If g = 0, then both sides of the desired inequality equal 0. Thus we can
assume g 6= 0. Consider the orthogonal decomposition
h f , gi
f = g+h
k g k2
given by 8.10, where h is orthogonal to g. The Pythagorean Theorem (8.9) implies
h f , g i
2
2
kfk =
g
+ k h k2
k g k2
|h f , gi|2
= + k h k2
k g k2
|h f , gi|2
8.12 ≥ .
k g k2
Multiplying both sides of this inequality by k gk2 and then taking square roots gives
the desired inequality.
The proof above shows that the Cauchy–Schwarz Inequality is an equality if and
only if 8.12 is an equality. This happens if and only if h = 0. But h = 0 if and only
if f is a scalar multiple of g (see 8.10). Thus the Cauchy–Schwarz Inequality is an
equality if and only if f is a scalar multiple of g or g is a scalar multiple of f (or
both; the phrasing has been chosen to cover cases in which either f or g equals 0).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8A Inner Product Spaces 217
k f + g k ≤ k f k + k g k,
Proof We have
k f + gk2 = h f + g, f + gi
= h f , f i + h g, gi + h f , gi + h g, f i
= h f , f i + h g, gi + h f , gi + h f , gi
= k f k2 + k gk2 + 2 Reh f , gi
8.16 ≤ k f k2 + k gk2 + 2|h f , gi|
8.17 ≤ k f k2 + k g k2 + 2k f k k g k
= (k f k + k gk)2 ,
where 8.17 follows from the Cauchy–Schwarz Inequality (8.11). Taking square roots
of both sides of the inequality above gives the desired inequality.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
218 Chapter 8 Hilbert Spaces
The proof above shows that the Triangle Inequality is an equality if and only if we
have equality in 8.16 and 8.17. Thus we have equality in the Triangle Inequality if
and only if
8.18 h f , g i = k f k k g k.
If one of f , g is a nonnegative multiple of the other, then 8.18 holds, as you should
verify. Conversely, suppose 8.18 holds. Then the condition for equality in the Cauchy–
Schwarz Inequality (8.11) implies that one of f , g is a scalar multiple of the other.
Clearly 8.18 forces the scalar in question to be nonnegative, as desired.
Applying the previous result to the inner product space L2 (µ), where µ is a
measure, gives a new proof of Minkowski’s Inequality (7.14) for the case p = 2.
Now we can prove that what we have been calling a norm on an inner product
space is indeed a norm.
Proof The definition of an inner product implies that k·k satisfies the positive defi-
nite requirement for a norm. The homogeneity and triangle inequality requirements
for a norm are satisfied because of 8.6 and 8.15.
The next result has the geometric in-
terpretation that the sum of the squares
of the lengths of the diagonals of a
parallelogram equals the sum of the
squares of the lengths of the four sides.
k f + g k2 + k f − g k2 = 2k f k2 + 2k g k2 .
Proof We have
k f + gk2 + k f − gk2 = h f + g, f + gi + h f − g, f − gi
= k f k2 + k gk2 + h f , gi + h g, f i
+ k f k2 + k gk2 − h f , gi − h g, f i
= 2k f k2 + 2k g k2 ,
as desired.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8A Inner Product Spaces 219
EXERCISES 8A
1 Let V denote the vector space of bounded continuous functions from R to F.
Let r1 , r2 , . . . be a list of the rational numbers. For f , g ∈ V, define
∞
f (r k ) g (r k )
h f , gi = ∑ 2k
.
k =1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
220 Chapter 8 Hilbert Spaces
9 (a) Suppose f and g are elements of a real inner product space. Prove that f
and g have the same norm if and only if f + g is orthogonal to f − g.
(b) Use part (a) to show that the diagonals of a parallelogram are perpendicular
to each other if and only if the parallelogram is a rhombus.
10 Suppose f and g are elements of an inner product space. Prove that k f k = k gk
if and only if ks f + tgk = kt f + sgk for all s, t ∈ R.
11 Suppose f and g are elements of an inner product space and k f k = k gk = 1
and h f , gi = 1. Prove that f = g.
12 Suppose f and g are elements of a real inner product space. Prove that
k f + g k2 − k f − g k2
h f , gi = .
4
13 Suppose f and g are elements of a complex inner product space. Prove that
k f + gk2 − k f − gk2 + k f + igk2 i − k f − igk2 i
h f , gi = .
4
14 Suppose f , g, h are elements of an inner product space. Prove that
k h − f k2 + k h − g k2 k f − g k2
kh − 21 ( f + g)k2 = − .
2 4
15 Prove that a norm satisfying the parallelogram equality comes from an inner
product. In other words, show that if V is a normed vector space whose norm
k·k satisfies the parallelogram equality, then there is an inner product h·, ·i on
V such that k f k = h f , f i1/2 for all f ∈ V.
16 Suppose ( X, S , µ) is a measure space. Let V be the subspace of L2 (µ) defined
by
V = f ∈ L2 (µ) : k f k∞ < ∞ and µ({ x ∈ X : f ( x ) 6= 0}) < ∞ .
For f , g ∈ V, define h f , gi by
Z
h f , gi = f g dµ.
The integral above makes sense without the use of Hölder’s Inequality because
of the definition of V.
(a) Show that the Cauchy–Schwarz Inequality implies that
k f g k1 ≤ k f k2 k g k2
for all f , g ∈ V (again, without using Hölder’s Inequality).
(b) Now suppose f , g ∈ L2 (µ). Let f 1 , f 2 , . . . and g1 , g2 , . . . be the increasing
sequences of simple functions that approximate | f | and | g| as constructed
in 2.82. Show that each f k and each gk is in V. Then apply part (a) to f k
and gk and use an appropriate limit theorem to conclude (without using
Hölder’s Inequality) that k f gk1 ≤ k f k2 k gk2 .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8A Inner Product Spaces 221
(b) Describe the set of Borel measurable functions f : [1, ∞) → [0, ∞) such
that the inequality in part (a) is an equality.
h( f 1 , . . . , f m ), ( g1 , . . . , gm )i = h f 1 , g1 i + · · · + h f m , gm i
a2 + b2 = 12 c2 + 2d2 .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
222 Chapter 8 Hilbert Spaces
8B Orthogonality
Orthogonal Projections
The previous section developed inner product spaces following the pattern used in the
author’s textbook Linear Algebra Done Right and other linear algebra books. Linear
algebra focuses mainly on finite-dimensional vector spaces. Many interesting results
about infinite-dimensional inner product spaces require an additional hypothesis,
which we now introduce.
A Hilbert space is an inner product space that is a Banach space with the norm
determined by the inner product.
• Suppose µ is a measure. Then L2 (µ) with its usual inner product is a Hilbert
space (by 7.23).
• As a special case of the first bullet point, if n ∈ Z+ then taking µ to be counting
measure on {1, . . . , n} shows that Fn with its usual inner product is a Hilbert
space.
• As another special case of the first bullet point, taking µ to be counting measure
on Z+ shows that `2 with its usual inner product is a Hilbert space.
• Every closed subspace of a Hilbert space is a Hilbert space [by 6.16(b)].
The next definition makes sense in the context of normed vector spaces.
distance( f , U ) = inf{k f − gk : g ∈ U }.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8B Orthogonality 223
• A subset of a vector space is called convex if the subset contains the line
segment connecting each pair of points in it.
The next example shows that the distance from an element of a Banach space to a
closed subspace is not necessarily attained by some element of the closed subspace.
After this example, we will prove that this behavior cannot happen in a Hilbert space.
1 xk x−1
gk ( x ) = −x+ + .
2 2 k+1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
224 Chapter 8 Hilbert Spaces
In the next result, we use for the first time the hypothesis that V is a Hilbert space.
k f − gk = distance( f , U ).
Proof First we prove the existence of an element of U that attains the distance to f .
To do this, suppose g1 , g2 , . . . is a sequence of elements of U such that
8.29 lim k f − gk k = distance( f , U ).
k→∞
= 2k f − gk k2 + 2k f − g j k2 − k2 f − ( gk + g j )k2
gk + g j
2
= 2k f − gk k2 + 2k f − g j k2 − 4
f −
2
2
8.30 ≤ 2k f − gk k2 + 2k f − g j k2 − 4 distance( f , U ) ,
where the second equality comes from the Parallelogram Equality (8.20) and the
last line holds because the convexity of U implies that ( gk + g j )/2 ∈ U. Now the
inequality above and 8.29 imply that g1 , g2 , . . . is a Cauchy sequence. Thus there
exists g ∈ V such that
8.31 lim k gk − gk = 0.
k→∞
Because U is a closed subset of V and each gk ∈ U, we know that g ∈ U. Now 8.29
and 8.31 imply that
k f − gk = distance( f , U ),
which completes the existence proof of the existence part of this result.
To prove the uniqueness part of this result, suppose g and g̃ are elements of U
such that
8.32 k f − gk = k f − g̃k = distance( f , U ).
Then
2
k g − g̃k2 ≤ 2k f − gk2 + 2k f − g̃k2 − 4 distance( f , U )
8.33 = 0,
where the first line above follows from 8.30 (with g j replaced by g and gk replaced
by g̃) and the last line above follows from 8.32. Now 8.33 implies that g = g̃,
completing the proof of uniqueness.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8B Orthogonality 225
Example 8.27 showed that the existence part of the previous result can fail in a
Banach space. Exercise 13 shows that the uniqueness part can also fail in a Banach
space. These observations highlight the advantages of working in a Hilbert space.
The definition above makes sense because of 8.28. We will often use the notation
PU f instead of PU ( f ). To test your understanding of the definition above, make sure
that you can show that if U is a nonempty closed convex subset of a Hilbert space V,
then
• PU f = f if and only if f ∈ U;
• PU ◦ PU = PU .
PU b = (b1 , 0, b3 , 0, b5 , 0, . . .),
The next result shows that the properties stated in the last two paragraphs of the
example above hold whenever U is a closed subspace of a Hilbert space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
226 Chapter 8 Hilbert Spaces
Proof The figure below illustrates (a). To prove (a), suppose g ∈ U. Then for all
α ∈ F we have
k f − PU f k2 ≤ k f − PU f + αgk2
= h f − PU f + αg, f − PU f + αgi
= k f − PU f k2 + |α|2 k gk2 + 2 Re αh f − PU f , gi.
Let α = −th f − PU f , gi for t > 0. A tiny bit of algebra applied to the inequality
above implies
2|h f − PU f , gi|2 ≤ t|h f − PU f , gi|2 k gk2
for all t > 0. Thus h f − PU f , gi = 0, completing the proof of (a).
To prove (b), suppose h ∈ U and f − h is orthogonal to g for every g ∈ U. If
g ∈ U, then h − g ∈ U and hence f − h is orthogonal to h − g. Thus
k f − h k2 ≤ k f − h k2 + k h − g k2
= k( f − h) + (h − g)k2
= k f − g k2 ,
f − PU f is orthogonal to each element of U.
where the first equality above follows from the Pythagorean Theorem (8.9). Thus
k f − hk ≤ k f − gk
for all g ∈ U. Hence h is the element of U that minimizes the distance to f , which
implies that h = PU f , completing the proof of (b).
To prove (c), suppose f 1 , f 2 ∈ V. If g ∈ U, then (a) implies that h f 1 − PU f 1 , gi =
h f 2 − PU f 2 , gi = 0, and thus
h( f 1 + f 2 ) − ( PU f 1 + PU f 2 ), gi = 0.
The equation above and (b) now imply that
PU ( f 1 + f 2 ) = PU f 1 + PU f 2 .
The equation above and the equation PU (α f ) = αPU f for α ∈ F (whose proof is left
to the reader) show that PU is a linear map, proving (c).
The proof of (d) is left as an exercise for the reader.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8B Orthogonality 227
Orthogonal Complements
U ⊥ = { h ∈ V : h g, hi = 0 for all g ∈ U }.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
228 Chapter 8 Hilbert Spaces
The results in the rest of this subsection have as a hypothesis that V is a Hilbert
space. These results do not hold when V is only an inner product space.
U = (U ⊥ ) ⊥ .
f − PU f ∈ (U ⊥ )⊥ .
Also,
f − PU f ∈ U ⊥
by 8.37(a) and 8.40(d). Hence
f − PU f ∈ U ⊥ ∩ (U ⊥ )⊥ .
To prove the other direction, now suppose U ⊥ = {0}. Then 8.41 implies that
U = (U ⊥ )⊥ = {0}⊥ = V,
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8B Orthogonality 229
f = g + h,
f = PU f + ( f − PU f ),
f = g1 + h 1 = g2 + h 2 ,
In the next definition, the function I depends upon the vector space V. Thus a
notation such as IV might be more precise. However, the domain of I should always
be clear from the context.
Suppose V is a vector space. The identity map I is the linear map from V to V
defined by I f = f for f ∈ V.
The next result highlights the close relationship between orthogonal projections
and orthogonal complements.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
230 Chapter 8 Hilbert Spaces
PU ⊥ f = 0 = f − PU f = ( I − PU ) f ,
where the first equality above holds because null PU ⊥ = U [by (b)].
If f ∈ U ⊥ , then
PU ⊥ f = f = f − PU f = ( I − PU ) f ,
where the second equality above holds because null PU = U ⊥ [by (a)].
The last two displayed equations show that PU ⊥ and I − PU agree on U and agree
on U ⊥ . Because PU ⊥ and I − PU are both linear maps and because each element of
V equals some element of U plus some element of U ⊥ (by 8.43), this implies that
PU ⊥ = I − PU , completing the proof of (c).
8.46 Example PU ⊥ = I − PU
Suppose U is the closed subspace of L2 (R) defined by
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8B Orthogonality 231
= ϕ ( f ),
ϕ( f )
where 8.48 holds because f − h ∈ null ϕ and h is orthogonal to all elements of
k h k2
null ϕ.
We have now proved the existence of h ∈ V such that ϕ( f ) = h f , hi for all
f ∈ V. To prove uniqueness, suppose h̃ ∈ V has the same property. Then
hh − h̃, h − h̃i = hh − h̃, hi − hh − h̃, h̃i = ϕ(h − h̃) − ϕ(h − h̃) = 0,
which implies that h = h̃, which proves uniqueness.
The Cauchy–Schwarz Inequality implies that | ϕ( f )| = |h f , hi| ≤ k f k k hk for
all f ∈ V, which implies that k ϕk ≤ khk. Because ϕ(h) = hh, hi = k hk2 , we also
have k ϕk ≥ khk. Thus k ϕk = khk, completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
232 Chapter 8 Hilbert Spaces
we showed that this maps preserves norms. In the special case where p = p0 = 2,
the Riesz Representation Theorem (8.47) shows that this map is surjective. In other
words, if ϕ is a bounded linear functional on L2 (µ), then there exists h ∈ L2 (µ) such
that Z
ϕ( f ) = f h dµ
for all f ∈ L2 (µ) (take h to be the complex conjugate of the function given by 8.47).
Hence we can identify the dual of L2 (µ) with L2 (µ). In 9.42 we will deal with other
values of p. Also see Exercise 25 in this section.
EXERCISES 8B
1 Show that each of the inner product spaces in Example 8.23 is not a Hilbert
space.
2 Prove or disprove: The inner product space in Exercise 1 in Section 8A is a
Hilbert space.
3 Suppose V1 , V2 , . . . are Hilbert spaces. Let
n ∞ o
V = ( f 1 , f 2 , . . .) ∈ V1 × V2 × · · · : ∑ k f k k2 < ∞ .
k =1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8B Orthogonality 233
5 Prove that if V is a normed vector space, f ∈ V, and r > 0, then the open ball
B( f , r ) centered at f with radius r is convex.
6 (a) Suppose V is an inner product space and B is the open unit ball in V (thus
B = { f ∈ V : k f k < 1}). Prove that if U is a subset of V such that
B ⊂ U ⊂ B, then U is convex.
(b) Give an example to show that the result in part (a) can fail if the phrase
inner product space is replaced by Banach space.
7 Suppose V is a normed vector space and U is a closed subset of V. Prove that
U is convex if and only if
f +g
∈ U for all f , g ∈ U.
2
8 Prove that if U is a convex subset of a normed vector space, then U is also
convex.
9 Prove that if U is a convex subset of a normed vector space, then the interior of
U is also convex.
[The interior of U is the set { f ∈ U : B( f , r ) ⊂ U for some r > 0}.]
10 Suppose V is a Hilbert space, U is a nonempty closed convex subset of V, and
g ∈ U is the unique element of U with smallest norm (obtained by taking f = 0
in 8.28). Prove that
Reh g, hi ≥ k gk2
for all h ∈ U.
11 Suppose V is a Hilbert space. A closed half-space of V is a set of the form
{ g ∈ V : Reh g, hi ≥ c}
for some h ∈ V and some c ∈ R. Prove that every closed convex subset of V is
the intersection of all the closed half-spaces that contain it.
12 Give an example of a nonempty closed subset U of the Hilbert space `2 and
a ∈ `2 such that there does not exist b ∈ U with k a − bk = distance( a, U ).
[By 8.28, U cannot be a convex subset of `2 .]
13 In the real Banach space R2 with norm defined by k( x, y)k∞ = max{| x |, |y|},
give an example of a closed convex set U ⊂ R2 and z ∈ R2 such that there
exist infinitely many choices of w ∈ U with kz − wk = distance(z, U ).
14 Suppose f and g are elements of an inner product space. Prove that h f , gi = 0
if and only if
k f k ≤ k f + αgk
for all α ∈ F.
15 Suppose U is a closed subspace of a Hilbert space V and f ∈ V. Prove that
k PU f k ≤ k f k, with equality if and only if f ∈ U.
[This exercise asks you to prove 8.37(d).]
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
234 Chapter 8 Hilbert Spaces
(b) Same as part (a), but with the hypothesis that µ is a finite measure replaced
by the hypothesis that µ is a measure.
[See 7.24, which along with this exercise shows that we can identify the dual of
0
L p (u) with L p (µ) for 1 < p ≤ 2. See 9.42 for an extension to all p ∈ (1, ∞).]
26 Prove that if V is a infinite-dimensional Hilbert space, then the Banach space
B(V ) is nonseparable.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 235
8C Orthonormal Bases
Bessel’s Inequality
Recall that a family {ek }k∈Γ in a set V is a function e from a set Γ to V, with the
value of the function e at k ∈ Γ denoted by ek (see 6.53).
for all j, k ∈ Γ.
In other words, a family {ek }k∈Γ is an orthonormal family if e j and ek are orthog-
onal for all distinct j, k ∈ Γ and kek k = 1 for all k ∈ Γ.
• For k ∈ Z+, let ek be the element of `2 all of whose coordinates are 0 except for
the kth coordinate, which is 1:
ek = (0, . . . , 0, 1, 0, . . .).
Then {ek }k∈Z+ is an orthonormal family in `2 . In this case, our family is a
sequence; thus we can call {ek }k∈Z+ an orthonormal sequence.
• More generally, suppose Γ is a nonempty set. The Hilbert space L2 (µ), where
µ is counting measure on Γ, is often denoted by `2 (Γ). For k ∈ Γ, define a
function ek : Γ → F by
(
1 if j = k,
ek ( j ) =
0 if j 6= k.
(see Exercise 1 for useful formulas that will help in this verification).
This orthonormal family {ek }k∈Z leads to the classical theory of Fourier series,
as we will see in more depth in Chapter 11.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
236 Chapter 8 Hilbert Spaces
• Now we modify the example in the previous bullet point by translating the
functions in the previous bullet point by arbitrary integers. Specifically, for k a
nonnegative integer and m ∈ Z, define ek,m : R → F by
The next result gives our first indication of why orthonormal families are so useful.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 237
= ∑ α j αk h e j , ek i
j,k∈Ω
= ∑ | α j |2 ,
j∈Ω
as desired.
Exercises at the end of this section ask you to develop basic properties of unordered
sums, including the following:
• Suppose { ak }k∈Γ is a family in R and ak ≥ 0 for each k ∈ Γ. Then the unordered
sum ∑k∈Γ ak converges if and only if
n o
sup ∑ a j : Ω is a finite subset of Γ < ∞.
j∈Ω
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
238 Chapter 8 Hilbert Spaces
Suppose {ek }k∈Γ is an orthonormal family in a Hilbert space V. Suppose {αk }k∈Γ
is a family in F. Then
Proof First suppose that ∑k∈Γ αk ek converges, with ∑k∈Γ αk ek = g. Suppose ε > 0.
Then there exists a finite set Ω ⊂ Γ such that
g − ∑ α j e j
< ε
j∈Ω0
1/2
Thus k gk = ∑k∈Γ |αk |2 , completing the proof of one direction of (a) and the
proof of (b).
To prove the other direction of (a), now suppose ∑k∈Γ |αk |2 < ∞. Thus there
exists an increasing sequence Ω1 ⊂ Ω2 ⊂ · · · of finite subsets of Γ such that for
each m ∈ Z+,
1
8.54 ∑ | α j |2 <
m2
j∈Ω0 \Ωm
for every finite set Ω0 such that Ωm ⊂ Ω0 ⊂ Γ. For each m ∈ Z+, let
gm = ∑ αj ej .
j∈Ωm
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 239
1
k gn − gm k2 = ∑ | α j |2 <
m2
.
j∈Ωn \Ωm
1 1/2
= +
m ∑ j | α | 2
j∈Ω0 \Ω m
< ε,
where the third line comes from 8.51 and the last line comes from 8.54. Thus
∑k∈Γ αk ek = g, completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
240 Chapter 8 Hilbert Spaces
= ∑ |h f , e j i|2 ,
j∈Ω
where the last equality follows from 8.51. Because the inequality above holds for
every finite set Ω ⊂ Γ, we conclude that k f k2 ≥ ∑k∈Γ |h f , ek i|2 , as desired.
Recall that the span of a family {ek }k∈Γ in a vector space is the set of finite sums
of the form
∑ αj ej ,
j∈Ω
Furthermore,
(b) f = ∑ h f , ek i ek
k∈Γ
Proof The right side of (a) above makes sense because of 8.53(a). Furthermore, the
right side of (a) above is a subspace of V because `2 (Γ) [which equals L2 (µ), where
µ is counting measure on Γ] is closed under addition and scalar multiplication by 7.5.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 241
Suppose first {αk }k∈Γ is a family in F and ∑k∈Γ |αk |2 < ∞. Let ε > 0. Then
there is a finite subset Ω of Γ such that
∑ | α j |2 < ε2 .
j∈Γ\Ω
The definition of the closure (see 6.7) now implies that ∑k∈Γ αk ek ∈ span {ek }k∈Γ ,
showing that the right side of (a) is contained in the left side of (a).
To prove the inclusion in the other direction, now suppose f ∈ span {ek }k∈Γ . Let
8.58 g= ∑ h f , ek i ek ,
k∈Γ
where the sum above converges by Bessel’s Inequality (8.56) and by 8.53(a). The
direction of the inclusion that we just proved implies that g ∈ span {ek }k∈Γ . Thus
h g − f , ek i = 0 for every k ∈ Γ.
where the equality above comes from 8.40(d). Now 8.59 and the inclusion above
imply that f = g [see 8.40(b)], which along with 8.58 implies that f is in the right
side of (a), completing the proof of (a).
The equations f = g and 8.58 also imply (b).
Parseval’s Identity
Note that 8.51 implies that every orthonormal family in an inner product space is
linearly independent (see 6.54 to review the definition of linearly independent and
basis). Linear algebra deals mainly with finite-dimensional vector spaces, but infinite-
dimensional vector spaces frequently appear in analysis. The notion of a basis is not
so useful when doing analysis with infinite-dimensional vector spaces because the
definition of span does not take advantage of the possibility of summing an infinite
number of elements.
However, 8.57 tells us that taking the closure of the span of an orthonormal
family can capture the sum of infinitely many elements. Thus we make the following
definition.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
242 Chapter 8 Hilbert Spaces
ek = (0, . . . , 0, 1, 0, . . . , 0).
• Let e1 = √1 , √1 , √1 , e2 = − √1 , √1 , 0 , and e3 = √1 , √1 , − √2
. Then
3 3 3 2 2 6 6 6
{ek }k∈{1,2,3} is an orthonormal basis of F3 , as you should verify.
• All five examples of orthonormal families in Example 8.50 are orthonormal
bases. The exercises ask you to verify that we have an orthonormal basis in the
first, second, fourth, and fifth bullet points of Example 8.50. For the third bullet
point (trigonometric functions), see Exercise 7 in Section 10D or see Chapter 11.
The next result shows why orthonormal bases are so useful—a Hilbert space with
orthonormal basis {ek }k∈Γ behaves like `2 (Γ).
(a) f = ∑ h f , ek i ek ;
k∈Γ
(b) h f , gi = ∑ h f , ek ih g, ek i;
k∈Γ
(c) k f k2 = ∑ |h f , ek i|2 .
k∈Γ
Proof The equation in (a) follows immediately from 8.57(b) and the definition of an
orthonormal basis.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 243
= ∑ h f , ek ih g, ek i,
k∈Γ
where the first equation follows from (a) and the second equation follows from the
definition of an unordered sum and the Cauchy–Schwarz Inequality.
Equation (c) follows from setting g = f in (b). An alternative proof: equation (c)
follows from 8.53(b) and the equation f = ∑ h f , ek iek from (a).
k∈Γ
• Suppose n ∈ Z+. Then Fn with the usual Hilbert space norm is separable
because the closure of the countable set
n =1
is `2 .
• The Hilbert spaces L2 ([0, 1]) and L2 (R) are separable, as an exercise asks you
to verify [hint: consider finite linear combinations with rational coefficients of
functions of the form χ(c, d), where c and d are rational numbers].
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
244 Chapter 8 Hilbert Spaces
A moment’s thought about the definition of closure (see 6.7) shows that a normed
vector space V is separable if and only if there exists a countable subset C of V such
that every open ball in V contains at least one element of C.
• Suppose Γ is an uncountable set. Then √the Hilbert space `2 (Γ) is not separable.
To see this, note that kχ{ j} − χ{k}k = 2 for all j, k ∈ Γ with j 6= k. Hence
n √ o
2
:k∈Γ
B χ{k} , 2
We will present two proofs about the existence of orthonormal bases of Hilbert
spaces. The first proof works only for separable Hilbert spaces, but it gives a useful
algorithm, called the Gram–Schmidt process, for constructing orthonormal sequences.
The second proof works for all Hilbert spaces, but it uses a result that depends upon
the Axiom of Choice.
Which proof should you read? In practice, the Hilbert spaces you will encounter
will almost certainly be separable. Thus the first proof suffices, and it has the
additional benefit of introducing you to a widely-used algorithm. The second proof
uses an entirely different approach and has the advantage of applying to separable
and nonseparable Hilbert spaces. For maximum learning, read both proofs!
for each n ∈ Z+. This will imply that span{ek }k∈Z+ = V, which will mean that
{ek }k∈Z+ is an orthonormal basis of V.
To get started with the induction, set e1 = f 1 /k f 1 k (we can assume that f 1 6= 0).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 245
PU f = ∑ h f , ek i ek
k∈Γ
for all f ∈ V.
where the first equality follows from Parseval’s Identity [8.62(a)] as applied to U and
its orthonormal basis {ek }k∈Γ , and the second equality follows from 8.71.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
246 Chapter 8 Hilbert Spaces
Solution We will work in the real Hilbert space L2 ([−1, 1]) with the usual inner
R1
product h g, hi = −1 gh. For k ∈ {0, 1, . . . , 10}, let f k ∈ L2 ([−1, 1]) be defined by
f k ( x ) = x k . Let U be the subspace of L2 ([−1, 1]) defined by
Apply the Gram–Schmidt process from the proof of 8.66 to { f k }k∈{0, ..., 10} , pro-
ducing an orthonormal basis {ek }k∈{0,...,10} of U, which is a closed subspace of
L2 ([−1, 1]) (see Exercise 8). The point here is that {ek }k∈{0, ..., 10} can be computed
explicitly and exactly by using 8.69 and evaluating some integrals (using software √ that
can do exact√ rational arithmetic will make the process easier), getting e 0 ( x ) = 1/ 2,
e1 ( x ) = 6x/2, . . . up to
√
42
e10 ( x ) = (−63 + 3465x2 − 30030x4 + 90090x6 − 109395x8 + 46189x10 ).
512
p
Define f ∈ L2 ([−1, 1]) by f ( x ) = | x |. Because U is the subspace of
L2 ([−1, 1]) consisting of polynomials of degree at most 10 and PU f equals the
element of U closest to f (see 8.34), the formula in 8.70 tells us that the solution g to
our minimization problem is given by the formula
10
g= ∑ h f , ek i ek .
k =0
Using the explicit expressions for e0 , . . . , e10 and again evaluating some integrals,
this gives
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 247
Now we are ready to prove that every Hilbert space has an orthonormal basis.
Before reading the next proof, you may want to review the definition of a chain (6.58),
which is a collection of sets such that for each pair of sets in the collection, one of
them is contained in the other. You should also review Zorn’s Lemma (6.60), which
gives a way to show that a collection of sets contains a maximal elemest.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
248 Chapter 8 Hilbert Spaces
8.76 h= ∑ ϕ ( ek ) ek .
k∈Γ
Then
8.77 ϕ( f ) = h f , hi
1/2
for all f ∈ V. Furthermore, k ϕk = ∑k∈Γ | ϕ(ek )|2 .
Proof First we must show that the sum defining h makes sense. To do this, suppose
Ω is a finite subset of Γ. Then
1/2
∑ | ϕ(e j )|2 = ϕ ∑ ϕ(e j )e j ≤ k ϕk
∑ ϕ(e j )e j
= k ϕk ∑ | ϕ(e j )|2 ,
j∈Ω j∈Ω j∈Ω j∈Ω
1/2
where the last equality follows from 8.51. Dividing by ∑ j∈Ω | ϕ(e j )|2 gives
1/2
∑ | ϕ(e j )|2 ≤ k ϕ k.
j∈Ω
Because the inequality above holds for every finite subset Ω of Γ, we conclude that
∑ | ϕ(ek )|2 ≤ k ϕk2 .
k∈Γ
Thus the sum defining h makes sense (by 8.53) in equation 8.76.
Now 8.76 shows that hh, e j i = ϕ(e j ) for each j ∈ Γ. Thus if f ∈ V then
ϕ( f ) = ϕ ∑ h f , ek iek = ∑ h f , ek i ϕ(ek ) = ∑ h f , ek ihh, ek i = h f , hi,
k∈Γ k∈Γ k∈Γ
where the first and last equalities follow from 8.62 and the second equality follows
from the boundedness/continuity of ϕ. Thus 8.77 holds.
Finally, the Cauchy–Schwarz Inequality, 8.77, and the equation ϕ(h) = hh, hi
1/2
show that k ϕk = khk = ∑k∈Γ | ϕ(ek )|2 .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 249
EXERCISES 8C
1 Verify that the family {ek }k∈Z as defined in the
third bullet point of Example
8.50 is an orthonormal family in L2 (−π, π ] . The following formulas should
help:
sin( x + y) + sin( x − y)
(sin x )(cos y) = ,
2
cos( x − y) − cos( x + y)
(sin x )(sin y) = ,
2
cos( x + y) + cos( x − y)
(cos x )(cos y) = .
2
Furthermore, prove that if ∑k∈Γ ak converges then it equals the supremum above.
3 Suppose { ak }k∈Γ is a family in R. Prove that the unordered sum ∑k∈Γ ak
converges if and only if ∑k∈Γ | ak | < ∞.
4 Suppose {ek }k∈Γ is an orthonormal family in an inner product space V. Prove
that if f ∈ V, then {k ∈ Γ : h f , ek i 6= 0} is a countable set.
5 Suppose { f k }k∈Γ and { gk }k∈Γ are families in a normed vector space such that
∑k∈Γ f k and ∑k∈Γ gk converge. Prove that ∑k∈Γ ( f k + gk ) converges and
∑ ( f k + gk ) = ∑ fk + ∑ gk .
k∈Γ k∈Γ k∈Γ
6 Suppose { f k }k∈Γ is a family in a normed vector space such that ∑k∈Γ f k con-
verges. Prove that if c ∈ F, then ∑k∈Γ (c f k ) converges and
∑ (c f k ) = c ∑ fk .
k∈Γ k∈Γ
7 Suppose { f k }k∈Z+ is a family in a normed vector space. Prove that the un-
ordered sum ∑k∈Z+ f k converges if and only if the usual ordered sum ∑∞
k =1 f p ( k )
converges for every injective function p : Z+ → Z+.
8 Explain why 8.57 implies that if Γ is a finite set and {ek }k∈Γ is an orthonormal
family in a Hilbert space V, then span{ek }k∈Γ is a closed subspace of V.
9 Suppose V is an infinite-dimensional Hilbert space. Prove that there does not
exist a basis of V that is an orthonormal family.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
250 Chapter 8 Hilbert Spaces
10 (a) Show that the orthonormal family given in the first bullet point of Exam-
ple 8.50 is an orthonormal basis of `2 .
(b) Show that the orthonormal family given in the second bullet point of Exam-
ple 8.50 is an orthonormal basis of `2 (Γ).
(c) Show that the orthonormal family given in the fourth bullet point of Exam-
ple 8.50 is an orthonormal basis of L2 [0, 1) .
(d) Show that the orthonormal family given in the fifth bullet point of Exam-
ple 8.50 is an orthonormal basis of L2 (R).
11 Suppose µ is a σ-finite measure on ( X, S) and ν is a σ-finite measure on (Y, T ).
Suppose also that {e j } j∈Ω is an orthonormal basis of L2 (µ) and { f k }k∈Γ is an
orthonormal basis of L2 (ν). For j ∈ Ω and k ∈ Γ, define g j,k : X × Y → F by
g j,k ( x, y) = e j ( x ) f k (y).
12 Prove the converse of Parseval’s Identity. More specifically, prove that if {ek }k∈Γ
is an orthonormal family in a Hilbert space V and
k f k2 = ∑ |h f , ek i|2
k∈Γ
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 8C Orthonormal Bases 251
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
252 Chapter 8 Hilbert Spaces
f 0 g0 dλ2 .
R
For f , g ∈ D , define h f , gi to be f (0) g(0) + D
25 (a) Prove that the Dirichlet space D is contained in the Bergman space L2a (D).
(b) Prove that there exists a function f ∈ L2a (D) such that f is uniformly
continuous on D and f ∈ / D.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
9
Dome in the main building of the University of Vienna, where Johann Radon
(1887–1956) was a student and then later a faculty member. The Radon–Nikodym
Theorem provides information analogous to differentiation for measures.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 253
254 Chapter 9 Real and Complex Measures
9A Total Variation
Properties of Real and Complex Measures
Recall that a measurable space is a pair ( X, S), where S is a σ-algebra on X. Recall
also that a measure on ( X, S) is a countably additive function from S to [0, ∞] that
takes ∅ to 0. Countably additive functions that take values in R or C give us new
objects called real measures or complex measures.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9A Total Variation 255
Note that every real measure is a complex measure. Note also that by definition,
∞ is not an allowable value for a real or complex measure. Thus a (positive) measure
µ on ( X, S) is a real measure if and only if µ( X ) < ∞.
Some authors use the terminology signed measure instead of real measure; some
authors allow a real measure to take on the value ∞ or −∞ (but not both, because the
expression ∞ − ∞ must be avoided). However, real measures as defined here will
serve us better because we need to avoid ±∞ when considering the Banach space of
real or complex measures on a measurable space (see 9.18).
For (positive) measures, we had to make µ(∅) = 0 part of the definition to avoid
the function µ that assigns ∞ to all sets, including the empty set. But ∞ is not an
allowable value for real or complex measures. Thus ν(∅) = 0 is a consequence of
our definition rather than part of the definition, as shown in the next result.
(a) ν(∅) = 0;
∞
(b) ∑ |ν(Ek )| < ∞ for every disjoint sequence E1 , E2 , . . . of sets in S .
k =1
and
∑ ∑
[
−ν Ek = − ν( Ek ) = |ν( Ek )|.
{k : ν( Ek )<0} {k : ν( Ek )<0} {k : ν( Ek )<0}
Because ν( E) ∈ R for every E ∈ S , the right side of the last two displayed equations
is finite. Thus ∑∞
k=1 | ν ( Ek )| < ∞, as desired.
Now consider the case where ν is a complex measure. Then
∞ ∞
∑ |ν(Ek )| ≤ ∑ |(Re ν)( Ek )| + |(Im ν)( Ek )| < ∞,
k =1 k =1
where the last inequality follows from applying the result for real measures to the
real measures Re ν and Im ν.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
256 Chapter 9 Real and Complex Measures
The next definition provides an important class of examples of real and complex
measures. If F = R, then the measure ν in the next result is a real measure (which is
also a complex measure).
where the first equality holds because the sets E1 , E2 , . . . are disjoint and the second
equality follows from the inequality
m
∑ χ Ek ( x )h( x ) ≤ |h( x )|,
k =1
which along with the assumption that h ∈ L1 (µ) allows us to interchange the integral
and limit of the partial sums by the Dominated Convergence Theorem (3.30).
The countable additivity shown in 9.5 means ν is a complex measure.
In the notation that we are about to define, the symbol d has no separate meaning—
it functions to separate h and µ. The result above shows that h dµ as defined below is
indeed a real or complex measure.
9.6 Definition h dµ
Note that if a function h ∈ L1 (µ) takes values in [0, ∞), then h dµ is a finite
(positive) measure.
The next result shows some basic properties of complex measures. No proofs
are given because the proofs are the same as the proofs of the corresponding results
for (positive) measures. Specifically, see the proofs of 2.56, 2.60, 2.58, and 2.59.
Because complex measures cannot take on the value ∞, we do not need to worry
about hypotheses of finite measure that are required of the (positive) measure versions
of all but part (c).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9A Total Variation 257
are disjoint sets in S such that E1 ∪ · · · ∪ En ⊂ E .
To start getting familiar with the definition above, you should verify that if ν is a
complex measure on ( X, S) and E ∈ S , then
• |ν( E)| ≤ |ν|( E);
• |ν|( E) = ν( E) if ν is a finite (positive) measure;
• |ν|( E) = 0 if and only if ν( A) = 0 for every A ∈ S such that A ⊂ E.
The next result states that for real measures, we can consider only n = 2 in the
definition of the total variation measure.
|ν|( E) = sup{|ν( A)| + |ν( B)| : A, B are disjoint sets in S and A ∪ B ⊂ E}.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
258 Chapter 9 Real and Complex Measures
The next result could be rephrased as stating that if h ∈ L1 (µ), then the total
variation measure of the measure h dµ is the measure |h| dµ. In the statement below,
the notation dν = h dµ means the same as ν = h dµ; the notation dν is commonly
used when considering expressions involving measures of the form h dµ.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9A Total Variation 259
Now
n n Z
∑ |ν(Ek )| = ∑ h dµ
k =1 k =1 Ek
n Z nZ
≥ ∑ g dµ − ∑ ( g − h ) dµ
k =1 Ek k =1 Ek
n n Z
= ∑ |ck |µ(Ek ) − ∑ ( g − h) dµ
k =1 k =1 Ek
Z Zn
= | g| dµ − ∑ ( g − h) dµ
E k =1 Ek
Z n Z
≥
E
| g| dµ − ∑ | g − h| dµ
k=1 Ek
Z
≥ |h| dµ − 2ε.
E
R
The inequality above implies that |ν|( E)R ≥ E |h| dµ − 2ε. Because ε is an arbitrary
positive number, this implies |ν|( E) ≥ E |h| dµ, completing the proof.
Taking the supremum of the left side of the inequality above over all choices of { Ej,k }
satisfying 9.12 shows that
m [∞
∑ |ν|( Ak ) ≤ |ν| Ak .
k =1 k =1
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
260 Chapter 9 Real and Complex Measures
n ∞
= ∑ ∑ |ν(Ej ∩ Ak )|
j =1 k =1
n ∞
≥ ∑ ∑ ν( Ej ∩ Ak )
j =1 k =1
n
= ∑ |ν(Ej )|,
j =1
where the first line above follows from the definition of |ν|( Ak ) and the last line
above follows from the countable additivity of ν. S
The inequality above and the definition of |ν| ∞
k=1 Ak imply that
∞ ∞
[
∑ |ν|( Ak ) ≥ |ν| Ak ,
k =1 k =1
You should verify that if ν, µ, and α are as above, then ν + µ and αν are complex
measures on ( X, S). You should also verify that these natural definitions of addition
and scalar multiplication make the set of complex (or real) measures on a measurable
space ( X, S) into a vector space. We now introduce notation for this vector space.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9A Total Variation 261
We use the terminology total variation norm below even though the object being
defined is not obviously a norm (especially because it is not obvious that kνk < ∞
for every complex measure ν). Soon we will justify this terminology.
kνk = |ν|( X ).
Proof First consider the case where F = R. Thus ν is a real measure on ( X, S). To
begin this proof by contradiction, suppose kνk = |ν|( X ) = ∞.
We inductively choose a decreasing sequence E0 ⊃ E1 ⊃ E2 ⊃ · · · of sets in S
as follows: Start by choosing E0 = X. Now suppose n ≥ 0 and En ∈ S has been
chosen with |ν|( En ) = ∞ and |ν( En )| ≥ n. Because |ν|( En ) = ∞, 9.9 implies that
there exists A ∈ S such that A ⊂ En and |ν( A)| ≥ n + 1 + |ν( En )|, which implies
that
|ν( En \ A)| = |ν( En ) − ν( A)| ≥ |ν( A)| − |ν( En )| ≥ n + 1.
Now
|ν|( A) + |ν|( En \ A) = |ν|( En ) = ∞
because the total variation measure |ν| is a (positive) measure (by 9.11). The equation
above shows that at least one of |ν|( A) and |ν|( En \ A) is ∞. Let En+1 = A
if |ν|( A) = ∞ and let En+1 = En \ A if |ν|( A) < ∞. Thus En ⊃ En+1 ,
|ν|( En+1 ) = ∞, and |ν( En+1 )| ≥ n +1.
Now 9.7(d) implies that ν ∞ n=1 En = limn→∞ ν ( En ). However, | ν ( En )| ≥ n
T
for each n ∈ Z+, and thus the limit in the last equation does not exist (in R). This
contradiction completes the proof in the case where ν is a real measure.
Consider now the case where F = C; thus ν is a complex measure on ( X, S).
Then
|ν|( X ) ≤ |Re ν|( X ) + |Im ν|( X ) < ∞,
where the last inequality follows from applying the real case to Re ν and Im ν.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
262 Chapter 9 Real and Complex Measures
The previous result tells us that if ( X, S) is a measurable space, then kνk < ∞
for all ν ∈ MF (S). This implies (as the reader should verify) that the total variation
norm k·k is a norm on MF (S). The next result shows that this norm makes MF (S)
into a Banach space (in other words, every Cauchy sequence in this norm converges).
ν( E) = lim νj ( E).
j→∞
If n ∈ Z+ is such that
∞
9.20 ∑ |νm (Ek )| ≤ ε
k=n
9.21 ≤ 2ε,
where the second line uses 9.20, the third line uses the countable additivity of the
measure |νj − νm | (see 9.11), and the fourth line uses 9.19.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9A Total Variation 263
where the last inequality follows from 9.22 and the definition of the total variation
norm. The inequality above implies that kν − νk k ≤ ε, completing the proof.
EXERCISES 9A
1 Prove or give a counterexample: If ν is a real measure on a measurable
space ( X, S) and A, B ∈ S are such that ν( A) ≥ 0 and ν( B) ≥ 0, then
ν( A ∪ B) ≥ 0.
2 Suppose ν is a real measure on ( X, S). Define µ : S → [0, ∞) by
µ( E) = |ν( E)|.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
264 Chapter 9 Real and Complex Measures
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9B Decomposition Theorems 265
9B Decomposition Theorems
Hahn Decomposition Theorem
The next result shows that a real measure on a measurable space ( X, S) decomposes
X into two disjoint measurable sets, on one of which all subsets have nonnegative
measure and on the other of which all subsets have nonpositive measure.
The decomposition in the result below is not unique because a subset D of X with
|ν|( D ) = 0 could be shifted from A to B or from B to A. However, Exercise 1 at
the end of this section shows that the Hahn decomposition is almost unique.
Suppose ν is a real measure on a measurable space ( X, S). Then there exist sets
A, B ∈ S such that
(a) A ∪ B = X and A ∩ B = ∅;
(b) ν( E) ≥ 0 for every E ∈ S with E ⊂ A;
(c) ν( E) ≤ 0 for every E ∈ S with E ⊂ B.
1
9.25 ν( A j ) ≥ a − .
2j
Temporarily fix k ∈ Z+. We will show by induction on n that if n ∈ Z+ with
n ≥ k, then
n n
[ 1
9.26 ν Aj ≥ a − ∑ j .
j=k j=k 2
To get started with the induction, note that if n = k then 9.26 holds because in this
case 9.26 becomes 9.25. Now for the induction step, assume that n ≥ k and that 9.26
holds. Then
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
266 Chapter 9 Real and Complex Measures
+1
n[ n
[ n
[
ν Aj = ν A j + ν ( A n +1 ) − ν A j ∩ A n +1
j=k j=k j=k
n
1 1
≥ a− ∑ + a− −a
j=k 2j 2n +1
n +1
1
= a− ∑ 2j
,
j=k
where the first line follows from 9.7(b) and the second line follows from 9.25 and
9.26. We have now verified that 9.26 holds if n is replaced by n + 1, completing the
proof by induction of 9.26.
The sequence of sets Ak , Ak ∪ Ak+1 , Ak ∪ Ak+1 ∪ Ak+2 , . . . is increasing. Thus
taking the limit as n → ∞ of both sides of 9.26 and using 9.7(c) gives
∞
[ 1
9.27 ν Aj ≥ a − .
j=k
2k −1
Now let
∞ [
\ ∞
A= Aj.
k =1 j = k
ν( A) = a.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9B Decomposition Theorems 267
• Every real measure is the difference of two finite (positive) measures that are
singular with respect to each other.
• More precisely, suppose ν is a real measure on a measurable space ( X, S).
Then there exist unique finite (positive) measures ν+ and ν− on ( X, S) such
that
9.31 ν = ν+ − ν− and ν+ ⊥ ν− .
Furthermore,
|ν| = ν+ + ν− .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
268 Chapter 9 Real and Complex Measures
The next result should help you think that absolute continuity and singularity are
two extreme possibilities for the relationship between two complex measures.
µ( E ∩ A) = µ ( E ∩ A) ∩ B = µ(∅) = 0.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9B Decomposition Theorems 269
νa µ and νs ⊥ µ.
Proof Let
b = sup{|ν|( B) : B ∈ S and µ( B) = 0}.
For each k ∈ Z+, let Bk ∈ S be such that
1
|ν|( Bk ) ≥ b − k and µ( Bk ) = 0.
Let
∞
[
B= Bk .
k =1
Then µ( B) = 0 and |ν|( B) = b.
Let A = X \ B. Define complex measures νa and νs on ( X, S) by
νa ( E) = ν( E ∩ A) and νs ( E) = ν( E ∩ B).
Clearly ν = νa + νs .
If E ∈ S , then
µ ( E ) = µ ( E ∩ A ) + µ ( E ∩ B ) = µ ( E ∩ A ),
where the last equality holds because µ( B) = 0. The equation above implies that
νs ⊥ µ.
To prove that νa µ, suppose E ∈ S and µ( E) = 0. Then µ( B ∪ E) = 0 and
hence
b ≥ |ν|( B ∪ E) = |ν|( B) + |ν|( E \ B) = b + |ν|( E \ B),
which implies that |ν|( E \ B) = 0. Thus
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
270 Chapter 9 Real and Complex Measures
Radon–Nikodym Theorem
If µ is a (positive) measure, h ∈ L1 (µ),
The result below was first proved by
and dν = h dµ, then ν µ. The next
Radon and Otto Nikodym
result gives the important converse—if µ
(1887–1974).
is σ-finite, then every complex measure
that is absolutely continuous with respect to µ is of the form h dµ for some h ∈ L1 (µ).
The hypothesis that µ is σ-finite cannot be deleted.
Proof First consider the case where both µ and ν are finite (positive) measures.
Define ϕ : L2 (ν + µ) → R by
Z
9.37 ϕ( f ) = f dν.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9B Decomposition Theorems 271
Thus we can modify g (for example by redefining g to be 12 on the two sets appearing
above; both those sets have ν-measure 0 and µ-measure 0) and from now on we can
assume that 0 ≤ g( x ) < 1 for all x ∈ X and that 9.39 holds for all f ∈ L2 (ν + µ).
Hence we can define h : X → [0, ∞) by
g( x )
h( x ) = .
1 − g( x )
Suppose E ∈ S . For each k ∈ Z+, let
Taking f = χ E/(1 − g) in 9.39
χ E( x ) χ (x)
R
if 1−Eg( x) ≤ k, would give ν( E) = E h dµ, but this
f k (x) = 1 − g ( x )
function f might not be in
0 otherwise. L2 (ν + µ) and thus we need to be a
bit more careful.
Then f k ∈ L2 (ν + µ). Now 9.39 implies
Z Z
f k (1 − g) dν = f k g dµ.
Taking the limit as k → ∞ and using the Monotone Convergence Theorem (3.11)
shows that
Z Z
9.40 1 dν = h dµ.
E E
Thus dν = h dµ, completing the proof in the case where both ν and µ are (positive)
finite measures [note that h ∈ L1 (µ) because h is a nonnegative function and we can
take E = X in the equation above].
Now relax the assumption on µ to the hypothesis that µ is a σ-finite measure.
Thus
S∞
there exists an increasing sequence X1 ⊂ X2 ⊂ · · · of sets in S such that
k=1 Xk = X and µ ( Xk ) < ∞ for each k ∈ Z . For k ∈ Z , let νk and µk denote
+ +
the restrictions of ν and µ to the σ-algebra on Xk consisting of those sets in S that
are subsets of Xk . Then νk µk . Thus by the case we have already proved, there
exists a nonnegative function hk ∈ L1 (µk ) such that dνk = hk dµk . If j < k, then
Z Z
h j dµ = ν( E) = hk dµ
E E
for every k ∈ Z+. The Monotone Convergence Theorem (3.11) can now be used to
show that 9.40 holds for every E ∈ S . Thus dν = h dµ, completing the proof in the
case where ν is a (positive) finite measure.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
≈ 100e Chapter 9 Real and Complex Measures
Now relax the assumption on ν to the assumption that ν is a real measure. The
measure ν equals one-half the difference of the two positive (finite) measures |ν| + ν
and |ν| − ν, each of which is absolutely continuous with respect to µ. By the case
proved in the previous paragraph, there exist h+ , h− ∈ L1 (µ) such that
(a) Suppose ν is a real measure on a measurable space ( X, S). Then there exists
an S -measurable function h : X → {−1, 1} such that dν = h d|ν|.
(b) Suppose ν is a complex measure on a measurable space ( X, S). Then there
exists an S -measurable function h : X → {z ∈ C : |z| = 1} such that
dν = h d|ν|.
Proof Because ν |ν|, the Radon–Nikodym Theorem (9.36) tells us that there
exists h ∈ L1 (|ν|) (with h real valued if ν is a real measure) such that dν = h d|ν|.
Now 9.10 implies that d|ν| = |h| d|ν|, which implies that |h| = 1 almost everywhere
(with respect to |ν|). Refine h to be 1 on the set { x ∈ X : |h( x )| 6= 1}, which gives
the desired result.
We could have proved part (a) of the result above by taking h = χ A − χ B in the
Hahn Decomposition Theorem (9.23).
Conversely, we could give a new proof of Hahn Decomposition Theorem by using
part (a) of the result above and taking
A = { x ∈ X : h ( x ) = 1} and B = { x ∈ X : h ( x ) = −1}.
We could also give a new proof of the Jordan Decomposition Theorem (9.30) by
using part (a) of the result above and taking
ν + = χ { x ∈ X : h ( x ) = 1} d | ν | and ν − = χ { x ∈ X : h ( x ) = −1} d | ν | .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9B Decomposition Theorems 273
0
9.42 Dual space of L p (µ) is L p (µ)
0 0
Then h 7→ ϕh is a one-to-one linear map from L p (µ) onto L p (µ) . Further-
0
more, k ϕh k = k hk p0 for all h ∈ L p (µ).
Proof The case p = 1 will be left to the reader as an exercise. Thus assume
1 < p < ∞.
Suppose µ is a (positive) measure on a measurable space ( X, S) and ϕ is a
0
bounded linear functional on L p (µ); in other words, suppose ϕ ∈ L p (µ) .
Consider first the case where µ is a finite (positive) measure. Define a function
ν : S → F by
ν ( E ) = ϕ ( χ E ).
If E1 , E2 , . . . are disjoint sets in S , then
∞
[ ∞ ∞ ∞
ν Ek = ϕ χS∞
k =1 Ek
=ϕ ∑ χE k
= ∑ ϕ(χE ) = ∑ ν(Ek ),
k
k =1 k =1 k =1 k =1
where the infinite sum in the third term converges in the L p (µ)-norm to χS∞ E , and
k =1 k
the third equality holds because ϕ is a continuous linear functional. The equation
above shows that ν is countably additive. Thus ν is a complex measure on ( X, S)
[and is a real measure if F = R].
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
274 Chapter 9 Real and Complex Measures
for every E ∈ S . The equation above, along with the linearity of ϕ, implies that
Z
9.43 ϕ( f ) = f h dµ for every simple S -measurable function f : X → F.
Now
Z
0
Z 0
1/p
|h| p χ E dµ = ϕ( f k ) ≤ k ϕk k f k k p = k ϕk |h| p χ E dµ ,
k k
where the first equality follows from 9.44 and 9.45, and the last equality follows from
0
9.45 [which implies that | f k ( x )| p = |h( x )| p χ E ( x ) for x ∈ X]. After dividing by
k
R 0
1/p
|h| p χ E dµ , the inequality between the first and last terms above becomes
k
khχ E k p0 ≤ k ϕk.
k
Taking the limit as k → ∞ shows, via the Monotone Convergence Theorem (3.11),
that
k h k p 0 ≤ k ϕ k.
0
Thus h ∈ L p (µ). Because each f ∈ L p (µ) can be approximated in the L p (µ) norm
by functions in L∞ (µ), 9.44 now shows that ϕ = ϕh , completing the proof in the
case where µ is a finite (positive) measure.
Now relax the assumption that µ is a finite (positive) measure to the hypothesis
that µ is a (positive) measure. For E ∈ S , let S E = { A ∈ S : A ⊂ E} and let µ E
be the (positive) measure on ( E, S E ) defined by µ E ( A) = µ( A) for A ∈ S E . We
can identify L p (µ E ) with the subspace of functions in L p (µ) that vanish (almost
everywhere) outside E. With this identification, let ϕ E = ϕ| L p (µE ) . Then ϕ E is a
bounded linear functional on L p (µ E ) and k ϕ E k ≤ k ϕk.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9B Decomposition Theorems 275
If E ∈ S and µ( E) < ∞, then the finite measure case that we have already proved
0
as applied to ϕ E implies that there exists a unique h E ∈ L p (µ E ) such that
Z
9.46 ϕ( f ) = f h E dµ for all f ∈ L p (µ E );
E
0
the uniqueness of h E ∈ L p (µ E ) holds because the equation k ϕh k = khk p0 implies
0
that the difference of two different choices for h E will have norm 0 in L p (µ E ). This
uniqueness implies that if D, E ∈ S and D ⊂ E, then h D ( x ) = h E ( x ) for almost
every x ∈ D.
For each k ∈ Z+, there exists f k ∈ L p (µ) such that
where the first equality follows from the continuity of ϕ, the second equality follows
from 9.46 as applied to each Ek [valid because µ( Ek ) < ∞], and the third equality
follows from an application of Hölder’s Inequality.
If D is an S -measurable subset of X \ E with µ( D ) < ∞, then k h D k p0 = 0
because otherwise we would have kh + h D k p0 > k hk p0 and the linear functional on
L p (µ) induced by h + h D would have norm larger than k ϕk even though itR agrees
with ϕ on L p (µ E∪ D ). Because k h D k p0 = 0, we see from 9.50 that ϕ( f ) = f h dµ
for all f ∈ L p (µ E∪ D ).
Every element of L p (µ) can be approximated in norm by elements of L p (µRE ) plus
functions that live on subsets of X \ E with finite measure. Thus ϕ( f ) = f h dµ
for all f ∈ L p (µ), completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
276 Chapter 9 Real and Complex Measures
EXERCISES 9B
1 Suppose ν is a real measure on a measurable space ( X, S). Prove that the Hahn
decomposition of ν is almost unique, in the sense that if A, B and A0 , B0 are
pairs satisfying the Hahn Decomposition Theorem (9.23), then
ν+ ( E) = sup{ν( D ) : D ∈ S and D ⊂ E}
and
ν− ( E) = − inf{ν( D ) : D ∈ S and D ⊂ E}.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 9B Decomposition Theorems 277
{ν ∈ MF (S) : ν µ}
15 Show that if p = 1, then 9.42 can fail without the extra hypothesis that µ is a
σ-finite (positive) measure.
16 Prove 9.42 [with the extra hypothesis that µ is a σ-finite (positive) measure] in
the case where p = 1.
17 Explain where the proof of 9.42 fails if p = ∞.
18 Prove that if µ is a (positive) measure and 1 < p < ∞, then L p (µ) is reflexive.
[See the definition before Exercise 18 in Section 7B for the meaning of reflexive.]
19 Prove that L1 (R) is not reflexive.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Chapter
10
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 278
Section 10A Adjoints and Invertibility 279
Suppose V and W are Hilbert spaces and T ∈ B(V, W ). The adjoint of T is the
function T ∗ : W → V such that
h T f , gi = h f , T ∗ gi
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
280 Chapter 10 Linear Maps on Hilbert Spaces
for f ∈ L2 (ν) and x ∈ X. To see that this definition makes sense, first note that
there are no worrisome measurability issues because for each x ∈ X, the function
y 7→ K ( x, y) is a T -measurable function on Y (see 5.9).
Suppose f ∈ L2 (ν). Use the Cauchy–Schwarz Inequality (8.11) or Hölder’s
Inequality (7.9) to show that
Z Z 1/2
10.7 |K ( x, y)| | f (y)| dν(y) ≤ |K ( x, y)|2 dν(y) k f k L2 ( ν ) .
Y Y
for every x ∈ X. Squaring both sides of the inequality above and then integrating on
X with respect to µ gives
Z Z 2 Z Z
|K ( x, y)| | f (y)| dν(y) dµ( x ) ≤ |K ( x, y)|2 dν(y) dµ( x ) k f k2L2 (ν)
X Y X Y
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10A Adjoints and Invertibility 281
for f ∈ L2 (ν). If we identify L2 (ν) and L2 (µ) with Fn and Fm and then think of
elements of Fn and Fm as column vectors, then the equation above shows that the
linear map IK : Fn → Fm is simply matrix multiplication by K.
In this setting, K ∗ is called the conjugate transpose of K because the n-by-m
matrix K ∗ is obtained by interchanging the rows and the columns of K and then
taking the complex conjugate of each entry.
The previous example now shows that
m n 1/2
kIK k ≤ ∑ ∑ |K (i, j)|2 .
i =1 j =1
Furthermore, the previous example shows that the adjoint of the linear map of
multiplication by the matrix K is the linear map of multiplication by the conjugate
transpose matrix K ∗, a result that may be familiar to you from linear algebra.
T ∗ ∈ B(W, V ), ( T ∗ )∗ = T, and k T ∗ k = k T k.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
282 Chapter 10 Linear Maps on Hilbert Spaces
h( T ∗ )∗ f , gi = h g, ( T ∗ )∗ f i = h T ∗ g, f i = h f , T ∗ gi = h T f , gi
Parts (a) and (b) of the next result show that if V and W are real Hilbert spaces,
then the function T 7→ T ∗ from B(V, W ) to B(W, V ) is a linear map. However,
if V and W are nonzero complex Hilbert spaces, then T 7→ T ∗ is not a linear map
because of the complex conjugate in (b).
Proof
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10A Adjoints and Invertibility 283
g ∈ null T ∗ ⇐⇒ T ∗ g = 0
⇐⇒ h f , T ∗ gi = 0 for all f ∈ V
⇐⇒ h T f , gi = 0 for all f ∈ V
⇐⇒ g ∈ (range T )⊥ .
As a corollary of the result above, we have the following result which can give a
useful way to determine whether or not a linear map has a dense range.
Suppose V and W are Hilbert spaces and T ∈ B(V, W ). Then T has dense range
if and only if T ∗ is injective.
Proof From 10.13(d) we see that T has dense range if and only if (null T ∗ )⊥ = W,
which happens if and only if null T ∗ = {0}, which happens if and only if T ∗ is
injective.
The advantage of using the result above is that to determine whether or not a
bounded linear map T between Hilbert spaces has a dense range, we need only
determine whether or not 0 is the only solution to the equation T ∗ g = 0. The next
example illustrates this procedure.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
284 Chapter 10 Linear Maps on Hilbert Spaces
for f ∈ L2 ([0, 1]) and x ∈ [0, 1]; here dy means dλ(y), where λ is the usual
Lebesgue measure on the interval [0, 1].
To show that V is a bounded linear map from L2 ([0, 1]) to L2 ([0, 1]), let K be the
function on [0, 1] × [0, 1] defined by
(
1 if x > y,
K ( x, y) =
0 if x ≤ y.
Invertibility of Operators
Linear maps from a vector space to itself are so important that they get a special name
and special notation.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10A Adjoints and Invertibility 285
The second bullet point above is equivalent to the first bullet point because if
a linear map T : V → V is one-to-one and surjective, then the inverse function
T −1 : V → V is automatically linear (as you should verify).
Also, if V is a Banach space and T is a bounded operator on V that is invertible,
then the inverse T −1 is automatically bounded, as follows from the Bounded Inverse
Theorem (6.83).
The next result shows that inverses and adjoints work well together. In the proof,
we use the common convention of writing composition of linear maps with the same
notation as multiplication. In other words, if S and T are linear maps such that S ◦ T
makes sense, then from now on
ST = S ◦ T.
Proof First suppose T is invertible. Taking the adjoint of all three sides of the
equation T −1 T = TT −1 = I, we get
T ∗ ( T −1 )∗ = ( T −1 )∗ T ∗ = I,
Norms work well with the composition of linear maps, as shown in the next result.
Proof If f ∈ U, then
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
286 Chapter 10 Linear Maps on Hilbert Spaces
Unlike linear maps from one vector space to a different vector space, operators on
the same vector space can be composed with each other and raised to powers.
10.21 Definition Tk
You should verify that powers of an operator satisfy the usual arithmetic rules:
T j T k = T j+k and ( T j )k = T jk for j, k ∈ Z+. Also, if V is a normed vector space
and T ∈ B(V ), then
k T k k ≤ k T kk
for every k ∈ Z+, as follows from using induction on 10.20.
Recall that if z ∈ C with |z| < 1, then the formula for the sum of a geometric
series shows that
∞
1
= ∑ zk .
1−z k =0
The next result shows that this formula carries over to operators on Banach spaces.
10.22 Operators in the open unit ball centered at the identity are invertible
∞ n
10.23 ( I − T) ∑ T k = lim ( I − T )
n→∞
∑ T k = nlim
→∞
( I − T n+1 ) = I,
k =0 k =0
where the last equality holds because k T n+1 k ≤ k T kn+1 and k T k < 1. Similarly,
∞ n
10.24 ∑ Tk ( I − T ) = lim
n→∞
∑ T k ( I − T ) = nlim
→∞
( I − T n+1 ) = I.
k =0 k =0
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10A Adjoints and Invertibility 287
Now we use the previous result to show that the set of invertible operators on a
Banach space is open.
One of the wonderful theorems of linear algebra states that left invertibility and
right invertibility and invertibility are all equivalent to each other for operators on
a finite-dimensional vector space. The next example shows that this result fails on
infinite-dimensional Hilbert spaces.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
288 Chapter 10 Linear Maps on Hilbert Spaces
The result 10.29 below gives equivalent conditions for an operator on a Hilbert
space to be left invertible. From the finite-dimensional situation, you may be accus-
tomed to left invertibility to be equivalent to injectivity. The example below shows
that this fails on infinite-dimensional Hilbert spaces. Thus we cannot eliminate the
closed range requirement in part (c) of 10.29.
The equation above implies that S is unbounded. Thus T is not left invertible, even
though T is injective.
Suppose V is a Hilbert space and T ∈ B(V ). Then the following are equivalent:
Proof First suppose (a) holds. Thus there exists S ∈ B(V ) such that ST = I. If
f ∈ V, then
k f k = kS( T f )k ≤ kSk k T f k.
Thus (b) holds with α = kSk, proving that (a) implies (b).
Now suppose (b) holds. Thus there exists α ∈ (0, ∞) such that
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10A Adjoints and Invertibility 289
Now suppose (c) holds, so T is injective and has closed range. We want to prove
that (a) holds. Let R : range T → V be the inverse of the one-to-one linear function
f 7→ T f that maps V onto range T. Because range T is a closed subspace of V and
thus is a Banach space [by 6.16(b)], the Bounded Inverse Theorem (6.83) implies
that R is a bounded linear map. Let P denote the orthogonal projection of V onto the
closed subspace range T. Define S : V → V by
Sg = R( Pg).
Then for each g ∈ V, we have
kSgk = k R( Pg)k ≤ k Rk k Pgk ≤ k Rkk gk,
where the last inequality comes from 8.37(d). The inequality above implies that S is
a bounded operator on V. If f ∈ V, then
S( T f ) = R P( T f ) = R( T f ) = f .
Thus ST = I, which means that T is left invertible, completing the proof that (c)
implies (a).
At this stage of the proof we know that (a), (b), and (c) are equivalent. To prove
that one of these implies (d), suppose (b) holds. Squaring the inequality in (b), we
see that if f ∈ V, then
k f k2 ≤ α2 k T f k2 = α2 h T ∗ T f , f i ≤ α2 k T ∗ T f k k f k,
which implies that
k f k ≤ α2 k T ∗ T f k.
In other words, (b) holds with T replaced by T ∗ T (and α replaced by α2 ). By the
equivalence we already proved between (a) and (b), we conclude that T ∗ T is left
invertible. Thus there exists S ∈ B(V ) such that S( T ∗ T ) = I. Taking adjoints of
both sides of the last equation shows that ( T ∗ T )S∗ = I. Thus T ∗ T is also right
invertible, which implies that T ∗ T is invertible. Thus (b) implies (d).
Finally, suppose (d) holds, so T ∗ T is invertible. Hence there exists S ∈ B(V )
such that I = S( T ∗ T ) = (ST ∗ ) T. Thus T is left invertible, showing that (d) implies
(a), completing the proof that (a), (b), (c), and (d) are equivalent.
You may be familiar with the finite-dimensional result that right invertibility is
equivalent to surjectivity. The next result shows that this equivalency also holds on
infinite-dimensional Hilbert spaces.
Suppose V is a Hilbert space and T ∈ B(V ). Then the following are equivalent:
(c) TT ∗ is invertible.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
290 Chapter 10 Linear Maps on Hilbert Spaces
Proof Taking adjoints shows that an operator is right invertible if and only if its
adjoint is left invertible. Thus the equivalence of (a) and (c) in this result follows
immediately from the equivalence of (a) and (d) in 10.29 applied to T ∗ instead of T.
Suppose (a) holds, so T is right invertible. Hence there exists S ∈ B(V ) such
that TS = I. Thus T (S f ) = f for every f ∈ V, which implies that T is surjective,
completing the proof that (a) implies (b).
To prove that (b) implies (a), suppose T is surjective. Define R : (null T )⊥ → V
by R = T |(null T )⊥ . Clearly R is injective because
EXERCISES 10A
1 Define T : `2 → `2 by T ( a1 , a2 , . . .) = (0, a1 , a2 , . . .). Find a formula for T ∗ .
2 Suppose V is a Hilbert space, U is a closed subspace of V, and T : U → V is
defined by T f = f . Describe the linear operator T ∗ : V → U.
3 Suppose V and W are Hilbert spaces, g ∈ V, h ∈ W, and T ∈ B(V, W ) is
defined by T f = h f , gih. Find a formula for T ∗ .
4 Suppose V and W are Hilbert spaces and T ∈ B(V, W ) has finite-dimensional
range. Prove that T ∗ also has finite-dimensional range.
5 Prove or give a counterexample: If V is a Hilbert space and T : V → V is a
bounded linear map such that dim null T < ∞, then dim null T ∗ < ∞.
6 Suppose T is a bounded linear map from a Hilbert space V to a Hilbert space W.
Prove that k T ∗ T k = k T k2 .
[This formula for k T ∗ T k leads to the important subject of C ∗ -algebras.]
7 Suppose V is a Hilbert space and Inv(V ) is the set of invertible bounded oper-
ators on V. Think of Inv(V ) as a metric space with the metric it inherits as a
subset of B(V ). Show that T 7→ T −1 is a continuous function from Inv(V ) to
Inv(V ).
8 Suppose T is a bounded operator on a Hilbert space.
(a) Prove that T is left invertible if and only if T ∗ is right invertible.
(b) Prove that T is invertible if and only if T is both left and right invertible.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10A Adjoints and Invertibility 291
inf{|bk | : k ∈ Z+ } > 0.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
292 Chapter 10 Linear Maps on Hilbert Spaces
10B Spectrum
Spectrum of an Operator
The following definitions play key roles in operator theory.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 293
k T f k = k α f k = | α | k f k,
which implies that |α| ≤ k T k. The next result states that the same inequality holds
for elements of sp( T ).
Thus
∞
1 k T kk
k( T − αI )−1 k ≤
|α| ∑ k
k =0 | α |
1 1
=
|α| 1 − kT k
|α|
1
= .
|α| − k T k
The inequality above implies (c), completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
294 Chapter 10 Linear Maps on Hilbert Spaces
Our next result will be the key tool in proving that the spectrum of a bounded
operator on a nonzero complex Hilbert space is nonempty (see 10.38). The statement
of the next result and the proofs of the next two results use a bit of basic complex
analysis. Because sp( T ) is a closed subset of C (by 10.36), C \ sp( T ) is an open
subset of C and thus it makes sense to ask whether the function in the result below is
analytic.
To keep things simple, the next two results are stated for complex Hilbert spaces.
See Exercise 6 for the analogous results for complex Banach spaces.
α 7→ ( T − αI )−1 f , g
Multiplying both sides of the equation above by ( T − βI )−1 and using the equation
A−1 B−1 = ( BA)−1 for invertible operators A and B, we get
∞ k +1
( T − αI )−1 = ∑ (α − β)k ( T − βI )−1 .
k =0
Thus for f , g ∈ V, we have
∞ D k +1 E
( T − αI )−1 f , g = ∑ ( T − βI )−1 f , g (α − β)k .
k =0
The equation above shows that the function α 7→ ( T − αI )−1 f , g has a power se-
ries expansion as powers of α − β for α near β. Thus this function is analytic near β.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 295
Proof Suppose T ∈ B(V ), where V is a complex Hilbert space with V 6= {0}, and
sp( T ) = ∅. Let f ∈ V with f 6= 0. Take g = T −1 f in 10.37. Because sp( T ) = ∅,
10.37 implies that the function
α 7→ ( T − αI )−1 f , T −1 f
is analytic on all of C. The value of the function above at α = 0 equals the average
value of the function on each circle in C centered at 0 (because analytic functions
satisfy the mean value property). But 10.34(c) implies that this function has limit 0
as |α| → ∞. Thus taking the average over large circles, we see that the value of the
function above at α = 0 is 0. In other words,
−1
T f , T −1 f = 0.
10.39 Definition p( T )
p( T ) = b0 I + b1 T + · · · + bn T n .
You should verify that if p and q are polynomials with coefficients in F and T is
an operator, then
( pq)( T ) = p( T ) q( T ).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
296 Chapter 10 Linear Maps on Hilbert Spaces
The next result provides a nice way to compute the spectrum of a polynomial
applied to an operator. For example, this result implies that if T is a bounded operator
on a complex Banach space, then the spectrum of T 2 consists of the squares of all
numbers in the spectrum of T.
As with the previous result, the next result fails on real Banach spaces. As you
can see, the proof below uses factorization of a polynomial with complex coefficients
as the product of polynomials with degree 1, which is not necessarily possible when
restricting to the field of real numbers.
Proof If p is a constant polynomial, then both sides of the equation above consist
of the set containing just that constant. Thus we can assume that p is a nonconstant
polynomial.
First suppose α ∈ sp p( T ) . Thus p( T ) − αI is not invertible. By the Funda-
mental Theorem of Algebra, there exist c, β 1 , . . . β n ∈ C with c 6= 0 such that
p( T ) − αI = c( T − β 1 I ) · · · ( T − β n I ).
The left side of the equation above is not invertible. Hence T − β k I is not invertible
for some k ∈ {1, . . . , n}. Thus β k ∈ sp( T ). Now 10.41
implies p( β k ) = α. Hence
α ∈ p sp( T ) , completing the proof that sp p( T ) ⊂ p sp( T ) .
To prove the inclusion in the other direction, now suppose α ∈ p sp( T ) . Thus
there exists β ∈ sp( T ) such that α = p( β). By the Fundamental Theorem of
Algebra, there exist c, β 2 , . . . β n ∈ C with c 6= 0 such that
10.42 p( T ) − αI = c( T − βI )( T − β 2 I ) · · · ( T − β n I )
and
10.43 p( T ) − αI = c( T − β 2 I ) · · · ( T − β n I )( T − βI ).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 297
Self-adjoint Operators
In this subsection, we look at a nice special class of bounded operators.
The definition of the adjoint implies that a bounded operator T on a Hilbert space
V is self-adjoint if and only if h T f , gi = h f , Tgi for all f , g ∈ V. See Exercise 7 for
an interesting result about this last condition.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
298 Chapter 10 Linear Maps on Hilbert Spaces
For real Hilbert spaces, the next result requires the additional hypothesis that T
is self-adjoint. To see that this extra hypothesis cannot be eliminated, consider the
operator T : R2 → R2 defined by T ( x, y) = (−y, x ). Then, T 6= 0, but with the
standard inner product on R2 , we have h T f , f i = 0 for all f ∈ R2 (which you can
verify either algebraically or by thinking of T as counterclockwise rotation by a right
angle).
(a) If F = C, then T = 0.
(b) If F = R and T is self-adjoint, then T = 0.
h T ( g + h ), g + h i − h T ( g − h ), g − h i
h Tg, hi =
4
h T ( g + ih), g + ihi − h T ( g − ih), g − ihi
+ i,
4
as can be verified by computing the right side. Our hypothesis that h T f , f i = 0
for all f ∈ V implies that the right side above equals 0. Thus h Tg, hi = 0 for all
g, h ∈ V. Taking h = Tg, we can conclude that T = 0, which completes the proof
of (a).
Now suppose that F = R and that T is self-adjoint. Then
h T ( g + h ), g + h i − h T ( g − h ), g − h i
10.47 h Tg, hi = ;
4
this is proved by computing the right side using the equation
where the first equality holds because T is self-adjoint and the second equality holds
because we are working in a real Hilbert space. Each term on the right side of 10.47
is of the form h T f , f i for appropriate f . Thus h Tg, hi = 0 for all g, h ∈ V. This
implies that T = 0 (take h = Tg), completing the proof of (b).
Some insight into the adjoint can be obtained by thinking of the operation T 7→ T ∗
on B(V ) as analogous to the operation z 7→ z on C. Under this analogy, the
self-adjoint operators (characterized by T ∗ = T) correspond to the real numbers
(characterized by z = z). The first two bullet points in Example 10.45 illustrate this
analogy, as we saw that a multiplication operator on L2 (µ) is self-adjoint if and only
if the multiplier is real-valued almost everywhere.
The next two results deepen the analogy between the self-adjoint operators and
the real numbers. First we see this analogy reflected in the behavior of h T f , f i, and
then we see this analogy reflected in the spectrum.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 299
h T f , f i − h T f , f i = h T f , f i − h f , T f i = h T f , f i − h T ∗ f , f i = h( T − T ∗ ) f , f i.
If h T f , f i ∈ R for every f ∈ V, then the left side of the equation above equals 0, so
h( T − T ∗ ) f , f i = 0 for every f ∈ V. This implies that T − T ∗ = 0 [by 10.46(a)].
Hence T is self-adjoint.
Conversely, if T is self-adjoint, then the right side of the equation above equals
0, so h T f , f i = h T f , f i for every f ∈ V. This implies that h T f , f i ∈ R for every
f ∈ V, as desired.
Proof The desired result holds if F = R because the spectrum of every operator on
a real Hilbert space is, by definition, contained in R.
Thus we assume that T is a bounded operator on a complex Hilbert space V.
Suppose α, β ∈ R, with β 6= 0. If f ∈ V, then
k T − (α + βi ) I f k k f k ≥ T − (α + βi ) I f , f
= h T f , f i − α k f k2 − β k f k2 i
≥ | β | k f k2 ,
where the first inequality comes from the Cauchy–Schwarz Inequality (8.11) and the
last inequality holds because h T f , f i − αk f k2 ∈ R (by 10.48).
The inequality above implies that
1
kfk ≤ k T − (α + βi ) I f k
| β|
for all f ∈ V. Now the equivalence of (a) and (b) in 10.29 shows that T − (α + βi ) I
is left invertible.
Because T is self-adjoint, the adjoint of T − (α + βi ) I is T − (α − βi ) I, which
is left invertible by the same argument as above (just replace β by − β). Hence
T − (α + βi ) I is right invertible (because its adjoint is left invertible). Because the
operator T − (α + βi ) I is both left and right invertible, it is invertible. In other words,
α + βi ∈ / sp( T ). Thus sp( T ) ⊂ R, as desired.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
300 Chapter 10 Linear Maps on Hilbert Spaces
Normal Operators
Now we consider another nice special class of operators.
T ∗ T = TT ∗ .
Clearly every self-adjoint operator is normal, but there exist normal operators that
are not self-adjoint, as shown in the next example.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 301
Proof If f ∈ V, then
k T f k2 − k T ∗ f k2 = h T f , T f i − h T ∗ f , T ∗ f i = h( T ∗ T − TT ∗ ) f , f i.
If T is normal, then the right side of the equation above equals 0, which implies that
the left side also equals 0 and hence k T f k = k T ∗ f k.
Conversely, suppose k T f k = k T ∗ f k for all f ∈ V. Then the left side of the
equation above equals 0, which implies that the right side also equals 0 for all f ∈ V.
Because T ∗ T − TT ∗ is self-adjoint, 10.46 now implies that T ∗ T − TT ∗ = 0. Thus
T is normal, completing the proof.
Each complex number can be written in the form a + bi, where a and b are real
numbers. Part (a) of the next result gives the analogous result for bounded operators
on a complex Hilbert space, with self-adjoint operators playing the role of real
numbers. We could call the operators A and B in part (a) the real and imaginary parts
of the operator T. Part (b) below shows that normality depends upon whether these
real and imaginary parts commute.
10.54 Operator is normal if and only if its real and imaginary parts commute
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
302 Chapter 10 Linear Maps on Hilbert Spaces
Suppose V is a Hilbert space and T ∈ B(V ) is normal. Then the following are
equivalent:
(a) T is invertible;
(b) T is left invertible;
(c) T is right invertible;
(d) T is surjective;
(e) T is injective and has closed range;
(f) T ∗ T is invertible;
(g) TT ∗ is invertible.
Proof Because T is normal, (f) and (g) are clearly equivalent. From 10.29, we know
that (f), (b), and (e) are equivalent to each other. From 10.31, we know that (g),
(c), and (d) are equivalent to each other. Thus (b), (c), (d), (e), (f), and (g) are all
equivalent to each other.
Clearly (a) implies (b).
Suppose (b) holds. We already know that (b) and (c) are equivalent; thus T is left
invertible and T is right invertible. Hence T is invertible, proving that (b) implies (a)
and completing the proof that (a) through (g) are all equivalent with each other.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 303
Because every self-adjoint operator is normal, the following result also holds for
self-adjoint operators.
( T f )(n) = f (n − 1)
for f : Z → F with ∑∞
k=−∞ | f ( k )| < ∞. Then T is an isometry and is unitary.
2
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
304 Chapter 10 Linear Maps on Hilbert Spaces
By definition, isometries preserve norms. The equivalence of (a) and (b) in the
following result shows that isometries also preserve inner products.
(a) T is an isometry;
(b) h T f , Tgi = h f , gi for all f , g ∈ V;
(c) T ∗ T = I;
(d) { Tek }k∈Γ is an orthonormal family for every orthonormal family {ek }k∈Γ
in V;
(e) { Tek }k∈Γ is an orthonormal family for some orthonormal basis {ek }k∈Γ
of V.
Proof If f ∈ V, then
k T f k2 − k f k2 = h T f , T f i − h f , f i = h( T ∗ T − I ) f , f i.
Thus k T f k = k f k for all f ∈ V if and only if the right side of the equation above
is 0 for all f ∈ V. Because T ∗ T − I is self-adjoint, this happens if and only if
T ∗ T − I = 0 (by 10.46). Thus (a) is equivalent to (c).
If T ∗ T = I, then h T f , Tgi = h T ∗ T f , gi = h f , gi for all f , g ∈ V. Thus (c)
implies (b).
Taking g = f in (b), we see that (b) implies (a). Hence we now know that (a), (b),
and (c) are equivalent to each other.
To prove that (b) implies (d), suppose (b) holds. If {ek }k∈Γ is an orthonormal
family in V, then h Te j , Tek i = he j , ek i for all j, k ∈ Γ, and thus { Tek }k∈Γ is an
orthonormal family in V. Hence (b) implies (d).
Because V has an orthonormal basis (see 8.66 or 8.74), (d) implies (e).
Finally, suppose (e) holds. Thus { Tek }k∈Γ is an orthonormal family for some
orthonormal basis {ek }k∈Γ of V. Suppose f ∈ V. Then by 8.62(a) we have
f = ∑ h f , e j ie j ,
j∈Γ
where the last equality holds because h Te j , Tek i equals 1 if j = k and equals 0
otherwise. Because the equality above holds for every ek in the orthonormal basis
{ek }k∈Γ , we conclude that T ∗ T f = f . Thus (e) implies (c), completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 305
The equivalence between (a) and (c) in the previous result shows that every unitary
operator is an isometry.
Next we have a result giving conditions that are equivalent to being a unitary
operator. Notice that parts (d) and (e) of the previous result refer to orthonormal
families, while parts (f) and (g) of the following result refer to orthonormal bases.
(a) T is unitary;
(b) T is a surjective isometry;
(c) T and T ∗ are both isometries;
(d) T ∗ is unitary;
(e) T is invertible and T −1 = T ∗ ;
(f) { Tek }k∈Γ is an orthonormal basis of V for every orthonormal basis {ek }k∈Γ
of V;
(g) { Tek }k∈Γ is an orthonormal basis of V for some orthonormal basis {ek }k∈Γ
of V.
Proof The equivalence of (a), (d), and (e) follows easily from the definition of
unitary.
The equivalence of (a) and (c) follows from the equivalence in 10.60 of (a) and (c).
To prove that (a) implies (b), suppose (a) holds, so T is unitary. As we have
already noted, this implies that T is an isometry. Also, the equation TT ∗ = I implies
that T is surjective. Thus (b) holds, proving that (a) implies (b).
Now suppose (b) holds, so T is a surjective isometry. Because T is surjective and
injective, T is invertible. The equation T ∗ T = I [which follows from the equivalence
in 10.60 of (a) and (c)] now implies that T −1 = T ∗ . Thus (b) implies (e). Hence at
this stage of the proof, we know that (a), (b), (c), (d), and (e) are all equivalent to
each other.
To prove that (b) implies (f), suppose (b) holds, so T is a surjective isometry.
Suppose {ek }k∈Γ is an orthonormal basis of V. The equivalence in 10.60 of (a) and (d)
implies that { Tek }k∈Γ is an orthonormal family. Because {ek }k∈Γ is an orthonormal
basis of V and T is surjective, the closure of the span of { Tek }k∈Γ equals V. Thus
{ Tek }k∈Γ is an orthonormal basis of V, which proves that (b) implies (f).
Obviously (f) implies (g).
Now suppose (g) holds. The equivalence in 10.60 of (a) and (e) implies that T
is an isometry, which implies that the range of T is closed. Because { Tek }k∈Γ is an
orthonormal basis of V, the closure of the range of T equals V. Thus T is a surjective
isometry, proving that (g) implies (b) and completing the proof (a) through (g) are all
equivalent to each other.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
306 Chapter 10 Linear Maps on Hilbert Spaces
= (1 + |α|2 ) I − (αT ∗ + αT )
αT ∗ + αT
10.63 = (1 + | α |2 ) I − .
1 + | α |2
Looking at the last term in parentheses above, we have
αT ∗ + αT
2| α |
10.64
≤ < 1,
1 + | α |2 1 + | α |2
where the last inequality holds because |α| 6= 1. Now 10.64, 10.63, and 10.22 imply
that ( T − αI )∗ ( T − αI ) is invertible. Thus T − αI is left invertible. Because T − αI
is normal, this implies that T − αI is invertible (see 10.55). Hence α ∈ / sp( T ). Thus
sp( T ) ⊂ {α ∈ F : |α| = 1}, as desired.
As a special case of the next result, we can conclude (without doing any calcula-
tions!) that the spectrum of the right shift on `2 is {α ∈ F : |α| ≤ 1}.
Proof Because T is an isometry but is not unitary, we know that T is not surjective
[by the equivalence of (a) and (b) in 10.61]. In particular, T is not invertible. Thus
T ∗ is not invertible.
Suppose α ∈ F with |α| < 1. Because T ∗ T = I, we have
T ∗ ( T − αI ) = I − αT ∗ .
The right side of the equation above is invertible (by 10.22). If T − αI were invertible,
then the equation above would imply T ∗ = ( I − αT ∗ )( T − αI )−1 , which would
make T ∗ invertible as the product of invertible operators. However, the paragraph
above shows T ∗ is not invertible. Thus T − αI is not invertible. Hence α ∈ sp( T ).
Thus {α ∈ F : |α| < 1} ⊂ sp( T ). Because sp( T ) is closed (see 10.36), this
implies {α ∈ F : |α| ≤ 1} ⊂ sp( T ). The inclusion in the other direction follows
from 10.34(a). Thus sp( T ) = {α ∈ F : |α| ≤ 1}.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 307
EXERCISES 10B
1 Verify all the assertions in Example 10.33.
2 Suppose T is a bounded operator on a Hilbert space V.
(a) Prove that sp(S−1 TS) = sp( T ) for all bounded invertible operators S on V.
(b) Prove that sp( T ∗ ) = {α : α ∈ sp( T )}.
(c) Prove that if T is invertible, then sp( T −1 ) =
1
α : α ∈ sp( T ) .
α 7→ ϕ ( T − αI )−1 f
TC ( f + ig) = T f + iTg
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
308 Chapter 10 Linear Maps on Hilbert Spaces
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10B Spectrum 309
21 (a) Prove that if T is a bounded operator on a Banach space V, then the infinite
sum above converges in B(V ) and ke T k ≤ ekT k .
(b) Prove that if S, T are bounded operators on a Banach space V such that
ST = TS, then eS e T = eS+T .
(c) Prove that if T is a self-adjoint operator on a complex Hilbert space, then
eiT is unitary.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
310 Chapter 10 Linear Maps on Hilbert Spaces
The next result provides a large class of examples of compact operators. We will
see more examples after proving a few more results.
Not every bounded operator is compact. For example, the identity map on an
infinite-dimensional Hilbert space is not compact (to see this, consider an orthonormal
sequence, which does not have a convergent subsequence because √ the distance
between any two distinct elements of the orthonormal sequence is 2).
Proof We will show if T is an operator that is not bounded, then T is not compact.
To do this, suppose V is a Hilbert space and T is an operator on V that is not bounded.
Thus there exists a bounded sequence f 1 , f 2 , . . . in V such that limn→∞ k T f n k = ∞.
Hence no subsequence of T f 1 , T f 2 , . . . converges, which means T is not compact.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10C Compact Operators 311
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
312 Chapter 10 Linear Maps on Hilbert Spaces
The previous result now allows us to see many new examples of compact operators.
for all g, h ∈ L2 (µ) where we have used Tonelli’s Theorem, Fubini’s Theorem, and
the Cauchy–Schwarz Inequality. For fixed h ∈ L2 (µ), the right side above equalling
0 for all g ∈ L2 (µ) implies that
Z
h(y) F ( x, y) dµ(y) = 0
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10C Compact Operators 313
As a special case of the previous result, we can now see that the Volterra operator
V : L2 ([0, 1]) → L2 ([0, 1]) defined by
Z x
(V f )( x ) = f
0
is compact. This holds because, as shown in Example 10.15, the Volterra operator is
an integral operator of the type considered in the previous result.
R x The Volterra operator is injective [because differentiating both sides of the equation
0 f = 0 with respect to x and using the Lebesgue Differentiation Theorem (4.19)
shows that f = 0]. Thus the Volterra operator is an example of a compact operator
with infinite-dimensional range. The next example provides another class of compact
operators that do not necessarily have finite-dimensional range.
The next result states that an operator is compact if and only if its adjoint is
compact.
k T ∗ f n j − T ∗ f n k k2 = T ∗ ( f n j − f n k ), T ∗ ( f n j − f n k )
= TT ∗ ( f n j − f nk ), f n j − f nk
≤ k TT ∗ ( f n j − f nk )k k f n j − f nk k.
The inequality above implies that T ∗ f n1 , T ∗ f n2 , . . . is a Cauchy sequence and hence
converges. Thus T ∗ is a compact operator, completing the proof that if T is compact,
then T ∗ is compact.
Now suppose T ∗ is compact. By the result proved in the paragraph above, ( T ∗ )∗
is compact. Because ( T ∗ )∗ = T (see 10.11), we conclude that T is compact.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
≈ 100π Chapter 10 Linear Maps on Hilbert Spaces
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10C Compact Operators 315
To prove the claim above, suppose that it is false. Then for each n ∈ Z+, there exists
⊥
f n ∈ null( T − αI ) such that
k f n k = 1 and k( T − αI ) f n k < n1 .
lim T f nk = g
k→∞
10.79 lim ( T − αI ) f nk = 0
k→∞
lim f nk = α1 g.
k→∞
⊥
The equation above implies k gk = |α|; hence g 6= 0. Each f nk ∈ null( T − αI ) ;
⊥
hence we also conclude that g ∈ null( T − αI ) . Applying T − αI to both sides of
the equation above and using 10.79 shows that g ∈ null( T − αI ). Thus g is a nonzero
element of both null( T − αI ) and its orthogonal complement. This contradiction
completes the proof of the claim in 10.78.
To show that range( T − αI ) is closed, suppose h1 , h2 , . . . is a sequence in
range( T − αI ) that converges to some h ∈ V. For each n ∈ Z+, there exists
⊥
f n ∈ null( T − αI ) such that ( T − αI ) f n = hn . Because h1 , h2 , . . . is a Cauchy
sequence, 10.78 shows that f 1 , f 2 , . . . is also a Cauchy sequence. Thus there exists
f ∈ V such that limn→∞ f n = f , which implies h = ( T − αI ) f ∈ range( T − αI ).
Hence range( T − αI ) is closed.
Tg − αg = f
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
316 Chapter 10 Linear Maps on Hilbert Spaces
The next lemma will be used in our proof of the Fredholm Alternative (10.83).
Note that this lemma implies that every injective operator on a finite-dimensional
vector space is surjective (a finite-dimensional vector space cannot have an infinite
chain of strictly decreasing subspaces because the dimension decreases by at least
1 in each step). Also, see Exercise 9 for the analogous result implying that every
surjective operator on a finite-dimensional vector space is injective.
10.82 f ∈
/ range T.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10C Compact Operators 317
(a) α ∈ sp( T );
(b) α is an eigenvalue of T;
(c) T − αI is not surjective.
Because f j and f k are both in range( T − αI ) j , the first two terms on the right side of
10.86 are in range( T − αI ) j+1 . Because j + 1 ≤ k, the third term in 10.86 is also in
range( T − αI ) j+1 . Now 10.85 implies that the last term in 10.86 is orthogonal to
the sum of the first three terms. Thus 10.86 leads to the inequality
k T f j − T f k k ≥ k α f j k = | α |.
The inequality above implies that T f 1 , T f 2 , . . . has no convergent subsequence, which
contradicts the compactness of T. This contradiction means the assumption that α is
not an eigenvalue of T was false, completing the proof that (a) implies (b).
At this stage, we know that (a) and (b) are equivalent and that (c) implies (a). To
prove that (a) implies (c), suppose α ∈ sp( T ). Thus α ∈ sp( T ∗ ). Applying the
equivalence of (a) and (b) to T ∗ , we conclude that α is an eigenvalue of T ∗ . Thus
applying 10.13(d) to T − αI shows that T − αI is not surjective, completing the proof
that (a) implies (c).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
318 Chapter 10 Linear Maps on Hilbert Spaces
The previous result traditionally has the word alternative in its name because it
can be rephrased as follows:
If T is a compact operator on a Hilbert space V and α ∈ F \ {0}, then
exactly one of the following holds:
for almost every x ∈ [0, 1]. The left side of 10.88 is a continuous function of x and
thus so is the right side, which implies that f is continuous. The continuity of f
now implies that the left side of 10.88 has a continuous derivative, and thus f has a
continuous derivative.
Now differentiate both sides of 10.88 with respect to x, getting
f (x) = α f 0 (x)
for all x ∈ (0, 1). Standard calculus shows that the equation above implies that
f ( x ) = ce x/α
for some constant c. However, 10.88 implies that the continuous function f must
satisfy the equation f (0) = 0. Thus c = 0, which implies f = 0.
The conclusion of the last paragraph shows that α is not an eigenvalue of V . The
Fredholm Alternative (10.83) now shows that α ∈ / sp(V ). Thus sp(V ) = {0}.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10C Compact Operators 319
Our next result states that if T is a compact operator and α 6= 0, then null( T − αI )
and null( T ∗ − αI ) have the same dimension (denoted dim). This result about
the dimensions of spaces of eigenvectors is easier to prove in finite dimensions.
Specifically, suppose S is an operator on a finite-dimensional Hilbert space V (you
can think of S = T − αI). Then
dim null S = dim V − dim range S = dim(range S)⊥ = dim null S∗,
where the justification for each step should be familiar to you from finite-dimensional
linear algebra. This finite-dimensional proof does not work in infinite dimensions
because the expression dim V − dim range S could be of the form ∞ − ∞.
Although the dimensions of the two null spaces in the result below are the same,
even in finite dimensions the two null spaces are not necessarily equal to each other
(but we do have equality of the two null spaces when T is normal; see 10.56).
Note that both dimensions in the result below are finite (by 10.80 and 10.73).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
320 Chapter 10 Linear Maps on Hilbert Spaces
{α ∈ sp( T ) : |α| ≥ δ}
Proof Fix δ > 0. Suppose there exist distinct α1 , α2 , . . . in sp( T ) with |αn | ≥ δ
for every n ∈ Z+ . The Fredholm Alternative (10.83) implies that each αn is an
eigenvalue of T. For n ∈ Z+ , let
Un = null ( T − α1 I ) · · · ( T − αn I ) .
10.92 f n ∈ Un ∩ (Un−1 ⊥ )
such that k f n k = 1.
Now suppose j, k ∈ Z+ with j < k. Then
10.93 T f j − T f k = ( T − α j I ) f j − ( T − αk I ) f k + α j f j − αk f k .
Because j + 1 ≤ k, the first three terms on the right side of 10.93 are in Uk−1 . Now
10.92 implies that the last term in 10.93 is orthogonal to the sum of the first three
terms. Thus 10.93 leads to the inequality
k T f j − T f k k ≥ kαk f k k = |αk | ≥ δ.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10C Compact Operators 321
EXERCISES 10C
1 Prove that if T is a compact operator on a Hilbert space V and e1 , e2 , . . . is an
orthonormal sequence in V, then limn→∞ Ten = 0.
√
2 Prove that if T is a compact operator on L2 ([0, 1]), then lim nk T ( x n )k2 = 0,
n→∞
where x n means the element of L2 ([0, 1]) defined by x 7→ x n .
3 Suppose T is a compact operator on a Hilbert space V and f 1 , f 2 , . . . is a
sequence in V such that limn→∞ h f n , gi = 0 for every g ∈ V. Prove that
limn→∞ k T f n k = 0.
T ( a1 , a2 , . . .) = ( a1 b1 , a2 b2 , . . .).
∑ kTek k2 < ∞,
k∈Γ
then T is compact.
7 Suppose T is a bounded operator on a Hilbert space V. Prove that if {ek }k∈Γ
and { f j } j∈Ω are orthonormal bases of V, then
∑ kTek k2 = ∑ kT f j k2 .
k∈Γ j∈Ω
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
322 Chapter 10 Linear Maps on Hilbert Spaces
15 Prove that if V is a separable Hilbert space, then the Banach space C(K ) is
separable.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10D Spectral Theorem for Compact Operators 323
10.95 lim k T f n k = k T k.
n→∞
Then
T T f n − k T k2 f n
2 = k T ∗ T f n k2 − 2k T k2 h T ∗ T f n , f n i + k T k4
∗
= k T ∗ T f n k2 − 2k T k2 k T f n k2 + k T k4
10.96 ≤ 2k T k4 − 2k T k2 k T f n k2 ,
The next result indicates one way in which self-adjoint compact operators behave
like self-adjoint operators on finite-dimensional Hilbert spaces.
T 2 − k T k2 I = T − k T k I T + k T k I .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
324 Chapter 10 Linear Maps on Hilbert Spaces
h( T |U ) f , gi = h T f , gi = h f , Tgi = h f , ( T |U ) gi
for all f , g ∈ U. The next result shows that a bit more is true.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10D Spectral Theorem for Compact Operators 325
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
326 Chapter 10 Linear Maps on Hilbert Spaces
The next result is one of the major highlights of the theory of compact operators
on Hilbert spaces. The result as stated below applies to both real and complex Hilbert
spaces. In the case of a real Hilbert space, the result below can be combined with
10.101(a) to produce the following result: A compact operator on a real Hilbert
space is self-adjoint if and only if there is an orthonormal basis of the Hilbert space
consisting of eigenvectors of the operator.
Tf = ∑ α k h f , ek i ek
k∈Ω
for every f ∈ V.
Proof Let U denote the span of all the eigenvectors of T. Then U is an invariant
subspace for T. Hence U ⊥ is also an invariant subspace for T and T |U ⊥ is a self-
adjoint operator on U ⊥ (by 10.100). However, T |U ⊥ has no eigenvalues, because
all the eigenvectors of T are in U. Because all self-adjoint compact operators on a
nonzero Hilbert space have an eigenvalue (by 10.97), this implies that U ⊥ = {0}.
Hence U = V (by 8.42).
For each eigenvalue α of T, there is an orthonormal basis of null( T − αI ) consist-
ing of eigenvectors corresponding to the eigenvalue α. The union (over all eigenvalues
α of T) of all these orthonormal bases is an orthonormal family in V because eigen-
vectors corresponding to distinct eigenvalues are orthogonal (see 10.57). The previous
paragraph tells us that the closure of the span of this orthonormal family is V (here
we are using the set itself as the index set). Hence we have an orthonormal basis of
V consisting of eigenvectors of T, completing the proof of (a).
By part (a) of this result, there is an orthonormal basis {ek }k∈Γ of V and a family
{αk }k∈Γ in R such that Tek = αk ek for each k ∈ Γ (even if F = C, the eigenvalues
of T are in R by 10.49) . Thus if f ∈ V, then
T f = T ∑ h f , ek iek = ∑ h f , ek i Tek = ∑ αk h f , ek iek .
k∈Γ k∈Γ k∈Γ
Tf = ∑ αk h f , ek i ek
k∈Ω
for every f ∈ V. The set Ω is countable because T has only countably many
eigenvalues (by 10.91) and each nonzero eigenvalue can appear only finitely many
times in the sum above (by 10.80), completing the proof of (b).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10D Spectral Theorem for Compact Operators 327
A normal compact operator on a nonzero real Hilbert space might have no eigen-
values [consider, for example the normal operator T of counterclockwise rotation by
a right angle on R2 defined by T ( x, y) = (−y, x )]. However, the next result shows
that normal compact operators on complex Hilbert spaces behave better. The key idea
in proving this result is that on a complex Hilbert space, the real and imaginary parts
of a normal compact operator are commuting self-adjoint compact operators, which
then allows us to apply the Spectral Theorem for self-adjoint compact operators.
Proof One direction of this result has already been proved as part (b) of 10.101.
To prove the other direction, suppose T is a normal compact operator. We can
write
T = A + iB,
where A and B are self-adjoint operators and, because T is normal, AB = BA (see
10.54). Because A = ( T + T ∗ )/2 and B = ( T − T ∗ )/(2i ), the operators A and B
are both compact.
If α ∈ R and f ∈ null( A − αI ), then
( A − αI )( B f ) = A( B f ) − αB f = B( A f ) − αB f = B ( A − αI ) f = B(0) = 0
and thus B f ∈ null( A − αI ). Hence null( A − αI ) is an invariant subspace for B.
Applying the Spectral Theorem for self-adjoint compact operators [10.104(a)] to
B|null( A−αI ) shows that for each eigenvalue α of A, there is an orthonormal basis of
null( A − αI ) consisting of eigenvectors of B. The union (over all eigenvalues α of
A) of all these orthonormal bases is an orthonormal family in V (use the set itself as
the index set) because eigenvectors of A corresponding to distinct eigenvalues of A
are orthogonal (see 10.57). The Spectral Theorem for self-adjoint compact operators
[10.104(a)] as applied to A tells us that the closure of the span of this orthonormal
family is V. Hence we have an orthonormal basis of V each of whose elements is an
eigenvector of A and an eigenvector of B.
If f ∈ V is an eigenvector of both A and B, then there exist α, β ∈ R such
that A f = α f and B f = β f . Thus T f = ( A + iB)( f ) = (α + βi ) f ; hence f is
an eigenvector of T. Thus the orthonormal basis of V constructed in the previous
paragraph is an orthonormal basis consisting of eigenvectors of V, completing the
proof.
The following example shows the power of the Spectral Theorem for normal
compact operators. Finding the eigenvalues and eigenvectors of the normal compact
operator V − V ∗ in the next example leads us to an orthonormal basis of L2 ([0, 1]).
Easy calculus shows that the family {ek }k∈Z , where ek is defined as in 10.110, is
an orthonormal family in L2 ([0, 1]). The hard part of showing that {ek }k∈Z is an
orthonormal basis of L2 ([0, 1]) is to show that the closure of the span of this family
is L2 ([0, 1]). However, the Spectral Theorem for normal compact operators (10.105)
provides this information with no further work required.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
328 Chapter 10 Linear Maps on Hilbert Spaces
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10D Spectral Theorem for Compact Operators 329
10.112 Tf = ∑ sk h f , ek i hk
k∈Ω
for every f ∈ V.
α k f k2 = h α f , f i = h T ∗ T f , f i = h T f , T f i = k T f k2 .
10.113 (T∗ T ) f = ∑ s k 2 h f , ek i ek
k∈Ω
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
330 Chapter 10 Linear Maps on Hilbert Spaces
Taking square roots of the positive eigenvalues of T ∗ T and then adjoining an infinite
string of 0’s shows that the singular values of T are 3 ≥ 3 ≥ 2 ≥ 0 ≥ 0 ≥ · · · .
Note that −3 and 0 are the only eigenvalues of T. Thus in this case, the list of
eigenvalues of T did not pick up the number 2 that appears in the definition (and
hence the behavior) of T, but the list of singular values of T does include 2.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10D Spectral Theorem for Compact Operators 331
10.119 IK ( f ) = ∑ sk h f , ek i hk
k∈Ω
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
332 Chapter 10 Linear Maps on Hilbert Spaces
Let X denote the set on which the measure µ lives. For j ∈ Γ and k ∈ Γ0 , define
g j,k : X × X → F by
g j,k ( x, y) = e j (y)hk ( x ).
Then { g j,k } j∈Γ, k∈Γ0 is an orthonormal basis of L2 (µ × µ), as you should verify. Thus
Z Z 2
= ∑ K ( x, y)e j (y)hk ( x ) dµ(y) dµ( x )
j∈Γ, k∈Γ0
Z 2
= ∑ (IK e j )( x )hk ( x ) dµ( x )
j∈Γ, k∈Γ0
Z 2
10.120 = ∑ s j h j ( x )hk ( x ) dµ( x )
j∈Ω, k∈Γ0
10.121 = ∑ sj2
j∈Ω
∞ 2
= ∑ sn (IK ) ,
n =0
1 1 1 π2
10.122 Example 2
+ 2 + 2 +··· =
1 3 5 8
Define K : [0, 1] × [0, 1] → R by
1
if x > y,
K ( x, y) = 0 if x = y,
−1 if x < y.
Letting µ be Lebesgue measure on [0, 1], we note that IK is the normal compact
operator V − V ∗ examined in Example 10.116.
Clearly kK k L2 (µ×µ) = 1. Using the list of singular values for IK obtained in
Example 10.116, the formula in 10.118 tells us that
∞
4
1=2 ∑ (2k + 1)2 π 2
.
k =0
Thus
1 1 1 π2
+ + + · · · = .
12 32 52 8
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10D Spectral Theorem for Compact Operators 333
EXERCISES 10D
1 Prove that if T is a compact operator on a nonzero Hilbert space, then k T k2 is
an eigenvalue of T ∗ T.
2 Prove that if T is a self-adjoint operator on a nonzero Hilbert space V, then
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
334 Chapter 10 Linear Maps on Hilbert Spaces
for all f ∈ V.
14 Suppose that T is an operator on a finite-dimensional Hilbert space V with
dim V = n.
(a) Prove that T is invertible if and only if sn ( T ) 6= 0.
(b) Suppose T is invertible and T has a singular value decomposition
T f = s1 ( T )h f , e0 ih0 + · · · + sn ( T )h f , en ihn
for all f ∈ V. Show that
h f , h1 i h f , hn i
T −1 f = e +···+ en
s1 ( T ) 1 sn ( T )
for all f ∈ V.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section 10D Spectral Theorem for Compact Operators 335
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Appendix
The set Q of rational numbers, with the usual operations of addition and multi-
plication, is a field. As another example, the set {0, 1}, with the usual operations of
addition and multiplication except that 1 + 1 is defined to be 0, is a field.
The familiar properties of arithmetic all follow from the field properties listed
above. For example, here are a few properties of the additive inverse.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section A Complete Ordered Fields 353
You should be able to write down analogous properties for the multiplicative
inverse in a field. With the exception of the proof below, proofs of the easy and
familiar field properties are not provided here because we need to get to other topics.
Because 0 is the additive identity in a field, the result below connects addition and
multiplication. The only field property that connects addition and multiplication is
the distributive property. Thus the proof of the result below must use the distributive
property. With that hint, the main idea of the proof (writing 0 as 0 + 0 and then using
the distributive property) becomes easier to discover.
0.3 a0 = 0
a0 = a(0 + 0)
= a0 + a0.
Now add −( a0) to each side of the equation above, obtaining the equation 0 = a0,
as desired.
Subtraction and division are defined in a field using the appropriate inverse, as
follows.
a − b = a + (−b).
a
• If b 6= 0, then the quotient b (which is also denoted by a/b and by a ÷ b) is
defined by the equation
a
= ab−1 .
b
Ordered Fields
The usual arithmetic properties that we expect of the real numbers follow from
the definition of a field. However, more structure is needed to generate the order
properties that we expect of the real numbers. The easiest way to get at order
properties in a field comes from designating a subset to be thought of as the positive
numbers. Then the ordering a < b can be defined to mean that b − a is positive.
As motivation for the following definition, think of the properties we expect of the
positive numbers as a subset of the real numbers: every real number is either positive
or 0 or its additive inverse is positive; a real number and its additive inverse cannot
both be positive; the sum and product of two positive numbers is positive.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
354 Appendix The Real Numbers and Rn
In the following definition, the symbol P should remind you of the positive
numbers.
An ordered field is a field F along with a subset P of F, called the positive subset,
with the following properties:
• if a ∈ F, then a ∈ P or a = 0 or − a ∈ P;
• if a ∈ P, then − a ∈
/ P;
• if a, b ∈ P, then a + b ∈ P and ab ∈ P.
For example, the field Q of rational numbers, with the usual operations of addition
and multiplication and with P denoting the usual set of positive rational numbers, is
an ordered field.
Because you are already familiar with the properties of the positive numbers, the
statements and easy proofs of results concerning the positive subset are left to you as
exercises. The following result and its proof give an example of how to work with
the definition of an ordered field.
(a) 1 ∈ P;
(b) a−1 ∈ P for every a ∈ P.
Proof To prove (a), note that the definition of an ordered field implies that either
1 ∈ P or −1 ∈ P. If 1 ∈ P, then we are done. If −1 ∈ P, then 1 = (−1)(−1) ∈ P
(because P is closed under multiplication). Either way, we conclude that 1 ∈ P.
To prove (b), suppose a ∈ P. If we had − a−1 ∈ P, then we would have
−1 = (− a−1 ) a ∈ P (because P is closed under multiplication), which contradicts
the first bullet point. Thus a−1 ∈ P, as desired.
Now we use the positive subset of an ordered field to define the order relations.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section A Complete Ordered Fields ≈ 113π
0.8 Transitivity
c − a = (c − b) + (b − a) ∈ P.
The observation that b ≤ |b| and −b ≤ |b| for every b in an ordered field F
provides the key to the proof of our next result.
0.10 | a + b| ≤ | a| + |b|
| a + b | ≤ | a | + | b |.
| a + b | = a + b ≤ | a | + | b |.
| a + b| = −( a + b) = (− a) + (−b) ≤ | a| + |b|,
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
356 Appendix The Real Numbers and Rn
The next result allows us to think of Q, the ordered field of rational numbers
with the usual operations of addition and multiplication and the usual ordering, as
contained in each ordered field.
ϕ( m
n ) = (1 + · · · + 1)(1 + · · · + 1)−1
| {z } | {z }
m times n times
and let
ϕ(− m · · + 1})−1 ,
(−1) + · · · + (−1) (1| + ·{z
n) = | {z }
m times n times
Proof
The properties of an ordered field
To show that ϕ is a one-to-one func-
imply that 1 + · · · + 1 > 0. In
tion, suppose first that | {z }
n times
ϕ( m
p particular, 1 + · · · + 1 6= 0, and
n ) = ϕ ( q ), | {z }
n times
where m, n, p, q are positive integers and thus the multiplicative inverses
both fractions are in reduced form. The above make sense in F.
equality above implies that
(1| + ·{z
· · + 1})(1| + ·{z
· · + 1}) = (1| + ·{z
· · + 1})(1| + ·{z
· · + 1}).
m times q times p times n times
Repeated applications of the distributive property show that both sides of the equation
above are a sum, with 1 appearing mq times on the left side and pn terms on the right
side. Thus
mq = pn
(because otherwise, after adding the additive inverse of the shorter side to both sides
of the equation above, we would have a sum of 1’s equaling 0, which would violate
p
the order properties). Thus m n = q , which shows that the restriction of ϕ to the
positive rational numbers is a one-to-one function.
Using the ideas of the paragraph above, the reader should be able to show that ϕ
is a one-to-one function on all of Q. The reader should also be able to verify that ϕ
preserves all the ordered field properties. In other words, ϕ( a + b) = ϕ( a) + ϕ(b),
ϕ( ab) = ϕ( a) ϕ(b), ϕ(− a) = − ϕ( a), ϕ( a−1 ) = ϕ( a)−1 , and ϕ( a) > 0 if and
only if a > 0 for all a, b ∈ Q (with a 6= 0 for the multiplicative inverse condition).
The result above means that we can identify a ∈ Q with ϕ( a) ∈ F. Thus from
now on we think of Q as a subset of each ordered field.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section A Complete Ordered Fields 357
Completeness
The Pythagorean Theorem implies that a right trian-
gle with two legs of length 1 has a hypotenuse whose
length c satisfies the equation c2 = 2. The ancient
Greeks discovered that the rational numbers are not
rich enough to have such a number, as shown by the
next result.
An isoceles right triangle,
with c2 = 2.
Intuitively, we expect that the length of any line segment (including the hypotenuse
of a right triangle with two legs of length 1) should√ be a real number. Thus the result
above tells us that there should be a real number 2 that is not rational. The rational
numbers Q, with the usual operations of addition and multiplication, form an ordered
field. Hence we see that we need more than the properties of an ordered field to
describe our notion of the real numbers.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
358 Appendix The Real Numbers and Rn
Experience has shown that the best way to capture the expected properties of the
real numbers comes through consideration of upper bounds.
In other words, a least upper bound of a set is an upper bound that is less than or
equal to all the other upper bounds of the set.
If b and d are both least upper bounds of a subset A of an ordered field F, then
b ≤ d and d ≤ b (by the second bullet point above), and hence b = d. In other
words, a least upper bound of a set, if it exists, is unique.
then 3 is the least upper bound of A1 and 3 is the least upper bound of A2 . Note that
the least upper bound 3 of A1 is an element of A1 but the least upper bound 3 of A2
is not an element of A2 .
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section A Complete Ordered Fields 359
The next example shows that a nonempty subset of an ordered field can have an
upper bound without having a least upper bound.
2 − b2
δ= .
5
Because b < 2 and 0 < δ < 1, we have 2b + δ < 5 and
The last example indicates that the absence of a least upper bound prevents the field
of rational numbers from having a square root of 2, motivating the next definition.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
360 Appendix The Real Numbers and Rn
Example 0.18 shows that Q is not a complete ordered field. Our intuitive notion
of the real line as having no holes indicates that the field of real numbers should be a
complete ordered field. Thus we want to add completeness to the list of properties
that the field of real numbers should possess.
Experience shows that no additional properties beyond being a complete ordered
field are needed to prove all known properties of the real numbers. Thus we define
the real numbers (which we have not previously defined) to be a complete ordered
field.
Here is the formal definition, which is followed by a discussion of philosophical
issues that this definition raises.
The definition of R that we have just given raises the following three philosophical
issues:
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section A Complete Ordered Fields 361
EXERCISES A
1 Prove 0.2.
2 State and prove properties for the multiplicative inverse in a field that are
analogous to the properties in 0.2.
3 Suppose F is a field and a, b ∈ F. Prove that (− a)(−b) = ab.
4 Suppose F is a field and a, b, c ∈ F, with b 6= 0 and c 6= 0. Prove that
ac a
= .
bc b
5 Suppose F is a field and a, b, c, d ∈ F, with b 6= 0 and d 6= 0. Prove that
a c ad − bc
− = .
b d bd
6 Suppose F is a field and a, b, c, d ∈ F, with b 6= 0, c 6= 0, and d 6= 0. Prove that
a c ad
÷ = .
b d bc
7 Suppose F is an ordered field and a, b, c, d ∈ F. Prove that if a < b and c ≤ d,
then a + c < b + d.
8 Suppose F is an ordered field and a, b, c, d ∈ F. Prove that if 0 ≤ a < b and
0 < c ≤ d, then ac < bd.
9 Suppose F is an ordered field and a, b ∈ F. Prove that if a < b and ab > 0,
then a−1 > b−1 .
10 Prove that if a and b are elements of an ordered field, then | ab| = | a||b|.
11 Prove that if a and b are elements of an ordered field, then | a| − |b| ≤ | a − b|.
12 Prove that every ordered field has at most one positive element whose square
equals 2 (where 2 is defined to be 1 + 1).
13 Suppose F is an ordered field. Prove that there does not exist i ∈ F such
that i2 = −1. (Thus the set of complex numbers, with its usual operation of
multiplication, cannot be made into an ordered field.)
14 Suppose F is the field of rational functions with coefficients in R. This means
p
that an element of F has the form q , where p and q are polynomials with real
p
coefficients and q is not the 0 polynomial. Rational functions q and rs are
declared to be equal if ps = rq, and addition and multiplication are defined in F
as you would naturally assume.
(a) Let P denote the subset F consisting of rational functions that can be written
p
in the form q , where the highest order terms of p and q both have positive
coefficients. Show that F is an ordered field with this definition of P.
(b) Show that F, with P defined as above, is not a complete ordered field.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
362 Appendix The Real Numbers and Rn
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section B Construction of the Real Numbers: Dedekind Cuts 363
C + D = {c + d : c ∈ C, d ∈ D }.
0̃ = { a ∈ Q : a < 0}.
The next result states that addition is well defined, that addition is commutative,
that addition is associative, that 0̃ is the additive identity, and that − D is that additive
inverse of D for each Dedekind cut D. These properties are part of what is needed to
make D into a field.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
364 Appendix The Real Numbers and Rn
with similarly appropriate definitions for the other three cases concerning C and D.
For details, see the instruction before Exercise 4 in this section. That exercise ends
with the conclusion that D (with the operations of addition and multiplication as
defined) is a field.
Now that we have made D into a field, we want to make it into an ordered
field. Thus we must define the positive subset of D , which is done in the next
definition. To motivate this definition, think of the intuitive notion of a Dedekind cut
as corresponding to what should be its right endpoint.
Exercise 5 asks you to verify that the definition above satisfies the requirements
for the positive subset of a field (see 0.5). In other words, the definition above makes
D into an ordered field.
Now that D is an ordered field, the positive subset of D defines the meaning of
inequalities in the usual way (see 0.7). For example, C ≤ D means that D + (−C )
is positive or C = D. The next result shows that this ordering of D has a particularly
nice interpretation. Again, the proof is left as an exercise.
Now we are ready to prove the main point about what we have been doing with
Dedekind cuts. Specifically, we will prove that the ordered field D of Dedekind cuts
is complete. The clean, easy proof of this result should be attributed to the cleverness
of the definition of Dedekind cuts.
0.27 Completeness of D
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section B Construction of the Real Numbers: Dedekind Cuts 365
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
366 Appendix The Real Numbers and Rn
EXERCISES B
1 Prove 0.24 (the addition properties on the set of Dedekind cuts).
2 Prove that a Dedekind cut D is positive if and only if 0 ∈ D.
3 Prove 0.26. In other words, show that if C and D are Dedekind cuts, then
C ≤ D if and only if C is a subset of D.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section C Supremum and Infimum 367
Z+ ⊂ Z ⊂ Q ⊂ R,
Proof Suppose there does not exist a positive integer n such that t < n. This implies
that t is an upper bound of Z+. Because R is complete, this implies that Z+ has a
least upper bound, which we will call b.
Now b − 1 is not an upper bound of Z+ (because b is the least upper bound of
Z+ ). Thus there exists m ∈ Z+ such that b − 1 < m. Thus b < m + 1. Because
m + 1 ∈ Z+, this contradicts the property that b is an upper bound of Z+. This
contradiction completes the proof.
1
Suppose ε ∈ R and ε > 0. Then there is a positive integer n such that n < ε.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
368 Appendix The Real Numbers and Rn
Suppose a, b ∈ R, with a < b. Then there exists a rational number c such that
a < c < b.
If a subset of R has a greatest lower bound, then the subset has a unique greatest
lower bound. The uniqueness follows from the same reasoning as for the uniqueness
of the least upper bound—see the comment after 0.17.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section C Supremum and Infimum 369
The completeness property of the real numbers tells us that every nonempty
subset of R with an upper bound has a least upper bound. The result below is the
corresponding statement for lower bounds. Note that the upper bound property is
part of the definition of R, while the lower bound property below is a theorem.
Every nonempty subset of R that has a lower bound has a greatest lower bound.
The terminology defined below has wide usage in many areas of mathematics.
The term supremum, which comes from the same Latin root as the word superior,
should help remind you that sup A is trying to be the largest number in A (if
sup A ∈ A, then sup A is the largest number in A). Similarly, the term infimum,
which comes from the same Latin root as the word inferior, should help remind you
that inf A is trying to be the smallest number in A (if inf A ∈ A, then inf A is the
smallest number in A).
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
370 Appendix The Real Numbers and Rn
1
• If A = {1 − n : n ∈ Z+ } = {0, 12 , 32 , 43 , . . . }, then inf A = 0 and sup A = 1.
The symbols ∞ and −∞ that appear in the definitions of supremum and infimum
do not represent real numbers. The equation sup A = ∞ is simply an abbreviation for
the statement A does not have an upper bound. Similarly, the equation inf A = −∞
is an abbreviation for the statement A does not have a lower bound.
Irrational Numbers
The completeness property of the real numbers implies the existence of a real number
whose square is 2.
√
0.37 Existence of 2
Proof Let
b = sup{ a ∈ R : a2 < 2}.
The set { a ∈ R : a2 < 2} has an upper bound (for example, 2 is an upper bound)
and thus b as defined above is a real number.
If b2 < 2, then we can find a number slightly bigger than b in { a ∈ R : a2 < 2}
(see the second paragraph of Example 0.18 for this calculation), which contradicts
the property that b is an upper bound of { a ∈ R : a2 < 2}.
If b2 > 2, then we can find a number slightly smaller than b that is an upper bound
of { a ∈ R : a2 < 2} (see the third paragraph of Example 0.18 for this calculation),
which contradicts the property that b is the least upper bound of { a ∈ R : a2 < 2}.
The two previous paragraphs imply that b2 = 2, as desired.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section C Supremum and Infimum 371
Suppose a, b ∈ R, with a < b. Then there exists an irrational number c such that
a < c < b.
Intervals
We will find it useful sometimes to consider a set (not a field) consisting of R and
two additional elements called ∞ and −∞. We define an ordering on R ∪ {∞, −∞}
to behave exactly as you expect from the names of the two additional symbols.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
372 Appendix The Real Numbers and Rn
If a > b, then all four of the sets listed above are the empty set. If a = b, then
[ a, b] is the set { a} containing only one element and the other three sets listed above
are the empty set.
The definition above implies that (−∞, ∞) equals R and that (0, ∞) is the set
of positive numbers. Also note that [−∞, ∞] = R ∪ {∞, −∞} and that [0, ∞] =
[0, ∞) ∪ {∞}; thus neither [−∞, ∞] nor [0, ∞] is a subset of R.
The next result gives a complete description of all intervals of [−∞, ∞].
Suppose I ⊂ [−∞, ∞] is an interval. Then I is one of the following sets for some
a, b ∈ [−∞, ∞]:
( a, b), [ a, b], ( a, b], [ a, b).
EXERCISES C
1
1 Suppose b ∈ R and |b| < n for every positive integer n. Prove that b = 0.
3 Explain why it makes no sense to inquire about whether your current height as
measured in meters is a rational number or an irrational number.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section C Supremum and Infimum 373
A + B = { a + b : a ∈ A, b ∈ B}.
Define arithmetic with ∞ and −∞ as you would expect. For example, s + ∞ = ∞
for all s ∈ (−∞, ∞] and −∞ + t = −∞ for all t ∈ [−∞, ∞). Note, however,
that ∞ + (−∞) should remain undefined.
6 Prove that if A and B are nonempty subsets of R, then
and
inf( A + B) = inf A + inf B.
(b) Give an example to show that the inequality above can be a strict inequality.
(b) Give an example to show that the inequality above can be a strict inequality.
9 Prove that the ordered field of rational functions with coefficients in R (see
Exercise 14 in Section A for the definition of this ordered field) does not satisfy
the Archimedean Property.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
374 Appendix The Real Numbers and Rn
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section D Open and Closed Subsets of Rn 375
0.44 Definition Rn
Rn = {( x1 , . . . , xn ) : x1 , . . . , xn ∈ R}.
For ( x1 , . . . , xn ) ∈ Rn , let
p
k( x1 , . . . , xn )k = x1 2 + · · · + x n 2
and
k( x1 , . . . , xn )k∞ = max{| x1 |, . . . , | xn |}.
The reason for using the subscript ∞ here will become clear when we get to
L p -spaces in Chapter 7. For now, note that the triangle inequality
is an easy consequence of the definition of k·k∞ . The triangle inequality also holds
for k·k but its proof when n ≥ 3 is far from obvious. Later we will see two nice
proofs (7.14 with p = 2 and 8.15) of the triangle inequality for k·k. Meanwhile, in
this Appendix we will use k·k∞ for simpler proofs. Note that if n = 1, then k·k and
k·k∞ both equal the absolute value |·|.
Now we are ready to define what it means for a sequence of elements of Rn to
have a limit. The intuition concerning limits is that if we go far enough out in a
sequence, then all the terms beyond that will be as close as we wish to the limit. You
should have seen limits in previous courses. Thus some key properties of limits are
left as exercises.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
376 Appendix The Real Numbers and Rn
k ak − Lk∞ < ε
Because √
k x k∞ ≤ k x k ≤ nk x k∞ for all x ∈ Rn ,
we see that
lim ak = L if and only if lim k ak − Lk = 0.
k→∞ k→∞
We will need the following useful terminology.
The next result states that a sequence of elements of Rn converges if and only if it
converges coordinatewise. Thus questions about convergence of sequences in Rn can
often be reduced to questions about convergence of sequences in R. The proof of this
next result is left to the reader.
( ak,1 , . . . , ak,n ) = ak ,
lim ak,j = L j
k→∞
You should show that each sequence in Rn has at most one limit. Thus the phrase
a limit in 0.46 can be replaced by the limit.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section D Open and Closed Subsets of Rn 377
Open Subsets of Rn
If n = 3, then the open cube B( x, δ) defined below is the usual cube in R3 centered
at x with sides of length 2δ.
B( x, δ) = {y ∈ Rn : ky − x k∞ < δ}.
As a test that you are comfortable with these concepts, be sure that you can verify
the following implication:
Make sure you take the time to understand why the definitions given by the two bullet
points above are equivalent (you will need to use 0.50).
Open sets could have been defined using the open balls {y ∈ Rn : ky − x k < δ}
instead of the open cubes B( x, δ). These two possible approaches are equivalent
because every open cube contains an open ball with the same center, and every open
ball contains an open cube with the same center. Specifically, if x ∈ Rn and δ > 0
then
√
{y ∈ Rn : ky − x k < δ} ⊂ B( x, δ) ⊂ {y ∈ Rn : ky − x k < nδ},
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
378 Appendix The Real Numbers and Rn
The union ∞
S
k=1 Ek of a sequence E1 , E2 , . . . of subsets of T
a set S is the set of
elements of S that are in at least one of the Ek . The intersection ∞ k=1 Ek is the set of
elements of S that are in all the Ek .
More generally, we can consider unions and intersections that are not indexed by
the positive integers.
\
• The intersection of the collection A, denoted E, is defined by
E∈A
\
E = { x ∈ S : x ∈ E for every E ∈ A}.
E∈A
(a) The union of every collection of open subsets of Rn is an open subset of Rn.
δ = min{δ1 , . . . , δm }.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section D Open and Closed Subsets of Rn 379
The next definition helps us distinguish some sets as having the same number of
elements as Z+.
The following two points follow easily from the definition of a countable set.
• Every finite set is countable. This holds because if C = {c1 , c2 , . . . , cn }, then
C = {c1 , c2 , . . . , cn , cn , cn , . . . } (repetitions do not matter for sets).
• If C is an infinite countable set, then C can be written in the form {b1 , b2 , . . .}
where b1 , b2 , . . . are all distinct. This holds because we can delete any terms in
the sequence c1 , c2 , . . . that appear earlier in the sequence.
We will use the next result to prove our description of open subsets of R (0.59).
0.57 Q is countable
Proof At step 1, start with the list −1, 0, 1. At step n, adjoin to the list in increasing
order the rational numbers in the interval [−n, n] that can be written in the form m n
for some integer m. Thus halfway through step 3, the list is as follows:
−1, 0, 1, −2, − 32 , −1, − 12 , 0, 21 , 1, 32 , 2, −3, − 83 , − 37 , −2, − 53 , − 34 , −1, − 23 , − 13 , 0.
Continue in this fashion to produce a sequence that contains each rational number,
completing the proof.
Deleting the entries in the list that already appear earlier in the list (shown above
in red) produces a sequence that contains each rational number exactly once.
Later we will see that R is uncountable (see 2.16). Similarly, the set of irra-
tional numbers is uncountable. Thus there are more irrational numbers than rational
numbers.
The following terminology will be useful.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
380 Appendix The Real Numbers and Rn
In the result below, some (perhaps infinitely many) of the open intervals may be
the empty set.
Proof One direction of this result is easy: The union of every sequence (disjoint or
not) of open intervals is open [by 0.55(a)].
To prove the other direction, suppose G is an open subset of R. For each t ∈ G,
let Gt be the union of all the open intervals contained in G that contain t. A moment’s
thought shows that Gt is the largest open interval contained in G that contains t.
If s, t ∈ G and Gs ∩ Gt 6= ∅, then Gs = Gt (because otherwise Gs ∪ Gt would
be an open interval strictly larger than at least one of Gs and Gt and containing both
s and t). In other words, any two intervals in the collection of intervals { Gt : t ∈ G }
are either disjoint or equal to each other.
Because t ∈ Gt for each t ∈ G, we see that the union of the collection of intervals
{ Gt : t ∈ G } is G.
Let r1 , r2 , . . . be a sequence of rational numbers that includes every rational
number (such a sequence exists by 0.57). Define a sequence of open intervals
I1 , I2 , . . . as follows:
∅
if rk ∈/ G,
Ik = ∅ if rk ∈ Ij for some j < k,
Grk if rk ∈ G and rk ∈ / Ij for all j < k.
If t ∈ G, then the open interval Gt contains a rational number (by 0.30) and thus
Gt = Ik for some positive integer k. Thus G is the union of the disjoint sequence of
open intervals I1 , I2 , . . . .
Closed Subsets of Rn
• If S and A are sets, then the set difference S \ A is defined to be the set of
elements of S that are not in A. In other words, S \ A = {s ∈ S : s ∈ / A }.
• If A ⊂ S, then S \ A is called the complement of A in S.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section D Open and Closed Subsets of Rn 381
For example, the interval [1, 4] is a closed subset of R because its complement in
R is the open set (−∞, 1) ∪ (4, ∞).
Unlike doors, a subset of Rn need not be either open or closed. For example, the
interval (3, 7] is neither an open nor a closed subset of R.
Closed sets can be more complicated than open sets. There exist closed subsets of
R that are not the union of a sequence of intervals (for example, see 2.75).
The following characterization of closed sets will frequently be useful.
B( L, δ) 6⊂ Rn \ A
for every δ > 0. Hence Rn \ A is not an open subset of Rn. Thus A is not a closed
subset of Rn, completing the proof in one direction.
To prove the other direction, now suppose that A is a subset of Rn that is not
closed. Thus Rn \ A is not open. Hence there exists L ∈ Rn \ A such that
B( L, 1k ) 6⊂ Rn \ A
for every k ∈ Z+. Thus for each k ∈ Z+, there exists ak ∈ A such that
k L − ak k∞ < 1k .
The inequality above implies that the sequence a1 , a2 , . . . of elements of A has limit
L. Thus there exists a convergent sequence of elements of A whose limit is not in A,
completing the proof in the other direction.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
382 Appendix The Real Numbers and Rn
Proof This result follows immediately from De Morgan’s Laws (0.63), the definition
of a closed set, and 0.55.
The only subsets of Rn that are both open and closed are ∅ and Rn.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section D Open and Closed Subsets of Rn 383
EXERCISES D
1 Suppose a1 , a2 , . . . and c1 , c2 , . . . are convergent sequences in Rn. Prove that
lim ak
ak
lim = k→∞ .
k→∞ ck lim ck
k→∞
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
384 Appendix The Real Numbers and Rn
{ a ∈ Rn : k a − bk∞ ≤ δ} and { a ∈ Rn : k a − bk ≤ δ}
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section E Sequences and Continuity 385
To test that you are comfortable with this terminology, make sure that you can
show that a set A ⊂ R is bounded if and only if A has an upper bound and A has a
lower bound.
As you should verify, if a sequence of real numbers converges, then it is bounded.
The next result states that the converse is true for monotone sequences. Note the
crucial role that the completeness of the field of real numbers plays in the proof of
the next result.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
386 Appendix The Real Numbers and Rn
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section E Sequences and Continuity 387
The next result is called a characterization of closed bounded sets because the
converse, although less important, is also true; see Exercise 3 in this section.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
388 Appendix The Real Numbers and Rn
In the first bullet point above, continuity is defined at an element of the domain.
The second bullet point above establishes the convention that simply calling a function
continuous means that the function is continuous at every element of its domain.
The next result allows us to think about continuity in terms of limits of sequences.
You probably saw the result above in a previous course. Thus the proof is left as
an exercise.
The concept of uniform continuity also evolved in the nineteenth century.
k f ( a) − f (b)k∞ < ε
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section E Sequences and Continuity 389
Clearly every uniformly continuous function is continuous, but the converse is not
true, as shown by the following example.
f (n + n1 ) − f (n) = 2 + 1
n2
>2
for every n ∈ Z+. The inequality above implies that f is not uniformly continuous.
The following remarkable result states that for functions whose domain is a closed
bounded subset of Rm, continuity implies uniform continuity. This result plays a
crucial role in showing that continuous functions are Riemann integrable (1.11).
Thus
lim g( ak j ) − g(bk j ) = 0.
j→∞
The equation above contradicts the inequality k g( ak ) − g(bk )k∞ ≥ ε, which holds
for all k ∈ Z+. This contradiction means that our assumption that g is not uniformly
continuous is false, completing the proof.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
390 Appendix The Real Numbers and Rn
lim g( ak ) = sup{ g( x ) : x ∈ F }.
k→∞
f ( A ) = { f ( a ) : a ∈ A }.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section E Sequences and Continuity 391
In the next proof, the Bolzano–Weierstrass Theorem will again play a key role.
EXERCISES E
1 Prove that every convergent sequence of elements of Rn is bounded.
2 Prove that a sequence of elements of Rn converges if and only if every subse-
quence of the sequence converges.
3 Prove the converse of 0.74. Specifically, prove that if F is a subset of Rn with the
property that every sequence of elements of F has a subsequence that converges
to an element of F, then F is closed and bounded.
4 Define f : R → R as follows:
0 if a is irrational,
f ( a) = 1 if a is rational and n is the smallest positive integer
n
such that a = m
n for some integer m.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
392 Appendix The Real Numbers and Rn
23 Prove that every continuous real-valued function on each closed subset of R can
be extended to a continuous real-valued function on R. More precisely, prove
that if F is a closed subset of R and g : F → R is continuous, then there exists a
continuous function h : R → R such that g( x ) = h( x ) for all x ∈ F.
24 Prove or give a counterexample: If G is a bounded open subset of R and
h : G → R is continuous, then h( G ) is an open subset of R.
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Section E Sequences and Continuity 393
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Photo Credits
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 394
Photo Credits 395
• page 351: Painting by Paul Barbotti in 1853; public domain image from
Wikipedia
• page 357: The School of Athens (detail) by Raphael; public domain image from
Wikipedia
• page 367: Painting by Salvator Rosa (1615–1673); public domain image from
Wikipedia
• page 387: Photo by Matěj Bat’ha; Creative Commons Attribution-Share Alike
2.5 Generic license
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Notation Index
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 396
Notation Index 397
K ∗ , 280 S \ A, 380
∑k∈Γ f k , 237 sn ( T ), 330
span{ek }k∈Γ , 172
L1 (µ), 93, 159 sp( T ), 292
L1 (R), 95 S ⊗ T , 115
`2 (Γ), 235 sup, 369
λ, 61
λn , 137 k T k, 165
L( f , [ a, b]), 4 T ∗ , 279
L( f , P), 72 T −1 , 285
L( f , P, [ a, b]), 2 t + A, 16
`( I ), 14 tA, 23
`∞ , 175, 193
U ⊥ , 227
` p , 193
U f , 131
L p ( E), 201
U ( f , [ a, b]), 4
L p (µ), 192
U ( f , P, [ a, b]), 2
L p (µ), 200
V 0 , 178
MF (S), 260 V , 284
Mh , 279 VC , 232
µ × ν, 125
k( x1 , . . . , xn )k, 375
|ν|, 257 k( x , . . . , xn )k∞ , 134, 375
R R1
kνk, 261
X Y f ( x, y ) dν ( y ) dµ ( x ), 124
νa , 269
null T, 170 |z|, 153
ν− , 267 z, 156
ν µ, 268 Z (µ), 200
ν ⊥ µ, 266
ν+ , 267
νs , 269
p0 , 194
Pr , 342
p( T ), 295
PU , 225
Q, 352
R, 360
Re z, 153
Rn , 134, 375
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Index
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler 398
Index 399
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
400 Index
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
Index 401
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler
402 Index
TC , 307
Tonelli’s Theorem, 127, 129
Tonelli, Leonida , 114
total variation measure, 257
total variation norm, 261
translation of set, 16, 59
Triangle Inequality, 161, 217
Trinity College, Cambridge, 99
two-sided ideal, 311
Measure, Integration & Real Analysis. Preliminary edition. 12 May 2019. ©2019 Sheldon Axler