Metric Spaces (Bath Lecture Notes)
1.2 Examples: Rn with d(x; y) = jx yj, the Euclidean metric; any nonempty set X with d(x; y ) = 1 if
1 if pk divides (n m) and pk+1 does not (p a
x 6= y, the discrete metric; Z with the metric d(n; m) = k+1
prime), the p-adic metric.
1.3 A norm on a real vector space V (possibly innite-dimensional) is a function k k : V ! R such that
a) k0k = 0;
b) kvk > 0 if v 6= 0;
c) kvk = jjkvk if 2 R;
d) kv + wk kvk + kwk.
1.4 Examples: the Euclidean norm kxk = jxj on Rn ; the max-norm kxk = max(jxi j) on Rn ; the max- or
sup-norm kf k = maxfjf (t)j j t 2 [0; 1]g on C [0; 1] (the set of continuous functions f : [0; 1] ! R).
notice that
d(x; y) = kx yk
= k( 1)(y x )k
=j 1jky x k
= ky xk
= d(y; x):
For 1.1(d)
d(x; z) = kx z k
= k(x y ) + (y z )k
kx y k + ky z k
= d(x; y) + d(y; z):
d(x; y) d(y; z ) d(x; z ) and d(y; z ) d(x; y) d(x; z ):
1.7 Denition. If Y X then the subspace metric on Y is just the restriction of d to Y Y.
1.8 Denition. If X; dX and Y; dY are metric spaces then the product metric on X Y is d given by
It is a metric on X Y . To check the triangle inequality (the other three conditions are immediate)
We can also do this for any nite product X = X1 Xn of metric spaces, setting d (x; y) =
maxi dXi (xi ; yi ).
1.9 The distance between two nonempty subsets A and B of a metric space X is dened to be d(A; B ) =
inf fd(a; b) j a 2 A; b 2 B g. If A = fag is a set with just one point we write d(a; B ) (the distance from a to
B ) instead of d(fag; B ).
The distance function d(A; B ) on sets is not a metric itself: it can be zero even though A 6= B . In fact it
can be zero even when A \ B = ;, for instance A = ( 1; 0) and B = (0; 1) in R.
1.10 A subset A X is said to be bounded if there exists M 2 R such that d(x; y ) < M for all x; y 2 A. The
diameter of A is dened to be supfd(x; y ) j x; y 2 Ag. A bounded metric space is one for which X itself is
Bounded intervals in R are bounded sets. A discrete metric space is bounded (take M = 1).
1.12 Every ball is bounded: in fact the diameter of Br (c) is at most 2r. Small balls around c are contained
in bigger ones, that is, Br (c) BR (c) if R > r.
2.1 A sequence in a metric space X is a function f : N ! X. More usually one expresses this by writing
f (n) = an and calling the sequence (an ).
2.2 A sequence (an ) in X tends or converges to a 2 X if
2.4 Proposition. In a metric space X , a sequence (an ) tends to a if and only if d(an ; a) ! 0 in R.
Proof: d(an ; a) ! 0 means 8 > 0 9N 8n > N jd(an ; a)j < . But d(an ; a) 0 so we can drop the modulus
2.5 A sequence (an ) is said to be bounded if and only if fan g is a bounded subset of X , that is,
2.6 Theorem. Let X be a metric space and let (an ); (bn ) be sequences in X .
a) if an ! a and bn ! b then d(an ; bn ) ! d(a; b);
b) the sequence (an ) has at most one limit;
c) if (an ) is convergent then it is bounded;
d) if (ani ) is a subsequence of (an ) and an ! a then ani ! a.
Proof: (a) We calculate directly, using 1.6:
I claim that for any m; n 2 N d(am ; an ) M . If m and n are both less than or equal to N then d(am ; an )
max d(ai ; aj ) M . If m N < n then d(am ; an ) d(am ; a)+ d(a; an ) 1+max d(ai ; a) M . If m; n > N
i;j N iN
then d(am ; an ) d(am ; a) + d(a; an ) 2 M .
(d) We must show that given > 0 there is an N such that if j > N then d(anj ; a) < . For this there is
an N 0 such that if n > N 0 then d(an ; a) < : but then if j > N 0 we have nj > N 0 so d(anj ; a) < . So all we
have to do is take N = N 0 .
2.7 We shall show that a sequence in Rk converges in the Euclidean metric if and only if each sequence of
coeÆcients converges. Take a sequence (an ) with an = (a(1) (k)
n ; : : : ; an ). Then an ! a = (a(1) ; : : : ; a(k) ) if
and only if r
X (i)
8 > 0 9N 8n > N (an a(i) )2 < :
I claim that this is the same as saying that a(ni) ! a(i) for each i = 1; 2; : : : ; k . If
P (i)
i (an a(i) )2 < then
P (i) ( i ) ( i )
i (an a(i) )2 < 2 so for each i we have (an a(i) )2 < 2 , and then jan a(i) j < . So
8 > 0 9N 8n > N ja(ni) a(i) j < :
Conversely suppose a(ni) ! a(i) for each i. Then, given > 0, we can choose N so large that ja(ni) a(i) j < = k
for all i if n > N . If we do that then as long as n > N we have
X q p
(a(i) a(i) )2 < k(= k)2 = :
i n
2.8 If d is the discrete metric on X then an ! a if and only if there is an N such that an = a for n > N .
This is because if 1 then d(an ; a) < implies an = a.
2.9 Proposition: If Y X , a 2 Y and (an ) is a sequence in Y , then an ! a in Y with the subspace metric
if and only if an ! a in X . If X1 ; : : : ; Xk are metric spaces and X = X1 Xk with the product metric,
then the sequence (an ) in X tends to a 2 X if and only if a(ni) ! a(i) in Xi for each i.
Proof: an ! a in Y means dY (an ; a) ! 0 as n ! 1. But dY (an ; a) = d(an ; a) so dY (an ; a) ! 0 if
and only if d(an ; a) ! 0. Similarly, an ! a means d (an ; a) ! 0, that is maxi dXi (a(ni) ; a(i) ) ! 0. But
maxi dXi (a(ni) ; a(i) ) ! 0 if and only if dXi (a(ni) ; a(i) ) ! 0 for each i.
2.10 The max norm k k1 on C [0; 1] is dened by kf k1 = max jf (t)j. Of course we can replace 0 and 1 by
any a; b 2 R with a b. More generally can dene a norm k ki nfty , called the sup norm or the uniform
norm, on the set of bounded and continuous (or even just bounded), real-valued functions on any interval I ,
closed or not. It is dened by k ki nfty = sup jf (t)j.
We need to check that kk1 really is a norm. This means checking 1.3(a){(d). First (a): k0k1 = max(0) = 0.
Then (b): if f is not the zero function then there is some 2 I such that f ( ) 6= 0, and kf k1 = max jf (t)j
jf ( )j > 0. For (c), note that jf (t)j = jjjf (t)j so
kf k1 = max
jf (t)j
= jj max jf (t)j
= jjkf k1 :
Finally, the triangle inequality, (d):
kf + gk1 = max
jf (t) + g(t)j
(jf (t)j + jg (t)j)
jf (t)j + max
= kf k1 + kg k1 :
k k1 is called the uniform norm because fn ! f in this normed space if and only if fn tends uniformly to
f in the usual sense.
2.11 Two metrics d1 , d2 on the same set X are said to be equivalent if there are positive constants c1 ; c2 > 0
such that
8x; y 2 X c1 d1 (x; y) d2 (x; y) c2 d1 (x; y):
The idea here is that if two metrics are equivalent they will give us the same notion of convergence: see 2.13.
It's not quite true that if two metrics give the same notion of convergence they are equivalent: however,
equivalence has the virtue of being easy to check. In any case examples where two inequivalent metrics give
the same notion of convergence are usually articial.
In the same way, one says that two norms k k1 and k k2 on a real vector space X are equivalent if there
are positive constants c1 ; c2 > 0 such that for all x 2 X
To justify the use of the word \equivalent" we need to show that equivalence of metrics (resp. norms) is
indeed an equivalence relation. We also need to show that the two uses of the word are compatible. Let us
do that rst.
Lemma. Two norms on a vector space X are equivalent if and only if the corresponding metrics are
Proof: If the metrics d1 (x; y ) = kx yk1 and d2 (x; y) = kx yk2 are equivalent then
Symmetry: if d1 is equivalent to d2 then, for all x; y 2 X , c1 d1 (x; y ) d2 (x; y ) so, since c1 > 0, d1 (x; y )
1 1
c1 d2 (x; y ). Similarly d1 (x; y ) c2 d2 (x; y ) so d2 is equivalent to d1 .
Transitivity: if c1 d1 (x; y ) d2 (x; y ) c2 d2 (x; y ) and c01 d2 (x; y ) d3 (x; y ) c02 d2 (x; y ) for all x; y 2 X , so
d1 is equivalent to d2 and d2 is equivalent to d3 , then
2.12 Proposition. The norms k kk and k k1 on Rn are equivalent, for any k. In particular the dierent
k kk are all equivalent to each other.
Proof: Recall that kxkk = k i jxi jk , k 2 N , and kxk1 = max jxi j. Now
2.13 Theorem. If d1 and d2 are equivalent metrics on a non-empty set X then an ! a in (X; d1 ) if and
only if an ! a in (X; d2 ).
Proof: By symmetry it is enough to show this in one direction only. Suppose an ! a in (X; d1 ). We want
to show that an ! a in (X; d2 ). Given > 0, we choose N such that d1 (an ; a) < =c2 if n > N . Then
d2 (an ; a) c2 d1 (an ; a) < , so we have what we want.
2.14 We should check that what we are doing is not trivial: how do we know that there are any inequivalent
pairs of metrics? In fact there are lots: here are two examples.
X = R, d1 is the Euclidean metric, d2 is the discrete metric. We have seen that in the discrete metric
the only convergent sequences are the eventually constant sequences (see 2.8). On the other hand there are
plenty of convergent sequences in the Euclidean metric that are not eventually constant, for instance an = n1 .
So d1 and d2 cannot be equivalent, because if they were then according to 2.13 they ought to give the same
convergent sequences.
If X = C (0; 1) then k k1 and k k1 are inequivalent, where kf k1 = 01 jf (t)jdt. (It is easy to check that
k k1 is a norm.) Consider the function
an (t) = 01 nx ifif 01 <<xx < n1.
This is the function that comes down from 1 to 0 linearly with slope n and then stays at 0. Clearly
kan k1 = 1 for all n, so an does not tend to 0 in the uniform norm, but kank1 = 21n so an ! 0 in k k1 . So
again the two metrics give dierent convergent sequences and must be inequivalent.
Finally, a pathology (that is, an unpleasant counterexample that you don't really want to think about but
you have to know exists). Take X = N , d1 the discrete metric and dene d2 by d2 (n; n + 1) = n1 , so that
P 11
more generally, d2 (n; m) = m r=n r if n < m. Then the convergent sequences in both cases are the same,
namely the eventually constant sequences. For d1 we saw this in 2.8: for d2 , suppose an ! a. Choose
= 21a . Then 9N 8n > N d2 (a; an ) < 21a , but if b 6= a then d2 (a; b) a1 { the nearest point to a is a + 1.
So an = a so the sequence an is eventually constant. Looking at 2.13 you might hope that the converse
would also be true, that if two metrics give the same convergent sequences then they are equivalent. But
here d1 and d2 do give the same convergent sequences, as we've just checked, and they aren't equivalent.
Suppose they were, so 8x; y 2 Xc1 d1 (x; y ) d2 (x; y ) c2 d1 (x; y ). Choose x > 1=c1 and y = x + 1. Then
d2 (x; y) = x1 > c1 = c1 d1 (x; y), a contradiction.
Fortunately this doesn't matter much because it practically never happens. In particular (we aren't going
to prove this, though it's not especially hard) it doesn't happen for metrics given by norms.
3.2 Examples. X X is open: you may take to be anything. ; X is also open: this is true according
to the denition above, because of a logical quibble about the meaning of 8x 2 ;, but if you prefer you may
modify youir denition so that it says explicitly that ; X is open. As a result, ; X and X X are also
both closed.
Open balls are open, as you would hope. But this is not quite trivial to prove. Take U = Br (c) X and
suppose x 2 U . Then d(x; c) < r. Choose = r d(x; c) >, which is positive. I claim that B (x) U .
Suppose y 2 B (x): then
d(y; c) d(y; x) + d(x; c) < + d(x; c) = r
so y 2 Br (c) = U .
If X is a discrete metric space then every subset of X is open, and therefore every subset of X is closed as
well. If U X and X is discrete, suppose x 2 U : then B1 (x) = fxg U , so U is open.
From (a) and (b) it follows (by De Morgan's rules) that
a0 ) If A is nonempty and Z X is closed for every 2 A then Z is closed in X.
b0 ) If Z1 ; : : : ; Zn are nitely many sets, each closed in X , then Zi is open in X .
3.4 The collection of all the open subsets of a metric space X is called the topology of (X; d).
Proposition. A nonempty set U X is open if and only if U can be written as a union of open balls.
Proof: If U can be written as a union of open balls then U is open by 3.3(a), since the open balls are open
themselves (3.2). Conversely, supose U is open. Then for each x 2 U there is a positive number x such that
Bx (x) U . Consider W = Bx (x). By denition W is a union of open balls: I claim W = U . If w 2 W
then there is an x 2 U such that w 2 Bx (x) U , so W U . If x 2 U then x 2 Bx (x) W , so U W .
3.5 Theorem. If two metrics d, d0 on X are equivalent then the metric spaces (X; d) and (X; d0 ) have the
same topology.
Proof: Suppose U (X; d) is open in the sense of d. Then there exists > 0 such that B (x) U . We
need to show that U is also open in the sense of d0 . Let us use B 0 rather than B to denote a ball in the
sense of d0 : thus Br0 (c) = fy 2 X j d0 (c; y ) < rg. Then we must show that there exists 0 > 0 such that
B0 (x) U . Since d and d0 are equivalent, there exist c1 ; c2 > 0 such that c1 d(x; y) < d0 (x; y) < c2 d(x; y).
so y 2 B (x) U , as required. Thus every open subset in the sense of d is also open in the sense of d0 : the
other way round follows by symmetry.
If two metrics d and d0 on a set X yield the same topology we say they are topologically equivalent. We have
just proved that equivalent metrics are topologically equivalent. In order to prove this what we had to do
was check that every ball in the sense of d contained a ball in the sense of d0 (and vice versa).
It is not true that if d and d0 are topologically equivalent then they are equivalent. The pathology in 2.14 is
a counterexample: in that example the metrics are not equivalent but the topology is the same (every set is
open) in both cases.
3.8 Theorem. If Z X then Z is closed in X if and only if the following holds: if an 2 Z for all n 2 N
and an ! a in X , then a 2 Z .
Proof: Suppose rst that Z is closed, so U = X n Z is open. If an ! a but a 62 Z then a 2 U , so there exists
> 0 such that B (a) U . So if d(an ; a) < then an 2 U , and this is a contradiction. Therefore a 2 Z .
Conversely, suppose Z is not closed. Then U is not open so 9a 2 U forall > 0B (a) 6 U . Choose such
an a 2 U and take = n1 . Then B (a) 6 U so B (a) \ Z 6= ;. Choose an 2 B (a) \ Z . Then an ! a, but
an 2 Z and a 62 Z .
This means that a set Z is closed in X if you don't fall out of it by taking limits. The limit of a sequence of
points of Z might not exist at all (in X ), but if it does and Z is closed then it's in Z .
then d(an ; a) < . Then for n > N we have an 2 B (a) U , as required. For the converse, suppose that
for every open neighbourhood U of x there exists N such that an 2 U for n > N . Then in particular this is
true for U = B (a) for every > 0, which is precisely what the denition of convergence to a requires.
3.11 If X is a metric space and D A X , we say that D is relatively open in A if D is open in A with the
subspace metric dA .
For example if D = (0; 1) A = R X = C (or A is the x-axis in R2 ) then D is relatively open in A,
though D is not open in X .
Theorem. If D A X then D is relatively open in A if and only if D = A \ U for some open set U X .
Proof: The point is that a ball in (A; dA ) is a ball in X intersected with A. More precisely, if Br (x) = fy 2
A j dA (y; x) < rg then BrA (x) = Br (x) \ A where Br (x) = fy 2 X j d(y; x) < rg is the usual ball in X .
If D = A \ U then 8x 2 D 9x > 0 Bx (x) U . So Bx (x) \ A D; but Bx (x) \ A is the ball in (A; dA )
with centre x and radius x , so we have shown that D is open in A.
Conversely, if D is relatively open in A then there is a ball in the sense of (A; dA ) centred at x which
is contained in D. So there is an epsilonx such that BAx (x) D, that is Bx (x) \ A D. Now take
U= Bx (x): we have
[ [
A\U =A\ Bx (x) = (A \ Bx (x)) D:
x2D x2D
4. Continuity.
4.1 Let (X; d) and (X 0 ; d0 ) be metric spaces and f : X ! X a function. Then the following conditions are
i) 8x 2 X 8 > 0 9Æ > 0 (d(x; y ) < Æ ) =) (d0 (f (x); f (y )) < );
ii) for all x 2 X , if (xn ) is a sequence in (X; d) and xn ! x, then f (xn ) ! f (x) in (X 0 ; d0 );
iii) if U X 0 is open in X 0 then f 1 (U ) X is open in X .
If one of these conditions holds then f is said to be continuous on X .
Proof: (i) =) (ii). Suppose Xn ! x and > 0. Then there exists Æ > 0 such that if d(xn ; x) < Æ then
d(f (xn ); f (x)) < . But 9N 8n > N d(xn ; x) < Æ so 9N 8n > N d(f (xn ); f (x)) < , i.e. f (xn ) ! f (x).
(ii) =) (iii). If U X 0 is open then Z = X 0 n U is closed and f 1 (Z ) = X n f 1 (U ) = W . It is enough to
show that W is closed. By 3.8 it is enough to show that if wn 2 W and wn ! w 2 X then w 2 W . But if
wn ! w then f (wn ) ! f (w) 2 X 0 , and f (wn ) 2 Z and Z is closed, so by 3.8 f (w) 2 Z , i.e. w 2 W .
(iii) =) (i). Choose > 0. U = B (f (x)) is an open subset of X 0 so f 1 (U ) is an open subset of X .
Moreover x 2 f 1 (U ), so by the denition of an open set 9Æ > 0 BÆ (x) f 1 (U ). But that means precisely
that if d(y; x) < Æ (so y 2 BÆ (x)) then d(f (y ); f (x)) < (so f (y 2 U )).
One sometimes says that f is continuous at a particular x 2 X if (i) or (ii) holds for that particular value
of x (strictly, we haven't checked that these two statements are equivalent, because we went via (iii) which
doesn't mention a particular x). But continuity at just one point is almost never any use: what one needs
is continuity in a neighbourhood of a point, and we say that f : X ! X 0 is continuous near x 2 X is there
is a neighbourhood Y of x such that the restriction of f to Y is continuous.
4.2 This agrees with all previous denitions of continuity of maps R ! R, C ! C , etc., because condition (i)
is the usual denition in those cases.
If t 2 [0; 1] the evaluation map evt : C [0; 1] ! R given by evt (f ) = f (t) is continuous when R had the
Euclidean metric and C [0; 1] has the uniform metric (or anything else sensible). It is easiest to check this using
(ii): if fn ! f in C [0; 1] with the uniform metric then fn ! f pointwise, so fn (t) = evt (fn ) ! f (t) = evt (f ),
for any t 2 [0; 1].
The map I : C [0; 1] ! C [0; 1] (with the uniform metric) given by I (f )(t) = 0t f (x) dx is continuous. This is
most easily checked using (i): if kf g k1 < then
Z t Z t
kI (f ) I (g)k = sup
f ( x ) dx g ( x ) dx
0t1 0 0
Z t
= sup (f (x) g (x))dx
0t1 0
Z t
sup jf (x) g(x)jdx
0t1 0
Z 1
jf (x) g(x)jdx
sup jf (x) g(x)j
= kf g k
4.8 Denition. A continuous map f : (X; d) ! (X 0 ; d0 ) is called an isometry if
5. Completeness.
M = 1 + maxfd(an ; am ) j n; m N + 1g:
Then, if n; m N we have d(an ; am ) maxfd(an ; am ) j n; m N + 1g < M ; if n N < m we have
choose N such that d(an ; am ) < =2 if n; m > N and d(ani ; a) < =2 if i > N . Now suppose n > N and
choose i > n, so that ni > n also. Then
so an ! a.
5.4 A metric space (X; d) is said to be complete if every Cauchy sequence in (X; d) is convergent.
This means, roughly, that everything that ought to have a limit, does. If we know that a metric space is
complete then we can tell that a sequence is convergent without having to know rst what the limit is going
to be. This means we can use convergence as a way of constructing or nding things we didn't previously
know about: completeness actually asserts that under certain circumstances some point of X exists, rather
than just telling us more things about points we already knew existed.
5.5 R with the usual metric is complete. It is probably best to regard this as an axiom: in one rather
popular approach this is pretty close to being the denition of R. Rn is complete (with the Euclidean,
or equivalently with the product, metric { if two metrics are equivalent then they give the same Cauchy
sequences). If (an ) is a Cauchy sequence in Rn with respect to the product metric d then for m; n > N we
have maxi ja(ni) a(mi) j = d (an ; am ) < ; but then for each i we get ja(ni) a(mi) j < , so the sequence (a(ni) ) is
a Cauchy sequence in R and therefore converges. So an converges also.
Q is not complete, because 3; 3:1; 3:14; 3:141; 3:1415 : : : is obviously Cauchy but does not converge in Q since
62 Q . R n f0g (with the usual metric) is also not complete, because the Cauchy sequence 1=n no longer
converges (somebody has taken the limit). The p-adic metrics on Z and Q are not complete metrics (that
is, they do not yield complete metric spaces) either: you can easily check that an = pn is Cauchy but not
5.6 Proposition. If (X; d) is a complete metric space and ; = 6 A X then (A; dA ) is complete if and only
if A is a closed subset of X .
Proof: If A is complete, take a sequence (an ) in A that converges to a 2 X . We need to show that in fact
a 2 A. But since (an ) is convergent in X it is Cauchy in X and therefore it is Cauchy in A too, since
d(am ; an ) < if and only if dA (am ; an ) < . So (an ) must converge in A, and the limit it has in A must be
the same as its limit in X , since a sequence has at most one limit. So a 2 A.
Conversely, suppose A is closed and (an ) is a Cauchy sequence in A. Then (an ) is a Cauchy sequence in X
also, so it has a limit a 2 X . Since A is closed, a 2 A, but now an ! a 2 A so (an ) is a convergent sequence
in A and A is complete.
5.7 Theorem. If a; b 2 R and a < b then the space C [a; b] with the uniform metric is complete.
For this we need the following lemma from real analysis.
Lemma. If (fn ) is a sequence of continuous functions on some interval [a; b] and fn tends to f uniformly
(that is, sup jfn (t) f (t)j ! 0 as n ! 1), then f is also continuous on [a; b].
One expresses this informally by saying that \a uniform limit of continuous functions is continuous".
Proof of Lemma: Choose N so large that 8n > N 8t 2 [a; b] jfn (t) f (t)j < =3 (by the uniform convergence
this is possible). If n > N we have
(an ) (bn ) (where (an ) and (bn ) are Cauchy sequences in (X; d)) if and only if nlim
!1 d(an ; bn ) = 0, and
elements of X^ are equivalence classes: X is inside X^ because you think of x 2 X as corresponding to the
constant sequence at x. To dene d^ you simply say that d^([(an )]; [(bn )]) = lim d(an ; bn ). You then have to
check that this all makes sense and really denes a metric space, etc. It isn't hard.
5.9 Completeness, unlike most other things we've mentioned, is not a topological property. That is, in order
to tell whether a metric space is complete or not it's not enough to know what the open sets are: you
really have to know what the metric is. For instance, take N with the discrete metric. That is complete,
because the only Cauchy sequences are the eventually constant sequences and they converge. Every subset
is open because fng = B1 (n) for any n 2 N and if A N then A = n2A fng = n2A B1 (n), which is
open. But if we take the metric on N where the distance between successive points is d(n; n + 1) = 1=n2 ,
P 1
so d(n; m) = m 2 Pm 1 2 P1 2
r=n 1=r , then the sequence (an ) = n is Cauchy (because r=n 1=r < r=n 1=r < if n
is large enough). On the other hand every set is still open, because fng = B1=2n2 (n). So this new metric
space has exactly the same open sets as the old one, and therefore exactly the same convergent sequences,
but not the same Cauchy sequences: in particular, (n) does not converge though it is Cauchy.
6. Contractions.
6.1 Suppose (X; d) and (X 0 ; d0 ) are metric spaces. We say that a map f : X ! X 0 satises a Lipschitz
condition if there is a a constant k , 0 k < 1 (the Lipschitz constant) such that
8x; y 2 X d0 f (x); f (y) kd(x; y):
For instance, if f is an isometry then f satises a Lipschitz condition with k = 1 (and moreover the inequality
is actually an equality).
If f satises a Lipschitz condition then f is automatically continuous, so we don't need to specify this in
the denition. It is easy to see this: if k = 0 then f is constant and therefore continuous, and if k > 0 then,
given > 0, we take Æ = =k . Then if d(x; y ) < Æ we have d0 f (x); f (y ) < kÆ = so f is continuous.
6.2 A contraction or contraction mapping is a map f : X ! X 0 which satises a Lipschitz condition with
0 k < 1.
To say that f is a contraction is not the same as saying 8x; y 2 X d0 f (x); f (y ) < d(x;y ): it's stronger
d0 f (x); f (y)
than that. If all we know is that d0 f (x); f (y ) < d(x; y ) then possibly sup = 1, but for a
x;y d(x; y)
d0 f (x); f (y)
contraction we have sup < 1.
x;y d(x; y)
6.3 If f : R ! R is dierentiable and jf 0 (t)j k for all t 2 R then f satises a Lipschitz condition with
constant k . This follows from the Mean Value Theorem: if a < b then
d f (a); f (b)
jf (b) f (a)j = jf 0 ( )j
d(a; b) b a
d f (a); f (b)
for some 2 (a; b), so k. If k < 1 then f is a contraction.
d(a; b)
6.4 If X is any set and f : X ! X is any map then a xed point of f is an element x 2 X such that f (x) = x.
6.5 Theorem. (The Contraction Mapping Theorem, also known as the Banach Fixed-Point Theorem.) If
(X; d) is a complete metric space and f : X ! X is a contraction then
(i) f has a unique xed point 2 X ;
(ii) if a0 2 X and an = f (an 1 ) for n 2 N then an ! as n ! 1.
Proof: Let k < 1 be the Lipschitz constant (we may as well assume k > 0 as the case k = 0 is the trivial
d(an ; am ) d(ar ; ar+1 )
kr d(a0 ; a1 )
1 n
= k n d(a0 ; a1 ) kr
< d(a0 ; a1 )
1 k
which is less than if n is large enough.
So, since (X; d) is complete, (an ) converges and we may write = lim an . We need to check that is a
xed point. But
d ; f ( ) d(; an ) + d an ; f ( )
= d(; an ) + d f (an 1 ); f ( )
d(; an ) + kd(an 1 ; )
< if n is suÆciently large.
So d ; f ( ) = 0, so = f ( ).
It remains to show that there is no other xed point. If 0 is a xed point then
d(; 0 ) = d f ( ); f ( 0 ) kd(; 0 )
words f : [0; 1] ! [0; 1] is a contraction mapping. Take a0 = 12 ; then a1 = f ( 21 ) = 13 e 2 0:5496, etc. But
it's rather slow: ten iterations gives a10 0:6182 and the correct solution is 0:61906 : : :.
6.8 The Newton-Raphson method. This is another way of solving equations iteratively and tends to be quite
quick. We are now in a position to say when it works (and why). Suppose f : R ! R is twice dierentiable.
Suppose that f 0 (x) 6= 0 for all x 2 R and that there is a k with 0 k < 1 such that
f (x)f 00 (x)
f 0 (x)2
for all x 2 R. Then f (x) = 0 has a unique solution 2 R, obtained by taking some starting value a0 and
putting an+1 = an ff ((aann)) (and then = lim an ).
The point is that g ; R ! R given by g (x) = x ff ((xx)) is a contraction mapping because g 0 (x) = f (fx)(fx)(2x) ,
0 0
D = [t0 1; t0 + 1] [x0 1; x0 + 1] R2 ;
c = 1 + max fjf (t; x)j; (t; x) 2 Dg
c0 = 1 + max (t; x) ; (t; x) 2 D :
These maxima exist (because D is compact, see the next section): this is why we need to restrict attention
to D rather than working with the whole of R2 . However there is nothing very special about the precise
choice of D we have made. Choose Æ such that 0 < Æ < min( 1c ; c1 ): note that Æ < 1. Now we look at a closed
Thus Z = BcÆ (x0 ), the closed ball in C [t0 Æ; t0 + Æ ] of radius cÆ with centre the constant function x0 .
Because Z is closed, it is complete. We dene a map T : Z ! Z by
Z t
T (x)(t) = x0 + f s; x(s) ds
for x 2 X , t 2 [t0 Æ; t0 + Æ ]. If we can prove that T has a xed point in Z we shall have solved our dierential
equation, because T (x) = x if and only if x_ = f (t; x) and x(0) = x0 . So we need to check that T : Z ! Z is
a contraction mapping.
First of all we need to check that T is indeed a map from Z to Z . So we must check that T (x) 2 Z if x 2 Z .
But Z t
jT (x)(t) x0 j = f s; x(s) ds
Z t
jf s; x(s) j ds
Z t
c ds
Then we need to check that T is a contraction mapping. If x; y 2 Z then for some t 2 [t0 Æ; t0 + Æ]
kT (x) T (y)k1 = jT (x)(t) T (y)(t)j
Z t
f s; x(s) f s; y(s) ds
Z 0 (Z )
t x(s) @f (s; w)
= dw ds
t0 y(s) @w
t x(s)
@f (s; w)
y(s) @w
dw ds
t x(s)
@f (s; w)
y(s) @w
dw ds
t x(s)
c0 dw ds
t0 y(s)
Z t
c0 x(s) y(s) ds
j j
Z t
k k c x y 1 ds
Æc0 kx yk1
and Æc0 < 1 so T is a contraction.
7. Compactness.
7.1 The Bolzano-Weierstrass theorem in real analysis says that a bounded sequence in R has a convergent
7.2 A metric space (X; d) is said to be sequentially compact if every sequence in X has a convergent subse-
quence. If A X we say that A is compact if A is empty or if (A; dA ) is compact, where dA is the subspace
7.3 Any bounded closed interval [a; b] R is sequentially compact. This is one way of stating the Bolzano-
Weierstrass theorem: if a an b and (anj ) is convergent then a lim anj b, so any sequence in [a; b]
j !1
has a subsequence which converges in [a; b]. But (0; 1] is not compact because the sequence an = 1=n does
not have a subsequence which converges in (0; 1].
7.4 Proposition. Suppose (X; d) is sequentially compact. Then
(i) X is complete;
(ii) X is bounded;
(iii) if Z X is closed then Z is sequentially compact.
Proof: (i) If (an ) is a Cauchy sequence in X , then by sequential compactness it has a convergent subsequence
by 3.8. If Z is sequentially compact then there is a subsequence (anj ) with anj ! a0 2 Z . But by 2.6(d),
anj ! a, so a = a0 , so a 2 Z . This is a contradiction.
7.6 A complete bounded metric space need not be sequentially compact: for instance N with the discrete
metric is complete and bounded but an = n is an example of a sequence with no convergent subsequence.
7.7 For subsets of Rn (with the usual metric) things are simpler. By the Bolzano-Weierstrass theorem, a
closed bounded subset of R is sequentially compact: indeed, by 7.1 and 7.4 a subset Z R is sequentially
compact if and only if it is closed and bounded (and closed subsets of R are the same as complete subsets of
R, by 5.6). It is easy to see that R can be replaced by Rn here (see 7.8 below). Thus \sequentially compact"
is the same as \closed and bounded" for subsets of Rn . However, this is a special fact about Rn (or C n ): it
doesn't work for all complete metric spaces.
7.8 Z1 Z2 X1 X2 is sequentially compact if and only if Z1 and Z2 are both sequentially compact.
This is quite straightforward. We may assume Z1 and Z2 are non-empty. If (an ; bn ) is a sequence in Z1 Z2
and (ani ; bni ) is a convergent subsequence then (ani ) is a convergent subsequence of (an ) in Z1 . So if Z1 Z2
is sequentially compact and (an ) is a sequence in Z1 , we take any point b 2 Z2 and take a convergent
subsequence (ani ; b) of the sequence (an ; b) in Z1 Z2 : then (ani ) is a convergent subsequence of (an ) in Z1 .
Conversely, suppose Z1 and Z2 are sequentially compact: let (an ; bn ) be a sequence in Z1 Z2 . Then (an )
has a convergent subsequence (ani ) tending to a, and bni has a convergent subsequence (bnij ) tending to b.
The subsequence (anij ; bnij ) tends to (a; b) in Z1 Z2 so Z1 Z2 is sequentially compact.
7.9 Denition. An open cover of a metric space (X; d) is a collection fU g2A of open subsets of X such
that U = X .
If Z X we can say that an open cover of Z is an open cover of (Z; dZ ); but it is sometimes more useful
to think of an open cover of Z in this case as being a collection fU g2A of open subsets of X such that
U Z . By 3.12 this is essentially the same thing.
7.10 Denition. A metric space (X; d) is said to be compact if every open cover has a nite subcover. That
is, given any open cover fU g of X there is a nite collection fU1 ; : : : ; UN g such that N
i=1 Ui = X .
This means that whenever you have an open cover you only actually need nitely many of the sets in it. It
cannot be emphasised strongly enough that this has to be true for every open cover, not just one you happen
to know about.
7.11 The way to use this is very often as follows: you have a condition which involves the choice of some
number Æ , depending on where in your metric space you are. You would like to be able to make the choice
uniformly, without having to know where you are rst (compare being able to chosse N rst in the denition
of uniform convergence). You do know that once you've made a choice of Æ , say Æ , at a point x 2 X it will
work in some small neighbourhood of x (usually a ball), and you take these neighbourhoods to be your U s.
(For this reason the index set A is most often simply X .) Then you want to take the inf (or sup) of the Æ s:
in general you can't (or you can but you might get zero). But if you know that in practice you only need
nitely many of the U s you can take the sup or inf of those nitely many Æ s, which is all right.
7.12 Theorem A metric space (X; d) is compact if and only if it is sequentially compact.
We'll prove this later on (7.17).
7.13 Proposition. If Z is a compact metric space and f : Z ! R is continuous then f is bounded and
attains its bounds.
That is, there exists 2 Z such that f ( ) = sup f (x).
Proof: If f is unbounded we can nd an 2 Z such that jf (an )j ! 1. But then we can nd a convergent
subsequence (ani ) with limit a 2 Z , and f (ani ) ! f (a) < 1, which is a contradiction.
If no such exists then g (x) = (f (x) supt f (t)) 1 is a continuous (negative) function so it is bounded
below by M say. But then f (x) supt f (t) 1=M for all x 2 Z and that contradicts the denition of the
7.14 A continuous image of a compact set is compact. That is, if f : X ! Y is continuous and Z X is
compact then f (Z ) Y is compact.
This is easiest via sequential compactness. If (an ) is a sequence in f (Z ) then choose an 2 Z such that
f (an ) = bn . Then (an ) has a convergent subsequence ani ! a 2 Z and, applying f , we have bni ! f (a) 2
f (Z ).
7.15 The Heine-Borel theorem states (in the language we have now) that [0; 1] is a compact subset of R.
7.16 A circle is compact; so is a sphere or a torus. A sphere with a point missing is not compact. A closed
disc in C is compact; an open disc is not.
7.17 We shall now prove 7.12.
(Compact =) sequentially compact.) Suppose (an ) is a sequence in a compact metric space X and (an )
has no convergent subsequence. Then for each x 2 X there is an rx > 0 such that fn 2 N j an 2 Brx (x)g
is nite: if this were not true for some x we could choose a subsequence tending to x. The Brx (x) between
them cover X , so we choose a nite subcover, X = Brxi (xi ). If m > max (maxfn 2 N j an 2 Brx (x)g),
i=1 1iN
which exists since the sets are nite, then am 62 X , which is a contradiction.
(Sequentially compact =) compact.) Take any open covering fU j 2 Ag of X . For each x 2 X we dene
This means that Br(x) (x) is (roughly speaking) the biggest ball around X that can be tted into one of the
patches in the cover. A really big ball might spread outside all the patches that contain x. It's possible that
r(x) = 1, that is, the set is unbounded, but that will make no dierence.
We then put s(x) = min( 12 r(x); 1). The 1 is put there to make sure that we are writing something nite:
the 21 (which could as well be any constant less than 1) is there to make sure s(x) isn't as big as r(x). By
denition there are U s which contain Bs(x) (x): choose one of them for each x 2 X and call it U (x).
The idea is that we shall cover X with nitely many of the Bs(x) (x)s and therefore with nitely many of the
U (x)s. If we can do this we'll have shown that fUg has a nite subcover.
Suppose we can't do this. Then we can pick a sequence (an ) in X by starting with any a1 and requiring that
an 62 Bs(ai ) (ai ). Such an an must always exist, otherwise the rst n 1 of the Bs(ai ) (ai ) have covered X ,
which we supposed couldn't be done. Having got this sequence we can apply sequential compactness to
extract a subsequence (anj ) tending to a 2 X .
If m > n we have d(am ; an ) s(an ) since if d(am ; an ) < s(an ) then am 2 Bs(an ) (an ) which we forbade. So
so s(anj ) ! 0 as j ! 1.
On the other hand we can choose j 2 N such that d(anj ; a) < 21 s(a), and if we do that then Bs(a)=2 (anj )
Bs(a) (a) U(a). (Check: if y 2 Bs(a)=2 (anj ) then d(y; a) d(y; anj ) + d(anj ; a) < 12 s(a) + 12 s(a) = s(a), so
y 2 Bs(a) (a).) This means that r(anj ) 12 s(a), since r(anj ) is as big as a ball centred at anj can be without
needing another patch to cover it, and the 12 s(a)-ball, we have just checked, needs only one patch, namely
U(a) . So
s(anj ) = min r(anj )=2; 1 min s(a)=4; 1 ;
but this is a positive constant so s(anj ) does not tend to zero.
Now we have reached a contradiction, so our assumption that we couldn't cover X with the Bs(x) (x)s must
have been false. This gives a nite subcover of fU g, so X is compact.
8. Connectedness.
8.1 Theorem. The following conditions on a metric space (X; d) are equivalent:
(i) X = X0 [ X1 with X0 ; X1 open and nonempty and X0 \ X1 = ;.
(ii) X = X0 [ X1 with X0 ; X1 closed and nonempty and X0 \ X1 = ;.
(iii) There exists a continuous function f : X ! E , where E has the discrete metric, such that f is not
Proof: (i) is equivalent to (ii) because if X0 and X1 are both open (or both closed) then X0 = X n X1 and
Proof: If X Y is disconnected let f : X Y ! E be nonconstant, E discrete. Choose y 2 Y and consider
gy (x) = f (x; y). This is a continuous function gy : X ! E , so it is constant, say gy (x) = h(y). But now
h : Y ! E is continuous, as h(y) = f (x; y) for any x 2 X , and so it is also constant. But that means f is
8.5 If X 0 X is closed, open and connected then X 0 is said to be a connected component of X . Any locally
connected metric space X can be written uniquely as a disjoint union of connected components, namely as
the union of all the connected components of X . Any two dierent connected components X 0 , X 00 of X
must be disjoint, as if X 0 \ X 00 6= ;; X 0 then X 0 is disconnected as X 0 = (X 0 \ X 00 ) [ (X 0 n X 00 ). By \locally
connected" I mean that every point x 2 X has a connected neighbourhood.
8.6 A metric space X is path-connected if given any two points x0 ; x1 2 X there is a continuous map
: [0; 1] ! X such that
(0) = x0 and
(1) = x1 .
8.7 Proposition. If X is path-connected then it is connected.
Proof: If X is disconnected then there is a nonconstant continuous function f : X ! E . Put h(x) = 1 if
f(x; y) 2 R2 j x 0; y = sin(1=x) if x 6= 0g
is connected but not path-connected. But this doesn't often happen.