GTM298 - More Explorations in Complex Functions (2023)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 410

Graduate Texts in Mathematics

Richard Beals
Roderick S. C. Wong

More Explorations
in Complex Functions
Graduate Texts in Mathematics 298
Graduate Texts in Mathematics

Series Editors:
Patricia Hersh, University of Oregon
Ravi Vakil, Stanford University
Jared Wunsch, Northwestern University

Advisory Board:
Alexei Borodin, Massachusetts Institute of Technology
Richard D. Canary, University of Michigan
David Eisenbud, University of California, Berkeley & SLMath
Brian C. Hall, University of Notre Dame
June Huh, Princeton University
Akhil Mathew, University of Chicago
Peter J. Olver, University of Minnesota
John Pardon, Princeton University
Jeremy Quastel, University of Toronto
Wilhelm Schlag, Yale University
Barry Simon, California Institute of Technology
Melanie Matchett Wood, Harvard University
Yufei Zhao, Massachusetts Institute of Technology

Graduate Texts in Mathematics bridge the gap between passive study and creative
understanding, offering graduate-level introductions to advanced topics in mathemat-
ics. The volumes are carefully written as teaching aids and highlight characteristic
features of the theory. Although these books are frequently used as textbooks in
graduate courses, they are also suitable for individual study.
Richard Beals • Roderick S. C. Wong

More Explorations
in Complex Functions
Richard Beals Roderick S. C. Wong
Department of Mathematics City University of Hong Kong
Yale University Kowloon Tong, Hong Kong
New Haven, CT, USA

ISSN 0072-5285 ISSN 2197-5612 (electronic)


Graduate Texts in Mathematics
ISBN 978-3-031-28287-4 ISBN 978-3-031-28288-1 (eBook)
https://doi.org/10.1007/978-3-031-28288-1

Mathematics Subject Classification: 30-01, 33-01, 30D35, 33E05, 11M06

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

In the preface to Explorations in Complex Functions, the authors noted that “A first
course in complex analysis introduces keys that open many doors. … The doors open
on many subjects of interest. Too many subjects, in fact, to cover in a single follow-
up course. … Our purpose is to provide brief, but self-contained introductions to
many of the subjects alluded to above.” We felt that such a book could be useful – for
independent reading, as a source of material for presentation in a seminar, or as a text
for a second course in the subject. The first author used the material of Explorations
in this way in a one-semester course for two successive years. The courses covered
two different (though overlapping) subsets of roughly half of the chapters past the
“basics.”
Anyone familiar with complex analysis could see that Explorations did not exhaust
the topics that such a book might cover. Eventually the authors decided that the book
did not completely exhaust themselves, either. We envision the same kind of uses
for the present book – independent reading, seminar topics, or for a semester or
year-long second course in complex analysis that gives a broad overview of some
important parts of the subject.
The present book is independent of, and has minimal overlap with, Explorations,
and is essentially self-contained. We begin with two chapters that are meant to be
used as a resource, rather than as regular reading – sections of these chapters can be
drawn on as needed as background for later chapters. Both the introductory chapters
contain proofs, or sketches of proofs, of all the material that they cover. The first
chapter is (almost) the same as in Explorations, reviewing material that is standard
in an introductory course. The second chapter covers, quickly, some topics from
that book and some additional topics that will be used in more than one chapter in
this book. These additions include Carathéodory’s theorem that a conformal map
between Jordan domains extends to the boundaries, and an introduction to weak
solutions and Weyl’s lemma.
The order of the remaining chapters is somewhat arbitrary. Chapter 3 and Chapter 4
are stand-alone introductions to complex dynamics and to univalent function theory,
respectively. Chapter 3 treats iteration of a rational function. It covers basic facts
about the Fatou and Julia sets and the roles played by different types of fixed points.

v
vi Preface

Chapter 4 begins with a capsule history of the Bieberbach conjecture, introduces the
basic results of Koebe and Bieberbach, and continues through Carathéodory conver-
gence and Loewner’s equation. After covering the Robertson and Milin conjec-
tures, the chapter ends with Weinstein’s short proof of de Branges’s theorem: the
verification of the Bieberbach conjecture.
The next three chapters can be treated as a unit leading to the uniformization
theorem: the characterization of simply connected Riemann surfaces. Chapter 5
follows Perron’s approach to the Dirichlet problem via subharmonic functions.
General Riemann surfaces, universal covers, cover transformations, and some conse-
quences of the uniformization theorem are covered in Chapter 6. Chapter 7 contains
the proof of the uniformization theorem itself.
Chapter 8 and Chapter 9 carry the theory of Riemann surfaces further. Chapter 8
is a stand-alone introduction to quasiconformal mapping through modules, extremal
ring domains, the Beurling–Ahlfors extension, Hölder continuity, and the Beltrami
equation. This chapter paves the way for the use of the uniformization theorem and
quasiconformal equivalence to attack the problem of moduli of Riemann surfaces in
Chapter 9 on Teichmüller theory.
The remaining five chapters are (largely) stand-alone introductions to topics of
both theoretical and applied interest. Chapter 10 treats the Bergman kernel and the
Bergman metric, with applications to conformal mapping for simply connected and
multiply connected plane domains and to the Dirichlet problem. Chapter 11 intro-
duces theta functions, particularly for hyperelliptic curves, and the approaches of
Riemann and Weierstrass to the Jacobi inversion problem.
The final three chapters have applications to approximation theory and to asymp-
totics. Chapter 12 deals with Padé approximants and the connections with continued
fractions, orthogonal polynomials, and the Stieltjes transform. Chapter 13 treats
the original Riemann–Hilbert problem and some of its generalizations and appli-
cations, such as integral transforms and integral equations. Chapter 14 covers
Darboux’s method for computing asymptotics of Maclaurin expansions, and some
recent generalizations.
Altogether, there is more material here than one could expect to cover in a year-
long course in complex analysis. How much can be covered in one or two semesters
will depend on the degree of preparation of the class. The authors hesitate, therefore,
to make specific suggestions – especially since the choice of topics will depend very
much on the interests of the instructor and/or the students. It has been pointed out to
the authors that material selected from Chapters 2, 4, 5, 7, and 8 contains the function
theory background for some stochastic equations of current interest, such as SLE.
For an overview of dependence relations of chapters, see Fig. 0.1.
The authors gratefully acknowledge the patience and support of their wives,
Nancy Beals and Edwina Wong, and the encouragement and helpfulness of their
Springer editors, Loretta Bartolini and Elizabeth Loew. The authors are also grateful
to Professor Wei-Yuan Qiu for many helpful comments on Chapter 3.
Preface vii

Fig. 0.1 Chart of dependence relations among chapters

New Haven, Connecticut, USA Richard Beals


Kowloon Tong, Hong Kong Roderick S. C. Wong
Contents

1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction; notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The Cauchy–Riemann equations and Cauchy’s integral
theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 The Cauchy integral formula and applications . . . . . . . . . . . . . . . . 4
1.4 Change of contour, isolated singularities, residues . . . . . . . . . . . . . 7
1.5 The logarithm and powers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 Infinite products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.7 Reflection principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.8 Analytic continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.9 Harmonic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Further preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1 Linear fractional transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Geometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Normal families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Conformal equivalence and the Riemann mapping theorem . . . . . 28
2.5 The triply-punctured sphere, Montel, and Picard . . . . . . . . . . . . . . 30
2.6 Jordan domains and Carathéodory’s extension theorem . . . . . . . . 33
2.7 Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.8 L p spaces and measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.9 Convolution, approximation, and weak solutions . . . . . . . . . . . . . . 40
2.10 The gamma function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3 Complex dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1 Fatou sets and Julia sets; some examples . . . . . . . . . . . . . . . . . . . . . 49
3.2 Julia sets: invariance, density, and self-similarity . . . . . . . . . . . . . . 53
3.3 Fixed points and periodic points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4 Attracting, super-attracting, and repelling fixed points . . . . . . . . . 57
3.5 Neutral fixed points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

ix
x Contents

3.6 Parabolic fixed points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68


3.7 Perspectives: classification and the Mandelbrot set . . . . . . . . . . . . 72
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4 Univalent functions and de Branges’s theorem . . . . . . . . . . . . . . . . . . . 79
4.1 Bieberbach’s theorem and some consequences . . . . . . . . . . . . . . . . 81
4.2 The Bieberbach conjecture: history and strategy . . . . . . . . . . . . . . 86
4.3 The Carathéodory convergence theorem . . . . . . . . . . . . . . . . . . . . . 87
4.4 Slit mappings and Loewner’s equation . . . . . . . . . . . . . . . . . . . . . . . 90
4.5 The Robertson and Milin conjectures . . . . . . . . . . . . . . . . . . . . . . . . 95
4.6 Preparation for the proof of de Branges’s theorem . . . . . . . . . . . . . 98
4.7 Proof of de Branges’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5 Harmonic and subharmonic functions; the Dirichlet problem . . . . . 113
5.1 Harmonic functions and the Poisson integral formula . . . . . . . . . . 113
5.2 Harnack’s principle; removable singularities . . . . . . . . . . . . . . . . . 117
5.3 Subharmonic functions and Perron’s principle . . . . . . . . . . . . . . . . 118
5.4 Regular points and the solution of the Dirichlet problem . . . . . . . 120
5.5 The L 2 approach to the Dirichlet problem . . . . . . . . . . . . . . . . . . . . 122
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6 General Riemann surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.1 Abstract Riemann surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.2 The universal cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.3 Automorphism groups and cover transformations . . . . . . . . . . . . . 132
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7 The uniformization theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.1 Green’s functions and harmonic measure . . . . . . . . . . . . . . . . . . . . 138
7.2 Uniformization: the hyperbolic case . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.3 An analogue of the Green’s function . . . . . . . . . . . . . . . . . . . . . . . . 143
7.4 Proof of the uniformization theorem, completed . . . . . . . . . . . . . . 148
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
8 Quasiconformal mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.1 Quadrilaterals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.2 Quasiconformal mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.3 Regular quasiconformal maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.4 Ring domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
8.5 Extremal ring domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.6 Distortion properties and Hölder continuity . . . . . . . . . . . . . . . . . . 174
Contents xi

8.7 Quasisymmetry and quasi-isometry . . . . . . . . . . . . . . . . . . . . . . . . . 177


8.8 Complex dilatation; the Beltrami equation . . . . . . . . . . . . . . . . . . . 182
8.9 The Calderón–Zygmund inequality . . . . . . . . . . . . . . . . . . . . . . . . . 190
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9 Introduction to Teichmüller theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
9.1 Coverings, quotients, and moduli of compact Riemann
surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
9.2 Homeomorphisms of Riemann surfaces . . . . . . . . . . . . . . . . . . . . . . 207
9.3 Homeomorphisms of compact Riemann surfaces . . . . . . . . . . . . . . 209
9.4 The Teichmüller space of a Riemann surface . . . . . . . . . . . . . . . . . 214
9.5 The universal Teichmüller space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
9.6 The Bers embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9.7 Further developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
9.8 Higher Teichmüller theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
10 The Bergman kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.1 The reproducing kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
10.2 Orthonormal bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
10.3 Conformal mapping, I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
10.4 Conformal invariance and the Bergman metric . . . . . . . . . . . . . . . . 242
10.5 Conformal mapping, II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
10.6 The kernel function and partial differential equations . . . . . . . . . . 253
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
11 Theta functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11.1 Hyperelliptic curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
11.2 Cycles and differentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.3 Theta functions and Abel’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . 272
11.4 Jacobi inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
12 Padé approximants and continued fractions . . . . . . . . . . . . . . . . . . . . . 283
12.1 Padé approximants and Taylor series . . . . . . . . . . . . . . . . . . . . . . . . 284
12.2 Padé approximation and continued fractions . . . . . . . . . . . . . . . . . . 287
12.3 Another view of Padé approximants and continued fractions . . . . 292
12.4 The Stieltjes transform, Padé approximants, and orthogonal
polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
12.5 Characterization of Stieltjes transforms . . . . . . . . . . . . . . . . . . . . . . 300
12.6 Stieltjes functions and Padé approximants . . . . . . . . . . . . . . . . . . . . 303
12.7 Generalized Shanks Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 307
12.8 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
xii Contents

12.9 Continued fraction expansions of e x . . . . . . . . . . . . . . . . . . . . . . . . . 314


Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
13 Riemann–Hilbert problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
13.1 The Sokhotski–Plemelj formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
13.2 Riemann–Hilbert Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
13.3 The Radon Transform and the Fourier transform . . . . . . . . . . . . . . 331
13.4 Integral Equations with Cauchy Kernels . . . . . . . . . . . . . . . . . . . . . 337
13.5 Integral Equations with Algebraic Kernels . . . . . . . . . . . . . . . . . . . 339
13.6 Integral Equations with Logarithmic Kernels . . . . . . . . . . . . . . . . . 341
13.7 Singular Integral Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
13.8 The other Riemann–Hilbert problem . . . . . . . . . . . . . . . . . . . . . . . . 348
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
14 Asymptotics and Darboux’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
14.1 Algebraic singularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
14.2 Logarithmic singularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
14.3 Two coalescing singularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
14.4 Asymptotic nature of the expansion (14.3.24) . . . . . . . . . . . . . . . . 371
14.5 Heisenberg polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Remarks and further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Chapter 1
Basics

This chapter begins with a brief summary of facts from a standard introductory com-
plex variables course: Cauchy’s formula and consequences, isolated singularities,
residues, and the complex logarithm. Also included are four topics that are not as
standard for an elementary course, but are used in the following chapters: reflection
properties, infinite products, analytic continuation, and harmonic functions. For all
this material we give brief discussions and sketches of proofs.

1.1 Introduction; notation

Throughout, a domain Ω is a connected non-empty open subset of the complex plane


C. A function f : Ω → C is said to be C n , or a C n function if all partial derivatives
of f of order ≤ n exist and are continuous. The space of such functions is denoted
C n (Ω). The space C ∞ (Ω) of C ∞ functions is defined similarly. Partial derivatives
are often denoted by subscripts: f x , f y , f x x , f x y , etc.
A (parametrized) curve in Ω is a continuous function γ defined on a real interval
I = [a, b] and having values in Ω. We commit the usual abuse of terminology by
using the term “curve” interchangeably for the continuous function γ : I → Ω and
for the image γ (I ) in C. The image carries an orientation from the parametrization.
The curve γ is said to be smooth if γ is a C 1 function on the closed interval. The curve
γ is said to be piecewise smooth if it is smooth on each of finitely many subintervals
whose union is [a, b].In some contexts a curve, or a part of a curve, may be referred
to as an arc or a path..
Similarly, a curve γ is said to be analytic if it is real-analytic, i.e. for each t0 ∈
[a, b], γ is given for nearby values by a convergent power series


γ (t) = an (t − t0 )n .
n=0

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 1


R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_1
2 1 Basics

Note that this means that γ can be extended so as to be defined on an open neigh-
borhood of its (original) domain, defined by these same power series for complex
values of t.
A curve Γ : [a, b] → C is said to be closed if the endpoints coincide: γ (a) =
γ (b). A curve is said to be simple if its image has no self-intersections.
In this chapter it is assumed that any domain Ω that occurs is bounded and that the
boundary ∂Ω is the union of finitely many pairwise disjoint simple smooth closed
curves, oriented so that Ω lies to the left of each boundary curve.
If γ : [a.b] → Ω is a curve and f is a continuous function defined on the image
of γ , then the integral  
f = f (z) dz
γ γ

is defined to be the limit as max{|z j+1 − z j |} tends to zero of the Riemann sums over
partitions a = x0 < x1 < · · · < xn+1 = b,

n
f (x j )[γ (x j+1 ) − γ (x j )].
j=0

In the case of double integrals it will often be convenient to write dm(z) for d x d y:
 
f (x + i y) d x d y = f (z) dm(z).
Ω Ω

(As usual, it is understood that x, y are real, and that a function of z = x + i y can
be considered as a function of x and y, and conversely.)
If z = x + i y, the complex conjugate z̄ is x − i y. Thus the real part Re z and
imaginary part Im z of z = x + i y are
1 1
x = Re z = (z + z̄); y = Im z = (z − z̄).
2 2i
The polar decomposition of z = x + i y is essentially the representation of z in polar
coordinates
z = r eiθ = r cos θ + i sin θ, (1.1.1)

so  y
r = − x 2 + y2, θ = tan−1 .
x

1.2 The Cauchy–Riemann equations and Cauchy’s integral


theorem

Consider a function

f (x + i y) = u(x, y) + i v(x, y), x + i y ∈ Ω,


1.2 The Cauchy–Riemann equations and Cauchy’s integral theorem 3

where u and v are real-valued C 1 functions. The complex-valued function f is said


to be holomorphic (differentiable in the complex sense), if and only if u and v satisfy
the Cauchy–Riemann equations:

u x = vy , u y = −vx , (1.2.1)

where the subscripts denote partial differentiation.


Green’s theorem (or an argument due to Goursat that uses only pointwise differ-
entiability) yields the basic theorem of the subject.

Theorem 1.2.1. (Cauchy integral theorem) If f is holomorphic in a domain Ω,


and continuous on the closure of Ω, then

f (ζ ) dζ = 0. (1.2.2)
∂Ω

Let us pause to look at the Cauchy–Riemann equations and Cauchy’s theorem


from the point of view of differential forms and Green’s theorem. The pairs of 1-
forms dz, d z̄, and d x, dy are related by

dz = d x + idy, d z̄ = d x − idy;
dz + d z̄ dz − d z̄
dx = , dy = .
2 2i
Thus
   
∂f ∂f 1 ∂f ∂f 1 ∂f ∂f
df = dx + dy = −i dz + +i d z̄.
∂x ∂y 2 ∂x ∂y 2 ∂x ∂y

It is natural to express this as


∂f ∂f
df = dz + d z̄ = ∂ f dz + ∂¯ f d z̄,
∂z ∂ z̄

where
   
∂ 1 ∂ ∂ ∂ 1 ∂ ∂
∂ = = −i ; ∂¯ = = +i . (1.2.3)
∂z 2 ∂x ∂y ∂ z̄ 2 ∂x ∂y

With f = u + iv we find that


1  1 
∂f = ∂¯ f =
(u x + v y ) + i(vx − u y ) , (u x − v y ) − i(vx + u y ) .
2 2
(1.2.4)
Thus the Cauchy–Riemann equations (1.2.1) are equivalent to the single equation
∂¯ f = 0. Moreover they imply that for holomorphic f = u + iv,

∂ f = u x + ivx = f  , ∂¯ f¯ = u x − ivx = f  . (1.2.5)


4 1 Basics

A standard form of Green’s theorem is that if Ω is a domain, then


 
[P d x + Q dy] = [Q x − Py ] d x d y. (1.2.6)
∂Ω Ω

It is an exercise, using the identities above, to show that (1.2.6) is equivalent to the
equation  
[ f dz + g d z̄] = 2i [∂¯ f − ∂g] d x d y. (1.2.7)
∂Ω Ω

In particular, taking g = 0 we obtain a result known as the Cauchy–Green formula


 
f dz = 2i ∂¯ f d x dy. (1.2.8)
∂Ω Ω

The case ∂¯ f = 0 is Cauchy’s formula (1.2.2).


Another application of these identities is to the calculation of the area of the
image of a domain Ω under an injective holomorphic function f whose first and
second partial derivatives are continuous up to the boundary. If f = u + iv, the area
is, taking account of the Cauchy–Riemann equations:

u x vx
dx dy
Ω u y vy

u x vx
= dx dy
Ω −v x ux

= [u 2x + vx |2 d x d y.
Ω

Taking into account (1.2.5) we have two area formulas for injective holomorphic f :
 
 2
Area{ f (Ω)} = | f | dx dy = ∂ f ∂ f d x dy. (1.2.9)
Ω Ω

¯ f = 0,
Since ∂∂
¯ f  f¯).
∂ f ∂ f = ∂(

It follows from (1.2.8) that



1
Area { f (Ω)} = f  (z) f (z) dz. (1.2.10)
2i ∂Ω

1.3 The Cauchy integral formula and applications

Much of basic complex function theory consists of exploring (fairly immediate)


consequences of the Cauchy integral theorem, Theorem 1.2.1. One such consequence
is the Cauchy integral formula. If f is holomorphic in a general domain Ω, and
continuous on the closure, we can apply (1.2.2) to the function
1.3 The Cauchy integral formula and applications 5

1 f (w)
g(w) = · , w ∈ Ω,
2πi w − z

on the domain Ωε formed by removing from Ω a small disk centered at z,

Dε (z) = {w : w = z + r eiθ , 0 ≤ r < ε, 0 ≤ θ ≤ 2π }.

The integral over the boundary of Dε , oriented in the positive (counter-clockwise)


direction, approaches 2π f (z) as ε → 0; see the calculation (1.3.4). Taking the limit
yields the formula (1.3.1). This formula can be differentiated arbitrarily often.
Theorem 1.3.1. (Cauchy integral formula) If f is holomorphic in a domain Ω,
and continuous on the closure of Ω, then for each z ∈ Ω,

1 f (ζ )
f (z) = dζ. (1.3.1)
2πi ∂Ω ζ − z

More generally, each derivative can be written as an integral:



(n) n! f (ζ ) dζ
f (z) = . (1.3.2)
2πi ∂Ω (ζ − z)n+1

Thus a holomorphic function is infinitely differentiable. Moreover, if

|z − z 0 | < r = inf |ζ − z 0 |,
ζ ∈∂Ω

then the expansion


∞
1 1 (z − z 0 )n
=   =
ζ −z z − z0 (ζ − z 0 )n+1
(ζ − z 0 ) · 1 − n=0
ζ − z0

converges uniformly for ζ ∈ ∂Ω. This gives:


Theorem 1.3.2. (Taylor expansion) If f is holomorphic in a disk Dr (z 0 ), then f
has a convergent Taylor expansion

 f (n) (z 0 )
f (z) = an (z − z 0 )n , |z − z 0 | < r ; an = . (1.3.3)
n=0
n!

In other words, a holomorphic function is an analytic function of z.


Remark. If f is holomorphic in a neighborhood of 0, the Taylor expansion centered
at z = 0, ∞

f (z) = an z n
n=0

is often referred to as the Maclaurin expansion.


6 1 Basics

Other easy consequences of the Cauchy integral formula are various mean value
and maximum principles. For example, if f is holomorphic in a domain that includes
the closure of a disk Dr (z), then a change of variables

ζ = z + r eiθ

gives
 
1 f (ζ ) 1 2π
f (z) = dζ = f (z + r eiθ ) dθ. (1.3.4)
2πi |ζ −z|=r ζ −z 2π 0

One can also take the real or imaginary part of this formula.

Theorem 1.3.3. (Mean value property) If f is holomorphic in a domain Ω, then


the value of f at each point z 0 ∈ Ω is the mean of the values on any circle {z :
|z − z 0 | = r } that is small enough so that Dr (z 0 ) is contained in Ω. The real and
imaginary parts of f have the same property.

It is an easy consequence of Theorem 1.3.3 that the maximum value of the modulus
| f (z)|, or of the real or imaginary part of f , occurs at the boundary of Ω. A closer
examination of (1.3.4), taking into account the Taylor expansion, shows that no such
maximum value can occur at an interior point of Ω, unless f is constant near the
point.

Theorem 1.3.4. (Maximum modulus principle) If f is holomorphic in Ω and


continuous on the closure of Ω, then the maximum value of the modulus | f (z)| is
attained on the boundary. The same is true for the real and imaginary parts of f .

By assumption a domain Ω is connected, so it is easily seen that if f is constant


near a point, it is constant throughout Ω. Therefore Theorem 1.3.4 has a more precise
form.

Theorem 1.3.5. (Strong maximum modulus principle) If f is holomorphic in a


domain Ω, and the maximum modulus is attained at a point of Ω itself, then f is
constant. The same is true for the real and imaginary parts of f .

We note here another frequently used consequence of the Cauchy integral formula.

Proposition 1.3.6. Suppose that { f n }∞


n=1 is a sequence of functions holomorphic in
a domain Ω, and suppose that the sequence converges to a function f , uniformly on
each compact subset of Ω. Then f is holomorphic in Ω.

In fact if z ∈ Ω, the convergence is uniform on a small circle Γ that contains


z. Therefore in the disk bounded by Γ the limit function f is given by the Cauchy
integral formula, from which it follows that f is holomorphic in that disk.
1.4 Change of contour, isolated singularities, residues 7

An entire function is a function f that is holomorphic in the entire plane C. For


each R > 0 and each z ∈ C, (1.3.1) and (1.3.2) give

1 f (ζ )
f (z) = dζ
2πi |ζ −z|=R ζ −z

and, more generally,



n! f (ζ )
f (n) (z) = dζ.
2πi |ζ −z|=R (ζ − z)n+1

Since the circle of integration has length 2π R and the modulus of the denominator
is R n+1 , it is easy to see that constraints on the growth of f can imply vanishing of
high order derivatives.
Theorem 1.3.7. (Liouville’s theorem) If f is entire and bounded, then f is con-
stant.
Theorem 1.3.8. (Extended Liouville theorem) If f is entire and

| f (z)| ≤ C(|z|n + 1)

for some integer n ≥ 0, then f is a polynomial of degree ≤ n.

1.4 Change of contour, isolated singularities, residues

The Cauchy integral theorem is often used to justify a change of contour in an


integration. This is particularly useful in the rest of this section. Rather than formulate
a general theorem, we illustrate with an example. Suppose that the domain Ω is
bounded by one large circle Γ and two smaller, disjoint circles, Γ1 , Γ2 , that are
enclosed by Γ , as in Figure 1.1 on the left. Suppose that f is holomorphic in Ω and
continuous on the closure. Then
  
f (z) dz = f (z) dz + f (z) dz, (1.4.1)
Γ Γ1 Γ2

where each circle is oriented in the positive (counter-clockwise) direction.


In fact, Theorem 1.2.1 implies that the integral of f over the contour on the right
in Figure 1.1 is zero. In the limit, as the gap is closed, the integrals over the flat parts
of the contour cancel, and we are left with (1.4.1) in the form
  
f (z) dz − f (z) dz − f (z) dz = 0.
Γ Γ1 Γ2

An isolated singularity for a holomorphic function is a point z 0 such that f is


holomorphic in a punctured disk Ω = {z : 0 < |z − z 0 | < r }.
8 1 Basics

Γ Γ

Γ2
Γ1

Fig. 1.1 Change of contour in integration.

An isolated singularity z 0 is said to be a removable singularity if a value f (z 0 )


can be assigned to f at z 0 in such a way that the extended function is holomorphic
in some disk {z : |z − z 0 | < r }.
An isolated singularity z 0 is said to be a pole if there is some integer n > 0 such
that
a−n a1−n
f (z) = + + · · · + a0 + a1 (z − z 0 ) + . . . (1.4.2)
(z − z 0 ) n (z − z 0 )n−1

in some punctured disk {0 < |z − z 0 | < r }, with a−n = 0. The expansion (1.4.2) is
called the Laurent expansion of f at z 0 . The order of the pole is n. A simple pole is
a pole of order 1.
Suppose that the function f is bounded and holomorphic in the punctured disk {z :
0 < |z − z 0 | < R}. Choosing a smaller radius, we may assume that f is continuous
up to the circle {z : |z − z 0 | = r }. Let g(z) = (z − z 0 ) f (z) and g(z 0 ) = 0, so g(z) is
continuous at z 0 . Using the Cauchy integral formula for {z : ε < |z − z 0 | < r } and
letting ε → 0, we find that g is given by the Cauchy integral formula and is therefore
holomorphic near 0. If follows that the same is true for f = g/(z − z 0 ). Thus

Proposition 1.4.1. Suppose that z 0 is an isolated singularity of f and suppose that


f (z) is bounded for 0 < |z − z 0 | < r . Then z 0 is a removable singularity: f (z) has
a limit at z = z 0 and extends to be holomorphic in Dr (z 0 ).

Corollary 1.4.2. Suppose that z 0 is an isolated singularity of f . Suppose that for


some integer n, g(z) = (z − z 0 )n f (z) is bounded as z → z 0 , and suppose that n is
the least such integer. If n is negative, it follows that z 0 is a removable singularity, at
which f has a zero of order −n. If n is positive, then f has a pole of order n at z 0 ,

An isolated singularity that is neither removable nor a pole is called an essential


singularity. An example is the function g(z) = exp(1/z) on Ω = C \ {0}. In this
case the behavior near 0 is quite different. It is an exercise to show that g takes any
given non-zero value a infinitely often in each neighborhood of 0. A weaker version
of this is easily proved for essential singularities in general. (For a stronger version,
see Theorem 2.5.2.)
1.4 Change of contour, isolated singularities, residues 9

Theorem 1.4.3. (Casorati–Weierstrass theorem) Suppose that f is holomorphic


in a domain Ω and has an essential singularity at z 0 ∈ Ω. In each punctured neigh-
borhood Dε = {z : 0 < |z − z 0 | < ε}, f comes arbitrarily close to any given com-
plex number a.

Proof: Suppose, to the contrary, that | f (z) − a| ≥ δ > 0 in Dε . Then g(z) =


1/[ f (z) − a] has an isolated singularity at z 0 . Moreover, g is bounded as z → z 0 , so
the singularity is removable. If g(z 0 ) = 0, then f has a removable singularity at z 0 .
If g has a zero of degree n > 0 at z 0 , then f has a pole of order n at z 0 .
Let us return to the Laurent expansion (1.4.2). Suppose that f is holomorphic in
{z : 0 < |z − z 0 | < R}. Then (z − z 0 )−1−n f (z) can be integrated term-by-term over
the boundary of the domain {z : ε < |z − z 0 | < r < R}. Taking ε → 0, we find that

1 f (z) dz
an = . (1.4.3)
2πi |z−z0 |=r (z − z 0 )n+1

In particular, the coefficient a−1 is defined to be the residue res( f, z 0 ) of f at z 0 :



1
res( f, z 0 ) = f (z) dz. (1.4.4)
2πi |z−z 0 |=r

A function f is said to be meromorphic in a domain Ω if f is holomorphic except


at isolated points that are poles of f . An application of Cauchy’s theorem to the
domain minus sufficiently small disks centered at the poles gives the following.

Theorem 1.4.4. (Residue theorem) If f has finitely many poles in Ω and is con-
tinuous on the closure, then
 
1
f (ζ ) dζ = res( f, z). (1.4.5)
2πi ∂Ω z∈Ω

The residue theorem can be used to count poles and zeros (taking into account
multiplicities). In fact, suppose that near z = z 0 , f (z) = (z − z 0 )n g(z), where n is
an integer, g is holomorphic and g(z 0 ) = 0. then

f  (z) n g  (z)
= +
f (z) z − z0 g(z)

has residue n at z 0 . As a consequence:

Theorem 1.4.5. (Counting zeros and poles) If f is meromorphic in Ω, and con-


tinuous and nowhere zero at the boundary, then

1 f  (ζ )

2πi ∂Ω f (ζ )
= number of zeros minus number of poles of f in Ω, (1.4.6)
10 1 Basics

where the zeros and poles are counted according to multiplicity.

Corollary 1.4.6. If f is meromorphic in Ω and continuous on the boundary, then


it takes each value in the complement of f (∂Ω) the same number of times (counting
multiplicity) in each connected component of this complement..

Proof: If f does not take the value a on the boundary, then the integral

1 f  (ζ )
N (a) = dζ
2πi ∂Ω f (ζ ) − a

counts the number of times f takes the value a minus the number of poles. The num-
ber of poles is constant, and N (a), being integer-valued and continuous with respect
to a, is also constant on the connected component of the complement that contains
a.

Here are two more applications of these ideas.

Theorem 1.4.7. (Rouché’s theorem) Suppose that f and g are holomorphic in Ω


and continuous on the closure. If | f (z) − g(z)| < | f (z)| on the boundary ∂Ω, then
f and g have the same number of zeros in Ω.

In fact the function f s (z) = (1 − s) f (z) + sg(x) = f (z) − s[ f (z) − g(z)], 0 ≤


s ≤ 1, has no zeros on ∂Ω, so the number of zeros in Ω is

1 f s (ζ )
dζ.
2πi ∂Ω f s (ζ )

This is an integer-valued continuous function of s, so it has the same value at s = 0


and at s = 1. But f 0 = f , f 1 = g.

Theorem 1.4.8. (Inverse function theorem) Suppose that f is holomorphic near


z 0 and f  (z 0 ) = 0. Then f has an inverse that is holomorphic near f (z 0 ).

In fact it follows from the series expansion at z 0 that for small r > 0, f (z) = f (z 0 )
if z is inside or on the curve Γ = {z : |z − z 0 | = r }. Therefore if a is close enough
to f (z 0 ), the integral 
1 ζ f  (ζ )

2πi Γ f (ζ ) − a

is the unique value of z inside the curve such that f (z) = a. This expression is a
holomorphic function of a.

1.5 The logarithm and powers

In view of (1.1.1), the complex logarithm log z, z = 0, is defined by

log z = log(|z|ei arg z ) = log |z| + i arg z. (1.5.1)


1.5 The logarithm and powers 11

Here log |z| denotes the usual choice for positive argument; thus log |z| is real. Of
course arg z is defined only up to addition of an integer multiple of 2π . By a branch
of the logarithm in a domain Ω, we mean a choice that is holomorphic throughout
Ω. (Such a choice may not be possible, e.g. in a deleted neighborhood of the origin
{z : 0 < |z| < r }.) A branch is called the principal branch if Ω ∩ R is not empty and
log z is real on this intersection.
An important concept here is that of a simply connected domain,usually defined
to be one that is connected and in which each closed curve can be continuously
shrunk to a point. An equivalent definition is that Ω is connected and, given two
curves γ0 and γ1 in Ω that join points z and w, there is a family of curves γt :
[0, 1] → Ω, 0 < t < 1, such that γt (0) = z, γt (1) = w, and the map (s, t) → γt (s)
is continuous, 0 ≤ s, t ≤ 1. (Showing that the two definitions are equivalent is an
interesting exercise.)
Suppose that Ω is a simply connected domain. Suppose also that 0 is not in Ω.
Then a branch of the logarithm may be obtained by choosing z 0 ∈ Ω, choosing log z 0 ,
and setting  z

log z = log z 0 + . (1.5.2)
z0 ζ

Because of the assumption that Ω is simply connected, the integral is independent


of the path of integration from z 0 to z: see Section 1.8 for details.
Corresponding to a branch of the logarithm, and to each α ∈ C, there is a branch
of the power z α :
z α = eα log z . (1.5.3)

This is independent of the branch of the logarithm if and only if α is an integer.


The next result is a generalization of Theorem 1.4.4 to the case in which the
function f is defined only in a neighborhood of the curve of integration.
Theorem 1.5.1. (Argument principle) Suppose that f is holomorphic in a neigh-
borhood of a closed curve Γ , and suppose that z 0 is not in the image f (Γ ). Then
the integral 
1 f  (ζ ) dζ
n(z 0 ) =
2πi Γ f (ζ ) − z 0

is an integer: the number of times that the curve f (Γ ) wraps around z 0 in the positive
direction.
Proof: Let g(ζ ) = f (ζ ) − z 0 . By assumption, g = 0 for ζ ∈ Γ , so we may choose
a branch of the logarithm at a point ζ0 ∈ Γ and follow the logarithm continuously
along the curve. When we return to the starting point ζ0 , the logarithm will have the
same real part as initially, but the imaginary part will differ by 2π n, where the integer
n can be interpreted as the number of times that g(Γ ) wraps around the origin in the
positive direction. Equivalently, n is the number of times that f (Γ ) wraps around z 0
in the positive direction. Thus
  
1 f  (ζ ) dζ 1 g (ζ ) dζ 2nπi
= = = n.
2πi Γ f (ζ ) − z 0 2πi Γ g(ζ ) 2πi
12 1 Basics

1.6 Infinite products

Infinite products are often written in the form



(1 − an ), (1.6.1)
n=1

where the an are complex numbers. The key tool to be used is the following estimate.

Lemma 1.6.1. Suppose |z| ≤ 1/2. Then the principal branch of log(1 − z) satisfies
|z|
| log(1 − z) + z| ≤ |z|2 ≤ . (1.6.2)
2

Proof: Integrating along the line segment from 1 to 1 + z,


 1−z  z
ds dt
log(1 − z) = = −
1 s 0 1−t
z
z2 z3
=− (1 + t + . . . ) dt = −z − − − ...,
0 2 3
so
|z|2 |z|2 1
| log(1 − z) + z| ≤ (1 + |z| + |z|2 + . . . ) ≤ · ≤ |z|2 .
2 2 1 − |z|

The (formal) product (1.6.1) is said to converge if


N
lim (1 − an ) = 1. (1.6.3)
M,N →∞
n=M

This implies that the partial products ∞ M (1 − an ) have a non-zero limit, as soon
as M is large enough that n ≥ M implies 1 − an = 0. In particular, a necessary
condition for convergence is that 1 − an → 1, i.e. an → 0. Suppose that |an | ≤ 1/2
for n ≥ M. Then, taking the principal branch of the logarithm
N 
N
log (1 − an ) = | log(1 − an )|.
M n=M

The product is said to be absolutely convergent if



(1 + |an |)
n=1

converges. Absolute convergence implies convergence. It follows from (1.6.2) that


for large enough n,
1.7 Reflection principles 13

|an | 3|an |
≤ | log(1 + |an |) ≤ ,
2 2

Therefore the product converges absolutely if and only if n=1 |an | < ∞.

1.7 Reflection principles

Theorem 1.7.1. Suppose that Ω is a domain that is symmetric under reflection about
the real axis: Ω = Ω. Suppose also that f is holomorphic on the intersection of Ω
with the upper half-plane H = {z : Im z > 0}, continuous up to I = Ω ∩ R, and
real on I . Then f has a holomorphic extension to the remainder of Ω, with

f (z̄) = f (z), z ∈ Ω+ . (1.7.1)

Proof: The prescription (1.7.1) defines f so as to be holomorphic in Ω ∩ C− , and


continuous in all of Ω. As a domain, Ω is connected, so I is not empty. We need to
show that f is holomorphic near I . Consider a complex neighborhood Dr (x0 ) of a
point x0 ∈ I , whose closure is contained in Ω. Let

± = Dr (x0 ) ∩ {z : ±Im z > 0}, (1.7.2)

and 
1 f (ζ ) dζ
g(z) = , |z − x0 | < r.
2πi |ζ −x0 |=r ζ −z

This function is holomorphic in Dr (x0 ). For z ∈ + the lower semicircle of the


contour can be moved to the x-axis, showing that g = f on + . Similarly, g =
f on Ω− if we use (1.7.1) to define f on − . It follows that (1.7.1) extends f
holomorphically across I .

Theorem 1.7.2. Suppose that Ω and I are as in the previous theorem. Suppose that
f is holomorphic in Ω ∩ H, nowhere zero, and continuous up to I . Suppose also
that | f (x)| = 1 for x ∈ I . Then f has a holomorphic extension to the remainder of
Ω, with
f (z̄) = 1/ f (z). (1.7.3)

Proof: As in the previous proof, it is sufficient to work in a small disk Dr (x0 ). For
small r a branch g of log f can be chosen in + . By the assumption on | f |, the
limit of ig is real on Dr (x0 ) ∩ R. Therefore ig can be continued to all of Dr (x0 ).
The continuation of g, given by (1.7.1) for ig, exponentiates to the continuation of
f given by (1.7.3).
14 1 Basics

1.8 Analytic continuation

There are two situations that give rise to the consideration of analytic continuation.
An example of one such situation is the function f defined by the series

f (z) = 1 + z + z 2 + z 3 + · · · + z n + . . . . (1.8.1)

The series converges if and only if |z| < 1. On the other hand, the sum is 1/(1 − z),
which is holomorphic in the complement of the point z = 1. It is natural to consider
1/(1 − z) as a continuation of f : the extension of f to a function holomorphic on a
larger domain. A natural question: is such an extension unique?
An example of a second such situation is the logarithm. Starting with the usual
choice in a neighborhood of z = 2, and following along a curve that circles the origin
in the positive direction, one comes back not to log 2 but to log 2 + 2πi – but it is
natural to think of this “branch” as an analytic continuation of the original. For a
visualization, see Figure 1.2.

log 1 = 4πi
log(−1) = 3πi
log 1 = 2πi
log(−1) = πi
log 1 = 0
log(−1) = −πi
log 1 = −2πi

Fig. 1.2 Analytic continuation of the logarithm.

In general, suppose that f 0 is holomorphic in an open disk D0 centered at z 0 ,


suppose that γ : [0, 1] → C is a curve with γ (0) = z 0 , and suppose that D0 does not
contain γ (we are systematically conflating γ as a mapping and γ as a set of points,
i.e. the image of the mapping). It may still be the case that we can find successive
points z j = γ (t j ) along the curve and functions f j holomorphic in disks D j centered
at z j such that D j ∩ D j+1 = ∅, f j = f j+1 on D j ∩ D j+1 , and the union of the D j
covers γ . The result is a function f , holomorphic in a neighborhood of the curve γ ,
that agrees with f 0 near z 0 . The function f is said to be a continuation of f 0 along
the curve γ .
Proposition 1.8.1. (Uniqueness of analytic continuation) If two functions that are
holomorphic in a connected domain Ω agree on a non-empty open subset of Ω, then
they agree on all of Ω.

Proof: It suffices to prove that if f is holomorphic in Ω and vanishes near a point


z 0 ∈ Ω, then f is identically zero. Let z be another point of Ω and let γ : [0, 1] → Ω
1.9 Harmonic functions 15

be a smooth curve with γ (0) = z 0 and γ (1) = z. If f (γ (s)) = 0 for 0 ≤ s ≤ t, then


it follows that each derivative of f vanishes at z = γ (t). Thus the Taylor expansion
of f vanishes at γ (t), so f vanishes in a neighborhood of γ (t). It follows from this
argument that f vanishes along the entire curve, so f (z) = 0.

Recall from Section 1.5: a domain Ω ⊂ C is said to be simply connected if each


closed curve in Ω can be deformed continuously to a point (a constant curve). For
example, the plane C is simply connected, but the complement of any non-empty
bounded subset A is not. As noted above, an equivalent definition is that any two
curves from a point z 0 to a point z 1 can be deformed continuously from one to the
other.
Theorem 1.8.2. (Monodromy theorem) Suppose that the domain Ω is simply con-
nected. Suppose that f 0 is holomorphic in a domain Ω0 ⊂ Ω, and suppose that f 0
can be continued along each curve in Ω. Then f0 has a unique holomorphic extension
to all of Ω.

Proof: Take z 0 ∈ Ω0 . It is enough to show that the continuation of f 0 along a curve


γ : [0, 1] → Ω that starts at z 0 leads to a value f (γ (1)) that depends only on z 1 =
γ (1), not on the particular curve γ . Suppose that γ0 and γ1 are two such curves from
z 0 to z 1 . Then there is a family of curves γt from z 0 to z 1 , 0 < t < 1, that interpolates
continuously from γ0 to γ1 .
Suppose that f 0 is continued along each curve γt . Let T be the supremum of those
t such that γt (z 1 ) = γ0 (z 1 ) It follows from Proposition 1.8.1 that T is positive. It fol-
lows from continuity that γT (z 1 ) = γ0 (z 1 ). Then T = 1, since, otherwise, Proposi-
tion 1.8.1 implies that equality at z 1 can be extended past t = T .

1.9 Harmonic functions

A function f that maps a domain Ω → C is said to be harmonic if is belongs to


C 2 (Ω) and satisfies Laplace’s equation, the differential equation

f ≡ f x x + f yy = 0. (1.9.1)

As an example, suppose that f : U → C is holomorphic. Writing f (x + i y) =


u(x, y) + iv(x, y), where u and v are real-valued, we note that the Cauchy–Riemann
equations imply

u x x + u yy = (v y )x − (vx ) y = 0; vx x + v yy = −(u x ) y + (u y )x = 0.

Thus the real and imaginary parts of a holomorphic function are holomorphic. This
proves the first half of the following proposition.
Proposition 1.9.1. Suppose that U is a simply connected domain in C. If f : U → C
is holomorphic, then its real part u is harmonic. Conversely, if U is simply connected
16 1 Basics

and a C 2 function u : U → R is harmonic, then u is the real part of a holomorphic


function f : U → C.

Proof: It is enough to prove the second statement for a disk D whose closure lies
in U ; then the result follows by analytic continuation. We may translate and take
D = Dr (0). If u is the real part of f , the Cauchy–Riemann equation tell us that
the gradient of the imaginary part v is given by vx = −u y , v y = u x . Therefore, for
z = x + i y ∈ D,
 1
d
v(x, y) = v(0) + v(sx, sy) ds
0 ds
 1 
= v(0) + −xu y (sx, sy) + yu(sx, sy) ds. (1.9.2)
0

Conversely, choose v(0) arbitrarily and define v by (1.9.2). The assumption that u is
harmonic implies that u and v, so defined, satisfy the Cauchy–Riemann equations.
Therefore f = u + iv is holomorphic.

The function v is called a harmonic conjugate of u, and is often denoted u ∗ . It is


unique up to an additive constant.

Corollary 1.9.2. A harmonic function u is infinitely differentiable, and its Taylor


series sums to u in any disk that is contained in the domain of u.

Corollary 1.9.3. If u is harmonic and g is holomorphic, then u ◦ g is harmonic


where it is defined. In particular, dilations u (r ) (x, y) = u(r x, r y) are harmonic.

Proof: Locally u is the real part of f , so u ◦ g is the real part of f ◦ g.

Remarks and further reading

Most undergraduate texbooks on complex analysis cover the basic complex analysis
in this chapter. Three classic complex analysis texts—Ahlfors [6], Hille [107], and
Titchmarsh [206]—cover, in addition, several of the topics in later chapters: For a
discussion of the development of the subject through the work of Cauchy, Riemann,
Weierstrass, and others, see Neuenshwander [153].
The special issue of the Journal Primus, vol. 27, issue 8-9 (2017) on “Revitalizing
Complex Analysis” contains a number of papers that explore topics in this chapter
and their applications.
Chapter 2
Further preliminaries

This chapter covers additional material that is used in more than one subsequent
chapter. The various sections here are meant to be read or consulted as needed for later
chapters, so that those chapters or sequence of chapters can be read independently.
Four fundamental domains in complex function theory are: the complex plane C
itself; the unit disk D,
D = {z ∈ C : |z| = 1};

the upper half-plane H,


H = {z ∈ C : Im z > 0};

and the Riemann sphere S:


S = C ∪ {∞}.

The Riemann sphere is given a complex structure by taking a base of neighbor-


hoods of ∞ to be the sets
{z ∈ C : |z| > R ≥ 0}.

A function f is said to be holomorphic at ∞ if it is holomorphic in a neighborhood


of ∞ and g(z) = f (1/z) has a removable singularity at 0; equivalently, f is holo-
morphic and bounded in some neighborhood of ∞. Poles and essential singularities
at ∞ are defined similarly.
It is also useful to have a special notation for the unit circle T, the boundary of
the unit disk:
T = ∂D = {ζ ∈ C : |z| = 1}.

Sections 2.1 and 2.2 cover the automorphisms (bijective holomorphic self-maps)
of C, D, H, and S, and the geometries associated to these domains.
Section 2.3 introduces normal families and theorems of Ascoli–Arzelà and Mon-
tel. In Section 2.4 this material is applied to the proof of the Riemann mapping
theorem.
The triply-punctured sphere and theorems of Picard and of Montel are introduced
in Section 2.5. Carathédory’s extension theorem is proved in Section 2.6.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 17


R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_2
18 2 Further preliminaries

Sections 2.7 and 2.8 cover basic facts about Hilbert spaces and L p spaces and
measure. Convolution, approximation, and weak solutions are covered in Section
2.9. The gamma function and some of its properties are introduced in Section 2.10.

2.1 Linear fractional transformations

A linear fractional transformation, or Möbius transformation, is a function of the


form
az + b
f (z) = , ad − bc = 0. (2.1.1)
cz + d

Computing the composition of two such functions shows that the group of such
transformations is a homomorphic image of the group G L(2, C) of invertible 2 × 2
complex matrices:
 
ab az + b
A = → f A, f A (z) = .
cd cz + d

Note that if α is a non-zero constant, then α A and A induce the same mapping. In
particular α can be chosen so that α A has determinant 1.

Proposition 2.1.1. (a) Each linear fractional transformation is a bijective map from
the Riemann sphere S to itself.
(b) Each linear fractional transformation f has either one or two fixed points, i.e.
solutions of f (z) = z, z ∈ S.
(c) Given any two ordered triples of distinct points (z 1 , z 2 , z 3 ) and (w1 , w2 , w3 ), there
is a unique linear fractional transformation f such that f (z j ) = w j , j = 1, 2, 3.
(d) Each linear fractional transformation is conformal: if two smooth curves in S
meet at an angle, then the images under f meet at the same angle.
(e) Each linear fractional transformation f has the property that if Γ is either a
straight line or a circle in C, then f (Γ ) ∩ C is either a straight line or a circle.

Proof: Parts (a) and (b) are easily checked. For (c), it is enough to show that given
(z 1 , z 2 , z 3 ), there is a unique linear fractional transformation f such that f (z 1 ) = 0,
f (z 2 ) = 1, and f (z 3 ) = ∞. (See below).
Part (d) is clear geometrically. Given z 0 in the domain of f , let us define g(w) =
f (z 0 + w) − f (z 0 ), so that g(0) = 0. In the limit as w → 0, g is multiplication by
f (z 0 ) = 0. So letting f (z 0 ) = r eiθ , in the limit g dilates by a factor r and rotates
by θ , both of which actions preserve angles. (If z 0 or f (z 0 ) equals ∞, this argument
can be modified accordingly.)
In part (e), note that the statement is not that lines are taken to lines and circles
are taken to circles. One way to verify the result is to note that any linear fractional
transformation is a product of linear fractional transformations of the form f (z) =
az + b and, if necessary, the inversion R(z) = 1/z. The first type maps lines to lines
2.1 Linear fractional transformations 19

and circles to circles, so the problem reduces to the study of R. Taking into account
rotations, it is enough to consider R(Γ ) when Γ is a vertical line or a circle with
center on the real axis. In each case, consideration of where R(Γ ) intersects R ∪ {∞}
will identify the nature of the image, and aid in verifying that it is indeed a straight
line or circle. Details are left to the reader.

Three distinct points in S determine a unique line or circle in the plane; if one
of the points is the point at ∞, then they determine a line. Suppose that the three
points are (z 1 , z 2 , z 3 ) ∈ C. The unique linear fractional transformation that takes
these points, in order, to (0, 1, ∞) is
z − z1 z2 − z3
f (z) = · .
z − z3 z2 − z1

The expression on the right is called the cross ratio of the quadruple (z, z 1 , z 2 , z 3 ). It
is commonly denoted [z, z 1 , z 2 , z 3 ]. The following is a consequence of Proposition
2.1.1.

Corollary 2.1.2. (a) The cross ratio is invariant under linear fractional transforma-
tions: given four distinct points z 0 ,z 1 ,z 2 ,z 3 in C and a linear fractional transformation
g,
[g(z 0 ), g(z 1 ), g(z 2 ), g(z 3 )] = [z 0 , z 1 , z 2 , z 3 ].

(b) A point z ∈ C lies on the line or circle determined by three distinct points z 1 , z 2 , z 3
in S if and only if the cross ratio [z, z 1 , z 2 , z 3 ] is real.
By an automorphism of a domain Ω with a complex structure, we mean a bijective
holomorphic map of the domain to itself. The set of such mappings is a group Aut(Ω)
under composition.

Proposition 2.1.3. The automorphism group of C is the set of linear fractional


transformations of the form f (z) = az + b, a = 0.

Proof: Obviously any such linear fractional transformation is an automorphism of C.


Conversely, suppose that f is an automorphism of C. Then f has an isolated singu-
larity at ∞. Since f is single-valued, this singularity is not a multiple pole nor (by
Casorati–Weierstrass) an essential singularity. Moreover f is not bounded, so it is
not bounded in a neighborhood of ∞. Therefore the singularity is a simple pole with
residue a = 0. Then f (z) − az is entire and bounded near ∞, hence constant.

Proposition 2.1.4. The automorphism group of S is the set of all linear fractional
transformations.

Proof: Any linear fractional transformation f is an automorphism of S. Bijectivity is


easy to check, and holomorphy needs only to be checked at z = ∞ and at f −1 (∞).
Conversely, if g is an automorphism of S, compose with a linear fractional transforma-
tion f such that f (g(∞)) = ∞. Then h = f ◦ g restricted to C is an automorphism
20 2 Further preliminaries

of C, hence a linear fractional transformation, so g = f −1 ◦ h is also a linear frac-


tional transformation.

Lemma 2.1.5. (Schwarz’s lemma) If f is an automorphism of D such that f (0) = 0,


then f is a rotation: f (z) = ωz, |ω| = 1.

Proof: Let g(z) = f (z)/z. Then g : D → D is holomorphic, so the maximum princi-


ple implies | f (z)/z| ≤ 1/r for |z| ≤ r < 1. Taking the limit as r → 1 and noting that
the same argument applies to f −1 , we find that | f (z)| = |z|. By the strong maximum
principle, f (z)/z is constant.

Proposition 2.1.6. The automorphism group of the unit disk D consists of linear
fractional transformations of the form
z−a
f (z) = ω , a ∈ D, |ω| = 1. (2.1.2)
1 − āz

Proof: If f has the form (2.1.2), then |z| = 1 implies | f (z)| = 1, so f maps each
component of the complement of the unit circle onto such a component. Since f (a) =
0, it follows that f (D) = D.
Conversely, suppose that g is an automorphism of D. Let f be given by (2.1.2) with
a = g(0) and ω = 1. Then h = f ◦ g is an automorphism with h(0) = 0. By Lemma
2.1.5, h is constant, so g = f −1 ◦ h is a linear fractional transformation.

The linear fractional transformations


z−i 1+w
C(z) = , C −1 (w) = i (2.1.3)
z+i 1−w

are the Cayley transform and its inverse. It is easily seen that C maps the real line to
the unit circle. Since C(i) = 0, C maps the upper half-plane to the unit disk and the
lower half-plane to {z : |z| > 1}.

Proposition 2.1.7. The automorphism group of the upper half-plane H consists of


the linear fractional transformations that have real coefficients and positive deter-
minants.

Proof: If f is an automorphism of H, then g = C ◦ f ◦ C −1 is an automorphism of D.


Therefore g is a linear fractional transformation and so is f = C −1 ◦ g ◦ C. If f has
the form (2.1.1), we may assume that ad − bc is real, and then computing Im f (i)
shows that ad − bc must be positive. Now f maps R ∪ {∞} to itself. Checking
f (∞), f (0), and f (1) shows that each coefficient is real.
2.2 Geometries 21

2.2 Geometries

Each of the domains C, S, D, and H carries one or more natural metrics and geome-
tries.
A. Geometries on C
There are two commonly used geometries and three commonly used metrics on C.
The first is the euclidean geometry of C as identified with R2 , with metric |z − w|.
The second geometry and a related metric come from the identification of C as a
subset of the Riemann sphere via stereographic projection. This standard pictorial
representation is obtained by considering C as the x, y plane of the three-dimensional
space
R3 = C × R = {(w, t) : w ∈ C, t ∈ R},

and relating it to the 2-sphere of radius 1 centered at the origin:

S = {(w, t) : |w|2 + t 2 = 1}.

A point ω = (w, t) on S is mapped to a point z = π(ω) in C by following the line


from the north pole N = (0, 1) ∈ S through (w, t) to its intersection (z, 0) with
C × {0}; see Figure 2.1.

ω1

π(ω2 )
π(ω1 )

ω2

Fig. 2.1 Stereographic projection.

The line determined by (0, 1) and ω = (w, t) is the set of points

(1 − λ)(0, 1) + λ(w, t), λ ∈ R.

Thus for t = 1,
w
π(w, t) = . (2.2.1)
1−t

It follows that
22 2 Further preliminaries

|w|2 1 − t2 1+t
|π(w, t)|2 = = = .
(1 − t)2 (1 − t)2 1−t

Therefore
|z|2 − 1 2
t = ; 1−t = ,
|z|2 + 1 |z|2 +1

and  
−1 2z |z|2 − 1
π (z) = , . (2.2.2)
|z|2 + 1 |z|2 + 1

As ω ∈ S approaches the north pole, π(ω) approaches ∞, so we let π(0, 1) = ∞.


We define a second distance function in the plane by using the euclidean distance
of the pull-back to C,
1  −1 
d(z 1 , z 2 ) = π (z 1 ) − π −1 (z 2 ) , (2.2.3)
2
The computation is made simpler by noting that for (w j , t j ) in the sphere,

|(w1 , t1 ) − (w2 , t2 )|2 = 2 − Re (w̄1 w2 + t1 t2 ). (2.2.4)

Using this and (2.2.2) we find that


|z 1 − z 2 |
d(z 1 , z 2 ) =   . (2.2.5)
|z 1 |2 + 1 |z 2 |2 + 1

Taking the limit as z 2 → ∞ gives


1
d(z, ∞) =  . (2.2.6)
1 + |z|2

B. Hyperbolic geometry in the disk


The fundamental idea for geometry in D is that the metric ρD should be invariant
under Aut(D) and that the diameters should be geodesics. The geometry is then
uniquely determined by setting a scale factor. Different sources use different scale
factors. In [22] we followed [107] and chose the scaling so that in the limit at the
origin, the metric is euclidean:
ρD (ε, 0)
lim = 1. (2.2.7)
ε→0 |ε|

This choice also fits nicely with the Teichmüller metric in Section 9.4.
Since the diameter (−1, 1) is to be a geodesic, we want additivity:
ρD (r, t) = ρD (r, s) + ρD (s, t) if − 1 < r < s < t < 1. (2.2.8)

By a slight abuse of notation, write ρD (z, 0) = ρD (z). Note that invariance implies
that ρD (z) = ρD (|z|). The automorphism f (z) = (z − r )/1 − r z) and the invari-
2.2 Geometries 23

ance assumption reduce (2.2.8) to the case for the triple 0 < r < r + δ, thus to
( f (0), 0, f (r + δ)). Letting δ → 0 and taking into account (2.2.7), we find that
 
1 1 1 1
ρD (r ) = = + ,
1 − r2 2 1+r 1−r
so
1 1+r
ρD (r, 0) = ρD (r ) = log , −1 < r < 1. (2.2.9)
2 1−r

In general, suppose that z 1 , z 2 are distinct points of D. There is a (unique) automor-


phism that takes z 1 to 0 and z 2 to an element of (0, 1). Invariance then gives the
general formula for the hyperbolic metric, or Poincaré metric

1 |1 − z̄ 1 z 2 | + |z 1 − z 2 | |z 1 − z 2 |
ρD (z 1 , z 2 ) = log = tanh−1 . (2.2.10)
2 |1 − z̄ 1 z 2 | − |z 1 − z 2 | |1 − z̄ 1 z 2 |

The set of geodesics is, by construction, invariant under Aut(D). We know that
the image of a diameter must be an arc of a circle, and since the automorphisms are
conformal, such an arc must meet the boundary T in two right angles. Conversely,
given such a circular arc, there is an element of Aut(D) that moves the endpoints to
−1, 1. Therefore

Proposition 2.2.1. The geodesics for the hyperbolic metric ρD in D are the diameters
of D and the circular arcs that meet the boundary in two right angles.

Two infinitesimal versions are the line element


|dz|
ds = (ds)D = (2.2.11)
1 − |z|2

and the Riemann metric


d x 2 + dy 2
ds 2 = . (2.2.12)
(1 − r 2 )2

The Poincaré density ηD (z) is, by definition, the limiting ratio between the Poincaré
distance and the euclidean distance at zD:

|dz| 1
ηD (z) = = . (2.2.13)
ds 1 − r2

C. Hyperbolic geometry in the upper half-plane


The fundamental idea is the same as in the case of the disk. The metric ρH , known
as the Poincaré metric is invariant under Aut(H). This can be constructed by the
same method used for ρD , starting with the positive imaginary axis and deriving a
differential equation that determines ρH on that half-line, up to a scale factor. Instead,
we shall take advantage of the Cayley transform C : H → D to transplant ρD :
24 2 Further preliminaries
 
z1 − i z2 − i
ρH (z 1 , z 2 ) = ρD (C(z 1 ), C(z 2 )) = ρD ,
z1 + i z2 + i
1 |z̄ 1 − z 2 | + |z 1 − z 2 |
= log . (2.2.14)
2 |z̄ 1 − z 2 | − |z 1 − z 2 |

In particular
1 t
ρH (it, is) = log if 0 < s < t.
2 s
The infinitesimal version is
|dz| d x 2 + dy 2
(ds)H = ; ds 2 = . (2.2.15)
2 Im z 4y 2

Thus the Poincaré density ηH for ρH is

(ds)H 1
ηH (z) = = . (2.2.16)
|dz| 2 Im z

It follows from this construction that the geodesics, as images under C −1 of


geodesics in D, are lines and half-circles that meet the real axis at right angles.
Remark. There seem to be two common normalizations for ρH . The other is twice
this one. We have chosen to use the normalization (2.2) since this is what is used in
our sources for Chapter 9.

This construction can be carried over to any conformal image of D. If f : Ω → D


is a conformal map, then we set

ρΩ (z 1 , z 2 ) = ρD ( f (z 1 ), f (z 2 )) .

Thus the infinitesimal distance dsΩ is


ρD (( f (z + ε), f (z)) | f (z) dz|
dsΩ (z) = lim = ,
ε→0 ε 1 − | f (z)|2

and the associated metric density is

| f (z)|
ηΩ (z) = . (2.2.17)
1 − | f (z)|2

In Section 9.4 we will need the following two results.


Proposition 2.2.2. Metric density decreases with respect to set inclusion: if Ω1 and
Ω2 are simply connected domains with Ω1 ⊂ Ω2 ⊂ C and Ω1 = Ω2 = C, then

ηΩ2 (z) < ηΩ1 (z), z ∈ Ω1 . (2.2.18)

Proof. Given z ∈ Ω1 , let f j be a conformal map of Ω j onto D that takes z to 0, such


that f j (0) > 0. Then f = f 2 ◦ f 1−1 is a conformal map of D to a proper subset of D,
2.3 Normal families 25

and f (0) = 0, f (0) > 0. By Schwarz’s lemma, f (0) < 1. Near z, f 2 = f ◦ f 1 , so


| f 2 (z)| = | f (0) f 1 (0)| < | f 1 (0)|.

Proposition 2.2.3. The metric density ρΩ satisfies

1 1
≤ ηΩ (z) ≤ , (2.2.19)
4 d(z, ∂Ω) d(z, ∂Ω)

where d(z, ∂Ω) is the (euclidean) distance to the boundary:

d(z, ∂Ω) = inf |z − ζ |.


ζ ∈∂Ω

Proof. Given z ∈ Ω, let φ be the conformal map of D onto Ω such that φ(0) = z,
φ (0) > 0. Let Ω1 be the disk of radius r = d(z, ∂Ω) centered at z. Then f (w) =
φ −1 (r w + z) maps D into D, so

d(z, ∂Ω)
1 ≥ | f (0)| = r [φ −1 ] (z) = = d(z, ∂Ω) · ηΩ ,
φ (0)

which proves the second inequality in (2.2.19).


Now ψ(w) = [φ(w) − z]/φ (0)] is a normalized map of D into C. The Koebe
one-quarter theorem, Theorem 4.1.4, says that ψ(D) contains D1/4 (0). The largest
disk about 0 has radius r/φ (0), so

1 r d(z, ∂Ω)
≤ = ,
4 φ (0) ηΩ

which proves the first inequality in (2.2.19).

2.3 Normal families

A collection F of real or complex-valued functions defined on a domain Ω in C is


said to be a normal family if each sequence { f n } in F contains a subsequence that
converges uniformly on each compact subset of Ω. A normal family F is said to be
complete if the limit of any such convergent sequence belongs to F.
The standard criterion for a normal family is the following, specialized to domains
in C. A family F of real or complex-valued functions defined on a domain Ω ⊂ C
is said to be bounded on compact subsets if for each compact subset C ⊂ Ω, the
supremum of | f (z)|, f ∈ F , z ∈ C, is finite. The family F is said to be equicon-
tinuous on compact subsets if for each compact subset C ⊂ Ω and each ε > 0,
there is a δ > 0 such that for each f ∈ F and points z, w ∈ C, if |z − w| < δ then
| f (z) − f (w)| < ε.
26 2 Further preliminaries

Theorem 2.3.1. (Ascoli–Arzelà) Suppose that F is a family of real or complex-


valued functions defined on a domain Ω ⊂ C, that is bounded and equicontinuous
on compact subsets. Then F is a normal family.

Proof: Let {z m } be a countable dense subset of Ω. The boundedness assumption


implies, in particular, that for any sequence { f n } in F , there is a subsequence { f 1,n }
that converges at z 1 . Some subsequence of { f 1,n } converges at z 2 ; denote this sub-
sequence by { f 2,n }, and find a subsequence of { f 2,n } that converges at z 3 . Having
found nested subsequences { f m+1,n } ⊂ { f m,n } for each m, we note that the sequence
{ f n,n } converges at each z n .
Suppose now that C is a compact subset of C that is the closure of an open set. Then
the z n that lie in C are dense in C. It is an exercise to show, using the assumptions of
equicontinuity and boundedness, that { f n,n } converges uniformly on C to a bounded
continuous function f .

Corollary 2.3.2. (Montel’s theorem) Suppose that F is a family of holomorphic


functions on a domain Ω ⊂ C, that is bounded on compact subsets. Then F is a
normal family.

Proof: It is enough to prove equicontinuity on compact subsets, and for that it is


enough to prove that the family of derivatives

F = { f : f ∈ F}

is bounded on compact sets. To see that this is true, given z 0 ∈ Ω, choose r > 0 such
that {z : |z − z 0 | ≤ r } is contained in Ω and use the estimate (1.3.2) to show that the
derivative is bounded on each disk Ds (z 0 ), 0 < s < r .

Anticipating a bit, we note here that analogous results are true for harmonic
functions.

Corollary 2.3.3. Suppose that F is a family of harmonic functions on a domain


Ω ⊂ C, and bounded on compact subsets. Then F is a normal family.

Proof: The proof is essentially the same as for Corollary 2.3.2, with the (scaled) version
of the Poisson integral formula (5.1.6) in place of the Cauchy integral formula. .

It is important to know something about the limit functions in the holomorphic or


harmonic case.

Proposition 2.3.4. If { f m } is a uniformly convergent sequence of holomorphic (resp.


harmonic) functions on a domain Ω ⊂ C, then the limit f is holomorphic (resp.
harmonic).
2.3 Normal families 27

Proof: Given z 0 ∈ Ω, choose r as in the proof of Corollary 2.3.3. The f n are eventually
defined on Dr (z 0 ). In the holomorphic case, the Cauchy integral formula for f n (z),
z ∈ Dr (z 0 ), gives a formula for f (z). The same argument, using the Poisson formula,
applies to the harmonic case.
Corollary 2.3.5. (Vitali’s theorem) If the sequence { f n } of holomorphic functions
on a domain Ω ⊂ C is uniformly bounded on compact subsets and converges at
each point of a set with an accumulation point in S, then it converges uniformly on
compact sets in Ω.

Proof. The sequence is a normal family. The limit of any convergent subsequence is
determined uniquely by its values {an } on a set with an accumulation point, since all
derivatives at that point are uniquely determined by the {an }. Therefore the sequence
itself converges.
Proposition 2.3.6. If { f m } is a uniformly convergent sequence of injective holomor-
phic functions on a domain Ω ⊂ C, then the limit f is either constant or injective.

Proof: Suppose that f is not constant. Given z 0 ∈ Ω, choose r as before, but in such
a way that | f (z) − f (z 0 )| ≥ δ > 0 for |z − z 0 | = r . By Rouché’s theorem, 1.4.7,
eventually f n has the same number of values f (z 0 ) in Dr (z 0 ) as f does. By assump-
tion each f n is injective, so f must also be injective.

The special properties of holomorphic and harmonic functions allow a useful


generalization of these criteria. The proofs make new use of the same ideas.
Proposition 2.3.7. Suppose that F is a family of holomorphic or harmonic func-
tions on a domain Ω ⊂ C, and suppose that for any compact C ⊂ Ω,

sup | f | d x d y < ∞. (2.3.1)
f ∈F C

Then F is a normal family.

Proof: Consider the holomorphic case. Let z 0 be any point of Ω and again let r > 0
be such that the closure of Dr (z 0 ) is contained in Ω. Given any f holomorphic on
Dr (z 0 ) and any z in D2r/3 (z 0 ), we can write a version of the Cauchy integral formally
by smearing the original formula over the circles of radius 2r/3 to r :
   
3 r
1 f (ζ ) dζ
f (z) = ds
r
2r/3 |ζ −z 0 |=s ζ − z
2πi

1 f (ζ ) d x d y
= .
2πi 2r/3<|z−z 0 |<r ζ −z

Differentiating with respect to z, and using the assumption (2.3.1) gives us uniform
estimates for f on Dr/3 (z 0 ), f ∈ F . The same idea, using the Poisson formula,
applies in the harmonic case.
28 2 Further preliminaries

Remark. This section exemplifies different mathematical traditions. Texts on com-


plex analysis speak of normal families and Montel, and give the above criterion and
proof, without mentioning Ascoli or Arzelà. Functional analysis and real analysis
texts often contain the Ascoli–Arzelà theorem, by name, but do not mention Montel
or give a name to the “normal family” concept.

For a pictorial proof of the Ascoli–Arzelà theorem, see [22], Section 5.3.

2.4 Conformal equivalence and the Riemann mapping


theorem

A conformal mapping of one complex domain to another is a bijective C 1 function


that preserves, at each point, the size and the orientation of the angle between any
two C 1 curves passing through that point. Some calculation shows that the necessary
and sufficient condition for this, pointwise, is that the Cauchy–Riemann equations
hold, and that the derivative with respect to z be non-zero. Thus a conformal mapping
is a bijective holomorphic function.
Given two domains in C or S, a natural question is whether they are conformally
equivalent: does there exist a bijective holomorphic map from one onto the other?
Since linear fractional transformations are conformal maps, we know that a half-
plane and a disk, are conformally equivalent. On the other hand, by Liouville’s
theorem, a holomorphic map from C to a disk in C must be constant, so C and D are
not conformally equivalent. Moreover it is easily seen that a bijective holomorphic
image of a simply connected domain is simply connected. Therefore the plane minus
a single point is also not conformally equivalent to the unit disk.
The original version of the following theorem was formulated by Riemann for
domains with some assumptions about the boundary. The definitive result is the
following, due to Koebe [122].

Theorem 2.4.1. (Riemann mapping theorem) If Ω is an open, simply connected,


proper subset of the plane, then there is a bijective holomorphic map f that maps
Ω onto the unit disk D.
Given a point z 0 ∈ Ω, we may specify f (z 0 ) = 0, f (z 0 ) > 0. These conditions
determine f uniquely.

Remark. Given that Ω ⊂ C is a simply connected domain, the assumption that it


is a proper subset is equivalent to the assumption that the boundary ∂Ω contains at
least two points. Therefore the theorem is often stated with this as the extra condition
on Ω.
The proof involves a number of steps. The first step is a reduction to the case of
bounded Ω. Choose a not in Ω. Then z − a√is never zero on the simply connected
domain Ω, so we may choose a branch of z − a that is holomorphic on Ω; see
Section 1.5. This branch maps Ω bijectively onto a domain Ω1 . Choose a point
b ∈ Ω1 . For some ε > 0, the disk {z : |z − b| < ε} is contained in Ω1 . If z is in Ω1
2.4 Conformal equivalence and the Riemann mapping theorem 29

then −z is not, so Ω1 lies outside the disk {z : |z + b| < ε}. The map z → 1/(z + b)
takes Ω1 bijectively onto a bounded domain Ω2 . Thus we may replace Ω1 by Ω2 ,
and assume that Ω itself is bounded.
For the next step, choose a point z 0 ∈ Ω and let F be the family of bijective
holomorphic maps f from Ω into the unit disk D such that f (z 0 ) = 0 and f (z 0 ) > 0.
Note that this family is not empty: f (z) = ε(z − z 0 ) will belong to F if ε > 0 is
small enough. Note also that
1
sup f (z 0 ) ≤
f ∈F r

if r is such that the closure of Dr (z 0 ) is contained in Ω. It follows from Corollary


2.3.3 that F is a normal family. Therefore F contains an element f such that

f (z 0 ) = sup g (z 0 ).
g∈F

By Proposition 2.3.6, f is injective. We need to show that f is surjective. Suppose


f omits a point a ∈ D, and suppose first, for simplicity, that a > 0. A branch of the
square root can be chosen so that

z−a
g(z) =
az − 1

is holomorphic on f (Ω) ⊂ D. The linear fractional transformation under the radical



sign maps D to D, so the composition g ◦ f maps Ω into D. Note that g(0) = a.
Let √
z− a
h(z) = √ , (2.4.1)
az − 1

and let f 1 = h ◦ g ◦ f . Then f 1 is bijective from Ω into D, and



f 1 (z 0 ) = h ( a) g (0) f (z 0 ). (2.4.2)

But
1 (az − 1) − a(z − a) 1 a2 − 1
g (z) = = ,
2g(z) (az − 1)2 2g(z) (az − 1)2

and √ √ √
( az − 1) − a(z − a) a−1
h (z) = √ = √ ,
( az − 1)2 ( az − 1)2
so
a2 − 1 √ 1
g (0) = √ , h ( a) =
2 a a−1

and
a+1
f 1 (z 0 ) = √ f (z 0 ).
2 a
30 2 Further preliminaries

√ √
But since 0 < a < 1 we have a + 1 − 2 a = (1 − a)2 > 0, so f 1 (z 0 ) > f (z 0 ),
contradicting the assumption that f (z 0 ) is maximal.
The preceding argument assumed that f (Ω) omitted a point a ∈ D and that a > 0.
Otherwise, we may assume that the omitted point has the form ωa, where |ω| = 1
and a > 0, and take
f 1 (z) = ω h(g(ω̄ f (z))

with g and h defined as before. Again we find that f 1 (z 0 ) > f (z 0 ), a contradiction.


This contradiction shows that our function f of (2.4.2) maps Ω onto D.
Finally, uniqueness follows easily from Lemma 2.1.5.

2.5 The triply-punctured sphere, Montel, and Picard

By the triply-punctured sphere we mean the Riemann sphere with the points 0, 1, ∞
removed. We will denote it by S \ 3:

S \ 3 = S \ {0, 1, ∞} = C \ {0, 1}. (2.5.1)

The study of S \ 3 is closely connected to the elliptic modular function λ. Theorem


2.4.1 makes possible a quick conceptual construction of λ. Let Ω be the domain

Ω = {z : 0 < Re z < 1, |z − 21 | > 21 }. (2.5.2)

See Figure 2.2.

Ω
Γ1 Γ3
Γ2

0 1

Fig. 2.2 Fundamental domain of λ.

Let  be a conformal map of D onto Ω. Since the boundary of Ω consists of


analytic arcs, Theorem 1.7.2, together with some consideration of the images under
 of circles {z : |z| = r } as r ↑ 1, shows that  has a continuous extension to the
boundary. (Or see Section 2.6.) By composing with an automorphism of D, we
may suppose that (1) = ∞, (−i) = 0, and φ(1) = ∞. Let C : H → D be the
Cayley transform. Then Ψ =  ◦ C is a conformal map of H onto Ω that extends
continuously to map R onto the boundary of Ω, fixing 0, 1, and ∞:
2.5 The triply-punctured sphere, Montel, and Picard 31

Ψ : H → Ω; lim Ψ (w) = w, w = 0, 1, ∞. (2.5.3)


z→w

Let λ be the map


λ = Ψ −1 : Ω → H. (2.5.4)

As we shall see, λ extends to a covering map of H onto S \ 3. This means that any
point z ∈ S \ 3 has a connected neighborhood U with the property that λ maps each
connected component of λ−1 (U ) conformally onto U .
Theorem 2.5.1. The function λ is the unique conformal map of the domain (2.5.2)
onto H whose extension to the boundary fixes 0, 1, and ∞. Moreover, λ extends to
H as a covering map of H onto S \ 3.

Proof. Suppose that μ : Ω → H is another such map. Then μ−1 ◦ λ is an automor-


phism of H whose extension fixes three distinct points. Therefore μ−1 ◦ λ = 1, the
identity map, so μ = λ.
Since λ extends to map the semicircle that is the lower boundary Γ2 onto the
real interval [0, 1], it extends across by reflection to the reflection r (Ω) through the
semicircle Γ2 . By following the boundary and using conformality, it is easy to see
that the lower boundary of r (Ω) consists of the semicircles in H centered at 1/4 and
at 3/4 and having radius 1/4. This gives us an extension that is a conformal map

λ : Ω ∪ Γ2 ∪ r (Ω) → C \ {0, 1} = S\3.

This can be extended again across each of its lower bounday arcs, and the process
continued; see Figure 2.3.

Ω
Γ1 Γ3
Γ2

0 1

Fig. 2.3 Extension to a half strip.

Each of the bounded, three-sided subdomains in this figure, when reflected across
one of its lower boundary arcs, gives an extension that it defined on a similar domain
having half the diameter and maps to the opposite half-plane of C. Continuing indef-
initely results in a countably-many-to-one locally conformal map of the strip,

λ : {z : Im z > 0, 0 < Re < 1 } → S\3.


32 2 Further preliminaries

Finally, continued reflection through the vertical boundaries results in an extension

λ : H → C \ {0, 1} = S\3.

This is a covering map—tracing through the construction shows that each point of
S\3 has a neighborhood U whose inverse image is the disjoint union Uα such that
λ : Uα → U is conformal.

For our purpose here it is convenient to replace λ by λ∗ = λ ◦ C−1 , C the Cayley


transform, so that λ∗ is a covering map from D onto S \ 3.
Remark. The classical elliptic modular function has (2.5.2) as fundamental domain,
but maps 0 → 1, 1 → ∞, ∞ → 0. Therefore it is h ◦ λ, where h ∈ Aut(H) maps
0 → 1, etc.

Let us pass to two important consequences of the existence of such maps λ or λ∗ .


The first is Picard’s theorem.
Theorem 2.5.2. If an entire function f : C → C omits two points of C, then f is
constant.

Proof. If f omits two points of C, choose the fractional transformation h such that
f ∗ = h ◦ f omits 0, 1, ∞. and consider f ∗ as a holomorphic function from C to S \ 3.
We know that there is a covering map from D to S \ 3, and the identity map C → C
is also a covering map. Both D and C are simply connected, so the monodromy
theorem, Theorem 1.8.2, implies that the map f ∗ can be lifted to a map f ∗ from one
covering space to another, in such a way that the diagram

f∗
C −−−−→ D
⏐ ⏐
⏐ ∗
⏐ (2.5.5)
1 λ

f∗
C −−−−→ S \ 3.

is commutative. In particular, f ∗ is a bounded holomorphic function on C, so it is


constant. Therefore f ∗ = λ ◦ f ∗ is constant.

Exactly the same procedure reduces the following theorem of Montel to Montel’s
theorem, Corollary 2.3.2.
Theorem 2.5.3. Suppose that F is a family of rational functions on some domain,
and suppose that there are at least three points in S that are omitted by each f ∈ F .
Then F is a normal family.

Proof. Choose a linear fractional transformation h such that each f ∗ = h ◦ f omits


0, 1, ∞, and restrict to a simply connected subdomain Ω. Passing to the diagram
(2.5.5) with Ω in place of C, and any f ∈ F , we find that the family of lifts { f ∗ } is a
normal family, by Corollary 2.3.2. It follows easily that F is a normal family.
2.6 Jordan domains and Carathéodory’s extension theorem 33

2.6 Jordan domains and Carathéodory’s extension theorem

A Jordan curve in C is a closed curve γ that is simple, i.e. does not intersect itself; see
Figure 2.4 According to the Jordan curve theorem, the complement of γ consists of
two simply connected components, one that is bounded and one that is unbounded.
We follow here the usual practice of remarking that this is intuitively obvious, and
referring elsewhere for a proof, e.g. Newman [155].

Fig. 2.4 Two Jordan curves.

The bounded component of the complement of a Jordan curve γ is called a Jordan


domain. In other words, a Jordan domain in C is a bounded domain whose boundary
is a Jordan curve. In view of Theorem 2.4.1, any two Jordan domains are conformally
equivalent. What one would wish is that a conformal map from one Jordan domain
to another extends continuously to the boundary curves, yielding a homeomorphism
between the boundaries. In fact, the wish has been granted: [39].
Theorem 2.6.1. (Carathéodory) A conformal mapping f from one Jordan domain
onto another has a continuous extension to the closures. The restriction to the bound-
ary is a homeomorphism.

Proof: We may assume that one of the domains is the unit disk. Suppose that f is
a conformal map of D onto a Jordan domain Ω. Note that the images of the disks
D(r, 0), 0 > r < 1, fill out Ω and, therefore, eventually cover any given compact
subset of Ω. Therefore if a sequence {z n } in D converges to a point ζ ∈ ∂D, some
subsequence of { f (z n )} converges to a point of ∂Ω.
The diameter of a bounded set S ⊂ C is defined to be the supremum of the distance
between two points of S. By assumption the boundary Γ is homeomorphic to the
unit circle, from which it follows that, given ε > 0, if two points of Γ are sufficiently
close together, then exactly one of the two subarcs of Γ determined by them has
diameter ≤ ε.
Let us look closely at f near a point ζ of the boundary of D. For 0 < r < 1, let
γr = γr (ζ ) and Br = Br (ζ ) be the arc and domain that are the parts of the circle
{z : |z − ζ | = r } and of the disk Dd (ζ ) that are contained in D:
34 2 Further preliminaries

γr = {z : |z − ζ | = r } ∩ D, Br = {z : |z − ζ | < } ∩ D;

see Figure 2.5.

ζ
γr
Br

0 1

Fig. 2.5 Domain Br (ζ ) and arc γr (ζ ), r = .4.

The image f (γr ) of γr has


 length L(r ) that can be estimated using the Cauchy–
Schwarz inequality. Since γr |dz| < πr , we obtain

 2 
L(r ) =
2
| f (z)| |dz| < πr | f (z)|2 |dz|
γr γr

= πr | f (ζ + r eiθ )|2 r dθ.
γr

Therefore
 δ  δ
2 dr
L(r ) < π | f (ζ + r eiθ )|2 r dθ dr = Area(Bδ ) < ∞.
0 r 0 γr

The fact that the integral on the left is finite implies that there is a sequence rn → 0
such that L(rn ) → 0.
Now the closure of γrn meets ∂D in two points αn and βn . The finiteness of
L(rn ) implies that as one approaches αn and βn on γrn , the image points converge to
limits an and bn on ∂Ω. Clearly |an − bn | ≤ L(rn ) → 0. Passing to a subsequence,
if necessary, we may assume that an and bn converge to a point w ∈ ∂Ω.
By the remarks above, this means that, given ε > 0, for large n exactly one of
the two subarcs of ∂Ω determined by an , bn has diameter < ε. Denote this subarc
by ηn . Together f (γrn ) and ηn form a Jordan curve that encloses a domain Ωn ⊂ Ω.
We claim that Ωn = f (Brn ) In fact γrn divides D into two disjoint simply connected
subdomains and the images under f are two connected subdomains of Ω whose
complement in Ω is f (γrn ). Note also that since the diameter of the two parts of
the bounding curve approach 0, the diameter of Ωn → 0. In fact every point of the
2.7 Hilbert spaces 35

curve f (γn ) ∪ ηn is within the maximum M of diam f (γn ) and diam ηn of an , so the
complement of D M (am ). Therefore Ωn has diameter ≤ 2M.
To complete the proof, it is enough to show that f is uniformly continuous on D.
In fact this will imply that it extends to be continuous on the closure. A conformal
map allows us to interchange the roles of the two domains Ω and D and conclude
that f −1 also extends continuously to the boundary, from which it follows that f is
a homeomorphism from one boundary onto the other.
If f were not uniformly continuous, there would be two sequences {sm }, {tm } in
D and a constant ε > 0 such that |sm − tm | → 0 but | f (sm ) − f (tm )| ≥ ε. Passing
to subsequences, we may assume that both the original sequences converge to a
point ζ which is necessarily on the boundary ∂D. Then, given n, both sequences will
eventually belong to Brn , so their images will belong to Ωn . Since the diameter of Ωn is
eventually < ε, this is a contradiction.

2.7 Hilbert spaces

For some applications we need the basics of Hilbert space theory. The starting point
is an inner product space. This is a complex vector space H , equipped with an inner
product (u, w), defined for each pair u, w in H and having the properties

(a1 u 1 + a2 u 2 , w) = a1 (u 1 , w) + a2 (u 2 , w), a j ∈ C, u j , w ∈ H ; (2.7.1)


(u, w) = (w, u), u, w ∈ H ; (2.7.2)
(u, u) > 0 if u ∈ H and u = 0. (2.7.3)

Let 
||u|| = (u, u).

A basic property is the Cauchy–Schwarz inequality

|(u, w)| ≤ ||u|| ||w||. (2.7.4)

The proof can be reduced to the case ||u|| = ||w|| = 1. Then for each a ∈ C with
|a| = 1,

0 ≤ ||u − aw||2 = (u − aw, u − aw)


= ||u||2 − (u, aw) − (aw, u) + ||aw||2
= 2 − 2 Re {ā(u, w)}.

We may choose a with |a| = 1 in such a way that Re {ā(u, w)} = |(u, w)|.
The Cauchy–Schwarz inequality implies the triangle inequality

||u + w|| ≤ ||u|| + ||w||.


36 2 Further preliminaries

This and the positivity property (2.7.3) imply that d(u, w) = ||u − w|| is a metric.
The space H is said to be a Hilbert space if H is complete with respect to this metric.
First example: the space l 2 (Z) of two-sided complex sequences x = (xn )∞ −∞ such
that
∞
|xn |2 < ∞,
n=−∞

with inner product




(x, y) = xn ȳn .
n=−∞

The Cauchy–Schwarz inequality, applied to partial sums, implies that the inner prod-
uct is well defined. This space is easily shown to be complete.
Second example: the space of continuous functions u : R → C that are periodic,
with period 2π :
 π
1
u(x + 2π ) = u(x), (u, w) = u(x)w(x) d x.
2π −π

The completion of this inner product space with respect to the associated metric is
L 2per (R); it can be identified with the corresponding space for the interval [0, 2π ],

L 2 ([0, 2π ]).

Two elements u, w of an inner product space are said to be orthogonal, written


u ⊥ w, if (u, w) = 0. Note that
u ⊥ w ⇒ ||u + w||2 = ||u||2 + ||w||2 .

An orthonormal set in an inner product space H is a subset consisting of elements


{ϕ j } such that 
1 if j = k;
(ϕ j , ϕk ) =
0 if j = k.

For our purposes the index set { j} here is finite or countable. Let us suppose that it
is the integers Z. If {ϕn } is an orthonormal set in H , let

un = (u, ϕ j ) ϕ j .
| j|≤n

An easy calculation shows that u n and u − u n are orthogonal, so



|(u, ϕ j )|2 = ||u n ||2 = ||u||2 − ||u − u n ||2 .
| j|≤n

This implies Bessel’s inequality:


2.7 Hilbert spaces 37

|(u, ϕ j )|2 ≤ ||u||2 , (2.7.5)
| j|≤n

and also Bessel’s equality:




||u − u n || → 0 ⇔ |(u, ϕ j )|2 = ||u||2 . (2.7.6)
j=−∞

The orthonormal set {ϕ j } is said to be complete, or an orthonormal basis if u n


converges to u for every u ∈ H . Note that u n is the element closest to u in the
subspace Hn spanned by {ϕ j }n−n . In fact if w belongs to Hn , then

||u − (u n + w)||2 = ||u − u n ||2 + ||w||2

is minimal when w = 0.
An element w of H induces a linear transformation

f w (u) = (u, w)

from H to C. This transformation is bounded:

| f w (u)| ≤ C ||u||

where the constant C can be taken to be ||w||. It is an important fact about Hilbert
spaces that the converse is true. A helpful identity here is the parallelogram identity:
the sum of the squares of the diagonals of a parallelogram equals the sum of the
squares of the sides.

||u + w||2 + ||u − w||2 = 2||u||2 + 2||w||2 . (2.7.7)

Proposition 2.7.1. If f : H → C is a bounded linear transformation, then there is


a unique element w of H such that

f (u) = (u, w), all u ∈ H.

Proof: We may assume that f is not identically zero. Let H1 be the null space:
H1 = {u ∈ H : f (u) = 0}. Since f is bounded, H1 is closed. If w exists, it must be
orthogonal to H1 . Take any w0 that is not in H1 and look for an element u 0 of H1
that is closest to w0 . Then, as argued above, w1 = w0 − u 0 is orthogonal to H1 , so
w should be a multiple of w1 . To put this argument into effect, we need to show that
there is indeed a u 0 ∈ H1 closest to w0 . We choose a sequence {u n } in H1 such that

lim ||u n − w0 || = inf ||u − w0 ||.


n→∞ u∈H1

Then applying (2.7.7) with u n − w0 and u m − w0 in place of u and w,


38 2 Further preliminaries

||u n − u m ||2 = 2||u n − w0 ||2 + 2||u m − w0 ||2 − ||(u n + u m ) − 2w0 ||2


 2
= 2||u n − w0 ||2 + 2||u m − w0 ||2 − 4 1 (u n + u m ) − w0  .
2

Considering how the u n were chosen, and that 21 (u n + u m ) belongs to H1 , we see


that the sequence {u n } converges to u 0 ∈ H . The argument shows that u 0 is unique,
so we see that the orthogonal complement of H1 is one-dimensional. For any λ ∈ C,
w = λ(w0 − u 0 ) is orthogonal to H1 . Therefore the function

f w (u) = (u, w)

agrees with f on H1 . If we choose λ so that λ||w||2 = f (w), then f w and f agree


on the orthogonal complement as well.

2.8 L p spaces and measure

Suppose that Ω is either a domain in C or an open interval in R. For convenience


we denote the variable in Ω by z in either case. We denote by C(Ω) the space of
continuous functions f : Ω → C. The support of a function in any of these spaces is
the smallest subset S of Ω such that f = 0 on the complement Ω \ S. The subspace
of C(Ω) that consists of functions with compact support is denoted Cc (Ω).
Given an index p, 1 ≤ p ≤ ∞, the L p norm of a function f ∈ Cc (Ω) is defined
to be   1/ p
|| f || p = Ω | f (z)| dm(z)
p
, 1 ≤ p < ∞;
(2.8.1)
|| f ||∞ = supz | f (z)|

Here m(z) is a convenient shorthand for d x d y in the two-dimensional case z =


x + i y, and m(x) = d x in the one-dimensional case. Eventually m will be interpreted
as a measure.
The defining properties of a norm in a vector space of functions are

(i) || f || > 0 unless f = 0;


(ii) ||a f || = |a| || f || if a ∈ C;
(iii) || f + g|| ≤ || f || + ||g||.

It is clear that (2.8.1) satisfies the first two of these properties, but (iii) is not obvious
except for p = 1 and p = ∞. To prove (iii) for 1 < p < ∞ we introduce the concept
of a dual index, and prove an important inequality. The dual index for p, 1 ≤ p ≤ ∞,
is the index q such that
1 1
+ = 1, (2.8.2)
p q

Thus 1 and ∞ are dual. Otherwise q = p/( p − 1). The key inequality (when gen-
eralized to the completions L p , L q ) is Hölder’s inequality.
2.8 L p spaces and measure 39

Proposition 2.8.1. If f and g belong to Cc (Ω) and p, q are dual indices, then
 
 
 f (z) g(z) dm(z) ≤ || f || p ||g||q (2.8.3)

Ω

Proof: We may normalize and assume that || f || p = ||g||q = 1. We use the elemen-
tary inequality
ap bq 1 1
ab ≤ + if a, b > 0, p, q > 0, and + = 1.
p q p q

Integrating this inequality, with a = | f (z)|, b = |g(z)|, gives (2.8.3).

Proposition 2.8.2. If f belongs to Cc (Ω) and p, q are dual indices, then


  
 
|| f || p = sup  f (z) g(z) dm(z) , g ∈ Cc0 (Ω), ||g||q = 1 . (2.8.4)

Proof: Assume 1 < p < ∞ and normalize again with || f || p = 1. It follows from
(2.8.3) that || f || p is at most equal to the right side of (2.8.4). To prove equality,
define g(z) = 0 if f (z) = 0, and otherwise let

f (z)
g(z) = | f (z)| p−1 .
| f (z)|

Then ||g||q = || f || p = 1 and f (z)g(z) = | f (z)| p , so equality holds in (2.8.4).

Corollary 2.8.3. For 1 ≤ p ≤ ∞,

|| f + g|| p ≤ || f || p + ||g|| p .

For 1 ≤ p < ∞, the space L p (Ω) is defined to be the completion of Cc (Ω) with
respect to the norm (2.8.1). The elements of L p (Ω) can therefore be considered as
equivalence classes of Cauchy sequences of continuous functions. They can also
be considered equivalence classes of measurable functions. Let us briefly discuss
Lebesgue measure in C; the transfer to R will be clear.
The outer measure m ∗ (S) of a set S ⊂ C is the infimum of the sum of the areas
A(Rn ), where {Rn } is a sequence of open rectangles with sides parallel to the coor-
dinate axes that covers S: S ⊂ Rn . It is an exercise to show that the outer measure
of a countable set is zero, and the outer measure of a disk or a rectangle is its area.
The inner measure m ∗ (S) of a bounded set S is defined to be m ∗ (R) − m ∗ (R \ S) for
any rectangle R that contains S. This is independent of the choice of R. A bounded
set S is said to be measurable if m ∗ (S) = m ∗ (S), and the common value is the mea-
sure m(S). An unbounded set S is said to be measurable if its intersection with each
rectangle R is measurable; m(S) is the supremum of m(S ∩ R). Two things are espe-
cially worth noting here. One is that not every set is measurable. The other is that
non-measurable sets do not occur easily; it takes some ingenuity to prove that they
40 2 Further preliminaries

exist. Some facts: The complement of a measurable set is measurable. Open sets
and closed sets are measurable. The union or intersection of a countable collection
of measurable sets is measurable, If they are pairwise disjoint, the measure of the
union is the sum of the measures (countable additivity). The measure of a decreasing
sequence of measurable sets is the limit of their measures.
A statement about C is said to hold almost everywhere (abbreviated a.e.) if it is
true of every z ∈ C except (possibly) for a set having measure zero.
By definition, a function f : C → C is measurable if S ⊂ C open implies f −1 (S)
is measurable. The elements of each L p space, 1 ≤ p < ∞ are measurable functions;
two such functions represent the same element of L p if and only if they are equal
a.e. If { f n } is a Cauchy sequence in L p , then it may not converge a.e. but some
subsequence converges a.e.
We defined the L ∞ norm of a function belonging to Cc (Ω). The completion of
this space with respect to this norm is the space C0 of continuous functions with
limit 0 at ∂Ω. The space L ∞ (Ω) is defined to be the space of measurable functions
f with the property that for some M, | f (z)| ≤ M a.e.

2.9 Convolution, approximation, and weak solutions

Limits of nice functions (holomorphic, harmonic, …) may give rise to functions that
satisfy the same equations ( f z̄ = 0, u x x + u yy = 0, …) in a “weak” sense, or in “the
sense of distributions.” In this case and some other important cases, weak solutions
are necessarily “strong solutions”: solutions in the ordinary sense. A tool for proving
this—approximation by convolution—occurs in many other contexts as well.
The functions considered in this section map a domain Ω ⊂ C to C. Given an
integer n > 0, let Ccn (Ω) denote the space of such functions whose partial derivatives
of order up to n exist and belong to Cc (Ω). Similarly, Cc∞ (Ω) is the space of functions
that belong to Ccn (Ω) for every n. It is not obvious that C0∞ = Cc∞ (C) contains
any non-zero functions. However, as we shall see, it is dense in each of the spaces
L p = L p (C), 1 ≤ p < ∞. A starting point is the function

G(z) = c exp(1 − |z|2 ) · max{1 − |z|, 0}, (2.9.1)

It is an exercise to show that, for any c > 0, this function belongs to Cc∞ . It is positive
for |z| < 1 and otherwise vanishes. We choose c so that

G(z) dm(z) = 1. (2.9.2)
C

Given 0 < ε ≤ 1, let G ε (z) = ε−2 G(z/ε) Then the support of G ε is {z : |z| ≤ ε}
and 
G ε (z) dm(z) = 1.
C
2.9 Convolution, approximation, and weak solutions 41

Thus G ε has L 1 norm 1, but is more and more concentrated near 0. As we shall see,
convolution with G ε is a systematic way of taking smooth averages of translates of
a function, in such a way that the averages get close to the function itself.
The convolution of two functions f, g ∈ Cc is the function f ∗ g defined by

f ∗ g(z) = f (z − w)g(w) dm(w).
C

A change of variables shows that

f ∗ g = g ∗ f.

Consider a Riemann sum approximation



f ∗ g(z) = g ∗ f (z) ∼ g(z − x j ) f (x j )(x j+1 − x j ). (2.9.3)
j

Each translate g j (z) = g(z − x j ) has the same L p norm as g, so the L p norm of the
approximation (2.9.3) is at most
⎡ ⎤

⎣ | f (x j )|(x j+1 − x j )⎦ ||g|| p .
j

Taking a limit gives

|| f ∗ g|| p ≤ || f ||1 ||g|| p , 1 ≤ p < ∞. (2.9.4)

Since Cc is dense in each L p , 1 ≤ p < ∞, it follows that convolution can be extended


to pairs f ∈ L 1 , g ∈ L p , and (2.9.4) remains valid.
If f belongs to Cc1 , then the partial derivatives satisfy

( f ∗ g)x = f x ∗ g, ( f ∗ g) y = f y ∗ g. (2.9.5)

Theorem 2.9.1. Let {G ε } ⊂ Cc∞ be the family of functions defined above. Then for
each f ∈ L p , 1 ≤ p < ∞, the functions f ε = G ε ∗ f belong to C ∞ and

lim || f ε − f || p = 0. (2.9.6)
ε→0

Proof: For p > 1, each G ε belongs to the dual space L q , 1/ p + 1/q = 1, so Hölder’s
inequality (2.8.3) gives a bound for G ε ∗ g, and translations of G ε ∗ g involve trans-
lations of G ε . For p = 1, the same argument works using boundedness of G ε . Com-
bining this argument with (2.9.5) and iterations, we see that each G ε is infinitely
differentiable.
It is enough to prove (2.9.6) on a dense subset. For f ∈ Cc , a change of variables
implies that
42 2 Further preliminaries

f ε (z) = G ε ∗ f (z) = G(w)[ f (z − εw) − f (z)] dm(w). (2.9.7)
C

This identity and (2.9.2) imply that gε converges uniformly to g. moreover, for 0 <
ε ≤ 1, gε vanishes outside a fixed bounded set (the set of points at distance ≤ 1 from
the support of f ). Therefore (2.9.6) holds.
Corollary 2.9.2. The space Cc∞ is dense in each L p , 1 ≤ p < ∞.
Now let us turn to weak solutions. The idea is that if, say, f is a C 1 function and
φ ∈ Cc1 , then  
gφ = − f φx , g = fx . (2.9.8)
C C

If we assume that f and g are functions that belong


 to the space of locally integrable
1
functions L loc , meaning that B |g| < ∞ and B | f | < ∞ for each bounded set B,
then both sides of (2.9.8) are well defined for each φ ∈ C01 . If equality holds for each
such “test function” φ, then f is said to satisfy f x = g in the weak sense, or f x = g
weakly. Thus f can in principle be a weak solution without having a continuous
derivative. However if the partial derivative f x exists and is continuous, then an
integration by parts in (2.9.8) leads to

( f x − g)φ = 0
C

for every test function φ, from which it follows that f − g = 0 a.e., so essentially f
is an ordinary (“strong”) solution of f x = g. Equations of higher order are treated in
the same way.
Without loss of generality, and to accommodate equations of any order, we may
take the set of test functions in any case to be Cc∞ .
The three cases to be considered here, for later use, are the weak solution of f z̄ = 0
in a domain Ω, characterized by

f φz̄ = 0, all φ ∈ Cc∞ (Ω), (2.9.9)
Ω

the weak solution of  f ≡ f x x + f x x = 0 in a domain Ω, characterized by



f φ = 0, all φ ∈ C0∞ (Ω) (2.9.10)
Ω

and the system f z̄ = p, f z = q, under the assumption that p and q are weak solutions
of pz̄ = qz . The results concerning (2.9.9) and (2.9.10) are both known as Weyl’s
lemma.
p
Theorem 2.9.3. Suppose f ∈ L loc (Ω) is a weak solution of f z̄ = 0 in Ω. Then f
is holomorphic in Ω.

Proof: Extend f to C by setting f = 0 on the complement of Ω. The extended


f still satisfies (2.9.9) for each φ ∈ Cc (Ω). Given such a function φ, note that for
2.9 Convolution, approximation, and weak solutions 43

sufficiently small ε > 0, the support of the convolution G ε ∗ φ is also a compact


subset of Ω. Let f ε = G ε ∗ f . Then for small ε, using the fact that G ε is an even
function, we have
 
( f ε )z̄ φ(z) dm(z) = (G ε )z̄ ∗ f (z) φ(z) dm(z)
Ω
 Ω
= (G ε )z̄ (z − w) f (w) φ(z) dm(w) dm(z)
 Ω   
= f (w) (G ε )z̄ (z − w) φ(z) dm(z) dm(w)
Ω Ω

=− f (w) G ε ∗ φ w̄ (w) dm(w) = 0.


Ω

It follows that for any given open disk D whose closure is in Ω, f ε is eventually holo-
morphic in D. The restrictions f ε | D converge to f | D in L 1 (D). By Proposition 2.3.7,
f is holomorphic in D. Thus f is holomorphic in Ω.

The proof of Theorem 2.9.4 adapts readily to the harmonic case.


p
Theorem 2.9.4. Suppose f ∈ L loc (Ω) is a weak solution of  f z̄ = 0 in Ω. Then f
is harmonic in Ω.
Now we come to the third case mentioned above. Suppose here that p and q are
functions in C 1 . We can construct a function f ∈ C 2 such that f z = p, f z̄ = q if and
only if p and q satisfy the equality-of-mixed-partials condition pz̄ = qz . In fact the
unique solution with f (0) = 0 must be given by the formula
 1
∂f
f (z, z̄) = (sz, s z̄) ds
0 ∂s
 1
= [z f z (sz, s z̄) + z̄ f z̄ (sz, s z̄)] ds
0
 1
= z p(sz, s z̄) + z̄q(sz, s z̄)] ds. (2.9.11)
0

It is an exercise to verify that the function defined by last integral satisfies f z = p,


f z̄ = q. Now we want to carry this over to the case of weak solutions of pz̄ = qz .
Theorem 2.9.5. Suppose that p and q are continuous functions whose weak deriva-
1
tives pz̄ and qz belong to L loc and are weak solutions of pz̄ = qz , i.e. for each φ ∈ Cc∞ ,

[ p φz̄ − q φz ] = 0. (2.9.12)
C

Then there is a C 1 function f such that f z = p, f z̄ = q.

Proof: With G ε as above, let pε = G ε ∗ p and qε = G ε ∗ q. The same type of argu-


ment that we used in the proof of Theorem 2.9.3 shows that ( pε )z̄ = (qε )z . Therefore
44 2 Further preliminaries

we may construct f ε by using the integral in (2.9.11) with p and q replaced by pε and
qε . Since p and q are assumed to be continuous, the formula (2.9.7), with f replaced
by p or q shows that pε and qε converge to p and q uniformly on bounded sets. There-
fore f ε converges to f in C 1 , with f z = p, f z̄ = q.

2.10 The gamma function

The gamma function Γ (z) is usually defined for Re z > 0 by


 ∞
Γ (z) = e−t t z−1 dt. (2.10.1)
0

An integration by parts yields the functional equation

Γ (z + 1) = z Γ (z). (2.10.2)

In particular, Γ (1) = 1 and Γ (n + 1) = n !, n = 1, 2, 3, . . . . Moreover, since the


left side of (2.10.2) is defined for Re z > −1, we may use (2.10.2) to extend the
definition of Γ (z) to the strip −1 < Re z ≤ 0. Continuing this process allows us to
extend Γ to the complement of the non-positive integers. The result is a function
meromorphic in C with simple poles at the negative integers. The residue of Γ at
−n, n = 0, 1, 2, 3, . . . , is (−1)n n !
A particularly important case is Γ (1/2). Now
 ∞  ∞  ∞  ∞
−t −1/2 −t −s 2
e−s ds.
2
Γ (1/2) = e t dt = 2 e d(t ) = 2
1/2
e ds =
0 0 0 −∞

and
 ∞ 2  
e−s ds e−(x +y ) d x d y = e−r r dr dθ
2 2 2 2
=
−∞
 ∞  ∞
e−r r dr = π e−u du = π,
2
= 2π (2.10.3)
0 0

so √
Γ (1/2) = π. (2.10.4)

The asymptotics of the gamma function are given by Stirling’s approximation:


Theorem 2.10.1.
 
 x x  2π 1/2
−3/2
Γ (x) = + O(x ) (2.10.5)
e x

as x → +∞.

Proof. By (2.10.2),
2.10 The gamma function 45
 ∞
Γ (x + 1) 1
Γ (x) = = e−t t x dt.
x x 0

The integrand is maximal at t = x, so with t = xu we have



1 ∞ −xu
Γ (x) = e (xu)x x du
x 0
 x x  ∞  x
= u e1−u du.
e 0

This integrand decays exponentially away from u = 1. Comparing Taylor expansions


at u = 1 shows that

u e1−u = e−(u−1) /2
2
[1 + 13 (u − 1)3 + O((u − 1)4 )].

Therefore, setting (u − 1) = s/ x, we have
 ∞  ∞
 1−u x   ds
e−s /2
1 + x{ 13 s 3 x −3/2 + O(s 4 x −2 )} √ .
2
ue du =
0 −∞ x

Since e−s /2 3
2
s is an odd function, its integral vanishes and we obtain
 ∞  ∞
 1−u x   ds
e−s /2 1 + O(s 4 x −1 ) √ .
2
ue du =
0 −∞ x

The identity (2.10.3) is equivalent to


 ∞ √
e−s /2 ds = 2π .
2

−∞

Combining these estimates, we get (2.10.5).


Remark. The estimate (2.10.5) can be shown to hold uniformly in any closed sector
that omits the negative half-axis:
 
 z z  2π 1/2
Γ (z) = + O(|z|−3/2 )
e z

as z → ∞, uniformly for | arg z| ≤ π − δ, for any δ > 0. See [21] or [22].

Remarks and further reading

The material in this chapter is covered in many standard texts on complex analy-
sis, functional analysis, or distribution theory, according to the topic. The topics in
Sections 2.1 and 2.2 are treated more leisurely in [22]. Normal families are covered
46 2 Further preliminaries

extensively in Schiff [184]. There are many texts on conformal mapping; two classics
are Cohn [46] and Nehari [152].
Riemann’s original argument for the Riemann mapping theorem came into much
criticism by Schwarz and others. Subsequent arguments were given, under vari-
ous assumptions on the boundary of the simply connected domain, before Koebe’s
definitive result; see Tazzioli [200] and Gray [92].
Boundary behavior of conformal maps is treated comprehensively in Pommerenke
[171]. For much more on the material in Sections 2.7 and 2.8 see, for example, Folland
[78], Rudin [182], or Stein and Shakarshi [195], [196]. For Section 2.9, see Georgiev
[90] or Duistermaat and Kolk [61]. The gamma function is treated in any book on
special functions, e.g. [21].
Chapter 3
Complex dynamics

Mathematically, a dynamical system generally consists of a product space X × T


and a function f : X × T → X , with X thought of as “space” and T as “time.”
Time is usually taken to be continuous, with T = R or T = [0, ∞), or discrete, with
T = Z or {0, 1, 2, . . . }. The simplest situation is the autonomous discrete case with
f independent of t. In other words, f : X → X and one studies the iterates f , f ◦ f ,
f ◦ f ◦ f , … . To have a convenient notation that will not be confused with powers
of f (if X has a multiplication), we set f ◦0 = 1, the identity map, and

f ◦(n+1) = f ◦ f ◦n , n = 1, 2, 3, . . . .

If f is invertible, this can be extended to f ◦n , n = −1, −2, . . . , . (We shall always


assume that f itself is not the identity map.)
One wants to understand the orbits

{ f (x), f ◦2 (x), f ◦3 (x), . . . }, x ∈ X.

It is useful also to consider the backward orbit of a point z 0 : the set

{z : f ◦n (z) = z 0 , some n ≥ 1.} (3.0.1)

Two maps f and g from X to itself are said to be conjugate if there is an invertible
map φ : X → X such that
f ◦ φ = φ ◦ g.

Then
f ◦n = φ ◦ g ◦n ◦ φ −1 ,

so the dynamics of g can be read off from the dynamics of f and conversely.
Among the most-studied cases are those in which X is a space of one real or one
complex dimension and f is a rational function—even a polynomial. These cases
allow surprisingly intricate behavior, as anyone knows who has encountered the
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 47
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_3
48 3 Complex dynamics

Mandelbrot set. This study began in earnest in the 19th century, with the examination
of Newton’s method for approximating zeros. It flourished in the early 20th century
with work by Fatou, Julia, and others. It revived later in the 20th century, sparked
in part by connections with chaos theory and fractals and was given striking visual
form by the development of powerful computers and computational techniques.
In this chapter we introduce the subject, concentrating on the case X = S, the Rie-
mann sphere, and f a rational function: f = P/Q where P and Q are polynomials
with no common zero. The degree of f is

deg f = max{deg P, deg Q}.

The rational functions of degree 1 are the linear fractional transformations.


A residue calculus argument shows that f takes each value in S deg f times,
counting multiplicity. This shows that degree is multiplicative:

deg( f ◦ g) = deg f · deg g. (3.0.2)

In particular,
deg f ◦n = (deg f )n . (3.0.3)

This fact suggests (or even insists) that the study of rational dynamics is much simpler
in the case of degree 1. We deal with this case immediately.
As noted above, the case deg f = 1 is the case f ∈ Aut(S). To normalize, we
note that f has exactly one or two fixed points. Consider first the case of a single
fixed point, which we may take to be z = ∞. Then

f (z) = z + b, b = 0, f ◦n (z) = z + nb,

and for each z ∈ S, f ◦n (z) → ∞ as n → ±∞.


Now suppose f has two fixed points, which we may take to be 0 and ∞. Then
f (z) = az for some a = 0. If |a| < 1, then | f ◦n (z)| = |a|n |z| shrinks any bounded
set to {0} at a geometric rate. If |a| > 1, f ◦n shrinks closed sets not containing the
origin to the point at ∞. Inversion z → 1/z conjugates one case to the other; thus as
dynamical systems they are identical.
If |a| = 1, the situation is more interesting. Here a = e2πiθ and the behavior is
very different depending on whether θ is rational or not: whether some iterate f ◦m
is the identity map (so that the orbit of each point consists of finitely many points),
or not.
As we shall see, things can become much more interesting than this, even for
f a quadratic. In Section 3.1 we define the Fatou set F( f ) and the Julia set J( f ),
and examine the examples f (z) = z 2 + c with c = 0, c = −2, and c = 6. General
properties of F and J are developed in Section 3.2.
Fixed points and periodic points of f play a dominant role in the study of dynam-
ics. Section 3.3 establishes general properties for rational f of degree ≥ 2. In Section
3.4 we look more closely at the general features of attracting and repelling fixed
3.1 Fatou sets and Julia sets; some examples 49

points. Section 3.5 introduces the basic theory of “neutral” fixed points. The special
case of “parabolic” fixed points is examined in more detail in Section 3.6.
Finally, in Section 3.7 we describe the classification theorem, which characterizes
the connected components of the Fatou set. We also give one more illustration of
dependence on parameters by a brief discussion of the Mandelbrot set.

3.1 Fatou sets and Julia sets; some examples

We begin here a systematic consideration of discrete dynamics with f a rational func-


tion of degree ≥ 1. For practical purposes this means deg f > 1, since we obtained a
complete picture (up to normalization) for degree 1 in the Introduction. Even the case
of quadratic polynomials illustrates many of the general phenomena. We note that
a given quadratic polynomial f can be conjugated by an affine map φ(ζ ) = aζ + b
to either of the canonical forms

f c (z) = ζ 2 + c; f c (ζ ) = ζ (ζ + c); (3.1.1)

see Exercise 2.
The Fatou set F = F( f ) of f is the set of points z ∈ S such that the family of
iterates { f ◦n }∞
n=1 , when restricted to some small enough neighborhood U of z, is
a normal family: for any subsequence { f ◦n k } some subsequence { f ◦n k j } converges
uniformly (in S) on compact subsets of U . The complement J = J( f ) is the Julia set
of f . Thus, by definition, F is open and J is closed.
Two maps f : Ω → Ω and g : Ω  → Ω  are said to be conformally conjugate if
there is a conformal map Φ : Ω  → Ω such that

g = Φ −1 ◦ f ◦ Φ. (3.1.2)

Again this means that the dynamics are the same. In fact

g ◦n = Φ −1 ◦ f ◦n ◦ Φ. (3.1.3)

Remark. We may always study the dynamics of f at up to three prescribed


points z j ∈ S by using a Möbius transformation Φ so that this amounts to studying
the dynamics of g at prescribed points w j . In particular, behavior near a pole or at
infinity can be reduced to behavior at a regular point or at a finite point.

Example. Consider the triple

g : Ω → Ω, Ω = {z : z ∈
/ [−2, 2]}, g(z) = z 2 ;
  
f : Ω → Ω , Ω = {z : |z| > 1}, f (z) = z 2 − 2;
Φ : Ω → Ω  , Φ(z) = z + z −1 . (3.1.4)

Then (3.1.2) is satisfied; see Exercise 3. The same is true if replace the domain Ω 
by Ω  = {z : |z| < 1}. Now it is easily checked that the Julia sets of g and f are
50 3 Complex dynamics

J(g) = {z : |z| = 1}, J( f ) = [−2, 2]; (3.1.5)

Exercise 4.
These very simple Julia sets are not representative of the general situation, even
for quadratic polynomials.
A more instructive example is

f (z) = z 2 − 6. (3.1.6)

with the two-valued inverse map



g(z) = 6 + z.

Proposition 3.1.1. If f (z) = z 2 − 6, then


(a) The fixed points of f in S are {−2, 3, ∞}.
(b) If |z| ≥ 3 then | f (z)| ≥ 3, with equality only if z = ±3.
(c) If |z| > 3√then | f (z)| − 3 > 2(|z| − 3), so f ◦n (z) → ∞.
(d) If |z| ≤ 3 then | f (z)| ≥ 3. √
(e) If |z| ≤ 3, then |Im g(z)| ≤ |Im z|/2 3.

Proof. This is left as Exercise 5.


By following up properties (a)—(e), we can show that the Julia set J( f ) is a
Cantor√set. Note first that it follows from (b), (c), (d) that, with the exception of ±3
and ± 3, √
f ◦n (z) → ∞ if |z| ≥ 3 or |z| ≤ 3. (3.1.7)

It follows from (e) that if |z| < 3 and Im z = 0, then |Im f ◦n (z)| increases geometri-
cally so long as it remains in D3 (0). Because of this we can sharpen (3.1.7):

f ◦n → ∞ if z ∈
/ [−3.3]. (3.1.8)

For real x, we know that f ◦n (x) → ∞ if


√ √
x ∈ A0 = (− 3, 3).

But then the same is true if x ∈ g(A0 ), and so on. Let

An = g ◦n (A0 ) = {x ∈ (−3, 3) ; | f ◦n (x)| < 3, | f ◦(n+1) (x)| > 3}.

Then the sets An are disjoint, and for real x,




f ◦n (x) → ∞ if and only if x ∈ An . (3.1.9)
n=0
3.1 Fatou sets and Julia sets; some examples 51

Now (each branch of) g is continuous


√ and strictly monotone on [−3, 3], so we simply
need to track the endpoints ± 3. The first stage is
 √ √   √ √ 
A1 = g(A0 ) = −(6 + 3)1/2 , −(6 − 3)1/2 ∪ (6 − 3)1/2 , (6 + 3)1/2 .

√ √
nare subintervals of [−3, − 3] and [ 3, 3], respectively.
Note that these
Let Bn = j=0 A j .

Lemma 3.1.2. Bn consists of 2n+1 − 1 disjoint intervals. The 2n intervals of An


interlace the 2n − 1 intervals of Bn−1 .
Proof. This is clearly the case at n = 1. Suppose that assertion is true for n. Then
the positive and negative parts of g(An ) each consist of 2n disjoint intervals, so An+1
consists of 2n+1 disjoint intervals. The positive part of g(Bn ) consists of 2n disjoint
intervals, interlaced by the intervals of g(An ). These last are the positive part of An+1 .
By symmetry the intervals of An+1 interlace those of g(Bn−1 ) = Bn \ A0 . Since An is
disjoint from A0 , and all the intervals in the positive part of g(Bn ) lie to the right of A0 ,
the interlacing property carries over to all of Bn .
Recall that the original Cantor set C is obtained by the processing of successively
removing the open middle third of a union of closed intervals, starting with the unit
interval. In general, a Cantor set is any set that is homeomorphic to C.
Proposition 3.1.3. Let f (z) = z 2 − 6. The Julia set J( f ) is a Cantor set contained
in the interval [−3, 3]. For every z in the Fatou set, f ◦n (z) → ∞.

Proof. Since B∞ = Bn is an open set in R, it follows from (3.1.8) and (3.1.9) that
the Fatou set contains
B∞ ∪ (S \ [−3, 3]).

By Lemma 3.1.2, to show that C is a Cantor set it is enough to show that the lengths
of the intervals in the complement of Bn in [−3, 3] shrink to zero. This follows
immediately from the fact that
1 1
|g  (x)| = √ ≤ √ .
2 6+x 2 3

It follows that points whose orbits converge to 3 are dense in the complement of B∞
in [−3.3]. Therefore

F( f ) = B∞ ∪ (S \ [−3, 3]), J( f ) = C ≡ [−3, 3] \ B∞ .

We close this with three images that give some idea of the possible variations in
form of J ( f ), even for quadratic polynomials. The first is a Cantor set: the appearance
of some connectedness is deceiving and is violated on a small enough scale. The
second is a case of a figure known as a Douady rabbit. The third begins to show how
elaborate Julia sets of the second type can become (Figures 3.1, 3.2, and 3.3).
52 3 Complex dynamics

Fig. 3.1 Julia set for f (z) = z 2 + (−.766 + .083i): a Cantor set.

Fig. 3.2 Julia set for f (z) = z 2 + (−.122 + .745i): a Douady rabbit.

Fig. 3.3 Julia set for f (z) = z 2 + (.125 + .604i): an elaborated rabbit.

These three figures were obtained from the Java Script Julia set generator. https://
marksmath.org>visualization>julia_sets. The reader is invited to explore the site for
other examples with f (z) = z 2 + c, for one’s choice of c.
In subsequent sections we shall see how the general theory accounts for some of
the properties of these figures, such as self-similarity.
3.2 Julia sets: invariance, density, and self-similarity 53

3.2 Julia sets: invariance, density, and self-similarity

We assume throughout that f : S → S is a rational map of degree d ≥ 2. Let f −1


denote the d-valued inverse. In this section we begin discussion of the general prop-
erties of the Julia set J = J( f ).

Proposition 3.2.1. (a) The Julia set J = J( f ) is not empty.


(b) J is fully invariant for f :

f (J) = J = f −1 (J). (3.2.1)

(c) For any k ≥ 1, J( f ◦k ) = J( f ).

Proof. Part (a) is proved by contradiction. If J = ∅ then { f ◦n } is a normal family on S,


so some subsequence { f ◦n j } converges uniformly on S to a function h : S → S that
is everywhere meromorphic, hence rational. The limit h cannot be constant, since
the f ◦n j are surjective. Since the convergence is uniform, eventually the number of
zeros of h will be the number of zeros of f ◦n j . But f ◦n j has degree d n j and we have
assumed d ≥ 2.
It is enough to prove statements (b) and (c) for the Fatou set. Now z 0 ∈ F( f ) if
and only if { f ◦n } is a normal family on some neighborhood of z 0 . This is true if and
only if { f ◦(n±1) } has this property on some neighborhood of f ± (z 0 ). This proves (b).
Any subfamily of a normal family on a given domain is normal, so z 0 ∈ F( f )
implies z 0 ∈ F( f ◦k ). Conversely, suppose z 0 ∈ F( f ◦k ). Then


k−1
{ f ◦n } = { f ◦ j ( f ◦nk )}.
j=0

Suppose z 0 ∈ F( f ◦k ). Given a sequence in { f ◦n }, some subsequence belongs to one


of the families on the right, each of which is normal in a neighborhood of z 0 . Therefore
{ f ◦n } itself is normal in that neighborhood, showing that z 0 ∈ F( f ).
Let us look more closely at points z ∈ J( f ). Let U be a neighborhood of such a
point, and consider the family of functions G = {gn }, gn = f ◦n |U . By assumption
G is not a normal
 family. By Montel’s theorem, Theorem 2.5.3, the exceptional
set E z = S \ gn (U ) of values omitted by every gn contains at most two points.
Suppose that E z consists of a single point a. Replacing f by h −1 ◦ f ◦ h, where h is
a linear fractional transformation that takes ∞ to a, we may assume that E z = {∞},
and take U = C. By definition, f −1 (E z ) ⊂ E z , so f −1 (∞) = ∞ and there are no
other poles, so f is a polynomial. If E z consists of two points, we may take them
to be 0 and ∞ and take U = C \ {0}. Then either f (0) = 0 and f (∞) = ∞, so
f (z) = C z n or f (0) = ∞ and f (∞) = 0, so f (z) = C z −n . In either case we have
proved the first part of the following.
54 3 Complex dynamics

Proposition 3.2.2. (a) The exceptional set E z , z ∈ J( f ), is independent of z, and is


contained in the Fatou set.
(b) If z 0 belongs to J( f ), then the backward orbit (3.0.1) of z 0 is dense in J( f ).
(c) Any completely invariant non-empty subset of J( f ) is dense in J( f ).
(d) If U is a union of connected components of F( f ) that is completely invariant,
then J( f ) = ∂U .

Proof. As noted, the preceding discussion proves (a).


 ◦b
Given z ∈ J( f ) and a neighborhood U of z, f (U ) is the complement of E, so
by (a) it contains J( f ). Therefore for any z 0 ∈ J, some point of U is in the backward
orbit of z 0 . This proves (b), and (c) is a consequence of (b).
Complete invariance of U implies complete invariance of the boundary, and the
assumption that U is made up of connected components of F( f ) implies that ∂U ⊂ J,
so (d) follows from (c). .

Corollary 3.2.3. If J has an interior point, then J = S.

Proof.
 ◦n Suppose that the open set U is contained in J. By invariance, the same is true of
f (U ). But this set is S \ E. Since J is closed, it must be all of S.
We supplement part (b) of Proposition 3.2.2 by considering forward orbits. Con-
sider J( f ) as a metric space relative to the spherical metric of S. It is closed in S,
and therefore complete. It is a general property of complete metric spaces that the
intersection of a countable family of dense open subsets is itself dense; see Exercise
9. We refer to such an intersection as a generic subset.

Proposition 3.2.4. There is a generic subset V ⊂ J such that for each z in V , the
forward orbit { f ◦n (z)} is dense in J.

Proof. Given j = 1, 2, . . . , let U j1 ,U j2 ,…, U jm j be an open cover of J by disks of


radius 1/j. Let V jk consist of all pre-images of points of U jk . Then each V jk is open and
dense in J, so the intersection V is generic. If z belongs to V , then each U jk contains
some point of the forward orbit of z. Therefore the orbit of z is dense in J.
In the next section we begin a general discussion of fixed points of f . It is useful
here to discuss one case. A finite fixed point z 0 of f is said to be super-attracting
if f  (z 0 ) = 0. If ∞ is fixed, it is super-attracting for f if 0 is super-attracting for
g(z) = 1/ f (1/z), i.e. f (z)/z → ∞) as z → ∞.
For example, each of the examples in Section 3.1 has a super-attracting fixed point
at ∞. In fact at ∞ we consider the expansion

1 1 z2
= = = z 2 + O(z 4 ).
f (1/z) 1 1 + O(z 2 )
+ O(1)
z2
3.2 Julia sets: invariance, density, and self-similarity 55

Lemma 3.2.5. If z 0 is a super-attracting fixed point of f , then z 0 belongs to the


Fatou set F( f ).

Proof. For small enough δ, if z ∈ U = Dδ (z 0 ), then | f (z)| ≤ C|z − z 0 |2 . If δ is


chosen so that Cδ 2 < δ, then U is invariant for f and { f ◦n } is a normal family.

Theorem 3.2.6. The Julia set J( f ) contains no isolated points.

Proof. Suppose z 0 ∈ J. We may assume that z 0 is finite. If no backward iterate


( f −1 )◦n (z 0 ) is equal to z 0 , then, since these pre-images are dense in J, it follows
that z 0 is not isolated. Otherwise z 0 is a periodic point for f : f ◦m (z 0 ) = z 0 , for
some m > 0 that we may assume to be minimal. There are (deg f )m solutions to
f ◦m (z) = z 0 , so if z 0 were the only solution it would have positive multiplicity.
(Recall that we are assuming that deg f > 1.) But then the orbit would be super-
attracting and belong to F. This leaves us with the possibility that f ◦m (z) = z 0 has
distinct solutions z 0 and z 1 . Then f ◦ j (z 0 ) = z 1 for all j, since otherwise, because of
the m-periodicity of z 0 , it would be true for some 0 < j < m. But then

f ◦ j (z 0 ) = f ◦(m+ j) (z 0 ) = f ◦m (z 1 ) = z 0 ,

contradicting the minimality of m. Then again no preimage of z 1 is equal to z 0 , but


they are dense in J, so z 0 is not isolated.

Corollary 3.2.7. If S ⊂ J is countable, then the complement J \ S is dense in J.

Proof. Let {z n }∞
n=1 be an enumeration of S. Given U ⊂ S open, with U ∩ J not empty,
we may find a sequence of open disks Dn such that D1 ⊃ D2 ⊃ . . . , the closure of
D1 is contained
 in U , and the closure D n contains a point of J but does not contain z n .
Then D n ⊂ J is non-empty and does not contain any of the z n .
We are now in a position to address self-similarity, the “fractal” nature of Julia
sets. We say that two points z 1 , z 2 of J have conformally equivalent locations in J if
there are neighborhoods U j of x j , and a conformal map Φ of U2 onto U1 , such that
Φ(z 2 ) = z 1 and Φ(U2 ∩ J) = U1 ∩ J.
The critical points of a rational map f : S → S are those points where f is not
locally injective. At finite points of C where f is finite, these are the zeros of f  .
At poles, they are the points where the pole has order ≥ 2. The number of critical
points, counting multiplicity, for f of degree d, is 2d − 2; see Exercise 8.
Let us say that a point z in J( f ) is atypical if every backward orbit contains a
critical point; otherwise we say that z is typical. Since there are finitely many critical
points, and each orbit is countable, there are only countably many atypical points. It
follows from Corollary 3.2.7 that typical points are dense in J.

Theorem 3.2.8. Suppose that z is a typical point of J( f ). Then the set of points z 
that have a conformally equivalent location is dense in J( f ).
56 3 Complex dynamics

Proof. We know from Proposition 3.2.2 (b) that the backward orbit of z is dense in J.
Suppose f ◦n (z  ) = z for some n ≥ 1. Because there are no critical points in the back-
ward orbit, the derivative [ f ◦n ] (z  ) = 0. Therefore f ◦n restricted to some neighbor-
hood U  of z  serves as the desired conformal map.

3.3 Fixed points and periodic points

It is clear from the examples in Section 3.1 that fixed points of f play a key role
in the dynamics. A first classification of a fixed point f (z 0 ) = z 0 is based on the
multiplier. For a finite fixed point z 0 ,

λ = f  (z 0 ).

We leave the determination of λ in the case z 0 = ∞ to the reader.


A fixed point z 0 is said to be

attracting if |λ| < 1;


super-attracting if λ = 0;
repelling if |λ| > 1;
neutral if |λ| = 1.

Obviously super-attracting is a special case of attracting. A fixed point that is attract-


ing but not super-attracting is termed geometrically attractive.
(It might seem more natural to use a different definition of “attracting fixed point”
that does not reference f  (z 0 ), but see Exercise 10.) The neutral case needs further
refinement; see Section 3.5.
In the example (3.1.6), −2 and 3 are repelling fixed points. As noted above, ∞ is
a super-attracting fixed point.

Proposition 3.3.1. (a) If z 0 is an attracting fixed point of of f , then z 0 belongs to


the Fatou set F( f ).
(b) If z 0 is a repelling fixed point of f , then z 0 belongs to the Julia set J( f ).

Proof. (a) By assumption f (z 0 ) = z 0 and | f  (z 0 )| < 1. Then there is a bounded


neighborhood U of z 0 such that f : U → U . Therefore each iterate maps U into U ,
so { f ◦n } is a normal family on U .
(b) If z 0 belonged to F( f ), then a subsequence of { f ◦n } would converge uniformly
near z 0 . In particular, by the Cauchy estimate for derivatives, the derivatives would
converge at z 0 . But
[ f ◦n ] (z 0 ) = f  (z 0 )n → ∞.

Corollary 3.3.2. (a) If f ◦n (z) = z 0 is an attracting fixed point of f , then z belongs


to the Fatou set.
3.4 Attracting, super-attracting, and repelling fixed points 57

(b) If f ◦n (z) = z 0 is a repelling fixed point of f , then z belongs to the Julia set.
Periodic points play a similar role. A point z 1 ∈ S is said to be periodic with
period k > 1 if the partial orbit

{z 0 , z 1 , z 2 . . . , z k = z 0 }, z j = f ◦ j (z 0 ), (3.3.1)

contains k distinct points. Clearly each z j is also periodic with period k. Equivalently,
z 0 is periodic with period k if z 0 is a fixed point for f ◦k , but not for any f ◦ j , 0 < j < k.
The set (3.3.1) is called a periodic orbit, or a cycle. In the case (3.3.1), the derivative

[ f ◦k ] (z 0 ) = f  (z 0 ) f  (z 1 ) · · · f  (z k−1 )

is independent of the choice of the point z j in the periodic orbit. We have the same
classification for periodic orbits as for fixed points: if z is periodic with period k and
multiplier λ = [ f ◦k ] (z), then, as in the case of a fixed point, the orbit is said to be

attracting if |λ| < 1;


super-attracting if λ = 0;
repelling if |λ| > 1;
neutral if |λ| = 1.

The analogue of Proposition 3.3.1 and its corollary are valid for periodic orbits.
Proposition 3.3.3. (a) Any attracting or super-attracting periodic orbit of f
belongs to F( f ).
(b) Any repelling periodic orbit of f belongs to J( f ).
Proof. Since elements of periodic orbits are fixed points for some iterate f ◦k of f , this
follows from Proposition 3.3.1 and the fact that F( f ◦k ) = F( f ).
Corollary 3.3.4. (a) If f ◦n (z) belongs to an attracting or super-attracting orbit of
f , then z belongs to F( f ).
(b) If f ◦n (z) belongs to a repelling orbit of f , then z belongs to J( f ).

3.4 Attracting, super-attracting, and repelling fixed points

We continue to assume that f is rational, deg f = d > 1. The next result goes back
to Kœnigs [123] in 1884.
Theorem 3.4.1. Suppose that z 0 is a geometrically attracting fixed point. Then there
is a conformal map φ defined on a neighborhood U of z 0 to a disk Dρ (0) such that
φ(z 0 ) = 0 and
φ( f (z)) = λφ(z), z ∈ U. (3.4.1)
58 3 Complex dynamics

The map φ is unique up to multiplication by a constant.


Proof. For convenience, we conjugate so that z 0 = 0. Choose δ > 0 such that f is
single-valued on a neighborhood V ⊂ D and

| f (z) − λz| ≤ C|z|2 , z ∈ V. (3.4.2)

Then | f (z)| ≤ |λz| + C|z|2 = (|λ| + Cδ)|z| for z ∈ V , so by choosing V small


enough we also have f (V ) ⊂ V . Choose δ > 0 such that U = φ −1 (V ) ⊂ Dδ (0).
Let
φn (z) = λ−n f ◦n (z), z ∈ U.

Then
φn ◦ f = λ−n f ◦(n+1) = λφn+1 .

If we show that the φn converge to φ on U , it follows that

φ( f (z)) = λφ(z). (3.4.3)

Iterating (3.4.3) gives


| f ◦n (z)| ≤ ρ n |z|, z ∈ U. (3.4.4)

Taking (3.4.2) into account, we get


 
 f ( f ◦n (z)) − λ f ◦n (z)  ◦n
 ≤ C| f (z)| ≤ Cρ |z| .
2 2n 2
|φn+1 (z) − φn (z)| =  
|λ| n+1 |λ| n+1 |λ| n+1


Thus, choosing ρ < |λ|, we get uniform convergence of φn on U . Since f is
injective in Dδ (0), the same is true of f ◦n and, therefore of φn . It follows from
Proposition 2.3.6 that φ is either conformal or constant. But

φn (0) = λ−n [ f ◦n ] (0) = λ−n f  (0)n = 1,

so φ  (0) = 1 and φ is conformal.


Finally, suppose that ψ was a second such map. Then

φ ◦ f = λφ, ψ ◦ f = λψ,

so, writing w = ψ(z), we have

λφ ◦ ψ −1 (w) = φ ◦ ψ −1 (λw).

Expanding φ ◦ ψ −1 (w) gives

λ(a1 w + a2 w2 + . . . ) = a1 λw + a2 (λw)2 + . . .

so a j = 0 for j ≥ 1, and φ ◦ ψ −1 (w) = a1 w,


3.4 Attracting, super-attracting, and repelling fixed points 59

The basin of attraction of an attracting or super-attracting fixed point z 0 is the set


A = A(z 0 ) of points z such that the f ◦n (z) converge to z 0 . Because f is continuous
on S, this is an open set. The connected component A0 = A0 (z 0 ) of A that contains
z 0 is the immediate basin of attraction of z 0 . Clearly A is invariant under f and f −1 .
Theorem 3.4.1 can be extended.
Proposition 3.4.2. The map φ of Theorem 3.4.1 can be extended to a holomorphic
map on the basin of attraction A(z 0 ) that intertwines f and multiplication by λ:

φ( f (z)) = λφ(z), z ∈ A(z 0 ). (3.4.5)

Proof. Formally, (3.4.5) extends to give

φ(z) = λ−1 φ( f (z)) = λ−1 [λ−1 φ( f ( f (z))) = . . . = λ−n φ ◦ f ◦n (z). (3.4.6)

Given z ∈ A(z 0 ), f ◦n (z) will eventually belong to the original domain of φ. Thus
(3.4.6) serves to define φ throughout A(z) and to extend the intertwining identity
(3.4.5) one step at a time.
Theorem 3.4.3. The immediate basin of attraction A0 of an attracting fixed point
z 0 of f contains a critical point of f .
Proof. If z 0 is super-attracting, then z 0 itself is a critical point. Suppose that z 0 is
geometrically attracting and let φ : A0 → C be the map in Proposition 3.4.2. Then
φ has a local inverse ψ0 defined in a disk Dρ = Dρ (0). There is some maximal disk
Dr such that ψ has a holomorphic extension ψr to Dr . Since φ −1 ◦ ψ and ψ −1 ◦ φ
are the identity map near 0 and z 0 respectively, they continue to be the identity up to
Dr and φ(Dr ), respectively. It follows from (3.4.5) that

f (z) = ψ(λψ −1 (z)), z ∈ Dr . (3.4.7)

Therefore f is injective on Dr . Our standing assumption is that f has degree ≥ 2,


hence is not injective on C, so Dr = C.
Writing (3.4.7) as
f (ψ(w)) = ψ(λw), w ∈ Dr (3.4.8)

we note that the right side is holomorphic in a neighborhood of the closure Dr . Near
any point w0 of ∂ Dr that is not a critical point of f , f has a local inverse g and we
get a holomorphic extension ψ(w) = g(λψ(w)) to a neighborhood of w0 . It follows
that there is a critical point of f on ∂ Dr . Otherwise ψ can be extended across ∂ Dr ,
contradicting the assumption of maximality. (We assumed here that the critical point
was not a pole of f . The case of a pole can be dealt with by passing from f to a
conformally equivalent function. The same remark applies to the rest of the proof.)
Now (3.4.8) extends by continuity to w ∈ Dr , so

f ◦n (ψ(w)) = f ◦(n−1) (ψ(λw)) = · · · = ψ(λn w) → ψ(0) = 0, w ∈ Dr .


(3.4.9)
60 3 Complex dynamics

In particular, the critical points on ∂ Dr belong to A0 .


Remark. The basin of attraction of a periodic orbit of f is defined to be the set
of points z such that { f ◦n (z)} tends to the orbit. Recall that each periodic orbit
corresponds to a fixed point of some iterate of f , and that the Julia set of an iterate
is the same as the Julia set of f itself. It follows that each of the preceding results in
this chapter has a corresponding extension to attracting periodic orbits.
Moreover, critical points of iterates of f are critical points of f . Therefore Theo-
rem 3.4.3 puts an absolute limit on the total number of attracting points and attracting
periodic orbits of f .

Proposition 3.4.4. Let A be the basin of attraction of an attracting or super-


attracting fixed point z 0 of f . Then J( f ) = ∂ A.

Proof. Boundary points of A belong to J; see Exercise 16. Thus A is a union of


components of F. It is easily seen that A is completely invariant for f , so the result
follows from Proposition 3.2.2 (d).
The following result helps show why Julia sets can be so complicated.

Corollary 3.4.5. Suppose that f has a repelling fixed or periodic point z 0 with mul-
tiplier λ that is not real. Then J( f ) is not contained in a proper smooth submanifold
of S.

Proof. We may assume that z 0 is a fixed point. There is a branch g of f −1 defined in


a neighborhood of z 0 , and z 0 is a fixed point of g with multiplier 1/λ. By Proposition
3.4.2 there is a conformal map φ defined in a neighborhood U of z 0 that locally
conjugates f −1 to g, with g(ζ ) = λ−1 ζ . Choose z 1 ∈ J ∩ U and let ζ1 = φ(z 1 ). The
points ζ j = g ◦ j (ζ1 ) lie on a logarithmic spiral centered at φ(z 0 ). Thus J is not smooth
at z 0 .
Since J is not smooth at the center z 0 of the  spiral and is not in the exceptional set
E ⊂ F, for any small neighborhood U1 of z 1 , f ◦n (U1 ) contains z 0 . Therefore U1
◦n −1
contains a preimage ( f ) of a neighborhood of z 0 . By shrinking U1 to z 1 itself, we
see that J is not smooth at z 1 .
We end this section with a quick look at the super-attracting case. The analog of
Theorem 3.4.1 was proved by Boettcher [28] in 1904.

Theorem 3.4.6. Suppose that z 0 is a super-attracting fixed point of the rational


function f . Then there is a neighborhood of z 0 and conformal map φ : U → Dρ (0)
such that φ(z 0 ) = 0 and

φ( f (z)) = φ(z) p , z ∈ U, (3.4.10)

where p ≥ 2. The map φ is unique up to multiplication by a ( p − 1)-th root of unity.


3.4 Attracting, super-attracting, and repelling fixed points 61

Proof. We may translate coordinates so that z 0 = 0, and change scale, z → cz, so


that in Dδ (0),
f (z) = z p [1 + (z)], | (z)| ≤ C|z|. (3.4.11)

Choose 0 < δ < 1/2 so that | f (z)| ≤ (2|z|) p if z ∈ U = Dδ (0). Then f : U → U


and inductively
| f ◦n (z) ≤ (2|z|) p ,
n
z ∈ U.

Let
φn (z) = [ f ◦n (z)]1/ p = [z p (1 + . . . )]1/ p = z (1 + . . . )1/ p .
n n n n

This is well defined and conformal in a neighborhood of 0. Then

φn ◦ f = [ f ◦n ◦ f ]1/ p = φn+1 ,
n p

With φ0 the identity map,


n
φj
φn = (3.4.12)
φ
j=1 j−1

and

[ f ◦ f ◦n ]1/ p
n+1
φn+1
=
φn [ f ◦n ]1/ pn
{[ f ◦n ] p [1 + ( f ◦n )]}1/ p
n+1

=
[ f ◦n ]1/ pn
= [1 + ( f ◦n )]1/ p
n+1
.

Therefore  
 φn+1 (z) 

1≤  ≤ 1 + | ( f ◦n (z))| = 1 + O(2|z| pn )
φn (z) 

so the product (3.4.12) converges uniformly in a neighborhood of 0; see Section 1.6.


The product is either conformal or constant. If ε(z) in (3.4.11) is identically zero then
there is nothing more to prove. Otherwise |ε(z)| ≥ δ|z|k for z near 0, some positive
δ and k, and this can be used to show that the product is not constant.
Finally, suppose that ψ is another such conformal map. Then φ ◦ ψ −1 commutes
with taking the p-th power. Expanding

φ ◦ ψ −1 (z) = a1 z + a2 z 2 + . . . ,

we have
(a1 z + a2 z 2 + . . . ) p = a1 z p + a2 z 2 p + . . . ,
p
so a1 = a1 and it is easily seen by induction that an = 0 for n > 1.
Remark. Since log |φ( f (z))| = p log |φ(z)|, we may extend the harmonic function
G(z) = log |φ(z)| to the entire basin of attraction A(z 0 ) by
62 3 Complex dynamics

G(z) = p −n G( f ◦n (z)) (3.4.13)

for z such that f ◦n (z) is in the domain of φ.

3.5 Neutral fixed points

Once again we concentrate on the dynamics of a rational map f of degree d ≥ 2.


Recall that a neutral fixed point or periodic point of f is one whose multiplier λhas
modulus |λ| = 1, so λ = e2πiθ for some real θ . A fundamental distinction here is
whether θ is rational or irrational. A fixed point or periodic point with θ rational
is said to be parabolic. A more fundamental distinction, from the point of view of
dynamics, is whether the point belongs to the Julia set or not.
Proposition 3.5.1. Any parabolic fixed point or periodic point z 0 of f belongs to
J( f ).
Proof. We know that some λ j = 1, so the iterate g = f ◦ j has multiplier λ j = 1. We
are assuming that d = deg f ≥ 2. Therefore g has degree d j , so g is not the identity
map. Expanding in an affine coordinate ζ centered at z 0 , we have

g(ζ ) = ζ + ak ζ k + ak+1 ζ k+1 + . . . , ak = 0, ζ = z − z 0 .

Then
g ◦n (ζ ) = ζ + nak ζ k + O(ζ nk+1 ),

so [g ◦n ](k) (0) = n k ! ak → ∞ as n → ∞. Therefore z0 belongs to


J(g) = J( f ).
Suppose now that z 0 is an irrationally neutral fixed point, i.e. λ = e2πiθ , with θ
not rational. This implies that powers of λ are dense in the unit circle. (In fact a
much stronger statement is true: see Exercise 17.) We may assume that z 0 = 0. We
want to know whether there is a local conjugation by ζ = h(z) such that in some
neighborhood of 0

f (h(z)) = h(λz), h(0) = 0, h  (0) = 1. (3.5.1)

Lemma 3.5.2. A solution to (3.5.1) in a disk Dr (0), r > 0, is injective in the disk.
Proof. Suppose that h(z 1 ) = h(z 2 ). Then

h(λz 1 ) = f (h(z 1 )) = f (h(z 2 )) = h(λz 2 ), . . . , h(λn z 1 ) = h(λn z 2 )

for all n. Since the λn are dense in the circle, h(eiψ z 1 ) = h(eiψ z 2 ) for all ψ. Therefore
the functions g j (w) = h(wz j ) agree in the interior of D as well: g1 (w) = g2 (w) if
|w| < 1. Then z 1 = g1 (0) = g2 (0) = z 2 .
3.5 Neutral fixed points 63

Lemma 3.5.3. A solution to (3.5.1) exists if and only if { f ◦n } is uniformly bounded


in some neighborhood of 0.

Proof. If there is such a solution in a neighborhood U of 0, then


g(z) = h −1 ◦ f ◦ h(z) = λz, z ∈ U,

so the functions g ◦n are uniformly bounded in U , and the same is true for the iterates
f ◦n = h ◦ g ◦n ◦ h −1 .
Conversely, suppose boundedness. Then let
n−1
1
ϕn (z) = λ− j f ◦ j (z) = z, z ∈ U.
n j=0

Note that n−1


1
ϕn (0) = λ− j f  (0) j = 1.
n j=0

The family {ϕn } is bounded and


1 −(n−1) ◦n
ϕn ◦ f (z) = λϕn + λ f (z) − z .
n

Therefore a subsequence converges, and its limit ϕ can be taken as h −1 . In fact ϕ  (0) =
1, so h  (0) = 1.
It was shown by Pfeifer [165] in 1917 that there is a choice of θ such that no solution
of (3.5.1) exists. In fact it was shown in 1938 by Cremer [49] that no solution exists
if lim inf n→∞ |1 − λn |1/n = 0. The question whether there is a choice of θ such that
(3.5.1) does have a solution was finally settled by Siegel in 1952. (The question of
exactly which θ permit a solution was settled by Brjuno [34] (sufficiency) in 1965
and Yoccoz [222] (necessity in 1988.)

Theorem 3.5.4. There is a λ = e2πiθ such that the equation (3.5.1) has no solution
for any polynomial f .

Proof. Let f be a polynomial with f  (0) = λ: f (z) = z d + · · · + λz, and suppose


that (3.5.1) has a solution h in Dδ (0). Then f ◦n (z) = z has d n solutions, the zeros
{z j } of
0 = f ◦n (z) − z = z d + · · · + (λn − 1)z = h(λn (h −1 (z))) − z.
n

Since there is only one zero in Dδ (0), namely z = 0, we have


 n
|1 − λn | = |z j | ≥ δ d .
z j =0

But suppose that λ = e2πiθ , where


64 3 Complex dynamics


θ = 2−qk , (3.5.2)
k=1

and {qk } is a strictly increasing sequence of positive integers. Then


|1 − λ2 | ∼ 2qk −qk+1 ;
qk
(3.5.3)

see Exercise 18. Then


qk+1 ≤ C(δ) 2δk . (3.5.4)

But by choosing qk to grow rapidly enough, we may be sure that, for any given degree
d and any given δ > 0, (3.5.4) eventually fails.
Note that θ , as constructed here, is an irrational number that can be approximated
very efficiently by rationals. Siegel showed that when this condition fails badly, it
is possible to solve (3.5.1). A Diophantine number is a number θ ∈ R such that for
some positive constants c and m,
 
 
θ − p  ≥ c , (3.5.5)
 q qm

for all integers p, q with q > 0. In particular, Liouville showed that if θ is irrational
but algebraic, then θ is Diophantine; see Exercise 19.
Let us express (3.5.5) in terms of λ = e2πiθ . Given an integer n > 0, the distance
|λ − 1| is
n

ψ
|eiψ − 1| = 2 sin , |ψ| = inf |2nπ θ − 2kπ |.
2 k∈Z

Since |ψ| ≤ π/2, we have 2| sin(ψ/2)| ∼ |ψ|, so


|λn − 1| ≥ c n 1−m .

For use in the proof to follow, it will be convenient to rewrite this in the form
1 nμ
≤ c0 , (3.5.6)
|λn − 1| μ!

where we take μ to be some positive integer.


Theorem 3.5.5. (Siegel) Suppose that f is holomorphic in a neighborhood of 0,
and f (0) = 0. Suppose that the multiplier λ = f  (0) is λ = e2πiθ , where θ is Dio-
phantine. Then there is a solution to (3.5.1) in a neighborhood of 0.
Proof. We follow the proof as presented in [42]. Given any function h, we will write
approximations in a form like h = k + ĥ, where k is some more-or-less explicit first
approximation, and ĥ is smaller than h in some suitable sense. In particular we begin
with f (z) = λz + fˆ(z), and we want to take h(z) = z + ĥ(z) and solve (3.5.1) in
the form
3.5 Neutral fixed points 65

ĥ(λz) − λĥ(z) = fˆ(h(z)). (3.5.7)

The first step is to look for a conjugation ψ, ψ(z) = z + ψ̂(z), that gives an approx-
imate solution to (3.5.1) in some disk

ψ −1 ◦ f ◦ ψ(z) = g(z) = λz + ĝ(z), z ∈ Dr (0). (3.5.8)

The idea is to replace f by g and work on a slightly smaller disk. When ĝ = 0, we


have reached our goal.
Replacing h(z) by z + ĥ(z) in the right side of (3.5.7) leads to

ψ̂(λz) − λψ̂(z) = fˆ(z) = bn z n , (3.5.9)
n=2

which has the solution ∞


bn
ψ̂(z) = zn .
n=2
λn −λ

Using this, we want to compare ĝ, as defined by (3.5.8), to fˆ in (3.5.9). For this
purpose we use (3.5.6) and the assumptions
1 nμ
| fˆ (z) ≤ δ, for z ∈ Dr (0); ≤ c0 .
|λn − 1| μ!

This allows us to use Cauchy estimates for the coefficients of fˆ,


δ
|nbn | ≤ ,
r n−1

together with (3.5.6) to estimate ψ̂ in a smaller disk of radius (1 − η)r :



n|bn | n−1
|ψ̂  (z)| ≤ r (1 − η)n−1
n=2
|λn − λ|

c0 δ
≤ n μ (1 − η)n−1
μ! n=2

n+μ c0 δ
< c0 δ (1 − η)n = μ+1 .
n=1
μ η

Let us assume now that δ is chosen so that c0 δ ≤ ημ+2 , so

|ψ̂  (z)| ≤ η, z ∈ D(1−η)r (0).

This estimate implies that ψ : D(1−3η)r (0) → D(1−4η)r (0). The argument principle
and the fact that
|ψ(z)| ≥ (1 − 2η)r, z ∈ ∂ D(1−η)r (0),
66 3 Complex dynamics

while ψ(z) = 0 in D(1−η)r (0) only for z = 0, implies that ψ takes every value in
D(1−2η)r (0) exactly once in D(1−η)r (0).
Now consider
g = ψ −1 ◦ f ◦ ψ, z ∈ D(1−4η)r (0).

Each factor enlarges the radius by at most ηr . By (3.5.8) and (3.5.9),

ĝ(z) + ψ̂(λz + ĝ(z)) = λψ̂(z) + fˆ(z + ψ̂(z))


= ψ̂(λz) − ψ̂(λz + ĝ(z)) + fˆ(z + ψ̂(z)) − fˆ(z).

Let C = sup{|ĝ(z)|, z ∈ D(1−4η)r (0)}. Then

C ≤ sup |ψ̂  (z)| C + sup | fˆ(z + ψ̂(z)) − fˆ(z)|


c0 δ
≤ ηC + δ μ+1 r,
η

which gives
c0 δ 2 r 1
C ≤ .
ημ+1 1 − η

Therefore Cauchy’s estimate gives


c0 δ 2 r 1
|ĝ  (z)| ≤ , z ∈ D(1−5η)r (0). (3.5.10)
ημ+2 1 − η

Recapitulating, we have passed from f with | fˆ | ≤ δ in Dr (0) to g with the


estimate (3.5.10), using the assumptions: f defined on Dr0 (0), r ≤ r0 , and
1
0<η< , c0 δ < ημ+2 , δ < η. (3.5.11)
5
If we require that η ≤ c1 for small enough c1 then η < 1/5 and the second inequality
in (3.5.11) will imply the third. Starting from η0 and δ0 that satisfy (3.5.11), we
choose inductively

rn+1 = (1 − 5ηn )rn ;


ηn+1 = 21 ηn = η0 2−n−1 ;
δn+1 = c0 δn2 (2ηn )−(μ+2) .

μ+2
If c0 δn < ηn , then

c0 δn+1 = c02 δn2 (2ηn )−(μ+2)


μ+2
≤ (ηn2μ+4 )ηn−(μ+2) 2−(μ+2) = ηn+1 .

In the process we choose functions ψn , gn ,

g0 = f, gn = ψn−1 ◦ gn−1 ◦ ψn ,
3.5 Neutral fixed points 67

i.e.
gn = ψn−1 · · · ◦ ψ1−1 ◦ f ◦ ψ1 ◦ · · · ◦ ψn .

The limiting radius of definition is



 ∞
 5η0
R = r0 (1 − 5ηn ) = r0 1− > r0 e−10η0
n=0 n=0
2n

and
δn r n
|ĝn (z)| ≤ → 0
1 − ηn

on D R (0). Thus gn (z) → λz uniformly on D R (0),

ψ1 ◦ ψ2 ◦ · · · ◦ ψn → ψ

uniformly, and ψ conjugates f to multiplication by λ, i.e.

ψ −1 ◦ f ◦ ψ(z) = λz.

The connected component of the Fatou set that contains a neutral fixed point for
which (3.5.1) has a solution is called a Siegel disk..
A reward for all this effort is Figure 3.4; the larger region on the lower left is a
Siegel disk.

Fig. 3.4 Julia set for f (z) = z 2 + ei2π ξ z, ξ = (1/4)1/3 , with a Siegel disk. (Figure reproduced
from [141] with the permission of Princeton University Press.).
68 3 Complex dynamics

3.6 Parabolic fixed points

In this section we return for a closer look at how iterates of a rational function f
behave near a parabolic fixed point in the Julia set. We assume again that deg f ≥ 2.
We begin with the case when the multiplier λ = 1, and work in a local coordinate
with the fixed point at the origin. Then for some n,

f (z) = z[1 + az n + O(z n+1 )], a = 0.

We assume that the multiplicity n + 1 of the fixed point is ≥ 2. To get some perspec-
tive, suppose that some sequence {z k = f ◦k (z 0 )} converges to 0. Then
   1/n
z k+1 = z k 1 + az kn + . . . ∼ z k 1 + naz kn + . . .
  z n −1/n
k
= zk 1 + + ... , (3.6.1)
ω

for any choice of ω with ωn = −1/an. Set z k = (ck )−1/n . Then (3.6.1), if we ignore
the remainder term, leads to the functional equation
1
ck+1 = ck 1 + = ck + 1
ck

with the solution ck = k. Thus we might expect convergent sequences to look like
ω 1
zk ∼ , ωn = − . (3.6.2)
k 1/n na

Similarly f −1 has a branch g defined near 0 with

g(z) = z[1 − az n + O(z n+1 )].

Therefore a sequence {z k = g ◦k (z 0 )} converging to 0 can be expected to have the


form ω 1
z k = 1/n , (ω )n = . (3.6.3)
k na
In view of these remarks we refer to the n solutions ω j , j = 1, . . . , n of the
equation ωn = −1/an as attraction directions for f at the fixed point, and the n
solutions ωn of (ω )n = 1/an as repulsion directions for f at the fixed point.
The preceding construction showed that what behaves nicely under f is not the
z k , but rather
ω n c 1
ck = = n, c = − .
z z na

This suggests a change of variables that puts the fixed point at ∞:


c  c 1/n
w = φ(z) = n ; z = φ −1 (w) = , (3.6.4)
z w
3.6 Parabolic fixed points 69

for some choice of the branch. Under this coordinate change, the attraction and
repulsion directions become
φ(ω j ) = 1, φ(ωj ) = −1.

Under φ, the function f is conjugated to F, i.e.

F(w) = φ ◦ f ◦ φ −1 (w)
 c 1/n  c 
=φ 1 + a + O(w−1−1/n )
w w
 c −n
= w 1 + a + O(w−1−1/n )
 w 
nac
= w 1− + O(w−1−1/n ) ..
w
Since c = −1/na,
F(w) = w + 1 + O(|w|−1/n ). (3.6.5)

Similarly, the branch g of f −1 that fixes 0 conjugates to G = φ ◦ g ◦ ψ and


G(w) = w − 1 + O(|w|−1/n ). (3.6.6)

It is especially simple to analyze the behavior in certain sector-like regions. First,


we may choose R > 0 large enough so that
R 1
|w| ≥ √ implies |F(w) − (w + 1)| ≤ . (3.6.7)
2 2

Lemma 3.6.1. The domain


Ω R = {w = x + i y : x + |y| > R} (3.6.8)

is mapped into itself by F. For any w ∈ Ω R , F ◦k (w) → ∞..

Proof. If w = x √
+ i y then 2|w|2 − (x + |y|)2 ≥ (x − |y|)2 . Therefore w ∈ Ω R
implies |w| ≥ R/ 2. Setting F(w) = x  + i y  , we have
1 1
|x  − (x + 1)| ≤ , |y  − y| ≤ ,
2 2

so x  + |y  | ≥ (x + 21 ) + (|y| − 21 ) > R. Moreover


k
Re F◦k (w) ≥ Re w + ,
2

so F ◦k (w) → ∞.
The various inverse maps ψ can be realized on the complement of the real interval
(−∞, 0] by taking
70 3 Complex dynamics

ψ(w) = ω w−1/n , w∈
/ (−∞, 0], (3.6.9)

as ω runs through the roots of ωn = −1/na.


For each choice of ω with ωn = −1/an, the map w → ψ(w) = ωw−1/n , taking
the principal branch of the n-th root, takes the domain Ω R to one petal of a petal-
shaped region P = PR as shown in Figure 3.5. The angle of each petal at the origin is
π/n. The arrows on the left indicate the direction of travel under F. It is not difficult
to see that any orbit in Ω R maps to a sequence in P that converges to 0 along a curve
that is tangent to the corresponding petal at the origin. More precisely:

Lemma 3.6.2. Suppose that w0 ∈ Ω R . Let wk = F ◦k (w0 ) and z k = ωwk , ωn =


1/n

−1/an, Then
z k = ωk −1/n (1 + o(1)). (3.6.10)

Proof. Note that for any fixed m, (k + m)1/n = k 1/n + O(m/k), so we may replace
z 0 by any later point in the orbit and number from there. In particular, we may assume
that we have reached the region where |F(w)| ≥ |w|. Moreover, given ε > 0, we may
assume that we have reached the region where

F(w) = w + 1 + η(w), |η(w)| ≤ ε.

Then, renumbering the sequence from here,

wk = w0 + k + ηk , |ηk | ≤ kε.

Therefore
 w0 ηk −1/n
−1/n
z k = wk = k −1/n 1 + + = k −1/n [1 + O(ε)].
k k
Since ε is arbitrary, this proves (3.6.10).
Under various choices of the inverse map ψ given by (3.6.9), the petal P is mapped
by F to rotations of a scaled copy of P. The arrows on the right in Figure 3.5 indicate
the direction of travel under f . Similarly, the maps

1
w → ω w−1/n , (ω )n = (3.6.11)
2an
take P to rotations of a scaled copy of P. Reversing the arrows in Figure 3.5 shows
the direction of travel under G and g.
Putting all this together yields the Leau–Fatou flower, a covering of a neigh-
borhood of the parabolic fixed point by overlapping petals that alternate between
attraction and repulsion directions. The case n = 3 is indicated in Figure 3.5. The
arrows indicate the direction of travel of points under f .
It is clear from this that any z with the property that the orbit of z enters P j
approaches the fixed point 0, in the limit, from the direction ω j of the axis of symmetry
3.6 Parabolic fixed points 71

Fig. 3.5 Leau–Flatou flower, n = 3.

of P f . The set of such points is denoted A j . Clearly, each A j belongs to the Fatou
set, as does the entire basin of attraction of 0.
Proposition 3.6.3. The boundary ∂ A j (0) of each basin of attraction A j (0) belongs
to the Julia set.
Proof. Consider the orbit z 0 → z 1 → . . . of a point z 0 ∈ ∂ A j If the orbit reaches
0 in finitely many steps, then since 0 ∈ J it follows that z 0 ∈ J. Now z 0 is not in
any of the A j , so it does not converge to the fixed point 0. Therefore there is a
subsequence that is bounded away from 0. But f ◦k converges to 0 at each point
of A j , so { f ◦k } cannot be a normal family in any neighborhood of 0. Therefore
z 0 ∈ J.
It is easily seen, in light of this discussion, and taking account of Lemma 3.6.2,
that we have confirmed (3.6.2) and (3.6.3):
Proposition 3.6.4. (a) Suppose that a sequence {z k = f ◦k (z 0 )}∞
k=0 converges to 0
and no z k = 0. Then for some j,
lim k 1/n z k = ω j .
k→∞

(b) Suppose that a sequence {z k = g ◦k (z 0 )}∞


k=0 converges to 0, where g = f
−1
, and

no z k = 0. Then for some j
lim k 1/n z k = ωj .
k→∞

All attraction and repulsion directions occur.


Remarks. 1. The flower grew through work of Leau [130], Julia [118], and Fatou
[69].
2. To this point we have been considering only the case of a parabolic fixed point
z 0 with multiplier 1. Suppose that the multiplier λ = 1, λm = 1. Then z 0 is a fixed
72 3 Complex dynamics

point of f ◦m with multiplier 1. Similarly, any point in a parabolic periodic orbit is a


parabolic fixed point of some iterate of f , and therefore a fixed point with multiplier
1 of some further iterate. Recall that the Fatou and Julia sets are unchanged under
iteration.

3.7 Perspectives: classification and the Mandelbrot set

One long-time goal of the theory is to understand the connected components of the
Fatou set of a rational f of degree ≥ 2. In principle there are three possibilities for
such a component U : it might be periodic: f ◦m (U ) = U for some minimal m ≥ 1, or
it might be pre-periodic: some f ◦k (U ), k ≥ 1 is periodic, or it might be wandering:
the images { f ◦n (U )} are all distinct. Much of the progress was made by Fatou and
Julia. The final pieces were supplied by Siegel, Sullivan, and Herman.
Sullivan [198] introduced methods of quasiconformal mapping to prove in 1985
that for rational f there are no wandering components. In 1984 Herman [105] pro-
duced a new type, now known as a Herman ring. By definition this is a periodic
component U of the Fatou set that is doubly connected and f ◦n |U is conjugate to
either a rotation on an annulus or a rotation followed by an inversion. Figure 3.6
shows the Julia set of a cubic rational map that contains a Herman ring.

Fig. 3.6 Julia set that contains a Herman ring. Figure reproduced from [141] with the permission
of Princeton University Press.

Theorem 3.7.1. (Classification Theorem) If U is a periodic component of the Fatou


set of a rational function f , then either
(a) U contains an attracting or super-attracting fixed point or periodic point;
(b) U is the basin of attraction of a parabolic fixed point with multiplier 1;
(c) U is a Siegel disk; or
(d) U is a Herman ring.
3.7 Perspectives: classification and the Mandelbrot set 73

For the proofs of Sullivan’s non-wandering theorem and the existence of Herman
rings, and for a thorough discussion of components of the Fatou set, see Chapter IV
of Carleson and Gamelin [42].
Finally, we discuss dependence on parameters and the Mandelbrot set. We saw in
Section 3.1 that even for quadratics f (z) = z 2 + c, the Julia set varies considerably
with c. In this case the critical points are 0 and ∞, and the fixed points are the
solutions of z 2 − z + c = 0.

Proposition 3.7.2. If f (z) = f c (z) = z 2 + c, then J( f ) is connected if and only if


{ f ◦n (0)} is bounded.

Proof. By the maximum principle, iterates of f are bounded on bounded components


of the Fatou set F, so the basin of attraction A of ∞ is connected. By Theorem
3.4.6 there is a conformal map φ defined in a neighborhood of ∞ such that φ(z) =
z + O(1) and

φ( f (z)) = φ(z)2 ; log |φ( f (z))| = 2 log |φ(z)|.

By the remark following Theorem 3.4.6, the harmonic function G(z) = log |φ(z)|
can be extended to A. We may define Ur = {z : G(z) > r }. Then f : Ur → U2r . For
sufficiently large r , φ is defined on Ur . We may extend φ further by

φ(z) = [φ( f (z))]1/2 , z ∈ Ur/2 ,

so long as Ur/2 does not contain the critical point 0 of f . This extension is injective
on Ur/2 . The process can be continued as long as we do not reach 0.
Therefore if { f ◦n (0)} is bounded, i.e. 0 ∈ / A(∞), it follows that φ extends to all
of A, A is simply connected, and its boundary ∂ A is connected. But by Proposition
3.4.4, ∂ A = J.
Suppose that f ◦n (0) → ∞. We shall show that J is totally disconnected. We know
that J is bounded. choose a disk U ⊃ J, and choose N so that f ◦n (0) is not in the
closure U for n ≥ N . Given z 0 ∈ J, there is an N such that for n ≥ N there is a branch
gn of ( f ◦n )−1 , holomorphic on U , with gn ( f ◦n (z 0 )) = z 0 . The functions {gn } are a
normal family. Any limit point of {gn (z)}, z ∈ A(∞) ∩ U belongs to J, so any limit
of a subsequence of the {gn } maps A(∞) ∩ U into J. Since J contains no open sets of
C, the limit is constant. Therefore the diameter of gn (U ) tends to 0. Since gn (∂U ) is
disjoint from J, it follows that {z 0 } is a connected component of J.
The set of c such that z 2 + c has a connected Julia set,

M = {c : sup | f c◦n (0)| < ∞, f c (z) = z 2 + c}

is called the Mandelbrot set, Figure 3.7. It was studied also by Brooks and Matelski
[33], but it was the computer images in Mandelbrot [139] that showed the complexity
of this set and made it famous as a “fractal.” The term “Mandelbrot set” is due to
Douady and Hubbard [59]. Douady and Hubbard [58] proved that M is connected.
74 3 Complex dynamics

Fig. 3.7 The Mandelbrot set.

Theorem 3.7.3. The Mandelbrot set is a closed, simply connected subset of the
closed disk D2 (0), consisting of those c such that

| f c◦n (0)| ≤ 2, n = 1, 2, 3, . . . . (3.7.1)

Moreover
M ∩ R = [−2, 1/4]. (3.7.2)

Proof. Suppose |c| > 2. Then | f c (0)| ≥ |c|2 − |c| = |c|(|c| − 1) and by induction

| f c◦n (0)| ≥ |c|(|c| − 1)2


n−1
→ ∞,

so M ⊂ D2 (0).
Suppose m ≥ 1 and f c◦m (0)| = 2 + δ > 2. If |c| > 2, then c ∈
/ M. If |c| ≤ 2, then
◦(m+1)
fc (0) ≥ 2 + 4δ, and inductively

| f c◦(m+k) (0)| ≥ 2 + 4k δ → ∞,

so again c ∈ / M. This proves that c ∈ M satisfies (3.7.1), and this characterization


implies that M is closed. Then C \ M is open. For any n ≥ 1, f c◦n (0) is holomorphic
in c, so if (3.7.2) is true for c in some open set, it is true on the boundary. Therefore
C \ M has no bounded components, so M is simply connected.
Any finite limit point z of f c◦m (0) would be a fixed point of f c . For c > 1/4, f c
is strictly increasing on [0, ∞) and has no real fixed points, so f c◦n (0) → ∞ and
c∈ / M. We know that c ∈ / M if c < −2, so suppose that −2 ≤ c ≤ 1/4. Let a be the
larger of the two real roots of f c (z) = z,

1 1√
a = + 1 − 4c.
2 2
3.7 Perspectives: classification and the Mandelbrot set 75

Note that a 2 + c = a and that |c| = | f c (0)| ≤ a. Inductively,

| f c◦(n+1) (0)| = |[ f c◦n ]2 + c| ≤ |a 2 + c| = a,

so c ∈ M.
The next result accounts for the large smooth blob that is the dominant part of M
and shows that the smaller pieces immediately next to the blob are tangent to it.

Theorem 3.7.4. M contains the cardioid


 
2λ − λ2
C = : |λ| < 1 . (3.7.3)
4

Moreover ∂C ⊂ ∂M.

Proof. . As noted before, a fixed point z c for f c is z c = 21 ± 21 1 − 4c. The mul-

tiplier is therefore λ = 2z c = 1 ± 1 − 4c, so the parameter associated to a given
multiplier λ is
2λ − λ2
c = c(λ) = .
4
Therefore f c has a (finite) attracting fixed point z c if and only if c ∈ C. If so, then by
Theorem 3.4.3, the immediate basin of attraction A0 (z c ) contains the unique finite
critical point 0. Thus f c◦n (0) → z c , so c ∈ M.
Let Ω be the component of the interior of M that contains C. Now f c◦n (0) is
a polynomial Pn (c), and {Pn } is a normal family on Ω. On the interior of C the
Pn converge to the fixed point z c , so by analyticity this is true on Ω. If ζ ∈ / C then
|λ| > 1, so Pn (c) cannot converge to z c unless Pn (c) = z c for sufficiently large z. But
for each n, Pn+1 (c) = Pn (c) has only finitely many solutions, so the set of such c is at
most countable. Therefore Ω = C.
It is not difficult to account for the next most obvious feature on M, the disk-like
piece to the left of the main cardioid; see Exercise 20. This suggests a way to account
for further features of M as well.
For much more information about quadratic polynomial dynamics and the struc-
ture of M to see Chapter VIII of Carleson and Gamelin [42] and the references
there.

Exercises

1. Determine the Julia set for each of the cases f ∈ Aut(S) discussed in the intro-
duction.
2. Verify that any quadratic can be put into a form (3.1.1).
76 3 Complex dynamics

3. Show that (3.1.4) satisfies (3.1.2).


4. Verify (3.1.5).
5. Verify Proposition 3.1.1
6. Show that the points in the Cantor set associated with (3.1.6) are the points that
have the form
  √
x = 6 + η1 6 + η2 6 ± . . ., ηk = ±1,

for any choice of the signs ηk .


7. Verify the assertions in Proposition 3.2.2 for the examples in (3.1.4).
8. Prove that the number of critical points of a rational function of degree d is
2d − 2.
 if {Un } is a sequence of dense open sets in a complete metric space
9. Prove that
X , then Un is dense in X .
10. Suppose that f is a rational map, deg f ≥ 1, and suppose that z 0 is a fixed point
with the property that some subsequence of { f ◦n } converges to z 0 uniformly in
some neighborhood of z 0 . Prove that | f  (z 0 )| < 1.
11. Suppose f is rational, degree ≥ 2, with a fixed point at infinity. Determine the
multiplier λ.
12. Show that for any n > 0, there is a c such that f (z) = z 2 + c has a parabolic
fixed point with multiplier λ = exp(2πi/n).
13. Show that a fixed point z 0 of a rational map has the Liapunov stability property—
that for any ε > 0, if z is sufficiently close to z 0 then | f ◦n (z) − z 0 | < ε for all
n ≥ 0—if and only if z 0 belongs to F( f ).
14. Newton’s method for approximating the roots of a real polynomial P: given a
point x ∈ R, go to the point (x, P(x)), and follow the tangent line to a the point
x  where it meets the real axis.
(a) Show that x  = x − P(x)/P  (x).
(b) Given a general complex polynomial P, let f be the rational function
P(z)
f (z) = z − .
P  (z)

Show that the fixed points of f are ∞, which is repelling, and the zeros of P,
which are attracting.
(c) Suppose that P is a quadratic with two distinct zeros. Show that J( f ) consists
of a straight line and ∞.
(d) Determine the Julia set of a quadratic with a double zero.
15. Given |a| < 0, the linear fractional transformation
z−a
ba (z) =
1 − āz
m
takes the unit disk D to itself. A Blaschke product is a product f = ω j=1 ba j ,
where |ω| = 1.
(a) Show that J( f ) is a subset of {z : |z| = 1}.
3.7 Perspectives: classification and the Mandelbrot set 77

(b) Suppose that one of the factors is b0 . Show that 0 and ∞ are attracting fixed
points and J( f ) is the entire unit circle.
16. Suppose that z lies in the closure of the basin of attraction A of an attracting fixed
point or attracting periodic orbit of a rational function f . Prove that if z ∈ F( f ),
then z ∈ A.
17. Suppose that θ ∈ R is irrational. A result of Kronecker says that the powers of
ω = e2πiθ are equidistributed in the unit circle ∂D. An equivalent formulation is
in terms of the fractional parts {2nπ θ }. Here {x} is defined to be x − m, where
m is the integer such that m ≤ x < m + 1. Then the theorem says that for any
subinterval [a, b) ⊂ [0, 1), as N → ∞, the average number of values {2nπ θ },
|n| ≤ N , (counting multiplicity) that lie in [a, b) approaches b − a. This asser-
tion has a number of equivalent formulations. First, let f be the characteristic
function of the interval [a, b): f (x) = 1 if x ∈ [a, b), otherwise f (x) = 0. Then
N  b
1
lim f ({2nπ }) = f (x) d x. (3.7.4)
N →∞ 2N + 1 a
−N

Second, (3.7.4) is true for all such characteristic functions if and only if it is true
for all real linear combinations of such functions. Show that, in turn, (3.7.4) is true
for all such combinations if and only if (3.7.4) is true for all continuous functions
f : [0, 1) → R. Use the Weierstrass polynomial approximation theorem (see
the Remark after Corollary 5.1.8) to conclude that (3.7.4) is true for all such
continuous functions f if and only if it is true for each power function f m (x) =
x m , m = 0, 1, 2, . . . . Finally, prove Kronecker’s theorem by verifying that (3.7.4)
is indeed true for each power f m . (It is only at this last step that we use the
assumption that θ is irrational.)
18. Suppose that {qk } is an increasing sequence of positive integers. Prove that (3.5.4)
implies (3.5.2).
19. Suppose that θ is an algebraic irrational: i.e. θ is a zero of a polynomial P with
integer coefficients, and the minimum degree of such a polynomial of degree
m > 2. Prove Liouville’s result that there is a constant c > 0
 
 
θ − p  ≥ c
 q qm

for every pair of integers p, q, q > 0. (Suppose that P(θ ) = 0, where P has
integer coefficients and has minimal degree m. Then P  (θ ) = 0 and
 
 m 
q P p − P(θ )
 q 

is a non-zero integer. By the Mean Value Theorem, this expression is


 
p 
q m |P  (θ ∗ )|  − θ 
q

for some θ ∗ between θ and p/q.)


78 3 Complex dynamics

20. Let f c (z) = z 2 + c. Find the attracting fixed points of f c ◦ f c that are not fixed
points of f c . Hint: the solutions of f c (z) = z are solutions of f c ◦ f c (z) = z, so
f c (z) − z divides f c ◦ f c (z) − z:

f c ◦ f c (z) − z = [ f c (z) − z][g(z)],

where g is a quadratic in z. Compute g, use g(z) = 0 to show that the multiplier


λ = ( f c ◦ f c ) satisfies |λ| < 1 if and only if |c + 1| < 1/4. Adapt the argument
in Theorem 3.7.4 to show that the disk |c + 1| < 1/4 is contained in the Man-
delbrot set M, and that the boundary of the disk is contained in the boundary of
M.

Remarks and further reading

The second flourishing of complex dynamics in the 20th century was celebrated in
a number of expositions, including books by Beardon [23] and Steinmetz [197] and
a review article by Lyubich [138]. Our presentation here relied mainly on the sys-
tematic and thorough treatment by Carleson and Gamelin [42], and the discursive
and profusely illustrated notes of Milnor [141]. The book by Carleson and Gamelin
contains proofs of all major results, that by Milnor is particularly complete on histor-
ical detail and anecdote. Both treat topics beyond rational dynamics, such as entire
functions and dynamics on Riemann surfaces, as well as a more thorough treatment
of polynomial dynamics. Sullivan’s work and subsequent developments have made
use of quasiconformal mapping and methods of Teichmüller theory; see the expo-
sition by Shishikura in one of the supplementary chapters in the second edition of
Ahlfors’s lectures [5].
Chapter 4
Univalent functions and de Branges’s
theorem

The Riemann mapping theorem says that any simply connected domain Ω ⊂ C that
is not all of C can be mapped conformally onto D. Moreover if we fix φ(0) ∈ Ω and
require φ  (0) > 0, then the conformal map φ : D → Ω is unique.
The study of univalent (injective) functions turns this around, by looking at injec-
tive holomorphic functions f : D → C. Here the usual normalization is f (0) = 0,
f  (0) = 1. This fixes the position, orientation, and scale of the image f (D). Then
the series expansion of f at 0 has the form

f (z) = z + a2 z 2 + a3 z 3 + · · · + . (4.0.1)

In principle, the coefficients {an } encode all the information about the conformal
image Ω = f (D). The set of such normalized conformal maps of D is denoted S.
(The S stands for the German schlicht, meaning “simple.” The functions in S are
often called “schlicht functions.”)
A particularly important example comes about as follows. The linear fractional
transformation 1+z
h(z) =
1−z

maps D to the right half-plane {z : Re z > 0}. Therefore h 2 maps D to the complement
of the half-line {x : x ≤ 0}. We can adjust this to get a function in S by a translation
and dilation. The result is the Koebe function
1 z
K (z) = [h(z)2 − h(0)2 ] =
4 (1 − z)2
= z + 2z + 3z + 4z + . . . .
2 3 4
(4.0.2)

The image of D under K is the complement of the half-line {x : x ≤ −1/4}; see


Exercise 1. More generally, given θ ∈ R, define the rotated Koebe function


K θ (z) = e−iθ K (eiθ z) = z + an z n , an = n ei(n−1)θ . (4.0.3)
n=2
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 79
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_4
80 4 Univalent functions and de Branges’s theorem

Then |an | = n, and the complement of the image K θ (D) is a rotation around the
origin of the half-line {x : x ≤ −1/4}.
Koebe proved that there is a δ > 0 such that for each f ∈ S, the image f (D)
necessarily contains the disk Dδ (0). The example f = K shows that δ ≤ 1/4, and
Koebe conjectured that δ = 1/4 is sharp, i.e. f ∈ S implies f (D) ⊃ D1/4 (0). This
is correct. The result is referred to as “Koebe’s one-quarter theorem,” although it
was Bieberbach who proved it. Given that any such image must contain D1/4 (0) and
must be simply connected, we might consider the “largest” possible such domain to
be one whose complement is a ray running to ∞ from a point a with |a| = 1/4—in
other words, the image of a Koebe function. As a result, we might suspect that for
any univalent function with expansion (4.0.1), the coefficients must satisfy |an | ≤ n,
with equality only for the Koebe functions.
Bieberbach [27] proved in 1916 that this is true for n = 2, i.e. |a2 | ≤ 2, with equal-
ity only for the Koebe functions. He went on to pose the full Bieberbach conjecture:
each coefficient in the expansion (4.0.1) of a normalized conformal map f of D into
C satisfies
|an | ≤ n, (4.0.4)

with equality only for the Koebe functions.


Proving (or disproving) the Bieberbach conjecture was the outstanding challenge
of research on univalent functions for much of the 20th century. Aside from some spe-
cial cases, progress was mainly made on one coefficient at a time. The full conjecture
was finally proved by de Branges in 1984 [52].
In Section 4.1 we give Bieberbach’s proof of the result for n = 2 and derive some
of the important consequences for the general theory. These include the proof of the
one-quarter theorem, and proofs of the growth and distortion theorems of Koebe.
The rest of this chapter is devoted to the theory that leads up to the proof of the
full Bieberbach conjecture. Section 4.2 begins with an outline of progress before
1984, and its relation to the proof of the full conjecture. Section 4.3 introduces
slit mappings and gives Carathéodory’s proof that they are dense in S. To prove
Bieberbach’s conjecture it is enough to prove it for slit mappings. Section 4.4 treats
Loewner’s theory of slit mappings, which enabled him to prove the conjecture for
the third coefficient.
Section 4.5 introduces the conjectures of Robertson and of Milin. It is shown
that the Robertson conjecture implies the Bieberbach conjecture and that the Milin
conjecture implies the Robertson conjecture.
The achievement of de Branges was to prove the Bieberbach conjecture by proving
the Milin conjecture. In Section 4.7 we give an expanded version of Weinstein’s later,
very condensed, proof of the Milin conjecture [213].
Weinstein’s proof relies on some particular facts from the Loewner theory, on
the generating function representation of the Legendre polynomials, and on Leg-
endre’s addition formula. In the preparatory section, Section 4.6, we develop the
needed version of the Loewner theory. We also derive some general properties of
the Legendre polynomials from the generating function representation, and outline
4.1 Bieberbach’s theorem and some consequences 81

the relation to the addition theorem and to its its interpretation in connection with
surface harmonics.
Some further history is outlined in the section on remarks and further reading.

4.1 Bieberbach’s theorem and some consequences

As noted in the introduction to this chapter, the term univalent means injective, i.e.
not taking the same value twice. In complex function theory the term is primarily
used for holomorphic functions, also (especially in the older literature) called schlicht
functions.
If g : D → C is univalent, then there is a unique translation h(z) = az + b such
that f = h ◦ g satisfies the normalization conditions

f (0) = 0, f  (0) = 1. (4.1.1)

As we noted in the introduction, the set of univalent maps f that are defined on D
and satisfy (4.1.1) is denoted S.
If f belongs to S, then the function
1  −1
g(z) = = z 1 + a2 z −1 + a3 z −2 + . . . , |z| > 1, (4.1.2)
f (1/z)

is univalent and has a simple pole at ∞. Let us consider functions of this type. By a
translation we may get rid of the constant term in the expansion and have

h(z) = z + b1 z −1 + b2 z −2 + b3 z −3 + . . . , |z| > 1. (4.1.3)

A key result due to Gronwall [93] is known as the Gronwall area theorem.

Theorem 4.1.1. (Area Theorem) If a function h given by the formula (4.1.3) is


univalent, then


n|bn |2 ≤ 1. (4.1.4)
n=1

Proof: For r > 1, let Er be the complement of the image {h(z) : |z| ≥ r }, and let
Γr = {h(z) : |z| = r }.

Univalence implies that Γr is a simple closed curve that encloses Er . By (1.2.10),


the area of Er is

1
Ar = h(z)h  (z) dz
2i Γr
82 4 Univalent functions and de Branges’s theorem

  ∞

 ∞


1 −n −m−1
= z̄ + b̄n z̄ 1− mbm z dz.
2i Γr n=0 m=0

On Γr , write z = r eiθ so that the integral becomes


  ∞
 ∞

1 2π  
−iθ −n inθ −m −imθ
re + b̄n r e re −

mbm r e dθ.
2 0 n=0 m=1

The series converge uniformly, so we may take the product and integrate term-by-
term. Since  
1 2π i pθ π if p = 0,
e dθ = (4.1.5)
2 0 0 if p = ±1, ±2, . . . .

it follows that  ∞


2 −2n
Ar = π r − 2
n|bn | r .
n=1

Letting r decrease to 1, the limit of the left side is the outer measure m ∗ (E) of the
complement E of the image of the map h. Therefore
 ∞



0 ≤ m (E) = π 1 − n|bn | 2
.
n=1

In particular, equality holds in (4.1.4) if and only if the complement of the image
of h has measure zero. If h = g has the form (4.1.2), where f belongs to S, then this
is equivalent to saying that the complement of the image of f has measure zero.
Corollary 4.1.2. If h given by (4.1.3) is univalent then for each n, |bn |2 ≤ 1/n.
The remaining ingredient in Bieberbach’s proof of his conjecture in the case n = 2
is the square root transformation. Suppose f belongs to S. Then
 ∞


f (z ) = z
2 2
1+ an+1 z 2n
, |z| < 1.
n=1

By assumption, f is univalent, so the term in brackets is never 0. Therefore we may


choose a branch of the square root that is 1 at z = 0 and define
a2 2
f 2 (z) ≡ f (z 2 )1/2 = z 1 + z + . . . , |z| < 1. (4.1.6)
2
The function f 2 is single-valued (Exercise 2) so it belongs to S.
Theorem 4.1.3. (Bieberbach) If f belongs to S and has the expansion (4.1.2), then
|a2 | ≤ 2. The equality is strict unless f = K θ for some θ .
4.1 Bieberbach’s theorem and some consequences 83

Proof: Let
1 1 a2
g(z) = = = z − z −1 + . . . . (4.1.7)
f 2 (1/z) f (1/z )
2 1/2 2

By Corollary 4.1.2, |a2 | ≤ 2. Equality implies that the remaining coefficients are
zero, so
eiθ
g(z) = z − , (4.1.8)
z

for some θ ∈ R. It follows that f = K θ .


We are also in a position to prove Koebe’s conjecture.
Theorem 4.1.4. (One-quarter theorem) If f belongs to S, then the image f (D)
contains D1/4 (0), the disk of radius 1/4 centered at the origin.

Proof: Suppose that f omits the value w. The function


w f (z)
g(z) =
w − f (z)

is the composition h ◦ f of a linear fractional transformation with f , so it is also


univalent and is easily seen to belong to S. The coefficient of z 2 in its expansion is
g  (0) 1
= a2 + ,
2 w

where a2 is the coefficient of x 2 in the expansion of f . Therefore


1 1
= a2 + − a2 ≤ 4, (4.1.9)
w w

showing that no value in D1/4 (0) can be omitted.


In fact equality can happen in (4.1.9) only if |a2 | = 2, so any function in S that is
not a Koebe function has a disk Dr (0) larger than D1/4 (0) in its image.
Bieberbach’s theorem has other consequences for the general theory of univalent
functions. The following simple lemmas are key.
Lemma 4.1.5. For f in S and g in Aut(D) the function

f ◦ g − f (g(0))
h = (4.1.10)
f  (g(0)) g  (0)

belongs to S.

Proof: This function is the composition of f ◦ g with an affine map, so it is univalent.


Clearly f (0) = 0, and the denominator is chosen so that h  (0) = 1.
84 4 Univalent functions and de Branges’s theorem

Lemma 4.1.6. For f in S and t in D,


f  (t)
(1 − |t|2 ) − 2t¯ ≤ 4. (4.1.11)
f  (t)

Proof: In (4.1.10) we choose t ∈ D and take


z+t
g(z) = .
1 + t¯z

Then
g(0) = t, g  (0) = 1 − |t|2 , g  (0) = −2t¯(1 − |t|2 )

so
( f ◦ g) (0) f  (g(0))g  (0)2 + f  (g(0))g  (0)
h  (0) = =
f  (g(0)) g  (0) f  (g(0))g  (0)
   
f (t) g (0) g (0) f (t)
= +  =  (1 − |t|2 ) − 2t¯.
f  (t) g (0) f (t)

But h  (0) is twice the second coefficient in the expansion of h, so Bieberbach’s


theorem gives (4.1.11).
Theorem 4.1.7. (Koebe’s distortion theorem) For f in S,
1−ρ 1−ρ
≤ | f  (z)|2 ≤ , ρ = |z|. (4.1.12)
(1 + ρ)3 (1 − ρ)3

Proof: Replacing t by z in (4.1.11) and multiplying by |z|/(1 − ρ 2 ) gives the estimate


z f  (z) 2ρ 2 4ρ

− ≤ .
f (z) 1−ρ 2 1 − ρ2

Thus  
2ρ 2 − 4ρ z f  (z) 2ρ 2 + 4ρ
≤ Re ≤ . (4.1.13)
1 − ρ2 f  (z) 1 − ρ2

Since f is univalent and f  (0) = 1, we may take the principal branch of log f  .
Writing z = ρeiθ , we have
∂ z f  (z)
log f  (z) =
∂ρ |z| f  (z)

Multiplying by ρ = |z| and taking the real part gives


  
∂   z f (z)
ρ Re log f  (z) = Re .
∂ρ f  (z)

From this and (4.1.13) we obtain


4.1 Bieberbach’s theorem and some consequences 85

2ρ − 4 ∂ 2ρ + 4
≤ log | f  (ρeiθ )| ≤ .
1−ρ 2 ∂ρ 1 − ρ2

Integrating from 0 to ρ gives


1−ρ 1+ρ
log ≤ log | f  (ρeiθ )|  log ,
(1 + ρ)3 (1 − ρ)3

and exponentiating yields (4.1.12).

Theorem 4.1.8. (Growth theorem) For f in S,


ρ ρ
≤ | f (z)| ≤ , ρ = |z|. (4.1.14)
(1 + ρ)2 (1 − ρ)2

Proof: Let z = ρeiθ be fixed, 0 < ρ < 1. Since f (0) = 0,


 ρ
f (z) = f  (σ eiθ )eiθ dσ.
0

By the distortion theorem, we have


 ρ  ρ
1+σ ρ
| f (z)| ≤ | f  (σ eiθ )|dσ ≤ dσ = .
0 0 (1 − σ ) 3 (1 − ρ)2

This gives the upper bound in (4.1.14).


To establish the lower bound, since ρ(1 + ρ)−2 < 41 for 0 ≤ ρ < 1, we only
need to consider the case | f (z)| < 41 . Then by the Koebe one-quarter theorem, the
straight line segment from 0 to w = f (z) lies entirely within f (D). Let Γ be the
pre-image of this segment. Then Γ is a simple arc from 0 to z, and

f (z) = f  (ζ )dζ.
Γ

But arg f  (ζ )dζ = arg dw = constant along Γ . Therefore


 
| f (z)| = f  (ζ )dζ = | f  (ζ )||dζ |
Γ Γ

By the distortion theorem,


 ρ
1−σ ρ
| f (z)| ≥ dσ = ,
0 (1 + σ ) 3 (1 + ρ)2

completing the proof.


The bounds in (4.1.12) and (4.1.14) are sharp. They are attained by the Koebe
function; see Exercise 3.
86 4 Univalent functions and de Branges’s theorem

4.2 The Bieberbach conjecture: history and strategy

The attack on the Bieberbach conjecture is a case study in research work on a natural
and difficult problem.
One approach is to start with some subclass of S. The conjecture was proved in
1921, for f with range f (D) that is starlike with respect to 0, by Nevanlinna [154],
and in 1931-32, for f with real coefficients, independently by Rogosinski [180] and
Dieudonné [55].
Another approach is to look for the sharpest uniform upper bound that one can
find for all f in S. Successive results were
|an | ≤ e · n Littlewood [134], 1925
|an | ≤ 34 e · n Bazilevic [18], 1951
|an | ≤ ( 21 e + 1.51) · n Baernstein [14], 1974
|an | ≤ 21 e · n Milin [140], 1965
|an | ≤ 1.243 · n Fitzgerald [72], 1972

|an | ≤ 7/6 · n Horowitz [110], 1976
We mention also an asymptotic result of Hayman [97] in 1955:
|an |
lim = α( f ) ≤ 1,
n→∞ n
with equality if and only if f is a Koebe function.
Finally, one can attack one coefficient at a time

|a3 | ≤ 3 Loewner [136], 1923


|a4 | ≤ 4 Garabedian and Schiffer [81], 1955
|a6 | ≤ 6 Ozawa [161], Pederson [163], 1976
|a5 | ≤ 5 Pedersen and Schiffer [164], 1980

The Koebe functions are examples of slit mappings: functions f ∈ S with the
property that the complement of f (D) is a Jordan path from some finite point in C
to the point at ∞. Carathédory [38] introduced a notion of convergence of domains
that allowed him to show that slit mappings are dense in S, in the sense of uniform
convergence on compact subsets of D. Therefore attacks on the Bieberbach conjecture
can focus on slit mappings. Loewner used this fact, and a construction of a well-
chosen family of slit mappings, to prove his result for the third coefficient. Loewner’s
method came to play an important part in the proof of the full conjecture. (The
original paper [136] was written by Karl Löwner. After emigrating to the U.S, the
author became Charles Loewner.)
4.3 The Carathéodory convergence theorem 87

4.3 The Carathéodory convergence theorem

A function f in S is called a slit mapping if the complement of f (D) in C is a Jordan


path. An example is the Koebe function (4.0.2), where the complement of K (D) is a
half-line. Since f (D) is simply connected, this curve must tend to infinity. The image
f (D) itself is termed a slit domain. As we shall see, slit mappings are dense in S. A
preliminary result is Carathéodory’s convergence theorem. This involves a particular
 of a sequence of simply connected domains Ωn ⊂ C. If 0 is an
notion of convergence
interior point of Ωn then the kernel of {Ωn } is the largest domain Ω that contains
0 and has the property that each compact subset of Ω lies in all but finitely many Ωn .
(It is important to remember that a domain is, by definition, a connected set.) Any
domain that is the union of such domains has this property: seeExercise 9; therefore
there is a largest such domain. If 0 is not an internal point of Ωn , then the kernel
is taken to be {0}. In either case, {Ωn } is said to converge to its kernel in the sense of
Carathéodory if every subsequence of {Ωn } has the same kernel.
This clearly needs some illustration. Let Ωn be the complement of the path con-
sisting of a half-line and a portion of the unit circle

Γn = [1, ∞) ∪ {eiθ : 0 ≤ θ ≤ 2π(1 − 1/n)}.

It is easily checked that the kernel is D and that Ωn converges to D; see Figure 4.1.
This example also hints at how to accomplish an approximation by slit mappings.

Fig. 4.1 Slit domain approximation to a disk.

Theorem 4.3.1. (Carathéodory) Let {Ωn } be a sequence of simply connected plane


domains not equal to C, such that 0 is an interior point of Ωn . Suppose the kernel
Ω of {Ωn } is not all of C. Let f n : D → Ωn be conformal, with f n (0) = 0 and
f n (0) > 0. Then f n → f uniformly on each compact subset of D if and only if {Ωn }
converges to Ω in the sense of Carathéodory. In the case of convergence, Ω is simply
connected, and the inverse maps f n−1 converge to f −1 uniformly on each compact
subset of Ω.

Proof: Suppose that f n → f uniformly on compact subsets of D. By Proposition


2.3.6, f is either constant or univalent. Some disk D = Dρ (0) belongs to all Ωn .
Then the functions gn = f n−1 : Ωn → D map 0 to 0 and gn (Ωn ) contains D, so by
88 4 Univalent functions and de Branges’s theorem

Schwarz’s lemma, |gn (0)| < 1/ρ. Therefore f n (0) > ρ, so the limit f is univalent.
We need to show that f (D) = Ω, and that {Ωn } converges to Ω in the sense of
Carathéodory.
We first show that f (D) ⊂ Ω. Let E be a compact subset of f (D), and let Γ
be a smooth Jordan curve that encloses E. Let δ > 0 be the distance from E to
Γ , and Γ1 = f −1 (Γ ). We will show that E ⊂ Ωn for all sufficiently large n. Fix
z 0 ∈ E. Then | f (z) − z 0 | ≥ δ for z ∈ Γ1 . By the uniform convergence of { f n } on Γ1 ,
| f n (z) − f (z)| < δ for all z ∈ Γ1 and sufficiently large n. In view of

| f n (z) − f (z)| < | f (z) − z 0 |, z ∈ Γ,

Rouché’s theorem implies that f n (z) − z 0 = f (z) − z 0 + [ f n (z) − f (z)] has the
same number of zeros inside Γ1 as f (z) − z 0 , namely, one. This shows that z 0 is
in Ωn for all n > n 0 , where n 0 depends on E but not on z 0 . In other words, E ⊂ Ωn
for all n > n 0 . By the definition of the kernel Ω, this means that f (D) ⊂ Ω.
The inverse functions gn = f n−1 are defined on E for all n ≥ n 0 , and |gn (w)| ≤
1. Therefore the gn are a normal family. Renumbering a convergent subsequence,
we get {gn } that converges uniformly on compact subsets of f (D) to a function
g holomorphic on f (D) with g(0) = 0 and g  (0) >. Indeed, restricting n to the
subsequence,
1 1
0 < = lim = lim gn (0) = g  (0).
f  (0) n→∞ f  (0)
n
n→∞

Thus, g is univalent.
The next step is to show that g = f −1 . Fix z 0 ∈ D and let w0 = f (z 0 ). Choose
ε > 0 so that the circle Γ = {z : |z − z 0 | = ε} lies in D, and let Γ1 = f (Γ ). Let δ be
the distance of w0 from Γ1 . Then | f n (z) − w0 | ≥ δ for z ∈ Γ , while | f n (z) − f (z)| <
δ on Γ for all large n. As above, it follows by Rouché’s theorem that for large n
there is precisely one z n inside Γ such that f n (z n ) = w0 . Thus |z n − z 0 | < ε and
z n = gn (w0 ). Therefore, if n is so large that |gn (w0 ) − g(w0 )| < ε, then for gn in the
convergent subsequence

|g(w0 ) − z 0 | ≤ |g(w0 )) − gn (w0 )| + |z n − z 0 | < 2ε.

Since ε > 0 is arbitrary, g(w0 ) = z 0 . Since z 0 ∈ D is arbitrary, g = f −1 .


We know now that any convergent subsequence of {gn } converges, uniformly on
compact subsets of f (D), to f −1 . A further application of Montel’s theorem shows
that the whole sequence {gn } converges to f −1 .
Now let Ω be the kernel of {Ωn }, and let F be a compact subset of Ω. All but
finitely many gn are defined on F. Since f (D) ⊂ Ω, Theorem 2.3.5 applies, so {gn }
converges (to f −1 ) uniformly on F. Since gn (F) ⊂ D, we have F ⊂ f (D). This is
true for every such F, so we have proved that Ω = f (D). The preceding argument
applies to any subsequence of the original {Ωn }, showing that its kernel is f (D).
Thus, all subsequences of {Ωn } have the same kernel.
Suppose now that {Ωn } converges in the sense of Caratheodory to a domain
Ω = f (D). Then the sequence { f n (0)} is bounded. Indeed, if | f n (0)| > n for some
4.3 The Carathéodory convergence theorem 89

(renumbered) subsequence, the Koebe one-quarter theorem shows that f n (D) con-
tains D = Dn/4 (0), and it follows that the subsequence has kernel C. This contra-
diction shows that there exists c ∈ R such that f n (0) < c for all n. By Theorem
4.1.8
|z|
| f n (z)| ≤ f n (0) , z ∈ D,
(1 − |z|)2

which shows that the sequence { f n } is uniformly bounded on each compact subset,
hence is normal. Some subsequence converges to a holomorphic f , uniformly on
compact subsets of D. By the first part of this proof, f maps D onto Ω. Again it
follows that the whole sequence { f n } converges to f , uniformly on compact subsets
of D.
We now reach the punch line.

Theorem 4.3.2. If f is in S, there is a sequence of slit mappings { f n } in S such that


f n → f uniformly on compact subsets of D.

Proof: Suppose first that f extends holomorphically to a larger disk D1+δ (0). Then
f (∂D) is an analytic Jordan curve. Let
w0 = f (1), wn = f (e2πi(1−1/n) )

and let Γn be the path that consists of an arc in the complement of f (D) running
from from ∞ to w0 and the arc

1
w = f (eiθ ), 0 ≤ θ ≤ 2π 1 − ;
n

see Figure 4.2.

w0
wn

Fig. 4.2 Approximating a general univalent function by slit maps.

The complement Ωn of Γn is simply connected. Let gn be the conformal map


of D onto Ωn with gn (0) = 0, gn (0) > 1. It is geometrically clear that Q = f (D)
is the kernel of the family {Ωn } and that Ωn → Q in the sense of Carathéodory.
Therefore Theorem 4.3.1 implies that gn → f uniformly on compact subsets of
90 4 Univalent functions and de Branges’s theorem

D. By Cauchy’s formula for the derivative, this implies that gn (0) → f  (0) = 1.
Therefore the functions
gn (z)
h n (z) = 
gn (0)

are slit mappings in S that converge to f uniformly on compact subsets of D.


For a general f in S, let f σ (z) = f (σ z)/σ , 0 < σ < 1. Then f σ extends to
D1/δ , so it can be approximated by slit mappings. As σ → 1, f σ → f , so f can be
approximated by slit mappings.

The importance of slit mappings for the main subject of this chapter is made clear
by the following.
Corollary 4.3.3. If Bieberbach’s conjecture is true for the subset Ss of S consisting
of slit mappings, then it is true for S.
The proof is left as Exercise 10.

4.4 Slit mappings and Loewner’s equation

In Section 4.3 we saw that slit mappings are dense in S. Given such a mapping f ∈ S,
the associated slit Γ is C \ f (D), the set of values that are not attained by f .
Loewner [136] attacked the converse problem of determining such a mapping f
from knowledge of the slit Γ . After a suitable parametrization of the equation of the
slit, he was able to determine the associated map f ∈ S from the limiting value of
the solution of a certain differential equation. In fact we shall see that

f (z) = lim et g(z, t),


t→∞

where g(z, t) = gt (z)) is the solution of a first-order differential equation in t with


initial condition g(0, z) = z for z ∈ D.
The first steps in the argument are to shrink the map f to a family of maps { f t }t≥0
in S by shrinking the slit toward ∞, and to choose a canonical parametrization.
Then f 0 = f and f t converges to the identity map as t → ∞. The decisive step is
to examine gt = f t−1 ◦ f and show that gt satisfies a first-order differential equation
with respect to t.
Suppose that f ∈ S is a slit mapping. Let Γ be the Jordan arc that is the comple-
ment of Ω = f (D), parametrized by a map

t → σ (t), 0 ≤ t < b, lim σ (t) = ∞.


t→b

Let
Γt = {σ (s) : t ≤ s < b}, Ωt = C \ Γt .

Thus Ω0 = Ω, the domains Ωt increase with t, and Ωt = C. Let f t ,
4.4 Slit mappings and Loewner’s equation 91

f t (z) = β(t)[z + b2 (t)z 2 + b3 (t)z 3 + . . . ],

be the conformal map from D onto Ωt with f t (0) = β(t) > 0. Then f 0 = f . Given
t ∈ [0, b), the Carathéodory convergence theorem says that f s → f t uniformly on
compact subsets of D. It follows that the coefficients bn (t) are continuous functions
of t.
Suppose that s < t. Then the function f t−1 ◦ f s maps D to a proper subset of
itself and fixes z = 0. By Schwarz’s lemma, its derivative at z = 0, which is positive,
is < 1. Therefore β(t) = f t (0) is strictly increasing with t. Since b(0) = 1, we
may choose the parametrization σ (t) so that β(t) = et . This is called the standard
parametrization of Γ .
We claim that in the standard parametrization, b = ∞. In fact
z |z| 1
≤ ≤
f t (z) r r

for z ∈ D. In particular, r ≤ | f t (0)| = et for t close to b. Since r is arbitrary, this


shows that et → ∞ as t → b, so b = ∞. Thus our parametrization is

f t (z) = et [z + b2 (t)z 2 + b3 (t)z 3 + . . . ], 0 < t < ∞.

The function gt (z) = f t−1 ◦ f maps D conformally onto D minus the pre-image
of Γt , which is an arc that extends inward from the boundary. This function has an
expansion
gt (z) = e−t [z + a2 (t)z 2 + a3 (t)z 3 + . . . ], (4.4.1)

where each an (t) is a polynomial in b2 (t), . . . bn (t); Exercise 11. In particular,


a0 (z) = z.
Next, following Duren [64], we prove convergence of gt to f and establish
Loewner’s differential equation, (4.4.3).

Theorem 4.4.1. Let f be a slit map, let σ (t) be the standard representation of the
omitted path Γ , and let the functions f t and gt be defined as above. Then

lim et gt (z) = f (z) (4.4.2)


t→∞

uniformly on compact subsets of D. There is a continuous k : D → ∂D such that gt


satisfies Loewner’s differential equation
∂ gt 1 + k(z) gt (z)
(z) = −gt (z) . (4.4.3)
∂t 1 − k(z) gt (z)

Proof: Since gt = f t−1 ◦ f and f maps compact subset of D to compact subsets of


f (D) = Ω, to prove the first statement of the theorem it is enough to show that

lim et f t−1 (w) = w


t→∞
92 4 Univalent functions and de Branges’s theorem

uniformly on compact sets in C. Theorem 4.1.8 gives


et |z| et |z|
≤ | f t (z)| ≤
(1 + |z|)2 (1 − |z|)2

With z = f t−1 (w), this leads to

f t−1 (w)
[1 − | f t−1 (w)]2 ≤ et ≤ [1 + | f t−1 (w)]2 . (4.4.4)
w

Therefore | f t−1 (w)| ≤ 4|we−t |, so f t−1 → 0 uniformly on bounded sets. Hence


(4.4.4) implies
f −1 (w)
et t → 1. (4.4.5)
w

It follows that the functions


f t−1 (w)
h t (w) = et , 0≤t <∞
w
are a normal family. Any convergent subsequence has holomorphic limit h with
|h(w)| = 1 = h(0), so, by the strong maximal principle, h ≡ 1. Therefore the h t
converge to 1 as τ → ∞, uniformly on compact sets. This proves (4.4.3).
Now for 0 ≤ s < t < ∞, let

h st (z) = f t−1 ( f s (z)) = es−t [z + c2 (s, t)z 2 + . . . ].

This function maps D conformally onto D minus a Jordan arc Jst , that extends inward
from a point λ(t) = f t−1 (σ (t)) on ∂D. Let Bst be the portion of ∂D that maps to Jst .
By the Carathéodory extension theorem, Theorem 2.6.1, f t−1 maps ∂D onto the (two-
sided) slit Γt \ Γs , so λ(s) = f t−1 (σ (s)) is an interior point of the arc Bst . As s ↑ t
or t ↓ s, Bst shrinks to λ(s) or to λ(t), respectively.
We claim that λ is continuous. The function h can be continued by reflection
across the complement of Bst in ∂D. The continuation maps the (full) complement
of Bst onto the complement of the union of Jst and its reflection Jst∗ . By Koebe’s
one-quarter theorem, Jst lies outside the disk Dr (0), r = es−t /4. Therefore Jst∗ lies
in the disk
{z : |z| < 4et−s }.

Since z/ h st (z) → et−s as t → 0, the reflection satisfies


h st (z)
lim = et−s .
z→∞ z

By the maximum modulus theorem,


h st (z)
≤ 4et−s , z ∈
/ Bst .
z
4.4 Slit mappings and Loewner’s equation 93

Letting t decrease to s, a normal families argument shows that h st (t) converges to a


function that is holomorphic and bounded on the complement of λ(s) with limit 1 at
λ(s). Thus
lim h st (z) = z,
t↓s

uniformly on compact sets not containing λ(s).


Now given s ≥ 0 and ε > 0, choose δ > 0 such that if s < t < s + δ, then the
circle C = {z : |z − λ(0)| < ε} encloses Bst . The image C  of C under h(z, s, t)
encloses Jst ∪ Jst∗ , so in particular it encloses λ(t). Since h st (t) → z uniformly on C
as t → s, it follows that for t sufficiently close to s, the diameter of C  is < 3ε. Thus
for any z 0 ∈ C, as t ↓ s,
|λ(t) − λ(z)| ≤ |λ(t) − z 0 | + |z 0 − h st (z 0 )| + |h st (z 0 ) − λ(t)|
≤ ε + ε + 3ε.

This proves continuity from the right, and the same constructions prove continuity
from the left.
Finally, note that h st (z)/z has no zeros, and extends to have value et−s at z = 0.
Therefore we may choose a branch of the logarithm so that
h st (z)
Φ(z) = Φ(z, s, t) = log , Φ(0) = t − s.
z

Now Φ is holomorphic in D and continuous in the closure. The properties of h st


imply that

/ Bst ;
Re Φ(z) = 0, for |z| = 1, z ∈ Re Φ(z) < 0, for z ∈ Bst . (4.4.6)

Therefore the extended Poisson integral formula of Theorem 5.1.6 gives


 β
1 eiθ + z
Φ(z) = Re Φ(eiθ ) iθ dθ, (4.4.7)
2π α e −z

where eiα and eiβ are the endpoints of Bst with the positive orientation. Then
 β
1
s − t = Φ(0) = Re Φ(eiθ ) dθ. (4.4.8)
2π α

By definition, h st (gs (z)) = gt (z). Therefore if we replace z in (4.4.7) by gs (z) we


get  β
gt (z) 1 eiθ + gs (z)
log = Re Φ(eiθ ) iθ dθ, (4.4.9)
gs (z) 2π α e − gt (z)

As t ↓ s the interval shrinks. We may apply the mean value theorem to the real part
and the imaginary part of (4.4.9) separately in order to replace the variable eiθ by
some intermediate values, divide by t − s, and take advantage of (4.4.8) to conclude
that the derivative from the right is
94 4 Univalent functions and de Branges’s theorem

∂ λ(s) + gs (z)
log gs (z) = − , (4.4.10)
∂s λ(s) − gs (z)

recalling that Bst contracts to λ(s). The same argument applies to the derivative from
the left, taking s ↑ t. Setting k(t) = 1/λ(t), we have obtained (4.4.3).
Let us look more closely at the family of functions f t .

Theorem 4.4.2. Let f t , 0 ≤ t < ∞, be the normalized conformal map of D onto


Ωt = C \ Γt ,
f t (z) = et [z + b2 (t)z 2 + b3 (t)z 3 + . . . ], 0 < t < ∞. (4.4.11)

Then f 0 (z) = f (z), the normalized conformal map with f (D) = C \ Γ , and
f t (z)
lim = 1, z ∈ C, (4.4.12)
t→∞ et z

uniformly on compact subsets of D. Moreover


∂ ft 1 + k(t) z 1
(z) = z f t (z) , k(t) = , (4.4.13)
∂t 1 − k(t) z λ(t)

and  
∂ f t (z)/∂t
Re > 0. (4.4.14)
z f t (z)

Proof: It is clear that f 0 (z, 0) = f (z), since Γ0 = Γ . The assertion (4.4.12) follows
from (4.4.5) by taking w = f (z).
By definition,
f t (gt (w)) = f (z).

Differentiating with respect to t gives


∂ ∂
f t (gt (w)) (gt (w)) + f t (gt (w)) = 0. (4.4.15)
∂t ∂t

Using (4.4.3), and replacing gt (w) by z, converts (4.4.15) to (4.4.13). Then (4.4.14)
follows, since |k| = 1, |z| < 1 implies
1 + kz
Re > 0.
1 − kz

Remarks. An important modification of Loewner’s equation (4.4.3) is a stochastic


version proposed by Schramm [187], who designated it SLE for stochastic Loewner
evolution. It is now generally called the Schramm–Loewner evolution. Loewner’s
equation can be written in an equivalent form
∂ gt ζ (t) + gt
= −gt ,
∂t ζ (t) − gt
4.5 The Robertson and Milin conjectures 95

where the “driving function” ζ is a continuous mapping to ∂ D. Schramm’s version


replaces the deterministic term ζ by a scaled Brownian motion on ∂ D. The resulting
equation, denoted SLEκ , is

∂ gt κ B(t) + g
= −gt √ ,
∂t κ B(t) − g

(This is chordal SLE; there are other versions, including radial SLE.)
There are a number of deep mathematical and physical applications. See Lawler
and Limic [129] and Kemppainen [119].

4.5 The Robertson and Milin conjectures

Suppose that f belongs to S, with expansion

f (z) = z + a2 z 2 + · · ·

Let h be the square root transform of f ; i.e.

h(z) = [ f (z 2 )]1/2 = z[1 + b2 z 2 + b4 z 4 + · · · ].

Comparing coefficients of z n in the equation

a1 z + a2 z 2 + α3 z 3 + · · · = z(b0 + b2 z + b4 z 4 + · · · )2 , a1 = b0 = 1,

we find that

an = b0 b2n + b3 b2n−2 + · · · + b2n b0 , n = 1, 2, · · · .

By Schwarz’s inequality,

|an |2 ≤ (|b0 |2 + |b2 |2 + · · · + |b2n |2 )2 , (4.5.1)

The Robertson conjecture is that the Bieberbach conjecture is true because of this
inequality: i.e. that for n = 1, 2, 3, . . . ,

|b0 |2 + |b2 |2 + · · · + |b2n |2 ≤ n 2 , (4.5.2)

with equality for some n if and only if f is a Koebe function. Thus

Theorem 4.5.1. If Robertson’s conjecture is true, then Bieberbach’s conjecture is


true.

Robertson’s conjecture is itself a consequence of a conjecture of Milin. For this


we need first
96 4 Univalent functions and de Branges’s theorem

Lemma 4.5.2. (Lebedev–Milin) Let

P = p1 x + p2 x 2 + · · ·

be an element of the ring P of formal power series over C, and let

Q = E ◦ P = q0 + q1 x + q2 x 2 + · · · , (4.5.3)
∞
where E is the exponential series n=0 x n /n !. Then for n = 0, 1, 2, · · · ,

|q0 |2 + |q1 |2 + · · · + |qn |2


  
1 
n
1
≤ (n + 1) exp (n + 1 − k) k| pk | −
2
. (4.5.4)
n + 1 k=1 k

Proof: Formal differentiation of (4.5.3) gives


q1 + 2q2 x + 3q3 x 2 + . . .
= (q0 + q1 x + q2 x 2 + · · · )( p1 + 2 p2 x + 3 p3 x 2 + · · · ).

A comparison of the coefficients yields


n−1
nqn = (n − k) pn−k qk , n = 1, 2 · · · .
k=0

By the Schwarz inequality,


n 
n−1
n 2 |qn |2 ≤ k 2 | p k |2 |qk |2 . (4.5.5)
k=1 k=0

For n = 1, 2, · · · , let

n 
n
πn = k 2 | p k |2 , γn = |qk |2 .
k=1 k=0

Then (4.5.5) can be written as


1
γn − γn−1 ≤ πn γn−1 .
n2
Using 1 + x ≤ e x , we obtain
1 n+1 n π
γn ≤ 1 + πn γn−1 = + γn−1
n2 n n + 1 n(n + 1)
 
n+1 πn − n n+1 πn − n
= 1+ γn−1 ≤ exp γn−1 .
n n(n + 1) n n(n + 1)
4.5 The Robertson and Milin conjectures 97

Repeated application of this inequality yields


 n 
 πk − k
γn ≤ (n + 1) exp
k=1
k(k + 1)
 n 
 πk 
n+1
1
= (n + 1) exp +1− .
k=1
k(k + 1) k=1
k

Since 1 1 1
= − ,
k+1 k k+1

it follows from summation by parts that



n
πk n
1 1 
n
1 πn
= πk − = (πk − πk−1 ) −
k=1
k(k + 1) k=1
k k+1 k=1
k n+1

n
1 
n
= k| pk |2 − k 2 | p k |2 .
k=1
n+1 k=1

Therefore
 

n
k 
n+1
1
γn ≤ (n + 1) exp 1− k| pk |2 + 1 −
k=1
n + 1 k=1
k
  
1 
n
1
= (n + 1) exp (n + 1 − k) k| pk |2 − ,
n + 1 k=1 k

thus proving (4.5.4).


Now let h be any odd function in S,
h(z) = z + b2 z 3 + b4 z 5 + · · · ,

and let f (z) = [h( z)]2 . Then f belongs to S: see Exercise 5. Using (4.5.4) we
convert Robertson’s inequality (4.5.3) into an inequality for the coefficients ck in
∞
f (z)
log = ck z k . (4.5.6)
z k=1

Clearly this has the form



f (z) [h( z)]2
log = log = 2 log(1 + b2 z + b4 z 2 + · · · ).
z z

Therefore ∞

1 k
1 + b2 z + b4 z + · · · = exp
2
ck z .
2 k=1
98 4 Univalent functions and de Branges’s theorem

Using Lemma 4.5.2 with qk = b2k and pk = 21 ck , we obtain


  
1 n
4
|1 + |b2 |2 + · · · + |b2n |2 ≤ (n + 1) exp (n + 1 − k) k|ck |2 − .
4(n + 1) k=1 k

If the exponent is negative, then the above exponential is ≤ 1. This, together with
Theorem 4.5.1 gives the following result.

Theorem 4.5.3. If, for each f in S, the coefficients cs defined by (4.5.6) satisfy

n  
4
(n + 1 − k) k|ck | −
2
≤ 0, n = 1, 2, 3, . . . , (4.5.7)
k=1
k

then the Bieberbach conjecture is true.

Milin conjectured in 1971 that the inequality (4.5.7) actually holds.

4.6 Preparation for the proof of de Branges’s theorem

In this section we adapt the constructions in Section 4.4 to obtain the specific results
that are the basis from which Weinstein’s proof of de Branges’s theorem proceeds.
Here we start with a function that is not a slit mapping and approximate it by slit
mappings. For convenience we repeat some steps of the arguments in Section 4.4.
Suppose that f : D → C belongs to S, and

f (z) = z + an z n . (4.6.1)
n=2

Given 0 < r < 1, the function

∞
f (r z)
fr (z) = = z+ (r n−1 an )z n ,
r n=2

restricted to D, belongs to S and has a holomorphic extension to D1/r (0). The coef-
ficients an (r ) = r n−1 an converge to an , so for our purposes we may replace f by fr
and assume that f extends smoothly to ∂D.

Theorem 4.6.1. Suppose that f ∈ S extends smoothly to the boundary ∂D. Then
there is a family { f t }t>0 ⊂ S with the properties
(a) f 0 (z) = f (z); 
(b) f t (z) = et z + ∞ n=2 an (t)z ;
n

f t (z) ∞ 2
(c) log t = k=1 ck (t)z k , ck (∞) = ;
ez k
4.6 Preparation for the proof of de Branges’s theorem 99
 
∂ f t (z)/∂t
(d) Re > 0.
z f t (z)
Proof. Let Ω = f (D). The boundary curve ∂Ω = f (∂D) encloses D1/4 (0) and meets
the half-line (−∞, 1/4]. Let

s0 = sup{s > 0 ; , −s ∈ f (∂D)}.

We may choose a parametrization of σ (τ ) of ∂Ω, 0 ≤ τ ≤ s0 , with σ (0) = σ (s0 ) =


s0 . We then parametrize the curve

Γ = ∂Ω ∪ [s0 , ∞] (4.6.2)

by 
σ (τ ), 0 ≤ τ ≤ s0 ;
Γ (τ ) =
−τ, s0 < τ < ∞.

Now define a family of slit domains for s > 0 by

Ωs = C \ Γs , Γs = {Γ (τ ) : τ ≥ s};

see Figure 4.3. Then Ωs is simply connected, and the Ωs converge to Ω0 = Ω in the
sense of Carathéodory as s → 0. Note that

Ωt ⊂ Ωs if t < s,

and
Ωs = C \ (−∞, −s], if s ≥ s0 . (4.6.3)

−s0 − 14 0

Γ(s)

Fig. 4.3 The approximating curve Γs , s close to 0.

Let 
f t be the conformal map of D onto Ωt that satisfies 
f t (0) = 0, 
f t (0) > 0. For
t < s,
f t (D) = Ωt ⊂ Ωs = 
 f s (D),
100 4 Univalent functions and de Branges’s theorem

so 
f s−1 ◦ 
f t : D → D is well defined. By Schwarz’s lemma, it follows that

f s−1 ) (0))( 
( f t ) (0) = ( 
f s−1 ) ( 
f t (0)) 
f t (0) < 1,

f t (0) < 
so  f s (0). Thus 
f t (0) is strictly increasing. Moreover (4.6.3) implies that

 4sz
f s (z) = , for s ≥ s0 ; (4.6.4)
(1 − z)2

see Exercise 15. Therefore f s (0) = 4s for s ≥ s0 . Note that 


f 0 = f , so 
f  (0) = 1.
It follows from these considerations that we may reparametrize by taking

ft = 
fs , t = log 
f s (0), 0 ≤ t ≤ ∞. (4.6.5)

This produces an expansion in the form (b).


Since f t vanishes only at 0 ∈ D, we may choose the branch of the logarithm so
that log( f t (z)/et z) is holomorphic in D and equals 0 at the origin. Therefore f t has
an expansion of the form in (c). Let t and s be related as in (4.6.5), and set t = t0 if
s = s0 . For t ≥ t0 it follows from (4.6.4) that t = log 4s and that
f t (z) 1
= , for t ≥ t0 .
ez t (1 − z)2

Therefore
∞
f t (z) 2 k
log t
= −2 log(1 − z) = z , t ≥ t0 .
ez k=1
k

This proves part (c). Part (d) is contained in (4.4.14).


Another important ingredient involves the Legendre polynomials {Pn }∞
0 . These
can be defined by the generating function

 1
G(x, s) = Pn (x) s n = , |x| < 1. (4.6.6)
n=0
(1 − 2xs + s 2 )1/2

Expanding the right side gives



 1 · 3 · · · (2n − 1)
G(x, s) = (2xs − s 2 )n . (4.6.7)
n=0
2n n !

Collecting coefficients of s n shows that Pn (x) is a polynomial of degree n.


The representation (4.6.6) can be used to show that
 
d d
(1 − x 2 ) Pn (x) = −(n + 1)n Pn (x); (4.6.8)
dx dx

see Exercise 18. In other words, the Pn are eigenfunctions of the operator
4.6 Preparation for the proof of de Branges’s theorem 101

d d
L = (1 − x 2 )
dx dx

acting on functions in L 2 (I ), I = (−1, 1).


Beyond the representation (4.6.8), we need Legendre’s addition formula

Pn (cos ϕ sin θ sin θ  + cos θ cos θ  )) (4.6.9)


n
(n − k)!
= Pn (cos θ )Pn (cos θ  ) + 2 cos(kϕ)Pnk (cos θ )Pnk (cos θ  ).
k=1
(n + k)!

The functions Pnk that occur in (4.6.9) are known as the associated Legendre functions.
They are closely related to the derivatives of the Legendre polynomials:

Pnk (x) = (−1)k (1 − x 2 )k/2 Pn(k) (x).

Here we give a brief outline of the discussion in [21], which contains the details of
the proof of the addition formula (4.6.9).
In spherical coordinates

(x, y, z) = (r cos ϕ sin θ, r sin ϕ sin θ, r cos θ ) (4.6.10)

the Laplacian in R3 is
∂2 ∂2 ∂2
Δ = + +
∂x2 ∂ y2 ∂z 2
∂ 2
2 ∂ 1 ∂2 1 ∂ ∂
= + + + 2 sin θ .
∂r 2 r ∂r r sin θ ∂ϕ
2 2 2 r sin θ ∂θ ∂θ

With r = 1, the second and third terms constitute the Laplacian L S on the unit
sphere in R3 with respect to the coordinates ϕ, θ . Solutions of L S Y = 0 are known
as spherical harmonics. Separation of variables, i.e. looking for solutions having the
form Y (θ, ϕ) == Φ(ϕ)Θ(θ )), leads one to choose Φ(ϕ) = eimϕ , m ∈ Z. Then the
equation for Θ becomes
   
1 d dΘ m2
sin θ + (n + 1)n − Θ = 0, (4.6.11)
sin θ dθ dθ sin2 θ

Letting x = cos θ converts (4.6.11) to the spherical harmonic equation


 
d 2 du m2
(1 − x ) + (n + 1)n − u(x) = 0.
dx dx (1 − x)2

For m = 0 the solution is a multiple of Pn , and in general the solution Θnm is a


multiple of Pnm .
With suitable normalization constants {cnm }, the resulting functions
102 4 Univalent functions and de Branges’s theorem

Ynm = cnm eimϕ Pnm (cos θ ), |m| ≤ n, n = 0, 1, 2, . . . ,

are an orthonormal basis for the L 2 space of the sphere, consisting of eigenfunctions
of L S . In particular, the function on the left in (4.6.9) can be expanded with respect
to the {Ynm }, and the right side of (4.6.9) is the expansion.
In the preceding discussion we used the notation of [21]. The version used (implic-
itly) by Weinstein [213] and expounded in the next section uses the version with θ
and ϕ interchanged, and also sets two of the variables equal. The result is the identity

Pnm (cos2 ϕ + sin2 ϕ cos θ )


n
(n − m)!
= Pn (cos ϕ)2 + 2 cos(mφ)(Pm (cos mφ))2 (4.6.12)
m=1
(n + m)!

4.7 Proof of de Branges’s Theorem

Let α0 = β0 = 0 and αk = 4
k
− k|ck (0)|2 , βk = k, k = 1, 2, . . . . The convolution


n
γn = αk βn−k
k=0

from the product ∞  ∞ 


  ∞

αk z k βk z k = γn z n
k=1 k=1 n=2

suggests that the finite sum in the Milin conjecture is just the coefficient of z n+1 in
the product of the two series

∞ ∞

4 z
− k|ck (0)|2 z k , k zk = .
k=1
k k=1
(1 − z)2

Indeed, we have

 n 
  4
( − k|ck (0)| )(n − k + 1) z n+1
2

n=1 k=1
k


z 4
= − k|ck (0)|2 z k . (4.7.1)
(1 − z) k=1 k
2

This is the first step in Weinstein’s argument. Let us denote the left-hand side of
(4.7.1) by Φ(z):
4.7 Proof of de Branges’s Theorem 103


 n 
  4
Φ(z) = − k|ck (0)| (n − k + 1) z n+1 .
2
(4.7.2)
n=1 k=1
k

The next step is to show that Φ(z) can be expressed as


∞  ∞
Φ(z) = h n (t)dt z n+1 , (4.7.3)
n=1 0

where h n (t) ≥ 0 for all t ≥ 0 and n = 1, 2, · · · .


In the following, we keep z fixed and define w = wt (z) by
z et w
= , t ≥ 0, (4.7.4)
(1 − z)2 (1 − w)2

so that w0 (z) = z. Recall from Theorem 4.6.1 (c) that |ck (∞)| = 2/k. Hence
 ∞ ∞  ∞
d  4  4 ∞
− k|ck (t)| wt dt =
2 k
− k|ck (t)|2 wtk
0 dt k=1 k k=1
k 0

∞ ∞
4 4
= − k|ck (∞)|2 w∞
k
− − k|ck (0)|2 w0k .
k=1
k k=1
k

Since w∞ < ∞ and w0 = z, the identities (4.7.1) and (4.7.2) imply that
 ∞ ∞ 
z d  4
Φ(z) = − − k|ck (t)| w dt.
2 k
(4.7.5)
0 (1 − z)2 dt k=1 k

It follows from (4.7.4) that


dw 1−w
= −w . (4.7.6)
dt 1+w
so
 ∞

et w 1 + w 
Φ(z) = k[ck (t)ck (t)] wk
0 1 − w2 1 − w k=1



k1−w
+ (4 − k |ck (t)| )w
2 2
dt. (4.7.7)
k=1
1+w

Write z = r eiθ . From Theorem 4.6.1 (c), we have


 ∞
∂ f t (z)/∂t
= 1+ ck (t)r k eikθ ,
f t (z) k=1

which is a Fourier expansion. By (4.1.5) the coefficients are given by


104 4 Univalent functions and de Branges’s theorem

1 2π
∂ f t (z)/∂t −ikθ
r k ck (t) = e dθ,
2π 0 f t (z)

which in turn gives



1 2π
∂ f t (z)/∂t k
r 2k ck (t) = z̄ dθ.
2π 0 f t (z)

Taking the limit r → 1 gives



1 2π
∂ f t (z)/∂t k
ck (t) = lim z̄ dθ
r →1 2π 0 f t (z)

and 
1 2π
∂ f t (z)/∂t
kck (t)ck (t) = lim k ck (t)z k dθ.
r →1 2π 0 f t (z)

Similarly, we have

1 2π
∂ f t (z)/∂t
kck (t)ck (t) = lim k ck (t)z k dθ.
r →1 2π 0 f t (z)

With these representations, equation (4.7.7) becomes


⎧ ⎡   ⎤
 ∞  2π
et w ⎨ 1 + w ⎣ ∞
1 ∂ f t (z)/∂t
Φ(z) = 1+ lim kck (t)z k dθ wk ⎦
0 1 − w ⎩1 − w f t (z)
2 r →1 2π 0
k=1
⎡   ⎤
∞  2π
1+w ⎣ 1 ∂ f t (z)/∂t
+ 1+ lim kck (t)z dθ wk ⎦
k
1−w r →1 2π 0 f t (z)
k=1

1+w 4w

 ⎬
−2 + + 2
−k |ck (t)| w 2 k dt (4.7.8)
1−w 1−w ⎭
k=1

Denote the term inside the curly brackets in (4.7.8) by A. Then


 ∞
  2π
1+w  1 ∂ f t (z)/∂t
A = 1+ dk w ,
k
dk = lim k ck (t)z k dθ.
1−w k=1
r →1 2π 0 f t (z)

Simple calculation yields




A = 1+ [2(1 + d1 + · · · + dk ) − dk ] wk .
k=1

Similarly, we put
 ∞
 
1+w  2π
∂ f t (z)/∂t
B = 1+ ek w k
, ek = lim k ck (t)z k dθ,
1−w k=1
r →1 0 f t (z)
4.7 Proof of de Branges’s Theorem 105

and ∞

B =1+ [2(1 + e1 + · · · + ek ) − ek ]wk .
k=1

It follows From (4.7.8) that


 ∞  ∞ 
et w  1 ∂ f t (z)/∂t

Φ(z) = 1+ lim
0 1 − w2 k=1
r →1 2π 0 f t (z)
%
× [2(1 + · · · + kck (t)z k ) − kck (τ )z k ]dθ wk

  2π
 1 ∂ f t (z)/∂t
+1 + lim
k=1
r →1 2π 0 f t (z)
&
×[2(1 + · · · + kck (t)z ) − kck (t)z k ]dθ wk
k




−2 + −k 2 |ck (t)|2 wk dt (4.7.9)
k=1

To proceed further, we differentiate the equation in Theorem 4.6.1 (c) with respect
to z, and obtain ∞
∂ f t (z)/∂z 
z =1+ k ck (t)z k ,
f t (z) k=1

so that (4.7.9) can be written as


 ∞ t ∞  2π  ' 
ew  1 ∂ f t (z)/∂t ∂ f t (z)/∂z
Φ(z) = lim z
0 1 − w2 k=1 r →1 2π 0 f t (z) f t (z)
 ∞


× 1+ lcl (t)z l [2(1 + · · · + kck (t)z k ) − kck (t)z k ] dθ wk
l=1

  2π  ' 
1 ∂ f t (z)/∂t ∂ f t (z)/∂z
+ lim z
k=1
r →1 2π 0 f t (z) f t (z)
 ∞


× 1+ lcl (t)z l [2(1 + · · · + kck (t)z k ) − kck (t)z k ]dθ wk
l=1



+ −k |ck (t)| w
2 2 k
dt. (4.7.10)
k=1

To simplify (4.7.10), we put


∂ f t (z)/∂t
F(z, t) = .
z ∂ f t (z)/∂z

From Theorem 4.6.1 (b), we have


106 4 Univalent functions and de Branges’s theorem

 ∞
∂ f t (z)
= et z + ak (t)z k .
∂t k=2

Then 

et + ak (t)z k−1
k=2
F(z, t) = .
∂ f t (z)/∂z

Since ∂ f t (z, t)/∂z is holomorphic and nonvanishing in the unit disk, the function
F(z, t) is holomorphic. Furthermore, since ∂ f t (z)/∂z = et + · · · , we have F(0, t) =
1. (This fact will be used later.) The first inner integral in (4.7.10) is equal to
⎡ ⎤
 2π ∞
1
F(z, t) ⎣1 + l cl (t)zl ⎦ [2(1 + · · · + kck (t)z k ) − kck (t)z k ] dθ
2π 0
l=1
⎡ ⎤
 2π ∞

1 1 1
= F(z, t) ⎣1 + · · · + k ck (t)z − k ck (t)z + k ck (t)z +
k k k l cl (t)z ⎦
l
2π 0 2 2
l=k+1

×[2(1 + · · · + kck (t)z k ) − kck (t)z k ]dθ


 2π
1 1
= F(z, t) |2(1 + · · · + kck (t)z k ) − kck (t)z k |2 dθ
2π 0 2
 2π
1 1
+ F(z, t) kck (t)z k 2(1 + · · · + kck (t)z k ) − kck (t)z k dθ
2π 0 2
⎛ ⎞
 2π ∞
1
+ F(z, t) ⎝ cl (t)z ⎠ [2(1 + · · · + kck (t)z k ) − kck (t)z k ] dθ.
l
2π 0
l=k+1
(4.7.11)
There are now three terms on the right-hand side of equation (4.7.11). In the second
term, for m ≤ k − 1, since z = r eiθ we have
 2π 
1 r 2m
F(z, t)z k z m dθ = F(z, t)z k−m−1 dz = 0
2π 0 2πi |z|=r

by Cauchy’s theorem. Similarly, for m = k we have


 2π 
1 r 2k F(z, t)
F(z, t)z z dθ =
k k
dz = r 2k F(0, t) = r 2k
2π 0 2πi |z|=r z

since, as noted above, F(0, t) = 1. Taking the limit r → 1 in (4.7.10) shows that the
second term on the right-hand side of (4.7.11) is 21 k 2 |ck (t)|2 . Similarly the third term
in (4.7.11) is zero.
Summarizing, the first inner integral in (4.7.10) is equal to the sum of the first
term in (4.7.11) and the contribution 21 k 2 |ck (t)|2 from the second term. The second
inner integral in (4.7.10) is the complex conjugate of the first inner integral, so it
contributes 21 k 2 |ck (t)|2 to (4.7.11). Summing with respect to k cancels the last series
in (4.7.10). Thus (4.7.10) becomes
4.7 Proof of de Branges’s Theorem 107

 ∞  2π  ' 

et w  1 ∂ f t (z) ∂ f t (z)
Φ(z) = lim Re z
0 1 − w2 k=1 r →1 2π 0 ∂t ∂z
×|2(1 + · · · + kck (t)z k ) − k ck (t)z k |2 dθ wk dt. (4.7.12)

Denote the term being summed in (4.7.12) by Ak (t). Then (4.7.12) becomes
 ∞ t ∞ 
ew 
Φ(z) = Ak (t)w dt.
k
(4.7.13)
0 1 − w2 k=1

By Theorem 4.6.1 (d), we have Re F ≥ 0, from which it follows that

Ak (t) ≥ 0 for t ≥ 0, k = 1, 2, · · · . (4.7.14)

If we show that  ∞
et wk+1
= Λnk (t)z n+1 (4.7.15)
1−w 2
n=0

with Λnk (t) ≥ 0 for t ≥ 0, then we have proved the Milin conjecture (4.5.7). Indeed,
from (4.7.13) and (4.7.15), we have

 ∞

 ∞
Φ(z) = Ak (t)Λk (t)dt z n+1 .
n
(4.7.16)
n=0 0 k=1

The function h n (t) in (4.7.3) is explicitly given by




h n (t) = Ak (t)Λnk (t). (4.7.17)
k=1

If Λnk (t) ≥ 0, then it follows from (4.7.2) and (4.7.16) that


n  ∞
4
− k|ck (t)|2 (n − k + 1) = h n (t)dt ≥ 0.
k=1
k 0

To show that Λnk (t) ≥ 0 for t ≥ 0, we first establish the equation


∞
z et w et wk+1
= + 2 cos θ, (4.7.18)
1 − 2z(cos2 φ + sin2 φ cos θ ) + z 2 1 − w2 k=1
1 − w2

where sin φ = e−t/2 and z/(1 − z)2 = et w/(1 − w)2 . Note that the right-hand side
of this equation is a Fourier cosine series. Thus, we may write it as

z a0 
= + ak cos kθ.
1 − 2z(cos φ + sin φ cos θ ) + z
2 2 2 2 k=1

From (4.1.5) again, we see that the coefficient ak has the integral representation
108 4 Univalent functions and de Branges’s theorem
 π
2 z cos kθ
ak = dθ. (4.7.19)
π 0 1− 2z(cos2 φ + sin2 φ cos θ ) + z 2

We first consider the special case t = 0; i.e. when sin2 φ = 1, cos2 φ = 0 and w = z
(see (4.7.4)). In this case

2 π w cos kθ 2wk+1
dθ = ; (4.7.20)
π 0 1 − 2w cos θ + w 2 1 − w2

see Exercise 19.


For the general case, we just need the identities
z cos kθ cos kθ
=
1− φ + sin φ cos θ ) + z
2z(cos2 2 2 (1 − z) /z + 2 sin2 φ(1 − cos θ )
2

cos kθ et w cos kθ
= −t = .
e (1 − w)2 /w + 2e−t (1 − cos θ ) 1 − 2w cos θ + w2

Substituting this into (4.7.19), we obtain from (4.7.20)


2et wk+1
ak = ,
1 − w2

proving (4.7.18).
Inserting (4.7.15) into (4.7.18) gives
z
1− φ + sin2 φ cos θ ) + z 2
2z(cos2

 ∞ 
 ∞
= Λn0 (t)z n+1 + 2 Λnk (t)z n+1 cos kθ. (4.7.21)
n=0 k=1 n=0

Here we use the generating function for the Legendre polynomials (4.6.6):
∞
z
, = Pn (cos2 φ + sin2 φ cos θ )z n
1 − 2z(cos2 φ + sin2 φ cos θ ) + z 2 n=0
(4.7.22)
and the addition formula (4.6.12):

Pn (cos2 φ + sin2 φ cos θ ) =


 n
(n − k)! k
= [Pn (cos θ )]2 + 2 [Pn (cos θ )]2 cos kφ (4.7.23)
k=1
(n + k)!

Applying (4.7.23) to (4.7.22) gives


z
, (4.7.24)
1 − 2z(cos φ + sin2 φ cos θ ) + z 2
2
4.7 Proof of de Branges’s Theorem 109

∞ ∞  n
(n − k)! k
= [Pn (cos2 θ )]2 z n + 2 ( [Pn (cos φ)]2 cos kθ )z n .
n=0 n=0 k=1
(n + k)!

Since 2 cos kθ = eikθ + e−ikθ , we have



 ∞

Λn|k| (t)eikθ = Λn0 (t) + 2 Λnk (t) cos kθ.
k=−∞ k=1

Therefore, equation (4.7.21) can be written as



 ∞

1  
- & = Λn|k| (t)eikθ zn . (4.7.25)
1 − 2z cos2 φ + sin2 φ cos θ + z 2 n=0 k=−∞

Similarly,
n
(n − |k|)! |k|
[Pn (cos φ)]2 eikθ
k=−n
(n + |k|)!

n
(n − k)!
= [Pn (cos φ)]2 + 2 [Pnk (cos φ)]2 cos kθ,
k=1
(n + k)!

and (4.7.24) becomes


⎛ ⎞

 
n
1 ⎝ (n − |k|)! |k|
, = [Pn (cos φ)]2 eikθ ⎠ z n .
1 − 2z(cos2 φ + sin2 φ cos θ ) + z 2 (n + |k|)!
n=0 k=−n

Squaring both sides of the last equation gives


1
1− φ + sin2 φ cos θ ) + z 2
2z(cos2
∞  n  m  (m − | j|)! (n − m − |l|)!
n−m
=
n=0 m=0 j=−m l=−n+m
(m + | j|)! (n − m + |l|)!
|l|
×[Pm| j| (cos φ)]2 [Pn−m (cos φ)]2 ei( j+l)θ z n
∞  n   j| (m − | j|)! (n − m − |k − j|)!
n n−|k−
=
n=0 k=−n j=−n m=| j|
(m + | j|)! (n − m + |k − j|)!
|k− j|
×[Pm| j| (cos φ)]2 [Pn−m (cos φ)]2 eikθ z n . (4.7.26)

Comparing (4.7.25) with (4.7.26) gives

Λnk (t) = 0 for k > n,

since the summation on k in equation (4.7.25) ranges from −∞ to +∞ while the


same summation in (4.7.26) ranges from −n to n. For k = 0, 1, · · · , n, we have
110 4 Univalent functions and de Branges’s theorem

  j| (m − | j|)!(n − m − |k − j|)!
n n−|k−
|k− j|
Λnk (t) = [Pm| j| (cos φ)]2 [Pn−m (cos φ)]2 ,
j=−n m=| j|
(m + | j|)!(n − m + |k − j|)!

which is clearly non-negative.


To this point we have proved the inequality |bn | ≤ n, n = 2, 3, . . . . If equality
holds for any n, then this argument shows that it holds for all n. In particular, it
holds for n = 2, so Bieberbach’s theorem implies that f is a Koebe function. This
completes Weinstein’s proof of Bieberbach’s conjecture.

Exercises

1. Prove that K (∂D) = (−∞, −1/4], where K is the Koebe function.


2. Prove that the function f 2 of (4.1.6) is single-valued.
3. Use the Koebe function to prove that the bounds in (4.1.12) and (4.1.14) are
sharp.
4. (a) Prove that if f belongs to S, then f  has no zeros in D.
(b) Is the converse true? ,
5. Suppose that h ∈ S is an odd function of z. Show that h(z) = f (z 2 ) where f
belongs to S.
6. (a) Suppose f ∈ S, and m is a positive integer. Show that, for a suitable choice
of the m-th root, the function

g(z) = f (z m )1/m (4.7.27)

belongs to S, and has the symmetry property g(e2πi/m z) = g(z).


(b) Conversely, show that if g ∈ S has the preceding symmetry property, then g
has the form (4.7.27) for some f ∈ S. Is f unique?
7. Prove the sharper version of the Koebe one-quarter theorem: if f ∈ S omits the
value w; then |w| ≥ 1/(2 + |a2 |).
8. Suppose f ∈ S. Prove that for any compact set K ∈ D and any ε > 0 there is
a polynomial p such that p D ∈ S and | p(z) − f (z)| < ε for z ∈ K . In other
words, polynomials are dense in S.
9. Let {Ωn } be a sequence of domains, and consider the family of domains Ω with
the property that any compact subset of Ω is contained in all but finitely many
Ωn . Prove that any domain that is the union of domains with this property also
has this property.
10. Prove Corollary 4.3.3.
11. Prove the assertion about the form of the coefficients of gτ in (4.4.1).
12. Suppose that the function k(t) in Loewner’s equation (4.4.3) is identically −1.
Show that the function f generated by (4.4.3) is the Koebe function. In fact,
given z ∈ D, let g(t) = gt (z). Write (4.4.3) in the form [log F] (g) g  = 1 for a
certain choice of the function F, so that
4.7 Proof of de Branges’s Theorem 111

F(g(t)) = cet

where the constant of integration c depends on z. The initial condition g(0) = z


determines c(z), and the asymptotic condition et g(t) → f (z) determines f (z).
13. (a) Suppose that the function k(t) in (4.4.3) generates f ∈ S. Show that for any
real θ , the function eiθ k(t) generates the function

f θ (z) = e−iθ f (eiθ z).

(b) Show that f θ also belong to S and is a slit map.


14. Suppose that k(t) in (4.4.3) is constant. What f is generated? (Remember that
f (t) ∈ ∂D.)
15. Prove (4.6.4).
16. Prove that Pn (−1) = 1 and Pn (1) = (−1)n .
17. Differentiate both sides of (4.6.7) with respect to s and equate coefficients of s n
to prove the recursion relation

(n + 1)Pn+1 (x) = (2n + 1)x Pn (x) − n Pn (x).

18. Prove (4.6.8). Hint: use (4.6.8) to derive a partial differential equation for G(x, s),
and show that the right-hand side of (4.6.7) satisfies that equation.)
19. Prove (4.7.20). Hint: The integrand is even in θ , so convert the left side into
an integral from −π to π , use the identity 1 − 2w cos θ + w2 = (1 − weiθ )(1 −
we−iθ ) to write the integrand as the sum of two (two-sided) series in eiθ , integrate
term-by-term, and sum.)

Remarks and further reading

Standard references for univalent functions, pre-de Branges, are Duren [64] and
Pommerenke [170]. De Brange’s original manuscript ran to 385 typed pages and
made much use of ideas from the theory of operators in Hilbert space. As recounted in
[52], de Brange’s participation in a seminar in Leningrad that included Milin led to the
simplified proof in [52]. Though much shorter, this proof still gives some indication
of the operator-theoretic considerations that led to deBranges’s new approach to the
problem. Another short version is due to Fitzgerald and Pommerenke [73], and has
even made its way into textbooks, e.g. [47] and [102].
Some of the history surrounding de Branges’s proof is recounted in the sympo-
sium volume [14]. For a comprehensive current account of the theory, see Thomas,
Tuneski, and Vasudevarao [205]. For a somewhat different approach, see Rosenblum
and Rovnyak [181].
Weinstein’s proof of de Brange’s theorem, presented here, depends on a positivity
result involving Legendre polynomials Pn . These are among the simplest cases of
(α,β)
Jacobi polynomials Pn . De Brange’s argument, and that of Fitzgerald and Pom-
112 4 Univalent functions and de Branges’s theorem

merenke, depended on a positivity result for certain sums of Jacobi polynomials


proved by Askey and Gasper [12]. To prove their result, Askey and Gasper estab-
lished positivity for a still more esoteric class of hypergeometric functions:

3 F2 (n − r, r + n + 2, n + : 2 + 1, n + : s) > 0, 0 < s < 1.


1 3
2 2

Wilf [216] has pointed out that Weinstein’s argument actually gives an independent
proof of the non-negativity of the Askey–Gasper polynomials.
Chapter 5
Harmonic and subharmonic functions;
the Dirichlet problem

Harmonic and subharmonic functions play an important role in many developments


in complex analysis. As we noted in Section 1.9, a real harmonic function is, locally,
the real part of a holomorphic function and has some of the same properties.
The Dirichlet problem is the problem of finding a function that is harmonic on a
given domain U and has prescribed values on the boundary ∂U . An important special
case with U = D has an explicit solution given by Poisson’s integral formula. This
formula is derived in Section 5.1. A number of consequences, such as maximum
principles and the Harnack inequalities, are worked out in Sections 5.1 and 5.2.
Subharmonic functions and Perron’s principle provide a mechanism of attack
for the Dirichlet problem in a general domain. These are introduced in Section 5.3
and applied to the solution of the Dirichlet problem in Section 5.4. An alternative
approach to the Dirichlet problem is outlined in Section 5.5.
The results and techniques introduced in this chapter lead eventually to the uni-
formization theorem for Riemann surfaces, the subject of Chapters 6 and 7.

5.1 Harmonic functions and the Poisson integral formula

As noted above, the Dirichlet problem is the problem of finding a function that is
harmonic in a given domain and that has prescribed values on the boundary of the
domain. As we shall see, if the domain is a disk, the problem has a very explicit
solution.

Theorem 5.1.1. Suppose that D ⊂ C is a disk and f is a continuous real function on


the boundary ∂ D. Then there is a unique function u, harmonic in D and continuous
on the closure of D, such that u = f on ∂ D.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 113
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_5
114 5 Harmonic and subharmonic functions; the Dirichlet problem

Proof: In view of Corollary 1.9.3, it is enough to consider D = D, the unit disk. The
strategy of the proof is to assume we have a solution, derive an explicit formula that
it must satisfy, and then show that the formula does indeed provide the solution.
Suppose that u is a solution. By Proposition 1.9.1, there is a function holomorphic
g : D → C having real part u. We may assume that g(0) = u(0). The function g has
an expansion
∞
g(z) = αn z n (5.1.1)
n=0

that converges uniformly on disks Dr (0), r < 1. The real part u has the expansion


u(r eiθ ) = r n Re (αn einθ )
n=0

∞ ⎪
⎨αn /2, n > 0;

= an r |n| einθ , an = Re α0 , n = 0; (5.1.2)


n=−∞ ᾱn /2, n < 0.

Given ε > 0, the dilated function gε (z) = g(z/(1 + ε)) is holomorphic in D1+ε (0).
By assumption, the restriction of u ε = Re gε to ∂D converges uniformly to f as
ε → 0. Convergence of the sum shows that
 iθ  ∞
e
u ε (e ) = u

= an (1 + ε)−|n| einθ .
1+ε n=−∞

By assumption, this series converges uniformly, so we can identify the coefficients


an (1 + ε)−|n| by integrating term by term, using the identity
π
1 1 i f m = n;
einθ e−imθ dθ = (5.1.3)
2π −π 0 i f m = n.

This gives π
an 1
= u ε (ϕ)e−inϕ dϕ.
(1 + ε)|n| 2π −π

Convergence of u ε to f on ∂D gives
π
1
an = f (eiθ )e−inϕ dϕ. (5.1.4)
2π −π

Note that 1 π
|an | ≤ | f (eiθ )| dθ.
2π −π

Therefore the series




an r |n| einθ (5.1.5)
n=−∞
5.1 Harmonic functions and the Poisson integral formula 115

converges uniformly for 0 ≤ r ≤ R < 1. Since a−n = ān , the terms

an einθ + a−n e−inθ

are real and harmonic, so (5.1.4) and (5.1.5) together define a function u that is real
and harmonic in D.
We have shown that if u is a solution to the Dirichlet problem with boundary value
f , then it is necessarily given on D by the formula
π
1
u(r eiθ ) = Pr (θ − ϕ) f (eiϕ ) dϕ, 0 ≤ r < 1, (5.1.6)
2π −π

where Pr is the Poisson kernel



 ∞

|n| inθ
Pr (θ ) = r e = 1+ (r eiθ )n + (r e−inθ )n
n=−∞ 1

r eiθ r e−iθ
= 1+ +
1 − r eiθ 1 − r e−iθ
1−r 2
= . (5.1.7)
1 − 2r cos θ + r 2

This proves uniqueness. We have also shown that the function u defined by (5.1.6)
is harmonic in D.
To see that u is continuous up to the boundary ∂D and equal to f on ∂D, note that
Pr , 0 ≤ r < 1 has the properties

(i) Pr > 0;
π
1
(ii) Pr (θ ) dθ = 1;
2π −π
1
(iii) lim Pr (θ ) = 0
r →1 2π δ≤|θ|≤π

for each δ > 0; Exercise 6. Using these properties and the continuity of f , it is not
difficult to prove that u(r eiθ ) converges uniformly to f (θ ) as r → 1−. (See the proof
of Theorem 2.9.1.) 


The formula (5.1.6) is known as the Poisson integral formula.


Corollary 5.1.2. (Weierstrass!approximation theorem) If f : ∂D → C is continu-
ous, then for any ε > 0 there is a trigonometric polynomial, i.e. a function of the
form 
g(θ ) = ak eikθ , (5.1.8)
|k|≤m

such that |g(θ ) − f (eiθ )| < ε, all θ .

Proof. In view of the previous discussion, the functions


116 5 Harmonic and subharmonic functions; the Dirichlet problem

π ∞

1
u(r eiθ ) = Pr (θ − ϕ) f (ϕ) dϕ = an r |n| einθ ,
2π −π −∞

where the bounded sequence {an } is defined by (5.1.4), converge to f uniformly as


r → 1. For any given 0 < r < 1, the partial sums of the series on the right are trigono-
metric polynomials, and they converge uniformly to u(r eiθ ). 


Remark. The other well-known Weierstrass approximation theorem, that a con-


tinuous function on a bounded closed interval can be approximated uniformly by
polynomials, is a consequence. In fact the interval can be rescaled to [0, π ], and
the function reflected about π so that f (2π ) = f (0) and f can be considered as an
element of C(∂D). Then f can be approximated within ε/2 by a trigonometric poly-
nomial (5.1.8), and each ak eikθ can be approximated within ε/4m by a polynomial,
by taking enough terms of the series expansion of eikθ .

Corollary 5.1.3. (Mean value property) If u is harmonic in a neighborhood of a


point z ∈ C, then for sufficiently small r > 0,
π
1
u(z) = u(z + r eiθ ) dθ, (5.1.9)
2π −π

i.e. u(z) is the mean value of u over any sufficiently small circle centered at z.

Proof: After a translation and dilation, we may assume that z = 0 and r = 1. In this
case, the result follows from Theorem 5.1.1. 


Corollary 5.1.4. (strict maximum principle) If u is harmonic in a bounded domain


U ⊂ C and U has a local maximum at a point z ∈ U , then U is constant.

Proof: The mean value property and the assumption that u has a local maximum at z
imply that u has this same value on each sufficiently small circle centered at z. Thus u
is constant, hence holomorphic, near z. If w is any other point of U we may find a curve
that joins z to w and a simply connected neighborhood V of the curve with V ⊂ U .
By uniqueness of analytic continuation, u is constant in V , so u(w) = u(z). 

Let us pass to consideration of the Dirichlet problem for a Jordan domain: a
domain in C whose boundary is a curve with no self-intersections.

Theorem 5.1.5. Suppose that ⊂ C is a Jordan domain with boundary . For any
continuous function f : → C, the Dirichlet problem has a unique solution.

Proof. By the Riemann mapping theorem, there is a conformal map that maps D
onto . By Theorem 2.6.1, extends to a bijective continuous map from the closure
to the closure. Therefore −1 and can be used to transfer the Dirichlet problem for
to the Dirichlet problem for D. The details are left as Exercise 4. 

More general domains are considered in later sections.
5.2 Harnack’s principle; removable singularities 117

We know that a real function u that is harmonic in D is the real part of a function φ
that is holomorphic in D. Moreover, if we require that φ(0) be real, then φ is unique.
We may extend the Poisson integral formula to exhibit φ in the case when u = f on
∂D. In fact for z ∈ D,
eiθ + z 1 − |z|2
Re iθ = iθ .
e −z |e − z|2

If z = r eiϕ , then the last expression on the right is Pr (θ − ϕ). Since the quotient
on the left is holomorphic in z, z ∈ D, the proof of Theorem 5.1.1 also proves the
following extended Poisson formula:

Theorem 5.1.6. If f : ∂D → R is continuous, then the function


π
1 eiθ + z
φ(z) = f (eiθ ) dθ
2π −π eiθ − z

is holomorphic in D and the real part is continuous and equal to f at the boundary.

5.2 Harnack’s principle; removable singularities

We begin with a simple consequence of the Poisson formula.

Lemma 5.2.1. Suppose that u is harmonic and non-negative in D. Then for z ∈ D

1−r 1+r
· u(0) ≤ u(z) ≤ · u(0), r = |z|. (5.2.1)
1+r 1−r

Proof: Suppose first that u is continuous on the closure of D. The Poisson kernel
(5.1.7) satisfies

1−r 1 − r2 1 − r2 1 − r2 1+r
= ≤ ≤ = .
1+r (1 + r ) 2 1 − 2r cos θ + r 2 (1 − r )2 1−r

Then the Poisson formula (5.1.6) gives (5.2.1). If u is not continuous on the closure,
we approximate u as in the proof of Theorem 5.1.1. 


The inequalities (5.2.1) are the simplest case of the Harnack inequalities for
solutions of elliptic equations.

Theorem 5.2.2. (Harnack’s principle) If {u n } is a sequence of harmonic functions


on a domain U , with u 1 ≤ u 2 ≤ u 3 . . . , then u ∞ ( p) = limn→∞ u n ( p) is either har-
monic or identically infinite.
118 5 Harmonic and subharmonic functions; the Dirichlet problem

Proof. Given p ∈ U , let us rescale coordinates, for convenience, so that p = 0 and


D ⊂ U . Let u ∞ (0) = limn u n (0). If u ∞ (0) is finite, then inequalities (5.2.1) show
that limn→∞ u n is finite in D. Similarly, if u ∞ (0) = ∞, these inequalities show that
the limit is identically ∞ in D. Since domains are, by assumption, connected, this
implies that the set where limn→∞ u n ( p) = ∞ is either empty or all of U . 


An important example of a harmonic function is log |z| in the punctured plane


C \ {0}. In fact log |z| = Re log z for any determination of log z, z = 0.
As in the case of holomorphic functions, there is a removable singularity theo-
rem for harmonic functions. The singularity of log |z| at the origin is obviously not
removable, but log |z| allows us to prove a type of one-sided singularity result that
will be useful later.

Lemma 5.2.3. Suppose that u is real-valued and harmonic in the punctured disk
D \ {0} and is continuous at the boundary of D. If u is bounded above, then u ≤ v,
where v is the function harmonic on D, continuous on the closure, and equal to u on
the boundary.

Proof: We may subtract u from v and reduce to the case that u ≡ 0 on the boundary
of D. Then v ≡ 0, and we want to show that u ≤ 0. Let h > 0 be an upper bound for
u. For any 0 < r < 1, let
h log |z|
u r (z) = .
log r

Then u r is harmonic and u ≤ u r on the boundary of the annulus {z : r < |z| < 1}, so
by the maximum principle u ≤ u r on the annulus. As r → 0, u r → 0 pointwise on
the punctured disk, so u ≤ 0 on the punctured disk. 


Corollary 5.2.4. If u is bounded and harmonic in a punctured neighborhood of a


point, then u extends to be harmonic in a full neighborhood of that point.

Proof. Apply Lemma 5.2.3 to u and to −u. 




5.3 Subharmonic functions and Perron’s principle

A real-valued function u defined on a domain U in C is said to be subharmonic if in


each coordinate disk D with closure in U , u ≤ h, where h is the harmonic function
such that h = u on ∂ D.
It may be helpful in the arguments that follow to think of the one-dimensional
analogue. A real-valued solution of u x x = 0 is a linear function u(x) = αx + β. A
convex function v : (a, b) → R is characterized by the property that for any subin-
terval (c, d), a < c < d < b, if u is linear, and if v ≤ u at the endpoints c, d, then
v ≤ u on all of (c, d); see the left part of Figure 5.1.
If u and v are subharmonic on a domain, then so is the maximum u ∨ v,
5.3 Subharmonic functions and Perron’s principle 119

v u

u u
c d c d

Fig. 5.1 One-dimensional analogues of subharmonicity and of harmonic regularization.

[u ∨ v]( p) = max{u( p), v( p)}.

If u is subharmonic in U and p0 belongs to U , a harmonic regularization û of u at


p0 is obtained by replacing u in a coordinate neighborhood D = { p : | p − p0 | < r }
centered at p0 by the harmonic function that agrees with u on ∂ D. The new function
û is subharmonic and u ≤ û. For the analogous construction in the one-dimensional
case, see the right part of Figure 5.1.
The usefulness of subharmonic functions for attacking the Dirichlet problem and
proving the uniformization theorem was established by Perron [166]. A Perron family
is a non-empty family F of subharmonic functions such that:
(a) if u and v belong to F then so does u ∨ v;
(b) if u belongs to F , so does each harmonic regularization of u.

Theorem 5.3.1. (Perron’s principle) Suppose that F is a Perron family of functions


on a domain U . Then ū = sup{u : u ∈ F } is either harmonic or identically infinite.

Proof: Let V be a coordinate neighborhood in U . Suppose that p, q are two points of


V . We may choose a sequence in F that converges to ū( p) and another that converges
to ū(q). Taking advantage of properties (a) and (b) of the definition, we may replace
these with a single sequence that is non-decreasing at p and at q and is harmonic in
V . By Theorem 5.2.2, ū( p) is infinite if and only if ū(q) is infinite. Since p and q
were arbitrary, ū is either infinite in all of V or finite in all of V . The result follows
from connectedness of U . 


Suppose that U is a bounded domain with boundary ∂U , and that g : ∂U → R


is continuous. Perron’s approach to the Dirichlet problem was to define the family
F (g) to consist of all subharmonic functions u that are continuous on the closure U
and ≤ g on the boundary. This is clearly a Perron family. Each u ∈ F (g) is bounded
by sup g, so the supremum u is harmonic. We shall refer to u as the Perron function
for the Dirichlet problem for the pair (U, g). The question is whether the Perron
solution is a solution, i.e. whether u = g on ∂U . This is not necessarily the case,
indeed there may not be a solution: see Exercise 11.
120 5 Harmonic and subharmonic functions; the Dirichlet problem

5.4 Regular points and the solution of the Dirichlet problem

A boundary point p0 of a domain U is said to have a barrier if there is a subharmonic


function v such that v is continuous on U , v ≤ 0, and v = 0 only at p0 . A local barrier
can be converted to a barrier:
Lemma 5.4.1. If p0 is a boundary point of U and there is a continuous, subharmonic
function u ≤ 0 in the intersection of a neighborhood of p0 with U , then p0 has a
barrier.

Proof: Choose r > 0 so that u is defined on Dr ( p0 ) ∩ U . Let c = sup{u( p) : | p −


p0 | = r }, and let v = max{u, c} on the domain of u and v = c on the remainder
of U . 


A boundary point p0 of a domain is said to be regular if there exists a barrier


at p0 . The extreme example of a point that is not regular is an isolated point of the
boundary; see Exercise 12. See Figure 5.2.

Fig. 5.2 Two domains for which every boundary point is regular.

As we shall see, if u is the Perron function for the pair (U, g), then u → g at
each regular point of ∂U . This is not very useful unless we can identify at least some
class of regular points. How does one tell when a barrier, or local barrier, exists? The
perfect candidate for a local barrier would seem to be

1 log r
v( p0 + r eiθ ) = Re = . (5.4.1)
log(r eiθ ) (log r )2 + θ 2

This function is harmonic and negative for 0 < r < 1, but θ 2 is not single-valued.
However this suggests a way to remedy the situation.
Proposition 5.4.2. Suppose that p0 is a boundary point of a bounded domain U and
suppose that for some 0 < R < 1 and some real θ0 the line segment
5.4 Regular points and the solution of the Dirichlet problem 121

L = { p = p0 + r eiθ0 : 0 < r < R} (5.4.2)

is contained in the complement of U \ { p0 }. Then p0 is a regular point.

Proof: In this case there is a single-valued branch of log( p − p0 )/R, holomorphic


on the complement of the segment (5.4.2) in D R ( p0 ), so (5.4.1) is a local barrier and
Lemma 5.4.1 applies. 


Remark. It is clear that in place of a straight segment in the complement of U \ p0 ,


it is enough to have any C 1 curve that has p0 as one endpoint and otherwise lies in
the complement of U .
Theorem 5.4.3. (Perron) If U is a bounded domain, g : ∂U → R a continuous
function, and p0 a regular point of ∂U , then the Perron function u for (U, g) converges
to g( p0 ) at p0 .

Proof: Let F (U, g) be the Perron family. Given ε > 0, there is a δ > 0 such that
|g( p) − g( p0 )| < ε if g( p) is defined and | p − p0 | < δ. Let v be a barrier at p0 . Let
||g|| = max |g|. For sufficiently large M,

Mv + 2||g|| < 0 on ∂U \ Dδ ( p0 ).

Let w = Mv + g( p0 ) − ε. This function is subharmonic, continuous on ∂U , and


w( p0 ) = g( p0 ) − ε. In Dδ ( p0 ) ∩ U ,

w = Mv + g( p0 ) − ε < g( p0 ) − ε ≤ g

by the choice of δ. By the choice of M, on the complement of U \ Dδ ( p0 ) we have

w = Mv + 2||g|| + g( p0 ) − 2||g|| − ε < g( p0 ) − 2||g|| − ε < g.

Therefore w belongs to F (U, g). Note that w( p0 ) = g( p0 ) − ε. We have shown


that the Perron function u satisfies

lim inf u( p) ≥ g( p0 ) − ε. (5.4.3)


p→ p0

Let us apply this argument in the case of −g to produce w∗ ∈ F (u, −g) such that
w ≥ −g − ε. Now let v be any element of F (U, g). Then v + w∗ is continuous on

U , harmonic in U , and ≤ 0 on ∂U . Therefore v ≤ −w ∗ in U . Since v ∈ F (U, g)


was arbitrary, u ≤ −w ∗ . Therefore

lim sup u( p) ≤ − lim inf w ∗ ( p) ≤ −(−g( p0 ) − ε) = g( p0 ) + ε. (5.4.4)


p→ p0 p→ p0

Since ε > 0 is arbitrary, (5.4.3) and (5.4.4) together show that u has limit g( p0 ) at
p0 . 

122 5 Harmonic and subharmonic functions; the Dirichlet problem

5.5 The L 2 approach to the Dirichlet problem

Given a bounded domain , consider the space C 1 ( ) of real functions u such that
u and its first derivatives are continuous on the closure . The associated Dirichlet
integral is
D(u) = [u 2x + u 2y ] d x d y. (5.5.1)

In physical problems, integrals like this occur as energy integrals, e.g. as the kinetic
energy of a vibrating surface. A natural problem is to try to minimize D(u) among
those functions on that have a specified value f on the boundary . This is a
problem in the calculus of variations. Let F be the family of C 1 functions that have
finite Dirichlet integral and equal to f on the boundary. If this family is not empty,
then there is a sequence {u n } ⊂ F such that
lim D(u n ) = inf D(u).
n→∞ u∈F

The terms u n can be viewed as elements of the Hilbert space H that has inner product

u, v = [u x vx + u y v y ] d x d y.

(Note that u, u = 0 if and only if u is constant. Therefore elements of H are


determined only up to an additive constant. Of course fixing the value at the boundary
fixes the constant.) We might expect the sequence {u n } to converge to a unique
element u ∈ H.
A standard argument from the calculus of variations puts a constraint on such an
element u. If w is any element of C 1 ( ) that vanishes on the boundary, then we
should have, for all ε,

D(u) ≤ D(u + εw) = u + εw, u + εw = D(u) + 2εu, w + ε2 D(w).

Differentiating the last expression with respect to ε at ε = 0, we see that the necessary
condition is that u, v = 0. Therefore, formally,

0 = u, w = [u x wx + u y w y ] = − (u x x + u yy )w, (5.5.2)

where the (formal) integration by parts is (formally) justified by the assumption


that w = 0 on the boundary. Since (5.5.1) is supposed to hold for each such w, the
conclusion is that (in some sense) u is harmonic in and is the solution of the
Dirichlet problem for the pair ( , f ).
To show that the sequence {u n } above does converge in H and that the limit u
is continuous on and equal to f on the boundary requires some assumptions on
the nature of the boundary. If this is the case, though, one can show that u is indeed
harmonic in . In fact integration by parts in (5.5.2) can be done in the other direction:
5.5 The L 2 approach to the Dirichlet problem 123

0 = − u (wx x + w yy ), w ∈ C 2 ( ), w = 0 on ∂ .

This says that u is a weak solution of u = 0, and is therefore an actual solution of


u = 0: see Theorem 2.9.4.
This approach to the Dirichlet problem has roots in the work of Gauss, Green,
Dirichlet, Riemann, Schwarz, and Hilbert, and was brought to fruition by Weyl [214].
(Weyl’s work can be thought of as the beginning of the theory of distributions.) Some
regularity of the boundary is needed. In fact Prym [174] gave an example of a pair
(U, g) such that for any f that is continuous on U and equal to g on ∂U , the formal
integral  f, f  is infinite.

Exercises

1. Prove that the harmonic function u and the function v of (1.9.2) satisfy the
Cauchy–Riemann equations.
2. Suppose that f : 1 → 2 is holomorphic and u : 2 → R is harmonic. Show
that u ◦ f is harmonic.
3. Prove that if u is harmonic on the half-disk D+ = {z : |z| < 1, Im z > 0}, con-
tinuous on the closure of D+ , and vanishes on [−1, 1], then u can be continued
as a harmonic function to all of D, with u(z̄) = −u(z).
4. Fill in the details in the proof of Theorem 5.1.5.
5. Prove that the assumption in Theorem 1.7.2 that f is continuous up to I and
| f (z)| = 1 on I can be replaced by the weaker assumption that | f (z)| → 1 as z
approaches I . This is the Schwarz reflection principle, which plays a major role
in the study of conformal mapping.
6. Complete the proof of Theorem 5.1.1 by using the properties (i), (ii), (iii) of the
Poisson kernel to prove that u n (r eiθ ) → f (eiθ ) as r → 1.
7. Use the Cayley transform and its inverse to find a solution to the problem: u
harmonic in H, u = f on the boundary R, where f is a bounded continuous
function on R. What is the value of u at z = i?
8. Show that the boundedness condition in Lemma 5.2.3 can be weakened to
max{u(z), 0} = o(− log |z|) as |z| → 0.
9. Show that the boundedness condition in Corollary 5.2.4 can be weakened to
|u(z)| = o(− log |z|) as |z| → 0.
10. Suppose K ⊂ C is the union of finitely many disks. Show that the Dirichlet
problem is solvable for K .
11. Use Corollary 5.2.4 to show that the Dirichlet problem on the punctured disk
U = D \ {0} may not have a solution.
12. Suppose that p0 is an isolated point of the boundary of a bounded domain. Show
that p0 is not a regular point.
13. (a) Suppose that ⊂ C is bounded and p0 ∈ . Show (without using conformal
mapping) that there is a function u, harmonic in \ { p0 }, such that
124 5 Harmonic and subharmonic functions; the Dirichlet problem

u + log |z − p0 |

is harmonic near p0 .

(b) Show that if = C, then there is no such harmonic function. Hint: let F
be the largest set of subharmonic functions in \ { p0 } that has the properties:
(i) if u, v ∈ F then u ∨ v ∈ F ; (ii) each harmonic regularization of an element
of F belongs to F ; (iii) if u ∈ F then

u(z) + log |z − p0 | is bounded in a neighborhood of p0 .

(Here we take the branch of log |z − p0 | that is negative near p0 .)

(c) Prove (a) without the restriction on using conformal mapping.

Remarks and further reading

The Laplacian  = ∂∂x 2 + ∂∂y 2 in R2 has an obvious analogue in Rn , and indeed in any
2 2

Riemannian manifold, and solutions of  = 0 are the associated harmonic functions.


For more on harmonic functions in Rn , see Axler, Bourdon, and Ramey [13]. For
harmonic, subharmonic, and plurisubharmonic functions in one and several complex
variables, see Hayman and Kennedy [99], Hayman [98], and texts on several complex
variables, such as Krantz [125], Ohsawa [156], and Hörmander [109].
Maximum principles for solutions of general partial differential equations are
treated by Protter and Weinberger [172] and Pucci and Serrin [173].
Chapter 6
General Riemann surfaces

In this chapter, we introduce the idea of an abstract Riemann surface S, construct a


simply connected covering surface for S, and derive some consequences.
An example is the Riemann sphere S = C ∪ {∞}, the one-point compactification
of C. The complex structure at ∞ is transferred from that of C by the inversion
z → 1/z.
Another example of a Riemann surface is any domain U in S: a two (real)-
dimensional manifold with a conformal structure. Recall that, by definition, a domain
U is connected. If U ⊂ C is also simply connected and omits at least two points, the
Riemann mapping theorem says that U can be mapped conformally to the unit disk.
If U omits only one point, then a linear fractional transformation with a pole at that
point maps U to C.
A third type of example is provided by the equation

z 2 + w2 = 1. (6.0.1)

This equation defines a complex curve C ∈ C2 :

C ≡ {(z, w) ∈ C2 : z 2 + w2 = 1}.

Note that as z → ∞, the two choices for w are asymptotic to ±i z. This suggests
considering an appropriate extension of C to a subset C ⊂ S × S. It can be shown

that C can be considered as a Riemann surface; it is equivalent to S.
Remark. The terminology here unfortunately conflicts with the common usage of
“curve” to refer to a map from a real interval into C, or into a Riemann surface, or to
the image of such a map. We rely on the context to make clear which use is intended.
As conceived originally by Riemann and Weierstrass, a Riemann surface was the
appropriate domain of definition of a possibly multiple-valued function f , extended
as far as possible
√ by analytic continuation. This is connected to the third example,
with f (z) = z 2 − 1. The general intrinsic concept, due to Weyl [214], is treated
in Section 6.1.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 125
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_6
126 6 General Riemann surfaces

The remainder of the chapter leads part of the way to a general classification of
Riemann surfaces as quotients of S, C, or D by certain equivalence relations. The first
two steps are taken in Section 6.2. The first step is the construction of the universal
cover S u of a Riemann surface S. The second step is the identification of a group G
of automorphisms of S u that correspond to equivalence classes of curves in S that
have a fixed endpoint. Then S itself can be identified with the quotient \S u of S u
by the equivalence relation induced by G.
In Section 6.3, we assume the uniformization theorem. This theorem, which is
proved in Chapter 7, says that any simply connected Riemann surface is equivalent
to one of D, C, or S. The relevance to the preceding discussion is that any universal
cover S u is simply connected. Therefore the group G can be taken to be a (certain
type of) group of linear fractional transformations. The possibilities when S u is C or
S are determined explicitly, and the much more variegated case S u ≡ D is discussed.

6.1 Abstract Riemann surfaces

An (abstract) Riemann surface is a set S provided with a conformal structure by a


collection of non-empty subsets Uα such that S = Uα and:
(a) for each index α there is a bijection φα mapping Uα onto D;
(b) if Uα ∩ Uβ is not empty, then Dαβ ≡ φa (Uα ∩ Uβ ) is an open subset of D and
φβ ◦ φα−1 : Dαβ → D is holomorphic;
(c) if p and q are two points of S, then there is a sequence Uα j , 1 ≤ j ≤ n such
that p ∈ Uα1 , q ∈ Uαn , and Uα j ∩ Uα j+1 is not empty, 1 ≤ j < n.
Assumptions (a) and (b) provide S with a topology: the open sets of S are the
subsets U with the property that each image φα (U ∩ Uα ) is open in D. Assumption
(c) implies that S is pathwise connected: given points z, w of S, there is a continuous
curve that joins z to w. Note that a domain, i.e. a connected open subset U , of a
Riemann surface is a Riemann surface, using the intersections U ∩ Uα that are not
empty to provide the complex structure.
Suppose that U ⊂ S is a domain. A function f : U → C is, by definition, holo-
morphic (resp. meromorphic) if each function f ◦ φα−1 is holomorphic (resp. mero-
morphic), where defined, on D. In particular, each coordinate mapping φα is holo-
morphic on Uα . More generally, a map from U to a Riemann surface S is defined to
be holomorphic (resp. meromorphic) if g ◦ f is holomorphic (resp. meromorphic)
on S for each g that is holomorphic on f (U ) ⊂ S .
The sets Uα are referred to as coordinate neighborhoods, and the mappings φα as
coordinate charts or simply as coordinates. More generally, any holomorphic map
z that is defined on a simply connected open neighborhood U of p ∈ S and maps U
injectively into C is said to be a coordinate at p, and U is said to be a coordinate
neighborhood.
As a first example, let us take the Riemann sphere S = C ∪ {∞} = U1 ∪ U2 ,
where
6.2 The universal cover 127

U1 = C, U2 = (C \ {0}) ∪ {∞}.

with φ1 (z) = z, φ2 (z) = 1/z.


 from the introduction to this chapter:
As a second example, consider the curve C

 = {(z, w) ∈ C × C : z 2 + w2 = 1} ∪ {(∞, ∞)} ⊂ S × S;


C (6.1.1)

see Exercise 1
The definitions, results, and proofs in Section 1.8 carry over immediately to Rie-
mann surfaces. In particular:

Theorem 6.1.1. If S is simply connected, f is holomorphic in some coordinate


neighborhood, and f can be continued along every curve in S, then f has a unique
single-valued extension to S.

Remark. With suitable modifications, we can define analytic continuation for mero-
morphic functions, and obtain an analogue of Theorem 6.1.1 for a meromorphic
function defined on a coordinate neighborhood. Note that the Riemann sphere S car-
ries no holomorphic functions, but carries many meromorphic functions: the rational
functions.

Two Riemann surfaces S and S are said to be equivalent if there is a holomorphic


bijection φ from S onto S . As shown in Chapter 1 for the case of domains in C,
in local coordinates injectivity implies that the derivative of φ is non-zero and the
local inverse is holomorphic. Therefore the inverse map is also holomorphic, so this
is indeed an equivalence relation.
The uniformization theorem says that any simply connected Riemann surface
is equivalent to either the unit disk, the complex plane, or the Riemann sphere.
Note that these three surfaces are themselves inequivalent: see Exercise 2. This
theorem indicates the importance of the construction, for any S, of a simply connected
covering surface. The proof of this theorem is the subject of Section 6.2.

6.2 The universal cover

We begin with some remarks about curves in a Riemann surface S. Again a curve
in S is a continuous map γ from a real interval I = [a, b] into S. The endpoints are
the ordered pair (γ (a), γ (b)). Two curves γ1 and γ2 are taken to be the same if they
differ only by parametrization: γ j : I j → S and γ2 = γ1 ◦ φ, where φ is an injective
increasing map from I2 onto I1 . Two curves γ0 and γ1 with the same domain I are
said to be homotopic if there is a continuous mapping γ : I × [0, 1] → S such that

γ (s, 0) = γ0 (s), γ (s, 1) = γ1 (s), s ∈ I.


128 6 General Riemann surfaces

More generally, two curves are said to be homotopic if they are equivalent to a pair
of curves that are homotopic in this sense. In other words, two curves are homotopic
if one can be continuously deformed into the other. Canonical examples are curves
in the punctured plane S = C \ {0}:

γ1 (θ ) = eiθ , γ2 (θ ) = 2eiθ , γ3 (θ ) = 4 + eiθ , |θ | ≤ π.

Then γ1 and γ2 are homotopic in S, but neither is homotopic in S to γ3 – see Figure


6.1. We write γ ∼ γ if the curves γ and 
γ are homotopic (in a specified domain),
and we write γ  γ otherwise.

γ2 γ1 0 γ3

Fig. 6.1 In C \ {0}, γ2 ∼ γ1 , but γ2  γ3 .

In particular, any two constant curves, curves whose range is a single point, are
equivalent; this is a consequence of pathwise connectedness.
The equivalence classes of curves in a Riemann surface S can be given a group
structure. If γ2 starts at the (second) endpoint of γ1 , then after a reparametrization, γ1
followed by γ2 is a curve γ1 · γ2 from the first endpoint of γ1 to the second endpoint
of γ2 . If γ is a curve, let γ −1 denote the curve obtained by reversing the direction of
travel. Then γ · γ −1 is homotopic to a constant curve; Exercise 4 It is easy to check
that (γ1 · γ2 ) · γ3 ∼ γ1 · (γ2 · γ3 ) and that if γ j ∼ γ̂ j , j = 1, 2, then γ1 · γ2 ∼ γ̂1 · γ̂2 .
The equivalence classes form a (non-commutative) group, the fundamental group
H1 (S); Exercise 5.
Suppose that S and S are Rieman surfaces. S is said to be a cover of S if there is
a mapping π : S → S with the property that π is holomorphic, and for each p ∈ S,
there is an open neighborhood U of p such that π −1 (U ) consists of disjoint open sets
each of which is mapped bijectively by π onto U . For example, Figure 1.2 shows a
portion of the domain of the logarithm as a cover of the punctured sphere C \ {0}.
By a lift of a curve γ in S to a curve in a cover S , we mean a curve γ such that
π ◦ γ = γ ◦ π , as in the commutative diagram
6.2 The universal cover 129

γ
S −−−−→ S
⏐ ⏐

π

π

γ
S −−−−→ S.

Lemma 6.2.1. Suppose that S is a cover of S and suppose that γ is a curve in S


that begins at p. For each p ∈ π −1 ( p), there is a unique lift γ in S that starts at p .

Proof: Suppose that γ is parametrized by the interval [0, 1]. It is easily seen that for t
close to zero there is a unique such lift of the restriction of γ to the interval [0, t]. It is
also easily seen that the set of t such that γ has a unique lift from [0, t] to S is both open
and closed in [0, 1]. 


Lemma 6.2.2. If curves γ0 and γ1 in S are homotopic, and γ0 , γ1 are lifts to a cover
S that have the same starting point, then γ0 and γ1 are homotopic.

Proof: Let γs , 0 ≤ s ≤ 1 provide a continuous deformation from γ0 to γ1 . Then the


lifts γs to S provide a continuous deformation from γ0 to γ1 . 


A universal cover of S is a cover S u that is simply connected. If S u is a universal


cover of S, then it is a cover for any cover S of S. In particular, S u is unique up to
equivalence; see Exercise 6.
The notion of a universal cover, and the first constructions, go back to the work of
Schwarz, Klein, Poincaré, and others on understanding and classifying the Riemann
surfaces of algebraic functions. One line of attack is to build up from the original
surface by cutting and pasting. A first step is illustrated schematically in Figure 6.2.
Starting with a compact Riemann surface S of genus 2 (like the surface of a two-holed
doughnut), S is cut along one curve that is not homotopic to a constant. Two copies
of S are joined together along the two sides of the cut, to form a new surface S. Each
point of S, such as p in the figure, corresponds naturally to two points, such as p
and p in  S. This can be done in such a way as to preserve conformal structures, so
that there is a two-to-one covering map π from  S onto S. Similar constructions can
be carried out indefinitely, in such a way that the limiting manifold  S is no longer
compact, and such that curves like γ that are are not homotopic to constant curves
in S become open curves in  S. Thus, 
S is simply connected.
The more modern approach to the construction of a universal cover is simpler and
more conceptual.
Theorem 6.2.3. A Riemann surface S has a universal cover.

Proof: Fix a point p0 in S. Consider the set 


S consisting of all pairs ( p, γ ), where p
is a point of S and γ is a curve from p0 to p. We define an equivalence relation ∼ in
the set {( p, γ )} by
130 6 General Riemann surfaces

S
p
S γ

p
γ

p

Fig. 6.2 The construction of a two-fold cover.

( p, γ1 ) ∼ ( p, γ2 ) if and only if γ1 is homotopic to γ2 .

Let [ p, γ ] denote the equivalence class of the pair ( p, γ ). Let S u be the set of all
equivalence classes [ p, γ ]. If p lies in a coordinate neighborhood U of p, any curve
from p0 to p can be extended within U so as to reach p . Extensions of two such
curves will be homotopic if and only if the original curves are homotopic. Therefore
we may sort the coordinate neighborhoods U of p into equivalence classes [U, γ ].
Each equivalence class [U, γ ] can be considered as a coordinate neighborhood U
of p, and any coordinate φ on U induces a coordinate on each [U, γ ]. This gives us
a covering of S u and a corresponding set of mappings φ that satisfy the properties
(a),(b),(c) stated at the beginning of Section 6.1; see Exercise 7. Therefore their union
S u is a Riemann surface. The map

π : [ p, γ ] → p, p∈S (6.2.1)

is a covering map.
Finally, we need to show that S u is simply connected. Let  p0 be the equivalence
class of ( p0 , γ0 ), where γ0 is the constant curve at p0 . If γ is a curve from p0 to p,
denote by γ the lift of γ to S u that begins at  p0 . Then γ ends at [ p, γ ]; Exercise
11. Suppose that γ1 is a closed curve in S u that begins and ends at [ p, γ ]. It is the
lift to S u , starting at [ p, γ ] of the projection γ1 = π ◦ γ1 in S. Moreover, γ1 · γ is
the lift to S u of γ1 · γ . Since γ1 · γ and γ have the same endpoint, it follows that
γ1 · γ and γ are homotopic. Therefore, γ1 · γ · γ −1 and γ · γ −1 are homotopic. The
former of these last two curves is homotopic to γ1 and the latter is homotopic to a
constant. Therefore the lift γ1 is homotopic to a constant, and we have shown that
S u is simply connected. 


Suppose that S u is the universal cover of S as constructed. Choose a point p0 of S.


We shall associate to each closed curve γ starting at p0 a map of S u to itself. Given
6.2 The universal cover 131

[ p, γ p ] in S u , let
Aγ ([ p, γ p ]) = [ p, γ p · γ ]. (6.2.2)

Theorem 6.2.4. (a) π ◦ A = π .


(b) Aγ is an automorphism of S u .
(c) If γ is the constant curve at p0 , then Aγ is the identity map of S u .
(d) Aγ1 = Aγ2 if and only if γ1 and γ2 are homotopic.
(e) Aγ has a fixed point if and only if Aγ is the identity map.
(f) Aγ1 ·γ2 = Aγ2 Aγ1 .

Proof: (a), (c), and (f) are immediate from the definitions.
(b) It follows readily from the definition that Aγ maps a a coordinate neighborhood
of [ p, γ p ] holomorphically to the corresponding coordinate neighborhood of [ p, γ p ·
γ ].
(d) This follows from the fact that γ1 and γ2 are homotopic if and only if γ1 · γ
and γ2 · γ are homotopic.
(e) By (d) Aγ ([ p, γ p ]) = [ p, γ p ] if and only if γ p and γ p · γ are homotopic,
which is the case if and only if γ is homotopic to a constant, which implies that Aγ is
the identity map. Conversely, the identity map fixes every point of S u . 


The automorphisms Aγ are called cover transformations, or deck transformations


(from the German decken, to cover). A group G of automorphisms of a Riemann
surface S is said to be properly discontinuous if, given two compact subsets C1 , C2
of S, the intersection g(C1 ) ∩ g(C2 ), g ∈ G, is empty except for finitely many g.
Proposition 6.2.5. The group G of cover transformations of S u is properly discon-
tinuous.

Proof: Suppose that p1 and p2 are distinct points of S. Choose distinct connected
neighborhoods U1 , U2 . Then U j = π −1 (U j ) are disjoint open sets in S u , each of
which is invariant under G. The sets U j themselves are unions of disjoint preimages
U jα of U j , and any element of G permutes these preimages. Therefore the intersection
under the action of an element of G on two such preimages is disjoint if they are
distinct, and is disjoint for all but the identity element if they coincide. The extension
to disjoint compact sets is immediate. 


As we shall see, Aut(S u ) has a natural topology. A consequence of Proposition


6.2.5 is that no sequence of non-identity cover transformations can have the identity
transformation as limit:
Corollary 6.2.6. The group of cover transformations of S u is a discrete group.
The importance of the group G is that it allows S to be recovered from its universal
cover S u . In fact, G induces an equivalence relation in S u : two points are equivalent
if some element of G takes one to the other. The quotient space, often written as
G\S u , can be naturally identified with S: see Exercise 12.
132 6 General Riemann surfaces

6.3 Automorphism groups and cover transformations

We assume now the uniformization theorem of Chapter 7, so any simply connected


Riemann surface can be taken to be one of D ∼= H, C, or S. As noted in Chapter 2,
in each case, the automorphism group is a subgroup of the group of linear fractional
transformations
az + b
f (z) = , ad − bc = 0. (6.3.1)
cz + d

We may multiply numerator and denominator by 1/ ad − bc and reduce to

az + b
f (z) = , ad − bc = 1. (6.3.2)
cz + d

This representation is still not unique: we can multiply both numerator and denomina-
tor by −1. In group theoretic terms, this means we are looking at S L(2, C)/{±1}, the
quotient of the group of 2 × 2 complex matrices divided by the subgroup consisting
of ±1, where 1 is the identity matrix. This group is commonly written P S L(2, C).
It inherits a topology from C4 .
Specifically, the results from Chapter 2 are the following.

Proposition 6.3.1. (a) The automorphism group Aut(C) consists of the affine map-
pings f (z) = az + b, a = 0.
(b) The automorphism group Aut(S) consists of all linear fractional transformations
(6.3.1).
(c) The automorphism group Aut(H) consists of the transformations (6.3.2) with
real coefficients a, b, c, d and positive determinant.

In view of Theorem 6.2.4 and the remarks which precede it, we would like to
identify the candidates for cover transformations, as subgroups of the automorphism
group of D, C or S as the case may be.

Proposition 6.3.2. If the universal cover S u of S is equivalent to the Riemann sphere,


then S u = S.

Proof: Any linear fractional transformation has at least one fixed point in S. Therefore,
by Theorem 6.2.4, there are no non-identity cover transformations and therefore no
non-constant closed curves in S. Therefore the construction of S u simply gives a
bijection. 


Proposition 6.3.3. Suppose that the universal cover of S is equivalent to C. The


group of cover transformations, carried over to C, is generated by either one or two
translations z → z + b.
6.3 Automorphism groups and cover transformations 133

Proof: Let G be the group of cover transformations. Every element of G is an affine


transformation f (z) = az + b. Such a transformation has a fixed point in C unless
a = 1. Thus, every non-identity cover transformation is a translation Tb z = z + b,
b = 0. The group generated by a single translation is clearly discrete.
Suppose that a has minimum modulus among the values b such that Tb is in G.
Let G a denote the subgroup generated by Ta . Suppose that Tb is another element of
G. If b lies on the line through 0 and a, then some translate b = [Ta ]n (b) lies in the
interval from 0 to a. By the minimality assumption b = 0 or b = a, so Tb is in G a .
If G a = G, let b have minimum modulus among the c such that Tc is in G \ G a ,
and let G a,b be the group generated by Ta and Tb . The image of 0 under G a,b is the
lattice
= {ma + nb : m, n ∈ Z}. (6.3.3)

Translations of the parallelepiped

= {ra + sb : 0 ≤ r, s < 1}

form a partition of C. Thus, given any point p of C, some combination of Ta and Tb


will move that point into , and further translation by (Ta Tb )−1 will move p into the
reflection − = {−z : z ∈ }. In particular, any point c in the triangle with vertices
a, b, a + b maps to a point c in the triangle with vertices −b, 0, −a; see Figure 6.3.
Suppose now that Tc belongs to G. The preceding observations show that some
element of G a,b will move c into a point c that lies either in the triangle with vertices
0, a, b, or in the triangle with vertices 0, −a, −b: Figure 6.3. The minimality assump-
tions on a and b imply that c or c must be one of the vertices 0, ±a, ±b, so Tc belongs
to G a,b . 


c a+b
Π

−Π c 0
a

Fig. 6.3 Translation by −a − b.

Suppose, finally, that the covering manifold for S is the upper half-plane H. If
T is a linear fractional transformation with real coefficients, then the fixed points
come in complex conjugate pairs. Therefore the only candidates for non-identity
cover transformations are those whose fixed points are in R ∪ {∞}. This is true of
the fixed points if and only if c = 0 or (a + d)2 ≥ 4; see Exercise 15. Beyond this, it
134 6 General Riemann surfaces

is not easy to give a simple description of a covering group. Proposition 6.2.5 gives
a necessary condition. A subgroup of Aut(H) is said to be Fuchsian if it is properly
discontinuous. Thus, any group of cover transformations is a Fuchsian group, with
the additional constraint that non-identity elements have no fixed points.
Conversely, it can be shown that any such group G is the group of covering
transformations of a Riemann surface S = G\H. The points of S are equivalence
classes of points of H, where two points z j ∈ H are equivalent if and only if z 2 =
g(z 1 ) for some g ∈ G, i.e. they belong to the same orbit of G. The proof is left as
Exercise 12.
Some standard examples of Fuchsian groups are G 0 = S L(2, Z), the group of
linear fractional transformations (2.1.1) having integer coefficients, and G p , the sub-
group of G 0 consisting of those transformations equal to the identity 1 modulo p, p
a prime, i.e.

b, c = 0 mod p and a = d = 1 mod p or a = d = −1 mod p.

There are non-identity elements of G 0 that have fixed points in H, but this is not the
case for the non-identity elements of G p ; see Exercise 17.

Exercises

1. Show that the curve (6.1.1) has a natural structure as a Riemann surface. Hint:
for each finite point (z 0 , w0 ), show that either φ(z, w) = ε(z − z 0 ) or φ(z, w) =
ε(w − w0 ) maps a neighborhood onto D; this leaves (∞, ∞) to be considered.
2. Show that D, C, and S are inequivalent Riemann surfaces.
3. If γ is a curve, show that γ · γ −1 is homotopic to a constant curve.
4. If γ1 , γ2 , γ3 are curves, show that γ1 · (γ2 · γ3 ) ∼ (γ1 · γ2 ) · γ3 .
5. Fill in the details to prove that the equivalence classes of curves in a Riemann
surface S form a group.
6. Show that If S u is a universal cover of S, then it is a cover for any cover S of S.
Show that S u is unique up to equivalence.
7. Show that the set of equivalence classes {[U, γ ]} introduced in the proof of
Theorem 6.2.3 has the properties (a), (b), (c) at the beginning of Section 6.1.
8. Show that the universal cover of the punctured plane C \ {0} can be taken to be
the upper half-plane H. Hint: start with the representation

{z : z = r eiθ , r > 0, θ ∈ R}.

9. The universal cover constructed in Section 6.2 involves the choice of a point p0 .
Show that the choice of any other point as starting point gives rise to the same
equivalence classes and the same conformal structure.
6.3 Automorphism groups and cover transformations 135

10. (a) Suppose that S is a Riemann surface and f : S → C is a meromorphic


function. Show that f lifts to a meromorphic function  f on the universal cover
S u , i.e. f ◦ π = π ◦  f.
(b) Conversely, suppose g : S u → C is meromorphic. Under what condition is
g the lift  f of a meromorphic function on S?
11. Prove that if γ is a curve from p0 to p, and γ is the lift of γ to S u that begins
at p0 , then γ ends at [ p, γ ].
12. Suppose that G is the group of cover transformations of a Riemann surface S.
Show that the points of S are equivalence classes of points of S u , where such
points are equivalent if and only if they belong to the same orbit of G.
13. Prove that for any fixed-point-free Fuchsian group G, the space G\H has a
complex structure such that the projection π taking z to its equivalence class [z],
is locally conformal.
14. Using Proposition 6.3.3, discuss the determination of the equivalence classes
of Riemann surfaces with universal cover C, where “equivalent” means being
related by a holomorphic bijection.
15. Verify that T ∈ Aut(H) with real coefficients has no fixed point in H if and only
if c = 0 or (a − d)2 ≥ 2.
16. Verify that G p , the subgroup of Aut(H) consisting of transformations with integer
coefficients that are equal to 1 modulo the prime p, is a group.
17. Show that G 0 , the subgroup of Aut(H) consisting of transformations with integer
coefficients, has non-identity elements with fixed point in H, but G p does not.
18. Show that the group of conformal self-maps of a compact Riemann surface is
finite.

Remarks and further reading

The definitive formulation of the general concept of a Riemann surface, and the basic
theory of such surfaces, go back to Weyl [214]. There are many modern treatments,
e.g. Donaldson [56], Schlag [185]. Farkas and Kra [79] is particularly comprehensive.
Siegel [191], [192] has an efficient treatment of covering spaces and the basics of
automorphic function theory.
Chapter 7
The uniformization theorem

This chapter is devoted to the proof of the uniformization theorem, and a discussion
of its consequences. The theorem says that a simply connected Riemann surface is
biholomorphically equivalent either to the unit disk D (or, equivalently, the upper
half-plane H), the complex plane C, or the Riemann sphere S. As shown in Chapter
6, every Riemann surface has a simply connected cover and is invariant under certain
automorphisms of the cover. Therefore the uniformization theorem opens the way to
a trove of information about general Riemann surfaces.
The first theorem of this type is the theorem known as the Riemann mapping
theorem: a simply connected domain U in C whose boundary consists of more than
one point is biholomorphically equivalent to the unit disk D. Riemann’s argument
assumed that U was a Jordan domain – a domain bounded by a simple closed curve
Γ = ∂ D. On physical grounds, given a point p0 ∈ U there should be a point poten-
tial g = g( p, p0 ): a function harmonic in U \ { p0 } that vanishes on Γ and has a
singularity like log r near p0 , where r ( p) = | p − p0 |. Then g is the real part of a
function f that is holomorphic on U , and the function F = exp(− f ) would map
U biholomorphically onto D. Riemann’s argument was not a proof; see Section 5.5.
However the idea can be made a proof by making more direct use of the solvability
of the Dirichlet problem: see Exercises 13 – 15.
The proof of the Riemann mapping theorem that is usually presented now looks for
F directly as the solution of a certain extremal problem. However one of the standard
approaches to the general problem goes directly back to constructing a harmonic
function u with a singularity – either like log r or like Re (1/z). The particular
version we follow in this chapter is known as the Perron method. In the case of a
singularity like log r , finding a harmonic conjugate v to u and exponentiating u + iv
leads to a conformal map onto D. In the case of a singularity like 1/z, u + iv itself
leads to a conformal map onto C or S.
Sections 7.1 and 7.2 treat the hyperbolic case: S u ∼ = H, singularity like log r .
Sections 7.3 and 7.4 treat the remaining cases: parabolic (S u ∼
= C) and elliptic (S u ∼
=
S).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 137
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_7
138 7 The uniformization theorem

7.1 Green’s functions and harmonic measure

In this section we deal with a simply connected open Riemann surface S. It is conve-
nient to introduce some terminology. If p0 is a point of S, a coordinate map z defined
in a neighborhood U of p0 is said to be a standard coordinate at p0 if z( p0 ) = 0
and U contains the set { p : |z( p)| ≤ 1} as a compact subset.. With z understood, we
let Dr = Dr ( p0 ), 0 < r ≤ 1, be the disk { p : |z( p)| < r }. By a coordinate disk in S
we mean any such Dr ( p0 ) 0 < r < 1, always with the assumption that the closure
is compact in S.
A real-valued function u defined in an open subset U ⊂ S is said to be harmonic
if each point p ∈ S has a neighborhood in which u is the real part of a holomorphic
function. This is equivalent to saying that for any coordinate disk D ⊂ U , there is a
real-valued function v such that u + iv is holomorphic in D. Any such function v is
called a harmonic conjugate of u.
As in the case S = C of Chapter 5, a real-valued function v defined in an open
set U ⊂ S is said to be subharmonic if for each coordinate disk D whose closure D
is in U , if u is harmonic in D and continuous on the closure, and v ≤ u on ∂ D, then
v ≤ u in D.
We are now in a position to carry over from Section 5.3 the concept of a Perron
family: a non-empty family F of subharmonic functions such that
(a) if u and v belong to F then so does the maximum u ∨ v;
(b) if u belongs to F , so does each harmonic regularization of u.
The first example that is of interest here is the following. Given p0 ∈ S, let the
family F ( p0 ) consist of all non-negative functions u, subharmonic in the punctured
surface S  = S \ { p0 }, such that
(a) u( p) = 0 if p is in the complement of some compact set K (that depends on
u);
(b) u( p) + log |z( p)| is bounded in a coordinate neighborhood centered at p0 .
It is easily seen that it if F ( p0 ) is non-empty, then it is a Perron family. But we have

Lemma 7.1.1. For each p0 ∈ S, the family F ( p0 ) is non-empty.

Proof. Let z be a coordinate centered at p0 such that the closure of D1 ( p0 ) is compact


in S. Define

− log |z( p)|, if |z( p)| ≤ 1;
u( p) = .
0, otherwise.

Then u belongs to F ( p0 ). 

Remark. The preceding concepts are conformally invariant: see Exercises 1 – 4.


If the supremum u of the Perron family F ( p0 ) is finite, then 
u is called a Green’s
function for S with pole at p0 , and is denoted g( p, p0 ).
7.1 Green’s functions and harmonic measure 139

Theorem 7.1.2. A Green’s function g( p, p0 ) on S has the properties


(a) g(·, p0 ) is harmonic on S \ { p0 };
(b) g( p, p0 ) > 0.
(c) g( p, p0 ) + log |z( p)| has a harmonic extension to a neighborhood of p0 , where
z is a standard coordinate at p0 ;
(d) inf p {g( p, p0 )} = 0.

Proof. Property (a) is Perron’s principle, and property (b) follows from the fact that
u ≡ 0 belongs to F ( p0 ), together with the strict maximum principle for −g(·, p0 ).
(c) Let h denote the maximum value of g( p, p0 ) for |z( p)| = 1. Given p
with z( p) < 1, choose a sequence {u n } in F ( p0 ) such that u n is nondecreasing
and u n ( p) → g( p, p0 ). Let vn be the harmonic function equal to u n + log |z| at
|z| = 1. By Lemma 5.2.3, u n + log |z| ≤ vn for 0 < |z| < 1. By Theorem 5.2.2,
the vn converge to a function v, harmonic for |z| < 1 and ≤ g for |z| = 1. There-
fore v ≤ v̄, where v̄ is the harmonic function equal to g for |z| = 1. Moreover
v( p) = g( p, p0 ) + log |z( p)|. Since p was arbitrary, g(·, p0 ) + log |z| ≤ v̄. But also
g(·, p0 ) ≥ v, so p0 is a removable singularity.
(d) Let c = inf p g( p, p0 ) and suppose u ∈ F ( p0 ). By assumption, u ≤ 0 ≤ g −
c outside some compact subset. Moreover u + log |z| is bounded in a neighborhood of
p0 . We know that g + log |z| is also bounded near p0 . Therefore, for each ε > 0, (1 −
ε)u ≤ g in a small enough neighborhood of p0 . Since g is harmonic and (1 − ε)u is
subharmonic, this implies that (1 − ε)u ≤ u − c everywhere. Therefore (1 − ε)g ≤
g − c everywhere, so c = 0. 
If K is a non-empty subset of S having compact closure, let H (K ) denote the
family of subharmonic functions u defined on S \ K such that 0 ≤ u ≤ 1, and u
vanishes outside some compact set. Let u K = sup{u; u ∈ H (K )}. Then 0 ≤ u K ≤ 1
on the complement of K . If u K is not identically 0 or 1, then K is said to have harmonic
measure u K . Since u K cannot attain its supremum or infimum, it follows that in this
case 0 < u K < 1.

Lemma 7.1.3. If ∅ = K 1 ⊂ K 2 and the closure of K 2 in S is compact, then



u K 1  S\K 2 ≤ u K 2 . (7.1.1)

Proof. The restriction to S \ K 2 of an element of H (K 1 ) belongs to H (K 2 ), which


implies (7.1.1). 

Lemma 7.1.4. If D1 is a standard coordinate neighborhood of p0 and 0 < r < 1,


then u Dr is not identically zero, and has limit 1 as p → ∂ Dr .

Proof. Choose r < s < 1 and define



⎨ log(s/|z( p)| , p ∈ D ∩ D ;
r s
u( p) = log(s/r )

0, p∈/ Ds ( p0 ).
140 7 The uniformization theorem

Then u belongs to H (D) and u > 0 on Ds \ Dr . Moreover, u( p) → 1 as p → ∂ D.


But 1 ≥ u Dr ≥ u. 
The following is an easy consequence; see Exercise 5.
Corollary 7.1.5. If K has harmonic measure and K  ⊂ K has non-empty interior,
then K  has harmonic measure.
Proposition 7.1.6. Suppose that some coordinate disk D in S has harmonic mea-
sure. Then S has a Green’s function with pole in D.
Proof. Let D be D1 ( p0 ) with respect to a local coordinate z. Choose 0 < r < s < 1
and let Dr = Dr ( p0 ), Ds = Ds ( p0 ). By Corollary 7.1.5, Dr has harmonic measure
ur .
Suppose that v belongs to the Perron family F ( p0 ). Let c be the maximum value
for v on ∂ Dr . Then v ≤ c = cu r on ∂ Dr and v ≡ 0 at ∞, so

v ≤ cu r on the complement of Dr . (7.1.2)

Now choose ε > 0 and consider

w = v + (1 + ε) log |z| for |z| ≤ s.

By assumption, v + log |z| is bounded as z → 0. Since ε log |z| → −∞ as z → 0,


it follows that the maximum of w is attained on ∂ Ds . Therefore, on ∂ Ds ,

c + (1 + ε) log r ≤ cu r + (1 + ε) log s,

so
1+ε s
c ≤ log on ∂ Ds .
1 − ur r

Taking ε → 0 we get
 −1
s
max[v + log |z|] ≤ 1 − max u r log
|z|≤s |z|=s r

on Ds . It follows that sup v, v ∈ {F ( p0 )} is finite for 0 < |z| < s, hence finite every-
where. 

We shall want a partial converse.


Proposition 7.1.7. Suppose that S has a Green’s function g with pole at p0 and
suppose that D is a disk contained in the set

{z : 0 < ε < |z( p0 )| < r < 1},

where z is a standard coordinate at p0 . Then D has harmonic measure.


7.2 Uniformization: the hyperbolic case 141

Proof. Each function v in H is bounded near ∂ D, and on the complement of some


compact set, by the harmonic function g/m + ε, where m is the minimum of g on
∂ D and ε > 0 is arbitrary. Therefore the supremum u is bounded by g/m + ε. The
infimum ε of g/m is not attained, so there is some point p ∈ S \ D where g/m is
< ε. Therefore each v is < 2ε at p. It follows that the supremum u is ≤ 2ε at p and
therefore not identically 1. 

Theorem 7.1.8. Suppose that S has a Green’s function at a point p0 . Then S has a
Green’s function at every point of S.

Proof. It follows from Proposition 7.1.7 and Proposition 7.1.6 that every point in a
standard coordinate neighborhood of p0 is the pole of a Green’s function. Therefore
the set of points that can serve as poles of Green’s functions is both open and closed,
hence is all of the connected set S. 

Remark. The reason for the terminology “Green’s function” here is that if L is a
linear differential operator, a Green’s function for L is a function G such that

u(x) = G(x, y)ϕ(y) dy


Ω

is a solution of Lu = ϕ (subject to some conditions at the boundary of the domain


of definition Ω). In particular, for L = Δ, the Laplacian in R2 , the Green’s function
is
− log |x − y|
G(x, y) = , x, y ∈ R2 .

7.2 Uniformization: the hyperbolic case

A Riemann surface that carries a Green’s function is said to be hyperbolic.

Theorem 7.2.1. (Uniformization, part I). If S is a simply connected hyperbolic


Riemann surface, then there is a conformal map of S onto the disk D.

Proof. Let g = g(·, p0 ) be the Green’s function with pole at p0 , and let z be a
standard coordinate at p0 . By Theorem 7.1.2 (c), g + log |z| is harmonic near z = 0.
Let h p0 be a harmonic conjugate defined in D1 , with h p0 (0) = 0. Define f p0 for
|z| < 1 by
f p0 = z exp(−g − log |z| − i h p0 ).

This function is bounded near p0 and the exponential factor has a non-zero limit at
p0 .
Given p = p0 , choose a standard coordinate at p, with p0 ∈
/ D1 ( p). Let h be a
harmonic conjugate for g in D1 . Note that h is unique up to an additive constant,
142 7 The uniformization theorem

so exp(g + i h) is unique up to a multiplicative constant of modulus 1. If two such


neighborhoods overlap, the second constant (say) can be adjusted so that exp(g + i h)
is holomorphic on the union of the two neighborhoods. Similarly, if D1 ( p) and
D1 ( p0 ) overlap, then the modulus of the quotient f p0 / f p is
 
 z exp(−g − log |z| − i h p0 ) 
  = 1,
 exp(−g − i h p ) 

so the quotient is a constant of modulus 1.


It follows that f p0 can be continued along each curve in S. Since S is assumed to
be simply connected, the continuation is a single-valued holomorphic map f ( p, p0 )
of S into D. The next step is to show that f ( p, p0 ) is injective. Suppose that p1 = p0
and p = p0 . Set
f ( p, p0 ) − f ( p1 , p0 )
F( p) = T ( f ( p, p0 )) = .
1 − f ( p1 , p0 )

The linear fractional transformation T maps D to itself, so F is holomorphic on S. Let


z be a standard coordinate at p1 . Since F( p1 ) = 0, it follows that log F( p) ∼ log |z|
near p1 . Suppose now that v belongs to the Perron family F ( p1 ) that defines g( p, p1 ).
Take ε > 0 and consider the subharmonic function

v + (1 + ε) log |F( p)|. (7.2.1)

Near p1 this is similar to ε log |z|, so it has limit −∞. But by assumption v vanishes
near ∞, so by the maximum principle

v + (1 + ε) log |F( p)| ≤ 0.

Taking the supremum over v ∈ F ( p1 ) and letting ε → 0, we have

g( p, p1 ) + log |F( p)| ≤ 0. (7.2.2)

Exponentiating,
|F( p)| ≤ |g( p, p1 )| = | f ( p, p1 )|.

But F( p0 ) = − f ( p1 , p0 ), so

| f ( p1 , p0 )| ≤ | f ( p0 , p1 )|.

Since p0 and p1 are interchangeable in this argument,

| f ( p0 , p1 )| = | f ( p1 , p0 )|.

Then (7.2.2) gives


g( p0 , p1 ) + log |F( p0 )| = 0.
7.3 An analogue of the Green’s function 143

The left-hand side is a harmonic function of p0 , so by the strict maximum principle


it is constant, hence identically zero:

g( p, p1 ) + log |F( p)| = 0. (7.2.3)

Now F( p) = 0 when f ( p, p0 ) = f ( p1 , p0 ), and (7.2.3) shows that this implies that


p = p1 . Thus f ( p, p0 ) is single-valued.
We have now shown that f ( p) = f ( p, p0 ) is a conformal map from S to a (nec-
essarily simply connected) subset of D. By the Riemann mapping theorem, there is a
conformal map from the image f (S) onto D. 
It follows from Theorem 7.2.1 that a simply connected open Riemann surface that
is hyperbolic, i.e. carries a Green’s function, also carries a non-constant bounded
harmonic function. In the next section we will have use for the converse.
Proposition 7.2.2. If S carries a non-constant real-valued bounded harmonic func-
tion, then S has a Green’s function.
Proof. Take u to be a harmonic function with sup u = 1, inf u = 0, and use the proof
of Proposition 7.1.7 with u in place of g/m. 

7.3 An analogue of the Green’s function

We assume throughout this section that S is a simply connected Riemann surface


that is not hyperbolic, i.e. S has no Green’s function. Such a surface is said to be
parabolic if it is open, i.e. not compact. It is said to be elliptic if it is compact. As in
the case of conic sections, these can be thought of as limiting cases of the hyperbolic
case; see Exercise 12.
Proposition 7.3.1. If S is parabolic then:
(a) no coordinate disk in S has harmonic measure;
(b) S carries no non-constant bounded harmonic functions;
(c) for any non-empty compact subset K , the maximum principle holds in S \ K , in
the sense that if u is bounded above and harmonic in S \ K , then

sup u(z) = lim sup u( p).


p→∂ K

Proof. Parts (a) and (b) follow from Proposition 7.1.6 and Proposition 7.2.2. For (c),
as before we let H (K ) be the Perron family consisting of subharmonic functions v on
S \ K , such that 0 ≤ v ≤ 1, v is not identically 0, and v vanishes outside some com-
pact set. Suppose that u is harmonic in S \ K , 0 ≤ u ≤ 1, and lim sup p→∂ K u( p) = 0.
Then
lim sup[u( p) + v( p)] ≤ 1, lim sup[u( p) + v( p)] ≤ 1.
p→∂ K p→∞
144 7 The uniformization theorem

Therefore u + v ≤ 1. If K does not have harmonic measure, then v can be chosen to be


arbitrarily close to 1. Therefore u ≤ 0, and we have proved the maximum principle for
S \ K. 

In place of a Green’s function – a harmonic function with a pole like log (1/r ) in
some coordinate neighborhood of a point p0 – we look for a harmonic function on
S with a pole like Re (1/z).
We could take a corresponding Perron family to be the family of functions v
that are subharmonic on S \ { p0 }, vanish outside some compact set, and such that
v − Re (1/z) is bounded, where z is a standard coordinate at p0 . However it is far
from obvious that there are any such functions. Instead, the basic idea of the proof
is to construct a harmonic function with a singularity at a point p0 by working with
a family of functions defined outside successively smaller coordinate disks centered
at p0 .

Lemma 7.3.2. Let z be a standard coordinate at p0 ∈ S. Given 0 < ρ < 1, there is


a unique bounded harmonic function u ρ on S \ Dρ that is equal to Re (1/z) on ∂ Dρ .

Proof. In the compact case, u ρ is simply the solution of the corresponding Dirichlet
problem. If S is not compact, we let G be the family of subharmonic functions on
S \ Dρ that are continuous, bounded above by Re (1/z) at the boundary, and van-
ish outside some compact set. The supremum u ρ is harmonic and bounded above
by Re (1/z). On the other hand, G contains the function v obtained by solving the
Dirichlet problem with value Re (1/z) on ∂ Dρ and 0 on ∂ D1 , extended to be zero
outside ∂ D1 , so u ρ is continuous and has the correct boundary value on ∂ Dρ . Bound-
edness and uniqueness follow from the maximum principle for S \ Dρ , applied to u
and to −u. 

The next sequence of lemmas aims to estimate the behavior of u ρ as ρ → 0,


starting with the oscillation on the circle |z| = r , ρ ≤ r ≤ 1:

Mr (u ρ ) = max u ρ − min u ρ
|z|=r |z|=r

Lemma 7.3.3. Let v be the solution to the Dirichlet problem on D with boundary
values 
1, 0 < θ < π;
v(r eiθ ) =
−1, π < θ < 2π.

Then

|v(r, θ )| ≤ c(r ), where c0 (r ) ≤ 1 and c0 (r ) = O(r ) as r → 0. (7.3.1)

Proof. Since v(−z) + v(z) = 0 and v is positive for Re z > 0, it is enough to bound
v(z) for Re z > 0. The Poisson integral formula 5.1.6 here becomes
7.3 An analogue of the Green’s function 145

π
1 1 − r2 1 − r2
v(r eiθ ) = − dϕ
2π 0 1 − 2r cos(θ − ϕ) + r 2 1 − 2r cos(θ + ϕ) + r 2
1 2π
(1 − r 2 ) 4r sin θ sin ϕ
= dϕ.
2π 0 (1 − 2r cos(θ − ϕ) + r 2 )(1 − 2r cos(θ + ϕ) + r 2

The integrand is bounded by 4(1 + r )r (1 − r )−3 sin θ sin ϕ. We also know from the
maximum principle that v(z) < 1 for Re z > 0, so integrating gives (7.3.1) with

4r (1 + r )| sin θ |
c0 (r ) = min 1, . (7.3.2)
(1 − r )3

Lemma 7.3.4. Suppose that u is harmonic for r0 < |z| < 1, continuous for r0 ≤
|z| ≤ 1, and constant for |z| = r0 . Let

Mr (u) = max u(z) − min u(z), r0 ≤ r ≤ 1.


|z|=r |z|=r

Then
Mr (u) ≤ c(r ) M1 (u), (7.3.3)

where c(r ) = π c0 (r ), with c0 given by (7.3.2).

Proof. Up to a rotation and multiplication by a constant, we may assume that


M1 (u) = 1 and, for a given value of r , that the maximum and minimum values
of u(z) for |z| = r occur at complex conjugate points z 0 and z̄ 0 respectively. The
function u(z) = u(z) − u(z̄) is harmonic in the intersection U of the annulus with
the upper half-plane, continuous on the closure of U , equal to zero on the lower
boundary and is ≤ 1 on the upper boundary. See Figure 7.1.
Let v be the function of Lemma 7.3.3. Then v ≥ 0 on the lower boundary of U ,
so u ≤ v on U and (7.3.1) applies. 

U
z0

0 r 1
z̄0

Fig. 7.1 The domain U .


146 7 The uniformization theorem

The next several lemmas are aimed at estimating the mean value and the oscilla-
tion, and therefore the size, of u ρ − Re (1/z). The goal is to show convergence of
u ρ − Re (1/z), as ρ → 0, to a harmonic function with the desired singularity at p0 .
We begin with Green’s identity for functions u, v that are smooth on the closure
of bounded domain U ⊂ S having smooth boundary:
 
∂u ∂v
[vΔu − uΔv] dm = v −u ds, (7.3.4)
U ∂U ∂n ∂n

where dm is the area measure, ∂/∂n is the outer normal derivative, and ds is arc-
length measure on ∂U . If u and v are harmonic, this becomes
 
∂u ∂v
v −u ds = 0. (7.3.5)
∂U ∂n ∂n

In particular, suppose that u is harmonic and v ≡ 1, and U is an annulus, in a standard


coordinate, bounded by the circles |z| = r1 < r2 and |z| = r2 . Then (7.3.5) implies

∂u 2π
∂u
(r1 eiθ ) dθ = (r2 eiθ ) dθ. (7.3.6)
0 ∂r 0 ∂r

Lemma 7.3.5. The function u ρ of Lemma 7.3.2 satisfies



∂u ρ
(r eiθ ) dθ = 0. (7.3.7)
0 ∂r

Proof. Suppose that D1 ⊂ K , where K has compact closure and smooth boundary.
Then each point of ∂ K is regular, so Dr has harmonic measure v in K , namely
the solution of the Dirichlet problem that is 1 on ∂ Dr and 0 on ∂ K . Then (7.3.5)
specializes to
   
∂u ρ ∂v ∂u ρ ∂v
v − uρ ds = v − uρ ds. (7.3.8)
∂ Dρ ∂n ∂n ∂K ∂n ∂n

By the maximum principle, u ρ ≤ 1/ρ. Therefore (7.3.8) leads to


     
 ∂u ρ  1  ∂v  1  ∂v 

 ds  ≤  ds  +  ds  . (7.3.9)
 ∂ Dρ ∂n  ρ  ∂ Dρ ∂n  ρ ∂ K ∂n

We know that ∂ Dr does not have harmonic measure, so as we take larger sets K  ⊃ K
the harmonic measures v K  for ∂ Dr on K  increase to 1 uniformly on K . Replacing
v by v K  we find that
∂u ρ
ds = 0.
∂ Dρ ∂n

For the standard coordinate z, this is (7.3.7) at r = ρ. By (7.3.6), the integral in (7.3.7)
is independent of r . 
7.3 An analogue of the Green’s function 147

Lemma 7.3.6. The mean value of u ρ over any circle {z : |z| = r }, 0 < r < 1, is
zero.

Proof. By (7.3.7), it is enough to prove this at r = ρ. On ∂ Dρ , u ρ = Re (1/z), which


has mean value zero. 

Lemma 7.3.7. As ρ → 0, u ρ tends to a function u that is harmonic on S \ { p0 }, and


bounded on the complement of each neighborhood of p0 . Moreover in a standard
coordinate at p0 ,  
1
lim u − Re = 0.
z→0 z

Proof. By Lemma 7.3.4 applied to u − Re (1/z), the oscillation satisfies


 
1 1
Mr u − Re ≤ c(r ) M1 u − Re . (7.3.10)
z z

Now Mr (Re (1/r )) = 2/r , and the maximum principle implies that M1 (u ρ ) ≤
Mr (u ρ ). Therefore

2 2
M1 (u ρ ) − ≤ Mr u ρ − Re
r z

2
≤ c(r ) M1 u ρ − Re
z
≤ c(r ) [M1 (u ρ ) + 2].

Choose r1 so that c(r1 ) < 1. Then


c(r1 ) + 1/r1
M1 (u ρ ) ≤ 2 = C1 , (7.3.11)
1 − c(r1 )

independent of ρ < r1 . Returning to Lemma 7.3.4, we see that by the scaling z →


z/r1 we obtain a standard coordinate at p0 , and a corresponding version of (7.3.3)
with c1 (r ) = c(r/r1 ) as the multiplier. Therefore for ρ < r < 1,
  
1 r r1
Mr u ρ − Re ≤c M1 u ρ − Re
z r1 z

2r
≤ c1 (r ) C1 +
r1
= C2 c1 (r ). (7.3.12)

By Lemma 7.3.6, the mean value of u ρ − Re (1/z) over a circle is zero. It follows
that  
1 1
min u ρ − Re ≤ 0 ≤ max u ρ − Re .
|z|=r z |z|=r z
148 7 The uniformization theorem

Therefore (7.3.12) implies that


 
 1 

max u ρ − Re  ≤ C2 c1 (r ), (7.3.13)
|z|=r z

and so
max |u ρ − u ρ  | ≤ C2 c1 (r ), ρ, ρ  < r < r1 . (7.3.14)
|z|=r

By the maximum principle, (7.3.14) is also valid outside Dr1 . It follows that u ρ
has limit u as ρ → 0, uniformly on the complement of any neighborhood of p0 .
Moreover (7.3.13) and (7.3.14) imply that as r → 0,
 
 1  
max u − Re  = o(r ), max u ρ − u  = o(r ). 
|z|=r z |z|=r

We are now in a position to complete the proof of the uniformization theorem.

7.4 Proof of the uniformization theorem, completed

We continue to assume that S is either parabolic or elliptic. We have established


that for each point p0 of S and each standard coordinate z at p0 , there is a function
u, harmonic in S \ { p0 }, bounded outside any neighborhood of p0 , such that u −
Re (1/z) has limit 0 at p0 and lim inf p→∞ u( p) = 0. For each point of S \ { p0 }
and standard neighborhood D1 ⊂ S \ { p0 }, we may choose a harmonic conjugate
v, unique up to an additive constant, such that f = u + iv is holomorphic in D1 .
The same is true in D1 ( p0 ), modulo the pole at p0 : choose w a harmonic conjugate
to u − Re (1/z) with v(0) = 0, and let v = w + Im (1/z), so that f = u + iv is
meromorphic near p0 with expansion
1
f ( p) = + az + . . . . (7.4.1)
z

This function may be continued along any curve in S by adjusting the additive
constant along an overlapping chain of coordinate neighborhoods that cover the
curve. As in the hyperbolic case, simple connectedness tells us that, starting from
D1 ( p0 ), f has a unique analytic continuation to all of S.
In the previous standard neighborhood of p0 we can take z = −i z as our standard
coordinate and construct the analogous function f , with an expansion
i
f ( p) = + az + . . . . (7.4.2)
z

We know that u = Re f is bounded outside each neigborhood of p0 . We do not yet


know that this is true of v = Im f and, therefore of f itself. The following proposition
settles this point.
7.4 Proof of the uniformization theorem, completed 149

Proposition 7.4.1. Let f and f be the functions constructed above with expansions
(7.4.1) and (7.4.2). Then f − i f is constant.

Proof. Near p0 we work in a standard coordinate chart. Since f and f each have
a simple pole at 0, it follows that if ρ is small enough, and p1 ∈ Dρ , then f takes
the value f ( p1 ) exactly once in Dr , and the same is true for f . It will be useful to
choose p1 so that we also have f ( p) = f ( p1 ) for p in the complement of Dρ .
To accomplish this, choose M so that

|Re f ( p)| ≤ M, |Re f ( p)| ≤ M, p ∈ S \ Dρ ,

and choose p1 = z 1 = (1 + i)/ε, where ε > 0 is small enough that

|Re f ( p1 )| > M, |Re f ( p1 )| > M.

Therefore we also have f ( p) = f ( p1 ) and f ( p) = f ( p1 ) if p is in the complement


of Dρ . It follows that the functions

1 1
F( p) = , F( p) = (7.4.3)
Re f ( p) − M Re f ( p) − M

are holomorphic except for simple poles at p1 , and vanish at 0. Therefore near p1
they have expansions

A A
F( p) = + B + O(z − z 1 ), F( p) =
+ B + O(z − z 1 ).
z − z1 z − z1
(7.4.4)
Then G = AF − A F is holomorphic on S. On the complement of Dρ ,

C C
|G( p)| ≤ C(|F( p)| + | F( p)|) ≤ + ,
Re F( p1 ) − M Re F( p1 ) − M

so G is a bounded holomorphic function on S, hence is a constant C1 . Therefore

A[ f ( p) − f ( p1 )] − A[ f ( p) − f ( p1 ) = C1 [ f ( p) − f ( p1 )][ f ( p) − f ( p1 )].

Since the left-hand side has at most a simple pole at p0 , it follows that C1 = 0. The
expansions (7.4.1) and (7.4.2) show that A = −i A, so

f ( p) = i f ( p) + [ f ( p1 ) − i f ( p1 )].

But the expansions (7.4.1) and (7.4.2) show that the term in brackets is zero. 

Let us denote the function f with pole at p0 by f ( p; p0 ) and the corresponding


function with pole at the point p1 of Proposition 7.4.1 by f ( p; p1 ).

Proposition 7.4.2. The function f ( p; p0 ) is injective.


150 7 The uniformization theorem

Proof. Let F be the function of (7.4.3). Then F and f ( p; p1 ) are both meromor-
phic in S with a simple pole at p1 and bounded outside any neighborhood of p1 .
Therefore F( p) = a f ( p; p1 ) + b, where a and b are constants. Since F is a linear
fractional transformation of f ( p; p0 ), it follows that f ( p; p1 ) is a linear fractional
transformation of f ( p; p0 ). This is true for any p1 in Dρ . Continuing this argument
along an overlapping chain of neighborhoods, we find that each f ( p; q) is a linear
fractional transformation T = Tq of f ( p; p0 ).
Suppose now that f ( p1 ; p0 ) = f ( p2 ; p0 ). Choose T so that f ( p; p2 ) =
T f ( p; p0 ). Then

f ( p1 ; p2 ) = T f ( p1 ; p0 ) = T f ( p2 ; p0 ) = f ( p2 , p2 ) = ∞.

But the only pole of f ( p; p2 ) is at p2 , so p1 = p2 . 

Theorem 7.4.3. (Uniformization: the parabolic and elliptic cases) A simply con-
nected parabolic or elliptic Riemann surface is biholomorphically equivalent to C
or S, respectively.

Proof. The function f ( p; p0 ) is an injective holomorphic map to an open subset U


of S. In the elliptic case the image must be compact, hence all of S. Otherwise, if f
omitted more than one point, then the Riemann mapping theorem would provide an
equivalence with the disk and, therefore, a bounded holomorphic function. Therefore
in the parabolic case f omits a single point a ∈ C. Then f ( p) − a reaches every
z ∈ C, z = 0, so
1
g( p) = ,
f ( p) − a

which vanishes at p0 , is an equivalence of S and C. 

Exercises

In the following exercises, S is a Riemann surface, U is a non-empty open subset of


S, and Φ is a conformal map of S onto itself.
1. Suppose that u : U → R is harmonic. Show that u ◦ Φ −1 is harmonic on Φ(U ).
2. Suppose that v : U → R is subharmonic. Show that v ◦ Φ −1 is subharmonic on
Φ(U ).
3. Suppose that F is a Perron family on U , Show that

{u ◦ Φ −1 : u ∈ F }

is a Perron family on Φ(U ).


4. Given p0 ∈ S, show that

{u ◦ Φ −1 : u ∈ F ( p0 )} = F (Φ( p0 )).

5. Prove Corollary 7.1.5.


7.4 Proof of the uniformization theorem, completed 151

6. Show that D has a Green’s function with pole at 0.


7. Show that H has a Green’s function with pole at i.
8. Given R > 0, show that there is a subharmonic function u : C → R such that
u + log |z| is bounded for 0 < |z| < R and u is harmonic and positive for 0 <
|z| < R, u = 0 for |z| ≥ R.
9. Show that C does not have a Green’s function with pole at 0.
10. Show that C does not have any Green’s function.
11. Show that S does not have any Green’s function.
12. Show how elliptic and parabolic simply connected Riemann surfaces can be
treated as limiting cases of hyperbolic simply connected Riemann surfaces.
13. Suppose that Ω is a Jordan domain in C with a barrier at each point of the
boundary Γ , and z 0 ∈ Ω. Let u : Ω → R be the solution of the Dirichlet problem
with u = − log |z − z 0 | on Γ .
(a) Show that the harmonic conjugate u ∗ is single-valued in Ω. Hint: Ω is simply
connected. Choose u ∗ with u ∗ (z 0 ) = 0.
(b) Show that f = exp(u + iu ∗ )(z − z 0 ) is a conformal map of Ω to D and
| f (z)| → 1 as |z| → ∂Ω.
(c) Show that f : Ω → D is conformal. Hint: f has a unique zero in Ω.
14. (a) Suppose that Ω ⊂ C is an arbitrary Jordan domain. Show that for each n =
1, 2, . . . there is a domain Ωn ⊂ Ω whose boundary Γn is a polygon contained
in a 1/n neighborhood of Γ = ∂Ω.
(b) Show that there is a conformal map from Ω onto D.
15. Suppose that Ω ⊂ C is a domain whose boundary contains at least two points.
Use the results of the preceding exercises to show that there is a conformal map
of Ω onto D. (Note that we may assume that Ω is bounded: see the first step in
the proof of Theorem 2.4.1.)
16. Exercises 12 – 13 of Chapter 8 show that any domain Ω ∈ C whose complement
has two connected pieces, one bounded and one unbounded, can be mapped
conformaly to a unique open annulus A(1, r ) = {z : 1 < |z| < r }. The purpose
of this and the following exercises are to prove a mapping theorem applicable to
plane domains with any connectivity.
Suppose that the complement of Ω ⊂ C consists of m disjoint connected sets,
Ω1 , . . . , Ωm , Ωm unbounded and the others bounded.
Continuing, using inversions, show that we may assume inductively that each
Γ j = ∂Ω j is an analytic curve, and that Γ1 is a circle enclosing Ω, with the usual
orientation.
17. Let z 0 be a point of Ω and let u be the harmonic function that is the solution of
the Dirichlet problem with value −Re (1/(z − z 0 ) on ∂Ω.
(a) Show that u has a harmonic extension to a neighborhood of ∂Ω. Hint: use
analyticity of the Γ j .
(b) Let u ∗ be a harmonic conjugate of u. Note that it may not be single-valued:
it may have period ak on Γk , i.e. a gain ak as it is continued around Γk in the
positive direction. Show that locally
152 7 The uniformization theorem

1
u + iu ∗ +
(z − z 0 )

is holomorphic in Ω \ {z 0 }, with a simple pole at z 0 , extends locally to be holo-


morphic in a neighborhood of ∂Ω, and has real part zero on ∂Ω.
18. With Ω as in Exercise 16, let u j , 1 ≤ j ≤ m be the harmonic function on Ω that
is the solution of the Dirichlet problem with value 1 on Γ j and 0 on the other
Γk . As in Exercise 16 (a), each u j has a harmonic extension to a neighborhood
of Ω.
(a) Let a jk be the period of u j on Γk . Prove that the homogeneous system


m
λ j a jk = 0, k = 1, 2, . . . , m
j=1

has only the trivial solution all λ j = 0. Hint: consider the real and imaginary
parts separately.
(b) Show that there is a linear combination of u ∗ above and the u j such that


m
u∗ + λju j
j=1

has period zero on each Γk .


(c) Show that

m
1
f = u + iu ∗ + λju j +
j=1
z − z0

is single-valued and meromorphic in a neighborhood of Ω.


19. Show that the function f of Exercise 18 is a conformal map of Ω onto the
complement of a set of disjoint horizontal slits {z : b j ≤ Re z ≤ c j , Im z = d j },
j = 1, . . . , m.

Remarks and further reading

The formulation and proof of the uniformization theorem involved many of the
leading analysts of the late 19th and early 20th centuries, including Schwarz, Klein,
Poincaré and Koebe. This history is summarized, and an alternative proof is sketched,
in Abikoff’s Monthly article [2]; see also the discussion in §20 of Weyl [214]. Gray
[92] has a detailed history of the proof, with discussion of the work of the previously
mentioned authors as well as Osgood, Carathéodory, Bieberbach, and others. The
theorem is covered in most texts on Riemann surfaces; see the references at the end
of Chapter 6. Our presentation here mainly follows Ahlfors [7].
Chapter 8
Quasiconformal mapping

Conformal equivalence is a basic concept in complex analysis. In connection with


simply connected proper domains in C, it is very flexible, as shown by the Riemann
mapping theorem. But, for domains that are not simply connected, it is much more
rigid. As we shall see, two annuli A j = {z : 1 < |z| < R j }, j = 1, 2, are conformally
equivalent if and only if R1 = R2 . A more flexible, and very useful, concept is that of
quasiconformal equivalence. This chapter covers the basic theory of quasiconformal
mapping.
Section 8.1 introduces a general notion of a quadrilateral and the fundamental
concept of the module of a quadrilateral. This provides the basis for the definition of
a quasiconformal map in Section 8.2. Regular, i.e. C 1 , conformal maps are charac-
terized in Section 8.3.
Section 8.4 introduces ring domains, i.e. domains quasiconformally equivalent
to annuli. The importance of ring domains comes from their separation property –
separating the region surrounded by the ring from the region external to the ring.
Of particular importance are extremal ring domains, the subject of Section 8.5. An
extremal ring domain is one that has maximal module among all ring domains that
have a specified separation property. As shown in Section 8.6, results on such domains
are powerful tools for establishing such properties as Hölder continuity of quasicon-
formal mappings.
The question of the relation of a quasiconformal map of the closed upper half-
plane to its restriction to the real line is treated in Section 8.7. Quasisymmetry,
quasi-isometry, and the Beurling–Ahlfors extension of a map of R → R to a quasi-
conformal map H → H are covered.
Section 8.8 deals with the existence of quasiconformal maps with given dilatation
via the Beltrami equation. The existence theory uses a special case of the Calderón-
Zygmund inequality, which is proved in Section 8.9.
A principal motivation for the development of this theory was its use in the prob-
lem of finding a space of moduli to characterize Riemann surfaces – the subject of
Chapter 9.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 153
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_8
154 8 Quasiconformal mapping

8.1 Quadrilaterals

By a quadrilateral in C, we mean a Jordan domain Q with four distinguished points


p1 , p2 , p3 , p4 on the boundary, numbered in the positive direction and referred to
as the vertices of Q. The boundary arcs between p1 and p2 and between p3 and p4
are referred to as the “a sides” of Q, and the other two arcs as the “b sides.” As
a first example, consider the unit disk D with four such points designated on the
boundary. There is a unique conformal map in f ∈ Aut(D) that maps the ordered
triple ( p1 , p1 , p3 ) to (−1, −i, 1). Then f ( p4 ) is uniquely determined; see Exercise
1. This shows that the conformal class of such a quadrilateral can be parametrized
by a single real variable, e.g. Re f ( p4 ). A second natural example is a rectangle,
with the usual vertices numbered in the positive direction from some choice of initial
vertex. Here, as we shall see, a natural parametrization of the conformal class is the
ratio of the a side lengths to the b side lengths.
We shall immediately broaden the definition to allow what can be thought of as
Jordan domains in the Riemann sphere. The principal example is H, with boundary
R ∪ {∞}, and vertices numbered in the positive direction. Usually, we shall take
these vertices to be (−1/k, −1, 1, 1/k), where 0 < k < 1. Given any quadrilateral
Q( p1 , p2 , p3 , p4 ), there is a unique choice of k such that

Q( p1 , p2 , p3 , p4 ) is conformally equivalent to H(−1/k, −1, 1, 1/k); (8.1.1)

see Exercise 2. The quadrilateral H(−1/k, −1, 1, k) can be mapped to a rectangle


with vertices −K , K , K + i K  , −K + i K  , for suitable K , K  > 0, by the function
 z

F(z) =  ; (8.1.2)
0 (1 − ζ 2 )(1 − k 2 ζ 2 )

see Exercise 3. In view of this, it is easily seen that any quadrilateral is conformally
equivalent to a rectangle, as illustrated in Figure 8.1.

Q R b
b
a a
Fig. 8.1 A quadrilateral and a conformally equivalent rectangle.
8.1 Quadrilaterals 155

Remark. Because we have assumed a quadrilateral to be a Jordan domain, such a


conformal map to a rectangle extends to a homeomorphism (i.e. a continuous map
with continuous inverse) from the closure of Q to the closure of R. This is really
the necessary feature, so we may build it into the definition of a quadrilateral and of
maps between quadrilaterals.
Two rectangles with specified a and b sides are conformally equivalent if and only
if they are similar; see Exercise 5.

Proposition 8.1.1. Two quadrilaterals Q, Q  in C are conformally equivalent if and


only if the canonical images R, R  are similar: a/b = a  /b .

We have shown, in effect, that any quadrilateral Q has a canonical image with
vertices 0, a, a + bi, bi for some choice of a, b. We will also refer to a canonical
image of Q as a model of Q With Proposition 8.1.1 in mind, and abusing notation a
bit, we note that for a rectangle whose a-sides have length a and whose b-sides have
length b, the similarity class is determined by the ratio a/b. In general, the module
m(Q) of a quadrilateral Q is defined to be a/b, where a and b are the appropriate
side lengths of a model of Q. The module is, by definition, a conformal invariant.
By changing the numbering of the vertices of Q, we can convert to a quadrilateral
Q ∗ whose a-sides are the b-sides of Q. Clearly,

1
m(Q ∗ ) = . (8.1.3)
m(Q)

It is useful to have a second characterization of m(Q).

Proposition 8.1.2. For any quadrilateral Q,

L(ρ)2
m(Q) = sup , (8.1.4)
ρ A(ρ)

where ρ runs through the non-zero functions that are non-negative on Q and have
finite integral 
A(ρ) = ρ(x + i y)2 d x d y,
Q

and L(ρ) is the infimum of 


L γ (ρ) = ρ(z)|dz|
γ

over rectifiable curves γ that join the two b-sides of Q.

Proof: It is enough to pass to a model R with side lengths a, b. Then



L(ρ) ≤ ρ(x + i y) d x,
156 8 Quasiconformal mapping

so 
b L(ρ) ≤ ρ d x d y,
R

and the Cauchy–Schwarz inequality gives


 ρ 
b2 L(ρ)2 ≤ ρ2 d x d y d x d y = A(ρ) ab. (8.1.5)
R R

Thus, the expression on the right-hand side of (8.1.4) is ≤ a/b = m(Q). Conversely,
returning to R with ρ = 1, it is clear that we obtain equality in (8.1.4).

Remarks. The quotient on the right in (8.1.4) is unchanged if ρ is multiplied by a


positive constant. Moreover, equality occurs in (8.1.5) if and only if ρ is a constant
multiple of 1. Thus, the choice ρ = 1 is essentially the only choice that gives equality
in (8.1.4). As we shall see, this has important consequences.

It will be useful to check what happens if we split a quadrilateral into “vertical”


or “horizontal” strips.

Proposition 8.1.3. Suppose that the quadrilateral Q is divided into quadrilaterals


Q 1 , Q 2 by a curve κ that joins the two a sides of Q. Then

m(Q) ≥ m(Q 1 ) + m(Q 2 ). (8.1.6)

Equality holds if and only if the image of κ in any model R of Q is a vertical line
segment.

Proof: It is enough, once again, to work directly with a model with side lengths a, b.
By the remark, we may work with ρ = 1. Let l1 be the minimal distance from κ to
the left side of R and l2 the minimal distance to the right side of R. Then the area
A j = A(Q j ) is ≥ l j b, with equality if and only if κ is a vertical segment. Then

l12 l2 l2 l2 l1 + l2 a
m(Q 1 ) + m(Q 2 ) = + 2 ≤ 1 + 2 = ≤ ,
A1 A2 l1 b l2 b b b

with equality if and only if κ is a vertical segment.

The assumptions of the following useful lemma are illustrated in Figure 8.2.
8.2 Quasiconformal mappings 157

a
b
Q β
b
a

α
Fig. 8.2 Estimating the module of Q from below.

Lemma 8.1.4. Suppose that the quadrilateral Q is contained in a horizontal strip


of width β and the b sides of Q are separated by a vertical strip of width α. Then the
module of Q is ≥ α/β.

Proof: Define ρ on Q to be 1 between the vertical strips and zero elsewhere. Then
L(ρ) ≥ α and A(ρ) ≤ αβ, so m(Q) ≥ α 2 /αβ.
We say that a sequence of quadrilaterals {Q n } converges uniformly to a bounded
quadrilateral Q if for each ε > 0, n ≥ n(ε) implies that each a side of Q n lies in
an ε neighborhood of the corresponding a side of Q, and the same for the b sides.
Consequently, Q n lies in an ε neighborhood of Q and conversely.
Theorem 8.1.5. If the sequence of quadrilaterals Q n ⊂ Q converges uniformly to
the bounded quadrilateral Q, then

lim m(Q n ) = m(Q).


n→∞

Proof: Replace Q by a model and use Lemma 8.1.4 to estimate m(Q n ).

8.2 Quasiconformal mappings

We can now define the subject of this chapter. A homeomorphism f from a domain
Ω ⊂ C to a domain Ω  ⊂ C is K -quasiconformal, K < ∞, if for each quadrilateral
Q ⊂ Ω whose boundary in contained in Ω,

m( f (Q)) ≤ K m(Q).

In view of (8.1.3), this implies that also 1/m( f (Q)) ≤ 1/m(Q), so f is K -


quasiconformal if and only if for each such quadrilateral Q ⊂ Ω,

1
m(Q) ≤ m( f (Q)) ≤ K m(Q). (8.2.1)
K
158 8 Quasiconformal mapping

Proposition 8.2.1. (a) If f is K -quasiconformal, so is its inverse.


(b) If f j is K j -quasiconformal, j = 1, 2, then f 1 ◦ f 2 is K -quasiconformal, with
K ≤ K1 K2.
(c) If f is K -quasiconformal, then K ≥ 1.
(d) f is 1-quasiconformal if and only if it is conformal.

Proof: (a) and (c) follow from (8.2.1), while (b) is obvious. The first part of (d) is
clear, since a conformal map preserves modules.
Conversely, suppose that f is 1-quasiconformal. It is enough to show that it is
conformal on each quadrilateral Q to f (Q), and by composing with conformal maps,
we may reduce to the case of a 1-quasiconformal map g that maps a model R to a
model R  . Since R and R  have the same module, we may take R  = R and assume
that the lower side of R is the interval [0, a]. Given 0 < x < a, let γ be the vertical
segment from a to a + ib and let κ = g(γ ). An application of Proposition 8.1.3
shows that κ must also be a vertical segment – in fact the same vertical segment. Thus,
Re g(x + i y) = x. The same argument applied to horizontal segments implies that
Im g(x + i y) = y. Thus, f , composed with certain conformal maps, is the identity.
It follows that f is conformal.
We now pass to some consequences of the approximation in Lemma 8.1.4 and
Proposition 8.1.3.

Theorem 8.2.2. Suppose that the quadrilateral Q is mapped to a quadrilateral Q 


by a continuous map f that is K -quasiconformal on the interior of Q. Then f is
K -quasiconformal on Q.

Proof: Because of conformal invariance, we may replace Q and Q  by models R,


R  with sides a, b and a  , b . Let R be a rectangle whose closure is contained in the
interior of R, with sides  
a , b. By continuity, we may assume that the image R  is
 
contained in a vertical strip in R of width ≥ a − ε. By Lemma 8.1.4, the module

a − ε  ) ≤ K m(R).
≤ m( R
b
 increases to R, this shows that a  /b ≤ K · a/b.
As R

Theorem 8.2.3. Suppose that a map f is continuous on a domain Ω and K -


quasiconformal on Ω \ γ , where γ is an analytic arc. Then f is K -quasiconformal
on Ω.

Proof: From conformal invariance and the proof of Theorem 8.2.2, we may replace
any quadrilateral in Ω by a smaller quadrilateral. The smaller quadrilateral intersects
γ in a finite set of disjoint analytic arcs. Thus, we may reduce the problem to two
rectangles R and R  , and we may reverse the viewpoint and take γ  in R  to be analytic.
It is enough to remove subarcs of γ  ∩ R  one at a time. If such a subarc is contained
in a vertical line, then Proposition 8.1.3 allows us to remove that line and reduce
8.2 Quasiconformal mappings 159

the problem to consideration of each of the two resulting rectangles. Continuing this
process, we may remove all such arcs and assume that γ  ∩ R  can be decomposed
into arcs that intersect each vertical line at most once. Passing vertical lines through
each endpoint of such an arc allow us to invoke Proposition 8.1.3 again, and reduce
to the case that γ  runs from one vertical side of R  to the other.
Under this assumption, divide R  into vertical strips R j , which are divided into
parts Q j1 , Q j2 by γ . These are the images of Q j1 , Q j2 in R, with moduli m j1 , m j2 .
From Proposition 8.1.2

1 b2j1 1 b2j2
≥ , ≥ , (8.2.2)
m j1 A j1 m j2 A j2

where A j1 , A j2 are the areas and b j1 , b j2 are the shortest distances from γ to the
horizontal sides of R. If the strips Q j are narrow enough, b2j1 + b2j2 ≥ (b − ε)2 ,
where a, b are the horizontal and vertical side lengths for R. Then (8.1.6) gives

1 1 b2j b2j
+ ≥ 1 + 2
m j1 m j2 A j1 A j2
b2j1 + b2j2 (b − ε)2
≥ ≥ .
A j1 + A j2 A j1 + A j2

But also  
1 1 1 1 1 1
≥  +  ≥ + ,
m j m j1 m j2 K m j1 m j2

so
A j1 + A j2
m j ≤ K .
(b − ε)2

Therefore
  A j1 + A j2 ab
m = m j < K ≤ K ,
j j
(b − ε) 2 (b − ε)2

which, in the limit, gives m  ≤ K m.


As we shall see, quasiconformality, like continuity, is a local concept. We take
the maximal dilatation of a quasiconformal map f : Ω → Ω  to be
m( f (Q))
K f (Ω) = K (Ω) = sup . (8.2.3)
Q⊂Ω m(Q)

The maximal dilatation of f at a point p ∈ Ω is

K f ( p) = inf{K (U ) : U a neighborhood of p}. (8.2.4)

Both these concepts are conformal invariants.


160 8 Quasiconformal mapping

Proposition 8.2.4. If f : Ω → Ω  is a K -quasiconformal map, then the maximal


dilatation satisfies
K (Ω) = sup K ( p). (8.2.5)
p∈Ω

Proof: Clearly, K f ( p) ≤ K f (Q). To prove the converse, it is enough to consider the


case of a square Ω. We want to show that there is a point p ∈ Ω such that K f ( p) ≥
K f (Q). Now Ω is the union of four disjoint subsquares and some line segments. In
view of Theorem 8.2.3, at least one of these subsquares Ω1 has K f (Ω1 ) = K f (Ω).
Continuing, we get a nested sequence
 of squares Ωn whose sides decrease by 1/2 at
each stage. The intersection Ωn is a single point p with K f ( p) = K f (Ω).
One more analogy of quasiconformal maps with conformal maps concerns con-
vergence.

Theorem 8.2.5. If { f n } is a sequence of K -quasiconformal maps from Ω to Ω  that


converges uniformly on each compact subset of Ω, then the limit f is a homeomor-
phism, and is also K -quasiconformal.

Proof: Given a quadrilateral Q contained in a compact subset of Ω, we may construct


a sequence of quadrilaterals Q n that converges uniformly to Q in the sense used in
Theorem 8.1.5. For large enough k, f k (Q n ) is contained in f (Q). Passing to a
subsequence of { f n } and renumbering, we can obtain f n (Q n ) ⊂ f (Q). Since f is
uniformly continuous on Q, the f n (Q n ) converge uniformly to f (Q). Therefore
Theorem 8.1.5 gives m( f n (Q n )) → m( f (Q)), and the inequalities

1
m( f n (Q n )) ≤ m(Q n ) ≤ K m( f n (Q n ))
K
carry over to f .

8.3 Regular quasiconformal maps

We say that a map f : Ω → Ω  is regular if it is an orientation-preserving home-


omorphism that is of class C 1 , i.e. the coordinate functions have continuous first
partial derivatives. It will be useful to put this in terms of the complex derivatives of
(1.2.3) and (1.2.4). Suppose that

f (x + i y) = u(x, y) + iv(x, y)

where u and v are real-valued and have continuous first partial derivatives. Let
   
∂f 1 ∂f ∂f ∂f 1 ∂f ∂f
p = = −i , q = = +i .
∂z 2 ∂x ∂y ∂ z̄ 2 ∂x ∂y

Then some calculation shows that


8.3 Regular quasiconformal maps 161

u x = Re ( p + q), u y = Im (q − p),
vx = Im ( p + q), v y = Re ( p − q). (8.3.1)

Some further calculation shows that the Jacobian of the map f ,


ux u y
Jf = = | p|2 − |q|2 . (8.3.2)
vx v y

Since we are assuming that f preserves orientation, we have that |q| < | p| at each
point of Ω. Let us consider a uniform condition

∂f ∂f
|q| = ≤ k | p| = k in Ω, (8.3.3)
∂ z̄ ∂z

where k is a constant, 0 ≤ k < 1. We want to relate this condition to quasiconfor-


mality.
For this purpose we consider directional derivatives
∂f ∂f
∂α f = cos α + i sin α = e−iα p + eiα q. (8.3.4)
∂x ∂y

The dilatation quotient of the map f at a point z of Ω is defined to be


supα |∂α f (z)|
D f (z) = . (8.3.5)
inf α |∂α f (z)|

Let us compute the numerator and denominator, assuming p(z)q(z) = 0. Let θ be


the argument of q/ p. Then (8.3.4) shows that

|∂α f (z)| = | p + e2iα q|.

It follows easily that


| p| + |q| 1 + |q|/| p|
D f (z) = = . (8.3.6)
| p| − |q| 1 − |q|/| p|

Theorem 8.3.1. Suppose that f : Ω → Ω  is a regular map.. Then f is K -quasi-


conformal if and only if the dilatation coefficient D(z) is bounded. If so, then

K f (Ω) = sup D f (z). (8.3.7)


z

Proof: We show first that K ≤ sup D(z). The dilatation quotient is a conformal
invariant, so we may consider a model rectangle R with side lengths m, 1 mapped by
g onto a model rectangle R  with side lengths m  , 1, where g has dilatation quotient
≤ K at each point. By (8.3.2),
162 8 Quasiconformal mapping

1 1
Jg = (| p| + |q|)(| p| − |q) ≥ (| p| + |q|)2 ≥ |gx |2 .
K K

Therefore for the area of R  we have


   m 
1 1
m = Jg d x d y ≥ |gx |2 d x dy. (8.3.8)
R K 0 0

On the other hand  m


m ≤ |gx (x + i y)| d x
0

since the integral is the length of a curve joining the b sides of R  , so


 m 2  m
1
m ≤ |gx (x + i y)| d x ≤ |gx (x + i y)|2 d x. (8.3.9)
m 0 0

The inequalities (8.3.8) and (8.3.9) imply that m  ≤ K m.


Conversely, by Proposition 8.2.4, to show that sup D f (z) ≤ K , it is enough to
show that D f (z) ≤ K f (z). Given z, let dα and dβ be the directional derivatives
where
dα f = | p| + |q|, dβ f = | p| − |q|.

A calculation shows that these directions are at right angles. As before, we may
reduce to the case z = 0, and g(0) = 0, where g is the transplanted map, and we may
take dα = dx , dβ = d y . Then for sufficiently small ε > 0, ε R will lie in the domain
of the transplanted map g. For any z = x + i y ∈ ε R,

g(x + i y) = gx (0)x + g y (0)(i y) + O(ε2 ).

By assumption, gx (0) and g y (0) are positive, so for some constant c the strip

c ε2 gx (0) ≤ x ≤ gx (0)ε − c ε2

separates the sides of g(ε R), while

−c ε2 ≤ Im g(ε R) ≤ g y (0)ε + c ε2 .

It follows from Proposition 8.1.4 that

gx (0)ε − 2cε2 | p| + |q| − 2cε


m(g(ε R)) ≥ = = Dg (0) + O(ε)
g y (0)ε + 2cε 2 | p| − |q| + 2cε

Therefore D f (z) ≤ K f (z).

Corollary 8.3.2. If f : Ω → Ω  is a regular map with bounded maximal dilatation


K (Ω), then
8.4 Ring domains 163

1+k f z̄ (z)
K (Ω) = , k = sup . (8.3.10)
1−k z∈Ω f z (z)

The following result could be deduced from Theorem 8.3.1 and earlier results,
but a direct proof is simpler.
Proposition 8.3.3. The dilatation quotient D f (z) of a regular map f is the same as
the dilatation quotient of the inverse map at f (z).

Proof: Denote f (z) by ζ . The identity dζ = pdz + qd z̄ and its complex conjugate
can be solved for dz to obtain

p̄dζ − qd ζ̄
dz = , (8.3.11)
| p|2 − |q|2

which implies the stated result.


Remark. The inequality D f (z) ≤ K holds if we simply assume that the K -quasi-
conformal map f is differentiable at the point z. The proof is the same as in the proof
of Theorem 8.3.1, with o(ε) in place of ε2 in the remainder terms.

8.4 Ring domains

By a ring domain we mean a bounded domain B, the complement of whose closure


consists of one (non-empty) bounded component and one unbounded component.
For a suitable choice of r1 , r2 there is a conformal map of B onto an annulus

A = A(r1 , r2 ) = {z : 0 < r1 < |z| < r2 < ∞}; (8.4.1)

see Exercises 12 – 14. The annulus A is called a canonical image of B. The annulus
A itself is, up to the segment [r1 , r2 ], the bijective image under the exponential map
of the rectangle

R = R(r1 , r2 ) = {(x + iθ ) : 0 < θ < 2π, log r1 < x < log r2 }. (8.4.2)

Note that we are considering the segments with fixed r to be joining the a sides of
this rectangle, so the identity in Proposition 8.4.1 below takes the opposite form from
Proposition 8.1.2.
The rectangle has module m(R) = log(r2 /r1 )/2π . We follow custom and nor-
malize by setting
r2
m(B) = log . (8.4.3)
r1

Note that if A = A(r1 , r2 ) is the canonical image of B, then


 r2 
r dr 1 dx dy
m(B) = = . (8.4.4)
A |z|
r 2 2π 2
r1
164 8 Quasiconformal mapping

We have the following analogue of Proposition 8.1.2:


Proposition 8.4.1. The module of a ring domain B is
2 A(ρ)
m(B) = inf , (8.4.5)
π ρ L(ρ)2

where ρ runs through the non-zero functions that are non-negative on B and have
finite integral 
A(ρ) = ρ(x + i y)2 d x d y,
B

and L(ρ) is the infimum of 


L γ (ρ) = ρ(z)|dz|
γ

over closed rectifiable curves γ in B that separate the two components of the com-
plement of B.

Proof: It follows from the remarks above that there is a conformal map g from a
rectangle (8.4.2) onto B, minus an analytic arc.
If γ is a curve as above, then  γ = γ ◦ g is a curve that joins the b sides of
R, and conversely. With the convention that for ζ = u + iv ∈ R we have g(ζ ) =
z = x + i y ∈ B, then |dz| = |g  (ζ )||dζ | and d x d y = |g  (ζ )|2 du dv. Thus, if we set
(ζ ) = ρ( f (ζ ))|g  (ζ )|, then
ρ
   
ρ
(ζ )|dζ | = ρ(z)|dz|; ρ
(ζ )2 du dv = ρ(z)2 d x d y.

g γ R B

Therefore (8.4.5) follows from (8.1.4) applied to R.


Remark. In the notation of the previous proof, equality holds in (8.1.4) only if ρ is
constant. Thus, for equality we need ρ(z) = 1/|g  (z)|. If f : Ω → A(r1 , r2 ) is the
canonical map, then g is the inverse of log f , so 1/|g  | = | f  |/| f |.

The next result gives another upper bound for the module of a ring domain.
Proposition 8.4.2. If B is a ring domain that encloses the origin, then

1 dx dy
m(B) ≤ . (8.4.6)
B |z|
2π 2

Equality holds if and only if B is an annulus centered at the origin.

Proof: We use the notation of the proof of Proposition 8.4.1. Thus, g : R → B and
 
dx dy 1 |g  |2
= du dv. (8.4.7)
B |z|2 2π R |g|2
8.4 Ring domains 165

Now for fixed u, the integral of g  /g is the change in the argument of g(u, ·), so
 2π
g  (u, v)
dv = 2π. (8.4.8)
0 g(u, v)

Therefore 
g r2
du dv = 2π log = 2π m(B). (8.4.9)
R g r1

But also, by Cauchy–Schwarz,


 
g 2
|g  |2
du dv ≤ du dv · 2π m(B).. (8.4.10)
R g R |g|2

Combining (8.4.7), (8.4.9), and (8.4.10), we obtain the inequality (8.4.6).


Suppose equality holds in (8.4.6). Then equality holds in (8.4.10). This implies
that g  /g is constant. In view of (8.4.8), the constant is 1. Therefore g(ζ ) = ζ + c.
Recall that g is the inverse of log f , where f : B → R. Therefore

f (z) = exp(z − c) = e−c e z ,

so B is the image under dilation by e−c of the annulus R.

Corollary 8.4.3. If B is a ring domain that encloses 0 and {Bn } is a sequence of


disjoint ring domains contained in B that enclose 0, then

m(Bn ) ≤ m(B).
n

Moreover, if B is an annulus centered at the origin, then equality holds only if the
Bn are also annuli centered at the origin, and their union is dense in B.
In particular, the module of ring domains that enclose the origin is strictly increas-
ing with respect to set inclusion.

Let us look at the connection with quasiconformal mapping.

Proposition 8.4.4. If f is a homeomorphism from a domain Ω onto a domain Ω  ,


then f is K -quasiconformal if and only if for each ring domain B ⊂ Ω,

1
m(B) ≤ m(B  ) ≤ K m(B), B  = f (B). (8.4.11)
K

Proof: Suppose that f is K -quasiconformal and B ⊂ Ω is a ring domain with image


B  . We may pass to the canonical image and assume that B is an annulus A =
A(r1 , r2 ). Removing the segment (r1 , r2 ) from A leaves two quadrilaterals A1 , A2 .
Taking the logarithm gives rectangles with modulus 2π/m(B). The images B j =
f (A j ) are disjoint quadrilaterals in B  , so
166 8 Quasiconformal mapping

2π 2π

= m(B1 ) + m(B2 ) ≤ K [m(A1 ) + m(A2 )] = K
m(B ) m(B)

so m(B)/K ≤ m(B  ). The other inequality in (8.4.11) follows, since f −1 is also


K -quasiconformal.
The converse may be proved by reverse engineering: given a quadrilateral Q,
divide Q (minus an analatic arc) into two quadrilateral Q j with half the module, and
use the canonical images to map conformally to a ring domain. Then use (8.4.11) to
prove that
1
m(Q) ≤ m(Q  ) ≤ K m(Q). (8.4.12)
K

8.5 Extremal ring domains

We consider now a question studied by Grötzsch [94]: what is the maximum module
of a ring domain that separates a Jordan curve γ from two distinct points that lie on
one component of the complement of γ . By a conformal map of the component that
contains the two points, we may assume that γ is the boundary of the unit disk D and
that the two points are 0, r , 0 < r < 1. Grötzsch showed that the maximal modulus
is attained by Grötzsch’s extremal domain: the complement in the unit disk of the
segment [0, r ].
As we shall see, questions of this type are important for the understanding of the
possible behaviour of a quasiconformal map. Questions about the extend to which a
K -quasiconformal map can change the distance between two points can be attacked
by looking at separation problems of this type, and using the maximum modulus of
a separating ring in order to calculate an upper bound to the distortion or to a Hölder
continuity norm.

0 r

Fig. 8.3 Grötzsh’s extremal domain.

In order to consider the domain in Figure 8.3 as a ring domain, we need to extend
the definition to allow the complement of the domain to consist of sets that may
have empty interior. Specifically, we allow any domain B∞ that is the union of an
increasing sequence of ring domains {Bn }, and take m(B∞ ) = limn→∞ m(Bn ).
8.5 Extremal ring domains 167

The following result is known as Grötzsch’s module theorem:

Theorem 8.5.1. The domain B indicated in Figure 8.3 has the greatest module of
any ring domain that separates the points 0 and r , 0 < r < 1, from the unit circle
∂D.

Proof: Consider Q, the upper half of B, as a quadrilateral with vertices (0, r, 1, −1).
Given R > 1, consider the upper half of the annulus A(1, R) to be a quadrilateral
Q R with vertices (1, R, −R, −1). There is a unique conformal map f of B onto
Q R that takes the ordered triple (0, r, 1) to (1, R, −R). If we choose R so that
m(Q R ) = m(Q), then the map takes −1 to itself, and thus is a conformal map from
Q to Q R as quadrilaterals. Extend f to D by reflection across the real axis. This
conformal image has the same modulus, so R = eμ , μ = μ(B). Now suppose that
γ is any curve that separate 0 and r from the unit circle. Then the portions of γ in
the upper and lower halves of the disk must meet both intervals (−1, 0) and (r, 1),
so the length of f (γ ) is at least 2π . The result follows from Proposition 8.4.1 and
the remark that follows it.
The module of Grötzsch’s domain B = B(r ) is commonly denoted by μ(r ).
We turn next to three similar problems, in the formulation given by Ahlfors [5].
Consider a ring domain A in the plane whose complement consists of a bounded
region C1 and an unbounded region C2 . What is the maximum modulus of A in the
three cases?
I. (Grötzsch) C1 = D, C2 = {R}, R > 1.
In this case inversion z → 1/z sends the problem into the one in Theorem 8.5.1 with
r = 1/R and extremal domain

B I = {z : |z| > 1, z ∈
/ [R, ∞)}.

Therefore the module m I (R) = μ(1/R).


II. (Teichmüller) C1 contains −1 and 0; C2 contains a point P > 0.
III. (Mori) C1 ∩ D contains two points z 1 , z 2 with 2 > |z 1 − z 2 | ≥ λ > 0, C2
contains the origin.
In case III there is an automorphism of the closed disk that moves {z 1 , z 2 } to a pair of
points {w, w̄} with Re w < 0 and |w − w̄| = λ; see Exercise 17. Therefore we may
assume this configuration.
Extremal domains for these three cases are shown in Figure 8.4.

Theorem 8.5.2. (Teichmüller) The extremal domain for Question II is B I I in Figure


8.4. The module is
 
√ 1
m I I (P) = 2m I P + 1 = 2μ √ . (8.5.1)
P +1
168 8 Quasiconformal mapping

BI BII BIII
w
λ
0 1 R −1 0 P 0

Fig. 8.4 Extremal domains of Grötzsch, Teichmüller and Mori.

Proof: Consider the circle Γ with center −1 and radius ρ√> 1. Reflection through
this circle maps 0 to ρ 2 − 1. Therefore if we choose ρ = P + 1, Γ separates the
plane into two components, each of which is conformally equivalent to Grötzsch’s
domain in Figure 8.3, with r = 1/ρ. Map one of these components conformally
onto an annulus centered at the origin. Reflection maps the other component onto
an annulus centered at the origin, so altogether B I I is mapped conformally to the
union of these two annuli, together with the circle that separates them. By √Corollary
8.4.3, the module of the union is the sum of the moduli, which is 2μ(1/ P + 1).
This proves the statement about the module m I I (P).
To show that B I I is extremal, suppose that A is a ring domain that separates {0, 1}
from P > 0. Let f be the conformal map of B I I onto the annulus that was constructed
in the previous paragraph. As in the proof of Theorem 8.5.1, we conclude that the
module of A is at most m I I (P).
Teichmüller considered the general problem of a ring domain that separates two
distinct points of the sphere S from two other distinct points. We may normalize
and consider the case of separating {0, z 1 } from {z 2 , ∞}. The proof depends on two
results from Chapter 4.
Theorem 8.5.3. If the ring domain A separates 0 and z 1 from z 2 and ∞, then
 
|z 1 |
m(A) ≤ 2μ √ . (8.5.2)
|z 1 | + |z 2 |

Proof: Let C2 be the component of the complement of A that contains {z 2 , ∞} and let
ϕ be the conformal map from the complement Ω of C2 onto D, for which ϕ(0) = 0
and ϕ(z 1 ) = ζ1 > 0. Then the function

4|z 2 |ϕ(z)
g(z) = −
(1 − ϕ(z))2

maps Ω conformally onto the plane, slit along the real axis from |z 2 | to ∞. The
domain A is mapped onto a ring domain A that separates |z 2 | and ∞ from 0 and
g(z 1 ),
4|z 2 |ζ1
g(z 1 ) = − < 0.
(1 − ζ1 )2
8.5 Extremal ring domains 169

By Theorem 8.5.2,

−4g(z 1 )
m(A) = m(A ) = μ . (8.5.3)
−g(z 1 ) + |z 2 |

Now μ is a decreasing function, so we want to show that −g(z 1 ) ≥ |z 1 |. Applying


the Koebe distortion theorem, Theorem 4.1.7, to ϕ −1 , we get

|(ϕ −1 ) (0)|ζ1
|z 1 | = |ϕ −1 (ζ1 )| ≤ . (8.5.4)
(1 − ζ1 )2

By Koebe’s one-quarter theorem, Theorem 4.1.4,

|(ϕ −1 ) (0)| ≤ 4|z 2 |. (8.5.5)

The inequalities (8.5.4), (8.5.5) give the desired inequality −g(z 1 ) ≥ |z 1 |.


We turn now to Mori’s problem.

Theorem 8.5.4. (Mori) The extremal domain for Question III is B I I I in Figure 8.4.
The module is
√  √ √ 
1 (2 + 4 − λ2 )2 4 + 2λ + 4 − 2λ
m I I I (λ) = m I I = mI . (8.5.6)
2 λ2 λ

Proof: Let A be a ring domain that separates {z 1 , z 2 } from 0. The idea of the proof is
to convert this to case (II).

There are two single-valued branches of the square root, ± z, defined in the
complement of the unbounded component C2 . The pre-image in the ζ plane of
C2 separates the two components of the pre-image of C1 . We choose square roots
√ √
ζ1 = z 1 and ζ2 = z 2 . Let ϕ be the linear fractional transformation
ζ + ζ1 ζ1 + ζ2
ϕ(ζ ) = · ,
ζ − ζ1 ζ1 − ζ2

and let u = (ζ1 + ζ2 )/(ζ2 − ζ1 ). Then


ϕ(−ζ1 ) = 0, ϕ(−ζ2 ) = −1, ϕ(ζ1 ) = ∞, ϕ(ζ2 ) = −u 2 .

A simple calculation shows that u is imaginary, so −u 2 > 0. Also


(ζ1 + ζ2 )2 z 1 + z 2 + 2ζ1 ζ2
u = = . (8.5.7)
ζ1 − ζ2
2 2 z1 − z2

Since
|z 2 + z 1 |2 = 2(|z 2 |2 + |z 1 |2 ) − |z 2 − z 1 |2 ≤ 4 − λ2 , (8.5.8)
170 8 Quasiconformal mapping

it follows from (8.5.7) that √


2+ 4 − λ2
|u| ≤ .
λ
Therefore the ζ plane corresponds to case II, with

√ 2
2+ 4 − λ2
P = −u ≤ 2
. (8.5.9)
λ2
Equality holds in (8.5.8) if and only if |z 1 | = |z 2 | = 1 and |z 2 − z 1 | = λ. Note
that in this case the ±ζ j all lie on the unit circle, so their images under ϕ lie on R.
Therefore ϕ maps the circle to R. For equality to hold in (8.5.9) we need

2 + |z 1 + z 2 | = 2 + 4 − λ2 = |(ζ1 + ζ2 )2 |
= |z 1 | + |z 2 | + ζ1 ζ2 + ζ 1 ζ2 .

This is only possible if z 1 + z 2 and the term ζ 1 ζ2 and its conjugate all lie on the same
line, which must be R. This, in turn, implies that z 1 = z̄ 2 and that each of the other
terms is ±1, according to the sign of z 1 + z 2 . If A = B I I I and we take ζ1 = −z̄ 2 ,
then all these conditions are fulfilled, equality holds in (8.5.9), and the image of A
is B I I . Therefore m I I I is extremal for Mori’s problem and
1
m I I I (λ) =
m I I (P);
2 √ √ √
(2 + 4 − λ2 )2 ( 4 + 2λ + 4 − 2λ)2
P= = − 1.
λ2 λ2
In view of (8.5.1), this proves (8.5.6).
One standard notation for the modules of these domains is used in Künzi [127].
With our normalization of ring modules, it is

m(B I ) = log Φ(R); (8.5.10)


m(B I I ) = log Ψ (P); (8.5.11)
m(B I I I ) = log X (λ). (8.5.12)

The module calculations above can be translated into relations for these functions:

Ψ (P) = [Φ( P + 1)]2 ; [Φ(R)]2 = Ψ (R 2 − 1) (8.5.13)

and
√   
2+ 4 − λ2 4P
X (λ) = Ψ ; Ψ (P) = X √ . (8.5.14)
λ P +1
8.5 Extremal ring domains 171

Another such relation is obtained by noting that


   
1 1 1 1 √ 1 2
w(z) = z+ − = z−√
4 z 2 4 z

is a conformal map of the complement


√ of√the Grötzsch domain B I onto the Teich-
müller domain B I I with P = 41 ( R − 1/ R)2 . Therefore
√ √ 
( R − 1/ R)2
Φ(R) = Ψ . (8.5.15)
4

Together with (8.5.13), (8.5.15) implies


 √ 2
R 1
Φ(R) = Φ + √ . (8.5.16)
2 2 R

This last identity can be converted to a functional equation for Grötzsch’s module
function μ:  √ 
2 r
μ(r ) = 2μ . (8.5.17)
1+r

This can be written in an equivalent form by solving r1 = 2 r /(1 + r ) for r :
√ 
1 (1 − 1 − r 2 )2
μ(r ) = μ . (8.5.18)
2 r2

Estimates of μ(r ) are important.

Proposition 8.5.5. For 0 < r < 1,



(1 + 1 − r 2 )2 4
log < μ(r ) < log . (8.5.19)
r r

Proof: The function ϕ(z) =√(k − z)/(kz − 1) is an automorphism


√ of D that takes
[0, r ] to [−k, k] if k = (1 − 1 − r 2 )/r . Then k −1 = (1 + 1 − r 2 )/r , so ψ(z) =
k −1 ϕ(z) maps Br conformally onto B  , the complement of [−1, 1] in the disk D R (0),
where √
1 + 1 − r2
R = k −1 = .
r

The function χ (z) = (z + z −1 )/2 maps the annulus A = A(1, ρ) conformally onto
the ellipse E ρ with semi-axes (ρ ± ρ −1 )/2, slit along [−1, 1]; see Exercise 19. This
ellipse contains B  if ρ − ρ −1 > 2R, which is true if ρ = 4/r . The ellipse is contained
in B  if ρ + ρ −1 ≤ 2R, and since 1/(2R − r ) < r , this is true if
172 8 Quasiconformal mapping

(1 + 1 − r 2 )2
ρ = 2R − r = .
r

Since μ(r ) = m(B  ), these considerations give us (8.5.19).

Corollary 8.5.6. The module μ satisfies


4
μ(r ) ∼ log as r → 0. (8.5.20)
r
We conclude this section with an exact formula for μ(r ).

Proposition 8.5.7. For 0 ≤ k ≤ 1, let


 1

K (k) = . (8.5.21)
0 (1 − ζ 2 )(1 − k2ζ 2)

Then √
π K ( 1 − r 2)
μ(r ) = . (8.5.22)
2 K (r )

Proof: As in the proof of Theorem 8.5.1, we begin with the map f from Q, the
upper half of D, to the upper half of the annulus A(1, eμ ), where μ = μ(r ), that
takes (0, r, 1, −1) to (1, eμ , −e−μ ). Let g = e−μ f , so g maps Q to the annulus
A(e−μ , 1). Then g can be continued across the upper boundary of Q, giving a map
of H onto the annulus A = A(e−μ , eμ ); see Figure 8.5.

g
Q

0 r 1 e−µ 1 eµ

Fig. 8.5 Mapping the upper half plane to A(e−μ , eμ ).

Thus, A is the image of H considered as a quadrilateral R(0, r, 1/r, ∞). The


image A has module 2μ(r )/π , so
π
μ(r ) = m(Q 1 ). (8.5.23)
2
For z ∈ H, let  z
dt
G(z) =  .
−∞ −t (1 − r t)(1 − r −1 t)
8.5 Extremal ring domains 173

As in the discussion in Exercise 3, G maps Q 1 to the rectangle

R(0, G(r ), G(r ) + i G(1/r ), i G(1/r )). (8.5.24)

The function ϕ(z) = (1 + r −1 )z(z + 1)−1 is an automorphism of H that maps Q 1


to Q 2 = R(0, 1, 1/r, ∞). As in Exercise 3 again, for 0 < k < 1, the function
 z

F(z) = 
0 (1 − ξ )(1 − k 2 ξ 2 )
2

maps Q 2 to the rectangle

Q 3 = R(0, F(1), F(1) + i F(1/k);


= (0, K , K + i K  , i K  ), (8.5.25)

where K is defined by (8.5.21) and


 1/k
 dξ
K =  .
1 (ξ − 1)(1 − k 2 ξ 2 )
2

√ √
The change of variables x → 1 − k 2 x 2 / 1 − k 2 shows that

K  (k) = K (k  ), k = 1 − k2. (8.5.26)

Therefore
√ the rectangles (8.5.24) and (8.5.25) will coincide if we take k  = r , so
k = 1 − r 2 . Then

K ( 1 − r 2)
m(Q 1 ) = m(Q 2 ) = m(Q 3 ) = ,
K (r )

so (8.5.23) gives (8.5.22).

Corollary 8.5.8. For 0 < r ≤ 1, μ(r ) is a continuous decreasing function of r .



Combining (8.5.21) with (8.5.18), taking r = 1/ 2, we find that
 
1 π
μ √ = . (8.5.27)
2 2

The formula (8.5.22) gives an alternative way to derive the functional equation
(8.5.17) and asymptotics like (8.5.19). For the functional equation, see Exercise 20;
for asymptotics, see Exercises 21 – 23.
174 8 Quasiconformal mapping

8.6 Distortion properties and Hölder continuity

Our first application of the results in Section 8.5 is to circular distortion.

Theorem 8.6.1. Suppose that f is a K -quasiconformal homeomorphism of C, and


f (0) = 0. Then there is a constant c(K ) such that for each r > 0,

supθ | f (r eiθ )|
≤ c(K ). (8.6.1)
inf θ | f (r eiθ )|

Proof: For a given r , let z 1 and z 2 be the points on the circle of radius r centered
at 0 at which f attains its minimum and maximum values, respectively. Let A be
the annulus A(|z 1 |, |z 2 |), and let A = f −1 (A ). Then the annulus A separates the
set {0, z 1 } from {z 2 , ∞}. Theorem 8.5.2, the monotonicity of the function μ, and
(8.5.27) imply
   
|z 1 − z 2 | 1
m(A) ≤ 2μ √ ≤ 2μ √ = π.
2(|z 1 | + |z 2 |) 2

Therefore m(B  ) ≤ K π , and we may take c(K ) = e K π in (8.6.1).


In the remainder of this section we investigate properties of K -quasiconformal
maps from D to D. The culminating result is that such a map has a strong uniform
continuity property.

Proposition 8.6.2. Suppose that f is a K -quasiconformal map from D into itself,


such that f (0) = 0. Then for z ∈ D,

| f (z)| ≤ ϕ K (|z|), where ϕ K (r ) = μ−1 (μ(r )/K ). (8.6.2)

Proof: Slit the disk along the segment from 0 to z. The slit disk has module μ(|z|),
and its image under f has module ≤ μ(| f (z)|). Therefore

μ(z) ≤ K μ(| f (z)|),

which is (8.6.2).
It can be shown that equality in (8.6.2) can be attained if f (D) = D; see Exercise
24.
The next step is to estimate the distortion function ϕ K .

Proposition 8.6.3. The distortion function ϕ K satisfies

ϕ K (r ) ≤ 41−1/K · r 1/K . (8.6.3)


8.6 Distortion properties and Hölder continuity 175

Proof: Suppose 0 < r < r  < 1. The Grötzsch domain Br contains the annulus A =
A(r/r  , 1) and the ring domain

R = {z : |z| < r/r  } \ [0, r ].

Since dilation by r  /r maps R onto Br  , it follows that


r
m(A) + m(R) = log + μ(r  ) ≤ μ(r ),
r
or
log(1/r ) − μ(r ) ≤ log(1/r  ) − μ(r  ).

But (8.5.19) shows that log(4/r ) − μ(r ) is positive for 0 < r < 1, so
     
1 4 4
log − μ(r ) ≤ log  − μ(r  ). (8.6.4)
K r r

Let r  = ϕ K (r ). Then μ(r )/K = μ(r  ), so (8.6.4) becomes (8.6.3).

Proposition 8.6.4. If f : D → D is K -quasiconformal, then for any z 1 , z 2 ∈ D,


 
f (z 2 ) − f (z 1 ) z2 − z1
≤ ϕK . (8.6.5)
1 − f (z 1 ) f (z 2 ) 1 − z1 z2

Proof: The disk automorphisms


z − z1 w − f (z 1 )
z → , w →
1 − z1 z 1 − f (z 1 )w

map z 1 to 0 and f (z 1 ) to zero, respectively, so the result follows from Proposition


8.6.3.
The inequality (8.6.5) can be rephrased in terms of the hyperbolic metric on D,
(2.2.10):
1 |1 − z̄ 1 ζ2 | + |z 1 z 2 | |z 1 − z 2 |
ρ(z 1 , z 2 ) = log = tanh−1 .
2 |1 − z̄ 1 ζ2 | − |z 1 z 2 | |z̄ 1 − z 2 |

Thus, we may restate Proposition 8.6.4:

Proposition 8.6.5. If f : D → D is K -quasiconformal, then for any z 1 , z 2 ∈ D, the


hyperbolic distances satisfy

ρ( f (z 1 ), f (z 2 )) ≤ C(K ) ρ(z 1 , z 2 ), (8.6.6)

for a suitable constant C(K ).


176 8 Quasiconformal mapping

Combining Propositions 8.6.3 and 8.6.4 with a local change of scale, we get an
important regularity result for quasiconformal maps.
Theorem 8.6.6. If f : Ω → Ω  is K -quasiconformal, then f is locally Hölder con-
tinous with exponent 1/K , i.e. for each z 0 in Ω there are constants δ > 0 and C such
that if |z j − z 0 | < δ, j = 1, 2, then

| f (z 1 ) − f (z 2 )| ≤ C |z 1 − z 2 |1/K . (8.6.7)

The next step is a more precise but more specialized result on Hölder continuity.
Theorem 8.6.7. Suppose that f is a quasiconformal map of D onto itself with max-
imal dilatation K , and f (0) = 0. Then f satisfies a Hölder continuity condition: if
z 1 , z 2 ∈ D, then
| f (z 1 ) − f (z 2 )| ≤ 16 |z 1 − z 2 |1/K . (8.6.8)

Proof: The inequality (8.6.8) is automatic if |z 1 − z 2 | ≥ 1/8, so we assume that


|z 1 − z 2 | < 1/8. Suppose first that |z 1 + z 2 | ≤ 1. Then

4|z 1 z 2 | = |(z 1 + z 2 )2 + (z 1 − z 2 )2 | ≤ |z 1 + z 1 |2 + |z 1 − z 2 |2 < 2,

so |z 1 z 2 | < 1/2|, and |1 − z̄ 1 z 2 | > 1/2. Also |1 − f (z 1 ) f (z 2 )| ≤ 2. The estimates


(8.6.5) and (8.6.3) imply that

| f (z 1 ) − f (z 2 )| ≤ 2ϕ K (2|z 1 − z 2 |) ≤ 8(2|z 1 − z 2 |)1/K .

Now suppose that |z 1 + z 2 | > 1, and, for the moment, assume that f extends to
the boundary of D. Then we may extend f to C by reflection across the unit circle.
The annulus  
z1 − z2 z1 + z + z2 1
A = z : < z− <
2 2 2

has module m(A) = log(1/|z 1 − z 2 |). Since |z 1 + z 2 | > 1, both 0 and ∞ are in the
unbounded component of the complement of A. The points z j belong to the closure
A, so A = f (A) separates f (z 1 ) and f (z 2 ) from 0 = f (0) and ∞ = f (∞). Let
λ = | f (z 1 ) − f (z 2 )|. By Theorem 8.5.4, we have
√ √   
 ≤ mI 4 + 2λ + 4 − 2λ 4
m( A) ≤ mI . (8.6.9)
λ λ

Now m I (R) = μ(1/R), so Proposition 8.5.5 implies that the estimate (8.6.9) gives
 ≤ log(16/| f (z 1 ) − f (z 2 )|). Thus,
m( A)
 
1  ≤ log 16
log ) = m(A) ≤ m( A) ,
|z 1 − z 2 | | f (z 1 ) − f (z 2 )|
8.7 Quasisymmetry and quasi-isometry 177

which is (8.6.8).
To complete the proof, we must drop the assumption that the map f continues
to the boundary. Let fr (z) = f (r z), 0 < r < 1. The image fr (D) may be mapped
conformally onto D by a unique map gr that satisfies gr (0) = 0, gr (0) > 0. The gr
are a family of holomorphic functions to D whose domains increase to fill D. Some
subsequence converges uniformly on compact subsets of D to an automorphism of
D. The conditions on gr imply that this automorphism is the identity. Therefore the
gr themselves converge to the identity.
It follows from Theorem 8.6.6 that the boundary of fr (D) is a Jordan curve, so gr
is continuous to the boundary by Theorem 2.6.1. Therefore each K -quasiconformal
map gr ◦ fr is continuous to the boundary and satisfies estimates (2.8.1). The fr
converges pointwise to f ; it follows from this and the convergence of gr that f itself
satisfies (2.8.1).
The Hölder exponent α = 1/K cannot be improved, in general; see Exercise 27.
Remark. The estimate (2.8.1) shows that f is uniformly continuous in D. It
follows that it extends continuously to the boundary. In other words, once we have
assumed that it extends, then we have the means to prove that it does. Note also that
f −1 is also K -quasiconformal and extends to the boundary, so the extensions are
bijective from D to itself.

Corollary 8.6.8. A surjective K -quasiconformal map f : D → D extends to a K -


quasiconformal map of C onto C.

Proof: Extend f by

1
f (z) = − = r ◦ f ◦ r (z), |z| > 1,
f (−1/z̄)

where r is the composition of the linear fractional transformation z → 1/z with the
orientation-reversing map z → z̄. Therefore f is K -quasiconformal on the exterior
region. It agrees with the original f on the unit circle. By Theorem 8.2.3, f is
K -quasiconformal on C.

Corollary 8.6.9. The family of K -quasiconformal maps f of D onto D such that


f (0) = 0 is a complete normal family.

8.7 Quasisymmetry and quasi-isometry

We know now that a quasiconformal homeomorphism of the disk extends to the


boundary. In this section, we consider the properties of the boundary homeomor-
phism, and the relation between the boundary map and the map of the interior.
178 8 Quasiconformal mapping

Suppose that f D is a K -quasiconformal map of D onto D. Up to composing with a


rotation, we may normalize by setting f (1) = 1. It is convenient now to compose with
the Cayley transform C : H → D and its inverse and examine f = C −1 ◦ f D ◦ C.
Then f is a K -quasiconformal map of H onto H that extends by continuity to
h : R → R. Then h is a strictly increasing function from R onto R. The goal here
is to characterize the functions of this type that arise as boundary values of K -
quasiconformal maps of H onto H.
A strictly increasing function h from R onto R is said to be quasisymmetric if
there is a constant M such that for every x ∈ R and t > 0,

1 h(x + t) − h(x)
≤ ≤ M. (8.7.1)
M h(x) − h(x − t)

We shall see that this condition characterizes the boundary values in question. The
first half of this characterization is:

Theorem 8.7.1. If f is a K -quasiconformal map of H → H that fixes ∞, then the


boundary value h : R → R is quasisymmetric.

Proof: Suppose that x1 < x2 < x3 are three points on R, with images 
x j = h(x j ).
Consider the quadrilaterals

Q = H(∞, x1 , x2 , x3 ),  = H(∞, 
Q x1 , 
x2 , 
x3 ) = f (Q)

By assumption on f , m(Q)/K ≤ m( Q)  ≤ K m(Q).


The linear fractional transformation ϕ(z) = (z − x1 )/(x2 − x1 ) maps (x1 , x2 , ∞)
to (0, 1, ∞) and takes x3 to 1/k, where
x2 − x1
k = .
x3 − x1

As we know from earlier calculations, the module


2
m (H(0, 1, 1/k, ∞)) = μ(k).
π
Therefore
2  = 2   
x2 − 
x1
m(Q) = μ(k); m( Q) μ(k), k = .
π π 
x3 − 
x1

We now specialize to the case x1 = x − t < x2 = x < x + t = x3 , so that k = 1/2.


Using the inequality (8.5.19), which implies log(1/r ) < μ(r ) < log(4/r ), together
with
1  ≤ m(Q) = 2 μ(1/2) ≤ K m( Q), 
m( Q)
K π
we find that
8.7 Quasisymmetry and quasi-isometry 179

1 −μ(1/2)/K 1 h(x + t) − h(x − t)


e ≤ = + 1 ≤ e K μ(1/2) . (8.7.2)
4 
k h(x) − h(x − t)

Since h(x + t) − h(x) 1


= − 1,
h(x) − h(x − t) 
k

the estimate (8.7.2) implies an estimate of the form (8.7.1), where M depends only
on K .
We take H (M) to be the set of quasisymmetric functions h that satisfy the nor-
malization conditions
h(0) = 0, h(1) = 1.

Beurling and Ahlfors [26] showed that each h in H (M) is the boundary value of a
K -quasiconformal map of H onto H. We follow the (much simpler) argument given
in [132] and [131].
Lemma 8.7.2. The family H (M) is a complete normal family on R.

Proof: The left-hand inequality in (8.7.1) implies that


h(2−n )
h(21−n ) − h(2−n ) ≥ , n = 0, 1, 2, . . . .
M
Therefore  n
M
h(2−n ) ≤ . (8.7.3)
M +1

If 0 ≤ x < 2−n , then


 n
M
0 ≤ h(a + x) − h(a) ≤ [h(a + 1) − h(a)] .
M +1

Thus, for m ≤ a < m + 1,

h(a + 1) − h(a) ≤ M m [h(a + 1 − m) − h(a − m)]


≤ M m h(2) ≤ M m (M + 1).

Thus, (8.7.3) implies equicontinuity. Therefore H (M) is a normal family. If {h n } is


a sequence that converges uniformly on each bounded interval, then clearly the limit
f satisfies (8.7.1).

Lemma 8.7.3. For any bounded interval [a, b] ⊂ R and any ε > 0, there is a δ > 0
such that if h ∈ H (1 + δ), then x ∈ [a, b] implies |h(x) − x| < ε.

Proof: Suppose that for some ε > 0 there is a sequence of functions h n ∈ H (1 + 1/n)
and a sequence of points xn ∈ [a, b] such that |h n (xn ) − xn | ≥ ε. By Lemma 8.7.2,
H (2) is a normal family, so there is a subsequence of {h n } that converges uniformly
on [a, b]. The limit is in H (1), so it is the identity.
180 8 Quasiconformal mapping

The following important construction is the Beurling–Ahlfors extension.


Theorem 8.7.4. If h belongs to H (M), then there is a K -quasiconformal map of
H onto H whose extension to the boundary is h, where K depends only on M, and
K → 1 as M → 1. The map extends to a quasiconformal map of C onto C.

Proof: Define
 
1 1 i 1
f (x + i y) = [h(x + t y) + h(x − t y)] dt + [h(x + t y) − h(x − t y)] dt.
2 0 2 0
(8.7.4)
Then f |R = h. Note that if h is the identity map on R, the extension is the identity
map on C.
For y = 0, let
 1 
1 x+y
α(x, y) = h(x + t y) dt = h(ξ ) dξ ;
0 y x
 1  x
1
β(x, y) = h(x − t y) dt = h(ξ ) dξ.
0 y x−y

Then

f (x + i y) = u(x, y) + iv(x, y) = 1
2
(α + β) + 21 i(α − β), y = 0.

Clearly, α and β are C 1 on C \ R. Moreover, α(x, −y) = β(x, y), from which it
follows that
f (z̄) = f (z). (8.7.5)

Since h is strictly increasing, f maps the upper half plane H into itself, and the lower
half-plane into itself. We concentrate for now on H: y > 0.
Note that α and β represent the mean value of h over the intervals [x, x + y] and
[x − y, x], respectively. Since h is strictly increasing, this implies that

α > β; αx > βx > 0; α y > −β y > 0. (8.7.6)

We know that f is single-valued from R to R. It is also single-valued on C \ R.


It is enough to prove this in H. Suppose that z j = x j + i y j ∈ H, j = 1, 2, with
α(x1 , y1 ) = α(x2 , y2 ). Assume x1 ≤ x2 . From the equality of the two mean values,
we see that y1 ≥ y2 . Turning to β, we see that this implies that x1 ≥ x2 . Thus, x1 = x2
which implies that y1 = y2 .
The inequalities (8.7.2) imply that the Jacobian of f in H,
   
ux u y 1 αx + βx α y + β y α y βx − αx β y
= = > 0. (8.7.7)
vx v y 4 αx − βx α y − β y 2

Therefore f preserves orientation. It follows also that if a sequence {z n } ⊂ H con-


verges to a point z ∈ U , then f (H) contains a neighborhood of f (z). Therefore
8.7 Quasisymmetry and quasi-isometry 181

the boundary of f (H) consists only of R, so f : H → H is bijective. By (8.7.5),


f : C → C is bijective.
To this point, we have not used the assumption of quasisymmetry, or of the nor-
malization f ∈ H (M); f could be any increasing homeomorphism of R. As we
know, the Jacobian of f can be expressed as | f z |2 − | f z̄ |2 . The inequality (8.7.7)
shows that f is quasiconformal on each compact subset of C \ R.
We now invoke the assumption f ∈ H (M). Note that for any affine maps A j (z) =
a j z + b j with a j > 0 and b j ∈ R, f 1 = A1 ◦ f ◦ A2 is the Beurling–Ahlfors exten-
sion of h 1 = A1 ◦ h ◦ A2 . Given any point z 0 ∈ H, we may choose A1 and A2 in
such a way that h 1 is normalized and f (z 0 ) = f 1 (i).
Suppose that f is not quasiconformal. Then there is a sequence of points {z n } in H
such that the dilatation D f (z n ) → ∞. As just noted, we may normalize and obtain a
sequence { f n } of normalized maps such that f n (i) = f (z n ). Passing to a subsequence
and renumbering, we may assume that {h n = f n |R } converges uniformly on bounded
intervals in R. This implies that the Jacobians of the Beurling–Ahlfors extensions
converge uniformly in a neighborhood of i. Therefore the limit has finite dilatation
at i, contradicting the assumption.
Finally, suppose that the last assertion in the statement of the theorem is not
true. Then there is a sequence h n ∈ M(1/n) such that the maximal dilatation K fn ≥
1 + ε > 1, where f n is the extension of h n . Once again we may renormalize so that
D fn (i) ≥ ε, pass to a subsequence and assume uniform convergence on bounded
intervals. The extension f of the limit function f has D f (i) ≥ 1 + ε. However, by
Lemma 8.7.3, the limit function h is the identity, so the extension f is the identity,
and D f (i) = 1.
Note that the Beurling–Ahlfors extension can be transferred to D by means of the
Cayley transform. A different extension from the boundary of D to D, with better
invariance properties, is due to Douady and Earle [57].
An important property of the extension is its relation to the hyperbolic metric.
A homeomorphism f : H → H is said to be a quasi-isometry if there is a constant
C > 0 such that
1 |dz| |d f (z)| |dz|
≤ ≤ C . (8.7.8)
C Im z Im f (z) Im z

Theorem 8.7.5. The Beurling–Ahlfors extension f of a quasisymmetric function h


on R is a quasi-isometry.

Proof: The map f is K -quasiconformal for some K . Given z ∈ H,


|d f (z)| 
≤ sup |∂a f (z)| ≤ K J f (z),
|dz| a

where J f is the Jacobian. Therefore the right side of (8.7.8) will follow from

K J f (z)(Im z)2 ≤ C 2 (Im f (z))2 , z ∈ H. (8.7.9)


182 8 Quasiconformal mapping

Once again we assume that (8.7.9) is false, and choose a sequence of normalized
k-quasisymmetric functions h n that converge uniformly on compact sets to a k-
quasisymmetric function h, such that the Beurling–Ahlfors extensions f n satisfy

J fn (i)
→ ∞. (8.7.10)
(Im f n (i))2

But the Beurling–Ahlfors extension f of h satisfies J fn (i) → J f (i) and Im f n (i) →


 f (i), contradicting (8.7.10). The same kind of argument proves the other half of
(8.7.8).

8.8 Complex dilatation; the Beltrami equation

Throughout this section, we denote by C0m the space of functions g : C → C such


that g belongs to C m and g has compact support, i.e. g vanishes outside some bounded
set.
In Section 8.1, we considered maps f that satisfy

f z̄ (z)
≤ k, (8.8.1)
f z (z)

where k < 1 is a constant. The argument in Section 8.1 shows that at any point where
the derivative of f exists and satisfies (8.8.1), the dilatation of f at that point is K (z) ≤
(1 + k)/(1 − k). Therefore, if f : Ω → Ω  is a C 1 homeomorphism that satisfies
(8.8.1) at each point, then f is K -quasiconformal, with K ≤ (1 + k)/(1 − k).
Let us rewrite (8.8.1) in the form of a differential equation, of a type known as a
Beltrami equation:
∂f ∂f
= μ(z) , |μ(z)| ≤ k < 1. (8.8.2)
∂z ∂z

In this section, we discuss the existence of a solution to (8.8.2), given some conditions
on the function μ.
Remark. The function μ here is not to be confused with the Grötzsch module function
of Section 8.5 and Section 8.6. (In both cases we are following standard usage.)

The strategy is to convert the Beltrami equation into a system

f z̄ = g, f z = T g, (8.8.3)

so that (8.8.2) becomes the 2-step process:

f = Pg, g = μT g.
8.8 Complex dilatation; the Beltrami equation 183

The first step is then to solve f z̄ = g for some reasonably large class of functions g.
This is accomplished by the (two-dimensional) Cauchy transform P:
  
1 1 1
Pg(z) = − g(w) − dx dy
π C w−z z
  
1 1 1
= g(w) − dw ∧ d w̄ (8.8.4)
2πi C w − z z
  
1 1 1
= g(z + w) − dw ∧ d w̄. (8.8.5)
2πi C w z + w

As we shall see, for appropriate g, f = Pg is a solution to f z̄ = g. Moreover,


f z = T g, where T is the Hilbert transform (in the plane),

1 g(w)
T g(z) = − lim dx dy
π ε→0 ε<|w−z|<1/ε (w − z)2

1 g(w)
= lim dw ∧ d w̄. (8.8.6)
2πi ε→0 ε<|w−z|<1/ε (w − z)2

Transforming to polar coordinates centered at w shows that T vanishes on constant


functions. Therefore, if g belongs to C01 and R is large enough,

1 g(z + w) − g(z)
T g(z) = T (g(z) − g(0)) = dw ∧ d w̄.
2πi |w|<2R w2

Moreover, in this case, T g is easily seen to be continuous.

Lemma 8.8.1. Suppose that g belongs to C01 . Then Pg is a C 1 function and

(Pg)z̄ = g; (Pg)z = T g. (8.8.7)

Proof: Differentiate under the integral sign to obtain



1 gz̄ (z + w)
(Pg)z̄ = dw ∧ d w̄
2πi C w

1 gw̄ (w)
= w ∧ d w̄
2πi w−z
C
1 dg(w)
=− ∧ dw.
2πi C w−z

Let Ωε = {w : |w| > ε}. Applying Stokes’s theorem, we obtain


  
1 g(w + z)
(Pg)z̄ = − lim d dw
ε→0 2πi Ω w
 ε

1 g(w + z)
= lim dw = g(z).
2πi ε→0 |w|=ε w
184 8 Quasiconformal mapping

A similar calculation shows that


   
1 g(w − z) 1 g(w − z)
(Pg)z = lim − d w̄ + dw ∧ d w̄
ε→0 2πi |w|=ε w 2πi |w|=ε w2
= T g(z).

Lemma 8.8.2. Suppose that g belongs to C02 . Then


P(gz )(z) = T g(z) − T g(0). (8.8.8)

Proof: Following the procedure in the proof of Lemma 8.8.1, we may write
  
1 1 1
P(gz )(z) = gw (w) − dx dy
2πi C w−z w
  
1 1 1
= [gx (w) − ig y (w)] − d x d y.
4πi C w−z w

The last line gives two integrals. One can be integrated by parts in x so long as y = 0,
and the other can be integrated by parts in y if x = 0. Integration over C does not
see the exceptional lines, so integration by parts leads immediately to (8.8.8).

Lemma 8.8.3. Suppose g belongs to C03 . Then T g is a C 1 function and


 
|T g(z)|2 d x d y = |g(z)|2 d x d y. (8.8.9)
C C

Proof: Apply Lemmas 8.8.1 and 8.8.2 to gz to find that


(T g)z̄ = (Pg)z z̄ = (Pgz )z̄ = gz ;
(T g)z = (Pgz )z = T (gz ) = P(gzz )(z) + T (gz )(0).

The assumption implies that gzz belongs to C01 . Therefore both T (g)z̄ and T (g)z
are continuous. Thus, T g is in C 1 . The assumption that g has compact support
implies that T g(z) = O(|z|−2 ) as z → ∞. Therefore both sides of (8.8.9) are finite.
Moreover, the following integrations-by-parts are justified:
  
Tg Tg = (Pg)z (Pg)z = − Pg (Pg)z z̄
C
C  C

=− Pg ḡz̄ = g ḡ.
C C

Since C02 is dense in L 2 (C), we have


8.8 Complex dilatation; the Beltrami equation 185

Corollary 8.8.4. T extends to an isometry of L 2 (C).


On the other hand, as we shall see, the integral defining Pg is only guaranteed
to converge if g belongs to L p (C) for 2 < p ≤ ∞. However, this difficulty can be
overcome. The key is the Calderón–Zygmund inequality:
Theorem 8.8.5. The operator T extends to a bijective map from L p to L p , 1 < p <
∞, and
||T g|| p ≤ C p ||g|| p , (8.8.10)

where 1 ≤ C p < ∞ and C p → 1 as p → 2.


A proof is given in the next section. Let us turn to the operator P.
Theorem 8.8.6. For f ∈ L p , 2 < p < ∞, P f vanishes at z = 0 and satisfies a
Hölder continuity condition

|Pg(z 1 ) − Pg(z 2 )| ≤ K p ||g|| p |z 1 − z 2 |1−2/ p . (8.8.11)

Proof: Given z ∈ C, the function h z (w) = |(w − z)−1 − w−1 | = |z||(w − z)w|−1 is
O(|w|−1 + |w − z|−1 ) near the singularities and is O(|w|−2 ) as w → ∞. Therefore
it belongs to L q for 1 < q < 2. By Hölder’s inequality, the integral Pg(z) converges
so long as g ∈ L p (C), 2 < p < ∞. In fact
1 1
|Pg(z)| ≤ ||g|| p ||h z ||q , + = 1, p > 2,
p q

i.e. q = p/( p − 1). Writing w = x + i y = |z|ζ = |z|(ξ + iη), we have



1
||h z ||q = |z|
q q
dx dy
|(w − z)w)|q
C
1
= |z|2−q dξ dη
C |(ζ − 1)ζ |q
= |z|2−q (K p )q ,

where K p is constant. Since (2 − q)/q = 1 − 2/ p, we have

|Pg(z)| ≤ |z|1−2/ p K p ||g|| p , p > 2, q = p/( p − 1). (8.8.12)

Then
  
1 1 1
Pg(z 2 ) − Pg(z 1 ) = − g(w) − dx dy
π C w − z2 w − z1
  
1 1 1
=− g(w + z 1 ) − dx dy
π C w − (z 2 − z 1 ) w
= Pg (z 2 − z 1 ), g (z) = g(z + z 1 ).

g || p = ||g|| p , so (8.8.12) gives (8.8.11).


Now ||
186 8 Quasiconformal mapping

We know now that for g ∈ C03 , (Pg)z̄ = g, (Pg)z = T g, and Pg is Hölder con-
tinuous. These results can be carried over, in a certain sense, for g ∈ L p : the “weak”
sense, or the sense of distribution theory, as in Section 2.9.

Proposition 8.8.7. If g ∈ L p for some 1 < p < ∞, then f = Pg satisfies the


Hölder condition (8.8.11) and is a weak solution of equations (8.8.3).

Proof: We know that these equations are true in the usual (“strong”) sense if g is
in C01 . The space C01 is dense in L p ; Theorems 8.8.5 and 8.8.6 allow passage to the
limit.
We are now prepared to solve the Beltrami equation, in several steps. Throughout,
μ will be a measurable function with |μ(z)| ≤ k < 1, all z ∈ C. Since (8.8.2) can
only specify f up to an additive constant and a multiplicative constant, we shall
normalize by requiring f (0) = 0 and f (1) = 1.
Given k < 1, we fix p = p(k) > 2 with k C p < 1, where C p is the constant in
Theorem 8.8.5.

Theorem 8.8.8. If μ has compact support, then (8.8.2) has a unique normalized
solution f such that f z − 1 belongs to L p . Moreover f is Hölder continuous:

Kp
| f (z 1 ) − f (z 2 )| ≤ ||μ|| p |z 1 − z 2 |1−2/ p + |z 1 − z 2 |. (8.8.13)
1 − k Cp

Proof: Suppose first that f is a solution of (8.8.2). Then

f z̄ = μf z = μ( f z − 1) + μ

belongs to L p . The function F = f − P( f z̄ ) is a weak solution of Fz̄ = 0. By The-


orem 2.9.3, F is an entire function. But

F  − 1 = ( f z − 1) + T ( f z̄ ) ∈ L p ,

so F(z) = z + c. The normalization implies that the constant c = 0, so we must have

f (z) = P( f z̄ )(z) + z = P(μf z ) + z; f z = T (μf z ) + 1. (8.8.14)

If g is another solution, then f z − gz = ( f z − 1) − (gz − 1) belongs to L p and

|| f z − gz || p = ||T (μ( f z − gz ))|| p ≤ k C p || f z − gz || p .

Therefore gz − f z = 0 a.e. Then the Beltrami equation implies ( f − g)z̄ = 0 a.e.,


and all this is true of ( f¯ − ḡ) as well, so f − g and f¯ − ḡ are analytic. Therefore
f − g is constant, and the normalization gives f = g.
We have proved uniqueness, but the argument tells us how to prove existence.
According to (8.8.14), we want
8.8 Complex dilatation; the Beltrami equation 187

f z − 1 = T (μ( f z − 1)) + T μ.

The operator Sg = T (μg) has a norm less than 1 as an operator in L p . Therefore for
h ∈ L p , the series

T μ + S(T μ) + S 2 (T μ) + · · · + S n (T μ) + . . . (8.8.15)

converges in norm to the unique solution h ∈ L p of h = Sh + T μ = T (μ(h + 1))


(apply I − S to the series). Since μ has compact support, μ(h + 1) is also in L p ,
and the construction shows that
1
||μ(h + 1)|| p ≤ ||μ|| p . (8.8.16)
1 − k Cp

Thus, we may define


f = P(μ(h + 1)) + z. (8.8.17)

Then
f z̄ = μ(h + 1), f z = T (μ(h + 1)) + 1 = h + 1. (8.8.18)

Therefore f z − 1 = h is in L p and f is a (distribution) solution of (8.8.2). The


estimate (8.8.13) follows from (8.8.16), (8.8.17), and (8.8.11).
The solution f just constructed is termed the normal solution of the Beltrami
equation. We want to show that the normal solution is a K -quasiconformal map,
K = (1 + k)/(1 − k). We begin by showing that if μ above has some regularity,
then f is C 1 .

Lemma 8.8.9. Suppose that μ in Theorem 8.8.8 has a distribution derivative μz ∈


L p , p > 2. Then the normal solution f is a C 1 function.

Proof: Consider the system f z = λ, f z̄ = λμ. By Theorem 2.9.5, this system has a
C 1 solution f if λ has a weak derivative λz̄ ∈ L p , λμ has a weak derivative (λμ)z
in L p , and
λz̄ = (μλ)z = λz μ + λμz . (8.8.19)

Dividing by λ, we want
(log λ)z̄ = μ(log λ)z + μz .

We can solve q = T (μq) + T (μz ) for q ∈ L p . Let

σ = P(μq + μz ) + c,

where the constant c is chosen so that lim z→∞ σ (z) = 0. Then σ is continuous and

σz̄ = μq + μz ; σz = q.

Therefore λ = eσ satisfies (8.8.19), and f z = λ, f z̄ = λμ has a C 1 solution.


188 8 Quasiconformal mapping

Lemma 8.8.10. Suppose that the normal solution f in Theorem 8.8.8 is a C 1 func-
tion. Then f is K -quasiconformal, K = (k + 1)/(k − 1).

Proof: We know from (8.3.11) that f is locally invertible and that the local inverse
satisfies
fz f z̄
( f −1 )ζ = , ( f −1 )ζ̄ = − ,
Jf Jf

where we have written ζ = f (z) and J f is the Jacobian of f : J f = | f z |2 − | f z̄ |2 .


Therefore f −1 is a (local) solution of the Beltrami equation with coefficient

f z̄ (z)

μ(ζ ) = − .
f z (z)

Then, writing ζ = s + it and using Hölder’s inequality, we obtain


 
|
μ(ζ )| p ds dy = |μ(z| p |J f (z)| d x d y
C
C

= |μ| p (| f z |2 − | f z̄ |2 ) d x d y
C
 
≤ |μ| p | f z |2 d x d y = |μz | p−2 f z̄ |2 d x d y
C C
 ( p−2)/ p   2/ p
≤ |μ| d x d y
p
| f z̄ | p
.
C C

Therefore
μ|| p ≤ (1 − k C p )−2/ p ||μ|| p .
||

It follows that the normal solution to the Beltrami equation with coefficient 
μ satisfies

|z 1 − z 2 | ≤ K p (1 − k C p )−1−2/ p | f (z 1 ) − f (z 1 )|1− p/2 + | f (z 1 ) − f (z 2 )|.

Therefore f is globally invertible, hence a K -quasiconformal map.


We now remove the assumption that the normal solution is a C 1 function.

Theorem 8.8.11. The normal solution of the Beltrami equation is K -quasiconfor-


mal.

Proof: Let {u n } be a sequence of functions with distribution derivatives u z ∈ L p ,


supported in some compact subset of C, uniformly bounded by k, and such that
||μn − u|| p → 0. Let { f n } and f be the corresponding normal solutions of the Bel-
trami equation. Then ||( f n )z − f z || p → 0. Consider the associated functions h n and
h constructed in the proof of Theorem 8.8.8: (I − S)h n = T μn . Then the series
solution shows that
8.8 Complex dilatation; the Beltrami equation 189

||h n − h|| p ≤ ||T (μn − μ)|| p + k C p ||T (μn − μ)|| p + (k C p )2 ||T (μn − μ)|| p + . . .
Cp
≤ ||μn − μ|| p → 0.
1 − k Cp

Therefore (8.8.17) and Theorem 8.8.6 imply uniform convergence on bounded sets.
By Theorem 8.2.5, f is K -quasiconformal. .
We would like to drop the requirement that μ has compact support. Let us start
with an observation about composition of regular quasiconformal maps. Suppose
that f and g are such maps, with Beltrami coefficients μ f and μg . Write ζ for f (z).
Then

(g ◦ f )z̄ = (gζ ◦ f ) f z̄ + (gζ̄ ◦ f ) f z ;


(g ◦ f )z = (gζ ◦ f ) f z + (gζ̄ ◦ f ) f z̄ .

This system can be solved for μg ◦ f :

fz μg◦ f − μ f
μg ◦ f = · . (8.8.20)
fz 1 − μ̄ f μg◦ f

Theorem 8.8.12. For any measurable function μ with |μ(z)| ≤ k < 1 a.e., there is
a unique K -quasiconformal map f : C → C, K = (1 + k)/(1 − k) such that f is
a weak solution of the Beltrami equation (8.8.2), and μ(0) = 0, μ(1) = 1.

Proof: We assume here that various coefficients μ are sufficiently regular that the
solutions are C 1 . The general result follows by an approximation argument.
If μ has compact support, we need only divide the normal solution f by f (1).
Suppose that μ = μ1 + μ2 , where μ1 has compact support and μ2 is ≡ 0 in a neigh-
borhood of 0. Let   2
1 z
ν(z) = μ2 .
z z̄ 2

Then ν has compact support, so there is a corresponding normalized solution f ν . A


computation shows that
1
f μ2 (z) = ν .
f (1/z)

The next step is to assume that there is a solution f μ and see how it can be written
as a composition
f μ = f λ ◦ f μ2 .

The preceding computation for composite functions shows that the coefficient λ
should be    
μ2 f zμ2 μ − μ2 f zμ2 μ1
λ◦ f = μ · = μ ·
f¯z̄ 2 1 − μμ̄2 f¯z̄ 2 1 − μμ̄2
190 8 Quasiconformal mapping

Then λ has compact support, so f μ is well-determined.


It is important for later use to consider quasiconformal maps of H to itself. Such
a map can be extended to C by setting μ ≡ 0 on the lower half-plane. By Theorem
8.2.3, the extension is quasiconformal on C.
Theorem 8.8.13. Given μ : H → C with ||μ||∞ < 1, there is a μ-quasiconformal
map f μ of H onto itself with 0, 1, ∞ as fixed points.

Proof: Extend μ to the lower half-plane by setting μ(z) = μ(z̄) if Im z < 0. Then,
by uniqueness, f μ (z̄) = f μ (z). Therefore f μ maps R onto R. Starting with μ ∈ C01 ,
so that f (z) ∼ z as z → ∞, and proceeding by approximations, we see that f μ must
map H onto itself.
The following result uses the same kind of construction to decompose a quasi-
conformal map.
Theorem 8.8.14. Given 0 < t < 1, the map f = f μ can be decomposed as

f = f 1−t ◦ f t , with K f = K tft K 1−t


f 1−t .

Proof: Let L be the length of the hyperbolic geodesic in D from 0 to μ. Let μt be


the point on that geodesic at hyperbolic distance t L from μ and let f t = f μt . Define
μ1−t and f 1−t similarly. Then
t
1 + |μt | 1 + |μ|
K ft = = = K tf ,
1 − |μt | 1 − |μ|

and similarly for f 1−t .


Any quasiconformal homeomorphism of C to C can be shown to be differen-
tiable at a.e. point of C, so that the quotient μ = f z̄ / f z is defined and < 1 almost
everywhere. In fact, the converse of Theorem 8.8.12 is true.
Theorem 8.8.15. Suppose that f : C → C is a K -quasiconformal homeomor-
phism, normalized so that f (0) = 0, f (1) = 1. Then f = f μ for a unique μ in
L ∞ (C), with ||μ||∞ < 1.
For the proof, see [5] or [132].

8.9 The Calderón–Zygmund inequality

We know by Lemma 8.8.3 that T extends to an isometry of L 2 . We only need (8.8.10)


in the range 2 ≤ p < ∞, but a duality argument, using the adjoint T ∗ , shows that
the result extends to the remainder of the range 1 < p < ∞.
The inequality (8.8.10) is a special case of a very general theory. However, the
complex-variable context here allows some special arguments, as in Vekua [210].
The idea is that
8.9 The Calderón–Zygmund inequality 191

1 f (z − w)
T f (z) ≡ − lim d x d y = −T12 f (z),
π ε→0 |w|>ε w2

where T1 is bounded from L p to L p , 1 < p < ∞, and that this boundedness property
of T1 can be obtained easily from the corresponding boundedness in L p (R) of the
one-dimensional Hilbert transform H . Starting with f ∈ C01 (R),

1 f (x − y)
H f (x) = lim dy.
π ε→0 |y|>ε y

Notice that since 1/t is an odd function, we may rewrite this as



1 f (x − y) − f (x)
H f (x) = lim dy,
π ε→0 |y|>ε y

and conclude from the assumption on f that the limit exists, uniformly with respect
to x.
We claim that

||H f || p ≤ A p || f || p , f ∈ L p (R), 1 < p < ∞. (8.9.1)

One proof of (8.9.1) starts with a complex transform of a real-valued function f ∈


C01 (R). Define  ∞
1 f (t) dt
F(z) = , Im z > 0.
πi −∞ t − z

A straightforward calculation shows that the real part



1 ∞ f (x + ys) ds
u(x, y) = Re F(x + i y) = ,
π −∞ s2 + 1
so
lim u(x, y) = f (x). (8.9.2)
y→0

Similarly,  ∞
1 f (x − s)s ds
v(x, y) = Im F(x + i y) = ds,
π −∞ s2 + y2
so
lim v(x, y) = H f (x). (8.9.3)
y→0

Direct calculation, using the Cauchy–Riemann equations for u, v shows that

(|u| p ) = p( p − 1)|u| p−2 (vx2 + v2y );


(|v| p ) = p( p − 1)|v| p−2 (vx2 + v2y );
(|F| p ) = p 2 |F| p−2 (vx2 + v2y ).
192 8 Quasiconformal mapping

Therefore
 
p
 |F| p − |v| p = p 2 (|F| p−2 − |v| p−2 )(vx2 + v2y ), p ≥ 2. (8.9.4)
p−1

For R > 1, let γ R be the curve consisting of the horizontal diameter and upper
semicircle of the circle in the upper half-plane with center i y and radius R, with the
usual orientation. Let Ω R be the domain enclosed by γ R . Applying Green’s identity
   
∂h ∂g
(gh − hg) = g −h
ΩR γR ∂n ∂n

with g = 1 and h = |F| p − p/( p − 1) · |v| p , and letting R → ∞ shows that


 ∞ 
∂ p
|F(x + i y)| −
p
|v(x, y)| p
d x ≤ 0.
∂ y −∞ p−1

The integral has limit 0 as y → +∞, so


 ∞  ∞
p
|F(x + i y| p d x ≥ |v(x, y)| p d x, y > 0, p ≥ 2.
−∞ p − 1 −∞

Now
 ∞ 2/ p
|F| p
= |||F|2 || p/2 = ||u 2 + v2 || p/2 ≤ ||u 2 || p/2 + ||v2 || p/2 .
−∞

Therefore
 2/ p
p
||v2 || p/2 ≤ ||u 2 || p/2 + ||v2 || p/2 ;
p−1
 2/ p −1
p
||v2 || p/2 ≤ −1 ||u 2 || p/2 .
p−1

In view of (8.9.2) and (8.9.3), raising this inequality to the p/2 power and taking the
limit as y → 0 gives (8.9.1) with
 2/ p − p/2
p
Ap = −1 .
p−1

(Note that A2 = 1.) This proves (8.9.1) for p ≥ 2. A duality argument using Hölder’s
inequalityshowsthatitisalsotruefor1 < p ≤ 2with A p = Aq ,where1/ p + 1/q = 1.
In the following calculations, we write z = x + i y, ζ = ξ + iη. Then we define
T1 for f ∈ C02 :

1 f (ζ + z)
T1 f (z) = lim dm(ζ )
ε→0+ |ζ |>ε 2π ζ |ζ |
8.9 The Calderón–Zygmund inequality 193
 π   ∞ 
1 1 f (z + r eiθ ) − f (z − r eiθ )
= dr e−iθ dθ.
2 0 π 0 r

Therefore

π 1 ∞ f (z + r eiθ ) − f (z − r eiθ )
||T1 f || p ≤ sup
2 θ π 0 r p
π
= sup ||H f θ || p , f θ (x) = f (xeiθ )
2 θ
π
≤ A p || f || p .
2

The next step is to show that −T12 = T . For f ∈ C01 ,



1 ∂ 1
T1 f (z) = − [ f (z + ζ ) − f (z)] dξ dη
π ∂ζ |ζ |

1 1
= f ζ (z + ζ ) dξ dη (8.9.5)
π |ζ |
  
1 ∂ 1 1
= f (ζ ) − dξ dη. (8.9.6)
π ∂z |z − ζ | |ζ |

Therefore, for any test function φ ∈ C01 ,


    
1 1 1
T1 f (z) φ(z) d x d y = − f (z) − φζ (ζ ) dξ dη d x d y.
π |z − ζ | |z|

The integral on the right converges for f ∈ L p , so (8.9.6) is true in the weak sense
for f ∈ L p . Then
   
∂ 1 1 1
T1 f (w) =
2
T1 f (z) − dx dy
∂w π |z − w| |z|
     
1 ∂ 1 1 fz d x d y
= 2 − dx dy
π ∂w |ζ − w| |ζ | |z − ζ |
     
1 ∂ 1 1 1
= 2 fz − dm(w) dm(z)
π ∂w |z − ζ | |ζ − w| |ζ |
     
1 ∂ ∂ 1 1 1
=− 2 f − dm(w) dm(z).
π ∂w ∂z |z − ζ | |ζ − w| |ζ |

Differentiation and integration can be exchanged and the expression replaced by


   
∂ dξ dη 1
lim − dξ dη . (8.9.7)
R→∞ ∂z |ζ −w|<R |z − ζ ||ζ − w| |ζ |<R |ζ ||z − ζ |

The first integral here gives


194 8 Quasiconformal mapping
  
∂ dξ dη ∂ R/|z−w| 2π
dr dθ
=
∂z |ζ |<R/|z−w| |ζ ||1 − ζ | ∂z 0 0 |1 − r eiθ |
 2π
1 R dθ
=− .
2 (z − w)|z − w| 0 1− Reiθ
|z−w|

The limit is clearly −π/(z − w) and similarly the limit of the second integral in
(8.9.7) is −π/z. Thus,
   
1 ∂ 1 1
T1 f (w) =
2
f (z) − d x d y = −T f (w).
π ∂w z−w z

The last step is to verify that the best constant C p of (8.8.10) has limit 1 as
p → 2. By Corollary 8.8.4, C2 = 1. The rest follows from a particular case of the
Riesz–Thorin convexity theorem.
Theorem 8.9.1. The constant C p in the estimate

||T f || p ≤ C p || f || p , 2≤ p<∞

can be chosen so that for any p1 > 2,


1 t 1−t
C p = C21−t C tp1 = + , 0 ≤ t ≤ 1. (8.9.8)
p 2 p 1

Proof: We use Hölder’s inequality (2.8.1) with dual exponents 2, q, q1 ,

1 1 1 1 1 1
1 = + = + = + .
2 2 p q p1 q1

It is enough to consider functions in C00 , since such functions are dense in each
L p . Given such functions f and g, define functions Fζ and G ζ for ζ in the strip
2 ≤ Re ζ ≤ p1 by Fζ (z) = 0 if f (z) = 0, G ζ (z) = 0 if g(z) = 0, and otherwise
f g
Fζ = | f |a(ζ ) , G ζ = |g|b(ζ ) ,
|f| |g|

where p p
a(ζ ) = (1 − ζ ) +ζ .
2 p1

We may also normalize so that || f || p = 1 = ||g||q . We want to show that under these
assumptions, necessarily 
f g ≤ C21−t C tp1 . (8.9.9)
C

The function Φ(ζ ) = Fζ G ζ is holomorphic in the strip. For Re (ζ ) = 0,
Re a(ζ ) = p/2, Re b(ζ ) = q/2,
8.9 The Calderón–Zygmund inequality 195

||Fζ ||2 = (|| f || p ) p/2 = 1 = ||G ζ ||2 , Re ζ = 0.

Similarly,

||Fζ || p1 = (|| f || p ) p/ p1 = 1 = ||G ζ ||q1 , Re ζ = 1.

Therefore

|Φ(ζ )| ≤ C2 = 1 if Re ζ = 0; |Φ(ζ )| ≤ C p1 , if Re ζ = 1.
2
Since Φ is bounded in the strip, Φε = e1−εz Φ → 0 as z → ∞ in the strip, for any
ε > 0, so |Φε | is bounded by max{C2 , C p1 }. Taking ε → 0, it follows that |Φ| has
ζ −1 −ζ
the same bounds. But then C2 C p1 Φ(z) is bounded by 1. Finally

f g = |Φ(t)| ≤ C21−t C tp1 .

Remarks. Ahlfors cites the first edition of Zygmund’s treatise on the Fourier Series
for the proof of the boundedness of the one-dimensional Hilbert transform H . In
fact, it was to get away from the very special complex-analytic arguments that are
used in this section, and to gain an understanding of the real-variable nature of these
integral transforms, that Zygmund and Calderón developed their far-reaching theory
of “singular integrals;”[37]. The Calderón–Zygmund arguments apply to much more
general integral transforms; see Stein [194], Christ [44], or Peyrière [167].

Exercises

1. Prove that given a triple ( p1 , p2 , p3 ) of points in the unit circle ∂D, ordered in the
positive direction, there is a unique f ∈ Aut(D) such that f takes ( p1 , p2 , p3 )
to (−1, −i, 1).
2. Show that a quadrilateral can be mapped conformally to some H(q1 , q2 , q3 , q4 ),
such the vertices can be taken to be −1/k, −1, 1, 1/k for some (unique) k > 1.
3. For Re z > 0 and 0 < k < 1, let
 z
dx
f (z) =  .
0 (1 − x 2 )(1 − k 2 x 2 )

(Note that the integral is independent of the path of integration in the upper
half-space H. Note also that f extends to be continuous on the closure of H.)
(a) Show that the image f (H) is bounded.
(b) Show that f maps the interval [0, 1] to an interval [O, K ], for some finite
K (k) > 0 and that f maps the interval [1, 1/k] to the vertical interval [0, i K  ]
for some finite K  (k) > 0. Show that the image of [1/k, ∞] is a finite horizontal
196 8 Quasiconformal mapping

segment extending to the left from K + i K  and the image of {i y : y > 0} is a


finite vertical segment extending upward from the origin. Conclude from this
that the image of the upper half plane is the interior of a rectangle with vertices
0, K , K + i K  , i K  .
4. For what values of a, b, c, d ∈ R, ad − bc = 1, is the map z = x + i y → ζ =
u + iv quasiconformal, and what is the dilatation quotient, if
    
u ab x
=
v cd y

5. Prove that two rectangles R and R  are conformally equivalent if and only if
a/b = a  /b . Hint: reduce to the case R = R  = D.
6. Verify (8.3.2).
7. Let Ra be the rectangle Ra = z = x + i y : 0 < x < a, 0 < y < 1, and let Rb ,
b > a, be defined in the same way, so that m(Ra ) = a, m(Rb ) = b, b > a Let
f (x + i y) = ρx + i y with ρ = b/a, so that f (Q a ) = Q b . Verify that f is b/a-
quasiconformal.
8. (Grötzsch) Let Ra and Rb be as in Exercise 7 and let g be a K -quasiconformal.
Adapt the proof of Theorem 8.3.1 to show directly that K ≥ b/a. By considering
carefully the inequalities involved, show that if K = b/a, then g is the map f
of Exercise 7.
9. Verify (8.3.11).
10. Show that the maps f nm (z) = z n |z|m , n, m = 1, 2, 3, . . . are quasiconformal.
11. Let f (r eiθ ) = r a eiθ , with a ≥ 1.
(a) Show that f is a regular K -quasiconformal map, and compute K .
(b) Show that for a suitable value of a, f a maps the annulus A(1, r ) onto A(r, s),
where s ≥ r > 1 are specified.
(c) Show that f in part (b) has the smallest maximal dilatation of any quasicon-
formal map from A(1, r ) to A(1, s).
12. Suppose that Ω is a domain in C with the property that the complement in C
of the closure Ω consists of one bounded component Ω0 and one unbounded
component Ω∞ . Use the Riemann mapping theorem and inversion to prove:
(a) Up to conformal equivalence we may assume that Ω∞ is the complement of
D.
(b) Up to a further conformal equivalence we may assume that the inner boundary
Γ1 of Ω is the unit circle, and the outer boundary G 2 is an analytic curve.
13. Suppose that Ω is the domain in Exercise 12 (b). We want to construct a confor-
mal map F from Ω to some annulus Rr , taking inner boundary to inner boundary.
Since F has no zeros, it can be written as F = exp( f + ig), where f is a real
harmonic function such that f = 0 on Γ1 and f is a positive constant on Γ2 ,
while g is a harmonic conjugate of f that gains 2π around a curve in Ω that is
homotopic to the Γ j .
(a) Let f 1 be the harmonic function on Ω that vanishes on the inner boundary Γ1
and equals 1 on the outer boundary Γ2 . Locally, either boundary can be straight-
ened, so f can be extended across. Thus, f 1 is harmonic in a neighborhood of the
closure Ω. Show that the outer normal derivative ( f 1 )n = ∂ f 1 /∂n is nonnegative.
8.9 The Calderón–Zygmund inequality 197

(b) Let g1 be a harmonic conjugate of f 1 , defined first in a neighborhood of a


point of Γ2 . Use the fact that f + ig is a non-constant holomorphic function to
show that ( f 1 )n is positive at all but finitely many points of Γ2 . Use the Cauchy–
Riemann equations to show that where ( f 1 )n is positive, the tangential derivative
(g1 )τ = ∂g1 /∂τ in the positive direction on Γ2 is positive. Thus, g1 is strictly
increasing in the positive direction on Γ2 .
(c) Conclude that there is a positive multiple f = c f 1 such that the harmonic
conjugate g gains 2π on a full circuit of Γ2 (or any curve in Ω homotopic to
Γ2 .) Thus, F = exp( f + ig) is a (single-valued) holomorphic map of Ω onto
Rr , r = ec . Note that F has a holomorphic extension to a neighborhood of Ω.
14. The object here is to show that the map F of Exercise 13 (d) is injective, and
thus conformal. The number of times N (z 0 ) that F takes the value z 0 is given
by the usual formula

1 F  (ζ ) dζ
N (z 0 ) = = N2 (z 0 ) − N1 (z 0 ),
2πi ∂Ω F(ζ ) − z 0

where 
1 F  (ζ ) dζ
N j (z ) = , j = 1, 2.
2πi Γj F(ζ ) − z 0

Here, we take the positive (domain to the left) orientation for each of the curves
Γ j , so that, symbolically, ∂Ω = Γ2 − Γ1 .
(a) Apply the argument principle, Theorem 1.5.1, to conclude that N1 (z 0 ) is
constant for |z 0 | < 1 and for |z 0 | > 1 and that N2 (z 0 ) is constant for |z 0 | < r
and for |z 0 | > r .
(b) Show by direct computation that N j (0) = 1.
(c) Use (a) and (b) to complete the proof.
15. Suppose that S is any Riemann surface and Ω ⊂ S is a domain such that the
complement of Ω in S has two open components. Show that Ω is conformally
equivalent to an annulus A(r, 1).
16. (a) Show that the domains

Ω0 = A(0, 2) = {z : 0 < |z| < 2}, Ω1 = A(1, 2) = {z : 1 < |z| < 2}

are homeomorphic but not conformally equivalent.


(b) Show that Ω0 and Ω1 are not quasiconformally equivalent: there is no
quasiconformal map of Ω0 onto Ω1 .
17. Suppose that z , and z 2 are distinct points in D with |z 1 − z 2 | < 2. Show that there
is an automorphism of D that takes the pair {z 1 , z 2 } to {w, w̄}, with Re w < 0
and |w − w̄| = |z 1 − z 2 |.
18. Prove the assertion about the function ϕ in the proof of Proposition 8.5.7.
19. Prove the assertion about the function χ in the proof of Proposition 8.5.5.
20. (a) Use the change √ of variables ζ = (1 + k)t/(1 + kt 2 ) in the integrand for
K (k1 ), where k1 = 2 k/(1 + k), to prove that
198 8 Quasiconformal mapping
√ 
2 k
K = (1 + k)K (k).
1+k

(b) Use (8.5.22) and part (a) to prove the functional equation (8.5.17).
21. The aim of this and the two following exercises is to prove a version of the
estimate (8.5.19):

μ(r ) = O (r log r )) as r → 1; (8.9.10)


4
μ(r ) = log + O (r log r ) as r → 0. (8.9.11)
r
Prove that
 1
dx π
K (k) = √ [1 + O(x 2 )] = [1 + O(x 2 )].
0 1−x 2 2

Use this fact to show that (8.9.11) is equivalent to the estimate


2 √ √
K (k) = log{ √ }+O 1 − k log( 1 − k) . (8.9.12)
1−k

22. (a) Let ch = 21 (k + 1/k) and sh = 21 (k − 1/k). Verify that (1 − x 2 )(1 − x 2 )2 =


(ch − kx 2 )2 − sh2 .
(b) Use (a) to verify
 
1 1 sh2
= 1− .
(1 − x )(1 − x )
2 2 2 (ch − kx 2 )2 (1 − x 2 )(1 − k 2 x 2 )

(c) Show that for√k ≥ 1/2 the last quotient in (b) is √ ≤ 4(1 − k) /(1 − x) , so
2 2

that for 1 − x ≤ 1√ − k, this term is √bounded by 4 1 − k.


(d) Set x(k) = 1 − 1 − k. Writing 1 − k = , verify
 x(k)  x(k)
dξ dξ
Ik =  = (1 + )
1 (1 − ξ 2 )(1 − k 2 ξ 2 ) 1 ch − tξ 2
1 + x(k) 2
= log [1 + O()] = log + O( log ). (8.9.13)
1 − x(k) 

Let t = ch − kx 2 , so d x = −dt/2kx. Then


 1  1
dξ 1 dt
IIk =  = −
x(k) (1 − ξ )(1 − k ξ )
2 2 2 2k x(k) (t − sh2 )x
2
 1
1 dt
=− [1 + O()].
2k x(k) (t 2 − sh2 )

The indefinite integral of the integrand in the last line of Exercise 22 is log(t +
23. 
t 2 − sh2 ).
8.9 The Calderón–Zygmund inequality 199

(a) Verify that the upper and lower limits of the integral satisfy
1+k
ch − k = sh = (1 − k) = (1 − k) + O((1 − k)2 )
2k
ch − kx 2 = ch − k(1 − )2 = 2[1 + 0()].

(b) Use (a) to verify


1
IIk = − log  + log  + O( log ). (8.9.14)
2
(c) Combine (8.9.13) and (8.9.14) to obtain
1 
K (k) = − log  + log  − log + O( log )
2 2
√ 1
= − log  + log 2 + O( log ),
2
and check that this is (8.9.12).
24. Show that equality in (8.6.2) can be attained if f (D) = D.
25. Write out a proof of Theorem 8.6.6.
26. Use Theorem 8.6.6 to show that a quasiconformal map takes sets of measure
zero to sets of measure zero.
27. Use the map f (z) = z|z|(1−K )/K to show that the Hölder exponent of continuity
of a K -quasiconformal map cannot in general be increased.
28. Show that the set H (M) of normalized quasisymmetric functions on R is closed
under convex combination: if h 0 , h 1 ∈ H (M), then

h t = (1 − t)h 0 + th 1 belongs to H (M), 0 ≤ t ≤ 1.

A consequence is that for any two Beurling–Ahlfors extensions of elements of


H (M), there is a continuous deformation from one to the other.
29. Verify (8.8.20).
30. Carry through the proof of (8.9.1) in the case p = 2.

Remarks and further reading

The study of quasiconformal mappings in the plane was begun by Grötzsch [94] and
continued by Morrey [143]. It was further developed by Teichmüller [201], [202],
[203] in his study of the moduli problem for Riemann surfaces. A related subject is
that of “generalized analytic functions,” where the Beltrami equation again plays a
leading role; see Vekua [210] and Rodin [179].
The presentation here relies largely on two standard references for quasiconformal
maps in the plane: the notes of Ahlfors [5], and the comprehensive and careful book of
Lehto and Virtanen [132]. The Ahlfors notes, and most of the subsequent literature on
200 8 Quasiconformal mapping

quasiconformal maps in one complex variable, are focused on its use in Teichmüller
theory. For this, we refer to Chapter 9 and the references at the end of that chapter.
The additional chapters in the second edition of [5] give some overview of further
development of the theory.
The theory has been generalized to higher dimensions, starting with the work
of Loewner [137] and Gehring and Väisälä; see [87]. One striking application is
the Mostow rigidity theorem [144] concerning the analogue of Teichmüller theory
in higher dimensions. Anticipating Chapter 9, the homotopy class of a compact
Riemann surface of genus g > 2 is characterized by 6g − 6 real parameters. Mostow
proved that, in the analogous case for higher dimension, the moduli space is a point: if
M and M  are closed hyperbolic manifolds of (real) dimension ≥ 3 and f : M → M 
is a homeomorphism, then f is homotopic to an isometry.
The multidimensional theory remains an active area of research. See, for example,
Iwaniec and Martin [116], Heinonen et al. [101], Gehring, Martin, and Palka [88].
Chapter 9
Introduction to Teichmüller theory

We consider here the problem of classifying Riemann surfaces. The most obvious
classification is topological. For example, a compact Riemann surface is character-
ized topologically by its genus: the number of holes in the doughnut. See Figure 9.1
for genus 0, genus 1, and genus 2.

Fig. 9.1 Compact surfaces of genus 0, genus 1, and genus 2.

Conformal equivalence implies topological equivalence, so, for compact man-


ifolds, the question is: given a topological surface of genus g, what inequivalent
conformal structures can it carry?
For simply connected Riemann surfaces (genus 0), these questions are settled
by the uniformization theorem, Theorems 7.2.1 and 7.4.3. Any such surface is
conformally equivalent to the unit disk D, the complex plane C, or the Riemann
sphere S. (In place of D we shall usually consider the conformally equivalent upper
half-space H.) Now C and D are topologically—even real-analytically—equivalent
(r eiθ → arctan(πr/2)eiθ ). But C and D are conformally distinct. For example, D
carries non-constant holomorphic functions, but C does not. Neither D nor C is com-
pact, but S is compact. Thus, in the simply connected case, the answer to the question

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 201
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_9
202 9 Introduction to Teichmüller theory

above is that a topological open disk or plane carries exactly two distinct conformal
structures, while the sphere carries exactly one.
Restricting our attention to compact surfaces, we shall see that in the case of
genus 1, there is a natural parametrization of the inequivalent complex structures by
C. Riemann [175] argued that for genus g > 1, these structures can be parametrized
using 3g − 3 complex parameters. However Riemann’s argument does not provide
the kind of geometric insight that the parameter space C provides for g = 1, for
example—a measure of how close two structures are.
The problem of giving a good geometric description of Riemann’s moduli space
was revived in the 1940s by Teichmüller. Teichmüller’s ideas have since been
extended to the study of “bordered” Riemann surfaces, such as the closed disc D,
and to non-compact Riemann surfaces, such as those obtained by removing closed
sets (e.g. collections of isolated points and/or disjoint closed analytic disks) from a
compact Riemann surface.
Teichmüller’s fundamental idea was to start with maps from one Riemann surface
to another. For example, if S and S  are compact Riemann surfaces that are homeo-
morphic, we shall see that there is a quasiconformal homeomorphism f : S → S  .
The maximal dilatation K f is a measure of how far f deviates from a conformal
map (for which K f = 1). Therefore f provides an upper estimate log K f on how
far apart we should consider the conformal structures to be. We look for such f with
minimal K f , and consider log K f as the relevant distance. Carrying this program
through in a satisfactory way is not a simple matter. In particular, it relies heavily on
the theory of quasiconformal maps as developed in Chapter 8.
In Section 9.1, we recall from Chapters 6 and 7 the basic machinery used to con-
struct and classify Riemann surfaces: universal covers, the uniformization theorem,
and groups of covering transformations. Moduli spaces are computed for spaces
whose universal cover is conformally equivalent to C.
Section 9.2 begins the study of maps from one Riemann surface to another, and
their lifts to the universal cover. In Section 9.3, the developments in Section 9.2 are
used to initiate the study of the Teichmüller space T (S) of a Riemann surface S.
This is a space of quasiconformal maps from S to topologically equivalent surfaces
S  . More precisely, it is a space of equivalence classes of such maps. Minimizing
the dilatation within the equivalence classes provides a natural metric. Section 4
examines T (S) more closely for compact S.
In section 9.4, we turn to T (H). Aside from the torus (genus 1), all the interesting
Riemann surfaces are (up to conformal equivalence) quotients of H by a subgroup
G ⊂ Aut(H). Quasiconformal maps of surfaces S, S  of the same genus lift to quasi-
conformal maps of H to itself. Conversely, if a quasiconformal homeomorphism of
H to H is compatible with the action of the group of cover transformations G, then
it induces a map of G to a group G  , and projects to a map S → S  . Therefore T (H)
is known as the universal Teichmüller space.
There are several different natural parametrizations of T (H). One comes from
considering the normalized quasiconformal maps f μ . Another comes from consid-
9.1 Coverings, quotients, and moduli of compact Riemann surfaces 203

ering certain modifications f μ and their Schwarzian derivatives. This is the subject
of Section 9.5. In Section 9.6, we indicate briefly how the theory proceeds from this
point.
A very active area that has its roots in Teichmüller theory has come to be known
as “higher Teichmüller theory.” A brief description is given in Section 9.8.

9.1 Coverings, quotients, and moduli of compact Riemann


surfaces

Here, we summarize results from Chapter 6 and Chapter 7, and indicate the direction
of further developments. We then sketch proofs that the moduli space for compact
surfaces of genus 1 has complex dimension one, and that the moduli space for com-
pact surfaces of genus g ≥ 1 has real dimension 6g − 6.
• As noted before, up to conformal equivalence, the only simply connected Riemann
surfaces are the plane C, the Riemann sphere S, and the unit disk D. In place of D,
we will work with the conformally equivalent upper half-plane H.
• Any Riemann surface S has a simply connected universal covering surface: a
Riemann surface S u and a projection π of S u onto S with the properties that π
is holomorphic, and that each point p ∈ S has an open neighborhood U such that
π −1 (U ) consists of disjoint open sets, each of which is mapped conformally to U
by π . Moreover, any closed curve γ beginning and ending at p lifts to a curve  γ in
S u that begins and ends in π −1 ( p). The lift is closed if and only if γ is homotopic
to a constant curve. A Riemann surface S is said to be elliptic if S u ∼ = S, parabolic
if S u ∼
= C, and hyperbolic if S u ∼
= H.
• Let Aut(S u ) be the group of conformal self-maps of S u . The group Aut(S u ) contains
a subgroup G, the group of cover transformations (often called deck transformations,
from decken, “to cover”). The automorphisms g belonging to G are characterized by
the property that
π ◦ g = π. (9.1.1)

Thus, a point p of S corresponds to the orbit under G of any point in π −1 ( p), or,
equivalently, to the quotient of S u by the equivalence relation induced by G. This
quotient is usually denoted G\Aut(S u ).
Since S u can be taken to be one of C, S, or H, the automorphism group Aut(S u ) is
isomorphic to a group of linear fractional transformations, and has a natural topology.
The group G is a properly discontinuous subgroup of Aut(S u ), and g ∈ G has no fixed
point unless g is the identity. Such a group is called a Fuchsian group. Conversely,
if G ⊂ Aut(S u ) is such a group, then G\S u is a Riemann surface.
• The cover transformations can be generated as follows. The covering surface itself
is constructed by choosing a point p0 in S. The elements of S u are equivalence
classes [γ p ] of curves γ p from p0 to p, two such curves being equivalent if they
204 9 Introduction to Teichmüller theory

are homotopic. Then each closed curve γ begining and ending at p generates a
cover transformation g[γ ] by g[γ ] ([γ p ]) = [γ p · γ ]. This is a homomorphism from
the fundamental group H1 (S) into the automorphism group Aut(S u ).
• In the elliptic case, S u ∼
= S, each element of Aut(S) has a fixed point. It follows
that G contains only the identity map, and S = S u . In the parabolic case, S u ∼ = C,
the fixed-point-free automorphisms are translations and the group of cover transfor-
mations is generated by either one or two translations: Proposition 6.3.3.

Theorem 9.1.1. Suppose that S1 = G 1 \S1u and S2 = G 2 \S2u .


(a) If S1 and S2 are conformally equivalent, then any conformal equivalence φ :
S1 → S2 can be lifted to a conformal equivalence φ  of the universal covers S uj .
 induces a isomorphism of the groups G 1 and G 2 of cover transforma-
(b) The lift φ
tions.
, for some
(c) Any other lift of φ to a conformal map of S1u onto S2u has the form h ◦ φ
h ∈ G2.

Proof: The proof of Theorem 6.2.4 can be adapted to prove (a). In fact, fix points
(z) to
z 0 in S1u and w0 in π2−1 ◦ φ ◦ π1 (z 0 ). If γ is a path in S1u from z 0 to z, take φ
be the endpoint of the path obtained by lifting φ ◦ π1 ◦ γ from w0 . Any other such
path γ  is homotopic to γ , so φ ◦ π1 ◦ γ  lifts to the same endpoint. The result can
be illustrated by the commutative diagram

φ
S1u −−−−→ S2u
⏐ ⏐

π1 

π2  (9.1.2)
φ
S1 −−−−→ S2 .

More concisely,
.
φ ◦ π1 = π2 ◦ φ (9.1.3)

To prove (b), we let θ (g) be defined by


◦ g ◦ φ
θ (g) = φ −1 , g ∈ G1.

Then θ (g) is a conformal map of S2u onto S2u . We need to show that θ (g) belongs to
G 2 , i.e. that π2 ◦ θ (g) = π2 . But (9.1.3) and (9.1.1) imply that
◦ g ◦ φ
π2 ◦ θ (g) = π2 ◦ φ −1 = φ ◦ π1 ◦ g ◦ φ
−1
 = π2 .
= φ ◦ π1 ◦ φ −1

−1 ◦ h ◦ φ
Conversely, for h ∈ G 2 , let g = φ . Then

−1 ◦ h ◦ φ
π1 ◦ g = π1 ◦ φ  = φ −1 ◦ π2 ◦ h ◦ φ

 = π1 .
= φ −1 ◦ π2 ◦ φ

Thus, θ is an isomorphism from G 1 to G 2 .


9.1 Coverings, quotients, and moduli of compact Riemann surfaces 205

 is also a lift. Conversely, suppose f is a lift of φ. Let


For any h ∈ G 2 , h ◦ φ
−1 . Then
h = f ◦φ

−1 ◦ h = π2 ◦ φ
π2 ◦ h = φ ◦ π1 ◦ φ ◦ φ
−1 ◦ h = π2 ,

so h is in G 2 .

Note that one consequence is that conformally equivalent Riemann surfaces have
conformally equivalent universal covers.

Corollary 9.1.2. Discrete groups G 1 , G 2 of fixed-point-free automorphisms of S u ,


where S u is C or H, generate conformally equivalent Riemann surfaces if and only
if they are related by an inner automorphism of Aut(S u ), i.e.

g ∈ G 1 → h ◦ g ◦ h −1 (9.1.4)

for some h in Aut(S μ ).

Proof: Suppose that φ : S1 → S2 is a conformal equivalence. We may identify both


 be the lift of φ. Then for g ∈ G 1
S uj with C or H, as the case may be, and let φ

◦ g ◦ φ
π2 ◦ φ −1 = φ ◦ π1 ◦ φ
−1 = φ ◦ φ −1 ◦ π2 = π2 ,

◦ g ◦ φ
so φ −1 belongs to G 2 . The mapping is clearly a homomorphism, and a similar
calculation shows that the inverse h → φ  maps G 2 to G 1 .
−1 ◦ h ◦ φ
Conversely, suppose that S1 is defined by the automorphism group G 1 ⊂ Aut(S u ),
and suppose that f belongs to Aut(S u ). Let G 2 = h ◦ G 1 ◦ h −1 be the groups defined
by the map g → h ◦ g ◦ h −1 . This defines a Riemann surface S2 . The map φ = π2 ◦
h ◦ π2−1 is easily seen to be well defined and holomorphic, with inverse π2 ◦ h ◦ π2−1 .
Therefore it is conformal.

Let us return to the parabolic case. We may take S = G\C. We know that any
element of G is a translation Ta = z + a, a = 0. As noted above, by Proposition 6.3.3
G has either one or two generators. Suppose first that G has the single generator Ta .
It is easily verified that any other translation can be obtained as BT B −1 for some
choice of B ∈ Aut(C), so all such surfaces are conformally equivalent. Note that
they are not compact: topologically they are open cylinders obtained by identifying
z and z + a.
Suppose now that G is generated by two translations, Ta and Tb . In the proof of
Proposition 6.3.3, it was shown that b/a is not real. The resulting surface is a torus
that is obtained by identifying opposite sides of a period parallelogram of a lattice,
as in Figure 6.3.
The parameter space in this case can be taken to be the plane C. An analytic
proof of this fact takes some heavy machinery. Let us denote the generators of the
lattice by ω1 , ω2 rather than a, b, with the standard normalization Im (ω2 /ω1 ) > 0.
206 9 Introduction to Teichmüller theory

Suppose that A is an automorphism of C that takes one such period parallelogram


Λ1 bijectively to another, Λ2 . Then

Aω j = m j1 ω1 + m j2 ω2 ; A−1 ωj = n j1 ω1 + n j2 ω2 ,

j = 1, 2, where the m jk and n jk are integers. Thus, A corresponds to a matrix in


the modular group S L(2, Z) of invertible matrices with integer coefficients and
determinant 1. More precisely, A belongs to the projective version P S L(2, Z): the
quotient of S L(2, C) by the subgroup {1, −1}.
Thus, the moduli space for conformal structures on the torus is the quotient of H
by P S L(2, Z). Some elementary calculations show that, by choosing periods ω1 and
ω2 with |ω j | minimal, we can guarantee that τ = Im (ω2 /ω1 ) belongs to the region

Δ = {τ : −1/2 < Re τ ≤ 1/2; |τ | ≥ 1; Re τ ≥ 0 if |τ | = 1};

see Figure 2.2. A further argument shows that this choice of τ ∈ Δ is unique. There-
fore we may take Δ itself as the moduli space for conformal structures on a topological
torus. Finally, the J -function, a holomorphic function on H that is invariant under
the modular group (and is closely related to the elliptic modular function λ discussed
in Section 2.5) maps Δ conformally onto C. (For details, see Hille [107], Section
13.6, or [22], Chapter 17.)
To conclude this section, we sketch an argument concerning the moduli of a
compact surface S of genus g > 1. As discussed in Section 11.2, cycles (simple
closed curves) α j , β j , j = 1, 2, . . . g, can be chosen in such a way that the only
intersections are a single intersection of each pair α j , β j , and such that if S is cut
along these cycles, the result is topologically a 4g-sided polygon whose boundary
consists of the sides in the order

α1 , β1 , α1−1 , β1−1 , . . . , αg , βg , αg−1 , βg−1 .

By identifying the sides α j , α −1 −1


j and the sides β j , β j one reconstructs, topologically,
the surface S. Figure 11.2 illustrates this in the case g = 2. Choose a base point in S,
and g curves that join p0 to the points of intersection of the cycles. The closed curve
that starts at p0 , follows α j and returns to p0 determines an automorphism A j of
Su ∼= H, and similarly for β j and B j : Theorem 6.2.4. The cycles α j , β j generate the
fundamental group of S, so the A j and B j generate the group of cover transformations
of H over S.
Each of these 2g transformations corresponds to a linear fractional transformation
with real entries and determinant 1, unique up to multiplication by −1. Therefore the
particular surface S has at most 6g “degrees of freedom.” Looking at the boundary
of the 4g-gon, we see that

A1 B1 A−1 −1 −1 −1
1 B1 · · · A g Bg A g Bg = I, (9.1.5)

the identity transformation. The product on the left in (9.1.5) has determinant 1, so
(9.1.5) puts 3 constraints on the A j , B j . Moreover, if B is any element of Aut(H),
9.2 Homeomorphisms of Riemann surfaces 207

then the map A → B −1 AB, A ∈ is an automorphism of G, allowing us to eliminate


3 more degrees of freedom from the remaining 6g − 3, leaving 6g − 6.
To this point, the argument only gives 6g − 6 as an upper bound for the dimen-
sion of the parameter space. However, the cycles α j , β j are determined only up to
homotopy and intersection conditions, so the associated matrices have some room to
move. Thus, the parameter space does actually have 6g-6 real dimensions as a subset
of R6g (modulo an equivalence relation corresponding to the freedom of multiplying
any of the matrices by −1).
Riemann’s argument, unlike this one, produced 3g − 3 complex parameters. How-
ever, neither argument gives a clear geometric picture. For example, is there any nat-
ural parametrization that leads to an open ball in R6n−6 or in C3m−3 as the parameter
set? This is the type of question that led to Teichmüller’s work.

9.2 Homeomorphisms of Riemann surfaces

The case of annuli, considered in the light of Exercise 11 of Chapter 8, suggests that
one way to relate inequivalent Riemann surfaces, and measure the extent to which
they differ, is by the use of quasiconformal maps. In this section, we do not need to
make use of quasiconformality, so we deal with general homeomorphisms (always
assumed to be orientation-preserving).
The proof of Theorem 9.1.1 can be extended to give the following.

Theorem 9.2.1. Suppose that S1 = G 1 \S1u and S2 = G 2 \S2u .


(a) If f is a homeomorphism of S1 onto S2 , then f can be lifted to a homeomor-
phism f of the universal cover S1u onto the universal cover S2u . Moreover, if f is
K -quasiconformal, the same is true of the lift 
f.
(b) The map
θ f (g) = f ◦g◦  f −1

is an isomorphism of G 1 onto G 2 .
Any other lift of f has the form h ◦ 
f for some h ∈ Aut(S2u ).

Proof. Pictorially, we have again


f
S1u −−−−→ S2u
⏐ ⏐

π1 
⏐π (9.2.1)
 2
f
S1 −−−−→ S2 ,

so again
π2 ◦ 
f = f ◦ π1 . (9.2.2)
208 9 Introduction to Teichmüller theory

Since f maps closed curves in S1 to closed curves in S2 , composition with 


f maps
cover transformations to cover transformations. The map
θ f (g) = 
f ◦g◦ 
f −1 , g ∈ Aut(S1u )

belongs to Aut(S2u ).
Given g ∈ G 1 , let g2 = θ f (g1 ). Then, referring to (9.2.2), we see that again

π2 ◦ g2 = π2 ◦ 
f ◦ g1 ◦ 
f −1 = f ◦ π1 ◦ g1 ◦ 
f −1
= f ◦ π1 ◦ 
f −1 = π2 .

Thus, g2 belongs to G 2 . Clearly, θ f is a homomorphism, and θ f −1 inverts it.

Since the universal covers of homeomorphic Riemann surfaces are homeomorphic


and, in the interesting case, equivalent to C or H, we may always take a Riemann
surface S to be G\S u with S u equal C or H, and with covering group G a subgroup
of Aut(S u ).
Theorem 9.2.2. Suppose that S and S  are homeomorphic Riemann surfaces. Two
homomorphisms f 0 : S → S  and f 1 : S → S  induce the same isomorphism of the
covering groups if and only if they are homotopic.

Proof. Suppose that f 0 and f 1 induce the same isomorphism, and suppose that
S u = H. We define a homotopy {  f t } from 
f 0 to 
f 1 by taking  f t (z), z ∈ H, to be the
point on the (hyperbolic) line segment from f 0 (z) to 
 f 1 (z) such that

ρH ( 
f 0 (z), 
f t (z)) = t ρH ( 
f 0 (z), 
f 1 (z)),

where ρH is the hyperbolic distance. We want to show that the projection

f t = π2 ◦ 
f t ◦ π1−1

is a well-defined map. But g ∈ G is an isometry, so if w = g(z), then f t (w) =


g( f t (z)), i.e.  ft ◦ g = g ◦  f t . Therefore f t is well defined and { f t } is a homotopy
from f 0 to f 1 . If S u = C, the same argument works, with the euclidean metric in
place of the hyperbolic metric.
Conversely, suppose that { f t } is a homotopy from f 0 to f 1 . It can be lifted to a
homotopy {  f t } of 
f 0 to 
f 1 . For g ∈ G and z ∈ H, the paths

t→ 
f t ◦ g(z), t→ 
f0 ◦ g ◦ 
f 0−1 ( 
f t (z))

have the same starting point  f 0 (g(z)) in H and the same projections π2 ◦ 
f t (z) on
S2 . Therefore they agree at t = 1, giving

f0 ◦ g ◦ 
 f 0−1 = 
f1 ◦ g ◦ 
f 1−1 , all g ∈ G.
9.3 Homeomorphisms of compact Riemann surfaces 209

Thus, f 0 and f 1 give the same isomorphism.

The following is an equivalent way to state Theorem 9.2.2. We leave the proof as
Exercise 1.

Corollary 9.2.3. Two homeomorphisms f j : S → S  , j = 0, 1, induce the same iso-


morphisms G → G  if and only if f 1−1 ◦ f 0 is homotopic to the identity map of S 

In anticipation of the standard construction of Teichmüller spaces, let us push this


idea one step further.

Corollary 9.2.4. Suppose that S, S1 , and S2 are homeomorphic compact Riemann


surfaces with cover groups G, G 1 , and G 2 in Aut(S u ). Suppose also that f j : S → S j ,
j = 1, 2, are quasiconformal homeomorphisms.
(a) Suppose that f 1 ◦ f 2−1 : S2 → S1 is homotopic to a conformal map φ. Then the
lifts f 1  induce the same isomorphism of G 2 to G 1 .
◦ f 2−1 and φ
(b) Suppose that φ : S2 → S1 is a conformal map. Then  ◦ 
f 1 and φ f 2 induce the
same isomorphism of G to G 1 if and only if f 1 ◦ f 2−1 is homotopic to φ.

Proof: (a) Under this assumption, f 1 and f = φ ◦ f 2 are homotopic homeomor-


phisms from S to S1 , so 
f 1 and f induce the same isomorphism from G to G 1 .
(b) Under this assumption, φ  induces an isomorphism of G 2 to G 1 . Again, let
f = f 2 ◦ φ. Then 
f 1 and  f induce the same isomorphism if and only if the map

f 1 ◦ f −1 = ( f 1 ◦ f 2−1 ) ◦ φ −1

is homotopic to the identity, which is true if and only if f 1 ◦ f −1 is homotopic to φ.

There is a different way to phrase this result.

Proposition 9.2.5. Supposethat S, S1 ,and S2 areRiemannsurfaces,and f j : S → S j


are homeomorphisms. Then S1 and S2 are conformally equivalent if and only if f 1 ◦
f 2−1 : S2 → S2 is homotopic to a conformal map.

9.3 Homeomorphisms of compact Riemann surfaces

The discussion in the previous section shows that in considering homeomorphisms


from one Riemann surface to another, we are principally interested in homotopy
types. In the introduction to this chapter, we suggested that one should focus on
quasiconformal maps. For compact surfaces, up to homotopy, we lose nothing by
restricting to quasiconformal homeomorphisms.
210 9 Introduction to Teichmüller theory

In preparation for the proof of the last statement, we recall some facts about the
Beurling–Ahlfors extension in Theorem 8.7.4, transplanted, via the Cayley trans-
form, to an extension of an orientation-preserving homeomorphism h of ∂D (not
necessarily normalized). The homeomorphism f of D constructed from h is C 1 on
D, with positive Jacobian, so it is quasiconformal on each compact subset of D. (The
maximal dilatation will grow indefinitely as one approaches some boundary point
unless h satisfies a quasi-symmetry condition.)
Theorem 9.3.1. Suppose that the Riemann surfaces S and S  are compact, and
f is a homeomorphism of S onto S  . Then f is homotopic to a quasiconformal
homeomorphism of S onto S  .

Proof: We may choose domains U j , j = 1, 2, . . . n, in S such that: (a) there are


conformal maps ψ j : D onto U j ; (b) these maps extend to the boundary; (c) for
some 0 < r < 1 the images V j = ψ j (Dr (0)) cover S and have the property that
successive intersections V j−1 ∩ V j are not empty. Define successive maps f 0 = f ,
and f j = f j−1 on S \ U j . Extend f j to U j as the Beurling–Ahlfors extension of
f j−1 |∂U j . Then f j−1 and f j agree except on U j . We define a homotopy on U j by

f t, j = (1 − t) f j−1 + t f j .

Thus,
 f = f 0 is homotopic to each f k . By construction, f k is quasiconformal on

j≤k j , so f = f 0 is homotopic to the quasiconformal map f n : S → S .
V

We assume throughout this section that all surfaces are hyperbolic: they have H
as universal cover. One question is: how to recognize the compact case among all the
cases G\H? We start by looking for a fundamental domain for G: a domain Ω ⊂ H
that is minimal with respect to the condition that H is covered by the closures (in H)
of the images g(Ω), g ∈ G. Equivalently, the condition is that π(Ω) = S and Ω is
minimal among domains with this property.
Suppose that G ⊂ Aut(H) is a properly discontinuous group, whose non-identity
elements have no fixed points. A Dirichlet domain for G is defined by choosing a
point a ∈ H and defining Na to be set of points of H that are closer to a than to any
of its images g(a), for non-identity element g ∈ G:

N = Na = {z ∈ H : ρH (z, a) < ρH (z, g(a)), all g ∈ G, g = 1}, (9.3.1)

where ρH is again the hyperbolic metric in H. The point a is called the center of Na .
Clearly, Na is open and non-empty. If b = g(a) for some g ∈ G, g = 1, then Na and
Nb are disjoint.
Lemma 9.3.2. The closure N of Na in H projects onto S.

Proof: Every point of H either lies in some N g(a) for some g ∈ G, or lies on the
boundary of (at least) two of these domains. Therefore the closures {N g(a) } cover H.
Any two such closures project to the same set, so π(N ) = S.
9.3 Homeomorphisms of compact Riemann surfaces 211

The elements of Aut(H) are isometries with respect to the hyperbolic metric, so
the Nb , for b = g(a), g ∈ G, are each congruent to N = Na . In particular, either no
Nb has finite diameter, or they all have the same finite diameter. This means that
each has a boundary point (with respect to C) on R, or none do. This dichotomy is
independent of the choice of starting point a, as shown by the following.

Theorem 9.3.3. The Dirichlet domains of a Riemann surface S = G\H are bounded
(with respect to the hyperbolic metric) if and only if S is compact.

Proof: Suppose that S is compact. Choose a ∈ H and let Dn be the hyperbolic disk
Dn = {z ∈ H : ρH (a, z) < n}.

The projections π(Dn ) are open sets that cover S, so there is some m such that Dm =
S. Thus, for every z ∈ H, there is a g ∈ G such that ρH (a, g(z)) < m. Equivalently,
Na has diameter < 2m. Conversely, if Na is bounded, then its closure N is compact.
Therefore S = π(N ) is also compact.

We do not need the following description of Dirichlet domains, but it is easily


established in the compact case. For convenience, we refer to the hyperbolic lines
(geodesics) simply as “lines,” and intervals on geodesics as “segments.”

Proposition 9.3.4. Any bounded Dirichlet domain N is a (hyperbolic) convex poly-


gon, i.e. the boundary is the union of finitely many segments that meet at interior
angles < π . The number of such sides is even, and there is a natural pairing of
opposite sides.

Proof: Any point on the boundary of N = Na is the midpoint of the line joining a
to a point b = g(a) some g ∈ G. Since there is a bound to the distance from a to
the boundary of N , there is a bound to the distance from a of the b that can occur
in this way. Therefore there are finitely many. The set of points equidistant from a
and b is a line; see Exercise 2. Therefore the boundary of N is a union of finitely
many segments E j . Wherever two such edges meet, N lies in the intersection of the
half-planes determined by the lines that contain these edges, so the interior angle is
< π.
Finally, if E j is associated to g j (a), g j ∈ G, then reflecting through a takes E j
to the side associated to g −1
j (a).

Corollary 9.3.5. In the compact case, G is finitely generated.

Proof: As noted in the preceding proof, each side E j of a Dirichlet region Na cor-
responds to a point b j = g j (a), g j ∈ G. We claim that these finitely many elements
{g j } generate G. In fact, g j maps Na to Nb j , b j = g j (a). This image is a reflection
through E j . It follows that gk ◦ g j maps the the g j (E j ) to the corresponding side of
Nb j . Clearly, any of the images under G of Na can be reached in this way, by means of
212 9 Introduction to Teichmüller theory

an element of the group G  generated by the g j . There is a bijective correspondence


between these images and the elements of G, so G  = G.

For an example of all this, set in the disk D rather than in H, see Figure 9.2.
The yellow region in the disk on the left is the Dirichlet region N0 with center 0 for
the Bolza curve, a compact genus 2 surface. The right side of the figure shows the
decomposition of D into Dirichlet regions congruent to the region on the left.

Fig. 9.2 Dirichlet regions for the Bolza curve.

Lemma 9.3.6. Suppose that the Riemann surface S = G\H is compact. Then the
set of x ∈ R ∪ {∞} such that x is a fixed point of some g ∈ G, g = 1 contains at
least three points.

Proof: Suppose that there are at most two such points. Replacing G by hGh −1 for
some h ∈ Aut(G) yields an equivalent surface. Assuming that there is only one
such point, choose h so that the fixed point is the point at ∞. Then G consists of
translations gb (z) = z + b, b ∈ R. Since G is discrete, there is a minimal such b > 0.
Then S = G\H is homeomorphic to the vertical strip

{z ∈ H : −b/2 ≤ Re z ≤ b/2}

with the edges |Re z| = b/2 identified. Thus, S is an open cylinder, contradicting the
assumption that S is compact.
Assuming that there are exactly two fixed points, we may take them to be 0 and
∞.Then each element of G is a map z → az for some a > 0. Since G is discrete,
there is a smallest such a > 1. Then

G = {gn (z) = a n z, n = 0, ±1, ±2, . . . }.

The associated surface G\H can be taken to be the closure of the annulus A(1, a)
with the inner and outer boundaries identified. This is a torus, so the universal cover
would be conformal to C rather than to H.
9.3 Homeomorphisms of compact Riemann surfaces 213

It is useful to be able to extend a homeomorphism H → H continuously to a


homeomorphism of the closure H ∪ R. This is not always possible; see Exercise 3.
However, we know from Section 8.6 that it is possible for quasiconformal maps,
and we also know that the lift of a K -quasiconformal homeomorphism of Riemann
surfaces is a K -quasiconformal map of the covering spaces.

Theorem 9.3.7. Suppose that S = G\H and S  = G  \H are compact. Two quasi-
conformal homeomorphism f 1 and f 2 from S to S  induce the same isomorphism of
G and G  if and only if there are lifts 
f1 , 
f 2 that coincide on R.

Proof: Suppose first that such lifts induce the same homomorphism. The  f j map
fixed points of G to fixed points of G  . Therefore for any g ∈ G, the maps

f1 ◦ g ◦ 
 f 1−1 , 
f2 ◦ g ◦ 
f 2−1

agree on R. Since both maps belong to Aut(H), they must be identical.


f 1 and 
Conversely, suppose that  f 2 induce the same isomorphism, i.e.

f1 ◦ g ◦ 
 f 1−1 = 
f2 ◦ g ◦ 
f 2−1 , all g ∈ G.

f 1−1 ◦ 
Let ψ =  f 2 , so that

gn ◦ ψ = ψ ◦ gn , g ∈ G, n = 0, ±1, ±2, . . . .

If x is fixed by g ∈ G, then so is ψ(x). Now g n (z) → x either as n → ∞ or as


n → −∞ (or both), so, for the correct choice of sign, we have

ψ(x) = lim ψ(g n (z)) = lim g n (ψ(z)) = x.

Thus, each fixed point of G is a fixed point of ψ. Since there are at least three such
points, ψ = 1.
Remarks. Theorem 9.3.7 is true for a much wider class of hyperbolic Riemann
surfaces, classified according to the properties of the covering group G. To be specific
here, we need some definitions. The limit set of a Fuchsian group G is the set L of
points x ∈ {R ∪ {∞} with the property that there is a sequence of points {z n } ⊂ H
and a sequence of {gn } of distinct elements of G such that gn (z n ) → x. In particular,
any fixed point of an element g ∈ G belongs to L. The Fuchsian group G is said
to be of the first kind if the limit set L is all of R ∪ {∞}. Otherwise, G is said to be
of the second kind. Theorem 9.3.7 carries over to any G of the first kind. It is not
difficult to show that, in the compact case, G is of the first kind; see Exercise 5. For
any such group, the set of fixed points is dense in R.
For a full treatment of this topic, see Lehner [133].
214 9 Introduction to Teichmüller theory

9.4 The Teichmüller space of a Riemann surface

to consider only the hyperbolic case: Riemann surfaces that can be taken to be of
the form G\H. Throughout this section, we also assume that the Riemann surfaces
under consideration are of the form G\H with covering group G that is of the first
kind. As noted in the previous section, this includes all compact hyperbolic surfaces.
The deformation space Def(S) of a Riemann surface S is defined to be the collec-
tion of pairs (S  , f ), where S  is a Riemann surface and f : S → S  is a quasicon-
formal homeomorphism. The Teichmüller space T (S) is defined to be the quotient
of Def(S) by a certain equivalence relation ∼:
Def(S)
T (S) = (9.4.1)

the relation ∼ is defined as follows:
f 1 ∼ f 2 if and only if f 2 ◦ f 1−1 : S1 → S2
is homotopic to a conformal map φ : S1 → S2 . (9.4.2)

In light of Proposition 9.2.5, this is the same as saying that the images S1 and S2 are
conformally equivalent.
Proposition 9.4.1. If f belongs to Def(S), then the equivalence class [ f ] contains
an extremal: a map f 0 whose maximal dilatation is minimal:
K f0 = inf K g .
g∼ f

The proof is left as Exercise 4.


The Teichmüller distance DT between two functions f , g in Def(S) is defined to
be 1
DT ( f, g) = log K g◦ f −1 .
2
Thus, DT ( f, g) ≥ 0 and DT ( f, g) = 0 if and only if f and g are conformally
equivalent. Moreover, DT ( f, g) = DT (g, f ), since K h = K h −1 .
The Teichmüller metric on T (S) is defined by
dT ([ f 1 ], [ f 2 ]) = inf{DT ( f, g) : f ∼ f 1 , g ∼ f 2 }. (9.4.3)

We show next that the term “metric” is justified.


Proposition 9.4.2. The formula (9.4.3) defines a metric on T (S).

Proof: Clearly, dT ([ f ], [g]) = dT ([g], [ f ]) ≥ 0. The triangle inequality follows


from Proposition 8.2.1 (b):
K f ◦h −1 ≤ K f ◦g−1 K g◦h −1 .
9.4 The Teichmüller space of a Riemann surface 215

To complete the argument, we need to show that dT ([ f ], [g]) = 0 implies [ f ] = [g].


If dT ([ f ], [g]) = 0 then, as in the proof of Proposition 9.4.1, there is are sequences

{ f n } ⊂ [ f ], {gn } ⊂ [g], K fn ◦gn−1 → 1

that can be lifted to H and yield limits  f0 , 


g0 that are lifts of f 0 ∈ [ f ], g0 ∈ [g].
Moreover, K f0 ◦g0−1 = 1. Thus, f 0 ◦ g0 is conformal, so [ f 0 ] = [g0 ].

Suppose that f is a K -quasiconformal homeomorphism of S to S  . If S = G\H


and S  = G  \H, then the lift 
f is a K -quasiconformal homeomorphism of H. If we
normalize it, then by Theorems 8.8.13 and 8.8.15 is f μ for some unique μ = μ f in
the unit ball B of L ∞ (H):

B = {μ ∈ L ∞ (H) : ||μ||∞ < 1}. (9.4.4)

Note: to simplify notation in the following, we write μ f for μ f .


We take advantage of the fact that the maximal dilatation of f is the same as
the maximal dilatation of the normalized lift f μ , and find a formula for dT ( f, g)
as follows. Given two quasiconformal homeomorphisms f and g from S to S  , let

f and g be the normalized lifts to H. Let h = g ◦ f −1 , so the normalized lift is

h = g ◦f −1 . The computation (8.8.20) can be rewritten, using  g◦ 
f −1 in place of
g, as   μ −μ
μh ◦ f =  fz / 
g f
fz · . (9.4.5)
1 − μ f μg

It follows that, in the distance calculation, we may replace the maximal dilatation of
h = g ◦ f by that of  h, giving

1 + |μh | |1 − μ f μg | + |μg − μ f |
=
1 − |μh | |1 − μ f μg | − |μg − μ f |

Compare this to the hyperbolic distance between points a, b ∈ D,

1 |1 − āb| + |a − b|
ρ(a, b) = log .
2 |1 − āb| − |a − b|

Taking the supremum of Dg◦ f −1 over D gives the maximal dilatation Kg◦ f −1 =
K g◦ f −1 :
DT ( f, g) = ρD (μ f , μg )). (9.4.6)

In view of Proposition 9.4.1, we have proved the following:

Theorem 9.4.3. If f and g belong to Def(S), then

dT ([ f ], [g]) = ρD (μ f0 , μg0 ), (9.4.7)


216 9 Introduction to Teichmüller theory

where f 0 and g0 are extremal elements of [ f ] and [g].


These results show that we may study T (S) as a metric space by studying the set of
Beltrami coefficients that correspond to lifts from S = G\H. These are characterized
among elements of the unit ball (9.4.4) by the condition

μ ◦ g = μ, all g ∈ G. (9.4.8)

Such a coefficient μ is said to be extremal if μ = μ f , where f is extremal.


Corollary 9.4.4. The space T (S) is pathwise connected.

Proof: It is sufficient to show that every f ∈ Def(S) is homotopic to the the identity
map. We may take f to be extremal in its equivalence class, and let μ = μ f . In view
of Theorem 9.4.3, it is natural to define μt , 0 ≤ t ≤ 1 to be the point

(1 + |μ|)t + (1 − |μ|)t μ
μt = · (9.4.9)
(1 + |μ|)t − (|1 − |μ|)t |μ|

on the geodesic from 0 to μ in D. This is a homotopy from the identity map f 0 to


f μ . Since μ satisfies (9.4.8), it follows that μt does also. Therefore { f μt } is the lift
of a homotopy { f t } from the identity map 1 of S to the given map f : S → S  . Then
K ft = K f and K f ◦ ft−1 = K 1−tf . If g ∈ [ f t ], then f t ◦ f 1−t ∼ g ◦ f 1−t , so

K f = K tf K 1−t
f ≤ K g◦ f1−t ≤ K g K f1−t = K g K 1−t
f .

Therefore K ft ≤ K g , showing that f t is extremal. Thus, {[ f t ]} is a path in T (S) from


the identity map to [ f ].

This argument shows that

dT (1, f ) = t dT (1, f t ) + (1 − t) dT (1, f ). (9.4.10)

Now it follows from the definition that

dT ( f, g) = dT (1, f ◦ g −1 ). (9.4.11)

Combining this with the additive property (9.4.10), we can show that for any partition
{t j } of the interval [0, 1], we have

dT (1, f ) = dT ( f t j , f t j+1 ).
j

Therefore the path {[ f t ]} is a geodesic in T (S).


The construction here can be generalized. Given extremal elements f 0 and f 1 of
Def(S), let μt (z) be the point on the hyperbolic geodesic from μ0 (z) = μ f0 (z) to
9.5 The universal Teichmüller space 217

μ1 (z) = μ f1 (z) such that

ρD (μ0 (z), μt (z)) = t ρD (μ0 (z), μ1 (z)). (9.4.12)

Then again μt satisfies (9.4.8) and {[π ◦ f μt ◦ π −1 ]} is a geodesic path from [ f 0 ] to


[ f 1 ] in T (S). This sketch gives the following.
Theorem 9.4.5. The space T (S) is geodesically convex: for extremal elements f , g
in Def(S), the path t → π ◦ f μt from μ f to μg in D, where μt is defined by (9.4.9),
corresponds to a geodesic from [ f ] to [g] in Def(S).
We complete this discussion of T (S) with two more results about T (S) as a metric
space. The proof of the first of the two results is left as Exercise 7.
Theorem 9.4.6. The space T (S), with metric (9.4.3), is complete.
Theorem 9.4.7. If two surfaces in S are quasiconformally equivalent, then their
Teichmüller spaces are isometric.

Proof: Suppose h : S → S  is a quasiconformal homeomorphism. Then f → f ◦


h −1 maps the family of quasiconformal self-maps of S to the corresponding family
for S  . If g j = f j ◦ h −1 , j = 1, 2, then

g2 ◦ g1−1 = f 1 ◦ f 2−1 . (9.4.13)

Thus, [ f 1 ] = [ f 2 ] ∈ T (S) if and only if [g1 ] = [g2 ] ∈ T (S  ). Therefore f → f ◦


h −1 is a bijection from T (S) to T (S  ). It follows from (9.4.13) lifted to H that this
map is an isometry.

9.5 The universal Teichmüller space

We have noted that for all but some well-understood examples, the universal cover of
a Riemann surface can be taken to be H, and that any quasiconformal homomorphism
of Riemann surfaces can be lifted to a quasiconformal homeomorphism of the covers.
For this reason, T (H) is called the universal Teichmüller space. The question is: how
to define T (H)?
We need a stronger definition of equivalence of quasiconformal self-maps. In fact,
with the definition (9.4.2), T (H) would consist of a single point:
Proposition 9.5.1. Any two quasiconformal homeomorphisms of H to H are homo-
topic.
The proof is left as Exercise 8.
As noted above, each normalized K -quasiconformal homeomorphism of H to H
is f μ for a unique μ in the unit ball B of L ∞ (H). In view of Theorem 9.3.7, it is
218 9 Introduction to Teichmüller theory

natural to take the equivalence relation for quasiconformal homeomorphisms f , g


of H to be: f ∼ g if there is a homotopy from f to g that is constant on R. However,
note that if we simply assume that f |R = g|R , then the homotopy constructed in
(9.4.9) is constant on R. Therefore we may define

f ∼ g if and only if f |R = g|R . (9.5.1)

Then T (H) is the quotient of the family F of normalized quasiconformal homeo-


morphisms f : H → H by the equivalence relation (9.5.1):

F { f μ : μ ∈ B}
T (H) = = (9.5.2)
∼ ∼
The metric (9.4.7), as well as Theorems 9.4.5 and 9.4.6 carry over to T (H):

dT ([ f ], [g]) = ρD (μ f0 , μg0 ), (9.5.3)

where f 0 ∈ [ f ] and g0 ∈ [g] are extremal.

Theorem 9.5.2. The space T (H) is geodesically convex and complete.


We may also identify T (H) with the quotient of B by the an equivalence relation:
B
T (H) ∼
= , μ ∼ ν ⇔ f μ ∼ f ν. (9.5.4)

Let Q S denote the collection of normalized quasi-symmetric maps of R. The
equivalence class of any f μ is uniquely determined by its restriction to R, so we also
have
T (H) ∼= Q S. (9.5.5)

The family F = { f μ } of normalized quasiconformal homeomorphism of H is a


group under composition, as is Q S, and the map f → f |R is a group homomorphism.
We may also make B a group by defining

μ ◦ ν = μ f μ◦ f ν . (9.5.6)

After these general remarks, we pass to a construction that leads to a new and
very fruitful way to parametrize T (H). Given μ ∈ B, we can define a new Beltrami
coefficient on C by
0, z∈H
μ∗ (z) = (9.5.7)
μ(z̄), z ∈ H∗ .


where H∗ is the lower half-plane {z ∈ C : Im z < 0}. Let f μ = f μ . Then f μ maps
H conformally onto a domain bounded by the curve L = f μ (R), and is a sense-
reversing quasiconformal map of H∗ onto the other component of the complement
of L, with
( f μ )z̄ (z)
= −μ(z), z ∈ H∗ . (9.5.8)
( f μ )z (z)
9.5 The universal Teichmüller space 219

Theorem 9.5.3. If μ and ν belong to B, then f μ ∼ f ν if and only if f μ = f ν on H.


Proof: Suppose that f μ = f ν on H, and therefore on R as well. The maps of H defined
by
gμ = f μ ◦ ( f μ )−1 |H , gν = f ν ◦ ( f ν )−1 |H (9.5.9)

are conformal and have the same image, so gμ ◦ gν−1 belongs to Aut(H). This map
fixes 0, 1, and ∞, so it is the identity. Therefore f μ = gμ−1 ◦ f μ and f ν = gν−1 ◦ f ν
agree on R.
Conversely, suppose that f μ ∼ f ν . Define a map g : C → C by

f μ ◦ f ν −1 (z), f ν (z) ∈ H ∪ R;
g(z) =
f μ ◦ ( f μ )−1 ◦ f ν ◦ f ν−1 (z), f ν (z) ∈ H∗ .

Since f μ and f ν agree on R, g is continuous on f ν (R) and thus is a homeomorphism


of C. Now g|H is conformal. By (9.5.8),

μ fμ |H∗ = μ−1
f μ | H∗ , μ fν |H∗ = μ−1
f ν | H∗ .

By (9.4.5), f μ ◦ ( f μ )−1 and f ν ◦ f ν−1 are both conformal on H∗ . Therefore g is


conformal, hence belongs to Aut(H). But g fixes 0, 1, ∞, so g is the identity. Thus,
f μ = f ν on H.

Theorem 9.5.3 gives us another way to characterize T (H). The map from equiv-
alence classes to functions
[ f μ ] → f μ |H

is well defined, so
T (H) ∼
= { f μ |H : μ ∈ B}. (9.5.10)

There is a conformal invariant associated with conformal homeomorphisms


from H into C, namely the Schwarzian derivative, or simply the Schwarzian. The
Schwarzian { f, z} of a holomorphic function f is defined to be

f  (z) 1 f  (z) 2
{ f, z} = − . (9.5.11)
f  (z) 2 f  (z)

To simplify the following statements, we assume that the functions in question are
defined in a simply connected domain Ω in S, which we generally take to be H.
The formula (9.5.11) extends to meromorphic functions, and in any case defines
a meromorphic function. In particular, we may compute the Schwarzian of a linear
fractional transformation f ∈ Aut(S) and check that

{ f, z} ≡ 0 if and only if f ∈ Aut(S). (9.5.12)

Proposition 9.5.4. (a) If f is holomorphic and f  has no zeros, then { f, z} is


holomorphic.
220 9 Introduction to Teichmüller theory

(b) For meromorphic f and g,

{g ◦ f, z} = {g, f (z)} f  (z)2 + { f, z}. (9.5.13)

(c) If g is a linear fractional transformation,

{g ◦ f, z} = { f, z}. (9.5.14)

(d) Schwarzians { f 1 , z}, { f 2 , z} are identical if and only if

f2 = h ◦ f1 (9.5.15)

for some linear fractional transformation h.

We leave the proof of Proposition 9.5.4 as Exercise 9.


It is common in this context to change the standard notation by considering z →
{ f, z} as the definition of the function S f . Then (9.5.13) and (9.5.14) are

Sg◦ f = Sg ◦ f ( f  )2 + S f ; (9.5.16)
Sg◦ f = S f , g ∈ Aut(S). (9.5.17)

The next result tells how to recover f from its Schwarzian.

Theorem 9.5.5. If g is holomorphic and g  has no zeros, then solutions ϕ1 , ϕ2 of


the equation
ϕ  + 21 g ϕ = 0 (9.5.18)

can be chosen in such a way that the quotient f = ϕ1 /ϕ2 satisfies S f = g.

Proof: We rely on some basic facts about linear differential equations. (Standard
proofs for functions of a real variable carry over to the complex case, working with
holomorphic coefficients and solutions.) Equation (9.5.18) has a two-dimensional
space of solutions. If ϕ1 and ϕ1 are two solutions, then a simple computation shows
that the Wronskian
ϕ1 ϕ2 − ϕ1 ϕ2 (9.5.19)

is constant. If the ϕ j are chosen to be independent, then (9.5.19) is not zero, and we
may normalize so that
ϕ1 ϕ2 − ϕ1 ϕ2 = 1. (9.5.20)

Then (9.5.20) implies that with f = ϕ1 /ϕ2 we have f  = 1/ϕ2 2 . A further compu-
tation, again using (9.5.20), shows that the Schwarzian of f is g.

Consider S fμ in H. For any h ∈ Aut(H), Sh = 0, so Proposition 9.5.4 (b) shows


that under the change of coordinates ζ = h(z), the expression
9.6 The Bers embedding 221

ω = S f (z) (dz)2

transforms as

ω(z) → S f (h(z) (h  )2 (dz)2 = S f (ζ ) (dζ )2 = ω(ζ ).

Thus, ω is invariant under any conformal change of variables in H. It is referred


to as a quadratic differential. Since S f is holomorphic, ω is termed a holomorphic
differential.

9.6 The Bers embedding

This section uses some results from Sections 2.2 and 4.1.
We saw in Section 9.5 that T (H) can be identified with the family of holomorphic
maps { f μ |H }. Each such map is a homeomorphism onto some domain Ω ⊂ C. Theo-
rem 9.5.5 shows that f μ |H can be reconstructed uniquely from its Schwarzian (taking
into account the normalization of f μ ). Therefore we have one more identification:
Let
sμ = S f μ |H . (9.6.1)

Then [ f μ ] → sμ is well defined, and

T (H) ∼
= {sμ : μ ∈ B}. (9.6.2)

This identification makes possible another natural choice of metric on T (H).


Recall that the hyperbolic distance element at a point z in H is dz/2Im (z). Therefore,
if h ∈ Aut(H), then
dz h  (z) dz
= . (9.6.3)
2 Im z 2 Im h(z)

By (9.5.16) with f = f μ |H and h ∈ Aut(H),

S f ◦h = S f ◦h (h  )2 . (9.6.4)

Combining (9.6.4) and (9.6.3), we get

4 (Im z)2 |sμ (z)| = 4 (Im h(z))2 |sμ◦h (z)|. (9.6.5)

In other words, the expression on the left in (9.6.5) is a conformal invariant in H. Set

||S fμ ||H = sup 4 (Im z)2 |sμ (z)|.


z∈H

It will be helpful to note that conformal equivalence of H and D implies that


the invariance in (9.6.5) carries over to D, where the hyperbolic distance element is
dz/(1 − |z|2 ). Thus, if f : D → C is conformal, then
222 9 Introduction to Teichmüller theory

(1 − |z|2 )2 S f (z) = (1 − |h(z)|2 )2 S f (h(z)), h ∈ Aut(D). (9.6.6)

It is convenient to define

||S f ||D = sup (1 − |z|2 )2 | f (z)|.


z∈D

Theorem 9.6.1. (Nehari) Suppose that f : D → C is holomorphic.


(a) If f is conformal, then ||S f ||D ≤ 6.
(b) If ||S f ||D ≤ 2, then f is conformal.

We prove part (a) now. Part (b) is a consequence of Lemma 9.6.3 below. Suppose
that f is conformal and the maximum value of (1 − |z|2 )2 |S f (z)| is attained at z 0 .
We may take advantage of invariance under z → h(z), h ∈ Aut(S), to take z = 0
and assume also that f  (0) = 1. Thus,

f (z) = z + a2 z 2 + a3 z 3 + . . . , S f (0) = 6(a3 − a2 )3 .

The function
∞
1
g(z) = = z+ bn z −n .
S f (1/z) n=1

satisfies the conditions of the Area Theorem, Theorem 4.1.1, so |b1 | ≤ 1. But a
calculation shows that b1 = a3 − a22 , so we have |S f (0)| ≤ 6.
Remark. Part (a) was proved by Kraus [126] and rediscovered by Nehari [152]. Part
(b) is deeper, and Nehari clearly considered it to be the principal result of his paper.
Theorem 9.6.1 (a) carries over to H, using

||S f ||H = sup 4 (Im z)2 ||f||H .


H

We extend this expression to the space Q(H) of holomorphic functions ϕ on H:

||ϕ||H = sup 4 (Im z)2 |ϕ(z)|. (9.6.7)


z∈H

It clearly has the properties of a norm:

||aϕ||H = |a| ||ϕ||H , a ∈ C; ||ϕ + ψ||H ≤ ||ϕ||H + ||ψ||H .

Then Q(H) is a complete normal family. In particular, it is complete with respect to


the norm and is therefore a Banach space. Let Δ be the image of B in Q(H):

Δ = {sμ : μ ∈ B}. (9.6.8)


9.6 The Bers embedding 223

The rest of this section is devoted to the proof of the following result due to Ahlfors
[4].
Theorem 9.6.2. The image Δ of B under the map μ → sμ is open in Q(H).
We begin with the proof of Theorem 9.6.1 (b).
Lemma 9.6.3. If φ ∈ Q(H) and ||φ||H < 2, then φ ∈ Δ.

Proof: We know from Theorem 9.5.5 that φ = S f , where f = v1 /v2 and the v j are
solutions of
1
v + φv = 0 (9.6.9)
2
such that
v1 v2 − v2 v1 = 1. (9.6.10)

We want to extend f to the lower half-plane H∗ by finding a function that coincides


with f when Im z = 0 and whose maximal dilatation is finite. We take the extension
to be F(z̄), where
v1 (z) + (z̄ − z)v1 (z)
F(z) = , z ∈ H.
v2 (z) + (z̄ − z)v2 (z)

By (9.6.10), the numerator multiplied by v2 , minus the denominator multiplied by


v2 , gives z − z̄. Therefore the numerator and denominator have no common zeros.
Some computation shows that
1
Fz̄ = ; (9.6.11)
[v2 (z) + (z̄ − z)v2 (z)]2
1
φ(z)(z − z̄)2
Fz = − 2
. (9.6.12)
[v2 (z) + (z̄ − z)v2 (z)]2

Therefore
1
|Fz /Fz̄ ||H ≤ ||φ||H = k < 1.
2
It follows that the extension f = F(z̄) is quasiconformal but sense reversing, with

μ f (z)−1 = −2φ(z̄)(Im z)2 , z ∈ H.

If f extends continuously to the boundary from H, then f is continuous on C, and


we may conclude that, after normalizing, f = f μ with μ = μ f in H∗ .
Suppose first that φ is analytic on R and has a zero of order ≥ 4 at ∞. The
function f is globally continuous on S and locally single-valued on C. The vanishing
assumption on φ implies that solutions of (9.6.9) have the form
a 1
v(ζ ) = + b + ζ ψ(ζ ), ζ =
ζ z

where ψ is analytic in ζ . In fact (9.6.9) becomes an equation


224 9 Introduction to Teichmüller theory

φ
ζ 2 ψ  + 2ζ ψ  + ψ = ζ g(ζ ), g holomorphic at 0,
2ζ 4

which has a regular singular point at ζ = 0. Therefore

ϕ j = a j z + b j + O(z −1 ), a1 b2 − a2 b1 = 1,

and a1
lim f (z) = = lim F(z).
z→∞ a2 z→∞

The fact that f is injective on C follows from the monodromy theorem. Composing
with a linear transformation will normalize f .
To this point, we have shown that if ||φ||H < 2 and φ is analytic on R and has a
zero of order ≥ 4 at ∞, then φ is in Δ. We pass to the general case by approximation,
using a sequence of linear fractional transformations gn with the property that the
gn (H) expand to exhaust H and fix ∞. Such transformations are easily obtained in
D, fixing 1, in the form h n (z) = ρn z where n ↑ 1, and then transplanted to H by
using the Cayley transform C. Let τn = C −1 ◦ h n ◦ C. If we set n = n − 1/2, the
result is 2nz + i
τn (z) =
2n − i z

Let
φn = φ ◦ τn · (gn )2 .

Now φn is analytic up to R. Moreover, |gn | ≤ 1 and |τn (z)| = O(|z|−2 ) as |z| → ∞.


In addition,
||φn ||H ≤ ||φ||H < 2.

Therefore we can find { f n } with S fn = φn in H and with a fixed bound for K fn .


Normalizing, there is a subsequence that converges in C to a solution that is of the
form f μ , μ ∈ B. .

We turn now to the proof of Theorem 9.6.2. We begin with some remarks about
the quasi-isometry property of the Beurling–Ahlfors extension, Theorem 8.7.8. If
the K -quasiconformal map ϕ : H → H is such an extension, then
1 |dz| |dϕ(z)| |dz|
≤ ≤ c1 (K ) . (9.6.13)
c1 (K ) (Im z) 2 (Im ϕ(z)) 2 (Im z)2

This may be rewritten in a conformally invariant form by noting that the density for
the hyperbolic metric on H is ηH (z) = 1/(2Im z). Therefore (9.6.13) is
1
ηH (z) |dz| ≤ ηH (ϕ(z)) |dϕ(z)| ≤ c1 (K ) ηH (z) |dz|. (9.6.14)
c1 (K )

We need a general fact about the hyperbolic density of a general simply connected
domain Ω, from Proposition 2.2.3:
9.6 The Bers embedding 225

1 1
≤ ηΩ (z) ≤ , (9.6.15)
4 d(z, ∂Ω) d(z, ∂Ω)

where d(z, ∂Ω) is the (euclidean) distance to the boundary.


The following theorem, due to Ahlfors, is key to the proof of Theorem 9.6.2. We
use the argument in [131].
If L is a Jordan curve in S, then a K -quasiconformal reflection in L is a K -
quasiconformal homeomorphism g : C → C that is the identity on L, is sense-
preserving on one component of the complement of L, sense-reversing on the other
component, and is an involution, which means that g ◦ g is the identity.
Theorem 9.6.4. Suppose that L is an unbounded Jordan curve in S that admits a
K -quasiconformal reflection g. Let Ω1 be one of the components of the complement
of L. Then there is a c(K )-quasiconformal reflection λ in L that is C 1 on Ω1 ∪ Ω2
and satisfies
|dλ(z)| ≤ C(K ) |dz|, z ∈ Ω1 . (9.6.16)

Proof: Let h 1 : H → Ω1 and h 2 : H∗ → Ω2 be conformal maps. The assumptions


imply that the Ω j are Jordan domains in S, so that h j extend to homeomorphisms of R
onto the closure of Ω j . Assume that orientations are chosen so that h 2 ◦ h −1
1 :R→R
is increasing. Let j (z) = z̄ and

ψ = j ◦ h2 ◦ g ◦ h1,

Then ψ|H is a K -quasiconformal homeomorphism of H. Since g and j are the identity


on R, it follows that h −1 −1
2 h 1 = ψ on R. Therefore h = h 2 ◦ h 1 is quasisymmetric.
Let ϕ be the Beurling–Ahlfors extension of h to H, and define

h −1
2 ◦ j ◦ ϕ ◦ h 1 on Ω1 ∪ L;
λ = (9.6.17)
h −1
1 ◦ ϕ ◦ j ◦ h 2 in Ω2 .

Then λ is a c(K )-quasiconformal reflection in L that is C 1 on Ω1 ∪ Ω2 . The inequal-


ities (9.6.14) carry over to give, in particular,

1
η1 (z)|dz| ≤ η2 (λ(z))|dλ(z)| ≤ c1 (K )η1 (z)|dz|, z ∈ Ω, (9.6.18)
c1 (K )

For the purpose of this proof, we shall abbreviate inequalities like (9.6.18), with
a constant that depends only on K , as η1 |dz| ∼ η2 |dλ(z)|. We want to show that
η1 (z) ∼ η2 (λ(z)). In view of (9.6.15), this amounts to showing that

d(z, L) ∼ d(λ(z), L). (9.6.19)

For this purpose, we use the circular distortion theorem, Theorem 8.6.1. Let ψ = h 1
on H ∪ R and ψ = λ ◦ h 1 ◦ j in H∗ . Then ψ : C → C is K -quasiconformal. Given
226 9 Introduction to Teichmüller theory

z ∈ Ω1 and z 0 ∈ L, let C be the circle in C,

C = {w ∈ C : |w − h −1 −1 −1
1 (z 0 )| = |h 1 (z) − h 1 (z 0 )|}.

Then λ(C) passes through λ(z) and λ(z 0 ) = z 0 . It follows from Theorem 8.6.1 that

|z − z 0 | ∼ |λ(z) − z 0 |.

Taking z 0 so that |z − z 0 | = d(z, L) and, separately, so that |λ(z) − z 0 | = d(λ(z), L),


we obtain (9.6.19).
Now we proceed to the proof of Theorem 9.6.2. Given φ0 ∈ Q(H), with φ0 = S f0 ,
f 0 = f μ0 . Suppose that f 0 is K -quasiconformal. Let Ω = φ0 (H), L = φ(R), Ω ∗ =
φ0 (H∗ ). By Theorem 9.6.4, L admits a c0 (K )-quasiconformal reflection λ such that
|dz| ∼ |dλ(z)|. This implies inequalities
1
≤ |ϕz̄ | ≤ c1 (K ).
c1 (K )

If φ ∈ Q(H), with φ = S f , let g = f ◦ f 0−1 . Then by (9.5.16)

φ − φ0 = Sg◦ f0 − S f0 = Sg ◦ f 0 ( f 0 )2 = S f ( f 0 )2 .

The hyperbolic metric in Ω is given by


|dz|
ηλ(z) |dλ(z)| = .
2Im z

Therefore ||φ − φ0 ||H ≤ ε implies

|g(ζ )| ≤ εη(ζ )2 . (9.6.20)

We want to show that for small ε, g has a quasiconformal extension. Let ψ = Sg


let v1 , v2 be normalized solutions of v + 21 ψv = 0:
1
vj + ψv j = 0, v1 v2 − v2 v1 = 1. (9.6.21)
2
Set
v1 (ζ )
g(ζ ) = , ζ ∈ Ω;
v2 (ζ )
v1 (ζ ∗ ) + (ζ − ζ ∗ )v1 (ζ ∗ )
g(ζ ) = , ζ ∈ Ω ∗ , ζ ∗ = λ(ζ ).
v2 (ζ ∗) + (ζ − ζ ∗ )v2 (ζ ∗ )

Let Δ j = v j (ζ ∗ ) + (ζ − ζ ∗ )vj (ζ ∗ ). Then, using (9.6.21), we see that


9.7 Further developments 227

(ζ − ζ ∗ )[v1 Δ2 − v2 Δ1 ]λζ̄ (ζ − ζ ∗ )2 21 ψ(ζ ∗ )λζ̄ (ζ )


gζ̄ = = (9.6.22)
Δ22 Δ22
 ∗   ∗ 
[v + (ζ − ζ )v1 λζ ]Δ2 − [v2 + (ζ − ζ )v2 λζ ]Δ1
gζ = 1 (9.6.23)
Δ22
1 + (ζ − ζ ∗ )2 21 ψ(ζ ∗ )λζ (ζ )
= . (9.6.24)
Δ22

Therefore
1
(ζ − ζ ∗ )2 ψ(ζ ∗ )λz̄ (ζ )
μg (ζ ) = 2
, ζ ∈ Ω. (9.6.25)
1 + 21 (ζ − ζ ∗ )ψ(ζ ∗ )2 λζ (ζ )

Now |λz | < |λz̄ | < c(K ) and |ζ − ζ ∗ | < C/η(ζ ∗ ), so

ε c(K )
|μg | ≤ < 1 (9.6.26)
1 − εc(K )

for sufficiently small ε. We need to show that g is continuous and injective. Once
again this is true if L is an analytic curve and ψ is analytic on L with a zero of
order 4 at ∞. Therefore we proceed again by approximation. With τn as before, let
f n = f 0 ◦ τn and L n = f n (R). Then L n admits a quasiconformal reflection and ψ
is analytic on L n . Since τn (H∗ ) ⊃ H∗ , by Proposition 2.2.2 the hyperbolic density
ηn of f n (H∗ ) is ≥ η = ηH∗ , so |ψ| ≤ εη implies |ψ| ≤ εηn . The associated maps gn
with Sgn = ψ in Ωn satisfy (9.6.26) uniformly. Therefore a subsequence converges
to a quasiconformal g that equals g in H∗ .

9.7 Further developments

Pushing these results much further requires many additional technical steps, and is
beyond the scope of this book. In this section, we give a very brief look at some more
of the theory.
Let us mention first what is referred to as Teichmüller’s theorem. Opinions seem
to differ about how close Teichmüller came to a rigorous proof of this, especially
the statement of existence. The theorem—and the theory—have been generalized to
many kinds of non-compact Riemann surfaces, and in other ways as well; see the
references in the last section.
The exact statement of the theorem varies somewhat from monograph to mono-
graph, but the following is the gist.

Theorem 9.7.1. If S is a compact Riemann surface, then in every homotopy class


of sense-preserving homeomorphism of S onto another Riemann surface, there is a
unique map whose maximal dilatation k is the smallest. The Beltrami coefficient has
the form
φ(z)
μ(z) = k (9.7.1)
φ(z)
228 9 Introduction to Teichmüller theory

where
φ (dz)2 (9.7.2)

is a holomorphic quadratic differential on S.

We encountered a holomorphic quadratic differential at the end of Section 9.5 in


the form (9.7.2), with φ = S f . The theory leading up to Teichmüller’s theorem and
its generalizations relies heavily on the study of such differentials, as is evident from
some of the titles of the references for this chapter. The study focuses particularly on
the “trajectories” of such differentials. Some neighborhood U of a point z 0 where φ
is injective can be parametrized by
z 
ζ = φ(t) dt.
z0

Then (dζ )2 = φ(z)(dz)2 . A curve γ : [a, b] → U is said to be a horizontal trajectory


of φ if arg[φ(γ (t))(γ  (t))2 ] = 0, a vertical trajectory if arg[φ(γ (t))(g  (t))2 ] = π .
Carefully chosen horizontal and vertical trajectories eventually provide the desired
type of parametrization of the moduli space.

9.8 Higher Teichmüller theory

To put this subject in context, we follow [215] and begin with a bird’s eye view
of Teichmüller theory itself. Let S be a compact Riemann surface of genus g > 1.
The Teichmüller space T (S) consists of equivalence classes of pairs (S  , f ), where
S  is a Riemann surface and f : S → S  is a quasiconformal orientation-preserving
homeomorphism. The equivalence relation is:

(S1 , f 1 ) ∼ (S2 , f 2 ) ⇔ f 2 ◦ f 1−1 is homotopic to a conformal map.

The universal cover of S  can be taken to be H. Choosing a base point in S  , there


is a homomorphism
H1 (S  ) → Aut(H) = P S L(2, R)

from the fundamental group H1 (S  ) to the group of deck transformations. Given


(S  , f ) as above, there is an injective map f ∗ :

f ∗ : H1 (S) → H1 (S  ) → Aut(H) ∼
= P S L(2, R).

This can be seen to induce an injective homomorphism, the holonomy, from T (S) to
the set of homomorphisms from

hol : T (S) → Hom (H1 (S), P S L(2, R)/P S L(2, R)) , (9.8.1)
9.8 Higher Teichmüller theory 229

where P S L(2, R)/P S L(2, R) denotes the space Aut(H) = P S L(2, R) up to inner
automorphisms.
The Teichmüller space T (S) is a connected component of the target space on the
right in (9.8.1). There is a second connected component, the image of T ( S̄), suitably
constructed, where S̄ is the topological surface S with the opposite orientation.
The idea behind higher Teichmüller theory is to replace P S L(2, R) in this pic-
ture by a Lie group G of higher rank, in such a way that the associated (higher)
Teuchmüller space of S is a union of connected components of

Hom(π1 (S), G)/G

that consists of discrete faithful representations of π1 (S). This happens only for
special choices of G.
In retrospect, the theory seems to have begun in 1992 with results of Hitchin [108].
The fact that Hitchin’s construction fits into the theory as described here was proved
in 2006 by Fock and Goncharov [77] and by Labourie [128]. A second family of
higher Teichmüller spaces was defined in a different way, and was shown to fit the
definition above by Burger, Iozzi, Labourie, and Wienhard [36].
This history, various points of view, and further developments of the subject
are described in Wienhard’s survey article [215]. There are many interactions with
other research areas, such as Lie theory, representation theory, and ergodic theory.
However, as noted in [215],
In classical Teichmüller theory complex analytic methods and the theory of quasiconformal
mappings play a crucial role. These aspects are so far largely absent from higher Teichmüller
theory.

An exception to this is Dumas and Sanders [62].

Exercises

1. Prove Corollary 9.2.3.


2. Show that the set of points in H that are equidistant from two distinct points of
H with respect to the hyperbolic metric is a geodesic.
3. Construct a homeomorphism of H onto itself that cannot be extended continu-
ously to a homeomorphism of the boundary.
4. Prove Proposition 9.4.1. Hint: transfer the problem to D.
5. Show that every point of R is a limit point of G if G\H is compact. Hint: the
ratio between the euclidean and the hyperbolic diameter of a compact subset of
H goes to zero as the set approaches R.)
6. Show directly that the set of quasisymmetric functions f : R → C, normalized
by f (0) = 0, f (1) = 1, is a group under composition.
7. Prove Theorem 9.4.6.
230 9 Introduction to Teichmüller theory

8. Prove Proposition 9.5.1. Hint: suppose that f and g are normalized, and use
(9.4.9).
9. Verify Proposition 9.5.4.
10. Fill in the details in the proof of Theorem 9.5.5.
11. Prove that every element of Q(H) is the Schwarzian of some function that is
meromorphic in H.
12. The universal Teichmüller space T (U H ) is a group under the composition
[ f ][g] = [ f ◦ g]. The aim of this exercise is to show that the group compo-
sition is not continuous with respect to the Teichmüller metric, following the
proof in [84]. Normalize elements [ f ] ∈ T (H) by requiring that −1, 1, ∞ are
fixed by f R .

Show that there is a sequence { f n } ⊂ T (H) of normalized maps such that the
distance dT ( f n , 1) to the identity map 1 converges to zero and such that each f n
is asymptotically conformal, but dT (g ◦ f n , g) does not converge to zero, where
g(z) = z|z|. You may use the fact that to have dT ( f n , 1) → 0, it is enough to
have the restrictions to R satisfy
 
f n (x + t) − f n (x)
sup , f n (x) − f n (x − t) f n (x + t) − f n (x) → 0.
f n (x) − f n (x − t)

13. Verify (9.6.11) and (9.6.12).


14. Verify (9.6.22) and (9.6.24).

Remarks and further reading

The history of this subject exhibits gaps and then bursts of activity. Moreover, the
subject seems to have inspired an unusual number of pithy comments. We cannot
resist the temptation to summarize through some quotations. According to Weyl
[214], footnote, p.176, Fricke [79], with the aid of his study of canonical polygons,
succeeded in formulating and proving rigorously the statement of Riemann: the Riemann
surfaces of genus p ( p > 1) form a 6 p − 6-dimensional manifold.

The subject seems to have rested there until revived by Teichmüller in the late 1930s
and the early 1940s. Ahlfors [3] attacked the problem a decade later and wrote
In a systematic way the problem of extremal quasiconformal mapping was taken up by
Teichmüller in a brilliant and unconventional paper ... . He formulates the general problem
and, although unable to give a binding proof, is led by heuristic arguments to a highly elegant
conjectured solution. Tbe paper contains numerous fundamental applications which clearly
show the importance of the problem. In a later publication [8] Teichmüller has offered a proof
of his main conjecture. In many respects this proof is an anticlimax when compared with
the original article. It is based on the method of continuity, which of all classical methods is
the least satisfactory, because of the nature of a posteriori verification.
9.8 Higher Teichmüller theory 231

Ahlfors goes on to give a variational proof of Teichmüller’s theorem. This paper


by Ahlfors led to an explosion of activity by Ahlfors, Bers, Ahlfors–Bers, and many
others. See, in particular, the introduction of a complex structure in Teichmüller space
by Bers [25]. For more on the history of the subject and on the two decades after [3],
see the books by Abikoff [1], Gardiner [82], Lehto [131], and Nag [150].
Here is the classic beginning of Abikoff’s review of [131], MR0867407:
In the late 1930s the study of variational techniques in complex analysis received a tremen-
dous boost from the work of M. M. Schiffer on the Bieberbach conjecture. Schiffer’s approach
emphasized the role of quadratic differentials as nonlinear differential equations satisfied by
any extremal mapping. Previously Grötzsch had studied the most nearly conformal maps
between plane domains. The deviation of these maps from conformality is measured by the
L ∞ -norm of the logarithm of the local distortion. The class of homeomorphisms of finite
norm consists of the quasiconformal mappings. In a pair of brilliant, but marginally read-
able, papers, Teichmüller completely revised the deformation theory of Riemann surfaces.
His methods were a brilliant merging and extension of the methods and ideas of Schiffer
and Grötzsch. The deformation theory he obtained is now called Teichmüller theory.
By now, Teichmüller theory has developed to a point of mathematical maturity. By this
term I mean five distinct qualities. First, its major practitioners speak so vastly different
languages that they can barely understand one another. It is being used as a tool in a wide
variety of mathematical and physical disciplines. It is serving as a model for new areas of
mathematical research. Several books have recently been written on the subject. Last, the
discipline is probably named after the wrong person.

Our exposition here is based mainly on the rather terse notes of Ahlfors [5] and the
expansive book of Lehto [131]. The state of the theory in the mid-1980s is described
well by the books of Abikoff, Gardiner, Lehto, and Nag mentioned above. Some of
the developments of the following two decades are contained in the books of Fletcher
and Markovic [76], Hubbard [111], [112], and Gardiner and Lakic [83].
Work in several directions is summarized in the chapters that have been added
in the second edition of [5]. Earle and Kra cover further work along the lines pio-
neered by Ahlfors and Bers. Shimakura describes the work of Sullivan, Thurston, and
others relating quasiconformal mapping and complex dynamics. Hubbard outlines
Thurston’s remarkable work on 3-manifolds. For more on this and related topics,
including higher Teichmüller theory, see the expository article of Wolpert [217].
For still more, see the Handbook of Teichmüller Theory [95]. (Volume V of the
Handbook includes translations of Teichmüller’s papers [202], [203].) For a glimpse
into the different state of affairs in higher dimensions, see the remarks concerning
the Mostow rigidity theorem at the end of Chapter 8.
Chapter 10
The Bergman kernel

If Ω is a domain in C, the set H (Ω) of functions f that are holomorphic in Ω and


square-integrable with respect to the area measure dm(z) = d x d y = 2i dz dz,

| f (z)|2 dm(z) < ∞
Ω

is a Hilbert space. The Bergman kernel K has the property that for f in H (Ω) and
z in Ω, 
f (z) = K (z, w) f (w) dm(w).
Ω

The kernel K itself is introduced in Section 10.1. Section 10.2 looks at its expan-
sion with respect to an orthonormal basis.
The Bergman kernel is a conformal invariant of the domain Ω. As such it can be
expected to be closely related to other such invariants. For simply connected Ω, one
such invariant is the inverse of the Riemann map from D to Ω. Section 10.3 exhibits
the relation of this inverse map to K .
The kernel function also has natural geometric significance. Section 10.4 covers
conformal invariance and the Bergman metric. Conformal invariance suggests that
the kernel K is closely related to other natural geometric features of a domain. We
have already seen this in Section 10.3 in connection with the Riemann map, and we
return to this theme in Section 10.5, in relation to conformal maps from domains that
are not simply connected, and in Section 10.6 in connection with natural boundary
problems for the Laplacian Δ in Ω.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 233
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_10
234 10 The Bergman kernel

10.1 The reproducing kernel

Suppose that Ω is a domain in C whose boundary contains more than one point. As
in the introduction, we denote by H (Ω) the space of functions f , holomorphic on
Ω, that are square-integrable on Ω. The corresponding inner product is

( f, g) = f (z)g(z) dm(z). (10.1.1)
Ω

We assume throughout that Ω is such that H (Ω) contains non-zero functions. Note
that this is not obvious if Ω is unbounded (and is not true if the boundary contains
only one point; see Exercise 1).
Our first step here is to show that H (Ω) is a Hilbert space, i.e. that it is complete
with respect to the metric induced by the norm
  1/2
|| f || = ( f, f )1/2 = | f (z)|2 dm(z) .
Ω

Lemma 10.1.1. If f belongs to H (Ω) and z is a point of Ω, then

1
| f (z)| ≤ √ || f || (10.1.2)
πR

where R is the distance from z to the boundary ∂Ω.

Proof: For convenience, we translate coordinates so that z = 0. Then for 0 ≤ r < R,


∞ 
  f (n)
 
| f (r eiθ )| =  an r n einθ  , an = .
  n!
n=0

We square and average with respect θ to get


 2π  2π ∞

1 1
| f (r eiθ )|2 dθ = an ām r n+m ei(n−m) dθ
2π 0 2π 0 n,m=0


= |an |2 r 2n ,
n=0

since  
1 2π
1, m = n;
e i(n−m)θ
dθ =
2π 0 0, m = n.

The disk D R (0) is contained in Ω, so


10.1 The reproducing kernel 235
 R  2π
|| f || ≥
2
| f (r eiθ |2 r dθ dr
0 0
 R∞
= 2π |an |2 r 2n+1 dr
0 n=0
∞
|an |2 2n+2
= 2π R
n=0
2n + 2
|a0 |2 2
≥ 2π R = π | f (0)|2 R 2 . 

2

Note that the example Ω = D, f = 1, z = 0 shows that the estimate (10.1.2) is


sharp.

Proposition 10.1.2. H (Ω) is a Hilbert space.

Proof: We must show that H (Ω) is complete. Suppose that { f n } is a Cauchy sequence
in H (Ω). It follows from Lemma 10.1.1 that the functions { f n } converge uniformly
on each compact subset of Ω. Therefore the limit function f is holomorphic. 


Proposition 10.1.3. Given z ∈ Ω, there is a unique element k z ∈ H (Ω) such that


for each f in H (Ω),
f (z) = ( f, k z ). (10.1.3)

Proof: The estimate (10.1.1) shows that the linear map f → f (z) is bounded with
respect to the Hilbert norm. Since H (Ω) is a Hilbert space, any such map can be
written uniquely as an inner product with an element of H (Ω); this is Proposition
2.7.1. 


Proposition 10.1.4. The element k z satisfies the estimate

1
||k z || ≤ √ (10.1.4)
πR

where R is the distance from z to ∂Ω.

Proof. Apply (10.1.3) with f (w) = k z (w):

||k z ||
||k z ||2 = (k z , k z ) = k z (z) ≤ √ . 

πR

The Bergman kernel for the domain Ω is defined by

K (z, w) = K (z, w; Ω) = k z (w) = (kw , k z ) = (kw , k z ), (10.1.5)


236 10 The Bergman kernel

where k z is the element of H (Ω) that is defined by (10.1.3). By definition, the kernel
K has the reproducing property for H (Ω): for each f ∈ H (Ω), we have

K (z, w) f (w) dm(w) = f (z). (10.1.6)
Ω

In fact, the integral (10.1.6) is just ( f, k z ), rewritten.

Proposition 10.1.5. The Bergman kernel is holomorphic as a function of z, and


anti-holomorphic as a function of w. Moreover, K has hermitian symmetry

K (z, w) = K (w, z), (10.1.7)

and satisfies the estimate

1
|K (z, w)|2 ≤ |K (z, z)| |K (w, w)| ≤ , (10.1.8)
π R z Rw

where Ra is the distance from a ∈ Ω to ∂Ω.

Proof: By definition, k z (w) ∈ H (Ω) is holomorphic in w, so the complex conjugate


K (z, w) is anti-holomorphic in w. The assertion (10.1.7) will then imply that K (z, w)
is holomorphic with respect to z. The identity (10.1.7) follows from (10.1.5). The
estimate (10.1.8) follows from (10.1.4) and (10.1.5). 

The kernel K has a certain extremal property that is important for applications,
e.g. in Section 10.2.

Proposition 10.1.6. Suppose z 0 is a point of Ω. The unique solution to the problem


of minimizing || f || for f in H (Ω) such that f (z 0 ) = 1 is given by the function

K (z, z 0 )
f (z) = . (10.1.9)
K (z 0 , z 0 )

Proof: We note first that the assumption that H (Ω) = (0), together with (10.1.8),
implies that K (z 0 , z 0 ) > 0, so f in (10.1.9) is well defined and satisfies f (z 0 ) = 1.
Now f is a scalar multiple of k z0 , so any g ∈ H (Ω) can be written as

g = c f + h,

where c is constant and h is orthogonal to k z0 . Then h(z 0 ) = (h, k z0 ) = 0, g(z 0 ) = 1


implies that c = 1. Then

||g||2 = ( f + h, f + h) = || f ||2 + ||h||2 .

Therefore the desired minimum is obtained where h = 0 and thus g = f . 



10.1 The reproducing kernel 237

For a generalization of Proposition 10.1.6, see Exercise 2.


The assertion above is that K is separately holomorphic and anti-holomorphic in
its arguments, but not that it is even continuous as a function of the two variables
jointly. As we shall see, much more is true.
Lemma 10.1.7. If f belongs to H (Ω) and z is a point of Ω, then there is a constant
C such that derivatives of f satisfy

n !2n
| f (n) (z)| ≤ || f ||, (10.1.10)
R n+1
where R is the distance from z to ∂Ω.

Proof: Use the Cauchy integral formula to write f (n) (z) as an integral over the circle
{ζ : |ζ − z| = R/2}, and use (10.1.2) to estimate | f (ζ )|. 


Corollary 10.1.8. The map z → k z is continuous from Ω to H (Ω).

Proof: Given z, z in Ω,

||k z − k z || = sup |(k z − k z , f )| = sup | f (z ) − f (z)|


|| f ||=1 || f ||=1

The estimate (10.1.10) with n = 1 gives local estimates | f (z) − f (z )| =


O(|z − z|). 


Corollary 10.1.9. K (z, w) is jointly continuous in (z, w).

Proof: This follows from Corollary 10.1.8 and the estimate (10.1.1). 


Corollary 10.1.9 allows us to combine separate Cauchy integral formulas for K


with respect to z and to w:
Proposition 10.1.10. For z 0 , w0 in Ω, and sufficiently small r > 0, s > 0, the kernel
function K satisfies a double integral formula
 
1 K (z, w) dz dw
K (z 0 , w0 ) = . (10.1.11)
(2πi)2 |z−z 0 |=r |w−w0 |=s (z − z 0 )(w − w0 )

Expanding the denominator of the integrand gives the power series expansion:
Corollary 10.1.11. For z 0 , w0 in Ω, and sufficiently small r > 0, s > 0, the kernel
function K has a series expansion for |z − z 0 | < r, |w − w0 | < s:


K (z, w̄) = amn (z − z 0 )m (w − w0 )n . (10.1.12)
m,n=0
238 10 The Bergman kernel

Remark. The Bergman kernel is one example of what was later termed a reproducing
kernel; see the survey by Aronszajn [11]. An earlier example was examined by Szegő
[199] in connection with the Hardy space H 2 ; see Exercise 14.

10.2 Orthonormal bases

Suppose that {φn }∞


n=1 is an orthonormal basis for the Hilbert space H (Ω). Then each
f in H (Ω) has an expansion


f = ( f, φn )φn ,
n=1

and


|| f ||2 = |( f, φn )|2 .
n=1

This series converges in norm, so by (10.1.1) it converges pointwise, uniformly on


compact sets. Thus for z in Ω,


f (z) = ( f, φn )φn (z). (10.2.1)
n=1

In particular,

 ∞

k z (w) = (k z , φn )φn (w) = φn (z)φn (w).
n=1 n=1

Since K (z, w) = k z (w), we have proved

Proposition 10.2.1. The Bergman kernel has an expansion




K (z, w) = φn (z)φn (w), (10.2.2)
n=1

where {φn }∞
1 is any orthonormal basis for H (Ω).

Let us consider two examples. The proofs are left as Exercise 4 and 5.

Proposition 10.2.2. The functions

n 1/2 n−1
φn (z) = z , n = 1, 2, . . . (10.2.3)
π 1/2 R n
10.2 Orthonormal bases 239

are an orthonormal basis for H (D R ), where D R = D R (0) is the disk of radius R


centered at the origin.

Corollary 10.2.3. The kernel function K for the disk D R (0) is


 
1 zw −2 R2
K (z, w) = 1− 2 = . (10.2.4)
π R2 R π(R 2 − zw)2

Proposition 10.2.4. The functions




⎨ |n|R 2
1/2
z n−1 , |n| − 1 = 0, 1, 2, . . . ;
ψn (z) = π |R 2n − 2n | (10.2.5)

⎩[2π log(R/ ]−1/2 z −1 , n = 0.

are an orthonormal basis for H (A ,R ), where A ,R is the annulus

A ,R = {z : < |z| < R}. (10.2.6)

The basis (10.2.5) leads to the formula

∞
1 R2
K (z, w) = (zw)−1 + (zw)n−1 . (10.2.7)
2π log(R/ρ) n=0
π(R 2n − ρ 2n )

This series can be summed to give an explicit formula for the Bergman kernel for
A ,R in terms of the Weierstrass function ℘; see [24], Section 1.4.
We turn now to a result that will allow us to prove a monotonicity result for
kernels. Suppose that Ω1 ⊂ Ω. Let ( , )1 and || ||1 denote the inner product and norm
in H (Ω1 ). Given u ∈ H (Ω), the restriction (also denoted u) belongs to H (Ω1 ) and
||u||1 ≤ ||u||.

Proposition 10.2.5. If Ω1 ⊂ Ω, there is an orthogonal basis {φn } for Ω such that


the restrictions {φn |Ω1 } are orthogonal in H (Ω1 ).

Proof: We define an orthonormal set in H (Ω1 ) inductively. Choose ψ1 ∈ H (Ω)


such that ||ψ1 || is minimal, subject to the constraint ||ψ1 ||1 = 1. Having chosen
ψ1 , . . . , ψn−1 , choose ψn ∈ H (Ω) such that ||ψn || is minimal, subject to the con-
straints
(ψn , ψm )1 = 0, m = 1, . . . , n − 1; ||ψn ||1 = 1. (10.2.8)

We claim that {ψn } is an orthogonal set in H (Ω). Suppose that we have shown that
{ψ1 , . . . , ψn−1 } is an orthogonal set in H (Ω). Suppose that ψn is not orthogonal in
H (Ω) to all the preceding ψk . Then we may choose some element g in the span of
these ψk , normalized with
240 10 The Bergman kernel

||g||1 = 1, (ψn , g) = −a < 0.



Given 0 < ε < 1, let f ε = 1 − ε2 ψn + εg. Then


||ψn ||2 ≤ || f ε ||2 = (1 − ε2 )||ψn ||2 − 2aε 1 − ε2 + ε2 ||g||2
= ||ψn ||2 − 2aε + O(ε2 ).

For small ε, the right-hand side is less than ||ψn ||2 , a contradiction. Therefore the set
{ψ1 , . . . , ψn } is orthogonal in H (Ω). It follows that the set

{φn = kn−1 ψn }∞
n=1 , kn = ||ψn ||

is orthogonal in H (Ω). Choosing a maximal such set and renumbering if necessary,


we obtain the desired orthonormal basis. 


Corollary 10.2.6. If Ω1 ⊂ Ω, then the kernels K 1 for Ω1 and K for Ω satisfy

K 1 (z, z) ≥ K (z, z), z ∈ Ω1 , (10.2.9)

with strict inequality unless H (Ω1 ) = H (Ω).

Proof: Propositions 10.2.1 and 10.2.5 imply the inequality (10.2.9). Equality can
only hold if the orthogonal basis for H (Ω) that is constructed in Proposition 10.2.5
is not only orthogonal in H (Ω1 ) but also complete in H (Ω1 ). 


For an example in which {ωm } is orthogonal but not complete in H (Ω1 ), see
Exercise 8.

10.3 Conformal mapping, I

We have assumed throughout that Ω is a domain in C with the property that ∂Ω


contains more than one point. In this section, we assume, in addition, that Ω is simply
connected. The Riemann mapping theorem says that in this case, given a point z 0
of Ω, there is a unique conformal map f 1 of Ω onto the unit disk D having the
properties
f 1 (z 0 ) = 0, f 1 (z 0 ) > 0.

Let us renormalize by choosing R = 1/ f 1 (0), so that

f R = R f 1 : Ω → D R = D R (0) (10.3.1)

is the unique conformal map of Ω onto the disk D R (0) = {w : |w| < R} that satisfies
10.3 Conformal mapping, I 241

f R (z 0 ) = 0, f R (z 0 ) = 1. (10.3.2)

We may use Proposition 10.4.1 with Ω = D, together with Proposition 10.2.2 to


construct an orthonormal basis for H (Ω) and compute the Bergman kernel K for
Ω. The basis is

n 1/2
ωn (z) = ψn ( f R (z)) f R (z) = f R (z)n−1 f R (z), n = 1, 2, . . . ,
π 1/2 R n
and the formula for K is


K (z, w) = ωn (z)ωn (w). (10.3.3)
n=1

The conditions (10.3.2) tell us that

1
ω1 (z 0 ) = f (z 0 ); ωn (z 0 ) = 0, n = 2, 3, . . . ,
π 1/2 R R
so, for z = z 0 , the sum (10.3.3) collapses to

1
K (w, z 0 ) = f (w). (10.3.4)
π R2 R
In view of this, we have
1
K (z 0 , z 0 ) =
π R2
and
1
f R (w) = K (w, z 0 ). (10.3.5)
K (z 0 , z 0 )

Since f R (z 0 ) = 0, we may integrate (10.3.5) to obtain a formula for the conformal


map in terms of the Bergman kernel.

Theorem 10.3.1. Suppose that Ω is a simply connected domain in C such that ∂Ω


has more than one point. Given a point z 0 in Ω, the function
 z
1
Φ(z) = K (ζ, z 0 ) dζ (10.3.6)
K (z 0 , z 0 ) z0

is a conformal map of Ω onto the disk D R (0), where

R = π −1/2 K (z 0 , z 0 )−1/2 . (10.3.7)

The map Φ is uniquely determined by the conditions


242 10 The Bergman kernel

Φ(z 0 ) = 0, Φ (z 0 ) = 1. (10.3.8)

Three consequences of this result are the estimate


 
 z 
 K (ζ, z 0 ) dζ  < π −1/2 K (z 0 , z 0 )1/2 , (10.3.9)

z0

the limit  
 z 
lim  K (ζ, z 0 ) dζ  = π −1/2 K (z 0 , z 0 )1/2 , (10.3.10)
z→∂Ω  z0

and the fact that the kernel function has no zeros:

Corollary 10.3.2. The kernel function K has no zeros.

Proof: Since f R of (10.3.1) is conformal, its derivative has no zeros. 




For a simple, and much more direct, proof of this last result, see Exercise 3.
In light of Proposition 10.1.6 and the first area identity (1.2.9), we may look at
Theorem 10.3.1 in a different way.

Theorem 10.3.3. Suppose that z 0 is a point of Ω. The problem of finding a conformal


map Φ : Ω → C such that

Φ(z 0 ) = 0, Φ (z 0 ) = 1

and Φ(Ω) has minimal area, has a unique solution


 z
1
Φ(z) = K (z 0 , ζ ) dζ.
K (z 0 , z 0 ) z0

The image f (Ω) is the disk D R (0), where

R = π −1/2 K (z 0 , z 0 )−1/2 .

Remark. Theorem 10.3.3 is one of many examples in Bergman [24] recasting a


classical problem as an extremal problem.

10.4 Conformal invariance and the Bergman metric

Suppose that Ω and Ω are domains that are conformally equivalent, i.e. there is a
bijective holomorphic map Φ : Ω → Ω.
10.4 Conformal invariance and the Bergman metric 243

Proposition 10.4.1. The map f → 


f,


f (z) = f (Φ(z))Φ (z), z∈Ω,

is a unitary map from H (Ω) onto H (Ω ).

Proof: Note that the inverse map has the same form:

f (w) = 
f (Φ −1 (w))[Φ −1 ] (w), w ∈ Ω.

Distinguishing the inner products by subscripts, we only need to show that for f , g
in H (Ω),
( g )Ω = ( f, g)Ω .
f ,

This is true if and only if

|Φ (z )|2 d x dy = d x d y (10.4.1)

where we write z = x + i y , z ∈ Ω, and w = x + i y, w = Φ(z) ∈ Ω. But this is


the local form of the first area formula in (1.2.9). Thus, we have established (10.4.1).



Proposition 10.4.2. If Φ is a conformal map of Ω onto Ω, then the Bergman kernels


K of Ω and K of Ω are related by

K (z, w) = K (Φ(z), Φ(w))Φ (z)Φ (w). (10.4.2)

Proof: Given z in Ω and w in Ω , let k z and kw be the corresponding elements


that evaluate functions at z and w, respectively. Then if f belongs to H (Ω ), and
w = Φ(z), then

f , k z )Ω = 
( f (z) = f (Φ(z))Φ (z) = ( f, kw )Ω Φ (z) = ( f, Φ (z)kw )Ω .

Thus,
k z = kw if w = Φ(z). (10.4.3)

Then by (10.1.5) and (10.4.3), if Φ(z) = z and Φ(w) = w , then

K (z, w) = (k z , kw )Ω = (kz , k
w )Ω = K (z , w )Φ (z)Φ (w). 


Bergman introduced a Riemannian metric on the domain Ω by setting the distance


element at z in Ω to be
244 10 The Bergman kernel

ds 2 = K (z, z)(d x 2 + dy 2 ). (10.4.4)

This is known as the Bergman metric.


We do not need to assume any knowledge of differential geometry to explain
what this means, and to prove the conformal invariance of the metric. The point of
an expression like (10.4.4) is to indicate how to compute distance, and one computes
distance by first computing the length of smooth curves. Suppose that γ : [a, b] → Ω
is such a curve: γ (t) = x(t) + i y(t), where x, y are differentiable functions of t. Then
γ = x + i y and the length of γ with reference to the metric (10.4.4) is
 b  b
L(γ ) = K (γ (t), γ (t))1/2 [x (t) + y (t)]1/2 dt = K (γ (t), γ (t))1/2 |γ (t)| dt.
a a

The associated distance between two points z and w of Ω is the minimum length
among such curves that begin at one point and end at the other. (Conversely, one
can recover the metric from the distance function, using the observation that for
points very close to one another, the distance is approximately a constant times the
euclidean distance.)

Theorem 10.4.3. The Bergman metric is a conformal invariant: suppose that Φ is


a conformal map of Ω onto Ω. Then a smooth curve γ in Ω and the induced curve
γ = Φ ◦ γ in Ω have the same length as measured by the Bergman metrics on Ω
and Ω, respectively.

Proof: The length of γ : [a, b] → Ω is


 b
K (Φ(γ (t)), Φ(γ (t)))1/2 |[Φ ◦ γ ] (t)| dt
a
 b  
= K (Φ(γ (t)), Φ(γ (t)))1/2 |Φ (γ (t)| |γ (t)| dt. (10.4.5)
a

By Proposition 10.4.2, the expression in braces is the square root of K (γ (t), γ (t)),
where K is the Bergman kernel for Ω . Therefore the right side of (10.4.5) is the
length of the induced curve γ . 


A geodesic for the Bergman metric is a curve γ that is locally of shortest length,
meaning that for some δ > 0, if z and w are points on γ and |z − w| ≤ δ, then no
curve from z to w has length shorter than the portion of γ from z to w. It is convenient
to renormalize the parametrization of the curve so that |γ (t)| ≡ 1, so that (with some
new choice of a and b)
 b
L(γ ) = K (z(t), z(t))1/2 dt, |γ (t)| ≡ 1. (10.4.6)
a
10.5 Conformal mapping, II 245

With this parametrization, the necessary and sufficient conditions for γ to be a


geodesic are the Euler–Lagrange equations

∂   ∂  
x (t) = K (z(t), z(t))1/2 ) , y (t) = K (z(t), z(t))1/2 ) ,
∂x ∂y
(10.4.7)
see Exercise 15.

As an example, consider the unit disk D, for which the Bergman kernel is

1
K (z, w) = ;
π(1 − zw)2

see Corollary 10.2.3. Thus the Bergman metric is

d x 2 + dy 2
ds 2 = . (10.4.8)
π(1 − r 2 )2

Up to a multiplicative constant, this is the Poincaré metric for the unit disk as a model
of hyperbolic geometry; see Section 2.2.
The interval (−1, 1) is a geodesic for this metric; see Exercise 16. As in Section
2.2, the remaining facts about the geometry follow by conformal invariance: rotations
around the origin are conformal maps of D to itself, so each diameter of D is a
geodesic. More generally, the linear fractional transformations

z−a
Φ(z) = ω · , a ∈ D, |ω| = 1,
az − 1

are conformal maps of D to itself. Linear fractional transformations map any line
to either a line or a circle. It can be deduced from these remarks that the remaining
geodesics of D are precisely the circular arcs in D that meet the boundary at right
angles. These are the "lines" for the hyperbolic geometry.
Remark. It follows from (10.4.8) that√ the Bergman metric for D blows up, as one
approaches a boundary point z 0 , like 1/2 π ρ, where ρ is the distance to the bound-
ary. This is true, not only qualitatively (O(ρ −1 )) but also quantitatively, whenever
the boundary is smooth in a neighborhood of z 0 ; see Exercise 10,

10.5 Conformal mapping, II

We saw in Section 10.3 that the Bergman kernel can be used to give an explicit
formula for a conformal map of a simply connected domain onto a disk. A modified
version of this statement is also true for domains that are not simply connected. In this
case, the mapping takes such a domain to the plane minus a collection of horizontal
246 10 The Bergman kernel

S1
Γ3 Φ S2
Γ1
Γ2
S3

Fig. 10.1 Multiply connected domains.

slits. (For a different proof of this mapping theorem, see Exercises 16–19 of Chapter
7.)
In this section, we take Ω to be a bounded domain whose boundary ∂Ω consists
of smooth simple closed curves Γ1 , Γ1 , … Γ p , where Ω lies in the bounded domain
enclosed by Γ1 ; see the left side of Figure 10.1.
As we shall show,

Theorem 10.5.1. There is a conformal map Φ of Ω onto C \ S, where S = S1 ∪


· · · ∪ S p is a union of non-overlapping horizontal slits

S j = {a j + is : 0 ≤ s ≤ s j }, j = 1, 2, . . . , p.

See the right side of Figure 10.1.


The conformal map Φ will be constructed using a certain modification of the
Bergman kernel for Ω. We shall want a kernel with a single-valued integral in Ω.
Let H(Ω) be the subspace of H (Ω) consisting of functions that are derivatives
of functions that are holomorphic in Ω.
(Ω), the integral
Lemma 10.5.2. For any z 0 in Ω and f in H
 z
I f (z) = f (ζ ) dζ, z ∈ Ω, (10.5.1)
z0

is independent of the path of integration.

Proof: Suppose that f = g in Ω. Then I f f − g is constant on any path that lies


in a simply connected subdomain of Ω. It follows that I f − g is constant in each
coordinate disk, hence constant. 


(Ω) is closed in H (Ω).


Corollary 10.5.3. The subspace H

(Ω) converges to f ∈ H (Ω), then the { f n } converge uniformly


Proof: If { f n } in H
to f on compact subset of Ω. The corresponding integrals {I fn }, defined by (10.5.1),
converge uniformly on compact subsets to a holomorphic function whose derivative
is f . 

10.5 Conformal mapping, II 247

(Ω) is a Bergman kernel K


Corresponding to the closed subspace H  defined as
before via the reproducing property

(z, w) = 
K k z (w), (10.5.2)

where  (Ω) such that for each f ∈ H


k z is the unique element of H (Ω),

f (z) = ( f, 
k z ).

(z, z) > 0 and


Exactly as for K , we have K

(z, w) = K
K (w, z); |K (z, z) K
(z, w)|2 ≤ K (w, w). (10.5.3)

Moreover,  (Ω) of k z in H (Ω), so


k z is the orthogonal projection onto H

(z, z) ≤ K (z, z).


K (10.5.4)

(z, w) is holomorphic in z and anti-holomorphic in w.


As with K , K
The principle roles are played by
 z
M(t, z) = M(t, t) + (τ, z) dτ
K (10.5.5)
t

and a certain auxiliary function


 
1 1
N (t, z) = + λ(z, t) ,
π t−z

where the function λ is holomorphic with respect to z ∈ Ω. In fact, for any choice
of z in Ω, the function
Φ(t) = M(t, z) + N (t, z) (10.5.6)

can be taken as the desired conformal map. The rest of this section is devoted to the
proof of this statement.

Lemma 10.5.4. There is a constant δ > 0 such that at each point z of the boundary
of Ω there is a disk of radius δ contained in Ω and tangent to ∂Ω at z, and also a
disk of radius δ contained in the complement of the closure of Ω and tangent to ∂Ω
at z.

Proof: At any point of the boundary, there are two such disks of maximal radius
r1 (z), r2 (z). Let (z) be the smaller of the r j (z). Then (z) is continuous on ∂Ω, so
it has a minimum δ > 0. 

248 10 The Bergman kernel

Lemma 10.5.5. Given t ∈ Ω, there are constants C1 (t), C2 (t) such that, for each
z in Ω such that the distance ρ(z) from z to ∂Ω is ≤ 1, we have

|M(z, t)| ≤ C1 (w) + C2 (w)| log ρ(z)|. (10.5.7)

Proof: In view of (10.5.5) and (10.5.3),


 t
(z, z)1/2
|M(t, z)| ≤ |M(z, z)| + K (τ, τ )1/2 |dτ |.
K
z

Therefore it is enough to estimate the integral, and, in view of (10.5.4), we may


replace K(τ, τ ) by K (τ, τ ). Let δ be as in Lemma 10.5.4. Let D be any disk of
radius δ contained in Ω and tangent to the boundary. The set of points in Ω whose
distance from the boundary is at least δ/2 is compact, so it is enough to estimate the
integral from the center of D along the radius ending at the boundary. By Corollary
10.2.6, K (τ, τ ) for z in D is dominated by the corresponding value of the Bergman
kernel for D. Up to a translation and rotation, we may assume that D = Dδ (0). Then,
by Corollary 10.2.3, we have

δ2 δ2
K (τ, τ ) ≤ = , z ∈ D, |z| = t.
π(δ 2 −τ )
2 2 π(δ + τ )2 (δ − τ )2

Therefore we want to estimate


 t  t
δ ds
K (s, s)1/2 ds = π −1/2
0 0 (δ + s)(δ − s)
1 δ+t
= log = O(| log(δ − t)|).
2π 1/2 δ−t

But δ − t is the distance ρ(t) from t to the boundary. 



The construction of K  shows that if f belongs to H (Ω) then

(z, w) f (w) dm(w) = f (z).
K
Ω

Given t not in the closure of Ω, we can take f (z) = (t − z)−1 and


 (z, w)
K 1
dm(w) = .
Ω t −w t −z

It follows that 
d (z, w)
K 1
dm(w) = .
dt Ω w−t (t − z)2

Therefore
10.5 Conformal mapping, II 249
 (z, w)
K 1
I (t, z) ≡ dm(w) = + ck (z) (10.5.8)
Ω w−t z−t

where ck depends on the choice of z and on which component of the complement of


the closure of Ω contains t, i.e. which of the curves Γk encloses t. (If Γk encloses
the unbounded component, then by taking t → ∞ we see that ck = 0.)
Since 1/(w − t) is integrable, we may define I (t, z) for t ∈ Ω by the same for-
mula:  
K (z, w) 1
I (t, z) ≡ dm(w) = , t ∈ Ω. (10.5.9)
Ω w−t z−t

This converges (consider radial coordinates centered at t).


Lemma 10.5.6. For t in Ω,

I (t, z) = −π M(z, t) + λ(z, t), (10.5.10)

where λ is holomorphic in t.

Proof: Let Ω be a domain with smooth boundary whose closure is contained in Ω,


such that t is in Ω . The integrand in (10.5.9) is regular in ∂Ω except at ζ = t. Let
Dε = Dε (t). For sufficiently small ε > 0, the closure Dε is contained in Ω . Let
Ωε = Ω \ Dε .
By the Cauchy–Green formula (1.2.8),
 (z, w)     
K K (z, w) K (z, w)
2i dm(w) = ∂w̄ dz + 2i dm(w).
Ω w−t ∂Ωε w − t Dε w − t

Since the integrand is integrable over Ω, the second integral on the right goes to zero
with ε. Now ∂w̄ {(w − t)−1 } = 0 and

(z, w) = ∂w K
∂w̄ K (w, z) = M(w, z).

Therefore
 (z, w)  
K K (z, w)
2i dm(w) − 2i dm(w)
Ω t −w Dε w − t

M(z, w)
= dz + o(1). (10.5.11)
∂Ω t −w

The second integral on the left is a holomorphic function of z, while the term on the
right has limit −π M(z, t).
Suppose now that {Ωn } is an increasing sequence of smoothly bounded domains,
such that ∂Ωn is in the 1/n neighborhood of ∂Ω. For each such n, the preceding
argument shows that
250 10 The Bergman kernel
 (z, w)  (z, w)
K 1 K
dm(w) = −π M(z, t) + |dw|.
Ωn w−t 2i ∂Ωn w − t

The left side converges to I (z, t), so the integrals on the right have a limit λ(z, t)
that is holomorphic in z. 


Lemma 10.5.7. For fixed z in Ω, I (z, t) is a continuous function of t ∈ C.

Proof: Let t0 be a boundary point of Ω. Let δ be as in Lemma 10.5.4 and let D1


contained in Ω and D2 contained in the complement of the closure Ω be disks of
radius δ that are tangent to ∂Ω at t0 .
Write
 (z, w)  
K K (z, w)
I (t, z) = dm(w) + dm(w)
Ω\D1 w − t D1 w − t
= I1 (t, z) + I2 (t, z).

The Cauchy–Schwarz inequality shows that



|t1 − t2 |2
|I1 (t2 , z) − I1 (t1 , z)|2 ≤ ||
k z ||2 dm(w). (10.5.12)
Ω\D1 |w − t1 |2 |w − t2 |2

For t1 , t2 ∈ D1 , the integral is bounded by integration over a larger set, the comple-
ment of D1 ∪ D2 .
Equation (10.5.8) implies that I (z, t) is uniformly continuous on the complement
of Ω. Therefore it is enough to consider I (z, t) − I (z, t0 ) as t → t0 along the radius
of D j that ends at t0 . We change coordinates and take δ = 1 and t0 = 0, with the
boundary of Ω vertical at 0; see Figure10.2.

Ω
w
D1 D2

t0 = 0
−1 t

Fig. 10.2 Estimating I1 (t, z) − I (t0 , z).


10.5 Conformal mapping, II 251

Note that the integrand in (10.5.12) is O(|w|−4 ) as |w| → ∞. It follows from this
and symmetry that it is enough to consider the case Im w > 0 and to estimate the
integral over the region that is shaded gray in the figure.
With w = x + i y, we have

 y2
1 ≤ |w + 1|2 = (x + 1)2 + y 2 , |x| ≤ 1 − 1 − y2 = + O(y 4 ).
2
At a given value of y, the range of x is contained in

y2 y2
− + O(y 4 ) < x < + O(y 4 ).
2 2
Therefore

y2t
|w − t|2 = (x − t)2 + y 2 = (x 2 − 2xt) + t 2 + y 2 ≥ + t 2 + y 2 + O(y 4 )
2
∼ y2 + t 2.

Moreover, |w|2 ≥ y 2 . Therefore the integral we are estimating is dominated by


 1  1/|t|  ∞
t 2 y 2 dy t ds ds tπ
= ≤ t = .
0 t 2 + y2 0 1 + s2 0 1+s 2 2

This completes the argument for I1 (z, t).


In the case of I2 , let us take D1 to be the unit disk, and t0 = 1, Here, it is enough
to consider
  
(z, w) 1 1
I2 (t, z) − I2 (t −1 , z) = K − dm(w),
D w−t w − t −1

for 0 < t < 1. We want to invoke the Cauchy–Green formula (10.5.11), which takes
the form here
  
(z, w) 1 1
2i K − dm(w)
D w−t w − t −1
  
1 1
= M(w, z) − dw. (10.5.13)
|z|=1 w−t w − t −1

To justify this, we need to consider the singularity of M at the boundary point 1. The
formula is valid if we replace D by the portion of D that lies to the left of the arc of a
circle of radius ε centered at 1. Lemma 10.5.5 implies that the integral over the arc
is less than some constant times
 ε
log s ds = ε log ε − ε
0
252 10 The Bergman kernel

The limit as ε → 0 is (10.5.13). To complete the proof, we take a small disk D =


Dε (t) and use (10.5.13) to write
  
1 1 1
I2 (t, z) − I2 (t −1 , z) = M(w, z) − dw
2i |w|=1 w−t w − t −1
  
 1 1
+ K (z, w) − dm(w)
D w−t w − t −1
ε  
1 1 1
− M(w, z) − dw
2i |w|=ε w−t w − t −1

As ε → 0, the last two integrals on the right converge to 0 and to −π M(t, z), respec-
tively.
On the unit circle z = z̄ −1 , so dz = −z −2 d z̄. Therefore, using the previous results
and taking the complex conjugate, we have

I2 (t, z) − I2 (t −1 , z)
  
1 1 1
=− M(w, z) − −1 dw − π M(t, z).
2i |w|=1 w−1 − t w − t −1

Since M(w, z) is holomorphic with respect to w ∈ D, and the residue at the pole
w = t −1 is −π M(w, z), we have I2 (t, z) = I2 (t −1 , z). 

Let  
1 1
N (t, z) = + λ(z, t) , (10.5.14)
π t−z

where λ is the function in (10.5.10).

Lemma 10.5.8. Let Γk be one of the curves that bound Ω. Then


ck
lim N (t, z) = lim M(t, z) + . (10.5.15)
t→Γk z→Γk π

Proof: Since the function I (t, z) is continuous at Γk , the formulas (10.5.8) and
(10.5.10) must give the same value at any point of Γk . Thus,

1  
lim + ck = lim −π M(t, z) + λ(z, t)
t→Γk z−t t→Γk

which is the same as (10.5.15). 




We are now in a position to prove Theorem 10.5.1 in a more complete formulation.


Fix some z ∈ Ω and let
Φ(t) = M(t, z) + N (t, z). (10.5.16)
10.6 The kernel function and partial differential equations 253

Then (10.5.15) implies that


 
lim Im Φ(t) = lim M(t, z) + N (t, z) − M(t, z) − N (t, z)
t→Γk t→Ck
Im ck
= . (10.5.17)
π

Theorem 10.5.9. For any choice of z ∈ Ω, let Φ(t) = M(t, z) + N (t, z), where M
is defined by (10.5.5) and N is defined by (10.5.14) and (10.5.10). Then Φ is a
conformal map onto the complement in C of the union of disjoint horizontal slits
S1 ,…,S p .

Proof: By construction, Φ is holomorphic on Ω, except for a simple pole at t = z.


Lemma 10.5.8 shows that Φ is continuous up to the boundary. For any complex
value a not in Φ(∂Ω), integrating Φ (t)/[Φ (t) − a] over the boundary shows that
Φ takes the value a exactly once in Ω. Therefore Φ is a conformal map of Ω onto
the complement of the union of the images Φ(Γk ). By (10.5.17), each such image is
a horizontal slit.
To complete the proof, we need to show that distinct boundary curves Γk have dis-
tinct images Sk = Φ(Γk ). If j = k, then we may find disjoint curves Γj , Γk homotopic
to Γ j , Γk , respectively. The images are disjoint and enclose S j and Sk , respectively.



Remarks. 1. Theorem 10.5.9 contains the Riemann mapping theorem for domains
with smooth boundaries. In fact, suppose that Ω has a single boundary curve, so that
Φ(Ω) is the plane with a single slit. This domain can be mapped onto the unit disk
by an explicit conformal map; see Exercise 17.
2. The assumption that Ω has a smooth boundary in order for there to be a
conformal map as described in Theorem 10.5.1 can be very much weakened; see
Exercise 19.

10.6 The kernel function and partial differential equations

Suppose that Ω is a domain bounded by p analytic closed curves Γk , as in the left


side of Figure 10.1. Two classic partial differential equations problems associated
with such a domain are the Dirichlet problem: given a continuous function f on the
boundary ∂Ω, find a function u, continuous on the closure, such that

Δu ≡ u x x + u yy = 0 in Ω, u = f on ∂Ω, (10.6.1)

and the Poisson problem with Dirichlet boundary condition: given a bounded function
g on Ω, find a function u, continuous on the closure, such that

Δu = f in Ω, u = 0 on ∂Ω. (10.6.2)
254 10 The Bergman kernel

A function u that satisfies Δu = 0, as in (10.6.1), is said to be harmonic.


One useful tool here is the following form of Green’s identity for functions that
are smooth in Ω and continuous on the closure:
   
∂u ∂v
v−u |dz| = (vΔu, −uΔv) dm, (10.6.3)
∂Ω ∂n ∂n Ω

where ∂/∂n denotes differentiate in the direction of the outward normal vector.
By definition, a Green’s function for Ω is a function G(z, ζ ), z, ζ ∈ Ω, with the
properties that as a function of z it is harmonic in Ω \ {ζ }, continuous and equal to
zero at the boundary ∂Ω, and has a logarithmic singularity at ζ :

1
G(z, ζ ) = log + h(z, ζ ), (10.6.4)
|z − ζ |

where h is harmonic with respect to z in all of Ω. The maximum principle for


harmonic functions implies that the Green’s function is unique.
A calculation shows that G(z, ζ ) is harmonic with respect to z, apart from z = ζ .
It is also harmonic with respect to ζ , so by uniqueness

G(z, ζ ) = G(ζ, z) (10.6.5)

and h(z, ζ ) = h(ζ, z) is harmonic in each variable.


The existence of G can be established by solving the Dirichlet problem for h(·, ζ ),
with boundary condition

1
h(z, ζ ) = − log , z ∈ ∂Ω.
|z − ζ |

For the existence of a solution, see Section 5.4.


The principal aim of this section is to connect Green’s function for Ω and the
Bergman kernel for Ω. As we shall see,

2 ∂2G
K (z, ζ ) = − . (10.6.6)
π ∂z∂ζ

The importance of G is that it provides the solution to both the Dirichlet problem
(10.6.1) and the Poisson problem (10.6.2).
Lemma 10.6.1. G extends to be harmonic in a neighborhood of the boundary ∂Ω.

Proof: The assumption that the boundary curves Γ are analytic (i.e. Γ (t) is an
analytic function of t and Γ = 0) implies that Γ extends to a coordinate chart in a
neighborhood of any given boundary point Γ (t0 ). In this chart, the intersection with
the nearby portion of ∂Ω becomes part of the real axis. Since the harmonic function
G is zero on the boundary, the reflection principle says that it extends across. 

10.6 The kernel function and partial differential equations 255

Proposition 10.6.2. Given any continuous function f on ∂Ω, the unique solution
to the problem (10.6.1) is

1 ∂G
u(z) = (ζ ) f (ζ )|dζ |, z ∈ Ω. (10.6.7)
2π ∂Ω ∂n

Proof: Let u be the solution of (10.6.1). For small ε > 0, the closure Dε of the disk
Dε = Dε (z) is contained in Ω. Let Ωε = Ω \ Dε . Then G and u are both harmonic
in Ωε , so Green’s formula (10.6.3) gives
 
∂G ∂u
(z, ζ )) u(ζ )|dζ | = G(z, ζ ) (ζ )|dζ |
∂Ωε ∂n ∂Ωε ∂n

∂u
=− G(z, ζ ) (ζ )|dζ |. (10.6.8)
∂ Dε ∂n

As ε → 0, on ∂ Dε , we may replace ∂G/∂n by ∂h/∂n = 1/ε, so the left side of


(10.6.8) has limit 
∂G
(z, ζ ) f (ζ )|dζ | − 2π u(z).
∂Ω ∂n

Similarly, the right side of (10.6.7) is O(ε| log ε|). This proves (10.6.7). 


Proposition 10.6.3. Given any bounded continuous function f on Ω, the unique


solution to the Poisson problem (10.6.2) is

1
u(z) = G(z, ζ ) f (ζ ) dm(ζ ). (10.6.9)
2π Ω

Proof: With u defined by (10.6.9), the obstacle to a simple computation of Δu is


behavior at the singularity at ζ = z. We will approximate G by a family of smoother
functions {G ε }. To this end, we first choose a non-decreasing twice-differentiable
function ϕ of one variable with the properties

ϕ(r ) = 0, if r ≤ 1/2; ϕ(r ) = 1 if r ≥ 1.

Let
l(r ) = ϕ(r ) log r ; lε (r ) = ε2 l(r/ε).

Then with s = r/ε, we have


 
d2 1 d
Δlε (r ) = + lε (r )
dr 2 r dr
= [ϕ log] s + 1s l (s).
256 10 The Bergman kernel

so the integral over Dε (0) ⊂ R2 is


 1  1  1
2π ds + 2π [ϕ(s) log s] ds = 2π [sϕ(s) log s] ds
0 0 0
1 1
 
= 2π [sϕ(s) log s]  = 2π [sφ (s) log s + φ(s)] = 2π ϕ(1) = 2π.
0 0

With this in mind, we approximate G by

G ε (z, ζ ) = −lε (|z − ζ |) + h(z, ζ ).

Now G ε is harmonic with respect to z for |z − ζ | > ε so with u given by (10.6.9),



1
Δu(z) = Δlε (|z, ζ |) f (ζ ) dζ. (10.6.10)
2π Ω∩Dε (z)

For sufficiently small ε, the disk Dε is contained in Ω. Taking into account the
calculation of the integral of Δε and the continuity of f , we see that the limit as
ε → 0 of the right side of (10.6.10) is f (z). Since G ε decreases to G, we obtain
Δu(z) = f (z). 


The principal step in getting to the identity (10.6.6) is to show that the function
−(2/π )∂ 2 G/∂z∂ζ has the reproducing property.

Proposition 10.6.4. If f is holomorphic in Ω and continuous on the closure, then



2 ∂2G
− (z, ζ ) f (ζ )dm(ζ ) = f (z). (10.6.11)
π Ω ∂z∂ζ

Proof: Since
1
G(z, ζ ) = log + h(z, ζ ),
|z − ζ |

it follows that
∂G 1 ∂h
(z, ζ ) = − + (z, ζ );
∂z z−ζ ∂z
∂2G ∂ 2h
(z, ζ ) = (z, ζ ).
∂z∂ ζ̄ ∂z∂ z̄

Therefore ∂ 2 G/∂z∂ ζ̄ has no singularities. Moreover, since 4∂z∂ z̄ = Δ, it follows


that ∂ 2 G/∂z∂ ζ̄ (z, ζ ) is holomorphic with respect to z.
Given z ∈ Ω, the Cauchy–Green formula (1.2.8) gives
10.6 The kernel function and partial differential equations 257
 
1 ∂G ∂2G
(z, ζ ) f (ζ ) dz = f (ζ ) dm(ζ ).
2i ∂Ω ∂z Ω ∂z∂ζ

The left side is



1 1
= − + h(z, ζ ) f (ζ ) dz.
2i ∂Ω z−ζ

The integrand is meromorphic in Ω, so the value is π times the residue, i.e. − π2 f (z).



Proposition 10.6.5. The Bergman kernel K (z, ζ ) for Ω is continuous up to the


boundary in either variable.

In fact, the kernel K extends analytically across the boundary. This uses again
the assumption that the boundary curves are analytic arcs. For the lengthy proof, we
refer to Bergman [24], Chapter 5, Section 3. This being the case, we can use K in
place of f in (10.6.11) to prove the identity (10.6.6):

2 ∂2G
K (z, η) = − (z, ζ )K (ζ, η) dm(ζ )
π Ω ∂z∂ζ

2 ∂2G
=− K (ζ, η) dm(ζ ).
π Ω ∂z∂η

Taking the complex conjugate of the last integral gives



2 ∂2G 2 ∂2G
− K (η, ζ ) (z, ζ ) dm(ζ ) = − (η, z).
π ∂ζ ∂z π ∂η∂z

Remark. The assumption that the domain Ω is bounded by analytic curves is not,
up to conformal equivalence, at all special; see Exercise 19.

Exercises

1. Suppose ∂Ω contains only one point. Show that if f is holomorphic on Ω and


Ω | f | < ∞, then f ≡ 0.
2

2. Given points z 1 , . . . , z n in Ω and constants a1 , . . . a j , consider the interpolation


problem: find a function f , holomorphic in Ω, such f (z j ) = a j , j = 1, . . . , n
and || f || is minimal among such functions. Show that the problem has a unique
solution and describe its form.
258 10 The Bergman kernel

3. Suppose that H (Ω) contains a non-zero function.


(a) Show that for each z ∈ Ω there is a function in H (Ω) that does not vanish
at z.
(b) Show that the kernel K has no zeros.
4. Prove Proposition 10.2.2 (including the completeness of the set {φn }.
5. Prove Proposition 10.2.4 (including the completeness of the set {ψn }.)
6. Find necessary and sufficient conditions that the function with Laurent expansion


f (z) = cn z n
k=−n

belong to H (A(r, R)), where A(r, R) is the annulus (10.2.6).


7. Use Corollary 10.2.3 to prove that for each disk, the Bergman kernel K has the
boundary behavior
1 −2
K (z, z) = 4π ρ + O(ρ)−1 )

as the distance ρ from z to the boundary goes to zero. Note that this is independent
of the radius of the disc.
8. Let Ω be the unit disk D and Ω1 be D with the segment [0, 1) removed. Show
that each orthogonal basis for H (Ω) is an orthonormal set for H (Ω1 ), but is not
complete in H (Ω1 ).
9. Suppose K is the kernel function for a domain Ω in C. Show that for any set
{z 1 , z 2 , . . . , z n } of distinct points of Ω, and any set of constants {t1 , t2 , . . . tn } in
C, 
K (z j , z k )t j t¯k ≥ 0.
1≤ j,k≤n

10. Suppose that the domain Ω has a smooth boundary. Use Corollary 10.2.6 and
Exercise 7 to show that the kernel function K (z, z) has the same boundary
behavior, pointwise, as a disk. Hint: if z 0 ∈ ∂Ω, then there are disks D1 and D2
such that D1 is contained in Ω, D2 is disjoint from Ω, and the boundaries meet
at z 0 . An inversion Φ(z) = 1/(z − z 1 ), where z 1 is not in the closure of Ω, is a
conformal map with the property that D3 = Φ(D2 ) is a disk that contains Φ(Ω)
and is tangent to Φ(Ω) at Φ(z 0 ).
11. A linear map U from a Hilbert space H to itself is said to be unitary if it is
surjective and satisfies

(U f, U g) = ( f, g), all f, g ∈ H.

(a) Suppose that H has an orthogonal basis ϕn . Show that a linear map U : H →
H is unitary if and only if {ψn } is an orthomormal basis, where ψn = U φn .
(b) Suppose that U : H (Ω) → H (Ω) is unitary. Prove that U has a kernel K U ,
i.e. a continuous function K U (z, w) such that for any f ∈ H (U ),
10.6 The kernel function and partial differential equations 259

U f (z) = K U (z, w) f (w) dm(w).
Ω

12. An orthogonal projection in a Hilbert space H is a linear map P : H → H with


the property that P(H ) = H1 is a closed subspace of H , and P f = f if f ∈ H1
and Pg = 0 if g is orthogonal to H1 (meaning that ( f, g) = 0 for each f ∈ H1 .
Prove that if P : H (Ω) → H (Ω) is a projection, then P has a kernel K P :

P f (z) = K P (z, w) f (w) dm(w).
Ω

13. Compute the Bergman kernel for the upper half-plane H.


14. The Hardy space H 2 is defined to be the space of functions f that are holomorphic
in D and satisfy
 2π
1
sup | f (r eiθ )|2 dθ < ∞. (10.6.12)
0≤r <1 2π 0

(a) Show that this is a Hilbert space, and the square of the norm, || f ||2H is
(10.6.12). 
(b) Verify that if f (z) = ∞ n
n=0 an z , then



|| f ||2H = |an |2 .
n=0

 
(c) Deduce that the inner product in H 2 of f = an z n and g = bn z n is


( f, g) H = an b̄n .
n=0

(d) Use (c) to select an orthonormal basis {φn }∞ 2


n=0 for H , and compute



K H (z, w) = φn (w) φn (z).
n=0

(e) Show that K H has the reproducing kernel property: if f ∈ H 2 , then

f (z) = ( f, K (z, ·)) H .

(f) Prove that

K H (z, w) = K (w, z); |K H (z, w)|2 ≤ K H (z, z)K H (w, w).


260 10 The Bergman kernel

15. (a) Given a positive smooth function F(x, y, ẋ, ẏ), consider the problem of
minimizing the integral
 1
F(x(t), y(t), x (t), y (t)) dt
0

for curves γ (t) = (x(t), y(t)) in the plane that have fixed endpoints (x0 , y0 ) and
(x1 , y1 ). If such a curve is minimal, and ν(t) = (ξ(t), η(t)) is a smooth curve
that begins and ends at (0, 0), then the value of the integral is not decreased by
replacing (x, y) by (x + εξ, y + εη). Therefore a necessary condition on γ is
that   b 
d 
F(x + εξ, y + εη) dt
dε ε=0 a

Deduce from this, integration by parts, and the boundary conditions, the Euler–
Lagrange equations

d d  
{Fẋ } = Fx ; Fẏ = Fy .
dt dt
(b) Suppose that the function in part (a) has the form

F(x, y, ẋ, ẏ) = G(x, y)(ẋ 2 + ẏ 2 )1/2 .

For the minimization problem we may, and shall, choose the parametrization of
the curve (x(t), y(t)) to satisfy

[x (t)2 ] + [y (t)]2 = 1.

Find the form of the Euler–Lagrange equations for curves with this parametriza-
tion and verify (10.4.7).
16. Calculate the Euler–Lagrange equations (10.4.7) for Ω = D and verify that the
interval (−1, 1) is a geodesic.
17. Map the plane minus a single (straight) slit conformally onto the unit disk. (One
may as well take the slit to be the interval [-1,1].)
18. (a) Compute the Green’s function for the unit disk. Hint: what is G(z, 0) in this
case?
(b) Use the result in (a) to verify (10.6.6) for D.
19. Suppose that Ω is a bounded domain in the plane, and suppose that the com-
plement of Ω consists of an unbounded component Ω1 and p − 1 bounded
components Ω1 , …, Ω p . Show that Ω is conformally equivalent to a domain
whose boundary consists of disjoint closed analytic curves. Hint: in case p = 1
there are two ways to proceed: (i) use the Riemann mapping theorem; (ii) choose
a point z 1 ∈ Ω, invert the plane by z → (z − z 1 )−1 , map the image of Ω to D,
and follow this by the inversion w → w−1 .
10.6 The kernel function and partial differential equations 261

20. Prove, by applying the Riemann mapping theorem p times, the converse of
Theorem 10.5.1: If Ω is the complement in the plane of p disjoint finite vertical
closed slits, then there is a conformal map of Ω to a domain bounded by p
analytic Jordan curves.

Remarks and further reading

The classic exposition of this material is in Bergman [24]. It covers the material in
this chapter much more completely. As we noted in the introduction, the Bergman
kernel has close connections to other natural conformal invariants associated to a
domain in C. These connections are exhaustively investigated in [24]. Other topics
there include further applications to partial differential equations, potential theory,
and functions of two complex variables. Krantz [125] contains some more recent
developments.
The Hilbert space H (D) of square-integrable holomorphic functions in the disk is
often denoted A2 . It has Banach space generalizations A p , the holomorphic functions
that are p-th power integrable, 1 ≤ p < ∞, as well as non-Banach space versions
with 0 < p < 1. These are known as Bergman spaces. There is now an extensive
body of knowledge concerning their structure and properties; see Duren and Schuster
[65].
Chapter 11
Theta functions

A polynomial P(z, w) in two complex variables induces a complex curve

C P = {(z, w) ∈ C × C : P(z, w) = 0}.

This can generally be extended in a natural way to a subset of S × S. If P is irre-


ducible, then C P ⊂ S × S is a compact Riemann surface. Conversely, every compact
Riemann surface arises in this way; see [6] or [22], for the case of the Riemann surface
of a function defined on a subset of C. This surface carries a complex structure, so
an object of natural interest is the associated function field: the field of meromorphic
functions on C P . The study of this function field leads naturally to the study of the
associated theta functions.
This chapter is a brief introduction to a vast topic. The basic classification of the
curves C P is the topological invariant called the genus. The curve in S2 is a compact
oriented smooth manifold of real dimension two. Therefore, topologically, it is either
a sphere (genus 0), a torus (genus 1), or a torus-like figure with more than one hole
(genus > 1). Figure 9.1 in Chapter 9 illustrates the cases genus 0, 1, and 2.
Genus 0 presents no particular difficulty. The case of genus 1 is the case of
elliptic curves, which are treated in some detail in many presentations, including
[21] and [22]. In this chapter, we discuss the general case, concentrating on genus
≥ 2. However, we discuss in detail only the case of hyperelliptic curves, where the
necessary machinery can be calculated explicitly. This avoids long excursions into
more general and more abstract issues while providing some insight into the overall
picture.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 263
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_11
264 11 Theta functions

11.1 Hyperelliptic curves

A hyperelliptic curve is one that has one of the two forms


2g+1
w = P2g+1 (z),
2
P2g+1 (z) = (z − r j ) (11.1.1)
j=1

or

2g+2
w = P2g+2 (z),
2
P2g+2 (z) = (z − r j ), (11.1.2)
j=1

where in either case the roots r j are assumed to be distinct. Here, g ≥ 2; the analogous
curve with g = 1 is termed an elliptic curve. It is easy to construct the Riemann
surface. The general process will be clear if we take the case g = 2 and specialize to
the case of real roots. In the degree six case, we have six real roots

r1 < r2 < r3 < r4 < r5 < r6 .

Slit the plane from r2 j−1 to r2 j j = 1, 2, 3, and let C+ denote the plane with the slits
[r2 j−1 , r2 j ] removed. It is not difficult to see that a branch of the square root

P6 (z)

can be chosen on C+ . We choose the branch that is positive as z → +∞, z ∈ R.√ We


then take a second copy C− of the slit plane, and choose the other branch of P
on C− (Figure
√ 11.1). Extend these choices to the corresponding slit spheres S± . The
value of P varies smoothly if we start in one copy of the slit sphere, cross one of
the slits, and continue on from the corresponding slit on the other √ copy. Therefore
it is natural to join the two copies across the slits and consider P as a continuous
function on the resulting figure. This figure is, topologically, a surface with two holes;
see Figure 11.1 (Compare this with the analogous construction for P of degree 2.) In
the case (11.1.1) with g = 2 and five distinct real roots z 1 < · · · < z 5 , we set z 6 = ∞
and proceed in the same way.
When the roots are not assumed real, then in place of the slits we find three non-
intersecting curves that join pairs of roots. For general g, this construction produces
a surface with g holes, i.e. with genus g—often described as a “sphere with g handles
attached.”
To complete the picture, we want to have a complex structure on the resulting
surface. At any point z 0 of S± , minus the slits, we have the usual choice of coordinate
z in a neighborhood. A slit can be considered as having two sides, corresponding
to the opening depicted in Figure 11.1. For a point on either side of the slit, there
is a neighborhood that lies partially in S+ and partially in S− . The local coordinate
z works in that neighborhood. Finally, near a root r j that is the endpoint of a slit,
11.2 Cycles and differentials 265

C+
w → +∞

C−
w → −∞

Fig. 11.1 Slit planes; connecting the slit spheres.



a determination of w = P, continued around r j , takes us from one copy of S
onto the other and back, once again providing a coordinate. Covering the curve with
appropriate coordinate neighborhoods shows that it is a Riemann surface in the sense
of Chapter 6.
In general, if P(z, w) is irreducible, the corresponding curve has the same topology
as one of the hyperelliptic curves just described—a surface with g holes. However,
the various curves for a given genus g > 2 may have different complex structures:
see Farkas and Kra [67]. The point of Teichmüller theory is to construct a natu-
ral parametrization of the family of complex structures on a surface with a given
topology; see Chapter 9.

11.2 Cycles and differentials

We return here to a hyperelliptic curve Γg of genus g > 1, associated to the polyno-


mial P2g+1 . We state without proof a number of facts about the differential topology
of Γg . Each will be illustrated in the case g = 2; the reader may construct similar
illustrations to verify the statements in other cases.
Our terminology “curve” here conflicts with our previous use of “curve” as a
mapping γ from a real interval into C or into a Riemann surface, or to the iamge
of γ (with an orientation coming from the parametrization). In this chapter, we will
term such a map γ or, its image, a path. A smooth, closed path that does not intersect
itself will be called a cycle.
It is possible to find cycles a1 , a2 , . . . , ag and b1 , b2 , . . . , bg that have intersection
numbers
a j · b j = 1, a j · ak = a j · bk = b j · bk = 0 if j = k. (11.2.1)

The intersection number a · b of two cycles a and b is defined as follows: a · b = 0


if the cycles do not meet; if a and b meet at a single point p, then it is assumed that
the the tangents ȧ and ḃ in the forward direction along a and b meet at an angle, and
a · b is ±1 according to the sign of the angle from ȧ to ḃ (in local coordinates).
266 11 Theta functions

For a curve of genus 2, this is illustrated on the left in Figure 11.2. Such cycles are
 basis for Γg , which means that every closed path in Γg is homologous to
a homology
a path [m j a j + n j b j ], where the integer coefficients indicate repeating the cycle
that number of times (in the opposite direction if the coefficient is negative), and the
addition refers to concatenating the paths— following one by another. Homologous
means homotopic, cycle by cycle, but in homology, the order does not matter since
we will be concerned only with integration of holomorphic functions along the path
(or, more generally, closed 1-forms). Then homotopic paths yield the same integral,
and addition is commutative.
Let us spell this out in the case illustrated in Figure 11.2. Cutting the curve Γ −
along the cycles a j , b j , i.e. going to the complement of the union of the (images
of) the a j , b j leads to a simply connected region Γg that is topologically a 4g-sided
polygon with boundary,

a1 b1 a1−1 b1−1 a2 b2 a2−1 b2−1 · · · ag bg ag−1 bg−1 .

This is illustrated on the right in Figure 11.2, for the case g = 2. Points on a −1
j are
identified with points on a j , and so on, to form Γg topologically.

b−1
2
b1 a−1
1 a−1
2

b2

b−1
1 b2
a1 a2

a1 a2
b1

Fig. 11.2 A homology basis and the representation as a polygon.

Suppose that γ is a closed path in the curve Γg on the left in figure 11.2. Let Ω
denote the interior of the octagon on the right. If the path γ stays in Ω, then it is
homotopic to a constant map, and counts as zero. Otherwise, each connected part of
the curve that lies in the interior is homotopic to a path that lies on the boundary.
Thus, γ is homotopic to a path that lies on the union of the cycles. Starting at the
intersection of two cycles, the path is homologous to a path that follows one of those
cycles, or its inverse, an integer number of times. Thus, eventually, we have a unique
homology representation of all of the path γ in the form m 1 a1 + m 2 a2 + n 1 b1 + n 2 b2
for some integers m j , n j , not necessarily positive.
A fundamental question concerning a curve Γg is to determine the field of functions
that are meromorphic on Γg . A general picture is given by the following result.
11.2 Cycles and differentials 267

Proposition 11.2.1. If ϕ is a non-constant meromorphic function on Γg , then the


number of zeros equals the number of poles (each counted according to multiplicity),
and the number of poles is at least two.

Proof: Since Γg is compact, ϕ must have at least one pole. We may assume that
there are no zeros or poles on the cycles {a j } and {b j }. (Otherwise, simply move
the offending cycles slightly.) Integrate ϕ /ϕ over the boundary ∂ Γg . The integral
over a j and the integral over a −1
j cancel, and the same is true for the b j . Therefore
the number of zeros equals the number of poles. (In particular, there is at least one
zero.) Integrate ϕ itself over the boundary to see that the sum of the residues is zero.
Therefore the number of poles, counting multiplicity, must be greater than one.
Replacing ϕ in the previous argument by ϕ − c, any c ∈ C, we obtain

Corollary 11.2.2. A non-constant meromorphic function takes each value (finite or


infinite) the same number of times (counting multiplicity).

We are still some distance from showing that non-constant meromorphic functions
exist. For this, we need an excursion into differentials and theta functions.
In this context, the term used for a 1-form on Γg is a differential. In local coordi-
nates, a differential ω has the form

ω = f (z) dz + g(z) dz. (11.2.2)

The differential ω is said to be holomorphic, or Abelian of the first kind if, in each
such local representation, f is holomorphic and g = 0. In particular, as we shall
see, the differentials
z j−1
η j = √ dz, j = 1, 2, . . . g, (11.2.3)
P

P = P2g+1 or P = P2g+2 , are holomorphic.


A holomorphic differential ω = f (z)dz is clearly closed: dω = 0; see Section
1.2. The same is true of its complex conjugate f (z)dz.
Let us see that the differentials (11.2)
√ are actually holomorphic, despite the appar-
ent singularities at the zeros of w = P(z). Recall that, at these zeros, w itself can
be taken as the local coordinate. Now
 −1
dz dw w
dz = dw = dw = 1
dw.
dw dz 2
P

Thus, near the zeros of P,


2z j−1
ηj = dw.
P (z)

The roots are assumed to be distinct, so P (z) does not vanish near a root.
268 11 Theta functions

A natural question is: why stop at j = g? We need to examine what happens as


z → ∞. If P = P2g+2 , then as z → ∞, w = ±z g+1 + . . . . Taking ζ = 1/z as an
appropriate coordinate,

z j−1 ζ 1− j [ζ g+1 + . . . ]
ηj = dz = ± dζ = ±ζ g− j [1 + O(ζ )] dζ,
w ζ2

so the condition j ≤ g is necessary for holomorphy at z = ∞. In the case P = P2g+1 ,



the point at ∞ is itself the endpoint of a slit; in this case, ζ = z can be taken as a
local coordinate, and a similar calculation shows that j ≤ g is precisely the necessary
and sufficient condition for regularity at ∞; see Exercise 2.
Fix a point p0 that does not lie on any of the cycles a j , b j . If ω is a holomorphic
differential, we may define  p
f ( p) = ω. (11.2.4)
p0

Lemma 11.2.3. Suppose that ω and ω are closed differentials. Then


  g
ω∧ω = f (z)ω = [ A j B j − A j B j ], (11.2.5)
Γg ∂ Γg j=1

where f is defined by (11.2.4) and


   
Aj = ω, B j = ω, Aj = ω , Bj = ω. (11.2.6)
aj bj aj bj

Proof: Since ω ∧ ω = d( f ω ), the first equality in (11.2.5) follows from Stokes’s


theorem. Next,
 g    
fω = + + + fω. (11.2.7)
∂ Γg j=1 aj a −1
j bj b−1
j

Given corresponding points p on a j and p on a −1 j , the segment from p to p is a


cycle in Γg that is homologous to b j ; see Figure (11.3), so

f (pj) − f (pj) = − ω = −B j . (11.2.8)
bj

Similarly, if q j and q j are corresponding points on b j and b−1


j ,


f (q j ) − f (q j ) = ω = Aj. (11.2.9)
aj
11.2 Cycles and differentials 269

Fig. 11.3 Integrating


αj−1
ω∧ω.
βj
βj−1 p
q
q
αj
p

Therefore (11.2.7) is
 g  
fω = −B j ω + Aj ω
∂ Γg j=1 aj bj
g
= [ A j B j − A j B j ].
j=1

The numbers A j , B j in (11.2.6) are called the a periods and b periods of ω,


respectively.
Suppose that ω = f dz is holomorphic. Then ω ∧ ω = | f |2 dz ∧ dz. In local
coordinates z = x + i y,

dz ∧ dz = (d x + idy) ∧ (d x − idy) = −2idx ∧ dy.

Therefore (11.2.5) implies that if ω = 0,


 g
i i
0 < ω∧ω = [ Aj B j − Aj Bj]
2 Γg 2 j=1
⎡ ⎤
g
= −Im ⎣ Aj B j⎦ . (11.2.10)
j=1

We have proved

Corollary 11.2.4. (a) If ω is a non-zero holomorphic differential on Γg , with a and


b periods A j , B j , then ⎡ ⎤
g
Im ⎣ A j B̄ j ⎦ < 0. (11.2.11)
j=1

(b) If the a periods of a holomorphic differential ω all vanish, then ω = 0.


270 11 Theta functions

Proposition 11.2.5. The space of holomorphic differentials on Γg has dimension g.

Proof: The g holomorphic differentials η j of (11.2) are linearly independent, so the


dimension is at least g. The matrix A with entries

A jk = ηk (11.2.12)
aj

is non-singular, since otherwise some non-trivial linear combination of the η j would


have a periods zero, a contradiction. But this implies that given any holomorphic
differential ω, there is a linear combination ω of the η j having the same a periods
as ω, so ω = ω . Thus, the η j are a basis for the holomorphic differentials.

We define a new basis {ω j } of holomorphic differentials by setting

g
ω j = 2πi (A−1 )k j ηk . (11.2.13)
k=1

The {ω j } are canonically dual to the basis of cycles {a j } in the sense that
 
ω j = 2πi; ωk = 0 if j = k. (11.2.14)
aj aj

Having chosen this canonically dual basis, we let



B jk = ωj. (11.2.15)
bk

Theorem 11.2.6. The matrix B = (B jk ) satisfies


(a) B is symmetric: B jk = Bk j ;
(b) Re B < 0, i.e. Re B is negative definite.

Proof: (a) Both ω j and ωk are holomorphic so ω j ∧ ωk = 0. Therefore (11.2.5)


becomes
0 = 2πi Bk j − 2πi B jk .

(b) Suppose that η = j=1 c j ω j = 0, where the coefficients c j are real. The a and
b periods of η are
g
A j = 2πi c j , Bj = ck Bk j .
k=1
11.2 Cycles and differentials 271

Therefore (11.2.10) for η, η becomes


⎡ ⎤ ⎡ ⎤ ⎡ ⎤
g g g
0 > Im ⎣ Ak B k ⎦ = Im ⎣ 2πick c j Bk j ⎦ = 2π Re ⎣ Bk j c j ck ⎦
k=1 j,k=1 j,k=1

A symmetric matrix B = (B jk ) such that Re B < 0 is called a Riemann matrix.


As noted above, holomorphic differentials are called Abelian differentials of the
first kind. In addition, there are Abelian differentials of the second kind ω(n)
p , n ≥ 1.
These are meromorphic with a pole of order n + 1 at p and an expansion of the form
 
1 an
+ + ... dz.
(z − p)n+1 (z − p)n

if z is not a zero of P, and is not ∞ if P has odd degree. In the excluded cases, the
expansion near p can be written in terms of w. Finally, there are Abelian differentials
of the third kind ω pq , where p and q are distinct points of Γg near which the coefficient
has a simple pole with residues 1 and −1, respectively. For the existence of the
differentials ω(n)
p and ω pq , see Exercises 3 and 4.
Each such differential is unique up to the addition of a holomorphic differential.
We have already chosen a basis of differentials of the first kind, normalized by

ωk = δ jk . (11.2.16)
aj

Recall (11.2.15): the b-periods are



ω j = B jk (11.2.17)
bk

We normalize the differentials of second and third kinds by


 
ω(n)
p = 0; ω pq = 0, j = 1, 2, . . . , g. (11.2.18)
aj aj

Let ω j = ϕ j dz in a neighborhood of p. Then the b j periods are


   p
1 d n−1 f j
ω(n) = ϕ j ( p); ω pq = ωj, j = 1, 2, . . . , g.
bj
p
n ! dz n−1 bj q
(11.2.19)
Here, the basis forms ω j are assumed to have the form ω j = f j (z) dz in a neigh-
borhood of the point p. The proof is similar to the proof of Lemma 11.2.5; see
Exercise 5
272 11 Theta functions

11.3 Theta functions and Abel’s theorem

Let B be the Riemann matrix associated to the cycles {a j }, {b j } on the hyperelliptic


curve Γg . The associated theta function θ (z), z ∈ Cg , is
 
1
θ (z) = θ (z|B) = exp B N , N  + N , z , (11.3.1)
N ∈Zg
2

where
g g
B N , N  = B jk Nk N j , N , z = Njz j.
j,k=1 j=1

Let −b < 0 be the largest (i.e. least negative) eigenvalue of the real symmetric matrix
Re B. Then
g
|B N , N | ≤ Re B N , N  ≤ (−b)|N | = (−b) 2
N 2j ;
j=1
|N , z| ≤ |N ||z|.

The sum in the definition (11.3.1) converges uniformly on compact sets in Cg ; see
Exercise 6. Therefore θ is an entire function of z. Note that θ is an even function of
z. Change N to −N in (11.3.1):

θ (−z) = θ (z). (11.3.2)

Let {e j } be the standard basis vectors in Cg , and let f j = Be j .

Proposition 11.3.1. The function θ satisfies

θ (z + 2πiek ) = θ (z); (11.3.3)


θ (z + f k ) = exp(− 21 Bkk − z k ) θ (z). (11.3.4)

Proof: The identity


N , z + 2πie j  = N , z + 2N j πi

implies (11.3.3). Next,

1
1
2
B N , B N , N  + N , z + Bek 
N  + N , z + f k  =
2
= 21 B(N + 2ek ), N  + N , z
= 1
B(N + ek ), N + ek  − 21 Bek , ek  + N + ek , z − ek , z
2 1 
= 2 B(N + ek ), N + ek  + N + ek , z − 21 Bkk − z k .
11.3 Theta functions and Abel’s theorem 273

Since summing a function of N over N and summing over N + ek give the same
result, this proves (11.3.4).
These equations show that the theta function has periods {2πiek } and quasiperiods
{ f k }. More generally, any element of the period lattice
 
Λ = Λ(B) = 2πi N + B M : M, N ∈ Zg (11.3.5)

is a period or quasiperiod of θ . The transformation laws (11.3.3), (11.3.4) generalize


to

θ (z + 2πi N + B M) = exp(− 21 B M, M − N , z, M, N ∈ Zg . (11.3.6)

The Jacobi variety, or Jacobian J (Γg ) of the curve G g is the 2g torus that is the
quotient of C g by the lattice (11.3.5):

J (Γg ) = C g /Λ.

Put differently, J (Γg ) is the set of equivalence classes of elements of C2g under the
equivalence relation

w ≡ w if and only if w − w belongs to Λ. (11.3.7)

More generally, we may consider a theta function with characteristics α, β ∈ Rg :


 
1
θ [α, β](z) = exp Bα, α + α, z + 2πiβ θ (z + 2πiβ + Bα). (11.3.8)
2

This has a representation similar to (11.3.1):


 
1
θ [α, β](z) = exp B(N + α, N + α + N + α, z + 2πiβ . (11.3.9)
2
N ∈Zg

The transformation law generalizes (11.3.6):

θ [α, β](z + 2πi N + B M) (11.3.10)


 
= exp − 21 B M, M − M, z + 2πi[N , α − M, β] θ [α, β](z).

The proof is left as Exercise 7.


A case of particular interest is that of a half-period: when each α j and β j is either
0 or 1/2 but not all are 0. A half-period is said to be even or odd if 4α, β is even or
odd.

Proposition 11.3.2. If (α, β) is a half-period, then θ [α, β] is even (resp. odd) if


4α, β is even (resp. odd).
274 11 Theta functions

Proof: The summands of (11.3.9) depend on z by the factor exp(2πiα, β).

Fix a point p0 ∈ Γg and define A( p) for p ∈ Γg by


 p
A j ( p) = ωj, j = 1, . . . , g. (11.3.11)
p0

We take the path of integration to be the same for each index j = 1, 2, . . . , g. The
Abel map A : Γg → J (C g ) is the map
 
A( p) = A1 ( p), A2 ( p), . . . , A g ( p) . (11.3.12)

This is independent of the chosen path (so long as the path is independent of the
index j). In fact, any two paths differ by a path that is homologous to some path
g
c= [n j a j + m j b j ],
j=1

and  g
ωk = 2n k πi + B jk m j ,
c j=1

which is the k-th component of a point of the lattice Λ.


We are now in a position to determine the meromorphic functions on Γg .
Theorem 11.3.3. (Abel) The distinct points p1 , p2 , . . . pn and q1 .q2 , . . . qn in Γg
are, respectively, the (simple) zeros and poles of a meromorphic function on Γg if
and only if
n n
A( p j ) − A(q j ) belongs to Λ. (11.3.13)
j=1 j=1

Proof: Suppose that f is meromorphic on Γg with the prescribed zeros and poles.
Then Ω = d log f = d f / f is a meromorphic differential with simple poles and
zeros. It has residue 1 at each zero and residue −1 at each pole. Therefore it has
an expansion
g g
Ω = m j ωpjqj + cjωj. (11.3.14)
j=1 n=1

We are assuming that f is single-valued on Γg , so the integral over any closed cycle
is an integer multiple of 2πi:
 
Ω = 2n j πi, Ω = 2m j πi. (11.3.15)
aj bj
11.4 Jacobi inversion 275

Taking into account (11.3.14), (11.3.15), and the normalizations (11.2.17), (11.2.18),
it follows that

2πin k = Ω = 2πick ;
αk
 n  pj g
2πim k = Ω = ωk + n j B jk .
βk j=1 qj j=1

Therefore

n
  n pj
A( p j ) − A(q j ) = − ωk
j=1 j=1 qj
n
= −2πim k + n j B jk . (11.3.16)
j=1

The right-hand side belongs to the lattice Λ, so (11.3.13) is true.


Conversely, suppose that (11.3.13) is true. Then there are integers {n k }, {m k } such
that (11.3.16) is true. Let ck = n k and use these coefficients to define Ω in (11.3.14).
Then  p
F( p) = exp Ω
p0

is single-valued on Γg and has the { p j } and {q j } as its zeros and poles,


respectively.

11.4 Jacobi inversion

Let S g (Γg ) be the g-th symmetric product of the curve Γg . This means that its
elements are the unordered g-tuples ( p1 , . . . , pg ) of points of Γg . The Abel map
extends to a map
A : S g (Γg ) → J (Γg ) = Cg /Λ

defined by
A( p1 , . . . , pg ) = A( p1 ) + · · · + A( pq ). (11.4.1)

The problem of inverting the Abel map is known as the Jacobi inversion problem.
Thus, given
z = (z 1 , z 2 , . . . , z g ) ∈ J (Γg ) = C/Λ
276 11 Theta functions

we want to find points p1 , p2 , …, pg in Γg such that

g  pj
ωk ≡ z k , k = 1, . . . , g
j=1 p0

where the ωk are the standard a-cycles of Γg , and the equivalence relation is (11.3.7):
b ≡ c means that b − c belongs to Λ.
Let −

e = (e1 , . . . , eg ) be a fixed element of Cg and set

F( p) = θ (A( p) − −

e ). (11.4.2)

This function is holomorphic on the cut surface Γg . Changing − →e slightly, if necessary,

we may assume that F is not identically zero. Recall that ∂ Γg is the union of cycles
ak , bk , ak−1 , bk−1 . Changing the cycles slightly, if necessary, we may assume that F
has no zeros on the boundary ∂ Γg .

Lemma 11.4.1. F defined by (11.4.2) has g zeros, counting multiplicity, on Γg .

Proof: The number of zeros is


 
1 F 1
= d log F. (11.4.3)
2πi ∂ Γg F 2πi ∂ Γg

Let F + denote F on the union of the ak and bk , and let F − denote F on the union of
the inverses ak−1 , bk−1 , and similarly for A± . Then (11.4.3) is

g   
1  
+ d log F + − d log F − . (11.4.4)
2πi k=1 ak bk

It follows from (11.2.8) and (11.2.9) that if p is a point of ak , then

A−j ( p) = A+j ( p) + B jk (11.4.5)

and if p is a point of bk then

A+j ( p) = A−j ( p) + 2πiδ jk . (11.4.6)

From these equations and (11.3.3), (11.3.4), it follows that

log F − ( p) = − 21 Bkk − Ak ( p) + ek + log F j+ ( p), p ∈ ak ; (11.4.7)


− +
log F ( p) = log F ( p), p ∈ bk . (11.4.8)

But d Ak ( p) = ωk , so
11.4 Jacobi inversion 277

d log F − ( p) = d log F + ( p) − ωk , p ∈ ak ; (11.4.9)


− +
d log F ( p) = d log F ( p), p ∈ bk . (11.4.10)

Therefore (11.4.4) is
 g 
1 1
d log F = ωk = g.
2πi ∂ Γg 2πi k=1 ak

Theorem 11.4.2. Suppose that the zeros of F on Γg are p1 , . . . , pg . Then




A( p1 , . . . , pg ) ≡ −

e − K, (11.4.11)

where    
2πi + B j j 1 p
Kj = − ωk ( p) ωj . (11.4.12)
2 2πi k= j ak po



Proof: Let ζ be defined by

ζ j = A j ( p1 ) + · · · + A j ( pg ). (11.4.13)

The sum on the right can be viewed as the sum of residues


 
1 F ( p) 1
ζj = A j ( p) = d log F( p). (11.4.14)
2πi ∂ Γg F( p) 2πi ∂ Γg

In view of the calculations in the proof of Lemma 11.4.1, this integral is


g   
1
ζj = + [A+j d log F + − A−j d log F − ]
2πi k=1 ak bk
g 
1
= [A+j d log F + − (A+j + B jk )(d log F + − ωk )]
2πi k=1 ak
g 
1
+ [A+j d log F + − (A+j − 2πiδ jk ) d log F + ]
2πi k=1 bk
g    
1
= A+j ωk − B jk d log F + + 2πi B jk + d log F + .
2πi k=1 ak ak bk

Because of the way that the ak are chosen, F takes the same value at the ends, so
278 11 Theta functions

d log F + = 2πin k , n k ∈ Z. (11.4.15)
ak

Let q j and 
q j be the initial and final points of b j . Then

d log F + = log F + (
q j ) − log F + (q j ) + 2πim j , m j ∈ Z
bj

= log θ (A(q j ) + f j − −→ e ) − log θ (A(q j ) − −



e ) + 2πim j
= − 2 B j j + e j − A j (q j ) + 2πim j .
1
(11.4.16)

As before, f j = (B j1 , . . . , B jg ) belongs to the lattice Λ. Therefore

g 
1
ζj ≡ ej − 1
B
2 jj
− A j (q j ) + A j ωk
2πi k=1 ak

+2πim j + B jk (−n k + 1)
k
g 
1
≡ e j − 21 B j j − A j (q j ) + A j ωk . (11.4.17)
2πi k=1 ak

Now q j , the beginning of b j , is the end of a j , so


 
Ajωj = d[ 21 A2j = 1
2
[A2j (q j ) − A2j (r j )],
aj aj

where r j is the beginning of a j , and A j (q j ) − A j (r j ) = 2πi. Therefore



Ajωj = 1
2
2πi[2 A j (q j ) − 2πi],
aj

and
g  
1 1
− A j (q j ) + A j ωk = −πi + A j ωk . (11.4.18)
2πi k=1 ak 2πi k= j αj

Combining (11.4.15) – (11.4.17), we obtain (11.4.11).

The constants K j of (11.4.12) are known as the Riemann constants associated to


the given cycles and the choice of p0 .
For the following result, we refer to Farkas and Kra [67].
11.4 Jacobi inversion 279

Theorem 11.4.3. The function θ (A( p) − −



e ) is identically zero on Γg if and only


if e can be written as


→ −

e = A(q1 ) + . . . A(qg ) + K , (11.4.19)

where the points q j are the unique poles (counting multiplicity) of a meromorphic
function on Γg .
→ −
− →
Corollary 11.4.4. If ζ ∈ Cg has the property that F( p) = θ (A( p) − ζ − K ) does
not vanish identically on γ , then F has g zeros p j on Γg that are the solution of the
Jacobi inversion problem


A( p1 ) + · · · + A( pg ) ≡ ζ .

Moreover, the p j are uniquely determined by these equations.

Proof: The first statement is just Theorem 11.4.2. Suppose that {q1 , . . . qg } is disjoint
from { p1 , . . . , pg } and
A(q1 ) + · · · + A(qn ) ≡ ζ.

Then by Theorem 11.3.3, there is a meromorphic function with poles precisely at


the p j and zeros at the q j . This contradicts Theorem 11.4.3.

Corollary 11.4.5. The zeros −



e of θ can be parametrized by S g−1 :


→ −

e = A( p1 ) + · · · + A( pg−1 ) + K , (11.4.20)

where p1 , . . . , pg−1 (counting multiplicity) are any points of Γg .

Proof: If θ (− →e ) = 0, let F( p) = θ (A( p) − −



e ), and suppose first that F( p) is not
identically 0. Let p0 be the lower limit of the integration that defines A, so A( p0 ) =
0. It follows from the definition (11.3.1) that θ is an even function, so F( p0 ) =
θ (−− →e ) = 0. Lemma 11.4.1 and Theorem 11.4.2 imply that there are unique points
p1 , . . . , pg such that


→ −

e = A( p1 ) + · · · + A( pg ) + K . (11.4.21)

We know that one of these points, say pg is p0 , so that (11.4.21) reduces to (11.4.20).
If F( p) is identically zero, then by Theorem 11.4.3,


→ −

e = A(q1 ) + · · · + A(qg ) + K , (11.4.22)

where the q j are the unique poles of a function f , meromorphic on Γg , We may


choose the integration limit p0 to be one of the zeros of f . Let p1 , . . . , pg−1 be the
remaining zeros. Again A( p0 ) = 0. By Theorem 11.3.3,
280 11 Theta functions


→ −

e = A(q1 ) + · · · + A(qg ) + K = A( p1 ) + · · · + A( pg−1 ) + K ,

so again we obtain (11.4.20).

The approach to the Jacobi inversion problem that we have just described is due
to Riemann. A second approach is due to Weierstrass. We illustrate the Weierstrass
approach for genus g = 2, with defining equation

w2 = P5 (z).

We work with the original differentials

dz z dz
η1 = , η2 = .
w w
The corresponding modified Abel map is
 z  z 
A(z) = η1 , η2
z0 z0

so
A(z 1 , z 2 ) = A(z 1 ) + A(z 2 ) = (ζ1 , ζ2 )

with
 z1  z2
ζ1 = η1 + η1 ; (11.4.23)
z z
 0z1  0z2
ζ2 = η2 + η2 . (11.4.24)
z0 z0

Thus,
dζ1 1 dζ2 zj
(z j ) = , (z j ) = . (11.4.25)
dz w(z j ) dz w(z j )

The idea is to relate two systems of differential equations for (z 1 , z 2 ) ∈ Γ2 × Γ2


to the corresponding system of equations for the image (ζ1 , ζ2 ) under the Abel map
(11.4.23), (11.4.24). The systems for (z 1 , z 2 ) are

dz 1 w(z 1 ) dz 2 w(z 2 )
= , = , (11.4.26)
ds z1 − z2 ds z2 − z1
dz 1 z 2 w(z 1 ) dz 2 z 1 w(z 2 )
= , = ,. (11.4.27)
dt z1 − z2 dt z2 − z1

Proposition 11.4.6. Under the Abel map A : S 2 Γ2 → J (Γ2 ), the systems (11.4.26),
(11.4.27) become
11.4 Jacobi inversion 281

dζ1 dζ2
= 0, = 1; (11.4.28)
ds ds
dζ1 dζ2
= −1, = 0; . (11.4.29)
dt dt

Proof: The equations (11.4.28) and (11.4.29) follow readily from the equations
(11.4.23)–(11.4.27); see Exercise 9.

We cannot resist closing this section with a remark of Weyl, [214], footnote, p.
144:
The principal significance of the inversion problem to us today lies primarily, not in its
intrinsic value, but in the splendid development created by Riemann and Weierstrass in their
efforts to solve the problem.

Exercises

1. (a), (b), (c), (d): section by section, work out the results in the case of a torus:
g = 1.
2. Show that j ≤ g is also the necessary and sufficient condition that η j be holo-
morphic at ∞ in the case when P has degree 2g + 1.
3. Prove the existence of the differentials of the second kind. Hint: look for
g(w) dz
ω(n) = .
p
(z − p)n+1

What conditions are needed on g(w) to guarantee that ω(n) p has the correct behav-
ior at the point p on each sheet C± ?
4. Prove the existence of the differentials of the third kind ω pq . Hint: look at Exer-
cise 3.)
5. Prove (11.2.19).
6. Prove that (11.3.1) converges uniformly on compact sets in C.
7. Prove the transformation law (11.3.10).
8. Show that θ has 2g−1 (2g+1 ) even periods and 2g−1 (2g − 1) odd periods.
9. Prove (11.4.28) and (11.4.29).
10. Use the Weierstrass system of differential equations to show that map to J (G 2 )
is surjective and can be inverted by following trajectories of two systems. (Start
from p0 .)
282 11 Theta functions

Remarks and further reading

The theory of theta functions was initiated by Jacobi and advanced by Riemann and
Weierstrass. It is still a large and active area of research, with connections to alge-
braic geometry, analytic number theory, representation theory, algebraic topology,
nonlinear partial differential equations, and quantum physics.
The classical theory of theta functions is treated exhaustively by Baker in [17].
Baker’s monograph has been reissued, with a foreword by Krichever that outlines the
theory and delves into its application to the study of “completely integrable” nonlinear
partial differential equations, such as the Korteweg–deVries (KdV) equation for
waves in a channel. Our presentation is based on Dubrovin’s exposition [60], which
goes on to treat these applications in detail. These applications are also among the
(very many) topics treated in Mumford’s lectures [145], [146], [147].
Various versions of theta functions are associated with the names of Ramanujan
[48] and Siegel and Ruelle [35]. There are connections with modular forms [48],
[68], quantum field theory [208], moduli spaces [113], eta functions [124], all of the
above [148], and knot theory [89].
Chapter 12
Padé approximants and continued
fractions

The Taylor series of a function that is holomorphic in a neighborhood of a given point


provides an approximation of the function by polynomials: the successive partial
sums of the Taylor series. For both theoretical and practical reasons, it can be useful
to approximate instead by general rational functions, i.e. quotients of polynomials.
Systematic use of this idea goes back at least to Frobenius [80] in 1881. Padé [162]
treated some exceptional cases in his thesis a decade later. Both theory and practice –
and the theory behind the practice – have developed greatly since then. In this chapter
we touch on the main theoretical questions and practical methods, and exhibit some
interesting examples.
In Section 12.1 we introduce the general terminology, notation, and concepts, as
well as the basic existence theorems. Sections 12.2 and 12.3 give three connections
of Padé theory to continued fractions.
The connection of Padé approximants to the Stieltjes transform and to orthogonal
polynomials is introduced in Section 12.4. Stieltjes transforms and related functions
are characterized in Section 12.5. The Padé approximants of these transforms are
examined in Section 12.6.
Two continued fraction expansions of the exponential function are examined in
Section 12.9. Section 12.8 contains several examples illustrating the theory, with
some specific numerical results.
Section 12.7 describes the basic theory behind practical methods of computation:
Shanks’ method and generalizations.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 283
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_12
284 12 Padé approximants and continued fractions

12.1 Padé approximants and Taylor series

Suppose that f is holomorphic in a neighborhood of 0:


∞
f (z) = ak z k . (12.1.1)
k=0

Summing the series exactly is often not practical, but one can resort to the partial sums
as approximations to the value f (z). The partial sums are the Taylor polynomials
m
Tm (z) = ak z k ,
k=0

which are uniquely determined by the property that f (z) = TN (z) + O(|z| N +1 ) as
z → 0. One can obtain this same approximation property with rational functions.
For a pair of integers m ≥ 0, n ≥ 0, the [m, n] Padé approximant to f at 0 is the
quotient m
Pm (z) pk z k
= k=0
n
Q n (z) k=0 qk z
k

with the property that


Pm (z)
f (z) − = O(|z|m+n+1 ) as z → 0. (12.1.2)
Q m (z)

For a given m and n, (12.1.2) is equivalent to


Pm (z)
Tm+n (z) − = O(|z|m+n+1 ) as z → 0. (12.1.3)
Q n (z)

This is unique if we normalize by taking Q(0) = q0 = 1.


The quotient Pm /Q n is sometimes denoted [m, n] f . The Padé table of f is the
infinite matrix ([m, n] f )∞
m,n=0 . The leftmost column {[m, 0] f } consists of the Taylor
polynomials Tm . If a0 = 0, then the first row {[0, n] f } consists of the Taylor polyno-
mials for 1/ f . In applications one often uses the main diagonal, i.e. the approximants
[n, n] f , or some nearby diagonal, such as [n, n ± 1] f .
As we shall see, there are a number of reasons for going beyond the Taylor
polynomials for the purpose of approximation. The Taylor polynomials can only
converge uniformly on disks Dr (0) with r less than the radius of convergence of the
series (12.1.1). If f is holomorphic in a larger region, rational approximations may
converge in larger subsets of that region; see Exercises 1 and 2. Even in disks where
the Taylor series converges, some Padé approximants may converge more rapidly
than the Taylor polynomials. Furthermore, Padé approximation is, in a number of
ways, more flexible and adaptable to special circumstances, such as dealing with
asymptotic series or simultaneous convergence near two or more points.
The approximation property (12.1.2) or (12.1.3) can be put in a linear form:
12.1 Padé approximants and Taylor series 285

Tn+m (z)Q m (z) − Pm (z) = O(|z|n+m+1 ) as z → 0. (12.1.4)

This is a system of m + n + 1 linear equations for the m + 1 coefficients of Pm and


the n coefficients qk , k > 0 of Q n . In fact
⎡ ⎤
∞ 
k
Tn+m Q n (z) − Pm (z) = ⎣ ak− j q j − pk ⎦ z k , (12.1.5)
k=0 j=0

where ak = 0, k > m + n, q j = 0, j > n, pk = 0, k > m. We set the terms in brack-


ets in (12.1.5) equal to zero for k ≤ m + n. As we show next, this system always has
a solution. However, the solution may not be unique; see Exercise 1.
Theorem 12.1.1. The problem (12.1.2) has a solution with Pm a polynomial of
degree ≤ m and Q n a polynomial of degree ≤ n.

Proof. The idea is to use the extended Euclidean algorithm, starting with the polynomi-
als A(z) = z n+m+1 , B(z) = Tn+m (z). The algorithm constructs polynomials Rk , Sk :
R0 = A, R1 = B, Rk−1 = Sk Rk + Rk+1 , deg Rk+1 < deg Rk . (12.1.6)

The third equation in (12.1.6) determines polynomials Sk and Rk+1 up to a constant


multiple. The extension of the algorithm uses the Sk to compute polynomials Uk , Vk :

U0 = 1, U1 = 0, Uk+1 = Uk−1 − Sk Uk ;
V0 = 0, V1 = 1, Vk+1 = Vk−1 − Sk Vk . (12.1.7)

It follows by induction that we have the Bezout identities


AUk + BVk = Rk . (12.1.8)

It follows from (12.1.6) that


deg Sk = deg Rk−1 − deg Rk

and from (12.1.7) that


deg Vk+1 = deg Sk + deg Sk−1 + · · · + deg S1 = deg A − deg Rk . (12.1.9)

Let us stop the process as soon as deg Vk+1 > n. Then deg Vk ≤ n and (12.1.9) implies
that deg Rk ≤ m. The Bezout identity (12.1.8) gives us
z m+n+1 Uk (z) + Tm+n (z)Vk (z) = Rk (z).

Therefore taking Pm = Rk , Q n = Vk gives us our approximant. 




Remarks. 1. In the previous discussion we made no use of the convergence of the


series (12.1.1), but only that it is an asymptotic series for f :
286 12 Padé approximants and continued fractions


n
f (x) − ak z k = O(|z|n+1 ). (12.1.10)
k=0

2. We might also assume that (12.1.10) is valid only in some subset of a neighborhood
of 0, e.g. in the sector {z : arg z < α}.
3. A typical situation in which one might have an asymptotic expansion valid in some
sector concerns behavior as z → ∞:
b1 b2
g(z) ∼ b0 + + 2 + . . . as z → ∞ (12.1.11)
z z

in some sector.
4. If g satisfies (12.1.11) with b0 = 0, then f (z) = g(z −1 ) has an expansion (12.1.10).
More generally, if g has an expansion with b0 , . . . bk−1 = 0 and bk = 0, then f (z) =
z −k g(z −1 ) has an expansion (12.1.10).
Theorem 12.1.2. Suppose that g has an asymptotic expansion (12.1.10) in some
sector. Then for any non-negative integers m and n there is a [m, n]-Padé approximant
Pm /Q n such that
Pm (z)
g(z) − = O(z −m−n−1 ) as z → ∞ in the sector.
Q n (z)

Proof. We may assume that not every term in the expansion (12.1.11) is zero. Then for
some k, f (z) = z −k g(1/z) has an expansion (12.1.10) in a sector at 0. By Theorem
12.1.1 and the preceding remarks, f has a Padé approximant [n − k, m] f = R(z),
where R is a rational function with at most n − k zeros and at most m poles. Then
g(z) = z −k f (z −1 ) = z −k R(z −1 ) + O(z −k−(n−k+m)−1 ),

and z −k R(z −1 ) is a rational function of z with at most n − k zeros and at most m


poles. Therefore z −k R(z −1 ) is the quotient of polynomials P and Q of degrees ≤ m
and n, respectively. 


The method as described so far could be called “one-point” Padé approximation.


It relies on information about behavior of a function f at a single point, which we
have taken to be the origin or the point at infinity. This can easily be generalized.
Suppose that f (z) has asymptotic expansion at distinct points z 0 , z 1 :


f (z) ∼ an (z − z 0 )n , as z → z 0 , (12.1.12)
n=0
∞
f (z) ∼ bn (z − z 1 )n , as z → z 1 . (12.1.13)
n=0

A two-point Padé approximation to f (z) is a rational function R(z) = Pm (z)/Q n (z),


where Q m (z 0 ) = 1, while Pm (z) and Q n (z) are polynomials of degrees m and n,
12.2 Padé approximation and continued fractions 287

respectively. The n + m + 1 coefficients (after the normalization Q n (z 0 ) = 1) are


chosen so that
Pm (z)
f (z) = + O((z − z 0 )k ) as z → z 0 ;
Q n (z)
Pm (z)
f (z) = + O((z − z 0 )l ) as z → z 1 ,
Q n (z)

where k + l = m + n + 1. We leave the formulation of the general equations for


the coefficients of the polynomials Pn (z) and Q m (z), as well as the development of
efficient numerical techniques, to Exercises 5 and 6.

12.2 Padé approximation and continued fractions

Often Padé approximations are made for functions whose asymptotic behavior at ∞
is known, say
c0 c1 cn
F(z) ∼ + 2 + · · · + n+1 + . . . (12.2.1)
z z z

as z → ∞ in some sector.
Suppose c0 = 0. Then we may write (12.2.1) as

c0 c1 c2
F(z) ∼ 1+ + + ... .
z c0 z c0 z z

A key observation is that this implies that

1 z b1 F1 (z) c1
∼ − + , b1 = ,
F(z) c0 c0 c0 c0

where F1 has an expansion similar to the expansion (12.2.1) of F. Thus


a1
F(z) ∼ , a1 = c0 . (12.2.2)
z − b1 + F1 (z)

If the leading coefficient in the expansion of F1 is not zero, this can be continued:

a1
F(z) ∼ ,
a2
z − b1 +
z − b2 + F2 (z)

where F2 has an expansion of the form (12.2.1). So long as leading coefficients do


not vanish we get a continued fraction expansion
288 12 Padé approximants and continued fractions

a1
F(z) ∼ . (12.2.3)
a2
z − b1 +
a3
z − b2 +
z − b3 + . . .

Suppose we truncate this, so that the last denominator is z − bn . The resulting


expression Rn is called the n-th convergent of the continued fraction (12.2.3). It is
easily seen that Rn is a rational function that approximates F to degree z −n−1 . As
we shall see, Rn = Pn /Q n , where Pn and Q n are polynomials and deg Pn = n − 1,
deg Q n = n. We shall show that Rn is a Padé approximant to f at ∞.
In discussing the theory further, it will be convenient to ease notation by looking
first at the numerical case of continued fractions:
a1
. (12.2.4)
a2
b1 +
a3
b2 +
a4
b3 +
b4 + . . .

This is sometimes written


a1 a2 a3 a4
... . (12.2.5)
b1 + b2 + b3 + b4 +

We assume throughout this discussion that the coefficients {an }, {bn } are each non-
zero.
Let us look at the successive convergents: the successive truncations tn that end
at a denominator bn :

a1 a1 b2 a1
t1 = , t2 = = ,
b1 a2 b2 b1 + a2
b1 +
b2
a1 b3 b2 a1 + a3 a1
t3 = = .
a2 b3 b2 b1 + b3 a2 + b1 a3
b1 +
a3
b2 +
b3

Let us write these as


p1 a1 p2 b2 a1 p3 b3 b2 a1 + a3 a1
= , = , = .
q1 b1 q2 b2 b1 + a2 q3 b3 (b2 b1 + a2 ) + a3 b1

Identifying numerators with numerators and denominators with denominators in each


equation here, we see that
12.2 Padé approximation and continued fractions 289

p2 = b2 p1 , q2 = b2 q1 + a2 ;
p3 = b3 p2 + a3 p1 , q3 = b3 q2 + a3 q1 .

If we set p−1 = 1, p0 = 0, q−1 = 0, and q0 = 1, then these equations take the form
of three-term recursions:
pk = bk pk−1 + ak pk−2 , qk = bk qk−1 + ak qk−2 , (12.2.6)

k = 1, 2, 3. Replacing bk in these equations by bk + ak+1 /bk+1 in the quotient pk /qk


leads to the corresponding equations for pk+1 and qk+1 ; see Exercise 7

Proposition 12.2.1. The equations (12.2.6), extended for all values of k, produce
the convergents rk = pk /qk of the continued fraction (12.2.4).

The equations (12.2.6) have powerful consequences.


Lemma 12.2.2. The convergents of the continued fraction (12.2.4) satisfy the rela-
tion
pn pn−1 (−1)n−1 a1 a2 · · · an
− = . (12.2.7)
qn qn−1 qn qn−1

Proof. Multiplying the left side of (12.2.7) by qn qn−1 , and using the equations
(12.2.6), we obtain
pn qn−1 − pn−1 qn = (bn pn−1 + an pn−2 )qn−1 − pn−1 (bn qn−1 + an qn−2 )
= −an [ pn−1 qn−2 − pn−2 qn−1 ].

The term in square brackets is the numerator of pn−1 /qn−1 − pn−2 /qn−2 . Therefore
we may continue the calculation back to the numerator p0 q1 − p1 q0 = 1. Dividing
by qn qn−1 gives (12.2.7). 


Now  
pn pn pn−1 p1 p0 p0
= − + ··· + − + .
qn qn qn−1 q1 q0 q0

Combining this with (12.2.7) gives

Corollary 12.2.3. The n-th convergent of the continued fraction (12.2.4) can be
written as a sum
pn a1 a2 · · · an a1 a2 · · · an−1 a1
= (−1)n−1 + (−1)n−2 + ··· + . (12.2.8)
qn qn qn−1 qn−1 qn−2 q1 q0

Returning now to the continued fraction (12.2.3), we will assume that the coeffi-
cients {ak } are non-zero.

Proposition 12.2.4. The [n, n] Padé approximant for the continued fraction (12.2.3)
is the rational function Rn obtained by truncating at denominator bn . It has the form
290 12 Padé approximants and continued fractions

Pn (z)
Rn (z) = , (12.2.9)
Q n (z)

where Pn and Q n are polynomials of degree n − 1 and n, respectively. They are


defined recursively by
P−1 = 1, P0 = 0, Pk (z) = (z − bk )Pk−1 (z) + ak Pk−2 (z); (12.2.10)
Q −1 = 0, Q 0 = 1, Q k (z) = (z − bk )Q k−1 (z) + ak Q k−2 (z).

Proof. The equations (12.2.10) are an immediate consequence of Proposition 12.2.1.


The determination of the degrees of Pn and Q n follows by induction, using the
assumption that the an are non-zero. It follows from (12.2.8) that the first omitted
term in the expansion of (12.2.3) has degree (n + 1) + n so Rn agrees with the formal
expansion of (12.2.3) through terms of order z −2n . 


Multiplying numerator and denominator by λ1 = 1/a1 produces the same formal


fraction, with a1 , b1 and a2 replaced by 1, b1 /a1 , and a2 /a1 . Then multiplying both
terms in the fraction with numerator a2 by λ2 = a1 /a2 changes the numerator to 1
and b2 to λ2 b2 . Continuing this process leads to
1
, (12.2.11)
1
b̂1 +
1
b̂2 +
b̂3 + . . .

where
a2 a4 · · · a2m−2 a1 a3 · · · a2m−1
b̂n = λn bn , λ2m−1 = , λ2m = .
a1 a3 · · · a2m−1 a2 a4 · · · a2m

The basic analytic question concerning continued fractions is: do the “conver-
gents” actually converge? One of the most elegant results is the following theorem
of Seidel [188].
Theorem 12.2.5. Suppose b̂n > 0 for n = 1, 2, · · · . The continued fraction (12.2.11)

converges if and only if the series b̂n diverges.

Proof. It follows from (12.2.8) that the convergents of (12.2.11) are the partial sums
of the series
1 1 1
− + · · · + (−1)n + ... (12.2.12)
q1 q1 q2 qn−1 qn

where the qn are the denominators of the convergents. They are all positive. By
definition q0 = 1, and clearly q1 = b̂1 . We claim that for all n,
n
qn < (1 + b̂k ).
k=1
12.2 Padé approximation and continued fractions 291

This is clearly true for n = 0 and n = 1. If it is true for n − 1 and n − 2, then by


(12.2.6)
n−1 n−2 n−1
qn < b̂n (1 + b̂k ) + (1 + b̂k ) < (1 + b̂n ) (1 + b̂k ).
k=1 k=1 k=1


Suppose first that ∞ n=1 b̂k < ∞. Then the infinite product

n=1 (1 + b̂n ) has a finite
limit. (See Section 1.6, or use the inequality 1 + b̂n < e ). This implies that the
b̂n

terms in the series (12.2.12) are bounded away from zero, so the convergents fail to
converge. 
Suppose instead that ∞ n=1 b̂n diverges. We claim that for all n ≥ 1,

qn ≥ ρqn−1 , ρ = min{1, b̂1 }.

By definition q−1 = 0, so this is true for n = 1 and n = 2. If it is true for n − 1 and


n − 2, then
qn = b̂n qn−1 + qn−2 ≥ ρ b̂n ρ + ρ > ρ.

Now
qn qn−1 = (b̂n qn−1 + qn−2 )qn−1 ≥ b̂n ρ 2 + qn−1 qn−2 ,

so
qn qn−1 = (qn qn−1 − qn−1 qn−2 ) + · · · + (q2 q1 − q1 q0 ) + q1 q0
≥ b̂n ρ 2 + · · · + b̂2 ρ 2 + b̂1 ρ 2 .

Therefore qn qn−1 → ∞, so the terms of (12.2.12) decrease to zero, and the series
converges. 


As an example, consider
1
, x > 0.
1
2x +
1
2x +
1
2x +
2x + . . .

By Theorem 12.2.5 the convergents converge to some limit L = L(x) > 0. Clearly

1
L = , (12.2.13)
2x + L
so √
L 2 + 2x L − 1 = 0, L = −x + x + 1. (12.2.14)

Conversely, (12.2.14) leads to (12.2.13),


√ which identifies the continued fraction
expansion of the function −x + x + 1. It is a general fact that if the sequence
292 12 Padé approximants and continued fractions

{b̂n } is eventually periodic, then the limit of the convergents of (12.2.11) is the solu-
tion of a quadratic with coefficients that are rational functions of the {b̂n }; see [120].
We conclude this section with another method for determining a partial fractions
expansion.
Theorem 12.2.6. For any n ≥ 1, we have
n
1 1
= . (12.2.15)
D k D12
k=1 D1 −
D22
D1 + D2 − 2
Dn−1
D2 + D3 − · · ·
Dn−1 + Dn

Proof. The identity (12.2.15) is true for n = 1. For n = 2 we have

1 D1 + D2 D1 + D2 1 1
= = = + .
D12 (D1 + D2 )D1 − D1
2 D1 D2 D1 D2
D1 −
D1 + D2

If (12.2.15) is true for n, then changing the last denominator Dn−1 + Dn to


Dn2
Dn−1 + Dn −
Dn + Dn+1

amounts to replacing the last term in the sum 1/D1 + · · · + 1/Dn by


1 Dn + Dn+1 1 1
= = + . 

Dn2 Dn Dn+1 Dn Dn+1
Dn −
Dn + Dn+1

12.3 Another view of Padé approximants and continued


fractions

Let us take another look at the Padé approximants of a function f with Taylor
expansion
f (z) = a0 + a1 z + a2 z 2 + a3 z 3 + . . . . (12.3.1)

Assuming that a0 = 0, set f 0 = f and define f 1 by


a0
= 1 + z f 1 (z).
f 0 (z)

Then f 1 has a Taylor expansion similar to (12.3.1), and


12.3 Another view of Padé approximants and continued fractions 293

c0 c0 1
f 0 (z) = ; f 1 (z) = − 1 · = c1 + O(z), c0 = a0 .
1 + z f 1 (z) f0 z

If the constant term of f 1 , c1 = −a1 /a0 , is not zero, then this process can be contin-
ued: c1
f 1 (z) = .
1 + z f 2 (z)

For brevity, we say that f is normal if this iterative process



cn 1
f n+1 = −1 · (12.3.2)
fn z

leads to cn = f n (0) = 0 for all n, and thus to the continued fraction representation
c0
f (z) ∼ (12.3.3)
c1 z
1+
c2 z
1+
1+ .
..
c2n z
1 + ...

The coefficients ck can be determined by expanding the convergents of (12.3.3)


into power series and comparing the coefficients with those of the power series to be
approximated. This procedure is very similar to the Padé approximation as discussed
in Section 12.1. As we shall see, in this case there is a simple algorithm.
Proposition 12.3.1. Suppose that f is normal. The n-th convergent of the continued
fraction (12.3.3) is the [n, n] Padé approximant of f if n is odd, and the [n − 1, n]
approximant if n is even.

Proof. As we showed in Section 12.2, the n convergent is Pn /Q n , where

P−1 = 0, P0 = c0 , Pn (z) = Pn−1 (z) + cn z Pn−2 (z); (12.3.4)


Q −1 = 1, Q 0 = 1, Q n (z) = Q n−1 (z) + cn z Q n−2 (z). (12.3.5)

It follows inductively that for n ≥ 0, P2m+1 and Q 2m+1 have degrees n and n + 1,
respectively, while P2m+2 and Q 2m+2 have degrees n + 1 and n + 2, respectively.
Thus Q n Q n−1 has degree n, and the argument in Proposition 12.2.4 shows that the
n-th convergent agrees with f to O(z −2n ). 


For convenience, we introduce the notation Rnm (z) for the [m, n] Padé approximant
to the function (12.3.1).
m
Pm (z) pk z k
Rnm (z) = = k=0
n . (12.3.6)
Q n (z) k=0 qk z
k
294 12 Padé approximants and continued fractions

Consider the Padé approximants Rmm (z) and Rm+1 m


(z). The Padé sequence R00 (z),
R1 (z), R1 (z), R2 (z), R2 (z), P3 (z), . . . , is said to be normal if every member of the
0 1 1 2 2

sequence exists and no two members are identically equal. This sequence is normal
if and only if the function f is normal; see Exercise 8. The coefficients {ck } are the
same for every term of the Padé sequence. Thus, Rm+1 m
, m ≥ 1, is obtained from
Rmm by simply replacing cn z by cn z/(1 + cn+1 z) where n = 2m and Rm+1 m+1
, m ≥ 0,
is obtained from Rm+1 by replacing cn z by cn z/(1 + cn+1 z), where n = 2m + 1.
m

Note that when Padé approximants are written as ratios of polynomials, every
coefficient in the rational fraction must be recomputed as we go from one member
of the normal sequence to the next. However, the entire normal sequence may be
rewritten as a simple continued fraction and only one new coefficient needs to be
computed as we go from one member to the next. Moreover, as we mentioned above,
there is a simple algorithm for computing the cn . Then the iterations (12.3.4), (12.3.5)
yield the polynomials Pn , Q n .
The algorithm for computing the cn proceeds as follows. Let us write the functions
f n in the construction (12.3.2) as quotients:
∞ (k) n
a (k) an z
f k (z) = (k) = n=0 ∞ (k) n
.
b n=0 bn z

In particular we take
a (0) = f 0 , b(0) = 1. (12.3.7)
∞
Given any such power series d(z) = n=0 dn z with leading coefficient d(0) = d0 ,
n

let us write d(z) = d(z)/d(0) for the normalized series with leading term 1. Then
the algorithm (12.3.2) can be written as
 (k)
a (k+1) b 1
(k+1)
= (k)
−1 ·
b a z
b(k) − a (k) 1
= · . (12.3.8)
a (k) z

Thus we may define b(k) and a (k) recursively from (12.3.8) by


b(k) − a (k)
b(k+1) (z) = a (k) (z), a (k+1) = .
z

Then b( k) = b(k) for all k. Since ck is the constant term of f k = a (k) /b(k) , we have
ck = a (k) (0). (12.3.9)

In fact we may eliminate the b(k) entirely by setting

a (k−1) (z) − a (k) (z)


a (−1) = 1, a (0) = f, a (k+1) (z) = , k ≥ 0, (12.3.10)
z
12.3 Another view of Padé approximants and continued fractions 295

where again a (k) (z) = a (k) (z)/a (k) (0).


This method can be extended to derive a continued fraction expansion for other
Padé approximants Rmn (z) for f . Given J > 0, the Padé approximants RkJ +k , Rk+1
J +k

of the function f of (12.3.1) can be represented as convergents of


J −1
 c0 z J
ak z k + (12.3.11)
c1 z
k=0 1+
c2 z
1+
1+ .
..
c2n z
1 + ...

provided that the Padé sequence R0J , R1J , R1J +1 , R2J +1 , … is normal, i.e. no two
members of the sequence are identically equal, so that all the coefficients cn are
non-zero. Any convergent of the form (12.3.11) is the ratio of a polynomial of
degree J + m to a polynomial of degree J + m or J + m + 1. The coefficients
of z p in the expansion of the convergent whose last denominator is cn z are a p
for p = 0, 1, · · · , J − 1. The coefficients c p involve only the coefficients of z k for
k = J, J + 1, · · · , J + p. Thus, the coefficients c p can be determined by a slight
modification of the formulas in (12.3.9).
To represent the members of the Padé sequence R 0J , R 1J , R 1J +1 , R 2J +1 , R 2J +2 , · · ·
with J ≥ 0 as continued fractions, we only need to observe that the Rmn Padé approx-
imant to 1/ f (z) is identical to the Rnm Padé approximant to f , evaluated at 1/z; see
Exercise 7. Therefore, assuming normality, the desired sequence of Padé approxi-
mants can be represented as the inverse of the expressions (12.3.11) with the coeffi-
cients ak of f (z) replaced by the expansion coefficients of 1/ f (z).
Let us consider the question of convergence of the convergents of the continued
fraction (12.3.3) in the special case when the cn are equal:
c
f (z) ∼ (12.3.12)
cz
1+
cz
1+
1+ .
..
cz
1 + ...

If there is convergence, then


c
f (z) = . (12.3.13)
1 + z f (z)

This gives a quadratic equation in f (z), and considering z → 0 identifies the solution
as √
1 + 4cz − 1
f (z) = . (12.3.14)
2z
296 12 Padé approximants and continued fractions

We know that the convergents have the form Pn /Q n , where


P−1 = 0, P0 = c, Pn (z) = Pn−1 (z) + cz Pn−2 (z);
Q −1 = 1, Q 0 = 1, Q n (z) = Q n−1 (z) + cz Q n−2 (z).

Writing S for the shift operator on sequences x = {xn }, Sx = {xn+1 }, the sequences
{Pn } and {Q n } are solutions of (S 2 − S − cz)x = 0. Now
1 1√
S 2 − S − cz1 = (S − λ+ 1)(S − λ− 1), λ± = ± 1 + 4cz.
2 2
(Here we take the branch of the square root that is positive for cz > 0 and holomorphic
on the complement of {z : Re cz ≤ 0}.) Therefore the sequences {Pn } and {Q n } are
linear combinations of the sequences {λn+1+ }, {λ− }. Taking into account the initial
n+1

conditions at n = −1, n = 0, we find that


c 1
Pn (z) = √ [λn+1
+ − λ1 ];
n+1
Q n (z) = √ [λn+2
+ − λ− ].
n+2
2 1 + 4cz 2 1 + 4cz

Now throughout the sector on which we have defined λ± , the principal branch of
log λ± is positive for λ+ and negative for λ− , so λn− is exponentially small as n → ∞.
Therefore, since λ+ λ− = −cz,

Pn (z) c λ− 1 + 4cz − 1
∼ = − = .
Q n (z) λ+ z 2z

Thus we have proved convergence and verified (12.3.14).

12.4 The Stieltjes transform, Padé approximants,


and orthogonal polynomials

Suppose that μ is a measure defined on an interval I = (a, b) ⊂ R. For our purpose


here we may assume that μ is defined by a non-negative density function w : I →
[0, ∞). The associated Hilbert space L 2 (I, w(t) dt) is the completion of the space
b
of continuous functions ϕ such that a |ϕ(t)|2 w(t) dt < ∞ with respect to the norm
1/2
||ϕ|| = (u, u)w that corresponds to the inner product
 b
(ϕ, ψ)w = ϕ(t)ψ(t)w(t) dt.
a

Two other objects associated to w are the Stieltjes transform


 b
w(t) dt
F(z) = , z∈
/ [a, b], (12.4.1)
a z−t
12.4 The Stieltjes transform, Padé approximants, and orthogonal polynomials 297

b
and the set of moments of w. We assume that a x 2n w(x) d x, n = 0, 1, 2, . . . is
finite; then the moments are the constants
 b
cn = x n w(x) d x. (12.4.2)
a

It is convenient to assume that the measure is normalized with


 b
c0 = w(x) d x = 1. (12.4.3)
a

Let us look for the [n-1,n] Padé approximants P − n/Q n to the Stieltjes transform
F, where deg Pn = n − 1, deg Q n = n. Since we have assumed c0 = 1, we may take
Pn and Q n to be monic: having leading coefficients equal to 1. The Padé condition is
Q n (z)F(z) = Pn (z) + O(z −n−1 ). (12.4.4)

For any monic polynomial Q of degree n,


 b
Q(z)
Q(z)F(z) = w(t) dt = P(z) + R(z),
a z−t

where  b
Q(z) − Q(t)
P(z) = w(t) dt (12.4.5)
a z−t

has degree n − 1. Expanding (z − t)−1 as z → ∞ along a non-real ray,


 b
Q(t)
R(z) = w(t) dt
a z−t
n−1  b
 
= Q(t)t w(t) dt z −k−1 + O(z −n−1 ).
k

k=0 a

It follows that Q n = Q satisfies (12.4.4) if and only if Q n is orthogonal to every


polynomial of lower degree:
(Q n , P)w = 0 if deg P < n.

If so, then Pn = P is defined by (12.4.5).


It is not difficult to see that there is a unique sequence {Q n } of monic polynomials
that are mutually orthogonal:
(Q n , Q m )w = 0, n = m.

In fact the n linear equations (Q n , t k )w = 0, 0 ≤ k < n determine the n lower degree


coefficients of Q n . Note that Q 0 = 1, Q 1 = z − b, where b = c1 − 1. For n ≥ 2,
Q n − z Q n−1 has degree n − 1 and is orthogonal to all polynomials of degree ≤ n − 3,
so it is a linear combination of Q n−1 and Q n−2 :
298 12 Padé approximants and continued fractions

Q n = (z − bn )Q n−1 + an Q n−2 , n ≥ 2. (12.4.6)

It follows from (12.4.5) that Pn satisfies the same three-term recursion, with

P0 = 0, P1 = 1.

Since (Q n , Q n−2 )w = 0 = (Q n−1 , Q n−2 )w , (12.4.6) implies

an ||Q n−2 ||2w = (an Q n−2 , Q n−2 )w = −(z Q n−1 , Q n−2 )w


= −(Q n−1 , z Q n−2 )w = −||Q n−1 ||2w ,

since z Q n−2 differs from Q n−1 by a polynomial of lower degree. Therefore an < 0.
Comparing this with the discussion at the beginning of Section 12.2, we see that if
we take a1 = 1, the [n − 1, n] Padé approximant of F is the n-th convergent of the
continued fraction

1
F(z) ∼ . (12.4.7)
a2
z − b1 +
a3
z 2 − b2 +
z − b3 + . . .

Let us estimate the remainder term Rn = F − Pn /Q n :


  b
Pn (z) b
w(t) dt 1 Q n (z) − Q n (t)
F(z) − = − w(t) dt
Q n (z) a z − t Q n (z) a z−t
 b
1
= Q n (t)w(t) dt. (12.4.8)
Q n (z) a
b
Since we have assumed that a w(t) dt = 1, the Cauchy–Schwarz inequality yields
 b
|Q n (t)|w(t) dt ≤ ||Q n ||w . (12.4.9)
a

In this connection, we may use the following to make more explicit estimates.

Lemma 12.4.1. The monic orthogonal polynomial Q n has minimal L 2 norm || ||w
among all monic polynomials of degree n.

Proof. Suppose that Q is a monic polynomial of degree n. Then P = Q − Q n has


degree < n, so (Q n , P)w = 0. Therefore

||Q||2w = (Q, Q)w = (Q n + P, Q n + P)w = (Q n , Q n )w + (P, P) = ||Q n ||2w + ||P||2w ,

so ||Q||w > ||Q n ||w unless Q = Q n . 



12.4 The Stieltjes transform, Padé approximants, and orthogonal polynomials 299

Using the monic polynomials x n in the general case, and the polynomials (x −
1
2
[b− a])n in the case of a finite interval, we see that (12.4.8), (12.4.9), and Lemma
12.4.1 imply the following estimates.

Theorem 12.4.2. Let d(z) be the distance from z ∈ C to the interval I = (a, b).
Then d(z) ≥ |Im z| and
   b 1/2 √
 
 F(z) − Pn (z)  ≤ 1
t 2n
w(t) dt =
c2n
. (12.4.10)
 Q n (z)  d(z)n d(z) n
a

If the interval is finite, then d(z) ∼ |z| as z → ∞, and


   
  b−a n
 F(z) − Pn (z)  ≤ 1
· . (12.4.11)
 Q n (z)  d(z)n 2

Remark. The discussion and the results of this section are unchanged if we consider
a general Borel measure μ in place of one determined by a weight function w, so
long as all the moments I t 2n dμ(t) are finite. The most general type, in the case of
R, is a Riemann–Stieltjes integral.
Such an integral is determined by a bounded, non-decreasing function ψ on R.
We may normalize it by assuming that lim x→−∞ ψ(x) = 0 and that ψ is continuous
from the left. The measure of an interval (a, b) is ψ(b) − ψ(a). The integral of a
continuous function that vanishes outside a bounded interval [a, b] is the limit of the
sums
n
f (xk )[ψ(xk ) − ψ(xk−1 )]
k=1

over partitions x0 < a < x1 < · · · < xn = b as sup{|xk − xk−1 |} → 0. The integral
is extended to more general functions f by taking appropriate limits.
As an example, consider the step function

φ(x) = 0, x < 0. φ(x) = 1, x ≥ 0.

For any continuous function f : R → C,


 ∞
f (x − t) dφ(t) = f (x).
−∞

The corresponding measure μ is often denoted δ(t)dt, where δ is the Dirac delta
“function”, which vanishes outside the origin and is thought of as being infinite at
the origin to just the correct amount so that R δ(t) dt = 1. (This is the reason for
taking partitions that start to the left of a, above, so that one can have a jump “at”
x = a even if the interval in question starts at a.)
300 12 Padé approximants and continued fractions

12.5 Characterization of Stieltjes transforms

In general, the question of convergence of a Padé sequence can be difficult. In this


section we consider a class of functions for which the theory is quite complete. As
we showed in Section 12.4, if F is the Stieltjes transform of a positive measure dψ
on the line, all of whose moments are finite, then the Padé approximants converge
to F on the complement of the support of the measure. In this section we ask how
to characterize such Stieltjes transforms F, and how to determine the measure from
the moments 
ck = t k dψ(t), k = 0, 1, 2, . . . . (12.5.1)

We make some changes from the discussion in Section 12.4. First, we assume that
the interval I = (a, b) is not all of R, so up to a translation and, if necessary, a change
of orientation, we assume I ⊂ (0, ∞). Second, we change some signs in the Stieltjes
transform, and characterize functions
 ∞
dμ(t)
f (z) = T (μ) = , z∈/ (−∞, 0], (12.5.2)
0 z+t

where μ is a measure all of whose moments are finite. Let us note the particular case:
μ supported at the origin, with μ({0}) = c > 0. Then c0 = c, cn = 0 for n > 0, and
c
f (z) = . (12.5.3)
z

It will be convenient to consider (12.5.2) as a Riemann–Stieltjes integral with


respect to the non-decreasing function
ψ(t) = 0, t < 0, ψ(t) = μ([0, t)), t ≥ 0.

Thus  ∞
dψ(t)
f (z) = , z∈
/ (−∞, 0]. (12.5.4)
0 z+t

The function ψ can be recovered from f . We leave the following as Exercise 10.
Proposition 12.5.1. The function (12.5.4) is the limit
1
ψ(x) = lim [ f (−x − iε) − f (−x + iε)].
ε↓0 2πi

The characterization theorem is a consequence of a classic result of Herglotz


[104] and Riesz [176]:
Theorem 12.5.2. Suppose that g : D → C is holomorphic and g(0) > 0. Then g
has positive real part if and only if there is a positive measure μ on [0, 2π ] such that
 2π
eiθ + z
g(z) = dμ(θ ). (12.5.5)
0 eiθ − z
12.5 Characterization of Stieltjes transforms 301

Proof. Suppose that g is defined by (12.5.5), with μ such a measure. Then g is



holomorphic in D. For z = 0 we get g(0) = 0 dμ(t) > 0. Taking the real part of
(12.5.5) gives
 2π
1 − |z|2
Re g(z) = dμ(θ ), 0 ≤ r < 1. (12.5.6)
0 |eiθ − z|2

The integrand is positive, so Re g > 0.


Conversely, suppose that g is holomorphic in D, with positive real part. Let gε (z) =
g(z/(1 + ε)), 0 < ε < 1. Then gε is continuous on the closure of D. By Theorem
5.1.6,  2π iθ
e +z
gε (z) = h ε (eiθ ) dθ, h ε (z) = Re gε (z).
0 eiθ − z

By assumption, each h ε is positive. Let us use h ε to define a non-decreasing function


and a corresponding Riemann–Stieltjes measure
 θ
φε (θ ) = h ε (eis ) ds, 0 ≤ θ ≤ 2π ; με = dφε .
0

Then με (∂D) = φε (2π ) > 0. Note that φε (2π ) = g(0).


By the usual diagonal process we may find a subsequence {εn } of the sequence
{1/n} such that each limit
 2π
am = lim eimθ dφεn , m ∈ Z (12.5.7)
n→∞ 0

exists. By the Weierstrass approximation theorem, Corollary 5.1.2, linear combina-


tions of the {eimθ } are dense in C(∂D). Since each με has total mass g(0), it follows
that the linear maps λn : C(∂D) → C defined by
 2π
λn (u) = u(θ ) dφεn (θ )
0

converge to a linear map λ such that λ(u) ≥ 0 if u ≥ 0, and

|λ(u)| ≤ g(0) sup |u(θ )|, λ(u) = g(0) if u ≡ 1.


θ

This is one characterization of a measure μ of total mass g(0) on ∂D:


 2π
u(θ ) dμ(θ ) = λ(u).
0

On any given compact subset of D, the gεn converge uniformly to g, so


 

eiθ + z 2π
eiθ + z
g(z) = lim gεn (z) = lim dμεn (θ ) = dμ(θ ). 

n→∞ n→∞ 0 eiθ − z 0 eiθ − z
302 12 Padé approximants and continued fractions

Theorem 12.5.3. Let f be a function defined on the complement of the real interval
(−∞, 0]. Then f (−z) has the form (12.5.2), where μ is a positive measure on [0, ∞),
all of whose moments are finite, if and only the following conditions hold:
(i) f is holomorphic;
(ii) f (x) > 0 for x > 0;
(iii) for any x ∈ R and y = 0, y Im f (x + i y) < 0;
sequence of constants c0 , c1 , c2 , … such that for each ε > 0, the
(iv) there is a 
function f (z) ∼ ∞ n=0 cn z
−n−1
as z → ∞ in the sector | arg z| ≤ π − ε.

Proof. Suppose that f is given by (12.5.2), where μ is such a measure. It is easily


checked that f satisfies conditions (i), (ii), and (iii). To verify condition (iv), we
expand
1 
n−1
tk tn 1
= (−1)k k+1 + (−1)n n . (12.5.8)
z+t k=0
z z z+t

Therefore as z → ∞ in the sector,



n−1
ck
f (z) = (−1)k k+1 + O(z −n−1 ),
k=0
z

where the ck are the moments of μ.


Conversely, suppose that f satisfies conditions (i), (ii), and (iii). We use the inverse
of the Cayley transform, C −1 : D → H, to transfer f to the function
 
−1 1+w
g(w) = i f ◦ C (w) = i f i .
1−w

Then g is holomorphic on D and has positive real part. Moreover, f is holomorphic


across (0, ∞) and is real on this half-line. Theorem 12.5.2 tells us that there is a
positive measure ν = dφ on ∂D such that
 π iθ
e +w
g(w) = ia + dφ(θ ),
−π e − w

where a is the imaginary part of g(0). Then


   π iθ
z−i e (z + i) + (z − i)
f (z) = −ig = a−i dφ(θ )
z+i −π e (z + i) − (z − i)

 π iθ/2
e (z + i) + e−iθ/2 (z − i)
= a−i iθ/2 (z + i) − e−iθ/2 (z − i)
dφ(θ )
−π e
 π
z cos(θ/2) − sin(θ/2)
= a−i dφ(θ ).
−π i z sin(θ/2) + i cos(θ/2)

Let t = cot(θ/2) and ψ(t) = 1


2
φ(2 cot −1 t). Then
12.6 Stieltjes functions and Padé approximants 303

dψ(t)
dφ(θ ) = ,
1 + t2
so  ∞
1 − t z dψ(t)
f (z) = a + . (12.5.9)
0 t + z 1 + t2

By assumption, f (z) → 0 as z → ∞ in any sector | arg z| < π . Therefore (12.5.9)


implies that  ∞
t dψ(t)
a = .
0 1 + t2

Since t + (1 − t z)/(t + z) = (1 + t 2 )/(t + z), (12.5.9) becomes


 ∞  ∞
1 − t z dψ(t) dψ(t)
f (z) = t+ = . (12.5.10)
0 t +z 1+t 2
0 z+t

We have proved this for Im z > 0, but continuation across (−∞, 0) establishes it for
all z in the complement of (−∞, 0].
It remains to show that the moments of μ = dψ are finite. This is a consequence
of assumption (iii); see Exercise 11. 


12.6 Stieltjes functions and Padé approximants

Given a modified Stieltjes transform


 ∞
dμ(t)
f = T (μ) = ,
0 z+t

we define the associated Stieltjes function F = Fμ in the complement of (−∞, 0] by


   ∞
1 1 dμ(t)
F(z) = f = (12.6.1)
z z 0 1+zt

where again all moments of μ are assumed to be finite. An example is a constant


function F(t) ≡ c > 0. In fact this corresponds to f of (12.5.2), where the measure
μ has total mass c supported on {0}.
The relation (12.6.1) is reciprocal: f (z) = z −1 F(z −1 ). This relation between
F and f , and the integral formula (12.6.1), makes it easy to verify the following
characterization; see Exercise 12.
Theorem 12.6.1. A function F, defined for z in the complement of (−∞, 0), is
the Stieltjes function for a positive measure on [0, ∞) if and only if it satisfies the
properties
(i) F is holomorphic on the complement of (−∞, 0];
(ii) F(x) > 0 for x ≥ 0;
304 12 Padé approximants and continued fractions

(iii) Im (z F(z)) and Im z have the same sign, for Im z = 0;


(iv) there
 is a sequence of constants a0 > 0, a1 , a2 , … such that for any ε > 0,
F(z) ∼ ∞ n=0 (−1) n
a n z n
as z → 0, | arg z| ≤ π − ε.
It is easily checked that the coefficients in the expansion (iv) of a Stieltjes function
are the moments  ∞
ak = t k dμ(t).
0

Let us turn to the Padé approximants of Stieltjes functions.



Lemma 12.6.2. Suppose that μ is a positive measure on [0, ∞] and 0 dμ(t) is
finite. Then the functions
 ∞  ∞
dμ(t) dμ(t)
f (z) = , F(z) = , Im z = 0
0 z+t 0 1+zt

satisfy
Im z Im f (z) < 0, Im z Im F(z) < 0; (12.6.2)
Im z Im (z f (z)) > 0, Im z Im (z F(z)) > 0. (12.6.3)

Proof. This follows immediately from


1 z̄ + t z |z|2 + t z
= , =
z+t |z + t|2 z+t |z + t]2

and the analogous identities for the integrand of F, since these integrands are bounded
functions of t and therefore integrable with respect to μ. 


Lemma 12.6.3. Suppose that the function F is a Stieltjes function. Then the same
is true of the functions G 1 and G 2 defined by
F(z) 1 c
= ; G 2 (z) = , c > 0. (12.6.4)
F(0) 1 + zG 1 (z) 1 + z F(z)

Proof. First 
F(0)
G 1 (z) = − 1 z −1 . (12.6.5)
F(z)

It follows from (12.6.5) and from the second equation in (12.6.4) that, since F is
a Stieltjes function, G 1 and G 2 satisfy conditions (i) and (iv) of Theorem 12.6.1.
Condition (ii) is immediate for G 2 , and for G 1 it follows from the fact that for x > 0,
 ∞
xt
F(0) − F(x) = dμ(t) > 0.
0 1+xt

As for condition (iv),


12.6 Stieltjes functions and Padé approximants 305

F(0) c
z G 1 (z) = − 1; z G 2 (z) = ,
F(z) z −1 + F(z)

and Lemma 12.6.2 shows that the imaginary part of zG k (z) has the same sign as
Im z. 


Theorem 12.6.4. If F is a Stieltjes function then all the constants cn in the


continued fraction expansion
c0
(12.6.6)
c1 z
1+
1+ .
..
c2n z
1 + ...

of F are positive.

Proof. Let G 0 = F and c0 = F(0), and define G n inductively:


ck
G k (z) = , ck = G k (0), k = 1, 2, 3, . . . .
1 + zG k+1 (z)

By Lemma 12.6.5, these functions are Stieltjes functions. Therefore (12.6.6) with
coefficients  ∞
ck = G k (0) = dμk (t) > 0
0

is the continued fraction associated with G 0 = F.


Conversely, suppose that the coefficients in the continued fraction (12.6.6) are
all positive. For any given positive integer let G (n)
0 ≡ cn . This is a Stieltjes function.
Then by Lemma 12.6.3 so are the functions defined iteratively by
cn−k
G (n)
k (z) = , k = 1, 2, . . . , n.
1 + zG (n)
k−1 (z)

Then Fn = G (n)
n is the n-th convergent of (12.6.6), and has a representation
 ∞
dμn (t)
Fn (z) = ,
0 1+zt

where μn is a positive measure on [0, ∞). It has total mass


 ∞
dμn (t) = Fn (0) = c0 .
0

Therefore, as in the proof of Theorem 12.5.2, some subsequence {μ2n k } of {μ2n }


converges to a positive measure μ with total mass c0 .
306 12 Padé approximants and continued fractions

We know that Fn = Pn /Q n where Pn and Q n are polynomials that satisfy

Pn = Pn−1 + cn z Pn−2 , P−1 = 0, P0 = c0 ; (12.6.7)


Q n = Q n−1 + cn z Q n−2 , Q −1 = Q 0 = 1. (12.6.8)

In particular, positivity of the ck implies that for x > 0


1 = Q 0 (x) < Q 1 (x) = 1 + c1 x < Q 2 (x) < . . . . (12.6.9)

As in (12.2.7), the recursion (12.6.8) leads to


(−1)n c0 c1 · · · cn z n
Fn (z) − Fn−1 (z) = .
Q n (z)Q n−1 (z)

The same kind of calculation leads to


(−1)n c0 c1 · · · cn z n
Fn+1 (z) − Fn−1 (z) = .
Q n+1 (z)Q n−1 (z)

It follows from these identities and (12.6.9) that for x > 0,


c0
= F1 (x) < F3 (x) < F5 (x) < · · · < F4 (x) < F2 (x) < F0 (x) = c0 .
1 + c1 x

Now  ∞  ∞
dμ2n k dμ(t)
lim F2n k (z) = lim = .
k→∞ k→∞ 0 1+zt 0 1 + zt

Thus the limiting function F is Stieltjes. It has the continued fraction expansion
whose 2n-th convergent is F2n . Now for each fixed x > 0, F2n (x) decreases, so
 ∞
dμ(t)
lim F2n (x) = lim F2n k (x) = .
n→∞ k→∞ 0 1 + xt

Thus F has continued fraction expansion (12.6.6). 




We know from Proposition 12.3.1 that the convergent Fn is the [n − 1, n] Padé


approximant [n, n − 1] F if n is even, and is [n, n] F if n is odd. Therefore Theorem
12.6.4 and its proof give us the following.

Corollary 12.6.5. For any Stieltjes function F, the Padé approximants satisfy
0 < [1, 0] F (x) < [3, 2] F (x) < [5, 4] F (x) < . . .
< . . . [4, 4] F (x) < [2, 2] F (x) < [0, 0] F (x) = F(0)

for x ≥ 0.

Remarks. Recall that the coefficients in the expansion


12.7 Generalized Shanks Transformation 307


  ∞
dμ(t)
F(z) ∼ (−1)n an z n , F(z) =
n=0 0 1+zt


are the moments an = 0 t n dμ(t). These can be computed from the convergents Fn
of the continued fraction (12.6.6), and conversely. One question is: do the moments
determine μ uniquely? The answer is no, in general; see [9], [186]. For example, the
measures with density functions

w(t) = exp(−t 1/4 )[1 − a sin(t 1/4 )], 0≤a<1

all have the same moments an = 4 · (4n + 3) !. One positive result is due to Carleman
[40]

Theorem 12.6.6. If the moments an of the positive measure μ on [0, ∞) satisfy




an−1/2n = ∞,
n=0

then μ is uniquely determined.

12.7 Generalized Shanks Transformation

There is no general method for transforming the terms of a convergent series



  n
A = an = lim An , An = ak ,
n→∞
n=0 k=1

so that the transformed series converges more rapidly. However if something is known
about the (approximate) form of the remainder terms A − An , such a transformation
may be possible. Let us write An = A + εn . By assumption, εn → 0. The system
An+1 = A + εn+1 , An = A + εn , An−1 = A + εn−1

gives

Δn An−1 − An Δn−1 ε2 − εn+1 εn−1


A = − n , Δn = An+1 − An . (12.7.1)
Δn − Δn−1 Δn − Δn−1

This produces the limit A as an explicit function of any three successive An , provided
the denominator does not vanish and the εn = 0 satisfy εn2 − εn−1 εn+1 = 0, i.e.

εn+1 εn
= = λ = 0.
εn εn−1
308 12 Padé approximants and continued fractions

If this is the case, then


εn−1 ε1
εn = · · · ε0 = λn ε0 .
εn−2 ε0

The Aitkens Δ2 process [8] converts the sequence {An } to the sequence {T (A)},

An+1 An−1 − A2n


T (A)n =
An+1 − 2 An + An−1
(An+1 − An )2
= An+1 −
(An+1 − An ) − (An − An−1 )
Δ2n
= An+1 − (12.7.2)
Δn − Δn−1

(Note that if the denominator is 0 for n ≥ 1, then {An } is constant.) Thus, if An = αλn
for some (possibly unknown) constants α, λ, with 0 < |λ| < 1, then the sequence
(12.7.2) converges and the transformed sequence is constant and immediately gives
the limit A. Conversely, if the transformed sequence is constant, then An = αλn for
some α and λ, 0 < |λ| < 1. More generally, if An = αλn + εn , where εn /λn is small,
then the sequence {T (A)n } can be expected to converge more rapidly than {An }; see
Exercise 15.
The Aitken’s process (12.7.2) is sometimes called the Shanks transformation,
because of Shanks’s generalization to the case of more than one “transient,” as in the
case of the series
∞
A = [α1 λn1 + α2 λn2 ]. (12.7.3)
n=0

To improve the convergence of series like (12.7.3), we look for a generalization of


the nonlinear transformation T .
We call a term in the remainder of the power series a transient, if it decays like
λn for some 0 < λ < 1 and other parts of the remainder decay more rapidly. If An
has k distinct transient terms


k
An = A + α j λnj ,
j=1

then A can be determined by the (2k + 1) terms An−k , An−k+1 , · · · , An+k . The solu-
tion A of this system of 2k + 1 equations with 2k + 1 unknowns is the kth-order
Shanks transformation, given by a ratio of determinants
12.8 Examples 309
 
 An−k · · · An−1 An 
 
 ΔAn−k · · · ΔAn−1 ΔAn 
 
 ΔAn−k+1 · · · ΔAn ΔAn+1 
 
 .. .. .. 
 . . . 
 
 ΔAn−1 · · · ΔAn+k−2 ΔAn+k−1 
A = Sk (A)n =  ,
 1 ··· 1 1 
 
 ΔAn−k · · · ΔAn−1 ΔAn 

 ΔAn−k+1 · · · ΔAn ΔAn+1 

 .. .. .. 
 . . . 
 
 ΔAn−1 · · · ΔAn+k−2 ΔAn+k−1 

where ΔA p = A p+1 − A p . Note that S1 (A)n = T (A)n .


The Taylor series for the function f (z) = 1/(z + 1)(z + 2) is a very slowly con-
vergent series. The nth partial sum of this Taylor series is
n  
1
f n (z) = (−1) 1 − k+1 z k
k

k=0
2
1 (−z)n+1 (−z/2)n+1
= − − .
(z + 2)(z + 1) z+1 z+2

The poles of f (z) at z = −1 and z = −2 affect the rate of convergence of f n (z) to


f (z), and they are the origin of the two transients of f n (z). When Sk is applied to a
sequence of partial sums whose convergence is governed by two transients, the result
is exact, that is, Sk [ f n (z)] = f (z) for all k ≥ 2. If the function f (z) has p simple
poles, then its partial sums have p transient terms; see Exercise 16. Moreover Sk
applied to the partial sums is exact for k ≥ p.
The higher order Shanks transformations Sk are closely related to Padé approx-
imants. In fact, this treatment can be regarded as an alternative derivation of the
Padé approximants. It can be shown that if k ≤ n,  then Sk (An ) is identical to the
Padé approximant Pn (z)/Q k (z) of the series A1 + ∞ j=1 (A j+1 − A j )z evaluated
j

at z = 1; see Exercise 17.


For large k, the determinants in the transformations Sk are not easily computed.
The ε-algorithm developed by Wynn [221], with techniques for implementation due
to Wynn and others, is used to make computation efficient.

12.8 Examples

The Stieltjes function


 ∞
e−t dt
f (z) = | arg z| < π
0 1+zt
310 12 Padé approximants and continued fractions

has asymptotic expansion


∞ 
 ∞  ∞

n −t
f (z) ∼ t e dt (−z)n = n ! (−z)n ; (12.8.1)
n=0 0 n=0

see Exercise 18. The series diverges for z = 0. The standard way to obtain approxi-
mate values is by truncating the series after the minimal term. Since
(n + 1)! z n+1
= (n + 1)z,
n! z n

this means summing to n ∼ |z|−1 .


Table 12.1 illustrates convergence of the diagonal Padé approximants for this
series.
Table 12.1 Stieltjes Series
n Pn (1)/Q n (1) Pn (10)/Q n (10)
6 0.59682 0.24256
7 0.59657 0.23284
8 0.59646 0.22593
9 0.59641 0.22086
10 0.59638 0.21706
50 0.59635 0.20156
∞ 0.59635 0.20146

The next example exhibits a two-point expansion. Consider the function f (z)
given by the integral 
1 −z z et
f (z) = √ e √ dt, (12.8.2)
2 z 0 t

which is a solution to the differential equation 2z f  (z) = −(1 + 2z) f (z) + 1. This
solution has power series expansions at both z = 0 and z = ∞. The differential
equation allows one to compute the coefficients recursively, by plugging a proposed
expansion into (12.8.2) and looking at the coefficient of z k or z −k . At z = 0 we obtain
the Taylor series
∞
(−4)n n!
f (z) = an z n , an = , (12.8.3)
n=1
(2n + 1)!
which converges for all finite z. At z = ∞, we obtain the divergent asymptotic
expansion
∞
bn 2(2n − 2)!
f (z) ∼ , bn = n , n ≥ 1. (12.8.4)
n=1
z n 4 (n − 1)!

Here we discuss only the diagonal Padé sequence Pn (z)/Q n (z), and use as input k =
n + 1 terms of the Taylor series (12.8.3) at z = 0 and l = n terms of the asymptotic
series (12.8.4) at z = ∞. Write
12.8 Examples 311


n 
n
Pn (z) = Ak z k , Q n (z) = 1 + Bk z k .
k=0 k=1

The approximation property (12.1.2) at z 0 = 0 becomes the system of equations



k
Ak = ak− j B j , 0 ≤ k ≤ n. (12.8.5)
j=0

Similarly, the approximation property (12.1.13) at z 1 = ∞ becomes, by looking at


powers z − j , the system
n
Ak = B j b j−k , 1 ≤ k ≤ n. (12.8.6)
j=n−k

In Table 12.2 we take z = x to be real. The values of the diagonal Padé Rn /Sn
are given for the two-point approximants about x = 0 and x = ∞, the one-point
approximant Pn /Q n about x = 0, and x −1 times the one-point approximant to x f (x)
at x = ∞. Note that for small x(< 5) the two-point Padé is significantly more accu-
rate than the one-point Padé at x = ∞, while it is only slightly less accurate than
the one-point Padé at x = 0. For large x (> 50), the two-point Padé is significantly
more accurate than the one-point Padé at x = 0, while it is only slightly less accurate
than the one-point Padé at ∞. In general, the two-point approximant gives a more
uniform approximation to f (x).

Table 12.2 Two-point Padé approximation of f


x =1
n Two-point Padé One-point Padé at 0 One-point Padé at ∞
5 0.538045407 0.538079506 –1.436
6 0.538069836 0.538079506 1.783
7 0.538078314 0.538079506 0.973
8 0.538079573 0.538079506 0.745
Exact value of f (1) = 0.538079506
x = 16
7 0.03237 0.03203 0.032336
8 0.03069 0.03239 0.032337
9 0.03240 0.03233 0.032336
10 0.03235 0.03234 0.032337
Exact value at f (16) = 0.032337000
x = 256
5 0.001956983 –0.284 0.001956962
6 0.001956964 0.242 0.001956962
7 0.001956962 –0.198 0.001956962
8 0.001956963 0.167 0.001956962
Exact value of f (256) = 0.001956962

The next example is Stirling series. It is well known that the gamma function has
the Stirling series expansion
312 12 Padé approximants and continued fractions
 1/2  
−z z 2π 1 1
Γ (z) ∼ e z 1+ + + ··· (12.8.7)
z 12z 288z 2

as z → ∞ in | arg z| ≤ π − δ < π ; see [160], p.294]. By transforming this series into


a sequence of Padé approximants, the applicability of the Stirling series is increased.
It can now be used to compute Γ (z) to greater accuracy than can be obtained by the
optimal truncation of the Stirling series. Let Pn (x)/Q n (x) denote the diagonal Padé
approximants of the Stirling series
 in (12.8.7). Table 12.3 provides some numerical
2π Pn (1/x)
values of the function (x/e)x x Q n (1/x)
.

Table 12.3 Stirling Series


n x = 0.2 x = 0.5
10 4.46010 1.77180
11 4.69419 1.77297
12 4.47753 1.77199
13 4.68203 1.77283
14 4.49052 1.77211
15 4.67269 1.77274
Γopt (x) 4.71183 1.76224
Γ (x) 4.59084 1.77245

Our final example here is the Bessel function


x4 x6
J0 (2x) = 1 − x 2 + − + ··· , (12.8.8)
4 36
and its Padé approximation Pn (x)/Q m (x). When n = m = 1, the inequality (12.8.9)
written out in full gives
 
 4 6 
(1 − x 2 + x − x + · · · )(1 + q1 x) − ( p0 + p1 x) ≤ M|x|ρ , (12.8.9)
 4 36 

where we have as usual q0 = 1. After regrouping, we obtain


|(1 − p0 ) + (q1 − p1 )x − x 2 + · · · | ≤ M|x|ρ .

Taking p0 = 1 and p1 = q1 , the left-hand side is O(x 2 ) for small x. Hence, the expo-
nent ρ on the right-hand side can be at most 2, and is one less than the expected value
n + m + 1 = 3. Furthermore, the Padé approximation in this case simply reduces to
J0 (2x) ≈ 1.
Let us write the quantity inside the absolute value sign on the left-hand side of
(12.8.9) as
∞
f (z)Q m (z) − Pn (z) = ck z k ,
k=0
12.8 Examples 313

which is possible as long as f (z) has a power series expansion. The inequality
in (12.8.9) requires c0 = c1 = · · · = cρ−1 = 0 with ρ as big as possible. This new
representation gives an indication
 of the error in the Padé approximation. Once the
coefficients in the power series ck z k are known, we have

Pn (z) cρ z ρ + · · ·
f (z) − = . (12.8.10)
Q m (z) Q m (z)

In our case, f (z) = J0 (2z) and we take (n, m) = (2, 4). The last equation becomes
 
x4 x6 x8
1 − x2 + − + − · · · (1 + q1 x + q2 x 2 + q3 x 3 + q4 x 4 )
4 36 576
−( p0 + p1 x + p2 x 2 ) = c0 + c1 x + c2 x 2 + · · · + cρ−1 x ρ−1 + cρ x ρ + · · · .

Collecting terms on the left-hand side gives

(1 − p0 ) + (q1 − p1 )x + (q2 − 1 − p2 )x 2 + (q3 − q1 )x 3


  q   
1 1 1 q2
+ q4 − q2 + x4 + − q3 x 5 + − + − q4 x 6
4 4 36 4
 
q1 q3 7 1 q2 q4
+ − + )x + ( − + x8 + · · · .
36 4 576 36 4

Taking q0 = p0 = 1, q1 = p1 = q3 = 0, q2 = 8/27, p2 = −19/27, q4 = 5/108


results in coefficients of x 0 , x 1 , · · · , x 7 all vanishing, and we have

1 q2 q4 79
c8 = − + = .
576 36 4 15, 552

In fact, we can compute as many ck as we wish. The Padé approximant P2 (x)/Q 4 (x)
of the function J0 (2x) is thus given by

1− 19 2
x
79
15,552
x8 + · · ·
J0 (2x) = 27
+ .
1+ 8 2
27
x + 5 4
108
x 1 + 27 x + 108
8 2 5 4
x

As an application of this approximation, let us try to approximate the first zero of


J0 (2x), which is known to be ±1.202. By solving the simple equation

19 2
1− x = 0,
27
we obtain x = ±1.192, whereas the first positive zero obtained from the first three
terms of the power series in (12.8.8) gives 1.414.
314 12 Padé approximants and continued fractions

12.9 Continued fraction expansions of e x

At the end of Section 12.2 we established the identity


n
1 1
= .
Dk D12
k=1 D1 −
D22
D1 + D2 − 2
Dn−1
D2 + D3 − · · ·
Dn−1 + Dn

It can be used to compute a continued fraction expansion for any power series. As
an example, let us compute such an expansion of the exponential function:
∞
xn 1 1
ex = = 1 + −1 + −2 + . . . .
n=0
n! x 2x

By Theorem 12.2.6, we have


1
ex = .
1
1−
x −2
1 + x −1 −
(n ! x −n )2
x −1 + 2x −2 − · · · −
n x −n + (n + 1) ! x −n−1 − . . .

Multiplying the first denominator by x/x changes it to


x x
= .
1/x x
x +1− x +1−
4/x 4 4/x 2
1/x + 2/x 2 − x +2−
2/x 2 + 6/x 3 − . . . 2/x 2 + 6/x 3 − . . .

Continuing to simplify in this way leads to


1
ex = (12.9.1)
x
1−
x
x +1−
2x
x +2−
3x
x +3−
x + 4 − ···

This converges for each z ∈ C; see Exercise 19.


Consider now the expansion of the form (12.3.3) for the exponential function.
Computing the constants {ck } as above, it can be shown that
12.9 Continued fraction expansions of e x 315

1 1
c0 = −c1 = 1, c2n = , c2n+1 = − , n = 1, 2, 3 . . . .
4n − 2 4n + 2

For these coefficients, (12.3.5) is


z
Q 2m+1 = Q 2m − Q 2m−1 ; (12.9.2)
4m + 2
z
Q 2m = Q 2m−1 + Q 2m−2 , m ≥ 1. (12.9.3)
4m − 2

Replacing m by m + 1 in (12.9.3) gives


z
Q 2m+1 = Q 2m+2 − Q 2m .
4m + 2

Substituting this equation into (12.9.2), we obtain


z2
Q 2m+2 − Q 2m = Q 2m−2 , m ≥ 1, (12.9.4)
16m 2 − 4

which is a second-order linear difference equation in Tm = Q 2m . We look for an


asymptotic solution of the form
∞
ak
Q 2m ∼ 1+ k
. m → ∞.
k=1
m

The coefficients ak can be determined by a recurrence relation; see Exercise 21. One
can find a1 by asymptotic matching. Since
a1
Q 2m+2 − Q 2m ∼ − 2 + · · · ,
m

it follows from (12.9.4) that a1 = −z 2 /16. Hence, we have


 
z2 1
Q 2m = 1 − +O .
16m m2

For any non-zero function C(z), the product C(z)Q 2m is also a solution of the dif-
ference equation (12.9.4). Therefore
  
z2 1
Q 2m (z) = C(z) 1 − +O , m → ∞. (12.9.5)
16m m2

Coupling (12.9.5) and (12.9.3) gives


  
z z2 1
Q 2m−1 = C(z) 1 − − +O , m → ∞. (12.9.6)
4m 16m m2

From (12.9.5), (12.9.6), and the identity (12.2.7), and with the aid of Stirling’s approx-
imation (2.10.5) and the duplication formula for the gamma function:
316 12 Padé approximants and continued fractions

 n n  2π 1/2 22n−1
(n − 1) ! = Γ (n) ∼ ; Γ (2n) = √ Γ (n + 21 ) Γ (n),
e n π

we obtain √
Pn Pn−1 σn z n n
− ∼ D(z) , n → ∞, (12.9.7)
Qn Q n−1 2n n!

where D(z) is a function of z (independent of n) and σ4n = σ4n+1 = 1, σ4n+2 =


σ4n+3 = −1. In comparison, if Tn (z) is the nth partial sum of the Taylor series of e z ,
zn
Tn − Tn−1 = . (12.9.8)
n!

There is an extra factor of 2−n in (12.9.7) relative to (12.9.8), so the Padé approxi-
mants Rn = Pn /Q n converge to their limit much faster than the Taylor series Tn (z).
However, we have not shown that the limit of Rn (z) is indeed e z ; this problem is left
as Exercise 22.

Exercises

1. Suppose f (z) = 1/(1 − z). (a) Prove that, for n ≥ 1, the [m, n] Padé approxi-
mant of f at 0 is f .
(b) Find all solutions of (12.1.4) in the case m = n = 2.
2. Suppose f is a rational function. Prove that for m and n sufficiently large, the
[m, n] Padé approximant of f at 0 is f .
3. Let f (z) ∼ ∞ n=0 an z as z → 0, with a0  = 0. Let Pm (z) be the Padé approxi-
n n

mant to f (z). Prove that 1/Pm (z) is the Padé approximant to 1/ f (z).
n

4. Verify the steps in the proof of Theorem 12.1.1.


5. Formulate a set of equations for the coefficients of a two-point Padé approxima-
tion Pn (z)/Q n (z) to a function f (z) having the asymptotic expansions (12.1.12),
(12.1.13). Hint: The result is analogous to (12.1.5).
6. Derive an efficient numerical method for computing two-point Padé approxi-
mants as in Exercise 5). Hint: Modify the continued fraction development of
the one-point Padé approximation in Section 12.3.
7. Verify Proposition 12.2.1.
8. Prove that the function f of (12.3.1) is normal if and only if the sequence of
Padé approximants {Rmm , Rm+1 m
} is normal.
9. Provide a proof for Theorem 12.4.2.
10. Prove Proposition 12.5.1.
11. Prove that the moments μn in Theorem (12.5.3) are finite and given by
μn = (−1)n cn+1 for each n = 0, 1, 2, · · · , where cn are the coefficients in the
asymptotic expansion of f (z) in that theorem. Hint: Use induction.
12. Verify the characterization of Stieltjes functions, Theorem 12.6.1.
13. Verify the inequalities (12.6.2) and (12.6.3).
12.9 Continued fraction expansions of e x 317

14. Verify (12.7.1).


15. Suppose that εn /λ → 0. Estimate the rate of convergence of (12.7.2) compared
to that of An .
16. Show that if the only singularities of f (z) in the finite z-plane are l simple poles,
then the remainder in the Taylor series for f (z) has l transient terms.
17. Show that if k ≤ n, the kth-order Shanks transform Sk (An ) is identical to Pkn (1),
where Pk (z) is the Padé approximant of the series A1 + ∞
n
j=1 (A j+1 − A j )z .
j

18. Verify the asymptotic expansion (12.8.1).


19. Prove convergence of (12.9.1).
20. Find an asymptotic expansion for the solutions to the linear difference equation
in (12.9.4).
21. Use (12.9.4) to derive a recurrence relation for the coefficients of Q 2m .
22. (a) Show that the Padé approximants Pnn (z) and Pn+1 n
(z) of e z converge to
e as n → ∞. Hint: Let Fn (z) = Rn (z)/Sn (z) be the nth member of the Padé
z

sequence P00 , P10 , P11 , · · · . Since Sn (z)e z − Rn (z) = O(z n ) as z → 0 and Sn


and Rn are polynomials, use Cauchy’s theorem to show that

zn Sn (t)et
e z − Fn (z) = dt,
2πi Sn (z) L (t − z)t n

where L is any contour on which |z| < |t|. Then, use (12.9.6) to show that
Sn (z) → C(z) as n → ∞, where C(z) is a finite function. Use this result to
prove that Fn (z) → e z as n → ∞, provided that C(z) = 0.
(b) Show that C(z) = e z/2 in (12.9.5).

Remarks and further reading

The standard treatise on Padé approximants is Baker and Graves-Morris [16]. For a
full, up-to-date account of the computational aspects of the subject, its history, and
related developments, see Brezinski and Redivo-Zaglia [31], [32]. A version of Padé
approximants was developed by Hermite to prove that e is transcendental, and further
adapted by Lindemann to prove that π is transcendental. See Van Assche [209] for
a discussion of Padé and Hermite–Padé approximation.
The topics in this chapter have many applications. For more on orthogonal poly-
nomials, see Khrushchev [121] and Ismail [114]. Khinchine [120] is the standard
introduction to continued fractions. A more contemporary reference is Hensley [103].
Sauer [183] treats continued fractions and signal processing. Cuyt et al. [50] contain
continued fraction expansions of many functions, and related information important
for applications.
The classic text on the moment problem and its ramifications is Akhiezer [9].
Schmüdchen [186] contains recent developments.
Chapter 13
Riemann–Hilbert problems

In his thesis, Riemann considered the following problem: given a Jordan curve Γ
in C that bounds a domain Ω, and given real-valued functions a, b, c on Γ , find a
function W = U + i V holomorphic in Ω and continuous to the boundary Γ , such
that
aU = bV + c (13.0.1)

on Γ . Hilbert later generalized the problem by allowing the functions a, b, c to be


complex-valued.
A related problem is known as the Riemann–Hilbert factorization problem:
given a matrix-valued function V on Γ , find M+ holomorphic on the unbounded
component of the complement of Γ and M− holomorphic on the bounded compo-
nent, such that on Γ we have
M+ = V M− . (13.0.2)

In this chapter we focus on the classical Riemann–Hilbert problem (13.0.1), and


its relation to integral transforms and integral equations. The key ingredient is the
Cauchy transform 
1 f (t)
F(z) = dt, z∈
/ Γ,
2πi Γ t − z

and its limits F± (t) as z approaches the curve Γ from one side or the other. The for-
mulas of Sokhotski and Plemelj for these limits are proved in Section 13.1. Section
13.2 covers Carleman’s approach to the Riemann–Hilbert problem. The remain-
ing sections give some applications of the Riemann–Hilbert problem: to integral
transforms in Section 13.3, and to integral equations in Sections 13.4, 13.5, 13.6,
and 13.7.
This Chapter could as well have been titled “A Riemann–Hilbert problem.” A
rather different problem, which is also commonly referred to as “the Riemann–

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 319
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_13
320 13 Riemann–Hilbert problems

Hilbert problem” comes up as number 21 in Hilbert’s famous list of 23 problems


[106]. In Section 13.8 we describe this other problem.

13.1 The Sokhotski–Plemelj formula

Consider the Cauchy integral



1 f (t)
F(z) = dt, (13.1.1)
2πi Γ t−z

where Γ is a finite simple oriented C 1 curve in the complex plane, and f : Γ → C


is piecewise continuous. The curve Γ may be either a finite arc or a closed contour.
Then F defined by (13.1.1) is holomorphic on the complement of Γ . Let t0 be a point
on Γ , but not an end point. We wish to determine the value of the limit of F(z) as
z → t0 . The answer was found by Sokhotski [193] in 1873. It was rediscovered by
Plemelj [168] in 1908, in his work on the Riemann–Hilbert problem.
Specifically, let n = n(t0 ) be a vector normal to Γ at t0 and pointing to the left
with respect to the direction along the curve. We start by considering F(t0 ± εn) as
ε ↓ 0.
Theorem 13.1.1. Under the preceding conditions, suppose that f is Hölder contin-
uous at t0 , i.e. there are constants C > 0 and 0 < α < 1 such that for t ∈ Γ ,

| f (t) − f (t0 )| ≤ C|t − t0 |α .

Then the limits F± = limε→0 F(t0 ± εn) exist, and



1 1 f (t)
F± (t0 ) = ± f (t0 ) + p.v. dt. (13.1.2)
2 2πi Γ t − t0

The principal value integral here is defined by


 
f (t) f (t)
p.v. dt = lim dt.
Γ t − t0 δ→0 t∈Γ,|t−t0 |>δ t − t0

Proof. For convenience we translate and rotate the coordinate system so that t0 = 0
and Γ is tangent to the real axis in the positive direction. The unit normal is n = i.
It follows that 
1 f (t)
F(t0 ± εn) = F(±εi) = dt.
2πi Γ t ∓ iε

A simple calculation gives



1 ε
F(εi) − F(−εi) = f (t) dt.
π Γ t 2 + ε2
13.1 The Sokhotski–Plemelj formula 321

Given ε > 0, let Γε = Γ ∩ {t : |t − t0 | < ε1/4 }. Then on Γ \ Γε , the last integrand


is dominated by ε1/2 , so we may concentrate on Γε . For small ε, t ∈ Γε implies that
Im t = o(t), so
  1/4  −3/4
1 ε f (t) dt 1 ε ε f (t) dt 1 ε f (εx)
∼ = dx
π Γε t +ε
2 2 π −ε1/4 t + ε
2 2 π −ε−3/4 x 2 + 1
  ∞ 
1 dx
∼ f (0) = f (0)
π −∞ x 2 + 1

as ε → 0. (Note that here we used only continuity at t0 , not the Hölder condition.)
Similarly 
1 t f (t) dt
F(εi) + F(−εi) = .
iπ Γ t 2 + ε2

Here we set Γδ = {t ∈ Γ : |t| < δ}. Clearly


 
1 t f (t) dt 1 f (t) dt
lim = .
ε→0 iπ Γ \Γ t 2 + ε 2 πi Γ \Γδ t
δ

Now    
t f (t) dt t[ f (t) − f (0)] dt t dt
= + f (0).
Γδ t 2 + ε2 Γδ t 2 + ε2 Γδ t + ε2
2

Because of the Hölder continuity assumption, the first integral on the right has a
limit l(δ) as ε → 0, and l(δ) → 0 as δ →√0. In the second integral on the right, we
note that integrand is the derivative of log t 2 + ε2 . For small δ the endpoints of the
path of integration are ±δ + r± , where r± = o(δ). Therefore as ε → 0 the second
integral is
δ + r+
log(δ + r+ ) − log(δ + r− ) = log = log(1 + o(δ)) = o(δ).
δ + r−

At this point we have proved that the difference and the sum of F(t0 ± ε) have
limits. Therefore the individual limits F± (t0 ) exist and satisfy

1 f (t) dt
F+ (t0 ) − F− (t0 ) = f (t0 ), F+ (t0 ) + F− (t0 ) = p.v. . (13.1.3)
πi Γ t − t0

Solving (13.1.3) for F+ and F− gives (13.1.2).

If we assume that f is uniformly Hölder continuous on the curve,

| f (t) − f (s)| ≤ C|t − s|α , all t, s ∈ Γ, (13.1.4)

where again C, α > 0, then the pointwise result above can be converted to a uniform
result.
322 13 Riemann–Hilbert problems

Theorem 13.1.2. Under the assumptions of Theorem 13.1.1 and the additional
assumption (13.1.4), if Γ is a closed curve then F is continuous from either side
of Γ up to Γ itself. If Γ is an arc, then F is continuous up to Γ minus the endpoints.
The formulas (13.1.3) hold at each point t0 ∈ Γ that is not an endpoint of Γ .

Proof. Suppose that t0 ∈ Γ is not an endpoint. For some sufficiently small disk
D2r (t0 ), and for ε < r , the sets
I± (ε) = {t ± εn(t) : t ∈ Γ, |t − t0 | < r }

are arcs that are approximately parallel to Γ at distance ε. The function F is uniformly
continuous along each such arc. The previous argument shows that F converges at
a uniform rate along each of the normal vectors ±n(t), so F is continuous up to
{t ∈ Γ : |t − t0 | < r }.

Let us consider the behavior of (13.1.1) at the endpoint a of a C 1 arc Γ .

Theorem 13.1.3. Suppose that Γ is a simple C 1 arc from a to b and f : Γ → C is


piecewise continuous on Γ and continuous at the endpoints. Then

⎪ 1
⎨ − log(a − z) · f (a), as z → a,
F(z) ∼ 2πi (13.1.5)

⎩ 1
log(b − z) · f (b), as z → b.
2πi

Proof. Consider the endpoint a. Let Γ be an extension of Γ past b to a simple C 1


curve from a to ∞. We choose a branch of the logarithm on the complement of Γ
and extend it to Γ by taking the limit from the left. Then
  
1 f (t) 1 dt 1 f (t) − f (a)
F(z) = dt = f (a) + dt.
2πi Γ t−z 2πi Γ t −z 2πi Γ t−z

The first integral on the right is


1 b−z
log · f (a) + O(1)
2πi a−z

for any choice of the branch of the logarithm. To estimate the second integral, we ease
notation by assuming that the coordinates were chosen so that a = 0. The integral is

1 g(t)
dt, g(t) = f (t) − f (0).
2πi Γ t − z

Then g is also Hölder continuous, and g(0) = 0. Given z ∈ C, if |z − t| > |t|/3 for
all t ∈ Γ , then Hölder continuity implies that
 
g(t) dt C |t|α
< |dt| = C1 ,
Γ t −z Γ |t|/3
13.1 The Sokhotski–Plemelj formula 323

0
s t
|t/3|

Fig. 13.1 a = 0, |z − s| ≤ |t|/3

a constant. Otherwise there is a z such that |z − t| ≤ |t|/3. Let s = s(z) be the point
of Γ that is closest to z. Then |s| ≥ |t| − |t|/3 and

|t| |s|
|z − s| ≤ |z − t| ≤ ≤ ; (13.1.6)
3 2
see Figure 13.1
Suppose for now that Γ is a straight line segment, which we may take to be
[0, c] ⊂ R. Then t ∈ Γ implies |z − t|2 = |t − s|2 + |z − s|2 , so |z − t| ≥ |t − s|
and   
g(t) dt |g(t) − g(s)| g(s) dt
≤ |dt| + .
Γ t − z Γ |t − s| Γ t −z

Again, Hölder continuity of g yields a bound for the first integral that is independent
of s. The second integral is
c
log(t − z) g(s) = O(s α log(−z)) = O(|z|α log |z|),
0

since |z| ∼ s, by (13.1.6).


In the general case, it is enough to restrict attention to a small neighborhood of
a in which Γ is sufficiently close to an interval so that we can conclude that t ∈ Γ
implies |z − t| ≥ 21 |t − s|, where s is again the point of Γ that is closest to z. Then
the previous argument goes through. Let us note explicitly that this argument allows
for |z − s| = 0, i.e. z ∈ Γ , z = a.
The argument for the endpoint b is the same: simply reverse the direction of travel
on Γ and extend Γ past a to select a branch of the logarithm.

If we allow Γ to be an infinite contour, then some restriction on f needs to be


made to ensure that F is defined on the complement of Γ , such as

| f (t)|
|dt| < ∞. (13.1.7)
Γ 1 + |t|
324 13 Riemann–Hilbert problems

With such a restriction, the previous results hold. In (13.1.4) we may allow the
constant C to grow as |t| → ∞.
Remark. As another generalization, we can permit the curve Γ to have a corner
at z 0 ; see Figure 13.2. If the angle is θ , then

θ 1 f (t)
F+ (t0 ) = 1 − f (t0 ) + p.v. dt;
2π 2πi Γ t − t0

θ 1 f (t)
F− (t0 ) = − f (t0 ) + p.v. dt. (13.1.8)
2π 2πi Γ t − t0

See Exercise 2.

t0
θ

Fig. 13.2 Corner at t0

A problem sometime encountered is to find a function G that is holomorphic


on the complement of a finite curve Γ , such that the discontinuity G + (t) − G − (t),
t ∈ Γ , t not an endpoint, is a prescribed function f . If f is continuous on Γ , the proof
of Theorem 13.1.1 shows that the Cauchy integral F is a solution. A natural question
is that of uniqueness of the solution. Clearly F(z) → 0 as z → ∞. If G is another
solution, then G − F is continuous on Γ except possibly at the endpoints. If f is
continuous at the endpoints, and G has at most the same kind of logarithmic growth
as F at the endpoints, then G − F has removable singularities at the endpoints, and
is entire. More generally, if f is such that at an endpoint a of Γ ,

F(z) = O(|z − a|r ), r > −1 as z → a, z ∈


/ Γ, (13.1.9)

and G is required to obey the same estimate then the singularities of G − F at a are
removable.
We have proved one version of the discontinuity theorem [43]:

Theorem 13.1.4. Suppose that Γ is a finite simple C 1 curve and f : Γ → C is


continuous except possibly at the endpoints. Suppose that G is holomorphic on the
complement of Γ , and G + − G − = f on G. Suppose finally that at the endpoints, if
13.1 The Sokhotski–Plemelj formula 325

any, both the Cauchy integral (13.1.1) and G satisfy estimates of the form (13.1.9).
Then G − F is an entire function.

Remark. If G = O(z n ) as |z| → ∞ for some integer n ≥ 0, it follows that G − F


is a polynomial of degree ≤ n.
Theorem 13.1.4 is one example of solving a problem by turning the Sokhotski–
Plemelj formula around. The following is a different example. The problem is to
evaluate the principal value integral
 1
(1 − t)α−1
I (x) = p.v. dt, |x| < 1, (13.1.10)
−1 (1 + t)α (t − x)

where 0 < α < 1. Consider the function


(t − 1)α−1
G(t) = (13.1.11)
(t + 1)α

with the branch cut (−1, 1) and the branch chosen to correspond to principal values
for t real and t > 1. For t0 ∈ (−1, 1), it follows from (13.1.11) that
(1 − t0 )α−1 i(α−1)π
G + (t0 ) = e
(1 + t0 )α

and
(1 − t0 )α−1 −i(α−1)π
G − (t0 ) = e .
(1 + t0 )α

Thus, G + (t0 ) − G − (t0 ) = (1 − t0 )α−1 (1 + t0 )−α (−2i sin απ ). In view of Theorem


13.1.4, we obtain
 1
1 (1 − t)α−1 dt
G(z) = (−2i sin απ ) .
2πi −1 (1 + t)α t − z

From (13.1.3) it follows that


I (x) = π cot απ (1 − x)α−1 (1 + x)−α . (13.1.12)

(Note the interesting special case α = 1


2
.)

We end this section with an extension of Theorem 13.1.3 to the case of singularities
at the endpoints.
Theorem 13.1.5. Suppose that Γ is a finite simple C 1 arc from a to b. Suppose that
f : Γ → C is continuous except possibly at the endpoints, and suppose that at the
endpoint c = a or c = b it has the form
f (t)
f (t) = , σ = α + iβ = 0. 0 ≤ α < 1. (13.1.13)
(t − c)σ
326 13 Riemann–Hilbert problems

Here α and β are real, and f (t) is continuous. Then the Cauchy integral 13.1.1
satisfies
(a) as z → c, with z not on the arc,

e±σ πi f (c)
F(z) = ± + δ(z); (13.1.14)
2i sin σ π (z − c)σ

(b) as t → c, with t on the arc,

cot σ π f (c)
F(t) = ± + ρ(t), (13.1.15)
2i (t − c)σ

where the positive and negative signs correspond to c = a and c = b, respectively. If


α = Re σ = 0, then σ (z) and ρ(t) are bounded functions with limits at c. If α > 0,
then
M0 ρ(t)
|δ(z)| ≤ α
, |ρ(t)| ≤ , α0 < α,
|z − c| 0 |t − c|α0

where ρ(t) is continuous near c. The function (z − c)σ is any branch that is single-
valued near c with the branch cut taken along the arc with the value (t − c)σ on the
left side of the curve.

Proof. We only present a sketch of the proof by using the Sokhotski–Plemelj for-
mula, and refer the readers to [149] for details. Consider the case c = a. Take the
branch cut of (z − a)σ from the endpoint a to ∞ going through b; see Figure 13.3.
Select the branch that tends to (t − a)σ on the left side of the cut, i.e.

(t − a)σ = (t − a)σ+ . (13.1.16a)

To find the value of (t − a)σ on the right of the cut, we follow the contour in Figure
13.3.

b
+

Fig. 13.3 The cut from a to ∞ through b

Thus
(t − a)σ− = e2πσ i (t − a)σ+ . (13.1.16b)
13.1 The Sokhotski–Plemelj formula 327

Equations (13.1.16a) and (13.1.16b) can be written as

(t − a)−σ −σ
+ − (t − a)− = (1 − e
−2πσ i
)(t − a)−σ

or equivalently

eiπσ eiπσ
(t − a)−σ
+ − (t − a)−σ
− = (t − a)
−σ
. (13.1.17)
2i sin π σ 2i sin π σ

Equation (13.1.17) shows that the function (t − a)−σ can be written as a difference
function of a “+” and a “−” function. Since it is expected that the major contribution
to the Cauchy integral (13.1.1) will come from the locations where f (t) is singular
(i.e. the endpoints), it follows from (13.1.13) that as z → a,

f (a) b
(t − a)−σ
F(z) ∼ dt.
2πi a t −z

On account of (13.1.17), we obtain


   
eiπσ 1 b
(t − a)−σ
+ 1 b
(t − a)−σ

F(z) ∼ f (a) dt − dt .
2i sin σ π 2πi a t−z 2πi a t−z

From (13.1.13), it follows that

eiπσ
F(z) ∼ [F+ (z) − F− (z)].
2i sin σ π
By the Sokhotski–Plemelj formulas we have, for z not on the curve of integration,

eiπσ f (a)
F(z) ∼ .
2i sin σ π (z − a)σ

In view of (13.1.16a) and (13.1.16b), for z = t on the path of integration we have

1
F(t) = [F+ (t) + F− (t)]
2
eiπσ f (a)
∼ [(t − a)−σ −σ
+ + (t − a)− ]
2i sin σ π 2
cot σ π f (a)
= .
2i (t − a)σ
328 13 Riemann–Hilbert problems

13.2 Riemann–Hilbert Problems

As we noted in the introduction, the problem originally posed by Riemann was to find
a function W = U + i V , holomorphic inside a bounded domain Ω and continuous
to the boundary, that satisfies a linear relation between the boundary values of its real
and imaginary parts. Up to conformal equivalence we may take Ω = D and look for

a(ζ )U (ζ ) + b(ζ )V (ζ ) = c(ζ ), |ζ | = 1, (13.2.1)

where a, b, and c are given real-valued functions. If we set

1
W− (z) = W + , |z| > 1, (13.2.2)
z

then W − = W+ on Γ = ∂D. Therefore we may rewrite (13.2.1) as


a(ζ ) − ib(ζ ) a(ζ ) + ib(ζ )
W+ (ζ ) + W− (ζ ) = c(ζ ). (13.2.3)
2 2
Thus, we can reformulate Riemann’s problem in the form: find two functions W+ (z)
and W− (z), holomorphic inside and outside of the unit circle, respectively, such that
their boundary values on the unit circle satisfy the linear relation (13.2.3). With this
formulation, W is unique only up to multiplication by an entire function, so we
also specify the behavior of W− (z) at ∞; for instance, from (13.2.2), we require
W− (z) → W + (0) as z → ∞].
As a generalization, Hilbert posed the problem of finding a function W (z), holo-
morphic on the complement of a closed curve Γ such that for all ζ ∈ Γ ,
W+ (ζ ) = g(ζ )W− (ζ ) + f (ζ ), (13.2.4)
where g(ζ ) and f (ζ ) are two given complex-valued functions. In Hilbert’s original
problem, Γ is a closed curve, the general problem (13.2.4), whether Γ is open or
closed, has become known as the Riemann–Hilbert problem. Again, for uniqueness,
the behavior of W (z) at ∞ is required. If Γ is an open arc, then the endpoint behavior
should also be prescribed.
In his work on singular integral equations (see Section 13.7), Carleman [41] found
an effective method of attack. First find a function L(z) that satisfies
L + (ζ ) = g(ζ ) L − (ζ ), (13.2.5)

where L + (ζ ), L − (ζ ), and L(ζ ) have no zeros. Substituting (13.2.5) into (13.2.4)


yields
W+ (ζ ) W− (ζ ) f (ζ )
− = . (13.2.6)
L + (ζ ) L − (ζ ) L + (ζ )

Note that the function W (z)/L(z) is holomorphic for z not on Γ , since L(z) = 0.
Hence the conditions in Theorem 13.1.4 are met, and the general function satisfying
13.2 Riemann–Hilbert Problems 329

(13.2.6) is given; see the remark following Theorem 13.1.4. Hence, if L(z) is known
then W (z) has been found.
Before proceeding to find L(z), we observe that solving equation (13.2.6) is
equivalent to solving
f (ζ )
= F+ (ζ ) − F− (ζ )
L + (ζ )

and then defining 


1 f (ζ )/L + (ζ )
F(z) = dζ ; (13.2.7)
2πi Γ ζ −z

see (13.1.2.
Equation (13.2.6) can be written as
W+ (ζ ) W− (ζ )
− F+ (ζ ) = − F− (ζ ).
L + (ζ ) L − (ζ )

The function
W (z)
− F(z) (13.2.8)
L(z)
has the same boundary values on each side of Γ , so it is an entire function. The
function W (z) is thus determined, up to addition of an entire function. In the case
when Γ is an infinite straight line parallel to the real axis, this method is known as
the Wiener–Hopf technique; see [22].
We now return to the problem of finding a function L(z) that satisfies (13.2.5).
Assuming that g(ζ ) = 0 for ζ ∈ Γ , we take logarithms on both sides of (13.2.5).
This gives
log L + (ζ ) − log L − (ζ ) = log g(ζ ). (13.2.9)

For now we assume that Γ is an arc, and that g(ζ ) is continuous to the end points a,
b of the arc. By the discontinuity theorem, Theorem 13.1.4, a particular solution of
(13.2.9) is 
1 log g(ζ )
G(z) = log L(z) = dζ. (13.2.10)
2πi Γ ζ − z

Thus, L(z) = e G(z) and L(z) is non-zero. Furthermore,


L + (ζ )
= e[G + (ζ )−G − (ζ )] = elog g(ζ ) = g(ζ ),
L − (ζ )

i.e. (13.2.5) is satisfied. Here, the second equality again follows from Theorem 13.1.4.
From (13.2.10), we have L(z) → 1 as z → ∞. The behavior of L(z) as z → a or b
may not be appropriate for the application of Theorem 13.1.4. Fortunately, we can
adjust the behavior of L(z) by incorporating an integral power of z − a or z − b into
L(z). For instance, we know from Theorem 13.1.5 that
330 13 Riemann–Hilbert problems

1
G(z) ∼ − log g(a) log (z − a)
2πi
as z → a; see (13.1.5). Hence
L(z) ∼ (z − a)− log g(a)/2πi
∼ (z − a)α+iβ ,

where α and β are real numbers. In this case, we can revise L(z) by multiplying it by
a factor (z − a) p1 , where p1 is an integer, with −1 < α + p1 < 0. A similar factor
can be incorporated to yield the desired behavior at the other endpoint.
If Γ is a closed curve, equation (13.2.9) is usually not useful, since log g(z) will not
in general return to its initial value after a complete circuit. Thus the function log g(ζ )
in the integral defining G(z) in (13.2.10) has a discontinuity, and the Sokhotski–
Plemelj formulas are not valid. Let log g(ζ ) increase by 2π ni, n an integer, during
a circuit of Γ . We can avoid this difficulty by defining
g0 (ζ ) = (ζ − z 0 )−n g(ζ ),

where z 0 is a point inside Γ . Now, define

L(z) for z inside G


N (z) = (13.2.11)
(z − z 0 )n L(z) for z outside.

Our problem is now to solve


N+ (ζ ) = g0 (ζ )N− (ζ ),

where g0 (ζ ) is single-valued, and the procedure for the arc can be used.
Example. Find a function W (z) satisfying

W+ (ζ ) + W− (ζ ) = f (ζ ) (13.2.12)

for ζ on an arc Γ , with W (z) being of finite degree at ∞ and having singularities
near endpoints a and b which are no worse than algebraic with degree > −1. The
function f (ζ ) may have integrable singularities at the endpoints a and b.
From (13.2.4) with g(ζ ) = −1, we obtain one solution, namely,
 b
1 iπ
log L(z) = dζ,
2πi a ζ − z

i.e. 
z−b
L(z) = . (13.2.13)
z−a
13.3 The Radon Transform and the Fourier transform 331

It is easily shown that equation (13.2.5) is satisfied. For W (z)/L(z) not to grow
too fast as z → a or b, we need to make L(z) grow algebraically (with exponent
> −1) as z → a, b. Therefore we use for L(z) a function obtained by multiplying
the right-hand side of (13.2.13) by 1/(z − b). That is, we choose
1
L(z) = √ (13.2.14)
(z − a)(z − b)

and the branch cut along the arc, with L(z) ∼ z −1 as z → ∞. For ζ ∈ Γ , L + (ζ ) and
L − (ζ ) can easily be calculated from (13.2.14). For instance, if Γ is the line segment
(−1, 1) of the real line, then

−i i
L + (ζ ) =  and L − (ζ ) =  . (13.2.15)
1− ζ2 1 − ζ2

Equation (13.2.6) now gives



W (z) 1 f (ζ )
= dζ + pn (z), (13.2.16)
L(z) 2πi Γ L + (ζ )(ζ − z)

where pn (z) is a polynomial of degree < n. The function given in (13.2.16) is the
most general solution for which W (z)/z n → 0 as z → ∞.

13.3 The Radon Transform and the Fourier transform

The Radon transform is defined by



Q(k, p) = q(x1 , x2 ) dτ,
L

wherethe integral is taken along a line L with direction determined by the unit vector
k = √1+k 1
2
, √1+k
k
2
, at a distance p from the origin, and τ is a parameter on this
line; see Figure 13.4. This transform plays a fundamental role in the mathematical
formulation of computerized tomography (CT): the reconstruction of a function from
the knowledge of its line integrals, irrespective of the particular field of application.
The most prominent application of CT is in diagnostic radiology. Here a cross section
of the human body is scanned by a thin X-ray beam whose intensity loss is recorded
by a detector and processed by a computer to produce a two-dimensional image that
in turn is displayed on a screen.
A simple physical model is as follows; see Figure 13.5. Let f (x1 , x2 ) be the X-ray
attenuation coefficient of the tissue at the point x = (x1 , x2 ). This means that X-ray
traversing a small distance Δτ along the line L suffers the relative intensity loss
332 13 Riemann–Hilbert problems

x2
L

x1

Fig. 13.4 Line L, distance p

ΔI
= − f (x1 , x2 )Δτ.
I
Let I0 and I1 be the initial and final intensities of the beam, before and after leaving
the body, respectively. In the limit Δτ → 0, it follows from the above equation that

I1 
= e− L f (x1 ,x2 ) dτ
,
I0

that is, the scanning process determines an integral of the function f (x1 , x2 ) along
each line L. Given all these integrals, one wishes to reconstruct the function f .

Source

Detector

Fig. 13.5 Simple physical model of CT

√ √ ⊥
Let k = (1/ 1 + k 2 , k/ 1 + k 2 ) be a unit
√ vector along
√ L and let k be the unit

vector orthogonal to k, that is, k = (−k/ 1 + k , 1/ 1 + k ). Then any point
2 2

x = (x1 , x2 ) can be written as x = pk⊥ + τ k. For fixed k and p, we write


13.3 The Radon Transform and the Fourier transform 333

τ − pk τk + p
x1 (τ ) = √ ; x2 (τ ) = √ .
1 + k2 1 + k2

Note that as τ varies, x = (x1 (τ ), x2 (τ )) moves along the line as depicted in Figure
13.5. Therefore the Radon transform can also be written as
 ∞
τ − pk τ k + p
q(k, p) = q √ ,√ dτ. (13.3.1)
−∞ 1 + k2 1 + k2

Along the line of integration in (13.3.1), the derivative of a function μ(τ ) is


 
dμ 1 ∂μ ∂μ
= √ +k ,
dτ 1 + k2 ∂ x1 ∂ x2

so in order to calculate q in (13.3.1), we are led to the partial differential equation

∂μ ∂μ
+k = q(x1 , x2 ). (13.3.2)
∂ x1 ∂ x2

Then  ∞
q(k, p) = 1 + k 2 · μ(x1 (t), x2 (τ )) . (13.3.3)
−∞

As we shall see, equation (13.3.2) leads naturally to a Riemann–Hilbert problem.


Let us begin with a simpler model, the differential equation

dμ 
(x) − ikμ = 1 + k 2 q(x), −∞ < x < ∞, k ∈ C. (13.3.4)
dx

Assuming that q and qx belong to L 1 , we have the following particular solutions:


 x
+
μ (x, k) = q(ξ )eik(x−ξ ) dξ,
−∞
 ∞ (13.3.5)
− ik(x−ξ )
μ (x, k) = − q(ξ )e dξ.
x

We define a solution of (13.3.4) by

μ+ (x, k), k I ≥ 0,
μ(x, k) = k = k R + ik I . (13.3.6)
μ− (x, k), k I ≤ 0,

Taking into account (13.3.5), it is readily seen that μ+ is holomorphic in the upper
half-plane (k I > 0) and μ− is holomorphic in the lower half-plane (k I < 0). Fur-
thermore, the large x behavior of both μ+ and μ− is uniquely determined by  q (k),
which is defined by
334 13 Riemann–Hilbert problems
 ∞

q (k) = q(x)e−ikx d x, k ∈ R. (13.3.7)
−∞

Indeed,
 −ikx −   
lim e μ = −
q (k), lim e−ikx μ+ = 
q (k). (13.3.8)
x→−∞ x→∞

Equation (13.3.7) defines 


q in terms of q. To invert this relationship we will formu-
late the problem as a Riemann–Hilbert problem. Taking the difference of the two
equations in (13.3.5), we have

μ+ (x, k) − μ− (x, k) = e−ikx 


q (k), k ∈ R. (13.3.9)

By integrating by parts, it can be seen from (13.3.5) that

1
μ = O as k → ∞.
k

Then equation (13.3.9), with μ → 0 as k → ∞, defines a Riemann–Hilbert problem


for the function μ(x, k); see (13.2.4) and the following remark. The solution is given
by  ∞ i xl
1 e  q (l)
μ(x, k) = dl, k ∈ C. (13.3.10)
2πi −∞ l − k

Given q (l), equation (13.3.10) yields μ(x, k), which then gives q(x) through
equation (13.3.4). An elegant formula for q can be obtained by comparing the
large k asymptotics of equations (13.3.4) and (13.3.10). Equation (13.3.4) implies
q = −i lim (kμ), while (13.3.10) yields
k→∞
 ∞
1
lim kμ = − ei xl 
q (l)dl.
k→∞ 2πi −∞

Hence  ∞
1
q(x) = ei xk 
q (k) dk. (13.3.11)
2π −∞

Equations (13.3.7) and (13.3.11) are the usual formulas for the direct and inverse
Fourier transform.
Let us now turn this argument around. To solve (13.3.4), write the proposed
solution μ and the right-hand side in terms of their Fourier transforms:
 ∞
1
μ(x) = eilx 
μ(l) dl. (13.3.12)
2π −∞

Then the differential equation (13.3.4) becomes


 ∞  ∞
dμ 1 1
(x) − ikμ(x) = i(l − k)eilx 
μ(l) dl = eilx q(l) dl.
dx 2π −∞ 2π −∞
13.3 The Radon Transform and the Fourier transform 335

Thus we expect  μ(l) = q /(il − ik). With this choice, the inversion formula gives
(13.3.10).
Let us take a second look at this, writing the solution in terms of a Green’s function
G for the operator d/d x − ik, i.e. we want to obtain the solution μ as an integral
 ∞
μ(x) = G(x − y) q(y) dy.
−∞

(The form G(x, y) = G(x − y) reflects the fact that the operator is invariant under
translation.) A simple computation shows that taking the Fourier transform gives
 ·
μ = G q.

 = 1/i(l − k), so
In view of this and (13.3.12), we want G(l)
 ∞
1 1
G(x, k) = ei xl dl, k∈
/ R.
2πi −∞ l −k

The limits as ±Im k ↓ 0 can be computed (see Exercise 3), and we recover (13.3.5).
Making use of the analogy with (13.3.4), let us return to equation (13.3.2), with
k allowed to be complex:
∂μ ∂μ
+k = q, −∞ < x1 , x2 < ∞, k ∈ C. (13.3.13)
∂ x1 ∂ x2

As in the case of (13.3.4), we make use of the Fourier transform, this time in two
dimensions. Treating one variable at a time, it is easy to see that under appropriate
assumptions on q(x) = q(x1 , x2 ) we have the relation between q and its Fourier
transform q:
 ∞ ∞

q (l) = e−i(l1 x1 +l2 x2 ) q(x) d x1 d x2 ;
−∞ −∞
 ∞ ∞
1
q(x) = ei(l1 x1 +l2 x2 ) 
q (l) dl1 dl2 .
(2π )2 −∞ −∞

In analogy with the argument given above with respect to (13.3.4), we look for a
Green’s function G for the equation (13.3.13):
 ∞  ∞
μ(x) = G(x − y)q(y)μ(y) dy1 dy2 , (13.3.14)
−∞ −∞

and derive the equation



1 ei(x1 ξ1 +x2 ξ2 )
G(x1 , x2 , k) = dξ1 dξ2 . (13.3.15)
i(2π )2 R2 ξ1 + kξ2
336 13 Riemann–Hilbert problems

The above integral can be evaluated by using contour integration (Exercise 3), and
we have sgn(Im k)
G(x1 , x2 , k) = , Im k = 0. (13.3.16)
2πi(x1 − kx2 )

Putting (13.3.16) into (13.3.14) we obtain



1 q(y1 , y2 )
μ± (x1 , x2 , k) = ± dy1 dy2 , k ∈ C± , Im k = 0.
2πi R2 [(x2 − y2 ) − k(x1 − y1 )]

Applying the Sokhotski–Plemelj formulas, we obtain


 ∞  ∞
1 q(y1 , y2 )
μ± (x1 , x2 , k) = ± p.v. dy2 dy1
2πi −∞ −∞ (x 2 − y2 ) − k(x 1 − y1 )
 x1  ∞
1
+ − q (y1 , x2 − k(x1 − y1 )) dy1 , k ∈ R;
2 −∞ x1

see Exercise 4. The difference of these two equations gives


 ∞  ∞
1 q(y1 , y2 )
(μ+ − μ− )(x1 , x2 , k) = p.v. dy2 dy1 ,
πi −∞ −∞ (x2 − y2 ) − k(x1 − y1 )
k ∈ R. (13.3.17)

The right-hand side of this equation can be written in terms of the Radon transform
of the function q(x1 , x2 ) defined by
 ∞
τ − pk τ k + p
q(k, p) = q √ ,√ dτ. (13.3.18)
−∞ 1 + k2 1 + k2

Indeed, changing variables from (y1 , y2 ) to ( p , τ ) where

τ −pk τk+p
y1 = √ , y2 = √
1 + k2 1 + k2

and using equation (13.3.17) and the Jacobian of the transformation


⎛ ⎞
∂ y1 ∂ y2
⎜ ∂τ ∂τ ⎟
J = det ⎜ ⎟
⎝ ∂ y1 ∂ y2 ⎠ = 1,
∂p ∂p

it follows that
μ+ (x1 , x2 , k) − μ− (x1 , x2 , k)
 ∞
1 q(k, p )
= p.v. √ dp , k ∈ R. (13.3.19)
iπ −∞ x 2 − kx 1 − p 1 + k2
13.4 Integral Equations with Cauchy Kernels 337

Equation (13.3.14) implies that


1
μ = O , k → ∞, (13.3.20)
k

so equations (13.3.19) and (13.3.20) define a Riemann–Hilbert problem for the func-
tion μ(x1 , x2 , k). Its unique solution, for Im k = 0, is
 ∞  ∞
1 1 q(k, p )dp dk
μ(x1 , x2 , k) = p.v. √ ;
2πi −∞ πi −∞ x2 − kx1 − p 1 + k 2 k −k

see (13.3.19). Comparing the large-k asymptotics of equations (13.3.13), (13.3.14)


and (13.3.20), it follows that


q = lim (kμ)
k→∞ ∂ x2
or
 ∞  ∞
1 ∂ q(k, p)
q(x1 , x2 ) = p.v. √ dp dk. (13.3.21)
2π 2 ∂ x2 −∞ −∞ x2 − kx1 − p 1 + k 2

Equations (13.3.18) and (13.3.21) are the usual formulas for the direct and inverse
Radon transform.

13.4 Integral Equations with Cauchy Kernels

A typical integral equation in one variable has the form


 ∞
m(x)u(x) = λ K (x, y)u(y) dy,
−∞

where m is a given function, λ is a complex parameter, and various assumptions are


made about the kernel K , such as
K (x, y) = K (x − y), K (y, x) = −K (x, y), or K (x, y) = 0 if y > x.

In this and subsequent sections we examine some cases where the problem can be
treated by Riemann–Hilbert methods.
In this section we consider the case
 1
u(τ )
m(x)u(x) = λ p.v. dτ + k(x), |x| < 1, (13.4.1)
−1 τ − x

where λ is real and positive, and where m(x), k(x) are given real-valued functions.
Define
338 13 Riemann–Hilbert problems

1 1
u(τ )
U (z) = dτ. (13.4.2)
2πi −1 τ −z

(Here we allow U (z) to have an algebraic singularity of degree > −1 at the endpoints
−1 and 1.) From the Sokhotski–Plemelj formulas (13.1.2), we have
[m(x) − λπi]U+ (x) = [m(x) + λπi]U− (x) + k(x), (13.4.3)

which is of the form discussed in Section 13.2; see (13.2.3).


First we look for a non-zero function L(z) such that
L + (x) m(x) + λπi
= .
L − (x) m(x) − λπi

A suitable choice is given by


1 G(z)
L(z) = e , (13.4.4)
z−1

where 
1 1
1 m(τ ) + λπi
G(z) = log dτ ; (13.4.5)
2πi −1 τ −z m(τ ) − λπi

see (13.2.10). Note that


1 m(τ ) + λπi
log
2πi m(τ ) − λπi

is purely real, and we take it to lie in the range (0, 1). The factor (z − 1)−1 in (13.4.4)
has been inserted to make sure that L(z) grows algebraically, with index between
−1 and 0, as z tends to either endpoint −1 or 1.
For x in (−1, 1], equation (13.4.3) gives
U+ (x) U− (x) k(x)
− = , (13.4.6)
L + (x) L − (x) L + (x)[m(x) − λπi]

where 
1 m(x) + λπi w(x)
L + (x) = e , (13.4.7a)
x −1 m(x) − λπi

1 m(x) − λπi w(x)
L − (x) = e , (13.4.7b)
x −1 m(x) + λπi

and 
1 1
1 m(τ ) + λπi
w(x) = p.v. log dτ. (13.4.8)
2πi −1 τ −x m(τ ) − λπi
13.5 Integral Equations with Algebraic Kernels 339

(To derive these formulas, first write (13.4.3) in the form of (13.2.4), and then follow
the steps leading to equation (13.2.4)–(13.2.8).) On account of the behavior of U (z)
and L(z) as z → ∞, and by Theorem 13.1.4, the most general solution of (13.4.6) is
 1
m(x)k(x) λew(x) k(τ )e−w(τ )
u(x) = 2 +  p.v.  dτ
m (x) + λ π2 2
m 2 (x) + λ2 π 2 −1 (τ − x) m 2 (τ ) + λ2 π 2

Cew(x)
+  , (13.4.9)
(1 − x) m 2 (x) + λ2 π 2

where C is constant and w(x) is given in (13.4.8); see Exercise 4. The singularity at
x = 1 in the last term of (13.4.9) is offset by the factor ew(x) , so that the last term
is integrable. In the case when k(x) = 0 in (13.4.1), the resulting homogeneous
equation has a solution for all λ, i.e. the spectrum is continuous.
In the case when m(x) = 0 and k(x) = −λl(x), equation (13.4.1) reduces to
 1
u(τ )
p.v. dτ = l(x), (13.4.10)
−1 τ − x

and its solution is given by


  √
1 1−x 1
l(τ ) 1 + τ C
u(x) = − 2 p.v. √ dτ + √ ; (13.4.11)
π 1+x −1 1 − τ (τ − x) 1 − x2

see Exercise 6. If l(x) = 1 in (13.4.10), then the solution further simplifies to



1 1−x C1
u(x) = − +√ ,
π 1+x 1 − x2

where C1 is a new constant.

13.5 Integral Equations with Algebraic Kernels

Consider the Abel-type integral equation


 1
u(τ )
α
dτ = k(x) (13.5.1)
0 |τ − x|

for x ∈ (0, 1), where 0 < α < 1. This equation is not of Cauchy type but Carleman
[41] showed that it is still useful to introduce a function
 1
u(τ )
U (z) = α
dτ, (13.5.2)
0 (z − τ )
340 13 Riemann–Hilbert problems

analogous to that used for the Cauchy type in Section 13.4; cf. (13.4.1)-(13.4.2). This
function is defined for all z ∈
/ (−∞, 1); for z real and z > 1, we use principal values
in (13.5.2). For x ∈ (0, 1), it is easily seen that
 x  1
u(τ ) −iαπ u(τ )
U+ (x) = α
dτ + e dτ, (13.5.3)
0 (x − τ ) x (τ − x)α
 x  1
u(τ ) u(τ )
U− (x) = α
dτ + e iαπ
dτ. (13.5.4)
0 (x − τ ) x (τ − x)α

Here, as before, U+ (x) and U− (x) denote the limits of U (z) as z → x from above or
below, respectively. Equations (13.5.3) and (13.5.4) may be viewed as the appropriate
replacement for the Sokhotski–Plemelj formulas for Cauchy integrals. Since
 x
−iαπ u(τ )
e U+ (x) − e
iαπ
U− (x) = 2i sin απ dτ, (13.5.5)
0 (x − τ )α

the function u(x) can be determined from the knowledge of U+ (x) and U− (x), by
using the solution of a conventional Abel equation; see [207].
Solving (13.5.3) and (13.5.4), we obtain
 
x
u(τ ) 1
u(τ )
dτ, dτ
0 (x − τ )α x (τ − x)α

in terms of U+ (x) and U− (x), and use (13.5.1) to obtain

U+ (x) = −e−iαπ U− (x) + (1 + e−iαπ )k(x) (13.5.6)

for x ∈ (0, 1). For x ∈ (−∞, 0), equation (13.5.2) gives

U+ (x) = e−2iαπ U− (x). (13.5.7)

This is again a Riemann–Hilbert problem, but it involves two arcs (−∞, 0) and (0, 1),
and the above-mentioned method no longer works. Fortunately, the coefficients in
(13.5.6) and (13.5.7) are constants. Trying a factor of the form z ν (z − 1)μ , we find
that the new function

V (z) = z (α−1)/2 (z − 1)(α−1)/2 U (z)

reduces (13.5.7) to
V+ (x) = V− (x) (13.5.8)

for x ∈ (−∞, 0). Furthermore, equation (13.5.6) becomes


απ (α−1)/2
V+ (x) = V− (x) − 2i cos x (1 − x)(α−1)/2 k(x) (13.5.9)
2
13.6 Integral Equations with Logarithmic Kernels 341

for x in (0, 1), with

V+ (x) = x (α−1)/2 (1 − x)(α−1)/2 eiπ(α−1)/2 U+ (x),


(13.5.10)
V− (x) = x (α−1)/2 (1 − x)(α−1)/2 e−iπ(α−1)/2 U− (x)

for x ∈ (0, 1).


The solution of (13.5.9) is

1 απ 1
[τ (1 − τ )](α−1)/2 k(τ )
V (z) = − cos dτ, (13.5.11)
π 2 0 τ −z

where we have allowed U (z) to have algebraic singularities near the points 0, 1
of order not greater than − 21 (α + 1), which is equivalent to allowing u(τ ) to have
nothing worse than an integrable algebraic singularity at each point. Computing
V+ (x) and V− (x) and using equations (13.5.10), (13.5.5) and Exercise 7, we obtain
 x
sin απ d k(t) cos2 απ/2
u(x) = dt − ·
2π d x 0 (x − t) 1−α π2
 x  1 
d [τ (1 − τ )](1−α)/2 k(t)[t (1 − t)](α−1)/2
p.v. dt dτ.
dx 0 (x − τ )1−α 0 t −τ

13.6 Integral Equations with Logarithmic Kernels

Consider the integral equation


 1
log |x − t| u(t) dt = k(x), x ∈ (0, 1). (13.6.1)
−1

To solve this equation, we define the function


 1
U (z) = log (z − t)u(t) dt. (13.6.2)
−1

For x < −1, we have


 1  1
U+ (x) = log |x − t|u(t) dt + iπ u(t) dt,
−1 −1
 1  1
(13.6.3)
U− (x) = log |x − t|u(t) dt − iπ u(t) dt.
−1 −1

But, for x ∈ (−1, 1), there is a discontinuity and we have


342 13 Riemann–Hilbert problems
 1  1
U+ (x) = log |x − t|u(t) dt + iπ u(t) dt,
−1 x
 1  1
(13.6.4)
U− (x) = log |x − t|u(t) dt − iπ u(t) dt;
−1 x

Exercise 8. To avoid the discontinuity, we can use U (z) instead of U (z). Indeed, for
x ∈ (−1, 1), we have
U+ (x) + U− (x) = 2k (x);

Exercise 9. In terms of the function



V (z) = U (z) z 2 − 1,

the last equation becomes



V+ (x) − V− (x) = 2i 1 − x 2 k (x)

for x ∈ (−1, 1). The solution is


 √  1
1 1 1 − t 2 k (t)
V (z) = dt + u(t) dt.
π −1 t−z −1

Note that the last term is a constant; cf. Theorem 13.1.4 and the following remark.
(In considering the behavior of V (z) near −1 and +1, we have allowed u(t) to have
an integrable singularity at each end point.) Since U+ (x) − U− (x) = −2πiu(x) by
(13.6.4), it follows that
  √  
1 1 1
1 − t 2 k (t) 1 1
u(x) = √ p.v. dt + u(t) dt . (13.6.5)
1 − x2 π2 −1 t−x π −1

To obtain an expression for the second integral in (13.6.5), we first note that if
k(x) ≡ 1, then (13.6.1) can be used to show that the integral
 1
log |x − t|
√ dt
−1 1 − t2

is a constant. Setting x = 0 shows that the value of the integral is −π log 2; Exercise
10. Multiplying (13.6.1) by (1 − x 2 )−1/2 and integrating from −1 to 1, we obtain
 1  1
1 k(t)
u(t) dt = − √ dt.
−1 π log 2 −1 1 − t 2

Inserting this into (13.6.5) yields Carleman’s formula


13.6 Integral Equations with Logarithmic Kernels 343
  √  
1 1
1 − t 2 k (t) 1 1
k(t)
u(x) = √ p.v. dt − √ dt .
π2 1 − x2 −1 t−x log 2 −1 1 − t2
(13.6.6)
As an extension of equation (13.6.1), we now consider the more general equation
 1

log |x − t| p(x − t) + q(x − t) u(t) dt = k(t) (13.6.7)
−1

for x ∈ (−1, 1), where p(x) and q(x) are polynomials. As in the previous case, we
first define the function
 1 
1 z−t
U (z) = √ p(z − t) log + q(z − t) u(t) dt, (13.6.8)
z 2 − 1 −1 z+1

which is single-valued in the z-plane with a cut along the real axis from −1 to 1; see
(13.6.2). For x ∈ (−1, 1), we have

 
−i 1
U+ (x) = √ k(x) − log (x + 1) p(x − t)u(t) dt (13.6.9)
1 − x2 −1
 1 
+iπ p(x − t)u(t)dt ,
x
  1
i
U− (x) = √ k(x) − log (x + 1) p(x − t)u(t) dt (13.6.10)
1 − x2 −1
 1 
−iπ p(x − t)u(t) dt ,
x

and hence
  
−2i 1
U+ (x) − U− (x) = √ k(x) − log (x + 1) p(x − t)u(t) dt .
1 − x2 −1
(13.6.11)
Examining the behavior of U (z) at ∞ as well as at the endpoints −1 and 1, we
conclude that
   1 
1 1 1 dt
U (z) = − √ k(t) − log (1 + t) p(t − r )u(r )dr
π −1 1 − t 2 −1 t −z
+R(z); (13.6.12)

cf. (13.1.1) and (13.6.12). Here, R(z) is that part of the Laurent series for U (z), in
the region outside the unit circle, which does not involve negative powers of z. From
(13.6.8), R(z) may be expressed in terms of a finite number of unknown constants
cn defined by  1
cn = t n u(t)dt, n ≥ 0. (13.6.13)
−1
344 13 Riemann–Hilbert problems
1
These same constants cn also occur in the term coming from −1 p(t − r )u(r )dr
in (13.6.12). Hence, except for a finite number of these cn , U (z) is known. From
(13.6.9) and (13.6.10), we also have
 1

U+ (x) + U− (x) = √ p(x − t)u(t) dt. (13.6.14)
1 − x2 x

By using Laplace transforms, one can show that


 ! "
d 1
1  
u(x) = M(t − x) 1 − x 2 U+ (t) + U− (t) dt, (13.6.15)
dx x 2π

where M(t) is the inverse transform of [s 2 P1 (s)]−1 , P1 (s) being the transform of
p(−t); Exercise 11. From (13.6.12), simple calculation shows that
 1 #
2 1
U+ (x) + U− (x) = − p.v. √ k(t)
π −1 1 − t2
 1 $ dt (13.6.16)
− log (1 + t) p(t − r )u(r )dr
−1 t−x
+ 2R(x).

Thus, the solution is complete, except for the evaluation of the constants cn . A set of
linear algebraic equations for the cn may also be obtained by multiplying equation
(13.6.15) by appropriate powers of t and integrating over (−1, 1).
For the special case p(t) = 1 and q(t) = 0, the result is
  
d 1  1
k(t) − c0 log (1 + t)
u(t) = 1 − x 2 p.v. √ dt , (13.6.17)
dx π2 −1 1 − t 2 (t − x)

where c0 is a constant. The value of the constant c0 can be determined from the
condition that U (z), as given in (13.6.12), has no terms of 1z as z → ∞ [cf. (13.6.8)
with p(t) = 1 and q(t) = 0]. This yields
 1
1 g(t)
c0 = − √ dt (13.6.18)
π log 2 −1 1 − t 2

as before; Exercise 14.

13.7 Singular Integral Equations

We conclude this chapter with a discussion of the singular integral equation



b(x) u(t)
a(x)u(x) + p.v. dt = c(x), (13.7.1)
πi L t −x
13.7 Singular Integral Equations 345

where a(x), b(x), c(x) satisfy a Hölder condition on L, and a ± b = 0 on L. Solving


this equation is equivalent to finding the function defined by the Cauchy integral

1 u(t)
U (z) = dt (13.7.2)
2πi L t − z

associated with the Riemann–Hilbert problem


U+ (t) = g(t)U− (t) + f (t), t ∈ L; U− (∞) = 0, (13.7.3)

where
a(t) − b(t) c(t)
g(t) ≡ , f (t) ≡ . (13.7.4)
a(t) + b(t) a(t) + b(t)

To show that finding a solution to equation (13.7.1) reduces to solving the Riemann–
Hilbert problem (13.7.3), one can use the Sokhotski–Plemelj formulas for U (z), that
is,

1 u(τ )
U+ (t) − U− (t) = u(t), U+ (t) + U− (t) = p.v. dτ. (13.7.5)
πi L τ −t

Substituting these equations into (13.7.1), we obtain (13.7.3). The converse is also
true, that is, if the Cauchy integral U (z) in (13.7.2) is the solution of the Riemann–
Hilbert problem (13.7.3) with boundary condition U− (∞) = 0, then the function
u(t) in (13.7.5) is a solution of the integral equation (13.7.1); see Muskhelishivili
[149].
Singular integral equations of the form (13.7.1) play an important role in studying
the more general equation

1 K (t, τ )u(τ )
a(t)u(t) + p.v. dτ = c(t). (13.7.6)
πi L τ −t

Writing K (t, τ ) = K (t, t) + [K (t, τ ) − K (t, t)], and letting b(t) ≡ K (t, t) and
F(t, τ ) ≡ iπ1 [K (t, τ ) − K (t, t)]/(τ − t), we get
 
b(t) u(τ )
a(t)u(t) + p.v. dτ + F(t, τ )u(τ ) dτ = c(t). (13.7.7)
iπ L τ −t L

Equations of the type (13.7.7) are much more complicated to study than equa-
%n (13.7.1). Here we only note that if F(t, τ ) is degenerate, i.e. if F(t, τ ) =
tion
1 H j (t)H j (τ ), then equation (13.7.7) can also be solved in closed form.

Example. Consider the singular integral equation



t − t −1 u(τ )
(t + t −1 )u(t) + p.v. dτ
πi Γ τ −t
 (13.7.8)
1
− (t + t −1 )(τ + τ −1 )u(τ ) dτ = 2t 2 ,
2πi Γ
346 13 Riemann–Hilbert problems

where Γ is the unit circle. The kernel (t + t −1 )(τ + τ −1 ) is degenerate. Hence,


according to the remark above, it is expected that equation (13.7.8) is solvable in
closed form. Let 
1
A = (τ + τ −1 )u(τ ) dτ.
2πi Γ

Equation (13.7.8) can then be written as



t − t −1 u(τ )
(t + t −1 )u(t) + p.v. dτ = 2t 2 + (t + t −1 ) · A.
πi Γ τ −t

By the Sokhotski–Plemelj formula in (13.7.5), the above equation is equivalent to


the Riemann–Hilbert problem.
 
(t + t −1 ) U+ (t) − U− (t) + (t − t −1 ) U+ (t) + U− (t)
= 2t 2 + (t + t −1 ) · A,

which can be reduced to


1
U+ (t) = t −2 U− (t) + t + (1 + t −2 ) · A, U− (∞) = 0; (13.7.9)
2
see (13.7.3).
We now return to the homogeneous Riemann–Hilbert problem (13.2.5):
L + (ζ ) = g(ζ )L − (ζ ) (13.7.10)

and the nonhomogeneous Riemann–Hilbert problem (13.2.4):


W+ (ζ ) = g(ζ )W− (ζ ) + f (ζ ). (13.7.11)

In our case, g(ζ ) = ζ −2 and

1
f (ζ ) = ζ + (1 + ζ −2 )A. (13.7.12)
2

With g(ζ ) = ζ −2 , the homogeneous problem is simply


L + (ζ ) = ζ −2 L − (ζ ). (13.7.13)

By inspection we can take


L + (ζ ) = 1, L − (ζ ) = ζ 2 . (13.7.14)

Substituting (13.7.13) into (13.7.11) (i.e. replacing g(ζ ) by L + (ζ )/L − (ζ )) gives


W+ (ζ ) W− (ζ ) f (ζ )
− = ; (13.7.15)
L + (ζ ) L − (ζ ) L + (ζ )

see (13.2.6). From (13.1.2) it follows that the function


13.7 Singular Integral Equations 347

1 f (τ )/L + (τ )
F(ζ ) = dτ (13.7.16)
2πi Γ τ −ζ

can be written as
f (ζ )
= F+ (ζ ) − F− (ζ ). (13.7.17)
L + (ζ )

Coupling (13.7.15) and (13.7.17) yields


W (ζ )
= F(ζ ) + pn−1 (ζ ), (13.7.18)
L(ζ )

where pn (ζ ) is a polynomial; see Theorem 13.1.4 and the following remark. Note that
in our case, the W in (13.7.11) is just the U in (13.7.9). Thus, the boundary condition
U− (∞) = 0 and the function L − (ζ ) = ζ 2 in (13.7.14) imply that the left-hand side
of (13.7.18) is of the order o(ζ −2 ). From (13.7.16), it is easily seen that the function
F(ζ ) on the right-hand side of (13.7.18) has the asymptotic expansion

i f (τ ) 1 τ τ2
F(ζ ) ∼ + 2 + 3 + · · · dτ
2π Γ L + (τ ) ζ ζ ζ

& cs (13.7.19)
∼ , ζ → ∞.
s =0
ζ s+1

Balancing the terms on both sides, it follows readily that the polynomial pn−1 (ζ ) in
(13.7.18) is zero and the coefficients c0 and c1 in (13.7.19) must vanish, i.e.
     
A A
τ + (1 + τ −2 ) dτ = 0, τ + (1 + τ −2 ) τ dτ = 0;
C 2 Γ 2

see (13.7.12) and (13.7.14). The first equation automatically holds, but the sec-
ond equation requires that A = 0. Thus, from (13.7.12), we have f (ζ ) = ζ . With
pn−1 (ζ ) = 0, f (τ ) = τ and W (ζ ) = U (ζ ), we obtain from (13.7.18), (13.7.16),
and (13.7.14)

L(ζ ) τ ζ, ζ inside the circle,
U (ζ ) = dτ =
2πi Γ τ − ζ 0, ζ outside the circle.

Returning to (13.7.9) and (13.7.12), we have U+ (t) = t, U− (t) = 0 and f (ζ ) = ζ .


Therefore, we conclude from (13.7.5) that equation (13.7.8) has the unique solution
u(t) = t if the constant A defined above is zero, which is indeed the case since
 
1 1
A = (τ + τ −1 )u(τ ) dτ = (τ + τ −1 )τ dτ = 0.
2πi Γ 2πi Γ
348 13 Riemann–Hilbert problems

13.8 The other Riemann–Hilbert problem

There is another problem that was studied in various forms by Riemann and, later,
Hilbert. It is (at least) equally well known by the term Riemann–Hilbert problem.
Consider a linear differential equation
f ( p) (z) + q1 (z) f ( p−1 )(z) + · · · + q p (z) f (z) = 0, (13.8.1)

where the coefficients {qk } are rational functions. Let P be the set of poles of the
{qk } in S. Fix a coordinate disk D in Ω = S \ P, centered at a point z 0 . There is a
basis f 1 , f 2 , . . . f p of solutions of (13.8.1) defined in D. If γ is a closed curve in Ω
that begins and ends at z 0 , then each f j can be continued along γ to another solution
f j , giving a second basis of solutions defined in D, related to the original set by a
matrix Aγ in the group G L(n, C) of n × n invertible complex matrices. This gives
a homomorphism χ from the fundamental group to the n × n matrices:
χ : H1 (Ω) → G L(n). (13.8.2)

The image is called the monodromy group of the equation.


The equation (13.8.1) can be reformulated as a system of equations of first order.
If the resulting singular points (including the point at ∞, if it is singular) are simple
poles, then (13.8.1) is said to be of Fuchsian type. After changing coordinates by an
element of Aut(S) if necessary, we may assume that ∞ is a regular point. Then the
first-order system has the form
&
n
1
f j (z) = B jk , 1 ≤ j ≤ p, (13.8.3)
k=1
z − ak

where the ak are distinct, the B jk are constant, and

&
n
B jk = 0. (13.8.4)
k=1

Then Problem XXI in Hilbert’s famous list of problems [106] can be formulated as
follows:
Let the representation (13.8.2) be given. Prove that there is always a system (13.8.3), (13.8.4)
with the given monodromy (13.8.2).

As it turns out, this can be done for equations of degree ≤ 3 or with ≤ 3 sin-
gularities. Bolibrukh [29] showed that otherwise there are counter-examples, so the
problem, in Hilbert’s formulation, has a negative solution. For a full treatment, see
Anosov and Bolibrukh [10].
13.8 The other Riemann–Hilbert problem 349

Exercises

1. Prove that the Hölder continuity condition at t0 in Theorem 13.1.1 can be replaced
by the weaker condition

| f (t) − f (t0 )|
| dt| < ∞.
Γ |t − t0 |

2. Prove (13.1.8).
3. Let Γ y = {x + i y : −∞ < x < ∞}, Γ+ = lim y→0+ Γ y and Γ− = lim y→0− Γ y .
Consider the contour integrals

ei z
I± = dz.
Γ± z

(a) Show that 


ei z
I± = dz,
Γ± i z2

thus proving that the contour integrals are convergent and well defined.
(b) Use Cauchy’s integral formula to show that
I+ − I− = 2πi,

and  0
ei R(cos θ+i sin θ)
I+ = lim dθ = 0,
R→∞ π Reiθ

hence I− = 2πi.
(c) For any ±Re w > 0, we have
  !
ei x ei(z−iw) 0, Re w > 0,
dx = dz =
R x + iw R z 2πiew , Re w < 0.

4. (a) Use the results in Exercise 3 to show that if x1 > 0 then


 !
ei x1 ξ1 0, Re ξ2 > 0,
dξ1 =
R ξ1 + iξ2 2πie x1 ξ2 , Re ξ2 < 0.

and if x1 < 0 then


 !
ei x1 ξ1 −2πie x1 ξ2 , Re ξ2 > 0,
dξ1 =
R ξ1 + iξ2 0, Re ξ2 < 0.

(b) Show that for a ∈ R and b > 0,



ei(x1 ξ1 +x2 ξ2 ) 2π
dξ1 ξ2 = .
R2 ξ1 + (a ± ib)ξ2 x2 − (a ± ib)x1
350 13 Riemann–Hilbert problems

(c) Use (b) to conclude that



ei(x1 ξ1 +x2 ξ2 ) 2π sgn(Im k)
dξ1 ξ2 = , Im k = 0,
R 2 ξ 1 + kξ 2 x2 − kx1

which proves (13.3.16).


5. (a) Show that equation (13.3.14) can be written as
 ∞  ∞ 
1 q(y1 , y2 )
μ± (x1 , x2 , k) = ∓ dy2 dy1 .
−∞ 2πi −∞ y2 − [x 2 − k(x 1 − y2 )]

(b) By applying the Sokhotski–Plemelj formula to the equations in (a), prove


that for k ∈ R,
 ∞  ∞
1 q(y1 , y2 )
μ± (x1 , x2 , k) = ± p.v. dy2 dy2
2πi −∞ −∞ (x 2 − y2 ) − k(x 1 − y1 )
 x1  ∞
1
+ − q(y1 , x2 − k(x1 − y1 ))dy1 ,
2 −∞ x1

which gives the formula in (13.3.17).


6. Prove (13.4.11).
7. Prove the two equations in (13.6.4).
8. Prove the identity

U+ (x) + U− (x) = 2k (x), x ∈ (−1, 1),

where U (z) is defined in (13.6.2).


9. (a) By taking k = 1 in (13.6.1) and (13.6.5), show that the integral
 1
log |x − t|
√ dt
−1 1 − t2

is a constant.
(b) By setting x = 0 in (a), show that the value of the integral in (a) is −π log 2.
10. When the parameters are positive, the usual Laplace transform is defined by
 ∞
F(x) = L f (s) = f (t)e−st dt, s > 0. (1)
0
When the parameters are negative, one can use a different notation. For instance,
for θ < 0, we define
 0
F(θ ) = L f (θ ) = f (x)e−θ x d x. (2)
−∞

Define
fˆ(y) = f (−y). (3)

Show that
13.8 The other Riemann–Hilbert problem 351

L f (θ ) = L fˆ (−θ ). (4)

11. Let g(x) = 1

1 − x 2 [U+ (x) + U− (x)], so that equation (13.6.14) becomes
 1
g(x) = p(x − t)u(t) dt. (5)
x

Define
ũ(t) = u(t + 1) and g̃(x − 1) = g(x). (6)

(a) Show that  0


g̃(x − 1) = p(x − 1 − t)ũ(t) dt, (7)
x−1

and
g̃(0) = g(1) = 0. (8)

(b) With the Laplace transforms defined in Exercise 10, show that
L g̃ (θ ) = L ũ (θ ) · L p (θ ); (9)

equivalently,
G̃(θ ) = Ũ (θ )P(θ ). (10)

(c) Using integration by parts, show that


1
L g̃ (θ ) = L g̃ (θ ). (11)
θ

If I (x) denotes the integral


 0
I (x) = ũ(t) dt, (12)
x

then show
1
L I (θ ) = − L ũ (θ ). (13)
θ
(d) Use equation (9) to conclude that
1 L g̃ (θ )
L I (θ ) = − . (14)
θ 2 L p (θ )

12. Recall equation (13.6.15): M(t) is the inverse Laplace transform of [s 2 Pi (x)]−1 ,
where P1 (s) is the Laplace transform of p(−t), i.e.
(a) 1 (b)
L M (s) = , P1 (s) = L p(−t) (s). (15)
s 2 P1 (s)
352 13 Riemann–Hilbert problems

(a) Show that


L p̂ (−θ ) = P1 (θ ).

(b) Use (4), (14) and (15) to show that


L I (θ ) = −L M̂ (θ )L g̃ (θ ), (16)

where M̂(x) = M(−x).


13. (a) By interchanging the order of integration, show that
 0  0
−xθ
e M̂(x − t)g̃(t) dt d x = L M̂ (θ )L g̃ (θ )
−∞ x

(b) Use the results in (a) and Exercise 12 (b) to conclude


 0  0
ũ(t) dt = − M(t − x)g̃ (t) dt.
x x

(c) Now prove the formula in (13.6.15).

14. Prove (13.6.18). Hint: use the result in Exercise 9).

Remarks and further reading

The Riemann–Hilbert problem and applications to singular integral equations in C are


treated in depth in Vekua [210], [211]. An important generalization of the Riemann–
Hilbert factorization problem takes the function to be factored to be a matrix-valued
function. This makes it possible to treat matrix-valued singular integral equations; see
Clancey and Gohberg [45]. Calderón and Zygmund [37] developed a far-reaching gen-
eralization of the theory of singular integral equations in the plane; see Stein [194],
Christ [44], or Peyrière [167]. The Riemann–Hilbert problem plays a crucial part in
several areas of asymptotic analysis, including random matrices; see Deift [53].
An active area of application of both Riemann–Hilbert problems is the study of
inverse scattering and integrable systems of nonlinear partial differential equations.
For use of the first version of Riemann–Hilbert, see Beals, Deift, and Zhou [19]
and Deift and Zhou [54]. For use of the second version of Riemann–Hilbert, see the
expository article by Its [115].
Chapter 14
Asymptotics and Darboux’s method

Suppose that f is holomorphic in a domain that includes the unit disk D. Then its
Maclaurin expansion
∞
f (z) = an z n (14.0.1)
n=0

converges in D. The coefficients an are determined by f : on any smaller circle


centered at the origin we have the integral representation

1 f (z)
an = dz.
2πi |z|=r z n+1

A problem that arises frequently in number theory [96], combinatorics [75] and
orthogonal polynomials [114] is to determine the asymptotic behavior of the an .
One such problem that we treat in this chapter involves the Legendre polynomials
{Pn } that play a role in Chapter 4. The generating function for these polynomials can
be written as
∞
1
f θ (z) = Pn (cos θ )z n = iθ 1/2 (e−iθ − z)1/2
, (14.0.2)
n=0
(e − z)

where the branches are chosen such that (e±iθ − z)1/2 → e±πiθ/2 as z → 0. For fixed
θ , 0 < θ < π, f θ (z) is holomorphic in D and the restriction of f to Γ = ∂D has two
algebraic singularities that coalesce as θ → 0. Thus the asymptotics of the Maclaurin
coefficients of f θ are the asymptotics of Pn (θ ).
As this example suggests, we might want to extract information about the an from
f on Γ = ∂D, under the assumption that the singularities of f on Γ are somehow
manageable. Darboux [51] was the first to consider problems of this nature. Darboux
considered the case of finitely many distinct algebraic singularities. This work is
described in Section 14.1. Recent extensions of the Darboux method are the sub-
ject of remaining sections: logarithmic singularities in Section 14.2 and coalescing
singularities in Section 14.3. Section 14.4 is devoted to showing that the result on

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 353
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1_14
354 14 Asymptotics and Darboux’s method

coalescing singularities gives an asymptotic expansion for the coefficients. In Section


14.5, these results are applied to the case of the Heisenberg polynomials.

14.1 Algebraic singularities

Suppose that f is holomorphic on D and on Γ = ∂D except at finitely many distinct


points s j where, in a neighborhood,
f (z) = (s j − z)α j g j (z), 1 ≤ j ≤ l, (14.1.1)
where g j (z) is holomorphic at z = σ j , s j is a complex number, and the branch of
(s j − z)α j is holomorphic on D.
For simplicity, we look first at the case of a single singularity at a ∈ Γ . The
associated function g has an expansion


g(z) = cr (a − z)r . (14.1.2)
r =0

The mth Darboux approximant of f (z) is defined by



m
f m (z) = cr (a − z)r +α . (14.1.3)
r =0

Since f m (z) is holomorphic in D, it has a Maclaurin expansion




f m (z) = bmn z n . (14.1.4)
n=0

From (14.1.3), a simple calculation gives


 m  
1 (n) r + α r +α−n
bmn = f (0) = (−1) n
cr a . (14.1.5)
n! m r =0
r

By Cauchy’s theorem, we have from (14.1.4) and (14.0.1)



1 f (z) − f m (z)
an − bmn = dz, (14.1.6)
2πi Γ z n+1

where Γ is any contour that contains the origin and lies in D. For convenience, we
let
εm (z) = f (z) − f m (z) (14.1.7)

and 
1
δm (n) = εm (z)z −n−1 dz. (14.1.8)
2πi Γ
14.1 Algebraic singularities 355

In view of (14.1.5), equation (14.1.6) can be written as


 m  
r + α r +α−n
an = (−1) n
cr a + δm (n), (14.1.9)
r =0
n
where  
r +α (r + α)(r + α − 1) · · · (r + α − n + 1)
= .
n n!

The functional equation for the gamma function is Γ (z + 1) = zΓ (z), so (14.1.9)


can be rewritten as
 m
Γ (n − α − r ) r +α−n
an = cr a + δm (n). (14.1.10)
r =0
n ! Γ (−α − r )

equation
We claim that
δm (n) = o(n −α−m−1 ), as n → ∞. (14.1.11)

Integration by parts N times gives



(n − N )! 1
δm (n) = εm(N ) (z)z −(n−N +1) dz. (14.1.12)
n! 2πi Γ

Since
εm (z) = cm+1 (a − z)m+α+1 + cm+2 (a − z)m+α+2 + · · ·

in a neighborhood of z = a, we have
εm(N ) (z) = O((a − z)m+α+1−N ) (14.1.13)
as z → 1. As long as N satisfies
m + Re α + 1 ≤ N < m + Re α + 2, (14.1.14)
then (a − z)m+α+1−N is integrable on Γ , so the the contour Γ can be expanded so
that (14.1.12) becomes

1 (n − N )! 2π (N ) iθ −i(n−N )θ
δm (n) = εm (e )e dθ. (14.1.15)
2π n! 0

Let us emphasize here that N = N (m) ∼ m. Since the last integral is absolutely
integrable, it follows from the Riemann–Lebesgue lemma (Exercise 2) that
 
(n − N )!
δm (n) = o = o(n −N ) as n → ∞, (14.1.16)
n!
which in view of (14.1.14) establishes our claim in (14.1.11). Thus, (14.1.9) gives

 Γ (n − α − r )
an ∼ cr a r +α−n , n → ∞. (14.1.17)
r =0
n! Γ (−α − r )
356 14 Asymptotics and Darboux’s method

Let us remark that as n → ∞,

Γ (n − α − r ) Γ (n − α − r )
= ∼ n −α−r −1 ; (14.1.18)
n! Γ (n + 1)

see Exercise 1.
Let us return to the case (14.1.1) with finitely many singularities {s j }. We can apply
the same derivation to each s j . The end result is a sum of asymptotic expansions.
Theorem 14.1.1. Suppose that f is holomorphic on D and has distinct singularities
s1 , s2 , · · · , sl , on ∂D , and that in a neighborhood of s j ,



f (z) = c jr (s j − z)α j +r . (14.1.19)
r =0

Then the Maclaurin coefficients an in (14.0.1) have the asymptotic expansion

∞ 
 l  
r + α j α j +r −n
an ∼ c jr (−1)n sj
r =0 j=1
n
∞ 
 l
α +r −n Γ (n − α j − r )
∼ c jr s j j (14.1.20)
r =0 j=1
n ! Γ (−α j − r )

as n → ∞.

Now let us return to the example in the introduction:



 1
f θ (z) = Pn (cos θ )z n = ,
n=0
(eiθ − z)1/2 (e−iθ − z)1/2

where the branches are chosen such that (e±iθ − z)− 2 → e∓ 2 iθ as z → 0. The alge-
1 1

braic singularities are at a1 = eiθ and at a2 = e−iθ . Since


∞  
eiπ/4  − 21 (eiθ − z)r − 2
1
1
= √ (14.1.21)
(eiθ − z)1/2 (e−iθ − z)1/2 2 sin θ r =0 r (−2i sin θ )r

for |eiθ − z| < 2 sin θ , and


∞  
e−iπ/4  − 21 (e−iθ − z)r − 2
1
1
= √ (14.1.22)
(eiθ − z)1/2 (e−iθ − z)1/2 2 sin θ r =0 r (2i sin θ )r

for |e−iθ − z| < 2 sin θ , the constants c jk and α j in (14.1.19) are given by α1 = α2 =
− 21 and
14.1 Algebraic singularities 357
 1
eiπ/4 −2 1
c1r = √ ;
2 sin θ r (−2i sin θ )r
 
e−iπ/4 − 21 1
c2r = √ .
2 sin θ r (2i sin θ )r

From (14.1.20), it now follows that


  21 ∞  1  
2 r − 21 cos θn,r
Pn (cos θ ) ∼ 2 (14.1.23)
sin θ r =0
r n (2 sin θ )r

as n → ∞, where θn,r = (n − r + 21 )θ + (n − 21 r − 41 )π .
Olver [157] pointed out an interesting paradox associated to (14.1.23). It is easily
verified that the series on the right-hand side of (14.1.23) converges when 2 sin θ > 1;
that is, 16 π < θ < 56 π . Thus, it is natural to expect that the sum is Pn (cos θ ). But
from (14.1.22), we have
∞  
e−iπ/4  − 21 (e−iθ − z)n− 2
1
1
√ = √ , (14.1.24)
1 − 2z cos θ + z 2 2 sin θ r =0 r (2i sin θ )r

which converges uniformly when |e−iθ − z| ≤ 2 sin θ − δ, δ > 0. If 2 sin θ > 1, then
z = 0 lies inside the region of uniform convergence. According to (14.0.2), Pn (cos θ )
is the nth Maclaurin coefficient of the function on the left-hand side of (14.1.24).
Hence, differentiating (14.1.24) n times, setting z = 0, and equating real parts, we
obtain
∞  1  
1 − 2 r − 21 cos θn,r
Pn (cos θ ) = √ . (14.1.25)
2 sin θ r =0 r n (2 sin θ )r

However compare (14.1.23) with (14.1.25), and note that

1 5
Pn (cos θ ) ∼ 2Pn (cos θ ), π < θ < π, n → ∞. (14.1.26)
6 6

Example. In [177], Robinson considered the following problem: “Let there be n


straight lines in a plane, no three of which meet at a point. Determine the number,
gn , of groups of n of their points of intersection such that no three of the points of
the group be on one of the straight lines.”
Although Robinson did not find an explicit form for gn , he showed that gn satisfies
the recurrence relation
 
n
gn+1 = ngn + gn−2 , n ≥ 3, (14.1.27)
2

where g1 = g2 = 0 and g3 = 1. He also proved that the limit


358 14 Asymptotics and Darboux’s method

gn
lim =B (14.1.28)
n→∞ n n e−n
exists, and conjectured that B might be a new geometric constant. Although several
solutions were given to show that the conjecture is false, none of these used the
method of Darboux; cf. an editorial note in [178].
If we multiply (14.1.27) by z n /n! and sum from n = 3 to ∞, we obtain

 ∞ ∞
zn z n+1 1  z n+2
gn+1 = gn+1 + gn .
n=3
n! n=2
n! 2 n=3 n!
∞
Hence, if we define f (z) = n=3 gn z n /n!, we obtain
1 1
(1 − z) f (z) − z 2 f (z) = z 2 . (14.1.29)
2 2
The solution of this first-order equation is
  2 
c z z
f (z) = √ exp − + − 1.
1−z 4 2

Since f (0) = 0, c = 1, and


  2 
1 z z
f (z) = √ exp − + − 1. (14.1.30)
1−z 4 2

From the Darboux result (14.1.17) with a = 1, α = − 21 , and c0 = e−3/4 , we have


gn √
n −n
∼ 2e−3/4 , (14.1.31)
n e

and hence the constant B in (14.1.28) is given by B = 2e−3/4 .

14.2 Logarithmic singularities

In this section we consider an extension of Darboux’s method to deal with a general


version of the type of singularity that occurs in the following example:
∞
1 1
L(z) = + = ln z n . (14.2.1)
log(1 − z) z n=0

A second form of the same example is

∞
z zn
M(z) = = An , |z| < 1; (14.2.2)
log(1 + z) n=0
n!
14.2 Logarithmic singularities 359

in fact
z L(−z) = M(−z) + 1,

so An /n ! = (−1)n−1 ln−1 , n ≥ 1.
In 2004 Donald Knuth asked Frank Olver about the asymptotics of the
coefficients ln in (14.2.1). Polya [169], pp. 8-9 gives the first few coefficients A N in
(14.2.2):

A0 = 1, A1 = 1, A2 = 1, A3 = 2, A4 = 4,
A5 = 14, A6 = 38, A7 = 216, A8 = 600, A9 = 6240, (14.2.3)

and asks for a conjecture on An . In the solution section, after noting that (14.2.3)
makes it reasonable to conjecture that An is positive and increasing, Pólya points out
that asymptotically
An 1
∼ (−1)n−1 . (14.2.4)
n! n log2 n

The extension that we discuss now, taken from [220], is apparently considered
useful in combinatorics; see a remark in [75], p. 438, line 12.
Let f (z) be holomorphic, with Maclaurin expansion


f (z) = an z n , |z| < 1. (14.2.5)
n=0

Assume that f (z) has a singularity at z = 1, and is holomorphic within and on the
contour Γ shown in Figure 14.1), for some δ > 0. In a neighborhood of z = 1, f (z)
is assumed to have the form

f (z) = (1 − z)λ−1 (log(1 − z))μ g(z), (14.2.6)

where λ and μ are complex numbers, g(z) is holomorphic at z = 1, and log(1 − z)


has its principal value, which is real when z is real and < 1.

O 1

1+

Fig. 14.1 Contour Γ ; δ > 0


360 14 Asymptotics and Darboux’s method

From (14.2.5) we have



2πian = f (z)z −n−1 dz
Γ
 
= f (z)z −n−1 dz + f (z)z −n−1 dz. (14.2.7)
|z|=1+δ |z−1|=δ

We will vary δ, with


δ = δn = n − 2 ,
1
(14.2.8)

and assume that on the larger circle |z| = 1 + δn , f satisfies


f (z) = O(n s ) n → ∞, (14.2.9)

for some fixed s. A simple estimation then gives



ns
f (z)z −n−1 dz = O √ n , n → ∞,
|z| = 1+δn 1 + 1/ n

= O exp(−ε n ) , n → ∞,

for some fixed ε > 0. Since this integral is exponentially small, it is clear that the
asymptotic behavior of an will be determined by the asymptotic behavior of the
integral 
i
In = f (z)z −n−1 dz, (14.2.10)
2π |z−1|=δn

where the path of integration on |z − 1| = δn is now oriented in the positive direction.


Before we begin the study of behavior of the integral In , we shall digress briefly
to discuss the function
 (0+ )
i
M(λ, μ; n) = (−z)λ−1 (log(−z))μ e−(n+1)z dz, (14.2.11)
2π ∞

where the loop contour of integration and the cuts in the z-plane are illustrated in
Figure 14.2. If μ = 0, then the integral in (14.2.11) can be expressed in terms of the
gamma function; see Section 2.10.
 (0+ )
1 i
= (−u)λ−1 e−u du, | arg(−u)| ≤ π. (14.2.12)
Γ (1 − λ) 2π ∞

Differentiating both sides with respect to λ gives


  (0+ )
1 i
Dk = (−u)λ−1 (log(−u))k e−u du, (14.2.13)
Γ (1 − λ) 2π ∞

where D k = d k /dλk .
14.2 Logarithmic singularities 361

−1 O

Fig. 14.2 Loop contour

Lemma 14.2.1. For any fixed integer N ≥ 0,


 N  
(− log(n + 1))μ  μ D k [1/Γ (1 − λ)]
M(λ, μ; n) =
(n + 1)λ k=0
k (− log(n + 1))k
 
1
+O (14.2.14)
(log(n + 1)) N +1

as n → ∞.

Proof. In (14.2.11), we make the change of variable u = (n + 1)z and obtain


 (0+)  μ
1 i λ−1 −u
M(λ, μ; z) = λ
(−u) log e−u du. (14.2.15)
(n + 1) 2π ∞ n+1

Divide the loop path of integration into two parts A and B, where A is the portion
contained in |u| ≤ (n + 1)ρ for some fixed ρ in (0, 1), and B is the remaining por-
tion of the path (i.e. two half-lines extending to ∞). Since arg(−u) = ±π on B,
log(−u/n + 1) satisfies the inequalities
     
 u    u 

π ≤ log −  
≤ log   + π. (14.2.16)
n + 1    n + 1 

Hence, | log(−u/(n + 1))| is uniformly bounded away from zero. Although this
function becomes unbounded on B, it is bounded by the larger of log |u| and log(n +
1). An easy estimation shows that for |u| ≥ (n + 1)ρ , there must exist an ε > 0 such
that   μ
λ−1 −u
(−u) log e−u du = O (exp(−εn ρ )) (14.2.17)
B n+1

as n → ∞, with λ and μ unrestricted, and the order relation holds uniformly.


On the part A of the loop, we have
μ N  
  
log(−u) μ (log(−u))k (log u) N +1
1− = +O
log(n + 1) k=0
k (− log(n + 1))k (log(n + 1)) N +1
(14.2.18)
362 14 Asymptotics and Darboux’s method

as n → ∞, for every fixed integer N ≥ 0. Since A (−u)λ−1 (log(−u))k e−u du exists
as an absolutely convergent integral for each fixed k ≥ 0, it follows that
  μ
−u
(−u)λ−1 log e−u du
A n+1
 N   
 μ 1
μ
= (− log(n + 1)) (−u)λ−1 (log(−u))k e−u du
k=0
k (− log(n + 1))k
A
 
1
+O , as n → ∞. (14.2.19)
(log(n + 1)) N +1

By the argument used to obtain (14.2.17), we also have



(−u)λ−1 (log(−u))k e−u du
A
 (0+ )
= (−u)λ−1 (log(−u))k e−u du + O(exp(−εn ρ )). (14.2.20)

Hence, on account of (14.2.13),


 
i 1
(−u)λ−1 (log(−u))k e−u du = D k + O(exp(−εn ρ ))
2π A Γ (1 − λ)
(14.2.21)
as n → ∞. A combination of the results (14.2.15), (14.2.17), (14.2.19), and (14.2.21)
yields the desired result (14.2.14).

The result in Lemma 14.2.1 will be used in a slightly different form. First, we
note that
E(w, u) = exp{−(n + 1)[log(1 + u) − u]}
 
1 2(log(1 + u) − u)
= exp − wu , (14.2.22)
2 u2

where
w = (n + 1)u. (14.2.23)

(The first equality will be used later in (14.2.35).) Now, let Pm (w) be the polynomials
defined by
∞
g(u + 1)E(w, u) = Pm (w)u m , (14.2.24)
m=0

where g(z) is the function given in (14.2.6). Thus we have



1 dm 
Pm (w) = [g(u + 1)E(w, u)]  . (14.2.25)
m! du m 
u=0

Consider the integral


14.2 Logarithmic singularities 363

i
Jm (n) = (−u)λ+m−1 (log(−u))μ Pm ((n + 1)u)e−(n+1)u du, (14.2.26)
2π γn

where γn is the contour which traverses the circle |u| = δn in the positive direction,
and begins and ends on the positive half of the real axis. The polynomial Pm ((n + 1)u)
may be written as
m
Pm ((n + 1)u) = ps (n + 1)s u s , (14.2.27)
s=0

where ps is a fixed number. Hence,


 m 
i
Jm (x) = (−1)s ps (n + 1)s (−u)λ+m+s−1 (log(−u))μ e−(n+1)u d u.
s=0
2π γn
(14.2.28)
Since the error incurred by extending the circular paths of integration to infinite loops
is exponentially small, we have

m  1

Jm (n) = (−1)s ps (n + 1)s M(λ + m + s, μ; n) + O exp(−εn 2 ) (14.2.29)
s=0

as n → ∞; see (14.2.17). By Lemma 14.2.1,


∞  
(− log(n + 1))μ  μ
Jm (n) ∼ Ak (λ, m)(− log(n + 1))−k , (14.2.30)
(n + 1)λ+m k=0 k

where 

m
1
Ak (λ, m) = (−1)s ps D k . (14.2.31)
s=0
Γ (1 − λ − m − s)

Returning to (14.2.10), we replace z − 1 by u and obtain



i
In = f (u + 1)(u + 1)−n−1 du, (14.2.32)
2π γn

where γn is the contour described in (14.2.26).

Theorem 14.2.2. If the function in (14.2.5) is holomorphic within and on the con-
tour Γ shown in Figure 14.1, and if f (z) satisfies the conditions in (14.2.6) and
(14.2.9), then for any fixed integers N ≥ 0 the Maclaurin coefficients of f (z) have
the asymptotic expansion
N  
(log n)μ
an = (−1) Jm (n) + O
m
(14.2.33)
m=0
n λ+N +1

as n → ∞, where Jm (n) is defined in (14.2.26) and its asymptotic behavior is given


in (14.2.30).
364 14 Asymptotics and Darboux’s method

Proof. Substituting (14.2.6) into (14.2.32) gives



i
In = (−u)λ (log(−u))μ g(u + 1)(u + 1)−n−1 du. (14.2.34)
2π γn

By (14.2.22) and (14.2.24),



N
g(u + 1) exp {−(n + 1)[log(u + 1) − u]} = Pm (w)u m + R N (n, u),
m=0
(14.2.35)
where N ≥ 0 is any fixed number. The error term R N (n, u) in (14.2.35) can be
expressed as
  
1 dζ
R N (n, u) = g(ζ + 1)E(w, ζ ) u N +1 ,
2πi |ζ |=2K /n 21 ζ N +1 (ζ − u)

using Taylor’s formula with remainder, where K is a positive constant and E(w, ζ )
is given in (14.2.22). A simple estimation gives

R N (n, u) = O(n (N +1)/2 u N +1 ) as n → ∞, (14.2.36)


1
provided |u| ≤ K /n 2 . Coupling (14.2.26) and (14.2.35), we obtain

N
In = (−1)m Jm (n) + E N (n), (14.2.37)
m=0

where

i
E N (n) = (−u)λ−1 (log(−u))μ R N (n, u)e−(n+1)u du. (14.2.38)
2π γn

Now, choose N large enough so that Re (λ + N − 1) > 0. The circular path of inte-
gration can then be replaced by two line segments joining u = 0 to u = δn , one on
the upper side of the cut in the u-plane, and the other on the lower side of this cut.
Hence,
  
 
E N (n) = O n (N +1)/2 (−u)λ+N (log(−u))μ e−(n+1)u du  , (14.2.39)
L

where L is the integration path shown in Figure 14.3. By the argument given in
Lemma 14.2.1, it can be shown that the integral in (14.2.39) is O (log n)μ /n λ+N +1 .
Thus,

E N (n) = O (log n)μ /n λ+(N +1)/2 as n → ∞. (14.2.40)

From (14.2.37), it follows that


14.2 Logarithmic singularities 365

Fig. 14.3 Integration path L


N
In = (−1)m Jm (n) + O (log n)μ /n λ+(N +1)/2 . (14.2.41)
m=0

This is short of the claim in (14.2.33). However, the order of the terms Jm (n), given
in (14.2.30), indicates that the result in (14.2.41) can be improved to read

N
In = (−1)m Jm (n) + O (log n)μ /n λ+N +1 (14.2.42)
m=0

as n → ∞, for any fixed integer N ≥ 0. This is essentially the statement of the


theorem, on account of (14.2.7) and (14.2.10).

When μ = 0, the canonical form (14.2.5) reduces to the Darboux condition


(14.1.1) with α replaced by λ − 1, and our expansion (14.2.33) is equivalent to the
result given in (14.1.17).
From (14.2.30), we have
∞   
(− log(n + 1))μ  μ −k 1
J0 (n) ∼ (− log(n + 1)) p0 D k
(n + 1)λ k=0
k Γ (1 − λ)

and
∞   
(− log(n + 1))μ  μ 1
−k 1
J1 (n) ∼ (− log(n + 1)) p s D k
.
(n + 1)λ+1 k=0
k s=0
Γ (1 − λ − s)

Hence, for any integer N ≥ 0,


(− log(n + 1))μ
J0 (n) − J1 (n) = g(1)
(n + 1)λ
 N   
 μ 1
× (− log(n + 1))−k D k
k=0
k Γ (1 − λ)

+O (log(n + 1))−N −1 + O n −1

as n → ∞. Clearly, none of the terms of J1 (n) can contribute to the asymptotic


expansion for an unless the infinite asymptotic expansion for J0 (n) terminates after
366 14 Asymptotics and Darboux’s method

a finite number of terms (e.g. when μ is a positive integer). The same will be true for
Jm (n) for m ≥ 1. Hence, the general situation is
∞   
(− log(n + 1))μ  μ 1
an ∼ g(1) D k
(− log(n + 1))−k (14.2.43)
(n + 1)λ k=0
k Γ (1 − λ)

as n → ∞.
Returning to (14.2.1), we note that
z L(z) − 1 = f (z),

where f (z) is given in (14.2.6) with λ = 1, μ = −1 and g(z) = z. Thus,


ln = an+1 , n = 0, 1, 2, · · ·

Since Γ (1 − λ)Γ (λ) = π/ sin π z, a straightforward calculation gives



1 2γ
ln ∼ 1− + ··· , (14.2.44)
(n + 2) log2 (n + 2) log(n + 2)

where γ = −Γ (1) is the Euler constant.

14.3 Two coalescing singularities

Returning to (14.0.2), we note that the generating function for the Legendre poly-
nomial has two algebraic singularities, one at z = eiθ and the other at z = e−iθ . As
θ → 0+ , these two singularities coalesce at z = 1 and the asymptotic expansion
of the Legendre polynomials, given in (14.1.23), breaks down; that is, Darboux’s
method fails when two or more singularities coalesce.
Fields [71] in 1967 presented a uniform treatment of Darboux’s method when two
or three singularities coalesce. He considered the case


f (z, θ ) = (1 − z)−λ [(eiθ − z)(e−iθ − z)]−α g(z; θ ) = an (θ )z n , (14.3.1)
n=0

where the Maclaurin expansion converges for |z| < 1 uniformly for θ ∈ [0, π ], the
branch of (1 − z)−λ and [(eiθ − z)(e−iθ − z)]−a are chosen such that each is holo-
morphic on D and equals 1 as z = 0, and g(z, θ ) is holomorphic in |z| ≤ eη (η > 0).
Fields derived an expansion which is uniform in certain θ -intervals depending
on n. However his result seems too complicated for practical applications; see, e.g.
Erdélyi [66], p.167, Olver [159], pp.112–113, and Wong [218], p.145. In response to
the comments by Erdélyi and Olver, Wong and Zhao [219] found a way to derive a
simpler form of uniform asymptotic expansion for the Maclaurin coefficients an (θ )
in (14.3.1) when two or three algebraic singularities on the circle of convergence
14.3 Two coalescing singularities 367

coalesce. (Neither the series in (14.1.17) given by Darboux nor Field’s result is an
asymptotic (power) expansion in the usual sense.)
To begin, we start with the simple case of two singularities, namely,


f (z, θ ) = [(eiθ − z)(e−iθ − z)]−α g(z, θ ) = an (θ )z n , (14.3.2)
n=0

where g(z, θ ) is holomorphic in |z| ≤ eη , η > 0; this is (14.3.1) with λ = 0. Contri-


bution to the large n behavior of an (θ ) still comes from the singular points z = e±iθ ,
which are now allowed to vary as θ → 0+ . We shall show that the approximants in
this case are
 
1 1
T1 (x) = (s 2 + 1)−α e xs ds, T2 (x) = s(s 2 + 1)−α e xs ds,
2πi Γ0 2πi Γ0
(14.3.3)
where Γ0 is a Hankel-type loop which starts and ends at −∞, and encircles s = ±i
in the positive sense. It is easily verified that (d/d x)T1 (x) = T2 (x), and it can also
be shown that
√   1 √   1
π x α− 2 π x α− 2
T1 (x) = Jα− 21 (x), T2 (x) = Jα− 23 (x), (14.3.4)
Γ (α) 2 Γ (α) 2

where Jν (x) is the Bessel function; see Exercise 3. Our ultimate goal here is to
establish that the Maclaurin coefficients in (14.3.2) have an asymptotic expansion of
the form
√   1 ∞ ∞

π n α− 2 αk (θ ) βk (θ )
an (θ ) ∼ Jα− 21 (nθ ) + Jα− 23 (x) (14.3.5)
Γ (α) 2θ k=0
nk k=0
nk

as n → ∞, holding uniformly for θ ∈ [0, π − δ], δ > 0, with coefficients αk (θ ) and


βk (θ ) determined recursively.
To prove (14.3.5), we start with the Cauchy formula

1 dz
an (θ ) = g(z, θ )(1 − 2z cos θ + z 2 )−α n+1 , (14.3.6)
2πi Γ z

where Γ is a simple closed contour which encloses z = 0 but not z = e±iθ and lies
in the domain of z-holomorphy of f (z; θ ). We may choose Γ so that it consists of
two portions Γ I and Γ E , where Γ I is a curve starting from z = e−0i eη , enclosing
z = e±iθ but not z = 0 in clockwise orientation, and ending at z = e0i eη , while Γ E
is the circle |z| = eη , oriented anticlockwise; see Figure 14.4.
We first show that the contribution from Γ E is exponentially small. Indeed, let us
define 
1 dz
An (θ ) = g(z, θ )(1 − 2z cos θ + z 2 )−α n+1 (14.3.7)
2πi Γ I z

and
368 14 Asymptotics and Darboux’s method

CE

CI

−1 eiθ 1
−eη O e−iθ eη

Fig. 14.4 Contour in (14.3.6)


1 dz
ε E (θ ) = g(z, θ )(1 − 2z cos θ + z 2 )−α . (14.3.8)
2πi ΓE z n+1

On Γ E , we have
(eη − 1)2 ≤ |1 − 2z cos θ + z 2 | ≤ (eη + 1)2 .

From (14.3.8), it follows


|ε E (θ )| ≤ c(g, η)e−ηn , (14.3.9)

where c(g, η) is a positive constant. In fact, one may choose

c(g, η) = maxη {|g(z; θ )|} · max{(eη − 1)−2α , (eη + 1)−2α }.


|z|=e

From (14.3.6) to (14.3.9), we obtain


an (θ ) = An (θ ) + ε E (θ ), (14.3.10)

where |ε E | ≤ c(g, θ )e−ηn .


Now we consider the behavior of An . The change of variable

z = e−θs (14.3.11)

in (14.3.7) gives

θ 1−2α
An (θ ) = h 0 (s, θ )(s 2 + 1)−α enθs ds, (14.3.12)
2πi Γ

where
14.3 Two coalescing singularities 369

  −α
−θs e−sθ − eiθ e−sθ − e−iθ
h 0 (s, θ ) = g(e , θ) (14.3.13)
(−s − i)θ (−s + i)θ

is holomorphic in s for Re s ≥ −η/θ and |s ± i| ≤ 2π/θ . In (14.3.12), Γ is the image


of Γ I under the transformation (14.3.11). That is, Γ is the positively oriented curve
in the s-plane which starts at e−iπ η/θ , ends at eiπ η/θ , and encloses both s = ±i.
To pick up the first-level contribution from the integral in (14.3.12), we write

h 0 (s, θ ) = α0 (θ ) + sβ0 (θ ) + (s 2 + 1)g0 (s, θ ), (14.3.14)

where the coefficients α0 (θ ) and β0 (θ ) are determined by setting s = ±i. More


precisely, we have
1 1
α0 (θ ) = [h 0 (i, θ ) + h 0 (−i, θ )], β0 (θ ) = [h 0 (i, θ ) − h(−i, θ )]. (14.3.15)
2 2i

Note that g0 (s, θ ) in (14.3.14) has the same domain of s-holomorphy as h 0 (s, θ ).
Inserting (14.3.14) into (14.3.12) and integrating the last term by parts give
1
An (θ ) = θ 1−2α α0 (θ )[T1 (nθ ) − εT1 ] + θ 1−2α β0 (θ )[T2 (nθ ) − εT2 ] + ε1 ,
n
(14.3.16)
where T1 (x) and T2 (x) are given in (14.3.3),
 e−iπ η/θ  eiπ η/θ
εTl = s l−1 (s 2 + 1)−α enθs ds + s l−1 (s 2 + 1)−α enθs ds, (14.3.17)
eiπ ∞ eiπ η/θ

l = 1, 2, and
ε1 = Σ1 + ε1,E . (14.3.18)

In (14.3.18),
 iθ
1  s=e η/θ
ε1,E = θ −2α · g0 (s, θ )(s 2 + 1)1−α enθs  (14.3.19)
2πi s=e−iπ η/θ

represents the endpoint contribution and



θ 1−2α
Σ1 = h 1 (s, θ )(s 2 + 1)−α enθs ds, (14.3.20)
2πi Γ

where
1 d  
h 1 (s, θ ) = − (s 2 + 1)α g0 (s, θ )(s 2 + 1)1−α
θ ds 
1 d
= − (s + 1) + 2(1 − α)s g0 (s, θ ).
2
(14.3.21)
θ ds
370 14 Asymptotics and Darboux’s method

It can be seen from (14.3.21) that h 1 (s, θ ) has the same domain of s-holomorphy as
g0 (s, θ ), and hence as h 0 (s, θ ). It can also be seen that the integral representation for
Σ1 is of the same form as (14.3.12) for An (θ ). Thus, the procedure can be repeated.
Define inductively
h k (s, θ ) = αk + sβk + (s 2 + 1)gk (s, θ ), k = 0, 1, 2, · · · , (14.3.22)

and

1 d
h k+1 (s, θ ) = − (s 2 + 1) + 2(1 − α)s gk (s, θ ), k = 0, 1, 2, · · · (14.3.23)
θ ds

Repeated application of integration by parts as above gives the expansion



m−1
αk (θ )
αn (θ ) = θ 1−2α T1 (nθ )
k=0
nk

m−1
βk (θ )
+θ 1−2α T2 (nθ ) + ε(θ, m) (14.3.24)
k=0
nk

for m = 1, 2, · · · , where

m
εk,E 
m−1
αk (θ )εT1 + βk (θ )εT2
ε(θ, m) = ε E + − θ 1−2α
k=1
nk k=0
nk
1
+ m Σm .) (14.3.25)
n
In (14.3.25), ε E is given in (14.3.8),
 iπ
1  s=e η/θ
εk,E = θ −2α gk−1 (s, θ )(s 2 + 1)1−α ensθ  , k = 1, 2, · · · (14.3.26)
2πi s=e−iπ η/θ

and

θ 1−2α
Σm = h m (s, θ )(s 2 + 1)−α enθs ds, m = 1, 2, · · · (14.3.27)
2πi Γ

One can see from (14.3.22) and (14.3.23) that h k (s, θ ) and gk (s, θ ) have the same
domain of s-holomorphy as h 0 (s, θ ).
To conclude this section, we show that εT1 and εT2 in (14.3.17) are exponentially
small. Set  ∞
I = (s 2 + 1)−α e−ηθs ds, (14.3.28)
η/θ

and make the change of variables s = (t + 1)η/θ . The integral in (14.3.28) becomes
 ∞ −α
θ2
I = η1−2α θ 2α−1 e−ηn (t + 1)2 + 2 e−ηnt dt. (14.3.29)
0 η
14.4 Asymptotic nature of the expansion (14.3.24) 371

Note that θ 2 /η2 ≥ 0, and


−α
θ2
(t + 1) + 2
2
≤ (t + 1)−2α
η

for θ ∈ [0, 2π ] and t ≥ 0. Hence


 ∞
1
|I | ≤ C(η)θ 2α−1 e−ηn (t + 1)−2α e−ηnt dt ≤ C(η)θ 2α−1 e−ηn , (14.3.30)
0 n

where we have used C(η) as a generic symbol to denote positive constants, indepen-
dent of θ and n, the value of which may differ in different places. From (14.3.17)
and (14.3.30), it follows that
1
θ 1−2α |εT1 | ≤ C(η) e−ηn (14.3.31)
n
and
θ 1−2α |εT2 | ≤ C(η)e−ηn (14.3.32)

for θ ∈ [0, π ]. The last inequality is obtained by combining (14.3.17) with (14.3.30)
and integrating by parts in both integrals in (14.3.17).

14.4 Asymptotic nature of the expansion (14.3.24)

In the previous section we derived the expansion (14.3.24) for the Maclaurin coeffi-
cients an (θ ) in (14.3.2). To show that (14.3.24) is an asymptotic expansion, we must
estimate the remainder term ε(θ, m) and demonstrate that the coefficients αk (θ ) and
βk (θ ) are bounded. In this section we do this step by step.

Theorem 14.4.1. Assume that g(z, θ ) in (14.3.2) is uniformly bounded for θ ∈


[0, π ], and is z-holomorphic in |z| ≤ eη (η > 0). For any integers m ≥ 1, we
have

m−1
αk (θ )  βk (θ )
m−1
an (θ ) = θ 1−2α T1 (nθ ) k
+ +θ 1−2α
T2 (nθ )
k=0
n k=0
nk
+ε(θ, m), (14.4.1)

where
|αk (θ )| ≤ Mk , |βk (θ )/θ | ≤ Mk (14.4.2)

for k = 0, 1, 2, · · · , and

θ 1−2α  
|ε(θ, m)| ≤ Mm m
|T1 (nθ )| + |T2 (nθ )| (14.4.3)
n
372 14 Asymptotics and Darboux’s method

2π/θ

i 2π/θ
−η/θ

−i
C

Fig. 14.5 Domain D and contour Γ

for m = 1, 2, 3, · · · . The positive constants Mk , k = 0, 1, 2, · · · , are independent


of θ ∈ [0, π − δ], δ > 0, the coefficients αk (θ ) and βk (θ ) are defined successively by
(14.3.13), (14.3.22), and (14.3.23). The remainder term ε(θ, m) in (14.4.1) involves
ε E , εTl , εk,E , and Σm , which are explicitly given in (14.3.8), (14.3.17), (14.3.26), and
(14.3.27), respectively.
Step 1. Proof of (14.4.2). Define the region
 
η 2π
D = s | Re s ≥ − , |s ± i| < , (14.4.4)
θ θ

and let Γ be a contour in D which encloses s = ±i in the positive sense; see Figure
14.5. Using Cauchy’s integral formula and the fact that h 0 (s, θ ) is s-holomorphic in
the region D, we have from (14.3.15)

1
α0 (θ ) = A0 (s, θ )h 0 (s, θ ) ds,
2πi Γ

1
β0 (θ ) = = B0 (s, θ )h 0 (s, θ ) ds, (14.4.5)
2πi Γ

where
s 1
A0 (s, θ ) = and B0 (s, θ ) = . (14.4.6)
s2 +1 s2 +1

Define inductively
14.4 Asymptotic nature of the expansion (14.3.24) 373
 
1 d
Ak (s, θ ) = (1 + s 2 )−1 (s 2 + 1) + 2αs Ak−1 (s, θ ) (14.4.7)
θ ds

and  
1 d
Bk (s, θ ) = (1 + s 2 )−1 (s 2 + 1) + 2αs Bk−1 (s, θ ) (14.4.8)
θ ds

for k = 1, 2, 3, · · · . The differential operator in (14.4.7) and (14.4.8) can be written


as
1 2 d  2 
(s + 1)−α (s + 1)α Ak−1 (s, θ ) . (14.4.9)
θ ds
In terms of these rational functions, we obtain
Lemma 14.4.2. For θ ∈ [0, π ] and k = 0,
 1, 2, · · · , we have
βk−1 (θ) 1
αk (θ) = (1 − 2α) + Ak (s, θ)h 0 (s, θ) ds (14.4.10)
θ 2πi Γ

and 
1
βk (θ) = Bk (s, θ)h 0 (s, θ) ds, (14.4.11)
2πi Γ

where Γ is the same contour given in (14.4.5) and for convenience, we have set
β−1 = 0.

Proof. We demonstrate only the result in (14.4.10). The corresponding result in


(14.4.11) can be established in a similar manner. The case k = 0 is already given in
(14.4.5). For k ≥ 1, we have from (14.3.22) and (14.3.23)

1
αk (θ) = A0 (s, θ)h k (s, θ) ds
2πi Γ
  
1 1 d 
= A0 (s, θ) − (s 2 + 1)α gk−1 (s, θ)(s 2 + 1)1−α ds
2πi θ ds

1
= A1 (s, θ)(s 2 + 1)gk−1 (s, θ) ds
2πi Γ

1
= A1 (s, θ)[h k−1 (s, θ) − αk−1 (θ) − sβk−1 (θ)]ds
2πi

1 βk−1 (θ)
= A1 (s, θ)h k−1 (s, θ) ds + (1 − 2α)
2πi Γ θ
······

1 βk−1 (θ)
= Ak (s, θ)h 0 (s, θ) ds + (1 − 2α) ,
2πi Γ θ

thus proving (14.4.10). Here we have repeatedly used



1
Ak (s, θ)ds = 0, k = 1, 2, · · · ,
2πi Γ

1
θs A1 (s, θ)ds = 2α − 1,
2πi Γ
374 14 Asymptotics and Darboux’s method

and 
1
θ s Ak (s, θ ) ds = 0, k = 2, 3, · · · ,
2πi Γ

which follows from (14.4.6), (14.4.7), and (14.4.13) below.

Lemma 14.4.3. For θ ∈ (0, π ], there exists a constant Mk > 0, independent of θ ,


such that
|Ak (s, θ )| ≤ Mk θ and |Bk (s, θ )| ≤ Mk θ 2 (14.4.12)

for |s| ≤ M/θ , |s − i| ≥ L/θ and |s + c| ≥ L/θ , where L and M are positive
constants.
Proof. By induction, one can use (14.4.6) and (14.4.7) to write

1 pk+1 (s)
Ak (s, θ ) = (14.4.13)
θ k (1 + s 2 )k+1

for k = 0, 1, 2, · · · , where pk+1 (s) is a polynomial of degree k + 1, with coefficients


independent of θ . It can also be shown from (14.4.6) and (14.4.8) that

1 qk (s)
Bk (s, θ ) = (14.4.14)
θ k (1 + s 2 )k+1

for k = 0, 1, 2, · · · , where qk (s) is a polynomial of degree k, with coefficients


independent of θ . The two inequalities in (14.4.12) now follow from (14.4.13) and
(14.4.14), respectively.

To estimate the function h 0 (s, θ ) in (14.3.13), we first recall that g(e−θs , θ ) is


uniformly bounded for θ ∈ [0, π ] and Re s ≥ −η/θ . Hence there exists a constant
Mg , independent of θ and s, such that

|g(e−θs , θ )| ≤ Mg for Re s ≥ −η/θ. (14.4.15)

Next, since (e z − 1)/z has no zero and is bounded on the circle |z| = b for 0 < b <
2π , there exist positive constants m b and Mb such that
 z 
e − 1
mb 
≤   ≤ Mb for |z| ≤ b.
z 

Hence, for 0 < b < 2π , we have


 −sθ 
e − eiθ  b
mb 
≤  ≤ Mb for |s + i| ≤ (14.4.16a)
(−s − i)θ  θ
14.4 Asymptotic nature of the expansion (14.3.24) 375

−η/θ i

−i
Γ

Fig. 14.6 The deformed contour Γ

and  −sθ 
e − e−iθ  b
mb 
≤  ≤ Mb for |s − i| ≤ . (14.4.16b)
(−s + i)θ  θ

Combining (14.4.15), (14.4.16), and (14.3.13) gives


Lemma 14.4.4. For θ ∈ (0, π ], there exists a constant M D > 0, independent of s
and θ , such that

η b b
|h 0 (s, θ )| ≤ M D for Re s ≥ − , |s + i| ≤ and |s − i| ≤ . (14.4.17)
θ θ θ
Now, for θ ∈ [0, π − δ], one may specify b = √2π − δ in (14.4.17). Without loss
of generality, we may always assume that η < π(3π − 2δ). The contour Γ in
(14.4.5) may be deformed so that it consists of
(i) |s + i| = b/θ , Im s ≥ 0 and Re s ≥ −η/θ ;
(ii) |s − i| = b/θ , Im s ≤ 0 and Re s ≥ −η/θ ; and
(iii) the segment of Re s = −η/θ joining (i) and (ii); see Figure 14.6.
The constants M and L in Lemma 14.4.3) may be chosen to be M = max{η, 3π −
2δ} and L = min{η, δ}. A combination of Lemmas 14.4.2–14.4.4) gives the bound-
edness of the coefficients αk (θ ) and βk (θ )/θ , thus proving (14.4.2).
Step 2. Bounds for ε E and εk,E . To describe the behavior of T1 (nθ ) and T2 (nθ ),
we make use of (14.3.4). From the behavior of Jα− 21 (nθ ) and Jα− 23 (nθ ), when nθ is
small, we have
1
T1 (nθ ) ∼ (nθ )2α−1 as nθ → 0+ ,
Γ (2α)

and
1
T2 (nθ ) ∼ (nθ )2α−1 as nθ → 0+ ;
Γ (2α − 1)

see, e.g. [22], p.225. Hence

|T1 (nθ )| + |T2 (nθ )| ≥ C(nθ )2α−1 (14.4.18)


376 14 Asymptotics and Darboux’s method

for nθ ∈ [0, ε] and ε small, where C depends only on ε. The interval of validity for
(14.4.18) can of course be extended to nθ ∈ [0, B] for a finite B, since Jα− 21 (τ ) and
Jα− 23 (τ ) have no common zero. The constant C may then depend on B.
In view of the behavior of the Bessel function (see, e.g. [158], p.133), we again
have from (14.3.4)
 α−1  
1 nθ 1
T1 (nθ ) ∼ cos nθ − απ as nθ → ∞,
Γ (α) 2 2

and  α−1  
1 nθ 1 π
T2 (nθ ) ∼ cos nθ − απ + as nθ → ∞.
Γ (α) 2 2 2

Hence
|T1 (nθ )| + |T2 (nθ )| ≥ C(nθ )α−1 (14.4.19)

for nθ ∈ [B, ∞), where B is a large but fixed number.


To estimate the error terms, we note from (14.3.31) that

C(η) 2α−1 −ηn   (nθ )2α−1 (nθ )2α−1


|εT1 | ≤ θ e = C(η) n m−2α e−ηn m
≤ C
n n nm
for nθ ∈ [0, B], and from (14.3.32) that

(nθ )2α−1
θ |εT2 | ≤ C
nm

also for nθ ∈ [0, B]. Here, C(η) is the generic symbol for positive constants, inde-
pendent of θ and n, used in (14.3.30), (14.3.31), and (14.3.32).
When nθ ∈ [B, ∞), and hence for θ ∈ [B/n, π ], it follows from (14.3.31)
 
θ α −ηn α−1 m+|α|−α −ηn (nθ )
α−1
(nθ )α−1
|εT1 | ≤ C e (nθ ) ≤ C{n e } ≤ C .
nα nm nm

Similarly, from (14.3.32) we have

(nθ )α−1
θ |εT2 | ≤ C .
nm
Summarizing the last four inequalities, we obtain, in view of (14.4.18) and
(14.4.19)
1
θ l−1 |εTl | ≤ Cm m {|T1 (θ )| + |T2 (θ )|}, l = 1, 2,
n
where Cm is a constant independent of n and θ . Accordingly,
14.4 Asymptotic nature of the expansion (14.3.24) 377
 
 
 1−2α  αk (θ )εT1 + βk (θ )εT2 
m−1
θ 1−2α
θ  ≤ C m {|T1 (θ )| + |T2 (θ )|} (14.4.20)
 n k  n
k=0

for all n and θ , where use has been made of the estimates in (14.4.2). Note that the
quantity on the left-hand side of this inequality is exactly the third member in the
remainder ε(θ, n) given in (14.3.25) and (14.4.1).
An estimate for ε E can be obtained by comparing (14.3.9) with (14.4.18) and
(14.4.19). Since
θ 1−2α (nθ )2α−1 θ 1−2α (nθ )2α−1
e−ηn = {n m−2α+1 e−ηn } ≤ C
nm nm
for nθ ∈ [0, B], and
θ 1−2α (nθ )α−1 θ 1−2α (nθ )α−1
e−ηn ≤ C{n m−α+|α|+1 e−ηn } m
≤ C
n nm

for nθ ∈ [B, ∞) (and hence θ ∈ [B/n, π ]), it follows that


θ 1−2α
|ε E | ≤ Mm [|T1 (θ )| + |T2 (θ )|]. (14.4.21)
nm

Note that ε E is the first member in the remainder term ε(θ, m) given in (14.3.25).
To investigate εk,E , we first analyze h k (s, θ ) and gk (s, θ ). Analogous to the
sequences {Ak (s, θ )} and {Bk (s, θ )} defined inductively in (14.4.7) and (14.4.8),
we now introduce another sequence of rational functions associated with (14.3.22)
and (14.3.23). By Cauchy’s theorem,

1 h 0 (u, θ )
h 0 (s, θ ) = du,
2πi Cu u − s

where the integration path Cu is a contour that lies in the domain D of the u-
holomorphy (see Figure 14.5)), and encloses u = s and u = ±i in the anticlockwise
direction. Set
1
Q 0 (u, s, θ ) = . (14.4.22)
u−s

Then 
1
h 0 (s, θ ) = Q 0 (u, s, θ )h 0 (u, θ ) du. (14.4.23)
2πi Cu

We further define

1 d u
Q k (u, s, θ ) = + 2α 2 Q k−1 (u, s, θ ), k = 1, 2, 3, · · · ;
θ du u +1
(14.4.24)
see the comment following (14.4.8).
In view of (14.4.22) and (14.4.24), it can be shown by induction that
378 14 Asymptotics and Darboux’s method

1 
k
Pl (u)
Q k (u, s, θ ) = , (14.4.25)
θ k l=0 (u − s)k−l+1 (1 + u 2 )l

where Pl (u) is a polynomial of degree l whose coefficients are independent of s and


θ . The last equation suggests that

1
Q k (u, s, θ ) du = 0, k = 1, 2, 3, · · · ,
2πi Cu

1
θ u Q 1 (u, s, θ ) du = 2α − 1,
2πi Cu

1
u Q k (u, s, θ ) du = 0, k = 2, 3, 4, · · · .
2πi Cu

(These are similar to the last three equations in the proof of Lemma 14.4.2; see also
the proof of Lemma 14.4.3.)
Similar to the derivation of (14.4.10) and (14.4.11), we have
Lemma 14.4.5. For θ ∈ [0, π ] and k = 0, 1, 2, · · · , we have

βk−1 (θ ) 1
h k (s, θ ) = (1 − 2α) + Q k (u, s, θ )h 0 (u, θ ) du, (14.4.26)
θ 2πi Cu

where, for convenience, we have set β−1 (θ ) = 0.


From (14.4.25), one can also see that the following estimates hold:

Lemma 14.4.6. For θ ∈ (0, π ], |u| ≤ M/θ , |s| ≤ M/θ , |u − s| ≥ L/θ , |u −


i| ≥ L/θ , and |u + i| ≥ L/θ , there exist constants Mk , k = 0, 1, 2, · · · , such that
|Q k (u, s, θ )| ≤ Mk θ. (14.4.27)

Choose an s-contour Γs similar to Γ , described in the paragraph following Lemma


14.4.4, Γs consists of
(i) |s + i| = b/θ , Im s ≥ 0 and Re s ≥ −(η − ε)/θ ;
(ii) |s − i| = b/θ , Im s ≤ 0 and Re s ≥ −(η − ε)/θ ; and
(iii) the segment of Re s = −(η − ε)/θ joining (i) and (ii), where ε is a positive
number which is sufficiently small so that Γs encloses ±i; see Figure 14.7.
Similarly, we define Γu , consisting of
(i) |u + i| = (b + ε)/θ , Im u ≥ 0 and Re u ≥ −η/θ ;
(ii) |u − i| = (b + ε)/θ , Im u ≤ 0 and Re u ≥ −η/θ ; and
(iii) segment of Re u = −η/θ joining (i) and (ii).
Denote by Ds the domain bounded by Γs . If s ∈ Ds and u ∈ Γu , then Lemma
14.4.4 holds with s replaced by u, and Lemmas 14.4.5 and 14.4.6 hold since Γu
encloses u = s and u = ±i, and lies in D, and since it follows from the previous
description that |u − s| ≥ ε/θ .
14.4 Asymptotic nature of the expansion (14.3.24) 379

b/θ

i (b + ε)/θ
−η/θ −(η − ε)/θ

−i
Γu
Γs

Fig. 14.7 Contours Γu and Γs

We notice that in the previous derivation leading to (14.3.24) and the estimation
leading to (14.4.2), (14.4.20), and (14.4.21), we only require that η be a fixed positive
number. Hence, in these cases we can replace η by a smaller number, say η = η − ε,
and the validity of these previous results will remain. For convenience, we continue
to denote the small η by η. With this understanding, one obtains the following result
by combining Lemmas 14.4.4–14.4.6 and using the fact that Γu |du| = O(1/θ ).

Lemma 14.4.7. For θ ∈ (0, π ], and k = 0, 1, 2, · · · , we have

|h k (s, θ )| ≤ Mk , s ∈ Ds , (14.4.28)

where Ds is the domain bounded by


(i) |s + i| = b/θ , Re s ≥ −η/θ and Im s ≥ 0;
(ii) |s − i| = b/θ , Re s ≥ −η/θ
 and Im s ≤ 0; and
(iii) Re s = −η/θ , |Im s| ≤ b2 − η2 /θ .

We are now ready to consider the term εk,E given in (14.3.26). By (14.3.22),

βk−1 (θ )
gk−1 (s, θ )(s 2 + 1)1−α ensθ = h k−1 (s, θ ) − αk−1 (θ ) − s θ
θ
×(s 2 + 1)−α ensθ .

Since
η2 /θ 2 < η2 /θ 2 + 1 < (η2 + π 2 )/θ 2 ,
380 14 Asymptotics and Darboux’s method


it follows that (s 2 + 1)−α  is bounded by C(η)θ 2α . In view of the bounded-
s=e±iπ η/θ
ness of h k−1 , αk−1 (θ ), and βk−1 (θ )/θ , it follows that

|εk,E | ≤ C(η, Mk−1 )e−ηn . (14.4.29)

Using the inequalities preceding (14.4.21), one can show that the estimate for ε E in
(14.4.21) also holds for εk,E , k = 1, 2, · · · . Hence, we have
 m 
 ε  θ 1−2α
 k,E 
  ≤ Mm m [|T1 (nθ )| + |T2 (nθ )|]. (14.4.30)
 n 
k n
k=1

Note that this is an estimate for the second member in the error term ε(θ, m) given
in (14.3.24) and (14.4.1).
Step 3. An Estimate for Σm (θ ). The only remaining task in this section is to
estimate Σm given in (14.3.27). For nθ ∈ [0, B], we deform the integration path Γ
so that it starts from e−iπ η/θ and ends at eiπ η/θ , and that there are positive constants
L and M such that |s ± i| ≥ L/θ and |s| ≤ M/θ along Γ ; for an example of such
a path, see the paragraph following Lemma 14.4.4. Now make the change of variable
nθ s = t, and denote the image of s-curve Γ by Γ˜t . It is readily seen that Γ˜t is a
curve which starts at e−iπ ηn and ends at eiπ ηn; along Γ˜t , we have |t ± inθ | ≥ n L,
|t| ≤ n M, and

θ 1−2α
Σm = (nθ ) 2α−1
h m (s, θ )(t 2 + (nθ )2 )−α et dt. (14.4.31)
2πi ˜
Γt

We further deform Γ˜t so that it traverses from e−iπ ηn to e−iπ (2B) along the lower
edge of the negative real line, moves to eiπ (2B) on the circle |t| = 2B in the
anticlockwise direction, and then along the upper edge of the negative real line
to eiπ ηn. The deformed curve will still be denoted by Γ˜t . Along this new curve,
|(t 2 + (nθ )2 )−α | ≤ C(B)t −2α and |h m (s, θ )| ≤ Mm ; see (14.4.28). Thus

|Σm | ≤ θ 1−2α C(Mm , B)(nθ )2α−1 |t −2α et || dt|
Γ˜t
≤ C(Mm , B)θ 1−2α
(nθ ) 2α−1
(14.4.32)

for nθ ∈ [0, B]. In view of (14.4.18), we obtain

|Σm | ≤ Mm θ 1−2α [|T1 (nθ )| + |T2 (nθ )|] (14.4.33)

for nθ ∈ [0, B], m = 1, 2, 3, · · · .


Finally we consider the case when nθ → +∞. First, we introduce a curve Γc
depending on nθ , which starts at e−iπ η/θ , moves to e−iπ /nθ along the lower edge
of the negative real axis, encircles the origin along the circle |s| = 1/nθ in the
positive sense, and then proceeds from eiπ /nθ to eiπ η/θ along the upper edge of
14.4 Asymptotic nature of the expansion (14.3.24) 381

i Γi

eiπ η/θ
e−iπ η/θ

-i Γ−i

Fig. 14.8 Contour Γ = Γi ∪ Γ−i ∪ Γr

the negative real line. We now deform the path of integration in (14.3.27), and split
it into three parts: Γi = Γc + i, Γ−i = Γc − i, and Γr , where Γr consists of three
segments on Re s = −η/θ connecting:
(i) eiπ η/θ − i and e−iπ η/θ + i;
(ii) eiπ η/θ + i and eiπ η/θ ;
(iii) e−iπ η/θ − i and e−iπ η/θ ; see Figure 14.8.
We know from Lemma 14.4.7 that h m (s, θ ) is bounded on Γ = Γi ∪ Γ−i ∪ Γr ,
and that the bound is uniform in θ ∈ [0, π − δ]. Consider

1
Ii ≡ h m (s, θ )(s 2 + 1)−α enθs ds
2πi Γi

einθ
= {h m (s + i, θ )(s + 2i)−α }s −α enθs ds.
2πi Γc

In the last integral we put v(s) ≡ h m (s + i, θ )(s + 2i)−α and make the change of
variable t = nθ s. Since v(s) is uniformly bounded on {Γc : Re s ≥ −3}, we have
 inθ  
e 
 v(s)s −α nθs
e ds 
 2πi 
{Γc : Re s ≥ −3}
 
 inθ  (0+ ) 
α−1  e −α t 
= (nθ )  v(s(t))t e dt 
 2πi −3nθ 
  + 
(0 )
≤ C |t|−α eRe t | dt| (nθ )α−1 .
−∞

On the other hand,


382 14 Asymptotics and Darboux’s method
 inθ   
e  Mm C(α) η/θ −2α −nθt
 nθs 
w(s)e ds  ≤ t e dt
 2πi π
{Γc : Re s ≤ −3} 3

≤ Ce−3nθ ,

which is in turn bounded by (nθ )α−1 for nθ ≥ B. The second to last inequality
follows from the fact that w(s) ≡ h m (s + i, θ )(s + 2i)−α s −α is uniformly bounded
by C|s|−2α on that part of Γc . Hence
|Ii | ≤ C(nθ )α−1 . (14.4.34)

Similarly, we have
|I−i | ≤ C(nθ )α−1 . (14.4.35)

The estimate of the integral Ir over Γr can be obtained by taking the absolute value
of the integrand. Indeed, we have
|Ir | ≤ Ce−ηn θ −2α ≤ C(nθ )α−1 . (14.4.36)

To obtain the last inequality, we have used the fact that θ ∈ [B/n, π ] for nθ ∈
[B, ∞). A combination of (14.4.34), (14.4.35), (14.4.36), and the fact that Σm =
θ 1−2α (Ii + I−i + Ir ) gives
|Σm (θ )| ≤ Cθ 1−2α (nθ )α−1 ≤ Cθ 1−2α {|T1 (nθ )| + |T2 (nθ )|} (14.4.37)

for nθ ∈ [B, +∞), B being sufficiently large, m = 1, 2, 3, · · · ; see (14.4.19). The


results in (14.4.33) and (14.4.37) imply that there exists a constant C such that
|Σm (θ )| ≤ Cθ 1−2α {|T1 (nθ )| + |T2 (nθ )|} (14.4.38)

for all n and θ . The desired result (14.4.3) now follows from (14.4.20), (14.4.21),
(14.4.30), (14.4.38), and (14.3.25).
Remark. To conclude this section, we note that in addition to the expression in
(14.3.4), the approximant T2 (x) in (14.4.1) can also be expressed as
√   3 √  α− 21
(α − 21 ) π nθ α− 2 π nθ
T2 (nθ ) = Jα− 21 (nθ ) − Jα+ 21 (nθ ).
Γ (α) 2 Γ (α) 2
(14.4.39)
As a consequence, we have the following useful corollary.
Corollary 14.4.8. Under the same assumption as Theorem 14.4.1, the following
holds:
 n α− 21  α̃k (θ )
m−1
an (θ ) = Jα− 2 (nθ )
1
2θ k=0
nk
 n α− 21  β̃k (θ )
m−1
+ Jα+ 21 (nθ ) + ε̃(θ, m). (14.4.40)
2θ k=0
nk
14.4 Asymptotic nature of the expansion (14.3.24) 383

With αk , βk , and ε(θ, m) replaced by α̃k , β̃k , and ε̃(θ, m), respectively, the estimates
in (14.4.2 and (14.4.3) remain valid.

Indeed, inserting (14.3.4) and (14.4.39) into (14.4.1) gives, explicitly,



α̃0 (θ ) = π α0 (θ )/Γ (α),

α̃k (θ ) = ( π /Γ (α))[αk (θ ) + (2α − 1)βk−1 (θ )/θ ], k = 1, 2, 3, · · · ,

β̃k (θ ) = − π βk (θ )/Γ (α), k = 0, 1, 2, · · · , (14.4.41)

and

(2α − 1) π  n α− 21 βm−1 (θ ) 1
ε̃(θ, m) = ε(θ, m) + J 1 (nθ ). (14.4.42)
Γ (α) 2θ θ n m α− 2

The above result is immediately applicable to the ultraspherical polynomials


Pn(λ) (x) defined by



[(eiθ − z)(e−iθ − z)]−λ = Pn(λ) (cos θ )z n . (14.4.43)
n=0

With α = λ and an (θ ) = Pn (cos θ ), one can immediately write down the asymp-
totic expansion

 n λ− 21 ∞
α̃k (θ )
Pn(λ) (cos θ ) ∼ Jλ− 21 (nθ )
2θ k=0
nk
 n λ− 21 ∞
β̃k (θ )
+ Jλ+ 2 (nθ )
1 (14.4.44)
2θ k=0
nk

uniformly for θ ∈ [0, π − δ], δ > 0. Here


√  −λ √  −λ
π sin θ π sin θ
α̃0 (θ ) = cos λθ, β̃0 (θ ) = − sin λθ,
Γ (λ) θ Γ (λ) θ

√   
π sin θ −λ λ − 1  θ cos θ − sin θ
α̃1 (θ ) = λ − sin λθ
Γ (λ) θ 2 θ sin θ
 sin λθ 
+2 cos λθ + ,
θ

and
384 14 Asymptotics and Darboux’s method
√   
π 1 sin θ −λ θ cos θ − sin θ
β̃1 (θ ) = − λ(λ − 1) cos λθ + 2 sin λθ .
Γ (λ) 2 θ θ sin θ

Setting λ = 21 , the above result reduces to a uniform asymptotic expansion for the
Legendre polynomials defined in (14.0.2).

14.5 Heisenberg polynomials

The Heisenberg polynomials are polynomials in z and z̄, defined by



n
(α) j (β)n− j
Cn(α,β) (z) = z̄ j z n− j , n = 0, 1, 2, · · · , (14.5.1)
j=0
j!(n − j)!

where α and β are real numbers, and (γ )k is the Pochhammer symbol defined by
(γ )0 = 1 and (γ )k = γ (γ + 1) · · · (γ + k − 1). This representation can be derived
from the generating function


(1 − w z̄)−α (1 − wz)−β = Cn(α,β) (z)w n , |wz| < 1. (14.5.2)
n=0

(α,β)
The notation Cn (z) was used by Gasper [86] in the sense of (14.5.1) and (14.5.2),
but the term Heisenberg polynomials was first used by Dunkl [63].
From (14.5.2), it is readily seen that the Heisenberg polynomials have the property
(α,β) (α,β)
Cn (ρeiθ ) = ρ n Cn (eiθ ). Hence, to study the behavior of these polynomials as
n → ∞, it suffices to consider the polynomials on the unit circle. The generating
function in (14.5.2) now takes the form


(eiθ − z)−α (e−iθ − z)−β eiθ(α−β) = Cn(α,β) (eiθ )z n . (14.5.3)
n=0

Note that the exponents of the two factors on the left-hand side of the above equation
are different; hence, the result of Theorem 14.4.1 is not directly applicable to Heisen-
berg polynomials. However, the arguments in the last two sections can be modified
to deal with the current situation. To this end, we define

1
Tl (x) = s l−1 (s − i)−β (s + i)−α e xs ds, l = 1, 2, (14.5.4)
2πi Γ0

where Γ0 is a Hankel-type loop which starts and ends at −∞ and encircles s = ±i


in the positive sense; cf. (14.3.3). We further introduce an auxiliary function
14.5 Heisenberg polynomials 385

−α −β
eiθ − e−θs e−iθ − e−θs
h 0 (s, θ ) = e iθ(α−β)
(14.5.5)
(s + i)θ (s − i)θ

and a sequence {h k (s, θ )}∞


k=0 defined by

h k (s, θ ) = αk (θ ) + sβk (θ ) + (s 2 + 1)gk (s, θ ), (14.5.6a)


 
s2 + 1 α−1 β −1 d
h k+1 (s, θ ) = + − gk (s, θ ) (14.5.6b)
θ s +i s −i ds

for k = 0, 1, 2, · · · ; cf. (14.3.13), (14.3.14), (14.3.22), and (14.3.23). The coeffi-


cients αk (θ ) and βk (θ ) are determined by requiring all h k (s, θ ) and gk (s, θ ) to be
holomorphic in D = {s : Re s ≥ − ηθ , |s ± i| < 2π θ
}; see Fig. 4.5.
Theorem 14.5.1. For θ ∈ [0, π − δ] with arbitrary δ > 0, we have

m−1
αk (θ )
Cn(α,β) (eiθ ) = θ 1−α−β T1 (nθ )
k=0
nk

m−1
βk (θ )
+ θ 1−α−β T2 (nθ ) + εθ,m , (14.5.7)
k=0
nk

where |αk (θ )| ≤ Mk , |βk (θ )/θ | ≤ Mk for k = 0, 1, 2, · · · , and


|εθ,m | ≤ Mm θ 1−α−β n −m {|T1 (nθ )| + |T2 (nθ )|}, (14.5.8)

for m = 1, 2, · · · . The positive constants Mk , k = 1, 2, · · · , are independent of θ


for θ ∈ [0, π − δ]. The coefficients αk (θ ) and βk (θ ) are given by (14.5.6), with
   
eiθα sin θ −α e−iθβ sin θ −β
α0 (θ ) = + ,
2 θ 2 θ
   
eiθα sin θ −α e−iθβ sin θ −β
β0 (θ ) = − .
2i θ 2i θ

By expanding the slowly varying factor in the integrand of (14.5.4) in a uniformly


convergent power series of 1/s and integrating term by term, we obtain
∞ 
 k
α+β−1 i k (−1)l (α)l (β)k−l
T1 (x) = x xk, (14.5.9)
k=0 l=0
l!(k − l)!Γ (α + β + k)

where x α+β−1 is positive for real positive x. It is easily seen that x −α−β+1 T1 (x) is
an entire function. From (14.5.4), it is also readily verified that T2 (x) = T1 (x) in
the cut plane C \ (−∞, 0]. Moreover, with a = 2 − α − β and b = β − α, T1 (x)
satisfies the differential equation
x T1 + aT1 + (x − bi)T1 = 0. (14.5.10)
386 14 Asymptotics and Darboux’s method

Furthermore, making the change of variables


T1 (x) = x 1−α ei x y(x), z = −2i x (14.5.11)

yields the confluent hypergeometric equation


d2 y dy 2−a−b
z 2
+ [(2 − a) − z] − y = 0. (14.5.12)
dz dz 2

Taking into account the first two terms of the infinite series in (14.5.9), one obtains
1
T1 (x) = x α+β−1 ei x M(α, α + β, −2i x), (14.5.13)
Γ (α + β)

where M is the Kummer function [22], p.201. Since T2 (x) = T1 (x), it also follows

(α + β − 1 + i x)x α+β−2 ei x
T2 (x) = M(α, α + β, −2i x)
Γ (α + β)
2i x α+β−1 ei x
− M (α, α + β, −2i x),
Γ (α + β)
(14.5.14)

where M (γ , δ, z) = dz
d
M(γ , δ, z). Substituting (14.5.13) and (14.5.14) into (14.5.7),
we obtain an asymptotic expansion of the Heisenberg polynomials in terms of the
Kummer function.

Theorem 14.5.2. Assume that α and β are real and fixed and that z = ρeiθ with
ρ > 0 and θ real. Then we have the compound asymptotic expansion [158], p.118:
 ∞
ck (θ )
Cn(α,β) (z) ∼ n α+β−1 z n M(α, α + β, −2inθ )
k=0
nk
∞ 
dk (θ )
+M (α, α + β, −2inθ ) (14.5.15)
k=0
nk

as n → ∞, uniformly with respect to ρ ∈ (0, ∞) and θ ∈ [0, π − δ], where 0 < δ ≤


π . The coefficients are given by

αk (θ ) + iβk (θ ) βk (θ )/θ 2iβk (θ )


ck (θ ) = + , dk (θ ) = − (14.5.16)
Γ (α + β) Γ (α + β − 1) Γ (α + β)

for k = 0, 1, 2, · · · , αk (θ ) and βk (θ ) being defined in (14.5.6). In particular,


14.5 Heisenberg polynomials 387
 
eiθα sin θ −α
c0 (θ ) = ,
Γ (α + β) θ
    (14.5.17)
e−iθβ sin θ −β eiθα sin θ −α
d0 (θ ) = − .
Γ (α + β) θ Γ (α + β) θ

When the parameters α and β in the above theorem are non-positive integers, the
coefficients ck and dk in (14.5.15) all vanish; see (14.5.16). However the asymptotic
relation remains valid, since the polynomials also vanish for large value of n, i.e. it
is a trivial result.
The results in this section are taken from [135]. For a proof of Theorem 14.5.1,
we refer to [135].

Exercises

1. Use Stirling’s approximation (2.10.5) to prove


 ∞ (Re 4.1.19a).
2. Prove the Riemann–Lebesgue lemma: if −∞ | f (θ )| dθ < ∞, then

lim einθ f (θ ) dθ = 0.
n→∞

Hint: by a density argument, it is enough to verify this for the characteristic


function of a bounded interval.
3. Prove the two identities in (14.3.4).
4. For n = 1, 2, 3, . . . , let
1−n (2 − n)2 (−1)n−1
an = 1 + + + ··· + .
1 2 n−1

Find the generating function of the sequence {ak } and show that an → 0 as
n → ∞.
5. Let qn denote the probability that in n tosses of an ideal coin, no run of three
consecutive heads appears. Clearly q0 = q1 = q2 = 1, and in probability the-
ory it is established that qn = 21 qn−1 + 14 qn−2 + 18 qn−3 . Show that the {qn } have
generating function
 ∞
2t 2 + 4t + 8
qn t n =
n=0
8 − 4t − 2t 2 − t 3

and deduce the asymptotic formula


1.2368398446
qn ∼ as n → ∞.
(1.0873780254)n+1

(See Feller [70], p.278.)


6. The Charlier polynomials Cn(a) (x) can be defined by the generating function
388 14 Asymptotics and Darboux’s method


 ωn
e−aw (1 + ω)x = Cn(a) (x) .
n=0
n!

For fixed x > 0, find an asymptotic expansion for Cn(a) (x) as n → ∞. Write out
the first two terms of the expansion. (See [20], equations (5.3.6) and (5.3.12).)
7. The Meixner polynomials Mn (x; β, c) have generating function
 ω x ∞
ωn
1− (1 − ω)−x−β = m n (x; β, c) .
c n=0
n!

For fixed x > 0, find an asymptotic expansion for m n (x; β, c) as n → ∞. Cal-


culate the coefficients of the two leading terms. (See [20], equation (5.5.7) and
Exercise 10.21, pp.262–264.)
8. The Stirling numbers of the first kind Snm , m, n = 0, 1, 2, . . . , have generating
function
∞
tn
[log(1 + t)]m = m! Snm .
n=0
n!

Use the results in Section 14.2 to write out an asymptotic expansion for Snm as
n → ∞. (This formula was first given in Jordan [117].)
(α,β)
9. The Jacobi polynomials Pn have the generating function
∞
Pn(α,β) t n = 2α+β R −1 (1 − t + R|−α (1 + t + R)−β ,
n=0

where R = R(z, t) = (1 − 2z t − t 2 )−1/2 . The branch of the square root is cho-


sen so that R(z, 0) = 1. Use Darboux’s method to show that:
(a) For θ ∈ [ε, π − ε], ε > 0, we have
Pn(α,β) (cos θ ) = n −1/2 k(θ ) cos(N θ + γ ) + O(n −3/2 ),

where
k(θ ) = π −1/2 (sin 21 )−α− 2 (cos 21 θ )−β− 2
1 1

N = n + 21 (α + β + 1), γ = − 21 π(α + 21 ),

and the O(n −3/2 ) term is uniform on the interval [−ε, θ − ε]. (See [20], equations
(4.6.7) and (4.6.11).)
(b) For fixed x > 1 we have
√ √
Pn(α,β) (x) ∼ (x − 1)−α/2 (x + 1)−β/2 [ x − 1 + x + 1]α+β
 1
+(x 2 − 1)−1/4 (2π n)−1/2 [x + x 2 − 1]n+ 2 .
(α,β)
10. The Heisenberg polynomials Cn have the generating function given in
(14.5.2):
14.5 Heisenberg polynomials 389



(1 − ω z̄)−α (1 − ω z)−β = Cn(α,β) (z) ωn , |ωz| < 1.
n=0

As noted at the beginning of Section 14.5, the homogeneity property of these


polynomials allows us, in studying asymptotics, to restrict attention to the unit
circle and thus to the restricted generating function (14.5.3):


(eiθ − z)−α (e−iθ − z)−β eiθ(α−β) = Cn(α,β) (eiθ ) z n .
n=0

By modifying the arguments given in this chapter, prove the result stated in
Theorem 14.5.1.

Remarks and further reading

The original presentation of Darboux’s method was in Darboux [51]. The discussion
at the start of this chapter is mainly based on Carrier, Krook, and Pearson [43].
Since its inception, Darboux’s method has been used extensively to compute
asymptotics of orthogonal polynomials. An excellent source for results of this type
is Ismail [114].
For some recent applications, see Bai and Zhao [15] and Wang and Zhao [212].
For other recent developments, in addition to the work described in this chapter, see
Flagolet et al. [74], Temme [204], Boyd [30].
References

1. Abikoff, W.: The Real Analytic Theory of Teichmüller Spaces. Springer, New York, Heidel-
berg, Berlin (1980)
2. Abikoff, W.: The uniformization theorem. Am. Math. Mon. 88, 574–592 (1981)
3. Ahlfors, L.V.: On quasiconformal mapping. J. d’Analyse 3, 1–58 (1953/54)
4. Ahfors, L.V.: Quasiconformal reflections. Acta Math. 109, 291–301 (1963)
5. Ahlfors, L.V.: Lectures on Quasiconformal Mappings, 2nd edn. American Mathematical Soci-
ety, Providence (2006)
6. Ahlfors, L.V.: Complex Analysis, 3rd edn. McGraw-Hill, New York (1978)
7. Ahlfors, L.V.: Conformal Invariants. McGraw-Hill, New York (1973)
8. Aitken, A.C.: On Bernoulli’s numerical solution of algebraic equations. Proc. R. Soc. Edinb.
46, 289–305 (1925)
9. Akhiezer, N.I.: The Classical Moment Problem and Some Related Questions in Analysis.
Reprint of the 1965 edition. SIAM, Philadelphia (2021)
10. Anosov, D.V., Bolibruch, A.A.: The Riemann-Hilbert Problem. Vieweg, Braunschweig, Wies-
baden (1994)
11. Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950)
12. Askey, R., Gasper, G.: Positive Jacobi polynomial sums II. Am. J. Math. 98, 709–737 (1976)
13. Axler, S., Bourdon, P., Ramey, W.: Harmonic Function Theory, 2nd edn. Springer, New York,
Heidelberg, Berlin (2001)
14. Baerenstein, A., et al.: The Bieberbach Conjecture. American Mathematical Society, Provi-
dence (1986)
15. Bai, X., Zhao, Y.: A uniform asymptotic expansion for Jacobi polynomials via uniform treat-
ment of Darboux’s method. J. Approx. Theory 148, 1–11 (2007)
16. Baker, G.A., Jr., Graves-Morris, P.: Padé Approximants, 2nd edn. Cambridge University Press,
Cambridge (1996)
17. Baker, H.F.: Abelian Functions. Abel’s Theorem and the Allied Theory of Theta Functions.
Cambridge University Press, Cambridge (1897, reissued 1995)
18. Bazilevich, I.E.: On distortion theorems in the theory of univalent functions. Mat. Sb. 28,
147–164 (Russian) (1951)
19. Beals, R., Deift, P., Zhou, X.: The inverse scattering transform on the line. Important Devel-
opments in Soliton Theory, pp. 7–32. Springer, Berlin (1993)
20. Beals, R., Wong, R.S.C.: Special Functions: A Graduate Text. Cambridge University Press
(2010)
© The Editor(s) (if applicable) and The Author(s), under exclusive license to 391
Springer Nature Switzerland AG 2023
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1
392 References

21. Beals, R., Wong, R.S.C.: Special Functions and Orthogonal Polynomials. Cambridge Univer-
sity Press (2016)
22. Beals, R., Wong, R.S.C.: Explorations in Complex Functions. Springer, New York, Heidel-
berg, Berlin (2020)
23. Beardon, A.F.: Iteration of Rational Functions. Springer, New York, Heidelberg, Berlin (1991)
24. Bergman, S.: The Kernel Function and Conformal Mapping, 2nd edn. American Mathematical
Society, Providence (1970)
25. Bers, L.: Quasiconformal mappings and Teichmüller’s theorem, Analytic Functions, pp. 89–
119. Princeton University Press, Princeton (1960)
26. Beurling, A., Ahlfors, L.V.: The boundary correspondence under quasiconformal mapping.
Acta Math. 96, 125–142 (1956)
27. Bieberbach, L.: Über die Koeffizienten derjenigen Potenzreihen, welche eine schlichte Abbil-
dung des Einheitskreise vermitteln. S.-B. Preuss. Akad, Wiss., 940–955 (1916)
28. Boettcher, L.E.: The principal laws of convergence of iterates and their applications to analysis
(Russian). Izv. Kazan. Fiz-Mat. Obshch. 14, 155–234 (1904)
29. Bolibrukh, A.A.: The Riemann-Hilbert problem on the complex projective line. (Russian)
Mat. Zametki 46, 118–120 (1989)
30. Boyd, J.P.: The breakdown of Darboux’s principle and natural boundaries for a function
periodised from a Ramanujan Fourier transform pair. East Asian J. Appl. Math. 9, 409–423
(2019)
31. Brezinski, C., Redivo-Zaglia, M.: Extrapolation and Rational Approximation. The Works of
the Main Contributors. Springer, New York, Heidelberg, Berlin (2020)
32. Brezinski, C., Redivo-Zaglia, M.: A survey of Shanks’ extrapolation methods and their appli-
cations. Comput. Math. Math. Phys.61 (2021)
33. Brooks, R., Matelski, J.P.: The dynamics of 2-generator subgroups of P S L(2, C). In: Riemann
Surfaces and Related Topics: Proceedings of the 1978 Stony Brook Conference. Ann. Math.
Stud. 97. Princeton University Press, Princeton (1978)
34. Brjuno, A.D.: On convergence of transforms of differential equations to the normal form.
(Russian) Dokl. Akad. Nauk SSSR 165, 987–989 (1965)
35. Bunke, U., Olbrich, M.: Selberg Theta and Eta Functions. A Differential Operator Approach.
Akademie-Verlag, Berlin (1995)
36. Burger, M., Iozzi, A., Labourie, A., Wienhard, A.: Maximal representations of surface groups:
symplectic Anosov structures. Pure Appl. Math. Q. 1. Special Issue: In memory of Armand
Borel, 543590 (2005)
37. Calderón, A.P., Zygmund, A.: On the existence of certain singular integrals. Acta Math. 88,
85–1339 (1952)
38. Carathéodory, C.: Untersuchungen über die konformen Abbildungen von festen und verän-
derlichen Gebieten. Math. Ann. 72, 107–144 (1912)
39. Carathéodory, C.: Zur Rñadezuordnung bei konformer Abbildung. Nachr. Königl. Ges. Wiss.
Göttingen, Math.-Phys. Kl. 509–518 (1913)
40. Carleman, T.: Sur la résolution de certaines équations intégrales. Ark. Mat. Astronom. Fys.
16, 1–19 (1921)
41. Carleman, T.: La théorie des équations intégrales singuliéres et ses applications. Ann. Inst.
Henri Poincaré 1, 401–430 (1930)
42. Carleson, L., Gamelin, T.W.: Complex Dynamics. Springer, New York, Heidelberg, Berlin
(1993)
43. Carrier, G.F., Krook, M., Pearson, C.E.: Functions of a Complex Variable. McGraw-Hill, New
York (1966)
44. Christ, M.: Lectures on Singular Integral Operators. American Mathematical Society, Provi-
dence (1990)
45. Clancey, K.F.: Gohberg: Factorization of Matrix Functions and Singular Integral Operators.
Birkhüser, Boston (1981)
46. Cohn, H.: Conformal Mapping on Riemann Surfaces. Reprint of the 1967 edition. Dover,
New York (1980)
References 393

47. Conway, J.B.: Functions of One Complex Variable II. Springer, New York, Heidelberg, Berlin
(1995)
48. Cooper, S.: Ramanujan’s Theta Functions. Springer, Cham (2017)
49. Cremer, H.: Über die Häufigkeit der Nichzentren. Math. Ann. 115, 573–580 (1938)
50. Cuyt, A., Petersen, V.B., Verdonk, B., Waadeland, H., Jones, W.B.: Handbook of Continued
Fractions for Special Functions. Springer, New York, Heidelberg, Berlin (2008)
51. Darboux, G.: Mémoire sur l’approximation des fonctions de très grands nombres, et sur une
classe étendue de développements en série. J. Math. Pures Appl. 4(5–56), 377–416 (1878)
52. de Branges, L.: A proof of the Bieberbach conjecture. Acta Math. 154, 137–152 (1985)
53. Deift, P.A.: Orthogonal polynomials and random matrices: a Riemann-Hilbert approach.
Courant Lecture Notes in Mathematics, 3. New York University, Courant Institute of Mathe-
matical Sciences, New York, American Mathematical Society, Providence (1999)
54. Deift, P., Zhou, X.: A steepest descent method for oscillatory Riemann-Hilbert problems.
Asymptotics for the MKdV equation. Ann. Math. 137, 295–368 (1993)
55. Dieudonné, J.: Sur les fonctions univalentes. C. R. Acad. Sci. Paris 192, 1148–1150 (1931)
56. Donaldson, S.: Riemann Surfaces. Oxford University Press, Oxford (2011)
57. Douady, A., Earle, C.: Conformally natural extensions of homeomorphisms of the circle. Acta
Math. 157, 23–48 (1986)
58. Douady, A., Hubbard, J.H: Itération des polynômes quadratiques complexes. Ann. Sci. École
Norm. Sup. 18 (1982)
59. Douady, A., Hubbard, J.H.: The dynamics of polynomial-like mappings. Ann. Sci. École
Norm. Sup. Paris 18, 287–343 (1982)
60. Dubrovin, B.A.: Theta functions and nonlinear equations. Uspekhi Mat. Nauk 36(2), 11–80
(1981), Russ. Math. Surv. 36(2), 11–92 (1981)
61. Duistermaat, J.J., Kolk, J.A.C.: Distributions. Theory and applications. Birkhüser, Boston
(2010)
62. Dumas, D., Sanders, A.: Geometry of compact complex manifolds associated to generalized
quasi-Fuchsian representations. Geom. Topol. 24, 1615–1693 (2020)
63. Dunkl, C.F.: The Poisson kernel for Heisenberg polynomials on the disk. Math. Z. 187, 527–
547 (1984)
64. Duren, P.: Univalent Functions. Springer, New York, Heidelberg, Berlin (1983)
65. Duren, P., Schuster, A.: Bergman Spaces. American Mathematical Society, Providence (2004)
66. Erdélyi, A.: Uniform asymptotic expansion of integrals. Analytic Methods in Mathematical
Physics, pp. 149–168. Gordon and Breach, New York (1970)
67. Farkas, H.M., Kra, I.: Riemann Surfaces, 2nd edn. Springer, New York, Heidelberg, Berlin
(1992)
68. Farkas, H.M., Kra, I.: Theta Constants, Riemann Surfaces, and the Modular Group. Springer,
New York, Heidelberg, Berlin (2001)
69. Fatou, P.: Sur les équations fonctionelles. Bull. Doc. Math. Fr. 47 161–271, 48, 33–94, 208–
314 (1919-20)
70. Feller, W.: An Introduction to Probability Theory and Its Applications, vol. I, 3rd edn. Wiley,
New York, London, Sydney (1968)
71. Fields, L.: A uniform treatment of Darboux’s method. Arch. Rat. Mech. Anal. 27(1967),
289–305 (1967)
72. FitzGerald, C.H.: Quadratic inequalities and coefficient estimates for Schlicht functions. Arch.
Rat. Mech. Anal. 46, 356–368 (1972)
73. FitzGerald, C.M., Pommerenke, C.: The de Branges theorem on univalent functions. Trans.
Am. Math. Soc. 290, 683–690 (1985)
74. Flajolet, P., Fusy, E., Gourdon, X., Panario, D., Pouyanne, N.: A hybrid of Darboux’s method
and singularity analysis in combinatorial asymptotics. Electron. J. Comb. 13(1), Research
Paper 103, 35 pp (2006)
75. Flajolet, P., Sedgewick, R.: Analytic Combinatorics. Cambridge University Press, Cambridge
(2009)
394 References

76. Fletcher, A., Markovic, V.: Quasiconformal Maps and Teichmüller Theory. Oxford University
Press (2007)
77. Fock, V., Goncharov, A.: Moduli spaces of local systems and higher Teichmüller theory. Publ.
Math. I. H. É. S. 103, 1–211 (2006)
78. Folland, G.B.: Real Analysis: Modern Techniques and Their Applications, 2nd edn. Wiley,
New York (1999)
79. Fricke, R., Klein, F.: Vorlesung über die Theorie der automorphen Funktionen. Teubner,
Stuttgart (1926)
80. Frobenius, G.: Über Relationen zwischen den Näherungsbruchen von Potenzreihen. J. für
Math. 90, 1–17 (1881)
81. Garabedian, P.R., Schiffer, M.: A proof of the Bieberbach conjecture for the fourth coefficient.
J. Rat. Mech. Anal. 4, 427–465 (1955)
82. Gardiner, F.: Teichmüller Theory and Quadratic Differentials. Wiley, New York (1987)
83. Gardiner, F., Lakic, N.: Quasiconformal Teichmüller Theory. American Mathematical Society,
Providence (2000)
84. Gardiner, F., Sullivan, D.: Symmetric structures on a closed curve. Am. J. Math. 114, 683–736
(1992)
85. Garnett, J., Marshall, D.: Harmonic Measure. Cambridge University Press, Cambridge (2005)
86. Gasper, G.: Orthogonality of certain functions with respect to complex valued weights. Can.
J. Math. 33, 1261–1270 (1981)
87. Gehring, F.W.: Quasiconformal mappings in space. Bull. Am. Math. Soc. 69, 146–164 (1963)
88. Gehring, F., Marten, G., Palka, B.: An Introduction to the Theory of Higher-Dimensional
Quasiconformal Mappings. American Mathematical Society, Providence (2017)
89. Gelca, R.: Theta Functions and Knots. World Scientific, Hackensack, N. J (2014)
90. Georgiev, S.G.: Theory of Distributions, 2nd edn. Springer, Cham (2021)
91. Goluzin, G.M.: On the coefficients of univalent functions. Mat. Sb. 22, 373–380 (1948)
92. Gray, J.: On the history of the Riemann mapping theorem. Rend. Circ. Mat. Palermo 2(Suppl.
34), 47–94 (1994)
93. Gronwall, T.H.: Some remarks on conformal representation. Ann. Math. 16, 72–76 (1914–
1915)
94. Grötzsch, H.: Über möglichst konforme Abbildungen von schlichten Bereichen. Ner. Verh.
Sächs. Akad. Wiss. Leipzig 84 (1932)
95. Handbook of Teichmüller Theory, A. Papadopoulos, ed. Eur. Math. Soc. Zürich, vols.1–VI
(2007–2016)
96. Hardy, G.H., Wright, E.M.: An Introduction to the Theory of Numbers, 6th edn. Oxford
University Press, Oxford (2008)
97. Hayman, W.K.: The asymptotic behavior of p-valent functions. Proc. Lond. Math. Soc. 5,
257–284 (1955)
98. Hayman, W.K.: Subharmonic Functions, vol. 2. Academic Press, London (1989)
99. Hayman, W.K., Kennedy, P.B.: Subharmonic Functions, vol. 1. Academic Press, London,
New York (1976)
100. Hedenmalm, H., Korenblum, B.: Theory of Bergman Spaces. Springer, New York, Heidelberg,
Berlin (2000)
101. Heinonen, J., Koskela, P., Shanmugalingam, N., Tyson, J.T.: Sobolev Spaces on Metric Mea-
sure Spaces. An Approach Based on Upper Gradients. Cambridge University Press, Cam-
bridge (2015)
102. Henrici, P.: Applied and Computational Complex Analysis, vol. III. Wiley, New York (1986)
103. Hensley, D.: Continued Fractions. World Scientific, Hackensack, NJ (2006)
104. Herglotz, G.: Über Potenzreihen mit positive reellen Teil im Einheitskreisen. Ber. Verh. Sachs.
Akad. Wiss. Leipzig 63, 501–511 (1911)
105. Herman, M.: Exemples de fractions rationelles ayant une orbite dense dans la sphère du
Riemann. Bull. Soc. Math. Fr. 112, 93–142 (1984)
106. Hilbert, D.: Mathematical problems. Bull. Am. Math. Soc. 8, 437–479 (1902). Reprinted in
Bull. Am. Math. Soc. 37, 407–436 (2000)
References 395

107. Hille, E.: Analytic Function Theory, vol. II. Ginn & Co., Boston (1962)
108. Hitchin, N.: Lie groups and Teichmüller space. Topology 30, 449–473 (1992)
109. Hörmander, L.: An Introduction to Complex Analysis in Several Variables, 3rd edn. North-
Holland, Amsterdam (1990)
110. Horowitz, D.: A further refinement for coefficient estimates of univalent functions. Proc. Am.
Soc. 71, 217–221 (1978)
111. Hubbard, J.: Teichmüller Theory and Applications to Geometry, Topology and Dynamics,
vol. 1. Matrix Editions, Ithaca, N.Y. (2006)
112. Hubbard, J.: Teichmüller Theory and Applications to Geometry, Topology and Dynamics,
vol. 2. Matrix Editions, Ithaca, N.Y. (2016)
113. Hulek, K., Kahn, C., Weintraub, S.: Moduli Spaces of Abelian Surfaces: Compactification,
Degenerations, and Theta Functions. De Gruyter, Berlin (1993)
114. Ismail, M.E.H.: Classical and Quantum Orthogonal Polynomials in One Variable. Cambridge
University Press, Cambridge (2005)
115. Its, A.R.: The Riemann-Hilbert problem and integrable systems. Not. Am. Math. Soc. 50,
1389–1400 (2003)
116. Iwaniec, T., Martin, G.: Geometric Function Theory and Non-linear Analysis. Oxford Uni-
versity Press, New York (2001)
117. Jordan, C.: The Calculus of Finite Differences, 2nd edn. Chelsea, New York (1947)
118. Julia, G.: Mémoire sur l’itération des fonctions rationlles. J. Math. Pures Appl. 8, 47–245
(1918)
119. Kemppainen, A.: Schramm-Loewner Evolution. Springer, Cham (2017)
120. Khinchin, A.Y.: Continued Fractions. University of Chicago Press (1964)
121. Khrushchev, S.: Orthogonal Polynomials and Continued Fractions. From Euler’s point of
view. Cambridge University Press, Cambridge (2008)
122. Koebe, P.: Über die Uniformisierung beliebiger analytischen Kurven. Göttinger Nachr. 191–
210 (1907)
123. Kœnigs, G.: Recherches sur les intégrales de certaines équations fonctionelles. Ann. Sci. ÉNS
Paris 1, supplém. 1–4 (1884)
124. Köhler, G.: Eta Products and Theta Series Identities. Springer, Heidelberg (2011)
125. Krantz, S.: Geometric Analysis of the Bergman Kernel. Springer, New York, Heidelberg,
Berlin (2013)
126. Kraus, W.: Über der Zusammenhang einige Characteristiken eines einfach zusammenhängen-
den Bereiches mit der Kreisabbildung. Mitt. Math. Sem. Giessen 21, 1–28 (1932)
127. Künzi, H.P.: Quasikonforme Abbildungen. Springer, Berlin (1960)
128. Labourie, F.: Anosov flows, surface groups and curves in projective space. Inven. Math. 165,
51–114 (2006)
129. Lawler, G.F., Limic, V.: Random Walk: A Modern Introduction. Cambridge University Press,
Cambridge (2010)
130. Leau, L.: Études sur les équations fonctionelles à une ou plusieurs variables. Ann. Fac. Sci.
Toulouse 11, E1–E110 (1897)
131. Lehto, O.: Univalent Functions and Teichmüller Spaces. Springer, New York, Heidelberg,
Berlin (1987)
132. Lehto, O., Virtanen, K.I.: Quasiconformal Mappings in the Plane. Springer, New York, Hei-
delberg, Berlin (1973)
133. Lehner, J.: Discontinuous Groups and Automorphic Functions. American Mathematical Soci-
ety, Providence (1964)
134. Littlewood, J.E.: On inequalities in the theory of functions. Proc. Lond. Math. Soc. 23, 481–
519 (1925)
135. Liu, S.-Y., Wong, R., Zhao, Y.-Q.: Uniform treatment of Darboux’s method and the Heisenberg
polynomials. Proc. Am. Math. Soc. 141, 2683–2691 (2013)
136. Löwner, K.: Untersuchungen über schlichte konforme Abbildungen des Einheitskreises I.
Math. Ann. 89, 103–121 (1923)
137. Loewner, C.: On the conformal capacity in space. J. Math. Mech. 8, 411–414 (1959)
396 References

138. Lyubich, M.Y.: Dynamics of rational transformations (Russian). Uspehi Mat. Nauk 41, 35–95,
235, Russian Math. Surv. 41, 43–117 (1986)
139. Mandelbrot, B.: Fractal aspects of the iteration of z → ζ (λ − z) for complex λ, z. Ann. N. Y.
Acad. Sci. 357, 249–259 (1980)
140. Milin, I.M.: Estimation of coefficients of univalent functions. Dokl. Akad. Nauk SSSR 160,
769–771 (1965), Soviet Math. Dokl. 6, 196–198 (1965)
141. Milnor, J.: Dynamics in One Complex Variable. Princeton University Press, Princeton and
Oxford (2006)
142. Mori, A.: On an absolute constant in the theory of quasiconformal mappings with prescribed
complex dilatation. J. Math. Soc. Jpn. 8, 156–166 (1956)
143. Morrey, C.B.: On the solution of quasilinear elliptic partial differential equations. Trans. Am.
Math. Soc. 43, 156–166 (1938)
144. Mostow, G.D.: Strong Rigidity of Locally Symmetric Spaces. Princeton University Press,
Princeton, N.J., University of Tokyo Press, Tokyo (1973)
145. Mumford, D.: Tata Lectures on Theta. I. Reprint of the 1983 edition. Birkhüser, Boston (2007)
146. Mumford, D.: Tata Lectures on Theta. II. Reprint of the 1984 edition. Jacobian Theta Functions
and Differential Equations. Birkhüser, Boston (2007)
147. Mumford, D.: Tata Lectures on Theta. III. Reprint of the 1991 original. Birkhüser, Boston
(2007)
148. Murty, M.R. (ed.): Theta Functions: From the Classical to the Modern. American Mathemat-
ical Society, Providence (1993)
149. Muskhelishvili, N.I.: Singular Integral Equations. Nordhoff, NV, Groningen (1953)
150. Nag, S.: The Complex Analytic Theory of Teichmüller Spaces. Wiley, New York (1988)
151. Nehari, Z.: The Schwarzian derivative and schlicht functions. Bull. Am. Math. Soc. 55, 545–
551 (1949)
152. Nehari, Z.: Conformal Mapping. McGraw Hill, London (1952), Dover, New York (1975)
153. Neuenschwander, E.: Studies in the history of complex function theory II. Interactions among
the French school, Riemann, and Weierstrass. Bull. Am. Math. Soc. 5, 87–105 (1981)
154. Nevanlinna, R.: Über die konforme Abbildung von Sterngebieten. Översikt Fin = ska
Vetenskaps-Soc. Förh 63A(6), 1–21 (1920-21)
155. Newman, M.A.: Elements of the Topology of Plane Sets of Points. Cambridge University
Press, Cambridge (1961)
156. Ohsawa, T.: Analysis of Several Complex Variables. American Mathematical Society, Prov-
idence (2002)
157. Olver, F.W.J.: A paradox in asymptotics. SIAM J. Math. Anal. 1, 533–534 (1970)
158. Olver, F.W.J.: Asymptotics and Special Functions. Harcourt Brace Jovanovich, New York-
London (1974)
159. Olver, F.W.J.: Unsolved problems in the asymptotic estimation of special functions. In: Askey,
R. (ed.) Theory and Applications of Special Functions, pp. 99–142. Academic Press, New
York (1975)
160. Olver, F.W.J., Lozier, D.W., Boisvert, R.F., Clark, C.W.: NIST Handbook of Mathematical
Functions. Cambridge University Press, Cambridge (2010)
161. Ozawa, M.: On the Bieberbach conjecture for the sixth coefficient. Kōdai Math. Sem. Rep.
21, 97–128 (1969)
162. Padé, H.: Sur la représentation approchée d’une fonction par des fractions rationelles, Thesis,
Ann. École Norm. (3), 9, 1–93 supplement (1892)
163. Pederson, R.N.: A proof of the Bieberbach conjecture for the sixth coefficient. Arch. Rat.
Mech. Anal. 31, 331–351 (1968-69)
164. Pederson, R.N., Schiffer, M.: A proof of the Bieberbach conjecture for the fifth coefficient.
Arch. Rat. Mech. Anal. 45, 161–193 (1972)
165. Pfeifer, G.A.: On the conformal mapping of curvilinear angles. The functional equation
φ[ f (x)] = α1 φ(x). Trans. Am. Math. Soc. 18, 185–198 (1917)
166. Perron, O.: Eine neue Behandlung der ersten Randwertaufgabe für u = 0. Math. Z. 18,
42–54 (1923)
References 397

167. Peyrière, J.: An Introduction to Singular Integrals. Soc. Ind. Appl. Mathematics, Philadelphia,
PA (2018)
168. Plemelj, J.: Problems in the Sense of Riemann and Klein. Interscience, New York (1964)
169. Pólya, G.: Mathematics and Plausible Reasoning, Vol. 1: Induction and Analogy in Mathe-
matics. Princeton University Press, Princeton (1954)
170. Pommerenke, C.: Univalent Functions. Vandenhoeck und Ruprecht, Göttingen (1975)
171. Pommerenke, C.: Boundary Behavior of Conformal Maps. Springer, New York, Heidelberg,
Berlin (1992)
172. Protter, M.H., Weinberger, H.F.: Maximum Principles in Differential Equations. Springer,
New York, Heidelberg, Berlin (1984)
173. Pucci, P., Serrin, J.: The Maximum Principle. Birkhäuser, Basel (2007)
174. Prym, F.E.: Zur Integration der Differentialgleichung ∂∂ xu2 + ∂∂ yu2 = 0. J. Math. 73, 340–364
2 2

(1871)
175. Riemann, B.: Theorie der Abel’schen Functionen. J. Reine Angew. Math. 54, 115–155 (1857).
Collected Works, 2nd edn, pp. 88–144. Dover, New York (1953)
176. Riesz, F.: Sur certains systèmes singuliers d’équations inténtegrales. Ann. Sci. Éc. Norm. Sup.
28, 33–62 (1911)
177. Robinson, R.: A new absolute geometric constant? Am. Math. Mon. 58, 442–469 (1951)
178. Robinson’s constant. Am. Math. Mon. 59, 296–297 (1952)
179. Rodin, Y.L.: Generalized Analytic Functions on Riemann Surfaces. Springer, New York,
Heidelberg, Berlin (1987)
180. Rogosiniski, W.: Über positive harmonische Entwicklungen und typisch-reelle Potenzreihen.
Math. Z. 35, 93–121 (1932)
181. Rosenblum, M., Rovnyak, J.: Topics in Hardy Classes and Univalent Functions. Birkhäuser,
Basel (1994)
182. Rudin, W.: Functional Analysis, 2nd edn. McGraw-Hill, New York (1991)
183. Sauer, T.: Continued Fractions and Signal Processing. Springer, New York, Heidelberg, Berlin
(2021)
184. Schiff, J.L.: Normal Families. Springer, New York (1993)
185. Schlag, W.: A Course in Complex Analysis and Riemann Surfaces. American Mathematical
Society, Providence (2014)
186. Schmüdgen, K.: The Moment Problem. Springer, New York, Heidelberg, Berlin (2017)
187. Schramm, O.: Scaling limits of loop-erased random walks and uniform spanning trees. Isr. J.
Math. 118, 221–288 (2000)
188. Seidel, L.: Untersuchungen über die Konvergenz und Divergenz der Kettenbrüche.
Habilschrift München (1846)
189. Shanks, D.: Nonlinear transformations of divergent and slowly convergent sequences. J. Math.
Phys. 34, 1–42 (1955)
190. Siegel, C.L.: Iteration of analytic functions. Ann. Math 43, 607–612 (1942)
191. Siegel, C.L.: Topics in Complex Function Theory, vol. I. Wiley-Interscience, New York (1969)
192. Siegel, C.L.: Topics in Complex Function Theory, vol. II. Wiley-Interscience, New York
(1971)
193. Sokhotski, Y.W.: On definite integrals and functions used in series expansions. Doctor thesis,
Saint Petersburg (1873)
194. Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton Uni-
versity Press, Princeton (1970)
195. Stein, E.M., Shakarchi, R.: Real Analysis. Integration, and Hilbert Spaces. Princeton Univer-
sity Press, Princeton, Measure Theory (2005)
196. Stein, E.M., Shakarchi, R.: Functional Analysis. Introduction to Further Topics in Analysis.
Princeton University Press, Princeton (2011)
197. Steinmetz, N.: Rational Iteration. Complex Analytical Dynamical Systems. deGruyter, Berlin
(1993)
198. Sullivan, D.: Conformal dynamical systems. Geometic Dynamics. Lecture Notes in Mathe-
matics, vol. 1007, pp. 725–752. Springer, New York, Heidelberg, Berlin (1983)
398 References

199. Szegő, G.: Über orthogonale Polynome, die zu einer gegebenen Kurve der komplexen Ebene
gehören. Math. Z. 9(1921), 218–270 (1921)
200. Tazzioli: Schwarz’s critique and interpretation of the Riemann representation theorem. (Ital-
ian) Rend. Circ. Mat. Palermo (2) Suppl. 34, 95–132 (1994)
201. Teichmüller, O.: Untersuchung über konforme und quasiconforme Abbildung. Dtsch. Math.
3, 621–678 (1938)
202. Teichmüller, O.: Extremale quasiconformale Abbildungen und quadratische Differentiale.
Abh. Preuss. Akad. Math.-Nat. Kl. 1939, no. 22
203. Teichmüller, O.: Bestimmung der extremalen quasiconformalen Abbildungen bei geschlosse-
nen orientierten Riemannschen Flächen. Abh. Preuss. Akad. Math.-Nat. Kl. 1943, no. 4
204. Temme, N.M.: Asymptotic Methods for Integrals. World Scientific, Hackensack, N.J. (2015)
205. Thomas, D., Tunseki, N., Vasudevarao, A.: Univalent Functions. A Primer. De Gruyter, Berlin
(2018)
206. Titchmarsh, E.C.: The Theory of Functions, 2nd edn. Oxford University Press, Oxford (1939)
207. Titchmarsh, E.C.: Introduction to the Theory of Fourier Integrals, 3rd edn. Chelsea, New York
(1986)
208. Tyurin, A.: Quantization. Classical and Quantum Field Theory and Theta Functions. American
Mathematical Society, Providence (2003)
209. Van Assche, W.: Padé and Hermite-Padé approximation and orthogonality. Surv. Approx.
Theory 2, 61–91 (2006)
210. Vekua, I.N.: Generalized Analytic Functions. Addison-Wesley, Reading, Mass (1962)
211. Vekua, N.P.: Systems of Singular Integral Equations. P. Noordhoff Ltd., Groningen (1967)
212. Wang, H., Zhao, Y.: Uniform asymptotics and zeros of a system of orthogonal polynomials
defined via a difference equation. J. Math. Anal. Appl. 369, 453–472 (2010)
213. Weinstein, L.: The Bieberbach conjecture. Duke Math. J. 64, 61–64 (1991)
214. Weyl, H.: Die Idee der Riemannschen Flächen. Teubner, Stuttgart,: Translation: The Concept
of a Riemann Surface, 3rd edn. Dover, New York (1913). (2009)
215. Wienhard, A.: An invitation to higher Teichmüller theory. In: Proceedings of the International
Congress of Mathematicians-Rio de Janeiro, vol. 2, pp. 1031–1058 (2018)
216. Wilf, H.: A footnote on two proofs of the Bieberbach-de Branges theorem. Bull. Lond. Math.
Soc. 26, 61–63 (1994)
217. Wolpert, S.: Counting geodesics, Teichmïller space, and random hyperbolic surfaces. Notices,
Am. Math. Soc. 68, 1890–1899 (2021)
218. Wong, R.S.C.: Asymptotic Approximations of Integrals. Academic Press, Boston (1989)
219. Wong, R.S.C., Zhao, Y.-Q.: On a uniform treatment of Darboux’s method. Constr. Approx.
21, 225–255 (2005)
220. Wong, R.S.C., Wyman, M.: The method of Darboux. J. Approx. Theory 10, 159–171 (1974)
221. Wynn, P.: On a device for computing the em(Sn) transformation. Math. Table Aids Comput.
10, 91–96 (1956)
222. Yoccoz, J.-C.: Linéarisation des germes de diffeomorphismes holomorphes de (C,0). C. R.
Acad. Sci. Paris Sér. I Math. 306, 55–58 (1988)
Index

A Blaschke product, 76
Abel map, 274 boundary condition, Dirichlet, 253
Abelian differential, 267 branch
first kind, 267 of logarithm, 11
second kind, 267 of power, 11
third kind, 267
absolute convergence of product, 12
Aitken’s process, 308 C
almost everywhere, 40 Calderón–Zygmund inequality, 185
analytic continuation, 14–15 canonical image, of a quadrilateral, 155
uniqueness, 14 canonical image, of a ring domain, 163
arc, 1 Casorati–Weierstrass theorem, 8
area theorem, 81 Cauchy integral formula, 4, 5
argument principle, 11 Cauchy integral theorem, 3
Ascoli–Arzelá theorem, 26 Cauchy transform
attracting fixed point, 56 one-dimensional, 319, 320
attracting periodic orbit, 57 two-dimensional, 183
automorphism, 19 Cauchy–Green formula, 4
Cauchy–Riemann equations, 3
Cauchy–Schwarz inequality, 35
B Cayley transform, 20
backward orbit, 47 change of contour, 7
basin of attraction, 59, 60 Charlier polynomials, 387, 388
immediate, 59 circular distortion, 174
Beltrami coefficient, 189 complete orthonormal set, 37
Beltrami equation, 182 complex curve, 125
normal solution, 187 complex logarithm, 10
Bergman kernel, 235 conformal equivalence
Bergman metric, 244 of domains, 28
Bergman spaces, 261 conformal mapping, 28
Bessel equality, 37 conformal structure, 126
Bessel inequality, 36 conformally conjugate maps, 49
Beurling–Ahlfors extension, 180, 181 conjugate maps, 47
Bieberbach conjecture, 80 conjugation of maps, 67, 69
Bieberbach’s theorem, 82 continuation along a curve, 14

© The Editor(s) (if applicable) and The Author(s), under exclusive license to 399
Springer Nature Switzerland AG 2023
R. Beals and R. S. C. Wong, More Explorations in Complex Functions, Graduate Texts
in Mathematics 298, https://doi.org/10.1007/978-3-031-28288-1
400 Index

convergence of product, 12 Euler-Lagrange equations, 245


convolution, 41 expansion
coordinate chart, 126 Laurent, 8
coordinate disk, 138 Taylor, 5
coordinate neighborhood, 126 extended Liouville theorem, 7
coordinates, 126 extended Poisson formula, 117
cover, 128 extremal domain
universal, 129 Grötzsch, 166, 167
cover transformation, 131, 203 Mori, 167
covering map, 31 Teichmüller, 167
critical points, of a rational map, 55
cross ratio, 19
curve, 1 F
cycle Fatou set, 49
on a Riemann surface, 206 fixed point
periodic orbit, 57 attracting, 56
geometrically attracting, 56
neutral, 56
D parabolic, 62
Darboux approximant, 354 repelling, 56
Darboux’s method, 353 super-attracting, 54, 56
deck transformation, 131, 203 Fourier transform, 334
Fuchsian group, 134, 203
degree, of a rational function, 48
first kind, 213
diameter, 33
second kind, 213
differential, 267
function field, 263
Abelian
fundamental group, 128
first kind, 267, 271, 275
second kind, 271
third kind, 271
G
holomorphic, 221, 267
gamma function, 44–45
quadratic, 221
functional equation, 44
dilatation quotient, 161 Stirling’s approximation, 44
dilatation, maximal, 159 genus, 201, 263, 264
Diophantine number, 64 geodesic, 244
Dirichlet boundary condition, 253 geometrically attracting fixed point, 56
Dirichlet domain, 210 Grötzsch’s extremal domain, 166
Dirichlet integral, 122 Grötzsch’s module theorem, 167
Dirichlet problem, 113, 253 Green’s function, 138, 141, 254
discontinuity theorem, 324 Green’s identity, 146, 254
domain, 1 Green’s theorem, 4
Dirichlet, 210 Gronwall area theorem, 81
simply connected, 11
Douady rabbit, 51
dual index, 38 H
half plane, upper, 17
half-period, of theta function, 273
E Hardy space, 238, 259
elliptic curve, 264 harmonic conjugate, 16, 138
elliptic modular function, 30 harmonic function, 138, 254
elliptic Riemann surface, 143, 203 harmonic measure, 139
entire function, 7 Harnack inequalities, 117
equivalence, of Riemann surfaces, 127 Harnack principle, 117
essential singularity, 8 Heisenberg polynomials, 384–389
Index 401

Herman ring, 72 extended, 7


Higher Teichmüller theory, 228 Loewner’s equation, 91
Hilbert space, 35–36 logarithm
Hilbert transform branch, 11
in the plane, 183 complex, 10
one-dimensional, 191 principal branch, 11
holomorphic, 3
differential, 221
holonomy, 228 M
homology, 266 Maclaurin expansion, 5
homotopic curves, 127 Mandelbrot set, 73
Hölder mapping theorem, Riemann, 28
continuity, 186, 321 maximal dilatation, 159
Hölder continuity, 176, 185, 320 maximum modulus principle, 6
Hölder’s inequality, 38 strong, 6
hyperbolic metric, 23 mean value property
hyperbolic Riemann surface, 141, 203 harmonic functions, 116
hyperelliptic curve, 264 holomorphic functions, 6
measurable, 40
measure, 40
I harmonic, 139
immediate basin of attraction, 59 inner, 40
infinite products, 12–13 outer, 40
inner measure, 40 Meixner polynomials, 388
inner product, 35 meromorphic function, 9
inner product space, 35 metric density, 24
isolated singularity, 7 model, of a quadrilateral, 155
modular group, 206
module
J of a quadrilateral, 155
Jacobi inversion problem, 275 or a ring domain, 163
Jacobi polynomials, 388 module theorem, Grötzsch’s, 167
Jacobi variety, 273 moments, 297
Jacobian, of a curve, 273 monic polynomial, 297
Jordan curve, 33 monodromy theorem, 15
Jordan domain, 33 Möbius transformation, 18
Julia set, 49 multiplier, of a fixed point, 56

K N
K-quasiconformal map, 157 neutral fixed point, 56
Koebe function, 79 neutral periodic orbit, 57
Newton’s method, 76
normal family, 25
L complete, 25
Laplace equation, 253 normal function (for Padé approximation),
Laplace transform, 350 293
Laurent expansion, 8 normal Padé sequence, 294
Leau–Fatou flower, 70 normal solution, of Beltrami equation, 187
Legendre polynomials, 100
lift, 128
limit set, of Fuchsian group, 213 O
linear fractional transformation, 18 orbit, 47
Liouville theorem, 7 backward, 47
402 Index

order, of pole, 8 repelling fixed point, 56


orthogonal, 36 repelling periodic orbit, 57
projection, 259 reproducing kernel, 238
orthonormal basis, 37 reproducing property, 236
orthonormal set, 36 residue, 9
complete, 37 theorem, 9
outer measure, 40 Riemann constants, 278
Riemann mapping theorem, 28
Riemann sphere, 17
P Riemann surface
Padé approximant, 284 elliptic, 143, 203
Padé table, 284 hyperbolic, 141, 203
parabolic fixed point, 62 parabolic, 143, 203
parabolic Riemann surface, 143, 203 Riemann–Hilbert
parallelogram identity, 37 factorization problem, 319
path, 1 problem I, 319
pathwise connected, 126 problem II, 348
period lattice, theta function, 273 Riemann–Stieltjes integral, 299
periodic orbit, 57 Riesz–Thorin convexity theorem, 194
attracting, 57 ring domain, 163
neutral, 57 module, 163
repelling, 57 Robertson conjecture, 95
super-attracting, 57 Rouché’s theorem, 10
periodic point, 57
periods, of theta function, 273
Perron family, 119, 138
S
Perron function, 119
schlicht function, 79, 81
Perron method, 137
Schwarz reflection principle, 123
Poincaré density, 23
Poincaré metric, 23 Schwarz’s lemma, 20
Poisson formula, 115 Schwarzian, Schwarzian derivative, 219
extended, 117 Scramm–Loewner evolution, 94
Poisson kernel, 115 self-similarity, 55
pole, 8 Shanks transformation, 308
products, infinite, 12–13 sides, of a quadrilateral, 154
properly discontinuous group, 131 Siegel disk, 67
simple pole, 8
simply connected domain, 11
Q singularity
quadratic differential, 221 essential, 8
quadrilateral, 154 isolated, 7
quasi-isometry, 181 removable, 8
quasiconformal map, 157 SLE, 94
quasiconformal reflection, 225 slit domain, 87
quasiperiod, theta function, 273 slit mapping, 86, 87
quasisymmetric, 178 Sokhotski–Plemelj formula, 320
standard coordinate, 138
stereographic projection, 21
R Stieltjes function, 303–307
Radon transform, 331, 337 Stieltjes transform, 296, 300–303
reflection principles, 13–14 modified, 303
regular boundary point, 120 Stirling numbers, 388
regular map, 160 Stirling’s approximation, 44
removable singularity, 8 stochastic Loewner evolution, 94
Index 403

strong maximum modulus principle, 6 U


subharmonic function, 118, 138 uniform convergence of quadrilaterals, 157
super-attracting fixed point, 54, 56 uniformization theorem, 126, 127, 137, 141
super-attracting periodic orbit, 57 unit circle, 17
support, 38 unit disk, 17
unitary map, 258
T univalent function, 79, 81
Taylor expansion, 5 upper half-plane, 17
Teichmüller distance, 214
Teichmüller metric, 214
Teichmüller space, 214
universal, 202, 217
Teichmüller theory, higher, 228 W
Teichmüller’s theorem, 227 Weierstrass approximation theorems, 116
theta function, 272 Weierstrass polynomial approximation the-
with characteristics, 273 orem, 115
triply punctured sphere, 30 Weyl’slemma, 42

You might also like