Classical Mechanics Systems of Particles and Hamiltonian Dynamics, Second Edition, Walter Greiner 9783642034688

Classical Mechanics
Second Edition
Greiner Greiner
Quantum Mechanics Classical Mechanics
An Introduction 4th Edition Systems of Particles
and Hamiltonian Dynamics
Greiner 2nd Edition
Quantum Mechanics
Special Chapters Greiner
Classical Mechanics
Greiner Müller Point Particles and Relativity
Quantum Mechanics
Symmetries 2nd Edition Greiner
Classical Electrodynamics
Greiner
Relativistic Quantum Mechanics Greiner Neise Stocker
Wave Equations 3rd Edition Thermodynamics
and Statistical Mechanics
Greiner Reinhardt
Field Quantization
Greiner Reinhardt
Quantum Electrodynamics
4th Edition
Greiner Schramm Stein

Quantum Chromodynamics
3rd Edition
Greiner Maruhn
Nuclear Models
Greiner Müller
Gauge Theory of Weak Interactions
4th Edition
Walter Greiner
Classical Mechanics
Systems of Particles and
Hamiltonian Dynamics
With a Foreword by
D.A. Bromley
Second Edition
With 280 Figures,
and 167 Worked Examples and Exercises
Prof. Dr. Walter Greiner
Frankfurt Institute
for Advanced Studies (FIAS)
Johann Wolfgang Goethe-Universität
Ruth-Moufang-Str. 1
60438 Frankfurt am Main
Germany
[email protected]
Translated from the German Mechanik: Teil 2, by Walter Greiner, published by Verlag Harri Deutsch, Thun,
Frankfurt am Main, Germany, © 1989
ISBN 978-3-642-03433-6 e-ISBN 978-3-642-03434-3

DOI 10.1007/978-3-642-03434-3
Springer Heidelberg Dordrecht London New York
Library of Congress Control Number: 2009940125
© Springer-Verlag Berlin Heidelberg 1992, 2010

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or
parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in
its current version, and permission for use must always be obtained from Springer. Violations are liable to
prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not
imply, even in the absence of a specific statement, that such names are exempt from the relevant protec-
tive laws and regulations and therefore free for general use.
Cover design: eStudio Calamar S.L., Spain
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)

Foreword
More than a generation of German-speaking students around the world have worked
their way to an understanding and appreciation of the power and beauty of modern the-
oretical physics—with mathematics, the most fundamental of sciences—using Walter
Greiner’s textbooks as their guide.
The idea of developing a coherent, complete presentation of an entire field of sci-
ence in a series of closely related textbooks is not a new one. Many older physicians
remember with real pleasure their sense of adventure and discovery as they worked
their ways through the classic series by Sommerfeld, by Planck, and by Landau and
Lifshitz. From the students’ viewpoint, there are a great many obvious advantages to
be gained through the use of consistent notation, logical ordering of topics, and co-
herence of presentation; beyond this, the complete coverage of the science provides a
unique opportunity for the author to convey his personal enthusiasm and love for his
subject.
These volumes on classical physics, finally available in English, complement
Greiner’s texts on quantum physics, most of which have been available to English-
speaking audiences for some time. The complete set of books will thus provide a
coherent view of physics that includes, in classical physics, thermodynamics and sta-
tistical mechanics, classical dynamics, electromagnetism, and general relativity; and
in quantum physics, quantum mechanics, symmetries, relativistic quantum mechanics,
quantum electro- and chromodynamics, and the gauge theory of weak interactions.
What makes Greiner’s volumes of particular value to the student and professor alike
is their completeness. Greiner avoids the all too common “it follows that . . . ,” which
conceals several pages of mathematical manipulation and confounds the student. He
does not hesitate to include experimental data to illuminate or illustrate a theoretical
point, and these data, like the theoretical content, have been kept up to date and top-
ical through frequent revision and expansion of the lecture notes upon which these
volumes are based.
Moreover, Greiner greatly increases the value of his presentation by including
something like one hundred completely worked examples in each volume. Nothing is
of greater importance to the student than seeing, in detail, how the theoretical concepts
and tools under study are applied to actual problems of interest to working physicists.
And, finally, Greiner adds brief biographical sketches to each chapter covering the
people responsible for the development of the theoretical ideas and/or the experimen-
tal data presented. It was Auguste Comte (1789–1857) in his Positive Philosophy who
noted, “To understand a science it is necessary to know its history.” This is all too
often forgotten in modern physics teaching, and the bridges that Greiner builds to the
pioneering figures of our science upon whose work we build are welcome ones.
Greiner’s lectures, which underlie these volumes, are internationally noted for their
clarity, for their completeness, and for the effort that he has devoted to making physics
v
vi Foreword
an integral whole. His enthusiasm for his sciences is contagious and shines through
almost every page.
These volumes represent only a part of a unique and Herculean effort to make all
of theoretical physics accessible to the interested student. Beyond that, they are of
enormous value to the professional physicist and to all others working with quantum
phenomena. Again and again, the reader will find that, after dipping into a particular
volume to review a specific topic, he or she will end up browsing, caught up by often
fascinating new insights and developments with which he or she had not previously
been familiar.
Having used a number of Greiner’s volumes in their original German in my teach-
ing and research at Yale, I welcome these new and revised English translations and
would recommend them enthusiastically to anyone searching for a coherent overview
of physics.
Yale University D. Allan Bromley

New Haven, Connecticut, USA Henry Ford II Professor of Physics
Preface to the Second Edition
I am pleased to note that our text Classical Mechanics: Systems of Particles and
Hamiltonian Dynamics has found many friends among physics students and re-
searchers, and that a second edition has become necessary. We have taken this op-
portunity to make several amendments and improvements to the text. A number of
misprints and minor errors have been corrected and explanatory remarks have been
supplied at various places.
New examples have been added in Chap. 19 on canonical transformations, dis-
cussing the harmonic oscillator (19.3), the damped harmonic oscillator (19.4), infini-
tesimal time steps as canonical transformations (19.5), the general form of Liouville’s
theorem (19.6), the canonical invariance of the Poisson brackets (19.7), Poisson’s the-
orem (19.8), and the invariants of the plane Kepler system (19.9).
It may come as a surprise that even for a time-honored subject such as Clas-
sical Mechanics in the formulation of Lagrange and Hamilton, new aspects may
emerge. But this has indeed been the case, resulting in new chapters on the “Extended
Hamilton–Lagrange formalism” (Chap. 21) and the “Extended Hamilton–Jacobi equa-
tion” (Chap. 22). These topics are discussed here for the first time in a textbook, and
we hope that they will help to convince students that even Classical Mechanics can
still be an active area of ongoing research.
I would especially like to thank Dr. Jürgen Struckmeier for his help in constructing
the new chapters on the Extended Hamilton–Lagrange–Jacobi formalism, and Dr. Ste-
fan Scherer for his help in the preparation of this new edition. Finally, I appreciate the
agreeable collaboration with the team at Springer-Verlag, Heidelberg.
Frankfurt am Main Walter Greiner

September 2009
vii
Preface to the First Edition
Theoretical physics has become a many faceted science. For the young student, it
is difficult enough to cope with the overwhelming amount of new material that has
to be learned, let alone obtain an overview of the entire field, which ranges from
mechanics through electrodynamics, quantum mechanics, field theory, nuclear and
heavy-ion science, statistical mechanics, thermodynamics, and solid-state theory to
elementary-particle physics; and this knowledge should be acquired in just eight to ten
semesters, during which, in addition, a diploma or master’s thesis has to be worked on
or examinations prepared for. All this can be achieved only if the university teachers
help to introduce the student to the new disciplines as early on as possible, in order to
create interest and excitement that in turn set free essential new energy.
At the Johann Wolfgang Goethe University in Frankfurt am Main, we therefore
confront the student with theoretical physics immediately, in the first semester. The-
oretical Mechanics I and II, Electrodynamics, and Quantum Mechanics I—An Intro-
duction are the courses during the first two years. These lectures are supplemented
with many mathematical explanations and much support material. After the fourth
semester of studies, graduate work begins, and Quantum Mechanics II—Symmetries,
Statistical Mechanics and Thermodynamics, Relativistic Quantum Mechanics, Quan-
tum Electrodynamics, Gauge Theory of Weak Interactions, and Quantum Chromo-
dynamics are obligatory. Apart from these, a number of supplementary courses on
special topics are offered, such as Hydrodynamics, Classical Field Theory, Special
and General Relativity, Many-Body Theories, Nuclear Models, Models of Elementary
Particles, and Solid-State Theory.
This volume of lectures, Classical Mechanics: Systems of Particles and Hamil-
tonian Dynamics, deals with the second and more advanced part of the important field
of classical mechanics. We have tried to present the subject in a manner that is both
interesting to the student and easily accessible. The main text is therefore accompa-
nied by many exercises and examples that have been worked out in great detail. This
should make the book useful also for students wishing to study the subject on their
own.
Beginning the education in theoretical physics at the first university semester, and
not as dictated by tradition after the first one and a half years in the third or fourth
semester, has brought along quite a few changes as compared to the traditional courses
in that discipline. Especially necessary is a greater amalgamation between the ac-
tual physical problems and the necessary mathematics. Therefore, we treat in the first
semester vector algebra and analysis, the solution of ordinary, linear differential equa-
tions, Newton’s mechanics of a mass point, and the mathematically simple mechanics
of special relativity.
Many explicitly worked-out examples and exercises illustrate the new concepts
and methods and deepen the interrelationship between physics and mathematics. As a
ix
x Preface to the First Edition
matter of fact, the first-semester course in theoretical mechanics is a precursor to the-

oretical physics. This changes significantly the content of the lectures of the second
semester addressed here. Theoretical mechanics is extended to systems of mass points,
vibrating strings and membranes, rigid bodies, the spinning top, and the discussion of
formal (analytical) aspects of mechanics, that is, Lagrange’s, Hamilton’s formalism,
and Hamilton–Jacobi formulation of mechanics. Considered from the mathematical
point of view, the new features are partial differential equations, Fourier expansion,
and eigenvalue problems. These new tools are explained and exercised in many physi-
cal examples. In the lecturing praxis, the deepening of the exhibited material is carried
out in a three-hour-per-week theoretica, that is, group exercises where eight to ten stu-
dents solve the given exercises under the guidance of a tutor.
We have added some chapters on modern developments of nonlinear mechanics
(dynamical systems, stability of time-dependent orbits, bifurcations, Lyapunov expo-
nents and chaos, systems with chaotic dynamics), being well aware that all this mate-
rial cannot be taught in a one-semester course. It is meant to stimulate interest in that
field and to encourage the students’ further (private) studies.
The last chapter is devoted to the history of mechanics. It also contains remarks on
the lives and work of outstanding philosophers and scientists who contributed impor-
tantly to the development of science in general and mechanics in particular.
Biographical and historical footnotes anchor the scientific development within the
general context of scientific progress and evolution. In this context, I thank the pub-
lishers Harri Deutsch and F.A. Brockhaus (Brockhaus Enzyklopädie, F.A. Brockhaus,
Wiesbaden—marked by [BR]) for giving permission to extract the biographical data
of physicists and mathematicians from their publications.
We should also mention that in preparing some early sections and exercises of our
lectures we relied on the book Theory and Problems of Theoretical Mechanics, by
Murray R. Spiegel, McGraw-Hill, New York, 1967.
Over the years, we enjoyed the help of several former students and collabo-
rators, in particular, H. Angermüller, P. Bergmann, H. Betz, W. Betz, G. Binnig,
J. Briechle, M. Bundschuh, W. Caspar, C. v. Charewski, J. v. Czarnecki, R. Fick-
ler, R. Fiedler, B. Fricke, C. Greiner, M. Greiner, W. Grosch, R. Heuer, E. Hoff-
mann, L. Kohaupt, N. Krug, P. Kurowski, H. Leber, H.J. Lustig, A. Mahn, B. Moreth,
R. Mörschel, B. Müller, H. Müller, H. Peitz, G. Plunien, J. Rafelski, J. Reinhardt,
M. Rufa, H. Schaller, D. Schebesta, H.J. Scheefer, H. Schwerin, M. Seiwert, G. Soff,
M. Soffel, E. Stein, K.E. Stiebing, E. Stämmler, H. Stock, J. Wagner, and R. Zim-
mermann. They all made their way in science and society, and meanwhile work as
professors at universities, as leaders in industry, and in other places. We particularly
acknowledge the recent help of Dr. Sven Soff during the preparation of the English
manuscript. The figures were drawn by Mrs. A. Steidl.
The English manuscript was copy-edited by Heather Jones, and the production of
the book was supervised by Francine McNeill of Springer-Verlag New York, Inc.
Johann Wolfgang Goethe-Universität Walter Greiner

Frankfurt am Main, Germany
Contents
Part I Newtonian Mechanics in Moving Coordinate Systems

1 Newton’s Equations in a Rotating Coordinate System . . . . . . . . . . 3
1.1 . . . . . . . . . . . . . . . . . . . .
Introduction of the Operator D 6
1.2 Formulation of Newton’s Equation in the Rotating Coordinate System 7
1.3 Newton’s Equations in Systems with Arbitrary Relative Motion . . . 7
2 Free Fall on the Rotating Earth . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Perturbation Calculation . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Method of Successive Approximation . . . . . . . . . . . . . . . . 12
2.3 Exact Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Foucault’s Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 Solution of the Differential Equations . . . . . . . . . . . . . . . . 26
3.2 Discussion of the Solution . . . . . . . . . . . . . . . . . . . . . . 28
Part II Mechanics of Particle Systems
4 Degrees of Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1 Degrees of Freedom of a Rigid Body . . . . . . . . . . . . . . . . . 41
5 Center of Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6 Mechanical Fundamental Quantities of Systems of Mass Points . . . . . 65

6.1 Linear Momentum of the Many-Body System . . . . . . . . . . . . 65
6.2 Angular Momentum of the Many-Body System . . . . . . . . . . . 65
6.3 Energy Law of the Many-Body System . . . . . . . . . . . . . . . . 68
6.4 Transformation to Center-of-Mass Coordinates . . . . . . . . . . . 70
6.5 Transformation of the Kinetic Energy . . . . . . . . . . . . . . . . 72
Part III Vibrating Systems
7 Vibrations of Coupled Mass Points . . . . . . . . . . . . . . . . . . . . . 81

7.1 The Vibrating Chain . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8 The Vibrating String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.1 Solution of the Wave Equation . . . . . . . . . . . . . . . . . . . . 103
8.2 Normal Vibrations . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
xi
xii Contents
10 The Vibrating Membrane . . . . . . . . . . . . . . . . . . . . . . . . . . 133

10.1 Derivation of the Differential Equation . . . . . . . . . . . . . . . . 133
10.2 Solution of the Differential Equation . . . . . . . . . . . . . . . . . 135
10.3 Inclusion of the Boundary Conditions . . . . . . . . . . . . . . . . 136
10.4 Eigenfrequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
10.5 Degeneracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
10.6 Nodal Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.7 General Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.8 Superposition of Node Line Figures . . . . . . . . . . . . . . . . . 140
10.9 The Circular Membrane . . . . . . . . . . . . . . . . . . . . . . . . 141
10.10 Solution of Bessel’s Differential Equation . . . . . . . . . . . . . . 144
Part IV Mechanics of Rigid Bodies

11 Rotation About a Fixed Axis . . . . . . . . . . . . . . . . . . . . . . . . 161
11.1 Moment of Inertia . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
11.2 The Physical Pendulum . . . . . . . . . . . . . . . . . . . . . . . . 166
12 Rotation About a Point . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

12.1 Tensor of Inertia . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
12.2 Kinetic Energy of a Rotating Rigid Body . . . . . . . . . . . . . . . 187
12.3 The Principal Axes of Inertia . . . . . . . . . . . . . . . . . . . . . 188
12.4 Existence and Orthogonality of the Principal Axes . . . . . . . . . . 189
12.5 Transformation of the Tensor of Inertia . . . . . . . . . . . . . . . . 193
12.6 Tensor of Inertia in the System of Principal Axes . . . . . . . . . . 195
12.7 Ellipsoid of Inertia . . . . . . . . . . . . . . . . . . . . . . . . . . 196
13 Theory of the Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

13.1 The Free Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
13.2 Geometrical Theory of the Top . . . . . . . . . . . . . . . . . . . . 210
13.3 Analytical Theory of the Free Top . . . . . . . . . . . . . . . . . . 213
13.4 The Heavy Symmetric Top: Elementary Considerations . . . . . . . 224
13.5 Further Applications of the Top . . . . . . . . . . . . . . . . . . . . 228
13.6 The Euler Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
13.7 Motion of the Heavy Symmetric Top . . . . . . . . . . . . . . . . . 241
Part V Lagrange Equations

14 Generalized Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 259
14.1 Quantities of Mechanics in Generalized Coordinates . . . . . . . . . 264
15 D’Alembert Principle and Derivation of the Lagrange Equations . . . . 267

15.1 Virtual Displacements . . . . . . . . . . . . . . . . . . . . . . . . . 267
16 Lagrange Equation for Nonholonomic Constraints . . . . . . . . . . . . 301
17 Special Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

17.1 Velocity-Dependent Potentials . . . . . . . . . . . . . . . . . . . . 311
17.2 Nonconservative Forces and Dissipation Function (Friction Function) 315
17.3 Nonholonomic Systems and Lagrange Multipliers . . . . . . . . . . 317
Contents xiii
Part VI Hamiltonian Theory
18 Hamilton’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

18.1 The Hamilton Principle . . . . . . . . . . . . . . . . . . . . . . . . 337
18.2 General Discussion of Variational Principles . . . . . . . . . . . . . 340
18.3 Phase Space and Liouville’s Theorem . . . . . . . . . . . . . . . . 350
18.4 The Principle of Stochastic Cooling . . . . . . . . . . . . . . . . . 355
19 Canonical Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 365
20 Hamilton–Jacobi Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 383

20.1 Visual Interpretation of the Action Function S . . . . . . . . . . . . 397
20.2 Transition to Quantum Mechanics . . . . . . . . . . . . . . . . . . 407
21 Extended Hamilton–Lagrange Formalism . . . . . . . . . . . . . . . . . 415

21.1 Extended Set of Euler–Lagrange Equations . . . . . . . . . . . . . 415
21.2 Extended Set of Canonical Equations . . . . . . . . . . . . . . . . . 419
21.3 Extended Canonical Transformations . . . . . . . . . . . . . . . . . 428
22 Extended Hamilton–Jacobi Equation . . . . . . . . . . . . . . . . . . . 455
Part VII Nonlinear Dynamics
23 Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
23.1 Dissipative Systems: Contraction of the Phase-Space Volume . . . . 465
23.2 Attractors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
23.3 Equilibrium Solutions . . . . . . . . . . . . . . . . . . . . . . . . . 469
23.4 Limit Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
24 Stability of Time-Dependent Paths . . . . . . . . . . . . . . . . . . . . . 485

24.1 Periodic Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
24.2 Discretization and Poincaré Cuts . . . . . . . . . . . . . . . . . . . 487
25 Bifurcations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
25.1 Static Bifurcations . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
25.2 Bifurcations of Time-Dependent Solutions . . . . . . . . . . . . . . 499
26 Lyapunov Exponents and Chaos . . . . . . . . . . . . . . . . . . . . . . 503

26.1 One-Dimensional Systems . . . . . . . . . . . . . . . . . . . . . . 503
26.2 Multidimensional Systems . . . . . . . . . . . . . . . . . . . . . . 505
26.3 Stretching and Folding in Phase Space . . . . . . . . . . . . . . . . 508
26.4 Fractal Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
27 Systems with Chaotic Dynamics . . . . . . . . . . . . . . . . . . . . . . 517

27.1 Dynamics of Discrete Systems . . . . . . . . . . . . . . . . . . . . 517
27.2 One-Dimensional Mappings . . . . . . . . . . . . . . . . . . . . . 518
Part VIII On the History of Mechanics
28 Emergence of Occidental Physics in the Seventeenth Century . . . . . . 555
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
Recommendations for Further Reading on Theoretical Mechanics . 573
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
Contents of Examples and Exercises
1.1 Angular Velocity Vector ω . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2 Position Vector r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Eastward Deflection of a Falling Body . . . . . . . . . . . . . . . . . . 16
2.2 Eastward Deflection of a Thrown Body . . . . . . . . . . . . . . . . . . 17
2.3 Superelevation of a River Bank . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Difference of Sea Depth at the Pole and Equator . . . . . . . . . . . . . 18
3.1 Chain Fixed to a Rotating Bar . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Pendulum in a Moving Train . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Formation of Cyclones . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Movable Mass in a Rotating Tube . . . . . . . . . . . . . . . . . . . . 35
5.1 Center of Gravity for a System of Three Mass Points . . . . . . . . . . 44
5.2 Center of Gravity of a Pyramid . . . . . . . . . . . . . . . . . . . . . . 45
5.3 Center of Gravity of a Semicircle . . . . . . . . . . . . . . . . . . . . . 46
5.4 Center of Gravity of a Circular Cone . . . . . . . . . . . . . . . . . . . 47
5.5 Momentary Center and Pole Path . . . . . . . . . . . . . . . . . . . . . 48
5.6 Scattering in a Central Field . . . . . . . . . . . . . . . . . . . . . . . 50
5.7 Rutherford Scattering Cross Section . . . . . . . . . . . . . . . . . . . 55
5.8 Scattering of a Particle by a Spherical Square Well Potential . . . . . . 59
5.9 Scattering of Two Atoms . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.1 Conservation of the Total Angular Momentum of a Many-Body
System: Flattening of a Galaxy . . . . . . . . . . . . . . . . . . . . . . 67
6.2 Conservation of Angular Momentum of a Many-Body Problem:
The Pirouette . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.3 Reduced Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.4 Movement of Two Bodies Under the Action of Mutual Gravitation . . . 73
6.5 Atwoods Fall Machine . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.6 Our Solar System in the Milky Way . . . . . . . . . . . . . . . . . . . 76
7.1 Two Equal Masses Coupled by Two Equal Springs . . . . . . . . . . . 84
7.2 Coupled Pendulums . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.3 Eigenfrequencies of the Vibrating Chain . . . . . . . . . . . . . . . . . 94
7.4 Vibration of Two Coupled Mass Points, Two Dimensional . . . . . . . . 96
7.5 Three Masses on a String . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.6 Eigenvibrations of a Three-Atom Molecule . . . . . . . . . . . . . . . 99
8.1 Kinetic and Potential Energy of a Vibrating String . . . . . . . . . . . . 108
8.2 Three Different Masses Equidistantly Fixed on a String . . . . . . . . . 109
xv
xvi Contents of Examples and Exercises
8.3 Complicated Coupled Vibrational System . . . . . . . . . . . . . . . . 112

8.4 The Cardano Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
9.1 Inclusion of the Initial Conditions for the Vibrating String by Means
of the Fourier Expansion . . . . . . . . . . . . . . . . . . . . . . . . . 123
9.2 Fourier Series of the Sawtooth Function . . . . . . . . . . . . . . . . . 125
9.3 Vibrating String with a Given Velocity Distribution . . . . . . . . . . . 126
9.4 Fourier Series for a Step Function . . . . . . . . . . . . . . . . . . . . 128
9.5 On the Unambiguousness of the Tautochrone Problem . . . . . . . . . . 129
10.1 The Longitudinal Chain: Poincaré Recurrence Time . . . . . . . . . . . 150
10.2 Orthogonality of the Eigenmodes . . . . . . . . . . . . . . . . . . . . . 155
11.1 Moment of Inertia of a Homogeneous Circular Cylinder . . . . . . . . . 163
11.2 Moment of Inertia of a Thin Rectangular Disk . . . . . . . . . . . . . . 165
11.3 Moment of Inertia of a Sphere . . . . . . . . . . . . . . . . . . . . . . 167
11.4 Moment of Inertia of a Cube . . . . . . . . . . . . . . . . . . . . . . . 167
11.5 Vibrations of a Suspended Cube . . . . . . . . . . . . . . . . . . . . . 168
11.6 Roll off of a Cylinder: Rolling Pendulum . . . . . . . . . . . . . . . . . 169
11.7 Moments of Inertia of Several Rigid Bodies About Selected Axes . . . . 174
11.8 Cube Tilts over the Edge of a Table . . . . . . . . . . . . . . . . . . . . 175
11.9 Hockey Puck Hits a Bar . . . . . . . . . . . . . . . . . . . . . . . . . . 176
11.10 Cue Pushes a Billiard Ball . . . . . . . . . . . . . . . . . . . . . . . . 178
11.11 Motion with Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 180
11.12 Bar Vibrates on Springs . . . . . . . . . . . . . . . . . . . . . . . . . . 181
12.1 Tensor of Inertia of a Square Covered with Mass . . . . . . . . . . . . . 191
12.2 Transformation of the Tensor of Inertia of a Square Covered with Mass . 199
12.3 Rolling Circular Top . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
12.4 Ellipsoid of Inertia of a Quadratic Disk . . . . . . . . . . . . . . . . . . 202
12.5 Symmetry Axis as a Principal Axis . . . . . . . . . . . . . . . . . . . . 203
12.6 Tensor of Inertia and Ellipsoid of Inertia of a System of Three Masses . 204
12.7 Friction Forces and Acceleration of a Car . . . . . . . . . . . . . . . . 206
13.1 Nutation of the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
13.2 Ellipsoid of Inertia of a Regular Polyhedron . . . . . . . . . . . . . . . 218
13.3 Rotating Ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
13.4 Torque of a Rotating Plate . . . . . . . . . . . . . . . . . . . . . . . . 219
13.5 Rotation of a Vibrating Neutron Star . . . . . . . . . . . . . . . . . . . 220
13.6 Pivot Forces of a Rotating Circular Disk . . . . . . . . . . . . . . . . . 222
13.7 Torque on an Elliptic Disk . . . . . . . . . . . . . . . . . . . . . . . . 223
13.8 Gyrocompass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
13.9 Tidal Forces, and Lunar and Solar Eclipses: The Saros Cycle . . . . . . 230
13.10 The Sleeping Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
13.11 The Heavy Symmetric Top . . . . . . . . . . . . . . . . . . . . . . . . 249
13.12 Stable and Unstable Rotations of the Asymmetric Top . . . . . . . . . . 255
14.1 Small Sphere Rolls on a Large Sphere . . . . . . . . . . . . . . . . . . 260
14.2 Body Glides on an Inclined Plane . . . . . . . . . . . . . . . . . . . . . 260
14.3 Wheel Rolls on a Plane . . . . . . . . . . . . . . . . . . . . . . . . . . 261
14.4 Generalized Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 262
14.5 Cylinder Rolls on an Inclined Plane . . . . . . . . . . . . . . . . . . . 263
Contents of Examples and Exercises xvii
14.6 Classification of Constraints . . . . . . . . . . . . . . . . . . . . . . . 263

15.1 Two Masses on Concentric Rollers . . . . . . . . . . . . . . . . . . . . 270
15.2 Two Masses Connected by a Rope on an Inclined Plane . . . . . . . . . 271
15.3 Equilibrium Condition of a Bascule Bridge . . . . . . . . . . . . . . . 271
15.4 Two Blocks Connected by a Bar . . . . . . . . . . . . . . . . . . . . . 276
15.5 Ignorable Coordinate . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
15.6 Sphere in a Rotating Tube . . . . . . . . . . . . . . . . . . . . . . . . . 279
15.7 Upright Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
15.8 Stable Equilibrium Position of an Upright Pendulum . . . . . . . . . . 282
15.9 Vibration Frequencies of a Three-Atom Symmetric Molecule . . . . . . 284
15.10 Normal Frequencies of a Triangular Molecule . . . . . . . . . . . . . . 286
15.11 Normal Frequencies of an Asymmetric Linear Molecule . . . . . . . . . 289
15.12 Double Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
15.13 Mass Point on a Cycloid Trajectory . . . . . . . . . . . . . . . . . . . . 292
15.14 String Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
15.15 Coupled Mass Points on a Circle . . . . . . . . . . . . . . . . . . . . . 295
15.16 Lagrangian of the Asymmetric Top . . . . . . . . . . . . . . . . . . . . 297
16.1 Cylinder Rolls down an Inclined Plane . . . . . . . . . . . . . . . . . . 303
16.2 Particle Moves in a Paraboloid . . . . . . . . . . . . . . . . . . . . . . 305
16.3 Three Masses Coupled by Rods Glide in a Circular Tire . . . . . . . . . 308
17.1 Charged Particle in an Electromagnetic Field . . . . . . . . . . . . . . . 314
17.2 Motion of a Projectile in Air . . . . . . . . . . . . . . . . . . . . . . . 317
17.3 Circular Disk Rolls on a Plane . . . . . . . . . . . . . . . . . . . . . . 320
17.4 Centrifugal Force Governor . . . . . . . . . . . . . . . . . . . . . . . . 322
18.1 Central Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
18.2 The Pendulum in the Newtonian, Lagrangian, and Hamiltonian Theories 332
18.3 Hamiltonian and Canonical Equations of Motion . . . . . . . . . . . . 334
18.4 A Variational Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
18.5 Catenary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
18.6 Brachistochrone: Construction of an Emergency Chute . . . . . . . . . 344
18.7 Derivation of the Hamiltonian Equations . . . . . . . . . . . . . . . . . 349
18.8 Phase Diagram of a Plane Pendulum . . . . . . . . . . . . . . . . . . . 351
18.9 Phase-Space Density for Particles in the Gravitational Field . . . . . . . 354
18.10 Cooling of a Particle Beam . . . . . . . . . . . . . . . . . . . . . . . . 359
19.1 Example of a Canonical Transformation . . . . . . . . . . . . . . . . . 370
19.2 Point Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
19.3 Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
19.4 Damped Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . 373
19.5 Infinitesimal Time Step . . . . . . . . . . . . . . . . . . . . . . . . . . 375
19.6 General Form of Liouville’s Theorem . . . . . . . . . . . . . . . . . . 376
19.7 Canonical Invariance of the Poisson Brackets . . . . . . . . . . . . . . 377
19.8 Poisson’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
19.9 Invariants of the Plane Kepler System . . . . . . . . . . . . . . . . . . 380
20.1 The Hamilton–Jacobi Differential Equation . . . . . . . . . . . . . . . 385
20.2 Angle Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
20.3 Solution of the Kepler Problem by the Hamilton–Jacobi Method . . . . 389
xviii Contents of Examples and Exercises
20.4 Formulation of the Hamilton–Jacobi Differential Equation for Particle

Motion in a Potential with Azimuthal Symmetry . . . . . . . . . . . . . 392
20.5 Solution of the Hamilton–Jacobi Differential Equation of Exercise 20.4 393
20.6 Formulation of the Hamilton–Jacobi Differential Equation for the Slant
Throw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
20.7 Illustration of the Action Waves . . . . . . . . . . . . . . . . . . . . . 398
20.8 Periodic and Multiply Periodic Motions . . . . . . . . . . . . . . . . . 400
20.9 The Bohr–Sommerfeld Hydrogen Atom . . . . . . . . . . . . . . . . . 408
20.10 On Poisson Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
20.11 Total Time Derivative of an Arbitrary Function Depending on q, p, and t 412
21.1 Extended Lagrangian for a Relativistic Free Particle . . . . . . . . . . . 417
21.2 Extended Lagrangian for a Relativistic Particle in an External
Electromagnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
21.3 Trivial Extended Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . 421
21.4 Hamiltonian of a Free Relativistic Particle . . . . . . . . . . . . . . . . 422
21.5 Hamiltonian of a Relativistic Particle in a Potential V (q, t) . . . . . . . 423
21.6 Relativistic “Harmonic Oscillator” . . . . . . . . . . . . . . . . . . . . 425
21.7 Extended Hamiltonian for a Relativistic Particle in an External
Electromagnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
21.8 Identical Canonical Transformation . . . . . . . . . . . . . . . . . . . . 432
21.9 Identical Time Transformation, Conventional Canonical Transformations 432
21.10 Extended Point Transformations . . . . . . . . . . . . . . . . . . . . . 433
21.11 Time-Energy Transformations . . . . . . . . . . . . . . . . . . . . . . 433
21.12 Liouville’s Theorem in the Extended Hamilton Description . . . . . . . 434
21.13 Extended Poisson Brackets . . . . . . . . . . . . . . . . . . . . . . . . 434
21.14 Canonical Quantization in the Extended Hamilton Formalism . . . . . . 435
21.15 Regularization of the Kepler System . . . . . . . . . . . . . . . . . . . 437
21.16 Time-Dependent Damped Harmonic Oscillator . . . . . . . . . . . . . 440
21.17 Galilei Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 444
21.18 Lorentz Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 445
21.19 Infinitesimal Canonical Transformations, Generalized Noether Theorem 447
21.20 Infinitesimal Point Transformations, Conventional Noether Theorem . . 450
21.21 Runge–Lenz Vector of the Plane Kepler System as a Generalized
Noether Invariant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
22.1 Time Dependent Harmonic Oscillator . . . . . . . . . . . . . . . . . . 456
23.1 Linear Stability in Two Dimensions . . . . . . . . . . . . . . . . . . . 471
23.2 The Nonlinear Oscillator with Friction . . . . . . . . . . . . . . . . . . 473
23.3 The van der Pol Oscillator with Weak Nonlinearity . . . . . . . . . . . 480
23.4 Relaxation Vibrations . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
24.1 Floquet’s Theory of Stability . . . . . . . . . . . . . . . . . . . . . . . 489
24.2 Stability of a Limit Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 492
26.1 The Baker Transformation . . . . . . . . . . . . . . . . . . . . . . . . 515
27.1 The Logistic Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
27.2 Logistic Mapping and the Bernoulli Shift . . . . . . . . . . . . . . . . 527
27.3 The Periodically Kicked Rotator . . . . . . . . . . . . . . . . . . . . . 530
27.4 The Periodically Driven Pendulum . . . . . . . . . . . . . . . . . . . . 537
27.5 Chaos in Celestial Mechanics: The Staggering of Hyperion . . . . . . . 544
Part
I
Newtonian Mechanics in Moving
Coordinate Systems
Newton’s Equations in a Rotating
Coordinate System 1
In classical mechanics, Newton’s laws hold in all systems moving uniformly relative
to each other (i.e., inertial systems) if they hold in one system. However, this is no
longer valid if a system undergoes accelerations. The new relations are obtained by
establishing the equations of motion in a fixed system and transforming them into the
accelerated system.
We first consider the rotation of a coordinate system (x , y , z ) about the origin
of the inertial system (x, y, z) where the two coordinate origins coincide. The inertial
system is denoted by L (“laboratory system”) and the rotating system by M (“moving
system”).
Fig. 1.1. Relative position of

the coordinate systems x, y, z
and x , y , z
In the primed system the vector A(t) = A1 e1 + A2 e2 + A3 e3 changes with time.
For an observer resting in this system this can be represented as follows:

dA dA1 dA2 dA3
= e1 + e2 + e .
dt M dt dt dt 3
The index M means that the derivative is being calculated from the moving system.
In the inertial system (x, y, z) A is also time dependent. Because of the rotation of the
primed system the unit vectors e1 , e2 , e3 also vary with time; i.e., when differentiating
the vector A from the inertial system, the unit vectors must be differentiated too:

dA dA1 dA2 dA3
= e + e + e + A1 ė1 + A2 ė2 + A3 ė3
dt L dt 1 dt 2 dt 3

dA
= + A1 ė1 + A2 ė2 + A3 ė3 .
dt M
Generally the following holds: (d/dt)(eγ · eγ ) = eγ · ėγ + ėγ · eγ = (d/dt)(1) = 0.
Hence, eγ · ėγ = 0. The derivative of a unit vector ėγ is always orthogonal to the vector
W. Greiner, Classical Mechanics, 3

DOI 10.1007/978-3-642-03434-3_1, © Springer-Verlag Berlin Heidelberg 2010
4 1 Newton’s Equations in a Rotating Coordinate System
itself. Therefore the derivative of a unit vector can be written as a linear combination
of the two other unit vectors:
ė1 = a1 e2 + a2 e3 ,

ė2 = a3 e1 + a4 e3 ,
ė3 = a5 e1 + a6 e2 .
Only 3 of these 6 coefficients are independent. To show this, we first differentiate

e1 · e2 = 0, and obtain
ė1 · e2 = −ė2 · e1 .
Multiplying ė1 = a1 e2 + a2 e3 by e2 and correspondingly ė2 = a3 e1 + a4 e3 by e1 ,
one obtains
e2 · ė1 = a1 and e1 · ė2 = a3 ,
and hence a3 = −a1 . Analogously one finds a6 = −a4 and a5 = −a2 .

The derivative of the vector A in the inertial system can now be written as follows:

dA dA
= + A1 (a1 e2 + a2 e3 ) + A2 (−a1 e1 + a4 e3 ) + A3 (−a2 e1 − a4 e2 )
dt L dt M

dA
= + e1 (−a1 A2 − a2 A3 ) + e2 (a1 A1 − a4 A3 ) + e3 (a2 A1 + a4 A2 ).
dt M
From the evaluation rule for the vector product,

e
1 e2 e3
C × A = C1 C2 C3
A A A
1 2 3
= e1 (C2 A3 − C3 A2 ) − e2 (C1 A3 − C3 A1 ) + e3 (C1 A2 − C2 A1 ),
it follows by setting C = (a4 , −a2 , a1 ) that

dA dA
= + C × A.
dt L dt M
We still have to show the physical meaning of the vector C. For this purpose we con-
sider the special case dA/dt|M = 0; i.e., the derivative of the vector A in the moving
system vanishes. A moves (rotates) with the moving system; it is tightly “mounted”
in the system. Let ϕ be the angle between the axis of rotation (in our special case the
z-axis) and A. The component parallel to the angular velocity ω is not changed by the
rotation.
The change of A in the laboratory system is then given by

dA
dA = ω dt A sin ϕ or = ωA sin ϕ.
dt L
This can also be written as

dA
= ω × A.
dt L
1 Newton’s Equations in a Rotating Coordinate System 5
Fig. 1.2. Change of an arbi-

trary vector A tightly fixed to
a rotating system
The orientation of (ω × A) dt also coincides with dA (see Fig. 1.2). Since the
(fixed) vector A can be chosen arbitrarily, the vector C must be identical with the
angular velocity ω of the rotating system M. By insertion we obtain

dA dA
= + ω × A. (1.1)
dt L dt M
This can also be seen as follows (see Fig. 1.3): If the rotational axis of the primed
system coincides during a time interval dt with one of the coordinate axes of the
nonprimed system, e.g., ω = ϕ̇e3 , then
ė1 = ϕ̇e2 and ė2 = −ϕ̇e1 ,
i.e.,
a1 = ϕ̇, a2 = a4 = 0, and hence C = ϕ̇e3 = ω.
Fig. 1.3. |de1 | = |de2 | =

ϕ̇ · dt

In the general case ω = ω1 e1 + ω2 e2 + ω3 e3 , one decomposes ω = ωi with
ωi = ωi ei , and by the preceding consideration one finds

Ci = ω i ; i.e., C= Ci = ωi = ω.
i i

1.1 Introduction of the Operator D
=
To shorten the expression ∂F (x, . . . , t)/∂t = ∂F /∂t, we introduce the operator D
∂/∂t. The inertial system and the accelerated system will be distinguished by the in-
dices L and M, so that

L = ∂
D and M = ∂ .
D

∂t L ∂t M
The equation

dA dA
= +ω×A
dt L dt M
then simplifies to
L A = D
D M A + ω × A.
If the vector A is omitted, the equation is called an operator equation
L = D
D M + ω ×,
which can operate on arbitrary vectors.
EXAMPLE
1.1 Angular Velocity Vector ω

dω dω
= + ω × ω.
dt L dt M
Since ω × ω = 0, it follows that

dω dω
= .
dt L dt M
These two derivatives are evidently identical for all vectors that are parallel to the
rotational plane, since then the vector product vanishes.
EXAMPLE
1.2 Position Vector r

dr dr
= + ω × r,
dt L dt M
in operator notation this becomes
L r = D
D M r + ω × r,
1.2 Formulation of Newton’s Equation in the Rotating Coordinate System 7
where (dr/dt)|M is called the virtual velocity and (dr/dt)|M + ω × r the true velocity. Example 1.2
The term ω × r is called the rotational velocity.
1.2 Formulation of Newton’s Equation in the Rotating Coordinate

System
Newton’s law mr̈ = F holds only in the inertial system. In accelerated systems, there
appear additional terms. First we consider again a pure rotation.
For the acceleration we have
d L (D
L r) = (D
M + ω×)(DM r + ω × r)
r̈L = (ṙ)L = D
dt
=DM2
r+DM (ω × r) + ω × D
M r + ω × (ω × r)
M
=D 2 M ω) × r + 2ω × D
r + (D M r + ω × (ω × r).
We replace the operator by the differential quotient:

d 2 r d 2 r dω dr
= + × r + 2ω × + ω × (ω × r). (1.2)
dt 2 L dt 2 M dt M dt M
The expression (dω/dt)|M × r is called the linear acceleration, 2ω × (dr/dt)|M

the Coriolis acceleration, and ω × (ω × r) the centripetal acceleration.
Multiplication by the mass m yields the force F:

d 2 r dω dr
m 2 + m × r + 2mω × + mω × (ω × r) = F.
dt M dt M dt M
The basic equation of mechanics in the rotating coordinate system therefore reads
(with the index M being omitted):
d 2r dω
m 2
=F−m × r − 2mω × v − mω × (ω × r). (1.3)
dt dt
The additional terms on the right-hand side of (1.3) are virtual forces of a dynamical
nature, but actually they are due to the acceleration term. For experiments on the earth
the additional terms can often be neglected, since the angular velocity of the earth
ω = 2π/T (T = 24 h) is only 7.27 · 10−5 s−1 .
1.3 Newton’s Equations in Systems with Arbitrary Relative Motion
We now drop the condition that the origins of the two coordinate systems coincide.
The general motion of a coordinate system is composed of a rotation of the system
and a translation of the origin. If R points to the origin of the primed system, then the
position vector in the nonprimed system is r = R + r .
Fig. 1.4. Relative position of

the coordinate systems x, y, z
and x , y , z
For the velocity we have ṙ = Ṙ + ṙ , and in the inertial system we have as before

d 2 r
m 2 = F = F.
dt L L
By inserting r and differentiating, we obtain

d 2 r d 2 R
m 2 + m 2 = F.
dt L dt L
The transition to the accelerated system is performed as above (see (1.3)), but here
we still have the additional term mR̈:

d 2 r d 2R dω
m 2 = F − m 2 − m × r − 2mω × vM − mω × (ω × r ). (1.4)
dt M dt L dt M
Free Fall on the Rotating Earth
2
On the earth, the previously derived form of the basic equation of mechanics holds if
we neglect the rotation about the sun and therefore consider a coordinate system at the
earth center as an inertial system.
mr̈ |M = F − mR̈|L − mω̇ × r |M − 2mω × ṙ |M − mω × (ω × r ). (2.1)
The rotational velocity ω of the earth about its axis can be considered constant in time;
therefore, mω̇ × r = 0.
The motion of the point R, i.e., the motion of the coordinate origin of the system
(x , y , z ), still has to be recalculated in the moving system. According to (2.1), we
have
R̈|L = R̈|M + ω̇|M × R + 2ω × Ṙ|M + ω × (ω × R).
Since R as seen from the moving system is a time-independent quantity and since
ω is constant, this equation finally reads
R̈|L = ω × (ω × R).
This is the centripetal acceleration due to the earth’s rotation that acts on a body
moving on the earth’s surface. For the force equation (2.1) one gets
mr̈ = F − mω × (ω × R) − 2mω × ṙ − mω × (ω × r ).
Hence, in free fall on the earth—contrary to the inertial system—there appear vir-
tual forces that deflect the body in the x - and y -directions.
If only gravity acts, the force F in the inertial system is F = −γ Mmr/r 3. By inser-
tion we obtain
Mm
mr̈ = −γ r − mω × (ω × R) − 2mω × ṙ − mω × (ω × r ).
r3
We now introduce the experimentally determined value for the gravitational accel-
eration g:
M
g = −γ R − ω × (ω × R).
R3
Here we have inserted in the gravitational acceleration −γ Mr/r 3 the radius

r = R + r and kept the approximation r ≈ R, which is reasonable near the earth’s
surface. The second term is the centripetal acceleration due to the earth’s rotation,

10 2 Free Fall on the Rotating Earth
Fig. 2.1. Octant of the globe:

Position of the various coordi-
nate systems
which leads to a decrease of the gravitational acceleration (as a function of the geo-
graphical latitude). The reduction is included in the experimental value for g. We thus
obtain
mr̈ = mg − 2mω × ṙ − mω × (ω × r ).
In the vicinity of the earth’s surface (r R) the last term can be neglected, since
ω2 enters and |ω| is small compared to 1/s. Thus the equation simplifies to
r̈ = g − 2(ω × ṙ ) or r̈ = −ge3 − 2(ω × ṙ ). (2.2)
The vector equation is solved by decomposing it into its components. First one suit-
ably evaluates the vector product. From Fig. 2.1 one obtains, with e1 , e2 , e3 the unit
vectors of the inertial system and e1 , e2 , e3 the unit vectors of the moving system, the
following relation:
e3 = (e3 · e1 )e1 + (e3 · e2 )e2 + (e3 · e3 )e3

= (− sin λ)e1 + 0 e2 + (cos λ)e3 .
Because ω = ωe3 , one gets the component representation of ω in the moving system:
ω = −ω sin λe1 + ω cos λe3 .
Then for the vector product we get
ω × ṙ = (−ωẏ cos λ)e1 + (ż ω sin λ + ẋ ω cos λ)e2 − (ωẏ sin λ)e3 .
The vector equation (2.2) can now be decomposed into the following three compo-
nent equations:
ẍ = 2ẏ ω cos λ,
ÿ = −2ω(ż sin λ + ẋ cos λ), (2.3)

z̈ = −g + 2ωẏ sin λ.
This is a system of three coupled differential equations with ω as the coupling parame-
ter. For ω = 0, we get the free fall in an inertial system. The solution of such a system
2.1 Perturbation Calculation 11
can also be obtained in an analytical way. It is, however, useful to learn various ap-
proximation methods from this example. We will first outline these methods and then
work out the exact analytical solution and compare it with the approximations.
In the present case, the perturbation calculation and the method of successive ap-
proximation offer themselves as approximations. Both of these methods will be pre-
sented here. The primes on the coordinates will be omitted below.
2.1 Perturbation Calculation
Here one starts from a system that is mathematically more tractable, and then one
accounts for the forces due to the perturbation which are small compared to the re-
maining forces of the system.
We first integrate (2.3):
ẋ = 2ωy cos λ + c1 ,
ẏ = −2ω(x cos λ + z sin λ) + c2 , (2.4)
ż = −gt + 2ωy sin λ + c3 .
In free fall on the earth the body is released from the height h at time t = 0; i.e., for
our problem, the initial conditions are
z(0) = h, ż(0) = 0,
y(0) = 0, ẏ(0) = 0,
x(0) = 0, ẋ(0) = 0.
From this we get the integration constants
c1 = 0, c2 = 2ωh sin λ, c3 = 0,
and obtain
ẋ = 2ωy cos λ,
ẏ = −2ω(x cos λ + (z − h) sin λ), (2.5)
ż = −gt + 2ωy sin λ.
The terms proportional to ω are small compared to the term gt. They represent the
perturbation. The deviation y from the origin of the moving system is a function of
ω and t ; i.e., in the first approximation the term y1 (ω, t) ∼ ω appears. Inserting this
into the first differential equation, we find an expression involving ω2 . Because of the
consistency in ω we can neglect all terms with ω2 , i.e., we obtain to first order in ω
ẋ(t) = 0, ż(t) = −gt,
and after integration with the initial conditions we get

g
x(t) = 0, z(t) = − t 2 + h.
2
Because x(t) = 0, in this approximation the term 2ωx cos λ drops out from the
second differential equation (2.5); there remains
ẏ = −2ω(z − h) sin λ.
Inserting z leads to

1 2
ẏ = −2ω h − gt − h sin λ
2
= ωgt 2 sin λ.
Integration with the initial condition yields

ωg sin λ 3
y= t .
3
The solutions of the system of differential equations in the approximation ωn = 0
with n ≥ 2 (i.e., consistent up to linear terms in ω) thus read
x(t) = 0,
ωg sin λ 3
y(t) = t ,
3
g
z(t) = h − t 2 .
2
The fall time T is obtained from z(t = T ) = 0:
2h
T2 = .
g
From this one finds the eastward deflection (e2 points east) as a function of the fall
height:

ωg sin λ2h 2h
y(t = T ) = y(h) =
3g g

2ωh sin λ 2h
= .
3 g
2.2 Method of Successive Approximation

If one starts from the known system (2.5) of coupled differential equations, these equa-
tions can be transformed by integration to integral equations:
t
x(t) = 2ω cos λ y(u) du + c1 ,
0
t t
y(t) = 2ωht sin λ − 2ω cos λ x(u) du − 2ω sin λ z(u) du + c2 ,
0 0
t
1
z(t) = − gt 2 + 2ω sin λ y(u) du + c3 .
2
0
2.2 Method of Successive Approximation 13
Taking into account the initial conditions
x(0) = 0, ẋ(0) = 0,
y(0) = 0, ẏ(0) = 0,
z(0) = h, ż(0) = 0,
the integration constants are
c1 = 0, c2 = 0, c3 = h.
The iteration method is based on replacing the functions x(u), y(u), z(u) under
the integral sign by appropriate initial functions. In the first approximation, one de-
termines the functions x(t), y(t), z(t) and then inserts them as x(u), y(u), z(u) on the
right-hand side to get the second approximation. In general there results a successive
approximation to the exact solution if ω · t = 2πt/T (T = 24 hours) is sufficiently
small.
By setting x(u), y(u), z(u) to zero in the above example in the zero-order approxi-
mation, one obtains in the first approximation
x (1) (t) = 0,
y (1) (t) = 2ωht sin λ,
g
z(1) (t) = h − t 2 .
2
To check the consistency of these solutions up to terms linear in ω, we have to check

only the second approximation. If there is consistency, there must not appear terms
that involve ω linearly:
t t
x (2)
(t) = 2ω cos λ y (1)
(u) du = 2ω cos λ 2ωh(sin λ)u du
0 0
t2
= 4ω2 h cos λ sin λ = f (ω2 ) ≈ 0.
2
Like x (1) (t), z(1) (t) is consistent to first order in ω:
t
1
z (2)
(t) = h − gt 2 + 2ω sin λ y (1) (u) du
2
0
t
g
= h − t 2 + 2ω sin λ 2ωh(sin λ)u du
2
0
g
= h − t 2 + i(ω2 ).
2
On the contrary, y (1) (t) is not consistent in ω, since

t t
y (2)
(t) = 2ωht sin λ − 2ω cos λ x (1)
(u) du − 2ω sin λ z(1) (u) du
0 0
t3
= 2ωh sin λ · t − 2ωh sin λ · t + gω(sin λ)
3
t3
= gω(sin λ) = 2ωh(sin λ)t + k(ω2 ).
3
We see that in this second step the terms linear in ω once again changed greatly. The
term 2ωht sin λ obtained in the first iteration step cancels completely and is finally
replaced by gω(sin λ)t 3 /3. A check of y (3) (t) shows that y (2) (t) is consistent up to
first order in ω.
Just as in the perturbation method discussed above, we get up to first order in ω the
solution
x(t) = 0,
gω sin λ 3
y(t) = t ,
3
g
z(t) = h − t 2 .
2
We have of course noted long ago that the method of successive approximation (itera-
tion) is equivalent to the perturbation calculation and basically represents its concep-
tually clean formulation.
2.3 Exact Solution

The equations of motion (2.3) can also be solved exactly. For that purpose, we start
again from
ẍ = 2ω cos λẏ, (2.3a)
ÿ = −2ω(sin λż + cos λẋ), (2.3b)
z̈ = −g + 2ω sin λẏ. (2.3c)
By integrating (2.3a) to (2.3c) with the above initial conditions, one gets
ẋ = 2ω cos λy, (2.5a)
ẏ = −2ω(sin λz + cos λx) + 2ω sin λh, (2.5b)
ż = −gt + 2ω sin λy. (2.5c)
Insertion of (2.5a) and (2.5c) into (2.3b) yields
ÿ + 4ω2 y = 2ωg sin λt ≡ ct. (2.6)
The general solution of (2.6) is the general solution of the homogeneous equation and
one particular solution of the inhomogeneous equation, i.e.,
c
y= t + A sin 2ωt + B cos 2ωt.
4ω2
2.3 Exact Solution 15
The initial conditions at the time t = 0 are x = y = 0, z = h, and ẋ = ẏ = ż = 0. It

follows that B = 0 and 2ωA = −c/4ω2 , i.e., A = −c/8ω3 and therefore

c c c sin 2ωt
y= t − sin 2ωt = t − ,
4ω2 8ω3 4ω2 2ω
i.e.,

g sin λ sin 2ωt
y= t− . (2.7)
2ω 2ω
Insertion of (2.7) into (2.5a) yields

sin 2ωt
ẋ = g sin λ cos λ t − .
2ω
From the initial conditions, it follows that

2
t 1 − cos 2ωt
x = g sin λ cos λ − . (2.8)
2 4ω2
Equation (2.7) inserted into (2.5c) yields

g sin λ sin 2ωt
ż = −gt + 2ω sin λ t− ,
2ω 2ω

sin 2ωt
ż = −gt + g sin2 λ t − ,
2ω
and integration with the initial conditions yields

2
g t 1 − cos 2ωt
z = − t 2 + g sin2 λ − + h. (2.9)
2 2 4ω2
Summarizing, one finally has

t 2 1 − cos 2ωt
x = g sin λ cos λ − ,
2 4ω2

g sin λ sin 2ωt
y= t− , (2.10)
2ω 2ω
2
g 2 t 1 − cos 2ωt
z = h − t + g sin λ 2
− .
2 2 4ω2
Since ωt = 2πfall time/1 day, i.e., very small (ωt 1), one can expand (2.10):
gt 2
x= sin λ cos λ(ωt)2 ,
6
gt 2
y= sin λ(ωt), (2.11)
3

gt 2 sin2 λ 2
z=h− 1− ωt .
2 3
If one considers only terms of first order in ωt, then (ωt)2 ≈ 0, and (2.11) becomes
x(t) = 0,
gωt 3 sin λ
y(t) = , (2.12)
3
g
z(t) = h − t 2 .
2
This is identical with the results obtained by means of perturbation theory. However,
(2.10) is exact!
The eastward deflection of a falling mass seems at first paradoxical, since the earth
rotates toward the east too. However, it becomes transparent if one considers that the
mass in the height h at the time t = 0 in the inertial system has a larger velocity
component toward the east (due to the earth rotation) than an observer on the earth’s
surface. It is just this “excessive” velocity toward the east which for an observer on the
earth lets the stone fall toward the east, but not ⊥ downward. For the throw upward
the situation is the opposite (see Exercise 2.2).
Fig. 2.2. Cut through the earth

in the equatorial plane viewed
from the North Pole: M is the
earth center, and ω the angular
velocity
EXAMPLE
2.1 Eastward Deflection of a Falling Body
As an example, we calculate the eastward deflection of a body that falls at the equator
from a height of 400 m.
The eastward deflection of a body falling from the height h is given by

2ω sin λh 2h
y(h) = .
3 g
The height h = 400 m, the angular velocity of the earth ω = 7.27 · 10−5 rad s−1 ,
and the gravitational acceleration is known.
Inserting the values in y(h) yields

2 · 7.27 · 400 rad m 2 · 400 s2
y(h) = ,
3 · 105 s 9.81
where rad is a dimensionless quantity. The result is Example 2.1
y(h) = 17.6 cm.
Thus, the body will be deflected toward the east by 17.6 cm.
EXERCISE
2.2 Eastward Deflection of a Thrown Body
Problem. An object will be thrown upward with the initial velocity v0 . Find the
eastward deflection.
Solution. If we put the coordinate system at the starting point of the motion, the
initial conditions read
z(t = 0) = 0, ż(t = 0) = v0 ,
y(t = 0) = 0, ẏ(t = 0) = 0,
x(t = 0) = 0, ẋ(t = 0) = 0.
The deflection to the east is given by y, the deflection to the south by x; z = 0 denotes
the height h above the earth’s surface.
For the motion in y-direction we have, as has been shown (see (2.4)),
dy
= −2ω(x cos λ + z sin λ) + C2 .
dt
The motion of the body in x-direction can be neglected; x ≈ 0. If one further ne-
glects the influence of the eastward deflection on z, one immediately arrives at the
equation
g
z = − t 2 + v0 t,
2
which is already known from the treatment of the free fall without accounting for the
earth’s rotation. Insertion into the above differential equation yields

dy g 2
= 2ω t − v0 t sin λ,
dt 2

g 3 v0 2
y(t) = 2ω t − t sin λ.
6 2
At the turning point (after the time of ascent T = v0 /g), the deflection is
2 v3
y(T ) = − ω sin λ 02 .
3 g
It points toward the west, as expected.
EXERCISE
2.3 Superelevation of a River Bank
Problem. A river of width D flows on the northern hemisphere at the geographical

latitude ϕ toward the north with a flow velocity v0 . By which amount is the right bank
higher than the left one?
Evaluate the numerical example D = 2 km, v0 = 5 km/h, and ϕ = 45◦ .
Solution. For the earth, we have
d 2r
m = −mge3 − 2mω × v with ω = −ω sin λe1 + ω cos λe3 .
dt 2
The flow velocity is v = −v0 e1 , and hence,
ω × v = −ωv0 sin ϕe2 .
Then the force is
mr̈ = F = −mge3 + 2mωv0 sin ϕe2 = F3 e3 + F2 e2 .
F must be perpendicular to the water surface (see Fig. 2.3). With the magnitude of
the force

F = 4m2 ω2 v02 sin2 ϕ + m2 g 2
one can, from Fig 2.3, determine H = D sin α and sin α = F2 /F . For the desired
height H one obtains
2ωv0 sin ϕ 2Dωv0 sin ϕ
H =D ≈ .
4ω2 v02 sin2 ϕ + g 2 g
Fig. 2.3.
For the numerical example one gets a bank superelevation of H ≈ 2.9 cm.
EXERCISE
2.4 Difference of Sea Depth at the Pole and Equator
Problem. Let a uniform spherical earth be covered by water. The sea surface takes
the shape of an oblate spheroid if the earth rotates with the angular velocity ω.
Find an expression that approximately describes the difference of the sea depth at
the pole and equator, respectively. Assume that the sea surface is a surface of con-
stant potential energy. Neglect the corrections to the gravitational potential due to the
deformation.
Solution.
Fig. 2.4.
γ mM
Feff (r) = − 2 er + mω2 r sin ϑex , r = r · sin ϑ,
r
r2
V |rr21 = − Feff (r) · dr
r1
r2
γ mM
=− − 2 er + mω r sin ϑex dr · er
2
r
r1

γ mM r2 mω2 r 2 sin2 ϑ r2
=− − .
r r1 2 r1
We therefore define
γ mM mω2 r 2 2
Veff (r) = − − sin ϑ. (2.13)
r 2
Let
r = R + r(ϑ); r(ϑ) R.
The potential at the surface of the rotating sphere is constant by definition:
γ mM
V (r) = − + V0 .
R
According to the formulation of the problem, the earth’s surface is an equipoten-
tial surface. From this it follows that the attractive force acts normal to this surface.
Because of the constancy of the potential along the surface, no tangential force can
arise.

γ mM r m r
V (r) = − 1− − ω2 R 2 1 + 2 sin2 ϑ
R R 2 R
! γ mM
=− + V0 .
R
From this it follows that
γ mM m
V0 = 2
r − ω2 R 2 sin2 ϑ − mω2 R r sin2 ϑ.
R 2
As can be seen by inserting the given values, the last term can be neglected:
γ mM
mω2 R sin2 ϑ.
R2
Exercise 2.4 From this it follows that

γ mM m
r = V0 + ω2 R 2 sin2 ϑ,
R2 2
or explicitly for the difference r(ϑ),

R2 m
r(ϑ) = V0 + ω2 R 2 sin2 ϑ . (2.14)
γ mM 2
The second requirement for the evaluation of the deformation is the volume conserva-
tion. Since one can assume r R, we can write this requirement as a simple surface
integral
π/2 2π
da · r(ϑ) = 0, (2.15)
ϑ=0 ϕ=0
and hence, because of the rotational symmetry in ϕ,
π/2
mω2 R 2 sin2 ϑ
V0 + 2πR · R sin ϑdϑ = 0,
2
0
from which follows
π/2
mω2 R 2 sin3 ϑ
V0 sin ϑ + dϑ = 0.
2
0
With
π/2 π/2
2
sin ϑdϑ = 1 and sin3 ϑdϑ = ,
3
0 0
one gets
m 2 2
V0 + ω R = 0,
3
m
V0 = − ω 2 R 2 .
3
By inserting this result into (2.14), one obtains

R 4 ω2 2
r(ϑ) = sin2 ϑ − .
γM 2 3
In the last step γ M/R 2 is to be replaced by g; thus we have found an approximate

expression for the difference of the sea depth:

ω2 R 2 2
r(ϑ) = sin2 ϑ − . (2.16)
2g 3
By inserting the given values Exercise 2.4

m 2π 1
R = 6370 km, g = 9.81 , ω= = 7.2722 · 10−5 ,
s2 T s
we get

π
d = r − r(0) ≈ 10.94 km.
2
If one wants to include the influence of the deformation on the gravitational po-
tential, one needs the so-called spherical surface harmonics. They will be outlined in
detail in the lectures on classical electrodynamics.1
1 See W. Greiner: Classical Electrodynamics, 1st ed., Springer, Berlin (1998).

Foucault’s Pendulum
3
In 1851, Foucault1 found a simple and convincing proof of the earth rotation: A pen-
dulum tends to maintain its plane of motion, independent of any rotation of the sus-
pension point. If such a rotation is nevertheless observed in a laboratory, one can only
conclude that the laboratory (i.e., the earth) rotates.
Figure 3.1 shows the arrangement of the pendulum and fixes the axes of the coor-
dinate system.
Fig. 3.1. Principle of the Fou-

cault pendulum
We first derive the equation of motion of the Foucault pendulum. For the mass point
we have
F = T + mg, (3.1)
where T is a still unknown tension force along the pendulum string. In the basic equa-
tion that holds for moving reference frames,
dω
mr̈ = F − m × r − 2mω × v − mω × (ω × r), (3.2)
dt
1 Jean Bernard Léon Foucault [fuk’o], French physicist, b. Sept. 18, 1819 Paris–d. Feb. 11, 1868.
In 1851, Foucault performed his famous pendulum experiment in the Panthéon in Paris as a proof of
the earth’s rotation. In the same year he proved by means of a rotating mirror that light propagates
in water more slowly than in air, which was important for confirming the wave theory of light. He
investigated the eddy currents in metals detected by D.F. Arago (Foucault-currents), and also studied
light and heat radiation together with A.H.L. Fizeau.

24 3 Foucault’s Pendulum
the linear forces and the centripetal forces can be neglected, because for the earth’s ro-
tation dω/dt = 0 and t · |ω| 1, t 2 ω2 ≈ 0 (t ≈ pendulum period). By inserting (3.1)
into the simplified equation (3.2), we get
mr̈ = T + mg − 2mω × v. (3.3)
As is obvious from this equation, the earth’s rotation is expressed for the moving
observer by the appearance of a virtual force, the Coriolis force. The Coriolis force
causes a rotation of the vibrational plane of the pendulum. The string tension T can
be determined from (3.3) by noting that
T
T = (T · e1 )e1 + (T · e2 )e2 + (T · e3 )e3 = T
T
(−x, −y, l − z)
=T
x 2 + y 2 + (l − z)2
(−x, −y, l − z)
≈T . (3.4)
l
In the last step, we presupposed a very large pendulum length l, so that x/ l 1,

y/ l 1, and z/ l ≪ 1. Evaluation of the scalar product therefore yields

x y z−l
T = T − e1 − e2 + e . (3.5)
l l l 3
Before inserting (3.5) into (3.3), it is practical to decompose (3.3) into individual com-
ponents. For this purpose one has to evaluate the vector product ω × v:

e e2 e3
1
ω × v = −ω sin λ 0 ω cos λ
ẋ ẏ ż
= −ω cos λẏe1 + ω(cos λẋ + sin λż)e2 − ω sin λẏe3 . (3.6)
By inserting (3.5) and (3.6) into (3.3) with mg = −mge3 , we obtain a coupled
system of differential equations:
x
mẍ = − T + 2mω cos λẏ,
l
y
mÿ = − T − 2mω(cos λẋ + sin λż), (3.7)
l
l−z
mz̈ = T − mg + 2mω sin λẏ.
l
To eliminate the unknown string tension T from the system (3.7), we adopt the already
mentioned approximations:
3 Foucault’s Pendulum 25
Fig. 3.2. Projection of the

string tension T onto the
axes ei
The pendulum string shall be very long, but the pendulum shall oscillate with small
amplitudes only. From this it follows that x/ l 1, y/ l 1, and z/ l ≪ 1, since the
mass point moves almost in the x, y-plane. Hence, for calculating the string tension
we use the approximation
l−z
= 1, mz̈ = 0, (3.8)
l
and obtain from the third equation (3.7)
T = mg − 2mω sin λẏ. (3.9)
Insertion of (3.9) and (3.7) after division by the mass m yields
g 2ω sin λ
ẍ = − x + x ẏ + 2ω cos λẏ,
l l (3.10)
g 2ω sin λ
ÿ = − y + y ẏ − 2ω cos λẋ.
l l
Equation (3.10) represents a system of nonlinear, coupled differential equations; non-

linear since the mixed terms x ẏ and y ẏ appear. Since the products of the small num-
bers ω, x, and ẏ (or ω, y, and ẏ) are negligible compared to the other terms, (3.10)
can be considered equivalent to
g g
ẍ = − x + 2ω cos λẏ, ÿ = − y − 2ω cos λẋ. (3.11)
l l
These two linear (but coupled) differential equations describe the vibrations of a pen-
dulum under the influence of the Coriolis force to a good approximation. In the fol-
lowing we will describe a method of solving (3.11).
3.1 Solution of the Differential Equations
√ the abbreviations g/ l = k and ω cos λ = α, multiply

For solving (3.11), we introduce 2
ÿ by the imaginary unit i = −1, and obtain
ẍ = −k 2 x − 2αi 2 ẏ
i ÿ = −k 2 iy − 2αi ẋ (3.12)
ẍ + i ÿ = −k 2 (x + iy) − 2αi(ẋ + i ẏ)
The abbreviation u = x + iy is obvious:
ü = −k 2 u − 2αi u̇ or 0 = ü + 2αi u̇ + k 2 u. (3.13)
Equation (3.13) is solved by the ansatz useful for all vibration processes,
u = C · eγ t , (3.14)
where γ is to be determined by inserting the derivatives into (3.13):
Cγ 2 eγ t + 2αiCγ eγ t + k 2 Ceγ t = 0 or γ 2 + 2iαγ + k 2 = 0. (3.15)
The two solutions of (3.15) are

γ1/2 = −iα ± ik 1 + α 2 /k 2 . (3.16)
Since α 2 = ω2 cos2 λ because ω2 /k 2 is small compared to 1 (ω2 /k 2 = Tpend2 /T 2

earth
1, where Tpend is the pendulum period and Tearth = 1 day), it further follows that
γ1/2 = −iα ± ik. (3.17)
The most general solution of the differential equation (3.13) is the linear combination
of the linearly independent solutions
u = A · eγ1 t + B · eγ2 t , (3.18)
where A and B must be fixed by the initial conditions and are of course complex, i.e.,
can be decomposed into a real and an imaginary part:
u = (A1 + iA2 )e−i(α−k)t + (B1 + iB2 )e−i(α+k)t . (3.19)
The Euler relation e−iϕ = cos ϕ − i sin ϕ allows one to split (3.19) into u = x + iy:
x + iy = (A1 + iA2 )[cos(α − k)t − i sin(α − k)t]

+ (B1 + iB2 )[cos(α + k)t − i sin(α + k)t], (3.20)
from which it follows after separating the real and the imaginary parts
x = A1 cos(α − k)t + A2 sin(α − k)t + B1 cos(α + k)t + B2 sin(α + k)t,

(3.21)
y = −A1 sin(α − k)t + A2 cos(α − k)t − B1 sin(α + k)t + B2 cos(α + k)t.
3.1 Solution of the Differential Equations 27
Let the initial conditions be
x0 = 0, ẋ0 = 0,
y0 = L, ẏ0 = 0,
i.e., the pendulum is displaced by the distance L toward the east and released at the
time t = 0 without initial velocity. Inserting x0 = 0 in (3.21), one gets
B1 = −A1 .
Differentiating (3.21) and setting ẋ0 = 0 yields
k−α
B2 = A2 .
k+α
As already noted in (3.16), α k and thus B2 ≈ A2 . From (3.21) one now obtains
x = A1 cos(α − k)t + A2 sin(α − k)t − A1 cos(α + k)t + A2 sin(α + k)t,

(3.22)
y = −A1 sin(α − k)t + A2 cos(α − k)t + A1 sin(α + k)t + A2 cos(α + k)t.
We still have to include the initial conditions for y0 and ẏ0 . From ẏ0 = 0 and (3.22)
we get
−A1 (α − k) + A1 (α + k) = 0 ⇒ A1 = 0.
From y0 = L and (3.22) we get
L
2A2 = L ⇒ A2 = .
2
By inserting these values one obtains
L L
x= sin(α − k)t + sin(α + k)t,
2 2
L L
y = cos(α − k)t + cos(α + k)t.
2 2
Using the trigonometric formulae
sin(α ± k) = sin α cos k ± cos α sin k, cos(α ± k) = cos α cos k ∓ sin α sin k,
it follows that
x = L sin αt cos kt, y = L cos αt cos kt.
The two equations can be combined into a vector equation:
r = L cos kt[sin(αt)e1 + cos(αt)e2 ]. (3.23)

Fig. 3.3. The unit vector n(t)

rotates in the x, y-plane
3.2 Discussion of the Solution
The first factor in (3.23) describes the motion of a pendulum that vibrates with the am-
√
plitude L and the frequency k = g/ l. The second term is a unit vector n that rotates
with the frequency α = ω cos λ and describes the rotation of the vibration plane:
r = L cos kt n(t),
n(t) = sin αt e1 + cos αt e2 .
Equation (3.23) also tells us in what direction the vibrational plane rotates. For the
northern hemisphere cos λ > 0, and after a short time sin αt > 0 and cos αt > 0, i.e.,
the vibrational plane rotates clockwise. An observer in the southern hemisphere will
see his pendulum rotate counter-clockwise, since cos λ < 0.
At the equator the experiment fails, since cos λ = 0. Although the component ωx =
−ω sin λ takes its maximum value there, it cannot be demonstrated by means of the
Foucault pendulum.
Following the path of the mass point of a Foucault pendulum, one finds rosette
trajectories. Note that the shape of the trajectories essentially depends on the initial
conditions (see Fig. 3.4). The left side shows a rosette path for a pendulum released at
the maximum displacement; the pendulum shown on the right side was pushed out of
the rest position.
Fig. 3.4. Rosette paths of the

Foucault pendulum
Because of the assumption α k in (3.16), (3.23) does not describe either of the
two rosettes exactly. According to (3.23), the pendulum always passes the rest posi-
tion, although the initial conditions were adopted as in the left figure.
3.2 Discussion of the Solution 29
EXERCISE
3.1 Chain Fixed to a Rotating Bar
Problem. A vertical bar AB rotates with constant angular velocity ω. A light non-
stretchable chain of length l is fixed at one end to the point O of the bar, while the
mass m is fixed at its other end. Find the chain tension and the angle between chain
and bar in the state of equilibrium.
Solution. Three forces act on the body, viz.
(1) the gravitation (weight): Fg = −mge3 ;
(2) the centrifugal force: Fz = −mω × (ω × r);
(3) the chain tension force: T = −T sin ϕe1 + T cos ϕe3 .
Fig. 3.5. e1 , e2 , e3 are the unit

vectors of a rectangular co-
ordinate system rotating with
the bar; T is the chain tension
force; Fg is the weight of the
mass m; Fz is the centrifugal
force
Since the angular velocity has only one component in the e3 -direction, ω = ωe3 , and
r = l(sin ϕe1 + (1 − cos ϕ)e3 ),
we find for the centrifugal force
Fz = −m(ω × (ω × r))
the expression
Fz = +mω2 l sin ϕ e1 .
If the body is in equilibrium, the sum of the three forces equals zero:
0 = −mge3 + mω2 l sin ϕe1 − T sin ϕe1 + T cos ϕe3 .
When ordering by components, we obtain
0 = (mω2 l sin ϕ − T sin ϕ)e1 + (T cos ϕ − mg)e3 .

Exercise 3.1 Since a vector vanishes only if every component equals zero, we can set up the fol-
lowing component equations:
mω2 l sin ϕ − T sin ϕ = 0, (3.24)

T cos ϕ − mg = 0. (3.25)
One solution of (3.24) is sin ϕ = 0. It represents a state of unstable equilibrium that

happens if the body rotates on the axis AB. In this case the centrifugal force compo-
nent vanishes. A second solution of the system is found by assuming sin ϕ = 0. We
can then divide (3.24) by sin ϕ and get the tension force T :
T = mω2 l (3.26)
and after elimination of T from (3.25) we get the angle ϕ between the chain and the
bar:
g
cos ϕ = .
ω2 l
Since the chain OP with the mass m in P moves on the surface of a cone, this arrange-
ment is called the cone pendulum.
EXERCISE
3.2 Pendulum in a Moving Train
Problem. The period of a pendulum of length l is given by T . How will the period
change if the pendulum is suspended at the ceiling of a train that moves with the
velocity v along a curve with radius R?
(a) Neglect the Coriolis force. Why can you do that?
(b) Solve the equations of motion (with Coriolis force!) nearly exactly (analogous to
Foucault’s pendulum).
Solution. (a) The backdriving force is
mv 2
FR = −mg sin ϕ + cos ϕ.
R
One has v(x) = ω(R + x) = ω(R + l sin ϕ) and R = R + x. Hence, it follows that
FR = −mg sin ϕ + mω2 (R + l sin ϕ) cos ϕ.
The differential equation for the motion therefore reads
ms̈ = −mg sin ϕ + mω2 (R + l sin ϕ) cos ϕ.

Fig. 3.6.
Since s = lϕ, s̈ = l ϕ̈, it follows that l ϕ̈ = −g sin ϕ + ω2 (R + l sin ϕ) cos ϕ, or

g 2 R
ϕ̈ + sin ϕ − ω + sin ϕ cos ϕ = 0. (3.27)
l l
For small amplitudes, cos ϕ ≈ 1 and sin ϕ ≈ ϕ, i.e.,

g 2 R
ϕ̈ + ϕ − ω +ϕ =0
l l
or

g R
ϕ̈ + − ω2 ϕ − ω2 = 0. (3.28)
l l
Here, the Coriolis force was neglected, since the angular velocity ϕ̇ and hence ẋ is
small compared to the rotational velocity v = ω(R + x), i.e., ω × ẋ ∼
= 0. The solution
of the homogeneous differential equation is

g
ϕh = sin −ω t .
2
l
The particular solution of the inhomogeneous differential equation is
ω2 (R/ l)
ϕi = .
(g/ l) − ω2
The general solution of (3.27) is therefore

g ω2 (R/ l)
ϕ = ϕh + ϕi = sin − ω2 t + .
l (g/ l) − ω2
Hence, the vibration period is
2π
T= .
(g/ l) − ω2
√
For ω = g/ l the period becomes infinite, since the centrifugal force exceeds the
gravitational force. This interpretation gets to the core of the matter, although the
formula (3.28) holds only for small angular velocities: For large angular velocities the
Exercise 3.2 approximation of small vibration amplitudes x in (3.27) is no longer allowed, because
the pendulum mass is being pressed outward due to the centrifugal force, i.e., to large
values of x.
(b) The equations of motion read
dω
mr̈ = F − m × r − 2mω × v − mω × (ω × (r + R)).
dt
With
ω = −ωez , R = Rex , −2mω × v = 2mω(−ẏ, ẋ, 0),
and
−mω × (ω × (r + R)) = mω2 (R + x, y, 0),
one finds
x
mẍ = −Tx − 2mωẏ + mω2 (R + x) = − T − 2mωẏ + mω2 (R + x),
l
y
mÿ = −Ty + 2mωẋ + mω2 y = − T + 2mωẋ + mω2 y, (3.29)
l
z
mz̈ = −Tz + mg = − T + mg.
l
In the following, we will assume that the pendulum length is large, i.e., for small
amplitudes z ≈ l (ż = z̈ = 0). The string tension is then given by T = mg:

g
ẍ = ω −2
x − 2ωẏ + ω2 R,
l

g
ÿ = ω −2
y + 2ωẋ.
l
With the substitution u = x + iy, it follows further that

g
ü ω2 − u + 2iωu̇ + ω2 R. (3.30)
l
For the homogeneous solution of the differential equation (3.30) one gets with the
ansatz uhom = c eγ t the characteristic polynomial

g
γ − ω −
2 2
− 2iωγ = 0.
l
The homogeneous solution then takes the form

uhom = c1 exp i ω + g/ l t + c2 exp i ω − g/ l t .
The particular solution is simply obtained as
ω2 R
upart = .
(g/ l) − ω2
From this it follows that Exercise 3.2
u = uhom + upart

ω2 R
= c1 exp i ω + g/ l t + c2 exp i ω − g/ l t + . (3.31)
(g/ l) − ω2
With the initial condition x(0) = x0 and y(0) = ẋ(0) = ẏ(0) = 0, it follows for c1
and c2
√
g/ l − ω ω2 R
c1 = √ x0 + 2 ,
2 g/ l ω − g/ l
√ (3.32)
g/ l + ω ω2 R
c2 = √ x0 + 2 .
2 g/ l ω − g/ l
By decomposing the solution (3.31) into real and imaginary parts, the solutions for
x(t) and y(t) can be found.

g g ω2 R
x = c1 cos ω + t + c2 cos ω − t+
l l g/ l − ω2

l ω2 R g g g
= x0 + 2 cos t cos ωt + ω sin t sin ωt
g ω − g/ l l l l
ω2 R
+
g/ l − ω2

ω2 R g l g ω2 R
= x0 + 2 cos t cos ωt + ω sin t sin ωt + ,
ω − g/ l l g l g/ l − ω2
(3.33)

g g
y = c1 sin ω + t + c2 sin ω − t
l l

ω2 R g l g
= x0 + 2 sin ωt cos t −ω sin t cos ωt . (3.34)
ω − g/ l l g l
√ √
Because ω g/ l, ω l/g 1. From this it follows that

g
x = x0 cos t cos ωt,
l

g
y = x0 cos t sin ωt.
l
This describes a rotation of the pendulum plane with the frequency ω (as for Foucault’s
pendulum).
The pendulum period T can now be obtained from the following consideration: For
√ √
t = 0, the brace in (3.33) equals 1. For t = (π/2) l/g + t , where t (π/2) l/g,
the brace vanishes for the first time, which corresponds to a quarter of T . By expanding
the brace, for t we find

π l 3/2 2
t = ω ,
2 g

π l π l 3/2 2
T =4 + ω
2 g 2 g

l ω2
= 2π 1+ .
g g/ l
On the other hand, in part (a) we found

2π l 1 ω2
T= = 2π 1+ ,
g/ l − ω2 g 2 g/ l
which suggests the conclusion that the Coriolis force should not be neglected from the
outset in this consideration.
EXERCISE
3.3 Formation of Cyclones
Problem. Explain to which directions the winds from north, east, south, and west
will be deflected in the northern hemisphere. Explain the formation of cyclones.
Solution. We derive the equation of motion for a parcel of air P that moves near the
earth’s surface. The X, Y, Z system is considered as an inertial system; i.e., we shall
not take the rotation of the earth about the sun into account. Moreover, we assume the
air mass is moving at constant height; i.e., there is no velocity component along the
z-direction (ż = 0). The centrifugal acceleration shall also be neglected.
With the assumptions mentioned above, the equation of motion of the particle is
defined by the differential equation
r̈ = g − 2(ω × ṙ) = g − 2(ω⊥ × ṙ) − 2(ω × ṙ),
where ω × ṙ = (ω⊥ + ω ) × ṙ. Let ω be the component of ω within the tangential

plane at the point Q of the earth’s surface (see Fig 3.7); it points to the negative
e1 -direction. ω⊥ points to the e3 -direction.
We consider the dominant term −2ω⊥ × ṙ: An air parcel moving in x-direction
(south) is accelerated toward the negative y-axis, and a motion along the y-direction
causes an acceleration in x-direction. The deflection proceeds from the direction of
motion toward the right side. The wind from the west is deflected toward the south,
the north wind toward the west, the east wind toward the north, and the south wind
toward the east.
Fig. 3.7. Definition of the

coordinates: 0 = origin of
the inertial systems X, Y, Z;
Q = origin of the moving
system x, y, z; P = a point
with mass m; ρ = position
vector in the system X, Y, Z;
and r = position vector in the
x, y, z-system
Fig. 3.8. If air flows in the

northern hemisphere from a
high-pressure region to a low-
pressure region, a left-rotating
cyclone arises in the low-
pressure region, and a right-
rotating anticyclone is formed
in the high-pressure region
The force −2mω × ṙ for the north and south winds exactly equals zero. For the
west or east winds the force points along e3 or the opposite direction. Accordingly
the air masses are pushed away from or toward the ground. This force component is
however very small compared to the gravitational force mg, which also points toward
the negative e3 -direction.
If we consider an air parcel moving in the southern hemisphere, then λ > π/2 and
cos λ is negative. Thus, a west wind is here deflected to the north, a north wind toward
the east, and a south wind toward the west.
EXERCISE
3.4 Movable Mass in a Rotating Tube
Problem. A tube rotates with constant angular velocity ω (relative motion) and is
inclined from the rotational axis by the angle α. A mass m inside of the tube is pulled
inward with constant velocity c by a string.
(a) What forces act on the mass?
(b) What work is performed by these forces while the mass moves from x1 to x2 ?
(Calculate the energy balance!) Numerical values: m = 5 kg, α = 45◦ , x1 = 1 m,
x2 = 5 m, ω = 2 s−1 , c = 5 m/s, g = 9.81 m/s2 .
Solution. (a) The mass m within the tube performs a relative motion with constant
velocity c = c(−1, 0, 0), and thus the resulting acceleration is composed of the guid-
Exercise 3.4 ing acceleration af in the tube, the relative acceleration ar , and the Coriolis accelera-
tion ac :
a = a f + ar + a c .
The guiding acceleration consists in general of a translational acceleration b0 of the

vehicle, and of the accelerations at (tangential acceleration) and an (normal accelera-
tion) due to a rotational motion.
Fig. 3.9.
In the present problem, we are dealing with a rotation about a fixed axis eω =
(cos α, 0, sin α) with a constant angular velocity ω, so that a0 = at = 0, and the guid-
ing acceleration is therefore
af = a0 + at + an = an = x sin αω2 (− sin α, 0, cos α),
i.e., the guiding acceleration obviously consists of the centripetal acceleration bn only.
The relative acceleration ar = 0, since the relative velocity is constant. The Coriolis
acceleration ac , defined by
ac = 2ωeω × c,
is therefore
ac = −2ωc sin α(0, 1, 0).
Therefore, for the total acceleration a we have
a = af + ac = (−xω2 sin2 α, −2ωc sin α, xω2 sin α cos α). (3.35)
a is the result of the following forces acting on the mass m (see Fig. 3.9):
S = S(−1, 0, 0), G = mg(− cos α, 0 − sin α),

N1 = N1 (0, 0, 1), N2 = N2 (0, −1, 0).
The resulting total force is therefore
F = (−S − mg cos α, −N2 , N1 − mg sin α). (3.36)
With Newton’s equation
F = m · a, (3.37)
one can determine the unknown quantities S, N1 and N2 : From (3.35), (3.36) Exercise 3.4
and (3.37) it follows that
(−S − mg cos α, −N2 , N1 − mg sin α)

= m(−xω2 sin2 α, −2ωc sin α, xω2 sin α cos α)
⇒ S = m(xω2 sin2 α − g cos α)
N1 = m(xω2 sin α cos α + g sin α)
N2 = 2ωmc sin α.
S becomes negative if x < (g cos α)/(ω2 sin α); i.e., the mass m would have to be
decelerated additionally within the tube if a constant velocity is to be maintained.
(b) During the motion, work is performed by the string force S, by the gravitational
force G, and by the Coriolis force N2 ; N1 is the normal force. The work performed by
the string force is
x2 x2
Ws = dWs = − S(x) dx
x1 x1
m
= ω2 sin2 α(x12 − x22 ) − mg cos α(x1 − x2 ). (3.38)
2
The work performed by the weight force is
x2
WG = dWG = mg cos α(x1 − x2 ). (3.39)
x1
The work performed by the Coriolis force, taking into account dx/dt = −c, is

W N2 = dWN2 = − N2 ds = − N2 x sin α dϕ
x2
dϕ dt
=− N2 x sin α dx
dt dx
x1
= −mω2 sin2 α(x12 − x22 ). (3.40)
Insertion of the numerical values given in the formulation of the problem yields
WS = (3.75 − 17.34) Nm = −13.59 Nm,

WG = 17.34 Nm, WN2 = 7.5 Nm.
To check the results, one uses the fact that the sum of the work performed by the
external forces must be equal to the difference of the kinetic energies (energy balance)
E = WS + WN2 + WG ,
Exercise 3.4 where

m 2 m
E = (c + x22 sin2 αω2 ) − (c2 + x12 sin2 αω2 )
2 2
m 2 2
= − ω sin α(x12 − x22 ),
2
and according to (3.38), (3.39), and (3.40)
m 2 2
W S + W G + W N2 = ω sin α(x12 − x22 ) − mg cos α(x1 − x2 )
2
− mω2 sin2 α(x12 − x22 ) + mg cos α(x1 − x2 )
m
= ω2 sin2 α(x12 − x22 ).
2
Part
II
Mechanics of Particle Systems
So far we have considered only the mechanics of a mass point. We now proceed to
describe systems of mass points. A particle system is called a continuum if it consists
of so great a number of mass points that a description of the individual mass points is
not feasible. On the other hand, a particle system is called discrete if it consists of a
manageable number of mass points.
An idealization of a body (continuum) is the rigid body. The notion of a rigid body
implies that the distances between the individual points of the body are fixed, so that
these points cannot move relative to each other. If one considers the relative motion of
the points of a body, one speaks of a deformable medium.
Degrees of Freedom
4
The number of degrees of freedom f of a system represents the number of coordinates

that are necessary to describe the motion of the particles of the system. A mass point
that can freely move in space has 3 translational degrees of freedom: (x, y, z). If there
are n mass points freely movable in space, this system has 3n degrees of freedom:
(xi , yi , zi ), i = 1, . . . , n.
4.1 Degrees of Freedom of a Rigid Body
We look for the number of degrees of freedom of a rigid body that can freely move.
To describe a rigid body in space, one must know 3 noncollinear points of it. Hence,
one has 9 coordinates:
r1 = (x1 , y1 , z1 ), r2 = (x2 , y2 , z2 ), r3 = (x3 , y3 , z3 ),
which, however, are mutually dependent. Since by definition we are dealing with a
rigid body, the distances between any two points are constant. One obtains
(x1 − x2 )2 + (y1 − y2 )2 + (z1 − z2 )2 = C12 = constant,

(x1 − x3 )2 + (y1 − y3 )2 + (z1 − z3 )2 = C22 = constant,
(x2 − x3 )2 + (y2 − y3 )2 + (z2 − z3 )2 = C31 = constant.
Three coordinates can be eliminated by means of these 3 equations. The remaining 6

coordinates represent the 6 degrees of freedom. These are the 3 degrees of freedom of
translation and the 3 degrees of freedom of rotation. The motion of a rigid body can
always be understood as a translation of any of its points relative to an inertial system
and a rotation of the body about this point (Chasles’1 theorem). This is illustrated by
Fig. 4.1: ABC → A B C , namely, by translation in A B C and by rotation
about the point E in A B C .
1 Michael Chasles, French mathematician, b. in Épernon Nov. 15, 1793–d. Dec. 18, 1880, Paris.
Banker in Chartres; from 1841 to 1851 professor at the École Polytechnique; after 1846 professor
at the Sorbonne in Paris. Chasles is independently of J. Steiner one of the founders of the synthetic
geometry. His Aperçu historique by far surpassed the older representations of the development of
geometry and stimulated new geometrical research in his age.

42 4 Degrees of Freedom
Fig. 4.1. Chasles’ theorem:

The translation vector de-
pends on the rotation, and vice
versa
We now consider the rigid body with one point fixed in space. The motion is com-
pletely described if we know the coordinates of two points
r1 = (x1 , y1 , z1 ) and r2 = (x2 , y2 , z2 )
and adopt the fixed point as the origin of the coordinate system. Since the body is
rigid, we have
x12 + y12 + z12 = constant, x22 + y22 + z22 = constant,

(x1 − x2 )2 + (y1 − y2 )2 + (z1 − z2 )2 = constant.
From these 3 equations one can eliminate 3 coordinates, so that the remaining 3 coor-
dinates describe the 3 degrees of freedom of rotation.
If a particle moves along a given curve in space, the number of degrees of freedom
is f = 1. The curve can be written in the parametric form
Fig. 4.2. Example of the para- x = x(s), y = y(s), z = z(s),
metric form: A caterpillar creeps
on a blade of grass
i.e., for a given curve the position of the particle is fully determined by specifying one
parameter value s.
A deformable medium or a fluid has an infinite number of degrees of freedom (e.g.,
a vibrating string, a flexible bar, a drop of fluid).
Center of Gravity
5
Definition Let a system consist of n particles with the position vectors rν and the
masses mν for ν = (1, . . . , n). The center of gravity of this system is defined as point S
with the position vector rs :
n
m1 r1 + m2 r2 + · · · + mn rn mν rν
rs = = ν=1 n ,
m1 + m2 + · · · + mn ν=1 mν
1
n
rs = mν rν ,
M
ν=1
n
where M = ν=1 mν is the total mass of the system, and

n
Mrs = mν rν
ν=1
is the mass moment. For systems with

uniform mass distribution over a volume V
with the volume density , the sum i mi ri becomes an integral, and one obtains

r(r) dV
rs = V .
V dV
The individual components are

mν xν mν yν mν zν
xs = ν , ys = ν , zs = ν
,
M M M
and for a continuous mass distribution

V x dV y dV z dV
xs = , ys = V , zs = V
,
M M M
Fig. 5.1. Definition of the cen-

ter of gravity

44 5 Center of Gravity
where the total mass is given by

M= mν or M = dV .
ν
V
We consider three systems of masses with the centers of gravity r1 , r2 , r3 and the total
masses M1 , M2 , M3 . The system 1 consists of the mass M1 = (m11 + m12 + m13 +· · · )
with the position vectors r11 , r12 , r13 , . . .; the systems 2 and 3 are analogous. Then by
definition the centers of gravity are

i m1i r1i
system 1: rs1 = ,
i m1i

i m2i r2i
system 2: rs2 = ,
i m2i

i m3i r3i
system 3: rs3 = .
i m3i
For the center of gravity of the total system we have the same relation:

i m1i r1i + i m2i r2i + i m3i r3i
rs =
m
i 1i + m
i 2i + m
i 3i
M1 rs1 + M2 rs2 + M3 rs3

= .
M 1 + M2 + M3
Hence, for composite systems we can determine the centers of gravity and masses of
the partial systems, and from them calculate the center of gravity of the total system.
The calculation can thereby be much simplified. This fact is often referred to as the
cluster property of the center of gravity.
The linear momentum of a particle system is the sum of the momenta of the indi-
vidual particles:
n
n
P= pν = mν ṙν .
ν=1 ν=1

If we introduce the center of gravity by Mrs = i mi ri , we see that P = M ṙs , i.e.,
the total momentum of a particle system equals the product of the total mass M united
in the center of gravity and its velocity ṙs . This means that the translation of a body
can be described by the motion of the center of gravity.
EXERCISE
5.1 Center of Gravity for a System of Three Mass Points
Problem. Find the coordinates of the center of gravity for a system of 3 mass points.
m1 = 1 g, m2 = 3 g, m3 = 10 g,
r1 = (1, 5, 7) cm, r2 = (−1, 2, 3) cm, r3 = (0, 4, 5) cm.
5 Center of Gravity 45
Solution. For the center of gravity, one finds Exercise 5.1
1
rs = (1 − 3, 5 + 3 · 2 + 10 · 4, 7 + 3 · 3 + 10 · 5) cm
14
or, recalculated,
1
rs = (−2, 51, 66) cm.
14
EXERCISE
5.2 Center of Gravity of a Pyramid
Problem. Find the center of gravity of a pyramid with edge length a and a homoge-
neous mass distribution.
Solution. Because of the homogeneous mass distribution, the mass density ρ(r) =
ρ0 = constant. The base of the pyramid is represented by the equation
x + y + z = a.
The coordinate axes are the edges, and the origin is the top. Then

ρ0 r dV r dV
rs = V = V , dV = dx dy dz.
ρ
V 0 dV V dV
The integration limits are evident from Fig. 5.2. The integration runs over z along
the column from z = 0 to z = a − x − y; over y along the prism from y = 0 to
y = a − x; and over x along the pyramid from x = 0 to x = a:
a a−x a−x−y
r dV x=0 y=0 z=0 r dz dy dx
rs = V
= a a−x a−x−y ,
V dV x=0 y=0 z=0 dz dy dx
Fig. 5.2.
Exercise 5.2 with

a−x
a z=a−x−y
1
r = (x, y, z) ⇒ r dV = xy, yz, z2 dy dx,
2 z=0
V x=0 y=0
a−x
a
1
r dV = x(a − x − y), y(a − x − y), (a − x − y) dy dx.
2
2
V 0 0
The corresponding integration over y and x yields

a4 a3
r dV = (1, 1, 1), dV = V = .
24 6
V V
Thus, the center of gravity is at

r dV a
rs = V = (1, 1, 1).
V dV 4
EXERCISE
5.3 Center of Gravity of a Semicircle
Problem. Find the center of gravity of a semicircular disk of radius a. The surface
density is constant.
Fig. 5.3.
Solution. The surface density σ = constant. (The surface density is defined by

σ (r) = limA→0 m(x, y, z)/A.) xs and ys represent the coordinates of the center
of gravity. We use polar coordinates for calculating the center of gravity. The equation
of the semicircle then reads
r = a, 0 ≤ ϕ ≤ π.
Because of symmetry, xs = 0, and for ys , we have

π a
σy dA ϕ=0 r=0 (r sin ϕ)r dr dϕ
ys = A
= .
A σ dA A
The evaluation of the integral yields Exercise 5.3
2a 3 /3 4a
ys = 2
= ,
πa /2 3π
i.e., the center of gravity lies at rs = (0, 4a/(3π)).
EXERCISE
5.4 Center of Gravity of a Circular Cone
Problem. Determine the center of gravity of

(a) a homogeneous circular cone with base radius a and height h; and
(b) a circular cone as in (a), with a hemisphere of radius a set onto its base.
Fig. 5.4.
Solution. (a) Because of symmetry, the center of gravity is on the z-axis, i.e., xs =
ys = 0. For the z-component, we have

z dV k z dV
zs = k = 2h
.
k dV (1/3)πa
We adopt cylindrical coordinates for evaluating the integral:
2π a
h(1−ρ/a)
z dV = z d dϕ dz
k ϕ=0 =0 z=0
a
1 2 2
= 2π h 1− d
2 a
=0

a
1 2 23 4 a 2 h2
= πh 2
− + 2 =π ,
2 3a 4a 0 12
πh2 a 2 ·3 1
zs = = h.
12πa 2 h 4
Thus, the center of gravity of a circular cone is independent of the radius of the base.
(b) See Fig. 5.5. One then has

cone z dV + hemisphere z dV
1 2 2
12 πh a +hemisphere z dV
zs = = ,
Vcone + Vhemisphere (π/3)(h + 2a)a 2
2π a 0
z dV = z dϕ d dz
Fig. 5.5. Because of symme-
try the center of gravity is again hemisphere ϕ=0 =0 z=−
√
a 2 −2
on the z-axis
a
=π (2 − a 2 ) d
=0

a
4 a 2 2
=π −
4 2 0
πa 4
=− .
4
Hence, the center of gravity is given by
12 πa h − 4 πa
1 2 2 1 4
1 h2 − 3a 2
zs = = ,
a2 π 4 h + 2a
3 (h + 2a)
ys = 0, xs = 0.
EXERCISE
5.5 Momentary Center and Pole Path
Problem.
(a) Show that any positional variation of a rigid disk in the plane can be represented
by a pure rotation about a point at a finite distance or at infinity. (Hint: The position
of the disk is already fixed by specifying two points A and B.)
(b) Show by “differential” variation of position: The planar motion of a rigid disk
can be described at any moment by a pure rotation about a point varying with the
motion, the so-called momentary center. The geometric locus of these momentary
centers is called the pole path or the fixed pole curve.
(c) Calculate the fixed pole curve r(ϕ) for a ladder sliding on two perpendicular walls.
(d) Calculate the fixed pole curve r(ϕ) for a bar of length l that can move in the guide
shown in Fig 5.6.
Fig. 5.6.
Fig. 5.7.
Solution. (a) For describing the motion of the disk we take the (arbitrary) straight line
AB; it turns into the straight line A1 B1 . The intersection M of the mid-perpendiculars
onto AA1 and BB1 is the desired center of rotation.
Argument: The triangles ABM and A1 B1 M are congruent. Hence the motion can
be considered as a rotation of the triangle ABM (involving the straight line AB) about
M by the angle ϕ.
(b) For an infinitely small rotation by dϕ the same considerations hold. But now
the individual turning points vary. These are the so-called momentary centers. In a
differential rotation about a momentary center M, for any point the path element dr
and the velocity vector v point along the same direction and are perpendicular to the
connecting lines to M (see Fig. 5.8). The geometric locus of the momentary centers is
called the pole curve.
Fig. 5.8.
(c) According to (b), one gets Fig. 5.9. The straight line l = AB forms a diagonal
of the square OBMA. Since the diagonals of a square are equal, M must move along
a circle of radius l.
Fig. 5.9.
(d) According to (b), one can construct Fig. 5.10. Evidently,

a a2
sin α = (1 − sin α) and cos α = 1 − 2 (1 − sin ϕ)2 ,
l l
AC = l cos α − a cos ϕ,

AC cos α 1 − (a 2 / l 2 )(1 − sin ϕ)2
OM = =l −a =l − a.
cos ϕ cos ϕ cos2 ϕ
Thus, in polar coordinates the equation of the fixed pole curve r(ϕ) reads

1 − (a 2 / l 2 )(1 − sin ϕ)2
r = r(ϕ) = OM = −a + l .
cos2 ϕ
Fig. 5.10.
EXAMPLE
5.6 Scattering in a Central Field
(1) The problem

The two-body problem appeared for the first time in recent physics in investigations
of planetary motion. However, the classical formulation of the two-body problem pro-
vides information both on the bound state as well as on the unbound state (scattering Example 5.6
state) of a system.
The study of the unbound states of a system became of great importance in modern
physics. One learns about the mutual interaction of two objects by scattering them
off each other and observing the path of the scattered particles as a function of the
incident energy and of other path parameters. The objects studied in this way are usu-
ally molecules, atoms, atomic nuclei, and elementary particles. Scattering processes
in these microscopic regions must be described by quantum mechanics. However, one
can obtain information on scattering processes by means of classical mechanics which
is confirmed by a quantum mechanical calculation. Moreover, one may learn the meth-
ods for describing scattering phenomena by studying the classical case.
The schematic arrangement of a scattering experiment is shown in Fig. 5.11. We
consider a homogeneous beam of incoming particles (projectiles) of the same mass
and energy. The force acting on a particle is assumed to drop to zero at large distances
from the scattering center. This guarantees that the interaction is somehow localized.
Let the initial velocity v0 of each projectile relative to the force center be so large that
the system is in the unbound state, i.e., for t → ∞ the distance between the two scat-
tering particles shall become arbitrarily large. For a repulsive potential this happens
for any value of v0 ; this does not hold for an attractive potential.
Fig. 5.11. Schematic setup of

a scattering experiment
The interaction of a projectile with the target particle manifests itself by the fact
that the flight direction after the collision differs from that before the collision (the
usage of the words “before” and “after” in this context presupposes a more or less
finite range of the interaction potential).
(2) Definition of the cross section

Measured quantities are count rates (number of particles/s in the detector, which is
assumed to be small). These count rates depend first on the physical data as kind of
projectile and target, incidence energy, and scattering direction, and second on the
specific experimental conditions such as detector size, distance between target and
detector, number of scattering centers, or incident intensity. In order to have a quantity
that is independent of the latter features, one defines the differential cross section
dσ (number of particles scattered to d

)/s
(ϑ, ϕ) := . (5.1)
d
d
· n · I
Here, n is the number of scattering centers and I is the beam intensity, which is given
by (number of projectiles)/(s·m2 ). The scattering direction is represented here by ϑ
and ϕ. ϑ is the angle between the asymptotic scattering direction and the incidence
direction; it is called the scattering angle. ϕ is the azimuth angle. d
denotes the solid-
angle element covered by the detector. Since we assumed the detector to be small, we
Example 5.6 have
d
= sin ϑ dϑ dϕ, (5.2)
where dϑ and dϕ specify the detector size. Note that (dσ/d

)(ϑ, ϕ) is defined
by (5.1) and is not a derivative of a quantity σ with respect to
. Obviously
dσ/d
)(ϑ, ϕ) has the dimension of an area. The standard unit is
1 b = 1 Barn = 100 (fm)2 = 10−28 m2 . (5.3)
Often the differential cross section will be independent of the azimuth angle ϕ (we
shall restrict ourselves to this case), and one can define
dσ dσ
:= 2π sin ϑ (ϑ, ϕ); (5.4)
dϑ d
see Fig. 5.12.
Fig. 5.12. Definition of the

cross section
dσ (number of particles scattered to d

)/s
(ϑ) = . (5.5)
dϑ dϑ · n · I
Finally, we introduce the total cross section, defined by
2π π π
dσ dσ dσ
σtot = d
(ϑ, ϕ) = dϕ dϑ sin ϑ = dϑ (ϑ). (5.6)
d
d
dϑ
0 0 0
It depends only on the kinds of particles, and possibly on the incidence energy:
(number of scattered particles)/s

σtot = . (5.7)
n·I
Like dσ/d
it has the dimension of an area. It equals the size of the (fictive) area of
a scattering center which must be traversed perpendicularly by the projectiles in order
to be deflected at all.
(3) Introduction of the collision parameter, its relation to the scattering angle,
and the formula for the differential cross section
It is clear that the scattering angle ϑ at fixed energy can depend only on the collision
parameter b, since the initial position and the initial velocity of the particle are then
specified. The collision parameter is defined as the vertical distance of the asymptotic
incidence direction of the projectile from the initial position of the scatterer. Hence, Example 5.6
for E = constant the scattering angle is
ϑ = ϑ(b). (5.8)
Since the movements in classical mechanics are determined, this connection is unam-
biguous. (This statement is no longer valid in quantum mechanics.) Thus,
b = b(ϑ), (5.9)
which means that, by observing an arbitrary particle at a definite scattering angle ϑ,

one can determine in a straightforward manner the value of the scattering parameter b
of the incident particle. This fact allows the following consideration. The number dN
of projectiles per second that move with values b of the collision parameter
b ≤ b ≤ b + db
toward a scattering center is

db
dN = I · 2πb db or dN = I · 2πb dϑ.
dϑ
The sign of absolute value stands since the number dN by definition cannot become
negative. Just this number of particles are scattered into the solid angle element
dR = 2π sin ϑ dϑ.
By inserting this into (5.5), we get

dσ db
(ϑ) = 2πb , (5.10)
dϑ dϑ
and for the differential cross section

dσ b(ϑ) db
(ϑ) = . (5.11)
d
sin ϑ dϑ
This is just the desired relation. The function b(ϑ) is determined by the force law that
governs the particular case. One realizes that the knowledge of the differential cross
section allows one to determine the interaction potential between the projectile and
the target particle.
In general, the scattering angle will depend not only on the collision parameter but
also on the incident energy. As a consequence, the differential cross section also be-
comes energy dependent. Hence, one can measure the differential cross section as a
function of the projectile energy by observing the scattered particles at a fixed scatter-
ing angle.
(4) Transition to the center-of-mass system, and transformation of the differential

cross section from the center-of-mass system to the laboratory system
The considerations of the last section are to some extent independent of the reference
system. If we move from the laboratory system S to another system S that moves
with constant velocity V parallel to the beam axis, the scattering angle and the dif-
ferential cross section (5.5) will change, but the derivation in the last section remains
unchanged, so that the relation (5.11) remains valid.
Fig. 5.13. Scattering in the

laboratory system (S) and
in the center-of-mass system
(S )
This has a practical meaning inasmuch as cross sections are always measured in
the laboratory system S where the target is at rest, but the calculation of b(ϑ ) often
simplifies in the center-of mass system S . We therefore derive a relation between
these two cross sections. In the following the primed and nonprimed quantities shall
always refer to these two systems.
First, we investigate the relation between the scattering angles ϑ and ϑ . Let v1
f
f
and v1 be the asymptotic final velocity (f = final) of the projectile of mass m1 in the
system S and S , respectively. V is the relative velocity of the two systems.
From Fig. 5.14, one immediately sees that
f
v1 sin ϑ sin ϑ
tan ϑ = f
= f
,
v1 cos ϑ + V cos ϑ + V /v1
f
Fig. 5.14. where V stands for the magnitude of V, and analogously for v1 . Furthermore,
m1 v1i = (m1 + m2 )V ,
where v1i is the initial velocity of the projectile in the laboratory system (i = initial),
and
v1i = V + v1 i .
f f f
Because m1 v1i = m2 v2i and m1 v1 = m2 v2 for elastic scattering (Ekin
i = E ), v i =
kin 1
f
v1 , and therefore,
V m1
f
= .
v1 m2
Hence,
sin ϑ
tan ϑ = . (5.12)
cos ϑ + m1 /m2
This relation defines the function ϑ (ϑ); we will not give it explicitly. If a projec-
tile in S is scattered into the ring dR with the “radius” ϑ and the width dϑ (see
Fig. 5.12), it will in S be scattered into a ring dR with the “radius” ϑ (ϑ) and the
width dϑ = (dϑ /dϑ)dϑ . The number of particles scattered to dR in S and to dR
in S is therefore identical, and with (5.5), we get
dσ dσ dσ dϑ
(ϑ) · dϑ = (ϑ ) · dϑ = (ϑ ) dϑ,
dϑ dϑ dϑ dϑ
thus,
dσ dσ dϑ
(ϑ) = (ϑ ) , (5.13)
dϑ dϑ dϑ
or
dσ dσ sin ϑ dϑ
(ϑ) = (ϑ ) . (5.14)
d
d
sin ϑ dϑ
This is the desired connection.
The difference between the scattering angles and the cross sections, respectively, is
obviously determined by the mass ratio of projectile and target particle (see (5.12)).
EXERCISE
5.7 Rutherford Scattering Cross Section

Problem. A particle of mass m moves from infinity with the collision parameter b
toward a force center. The central force is inversely proportional to the square of the
distance:
F = kr −2 .
(a) Calculate the scattering angle as a function of b and of the initial velocity of the
particle.
(b) What are the differential and the total cross sections?
Solution. (a) From the discussion of the Kepler problem, we know that the underly-
ing force law has the form
k
F =− . (5.15)
r2
The minus sign means that the force is attractive. The path equation reads1

1 mk 2El 2
= 2 1+ 1+ cos(θ − θ ) (5.16)
r l mk 2
(E = initial energy, l = angular momentum, m = mass of the particle, θ = integration

constant). With the standard abbreviation

2El 2
ε= 1+ , (5.17)
mk 2
one can write for (5.16)

1 mk
= 2 1 + ε cos(θ − θ ) . (5.18)
r l
1 See W. Greiner: Classical Mechanics: Point Particles and Relativity, 1st ed., Springer, Berlin
(2004).
Exercise 5.7 The path is characterized by ε:
ε > 1, E > 0: hyperbola,

ε = 1, E = 0: parabola,
ε < 1, E < 0: ellipse, (5.19)
mk 2
ε = 0, E = − : circle.
2l 2
In the given problem the force law is
k
F= . (5.20)
r2
The force is repulsive. For illustration, we consider the scattering of charged particles
by a Coulomb field (e.g., atomic nuclei by atomic nuclei, protons by nuclei, or elec-
trons by electrons, etc.). The scattering force center is created by a fixed charge −Ze
and acts on the particle with the charge −Z e. The force is then
ZZ e2
F= . (5.21)
r2
If we set k = −ZZ e2 , we can directly take over the equations for an attractive poten-
tial. The path equation (5.18) now reads
1 mZZ e2
=− (1 + ε cos θ ). (5.22)
r l2
The coordinates were rotated so that θ = 0. For ε (see (5.17)) it follows that

2
2El 2 2Eb
ε= 1+ = 1+ . (5.23)
m(ZZ e2 )2 (ZZ e2 )2
Here, we used the relation
√ 1 2
l = mbv∞ = b 2mE, E = mv∞ (5.24)
2
between angular momentum (l) and collision parameter (b).

Since ε > 1, (5.22) represents a hyperbola (see (5.19)). Because of the minus sign,
the values of θ for the path are restricted to values for which
1
cos θ < − (5.25)
ε
Fig. 5.15. Region of θ for re-
pulsive Coulomb scattering (see Fig. 5.15). Note that the force center for repulsive forces is in the outer focal point
(see Fig. 5.16).
The change of θ that occurs if the particle comes from infinity, and is then scattered
and moves to infinity again, equals the angle φ between the asymptotes, which is the
supplement to the scattering angle θ (see Fig. 5.16).
Fig. 5.16. Illustration of the

hyperbolic path of a particle
that is pushed off by a force
center. The force center lies in
the outer focal point
From Fig. 5.15 and (5.25), it follows that

π θ θ φ 1
cos − = sin = cos = . (5.26)
2 2 2 2 ε
The relation cos(φ/2) = 1/ε can be proved as follows: The two limiting angles θ1 and
θ2 satisfy the condition
1
cos θ1 = − ,
ε
(5.27)
1
cos θ2 = − .
ε
From this it follows that (see Fig. 5.17):
sin θ1 = − sin θ2 ,
(5.28)
θ1 θ2
cos = − cos .
2 2
Fig. 5.17. The limiting angles

θ1 and θ2 have the same co-
sine
The first of these equations can be rewritten as

θ1 θ1 θ2 θ2
2 cos sin = sin θ1 = − sin θ2 = −2 cos sin , (5.29)
2 2 2 2
and therefore,

θ1 θ2
sin = sin . (5.30)
2 2
Exercise 5.7 We look for cos(φ/2) = cos(θ2 − θ1 )/2), φ = θ2 − θ1 :

φ θ 2 θ1 θ2 θ1 θ2 θ1
cos = cos − = cos cos + sin sin
2 2 2 2 2 2 2

θ1 θ1
= − cos2 + sin2 . (5.31)
2 2
From cos θ1 = −1/ε, it follows that

1 θ1 θ1
− = cos θ1 = cos2 − sin2 (5.32)
ε 2 2

θ1 θ1 1
⇒ sin2 = cos2 + . (5.33)
2 2 ε
Insertion into (5.31) yields

φ 2 θ1 2 θ1 1 1
cos = − cos + cos + = . (5.34)
2 2 2 ε ε
From there we find
2
1 2Eb
=1+
2
sin (θ/2) ZZ e2

1 1 − sin2 (θ/2) θ 2Eb 2
⇔ −1= = cot2 =
sin2 (θ/2) sin2 (θ/2) 2 ZZ e2

θ 2Eb 2
⇒ = arccot . (5.35)
2 ZZ e2
(b) From (5.24) and (5.26) it follows that

ZZ e2 θ
b= cot
2E 2
db ZZ e2 1
⇒ =− 2
. (5.36)
dθ 4E sin (θ/2)
The differential cross section as a function of θ is given by
dσ b db
=− . (5.37)
d
sin θ dθ
Thus, one obtains

dσ (ZZ e2 )2 1 θ
= · 2 cot
d
2E sin θ 2E sin (θ/2) 2

1 ZZ e2 2 cot(θ/2)
= , (5.38)
2 2E sin θ sin2 (θ/2)
and with the identity

θ θ
sin θ = 2 sin cos (5.39)
2 2
it follows that Exercise 5.7

dσ 1 ZZ e2 2 1
= 4
. (5.40)
d
4 2E sin (θ/2)
This is the well-known Rutherford scattering formula. The total cross section is cal-
culated according to

dσ dσ
σtotal = (
) d
= 2π (θ ) sin θ dθ. (5.41)
d
d
By inserting dσ/d
(θ ) from (5.40), one quickly realizes that the expression diverges
because of the strong singularity at θ = 0. This is due to the long-range nature of the
Coulomb force. If one uses potentials which decrease faster than 1/r, this singularity
disappears.
EXERCISE
5.8 Scattering of a Particle by a Spherical Square Well Potential
Problem. A particle is scattered by a spherical square well potential with radius a

and depth U0 :
U =0 (r > a),
U = −U0 (r ≤ a).
Calculate the differential and the total cross section.

Hint: Use the refraction law for particles at sharp surfaces which results from the
following consideration: Let the velocity of the particle before scattering by a sharp
potential well be v1 = v∞ and after scattering v2 . Due to momentum conservation
perpendicular to the incident normal (“transverse momentum conservation”) one has
v∞ sin α = v2 sin β (5.42)

sin α v2
⇒ = . (5.43)
sin β v∞
From the energy conservation law it follows that
1 2 1
E = T + U = mv∞ + U1 = mv22 + U2 . (5.44)
2 2
Solving for v2 yields

2 2
v2 = v∞ + (U1 − U2 ) = v∞
2 2 + U0 . (5.45)
m m
Insertion into (5.43) finally yields

sin α 2 + (2/m)U
v∞ 2U0
0
n= = = 1+ 2
. (5.46)
sin β v∞ mv∞
Exercise 5.8 Solution. The straight path of the particle is broken when entering and leaving the
field. We have the relation
sin α
= n, (5.47)
sin β
where according to (5.46)

2U0
n= 1+ 2
.
mv∞
The deflection angle is (see Fig. 5.18)
χ = 2(α − β)
sin α sin(α − χ/2)
⇒ =
sin β sin α
sin α cos(χ/2) − cos α sin(χ/2)
=
sin α

χ χ 1
= cos − cot α sin = . (5.48)
2 2 n
Fig. 5.18. In the inner and

outer region of the spheri-
cal potential well the parti-
cle moves along straight lines.
When passing the surface it
will be refracted
From Fig. 5.18, we have
a sin α = . (5.49)
Because sin2 α + cos2 α = 1, it follows that

2

cos α = 1− . (5.50)
a
Now we can eliminate α from (5.49):
cos(χ/2) − 1/n cos α a cos α

= cot α = = (5.51)
sin(χ/2) sin α
and with (5.50), we get Exercise 5.8

1 − (/a)2 sin(χ/2)
=a ,
(cos(χ/2) − 1/n)
a 2 sin2 (χ/2) − 2 sin2 (χ/2)
2 =
(cos(χ/2) − 1/n)2
a 2 sin2 (χ/2) a 2 sin2 (χ/2)
= =
(cos(χ/2) − 1/n)2 + sin (χ/2)2 1 − (2/n) cos(χ/2) + 1/n2
n2 sin2 (χ/2)
⇒ 2 = a 2 . (5.52)
n2 − 2n cos(χ/2) + 1
To get the cross section, we differentiate
n sin(χ/2)
=a (5.53)
(n2 − 2n cos(χ/2) + 1)1/2
with respect to χ .
an
d 2 cos(χ/2) 1 an sin(χ/2) · n sin(χ/2)
⇒ = 2 −
dχ (n − 2n cos(χ/2) + 1) 1/2 2 (n2 − 2n cos(χ/2) + 1)3/2

an
cos χ2 n2 + 1 − 2n cos χ2 − 12 an2 sin2 χ2
= 2 3/2
n2 + 1 − 2n cos χ2
χ
a 3
2 n cos 2 + an
cos χ2 − an2 cos2 χ2 − 12 an2 sin2 χ
=
2
3/2
2
n2 + 1 − 2n cos χ2
χ χ χ
an n2 cos 2 + cos 2 − n − n cos2
= 3/2
2
2 n2 + 1 − 2n cos χ2
χ χ
an n cos 2 − 1 n − cos 2
= (5.54)
2 n2 + 1 − 2n cos χ 3/2
2

dσ d
⇒ σ (χ) = d

sin χ dχ
a 2 n2 sin(χ/2) |(n cos(χ/2) − 1)(n − cos(χ/2))|
=
2 sin χ (n2 + 1 − 2n cos(χ/2))2
a 2 n2 1 |(n cos(χ/2) − 1)(n − cos(χ/2))|
= . (5.55)
4 cos(χ/2) (n2 + 1 − 2n cos(χ/2))2
Here, we utilized
χ χ
sin χ = 2 cos sin . (5.56)
2 2
The angle χ takes the values from zero (for = 0) up to the value χmax (for = a)
which is determined by the equation
χmax 1
cos = . (5.57)
2 n
Exercise 5.8 The total cross section obtained by integration of (dσ/d

)(χ) over all angles within
the cone χ < χmax of course equals the geometrical cross section πa 2 .
We still want to show that the total cross section for scattering by the spherical
square well potential equals the geometrical cross section πa 2 . This is obvious since
for r > a we have U = 0, i.e., there is no scattering.
We start from (5.55)

dσ d a 2 n2 1 [n cos(χ/2) − 1][n − cos(χ/2)]
(χ) = = (5.58)
d
sin χ dχ 4 cos(χ/2) [n2 + 1 − 2n cos(χ/2)]2
and integrate over all angles χ from 0 to χmax (d

= 2π sin χ dχ):
max
χ max
χ
dσ χ [n cos(χ/2) − 1][n − cos(χ/2)]
σtot = (χ) d
= πa 2 n2 sin dχ
d
2 [n2 + 1 − 2n cos(χ/2)]2
0 0
max
χ
n2
= πa 2
(1 + n2 − 2n cos(χ/2))2
0

χ χ 2 χ χ χ
× (n + 1) cos sin − n cos
2
sin − n sin dχ. (5.59)
2 2 2
2 2
I II III
Part III can be integrated at once; I and II are transformed by integrating by parts:

χ −1 2 χmax
σtot = πa n + 1 − 2n cos
2 2
n
2 0

χmax
cos(χ/2)

− πa n(n + 1)
2 2
(1 + n2 − 2n cos(χ/2)) 0
max
χ
(1/2) sin(χ/2)
− πa n(n + 1)
2 2
dχ
(1 + n2 − 2n cos(χ/2))
0

χmax
cos2 (χ/2)
+ n πa 2 2
(1 + n − 2n cos(χ/2)) 0
2
max
χ
− cos(χ/2) sin(χ/2)
− πa n 2 2
dχ. (5.60)
(1 + n2 − 2n cos(χ/2))
0
In the last integral, we substitute
χ
y := cos ,
2
1 χ
dy = − sin dχ (5.61)
2 2
and obtain Exercise 5.8

χmax
n2 (1 + cos2 (χ/2)) − n(n2 + 1) cos(χ/2)
σtot = πa 2
(1 + n2 − 2n cos(χ/2) 0
max
χ
n(n2 + 1) sin(χ/2)
− dχ
2 (1 + n2 − 2n cos(χ/2))
0
max /2)
cos(χ
y
− 2n 2
dy
(1 + n − 2ny)
2
1

n2 (1 + cos2 (χ/2)) − n(n2 + 1) cos(χ/2) χmax
= πa 2
(1 + n2 − 2n cos(χ/2)) 0
2 χmax
n +1 χ
− ln 1 + n2 − 2n cos
2 2
0
2 cos(χmax /2)
n +1
+ ny + ln(1 + n2 − 2ny) ;
2 0
and finally, with χmax = 2arccos (1/n),

2 n (1 + 1/n ) − n(n + 1)(1/n) n2 (1 + 1) − n(n2 + 1) · 1
2 2 2
σtot = πa − +1−n
(1 + n2 − 2) (1 + n2 − 2n)

=0

n − 2n2 + n3 − n3 + 2n2 − n + n2 − 2n + 1
= πa 2
= πa 2 . (5.62)
(n − 1)2
EXERCISE
5.9 Scattering of Two Atoms
Problem. A hydrogen atom moves along the x-axis with a velocity vH = 1.78 ·
102 m · s−1 . It reacts with a chlorine atom that moves perpendicular to the x-axis with
vCl = 3.2 · 101 m · s−1 . Calculate the angle and the velocity of the HCl-molecule. The
atomic weights are H = 1.00797 and Cl = 35.453.
Fig. 5.19.
Exercise 5.9 Solution. We utilize momentum conservation. The initial momenta are
P1 = m1 v1 ex , m1 = A1 · 1 amu,
(5.63)
P2 = m2 v2 ey , m2 = A2 · 1 amu.
Here, A1 , A2 mean the atomic weights, and 1 amu (“atomic mass unit”)
= 1/12m(12 C). We require
P = P = (m1 v1 , m2 v2 ) with P = (m1 + m2 )v , (5.64)
from which we get

1 m1 m2 v1 v2 v 1 v2
v = (m1 v1 , m2 v2 ) = , =μ , . (5.65)
m1 + m2 m1 + m2 m2 m1 m2 m1
Here, μ is the reduced mass. It is calculated as

m1 m2
μ= = 0.9801 amu. (5.66)
m1 + m2
Thus, one obtains
v = (4.9208, 31.1154) m · s−1 ,

⇒ v = 31.502 m · s−1 . (5.67)
The angle θ is found from tan θ = vy /vx to be θ = 81.013◦ .

Mechanical Fundamental Quantities
of Systems of Mass Points 6
6.1 Linear Momentum of the Many-Body System
If we consider a system of mass points, for the total force acting on the νth particle
we have

Fν + fνλ = ṗν . (6.1)
λ
The force fνλ is the force of the particle λ on the particle ν; Fν is the force acting on

the particle ν from the outside of the system; λ fνλ is the resulting internal force of
all other particles on the particle ν.
The resulting force acting on the system is obtained by summing over the individual
forces:

ṗν = Fν + fνλ = Ṗ.
ν ν ν λ
Since force equals (–) counter force (here Newton’s third law becomes operative), it
follows that fνλ + fλν = 0, so that the terms of the above double sum cancel pairwise.
One thus obtains for the total force acting on the system

Ṗ = F = Fν .
ν
If no external force acts on the system, one has
F = Ṗ = 0, i.e., P = constant.

The total momentum P = ν pν of the particle system is thus conserved if the sum of
the external forces acting on the system vanishes.
6.2 Angular Momentum of the Many-Body System
The situation is similar for the angular momentum if the internal forces are assumed
to be central forces.
The angular momentum of the νth particle with respect to the coordinate origin is
lν = rν × pν .

66 6 Mechanical Fundamental Quantities of Systems of Mass Points
The angular momentum of a single particle is defined with respect to the origin. The
same holds for the total angular momentum. The angular momentum of the system
then equals the sum over all individual angular momenta,

L= lν .
ν
Fig. 6.1.
Analogously, the torque acting on the νth particle is
d ν = r ν × Fν ,
and the total torque is given by

D= dν .
ν
The internal forces fνλ do not perform a torque, since we assumed them to be central
forces. This can be seen as follows: For the force acting on the νth particle, according
to (6.1) we have
d
Fν + fνλ = pν .
dt
λ
By vectorial multiplication of the equation from the left by rν , we obtain
d d
rν × Fν + rν × fνλ = rν × pν = (rν × pν ) = l̇ν .
dt dt
λ
The differentiation can be moved to the left, because ṙν × pν = 0. Summation over ν
yields

r ν × Fν + rν × fνλ = L̇,
ν λ ν

D 0

D = L̇ = l̇ν .

Here, ν λ rν × fνλ = 0, since the terms of the double sum cancel pairwise, e.g.,
rν × fνλ + rλ × fλν = (rν − rλ ) × fνλ .
Since for central forces (rν − rλ ) is parallel to fνλ , the vector product vanishes.
The total torque on a system is given by the sum of the external torques
D = L̇.
For D = 0, it follows that L = constant. If no external torques act on a system, the

total angular momentum is conserved.
6.2 Angular Momentum of the Many-Body System 67
EXAMPLE
6.1 Conservation of the Total Angular Momentum of a Many-Body System: Flat-

tening of a Galaxy
Fig. 6.2. Formation of a galaxy from a cloud of gas with angular momentum L: (a) The gas con-
tracts due to the mutual gravitational attraction between its constituents. (b) The gas contracts
faster along the direction of the angular momentum L than in the plane perpendicular to L, since
the angular momentum must be conserved. In this way a flattening appears. (c) The galaxy in
equilibrium: In the plane perpendicular to L, the gravitational force balances the centrifugal
force due to the rotational motion
Fig. 6.3. Demonstration of

angular momentum conserva-
tion in the absence of external
torques. A person stands on a
platform that rotates about a
vertical axis
EXAMPLE
6.2 Conservation of Angular Momentum of a Many-Body Problem: The Pirou-

ette
(a) The person holds two weights and is set into uniform circular motion with angular
velocity ω. The arms are stretched out, so that the angular momentum is large.
Example 6.2 (b) If the person pulls the arms towards the body, the moment of inertia (see Chap. 11)
decreases. Since angular momentum is conserved, the angular velocity ω signifi-
cantly increases. Skaters exploit this effect when performing a pirouette.
6.3 Energy Law of the Many-Body System
Let fνλ be the force of the λth particle on the νth particle. According to (6.1), we have
d
Fν + fνλ = (mν ṙν ).
dt
λ
Scalar multiplication of the equation by ṙν , with

d d 1
ṙν · (mν ṙν ) = mν ṙ2ν ,
dt dt 2
leads to

d 1
Fν · ṙν + fνλ · ṙν = 2
mν ṙν .
dt 2
λ
(1/2)mν ṙ2ν is however the kinetic energy Tν of the νth particle. By summation over ν,
we obtain
d 1
d
Fν · ṙν + fνλ · ṙν = mν ṙν =
2
Ṫν = Tν .
ν ν ν
dt 2 ν
dt ν
λ

Ṫν is the time derivative of the total kinetic energy of the system. By integration
ν
from t1 to t2 , with
ṙν dt = drν ,
we get

t2 t2
T (t2 ) − T (t1 ) = Fν · drν + fνλ · drν . (6.2)

ν t νλ t1

1
Aa Ai
T is the total kinetic energy, Aa is the work performed by external forces, and Ai is
the work performed by internal forces over the time interval t2 − t1 .
If we assume that the forces can be derived from a potential, we can express the
performed internal and external work by potential differences.
6.3 Energy Law of the Many-Body System 69
For the external work, we have

t2
Aa = Fν · drν = − ∇ν V · drν = −
a
dVνa
ν ν ν t
1

=− Vνa (t2 ) − Vνa (t1 ) ,
ν
Aa = V (t1 ) − V a (t2 ).
a
Vνa is the potential of the particle ν in an external field. By summing over all particles,

one obtains the total external potential V a = ν Vνa .
The force acting between two particles ν and λ is assumed to be a central force.
For the “internal” potential, we set
i
Vλν (rλν ) = Vλν
i
(rλν ) = Vνλ
i
(rνλ ).
The mutual potential depends only on the absolute value of the distance:

rνλ = |rν − rλ | = (xν − xλ )2 + (yν − yλ )2 + (zν − zλ )2 .
Thus, the principle of action and reaction is satisfied, since from this it follows auto-
matically that the force fνλ is equal and opposite to the counterforce fλν :
fνλ = −∇ν Vνλ

i
= +∇λ Vνλ
i
= −fλν .
The index ν on the gradient indicates that the gradient is to be calculated with respect
to the components of the position vector rν of the particle ν. Hence,

∂ ∂ ∂ ∂ ∂ ∂
∇ν = , , , ∇λ = , , .
∂xν ∂yν ∂zν ∂xλ ∂yλ ∂zλ
Hence, for the internal work we can write

1
Ai = fνλ · drν = fνλ · drν + fλν · drλ
2
ν,λ ν,λ λ,ν

1
= fνλ (drν − drλ ).
2
ν,λ
We now replace the difference of the position vectors by the vector rνλ = rν − rλ and
introduce the operator ∇νλ which forms the gradient with respect to this difference.
We get

1 1 1 i
Ai = − ∇νλ Vνλ
i
· drνλ = − dVνλi
=− Vνλ (t2 ) − Vνλ
i
(t1 ) ,
2 2 2
ν,λ ν,λ ν,λ
where

∂ ∂ ∂
∇νλ = , , .
∂(xν − xλ ) ∂(yν − yλ ) ∂(zν − zλ )
Hence, the internal work is the difference of the internal potential energy. This quantity
is significant for deformable media (deformation energy).
For rigid bodies where the differences (distances) |rν − rλ | are invariant, the inter-
nal work vanishes. Changes drνλ can occur only perpendicular to rν − rλ and hence
perpendicular to the direction of force, i.e., the scalar products fνλ · drνλ vanish.
If we set for the total potential energy
1 i
V= Vνa + Vνλ ,
ν
2
ν,λ
for (6.2) we find
T (t2 ) − T (t1 ) = V (t1 ) − V (t2 )
or
V (t1 ) + T (t1 ) = V (t2 ) + T (t2 ); (6.3)
the sum of potential and kinetic energy for the total system remains conserved. Since
energy can be transferred by the interaction of the particles (e.g., collisions between
gas molecules), energy conservation must not hold for the individual particle but must
hold for all particles together, i.e., for the entire system.
6.4 Transformation to Center-of-Mass Coordinates

When investigating the motion of particle systems, one often disregards the common
translation of the system in space, since only the motions of particles relative to the
center of gravity of the system are of interest. One therefore transforms the quantities
characterizing the particles to a system whose origin is the center of gravity.
According to Fig. 6.4, the origin of the primed coordinate system is the center
of gravity; the position, velocity, and mass R, V, and M of the center of gravity are
denoted by capital letters. One has
rν = R + rν , ṙν = V + vν = Ṙ + ṙν .
Fig. 6.4.
According to the definition of the center of gravity, we have

M ·R= mν rν = mν (R + rν ),
ν ν

M ·R=M ·R+ rν ,
ν

where M = ν mν is the total mass of the system.
6.4 Transformation to Center-of-Mass Coordinates 71
From the last equation, it follows that

mν rν = 0. (6.4)
ν
Thus, the sum of the mass moments relative to the center of gravity vanishes. If there
acts a constant external force, as for example the gravity Fν = mν g, then it also follows
that

D= r ν × Fν = mν rν × g = 0.
ν ν
A body in the earth’s field is therefore in equilibrium if it is supported in the center of

gravity.
Differentiation of (6.4) with respect to time yields

mν vν = 0, (6.5)
ν
i.e., in the center-of-mass system the sum of the momenta vanishes. In relativistic
physics this statement is often used as definition of the “center-of-momentum” sys-
tem; there it is not possible to introduce the notion of the center of mass,—as defined
above—in a consistent way. Only the “center-of-momentum” system can be formu-
lated in a relativistically consistent way.
The equivalent transformation of the angular momentum leads to

L= mν (rν × vν ) = mν (R + rν ) × (V + vν ),
ν ν

L= mν (R + V) + mν (R × vν ) + mν (rν × V) + mν (rν × vν ).
ν ν ν ν
By appropriate grouping, one obtains

L = M(R × V) + R × mν vν + mν rν × V + mν (rν × vν )
ν ν ν
and sees that the two middle terms disappear, because of the definition (6.4) of the
center-of-mass coordinates. Hence,

L = M(R × V) + mν (rν × vν ) = Ls + lν . (6.6)
ν ν
Thus, the angular momentum L can be decomposed into the angular momentum of
the center of gravity Ls = MR × V with the total mass M, and the sum of angular
momenta of the individual particles about the center of gravity.
For the torque as the derivative of the angular momentum, the same decomposition
holds:

D = Ds + dν . (6.7)
ν
6.5 Transformation of the Kinetic Energy
We have
1 1 1
mν vν + mν vν .
2
T= mν v2ν = mν V2 + V ·
2 ν 2 ν ν
2 ν

Because mν vν = 0, the middle term again vanishes, and we find
1 1
mν vν = Ts + T .
2
T = MV2 + (6.8)
2 2 ν
The total kinetic energy T is thus composed of the kinetic energy of a virtual particle
of mass M with the position vector R(t) (the center of gravity), and the kinetic energy
of the individual particles relative to the center of gravity. Mixed terms, e.g. of the
form V · vν 2 , do not appear! This is the remarkable property of the center-of-mass
coordinates, the foundation of their meaning.
EXERCISE
6.3 Reduced Mass
Problem. Show that the kinetic energy of two particles with the masses m1 , m2
splits into the energy of the center of gravity and the kinetic energy of relative motion.
Fig. 6.5. Center of gravity and

relative coordinates of two
masses
Solution. The total kinetic energy is
1 1
T = m1 v21 + m2 v22 . (6.9)
2 2
The center of gravity is defined by
m1 r1 + m2 r2
R= ,
m1 + m2
and its velocity is
1
Ṙ = (m1 v1 + m2 v2 ). (6.10)
m1 + m2
The velocity of relative motion is denoted by v. We have
v = v 1 − v2 . (6.11)
6.5 Transformation of the Kinetic Energy 73
We now express the particle velocity by the center of gravity and relative velocity, Exercise 6.3
respectively.
By inserting v2 from (6.11) into (6.10), we have
(m1 + m2 )Ṙ = m1 v1 + m2 v1 − m2 v.
From this, it follows that

m2
v1 = Ṙ + v.
m1 + m2
Analogously, we get
m1
v2 = Ṙ − v.
m1 + m2
Inserting the two particle velocities into (6.9), we obtain
2 2
1 m2 1 m1
T = m1 Ṙ + v + m2 Ṙ − v
2 m1 + m2 2 m1 + m2
or
1 1 m1 m22 v2 1 m2 m21 v2
T = M Ṙ2 + + ,
2 2 (m1 + m2 )2 2 (m1 + m2 )2
1 1
T = M Ṙ2 + μv 2 .
2 2
The mixed terms cancel. The mass related to the center-of-mass motion is the total
mass M = m1 + m2 ; the mass related to the relative motion is the reduced mass
m1 m2
μ= .
m1 + m2
The reduced mass is often written in the form
1 1 1
= + .
μ m1 m2
It is remarkable that the kinetic energy for two bodies decomposes into the kinetic
energies of the motion of the center of gravity and of the relative motion. There are no
mixed terms, e.g., of the form Ṙ · v, which considerably simplifies the solution of the
two-body problem (see the next problem).
EXERCISE
6.4 Movement of Two Bodies Under the Action of Mutual Gravitation
Problem. Two bodies of masses m1 and m2 move under the action of their mutual
gravitation. Let r1 and r2 be the position vectors in a space-fixed coordinate system,
and r = r1 − r2 . Find the equations of motion for r1 , r2 , and r in the center-of-gravity
system. How do the trajectories in the space-fixed system and in the center-of-mass
system look like? Fig. 6.6. Laboratory system
Fig. 6.7.
Solution. Newton’s gravitational law immediately yields
Gm2 r Gm1 r
r̈1 = − , r̈2 = .
r3 r3
With the relative coordinate r = r1 − r2 , it follows that
Gm2 (r1 − r2 ) Gm1 (r1 − r2 )

r̈1 = − and r̈2 = .
r3 r3
In the center-of-mass system, we have m1 r1 = −m2 r2
−G(m1 + m2 )r1 −G(m1 + m2 )r2

⇒ r̈1 = and r̈2 = .
r3 r3
Subtraction yields
G(m1 + m2 )r
r̈ = r̈1 − r̈2 = − .
r3
Since
m2 m1
r1 = r and r2 = r,
m1 + m2 m1 + m2
it follows that
−Gm32 r1 −Gm31 r2
r̈1 = and r̈2 = .
(m1 + m2 )2 r13 (m1 + m2 )2 r23
Hence, Newton’s gravitational law holds with respect to the center of gravity, but
with modified mass factors. This means that the trajectories are conic sections as be-
fore (relative path with respect to S). Because of the superimposed translation of the Exercise 6.4
center of gravity, the trajectories become spirals in space.
EXERCISE
6.5 Atwoods Fall Machine
Problem. Two masses (m1 = 2 kg and m2 = 4 kg) are connected by a massless rope
(without sliding) via a frictionless disk of mass M = 2 kg and radius R = 0.4 m (At-
woods machine). Find the acceleration of the mass m2 = 4 kg if the system moves
under the influence of gravitation.
Fig. 6.8.
Solution. For the given masses m1 = 2 kg, m2 = 4 kg and the tension forces at the
rope ends N1 and N2 , it follows that
m1 a1 = N1 − m1 g, m2 a2 = m2 g − N2 , (6.12)
and for the torques acting on the disk, we get
D1 + D2 = −N1 R + N2 R = R(N2 − N1 ) = ω̇θs , (6.13)
since the disk is accelerated. θs is the moment of inertia of the disk. From this, it
follows that N2 = N1 ; otherwise, there is no motion at all. For the accelerations, we
have
a = a1 = a2 = ω̇R, (6.14)
since the rope is tight and does not slide, i.e., it adheres to the disk.
Inserting the moment of inertia of the disk θs = MR 2 /2 (see Example 11.7)
into (6.13) and using (6.14) yields for the acceleration
N1 N2 R2
a= −g=g− = ω̇R = (N2 − N1 ). (6.15)
m1 m2 MR 2 /2
Inserting (6.12) and performing the algebraic steps yields

2 g(m2 − m1 ) − m2 a2 − m1 a1
a= (N2 − N1 ) =
M M/2
and, because a = a1 = a2 ,
aM/2 − g(m2 − m1 ) + a(m2 + m1 )
0=
M/2
a(m1 + m2 + M/2) − g(m2 − m1 )
=
M/2
g(m2 − m1 )
⇒ a= .
m1 + m2 + M/2
The Atwoods machine serves as a transparent and easily controllable demonstration
of the laws of free fall. By varying the difference of the masses (m2 − m1 ), the accel-
eration a can be varied.
EXERCISE
6.6 Our Solar System in the Milky Way
Problem. Our solar system is about r0 ≈ 5 · 1020 m away from the center of the Milky
Way, and its orbital velocity relative to the galactic center v0 is ≈ 3 · 105 m/s. This is
schematically shown in Fig 6.9.
(a) Determine the mass M of our galaxy.
(b) Discuss the hypothesis that the motion of our solar system is a consequence of the
contraction of our Milky Way (see Fig. 6.9), and then verify, r0 = GM/v02 . Here
G = 6.7 · 10−11 m3 s2 kg−1 is the gravitational constant.
Fig. 6.9.
Solution. (a) If a mass point moves on a circular path, then according to Newton
the force per unit mass equals the acceleration. Since our sun (mass m) is at the pe-
riphery of our Milky Way, the attractive force toward the center can approximately be
represented by
mM
F =G , (6.16)
r02
where m is the solar mass and M is the mass of the Milky Way. The acceleration
points toward the center,
v02 F
a= = , (6.17)
r0 m
from which it follows that
v02 GM GM
= 2 or r0 = . (6.18)
r0 r0 v02
Using the numbers given in the formulation of the problem, one gets from equation Exercise 6.6
(6.18) the mass of our Milky Way:
r0 v02 5 · 1020 · 9 · 1010

M= ≈ kg = 6.7 · 1041 kg.
G 6.7 · 10−11
This means that the mass of the Milky Way is
M ≈ 3 · 1011 m,
where m is the solar mass.

(b) If r, v are the initial values for the distance and velocity of our sun, for the
available energies we have
GMm 1
Vpot = − and Tkin = mv 2 , (6.19)
r 2
where M is the mass of the Milky Way, and G is the gravitational constant. If the
sun moves with decreasing radius about the center of the Milky Way, the angular
momentum about the center remains constant; however, the orbital velocity increases.
Hence, the kinetic energy Tkin can be given as a function of the radius
1 l2 1 l2 1
T= m 2 2= , (6.20)
2 m r 2 m r2
where we used l = (mr 2 )ω = mvr = constant.
The assumption is now that at the present distance r the increase in the kinetic
energy Tkin is balanced by the decrease in the potential energy if r is reduced by r.
Differentiation of (6.19) and (6.20) with respect to r yields:

dTkin l2 1
Tkin = r = − r, Tkin > 0, if r < 0,
dr m r3

dVpot GMm
Vpot = r = r, Vpot < 0, if r < 0.
dr r2
In the equilibrium, however, Tkin + Vpot = 0. Replacing l by l = mv0 r0 yields
m2 v02 r02 Mm
=G or r0 v02 = MG. (6.21)
mr03 r02
Equation (6.21) again corresponds exactly to the result of problem (a).

Part
III
Vibrating Systems
Vibrations of Coupled Mass Points
7
As the first and most simple system of vibrating mass points, we consider the free
vibration of two mass points, fixed to two walls by springs of equal spring constant,
as is shown in the Fig. 7.1.
Fig. 7.1. Mass points coupled

by springs
The two mass points shall have equal masses. The displacements from the rest
positions are denoted by x1 and x2 , respectively. We consider only vibrations along
the line connecting the mass points.
When displacing the mass 1 from the rest position, there acts the force −kx1 by the
spring fixed to the wall, and the force +k(x2 − x1 ) by the spring connecting the two
mass points. Thus, the mass point 1 obeys the equation of motion
mẍ1 = −kx1 + k(x2 − x1 ). (7.1a)
Analogously, for the mass point 2 we have
mẍ2 = −kx2 − k(x2 − x1 ). (7.1b)
We first determine the possible frequencies of common vibration of the two particles.
The frequencies that are equal for all particles are called eigenfrequencies. The re-
lated vibrational states are called eigen- or normal vibrations. These definitions are
correspondingly generalized for a N -particle system. We use the ansatz
x1 = A1 cos ωt, x2 = A2 cos ωt, (7.2)
i.e., both particles shall vibrate with the same frequency ω. The specific type of the
ansatz, be it a sine or cosine function or a superposition of both, is not essential.
We would always get the same condition for the frequency, as can be seen from the
following calculation.

82 7 Vibrations of Coupled Mass Points
Insertion of the ansatz into the equations of motion yields two linear homogeneous
equations for the amplitudes:
A1 (−mω2 + 2k) − A2 k = 0,
(7.3)
−A1 k + A2 (−mω2 + 2k) = 0.
The system of equations has nontrivial solutions for the amplitudes only if the deter-
minant of coefficients D vanishes:

−mω2 + 2k −k
D= = (−mω2 + 2k)2 − k 2 = 0.
−k −mω + 2k
2
We thus obtain an equation for determining the frequencies:
k 2 k2
ω4 − 4 ω + 3 2 = 0.
m m
The positive solutions of the equation are the frequencies

3k k
ω1 = and ω2 = .
m m
These frequencies are called eigenfrequencies of the system; the corresponding vibra-
tions are called eigenvibrations or normal vibrations. To get an idea about the type
of the normal vibrations, we insert the eigenfrequency into the system (7.3). For the
amplitudes, we find

3k
A1 = −A2 for ω1 =
m
and

k
A1 = A2 for ω2 = .
m
The two mass points vibrate in-phase with the lower frequency ω2 , and with
the higher frequency ω1 against each other. The two vibration modes are illustrated
by Fig. 7.2.
Fig. 7.2.
The number of normal vibrations equals the number of coordinates (degrees of

freedom) which are necessary for a complete description of the system. This is a con-
sequence of the fact that for N degrees of freedom there appear N equations of the
kind (7.2) and N equations of motion of the kind (7.1a), (7.1b). This leads to a de-
terminant of rank N for ω2 , and therefore in general to N normal frequencies. Since
we have restricted ourselves in the example to the vibrations along the x-axis, the two
7 Vibrations of Coupled Mass Points 83
coordinates x1 and x2 are sufficient to describe the system, and we obtain the two
eigenvibrations with the frequencies ω1 , ω2 .
In our example, the normal vibrations mean in-phase or opposite-phase (= in-phase
with different sign of the amplitudes) oscillations of the mass points. The amplitudes
of equal size are related to the equality of masses (m1 = m2 ). The general motion
of the mass points corresponds to a superposition of the normal modes with different
phase and amplitude.
The differential equations (7.1a), (7.1b) are linear. The general form of the vibration
is therefore the superposition of the normal modes. It reads
x1 (t) = C1 cos(ω1 t + ϕ1 ) + C2 cos(ω2 t + ϕ2 ),

(7.4)
x2 (t) = −C1 cos(ω1 t + ϕ1 ) + C2 cos(ω2 + ϕ2 ).
Here, we already utilized the result that x1 and x2 have opposite-equal ampli-
tudes for a pure ω1 -vibration, and equal amplitudes for pure ω2 -vibrations. This en-
sures that the special cases of the pure normal vibrations with C2 = 0, C1 = 0 and
C1 = 0, C2 = 0 are included in the ansatz (7.4). Equation (7.4) is the most general
ansatz since it involves 4 free constants. Thus one can incorporate any initial values
for x1 (0), x2 (0), ẋ1 (0), ẋ2 (0).
For example, the initial conditions are
x1 (0) = 0, x2 (0) = a, ẋ1 (0) = ẋ2 (0) = 0.
To determine the 4 free constants C1 , C2 , ϕ1 , ϕ2 , we insert the initial conditions

into the equations (7.4) and their derivatives:
x1 (0) = C1 cos ϕ1 + C2 cos ϕ2 = 0, (7.5)

x2 (0) = −C1 cos ϕ1 + C2 cos ϕ2 = a, (7.6)
ẋ1 (0) = −C1 ω1 sin ϕ1 − C2 ω2 sin ϕ2 = 0, (7.7)
ẋ2 (0) = C1 ω1 sin ϕ1 − C2 ω2 sin ϕ2 = 0. (7.8)
Addition of (7.7) and (7.8) yields
C2 sin ϕ2 = 0.
Subtraction of (7.7) and (7.8) yields
C1 sin ϕ1 = 0.
From addition and subtraction of (7.5) and (7.6), it follows that
2C2 cos ϕ2 = a and 2C1 cos ϕ1 = −a.
Thus, one obtains

a a
ϕ1 = ϕ2 = 0, C1 = − , C2 = .
2 2
The overall solution therefore reads

a ω1 − ω2 ω1 + ω 2
x1 (t) = (− cos ω1 t + cos ω2 t) = a sin t sin t,
2 2 2

a ω1 − ω2 ω 1 + ω2
x2 (t) = (cos ω1 t + cos ω2 t) = a cos t cos t.
2 2 2
For t = 0: x1 (0) = 0, x2 (0) = a, as required. The second mass plucks at the first one
and causes it to vibrate. These are beat vibrations (see Exercise 7.2).
EXERCISE
7.1 Two Equal Masses Coupled by Two Equal Springs
Problem. Two equal masses move without friction on a plate. They are connected
to each other and to the wall by two springs, as is indicated by Fig. 7.3. The two
spring constants are equal, and the motion shall be restricted to a straight line (one-
dimensional motion). Two equal masses coupled by two equal springs.
Find
(a) the equations of motion,
(b) the normal frequencies, and
(c) the amplitude ratios of the normal vibrations and the general solution.
Fig. 7.3.
Solution. (a) Let x1 and x2 be the displacements from the rest positions. The equa-
tions of motion then read
mẍ1 = −kx1 + k(x2 − x1 ), (7.9)

mẍ2 = −k(x2 − x1 ). (7.10)
(b) For determining the normal frequencies, we use the ansatz
x1 = A1 cos ωt, x2 = A2 cos ωt
and thereby get from (7.9) and (7.10) the equations
(2k − mω2 )A1 − kA2 = 0,

(7.11)
−kA1 + (k − mω2 )A2 = 0.
From the requirement for nontrivial solutions of the system of equations, it follows
that the determinant of coefficients vanishes:

2k − mω2 −k
D= = 0.
−k k − mω2
From this follows the determining equation for the eigenfrequencies, Exercise 7.1
k 2 k2
ω4 − 3 ω + 2 = 0,
m m
with the positive solutions
√ √
5+1 k 5−1 k
ω1 = and ω2 = , ω1 > ω 2 .
2 m 2 m
(c) By inserting the eigenfrequencies in (7.11) one sees that the higher frequency
ω1 corresponds to the opposite-phase mode, and the lower frequency ω2 to the equal-
phase normal vibration:
√
1 √ k 5−1
with ω1 = 3 + 5
2
, it follows from (7.11) that A2 = − A1 ,
2 m 2
√
1 √ k 5+1
with ω22 = 3 − 5 , it follows from (7.11) that A2 = A1 .
2 m 2
Since the two mass points are fixed in different ways, we find amplitudes of different
magnitudes.
The general solution is obtained as a superposition of the normal vibrations, using
the calculated amplitude ratios:
x1 (t) = C1 cos(ω1 t + ϕ1 ) + C2 cos(ω2 t + ϕ2 ),

√ √
5−1 5+1
x2 (t) = − C1 cos(ω1 t + ϕ1 ) + C2 cos(ω2 t + ϕ2 ).
2 2
The 4 free constants are determined from the initial conditions of the specific case.
EXERCISE
7.2 Coupled Pendulums
Problem. Two pendulums of equal mass and length are connected by a spiral spring.
They vibrate in a plane. The coupling is weak (i.e., the two eigenmodes are not very
different). Find the motion with small amplitudes.
Fig. 7.4.
Solution. The initial conditions are
x1 (0) = 0, x2 (0) = A, ẋ1 (0) = ẋ2 (0) = 0.

We start from the vibrational equation of the simple pendulum:
ml α̈ = −mg sin α.
For small amplitudes, we set sin α = α = x/ l and obtain

g
mẍ = −m x.
l
Fig. 7.5.
For the coupled pendulums, the force ∓k(x1 − x2 ) caused by the spring still enters,
which leads to the equations
g k
ẍ1 = − x1 − (x1 − x2 ),
l m (7.12)
g k
ẍ2 = − x2 + (x1 − x2 ).
l m
This coupled set of differential equations can be decoupled by introducing the coordi-
nates
u1 = x1 − x2 and u2 = x1 + x2 .
Subtraction and addition of the equations (7.12) yield

g k g k
ü1 = − u1 − 2 u1 = − +2 u1 ,
l m l m
g
ü2 = − u2 .
l
These two equations can be solved immediately:
u1 = A1 cos ω1 t + B1 sin ω1 t,
(7.13)
u2 = A2 cos ω2 t + B2 sin ω2 t,
√ √
where ω1 = g/ l + 2(k/m), ω2 = g/ l are the eigenfrequencies of the two vibra-
tions. The coordinates u1 , u2 are called normal coordinates. Normal coordinates are
often introduced to decouple a coupled system of differential equations. The coor-
dinate u1 = x1 − x2 describes the opposite-phase and u2 = x1 + x2 the equal-phase
normal vibration. The equal-phase normal mode proceeds as if the coupling were ab-
sent.
For sake of simplicity, we incorporate the initial conditions in (7.13). For the nor-
mal coordinates we then have
u1 (0) = −A, u2 (0) = A, u̇1 (0) = u̇2 (0) = 0.
A1 = −A, A2 = A, B1 = B2 = 0,
and thus,
u1 = −A cos ω1 t, u2 = A cos ω2 t.
Returning to the coordinates x1 and x2 :
1 A
x1 = (u1 + u2 ) = (− cos ω1 t + cos ω2 t),
2 2
1 A
x2 = (u2 − u1 ) = (cos ω1 t + cos ω2 t).
2 2
After transforming the angular functions, one has

ω 1 − ω2 ω1 + ω2
x1 = A sin t sin t ,
2 2

ω 1 − ω2 ω 1 + ω2
x2 = A cos t cos t .
2 2
We have presupposed the coupling of the two pendulums to be weak, i.e.,

g g k
ω2 = ≈ ω1 = +2 ,
l l m
hence, the frequency ω1 − ω2 is small. The vibrations x1 (t) and x2 (t) can then be
interpreted as follows: The amplitude factor of the pendulum vibrating with the fre-
quency ω1 + ω2 is slowly modulated by the frequency ω1 − ω2 . This process is called
beat vibration. Figure 7.6 illustrates the process. The two pendulums exchange their
energy with the amplitude modulation frequency ω1 − ω2 . If one pendulum reaches
its maximum amplitude (energy), the other pendulum comes to rest. This complete
energy transfer occurs only for identical pendulums. If the pendulums differ in mass
or length, the energy transfer becomes incomplete; the pendulums vary in amplitudes
but without coming to rest.
Fig. 7.6.
7.1 The Vibrating Chain1
We consider another vibrating mass system: the vibrating chain. The “chain” is a mass-
less thread set with N mass points. All mass points have the mass m and are fixed to
the thread at equal distances a. The points 0 and N + 1 at the ends of the thread are
tightly fixed and do not participate in the vibration. The displacement from the rest po-
sition in y-direction is assumed to be relatively small, so that the minor displacement
in x-direction is negligible. The total string tension T is only due to the clamping of
the end points and is constant over the entire thread.
If one picks out the νth particle, the forces acting on this particle are due to the dis-
placements of the particles (ν − 1) and (ν + 1). According to Fig. 7.7 the backdriving
forces are given by
Fν−1 = −(T · sin α)e2 ,

Fν+1 = −(T · sin β)e2 .
Fig. 7.7.
Since the displacement in y-direction is small by definition, α and β are small

angles, and hence, one has, to a good approximation,
sin α = tan α and sin β = tan β.
From Fig. 7.7, one sees that
yν − yν−1 yν − yν+1
tan α = and tan β = .
a a
Hence, the forces are given by

yν − yν−1
Fν−1 = −T e2 ,
a

yν − yν+1
Fν+1 = −T e2 .
a
1 It is recommended that the reader go through Chap. 8 (“The Vibrating String”) before studying
this section. The concepts presented here will be more easily understood, and the mathematical ap-
proaches will be more transparent in their physical motivation.
7.1 The Vibrating Chain 89
The total backdriving force is the sum Fν−1 + Fν+1 , i.e., the equation of motion for
the particle reads

d 2 yν yν − yν−1 yν − yν+1
m e 2 = −T e 2 − T e2
dt 2 a a
or
d 2 yν T
2
= (yν−1 − 2yν + yν+1 ). (7.14)
dt ma
Since the index ν runs from ν = 1 to ν = N , one obtains a system of N coupled dif-
ferential equations. Considering that the endpoints are fixed, by setting for the indices
ν = 0 and ν = N + 1
y0 = 0 and yN+1 = 0 (boundary condition),
one obtains from the differential equation (7.14) with the indices ν = 1 and ν = N the
differential equation for the first and last particle that can participate in the vibration:
d 2 y1 T
m = (−2y1 + y2 ),
dt 2 a (7.15)
2
d yN T
m 2 = (yN−1 − 2yN ).
dt a
We now look for the eigenfrequencies of the particle system, i.e., the frequencies
of vibration common to all particles. To get a determining equation for the eigenfre-
quency ωn , we introduce in (7.14) the ansatz
yν = Aν cos ωt. (7.16)
We obtain
T
−mω2 · Aν · cos ωt = (Aν−1 − 2Aν + Aν+1 ) cos ωt,
a
and after rewriting,

maω2
−Aν−1 + 2 − Aν − Aν+1 = 0, ν = 2, . . . , N − 1. (7.17a)
T
By insertion of (7.16) into (7.15), we get the equations for the first and the last vibrat-
ing particle:

maω2
2− A1 − A2 = 0,
T
(7.17b)
maω2
−AN−1 + 2 − AN = 0.
T
With the abbreviation
2T − maω2
= c, (7.18)
T
(7.17a) and (7.17b) can be rewritten as follows:
cA1 − A2 =0
−A1 + cA2 − A3 =0
− A2 + cA3 − A4 =0
.. ..
. .
− AN−1 + cAN = 0.
This is a system of homogeneous linear equations for the coefficients Aν . For any
nontrivial solution of the equation system (not all Aν = 0) the determinant of coeffi-
cients must vanish. This determinant has the form

c −1 0 0 0 ... 0 0 0

−1 c −1 0 0 ... 0 0 0

0 −1 c −1 0 ... 0 0 0

DN = . .. .. .. .. .. .. .. ...
.. . . . . . . . .

0 0 0 0 0 −1 c −1
...
0 0 0 0 0 ... 0 −1 c
It has N rows and N columns. The eigenfrequencies are obtained as solution of the
equation
DN = 0.
Expanding DN with respect to the first row, we get

c −1 0 ... 0 0 0

−1 c −1 ... 0 0 0

0 −1 c ... 0 0 0

0 0 −1 ... 0 0 0

DN = c · ... ..
.
..
.
..
.
..
.
..
.
..
.

0 0 0 ... −1 0 0

0 −1 0
0 0 ... c
0 −1 −1
0 0 ... c
0 0 0 ... 0 −1 c

−1 −1 0 0 ... 0 0

0 c −1 0 ... 0 0

0 −1 −c −1 ... 0 0

0 0 −1 c ... 0 0

+ . .. .. .. .. .. .. .
.. . . . . . .

0
0 0 0 ... −1 0
0
0 0 0 ... c −1
0 0 0 0 ... −1 c
The left-hand determinant has exactly the same form as DN , but is lower by one
order (N − 1 rows, N − 1 columns). It would be the determinant of coefficients for
a similar system with one mass point less, i.e., DN−1 . The right-hand determinant is
now expanded with respect to the first column, which leads to

c −1 0 ... 0 0

−1 c −1 ... 0 0

0 −1 c ... 0 0

DN = cDN−1 + (−1) · ... ... ..
.
..
.
..
.
..
..

0 0 0 ... −1 0

0 0 c −1
0 ...
0 0 0 ... −1 c
The last determinant is just DN−2 . Hence we get the determinant recursion equa-
tion
DN = cDN−1 − DN−2 , if N ≥ 2. (7.19)
Moreover,

c −1
D1 = |c| = c and D2 = = c2 − 1. (7.20)
−1 c
By setting N = 2 in (7.19), we recognize that (7.19) combined with (7.20) is satisfied

only if we formally set
D0 = 1. (7.21)
Our problem is now to solve the determinant equation (7.19). We use the ansatz
DN = p N ,
where the constant p must be determined. Insertion into (7.19) yields
p N = cp N−1 − p N−2 ,
and after division by p N−2 ,

√
c± c2 − 4
p − cp + 1 = 0 or
2
p= .
2
The mathematical possibility p N−2 = 0 that leads to p ≡ 0 does not obey the bound-
ary condition D0 = 1 and is therefore inapplicable. Substituting c = 2 cos , we obtain
for p

p = cos ± cos2 − 1 = cos ± i sin = e±i .
The solutions of (7.19) are then
DN = p N = (ei )N = eiN = cos N + i sin N
and
DN = (e−i )N = e−iN = cos N − i sin N .

Since the equation system (7.19) is homogeneous and linear, the general solution is a
linear combination of cos N and sin N :
DN = G cos N + H sin N . (7.22)
Since D0 = 1 and D1 = c = 2 cos (see above), G and H are determined as
G = 1, H = cot ,
so that
sin N cos sin(N + 1)
DN = cos N + = ,
sin sin
because sin cos N + sin N cos = sin(N + 1).
For any nontrivial solution of the equation system we must have DN = 0, i.e., DN
must vanish for all N ; it follows that
sin((N + 1)) = 0,
or
nπ
= n = , n = 1, . . . , N. (7.23)
N +1
n = 0 drops out since it leads to the solution 0 = 0, and hence to DN = N + 1
0, and thus does not lead to a solution of the equation DN = 0. For c we then get
=
according to (7.18):
ω2 ma nπ
c=2− = 2 cos ,
T N +1
and ω is calculated from

2T nπ
ω2 = ω(n)
2
= 1 − cos (7.24a)
ma N +1
as

2T nπ
ω(n) = 1 − cos . (7.24b)
ma N +1
These are the eigenfrequencies of the system; the fundamental frequency is obtained
for n = 1 as the lowest eigenfrequency. There are exactly N eigenfrequencies, as is
seen from (7.23): For n ≥ N + 1, we set n = (N + 1) + τ and find
τπ
n = π + .
N +1
If one inserts the above expression into (7.17a) and (7.17b) for ω and c, respectively,
one obtains for the amplitudes of the normal vibration
(n) nπ (n)
−Aν−1 + 2A(n)
ν cos − Aν+1 = 0,
N +1
(n) nπ (n)
2A1 cos = A2 , (7.25)
N +1
(n) nπ (n)
2AN cos = AN−1
N +1
(n)
where the Aν depend on n (Aν = Aν ). The system of equations (7.25) for the Aν
is the same as that for the determinants DN (equation (7.19)), with the same coeffi-
cient c = 2 cos nπ/(N + 1) = 2 cos n . Only the boundary conditions (7.25) do not
correspond to those for the DN (see (7.20) and (7.21)). The general solution for the
coefficients Aν is therefore obtained from (7.22) with at first arbitrary coefficients
E (n) :
(n) (n)
ν = E1 cos νn + E2 sin νn ,
A(n)
or, in detail,
(n) nπν (n) nπν
ν = E1 cos
A(n) + E2 sin . (7.26)
N +1 N +1
Since the points ν = 0 and ν = N + 1 are tightly clamped, for all eigenmodes n we
have y0 = yN+1 = 0, or
A(n) (n)
0 = AN+1 = 0 (boundary condition).
Then one obtains for ν = 0 in (7.26):

nπν
E1(n) = 0, i.e., ν = E2 sin
A(n)
(n)
.
N +1
After insertion into (7.16), one gets
(n) nπν
yν(n) = E2 sin cos ω(n) t. (7.27)
N +1
If one inserts yν = Bν sin ωt instead of (7.16) into (7.14), one determines Bν by the
same method as Aν and obtains
(n) nπν (n)
Bν(n) = E4 sin (E3 = 0);
N +1
hence, the solutions for the yν read
(n) nπν
yν(n) = E2 sin cos ω(n) t (7.28a)
N +1
and
(n) nπν
yν(n) = E4 sin sin ω(n) t. (7.28b)
N +1
The sum of these individual solutions yields the general solution, which therefore
reads

N
nπν (n) (n)
yν = sin E4 sin ω(n) t + E2 cos ω(n) t
N +1
n=1

N
nπν
= sin (an sin ω(n) t + bn cos ω(n) t), (7.29)
N +1
n=1
(n) (n)
where the constants E2 and E4 were renamed bn and an , respectively. They are
determined from the initial conditions.
The equation of the vibrating chord must follow from the limit for N → ∞ and
a → 0 (continuous mass distribution):
nπν nπaν
sin = sin (xν = aν takes only discrete values)
N +1 (N + 1)a
πn(aν)
= sin (l = N a is the length of the chord)
l+a

πnx πnx
lim sin = sin (x continuous).
N→∞ l+a l
a→0
2 becomes (expansion of the cosine in (7.24a) in a Taylor series):

ω(n)

2
2T 1 nπ T (nπ)2
2
ω(n) = 1−1+ − ··· ≈ ,
ma 2 N +1 (m/a)(N + 1)2 a 2
and with σ = m/a = mass density of the chord,

T (nπ)2 T (nπ)2
lim = ,
N→∞ σ (N + 1)2 a 2 σ l2
a→0
i.e.,

T nπ
ω(n) = .
σ l
Hence, one has as a limit

nπx T nπ T nπ
yn (x) = sin an sin · t + bn cos t . (7.30)
l σ l σ l
This is the equation for the nth eigenmode of the vibrating chord (l is the chord length).
It will be derived once again in the next chapter in a different way and will then be
discussed in more detail.
EXERCISE
7.3 Eigenfrequencies of the Vibrating Chain
Problem. When solving the determinant equation (7.19), we have made a mathemat-
ical restriction for c by setting c = 2 cos .
Show that for the cases
(a) |c| = 2,
(b) c < −2
the eigenvalue equation DN = 0 cannot be satisfied. Clarify that thereby the special
choice of the constant c is justified.
Solution. (a)
Dn = cDn−1 − Dn−2 , D1 = c = ±2, D0 = 1. (7.31)

We assert and prove by induction Exercise 7.3
|Dn | ≥ |Dn−1 |. (7.32)
Induction start: n = 2, |D0 | = 1, |D1 | = 2, |D2 | = 3.

Induction conclusion from n − 1, n − 2 to n:
|Dn |2 = 4|Dn−1 |2 ± 4|Dn−1 ||Dn−2 | + |Dn−2 |2

≥ 4|Dn−1 |2 + |Dn−2 |2 − 4|Dn−1 ||Dn−2 |
⇒ |Dn |2 − |Dn−1 |2 ≥ 3|Dn−1 |2 + |Dn−2 |2 − 4|Dn−1 ||Dn−2 |.
According to the induction condition,
|Dn−1 | = |Dn−2 | + with ≥ 0.
|Dn |2 − |Dn−1 |2 ≥ 4|Dn−2 |2 + 6|Dn−2 | + 3 2 − 4|Dn−2 | − 4|Dn−2 |2

≥ 2|Dn−2 |
≥0
⇒ |Dn | ≥ |Dn−1 |. (7.33)
Since |Dn | monotonically increases in n, and |D1 | = 2 > 0, we have |DN | > 0. There-
√
fore DN = 0 cannot be satisfied. ω = 0 and ω = 2T /ma are not eigenfrequencies
of the vibrating chain.
(b) By inserting the ansatz Dn = Ap n , p = 0, we also find the solution of the
recursion formula Dn = cDn−1 − Dn−2 , D1 = c, D0 = 1:

p1 = 12 c + (c2 − 4)1/2 < 0
0 > p 1 > p2 . (7.34)
p2 = 12 c − (c2 − 4)1/2 < 0
The general solution for incorporating the boundary conditions D0 = 1, D1 = c reads
Dn = A1 p1n + A2 p2n . (7.35)
With D0 = 1, D1 = c, it follows that
A1 + A2 = 1,
A1 A2
c + (c2 − 4)1/2 + c − (c2 − 4)1/2 = c,
2 2
c + (c2 − 4)1/2 −c + (c2 − 4)1/2
A1 = ⇔ A2 = . (7.36)
2(c2 − 4)1/2 2(c2 − 4)1/2
One then has
1 c + (c2 − 4)1/2 n 1 (c2 − 4)1/2 − c n
Dn = p1 + p2
2 (c2 − 4)1/2 2 (c2 − 4)1/2
1 n+1
= 2 p1 − p2n+1 . (7.37)
(c − 4) 1/2
To determine the physically possible vibration modes, we had required that DN = 0:

N+1
p2
DN = 0 ⇒ = 1. (7.38)
p1
But now 0 > p1 > p2 , hence (p2 /p1 )N+1 > 1. Thus, for the case c < −2 eigenfre-
quencies do not exist too.
These supplementary investigations can be summarized √ as follows: The possible
eigenfrequencies of the vibrating chain lie between 0 and 2T /ma:

2T
0 < |ω| < . (7.39)
ma
EXERCISE
7.4 Vibration of Two Coupled Mass Points, Two Dimensional
Problem. Two mass points (equal mass m) lie on a frictionless horizontal plane and
are fixed to each other and to two fixed points A and B by means of springs (spring
tension T , length l).
(a) Establish the equation of motion.
(b) Find the normal vibrations and frequencies and describe the motions.
Fig. 7.8.
Fig. 7.9.
Solution. (a) For the vibrating chain with n mass points, which are equally spaced
by the distance l, the equations of motion
d 2 yN T
= (yN−1 − 2yN + yN+1 ) (N = 1, . . . , n)
dt 2 ml
were established. For the first and second mass point, we have
ÿ1 = k(y0 − 2y1 + y2 ) = k(y2 − 2y1 ),

(7.40)
ÿ2 = k(y1 − 2y2 + y3 ) = k(y1 − 2y2 )
with k = T /ml; the chain is clamped at the points A and B, i.e., y0 = y3 = 0.

(b) Solution ansatz: y1 = A1 cos ωt, y2 = A2 cos ωt (ω = eigenfrequency). Inser-
tion into (7.40) yields
(2k − ω2 )A1 − kA2 = 0,

(7.41)
(2k − ω2 )A2 − kA1 = 0.
To get the nontrivial solution, the determinant of coefficients must vanish, i.e.,

2k − ω2
−k
D= = 0;
−k 2k − ω2
i.e., ω4 + 3k 2 − 4kω2 = 0, from which it follows that ω12 = 3k, ω22 = k.

Insertion in (7.41) yields A1 = A2 for ω2 and A1 = −A2 for ω1 . This is an
opposite-phase and an equal-phase vibration, respectively. We note that the vibration
with the higher frequency has opposite phases and a “node,” while the vibration with
lower frequency has equal phases and a “vibration antinode.”
EXERCISE
7.5 Three Masses on a String
Problem. Three mass points are fixed equidistantly on a string that is fixed at its
endpoints.
(a) Determine the eigenfrequencies of this system if the string tension T can be con-
sidered constant (this holds for small amplitudes).
(b) Discuss the eigenvibrations of the system. Hint: Note Exercises 8.1 and 8.2 in
Chap. 8.
Fig. 7.10.
Solution. (a) For the equations of motion of the system, one finds straightaway

2T T
mẍ1 + x1 − x2 = 0,
L L

2T T T
mẍ2 + x2 − x3 − x1 = 0, (7.42)
L L L

2T T
mẍ3 + x3 − x2 = 0.
L L
Assuming periodic oscillations, i.e., solutions of the form
x1 = A sin(ωt + ψ), ẍ1 = −ω2 A sin(ωt + ψ),

x2 = B sin(ωt + ψ), ẍ2 = −ω2 B sin(ωt + ψ),
x3 = C sin(ωt + ψ), ẍ3 = −ω2 C sin(ωt + ψ),
Exercise 7.5 we get after insertion into (7.42)

2T T
−ω m A−
2
B = 0,
L L

T 2T T
− A+ −ω m B −
2
C = 0, (7.43)
L L L

T 2T
− B+ − ω2 m C = 0.
L L
As in Exercise 8.2, one gets the equation for the frequencies of the system from the
expansion of the determinant of coefficients:
3 2
Lm Lm 10Lm 2
ω6 − 6 ω4 + ω −4=0
T T T
or
3
Lm Lm 2 2 10Lm
−6
3
+ −4=0 (7.44)
T T T
with =
ω2 . This cubic equation with the coefficients
3 2
Lm Lm 10Lm
a= , b = −6 , c= , d = −4
T T T
can be solved by Cardano’s method.

With the substitutions
b 1 b2 c T2 2 b3 1 bc d
y = + , 3p = − + = −2 , 2q = − + =0
3a 3 a2 a L2 m2 27 a 3 3 a 2 a
we get q 2 + p 3 < 0, i.e., there are three real solutions which by using the auxiliary
quantities

q √ ϕ π √ T
cos ϕ = − = 0, y1 = −2 −p cos − =− 2 ,
−p 3 3 3 Lm

√ ϕ π
y2 = −2 −p cos + = 0,
3 3
√ ϕ √ T
y3 = 2 −p cos = 2
3 Lm
can be calculated as

T 2T T
ω1 = 0.6 , ω2 = , ω3 = 3.4 .
Lm Lm Lm
(b) From the first and third equation of (7.43), one finds for the amplitude ratios
B B mLω2
= =2− . (7.45)
A C T
Discussion of the modes:

(1) ω = ω1 = (0.6T /Lm)1/2 inserted into (7.45) ⇒ B1 /A1 = B1 /C1 = 1.4 or B1 =
1.4A1 = 1.4C1 .
All three masses are deflected in the same direction, where the first and third
mass have equal amplitudes, and the second mass has a larger amplitude. Fig. 7.11.
(2) ω = ω2 = (2T /Lm)1/2 inserted into (7.45) ⇒ B2 /A2 = B2 /C2 = 0 and A2 =
−C2 from the second equation of (7.43). The central mass is at rest, while the first
and third mass are vibrating in opposite directions but with equal amplitude.
Fig. 7.12.
(3) ω = ω3 = (3.4T /Lm)1/2 inserted into (7.45) ⇒ B3 /A3 = B3 /C3 = −1.4, i.e.,
A3 = C3 = −1.4B3 . The first and the last mass are deflected in the same direction,
while the central mass vibrates with different amplitude in the opposite direction.
The system discussed here has three vibration modes with 0, 1, and 2 nodes, re-
spectively. For a system with n mass points, both the number of modes as well as Fig. 7.13.
the number of possible nodes (n − 1) increases. A system with n → ∞ is called a
“vibrating string.”
A comparison of the figures clearly shows the approximation of the vibrating string
by the system of three mass points.
Fig. 7.14.
EXERCISE
7.6 Eigenvibrations of a Three-Atom Molecule
Problem. Discuss the eigenvibrations of a three-atom molecule. In the equilibrium

state of the molecule, the two atoms of mass m are in the same distance from the
atom of mass M. For simplicity one should consider only vibrations along the mole-
cule axis connecting the three atoms, where the complicated interatomic potential is
approximated by two strings (with spring constant k).
Exercise 7.6 (a) Establish the equation of motion.

(b) Calculate the eigenfrequencies and discuss the eigenvibrations of the system.
Solution. (a) Let x1 , x2 , x3 be the displacements of the atoms from the equilibrium
positions at time t. From Newton’s equations and Hooke’s law then it follows that
Fig. 7.15.
mẍ1 = −k(x1 − x2 ),
M ẍ2 = −k(x2 − x3 ) − k(x2 − x1 ) = k(x3 + x1 − 2x2 ), (7.46)
mẍ3 = −k(x3 − x2 ).
(b) By inserting the ansatz x1 = a1 cos ωt, x2 = a2 cos ωt, and x3 = a3 cos ωt
into (7.46), one obtains
(mω2 − k)a1 + ka2 = 0,

Fig. 7.16.
ka1 + (Mω − 2k)a2 +
2 ka3 = 0, (7.47)
ka2 + (mω2 − k)a3 = 0.
The eigenfrequencies of this system are obtained by setting the determinant of

coefficients equal to zero:

mω2 − k k 0

k Mω 2 − 2k k = 0. (7.48)

0 k mω − k
2
(mω2 − k)[ω4 mM − ω2 (kM + 2km)] = 0 (7.49)
or
ω2 (mω2 − k)[ω2 mM − k(M + 2m)] = 0.
By factorization of (7.49) with respect to ω, one obtains for the eigenvibrations of the
system:

k k 2m
ω1 = 0, ω2 = , ω3 = 1+ .
m m M
Discussion of the vibration modes:

(1) Insertion of ω = ω1 = 0 into (7.47) yields a1 = a2 = a3 . The eigenfrequency
ω1 = 0 does not correspond to a vibrational motion, but represents only a uni-
form translation of the entire molecule: •→ ◦→ •→.
(2) Inserting ω = ω2 = (k/m)1/2 into (7.47) yields a1 = −a3 , a2 = 0; i.e., the central
atom is at rest, while the outer atoms vibrate against each other: ←• ◦ •→.
(3) Inserting ω = ω3 = {k/m(1 + 2m/M)}1/2 into (7.47) yields a1 = a3 , a2 =
−(2m/M)a1 , i.e., the two outer atoms vibrate in phase, while the central atom
vibrates with opposite phase and with another amplitude: •→ ←◦ •→.
The Vibrating String
8
A string of length l is fixed at both ends. Thereby appear forces T that are constant
in time and independent of the position. The string tension acts as a backdriving force
when the string is displaced out of the rest position. A string element s at the position
x experiences the force
Fy (x) = −T sin (x)
in y-direction. At the position x + x there acts in y-direction the force
Fy (x + x) = T sin (x + x).
In y-direction, the string element s experiences the total force
Fy = T sin (x + x) − T sin (x). (8.1)
Accordingly, along the x-direction the string element s is pulled by the force
Fx = T cos (x + x) − T cos (x).
In a first approximation we assume that the displacement in x-direction shall be zero.

A displacement of the string in y-direction causes only a very small motion in the
x-direction. This displacement is negligible compared to the displacement in
y-direction, i.e.,
Fx = 0.
Since we neglect the displacement in x-direction, the only acceleration component of

the string element is given by ∂ 2 y/∂t 2 . The mass of the element is m = σ s, where
σ represents the line density. From that and by means of (8.1) we obtain the equation
of motion:
∂ 2y
Fy = σ s = T sin (x + x) − T sin (x). (8.2)
∂t 2
Fig. 8.1. The string tension T

102 8 The Vibrating String
Both sides are divided by x:
σ sd 2 y T sin (x + x) − T sin (x)

= . (8.3)
xdt 2 x
Inserting for s in the left-hand side of (8.3)

s = x 2 + y 2 ,
one has

σ x 2 + y 2 ∂ 2 y y 2 ∂ 2 y
=σ 1+
x ∂t 2 x ∂t 2
T sin (x + x) − T sin (x)
= . (8.4)
x
By forming the limit for x, y → 0 on both sides of (8.4), we obtain

2
∂y ∂ 2y ∂
σ 1+ 2
=T (sin ). (8.5)
∂x ∂t ∂x
√
For sin we have sin = tan / 1 + tan2 . Since tan = ∂y/∂x (inclination of
the curve), we write
∂y/∂x
sin = . (8.6)
1 + (∂y/∂x)2
By means of relation (8.6) the equation (8.5) can be transformed as follows:

2
∂y ∂ 2y ∂ ∂y/∂x
σ 1+ = T . (8.7)
∂x ∂t 2 ∂x 1 + (∂y/∂x)2
In order to simplify the equation, we again consider only small displacements of the
string in y-direction. Then ∂y/∂x 1, and (∂y/∂x)2 can be neglected too.
Thus, we obtain

∂ 2y ∂ ∂y
σ 2 =T (8.8)
∂t ∂x ∂x
or
∂ 2y ∂ 2y
σ = T . (8.9)
∂t 2 ∂x 2
We set c2 = T /σ (c has the dimension of a velocity). The desired differential equation

(also called the wave equation) then reads

∂ 2y 2
2∂ y ∂2 1 ∂2
= c or − y(x, t) = 0. (8.10)
∂t 2 ∂x 2 ∂x 2 c2 ∂t 2
8.1 Solution of the Wave Equation 103
8.1 Solution of the Wave Equation

The wave equation (8.10) is solved with given definite boundary conditions and initial
conditions. The boundary conditions state that the string is tightly clamped at both
ends x = 0 and x = l, i.e.,
y(0, t) = 0, y(l, t) = 0 (boundary conditions).
The initial conditions specify the state of the string at the time t = 0 (initial excitation).
The excitation is performed by a displacement of the form f (x),
y(x, 0) = f (x) (first initial condition),
and the velocity of the string is zero,

∂
y(x, t) = 0 (second initial condition).
∂t t=0
For solving the partial differential equation (PDE), we use the product ansatz y(x, t) =
X(x) · T (t). Such an approach is obvious, since we are looking for eigenvibrations.
These are defined so that all mass points (i.e., any string element at any position x)
vibrate with the same frequency. By the ansatz y(x, t) = X(x)·T (t), the time behavior
is decoupled from the spatial one. Thus we try to split the partial differential equation
into a function of the position X(x) and a function of the time T (t). Inserting y(x, t) =
X(x) · T (t) into the differential equation (8.10) yields
X(x)T̈ (t) = c2 X (x)T (t),
where ∂ 2 T /∂t 2 = T̈ and ∂ 2 X/∂x 2 = X . The above equation can be rewritten as
T̈ (t) X (x)
= c2 .
T (t) X(x)
Since one side depends only on x and the other side depends on t , while x and t are
independent of each other, there is only one possible solution: Both sides are constant.
The constant will be denoted by −ω2 .
T̈
= −ω2 or T̈ + ω2 T = 0, (8.11)
T
or
X ω2 ω2
=− 2 or X + X = 0. (8.12)
X c c2
The solutions of the differential equations (continuous harmonic vibrations) have the
form
T (t) = A sin ωt + B cos ωt,

ω ω
X(x) = C sin x + D cos x.
c c
The general solution then reads

ω ω
y(x, t) = (A sin ωt + B cos ωt) · C sin x + D cos x . (8.13)
c c
The constants A, B, C, and D are determined from the boundary and initial condi-
tions.
From the boundary conditions, it follows for (8.11) that
y(0, t) = 0 = D(A sin ωt + B cos ωt).
Since the expression in brackets differs from zero, we must have D = 0. Then (8.13)
simplifies to
ω
y(x, t) = C sin x(A sin ωt + B cos ωt).
c
With the second boundary condition, we get
ω
y(l, t) = 0 = C sin l(A sin ωt + B cos ωt)
c
ω
⇒ 0 = C sin l.
c
This equation will be satisfied if either of the following holds:
(a) C = 0, which means that the entire string is not

displaced,
or
(b) sin(ωl/c) = 0. The sine equals zero if (ω/c)l = nπ, i.e.,
if ω = ωn = nπc/ l, where n = 1, 2, 3, . . .
(n = 0 would lead to case (a)).
From the boundary conditions, we thus obtain the eigenfrequencies ωn = nπc/ l

of the string. Since the string is a continuous system, there are infinitely many eigen-
frequencies. The solution for an eigenfrequency, the normal vibration, was marked by
the index n. Equation (8.11) becomes

nπ nπc nπc
yn (x, t) = C · sin x An sin t + Bn cos t ,
l l l

nπ nπc nπc
yn (x, t) = sin x an sin t + bn cos t ,
l l l
where we set C · An = an and C · Bn = bn .

From the initial conditions, we have

∂ nπc nπ nπc nπc
yn (x, t) =0= sin x an cos t − bn sin t .
∂t t=0 l l l l t=0
Then
nπc nπ
an · · sin x=0
l l
is satisfied for all x only if an = 0. Thus, the solution of the differential equation is
nπ nπc
yn (x, t) = bn · sin x cos t. (8.14)
l l
The parameter n describes the excitation states of a system, in this case those of the
8.2 Normal Vibrations 105
vibrating string. In quantum physics such a discrete parameter n is called a quantum

number.
Interjection: If we had selected a negative separation constant in (8.11), i.e., +ω2
instead of −ω2 , we would have arrived at the solution
ω ω
y(x, t) = (Aeωt + Be−ωt ) Ce c x + De− c x .
The boundary conditions y(0, t) = y(l, t) = 0 would have led to the conditions
Ce c l + De− c l = 0
ω ω
C + D = 0;
with the solutions C = D = 0. The string would have remained at rest. But this is not
the desired solution.
Since the one-dimensional wave equation is a linear differential equation, one can
obtain the most general solution, according to the superposition principle, by the su-
perposition (addition) of the particular solutions:
∞

∞
nπx nπc
y(x, t) = bn sin cos t= bn sin kn x cos ωn t.
l l
n=1 n=1
The coefficients bn can be calculated from the given initial curve by using the consid-
erations on the Fourier series (see the next chapter):
∞

nπx
y(x, 0) ≡ f (x) = bn sin .
l
n=1
The calculation of the Fourier coefficients bn will be shown in the next chapter. One
then gets the following general solution of the differential equation:
∞
l

2 nπx nπx nπct

y(x, t) = f (x ) sin dx sin cos . (8.15)
l l l l
n=1 0
8.2 Normal Vibrations
Normal vibrations are described by the following equation:
yn (x, t) = Cn sin(kn x) cos(ωn t). (8.16)
For a fixed time t , the spatial variation (positional dependence) of the normal vibra-
tion depends on the expression sin(nπx/ l) (for n > 1, sin(nπx/ l) has exactly n − 1
nodes). All mass points (position x) vibrate with the same frequency ωn .
At a definite position x, the time dependence of the normal vibration is represented
by the expression cos(nπc/ l)t . The wave number kn is defined as
ωn nπ 2π
kn ≡ = = , (8.17)
c l λn
where λn = 2l/n is the wavelength.
The angular frequency is defined as follows:

nπc
ωn ≡ = 2πνn . (8.18)
l
Solving (8.18) for νn , we obtain for the frequency
nc
νn = , (8.19)
2l
i.e., the frequencies increase with increasing index n. By definition,

T
c= ; (8.20)
σ
c can be interpreted as “sound” velocity in the string, as we shall see below. T is the
tension in the string, σ is the mass density. From (8.19) and (8.20) we find

n T
νn = , (8.21)
2l σ
i.e., the longer and thicker a string is, the smaller the frequency. The frequency in-
creases with the string tension T . This agrees with our experience that long, thick
strings sound deeper than short, thin ones. With increasing string tension the frequency
increases. This property is utilized when tuning up a violin.
Multiplication of the wavelength by the frequency yields a constant c which has
the dimension of a velocity:
2l nc
λn νn = =c (dispersion law). (8.22)
n 2l
c is the velocity (phase velocity) by which the wave propagates in a medium. This can
be seen as follows: If an initial perturbation y(x, 0) = f (x) is given as in Fig. 8.2,
f (x − ct) is also a solution of the wave equation (8.10), because with z = x − ct we
have
∂f ∂f ∂z ∂f ∂ 2f ∂ 2 f ∂z ∂ 2f
= = −c , 2
= −c 2 = c2 2 ,
∂t ∂z ∂t ∂z ∂t ∂z ∂t ∂z
and
∂f ∂f ∂ 2f ∂ 2f
= , = .
∂x ∂z ∂x 2 ∂z2
Hence,
1 ∂2 c2 ∂ 2 f ∂ 2f ∂ 2 f (x − ct)
f (x − ct) = = = .
c2 ∂t 2 c2 ∂z2 ∂z2 ∂x 2
f (x − ct) thus satisfies the wave equation (8.10).
Fig. 8.2. Propagation of a perturbation f (x) along a long string: After the time t, the perturba-
tion has moved away by ct; it is then described by f (x − ct)
Let the maximum of the perturbation f (x) be at x0 . After the time t , it lies at
x − ct = x0 .
It thus propagates with the velocity
dx
=c
dt
along the string, namely to the right (positive x-direction). One can say that the per-
turbation f (x) moves along the string with the velocity
dx
= c. (8.23)
dt
The propagation velocity of small perturbations is called the sound velocity. One
easily realizes as above that f (x + ct) is also a solution of the wave equation and
represents a perturbation that moves to the left (negative x-direction). We are deal-
ing here with running waves, while for the tightly clamped string we have standing
waves.
If a string is excited with an arbitrary normal frequency, there are points on the
string that remain at rest at any time (nodes).
The wavelength, the number of nodes, and the shape of normal vibrations can be
represented as a function of the index n (see Fig. 8.3).
Fig. 8.3. The lowest normal

vibrations of a string
n Wavelength Number of nodes Figure

1 2l 0 (a)
2 l 1 (b)
2
3 l 2 (c)
3
.. .. ..
. . .
2
n l n−1
n
EXERCISE
8.1 Kinetic and Potential Energy of a Vibrating String
Problem. Consider a string of density σ that is stretched between two points and is
excited with small amplitudes.
(a) Calculate in general the kinetic and potential energy of the string.
(b) Calculate the kinetic and potential energy for waves of the form

ω(x − ct)
y = C cos
c
with T0 = 500 N, C = 0.01 m, and λ = 0.1 m.
Solution. (a) The part P Q of the string has the mass σ x and the velocity ∂y/∂t.
Its kinetic energy is then
2
1 ∂y
T = σ x . (8.24)
2 ∂t
The total kinetic energy of the string between x = a and b is
b 2
1 ∂y
T= σ dx. (8.25)
2 ∂t
a
The work which is needed to elongate the string from x to l is

l
dP = T0 (l − x), ∼ 1. (8.26)
x
Fig. 8.4. Displacement and

deformation (elongation com-
pression) of the string ele-
ment x
For small displacements, we have

2 1/2
∂y 1 ∂y 2
l = (x + y )
2 2 1/2
= x 1 +
x 1 + . (8.27)
∂x 2 ∂x
The potential energy for the region x = a to x = b is then

b 2
1 ∂y
P = T0 dx. (8.28)
2 ∂x
a
For a wave y = F (x − ct) propagating in a direction, we have

b
1 T0
T = P = T0 [F (x − ct)]2 dx, c2 = . (8.29)
2 σ
a
Hence, the kinetic and potential energy are equal. If a, b are fixed points, then T and P
vary with time. But if we admit that a and b can propagate with the sound velocity c,
so that
a = A + ct and b = B + ct, (8.30)
then P and T are constant:

B
1
T = P = T0 (F (x))2 dx. (8.31)
2
A
(b)

∂y ω
= C sin x − ωt ω
∂t c
2
∂y ω
⇒ = C 2 sin2 x − ωt ω2 . (8.32)
∂t c
Insertion into (8.25) yields (a = 0, b = λ)
λ
1 T0 2 2 2 ω 1 T0 2 2
T= 2
C ω sin x − ωt dx = C ω · I. (8.33)
2c c 2 c2
0
With the substitution z = (ω/c)x − ωt for the integral I , we find

(ω/c)λ−ωt
(ω/c)λ 2π
c c c
I= sin z dz =
2
sin z dz =
2
sin2 z dz (8.34)
ω ω ω
−ωt 0 0
2π
c 1 1 c
= z − sin(2z) = π
ω 2 4 0 ω
1 T0 2 2 c π 2 C 2 T0 c
⇒ T = C ω π = , λ = 2π . (8.35)
2 c2 ω λ ω
One gets the same expression for the potential energy. Insertion of the numerical val-
ues yields
500 N 2
T = P = (0.01)2 · π 2 m ∼ 5 N m.
0.1 m
EXERCISE
8.2 Three Different Masses Equidistantly Fixed on a String
Problem. Calculate the eigenfrequencies of the system of three different masses that
are fixed equidistantly on a stretched string, as is shown in Fig. 8.5.
Hint: For small amplitudes, the string tension T does not change!
Fig. 8.5.
Solution. From Fig. 8.6, we extract for the equations of motion

(x2 − x1 ) x1
2mẍ1 = T −T ,
L L

(x2 − x1 ) (x2 − x3 )
mẍ2 = −T −T , (8.36)
L L

(x2 − x3 ) x3
3mẍ3 = T −T .
L L
Fig. 8.6.
We look for the eigenvibrations. All mass points must then vibrate with the same
frequency. We therefore start with
x1 = A sin(ωt + ψ), ẍ1 = −ω2 A sin(ωt + ψ),

x2 = B sin(ωt + ψ), ẍ2 = −ω2 B sin(ωt + ψ),
x3 = C sin(ωt + ψ), ẍ3 = −ω2 C sin(ωt + ψ).
Hence, after insertion into equation (8.36) one gets

2T T
− 2mω A −
2
B = 0,
L L

T 2T T
− A+ − mω2 B − C = 0, (8.37)
L L L

T 2T
− B+ − 3mω2 C = 0.
L L
For evaluating the eigenfrequencies of the system, i.e., for solving equation (8.37), the
determinant of coefficients must vanish:

T T
2 − 2mω2 − 0
L
L
T T T
− 2 − mω 2
− = 0.

L L L
T T
0 − 2 − 3mω2
L L
Expansion of the determinant leads to
3
22T m2 19T 2 m 2 4T
0 = 6m ω −
3 6
ω +
4
2
ω −
L L L3
or Exercise 8.2

−22T m2 19T 2 m −4T 3
0 = 6m3 3 + 2 + + , (8.38)
L L2 L3
where we substituted = ω2 . This leads to the cubic equation
a3 + b2 + c + d = 0,
where
−22T m2 19T 2 m −4T 3
a = 6m3 , b= , c= , d= .
L L2 L3
It can be transformed to the representation (reduction of the cubic equation)
y 3 + 3py + 2q = 0, (8.39)
where
b 11 T
y =+ =−
3a 9 Lm
and
1 b2 c 2 b3 1 bc d
3p = − + and 2q = − + .
3 a2 a 27 a 3 3 a 2 a
Insertion leads to
71 T 2 653 T 3
3p = − , 2q = − .
54 L2 m2 1458 L3 m3
q 2 + p 3 < 0,
i.e., there exist 3 real solutions of the cubic equation (8.39).

For the case q 2 + p 3 ≤ 0, the solutions y1 , y2 , y3 can be calculated using tabulated
auxiliary quantities (see Mathematical Supplement 8.4). Direct application of Car-
dano’s formula would lead to complex expressions for the real roots, hence the above
method is convenient.
After insertion one obtains for the auxiliary quantities
−q √ ϕ
cos ϕ = , y1 = 2 −p cos ,
−p 3 3

√ ϕ π
y2 = −2 −p cos + ,
3 3

√ ϕ π
y3 = −2 −p cos − ,
3 3
and finally, for the eigenfrequencies of the system

T T T
ω1 = 0.563 , ω2 = 0.916 , ω3 = 1.585 .
Lm Lm Lm
EXERCISE
8.3 Complicated Coupled Vibrational System
Problem. Determine the eigenfrequencies of the system of three equal masses sus-
pended between springs with the spring constant k, as is shown in Fig 8.7.
Hint: Consider the solution method of the preceding Exercise 8.2 and Mathematical
Supplement 8.4.
Fig. 8.7. Vibrating coupled

masses
Solution. From Fig. 8.7, we extract for the equations of motion
mẍ1 = −kx1 − k(x1 − x2 ) − k(x1 − x3 ),

mẍ2 = −kx2 − k(x2 − x1 ) − k(x2 − x3 ), (8.40)
mẍ3 = −kx3 − k(x3 − x1 ) − k(x3 − x2 ),
or
mẍ1 + 3kx1 − kx2 − kx3 = 0,

mẍ2 + 3kx2 − kx3 − kx1 = 0, (8.41)
mẍ3 + 3kx3 − kx1 − kx2 = 0.
We look for the eigenvibrations. All mass points must vibrate with the same fre-
quency. Thus, we adopt the ansatz
x1 = A cos(ωt + ψ), ẍ1 = −ω2 A cos(ωt + ψ),

x2 = B cos(ωt + ψ), ẍ2 = −ω2 B cos(ωt + ψ),
x3 = C cos(ωt + ψ), ẍ3 = −ω2 C cos(ωt + ψ),
and after insertion into (8.41), we get
(3k − mω2 )A − kB − kC = 0,
−kA + (3k − mω2 )B − kC = 0, (8.42)
−kA − kB + (3k − mω2 )C = 0.
To get a nontrivial solution of (8.42), the determinant of coefficients must vanish:

(3k − mω2 ) −k −k

−k (3k − mω 2) −k = 0.

−k −k (3k − mω )
2
Expansion of the determinant leads to

9k 4 24k 2 2 16k 3
0 = ω6 − ω + 2 ω − 3
m m m
or
9k 2 24k 2 16k 3
0 = 3 − + 2 − 3 ,
m m m
where we substituted = ω2 (see Exercise 8.2). The general cubic equation a3 +

b2 + c + d = 0 (in our case a = 1, b = −9k/m, c = 24k 2 /m2 , d = −16k 3 /m3 )
can according to Mathematical Supplement 8.4 be reduced to
y 3 + 3py + 2q = 0,
where
b 1 b2 c 2 b3 1 bc d
y =+ , 3p = − + , 2q = − + .
3a 3 a2 a 27 a 3 3 a 2 a
Insertion leads to
k2 k3
3p = −3 2
, 2q = 2 3 ⇒ q 2 + p 3 = 0,
m m
i.e., there exist 3 solutions (the real roots); 2 of them coincide. Hence, the vibrating
system being treated here is degenerate. As in Exercise 8.2, the solutions can be cal-
culated using tabulated auxiliary quantities. For these, we obtain
−q √ ϕ
cos ϕ = , y1 = 2 −p cos ,
−p 3 3

√ ϕ π
y2 = −2 −p cos + ,
3 3

√ ϕ π
y3 = −2 −p cos − ,
3 3
and, after insertion, for the eigenfrequencies of the system

k k
ω3 = , ω1 = ω 2 = 2 .
m m
MATHEMATICAL SUPPLEMENT
8.4 The Cardano Formula1
In theoretical physics, one often meets the problem of solving a cubic equation, just
as in the Exercises 8.2 and 8.3. We now will clarify this problem.
1 We follow the exposition of E. v. Hanxleben and R. Hentze, Lehrbuch der Mathematik, Friedrich
Vieweg & Sohn 1952, Braunschweig–Berlin–Stuttgart.
Mathematical Supplement 8.4 Reduction of the general cubic equation: If the general cubic equation
x 3 + ax 2 + bx + c = 0 (8.43)
with nonvanishing coefficients a, b, and c is to be solved, one must first eliminate the
quadratic term of the equation, i.e., reduce the equation. If the unknown x is replaced
by y + λ, where y and λ are new, unknown quantities, (8.43) turns into
(y 3 + 3y 2 λ + 3yλ2 + λ3 ) + (ay 2 + 2ayλ + aλ2 ) + (by + bλ) + c = 0,

y 3 + (3λ + a)y 2 + (3λ2 + 2aλ + b)y + (λ3 + aλ2 + bλ + c) = 0. (8.44)
Since we have replaced one unknown quantity x by two unknown ones, y and λ, we
can freely dispose of one of the two unknown quantities. This freedom is exploited so
as to let the quadratic term of the equation disappear. This is achieved by setting the
coefficient of y 2 , that is, 3λ + a, equal to zero, i.e., λ = −a/3. By inserting this value
(8.44) changes to
3
a2 2a ab
y + − +b y +
3
− + c = 0. (8.45)
3 27 3
If we set the expressions determined by the known coefficients a, b, and c of the cubic
equation,
a2 2a 3 ab
− +b=p and − + c = q, (8.46)
3 27 3
the cubic equation takes the form
y 3 + py + q = 0 (reduced cubic equation). (8.47)
Result: To reduce the cubic equation given in the normal form, one sets x = y −a/3.
Then (8.47) follows from (8.43).
Example: x 3 − 9x 2 + 33x − 65 = 0.
(1) Solution: Set x = y − (−3) = y + 3.
(y + 3)3 − 9(y + 3)2 + 33(y + 3) − 65 = 0,

(y 3 + 9y 2 + 27y + 27) − 9(y 2 + 6y + 9) + 33(y + 3) − 65 = 0,
y 3 + 6y − 20 = 0.
(2) Solution: Insert the values calculated from (8.46) into (8.47).
Special case: If in the general cubic equation, the linear term is missing (b = 0),
i.e., the cubic equation is given in the form
x 3 + ax 2 + c = 0, (8.48)
the reduction can also be performed by inserting

c
x= . (8.49)
y
From (8.48) and (8.49), we obtain the reduced equation Mathematical Supplement 8.4
c3 c2
+ a + c = 0 or y 3 + acy + c2 = 0. (8.50)
y3 y2
Solution of the reduced cubic equation: If one sets in the reduced cubic equation
y 3 + py + q = 0,
(8.51)
y = u + v,
one obtains
u3 + 3u2 v + 3uv 2 + v 3 + p(u + v) + q = 0,

(u3 + v 3 + q) + 3uv(u + v) + p(u + v) = 0,
(u3 + v 3 + q) + (3uv + p)(u + v) = 0. (8.52)
Since one can freely dispose of one of the unknown quantities u or v (justifica-
tion?), these are suitably chosen so that the coefficient of (u + v) vanishes. We there-
fore set
p
3uv + p = 0, i.e., uv = − . (8.53)
3
Equation (8.52) simplifies to
u3 + v 3 + q = 0 or u3 + v 3 = −q. (8.54)
u and v are determined by (8.53) and (8.54). The quantities u and v can no longer be
arbitrarily chosen. By raising (8.54) to the second power and (8.53) to the third power,
one obtains
u6 + 2u3 v 3 + v 6 = q 2 ,
3
p
4u3 v 3 = −4 .
3
Subtraction of the two equations yields
3
p
(u3 − v 3 )2 = q 2 + 4 ,
3

3
p
u − v = ± q2 + 4
3 3
. (8.55)
3
By addition and subtraction of equations (8.54) and (8.55), one obtains

3 3
1 p 1 p
u =
3
−q ± q 2 + 4 and v = 3
−q ∓ q 2 + 4 ,
2 3 2 3

2 3 2 3
q q
q p q p
u= 3 − + + and v = 3 − ∓ + . (8.56)
2 2 3 2 2 3
Mathematical Supplement 8.4 If one sets

2 3 2 3
q q
q p q p
3 − + + =m and 3 − − + = n,
2 2 3 2 2 3
one gets
u1 = m, u2 = m 2 , u3 = m 3 ,
v1 = n, v2 = n 2 , v3 = n 3 .
Here, the i are the unit roots of the cubic equation x 3 = 1 which, as is evident, read
√ √
1 3 1 3
1 = 1, 2 = − + i , =− −i .
2 2 2 2
Since now y = u + v, one can actually form 9 values for y (why?). But since the
quantities u and v must satisfy the determining equation (8.53), the number of possible
connections between u and v is restricted to 3, namely,
y 1 = u1 + v 1 , y 2 = u2 + v 3 , y 3 = u3 + v 2 ;
hence,

2 3 2 3
q q
q p q p
y1 = m + n = 3 − + + + 3 − − + ,
2 2 3 2 2 3
m+n m−n √
y2 = m 2 + n 3 = − + i 3, (8.57)
2 2
m+n m−n √
y3 = m 3 + n 2 = − − i 3.
2 2
The real root of the cubic equation, i.e., the root

2 3 2 3
q q
q p q p
y1 = 3 − + + + 3 − − +
2 2 3 2 2 3
is known as the “Cardano formula.” It was named in honor of the Italian Hieronimo
Cardano2 to whom the discovery of the formula was falsely ascribed. Actually, the
2 Hieronimo Cardano, Italian physicist, mathematician, and astrologer, b. Sept. 24, 1501, Pavia–
d. Sept. 20, 1576, Rome. Cardano was the illegitimate son of Fazio (Bonifacius) Cardano, a friend
of Leonardo da Vinci. He studied at the universities of Pavia and Padua, and in 1526 he graduated in
medicine. In 1532, he went to Milan, where he lived in deep poverty, until he got a position teaching
in mathematics. In 1539, he worked at a high school of physics, where he soon became the director.
In 1543, he accepted a professorship for medicine in Pavia.
As a mathematician, Cardano was the most prominent personality of his age. In 1539, he published
two books on arithmetic methods. At this time, the discovery of a solution method for the cubic
equation became known. Nicolo Tartaglia, a Venetian mathematician, was the owner. Cardano tried in
vain to get permission to publish it. Tartaglia left the method to him under the condition that he keeps
it secret. In 1545, Cardano’s book Artis magnae sive de regulis algebraicis, one of the cornerstones of
the history of algebra, was published. The book contained, besides many other new facts, the method
of solving cubic equations. The publication caused a serious controversy with Tartaglia.
formula is due to the Bolognesian professor of mathematics Scipione del Ferro,3 who Mathematical Supplement 8.4
found this ingenious algorithm.
Example: y 3 − 15y − 126 = 0. Here,
p = −15, q = −126,
p q
= −5, = −63.
3 2
By inserting into the Cardano formula, one obtains
3 √
3 √
y1 = 63 + 632 − 53 + 63 − 632 − 53
3 √
3 √
= 63 + 3844 + 63 − 3844c
√ √
= 3 63 + 62 + 3 63 − 62
√ √
= 3 125 + 3 1 (= m + n)
= 6,
5+1 5−1 √ √
y2 = − + i 3 = −3 + 2i 3,
2 2
5+1 5−1 √ √
y3 = − − i 3 = −3 − 2i 3.
2 2
Check the validity of the roots by insertion!
Discussion of Cardano’s formula: The square root appearing in the Cardano for-
mula only yields a real value if the radicand (q/2)2 + (p/3)3 ≥ 0. If the radicand
is negative, the three values for y yield complex numbers. We consider the possible
cases:

2 3
q p
+ Form of the roots
2 3
(1) p>0 Real A real value, two complex con-
jugate values
(2) p < 0, namely,
3 2
p q
(a) Real As in (1).
3 < 2
3 2
p q

(b) 3 = 2 =0 Three real values, among them
a double root
3 2
p q
(c) Imaginary All three roots by the form
3 > 2
imaginary
The case (2c) was of particular interest to the mathematicians of the Middle Ages.
Since any cubic equation has at least one real root, but they could not find it by means
of Cardano’s formula, the case was called the casus irreducibilis.4 The first to solve
3 Scipione del Ferro, b. 1465(?)–d. 1526 (?). About his life we know only that he lectured from 1496
to 1526 at the university of Bologna. By 1500, he discovered the method of solving the cubic equation
but did not publish it. Tartaglia rediscovered the method in 1535.
4 Casus irreducibilis (Lat.) = “the nonreducible case”.
Mathematical Supplement 8.4 this case was the French politician and mathematician Vieta.5 He proved by using
trigonometry that this case was solvable too, and that in this case the equation has
three real roots.
Trigonometric solution of the irreducible case: Since p is negative in this case,
one starts from the reduced cubic equation
y 3 − py + q = 0, (8.58)
where p must now be kept fixed as absolute numerical value. According to the trigono-
metric formulae we have
cos 3α = cos(2α + α) = cos 2α cos α − sin 2α sin α

= (cos2 α − sin2 α) cos α − 2 sin2 α cos α
= cos3 α − sin2 α cos α − 2 sin2 α cos α
= cos3 α − (1 − cos2 α) cos α − 2(1 − cos2 α) cos α
= cos3 α − cos α + cos3 α − 2 cos α + 2 cos3 α
= 4 cos3 α − 3 cos α,
thus,
3 1
cos3 α − cos α − cos 3α = 0. (8.59)
4 4
If one considers cos α to be unknown, (8.59) coincides with the form of (8.58). But
since the value of the cosine varies only between the limits −1 and +1, while y, ac-
cording to the values of p and q, can take any values, one cannot simply set cos α = y.
By multiplying (8.59) by a still uncertain positive factor 3 , one obtains
3 1
3 cos3 α − 2 · cos α − 3 cos 3α = 0. (8.60)
4 4
By setting · cos α = y, p = (3/4)2 , and q = −(1/4)3 cos 3α, (8.60) turns into
(8.58). From this, we find

p
=2· (8.61)
3
and
4q −4q q/2
cos 3α = − = √ = − . (8.62)
3 8 · (p/3) p/3 (p/3)3
Equation (8.62) is ambiguous, since the cosine is a periodic function. One has
3α = ϕ + k · 360◦ , where k = 0, 1, 2, 3, . . . . (8.63)
5 François Vieta, French mathematician, b. 1540, Fontenay-le-Comte–d. Dec. 13, 1603, Paris. Ad-
vocate and adviser of Parliament in the Bretagne. His greatest achievements were in the theory of
equations and algebra, where he introduced and systematically used letter notations. He established
the rules for the rectangular spherical triangle which are often ascribed to Neper. In his Canon math-
ematicus, a table of angular functions (1571), he emphasized the advantages of decimal notation.
[BR]
From this, we find for α Mathematical Supplement 8.4
ϕ ϕ ϕ
α1 = , α2 = + 120◦ , α3 = + 240◦ .
3 3 3
Compare this consideration with the problem of cyclotomy! Which values are ob-
tained for α if k = 3, 4, . . .?
For y, one obtains

p ϕ p ϕ ◦
y1 = 2 cos , y2 = 2 cos + 120 ,
3 3 3 3

p ϕ ◦
y3 = 2 cos + 240 .
3 3
Now

ϕ ◦ ◦ ϕ
cos + 120 = − cos 60 −
3 3
and

ϕ ϕ
cos + 240◦ = − cos 60◦ + ,
3 3
so that the roots of the cubic equations are

p ϕ
y1 = 2 cos ,
3 3

p ϕ
y2 = −2 cos 60◦ − , (8.64)
3 3

p ϕ
y3 = −2 cos 60◦ + .
3 3
Comment: The formulas of the casus irreducibilis can also be derived by means of
the Moivre’s theorem.
Example: Calculate the roots of the equation
y 3 − 981y − 11340 = 0.
Solution: Since p < 0 and

p 3 p 3

= 3273 , log = 3 · log 327 = 7.5436,
3 3
2 2
q q
= 56702 , log = 2 · log 5670 = 7.5072,
2 2
Mathematical Supplement 8.4 by comparing the logarithms it follows that |(p/3)3 | > (q/2)2 . Thus, the condition of
the casus irreducibilis is fulfilled. According to (8.62)
5670
cos 3α = + √ ,
3273
log cos 3α = 3.7536 − 3.7718 = 9.9818 − 10,
ϕ
ϕ = 3α ≈ 16◦ 30 , hence, = α = 5◦ 30 .
3
From (8.64), we obtain y1 = 36, y2 = −21, y3 = −15. Check the root values by in-
sertion!
Fourier Series
9
When setting the initial conditions for the problem of the vibrating string, a trigono-
metric series was set equal to a given function f (x). The expansion coefficients of
the series had to be determined. To solve the problem, the function f (x) should also
be represented by a trigonometric series. These trigonometric series are called Fourier
series.1 The conditions that allow an expansion of a function into a Fourier series are
summarized as follows:
(1) f (x) is defined in the interval a ≤ x < a + 2l,

(2) f (x) and f (x) are piecewise continuous on a ≤ x < a + 2l,
(3) f (x) has a finite number of discontinuities which are finite jump discontinuities,
and
(4) f (x) has the period 2l, i.e., f (x + 2l) = f (x).
These conditions (Dirichlet conditions) are sufficient to represent f (x) by a Fourier

series:
∞
a0 nπx nπx
f (x) = + an cos + bn sin . (9.1)
2 l l
n=1
1 Jean Baptiste Joseph Fourier, b. March 21, 1768, Auxerre, son of a tailor–d. May 16, 1830, Paris.
Fourier attended the home École Militaire. Because of his origin he was excluded from an officer’s
career. Fourier decided to join the clergy, but did not take a vow because of the outbreak of the rev-
olution of 1789. Fourier first took a teaching position in Auxerre. Soon he turned to politics and
was arrested several times. In 1795, he was sent to Paris to study at the École Normale. He soon
became member of the teaching staff of the newly founded École Polytechnique. In 1798, he be-
came director of the Institut d’Egypte in Cairo. Only in 1801 did he return to Paris, where he was
appointed by Napoleon as a prefect of the departement Isère. During his term of office from 1802
to 1815, he arranged the drainage of the malaria-infested marshes of Bourgoin. After the downfall
of Napoleon, Fourier was dismissed from all posts by the Bourbons. However, in 1817 the king had
to agree to Fourier’s election to the Academy of Sciences, where he became permanent secretary
in 1822. Fourier’s most important mathematical achievement was his treatment of the notion of the
function. The problem of the vibrating string that had been treated already by D’Alembert, Euler, and
Lagrange, and had been solved in 1755 by D. Bernoulli by a trigonometric series. The subsequent
question of whether an “arbitrary” function can be represented by such a series was answered 1807/12
by Fourier in the affirmative. The question about the conditions for such a representation could be
answered only by his friend Dirichlet. Fourier became known mainly by his Théorie analytique de la
chaleur (1822) which deals mainly with the discussion of the equation of heat propagation in terms of
Fourier-series. This work represents the starting point for treating partial differential equations with
boundary conditions by means of trigonometric series. Fourier also made import contributions to the
theory of solving equations and to the probability calculus.

122 9 Fourier Series
The Fourier coefficients an , bn , and a0 are determined as follows:

a+2l
1 nπx
an = f (x) cos dx,
l l
a

a+2l
1 nπx
bn = f (x) sin dx, (9.2)
l l
a

a+2l
1
a0 = f (x) dx.
l
a
To prove these formulas, one needs the so-called orthogonality relations of the trigono-
metric functions:
2l
nπx mπx
cos cos dx = l δnm ,
l l
0
2l
nπx mπx
sin sin dx = l δnm , (9.3)
l l
0
2l
nπx mπx
sin cos dx = 0.
l l
0
The first relation can be proven by means of the theorem
1
cos(A + B) + cos(A − B) ,
cos A cos B =
2
2l 2l
nπx mπx 1 (n + m)πx (n − m)πx
cos cos dx = cos + cos dx = 0,
l l 2 l l
0 0
if n = m. The integral of the cosine function over a full period vanishes. For n = m
we have
2l 2l
nπx mπx 1 2nπx
cos cos dx = 1 + cos dx = l.
l l 2 l
0 0
The other relations can be proved in an analogous way.

Formula (9.2) for calculating the Fourier coefficients can be proved by means of
the orthogonality relations.
To determine the an , one multiplies the equation
∞ ∞
a0 nπx nπx
f (x) = + an cos + bn sin
2 l l
n=1 n=1
9 Fourier Series 123
by cos(mπx/ l) and then integrates over the interval 0 to 2l:

2l 2l ∞ 2l
mπx a0 mπx nπx mπx
f (x) cos dx = cos dx + an cos cos dx
l 2 l l l
0 0 n=1 0
∞
2l
nπx mπx
+ bn sin cos dx
l l
n=1 0
∞

= an lδnm = lam ,
n=1
and therefore,
2l
1 mπx
am = f (x) cos dx, (9.4)
l l
0
as is given by (9.2).
The analogous relation for the bm can be confirmed by multiplication of (9.1) by
sin(mπx/ l) and integration from 0 to 2l; the same holds for the calculation of a0 .
Functions that satisfy
f (x) = f (−x)
are called even functions; functions with the property

f (x) = −f (−x)
are called odd functions. For instance, f (x) = cos x evidently is an even function and
f (x) = sin x an odd function. The part of (9.1)
∞
a0 nπx
+ an cos
2 l
n=1
is obviously even, while

∞
nπx
bn sin
l
n=1
represents the odd part of the series expansion (9.1). Therefore, for even functions all
bn = 0, for odd functions a0 and all an are equal to zero.
Any function f (x) can be decomposed into an even and an odd part. Thus, (f (x) +
f (−x))/2 is the even part and (f (x) − f (−x))/2 the odd part of f (x) = [(f (x) +
f (−x))/2 + (f (x) − f (−x))/2].
EXAMPLE
9.1 Inclusion of the Initial Conditions for the Vibrating String by Means of the
Fourier Expansion
A string is fixed at both ends. The center is displaced from the equilibrium position by
the distance H and then released. From Fig. 9.1 we see that the initial displacement is
Fig. 9.1.
given by
⎧
⎪
⎪
Hx
0≤x ≤ ,
l
⎨2 ,
l 2
y(x, 0) = f (x) =
⎪
⎪ 2H (l − x) l
⎩ , ≤ x ≤ l.
l 2
If we assume f (x) is an odd function (dashed line), we then obtain
l
2 nπx
bn = f (x) sin dx
l l
0
l/2 l
2 2H x nπx 2H nπx
= sin dx + (l − x) sin dx ,
l l l l l
0 l/2
l/2
2H x nπx 2H l nπx l2 nπx l/2
sin dx = −x cos + 2 2 sin
l l l nπ l n π l 0
0
2lH nπ Hl nπ
= 2 2
sin − cos ,
n π 2 nπ 2
l
2H nπx
(l − x) sin dx
l l
l/2
l l
2H nπx nπx
= l sin dx − x sin dx
l l l
l/2 l/2
l
2H l2
nπx xl nπx l2 nπx
= − cos + cos − 2 2 sin
l nπ l nπ l n π l l/2
2lH nπ lH nπ
= 2 2 sin + cos ,
n π 2 nπ 2

2 2lH nπ 2lH nπ
bn = sin + sin
l n2 π 2 2 n2 π 2 2
8H nπ
= 2 2
sin .
n π 2
By inserting the solution for the Fourier coefficient bn into the general solution of
the differential equation (8.15), we get the equation that describes the vibrations of
a string: Example 9.1
∞
8H nπ nπx nπct
y(x, t) = sin sin cos
n2 π 2 2 l l
n=1

8H 1 πx πct 1 3πx 3πct
= 2 2
sin cos − 2 sin cos
π 1 l l 3 l l

1 5πx 5πct
+ 2 sin cos − ··· .
5 l l
Thus, by plucking the string in the center one essentially excites the fundamental
mode (lowest eigenvibration) sin(πx/ l) cos(πct/ l). Several overtones are admixed
with small amplitude. The initial displacement obviously corresponds to the funda-
mental vibration. If one wants to excite pure overtones, the initial displacement must
be selected according to the desired higher harmonic vibration (compare Fig. 8.3).
EXERCISE
9.2 Fourier Series of the Sawtooth Function
Problem. Find the Fourier series of the function
f (x) = 4x, 0 ≤ x ≤ 10, with period 2l = 10, l = 5.
Solution. The Fourier coefficients are

10
1 2 2 10
a0 = 4x dx = x = 40,
5 5 0
0
10 10
1 nπx 4x nπx 10 4 nπx
an = 4x cos dx = cos − sin dx
5 5 nπ 5 0 nπ 5
0 0

20 nπx 10
= 0 + 2 2 cos = 0,
n π 5 0
10 10
4 nπx 4x nπx 10 4 nπx
bn = x sin dx = − cos + cos dx
5 5 nπ 5 0 nπ 5
0 0

40 20 nπx 10 40
=− + 2 2 sin =− .
nπ n π 5 0 nπ
Hence, the Fourier series reads
∞
40 1 nπx
f (x) = 20 − sin .
π n 5
n=1
Fig. 9.2.
The first partial sums Sn of this series are drawn in Fig. 9.2. A comparison of this
series with the starting curve f (x) illustrates the convergence of this Fourier series.
EXERCISE
9.3 Vibrating String with a Given Velocity Distribution
Problem. Find the transverse displacement of a vibrating string of length l with

fixed endpoints if the string is initially in its rest position and has a velocity distribu-
tion g(x).
Solution. We look for the solution of the boundary value problem
∂ 2y 2
2∂ y
= c , (9.5)
∂t 2 ∂x 2
where y = y(x, t), with
y(0, t) = 0, y(l, t) = 0,
∂ (9.6)
y(x, 0) = 0, y(x, t) = g(x).
∂t t=0
We use the separation ansatz y = X(x) · T (t). By inserting it into (9.5), one obtains
X T̈
X · T̈ = c2 X T or (x) = 2 (t). (9.7)
X c T
Since the left-hand side of (9.7) depends only on x, the right side only on t , and x and
t are independent of each other, the equation is satisfied only then if both sides are
constant. The constant is denoted by −λ2 .
X T̈
= −λ2 and = −λ2 ,
X c2 T
or, transformed,
X + λ2 X = 0 and T̈ + λ2 c2 T = 0. (9.8)
The two equations have the solutions
X = A1 cos λx + B1 sin λx, T = A2 cos λct + B2 sin λct.

Since y = X · T , we have Exercise 9.3
y(x, t) = (A1 cos λx + B1 sin λx)(A2 cos λct + B2 sin λct). (9.9)
From the condition y(0, t) = 0, it follows that A1 (A2 cos λct + B2 sin λct) = 0. This
condition is satisfied by A1 = 0. Then
y(x, t) = B1 sin λx(A2 cos λct + B2 sin λct).
We now set
B1 A2 = a, B1 B2 = b,
and it follows that
y(x, t) = sin λx(a cos λct + b sin λct). (9.10)
From the condition y(l, t) = 0, it follows that sin λl = 0. This happens if

nπ
λl = nπ or λ= . (9.11)
l
Here, n = 1, 2, 3, . . . . The value n = 0 which seems possible at first sight leads to
y(x, t) ≡ 0 and must be excluded. The relation (9.11) is inserted into (9.10). The
normal vibration will be labeled by the index n:

nπx nπct nπct
yn (x, t) = sin an cos + bn sin . (9.12)
l l l
Because y(x, 0) = 0, all an = 0, we have

nπx nπct
yn (x, t) = bn sin sin . (9.13)
l l
By differentiation of (9.13), we get
∂yn nπc nπx nπct
= bn sin cos . (9.14)
∂t l l l
For linear differential equations, the superposition principle holds, so that the entire
solution looks as follows:
∞
∂y nπcbn nπx nπct
= sin cos . (9.15)
∂t l l l
n=1
Because

∂
y(x, t) = g(x),
∂t t=0
it follows that
∞
nπcbn nπx
g(x) = sin . (9.16)
l l
n=1
Exercise 9.3 The Fourier coefficients then follow by
l
nπcbn 2 nπx
= g(x) sin dx (9.17)
l l l
0
or
l
2 nπx
bn = g(x) sin dx. (9.18)
nπc l
0
By inserting (9.18) into (9.13), we obtain the final solution for y(x, t):
∞
l
2 nπx nπx nπct

y(x, t) = g(x ) sin dx sin sin . (9.19)
nπc l l l
n=1 0
EXERCISE
9.4 Fourier Series for a Step Function
Problem. Given the function

0, for −5 ≤ x ≤ 0,
f (x) = period 2l = 10.
3, for 0 ≤ x ≤ 5
(a) Sketch the function.

(b) Determine its Fourier series.
Solution. (a)

0, for −5 ≤ x ≤ 0,
f (x) = period 2l = 10.
3, for 0 ≤ x ≤ 5
Fig. 9.3.
(b) For period 2l = 10 and l = 5, we choose the interval a to a + 2l to be −5 to 5, Exercise 9.4

i.e., a = −5:

a+2l 5
1 nπx 1 nπx
an = f (x) cos dx = f (x) cos dx
l l 5 l
a −5
0 5 5
1 nπx nπx 3 nπx
= (0) cos dx + 3 cos dx = cos dx
5 5 5 5 5
−5 0 0

3 5 nπx 5
= sin =0 for n = 0.
5 nπ 5 0
5 5
For n = 0, one has an = a0 = (3/5) 0 cos(0πx/5) dx = (3/5) 0 dx = 3.
Furthermore,

a+2l 5
1 nπx 1 nπx
bn = f (x) sin dx = f (x) sin dx
l l 5 l
a −5
0 5 5
1 nπx nπx 3 nπx
= (0) sin dx + 3 sin dx = sin dx
5 5 5 5 5
−5 0 0

3 5 nπx 5 3
= − cos = (1 − cos nπ).
5 nπ 5 0 nπ
Thus,
∞
3 3 nπx
f (x) = + (1 − cos nπ) sin ,
2 nπ 5
n=1
i.e.,

3 6 πx 1 3πx 1 5πx
f (x) = + sin + sin + sin + ··· .
2 π 5 3 5 5 5
EXERCISE
9.5 On the Unambiguousness of the Tautochrone Problem
Problem. Which trajectory of the mass of a mathematical pendulum yields a pen-

dulum period that is independent of the amplitude?
Solution. We consider Fig. 9.4. From energy conservation, we have
m 2
ṡ (y) + gmy = mgh (9.20)
2
Fig. 9.4.
Exercise 9.5 or

ṡ(y) = 2g(h − y). (9.21)
From this, one can calculate the period by separation of the variables:
T /4 s(h) h
1 ds (ds/dy)dy
T= dt = √ = √ . (9.22)
4 2g(h − y) 2g(h − y)
0 0 0
Using the variable u = y/ h, (9.22) changes to
1 √
T (ds/dy) h du
= √ . (9.23)
4 2g(1 − u)
0
We now require that T be independent of the maximum height h:
dT
= 0 for all h. (9.24)
dh
Thus, we get from (9.23) (s ≡ ds/dy)
1 √ 1
d s h du du 1 −1/2 √ ds
√ = √ h s + h = 0 for all h. (9.25)
dh 2g(1 − u) 2g(1 − u) 2 dh
0 0
With the condition that we keep the dimensionless variable u = y/ h constant, we can
rewrite the derivative with respect to h as a derivative with respect to y,
ds uds ds
= =u = us , (9.26)
dh d(uh) dy
and thus, we can transform (9.25) into
1
du 1
√ (s + 2ys ) √ = 0 for all h. (9.27)
8g(1 − u) h
0
1
Any periodic function f (u) satisfying 0 f (u) du = 0 can generally be expanded into
a Fourier series:
∞

f (u) = [am sin(2πmu) + bm cos(2πmu)] . (9.28)
m=1
Therefore, from (9.27) it follows that

√ ∞
1 8gh(1 − u)
s + s =

am sin(2πmu) + bm cos(2πmu)
2y 2y
m=1
√ ∞
8gh(h − y) y y
= am sin 2πm + bm cos 2πm . (9.29)
2y h h
m=1
This holds for all values of h. The left-hand side of (9.29) does not contain h; therefore, Exercise 9.5
the right-hand side must be independent of h too. This holds only for am = bm = 0
(for all m), as we shall prove now.
To have the right-hand side of (9.29) independent of h, we must have
∞

y y constant · (y/ h)h1/2
am sin 2πm + bm cos 2πm = √ (9.30)
m=1
h h 8g(1 − y/ h)
or
∞
u h1/2
[am sin(2πmu) + bm cos(2πmu)] = √ √ C. (9.31)
m=1
1 − u 8g
By integrating (9.31) from 0 to 1, we obtain
1
h1/2 u 4 h1/2
0= √ C √ du = √ C, (9.32)
8g 1−u 3 8g
0
√
thus, C = 0. (This reflects the fact that u/ 1 − u cannot be expanded into a Fourier
series à la (9.31).)
Inserting this result C = 0 again into (9.30), we have am = bm = 0 ∀m, and thus,
from (9.29)
s
s + = 0. (9.33)
2y
From this, one finds by integrating once
s 1 ds C̃

=− ⇒ s ≡ = C̃e−(1/2) ln y = √ . (9.34)
s 2y dy y
The constant is usually denoted by

l
C̃ = , (9.35)
2
so that we have to solve

ds l 1
= √ . (9.36)
dy 2 y
This is the differential equation of a cycloid.2
(2004), Problem 24.4.
The Vibrating Membrane
10
We consider a two-dimensional system: the vibrating membrane. We shall see that the
methods applied for the treatment of a vibrating string can be simply transferred in
many respects.
The membrane is a skin without an elasticity of its own. The stretching of the
membrane along the edge leads to a tension force that acts as a backdriving force on a
deformed membrane.
Let the tangential tension in the membrane be spatially constant and time indepen-
dent. We consider only vibrations with amplitudes so small that displacements within
the membrane plane can be neglected.
10.1 Derivation of the Differential Equation
We introduce the following notations: σ is the surface density of the membrane, and
the membrane tension is T (force per unit length). Let the coordinate system be ori-
ented so that the membrane lies in the x,y-plane. The displacements perpendicular to
this plane are denoted by u = u(x, y, t).
To set up the equation of motion, we imagine a cut of length x through the mem-
brane parallel to the x-axis, and a cut y parallel to the y-axis. The force acting on
the membrane element xy in the x-direction is the product of the tension and the
length of the cut: Fx = T y. Analogously for the y-component we have Fy = T x.
The surface element xy is pulled by the sum of the two forces. If the membrane
is displaced, the u-component of this sum acts on it.
From Fig. 10.1, we see
Fu = T x(sin ϕ(y + y) − sin ϕ(y)) + T y(sin ϑ(x + x) − sin ϑ(x)). (10.1)
Since we restrict ourselves to small amplitudes and angles, the sine can be replaced
by the tangent. For the tangent we then insert the differential quotient, e.g.,
∂u
tan ϕ(x, y + y) = (x, y + y),
∂y
i.e., the partial derivative with respect to y at the point y + y.

Equation (10.1) then takes the form

∂u ∂u ∂u ∂u
Fu = T x (x, y + y) − (x, y) + T y (x + x) − (x, y) .
∂y ∂y ∂x ∂x

134 10 The Vibrating Membrane
Fig. 10.1. The vibrating mem-

brane seen in perspective (a),
various cuts through the mem-
brane (b), and from the top (c)
Moving the product T xy to the left side, one has

∂u (x, y + y) − ∂u
∂y ∂y (x, y)
∂u
∂x (x + x, y) − ∂u
∂x (x, y)
Fu = T xy + .
y x
We replace the area xy of the membrane element by m/σ , where m is its
mass, and σ = m/xy is the mass density per unit surface. Turning now to the
differentials, x, y → 0, we find
∂u
∂x (x + x, y) − ∂u
∂x (x, y) ∂ 2u
lim = (x, y)
x→0 x ∂x 2
or

m ∂ 2 u ∂ 2 u
Fu = T + .
σ ∂x 2 ∂y 2
With this force, we arrive at the equation of motion

∂ 2u m ∂ 2 u ∂ 2 u
m 2 = T + .
∂t σ ∂x 2 ∂y 2
With the abbreviation T /σ = c2 and the Laplace operator, one obtains
1 ∂ 2u
u − = 0. (10.2)
c2 ∂t 2
10.2 Solution of the Differential Equation 135
This form of the wave equation is independent of the dimension of the vibrating
medium. If we insert the three-dimensional Laplace operator and set u = u(x, y, z, t),
(10.2) also holds for sound vibrations (u then represents the density variation of the
air). c is the propagation velocity of small perturbations (velocity of sound)—similar
to the case of the vibrating string.
10.2 Solution of the Differential Equation: Rectangular Membrane

We will now solve the two-dimensional wave equation (10.2) for the example of the
rectangular membrane.
We have the boundary conditions which mean that the membrane cannot vibrate at
the boundary: u(0, y, t) = u(a, y, t) = u(x, 0, t) = u(x, b, t) = 0. To solve the equa-
tion, we again use the product ansatz
u(x, y, t) = V (x, y) · Z(t),

Fig. 10.2. A rectangular mem-
brane
by means of which we first of all separate the space variables from the time variables.
Normal vibrations are of this type. All points x, y (mass points) then have the same
time behavior. This is typical for eigenvibrations. By insertion into the wave equation,
we obtain
Z̈(t) V (x, y)
= c2 .
Z(t) V (x, y)
Here, one has a function of only the position equal to a function that depends only on
the time. Thus, this identity is only valid if both functions are constants, i.e., unchang-
ing with respect to space and time. The constant that equals these functions is denoted
by −ω2 , the quotient ω2 /c2 by k 2 .
One then has
Z̈
= −ω2 , (10.3)
Z
V (x, y) ω2
= −k 2 , k2 = . (10.4)
V (x, y) c2
We can at once write down the general solution of (10.3):
Z(t) = A sin(ωt + δ).
If we had selected a positive separation constant, i.e., +ω2 in (10.3), the solution
would have been Z(t) = e±ωt . This means that the solution would either explode
with the time (e+ωt ) or fade away (e−ωt ). The negative separation constant in (10.3)
obviously guarantees harmonic solutions.
In order to separate the two space variables, we use a further separation ansatz:
V (x, y) = X(x) · Y (y).
∂ 2X ∂ 2Y
Y + X + k 2 XY = 0.
∂x 2 ∂y 2
From this, it follows after division by X(x)Y (y) that
1 ∂ 2 X(x) 1 ∂ 2 Y (y) ω2
+ + k 2 = 0, k2 = .
X(x) ∂x 2 Y (y) ∂y 2 c2
Here again, a function of x equals a function of y only if both are constants.
We split the constant k 2 into
k 2 = kx2 + ky2
and thus obtain

1 ∂ 2X 1 ∂ 2Y
= −kx2 , = −ky2 .
X ∂x 2 Y ∂y 2
Therefore, one has
∂ 2X
+ kx2 X = 0, solution: X(x) = A1 sin(kx x + δ1 ),
∂x 2
∂ 2Y
+ ky2 Y = 0, solution: Y (y) = A2 sin(ky y + δ2 ).1
∂y 2
By multiplying the partial solutions and combining the constants, one obtains the com-
plete solution of the two-dimensional wave equation:
u(x, y, t) = B sin(kx x + δ1 ) sin(ky y + δ2 ) sin(ωt + δ).
10.3 Inclusion of the Boundary Conditions
With the given boundary conditions for u, we obtain
u(0, y, t) = B sin δ1 sin(ky y + δ2 ) sin(ωt + δ) = 0,

u(x, 0, t) = B sin(kx x + δ1 ) sin δ2 sin(ωt + δ) = 0.
Both equations are only satisfied for all values of the variables x, y, t if
sin δ1 = sin δ2 = 0,
which is for example correct for δ1 = δ2 = 0.

From this, we obtain the other boundary conditions:
u(a, y, t) = B sin(kx a) sin(ky y) sin(ωt + δ) = 0,

u(x, b, t) = B sin(kx x) sin(ky b) sin(ωt + δ) = 0.
From considerations similar to those above, we find
sin(kx a) = sin(ky b) = 0,
1 One of the two separation constants kx2 or ky2 could in principle be chosen to be negative, so that
e.g., kx2 − ky2 = k 2 . In this case we would get Y = Aeky ·y + Be−ky ·y , and the boundary conditions
u(x, 0, t) = u(x, b, t) = 0 could be satisfied only by A = B = 0.
10.4 Eigenfrequencies 137
from which we get
kx a = nx π, ky b = ny π, with nx , ny = 1, 2, . . . .
The values nx = ny = 0 must be excluded, since they lead to u(x, y, t) = 0—as for
the vibrating string.
Now we have
2 2
π π
k 2 = kx2 + ky2 = n2x + n2y ,
a b
and because ω = k · c, we find for the eigenfrequency

n2x n2y
ωnx ny = cπ 2
+ 2.
a b
10.4 Eigenfrequencies
Thus, the eigenfrequencies of the rectangular membrane are

n2x n2y
ωnx ny = cπ + ,
a2 b2
where the lowest frequency is the fundamental harmonic:

1 1
ω11 = cπ + .
a 2 b2
For the string, we have ωn = nω1 , i.e., the higher harmonics are integer multiples
of the fundamental frequency. This is no longer valid in the two-dimensional case.
Contrary to the harmonic frequency spectrum (ωn = nω1 ) of the string, the membrane
has an anharmonic spectrum (ωnx ny = nω11 ).
10.5 Degeneracy
If in the special case of a square membrane, the edges have equal length, a = b, then
it follows that

n2x + n2y √
cπ 2
ωnx ny = √ ω11 , ω11 = .
2 a
The table of the ratios ωnx ny /ω11 for several values of the “quantum numbers”
nx , ny of a square membrane shows (see Table 10.1) that for different pairs of “quan-
tum numbers” there exist the same eigenvalues, i.e., there are different possible eigen-
vibrations with the same frequency. Such states are called degenerate. For a square
membrane which is symmetric with respect to the meaning of the x- and y-coordinate,
all states nx ny arranged symmetrically with respect to the main diagonal of the table
are degenerate.
Table 10.1. The ratio ωnx ny /ω11 as a function of nx and ny
ny \ nx 1 2 3 4
1 1.00 1.58 2.24 2.92
2 1.58 2.00 2.55 3.16
3 2.24 2.55 3.00 3.54
4 2.92 3.16 3.54 4.00
The degeneracy is removed at once if a = b. Generally degeneracies appear only

in systems with definite symmetries.
We further recognize that the square membrane contains a fraction of harmonic
overtones (diagonal elements of the table).
10.6 Nodal Lines

At the points where the position-dependent part of the wave motion vanishes, the
string has a node, and the membrane correspondingly has a nodal line.
The position-dependent part reads
nx πx ny πy
sin sin .
a b
Then for nx = 2 and ny = 1 we have
2πx πy
sin sin =0
a b
as the condition for a nodal line.
Away from the edges this condition is still satisfied for the straight line x = a/2,
which represents a nodal line for (nx , ny ) = (2, 1). In general all straight lines of the
form
ma nb
x= ; y= (m = 1, 2, . . . , m < nx ; n = 1, 2, . . . , n < ny )
nx ny
are nodal lines.
10.7 General Solution (Inclusion of the Initial Conditions)

The general solution of the wave equation for the rectangular membrane, since it is a
linear differential equation, is obtained as a sum of the particular solutions (superpo-
sition principle):
∞
∞
nx πx ny πy
u(x, y, t) = cnx ny sin sin sin(ωnx ny t + ϕnx ny ).
a b
nx =1 ny =1
We now can evaluate the cnx ny and the ϕnx ny from the initial conditions
u(x, y, t = 0) = u0 (x, y),

u̇(x, y, t = 0) = v0 (x, y).
10.7 General Solution 139
Fig. 10.3. Nodal lines of sev-

eral eigenvibrations
For t = 0, the general solution and its time derivative read as follows:
∞
nx πx ny πy
u0 (x, y) = cnx ny sin ϕnx ny · sin · sin ,
a b
nx ,ny =1
∞
nx πx ny πy
v0 (x, y) = ωnx ny cnx ny cos ϕnx ny · sin · sin .
a b
nx ,ny =1
We redefine the constants:
Anx ny = cnx ny sin ϕnx ny , (10.5)

Bnx ny = ωnx ny cnx ny cos ϕnx ny . (10.6)
The above equations then change to
∞
nx πx ny πy
u0 (x, y) = Anx ny sin sin , (10.7)
a b
nx ,ny =1
∞
nx πx ny πy
v0 (x, y) = Bnx ny sin sin . (10.8)
a b
nx ,ny =1
The coefficients Anx ny and Bnx ny can be determined by means of the orthogonality
relations. These read
a
n̄x πx nx πx
sin sin dx = aδn̄x nx ,
a a
−a
(10.9)
b
n̄y πy ny πy
sin sin dy = bδn̄y ny .
b b
−b
We assume (10.7) to be continued across the borders as an odd function, multiply

(10.7) by sin(n̄x πx/a), and integrate over x from −a to a. Next we multiply by
sin(n̄y πy/b) and integrate over y from −b to b:
a b
n̄x πx n̄y πy
u0 (x, y) sin sin dxdy
a b
−a −b
a b
n̄x πx n̄y πy
=4 u0 (x, y) sin sin dxdy
a b
0 0
∞
a b
nx πx n̄x πx n̄y πy ny πy
= Anx ny sin sin dx sin sin dy
nx ,ny
a a b b
−a −b
∞

= Anx ny δn̄x nx aδn̄y ny b = abAn̄x n̄y .
nx ,ny
Likewise, we treat (10.8) to evaluate the coefficients Bnx ny . One then obtains
a b
4 nx πx ny πy
Anx ny = u0 (x, y) sin sin dxdy,
ab a b
0 0 (10.10)
a b
4 nx πx ny πy
Bnx ny = v0 (x, y) sin sin dxdy.
ab a b
0 0
With the knowledge of the Anx ny and Bnx ny , one now can calculate the cnx ny and ϕnx ny
from (10.5) and (10.6).
10.8 Superposition of Node Line Figures
In the case of degenerate vibrations of the membrane, there can also appear node lines
that arise by superposition of the node line figures of the degenerate normal vibrations.
As an example we consider the position dependence of the degenerate normal vi-
brations of the quadratic membrane
πx 2πy 2πx πy
u12 = sin sin sin ω12 t and u21 = sin sin sin ω21 t. (10.11)
a a a a
10.9 The Circular Membrane 141
For the superposition of the two normal vibrations, we write
u = u12 + Cu21 .
The constant C specifies the particular kind of superposition. The equation of the
nodal line is obtained from u = 0. The common numerical factor sin ω12 t = sin ω21 t
obviously factors out. For the special case C = ±1, we find
πx 2πy 2πx πy
sin sin ± sin sin =0
a a a a
or, rewritten,

πx πy πy πx
sin sin cos ± cos = 0. (10.12)
a a a a
By setting the bracket equal to zero, we get the equations for the two nodal lines:
y=x for C = −1 and y = a − x for C = +1.
Fig. 10.4 illustrates the nodal lines.
Fig. 10.4. Nodal lines for de-

generate eigenvibrations
We recognize that new vibrations with new kinds of nodal lines can be constructed
by superposing appropriate normal vibrations. One can excite such specific superpo-
sitions of normal vibrations by stretching wires along the nodal lines (right figure) so
that the membrane remains at rest along these lines.
10.9 The Circular Membrane
In the case of the circular membrane, it is convenient to change from the Cartesian
coordinates to polar coordinates, i.e., from u = f (x, y, t) to u = ψ(r, ϕ, t).
For this recalculation, we have
x = r cos ϕ, y = r sin ϕ,
y (10.13)
tan ϕ = , r = x2 + y2.
x
For the transformation of the Laplace operator, we need the derivatives
Fig. 10.5. Circular membrane
∂r x ∂r y (drum)
= = cos ϕ, = = sin ϕ. (10.14)
∂x r ∂y r
By differentiating the tangent, we get
∂ tan ϕ ∂ tan ϕ ∂ϕ 1 ∂ϕ y
= = 2
=− 2. (10.15)
∂x ∂ϕ ∂x cos ϕ ∂x x
By inserting the polar representations for x and y, one gets ∂ϕ/∂x = −(sin ϕ)/r. The
corresponding differentiation of tan ϕ with respect to y yields ∂ϕ/∂y = (cos ϕ)/r. To
get the two-dimensional vibration equation in polar coordinates, we first transform the
Laplace operator (x, y) to polar coordinates (r, ϕ). The differential quotients are
interpreted as operators.
We demonstrate the calculation for the x-component; the recalculation of the
y-component then runs likewise. According to the chain rule, we have
∂ ∂ ∂r ∂ ∂ϕ
= + . (10.16)
∂x ∂r ∂x ∂ϕ ∂x
After insertion of the above results, we obtain
∂ ∂ sin ϕ ∂
= cos ϕ − . (10.17)
∂x ∂r r ∂ϕ
We square this result, taking into account that the terms act on each other as operators.
(The square of an operator means double application.)

∂2 ∂ 1 ∂ ∂ 1 ∂
= cos ϕ − sin ϕ cos ϕ − sin ϕ . (10.18)
∂x 2 ∂r r ∂ϕ ∂r r ∂ϕ
By multiplying out, one first gets the four terms

∂2 ∂ ∂ sin ϕ ∂ sin ϕ ∂
= cos ϕ · cos ϕ + ·
∂x 2 ∂r ∂r r ∂ϕ r ∂ϕ

∂ sin ϕ ∂ sin ϕ ∂ ∂
− cos ϕ · − · cos ϕ . (10.19)
∂r r ∂ϕ r ∂ϕ ∂ϕ
We now treat the individual terms according to the product rule:

∂ ∂ ∂2
cos ϕ · cos ϕ = cos2 ϕ 2 ,
∂r ∂r ∂r

sin ϕ ∂ sin ϕ ∂ sin ϕ cos ϕ ∂ sin2 ϕ ∂ 2
· = + ,
r ∂ϕ r ∂ϕ r2 ∂ϕ r 2 ∂ϕ 2

∂ sin ϕ ∂ cos ϕ sin ϕ ∂ cos ϕ sin ϕ ∂ ∂
cos ϕ · =− + ,
∂r r ∂ϕ r2 ∂ϕ r ∂r ∂ϕ

sin ϕ ∂ ∂ sin2 ϕ ∂ sin ϕ cos ϕ ∂ ∂
· cos ϕ =− + .
r ∂ϕ ∂r r ∂r r ∂ϕ ∂r
From this, one obtains

∂2 ∂2 sin2 ϕ ∂ ∂2 2 sin ϕ cos ϕ ∂ ∂ ∂
= cos2
ϕ + r + + − r .
∂x 2 ∂r 2 r2 ∂r ∂ϕ 2 r2 ∂ϕ ∂ϕ ∂r
Analogously, one gets for the y-component

∂2 ∂2 cos2 ϕ ∂ ∂2 2 sin ϕ cos ϕ ∂ ∂ ∂
= sin 2
ϕ + r + − − r .
∂y 2 ∂r 2 r2 ∂r ∂ϕ 2 r2 ∂ϕ ∂ϕ ∂r
10.9 The Circular Membrane 143
By adding both expressions, we obtain the Laplace operator in polar coordinates:
∂2 ∂2 ∂2 1 ∂ 1 ∂2
+ = = + + . (10.20)
∂x 2 ∂y 2 ∂r 2 r ∂r r 2 ∂ϕ 2
The vibration equation then takes the following form:
∂ 2 u(r, ϕ, t) 1 ∂u(r, ϕ, t) 1 ∂ 2 u(r, ϕ, t) 1 ∂ 2 u(r, ϕ, t)

+ + = . (10.21)
∂r 2 r ∂r r2 ∂ϕ 2 c2 ∂t 2
The equation of motion is solved again by separation of the variables. We use a product
ansatz for separating the position and time functions:
u(r, ϕ, t) = V (r, ϕ) · Z(t). (10.22)
By insertion into the wave equation, we obtain

2
∂ V 1 ∂V 1 ∂ 2V 1 ∂ 2Z
Z(t) 2
+ + 2 2
= 2V 2 . (10.23)
∂r r ∂r r ∂ϕ c ∂t
We divide both sides by V (r, ϕ) · Z(t):
∂2V 1 ∂2V
∂r 2
+ 1 ∂V
r ∂r + r 2 ∂ϕ 2 1 Z̈(t)
= . (10.24)
V (r, ϕ) c2 Z(t)
As a separation constant, we choose
1 T̈
= −k 2 (10.25)
c2 T
and introduce the angular frequency ω by
ω = ck. (10.26)
From this, we get
Z̈ + ω2 Z = 0 (10.27)
with the solution
Z(t) = C sin(ωt + δ). (10.28)
By insertion of the constant −k 2 , the equation of motion takes the form
∂ 2V 1 ∂V 1 ∂ 2V
+ + + k 2 V = 0. (10.29)
∂r 2 r ∂r r 2 ∂ϕ 2
We separate the radial and angular functions by a second product ansatz:
V (r, ϕ) = R(r) · φ(ϕ). (10.30)
Hence, we obtain
d2R 1 d2φ
dr 2
+ 1 dR
r dr r 2 dϕ 2
+ + k 2 = 0. (10.31)
R(r) φ(ϕ)
We separate the variables by multiplying by r 2 :

r 2 (d 2 R/dr 2 ) + r(dR/dr) d 2 φ/dϕ 2
+ k2r 2 + = 0. (10.32)
R(r) φ(ϕ)
Here again, the equation is valid only then if both functions are constants. Hence, we
choose
1 d 2φ
= −σ, (10.33)
φ dϕ 2
from which one obtains as a solution for φ(ϕ)
√ √
φ(ϕ) = Aei σϕ
+ Be−i σϕ
√
= C sin(mϕ + δ) with m = ± σ , m = 0, 1, 2, 3, . . . . (10.34)
m must take only integer values to get the periodicity of the solution. At the angle
2π + ϕ, the solution must be identical with that for the angle ϕ. This fact is often
described by the phrase periodic boundary conditions.
Now we can admit—without restricting the problem—only positive m, since with
negative m only the sense of rotation angle is inverted.
Thus, the equation of motion for the radial function R looks as follows:
d 2R dR
r2 2
+r + k2r 2R − σ R = 0
dr dr
or

d 2 R 1 dR m2
+ + k 2
− R = 0. (10.35)
dr 2 r dr r2
We substitute z = kr, dr = dz/k. Then we get
2
2d R k 2 dR m2 k 2
k + + k − 2 R = 0,
2
dz2 z dz z
2
d R 1 dR m2
+ + 1 − 2 R = 0. (10.36)
dz2 z dz z
In this form, the equation is called Bessel’s differential equation. This differential
equation and its solutions appear in many problems of mathematical physics.
10.10 Solution of Bessel’s Differential Equation2

The solution of our differential equation

d 2 g(z) 1 dg(z) m2
+ + 1 − 2 g(z) = 0 (10.37)
dz2 z dz z
2 Friedrich Wilhelm Bessel, b. July 22, 1784, Minden–d. March 17, 1846, Königsberg (Kaliningrad).
Bessel was first a trade apprentice in Bremen, then until 1809 an assistant at the observatory in
Lilienthal, and then professor of astronomy in Königsberg and director of the observatory there. In
1838 he succeeded in measuring the annual parallax of the star 61 Cygni, thus becoming the first to
determine the distance to a fixed star. As a mathematician Bessel was best known for his investigations
on differential equations and on Bessel functions.
10.10 Solution of Bessel’s Differential Equation 145
cannot be found by integration. Approaches using elementary functions also fail. We

therefore try with the most general power series expansion:

∞

g(z) = z μ
an z n
. (10.38)
n=0
The separation of a power factor is not necessary, but will prove to be very convenient.
Since in the center of our membrane the vibration remains always finite, g(z) must
not have a singularity at z = 0. But since for z → 0 we have
g(z) ≈ a0 zμ , (10.39)
for these physical reasons we must have μ ≥ 0. To get a more general statement, we
consider the asymptotic behavior of Bessel’s differential equation for z → 0 for at first
arbitrary μ.
We then can set as above
g(z) ≈ a0 zμ (10.40)
and obtain by inserting:

μ(μ − 1)zμ−2 + μzμ−2 + zμ − m2 zμ−2 = μ(μ − 1) + μ + z2 − m2 zμ−2
≈ (μ2 − m2 )zμ−2 = 0, (10.41)
since for z → 0 we also have z2 → 0. We thus have the condition
μ2 − m2 = 0. (10.42)
For the above-mentioned reasons, which are of a purely physical nature, it follows that
μ = m, m ∈ N0 . (10.43)
The constant m is itself an integer. To see this, we remind ourselves of the angular
dependence of the total solution, namely,
f (ϕ) = sin(mϕ + δ). (10.44)
Since after a full revolution we return again to the same point of the membrane, the
solution function must have the period 2π . But this holds only then if m is an integer!
We now try to determine the coefficients of our ansatz
gm (z) = zm (a0 + a1 z + a2 z2 . . .), m = 0, 1, 2, . . . . (10.45)

For this purpose, we insert the ansatz in the Bessel equation. The individual terms of
this equation then have the following form:
d 2g
= zm−2 a0 m(m − 1) + a1 (m + 1)mz + a2 (m + 2)(m + 1)z2
dz2

+ a3 (m + 3)(m + 2)z3 + · · · ,
1 dg
= zm−2 a0 m + a1 (m + 1)z + a2 (m + 2)z2 + a3 (m + 3)z3 + · · · ,
z dz
g(z) = zm−2 (a0 z2 + a1 z3 + · · · ),
m2
− g(z) = zm−2 (−a0 m2 − a1 m2 z − a2 m2 z2 − a3 m2 z3 − · · · ).
z2
The sum of the coefficients for each power of z must vanish, i.e., a0 (m(m − 1) +
m − m2 ) = 0. Since the bracket vanishes, a0 can be arbitrary.
For a1 , we get

a1 m(m + 1) + (m + 1) − m2 = 0,
a1 (2m + 1) = 0, i.e., a1 = 0. (10.46)
From the coefficient of zm , it follows that

a2 (m + 2)(m + 1) + (m + 2) − m2 + a0 = 0
or
a2 (4m + 4) = −a0 . (10.47)
Furthermore, we get

a3 (m + 3)(m + 2) + (m + 3) − m2 + a1 = 0,
a3 (6m + 9) = −a1 , i.e., a3 = 0. (10.48)
Generally, we find the condition equation

ap+2 (m + p + 2)(m + p + 1) + (m + p + 2) − m2 + ap = 0,

ap+2 (m + p + 2)2 − m2 = −ap ,
−ap −ap
ap+2 = = . (10.49)
(m + p + 2) − m
2 2 (p + 2)(2m + p + 2)
This recursion formula allows one to determine the coefficient ap+2 from the preced-
ing ap . Because a1 = 0, it follows that all a2n−1 vanish, i.e., in the series expansion
of the solution function there appear only even exponents. For these one obtains with
a0 = 0:
−a2n−2 −a2n−2
a2n = = . (10.50)
2n(2m + 2n) 2n2(m + n)
In the next step, we replace a2n−2 by a2n−4 and obtain

+a2n−4
a2n =
2n(2n − 2)(2m + 2n)(2m + 2n − 2)
a2n−4
= 2 . (10.51)
2 n(n − 1)22 (m + n)(m + n − 1)
By continuing this way, we can relate a2n back to a0 . We obtain
(−1)n a0
a2n =
2n n(n − 1) · · · 1 · 2n (m + n)(m + n − 1) · · · (m + 1)
(−1)n a0
= . (10.52)
2 n!(m + n)!/m!
2n
Thereby, we obtain the following solution functions:

∞
(−1)n z2n
gm (z) = a0 zm m! . (10.53)
n!(m + n)! 22n
n=0
With the special choice a0 · m! = 2−m , we obtain the Bessel functions:

m
∞ 2n
z (−1)n z
Jm (z) =
2 n!(m + n)! 2
n=0
∞
2n+m
(−1)n z
= . (10.54)
n!(m + n)! 2
n=0
The graph of the first Bessel functions is given in Fig. 10.6. We see that for large
arguments the Bessel functions vary like the trigonometric functions sine or cosine.
Now we can immediately write down the solutions of our differential equation:
Vm (r, ϕ) = cm Jm (kr) sin(mϕ + δm ). (10.55)
Fig. 10.6. Graphical represen-

tation of the lowest Bessel
functions
The membrane cannot vibrate at the border r = a, i.e., the boundary condition reads
V (a, ϕ) = 0 for all ϕ.
From this, we obtain the condition
Jm (k · a) = 0,
from which the eigenfrequencies can be determined. For this purpose we must find the
zeros of the Bessel function:
z2 z4
J0 (z) = 1 − + − + · · · = 0,
4 64 (10.56)
z z3 z5
J1 (z) = − + − + · · · = 0, etc.
2 16 384
These zeros—except for the trivial ones for z = 0—can in general not be determined
exactly; they must be calculated by numerical methods. If we denote the nth node of
(m)
the function Jm (z) by zn , we obtain the following table for the values of the first
(m)
zn :
Table 10.2. Zeros of the Bessel functions.
m
n 0 1 2 3 4 5
1 2.41 3.83 5.14 6.38 7.59 8.77
2 5.52 7.02 8.42 9.76 11.06 12.34
3 8.65 10.17 11.62 13.02 14.37 15.70
4 11.79 13.32 14.80 16.22 17.62 18.98
5 14.93 16.47 17.96 19.41 20.83 22.22
6 18.07 19.62 21.12 22.51 24.02 25.43
7 21.21 22.76 24.27 25.75 27.20 28.63
8 24.35 25.90 27.42 28.91 30.37 31.81
9 27.49 29.05 30.57 32.07 33.51 34.99
Useful approximate solutions may also be obtained by considering the asymptotes

of the Bessel functions for z → ∞. Then

2 mπ π
Jm (z) → cos z − − . (10.57)
mz 2 4
We give this without proof. A look to the graphical variation of the Bessel functions
shows the close analogy with the cosine function for large arguments.
From this one can determine zeros:

mπ π
cos z̄n −
(m)
− =0
2 4
mπ π π
⇒ z̄n(m) − − = nπ − ,
2 4 2
mπ π π
z̄n(m) = nπ + − = (4n + 2m − 1) . (10.58)
2 4 4
A comparison of these values with the exact ones from Table 10.2 shows that particu-
larly for n large compared to m one obtains good approximate values:
Table 10.3. Comparison of the exact zeros of the Bessel functions with those obtained from the
asymptotic approximation
m=0 m=5
(0) (0) (5) (5)
zn z̄n zn z̄n
n=1 2.41 2.36 8.77 10.21
n=2 5.52 5.49 12.34 13.35
.. .. .. .. ..
. . . . .
n=9 27.49 27.49 34.99 35.34
(m)
With the exact solutions zn , the boundary condition is
1 (m)
kn(m) · a = zn(m) , kn(m) = ·z .
a n
For the eigenfrequencies, we get
c (m)
ωn(m) = kn(m) · c = · z = ω0 · zn(m) . (10.59)
a n
(m)
Thus, Table 10.2 also shows the values for the ratio ωn /ω0 . By drawing all these
eigenfrequencies along an axis, one arrives at Fig. 10.7. The distances between the
individual eigenfrequencies are fully chaotic. Thus, we are dealing with extremely
anharmonic overtones. This is the reason why drums are badly suited as melodic in-
struments!
Fig. 10.7. Linear representa-
tion of the eigenfrequencies of
the circular membrane
The general solution of the vibration equation is the superposition of the normal
vibrations. It now reads

u(r, ϕ, t) = cn(m) Jm (kn(m) r) · sin(mϕ + δm ) · sin(ωn(m) t + δn(m) ). (10.60)
m,n
(m)
In analogy to the Fourier analysis, the cn can be found so that u(r, ϕ, t) can be
adjusted to any given initial condition u(r, ϕ, 0) or u̇(r, ϕ, 0).
Finally, we want to get a survey of the nodal lines of the vibrating membrane. On
these lines we must have
um,n (r, ϕ) = cn(m) Jm (kn(m) r) · sin(mϕ + δm ) = 0. (10.61)
Then we get nodal lines if either
Jm (kn(m) r) = 0; (10.62)
this is realized for

(m)
zi
r= (m)
, i = 1, 2, . . . , n − 1; (10.63)
kn
or if
sin(mϕ + δm ) = 0, (10.64)
i.e., for angles

νπ − δm
ϕ= , ν = 1, 2, . . . , m. (10.65)
m
For the first nodal lines, we get Fig. 10.8 (with δm = 0).
Fig. 10.8. Nodal lines of the
circular membrane
EXAMPLE
10.1 The Longitudinal Chain: Poincaré Recurrence Time
The equations of motion for a system with n vibrating mass points which are con-
nected by n + 1 springs of equal spring constant k read
mẍ1 =−kx1 + k(x2 − x1 )
mẍ2 = − k(x2 − x1 )+ k(x3 − x2 )
mẍ3 = − k(x3 − x2 )+ k(x4 − x3 )
.. ..
. . (10.66)
mẍn−1 = − k(xn−1 − xn−2 )+ k(xn − xn−1 )
mẍn = − k(xn − xn−1 )− kxn .
With
⎛ ⎞
x1
⎜ x2 ⎟
⎜ ⎟
r = ⎜ . ⎟,
⎝ .. ⎠
xn
Fig. 10.9.
this can be written succinctly as Example 10.1
mr̈ = Ĉkr, (10.67)
where
⎛ ⎞
−2 1
⎛ ⎞
⎜ 1 −2 1 ⎟ x1
⎜ ⎟
⎜ 1 −2 1 ⎟ ⎜ x2 ⎟
⎜ ⎟ ⎜ ⎟
Ĉ = ⎜ .. .. .. ⎟ and r = ⎜ . ⎟. (10.68)
⎜ . . . ⎟ ⎝ .. ⎠
⎜ ⎟
⎝ 1 −2 1 ⎠ xn
1 −2
With the ansatz

⎛ ⎞
a1
⎜ a2 ⎟
⎜ ⎟
r = a cos ωt, a = ⎜ . ⎟,
⎝ .. ⎠
an
we look for the normal modes:
(k Ĉ + mω2 Ên ) a cos ωt = 0,

= D̂n (ω)
⎛ ⎞ (10.69)
mω2 − 2k k
⎜ k mω2 − 2k k⎟
D̂n (ω) = ⎝ ⎠.
..
.
Here,
⎛ ⎞
1 0 ... 0
⎜0 1 ... 0⎟
⎜ ⎟
Ên = ⎜ . .. .. ⎟
⎝ .. . .⎠
0 0 ... 1
represents the unit matrix.

For nontrivial solutions for a , the determinant Dn (ω) of the matrix D̂n (ω) van-
ishes. Furthermore, we use for the coefficients a an ansatz with phase δ and γ to be
determined.
a := (aj = sin(j γ − δ), j = 1, 2, . . . , n). (10.70)
The evaluation of the line j yields
kaj −1 + (mω2 − 2k)aj + kaj +1 = 0,

k sin((j − 1)γ − δ) + (mω2 − 2k) sin(j γ − δ) + k sin((j + 1)γ − δ) = 0,
k cos γ + (mω2 − 2k) + k cos γ = 0,
⇒

k k γ
⇔ ω2 = 2 (1 − cos γ ), ω=2 sin . (10.71)
m m 2
Example 10.1 We know that the characteristic polynomial has n zeros:

k γi
ωi = 2 sin , i = 1, 2, . . . , n. (10.72)
m 2
The boundary conditions a0 = an+1 = 0 must be satisfied. The first one requires that
sin δ = 0, hence δ = lπ, l ∈ Z, and therefore w.l.o.g. l = 0. (10.73)
From the second boundary condition, it further follows that
iπ
sin((n + 1)γi ) = 0 ⇒ γi = , for all i ∈ {1, . . . , n}. (10.74)
n+1
We summarize the result for the ith eigenmodes:

iπ
ri (t) = sin j · cos ωi t, j = 1, 2, . . . , n (10.75)
n+1
with

k iπ
ωi = 2 sin . (10.76)
m 2n + 2
The general solution of (10.66) is a superposition of the various eigenmodes, i.e.,
a vector r(t) with the components xj (t):

n
iπ
xj (t) = (ci ·cos ωi t +bi ·sin ωi t)·sin j · , for j = 1, 2, . . . , n. (10.77)
n+1
i=1
The coefficients sin(j iπ/(n + 1)) are, according to (10.70), (10.73) and (10.76), the
components of the eigenvector to the ith mode, and since D̂n (ω) is symmetric, the
latter ones represent an orthogonal basis in Rn .

n
iπ lπ n+1
sin l sin j = δil . (10.78)
n+1 n+1 2
j =1
We explicitly check this relation in the following Exercise 10.2. We define the ortho-
normal eigenmodes ai :

2 iπ
ai = sin j , j = 1, 2, . . . , n , (10.79)
n+1 n+1
or in detail

2 πi 2πi nπi
ai = sin , sin , . . . , sin .
n+1 n+1 n+1 n+1
The general solution can then be written as follows:

n
r(t) = (ci cos ωi t + bi sin ωi t)ai . (10.80)
i=1
The following interesting question arises: Let the system of n mass points (degrees Example 10.1
of freedom) at the time t0 be at r(t0 ) = r0 with the velocity ṙ(t0 ) = ṙ0 . The system
moves away from this configuration, but after a certain time τ it can closely approach
the initial configuration and possibly return exactly into the initial configuration. We
call this time τ the Poincaré recurrence time.3 One looks for the difference between
the actual time-dependent state vector in the phase space (r(t), ṙ(t)) and the start
vector (r0 , ṙ|t=0 ):

ε(t) =: r(t) − r0 2 + ṙ(t) − ṙ|t=0 2 . (10.81)
The index at the second scalar product for the velocities indicates a diagonal weight
matrix which is suitably included into this normalization:
⎛ ⎞
1
⎜ ω2 ⎟
⎜ 1 ⎟
⎜ 1 ⎟
⎜ ⎟
⎜ ω22 ⎟
=⎜ ⎜
⎟;
⎟ (10.82)
⎜ .. ⎟
⎜ . ⎟
⎜ ⎟
⎝ 1 ⎠
ωn2
ωi are the eigenfrequencies. In this way the factors ωi obtained by differentiation of r

in (10.80) cancel out. So it is guaranteed that both terms under the root in (10.81) have
the same dimension. We formulate with (10.80) the following initial value problem:

n
r(0) = r0 = ci ai ,
i=1

n
ṙ(0) = ṙ0 = 0 = bi ai ω i ⇒ bi = 0 for all i.
i=1
For this choice, the distance ε(t) in phase space, given by (10.81), is

ε(t) = r0 · r0 − 2r0 · ci cos ωi tai + ci2 cos2 ωi t + ci2 sin2 ωi t . (10.83)
i i i

Because r0 = i ci ai , this turns into

n

ε(t) = ci2 (1 − 2 cos ωi t + cos2 ωi t + sin2 ωi t )

i=1 =1

n
ωi t
= 2 ci2 sin2 . (10.84)
2
i=1
It is easily seen that this expression after t = 0 vanishes again only if the eigenfrequen-
cies ωi are related by rational fractions. In the general case ε(t) is only conditionally
3 Jules-Henri Poincaré. See also footnote on page 461.

Example 10.1 periodic. This notion will be explained more precisely as follows: We first consider the
purely periodic case. The period is then determined by the lowest of the frequencies
that are related by rational fractions:

k iπ
ωi = 2 sin . (10.85)
m 2n + 2
We denote it by ω̃ = q · ω1 (q ∈ Z+ ),

k π k π
ω̃ = 2q sin ≈ 2q . (10.86)
m 2n + 2 m 2n + 2
The last approximation holds for n 1 (many mass points). To this frequency corre-
sponds the time

2π k n+1
τ= =2 · . (10.87)
qω1 m q
This is the Poincaré recurrence time, since after this time ε(t) vanishes again, i.e.,
the initial configuration in the phase space is reached again after the time τ . For very
many mass points (n → ∞), this time tends to infinity
τ → ∞, for n → ∞. (10.88)
This is an important and physically plausible result: After preparation of an initial

configuration, the system develops with time away from this configuration. At some
time, just after the Poincaré recurrence time, the state of motion of the system returns
to the initial state (or in the general case very close to it). However, in the case of a
great many degrees of freedom, the system “escapes” and the recurrence time becomes
∞. For example, if one of the n masses coupled by strings—say, the first one—is being
pushed (this corresponds to setting of r0 and ṙ0 ), the energy of this motion will spread
more and more over the other masses. After the Poincaré time τ the first mass will
have regained the entire energy. Only in the case of infinitely many coupled masses
will this no longer happen, since τ → ∞. This is of great importance for the statistical
behavior of systems of particles.
Addendum: Periodic systems with several degrees of freedom: Conditionally pe-
riodic systems. Let us define a periodic system with several degrees of freedom as a
system for which according to (10.83) the orthogonal coordinates used in the descrip-
tion are periodic functions:
ri = ai cos ωi t. (10.89)
The quantities τi = 2π/ωi are the periods belonging to the coordinates xi . In analogy
to (10.77) and (10.80), we expand the general configuration vector r(t) in a Fourier
series

r(t) = (ci cos ωi t + bi sin ωi t)ai . (10.90)
i
The Fourier series (10.90) is in general no longer a periodic function with respect
to the time t , although every individual term is periodic. The periodicity is assured
only for such degrees of freedom whose frequencies ω1 , ω2 , . . . are related by rational
fractions. Therefore systems with several degrees of freedom are called conditionally Example 10.1
periodic systems.
The number of frequencies which are related by rational fractions determines the
degree of degeneracy of the system. If there are no relations of this kind, the system
is non-degenerate. If all frequencies are rationally related to each other, the system is
called fully degenerate. In this case we observe a periodic time function.
The Kepler problem treated earlier is an example for a system with two degrees
of freedom (r, ϕ) which is degenerate and thus has only one frequency. By inventing
a perturbation, say a quadrupole-like potential with the typical variation 1/r 3 , the
degeneracy can be removed, which causes a rosette-like motion.
As an example of a conditionally periodic motion, we note the anisotropic linear
harmonic oscillator, which is a mass point with different spring constants in the vari-
ous Cartesian directions. The trajectory of the mass point is a Lissajous figure which
never turns into itself and in the course of time tightly covers the area given by the
amplitudes. Only in the case of degeneracy there are periodicities in the motion.
In the discussion of the Poincaré recurrence time, we assumed a periodic motion.
In the case of a conditionally periodic motion, the situation is—as was expected—
completely analogous. In this case, after the Poincaré recurrence time τ the configu-
ration vectors r(t), ṙ(t) come very close to the initial configuration r0 , ṙ0 . The initial
configuration will not be reached again, but will be reached “nearly” again after the
time τ . For further discussion, we refer to the literature.
EXERCISE
10.2 Orthogonality of the Eigenmodes
Problem. Prove the orthogonality relation for the eigenmodes

n
iπ lπ n+1
sin j sin j = δil , (10.91)
n+1 n+1 2
i=1
which is explicitly used in (10.78) of the last example.

Solution. Proof of orthogonality of the eigenvectors:

n
π π
dil = sin i j sin l j
n+1 n+1
j =1
n
1 (k − l)π (k + l)π
= cos j − cos .
2 n+1 n+1
j =1
Before continuing the exposition, we evaluate the sum of the following series:

n
sin(xn/2) cos x(n + 1)/2 cos(xn/2) sin x(n + 1)/2
cos kx = = − 1. (10.92)
sin x/2 sin x/2
k=1
This result is easily obtained by writing the cosine in terms of exponential functions
and then evaluating the sum as a geometrical series.
Exercise 10.2 The case k = l yields

1 n
2kπ 1
knπ
cos n+1 sin kπ 1
dkk = n− cos j = n+1− = (n + 1), (10.93)
2 n+1 2 kπ
sin n+1 2
j =1
since sin kπ = 0 for all k.

In the case k = l, we calculate the sums in both alternatives given in (10.92):
(k−l)πn (k−l)π (k+l)π
1 cos (n+1)2 sin 2 cos (k+l)πn
(n+1)2 sin 2
dkl = (k−l)π
− (k+l)π
, (10.94)
2 sin 2(n+1) sin 2(n+1)
(k−l)πn (k−l)π (k+l)π

1 sin (n+1)2 cos 2 sin (k+l)πn
2(n+1) cos 2
dkl = (k−l)π
− (k+l)π
. (10.95)
2 sin sin
2(n+1) 2(n+1)
The vanishing of dkl (k = l) is immediately seen from (10.94) for even k − l and k + l,
and from (10.95) for odd k − l and k + l.
Supplement: Everything Happened Already—A Physical Theorem?
Near the end of the nineteenth century, many physicists discussed the hypothesis
that the course of the world repeats in eternal cycles. This interest was stimulated
mainly by the works of Henri Poincaré (1854–1912). Also a philosopher like Friedrich
Nietzsche (1844–1900) was tempted by this theorem to a short guest performance in
physics. The first speculations along these lines originated from almost nonscientific
attempts to explain the phenomenon of heat. The Lord of Verulam (1561–1626), Fran-
cis Bacon, had already identified heat as a form of motion but had failed to construct
a quantitative theory. For lack of systematic investigations he had included the de-
velopment of heat in dung hills in his considerations. The topic attracted more and
more actors from physics, metaphysics, philosophy, politics and theology. We repro-
duce here two quotations from Poincaré’s works from this era. Henri Poincaré in 1893
wrote in “Review of metaphysics and morality”:
Everybody knows the mechanistic world view that tempted so many good people, and the vari-
ous forms in which it comes up. Some people imagine the material world as being composed of
atoms which move along straight lines because of their inertia, and change their velocity only
if two atoms collide. Other people assume that the atoms perform an attraction or repulsion
on each other, which depends on their distance. The following considerations will meet both
points of view.
It would possibly be appropriate to dispute here the metaphysical difficulties that are related
to these opinions, but I don’t have the necessary expert knowledge. Therefore I will deal here
only with the difficulties the mechanists met when they tried to reconcile their system with the
experimental facts, and with the efforts they made to overcome or to elude these difficulties.
According to the mechanistic hypothesis all phenomena must be reversible; the stars for
example could move along their orbits also in the opposite sense, without conflicting with
Newton’s laws. Reversibility is a consequence of all mechanistic hypotheses.
A theorem that can easily be proved tells us that a restricted world, which is governed only
by the laws of mechanics, will pass again and again a state that is very close to its initial state.
On the other hand, according to the assumed experimental laws the universe tends towards a
certain final state which it never will leave. In this final state which represents some kind of
death, all bodies will be at rest at the same temperature.
The doubts provoked this way, accompanying the developing theory of heat based
upon an irreversible motion of atomic particles, have not yet been clearly removed.
A classical illustrative example of Poincaré’s “recurrence objection” is the parti-
tioned box, one half filled with gas which uniformly distributes over the entire box
after removal of the membrane. Experience tells us what happens, and an inversion of
this “irreversible process” is never observed in practice. But Poincaré did not think at
all of an inversion, but rather of the chance that brought the particles into the ini-
tially empty half. This chance—after some “appropriate time”—should also bring
them back again to the initial half.
In 1955, Enrico Fermi,4 John Pasta,5 and Stanislaw Ulam6 considered a problem
which corresponds to our Example 10.1, except for the additional inclusion of a non-
linear coupling term. Their interest focused on finding, by means of the first com-
puters, recurring processes such as we looked for in our purely linear problem. Sur-
prisingly, they found an almost perfect recurrence of the initial conditions after large
numbers of oscillations. The investigations and reflections of such properties of non-
linear wave equations continue to this day and have been introduced into the theory of
elementary particles (solitons).
4 Enrico Fermi, Italian physicist, b. September 29, 1901, Rome–d. November 28, 1954, Chicago.
Following studies in Pisa and research stays in Göttingen, Leiden and Rome, Fermi became a profes-
sor of physics in Rome in 1925. He built up a research group which achieved leading experimental
and theoretical results in nuclear physics. In 1938, Fermi was awarded the Nobel Prize in physics for
his work on radioactive elements created by neutron bombardment and nuclear reactions triggered
by slow neutrons. In the same year he left Italy, worked at Columbia University in New York and
finally became professor in Chicago. During the war, Fermi was involved in the Manhattan Project to
produce the nuclear bomb, and was the driving force in the development of the first nuclear reactor in
1943. Fermi did seminal work in both experimental and theoretical physics, contributing to statistical
mechanics, the general theory of relativity, and the theory of weak interactions.
5 John Pasta, American physicist and computer scientist, b. 1918, New York–d. 1984, Chicago.
Following different employments with police and the military, Pasta obtained a Ph.D. in theoretical
physics and joined the Los Alamos National Laboratory, where he worked in the group of Nicholas
Metropolis, constructing the MANIAC I computer. Later he became expert for computing with the
Atomic Energy Commission, and in 1964 professor for computer science at the University of Illinois
in Urbana-Champaign.
6 Stanislaw Marcin Ulam, Polish mathematician, b. April 13, 1909, Lemberg (today Lviv, Ukraine)–
d. May 13, 1984, in Santa Fé, New Mexico. Ulam was a student of mathematics with Stefan Banach
and contributed to measure theory, topology, and ergodic theory. In 1938 he came to the US as a Har-
vard Junior Fellow, and later joined the Manhattan project, on the intervention of John von Neumann.
Together with von Neumann, he developed the Monte Carlo method to solve numerical problems
using random numbers. Ulam suggested the functional principle of the first hydrogen bomb.
Part
IV
Mechanics of Rigid Bodies
Rotation About a Fixed Axis
11
As we have seen in Chap 4, a rigid body has 6 degrees of freedom, 3 of translation

and 3 of rotation. The most general motion of a rigid body can be separated into the
translation of a body point and the rotation about an axis through this point (Chasles’
theorem). In the general case the rotation axis will change its orientation too. The
meaning of the 6 degrees of freedom becomes clear once again: The 3 translational
degrees of freedom give the coordinates of the particular body point, 2 of the rotational
degrees of freedom determine the orientation of the rotation axis, and the third one
fixes the rotation angle about this axis.
If a point of the rigid body is kept fixed, then any displacement corresponds to a
rotation of the body about an axis through this fixed point (Euler’s theorem). Hence,
there exists an axis (through the fixed point) such that the result of several consecutive
rotations can be replaced by a single rotation about this axis.
For an extended body the vanishing of the sum of all acting forces is no longer
sufficient as an equilibrium condition.
Two oppositely oriented equal forces −F and F that act at two points of a body
separated by the distance vector l are called a couple. A couple causes, independent
of the reference point, the torque
D = l × F.
Fig. 11.1. A couple causes a
torque
While the torque on a mass point is always related to a fixed point, the torque of a
couple is completely free and can be shifted in space.
The forces acting on a rigid body can always be replaced by a total force acting on
an arbitrary point, and a couple. This can easily be shown by the following example:
At the point P1 , the force F1 acts. Nothing is changed if we let the forces −F1 and
F1 act at O . The force F1 acting on P1 and the force −F1 acting on O represent a
couple, and there remains the force F1 acting on O .
If there are several forces acting, we combine them into the resultant force F =

i Fi . The torque is then given by D = i r i × Fi .
An extended body is in equilibrium if both the total force and the total torque
vanish:

Fi = 0
i

162 11 Rotation About a Fixed Axis
and

ri × Fi = 0
i
(equilibrium condition at the point O ).

For the calculation of the equilibrium condition, the origin of the vectors ri (refer-
ence point of the moments) is arbitrary. Actually, for the point O it follows that (see
Fig. 11.2)

Fi = 0
Fig. 11.2. The forces acting i
on a rigid body are equivalent
to a total force and a couple and

ri × Fi = (a + ri ) × Fi = a × Fi + ri × Fi = 0,
i i i i
i.e., the condition that the sum of all forces and the sum of all torques must van-
ish.
11.1 Moment of Inertia (Elementary Consideration)
A rigid body rotates about a rotation axis z fixed in space. By substituting the angular
velocity vi = ω · ri for the velocity in the kinetic energy, one obtains
1 1 1
T= mi vi2 = ω2 mi ri2 = ω2 .
2 2 2
i i
Analogously, for the angular momentum in z-direction we have

Fig. 11.3. Rotation about the
fixed z-axis with the angular
velocity ω Lz = mi ri vi = ω mi ri2 = ω.
i i
Here, ri is the distance of the ith mass element from the z-axis.
The sum appearing in both relations is called the moment of inertia with respect to
the rotation axis. One has

= mi ri2 .
i
To calculate the moments of inertia of extended continuous systems, we change from

the sum to the integral, i.e.,

= r dm =
2
r 2 dV ,
body body
with representing the density.

11.1 Moment of Inertia 163
For a spatially extended, not axially symmetric rigid body which rotates about the
z-axis, there can also appear components of the angular momentum perpendicular to
the z-axis:

L= mν rν × vν = mν rν × (ω × rν )
ν ν

= mν ω(xν , yν , zν ) × (−yν , xν , 0)
ν

=ω (−xν zν , −yν zν , xν2 + yν2 )mν .
ν
Since the body is supported in such a way that the rotation axis is constantly fixed, in
the bearings appear torques (bearing moments) D = L̇. They can be compensated by
“balancing,” i.e., by attaching additional masses so that the deviation moments

− xν zν mν and − yν zν mν
ν ν
vanish.
EXAMPLE
11.1 Moment of Inertia of a Homogeneous Circular Cylinder
We determine the moment of inertia of a homogeneous circular cylinder with density

about its symmetry axis. Adapted to the problem, we use cylindrical coordinates. The
volume element then reads dV = r dr dϕ dz, and dm = dV . The moment of inertia
about the z-axis is then given by
2π h R
= r dm =
2
dϕ dz r 3 dr;
cylinder 0 0 0 Fig. 11.4. A homogeneous cylin-
der rotates about its axis
integration over the angle and the z-coordinate yields
R
= 2πh r 3 dr.
0
Integration over the radius yields
π R2 1
= hR 4 = πR 2 h = MR 2 .
2 2 2
Steiner’s Theorem1
If the moment of inertia s with respect to an axis through the center of gravity S of
a rigid body is known, the moment of inertia for an arbitrary parallel axis with the
distance b from the center of gravity is given by the relation
= s + Mb2 .
If AB is the axis through the center of gravity and A B the parallel one with the unit
vector e along the axis, this can be shown as follows:

AB = mν (rν × e)2 , A B = mν (rν × e)2 .
ν ν
The relation between rν and rν is given by Fig. 11.5. Obviously rν = −b + rν , and
therefore,

A B = mν ((−b + rν ) × e )2
ν

= mν [(−b × e ) + (rν × e )]2
ν

= mν (−b × e )2 + 2 mν (−b × e ) · (rν × e ) + mν (rν × e )2
Fig. 11.5. On Steiner’s theo-
ν ν ν
rem
= Mb2 + AB .
The middle term vanishes because

2(−b × e) · mν rν × e = 0,
ν

since S is the center of gravity and hence ν mν rν = 0.
If for a planar mass distribution the moments of inertia xx , yy in the x,y-plane
are known, for the moment of inertia zz with respect to the z-axis we have
zz = xx + yy .

If rν = xν2 + yν2 is the distance of the mass element from the z-axis, we have

zz = mν rν2 = mν xν2 + mν yν2 ,
ν ν ν
1 Jacob Steiner, b. March 18, 1796, Utzenstorf–d. April 1, 1863, Bern. Steiner was son of a peasant
and grew up without education. He received his first education from Pestalozzi in Yverdon. Subse-
quently Steiner studied in Heidelberg, and then he served as a teacher of mathematics in Berlin; in
1834, he became an associate professor at the university there. Steiner is considered the founder of
synthetic geometry, which was systematically developed by him. He worked on geometric construc-
tions and isoperimetric problems. A peculiar feature of his work is that he almost completely avoided
analytic and algebraic methods in geometric investigations.
11.1 Moment of Inertia 165
i.e.,
zz = yy + xx .
EXAMPLE
11.2 Moment of Inertia of a Thin Rectangular Disk
We consider the moment of inertia of a thin rectangular disk of density . For the
calculation of the moment of inertia about the x-axis, we take as the mass element
dm = a dy. We then obtain
b
b3 1
xx = y 2 a dy = a = Mb2 .
3 3
0
The moment about the y-axis follows likewise:

1
yy = Ma 2 .
3
From zz = xx + yy , we then get
1
zz = M(a 2 + b2 ).
3
Fig. 11.6. A rectangular pla-

nar mass distribution
The moment of inertia about a perpendicular axis through the center of grav-
ity is found, according to Steiner’s theorem, from the moment of inertia about the
z-axis:
2
2
a 2 b a 2 + b2
zz = s + M + = s + M ,
2 2 4

M 2 1 1
s = zz − (a + b ) = M(a + b )
2 2 2
− ,
4 3 4
M 2
s = (a + b2 ).
12
11.2 The Physical Pendulum
An arbitrary rigid body with the center of gravity S is suspended revolving on an

−
→
axis through the point P . The distance vector PS is r. Let 0 be the moment of inertia
of the body about a horizontal axis through P , and let M be the total mass. If the body
in the gravitation field is now displaced from its rest position, it performs pendulum
motions.
If the body is displaced, there is a torque
Fig. 11.7. A body of mass M
is suspended at the point P D= rν × mν g = mν rν × g = Mr × g = −aMg sin ϕk,
ν ν
where k is a unit vector pointing out of the page in Fig. 11.7, and |r| = a. The angular
velocity is then
dϕ
ω = +k .
dt
From the relation D = L̇ = 0 ω̇, we then obtain
d 2ϕ d 2 ϕ aMg
−aMg sin ϕ = 0 or + sin ϕ = 0.
dt 2 dt 2 0
For small amplitudes, we replace sin ϕ by ϕ. With the abbreviation =

√
aMg/0 , we obtain the differential equation
d 2ϕ
+ 2 ϕ = 0,
dt 2
with the solution
ϕ = A sin(t + δ).
So one also obtains the period of the physical pendulum:

2π 0
T= = 2π .
Mag
√
Since for the thread pendulum (mathematical pendulum) we have T = 2π l/g,
it follows that both periods coincide if the thread pendulum has the length l =
0 /Ma.
If we replace the moment of inertia 0 by the moment of inertia s about the
center of gravity, then according to Steiner’s theorem we have

s + Ma 2 s a
T = T (a) = 2π = 2π + .
Mag Mag g
From this, it follows that the period becomes a minimum if the vibration axis is a
√
distance a = s /M from the center of gravity. From this relation one can experi-
mentally determine the moment of inertia s .
11.2 The Physical Pendulum 167
EXERCISE
11.3 Moment of Inertia of a Sphere
Problem. Find the moment of inertia of a sphere about an axis through its center.
The radius of the sphere is a, and the homogeneous density is .
Solution. We use cylindrical coordinates (r, ϕ, z). The z-axis is the rotation axis. For
the corresponding moment of inertia, we have

= r 2 dV .
sphere
The center of the sphere is at z = 0. The equation for the spherical surface then reads
x 2 + y 2 + z2 = a 2 or r 2 + z2 = a 2 .
We write out the integration limits:

√
2π a a −z
2 2
= dϕ dz r 3 dr
0 −a 0
or
√
a a 2 −z2 a
1 4 π
= 2π r dz = (a 2 − z2 )2 dz.
4 0 2
−a −a
Integration over z yields

8 4 2
= πa 5 = πa 3 a 2 .
15 3 5
Since the total mass of the sphere is given by M = (4/3)πa 3 , it follows that
2
= Ma 2 .
5
EXERCISE
11.4 Moment of Inertia of a Cube
Problem. Calculate the moment of inertia of a homogeneous massive cube about one
of its edges.
Solution. Let be the density and s the edge length of the cube. A mass element is
then given by
dm = dV = dx dy dz.
The moment of inertia about AB (see Fig. 11.8) is evaluated as
s s s
2 2
AB = (x 2 + y 2 ) dx dy dz = s 5 = Ms 2 .
3 3
0 0 0
Fig. 11.8. Calculation of the

moment of inertia of a cube
EXERCISE
11.5 Vibrations of a Suspended Cube
Problem. A cube of edge length s and mass M hangs vertically down from one of its
edges. Find the period for small vibrations about the equilibrium position. How long
is the equivalent thread pendulum?
Fig. 11.9. Rotation axes of the

suspended cube
Solution. The moment of inertia of the cube about AB is (see Exercise 11.4)
2
AB = Ms 2 .
3
The center of gravity is in the center of the cube, i.e., for the distance a of the center
of gravity S from the axis AB we have
1 √
a = s 2.
2
The equation of motion of the physical pendulum for small angle amplitudes was
Mga
ϕ̈ + ϕ=0
AB
with the angular frequency

Mga
ω=
AB
and the period

2π AB 2Ms 2 · 2 √
4 2s
T= = 2π = 2π √ = 2π 2 .
ω Mga 3Mgs 2 3g
The length of the equivalent thread pendulum is calculated as Exercise 11.5

l
T = T = 2π ,
g
which just defines the equivalence of the pendulums. By insertion of T one obtains

√
4 2s l
2π 2 = 2π ,
3g g
or resolved,
2√
l= 2s.
3
This equivalent length of the thread pendulum is also called the reduced pendulum Fig. 11.10. Physical pendulum
length. and reduced pendulum length
EXAMPLE
11.6 Roll off of a Cylinder: Rolling Pendulum
We consider a cylinder with a horizontal axis that can roll down an inclined plane.
The system has one degree of freedom; hence an energy consideration is sufficient.
The velocity of each point of the cylinder may be thought as being composed of the
velocity v1 due to the translational motion and of the velocity v2ν due to the rotation.
The energy of motion is then given by
mν 1 mν
vν2 = v12 mν + 2
v2ν + v1 · mν v2ν . (11.1)
2 2 2
For a symmetric mass distribution, the last term drops out, and we have
M 2 2
T= v + ϕ̇ , (11.2)
2 1 2
i.e., the energy of motion is additive in translational and rotation energy. For the cylin-
der (with symmetric mass distribution) on the inclined plane we have
M 2 2
ṡ + ϕ̇ − Mgs sin α = E (11.3)
2 2
(s measures the distance along the inclined plane). “Rolling off” without gliding
means that the axis always moves just as much as corresponds to the rotation of the
cylinder surface:
ṡ = R ϕ̇,
Example 11.6 where R is the cylinder radius. We thus obtain the equation

1
M + 2 ṡ 2 − Mgs sin α = E,
2 R
(11.4)
1
s̈ = g sin α.
1 + /MR 2
The acceleration of the cylinder rolling off is smaller than that of a gliding mass point.
If the total mass of the cylinder is (approximately) concentrated on the axis, then

= 0, s̈ = g sin α,
MR 2
and the acceleration is the same as for a gliding mass point. For a homogeneous cylin-
der, we have
1 2
= , s̈ = g sin α.
MR 2 2 3
For a hollow cylinder with all mass on the surface, we have
1
= 1, s̈ = g sin α;
MR 2 2
the acceleration is only half of that for a gliding mass point. If we fix a circular disk
concentric onto the cylinder, which extends beyond the base (like a wheel rim over the
rail), then /MR 2 > 1, i.e., the acceleration can be even lower.
An investigation of the force balance lets us elucidate this problem once again from
another point of view. At the point S, gravity acts and performs a torque with respect
to the point A (see Fig. 11.11)
DA = |DA | = R · Mg sin α, (11.5)
while the constraints do not create a torque. The angular acceleration at the point A is
therefore
DA RMg sin α 2g
ϕ̈ = ω̇ = = = sin α. (11.6)
A (3/2)MR 2 3 R
Fig. 11.11. Rolling cylinder

on an inclined plane
The moment of inertia A of a homogeneous cylinder is easily found by means of Example 11.6
Steiner’s theorem. Since the moment of inertia with respect to the center of gravity is
s = MR 2 /2, it follows immediately that
3
A = s + MR 2 = MR 2 .
2
If the cylinder rolls without gliding, for the linear acceleration of the center of
gravity, we find
2
|as | = |ω̇ × rA | = ω̇R = g sin α. (11.7)
3
The cylinder gets only 2/3 of the acceleration which it would get when gliding. Equa-
tion (11.8) is found from simple considerations: Since the instantaneous velocity of
the contact point A equals zero, one can consider A as instantaneously at rest. But this
means that the rigid body instantaneously performs a rotation about the contact point
A, with an angular velocity ω. The velocity of an arbitrary point of the body is then
given by (see Fig. 11.11)
v = ω × rA .
Besides the gravitation force there acts the reaction force N (to balance the normal
component of Mg)
|N| = |Mg cos α|, (11.8)
and the friction force Ff . The latter one is calculated from the balance
Mg sin α + Ff = Mas (11.9)
and with (11.7),

2 1
−Ff = Mg sin α − Mg sin α = Mg sin α. (11.10)
3 3
Thus, the friction force acts opposite to the direction of motion. The condition for a
rolling motion of the cylinder is
|Ff | ≤ μN, (11.11)
where μ is the friction coefficient.

Since
1
N = Mg cos α and |Ff | = Mg sin α,
3
we have
1
|Ff | ≤ μMg cos α or Mg sin α ≤ μMg cos α. (11.12)
3
That means that a rolling motion exists only for tan α ≤ 3μ.
A cylinder with asymmetrical mass distribution, which under the influence of grav-
itation can vibrate by rolling on a horizontal base, is called a rolling pendulum. It
represents a system with one degree of freedom; the position of the rolling pendu-
lum can be specified by the rotation angle ϕ or by the coordinate x of the cylin-
der axis (measured perpendicularly to the axis, see Fig. 11.12). “Rolling off” means
Fig. 11.12. Rolling pendulum that
ẋ = R ϕ̇. (11.13)
Since there is only one degree of freedom, the energy law is sufficient for the de-
scription. The motion is composed of a translational and a rotational motion. When
applying (11.1), we have to account for the asymmetrical mass distribution. The ex-

pression mν v2ν , as a momentum due to the rotational motion, can be calculated
by assuming that the total mass M is concentrated at the center of gravity, which is
located off the axis by the distance s: |s ϕ̇| is then the velocity |v2ν | of this mass on
rotation, and π − ϕ is the angle between v1 and v2ν . According to (11.1), we then
have
M 2 2
T= ẋ + ϕ̇ − ẋ · Ms ϕ̇ cos ϕ,
2 2
where is the moment of inertia about the cylinder axis. With the condition (11.13)
for rolling follows
1
T = (MR 2 + − 2MRs cos ϕ)ϕ̇ 2 . (11.14)
2
This expression can be interpreted also in a different way: If s is the moment of

inertia about an axis through the center of gravity which is parallel to the cylinder
axis, then according to Steiner’s theorem we have
= s + Ms 2 .
Equation (11.14) thus turns into
1
T = [M(R 2 + s 2 − 2Rs cos ϕ) + s ]ϕ̇ 2
2
or
1
T = (Mr 2 + s )ϕ̇ 2 ,
2
where r is the distance of the center of gravity from the contact line of the cylinder
with the base. According to Steiner’s theorem,
u = Mr 2 + s
is the moment of inertia about the contact line which changes with time, and (11.14)
takes the form
u 2
T= ϕ̇ .
2
If we now abbreviate (11.14) by
1
T = (B − 2MRs cos ϕ)ϕ̇ 2
2
and add the potential energy
U = Mgs(1 − cos ϕ),
we then get the energy law
1
(B − 2MRs cos ϕ)ϕ̇ 2 + Mgs(1 − cos ϕ) = E. (11.15)
2
The equation differs from that for the physical pendulum (compare the section on
the physical pendulum). For small angles ϕ, we obtain
u ϕ̇ 2 + Mgsϕ 2 = Mgsα 2 , (11.16)
where we replaced the arbitrary constant E by the arbitrary constant α. Equation

(11.16) is solved by
ϕ = α cos(ωt + δ), (11.17)
with
Mgs Mgs
ω2 = = . (11.18)
u MR 2 + − 2MRs
In the limit of a symmetrical mass distribution (s = 0), one has ω = 0. If the center of
gravity moves to the cylinder surface (s → R), then
MgR
ω2 = .
s
If the mass is limited to a more restricted region, ω becomes very large. If we imagine
that a part of the mass is shifted by an appropriate device to the outside of the rolling
Fig. 11.13. Transition from the
cylinder (see Fig. 11.13) and that s is large compared to R, then the vibration turns rolling pendulum to the phys-
into the vibration of a physical pendulum. ical pendulum
EXAMPLE
11.7 Moments of Inertia of Several Rigid Bodies About Selected Axes
Figure 11.14 shows the moments of inertia of (a) a disk, (b) a cylinder, (c) a rec-
tangular plate, (d) a spherical shell, (e) a solid sphere, and (f) a cube about different
selected axes.
Fig. 11.14.
EXERCISE
11.8 Cube Tilts over the Edge of a Table
Problem. A cube with the edge length 2a and mass M glides with constant velocity
v0 on a frictionless plate. At the end of the plate, it bumps against an obstacle and tilts
over the edge (see Fig. 11.15). Find the minimum velocity v0 for which the cube still
falls from the plate!
Fig. 11.15. Cube tilting over
an edge
Solution. We look for the velocity v0 for which the cube can tilt over its edge, as is
represented in (c). If it bumps into the obstacle at the edge of the plate, it is set into
rotation about the axis A. At the time of collision all external forces act along this
axis, and the angular momentum of the cube is conserved. Before hitting the obstacle,
the cube has—due to the translational motion—the angular momentum
L = |r × p| = p · a = Mv0 a. (11.19)
Immediately after the collision, the angular momentum appears as rotational motion
of the cube
L = A ω0 = Mv0 a,
or
Mv0 a
ω0 = . (11.20)
A
If the cube begins to lift off, the gravitational force causes a torque about the axis A
that counteracts the lifting process.
For the kinetic energy of the cube immediately after the collision, one has, for given
ω0 ,
1 1 M 2 v02 a 2
T0 = A ω02 = . (11.21)
2 2 A
The potential energy difference between position a and position c is
√ √
V = M(h2 − h1 )g = M( 2a − a)g = Mag( 2 − 1), (11.22)
and from the energy conservation law, immediately it follows that

√ 1 M 2 v02 a 2
Mag( 2 − 1) = . (11.23)
2 A
Exercise 11.8 The moment of inertia of the cube A is easy to calculate:
2a 2a 2a

8
A = r dV =
2
(x 2 + y 2 ) dx dy dz = Ma 2 . (11.24)
3
V 0 0 0
From (11.23), it follows that

√ 1 Ma 2
ag( 2 − 1) = v2,
2 (8/3)Ma 2 0
and from this, we find

16 √
v0 = ag ( 2 − 1). (11.25)
3
This is the correct result. We emphasize this because one could easily come to another
result by a false consideration: The kinetic energy is (1/2)Mv02 , and from the energy
conservation combined with (11.22) it follows that
1 √
Mv02 = Mag( 2 − 1).
2
This leads to
√
v0 = 2ag( 2 − 1), (11.26)
√
i.e., a value that is smaller than the correct result (11.25) by the factor 3/8. The
result (11.26) is wrong since the cube loses part of its kinetic energy in the collision
because of its inelasticity. The correct result (11.25) is based upon the conservation of
angular momentum, which acts “more strongly” than the conservation of energy.
EXERCISE
11.9 Hockey Puck Hits a Bar
Problem. A thin bar of length l and mass M lies on a frictionless plate (the x, y-plane
in Fig. 11.16). A hockey puck of mass m and velocity v knocks the bar elastically
under 90◦ at the distance d from the center of gravity. After the collision the puck is
at rest.
(a) Determine the motion of the bar.
(b) Calculate the ratio m/M, accounting for the fact that the puck is at rest.
Fig. 11.16. A hockey puck

hits a bar on a frictionless
plate
Solution. (a) Since the collision is elastic, momentum and energy conservation hold, Exercise 11.9
where momentum conservation refers both to linear and angular momentum. The bar
acquires both a translational and a rotational motion from the collision with the puck.
Conservation of the linear momentum immediately leads to
Ps = Mvs = mv, (11.27)
and the velocity of the center of gravity is

mv
vs = . (11.28)
M
Likewise, from the conservation of angular momentum it follows that
Ls = s ω = mvd = D, (11.29)
and for the angular velocity of the bar relative to the center of gravity,
mvd
ω= , (11.30)
Ml 2 /12
where
l/2
1
s = r 2 dV = Ml 2 .
12
0
Thus, the center of gravity of the bar moves uniformly with vs along the y-axis, while
the bar rotates with the angular velocity ω about the center of gravity. Figure 11.17
illustrates several stages of the motion.
(b) The kinetic energy of the bar can be determined by means of the energy con- Fig. 11.17. The motion of the
servation law. Before the collision, the kinetic energy of the puck is bar
1
T = mv 2 , (11.31)
2
while the kinetic energy after the collision consists of two components:
1
Tt = Mvs2 “translation energy of the center of gravity”
2
and
1
Tr = s ω2 “rotation energy about the center of gravity.”
2
Since the potential energy remains unchanged, it immediately follows that
1 1
T = mv 2 = Tt + Tr = (Mvs2 + s ω2 )
2 2
or
Ml 2 2
mv 2 = Mvs2 + ω . (11.32)
12
Exercise 11.9 Insertion of (11.28) and (11.30) into (11.32) finally yields
m2 v 2 m2 v 2 d 2 (12)2 Ml 2
mv 2 = + ,
M M 2l4 12
m m d2
1= + 12 ,
M M l2
or
m 1
=
M 1 + 12(d/ l)2
for the mass ratio. If the puck kicks the bar at the center of gravity, d = 0, no rotation
appears. In order to make the collision elastic, m = M must be satisfied.
If the puck kicks the bar at the point d = l/2, the collision is elastic only if M = 4m.
In this case the rotation velocity is ω = 6mv/Ml = 6vs / l.
EXERCISE
11.10 Cue Pushes a Billiard Ball
Problem. A billiard ball of mass M and radius R is pushed by a cue so that the center
of gravity of the ball gets the velocity v0 . The momentum direction passes through the
center of gravity. The friction coefficient between table and ball is μ. How far does
the ball move before the initial gliding motion changes to a pure rolling motion?
Fig. 11.18. A cue pushes a

billiard ball
Solution. Since the momentum direction passes through the center of gravity, the
angular momentum with respect to the center of gravity at the time t = 0 equals zero.
The friction force f points opposite to the direction of motion (see Fig. 11.18) and
causes a torque about the center of gravity
Ds = f · R = μMgR. (11.33)
The result is an angular acceleration of the ball, so that
μMgR μMgR 5 μg
ω̇ = = 2
= . (11.34)
s (2/5)MR 2 R
Moreover, the friction force causes a deceleration of the center of gravity, i.e.,
f μgM
Mas = −f or as = − =− . (11.35)
M M
as is the acceleration of the center of gravity.
For the rotation velocity of the ball, one gets from (11.34)—after performing the Exercise 11.10
integration—
t
5 μg
ω= ω̇ dt = t. (11.36)
2 R
0
The linear velocity of the center of gravity follows from (11.35)—again after
integration—as

vs = as dt = v0 − μgt. (11.37)
The billiard ball begins rolling when vs = ωR, or
5 g
μ tR = v0 − gμt (11.38)
2 R
or when
7 2 v0
v0 = μgt and t = . (11.39)
2 7 μg
The distance passed before rolling starts is obtained—by integrating (11.37)—as
t
μgt 2
s= vs dt = v0 t − (11.40)
2
0
and with t from (11.39) finally as

2 v02 v 2 2 2 12 v02
s= − 0 = . (11.41)
7 μg 2μg 7 49 μg
If the ball is kicked at a distance h above the center of gravity, besides the linear
motion there appears a rotational motion with the angular velocity
Mv0 h 5 v0 h
ω= = . (11.42)
2 R2
If h = (2/5)R, the rolling motion of the ball starts immediately. For h < (2/5)R, one
has ω < v0 /R, and for h > (2/5)R correspondingly ω > v0 /R; in the second case the
friction force points forward.
Figure 11.19 shows the change of vs and ωR as a function of time for h = 0. If
vs = ωR, the rolling motion begins, the friction vanishes, and then vs and ω remain
constant.
Fig. 11.19.
EXERCISE
11.11 Motion with Constraints
Problem. A bar of length 2l and mass M is fixed at point A, so that it can rotate only
in the vertical plane (see Fig. 11.20). The external force F acts on the center of gravity.
Calculate the reaction force Fr at the point A!
Fig. 11.20. The bar rotates

about the point A
Solution. In order to determine Fr , one calculates the torque DA with respect to the
center of gravity of the bar, caused by Fr .
The torque with respect to the fixed point A is
DA = −F l = A ω̇, (11.43)
since the constraints do not contribute to DA . The angular acceleration of the bar ω̇
then follows from (11.43):
DA Fl
ω̇ = =− , (11.44)
A A
where A is the moment of inertia of the bar with respect to A. Since the moment of
inertia s with respect to the center of gravity S is easily calculated as
l
1
s = r 2 dV = Ml 2 , (11.45)
3
−l
one immediately gets for A by means of Steiner’s theorem
1 4
A = s + Ml 2 = Ml 2 + Ml 2 = Ml 2 . (11.46)
3 3
Equation (11.46) inserted into (11.44) leads to
Fl 3 F
ω̇ = − =− . (11.47)
A 4 Ml
Since (11.47) must be correct, independent of the point from which the torque is being
calculated, from the knowledge of the torque with respect to the center of gravity S,
Ds = −Fr l, (11.48)
and hence of the angular acceleration
Ds 3Fr l 3Fr
ω̇ = =− 2 =− , (11.49)
s Ml Ml
one can calculate the reaction force Fr , by comparing (11.47) and (11.49):
3 F 3Fr 1
− =− ⇒ Fr = F.
4 Ml Ml 4
EXERCISE
11.12 Bar Vibrates on Springs
Problem.
(a) Find the moment of inertia of a thin homogeneous bar of length L with respect to
an axis perpendicular to the bar.
(b) A homogeneous bar of length L and mass m is supported at the ends by identical
springs (spring constant k). The bar is moved at one end by a small displacement
a and then released.
Solve the equation of motion and determine the normal frequencies and normal
vibrations. Sketch the normal vibrations.
Fig. 11.21. A bar is supported
by two identical springs
Solution. (a) If the bar is divided into small segments of length dx with the cross
section f , we have elementary volumes dV = f dx. Let be the constant density of
the bar; then we have
Fig. 11.22.
L L
1
A = x (f dx) = f
2
x 2 dx = f L3 .
3
0 0
Since m = f L is the total mass of the bar, it follows that

1
A = mL2 .
3
According to Steiner’s theorem, the moment of inertia about an axis through the
center of gravity is
2
L 1
A = s + m ⇒ s = mL2 .
2 12
(b) Let b be the length of the spring before the motion (b is not the natural length
of the spring, because of the existence of the gravitation field), and x1 , x2 , x be the
lengths of the first and second spring, and the height of the center of gravity of the bar
at the time t . Since the bar is rigid, we have x1 + x2 = 2x. Newton’s second law leads
to
mẍ = −k(x1 − b) − k(x2 − b)
or
mẍ = −k(x1 + x2 ) + 2kb.
The constraint condition leads to

2k
mẍ = −2kx + 2kb ⇒ ẍ = − (x − b). (11.50)
m
We assume that there are only small displacements, so that sin ϑ ≈ ϑ . Then
L L
x2 = x + ϑ, x1 = x − ϑ.
2 2
For the torque, we get
k 1
ϑ̈ = − L(x2 − x1 ) = − kL2 ϑ, since x2 − x1 = Lϑ.
2 2
Fig. 11.23.
From (a) = (1/12)mL2 , we conclude

6k
ϑ̈ = − ϑ. (11.51)
m
The solutions of (11.50) and (11.51) are
x = A cos(ω1 t + B) + b
and
ϑ = C cos(ω2 t + D)
with

2k 6k
ω1 = and ω2 = .
m m
The initial conditions at the time t = 0 are
a a
x=b− , ϑ= , ẋ = 0, ϑ̇ = 0.
2 L
Thus follows
⎧ a
⎪
⎪ b − = A cos(B) + b,
B = D = 0, ⎪
⎪ 2
⎪
⎪
a ⎪
⎨ 0 = −Aω1 sin(B),
A=− ,
2 ⎪ a
a ⎪
⎪ = C cos(D),
C= , ⎪
⎪
⎪
⎪ L
L ⎩
0 = −Cω2 sin(D),
and, hence,

a 2k a 6k
x = b − cos t, ϑ = cos t.
2 m L m
The normal modes are

2k
X1 = x1 + x2 = 2b − a cos t,
m

6k
X2 = x1 − x2 = −a cos t.
m
Fig. 11.24. The normal vibra-

tions
Rotation About a Point
12
The general motion of a rigid body can be described as a translation and a rotation
about a point of the body. This is just the content of Chasles’ theorem, discussed at the
begin of Chap. 4. If the origin of the body-fixed coordinate system is set at the center
of gravity of the body, one can separate the center-of-mass motion and the rotation
in all practical cases (compare Chap. 6, equations (6.4)–(6.8)). For this reason, the
rotation of a rigid body about a fixed point is of particular significance.
12.1 Tensor of Inertia
We first consider the angular momentum of a rigid body that rotates with angular
velocity ω about the fixed point 0 (see Fig. 12.1):

L= mν (rν × vν )
ν

= mν rν × (ω × rν )
ν Fig. 12.1. A rigid body rotates
with ω about the fixed point 0
= mν ωrν2 − rν (rν · ω) ;
ν
the latter relation holds according to the expansion rule. We decompose rν and ω into
components and insert

L= mν (xν2 + yν2 + zν2 )(ωx , ωy , ωz ) − (xν ωx + yν ωy + zν ωz )(xν , yν , zν ) .
ν
Ordering by components leads to

L= mν (xν2 + yν2 + zν2 )ωx − xν2 ωx − xν yν ωy − xν zν ωz ex
ν

+ (xν2 + yν2 + zν2 )ωy − yν2 ωy − xν yν ωx − zν yν ωz ey

+ (xν2 + yν2 + zν2 )ωz − zν2 ωz − xν zν ωx − yν zν ωy ez .

186 12 Rotation About a Point
For the components of the angular momentum one thus obtains

Lx = mν (yν + zν ) ωx + −
2 2
mν xν yν ωy + − mν xν zν ωz ,
ν ν ν

Ly = − mν xν yν ωx + mν (xν2 + yν2 ) ωy + − mν yν zν ωz ,
ν ν ν

Fig. 12.2. The angular momen- Lz = − mν xν zν ωx + − mν yν zν ωy + mν (xν + yν ) ωz .
2 2
tum is in general not parallel
ν ν ν
to the angular velocity ω
The individual sums are abbreviated by
Lx = xx ωx + xy ωy + xz ωz ,

Ly = yx ωx + yy ωy + yz ωz ,
Lz = zx ωx + zy ωy + zz ωz ,
or

Lμ = μν ων ,
ν
or, written in terms of vectors (matrix notation),
· ω.
L=
that can be written as a

The quantities μν are the elements of the tensor of inertia
3 × 3-matrix:
⎛ ⎞
xx xy xz
= ⎝ yx yy yz ⎠ .

zx zy zz
The elements in the main diagonal are called moments of inertia, the remaining
ones are called deviation moments. The matrix is symmetric, i.e., νμ = μν . Thus
the tensor of inertia has 6 independent components. If the mass is continuously distrib-
uted, one changes from summation to integration for calculating the matrix elements.
For example,

xy = − (r)xy dV ,
V

xx = (r)(y 2 + z2 ) dV ,
V
where (r) is the space-dependent density.

At each point 00 , 01 , 02 , . . . the tensor of inertia μν is different. At a fixed point 0
μν also depends on the coordinate system.
As follows from their definition, the μν are constants if one selects a body-fixed
Fig. 12.3. Possible rotation
points and coordinate systems coordinate system. The tensor of inertia is however dependent on the position of the
of the rigid body coordinate system relative to the body and will change if the origin is shifted or the
12.2 Kinetic Energy of a Rotating Rigid Body 187
orientation of the axes is changed. The tensor of inertia is usually understood as the
tensor in a coordinate system with the origin in the center of gravity (center-of-mass
system). The corresponding principal moments of inertia (see Fig. 12.3) are corre-
spondingly the moments of inertia.
12.2 Kinetic Energy of a Rotating Rigid Body

Quite generally the kinetic energy of a system of mass points is
1
T= mν vν2 .
2 ν
We decompose the motion of the rigid body into the translation of a point and the
rotation about this point, so that vν = V + ω × rν , and we obtain
1
T = mν (V + ω × rν )2
2 ν

1 1
= MV 2 + V · ω × mν rν + mν (ω × rν )2 .
2 ν
2 ν
The first and the last term correspond to pure translational and rotational energy, re-
spectively. The mixed term can be made to vanish in two different ways.
If one point is fixed, and if we put it at the origin of the body-fixed coordinate
system, then V = 0. Otherwise the origin is put at the center of gravity, so that

mν rν = 0.
ν
The rotation point is in this case the center of gravity. We now consider the pure
rotation energy
1 1
T = mν (ω × rν ) · (ω × rν ) = mν ω · (rν × (ω × rν ))
2 ν 2 ν
1 1 1
= ω· mν (rν × vν ) = ω · rν × p ν = ω · lν .
2 ν
2 ν
2 ν
Hence,
1
T = ω · L.
2

We can substitute the angular momentum Lμ = ν μν ων (μ, ν = 1, 2, 3):
1 1 1
T = ω·L= ωμ μν ων = μν ωμ ων . (12.1a)
2 2 μ ν
2 μ,ν
Because μν = νμ , the sum reads

1
T = (xx ωx2 + yy ωy2 + zz ωz2 + 2xy ωx ωy + 2xz ωx ωz + 2yz ωy ωz ).
2
(12.1b)
Using tensor notation, the rotation energy reads
1
· ω.
T = ωT · (12.1c)
2
must be given as a column vector,
The vector ω on the right-hand side of the tensor
and on the left-hand side as a row vector:
⎛ ⎞
ωx
1
⎝
T = (ωx , ωy , ωz ) ωy ⎠ . (12.1d)
2
ωz
12.3 The Principal Axes of Inertia
The elements of the tensor of inertia depend on the position of the origin and on the
orientation of the (body-fixed) coordinate system. It is now possible for a fixed origin
to orient the coordinate system in such a way that the deviation moments vanish. Such
a special coordinate system is called a system of principal axes. The tensor of inertia
then has diagonal form with respect to this system of axes:
⎛ ⎞
1 0 0
= ⎝ 0 2 0 ⎠ or μν = μ δμν .
(12.2)
0 0 3
For angular momenta and rotation energy in the system of principal axes, we have
the especially simple relations (ων are the components of the angular velocity ω with
respect to the principal axes)

Lμ = μν ων = μ δμν ων = μ ωμ , (12.3)
ν ν
1 1 1
T = ω·L= ω μ Lμ = 2
μ ω μ , (12.4a)
2 2 μ 2 μ
or written out,
1
T= 1 ω12 + 2 ω22 + 3 ω32 . (12.4b)
2
Because of the tensorial relation L = ω, the angular momentum and the angular
velocity have different orientations.
If the body rotates about one of the principal axes of inertia, e.g. about the μ-axis,
ω = ωeμ , then (because in this example ω = ωeμ ) according to (12.3) the angular
momentum L and the angular velocity ω have the same orientation. The vector ω then
has only one component, ω = (0, ω2 , 0), if the rotation is about the second principal
axis. The same holds also for the angular momentum: L = (0, L2 , 0). This property
of parallelism between the angular momentum and the angular velocity allows one to
determine the principal axes. The question is namely how to choose ω = {ω1 , ω2 , ω3 }
(about which axis must the body rotate), in order to get the angular momentum
L= ω and the angular velocity parallel to each other, i.e., L = ω, with a scalar.
12.4 Existence and Orthogonality of the Principal Axes 189
ω (
From the combination of the relations L = is a tensor) and L = ω ( is a
scalar), we obtain the equation
· ω = ω,
L= (12.5)
which is an eigenvalue equation. In this equation, the scalar and the related compo-
nents ωx , ωy , ωz , i.e., the rotation axis, are unknown. The equation physically states Fig. 12.4. Special case: If ω
that the angular momentum L and the rotation velocity ω are parallel to each other. is parallel to a principal axis,
This is fulfilled for certain directions ω that—as stated above—must be determined. then L is parallel to ω
; the correspond-
All values that satisfy (12.5) are called eigenvalues of the tensor
ing vectors ω = 0 are eigenvectors.
Equation (12.5) is a shortened notation for the system of equations
xx ωx + xy ωy + xz ωz = ωz ,

yx ωx + yy ωy + yz ωz = ωy , (12.6)
zx ωx + zy ωy + zz ωz = ωz ,
or
(xx − )ωx + xy ωy + xz ωz = 0,

yx ωx + (yy − )ωy + yz ωz = 0, (12.7)
zx ωx + zy ωy + (zz − )ωz = 0.
This system of homogeneous linear equations has nontrivial solutions if its determi-
nant of coefficients vanishes:

xx − xy xz

yx yy − yz = 0. (12.8)

zx zy zz −
The expansion of the determinant leads to an equation of third order in , the char-
Fig. 12.5. General case: The
acteristic equation. Its three roots are the desired principal moments of inertia (eigen- angular momentum L is not
values) 1 , 2 , and 3 . By inserting i into the system of (12.5), one can calculate parallel to the rotation veloc-
the ratio ωx(i) : ωy(i) : ωz(i) of the components of the vector ω(i) . Thereby the orientation ity ω
of the ith principal axis is determined.
Since one can find a tensor of inertia for any possible position of the body-fixed
coordinate system, there exists also a system of principal axes at each point of the
body. The orientations of these axes will however not coincide in general.
12.4 Existence and Orthogonality of the Principal Axes
In principle, it would be possible for the cubic equation (12.8) to have two complex
solutions. We therefore have to prove that a system of real orthogonal principal axes
generally exists.
In order to apply a shortened summation notation, we number the coordinates
(x = 1, y = 2, z = 3) and denote them by Latin letters. Greek letters are indices for
the three different eigenvalues. We multiply the eigenvalue equation (12.5) for λ by
(μ)
the complex conjugated of ωi and sum over i.
The equation for the component i reads

(λ) (λ) (λ)
ik ωk = Li = λ ω i . (12.9)
k
This leads to
(μ)∗
(μ)∗
ik ωk(λ) ωi = λ ωi(λ) ωj = λ ω(λ) · ω(μ)∗ . (12.10)
i,k i
In the same way, we form the complex conjugated of the equation corresponding to
(λ)
(12.9) for μ , multiply by ωk , and sum over k:

∗ki ωi = ∗μ ωk
(μ) (μ) (μ)∗ (μ)∗
ki ωi = μ ω k , , (12.11)
i i

∗ki ωi = ∗μ = ∗μ ω(μ)∗ · ω(λ) .
(μ)∗ (λ) (μ)∗ (λ)
ωk ωk ωk (12.12)
i,k k
Now we utilize the property of the tensor of inertia to be real and symmetric. We have
ik = ki = ∗ki , and the left-hand sides of (12.10) and (12.12) are equal to each
other. We subtract (12.12) from (12.10):
(λ − ∗μ )ω(λ) · ω(μ)∗ = 0. (12.13)
This equation allows two conclusions:

(1) Setting λ = μ, then for the eigenvalues of
(λ − ∗λ )ω(λ) · ω(λ)∗ = 0 (12.14)
follows the relation λ = ∗λ , since the scalar product of two complex conjugated
quantities is positively definite.
We thus proved that λ is real. Hence, any body always has three real principal
moments of inertia and therefore also three real principal axes ω(λ) . This is of
course physically clear from the outset, since the principal moments of inertia are
nothing else but the moments of inertia about the principal axes, and therefore
they are always real.
(2) We now consider the case λ = μ: Since all ν and therefore also all ων are real,
(12.13) reads
(λ − μ )ω(λ) · ω(μ) = 0. (12.15)
(a) If λ = μ , then ω(λ) · ω(μ) = 0, and therefore, ω(λ) and ω(μ) are orthogonal.
(b) If, e.g., 1 = 2 = , i.e., if two of the three eigenvalues are equal, then
besides ω(1) and ω(2) all linear combinations of these two vectors are eigen-
vectors, too:
· ω(1) = ω(1) ,
· ω(2) = ω(2)

⇒ · (αω(1) + βω(2) ) = (αω(1) + βω(2) ).

12.4 Existence and Orthogonality of the Principal Axes 191
Thus, we can arbitrarily select two orthogonal vectors from the plane spanned
this way and consider them as directions of principal axes. The third principal
axis is by (12.15) fixed orthogonally to the two other axes. If two principal
moments of inertia with respect to the center of gravity as rotation point are
equal, the body is called a symmetric top.
(c) If all three moments of inertia are equal (1 = 2 = 3 ), then any arbitrary
orthogonal set of axes is a system of principal axes. If this holds with respect
to the center of gravity, the body is called a spherical top.
If a body has rotational symmetry about one axis, then we are dealing with case (b),
and the rotation axis is a principal axis. For other kinds of symmetries the symmetry
axis also coincides with the principal axis.
EXAMPLE
12.1 Tensor of Inertia of a Square Covered with Mass
We calculate the tensor of inertia and the principal axes of inertia of a square covered
with mass for a corner of the square. We put the square in the x, y-plane of the coor-
dinate system, as is shown in Fig. 12.6. The components of the tensor of inertia are
obtained with z = 0 by integration over the area:
a a Fig. 12.6. The angular veloc-

1
xx = σ y 2 dx dy = Ma 2 , ity ω is arbitrary; however, it
3
y=0 x=0 passes through the coordinate
origin
a a
1
yy = σ x 2 dx dy = Ma 2 ,
3
y=0 x=0
a a
2
zz = σ (x 2 + y 2 ) dx dy = Ma 2 .
3
y=0 x=0
Likewise,
a a
1
xy = yx = −σ xy dx dy = − Ma 2 .
4
y=0 x=0
The remaining deviation moments contain the factor z in the integrand and therefore
vanish:
yz = zy = xz = zx = 0.
Thus, in the selected coordinate system the plate has the following tensor of inertia:
⎛ ⎞
Example 12.1 1 1
Ma 2 − Ma 2 0
⎜ 3 4 ⎟
⎜ ⎟
⎜ ⎟
= ⎜ − 1 Ma 2

1
Ma 2 0 ⎟.
⎜ 4 ⎟
⎜ 3 ⎟
⎝ 2 ⎠
0 0 Ma 2
3
We now calculate the orientations of the principal axes.
In accordance with the described approach, we first determine the eigenvalues of
the tensor of inertia. We introduce the abbreviation 0 = Ma 2 . Then we have the
determinant

1 1
0 − − 0 0
3 4

1 1
− 0 0 − 0 =0

4 3

2
0 0 0 −
3
or

2 7 2 2
2 − 0 + 0 0 − = 0.
3 144 3
The roots of this characteristic equation
1 7 2
1 = 0 , 2 = 0 , 3 = 0
12 12 3
are the principal moments of inertia with respect to the origin.
For the principal moment of inertia ν , the orientation of the axis ω(ν) results from
ω(ν) = ν ω(ν) .
the eigenvalue equation
Written out for ν = 1,
⎛ ⎞
1 1
0 − 0 0
⎜ 3 4 ⎟ ⎛ (1) ⎞ ⎛ (1) ⎞
⎜ ⎟ ωx ωx
⎜ 1 1 ⎟ ⎜ (1) ⎟ 1 ⎜ (1) ⎟
⎜ − 0 ⎟
0 ⎟ ⎝ ωy ⎠ = 0 ⎝ ωy ⎠ .
⎜ 4 0
⎜ 3 ⎟ (1) 12 (1)
⎝ 2 ⎠
ωz ωz
0 0 0
3
By multiplying out, we get a vector equation; after splitting into the three compo-
nents, we obtain the three equations
1 1 1
0 ωx(1) − 0 ωy(1) = 0 ωz(1) ,
3 4 12
1 1 1
− 0 ωx(1) + 0 ωy(1) = 0 ωy(1) ,
4 3 12
2 1
0 ωz(1) = 0 ωz(1) .
3 12
ωy(1) = ωx(1) , ωz(1) = 0,

12.5 Transformation of the Tensor of Inertia 193
and thus, the orientation of the first principal axis is Example 12.1
⎛ ⎞
(1) 1
ω 1
e1 = (1) = √ ⎝ 1 ⎠ .
|ω | 2 0
Fig. 12.7. The principal axes

for rotations about the point 0
Analogously, we obtain for the two other directions

⎛ ⎞ ⎛ ⎞
(2) −1 (3) 0
ω 1 ω
e2 = (2) = √ ⎝ 1 ⎠ and e3 = (3) = ⎝ 0 ⎠ .
|ω | 2 |ω |
0 1
Evidently, the principal axes are orthogonal to each other, as is demanded by the gen-
eral theory. For a rotation about the point 0 around one of the principal axes, the
angular momentum L is parallel to ω, but in general the center of gravity then also
moves. Such a motion can be forced only by the action of a force. Thus it is no free
motion. Force-free rotations (shortly: free rotation) take place only about the center of
gravity. The principal axes moments or principal moments of inertia about the center
of gravity are the principal moments of inertia or principal axes of the body. In our
example the orientations of the principal axes coincide with those at the point 0.
12.5 Transformation of the Tensor of Inertia

We investigate how the elements of the tensor behave under a rotation of the coor-
dinate system. The transformation of a vector under rotation of the coordinate system
is described by1

x = Ax
or

xi = aij xj , (12.16)
j
or for the basis vectors

ei = aij ej , (12.17)
j
(2004), Chapter 6.
where the components aij of the rotation matrix A are the direction cosines between
the rotated and the old axes. The inverse of this transformation reads

−1 x
x=A or xi = aj i xj . (12.18)
j
The inverse rotation matrix (a −1 )ij = (aj i ) is found by exchanging rows and columns
(transposition), since the rotation is an orthogonal transformation which satisfies

aij akj = δik or aij aik = δj k . (12.19)
j i
We require for the tensor of inertia that a vector equation of the form

Lk = kl ωl (12.20)
l
exists also in the rotated system:

Li = ij ωj . (12.21)
j
Thus, we can determine the transformation behavior of the tensor from the behavior
of the vectors. The vectors L and ω obey the transformation equation (12.18). If we
replace Lk and ωl in (12.20) by the primed quantities, we obtain

kl aj l ω j = aj k Lj .
l j j
Multiplication by aik and summation over k yields

aik aj l kl ωj = aj k aik Lj = δij Lj = Li . (12.22)
j k,l j k i
follows by comparison with (12.21)

For the components of

ij = aik aj l kl . (12.23)
k,l
as a “tensor.” A tensor of
This transformation relation is the reason for denoting
rank m is generally defined as any quantity which under orthogonal transformations
behaves according to the logical extension of (12.23) (summation over m indices),
e.g., a tensor of third rank

Aij k = aii ajj akk Ai j k . (12.24)
i ,j ,k
is a tensor of second rank; a vector can because of (12.16) be considered as tensor

of first rank, a scalar accordingly as tensor of rank 0. One can easily memorize the
transformation law of a tensor: Each component of the tensor transforms as a vector
(see (12.16)).
12.6 Tensor of Inertia in the System of Principal Axes 195
For the tensor of inertia, (12.23) can be more clearly represented in matrix nota-
tion:

= A −1 .
A (12.25)
This is a similarity transformation.

The matrices A (A−1 ) reduce to row vectors (column vectors) if we want to deter-
mine only the moment of inertia ii about a given axis ei from the tensor of inertia
kl in the coordinate system ek . According to (12.23), the desired moment of inertia
ii is

ii = aij ail j l .
j,l
Now, according to (12.17), ei = {ai1 , ai2 , ai3 } is the vector ei in the basis ej . Hence,
the moment of inertia about the rotation axis ei = n = (n1 , n2 , n3 ) can obviously be
written as follows:

n = aij j l ail = aij j l (a T )li = n j j l n l
j,l j,l j,l
⎛ ⎞
n1
⎝ n2 ⎠ = n T ·
= (n1 , n2 , n3 ) ·n
n3

= ij ni nj . (12.26)
i,j
This relation will be derived more clearly in the context of the subsequent equation
(12.33). It allows one to calculate the moment of inertia about an arbitrary rotation
axis n rather quickly.
12.6 Tensor of Inertia in the System of Principal Axes
If the three orientations of the principal axes ei = ω(i) are selected as coordinate axes,
then

ei = ω1(i) e1 + ω2(i) e2 + ω3(i) e3 = (i)
ω j ej .
j
A comparison with (12.17) shows that in this case

(i)
aij = ωj .
Hence, according to (12.23) the tensor of inertia in the system of principal axes reads
(i) (j ) (i)
(j )
ij = aik aj l kl = ωk ωl kl = ωk kl ωl . (12.27)
k,l k,l k l
with the eigenvalue j , according to

Since ω(i) is an eigenvector of the matrix
(12.5), we have
ω(j ) = j ω(j )
(12.28)
or, explicitly,
(j ) (j )
kl ωl = j ω k .
l
Therefore, (12.27) turns into

ij = (i) (j ) (i) (j )
ω k j ω k = j ωk ωk = j ω(i) · ω(j )
k k
= j δij . (12.29)
We thus used the orthonormality (12.16) of the principal axes vectors ω(i) . The ω(i)
were assumed to be normalized, which is possible because of the linearity of the eigen-
value equation (12.28) with respect to ω. Equation (12.29) expresses the interesting
and important fact that the tensor of inertia in its eigenrepresentation (i.e., in the coor-
dinate system with the principal axes ω(i) as coordinate axes) is diagonal and exactly
of the form (12.2). This was to be expected, but it is satisfactory to see how everything
fits together consistently.
12.7 Ellipsoid of Inertia
We define a rotation axis by the unit vector n with the direction cosines n =
(cos α, cos β, cos γ ). According to (12.26), the moment of inertia about this axis
is
⎛ ⎞⎛ ⎞
xx xy xz cos α
= n = (cos α, cos β, cos γ ) ⎝ xy yy yz ⎠ ⎝ cos β ⎠ .
Fig. 12.8. n characterizes the xz yz zz cos γ
rotation axis
Multiplying out, we obtain
n = xx cos2 α + yy cos2 β + zz cos2 γ

+ 2xy cos α cos β + 2xz cos α cos γ + 2yz cos β cos γ . (12.30)
√
By defining a vector n = n/ n , we can rewrite the equation as
xx x2 + yy y2 + zz z2 + 2xy x y + 2xz x z + 2yz y z = 1. (12.31)
This equation represents an ellipsoid in the coordinates (x , y , z ), the so-called el-
lipsoid of inertia.
The distance √ from the center of rotation 0 along the direction n to the ellipsoid of
inertia is = 1/ . This allows us to write down at once the moment of inertia if the
ellipsoid of inertia is known. Each ellipsoid can now be brought to its normal form by
12.7 Ellipsoid of Inertia 197
Fig. 12.9. The ellipsoid of in-

ertia
a rotation of the coordinate system, i.e., the mixed terms can be made to vanish. We
then obtain the form of the ellipsoid of inertia
1 12 + 2 22 + 3 32 = 1. (12.32)
This transformation of the ellipsoid of inertia obviously corresponds to the transforma-

tion of the tensor of inertia to principal axes. This becomes clear by comparing (12.31)
and (12.32) with (12.1b) and (12.4a). The principal moments of inertia are given by
the squares of the reciprocal axis lengths of the ellipsoid. For two equal principal mo-
ments of inertia, the ellipsoid of inertia is a rotation ellipsoid, for three equal moments
a sphere.
There is also a physical approach to the ellipsoid of inertia which will be presented
now. Let n = {cos α, cos β, cos γ } be a unit vector pointing along the direction of the
angular velocity ω, so that
ω = ωn = ω{cos α, cos β, cos γ } = ω{n1 , n2 , n3 } = {ω1 , ω2 , ω3 }.
For the kinetic rotation energy, we then obtain according to (12.1a)

1
Trot = ik ωi ωk
2
ik
1
= ω2 (11 cos2 α + 22 cos2 β + 33 cos2 γ
2
+ 212 cos α cos β + 213 cos α cos γ + 223 cos β cos γ )
1
= n ω2 .
2
n denotes the moment of inertia about the axis n. Hence, the moment of inertia about
an axis with the orientation n is given by
n = 11 cos2 α + 22 cos2 β + 33 cos2 γ

+ 212 cos α cos β + 213 cos α cos γ + 223 cos β cos γ .
√ agrees with the already known result (12.30). With the coordinates =
This
n/ n = (1 , 2 , 3 ), we thus obtain the ellipsoid of inertia
11 12 + 22 22 + 33 32 + 212 1 2 + 213 1 3 + 223 2 3 = 1. (12.33)
√
The radius of the ellipsoid in the direction n is n = 1/ n .
Fig. 12.10. The rigid body ro-

tates about the axis n: dν is the
distance of the mass mν from
the rotation axis
Finally, there is still a third approach to the ellipsoid of inertia: According to

Fig. 12.10, the moment of inertia about the axis n is given by

n = mν dν2 = mν |rν × n|2 . (12.34)
ν ν
We check

e1 e2 e3

rν × n = xν yν zν
cos α cos β cos γ
= (yν cos γ − zν cos β)e1 + (zν cos α − xν cos γ )e2
+ (xν cos β − yν cos α)e3
and
dν2 = |rν × n|2

= (yν cos γ − zν cos β)2 + (zν cos α − xν cos γ )2 + (xν cos β − yν cos α)2
= (yν2 + zν2 ) cos2 α + (xν2 + zν2 ) cos2 β + (xν2 + yν2 ) cos2 γ
− 2xν yν cos α cos β − 2xν zν cos α cos γ − 2yν zν cos β cos γ . (12.35)
Inserting this into (12.34) immediately yields

n = ij ni nj , (12.36)
i,j
i.e., again the known ellipsoid of inertia.

One should realize at this point that the ellipsoid of inertia for a given tensor of
inertia ik can immediately be written down and drawn according to (12.31). We
use this method of evaluating moments of inertia in an arbitrary direction in Exer-
cise 12.4.
EXAMPLE
12.2 Transformation of the Tensor of Inertia of a Square Covered with Mass
The tensor of inertia of the square covered with mass in the x, y-plane was given by
(compare Example 12.1)
⎛ ⎞
1 1
0 − 0 0
⎜ 3 4 ⎟
⎜ ⎟
⎜ 1 1 ⎟
⎜
= ⎜ − 0 0 ⎟
0 ⎟.
⎜ 4 3 ⎟
⎝ 2 ⎠
0 0 0
3
The rotation of the coordinate system by ϕ = π/4 about the z-axis must bring to
diagonal form, because the angle bisectors of the x, y-plane, as was shown (compare
Exercise 12.1), are principal axes. The corresponding rotation matrix reads
⎛ √ √ ⎞
2 2
⎛ ⎞ ⎜ 0⎟
cos ϕ sin ϕ 0 ⎜ √ 2 2 ⎟
√
= ⎝ − sin ϕ cos ϕ 0 ⎠ = ⎜
A ⎜ 2 2
⎟
⎟.
⎜− 0 ⎟
0 0 1 ⎝ 2 2 ⎠
0 0 1
Obviously,
−1 = A
A T .
Performing the matrix multiplication yields in accordance with the former result
⎛ ⎞
1
0 0 0
⎜ 12 ⎟
⎜ ⎟
⎜ 7 ⎟
−1
= AA = ⎜ 0 ⎜ 0 ⎟
0 ⎟.
⎜ 12 ⎟
⎝ 2 ⎠
0 0 0
3
EXERCISE
12.3 Rolling Circular Top

Problem. Find the kinetic energy of a homogeneous circular top (density , mass m,
height h, vertex angle 2α),
(a) rolling on a plane, and
(b) whose base circle rolls on a plane while its longitudinal axis is parallel to the plane
and the vertex is fixed at a point.
Solution. For the calculation of the tensor of inertia, we choose the coordinate system
so that the longitudinal axis coincides with the z-axis.
Obviously,

xx = (y 2 + z2 ) dV = (r 2 sin2 ϕ + z2 )r dz dr dϕ
V
2π R h
= dϕ r dr (r 2 sin2 ϕ + z2 ) dz
Fig. 12.11. From the figure it 0 0 h(r/R)
is seen that m = (1/3)π hR 2 , π
R = h tan α, s = R/ sin α = hR 2 (R 2 + 4h2 ),
20
3
xx = mh2 (tan2 α + 4). (12.37)
20
For reasons of symmetry, we have
yy = xx .
Likewise,

zz = (x 2 + y 2 ) dV = r 3 dz dr dϕ
V
2π R h
π
= dϕ 3
r dr dz = hR 4 ,
10
0 0 h(r/R)
3
zz = mh2 tan2 α. (12.38)
10
Since the integrals over ϕ of xy = r 2 cos ϕ sin ϕ, xz = rz cos ϕ, yz = rz sin ϕ with

the limits 0 and 2π vanish, follows xy = xz = yz = 0. The adopted system is a
system of principal axes. We therefore set 1 = xx = 2 , 3 = zz .
(a) The kinetic energy in the representation of principal axes reads
1 1 1
T = 1 ω12 + 2 ω22 + 3 ω32 . (12.39)
2 2 2
Since we already know the principal axes of inertia and moments of inertia, it remains
only to express the motion of the cone by the corresponding angular velocities. The
momentary rotation of the cone happens with the angular velocity ω about a line of
support. We can express ω by ϕ̇ by considering the velocity of the point B. On the one
Fig. 12.12. Rolling cone

hand vB = ϕ̇h cos α, and on the other hand vB = ω · R cos α. From this, we find Exercise 12.3
h
ω = ϕ̇ . (12.40)
R
ϕ is the polar angle of the figure axis (or, equivalently, the tangential line) in the x ,y -
plane; ϕ̇ is the corresponding angular velocity.
A decomposition of ω in the system of principal axes, where ω lies in the x,z-plane,
leads to ω2 = 0 and
ω3 = ω cos α and ω1 = sin α. (12.41)
For the kinetic energy, we thus obtain from (12.39)
1 1 1
T = 1 ω12 + 2 ω22 + 3 ω32
2 2 2
1 3 1 3
= mh2 (tan2 α + 4)ω12 + mh2 tan2 α 2
2 20 2 10
3 3
= mh2 ω2 sin2 α(tan2 α + 4) + mh2 ω2 sin2 α
40 20
4
3 sin α
= mh2 ω2 + 6 sin2 α . (12.42)
40 cos2 α
If we replace ω by (12.40) and employ R/ h = tan α = sin α/ cos α, then

2
3 h2 R 2 2 2 R 3
T= mh2 ϕ̇ 2 2 sin α + 6 cos α = mh2 ϕ̇ 2 (1 + 5 cos2 α). (12.43)
40 R h2 h2 40
(b) The momentary rotation axis ω is again the connecting line between the fixed
vertex and the point of support. The relation between ω and ϕ̇ is likewise obtained by
considering the velocity of point A.
Fig. 12.13. x , y , z labels the

laboratory system, x, y, z the
system of principal axes
We have vA = h · ϕ̇ = ωR cos α, from which it follows that ω = ϕ̇/ sin α. The pro-
jection of ω onto the principal axes yields
ω1 = ω sin α = ϕ̇,
ω2 = 0, (12.44)
h
ω3 = ω cos α = ϕ̇ .
R
Fig. 12.14. Cone rolling on

the edge of its base
Hence, for the kinetic energy, it follows from (12.39) that

1 3 1 3
T = mh2 (tan2 α + 4)ω12 + mh2 tan2 α ω32
2 20 2 10
2
3 2 2 R R 2 h2 3 R2
= mh ϕ̇ + 4 + 2 2 · 2 = mh ϕ̇ 6 + 2 .
2 2
(12.45)
40 h2 h R 40 h
EXERCISE
12.4 Ellipsoid of Inertia of a Quadratic Disk
Problem. Determine the ellipsoid of inertia for the rotation of a quadratic disk about
the origin, as described in Example 12.1. Find the moments of inertia of the disk for
rotation about (a) the x-axis, (b) the y-axis, (c) the z-axis, (d) the three principal axes,
and (e) the axis {cos 45◦ , cos 45◦ , cos 45◦ }.
Solution. The ellipsoid of inertia reads
0 2 0 0 2 20 2
− x y + + = 1. (12.46)
3 x 2 3 y 3 z
√
(a) For rotation about the x-axis n = {1, 0, 0}, and thus = {1/ x , 0, 0}. Inser-
tion into (12.46) yields
0 1 0
· =1 ⇒ x = ,
3 x 3
as expected.
(b) Here, n = {0, 1, 0}, and following the procedure in (a), we find
0
y = .
3
(c) Here, n = {0, 0, 1}, and following the procedure in (a), we find
2
z = 0 .
3
(d) The third principal axis is identical with the z-axis, which corresponds to (c).
The first two principal axes are given by

1 1 1 1
n1 = √ , √ , 0 and n2 = − √ , √ , 0 ,
2 2 2 2
respectively. Therefore, Exercise 12.4

n 1 1
1 = √ = √ ,√ ,0
1 21 21
and

n2 1 1
2 = √ = −√ ,√ ,0 .
2 22 22
0 1 0 1 0 1 0
− + +0=1 ⇒ 1 = ,
3 21 2 21 3 21 12
and
0 1 0 1 0 1 7
+ + +0=1 ⇒ 2 = 0 .
3 22 2 22 3 22 12
These are the principal moments of inertia, as was expected.
(e) In this case, n is proportional to {cos 45◦ , cos 45◦ , cos 45◦ }. Thus,

1 1 1
n= √ ,√ ,√ ,
3 3 3
and therefore,

n 1 1 1
= √ = √ ,√ ,√ .
3 3 3
0 1 0 1 0 1 2 1
− + + 0 = 1,
3 3 2 3 3 3 3 3
from which we find
10
= 0 .
36
This problem demonstrates the simple handling and the usefulness of the ellipsoid of
inertia.
EXERCISE
12.5 Symmetry Axis as a Principal Axis
Problem. Demonstrate that an n-fold rotational symmetry axis is a principal axis of

inertia, and that in the case n ≥ 3, the two other principal axes can be freely chosen in
the plane perpendicular to the first axis.
Solution. If a body has an n-fold symmetry axis, then the tensor of inertia must be
equal in two coordinate systems rotated from each other by ϕ = 2π/n:
=

= A −1 .
A
Exercise 12.5 If we select the z-axis as a rotation axis, the rotation matrix reads
⎛ ⎞
cos ϕ sin ϕ 0
A = ⎝ − sin ϕ cos ϕ 0 ⎠ .
0 0 1
Multiplying the matrices out, one obtains the components ij of the new tensor of
inertia which shall coincide with ij .
11 = 11 = 11 cos2 ϕ + 22 sin2 ϕ + 212 sin ϕ cos ϕ,

22 = 22 = 11 sin2 ϕ + 22 cos2 ϕ − 212 sin ϕ cos ϕ,
12 = 12 = −11 cos ϕ sin ϕ + 22 cos ϕ sin ϕ + 12 (1 − 2 sin2 ϕ),
13 = 13 = +13 cos ϕ + 23 sin ϕ,
23 = 23 = −13 sin ϕ + 23 cos ϕ.
The determinant of the system of the last two equations,

cos ϕ − 1 sin ϕ
= 2(1 − cos ϕ),
− sin ϕ cos ϕ − 1
vanishes only for ϕ = 0, 2π, . . . . If there is symmetry (n ≥ 2), then we must have
13 = 23 = 0, i.e., the z-axis must be a principal axis.
Two of the remaining three equations are identical, and there remains the system
of equations
(22 − 11 ) sin2 ϕ + 212 sin ϕ cos ϕ = 0,

(22 − 11 ) cos ϕ sin ϕ − 212 sin2 ϕ = 0.
The determinant of coefficients has the value
D = −2 sin4 ϕ − 2 sin2 ϕ cos2 ϕ = −2 sin2 ϕ.
There holds D = 0 for ϕ = 0, π, 2π, . . . . Hence, 11 = 22 and 12 = 0, if n > 2. If
the axis of rotational symmetry z is at least 3-fold, the tensor of inertia is diagonal for
each orthogonal pair of axes in the x, y-plane.
EXERCISE
12.6 Tensor of Inertia and Ellipsoid of Inertia of a System of Three Masses
Problem. A rigid body consists of three mass points that are connected to the z-axis
by rigid massless bars (see Fig. 12.15).
(a) Find the elements of the tensor of inertia relative to the x, y, z-system.
(b) Calculate the ellipsoid of inertia with respect to the origin 0, and the moment of
inertia of the entire body with respect to the axis 0a.
Fig. 12.15. Rigid body con-

sisting of three mass points
Solution. (a) The elements of the tensor of inertia relative to the x,y,z-system are

xx = mi (yi2 + zi2 )
i
= m1 (y12 + z12 ) + m2 (y22 + z22 ) + m3 (y32 + z33 ),
and after inserting the numerical values from Fig. 12.15, one has
xx = 100(144 + 25) + 200(64 + 225) + 150(144 + 196) (g cm2 )

= 125.7 (kg cm2 ).
Likewise, one obtains
yy = 117.5 (kg cm2 ) and zz = 104.75 (kg cm2 ).
For the deviation moments of the tensor of inertia, it follows that

xy = − mi (xi yi )
i
= 100(12 · 10) − 200(10 · 8) + 150(11 · 14) (g cm2 ) = 19.1 (kg cm2 ),
and likewise,
xz = −44.8 (kg cm2 ) and yz = 4.800 (kg cm2 ).
(b) From (a) one now immediately obtains for the ellipsoid of inertia with respect
to the origin 0 (see (12.30))
= xx cos2 α + yy cos2 β + zz cos2 γ

+ 2xy cos α cos β + 2xz cos α cos γ + 2yz cos β cos γ . (12.47)
Exercise 12.6 To calculate the moment of inertia 0a , we evaluate the direction cosines with the
coordinates given in Fig. 12.15,
−6
cos α = √ = −0.268,
6 + 82 + 202
2
8
cos β = √ = 0.358,
6 + 82 + 202
2
and
20
cos γ = √ = 0.895.
6 + 82 + 202
2
By inserting into (12.47) for the moment of inertia, we obtain
0a = (0.268)2 · 125.7 + (0.358)2 · 117.25 + (0.895)2 · 104.75

− 2(0.268)(0.358) · 19.1 + 2(0.268)(0.895) · 44.8
− 2(0.358)(0.895) · 4.800 (kg cm2 )
= 128.87 (kg cm2 ).
EXERCISE
12.7 Friction Forces and Acceleration of a Car
Problem. A car of mass M is driven by a motor that performs the torque 2D on the
wheel axis. The radius of the wheels is R, and their moment of inertia is = mR 2
(m is the reduced mass of the wheels).
(a) Determine the friction force f which acts on each wheel and causes the accelera-
tion of the car. The street is assumed to be planar.
(b) Calculate the acceleration of the car if the torque 2D = 103 J, M = 2 · 103 kg,
R = 0.5 m and m = 12.5 kg.
Solution. (a) Fig. 12.16 shows one of the wheels and the force f acting on it. Since
the linear acceleration of the wheel center is the same as that of the center of gravity
of the car as , one has
Mas = 2f − F. (12.48)
The factor 2 accounts for the fact that a car in general is driven by two wheels. F
Fig. 12.16. Wheel, accelera- is a possible external force which impedes the motion (air resistance), and as is the
tion as , and friction force f acceleration of the car. For the torque relative to the axis, one obtains
4ω̇ = 2(D − f R), (12.49)

where is the moment of inertia of each of the four wheels, D is the accelerating Exercise 12.7
torque, and −f R is the torque performed by the friction force on each wheel. The
moment of inertia is = mR 2 . If the car does not glide, one has
ω̇R = as , (12.50)
and with (12.49) and (12.48), it immediately follows that

1 DR − f R 2 2f − F
ω̇R = as = = . (12.51)
2 mR 2 M
Equation (12.51) finally yields for the friction force
1 (2D/R)M + 4mF
f= (12.52)
2 M + 4m
and by neglecting the backdriving force F (F = 0), we have
D/R
f= . (12.53)
1 + (4m/M)
(b) By replacing f in (12.49) by (12.53) and solving for as (F = 0), one finds the
acceleration of the car
2D/R 103 /0.5 103 m
as = = = ≈1 2.
M + 4m 2 · 10 + 4 · 12.4 1025
3 s
With the numerical values from (b), the friction force f is given by
D
f≈ = 1000 N.
R
Theory of the Top
13
13.1 The Free Top
A rigid, rotating body is called a top. A top is called symmetric if two of its principal
moments of inertia are equal. If 1 = 2 , we further distinguish
(a) 3 > 1 oblate top or flattened top, e.g., a disk;

(b) 3 < 1 prolate top or cigar top, e.g., an (extended) cylinder; and
(c) 3 = 1 spherical top, e.g., a cube.
The third principal axis of inertia which is related to 3 is called the figure axis. It
specifies the spatial orientation of the top. For rotationally symmetric bodies it co-
incides with their symmetry axis. Hence, the center of gravity of a rotational body
always lies on the figure axis. Moreover, we must distinguish between the free top and
the heavy top. For the free top one assumes that no external forces act on the body,
so that the torque with respect to the fixed point vanishes. On the heavy top forces
act, for example gravity. One can however imagine other forces (centrifugal forces,
friction forces, etc.). For an experimental realization of a free top we only have to
support an arbitrary body at the center of gravity. The body is then in an indifferent
equilibrium, and there is no torque acting on it.
Fig. 13.1. (i) Possible real

form of the top. (ii) Ellipsoid
of inertia

210 13 Theory of the Top
Fig. 13.2. Model of a free

top supported at the center of
gravity S: The construction is
so that S is also the supporting
point
For a theoretical description of the top, we start from the basic equations
· ω = constant (conservation of angular momentum),
L= (13.1)
1
T = ω · L = constant (conservation of kinetic energy). (13.2)
2
The angular momentum L and the kinetic energy T of the free top are constant in
time. This is the content of the last two equations.
13.2 Geometrical Theory of the Top
We first will derive the laws governing the free top from geometrical considerations.
The geometrical theory of the top is based on Poinsot’s1 ellipsoid (also called the
energy ellipsoid):
xx ωx2 + yy ωy2 + zz ωz2 + 2xy ωx ωy + 2xz ωx ωz + 2yz ωy ωz

= 2T = constant. (13.3)
This ellipsoid in the ω-space is immediately obtained from (13.2). It is similar to the
ordinary ellipsoid of inertia and has the same body-fixed axes.
In the subsequent considerations, we shall utilize the property of (13.3) that the
endpoint of the vector ω lies just on the surface of the ellipsoid.
Now follows Poinsot’s construction of the motion of the free top. The angular mo-
mentum vector is constant and defines an orientation in space. The straight line de-
termined by L is therefore called the invariable straight line. Moreover, the kinetic
energy is constant, hence 2T = ω · L = constant; from the definition of the scalar
product immediately it follows that
ω · cos(ω, L) = constant. (13.4)
In other words, the projection of ω onto L is constant. If one now considers ω as the
position vector for points in space, the parameter representation ω(t) fixes a plane
which is called an invariable plane. The invariable straight line is then perpendicular
to the invariable plane.
Now one can describe the motion of the top by the rolling of the Poinsot ellipsoid
on the invariable plane. This is allowed since the endpoint of ω, as is evident from (13.4),
1 Louis Poinsot, French mathematician and physicist, b. Jan. 3, 1777, Paris–d. Dec. 5, 1859, Paris.
Professor in Paris, introduced in his Eléments de statique (Paris, 1804) the concept of the couple to
mechanics and used it to represent the motion of the top. Poinsot-motion means the motion of a free
top.
13.2 Geometrical Theory of the Top 211
Fig. 13.3. Invariable straight

line and invariable plane
lies on the surface of the ellipsoid and moves in the invariable plane. The invariable
plane is also a tangent plane of Poinsot’s ellipsoid, since there is only one common
vector ω, and hence the ellipsoid and the plane have a common point. To prove this,
we show that at the point ω the gradient of the ellipsoid is parallel to L. From vector
analysis we know that the gradient of a surface is perpendicular to this plane. The
surface of the ellipsoid F is described by (13.3).
Because2

∂F ∂F ∂F
∇ω F = , , ,
∂ωx ∂ωy ∂ωz
we obtain
⎛ ⎞
xx ωx + xy ωy + xz ωz
1
∇ω F = ⎝ xy ωx + yy ωy + yz ωz ⎠ =
ω = L,
2
xz ωx + yz ωy + zz ωz
i.e., gradω F is parallel to L or F ⊥ L; therefore, the tangent plane of F at the point ω

is parallel to the invariable plane.
Since the center of the ellipsoid is a constant distance from the invariable plane
(see (13.4)), the motion of the top can be described as follows: The body-fixed Poinsot
ellipsoid rolls without gliding on the invariable plane, where the center of the ellipsoid
is fixed. The instantaneous value of the angular velocity is then given by the distance
from the center to the contact point of the ellipsoid.
The ellipsoid rolls but does not glide. This follows from the fact that all points along
the ω-axis are momentarily at rest; hence the contact point is also momentarily at rest.
Rolling without gliding means that the changes of the rotation vector ω measured from
the laboratory system and from the body-fixed system are equal. Actually,

dω dω dω dω
= + ω × ω, i.e., = .
dt L dt K dt L dt K
Concerning the difference between gliding and rolling, if a wheel rolls on a plane,
the velocities of change of the contact point P in the body-fixed and in the laboratory
system are equal. If the wheel glides, the contact point in the body-fixed system is Fig. 13.4. On the condition of
fixed; in the laboratory system its position changes permanently. rolling
2 Since the surface (13.3) is defined in the ω-space, we mean by gradient the ω-gradient, i.e., ∇ω =
{∂/∂ωx , ∂/∂ωy , ∂/∂ωz }.
The trajectory of ω on the invariable plane is called the herpolhodie or trace trajec-
tory, the corresponding curve on the ellipsoid is called the polhodie or pole trajectory.
See Fig. 13.5.
Fig. 13.5. The Poinsot ellip-

soid rolls on the invariable
plane
The polhodie and the herpolhodie are in general complicated, not closed curves.
For the special case of a symmetric top, the Poinsot ellipsoid turns into a rotation
ellipsoid, and by rolling of the rotation ellipsoid there arise circles. ω has constant
magnitude but permanently changes the direction, i.e., ω rotates on a cone about the
angular momentum axis. This cone is called the herpolhodie- or trace cone. For a
symmetric cone it is efficient to use the symmetry axis (figure axis) as third axis for
describing the motion. The figure axis that is tightly fixed to the ellipsoid rotates just
like ω rotates about L. The cone resulting this way is called the nutation cone. The
motion of the figure axis of the top in space is called nutation. (The term precession
used in the American literature makes little sense, since the term means a motion of
the heavy top that is of a completely different origin.)
An observer who is in the system of the top and considers the figure axis as fixed
will find that ω and L rotate about this axis. For the cone arising by the rotation of
ω the term polhodie- or pole cone is introduced. The precise orientation of the axes
and cones depends essentially on the shape of the rotation ellipsoid. This is shown
by the following two diagrams on the orientation of the axes. Note that a large prin-
cipal momentum of inertia 3 corresponds to a small radius of the Poinsot ellipsoid,
√
namely, 2T /3 . The other axes of the Poinsot ellipsoid accordingly have the lengths
√ √
2T /1 and 2T /2 , respectively.
This is immediately seen from the form of (13.3) in terms of the principal axes:
ω12 ω22 ω32

+ + = 2T = constant.
1/1 1/2 1/3
Figure 13.6(a) shows the ellipsoid of a flattened (oblate) top; Fig. 13.6(b) represents
a prolate top. In the first case, the axes have the sequence ω—L—figure axis; in the
second case, the sequence is L—ω—figure axis.
Likewise is the sequence of the cones introduced above. Figure 13.6(c) shows the
case of an oblate top, and Fig. 13.6(d) that of a prolate top. We note that the three axes
lie in a plane.
13.3 Analytical Theory of the Free Top 213
Fig. 13.6. (a) Oblate symmet-

ric top. (b) Prolate symmet-
ric top. (c) Oblate symmetric
top: The pole cone rolls inside
of the trace cone. (d) Prolate
symmetric top: The pole cone
rolls outside of the trace cone
13.3 Analytical Theory of the Free Top

We consider the motion of the angular momentum and angular velocity vectors from
a coordinate system that is tightly fixed to the top and moves with it. For the angular
velocity, we have
ω = ω 1 e1 + ω 2 e2 + ω 3 e3 ,
where e1 , e2 , and e3 are body-fixed principal axes of the top. We now investigate the
angular momentum of the top no longer in the moving coordinate system, i.e., in the
system of the top that rotates with ω in the laboratory system, but transformed into
the laboratory system, using our knowledge of moving coordinate systems. We then
obtain
L̇|lab = L̇|top + ω × L.
Because L̇|top = ω̇, for the component in the laboratory system we have

e1 e2 e3

L̇|lab = 1 ω̇1 e1 + 2 ω̇2 e2 + 3 ω̇3 e3 + ω1 ω2 ω3 .
1 ω 1 2 ω 2 3 ω 3
Solved for the components e1 , e2 , and e3 and combined, this reads
L̇|lab = (1 ω̇1 + 3 ω2 ω3 − 2 ω2 ω3 )e1

+ (2 ω̇2 + 1 ω1 ω3 − 3 ω1 ω3 )e2
+ (3 ω̇3 + 2 ω1 ω2 − 1 ω1 ω2 )e3 .
Since the laboratory system is an inertial frame of reference, we have the relation
L̇ = D.
The torque is again expressed by the body-fixed coordinates, and we obtain
L̇|lab = D1 e1 + D2 e2 + D3 e3 .
Thus, we find the Euler equations:
D1 = 1 ω̇1 + (3 − 2 )ω2 ω3 ,

D2 = 2 ω̇2 + (1 − 3 )ω1 ω3 , (13.5)
D3 = 3 ω̇3 + (2 − 1 )ω1 ω2 .
These three coupled differential equations for ω1 (t), ω2 (t), and ω3 (t) are not linear.
This suggests that in general the solutions ωi (t) are rather complicated functions of
time. Only in the case of free motion (D = 0) can one obtain a transparent solution
which will be discussed now. Later we shall deal with the heavy top for which D = 0.
We choose the body-fixed coordinate system so that the e3 -axis corresponds to the
figure axis. Since we will restrict the analytical consideration of the theory of the top
to a free symmetric top that shall be symmetric about the figure axis, the following
conditions hold:
L̇|lab = D = 0, i.e., D1 = D2 = D3 = 0, and 1 = 2 .
We show that for a symmetric top e3 , ω, and L lie in a plane. For this we have to
calculate the scalar triple product of the three vectors which must vanish:

e1 e2 e3

e3 · (ω × L) = e3 · ω1 ω2 ω3
1 ω 1 2 ω 2 3 ω 3
= (2 − 1 )ω1 ω2 = 0,
because 1 = 2 .
With the conditions for the free symmetric top, the Euler equations read
3 ω̇3 = 0 ⇒ ω3 = constant,
1 ω̇1 + (3 − 1 )ω2 ω3 = 0,
1 ω̇2 + (1 − 3 )ω1 ω3 = 0.
Thus, the component of ω along the figure axis is constant. To show this in the subse-
quent calculation, we set
ω3 = u.
To solve the two differential equations, we differentiate the second equation with re-
spect to time:
1 ω̈1 + (3 − 1 )uω̇2 = 0, 1 ω̇2 + (1 − 3 )uω1 = 0.
By solving the last equation for ω̇2 and inserting into the first one, we obtain
(3 − 1 )2 2
ω̈1 + u ω1 = 0.
21
This form of the differential equation is already known: setting

|3 − 1 |
u = k,
1
we see that ω̈1 + k 2 ω1 = 0 is exactly the differential equation of the harmonic oscilla-
tor, which is solved by
ω1 = B sin kt + C cos kt.
Considering the initial condition ω1 (t = 0) = 0, it follows that ω1 = B sin kt, or from

the second equation, ω2 = −B cos kt.
The result means that ω moves on a circle about the figure axis, as seen from the
system of the top:
ω = B(sin kt e1 − cos kt e2 ) + ue3 .
The rotational frequency is thereby given by k; for k > 0 the rotation proceeds in
Fig. 13.7. Motion of ω(t) in
the mathematically positive sense. The cone arising in the rotation is again called the
the e1 , e2 -plane
pole cone. The angular momentum, which is given by L = · ω, also changes with
time:
L = 1 B sin kt e1 − 1 B cos kt e2 + 3 ue3 ,
i.e., the L-axis rotates with the same frequency k but with a different amplitude about
the figure axis (nutation). This is no contradiction to the statement |L|lab = constant,
since we measure the angular momentum from the system of the top.
Finally, we determine the angles between the three axes. We set
∠) (e3 , L) = α, ∠) (e3 , ω) = β,
and scalar multiply e3 and L; this yields

e3 · L = L cos α = (1 B)2 + (3 u)2 cos α
or
e3 · L = e3 · (1 ω1 e1 + 2 ω2 e2 + 3 ω3 e3 ) = 3 ω3 = 3 u.
Equating both equations leads to

3 u 1
cos α =
=
(1 B) + (3 u)
2 2 (1 B/3 u)2 + 1
or

2
1 B
cos α + 1 = 1.
3 u
Comparison of the coefficients with the trigonometric formula

cos x tan2 x + 1 = 1
yields tan α = 1 B/3 u = constant.

Performing the same calculation for e3 · ω, for β we find
B
tan β = = constant.
u
The comparison of the last two results shows the dependence of the orientation of
the axes on 1 and 3 : One has
tan α/ tan β = 1 /3 ,
from which it follows that

(1) 1 > 3 (prolate top) ⇒ α > β for α, β < π/2; sequence of axes: e3 − ω − L;
(2) 1 < 3 (oblate top) ⇒ α < β for α, β < π/2; sequence of axes: e3 − L − ω;
(3) 1 = 3 (spherical top) ⇒ α = β for α, β < π ; ω lies on the L-axis. Since the
e3 -axis of a spherical top can be chosen arbitrarily, there is no loss of generality if
we set α = β = 0.
For (3) we note that, for the spherical top, k = u(3 − 1 )/1 = 0 because 1 = 3 .
For the spherical top, as was discussed above, we can set the figure axis (i.e., the
e3 -axis) arbitrarily, e.g., also along the L- or ω-axis. The result α = β would follow
Fig. 13.8.
also from
⎛ ⎞
1 0 0
=⎝ 0
ω × L = ω × 1 ω = 0 because 1 0 ⎠.
0 0 1
EXAMPLE
13.1 Nutation of the Earth
The earth is not a spherical top but a flattened rotation ellipsoid. The half-axes are
a = b = 6378 km (equator) and c = 6357 km.
If the angular momentum axis and the figure axis do not coincide, the figure axis
performs nutations about the angular momentum axis. The angular velocity of the
nutations is
3 − 1
k= ω3 .
1
The third axis is the principal axis of inertia (pole axis). If we consider the earth as
a homogeneous ellipsoid of mass M, we obtain the two moments of inertia:
M 2 M 2
1 = 2 = (b + c2 ), 3 = (a + b2 ).
5 5
From this, we obtain
a 2 − c2
k= ω3 .
b2 + c2
Since the half-axes differ only a little, we set a = b ≈ c, and thus,
(a − c)(a + c) a−c
k= ω3 ≈ ω3 .
b +c
2 2 a
The rotation velocity of the earth is ω3 = 2π/day. Thus, we obtain for the period of
nutation
2π
T= = 304 days.
k
The figure axis of the earth (geometrical north pole) and the rotation axis ω of
the earth (kinematical north pole) rotate about each other. The measured period (the
so-called Chandler period)3 is 433 days. The deviation is essentially caused by the
fact that the earth is not rigid. The amplitude of this nutation is about ±0.2 . The
3 Seth Carlo Chandler, b. Sept. 17, 1846, Boston, Mass.–d. Dec. 31, 1913, Wellesley Hills, Mass.
American astronomer, detected the Chandler period of 14 months in the pole height fluctuations. He
observed variable stars, and for a long time he edited the Astronomical Journal.
Example 13.1 kinematical north pole moves along a spiral trajectory within a circle of 10 m radius
in the sense of the earth’s rotation.
EXAMPLE
13.2 Ellipsoid of Inertia of a Regular Polyhedron
The ellipsoid of inertia of any regular polyhedron is a sphere, which will be shown
by the example of the tetrahedron; the reasoning for the octahedron, dodecahedron,
and icosahedron is analogous. Suppose there were a principal axis of inertia with a
moment which differs from those of the two other principal axes of inertia. In a rota-
tion by 120◦ about the axis g (perpendicular from point C on the opposite plane; see
Fig. 13.9), this axis of inertia must turn into itself, since the tetrahedron is transferred
into itself. It is easily seen that only the axis g has this property and therefore must
Fig. 13.9. Regular tetrahedron: be the distinguished axis of inertia. But since h is a symmetry axis too, a rotation by
g and h are straight lines (axes) 120◦ about h must also transfer the axis of inertia g into itself, which however is not
that are perpendicular to the true. Assuming the existence of a distinct axis of inertia leads to a contradiction, and
planes opposite C and D hence the ellipsoid of inertia of a tetrahedron must be a sphere.
EXERCISE
13.3 Rotating Ellipsoid
Problem. A homogeneous three-axial ellipsoid with the moments of inertia 1 ,

2 , 3 rotates with the angular velocity ϕ̇ about the principal axis of inertia 3. The
axis 3 rotates with ϑ̇ about the axis AB. The axis AB passes through the center of
gravity and is perpendicular to 3. Find the kinetic energy.
Fig. 13.10. Homogeneous

three-axial ellipsoid: (a) side
view, and (b) top view
Solution. We decompose the angular velocity ω into its components along the prin-
cipal axes of inertia:
ω = (ω1 , ω2 , ω3 ), where ω1 = ϑ̇ cos ϕ, ω2 = −ϑ̇ sin ϕ, ω3 = ϕ̇.
The kinetic energy is then
1 1 1
T= i ωi2 = (1 cos2 ϕ + 2 sin2 ϕ)ϑ̇ 2 + ϕ̇ 2 .
2 2 2
i
The ellipsoid shall now be symmetric, 1 = 2 ; the axis AB is tilted from the Exercise 13.3
third axis by the angle α. For the total angular velocity, we have
ω = ϕ̇e3 + ϑ̇eAB .
Fig. 13.11. The axis AB tilted

from the third axis by α: Posi-
tions of the axes are shown in
perspective
We decompose the unit vector eAB along the axis AB with respect to the principal
axes
eAB = e3 · cos α + (cos ϕe1 − sin ϕe2 ) sin α.
Thus, the components of ω along the directions of the principal axes are
ω1 = sin α cos ϕ ϑ̇,

ω2 = − sin α sin ϕ ϑ̇,
ω3 = ϕ̇ + cos α ϑ̇.
Hence, the kinetic energy reads

1 1
T = 1 sin2 α ϑ̇ 2 + 3 (ϕ̇ + ϑ̇ cos α)2 .
2 2
For α = 90◦ , we obtain for the first case a rotation ellipsoid.
EXERCISE
13.4 Torque of a Rotating Plate
Problem. Find the torque that is needed to rotate a rectangular plate (edges a and b)
with constant angular velocity ω about a diagonal.
Fig. 13.12. The rectangular
plate rotates about the diago-
nal axis
Solution. The principal moments of inertia of the rectangle are already known from
Example 11.7:
1 1 1
I1 = Ma 2 , I2 = Mb2 , I3 = M(a 2 + b2 ). (13.6)
12 12 12
Exercise 13.4 The angular velocity is
ω = (ω · ex )ex + (ω · ey )ey ,
i.e.,
ωb ωa
ω = −√ ex + √ ey
a2 + b2 a 2 + b2
−ωb +ωa
⇒ ω1 = √ , ω2 = √ , ω3 = 0. (13.7)
a +b
2 2 a 2 + b2
Inserting (13.6) and (13.7) into the Euler equations yields
I1 ω̇1 + (I3 − I2 )ω2 ω3 = D1 ,

I2 ω̇2 + (I1 − I3 )ω3 ω1 = D2 ,
I3 ω̇3 + (I2 − I1 )ω2 ω1 = D3 ,
and furthermore, D1 = 0, D2 = 0, and
−M(b2 − a 2 )abω2
D3 = .
12(a 2 + b2 )
Hence, the torque is
−M(b2 − a 2 )abω2
D= ez .
12(a 2 + b2 )
For a = b (square), D = 0!
EXERCISE
13.5 Rotation of a Vibrating Neutron Star
Problem. The surface of a neutron star (sphere) vibrates slowly, so that the principal
moments of inertia are harmonic functions of time:
2
Izz = mr 2 (1 + ε cos ωt),
5

2 2 cos ωt
Ixx = Iyy = mr 1 − ε , ε
1.
5 2
The star simultaneously rotates with the angular velocity (t).

(a) Show that the z-component of remains nearly constant!
(b) Show that (t) nutates about the z-axis and determine the nutation frequency for
z ω.
Solution. (a) If the total angular momentum is given in an inertial system, then Exercise 13.5

dL
= 0.
dt inertial
The principal moments of inertia are, however, given in a body-fixed system that ro-
tates itself with the angular velocity with respect to the inertial system. Then

dL dL
= + × L = 0.
dt inertial dt k
One therefore gets in the body-fixed system (Euler equations)

d
(Izz z ) = 0, (13.8)
dt
d 3
(Ixx x ) + I0 y z ε cos ωt = 0, (13.9)
dt 2
d 3
(Iyy y ) − I0 x z ε cos ωt = 0, (13.10)
dt 2
where I0 = (2/5)mr 2 is the moment of inertia of the sphere. (13.8) has the solution
0z
z = ,
1 + ε cos ωt
where 0z follows from the initial conditions; this means that z is only very weakly
time dependent.
(b) We suppose that ω
z , i.e.,
dIxx dIyy
≈ 0 and ≈ 0.
dt dt
From this, we find
3 3
˙ x + I0 z ε cos ωty = 0,
Ixx ˙ y − I0 z ε cos ωtx = 0.
Iyy (13.11)
2 2
Differentiating again and inserting (13.8), (13.9), and (13.10) yield
2
1 3
¨
Ixx x + I0 z ε cos ωt x = 0,
Iyy 2
2 (13.12)
1 3
¨y +
Iyy I0 z ε cos ωt y = 0.
Ixx 2
If Ixx = Iyy ≈ I0 , then

2 2
3
¨x +
εz cos ωt x = 0, ¨ y + 3 εz cos ωt y = 0.

2 2
Since ω
z (we further assume that ω
εz ), we find
3
ωn = εz cos ωt (nutation frequency),
2
i.e., x and y perform a nutation motion with ωn .
EXERCISE
13.6 Pivot Forces of a Rotating Circular Disk
Problem. A homogeneous circular disk (mass M, radius R) rotates with constant

angular velocity ω about a body-fixed axis passing through the center. The axis is
inclined by the angle α from the surface normal and is pivoted at both sides of the disk
center with spacing d. Determine the forces acting on the pivots.
Solution. The Euler equations read
I1 ω̇1 − ω2 ω3 (I2 − I3 ) = D1 , (13.13)
I2 ω̇2 − ω1 ω3 (I3 − I1 ) = D2 , (13.14)
I3 ω̇3 − ω1 ω2 (I1 − I2 ) = D3 , (13.15)
where D = {D1 , D2 , D3 } represents the torque in the body-fixed system. We choose

the body-fixed coordinate system in such a way that n = e3 and e1 lies in the plane
spanned by n, ω. For the principal momentum of inertia I1 , we have
2π R 2π R 2π
1
I1 = σ y 2 r drdϕ = σ r 3 sin2 ϕ drdϕ = σ R 4 sin2 ϕ dϕ
4
0 0 0 0 0

1 1 M 1
= σ R4π = R 4 π = MR 2 , (13.16)
4 4 πR 2 4
since the surface density σ is given by σ = M/F = M/πR 2 . And likewise for I2
and I3
1 1
I1 = I2 = I3 = MR 2 . (13.17)
2 4
The components of the angular velocity vector are given by
ω = {ω1 = ω sin α, ω2 = 0, ω3 = ω cos α}. (13.18)
Because ω̇ = 0, inserting (13.17) and (13.18) into (13.13) to (13.15) yields
1
D1 = D3 = 0 and D2 = −ω2 sin α cos α MR 2 . (13.19)
4
Because D = r × F, in the pivots act equal but oppositely directed forces of magnitude

|D2 | 1 1 sin 2α
Fig. 13.13. Geometry and piv- |F| = = MR 2 ω2 sin 2α = MR 2 ω2 (13.20)
2d 4d 4 16d
oting of the rotating circular
disk (see Fig. 13.13).
EXERCISE
13.7 Torque on an Elliptic Disk
Problem. What torque is needed to rotate an elliptic disk with the half-axes a and
b about the rotation axis 0A with constant angular velocity ω0 ? The rotation axis is
tilted from the large half-axis a by the angle α.
Fig. 13.14.
Solution. We choose the e1 -axis orthogonal to the plane of the drawing, e2 along the
small half-axis b, and e3 along the large half-axis. The principal moments of inertia
are then (M = σ πab, dM = σ dF )

+a ϕ(z)
z2
I2 = σ z2 dz dy with ϕ(z) = b 1 −
a2
−a −ϕ(z)
from the ellipse equation z2 /a 2 + y 2 /b2 = 1.

√
+a b 1−z2 /a 2 +a
z2
I2 = σ z2 y √ dz = 2σ b z2 1 − 2 dz
−b 1−z2 /a 2 a
−a −a

b z 2 a4 z +a
= 2σ (2z − a ) a − z +
2 2 2 arcsin
a 8 8 |a| −a
1 1
= σ ba 3 π = Ma 2 ; (13.21)
4 4
accordingly,

+b ϕ(y)

y2
I3 = σ y 2 dy dz with ϕ(y) = a 1 −
b2
−b −ϕ(y)
1
⇒ I3 = Mb2 . (13.22)
4
We can immediately write down I1 (because I1 = I2 + I3 for thin plates):
1
I1 = M(a 2 + b2 ). (13.23)
4
For ω, we obtain
ω = 0 · e1 − ω0 sin α · e2 + ω0 cos α · e3 .
Exercise 13.7 We insert into the Euler equations of the top:
D1 = I1 ω̇1 + (I3 − I2 )ω2 ω3

1
= − M(b2 − a 2 ) sin α cos α · ω02 ,
4 (13.24)
D2 = I2 ω̇2 + (I1 − I3 )ω1 ω3 = 0,
D3 = I3 ω̇3 + (I2 − I1 )ω1 ω2 = 0.
Thus, we obtain for the desired torque D

M 2
D = −ω02 · (b − a 2 ) sin 2α · e1 . (13.25)
8
It is obvious that
(1) for α = 0, π/2, π, . . . , the torque vanishes, since the rotation is performed about
a principal axis of inertia; and
(2) for b2 = a 2 , i.e., the case of a circular disk, the torque vanishes for all angles α.
We will consider once again the last conclusions: Given an elliptic disk with half-axes
a and b, for α = 0◦ , 180◦ , or for α = 90◦ , 270◦ , the rotation axis coincides with one of
the principal axes of inertia along the half-axes. In this case the orientation of the angu-
lar momentum is identical with the momentary rotation axis. Because ω0 = constant,
we also have L = constant, and therefore the resulting torque vanishes. This also re-
sults by insertion into the Euler equations of the top: ω̇ = (0, 0, 0), ω = (0, 0, ω0 ) or
ω = (0, ω0 , 0)
⇒ D1 = I1 ω̇1 + (I3 − I2 )ω2 ω3 = 0,

D2 = I2 ω̇2 + (I1 − I3 )ω1 ω3 = 0, (13.26)
D3 = I3 ω̇3 + (I2 − I1 )ω1 ω2 = 0.
13.4 The Heavy Symmetric Top: Elementary Considerations
We now consider the motion of the top under the action of gravity. If the bearing point
O of the top does not coincide with the center of gravity S, gravity performs a torque.
To distinguish the top from the freely moving top, it is called the heavy top. First we
restrict ourselves to the symmetric top which rotates with the angular velocity ω about
its figure axis. The origin of the space-fixed coordinate system is set into the bearing
−→
point O, the negative z-axis points along the gravity force. Let the distance OS = l;
gravity acts on the top of mass m with the torque D = l × mg. Hence, the angular
momentum vector is not constant in time:
L̇ = D or dL = D · dt.
Fig. 13.15. A heavy top: The This differential form of the equation of motion expresses that the torque causes a
center of gravity S and the bear-
change dL of the angular momentum which is parallel to the torque D.
ing point O are sketched in
13.4 The Heavy Symmetric Top: Elementary Considerations 225
Fig. 13.16. Position of the

heavy top in the coordinate
system
Sommerfeld4 and Klein5 in their book Theory of the Top called this phenomenon—
philosophizing—“Die Tendenz zum gleichsinnigen Parallelismus.” The z-component
of the angular momentum is however conserved. This results from the following con-
sideration: Because g = −gez , we have D = mgez × l, i.e., the torque has no com-
ponent along the z-direction, and hence Lz is constant. Hence the torque D causes a
motion of the angular momentum vector L on a cone about the z-axis; this motion
of the heavy top is called precession. The precession frequency of the top is thereby
constant by reason of symmetry; the relative orientation of the torque and angular
momentum vectors is also constant.
We now calculate the precession frequency. For this purpose we start from the
radial component Lr of the angular momentum:
Lr = L sin ϑ.
The angle covered by Lr in the time dt is

dLr dL D dt
dα = = = .
Lr Lr L sin ϑ
For the precession frequency ωp = dα/dt , we thus have Fig. 13.17. On the calculation
of the precession frequency ωp
D mgl
ωp = =
L sin ϑ L
or in vector notation
ωp × L = D.
4 Arnold Sommerfeld, b. Dec. 5, 1868, Königsberg–d. April 26, 1951, Munich, physicist. From 1897,
professor in Clausthal-Zellerfeld and from 1900 in Aachen, successfully tried for a mathematical
backup of technique. In 1906, Sommerfeld became professor for theoretical physics in Munich,
where he was an excellent academic teacher for generations of physicists (among others P. Debye,
P.P. Ewald, W. Heisenberg, W. Pauli, and H.A. Bethe). He extended Bohr’s ideas in 1915 to the
“Bohr–Sommerfeld theory of atom” and discovered many of the laws for the number, wavelength,
and intensity of spectral lines. His work Atomic Structure and Spectral Lines (Vol. 1, 1919; Vol. 2,
1929) was accepted for decades as a standard work of atomic physics. Further works: Lectures on
Theoretical Physics, six volumes (1942–1962).
5 Felix Klein, b. April 25, 1849, Düsseldorf–d. June 22, 1925, Göttingen. Klein studied from 1865 to
1870 in Bonn. During a stay in 1870 in Paris, he became familiar with the rapidly developing group
theory. From 1871, Klein was a private lecturer in Göttingen, from 1872, professor in Erlangen, from
1875, in Munich, from 1880, in Leipzig and from 1886, in Göttingen. He contributed fundamental
papers on function theory, geometry, and algebra. In particular, group theory and its applications
attracted his interest. In 1872, he published the Program of Erlangen. In later life Klein became
interested in pedagogical and historical problems.
Hence, the precession frequency is independent of the inclination ϑ of the top if

ϑ = 0 is presupposed. In the general case nutation motions are superimposed on the
precession, so that the tip of the figure axis F no longer moves on a circle but on a
much more complicated trajectory about the z-axis. The angle ϑ varies between two
extreme values
ϑ0 − ϑ ≤ ϑ ≤ ϑ0 + ϑ
(see Fig. 13.18). If D = 0, there is only the nutation of the figure axis F about the then-
Fig. 13.18. In general, a nu-
invariable straight line L. For D = 0, the precession of L about the z-axis dominates.
tation is superimposed on the
precession The nutation is superimposed on this precession.
Since in the special case considered here the vectors of angular momentum and
angular velocity coincide with the figure axis of the top, we can write for the angular
momentum
L = 3 ω,
where 3 is the moment of inertia about the figure axis. For the torque we then have
the relation
D = 3 ωp × ω.
The inverse of this moment D = −D = 3 ω × ωp is called the top moment. It is

that torque the top performs on its bearing if it is turned with the angular velocity ωp .
This top moment—sometimes also called the deviation resistance—can reach very
large values for a quickly rotating top and a sudden turn of its rotation axis. We feel it
for example if we suddenly turn the axis of a quickly rotating flywheel held in the hand
(e.g., wheel of a bicycle). The top moment observed from a reference system rotating
with the angular velocity ωp is identical with the moment of the Coriolis forces, which
can be proved.
The earth also precesses under the action of the gravitational attraction of the sun
and moon. The earth is a top with a fully free rotation axis, but it is not free of forces.
As a result of its flattening and of the tilted ecliptic, the attraction of the sun and moon
generates a torque. We imagine the earth as an ideal sphere with a bulge upon it, which
is largest at the equator, and we first consider only the action of the sun. In the center
of earth (center of gravity), the attraction exactly balances the centrifugal forces due
to the orbit of the earth about the sun. We divide the bulge into the halves pointing to
the sun and away from it, respectively. The attraction of the sun on the former half is
larger than at the earth’s center, because of the smaller distance. The centrifugal force,
however, is by the same reason smaller. At the center of gravity S1 of the half-bulge
results a force K pointing toward the sun.
On the side away from the sun, the situation is reversed. Here the centrifugal force
dominates over the attraction by the sun, and at the center of gravity S2 of the half-
bulge there is a force −K pointing away from the sun, which—because of the inclina-
tion of the ecliptic—forms a couple with the former force. The couple tries to turn the
earth axis and the axis which is perpendicular to the earth orbit radius and lies in the
orbital plane. From this follows the precession motion about the axis perpendicular to
the earth orbit. The moon acts in the same sense, but even more strongly than the sun,
because of its small distance. The earth axis rotates in 25800 years (“Platonic year”)
once on a conical surface with the vertex angle of twice the inclination of the ecliptic,
13.4 The Heavy Symmetric Top: Elementary Considerations 227
Fig. 13.19. Mass ring caused

by the orbiting sun as seen
from the earth6
i.e., 47◦ ; it therefore changes its orientation over the millennia. This precession motion
must be distinguished from the nutations of earth (Chandler’s nutations) discussed in
Example 13.1. The latter are superimposed on the precession motion.
Possibly the most important practical application of the top is the gyrocompass. The
idea goes back to Foucault (1852). The gyrocompass consists in principle of a quickly
rotating, semi-cardanic suspended top, with the rotation axis kept in the horizontal
plane by the suspension.
The earth is not an inertial system; it rotates with the angular velocity ωE . Since the
top wants to preserve the orientation of its angular momentum, it is forced to precess
with ωE . Hence, there is a top moment D :
D = 3 ω K × ω E ,
where we set
ωE = ωE sin ϕez + ωE cos ϕeN ≡ ωEZ + ωEN
with ϕ as the geographic latitude. eN is a unit vector pointing along the meridian. By
splitting ωE one obtains
D = 3 (ωK × ωEZ + ωK × ωEN ).
The first term is compensated by the bearing of the top. This part of the top moment
tends to turn the AB-axis (see Fig. 13.20(b)) The second term causes a rotation of the
top about the z-axis. The splitting of ωK leads to the acting torque
Fig. 13.20. (a) Decomposition
of the angular frequency ωE
into a vertical and a hori-
zontal component. (b) Semi-
cardanic suspension: This top
can freely rotate about the
AB-axis
(2004), Chapter 28.
Fig. 13.21. Decomposition of

the angular velocity of the
earth (ωE ) and of the top
(ωK )
D = 3 ωK sin αωE cos ϕez .
Hence, a torque arises which always tends to turn the top along the meridian (α = 0).
If the suspension of the top is damped, the top adjusts along the north–south direc-
tion provided that it is not just at one of the two poles (ϕ ± 90◦ ). Otherwise, it performs
damped pendulum vibrations about the north–south direction. One can therefore use
the top as a direction indicator if one is not close to a pole.
Foucault’s experiments with a “gyroscope” led only to indications of the described
effect. Anschütz-Kaempfe succeeded in constructing the first useful gyrocompass
(1908). To reduce the friction, the gyroscope body—a three-phase current motor—
hangs at a float that floats in a basin of mercury. The top axis is kept horizontal by
placing the center of gravity of the top lower than the buoyancy center (corresponding
to the suspension point). In this setup the gyroscope axis vibrates under the influ-
ence of the moment not only in the horizontal, but also in the vertical plane about the
north–south direction.
Fig. 13.22. Principle of the
gyroscope
By an appropriate damping of the latter of these coupled vibrations, one can also
reach a damping of the vibrations in the horizontal plane, which is needed for the
adjustment. The deviations arising from ship vibrations and from other effects could
be removed in more recent construction (multiple gyrocompass), or accounted for by
calculation.
13.5 Further Applications of the Top
In order to stabilize free motions of bodies, e.g., of a disk or a projectile, these are
set into rapid eigenrotation (spin). A disk thereby maintains its tilted position almost
unchanged, and therefore gets a buoyancy similarly to the wing of a plane and thus
reaches a much larger range of flight than without rotation. A prolate projectile rotat-
ing about its longitudinal axis experiences a torque from the air resistance which tends
13.5 Further Applications of the Top 229
to turn the projectile about a center-of-gravity axis perpendicular to the flight direc-
tion. The projectile responds with a kind of precession motion which is very intricate
because of the variable air resistance. The vibrations of the projectile tip remain close
to the tangent of the trajectory, but for a projectile with “right-hand spin” drift off to
the right of the shot plane. The projectile therefore hits the target with the tip ahead,
but on firing one has to account for a right-hand deviation.
The gyroscope torques acting on guided tops tend to turn up the axes of the wheels
of a car passing a bend, which causes an additional pressure on the outer wheels and a
relief of the inner ones. The same gyroscope effect provides an increase of the milling
pressure in grinding mills and finds further application in the turn and bank indicator
of airplanes. If the plane performs a turn, the gyroscope actions on the propeller must
be taken into account.
The top can also serve to stabilize systems (cars) which by their nature are unstable,
as in the one-rail track, or to reduce the vibrations of an internally stable system,
as in Schlick’s ship gyroscope. In the latter device a heavy top with a vertical axis,
driven by a motor, is set into a frame that can rotate about the transverse horizontal
axis. During ship vibrations about the longitudinal axis—these “rolling vibrations”
shall be damped—the top performs vibrations because of the precession about the axis
lying across the ship; ship vibrations and top vibrations represent coupled vibrations.
The top vibrations are appropriately damped by a brake. By the coupling the released
energy is pulled out of the ship vibrations; as a result these are considerably reduced.
As is seen from the above example, for stabilization by a top it is generally essential
that its rotation axis is not fixed relative to the body, but that all degrees of freedom
are available. For this reason a bicycle with a tightly mounted front wheel would not
be stable. Moreover, riding a bicycle without support is also partly based on the laws
of top motion.
An indirect stabilization is used by the devices which control the straight motion of
torpedoes. On deviation from the shot direction, the top activates a relay which causes
the adjustment of the corresponding rudder.
An important problem is to stabilize a horizontal plane so that it remains horizontal
on a moving ship or airplane. This so-called artificial horizon (gyroscope horizon,
flight horizon) could, according to Schuler, be realized by a gravitation pendulum
with a period of 84 minutes (pendulum length = earth radius), since such a pendulum,
even when the suspension point is moved, points to the earth’s center. Useful artificial
horizons could be realized by tops with cardanic suspension (“top pendulum,” center
of gravity below the rotation point).
We finally note that an ordinary play top that moves with a rounded tip on a hori-
zontal plane, and thus has five degrees of freedom, does not fit the definition of a top,
since in general no point remains fixed during its motion; i.e., translation and rotation
motion cannot be dynamically separated. The fact that a play top with an initially tilted
axis straightens up under sufficiently fast rotation—which also happens, for example,
with a cooked egg—can be explained by a torque created by the friction.
EXERCISE
13.8 Gyrocompass
Problem. A simple gyrocompass consists of a gyroscope that rotates about its axis
with the angular velocity ω. Let the moment of inertia about this axis be C and the
Exercise 13.8 moment of inertia about a perpendicular axis be A. The suspension of the gyroscope
floats on mercury, hence the only acting torque forces the gyroscope axis to stay in the
horizontal plane. The gyroscope is brought to the equator. Let the angular velocity of
the earth be . What is the response of the gyroscope?
Solution. Since the earth rotates with angular velocity , the angular momentum in
the earth system satisfies
dL
= D − × L,
dt
where D is the total torque. At the equator points along the y-axis, and the z-axis is
perpendicular to it.
Fig. 13.23. is the angular
The components of the angular momentum are
velocity of the earth, ω that of
the gyroscope Lx = Cω sin ϕ,
ω
Ly = Cω cos ϕ,
1,

Lz = Aϕ̇.
We suppose that ϕ is small. Then

Lx ∼
= Cωϕ,
ω
Ly ∼
= Cω,
1,

Lz ∼
= −Aϕ̇.
Since there are no forces acting in the x,y-plane, Dz = 0. Hence, the equation for Lz
is
Aϕ̈ = −Cωϕ
or
C C
ϕ̈ + ωϕ = 0; i.e., ϕ̈ + ωr2 ϕ = 0, ωr2 = ω.
A A
ϕ oscillates with the frequency
1/2
C
ωr = ω
A
in the north–south direction!
EXERCISE
13.9 Tidal Forces, and Lunar and Solar Eclipses: The Saros Cycle7
Problem. Theancient Chinese court astronomers were able to predict lunar and solar
eclipses with great reliability. The fact that such eclipses arise only occasionally—
7 The name goes back to the Chaldeans, a Babylonian tribe. Thales presumably used Babylonian ta-
bles for predicting the solar eclipse in 585 B.C. The knowledge of natural science of the Babylonians
was highly developed. They had tables for square roots and powers, approximated the number π by
3 1/8, and could solve quadratic equations. The subdivision of the celestial circle into 12 zodiacal
symbols and the 360◦ division of the circle are modern examples of Babylonian nomenclature.
Fig. 13.24.
while otherwise we have a full moon or a new moon—is caused by the inclination of
the orbital plane of the earth–moon system from the ecliptic, i.e., the orbital plane of
the motion of the common center of gravity about the sun. This inclination is about
5.15◦ . It is not fixed in space but precesses because of the tidal forces exerted by
the sun. This leads to the so-called Saros cycle, which is of great importance for the
prediction of eclipses.
Consider the earth–moon system as a dumbbell-shaped top which rotates about
its center of gravity Sp ; the center of gravity orbits about the sun on a circular path.
The gravitation force between the earth and moon just balances the centrifugal force
resulting from the eigenrotation of the system and thus fixes the almost rigid dumbbell
length r0 . The gravitation of the sun and the centrifugal force due to orbiting about the
sun don’t compensate for each body independently but lead to resulting tidal forces.
These forces create a torque M0 on the top. Calculate M0 for the sketched position
where it just takes its maximum value. Realize that M0 on the (monthly and annual)
average has a quarter of this value. Calculate from this the precession period Tp . Can
you find arguments for why the actual Saros cycle of 18.3 years is notably longer?
Hint: The only data you need for the calculation are the distances r0 , the length of
year, and the length of the sidereal month.
Solution. R0 is defined as the vector pointing from the center of gravity of the sun to
the center of gravity of the earth-moon system. The coordinate origin of the system is
in the center of gravity of the sun.
Let R be given in cylindrical coordinates:
R = R0 + R
= R0 er + Rr er + Rϕ eϕ + Rz ez
with
| R|
|R0 |.
We then have

2 Rr 1/2
|R| ≈ R0 1 + .
R0
Fig. 13.25. The adopted unit

vectors
Exercise 13.9 Hence, we can write for the gravitational force
γ mM
FGr (R) = − R
|R|3

γ mM 3 Rr
≈− 3 1− (R0 er + Rr er + Rϕ eϕ + Rz ez )
R0 R0

γ mM γ mM
≈ er − 3 (R0 − 2 Rr ) + eϕ − 3 Rϕ
R0 R0

γ mM
+ ez − 3 Rz . (13.27)
R0
The two masses m and M don’t yet have a special meaning. We now consider the
motion of the earth–moon system as a two-body problem with an external force:
mE RE + mM · RM
RCM := R0 = (CM means center of mass),
mE + mM
mE VE + mM VM
VCM = ,
mE + mM
mE BE + mM BM
BCM = ,
mE + mM
FES + FEM + FME + FMS
= (S means sun).
mE + mM
According to (13.27),
RE = rE ,
RM = rM ,
mE rE = −mM rM ;
further from (13.27) we have

1 (mE + mM )MS
BCM = −γ · er
mE + mM R02
γ MCM
=− · er = −ωCM
2
R0 · e r (circular acceleration). (13.28)
R02
The last equality follows from the equilibrium condition for the center of gravity. The
magnitude of the gravitational acceleration must be equal to the magnitude of the
circular acceleration. From this, it follows that the center of mass CM rotates with the
frequency ωCM about the sun at the distance R0 .
We further know the following values:
2π
TCM = = 365 days; R0 = 149.6 · 106 km. (13.29)
ωCM
We are mainly interested in the motion of the earth–moon system. To this end we Exercise 13.9
consider the relative distance rrel between the earth and moon.
rrel = RE − RM , |rrel | = r0 ,
mE · mM
Prel = μ(VE − VM ) with μ = ,
mE + mM

dPrel FES FEM FME FMS
= μ(BE − BM ) = μ + − − ,
dt mE mE mM mM
mE mM
FEM = −γ rrel = −mωM2
rrel .
r03
The last equality holds because of the equilibrium condition, as for the center-of-mass
acceleration.
The relative distance also performs a circle with the sidereal period of the moon at
the distance r0 . One has
2π
TM = = 27 days + 8 hours,
ωM
(13.30)
r0 = rE + rM = 0.384 · 106 km.
We thus obtain the following combined motion:
mM ⎫
RE = R0 + rrel ⎪
⎬
mM + mE
epicycle motion. (13.31)
rrel ⎪
mE
RM = R0 − ⎭
mM + mE
The angular momentum with respect to the center of mass CM is given by
L0 = rE × mE (VE − VCM ) + rM × mM (VM − VCM )

mM mM
= mE rrel × vrel
mE + mM mE + mM

mE mE
+ mM rrel × vrel
mE + mM mE + mM
mE mM
= rrel × vrel
mE + mM
= μ · rrel × vrel
mE mM
≈ ωM · r02 · lEM , (13.32)
mE + mM
where lEM represents a normal vector to the orbital plane of the earth–moon system.
The relation (13.32) does not hold exactly, since the motion of the earth–moon system
is not perfectly circular, but can be well approximated by a circle.
Exercise 13.9 The coordinate system at the center of gravity is oriented just as at the origin. The
total angular momentum with respect to the sun is evaluated as
Ltot = mE RE × VE + mM · RM × VM

mM mM
= mE R0 + rE × VCM + vrel
mM + mE mM + mE

mE mE
+ mM R0 − rrel × VCM − vrel
mM + mE mM + mE
= (mE + mM )R0 × VCM + μ · rrel × vrel
= (mE + mM )ωCM · R02 · lS−CM + L0
= LCM + L0 . (13.33)
Similar as in (13.32), here it also holds that lS−CM is a vector normal to the orbital
plane defined by the sun and the center of gravity.
dLtot
= RE × FES + RM × FMS + (RE − RM ) × FEM = 0;
dt
this means
L̇CM = −L̇0 . (13.34)
We now consider the resulting torque M0 with respect to the center of mass CM:
M0 = rE × (FES + FEM − mE BCM ) + rM × (FMS + FME − mM BCM )

= rE × FES + rM × FMS .
The second terms in each bracket drop because the force and position vector have the
same orientation. The third two terms cancel because
rE mE = −rM mM .
By inserting rrel , it follows that

mM mE
M0 = rrel × FES − rrel × FMS .
mE + mM mE + mM
Using (13.27) for the two force vectors FES and FMS , one obtains by simplifying
M0 with respect to cylindrical coordinates:
3γ mE M
M0 = ( RrE (rrel × er )). (13.35)
R03
Here, RϕE and RzE were set to zero, according to the definition of the problem. In
order not to complicate the problem unnecessarily, we put the coordinate system at the
center of gravity and thereby also that at the origin just so that the angular momentum
on the average lies in the x, z-plane. This approach is justified since the precession
frequency to be calculated is notably less than ωCM . During one revolution about the
sun the angular momentum has changed only insignificantly (by about 20◦ ), so that
the ecliptic of the earth–moon system has turned only slightly.
Fig. 13.26. (a) The center-of-

mass motion about the sun is
parametrized by the angle γ .
(b) The ecliptic plane is tilted
by α from the plane spanned
by the center-of-mass motion
about the sun
Ansatz:
⎛ ⎞
cos β
rrel = r0 ⎝ sin β ⎠ , where β ∼ ωM t.
0
In the nonprimed system the vector has the components

⎛ ⎞ ⎛ ⎞
cos α 0 − sin α cos α cos β
rrel = ⎝ 0 1 0 ⎠, rrel = r0 ⎝ sin β ⎠ .
sin α 0 cos α sin α sin β
Ansatz:
⎛ ⎞
cos(γ + ϕ0 )
er = ⎝ sin(γ + ϕ0 ) ⎠ , where γ ∼ ωCM t.
0
Using the relation

mM
RrE = (rrel · er ),
mE + mM
one obtains, by using the two approaches, the following new formulation of (13.35):
3γ mE M mM
M0 = r02 v
R03 mE + mM
with
⎛ ⎞
− sin α cos α cos2 β sin(γ + ϕ0 ) cos(γ + ϕ0 ) − sin α cos β sin β sin2 (γ + ϕ0 )
⎜ ⎟
⎜ sin α cos α cos2 β cos2 (γ + ϕ0 ) + sin α cos β sin β sin(γ + ϕ0 ) cos(γ + ϕ0 ) ⎟
v=⎜
⎜
⎟.
⎟
⎝ cos2 α cos2 β · sin(γ + ϕ0 ) cos(γ + ϕ0 ) − sin β cos β cos α cos2 (γ + ϕ0 ) ⎠
+ cos α cos β · sin β sin2 (γ + ϕ0 ) − sin2 β sin(γ + ϕ0 ) cos(γ + ϕ0 )
This clumsy expression can be significantly simplified, assuming again that ωP

ωCM . This means that the angular momentum L0 changes only slightly during one
revolution about the sun.
We first consider β. The revolution period of the moon about the earth is about
28 days. The moment M0 changes its orientation with varying β; for the “inert” angu-
lar momentum, however, only the average momentum M0 β counts; this is obtained
Exercise 13.9 by averaging over a full period of β. The moment impact (analogous to the force
impact) of M0 and of M0 β has the same value, because of the linearity β = ωM t .
3γ Mμ
M0 β = r02
R03
⎛ ⎞
1
− sin α cos α sin(γ + ϕ0 ) cos(γ + ϕ0 )
⎜ 2 ⎟
⎜ ⎟
⎜ 1 ⎟
⎜
×⎜ sin α cos α cos (γ + ϕ0 )
2 ⎟.
⎟
⎜ 2 ⎟
⎝ 1 1 ⎠
cos2 α sin(γ + ϕ0 ) cos(γ + ϕ0 ) − sin(γ + ϕ0 ) cos(γ + ϕ0 )
2 2
The same consideration can be made for the “rotating” angle γ ∼ ωCM t , since we
assume that ωp ωCM . We therefore average over M0 β:
⎛ ⎞
0
3γ Mμ ⎜1 ⎟
M0 βγ = r02 ⎜
⎝4
sin α cos α ⎟
⎠
R03
0
3γ Mμ 1
= r02 · sin α cos α · ey := M0 . (13.36)
R03 4
Not very much is left over from the extended expression; the resulting acting moment
M0 is exactly perpendicular to L0 (see Fig. 13.27) and points along the y-direction.
If the angular momentum moved (slowly), we imagine the shifted angular momentum
as being embedded in a fixed coordinate system and thus get, according to (13.36),
the same result. M0 is constant and always perpendicular to L0 . It therefore causes
a precession.
Fig. 13.27. Meaning of the

angles α, β, γ at one glance:
β describes the rotation of the
earth–moon axis in the primed
coordinate system
But first we will illustrate (13.36) that has been obtained in a rather mathematical
way: In this situation, we get a maximum moment. The moment M(β) now points in
the y -direction for all possible β. For β = 90◦ or β = 270◦ , the vector M vanishes.
On the average, we thus obtain
M(β) = 2M0 .
Fig. 13.28. Orientation of er

as a function of γ + ϕ0 :
The averaged moment Mβ
always points along the y -
direction
The diagram shows that there are also two “maximum” orientations with respect to
(γ + ϕ0 ). Between these positions M(β) must vanish. One thus obtains altogether
M(β, γ )βγ = M0 .
Thus, the result (13.36) is also clearly understood.

With (13.32) in our defined coordinate system, we have
⎛ ⎞
− sin α
L0 = μ · ωM · r02 ⎝ 0 ⎠ .
cos α
The angular momentum L0 now precesses:

MS M0
ωp = =
L · sin α L0 · sin α
3 γ MS μr02 sin α cos α 3 γ MS 1
= · 3 2
= 3
cos α · . (13.37)
4 R0 μωM r0 sin α 4 R0 ω M
It has been shown at the beginning (13.28) that

2
γ MS γ MS 2π
= ωCM
2
R0 ⇒ = ωCM
2
= ,
R02 R03 TCM
and with
2π
ωM = ,
TM
one gets
2
3 4π TM 3 TM
ωp = cos α 2
· = cos α 2 , (13.38)
4 TCM 2π 2 TCM
and for Tp
2π 4 1 T2
Tp = = · · CM . (13.39)
ωp 3 cos α TM
With TCM ≈ 365.25 days, TM ≈ 27.3 days, and α ≈ 5.5◦ , one obtains
Tp ≈ 17.9 years. (13.40)
Exercise 13.9 The fact that the actual Saros cycle is larger by about 2% is partly due to the ap-
proximation when averaging γ (the angular momentum actually moves slightly), but
possibly due to the elliptic path of the moon about the earth. In any case, the result is
relatively accurate, considering the approximations made.
From (13.34), it further follows that
L̇CM = −L̇0 ,
i.e., the “large” angular momentum vector LCM runs through an opposite precession
cone.
13.6 The Euler Angles

The motion of the heavy top suspended at a point can be described by specifying
the orientation of a body-fixed coordinate system (x , y , z ) relative to a space-fixed
system (x, y, z). The two coordinate systems have a common origin at the suspension
point of the top. To establish the relation between the two coordinate systems, one
usually adopts the Euler angles.8
The coordinate system (x , y , z ) is obtained from the system (x, y, z) by three
subsequent rotations about defined axes. The corresponding rotation angles are called
Euler angles. The sequence of the rotations is important, since rotations by finite an-
gles are not commutative. In Fig. 13.29 we see at once that a permutation of the se-
quence of two rotations about different axes leads to a different result.
Fig. 13.29. Demonstration of
the noncommutativity of finite
rotations
A rotation by 90◦ about the x-axis, followed by a 90◦ -rotation about the y-axis
(upper figure) leads to a different result than rotating first about the y-axis and then
about the x-axis (lower figure) (noncommutativity of finite rotations).
The Euler angles are defined as follows: The first rotation is performed about the
z-axis by the angle α. The x- and y-axis turn into the X- and Y -axis. The Z-axis
8 Leonhard Euler, Swiss Mathematician, b. April 15, 1707, Basel, Switzerland–d. September 18,
1783, Saint Petersburg, Russia. Son of a clergyman, Euler studied mathematics in Basel with Johann
Bernoulli, and in 1727 was appointed professor of physics and mathematics at the university of Saint
Petersburg. With an extended break from 1741 to 1766 as a member of the Berlin Academy of Sci-
ences, he spent the rest of his life in Saint Petersburg. Euler authored more than 800 scientific papers
on nearly all subjects in mathematics, especially on calculus, calculus of variations, and the theory
of complex functions. He contributed eminently also to the theory of numbers, and his solution to
the problem of the Seven Bridges of Königsberg laid the foundation of graph theory and topology. In
physics, Euler worked mainly in hydrodynamics, the theory of elasticity, and the theory of the top.
13.6 The Euler Angles 239
coincides with the z-axis. The so defined X, Y, Z system is a first intermediate system
which is only used to keep the calculation transparent. For the unit vectors, we have
i = (i · I)I + (i · J)J + (i · K)K = cos αI − sin αJ,

j = (j · I)I + (j · J)J + (j · K)K = sin αI + cos αJ, (13.41a)
k = (k · I)I + (k · J)J + (k · K)K = K.
Fig. 13.30. The first Euler an-
The second rotation is performed about the (new) X-axis by the angle β; the Y - gle
and Z-axis turn into the Y - and the Z -axis. The X -axis coincides with the X-axis.
The X , Y , Z system fixed in this way is a second intermediate system. It serves for
mathematical clarity and transparency, just as the first intermediate system did. An
analogous calculation yields for the unit vectors
I = I ,
J = cos βJ − sin βK , (13.41b)
K = sin βJ + cos βK .
The third rotation is performed about the Z -axis by the angle γ ; the X - and Y -
axis then turn into the x - and y -axis, respectively. The z -axis is identical with the Fig. 13.31. The first two Euler
angles
Z -axis. The x , y , z system constructed this way is the desired body-fixed coordinate
system. For the unit vectors, one obtains
I = cos γ i − sin γ j ,
J = sin γ i + cos γ j , (13.41c)

K =k.
Fig. 13.32. All three Euler an-

gles
Using the relations between the unit vectors, we now determine the unit vectors
i, j, k as functions of i , j , k . For this purpose, we insert
i = cos αI − sin αJ
= cos αI − sin α cos βJ + sin α sin βK
= cos α cos γ i − cos α sin γ j − sin α cos β sin γ i
− sin α cos β cos γ j + sin α sin β k
= (cos α cos γ − sin α cos β sin γ )i
+ (− cos α sin γ − sin α cos β cos γ )j + sin α sin βk . (13.41d)
An analogous calculation yields
j = (sin α cos γ + cos α cos β sin γ )i

+ (− sin α sin γ + cos α cos β cos γ )j − cos α sin β k , (13.41e)

k = sin β sin γ i + sin β cos γ j + cos βk .
The rotations can also be expressed by the corresponding rotation matrices. For the
first rotation, we have

r = AR,
where
⎛ ⎞ ⎞⎛
cos α − sin α 0 X
= ⎝ sin α
A cos α 0⎠ and R = ⎝Y ⎠.
0 0 1 Z
The matrices for the rotations by the angles β and γ accordingly read
⎛ ⎞
1 0 0
= ⎝0
B cos β − sin β ⎠ ,
0 sin β cos β
⎛ ⎞
cos γ − sin γ 0
= ⎝ sin γ
C cos γ 0⎠.
0 0 1
is the product of the three matrices D
The matrix of the entire rotation D =A
BC.

Hence,

r = Dr or r = D
−1 r = Dr.

Since the rotation matrices are orthogonal, the inverse matrix equals the transposed
one. By calculating the matrix product, one can easily show that the matrix D agrees
with the relations derived for the unit vectors. (This agrees with the general consid-
erations from Chap. 30 of Classical Mechanics: Point Particles and Relativity of the
Lectures on Theoretical Physics.)
We first calculate the angular velocity ω of the top as a function of the Euler angles.
If (i, j, k) define the laboratory system and (i , j , k ) a body-fixed system of principal
axes, for the angular velocity we have
ω = ωα k + ωβ I + ωγ K = α̇k + β̇I + γ̇ K ,
where we presuppose that k, I, and K are not coplanar. We utilize the derived relations
between the unit vectors and obtain
ω = α̇ sin β sin γ i + α̇ sin β cos γ j + α̇ cos β k + β̇ cos γ i − β̇ sin γ j + γ̇ k

= (α̇ sin β sin γ + β̇ cos γ )i + (α̇ sin β cos γ − β̇ sin γ )j + (α̇ cos β + γ̇ )k .
13.7 Motion of the Heavy Symmetric Top 241
Setting ω = ωx i + ωy j + ωz k , we get the components of the angular velocity rela-

tive to the body-fixed coordinate system:
ωx ≡ ω1 = α̇ sin β sin γ + β̇ cos γ ,

ωy ≡ ω2 = α̇ sin β cos γ − β̇ sin γ , (13.42)
ωz ≡ ω3 = α̇ cos β + γ̇ .
The kinetic energy T of the top is then
1
T = (1 ω12 + 2 ω22 + 3 ω32 )
2
1
= 1 (α̇ sin β sin γ + β̇ sin γ )2
2
1
+ 2 (α̇ sin β cos γ − β̇ sin γ )2
2
1
+ 3 (α̇ cos β + γ̇ )2 . (13.43)
2
If 1 = 2 , i.e., the top is symmetric, the above expression simplifies to
1 1
T = 1 (α̇ 2 sin2 β + β̇ 2 ) + 3 (α̇ cos β + γ̇ )2 . (13.44)
2 2
13.7 Motion of the Heavy Symmetric Top
For the special case of the heavy symmetric top, we will determine the explicit equa-
tions of motion and the constants of motion, starting from the Euler equations. For
simplification, we note that for the symmetric top the two orientations of the princi-
pal axes ex , ey can be arbitrarily chosen in a plane perpendicular to ez . We therefore
choose a coordinate system where the angle γ always vanishes. This system is then
no longer body-fixed (it does not rotate with the top about the ez -axis). The axes
ez , ez , ey are then coplanar, as are ex , ex , ey . This is illustrated in Fig. 13.33.
Fig. 13.33. Heavy top in vari-

ous coordinate systems
Analytically, this follows from the fact that
ex = I = I = cos α i + sin α j = cos α ex + sin α ey ,

ey = J = cos β J + sin β K = cos β(− sin α i + cos α j) + sin β K
= − sin α cos β ex + cos α cos β ey + sin β ez , (13.45)
ez
= K = (− sin β)J + cos β K = − sin β(− sin α i + cos α j) + cos β K
= sin α sin β ex − cos α sin β ey + cos β ez .
We thereby have inverted the relations (13.41a) and (13.41b). By means of the expres-
sions for ex , ey , ez , one can now easily check the triple scalar products ez · (ez × ey )
and ex · (ex × ey ), e.g.,
⎛ ⎞
1 0 0
ex · (ex × ey ) = det ⎝ cos α sin α 0 ⎠ = 0.
0 1 0
Similarly, one shows the vanishing of the other triple scalar product and thus confirms
that the corresponding vectors are coplanar.
The coordinate system thus follows the precession (with α̇) and the nutation
(with β̇) of the top, but not its eigenrotation. To realize that β̇ describes the nutation,
Fig. 13.34. Precession and nu- we note that a nutation motion of the figure axis is superimposed onto the precession
tation of the angular momen-
(compare the discussion in the section “Elementary considerations on the heavy top”).
tum: The figure axis points
along Lz This manifests itself for β in an up-and-down motion (vibration) about a fixed value
β0 (see Fig. 13.34).
For the angular velocities (13.42) of the ex , ey , ez system (which is only partly
body-fixed) relative to the laboratory system ex , ey , ez in this system (γ = 0) we have
ω1 = ωx = β̇,
ω2 = ωy = α̇ sin β, (13.46a)
ω3 = ωz = α̇ cos β,
or
ω = β̇ex + α̇ sin β ey + α̇ cos β ez . (13.46b)
The angular velocity of the top, on the other hand, is
ωK = ωx ex + ωy ey + (ωz + ω0 )ez
= β̇ex + α̇ sin β ey + (α̇ cos β + ω0 )ez . (13.47)
Here, ω0 is the additional angular velocity of the top relative to the ex , ey , ez system.
The angular velocity ω0 (t) in general depends on the time. We must take care also
when calculating the angular momentum, because in this particular ex , ey , ez system
the rigid body still rotates with the angular velocity ω0 ez . We can call this additional
rotation spin. It is due to the particular choice of our (not exactly body-fixed) system
ex , ey , ez . The angular momentum is then
ωk = 1 ω1 ex + 2 ω2 ey + 3 (ω3 + ω0 )ez

L=
= 1 β̇ex + 2 α̇ sin β ey + 3 (α̇ cos β + ω0 )ez
= {Lx , Ly , Lz }, (13.48)
and the Euler equations read
L̇|lab = L̇|e + ω × L = D. (13.49)
Note that the spin rotation ω0 appears in L but not in ω!

The torque about the origin of the space-fixed system is
D = (l · ez ) × (−mgez ) = mgl sin β ex ,
because ez × ez = − sin β ex , as is easily seen from (13.45). By inserting this into the
Euler equations (13.5) and (13.49) and noting that 1 = 2 , we obtain
mgl sin β = 1 β̈ + (3 − 1 )α̇ 2 sin β cos β + 3 ω0 α̇ sin β,

0 = 1 (α̈ sin β + α̇ β̇ cos β) + (1 − 3 )α̇ β̇ cos β − 3 ω0 β̇, (13.50)
0 = 3 (α̈ cos β − α̇ β̇ sin β + ω̇0 ).
From the above system of equations, α(t), β(t), and ω0 (t) can be determined. From
the third equation, we have, because 3 = 0,
d
α̈ cos β − α̇ β̇ sin β + ω̇0 = (α̇ cos β + ω0 ) = 0 (13.51a)
dt
or
α̇ cos β + ω0 = A = constant, (13.51b)
i.e., the angular momentum component Lz = 3 A (see (13.48)!) about the figure axis
is constant.
We therefore set α̇ cos β + ω0 = A, calculate ω0 from this and insert it into the first
two equations. We then obtain two coupled differential equations for precession (α)
and nutation (β), respectively:
mgl sin β = 1 β̈ − 1 α̇ 2 sin β cos β + 3 A sin β · α̇, (13.51c)

0 = 1 (α̈ sin β + 2α̇ β̇ cos β) − 3 Aβ̇. (13.51d)
We first investigate this system for the case that the top performs no nutation. Then
β̈ = β̇ = 0, and β > 0. By insertion, we obtain
mgl = −1 α̇ 2 cos β + 3 Aα̇, α̈ = 0. (13.52)
The second equation means that the precession is stationary.

From the first equation, we determine the precession velocity α̇:

3 A 4mgl1 cos β
α̇ = 1± 1− . (13.53)
21 cos β 23 A2
For a top rotating quickly about the ez -axis, A becomes very large, and the fraction
in the radicand becomes very small. We terminate the expansion of the root after the
second term and get as solutions to first order for α̇small :
mgl
α̇small = ; (13.54a)
3 A
to zeroth order for α̇large , we have
3
α̇ = A. (13.54b)
1 cos β
A stationary precession without nutation (regular precession) occurs only if the heavy
symmetric top gets a certain precession velocity (α̇small or α̇large ) by an impact. In the
general case the precession is always coupled with a nutation. The heavy top will al-
ways begin its motion with an deviation toward the direction of the gravitational force,
i.e., with a nutation. We still note that α̇small agrees with the precession frequency
mgl mgl mgl
ωp = = ≈
L 3 ω 0 3 A
obtained in the section “The heavy symmetric top: Elementary considerations.”
Before we continue the discussion on the general motion of the top, we determine
additional constants of motion. We have already seen that from the last equation of the
system (13.50) we have
α̇ cos β + ω0 = A = constant, (13.51b)
hence, the corresponding part of the kinetic energy is

1 1
T3 = 3 (α̇ cos β + ω0 )2 = 3 A2 = constant. (13.55)
2 2
Multiplying the first of the Euler equations (13.50) by β̇ and the second one by α̇ sin β,
after addition one gets the total differential
mgl sin β · β̇ = 1 β̈ β̇ + 1 (α̈ α̇ sin2 β + α̇ 2 β̇ sin β cos β)
or

d d 1 1
(−mgl cos β) = 1 β̇ + 1 α̇ sin β .
2 2 2
(13.56)
dt dt 2 2
This means that the energy (more precisely, the sum of the kinetic parts T1 + T2 plus
the potential energy)
1
E = 1 (β̇ 2 + α̇ 2 sin2 β) + mgl cos β (13.57)
2
is also a constant of motion. This must be so, of course, since the total energy of the
top must be constant.
The total energy of the top is then
E = E + T3
1 1
= 1 (β̇ 2 + α̇ 2 sin2 β) + 3 (α̇ cos β + ω0 )2 + mgl cos β. (13.58)
2 2
The last term obviously describes the potential energy of the top in the gravitational
field. The energy law (13.58) must of course hold in general. We could have written
it down immediately and skipped the derivation (13.56) from the Euler equations.
Nevertheless, it is of interest to see how the equations of motion succeed too.
In the second Euler equation (13.51d), we insert
Lz = 3 A = constant (13.59)
and multiply by sin β. This yields
1 (α̈ sin2 β + 2α̇ β̇ sin β cos β) − Lz sin β · β̇ = 0. (13.60)
Since Lz is constant, this is a total differential, and then it follows that
1 (α̇ sin2 β) + Lz cos β = constant. (13.61)
This constant is the z-component of the angular momentum in the space-fixed system.
This is seen immediately if we multiply the angular momentum
L = 1 (ωx ex + ωy ey ) + Lz ez (13.62)
by ez . From (13.45) or from Fig. 13.33, we see that
ex · ez = 0, ey · ez = sin β, ez · ez = cos β. (13.63)
By noting that ωy = α̇ sin β, we see that
L · ez = Lz = 1 α̇ sin2 β + Lz cos β = constant. (13.64)
The two angular components Lz and Lz are constant, because the moment of the grav-
itational force acts only in ex -direction, i.e., perpendicular both to the z- as well as to
the z -axis. The conditions Lz = constant and Lz = constant can be realized by a pre-
cession of L about the z-axis, and an additional rotation of the z -axis about the L-axis.
The latter motion is the nutation. This obviously means that the angular momentum L
precesses about the laboratory axis ez , and the figure axis ez simultaneously performs
a nutation about the angular momentum L.
With the constants of motion we will now further discuss the motion of the top.
From the equation of the angular momentum component Lz in the laboratory system,
1 α̇ sin2 β + Lz cos β = Lz , (13.65)
we determine α̇:
Lz − Lz cos β
α̇ = , (13.66)
1 sin2 β
and insert this into (13.58):
1 (Lz − Lz cos β)2

1 β̇ 2 + + T3 + mgl cos β = E.
2 21 sin2 β
Since Lz , Lz , T3 , and E are constants of motion, this is a differential equation for the
nutation β(t). We now substitute
u = cos β, (13.67)
then u̇ = − sin β · β̇ and sin2 β = 1 − u2 . From this, we get
1 u̇2 (Lz − Lz u)2

1 + + mglu = E − T3 (13.68)
2 1 − u2 21 (1 − u2 )
or
(Lz − Lz u)2 2mglu(1 − u2 ) 2(1 − u2 )
u̇2 + + = (E − T3 ). (13.69)
21 1 1
With the abbreviations

E − T3 2mgl Lz Lz
ε=2 , ξ= , γ= , δ= , (13.70)
1 1 1 1
this equation can be written as follows:
u̇2 = (ε − ξ u)(1 − u2 ) − (γ − δu)2 . (13.71)
It cannot be solved by elementary methods. We therefore give the graphical rep-

resentation of the functional dependence. In the following we use the abbreviation
u̇2 = f (u). For large u the leading term is u3 , i.e., the curve approaches f (u) = ξ u3 .
For f (1) and f (−1), we have
f (1) = −(γ − δ)2 ≤ 0, f (−1) = −(γ + δ)2 < 0. (13.72)
From this, we obtain Fig. 13.35.

Fig. 13.35. Qualitative trend
of the function f (u)
In general, the function f (u) has three zeros. Because of its asymptotic behavior
for large, positive u and because f (1) < 0, for one zero we have u3 > 1.
For the motion of the top, we must have u̇2 ≥ 0. Since 0 ≤ β ≤ π/2, we have 0 ≤
u ≤ 1. To ensure that the top moves at all in the physically relevant region 0 ≤ u ≤ 1,
in a certain interval of this region we must have u̇2 = f (u) > 0. Hence, for physical
reasons two physically interesting zeros u1 , u2 must exist between zero and unity.
Therefore in the general case there are two corresponding angles β1 and β2 with
cos β1 = u1 and cos β2 = u2 . (13.73)
In special cases, we can have (1) u1 = u2 and (2) u1 = u2 = 1. We first consider these
special cases:
(1) u1 = u2 = 1: The tip of the figure axis orbits on a circle (this is called stationary
precession); no nutation occurs (the angle β has a fixed value). According to (13.66)
the precession velocity reads
γ − δu
α̇ = (13.74)
1 − u2
and is constant.
(2) u1 = u2 = 1: In this case, the figure axis points vertically upward. The top
performs neither nutation nor precession motion (sleeping top). This is obviously a
special case of the stationary precession (compare Exercise 13.8).
In the general case (u1 = u2 ), a nutation of the top is superimposed on the preces-
sion between the angles β1 and β2 . According to the angular momentum law (13.65),
Fig. 13.36. γ /δ = u2
(13.66) for the precession velocity, we have
Lz − Lz cos β γ − δu
α̇ = = . (13.75)
1 sin β2 1 − u2
The zeros of this equation, i.e., the solution of α̇(u) = 0, specify those angles β at
which the precession velocity α̇ momentarily vanishes. In order to illustrate the gyro-
scope motion, we give the curve described by the intersection point of the figure axis
on a sphere centered about the bearing point. Fig. 13.37. u1 < γ /δ < u2
There are three different types of motion, as illustrated in Figs. 13.36, 13.37,
and 13.38.
(1) γ /δ = u2 : The precession velocity just vanishes at β2 ; hence, the peaks appear.
(2) u1 < γ /δ < u2 : The upper peaks at β2 extended to loops. The precession velocity
vanishes between β2 and β1 .
(3) γ /δ > u2 : The precession velocity would vanish beyond β2 (as indicated in
Fig. 13.38). A peak cannot arise.
Fig. 13.38. γ /δ > u2

EXAMPLE
13.10 The Sleeping Top
In the case of the so-called “sleeping top,” the figure axis points up vertically, so that
neither nutation nor precession occurs. For this special case, we must have β = 0 and
β̇ = 0.
From energy conservation, we obtain
1 1
1 (β̇ 2 + α̇ 2 sin2 β) + 3 A2 + mgl cos β = E, (13.76)
2 2
and because
β = 0, β̇ = 0,
it follows that
3 A2 = 2(E − mgl). (13.77)
Constancy of the z-component of the angular momentum yields
1 α̇ sin2 β + 3 A cos β = constant = K, (13.78)

Example 13.10 from which follows (A = ω3 + ω0 = constant)
3 A = K. (13.79)
For the quantities ε, ξ, γ , and δ in the differential equation for the nutation motion in
u = cos β, we have
2(E − (1/2)3 A2 ) 2mgl
ε= = ,
1 1
2mgl
ξ= ,
1
K 3 A
γ= = , (13.80)
1 1
3 A
δ= ,
1
⇒ ε=ξ and γ = δ.
Inserting this into the differential equation (13.71) for u, one obtains
u̇2 = f (u) = ε(1 − u)(1 − u)(1 + u) − γ 2 (1 − u)2 (13.81)

⇒ f (u) = (1 − u)2 [ε(1 + u) − γ 2 ]. (13.82)
Equation (13.82) has a twofold zero, which will be denoted by u2 = u3 = 1 or u1 =

u2 = 1 (compare Fig. 13.39). The third zero is at
γ2 23 A2
ū = −1= − 1. (13.83)
ε 1 2mgl
Accordingly, f (u) has one of the two courses (see Fig. 13.39).
Fig. 13.39.
For Fig. 13.39(a), β̇ actually vanishes since f (u) has a zero. Thus, we can have
β = constant = 0, i.e., the case of stationary precession. But since we also require
β = 0 for the “sleeping top,” only Fig. 13.39(b) is left over where β = 0 does not exist
as a solution (u1 ≥ 1).
Hence, from (13.83) we obtain as condition equations for the “sleeping top”:
23 A2 4mgl1
−1≥1 ⇔ A2 ≥ . (13.84)
1 2mgl 23
Equation (13.84) will be satisfied only in the initial phase of the gyroscope motion.
Because of friction, A2 = (ω3 + ω0 )2 decreases, so that
4mgl1
A2 <
23
and therefore one observes precession with overlaid nutation. Further energy loss in- Example 13.10
evitably causes the top to tilt down.
EXAMPLE
13.11 The Heavy Symmetric Top
(a) Write the total energy of the top as a function of the Euler angles.
(b) Determine the constants of motion, and use them to eliminate the Euler angles
α and γ from the energy law. Propose approaches for solving the resulting one-
dimensional differential equation:
1
E = 1 β̇(t)2 + Veff (β). (13.85)
2
(c) Discuss the effective potential Veff (β), and solve the differential equation of the
heavy top for infinitesimal displacements from the stable position in the minimum
of the potential:
β(t) = β0 + η(t).
Fig. 13.40.
We consider the symmetric top bound to a fixed point in the gravitation field. The
energy law reads
E = T + V, (13.86)
where
1
T= i ωi2 and V = M · g · h. (13.87)
2
i
M is the mass of the top, and h = l cos β is the distance between the center of gravity
and the bearing plane.
In order to express the angular velocity ω = (ωx , ωy , ωz ) by the Euler angles and
their time derivatives, we note that α̇, β̇, γ̇ are rotation velocities by themselves. The
rotation velocity of the body is obtained as the vector sum
ω = ωα + ωβ + ωγ = α̇eα + β̇eβ + γ̇ eγ . (13.88)

Example 13.11 The vectors eγ , eα , eβ follow from the definition of the Euler angles:
γ : Rotation about the new (body-fixed) z -axis:
eγ = ez = (0, 0, 1). (13.89)
α: Rotation about the space-fixed z-axis:
eα = ez = (sin β sin γ , sin β cos γ , cos β). (13.90)
β: Rotation about the nodal line, x-axis:
eβ = ex = (cos γ , − sin γ , 0). (13.91)
One thus obtains the components of the rotation velocity in the body-fixed coordinate
system
ωx = ω1 = α̇ sin β sin γ + β̇ cos γ ,

ωy = ω2 = α̇ sin β cos γ − β̇ sin γ , (13.92)
ωz = ω3 = α̇ cos β + γ̇ .
By inserting this into (13.87) and using 1 = 2 , we obtain for the kinetic energy
1 1
T = 1 (α̇ 2 sin2 β + β̇ 2 ) + 3 (α̇ cos β + γ̇ )2 . (13.93)
2 2
Since the gravitational force acts only along the z-direction, the torque acts only along
the nodal line eβ .
D = r × F = −Mglez × ez
= −Mgl sin βeβ .
Thus, the angular momentum components in the ez ,ez -plane remain unchanged.
Lz and Lz are constants of motion:
Lz = 3 ω3 = 3 (α̇ cos β + γ̇ ) = constant,

(13.94)
Lz = L · ez = constant.
We evaluate the scalar product in the body-fixed coordinates:
Lz = L · ez
= 1 (α̇ sin β sin γ + β̇ cos γ )(sin γ sin β)
+ 1 (α̇ sin β cos γ − β̇ sin γ )(cos γ sin β)
+ 3 (α̇ cos β + γ̇ )(cos β)
= 1 (α̇ sin2 β) + 3 cos β(α̇ cos β + γ̇ ). (13.95)
Here, we utilized (13.92). Equations (13.94) and (13.95) can be inverted, i.e., solved
for γ̇ and α̇.
Lz − Lz cos β
α̇ = , (13.96)
1 sin2 β

1 cot2 β Lz cos β
γ̇ = Lz + − . (13.97)
3 1 1 sin2 β
We insert the relations (13.96) and (13.97) obtained this way into the expression for Example 13.11
the kinetic energy (13.93) and obtain

1 1 Lz − Lz cos β 2 2
E = 1 β̇ 2 + sin β
2 2 1 sin2 β
1
+ 3 (α̇ 2 cos2 β + γ̇ 2 + 2α̇ γ̇ cos β) + Mgl cos β
2
1 1 (Lz − Lz cos β)2 1 Lz
= 1 β̇ 2 + + + Mgl cos β
2 21 sin2 β 2 3
1
= 1 β̇ 2 + Veff (β). (13.98)
2
We used the constancy of Lz and Lz to eliminate the two Euler angles α and γ .
This simplified the problem greatly. From the energy law (13.98) we can in principle
determine β(t) and then obtain α(t) and γ (t) via (13.96) and (13.97).
To proceed further, various possibilities offer themselves:
(a) We can establish the equation of motion for β(t):
dE ∂Veff (β)
= 0 = 1 β̇ β̈ + β̇. (13.99)
dt ∂β
Hence, energy conservation leads to the equation of motion:
∂
1 β̈ = − Veff (β)
∂β
1 cos β Lz
= (Lz − Lz cos β)2 3 − (Lz − Lz cos β) + Mgl sin β
1 sin β 1 sin β
cos β L z Lz
= (L2z − 2Lz Lz cos β + L2z ) − + Mgl sin β. (13.100)
1 sin3 β 1 sin β
This is a one-dimensional differential equation for β, although highly nonlinear. For a

given solution β(t), α(t), and γ (t) can be found by integration of (13.96) and (13.97).
(b) Another principal approach is to solve the differential equation (13.98) by sep-
aration of variables and integration.

dβ 2
= E − Veff (β) ,
dt 1
(13.101)
β
1
t − t0 = dβ
.
2/1 (E − Veff (β ))
β0
Thus, the time dependence of β can be determined by integration.

Since (a) and (b) are likewise complicated, we restrict ourselves to a discussion of
the effective potential, in order to understand the essentials of gyroscopic motion.
Example 13.11 Discussion of the effective potential
1 (Lz − Lz cos β)2 1 Lz

Veff = + + Mgl cos β. (13.102)
21 sin2 β 2 3
The effective potential is composed of three terms which we will discuss separately.
Fig. 13.41. The effective po-

tential is composed of three
terms
Spin term
1 Lz
. (13.103)
2 3
The second term is constant and is due to the energy of the eigenrotation of the top
about its figure axis. It shifts the zero point of the energy scale and is independent
of β.
Angular momentum barrier
1 (Lz − Lz cos β)2

. (13.104)
21 sin2 β
The first term is understood by analogy to the l 2 /2mr 2 angular momentum term,
which appeared in the effective potential when treating the central force prob-
lem. It is positive and vanishes for Lz /Lz = cos β. Then β is only a physi-
cally meaningful angle if Lz < Lz . This is in general fulfilled for a top with
fast eigenrotation. For β = 0, β = π , the first term diverges because of the fac-
tor sin2 β in the denominator. As a consequence, the term then has a minimum
which lies at β = arccos(Lz /Lz ) < π/2. For β → 0 and β → π the potential rises
steeply.
Gravitation term
Mgl cos β. (13.105)

The last contribution is caused by the gravitational potential. It is antisymmetric Example 13.11
about the center π/2 and shifts the minimum of the effective potential to the right side
of arccos(Lz /Lz ) without changing its qualitative form.
For a given energy E (> Lz /23 ), the motion is restricted to the region
E > Veff (β) with reversal points β± ; these points are defined by E = Veff (β± ).
For a more precise analysis of the motion, we determine the stationary solution for
which β = β0 = constant. It is located exactly in the minimum of the effective poten-
tial, so that the reversal points β± coincide, and E = Veff (β0 ). β0 is then determined
from the minimum property:

∂Veff
= 0. (13.106)
∂β β=β0
Hence, (13.100) leads to
Lz Lz sin2 β0 1 Mgl sin β0

L2z − 2Lz Lz cos β0 + L2z = − . (13.107)
cos β0 cos β0
This equation fixes β0 for given values of Lz and Lz .

Equations (13.96), (13.97) imply for α̇ and γ̇ constant values α̇0 and γ̇0 . Hence,
in dynamic equilibrium the top performs a constant rotation about its own axis
γ (t) = γ̇0 t at fixed angle β0 , as well as a precession motion α(t) = α̇0 t with constant
precession frequency α̇0 .
Small oscillations about the dynamic equilibrium position

In order to investigate the motion in the vicinity of β0 , we consider small displace-
ments from the equilibrium. Instead of explicitly solving (13.100), we write
β(t) = β0 + η(t) (13.108)
with an infinitesimal displacement η(t). We expand the potential into a Taylor series

∂Veff 1 2 ∂ 2 Veff
Veff (β) = Veff (β0 ) + η + η + ··· . (13.109)
∂β β=β0 2 ∂t 2 β=β0
The linear term vanishes by construction, and the quadratic term follows by differen-
tiation of the negative right side of (13.100):
∂ 2 Veff 3Lz Lz cos β

2
= −Mgl cos β −
∂β 1 sin2 β
3 − 2 sin2 β
+ (L2z − 2Lz Lz cos β + L2z ) . (13.110)
1 sin4 β
By inserting (13.107) for the last term, one obtains

∂ 2 Veff Lz Lz − 1 Mgl(4 − 3 sin2 β0 )
2 = . (13.111)
∂β β=β0 1 cos β0
Example 13.11 For the total energy, one then obtains likewise

1 1 2 ∂ 2 Veff
E = 1 η̇ + η
2
+ Veff (β0 ). (13.112)
2 2 ∂β 2 β=β0
Differentiation with respect to the time finally leads to the differential equation of the
harmonic oscillator:
η̈ + 2 η = 0 (13.113)
with

1 ∂ 2 Veff Lz Lz − 1 Mgl(4 − 3 sin2 β0 )
2 = = . (13.114)
1 ∂β 2 β=β0 21 cos β0
The corresponding motion
β(t) = β0 + η0 cos(t + 0 ) (13.115)
is stable if 2 > 0. Obviously the product Lz Lz must be sufficiently large to ensure
stable vibrations.
Precession and nutation

We insert the explicit solution of (13.113) into (13.96) and (13.97), and expand with
respect to η(t):

Lz − Lz cos β0 ∂ Lz − Lz cos β
α̇(t) ≈ + η(t) + ··· (13.116)
1 sin2 β0 ∂β 1 sin2 β β=β0
≡ α̇0 + η(t)α̇1 ,
γ̇ (t) γ̇0 + η(t)γ̇1 , (13.117)
where α̇0 , α̇1 , γ̇0 , γ̇1 are constants which depend on Lz , Lz , and E (through β0 ). For a
qualitative investigation of the superposition of nutation (β(t)) and precession (α(t)),
we start from (13.116):
α̇(t) α̇0 + η0 α̇1 cos(t + ϕ0 )

α̇0
= α̇1 η0 + cos(t + ϕ0 ) . (13.118)
α̇1 η0
For α̇0 /α̇1 η0 > 1, α̇ always remains larger than zero (Fig. 13.42(a)). For α̇0 /α̇1 η0 = 1,
the precession frequency may become equal to zero (Fig. 13.42(b)). For α̇0 /α̇1 η0 < 1,
we have a backward motion in parts (Fig. 13.42(c)).
Fig. 13.42.
EXERCISE
13.12 Stable and Unstable Rotations of the Asymmetric Top
Problem. Use the Euler equations to show that for an asymmetric top the rotations
about the axes of the largest and smallest moment of inertia are stable; however, the
rotation about the axis of the intermediate moment of inertia is unstable.
Solution. We start from the Euler equations for the free top:
2 − 3
ω̇1 = ω2 ω3 , (13.119)
1
3 − 1
ω̇2 = ω1 ω3 , (13.120)
2
1 − 2
ω̇3 = ω1 ω2 . (13.121)
3
Let the top rotate about the body-fixed z-axis, i.e., ω3 = ω0 = constant and ω1 = ω2
= 0. To investigate the stability of the rotation about this principal axis, we tilt the
rotation axis by a small amount, so that new components δω1 , δω2 and an additional
δω3 arise. For δ ω̇3 , we have from the Euler equation
1 − 2
δ ω̇3 = δω1 δω2 0. (13.122)
3
Neglecting quadratic small terms, we can set ω3 = ω0 . From the other two Euler equa-
tions, we then obtain
3 − 2
δ ω̇1 + δω2 ω0 = 0, (13.123)
1
1 − 3
δ ω̇2 + δω1 ω0 = 0. (13.124)
2
To solve this coupled system, we use the ansatz
δω1 = Aeλt ,
(13.125)
δω2 = Beλt .
Exercise 13.12 This leads to a linear set of equations in A and B, where the determinant must vanish
for nontrivial solutions:

3 − 2
λ ω 0
1
= 0. (13.126)
1 − 3

ω0 λ
2
From this, we find the characteristic equation
(3 − 2 )(1 − 3 )
λ2 = ω02 . (13.127)
1 2
For the rotation about the axis of the smallest moment of inertia 3 < 1 , 2 , and
for the rotation about the axis of the largest moment of inertia 3 > 1 , 2 , equation
(13.127) leads to a purely imaginary λ:
λ2 < 0, (13.128)
and therefore to vibration solutions for δω1 and δω2 . The rotation about the axis of the
largest and smallest moment of inertia, respectively, is therefore stable.
The rotation about the axis of the intermediate moment of inertia
1 > 3 > 2 or 2 > 3 > 1 (13.129)
leads to a real λ and thus to a time evolution of δω1 and δω2 according to
δω1/2 = C1/2 cosh λt + D1/2 sinh λt. (13.130)
The rotation axis turns away exponentially from the initial position. The rotation about
the axis of the intermediate moment of inertia is not stable!
Part
V
Lagrange Equations
Generalized Coordinates
14
In many cases, the motion of bodies considered in mechanics is not free but is re-
stricted by certain constraint conditions. The constraints can take different forms. For
instance, a mass point can be bound to a space curve or to a surface. The constraints
for a rigid body state that the distances between the individual points are constant.
If one considers gas molecules in a vessel, the constraints specify that the molecules
cannot penetrate the wall of the vessel. Since the constraints are important for solv-
ing a mechanical problem, mechanical systems are classified according to the type
of constraints. A system is called holonomic if the constraints can be represented by
equations of the form
fk (r1 , r2 , . . . , t) = 0, k = 1, 2, . . . , s. (14.1)
This form of the constraints is important since it can be used for eliminating dependent
coordinates. For a pendulum of length l (14.1) reads x 2 + y 2 − l 2 = 0 if we put the
coordinate origin at the suspension point. The coordinates x and y can be expressed
by this equation.
We already met another simple example of holonomic constraints in the context of
the rigid body, i.e., the constancy of the distances between two points: (ri − rj )2 −
Cij2 = 0. In this case the constraints served to reduce the 3N degrees of freedom of a
system of N mass points to the 6 degrees of freedom of the rigid body.
All constraints that cannot be represented in the form (14.1) are called nonholo-
nomic. These are conditions that cannot be described by a closed form or by inequali-
ties. An example of this type of constraint is the system of gas molecules enclosed in
a sphere of radius R. Their coordinates must satisfy the conditions ri ≤ R.
A further classification of the constraint conditions is made based on their time
dependence. If the constraint is an explicit function of the time, then it is called
rheonomic. If the time does not enter explicitly, the constraint is called scleronomic.
A rheonomic constraint appears if a mass point moves along a moving space curve, or
if gas molecules are enclosed in a sphere with a time-dependent radius.
In certain cases the constraints may also be given in differential form, for example
if there is a condition on velocities, e.g., for the rolling of a wheel. The constraints
then have the form

N
ak (x1 , x2 , . . . , xN ) dxk = 0, (14.2)
k
where the xk represent the various coordinates, and the ak are functions of these coor-
dinates. We now have to distinguish between two cases.

260 14 Generalized Coordinates
If (14.2) represents the total differential of a function U , we can integrate it imme-

diately and obtain an equation of the form of (14.1). In this case the constraints are
holonomic. If (14.2) is not a total differential, we can integrate it only after having
solved the full problem. Then (14.2) is not suitable for eliminating dependent coordi-
nates; it is nonholonomic.
From the requirement that (14.2) be a total differential, one can derive a criterion
for the holonomity of differential constraints. One must have
∂U
ak dxk = dU with ak = .
∂xk
k
This leads to
∂ak ∂ 2U ∂ai
= = .
∂xi ∂xi ∂xk ∂xk
Thus, (14.2) represents a holonomic constraint if the coefficients obey the integrability
conditions
∂ak ∂ai
= .
∂xi ∂xk
These only mean that the “vector” a = {a1 , a2 , . . . , aN } must be rotation-free (irro-
tational). In N -dimensional space, the situation is analogous.
To classify a mechanical system, we additionally specify whether the system is
conservative or not.
EXAMPLE
14.1 Small Sphere Rolls on a Large Sphere
A sphere in the gravitational field rolls without friction from the upper pole of a larger
sphere. The system is conservative. The constraints change completely after getting
away from the sphere and cannot be represented in the closed form of (14.1), and
therefore the system is nonholonomic. Since the time does not enter explicitly, the
system is scleronomic.
Fig. 14.1. A small sphere rolls

on a large sphere
EXAMPLE
14.2 Body Glides on an Inclined Plane
A body glides with friction down on an inclined plane (see Fig. 14.2). The inclination
angle of the plane varies with time. The coordinates and the inclination angle are
related by
y
− tan ωt = 0.
Fig. 14.2. x
14 Generalized Coordinates 261
Thus, the time occurs explicitly in the constraint. The system is holonomic and rheo- Exercise 14.2
nomic. Since friction occurs, the system is furthermore not conservative.
EXAMPLE
14.3 Wheel Rolls on a Plane
An example of a system with differential constraints is a wheel that rolls on a plane

without gliding. The wheel cannot fall over. The radius of the wheel is a.
Fig. 14.3.
For the calculation, we use the coordinates xM , yM of the center, the angle ϕ that
describes the rotation, and the angle ψ that characterizes the orientation of the wheel
plane relative to the y-axis.
The velocity v of the wheel center and the rotation velocity are related by the rolling
condition
v = a ϕ̇.
The components of the velocity are
ẋM = −v sin ψ,
ẏM = v cos ψ.
By inserting v, we obtain
dxM + a sin ψ · dϕ = 0,
dyM − a cos ψ · dϕ = 0,
i.e., a constraint of the type of (14.2).

Since the angle ψ is known only after solving the problem, the equations are not
integrable. Hence, the problem is nonholonomic, scleronomic, and conservative.
If a body moves along a trajectory specified (or restricted) by constraints, there ap-
pear constraint reactions that keep it on this trajectory. Such constraint reactions are
support forces, bearing forces (-moments), string tensions, etc. If one is not especially
interested in the load of a string or a bearing, one tries to formulate the problem in
such a way that the constraint (and thus the constraint reaction) no longer appears in
the equations to be solved. We have tacitly used this approach in the problems treated
so far. A simple example is the plane pendulum. Instead of the formulation in Carte-
sian coordinates, where the constraint x 2 + y 2 = l 2 must be considered explicitly, we

use polar coordinates (r, ϕ).
The constancy of the pendulum length means that the r-coordinate remains con-
stant and that the motion of the pendulum can be completely described by the angle
coordinate alone. This procedure—the transformation to coordinates adapted to the
problem—shall now be formulated more generally.
If we consider a system of n mass points, then it is described by 3n coordinates
r1 , r2 , . . . , rn . The number of degrees of freedom also equals 3n. If there are s con-
straints, the number of degrees of freedom reduces to 3n − s. The set of originally
3n independent coordinates now involves s dependent coordinates. Now the meaning
of the holonomic constraints becomes transparent. If the constraints are expressed by
equations of the form (14.1), the dependent coordinates can be eliminated. We can
transform to 3n − s coordinates q1 , q2 , . . . , q3n−s that implicitly incorporate the con-
straints and that are independent of each other. The old coordinates ri are expressed
by the new coordinates qj by means of equations of the form
r1 = r1 (q1 , q2 , . . . , q3n−s , t),

r2 = r2 (q1 , q2 , . . . , q3n−s , t),
(14.3)
..
.
rn = rn (q1 , q2 , . . . , q3n−s , t).
These coordinates qi , which now can be considered free, are called generalized coor-
dinates. In the practical cases considered here, the choice of the generalized coordi-
nates is already suggested by the formulation of the problem, and the transformation
equations (14.3) need not be established explicitly. Using generalized coordinates is
also helpful for problems without constraint conditions. For instance, a central force
problem can be described more simply and completely by the coordinates (r, ϑ, ϕ)
instead of the (x, y, z).
Fig. 14.4. Ellipse: y = b sin ϕ,
x = a cos ϕ
As a rule, lengths and angles serve as generalized coordinates. As will be seen

below, moments and energies etc. can also be used as generalized coordinates.
EXAMPLE
14.4 Generalized Coordinates
An ellipse is given in the x,y-plane. A particle moving on the ellipse has the coordi-
nates (x, y).
14 Generalized Coordinates 263
The Cartesian coordinates can be expressed by the parameter ϕ:
y = b sin ϕ, x = a cos ϕ.
Thus, the motion of the particle can be completely described by the angle ϕ (the
generalized coordinate ϕ).
EXAMPLE
14.5 Cylinder Rolls on an Inclined Plane
The position of a cylinder on an inclined plane is completely specified by the distance l

from the origin to the center of mass and by the rotational angle ϕ of the cylinder about
its axis.
If the cylinder glides on the plane, both generalized coordinates are significant.
If the cylinder does not glide, l depends on ϕ through a rolling condition. Only one
of the two generalized coordinates will then be needed for a complete description of
Fig. 14.5. A cylinder rolls on
the motion of the cylinder.
an inclined plane
EXERCISE
14.6 Classification of Constraints
Problem. Classify the following systems according to whether or not they are scle-
ronomic or rheonomic, holonomic or nonholonomic, and conservative or nonconserv-
ative:
(a) a sphere rolling downward without friction on a fixed sphere;
(b) a cylinder rolling down on a rough inclined plane (inclination angle α);
(c) a particle gliding on the rough inner surface of a rotation paraboloid; and
(d) a particle moving without friction along a very long bar. The bar rotates with the
angular velocity ω in the vertical plane about a horizontal axis.
Solution. (a) Scleronomic, since the constraint is not an explicit function of time.
Nonholonomic, since the rolling sphere leaves the fixed sphere. Conservative, since
the gravitational force can be derived from a potential.
(b) Scleronomic, holonomic, nonconservative: The equation of the constraint rep-
resents either a line or a surface. Since the surface is rough, friction occurs. Therefore
this system is not conservative.
(c) Scleronomic, holonomic, but not conservative, since the friction force does not
result from a potential!
(d) Rheonomic: The constraint is an explicit function of time. Holonomic: The
equation of the constraint is a straight line that contains the time explicitly; conserva-
tive.
14.1 Quantities of Mechanics in Generalized Coordinates

The velocity of the mass point i can according to the transformation equation
ri = ri (q1 , . . . , qν , t)
be represented as
∂ri dq1 ∂ri dqν ∂ri
ṙi = + ··· + + .
∂q1 dt ∂qν dt ∂t
In the scleronomic case, the last term drops. The velocity can also be written in the
form

f
∂ri ∂ri dqα
ṙi = q̇α + , where q̇α = (14.4)
α
∂qα ∂t dt
and q̇α denotes the generalized velocity. In the following, we restrict ourselves to
the x-component. Moreover, we consider only the scleronomic case and write for the
x-component of (14.4)
∂xi
ẋi = q̇α . (14.5)
α
∂qα
By differentiating (14.5) once again with respect to time, we obtain for the Cartesian
components of the acceleration
d ∂xi ∂xi
ẍi = q̇α + q̈α .
α
dt ∂qα ∂qα
The total derivative in the first term is written as usual:

2
d ∂xi ∂ xi
= q̇β .
dt ∂qα ∂qβ ∂qα
β
The index to be summed over is denoted here by the letter β, to avoid confusion
with the summation index α. Then we have
∂ 2 xi ∂xi
ẍi = q̇β q̇α + q̈α .
∂qβ ∂qα α
∂qα
α,β
The first term involves double summation over α and β.

Let a system have the generalized coordinates q1 , . . . , qf that now shall be in-
creased by dq1 , . . . , dqf . We will determine the work performed by this infinitesimal
displacement. For an infinitesimal displacement of the particle i, we have

f
∂ri
dri = dqα . (14.6)
∂qα
α=1
From this, we obtain the work performed:

f
n n ∂ri
dW = Fi · dri = Fi · dqα = Qα dqα ,
∂qα α
i=1 i=1 α=1
14.1 Quantities of Mechanics in Generalized Coordinates 265
where
∂ri
Qα = Fi · . (14.7)
∂qα
i
Qα is called the generalized force. Since the generalized coordinate must not have
the dimension of a length, Qα must not have the dimension of a force. The product
Qα qα , however, always has the dimension of work.
In conservative systems, i.e., if W does not depend on time, one has
∂W
dW = dqα and dW = Qα dqα .
α
∂qα α
Then we must have

∂W

dW − dW = 0 = Qα − dqα = 0.
α
∂qα
Since the qα are generalized coordinates, they are independent of each other, and
therefore it follows that (Qα − ∂W/∂qα ) = 0 in order to satisfy the equation
∂W

Qα − dqα = 0.
α
∂qα
But this holds only if

∂W
Qα = .
∂qα
The components of the generalized force are thus obtained as the derivative of the
work with respect to the corresponding generalized coordinate.
D’Alembert Principle and Derivation
of the Lagrange Equations 15
15.1 Virtual Displacements
A virtual displacement δr is an infinitesimal displacement of the system that is com-

patible with the constraints. Contrary to the case of a real infinitesimal displacement
dr, in a virtual displacement the forces and constraints acting on the system do not
change. A virtual displacement will be characterized by the symbol δ, a real displace-
ment by d. Mathematically we operate with the element δ just as with a differential.
For example,
δ sin x
δ sin x = δx = (cos x)δx, etc.
δx
We consider a system of mass points in equilibrium. Then the total force Fi acting
on each individual mass point vanishes; hence, Fi = 0. The product of force and virtual
displacement Fi · δri is called the virtual work. Since the force for each individual
mass point vanishes, the sum over the virtual work performed on the individual mass
points also equals zero:

Fi · δri = 0. (15.1)
i
The force Fi will now be subdivided into the constraint reaction Fzi and the acting
(imposed) force Fai :

(Fai + Fzi ) · δri = 0. (15.2)
i
We now restrict ourselves to such systems where the work performed by the constraint
reactions vanishes. In many cases (except, e.g., for those with friction) the constraint
reaction is perpendicular to the direction of motion, and the product Fz · δr vanishes.
For instance, if a mass point is forced to move along a given spatial curve, its direction
of motion is always tangential to the curve; the constraint reaction points perpendicular
to the curve. There are, however, examples where the individual constraint reactions
perform work, while the sum of the works of all constraint forces vanishes; thus,

Fzi · δri = 0.
i

268 15 D’Alembert Principle and Derivation of the Lagrange Equations
The string tensions of two masses hanging on a roller represent such a case. We
refer to Example 15.1. This is the proper, true meaning of the d’Alembert1 principle:
The constraint reactions in total do not perform work. We always have

Fzi · δri = 0.
i
This is the fundamental characteristic of the constraint reactions. One can, of course,
trace this presupposition back to Newton’s axiom “action equals reaction,” as we just
have seen in the example of the string tensions between two masses. But in general
it does not follow from Newton’s axioms alone. The assumption that the total virtual
work of the constraint reactions vanishes can be considered to be a new postulate. It
accounts for systems of not freely movable mass points and can be expressed by the
forces imposed on the system, as we shall see below (see (15.5)). Then the constraint
position drops out from (14.2), and one has

Fai · δri = 0. (15.3)
i
While in (15.1) each term vanishes individually, now only the sum in total vanishes.
The statement of (15.3) is called the principle of virtual work. It says that a system is
only in equilibrium if the entire virtual work of the imposed (external) forces vanishes.
In the next chapter (equations (16.8) and (16.9)) the principle of virtual work (the total
virtual work vanishes) will be established by the Lagrangian formalism.
For holonomic constraints, the effect of the constraint reactions can be elucidated
by the following: If we consider the ith constraint in the form
gi (r1 , r2 , . . . , rN , t) = 0,
then the change of gi with respect to a change of the position vector rj must be
a measure of the constraint reaction Fzj i on the j th particle due to the constraint
gi (r1 , r2 , . . . , rN , t) = 0. We thus can write
∂gi (r1 , r2 , . . . , rN , t)
Fzj i = λi = λi ∇j gi (r1 , . . . , t).
∂rj
Here λi is an unknown factor, since the constraints gi (r1 , r2 , . . . , rN , t) = 0 are known

up to a nonvanishing factor. The total constraint reaction on the j th particle is then the
sum over all constraint reactions originating from the individual k constraints; hence,

k
k
∂gi (r1 , . . . , rN , t)
Fzj = Fzj i = λi .
∂rj
i=1 i=1
1 Jean le Rond d’Alembert, b. Nov. 16 or 17, 1717, Paris, as the son of a general–d. Oct. 29, 1783,
Paris. D’Alembert, who was abandoned by his mother, was found near the church Jean le Rond and
was brought up by the family of a glazier. Later he was educated according to his social status,
supported by grants. He studied at the Collège des Quatre Nations, and in 1741, he became a member
of the Académie des sciences. In mechanics, the d’Alembert principle is named after him; moreover,
he worked on the theory of analytic functions (1746), on partial differential equations (1747), and on
the foundations of algebra. D’Alembert is the author of the mathematical articles of the Encyclopédie.
15.1 Virtual Displacements 269
The virtual work performed by all constraints is then

N
k
N
∂gi
δW = Fzj · δrj = λi (r1 , . . . , rN , t) · δrj
∂rj
j =1 i=1 j =1

k
= λi δgi (r1 , . . . , rN , t),
i=1
where

N
∂gi
δgi (r1 , . . . , rN , t) = · δrj .
∂rj
j =1
This is just the change of gi caused by the virtual displacements δrj . Since the virtual
displacements are by assumption compatible with the constraints, i.e., the δrj satisfy
the constraints, we must have
δgi (r1 , . . . , rN , t) = 0.
From this, we see immediately that
Fzi · ri = 0 (15.4a)
and therefore also

N
δW = Fzj · δrj = 0. (15.4b)
j =1
Hence, for holonomic constraints the constraint reactions are perpendicular to the dis-
placements that are compatible with the constraints, and the virtual work of the indi-
vidual constraint reactions vanishes. In Chap. 16, equations (16.8) and (16.9), we shall
understand from a very general point of view that in the general case (hence including
the case of nonholonomic constraints), the sum of the virtual work of all constraint re-

actions must vanish. Therefore, i Fzi · δri = 0 always holds, while Fzi · δri = 0 holds
only in special (holonomic) cases.
The principle of virtual work at first only allows us to treat problems of statics. By
introducing the inertial force according to Newton’s axiom
Fi = ṗi , (15.5)
D’Alembert succeeded in applying the principle of virtual work to problems of dy-

namics as well. We proceed in an analogous way to derive the principle of virtual
work. Because of (15.4a) and (15.4b) in the sum

(Fi − ṗi ) · δri = 0, (15.6a)
i
every individual term vanishes. If we again subdivide the total force Fi into the im-
posed force Fai and the constraint reaction Fzi , with the same restriction as above we
find the equation

(Fai − ṗi ) · δri = 0, (15.6b)
i
where the individual terms can differ from zero; only the sum in (15.6b) vanishes. This
equation expresses the d’Alembert principle.
EXAMPLE
15.1 Two Masses on Concentric Rollers

Two masses m1 and m2 hang on two concentrically fixed rollers with the radii R1
and R2 . The mass of the rollers can be neglected. The equilibrium condition shall be
determined by means of the principle of virtual work.
For the conservative system under consideration (where no friction appears), the
total work performed by the constraint reactions vanishes, i.e.,

Fzi · δri = 0.
i
z z
Fig. 15.1. Two masses on con- thez constraint forces are the string tensions F1 and F2 .
In the present example,
centric rollers: The string ten- The vanishing of i Fi · δri in the equilibrium state is equivalent to the equality
sions Fz1 and Fz2 are parallel of the torques imposed by the string tensions F1z , F2z through the radii R1 , R2 :
but have different magnitudes
D1 = R1 F1z = D2 = R2 F2z .
By means of the constraint, it follows with δz1 = R1 δϕ, δz2 = −R2 δϕ, that
F1z δz1 + F2z δz2 = (F1z R1 − F2z R2 )δϕ = (D1 − D2 )δϕ = 0.
In the case of equal radii (R1 = R2 ), the string tensions are equal.
From

Fai · δri = 0,
i
it follows that
m1 gδz1 + m2 gδz2 = 0.
The displacements are correlated by the constraint condition; we have

δz1 = R1 δϕ, δz2 = −R2 δϕ.
Hence, we obtain
(m1 R1 − m2 R2 )δϕ = 0
or
m1 R1 = m2 R2
as the equilibrium condition.

EXAMPLE
15.2 Two Masses Connected by a Rope on an Inclined Plane
In the setup shown in Fig. 15.2, two masses connected by a rope move without friction.
The equation of motion shall be established by means of the d’Alembert principle. For
the two masses, this principle reads
(Fa1 − ṗ1 ) · δl1 + (Fa2 − ṗ2 ) · δl2 = 0. (15.7)

Fig. 15.2. Two masses on an
inclined plane connected by a
rope
The length of the rope is constant (constraint):
l1 + l2 = l.
This leads to
δl1 = −δl2 and l¨1 = −l¨2 .
The inertial forces are
ṗ1 = m1 l̈1 and ṗ2 = m2 l̈2 .
By inserting this into (15.7) and taking into account that the accelerations are par-
allel to the displacements, we have
(m1 g sin α − m1 l¨1 )δl1 + (m2 g sin β − m2 l¨2 )δl2 = 0,

(m1 g sin α − m1 l¨1 − m2 g sin β − m2 l¨2 )δl1 = 0,
or
m1 sin α − m2 sin β
l¨1 = g.
m1 + m2
EXERCISE
15.3 Equilibrium Condition of a Bascule Bridge

Problem. Find by means of the d’Alembert principle the equilibrium condition for
(a) a lever of length l1 , with a mass m at a distance l2 from the bearing point, and
with a force F1 acting vertically upward at its end; and Fig. 15.3. Lever with mass m
(b) the bascule bridge in Fig. 15.4, with the forces G and Q acting. and force Fl
Fig. 15.4. Geometry of the

bascule bridge described in
the problem
Solution. (a) The d’Alembert principle yields

Fν · δrν = 0.
ν
We have
F1 = F1 ey , F2 = −mgey
and
r1 = l1 cos ϕex + l1 sin ϕey ,

l2
r2 = l2 cos ϕex + l2 sin ϕey = r1 .
l1
Furthermore,
l2
δr1 = (−l1 sin ϕex + l1 cos ϕey )δϕ and δr2 = r1 .
l1
This leads to

2
Fν · δrν = (F1 l1 cos ϕ − mgl2 cos ϕ)δϕ = 0,
ν=1
i.e., the equilibrium condition reads

l2 π 3π
F1 = mg for ϕ = , ,....
l1 2 2
(b) The forces acting at the points 1 and 2 are
F2 = −Gey , F1 = −Qey .
Furthermore,
r1 = −a cos ϕex + (d − a sin ϕ)ey
and
r2 = (b + c) cos ϕex + (b + c) sin ϕey ;
i.e.,
δr1 = (a sin ϕex − a cos ϕey )δϕ

and Exercise 15.3
δr2 = (−(b + c) sin ϕex + (b + c) cos ϕey )δϕ.
The d’Alembert principle reads

2
0= Fν · δrν = (Qa cos ϕ − G(b + c) cos ϕ)δϕ = [Qa − G(b + c)] cos ϕδϕ.
ν=1
The equilibrium condition
b+c
Q=G
a
is independent of the angle ϕ!
As is seen in Example 15.1 and Exercise 15.2, the drawback of the principle of vir-
tual displacements is that one still must eliminate displacements that are dependent
through constraints before one can find an equation of motion. We therefore introduce
generalized coordinates qi . If we transform the δri in (15.6a) to δqi , the coefficients
of the δqi can immediately be set to zero.
Starting from (15.6a), we introduce in the first sum according to (14.6) and (14.7),
the generalized forces

n
n
f
∂ri
f
Fi · δri = Fi · δqα = Qα δqα . (15.8)
∂qα
i=1 i=1 α=1 α=1
We now consider the other term in (15.6a):

ṗi · δri = mi r̈i · δri .
i i
If we express δri according to (14.6) by the δqi , we obtain

∂ri
ṗi · δri = mi r̈i · δqν . (15.9)
∂qν
i i,ν
By adding and simultaneously subtracting equal terms, we rewrite the right-hand side
of the equation:
∂ri d ∂ri

d ∂ri

mi r̈i · = (mi ṙi ) · + mi ṙi ·
∂qν dt ∂qν dt ∂qν
i i i

d ∂ri
− mi ṙi ·
dt ∂qν
i
d ∂ri

d ∂ri

= mi ṙi · − mi ṙi . (15.10)
dt ∂qν dt ∂qν
i
To derive the expression for the kinetic energy, we change the order of differentiation
with respect to t and qν in the last term of (15.10):

d ∂ri ∂ d ∂
= ri = vi . (15.11)
dt ∂qν ∂qν dt ∂qν
Insertion in (15.10) yields
∂ri

d ∂ri

∂

mi r̈i · = mi ṙi · − mi vi · vi . (15.12)
∂qν dt ∂qν ∂qν
i i
We can rewrite the expression ∂ri /∂qν in the first term of the right side of (15.12) by
partially differentiating (14.4) with respect to q̇ν :
∂vi ∂ri
= ,
∂ q˙ν ∂qν
since (∂/∂ q̇ν )(∂ri /∂t) = 0 and from the sum remains only the factor at q̇ν . By insert-
ing this relation into (15.12), we obtain
∂ri

d ∂vi

∂vi

mi r̈i · = mi vi · − mi vi ·
∂qν dt ∂qν ∂qν
i i i

d ∂ 1 ∂ 1
= mi vi2 − mi vi2 .
dt ∂ q̇ν 2 ∂qν 2
i i
2
Here, i (1/2)mi vi
is the kinetic energy T :
∂ri

d ∂T

∂T
mi r̈i · = − .
∂qν dt ∂ q̇ν ∂qν
i
Insertion into (15.9) leads to

d ∂T ∂T
ṗi · δri = − δqν . (15.13)
ν
dt ∂ q̇ν ∂qν
i
Using (15.8) and (15.13), we can express the d’Alembert principle by generalized
coordinates. Insertion of

Fi · δri = Qν δqν (compare (15.8))
i ν
into (15.6a) yields

d ∂T ∂T
− − Qν δqν = 0. (15.14)
ν
dt ∂ q̇ν ∂qν
The qν are generalized coordinates; thus, the qν and the related δqν are independent
of each other. Therefore, (15.14) is satisfied only if the individual coefficients vanish,
i.e., for any coordinate qν we must have

d ∂T ∂T
− − Qν = 0, ν = 1, . . . , f. (15.15)
dt ∂ q̇ν ∂qν
As a further simplification, we assume that all forces Fi can be derived from a potential
V (conservative force field):
Fi = −gradi (V ) = −∇i (V ).
In this case, the generalized forces Qν can be written as

∂ri ∂ri ∂V
Qν = Fi · =− ∇i V · =− ,
∂qν ∂qν ∂qν
i i
because
∂V ∂V ∂V

∂xi ∂yi ∂zi

ex + ey + ez · ex + ey + ez
∂xi ∂yi ∂zi ∂qν ∂qν ∂qν
i
∂V ∂xi ∂V ∂yi ∂V ∂zi

= + +
∂xi ∂qν ∂yi ∂qν ∂zi ∂qν
i
∂V
= .
∂qν
By inserting Qν = −∂V /∂qν into (15.15), we obtain

d ∂T ∂T ∂V
− + =0
dt ∂ q̇ν ∂qν ∂qν
and

d ∂T ∂T − V
− = 0.
dt ∂ q̇ν ∂qν
V is independent of the generalized velocity; i.e., V is only a function of the position:
∂V
= 0.
∂ q̇ν
Therefore, we can write
d ∂ ∂
(T − V ) − (T − V ) = 0, (15.16)
dt ∂ q̇ν ∂qν
or, by defining a new function, the Lagrangian2
L = T − V, (15.17)
d ∂L ∂L
− = 0, ν = 1, . . . , f. (15.18)
dt ∂ q̇ν ∂qν
2 Joseph Louis Lagrange, b. Jan. 25, 1736, Torino–d. April 10, 1813, Paris. Lagrange came from a
French–Italian family and in 1755 became professor in Torino. In 1766, he went to Berlin as director
of the mathematical-physical class of the academy. In 1786, after the death of Friedrich II, he went to
Paris. There he essentially supported the reformation of the system of measures and was a professor
at various universities. His very extensive work includes a new foundation of variational calculus
(1760) and its application to dynamics, contributions to the three-body problem (1772), application
of the theory of continued fractions to the solution of equations (1767), number-theoretical problems,
and an unsuccessful reduction of infinitesimal calculus to algebra. With his Mécanique Analytique
(1788), Lagrange became the founder of analytical mechanics.
These equations are called Lagrange equations, and the quantities ∂L/∂ q̇ν are called
generalized momenta. In Newton’s formulation of mechanics, the equations of mo-
tions are established directly. The forces are thus put in the foreground; they must be
specified for a given problem and inserted into the basic dynamic equations
ṗi = Fi , i = 1, . . . , N.
In the Lagrangian formulation the Lagrangian is the central quantity, and L includes
both the kinetic energy T and the potential energy V . The latter one implicitly in-
volves the forces. After L is established, the Lagrange equations can be established
and solved. Both methods are equivalent to each other, as can be seen by stepwise
inversion of the steps leading from (15.6a) to (15.18).
EXAMPLE
15.4 Two Blocks Connected by a Bar
Two blocks of equal mass that are connected by a rigid bar of length l move without
friction along a given path (compare Fig. 15.5). The attraction of the earth acts along
the negative y-axis. The generalized coordinate is the angle α (corresponding to the
single degree of freedom of the system).
Fig. 15.5. Two blocks are
connected by a bar
For the relative distances x and y of the two blocks, we have
x = l cos α, y = l sin α.
The constraint is holonomic and scleronomic. We will determine the Lagrangian
L = T − V.
The kinetic energy of the system is

1
T = m(ẋ 2 + ẏ 2 ).
2
For this purpose, we form ẋ and ẏ:
ẋ = −l(sin α)α̇, ẏ = l(cos α)α̇.
Thus, we get for T

1 1
T = m l 2 (sin2 α)α̇ 2 + l 2 (cos2 α)α̇ 2 = ml 2 α̇ 2 .
2 2
For the potential, we have (conservative system), Example 15.4
V = mgy = mgl sin α.
The Lagrangian therefore reads

1
L = T − V = ml 2 α̇ 2 − mgl sin α.
2
We insert L into the Lagrange equation (15.16):
d ∂L ∂L d
− = (ml 2 α̇ 2 ) + mgl cos α = 0
dt ∂ α̇ ∂α dt
and
g
ml 2 α̈ + mgl cos α = 0, α̈ + cos α = 0.
l
Multiplication by α̇ yields
g
α̈ α̇ + α̇ cos α = 0.
l
These equations can be integrated directly. One obtains
1 2 g
α̇ + sin α = constant = c
2 l
or

g
α̇ = 2 c − sin α .
l
Separation of the variables α and t leads to the equation

α
dα dα
dt = √ , t − t0 = √ .
2(c − (g/ l) sin α) 2(c − (g/ l) sin α
α0
The constants c and t0 are determined from the given initial conditions.
EXAMPLE
15.5 Ignorable Coordinate
We will use the following example for the Lagrangian formalism to explain the con-
cept of the ignorable coordinate. The arrangement is shown in Fig. 15.6.
Fig. 15.6. Two masses m and
M are connected by a string
Example 15.5 Two masses m and M are connected by a string of constant total length l = r + s.
The string mass is negligibly small compared to m + M. The mass m can rotate with
the string (with varying partial length r) on the plane. The string leads from m through
a hole in the plane to M, where the mass M hangs from the tightly stretched string
(with the also variable partial length s = l − r). Depending on the values ω of the
rotation of m on the plane, the arrangement can glide upward or downward. Thus, the
mass M moves only along the z-axis. The constraints characterizing the system are
holonomic and scleronomic. This arrangement has two degrees of freedom. The two
corresponding generalized coordinates ϕ and s uniquely describe the state of motion
of this conservative system.
We have
x = r cos ϕ = (l − s) cos ϕ,
y = r sin ϕ = (l − s) sin ϕ.
For the kinetic energy T of the system, we obtain

2
1 d 1 1
T = m (l − s) + (l − s)2 mϕ̇ 2 + M ṡ 2
2 dt 2 2
1 1
= (m + M)ṡ 2 + (l − s)2 mϕ̇ 2 .
2 2
The potential V reads
V = −Mgs.
For the Lagrangian L, we get

1 1
L = T − V = (m + M)ṡ 2 + (l − s)2 mϕ̇ 2 + Mgs.
2 2
We now form
d ∂L ∂L
= (m + M)s̈, = −(l − s)mϕ̇ 2 + Mg,
dt ∂ ṡ ∂s
d ∂L d ∂L
= ((l − s)2 mϕ̇), = 0.
dt ∂ ϕ̇ dt ∂ϕ
Because ∂L/∂ϕ = 0, ϕ is called an ignorable or cyclic coordinate. The Lagrange
equation for ϕ then reduces to
d ∂L d
= ((l − s)2 mϕ̇) = 0
dt ∂ ϕ̇ dt
or

= constant.
(l − s)2 ϕ̇m = L

is the angular momentum of the rotating mass m.
Here, L
This first integral of motion is the angular momentum conservation law. Generally
speaking, the Lagrangian equation of motion
d ∂L ∂L
− =0
dt ∂ q̇j ∂qj
for an ignorable (cyclic) variable reduces to Example 15.5
d ∂L dpj
= 0 or = 0.
dt ∂ q̇j dt
Here, pj = ∂L/∂ q̇j is the generalized momentum. The generalized momentum re-
lated to the cyclic coordinate is thus constant in time. Therefore, the general conserva-
tion law holds: The generalized momentum related to a cyclic coordinate is conserved.
The Lagrange equation for s reads
(m + M)s̈ + (l − s)mϕ̇ 2 − Mg = 0
or, after multiplication by ṡ,

2 ṡ
L
(m + M)s̈ ṡ + − Mg ṡ = 0,
= (l − s)2 mϕ̇.
with L
(l − s)3 m
The last equation can be integrated immediately, and we obtain as a second integral
of motion
1
2
L
(m + M)ṡ 2 + − Mgs = constant = T + V = E;
2 2(l − s)2 m
i.e., the total energy of the system is conserved. The given system is in a state of equi-
librium (gravitation force = centrifugal force) for vanishing acceleration, d 2 s/dt = 0:
2
1
L
0 = s̈ = Mg − (l − s)m
m+M (l − s)2 m

2
1 L
= Mg − .
m+M (l − s)3 m
The result states that s must be constant. For a fixed distance s0 , equilibrium therefore
appears for a definite angular momentum L
=L
0 , which corresponds to a definite
angular velocity ω = ϕ̇:

0 = Mmg(l − s0 )3 .
L
For L
> L
0 , the entire arrangement glides upward; for L
< L
0 , the string with the
two masses m and M glides downward. For L
= L
0 , the system is in an equilibrium
state. For the special case L
= 0 (i.e., ϕ̇ = 0, no rotation on the plane), one simply has
the retarded free fall of the mass M.
EXAMPLE
15.6 Sphere in a Rotating Tube
As a further example of the Lagrangian formalism, we discuss a problem with

a holonomic rheonomic constraint. A sphere moves in a tube that rotates in the
x, y-plane about the z-axis with constant angular velocity ω.
Fig. 15.7. Sphere in a rotating

tube
This arrangement has one degree of freedom. Accordingly, we need only one gen-
eralized coordinate for a complete description of the state of motion of the system: the
radial distance r of the sphere from the rotation center.
One has
x = r cos ωt,
y = r sin ωt.
The Lagrangian L = T − V then reads
1 1
L = m(ẋ 2 + ẏ 2 ) = m(ṙ 2 + ω2 r 2 ),
2 2
if we take into account that for this arrangement the potential V = 0.
We now form
d ∂L ∂L
= mr̈, = mω2 r.
dt ∂ ṙ ∂r
Then we obtain the Lagrange equation
mr̈ − mω2 r = 0,
or
r̈ − ω2 r = 0.
This differential equation corresponds—up to the minus sign—to the equation for the
nondamped harmonic oscillator. It has a general solution of the type
r(t) = Aeωt + Be−ωt .
With increasing time t , this expression for r(t) also increases; i.e.,
lim r(t) = ∞ for A > 0.

t→∞
From the physical point of view, this means that the sphere is hurled outward by the
centrifugal force that results from the rotation of the arrangement.
The energy of the sphere increases. The reason is that the constraint reaction per-
forms work on the sphere. Although the constraint force is perpendicular to the tube
wall, it is not perpendicular to the trajectory of the sphere. Hence, the product Fz · δs Exercise 15.6
does not vanish.
EXERCISE
15.7 Upright Pendulum
Problem. Determine the Lagrangian and the equation of motion of the following
system: Let m be a point mass on a massless bar of length l which in turn is fixed to
a hinge. The hinge oscillates in the vertical direction according to h(t) = h0 cos ωt.
The only degree of freedom is the angle ϑ between the bar and the vertical (upright
pendulum).
Fig. 15.8. A mass m is fixed to

one end of a bar; the other end
of the bar is fixed to an oscil-
lating hinge
Solution. The position of the point mass m is (x, y):
x = l sin ϑ, y = h(t) + l cos ϑ = h0 cos ωt + l cos ϑ.
Differentiation of this equation yields
ẋ = ϑ̇l cos ϑ, ẏ = −(ωh0 sin ωt + ϑ̇l sin ϑ).
Hence, the kinetic energy T becomes
1
T = m(ẋ 2 + ẏ 2 )
2
1
= m(ϑ̇ 2 l 2 + ω2 h20 sin2 ωt + 2ωh0 ϑ̇l sin ϑ sin ωt),
2
and the potential energy reads
V = mgy = mg(h0 cos ωt + l cos ϑ).
Then the Lagrangian becomes
L=T −V
m 2 2
= ϑ̇ l + ω2 h20 sin2 ωt + 2ωh0 ϑ̇ sin ϑ sin ωt − 2g(h0 cos ωt + l cos ϑ) .
2
Exercise 15.7 The Lagrange equation reads

d ∂L ∂L
− = 0,
dt ∂ ϑ̇ ∂ϑ
∂L
= ml 2 ϑ̇ + mωh0 l sin ϑ sin ωt,
∂ ϑ̇
∂L
= mωh0 ϑ̇l cos ϑ sin ωt + mgl sin ϑ,
∂ϑ
d ∂L
= ml 2 ϑ̈ + mωh0 l ϑ̇ cos ϑ sin ωt + mω2 h0 l sin ϑ cos ωt,
dt ∂ ϑ̇
l 2 ϑ̈ + ωh0 l ϑ̇ cos ϑ sin ωt + ω2 h0 l sin ϑ cos ωt − ωh0 l ϑ̇ cos ϑ sin ωt − gl sin ϑ = 0,

or
l ϑ̈ + ω2 h0 sin ϑ cos ωt − g sin ϑ = 0.
The substitution ϑ = ϑ − π ⇒ sin ϑ = −sin ϑ ; for small displacements, −sin ϑ ≈

−ϑ , i.e.,
l ϑ̈ + (g − ω2 h0 cos ωt)ϑ = 0.
This is the desired equation of motion. If the piston is at rest, i.e., h(t) = h0 = 0, we
get
g
ϑ̈ + ϑ = 0.
l
This is the equation of motion of the ordinary pendulum!
EXERCISE
15.8 Stable Equilibrium Position of an Upright Pendulum
Problem. Find the position of stable equilibrium of the pendulum of Exercise 15.7
√
if the hinge oscillates with the frequency ω
g/ l.
Solution. We first rewrite the Lagrangian of the pendulum of Exercise 15.7 as fol-
lows: The terms
mω2 2 2
h sin ωt and − mgh0 cos ωt
2 0
can be written as total differentials with respect to time:

mω2 2 2 d 1
h sin ωt = − mωh0 sin ωt cos ωt + C,
2
2 0 dt 4

d mgh0
−mgh0 cos ωt = − sin ωt .
dt ω
We can omit these terms, since Lagrangians that differ only t by a total derivative with Exercise 15.8
respect to time, according to the Hamilton principle δ t12 L dt = 0, are equivalent.
Hence,
m 2 2
L= ϑ̇ l + ω2 h20 sin2 ωt + 2ωh0 ϑ̇l sin ϑ sin ωt − 2g(h0 cos ωt + l cos ϑ)
2
m
= [ϑ̇ 2 l 2 + 2ωh0 ϑ̇l sin ϑ sin ωt − 2gl cos ωt]. (15.19)
2
Another transformation yields
d
mωh0 ϑ̇l sin ϑ sin ωt = − (mωh0 l cos ϑ sin ωt) + mω2 h0 l cos ϑ cos ωt,
dt
so that the Lagrangian finally reads
m 22
L= [ϑ̇ l + 2ω2 h0 l cos ϑ cos ωt − 2gl cos ϑ]. (15.20)
2
From this, one obtains of course the equation of motion as in Exercise 15.7.
We consider ϑ as a generalized coordinate with the appropriate mass coefficient
ml 2 . The equation of motion then reads
ml 2 ϑ̈ = mgl sin ϑ − mω2 h0 l sin ϑ cos ωt

du
=− +f (15.21)
dϑ
with u = mgl cos ϑ and f = −mω2 h0 l sin ϑ cos ωt. The additional force f is due to
the motion of the hinge. For very fast oscillations of the hinge, we assume that the
motion of the pendulum in the potential u is superposed by quick oscillations ξ :
ϑ(t) =
ϑ (t) + ξ(t).
The average value of the oscillations over a period 2π/ω equals zero, while
ϑ changes
only slowly; therefore,

2π/ω
ω

ϑ (t) = ϑ(t) dt =
ϑ (t). (15.22)
2π
0
Equations (15.21) with (15.22) can then be written as

du
ϑ¨ (t) + ml 2 ξ̈ (t) = −
ml 2
+ f (ϑ).
dϑ
Because f (ϑ) = f (
ϑ + ξ ) = f (
ϑ ) + ξ df/dϑ , an expansion up to first order in ξ

yields
dU d 2U df
ϑ¨ + ml 2 ξ̈ = −
ml 2
−ξ + f (
ϑ) + ξ . (15.23)

dϑ dϑ
2 d
ϑ
The dominant terms for the oscillations are ml 2 ξ̈ and f (
ϑ ):
ml 2 ξ̈ = f (
ϑ)
ω2 h0
⇒ ξ̈ = − sin
ϑ cos ωt,
l
Exercise 15.8 and from this, we obtain

h0 f
ξ= sin
ϑ cos ωt = − . (15.24)
l mω2 l 2
We now calculate an effective potential created by the oscillations, and for this purpose
we average (15.23) over a period 2π/ω (the mean values over ξ and f vanish):
dU df dU 1 df
ϑ¨ = −
ml 2
+ξ =− − f .

dϑ
dϑ
dϑ mω 2 l 2 d
ϑ
This can be written as
dUeff 1
ϑ¨ = −
ml 2
with Ueff = U + f 2. (15.25)
d
ϑ 2mω2 − l 2
Because cos2 ωt = 1/2, we get
mω2 h20 2
Ueff = U + sin ϑ
4
mω2 h20 2
= mgl cos ϑ + sin ϑ. (15.26)
4
The minima of Ueff give the stable equilibrium positions:
dUeff mω2 h20 !

= −mgl sin ϑ + sin ϑ cos ϑ = 0
dϑ 4
2gl
⇒ sin ϑ = 0 or cos ϑ = . (15.27)
ω2 h20
From this, it follows that for any ω the position vertically downwards (ϑ = π ) is
stable. ϑ = 0 is excluded because Ueff (ϑ = 0) = mgl. Additional stable equilibrium
positions arise for ω2 > 2gl/ h20 with the angle given above.
EXERCISE
15.9 Vibration Frequencies of a Three-Atom Symmetric Molecule
Problem. Find the vibration frequencies of a linear three-atom symmetric molecule

ABA. We assume that the potential energy of the molecule depends only on the dis-
Fig. 15.9. Linear three-atom
tances AB and BA and the angle ABA. Write the Lagrangian of the molecule in
symmetric molecule
appropriate coordinates (normal coordinates) where the Lagrangian has the form
mα
L= ˙ 2α − ωα2 2α ).
(
α
2
The ωα are the desired vibration frequencies of the normal modes. If one cannot
find the normal coordinates of the system, one can proceed as follows: If a system has
s degrees of freedom and does not vibrate, then the Lagrangian generally reads
1
L= (mik ẋi ẋk − kik xi xk ).
2
i,k
The eigenfrequencies of the system are then determined by the so-called characteristic Exercise 15.8
equation
det|kik − ω2 mik | = 0.
Solution. We describe the geometry of the molecule in the x,y-plane. Let the dis-
placement of the atom α from the rest position rα0 be denoted by xα = (xα , yα ), i.e.,
rα = rα0 + xα . The forces that keep the atoms together are assumed to be to first order
linear in the displacement from the rest position, i.e.,
mA 2 mB 2 KL
L= (ẋ1 + ẋ32 ) + ẋ − [(x1 − x2 )2 + (x3 − x2 )2 ],
2 2 2 2
if we consider longitudinal vibrations. For these modes the conservation of the center
of gravity can be written as follows:

mA (x1 + x3 )mB x2 = 0, mα rα = mα rα0 ,
α α
and we can eliminate x2 from L:
mA 2 m2
L= (ẋ1 + ẋ32 ) + A (ẋ1 + ẋ3 )2
2 2mB

KL 2 mA 2m2
− x1 + x32 + 2 (x1 + x3 )2 + 2A (x1 + x3 )2 .
2 mB mB
Hence, only two normal coordinates for the longitudinal motion can exist, because of
the conservation of the center of gravity.
Let 1 = x1 + x3 , 2 = x1 − x3 . L can then be written as
mA ˙ 2 mA μ 2 KL 2 KL μ 2
L= + ˙ −
− , μ ≡ 2mA + mB ,
4 2 4mB 1 4 2 4m2B 1
i.e., 1 and 2 are the two normal coordinates of the longitudinal vibration (μ repre-
sents the total mass of the molecule).
(a) For x1 = x3 , 2 vanishes; i.e., 1 describes antisymmetric longitudinal vibrations Fig. 15.10.
(Fig. 15.10).
(b) For x1 = −x3 , 1 vanishes; i.e., 2 describes symmetric longitudinal vibrations
(Fig. 15.11). Fig. 15.11.
A comparison of kinetic and potential energy yields the normal frequencies

KL μ
ωa = , antisymmetric vibration,
mA mB

KL
ωs = , symmetric vibration.
mA
For transverse vibrations (see Fig. 15.12) of the form in Fig. 15.13, we set
mA 2 mB 2 KT
L= (ẏ + ẏ32 ) + ẏ − (lδ)2 ,
Fig. 15.12. 2 1 2 2 2
Fig. 15.13.
where δ is the deviation of the angle <) (ABA) from π . For small values of δ, we can
set

π π
δ= − α1 + − α2
2 2

π π
= sin − α1 + sin − α2
2 2
= cos α1 + cos α2
y 2 − y 1 y2 − y 3
= + .
l l
We utilize the conservation of the center of gravity and angular momentum conserva-
tion to eliminate y2 and y3 from L.
mA (y1 + y3 ) + mB y2 = 0 (conservation of the center of gravity).
To exclude rotation of the molecule, the total angular momentum must vanish:
d
D= mα [rα × vα ] mα [rα0 × ẋα ] = mα [ṙα0 × xα ],
α α
dt α
which can be achieved by

mα [rα0 × xα ] = 0.
α
For our case, it thus follows that y1 = y3 . Then we get

4μ2 ẏ12 mA mB 2 2 KT l 2 2
(l δ̇)2 = and L= l δ̇ − δ .
m2β 4μ 2
We thus obtain the eigenfrequency of the transverse vibration:

2KT μ
ωT = .
mA mB
EXERCISE
15.10 Normal Frequencies of a Triangular Molecule
Problem. Calculate the normal frequencies of a symmetric molecule ABA of trian-

gular shape:
Fig. 15.14. Triangular mole-

cule
Solution. Conservation of the center of gravity here reads
mA (x1 + x3 ) + mB x2 = 0, mA (y1 + y3 ) + mB y2 = 0.
For angular momentum conservation, we go to the rest position of atom B, and be-
cause m1 = m3 = mA , it follows that
r10 × x1 + r30 × x3 = 0.
We have
r10 × x1 = |r10 |(−x1 cos α + y1 sin α)e,

r30 × x3 = |r30 |(−x3 cos α − y3 sin α)e.
Because |r10 | = |r30 |, the angular momentum conservation law follows:
sin α(y1 − y3 ) = cos α(x1 + x3 ) or y1 − y3 = cot α(x1 + x3 ).
Fig. 15.15. The various coor-

dinates for Exercise 15.10
The changes δl1 and δl2 of the distances AB and BA result by projection of the
vectors x1 − x2 and x3 − x2 onto the directions of the lines AB and BA:
δl1 = (x1 − x2 ) sin α + (y1 − y2 ) cos α,

δl2 = −(x3 − x2 ) sin α + (y3 − y2 ) cos α.
The change of the angle 2α =<) (ABA) is found by projection of the vectors x1 − x2
and x3 − x2 onto the directions orthogonal to the line segments AB and BA:
1 1
δ= (x1 − x2 ) cos α − (y1 − y2 ) sin α + −(x3 − x2 ) cos α − (y3 − y2 ) sin α .
l l
Exercise 15.10 We write the Lagrangian of the molecule as
mA 2 mB 2 K1 K2
L= (ẋ1 + ẋ23 ) + ẋ2 − (δl1 )2 + (δl2 )2 − (lδ)2 .
2 2 2 2
Here, (K1 /2)[(δl1 )2 + (δl2 )2 ] is the potential energy of the rotation, and K2 (lδ)2 /2 is
the potential energy of the bending of the molecule. We adopt as new coordinates
Qα = x1 + x3 , qs1 = x1 − x3 , qs2 = y1 + y3 ,
and then have
1 mA 1
x1 = (Qα + qs1 ), x2 = − Qα , x3 = (Qα − qs1 ),
2 mB 2
1 mA 1
y1 = (qs2 + Qα cot α), y2 = − qs2 , y3 = (qs2 − Qα cot α).
2 mB 2
Because y1 − y3 = Qα cot α, we find for L

mA 2mA 1 mA 2 mA μ 2
L= + 2 Q̇2α + q̇s1 + q̇
4 mB sin α 4 4mB s2

K1 2mA 1 2mA 2
− Q2α + 2 1+ sin α
4 mB sin α mB
2
qs1 2
2 μ
− (K1 sin2 α + 2K2 cos2 α) − qs2 (K1 cos2 α + 2K2 sin2 α)
4 4m2B
μ
+ qs1 qs2 (2K2 − K1 ) sin α cos α.
2mB
Obviously, Qα is a normal coordinate, with the vibration frequency

K1 2mA 2
ωα2 = 1+ sin α .
mA mB
Pure Qα -vibrations occur for x1 = x3 , y1 = −y3 ; i.e., Qα describes antisymmetric

vibrations with respect to the y-axis in Fig. 15.16.
Fig. 15.16.
The eigenfrequencies ωs1 and ωs2 of the normal vibrations for qs1 and qs2 must be
determined by the characteristic equation

K1 2mA 2K2 2mA 2 2μK1 K2
ω −ω
4 2
1+ cos α +
2
1+ sin α + = 0.
mA mB mA mB mB m2A
The coordinates qs1 and qs2 correspond to vibrations that are symmetric about the Exercise 15.10
y-axis (Fig. 15.17):
(x1 = −x3 , Qα = 0 ⇒ y1 = y3 ).
Fig. 15.17.
EXERCISE
15.11 Normal Frequencies of an Asymmetric Linear Molecule

Problem. Find the normal frequencies for a linear, asymmetric molecule with the
shape in Fig. 15.18.
Solution. Conservation of the center of gravity and of angular momentum now read Fig. 15.18.
mA x1 + mB x2 + mC x3 = 0, x-center of gravity,
mA y1 + mB y2 + mC y3 = 0, y-center of gravity,
mA l1 y1 = mC l2 y3 , angular momentum conservation.
For the potential energy of bending, we write

K2
V= (lδ)2 , (2l = l1 + l2 );
2
for that of rotation,
K2 K
V= (x1 − x2 )2 + 1 (x2 − x3 )2 .
2 2
The analogous calculation as for Exercise 15.9 after some effort yields

K2 l 2 l12 l2 4l 2
ωT2 = 2 2 + 2 +
l1 l2 mC mA mB
for the frequency of the transverse vibration, and also the equation quadratic in ω2

1 1 1 1 μK1 K1
ω − ω K1
4 2
+ + K1 + + =0
mA mB mB mC mA mB mC
for the frequencies ωL1 , ωL2 of the two longitudinal vibrations.
EXERCISE
15.12 Double Pendulum
Problem. Determine
(a) the generalized coordinates of the double pendulum;
(b) the Lagrangian of the system;
Exercise 15.12 (c) the equations of motion;

(d) for m1 = m2 = m and l1 = l2 = l;
(e) as (d) for small amplitudes; and
(f) for the case (e) the normal vibrations and frequencies.
Fig. 15.19. Coordinates of the

double pendulum
Solution. (a) The appropriate generalized coordinates are the two angles ϑ1 and ϑ2
that are related to the Cartesian coordinates by
x1 = l1 cos ϑ1 , y1 = l1 sin ϑ1 ,
(15.28)
x2 = l1 cos ϑ1 + l2 cos ϑ2 , y2 = l1 sin ϑ1 + l2 sin ϑ2 .
(b) From (15.28), it follows by differentiation that
ẋ1 = −l1 ϑ̇1 sin ϑ1 , ẏ1 = l1 ϑ̇1 cos ϑ1 ,

ẋ2 = −l1 ϑ̇1 sin ϑ1 − l2 ϑ̇2 sin ϑ2 , ẏ2 = l1 ϑ̇1 cos ϑ1 + l2 ϑ̇2 cos ϑ2 .
The kinetic energy of the system is
1 1
T = m1 (ẋ12 + ẏ12 ) + m2 (ẋ22 + ẏ22 )
2 2
1 1
= m1 l12 ϑ̇12 + m2 l12 ϑ̇12 + l22 ϑ̇22 + 2l1 l2 ϑ̇1 ϑ̇2 cos(ϑ1 − ϑ2 ) .
2 2
(Addition theorem!)
To get the potential energy, we adopt a plane as a reference height, at the distance
l1 + l2 below the suspension point:

V = m1 g[l1 + l2 − l1 cos ϑ1 ] + m2 g l1 + l2 − (l1 cos ϑ1 + l2 cos ϑ2 ) .
The Lagrangian then becomes
L=T −V
1 1
= m1 l12 ϑ̇12 + m2 l12 ϑ̇12 + l22 ϑ̇22 + 2l1 l2 ϑ̇1 ϑ̇2 cos(ϑ1 − ϑ2 )
2 2

− m1 g[l1 + l2 − l1 cos ϑ1 ] − m2 g l1 + l2 − (l1 cos ϑ1 + l2 cos ϑ2 ) . (15.29)
(c) The Lagrange equations with ϑ1 and ϑ2 read

d ∂L ∂L d ∂L ∂L
− = 0, − = 0.
dt ∂ ϑ̇1 ∂ϑ1 dt ∂ ϑ̇2 ∂ϑ2
One has Exercise 15.12

∂L
= −m2 l1 l2 ϑ̇1 ϑ̇2 sin(ϑ1 − ϑ2 ) − m1 gl1 sin ϑ1 − m2 gl1 sin ϑ1 ,
∂ϑ1
∂L
= m1 l12 ϑ̇1 + m2 l12 ϑ̇1 + m2 l1 l2 ϑ̇2 cos(ϑ1 − ϑ2 ),
∂ ϑ̇1
∂L
= m2 l1 l2 ϑ̇1 ϑ̇2 sin(ϑ1 − ϑ2 ) − m2 gl2 sin ϑ2 ,
∂ϑ2
∂L
= m2 l22 ϑ̇2 + m2 l1 l2 ϑ̇1 cos(ϑ1 − ϑ2 ).
∂ ϑ̇2
Thus, the Lagrange equations read
m1 l12 ϑ̈1 + m2 l12 ϑ̈1 + m2 l1 l2 ϑ̈2 cos(ϑ1 − ϑ2 ) − m2 l1 l2 ϑ̇2 (ϑ̇1 − ϑ̇2 ) sin(ϑ1 − ϑ2 )
= −m2 l1 l2 ϑ̇1 ϑ̇2 sin(ϑ1 − ϑ2 ) − m1 gl1 sin ϑ1 − m2 gl1 sin ϑ1
and
m2 l22 ϑ̈2 + m2 l1 l2 ϑ̈1 cos(ϑ1 − ϑ2 ) − m2 l1 l2 ϑ̇1 (ϑ̇1 − ϑ̇2 ) sin(ϑ1 − ϑ2 )

= m2 l1 l2 ϑ̇1 ϑ̇2 sin(ϑ1 − ϑ2 ) − m2 gl2 sin ϑ2 ,
or
(m1 + m2 )l12 ϑ̈1 + m2 l1 l2 ϑ̈2 cos(ϑ1 − ϑ2 ) + m2 l1 l2 ϑ̇22 sin(ϑ1 − ϑ2 )

= −(m1 + m2 )gl1 sin ϑ1 (15.30)
and
m2 l22 ϑ̈2 + m2 l1 l2 ϑ̈1 cos(ϑ1 − ϑ2 ) − m2 l1 l2 ϑ̇12 sin(ϑ1 − ϑ2 )

= −m2 gl2 sin ϑ2 .
These are the desired equations of motion.

(d) For the case
m1 = m2 = m and l1 = l2 = l,
(15.30) reduce to
2l ϑ̈1 + l ϑ̈2 cos(ϑ1 − ϑ2 ) + l ϑ̇22 sin(ϑ1 − ϑ2 ) = −2g sin ϑ1 ,

(15.31)
l ϑ̈1 cos(ϑ1 − ϑ2 ) + l ϑ̈2 − l ϑ̇12 sin(ϑ1 − ϑ2 ) = −g sin ϑ2 .
(e) If moreover the oscillations are small, then sin ϑ = ϑ, cos ϑ = 1, and terms
proportional to ϑ̇ 2 are negligible, which leads to
2l ϑ̈1 + l ϑ̈2 = −2gϑ1 , l ϑ̈1 + l ϑ̈2 = −gϑ2 . (15.32)
(f) With the ansatz
ϑ1 = A1 eiωt , ϑ2 = A2 eiωt ,
Exercise 15.12 we then obtain
2(g − lω2 )A1 − lω2 A2 = 0, −lω2 A1 + (g − lω2 )A2 = 0. (15.33)
To ensure that A1 and A2 do not vanish simultaneously, the determinant of the coeffi-
cients must vanish:

2(g − lω2 ) −lω2
= 0,
−lω2 g − lω2
and therefore,
l 2 ω4 − 4lgω2 + 2g 2 = 0
with the solutions

4lg ± 16l 2 g 2 − 8l 2 g 2 √ g
ω =
2
2
= (2 ± 2) ;
2l l
i.e.,
√ g √ g
ω12 = (2 + 2) , ω22 = (2 − 2) . (15.34)
l l
By inserting (15.34) into (15.33), we obtain
√
ω12 : A2 = − 2A1 , i.e., the pendulums oscillate out of phase,
√
ω22 : A2 = 2A1 , i.e., the pendulums oscillate in phase.
EXERCISE
15.13 Mass Point on a Cycloid Trajectory
Problem. A mass point glides without friction on a cycloid, which is given by x =

a(ϑ − sin ϑ) and y = a(1 + cos ϑ) (with 0 ≤ ϑ ≤ 2π ). Determine
(a) the Lagrangian, and
(b) the equation of motion.
(c) Solve the equation of motion.
Fig. 15.20. Mass point glides

on a cycloid
Solution. The cycloid is represented by
x = a(ϑ − sin ϑ), y = a(1 + cos ϑ),

where 0 ≤ ϑ ≤ 2π . The kinetic energy is Exercise 15.13

1 1 2 2
T = m(ẋ 2 + ẏ 2 ) = ma 2 (1 − cos ϑ)ϑ̇ + −(sin ϑ)ϑ̇ ,
2 2
and the potential energy is
V = mgy = mga(1 + cos ϑ).
The Lagrangian is given by
L = T − V = ma 2 (1 − cos ϑ)ϑ̇ 2 − mga(1 + cos ϑ).
The equation of motion then reads

d ∂L ∂L
− = 0,
dt ∂ ϑ̇ ∂ϑ
i.e.,
d
2ma 2 (1 − cos ϑ)ϑ̇ − ma 2 (sin ϑ)ϑ̇ 2 + mga sin ϑ = 0
dt
or
d 1 g
[(1 − cos ϑ)ϑ̇] − (sin ϑ)ϑ̇ 2 − sin ϑ = 0,
dt 2 2a
i.e.,
1 g
(1 − cos ϑ)ϑ̈ + (sin ϑ)ϑ̇ 2 − sin ϑ = 0. (15.35)
2 2a
By setting u = cos(ϑ/2), one has

du 1 ϑ
= − sin ϑ̇
dt 2 2
and

d 2u 1 ϑ 1 ϑ
2
= − sin ϑ̈ − cos ϑ̇ 2 .
dt 2 2 4 2
Since cot(ϑ/2) = sin ϑ/(1 − cos ϑ), we can write (15.35) as

1 ϑ g ϑ
ϑ̈ + cot ϑ̇ −
2
cot = 0,
2 2 2a 2
and therefore,
d 2u g
+ u = 0. (15.36)
dt 2 4a
The solution of this differential equation is

ϑ g g
u = cos = C1 cos t + C2 sin t.
2 4a 4a
Exercise 15.13 The motion is just like the vibration of an ordinary pendulum of length l = 4a. The
arrangement is therefore called a “cycloid pendulum.”
EXERCISE
15.14 String Pendulum
Problem. A mass m is suspended by a spring with spring constant k in the grav-

itational field. Besides the longitudinal spring vibration, the spring performs a plane
pendulum motion (Fig. 15.21). Find the Lagrangian, derive the equations of motion,
and discuss the resulting terms.
Solution. We introduce plane polar coordinates for solving the problem and adopt
the radius r and the polar angle ϕ as generalized coordinates.
y = r cos ϕ, ẏ = ṙ cos ϕ + r ϕ̇ sin ϕ,

(15.37)
Fig. 15.21. x = r sin ϕ, ẋ = ṙ sin ϕ − r ϕ̇ cos ϕ.
The kinetic energy is given by

1 1
T = m(ẋ 2 + ẏ 2 ) = m(ṙ 2 + r 2 ϕ̇ 2 ). (15.38)
2 2
The length of the spring in its rest position, i.e., without the displacement caused by
the mass m, is denoted by r0 . The potential energy then reads
k k
V = −mgy + (r − r0 )2 = −mgr cos ϕ + (r − r0 )2 . (15.39)
2 2
The Lagrangian is then
1 k
L = T − V = m(ṙ 2 + r 2 ϕ̇ 2 ) + mgr cos ϕ − (r − r0 )2 . (15.40)
2 2
The equations of motion of the system are obtained immediately via the Lagrange
equations:
d
(mr 2 ϕ̇) = −mgr sin ϕ. (15.41)
dt
This is just the angular momentum law with reference to the coordinate origin. If we
take the time dependence of r into account, then we have
mr ϕ̈ = −mg sin ϕ − 2mṙ ϕ̇. (15.42)
The last term on the right-hand side is the Coriolis force caused by the time variation
of the pendulum length r.
For the coordinate r, one obtains
mr̈ = mr ϕ̇ 2 + mg cos ϕ − k(r − r0 ). (15.43)

The first term on the right side represents the radial acceleration, the second term Exercise 15.14
follows from the radial component of the weight force, and the last term represents
Hooke’s law. For small amplitudes ϕ the motion appears as a superposition of har-
monic vibrations in the r, ϕ-plane.
EXERCISE
15.15 Coupled Mass Points on a Circle
Problem. Four mass points of mass m move on a circle of radius R. Each mass point
is coupled to its two neighboring points by a spring with spring constant k (Fig. 15.22).
Find the Lagrangian of the system, and derive the equations of motion of the system.
Calculate the eigenfrequencies of the system, and discuss the related eigenvibrations.
Solution. The kinetic energy of the system is given by
1 2
4
T= m ṡν . (15.44) Fig. 15.22.
2
ν=1
For small displacements from the equilibrium position, the potential reads
1
4
V= k (sν+1 − sν )2 , s4+1 = s1 . (15.45)
2
ν=1
We set sν = Rϕν , and take the angles ϕν as generalized coordinates. Then the La-
grangian is
1 4
1 4
L = T − V = mR 2 ϕ̇ν2 − kR 2 (ϕν+1 − ϕν )2 . (15.46)
2 2
ν=1 ν=1
From the Lagrange equations
d ∂L ∂L
= , (15.47)
dt ∂ ϕ̇ν ∂ϕν
we find the equations of motion:
d ∂L
= mR ϕ̈ν
dt ∂ ϕ̇ν
1
= − kR 2 [2(ϕν − ϕν+1 ) + 2(ϕν − ϕν−1 )]
2
∂L
= . (15.48)
∂ϕν
Exercise 15.15 For the case of four mass points, we then obtain
k
ϕ̈1 = (ϕ2 − 2ϕ1 + ϕ4 ),
m
k
ϕ̈2 = (ϕ3 − 2ϕ2 + ϕ1 ),
m (15.49)
k
ϕ̈3 = (ϕ4 − 2ϕ3 + ϕ2 ),
m
k
ϕ̈4 = (ϕ1 − 2ϕ4 + ϕ3 ).
m
With the ansatz ϕν = Aν cos ωt, ϕ̈ν = −Aν ω2 cos ωt, we are led to the following linear
system of equations:
⎛ k k k ⎞
2 − ω2 − 0 −
⎜ m m m ⎟
⎜ ⎟⎛ ⎞
⎜ k k k ⎟ A1
⎜ − 2 − ω2 − 0 ⎟
⎜ m m m ⎟ ⎜ A2 ⎟
⎜ ⎟ ⎜ ⎟ = 0. (15.50)
⎜ ⎟ ⎝ A3 ⎠
⎜ −
k k
2 − ω2 −
k ⎟
⎜ 0 ⎟ A4
⎜ m m m ⎟
⎝ ⎠
k k k
− 0 − 2 − ω2
m m m
For the nontrivial solutions, the determinant of the coefficient matrix must vanish. This
condition leads to the determining equation for the eigenfrequencies:
2
k k
2 − ω2 4 − ω (−ω2 ) = 0.
2
(15.51)
m m
The frequencies are
k k
ω12 = 0, ω22 = 4 , ω32 = ω42 = 2 . (15.52)
m m
To calculate the related eigenvibrations, we insert these frequencies into the system of
equations (15.50).
(1) ω12 = 0: A1 = A2 = A3 = A4 . The system does not vibrate but performs a uniform
rotation (Fig. 15.23(a)).
(2) ω22 = 4k/m: A1 = A3 = −A2 = −A4 . Two neighboring mass points perform an
out-of-phase vibration (Fig. 15.23(b)).
Fig. 15.23. Uniform rotation

and out-of-phase vibration
(c) ω32 = ω42 = 2k/m: A1 = A2 = −A3 = −A4 or A1 = A4 = −A2 = −A3 . Two Exercise 15.15
neighboring mass points vibrate in phase (Fig. 15.24(a,b)).
Fig. 15.24. In-phase vibration
EXERCISE
15.16 Lagrangian of the Asymmetric Top
Problem. Write down the Lagrangian of the heavy asymmetric top. Use the Euler
angles as generalized coordinates and determine the related generalized momenta.
Which coordinate is cyclic? Which further cyclic coordinate appears for a symmetric
top?
Solution. In the system of principal axes, the kinetic energy of motion is given by
1 1 1
T = 1 ω12 + 2 ω22 + 3 ω32 .
2 2 2
The potential energy is
V = mgh = mgl cos β.
We take the Euler angles (α, β, γ ) as generalized coordinates. The angular velocities
expressed by these coordinates read (see (13.43)),
ω1 = α̇ sin β sin γ + β̇ cos γ ,

ω2 = α̇ sin β cos γ − β̇ sin γ ,
ω3 = α̇ cos β + γ̇ .
By inserting this into the Lagrangian L = T − V , we get
1
L = 1 (α̇ 2 sin2 β sin2 γ + β̇ 2 cos2 γ + 2α̇ β̇ sin β sin γ cos β)
2
1
+ 2 (α̇ 2 sin2 β cos2 γ + β̇ 2 sin2 γ − 2α̇ β̇ sin β sin γ cos β)
2
1
+ 3 (α̇ cos β + γ̇ )2 − Mgl cos β.
2
Exercise 15.16 The Euler angles as generalized coordinates obey the Euler–Lagrange equations
d ∂L ∂L
=
dt ∂ α̇ ∂α
and the analogous equations for β and γ . The Lagrangian does not depend on the
angle α; hence, this coordinate is cyclic, and the related generalized momentum is
conserved.
We determine the generalized momenta:
∂L ∂ω1 ∂ω2 ∂ω3
pα = = 1 ω 1 + 2 ω 2 + 3 ω 3
∂ α̇ ∂ α̇ ∂ α̇ ∂ α̇
= 1 (α̇ sin β sin γ + β̇ cos γ ) sin β sin γ
+ 2 (α̇ sin β cos γ − β̇ sin γ ) sin β cos γ
+ 3 (α̇ cos β + γ̇ ) cos β
= α̇ sin2 β( 1 sin2 γ + 2 cos2 γ ) + ( 1 − 2 )β̇ cos γ sin β sin γ
+ 3 (α̇ cos β + γ̇ ) cos β.
In an analogous way, we obtain

∂L
pβ = = 1 β̇ cos2 γ + 2 β̇ sin2 γ
∂ β̇
+ ( 1 − 2 )α̇ cos γ sin β sin γ + 3 (α̇ cos β + γ̇ ),
∂L
pγ = = 3 (α̇ cos β + γ̇ ).
∂ γ̇
For the symmetric top, 1 = 2 , and thus, the Lagrangian simplifies considerably:
1 1
L = 1 (α̇ sin2 β + β̇ 2 ) + 3 (α̇ cos β + γ̇ )2 − Mgl cos β.
2 2
The Lagrangian of the symmetric top no longer depends on the angle γ ; therefore, the
angle γ becomes cyclic too. Hence, the momentum pγ is also conserved.
The generalized momenta then read
pα = α̇ sin2 β 1 + 3 (α̇ cos β + γ̇ ) cos β = constant,

pβ = β̇ 1 + 3 (α̇ cos β + γ̇ ),
pγ = 3 (α̇ cos β + γ̇ ) = constant.
The generalized momenta, being the projection of the total angular momentum onto
the rotational axis related to the particular Euler angle, have a direct physical meaning.
pα is the projection of the total angular momentum onto the space-fixed z-axis (see
Exercise 13.12):
pα = L · eα = L · ez .
This projection is a conserved quantity for the asymmetric and the symmetric top.
Since the gravitational force acts only along the z-direction, the angular momentum
about this axis remains unchanged.
pβ is the projection of the total angular momentum onto the nodal line, i.e., the Exercise 15.16
axis about which the second Euler rotation is being performed:
pβ = L · eβ = L · ex .
This momentum is not conserved.

pγ can be interpreted as the projection of the total angular momentum onto the
body-fixed ez -axis:
pγ = L · eγ = L · ez .
For a symmetric top, the body-fixed z -axis is a symmetry axis, and the angular mo-
mentum projection L · ez is conserved.
Lagrange Equation for Nonholonomic
Constraints 16
For systems with holonomic constraints, the dependent coordinates can be eliminated
by introducing generalized coordinates. If the constraints are nonholonomic, this ap-
proach does not work. There is no general method for treating nonholonomic prob-
lems. Only for those special nonholonomic constraints that can be given in differential
form can one eliminate the dependent equations by the method of Lagrange multipli-
ers. We therefore consider a system with constraints given in the form

n
alν dqν + alt dt = 0 (16.1)
ν=1
(ν = 1, 2, . . . , n = number of coordinates; n > s; l = 1, 2, . . . , s = number of con-

straints).
The following considerations do not depend on whether the equations (16.1) are
integrable or not; i.e., they hold both for holonomic as well as for nonholonomic con-
straints.
Therefore, the method of Lagrange multipliers derived below can be used also for
holonomic constraints, if it is inconvenient to reduce all qν to independent coordinates
or if one wants to keep the constraint reactions. Equation (16.1) is not the most general
type of a nonholonomic constraint, e.g., constraints in the form of inequalities are not
covered.
In our considerations, we start again—as in deriving the Lagrange equations—from
the d’Alembert principle. According to (15.13), it reads in generalized coordinates as
follows:
n
d ∂T ∂T
− − Qν δqν = 0. (16.2)
dt ∂ q̇ν ∂qν
ν=1
This equation holds for constraints of any kind.

The qν shall now depend on each other. Therefore, the virtual displacements δqν
cannot be freely chosen as earlier (compare (15.13)). To reduce the number of virtual
displacements to the number of independent displacements, we introduce the—for
the present—freely chosen Lagrange multipliers λl . In the general case, the Lagrange
multipliers λl with l = 1, 2, . . . , s are functions of the time and of the qν and q̇ν .
Virtual displacements δqν are performed at fixed time, i.e., with δt = 0. Then (16.1)
changes to

n
alν δqν = 0.
ν=1

302 16 Lagrange Equation for Nonholonomic Constraints
These are also called instantaneous (belonging to a fixed time) constraints. This in
turn leads to

s
n
λl alν δqν = 0
l=1 ν=1
or
s

n
λl alν δqν = 0. (16.3)
ν=1 l=1
Equation (16.3) is now subtracted from (16.2):

n
d ∂T ∂T s
− − Qν − λl alν δqν = 0 for ν = 1, . . . , s, . . . n. (16.4)
dt ∂ q̇ν ∂qν
ν=1 l=1
These equations involve in total n of the variables qν ; s of them are dependent qν

which are connected with the independent ones through the constraints, and n − s are
independent qν . For the dependent qν , the index ν shall run from ν = 1 to ν = s, for
the independent qν from ν = s + 1 to ν = n. The coefficients of the δqν in (16.4)
are such that through the s Lagrange multipliers λl (l = 1, . . . , s) they can be chosen
as freely as allowed by the s equations for the constraints. Since the λl can take any
value, we can choose them in such a way that

s
d ∂T ∂T
λl alν = − − Qν (ν = 1, . . . , s),
dt ∂ q̇ν ∂qν
l=1
i.e., the first s coefficients in (16.4) that correspond to the dependent qν are set to zero:
d ∂T ∂T
− − Qν − λl alν = 0 for ν = 1, . . . , s.
dt ∂ q̇ν ∂qν
l
From (16.4) then remains

n

d ∂T ∂T
− − Qν − λl alν δqν = 0.
dt ∂ q̇ν ∂qν
ν=s+1 l
These δqν (for ν = s + 1, . . . , n) are no longer subject to constraints. This means that
these δqν are independent of each other. One then must set the coefficients of the δqν
(ν = s + 1, . . . , n) equal to zero, just as in the derivation of the Lagrange equation for
holonomic systems.
This leads, together with the s equations for the dependent qν , to n equations in
total:
d ∂T ∂T s
− − Qν − λl alν = 0 for ν = 1, . . . , s, s + 1, . . . , n. (16.5)
dt ∂ q̇ν ∂qν
l=1
For conservative systems, the Qν can be derived from a potential:

∂V
Qν = − .
∂qν
16 Lagrange Equation for Nonholonomic Constraints 303
As in the derivation of the Lagrange equation for holonomic systems, we can re-
formulate (16.5) with the Lagrangian L = T − V as follows:
∂L
s
d ∂L
− − λl alν = 0, ν = 1, . . . , n. (16.6)
dt ∂ q̇ν ∂qν
l=1
These n equations involve n + s unknown quantities, namely the n coordinates qν

and the s Lagrange multipliers λl . The additionally needed equations are just the s
constraints (16.1) which couple the qν ; however, these are now to be considered as
differential equations:

alν q̇ν + alt = 0, l = 1, 2, . . . , s.
ν
Thus, we have in total n + s equations for n + s unknowns. We thereby obtain both

the qν we were looking for, and also the s quantities λl .
To understand the physical meaning of the λl , we assume that the constraints of the
system are removed, but are replaced by external forces Q∗ν which act in such a way
that the motion of the system remains unchanged. The equations of motion would
then also remain the same. These additional forces must be equal to the constraint
reactions, since they act on the system in such a way that the constraint conditions are
being fulfilled. With regard to these forces Q∗ν the equations of motion read
d ∂L ∂L
− = Q∗ν , (16.7)
dt ∂ q̇ν ∂qν
where the Q∗ν enter in addition to the Qν . The equations (16.6) and (16.7) must be
identical. This leads to

Q∗ν = λl alν ; (16.8)
l
i.e., the Lagrange multipliers λl determine the generalized constraint reactions Q∗ν ;
they will not be eliminated but are part of the solution of the problem (see also the
statements in Chap. 17 on this topic). The relation (16.3) thus changes to

Q∗ν δqν = 0, (16.9)
ν
implying that the total virtual work performed by all constraint reactions vanishes.
This can be considered as the general proof of the thesis introduced in (15.3), that
constraint reactions do not perform work.
EXAMPLE
16.1 Cylinder Rolls down an Inclined Plane
As an example of the method of Lagrange multipliers, we consider a solid cylinder

that rolls down without gliding on an inclined plane with height h and inclination
angle α. This rolling condition is a holonomic constraint, but this is immaterial for the Fig. 16.1. A cylinder rolls with-
demonstration of the method. out gliding on an inclined plane
Example 16.1 The two generalized coordinates are s, ϕ. The constraint reads
R ϕ̇ = ṡ or Rdϕ − ds = 0.
These equations can, of course, be integrated immediately, and with Rϕ = s +

constant, the constraint is holonomic. But we stick to the differential form of the con-
straints and demonstrate the method of Lagrange multipliers. In this way we even find
the constraint reactions.
The coefficients occurring in the constraint are
as = −1, aϕ = R,
as is seen by comparison of coefficients with (16.1):

alν δqν = 0,
ν
where l = 1 is the number of constraints, and δt = 0. The kinetic energy T can be

represented as the sum of the kinetic energy of the center-of-mass motion and of the
kinetic energy of the motion about the center of mass:

1 1 m 2 R2 2
T = mṡ 2 + ϕ̇ 2 = ṡ + ϕ̇ ,
2 2 2 2
with the mass moment of inertia of the solid cylinder
1
solcyl = mR 2 .
2
The potential energy V is
V = mgh − mgs sin α.
The Lagrangian reads

m 2 R2 2
L=T −V = ṡ + ϕ̇ − mg(h − s sin α).
2 2
One should note that this Lagrangian cannot be used directly to derive the equation of
motion according to (15.17). The reason is that the two coordinates s and ϕ are not
independent of each other. Thus, ϕ is not an ignorable coordinate, although it does not
explicitly appear in the Lagrangian.
Since there is only one constraint, only one Lagrange multiplier λ is needed. With
the coefficients
as = −1, aϕ = R,
we obtain for the Lagrange equations
ms̈ − mg sin α + λ = 0, (16.10)

m 2
R ϕ̈ − λR = 0, (16.11)
2
which together with the constraint Example 16.1

R ϕ̇ = ṡ (16.12)
represent three equations for three unknown quantities ϕ, s, λ. Differentiation of

(16.12) with respect to the time yields
R ϕ̈ = s̈.
From this, it follows, together with (16.11), that

ms̈ = 2λ.
Hence, (16.10) changes to

mg sin α = 3λ.
From this equation, we obtain for the Lagrange multiplier

1
λ = mg sin α.
3
The generalized constraint reactions are
1 1
as λ = − mg sin α, aϕ λ = Rmg sin α.
3 3
Here, as λ is the constraint reaction caused by the friction; aϕ λ is the torque generated
by this force, which causes the rolling of the cylinder. One should clearly understand
that the constraint “rolling” demands a particular constraint reaction (friction force).
We have evaluated it here. We further note that the gravity is reduced exactly by the
amount of the constraint reaction as λ.
Inserting the Lagrange multiplier λ into (16.10), we obtain the differential equation
for s:
2
s̈ = g sin α.
3
The differential equation for ϕ is obtained from this by inserting
R ϕ̈ = s̈,
2g
ϕ̈ = sin α.
3R
We have seen by this example that the method of Lagrange multipliers yields not only
the desired equations of motion but also the constraint reactions, which otherwise do
not appear in the Lagrangian.
EXERCISE
16.2 Particle Moves in a Paraboloid
Problem. A particle of mass m moves without friction under the action of gravitation
on the inner surface of a paraboloid, which is given by
Exercise 16.2
x 2 + y 2 = ax.
(a) Determine the Lagrangian and the equation of motion.

(b) Show that the particle moves on a horizontal circle in the plane z = h, provided
that it gets an initial angular velocity. Find this angular velocity.
(c) Show that the particle oscillates about the circular orbit if it is displaced only
weakly. Determine the oscillation frequency.
Solution. (a) The appropriate coordinates are the cylindrical coordinates r, ϕ, z. The
kinetic energy expressed in cylindrical coordinates reads
1
T = m(ṙ 2 + r 2 ϕ̇ 2 + ż2 ).
2
Hence, the Lagrangian is
1
L = m(ṙ 2 + r 2 ϕ̇ 2 + ż2 ) − mgz. (16.13)
2
The constraint is x 2 + y 2 = ax. Since x 2 + y 2 = r 2 , we have r 2 − az = 0, or in
differential form, 2rδr − aδz = 0.
Adopting the notation r = q1 , ϕ = q2 , z = q3 , from

Aα qα = 0
α
we find that A1 = 2r, A2 = 0, A3 = −a.

The Lagrange equations read

d ∂L ∂L
− = λ1 Aα , α = 1, 2, 3;
dt ∂ q̇α ∂qα
i.e.,

d ∂L ∂L
− = 2λ1 r,
dt ∂ ṙ ∂r

d ∂L ∂L
− = 0,
dt ∂ ϕ̇ ∂ϕ
and

d ∂L ∂L
− = −λ1 a. (16.14)
dt ∂ ż ∂z
With (16.13), we obtain
d 2
m(r̈ − r ϕ̇ 2 ) = 2λ1 r, m (r ϕ̇), mz̈ = −mg − λ1 a, (16.15)
dt
and the constraint 2r ṙ − a ż = 0. From this system, we can determine r, ϕ, z, λ1 .
(b) The radius of the circle arising by intersection of the plane z = h with the
paraboloid is r 2 = az,
√
r0 = ah.
From mz̈ = −mg − λ1 a, it follows with z = h that Exercise 16.2

mg
λ1 = − .
a
From m(r̈ − r ϕ̇ 2 ) = 2λ1 r, it follows with ϕ̇ = ω, r = r0 that

mg 2g
m(−r0 ω2 ) = 2 − r0 or ω2 = ;
a a
i.e.,

2g
ω=
a
is the desired initial angular velocity.
(c) From md(r 2 ϕ̇)/dt = 0, it follows that r 2 ϕ̇ = constant = A. We suppose that the
particle has the initial angular velocity ω; i.e.,
ahω
A = ahω, and therefore ϕ̇ = .
r2
Since the particle oscillates about z = h with only small amplitude, we use λ1 =
−mg/a, which holds for z = h, and we obtain
2mg ω2 2gr
m(r̈ − r ϕ̇ 2 ) = − r ⇒ r̈ − a 2 h2 3
=− .
a r a
Since the oscillation is small, we have r = r0 + u; i.e.,
a 2 h2 ω2 2g
ü − = − (r0 + u). (16.16)
(r0 + u) 3 a
We have

1 1 1 u −3 3u 1
= = 1 + ≈ 1 − ,
(r0 + u)3 r03 (1 + u/r0 )3 r03 r0 r0 r03
since u/r0 1 (power series expansion)! √

√
Thus, from (16.16), we obtain with r0 = ah, ω = 2g/a, the differential equa-
tion
8g
ü + u=0 (16.17)
a
with the solution

8g 8g
u = ε1 cos t + ε2 sin t,
a a
and thus,

√ 8g 8g
r = r0 + u = ah + ε1 cos t + ε2 sin t;
a a
√
Exercise 16.2 i.e., r oscillates with ω2 = 8g/a about the equilibrium value r0 = ah. The oscilla-
tion period is

a
T0 = π ,
2g
while the orbital period is

a
Tu = 2π = 2T0 .
2g
EXERCISE
16.3 Three Masses Coupled by Rods Glide in a Circular Tire
Problem. Three mass points m1 , m2 , m3 are fixed to the ends of two massless rods
and glide without friction in a circular tire of radius R, which stands vertically in the
gravitational field of the earth. Find the equations of motion by means of Lagrange
multipliers, and determine the equilibrium position. Find the frequency of small oscil-
lations about the equilibrium position.
Solution. We use the angles ϕ1 , ϕ2 , and ϕ3 as generalized coordinates. The angles
are not independent of each other, but are coupled by the rigid rods connecting the
mass points, via the constraints
ϕ3 − ϕ2 = α = constant,
(16.18)
ϕ2 − ϕ1 = β = constant.

In differential form, ν alν δqν = 0, they read
δϕ3 − δϕ2 = 0,
Fig. 16.2. (16.19)
δϕ2 − δϕ1 = 0.
The Lagrangian of the system can be immediately given in these coordinates:

L= Tν − Vν
ν
1
= mν R 2 ϕ̇ 2 − mν gR(1 − cos ϕν ). (16.20)
2 ν ν
The Euler–Lagrange equations, generalized to nonintegrable constraints, i.e., con-

straints that are given only in the form (16.19), can be formulated by means of the
Lagrange multipliers λl (16.6):
∂L
s
d ∂L
− = λl alν . (16.21)
dt ∂ ϕ̇ν ∂ϕν
l=1
The number of constraints in the case considered here is s = 2. We thus obtain the Exercise 16.3
three equations of motion:
mν R 2 ϕ̈ν + mν gR sin ϕν = λ1 a1ν + λ2 a2ν , ν = 1, 2, 3. (16.22)
From (16.19), we obtain by comparison the coefficients alν :
a11 = 0, a12 = −1, a13 = 1,

(16.23)
a21 = −1, a22 = 1, a23 = 0.
Then (16.18) implies that
ϕ̈3 = ϕ̈2 = ϕ̈1 . (16.24)
By inserting this into (16.22), we get
g λ2 g λ1 λ2 g λ1
sin ϕ1 + = sin ϕ2 + − = sin ϕ3 − . (16.25)
R m1 R 2 R m2 R 2 m2 R 2 R m3 R 2
Solving for λ1 and λ2 leads to
λ1 ω2
2
= m3 0 [m1 (sin ϕ3 − sin ϕ1 ) + m2 (sin ϕ3 − sin ϕ2 )],
R M (16.26)
ω 2
λ2
2
= m1 0 [m2 (sin ϕ2 − sin ϕ1 ) + m3 (sin ϕ3 − sin ϕ1 )].
R M
Next, we set M = m1 + m2 + m3 and ω02 = g/R. The angles ϕ3 and ϕ2 can be ex-
pressed by ϕ1 via the constraint (16.18), so that one differential equation in the variable
ϕ1 describes the entire system. Hence, from (16.22) and (16.26) we obtain
ω02
ϕ̈1 = − [m1 sin ϕ1 + m2 sin ϕ2 + m3 sin ϕ3 ]
M
ω02
=− [m1 sin ϕ1 + m2 sin(ϕ1 + β) + m3 sin(ϕ1 + α + β)]. (16.27)
M
The equilibrium position is at the point of vanishing acceleration ϕ̈1 = 0:
m1 sin ϕ1 + m2 (sin ϕ1 cos β + cos ϕ1 sin β)

+ m3 sin ϕ1 cos(α + β) + cos ϕ1 sin(α + β) = 0. (16.28)
Solving for ϕ1 yields

m2 sin β + m3 sin(α + β)
ϕ1 |ϕ̈1 =0 = ϕ10 = arctan − . (16.29)
m1 + m2 cos β + m3 cos(α + β)
We now consider small vibrations ϑ about the equilibrium position determined by

(16.29):
ϕ1 = ϕ10 + ϑ with |ϑ| 1. (16.30)

Exercise 16.3 By means of the addition theorem sin( + ϑ) sin + ϑ cos and ϕ̈1 = ϑ̈ , we
obtain from (16.27) the desired frequency:
ω02
ϑ̈ = − m1 cos ϕ10 + m2 cos(ϕ10 + α) + m3 cos(ϕ10 + α + β) ϑ,

M
ϑ̈ ≡ −2 ϑ. (16.31)
For small amplitudes, this differential equation describes the vibrations of a physi-
cal pendulum. For α = β = 0, and, hence, ϕ10 = 0, it turns into the equation of the
mathematical pendulum.
Special Problems
17
17.1 Velocity-Dependent Potentials
So far, we defined conservative forces F by the condition that they can be derived from
a potential V (r) by forming gradients, i.e.,
F(r, t) = −∇V (r, t). (17.1)
The potential V (r, t) is a function of the position and in general also of the time. This
is possible as long as the forces do not depend on velocities or accelerations. There are,
however, such cases: for instance, the Lorentz force which acts on a charged particle
in the electromagnetic field is velocity-dependent:

v
F (e) = e E + × B . (17.2)
c
Here, e is the charge of the particle, and E and B are the electric and magnetic field
strength, respectively. F (e) indicates that this shall be an external force.
If external forces depend on the velocity or the acceleration, we shall call them
conservative as well if they can be expressed by a potential V that depends on the
generalized coordinates qj , the generalized velocities q̇j and the time t , according to
∂V d ∂V
Qj = − + (17.3)
∂qj dt ∂ q̇j
with V = V (qj , q̇j , t).

In some cases, such a representation can be possible also for the ordinary coordi-
nates ri and the velocity vi . The relation for V = V (ri , vi , t) analogous to (17.3) then
reads
d ∂V d ∂V
Fi = −∇ i V + ∇ vi V = − + .
dt ∂ri dt ∂vi
Here,

∂ ∂ ∂
∇ vi = , ,
∂vix ∂viy ∂viz
means the gradient vector with respect to the components of the velocity of the ith
particle.
The velocity-dependent potential
V = V (qj , q̇j , t) (17.4)

312 17 Special Problems
is sometimes called the generalized potential. We know from (15.14) that the kinetic
energy T and the generalized forces Qj are related by
d ∂T ∂T
− = Qj . (17.5)
dt ∂ q̇j ∂qj
Now using (17.3), we obtain
d ∂T ∂T ∂V d ∂V
− =− +
dt ∂ q̇j ∂qj ∂qj dt ∂ q̇j
or
d ∂L ∂L
− = 0,
dt ∂ q̇j ∂qj
if we define the generalized Lagrangian L by
L=T −V
with the generalized potential V (qj , q̇j , t).

Sometimes, it is desirable to use another set of coordinates qj instead of the set
of generalized coordinates qj . We now will show that this potential represents a gen-
eralized potential also in the new coordinates qj and the related velocities q˙ j . This
property is therefore independent of the selected special coordinates.
As in (14.7), the generalized forces Q j belonging to q˙ j and the forces Qj from
qj ,
(17.3) are related by

3N
∂qν
j =
Q Qν . (17.6)
∂
qν
ν=1
We have to show that

j = − ∂V + d
Q
∂V
. (17.7)
∂
qj dt q˙ j
∂
For the proof, we need the relation

∂ q̇k ∂qk
= , (17.8)
q˙ j
∂ ∂
qj
which immediately follows from
d ∂qk
3N
∂qk
q̇k = (qk ) = q˙ j +
.
dt ∂
qj ∂t
j =1
Then
∂V ∂V ∂qν ∂V ∂ q̇ν
3N 3N
∂V ∂t
= + +
∂
qj ∂qν ∂
qj ∂ q̇ν ∂
qj ∂t ∂qj
ν=1 ν=1

=0
3N

3N
∂V ∂qν ∂V
3N
∂ ∂qν ∂
= + ˙
q + qν .
(17.9)
∂qν ∂
qj ∂ q̇ν ∂
qj qα α ∂t
∂
ν=1 ν=1 α=1
17.1 Velocity-Dependent Potentials 313
Because

∂V d ∂V
Qν = − + ,
∂qν dt ∂ q̇ν
j (17.6) as follows:
we write the generalized force Q

∂V ∂qν d ∂V ∂qν
3N 3N
j = −
Q +
∂qν ∂
qj dt ∂ q̇ν ∂
qj
ν=1 ν=1

∂V ∂qν d
3N 3N 3N
∂V ∂qν ∂V d ∂qν
=− + · − .
∂qν ∂
qj dt ∂ q̇ν ∂
qj ∂ q̇ν dt ∂
qj
ν=1 ν=1 ν=1
By inserting this expression into (17.9), we get
3N
∂V 3N
∂ ∂qν
j = − ∂V +
Q
∂
q˙ + qν

∂
qj ∂ q̇ν ∂
qj qα α ∂t
∂
ν=1 α=1

3N 3N
d ∂V ∂qν ∂V d ∂qν
+ − .
dt ∂ q̇ν ∂
qj ∂ q̇ν dt ∂
qj
ν=1 ν=1
The third term yields with (17.8)

3N
d ∂V ∂ q̇ν d ∂V
· = .
ν=1
q˙ j
dt ∂ q̇ν ∂ q˙ j
dt ∂
j becomes
Thus, Q

j = − ∂V + d ∂V
Q
∂
qj dt ∂q˙ j
3N

3N
∂V ∂ ∂qν ∂
+ ˙
q + qν

∂ q̇ν ∂qν qα α ∂t
∂
ν=1 α=1
3N

3N
∂V ∂ ∂qν ˙ ∂ ∂qν
− q +
.
∂ q̇ν ∂
qα ∂qj α ∂t ∂qj
ν=1 α=1
q˙ α does not depend on

Since qj , the last two terms cancel each other, and we find that
(17.7) is valid. Thus, it has been shown that
V (qj , q̇j , t) = V (qj ( q˙ j , t), t) ≡ V (

qj , t), q̇j ( q˙ j , t)
qj ,
also represents a generalized potential in the new coordinates

qj .
EXAMPLE
17.1 Charged Particle in an Electromagnetic Field
In the lectures on electrodynamics1 we shall show that the electric field strength E and
the magnetic field strength B can be derived from the scalar potential (r, t) and the
vector potential A(r, t), namely,
1 ∂A
E = −∇ − , B = ∇ × A. (17.10)
c ∂t
In other words, the electromagnetic phenomena can be described by , A instead of
E, B. Now we show that in the frame of the Lagrangian formalism the Lorentz force
(17.2) can be described by the velocity-dependent potential
e
V = e − A · v. (17.11)
c
The Lagrangian then reads
1 e
L = T − V = mv2 − e + A · v. (17.12)
2 c
We restrict ourselves to the Lagrange equation for the x-component:
d ∂L ∂L
− = 0. (17.13)
dt ∂vx ∂x
The other components follow likewise. We calculate
∂L ∂ e ∂A ∂L e
= −e + · v, = mvx + Ax ,
∂x ∂x c ∂x ∂vx c
and furthermore according to (17.13),
d ∂ e ∂A e dAx
mvx = −e + ·v− . (17.14)
dt ∂x c ∂x c dt
For the last term, we obtain
dAx ∂Ax ∂Ax dx ∂Ax dy ∂Ax dz
= + + +
dt ∂t ∂x dt ∂y dt ∂z dt
∂Ax ∂Ax ∂Ax ∂Ax
= + vx + vy + vz , (17.15)
∂t ∂x ∂y ∂z
and for the intermediate term,
∂A ∂Ax ∂Ay ∂Az
·v= vx + vy + vz . (17.16)
∂x ∂x ∂x ∂x
Equations (17.15) and (17.16) are now inserted into (17.14) and yield

dmvx ∂ 1 ∂Ax e ∂Ay ∂Ax e ∂Ax ∂Az
=e − − + − vy − − vz
dt ∂x c ∂t c ∂x ∂y c ∂z ∂x
e
= eEx + (Bx vy − By vz )
c
1
=e E+ v×B .
c x
1 See W. Greiner: Classical Electrodynamics, 1st ed., Springer, Berlin (1998), Chapter 23.
17.2 Nonconservative Forces and Dissipation Function (Friction Function) 315
Corresponding expressions are obtained for the y- and z-components, so that we get Example 17.1
in total

d 1
(mv) = E + v × B , (17.17)
dt c
i.e., Newton’s equation of motion with the Lorentz force.
17.2 Nonconservative Forces and Dissipation Function (Friction

Function)
So far, the discussion was restricted to conservative forces only. We now consider
systems with conservative and nonconservative forces. Such systems are first of all
systems with friction. They play an important role in classical physics and recently
also in heavy-ion physics. If two atomic nuclei collide, many internal degrees of free-
dom are excited; one can say that the nuclei are being heated up. Energy of relative
motion is lost. This is a signature for friction forces, which are generally considered
as being responsible for the energy loss.
Fig. 17.1. (a) Trajectory of

nucleus 2 in the Coulomb field
of nucleus 1. (b) Trajectory of
nucleus 2 in the Coulomb plus
nuclear field of nucleus 1
We begin our discussion of nonconservative (e.g., friction) forces with the La-
grange equations in the form
d ∂T ∂T
− = Qj , j = 1, 2, . . . , n, (17.18)
dt ∂ q̇j ∂qj
(c)
and split the generalized forces Qj in a conservative part Qj and a nonconservative
(f )
part Qj (f for friction):
(c) (f )
Qj = Qj + Qj . (17.19)
(c)
Since Qj can be derived by definition from a potential according to (17.3), we can
introduce L = T − V and bring (17.18) into the form
d ∂L ∂L (f )
− = Ql , j = 1, 2, . . . , n. (17.20)
dt ∂ q̇j ∂qj
If the nonconservative forces are friction forces, on the right-hand side appear only
(f )
these friction forces Qj . For these, we make the ansatz
(f )

n
Qj =− fj k q̇k , (17.21)
k=1
where the fj k are the friction coefficients. If the friction tensor fj k is symmetric, i.e.,
(f )
fj k = fkj , the friction forces Qj can be obtained by partial derivation with respect
to the generalized velocities q̇j from the function
1
n
D= fkl q̇k q̇l (17.22)
2
k,l=1
according to
(f ) ∂D
Qj =− .
∂ q̇j
D is called the dissipation function (friction function). The Lagrange equations

(17.20) can now be written as
d ∂L ∂D ∂L
+ − = 0. (17.23)
dt ∂ q̇j ∂ q̇j ∂qj
In order to understand the physical meaning of the dissipation function, we calculate

(f )
the work performed by the friction force Qj per unit time,
dW (r) (f )
= Qj q̇j = − fj k q̇j q̇k = −2D, (17.24)
dt
j j,k
i.e., the energy consumed by the friction force per unit time is twice the dissipation
function:
dE d
= (T + V ) = −2D. (17.25)
dt dt
This can also be directly derived from the Lagrange equations:
d ∂T ∂T d
(T + V ) = q̇i + q̈i + V . (17.26)
dt ∂qi ∂ q̇i dt
i i
With (17.23) we find

∂T
d ∂T d ∂T
q̈i = q̇i − q̇i
∂ q̇i dt ∂ q̇i dt ∂ q̇i
i i i
d ∂D ∂T ∂V
= (2T ) + q̇i − q̇i + q̇i
dt ∂ q̇i ∂ q̇i ∂qi
i i i
d ∂T d
= (2T ) + 2D − q̇i + V . (17.27)
dt ∂qi dt
i
17.3 Nonholonomic Systems and Lagrange Multipliers 317
By inserting this into (17.26), we obtain

d(T + V ) d(2T + 2V )
= + 2D
dt dt
or
dE
= −2D,
dt
i.e., the result (17.25).
EXAMPLE
17.2 Motion of a Projectile in Air
The particle shall be under the action of the conservative gravitational force with the
potential
V = mgz
and the nonconservative friction resistance of air. The air resistance depends on the
projectile velocity. We suppose the friction force to be proportional to the velocity. It
can then be derived from the dissipation function
1
D = α(ẋ 2 + ẏ 2 + ż2 ),
2
and the Lagrange equations follow with L = (1/2)m(ẋ 2 + ẏ 2 + ż2 ) − mgz according
to (17.23) as
mẍ + α ẋ = 0, mÿ + α ẏ = 0, mz̈ + α ż + mg = 0.
These equations of motion are known from the lectures on classical mechanics.2
17.3 Nonholonomic Systems and Lagrange Multipliers
In the preceding text, we have already discussed holonomic and nonholonomic sys-
tems. A brief recapitulation seems appropriate: For holonomic systems, the supple-
mentary conditions can be expressed in the closed form
gi (rν , t) = 0, i = 1, 2, . . . , s, ν = 1, 2, . . . , N. (17.28)
N = number of particles. We therefore can eliminate s coordinates and express the

rν as functions of n = 3N − s independent generalized coordinates qi . For nonholo-
nomic systems, this is not possible, since the supplementary conditions appear in the
differential form

N
gil (rν , t) · drl + git (rν , t)dt = 0, i = 1, 2, . . . , s. (17.29)
l=1
(2004), Chapter 20.
Since these equations shall be nonintegrable, one cannot eliminate s dependent co-
ordinates from them in the form (17.29). One therefore simply expresses the ri as
functions of 3N generalized coordinates qi . The qi are of course not all independent,
but are subject to supplementary conditions which are obtained by rewriting (17.29)
in terms of the qi :

3N
ail (q, t)dql + ait (q, t)dt = 0, i = 1, 2, . . . , s. (17.30)
l=1
For virtual displacements δql , i.e., δt = 0, these supplementary conditions change to

3N
ail (q, t)δql = 0, i = 1, 2, . . . , s. (17.31)
l=1
In this form, the supplementary conditions can be combined with the Lagrange equa-
tions in the same form, namely,
3N

(r) ∂L d ∂L
Qj + − δqj = 0. (17.32)
∂qj dt ∂ q̇j
j =1
The conservative forces were taken into account in the Lagrangian L. Because of
the conditions (17.31), not all δqi in (17.32) are independent. To take this fact into
account, one multiplies in (17.31) by the—at the moment still unknown—factors λi
and sums up over i,

s
3N
λi ail (q, t)δql = 0. (17.33)
i=1 l=1
Addition of (17.32) and (17.33) then yields

3N
(r) ∂L d ∂L
s
Qj + − + λi aij (q, t) δqj = 0. (17.34)
∂qj dt ∂ q̇j
j =1 i=1
The factors λi are called Lagrange multipliers. They can be chosen arbitrarily in
(17.34). Among the 3N quantities δqj , however, only 3N − s can be chosen arbitrar-
ily, since the s supplementary conditions (17.31) still must be satisfied. We number
the δqj so that the first s of them are just the dependent ones; the last (3N − s) of the
δqj can be freely chosen.
Now we utilize the free choice of the s Lagrange parameters λi , which are deter-
mined in such a way that the coefficients of the first s variations δqj in (17.34) vanish.
This obviously leads to the s equations
d ∂L
s
(r) ∂L
Qj + − + λi aij (q, t) = 0, j = 1, 2, . . . , s, (17.35)
∂qj dt ∂ q̇j
i=1
and (17.34) reduces to

3N
(r) ∂L d ∂L
s
Qj + − + λi aij (q, t) δqj = 0. (17.36)
∂qj dt ∂ q̇j
j =s+1 i=1
In (17.36), all of the δqj can now be freely chosen. Therefore, the expression in the
round bracket must vanish for every individual j ; i.e., it follows that
d ∂L
s
(r) ∂L
Qj + − + λi aij (q, t) = 0, j = s + 1, s + 2, . . . , 3N. (17.37)
∂qj dt ∂ q̇j
i=1
Now we see that the two sets of (17.35) and (17.37) have the same form and can be
simply combined to
∂L d ∂L (r)
s
− + Qj + λi aij (q, t) = 0, j = 1, 2, . . . , 3N. (17.38)
∂qj dt ∂ q̇j
i=1
These are 3N equations which, together with the s supplementary conditions in the
form

3N
ail (q, t)q̇l + ait (q, t) = 0, i = 1, 2, . . . , s, (17.39)
l=1
determine the 3N + s unknown quantities, namely the 3N coordinates qj and s La-

grange multipliers λi . Hence, the total number of desired quantities (qj , λl ) is 3N + s.
This is also the number of (17.38) and (17.39) which determine these quantities.
The meaning of the Lagrange multipliers can be understood even more precisely if
we interpret the last term in (17.38) as an additional force Q(z)
j , namely,
(z)

s
Qj = λi aij (q, t). (17.40)
i=1
(z)
These forces Qj are constraint reactions which appear since the motion of the sys-
tem is restricted by supplementary conditions. Indeed, if the supplementary conditions
disappear (aij = 0), the constraint reactions also vanish; Q(z)
j = 0. The former equa-
tion (17.33) can now be written as

3N
(z)
Qi δqi = 0 (17.41)
i=1
and can be interpreted as the vanishing of the virtual work of the constraint reactions.
It is clear that the method of Lagrange multipliers developed here for nonholonomic
systems can be applied to holonomic systems, too. The holonomic constraints (17.28)
gi (rν , t) = 0, i = 1, 2, . . . , s, ν = 1, 2, . . . , N,
can immediately be written in differential form:

N
∂gi ∂gi
· drl + dt = 0, i = 1, 2, . . . , s. (17.42)
∂rl ∂t
l=1
This is exactly the form (17.29) for nonholonomic systems. From now on, the ap-
proach with Lagrange multipliers can run on as explained above. We then obtain
(3N + s) coupled equations, while the former solution method for holonomic sys-
tems (based on the elimination of s coordinates from (17.28)) leads only to (3N − s)
coupled equations. By the additional 2s equations the procedure now became much
more complicated. However, this complication has also a great advantage: We now
can determine the constraint reactions Q(z)
j according to (17.40) without difficulty (by
solving the 3N + s equations).
EXERCISE
17.3 Circular Disk Rolls on a Plane
Problem. Determine the equations of motion and the constraint reactions of a cir-
cular disk of mass M and radius R that rolls without gliding on the x, y-plane (see
Fig. 17.2). The disk shall always stand perpendicular to the x, y-plane.
Fig. 17.2. A circular disk rolls

on the x,y-plane
Solution. We first consider how to mathematically formulate the constraints “without

gliding” and “always stands perpendicular to the x, y-plane.” This actually means
that the center of the disk is exactly above the contact point (x, y) (the disk stands
perpendicular), and the velocity of the circumference R ˙ of the disk edge equals the
velocity of the contact point in the x, y-plane ( is the rotation angle of the disk
around its axis). The latter means that there is no gliding. If we introduce the angle
between the disk axis and the x-axis (see Fig. 17.3), the condition “no gliding”
mathematically reads
˙ sin ,
ẋ = R ˙ cos .
ẏ = −R (17.43)
In another formulation, these differential supplementary conditions read
dx − R sin d = 0,
(17.44)
dy + R cos d = 0.
Fig. 17.3. Projection onto the

x, y-plane
In the form (17.30), these conditions thus read Exercise 17.3
a11 dx + a12 dy + a13 d + a14 d = 0,

a21 dx + a22 dy + a23 d + a24 d = 0,
where
a11 = 1, a12 = 0, a13 = −R sin , a14 = 0,
a21 = 0, a22 = 1, a23 = R cos , a24 = 0.
Thus, according to (17.40) the constraint reactions are
x = λ1 ,
Q(z)
y = λ2 ,
Q(z)
(17.45)
Q(z)
= −λ1 R sin + λ2 R cos ,
(z)
Q = 0.
The kinetic energy of the disk is

1 ˙2 1 ˙2 1 1
T = I1 + I2 + M ẋ 2 + M ẏ 2 , (17.46)
2 2 2 2
where I1 is the moment of inertia of the disk about the axis perpendicular to the disk
through the center, and I2 is the moment about the axis through the center and the
contact point (x, y).
The Lagrange equations (17.38) now read explicitly
M ẍ = Qx + λ1 ,
M ÿ = Qy + λ2 ,
(17.47)
¨ = Q − λ1 R sin + λ2 R cos ,
I1
¨ = Q .
I2
Qx , Qy , Qφ , Q are possible external forces. We study the case without such forces
and therefore set them equal to zero. This transforms (17.47) into
M ẍ = λ1 ,
M ÿ = λ2 ,
(17.48)
¨ = −λ1 R sin + λ2 R cos ,
I1
¨ = 0,
I2
which must be replaced by (17.43) according to (17.39):

˙ sin ,
ẋ = R
(17.49)
˙ cos .
ẏ = −R
The last equation of (17.48) can be immediately integrated, leading to
= ωt + 0 .
Exercise 17.3 By inserting this into (17.49), one can calculate ẍ and ÿ, which determine λ1 and λ2
through the first two equations of (17.48):
¨ sin(ωt + 0 ) + ωR
λ1 = M ẍ = M(R ˙ cos(ωt + 0 )),
(17.50)
¨ cos(ωt + 0 ) − ωR
λ2 = M ÿ = −M(R ˙ sin(ωt + 0 )).
This in turn is now inserted into the third equation (17.48), which then reads
¨ = −MR(R
I1 ˙ cos(ωt + 0 )) sin ωt
¨ sin(ωt + 0 ) + ωR
¨ cos(ωt + 0 ) − ωR
−MR(R ˙ sin(ωt + 0 )) cos ωt
¨
= −MR 2 ;
i.e.,
¨ = 0.
(I1 + MR 2 )
This leads to ¨ = 0 and hence ˙ = constant. Therefore, we can explicitly write down
the constraint reactions (17.45):
˙
x = MωR cos(ωt + 0 ),
Q(z)
Q(z) ˙
y = MωR sin(ωt + 0 ), (17.51)
(z) (z)
Q = 0, Q = 0.
These constraint reactions must act to keep the disk vertical on the x,y-plane. If the
disk rolls along a straight line (ω = 0), the constraint reactions disappear.
EXERCISE
17.4 Centrifugal Force Governor
Problem. Consider the degrees of freedom, and determine the equation of motion
of the centrifugal force governor (Fig. 17.4) through the Lagrangian.
Fig. 17.4.
Solution. The principle of the central force governor is applied, e.g., in automobiles. Exercise 17.4
The distributor drive shaft is tightly fixed to the carrier plate of a central force gov-
ernor which is attached below the interrupter plate. At higher speeds the centrifugal
masses press on their carrier plate against a “cog.” Thus, the distributor shaft set into
the driving shaft is moved additionally in the rotation direction by a cam. This mech-
anism causes a preignition needed at higher speeds. For more advanced motors with
“transistor ignition,” this mechanism is dropped.
Fig. 17.5. Centrifugal gover-

nor in automobiles
The system has two degrees of freedom, which can be described by the angles θ
and ϕ. The motion of m, M is restricted by the constraints represented by the four rigid
rods and the rotation axis. Hence, θ and ϕ offer themselves as generalized coordinates.
We first determine the kinetic energy. The moment of inertia of the cylinder is
1
ZZ = MR 2 ,
2
and therefore,

1 1
Trot = MR + 2ml sin θ ϕ̇ 2 .
2 2 2
(17.52)
2 2
The kinetic energy due to the motion in the x, y-plane is

m 2 1
Tplane = 2 v + MvM2
(17.53)
2 m 2
d
vm = l θ̇, vM = (−2l cos θ ) = θ̇ 2l sin θ. (17.54)
dt
Tplane = (m + 2M sin2 θ )l 2 θ̇ 2 . (17.55)
With the potential energy V = −2gl/(m + M) cos θ , we can write down the La-
grangian:
L = Trot + Tplane − V
1
= (ZZ + 2ml 2 sin2 θ )ϕ̇ 2 + (m + 2M sin2 θ )l 2 θ̇ 2
2
+ 2gl(m + M) cos θ. (17.56)
The Lagrange equations

d ∂L ∂L
− =0
dt ∂ q̇ν ∂qν
Exercise 17.4 immediately yield the equations of motion:

∂L
= 0,
∂ϕ
(17.57)
∂L
= (ZZ + 2ml 2 sin2 θ )ϕ̇,
∂ ϕ̇
∂L
= 2ml 2 sin θ cos θ ϕ̇ 2 + 4M sin θ cos θ l 2 θ̇ 2 − 2gl(m + M) sin θ,
∂θ (17.58)
∂L
= 2(m + 2M sin2 θ )l 2 θ̇ .
∂ θ̇
From (17.57), we get

d ∂L d
= (ZZ + 2ml 2 sin2 θ )ϕ̇ = 0. (17.59)
dt ∂ ϕ̇ dt
For the Lagrange equation in θ , we need

d ∂L
= (2m + 4M sin2 θ )l 2 θ̈ + 8M sin θ cos θ l 2 θ̇ 2 .
dt ∂ θ̇
Then
(2m + 4M sin2 θ )l 2 θ̈ + 4Ml 2 θ̇ 2 sin θ cos θ

− 2ml 2 ϕ̇ 2 sin θ cos θ + 2gl(m + M) sin θ = 0, (17.60)
and hence, we obtain the following equations of motion:
(2m + 4M sin2 θ )l 2 θ̈ + 2l 2 (2M θ̇ 2 − mϕ̇ 2 ) sin θ cos θ + 2gl(m + M) sin θ = 0,

d 1
MR 2 + 2ml 2 sin2 θ ϕ̇ = 0 (17.61)
dt 2
C
⇔ ϕ̇ = .
(1/2)MR 2 + 2ml 2 sin2 θ
From these equations of motion, the advantage of the Lagrangian formalism becomes
evident. To account for the complicated constraint reactions in Newton’s formulation
would be much more laborious.
Part
VI
Hamiltonian Theory
Hamilton’s Equations
18
The variables of the Lagrangian are the generalized coordinates qα and the accom-
panying generalized velocities q̇α . In Hamilton’s theory,1 the generalized coordinates
and the corresponding momenta are used as independent variables. In this theory the
position coordinates and the “momentum coordinates” are treated on an equal ba-
sis. Hamiltonian theory leads to an essential understanding of the formal structure of
mechanics and is of basic importance for the transition from classical mechanics to
quantum mechanics.
We now look for a transition from the Lagrangian L(qi , q̇i , t) to the Hamiltonian
H (qi , pi , t) and remember that the generalized momenta are given by
∂L
pi = .
∂ q̇i
We look for a transformation

∂L
L(qi , q̇i , t) ⇒ H qi , , t = H (qi , pi , t). (18.1)
∂ q̇i
The question is, how to construct H ? The recipe is simple and will be formulated in
the following equation (18.2). The mathematical background of such a transformation
(Legendre2 transformation) can be easily demonstrated by a two-dimensional exam-
ple. We change from the function f (x, y) to the function g(x, u) = g(x, ∂f/∂y):
∂f
f (x, y) ⇒ g(x, u) with u = ,
∂y
where g(x, u) is defined by
g(x, u) = uy − f (x, y).
1 Sir William Rowan Hamilton, b. Aug. 4, 1805, Dublin–d. Sept. 2, 1865, Dunsik. Hamilton began
his studies in 1824 in Dublin. In 1827, before finishing his studies, he became professor of astronomy
and King’s astronomer of Ireland. Hamilton contributed important papers on algebra and invented
the quaternion calculus. His contributions to geometrical optics and classical mechanics, e.g., the
canonical equations and the Hamilton principle, are of extraordinary importance.
2 Adrien Marie Legendre, b. Sept. 18, 1752–d. Jan. 10, 1833, Paris. Legendre made essential contri-
butions to the foundation and development of number theory and geodesy. He also found important
results on elliptic integrals, on foundations and methods of Euclidean geometry, on variational cal-
culus, and on theoretical astronomy. For instance, he first applied the method of least squares and
calculated voluminous tables. Legendre dealt with many problems that Gauss was also interested
in, but he never reached his perfection. Beginning in 1775, Legendre served as professor at various
universities at Paris and published excellent textbooks which had a long-lasting influence.

328 18 Hamilton’s Equations
By forming the total differential, we realize that the function g formed this way no
longer contains y as an independent variable:
dg = ydu + udy − df
∂f ∂f
= ydu + udy − dx − dy
∂x ∂y
∂f
= ydu − dx,
∂y
where now y = ∂g/∂u and ∂g/∂x = −∂f/∂x.
According to this short insertion, we now construct the Hamiltonian from the La-
grangian. We write for the Hamiltonian

H (qi , pi , t) = pi q̇i − L(qi , q̇i , t). (18.2)
i
We look for those equations of motion which are equivalent to the Lagrange equations
based on the Lagrangian L. To this end, we form the total differential:

dH = pi d q̇i + q̇i dpi − dL. (18.3)
The total differential of the Lagrangian reads

∂L ∂L ∂L
dL = dqi + d q̇i + dt. (18.4)
∂qi ∂ q̇i ∂t
We now utilize the definition of the generalized momentum, pi = ∂L/∂ q̇i , and the
Lagrange equation in the form
d ∂L
pi − = 0.
dt ∂qi
Inserting both into (18.4) yields
∂L
dL = ṗi dqi + pi d q̇i + dt.
∂t
By insertion of dL into (18.3), it follows that

∂L
dH = pi d q̇i + q̇i dpi − ṗi dqi − pi d q̇i − dt.
∂t
Since the first and fourth term mutually cancel, there remains
∂L
dH = q̇i dpi − ṗi dqi − dt.
∂t
i i
Therefore, H depends only on pi , qi , and t ; thus, H = H (qi , pi , t), and we have

∂H ∂H ∂H ∂L
dH = dqi + dpi + dt = q̇i dpi − ṗi dqi − dt.
∂qi ∂pi ∂t ∂t
18 Hamilton’s Equations 329
From this immediately follow the Hamilton equations:

∂H ∂H ∂H ∂L
q̇i = , ṗi = − , =− . (18.5)
∂pi ∂qi ∂t ∂t
They are now the fundamental equations of motion in this formulation of mechan-
ics. The Hamiltonian H here plays the central role, similar to the Lagrangian L in
Lagrange’s formulation of mechanics. This Hamiltonian H is constructed according
to (18.2); but with the prescription that all velocities q̇i are expressed by the general-
ized momenta pi and the generalized coordinates qi through (18.1). In other words,
the equations (18.1) for the definition of the generalized momenta
∂L(qi , q̇i , t)
pi =
∂ q̇i
are solved for the generalized velocities q̇i , so that
q̇i = q̇i (qi , pi ).
The q̇i obtained this way are inserted into the definition of H (see (18.2)), so that the
Hamiltonian H finally depends only on qi , pi , and the time t ; hence, H = H (qi , pi , t).
From this, the Hamilton equations (18.5) are established and solved.
The Lagrange equations provide a set of n differential equations of second order in
the time for the position coordinates. The Hamiltonian formalism yields 2n coupled
differential equations of first order for the momentum and position coordinates. In any
case, there are 2n integration constants when solving the system of equations.
From (18.5), it is seen that for a coordinate that does not enter the Hamiltonian, the
corresponding change of the momentum with time vanishes:
∂H
=0 ⇒ pi = constant.
∂qi
If the Hamiltonian (the Lagrangian) is not explicitly time dependent, then H is a
constant of motion since
dH ∂H ∂H ∂H
= q̇i + ṗi + ,
dt ∂qi ∂pi ∂t
and with (18.5) this leads to

dH ∂H
= .
dt ∂t
Now it is clear that with ∂H /∂t = 0 (since H shall not be explicitly time dependent)
it follows that dH /dt = 0, and thus, H = constant.
What is the meaning of the Hamiltonian; how can it be interpreted physically?
To see that, we consider a special case: For a system with holonomic, scleronomic
constraints and conservative internal forces, the Hamiltonian H represents the energy
of the system.
To clarify this, we first consider the kinetic energy:
1
T= mν ṙν2 , ν = 1, 2, . . . , N (N = number of particles).
2 ν
If the constraints are holonomic and not time-dependent, there exist transformation
equations rν = rν (qi ), and therefore,
∂rν
ṙν = q̇i .
∂qi
i
Insertion into the kinetic energy yields

∂rν ∂rν
T = mν q̇i · q̇k
ν
∂qi ∂qk
i,k
1 ∂rν ∂rν
= · q̇i q̇k
2 ν ∂qi ∂qk
i,k

= aik q̇i q̇k .
i,k
Thus, the kinetic energy is a homogeneous quadratic function of the generalized ve-
locities. The arising mass coefficients
1 ∂rν ∂rν
aik = mν ·
2 ν ∂qi ∂qk
are symmetric; i.e., aik = aki .

Now we can apply Euler’s theorem on homogeneous functions. If f is a homoge-
neous function of rank n, i.e., if
f (λx1 , λx2 , . . . , λxk ) = λn f (x1 , x2 , . . . , xk ),
then also

k
∂f
xi = nf.
∂xi
i=1
This can be shown by forming the derivative of the upper equation with respect to λ;
thus,
∂f ∂f
x1 + · · · + xk = nλn−1 f.
∂(λx1 ) ∂(λxk )
By setting λ = 1, the assertion follows. Euler’s theorem, applied to the kinetic energy
(n = 2), means
∂T
· q̇i = 2T . (18.6)
∂ q̇i
Since the forces are presupposed to be conservative, there exists a velocity-independent

potential V (qi ), so that
∂L ∂T
= = pi ,
∂ q̇i ∂ q̇i
and therefore,
∂T
H= pi q̇i − L = q̇i − L.
∂ q̇i
By using the relation (18.6) and the definition of the Lagrangian, we see that
H = 2T − (T − V ) = T + V = E.
Thus, under the given conditions the Hamiltonian represents the total energy. The
energy T − V represented by the Lagrangian is sometimes called the free energy.
One should note that H does not include a possible work performed by the con-
straint reactions.
The Hamiltonian formulation of mechanics emerges via the Lagrange equations
from Newton’s equations. This became evident in deriving (18.5), where we explicitly
used the Lagrange equations. The latter ones are however equivalent to Newton’s for-
mulation of mechanics (see d’Alembert’s principle and following text). Conversely,
one can easily derive Newton’s equations from Hamilton’s equations and thus show
the equivalence of both formulations. It is sufficient to consider a single particle in
a conservative force field and to use the Cartesian coordinates as generalized coordi-
nates. Then
1 2
pi = mẋi , H= ẋi + V (xi ) (i = 1, 2, 3),
2
i
or
1 pi2
H= + V (qi ).
2 m
i
This leads to the Hamilton equations (qi = xi ):

∂H pi ∂H ∂V
q̇i = = and ṗi = − =− ,
∂pi m ∂qi ∂qi
or in vector notation
ṗ = −grad V .
These are Newton’s equations of motion.
EXAMPLE
18.1 Central Motion
Let a particle perform a planar motion under the action of a potential that depends
only on the distance from the origin. It is obvious that we should use plane polar
coordinates (r, ϕ) as generalized coordinates.
1 1
L = T − V = mv 2 − V = m(ṙ 2 + r 2 ϕ̇ 2 ) − V (r).
2 2
With pα = ∂L/∂ q̇α , we get the momenta
∂L pr
pr = = mṙ or ṙ = ,
∂ ṙ m
∂L pϕ
pϕ = = mr 2 ϕ̇ or ϕ̇ = .
∂ ϕ̇ mr 2
Thus, the Hamiltonian reads
pr2 pϕ2
H = pr ṙ + pϕ ϕ̇ − L = + + V (r).
2m 2mr 2
The Hamilton equations then yield
∂H pr ∂H pϕ
ṙ = = , ϕ̇ = = ,
∂pr m ∂pϕ mr 2
and
∂H pϕ2 ∂V ∂H
ṗr = − = 3
− , ṗϕ = − = 0.
∂r mr ∂r ∂ϕ
ϕ is a cyclic coordinate. From this follows the conservation of the angular momentum
in the central potential.
EXAMPLE
18.2 The Pendulum in the Newtonian, Lagrangian, and Hamiltonian Theories
The equation of motion of the pendulum shall be derived within the frames of New-
ton’s, Lagrange’s, and Hamilton’s theory.
Newtonian theory: We begin with Newton’s axiom
ṗ = K.
The arclength of the displacement is denoted by s, and the tangent unit vector by T.
Then (see Fig. 18.1)
K = −mg sin T,
and thus,
ms̈T = −mg sin T.

Fig. 18.1. On pendulum mo-
tion ¨ We therefore get the equation of motion
With s = l, we have s̈ = l .
¨ + g sin = 0.

l
For small displacements (sin ∼ + · · · ), this becomes Example 18.2
¨ + g = 0.

l
This differential equation has the general solution

g g
= A cos t + B sin t,
l l
where the constants A and B are to be determined from the initial conditions.
Lagrangian theory:
1 1 1
˙ 2 = ml 2
T = mv 2 = m(l ) ˙ 2,
2 2 2
V = mgh = mg(l − l cos ) = mgl(1 − cos ).
Hence, the Lagrangian for this conservative system reads

1
˙ 2 − mgl(1 − cos ).
L = T − V = ml 2
2
Now we use the Lagrange equation

d ∂L ∂L
− = 0.
dt ∂ ˙ ∂
With
∂L ∂L
= −mgl sin and ˙
= ml 2 ,
∂ ˙
∂
we have
g
¨ + mgl sin = 0 or
ml 2 ¨ + sin = 0.
l
Hamiltonian theory: Using the generalized momentum

∂L
p = ˙
= ml 2 ,
˙
∂
the kinetic energy can be written as
2
1 p
T= .
2 ml 2
Since the total energy of the system is constant, the Hamiltonian reads
2
1 p
H =T +V = + mgl(1 − cos ).
2 ml 2
The Hamilton equations yield
∂H
˙ = ∂H = .
p
ṗ = − = −mgl sin and
∂ ∂p ml 2
The last equation gives
˙
p = ml 2 .
Differentiation yields
¨
ṗ = ml 2 .
By comparing this with the above expression for ṗ , we finally get again
g
¨ +
sin = 0.
l
EXERCISE
18.3 Hamiltonian and Canonical Equations of Motion
Problem. A mass point m shall move in a cylindrically symmetric potential V (, z).
Determine the Hamiltonian and the canonical equations of motion with respect to a
coordinate system that rotates with constant angular velocity ω about the symmetry
axis,
(a) in Cartesian coordinates, and
(b) in cylindrical coordinates.
Solution. (a) The coordinates of the inertial system (x, y, z) and those of the rotating
reference system (x , y , z ) are related by
x = cos(ωt)x − sin(ωt)y ,
y = sin(ωt)x + cos(ωt)y , (18.7)

z=z.
Derivation of the coordinates yields
ẋ = cos(ωt)ẋ − sin(ωt)ẏ − ω(sin(ωt)x + cos(ωt)y ),

ẏ = sin(ωt)ẋ + cos(ωt)ẏ + ω(cos(ωt)x − sin(ωt)y ), (18.8)
ż = ż .
In the primed coordinate system, the Lagrangian takes the form
1
L = m ẋ 2 + ẏ 2 + ż2 + ω2 (x 2 + y 2 ) + 2ω(ẏ x − ẋ y )
2
− V (x , y , z ). (18.9)
From (18.9), we calculate the generalized momenta as

∂L
px = = m(ẋ − ωy ), Exercise 18.3
∂ ẋ
∂L
py = = m(ẏ + ωx ), (18.10)
∂ ẏ
∂L
pz = = mż .
∂z
Now we solve (18.10) for the velocity components ẋ , ẏ , ż :
px
ẋ = + ωy ,
m
py
ẏ = − ωx , (18.11)
m
p
ż = z
m
and calculate the Hamiltonian according to

H= q̇i pi − L. (18.12)
i
This yields
H = m(ẋ 2 − ωẋ y ) + m(ẏ 2 + ωẏ x ) + mż2 − L

1
= 2ẋ 2 − 2ωẋ y + 2ẏ 2 + 2ωẏ x + 2ż2
2

− (ẋ 2 + ẏ 2 + ż2 + ω2 (x 2 + y 2 ) + 2ω(ẏ x − ẋ y )) + V
1
= m ẋ 2 + ẏ 2 + ż2 − ω2 (x 2 + y 2 ) + V
2
2
1 p ω py 2 ω
= m x2 + 2 y px + ω2 y 2 + 2 − 2 x py + ω2 x 2
2 m m m m
2
p
+ z2 − ω2 x 2 − ω2 y 2 + V
m
1 2

= [px + py 2 + pz 2 ] − ω[x py − y px ] + V x 2 + y 2 , z . (18.13)
2m
H is explicitly time-independent and is therefore a constant of motion. The canonical

equations of motion read
∂H 1
ẋ = = px + ωy ,
∂px m
1
ẏ = p − ωx , (18.14)
m y
1
ż = pz ,
m
Exercise 18.3 ∂H ∂V
ṗx = − = ωpy − ,
∂x ∂x
∂V
ṗy = −ωpx − , (18.15)
∂y
∂V
ṗz = − .
∂z
(b) For the transition to cylindrical coordinates, we differentiate the transformation

equations
x = cos ϕ , y = sin ϕ (18.16)
with respect to the time:
ẋ = ˙ cos ϕ − ϕ̇ sin ϕ ,
(18.17)
ẏ = ˙ sin ϕ + ϕ̇ cos ϕ .
From (18.9) and (18.17), we calculate the generalized momenta:
∂L ∂L ∂ ẋ ∂L ∂ ẏ
p =
= +
∂ ˙ ∂ ẋ ∂ ˙ ∂ ẏ ∂ ˙
= px cos ϕ + py sin ϕ , (18.18)
∂L ∂L ∂ ẋ ∂L ∂ ẏ
pϕ = = +
∂ ϕ̇ ∂ ẋ ∂ ϕ̇ ∂ ẏ ∂ ϕ̇
= −px sin ϕ + py cos ϕ . (18.19)
Now we solve for px and py . From (18.18), it follows that
p − py sin ϕ

px = , (18.20)
cos ϕ
and from (18.19) (with (18.20)),
pϕ cos ϕ + (p − py sin ϕ ) sin ϕ

py =
cos2 ϕ
py ( cos2 ϕ + sin2 ϕ ) pϕ cos ϕ + p sin ϕ
⇒ =
cos2 ϕ cos2 ϕ
1
⇒ py = p cos ϕ + p sin ϕ. (18.21)
ϕ
Analogously, we obtain
1
px = p cos ϕ − p sin ϕ . (18.22)
ϕ
18.1 The Hamilton Principle 337
Now we insert (18.21) and (18.22) into (18.13) and obtain Exercise 18.3

2 1
H = p 2 cos2 ϕ − p cos ϕ pϕ sin ϕ + 2 sin2 ϕ pϕ 2

2 1 1
+ p 2 sin2 ϕ + p sin ϕ pϕ cos ϕ + 2 pϕ 2 cos2 ϕ ·
2m
− ω(x py − y px ) + V ( , z )

1 1 1
= p 2 + 2 pϕ 2 + pz 2 − ω cos ϕ p sin ϕ + cos ϕ pϕ cos ϕ
2m

1
− sin ϕ p cos ϕ + sin ϕ pϕ sin ϕ + V ( , z )

1 1
= p 2 + 2 pϕ 2 + pz 2 − ωpϕ + V ( , z). (18.23)
2m
A comparison of (18.13) and (18.23) shows that the Hamiltonian becomes especially
simple if it is represented in coordinates adapted to the symmetry of the problem.
From (18.23) we see that H does not depend on the angle ϕ (ϕ is a cyclic coordinate),
hence the angular momentum component pϕ is a constant of the motion.
The canonical equations of motion read
1 1 1
˙ = p , ϕ̇ = p − ω, ż = p,
m m2 ϕ m z (18.24)
1 ∂V ∂V
ṗ = p 2 − , ṗϕ = 0, ṗz = − .
m3 ϕ ∂ ∂z
18.1 The Hamilton Principle
The laws of mechanics can be expressed in two ways by variational principles that
are independent of the coordinate system. The first of these are the differential prin-
ciples. In this approach, one compares an arbitrarily selected momentary state of the
system with (virtual) infinitesimal neighbor states. One example of this method is the
d’Alembert principle. Another possibility is to vary a finite path element of the sys-
tem. Such principles are called integral principles. The “path” is not understood as the
trajectory of a point of the system in the three-dimensional position space, but rather
as the path in a multidimensional space where the motion of the entire system is com-
pletely fixed. For a system with f degrees of freedom, this space is f -dimensional.
In all integral principles the quantity to be varied has the dimension of an action
(= energy · time); therefore, they are also called principles of minimum action. As
an example we will consider the Hamilton principle. The Hamilton principle requires
that a system moves in such a way that the time integral over the Lagrangian takes an
extreme value:
t2
I= L dt
t1
shall have an extremum, which can also be expressed as follows:
t2
δ L dt = 0. (18.25)
t1
The path equation of the system can be determined by applying this principle.
Before considering (18.25) in more detail, we will briefly deal in general with the
variational problem.
EXAMPLE
18.4 A Variational Problem
As an example for substituting a description in terms of coordinates by a description

independent of coordinates, one can consider the definition of a straight line in the
plane. The straight line is uniquely determined by fixing two of its points, and it can
also be described by a linear equation between the coordinates x and y. It can further
be described by the differential equation
d 2y
=0 (18.26)
dx 2
with the further prescription that the values of the desired function y(x) for x = x1
and x = x2 are given numbers. These are descriptions using rectangular coordinates.
The straight line can however also be described as the shortest connection between
two points, i.e., by

ds = minimum. (18.27)
One may imagine the two given points as being connected by all possible curves, and
among these curves that curve be selected which yields the minimum value for the
given integral. This description of the straight line is independent of the choice of
particular coordinates.
As a preparation for the following, we show how the search for the shortest connec-
tion between two points of the plane can be reduced mathematically to (18.26). After
introducing rectangular coordinates x and y, the problem is to look for a function y(x)
for which y(x1 ) and y(x2 ) have given values and the integral
x2
I= 1 + y (x)2 dx (18.28)
x1
takes a minimum value. Similar problems do not need to have a solution. So one
could put the problem (18.27) or (18.28) and prescribe not only the start point and the
endpoint, but also the direction of the curve at the start and endpoint, respectively. One
easily recognizes that under these conditions there is no shortest connection, unless
both of the given directions incidentally coincide with the straight connection.
18.1 The Hamilton Principle 339
The problem (18.28) has some similarity with the search for the minimum of a Example 18.4
given function f (x). There one considers a small change of x and forms
df (x) = f (x) dx.
If f (x) = 0, f (x) can increase or decrease for small changes of x, and thus, there
is no minimum at the point x. A necessary condition for a minimum is therefore
f (x) = 0. This condition is not sufficient; it is also fulfilled for a maximum.
In the problem (18.28), we do not have to change a variable but a function y(x).
We replace y(x) by a “neighboring” function y0 (x) + εη(x) of the desired func-
tion y0 , where we will afterward assume the number ε is arbitrarily small. We must
have η(x1 ) = η(x2 ) = 0. y is then replaced by y0 + εη , and instead of the integrand

1 + y 2 we obtain the Taylor series expansion into powers of ε:
y0
1 + (y0 + εη )2 = 1 + y0 2 + ε η + ε 2 (. . .),
1 + y0 2
where the term indicated by ε2 (. . .) can be neglected for sufficiently small |ε|. There-
fore, we have
x2 x2 x2
y0
I (ε) = 1 + (y0 + εη )2 dx ≈ 1 + y0 2 dx +ε η dx,
x1 x1 x1 1 + y0 2
which shall take a minimum for ε = 0. If the integral in the second term does not
vanish, the integral
x2
1 + y0 2 dx
x1
can increase or decrease by changing the function y0 (x), depending on the sign of ε.
Hence, y0 (x) does not provide a minimum of this integral. For a minimum rather
exists the necessary condition
x2
y0
η dx = 0 (18.29)
x1 1 + y0 2
for any function η(x) that vanishes at x1 and x2 . To be able to exploit the far-reaching
arbitrariness of the function η(x), we transform (18.29) by integration by parts:
x2 x2
y d y
η − η 0 dx = 0.
1 + y0 2 dx 1 + y 2
x1 x1 0
Because η(x1 ) = η(x2 ) = 0, the first term drops. The second term
x2
d y
η· 0 dx (18.30)
dx 1 + y 2
x1 0
Example 18.4 then and only then becomes zero for all allowed functions η(x) if everywhere between
x1 and x2 we have
d y
0 = 0. (18.31)
dx 1 + y 2
0
If this equation were not satisfied everywhere, we could choose η(x) so that it is
always positive where
d y
0
dx 1 + y 2
0
is positive, and choose it as negative where this expression is negative, and in this
way establish a contradiction. We can also conclude this way: If (18.31) were not
fulfilled anywhere, one should set η(x) equal to zero everywhere, except for a certain
interval about this place. But then the integral (18.30) does not vanish. We could not
choose the quantity η in (18.29) in this way; thus we could not draw the corresponding
conclusion for (18.29). From (18.31) now follows y0 = constant or y0 = 0; that means
the former description (18.26). Thus, our calculation has replaced the requirement
that a definite integral be minimized by a function, by a differential equation for this
function.
Equation (18.31) allows yet another interpretation. We have
d y y
= .
dx 1 + y 2 ( 1 + y 2 )3
As is shown in the theory of curves, this is an expression for the curvature of a curve.
Equation (18.31) thus states that the desired curve everywhere has the curvature 0.
We just have treated a simple problem of the “variational calculus.” Problems of
the type (18.27) or (18.28) are called variational problems. In Exercises 18.5 and 18.6
we shall meet further, less trivial variational problems.
18.2 General Discussion of Variational Principles

Given the integrable function F = F (y(x), y (x)), we look for a function y = y(x),
so that the integral
x2

I= F y(x), y (x) dx
x1
takes an extremum value.

This problem is transformed into an elementary extremum value problem by cov-
ering the ensemble of all physically meaningful paths by a parametric representation:
y(x, ε) = y0 (x) + εη(x),

where ε means a parameter that is constant for every path, η(x) is an arbitrary differ-
entiable function that vanishes at the endpoints:
η(x1 ) = η(x2 ) = 0.
The desired curve is given by y0 (x) = y(x, 0).

18.2 General Discussion of Variational Principles 341
Fig. 18.2. Possible paths from

(x1 , y1 ) to (x2 , y2 )
The condition for an extremum value of the integral I is then

dI
= 0.
dε ε=0
The differentiation under the integral symbol (allowed if F is continuously differen-

tiable with respect to ε) yields
x2 x2
dI ∂F ∂y ∂F ∂y ∂F ∂F
= + dx = η + η dx.
dε ∂y ∂ε ∂y ∂ε ∂y ∂y
x1 x1
The second integrand can be integrated by parts:

x2 x2
∂F ∂η ∂F x2 d ∂F
dx = η − η dx.
∂y ∂x ∂y x1 dx ∂y
x1 x1
Since the endpoints shall be fixed, the term integrated out vanishes, and the extremum
condition reads
x2
∂F d ∂F
− η dx = 0.
∂y dx ∂y
x1
Since η(x) can be an arbitrary function, this equation is generally satisfied only then
if
d ∂F (y(x), y (x)) ∂F (y(x), y (x))
− = 0. (18.32)
dx ∂y ∂y
This relation (18.32) is called the Euler–Lagrange equation. It is a necessary condition
for an extremum value of the integral I . The solution of the Euler–Lagrange equation,
a differential equation of second order, together with the boundary conditions yields
the wanted path. To simplify notation, we define the variation of a function y(x, ε) as
the difference between y(x, ε) and y(x, 0)

∂y
δy = y(x, ε) − y(x, 0) = ·ε
∂ε ε=0
for very small ε. Then the variational problem can be formulated as

x2

δ F y(x), y (x) dx = 0.
x1
F can also include constraints by means of Lagrange multipliers (compare Chap. 16).
EXERCISE
18.5 Catenary
Problem. This is an example with a constraint. A chain of constant density σ (mass

per unit length: σ = dm/ds) and length l hangs in the gravitational field between two
points P1 (x1 , y1 ) and P2 (x2 , y2 ). We look for the form of the curve, assuming that the
potential energy of the chain takes a minimum.
Fig. 18.3. A chain hangs in

the gravitational field
Solution. The potential energy of a chain element is
dV = gσy ds.
The total potential energy is then

x2
V = gσ y ds,
x1
where the line element is given by

dy
ds = 1 + y 2 dx, y = .
dx
The constraint of given length l is represented by
x2 x2
0= ds − l = 1 + y 2 dx − l.
x1 x1
With the Lagrange multiplier λ, the variational problem reads

x2 x2
gσ δ y 1 + y 2 dx − λδ 1 + y 2 dx − l = 0.
x1 x1
Since δl = 0, we can introduce the function

F (y, y ) = (y − μ) 1 + y 2
in the Euler equation (18.32), where we chose μ = λ/gσ . From

∂F d ∂F
− = 0,
∂y dx ∂y
it follows that
(y − μ)y − y 2 − 1 = 0.
We rewrite the last equation. With
dy dy dy dy
y = = = y ,
dx dy dx dy
we obtain
dy dy y dy
(y − μ)y = y 2 + 1, = .
dy y − μ 1 + y 2
Integration yields
1
ln(y − μ) + ln C1 = ln(1 + y 2 )
2
or

C1 (y − μ) = 1 + y 2 .
From this, we get

dy
= dx.
C12 (y − μ)2 − 1
To integrate the left side, we substitute cosh ν = C1 (y − ν), since cosh2 ν − 1 =

sinh2 ν. Then
1
dy = sinh νdν,
C1
and therefore,

1
dν = dx.
C1
Integration yields
ν = C1 (x + C2 )
or
1
y= cosh(C1 (x + C2 )) + μ.
C1
Thus, the solution is the catenary. The constants give the coordinates of the lowest
point (x0 , y0 ) = (−C2 , (1/C1 ) + μ). They are determined by the given length l of the
chain and by the suspension points P1 and P2 .
EXERCISE
18.6 Brachistochrone: Construction of an Emergency Chute
Problem. On board an aircraft, a fire breaks out after landing. The passengers must
leave by an emergency chute on which they glide down without friction. Determine
by variational calculus the form of the chute with the aim to evacuate the plane as fast
as possible (height of the hatch y0 ; distance to the bottom x0 ). Find the time of gliding
as compared to the harsh free fall, assuming x0 = (π/2)y0 .
Hint: Use the substitution
dy
y = = −cot !
dx 2
Remark: This problem is known as the “brachistochrone.”
Fig. 18.4. A passenger glides

down a chute
Fig. 18.5. Illustration of vari-

ous chutes
Solution. The problem goes back to the Bernoulli brothers (brachistochrone, 1696).
Energy conservation yields
1
mgy0 = mv 2 + mgy,
2
2
1 dx 2 dy
g(y0 − y) = + ,
2 dt dt
(dx)2 + (dy)2
(dt)2 = .
2g(y0 − y)
The total time T is then Exercise 18.6

T x0
1 + (dy/dx)2
T = dt = dx. (18.33)
2g(y0 − y)
0 0
To get the minimum time, one has to solve a variational problem of the form
x2
y(x1 ) = y0 ,
δ F (x, y, y ) dx = 0,
y(x2 ) = 0.
x1
Because
x2 x2
∂F ∂F ∂F d ∂F
0= δy + δy dx = − δydx,
∂y ∂y ∂y dx ∂y
x1 x1
the Euler–Lagrange equation reads

d ∂F ∂F

− =0 (18.34)
dx ∂y ∂y
or
∂2 ∂
2 ∂2 ∂F
y F + y F + F− = 0. (18.35)
∂y 2 ∂y∂y ∂x∂y ∂y
If the functional F is independent of x, (18.35) can be directly integrated. One finds

d ∂F d ∂F ∂F dF
y − F = y
+ y −
dx ∂y dx ∂y ∂y dx

d ∂F ∂F ∂F ∂F ∂F
=y + y −y − y −
dx ∂y ∂y ∂y ∂y
∂x
=0
d ∂F ∂F
= y − = 0;
dx ∂y ∂y
hence,
∂F 1
y − F = constant ≡ . (18.36)
∂y c
In our case, (18.33) is

1 + y 2
F= .
2g(y0 − y)
Then (18.36) reads

1 y 1 + y 2 1
y =√ · −√ = ,
2g(y0 − y) 1 + y 2 2g(y0 − y) c
1 1
2
= 2. (18.37)
2g(y0 − y)(1 + y ) c
Exercise 18.6 The transformation y = −cot2 (/2) yields

c2 2 2 1
= 1 + y = 1 + cot = 2 ,
2g(y0 − y) 2 sin (/2)
and thus,
c2
y = y0 − (1 − cos ).
4g
By integration, one finds an equation for x(), namely,

dy c2 d
−cot = = − sin cos
2 dx 2g 2 2 dx
x
c2 c2 1 1

⇒ x= dx = sin2 d = − sin
2g 2 2g 2 2 0
0 0
c2 c2
x= ( − sin ), y = y0 − (1 − cos ). (18.38)
4g 4g
This is just the parametric representation of a cycloid.

The maximum value of is determined by x0 and y0 , namely,
x0 0 − sin 0
= . (18.39)
y0 1 − cos 0
The transcendental equation (18.39) can be solved in general only numerically. Special
cases:
0 = 0 π 2π
x0 /y0 = 0 π/2 ∞
Fig. 18.6. Possible types of

solution
Calculation of the gliding time according to (18.33) and (18.37):
x0 0
1 + y 2 (dx/dy)2 + 1
T = dx = dy
2g(y0 − y) 2g(y0 − y)
0 y0
y0 Exercise 18.6
c2
= dy
2g(y0 − y)(c2 − 2g(y0 − y))
0
y0
c c2 − 2g(y0 − y)
= 2 arctan
2g 2g(y0 − y)
0

c π c2 − 2gy0
= − arctan ,
g 2 2gy0

c c2 − 2gy0
T = arccot .
g 2gy0
The integral can be found in tables.

π 2y0 π
x0 = y0 ⇒ 0 = π ⇒ c = 2gy0 ⇒ T= .
2 g 2
For comparison, the time of free fall is

2y0
T = .
g
As is seen already from (18.25), according to the Hamilton principle the time is not
being varied. The system passes a trace point and the appropriate varied trace point at
the same time. Hence,
δt = 0.
Starting from the integral
t2

δI = δ L qα (t), q̇α (t), t dt = 0, α = 1, 2, . . . , f, (18.40)
t1
where f is the number of degrees of freedom, we perform the variation according to

the procedure described above and show that the Lagrange equations can be derived
from the Hamilton principle. We describe the variation of a path curve qα (t) by
qα (t) → qα (t) + δqα (t),

where the δqα vanish at the endpoints,
δqα (t1 ) = δqα (t2 ) = 0.
Since time is not being varied, we have

t2 t2 t2 ∂L
∂L
δ L dt = δL dt = δqα + δ q̇α dt. (18.41)
α
∂qα α
∂ q̇α
t1 t1 t1
Because
d d
δqα = (qα (t, ε) − qα (t, 0))
dt dt
d d
= (qα (t, ε)) − (qα (t, 0))
dt dt
d
= δ qα (t) = δ q̇α (t), (18.42)
dt
integration by parts of the second summand yields
t2 t2
∂L ∂L d
δ q̇α dt = δqα dt
∂ q̇α ∂ q̇α dt
t1 t1
t2 t2
∂L d ∂L
= δqα − δqα dt. (18.43)
∂ q̇α t1 dt ∂ q̇α
t1
Since δqα vanishes at the endpoints (integration limits), we get for the variation of the
integral
t2
∂L d ∂L
δI = − δqα dt = 0. (18.44)
α
∂qα dt ∂ q̇α
t1
For holonomic constraints, we imagine that the dependent degrees of freedom were
eliminated. We take the qα as the independent coordinates. Hence, the δqα are in-
dependent of each other, and the integral vanishes only if the coefficient of any δqα
vanishes. This means that the Lagrange equations hold:
d ∂L ∂L
− = 0. (18.45)
dt ∂ q̇α ∂qα

Likewise, one can obtain the Hamilton equations by replacing L by α pα q̇α − H
and considering the variations δpα and δqα as independent. This will be worked out
in the Exercise 18.7.
In order to show the equivalence of the Hamilton principle with the formulations of
mechanics studied so far, we shall demonstrate its derivation from Newton’s equations.
We consider a particle in Cartesian coordinates. It moves along a certain path r = r(t)
between the positions r(t1 ) and r(t2 ). Now the path is varied by a virtual displacement
δr that is compatible with the constraint:
r(t) → r(t) + δr(t), δr(t1 ) = δr(t2 ) = 0.

The time is not varied. The work needed for the virtual displacement is
δA = F · δr = F a · δr,
if F e is the external force and the constraint reaction does not perform work. If F e is
conservative, then
F e · δr = −δV ,
and according to Newton
−δV = mr̈ · δr.
The right-hand side can be transformed (the operator (d/dt)δr = δṙ is treated accord-
ing to (18.42)):

d d 1 2
(ṙ · δr) = ṙ · δr + r̈ · δr = ṙ · δṙ + r̈ · δr = δ ṙ + r̈ · δr.
dt dt 2
Multiplication by the mass m yields

d 1 2
mr̈ · δr = m (ṙ · δr) − δ mṙ ,
dt 2
and therefore,
d
δ(T − V ) = δL = m (ṙ · δr).
dt
Integration with respect to time leads to
t2
δ L dt = m[ṙ · δr]tt21 = 0.
t1
Thus, the Hamilton principle for a single particle has been derived from Newton’s
equations. The result can be directly extended to particle systems. This can be un-
derstood quite generally in the following way: If a particle system obeys the La-
grange equations (18.45) (which are equivalent to Newtonian mechanics), then we
have (18.44) and from that—because of (18.43)—again (18.41) or (18.40), provided
that δqα (t1 ) = δqα (t2 ) = 0. Thus, the Lagrange equations are equivalent to the Hamil-
ton principle.
EXERCISE
18.7 Derivation of the Hamiltonian Equations
Problem. Derive the Hamilton equations from the Hamilton principle.

Solution. The Hamilton principle reads
t2
δ L dt = 0, (18.46)
t1
Exercise 18.7 where the Lagrangian L is now expressed by the Hamiltonian H ; hence,

L= pα q̇α − H (pα , qα , t). (18.47)
α
Then (18.46) becomes
t2 t2
∂H ∂H
δL dt = δpα q̇α + pα δ q̇α − δpα − δqα dt. (18.48)
α
∂pα ∂qα
t1 t1
The second term on the right-hand side can be transformed by integration by parts,
t2 t2 t2
d
pα δ q̇α dt = pα δqα dt = pα δqα H |tt21 − ṗα δqα dt. (18.49)
dt
t1 t1 t1
The first term vanishes since the variations at the endpoints vanish: δqα (t1 ) =
δqα (t2 ) = 0. Hence, (18.48) becomes
t2 t2
∂H ∂H
0= δL dt = q̇α − δpα + −ṗα − δqα dt. (18.50)
α
∂pα ∂qα
t1 t1
The variations δpα and δqα are independent of each other because along a path in
phase space the neighboring paths can have different coordinates or (and) different
momenta. Thus, (18.50) leads to
∂H
q̇α = ,
∂pα
(18.51)
∂H
ṗα = − ,
∂qα
which was to be demonstrated.
18.3 Phase Space and Liouville’s Theorem
In the Hamiltonian formalism, the state of motion of a mechanical system with f de-
grees of freedom at a definite time t is completely characterized by the specification
of the f generalized coordinates and f momenta q1 , . . . , qf ; p1 , . . . , pf . These qi
and pi can be understood as coordinates of a 2f -dimensional Cartesian space, the
phase space. The f -dimensional subspace of the coordinates qi is the configuration
space; the f -dimensional subspace of the momenta pi is called momentum space.
In the course of motion of the system the representative point describes a curve, the
phase trajectory. If the Hamiltonian is known, then the entire phase trajectory can
be uniquely calculated in advance from the coordinates of one point. Therefore to
each point belongs only one trajectory, and two different trajectories cannot intersect
each other. A path in phase space is given in parametric representation by qk (t), pk (t)
(k = 1, . . . , f ). Because of the uniqueness of the solutions of the Hamilton equations,
18.3 Phase Space and Liouville’s Theorem 351
the system develops from various boundary conditions along various trajectories. For
conservative systems the point is bound to a (2f − 1)-dimensional hypersurface of the
phase space by the condition H (q, p) = E = constant.
EXAMPLE
18.8 Phase Diagram of a Plane Pendulum
If the angle ϕ is taken as a generalized coordinate, then we have for the plane pendu-
lum (mass m, length l)
pϕ = ml 2 ϕ̇.
The Hamiltonian, which represents the total energy, reads

1 pϕ2
H = m(l ϕ̇)2 − mgl cos ϕ = − mgl cos ϕ = E.
2 2ml 2
The origin of the potential was put at the suspension point of the pendulum. One then
gets the equation for the phase trajectory pϕ = pϕ (ϕ):

pϕ = ± 2ml 2 (E + mgl cos ϕ).
Thus, we obtain a set of curves with the energy E as a parameter.

For energies E < mgl, the phase trajectories are closed (ellipse-like) curves; the
pendulum oscillates forth and back (vibration). If the total energy E exceeds the value
mgl, the pendulum still has kinetic energy at the highest point ϕ = ±π and continues
its motion without reversal of direction (rotation).
Fig. 18.7. Phase space and
phase diagram of the one-
dimensional pendulum
We now consider a large number N of independent points that are mechanically iden-
tical, apart from the initial conditions, and are therefore described by the same Hamil-
tonian. As a specific example, we can imagine particles in the beam of an accelerator.
If all points at time t1 are distributed over a 2f -dimensional phase space region G1
with the volume
V = q1 · · · qf · p1 · · · pf ,
one can define the density

N
= .
V
With the course of motion, G1 transforms according to the Hamilton equations into
the region G2 .
Fig. 18.8. Evolution of a re-

gion in phase space
The statement of the Liouville theorem3 is
The volume of an arbitrary region of phase space is conserved if the points of its
boundary move according to the canonical equations.
Or, in other words, performing a limit transition:
The density of points in phase space in the vicinity of a point moving with the fluid
is constant.
To prove that, we investigate the motion of system points through a volume element
of the phase space. Let us first consider the components of the particle flux along the
qk - and pk -direction.
The area ABCD represents the projection of the 2f -dimensional volume element
dV onto the qk , pk -plane.
Fig. 18.9. Projection of the

volume element onto the
qk , pk -plane
The number of points entering the volume element per unit time through the “side
face” (with the projection AD onto the qk , pk -plane) is
q̇k dpk · dVk ,
where

f
dVk = dqα dpα
α=1
α=k
is the (2f − 2)-dimensional remainder volume element; dpk · dVk is the magnitude of
the lateral surface with the projection AD in the pk ,qk -plane.
3 Joseph Liouville, b. March 24, 1809, St. Omer–d. Sept. 8, 1882, Paris. Liouville was professor of
mathematics and mechanics in Paris, at the École Polytechnique, at the Collège de France, and at
the Sorbonne. He was a member of the Bureau of Measures and of many scholarly societies. From
1840 to 1870, he was considered the leading mathematician of France. He worked on statistical
mechanics, boundary value problems, differential geometry, and special functions. His constructive
proof of the existence of transcendental numbers and, in 1844, the proof that e and e2 cannot be roots
of a quadratic equation with rational coefficients, were of great significance.
18.3 Phase Space and Liouville’s Theorem 353
The Taylor expansion for the points leaving at BC in the first direction yields

∂
q̇k + (q̇k )dqk dpk · dVk . (18.52)
∂qk
Analogously, for the flux in pk -direction we have
entrance through AB: ṗk dqk · dVk ,

(18.53)
∂
exit through CD: ṗk + (ṗk )dpk dqk · dVk .
∂pk
From the flux components in pk - and qk -direction, the number of system points per
unit time

∂ ∂
− (q̇k ) + (ṗk ) dV (18.54)
∂qk ∂pk
gets stuck in the volume element.

By summing over all k = 1, . . . , f , one obtains the number of points that get stuck
in total. This quantity just corresponds to the change with time (time derivative) of the
density multiplied by dV . Hence, we can conclude
f

∂ ∂ ∂
=− (q̇k ) + (ṗk ) . (18.55)
∂t ∂qk ∂pk
k=1
We are dealing here with a continuity equation of the form
∂
div(ṙ) + = 0.
∂t
The divergence refers to the 2f -dimensional phase space:

f
∂
f
∂
∇= + .
∂qk ∂pk
k=1 k=1
Continuity equations of this type often appear in flow physics (hydrodynamics,

electrodynamics, quantum mechanics). They always express a conservation law.
Application of the product rule in (18.55) yields
f

∂ ∂ q̇k ∂ ∂ ṗk ∂
q̇k + + ṗk + + = 0. (18.56)
∂qk ∂qk ∂pk ∂pk ∂t
k=1
From the Hamilton equations, we have
∂ q̇k ∂ 2H ∂ ṗk ∂ 2H
= and =− .
∂qk ∂qk ∂pk ∂pk ∂qk ∂pk
If the second partial derivatives of H are continuous, then
∂ q̇k ∂ ṗk
+ = 0,
∂qk ∂pk
and from this, it follows that

f

∂ ∂ ∂
q̇k + ṗk + = 0. (18.57)
∂qk ∂pk ∂t
k=1
This just equals the total derivative of the density with respect to time,
d
= 0, (18.58)
dt
and hence, = constant.
EXAMPLE
18.9 Phase-Space Density for Particles in the Gravitational Field
The system consists of particles of mass m in a constant gravitational field. For the
energy, we have
p2
H =E= − mgq.
2m
The total energy of a particle remains constant.
The phase trajectories p(q) are parabolas

p = 2m(E + mgq),
with the energy as a parameter. We consider a number of particles with momenta at

time t = 0 between the limits p1 ≤ p ≤ p2 , and with energies between E1 ≤ E ≤ E2 .
They cover the area F in phase space. At a later time t the points cover the area F .
They then have the momentum
p = p + mgt,
so that F is the area between the parabolas limited by p1 + mgt ≤ p ≤ p2 + mgt.

With
(p 2 /2m) − E
q= ,
mg
Fig. 18.10. On Liouville’s

theorem: Phase space for
particles in the gravitational
field
18.4 The Principle of Stochastic Cooling 355
the size of the areas is calculated as Example 18.9

2
p2 /2m)−E1 )
(1/mg)((p p2
E 2 − E1 E 2 − E1
F= dp dq = dp = (p2 − p1 ),
mg mg
p1 (1/mg)((p 2 /2m)−E2 ) p1
and likewise,
E 2 − E1
F = (p2 − p1 )
mg
E2 − E1
= (p2 − p1 ).
mg
This is just the statement of Liouville’s theorem: F = F means that the density of the
system points in phase space remains constant. The significance of Liouville’s theorem
lies in the field of statistical mechanics, where one considers ensembles because of
lack of exact knowledge of the system.
A special application is the focusing of particle currents in accelerators where a
large number of particles are subject to identical conditions. Here a reduction of the
beam cross section must lead to an undesirable broadening of the momentum distrib-
ution.
18.4 The Principle of Stochastic Cooling4

An essential implication of Liouville’s theorem is that the phase space occupied by an
ensemble of particles in the absence of friction behaves like an incompressible fluid.
We shall show in the following that the principle of stochastic cooling leads to a
(seeming) contradiction to the theorem of Liouville. For this purpose it is necessary
to expand on the method of stochastic cooling of antiprotons developed by van der
Meer.5 The successful application of this method led to the proof of the existence of
the intermediate vector bosons (IVB) W + , W − , and Z 0 predicted by the theory of
weak interactions.
4 This chapter was stimulated by a lecture given by Professor Herminghaus (Mainz), at the occa-
sion of the sixtieth birthday of Professor P. Junior 1988 in Frankfurt. My thanks go to colleague
Mr. Herminghaus for leaving his manuscript, which I found very useful when writing this section.
5 Simon van der Meer, b. Nov. 24, 1925, Den Haag. He received the Nobel prize for physics in
1984. He studied mechanical and electrical engineering at the Technical University of Delft, took his
diploma exams as engineer and worked at first in the Philips central laboratory in Eindhoven. In 1956
he got a position as a development engineer at CERN in Geneva. Here he soon earned a reputation
for professional competence, imagination, and also for his talent for theory. He was appointed a
“senior engineer.” Meanwhile the Italian physicist Carlo Rubbia, a scientific coworker at CERN, had
developed the idea to shoot 450 GeV protons from the just-finished super-high energy accelerator
“SPS” onto their artificially produced “antiparticles”—antiprotons. The project was realized as a
collider system. For the first time one could generate and demonstrate the so far only hypothetical
intermediate W - and Z-bosons. Van der Meer, a “genuine puzzler,” provided a genial invention: the
stochastic cooling, which allowed researchers to collect antiprotons in sufficient quantity and to store
them for the experiments. Only one year after their great success, which proved the predictions of
theory in a brilliant way, van der Meer and Rubbia were awarded with the Nobel prize for physics
“for decisive merits in the discovery of the field quanta of weak interaction.”
Fig. 18.11. Schematic

representation of a proton–
antiproton collision. In the
collision quark–antiquark
pairs are created whose reac-
tions can lead to the creation
of intermediate vector bosons
(• = quark, ◦ = antiquark)
According to the predictions of the theory, these particles should be able to decay
as follows:
IVB −→ lepton + antilepton, (18.59)

IVB −→ quark + antiquark. (18.60)
For the experimental proof of the IVB, one utilized the inverse reaction (18.60), by
shooting high-energy beams of antiprotons onto protons in the proton synchrotron
(PS) of the CERN. Since the protons consist only of three quarks (q) and the antipro-
tons of three antiquarks (q), many quark-antiquark pairs are created by the violent
collisions. The reactions between these quarks and antiquarks can generate the inter-
mediate vector bosons (see Fig. 18.11). In order to reach a high event rate, which is
calculated according to
event rate = cross section · luminosity, (18.61)
one needs both a large cross section and a high beam luminosity. Now one has
Np · Np
luminosity ∼ . (18.62)
q
Here, Np and Np denote the number of protons (p) and antiprotons (p) in the beam,
and q represents the beam cross section. The higher the number of particles and the
lower the beam cross section, the higher is the event rate for creating an intermediate
vector boson. See also Fig. 18.12.
An efficient cooling mechanism for the antiproton beams is therefore needed. Each
Fig. 18.12. The existence of particle of the beam moves by the action of magnetic fields in horizontal and vertical
intermediate vector bosons vibrations about a closed pre-set trajectory. In this context the term cooling means
could be proved for the first a reduction of the vibration amplitudes of the particles and thus of the beam cross
time at CERN by the collision section, or a reduction of the width of the momentum distribution of the particles
of intense high-energy proton
and antiproton beams (Np = about the mean value. This is illustrated by Fig. 18.13. Already well-tried cooling
number of antiprotons in the methods are electron cooling, cooling by synchrotron radiation, and the stochastic
beam, Np = number of pro- cooling, which will now be outlined in more detail.
tons) The motion of each particle in the beam is described by a point in a 6-dimensional
phase space spanned by the 3 spatial and the 3 momentum coordinates. This phase
space point is surrounded by empty space. By an appropriate deformation of the phase
space element the particle can be shifted toward the center of gravity of the distribu-
tion. This is the principle of stochastic cooling.
Fig. 18.13. A beam before

and after cooling, (a) in po-
sition space, (b) in momen-
tum space. Part (b) essentially
shows the particle density ver-
sus the transverse momentum
Fig. 18.14. The cooling sys-

tem consisting of “pick-up,”
amplifier, and “kicker”
The experimental setup for cooling of antiproton beams is sketched in Fig. 18.14.
In the ideal case, a probe (pick-up) measures the position or the momentum of a
particle. This tiny signal is amplified and fed to the “kicker,” which then corrects the
transverse or the longitudinal momentum and thereby cools. Thus, the cooling can
be interpreted as a one-particle effect, since each particle cools itself by emission of a
self-generated signal (coherent effect). An essential prerequisite is that the particle and
the signal reach the kicker simultaneously. Because of the finite resolving power of the
probe in the real case, besides the desired signal the perturbing signals from other par-
ticles reach the kicker too. This noise causes a heating of the particles (incoherent
effect) and thus counteracts the cooling effect. This interplay of cooling and heat-
ing mechanisms is illustrated by Fig. 18.15 and will be discussed in Exercise 18.10.
The cooling effect is directly proportional to the signal amplification, while the heat-
ing is proportional to the square of the amplification. The particle is cooled only in
the hatched area (see Fig. 18.15). Evidently there exists an optimum of amplification
where the cooling effect reaches an extremum value. Thus, the greater the intensity
of the beams, the greater is the noise and the heating effect, and the less is the factor
of optimum amplification. Generation of an intense beam of antiprotons at CERN is
therefore performed by stages and may last several hours. The principle is illustrated
by Fig. 18.15.
First, an antiproton pulse of low intensity is injected at the left border of the vacuum
chamber (1). The corresponding momentum density distribution can be seen on the
right. The beam and its momentum width are then compressed by cooling (2). A high-
frequency voltage is used to shift the pulse to the right side of the chamber (3), thus
giving space for a further antiproton pulse which is injected into the chamber (4).
After cooling, the second pulse is shifted onto the already “deposited” pulse (5). This
procedure is repeated every 2 to 3 seconds for several hours. In this way, the longi-
tudinal phase-space density is increased by accumulation of more and more particles
Fig. 18.15. Sectional view of

the vacuum chamber with
beams in various stages of
the accumulation process. The
right-hand part shows the cor-
responding density distribu-
tion versus the momentum
into the same momentum interval (6). The final 6-dimensional phase-space density of
the stack is higher than the density of a single pulse by a factor of 3 · 108 . The intense
antiproton beam generated this way can now be further accelerated and brought to
collision with a proton beam. Only one year after the demonstration of the intermedi-
ate vector bosons, S. van der Meer and C. Rubbia6 were awarded the Nobel Prize in
physics for their achievements.
We now come back to the apparent contradiction between Liouville’s theorem and
the method of stochastic cooling. While according to the Liouville theorem only a sin-
gle pulse can be accommodated in a ring, stochastic cooling allows one to accumulate
about 36,000 pulses in the course of a day. The final phase-space density is higher than
that of a single pulse by a factor of 3 · 108 .
However, stochastic cooling and the Liouville theorem are dealing with different
situations. The former presupposes an ensemble of a finite number of discrete parti-
cles, while the Liouville theorem presupposes a phase-space continuum (see div v !).
A discrete ensemble thus represents only a model approximation of this condition that
works the better the more dense the occupation of the phase-space volume becomes.
This becomes clear by the example of the cooling rate (which will be calculated in
the subsequent problem):
1 W
= (2g − g 2 ). (18.63)
τ N
6 Carlo Rubbia, b. March 31, 1934, Goriza. He got his education as a physicist in Pisa at the Scuola
Normale, a time-honored university. Here he got his doctorate in 1958, after which he worked for a
year as a research scholar at the Columbia University in New York, and then as an assistant professor
in Rome. In 1960, he came to CERN at Geneva as high-energy physicist. Since 1972, he has held
a chair at Harvard University. In Geneva, Rubbia was inspired by the unified theory of weak and
electromagnetic interactions developed by A. Salam, S. Glashow, and S. Weinberg (Nobel Prize for
physics, 1979). In 1976, Rubbia proposed to CERN the construction of a new 450 GeV SPS accel-
erator for the purpose of proton-antiproton collision experiments. The accelerator achieved collision
energies of 540 GeV, which were sufficient to create the (so far only predicted) W - and Z-bosons.
Important for the success of the project was not only Rubbia, but also S. van der Meer, whose con-
tributions made possible the generation of sharply bunched, pulsed antiproton currents. Both of them
got the Nobel Prize for physics in 1984.
N denotes the number of particles in the beam, W the bandwidth of the system and
g a gain factor that will be defined in problem 18.10. The essential point however is
the dependence of the cooling rate on the inverse of the particle number of the beam,
1/N . In the limit
1
lim = 0,
N →∞ τ
cooling is no longer possible, as we would expect.
We note that the same restriction for applying Liouville’s theorem basically also
holds in thermodynamics, but there the approximation is better by 12 orders of mag-
nitude (1012 → 1024 )!
Much more important, however, is the fact that Liouville’s theorem holds on the
condition that the particles obey the Hamilton equations, with a given Hamiltonian H .
In this sense the particle system must be closed. But just this condition is violated by
the reading off the particle position (coordinate, momentum pick-up) and by the cor-
responding correction (kicker; see Fig. 18.14). This is a calculated interference from
outside which cannot be described by a Hamiltonian. Hence, the Liouville theorem
does not have to be fulfilled; moreover, it must not hold at all!
EXERCISE
18.10 Cooling of a Particle Beam
Problem.
(a) Calculate the cooling rate per second for a beam of N particles.
(b) When does maximum cooling occur?
(c) Calculate the cooling time for a beam of N = 1012 particles. Let the bandwidth of
the system be W = 500 MHz, and g = 1.
Solution. (a) We first consider the case that the pick-up and the kicker are so fast that
they seize each particle independently (see Fig. 18.16). Let the displacement of this
Fig. 18.16. In the ideal case,
the pick-up seizes one particle
particle from the beam axis be xk . After passing the distance λ/4 (λ is the wavelength
of the x-vibration), the deviation is corrected electromagnetically in the kicker. Let
the correction be
xk = gxk . (18.64)
The corrected distance xk of the particle from the beam axis is thus given by
xk = xk − xk = (1 − g)xk (18.65)

Fig. 18.17. A single particle is

seized by the pick-up and its
momentum is corrected after a
λ/4 wavelength at the kicker.
After one revolution, the new
trajectory leads to the cor-
rected spatial displacement xk
Fig. 18.18. In the real case,

the pick-up seizes several
(Ns ) particles that cause a
noise
(Fig. 18.17). For g = 1, the cooling would be ideal. However, in the real case there
appears a noise in the pick-up which is due to further Ns − 1 particles passing the
pick-up in the time interval Ts (see Fig. 18.18). Thus, the pick-up measures not only
the spatial displacement xk of the kth particle from the beam axis (x = 0), but also
that of the additional Ns − 1 particles located around the kth one. The recorded spatial
displacement is therefore the mean value of all Ns seized particles (the kth and the
Ns − 1 located around the kth particle):
1
Ns
xk = xj . (18.66)
Ns
j =1
For clarity, we will label the kth particle in the sum on the right-hand side, e.g., the
numbering will be chosen so that j = 1 just denotes the kth particle. Moreover, it
should be clear that the remaining Ns − 1 particles are closely located around the kth
one when passing the pick-up. We therefore add the index k to the particular spatial
displacement xj :

1 Ns
xk = x1,k + xj,k , (18.67)
Ns
j =2
where x1,k ≡ xk and xj,k = xk for 2 ≤ j ≤ Ns .

The correction for the kth particles is now
xk = xk − gxk . (18.68)
This means that there will be no kick if the sample of Ns particles on the average
moves on the beam axis:
xk = xk if xk = 0. (18.69)

In other words, the kicker will not be activated if the center of gravity Exercise 18.10
1 1
Ns Ns
Sk = mxj,k = xj,k ≡ xk (18.70)
mNs Ns
j =1 j =1
of the sample in the pick-up is already on the beam axis (all particles have the same
mass m).
For real measurements in the pick-up, this will, of course, not be fulfilled in gen-
eral. The probability that the center of gravity of the Ns particles that are statistically
distributed over the beam just coincides with the beam axis is extremely low. In the
realistic case the sample will always be “kicked”.
We now want to know how the mean value of the spatial displacement of all N
particles in the beam will change by the mechanism of stochastic cooling. This mean
value is
1
N
E(xk ) = xk (18.71)
N
k=1
and will be denoted by E(xk ) (expectation value) to distinguish it from the mean value
for the sample of Ns particles, xk , which was defined in (18.66). It is however clear
that the mean value of the positions of all the particles just defines the beam axis.
Since we put the beam axis at the origin of the coordinate system, x = 0, the mean
value of the spatial displacement of all the particles from the beam axis just vanishes:
E(xk ) ≡ 0. (18.72)
This always holds, independent of the mechanism of stochastic cooling. Thus, the
mean value E(xk ) is not an appropriate quantity for investigating the mechanism of
stochastic cooling. It is evident that the mean square of the spatial displacement E(xk2 )
is much better suited for this purpose. We therefore will investigate the change of
E(xk2 ) by the stochastic cooling mechanism. First we consider the mean square spatial
displacement xk2 for the kth particle, and to this end, we square (18.68):
xk 2 = xk2 − 2gxk xk + g 2 xk 2 . (18.73)
The change of xk2 for a single passage through the kicker is thus given by
(xk2 ) := xk 2 − xk2 = −2gxk xk + g 2 xk 2 . (18.74)
Since there is one kick per revolution, this is also the change of xk2 per revolution. By
averaging over all particles, one obtains
E((xk2 )) ≡ E(xk 2 − xk2 ) = E(xk 2 ) − E(xk2 )

≡ (E(xk2 )) = −2gE(xk xk ) + g 2 E(xk 2 ). (18.75)
The second equals sign in the first line follows from the additivity of the expectation
value E(. . .); compare (18.71).
To calculate the change of the expectation value of the mean square of the spatial
displacement per revolution, (E(xk2 )), the expectation values E(xk xk ), E(xk 2 )
must be expressed by E(xk2 ). We then obtain (E(xk2 )) as a function of E(xk2 ), or a
Exercise 18.10 differential equation for E(xk2 ), the solution of which allows us to calculate the desired
quantities.
To evaluate E(xk xk ), we write with (18.67)

1
N Ns
1
E(xk xk ) = xk x1,k + xj,k
N Ns
k=1 j =2

1 2 1
N N Ns
1
= xk + x1,k xj,k . (18.76)
Ns N N
k=1 k=1 j =2
In the first term, we used x1,k ≡ xk , and in the second one xk ≡ x1,k .
We now realize that two different particles in the beam cannot be correlated (the
particles are statistically distributed over the beam!). Even though they belong to the
same sample of Ns particles around the kth particle, their spatial displacements xi,k
and xj,k , i = j , on the average must satisfy
1
N
E(xi,k xj,k ) = xi,k xj,k = 0, for i = j. (18.77)
N
k=1
Thus, if i = 1 and 2 ≤ j ≤ Ns , then
1
N
E(x1,k xj,k ) = x1,k xj,k ≡ 0. (18.78)
N
k=1
This is now utilized in the second term of (18.76), which then vanishes. The first term
can immediately be rewritten using the definition of E, and we obtain
1
E(xk xk ) = E(xk2 ). (18.79)
Ns
Furthermore,
1 1
N Ns
E(xk 2 ) = xi,k xj,k
N Ns2
k=1 i,j =1
N
1 1
N s Ns
= x 2
i,k + xi,k xj,k . (18.80)
N Ns2
k=1 i=1 i,j =1
i=j
The second term again vanishes by using (18.77). In the first term we first average by
summing over all particles,
1 1 1
N Ns Ns
x 2
i,k = 2
E(xi,k ). (18.81)
N Ns2 Ns2
k=1 i=1 i=1
The mean square spatial deviation E(xi,k2 ) cannot depend on the label i of the particle
from the sample of Ns particles, E(xi,k 2 ) ≡ E(x 2 ). The sum over i therefore yields
k
only the additional factor Ns , and we obtain
1
E(xk 2 ) ≡ E(xk2 ). (18.82)
Ns
Equation (18.75) with (18.79) and (18.82) thus turns into Exercise 18.10
2g − g 2
(E(xk2 )) = − E(xk2 ) (18.83)
Ns
for the change of the mean square spatial displacement per revolution. The “differen-
tial” change dE(xk2 ) per “differential” revolution dn is
dE(xk2 ) 2g − g 2
=− E(xk2 ). (18.84)
dn Ns
This differential equation is solved by the function

2g − g 2
E(xk2 ) = C exp −n . (18.85)
Ns
For the root of the mean square (rms) spatial displacement

xrms := E(xk2 ),
we obtain

√ 2g − g 2
xrms = C exp −n . (18.86)
2Ns
xrms decreases to the eth fraction of its original value after
2Ns
n0 = (18.87)
2g − g 2
revolutions. Since each revolution takes the time T , it thus lasts for
2Ns T
τ = n0 T = , (18.88)
2g − g 2
to reduce xrms to the fraction 1/e of its original value. Since among N particles orbit-
ing in the time T , Ns particles are seized in the time Ts , for a homogeneous particle
flux density we have
Ts
Ns = N , (18.89)
T
and therefore,
2N Ts
τ= . (18.90)
2g − g 2
With the bandwidth W = 1/2Ts (Nyquist’s theorem) or
1
Ts = , (18.91)
2ω
Fig. 18.19.
we finally obtain the cooling rate

1 W
= (2g − g 2 ). (18.92)
τ N
(b) From the discussion of (18.92), it follows immediately that the cooling rate is
maximized for g = 1. For g > 2, the particles are heated up.
(c) With the numerical values given in the problem, for the cooling rate we get
1 500 MHz 1
= · 1 = 5 · 10−4 , (18.93)
τ 1012 s
and thus,
1
τ = 2 · 103 s ≈ h. (18.94)
2
Canonical Transformations
19
Given a Hamiltonian H = H (qj , pj , t), the motion of the system is found by integra-
tion of the Hamilton equations:
∂H ∂H
ṗi = − and q̇i = . (19.1)
∂qi ∂pi
For the case of a cyclic coordinate, we have, as we know,
∂H
= 0, i.e., ṗi = 0.
∂qi
Hence, the corresponding momentum is constant: pi = βi = constant.

Whether or not H contains cyclic coordinates depends in general on the coordi-
nates adopted for describing a problem. This is immediately seen from the following
example: If a circular motion in a central field is described in Cartesian coordinates,
there is no cyclic coordinate. If we use polar coordinates (, ϕ), the angular coordinate
is cyclic (angular momentum conservation).
A mechanical problem would therefore be greatly simplified if one could find a
coordinate transformation from the set pi , qi to a new set of coordinates Pi , Qi with
Qi = Qi (pj , qj , t), Pi = Pi (pj , qj , t), (19.2)
where all coordinates Qi for the problem were cyclic. Then all momenta are constant,
Pi = βi , and the new Hamiltonian H is then only a function of the constant momenta
Pi ; hence, H = H (Pj ). Then
∂H (Pj ) ∂H (Pj )
Q̇i = = ωi = constant, Ṗi = − = 0.
∂Pi ∂Qi
Then integration with respect to time leads to
Qi = ωi t + ω0 , Pi = βi = constant.
Here, we presupposed that the new coordinates (Pi , Qi ) again satisfy the (canoni-
cal) Hamilton equations, with a new Hamiltonian H (Pj , Qj , t). This is an essential
requirement for a coordinate transformation of the form (19.2) to make it canonical.
Just as pi is the canonical momentum corresponding to qi (pi = ∂L/∂ q̇i ), Pi shall
be the canonical momentum to Qi . A pair (qi , pi ) is called canonically conjugate
if the Hamilton equations hold for qi and pi . The transformation from one pair of

366 19 Canonical Transformations
canonically conjugate coordinates to another pair is called a canonical transformation.

Then
∂H ∂H
Q̇i = , Ṗi = − . (19.3)
∂Pi ∂Qi
At the moment, we do not yet require that all Qi be cyclic. This case will be con-
sidered later (Chap. 20).
In the new coordinates, we require Hamilton’s principle to be maintained. Thus,
for fixed instants of time, t1 and t2 , we have both
t2
δ L(qj , q̇j , t) dt = 0
t1
and
t2
δ L (Qj , Q̇j , t) dt = 0.
t1
Thus, the difference

δ (L − L ) dt = 0 (19.4)
also vanishes.
We observe that (19.4) will then be fulfilled even if the old and new Lagrangians
differ by a total time derivative of a function F :
t2
dF dF
L−L = , because δ dt = δ F |t2 − F |t1 = 0,
dt dt
t1
since the variation of a constant equals zero. As we shall see, the function F medi-
ates the transformation (pi , qi ) to (Pi , Qi ). F is therefore also called a generating
function. In the general case, F will be a function of the old and the new coordinates;
together with the time t it involves 4n + 1 coordinates:
F = F (pj , qj , Pj , Qj , t).
But since simultaneously there are 2n transformation equations
Qi = Qi (pj , qj , t), Pi = Pi (pj , qj , t), (19.5)
F involves only 2n + 1 independent variables. F must contain both a coordinate from

the old coordinate set pi (or qi ) and one of the new Pi (or Qi ) to enable us to establish
a relation between the systems. Hence, there are four possibilities for a generating
function:
19 Canonical Transformations 367
F1 = F (qj , Qj , t), F2 = F (qj , Pj , t),

(19.6)
F3 = F (pj , Qj , t), F4 = F (pj , Pj , t).
Each of these functions has 2n + 1 independent variables. The dependency must be

selected in a suitable way, according to the actual problem. We now derive the trans-
formation rules of the form (19.2) from a generating function of type F1 .
Because
dF
L = L + and L= pi q̇i − H, (19.7)
dt
we have
dF
pi q̇i − H = Pi Q̇i − H + . (19.8)
dt
For the total time derivative of F1 we then have
dF1 ∂F1 ∂F1 ∂F1

= q̇i + Q̇i + . (19.9)
dt ∂qi ∂Qi ∂t
We insert this expression into (19.8), which yields

∂F1 ∂F1 ∂F1
pi q̇i − Pi Q̇i − H + H = q̇i + Q̇i + .
∂qi ∂Qi ∂t
By comparing the coefficients, we obtain
∂F1 (qj , Qj , t)
pi = ,
∂qi
∂F1 (qj , Qj , t)
Pi = − , (19.10)
∂Qi
∂F1 (qj , Qj , t)
H = H + .
∂t
We are now prepared to derive the transformation equations for a generating func-
tion of the type F2 , which is also denoted by S:
F2 ≡ S = S(qj , Pj , t).
For the derivation, we will use a comparison of coefficients as for F1 ; therefore, we

require that F2 be composed as follows:

F2 (qj , Pj , t) = Pi Qi + F1 (qj , Qj , t), (19.11)
i
since then we can consider the problem analogously to F1 . We imagine the Qi as

being expressed through the second equation of (19.10), i.e., through
∂F1 (qj , Qj , t)
Pi = − .
∂Qi
According to (19.8), we have

d
pi q̇i − H = Pi Q̇i − H + F1
dt
i i

d
= Pi Q̇i − H + F2 (qj , Pj , t) − Pi Qi .
dt
i i
This leads to

d
pi q̇i − Pi Q̇i − H + H = F2 (qj , Pj , t) − Pi Qi
dt
i i i
∂F2 ∂F2 ∂F2
= q̇i + Ṗi +
∂qi ∂Pi ∂t
i i

− Ṗi Qi − Pi Q̇i
i i
∂F2 ∂F2 ∂F2
pi q̇i + Ṗi Qi − H + H = q̇i + Ṗi + .
∂qi ∂Pi ∂t
i i i i
Comparing again the coefficients now yields the equations
∂F2 (qj , Pj , t) ∂F2 (qj , Pj , t)

pi = , Qi = ,
∂qi ∂Pi
(19.12)
∂F2 (qj , Pj , t)
H (Pj , Qj , t) = H (pj , qj , t) + .
∂t
The first two relations allow us to determine the transformation equations qi =
qi (Qj , Pj , t) and pi = pi (Qj , Pj , t), which by insertion into the third equation
of (19.12) yield the new Hamiltonian H (Pj , Qj , t).
The transformation equations for the other types of generating functions are ob-
tained analogously, by choosing an appropriate sum which enables us to use the meth-
ods of the first two problems.
From (19.10) and (19.12), we obtain the dependence of the new coordinates
(Pi , Qi ) on the old (pi , qi ) and vice versa. For the case F1 , from
∂F1 (qj , Qj , t)
pi =
∂qi
follow the equations pi = pi (qj , Qj , t), which can be solved for the Qi :
Qi = Qi (pj , qj , t).
Insertion into the equations
∂F1 (qj , Qj , t)
Pi = −
∂Qi
then enables us to calculate
Pi = Pi (pj , qj , t).
We now understand the name generating function for F : The function F determines
the canonical transformation
Qi = Qi (pj , qj , t), Pi = Pi (pj , qj , t)
through equations of the type (19.10) or (19.12).

By means of Legendre transformations, we may furthermore define generating
functions F3 (pj , Qj , t) and F4 (pj , Pj , t). Based on the Legendre transformation

F3 (pj , Qj , t) = F1 (qj , Qj , t) − q i pi (19.13)
i
we obtain with a similar derivation the canonical transformation rules

∂F3 (pj , Qj , t) ∂F3 (pj , Qj , t)
qi = − , Pi = − ,
∂pi ∂Qi
∂F3 (pj , Qj , t)
H = H + .
∂t
Starting finally from

F4 (pj , Pj , t) = F3 (pj , Qj , t) + Qi Pi (19.14)
i
the following transformation rules emerge
∂F4 (pj , Pj , t) ∂F4 (pj , Pj , t) ∂F4 (pj , Pj , t)

qi = − , Qi = , H = H + .
∂pi ∂Pi ∂t
Calculating the second derivatives of the generating functions F1,2,3,4 , we find the fol-
lowing relations to apply between old and new coordinates under a canonical trans-
formation
∂Qi ∂ 2 F2 ∂pk ∂Qi ∂ 2 F4 ∂qk

= = , = =− ,
∂qk ∂qk ∂Pi ∂Pi ∂pk ∂pk ∂Pi ∂Pi
(19.15)
∂Pi ∂ 2 F1 ∂pk ∂Pi ∂ 2 F3 ∂qk
=− =− , =− = .
∂qk ∂qk ∂Qi ∂Qi ∂pk ∂pk ∂Qi ∂Qi
Exactly the existence of these mutual relations between old and new coordinates dis-
tinguishes a canonical transformation from a general transformation (19.2) of the sys-
tem’s coordinates. For the latter, (19.15) do not hold.
In the preceding derivation, the Hamiltonians H (qj , pj , t) and H (Qj , Pj , t) were
conceived as alternative descriptions of the same dynamical system. On the other
hand, we may as well conceive H and H as describing different dynamical system.
A canonical transformation of H into H then establishes a correlation of both dy-
namical systems. This way, it is sometimes possible to find the solution of a given
dynamical system by canonically transforming it into a second system that is easier to
solve. The solution of the original system is then obtained by canonically back trans-
forming the solution of the second system. With examples 19.4 and 21.16, we shall
work out the solutions of the damped and the time-dependent harmonic oscillators,
respectively, by canonically transforming these systems into the ordinary harmonic
oscillator.
EXAMPLE
19.1 Example of a Canonical Transformation
Let the generating function be given by

F1 (qj , Qj ) = qk Qk .
k
According to (19.10) the particular transformation rules follow as

∂F1 ∂F1
pi = = Qi , Pi = − = −qi , H (Pj , Qj ) = H (qj , pj ).
∂qi ∂Qi
The example shows the in the Hamiltonian formalism, the momentum and position
coordinates play equivalent parts.
EXAMPLE
19.2 Point Transformations
We consider the canonical transformation that is defined by the particular generating

function

F2 (qj , Pj , t) = Pk fk (qj , t),
k
with arbitrary differentiable functions fk (qj , t). The transformation rules (19.12) for
this F2 follow as
∂fk
Qi = fi (qj , t), pi = Pk ,
∂qi
k
∂fk
H (Qj , Pj , t) = H (qj , pj , t) + Pk .
∂t
k
The new position coordinates Qi thus emerge as functions of the original position co-
ordinates qi , without any dependence on the momentum coordinates. Transformations
of this type are referred to as point transformations. This class of transformations is
generally canonical as we can always construct the corresponding generating function.
The particular case fk (qj , t) = qk then defines the identical transformation

Qi = qi , pi = Pk δki = Pi , H (Qj , Pj , t) = H (qj , pj , t).
k
EXAMPLE
19.3 Harmonic Oscillator
The kinetic energy T (p) and the potential energy V (q) of a particle be given by
p2 1 1 k
T (p) = , V (q) = kq 2 = mω2 q 2 , ω2 = , m, k, ω = const.,
2m 2 2 m
with m denoting the particle’s mass, k a characteristic constant of the oscillator, and Example 19.3
ω its characteristic frequency. The Hamiltonian of this system is then
p2 1
H (q, p) = + mω2 q 2 . (19.16)
2m 2
The canonical equations and the equation of motion follow as
∂H p(t) ∂H
q̇(t) = = , −ṗ(t) = = q(t) mω2 , q̈ + ω2 q = 0. (19.17)
∂p m ∂q
The direct way to evaluate the dynamics of this system is to integrate the equation of
motion. Here, we choose the “detour” over the canonical transformation formalism,
namely to map our system into another system with Hamiltonian H whose canonical
equations are even easier to solve. As the “target Hamiltonian” H , we choose
H (P ) = ωP . (19.18)
A simple transformation (H ; q, p) → (H ; P , Q), that provides this mapping is obvi-

ously

2P √
q= sin Q, p = 2mωP cos Q, H = H. (19.19)
mω
We observe that the new momentum P has acquired the dimension of an action,
whereas the new position coordinate Q is now dimensionless, i.e. an angle. In order to
ensure that the form of the canonical equations is maintained in the new coordinates,
we must test whether this transformation is actually canonical. To this end, we must
find a generating function that yields the transformation rules (19.19).
We try a generating function of the form F1 (q, Q, t). The transformation
rules (19.19) are first cast into the particular functional form that corresponds to
the generating function F1 (q, Q, t), hence into the form p = p(q, Q, t) and P =
P (q, Q, t)
1 1
p = mωq cot Q, P = mωq 2 2 , H = H. (19.20)
2 sin Q
We must now find a function F1 (q, Q, t) that yields these particular transformation
rules according to the general prescriptions (19.10). Thus, F1 (q, Q, t) must satisfy
∂F1 ∂F1 1 1 ∂F1
= mωq cot Q, = − mωq 2 2 , = 0.
∂q ∂Q 2 sin Q ∂t
Obviously, such a function exists and is given by

1
F1 (q, Q) = mω q 2 cot Q.
2
The transformation (19.19) thus establishes indeed a canonical transformation. This
may be observed also from the fact that the rules (19.20) satisfy the symmetry condi-
tions (19.15)
∂P mωq ∂p
= 2 =− .
∂q sin Q ∂Q
Example 19.3 With the evidence of the transformation (19.19) being canonical, it is ensured that the
transformed system (19.18) constitutes on its part a Hamiltonian system — and hence
the maintains the canonical form of the canonical equations. Explicitly, the canonical
equations of the transformed system are
∂H ∂H
Q̇(t) = = ω, Ṗ (t) = − = 0.
∂P ∂Q
These equations are thus equivalent to the original canonical equations (19.17) that
emerged from the original Hamiltonian (19.16). As H does not depend on Q, we
observe that the new canonical position coordinate Q is cyclic, hence that its conjugate
canonical momentum P represents a conserved quantity. The canonical equations for
Q̇(t) and Ṗ (t) can be immediately integrated, yielding
Q(t) = ωt + Q(0), P (t) = P (0).
The system’s dynamics are thus completely solved in the simplest possible manner.
Inserting the solution functions Q(t) and P (t) into the transformation rules (19.19),
we obtain the solutions in the original coordinates q(t) and p(t)

2P (0)
q(t) = sin ωt + Q(0) , p(t) = 2mωP (0) cos ωt + Q(0) .
mω
The trigonometric functions can finally be split by means of the addition theorems.
According to (19.19), the values sin Q(0) and cos Q(0) can then be expressed in terms
of the initial conditions q(0) and p(0) of the original system
p(0)
q(t) = q(0) cos ωt + sin ωt, p(t) = −q(0) mω sin ωt + p(0) cos ωt.
mω
As expected, we find the solution of the harmonic oscillator exactly in the form as we
would have obtained by a direct integration of the canonical equations (19.17).
At this point, one could well argue that overall effort needed to solve the canonical
equations (19.17) along the “detour” over the canonical transformation method is even
larger than that for the direct solution. But this is only due to the simplicity of the orig-
inal system. The example here was just chosen to demonstrate the method consisting
of three steps: (i) forth transformation of the initial conditions into a second system,
(ii) solving on that basis the dynamics of the second system, and (iii) transforming
back the obtained solution into the original system coordinates. We may depict both
alternatives by means of the following diagram:
Solution of the canonical equations

of Hamiltonian H
(H ; q(0), p(0)) (q(t), p(t))
Canonical forth
transformation Canonical back
of Hamiltonian M(0) M−1 (t) transformation
and initial of the solution
conditions Solution of the canonical equations
of Hamiltonian H
(H ; Q(0), P (0)) (Q(t), P (t))
In the next Example 19.4, we will show that the method to determine the dynamics Example 19.3
of a given system by transforming it into a second system that is easier to solve can
indeed reduce the overall effort, as compared to a direct solution of the original system.
This will become obvious with Example 21.16, where we treat the time-dependent
damped harmonic oscillator. This case has long been thought of as possessing no
analytic solution. Yet, the solution of this problem by means of a generalized canonical
transformation is fairly straightforward. The price to pay is that we must find the
appropriate generating function.
EXAMPLE
19.4 Damped Harmonic Oscillator
The Hamiltonian of the damped harmonic oscillator is explicitly time dependent
p 2 −2γ t 1
H (q, p, t) = e + mω2 e2γ t q 2 , (19.21)
2m 2
with the abbreviations 2γ = β/m and ω2 = k/m. As before, m stands for the mass
of the moving point particle, β for the friction coefficient, and k for the oscillator’s
constant. The canonical equations follow as
∂H p(t) −2γ t ∂H
q̇(t) = = e , −ṗ(t) = = q(t) mω2 e2γ t .
∂p m ∂q
In the left-hand side equation, we see that the canonical momentum p(t) no longer
coincides with the kinetic momentum pkin (t) = mq̇(t), provided that γ = 0,
p(t) = mq̇(t) e2γ t = pkin (t) e2γ t .
We may combine the two first-order equations into one second-order equation for q(t)
to obtain the equation of motion of the damped harmonic oscillator in its common
form
q̈ + 2γ q̇ + ω2 q = 0. (19.22)
Instead of solving this equation directly by means of an appropriate Ansatz function,

we will first map the Hamiltonian (19.21) by means of a canonical transformation into
the Hamiltonian of an undamped harmonic oscillators. In present case, the canonical
transformation will be based on a generating function of type F2 , namely
1
F2 (q, P , t) = eγ t qP − mγ e2γ t q 2 .
2
According to (19.12), the subsequent transformation rules follow as
∂F2
p= = eγ t P − mγ e2γ t q
∂q
∂F2
Q= = eγ t q
∂P
∂F2
H − H = = γ eγ t qP − mγ 2 e2γ t q 2 = γ QP − mγ 2 Q2 .
∂t
Example 19.4 As the new position coordinate Q solely depends on the old position coordinate q, we
are dealing here with a particular case of the general class of point transformations.
Furthermore, the relation between old and new coordinates is obviously linear. We
may thus express the transformation rules in matrix form. Solving for the old coordi-
nates this yields
−γ t
q e 0 Q
= . (19.23)
p −mγ eγ t eγ t P
According to the rule (19.12) for the mapping of the Hamiltonians, we get the
new Hamiltonian H (Q, P , t) by expressing the original Hamiltonian H (q, p, t)
via (19.23) in terms of the new coordinates Q, P and, moreover, by adding ∂F2 /∂t
1 2 1
H = m−1 e−2γ t −mγ eγ t Q + eγ t P + mω2 e2γ t e−2γ t Q2 + γ QP − mγ 2 Q2
2 2
1 −1 1
= m (P − mγ Q)2 + mω2 Q2 + γ QP − mγ 2 Q2 .
2 2
In the present example, we thus find a transformed Hamiltonian H that no longer
depends on time explicitly
P2 1
H (Q, P ) = + mω̃2 Q2 , ω̃2 = ω2 − γ 2 .
2m 2
We now observe that H emerges as exactly the Hamiltonian of an undamped har-
monic oscillator with angular frequency ω̃ = ω2 − γ 2 . Its solution is already known
from Example 19.3

Q(t) cos ω̃t m−1 ω̃−1 sin ω̃t Q(0)
= . (19.24)
P (t) −mω̃ sin ω̃t cos ω̃t P (0)
The solution functions q(t) and p(t) of the damped harmonic oscillator now follows
as the product the solution (19.24) and the canonical forth and back transformations,
given by (19.23) and its inverse
−γ t
q(t) e 0 cos ω̃t m−1 ω̃−1 sin ω̃t 1 0 q(0)
= .
p(t) −mγ eγ t eγ t −mω̃ sin ω̃t cos ω̃t mγ 1 p(0)
On the right-hand side, the initial conditions Q(0), P (0) of the transformed system
were expressed through those of the original system, q(0), p(0). Explicitly, according
to the inverse transformation of (19.23) at t = 0, we have Q(0) = q(0) and P (0) =
mγ q(0) + p(0). The determinants of all matrices are unity and hence the determinant
of the combined linear mapping (q(0), p(0)) → (q(t), p(t)). This is in agreement
with the requirement of Liouville’s theorem.
In the form of the product of three matrices, it becomes obvious that the solution
method via canonical transformation consists of the three steps, as sketched at the
end of Example 19.3. We may finally express the solution of the damped harmonic
oscillator (19.22) concisely by multiplying the matrices

q(t) q(0)
= R(t) ,
p(t) p(0)
with Example 19.4

e−γ t [cos ω̃t + γ ω̃−1 sin ω̃t] e−γ t m−1 ω̃−1 sin ω̃t
R(t) = ,
−eγ t m ω2 ω̃−1 sin ω̃t eγ t [cos ω̃t − γ ω̃−1 sin ω̃t]
ω̃2 = ω2 − γ 2 .
The present example shows that the task of solving the equation of motion of a given
dynamical system can be facilitated if we succeed to represent it as the transformed
solution of a another system that is easier to solve. But this works only if we can find
an appropriate generating function.
EXAMPLE
19.5 Infinitesimal Time Step
We consider the particular canonical transformation that is generated by the function

F2 (qj , Pj , t) = qi Pi + H (qj , pj , t) δt. (19.25)
i
Herein, H stand for the Hamiltonian of the given dynamical system, and δt for an
infinitesimal interval on the time axis. From the general form of transformation rules
for generating functions of type F2 we obtain the particular rules for (19.25) as
∂F2 ∂H dpi
pi = = Pi + δt = Pi − δt,
∂qi ∂qi dt
∂F2 ∂H 1st order in δt ∂H dqi
Qi = = qi + δt = qi + δt = qi + δt,
∂Pi ∂Pi ∂pi dt
∂F2 ∂H dH
H = H + =H + δt = H + δt.
∂t ∂t dt
In last rightmost terms of these equations, the canonical equations were inserted, re-
spectively. Solving for the transformed quantities, this means
Pi = pi + ṗi δt,
Qi = qi + q̇i δt,
H = H + Ḣ δt.
We now observe that the particular generating function (19.25) defines precisely the
canonical transformation that pushes the system ahead by an infinitesimal time step δt.
As any canonical transformation can be applied an arbitrary number of times in se-
quence, we can conclude that the transformation along finite time steps is also canon-
ical. This is an important result: the time evolution of a Hamiltonian system consti-
tutes a particular canonical transformation. As already stated, the class of canonical
transformations are characterized by their property to map Hamiltonian systems into
Example 19.5 Hamiltonian systems. It is thus ensured that a Hamiltonian system remains a Hamil-
tonian system in the course of its time evolution.
EXAMPLE
19.6 General Form of Liouville’s Theorem
With the theory of canonical transformations at hand, we may cast Liouville’s theorem
into the following general form: the volume element dV = dq1 . . . dqn dp1 . . . dpn of
a Hamiltonian system with n degrees of freedom is invariant with respect to canonical
transformations,
can. transf.
dQ1 . . . dQn dP1 . . . dPn = dq1 . . . dqn dp1 . . . dpn .
For general transformations of the system’s coordinates, the transformation of the

volume element dV is determined by the determinant D of its Jacobi matrix
dQ1 . . . dQn dP1 . . . dPn = D dq1 . . . dqn dp1 . . . dpn ,
∂(Q1 , . . . , Qn , P1 , . . . , Pn )
D= .
∂(q1 , . . . , qn , p1 , . . . , pn )
Liouville’s theorem thus states that the determinant D of the transformation’s Jacobi
matrix is unity in case that the transformation is canonical.
For the sake of transparency, we first prove Liouville’s theorem for the case of a
system with one degree of freedom, i.e., for n = 1. For such a system, the determinant
D of the Jacobi matrix emerging from a transition from old coordinates q, p to new
coordinates Q = Q(q, p), P = P (q, p) is given by
∂(Q, P ) ∂Q ∂P ∂Q ∂P
D= = − .
∂(q, p) ∂q ∂p ∂p ∂q
According to the general rule for the partial derivatives of the inverse functions q =
q(Q, P ), p = p(Q, P ) we have
∂p 1 ∂Q
= ,
∂P D ∂q
which means that the determinant D of the Jacobi matrix can be expressed as

∂Q ∂p −1
D= . (19.26)
∂q ∂P
In the particular case of a canonical transformation, the transformation rules can be
derived from a generating function F2 (q, P , t), as stated in (19.12),
∂F2 ∂F2
Q= , p= .
∂P ∂q
Inserting the expressions for Q and p, the determinant D is equivalently expressed as

∂ 2 F2 ∂ 2 F2 −1
D= = 1.
∂q∂P ∂P ∂q
But this equals unity as the partial derivatives may be interchanged.
The proof for the general case of systems with n degrees of freedom is worked out Example 19.6
analogously. In the case of a Hamiltonian system, the determinant D of the Jacobi
matrix that is associated with a general transformation of the system’s coordinates has
an even number of rows (columns). We assume the transformation to be invertible.
Then, we may express the new position coordinates Qi = Qi (qj , pj ) as functions
of the old coordinates, and the old momenta as functions of the new coordinates,
pi = pi (Qj , Pj ). The determinant of the associated Jacobi matrix is the represented
by

∂(Q1 , . . . , Qn ) ∂(p1 , . . . , pn ) −1
D= , (19.27)
∂(q1 , . . . , qn ) ∂(P1 , . . . , Pn )
which generalizes the relation (19.26). Provided that a generating function

F2 (qj , Pj , t) exists, then the transformation is referred to as canonical, and the trans-
formation rules are given by (19.12). Inserting Qi and pi yields
2
∂ F2 ∂ 2 F2 −1
D = = 1,
∂qj ∂Pi ∂Pj ∂qi
which is again unity as we may interchange the sequence of partial derivatives and
due to the fact that determinants of transposed matrices coincide.
We finally remark that the generating function F2 used here in this proof is com-
pletely equivalent to the other types of generating function. For, the determinant
(19.27) of the transformation’s Jacobi matrix has the equivalent representations

n ∂(P1 , . . . , Pn )∂(p1 , . . . , pn ) −1
D = (−1)
∂(q1 , . . . , qn ) ∂(Q1 , . . . , Qn )

∂(P1 , . . . , Pn ) ∂(q1 , . . . , qn ) −1
=
∂(p1 , . . . , pn ) ∂(Q1 , . . . , Qn )

∂(Q1 , . . . , Qn ) ∂(q1 , . . . , qn ) −1
= (−1)n .
∂(p1 , . . . , pn ) ∂(P1 , . . . , Pn )
The result D = 1 for a canonical transformation then follows in the same way as above
by inserting the rules into the appropriate generating function F1 , F3 , or F4 .
With the result of Example 19.5, we know that the time evolution of a Hamiltonian
system can be conceived as a particular canonical transformation whose generating
function is based on the Hamiltonian H . This yields the more special version of Liou-
ville’s theorem from Chap. 18, where it was stated that the volume element dV of a
Hamiltonian system is invariant in the course of the system’s time evolution.
EXAMPLE
19.7 Canonical Invariance of the Poisson Brackets
For a Hamiltonian system H (qj , pj , t) of n degrees of freedom, and for two differen-
tiable functions F (qj , pj , t), G(qj , pj , t) of the canonical variables and time t , the
Example 19.7 Poisson bracket1 of F and G is defined by

n

∂F ∂G ∂F ∂G
[F, G] = − . (19.28)
∂qi ∂pi ∂pi ∂qi
i=1
A special case is established if we set up the Poisson brackets of the canonical vari-
ables qi and pi . As these variables are required to not depend on each other, we im-
mediately get
[qi , qj ] = 0, [pi , pj ] = 0, [qi , pj ] = δij . (19.29)
We first convince ourselves that the same relations hold for canonically transformed
coordinates Qi and Pi , hence that the fundamental Poisson brackets (19.29) are in-
variant under canonical transformations. Making use of the relations (19.15), we find
n

∂Qi ∂Qj ∂Qi ∂Qj
[Qi , Qj ] = −
∂qk ∂pk ∂pk ∂qk
k=1
n

∂pk ∂Qj ∂qk ∂Qj ∂Qj
= + = =0
∂Pi ∂pk ∂Pi ∂qk ∂Pi
k=1
n

∂Pi ∂Pj ∂Pi ∂Pj
[Pi , Pj ] = −
k=1
n
(19.30)
∂pk ∂Pj ∂qk ∂Pj ∂Pj
= − − =− =0
∂Qi ∂pk ∂Qi ∂qk ∂Qi
k=1
n

∂Qi ∂Pj ∂Qi ∂Pj
[Qi , Pj ] = −
k=1
n

∂pk ∂Pj ∂qk ∂Pj ∂Pj
= + = = δij .
∂Pi ∂pk ∂Pi ∂qk ∂Pi
k=1
We are now prepared to show that the Poisson bracket of two arbitrary functions
F (qj , pj , t) and G(qj , pj , t) establishes likewise a canonical invariant. The time t
1 Siméon Denis Poisson, French mathematician and physicist, b. June 21, 1781, Pithiviers, France–
d. April 25, 1840, Paris, France. Descending from a simple social background—his father was a
soldier—Poisson had good teachers who recognized his extraordinary gifts and made it possible for
him to begin studies at the École Polytechnique in Paris in 1798. There, his mathematical talents
were recognized by Laplace and Lagrange. Poisson became an assistant professor, and, in 1806, a
full professor at the École Polytechnique, where he energetically worked to improve teaching and the
formation of students.
His research initially was focused on the theory of ordinary and partial differential equations,
which he applied to many different physical problems. Thus, Poisson developed further the mechanics
of Laplace and Lagrange, and studied problems related to the propagation of sound, elasticity, and
static electricity. He later turned his interests towards the theory of probabilities, and recognized the
seminal nature of the Law of Large Numbers.
Many ideas and concepts are named after Poisson, such as the Poisson equation in potential the-
ory, the Poisson bracket of mechanics, the Poisson ratio in elasticity, and the Poisson distribution in
statistics.
as the common independent variable of both the original and the transformed system Example 19.7
is not transformed. We may thus restrict ourselves to the nested mapping

F (qj , pj ) = F Qk (qj , pj ), Pk (qj , pj ) ,

G(qj , pj ) = G Qk (qj , pj ), Pk (qj , pj ) .
Applying the chain rule, one finds

∂F ∂Qi ∂F ∂Pi

∂G ∂Qj ∂G ∂Pj

[F, G]q,p = + +
∂Qi ∂qk ∂Pi ∂qk ∂Qj ∂pk ∂Pj ∂pk
k i j

∂F ∂Qi ∂F ∂Pi ∂G ∂Qj ∂G ∂Pj
− + + .
∂Qi ∂pk ∂Pi ∂pk ∂Qj ∂qk ∂Pj ∂qk
Multiplying and recollecting the terms for Poisson brackets with respect to the coor-
dinates Qi , Pj yields the equivalent expression
∂F ∂G ∂F ∂G
[F, G]q,p = [Qi , Qj ] + [Pi , Pj ]
∂Qi ∂Qj ∂Pi ∂Pj
i j

∂F ∂G ∂F ∂G
+ [Qi , Pj ] − [Qj , Pi ] . (19.31)
∂Qi ∂Pj ∂Pi ∂Qj
Equation (19.31) holds for any invertible coordinate transformation. In the particular
case that the transformation is canonical, then in addition the relations (19.30) for the
fundamental Poisson brackets apply. In that case, (19.31) simplifies to
∂F ∂G ∂F ∂G

[F, G]q,p = − δij = [F, G]Q,P .
∂Qi ∂Pj ∂Pi ∂Qj
i j
The Poisson bracket [F, G] is thus uniquely determined by functions F and G and
independent from the underlying coordinate system, provided that a transformation of
the coordinate system is canonical.
EXAMPLE
19.8 Poisson’s Theorem
Poisson’s theorem embodies an important benefit of the Poisson bracket formalism: if

two invariants I1 and I2 of a given dynamical system are known, then it is possible to
directly construct a third invariant I3 . In order to demonstrate this, we first derive the
general rule for the total time derivative of a Poisson bracket

d dF dG
[F, G] = , G + F, .
dt dt dt
The proof is easily worked out by directly calculating the total time derivative of the
Poisson bracket’s definition from (19.28)
n

Example 19.8 d ∂F d ∂G ∂G d ∂F ∂F d ∂G
[F, G] = + −
dt ∂qi dt ∂pi ∂pi dt ∂qi ∂pi dt ∂qi
i=1

∂G d ∂F
−
∂qi dt ∂pi

n n

∂F ∂ ∂G dqj ∂G dpj ∂G
= + +
∂qi ∂pi ∂qj dt ∂pj dt ∂t
i=1 j =1

∂G ∂ ∂F dqj ∂F dpj ∂F
− + +
∂qi ∂pi ∂qj dt ∂pj dt ∂t

∂G ∂ ∂F dqj ∂F dpj ∂F
+ + +
∂pi ∂qi ∂qj dt ∂pj dt ∂t

∂F ∂ ∂G dqj ∂G dpj ∂G
− + +
∂pi ∂qi ∂qj dt ∂pj dt ∂t
n

∂F ∂ dG ∂G ∂ dF ∂G ∂ dF
= − +
∂qi ∂pi dt ∂qi ∂pi dt ∂pi ∂qi dt
i=1

∂F ∂ dG
−
∂pi ∂qi dt

dG dF
= F, + ,G .
dt dt
If both I1 ≡ F as well as I2 ≡ G are invariants of motion, i.e., if dF /dt ≡ 0 and
dG/dt ≡ 0, we conclude
d
[F, G] ≡ 0. (19.32)
dt
With I3 ≡ [F, G] we have then found another, possibly trivial, invariant of the system.
We remark that Poisson’s theorem in the form of (19.32) only applies for invariants
F and G, whose total time derivatives vanish identically.
In case that dF /dt = 0 and dG/dt = 0 represent only implicit functions, we cannot
infer that the Poisson brackets [dF /dt, G] and [F, dG/dt] vanish. The reason is that
the construction of a Poisson bracket does not constitute an algebraic but an analytic
operation. In the latter case, we must impose the stronger condition that the partial
derivatives of dF /dt and dG/dt with respect to the qi and the pi all vanish

∂ dF ∂ dF dF
= 0, =0 ⇒ , G = 0.
∂qi dt ∂pi dt dt
EXAMPLE
19.9 Invariants of the Plane Kepler System
The Hamiltonian for a plane and time-independent Kepler system is given by

1 μ
H (qj , pj ) = p12 + p12 − , r = q12 + q12 , μ = G(m1 + m2 ).
2 r
Herein, G denotes the gravitational constant, and m1 , m2 the masses of the respective Example 19.8
bodies. The canonical equations are obtained as
∂H ∂H qi
q̇i = = pi , ṗi = − = −μ 3 .
∂pi ∂qi r
The angular momentum D = q1 p2 − q2 p1 constitutes an invariant of all systems with

central force fields. We verify this by directly calculating the time derivative of D and
subsequently inserting the canonical equations
dD
= q1 ṗ2 + q̇1 p2 − q2 ṗ1 − q̇2 p1
dt
μ μ
= − 3 q1 q2 + p1 p2 + 3 q2 q1 − p2 p1
r r
≡ 0.
Another invariant of this system is given by
q1
R1 = q1 p22 − q2 p1 p2 − μ .
r
We convince ourselves of this fact again by direct calculation of the time derivative
of R1
dR1 q̇1
= q̇1 p22 + 2q1 p2 ṗ2 − q̇2 p1 p2 − q2 ṗ1 p2 − q2 p1 ṗ2 − μ
dt r
q1
+ μ 3 (q1 q̇1 + q2 q̇2 )
r
μ μ μ
= p1 p22 − 2 3 q1 q2 p2 − p1 p22 + 3 q1 q2 p2 + 3 q22 p1
r r r
μ 2 μ
− 3 p1 q1 + q2 + 3 q1 (q1 p1 + q2 p2 )
2
r r
≡ 0.
According to Poisson’s theorem, the function R2 = [D, R1 ] then represents another

invariant of the system
Def ∂D ∂R1 ∂D ∂R1 ∂D ∂R1 ∂D ∂R1

R2 = [D, R1 ] = − + −
∂q1 ∂p1 ∂p1 ∂q1 ∂q2 ∂p2 ∂p2 ∂q2
2
μ q
= −q2 p22 + q2 p22 − + μ 13
r r

μ
− p1 (2q1 p2 − q2 p1 ) + q1 p1 p2 − 3 q1 q2
r
q 2
= q2 p12 − q1 p1 p2 − μ .
r
Example 19.8 We can prove this easily by directly calculating dR2 /dt. The invariants R1 and R2
constitute the components of the Runge–Lenz2 vector. We will get back to the Runge–
Lenz vector in Example 21.21.
2 Carl David Tolmé Runge, German mathematician and physicist, b. August 30, 1856, Bremen,
Germany–d. January 3, 1927, Göttingen, Germany. Runge came from a family of merchants and
grew up in Havana and Bremen. He took up studies of literature at Munich, but soon switched to
mathematics and physics. As a student in Munich, he met Max Planck, which was the beginning of
a lifelong friendship. Runge finished his studies with a thesis on differential geometry, supervised
by Weierstrass, and became a professor of mathematics in Hanover in 1886. In 1906, he took up a
professorship in Göttingen. Runge worked on the numerical solution of equations—the Runge–Kutta
method for the solution of differential equations is named after him—and on spectroscopy. He did
spectroscopical measurements himself and contributed eminently to the understanding of the spectral
series of various atoms. Runge applied his results to the new field of the analysis of stellar spectra.
In a textbook on vector analysis, Runge described the derivation, originally found by Gibbs, of con-
served quantity of the Kepler problem. This discussion was then referred to by Wilhelm Lenz in his
early quantum mechanical treatment of the hydrogen atom. The corresponding conserved quantity
has become known as the Runge–Lenz vector.
Wilhelm Lenz, German physicist, b. February 8, 1888, Frankfurt am Main, Germany–d. April
30, 1957, Hamburg, Germany. Lenz attended the same school in Frankfurt as Otto Hahn, and took
up studies of mathematics and physics in Göttingen in 1906. He obtained his Ph.D. in 1911 with
Arnold Sommerfeld in Munich and became Sommerfelds assistant. In 1921, Lenz became professor
of theoretical physics in Hamburg. Among his students and assistants in Hamburg were Pascual
Jordan, Wolfgang Pauli, and Hans Jensen, who was awarded the Nobel Prize in physics in 1963 for
the development of the shell model of the atomic nucleus. Lenz’ contributions to the early quantum
mechanics of hydrogen-like atoms renewed interest in the Runge–Lenz vector, which, actually, had
been known long before. A simple model for the description of ferromagnets developed by Lenz and
proposed as a thesis topic to one of his students is well known today by the name of the student: the
Ising model.
Hamilton–Jacobi Theory
20
In the preceding chapter, we tried to perform a transformation to coordinate pairs

(qi , pi = βi ) for which the canonical momenta were constant. We now proceed one
step further and look for a canonical transformation to coordinates Pi = pi0 and
Qi = qi0 which all are constant and are given by the initial conditions. When we have
found such coordinates, the transformation equations are the solutions of the system
in the normal position coordinates:
qi = qi (qi0 , pi0 , t), pi = pi (qi0 , pi0 , t).
The coordinates (Pi , Qi ) obey the Hamilton equations with the Hamiltonian
H (Qi , Pi , t). Since the time derivatives vanish by definition, we have
∂H ∂H
Ṗi = 0 = − , Q̇i = 0 = . (20.1)
∂Qi ∂Pi
These conditions would certainly be fulfilled by the function H ≡ 0. In order to
perform the coordinate transformation, we need a generating function. For histori-
cal reasons—Jacobi made this choice—we adopt among the four possible types the
type F2 = S(qi , Pi , t), which already has been treated in the preceding chapter. It is
generally known as the Hamilton action function. For this choice the equations (19.12)
hold. We now require that the new Hamiltonian shall identically vanish. Then

∂S ∂S ∂S
+ H q1 , . . . , qn ; p1 = , . . . , pn = ; t = 0. (20.2)
∂t ∂q1 ∂qn
Writing down this equation with the arguments, we obtain

∂S(qi , Pi = βi , t) ∂S ∂S
+ H q1 , . . . , qn ; ,..., ; t = 0. (20.3)
∂t ∂q1 ∂qn
This is the Hamilton–Jacobi differential equation.1 The Pi denote constants that, as

noted above, are fixed by the initial conditions pi0 . By means of this differential equa-
tion we can determine S. We note that this differential equation is a nonlinear partial
1 Carl Gustav Jacob Jacobi, b. Dec. 18, 1804, Potsdam, son of a banker–d. Feb. 18, 1851, Berlin.
After his studies (1824), Jacobi became a lecturer in Berlin and in 1827 to 1842 held a chair as a pro-
fessor in Königsberg (now: Kaliningrad). After an extended travel through Italy to restore his weak
health, Jacobi lived in Berlin. Jacobi became known for his work Fundamenta Nova Theoria Functio-
rum Ellipticarum (1829). In 1832, Jacobi discovered that hyperelliptic functions can be inverted by
functions of several variables. Jacobi also made fundamental contributions to algebra, to elimination
theory, and to the theory of partial differential equations, e.g., in his Lectures on Dynamics (1842 to
1843), published in 1866.

384 20 Hamilton–Jacobi Theory
differential equation of first order with n + 1 variables qi , t. It is nonlinear, since H

depends quadratically on the momenta that enter as derivatives of the action func-
tion with respect to the position coordinates. There appear only first derivatives with
respect to the qi and the time.
To get the action function S, we have to integrate the differential equation n + 1
times (each derivative ∂S/∂qi , ∂S/∂t requires one integration), and we thus obtain
n + 1 integration constants. But since S appears in the differential equation only as a
derivative, S is determined only up to a constant a; i.e., S = S + a. This means that
one of the n + 1 integration constants must be a constant additive to S. It is, however,
not essential for the transformation. We thus obtain as a solution function
S = S(q1 , . . . , qn ; β1 . . . , βn ; t),
where the βi are integration constants. A comparison with (19.12) leads to the require-
ments
∂S ∂S(q1 , . . . , qn ; β1 , . . . , βn ; t)
Pi = βi ; Qi = = = αi . (20.4)
∂Pi ∂βi
The βi , αi can be determined from the initial conditions.
The original coordinates result from the transformation equations (19.12) as fol-
lows: From
∂S(qj , βj , t)
αi =
∂βi
follow the position coordinates
qi = qi (αj , βj , t).
Insertion into
∂S(qj , Pj , t)
pi = = pi (qi , βi , t)
∂qi
finally yields
pi = pi (αi , βi , t).
Now the qi (αj , βj , t) and pi (αj , βj , t) are known as functions of the time and
of the integration constants αj , βj . This simply means the complete solution of the
many-body problem characterized by the Hamiltonian H (qi , pi , t).
We can separate off the time dependence in S. If H is not an explicit function of
the time, H represents the total energy of the system:
∂S
− = H = E. (20.5)
∂t
From this, it follows that S can be represented as
S(qi , Pi , t) = S0 (qi , Pi ) − Et.
To explain the meaning of S, we form the total derivative of S with respect to time:
dS ∂S ∂S ∂S
= q̇i + Ṗi + .
dt ∂qi ∂Pi ∂t
20 Hamilton–Jacobi Theory 385
But, since Ṗi = 0, we have

dS(qi , Pi = βi , t) ∂S ∂S
= q̇i + .
dt ∂qi ∂t
Because
dS(qj , Pj = βj , t) ∂S
= pi and = −H,
dqi ∂t
it further follows that
dS(qi , Pi (pα , qα ), t)
= pi q̇i − H (qi , pi , t) = L(qi , pi , t). (20.6)
dt
H and L are not bound by restrictions; in particular they can be time-dependent. This
means that S is given by the time integral over the Lagrangian:

S = L dt + constant. (20.7)
Since this integral physically represents an action (energy · time), the term action
function for S is obvious. The action function differs from the time integral over the
Lagrangian by at most an additive constant. However, this last relation cannot be used
for a practical calculation, since as long as the problem is not yet solved, one does not
know L as a function of time. Moreover, L(qi , pi , t) in (20.6) depends on the original
coordinates qi , pi , while the S-function is needed in the coordinates qi , Pi (qα , pα ).
Equation (20.7) is not unknown to us: The action function S turned up before when
formulating the Hamilton principle (18.25). Before further continuing this discussion,
we will illustrate the Hamilton–Jacobi method by an example.
EXAMPLE
20.1 The Hamilton–Jacobi Differential Equation
We start again with the harmonic oscillator. The Hamiltonian is
p2 k
H= + q 2.
2m 2
The Hamilton action function then has the form (compare (19.12) and (20.3))
∂S
S = S(q, P , t) and p= .
∂q
From this, we obtain the Hamilton–Jacobi differential equation:

∂S 1 ∂S 2 k 2
+ + q = 0.
∂t 2m ∂q 2
For solving the problem, we make a separation ansatz into a space and a time
variable. A product ansatz would not work here, since the differential equation is not
linear. We therefore set a sum:
S = S1 (t) + S2 (q).
Example 20.1 For the partial derivatives, we then get
∂S dS2 (q) ∂S dS1 (t)

= , = .
∂q dq ∂t dt
This leads to

1 dS2 (q) 2 k 2
−Ṡ1 (t) = + q = β,
2m dq 2
where β is the separation constant. (The left-hand side depends only on the time t,
the right-hand side only on the coordinate q: Therefore, both sides can only be equal
if they are equal to a common constant β.) For the time-dependent function, we then
have
Ṡ1 (t) = −β,
which leads to
S1 (t) = −βt.
For the space-dependent part, there remains the following equation:

1 dS2 (q) 2 k 2 dS2
+ q = β, = 2mβ − mkq 2 .
2m dq 2 dq
As sum of the two parts, we then obtain for S

√ 2β
S(q, β, t) = mk − q 2 dq − βt.
k
For the constant Q ≡ α, we then have
√ −1/2
∂S mk 2β
α=Q= = − q2 dq − t.
∂β k k
The integral can easily be evaluated, and we obtain

m
Q+t = arcsin k/(2β) q .
k
With the usual abbreviation ω2 = k/m, we obtain the equation

2β
q= sin ω(t + Q).
k
A comparison with the known equation of motion of the harmonic oscillator shows
that β corresponds to the total energy E, and Q to an initial time t0 . Energy and time
are therefore canonically conjugate variables. Both the energy and the time t0 (which
corresponds to an initial phase) are given by the initial conditions.
The separation of the Hamilton–Jacobi equation represents a general (often the only
feasible) way of solving it. If the Hamiltonian does not explicitly depend on the time,
then

dS ∂S ∂S
+ H q1 , . . . , q n ; ,..., = 0, (20.8)
dt ∂q1 ∂qn
and the time can be separated off immediately. We set for S a solution of the form
S = S0 (qi , Pi ) − βt.
The constant β then equals H and normally represents the energy. After this separa-
tion, there remains the equation

∂S0 ∂S0
H q1 , . . . , qn ; ,..., = E. (20.9)
∂q1 ∂qn
To achieve a separation of the position variables, we make the ansatz

S0 (q1 , . . . , qn ; P1 , . . . , Pn ) = Si (qi , Pi ) = S1 (q1 , P1 ) + · · · + Sn (qn , Pn ). (20.10)
i
This means that the Hamilton action function splits into a sum of partial functions Si ,
each depending only on one pair of variables. The Hamiltonian then becomes

dS1 dSn
H q 1 , . . . , qn ; ,..., = E. (20.11)
dq1 dqn
To ensure that this differential equation also separates into n differential equations for
the Si (qi , Pi ), H must obey certain conditions. For example, if H has the form
H (q1 , . . . , qn , p1 , . . . , pn ) = H1 (q1 , p1 ) + · · · + Hn (qn , pn ), (20.12)
the separation is certainly possible. A Hamiltonian of this form describes a system of

independent degrees of freedom; i.e., in (20.12) there are no interaction terms, e.g., of
the form H (qi , pi , qj , pj ), which describe an interaction between the ith and the j th
degree of freedom.
With (20.12), (20.11) reads

∂S1 ∂Sn
H1 q 1 , + · · · + Hn qn , = E. (20.13)
∂q1 ∂qn
This equation can be satisfied by setting each term Hi separately equal to a constant
βi ; hence,

∂S1 ∂Sn
H1 q1 , = β 1 , . . . , H n qn , = βn , (20.14)
∂q1 ∂qn
where
β1 + β2 + · · · + βn = E. (20.15)
Thus, there are n integration constants βi in total.

Since the kinetic energy term of the Hamiltonian involves the momentum pi =
dSi /dqi quadratically, these differential equations are of first order and second degree.
As solutions, we then obtain the n action functions
Si = Si (qi , βi ), (20.16)
which, apart from the separation constants βi , depend only on the coordinate qi . Ac-
cording to (19.12), Si immediately leads to the conjugate momentum pi = dSi /dqi to
the coordinate qi . The essential point is (see (20.12)) that the coordinate pair (qi , pi )
is not coupled to other coordinates (qk , pk , i = k), so that the motion in these coordi-
nates can be considered fully independent of the other ones.
We now restrict ourselves to periodic motions and define the phase integral

Ji = pi dqi , (20.17)
which is to be taken over a full cycle of a rotation or vibration. The phase integral has
the dimension of an action (or of an angular momentum). It is therefore also referred
to as an action variable. If we replace the momentum by the action function

dSi
Ji = dqi , (20.18)
dqi
we see from (20.16) that Ji depends only on the constants βi , since qi is only an inte-
gration variable. We therefore can move from the constants βi to the likewise constant
Ji and use them as new canonical momenta. Hence, one performs the transformation
Ji = Ji (βi ) −→ βi = βi (Ji ).
The total energy E which corresponds to the Hamiltonian can also be recalculated
by (20.15) to the Jk :

n
H =E= βi (Ji ). (20.19)
i=1
The Hamiltonian is therefore only a function of the action variables, which take the
role of the momenta. All corresponding conjugate coordinates are cyclic. The con-
jugate coordinates belonging to the Ji are called angle variables and are denoted by
ϕi . The generating function Si (qi , βk ) turns with βk (Jk ) into S(qi , Ji ). The Ji are the
new momenta. We therefore can apply (19.12), and for the related new coordinates,
we have
∂S(qi , Ji )
ϕj = .
∂Jj
By transforming to the action variables and angle variables, we thus have performed
a canonical transformation, mediated by the generating function
Si (qi , βi ) −→ Si (ϕi , Ji ). (20.20)
This transformation from one set of constant momenta to another set actually does not
give new insights. The meaning for periodic processes lies in the angle variable ϕi .
Since we performed only canonical transformations, we have
∂H
ϕ̇i = = νi (Ji ) = constant. (20.21)
∂Ji
One can show that νi is the frequency of the periodic motion in the coordinate i. This
relation thus offers the advantage that the frequencies, which are often of primary
interest, can be determined without solving the full problem. We briefly demonstrate
this point by the following example:
EXAMPLE
20.2 Angle Variable
We again consider the harmonic oscillator. The expression for the total energy
p2 kq 2
E= +
2m 2
is transformed so that we get the representation of an ellipse in phase space:
p2 q2
+ = 1.
2mE 2E/k
Fig. 20.1. Ellipses in phase
space
The phase integral is the area enclosed by the ellipse in phase space:

J = p dq = πab.
The two half-axes of the ellipse are

√ 2E
a = 2mE and b = .
k
We therefore obtain

m J k
J = 2πE , or E=H = .
k 2π m
This leads to the frequency

dH 1 k
ν= = .
dJ 2π m
EXERCISE
20.3 Solution of the Kepler Problem by the Hamilton–Jacobi Method
Problem. Use the Hamilton–Jacobi method for solving the Kepler problem in a cen-
tral force field of the form
K
V (r) = − .
r
Exercise 20.3 Solution. We adopt plane polar coordinates (r, ) as generalized coordinates. The
Hamiltonian reads
2
1 p K
H= p + 2 − .
2
(20.22)
2m r r r
H is cyclic in , and hence, p = constant = l. The pi can be expressed by the Hamil-

ton action function S:
∂S ∂S ∂S
pi = ⇒ pr = , p = = constant = β2 .
∂qi ∂r ∂
Thus, we obtain the Hamilton–Jacobi differential equation

∂S 1 ∂S 2 1 ∂S 2 K
+ + 2 − = 0. (20.23)
∂t 2m ∂r r ∂ r
For the action function, we adopt a separation ansatz
S = S1 (r) + S2 ( ) + S3 (t), (20.24)
which is inserted into (20.23):

1 ∂S1 (r) 2 1 ∂S2 ( ) 2 K ∂S3 (t)
+ 2 − =− . (20.25)
2m ∂r r ∂ r ∂t
Equation (20.25) can be satisfied only if both sides are constant. The constant is the
total energy of the system, because
∂S ∂S3
− =H =E ⇒ − = constant = β3 = E. (20.26)
∂t ∂t
We remember that
∂S ∂S
Pi = βi , Qi = = = αi ,
∂Pi ∂βi
where αi , βi are constants that follow from the initial conditions.

We insert (20.26) into (20.25), and solve for ∂S2 /∂ :

∂S2 2 2mK ∂S1 2
= r 2mβ3 +
2
− . (20.27)
∂ r ∂r
The same argument that led us to (20.26) now yields
∂S2 dS2 ( )
= = constant = β2 , (20.28)
∂ d
and therefore,

∂S1 dS1 (r) 2mK β22
= = 2mβ3 + − 2. (20.29)
∂r dr r r
The Hamilton action function can now be written down as follows: Exercise 20.3

2mK β22
S= 2mβ3 + − 2 dr + β2 − β3 t. (20.30)
r r
We now define β2 and β3 as new momenta P and Pr . The quantities Qi conjugate
to the Pi are also constant.

∂S ∂ 2mK β22
Qr = = 2mβ3 + − 2 dr − t = α3 , (20.31)
∂β3 ∂β3 r r

∂S ∂ 2mK β22
Q = = 2mβ3 + − 2 dr + = α2 . (20.32)
∂β2 ∂β2 r r
If we identify α2 with , which follows from the initial conditions, we obtain

β2 dr
= − . (20.33)
r 2mβ3 + 2mK/r − β2 /r
2 2 2
Insertion of the constants and substitution u ≡ 1/r leads to

du
− = − . (20.34)
(2mE)/ l ) + (2mKu/ l 2 ) − u2
2
This integral of the form

dx
√ = ···
ax + bx + c
2
can according to integral tables be written as a closed expression, with

2mE 4m2 K 2

= 4ac − b2 = −4 2 − < 0,
l l4

−2u + (2mK/ l 2 )
= + arcsin
(4m/ l 2 )(2E + (mK 2 / l 2 ))

(l 2 u/mK) − 1
= − arcsin (20.35)
1 + (2El 2 /mK 2 )
l2 1
⇔ r= . (20.36)
mK (1 + 1 + (2El 2 /mK 2 ) cos( − + π/2))
This is the solution of the Kepler problem, known from the lectures on classical me-
chanics.2 The types of trajectories follow from the discussion of conic sections in the
representation r = p/(1 + ε cos ϕ):
ε =1=
E = 0: parabolas;
ε <1=
E < 0: ellipses;
ε >1=
E > 0: hyperbolas.
(2004), Chapter 26.
Exercise 20.3 Equation (20.31) could be rewritten further, by pulling the differentiation into the
integral and transforming the resulting equation in such a way that the position r
becomes a function of the time. We skip that here.
EXERCISE
20.4 Formulation of the Hamilton–Jacobi Differential Equation for Particle Mo-

tion in a Potential with Azimuthal Symmetry
Problem. Let a particle of mass m move in a force field that in spherical coordi-
nates has the form V = −K cos /r 2 . Write down the Hamilton–Jacobi differential
equation for the particle motion.
Solution. We first need the Hamiltonian operator as a function of the conjugate mo-
menta in spherical coordinates. For this purpose we first write the kinetic energy T in
spherical coordinates:
˙ + r sin ϕ̇eϕ
ṙ = ṙer + r e (20.37)
1 1
⇒ ˙ 2 + r 2 sin2 ϕ̇ 2 ).
T = mṙ · ṙ = m(ṙ 2 + r 2 (20.38)
2 2
The Lagrangian then reads
1
˙ 2 + r 2 sin2 ϕ̇ 2 ) − V (r, , ϕ).
L = T − V = m(ṙ 2 + r 2 (20.39)
2
We now assume that V (r, , ϕ) is velocity-independent (which is indeed fulfilled)
and form the canonical conjugate momenta:
∂L ∂L ∂L
pr = = mṙ, p = ˙
= mr 2 , pϕ = = mr 2 sin2 ϕ̇. (20.40)
∂ ṙ ˙
∂ ∂ ϕ̇
From this, we obtain
pr p pϕ
ṙ = , ˙ =
, ϕ̇ = .
m mr 2 mr sin2
2
Hence, H can be given in the desired form

H= pα q̇α − L
α
2
1 pr p 2 pϕ2
˙
= pr ṙ + p + pϕ ϕ̇ − m + + + V (r, , ϕ)
2 m2 r 2 m2 m2 r 2 sin2
pr2 p2 pϕ2
= + 2 + + V (r, , ϕ), (20.41)
2m 2mr 2mr 2 sin2
and for the actual potential (see the formulation of the problem),
pr2 p2 pϕ2 K cos

H= + 2 + 2
− . (20.42)
2m 2mr 2
2mr sin r2
The pi as functions of the Hamilton action variables read Exercise 20.4
∂S ∂S ∂S
pr = , p = , pϕ = . (20.43)
∂r ∂ ∂ϕ
Therefore, the Hamilton–Jacobi equation has the form

2
∂S 1 ∂S 2 1 ∂S 2 1 ∂S K cos
+ + 2 + − = 0. (20.44)
∂t 2m ∂r r ∂ r 2 sin2 ∂ϕ r2
EXERCISE
20.5 Solution of the Hamilton–Jacobi Differential Equation of Exercise 20.4
Problem.
(a) Find the complete solution of the Hamilton–Jacobi differential equation from the
preceding Exercise 20.4, and
(b) sketch how to determine the motion of the particle.
Solution. (a) The approach is analogous to Exercise 20.3. We adopt the separation
ansatz for S,
S = S1 (r) + S2 ( ) + S3 (ϕ) − Et, (20.45)
and insert this into (20.44):

1 ∂S1 2 1 ∂S2 2 1 ∂S3 2 K cos
+ + − =E
2m ∂r 2mr 2 ∂ 2mr 2 sin2 ∂ϕ r2
(20.46)
2 2 2
∂S1 (r) ∂S2 ( ) 1 ∂S3 (ϕ)
⇔ r2 − 2mEr 2 = − − 2
∂r ∂ sin ∂ϕ
+ 2mK cos . (20.47)
Equations (20.46) and (20.47) can only be satisfied if both sides are constant:

∂S1 (r) 2
r2 − 2mEr 2 = constant = β1 , (20.48)
∂r

∂S2 ( ) 2 1 ∂S3 (ϕ) 2
− − 2 + 2mK cos = β1 . (20.49)
∂ sin ∂ϕ
To separate from ϕ, we multiply (20.49) by sin2 :

2 2
∂S3 (ϕ) ∂S2 ( )
= 2mK cos sin2 − β1 sin2 − sin2 . (20.50)
∂ϕ ∂
Exercise 20.5 The separation constant is denoted by β3 , since

2
∂S ∂S3 (ϕ)
= pϕ and thus = β32 = pϕ2 . (20.51)
∂ϕ ∂ϕ
Therefore,
2
∂S2
2mK cos sin2 − β1 sin2 − sin2 = pϕ . (20.52)
∂
Integration of (20.48), (20.51), and (20.52) yields

β1
S1 = 2mE + 2 dr + c1 ,
r

pϕ2
S2 = 2mK cos − β1 − 2 d + c2 , (20.53)
sin
S3 = ϕpϕ + c3 .
The complete solution of the Hamilton–Jacobi differential equation is obtained from

(20.53) and the ansatz (20.45) for S:

β1 pϕ2
S= 2mE + 2 dr + 2mK cos − β1 − 2 d
r sin
+ ϕpϕ − Et + C. (20.54)
(b) The explicit equations for the motion of the particle follow from the requirement
∂S ∂S
Qi = ⇔ αi = ,
∂Pi ∂βi
since Qi , Pi are constants that are denoted by αi , βi , and thus,
∂S ∂S ∂S
= α1 , = α2 , = α3 . (20.55)
∂β1 ∂E ∂pϕ
The αi follow from the initial conditions; for example,

∂S m
= dr − t = α2 , (20.56)
∂E 2mE + β1 /r 2

m m r
dr = dr
2mE + β1 /r 2 2E r + β1 /2mE
2

mr 2 β1
= + +c
2E 4E 2

mr 2 β1
⇒ α2 + t = + + c, (20.57)
2E 4E 2

mr02 β1
r(t = 0) = r0 ⇒ α2 − c = + ,
2E 4E 2
and as the solution for t , Exercise 20.5

mr 2 β1 mr02 β1
t= + 2
− + . (20.58)
2E 4E 2E 4E 2
The ∂S/∂β1 and ∂S/∂pϕ are treated likewise. The evaluation of the elliptic integrals
requires numerical methods.
We come back once again to our discussion in the context of (20.8) and (20.9). If
the Hamiltonian does not explicitly depend on the time, as is the case for conservative
scleronomic systems, the Hamilton–Jacobi differential equation can be brought to a
simpler form, since S can only linearly depend on t . We therefore transform to
S = S0 − Et,
where S0 = S0 (q − 1, . . . , qn , β1 , . . . , βn ). One then obtains the so-called reduced

Hamilton–Jacobi equation:

∂S0 ∂S0
H q1 , . . . , qn , ,..., = E.
∂q1 ∂qn
The solution of this differential equation yields arbitrary constants, one of them, e.g.,
β1 , is additive (S0 + c solves the above Hamilton–Jacobi equation too) and can be
omitted. But the reduced Hamilton–Jacobi equation now involves the total energy, so
that S0 will also depend on E, and therefore, in
S0 = S0 (q1 , . . . , qn , E, β2 , . . . , βn )
β1 is replaced by E. We can express this in the following way: Just as in the original
Hamilton–Jacobi equation, the reduced form also has n integration constants, one of
them the total energy E.
EXERCISE
20.6 Formulation of the Hamilton–Jacobi Differential Equation for the Slant

Throw
Problem. Use the reduced Hamilton–Jacobi differential equation to formulate the

equation of motion for the slant throw.
Solution. Let the coordinates of the throw plane be x (abscissa) and y (ordinate),
which will also be used as generalized coordinates.
m 2
H =T +V = (ẋ + ẏ 2 ) + mgy, (20.59)
2
∂H ∂H
px = = mẋ, py = = mẏ. (20.60)
∂ ẋ ∂ ẏ
The conjugate momentum px = mẋ to the cyclic coordinate x (∂H /∂x = 0) is a con-
served quantity. We recalculate (20.59) as
1
H (x, y, px , py ) = (p 2 + py2 ) + mgy. (20.61)
2m x
Exercise 20.6 Since H does not depend explicitly on the time and the system is conservative, the
reduced Hamilton–Jacobi differential equation can be applied.

1 ∂S0 2 ∂S0 2
+ + mgy = E. (20.62)
2m ∂x ∂y
By inserting the separation ansatz S0 = S1 (x) + S2 (y) into (20.62), we obtain

1 ∂S1 (x) 2 ∂S2 (y) 2
+ + mgy = E (20.63)
2m ∂x ∂y
or
2 2
∂S1 (x) ∂S2 (y)
= 2mE − 2m gy − 2
. (20.64)
∂x ∂y
This is satisfied only if both sides of the equation are constant, since x and y are
independent coordinates.
2 2
∂S1 (x) ∂S2 (y)
β2 , = (2mE − β2 ) − 2m2 gy. (20.65)
∂x ∂y
Integration yields the solutions

S1 (x) = β2 x + c1 , (20.66)
1 3/2
S2 (y) = − 2
(2mE − β2 ) − 2m2 gy + c2 . (20.67)
3m g
The complete solution of the reduced Hamilton–Jacobi differential equation is of the
form
1
S0 (x, y, E, β2 ) = β2 x − [(2mE − β2 ) − 2m2 gy]3/2 + c, (20.68)
3m2 g
∂S0 ∂S0
= t + α1 , = α2 ,
∂E ∂β2
where the first relation holds because
∂S1 ∂S0
α1 = = − t.
∂E ∂E
From this, we obtain y = y(t) as
1
− [(2mE − β2 ) − 2m2 gy]1/2 = t + α1 (20.69)
mg
⇔ 2mE − β2 − 2m2 gy = m2 g 2 (t + α1 )2
1 2mE − β2
⇔ y = − g(t + α1 )2 +
2 2m2 g
1
⇔ y = − gt 2 + c1 t + c2 . (20.70)
2
In the last step, we renamed the constants.
20.1 Visual Interpretation of the Action Function S 397
The analogous procedure with ∂S/∂β2 = α2 yields Exercise 20.6
y = −c1 x 2 + c2 x + c3 , (20.71)
i.e., the familiar throw parabola. For the case of the slant throw the Hamilton–Jacobi
equation may appear clumsy for establishing the equation of motion. A certain advan-
tage of the method shows up in complicated problems, e.g., in the Kepler problem in
Exercise 20.3.
20.1 Visual Interpretation of the Action Function S
In the preceding problems, the Hamilton–Jacobi differential equation proved success-

ful for establishing the equations of motion, in particular for complex mechanical
problems. There remains the question about the visual meaning of the action func-
tion S. We consider the motion of a single mass point in a time-independent potential
and write
S = S0 (qi , Pi ) − Et,
where, as already indicated, S0 (qi , Pi ) describes a spatial field which is time-

independent. With the labeling q1 = x, q2 = y, q3 = z and p1 = px , p2 = py , p3 = pz ,
for the momentum components we have according to (19.12)
∂S ∂S0 ∂S ∂S0 ∂S ∂S0

px = = , py = = , pz = = .
∂x ∂x ∂y ∂y ∂z ∂z
Written as a vector equation, this is
p = grad S = ∇S0 .
Since grad S is always perpendicular to the equipotential surfaces of S, we realize that

in a representation of the S-field by S = constant, the orbits are represented by trajec-
tories orthogonal to this set of surfaces. Accordingly, to a given field S belong all mo-
tions with trajectories perpendicular to the equipotential surfaces of S (S = constant),
and moreover along each trajectory all motions starting at an arbitrary moment (see
Fig. 20.2).
Fig. 20.2. Surfaces S =

constant. Trajectories are
dotted
The time behavior of the S-field can be seen from the representation S = S0 − Et.
For t = 0, the surfaces S(qi , Pi ) = 0 and S0 (qi , Pi ) = 0 are identical. For t = 1, the
surface S = 0 coincides with the surface S0 = E, S = E with S0 = 2E, etc. This
means graphically that surfaces of constant S-values move across surfaces of constant
S0 -values, i.e., that surfaces of constant S move through space. The formal meaning
of S follows from the action integral. One has

L dt = (px dx + py dy + pz dz − H dt),
t2 t2
∂S ∂S ∂S ∂S
L dt = dx + dy + dz + dt = S2 − S1 .
∂x ∂y ∂z ∂t
t1 t1
S therefore represents an action (energy

· time). It is the time integral over the La-
grangian. The Hamilton principle δ L dt = 0 therefore states that a motion proceeds
with the boundary condition of minimum action.
EXAMPLE
20.7 Illustration of the Action Waves
To illustrate the action waves, we consider the throw or fall motion in the gravita-
tional field of the earth, where the equation of motion is well known. In analogy to
Exercise 20.6, we obtain the following Hamilton–Jacobi differential equation:
2 2
1 ∂S 2 ∂S ∂S ∂S
+ + + mgz + = 0. (20.72)
2m ∂x ∂y ∂z ∂t
With the separation ansatz S = Sx (x) + Sy (y) + Sz (z) − Et, we obtain
Sx = xβx , Sy = yβy
up to additive constants, and

1 ∂Sz 2 βx2 + βy2
+ mgz = E − = βz . (20.73)
2m ∂z 2m
The quantities βx and βy are separation constants, just like βz . Integration over z
yields, up to a constant,

2 2
Sz = − (βz − mgz)3/2 . (20.74)
3g m
We write the constant βz as βz = mgz0 and thereby can express the total energy as
px2 + py2
E= + mgz0 . (20.75)
2m
By insertion, one gets the action function
√ 2
2m 2g βx + βy2
S = xβx + yβy − (z0 − z)3/2 − + mgz0 t, (20.76)
3 2m
and by the familiar scheme the equations of motion, Example 20.7

∂S βx
Qx = αx = = x − t,
∂βx m
∂S βy
Qy = αy = = y − t, (20.77)
∂βy m
∂S
Qz = αz = = −m 2g(z0 − z)1/2 − mgt.
∂z0
Among the possible motions of a body in the gravitational field we pick the ensemble
with βx = 0, βy = 0, z0 = 0. From (20.76), we get for the surfaces with S0 = constant:
√
2m 2g
constant = − (−z)3/2 .
3
These are planes parallel to the x, y-plane.
The possible trajectories are shown in Fig. 20.3 by dashed vertical straight lines.
Since the action function is real only for z ≤ 0, there are only such throw conditions
that ascend up to the plane z = 0 and then return again. In the present example, the
action waves are planes parallel to the x, y-plane that propagate in z-direction. This
is easily seen from (20.76) for z0 = constant = 0. Any vertical throw up to the height
z0 thus belongs to the same S-field or to the same action wave, respectively. Here it is
not essential at which space point the throw motion begins.
Fig. 20.3.
As a further ensemble of motions, we consider

√
2m 2g
βx = 0, z0 = 0, βy = , (20.78)
3
so that from (20.76) we obtain
√
2m 2g
S0 = {y − (−z)3/2 } (20.79)
3
3
⇔ y= √ S0 + (−z)3/2 . (20.80)
2m 2g
Equation (20.80) represents a Neil or semicubic parabola (y = ax 3/2 ) in the y, z-plane.

The surfaces with S0 = constant are therefore surfaces parallel to the x-axis: F (y, z) =
0, i.e., cylindrical surfaces intersecting the y, z-plane in a set of Neil parabolas with
the top on the y-axis (Fig. 20.4).
Fig. 20.4.
With increasing S0 the tops of the Neil parabolas move in the y-direction. The
related trajectories are throw parabolas in the y, z-plane which have no velocity com-
ponent along the x-direction and reach their highest point at z = 0 (dashed curves in
Fig. 20.5).
Fig. 20.5. Projection onto the
y, z-plane of the preceding
Fig. 20.4
The velocity component in y-direction is the same for all throws:

2
vy = 2g.
3
In this case, the action waves consist of cylindrical surfaces parallel to the x-axis that
propagate with increasing time in z-direction.
The starting point of the motion in the x, y-plane is again arbitrary; the turning
points of the trajectories are at z = 0. All throw parabolas parallel to the x-axis belong
to the same action wave; i.e., any throw described by (20.79) can be represented by a
set of action waves propagating in the z-direction and parallel to the x-axis.
We see from this example that the simple throw in the gravitational field can be cor-
rectly represented by the Hamilton–Jacobi formalism but becomes hopelessly compli-
cated. This confirms our thesis: Although the Hamilton–Jacobi method contains beau-
tiful formal ideas, it is hardly practicable, too clumsy, and too abstract for physicists.
EXAMPLE
20.8 Periodic and Multiply Periodic Motions
In this example, the peculiarities of periodic motions shall be compiled and extended
to multiply periodic motions.3
3 Here, we follow A. Budo, Theoretische Mechanik, Deutscher Verlag der Wissenschaften, Berlin
(1956).
1. Periodic motions: Here, one distinguishes two kinds, namely, the properly peri- Example 20.8
odic motion, for which
qi (t + τ ) = qi (t),
(20.81)
pi (t + τ ) = pi (t),
i.e., both the coordinates and the momenta have the same period τ . This motion is
also called libration. Two-dimensional examples are the (nondamped) harmonic os-
cillator or the (nondamped) vibrating pendulum. The phase-space diagram (the phase
trajectory) is a closed curve (see Fig. 20.6).
The other type of periodic motion is the rotation. Here one has (e.g., in the two-
Fig. 20.6. Two-dimensional phase
dimensional case) diagram of a properly periodic
motion. A closed phase tra-
p(q + q0 ) = p(q), (20.82) jectory occurs, e.g., for a non-
damped vibrating pendulum
i.e., the momentum takes for q + q0 the same value as for q. The coordinate q is
mostly an angle variable and q0 = 2π . One might imagine for example a circulating
pendulum; in this case q is the pendulum angle. The phase-space trajectory is then not
closed but periodic with the period q0 (see Fig. 20.7).
Fig. 20.7. Phase-space diagram of the rotation as a periodic motion. The trajectory is open but
has the period q0 . In other words, the momentum p is a periodic function of the coordinate q
with the period q0
The limiting case between rotation and libration is called limitation motion. The
pendulum, which is almost circulating, is an example for this type of motion. The
coordinate period q0 is then q0 = 2π as before, but the time period is τ = ∞.
(The pendulum then comes to rest in the upper vertical position (unstable point), i.e.,
the function graph terminates at the point q0 .) If the system is conservative and is
described by the Hamiltonian H (p, q), we have the equations
H (p, q) = E,
(20.83)
∂S
H q, = E.
∂q
The first equation yields p = p(q, E), i.e., for a given energy E the phase trajectory.
The second equation is the (reduced) Hamilton–Jacobi equation from which the action
function (generating function) F2 (q, P ) = S(q, E) can be calculated. If that is done,
one can calculate the phase-space integral

∂S
J = p dq = dq. (20.84)
∂q

Example 20.8 Here, means the integration over a closed trajectory in the case of libration, or over
a full period q1 ≤ q ≤ q1 + q0 in the case of rotation. Hence, the phase integral J
exactly corresponds to the shaded areas in Figs. 20.6 and 20.7
The phase integral J = J (E) depends only on E and is constant in time, since the
total energy is constant in time. Hence, (20.84) leads to the relations
J = J (E) or E = E(J ). (20.85)
As a consequence, the function S(q, E) changes to
S(q, E) ⇒ S(q, E(J )) ≡ S (q, J ). (20.86)
The function S (q, J ) can serve as the generator of a canonical transformation. The
new momentum is now identified with J , i.e.,
P = J. (20.87)
The canonically conjugate variable belonging to P = J is denoted by Q = ϕ. It is also

called an angle variable and is calculated according to (20.4) as
∂S (q, J )
ϕ= . (20.88)
∂J
As S does not explicitly depend on time, the transformed Hamiltonian H is obtained
by expressing the original Hamiltonian (20.83) in terms of the transformed variables
H (ϕ, J ) = E(J ). (20.89)
The Hamilton equations in the new coordinates then read

∂H ∂E(J )
Q̇ = or ϕ̇ = = constant,
∂P ∂J
(20.90)
∂H
Ṗ = − or J˙ = 0.
∂Q
∂E(J )/∂J depends only on J , which is constant in time. Hence, ϕ̇ is also constant in
time, and then

∂E
ϕ= t + δ. (20.91)
∂J
Here, the phase constant δ appears. If we had not selected S (q, J ) as the generating
function but rather the complete time-dependent action function
W (q, J, t) = S (q, J ) − E(J )t, (20.92)
the coordinate conjugate to J would be according to (20.88)

∂W (q, J ) ∂S (q, J ) ∂E(J )
= − t
∂J ∂J ∂J
∂E(J )
=ϕ− t = δ, (20.93)
∂J
i.e., just the phase constant from (20.91). Equation (20.91) states that the angle
variable ϕ linearly increases with the time. It is a cyclic coordinate as is evident
from (20.89), since the Hamiltonian H (ϕ, J ) = E(J ) does not depend on ϕ. The Example 20.8
change of ϕ during a period τ is found from (20.91) to be

∂E

ϕ = τ, (20.94)
∂J
which can be specified more precisely by means of (20.88). We have

2
∂ϕ ∂ S (q, J )

ϕ = dq = dq
∂q ∂q∂J

∂ ∂S (q, J ) ∂J
= dq = = 1. (20.95)
∂J ∂q ∂J
Hence, the angular coordinate increases during a period after which the system returns
to its initial configuration, exactly by 1. We therefore can state that the motion of the
system is periodic in ϕ with the period 1. Combining (20.94) and (20.95) yields
∂E ∂E 1
τ =1 ⇔ = = ν. (20.96)
∂J ∂J τ
ν is the frequency of the periodic motion. Obviously the complete solution of the
equations of motion is not needed for calculating ν. It is sufficient to express E as a
function of J and to differentiate with respect to J . This is the advantage of introduc-
ing the action (J ) and angle variables (ϕ). The approach is illustrated in Example 20.2
for the case of the harmonic oscillator.
2. Separable multiply periodic systems: We imagine a conservative system with

f degrees of freedom, which is described by the f coordinates q1 , . . . , qf and is
separable. This means that the solution of the reduced Hamilton–Jacobi equation

∂S ∂S
H q1 , . . . , qf ; ,..., =E (20.97)
∂q1 ∂qf
can be written in the form
S(q1 , . . . , qf ; E, β2 , . . . , βf )
= S1 (q1 ; E, β2 , . . . , βf ) + · · · + Sf (qf ; E, β2 , . . . , βf ). (20.98)
The f integration constants
E, β2 , β3 , . . . , βf (20.99)
characterize the constant momenta P1 , . . . , Pf . If the Hamiltonian decomposes into

a sum of terms H (qi , pi ), only one constant appears in the functions Sk (qk ; E,
β2 , . . . , βf ) in (20.98); the functions then have the form Sk (qk , βk )—see (20.10).
When can we classify a motion as periodic? The answer is simple: If any pair of
conjugate variables (qi , pi ) always behaves as discussed in the first part of this ex-
ample, the motion is periodic. More precisely: The projection of the phase trajectory
Example 20.8 onto each qi ,pi -plane of the phase space must be either a libration or a rotation, to
guarantee the periodicity of the entire motion of the system.
The procedure is analogous to that outlined in the first section. First one defines the
action variables

∂Si (qi ; E, β2 , . . . , βf )
Ji = pi dqi = dqi
∂qi
= Ji (E, β2 , . . . , βf ), i = 1, . . . , f. (20.100)
They are constant in time, since E, β2 , . . . , βf are constant. The f equations (20.100)
can be solved for E, β2 , . . . , βf and yield
E = E(J1 , . . . , Jf ),
β2 = β2 (J1 , . . . , Jf ),
(20.101)
..
.
βf = βf (J1 , . . . , Jf ).
By inserting (20.101) into (20.98), we obtain
Si (qi , E(Jk ), β2 (Jk ), . . . , βf (Jk )) = S (qi , J1 , . . . , Jf ). (20.102)
This is a generating function with the constant momenta
Pi = Ji . (20.103)
The relation (20.102) is fully analogous to the relation (20.86), and (20.103) corre-
sponds to (20.87). The canonically conjugate angle variables result—like (20.88)—
from
∂S ∂Sk (qk , J1 , . . . , Jf )
f
ϕi = = , i = 1, . . . , f. (20.104)
∂Ji ∂Ji
k=1
Among the canonical variables (Qi , Pi ) = (ϕi , Ji ) is the Hamiltonian
H (ϕi , Ji ) = E(Ji ), (20.105)
since the Hamiltonian is independent of time (see (20.83)). From this follow the
Hamilton equations
∂H ∂E(Jk )
ϕ̇i = = = constant ≡ νi ,
∂Pi ∂Ji
(20.106)
∂H
J˙i = − = 0;
∂ϕi
hence,
ϕi = νi t + δi ,
(20.107)
Ji = constant.
We are now interested in the change of the angle variables ϕi over a period (full rev- Example 20.8
olution or back-and-forth motion of a coordinate qi with the remaining coordinates
kept fixed). It is given by

∂ϕi ∂ 2S

k ϕi = dqk = dqk
∂qk ∂Ji ∂qk

∂ ∂S ∂Jk
= dqk = = δki . (20.108)
∂Ji ∂qk ∂Ji
According to (20.107),

k ϕk = νk τκ , (20.109)
if τk is the “vibration time” (time interval of the period) of qk . A comparison

of (20.109) and (20.108) yields
νk τk = 1. (20.110)
Thus,
1
νk = (20.111)
τk
obviously are the frequencies of the qk -motion. In other words, according to (20.106)
the (fundamental) frequency νk for the coordinate qk is νk = ∂E(J1 , . . . , Jf )/∂Jk .
Equations (20.104) can also be inverted, which yields the original coordinates qn
with
qk = qk (ϕ1 , . . . , ϕf ), k = 1, . . . , f (20.112)
as functions of the new angle variable ϕi . When increasing ϕi by

ϕi = 1 (keeping
the values of all other ϕk with k = i fixed), qi (and only this!) must run through a
period. This follows from (20.108): If qk (with k = i) ran through a period when ϕi
changes to ϕi +
ϕi = ϕi + 1, then according to (20.108) the variable ϕk also should
increase by
ϕk = 1. But this shall not occur by assumption. Therefore if ϕi increases
to ϕi + 1, qi changes as follows:
ϕi → ϕi + 1,
q i → qi for libration, (20.113)
qi → qi + qi0 for rotation.
For a libration, qi is periodic; for a rotation,
qi − ϕi qi0 (20.114)
is a periodic function of ϕi . Actually,
ϕi → ϕi + 1,
(20.115)
qi − ϕi qi0 → qi + qi0 − (ϕi + 1)qi0 = qi − ϕi qi0 .
Example 20.8 We therefore can expand the separation coordinates qi (for libration) or qi − ϕi qi0 (for
rotations) in a Fourier series and write
⎫
qi (ϕ1 (t), . . . , ϕf (t)) ⎬ +∞

= an(i) ei2πϕi n
⎭
qi − ϕi qi0 (ϕ1 (t), . . . , ϕf (t)) n=−∞
+∞

= an(i) ei2πn(νi t+δi ) , (20.116)
n=−∞
where
1
an(i) (ϕ1 , . . . , ϕi−1 , ϕi+1 , . . . , ϕf ) = qi (ϕ1 , . . . , ϕf )e−i2πnϕi dϕi . (20.117)
0
The Fourier coefficients an(i) (ϕ1 , . . . , ϕi−1 , ϕi+1 , . . . , ϕf ) in general still depend on all
angle variables, except for ϕi .
We now imagine other variables xl which describe the system and are useful for
certain problems. They shall unambiguously depend on the qi (t) and therefore are
also functions of the time. Then we can write
xl (q1 (t), . . . , qf (t))

+∞
+∞

i2π(n1 ϕ1 +···+nf ϕf )
= ··· A(l)
n1 ,...,nf e
n1 =−∞ nf =−∞
∞
i2π[(n1 ν1 +···+nf νf )t+(δn1 +...+δnf )]
= A(l)
n1 ,...,nf e , l = 1, . . . , f.
n1 ,...,nf =−∞
(20.118)
In the second step, we used ϕi = νi t + δi . The coordinates xl can be represented only

by a multiple Fourier series. Equation (20.118) now suggests that the motion xl (t) is
in general not periodic in time. For example, if t increases by
t = 1/ν1 , the first
exponential factor in (20.118) does not change because ei2πn1 ν1
t = ei2πn1 ν1 (1/ν1 ) =
ei2πn1 = 1, but the other exponential factors in (20.118) vary. The system is therefore
called multiply periodic in the coordinates xl . If the system is simply periodic, the
frequencies ν1 , . . . , νf must be correlated by f − 1 equations of the type
Ci1 ν1 + Ci2 ν2 + · · · + Cif νf = 0, i = 1, . . . , f − 1. (20.119)
These are f − 1 equations for the f unknown quantities ν1 , . . . , νf . It is evident that

besides νi , also ννi (ν an arbitrary factor) is also a solution of (20.119). Let νi =
ni /mi ; thus, the νi can be represented by the fraction ni /mi . Then
ni
νi = (m1 m2 · · · mf )νi = (m1 · · · mf ) (20.120)
mi
are also solutions of (20.119). Hence, the νi are integers. Since the solutions of
(20.119) can be determined only up to a common factor ν, the general solution reads
νi = ai ν, (20.121)
20.2 Transition to Quantum Mechanics 407
where the ai = (m1 m2 · · · mf )ni /mi are integers, and ν is a common factor. Thus, the Example 20.8
system is periodic if and only if all frequencies are commensurable. The fundamental
frequency ν0 is then the largest common divisor of all frequencies ν1 , . . . , νf . If there
exist only s (with s ≤ f − 1) relations of the form (20.119), s frequencies can be
rationally expressed by the remaining ones. The system (the motion) is then called
s-fold degenerate or (f − s)-fold periodic. Special cases are
• s = 0: the motion is f -fold periodic or nondegenerate;
• s = f − 1: the motion is single-periodic or fully degenerate.
20.2 Transition to Quantum Mechanics

In the last chapters, we have emphasized the formal aspects of mechanics. Although
for solving practical problems sometimes no advantages could be achieved, the in-
sights in the structure of mechanics provided by the Hamiltonian formalism con-
tributed essentially to the development of quantum mechanics. For example, the con-
cept of the phase integral was of fundamental importance for the transition to quantum
mechanics. The first clear formulation of the quantum hypothesis consisted of the re-
quirement that the phase integral take only discrete values; hence,

J = p dq = nh, n = 1, 2, 3, . . . , (20.122)
where h is Planck’s action quantum, which has the value h = 6.6 · 10−34 J s. We again
consider the case of the harmonic oscillator. In Example 20.2, we evaluated the phase
integral

m
J = 2πE . (20.123)
k
√
ν = (1/2π) k/m was the frequency. With the quantum hypothesis, we then obtain
En = nhν. (20.124)
Thus, the quantum hypothesis leads to the conclusion that the vibrating mass point
can take only discrete energy values En . For the motion, this means that only certain
trajectories in the phase space are allowed. We therefore get ellipses for the phase-
space trajectories (compare Example 20.2), whose areas (the phase integral) always
differ by the amount h. In this way, the phase space acquires a grid structure that is
defined by the allowed trajectories.
Each trajectory corresponds to an energy En . In a transition between two trajec-
tories the mass point receives (or releases) the energy En − Em = (n − m)hν. The
smallest transferred amount of energy is given by hν.
Fig. 20.8. In quantum me-
chanics, the phase-space tra-
jectories of the harmonic os-
cillator are ellipses that differ
by an area of h
Since the action quantum h is so small, the discrete structure of the phase space
is significant only for atomic processes. For macroscopic processes the trajectories
in the phase space are so dense that one can consider the phase space as a contin-
uum. The energy quanta hν are so small that they have no meaning for macroscopic
processes. For example, the energy emitted in a transition in the hydrogen atom is
hν = 13.6 eV (electron volt). Expressed in the (macroscopic) unit of Watt seconds
hν = 2 · 10−18 W s. The quantum hypothesis was confirmed by the explanation of the
spectra of radiating atoms.
EXERCISE
20.9 The Bohr–Sommerfeld Hydrogen Atom
Problem. At the beginning of the development of modern quantum mechanics,

N. Bohr and A. Sommerfeld formulated a “quantization prescription” for periodic
motions. Accordingly, only such trajectories in phase space are admitted for which
the phase integral

pα dqα = nα h, nα = 0, 1, 2, . . . (20.125)
is a multiple of Planck’s action quantum h = 6.626 · 10−34 J s. The integral extends

over a period of motion. qα and pα are the generalized coordinates and the canonically
conjugate momenta, respectively.
(a) Write the Lagrangian, the Hamiltonian, the Hamilton equations, and the constants
of motion for a particle in the potential v(r) = −e2 /r.
(b) Calculate the bound energy states of the hydrogen atom from the condi-
tion (20.125).
Solution. (a) The Lagrangian is L = T − V = (1/2)mv 2 + e2 /r. The Hamiltonian

then follows as
∂L 1 e2 1 2 1 2 2 e2
H= ẋα − L = mv 2 − = mṙ + mr ϕ̇ − (20.126)
α
∂ ẋα 2 r 2 2 r
in polar coordinates. The canonical momenta are

∂L ∂L
pϕ = = mr 2 ϕ̇ = L and pr = = mṙ. (20.127)
∂ ϕ̇ ∂ ṙ
L is the angular momentum of the particle.
Then
pr2 pϕ2 e2
H (p, q) = + − . (20.128)
2m 2mr 2 r
Constants of motion are
(i) H = E, since H (q, p) does not explicitly depend on the time; and
(ii) pϕ = L, since ϕ is a cyclic variable.
L represents the constant angular momentum. The Hamilton equations read Exercise 20.9
∂H ∂H pϕ
ṗϕ = − = 0, ϕ̇ = = ,
∂ϕ ∂pϕ mr 2
(20.129)
∂H e2 pϕ2 ∂H pr
ṗr = − =− + , ṙ = = .
∂r r mr 3 ∂pr m
(b) The quantization conditions for the angular motion are
2π
lh = pϕ dϕ = L dϕ = 2πL
0
(20.130)
h
⇒ L = l, = , l = 0, 1, 2, . . . ,
2π
i.e., the orbital angular momentum can take only integer multiples of . For the radial
motion, the phase integral equals
rmax
e2 L2
kh = pr dr = 2 2m E + − 2 dr, k = 0, 1, 2, . . . . (20.131)
r r
rmin
The limits of integration are determined from the condition
pr = 0; (20.132)
thus,
e2 L2
2
rm + rm − = 0,
E 2mE
√ (20.133)
e2 −
rm = − ∓ with
= −4m(2EL2 + me4 ).
2E 4mE
The integral in (20.131) is of the type
√ 2 √
ar + br + c X(r)
dr ≡ dr, (20.134)
r r
and one easily verifies by differentiation that

√
X(r) b dr dr
dr = X(r) + √ +c √ . (20.135)
r 2 X(r) r X(r)
Here, X(r) = ar 2 + br + c. Furthermore,

dr 1 2ar + b
√ = −√ arcsin √ for a < 0 (since E < 0)
X(r) −a −
and

dr 1 br + 2c
√ =√ arcsin √ for c < 0 and
< 0. (20.136)
r X(r) −c r −
Exercise 20.9 Here,

= 4ac − b2 . (20.137)
This leads to
√
X(r) b 2ar + 2b
dr = ar 2 + br + c − √ arcsin √
r 2 −a −

c br + 2c
+√ arcsin √ (20.138)
−c r −
for a < 0, c < 0,

< 0. In our case (see (20.131)),
a = 2mE, b = mc2 , c = −L2 ,

(20.139)

= −4m(2EL2 + me4 ).
For the integral, one gets (with E < 0)

√
2mEr 2 + 2me2 r − L2
dr
r

= 2mEr 2 + 2me2 r − L2

me2 4mEr + 2me2 2me2 r − 2L2
+√ arcsin √ − L arcsin √ . (20.140)
−2mE −
r −
Insertion of the integration limits yields

2π 2 me4
− 2πL = kh. (20.141)
−E
If one defines the “principal quantum number” n = l + k = 0, 1, 2, . . . , the formula

for the binding energy reads
me4
En = − . (20.142)
22 n2
This formula for the discrete energy levels in the hydrogen atom agrees exactly with
the quantum mechanical result. Only the value n = 0, which was allowed in this con-
sideration, is excluded in the quantum mechanical approach. The underlying classi-
cal picture (electron moves in an elliptic orbit with the eccentricity ε = 1 − (l/n)2 )
leads however to contradictions and must be modified in quantum mechanics. Because
n = l + k, the energy levels with n = 1, 2, . . . , are twofold, threefold, . . . , degenerate.
EXERCISE
20.10 On Poisson Brackets

Problem. If the functions F and G depend on the coordinates qα , the momenta pα ,
and the time t, the Poisson bracket of F and G is defined as follows:
∂F ∂G ∂F ∂G
Exercise 20.10
[F, G] = − .
α
∂qα ∂pα ∂pα ∂qα
Show the subsequent properties of this Poisson bracket:

(a) [F, G] = −[G, F ],
(b) [F1 + F2 , G] = [F1 , G] + [F2 , G],
∂F
(c) [F, qr ] = − , and
∂pr
∂F
(d) [F, pr ] = .
∂qr
Solution. (a)
∂F ∂G ∂F ∂G
∂G ∂F ∂G ∂F

[F, G] = − =− −
α
∂qα ∂pα ∂pα ∂qα α
= −[G, F ].
We see that the Poisson bracket is not commutative.

(b)
∂(F1 + F2 ) ∂G ∂(F1 + F2 ) ∂G

[F1 + F2 , G] = −
α
∂F1 ∂G ∂F1 ∂G

∂F2 ∂G ∂F2 ∂G

= − + −
α
∂qα ∂pα ∂pα ∂qα α
= [F1 , G] + [F2 , G].
Therefore, the Poisson bracket is distributive.

(c)
∂F ∂qr ∂F ∂qr

[F, qr ] = −
α
∂qr ∂F
∂F
= 0 ⇒ [F, qr ] = − δrα = − .
∂pα α
∂p α ∂p r
Here, δrα is the Kronecker symbol:
δrα = 1 for r = α,
δrα = 0 for r = α.
(d) Analogously, we find

∂F ∂pr ∂F ∂pr

∂F

∂F
[F, pr ] = − = δrα = ,
α
∂q α ∂p α ∂p α ∂q α α
∂q α ∂q r
since ∂pr /∂qα = 0.

We shall meet the rules on Poisson brackets in quantum mechanics again, since the
transition to quantum mechanics (the so-called canonical quantization) is performed
Exercise 20.10 by the transition to operators and by replacing the Poisson bracket [ , ] by the commu-
tator (1/i){ , }, where
{A, B}− = AB − BA.
If we form, e.g., [qi , pj ], we obtain
∂qi ∂pj ∂qi ∂pj

[qi , pj ] = − = δij . (20.143)
α
In the canonical quantization, one passes from the classical momenta pj to operator
momenta p j , and from the classical Poisson bracket [ , ] to the quantum mechanical
Poisson bracket (1/i){ , }− . Thus, in the canonical quantization one substitutes the
relation (20.143) by
{qi , p
j }− = iδij . (20.144)
j = −i∂/∂qj :
Equation (20.144) is satisfied if p

∂
j }− = −i qi ,
{qi , p ,
∂qj −
where the commutator operates on a function f (q1 , . . . , qα ). For example,

∂
−i , qi f (q1 , . . . , qα ]
∂qj −

∂ ∂
= −i (qi f (q1 , . . . , qα )) − qi f (q1 , . . . , qα )
∂qj ∂qj
= −iδij · f (q1 , . . . , qα ),
where the product rule was used and thus (20.144) is verified. The rules for the quan-
tum mechanical commutators are identical with those for the Poisson brackets. One
might say that quantum mechanics is another algebraic realization of the Poisson
brackets. As will be seen in quantum mechanics, this conclusion is premature and
in this form not correct.
EXERCISE
20.11 Total Time Derivative of an Arbitrary Function Depending on q, p, and t
Problem. Let H denote the Hamiltonian. Show that for an arbitrary function depend-
ing on qi , pi , and t we have
df ∂f
= + [f, H ].
dt ∂t
Solution. The total differential of the function f (pi , qi , t) reads Exercise 20.11
∂f ∂f ∂f

df = dt + dqα + dpα (20.145)
∂t α
∂qα ∂pα

df ∂f ∂f ∂f
⇒ = + q̇α + ṗα . (20.146)
dt ∂t α
∂qα ∂pα
By means of the Hamilton equations

∂H ∂H
= q̇α , = −ṗα ,
∂pα ∂qα
we can rewrite (20.146) as

df ∂f ∂f ∂H ∂f ∂H ∂f
= + − = + [f, H ]. (20.147)
dt ∂t α
∂qα ∂pα ∂pα ∂qα ∂t
Thus, the Poisson brackets enter automatically. Equation (20.147) reminds us even
more of the results of quantum mechanics than the analogies of the last problem. In
quantum mechanics we shall find the following expression for the time derivative of
:
an operator F
∂F
dF 1
= + {F , H }− , (20.148)
dt ∂t i
where H represents the Hamiltonian operator of the quantum mechanical problem. It
is, e.g., of the form
=H
(x, p ∂
H ) = −i
with p
∂x
and depends in general on the coordinates, momentum operators, and possibly even
further quantities, e.g., spin.
Extended Hamilton–Lagrange Formalism
21
21.1 Extended Set of Euler–Lagrange Equations

The conventional formulation of the principle of least action (Hamilton principle,
see (18.25)) is based on the action functional S[qj (t)], defined by
tb
dqj
S[qj (t)] = L qj , , t dt, (21.1)
ta dt
with L(qj , q̇j , t) denoting the system’s conventional Lagrangian, and (q1 (t), . . . ,
qn (t)) the set of configuration space variables as functions of time. In this formu-
lation, the independent variable time t plays the role of the Newtonian absolute time.
The clearest reformulation of the least action principle for relativistic physics is ac-
complished by treating the time t (s) = q0 (s)/c—just like the configuration space vari-
ables qj (s)—as a dependent variable of a newly introduced independent variable, s.
The idea behind is to place all space-time variables on equal footing. The action func-
tional (21.1) then rewrites in terms of an extended Lagrangian L1
sb
dqj dt
S1 [qj (s), t (s)] = L1 qj , , t, ds. (21.2)
sa ds ds
As the action functional (21.2) has the form of (21.1), the subsequent Euler–Lagrange
equations that determine the particular path (q̄j (s), t¯(s)) on which the value of the
functional S[q̄j (s), t¯(s)] takes on an extreme, adopt the customary form,

d ∂L1 ∂L1
dqμ − = 0. (21.3)
ds ∂ ∂qμ
ds
Here, the index μ = 0, . . . , n spans the entire range of extended configuration space
variables. In particular, the Euler–Lagrange equation for t (s) writes

d ∂L1 ∂L1
dt − = 0.
ds ∂ ds ∂t
The equations of motion for both qj (s) and t (s) are thus determined by the extended
Lagrangian L1 . The solution qj (t) of the Euler–Lagrange equations that equivalently
emerges from the corresponding conventional Lagrangian L may then be constructed
by eliminating the evolution parameter s.
As the actions, S and S1 , are supposed to be alternative characterizations of the
same underlying physical system, the action principles δS = 0 and δS1 = 0 must hold
simultaneously. This means that
sb sb
dt
δ L ds = δ L1 ds,
sa ds sa

416 21 Extended Hamilton–Lagrange Formalism
which, in turn, is assured if both integrands differ at most by the s-derivative of an

arbitrary differentiable function F (qj , t)
dt dF
L = L1 + .
ds ds
Functions F (qj , t) define a particular class of point transformations of the dynamical
variables, namely those ones that preserve the form of the Euler–Lagrange equations.
Such a transformation can be applied at any time in the discussion of a given La-
grangian system and should be distinguished from correlating L1 and L. We may thus
restrict ourselves without loss of generality to those correlations of L and L1 , where
F ≡ 0. In other words, we correlate L and L1 without performing simultaneously
a transformation of the dynamical variables. We will discuss this issue in the more
general context of extended canonical transformations in Sect. 21.3. The extended
Lagrangian L1 is then related to the conventional Lagrangian, L, by

dqj dt dqj dt dqi dqi /ds
L 1 qj , , t, = L qj , ,t , = . (21.4)
ds ds dt ds dt dt/ds
The derivatives of L1 from (21.4) with respect to its arguments can now be expressed
in terms of the conventional Lagrangian L as
∂L1 ∂L dt
= , i = 1, . . . , n, (21.5)
∂qi ∂qi ds
∂L1 ∂L dt
= , (21.6)
∂t ∂t ds
∂L1 ∂L
dqi = dqi , i = 1, . . . , n, (21.7)
∂ ds ∂ dt
∂L1 n
∂L dqi
dt = L − dqi . (21.8)
∂ ds dt
i=1 ∂ dt
With q0 ≡ ct, (21.7) and (21.8) yield for the following sum over the extended range
μ = 0, . . . , n of dynamical variables

n
∂L1 dqμ n
∂L dqi dt ∂L dqi
n
dqμ = L− dqi +
∂
μ=0 ds
ds ∂
i=1 dt
dt ds ∂ dqi ds
i=1 dt
= L1 .
The extended Lagrangian L1 thus satisfies the constraint
n
∂L1 dqμ
L1 − dqμ = 0. (21.9)
ds
μ=0 ∂ ds
The correlation (21.4) and the pertaining condition (21.9) allows two interpretations,
depending on which Lagrangian is primarily given, and which one is derived. If the
conventional Lagrangian L is the given function to describe the dynamical system in
question and L1 is derived from L according to (21.4), then L1 is a homogeneous form
of first order in the n + 1 variables dq0 /ds, . . . , dqn /ds. This may be seen by replac-
ing all derivatives dqμ /ds with a × dqμ /ds, a ∈ R in (21.4). Consequently, Euler’s
21.1 Extended Set of Euler–Lagrange Equations 417
theorem on homogeneous functions states that (21.9) constitutes an identity for L1 .

The Euler–Lagrange equation involving dt/ds then also yields an identity, hence, we
do not obtain a substantial equation of motion for t (s). In this case, the parameteri-
zation of time t (s) is left undetermined—which reflects the fact that a conventional
Lagrangian does not provide any information on a parameterization of time.
In the opposite case, if an extended Lagrangian L1 is the primary function to de-
scribe our system, then L1 is no longer a homogeneous function, in general. In that
case, (21.9) no longer establishes an identity but (21.9) furnishes a constraint function
for the system. Furthermore, the Euler–Lagrange equation involving dt/ds then yields
a non-trivial equation of motion for t (s). The conventional Lagrangian L may then be
deduced from (21.4) by means of the constraint function (21.9).
To summarize, by switching from the conventional variational principle (21.1) to
the extended representation (21.2), we have introduced an extended Lagrangian L1
that additionally depends on dt (s)/ds. Due to the emerging constraint function (21.9),
the actual number of degrees of freedom is unchanged. Geometrically, the system’s
motion now takes place on a hyper-surface, defined by (21.9), within the tangent bun-
dle T (M × R) over the space-time configuration manifold M × R. This contrasts with
the conventional, unconstrained Lagrangian description on the time-dependent tangent
bundle (T M) × R.
EXAMPLE
21.1 Extended Lagrangian for a Relativistic Free Particle

As only expressions of the form i qi2 − c2 t 2 are preserved under the Lorentz group,
the conventional Lagrangian for a free point particle of rest mass m0 , given by
3
dqj 1 dqi 2
Lnr qj , , t = T − V = m0 − m0 c2 , (21.10)
dt 2 dt
i=1
is obviously not Lorentz-invariant. Yet, in the extended description, a correspond-

ing Lorentz-invariant Lagrangian L1 can be constructed by introducing s as the
new independent variable, and by treating the space and time variables, qj (s) and
q0 (s) = ct (s), equally. This is achieved by adding the corresponding derivative of the
time variable t (s),
3 2
1 dqi 2
dqj dt 2 1 dt
L 1 qj , , t, = m0 c − −1 . (21.11)
ds ds 2 c2 ds ds
i=1
The constant third term has been defined accordingly to ensure that L1 converges
to Lnr in the limit dt/ds → 1. Of course, the dynamics following from (21.10) and
(21.11) are different—which reflects the modification our dynamics encounters if we
switch from a non-relativistic to a relativistic description. With the Lagrangian (21.11),
we obtain from (21.9) the constraint
2 3
dt 1 dqi 2
− 2 − 1 = 0. (21.12)
ds c ds
i=1
Example 21.1 As usual for constrained Lagrangian systems, we must not insert back the constraint
function into the Lagrangian prior to setting up the Euler–Lagrange equations. Phys-
ically, the constraint (21.12) reflects the fact that the square of the four-velocity vec-
tor is constant. It equals −c2 if the sign convention of the Minkowski metric is de-
fined as ημν = ημν = diag(−1, +1, +1, +1). We thus find that in the case of the La-
grangian (21.11) the system evolution parameter s is physically nothing else than the
particle’s proper time. In contrast to the non-relativistic description, the constant rest
energy term − 12 mc2 in the extended Lagrangian (21.11) is essential. The constraint
can alternatively be expressed as

3
ds 1 dqi 2
= 1− 2 = γ −1 ,
dt c dt
i=1
which yields the usual relativistic scale factor, γ . The conventional Lagrangian L that
describes the same dynamics as the extended Lagrangian L1 from (21.11) is derived
according to (21.4)

dqj dqj dt ds
L qj , , t = L1 q j , , t,
dt ds ds dt

1 dqi 2 dt

2 1 dt ds
= m0 c − −
2 c2 dt ds ds dt
i

1 2 ds dt 1 dqi 2
= − m0 c + 1− 2
2 dt ds c dt
i

2
1 dqi 2
= −m0 c 1 − 2 . (21.13)
c dt
i
We thus encounter the well-known conventional Lagrangian of a relativistic free

particle. In contrast to the equivalent extended Lagrangian from (21.11), the La-
grangian (21.13) is not quadratic in the derivatives of the dependent variables, qj (t).
The loss of the quadratic form originates from the projection of the constrained
description on the tangent bundle T (M × R) to the unconstrained description on
(T M) × R. The quadratic form is recovered in the non-relativistic limit by expanding
the square root, which yields the Lagrangian Lnr from (21.10).
EXAMPLE
21.2 Extended Lagrangian for a Relativistic Particle in an External Electromag-

netic Field
The extended Lagrangian L1 of a point particle of rest mass m0 and charge ζ in an

external electromagnetic field that is described by the potentials φ(qj , t) and A(qj , t)
is given by

1 dqi 2 dt 2
dqj dt 2 1
L1 qj , , t, = m0 c − −1
ds ds 2 c2 ds ds
i
ζ dqi dt
+ Ai −ζ φ . (21.14)
c ds ds
i
21.2 Extended Set of Canonical Equations 419
The associated constraint function coincides with that for the free-particle Lagrangian Example 21.2
from (21.12) as all terms linear in the velocities drop out calculating the differ-
ence in (21.9). Similar to the free particle case from (21.13), the extended La-
grangian (21.14) may be projected into (T M) × R to yield the well-known conven-
tional relativistic Lagrangian L

dqj
2 1 dqi 2 ζ dqi
L qj , , t = −m0 c 1 − 2 + Ai − ζ φ. (21.15)
dt c dt c dt
i i
Again, the quadratic form of the velocity terms is lost owing to the projection.
For small velocities dqj /dt, the quadratic form is regained as the square root in
(21.15) may be expanded to yield the conventional non-relativistic Lagrangian for a
point particle in an external electromagnetic field,

dqj 1 dqi 2 ζ dqi
Lnr qj , , t = m0 + Ai − ζ φ − m0 c2 . (21.16)
dt 2 dt c dt
i i
Significantly, this Lagrangian can be derived directly, hence without the detour over
the projected Lagrangian (21.15), from the extended Lagrangian (21.14) by letting
dt/ds → 1.
Comparing the Lagrangian (21.16) with the extended Lagrangian from (21.14),
we notice that the transition to the non-relativistic description is made by identify-
ing the proper time s with the laboratory time t = q0 /c. The remarkable formal sim-
ilarity of the Lorentz-invariant extended Lagrangian (21.14) with the non-invariant
conventional Lagrangian (21.16) suggests that approaches based on non-relativistic
Lagrangians Lnr may be transposed to a relativistic description by (i) introducing the
proper time s as the new system evolution parameter, (ii) treating the time t (s) as
an additional dependent variable on equal footing with the configuration space vari-
ables q(s)—commonly referred to as the “principle of homogeneity in space-time”—
and (iii) by replacing the conventional non-relativistic Lagrangian Lnr with the cor-
responding Lorentz-invariant extended Lagrangian L1 , similar to the transition from
(21.16) to (21.14).
21.2 Extended Set of Canonical Equations
The Lagrangian formulation of particle dynamics can equivalently be expressed as a

Hamiltonian description. The complete information on the given dynamical system is
then contained in a Hamiltonian H , which carries the same information content as the
corresponding Lagrangian L. It is defined by the Legendre transformation

n
dqi dqj
H (qj , pj , t) = pi − L qj , ,t , (21.17)
dt dt
i=1
with the canonical momenta pi being defined by

∂L
pi = dq .
∂ dti
Correspondingly, the extended Hamiltonian H1 is defined as the extended Legendre

transform of the extended Lagrangian L1 as

n
dqi dt dqj dt
H1 (qj , pj , t, e) = pi − e − L1 qj , , t, . (21.18)
ds ds ds ds
i=1
Herein, −e denotes the conjugate quantity of time t . Corresponding to the con-

ventional formalism, we assume the extended Lagrangian L1 to be regular, which
means that the Hesse matrix ∂ 2 L1 /[∂(dqμ /ds)∂(dqν /ds)] be invertible. We know
from (21.7) that for i = 1, . . . , n the momentum variable pi is equally obtained from
the extended Lagrangian L1 ,
∂L1
pi = dq . (21.19)
∂ dsi
This fact ensures the Legendre transformations (21.17) and (21.18) to be compatible.
For the corresponding definition of p0 , we must take some care as the derivative of L1
with respect to dt/ds evaluates to
∂L1 n
dqi ∂L
dt = L − = −H (qj , pj , t).
∂ ds dt ∂ dqi
i=1 dt
The momentum coordinate p0 that is conjugate to q0 = ct must therefore be defined

as
e(s) ≡
p0 (s) = − , e(s) = H qj (s), pj (s), t (s) , (21.20)
c
with e(s) representing the instantaneous value of the Hamiltonian H at s, but not the
function H proper. This distinction is essential as the canonical coordinate p0 must
be defined—like all other canonical coordinates—as a function of the independent
variable only. The reason is that the qμ , pμ with μ = 0, . . . , n depict the coordinates
pertaining to the base vectors that span the (symplectic) extended phase space. We
may express this fact by means of the comprehensible notation
∂L1 ∂L1
p0 (s) = dq (s) ⇔ e(s) = − dt (s). (21.21)
∂ ds0 ∂ ds
The constraint function from (21.9) translates in the extended Hamiltonian description
simply into

H1 qj (s), pj (s), t (s), e(s) = 0. (21.22)
This means that the extended Hamiltonian H1 directly defines the hyper-surface on
which the classical motion of the system takes place. The hyper-surface lies within the
cotangent bundle T ∗ (M × R) over the same extended configuration manifold M × R
as in the case of the Lagrangian description. Inserting (21.19) and (21.21) into the
extended set of Euler–Lagrange equations (21.3) yields the extended set of canonical
equations,
dpμ ∂H1 dqμ ∂H1
=− , = . (21.23)
ds ∂qμ ds ∂pμ
The right-hand sides of these equations follow directly from the Legendre transforma-
tion (21.18) since the Lagrangian L1 does not depend on the momenta pμ and has,
up to the sign, the same space-time dependence as the Hamiltonian H1 . The extended
set is characterized by the additional pair of canonical equations for the index μ = 0,
which reads in terms of t (s) and e(s)
de ∂H1 dt ∂H1
= , =− . (21.24)
ds ∂t ds ∂e
In contrast to the total time derivative of the Hamiltonian H (qj , pj , t), the total s
derivative of the extended Hamiltonian H1 (qν , pν ) always vanishes. Calculating the
total s derivative of H1 , and inserting subsequently the extended set of canonical equa-
tions (21.23), we find
n n
dH1 ∂H1 dqμ ∂H1 dpμ ∂H1 ∂H1 ∂H1 ∂H1
= + = − ≡ 0.
ds ∂qμ ds ∂pμ ds ∂qμ ∂pμ ∂pμ ∂qμ
μ=0 μ=0
Formally, an extended Hamiltonian H1 (pν , qν ) = const. thus describes an autonomous

Hamilton system, hence a system that does not explicitly depend on its independent
variable.
By virtue of the Legendre transformations (21.17) and (21.18), the correlation
from (21.4) of extended and conventional Lagrangians is finally converted into
dt
H1 (qj , p, t, e) = H (qj , p, t) − e , (21.25)
ds
as only the term for the index μ = 0 does not cancel after inserting (21.17) and (21.18)
into (21.4).
The conventional Hamiltonian H is defined as the particular function whose
value coincides with the extended phase-space variable e. In accordance with (21.20)
and (21.22), we thus determine H for any given extended Hamiltonian H1 by solving
H1 = 0 for e. Then, H emerges as the right-hand side of the equation e = H .
In the converse case, if merely a conventional Hamiltonian H is given, and H1 is set
up according to (21.25), then the canonical equation for dt/ds yields an identity, hence
allows arbitrary parameterizations of time. This is not astonishing as a conventional
Hamiltonian H generally does not provide the information for an equation of motion
for t (s).
EXAMPLE
21.3 Trivial Extended Hamiltonian
The trivial extended Hamiltonian H1 is defined by
H1 (qj , pj , t, e) = H (qj , pj , t) − e. (21.26)
According to (21.24), the canonical equation for dt/ds is obtained as
dt ∂H1
=− = 1.
ds ∂e
Example 21.3 Up to arbitrary shifts of the origin of our time scale, we thus identify t (s) with s. As all
other partial derivatives of H1 coincide with those of H , so do the respective canonical
equations. The system description in terms of H1 from (21.26) is thus identical to the
conventional description by a Hamiltonian H and does not provide any additional
information.
EXAMPLE
21.4 Hamiltonian of a Free Relativistic Particle
We present the systematic approach how to set up the Lorentz-invariant extended

Hamiltonian H1,r for a given non-invariant conventional Hamiltonian Hnr . For the
case of a free particle of rest mass m0 , the non-relativistic Hamiltonian that includes
the particle’s rest energy is given by
p2
Hnr (p) = + m0 c2 . (21.27)
2m0
Herein, p denotes the 3-component vector of particle momenta, p = (p1 , p2 , p3 ). The
equivalent extended Hamiltonian (21.26) that yields the same dynamics in terms of the
subsequent canonical equations (21.23) is then
p2
H1,nr (p, e) = − e + m0 c2 , (21.28)
2m0
in conjunction with the general side condition for extended Hamiltonians,
H1,nr (q, p, t, e) = 0. As solely expressions of the form q 2 − c2 t 2 and p 2 − e2 /c2
are maintained under Lorentz transformations, (see Example 21.18), the Hamiltoni-
ans (21.27) and (21.28) are obviously not Lorentz invariant. In the description of ex-
tended Hamiltonians, the corresponding Lorentz-invariant form of (21.28) can easily
be constructed

1 e2 1
H1,r (p, e) = p 2 − 2 + m0 c 2 . (21.29)
2m0 c 2
The constant term was adjusted to preserve the relation e = m0 c2 for p = 0. The side
condition H1 = 0, which represent an implicit function, now yields the relativistic
energy-momentum correlation
e2 = p2 c2 + m20 c4 . (21.30)
Although p and e denote formally independent canonical variables, only those com-
binations of p and e have physical significance that satisfy (21.30). Of course,
the canonical equations that follow from (21.29) are different from those following
from (21.28). This reflects the modification that a system’s description encounters
if we switch from a non-relativistic to a relativistic viewpoint. The extended set of
canonical equations (21.23) emerging from the extended Hamiltonian (21.29) is
∂H1,r dpi ∂H1,r dqi pi
− = = 0, = = ,
∂qi ds ∂pi ds m0
∂H1,r de ∂H1,r dt e Example 21.4

= = 0, − = = .
∂t ds ∂e ds m0 c2
In conjunction with the energy-momentum correlation from (21.30), the non-trivial
canonical equations are expressed as

dqi pi dt e p 2 c2 + m20 c4
= , = = . (21.31)
ds m0 ds m0 c2 m0 c2
We may finally rewrite the canonical equation for the qi in terms of the time t as the
independent variable
dqi dqi ds pi m0 c2 pi c2 pi c2 ∂Hr (p)
= = = = = . (21.32)
dt ds dt m0 e e p 2 c2 + m2 c4 ∂pi
0
But this is nothing else than the non-trivial canonical equation of the conventional
Hamiltonian
Hr (p) = e = p2 c2 + m20 c4 . (21.33)
We thus encounter the well-known conventional Hamiltonian Hr (p) of the free rela-
tivistic particle. In contrast to the extended Hamiltonian (21.29), the physically equiv-
alent conventional Hamiltonian (21.33) does not manifest anymore its Lorentz invari-
ance.
To complete this example, we show that the extended Hamiltonian (21.29) also
emerges as the Legendre-transformed Lagrangian (21.11) from Example 21.1. The
extended Legendre transformation that relates extended Lagrangians with extended
Hamiltonians was defined in equations (21.18), (21.19), and (21.21) of Sect. 21.2. For
the addressed case, the canonical momenta evaluate to
∂L1 dqi ∂L1 dt
pi = dq = m0 , e = − dt = m0 c2 .
∂ i ds ∂ ds ds
ds
The extended Hamiltonian is then obtained by expressing the derivatives dqi /ds and
de/ds that are contained in the Lagrangian and in the Legendre transformation rule in
terms of the momenta pi and e,
p2 e2 p2 e2 1
H1 (p, e) = i
− 2
−L 1 (p, e), L 1 (p, e) = i
− 2
− m0 c 2 .
m0 m0 c 2m0 2m0 c 2
i i
The extended Hamiltonian is then

1 2 e2 1
H1 (p, e) = pi − 2 + m0 c2 ,
2m0 c 2
i
which coincides with (21.29).
EXAMPLE
21.5 Hamiltonian of a Relativistic Particle in a Potential V (q, t)
The non-relativistic dynamics of a particle in a potential V (q, t) is described by the

Hamiltonian
p2
Hnr (q, p, t) = + V (q, t). (21.34)
2m0
Example 21.5 Analogously to the dynamics of a free relativistic particle, treated in Example 21.4,
the relativistic dynamics of a particle in an external potential V (q, t) is described by
the extended Hamiltonian
2
1 e − V (q, t) 1
H1,r (q, p, t, e) = p2 − + m0 c2 . (21.35)
2m0 c 2
The constant term was chosen to ensure that for p = 0 and V (q, t) = 0 the constraint
H1,r = 0 leads to e = m0 c2 . Consequently, for the general case H1,r = 0 induces the
scleronomous constraint
2
e − V (q, t) = p 2 c2 + m20 c4 . (21.36)
Again, q, p, t and e represent independent canonical variables, but only those combi-
nations of q, p, t and e have a physical meaning which satisfy (21.36).
The extended set of canonical equations (21.23) emerging from the extended
Hamiltonian (21.35) is
∂H1,r dpi e − V (q, t) ∂V ∂H1,r dqi pi
− = =− , = = ,
∂qi ds m0 c2 ∂qi ∂pi ds m0
∂H1,r de e − V (q, t) ∂V ∂H1,r dt e − V (q, t)
= = , − = = .
∂t ds m0 c2 ∂t ∂e ds m0 c2
We may express the canonical equations equivalently using the time t as the indepen-
dent variable, and eliminate the canonical variable e by means of the constraint (21.36)
dpi dpi ds e − V (q, t) ∂V m0 c2 ∂V ∂Hr

= =− =− =− ,
dt ds dt m0 c2 ∂qi e − V (q, t) ∂qi ∂qi
dqi dqi ds pi m0 c2 pi c 2 pi c2 ∂Hr
= = = = = ,
dt ds dt m0 e − V (q, t) e − V (q, t) p 2 c2 + m20 c4 ∂pi
de de ds e − V (q, t) ∂V m0 c2 ∂V ∂Hr
= = = = .
dt ds dt m0 c2 ∂t e − V (q, t) ∂t ∂t
These equations can be conceived to represent the canonical equations emerging from
the conventional Hamiltonian

Hr (q, p, t) = e = p 2 c2 + m20 c4 + V (q, t). (21.37)
We thus encounter the Lorentz invariant form of the conventional Hamiltonian for a
particle in an external potential V (q, t). The Hamiltonians H1,r from (21.35) and Hr
from (21.37) are physically equivalent, hence describe the same dynamics. On the
other hand, the extended Hamiltonian H1,r additionally determines the parameteriza-
tion of time t = t (s).
We finally note that the extended Hamiltonian (21.35) can be derived according
to (21.18), (21.19), and (21.21) as the Legendre transformed function of the extended
Lagrangian

dqj dt Example 21.5
L1 qj , , t,
ds ds

1 dqi 2 dt 2
2 1 dt
= m0 c 2
− − 1 − V (q, t) . (21.38)
2 c ds ds ds
i
From (21.38), the correlations of the “velocities” dqi /ds and dt/ds with the canonical
momenta, pi and −e, evaluate to
∂L1 dqi ∂L1 dt
pi = dq = m0 , e = − dt = m0 c2 + V (q, t). (21.39)
∂ dsi ds ∂ ds ds
EXAMPLE
21.6 Relativistic “Harmonic Oscillator”
In this example, we discuss the relativistic motion of a particle in a quadratic external

potential. According to (21.35), the extended Hamiltonian of this system is given by

1 e − 12 kq 2 2 1
H1,r (q, p, e) = p −
2
+ m0 c2 . (21.40)
2m0 c 2
The associated constraint H1,r (q, p, e) = 0 yields a relation of the formally indepen-
dent canonical variables p, q, and e
2
1
p 2 c2 − e − kq 2 + m20 c4 = 0. (21.41)
2
Solving this relation for e, we obtain the corresponding conventional Hamiltonian Hr
as the right-hand side of the equation e = Hr ,

1
Hr (q, p) = p 2 c2 + m20 c4 + kq 2 . (21.42)
2
The extended set of canonical equations following from the extended Hamil-
tonian (21.40) are
∂H1,r dqi pi ∂H1,r dpi e − 12 kq 2

= = , − = =− kqi ,
∂pi ds m0 ∂qi ds m0 c2
∂H1,r de ∂H1,r dt e − 12 kq 2
= = 0, − = = .
∂t ds ∂e ds m0 c2
We may again express these equations equivalently by eliminating e according
to (21.41) and by replacing s with the laboratory time t as the system’s independent
variable
dqi dqi ds pi c 2 ∂Hr
= = = ,
dt ds dt p 2 c2 + m20 c4 ∂pi
(21.43)
dpi dpi ds ∂Hr
− =− = kqi = .
dt ds dt ∂qi
Example 21.6 As expected, we encounter the canonical equations of the conventional Hamil-
tonian (21.42). The pair of first-order equations can be merged into a single second-
order equation for qi (t),
3
k q̇ 2 (t) 2
q̈(t) + 1− 2 q(t) = 0. (21.44)
m0 c
For q̇(t) → c we thus have q̈(t) → 0. In agreement with the postulates of special
relativity, the speed of light, c, constitutes the absolute limit for the particle’s velocity,
q̇(t). The term in brackets forms a power of the relativistic correction factor, γ
2
−2 q̇
γ =1− . (21.45)
c
The equation of motion (21.44) may thus be rewritten concisely as

k
q̈(t) + q(t) = 0. (21.46)
m0 γ 3
In this form, the equation of motion appears to agree with its non-relativistic coun-
terpart, except for the occurrence of the relativistic correction factor γ 3 in front of
the rest mass term m0 . For q̇ c, hence for γ → 1, indeed the equation of motion
of the ordinary harmonic oscillator comes out. Yet, the explicit form (21.44) of the
equation of motion shows that we are dealing with a non-linear system, with solutions
that no longer consists of harmonic oscillations. Thus, strictly speaking, a relativistic
oscillator can never be a harmonic oscillator.
In accelerator physics, the quantity m0 γ 3 is referred to as the “longitudinal mass”
of an ion beam particle. In the laboratory system, the longitudinal oscillation fre-
quency of the relativistic motion of a particle in an accelerator or high energy storage
ring appears as if the particle’s mass be increased by a factor of γ 3 .
EXAMPLE
21.7 Extended Hamiltonian for a Relativistic Particle in an External Electromag-

netic Field
The Hamiltonian counterpart H1 of the extended Lagrangian (21.14) from Exam-

ple 21.2 for a relativistic point particle in an external electromagnetic field is obtained
via the Legendre transformation prescription from (21.18). According to (21.19)
and (21.21), the canonical momenta pi and p0 are introduced by
∂L1 dqi ζ
pi = dq = m0 + Ai (q, t),
∂ ds i ds c
(21.47)
∂L1 dq0 ζ
p0 = dq = m0 − φ(q, t).
∂ 0 ds c
ds
We notice that the kinetic momentum pi,k = m dqi /ds differs from the canonical mo-
mentum pi in the case of a non-vanishing external potential Ai = 0. The condition for
the Legendre transform of L1 to exist is that its 4 × 4 Hessian matrix with elements
∂ 2 L1 /[∂(dqμ /ds)∂(dqν /ds)] must be non-singular, hence that the determinant of this Example 21.7
matrix does not vanish. For the extended Lagrangian (21.14) from Example 21.2, this
is actually the case as

∂ 2 L1
det dq dq = m40 = 0.
∂ dsμ ∂ dsν
This falsifies claims occasionally found in literature that the Hesse matrix associated
with an extended Lagrangian L1 be generally singular, and that for this reason an ex-
tended Hamiltonian H1 generally could not be obtained by a Legendre transformation
of an extended Lagrangian L1 .
With the Hessian condition being actually satisfied, the extended Hamiltonian H1
that follows as the Legendre transform (21.18) of L1 evaluates to
2
1 ζ e − ζ φ(q, t) 2 1
H1 (q, p, t, e) = p − A(q, t) − + m0 c2 . (21.48)
2m0 c c 2
The constraint H1 = 0 then furnishes the usual relativistic energy-momentum relation

2
2 ζ
e − ζ φ(q, t) = c p − A(q, t) + m20 c4 .
2
(21.49)
c
The conventional Hamiltonian H (q, p, t) that describes the same dynamics is deter-
mined according to (21.20) as the particular function, whose value coincides with e.
Solving H1 = 0 from (21.48) for e, we directly find H as the left-hand side of the
equation H = e,

2
ζ
H = c2 p − A(q, t) + m20 c4 + ζ φ(q, t) = e. (21.50)
c
The Hamiltonian Hnr (q, p, t) that describes the particle dynamics in the non-
relativistic limit is obtained from the Lorentz-invariant Hamiltonian (21.50) by ex-
panding the square root
2
1 ζ
Hnr = p − A(q, t) + ζ φ(q, t) + m0 c2 . (21.51)
2m0 c
In contrast to the extended Lagrangian description, a direct way to transpose the rel-
ativistic extended Hamiltonian from (21.48) into the non-relativistic Hamiltonian Hnr
does not exist. We conclude that the Lagrangian approach is more appropriate if we
want to “translate” a given non-relativistic Hamilton–Lagrange system into the corre-
sponding Lorentz-invariant description.
In order to show that the extended Hamiltonian (21.48) and the well-known con-
ventional Hamiltonian (21.50) indeed yield the same dynamics, we now set up the
extended set of canonical equations (21.23) for the extended Hamiltonian (21.48)
3
dpi ζ ζ ∂Ak ζ ∂φ
= pk − Ak − 2
(e − ζ φ) ,
ds m0 c c ∂qi m0 c ∂qi
k=1
3
de ζ ζ ∂Ak ζ ∂φ
=− pk − A k + 2
(e − ζ φ) ,
ds m0 c c ∂t m0 c ∂t
k=1

Example 21.7 dqi 1 ζ
= pi − Ai ,
ds m0 c
(21.52)
dt 1
= (e − ζ φ).
ds m0 c2
From the last equation, we deduce the derivative of the inverse function s = s(t) and
insert the constraint from (21.49)
ds m0 c2 m0 c2
= = . (21.53)
dt e − ζφ 2
c2 (p − ζc A(q, t)) + m20 c4
The canonical equations (21.53) can now be expressed equivalently with the time t as
the independent variable
dpi dpi ds
− =−
dt ds dt
ζc ζ

∂Ak ∂φ
= − pk − Ak +ζ ,
2 c ∂qi ∂qi
c2 (p − ζc A(q, t)) + m20 c4 k
de de ds
= (21.54)
dt ds dt
ζc ζ

∂Ak ∂φ
= − pk − Ak +ζ ,
2 c ∂t ∂t
c2 (p − ζc A(q, t)) + m20 c4 k

dqi dqi ds c2 ζ
= = pi − Ai .
dt ds dt 2 c
c2 (p − ζc A(q, t)) + m20 c 4
The right-hand sides of (21.54) are exactly the partial derivatives ∂H /∂qi , ∂H /∂t , and
∂H /∂pi of the Hamiltonian (21.50)—and hence its canonical equations, which was to
be shown.
The physical meaning of the dt/ds is worked out by casting it to the equivalent
form

dt (p − ζc A(q, t))2 p k (s) 2
= 1+ = 1 + = γ (s),
ds m20 c2 m0 c
with p k (s) the instantaneous kinetic momentum of the particle. The dimensionless
quantity dt/ds thus represents the instantaneous value of the relativistic scale fac-
tor γ .
21.3 Extended Canonical Transformations
The conventional theory of canonical transformations is built upon the conventional

action integral from (21.1). In this theory, the Newtonian absolute time t plays the role
of the common independent variable of both original and destination system. Similarly
to the conventional theory, we may build the extended theory of canonical equations
21.3 Extended Canonical Transformations 429
on the basis of the extended action integral from (21.2). With the time t = q0 /c and the
configuration space variables qi treated on equal footing, we are enabled to correlate
two Hamiltonian systems, H and H , with different time scales, t (s) and T (s), hence
to canonically map the system’s time t and its conjugate quantity e in addition to the
mapping of generalized coordinates q and momenta p. The system evolution parame-
ter s is then the common independent variable of both systems, H and H . A general
mapping of all dependent variables may be formally expressed as
Qμ = Qμ (qν , pν ), Pμ = Pμ (qν , pν ), μ = 0, . . . , n. (21.55)
Completely parallel to the conventional theory, the subgroup of transformations (21.55)

that preserve the action principle δS1 = 0 of the system is referred to as “canonical.”
The action integral (21.2) may be expressed equivalently in terms of an extended
Hamiltonian by means of the Legendre transformation (21.18). We thus get the fol-
lowing condition for a transformation (21.55) to be canonical
sb
sb
n
dqμ n
dQμ

δ pμ − H1 qν , pν ds = δ Pμ − H1 Qν , Pν ds.
sa ds sa ds
μ=0 μ=0
(21.56)
As we are operating with functionals, the condition (21.56) holds if the integrands dif-
fer at most by the derivative dF1 /ds of an arbitrary differentiable function F1 (qν , Qν )

n
dqμ n
dQμ dF1
pμ − H1 = Pμ − H1 + . (21.57)
ds ds ds
μ=0 μ=0
We restrict ourselves to functions F1 (qν , Qν ) of the old and the new extended config-
uration space variables, hence to a function of those variables, whose derivatives are
contained in (21.57). Calculating the s-derivative of F1 ,
n
dF1 ∂F1 dqμ ∂F1 dQμ
= + , (21.58)
ds ∂qμ ds ∂Qμ ds
μ=0
we then get unique transformation rules by comparing the coefficients of (21.58) with
those of (21.57)
∂F1 ∂F1
pμ = , Pμ = − , H1 = H1 . (21.59)
∂qμ ∂Qμ
F1 is referred to as the extended generating function of the—now generalized—

canonical transformation. The extended Hamiltonian has the important property to be
conserved under these transformations. Corresponding to the extended set of canon-
ical equations, the additional transformation rule is given for the index μ = 0. This
transformation rule may be expressed equivalently in terms of t (s), e(s), and T (s),
E(s) as
∂F1 ∂F1
e=− , E= , (21.60)
∂t ∂T
with E, correspondingly to (21.20), the value of the transformed Hamiltonian H
E(s) ≡
P0 (s) = − , E(s) = H Q(s), P (s), T (s) . (21.61)
c
The transformed Hamiltonian H is finally obtained from the general correlation of
conventional and extended Hamiltonians from (21.25), and the transformation rule
H1 = H1 for the extended Hamiltonian from (21.59)
dT dt
H (Q, P , T ) − E = H (q, p, t) − e .
ds ds
Eliminating the evolution parameter s, we arrive at the following two equivalent trans-
formation rules for the conventional Hamiltonians under extended canonical transfor-
mations
∂T
H (Q, P , T ) − E = H (q, p, t) − e,
∂t (21.62)
∂t
H (q, p, t) − e = H (Q, P , T ) − E.
∂T
The transformation rules (21.62) are generalizations of the rule for conventional
canonical transformations as now cases with T = t are included. We will see at the
end of this section that the rules (21.62) merge for the particular case T = t into the
corresponding rules (19.10), (19.12) of the conventional canonical transformation the-
ory.
By means of the Legendre transformation

n
∂F1
F2 (qν , Pν ) = F1 (qν , Qν ) + Qμ Pμ , Pμ = − , (21.63)
∂Qμ
μ=0
we may express the extended generating function of a generalized canonical trans-

formation equivalently as a function of the original extended configuration space vari-
ables qν and the extended set of transformed canonical momenta Pν . As, by definition,
the functions F1 and F2 agree in their dependence on the qμ , so do the corresponding
transformation rules
∂F1 ∂F2
= = pμ .
∂qμ ∂qμ
This means that all qμ do not take part in the transformation defined by (21.63). Hence,
for the Legendre transformation, we may regard the functional dependence of the gen-
erating functions to be reduced to F1 = F1 (Qν ) and F2 = F2 (Pν ). The new transfor-
mation rule pertaining to F2 thus follows from the Pν -dependence of F2
n
∂F2 ∂F1 ∂Qμ ∂Qμ ∂Pμ
= + Pμ + Qμ
∂Pν ∂Qμ ∂Pν ∂Pν ∂Pν
μ=0
n

∂Qμ ∂Qμ
= −Pμ + Pμ + Qμ δμν
∂Pν ∂Pν
μ=0
= Qν .
The new set of transformation rules, which is, of course, equivalent to the previous set
from (21.59), is thus
∂F2 ∂F2
pμ = , Qμ = , H1 = H1 . (21.64)
∂qμ ∂Pμ
Expressed in terms of the variables q, p, t , e, and Q, P , T , E the new set of coordinate

transformation rules takes on the more elaborate form
∂F2 ∂F2 ∂F2 ∂F2

pi = , Qi = , e=− , T =− . (21.65)
∂qi ∂Pi ∂t ∂E
Similarly to the conventional theory of canonical transformations, there are two more
possibilities to define a generating function of an extended canonical transformation.
By means of the Legendre transformation

n
∂F1
F3 (pν , Qν ) = F1 (qν , Qν ) − q μ pμ , pμ = − ,
∂qμ
μ=0
we find in the same manner as above the transformation rules
∂F3 ∂F3
qμ = − , Pμ = − , H1 = H1 . (21.66)
∂pμ ∂Qμ
Finally, applying the Legendre transformation, defined by

n
∂F3
F4 (pν , Pν ) = F3 (pν , Qν ) + Qμ Pμ , Pμ = − ,
∂Qμ
μ=0
the following equivalent version of transformation rules emerges
∂F4 ∂F4
qμ = − , Qμ = , H1 = H1 . (21.67)
∂pμ ∂Pμ
Calculating the second derivatives of the generating functions, we conclude that the
following correlations for the derivatives of the general mapping from (21.55) must
hold for the entire set of extended phase-space variables,
∂Qμ ∂pν ∂Qμ ∂qν

= , =− ,
∂qν ∂Pμ ∂pν ∂Pμ
(21.68)
∂Pμ ∂pν ∂Pμ ∂qν
=− , = .
∂qν ∂Qμ ∂pν ∂Qμ
Exactly if these conditions are fulfilled for all μ, ν = 0, . . . , n, then the extended co-
ordinate transformation (21.55) is canonical and preserves the form of the extended
set of canonical equations (21.23). Otherwise, we are dealing with a general, non-
canonical coordinate transformation that does not preserve the form of the canonical
equations.
EXAMPLE
21.8 Identical Canonical Transformation
As the first example of an extended generating function F2 (qj , Pj , t, E), we consider

the generating function of the identical transformation

n
n
F2 (qj , Pj , t, E) = qμ Pμ = qi Pi − tE.
μ=0 i=1
The particular transformation rules for this case follow from their general form, given
by (21.65),
pi = Pi , Qi = qi , e = E, T = t, H1 = H1 .
According to (21.62), the transformation rule for the extended Hamiltonians,

H1 = H1 , yields now for the conventional Hamiltonians, H and H ,
∂T
H −E = H − e,
∂t
which means, after inserting the coordinate transformation rules,
H (Qj , Pj , T ) = H (qj , pj , t).
The existence of a neutral element is a precondition for the set of extended canonical
transformations of a Hamiltonian system H (pj , qj , t) to form a group.
EXAMPLE
21.9 Identical Time Transformation, Conventional Canonical Transformations
The connection of the extended canonical transformation theory with the conventional
one is furnished by the particular extended generating function
F2 (qj , Pj , t, E) = F2 (qj , Pj , t) − tE, (21.69)
with F2 (qj , Pj , t) denoting a conventional generating function. According to (21.65),

the coordinate transformation rules following from (21.69) are
∂F2 ∂F2 ∂F2
pi = , Qi = , e=− + E, T = t.
∂qi ∂Pi ∂t
Together with the general transformation rule (21.62) for conventional Hamiltoni-
ans, we find the rule for Hamiltonians under conventional canonical transformations
from (19.12),
∂F2
H (Qj , Pj , t) = H (qj , pj , t) + E − e = H (qj , pj , t) + .
∂t
Canonical transformations that are defined by extended generating functions of the
form of (21.69) leave the time variable unchanged and thus define the subgroup of
conventional canonical transformations within the general group of extended canoni- Example 21.9
cal transformations. In the present example, the time t also forms a common indepen-
dent variable of both the original and the transformed system — just as presupposed in
case of a conventional canonical transformation. Corresponding to the trivial extended
Hamiltonian from (21.26), we may refer to (21.69) as the trivial extended generating
function.
EXAMPLE
21.10 Extended Point Transformations
We consider the extended canonical transformation defined by an extended generating

function that is linear in the Pν ,

n
n
F2 (qν , Pν ) = Pα fα (qν ) = Pi fi (qj , t) − Ef0 (qj , t)/c,
α=0 i=1
with the fα (qν ) denoting arbitrary differentiable functions. The transformation

rules (21.64) for this F2 follow as

n
∂fα (qν )
Qμ = fμ (qν ), pμ = Pα , H1 (Qν , Pν ) = H1 (qν , pν ).
∂qμ
α=0
The new configuration space coordinates Qi , i = 1, . . . , n and the new time T ≡ Q0 /c

thus emerge as functions of the original configuration space coordinates qi and time
t ≡ q0 /c, without any dependence on the canonical momenta and energy. Similar to
the case of a conventional canonical transformation (see Example 19.2), mappings of
this type are referred to as point transformations.
For the particular case f0 = f0 (t), we get the rule T = f0 (t)/c. The transformed
time T then only depends on the original time, t , and not on the configuration space
variables, qi . For multi-particle systems, T then retains in the transformed system the
property of the original system’s time t to be common to all particles.
EXAMPLE
21.11 Time-Energy Transformations
The general form of an extended generating function F3 that defines an extended

canonical transformation that leaves position and momentum coordinates invariant,
but transforms solely time t and energy e, is given by

n
F3 (pj , Qj , e, T ) = − pi Qi + f (e, T ).
i=1
Herein, f (e, T ) denotes an arbitrary differentiable function of the energy e of the

original system, and of the time T of the transformed system. According to (21.66),
the particular transformation rules are
∂f ∂f
Pi = pi , qi = Qi , E= , t= , H1 = H1 . (21.70)
∂T ∂e
Example 21.11 The transformed conventional Hamiltonian H follows from (21.62) as

∂ 2f ∂f
H (qj , pj , T ) = H (qj , pj , t) − e + .
∂e ∂T ∂T
The special case f (e, T ) = eT again yields the identical transformation. We observe
that the transformed Hamiltonian H emerges from the original Hamiltonian H by
multiplication with a the factor ∂ 2 f/∂e∂T . This contrasts to the case of conventional
canonical transformations from Chap. 19, for which H emerges from H by addition
of the partial time derivative of a conventional generating function F1,2,3,4 .
EXAMPLE
21.12 Liouville’s Theorem in the Extended Hamilton Description
For extended canonical transformations, the pertaining generalized form of Liou-

ville’s theorem applies analogously to the conventional Liouville theorem from Ex-
ample 19.6,
gen. can. transf.
dQ0 . . . dQn dP0 . . . dPn = dq0 . . . dqn dp0 . . . dpn .
This may be written equivalently as

gen. can. transf.
dQ1 . . . dQn dP1 . . . dPn dT dE = dq1 . . . dqn dp1 . . . dpn dt de.
The generalized form of Liouville’s theorem thus states that the extended volume form
dV1 = dq0 . . . dqn dp0 . . . dpn dt de is conserved under extended canonical transfor-
mations, hence that the determinant D that is associated with the Jacobi matrix of the
transformation is always unity. As the amount of canonical variables remains an even
number in the extended description, we may again represent D by

∂(Q0 , . . . , Qn ) ∂(p0 , . . . , pn ) −1
D= .
∂(q0 , . . . , qn ) ∂(P0 , . . . , Pn )
If the transformation rules can be derived from a generating function of type
F2 (qμ , Pμ ), then we are dealing with the particular case of a canonical coordinate
transformation. Inserting the equations for Qν and pν yields
2
∂ F2 ∂ 2 F2 −1
D = = 1.
∂qμ ∂Pν ∂Pμ ∂qν
This equation holds as (i) the partial derivatives may be interchanged, and (ii) due
to the fact that transpose matrices have the same determinant. We will see in Exam-
ple 21.18 that under generalized canonical transformations—hence transformations
that also map the time scales of original and destination systems—only the gener-
alized version of Liouville’s theorem applies, and not the conventional form from
Example 19.6.
EXAMPLE
21.13 Extended Poisson Brackets
In Example 19.7, we proved the invariance of conventional Poisson brackets [F, G]

under conventional canonical transformations. We will show that under generalized
canonical transformations, the invariance property holds for extended Poisson brack- Example 21.13
ets, [F, G]e . In analogy conventional brackets, the extended Poisson brackets are de-
fined by
n
∂F ∂G ∂F ∂G ∂F ∂G ∂F ∂G
[F, G]e = − = [F, G] − + . (21.71)
∂qμ ∂pμ ∂pμ ∂qμ ∂t ∂e ∂e ∂t
μ=0
Herein, F = F (qμ , pμ ) = F (qi , pi , t, e) and G = G(qμ , pμ ) = G(qi , pi , t, e) denote

arbitrary differentiable functions. Due to the complete analogy of conventional and
extended canonical transformation formalisms, the proof of the invariance of extended
Poisson bracket under extended canonical transformations formally coincides with
that of Example 19.7. We thus find that under extended canonical transformations
[F, G]eq,p,t,e = [F, G]eQ,P ,T ,E . (21.72)
We will show in Example 21.18 that the Lorentz transformation can be conceived
as a particular extended canonical transformation. Consequently, extended Poisson
brackets are always Lorentz invariant.
The total s derivative of a function f = f (qi , pi , t) is
n
df ∂f dqi ∂f dpi ∂f dt
= + +
ds ∂qi ds ∂pi ds ∂t ds
i=1
n

∂f dqμ ∂f dpμ
= +
∂qμ ds ∂pμ ds
μ=0
n

∂f ∂H1 ∂f ∂H1
= − = [f, H1 ]e . (21.73)
∂qμ ∂pμ ∂pμ ∂qμ
μ=0
In the context of the extended Hamilton formalism, the extended Poisson bracket of an
explicitly time-dependent function f (qi , pi , t) with the extended Hamiltonian H1 thus
yields directly the total derivative of f with respect to the independent variable, s. This
agrees formally with the conventional Poisson bracket of a conventional Hamiltonian
H with a function f (qi , pi ) that does not explicitly depend on time t . In that case, we
obtain the total time derivative of f .
EXAMPLE
21.14 Canonical Quantization in the Extended Hamilton Formalism
By canonical quantization, we denote a formalism to derive the quantum mechanical

equations of motion for a complex wave function ψ(qμ ) on the basis of a correspond-
ing classical system that is described by an extended Hamiltonian H1 . Explicitly, this
means to replace the classical momenta pν with momentum operators p̂ν , along with
the replacement of the classical extended Poisson brackets [ , ]e by quantum mechan-
ical commutators { , }_ . The quantum mechanical commutator { , }_ of two operators
Â and B̂ is defined as
{Â, B̂}_ = ÂB̂ − B̂ Â. (21.74)

Example 21.14 Analogously to the fundamental Poisson bracket (19.29) from Example 19.7, we then
find for the extended set of fundamental commutators
{q̂μ , q̂ν }_ = 0, {p̂μ , p̂ν }_ = 0, {q̂μ , p̂ν }_ = iδμν , μ, ν = 0, . . . , n.

(21.75)
Equations (21.75) are obviously satisfied if
∂
q̂μ = qμ , p̂ν = −i . (21.76)
∂qν
For, if we let the commutator {q̂μ , p̂ν }_ of the operators q̂μ and p̂ν act on an explicitly
time-dependent function ψ(qλ ) ≡ ψ(q1 , . . . , qn , t), we get

∂ ∂ ∂
{q̂μ , p̂ν }_ ψ(qλ ) = i , qμ ψ = i qμ ψ − qμ ψ
∂qν _ ∂qν ∂qν
= iδμν ψ(qλ ). (21.77)
Because of q0 ≡ ct, the momentum operator for the index ν = 0, i.e. p̂0 ≡ −ê/c, has
the alternative representation
∂
ê = i . (21.78)
∂t
Parallel to the momentum operators p̂i = −i∂/∂qi that are conjugate to the config-
uration space variables qi , one thus finds in the extended description the operator ê
for the system’s instantaneous energy content as the conjugate quantity of the time
variable t.
Furthermore, in the extended Hamiltonian formalism of canonical quantization,
the extended Hamiltonian H1 = 0 from (21.25) is replaced by the extended Hamilton
operator Ĥ1 = 0̂

dt dt ∂
Ĥ1 = (Ĥ − ê) = Ĥ − i = 0̂, (21.79)
ds ds ∂t
with Ĥ denoting the related conventional Hamilton operator of the given quantum
mechanical problem. As long as the operator equation Ĥ1 = 0̂ is not submitted to
an extended canonical transformation, we are allowed to identify the time t with the
system’s evolution parameter s, (t ≡ s). We thereby find an operator equation that is
no longer Lorentz invariant
∂
Ĥ − i = 0̂. (21.80)
∂t
If we let these operators act on an explicitly time-dependent function ψ(qi , t), we get
the following partial differential equation
∂ψ(qi , t)
Ĥ ψ(qi , t) = i . (21.81)
∂t
In the realm of quantum mechanics, this equation is referred to as the Schrödinger Example 21.14
equation.
EXAMPLE
21.15 Regularization of the Kepler System
As a first example of an extended canonical transformation that includes a mapping of

the time scale of the given dynamical system, we discuss the generalized formulation
of L. Euler’s regularization of a Kepler system. This technique is commonly referred to
as the “Kustaanheimo-Stiefel (KS)” transformation, and sometimes also as the “Hopf”
transformation. It has the properties (i) to ensure the regularization of the equations
of motion, (ii) to permit a uniform treatment all three types of Keplerian motion, and
(iii) to transform the equations of the two-body problem into a harmonic oscillator
form. We formulate this transformation here as a generalized canonical transforma-
tion, where in addition to a transformation of spatial and momentum coordinates the
physical time t of the original Kepler system is mapped into a new time T that para-
meterizes the motion within the transformed system. We write the Hamiltonian of the
Kepler system in normalized form
1 2 K
H (qj , pj ) = p1 + p22 + p32 − , K = G (m1 + m2 ), r 2 = q12 + q22 + q32 .
2 r
(21.82)
Herein, G denotes the gravitational constant, m1 , m2 the masses of the interacting

bodies, and r their distance in the 3-dimensional configuration space. As H does not
depend on time explicitly, we have ∂H /∂t = de/dt = 0, and hence
1 2 K
e= p1 + p22 + p32 − = const. (21.83)
2 r
Obviously, in this description the system has a singularity for r → 0. We will now
show that the Kepler system (21.82) can be canonically transformed into another
Hamiltonian system that does not exhibit any singularities. This canonical transfor-
mation can be defined in terms of a generating function of type F3 ,
1
F3 Qj , pj , T , e = − p1 Q21 − Q22 − Q23 + Q24 − p2 Q1 Q2 − Q3 Q4
2
T

− p3 Q1 Q3 + Q2 Q4 + e ξ(τ ) dτ . (21.84)
0
As the generating function is linear in the pi and in e, the KS-transformation consti-

tutes an extended canonical point transformation. It depends on an as yet undetermined
time function ξ(T ) and has the particular feature that the transformed system has four
degrees of freedom in place of three of the original system. We will see that this
particular correlation of both systems gives rise to some freedom in fixing the initial
conditions of the transformed system. According to the transformation rules (21.66),
the old spatial coordinates qi are expressed in terms of the new ones, Qi , as
Example 21.15 ∂F3 1 2

q1 = − = Q1 − Q22 − Q23 + Q24 ,
∂p1 2
∂F3
q2 = − = Q1 Q2 − Q3 Q4 , (21.85)
∂p2
∂F3
q3 = − = Q1 Q3 + Q2 Q4 .
∂p3
We directly verify that

1 2
r= q12 + q22 + q32 = Q1 + Q22 + Q23 + Q24 . (21.86)
2
The momenta Pi of the transformed system follow from the generating func-
tion (21.84) as
∂F3
P1 = − = p1 Q1 + p2 Q2 + p3 Q3 ,
∂Q1
∂F3
P2 = − = −p1 Q2 + p2 Q1 + p3 Q4 ,
∂Q2
(21.87)
∂F3
P3 = − = −p1 Q3 − p2 Q4 + p3 Q1 ,
∂Q3
∂F3
P4 = − = p1 Q4 − p2 Q3 + p3 Q2 ,
∂Q4
which yields
P12 + P22 + P32 + P42

p12 + p22 + p32 = . (21.88)
Q21 + Q22 + Q23 + Q24
The transformations of energy e and time t are

T
∂F3 ∂F3
E= = e ξ(T ), t= = ξ(τ ) dτ. (21.89)
∂T ∂e 0
From the transformation rule for the conventional Hamiltonians from (21.62), H is
finally obtained as
H (Qj , Pj , T ) = H (qj , pj , t) ξ(T ), (21.90)
which is, as expected, in agreement with the transformation rule of their values, E
and e. In explicit form, the transformed Hamiltonian H (Qj , Pj , T ) is then found by
expressing the original Hamiltonian in terms of the new variables

ξ(T ) 1 2
H (Qj , Pj , T ) = 2 P1 + P2 + P3 + P4 − 2K . (21.91)
2 2 2
Q1 + Q22 + Q23 + Q24 2
The constant energy e of the original system writes in terms of the new coordinates
1 2
2 (P1 + P22 + P32 + P42 ) − 2K
e= = const. (21.92)
Q21 + Q22 + Q23 + Q24
From H , the canonical equations of the transformed system evaluate to Example 21.15
∂H dQi ξ(T )
= = 2 Pi ,
∂Pi dT Q1 + Q2 + Q23 + Q24
2

∂H dPi 2ξ(T ) 1 2
− = = P1 + P2
2
+ P3
2
+ P4
2
− 2K Qi .
∂Qi dT (Q21 + Q22 + Q23 + Q24 )2 2
We may merge the pairs of first-order equations into second-order equations for the
Qi , i = 1, . . . , 4

d 2 Qi 1 d 1 d 2
2 dQi
− ξ(T ) − Q1 + Q2
2 + Q2
3 + Q4
dT 2 ξ(T ) dT Q21 + Q22 + Q23 + Q24 dT dT
2
ξ(T )
− 2e Qi = 0. (21.93)
Q21 + Q22 + Q23 + Q24
After having worked out the equations of motion of the transformed system, we are
now in the state to fix the as yet undetermined time function ξ(T ). With the trans-
formed canonical position coordinates conceived as functions of the transformed time,
Qi = Qi (T ), we may define
ξ(T ) ≡ Q21 (T ) + Q22 (T ) + Q23 (T ) + Q24 (T ). (21.94)
By virtue of the fixation of ξ(T ), the relation of the physical time t of the Ke-
pler system to the time T of the transformed system is uniquely determined
through (21.89)
T
t (T ) = Q21 (τ ) + Q22 (τ ) + Q23 (τ ) + Q24 (τ ) dτ. (21.95)
0
Note that the identification of ξ(T ) with the time evolution of a function the canonical
variables does not mean that ξ(T ) acquires an explicit dependence on the canonical
variables. With this particular scaling of the transformed time T , the equations of
motion (21.93) simplify to
d 2 Qi
− 2eQi = 0. (21.96)
dT 2
For e < 0, the orbit is closed in the original Kepler system. In the transformed system,
we then get four uncoupled equations of motion of the time-independent harmonic
oscillator, which we already know to be analytically solvable.
Equations (21.96) can be regarded as the equations of motion that emerge from the
canonical equations of the Hamiltonian
1 2
H (Qj , Pj ) = P1 + P22 + P32 + P42 − e Q21 + Q22 + Q23 + Q24 . (21.97)
2
By means of the relation (21.92), we immediately find the constant value E of the
Hamiltonian (21.97)
E = 2K = const. (21.98)
Example 21.15 The original Kepler system may now be solved according to the scheme sketched
at the end of Example 19.3. We must first transform the given initial conditions
q1 (0), q2 (0), q3 (0) and p1 (0), p2 (0), p3 (0) of the Kepler system into the initial con-
ditions Q1 (0), Q2 (0), Q3 (0), Q4 (0) and P1 (0), P2 (0), P3 (0), P4 (0) for (21.96). This
can be worked out by inverting the transformation rules (21.85). For a unique inverse
to exist, we must choose one constraint, which may be defined for convenience as
Def
Q4 (0) = 0. (21.99)
With this setting, the initial values of the configuration space variables of the trans-
formed system are
q2 (0) q3 (0)
Q1 (0) = q1 (0) + r(0), Q2 (0) = , Q3 (0) = . (21.100)
Q1 (0) Q1 (0)
The initial momenta Pi (0) are then directly obtained from the general transformation
rules (21.87). Now, the harmonic oscillator equations (21.96) may be solved analyt-
ically to find the solutions Qi (T ), Pi (T ) at time T . The configuration space coordi-
nates qi (T ) at time T of the original Kepler system are then found from the trans-
formation rules (21.85). The corresponding momentum coordinates pi (T ) at time T
of the original Kepler system follow by solving the transformation rules (21.87) for
the pi
1
p1 = (Q1 P1 − Q2 P2 − Q3 P3 + Q4 P4 )
2r
1
p2 = (Q2 P1 + Q1 P2 − Q4 P3 − Q3 P4 )
2r
1
p3 = (Q3 P1 + Q4 P2 + Q1 P3 + Q2 P4 )
2r
1
0 = (−Q4 P1 + Q3 P2 − Q2 P3 + Q1 P4 ).
2r
The remaining task is to invert the analytic solution of (21.95) to find the represen-
tation T (t). We can then finally express the solutions qi (T ), pi (T ) in terms of the
Kepler’s system time t to obtain the qi (t) and the pi (t).
EXAMPLE
21.16 Time-Dependent Damped Harmonic Oscillator
As another example for an extended canonical transformation we will show that the
time-dependent harmonic oscillator with also time-dependent damping coefficient can
directly be mapped into a conventional (time-independent) undamped harmonic oscil-
lator. Written for n degrees of freedom, the Hamiltonian of the original system is given
by
1 n
1 n
H (qj , pj , t) = e−F (t) pi2 + eF (t) ω2 (t) qi2 , (21.101)
2 2
i=1 i=1
with ω2 (t) and F (t) denoting arbitrary, not necessarily periodic, differentiable func- Example 21.16
tions of time. The subsequent equations of motion follow as
pi = eF (t) q˙i , q̈i + f (t) q̇i + ω2 (t) qi = 0, f (t) = Ḟ (t), i = 1, . . . , n.

(21.102)
As the “target system” H —with T the independent variable—we demand the ordi-
nary time-independent and undamped harmonic oscillator,
1 2 1 2 2
n n
H (Qj , Pj ) = Pi + Qi . (21.103)
2 2
i=1 i=1
In the context of the generalized canonical transformation theory, this means that the
transformed time T represents a cyclic coordinate — which means that the conjugate
coordinate energy E represents a constant of motion.
The generating function F2 (qj , Pj , t, E) that defines the desired mapping of the
Hamiltonian (21.101) into (21.103) turned out to be

eF (t) 1 ξ̇ (t)
F2 qj , Pj , t, E = qi Pi + eF (t) − f (t) qi2
ξ(t) 4 ξ(t)
i i
t
dτ
−E . (21.104)
0 ξ(τ )
Herein, ξ(t) denotes an as yet undetermined, hence a priori arbitrary differentiable

function of time. According to the general transformation rules (21.65) for generating
functions of type F2 , the particular rules for the transformation of canonical coordi-
nates and time follow as

F
t
qi ξ/e 0 Qi dτ
= 1 , T= . (21.105)
2 (ξ̇ − ξf ) e /ξ
pi F F
e /ξ P i ξ(τ )
0
As the new spatial coordinates Qi and the new time T depend on the old spatial
coordinates qi and the old time t , respectively, we are actually dealing with a point
transformation. The transformation of the time scales of both systems is governed by
the as yet undetermined time function ξ(t).
In terms of the new coordinates, the transformation rule for the energy e =
−∂F2 /∂t is found from our F2 as
1 1 2
E=ξe+ f ξ − ξ̇ Qi Pi + ξ ξ̈ − ξ̇ 2 + f ξ ξ̇ − f˙ξ 2 − f 2 ξ 2 Qi .
2 4
i i
(21.106)
Because of ∂T /∂t = 1/ξ(t), the relation of old and new Hamiltonians, H and H ,
follows from the general rule from (21.62), yielding
H − E = ξ(t)(H − e).
With H the original Hamiltonian from (21.101), we get the new Hamiltonian
H (Qj , Pj , T ) by eliminating the old variables according to the transformation
Example 21.16 rules (21.105) and (21.106)

1 2 1 2 1
n n
1 1 1
H = Pi + Qi ξ ξ̈ − ξ̇ 2 + ξ 2 ω2 − ξ 2 f 2 − ξ 2 f˙ . (21.107)
2 2 2 4 4 2
i=1 i=1
This is obviously the desired Hamiltonian H of the ordinary harmonic oscilla-

tor (21.103), provided that its constant coefficient 2 is identified with the terms in
brackets of the transformed Hamiltonian (21.107)

1 1 1 1
2 = ξ ξ̈ − ξ̇ 2 + ξ 2 ω2 − f 2 − f˙ = const. (21.108)
2 4 4 2
Correspondingly, the transformed Hamiltonian H does not explicitly depend on time
if and only if d2 /dt = 0, which means that the third-order equation
...
ξ + ξ̇ 4ω2 − 2f˙ − f 2 + ξ 4ωω̇ − f¨ − f f˙ = 0 (21.109)
must be satisfied. As a consequence of this requirement, depending on the given exter-

nal functions ω2 (t) and f (t), the function ξ(t) is now determined. This means, further-
more, that the particular correlations of canonical coordinates, energy, and time scales
of both systems are now pinpointed according to the transformation rules (21.105)
and (21.106). With ξ(t) a solution of the linear and homogeneous third-order equa-
tion (21.109), then 2 = const., and hence the value E = E(0) of H constitutes a
constant of motion. Thus, it is the freedom associated with extended canonical trans-
formations to arbitrarily adjust the time scales of the involved systems that enables us
to free the target system H from its explicit dependence on the independent variable.
If we express the constant value of the new Hamiltonian H (Qj , Pj ) = E = const.
in terms of the old coordinates pi and qi , then we isolate an invariant of the original
system (21.101),
1 −F 2 1
n n
E= e ξ pi − ξ̇ − ξf q i pi
2 2
i=1 i=1
1
n
+ eF (t) ξ̈ − ξ̇ f − ξ f˙ + 2ξ ω2 (t) qi2 . (21.110)
4
i=1
The time function ξ(t) can be attributed a physical meaning. We easily convince us
that

n
ξ(t) = eF (t) qi2 (t) (21.111)
i=1
represents a solution of (21.109), provided, of course, that all qi (t) are solutions of
the equation of motion (21.102) of the time-dependent damped harmonic oscillator.
Inserting (21.111) into the representation (21.110) of the invariant E, then the latter
takes on the equivalent form
2
1 2
E= qi2
pi −
2
q i pi = pi q j − q i p j . (21.112)
2
i i i i,j
The actual invariance of E can be proved directly by calculating its time derivative.
Obviously, the invariant (21.112) of the time-dependent damped harmonic oscilla-
tor (21.102) has exactly the form of the conservation law for the angular momentum
√
in central force fields. In the realm of accelerator physics, the quantity εrms = E/n Example 21.16
is referred to as the “root-mean-square (rms) emittance.” The “rms-emittance” of a
charged particle beam is thus invariant along the beam axis, as long as (i) the particle
motion may approximately be described by linear equations of motion and (ii) the
number of beam particles is maintained.
Inserting (21.111) into (21.108) we finally find that the invariant E coincides with
the coefficient 2 from the transformed Hamiltonian (21.103) provided that ξ(t) is
given by (21.111),
2 = E. (21.113)
Having related the time-dependent damped harmonic oscillator (21.101) by means of

an extended canonical transformation with the ordinary (time-independent) harmonic
oscillator (21.103), we may now explicitly work out the solution functions qi (t) and
pi (t) of the equations of motion (21.102). The explicit solutions Qi (T ) and Pi (T ) of
the ordinary harmonic oscillator are known from Example 19.3

Qi (T ) cos T −1 sin T Qi (0)
= . (21.114)
Pi (T ) − sin T cos T Pi (0)
The solution functions qi (T ) and pi (T ) of the time-dependent damped harmonic os-

cillator follow subsequently as the product of the solution (21.114) with the transfor-
mation (21.105),

qi (T ) ξ/eF
0 cos T −1 sin T
= 1
pi (T ) 2 (ξ̇ − ξf ) e /ξ
F eF /ξ − sin T cos T

Qi (0)
× . (21.115)
Pi (0)
The initial conditions Qi (0) and Pi (0) can, furthermore, be expressed through the
qi (0) and the pi (0) by means of the inverse transformation of (21.105) at time t =
T = 0,

Qi (0) eF (0) /ξ(0)
0 qi (0)
= .
Pi (0) − 12 (ξ̇ (0) − ξ(0) f (0)) eF (0) /ξ(0) ξ(0)/eF (0) pi (0)
(21.116)
We switch to the time scale of the original system by expressing T in terms of t

according to the corresponding transformation rule from (21.105). The solution of the
time-dependent damped harmonic oscillator from (21.102) is thus finally given by
t
qi (t) qi (0) dτ
= R(t) , T (t) = . (21.117)
pi (t) pi (0) 0 ξ(τ )
Herein, ξ(t) denotes the uniquely determined solution of (21.108) for given initial
conditions ξ(0), ξ̇ (0) and a fixed 2 = const. The elements of the solution matrix
R(t) are obtained by multiplying the three matrices involved,

ξ(t) eF (0) 1 sin T (t)
r11 (t) = cos T (t) − ξ̇ (0) − f (0) ξ(0)
ξ(0) eF (t) 2

Example 21.16 ξ(0) ξ(t) sin T (t)
r12 (t) =
eF (0) eF (t)

(21.118)
eF (0) eF (t) 1
r21 (t) = ξ̇ (t) − f (t) ξ(t) − ξ̇ (0) + f (0) ξ(0) cos T (t)
ξ(0) ξ(t) 2

1 2 sin T (t)
− ξ̇ (0) − f (0) ξ(0) ξ̇ (t) − f (t) ξ(t) +
4

ξ(0) eF (t) 1 sin T (t)
r22 (t) = cos T (t) + ξ̇ (t) − f (t) ξ(t) .
ξ(t) eF (0) 2
For all times t , the determinant D = r11 r22 − r12 r21 of matrix R(t) has the value
D = 1. The linear mapping (q(0), p(0)) → (q(t), p(t)) is thus in agreement with
the requirement of Liouville’s theorem. For the particular case → 0, we find that
(sin T )/ → T . In that case, the particle motion in a time-dependent damped har-
monic oscillator are mapped into free particles.
The rule for the transformation of time t → T that emerges from the generating
function (21.104) has the particular property that the transformed time T does not de-
pend on the coordinates of the particles. Exactly for that reason, the transformed time
T maintains the property of the original time t to be a common coordinate for all parti-
cles in the transformed system. This means that T may serve as the common evolution
parameter of all particle coordinates Qi (T ) and Pi (T ) in the transformed system. In
other words, T has the global property of the transformed system’s time. Mathemat-
ically, T has this property due to the fact that the (Pi , qi ) and the (E, t) terms in the
particular generating function (21.104) are additive. Therefore, this extended canon-
ical transformation can be split into a conventional canonical transformation plus a
pure canonical time-energy transformation from Example 21.11. This is not always
possible as extended canonical transformations can be defined that do not admit a sep-
aration of the transformation of space and time coordinates. We will encounter such
a canonical transformation in Example 21.18, where the Lorentz transformation is
formulated as an extended canonical transformation.
EXAMPLE
21.17 Galilei Transformation
Galileo’s principle of relativity constituted until its absorption as a limiting case into
Einstein’s principle in the year 1905 the most undoubted principles of classical dy-
namics. It stated that there exists an “absolute time” t that is instantaneously common
to all coordinate systems, how distant apart these systems ever may be located. If we
consider the special case of two coordinate systems that are moving with respect to
each other along one coordinate axis at a constant velocity v, the transformation rule
for the positions q, Q and times t , T of a moving body of mass m0 between these two
systems is simply
q = Q + vt, t = T.
Formulated as an extended canonical transformation, the generating function of type

F2 of the Galilei transformation is then
F2 (q, P , t, E) = P q − Et − v(P t − m0 q). (21.119)
The complete set of transformation rules of the canonical coordinates is then Example 21.17
∂F2 ∂F2
p= = P + m0 v, e=− = E + vP ,
∂q ∂t
(21.120)
∂F2 ∂F2
Q= = q − vt, T =− = t, H1 = H1 .
∂P ∂E
From the general transformation rule for extended Hamiltonians, H1 = H1 , the rule
for the conventional Hamiltonians H and H is then obtained according to (21.62)
with ∂T /∂t = 1 as
H = H + vP . (21.121)
As required, the rule for the Hamiltonians is in agreement with the rule for their values,
e and E.
EXAMPLE
21.18 Lorentz Transformation
The correct transformation rule between coordinate systems that move with respect
to each other at constant velocity (“inertial systems”) is based on the finding that the
velocity of light, c, is actually finite. This rule is referred to as the “Lorentz transfor-
mation.” A finite c constituting the upper speed limit for any signal obviously means
that a finite time span is needed for the signal to pass from one reference system to
another. As an immediate consequence, Galileo’s concept of an “absolute time” that
is instantaneously common to all inertial systems had to be abandoned. Instead, it is
obviously necessary to also transform the time t if one performs the transition from
one inertial system to another.
The special principle of relativity requires that the formulation of the description
of a physical system must be the same in all inertial systems. This means in particu-
lar for Hamiltonian systems that the Lorentz transformation must maintain the form
of the Hamiltonian. On the other hand, as we know from the preceding Chap. 19
and Sect. 21.3, only canonical transformations maintain the form of the canonical
equations. We conclude that the Lorentz transformation must constitute a particular
canonical transformation. As the Lorentz transformation necessarily associated with a
transformation of the time coordinate, t → T , it may be described only in terms of an
extended canonical transformation. Its extended generating function F2 is given by

E
F2 (q, P , t, E) = γ P q − Et − v P t − 2 q , (21.122)
c
with v denoting the constant relative velocity of the respective inertial systems. In the
formulation given here, the coordinate systems are adjusted to ensure that the relative
motion of both systems occurs along one coordinate axis, q. As usual, we denote by
γ the dimensionless length and time scaling factor γ = 1/ 1 − β 2 , with β = v/c
the scaled relative velocity. We observe that the generating function (21.122) of the
Lorentz transformation merges into that of the Galilei transformation (21.119) from
Example 21.17 for v c, hence for β → 0, γ → 1, E/c2 = m → m0 . Namely, the
total mass m = E/c2 in (21.122) is replaced by the constant rest mass m0 in the case of
Example 21.18 the Galilei transformation. With regard to the transformation rules that emerge from
the generating functions, it is exactly the replacement of the second E-dependent term
in (21.122) by the constant mass term m0 in (21.119) from Example 21.17 that induces
the time transformation rule t = T of the Galilei transformation.
For the generating function (21.122), the general transformation rules (21.65) for
extended canonical transformations yield the particular rules

∂F2 E ∂F2
p= =γ P + 2v , e=− = γ (E + vP ),
∂q c ∂t
(21.123)
∂F2 ∂F2 v
Q= = γ (q − vt), T =− =γ t − 2q , H1 = H1 .
∂P ∂E c
The transformation rules for the variables Q and T follow as

Q γ −βγ q
= . (21.124)
cT −βγ γ ct
The canonical transformation approach to represent the Lorentz transformation en-

sures that we simultaneously obtain the rules for conjugate coordinates, P and E,

p γ βγ P P γ −βγ p
= ⇔ = .
e/c βγ γ E/c E/c −βγ γ e/c
(21.125)
With the real angle α = arcosh γ = arsinh βγ , the linear transformations (21.124) and
(21.125) can be rewritten as orthogonal mappings, hence as the imaginary rotations

Q cos iα sin iα q
= ,
icT − sin iα cos iα ict
(21.126)
P cos iα sin iα p
= .
iE/c − sin iα cos iα ie/c
Obviously, these transformation maintain the “distances”
Q2 − c2 T 2 = q 2 − c2 t 2 , P 2 − E 2 /c2 = p 2 − e2 /c2 . (21.127)
With ∂T /∂t = γ , the transformation rule for conventional Hamiltonians, H and H ,

under Lorentz transformations follows according to (21.62) as

H − E γ = H − e. (21.128)
Together with the rule (21.125) for the transformation of the energies e, E
e = γ E + βγ P c (21.129)
the transformation rule for Hamiltonians under Lorentz transformations is finally

found from (21.128) as
H = γ H + βγ P c. (21.130)
As expected, the Hamiltonians H , H transform equally as their respective

values, e, E.
As the Lorentz transformation constitutes an extended canonical transformation, Example 21.18

the extended version of Liouville’s theorem from Example 21.12 applies,
∂(Q, P , T , E) !
= 1. (21.131)
∂(q, p, t, e)
On the basis of the transformation rules (21.124) and (21.125), we easily verify that
the requirement (21.131) is indeed fulfilled
∂(Q, P , T , E) ∂(Q, T ) ∂(P , E)
=
∂(q, p, t, e) ∂(q, t) ∂(p, e)

γ −cβγ −βγ
2
= −βγ γ c = γ 2 (1 − β 2 ) = 1.

γ −cβγ γ
c
Yet, we must be aware of the fact that the forms dq dp and dt de of the projection
planes (q, p) and (t, e) are not invariant under Lorentz transformations
∂(Q, P ) ∂(T , E)
= γ 2, = γ 2. (21.132)
∂(q, p) ∂(t, e)
Thus, under Lorentz transformations, the conventional form of Liouville’s theorem
from Example 19.6 does not hold!
EXAMPLE
21.19 Infinitesimal Canonical Transformations, Generalized Noether Theorem
We define the following generating function of an infinitesimal extended canonical

transformation that generalizes the infinitesimal time-step transformation from Exam-
ple 19.5,

n
F2 (qν , Pν ) = qμ Pμ + I (qν , pν ). (21.133)
μ=0
Herein, denotes the infinitesimal parameter and I (qν , pν ) a differentiable function

of all variables of the extended phase space (μ, ν = 0, . . . , n). The coordinate trans-
formation rules follow as
∂F2 ∂I
pμ = = Pμ + ,
∂qμ ∂qμ
(21.134)
∂F2 ∂I
Qμ = = qμ + .
∂Pμ ∂Pμ
To first order in , we derive the variations δpμ and δqμ from the transformation
rules (21.134),
∂I
δpμ ≡ Pμ − pμ = − ,
∂qμ
(21.135)
∂I
δqμ ≡ Qμ − qμ = .
∂pμ
Example 21.19 The total s-derivative of I (qν , pν ) is then

dI ∂I dqμ ∂I dpμ
= +
ds μ
∂qμ ds ∂pμ ds
∂H1 ∂H1

=− δpμ + δqμ
μ
∂pμ ∂qμ
= −δH1 . (21.136)
Thus, by means of the canonical equations (21.23) and the first-order transformation
rules (21.135), we have found that the characteristic function I (qν , pν ) that is con-
tained in the generating function (21.133) constitutes a constant of motion exactly if
the extended Hamiltonian H1 is invariant under the transformation (21.135) generated
by (21.133). In other words, if the rules (21.135) define a symmetry transformation
of the given Hamiltonian system then the characteristic function I (qν , pν ) of the gen-
erating function (21.133) constitutes a constant of motion. The correlation (21.136)
of a Hamiltonian system’s invariants I to the symmetry transformations that main-
tain the value of its extended Hamiltonian H1 establishes Noether’s theorem1 of point
mechanics in the extended Hamiltonian formalism
dI
=0 ⇔ δH1 = 0. (21.137)
ds
The derivation of Noether’s theorem in the context of the Lagrangian formalism2 is

restricted to extended point transformations (see Example 21.10). Yet, the extended
canonical transformation approach allows to describe more general possible symme-
try mappings as the rules (21.134) are not restricted to point transformations. Con-
sequently, (21.137) in conjunction with the infinitesimal canonical mapping (21.135)
represents a generalized formulation of Noether’s theorem.
The variation of an arbitrary function u(qν , pν ) under the infinitesimal canonical
transformation that is induced by I (qν , pν ) according to the rules from (21.135) is
then
1 Amalie “Emmy” Noether, German mathematician, b. March 2, 1882, Erlangen, Germany–d. April
14, 1935, Bryn Mawr, Pennsylvania, USA. Emmy Noether grew up in Erlangen in the family of a
mathematician and passed the state examination for teachers of foreign languages. When in 1903,
women were for the first time allowed to study at Bavarian universities, she took up studies of mathe-
matics in Erlangen and graduated in 1907. Following an invitation by Felix Klein and David Hilbert,
she then moved to Göttingen. She was not allowed a habilitation, the German qualification to become
a professor, until 1919, and only in 1922 she became an assistant professor, and the first paid job as
a professor in 1923. In 1928/29 she was a guest professor in Moscow, and on 1930 in Frankfurt am
Main. Because of her Jewish descent and her political views, she was forbidden to teach in 1933.
Noether emigrated to the US, where she found a position as a guest professor at the Bryn Mawr
Women’s College.
Emmy Noether’s work on the theory of invariants, the theory of ideals and of rings and modules
was instrumental to the development of modern algebra. Her theorem linking continuous symmetries
of a physical system to conserved quantities is one of corner-store concepts of modern physics.
2 In the original publication of Emmy Noether (“Invariante Variationsprobleme,” Nachr. Kgl. Ges.
Wiss. Göttingen, Math.-Phys. Kl. 1918, 235), the theorem was presented for continuous systems in
the Lagrangian formalism.
∂u ∂u

Example 21.19
δu(qν , pν ) = δqμ + δpμ
μ
∂qμ ∂pμ

∂u ∂I
∂u ∂I
= − . (21.138)
μ
∂qμ ∂pμ ∂pμ ∂qμ
In the notation of extended Poisson brackets that was introduced in Example 21.13,
we may express the variation δu concisely as
δu = [u, I ]e . (21.139)
As (21.139) applies for arbitrary differentiable functions u = u(qν , pν ), we may

equivalently write it as an operator equation,
δu = Û u.
The operator Û is thus given by

∂I ∂ ∂I ∂

Û = [ . , I ] =
e
− . (21.140)
μ
∂pμ ∂qμ ∂qμ ∂pμ
The dot in the Poisson bracket expression stands here as a placeholder for a function
the operator Û acts on. We refer to Û as the generator of the infinitesimal symmetry
transformation (21.135) that is associated with the invariant I . Obviously, I itself is
invariant under the symmetry transformation (21.135) which it generates
δI = Û I = [I, I ]e = 0. (21.141)
Two invariants I1 and I2 of a given Hamiltonian system H then define the two sym-
metry operators
Û1 = [ . , I1 ]e , Û2 = [ . , I2 ]e . (21.142)
The concatenations of the operators Û1 and Û2 can then be expressed as nested Pois-
son brackets. Skipping the superscript e , this reads

Û1 Û2 = Û1 [ . , I2 ] = [ . , I2 ], I1 , Û2 Û1 = Û2 [ . , I1 ] = [ . , I1 ], I2 . (21.143)
The commutator {Û1 , Û2 }_ of the operators Û1 and Û2 then defines a generally not
vanishing operator Û3 , which is represented in terms of extended Poisson brackets as

Û3 = Û1 , Û2 _ = Û1 Û2 − Û2 Û1

= [ . , I2 ], I1 − [ . , I1 ], I2 = − I1 , [ . , I2 ] − I2 , [I1 , . ]

= . , [I2 , I1 ]

= . , I3 , with I3 = [I2 , I1 ]. (21.144)
According to Poisson’s theorem from (19.32), the function I3 = [I2 , I1 ] represents

another invariant of the respective Hamiltonian system. Consequently, the operator
Û3 equally defines a symmetry transformation of the system. The commutator of two
Example 21.19 generators of symmetry transformations thus provides another generator of a symme-
try transformation. The group of generators of symmetry transformations of a Hamil-
tonian system thus forms in conjunction with the commutator operation a Lie–Poisson
algebra. For a given Hamiltonian system, we do not know a priori whether all its in-
variants have been found, and hence whether the Lie–Poisson algebra of symmetry
operators is complete. Yet, by applying Poisson’s theorem to all pairs of known invari-
ants, it is always possible to find a subset of invariants Ij that is closed with respect
to evaluating their mutual Poisson brackets. With respect to the set of generators of
symmetry transformations we thus find a sub-algebra for pertaining symmetry opera-
tors Ûj .
EXAMPLE
21.20 Infinitesimal Point Transformations, Conventional Noether Theorem
As we are dealing with an infinitesimal transformation, we may eliminate the original

coordinate qμ from the generating function (21.133) of Example 21.19. Solving for I ,
we rewrite this equation to first order in as
n

∂I
I (qν , pν ) = F2 (qν , Pν ) − Qμ − Pμ
∂Pμ
μ=0

n
n
∂I
= F2 (qν , Pν ) − Qμ Pμ + pμ
∂pμ
μ=0 μ=0

n
∂I
= F1 (qν , Qν ) + pμ .
∂pμ
μ=0
In the last step, we have replaced the generating function of type F2 (qν , Pν ) by an
equivalent function of type F1 (qν , Qν ) according to (21.63). In our case of an infini-
tesimal transformation, the generating function F1 may alternatively be expressed to
first order in as
F1 (qν , Qν ) = F1 (qν , δqν ) = f (qν ).
The function I (qν , pν ) may thus be expressed as

n
∂I
I (qν , pν ) = pμ + f (qν ). (21.145)
∂pμ
μ=0
This equation is obviously fulfilled for functions I (qν , pν ) that are linear in the pν

n
I (qν , pν ) = − pμ ημ (qν ) + f (qν ). (21.146)
μ=0
The functions ημ (qν ) in (21.146) are defined to only depend on the extended set of
canonical variables qν , hence on the configuration space variables qj and time t in the
conventional description. With this I (qν , pν ), the generating function (21.133) from Example 21.20
Example 21.19 defines the extended point transformation
Qμ = qμ − ημ (qν ), μ = 0, . . . , n. (21.147)
We may write (21.146) equivalently in the conventional description by replacing the

canonical variable p0 ≡ −e/c according to (21.20) with the Hamiltonian H (qj , pj , t).
Furthermore, the system evolution parameter s must be replaced with the physical time
t as the independent variable. With ξ ≡ η0 /c, the special Noether invariant (21.146)
then writes

n
I = ξ(qj , t) H − pi ηi (qj , t) + f (qj , t). (21.148)
i=1
This defines the conventional Noether invariant of point mechanics. If we can find
functions ξ(qj , t), ηi (qj , t), and f (qj , t) for a given Hamiltonian H (qj , pj , t) that
satisfy dI /dt = 0, then I from (21.148) constitutes a constant of motion. The Hamil-
tonian form of Noether’s theorem presented here then states that the corresponding
symmetry of the given Hamiltonian system is given by an extended canonical point
transformation that is determined by those functions ξ(qj , t) and ηi (qj , t)
T = t − ξ(qj , t), Qi = qi − ηi (qj , t). (21.149)
As a result of the fact that the Noether theorem from (21.148) represents only a special
case, not all invariants of a given system can be expressed by (21.148) in general. As
we see from the symmetry transformations (21.149) that are associated with invariants
of the form (21.148), only those symmetries that are represented by extended point
transformations are covered by the conventional Noether theorem.
Invariants that are not of the form of (21.148) are commonly referred to in liter-
ature as “non-Noether invariants,” an example of which we will encounter in Exam-
ple 21.21. The symmetry transformations associated with “non-Noether invariants”
do not constitute extended point transformations, and hence do not emerge straight-
forwardly in the context of the Lagrangian formalism.
EXAMPLE
21.21 Runge–Lenz Vector of the Plane Kepler System as a Generalized Noether

Invariant
The classical Kepler system is an example of a two-body problem whose masses in-
teract according to an inverse square force law. In its plane version, the Hamiltonian
of this system writes in scaled coordinates (see also Example 19.9)
1 1 K
H = p12 + p22 − . (21.150)
2 2 q12 + q22
The canonical equations follow as

qi
q̇i = pi , ṗi = −K , i = 1, 2. (21.151)
(q12 + q22 )3
Example 21.21 As the characteristic function I = I (q1 , q2 , p1 , p2 , t) that is contained in the gen-
erating function for an infinitesimal canonical transformation (21.133) from Exam-
ple 21.19, we define
I = −p1 p2 η1 (q1 , q2 , t) − p22 η2 (q1 , q2 , t) + f (q1 , q2 , t). (21.152)
The as yet unknown coefficients η1 (q1 , q2 , t), η2 (q1 , q2 , t), and f (q1 , q2 , t) contained
herein must now be determined accordingly to render I a constant of motion. With the
physical time t the system’s independent variable, the condition therefore writes
d
−p1 p2 η1 (q1 , q2 , t) − p22 η2 (q1 , q2 , t) + f (q1 , q2 , t) = 0. (21.153)
dt
Inserting the canonical equations, we then find

K ∂η1 ∂η1 ∂η1
(q1 p2 η1 + q2 p1 η1 + 2q2 p2 η2 ) − p1 p2 + p1 + p2
(q12 + q22 )3 ∂t ∂q1 ∂q2

2 ∂η2 ∂η2 ∂η2 ∂f ∂f ∂f
− p2 + p1 + p2 + + p1 + p2 = 0. (21.154)
∂t ∂q1 ∂q2 ∂t ∂q1 ∂q2
As this polynomial in the pj must be satisfies not only in one instant of time t0 but
for all times t, we conclude that the coefficients of each power p1n p2m , n, m = 0, . . . , 3
must vanish separately. We thus obtain here eight separate conditions,
∂η1 ∂η2 ∂η1 ∂η2 ∂η1 ∂η2 ∂f
= 0, = 0, + = 0, = 0, = 0, = 0,
∂q1 ∂q2 ∂q2 ∂q1 ∂t ∂t ∂t
(21.155)
K ∂f K ∂f
q 2 η1 + = 0, (q1 η1 + 2q2 η2 ) + = 0.
(q12 + q22 )3 ∂q1 (q12 + q22 )3 ∂q2
From the conditions in the upper line, the following particular solutions emerge
η1 = q 2 , η2 = −q1 , f = f (q1 , q2 ). (21.156)
We may now easily convince ourselves that the function

q1
f (q1 , q2 ) = −K (21.157)
q12 + q22
is a solution of both conditions from the lower line. This shows that an invariant of
the form of (21.152) exists. Inserting the solutions η1 , η2 , and f into (21.152) finally
yields
q1
IRL1 = −p1 p2 q2 + p22 q1 − K . (21.158)
q12 + q22
This constant of motion represents one component of the Runge–Lenz vector, which is
referred to in literature of a “non-Noether invariant.” Due to its quadratic momentum
terms, the invariant (21.158) cannot be expressed as a conventional Noether invariant
of the form of (21.148) from Example 21.20. By systematically defining appropriate
polynomial functions I (qν , pν ) of the pν , we may construct the invariants of Hamil-
tonian systems from the solutions of systems of coupled partial differential equations.
The symmetry transformation that is associated with the invariant IRL1 follows Example 21.21
from the rules (21.135) of Example 21.19

q22
δq1 = −p2 q2 , δp1 = − p22 − K ,
(q12 + q22 )3

q1 q2
δq2 = (2p2 q1 − p1 q2 ), δp2 = p1 p2 − K .
(q12 + q22 )3
As the variations of the spatial coordinates depend on the canonical momenta,

this transformation is not a point transformation—and thus represents a generalized
Noether symmetry. We finally get as the generator ÛRL1 = [ . , IRL1 ] of the pertaining
symmetry transformation of the system
∂ ∂
ÛRL1 = −p2 q2 + 2p2 q1 − p1 q2
∂q1 ∂q2
2
q2 ∂ q1 q2 ∂
− p22 − K + p1 p2 − K .
(q12 + q22 )3 ∂p1 (q12 + q22 )3 ∂p2
Extended Hamilton–Jacobi Equation
22
In the context of the extended canonical transformation theory, we may derive an

extended version of the Hamilton–Jacobi equation. We are looking for a generating
function F2 (qν , Pν ) of an extended canonical transformation that maps a given ex-
tended Hamiltonian H1 = 0 into a transformed extended Hamiltonian that vanishes
identically, H1 ≡ 0, in the sense that all partial derivatives of H1 (Qν , Pν ) vanish.
Then, according to the extended set of canonical equations (21.23), the derivatives of
all canonical variables Qμ (s), Pμ (s) with respect to the system’s evolution parameter
s must vanish as well
∂H1 dQμ ∂H1 dPμ

=0= , − =0= , μ = 0, . . . , n. (22.1)
∂Pμ ds ∂Qμ ds
This means that all transformed canonical variables Qμ , Pμ must be constants of mo-
tion. Writing the variables for the index μ = 0 separately, we thus have
T = α0 = const., Qi = αi = const., E = −β0 = const.,

Pi = βi = const.
Thus, corresponding to the conventional Hamilton–Jacobi formalism, the transformed

canonical variables, Qj and Pj , are constants. Yet, in the extended formalism, the
transformed time T is also a constant. The particular generating function F2 (qν , Pν )
that defines transformation rules for the extended set of canonical variables such that
(22.1) hold for the transformed variables thus defines a mapping of the entire system
into its state at a fixed instant of time, hence—up to trivial shifts in the origin of the
time scale—into its initial state at T = t (0)

T = t (0), Qi = qi (0), Pi = pi (0), E = H qj (0), pj (0), t (0) .
We may refer to this particular generating function as the extended action function
F2 ≡ S1 (qν , Pν ). According to the transformation rule H1 = H1 for extended Hamil-
tonians from (21.59), we obtain the transformed extended Hamiltonian H1 ≡ 0 sim-
ply by expressing the original extended Hamiltonian H1 = 0 in terms of the trans-
formed variables. This means for the conventional Hamiltonian H (qj , pj , t) accord-
ing to (21.25) in conjunction with the transformation rules from (21.65),

∂S1 ∂S1 dt
H qj , ,t + = 0.
∂qj ∂t ds

456 22 Extended Hamilton–Jacobi Equation
As we have ds/dt = 0 in general, we finally get the generalized form of the Hamilton–
Jacobi equation,

∂S1 ∂S1 ∂S1
H q 1 , . . . , qn , ,..., ,t + = 0. (22.2)
∂q1 ∂qn ∂t
Equation (22.2) has exactly the form of the conventional Hamilton–Jacobi equation.
Yet, it is actually a generalization as the extended action function S1 represents an
extended generating function of type F2 , as defined by (21.63). This means that S1 is
also a function of the (constant) transformed energy E = −cP0 (0) = −β0 .
Summarizing, the extended Hamilton–Jacobi equation may be interpreted as defin-
ing the mapping of all canonical coordinates qj , pj , t , and e of the actual system into
constants Qj , Pj , T , and E. In other words, it defines the mapping of the entire dy-
namical system from its actual state at time t into its state at a fixed instant of time, T ,
which could be the initial conditions.
EXAMPLE
22.1 Time Dependent Harmonic Oscillator
As a simple example for the method to analyze an explicitly time-dependent Hamil-

tonian system by means of the generalized Hamilton–Jacobi equation we choose the
time-dependent harmonic oscillator
1 2 1 2 2
n n
H (qj , pj , t) = pi + ω (t) qi . (22.3)
2 2
i=1 i=1
With S1 = S1 (q1 , . . . , qn , t, P1 , . . . , Pn , E) an extended action function, we encounter

the following Hamilton–Jacobi equation (22.2) for this system

∂S1 1 ∂S1 2 1 2 2
+ + ω (t) qi = 0. (22.4)
∂t 2 ∂qi 2
i i
The problem is now to find a solution S1 for this nonlinear partial differential equa-
tion. We start with generating function of the extended canonical transformation from
Example 21.16, and restrict ourselves for simplicity to the case of zero damping,
(F (t) ≡ 0),
t
1 1 ξ̇ (t) 2 dτ
S1 (qj , t, Pj , E) = √ qi Pi + qi − E . (22.5)
ξ(t) 4 ξ(t) 0 ξ(τ )
i i
In the first instance, we require only the transformed energy E ≡ −cP0 (0) ≡ −β0
in (22.5) to represent a constant. We insert the partial derivatives of S1 with respect to
Pi , qi , and time t
∂S1 qi
Qi = =√
∂Pi ξ
∂S1 Pi 1 ξ̇
pi = =√ + qi (22.6)
∂qi ξ 2ξ

∂S1 1 ξ̇ 1 ξ̈ ξ̇ 2 2 E
−e = =−
qi Pi + − 2 qi −
∂t 2 ξ3 4 ξ ξ ξ
i i
22 Extended Hamilton–Jacobi Equation 457
into the Hamilton–Jacobi equation (22.4). Expressed in terms of the transformed co- Example 22.1
ordinates, we find
1 2 1 2 2
E− Pi − Qi = 0. (22.7)
2 2
i i

In this equation, the sum of all terms proportional to i Q2i was denoted by 2 ,
1 1
2 (t) = ξ ξ̈ − ξ̇ 2 + ξ 2 ω2 (t).
2 4
For the required constant transformed energy E, (22.7) can only be satisfied for all t
if 2 itself constitutes a constant of motion
1 1
ξ ξ̈ − ξ̇ 2 + ξ 2 ω2 (t) = 2 = const. (22.8)
2 4
With ξ(t) a solution of (22.8), the transformation rules for the coordinates are now
uniquely determined
qi
1 ξ̇
Qi = √ , Pi = ξ pi − √ qi . (22.9)
ξ 2 ξ
As the action function S1 = S1 (qj , t, Pj , E) constitutes an extended generating func-

tion, the corresponding transformation maps the original time t into a new time T , in
conjunction with the usual mapping of the canonical coordinates, Qj (T ) and Pj (T ).
The correlation of new time T and original time t is determined by the solution ξ(t)
of (22.8)
t
∂S1 dτ
T (t) = − = . (22.10)
∂E 0 ξ(τ )
The transformed Hamiltonian H (Qj , Pj , T ) follows now from the general transfor-
mation rule (21.62) for extended generating functions F with ∂T /∂t = 1/ξ ,
H − E = ξ(t)(H − e) (22.11)
which reads explicitly in the new coordinates Qi , Pi
1 2 1 2 2
n n
H (Qj , Pj ) = Pi + Qi . (22.12)
2 2
i=1 i=1
This Hamiltonian H does not explicitly depend on T if and only if ξ(t) represents a
solution of (22.8) with constant 2 . In this case, H corresponds to the Hamiltonian
of the ordinary (time-independent) harmonic oscillator. As required, the value of H ,
given by E ≡ −β0 , is now a constant of motion. In contrast, the canonical coordinates
(Qi , Pi ) of this system are not constant. The given task to map the time-dependent
harmonic oscillator (22.3) into a system where all transformed canonical coordinates
αμ , βμ depict constants of motion is thus as yet performed for β0 only. However,
we may now in a second step transform the Hamiltonian (22.12) into a new Hamil-
tonian that vanishes identically, H ≡ 0. To this end, we first set up the corresponding
458 22 Extended Hamilton–Jacobi Equation
Example 22.1 Hamilton–Jacobi equation,

∂S 1 ∂S 2 1 2 2
+ + Qi = 0. (22.13)
∂T 2 ∂Qi 2
i i
We now try to find a solution S = S(Qj , T , βj ) that depends on the constant trans-
formed momenta βj = const. We may here restrict ourselves to a conventional action
function S as the Hamiltonian (22.12) does not explicitly depend on the independent
variable, i.e. the transformed time T . Due to the quadratic dependence of the Hamil-
tonian (22.12) on the Qi and the Pi , we try to determine the solution S(Qj , T , βj )
of (22.13) by defining
1 1
S = a(T ) Q2i + b(T ) Qi βi + c(T ) βi2 . (22.14)
2 2
i i i
Inserting the partial derivatives
∂S ∂S 1 2 1 2
= aQi + bβi , = ȧ Qi + ḃ Qi βi + ċ βi (22.15)
∂Qi ∂T 2 2
i i i
into the actual Hamilton–Jacobi equation (22.13) yields
1 2 1 2
ȧ + a 2 + 2 Qi + ḃ + ab Qi βi + ċ + b2 βi = 0. (22.16)
2 2
i i i
This equation can only be satisfied for arbitrary Qi at all times T if the coefficients
in (22.16) vanish separately
ȧ + a 2 + 2 = 0, ḃ + ab = 0, ċ + b2 = 0. (22.17)
Starting with the leftmost non-linear first-order differential equation for a(T ), we may
solve this coupled set step-by-step
1 tan T
a(T ) = − tan T , b(T ) = , c(T ) = − . (22.18)
cos T
We thus find the following action function S(Qj , T , βj ) as the solution of the
Hamilton–Jacobi equation (22.13)
1 1 1 tan T 2
S = − tan T Q2i + Qi βi − βi . (22.19)
2 cos T 2
i i i
By means of this action function, we may now work out the relation of the coordinates
Qi (T ) and Pi (T ) with the integration constants αi and βi . We thereby show that
the action function S ≡ F2 represents exactly the generating function of a canonical
transformation that map the Hamiltonian H from (22.12) into a new Hamiltonian
H ≡ 0 that vanishes identically. According to the general rules from (19.12), we
obtain for (22.19) the particular transformation rules
22 Extended Hamilton–Jacobi Equation 459
∂S Qj tan T Example 22.1

αj = = − βj ,
∂βj cos T
∂S βj
Pj = = −Qj tan T + ,
∂Qj cos T
(22.20)
∂S 1 2 2 sin T
H = H + = H − Qi + Qi βi
∂T 2 cos2 T cos2 T
i i
1 1
− βi2 .
2 cos T
i
Solving for the coordinates Qj and Pj this finally yields
sin T
Qj = αj cos T + βj ,

Pj = −αj sin T + βj cos T , (22.21)
1 2 1 2 2
H = H − Pi − Qi ≡ 0.
2 2
i i
With H ≡ 0 our task has been accomplished. The representation of the Qj and the Pj
as functions of the integration constants αj and βj means to have completely solved
our system. We may, furthermore, merge the coordinate transformations of the first
and second step
⎛ √ξ(t) 0
⎞⎛ ⎞
sin T
qj cos T αj
= ⎝ 1 ξ̇ (t) 1 ⎠⎝ ⎠ , (22.22)
pj √ √ − sin T cos T βj
2 ξ(t) ξ(t)
wherein 0 = = const. and ξ(t) denotes a solution of
1 1
ξ ξ̈ − ξ̇ 2 + ξ 2 ω2 (t) = 2 . (22.23)
2 4
The transformed time T then follows from
t
dτ
T= . (22.24)
0 ξ(τ )
With the particular initial conditions ξ(0) = 1 and ξ̇ (0) = 0, the constants αj , βj ob-
viously represent the values of the coordinates qj , pj at the instant of time t = 0.
Both, the extended action function S1 (qj , t, Pj , β0 ) from (22.5), and the con-
ventional action function S(Qj , T , βj ) from (22.19) define canonical transforma-
tions. The concatenation of both transformations then establishes again a canonical
transformation. In principle, it is thus possible to find an extended action function
S1 (qj , t, βj , β0 ) ≡ S1 (qμ , βμ ) that yields the solution of (22.4) with all βμ constants,
which means to solve the problem in one step.
Part
VII
Nonlinear Dynamics
The treatment of mechanics in these lectures would not be complete if we did not deal
at least in brief with a topic which recently has attracted much attention: nonlinear
dynamics, and thereof the “theory of chaos” as a special topic.
The starting point is the observation that ordered and regular motions like those
occurring in the harmonic oscillator, the pendulum, or the Kepler problem of plane-
tary motion are more an exception in nature than the standard case. One frequently
encounters erratic phenomena and phenomena that are unpredictable in the details.
A particularly striking example is the occurrence of turbulence in the flow of liquids.
Toward the end of the nineteenth century, the “father of nonlinear dynamics,” Henri
Poincaré,1 for the first time pointed out that an irregular behavior in mechanics is not
at all an unusual feature if the system being studied involves a nonlinear interaction.
Closely related is the—at first sight astonishing—insight that also very simple systems
may exhibit a highly complex dynamics. A simple deterministic differential equation
involving nonlinearities may have solutions the behavior of which over longer time
periods evolves quite irregularly and practically cannot be predicted. This is one of the
characteristic features of chaotic systems. The meaning of this concept, which may be
precisely defined in the frame of nonlinear dynamics, extends far beyond mechanics,
since the phenomenon of chaos arises in many fields not only of physics but also of
chemistry, biology, etc.
In the following sections, we shall learn quite a lot about general properties of
nonlinear dynamic systems. The time dependence and stability of their solutions will
be discussed, and concepts like attractors, bifurcations, and chaos will be introduced.
1 Jules-Henri Poincaré, French mathematician and physicist, b. April 29, 1854, Nancy–d. July 17,
1912, Paris. Poincaré studied at the École Polytechnique and the École des Mines and was a scholar of
Ch. Hermite. Soon after he received his doctorate, he obtained a chair at the Sorbonne in 1881, which
he held until his death. In pure mathematics, he became famous as the founder of algebraic topology
and of the theory of analytic functions of several complex variables. Further essential fields of work
were algebraic geometry and number theory. But Poincaré also dealt with applications of mathematics
to numerous physical problems, e.g., in optics, electrodynamics, telegraphy, and thermodynamics.
Together with Einstein and Lorentz, he founded the special theory of relativity. Poincaré’s work on
celestial mechanics, in particular, on the three-body problem, culminated in a monograph in three
volumes (1892–1899). In this context, he was the first who discovered the appearance of chaotic
orbits in planetary motion. Poincaré has been called “the last universalist in mathematics” because of
the unusually broad scope of his interests.
462 VII Nonlinear Dynamics
However, a detailed treatment of nonlinear dynamics, its manifold problems, and in-
terdisciplinary applications exceeds the scope of this book.2 In particular, we cannot
deal in more detail with the important topic of chaos in Hamiltonian systems.
2 Some textbooks from the very extensive literature on nonlinear dynamics:
H. Schuster, Deterministic Chaos, VCH Verlag (1989).

G. Faust, M. Haase and J. Argyris, Die Erforschung des Chaos, Vieweg (1995). (This was also trans-
lated into English and published as G. Faust and M. Haase, J. Argyris, An Exploration of Chaos,
North-Holland (1994).)
H.-O. Peitgen, H. Jürgens and D. Saupe, Chaos and Fractals: New Frontiers of Science, Springer
(1992).
R.C. Hilborn, Chaos and Nonlinear Dynamics, Oxford University Press (1994).
G. Jetschke, Mathematik der Selbstorganisation, Deutscher Verlag der Wiss., Berlin (1989).
Dynamical Systems
23
A unified theoretical description may be given for many of the systems of interest.
A system is described by a finite set of dynamic variables that will be combined to
a column vector x = (x1 , . . . , xN )T ∈ RN . The state of the system at a given time t
is uniquely described by such a point x in phase space. The xi are generalized co-
ordinates that may represent a variety of quantities. Note that the vector x shall also
comprise the velocities (or momenta, respectively). We now assume that the system
behaves deterministically. Thus, the entire time evolution x(t) is determined if an
initial value x(t0 ) is given. The time evolution shall be described by a differential
equation of first order with respect to time:
d
x(t) = F x(t), t; λ . (23.1)
dt
Here, F is in general a nonlinear function of the coordinates x (also called the ve-
locity field or vector field). Moreover, F may also still explicitly depend on the time
t, for example if varying external forces are acting on the system. If there is no such
dependence, the system is called autonomous. Finally, the third argument in (23.1)
shall indicate that possibly there exist one or several control parameters λ. These are
fixed given constants whose values affect the dynamics of the system and may pos-
sibly change the character of the dynamics. Typical control parameters are, e.g., the
coupling strength of an interaction, or the amplitude or frequency of an external per-
turbation imposed onto the system.
Note: A possible explicit time dependence in (23.1) may be eliminated by a simple
trick. For this purpose we consider a system with one additional degree of freedom,
x̃ = (x1 , . . . , xN , xN+1 )T ∈ RN+1 ,
and postulate for the additional vector component the differential equation
d
xN+1 = 1.
dt
With the initial condition xN+1 (0) = 0, this simply implies xN+1 (t) = t . Hence, the
time on the right side of t may be replaced by xN+1 , and we are dealing with an
autonomous system with one additional dimension.
The equation of motion (23.1) is very far-reaching, in spite of its simple shape. In
particular, it incorporates the Hamiltonian mechanics as a special case: For a system
with N degrees of freedom described by the generalized coordinates q1 , . . . , qN and
the associated canonical momenta p1 , . . . , pN , the Hamiltonian equations of motion

464 23 Dynamical Systems
(see Chap. 18) read as follows:
∂H ∂H
q̇i = , ṗi = − . (23.2)
∂pi ∂qi
If the coordinates and momenta are combined according to x = (q1 , . . . qN ;

p1 , . . . , pN )T to a 2N -dimensional vector, then (23.2) may be written as a combined
matrix equation of the form (23.1):
d
x = J ∇ x H. (23.3)
dt
∇ x H stands for the gradient vector of the Hamiltonian function,
T
∂H ∂H ∂H ∂H
∇x H = ,..., ; ,..., , (23.4)
∂q1 ∂qN ∂p1 ∂pN
and the 2N × 2N -matrix J provides both the permutation of the components as well
as the correct signs:

0 +I
J= , (23.5)
−I 0
where I denotes the N × N unit matrix. By the way, J has the following useful prop-
erties:
J−1 = JT = −J, J2 = −I, det J = 1. (23.6)
Moreover, dissipative systems may also be described by (23.1) by introducing

velocity-dependent friction terms; see, e.g., Example 23.2.
Obviously, the solutions of (23.1) may be highly manifold. For a given starting
vector x(t = 0) = x0 , a trajectory x(t), also called an orbit, may be calculated (which
in nonlinear systems, as a rule, is of course not feasible in an analytic manner) the
mathematical existence and uniqueness of which is guaranteed under very general
conditions by the theory of differential equations. Of particular interest is the asymp-
totic behavior of the trajectory at large times: Does it reach a stationary state (a fixed
point) or a periodic vibration (a limit cycle), or does it behave irregularly?
The connection between x(t) and x0 is mathematically a mapping t : RN → RN ,
namely,
t (x0 ) = x(t). (23.7)
This mapping that depends on the time t as a parameter is called the phase flow or
simply the flow of the vector field F(x). For t = 0, the flow obviously reduces to the
identical mapping
t=0 = I. (23.8)
Furthermore, for autonomous (not explicitly time-dependent) systems, we have for a

subsequent performance of two time shifts
t1 t2 = t1 +t2 . (23.9)

23.1 Dissipative Systems: Contraction of the Phase-space Volume 465
For a thorough understanding of the dynamical system, it is not sufficient to inspect

individual trajectories. Of much more interest is the behavior of the ensemble of all
trajectories in phase space. This may be interpreted as a query for the global properties
of the mapping t . Important questions are as follows: Can the flow be characterized
universally “on the large scale”? Are there regions with qualitatively distinct behavior
(ordered vs. disordered motion)? How does the flow vary with the value of a possibly
existing control parameter λ (are there critical threshold values at which a new type of
behavior arises)?
The answer to these questions depends of course on the system being considered.
Nevertheless, one may find general criteria in the frame of nonlinear dynamics, and
it turns out that seemingly very distinct systems display amazing similarities in their
dynamics.
23.1 Dissipative Systems: Contraction of the Phase-Space Volume
Conservative systems are characterized by a volume-conserving dynamical flow. Li-

ouville’s theorem, proved in Chap. 18, states that the volume of a cell in the 2N -
dimensional phase space (q1 , . . . , qN ; p1 , . . . , pN ) does not vary with time if the
points contained therein are moving according to the Hamiltonian equations. In dissi-
pative systems, on the contrary, the cells in phase space are shrinking with time. We
shall now derive a quantitative measure of this phenomenon for a general autonomous
dynamical system, the trajectories of which obey the equation of motion
d
x(t) = F x(t) (23.10)
dt
in an N -dimensional phase space. To this end, we consider a small volume element
V (x) that at time t = t0 shall be at the position x = x0 and shall move with the flow.
In Cartesian coordinates, the volume is given by the product of the edge lengths,

N
V (x) = xi (x). (23.11)
i=1
The time derivative of this quantity is, according to the chain rule, given by
d dxi (x)
N N
V (x) = xj (x)
dt dt
i=1 j =i

N
N
1 dxi (x)
= xj (x) , (23.12)
xi (x) dt
j =1 i=1

=V (x)
where the extension xi /xi has been added. Hence, the relative change (= logarith-
mic time derivative) of the volume is
1 d 1 dxi (x)
N
V (x) = . (23.13)
V (x) dt xi (x) dt
i=1
The change of the edge lengths of the volume1 may be calculated from the equation of
motion (23.10). Let us consider the distance between two edges of the cube along the
i-direction which are determined by the trajectories x0 (t) with x0 (t0 ) = x0 and x(t)
with x(t0 ) = x0 + ei xi :

dxi d
= xi (t) − x0i (t)
dt t0 dt t0
= Fi (x(t0 )) − Fi (x0 (t0 ))

= Fi (x0 + ei xi ) − Fi (x0 ). (23.14)
For small deviations xi , the Taylor expansion of F(x) yields to first order

dxi ∂Fi
= xi . (23.15)
dt t0 ∂xi x0
At the point t0 = t , x0 = x (23.13) thus yields
1 d ∂Fi N
(x) := V (x) = = ∇ · F. (23.16)
V (x) dt ∂xi
i=1
The rate of change of the phase-space volume is therefore determined by the diver-
gence of the velocity field F.
Liouville’s theorem is included in (23.16) as a special case. According to
(23.3)–(23.5), the velocity field of a Hamiltonian system with the coordinates x =
(q1 , . . . , qN ; p1 , . . . , pN )T reads
T
∂H ∂H ∂H ∂H
F(x) = ,..., ;− ,...,− . (23.17)
∂p1 ∂pN ∂q1 ∂qN
This leads to the volume change

N
∂
N
∂
=∇·F= Fi + FN+i
∂qi ∂pi
i=1 i=1

N
∂ ∂H ∂ ∂H
N
= − = 0, (23.18)
∂qi ∂pi ∂pi ∂qi
i=1 i=1
which confirms that conservative systems are volume conserving.

If the flow in phase space is contracting, i.e., if = ∇ · F < 0, the system is called
dissipative. This is so far a local statement holding at a point x in phase space. In order
to get a global estimate of the dynamics, (x) has to be averaged over a trajectory x(t).
If thereby changes sign, then there is no simple method of finding out whether the
system is dissipative; one actually has to evaluate the mean value.
In dissipative systems, the volume filled by neighboring trajectories shrinks with
increasing time; asymptotically it even tends to zero. This may happen in a trivial
manner if the trajectories are converging. In the simplest case they are moving towards
1 Strictly speaking, the shape of V is distorted, and the edges do not remain orthogonal to each
other. But this is of no meaning when calculating the volume to lowest order.
23.2 Attractors 467
an equilibrium point and the motion comes to rest (see the section on limit cycles).
There is, however, also the possibility that the volume is shrinking and the distance
between the trajectories is being reduced only along certain directions, while they are
diverging in other directions. In this case the resulting distance even increases with
time. An originally localized region in phase space is so to speak “rolled out” and
widely distributed by the dynamic flow. The shrinking of the volume towards zero
then means that an originally N -dimensional hypercube in phase space changes over
to a geometric object with lower dimension D < N . D may even take a nonintegral
value, as will be explained in Chap. 26.
23.2 Attractors
The dynamics of a nonlinear system may be highly complicated. It is convenient to
distinguish between transient and asymptotic behavior. A transient process denotes
the initial behavior of a system after starting from a given point x0 in phase space.
Naturally, it is particularly difficult to make general statements, since the transients
depend on the particular initial condition. Theorists therefore tend to ignore this part
of the trajectory, even if it may play an important role in practice, depending on the
dominant time scales. Only recently has the study of transients gotten more attention.
The systematic treatment of the asymptotic or stationary behavior of a system is
somewhat simpler. “Stationary” shall not mean here that the system is at rest but only
that possible transient phenomena have faded away. In dissipative systems, which will
be treated here, the trajectories will asymptotically approach a subset of the phase
space of lower dimension, a so-called attractor.
The definition and correct mathematical classification of attractors is not quite sim-
ple. Actually there are several concepts in literature that differ from each other in de-
tail. Here we first give a mathematical definition2 but shall also illustrate the concept
of the attractor by various examples in the subsequent chapters.
Let us consider a vector field F(x) on a space M (e.g., M = RN ) with an associated
phase flow t . A subset A ⊂ M is denoted as an attractor if it fulfills the following
criteria:
(1) A is compact.
(2) A is invariant under the phase flow t .
(3) A has an open environment U that contracts to A under the flow.
This statement needs several explanations:
(1) A set is called compact if it is closed and restricted. This means that any limit
value of an infinite sequence belongs itself to the set, and the set cannot extend up
to infinity. “Exploding” solutions where for example particles escape to infinity
therefore cannot be attractors.
(2) Invariance under the phase flow means that
t (A) = A for all t. (23.19)
Hence, a point on the attractor never may leave this attractor.
2 F. Scheck, Mechanik, Springer (1992). This book is also available in English: F. Scheck, Mechan-
ics: From Newton’s Laws to Deterministic Chaos, 3rd edition, Springer (1999).
Fig. 23.1. Visualization of the

definition of an attractor A
(hatched). In the course of
time evolution, the environ-
ment U is shrinking such that
for t > tV it is enclosed in
any arbitrary smaller environ-
ment V
(3) This may be formulated in two steps. First, the environment U ⊃ A is larger than
the attractor itself, since we are dealing with an open range that includes the com-
pact A. U shall be positively invariant, i.e.,
t (U ) ⊆ U for all t ≥ 0. (23.20)
If a point once lies within U , then it cannot leave it. It will, on the contrary, even be
pulled toward A, which may be formulated as follows: For any open environment
V of A that lies completely within U , i.e., A ⊂ V ⊂ U , one can find a time tV
after passing that the image of U lies entirely within V :
t (U ) ⊂ V for all t > tV . (23.21)
Since V may be chosen arbitrarily “close” about A, this means that for large time
values U is shrinking toward the attractor A.
Frequently, the definition of an attractor is still extended by the requirement
that it shall consist of one piece only.
(4) A cannot be separated into several closed nonoverlapping invariant subsets.
An important property of an attractor is its domain of attraction. The maximum
environment U that contracts to A is called the basin of attraction B. In correct math-
ematical formulation, B is the union of all open environments of A that fulfill the
conditions (23.20) and (23.21).
The introduction of the concept of attractor given here is rather complex. This is
justified, however, by the fact that attractors may have very complex properties. Of
central importance for nonlinear dynamics are the concepts of strange and chaotic
attractors, which sometimes—not quite correctly—are used as synonyms. These con-
cepts will become fully transparent only in the subsequent chapters and by examples.
But we shall present the definitions now:
Chaotic attractor: The motion is extremely sensitive with respect to the initial con-
ditions. The distance between two initially closely neighboring trajectories increases
exponentially with time. For more details see Chap. 26.
Strange attractor: The attractor has a strongly rugged geometrical shape that is
described by a fractal. For more details see Chap. 26.
Both of these properties arise, as a rule, in common. There are, however, also ex-
amples3 where an attractor is chaotic but not strange, or is strange but not chaotic.
3 C. Grebogi, E. Ott, S. Pelikan and J.A. Yorke, Physica 13D, 261 (1984).
23.3 Equilibrium Solutions 469
23.3 Equilibrium Solutions
A particularly simple case arises if the system is in stationary equilibrium, i.e.,
F(x0 ) = 0 such that x(t) = x0 = constant. (23.22)
Such an x0 is also called a critical point or fixed point. Of particular interest is the
question of whether or not the system is moving toward such a fixed point and—if
several ones exist—to which of them. A fixed point that attracts the trajectories is the
simplest example of an attractor. In this case the set A defined in the previous section
is trivial and consists of a single point.
We are therefore interested in the stability of equilibrium solutions. To this end we
consider the trajectories x(t) in the vicinity of a critical point x0 . We thus require that
the distance
ξ (t) = x(t) − x0 (23.23)
be a small quantity. Under this condition, the problem may be greatly simplified, since
it usually suffices to take only the lowest term of the Taylor expansion of F(x) into
account. The linearized equation of motion then reads
d
ξ (t) = Mξ (t), (23.24)
dt
where terms of quadratic or higher order in ξ were neglected. M denotes the Jacobi
matrix (functional matrix) of the function F(x) evaluated at the position x0 . This ma-
trix has the elements

∂Fi
Mik = . (23.25)
∂xk x0
Contrary to the original nonlinear equation of motion (23.1), the solution of the lin-
earized problem (23.24) is in principle simple; it may be given analytically. Let us
first consider the trivial special case of a one-dimensional system (N = 1). The Jacobi
matrix then has only a single element, say, μ, and (23.24) is solved by
ξ(t) = eμt ξ(0). (23.26)
The character of the solution is determined by the sign of μ: For μ < 0, x0

is a stable equilibrium point, since small perturbations decay exponentially. For
μ > 0, the equilibrium is unstable, since even the smallest displacements from
the equilibrium position “explode” exponentially. For μ = 0, the limit case of in-
different or neutral equilibrium arises. The behavior of the system under pertur-
bations is then determined by the higher derivatives of the function F (x) at the
point x0 .
The general case (N > 1) may be treated as in Chap. 8 by the method of normal
vibrations. One then constructs solution vectors u(t) normalized to 1, all components
of which show the same (exponential) time dependence:
u(t) = eμt u. (23.27)
Using (23.24), one arrives at the eigenvalue problem
Mu = μu. (23.28)
This N -dimensional linear system of equations only has nontrivial solutions if the
determinant
det(Mij − μδij ) = 0 (23.29)
vanishes. This characteristic equation (secular equation) has as a polynomial of N th

order in general N eigenvalues μn with the associated eigenvectors un . The general
solution of (23.24) may then be written as a superposition

N
ξ (t) = cn e μ n t un , (23.30)
n=1
where the expansion coefficients cn may be determined from the initial condition at
t = 0. The eigenvalues μn may be real or complex. Complex eigenvalues thereby arise
always pairwise: If μn solves (23.29), then the complex-conjugate μ∗n obviously also
solves the equation, since the Jacobi matrix Mij is real.
The real parts of the eigenvalues of the characteristic equation are decisive for char-
acterizing an equilibrium point x0 . We now define a tightened form of the condition
of stability: An equilibrium point x0 with F(x0 ) = 0 is called asymptotically stable if
there exists an environment U x0 within which all trajectories are running toward
x0 for large times:
lim x(t) = x0 for x(0) ∈ U. (23.31)

t→∞
If the function (the vector field) F is sufficiently smooth so that it can be described
by the linear approximation, one may immediately give a sufficient condition for as-
ymptotic stability: The point x0 is asymptotically stable if all eigenvalues of the Jacobi
matrix have a negative real part, i.e., if
μn ≤ c < 0 for all n = 1, . . . , N (23.32)
with a positive constant c.

A glance at (23.30) shows that under this condition all contributions to the dis-
placement ξ(t) exponentially tend to zero, such that asymptotically
x(t) − x0 < constant · e−(minn | μn |)t . (23.33)
Conversely, if at least one of the eigenvalues has a positive real part, μn > 0, then
x0 is an unstable fixed point, since displacements along un are increasing exponen-
tially.
With the knowledge of the eigenvectors un , the total phase space may be spanned in
partial spaces. The stable (or unstable) partial space is spanned by all vectors un satis-
fying μn < 0 (or > 0). In addition, a partial space may occur with the special value
μn = 0. If this happens, one speaks of a degenerate fixed point. (The associated par-
tial space is also called the center; but we shall not deal here in more detail with the
related problems.) If one considers a general perturbation of a trajectory, it will have
components in all partial spaces. After a sufficiently long time the contribution with
the maximum μn will dominate.
Finally, we note that the linear stability analysis holds only in the vicinity of a
critical point x0 . It may be shown mathematically that the topological behavior of the
flow does not change there under the influence of the nonlinearity. But this vicinity
may be very small, such that one cannot make a statement about the global behavior
of the flow by this way.
EXAMPLE
23.1 Linear Stability in Two Dimensions
The stability analysis becomes particularly transparent for the case N = 2 that corre-
sponds to a dynamic system with one degree of freedom x1 = q and the associated
momentum x2 = p. In the vicinity of a fixed point ẋ = F(x0 ) = 0, the motion is deter-
mined in a linear approximation by the four elements of the Jacobi matrix Mij . The
characteristic equation (23.29)

M11 − μ M12
=0 (23.34)
M21 M22 − μ
is a quadratic polynomial
μ2 − (M11 + M22 )μ + M11 M22 − M12 M21 = 0 (23.35)
or
μ2 − 2sμ + d = 0 (23.36)
with
1 1
s = (M11 + M22 ) = Tr M, d = M11 M22 − M12 M21 = det M. (23.37)
2 2
The two solutions of (23.36) may be given explicitly:

μ1/2 = s ± s 2 − d. (23.38)
Depending on the magnitude and sign of the two constants s and d, there are many
distinct possibilities for the eigenvalues μ1 , μ2 :
(a) μ1 , μ2 real and both negative (if s < 0 and 0 < d < s 2 ) stable node
(b) μ1 , μ2 real and both positive (if s > 0 and 0 < d < s 2 ) unstable node
(c) μ1 , μ2 real with distinct signs (if d < 0) saddle
(d) μ1 = μ∗2 , negative real part (if s < 0 and d > s 2 ) stable spiral
(e) μ1 = μ∗2 , positive real part (if s > 0 and d > s 2 ) unstable spiral
(f) μ1 = μ∗2 , purely imaginary (if s = 0 and d > 0) rotor
The ranges are represented in Fig. 23.2 in the s, d-plane. To these alternatives, there
correspond distinct types of trajectories ξ (t) = x(t) − x0 according to (23.30).
Fig. 23.2. Ranges of distinct

stability depending on the pa-
rameters s and d
Fig. 23.3. Various types of

stability of a fixed point in two
dimensions. Upper row: Sta-
ble and unstable node, saddle.
Lower row: Stable and unsta-
ble spiral, rotor
Figure 23.3 illustrates how the trajectories in the vicinity of a stable node are run-
ning into the fixed point:
ξ (t) = c1 e−|μ1 |t u1 + c2 e−|μ2 |t u2 , (23.39)
where u1 and u2 are the (not necessarily orthogonal) eigenvectors. The curvature of
the trajectories arises if μ1 = μ2 . These curves are parabola-like, with a common tan-
gent at the origin (in u1 - or u2 -direction depending on whether μ2 or μ1 is larger). The
trajectories for the unstable node, Fig. 23.3(b), have the same shape but are passed in
the opposite direction (exponential “explosion”). For the case of a saddle the trajec-
tories are running in the u1 -direction (let μ1 < μ2 without restriction of generality)
toward the fixed point but are pushed off in the u2 -direction, which results in the
hyperbola-like trajectories of Fig. 23.3(c).
If the eigenvalues are complex,
μ1 = μr + iμi , μ2 = μr − iμi , (23.40)
this will hold, because of (23.28), for the eigenvectors too:
u1 = ur + iui , u2 = ur − iui . (23.41)

The general solution (23.30) then has the form Example 23.1
ξ (t) = c1 eμ1 t u1 + c2 eμ2 t u2 (23.42)

μ∗1 t
= c1 eμ1 t u1 + c1∗ e u∗1

= 2 eμr t c1 eiμi t u1 , (23.43)
where c2 = c1∗ to get ξ real. If the constant c1 , the value of which is fixed by the
initial condition ξ (0), is split into magnitude and phase, and the same is done for the
Cartesian components of the complex eigenvector u1 ,
c1 = ρ eiφ , u1 = a eiα ex + b eiβ ey , with a 2 + b2 = 1, (23.44)
then (23.42) may be rewritten as follows:

ξ (t) = 2ρ eμr t a cos(μi t + φ + α)ex + b cos(μi t + φ + β)ey . (23.45)
The factor in brackets describes harmonic vibrations shifted in phase relative to each
other (if α = β). One thus has the parametric representation of an ellipse. Due to the
prefactor, the size of the ellipse varies exponentially with time. Thus, the trajectories
are logarithmic spirals moving toward the fixed point or away from it, depending on
the sign of the real part of μ; see Fig. 23.3(d), (e)—hence, the name spiral. The case
of the rotor with μ = 0 plays a particular role, since the trajectories in the vicinity of
x0 are periodic functions (concentric ellipses). This means that the equilibrium point
is stable (small displacements are not amplified) but not asymptotically stable (the
trajectory does not run into the fixed point), and hence this point is not an attractor.
EXERCISE
23.2 The Nonlinear Oscillator with Friction
Problem. Let a one-dimensional system be described by the following equation of

motion:
ẍ + α ẋ + βx + γ x 3 = 0. (23.46)
Show that the system is dissipative. Interpret the individual terms and discuss the
possible fixed points and their stability.
Solution. We are dealing with a harmonic oscillator involving friction and nonlinear-
ity. Besides the linear backdriving force of the harmonic oscillator (third term), there
acts a friction force proportional to the velocity (second term). Moreover, a cubic non-
linearity (fourth term) becomes important. This force law corresponds to a potential
m 2 m 4
V (x) = βx + γ x , (23.47)
2 4
where m denotes the mass. We obtain various types of motion, depending on the
magnitude and sign of the constants in (23.46). We first rewrite the equation of mo-
tion (23.46) in the standard form. For this purpose, we introduce the velocity as
Exercise 23.2 a second coordinate, x = (x, y) = (x, ẋ), which leads to the coupled differential equa-
tions of first order:

d x y
ẋ = = ≡ F(x). (23.48)
dt y −αy − βx − γ x 3
For α > 0, the system is dissipative, since the divergence of the velocity field is
∂ ∂
=∇·F= y + (−αy − βx − γ x 3 ) = −α < 0. (23.49)
∂x ∂y
The equilibrium condition F(x0 ) reads
y = 0, x(β + γ x 2 ) = 0 . (23.50)
Hence, besides the equilibrium position x0 = (0, 0) without displacement, there still
√
occur two further symmetrically positioned fixed points x0 = (± −β/γ , 0), provided
that the constants β and γ have distinct signs. Figure 23.4 shows the associated po-
tential functions V (x) for all combinations of signs.
Fig. 23.4. Potentials of the

quadratic oscillator for vari-
ous signs of the parameters β
and γ
To discuss the linear stability, we need the Jacobi matrix

∂F1 /∂x ∂F1 /∂y 0 1
M= = . (23.51)
∂F2 /∂x ∂F2 /∂y −β − 3γ x 2 −α
The characteristic equation (23.36) in Example 23.1 for the eigenvalues involves the
following coefficients:
1
For x0 = (0, 0): s = − α, d = β,
2
√ 1
For x0 = (0, ± −β/γ ): s = − α, d = −2β.
2
Obviously, asymptotic stability may occur only for a positive sign of the constant
α > 0. Only then is one dealing physically with a damping friction term. For the fixed
point in the rest position x0 = (0, 0), we get the alternatives
(1) β > 14 α 2 stable spiral
(2) 0 < β < 14 α 2 stable node
(3) β < 0 saddle
In the first case, there arise weakly damped vibrations, in the second case the oscillator
is overdamped, and the displacement monotonically tends to zero. For β < 0, the equi-
librium position is unstable, as may be seen from the potential plots in Fig. 23.4(c), (d).
23.4 Limit Cycles 475
√
The analogous considerations for the fixed points x0 = ± −β/γ , 0 lead to (assum- Exercise 23.2
ing α > 0):
(1) −2β > 14 α 2 stable spiral
(2) 0 < −2β < 14 α 2 stable node
(3) β > 0 saddle
The factor 2 arises because the curvature of the potential (23.47) in the equilibrium
positions with finite displacement is twice as large as in the rest position. Only the
double-oscillator potential (Fig. 23.4(c)) allows stable displaced fixed points (β < 0
and γ > 0).
It is instructive to plot the position of the fixed points as a function of the pa-
rameter β. As is seen from Fig. 23.5, for β = 0 there occurs a square-root branch-
ing. For γ > 0, a stable equilibrium position bifurcates into two new stable solutions.
Such bifurcations (Lat. furca = fork) frequently occur in nonlinear systems; see also
Chap. 25.
Fig. 23.5. Position of the sta-

ble (continuous) and unstable
(dotted) fixed points depend-
ing on the parameters β and γ
23.4 Limit Cycles
Besides the simple stationary equilibrium points studied in detail in the section on
attractors, a dynamic system may exhibit still other types of stable solutions. These
are the so-called limit cycles that are characterized by periodically oscillating closed
trajectories. Similar to the fixed points discussed already, limit cycles may also act as
attractors of motion; compare the section on attractors. Then there exists a more or less
extended range in phase space (the “basin of attraction” of the attractor): trajectories
starting from there move toward the limit cycle, which is approached for t → ∞. For
limit cycles, one may also perform a mathematical stability analysis as for fixed points
which by its very nature is somewhat more difficult.
We shall concentrate ourselves here to a special but typical example, namely a har-
monic oscillator with a nonlinear friction term. The associated differential equation
has the general form
d 2x dx
2
+ f (x) + ω2 x = 0. (23.52)
dt dt
If the middle term is absent, we obtain a harmonic oscillator with angular frequency ω.
The case of a constant coefficient, f (x) = α = constant, leads to a linear differential
equation that may be solved easily. The character of the solution is determined by an
exponential factor exp (−αt/2). The solution thus decreases exponentially toward the
fixed point at x = ẋ = 0 if α is positive. A negative value of α means that a force is
acting along the same direction as that of the instantaneous velocity which leads to an
unlimited amplification of the solution (negative damping). Physically one of course
no longer deals with a friction force; rather an external source must exist that pumps
energy into the system.
If allowance is made for more general functions f (x), it may happen that the damp-
ing coefficient takes partly positive, partly negative values, depending on the displace-
ment. Of particular interest is the case when f (x) is negative for small magnitudes of
x, and positive for large displacements. The simplest ansatz providing such a behavior
is a quadratic polynomial
f (x) = α(x 2 − x02 ), (23.53)
where α determines the strength of the damping/excitation and two zeros are at x =
±x0 . The zeros may be set to the value 1 without loss of generality by rescaling the
variables to x = x/x0 with α = αx02 . For convenience one may also choose the value
1 for the frequency by rescaling the time: t = ω0 t with α = α /ω. The standard form
of the equation of motion then reads (dropping the primes again):
d 2x dx
+ α(x 2 − 1) + x = 0. (23.54)
dt 2 dt
This differential equation has been set up and discussed in 1926 by the Dutch engi-
neer B. van der Pol. It served first for describing an electronic oscillator circuit with
feedback (at that time still with valves), but it was already clear to the author that his
equation could be applied to a variety of vibrational processes. Actually the origin
of this equation may be traced back even further, since around 1880 Lord Rayleigh
investigated the following differential equation in the context of nonlinear vibrations:

d 2v 1 dv 3 dv
+α − + v = 0. (23.55)
dt 2 3 dt dt
One easily sees the relation between (23.54) and (23.55). We have only to differentiate
the Rayleigh equation (23.55) with respect to time and then substitute
dv
=x (23.56)
dt
to get the van der Pol equation (23.54). Thus, both equations are essentially equivalent
to each other.
We now discuss the solutions of the van der Pol equation (23.54). It may be trans-
formed as usual to the standard form (23.1) of two coupled differential equations of
first order for the vector x(t) = (x, y)T :
dx
= y, (23.57)
dt
dy
= −x − α(x 2 − 1)y. (23.58)
dt
It is now advantageous to transform to polar coordinates in the x, y-phase space:
x = r cos θ, y = r sin θ. (23.59)
The time derivatives of r and θ may be expressed by those of x and y. For the radius
coordinate the relation follows immediately from the differentiation of r 2 = x 2 + y 2 :
dr dx dy
r =x +y . (23.60)
dt dt dt
An analogous relation for the angle coordinate may be obtained from the time deriva-
tives of (23.59):
dx dr dθ
= cos θ − r sin θ, (23.61)
dt dt dt
dy dr dθ
= sin θ + r cos θ. (23.62)
dt dt dt
By multiplying the first of these equations by y and the second one by x and subtract-
ing both equations, one obtains
dθ dy dx
r2 =x −y . (23.63)
dt dt dt
Using (23.60) and (23.63), the van der Pol system of equations in polar coordinates
reads as follows:
dr
= −α(r 2 cos2 θ − 1)r sin2 θ, (23.64)
dt
dθ
= −1 − α(r 2 cos2 θ − 1) sin θ cos θ. (23.65)
dt
The nonlinear terms on the right-hand side have a rather complex shape, but one may
give some qualitative statements on the solutions to be expected. In the limit α = 0,
one has of course a normal harmonic oscillator. The trajectories in phase space are
circles which are traveled through uniformly with the frequency 1, such that
x(t) = ρ sin(t − t0 ) (23.66)
with arbitrary ρ and t0 . Due to the nonlinearity in (23.64) and (23.65), the behavior of
the solution is modified. As long as α 1, the influence on the revolution frequency
remains small: Since the function sin θ cos θ changes its sign twice in each period,
the influence of the nonlinear term in (23.65) cancels out on the average. It is quite
different, however, for the radial motion: Here sin2 θ is positive definite, and small
changes of the radius may accumulate from period to period. The evolution direction
of the effect is determined by the sign of −α(r 2 cos2 θ − 1). In the following we
shall discuss the (more interesting) case α > 0 (the set of solutions for α < 0 may be
obtained by inversion of the time coordinate t → −t ).
For small displacements r < 1, the factor −α(r 2 cos2 θ − 1) is then always positive,
and the radius increases slowly but monotonically. For large displacements r 1,
on the contrary, the factor is predominantly negative (except for the vicinity of the
zeros of cos θ ), and the mean radius decreases from cycle to cycle. A more detailed
investigation as is performed in Exercise 23.3 shows that the trajectory in the course of
time approaches a periodic one, independent of the initial conditions, which for given
α is uniquely determined. This is the limit cycle of the system.
As long as α is very small, the limit cycle resembles a harmonic vibration as in
(23.66). The crucial difference is, however, that the amplitude ρ now has a sharply
determined value, namely, ρ = 2. If one starts from a smaller or larger value, the tra-
jectory is a spiral approaching the limit cycle. The result of a numeric calculation
for the value α = 0.1 is represented in Fig. 23.6. One can follow the spiraling mo-
tion towards the limit cycle. Moreover, deviations from the purely harmonic vibration
become visible.
Fig. 23.6. (a) The solutions

of the system of differential
equations (23.57) in the x, y-
plane approach the limit cy-
cle (bold curve), a slightly
deformed circle, independent
of the initial condition. The
nonlinearity parameter is α =
0.1. (b) The solutions x(t)
of the van der Pol oscillator
after a transient motion co-
incide with an approximately
harmonic vibration
Even more interesting is the solution in the opposite limit α 1, in which the non-
linearity plays a dominant role. Here also a limit cycle evolves for the same reasons,
the shape of which, however, strongly differs from a harmonic vibration. Figure 23.7
shows the phase-space plot and the trend of the amplitude of the limit cycle for the
case α = 10. One notices that the displacement remains in the range of the maximum
amplitude x = 2 and slowly decreases toward x = 1. Subsequently, a sudden “flip-
over” sets in, and the displacement drops to the value x = −2. Then the game repeats
with opposite sign. The period length of this kind of vibration is no longer determined
by the oscillator frequency (here ω = 1) but takes a much larger value. An analytic
investigation (see Exercise 23.4) shows that it increases proportional to the “friction”
parameter α:
T (3 − 2 ln 2)α. (23.67)
Fig. 23.7. Solutions of the

van der Pol oscillator with
strong nonlinearity, α = 10.
The meaning of the curves is
the same as in Fig. 23.6
A motion performed by a van der Pol- or Rayleigh oscillator for large α is also
called relaxation vibration. The name indicates that a tension builds up slowly which
then equilibrates via a sudden relaxation process. Such relaxation vibrations fre-
quently occur in nature. For example, the vibration of a string excited by a bow, the
squeak of a brake, and even the rhythm of a heartbeat or the time variation of animal
populations may be classified in this way.
An important and also practically useful property of nonlinear oscillators with a
limit cycle lies in the fact that self-exciting vibrations occur that are well defined and
independent of the initial conditions. As a somewhat nostalgic example, we quote the
balance of a mechanical clock, the vibrations of which are largely independent of the
strength of the driving force. Finally, we quote without proof a mathematical theorem4
stating that the possible types of motion of a two-dimensional system (corresponding
4 See, e.g., J. Guckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems and Bifur-
cations of Vector Fields, Springer (1983).
to a mechanical system with one degree of freedom: one coordinate plus one velocity)
are completely governed by fixed points and limit cycles.
The theorem of Poincaré and Bendixson: Let a two-dimensional dynamic sys-

tem x(t) = (x(t), y(t))T be described by the differential equation
dx
= F(x)
dt
with a continuous function F. Let B be a closed and restricted range of the x,y-plane.
If a trajectory lies in B for any time t > 0, x(t) ∈ B, there are three possibilities:
(i) x(t) is a periodic function,

(ii) x(t) asymptotically approaches a stationary equilibrium point, and
(iii) x(t) asymptotically approaches a periodic function (limit cycle).
The general theorem says of course nothing about the number and shape of the
fixed points and limit cycles. However, it excludes the existence of more complicated
nonperiodic types of solutions! It is important that the statement holds only for two-
dimensional systems. Two trajectories are not allowed to intersect each other in phase
space, which in the two-dimensional plane leads to considerable restrictions. But in
more than two dimensions the trajectories may “evade” each other, and more com-
plex patterns of motion are possible. In this case, the already-mentioned strange at-
tractors with a complicated shape may also occur. This will be treated in the next
chapters.
EXERCISE
23.3 The van der Pol Oscillator with Weak Nonlinearity
Problem. Show that the solutions of the van der Pol oscillator for small values of α
are spirals that approach a circle (the limit cycle) with the radius 2.
Hint: It is a good idea to introduce new variables that are averaged over one oscil-
lation period.
Solution. We start from the plausible assumption that for α 1 the solution of the
system of differential equations (23.64) and (23.65) differs only slightly from that of
the harmonic oscillator if it is considered for short time intervals. In order to calculate
a long-term drift of the variables, it is efficient to average over one vibrational period
in each case. We define the averaged amplitude r̄(t) as

dτ r(t + τ )
r̄(t) := . (23.68)
dt
The integration thereby extends over a full revolution of the angle, i.e., from θ to
θ − 2π (the minus sign arises because of dθ/dt −1). The corresponding time inter-
val runs from t to approximately (for α = 0 exactly) the value t + 2π .
We are interested in the time variation of the averaged amplitude for which accord- Exercise 23.3
ing to (23.64) we have

d r̄ 1
= −α dθ r sin2 θ (r 2 cos2 θ − 1)
dt 2π
2π
dθ 2 1 2
= −α r r sin 2θ − sin2 θ . (23.69)
2π 4
0
For small α, the quantity r(t) considered over a period varies only slowly, and hence
may be pulled out of the integral and replaced by r̄(t). The remaining angle integration
is trivial, since the mean value of both sin2 θ and sin2 2θ just equals 1/2. Hence, the
averaged amplitude satisfies the differential equation

d r̄ 1 1
= α r̄ 1 − r̄ 2 , (23.70)
dt 2 4
which is correct up to the order O(α 2 ). The circulation frequency, on the contrary,
does not change to first order:
2π
d θ̄ dθ
= −1 − α(r 2 cos2 θ − 1) sin θ cos θ = −1. (23.71)
dt 2π
0
The angular integral vanishes here, since the integrand is an odd function with respect
to θ = π . The differential equation (23.70) for the averaged amplitude may be solved
in closed form. We write
d r̄
= a r̄ − br̄ 3 (23.72)
dt
and transform to the new variable
1 d r̄
u= such that du = −2 . (23.73)
r̄ 2 r̄ 3
Obviously, (23.72) reduces to the simple linear differential equation
1 du
− = au − b, (23.74)
2 dt
the solution of which is a shifted exponential function:
b
u(t) = + c e−2at , (23.75)
a
where the free constant c is to be determined from the initial condition: c = u(0) −
b/a. Insertion of a = α/2 and b = α/8 finally yields
2r(0)
r̄(t) = . (23.76)
r 2 (0) + (4 − r 2 (0))e−at
Exercise 23.3 Thus, it is proved that the trajectories are spirals which approach a circle with radius 2
from inside (r(0) < 2) or outside (r(0) > 2). This is the limit cycle of the van der Pol
oscillator for small values of α.
EXERCISE
23.4 Relaxation Vibrations
Problem. Discuss the solutions of the Rayleigh oscillator (23.55) qualitatively for
large values of the parameter α 1. Find an approximate solution for the period
length of the resulting relaxation vibration.
Solution. The differential equation (23.55) of the Rayleigh oscillator written in stan-
dard form reads
dv
= x,
dt
(23.77)
dx 1
= −v − α x 3 − x .
dt 3
In order to discuss the behavior of the solution for large values of α, it is convenient
to rescale the amplitude to a new variable z = v/α:
dz 1
= x,
dt α (23.78)
dx
= −α z + f (x) ,
dt
with the abbreviation f (x) = (x 3 /3) − x. From this quantity one may read off the
direction of the trajectory for any point of the z,x-plane:
dx dx/dt z + f (x)
= = −α 2 . (23.79)
dz dz/dt x
This means that for α 1, the trajectories are almost vertical. Other directions
may occur only near the curve z(x) = −f (x). This cubic limit curve subdivides the
z, x-plane into two halves (see Fig. 23.8). In the right half, the derivative dx/dt is neg-
ative, according to (23.78), and the trajectories are running (almost) vertically down-
ward. In the left half, they are running upward.
Fig. 23.8. The cubic limit

curve z(x) determines the as-
ymptotic behavior of the re-
laxation vibrations of the van
der Pol oscillator
From this knowledge, the motion for large α may be constructed graphically. Be- Exercise 23.4
ginning from an arbitrary initial point, e.g., the point O in the figure, the trajectory
at first falls almost vertically down to the curve z(x) = −f (x). The further motion
proceeds with significantly lower velocity near this curve (directly on the curve the
velocity would vanish, dx/dt = 0). Finally, the point of inversion B is reached at
(z, x) = (2/3, 1).
Because dz/dt > 0, the trajectory cannot follow the backward-running branch of
the curve but “falls down” to the point C at (2/3, −2). Now the game is repeating
with inverse sign. The curve ABCD forms the limit cycle of the Rayleigh oscillator.
It consists of two slowly passed parts (x = 2 . . . 1 and −2 . . . −1) and two fast jumps
(x = 1 . . . −2 and −1 . . . 2). This discussion immediately applies, of course, to the van
der Pol oscillator, since according to (23.56) its displacement just corresponds to the
velocity x of the Rayleigh oscillator introduced in (23.77).
The period length T of the relaxation vibration may be evaluated easily:

dz
T = dt = α , (23.80)
x
where the integral extends over a full period. Since the motion along the partial
branches BC and DA proceeds very quickly, it is sufficient to calculate the contri-
bution of AB:

dz dz
T α +α
x x
AB CD
1
dz dz/dx
= 2α = 2α dx . (23.81)
x x
AB 2
The derivative dz/dx is to be formed on the curve AB, i.e., dz/dx = −df/dx:
1 2 2
−df/dx x2 − 1 1 2
T 2α dx = 2α dx = 2α x − ln x
x x 2 1
2 1
= (3 − 2 ln 2)α 1.614α. (23.82)
The period length calculated numerically in Fig. 23.7b for α = 10 amounts to about
19, hence the asymptotic range is not yet fully reached.
Stability of Time-Dependent Paths
24
In Chap. 23, we have considered the stability of an equilibrium point x0 by investigat-

ing the behavior of the trajectories in the vicinity of this point. Now we are interested
in the environment of a time-dependent reference trajectory xr (t). The former case of
a stationary fixed point xr (t) = x0 = constant is of course included too. One may again
distinguish between various kinds of stability. Stability in the sense of Lyapunov exists
if a point on a neighboring trajectory x(t) remains close to xr (t) for all times. The
formal expression of this concept reads as follows: A path xr (t) is called Lyapunov-
stable if for any > 0 a value δ() > 0 can be found such that any solution with
|x(t0 ) − xr (t0 )| < δ satisfies the condition |x(t) − xr (t)| < for all times t > t0 .
Figure 24.1(a) shows that the “perturbed” paths are confined within an (N + 1)-
dimensional tube of radius about the reference trajectory. This does not yet mean
that the trajectories are approaching each other with time. If the latter happens, one
speaks as before (see (23.31)) of asymptotic stability; see Fig. 24.1(b).
Fig. 24.1. (a) The neighboring paths x(t) of a Lyapunov-stable path xr (t) remain in its vicinity.
(b) In the case of asymptotic stability, neighboring paths are attracted such that the distance
decreases to zero with increasing time
A path xr (t) is called asymptotically stable if it is Lyapunov-stable and if for the

neighbor trajectories limt→∞ |x(t) − xr (t)| = 0.
It is of interest that for time-dependent trajectories xr (t) there exists still another
concept of stability that was not needed for stationary fixed points x0 . It may happen
that although the shape of two paths in phase space is very similar, these paths are nev-
ertheless passed with distinct speeds; see Fig. 24.2. Thereby a time shift may evolve,
|x(t) − xr (t)| increases, and the definitions of stability given so far do not apply. One
therefore introduces a weakened version: A path xr (t) is called orbitally stable if for
any > 0 a value δ() > 0 can be found such that any solution with |x(t0 )−xr (t0 )| < δ
is confined for all times t > t0 within a tube of radius about the path xr (t).
Fig. 24.2. Example of a path
that is asymptotically sta-
ble but not orbitally stable.
Neighboring paths are passed
with distinct speeds

486 24 Stability of Time-Dependent Paths
Using this definition, we consider the geometric position x(t). The time as a para-
meter of this curve, on the contrary, does not play a role.
24.1 Periodic Solutions

The investigation of the stability of time-dependent solutions is by its very nature
more difficult than in the case of stationary fixed points. Here we shall treat the im-
portant special case of periodic solutions. Then we may apply a formalism that goes
back to the French mathematician Floquet (1883). Thus, we assume that the reference
trajectory xr (t) repeats itself after a period length T ,
xr (t + T ) = xr (t). (24.1)
This may originate in two distinct manners. First, in an autonomous system vibrations
may arise by themselves, e.g., in the harmonic oscillator. Here the right-hand side
of the equation of motion does not depend explicitly on the time, ẋ = F(x). On the
other hand, there are also periodically externally excited systems which are under the
action of a time-dependent external drive with the periodicity F(x, t + T ) = F(x, t)
that reflects itself in the trajectory. An advantage is here that the period length T is
imposed from outside, while the vibrational frequency of an autonomous system is
not known from the beginning and must be determined—except for simple special
cases—by numerical solution of the equation of motion.
To discuss the stability of xr (t), one investigates, as in (23.23), the neighboring
trajectories
x(t) = xr (t) + ξ (t), (24.2)
where the deviation ξ (t) is assumed to be small. From the equations of motion
ẋ(t) = F(x, t) and ẋr (t) = F(xr , t), (24.3)
we find
ẋr + ξ̇ = F(xr + ξ , t) = F(xr , t) + ξ̇ , (24.4)
or
ξ̇ = F(xr + ξ , t) − F(xr , t) ≡ G(ξ , t). (24.5)
This equation can be linearized by expanding the right side in a Taylor series and
neglecting higher terms:
ξ̇ (t) = M(t) ξ (t) (24.6)
with the Jacobi matrix (here written in abstract without giving the indices)

∂G ∂F
M(t) = = . (24.7)
∂ξ ξ =0 ∂x xr (t)
Equation (24.6) is, like (23.24), a linear system of differential equations, but now the
matrix of coefficients is periodically time-dependent, M(t + T ) = M(t), while for-
merly it was constant. This periodicity also holds for autonomous systems: Although
the function F(x) does not involve the time explicitly, the reference trajectory xr (t) by
itself nevertheless induces a periodic time dependence.
24.2 Discretization and Poincaré Cuts 487
24.2 Discretization and Poincaré Cuts
There exists a mathematical tool that is useful for the stability analysis of time-
dependent paths but also in general for the qualitative understanding of dynamic sys-
tems. The basic idea is to perform a discretization of the time dependence of a trajec-
tory. This may be done in two somewhat different ways.
An obvious possibility is the stroboscopic mapping. Instead of the continuous func-
tion x(t), one considers a discrete sequence of “snapshots” xn = x(tn ), n = 0, 1, 2, . . . .
The time points of support of the spectroscopic method are chosen as equidistant,
hence tn = t0 + nT with a scanning interval T . Of course one should choose a value
of T that is appropriate for the problem. If an oscillating driving force is acting, one
will use its period length for T . The stroboscopic method becomes particularly simple
if the trajectory x(t) itself is periodic and T coincides with the period length; then all
xn are of course identical. The stroboscopic mapping of the path consists of a single
point xn = x0 in phase space. One should note that the position of the point x0 depends
of course on the selected reference time t0 and thereby may be shifted arbitrarily along
the orbit.
As was described in the preceding section, for a stability analysis one investi-
gates neighboring trajectories x(t) that are in general no longer strictly periodic. The
thin line in Fig. 24.3 shows such an example. The first three stroboscopic snapshots
x0 , x1 , x2 are marked by dots and the distance vectors ξ n = xn − xr0 are plotted.
An alternative method of discretization of trajectories, which is not oriented to the
periodicity and is of advantage particularly for autonomous systems having no fixed
eigenfrequency, is the Poincaré cut. When changing over to the discretized sequence Fig. 24.3. The stroboscopic scan-
xn , one again chooses momentary snapshots of the continuous orbit x(t). As a criterion ning of the distance ξ (t) =
one now adopts not any fixed equidistant time distances, but rather a geometric prop- x(t) − xr (t) yields informa-
erty of the orbit itself, namely the piercing of a given hypersurface . One thereby tion on the stability of a path
selects an (N − 1)-dimensional hypersurface in phase space and marks all points xn
at which the trajectory intersects the hypersurface. One further requires that is not
only touched but properly pierced. Mathematically this means that the surface shall be
transverse to the dynamic flow, n(x) · F(x) = 0 everywhere on , where n is the sur-
face normal. One therefore speaks of a transverse cut. In a transverse cut one usually
marks only points with a definite sign of F · n, i.e., only piercings of that proceed
in the same direction.
This method of discretization of trajectories was invented by Henri Poincaré and is
called the Poincaré cut. Figure 24.4 shows as an example a trajectory in an (N = 3)-
dimensional space, with the x,y-plane as the cut surface . Three piercings in the
negative z-direction are marked as x0 , x1 , x2 . The use of Poincaré cuts makes sense
Fig. 24.4. The piercing points

of a trajectory through a
given surface constitute the
Poincaré cut
only then if the trajectory is moving largely or completely in a restricted range of

the phase space, such that the cut surface is pierced again and again. Many systems
with approximately periodic or also chaotic motion satisfy this condition. Examples
of nontrivial Poincaré cuts will be given in Chap. 27.
An advantage of the Poincaré cut is first of all the reduction of dimension of the
phase space from N to N − 1, which may be very helpful for the qualitative discus-
sion. For detailed studies, not only does one want to know the ensemble of points
xn , but also their detailed sequence. Mathematically, this connection is mediated by a
mapping
P : xn → xn+1 with xn , xn+1 ∈ . (24.8)
This Poincaré mapping thus connects every point of the sequence x0 , x1 , x2 , . . . with
its successor. Note that P has no index. It is a single mapping of the plane onto itself,
which according to (24.8) is “scanned” at individual points. The individual points of
the Poincaré cut arise by successive iteration of the Poincaré mapping
x1 = P (x0 ), x2 = P (x1 ) = P 2 (x0 ), ..., xn = P n (x0 ). (24.9)
Hence, the long-term behavior of a trajectory may be derived from the properties of the
iterated Poincaré mapping P n , n → ∞. If the time evolution of the dynamic system is
determined by a differential equation ẋ = F(x, t), the Poincaré mapping is unique and
also reversible (possibly except for singular points), since trajectories are not allowed
to intersect each other.
The problem of describing a dynamic system is of course not yet solved by defining
the Poincaré mapping but is only postponed, since P must also be constructed explic-
itly. In most cases this cannot be achieved analytically, and one is finally left with a
numerical integration of the differential equation of the system. It turns out, however,
that the exact Poincaré mapping P frequently has amazing common features with
very simply constructed analytic discrete mappings. As an example we shall discuss
the “logistic mapping” in Chap. 27.
Let us return to the problem of stability of periodic paths. As was outlined in the
preceding section, it is sufficient to investigate the small deviations ξ (t) = x(t) − xr (t)
from the reference trajectory in the linear approximation. In this approximation the
Poincaré mapping simplifies to a linear mapping, i.e., the multiplication by a ma-
trix C:
ξ n+1 = Cξ n ; hence, xn = Cn ξ 0 . (24.10)
Decisive for the long-term behavior of the deviation ξ (t) are the eigenvalues
λ1 , . . . , λN of the matrix C. If all eigenvalues satisfy the condition |λi | < 1, the map-
ping is contracting, and the sequence converges toward zero. In this case the periodic
solution xr (t) is thus asymptotically stable. If at least one of the eigenvalues |λ| > 1,
the perturbations are increasing along the direction of the associated eigenvector, and
the path is unstable.
In the subsequent example, the mathematical theory of the stability of periodic
solutions developed by Floquet will be presented in more detail.
EXAMPLE
24.1 Floquet’s Theory of Stability
As described at the beginning of this chapter, we are interested in the long-term be-
havior of the path deviations ξ (t) = x(t) − xr (t) which approximately obey a linear
differential equation
d
ξ (t) = M ξ (t) (24.11)
dt
with a periodic matrix of coefficients M(t + T ) = M(t). Since we are dealing with
a linear problem, any solution may be expanded in terms of a fundamental system
of linearly independent basic solutions φ 1 (t), . . . , φ N (t). The basic solutions are not
uniquely determined, and for sake of clarity we choose them in such a way that at the
time t = 0 (we might choose also t = t0 ) they just coincide with the unit vectors in the
N -dimensional space:
φ 1 (0) = (1, 0, . . . , 0)T to φ N (0) = (0, 0, . . . , 1)T , (24.12)
where the transposition symbol T indicates that these vectors shall be column vectors.
Geometrically all of these vectors are lying on a (hyper-) spherical surface of radius
unity. The superposition of a general solution ξ (t) reads

N
ξ (t) = ci φ i (t), (24.13)
i=1
which may also be written in matrix form:
ξ (t) = (t) c. (24.14)
c is a column vector formed out of the expansion coefficients, and is an N × N -

matrix containing one of the basic vectors in each column,

(t) = φ1 (t), . . . , φN (t) and c = (c1 , . . . , cN )T . (24.15)
Due to (24.12), the matrix satisfies the initial condition
(0) = I. (24.16)
How does the periodicity of the differential equation (24.11) manifest itself in the
matrix ? To see this, one should realize that any solution ξ of (24.11) at the time
t + T satisfies the same differential equation as at the time t . This does of course not
mean that the solution will be periodic; in general ξ (t + T ) = ξ (t). But it may be
expanded both in terms of the basic solutions φ i (t + T ) as well as in terms of the
φ i (t),
ξ (t + T ) = (t + T )c and also ξ (t + T ) = (t)c . (24.17)
This implies a linear relation
(t + T ) = (t)C. (24.18)

Example 24.1 The constant N × N -matrix C is called the monodromy matrix. This quantity gov-
erns how the solutions develop from one period to the next. Using the initial con-
dition (24.16), one may immediately read off the value of the monodromy matrix
from (24.18):
C = (T ). (24.19)
Hence, the monodromy matrix may be calculated by integrating the differential equa-
tion (24.11) N times with distinct initial conditions over a period from 0 to T and
writing the resulting solution vectors φi (T ) into the columns. The evolution of the
matrix (t) for arbitrarily large times is obtained by iteration of (24.18). For full pe-
riods in particular, we have
(2T ) = (T )C = (0)CC = C2 , (24.20)
and generally,
(nT ) = n (T ) = Cn . (24.21)
According to (24.14) and (24.21), the evolution of the solutions ξ (t) for large times
is thus determined by the powers of the monodromy matrix C. What happens thereby
may be read off from the N eigenvalues λi of this matrix, which are called character-
istic multipliers or Floquet multipliers,
(T )ui = λi (T )ui . (24.22)
If ui is an eigenvector of the matrix (T ), then it keeps this property for the iterated
mapping as well, e.g.,
(2T )ui = (T )(T )ui = λi (T )ui = λ2i (T )ui

= λi (2T )ui , etc. (24.23)
This leads to the following functional equation for the eigenvalues of the iterated map-
ping:
λi (nT ) = λni (T ). (24.24)
This behavior is characteristic for the exponential function; i.e., (24.24) is solved by
λi (T ) = eσi T (24.25)
with an (in general complex) constant σi that is called the Floquet exponent. We still
note that (24.21) may be considered a functional equation like (24.24), with the same
kind of solution
(T ) = eST . (24.26)
Here, a matrix S stands in the argument of the exponential function, and the resulting
function value is again a matrix. Such a matrix function is mathematically defined
simply through its power series expansion. One can show that the eigenvalues of the
matrix S introduced by (24.26) are just the Floquet exponents σi of (24.25). If one is
interested in the evolution matrix at any times (not only multiples of the period T ), Example 24.1
then (24.26) still has to be generalized:
(t) = U(t) eSt with U(0) = I. (24.27)
The matrix U(t) may exhibit a complicated time dependence but must be periodic.
Because of (24.18), we have
U(t + T ) eS(t+T ) = (t)(T ) = U(t) eSt eST . (24.28)
The product of the exponential functions on the right side may be combined to
exp (St) exp (ST ) = exp S(t + T ) (for noncommuting matrices in the exponent this
would in general not be correct), and we find
U(t + T ) = U(t). (24.29)
Hence, for the long-term behavior of the solutions ξ (t), U does not play a role. This
behavior is only determined by the magnitude of the Floquet multipliers λi . Begin-
ning with an eigenvector ξ (0) = ui , this solution according to (24.24) will increase as
ξ (nT ) = ui exp(σi nT ). From that, we conclude the following: The trajectory xr (t) is
asymptotically stable if for all Floquet multipliers we have |λi | < 1; i.e., Re σi < 0. It
is unstable if for at least one eigenvalue we have |λi | > 1; i.e., Re σi > 0.
These statements, which were obtained by linearizing the equation of motion, trans-
fer also to the stability behavior of the nonlinear system. The limit of marginal stability
|λi | = 1 may be cleared up only by additional investigations.
For an autonomous periodically vibrating system, a peculiarity arises: In this case,
one of the eigenvalues always has the value λ = 1 and must not be considered in the
stability analysis. To prove this assertion, we consider the function ẋr (t). The mode
under consideration is namely the motion tangential to the reference orbit. For an
autonomous system, one obtains by differentiating the nonlinear equation of motion

∂F
ẋr (t) = F(xr ) −→ ẍr (t) = ẋr = M(t)ẋr , (24.30)
∂x xr
which agrees with the linearized equation of motion (24.16). The time evolution of the
solutions of this differential equation is determined by the matrix (t); hence,
ẋr (t) = (t)ẋr (0). (24.31)
The reference orbit xr (t) and therefore also its derivative ẋr (t) are however (contrary
to the case of general perturbations ξ (t)) periodic; hence,
ẋr (T ) = (T )ẋr (0) = ẋr (0), (24.32)
which proves that the monodromy matrix has an eigenvector, namely, ẋr (0), with the
eigenvalue λ = 1. This is vividly clear: A reference orbit that is shifted in the tangential
direction simply corresponds to a shift of the time coordinate t → t + δt. Since the
absolute value of the time does not play a role in autonomous systems, xr (t) and
x(t) = xr (t + δt) are always running with unchanged distance one behind the other.
Hence, the associated Floquet multiplier must have the value unity.
EXERCISE
24.2 Stability of a Limit Cycle
Problem. Let a nonlinear system be described by the following equation of motion:

ẋ = −y + x ρ − (x 2 + y 2 ) ,
(24.33)
ẏ = x + y ρ − (x 2 + y 2 ) .
Investigate the stable solutions and find the Floquet multipliers of the limit cycle.
Solution. A stationary fixed point exists at x0 = (0, 0)T . Its stability is governed by
the Jacobi matrix (24.7)

∂F ρ − 3x 2 − y 2 −1 − 2xy
M(x) = = , (24.34)
∂x 1 − 2xy ρ − 3y 2 − x 2
which at the fixed point x0 takes the form

ρ −1
M(x0 ) = . (24.35)
1 ρ
The two eigenvalues are according to Example 23.1:

μ1/2 = s ± s 2 − d = ρ ± ρ 2 − (ρ 2 + 1) = ρ ± i. (24.36)
Hence, for ρ < 0 one has a stable spiral and for ρ > 0 an unstable one; ρ = 0 repre-
sents the special case of a rotor.
By inspecting (24.33), one immediately finds a periodic solution for ρ > 0, since
for constant x 2 + y 2 = ρ the system reduces to a harmonic oscillator:
√ √
xr (t) = ( ρ cos t , ρ sin t)T . (24.37)
The Jacobi matrix (24.12) evaluated at the limit cycle (24.37) reads

−2ρ cos2 t −1 − 2ρ sin t cos t
M(t) = M(xr ) = . (24.38)
1 − 2ρ sin t cos t −2ρ sin2 t
For the linearized system of equations (24.33) with this matrix M(t), the normalized
fundamental solutions (24.12) from Example 24.1 may be given explicitly. One finds

cos t − sin t
φ 1 (t) = e−2ρt and φ 2 (t) = . (24.39)
sin t cos t
Combining these vectors to the matrix (t) and evaluating at T = 2π leads to the
monodromy matrix
−4πρ
e 0
C = (T ) = . (24.40)
0 1
Hence, the basic solutions (24.39) are also already eigenvectors of the monodromy
matrix, with the eigenvectors
λ1 = e−4πρ and λ2 = 1. (24.41)

As expected, one of the Floquet multipliers has the value unity (the corresponding Exercise 24.2
eigensolution φ 2 (t) is tangential to xr (t)). The value λ1 determines the stability of the
limit cycle: It is asymptotically stable, since λ1 < 1 for ρ > 0. For ρ < 0, no limit
cycle exists.
The nonlinear system (24.33) is so simple that it allows also a closed analytic so-
lution. We change to polar coordinates x = r cos ϕ, y = r sin ϕ. The differential equa-
tions (24.33) lead to the decoupled system
ṙ = r(ρ − r 2 ), ϕ̇ = 0. (24.42)
Hence, the angle simply increases linearly with time, ϕ(t) = t + ϕ0 . The radial equa-
tion may be integrated as follows:
r t
dr 1 r 2 r
= dt −→ ln = t, (24.43)
r0 r(ρ − r )
2
0 2ρ ρ − r 2 r0
or solved for r
√
ρ
r(t) = ρ . (24.44)
− 1 e−2ρt + 1
r02
√
For ρ > 0, the solution asymptotically approaches the limit cycle r(t) → ρ. For
ρ < 0,
√
|ρ|
r(t) = ; (24.45)
|ρ|
2 + 1 e 2|ρ|t − 1
r0
any trajectory asymptotically spirals to the stable fixed point r0 = 0.

Bifurcations
25
As a rule, the behavior of dynamical systems is influenced by the value of one or

several control parameters μ. These may be the strength of an interaction, the magni-
tude of friction, the amplitude and frequency of a periodic perturbation, or many other
quantities. One then frequently observes that the long-term behavior of the trajecto-
ries changes qualitatively when passing a critical value μ > μc of the parameter. For
example, instead of a stable equilibrium position, suddenly two such positions may
occur, or a system initially at rest may begin to oscillate. The phenomenon of addi-
tionally arising solutions or of solutions that suddenly change their character is called
branching or bifurcation; the value μc is called the branching value.
The general theory of bifurcations is difficult and not yet completely worked out
mathematically. Here we shall confine ourselves to easily tractable but important
cases. This means for the present that we consider only local bifurcations, where the
behavior of the system in the neighborhood of an equilibrium solution is changing.
There exist also global bifurcations; here the topological structure of the solutions
is modified “on a large scale,” e.g., the shape of the ranges of attraction of attractors.
Furthermore, we consider only the one-dimensional case, which means the bifurcation
shall arise when varying a single control parameter. (The phase space of the dynamical
system, however, may be multidimensional.) Even under these restricting conditions a
series of various types of bifurcations are possible. In the following, these types shall
be classified and illustrated by simple examples.
25.1 Static Bifurcations
For the present, we concentrate on the stability of stationary fixed points x0 character-
ized by
ẋ = F(x0 , μ) = 0. (25.1)
This may be interpreted as an implicit equation for the position of the fixed point de-
pending on the parameter μ, x0 = x0 (μ). A premise for the existence and continuity of
this function is, according to the theorem from analysis on implicit functions, a non-
singular Jacobi matrix M = ∂F/∂x|x0 . A discontinuous behavior, thus bifurcations,
may therefore be expected if the determinant of M vanishes; this means if one of the
eigenvalues of this matrix depending on μ takes the value zero. The meaning of the
eigenvalues of the Jacobi matrix for stability has been discussed in Chap. 24.
We now consider the typical cases in the simplest possible form, namely for a one-
dimensional system. Without restriction of generality the fixed point is set to x0 = 0,

496 25 Bifurcations
and let the branching value be μc = 0. This may always be achieved by appropriate
coordinate transformations. Moreover, the one-dimensional bifurcation may be em-
bedded into a higher-dimensional space.
(a) The saddle-node branching: Saddle-node branching occurs in the dynamical

system
ẋ = F (x, μ) = μ − x 2 . (25.2)
This system has fixed points at

√ √
x01 = + μ, x02 = − μ. (25.3)
But, since x is real, there is no fixed point for negative μ. If μ passes the critical
value μc = 0, the number of fixed points jumps from 0 to 2. The lower fixed point is
however unstable, as is shown by the linear stability analysis according to Chap. 23.
The eigenvalue γ of the Jacobi “matrix”

∂F
M= = −2x0 , thus, γ = −2x0 , (25.4)
∂x x0
√
is, namely, positive for the solution x02 = − μ. Figure 25.1 shows the stable (solid
curve) and unstable (dashed curve) fixed points as a function of the control parameter
μ. The arrows indicate the direction of motion, which may be immediately read off
from (25.2). One may state that a stable and an unstable solution meet each other at
the critical point and annihilate each other.
We still have to clarify the origin of the name saddle-node branching. For this pur-
pose, the branching is embedded in a two-dimensional space, which may be achieved,
Fig. 25.1. Stable (solid) and e.g., by
unstable (dashed) fixed points
for the saddle-node branching
ẋ = μ − x 2 ,
(25.5)
ẏ = −y.
The variables x and y are decoupled, and the solutions y(t) tend asymptotically to
√ √
zero. The fixed points are x01 = (+ μ, 0) and x02 = (− μ, 0). The Jacobi matrix
has the form

∂F1 /∂x ∂F1 /∂y −2x 0
M= = (25.6)
∂F2 /∂x ∂F2 /∂y x0 0 −1 x0
and is, of course, diagonal, with the eigenvalues γ1 = −2x0 , γ2 = −1. The two eigen-
√
values for the solution x02 are 2 μ and −1; i.e., they have distinct signs, which—
according to the nomenclature from Example 23.1—corresponds to a saddle point.
√
For x0 = + μ, both eigenvalues are negative, and there arises a stable node. Saddle
and node coalesce with each other at the critical point. The distinct dynamical flows
at μ < 0, μ = 0, μ > 0, are represented in Fig. 25.2. One clearly notes that after coa-
lescing of saddle and node, there is no longer a fixed point and thus the flow continues
to infinity (left diagram).
25.1 Static Bifurcations 497
Fig. 25.2. The dynamical flow

at a saddle-node branching
which is embedded in a two-
dimensional space
(b) The pitchfork branching: The simplest example of this kind of branching arises
if one uses a cubic polynomial for F (x, μ):
ẋ = F (x, μ) = μx − x 3 . (25.7)
Since this polynomial originates from (25.2) by multiplying by the factor x, a zero as
solution simply adds to the former fixed points,
√ √
x01 = + μ, x02 = − μ, x03 = 0. (25.8)
At the critical point μc = 0, the number of fixed points therefore jumps from 1 to 3,
whereby one of the latter solutions turns out to be unstable. The Jacobi matrix

∂F
M= = μ − 3x02 (25.9)
∂x x0
has for the three fixed points the eigenvalues
γ1 = γ2 = −2μ2 , γ3 = μ. (25.10)
Thus, the solution x03 is stable below μc , but looses this property at the critical point
to the two branches x01 and x02 of the “fork,” as is represented in Fig. 25.3. As com-
pared with Fig. 25.1, the arrows representing the direction of flow have changed their
orientation in the lower half-plane x < 0. This is obvious because F (x, μ) contains
the additional factor x.
Fig. 25.3. Stable (solid) and
unstable (dashed) fixed points
of the pitchfork branching:
(a) A supercritical branching.
(b) A subcritical branching
The pitchfork bifurcation may still arise in a second version, with the stability prop-
erties exactly inverted. An example for that is
ẋ = F (x, μ) = μx + x 3 , (25.11)
which differs from (25.7) by the sign. Since the flow direction and the signs of the
eigenvalues are inverted, the branching diagram changes as represented in Fig. 25.3(b).
498 25 Bifurcations
In this case, there remains only a single stable branch. The pitchfork bifurcation of
Fig. 25.3(a) is supercritical, and that of Fig. 25.3(b) is subcritical. The two remain-
ing combinations of signs of the linear and cubic term in F (x, μ) do not yield any
qualitatively new features, they just correspond to the reflection μ → −μ.
(c) The transcritical branching: In this case, we consider a polynomial with linear
and quadratic term
ẋ = F (x, μ) = μx − x 2 (25.12)
with the fixed points
x01 = μ, x02 = 0. (25.13)
Thus, there always exist two fixed points in the entire parameter space. The eigenval-
ues of the stability matrix are
γ1 = −μ, γ2 = μ. (25.14)
Fig. 25.4. Stable (solid) and
unstable (dashed) fixed points In each case, one of the solutions is stable, the other one is unstable. At the branching
of the transcritical branching point the two solutions change their roles; see Fig. 25.4.
(d) The Hopf branching: The branchings considered so far were characterized by
the fact that a real eigenvalue of the Jacobi matrix takes the value zero when varying
the control parameter. However, branchings may also occur for complex eigenvalues.
Since the eigenvalues always occur in the form of complex-conjugate pairs, the sys-
tem must have at least the dimension 2. As an example, we consider the system of
equations

ẋ = −y + x μ − (x 2 + y 2 ) ,
(25.15)
ẏ = x + y μ − (x 2 + y 2 ) .
This system has been investigated already in Exercise 24.2 in the context of the sta-
bility of limit cycles. For all values of μ the origin is a fixed point, x0 = (0, 0). At this
position the Jacobi matrix has the value

μ −1
M(x0 ) = (25.16)
1 μ
with the eigenvalues γ1 = μ + i, γ2 = μ − i. This means that the fixed point for
μ < 0 is a stable spiral, and for μ > 0 an unstable spiral. The fate of the stable
solution at the critical point differs from that in the cases considered so far. From
the stationary attractor x0 there evolves for μ > 0 a periodically oscillating solution
√ √
xr (t) = ( μ cos t, μ sin t), which turns out to be a stable limit cycle.
As is seen from Fig. 25.5, the bifurcation diagram resembles that of the pitchfork
branching, Fig. 25.3. Actually, the system of (25.15) may be decoupled by changing
to polar coordinates x = r cos φ, y = r sin φ. The equation of motion for the radius
r(t) then exactly coincides with (25.7); see (24.42) in Exercise 24.2. The new state is,
25.2 Bifurcations of Time-Dependent Solutions 499
Fig. 25.5. For a Hopf branch-

ing, a fixed point converts into
a limit cycle
however, independent of time only in a “convected rotating” coordinate system, since

the phase angle increases linearly, φ(t) = t . In the original phase space the behavior
of the attractor therefore changes qualitatively: A static solution becomes a dynamic
solution. Branchings of this kind were first investigated in a systematic way1 in 1942
by the mathematician Eberhard Hopf.2
As for the pitchfork branching, there are also two versions of the Hopf branching.
In addition to the supercritical case represented here, there exists also a subcritical
Hopf bifurcation with the opposite stability properties. In practice this version is of
less interest, due to the fact that no stable limit cycle arises as an attractor.
One might consider the branchings presented in (a) to (d) as particularly simple
special cases. But it has been shown under rather general premises that the change of
stability (characterized by the zero passage of eigenvalues of the Jacobi matrix) for
variation of a single control parameter always proceeds according to one of these four
scenarios. We cannot deal here with the more detailed mathematical foundation, but
refer for an introduction, e.g., to G. Faust, M. Haase and J. Argyris, Die Erforschung
des Chaos, Vieweg, 1995, Chap. 6.3
25.2 Bifurcations of Time-Dependent Solutions
It may also happen for periodic trajectories that their character changes stepwise un-
der variation of a parameter μ. We shall only briefly touch on the bifurcation theory
of periodic solutions and vividly illustrate some interesting aspects. The mathematical
tool is the Poincaré mapping introduced in Chap. 24. A periodic orbit xr (t) is charac-
terized by a fixed point xr0 in the Poincaré cut. The discretization of the neighboring
path x(t) consists of a sequence of points x0 , x1 , x2 , . . . . For the distance between the
two orbits, we have according to (24.10)
xn − xr0 = Cn (x0 − xr0 ), (25.17)
1 E. Hopf, Abh. der Sächs. Akad. der Wiss., Math. Naturwiss. Klasse 94, 1 (1942).
2 Eberhard Friedrich Ferdinand Hopf, b. April 17, 1902, Salzburg–d. July 24, 1983, Bloomington,
Indiana. Hopf studied mathematics in Berlin and taught at MIT (1932 to 1936), and at the universities
of Leipzig (1936 to 1944) and Munich (1944 to 1948), as well as at Indiana University, Bloomington
(from 1948). His main research fields were differential and integral equations, variational calculus,
ergodic theory, and celestial mechanics.
3 This was also translated into English and published as G. Faust, M. Haase and J. Argyris, An
Exploration of Chaos, North-Holland (1994).
500 25 Bifurcations
where the matrix C represents the linearized approximation of the Poincaré map-
ping P . As long as all eigenvalues λi of C fall into the complex unit circle, |λi | < 1,
the neighboring orbits are attracted and the solution xr (t) is a stable limit cycle.
The approach xn → xr0 may proceed in different ways, as will be illustrated for a
selected eigenvalue, say, λ1 . If the eigenvalue is real and positive, 0 < λ1 < 1, the xn
approach the fixed point xr0 monotonically, as is shown in Fig. 25.6(a). The axis in
this diagram corresponds to the direction of the eigenvector of C belonging to λ1 . If
the eigenvalue is real and negative, −1 < λ1 < 0, the xn form an alternating sequence;
see Fig. 25.6(b). Finally, for a pair of complex eigenvalues, λ1 = λ∗2 , |λ1 | < 1, the xn
lie on a spiral converging toward xr0 ; see Fig. 25.6(c).
Fig. 25.6. Various kinds of ap-

proaches to a fixed point
Branchings occur if an eigenvalue λ leaves the unit circle at a critical value of the
control parameter μc . One distinguishes three possible cases:
(a) λ1 = +1.
According to (25.17), in this case the distance of a neighboring trajectory from the
reference trajectory does not change. (A deviation along the direction of the eigen-
vector ξ1 of the matrix C is multiplied by λ1 = 1 for each cycle.) This indicates that
for μ > μc , new limit cycles may arise that have the same period length T as the ref-
erence orbit xr . Similar to the bifurcation of a stable fixed point, a periodic solution
may also undergo a pitchfork bifurcation and split into two separated solutions. This is
sketched in Fig. 25.7(a). The other bifurcations from the last section are also possible.
Which of these cases is actually realized cannot be read off from the criterion λ1 = +1
alone.
(b) |λ1 | = |λ∗2 | = 1.
In this case as well, the sequence of points xn of the Poincaré mapping remains
at a constant distance from xr0 but now rotates on a circle. In the Poincaré cut itself,
a limit cycle evolves. This means geometrically that the topology of the periodic or-
bit changes: It now lies on a torus which envelops the originally closed orbit, as is
Fig. 25.7. Typical branchings

of a periodic trajectory
(dashed)
25.2 Bifurcations of Time-Dependent Solutions 501
shown in Fig. 25.7(b). If the two circulation frequencies on the torus mantle are in-
commensurable, i.e., do not form a rational ratio p/q, p, q ∈ N , then one speaks of a
quasiperiodic motion, since the orbit for infinitely large times never closes. It thereby
approaches any point on the torus surface to arbitrarily close distance.
(c) λ1 = −1.
This case is of particular interest, since the distance of the points xn remains con-
stant, but the direction is alternating. This means that the neighboring orbit x after each
second passage returns to its old position. There arises a periodic orbit with twice the
period length 2T , as is indicated in Fig. 25.7(c). The phenomenon is therefore called a
period-doubling or subharmonic bifurcation. Bifurcations of this kind play an impor-
tant role in the transition from periodic to chaotic motion. An explicit example will be
discussed in Example 27.1 in the context of logistic mapping.
Lyapunov Exponents and Chaos
26
In Chap. 24, the concept of stability of time-dependent orbits was discussed, and in
Example 24.1, Floquet’s theory of stability, which may be applied to periodic paths,
was explained in more detail. Building on the works of Floquet and Poincaré, the
Russian mathematician Lyapunov1 published in 1892 an even more general study of
the stability problem in which arbitrary and also nonperiodic motions were admitted.
The characteristic exponents introduced by Lyapunov have played a central role in the
theory of nonlinear systems.
26.1 One-Dimensional Systems
We first consider the case of a one-dimensional discrete mapping
xn+1 = f (xn ). (26.1)
Physically, this may be for example the Poincaré mapping of a dynamical system. We
now ask, how does the point sequence x0 , x1 , x2 , . . . differ from the point sequence
x̃0 , x̃1 , x̃2 , . . . that evolves from a slightly modified initial condition x̃0 = x0 + δx0 ?
We have, in general,
x̃n = xn + δxn = f (xn−1 + δxn−1 )

= f (xn−1 ) + f (xn−1 )δxn−1 + · · · ; (26.2)
thus, the deviation in the nth step in linear approximation is
δxn = f (xn−1 )δxn−1 . (26.3)
This equation can be applied n times recursively, with the result
δxn = f (xn−1 )f (xn−2 ) · · · f (x0 ) δx0 (26.4)

n−1
= f (xl ) δx0 . (26.5)
l=0
1 Alexander Mikhailovich Lyapunov, Russian mathematician, b. June 6, 1857, Yaroslavl–d. Novem-

ber 3, 1918, Odessa. Lyapunov was a scholar of Chebyshev. He investigated the stability of equi-
librium and the motion of mechanical systems and the stability of rotating liquids. Lyapunov also
worked in the fields of potential theory and probability theory.

504 26 Lyapunov Exponents and Chaos
Obviously, the values f (xl ) are a measure of how fast the neighboring solutions xn
and x̃n go away from each other (or move towards each other). In the special case of
a periodic motion which was studied by Floquet the point sequence xl (interpreted as
a Poincaré mapping) is constant and all factors in (26.4) have the same value. This
yields the exponential relation
|δxn | = |f (x0 )|n |δx0 | = enσ |δx0 | with σ = ln |f (x0 )|, (26.6)
compare (24.25) in Example 24.1. (A difference in the definition of σ is that in (26.6)

one considers magnitudes. This implies that σ here is always real.) The characteristic
exponent σ determines the “growth rate” of a perturbation. Even if the points xl differ
from each other (nonperiodic orbit), the exponent σ may be further used by form-
ing a mean value. The mathematical definition of the Lyapunov exponent of a point
sequence xn is

1 δxn
σ = lim lim ln . (26.7)
n→∞ δx0 →0 n δx0
Using (26.4) and the multiplication rule for the logarithm, we can also write this as
1
n−1
σ = lim ln |f (xl )|. (26.8)
n→∞ n
l=0
If all xl are equal, the quantity σ of (26.7) or (26.8) obviously reduces to the special
case (26.6).
The Lyapunov exponent is a logarithmic measure for the mean expansion rate per
iteration (i.e., per unit time) of the distance between two infinitesimally close trajec-
tories.
The case σ > 0 is of particular interest. A dynamical system with a positive Lya-
punov exponent is called chaotic. The paths of such a system are extremely sensitive
to changes of the initial conditions. Because of the exponential dependence there is
no need for long waiting, and an initially small deviation δx0 explodes to arbitrary
magnitude. More strictly speaking, it is sufficient that the product nσ be a number not
very much larger than unity; compare (26.6).
This property of chaotic systems is very significant, both practically as well as con-
ceptually. The behavior of a chaotic system is not predictable, at least not over a long
time period. Since physical quantities can always be determined only with a limited
precision, δx0 inevitably has a value that differs from zero. Therefore, it is hopeless
to aim at predicting the state of a chaotic system for times that are significantly larger
than 1/σ . The attempt to reach that by more and more precise fixing of the initial
conditions is doomed to failure. Ultimately, the exponential increase of the deviation
will always win.
The fascinating point is that this effect arises in a completely deterministic system.
The dynamics of such a system is “in principle” mathematically uniquely fixed by
the basic equation of motion—be it a differential equation or a discrete mapping as
in (26.1). Nevertheless, an exact knowledge of the equation of motion cannot help in
the attempt to find the solutions. In this way a new kind of uncertainty is brought into
physics, in addition to the more familiar sources of the statistical fluctuations (noise)
and the quantum fluctuations (uncertainty relation). The first to clearly understand and
state this phenomenon was Henri Poincaré toward the end of the nineteenth century.
26.2 Multidimensional Systems 505
Considering the rapid development of other branches of physics, nonlinear dynamics

has lived in the shadows for a long time. The current interest in the chaos theory
has been inspired significantly by the proliferation of electronic computing facilities,
which allow for exploring the dynamics of systems resisting the analytic treatment.
An important impetus for dealing with chaotic systems was given by the meteorol-
ogist Edward N. Lorenz, whose work2 received little attention for a long time. In 1963,
he derived from the basic equations of hydrodynamics, under strongly simplifying as-
sumptions, a system of three coupled nonlinear differential equations for describing
a meteorological system. When working on their numerical solution, he discovered
the signature of deterministic chaos: nonpredictability due to sensitive dependence on
the initial conditions. Nobody will be surprised that meteorology offers an appropriate
example for that phenomenon. Finally, weather may be predicted reliably even with
massive computer effort for at most a few days in advance. The attempt to extend this
space of time requires an exponentially increasing effort.
Lorenz has also coined the concept of the “butterfly effect” in this context, which
entered public awareness: The flapping of wings of a butterfly in Brazil decides
whether Texas is hit by a tornado some days later. Of course one should not just flog
this picture to death. Even all butterflies together cannot change the structure of the
“climatic attractor” by their flapping. Tropical whirlwinds will occur over and over
again, at least as long as no global change in climate happens. But at what time and at
which place a thunderstorm occurs depends, however, so sensitively on subtle details
that even a butterfly may affect it.
26.2 Multidimensional Systems

The definition of the Lyapunov exponent given here for one-dimensional systems with
discrete dynamics may be generalized to several dimensions and to a continuous time
evolution. The discussion of the stability of periodic orbits in Chap. 24 may serve as
a starting point. There we have discussed trajectories x(t) = xt (t) + ξ (t) that differ
from a reference orbit xr (t) by a small perturbation ξ (t). To first approximation the
perturbation obeys a linear differential equation; compare (24.6),
ξ̇ (t) = M(t) ξ (t), (26.9)
with the Jacobi determinant M = ∂F/∂x|xr (t) . In agreement with (26.7), as a measure
for the time evolution of the perturbation one may define the quantity
|ξ (t)|
σxr ,ξ 0 = lim ln . (26.10)
t→∞ |ξ (t0 )|
This definition of the Lyapunov exponent raises several mathematical problems that
can only be touched on here. First of all, σxr ,ξ 0 depends on the reference trajectory
xr (t) and therefore on the position of the starting point. But if the system has an at-
tractor, then the value of σ in the range of attraction of the attractor is independent
of the reference orbit. Moreover, there exists the important class of ergodic systems
for which the mean values with respect to the time (taken along an orbit) can be re-
placed by mean values in phase space. One can show also for ergodic systems that
2 E.N. Lorenz, J. Atmos. Sci. 20, 130 (1963).

the Lyapunov exponents defined according to (26.10) exist and are independent of the
special reference trajectory. (More strictly speaking, there may occur “pathological”
orbits with differing σ , but these form a set of measure zero).3
Furthermore, the value of σxr ,ξ 0 depends on the direction of the perturbation
ξ 0 = ξ (t0 ). In an N -dimensional space, one may construct N linearly independent
vectors ei that lead to a set of N Lyapunov exponents
σi = σxr ,ei . (26.11)
The indices are chosen such that the σi are in descending order,
σ 1 ≥ σ2 ≥ · · · ≥ σN , (26.12)
where degeneracies (equality of several values) are possible.

The value of the largest Lyapunov exponent σ1 is the most easily determined one.
For this purpose the linearized equation of motion (26.9) is integrated with an initial
perturbation ξ (t0 ) chosen at random. Such a vector may be decomposed according to

N
ξ (t0 ) = ci ei (26.13)
i=1
and will always have also a component along the vector e1 . When tracing the solution
over a sufficiently long time interval, the most rapidly increasing component of the
perturbation (or, if all σi are negative, the most slowly decreasing component) will
dominate. Performing the limit in (26.10) then guarantees that the calculation yields
the maximum Lyapunov exponent. In practice one has to take care in such a calcula-
tion that the trajectory x(t) for chaotic systems moves away from the reference trajec-
tory xr (t) very rapidly. It is therefore recommended that one performs a rescaling of
the perturbation ξ (tn ) = c ξ (tn ) in regular time intervals, with a constant c 1; see
Fig. 26.1. The value of σ is then obtained by averaging over many time intervals.
Fig. 26.1. In the numerical

calculation of Lyapunov
exponents one performs a
rescaling in regular time steps,
because of the divergence of
the trajectories
The full set of the N Lyapunov exponents may be calculated by following the time
evolution of all N linearly independent perturbations ξ i with ξ i (t0 ) = ei . From the
N volumes V (p) of the parallelepipeds spanned by the ξ 1 , ξ 2 , . . . , ξ p (that must be
calculated for all values p = 1, 2, . . . , N ), one then may obtain successively all σi .4
We still note that for periodic orbits xr (t + T ) = xr (t), the Lyapunov coefficients
coincide with the real part of the Floquet exponents introduced in Example 24.1. Thus,
one has a generalization of this concept.
3 More information on these questions may be found, e.g., in D. Ruelle, Chaotic Evolution and
Strange Attractors, Cambridge University Press, Cambridge (1989).

4 G. Bennettin, C. Froeschle and J.P. Scheidecker, Phys. Rev. A19, 2454 (1979); T.S. Parker and
L.O. Chua, Practical Numerical Algorithms for Chaotic Systems, Springer, New York (1989).
26.2 Multidimensional Systems 507
The Lyapunov exponents are decisive for the long-term evolution of a dynamical
system. As discussed already, positive σ imply a rapid divergence of neighboring
trajectories and nonpredictability. Of particular interest are trajectories which attract
neighboring solutions and have (at least) one positive Lyapunov exponent. They are
called chaotic attractors.
Let us consider for illustration an autonomous system with three degrees of free-
dom. Depending on the combination of signs of (σ1 , σ2 , σ3 ), various kinds of attractors
may occur:
⎧
⎪
⎪ (−, −, −), fixed point,
⎨
( 0, −, −), limit cycle,
(σ1 , σ2 , σ3 ) =
⎪
⎪ ( 0, 0, −), torus,
⎩
(+, 0, −), chaotic attractor.
These cases are illustrated in Fig. 26.2.
Fig. 26.2. Distinct types of at-

tractors in a dynamical sys-
tem with three degrees of free-
dom. The pictures differ by
the signs of the Lyapunov ex-
ponents
(a) If all Lyapunov exponents are negative, there arises a stable fixed point to which
the neighboring trajectories from all directions are converging.
(b) The vanishing of a Lyapunov exponent, σ1 = 0, indicates the existence of a pe-
riodic motion. This has been demonstrated explicitly in Chap. 24, based on the
equation of motion. The vector e1 associated with σ1 points along the direction
of the tangent of the orbit. The attractor is a limit cycle, i.e., a one-dimensional
object with the topology (but not necessarily the geometric shape) of a circle.
(c) If two of the Lyapunov exponents vanish, σ1 = σ2 = 0, there exists a periodic
motion in two directions. Therefore the attractor is two-dimensional and has the
topology of a torus about which the trajectory is winding up. Whether or not the
trajectory is periodic in total depends on the circulation frequencies ω1 and ω2 for
the two degrees of freedom of the torus. If the values ω1 and ω2 are incommen-
surable, i.e., the ratio ω1 /ω2 is not a fraction of integer numbers but an irrational
number, the orbit will never close. Such an orbit that with increasing time covers
the torus more and more densely is called quasiperiodic.
(d) If the largest Lyapunov exponent is positive, σ1 > 0, there arises a chaotic attrac-
tor with the already discussed properties of irregular motion that depends strongly
on the initial conditions. The typical combination is σ1 > 0, σ2 = 0, σ3 < 0, but
chaotic attractors with several positive Lyapunov exponents may also occur. Geo-
metrically the chaotic attractors usually are also strange attractors, as mentioned
already in Chap. 23. They have the strange property that they are objects with
a broken dimension. These are neither lines nor surfaces (or higher-dimensional
hypersurfaces) but “something in between.” That such objects are not pure math-
ematical inventions but also occur in nature has been noted only recently, in par-
ticular by B. Mandelbrot,5 who named them fractals.6 A beautiful example of
a fractal attractor will be given in Example 27.4 in the context of a periodically
driven pendulum; compare Fig. 27.19.
26.3 Stretching and Folding in Phase Space

How can a strange attractor with infinitely branched internal structure evolve in a
dynamic system? The answer lies in the mechanism of the stretching and folding in
phase space, which may be visualized as follows.
We consider a dynamical system with a positive Lyapunov exponent σ1 . Moreover,
let the system be organized in such a way that the accessible part of phase space is
restricted (in Chap. 23 this property was included in the definition of attractors). We
now consider a small connected domain V in phase space, e.g. a cube, and fol-
low its deformation under the phase flow t . Neighboring points rapidly drift apart
in the direction belonging to the Lyapunov exponent σ1 . Since the phase-space vol-
ume is conserved in conservative systems, or even shrinks in dissipative systems (see
Chap. 23), a contraction must take place in the other directions. Since on the other
hand the available domain of phase space is restricted, the trajectories must bend back
again. The test volume V which is stretched at first will thus be folded back, as is
schematically shown in Fig. 26.3.
Fig. 26.3. A strange attractor
may develop if the dynamical
flow continuously stretches a
domain in phase space and
then folds it back again
If one follows the history of V still further, the game of stretching and fold-
ing will continue again and again. One may imagine that in this way a infinitely fine
5 Benoit B. Mandelbrot, b. November 20, 1924, Warsaw. After the emigration of his family to France
(1936), Mandelbrot studied in Lyon, at the California Institute of Technology, and in Paris, where he
did his doctorate in 1952. He worked at CRNS, in Geneva, and at the École Polytechnique before
he went in 1958 to the IBM Watson Research Center, where he was appointed as Research Fellow.
He served as visiting professor among others at Harvard and Yale. Mandelbrot’s interests are extra-
ordinarily broad and oriented interdisciplinarily. Building on the work of G.M. Julia (1893–1978) on
iterated rational functions, he demonstrated the properties of fractals using computer graphics and
pointed out their manifold occurrence in nature. Besides many other awards, Mandelbrot received the
Wolf Prize in physics in 1993.
6 For more on this point, see, e.g., B. Mandelbrot, The Fractal Geometry of Nature, Freeman (1982);
H.-O. Peitgen, H. Jürgens and D. Saupe, Chaos and Fractals: New Frontiers of Science, Springer
(1992).
26.4 Fractal Geometry 509
subdivided puff-pastry-like structure develops, with the geometrical properties of a

fractal. Two initially closely neighboring points end up after a while on two dis-
tinct layers of the “pastry,” after which any recognizable correlation between their
positions is lost. A simple mathematical model for the mechanism described here is
the baker transformation (called so in analogy to the kneading of pastry); see Exam-
ple 26.1.
26.4 Fractal Geometry
The long-term behavior of the trajectories of dissipative systems is characterized by

the approach toward attractors that are embedded in the phase space as geometric
forms of lower dimension. The simplest cases include fixed points (dimension D = 0),
limit cycles (D = 1), and tori (D = 2) for quasiperiodic motion with two incommensu-
rable frequencies. As new types of attractors in nonlinear systems, strange attractors
with more complicated geometric structure may arise which are characterized by a
broken dimension. The first fractals of this type, still without this name, were intro-
duced by mathematicians as abstract constructions. For a long time, they seemed to
belong into the mathematical cabinet of oddities, until it was discovered that fractal
structures actually frequently occur in nature.
The geometric construction of a fractal is usually based on a (mostly simple) it-
eration rule that is applied repeatedly. In the limit of infinitely many iteration steps,
the fractal arises which thus has an infinitely fine resolved internal structure. We shall
briefly consider several familiar examples.
(1) The Cantor set: The construction begins with a line, namely, the set of all real
numbers in the unit interval I0 = [0, 1]. The iteration rule reads: Remove the mean
third in each interval. In the first iteration step there arise two disjunct partial inter-
vals I1 = [0, 1/3] ∪ [2/3, 1] which then split further into I2 = [0, 1/9] ∪ [2/9, 1/3] ∪
[2/3, 7/9] ∪ [7/9, 1] etc. The first iteration steps are represented in Fig. 26.4(a). In the
limit n → ∞ from the In , there arises the Cantor set7 as a kind of finely distributed
dust of points in the unit interval.
To get a measure of extension of the Cantor set, we consider the magnitude of its
complementary set, i.e., the total length of all parts cut out. This leads to a geometric
series:
1 1 1 1 2 n 1 1
L = 1 + 2 + 4 + ··· = = = 1. (26.14)
3 9 27 3 3 3 1 − 2/3
n=0
7 Georg Cantor, German mathematician, b. March 3, 1845, St. Petersburg, Russia–d. January 6,
1918, Halle. Cantor studied at the universities of Zurich and Berlin under Weierstraß, Kummer, and
Kronecker. From 1869 to 1913, he was a professor at Halle. Cantor is the founder of set theory.
He invented the notion of cardinal numbers and the concept of infinite (transfinite) numbers and dealt
with the definition of the continuum. He proved the nondenumerability of real numbers. Furthermore,
he contributed to the theory of geometric series.
Fig. 26.4. (a) Iterative con-

struction of the Cantor set: In
each step, the mean third of
an interval is cut out. (b) Iter-
ative construction of the Koch
curve. It originates by contin-
ued adding of triangles
The parts cut out thus add up to the total length of the unit interval, and the length
of the Cantor set is therefore zero! But it consists of infinitely many points, and
one can show that its cardinal number (power) is the same as that of the real num-
bers.8
(2) The Koch curve: This construction again begins with a straight line of length
unity. Instead of cutting out parts, something is added, and the straight line is built up
to a toothed curve in the two-dimensional plane. The iteration rule reads: Remove the
mean third of every straight partial piece and replace it by the sides of an equilateral
triangle. The first steps of the iteration are shown in Fig. 26.4(b) (they remind one of
the bulwarks of old fortresses).
The peculiar feature of the Koch9 curve and its related ones is that it is everywhere
continuous but nowhere differentiable. One cannot give a tangent to the curve, because
of the infinitely many sharp corners. The calculation of the length of the Koch curve
also leads to a remarkable result: In every partial step, always 3 partial pieces are
replaced by 4 of the same length. The total length L is therefore

n
4
L = lim Ln = lim = ∞, (26.15)
n→∞ n→∞ 3
and thus diverges. This cannot be seen directly from the graph of the curve, due to the
conversion to smaller and smaller length scales. Here the self-similarity becomes ap-
parent which is a typical feature of fractals: On any length scale, a linear magnification
of a detail of the object again resembles the entire object.10
8 This may be understood in a rather simple manner: Every point of the Cantor set may be charac-
terized by an infinite sequence of “left-right decisions”; i.e., for every iteration step in Fig. 26.4 one
must say in which of the two partial intervals the point lies. But this sequence may also be interpreted
as the binary representation of a real number in the unit interval [0, 1], whereby the assertion becomes
clear.
9 Niels Fabian Helge von Koch, Swedish mathematician, b. January 25, 1870, Stockholm–d. March
11, 1924, Stockholm. Koch was a scholar and successor of Mittag-Leffler at the University of Stock-
holm. His main research fields were systems of linear equations of infinite dimension and with infi-
nitely many unknowns. His name became familiar mainly from the curve named after him.
10 Self-similarity alone is, however, not a sufficient criterion for fractals. For example, a straight line
is self-similar in a trivial manner.
(3) The Sierpinski gasket: The basic element of the Sierpinski11 gasket is a two-
dimensional area, namely, an equilateral triangle. Iteration rule: Subdivide each trian-
gle into 4 congruent parts and remove the central triangle. Figure 26.5 shows the first
steps of this iteration. The resulting object is something between an area and a curve.
It has again the property of self-similarity.
Fig. 26.5. Iterative construc-

tion of the Sierpinski gasket.
In each step, the middle of
4 partial triangles is removed
Nature offers a variety of fractal objects. Example from the organic world are the
branchings of plants (trees, cauliflower, particularly beautiful ferns) or vessels. In-
organic fractal shapes are observed in clouds, mountains, snowflakes, lightning dis-
charges, etc.
A classical example studied by Mandelbrot that resembles the Koch curve are
coastlines. For the length of one and the same coast the geographers may give quite
different values, depending on the length scale adopted in the measurement. The
smaller the scale, the better is the scanning of the bays and windings of the coast,
corresponding to the higher iterations Ln of the Koch curve. Ultimately the coastline
should wind about each individual grain of sand on the beach, which would blow up
the length enormously. But here also shows up that the application of fractal geome-
try to natural objects is meaningful only in a certain range. Ultimately on the atomic
scale when the granularity of matter becomes apparent, the mathematical limit n → ∞
loses its meaning. Nevertheless, the fractal scale behavior (see below) frequently can
be traced over many orders of magnitude.
The fractal dimension: An important tool for characterizing geometric objects is

their dimension. For ordinary objects (smooth curves, areas, spheres, . . .), the dimen-
sion is clear and coincides with the visual conception. Fractals, however, behave dif-
ferently. To characterize them, the concept of fractal dimension was invented. We will
restrict ourselves mainly to the so-called capacity dimension (or box counting dimen-
sion).12
The dimension shall be a measure of how dense a point set fills the space into which
it is embedded. We start from the assumption that the space is a metric (normally
Euclidean) space in which one can measure distances. This space is then, as illustrated
in Fig. 26.6, covered by a grid of lateral length (i.e., distances for n = 1, boxes for
n = 2, cubes for n = 3, hypercubes for n > 3) and one counts how many of the boxes
contain one or several points of the point set under consideration. This number N ()
11 Waclaw Sierpinski, Polish mathematician, b. August 20, 1882, Warsaw–d. May 14, 1969, Warsaw.
Sierpinski was a professor of mathematics in Lwow (now Ukraine, 1908–1914), Moscow (1915–
1918), and later in Warsaw. His main fields of work were the theory of sets (here, in particular, the
selection axiom and the continuum hypothesis), the topology of point sets, and number theory.
12 The situation is complicated by the fact that a multitude of distinct mathematical concepts of di-
mension are available. The calculated dimension values partly agree with each other but partly do
not, depending on the considered object. For details, see K. Falconer, Fractal Geometry, Mathemati-
cal Foundations and Applications, Wiley, New York (1990).
Fig. 26.6. The capacity di-

mension of an object is deter-
mined by counting how many
boxes of a grid are touched
by the object. The depen-
dence of the number N ()
on the box size determines
the dimension Df . This is il-
lustrated here for a “normal”
two-dimensional object
will in general depend on the box size . For an improved resolution the object for sure
will spread over more boxes, but the question is how fast that happens. If the scaling
behavior in the limit → ∞ follows a power law
N () = V () −Df with V () → constant for → 0, (26.16)
then the value of the exponent defines the fractal dimension or capacity dimension Df .
Solving yields
ln N () − ln V () ln N ()
Df = lim = lim , (26.17)
→0 ln(1/) →0 ln(1/)
since V () remains finite in the limit and therefore does not contribute.
For nonfractal geometric objects, the dimension determined in this way coincides
with the normal Euclidean dimension, and V (0) corresponds to the Euclidean vol-
ume. For example, for sufficiently high resolution a circular disk of radius R overlaps
N () πR 2 / 2 boxes, and each further doubling of the resolution increases the num-
ber of boxes by the factor 4. Hence, one finds for the circle N () 2πR/, and so
on.
For fractals, however, the power of the scaling law (26.16) differs from the naively
expected value and is in general not an integer. For the Cantor set the determination
of Df is particularly simple. Here it suffices to take the unit interval (n = 1) as the
embedding space. It is most convenient to consider a sequence of subdivisions of the
length i that always differ by the factor 3; thus, 0 = 1, 1 = 1/3, 2 = 1/9, etc.
The Cantor set is constructed in such a way that then the number of “occupied” boxes
(partial intervals) always doubles; thus, N (0 ) = 1, N (1 ) = 2, N (2 ) = 4, etc. Hence,
the fractal dimension of the Cantor set is, according to (26.17), given by
ln N () ln N (n )
Df = lim = lim
→0 ln(1/) n→∞ ln(1/n )
ln 2n ln 2
= lim = 0.6309. (26.18)
n→∞ ln 3n ln 3
The result is independent of the manner of performing the passage to the limit
→ 0.
Similarly, one finds the dimension of the Koch curve. To cover it, in the first step
one needs N (1 ) = 4 intervals of length 1 = 1/3, in the next one N (2 ) = 16 intervals
of length 1 = 1/9, etc. As in (26.18), this implies
ln 4n ln 4
Df = lim = 1.2618. (26.19)
n→∞ ln 3n ln 3
Finally, for the Sierpinski gasket we have
ln 3
Df = 1.5850, (26.20)
ln 2
since here for each bisection of the box size the number of partial objects increases by
the factor 3.
The results (26.18) to (26.20) are not implausible. They quantify how the consid-
ered fractals by their properties stand between the familiar objects point, line, area,
. . . . The comparison of (26.19) and (26.20) shows that the Sierpinski gasket is “more
space-filling” than the Koch curve, but does not come up to a normal area. An ex-
treme example in this respect is the area-covering curve that was discovered in 1890
by G. Peano13 and investigated in modified form by D. Hilbert. The Peano curve may
also be obtained iteratively: In a subdivision of the scale length into three sections
there arise nine partial distances of equal length; see Fig. 26.7. Accordingly, the di-
mension is
ln 9
Df = = 2. (26.21)
ln 3
Fig. 26.7. The first two iter-

ation steps in the construc-
tion of the area-filling curve
of Peano. In the limit there re-
sults a continuous mapping of
the unit square onto the inter-
val [0, 1]. To make the con-
struction transparent, the rec-
tangular edges were cut off in
the plot
This is the dimension n = 2 of the embedding space and hence the largest value that
may be taken by the capacity dimension Df . N () takes the maximum value, since
all boxes include parts of the object. The definition of the capacity dimension is very
clear and has the advantage that it provides immediately an operative calculation rule.
One only has to span grids and to count the boxes, which may easily be done on a
computer. If the function N () in a doubly logarithmic representation yields a straight
line (at least over a larger range of scale), then one can immediately read off Df from
its slope. A related but more subtle definition of the dimension which refrains from
equidistant grids and works with overlaps of variable magnitude was developed in
13 Giuseppe Peano, Italian mathematician, b. August 27, 1858, Cuneo (province Piemont)–d. April
20, 1932, Torino. Peano studied mathematics at the University of Torino, where he taught beginning
in 1880 as a lecturer and beginning in 1890 as a professor. His early works concerned analysis,
the initial-value problem of differential equations, and recursive functions. Peano emerged mainly
as a founder of the mathematical logic (with G. Frege). The Peano axioms (1889) define the natural
numbers via the properties of sets. His aim was the axiomatization of all of mathematics. Later, Peano
moved beyond mathematics and developed an universal world language (Interlingua) that, however,
did not gain acceptance.
1918 at Bonn University by the mathematician Felix Hausdorff.14 We shall not en-
ter into the details here and mention only that the Hausdorff dimension DH in the
most cases coincides with Df ,15 but there also exist exceptional cases. Generally,
D H ≤ Df .
Just for the classification of strange attractors, other dimension measures may also
be meaningful. The Poincaré mapping yields a possibly very inhomogeneously dis-
tributed cloud of points in phase space. In the calculation of the capacity dimension
the information on the frequency distribution of the points is ignored. There it has
no meaning whether a box is occupied by a single point or by thousands. To take
this quantity into account, one defines an information dimension, which includes the
density distribution of the points.
The construction is similar to that for the capacity dimension Df . The embedding
space is again subdivided into boxes, but now the contribution of every cell is weighted
by the probability pi of meeting points there. Practically, this value is determined by
generating a very large number N of points and counts how many of them fall in the

cell number i; thus, pi = Ni /N with i Ni = N . The weighting is performed with a
logarithmic measure. The information dimension DI is defined as

pi ln(1/pi )
DI = lim i . (26.22)
→0 ln(1/)
The factor f (p) = p ln(1/p) has the following meaning: Take an arbitrary point from
the distribution and ask “Does the point lie in cell i?” The answer to this question
yields the information set f (pi ). This function vanishes for pi → 0 and pi → 1, since
in these cases the answer is trivial (always “no” or always “yes”). The information gain
is maximum for pi = 1/2; then the answer is in any case a surprise.
The exact foundation for the function f (p) is provided by the statistical mechan-
ics, or by information theory (Shannon’s information measure).16 But at least we may
easily see that (26.22) turns into the formula (26.17) if the point distribution is ho-
mogeneous, i.e., if all probabilities in the in total N () covered cells have the same
value pi = p = 1/N(). For the remaining cells pi = 0. Thereby, (26.22) reduces
to
N ()p ln(1/p) ln N ()
DI = lim = lim = Df . (26.23)
→0 ln(1/) →0 ln(1/)
For a less homogeneous point distribution, the informational content is lower and one
can show that then DI < Df .17
14 Felix Hausdorff, German mathematician, b. November 8, 1869, Breslau–d. January 26, 1942,
Bonn. Hausdorff studied and taught mathematics in Leipzig and beginning in 1910 in Bonn, until
his forced retirement in 1935. He was persecuted because of his Jewish origin. In 1942, he and his
wife committed suicide, shortly before deportation to the concentration camp. Hausdorff’s main re-
search fields were topology and group theory. He introduced the concept of partly ordered sets and
dealt with Cantor’s continuum hypothesis. He founded a theory of topological and metric spaces and
about 1919 defined the concepts of dimension and measure named after him.
15 J.D. Farmer, E. Ott and J.A. Yorke, Physica 7D, 153 (1983).
16 C.E. Shannon and W. Weaver, The Mathematical Theory of Communication, Univ. of Illinois
Press, Urbana (1949).
17 See, e.g., H.-O. Peitgen, H. Jürgens and P.H. Richter, op. cit. p. 735.
EXERCISE Exercise 26.1
26.1 The Baker Transformation
Problem. The process of stretching and folding, which is characteristic for the phase
flow in chaotic systems, can be illustrated by a simple two-dimensional discrete map-
ping. Let the phase space be the unit square [0, 1] × [0, 1]. The motion of a point
proceeds according to the transformation
xn+1 = 2 xn mod 1,

ayn , 0 ≤ xn < 12 , (26.24)
yn+1 = for
1
2 + ayn , 1
2 ≤ xn ≤ 1,
with a parameter 0 < a ≤ 1/2. Interpret this transformation, and calculate the fractal
dimension of the set that is generated by application of (26.24) on the unit square.
Fig. 26.8. The baker transfor-

mation
Solution. The effect of this transformation may be simply visualized geometrically.

All lengths in x-direction are stretched by the factor 2. If a point thereby passes over
the right boundary of the unit interval, then it is set back to the left by one unit
length. This mapping is also called the Bernoulli shift; compare Example 27.2. Si-
multaneously, all lengths in y-direction are compressed by the factor a. Points from
the right half of the unit interval are additionally shifted by 1/2 in y-direction up-
ward. This reminds us of the work of a baker: The pastry is rolled out to twice its
length. Then the part sticking out is cut off and put on as a second layer onto the
pastry. (If a < 1/2, there still is an “air cushion” of thickness 1/2 − a between the
layers.)
The volume change of a phase space domain V under (26.24) is composed of the
product of the stretching- and compression factor in x- and y-direction; thus,
Vn+1 = 2aVn . (26.25)
For a = 1/2, the volume is conserved. However, a connected domain in phase space
is rapidly torn up by the repeated “back-folding” (here it is more a back-shifting) and
distorted beyond recognition. The figure represents the first steps of the transformation
of a circle of radius 1/2.
For a < 1/2, the volume element shrinks in each iteration step and goes asymp-
totically to zero. In the sense of Chap. 23, one also speaks of the dissipative baker
transformation. The resulting geometrical object is a fractal which in y-direction dis-
plays the structure of a Cantor set. Just as for the latter one may calculate the fractal
Exercise 26.1 dimension according to (26.17). In the y-direction in the first transformation step the
unit interval is converted into 2 parts of length a, in the next step 4 parts of length a 2 ,
and generally 2n parts of length a n . If one selects a sequence of overlaps of the lateral
length n = a n , then N (n ) = 2n (1/ n ), where the second factor originates from the
overlapping of the x-axis. The fractal dimension is therefore
ln N (n ) ln(2n /n ) n ln(2/a) ln 2 − ln a

Df = lim = lim = lim =
n→∞ ln(1/n ) n→∞ ln(1/n ) n→∞ ln(1/a) − ln a
ln 2
=1+ . (26.26)
| ln a|
Systems with Chaotic Dynamics
27
In this chapter, we shall get to know various dynamical systems that may display
highly complex forms of motion despite their very simple, even trivial structure. We
shall meet previously discussed concepts such as bifurcations and periodic and strange
attractors (limit cycles and chaotic trajectories) in a series of specific examples. For
sake of simplicity, we shall begin with the investigation of systems with discrete dy-
namics. The examples considered will gradually become physically more and more
realistic.
27.1 Dynamics of Discrete Systems
So far, we have been interested in dynamical systems that are described by continuous
differential equations with respect to time. In the Poincaré mapping we have seen
the possibility of reducing a continuous system to a time-discrete system. But since
the Poincaré mapping of a realistic physical system as a rule cannot be given in a
closed form, it is instructive instead of these to consider simple mathematical model
mappings. As it turns out, such models share many properties with systems which
from a physical point of view are more interesting.
Let us consider a sequence of vectors x0 , x1 , x2 , . . . which is generated by a simple
iterative mapping, i.e., by repeated application of a continuous function f(x),
xn+1 = f(xn ). (27.1)
The asymptotic behavior of the sequence of the xn may be characterized as in Chap. 24

by the appearance of fixed points or periodic limit cycles. The stability of a fixed
point x p is again governed by the eigenvalues of the Jacobi matrix D = ∂f/∂x|xp .
We shall not go again into these details but still note an interesting relation between
stationary and periodic solutions. Consider e.g. a periodic solution alternating between
two values: x1 , x2 , x1 , x2 , . . . . Then obviously x2 = f(x1 ) and x1 = f(x2 ), from which
it follows that x1 = f(f(x1 )) ≡ f 2 (x1 ) and also x2 = f 2 (x2 ). Here f 2 (x) must not be
confused with the power (f(x))2 . This procedure may also of course be transferred to
periodic solutions of length m, x1 , x2 , . . . , xm , x1 , x2 , . . . . Correspondingly, a periodic
solution of the iteration function f reduces to a set of m constant fixed points of the m-
times iterated function f m . For continuous (not time-discretized) solutions x(t), such
a relation does not exist.

518 27 Systems with Chaotic Dynamics
27.2 One-Dimensional Mappings
For further simplification, we now concentrate on one-dimensional mappings xn+1 =

f (xn ). The sequence of the xn can then be simply constructed graphically. The func-
tion f (x) is plotted in a two-dimensional coordinate system with xn as the abscissa
and xn+1 as the ordinate, together with the bisector of the angle xn+1 = xn . Start-
ing with the first point xn = x1 , the associated function value xn+1 = f (x1 ) = x2 is
marked. Since this value shall serve as the initial value for the next iteration step, it
must be transferred to the xn -axis. This is done by drawing a horizontal line to the bi-
sector. The intersection point xn = x2 is the base for marking again the function value
xn+1 = f (x2 ) = x3 , etc.
This geometrical construction is represented in Fig. 27.1, where the various possi-
bilities of stability of a fixed point xs = f (xs ) of the mapping are illustrated simul-
taneously. As is known, the derivative of the function at the fixed point, λ = f (xs ),
governs the stability (eigenvalue of the Jacobi matrix). For |λ| < 1, the fixed point
is stable (point attractor). The sequence of points approaches the fixed point either
monotonically (for 0 < λ < 1, (a)) or alternating (for −1 < λ < 0, (b)). A value
|λ| > 1, on the contrary, implies an unstable fixed point where the point sequence
escapes, again either monotonically (for λ > 1, (c)) or alternating (for λ < −1,
(d)). In the limit λ = −1, there results a periodic sequence of length 2, while for
λ = +1 also each point in the vicinity of xs is a fixed point. In order to deter-
mine the stability for |λ| = 1, the first derivative (linear approximation) is not suf-
ficient.
In Examples 27.1–27.3, various discrete one-dimensional mappings are presented
and their dynamics are studied in detail.
Fig. 27.1. Various kinds

of fixed points of a one-
dimensional mapping.
The fixed point is stable
((a) and (b)) or unstable
((c) and (d)), depending on
the slope of the function f (x)
27.2 One-Dimensional Mappings 519
EXAMPLE
27.1 The Logistic Mapping
One of the simplest but not least impressive examples of an iterative mapping is gen-
erated by the logistic function.1 This function is defined as an inverted parabola
f (x) = α x(1 − x), x ∈ [0, 1], (27.2)
with zeros at the border of the unit interval and a maximum at f (1/2) = α/4. The
logistic function depends on a real parameter α which shall lie in the range 1 < α ≤ 4,
since otherwise the mapping either would lead beyond the unit interval [0, 1] (α < 0,
α > 4) or would become trivial (0 < α < 1, all solutions converge toward x = 0).
The mapping (27.2) has the two fixed points
1
xs1 = 0 and xs2 = 1 − , (27.3)
α
with the derivatives
f (xs1 ) = α and f (xs2 ) = 2 − α. (27.4)
Since α shall be > 1, the first fixed point is always unstable. The second one is stable
(point attractor) if 1 < α < 3. In this parameter range all solutions look like that repre-
sented in Fig. 27.2 for α = 2.8; i.e., xn converges toward the fixed point xs2 , for small
starting value initially monotonically increasing, later on alternating. If α exceeds the
value α1 = 3, then according to (27.4), the fixed point xs2 becomes unstable, too.
Fig. 27.2. Typical trajectories

of the logistic mapping. For
α = 2.8, the attractor is a fixed
point, and for α = 3.3, a limit
cycle of period 2
The geometric or numerical construction of the solution shows that a stable limit
cycle of period 2 evolves, as is illustrated in Fig. 27.2(b) for the case α = 3.3. For
1 The logistic mapping may be interpreted with some effort as the model equation of a particular
physical system; see the remark at the end of Example 27.3. It is also related to other fields of science.
The logistic mapping was introduced first in 1845 in biological population dynamics by the Belgian
biomathematician P.F. Verhulst. There it describes the evolution of a population of animals or plants
in a restricted environment whereby each iteration corresponds to a new generation (the old one
thereby dies off). For a low population density, the reproduction rate is positive (if α > 1). But, if one
approaches the saturation density x = 1, the reproduction rate decreases by overpopulation such that
xn+1 < xn . The simple parabola ansatz is sufficient to generate complex dynamics.
Example 27.1 α = α1 , a period doubling of the solution arises. One faces a pitchfork bifurcation as
is illustrated in Fig. 25.3. The old solution becomes unstable, and there appears a pair
of new stable solution branches. (Note that here both branches simultaneously belong
to the solution, since the latter one periodically jumps back and forth between the
branches. This behavior differs from the case studied in Chap. 25, where both solution
branches were independent of each other.)
The period doubling can also be understood by inspecting the iterated mapping
f (x) = f (f (x)). As already mentioned, for this mapping a periodic solution reduces
2
to two separated stationary fixed points. The iterated function
f 2 (x) = α 2 x(1 − x)(1 − αx + αx 2 ) (27.5)
is a polynomial of fourth order with zeros at x = 0 and x = 1. Figure 27.3 shows this
function for various values of the parameter α. For α < α1 , the function f 2 intersects
the bisector x just as f does only at the two points xs1 and xs2 given in (27.3). As
a polynomial of fourth order, (27.5) allows however four intersection points with a
straight line, and just this happens for α > α1 . As shown in 2
√ Fig. 27.3, f has a mini-
mum at x = 1/2 for α > 2 and two maxima at x = 1/2 ± (1/4) − 1/(2α). This can
be seen without much calculation from

f 2 (x) = f f (x) f (x). (27.6)
Fig. 27.3. The iterated logis-

tic function f (f (x)) displays
two maxima for α > 0
A zero of the derivative (extreme value of f 2 ) follows from f (x) = 0; thus x = 1/2.
For the two other zeros one finds

1 1 1
x = f −1 , since f f (x) = f f f −1 =f = 0, (27.7)
2 2 2
which as a quadratic equation yields the two roots mentioned above. For the slope of
f 2 (x) at the position of the fixed point xs2 , one obtains
f 2 (xs2 ) = (α − 2)2 . (27.8)
For α < α1 = 3, this slope is smaller than 1, and therefore xs2 is a stable fixed point of
f 2 too. Of course this must be so because f 2 simply picks every second value of the
series of the xn , and thus it takes over the stability from the solution of the mapping f . Example 27.1
For α = α1 , f 2 (x) touches the bisector, and for α > α1 two new intersection points x1 ,
x2 arise. These are two stable fixed points of the iterated mapping f 2 , i.e., f 2 (x1 ) =
x1 , f 2 (x2 ) = x2 , which are related by x2 = f (x1 ) and x1 = f (x2 ).
If the parameter α increases further, these fixed points also become unstable at a
critical value α = α2 , and the game repeats for the mapping f 2 (x). A further bifurca-
tion arises there with period doubling, and the new stable solution has a period length
of 4. Figure 27.4 shows this solution for the example α = 3.48.
Fig. 27.4. (a) For α = 3.48,
the iterated logistic function
f (f (x)) has one unstable and
two stable fixed points at x
0.43298 and x 0.85437.
(b) The trajectory shows a pe-
riod of length 4
The critical value of α2 still can be given analytically, but with some effort. To
this end, one first has to find the position of the fixed points x1 and x2 . The condition
f 2 (x) = x together with (27.5) leads to a quartic equation. But we already know two
of the nodes, namely, the fixed points xs1 and xs2 from (27.3), which may be factored
out by polynomial division:

f 2 (x) − x = x(x − 1 + 1/α) α 3 x 2 − α(1 + α)x + α + 1 = 0. (27.9)
Setting the last bracket to zero yields a quadratic equation that determines the two new
fixed points, namely,

1 1 1 1 3
x1,2 = 1+ ± 1+ 1− . (27.10)
2 α 2 α α
The second bifurcation occurs when these fixed points become unstable. This is de-
cided by the magnitude of the slope of the function f 2 (x) at the points x1 , x2 . The
slope begins with the value +1 at α = α1 , and then decreases continuously to −1. At
this point, the next bifurcation arises. One therefore has to solve the equation
f 2 (x1,2 , α2 ) = −1. (27.11)
Taking into account (27.6) and (27.10), equation (27.11) seems to depend on α in
a very complicated manner. After some elementary transformations, (27.11) reduces
however to a simple quadratic equation for both fixed points in common,
α22 − 2α2 − 5 = 0. (27.12)
Hence, the critical bifurcation parameter for which the 2-cycle turns over to the 4-cycle
is
√
α2 = 1 + 6 = 3.4495 . . . . (27.13)
Example 27.1 That both fixed points simultaneously become unstable is plausible and immediately
follows from (27.6), since the slope of f 2 (x) has the same value at both points:

f 2 (x1 ) = f f (x1 ) f (x1 ) = f (x2 )f f (x2 ) = f 2 (x2 ). (27.14)
It is not surprising that with increasing α the cycle of the period 4 becomes unstable
too. There arises a full cascade of period doublings at α1 , α2 , α3 , . . . . In the interval
αk < α < αk+1 , there exists a stable limit cycle of period 2k . Mathematically one may
k
consider instead of the limit cycle the iterated function f 2 (x) which displays a set of
2k distinct stationary fixed points.
One should note that the critical points αk are more and more closely spaced. As
was found at first empirically and then analytically, the αk obey the law of a geometric
sequence that converges toward a cluster point α∞ :
1
αk α∞ − . (27.15)
δk
The number δ is a constant that may be determined from the ratio
αk − αk−1
δ = lim . (27.16)
k→∞ αk+1 − αk
For its value, one gets numerically
δ = 4.669201 . . . , (27.17)
and the accumulation point is at
α∞ = 3.569944 . . . . (27.18)
The cascade of period doublings was considered first by Großmann and Thomae.2
Feigenbaum3 showed that the behavior (27.15) is not restricted to the logistic mapping
but is universally valid for a large class of iterative mappings.4
Surprisingly, the numerical value (27.17) is also universally valid, and δ is there-
fore called the Feigenbaum constant. Essentially it suffices that the iterated function
be smooth and display a quadratic maximum. The mathematical properties of the bi-
furcation cascade have been thoroughly studied; for a survey see, e.g., H. Schuster,
Deterministic Chaos, VCH Publishing Company, 1989.
2 S. Großmann and S. Thomae, Z. Naturforsch. 32a, 1353 (1977).

3 Mitchell Feigenbaum, American physicist and mathematician, b. December 19, 1945, Philadelphia.
Feigenbaum studied electrical engineering at the City College of New York and physics at MIT,
where he did his doctorate in 1970 in high-energy physics. In 1974, he went to Los Alamos National
Laboratory, where he was involved with problems of turbulence and nonlinear dynamics. In 1982,
Feigenbaum was appointed as a professor at Cornell University, and presently, he heads the laboratory
of mathematical physics at Rockefeller University in New York. Around 1976, he discovered, at first
in numerical experiments on the logistic mapping, the universality of the period-doubling cascade
and the constant that is named after him. In 1986, Feigenbaum (in common with Albert Libchaber,
who experimentally demonstrated the period doubling in the flow of liquids) was awarded the Wolf
Prize in physics.
4 M.J. Feigenbaum, J. Stat. Phys. 19, 25 (1978).
Of particular interest is what happens beyond α∞ . For α = α∞ obviously a cycle Example 27.1
of “infinite period” arises, i.e., an aperiodic, never-repeating solution. α > α∞ is the
domain of chaos that is characterized by irregular and seemingly random trajectories.
As an example, Fig. 27.5 shows a fraction from such a chaotic solution calculated for
α = 3.9.
Fig. 27.5. A chaotic trajectory

of the logistic mapping for
α = 3.9
The attractor diagram of the logistic mapping: Figure 27.6 presents an attractor
diagram of the logistic mapping that is highly instructive and of amazing complexity.
For each value of α on the abscissa the values passed by a trajectory are plotted as
a point cloud along the ordinate direction. The first iteration steps (here 500) were
omitted in order to filter out “transient processes” (transients) and to represent the
asymptotic attractor itself. The subsequent (200 each) iterations are then plotted in the
diagram.
Fig. 27.6. The attractor dia-

gram of the logistic mapping
(Feigenbaum diagram) shows
the iterated values xn versus
the parameter α. One notes a
cascade of period doublings,
and above α∞ 3.569944
a chaotic domain that how-
ever is infiltrated by win-
dows of regular solutions. The
marked rectangular domain is
shown in linear magnification
in Fig. 27.7
Example 27.1 The left part of Fig. 27.6 shows the first segments of the bifurcation cascade: At α1 ,
α2 , α3 , the attractor splits in 2, 4, and 8 branches, respectively. According to (27.15),
the further critical points αk follow each other so closely that they are no longer re-
solved in the figure. Beyond α∞ one sees continuous bands more or less uniformly
grey. The grey domains are passed through by some kind of “scars.” Obviously these
places are more frequently passed by the trajectory, and thus, the probability of finding
P (x) of the attractor is increased here.
One notes that in the chaotic domain α > α∞ , the orbits are confined to a partial
interval of [0, 1]. The borders of this interval are obtained as xmax (α) = f (1/2) = α/4
and xmin (α) = f 2 (1/2) = (1/16)α 2 (4 − α). In the limit α = 4, the full unit interval is
passed. Thereby, the attractor covers the interval completely. More precisely, for any
point x ∈ [0, 1] and any > 0, one can find an n() such that |xn − x| < . Thus, the
trajectory approaches each point arbitrarily closely if one waits for a sufficiently long
time.
If the parameter is reduced from α = 4, there arise various interesting phenomena.
For example, when going below α1 3.6785 a band splitting is observed. The chaotic
attractor which formerly covered a connected interval splits at α = α1 into two disjunct
domains, as is clearly seen in Fig. 27.6. The trajectory thereby alternates back and
forth between the two “partial bands.” For a further reduction of α, the partial bands
in turn split again, into 4 parts at α2 3.5926, and so on. Similar to the bifurcation
cascade α1 , α2 , . . . , α∞ of the regular attractor, there also exists a kind of reversed
bifurcation cascade α1 , α2 , . . . , α∞
of the chaotic attractor. Both cascades meet at a
common limit α∞ = α∞ .
A closer inspection of the attractor diagram Fig. 27.6 reveals that there are also
windows of periodic solutions embedded in the chaotic domain. Particularly striking
is the domain with solutions of period 3 which occur above α 3.8283. This is con-
nected with a fixed point of the triply iterated mapping that arises at the position of
the maximum of f (x), i.e., at x = 1/2:

3 1 1
f = . (27.19)
2 2
√
One easily confirms that this happens at αs3 = 1 + 8 3.8284. The point x = 1/2
therefore has a particular meaning, since here the derivative vanishes; f (1/2) = 0.
Because
d 3
f (x) = f f 2 (x) f f (x) f (x), (27.20)
dx
this property transfers also to all iterated mappings. This guarantees that the fixed point
(27.19) is stable. Here, one even speaks of a superstable cycle, since the magnitude
of the derivative of f n (x) which determines the stability takes the smallest possible
value, namely zero. Due to the continuity of the mapping as a function of the parame-
ter, the period-3 cycle is still stable in a finite environment of αs3 . In the downward
direction the window of stability borders on the chaotic region.
If α increases beyond αs3 , one again finds a cascade of bifurcations with period
doubling, i.e., the 3-cycle turns into a 6-cycle, etc. The bifurcations follow each other
more and more closely until chaotic solutions emerge again.
This consideration is not restricted to the cycle of period 3 but holds also for arbi-
trarily larger period lengths. There exist superstable cycles for any natural number m
(their number even increases exponentially with m) with period length m which are Example 27.1
defined by

1 1
fm
= . (27.21)
2 2
They are always enclosed by a small window of regular cyclic orbits in the otherwise
chaotic domain.
It is highly instructive to study the attractor diagram in detail. Figure 27.7 shows a
small section α = 3.52 to α = 3.65 at the border of the 3-cycle window. The agree-
ment with the full attractor diagram Fig. 27.6 is amazing. Although there are tiny dif-
ferences in the shape of both geometrical objects, their structure is nevertheless very
similar. This property—a partial section looks just as the entire diagram—is called
self-similarity (in the parameter space). For the attractor diagram of the logistic map-
ping, the process of successive magnification of a section may be continued ad infini-
tum: In Fig. 27.7, one again finds partial domains that resemble the entire figure and
so on, without an end.
Fig. 27.7. A small section of

the attractor diagram of the lo-
gistic mapping shows quite a
similar structure to the full di-
agram in Fig. 27.6
The Lyapunov exponent of the logistic mapping: In the nonchaotic range, the
Lyapunov exponent may be calculated rather simply analytically. We first consider
the parameter range α < α1 = 3. Here, a stable fixed point exists at xs2 = 1 − 1/α; see
(27.3). The quantity to be calculated is
1
n−1
σ = lim ln |f (xl )|. (27.22)
n→∞ n
l=0
The influence of transients is excluded by the limit process. Because xl → xs2 , only
the fixed point contributes, and thus the sum reduces to a single term:
Example 27.1 σ = ln |f (xs2 )| = ln |α(1 − 2 xs2 )| = ln |α(1 − 2 + 2/α)|

= ln |2 − α|. (27.23)
The result shows that the Lyapunov exponent diverges logarithmically at α → 2,

σ → −∞. This of course happens just there where the derivative of the function f
at the fixed point vanishes, which corresponds to a particularly fast approach to the
attractor. This situation is called superstable. When approaching the parameter value
α → α1 = 3, the Lyapunov exponent vanishes. This is a general feature of bifurca-
tion points, since here the old attractor loses its stability, and the new attractor is not
yet born. Beyond α1 , the period-2 cycle becomes an attractor, and therefore, σ again
decreases to negative values.
It is not difficult to calculate the Lyapunov exponent in the interval α1 < α < α2 .
Since now a periodic attractor exists that alternates between the points x1 and x2
from (27.10), (27.22) reduces to a sum of two terms:
1 1 1
σ= ln |f (x1 )| + ln |f (x2 )| = ln |f (x1 )f (x2 )|. (27.24)
2 2 2
For the derivatives, we get

f (x1,2 ) = α(1 − 2x1,2 ) = −1 ∓ (α + 1)(α − 3), (27.25)
and the Lyapunov exponent is
1
σ= ln |(1 − (α + 1)(α − 3)|. (27.26)
2
This function begins with√the value σ = 0 at α = α1 , then decreases monotonically
and diverges at α = 1 + 5. This is again the superstable point of the 2-cycle, since
one easily confirms that

2 1 1 √
f = at α = 1 + 5. (27.27)
2 2
σ then increases again and finally reaches the value σ = 0. This happens when the
argument of the logarithm in (27.27) takes the value −1, and thus, 1−(α +1)(α −3) √ =
−1, which leads to the quadratic equation (27.12). The solution α = α2 = 1 + 6 is
the bifurcation point at which the 2-cycle becomes unstable. In the further course
of the bifurcation cascade α2 < α < α∞ , the game is repeating, and the Lyapunov
exponent oscillates in the interval 0 ≥ σ > −∞.
A qualitatively new feature arises in the chaotic range α > α∞ , since here σ takes
positive values. This is represented in Fig. 27.8, for which the Lyapunov exponent was
determined numerically. To the left of α = α∞ , one faces the range with negative σ
that has been discussed already. For α > α∞ , σ becomes positive and on the aver-
age increases with increasing α. In the limit α = 4, there results the maximum value
σ = ln 2 0.6931, as will be shown in Exercise 27.2. The chaotic domain is repeat-
edly interspersed with windows corresponding to regular solutions where σ becomes
negative. The figure reflects the complexity of the function σ (α) only imperfectly. The
windows may be perceived only approximately, due to the limited resolution. The in-
dicated peaks pointing down in the figure are actually poles extending to σ → −∞.
A realistic representation of σ (α), when plotted with finite line width, would display
Fig. 27.8. Numerically calcu-

lated values of the Lyapunov
exponent σ of the logistic
mapping. For σ > 0, chaotic
trajectories arise
only a largely homogeneous black block that extends to −∞. There are infinitely
many windows with stable cycles, and one may even show that they lie densely over
the entire real interval 0 < α < 4: Any arbitrarily small vicinity of each point α still
includes stable cycles!
The function σ (α) also has the property of self-similarity, just like the attractor
in Fig. 27.6. For each step of magnification, the partial sections look like the full
figure.
EXERCISE
27.2 Logistic Mapping and the Bernoulli Shift
Problem. The mapping
yn+1 = 2 yn (mod 1) with yn ∈ [0, 1] (27.28)
is called the saw-tooth mapping, or the Bernoulli shift.

(a) Discuss the trajectories yn generated by the Bernoulli shift as a function of the
start value y0 .
(b) Show that for the parameter α = 4, i.e., in the region of the fully developed chaos,
the logistic mapping and the Bernoulli shift are equivalent.
Hint: Use the transformation of variables
1
xn = 1 − cos(2πyn ) = sin2 (πyn ). (27.29)
2
(c) Find the frequency distribution P (x) of a “typical” trajectory xn of the logistic
mapping for passing the various points in the interval 0 < x < 1, and calculate the
Lyapunov exponent σ from this distribution.
Solution. (a) The graph of the function f (y) = 2 y (mod 1) consists of two pieces of
a straight line with the slope 2 that are shifted relative to each other, as is represented
in Fig. 27.9. Since everywhere f (y) = 2 > 1, stable fixed points cannot exist. The
iterated solution of (27.28) is simple,
yn = 2n y0 (mod 1). (27.30)
Hence, all iterated solutions of an initial value y0 are explicitly known, nevertheless
the mapping is chaotic! This is due to the factor 2n , which implies an exponential
Fig. 27.9. The mapping func- enhancement of smallest deviations in the initial value y0 . As is represented in the
tion of the Bernoulli shift
figure, an interval y is stretched by the Bernoulli shift by a factor 2 to the length
2 y. Values falling into the range y > 1 are folded back to the unit interval by the
modulo operation. This repeated sequence of stretching and folding is characteristic
of chaotic mappings. In this way there results a thorough mixing of the trajectories
and a sensitive dependence on the initial conditions.
The Bernoulli shift mapping may be visualized by writing y as a number in binary
representation:
∞

y= bk 2−k i.e., y = 0. b1 b2 b3 b4 . . . with bk = 0 or 1. (27.31)
k=1
A doubling of y corresponds to a left shift (therefore the name) of the binary digits bk ,
and the modulo operation ensures that the digit before the decimal point is cut off:
f (y) = b1 . b2 b3 b4 . . . (mod 1)
= 0 . b2 b 3 b 4 b 5 . . . . (27.32)
Now the effect of the Bernoulli shift becomes clear: Each iteration enforces the “back
digits” in the binary expansion. If the initial value y0 is known to an accuracy of
2−m , this information is exhausted after m iteration steps. Later on there remains only
“numerical noise,” the trajectory wanders around through the unit interval in a non-
predictable way.
Mathematically, there still remains a subtle difference between rational and irra-
tional values of the initial condition. A rational number (a fraction p/q) y0 has a
binary representation which (after a finite number of steps) becomes periodic. Con-
sequently the trajectory yn formed according to (27.28) with (27.32) also becomes
periodic. A simple example is provided by the cycle 2/3, 1/3, 2/3, . . . , since

1 1 1 1 1 2
y0 = 0. 1 0 1 0 1 0 . . . = 1 + 2 + 4 + ··· = = ,
2 2 2 2 1 − 1/4 3

1 1 1 1 1 1
y1 = 0. 0 1 0 1 0 1 . . . = 1 + 2 + 4 + ··· = = ,
4 2 2 4 1 − 1/4 3
y2 = 0. 1 0 1 0 1 0 . . . ,
..
.
The rational numbers lie densely on the real axis. Thus, there are infinitely many start
solutions leading to periodic trajectories, and in any arbitrarily small vicinity of a
point y one finds such solutions. On the other hand, the set of rational numbers has
the measure zero; these are therefore “atypical” numbers. If one adopts a “typical” Exercise 27.2
initial value y0 , e.g., a random number in the interval [0, 1], then the probability for
y0 being rational and thus leading to a periodic trajectory is arbitrarily small. For a
physicist, rational initial conditions do not play a role because of the finite precision of
measurement. In the numerical simulation, the situation is different: If the computer
used stores the numbers with a precision to m bits, then after at most m steps the
simulation of the Bernoulli shift becomes meaningless. Fortunately for many purposes
it makes no difference whether one is dealing with a cyclic solution with a very long
period or with a genuine nonperiodic solution.
(b) The logistic equation
xn+1 = 4 xn (1 − xn ) (27.33)
is transformed to the new variable yn . The right-hand side then becomes

1 1 1
4 1 − cos(2πyn ) 1 − + cos(2πyn )
2 2 2
1
= 1 − cos2 (2πyn ) = 1 − 1 + cos(4πyn )
2
1
= 1 − cos(4πyn ) , (27.34)
2
and hence, (27.33) reads
1 1
1 − cos(2πyn+1 ) = 1 − cos(4πyn ) (27.35)
2 2
or
cos(2πyn+1 ) = cos(4πyn ). (27.36)
This solution is solved by
yn+1 = 2yn (mod 1), (27.37)
i.e., the Bernoulli shift mapping. When transformed back to the variable x, the solution
(27.37) reads
1
xn = 1 − cos(2π 2n y0 ) = sin2 (π2n x0 )
2
√
= sin2 2n arcsin x . (27.38)
As was discussed in (a), this leads for almost all initial values x0 to chaotic trajectories
which cannot be calculated even numerically if n becomes large.
(c) In the range of definition of the Bernoulli shift mapping (27.28), no point is
particularly distinguished. For typical, i.e., irrationally chosen initial conditions, the
solution (27.30) will meet all numbers in the interval (0, 1) with the same probability,
hence P (y) = 1. This implies a corresponding probability of the logistic mapping of
dy 1 1
P (x) = 2P (y) =2
dx 2π sin(πy) cos(πy)
1
= √ . (27.39)
π x(1 − x)
Exercise 27.2 The factor 2 arises because the transformation equation has two solutions y symmet-
rical about y = 1/2 for any value of x. The probability of finding P (x) is minimum
at x = 1/2 and increases toward the borders of the unit interval. At x → 0 and x → 1,
the function diverges but remains integrable. It is normalized to unity.
For the calculation of the Lyapunov exponent (26.8), we replace the mean value
over the time sequence by a mean value over the probability distribution P (x):
1
n−1
σ = lim ln |f (xl )|
n→∞ n
l=0
1
= dx P (x) ln |f (x)|. (27.40)
0
Systems for which this substitution time average ↔ phase-space average is permis-
sible are called ergodic. The proof of ergodicity of a system is in general not simple.
Since f (x) = 2 for all x and since the probability distribution P (x) is normalized to
unity, the Lyapunov exponent of the logistic mapping at α = 4 is obtained as
σ = ln 2 = 0.6931 . . . , (27.41)
which agrees with the numerical result from Fig. 27.8.
EXAMPLE
27.3 The Periodically Kicked Rotator
In this section, we shall meet another example of a discrete mapping that, despite of
its simple shape, leads to complex solutions and to chaotic behavior. Thereby, several
new concepts arise, and a way is described that leads from a quasiperiodic to a chaotic
motion.
In contrast to the logistic mapping, which was introduced as a purely mathematical
example, we now consider the motion of a specific mechanical system, namely, a
damped rotator that is under the influence of an external force. The corresponding
equation of motion for the rotational angle θ reads
θ̈ + β θ̇ = M(θ, t). (27.42)
Here, β is a friction parameter, and M(θ, t) describes the imposed time-dependent

torque divided by the moment of inertia of the rotator. The external force shall depend
periodically on the time. The problem simplifies if the force is acting in short pulse-
like impacts spaced in time by the period T :
∞

M(θ, t) = M(θ) δ(t − nT ). (27.43)
n=0
The nonautonomous differential equation of second order

∞

θ̈ + β θ̇ − M(θ) δ(t − nT ) = 0 (27.44)
n=0
can be rewritten as usual in an autonomous system of three coupled differential equa- Example 27.3
tions of first order. With x = θ , y = θ̇ , and z = t , we have
ẋ = y,
∞

ẏ = −β θ̇ + M(θ) δ(z − nT ), (27.45)
n=0
ż = 1.
Because of the specific form of the force (27.43), the rotator is again and again accel-
erated by the pulse, but otherwise moves freely influenced only by friction. Therefore,
the equations of motion can be integrated exactly. For this purpose, we introduce the
discretized variables
xn = lim x(nT − ), yn = lim y(nT − ). (27.46)

→0 →0
Thus, position and velocity are scanned shortly before the individual kicks. We now
consider the momentum number n and integrate the equation of motion over the nth
time interval nT − < t < (n + 1)T − . Since only one kick contributes in this
interval, the equation of motion for y reads
ẏ = −βy + M(x)δ(t − nT ). (27.47)
The solution of the homogeneous differential equation holding between the kicks may
be given immediately:
y(t) = ae−βt for t = nT . (27.48)
The impact of force causes a sudden increase of velocity according to

nT +

y(nT + ) − y(nT − ) = dt −βy + M(x) δ(t − nT ) (27.49)
nT −
= −2β y(nT ) + M(x(nT ))
M(x(nT )), (27.50)
as is schematically represented in Fig. 27.10. With (27.48) and (27.49), one obtains

y(t) = yn + M(xn ) e−β(t−nT ) for nT < t < (n + 1)T . (27.51)
The angle coordinate x(t) results from this by integration:

t
1
x(t) = xn + dt y(t ) = xn − e−β(t−nT ) − 1 yn + M(xn ) . (27.52)
nT β
Fig. 27.10. Periodic impacts
Equations (27.51) and (27.52) taken at t = (n + 1)T − thus yield a two-dimensional
of force (lower part) cause sud-
discrete mapping den changes of velocity (up-
per part). The influence of fric-
1
xn+1 = xn + 1 − e−βT yn + M(xn ) (mod 2π), (27.53) tion between the impacts is
β perceptible
−βT

yn+1 = e yn + M(xn ) . (27.54)
Example 27.3 By performing the modulo operation, one takes into account that x is a periodic angu-
lar coordinate. The system (27.53) describes a Poincaré mapping of the periodically
kicked rotator.
The angular dependence of the torque M(x) is determined by the specific physical
system. If a body is moving on a circular orbit and is under a force pointing always
in the same direction (see Fig. 27.11), then the torque is proportional to the sine of
the angle, M = r × F = −r F sin θ ez . In addition, we take into account an angle-
independent torque M0 , i.e., with K0 = r F :
Fig. 27.11. The force F(t) al-
ways points in the same direc- M(x) = M0 + K0 sin x. (27.55)
tion, independent of the dis-
placement angle θ of the rota- The system (27.53) and (27.54) with the nonlinear force law (27.55) is called the dis-
tor
sipative circular mapping (“dissipative” since we are dealing with the limit of strong
damping) and displays very interesting dynamic properties.
The equations may be written still more clearly by introducing the following ab-
breviations:
1
b = e−βT , = M0 ,
β
1
K= 1 − e−βT K0 , (27.56)
β
1
rn = 1 − e−βT yn − .
bβ
rn represents a rescaled velocity coordinate. Insertion into (27.53) leads to the follow-
ing form of the dissipative circular mapping:
xn+1 = xn + brn + − K sin xn (mod 2π), (27.57)

yn+1 = e−βT yn + M(xn ) . (27.58)
One can iterate this equation numerically and study the solutions for various values
of the parameters b, K, and . A further simplification results in the limit of strong
damping, i.e., βT 1 or b 1. (To overcome the friction, K0 must, according to
(27.56), increase linearly with c.) In this case, the velocity yn is decelerated to zero
immediately after each “kick.” Consequently, the equation for the angle decouples to
xn+1 = f (xn ) (mod 2π)

= xn + − K sin xn (mod 2π). (27.59)
This equation is called the one-dimensional circular mapping or the standard map-
ping. Its mathematical properties were intensely studied, in particular by the Russian
mathematician V.I. Arnold.5 Its interesting properties are due to the nonlinearity of
the sine function.
Let us consider for a moment the trivial limit K = 0, i.e., the linear circular map-
ping
xn+1 = xn + (mod 2π). (27.60)
5 V.I. Arnold, Trans. of the Am. Math. Soc. 42, 213 (1965).
The rotator is moving forward in equidistant steps . If it reaches the old position Example 27.3
again after a finite number of steps, the motion is periodic. This obviously happens if
/2π is a rational number,
p
= 2π , p and q coprime numbers. (27.61)
q
This means that after q time steps the rotator has performed p full turns, i.e., takes
again (modulo 2π ) its original position. One deals with a solution with the period q.
If however /2π is an irrational number, the initial point x0 is not reached again even
after an arbitrarily long time. But there occur values xn in any arbitrarily small vicinity
of x0 . In such a case the motion is called quasiperiodic.
In order to characterize the motion, one may define a winding number:
1 f n (x0 ) − x0
W= lim . (27.62)
2π n→∞ n
Here, f n (x0 ) is the n-fold iterated mapping function (without performing the modulo
operation) from (27.59). The winding number thus represents the mean shift per stroke
interval. W = W (, K) depends on both parameters of the circular mapping. In the
linear case, K = 0, W just coincides with the fraction p/q defined in (27.61). Since
the rational numbers, although densely located on the real axis, form only a set of
measure zero, the “typical” trajectories are quasiperiodic.
What happens now if the nonlinearity becomes efficient (i.e., for K = 0) in the
circular mapping (27.60)? Let us consider a periodic solution with a rational winding
number W = p/q. The angular coordinate passes a cycle x1 , x2 , . . . , xq of length q
that causes a final shift of xq = x1 + 2πp mod 2π = x1 ; hence,
f q (x1 ) = x1 + 2πp. (27.63)
At the beginning of the present chapter, we met the criterion for the stability of a
discrete mapping, in the case at hand, of the q-fold iterated function f q (x): The mag-
nitude of the derivative of the function must be smaller than unity, and thus,

q

q

d

f (x1 )
=
f f · · · f (x )
=
f (xi )

dx 1

1
i=1

q

=
1 − K cos xi
< 1. (27.64)

i=1
If this condition is fulfilled, then x1 (and since all points in the cycle are on equal
base, all other xi too) belongs to a stable periodic attractor. In the linear case, K = 0,
the derivative has always the value f q (x) = 1. There exists marginal stability where
neighboring orbits are neither attracted nor repelled. If however 0 < K < 1, then each
of the solutions p,q = 2π p/q becomes a periodic attractor with a domain of at-
traction of finite width p,q . This is an example for a very interesting phenomenon
that arises in many branches of physics. Vibrating systems which are characterized
by two distinct frequencies adjust themselves—provided that there is a correspond-
ing interaction—in such a manner that the frequencies are synchronized, i.e., are in
an integral ratio to each other. The phenomenon is also called mode locking. The two
frequencies of the system considered here are determined on one hand by the stroke
length T , and on the other hand by the magnitude of the torque M0 .
Example 27.3 Possibly the earliest experimental evidence for the phenomenon of mode locking
is ascribed to the Dutch physicist Christian Huygens. He observed that a series of
pendulum watches (Huygens played a decisive role in their discovery) which were
suspended in a row began to vibrate in the same rhythm, although their limited accu-
racy of movement would rather have suggested a drifting apart. Huygens recognized
that the weak coupling of the watches via their common back wall was responsible for
the synchronization.
The extension of the mode-locking ranges can be calculated from (27.63). For
larger period lengths q, this may be performed only numerically. We therefore re-
strict ourselves here to the simplest case q = 1. If there is a complete synchronization
with winding number W = 1, then (27.63) becomes
f (x1 ) = x1 + 2π or = K sin x1 + 2π. (27.65)
The stability condition (27.64) reads
|1 − K cos x1 | < 1, (27.66)
which is fulfilled for angles 0 < x1 < π/2 or 3π/2 < x1 < 2π . The associated values
of may be read off from (27.65):
2π − K < 1,1 < 2π + K. (27.67)
The range of mode locking with just one turn of the rotator per stroke interval thus
has the shape of a triangle that opens with increasing K. Just the same consideration
leads to the attraction domain of the attractor with winding number W = 0, and thus,
p = 0, q = 1:
−K < 0,1 < K. (27.68)
Because of the periodicity of the angle coordinate, it suffices to investigate the inter-
val 0 ≤ ≤ 2π . Values beyond this range mean only that the rotator for each kick
performs additional full turns, which does not change the dynamics significantly. In
Fig. 27.12, the borders of the ranges (27.67) and (27.68) are drawn as straight lines.
The parameter values for which a periodic synchronized motion arises are shaded in
the diagram.
Fig. 27.12. Several domains

of stability for the one-
dimensional circular mapping
depending on the parameters
and K. The hatched regions
are called “Arnold tongues.”
On the lines in the upper half
the condition of superstability
is fulfilled
Analogous considerations may be made for any rational winding number W =
p/q. Some of the stability ranges (there are, of course, infinitely many ones) are
represented in Fig. 27.12. The width of these ranges, which are also called Arnold
tongues, increases monotonically with K. In the linear limit (K = 0), the summed-up
total width of all Arnold tongues is equal to zero (measure of the rational numbers),
as was already mentioned. It can be shown6 that for K = 1 the Arnold tongues cover Example 27.3
the entire -range:

p,q = 2π for K = 1. (27.69)
p,q
Hence, the situation is exactly complementary to the case K = 0; the “typical” so-
lutions are now periodic (mode locking), while the quasiperiodic solutions have the
measure zero.
The function W (), i.e., the winding number as a function of the frequency para-
meter at K = 1, is called the devil’s staircase (see Fig. 27.13). It is a function that
is everywhere continuous but nowhere differentiable. To any rational number p/q
belongs a step; the width of the steps decreases with increasing period length q; its
total width according to (27.69) covers the entire interval. The devil’s staircase has the
property of self-similarity, i.e., any sectional magnification again resembles the entire
object.
Fig. 27.13. The steps of the
“devil’s staircase” are ranges
where the winding number
“clicks into place,” i.e., is in-
dependent of the frequency
parameter . The sectional
magnification on the right in-
dicates the self-similar struc-
ture of the devil’s staircase
If the parameter of the nonlinear coupling exceeds the value K = 1, then the Arnold
tongues coalesce. This is connected with a conversion of the quasiperiodic solutions
into chaotic ones, as may be seen from the occurrence of positive Lyapunov exponents.
At the same time, for K > 1 there still exist domains of periodic solutions with a
negative Lyapunov exponent. Both types of solutions are interwoven in a complex
manner. For the logistic mapping, we observed a mixing of regular and chaotic motion.
But now the relations are even more complicated, since now two parameters, and
K, may be varied independently. An interesting feature of the chaotic solutions is that
they don’t have a well-defined winding number. The solution moves so irregularly that
the limit in (27.62) does not exist.
In the upper half of Fig. 27.12, for K > 1, the centers of several periodic ranges
are plotted as lines. Here, the condition of superstability introduced on page 524 is
fulfilled, i.e., the derivative of the iterated mapping function of a q-cycle vanishes,
f q (x) = 0. For the period q = 1, this condition may be evaluated easily. The solution
with winding number W = 0 has as its fixed-point condition
f (x1 ) = x1 ; thus, − K sin x1 = 0. (27.70)
6 M.H. Jensen, P. Bak and T. Bohr, Phys. Rev. A30, 1960 (1984).
Example 27.3 This 1-cycle is superstable if (see also (27.59))
f (x1 ) = 0; thus, 1 − K cos x1 = 0. (27.71)
With the value x1 following from (27.70), equation (27.71) yields the condition for
superstability

K= 1 + 2 (27.72)
as is plotted in Fig. 27.12. The condition for a superstable fixed point with the winding
number W = 1 follows analogously as

K= 1 + (2π − )2 . (27.73)
The crossing of the curves indicates that for equal values of and K distinct stable
solutions may coexist. In the range K > 1, there is no longer a straightforward relation
between the parameters K, , and the winding number W . Which of the solutions is
realized then depends on the initial condition for x.
The occurrence of chaotic solutions is associated with a qualitative change of the
mapping function f (x). As Fig. 27.14 shows, for K < 1, f (x) is a monotonically
increasing function. For K > 1, the nonlinear coupling is so strong that f (x) reflects
the shape of the sine function; i.e., there arise (quadratic) maxima and minima. Similar
to the previously treated logistic mapping, f (x) is not invertible for K > 1. This is a
necessary (but not sufficient) condition for the occurrence of chaos in one-dimensional
mappings.
Fig. 27.14. (a) The mapping

function of the circular func-
tion for K < 1 increases
monotonically. (b) For K >
1, a maximum develops, and
f (x) is no longer invertible.
The function f (x) = x +
(mod 2π ) is drawn as a
dashed line
The complex behavior in the mode locking expressed by the devil’s staircase is
well confirmed by experiment. Even simpler than mechanical oscillators are nonlinear
electric circuits. For example, mode locking was investigated in an externally period-
ically driven circuit involving a superconducting Josephson junction and an induction
which may be described mathematically by the circular mapping.7
Supplement: It should be noted that the logistic mapping from Example 27.1 may
be interpreted as the motion of a periodically kicked rotator. For this purpose, the angu-
lar dependence of the torque is not represented by (27.55) but rather by the (somewhat
7 M. Bauer, U. Krüger and W. Martienssen, Europhys. Lett. 9, 191 (1989).

artificially constructed) function Example 27.3

α 2
M(x) = K0 (α − 1)x − x . (27.74)
2π
If we consider the dissipative Poincaré mapping (27.53) in the limit of strong damping,
βT 1, the equation for xn again decouples,

K0 α 2
xn+1 = xn + (α − 1)xn − x (mod 2π). (27.75)
β 2π n
The choice K0 = β leads to

1
xn+1 = αxn 1 − xn (mod 2π), (27.76)
2π
which after rescaling of the angular variable to the unit interval by xn = xn /2π yields
the logistic mapping

xn+1 = αxn (1 − xn ) (mod 2π). (27.77)
For parameter values α ≤ 4, the values of xn are automatically bound to the unit inter-
val, and hence the modulo operation may be omitted.
EXAMPLE
27.4 The Periodically Driven Pendulum
In the preceding examples, we have studied systems the dynamics of which could be
described by the iteration of simple, analytically known discrete mappings. The logis-
tic mapping from Example 27.1 with its extremely simple structure served as a “test
laboratory” for investigating many aspects of nonlinear dynamics but has no plausi-
ble physical analog. For the “periodically kicked” damped rotator of Example 27.3,
the dynamics could also be reduced to the iteration of discrete equations of motion (in
the limit of strong damping to the one-dimensional circular mapping (27.57)), because
of the pulse-like nature of the acting force. Another model example, possibly even
more realistic and appropriate for a clear illustration of the characteristic phenomena
of nonlinear dynamics, is the periodically driven pendulum.8
Let the pendulum, a nonlinear oscillator with a backdriving force proportional to
the sine of the displacement angle θ , be under the influence of an additional external
force with a harmonic time dependence. Moreover, let the system be damped by fric-
tion which is proportional to the velocity. Mathematically, these system properties are
described by the following equation of motion:
d 2θ dθ
+β + sin θ = f cos(t). (27.78)
dt 2 dt
8 See, e.g., G.L. Baker and J.P. Gollub, Chaotic Dynamics, Cambridge University Press, Cambridge
(1996). In this book, extensive use is made of the example of the driven pendulum. We also refer to
H. Heng, R. Doerner, B. Huebinger and W. Martienssen, Int. Journ. of Bif. and Chaos 4, 751, 761,
773 (1994).
Example 27.4 Here, β is the friction parameter, and f and denote the strength and frequency of
the driving force, respectively. The eigenfrequency of the pendulum has been set to the
value ω0 = 1, which may always be achieved by rescaling the time and the parameters
β and f . As was described in Chap. 23, this explicitly time-dependent differential
equation of second order may be rewritten in a system of three coupled autonomous
(i.e., not time-dependent) differential equations of first order:
dω
= −βω − sin θ + f cos φ,
dt
dθ
= ω, (27.79)
dt
dφ
= .
dt
For β > 0, this is obviously a dissipative system, since the divergence of the velocity
field F (compare (23.16)) is then negative:
= ∇ · F = −β. (27.80)
The equations of motion of the driven pendulum are too complicated to allow analytic
solutions. Their numerical integration by the computer does not cause any trouble,
however.9
Depending on the parameters , β, f , the driven pendulum displays many distinct
types of motion. Here we shall investigate only a small section of the parameter space,
namely the dependence on the driving strength f for fixed values of the frequency
and of the friction constant β. As an example we choose a frequency = 2/3 for all
of the subsequent investigations, i.e., a value somewhat below the natural vibration
frequency of the pendulum, and a friction parameter β = 0.5.
The system of differential equations (27.79) is integrated numerically for various
values of the parameter f , beginning with selected initial conditions θ (0) and ω(0).
To avoid needless effort, transient processes, i.e., the initial solutions of, e.g., the first
20 vibrational periods, will be ignored.
The effect of dissipation in the system ensures that the solution after some finite
time approaches an attractor. The shape of this attractor may be analyzed in different
manners. One may directly consider the time dependence of the displacement angle
θ (t) or plot the trajectory in the three-dimensional phase space θ, ω, φ, where the
third coordinate because of φ = t just corresponds to the time. More transparent
than this three-dimensional representation are reduced two-dimensional phase-space
diagrams where the time is considered as a parameter and the trajectory is plotted in
the θ, ω-plane. Contrary to the full three-dimensional phase space, the projected orbits
may intersect here. Since θ is a periodic angular variable, we restrict it to the interval
−π < θ ≤ π by the modulo operation. A trajectory that leaves the diagram at the right
or left edge, corresponding to a loop of the pendulum, therefore enters again at the
opposite edge.
Figure 27.15 shows a gallery of selected phase-space diagrams arranged by increas-
ing value of the driving force f . The value of f is always given at the top left in the
partial figures.
9 Readers are advised to explore the dynamics of the driven pendulum by their own computer ex-
periments. For the integration of the differential equation a Runge–Kutta approach is recommended;
see, e.g., W.H. Press et al., Numerical Recipes, Cambridge University Press (1989).
Fig. 27.15. Typical phase-

space trajectories of the
periodically driven damped
pendulum for various values
of the parameter f . The
parameters = 2/3 and
β = 0.5 are kept fixed
For weak perturbations, e.g., f = 0.9, the pendulum performs approximately har-
monic librations about the zero position. The limit cycle θ (t) is a slightly distorted si-
nusoidal vibration with the frequency , and correspondingly the path in phase space
is approximately an ellipse.
With increasing perturbation strength a bifurcation arises at about f = 1.07 with
a period doubling, as is represented in Fig. 27.15 for f = 1.075: Two slightly differ-
ent vibrational tracks are alternating. After further period doublings, one then finds a
libration with twice the amplitude (represented for f = 1.12) in which the pendulum
performs a loop but then moves back. The frequency of this oscillation is /3.
In the range f 1.15 . . . 1.3, there arise chaotic solutions. The trajectory for
f = 1.2 in Fig. 27.15 fluctuates in an erratic manner between librations and rotations
in both directions, and correspondingly the path densely covers a domain in phase
space. For comparison, Fig. 27.16 shows a regular (f = 1.12) and a chaotic (f = 1.2)
trajectory, which correspond to the third and fourth partial figure in Fig. 27.15.
For even stronger coupling f , the chaotic range is left again, and there occur rotat-
ing periodic solutions, as is represented for f = 1.4. The angle increases linearly with
time, according to θ (t) ∝ ± t , superimposed by local fluctuations. For f = 1.45, a
bifurcation with period doubling of the rotating solution occurs. On the average, the
angle is unchanged, but now the local deviations alternate from period to period. At
f = 1.47, one obtains a second period doubling.
Fig. 27.16. Two trajectories

θ(t) for distinct values of the
parameter f . For f = 1.12
(left) a periodic motion arises,
for f = 1.2 (right) a chaotic
one. Note the distinct scales!
After a bifurcation cascade, there again occurs a range with chaotic solutions, as
is represented for the example f = 1.5. But soon the chaos is followed by regular
motion, as is demonstrated by the beautiful phase-space trajectory for f = 1.51 in
the last shown example. Here, one faces a periodic libration motion with two loops
(angular range 3 · 2π ) and the period 5 (i.e., the frequency /5).
A global survey of the behavior of the system is obtained from the attractor di-
agram. One of the coordinates is scanned in regular time intervals, and the result is
plotted along the ordinate as a function of a system parameter. Figure 27.17 shows the
angular velocity ω(tn ) scanned at the time points tn = t0 + n 2π/ versus the strength
f of the driving force. For any value on the abscissa, 150 values of ω are plotted.
The upper margin marks the nine values of f for which the associated phase-space
trajectories are shown in Fig. 27.15.
Fig. 27.17. The attractor dia-

gram of the driven pendulum
exhibits the sequence of reg-
ular and chaotic domains as
a function of the parameter
f . The phase-space trajecto-
ries corresponding to the fre-
quency values marked at the
upper margin are shown in
Fig. 27.15
The attractor diagram clearly exhibits the previously discussed alternating ranges
of regular and chaotic motion. The chaotic window at f = 1.15 . . . 1.28 is followed by
a broad domain of periodic (rotating) solutions at f = 1.28 . . . 1.48, showing several
pronounced period-doubling bifurcations (a “subharmonic cascade”). In the ranges
f = 1.11 . . . 1.15 and f > 1.54, one finds solutions of period 3. The structural sim-
ilarity of the attractor diagram for the driven pendulum with its counterpart for the
logistic mapping, Fig. 27.6, is obvious.
One should note that the attractor diagram represented in Fig. 27.17 is not complete. Example 27.4
This is due to the fact that our system of differential equations (27.79) is invariant
under reflection because the pendulum has no preferred direction of oscillation. Each
solution is accompanied by a reflected trajectory
θ → −θ (mod 2π), ω → −ω, φ → φ + π (mod 2π), (27.81)
which also satisfies the equation of motion. Angle and velocity are inverted, and the
phase is shifted by a half-period. In general, the solutions always occur pairwise,
which is obvious for rotational solutions because the pendulum may run “clockwise”
or “anti-clockwise.” Which attractor is actually reached depends in a complicated
manner on the initial conditions θ (0), ω(0). When plotting Fig. 27.17, only one path
has been calculated, hence the “reflected” branches are missing. (The points are not re-
ally reflected about θ = 0, since in the stroboscopic scanning the phase of the scanning
moment is kept fixed and not shifted according to (27.81).)
However, for periodic trajectories of period length n 2π/ it may happen that θ (t +
nπ/) = −θ (t) (mod 2π) holds. Such a symmetric trajectory is identical with its
reflected partner, and there exists only one solution. This occurs, e.g., for the n = 3-
vibration shown in Fig. 27.16 (left), and in particular also for the (n = 1)-librations for
small f . At about f = 1.01, the interesting case of a symmetry-breaking bifurcation
occurs, which may be recognized by the kink in the attractor diagram Fig. 27.17. The
attractor continues to be periodic with n = 1, but looses its symmetry and therefore
splits into a pair of distinct solutions which are reflected relative to each other. In the
figure only the upper branch of this fork is included.
The attractor diagram provides a good survey of the various kinds of motion
of a dynamic system. A more far-reaching quantitative measure of the stability
of trajectories are the Lyapunov exponents σi discussed in Chap. 26. The driven
pendulum—a three-dimensional system—has three Lyapunov exponents. One of
these, let us call it σ3 , has always the value σ3 = 0. It belongs to the degree of free-
dom φ, which according to (27.79) has the trivial linear time dependence φ(t) = t .
Thus, any perturbations along this direction neither increase nor shrink exponen-
tially.
The maximum Lyapunov exponent σ1 determines the stability of the system. At-
tractors with σ1 < 0 are periodic, those with σ1 > 0 are chaotic. Figure 27.18 shows
the result of a numerical calculation of the maximum Lyapunov exponent, plotted over
the same parameter range as in the attractor diagram Fig. 27.17. One may clearly trace
the sequence of regular and chaotic domains. At the bifurcation points the exponent
σ1 touches the zero line from below.
A conspicuous feature is that σ1 never falls below the value −0.25. This is re-
lated to the fact that in the dissipative system under consideration the sum of all three
Lyapunov exponents is determined by the negative of the friction coefficient β (here,
β = 0.5):

3
σi = −β. (27.82)
i=1
Because σ3 = 0 in the case of maximum stability σ1 = σ2 = −β/2.

Fig. 27.18. The largest Lya-

punov exponent σ1 of the
driven pendulum as a function
of the parameter f . Values
σ1 > 0 occur in the domains
of chaotic motion which were
perceptible already in the at-
tractor diagram Fig. 27.17.
Compare also the analogous
representation of the Lya-
punov exponent of the logistic
mapping in Fig. 27.8
The Poincaré cut introduced in Chap. 24 may further characterize the attractors.
When choosing φ = φ0 (mod 2π) as the cut condition, one just has a stroboscopic
mapping at equidistant time points tn = t0 + 2π/. The three-dimensional phase
space reduces to two dimensions (θ, ω), and the continuous trajectory turns into a
cloud of points. The Poincaré cuts of periodic attractors simply consist of one or sev-
eral fixed points, the number of which corresponds to the period length of the vibra-
tion. The situation is different, however, for the nonperiodic strange attractors which
are characteristic for the occurrence of chaotic motion. Here, the cloud of points of
the Poincaré cut covers extended partial domains of phase space more or less uni-
formly.
Figure 27.19(a) illustrates the situation for the chaotic attractor of the pendulum
with a driving strength f = 1.2. The points were obtained by stroboscopic scanning
of the trajectory shown in the fourth partial figure of Fig. 27.15 over 2000 vibrational
periods. The detailed shape of the Poincaré cut depends on the selected phase an-
gle φ0 .
The long curved object in Fig. 27.19 at first glance appears as a strange bent one-
dimensional curve. A closer look, however, shows the chaotic attractor to be a much
more complex geometric object. In the sector magnification of a small partial range
of the attractor (see Fig. 27.19(b)), the seemingly single line dissolves into several
closely spaced curves. But this is only the beginning, since a repetition of this opera-
tion would show that in each sector magnification the new lines again decay into sev-
eral fractions, and the procedure may be repeated infinitely many times. (This process
is limited in practice only by problems in the numerical integration of the equation of
motion. By the way, in order to plot Fig. 27.19(b), 100,000 periods had to be calcu-
lated.)
Thus, the attractor of the chaotic driven pendulum displays an infinite filigree
(“puff-pastry”) structure. Mathematically it is a fractal with broken dimension (see
Chap. 26) since the points of the Poincaré cut occupy, roughly speaking, a larger vol-
ume in phase space than a one-dimensional curve, but on the other hand they are too
rare to cover a two-dimensional area. An analogous statement holds also for the full
attractor in the three-dimensional phase space.
Fig. 27.19. The Poincaré cut

of a chaotic trajectory (f =
1.2) of the driven pendulum.
The range in the upper fig-
ure marked by a box is repre-
sented below as a sector mag-
nification, revealing the frac-
tal structure of the strange at-
tractor
In Chap. 26, we dealt with the determination of the fractal dimension. A remarkable
point is that the Lyapunov exponents σi may also be used to determine the dimension
of a strange attractor. The existence of such a relation is not implausible, because the σi
decide how “fast” a region of phase space spreads under the dynamic flow. Building
on this consideration, Kaplan and Yorke10 derived a formula for calculating a Lya-
punov dimension DL . For our special case (one positive and one negative Lyapunov
exponent), the Kaplan–Yorke relation reads
σ1
DL = 1 + for σ1 > 0, σ2 < 0. (27.83)
|σ2 |
The relation between DL and the other dimension measures has not yet been cleared
up in full. The originally assumed identification of Lyapunov dimension and capacity
dimension Df cannot be maintained, since counter-examples have been found. More
recent speculations rather concern a possible relation with the information dimension
DL = DI .
For the Poincaré cut 27.19, one gets from (27.83) with σ1 0.14, σ2 −0.64,
the Lyapunov dimension DL 1.2. This value depends sensitively on the friction
10 J.L. Kaplan and J.A. Yorke, Functional differential equations and approximation of fixed points,
H.-O. Peitgen and H.O. Walter (eds.), Lecture Notes in Mathematics 730, Springer, Berlin (1979).
Example 27.4 constant β. For a weaker damping, the strange attractor “blows up,” and its dimension
increases.
EXAMPLE
27.5 Chaos in Celestial Mechanics: The Staggering of Hyperion
Hyperion is one of the more remote moons of Saturn. It revolves about Saturn with
a revolution period of 21 days, on an ellipse with eccentricity ε = 0.1 and a large
semiaxis a = 1.5 · 106 km.
The motion of Hyperion is a particularly impressive example of chaotic stagger-
ing within our solar system. In this section we shall describe this behavior using a
simplified model. The satellite Voyager 2 among others has supplied pictures of the
moon Hyperion. Hyperion is an asymmetric top that may be roughly described by a
three-axial ellipsoid with the dimensions
190 km × 145 km × 114 km (±15 km). (27.84)
Hence, one obtains for the principal moments of inertia 1 < 2 < 3 :
2 − 1
≈ 0.3. (27.85)
3
The striking prediction is that Hyperion performs a chaotic staggering motion in the
sense that its rotational velocity and the orientation of its rotational axis vary signif-
icantly within a few revolution periods. This chaotic dancing, which must have hap-
pened also for other planetary satellites during their history (e.g., Phobos and Deimos
with the planet Mars have been calculated), is implied by the asymmetry of Hyperion
and by the eccentricity of the orbit.
To describe the change of the rotational velocity, we adopt the following model
(see Fig. 27.20): Hyperion H orbits Saturn S on a fixed ellipse with semimajor a and
eccentricity ε. r represents the distance between Saturn and Hyperion, ϕ the polar
angle of motion. Thus, the trajectory of Hyperion is given by
Fig. 27.20. A simple two-
dumbbell model for the asym-
metric Saturn moon Hyperion
k Example 27.5
r(ϕ) = . (27.86)
1 + ε cos ϕ
Its asymmetric shape is simulated by four mass points 1 to 4 with equal mass m which
are arranged in the orbital plane. Let the line 2–1 (distance d) be the (body-fixed)
e1 -axis, the line 4–3 (distance e < d) the (body-fixed) e2 -axis. The e3 -axis points
perpendicular out of the image plane: e3 = e1 × e2 . The angle ϑ specifies the rotation
of Hyperion about the e3 -axis. It is defined as the angle between the semimajor a and
the e1 -axis. The moments of inertia obey
1 1 1
1 = me2 < md 2 = 2 < m(d 2 + e2 ) = 3 . (27.87)
2 2 2
In this model, the satellite shall rotate only about the e3 -axis, i.e., the axis perpendicu-
lar to the orbital plane with the largest moment of inertia. This restriction is motivated
because the tidal friction over very long times causes (1) the rotational axis of a moon
to align along the direction of the largest moment of inertia, and (2) causes this direc-
tion to adjust perpendicular to the orbital plane. Moreover, the orbital angular momen-
tum of Hyperion is assumed to be constant. This is a very good approximation, since
the intrinsic angular momentum LE of Hyperion is always very small relative to the
orbital angular momentum LB , |LE |/|LB | ≈ (d 2 + e2 )/a 2 ≈ 10−8 . The gravitational
field at the position of Hyperion is not homogeneous, and since 1 and 2 are distinct,
the satellite experiences a torque that depends on its orbital point and its orientation
ϑ , which will be calculated now. The tidal friction shall be neglected, however. The
torque acting on the pair of masses (1, 2) is
de1
D(1,2) = × (F1 − F2 ), (27.88)
2
where
γ mMri
Fi = − (27.89)
ri3
is the force acting on the mass point i, M being the Saturn mass. Figure 27.21 once
more illustrates the torque that is caused by the gravitational forces F1 and F2 at the
positions r1 and r2 , and by the centrifugal force F = −(F1 + F2 ) at the position r.
Fig. 27.21. The forces caus-
ing the torque D(1,2)
Since the length d ≈ 200 km is small compared to the distance r ≈ 106 km, the
cosine law yields

2
d d d
ri = r 1 ± cos α + ≈ r 1 ± cos α. (27.90)
r 2r r
Example 27.5 The positive sign holds for r1 , the minus sign for r2 . α is the angle between r1 − r2
and r. From (27.90), we obtain

1 1 3d
≈ 1∓ cos α . (27.91)
ri3 r 3 2r
Hence, for D(1,2) we find

de1
D(1,2) = × (F1 − F2 )
2

de1 −γ mM 3d d
= × 1 − cos α r + e 1
2 r3 2r 2

3d d
− 1+ cos α r − e1
2r 2
3γ mMd 2 cos α
= e1 × r
2r 4
3γ mMd 2
= sin α cos α e3
2r 3
3γ mMd 2
= sin 2α e3
4r 3
3γ M2
= sin 2α e3
2r 3
6π 2 a 3 2
= sin 2α e3 . (27.92)
r 3T 2
In the last step of rewriting, γ M has been expressed by the orbital period T and the
semimajor a, using the third Kepler law:
2
2π
γM = a3. (27.93)
T
Thereby, the reduced mass was replaced by the mass of Saturn.
The torque for the pair of masses (3, 4) is obtained in the analogous way as
ee2
D(3,4) = × (F3 − F4 )
2

ee2 γ mM 3e 3e
= ×r 3 1+ sin α − 1 − sin α
2 r 2r 2r
−3γ mMe2
= sin α cos α e3
2r 3
−3γ M1
= sin 2α e3
2r 3
−6π 2 a 3 1 sin 2α
= e3 . (27.94)
r 3T 2
The total torque D = D(1,2) + D(3,4) is therefore
3
3 2π 2 a
D= (2 − 1 ) sin 2α e3 . (27.95)
2 T r
Thus, the torque vanishes if 1 = 2 . Besides, a configuration with α experiences Example 27.5
the same torque as a configuration with 1800 + α. The torque tries to rotate Hyper-
ion in such a way that at any moment the e1 -axis points toward Saturn. The expres-
sion (27.95) for the torque remains correct even if a more realistic mass distribution is
assumed. With
dL d 2ϑ
D= = 3 2 , (27.96)
dt dt
the equation of motion for the eigenrotation of the satellite reads

3 2π 2 a 3
3 ϑ̈ = − (2 − 1 ) sin 2(ϑ − ϕ(t)) . (27.97)
2 T r(t)
Here, we set α = ϕ − ϑ . Equation (27.97) involves only one degree of freedom, ϑ , but
the right side depends via the orbital radius r(t) and the polar angle ϕ(t) on the time
and is therefore not integrable. An exception is the case of a circular orbit. Then the
mean angular frequency
2π
n= (27.98)
T
equals the angular velocity ω, and r = a, ϕ(t) = nt. With ϑ = 2(ϑ − nt), the differ-
ential equation (27.97) therefore simplifies for ε = 0 to
3 ϑ̈ = −3n2 (2 − 1 ) sin ϑ . (27.99)
This is the differential equation for the pendulum. The integral of motion is the energy
E. To determine this quantity, we multiply (27.99) by ϑ̇ :
3 ϑ̇ ϑ̈ + 3n2 (2 − 1 )ϑ̇ sin ϑ = 0, (27.100)
which implies

d 1 2
3 ϑ̇ − 3n (2 − 1 ) cos ϑ = 0.
2
(27.101)
dt 2
Hence,
2
1 dϑ
E = 3 − 3n2 (2 − 1 ) cos ϑ (27.102)
2 dt
is an integral of the motion. Just as for the pendulum (see, e.g., Example 18.8) we
have that for energies E larger than E0 = 3n2 (2 − 1 ) the satellite rotates, while
for E < E0 it vibrates. Due to the (so far neglected) tidal friction the energy E will
decrease more and more until it reaches the minimum value Emin = −E0 . From this

follows ϑ̇min
= 0 and ϑmin = 0. Therefore, in the final state of satellites on a circular
orbit the e1 -axis always points toward the planet (bound rotation), as is known from
the earth’s moon.
As was stated already, the differential equation (27.97) for ε = 0 cannot be solved
analytically. One may try, however, to get approximate solutions for ε 1. This we
Example 27.5 shall do now. First, we introduce dimensionless quantities, the time t = nt = 2πt/T
and ω02 = 3(2 − 1 )/3 . Then (27.97) turns into
d 2ϑ ω2 a 3
2
= − 30 sin 2(ϑ − ϕ(t )) . (27.103)
dt 2r (t )
Since r(t ) and ϕ(t ) are periodic in 2π , the right side can be expanded into a Fourier-
like Poisson series. One obtains
∞
d 2ϑ ω02 m
=− H , ε sin(2ϑ − mt ). (27.104)
dt 2 2 m=−∞ 2
In order to determine the coefficients H (m/2, ε), r(t ) and ϕ(t ) must be known.
ϕ(t) is obtained, e.g., via the second Kepler law as the solution of the following dif-
ferential equation:
dϕ (1 + ε cos ϕ)2
= . (27.105)
dt hk 2
Solving this differential equation and hence the determination of H (m/2, ε) are be-
yond of the scope of this example. We only quote that the coefficients H are pro-
portional to ε2|m/2−1| and were tabulated by Cayley11 in 1859. For small ε, we have
H (m/2, ε) ≈ −ε/2, 1, 7ε/2 for p = m/2 = 1/2, 1, 3/2. Here the half-integer variable
p has been introduced. If the argument of one of the sine functions varies only weakly
with time, i.e., if

dϑ

dt − m
1, (27.106)
there occur resonances. It is then advantageous to rewrite the equation of mo-

tion (27.104) in such a way that it depends on the only slowly varying variable
γp = ϑ − pt :

d 2 γp ω02 ω02 n
= − H (p, ε) sin 2γp − H p + , ε sin(2γp − nt ). (27.107)
dt 2 2 2 2
n =0
It turns out that the terms in the sum are oscillating so rapidly compared to the variation
of γp that their total contribution to the equation of motion largely averages out, if
ω02 and ε are sufficiently small. In first approximation for small ω02 and ε the high-
frequency terms can be eliminated from the equation of motion by keeping γp fixed
and averaging (27.107) over a period. One then obtains
d 2 γp ω02
= − H (p, ε) sin 2γp . (27.108)
dt 2 2
This is again the pendulum equation. Integration of (27.108) analogous to (27.100)–
(27.102) again yields the energy

1 dγp 2 ω02
Ep = − H (p, ε) cos 2γp . (27.109)
2 dt 4
11The original paper is by A. Cayley, Tables of the developments of functions in the theory of elliptic
motion, Mem. Roy. Astron. Soc. 29, 191 (1859).
Again, γp vibrates if Ep is smaller than Eps = |H (p, ε)|ω02 /4, and rotates if Ep is Example 27.5
larger. For H (p, ε) > 0, γp vibrates about 0; for H (p, ε) < 0, γp vibrates about π/2.
The essential difference compared with the pendulum equation with ε = 0 is that here
exists not only the synchronous (p = 1) solution but also resonances, depending on
the initial condition. So, it may happen, e.g., that a satellite is captured by the tidal
friction into a p = 3/2-state. This happens in the solar system for Mercury: During
two circulations about the Sun it rotates exactly three times about its axis.
The question of to what extent the averaging over the high-frequency contributions
is justified and whether these are indeed only weak perturbations is complicated and
shall not be traced further here. But it is vividly clear that the high-frequency terms
must not be neglected if the energy is close to the limit Ep = Eps . Then these terms
will actually decide whether the satellite performs a full turn (in γp ) or whether it vi-
brates back. There exists a band of energies wp · Eps that are very close to Eps , with
wp defined by wp = (Ep − Eps )/Eps . For energies within this band the high-frequency
perturbations cannot be neglected. We shall only state without derivation that Chirikov
found an analytical criterion for the width wp of this band.12 Chirikov’s criterion pre-
dicts that for the parameters of Hyperion ω02 = 0.89 and ε = 0.1 the averaging over
the high-frequency components is not possible, since the width of the band belonging
to p = 1 and p = 3/2 is so large that the two bands overlap. In contrast, for ω02 = 0.2
and ε = 0.1 the averaging over the high-frequency components should be a good ap-
proximation. The following figures represent Poincaré cuts for these two cases. They
show points in phase space taken at ϕ = 0.
Thereby, the differential equations have been solved numerically.13 If the motion is
quasiperiodic, then successive points form a smooth curve; chaotic trajectories seem
to cover areas in a random manner.
Because of the symmetry of the inertial ellipsoid, the orientation ϑ is identical to
that with ϑ + π . Therefore, the graphs were plotted only for the range of ϑ between 0
and π . By averaging over the high-frequency components, solutions were obtained in
which γp = ϑ − pt vibrates (for Ep < Eps ). For each of these solutions, dϑ/dt has
a mean value of exactly p, and ϑ takes all values between 0 and 2π . If one considers,
however, only the points with ϕ = 0, i.e., times t = 2πn, then γp exactly corresponds
to ϑ modulo π . Therefore, a vibration in γp appears as a vibration in ϑ . The successive
points of quasiperiodic vibrations therefore yield a simple curve in the vicinity of
dϑ/dt = p that contains only a part of the angles between 0 and π . For nonresonant
quasiperiodic trajectories (Ep > Eps ), all γp ’s rotate, and successive points form a
single curve that contains all angles ϑ.
As is seen from Fig. 27.22, for small values of ω0 and ε the resonant states and
the nonresonant ones are separated by a narrow chaotic zone. The figure shows ten
distinct trajectories. Three of them correspond to the quasiperiodic vibrations in the
states p = 1/2, 1, and 3/2. As predicted by the approximation, γ1/2 vibrates about
ϑ = π/2; γ1 and γ3/2 vibrate about ϑ = 0. Three further trajectories, always enclosing
the resonant states, are chaotic. They fill narrow bands with points in a seemingly
random manner. The last four trajectories show that each chaotic band is separated
from the other bands by nonresonant quasiperiodic trajectories.
12 B.V. Chirikov, A universal instability of many-dimensional oscillators systems, Phys. Rep. 52, 262
(1979).
13 J. Wisdom, S.J. Peale and F. Mignard, The chaotic rotation of Hyperion, Internat. Journal of Solar
System Studies 58, 137 (1984).
Fig. 27.22. Transverse cut of

the phase space for ω02 = 0.2
and ε = 0.1
Fig. 27.23. Transverse cut of

the phase space for the pa-
rameters of Hyperion: ω02 =
0.89 and ε = 0.1
Figure 27.23 displays the situation for Hyperion. At least the chaotic zones of the
states p = 1 and p = 3/2 are no longer separated from each other. There is a small
remainder of the quasiperiodic p = 1/2 state; the quasiperiodic p = 3/2 state disap-
peared completely. Instead, there is a quasiperiodic state at p = 9/4, ϑ = π/2, which
is not given by the approximation (27.108). In total one sees 17 trajectories in the fig-
ure: eight quasiperiodic vibrations of the states p = 1/2, 1, 2, 9/4, 5/2, 3, and 7/2,
five nonresonant quasiperiodic rotations, and four chaotic trajectories.
A deeper study shows that the alignment of the rotational axis perpendicular to the Example 27.5
orbital plane is not stable, both in the chaotic as well as in the synchronous state. This
means that a small deviation of the rotational axis from the vertical increases exponen-
tially. The time scale for the resulting staggering motion is of the order of magnitude
of several orbital periods. The final stage of a “normal” moon is for Hyperion com-
pletely unstable. But if it tilts out of the perpendicular to the orbital plane, (27.97) is
no longer sufficient, and one has to solve the full nonlinear Euler equations. One then
finds that the full three-dimensional course of motion is completely chaotic. All three
characteristic Lyapunov exponents are positive (of the order of magnitude 0.1), which
implies a strongly chaotic staggering. Even if one could have measured the spatial
orientation of the rotation axis at the moment of Voyager 1 passing Hyperion (in No-
vember 1980) with a precision of up to ten figures, it nevertheless would not have been
possible to predict the orientation of Hyperion’s axis at the moment when Voyager 2
passed it (in August 1981).
Up to this point, the tidal friction, which causes a relatively very slow change of
the initially Hamiltonian system, has been neglected. But one can roughly describe
the history of Hyperion. Presumably, the period of the eigenrotation was initially much
shorter than the orbital period, and Hyperion began its evolution in the range far above
that which is shown in the figure. Over a time of the order of magnitude of the age of
the solar system (circa 1010 years) the eigenrotation was decelerated, and the rotation
axis straightened up perpendicular to the orbital plane. Thereby, the premises of our
simplified model (27.97) are approximately justified. But when the evolution once had
reached the chaotic domain, “the work of the tides lasting aeons was destroyed in a
few days,”14 since once it arrived in the chaotic domain, Hyperion began to stagger in
a fully erratic manner (which continues to the present day).15 Sometimes, its path will
end up in one of the few stable islands of the figure. But this cannot be the synchronous
state, since the latter state is unstable.
14 J. Wisdom, Chaotic behavior in the solar system, Nucl. Phys. B (Proc. Suppl.) 2, 391 (1987).
15 The observations of Voyager 2 are consistent with this statement. The staggering of Hyperion has
also been observed directly from the earth (see J. Klavetter, Science 246, 998 (1988), Astron. J. 98,
1855 (1989)).
Part
VIII
On the History of Mechanics
Here, we follow Friedrich Hund, Einführung in die Theoretische Physik Bd. 1, Mechanik,
Bibliographisches Institut Leipzig (1951), and also I. Szabo, Einführung in die Technische
Mechanik, Springer-Verlag, Berlin, Göttingen, Heidelberg (1956). For more on what is presented
here, in particular on the history of statics, we refer to the works of P. Duhem:
Les origines de la statique, Paris (1905–1906),
Etudes sur Lionard de Vinci, Paris (1906),
Le systéme du monde, Paris (1913–1917),
and to P. Sternagel:
Die artes mechanical im Mittelalter (1966),
and to F. Krafft:
Dynamische und statische Beobachtungsweise in der antiken Mechanik (1970).
Emergence of Occidental Physics
in the Seventeenth Century 28
Physics describes nature by theoretical-physical, mathematical theories. It uses gen-

eral but precise concepts, builds upon certain laws (“axioms”) and makes precise
mathematical-logical statements on natural phenomena. An important role is played
by the idealization of reality, i.e., the representation of reality by describing ideal sit-
uations. Precise statements can be made only about these ideal cases. In doing so,
physics utilizes systematically worked-out procedures for proof (reproducibility) of
its statements. The application of the laws of physics has led to a far-reaching control
of nature which seems to grow continually.
Physics originated in the occidental culture. As a modern natural science it has
characterized this society for about 300 years, i.e., since about the seventeenth century.
One should note that the consideration of nature in Greek antiquity differs essentially
from the present natural science in our society. In ancient Greece, say at the time of
Plato1 and Aristotle,2 there existed little systematic knowledge of nature. What ex-
isted appears from a present-day view to be primitive and of low precision. Attempts
at ultimate sharpening physical concepts were rarely known, although the clear line
of thought in the then already highly developed mathematics might have served as a
model. For example, Eudoxus3 (400–347 BC) was one of the most outstanding math-
ematicians of that age who was close to Plato. The philosopher Plato in turn admired
the inner logical consistency of mathematics. Partial disciplines of physics, such as
astronomy, statics, and music theory, only later were quantified. But no genuine me-
chanics emerged. The analysis of motion and the precise formulation of the concept
of force were not achieved in antiquity, although mathematics (e.g., by Archimedes,4
287–212 BC) was actually just as far developed as in the seventeenth century when
the theory of motion originated. It seems to be a peculiar feature of the ancient cul-
ture that mechanics in its full meaning, and hence the beginning of comprehensive
physics, could not develop in the ancient world. It is also remarkable that the natural
scientists of the antiquity hardly influenced the general consciousness of that age. The
specialized sciences were not of public concern. Scientists in Alexandria, Pergamon,
or Rhodos belonged to a closed circle.
The cultures in East Asia and India, still existing nowadays, produced out-
standing achievements in other disciplines and certain techniques (ceramics, dye-
works), but no physics. The conceptional-mathematical thinking about nature and the
questioning of nature by determined experiments were not developed or promoted
there.
Physical science emerged on one hand from the philosophic thinking about nature
and on the other hand from technical problems (e.g., military resources, traffic, civil
engineering, mining). In particular the first developed mechanics, which at the very
beginning was merely statics, goes back to these two sources. A proper unification of

556 28 Emergence of Occidental Physics in the Seventeenth Century
these two approaches to nature was, however, not reached in antiquity. Archimedes
(287–212 BC) represented the climax in ancient statics: the lever principle, the con-
cept of the center of gravity of a body, and the well-known hydrostatic law named
after him, were known to him in full clarity. Yet, Archimedes’s discoveries fell into
oblivion. The reasons are not known. Possibly his ideas were simply too hard to be un-
derstood. Whatever the reasons were, in the Middle Ages only the works of Aristotle
were known, and these determined the further development of mechanics.
In the fourteenth century, statics enjoyed a time of prosperity, mainly due to distin-
guished men of the artist faculty of the university in Paris. The methods for decom-
posing and combining forces were developed there and utilized for solving statical
problems. In addition, the concept “work of a force” was introduced, and the “vir-
tual work” in virtual displacements was correctly used in simple cases. Leonardo da
Vinci5 (1452–1519) was a leading researcher in mechanics of his time. He performed
the decomposition of forces for investigation of moments (lever law). For individ-
ual examples he traced the law of the parallelogram of forces back to the lever law.
These relations were clearly and precisely formulated by Varignon (1654–1722) and
Newton6 (1643–1727) not before the seventeenth century.
The dynamics of mass points was created in the seventeenth century. Since this
is a very important period in the history of mechanics, we will outline it in more
detail. It should be mentioned that the formal completion and mathematical treatment
of mechanics happened in the eighteenth century and culminated in the mechanics of
Lagrange7 (1736–1813).
The seventeenth century was presumably the most decisive period in the history of
physics, probably the birth of physics as an exact science per se. In that time mechanics
was created and completed in outline, monumental in its scientific clarity and beauty,
and convincing in its predictive power and mathematical formulation. The processes
of motion in the heavens and on earth were described in a consistent way. Method-
ically in mechanics one succeeded for the first time in sharply conceptualizing ex-
periences (experiments). This was achieved mainly by using mathematical language,
combined with the invention of very fruitful abstractions and idealized cases about
which one was now able to make precise and final statements. The implications of the
new physical knowledge were quickly realized in public consciousness, as is demon-
strated by the science methodology of Bacon8 (1561–1626), Jungius9 (1587–1657)
and Descartes10 (1596–1650).
The new scientific spirit did not concern mechanics only. Actually the first book on
physics in the meaning of the new science stems from the English physician Gilbert11
(1540–1603) on the magnet. He started from well-planned experiments, generalized
them, and in this way came to general statements on magnetism and geomagnetism.
This book strongly influenced the intellectual-scientific evolution of that age. The
great scientists of that era knew it (Kepler12 praised it, Galileo13 used it). However,
it was not in magnetism that physics made a great breakthrough, but dynamics. The
following stages were important:
(1) Kepler interpreted the processes in the sky as physical phenomena;
(2) Galileo succeeded in the correct conceptual understanding of simple motions. He
invented the abstraction of the “ideal case”;
(3) Huygens14 and Newton cleared up and completed the new concepts.
In the following table, we list the most important researchers and thinkers of that
era:
28 Emergence of Occidental Physics in the Seventeenth Century 557
1473–1543 Copernicus
1530–1590 Benedetti
1540–1603 Gilbert (1600 De Magnete)
1548–1603 Stevin
1561–1626 Bacon of Verulam
1562–1642 Galileo (1638 Discourses on Fall Laws)
1571–1630 Kepler (1609 Astronomia Nova)
1587–1657 Jungius
1592–1655 Gassendi
1596–1650 Descartes (1644 Principia Philosophiae)
1608–1680 Borelli
1629–1695 Huygens (1673 Pendulum Clock)
1635–1703 Hooke
1643–1727 Newton (1686 Principia).
According to the opinion prevailing in antiquity—presumably going back to
Aristotle—heaven and earth were greatly separated from each other. The heavens rep-
resented the perfect, unchanging, divine. The earthly, on the contrary, was changing
and chaotic. This opinion was considered to be confirmed by the circular orbits of the
celestial bodies and the ideally straight earthly motions. The stars were interpreted as
being essentially different from the earth, which presumably prevented the progress
of the heliocentric world system which had been invented already in antiquity. Only
Copernicus15 (1473–1543) made this breakthrough. As a proof for his doctrine, which
states that the sun is in the center of the world, he had to offer only the simplicity and
beauty. This was not yet a “physics of the heaven,” but rather a kind of geometrical
ordering of the world.
Only Kepler presumably realized the connection of the motions in the sky with
physics. He asked about the forces when he put the orbiting velocity of planets that
decreases outward in a causal connection with a force originating from the sun and
decreasing with the distance from it (1596). He justified the validity of the heliocentric
system with the sun as the origin of the force which causes the planetary motions. In
his Astronomia Nova Seu Physica Coelestis he expressed (1609) the first planetary
laws. But his conclusion, a 1/r-dependence of the gravitational force is false. He
concluded from the area law
r 2 ϕ̇ = constant
that the velocity r ϕ̇ perpendicular to the vector sun-planet is inversely proportional to

the radius
constant
r ϕ̇ =
r
and, therefore, the force should also show such a dependence, which, as stated already,
is wrong. He related this reasoning with the propagation of the force in the planetary
plane. Thus, Kepler understood the problem set by the planetary motion rather clearly,
but he still did not yet have the conceptual and mathematical tools for describing the
curved motion.
Kepler on the one hand was a rational astronomer and physicist who searched for
the laws of motion of planets and for the reasons behind them; on the other hand he
was an aesthete who considered the world as an ordered entity. In the Harmonices
Mundi (1619), he ascribed a particular role in the structure of the world to the five
regular bodies. He used them as mathematical “archetypes.” The use of mathematics
for describing connections, as soon became customary, was denied him, however.
Galileo does not match to Kepler in the physical interpretation of planetary motion.
He considered the circular orbit to be natural and did not yet understand the general
meaning of the laws of inertia. Borelli16 (1608–1680), on the contrary, qualitatively
described the motions of celestial bodies as an interplay of the attraction by the sun
and of the centrifugal force. He understood the planetary motion as a problem of the-
oretical mechanics. But to solve this problem essential tools still had to be developed;
first of all free fall and the throw had to be understood.
The idea of the inertia of bodies, their persistence in the state of uniform motion in
the absence of forces, was hard to understand. The abstraction was accepted only la-
boriously. On the contrary, it was easier, more natural, to ask for the reason of changes
of position and to look for relations between force and velocity, as was already done
by Aristotle in nebulous form. He took the view that the thrown body had a certain
immaterial ability which, however, gradually decreased (vis impressa natura—liter
deficiens). In the fourteenth century this ability was called “impetus” (Buridan, 1295–
1336?). The impetus normally remains constant, but gravity and air resistance can
modify it. A falling body is accelerated because more and more new impetus is added
by gravity (Benedetti,17 1530–1590). Galileo also shared this view, but he at first tried
to combine this understanding with the doctrine of the Aristotelian school that force
and velocity were proportional to each other, in a complicated nontransparent way.
Only on the third day of the Discourses on Fall Laws, clarity about the course of
the fall motions is achieved. The ideal cases of uniform motion and of uniformly ac-
celerated motion are outlined and mathematically grasped. The uniformly accelerated
motion is compared with the experience of free fall and the fall on the tilted plane.
Finally, on the fourth day of the discourses the tilted throw is analyzed: it is correctly
understood as a composition of an (ideal) propagating motion and an (ideal) fall mo-
tion.
Although Galileo described the law of inertia (persistence of bodies in uniform
motion if no forces are acting on them) and the proportionality between force and
acceleration, he did not express the general law of motion. He also did not apply his
knowledge of the laws of falling bodies on planetary motion. From his entire work
one realizes how reluctantly he gradually gave up the old ideas.
But the new ideas gained acceptance. In 1644, Descartes made the first attempt to
formulate general laws of motion. He talked about the persistence of bodies in the state
of rest or of motion, about linear motion as most natural motion (meaning force-free
motion) and about the “conservation of motion” in the impact of bodies. The latter

is obviously the conservation of momentum (called motion) i mi vi . Descartes how-
ever did not realize the vector character of the momentum (the “motion”). Presumably,
therefore, his applications of this law are wrong.
The conceptual completion of mechanics was achieved by Huygens when treating
curved motions. He studied the motion of a body on a given path in the gravitational
field of the earth (tautochrone problem) and demonstrated clear understanding of the
centripetal and centrifugal force in the treatment of the pendulum clock (1673). He
explained these topics by infinitesimal considerations as uniformly accelerated devia-
tion from linear motion. He realized the proportionality between centrifugal force and
(centrifugal) acceleration, where the acceleration is already defined infinitesimally.
Huygens realized the momentum was a vector quantity, and thus he correctly inter-
28 Emergence of Occidental Physics in the Seventeenth Century 559
preted the momentum conservation law, as can be seen from his applications of this
law to impacts.
Then Newton came up with his brilliant work Philosophiae Naturalis Principia
Mathematica (1686/87), and systematically showed the connection between mass, ve-
locity, momentum, and force. He demonstrated by the example of the gravitation law
how a force (measured by the change of the momentum) is determined by the loca-
tions of the involved bodies. Finally he applied the laws of mechanics to the treatment
of planetary motion, and he showed how the gravitation law follows by induction from
Kepler’s laws, and how the Kepler laws follow deductively from the gravitation law.
This represented the final breakthrough and the proof of the new mechanics; so to
speak the completion of Kepler’s quest.
After the preceding considerations, the question might arise why Kepler failed to
discover the acceleration law—and, hence, gravity—although it follows seemingly
simply from his own law. But it is not for us to accuse Kepler for this reason of
a lack of brilliance and imagination. It is beyond any doubt that he had both—the
genius in empirical research and the imagination in far-reaching speculations.∗ The
explanation is as follows: Kepler was a contemporary of Galileo, who survived him by
twelve years. Although Kepler knew the Galilean mechanics, in particular the central
concept of acceleration, the laws of inertia and of throwing by correspondence and
by hearsay, he nevertheless could not work it out to a consistent structure. (Note that
Kepler died in 1630, while Galileo’s Discorsi outlining his mechanics was published
only in 1638.) But even more decisive is the fact that the theory of curved motion—
invented by Huygens for the circle and completed by Newton for general orbits—was
not available to Kepler. But without the concept of acceleration for curved motions
it is impossible to derive the form of the radial acceleration from Kepler’s laws by
simple mathematical manipulations.
Newton’s mechanics of gravitation, which emerges from the dynamical fundamen-
tal law and from the reaction principle, is by its nature a further development of the
throw motion discovered by Galileo. Newton writes about this point: “That the plan-
ets are being kept in their orbits by central forces is seen from the motion of thrown
objects. A (horizontally) thrown stone under the action of gravity will be deflected
from the straight path and falls, following a bent line, finally down to the earth. If it is
thrown with greater velocity, it flies farther away, and so it might happen that it finally
would fly beyond the borders of the earth and would not fall back. Hence, the mis-
siles thrown away from the top of a mountain with increasing velocity would describe
more and more extended parabolic curves and finally—at a certain velocity† —return
∗ For example, in his thoughts concerning the possible number of planets, since he was convinced—
like the Pythagoreans—that God had taken the choice on number and proportions according to a
certain number rule.
† The value for horizontal throw is correctly given by Newton from mv 2 /R = mg as v = √gR =
7900 m s−1 ; for the vertical shot into space the necessary velocity is obtained from the energy law
∞
1 2 mM mM
mv = γ dr = γ
2 r2 R
R
with g = γ M/R 2 as

v = 2gR = 11200 m s−1 .
Both results do not involve the friction losses in the air.
to the top of the mountain, and in this way move around the earth.” An overwhelming
argument—by conception and compelling logic!
The English physicist Hooke18 (1635–1703), who is known as the founder of the-
ory of elasticity, also came close to the gravitation law. This is shown by the following
statements made by him:‡ “I shall develop a world system which in every way agrees
with the known rules of mechanics. This system rests on three assumptions: (1) All
celestial bodies have an attraction directed toward their center (gravity); (2) all bod-
ies that are put in straight and uniform motion move in a straight line until they are
deflected by any force and forced into a curved path; and (3) attractive forces are the
stronger the closer they are to the body they are acting upon. Which are the various
degrees of attractions I could not yet determine by experiments. But it is an idea that
must enable the astronomers to determine all motions of celestial bodies according to
one law.”
These remarks show that Newton did not create the monument of his Principia
out of nothing. But it took an immense mental strength and bold ideas to concentrate
all that created by Galilei, Kepler, Huygens, and Hooke in physics, astronomy, and
mathematics into one focus, and in particular to announce that the force that lets the
planets circulate on their orbits around the sun is identical with the force that drives
the bodies on the earth to the floor.
For this knowledge, mankind needed one-and-a-half millennia if one considers that
in the Moralia (De facie quae in orbe lunae apparet) Plutarch19 (46–120) states that
the moon by the momentum of its orbiting is just in the same way kept from falling to
earth as a body which is “rotated around” in a sling. It took the genius of Newton to
realize what the “sling” for the planets is!
The new mechanics had a tremendous impact on the spirit of that age. Now there
existed a second incontestable science besides mathematics. Furthermore, the exact
natural sciences were born: Mechanics advanced to their model.
Let us summarize once again the most important stages in the evolution of mechan-
ics from a present-day view: The essential part of mechanics and of its fundamental
concepts is expressed in the basic dynamic law
dp d
= (mv) = F.
dt dt
Here, basically the acceleration appears as the signature of an acting force F; the law
of inertia, i.e., the conservation of momentum
p = mv
and, hence, of the velocity v = ṙ if no external forces are acting, is also contained
therein. This law of inertia had already been realized in the ancient and scholastic
mechanics (Philoponos, Buridan) by experience. Uniform motion as the ideal case
‡ There are still two other hints on the many-sided active genius of Hooke: In 1665, he writes the
prophetic words: “I have often thought that it should be possible to find an artificial, glue-like mass
which is equal or superior to that excretion from which the silkworms produce their cocoon and which
can be spun to threads by jets.” This is the basic idea of the man-made fiber that—although two-and-
a-half centuries later—revolutionized the textile industry! In the same year he writes, anticipating the
mechanical theory of heat (hence, also kinetic gas theory): “That the particles of all bodies, as hard
as they may be, nevertheless vibrate, one needs in my opinion no other proof than that, that all bodies
involve a certain degree of heat and that never before has an absolutely cold body been found.”
Notes 561
of motion was described by Galileo. Descartes clearly formulated the law of inertia,
and Huygens utilized it correctly. As already stated, in the basic dynamic equation
the acceleration appears as a differential quotient of the velocity. Huygens clearly
realized that. He also correctly understood the acceleration as a measure of the force,
as well as the role of the mass in the momentum. Newton summarized everything in
a sovereign manner and applied the fundamental law to celestial mechanics. In this
sense Newton is the endpoint of the way to mechanics to which besides him also
Galileo and Huygens contributed essentially. The general concept, however, is due to
Kepler.
For the history of mechanics, we also refer to the outlines of the history of physics.
For the sections treated here, see particularly
E.J. Dijksterhuis, Val en worp, Groningen 1924.
E. Wohlwill, Galilei, Hamburg and Leipzig 1909 and 1926.
The most important original papers were translated to German in the collection:
Ostwald’s classics of exact sciences. The main works of Kepler and Newton are avail-
able in German as
J. Kepler, Neue Astronomie oder Physik des Himmels (1609). German translation.
Munich 1929.
I. Newton, Mathematische Principien der Naturlehre (1686/87). German Transla-
tion. Leipzig 1872.
E. Mach, Die Mechanik in ihrer Entwicklung historisch-kritisch dargestellt (1933).
R. Dugas, A History of Mechanics, Neuenburg/Switzerland, (1955).
P. Sternagel, Die Artes mechanical im Mittelalter (1966).
F. Krafft, Dynamische und statische Betrachtungweise in der antiken Mechanik
(1970).
For modern English editions and translations of historical texts, we refer to
J. Kepler, W.H. Donahue (Translator), New Astronomy (1992), Cambridge Univ.
Press.
I. Newton, I.B. Cohen, A. Whitman (Translators), The Principia: Mathematical
Principles of Natural Philosophy (1999), Univ. California Press.
J.L. Lagrange, A.C. Boissonnade, V.N. Vagliente (Translators), Analytical Mechan-
ics (2001), Kluwer Academic Publishers.
Finally, we mention some texts about the history of mechanics
E.J. Dijksterhuis, The Mechanization of the World Picture: Pythagoras to Newton
(1986), Princeton Univ. Press.
E. Mach, T.J. McCormack (Translator), The Science of Mechanics: A Critical and
Historical Account of its Development, 6th edition (1988), Open Court Publishing
Company.
R. Dugas, A History of Mechanics (1988), Dover Pub.
Notes
1 Plato, Greek philosopher, b. 427 BC , Athens–d. 347, Athens, was the son of Ariston
and Periktione, from one of the most noble families of Athens. According to legend,
he wrote tragedies in his youth. The meeting with Socrates, whose scholar he was for
8 years, became decisive for his turn to philosophy. After Socrates’ death (399), he
first went with other scholars of Socrates to the town of Megara to study with Euclid.
He then broadened his horizons by extended travel (first to Cyrene and Egypt). He
soon returned home and opened the war against the educational ideal of the Sophists
by his first works. He quickly won enthusiastic followers while dealing with science,
secluded from public life. Presumably, scientific intentions led him about 390 to Italy,
where he became familiar with the Pythagorean doctrine and school organization. He
was introduced to the court of the tyrant Dionysius of Syracuse. Dionysius was at first
much interested in him, but according to legend handed Plato over as a prisoner to
the envoy of Sparta, who sold him as a slave. After payment for release and return
to Athens, Plato founded the Academy in 387. Despite his bad experience, he set his
hopes for a full effectiveness to Syracuse. In 368, he followed an invitation by Dion,
uncle of the younger Dionysius, who hoped to win the young ruler over to Plato’s
political principles. Dionysius, however, only tolerated Plato’s ideas for a short time.
A third trip (361–360) also failed, since Dionysius distrusted him and turned against
him. Plato spent the last years in Athens in continuing scientific activity within a circle
of well-known scholars; according to legend, he died during a wedding meal.
Plato’s works are all preserved, except for the lecture On the good, which can be
reconstructed only in broad terms. But not all work recorded under his name is au-
thentic. The authenticity of the 7th Letter and of Laws is controversial.
The most important and surely authentic papers from the early period are: Apol-
ogy, Protagoras, State I, Gorgias, Menon, Kratylos; from the middle artistic period:
Phaidon, Banquet, State II–X; from the last years: Phaidros, Parmenides, Theaithetos,
Sophistes, Timaios, Philebos.
Almost all of his writings are dialogues that by language and structure are of great
artistic beauty. In most of them, Socrates appears as the main host of conversation.
Plato’s philosophy turns dialectics, which for his teacher Socrates had only the
negative function of destroying the false knowledge on the good and the virtue, into
an approach of realizing the good and the virtue–into a path to the “ideas.” The ideas
are not acts of imagination but are the content of that being represented by them,
which in itself is independent of us. By this distinction of the sensual (which is with
us) from the hypersensual (the later “transcendent”), Plato became a promoter of the
later so-called metaphysics.
Since the innermost nature of love is the will for perpetuation, it comes to fulfill-
ment only as love of the eternal ideas. All other love is a preliminary stage to that. To
kill off the transitory sensual and to turn toward the everlasting ideas is the aspiration
of the really philosophical man. The way toward this goal is the dialectics. Also, the
nature of this method of perception of ideas is logic.
Plato’s understanding of the role of the idea varies between the general idea and the
idea a priori. Provided it is the latter one, it is not brought into man from the outside,
but he remembers it as something he already knows but has forgotten. The under-
standing of the idea is remembrance (anámnesis). The method of remembering is that
of the hypothesis. By this, Plato means the proof in the form of the statement-logical
conclusion: If the first, then the second. But now the first, therefore the second. Or:
But now not the second, therefore not the first: e.g., in Menon: If virtue is knowledge,
then it is teachable. But now it is knowledge, therefore it is teachable. But now it is not
teachable, since there are practically no teachers of virtue. Therefore, it is not knowl-
edge, but only correct opinion inspired by the Gods; to transform it into knowledge
that is able to self-satisfy is according to Plato the essential duty of philosophy.
Notes 563
The later form of dialectics, the method of division (diaeresis) of the species into
sorts, is the draft of a logical method of proof. Aristotle rightly interprets it as a pre-
lude of the class-logical conclusion discovered by him. All proving is proving on
conditions. These can be proved themselves. In the State, Plato sketches the idea of a
completion of this proving up to the omission of all assumptions (anyipódeton), i.e.,
the idea of a proof by and from the purely logical. The absolute is defined here as
the good. In his later works, Plato interprets the ideas as numbers, i.e., as units that
include a manifold in themselves, and sees their absolute principle in the One and the
“Great-and-small” (interpretation controversial).
Plato remains aware of the limits of all human proofs. Where the dialectics ends,
there remains the speculative speech that uses the language of myth. All knowledge of
the sensual, the nature, does not go beyond well-founded presumption. Therefore, all
natural-scientific speech is necessarily myth. Plato develops this idea in his dialogue
Timaios, which was of particular influence in the Middle Ages.
The question of what is for man the good and the virtue as the way toward this
goal is answered by Plato by means of his dialectics of ideas, at first in the State by
his doctrine of the four cardinal virtues: wisdom, bravery, prudence, and justice. The
sketch of an ideal state serves only for proving this doctrine and does not represent
a plan that should be realized. This “ideal state” with its subdivision into the three
orders of scholastic profession, military profession, and peasantry, and his doctrine
on the community of possessions and women, and on the necessity that the kings
should become philosophers and the philosophers should become kings, later on was
interpreted as a political program and became efficient. In the late work Philebos,
Plato sees the good of human life in the composition of knowledge (epistéme) and joy
(hedoné), where all knowledge is admitted, but among the joys only those that are not
mixed with grief, those that cannot impair the knowledge. Men must be educated in
the spirit of such a life ideal if a real and stable state shall be possible. This restoration
program demands a radical restriction of the influence of the traditional literature on
the individual and on the community. Plato’s philosophy and critics of art were also
of extraordinary historical influence.
Plato’s doctrine, Platonism, was first developed further in Plato’s school, the Acad-
emy. One distinguishes the older, intermediate, and younger Academy. In the older
one, whose first and most important leaders were Speusippos and Xenocrates, the
Pythagorean attitudes of the late philosophy of Plato were emphasized. The ratio
of ideas and numbers became the focus of interest; soon mythological elements
joined. On the contrary, the leading men of the intermediate Academy, Arcesilaos
(315–241 BC) and Carneades (214–129), intended to revive the critical-scientific at-
titude of Plato. In this way, there emerged an—although moderate—skepticism that
believes that only probable insight is possible. The younger Academy considers the
power of reason again as more positive and combines thoughts of various systems in
an eclectic manner, in particular Platonic and Stoic thoughts. To the younger Acad-
emy belong Philo of Larissa (160–79) and Antiochos of Ascalon († 68 BC), heard
by Cicero in Athens. The Platonism of the three academies is called older Platonism.
The transition from this one to the new Platonism is mediated by the “intermediate”
Platonism, with the main representative Plutarch (AD 50–125), who taught a religious
Platonism with a strong emphasis on the absolute transcendence of God and assumed
a series of steps of intermediate beings between God and the world.
In the Middle Ages until the twelfth century, only Timaios was known and had
a strong impact. In the twelfth century, Henricus Aristippus translated Menon and
Phaidon, and in the thirteenth century, W. von Moerbeke translated Parmenides. The
new Platonism had more influence than Plato’s original ideas. The historical evolution
of the philosophy of the Middle Ages was largely determined by the confrontation
between Platonism and Aristotelian philosophy. In the early scholastic, Platonism was
dominated mainly by Augustinus; in particular, the school of Chartres had a Platonic
orientation. In the high scholastic, Platonism formed a strong undercurrent in the doc-
trines of the Aristotelians (Albert, Thomas). It emerged as an independent movement
among the mathematical-scientific thinkers (Robert Grosseteste, Roger Bacon, Witelo,
Dietrich of Freiberg) and among the German mystics. The latter established the link
to the Platonism of the early Renaissance (Nicholas of Cues).
Modern Platonism began during the Italian Renaissance. In 1428, Aurispa brought
the complete Greek text of Plato’s works from Constantinople to Venice. Latin trans-
lations soon emerged, the most important one by Marsiglio Ficino, who completed it
1453 and 1483. Followers of Platonism included Lionardo Bruni and the older Pico
Della Mirandola, as well as Byzantines who had fled to Italy, among them the two
Chrysoloras, Gemistos Plethon and Bessarion. Central to this movement was the Pla-
tonic academy in Florence, founded in 1459 by Cosimo de Medici and guided by
Ficino. From there, Platonism spread all over Europe. However, only in England did
a truly Platonic school emerge (Cambridge). But Plato’s thoughts had lasting effects
in the rationalistic systems of Cartesius, Spinoza, and Leibniz. Malbranche was even
called the “Christian Plato.” In the nineteenth century, German idealism brought a
revival of Plato’s system of thought. Hegel resorted not only to Platonism but even
more to Plotinus and the New Platonism. Plato’s influence is seen in the recent past
in the phenomenology of Husserl and in world philosophy. A.N. Whitehead explic-
itly confessed to Platonism. Although his statement that all of European philosophy is
only a footnote to Plato is exaggerated, it nevertheless rightly points out the immense
influence of Plato’s philosophy. Platonism is even more dominant in the philosophy
and theology of the Christian East, where the Platonic tradition of Origenes and of
the Greek church fathers survives; for example, W. Solowjew and N. Berdjajew are
Christian Platonics. [BR].
2 Aristotle, Greek philosopher, b. 384 BC , Stagira in Macedonia–d. 322, Chalcis on
Euboea. He came at the age of 18 to Athens and became a student of Plato, where
he remained at the Academy for almost two decades, first as a scholar and then as
a teacher, finally opposing Plato with his own philosophy. After Plato’s death (347),
he lived for three years in Asia Minor with Hermias, with the ruler of Atarneus. In
343, he was called to the court of Phillip of Macedonia to be the tutor of Philip’s son
Alexander. When Alexander ascended the throne, Aristotle returned to his hometown;
however, in 334 he returned to Athens. In Athens, he founded the Peripataetic school,
so called because of the covered walks (peripatoi) surrounding the lyceum. He taught
there for twelve years among an ever-increasing circle of scholars, until the revolt of
Athens after Alexander’s death became dangerous to him, a friend of the royal dynasty.
He went to his estate at Chalcis on Euboea, where he soon died. [BR].
3 Eudoxus of Cnidus (400–347 BC ), Greek scientist. He was equally active as math-
ematician, astronomer, and geographer. His biography is not recorded in detail, but
it is considered certain that he was a member of the Platonic school. Later, he con-
ducted his own school in Cyzicus. Eudoxus gave a new definition for proportion. He
developed the exhaustive method and applied it to many geometric and stereometric
problems and theorems, which he could prove exactly for the first time. Possibly, the
major part of Euclid’s twelfth book is the work of Eudoxus. Eudoxus made a map of
Notes 565
the stars that remained top-ranking for centuries. He subdivided the sky into degrees
of longitude and latitude, gave an improved value for the solar year, and improved the
calendar. He estimated the earth’s circumference to be 400,000 stadia, edited a new
map of the known continents, and wrote a geography of seven volumes.
4 Archimedes, outstanding mathematician and mechanic of the Alexandrian era,
b. about 285, Syracuse, killed by a Roman soldier during the capture of Syracuse.
Archimedes was close to the Syracusean dynasty. He wrote important papers on math-
ematics and mathematical physics, fourteen of which are preserved. He calculated the
area and circumference of a circle, the area and volume of segments of the parabola,
the ellipse, spiral, the rotation paraboloid, the one-shell hyperboloid, and others, and
determined the center of gravity of these figures. For π , he gave a value between
3 1/7 and 3 10/71; he developed in his “sand calculation” a method of exponential
notation of arbitrarily large numbers, and in the Ephodos, a kind of integration cal-
culus. Even more important than his treatment of the equilibrium conditions of the
lever is the treatise on swimming bodies, where the principle of Archimedes is given.
Archimedes determined the ratio of the volumes of the straight circular cone, the half-
sphere, and the straight circular cylinder as 1 : 2 : 3. Uncertain are the inventions of the
water screw named after him and of the composed tackle; legendary is the burning of
the Roman fleet by concave mirrors. [BR]
5 Leonardo da Vinci, Italian painter, sculptor, architect, scientist, technician, b. April
15, 1452, Vinci near Empoli–d. May 2, 1519, in the castle Cloux near Amboise; ille-
gitimate son of Ser Piero, notary in Florence, and of a peasant girl. He was educated
in his father’s house, and at the age of 15, he went to Florence as an apprentice of
A. Verrocchio, who taught him not only painting and sculpting but also gave him an
extensive education in the technical arts. In 1472, he was admitted to the Florentine
guild of painters, but remained in Verrocchio’s studio. In this time, of common work
the earliest of his preserved works emerged: an angel and the landscape in Verroc-
chio’s painting of the baptism of Christ (Florence, Uffizi), two preachings (Uffizi and
Louvre), and the Madonna with the vase (Munich, Pinakothek). About 1478, he be-
came a freelancer and worked at Florence for about 5 more years. From this era stems
the portrait of Ginevra Benci (Vaduz, gallery Liechtenstein), the unfinished painting
of St. Jerome (Vatican), and the also unfinished great panel painting of the worship of
the kings (Uffizi), which he got as an order for the high altar of a monastic church, but
which he gave up half-finished when he left Florence at the end of 1481, to start work
with Duke Lodovico of Milan.
The end of the Sforza dynasty forced Leonardo to leave Milan (1499). Through
Mantua, where he drew a portrait of margravine Isabella d’Este (Louvre), and Venice,
where he drew up a defense plan against the threatening invasion of the Turks, he
returned in April 1500 to Florence, where he began the painting of St. Anna Selb-
dritt (Louvre). In May 1502, Leonardo started work as the first inspector of fortress
buildings with Cesare Borgia, the papal military leader, throughout whose territory
Leonardo traveled for about 10 months: the Romagna, Umbria, and parts of the
Toscana. From this activity emerged a large fraction of his maps and city maps that—
masterpieces of surveying and representation—belong to the earliest records of mod-
ern cartography. Florence also asked for his advice as a war engineer; he worked on
a plan to divert the Arno river, in order to cut off the main access road to Pisa with
whom Florence was at war, and he designed the project for a channel to make the
Arno navigable from the sea to Florence. Both of these plans were not realized, just
like the draft proposed at the same time by sultan Bajasid II to built a bridge 300-m
long across the Bosporus. In 1503, Leonardo got the order to paint a monumental wall
painting for the large senate hall of the Palazzo della Signoria in Florence; there, the
drafts for the battle of Anghiari were born, which became the classical model in many
copies, e.g., by Rubens for the cavalry-fight painting of Renaissance and Baroque,
even to Delacroix. At the same time, Leonardo painted the Mona Lisa, presumably
the world’s most famous painting, and the standing Leda (preserved only as copies).
At this time, Leonardo reached the peak of his artistic fame. The arising geniuses of
the young generation either admired him without envy (Raphael) or accepted him re-
luctantly with jealousy (Michelangelo). Later, he painted only hesitantly, and more and
more turned to scientific problems. Besides mathematical studies, he studied anatomy
comprehensively. He dissected corpses and began an extensive treatise on the struc-
ture of the human body, where he promoted the anatomical drawing accompanying
the text as a tool for teaching. He also extended his biological and physical studies;
the experiments on the flight of man—already begun in Milan—led him to investigate
the flight of birds, which he also summarized like a treatise. Besides the laws of air
flow, he also tried to investigate those of water flow. These studies contain approaches
for theoretical and practical hydrology; he recorded them as materials for a treatise on
water. In these years, he tried to arrange his notes by the main topics of his planned
“books,” which as a whole comprise a theory of the mechanical primordial forces of
nature, i.e., an entire cosmology.
In 1506, Leonardo, at the request of the king of France released by the Florentine
Signoria under the pressure of the political situation, stopped the work on the Anghiari
battle and returned to Milan. There he served until 1513 mainly as an adviser to the
French governor Charles d’Amboise, for whom he designed a large domicile and the
plans for a chapel (S. Maria alla Fontana). From this era also date the drawings for
the tomb of General Giangiacomo Trivulzio that—like the Sforza monument—was
planned as an equestrian statue but was not realized. There is no clue to two almost
finished Madonna paintings for His Most Christian Majesty. Also, in Milan Leonardo
mainly dealt with scientific studies. He continued his great “anatomy,” in connection
with the anatomist Marc Antonio Della Torre of Pavia, and he extended his hydrolog-
ical and geophysical investigations both theoretically and practically, as is testified by
his project for an Adda channel between Milan and the lake of Como, and by his amaz-
ing geological observations on the origin of fossils. He also returned to his botanical
studies; also in this field, he created exact demonstration drawings according to ex-
actly defined principles of graphical representation, as in all of his research activities.
He thereby founded the scientific illustration.
When at the end of 1513 Leo X had risen to the papal throne, the now sixty-year-
old Leonardo went to Rome, presumably with the hope of acquiring orders through
his patron, Cardinal Giuliano de Medici. But he did not get big orders such as Raffael,
Bramante, and Michelangelo got. His years in Rome were occupied by research, in
particular on mechanics and anatomy. Only one painting, his last one, the mysterious
John the Baptist (Paris, Louvre), may have originated in this period.
In January 1517, Leonardo left Rome, following the invitation of Franz I. The coun-
try castle Cloux near Amboise was allocated to him as a residence, and he got the title
Premier peintre, architecte et mechanicien du Roi. He did not paint anymore, however,
because of paralysis of his hand, but mainly arranged his scientific materials; in partic-
ular, he worked on completing his “anatomy.” Among the few artistic creations of this
last era, the project of a large castle and park for the residence of the Queen Mother
in Romorantin is known. The building could not be built. His ideas nevertheless had
Notes 567
a lasting effect on the tremendous castle that was begun by Franz I when Leonardo
was still alive, the building of Chambord. The most stirring documents of his late
work are the drawings of the end of the world (Windsor), where Leonardo exhibited
his experiences of life devoted to studying nature in a unique synthesis of scientific
and artistic imagination; they are the symbol of the primordial forces penetrating the
world, which once had created and finally shall destroy it, but even in self-destruction
shall still obey the laws of harmony.
Leonardo was buried in Amboise in the church of St. Florentin, which was de-
stroyed during the French Revolution. His pupil and friend Francesco Melzi became
heir to his enormous written work, which is almost completely done in mirror writing,
familiar to him as a left-hander.
The greatness of Leonardo and his significance in the history of occidental culture
rests on the fact that he, like nobody else, understood art and science as a unity of hu-
man will of perception and power of mental comprehension. As a painter, he was the
first master of the classic style; his few artistic creations remained models of perfec-
tion for all following eras and styles. As a researcher and philosopher, he is at the bor-
derline between the Middle Ages and modern thinking. Altogether an empiricist, he
tried to acquire an encyclopedic knowledge by means of experience and experiment.
Guided by his imagination and less capable of abstract logical thinking, Leonardo
must not—as was often tried—be considered as the founder of modern science as
such. His achievements in the field of physics and pure mechanics are mediocre, often
even questionable. But, since he performed his all-embracing observations on natural
phenomena with ultimate objectivity and by virtue of his artistic talent was able to rep-
resent them in drawings, he became the pioneer of a systematic descriptive approach
in the natural sciences. Also, in the field of applied mechanics he can be considered
as the founder of elementary engineering, for which he developed the graphical prin-
ciples of demonstration. [BR]
6 Isaac Newton, b. Jan. 4, 1643, Woolsthorpe (Lincolnshire)–d. March 31, 1727,
London. Beginning in 1660, Newton studied at Trinity College in Cambridge, partic-

ularly with the eminent mathematician and theologian I. Barrow. After getting various
academic degrees and making a series of essential discoveries, in 1669 Newton be-
came the successor of his teacher in Cambridge. From 1672, he was a member and,
from 1703, president of the Royal Society. From 1688 to 1705, he was also a member
of Parliament, from 1696, attendant, and from 1701, mint-master of the Royal mint.
Newton’s life’s work, besides theological, alchemistic, and chronological-historical
writings, mainly comprises works on optics and on pure and applied mathematics.
In his investigations on optics, he described light as a flow of corpuscles and in this
way interpreted the spectrum and the composition of light, as well as the Newton color
rings, diffraction phenomena, and double-refraction. His main opus Philosophiae Nat-
uralis Principia Mathematica (printed in 1687) is fundamental for the evolution of ex-
act sciences. It includes the definition of the most important basic concepts of physics,
the three axioms of mechanics of macroscopic bodies, e.g., the principle of actio et
reactio, the gravitation law, the derivation of Kepler’s laws, and the first publication
on fluxion calculus. Newton also dealt with potential theory and with the equilibrium
figures of rotating liquids. The ideas for his big work mainly emerged in 1665 to 1666,
when Newton left Cambridge because of the plague.
In mathematics, Newton worked on the theory of series, e.g., in 1669, the bino-
mial series, on interpolation theory, approximation methods, and the classification
of cubic curves and conic sections. But Newton could not remove logical problems
even with his fluxion calculus that was represented in 1704 in detail. His influence on
the further development of mathematical sciences can hardly be judged, since New-
ton disliked publishing. For example, when Newton made his fluxion calculus public,
his method was already obsolete compared with the calculus of Leibniz. The quarrel
about whether he or Leibniz deserved priority for developing the infinitesimal calculus
continued until the twentieth century. Detailed studies have shown that both of them
obtained their results independently of each other. [BR]
7 Joseph Louis Lagrange, mathematician, b. Jan. 25, 1736, Torino–d. April 10, 1812,
Paris, at the age of 19 years, professor of mathematics in Torino. In 1766, he followed

the call of Friedrich the Great to the Berlin academy of sciences. After Friedrich’s
death, Lagrange moved to Paris as a professor at the École Normale. He invented the
principle named after him. Important for function theory is his Théorie des Fonctions
Analytiques, Contenant les Principes du Calcul Différentiel (1789), and for algebra
and number theory his Traite de la Résolution des Équations Numériques des Tous
Degrés (1798). In the Mécanique analytique (1788), he generalized and condensed
the principles of mechanics to the systems of equations named after him. [BR]
8 Francis Bacon, English philosopher and politician, b. Jan. 22, 1561, London–
d. April 9, 1626, London, son of Nicholas Bacon, nephew of Lord Burleigh; advocate
and deputy. In the notorious trial of his patron Essex, Bacon convicted Essex of high
treason. In 1607, he became Solicitor General; in 1613, Attorney General; in 1617,
Keeper of the Great Seal; and, in 1618, Lord Chancellor. Ennobled as Baron Verulam
and Viscount of St. Albans, in 1621 Bacon was thrown out by parliament because
of passive corruption and was sentenced to high penalties and imprisonment, which,
however, was remitted by the king’s influence. He was a curiously split character:
outstandingly talented, vastly well read, vain, excessively ambitious, and of frighten-
ing emotional frigidity. The reasons for his downfall were not only the proven and
confessed failures, but equally the anger of the parliament about the egocentric and
unauthorized policy of the king who utilized Bacon as a submissive tool.
Bacon left a large number of philosophical, literary, and legal writings. His philo-
sophical life’s work, the Instauratio Magna (i.e., great revival of philosophy), re-
mained a fragment, an attempt (based on insufficient means) of a complete recon-
struction of sciences on the basis of “unfalsified experience.” His main piece, Novum
Organum (the title indicates the contraposition to Aristotle, whose logical writings
traditionally were summarized under the title Organon), is a method of scientific re-
search, worked out down to the last detail, which shall serve to snatch the secrets of
nature and to govern it (Bacon considered knowledge as the means for a purpose,
“knowledge is power”). The starting point for any knowledge is experience. Expe-
rience and mind should be tightly linked in a “legitimate marriage,” instead of the
separation so far. Bacon constructed a complicated system of scientific induction, but
he failed to appreciate the role of mathematics. His main piece is preceded by an in-
ventory of all sciences (De diguitate et augmentis), where—according to the three
mental abilities memory, imaginative power, and mind—three main sciences are dis-
tinguished: history, poetry, and philosophy. Bacon listed what had been achieved by
each science and what still remained to be done.
Among Bacon’s literary works, the essays suggested by Montaigne are timeless:
10 in the first edition (1597), and 58 in the last edition (1625). In these “dispersed
meditations,” Bacon presented practical life’s wisdom in the various fields, general
guiding principles of the conduct of life, beyond good and evil, in an antithetical style
of epigrammatic brevity, realistic and plain. “Nova Atlantis” is the perfect description
of a philosophical ideal state.
Notes 569
Bacon’s legal writings testify to his absolute mastery of the subject. His plan of
codifying the English law of his age was not completed.
In the second half of the nineteenth century, Bacon was also considered to be the
author of Shakespeare’s dramas (Bacon theory). [BR]
9 Joachim Jungius, philosopher and scientist, b. Oct. 22, 1587, Lübeck–d. Sept. 17,
1657, Hamburg, in 1609, professor in Giessen. In 1622, he founded in Rostock the first
scientific society of Germany for the cultivation of mathematics and natural sciences.
In 1624, he became a professor in Rostock; in 1625, in Helmstedt; and in 1628, head-
master of the Johanneum and of the academic high school in Hamburg. He defended
the principle “improvement of philosophy has to originate from physics (= natural
sciences).” Jungius decisively contributed to the breakthrough of scientific chemistry
and the renewal of atomism. He was also important as a botanist. [BR]
10 René Descartes, b. March 31, 1596, La Haye–d. Feb. 11, 1650, Stockholm.
Descartes was the son of a councilor of the parliament of Bretagne and was educated
in a Jesuit college. He then began the study of law, and beginning in 1618, he partici-
pated in various campaigns. Beginning in 1622, Descartes traveled in many countries
of Europe, then settled down 1628 in the Netherlands, and lived from 1649 in Swe-
den as a teacher of philosophy. The mathematical main achievement of Descartes is
the foundation of analytical geometry in his Géometrie (1637), which also essentially
influenced the further development of infinitesimal calculus. [BR]
11 William Gilbert, English scientist and physician, b. May 24, 1544, Colchester–
d. Nov. 30, 1603, London. Gilbert was from 1573 a practicing physician in London;
from 1601, the private physician of Elizabeth I; and after her death, of King James
I of England. In his fundamental work De magnete, magneticisque corporibus et de
magnode magnets Tellure physiologia nova (London, 1600; facsimile edition, Berlin,
1892; English translation and comment by S.P. Thompson, in The collectors series in
science, 1958), Gilbert summarized the knowledge of older authors to an impressing
doctrine of magnetism and geomagnetism, and added a number of new observations
and findings. The work, which in the second book also involves a special chapter
on corpora electrica, on substances that—like amber (electrum)—after rubbing are
capable of attracting light bodies, impressed several of his contemporaries, among
others Kepler and Galileo. His treatise De monde nostro sublunari philosophia nova
appeared posthumous (Amsterdam, 1561). [BR]
12 Johannes Kepler, b. Dec. 27, 1571, Weil der Stadt–d. Nov. 15, 1630, Regensburg.
Kepler was the son of a trader who also often served in the military. He first went
to school in Leonberg, and later to the monastic school in Adelberg and Maulbronn.
From 1589, Kepler studied in Tübingen to become a theologian, but in 1599, he took
the position of professor of mathematics in Graz that was offered to him. In 1600,
because of the Counter-Reformation Kepler had to leave Graz and went to Prague.
After the death of Tycho Brahe (Oct. 24, 1601), as his successor Kepler became the
imperial mathematician. After the death of his patron, Emperor Rudolf II, Kepler left
Prague and in 1613 went to Linz as a land surveyor. From 1628, Kepler lived as an
employee of the powerful Wallenstein, mostly in Sagan. Kepler died unexpectedly
during a visit to the meeting of electors in Regensburg.
Kepler’s main fields were astronomy and optics. After extraordinarily lengthy cal-
culations, he found the fundamental laws of planetary motion: the first and second of
Kepler’s laws were published in 1609 in Astronomia Nova, and the third one in 1619
in Harmonices Mundi. In 1611, he invented the astronomical telescope. His Rudol-
phian tables (1627) continued to be one of the most important tools of astronomy until
the modern age. In the field of mathematics, he developed the heuristic infinitesimal
considerations. His best-known mathematical writing is the Stereometria Doliorum
(1615), where, e.g., Kepler’s barrel rule is given.
13 Galileo Galilei, Italian mathematician, b. Feb. 15, 1564, Pisa–d. Jan. 8, 1642,
Arcetri near Florence, studied in Pisa. At the Florentine Accademia del Dissegno,
he got access to the writings of Archimedes. On the recommendation of his patron
Guidobaldo del Monte, in 1589 he received a professorship for mathematics in Pisa.
Whether or not he performed fall experiments at the leaning tower is not proven incon-
testably; in any case, the experiments had to prove his false theory. In 1592, Galileo
took a professorship of mathematics in Padua, not because of disagreements with col-
leagues but for a better salary. He invented a proportional pair of compasses, furnished
a precision mechanic workshop in his home, found the laws for the string pendulum,
and derived the laws of falling bodies first in 1604 from false assumptions and then
in 1609 from correct assumptions. Galileo copied the telescope invented one year ear-
lier in the Netherlands, used it for astronomical observations, and published the first
results in 1610 in his Nuncius Siderus, the “star message.” Galileo discovered the
mountainous nature of the Moon, the abundance of stars of the Milky Way, the phases
of Venus, the moons of Jupiter (Jan. 7, 1610), and in 1611 the sunspots, although for
these Johannes Fabricius preceded him.
Only beginning in 1610 did Galileo, who returned to Florence as Court’s mathe-
matician and philosopher to the grand duke, publicly support the Copernican system.
By his over-eagerness in the following years, he provoked in 1614 the ban of this doc-
trine by the pope. He was urged not to advocate it further by speech or in writing.
During a dispute on the nature of the comets of 1618, where Galileo was completely
right, he wrote as one of his most profound treatises the Saggiatore (inspector with the
gold balance, 1623), a paper dedicated to Pope Urban VIII. Since the former cardinal
Maffeo Barberini had been well disposed toward him, Galileo hoped to win him as
pope for accepting the Copernican doctrine. He wrote his Dialogo, the “Talk on the
Two Main World Systems,” the Ptolemyan and the Copernican, gave the manuscript
in Rome for examination, and published it 1632 in Florence. Since he obviously had
not included the agreed-upon changes of the text thoroughly enough and had shown
his sympathy with Copernicus too clearly, a trial set up against Galileo ended with his
renunciation and condemnation on June 22, 1633. Galileo was imprisoned in the build-
ing of inquisition for a few days. The statement “It (the earth) still moves” (Eppur si
mouve) is legendary. Galileo was sentenced to unrestricted arrest, which he spent with
short breaks in his country house at Arcetri near Florence. There, he also wrote a work
important for the further development of physics: the Discorsi e Dimonstrazioni math-
ematiche, the “conversations and proofs” on two new branches of science: mechanics
(i.e., the strength of materials) and the science branches concerning local motions
(falling and throwing) (Leiden 1638).
In older representations of Galileo’s life, there are many exaggerations and mis-
takes. Galileo is not the creator of the experimental method, which he utilizes no
more than many other of his contemporaries, although sometimes more critically than
the competent Athanasius Kircher. Galileo was not an astronomer in the true sense,
but a good observer; and as an excellent speaker and writer, he won friends and pa-
trons for a growing new science and its methods among the educated of his age, and
he stimulated further research. Riccioli and Grimaldi in Bologna confirmed Galileo’s
laws of free fall by experiment. His scholars Torricelli and Viviani developed one of
Galileo’s experiments—for disproving the “horror vacui”—to the barometric experi-
Notes 571
ment. Christian Huygens developed his pendulum clock based on Galileo’s ideas, and
he transformed Galileo’s kinematics to a real dynamics.
Galileo was one of the first Italians who used their native language for presentation
of scientific problems. He defended this point of view in his correspondence. His prose
takes a special position within the Italian literature, since it is distinguished by its
masterly clarity and simplicity from the prevailing bombast that Galileo had reproved
in his literary-critical essays on Taso et al. In his works Il Dialogo sopra i due massimi
sistemi (Florence 1632) and I Dialoghi delle nouve scienze (Leiden 1638), he utilized
the form of dialogue that came down from the Italian humanists, to be understood by
a broad audience. [BR]
14 Christian Huygens, Dutch physicist and mathematician, b. April 14, 1629, Den
Haag–d. July 8, 1695, Den Haag. After initially studying law, he turned to mathemati-
cal research and published among other things in 1657 a treatise on probability calcu-
lus. At the same time, he invented the pendulum clock. In March 1655, he discovered
the first moon of Saturn, and in 1656, the Orion nebula and the shape of Saturn’s ring.
By then, he was already familiar with the laws of collision and of central motion, but
published them–without proof—only in 1669. In 1663, Huygens was elected a mem-
ber of the Royal Society. In 1665, he settled in Paris as a member of the newly founded
French academy of sciences, from where he returned in 1681 to the Netherlands. After
publishing in 1657 the small treatise Horologium and in 1659 his Systema Saturnium,
sive de causis mirandorum Saturni phaenomeno, in 1673 emerged his main work:
Horologium oscillatorium (the pendulum clock), which besides the description of an
improved watch construction contains a theory of the physical pendulum. Further one
finds treatises on the cycloid as an isochrone, and important theorems on central mo-
tion and centrifugal force. From 1675 dates Huygens’s invention of the spring watch
with a balance spring, from 1690 the Tractatus de lumine (treatise on light), which
contained a first version of the wave theory (collision theory) of light, and based on
that, the theory of double refraction of Iceland spar is developed. The spherical prop-
agation of action around the light source is explained there by means of Huygens’
principle. [BR]
15 Copernicus, Coppernicus, German Koppernigk, Polish Kopernik, Nikolaus, as-
tronomer, and founder of the heliocentric world system, b. Feb. 19, 1473, Thorn–d.
May 24, 1543, Frauenburg (East Prussia). Beginning in 1491, he engaged in human-
istic, mathematical, and astronomical studies at the university in Cracow. From 1496
to 1500, he studied civil and clerical law in Bologna. At the instigations of his un-
cle, bishop Lukas Watzelrode, in 1497 he was admitted to the chapter of Ermland at
Frauenburg, but he took only the lower holy orders. In Bologna, he continued his as-
tronomical work together with the professor of astronomy Dominico Maria Novarra,
made a short stay in Rome, and in 1501 temporarily returned to Ermland. Beginning
in the autumn of 1501, he studied in Padua and Ferrara, graduating on May 31, 1503,
as a doctor of canonical law, and then studied medicine. After returning home in 1506,
he lived in Heilsberg as secretary to his uncle from 1506 until his death in 1512. He
was involved in administrating the diocese of Ermland, and he accompanied his uncle
to the Prussian state parliaments and the Polish imperial parliament. As chancellor of
the chapter, Copernicus after 1512 lived mostly in Frauenburg, resided as governor of
the chapter (1512–1521) in Mehlsack and Allenstein, and in 1523 was administrator
of the diocese of Ermland. As a deputy, he represented the order chapter (1522–1529)
at the Prussian state parliaments and there particularly supported monetary reform.
Contrary to Polish claims, Copernicus’s German origin is established (the paternal

family stems from the diocesan country Neiss in Silesia). Like his elder brother An-
dreas, he defended the concerns of Ermland against the Crown of Poland; both in writ-
ing and in oral speech, he utilized only German and Latin. Besides his administrative
work, he also practiced as a physician. Toward the end of his life, in 1537, Copernicus
had differences with Johannes Dantiscus, the newly elected bishop of Ermland.
As an astronomer, Copernicus completed what Regiomontan had imagined: a revi-
sion of the doctrine of planetary motion, taking into account a series of critically evalu-
ated observations. Only on such a basis could one then think of reforming the calendar.
The urgency of that reform was generally recognized at the beginning of the sixteenth
century. Copernicus presumably was influenced by these considerations. In the course
of his work, he then decided to accept a heliocentric world system, inspired by ancient
writings. A brief, preliminary report on this topic is the Commentariolus, presumably
written before 1514. Here, already the decisive assumptions are expressed: The sun is
in the center of the planetary orbit—still considered as circular—and the earth circu-
lates about the sun; the earth daily rotates about its axis and in turn is orbited by the
moon. The wider public got the information on the Copernican doctrine only by the
Narratio prima of Georg Joachim Rheticus (first report on the six books of Copernicus
on the circular motions of celestial paths, 1540, German, by K. Zeller, 1943).
The main work of Copernicus, the “Six Books on the Orbits of Celestial Bodies”
(Die revolutionibus orbium coelestium libri VI, 1543; German, 1879; new edition,
1939), was published in the year of death of the author. It was dedicated to Pope
Paul III, but instead of the original foreword of Copernicus, it was introduced with a
foreword by the Protestant theologian Andreas Oslanden that inverted the meaning of
the whole subject. The doctrines of Copernicus remained uncontested by the church
until the edict of the index congregation of 1616. The imperfections of the Copernican
theory of planets were removed by Johannes Kepler. [BR]
16 Giovanni Alfonso Borelli, b. 1608, Naples–d. Dec. 31, 1679, Rome, physicist and
physiologist. In 1649, Borelli became a professor of mathematics in Messina and in

1656 moved to Pisa. In 1667, he returned to Messina. In 1674, he was forced to leave
to Rome. There he lived until his death under the patronage of Christina, Queen of
Sweden. His best-known work, De motu animalium, deals with the motions of the
body of animals, which he traced back to mechanical principles. In a letter published
in 1665 under the pseudonym Pier Maria Mutoli, Borelli first expresses the idea of
a parabolic trajectory of comets. Among his numerous astronomical works, there is
also Theoretica mediceorum planetarum ex causis physicis deducta (Florence, 1666),
treating the influence of the attracting force of Jupiter’s moons on the orbital motion
of Jupiter.
17 Giovanni Battista Benedetti, b. Aug. 14, 1530–d. Jan. 20, 1590, Torino, first to
recognize the buoyancy action of the surrounding medium in free fall. Taking up the
ideas of Archimedes, he writes in his work De resolutione omnium Euclidis problema-
tum (Venice 1553) that the fall velocity shall be determined by the difference of the
specific weights of the falling body and the medium.
18 Robert Hooke, English researcher, b. July 18, 1635, Freshwater (Isle of Wight)–
d. March 3, 1703, London. Hooke was at first an assistant to R. Boyle; then from 1665
a professor of geometry at Gresham College in London; and from 1677 to 1682, secre-
tary of the Royal Society. Hooke improved already-known methods and devices, e.g.,
the pneumatic pump and the composite microscope (described in his Mikographia,
1664). Hook was often involved in questions on priority, e.g., with Huygens, Hevelius,
Recommendations for Further Reading on Theoretical Mechanics 573
and Newton. He proposed, among others, the melting point of ice as the zero point of
the thermometric scale (1664), recognized the constancy of the melting and boiling
point of substances (1668), and for the first time observed the black spots on soap
bubbles. He gave a conceptually good definition of elasticity and in 1679 established
Hooke’s law. [BR]
19 Plutarch (Greek Plutarchos), Greek philosopher and historian, b. about AD 50,
Chäronea, from an old bourgeois family–d. about 125. He was educated in 66 in

Athens by the academician Ammonios to be a follower of Plato’s philosophy. He
visited, among other cities, Alexandria and Rome, where he had contact with promi-
nent Romans, but lived permanently in his small hometown. There, he participated in
communal politics; in Delphi he became a priest about 95. Plutarch was honored by
the emperors Trajan and Hadrian. However, his life center remained in the vicinity of
his homeland, where he closely associated with family and a circle of friends. This
milieu feeds the ethos of education and the national pathos of his numerous writings.
Despite the wealth of topics and the abundance of material, they nevertheless display
an inner unity from the pleasant integrity of a personality formed by philosophy and
religion.
Plutarch’s works consist of two groups. The first group contains the biographies
(Vitae parallelae), 46 comparative life descriptions of famous Greeks and Romans
(e.g., Pyrrhos-Marius, Agesilaos-Pompeius, Alexander-Cesar). Great Romans were
presented to the Greeks, in parallel and in contrast to their own history. Thereby an
important step toward inner balance of the double-culture of the Greek–Roman era of
emperors was made. The literary appeal of the biographies rests on the vivid represen-
tation and the description of characters, supported by memorable anecdotal features.
The second group, Moralia, contains popular, ethical-educational writings, but also
strictly philosophical, metaphysical, religious-philosophical investigations, learned
antiquarian studies, political treatises, etc. This writing was borne by the tradition
of the Platonic school, without schoolmasterly narrowness, with vivid religious em-
phasis.
While Plutarch was read in the Byzantine empire, he was unknown to the West in
the Middle Ages. In the early fifteenth century (Guarino and his scholars, then Pier
Decembrio, L. Bruni), Plutarch’s works were translated to Latin; from 1559 to 1572,
by J. Amyot in classical form into French (fertilizing influence on the dramatic art in
France in the seventeenth century); and in 1579, by North into English (influence on
Shakespeare). In Germany, for a long time Plutarch was appreciated only by learned
circles. Only toward the end of the eighteenth century was the interest turned again to
him (Schiller, Goethe, Jean Paul, Beethoven, and Nietzsche). [BR]
Recommendations for Further Reading on Theoretical Mechanics
The textbooks on theoretical mechanics listed below represent only part of the wealth
of excellent literature on this topic.
Classical textbooks on mechanics:
H. Goldstein: Classical Mechanics, 3rd edition (2001), Addison-Wesley Pub. Co.
A. Sommerfeld: Mechanics (Lectures on Theoretical Physics, Vol. 1), 4th edition
(1964), Academic Press.
L.D. Landau and E.M. Lifschitz: Mechanics, 3rd edition (1982), Butterworth-
Heinemann.
Problems and exercises for classical mechanics:
M.R. Spiegel: Theory and Problems of Theoretical Mechanics (Schaum’s Outline
Series), SI-edition (1980), McGraw-Hill.
More mathematical presentations of mechanics:
F. Scheck: Mechanics: From Newton’s Laws to Deterministic Chaos, 3rd edition
(1999), Springer.
J.B. Marion and S.T. Thornton: Classical Dynamics of Particles and Systems, 4th
edition (1995), Saunders College Publishing.
We consider the work of H. Goldstein to be particularly suited as an addendum.
Starting from the elementary principles, he outlines the formal Hamilton–Jacobi the-
ory in a didactically brilliant manner. All typical applications (central force problem,
rigid body, vibrations etc.) are discussed and expanded on in exercises and by special
recommendations for further reading. The lectures by A. Sommerfeld, planned in a
similar way, represent a gold mine because of the treatment of many special problems
and the imaginative power demonstrated in the mathematical solution techniques.
Readers may gain an appreciation of the formal aesthetics of the volume Mechanics
from the textbook by Landau and Lifschitz.
Index
acceleration – Hopf, 498

– centripetal, 7, 9 – pitchfork, 497, 520
– Coriolis, 7, 36 – saddle-node, 496
– gravitational, 9 – transcritical, 498
– linear, 7 butterfly effect, 505
action function, 397
– Hamilton, 383 canonical transformation, 365
action quantum, Planck, 407 Cantor set, 509, 512
action variable, 388 capacity dimension, 511
action waves, 398 Cardano formula, 113
air resistance, 317 casus irreducibilis, 118
amplitude modulation, 87 catenary, 342
angle variable, 388, 389 Causality, 5
angular frequency, 106 center of gravity, 43
angular momentum, 65 – circular cone, 47
angular velocity, 4, 6 – pyramid, 45
anticyclone, 34 – semicircular disk, 46
approximation, successive, 12 center-of-mass coordinates, 70
central field, scattering in, 50
Arnold tongues, 534
central forces, 66
asymptotic stability, 470
central motion, 331
attractor, 467
centrifugal force governor, 322
– chaotic, 468, 507
centripetal acceleration, 7, 9
– strange, 468, 509, 542
chain, vibrating, 88
attractor diagram, 523, 525, 540
Chandler period, 218
Atwoods fall machine, 75
chaos, 461
autonomous system, 463, 465, 491
chaotic attractor, 468, 507
characteristic equation, 189
baker transformation, 509, 515 charged particle, 314
bascule bridge, 271 Chasles’ theorem, 41
basin of attraction, 468 Chirikov criterion, 549
beat vibrations, 84 circular mapping
Bernoulli shift, 515, 527 – dissipative, 532
Bessel functions, 147 – one-dimensional, 532
Bessel’s differential equation, 144 cluster property, 44
bifurcation, 495 collision parameter, 52
– static, 495 conditionally periodic, 154
– subharmonic, 501 cone pendulum, 30
– time-dependent, 499 configuration space, 350
bifurcation cascade, 522, 524 constraint reactions, 261
billiard ball, 178 – generalized, 303
body, rigid, 39, 161 constraints, 259
boundary condition – holonomic, 259
– Dirichlet, 121 – nonholonomic, 259, 301
– periodic, 144 – rheonomic, 259
brachistochrone, 344 – scleronomic, 259
branching, 495 continuity equation, 353

DOI 10.1007/978-3-642-03434-3, © Springer-Verlag Berlin Heidelberg 2010
576 Index
continuum, 39 eigenvector, 189

contracting flow, 466 eigenvibrations, 82
control parameter, 463 electromagnetic field, 314
cooling ellipsoid, rotating, 218
– of particle beam, 359 ellipsoid of inertia, 196
– stochastic, 355 – quadratic disk, 202
coordinates – regular polyhedron, 218
– generalized, 259, 262, 463 energy law, 68
– ignorable (cyclic), 277 equation, characteristic, 189
coordinate system equilibrium solution, 469
– body-fixed, 186 ergodic system, 505, 530
– center-of-mass, 53, 70 Euler angles, 238
– for rotating earth, 9 Euler equations, 214
– rotating, 3, 7 Euler–Lagrange equation, 341
Coriolis acceleration, 7, 36 extended canonical equations, 419
Coulomb scattering, 56 extended canonical transformations, 416, 428
coupled mass points, vibrations, 81 extended Euler–Lagrange equation, 415
coupled pendulums, 85 extended Hamilton–Jacobi equation, 455
couple of forces, 161 extended point transformations, 433
critical point, 469 extended Poisson brackets, 434
cross section externally excited system, 486
– differential, 51
– Rutherford, 55 fall machine, Atwoods, 75
cycloid, 131, 292 Feigenbaum constant, 522
cyclone, 34 figure axis, 209
fixed point, 469, 507, 517
D’Alembert principle, 267 – unstable, 470
deformable medium, 39 Floquet multiplier, 490, 493
degeneracy, 137 Floquet’s theory of stability, 489
degrees of freedom, 41 flow of a vector field, 464
determinant of coefficients, 90 force
deterministic, 463, 504 – generalized, 265
deviation moments, 163, 186 – Lorentz, 311
devil’s staircase, 535 – nonconservative, 315
differential cross section, 51 Foucault’s pendulum, 23
differential principles, 337 Fourier coefficients, 122
dimension Fourier series, 121
– broken, 508, 509 fractal, 508
– capacity, 511 fractal dimension, 511
– fractal, 511 free fall, 9
– Hausdorff, 514 frequency spectrum, (an)harmonic, 137
– information-, 514, 543 friction force, 206
– Lyapunov, 543 friction parameter, 538, 541, 544
Dirichlet conditions, 121 friction tensor, 316
discrete systems, 517 function
discretization, 487, 531 – even (odd), 123
dispersion law, 106 – homogeneous, 330
displacement, virtual, 267 fundamental harmonic, 137
dissipation, 464–466
dissipation function, 315 galaxy, flattening of, 67
dissipative circular mapping, 532 generalized Noether theorem, 447
double pendulum, 289 generating function, 366
dynamical system, 463 gravitational acceleration, 9
Green’s function
earth, nutation of, 217 – advanced, 3
eastward deflection, 12, 16 – free, 5
eclipse, 230 gyrocompass, 229
ecliptic, 226 gyroscope, 228
eigenfrequency, 81, 82, 137
eigenmodes, orthonormal, 152 Hamilton action function, 383
eigenvalue, 189 Hamilton equations, 329, 349
Index 577
Hamilton principle, 337 mass, reduced, 72

Hamiltonian, 328 mass density, 45
Hamiltonian mechanics, 463 membrane
Hamilton–Jacobi differential equation, 383 – circular, 141
Hamilton–Jacobi theory, 383 – rectangular, 135
harmonic oscillator – vibrating, 133
– canonical formalism, 370 Mercury, 549
Hausdorff dimension, 514 mode locking, 533, 536
herpolhodie, 212 molecule
hockey puck, 176 – asymmetric linear, 289
holonomic, 259, 279, 317 – three-atom, 284
Hopf branching, 498 – triangular, 286
hydrogen atom, 408 momentary center, 48
Hyperion, 544 moment of inertia, 162, 545
– circular cylinder, 163
inertial system, 3 – cube, 167
information dimension, 514, 543 – rectangular disk, 165
integrability condition, 260 – rigid bodies, 174
integral principles, 337 – sphere, 167
iteration method, 13 momentum
iterative mapping, 517 – generalized, 276
– linear, 65
Jacobi matrix, 469, 492 momentum space, 350
monodromy matrix, 490
Kaplan–Yorke relation, 543 motion, central, 331
Kepler problem, 389
Kepler system neutron star, vibrating, 220
– canonical formalism, 380 Newton’s equations
Kepler’s law, 546 – arbitrary relative motion, 7
kinetic energy, 72 – rotating coordinate system, 3, 7
– of a rotating rigid body, 187 nodal line, 138
Koch curve, 510, 512 node, 97, 471, 496
nonconservative forces, 315
nonholonomic, 301, 317
laboratory system, 3
nonlinear dynamics, 461
Lagrange equations, 276
normal frequencies, 84, 289
– nonholonomic constraints, 301
normal vibrations, 82, 105
Lagrange multipliers, 301
North Pole
Lagrangian, 275
– geometrical, 218
Legendre transformation, 327, 369
– kinematical, 218
libration, 539
nutation, 212
limit cycle, 475, 477, 483, 492, 500, 507
nutation cone, 212
linearization, 469
nutation of earth, 217
Liouville theorem, 352, 376, 465
Lippmann–Schwinger equation, 5 one-dimensional mapping, 503
Lissajous figure, 155 orbit, 464
logistic mapping, 519, 523, 527 orthogonality relation
Lorentz force, 311 – for trigonometric functions, 122
Lorenz, 505 oscillator
Lyapunov dimension, 543 – nonlinear, 473
Lyapunov exponent, 504, 508, 551 – Rayleigh, 476
– logistic mapping, 525, 530 – van der Pol, 476, 480
– maximum, 506, 541 overtone, 107, 137, 149
Mandelbrot, 508 parabola, Neil, 400

many-body system, 65 paraboloid, 305
mapping path stability, 485
– iterative, 517 Peano curve, 513
– logistic, 519, 523, 527 pendulum
– one-dimensional, 503, 518 – cone, 30
– Poincaré, 488, 503, 517, 537 – coupled, 85
– stroboscopic, 487 – double, 289
578 Index
– Foucault, 23 reduced mass, 72

– periodically driven, 537 relaxation vibration, 479, 482
– physical, 166 resonances, 548, 549
– plane, 351 rheonomic, 259, 279
– rolling, 169 river, superelevation of bank, 18
– string, 294 rolling pendulum, 169
– theories, 332 rosette path, 28
– upright, 281, 282 rotating coordinate system, 3, 7
pendulum length, reduced, 169 rotating ellipsoid, 218
pendulum watch, 534 rotating tube, 35
period, Chandler, 218 rotation, 41
period-doubling, 501, 520, 539 rotation about a fixed axis, 161
periodic attractor, 533 rotation about a point, 185
periodic solution, 486 rotational velocity, 7
perturbation calculation, 11 rotation energy, 188
phase diagram, 351 rotation matrix, 194, 240
phase flow, 464, 467 rotator, periodically kicked, 530
phase integral, 388 rotor, 471
phase space, 350 Runge–Lenz vector, 382
phase-space density, 354 Rutherford scattering, 55
phase trajectory, 350
phase velocity, 106 saddle-node branching, 496
physical pendulum, 166 saddle point, 471, 496
pirouette, 67 Saros cycle, 230
pitchfork branching, 497, 520 Saturn, 544
Pivot forces, 222 scattering, in a central field, 50
Planck’s constant, 408 scattering cross section
plane, invariable, 210 – Rutherford, 55
Poincaré, 461, 504 – square well potential, 59
Poincaré–Bendixson theorem, 480 scattering experiment, 51
Poincaré cut, 487, 542, 549 scattering of two atoms, 63
Poincaré mapping, 488, 503, 517, 537 scleronomic, 259
Poincaré recurrence time, 150 sea level, 18
Poinsot ellipsoid, 210 secular equation, 470
point attractor, 518 self-exciting vibration, 479
point transformation, 370 self-similarity, 510, 525, 527, 535
Poisson bracket, 378, 410 Sierpinski gasket, 511
pole cone, 212 similarity transformation, 195
pole curve, 48 slant throw, 395
pole path, 48 sleeping top, 247
pole trajectory, 212 solar system, 76
polhodie, 212 sound velocity, 106
potential spin, 243
– generalized, 312 spiral, 471, 473
– scalar, 314 stability, 469
– vector, 314 – asymptotic, 470, 485
– velocity-dependent, 311 – orbital, 485
precession, 225 – time-dependent paths, 485, 488
– stationary, 246 stability of paths, 488
precession velocity, 247 staggering motion, 544, 551
principal axes of inertia, 188 standard mapping, 532
principal axis, 203 Steiner’s theorem, 164
Principle of causality, 5 stochastic cooling, 355
projectile, 51, 317 straight line, invariable, 210
strange attractor, 468, 509, 542
quantum hypothesis, 407 stretching and folding, 508, 528
quantum number, 105 string, vibrating, 101, 108
quasiperiodic, 533, 535, 549 string pendulum, 294
string tension, 24, 101, 261, 268
Rayleigh oscillator, 476 stroboscopic mapping, 487
recurrence time, Poincaré, 150 subcritical branching, 498
Index 579
subharmonic bifurcation, 501 total time derivative, 412

subharmonic cascade, 540 trace cone, 212
supercritical branching, 498 trace trajectory, 212
superposition principle, 105 trajectory, 464, 538
superstability, 524, 526, 535 transcritical branching, 498
surface density, 46 transformation
symmetry axis, 203 – canonical, 365
symmetry-breaking bifurcation, 541 – of kinetic energy, 72
synchronization, 533 – to center-of-mass coordinates, 70
system of mass points, 65 transients, 467, 525
system of principal axes, 188 translation, 41
transverse cut, 487
target, 51 tube, rotating, 279
tautochrone problem, 129 turbulence, 461
tensor, 194
tensor of inertia, 186 van der Pol oscillator, 476, 480
– square, 191 variational problem, 338, 340
– three mass points, 204 vector field, 463
theorem of Chasles, 41 vector product, 4
theorem of Liouville, 350 velocity
theorem of Steiner, 164 – angular, 6
theory of chaos, 505 – generalized, 264
theory of stability, Floquet, 489 – true, 7
theory of top – virtual, 7
– analytical, 213 velocity-dependent potential, 311
– geometrical, 210 velocity field, 463, 466
tidal forces, 230 vibrating chain, 88
tidal friction, 547, 549 vibrating membrane, 133
top vibrating string, 101, 108
– asymmetric, 255, 297 vibration antinode, 97
– free, 209 vibration, self-exciting, 479
– heavy, 224, 249 vibrations of coupled mass points, 81
– oblate, 209 virtual displacement, 267, 556
– prolate, 209 virtual forces, 7
– rolling circular, 199 virtual work, 267, 269, 319, 556
– sleeping, 247 volume density, 43
– spherical, 191, 209 Voyager, 544, 551
– symmetric, 191
top moment, 226 wave equation, 102
torque, 161 wavelength, 105
– elliptic disk, 223 wave number, 105
– of rotating plate, 219 winding number, 533
torus, 507 work, virtual, 267, 269, 319, 556

Classical Mechanics Systems of Particles and Hamiltonian Dynamics, Second Edition, Walter Greiner 9783642034688

Uploaded by

Copyright:

Available Formats

Classical Mechanics Systems of Particles and Hamiltonian Dynamics, Second Edition, Walter Greiner 9783642034688

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Classical Mechanics Systems of Particles and Hamiltonian Dynamics, Second Edition, Walter Greiner 9783642034688

Uploaded by

Copyright:

Available Formats

What are some of the main topics covered in the document?

What are some of the main topics covered in the document?

What are some of the mathematical concepts and theories discussed?

What are some of the mathematical concepts and theories discussed?

Classical Mechanics

Greiner Schramm Stein

ISBN 978-3-642-03433-6 e-ISBN 978-3-642-03434-3

Library of Congress Control Number: 2009940125

© Springer-Verlag Berlin Heidelberg 1992, 2010

Cover design: eStudio Calamar S.L., Spain

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

Yale University D. Allan Bromley

Frankfurt am Main Walter Greiner

matter of fact, the first-semester course in theoretical mechanics is a precursor to the-

Johann Wolfgang Goethe-Universität Walter Greiner

Part I Newtonian Mechanics in Moving Coordinate Systems

6 Mechanical Fundamental Quantities of Systems of Mass Points . . . . . 65

Part III Vibrating Systems

7 Vibrations of Coupled Mass Points . . . . . . . . . . . . . . . . . . . . . 81

8 The Vibrating String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

9 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

10 The Vibrating Membrane . . . . . . . . . . . . . . . . . . . . . . . . . . 133

Part IV Mechanics of Rigid Bodies

12 Rotation About a Point . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

13 Theory of the Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

Part V Lagrange Equations

15 D’Alembert Principle and Derivation of the Lagrange Equations . . . . 267

16 Lagrange Equation for Nonholonomic Constraints . . . . . . . . . . . . 301

17 Special Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

Part VI Hamiltonian Theory

18 Hamilton’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

19 Canonical Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 365

20 Hamilton–Jacobi Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 383

21 Extended Hamilton–Lagrange Formalism . . . . . . . . . . . . . . . . . 415

24 Stability of Time-Dependent Paths . . . . . . . . . . . . . . . . . . . . . 485

26 Lyapunov Exponents and Chaos . . . . . . . . . . . . . . . . . . . . . . 503

27 Systems with Chaotic Dynamics . . . . . . . . . . . . . . . . . . . . . . 517

1.1 Angular Velocity Vector ω . . . . . . . . . . . . . . . . . . . . . . . . 6

8.3 Complicated Coupled Vibrational System . . . . . . . . . . . . . . . . 112

14.6 Classification of Constraints . . . . . . . . . . . . . . . . . . . . . . . 263

20.4 Formulation of the Hamilton–Jacobi Differential Equation for Particle

Fig. 1.1. Relative position of

W. Greiner, Classical Mechanics, 3

ė1 = a1 e2 + a2 e3 ,

Only 3 of these 6 coefficients are independent. To show this, we first differentiate

ė1 · e2 = −ė2 · e1 .

e2 · ė1 = a1 and e1 · ė2 = a3 ,

and hence a3 = −a1 . Analogously one finds a6 = −a4 and a5 = −a2 .

it follows by setting C = (a4 , −a2 , a1 ) that

Fig. 1.2. Change of an arbi-

ė1 = ϕ̇e2 and ė2 = −ϕ̇e1 ,

a1 = ϕ̇, a2 = a4 = 0, and hence C = ϕ̇e3 = ω.

Fig. 1.3. |de1 | = |de2 | =

If the vector A is omitted, the equation is called an operator equation

which can operate on arbitrary vectors.

1.1 Angular Velocity Vector ω

1.2 Position Vector r

1.2 Formulation of Newton’s Equation in the Rotating Coordinate