Classical Mechanics Systems of Particles and Hamiltonian Dynamics, Second Edition, Walter Greiner 9783642034688
Classical Mechanics Systems of Particles and Hamiltonian Dynamics, Second Edition, Walter Greiner 9783642034688
Classical Mechanics Systems of Particles and Hamiltonian Dynamics, Second Edition, Walter Greiner 9783642034688
Second Edition
Greiner Greiner
Quantum Mechanics Classical Mechanics
An Introduction 4th Edition Systems of Particles
and Hamiltonian Dynamics
Greiner 2nd Edition
Quantum Mechanics
Special Chapters Greiner
Classical Mechanics
Greiner Müller Point Particles and Relativity
Quantum Mechanics
Symmetries 2nd Edition Greiner
Classical Electrodynamics
Greiner
Relativistic Quantum Mechanics Greiner Neise Stocker
Wave Equations 3rd Edition Thermodynamics
and Statistical Mechanics
Greiner Reinhardt
Field Quantization
Greiner Reinhardt
Quantum Electrodynamics
4th Edition
Greiner Maruhn
Nuclear Models
Greiner Müller
Gauge Theory of Weak Interactions
4th Edition
Walter Greiner
Classical Mechanics
Systems of Particles and
Hamiltonian Dynamics
With a Foreword by
D.A. Bromley
Second Edition
With 280 Figures,
and 167 Worked Examples and Exercises
Prof. Dr. Walter Greiner
Frankfurt Institute
for Advanced Studies (FIAS)
Johann Wolfgang Goethe-Universität
Ruth-Moufang-Str. 1
60438 Frankfurt am Main
Germany
[email protected]
Translated from the German Mechanik: Teil 2, by Walter Greiner, published by Verlag Harri Deutsch, Thun,
Frankfurt am Main, Germany, © 1989
More than a generation of German-speaking students around the world have worked
their way to an understanding and appreciation of the power and beauty of modern the-
oretical physics—with mathematics, the most fundamental of sciences—using Walter
Greiner’s textbooks as their guide.
The idea of developing a coherent, complete presentation of an entire field of sci-
ence in a series of closely related textbooks is not a new one. Many older physicians
remember with real pleasure their sense of adventure and discovery as they worked
their ways through the classic series by Sommerfeld, by Planck, and by Landau and
Lifshitz. From the students’ viewpoint, there are a great many obvious advantages to
be gained through the use of consistent notation, logical ordering of topics, and co-
herence of presentation; beyond this, the complete coverage of the science provides a
unique opportunity for the author to convey his personal enthusiasm and love for his
subject.
These volumes on classical physics, finally available in English, complement
Greiner’s texts on quantum physics, most of which have been available to English-
speaking audiences for some time. The complete set of books will thus provide a
coherent view of physics that includes, in classical physics, thermodynamics and sta-
tistical mechanics, classical dynamics, electromagnetism, and general relativity; and
in quantum physics, quantum mechanics, symmetries, relativistic quantum mechanics,
quantum electro- and chromodynamics, and the gauge theory of weak interactions.
What makes Greiner’s volumes of particular value to the student and professor alike
is their completeness. Greiner avoids the all too common “it follows that . . . ,” which
conceals several pages of mathematical manipulation and confounds the student. He
does not hesitate to include experimental data to illuminate or illustrate a theoretical
point, and these data, like the theoretical content, have been kept up to date and top-
ical through frequent revision and expansion of the lecture notes upon which these
volumes are based.
Moreover, Greiner greatly increases the value of his presentation by including
something like one hundred completely worked examples in each volume. Nothing is
of greater importance to the student than seeing, in detail, how the theoretical concepts
and tools under study are applied to actual problems of interest to working physicists.
And, finally, Greiner adds brief biographical sketches to each chapter covering the
people responsible for the development of the theoretical ideas and/or the experimen-
tal data presented. It was Auguste Comte (1789–1857) in his Positive Philosophy who
noted, “To understand a science it is necessary to know its history.” This is all too
often forgotten in modern physics teaching, and the bridges that Greiner builds to the
pioneering figures of our science upon whose work we build are welcome ones.
Greiner’s lectures, which underlie these volumes, are internationally noted for their
clarity, for their completeness, and for the effort that he has devoted to making physics
v
vi Foreword
an integral whole. His enthusiasm for his sciences is contagious and shines through
almost every page.
These volumes represent only a part of a unique and Herculean effort to make all
of theoretical physics accessible to the interested student. Beyond that, they are of
enormous value to the professional physicist and to all others working with quantum
phenomena. Again and again, the reader will find that, after dipping into a particular
volume to review a specific topic, he or she will end up browsing, caught up by often
fascinating new insights and developments with which he or she had not previously
been familiar.
Having used a number of Greiner’s volumes in their original German in my teach-
ing and research at Yale, I welcome these new and revised English translations and
would recommend them enthusiastically to anyone searching for a coherent overview
of physics.
I am pleased to note that our text Classical Mechanics: Systems of Particles and
Hamiltonian Dynamics has found many friends among physics students and re-
searchers, and that a second edition has become necessary. We have taken this op-
portunity to make several amendments and improvements to the text. A number of
misprints and minor errors have been corrected and explanatory remarks have been
supplied at various places.
New examples have been added in Chap. 19 on canonical transformations, dis-
cussing the harmonic oscillator (19.3), the damped harmonic oscillator (19.4), infini-
tesimal time steps as canonical transformations (19.5), the general form of Liouville’s
theorem (19.6), the canonical invariance of the Poisson brackets (19.7), Poisson’s the-
orem (19.8), and the invariants of the plane Kepler system (19.9).
It may come as a surprise that even for a time-honored subject such as Clas-
sical Mechanics in the formulation of Lagrange and Hamilton, new aspects may
emerge. But this has indeed been the case, resulting in new chapters on the “Extended
Hamilton–Lagrange formalism” (Chap. 21) and the “Extended Hamilton–Jacobi equa-
tion” (Chap. 22). These topics are discussed here for the first time in a textbook, and
we hope that they will help to convince students that even Classical Mechanics can
still be an active area of ongoing research.
I would especially like to thank Dr. Jürgen Struckmeier for his help in constructing
the new chapters on the Extended Hamilton–Lagrange–Jacobi formalism, and Dr. Ste-
fan Scherer for his help in the preparation of this new edition. Finally, I appreciate the
agreeable collaboration with the team at Springer-Verlag, Heidelberg.
vii
Preface to the First Edition
Theoretical physics has become a many faceted science. For the young student, it
is difficult enough to cope with the overwhelming amount of new material that has
to be learned, let alone obtain an overview of the entire field, which ranges from
mechanics through electrodynamics, quantum mechanics, field theory, nuclear and
heavy-ion science, statistical mechanics, thermodynamics, and solid-state theory to
elementary-particle physics; and this knowledge should be acquired in just eight to ten
semesters, during which, in addition, a diploma or master’s thesis has to be worked on
or examinations prepared for. All this can be achieved only if the university teachers
help to introduce the student to the new disciplines as early on as possible, in order to
create interest and excitement that in turn set free essential new energy.
At the Johann Wolfgang Goethe University in Frankfurt am Main, we therefore
confront the student with theoretical physics immediately, in the first semester. The-
oretical Mechanics I and II, Electrodynamics, and Quantum Mechanics I—An Intro-
duction are the courses during the first two years. These lectures are supplemented
with many mathematical explanations and much support material. After the fourth
semester of studies, graduate work begins, and Quantum Mechanics II—Symmetries,
Statistical Mechanics and Thermodynamics, Relativistic Quantum Mechanics, Quan-
tum Electrodynamics, Gauge Theory of Weak Interactions, and Quantum Chromo-
dynamics are obligatory. Apart from these, a number of supplementary courses on
special topics are offered, such as Hydrodynamics, Classical Field Theory, Special
and General Relativity, Many-Body Theories, Nuclear Models, Models of Elementary
Particles, and Solid-State Theory.
This volume of lectures, Classical Mechanics: Systems of Particles and Hamil-
tonian Dynamics, deals with the second and more advanced part of the important field
of classical mechanics. We have tried to present the subject in a manner that is both
interesting to the student and easily accessible. The main text is therefore accompa-
nied by many exercises and examples that have been worked out in great detail. This
should make the book useful also for students wishing to study the subject on their
own.
Beginning the education in theoretical physics at the first university semester, and
not as dictated by tradition after the first one and a half years in the third or fourth
semester, has brought along quite a few changes as compared to the traditional courses
in that discipline. Especially necessary is a greater amalgamation between the ac-
tual physical problems and the necessary mathematics. Therefore, we treat in the first
semester vector algebra and analysis, the solution of ordinary, linear differential equa-
tions, Newton’s mechanics of a mass point, and the mathematically simple mechanics
of special relativity.
Many explicitly worked-out examples and exercises illustrate the new concepts
and methods and deepen the interrelationship between physics and mathematics. As a
ix
x Preface to the First Edition
4 Degrees of Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1 Degrees of Freedom of a Rigid Body . . . . . . . . . . . . . . . . . 41
5 Center of Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
xi
xii Contents
25 Bifurcations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
25.1 Static Bifurcations . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
25.2 Bifurcations of Time-Dependent Solutions . . . . . . . . . . . . . . 499
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
Contents of Examples and Exercises
xv
xvi Contents of Examples and Exercises
In classical mechanics, Newton’s laws hold in all systems moving uniformly relative
to each other (i.e., inertial systems) if they hold in one system. However, this is no
longer valid if a system undergoes accelerations. The new relations are obtained by
establishing the equations of motion in a fixed system and transforming them into the
accelerated system.
We first consider the rotation of a coordinate system (x , y , z ) about the origin
of the inertial system (x, y, z) where the two coordinate origins coincide. The inertial
system is denoted by L (“laboratory system”) and the rotating system by M (“moving
system”).
In the primed system the vector A(t) = A1 e1 + A2 e2 + A3 e3 changes with time.
For an observer resting in this system this can be represented as follows:
dA dA1 dA2 dA3
= e1 + e2 + e .
dt M dt dt dt 3
The index M means that the derivative is being calculated from the moving system.
In the inertial system (x, y, z) A is also time dependent. Because of the rotation of the
primed system the unit vectors e1 , e2 , e3 also vary with time; i.e., when differentiating
the vector A from the inertial system, the unit vectors must be differentiated too:
dA dA1 dA2 dA3
= e + e + e + A1 ė1 + A2 ė2 + A3 ė3
dt L dt 1 dt 2 dt 3
dA
= + A1 ė1 + A2 ė2 + A3 ė3 .
dt M
Generally the following holds: (d/dt)(eγ · eγ ) = eγ · ėγ + ėγ · eγ = (d/dt)(1) = 0.
Hence, eγ · ėγ = 0. The derivative of a unit vector ėγ is always orthogonal to the vector
itself. Therefore the derivative of a unit vector can be written as a linear combination
of the two other unit vectors:
Multiplying ė1 = a1 e2 + a2 e3 by e2 and correspondingly ė2 = a3 e1 + a4 e3 by e1 ,
one obtains
= e1 (C2 A3 − C3 A2 ) − e2 (C1 A3 − C3 A1 ) + e3 (C1 A2 − C2 A1 ),
The orientation of (ω × A) dt also coincides with dA (see Fig. 1.2). Since the
(fixed) vector A can be chosen arbitrarily, the vector C must be identical with the
angular velocity ω of the rotating system M. By insertion we obtain
dA dA
= + ω × A. (1.1)
dt L dt M
This can also be seen as follows (see Fig. 1.3): If the rotational axis of the primed
system coincides during a time interval dt with one of the coordinate axes of the
nonprimed system, e.g., ω = ϕ̇e3 , then
i.e.,
In the general case ω = ω1 e1 + ω2 e2 + ω3 e3 , one decomposes ω = ωi with
ωi = ωi ei , and by the preceding consideration one finds
Ci = ω i ; i.e., C= Ci = ωi = ω.
i i
6 1 Newton’s Equations in a Rotating Coordinate System
1.1 Introduction of the Operator D
=
To shorten the expression ∂F (x, . . . , t)/∂t = ∂F /∂t, we introduce the operator D
∂/∂t. The inertial system and the accelerated system will be distinguished by the in-
dices L and M, so that
L = ∂
D and M = ∂ .
D
∂t L ∂t M
The equation
dA dA
= +ω×A
dt L dt M
then simplifies to
L A = D
D M A + ω × A.
L = D
D M + ω ×,
EXAMPLE
These two derivatives are evidently identical for all vectors that are parallel to the
rotational plane, since then the vector product vanishes.
EXAMPLE
L r = D
D M r + ω × r,
1.2 Formulation of Newton’s Equation in the Rotating Coordinate System 7
where (dr/dt)|M is called the virtual velocity and (dr/dt)|M + ω × r the true velocity. Example 1.2
The term ω × r is called the rotational velocity.
Newton’s law mr̈ = F holds only in the inertial system. In accelerated systems, there
appear additional terms. First we consider again a pure rotation.
For the acceleration we have
d L (D
L r) = (D
M + ω×)(DM r + ω × r)
r̈L = (ṙ)L = D
dt
=DM2
r+DM (ω × r) + ω × D
M r + ω × (ω × r)
M
=D 2 M ω) × r + 2ω × D
r + (D M r + ω × (ω × r).
The basic equation of mechanics in the rotating coordinate system therefore reads
(with the index M being omitted):
d 2r dω
m 2
=F−m × r − 2mω × v − mω × (ω × r). (1.3)
dt dt
The additional terms on the right-hand side of (1.3) are virtual forces of a dynamical
nature, but actually they are due to the acceleration term. For experiments on the earth
the additional terms can often be neglected, since the angular velocity of the earth
ω = 2π/T (T = 24 h) is only 7.27 · 10−5 s−1 .
We now drop the condition that the origins of the two coordinate systems coincide.
The general motion of a coordinate system is composed of a rotation of the system
and a translation of the origin. If R points to the origin of the primed system, then the
position vector in the nonprimed system is r = R + r .
8 1 Newton’s Equations in a Rotating Coordinate System
For the velocity we have ṙ = Ṙ + ṙ , and in the inertial system we have as before
d 2 r
m 2 = F = F.
dt L L
The transition to the accelerated system is performed as above (see (1.3)), but here
we still have the additional term mR̈:
d 2 r d 2R dω
m 2 = F − m 2 − m × r − 2mω × vM − mω × (ω × r ). (1.4)
dt M dt L dt M
Free Fall on the Rotating Earth
2
On the earth, the previously derived form of the basic equation of mechanics holds if
we neglect the rotation about the sun and therefore consider a coordinate system at the
earth center as an inertial system.
The rotational velocity ω of the earth about its axis can be considered constant in time;
therefore, mω̇ × r = 0.
The motion of the point R, i.e., the motion of the coordinate origin of the system
(x , y , z ), still has to be recalculated in the moving system. According to (2.1), we
have
Since R as seen from the moving system is a time-independent quantity and since
ω is constant, this equation finally reads
R̈|L = ω × (ω × R).
This is the centripetal acceleration due to the earth’s rotation that acts on a body
moving on the earth’s surface. For the force equation (2.1) one gets
Hence, in free fall on the earth—contrary to the inertial system—there appear vir-
tual forces that deflect the body in the x - and y -directions.
If only gravity acts, the force F in the inertial system is F = −γ Mmr/r 3. By inser-
tion we obtain
Mm
mr̈ = −γ r − mω × (ω × R) − 2mω × ṙ − mω × (ω × r ).
r3
We now introduce the experimentally determined value for the gravitational accel-
eration g:
M
g = −γ R − ω × (ω × R).
R3
which leads to a decrease of the gravitational acceleration (as a function of the geo-
graphical latitude). The reduction is included in the experimental value for g. We thus
obtain
In the vicinity of the earth’s surface (r R) the last term can be neglected, since
ω2 enters and |ω| is small compared to 1/s. Thus the equation simplifies to
The vector equation is solved by decomposing it into its components. First one suit-
ably evaluates the vector product. From Fig. 2.1 one obtains, with e1 , e2 , e3 the unit
vectors of the inertial system and e1 , e2 , e3 the unit vectors of the moving system, the
following relation:
Because ω = ωe3 , one gets the component representation of ω in the moving system:
ω × ṙ = (−ωẏ cos λ)e1 + (ż ω sin λ + ẋ ω cos λ)e2 − (ωẏ sin λ)e3 .
The vector equation (2.2) can now be decomposed into the following three compo-
nent equations:
ẍ = 2ẏ ω cos λ,
ÿ = −2ω(ż sin λ + ẋ cos λ), (2.3)
z̈ = −g + 2ωẏ sin λ.
This is a system of three coupled differential equations with ω as the coupling parame-
ter. For ω = 0, we get the free fall in an inertial system. The solution of such a system
2.1 Perturbation Calculation 11
can also be obtained in an analytical way. It is, however, useful to learn various ap-
proximation methods from this example. We will first outline these methods and then
work out the exact analytical solution and compare it with the approximations.
In the present case, the perturbation calculation and the method of successive ap-
proximation offer themselves as approximations. Both of these methods will be pre-
sented here. The primes on the coordinates will be omitted below.
Here one starts from a system that is mathematically more tractable, and then one
accounts for the forces due to the perturbation which are small compared to the re-
maining forces of the system.
We first integrate (2.3):
ẋ = 2ωy cos λ + c1 ,
ẏ = −2ω(x cos λ + z sin λ) + c2 , (2.4)
ż = −gt + 2ωy sin λ + c3 .
In free fall on the earth the body is released from the height h at time t = 0; i.e., for
our problem, the initial conditions are
z(0) = h, ż(0) = 0,
y(0) = 0, ẏ(0) = 0,
x(0) = 0, ẋ(0) = 0.
c1 = 0, c2 = 2ωh sin λ, c3 = 0,
and obtain
ẋ = 2ωy cos λ,
ẏ = −2ω(x cos λ + (z − h) sin λ), (2.5)
ż = −gt + 2ωy sin λ.
The terms proportional to ω are small compared to the term gt. They represent the
perturbation. The deviation y from the origin of the moving system is a function of
ω and t ; i.e., in the first approximation the term y1 (ω, t) ∼ ω appears. Inserting this
into the first differential equation, we find an expression involving ω2 . Because of the
consistency in ω we can neglect all terms with ω2 , i.e., we obtain to first order in ω
Because x(t) = 0, in this approximation the term 2ωx cos λ drops out from the
second differential equation (2.5); there remains
ẏ = −2ω(z − h) sin λ.
Inserting z leads to
1 2
ẏ = −2ω h − gt − h sin λ
2
= ωgt 2 sin λ.
x(0) = 0, ẋ(0) = 0,
y(0) = 0, ẏ(0) = 0,
z(0) = h, ż(0) = 0,
c1 = 0, c2 = 0, c3 = h.
The iteration method is based on replacing the functions x(u), y(u), z(u) under
the integral sign by appropriate initial functions. In the first approximation, one de-
termines the functions x(t), y(t), z(t) and then inserts them as x(u), y(u), z(u) on the
right-hand side to get the second approximation. In general there results a successive
approximation to the exact solution if ω · t = 2πt/T (T = 24 hours) is sufficiently
small.
By setting x(u), y(u), z(u) to zero in the above example in the zero-order approxi-
mation, one obtains in the first approximation
x (1) (t) = 0,
y (1) (t) = 2ωht sin λ,
g
z(1) (t) = h − t 2 .
2
t t
x (2)
(t) = 2ω cos λ y (1)
(u) du = 2ω cos λ 2ωh(sin λ)u du
0 0
t2
= 4ω2 h cos λ sin λ = f (ω2 ) ≈ 0.
2
t
1
z (2)
(t) = h − gt 2 + 2ω sin λ y (1) (u) du
2
0
t
g
= h − t 2 + 2ω sin λ 2ωh(sin λ)u du
2
0
g
= h − t 2 + i(ω2 ).
2
14 2 Free Fall on the Rotating Earth
t3
= 2ωh sin λ · t − 2ωh sin λ · t + gω(sin λ)
3
t3
= gω(sin λ) = 2ωh(sin λ)t + k(ω2 ).
3
We see that in this second step the terms linear in ω once again changed greatly. The
term 2ωht sin λ obtained in the first iteration step cancels completely and is finally
replaced by gω(sin λ)t 3 /3. A check of y (3) (t) shows that y (2) (t) is consistent up to
first order in ω.
Just as in the perturbation method discussed above, we get up to first order in ω the
solution
x(t) = 0,
gω sin λ 3
y(t) = t ,
3
g
z(t) = h − t 2 .
2
We have of course noted long ago that the method of successive approximation (itera-
tion) is equivalent to the perturbation calculation and basically represents its concep-
tually clean formulation.
By integrating (2.3a) to (2.3c) with the above initial conditions, one gets
ẋ = 2ω cos λy, (2.5a)
ẏ = −2ω(sin λz + cos λx) + 2ω sin λh, (2.5b)
ż = −gt + 2ω sin λy. (2.5c)
The general solution of (2.6) is the general solution of the homogeneous equation and
one particular solution of the inhomogeneous equation, i.e.,
c
y= t + A sin 2ωt + B cos 2ωt.
4ω2
2.3 Exact Solution 15
i.e.,
g sin λ sin 2ωt
y= t− . (2.7)
2ω 2ω
Since ωt = 2πfall time/1 day, i.e., very small (ωt 1), one can expand (2.10):
gt 2
x= sin λ cos λ(ωt)2 ,
6
gt 2
y= sin λ(ωt), (2.11)
3
gt 2 sin2 λ 2
z=h− 1− ωt .
2 3
16 2 Free Fall on the Rotating Earth
If one considers only terms of first order in ωt, then (ωt)2 ≈ 0, and (2.11) becomes
x(t) = 0,
gωt 3 sin λ
y(t) = , (2.12)
3
g
z(t) = h − t 2 .
2
This is identical with the results obtained by means of perturbation theory. However,
(2.10) is exact!
The eastward deflection of a falling mass seems at first paradoxical, since the earth
rotates toward the east too. However, it becomes transparent if one considers that the
mass in the height h at the time t = 0 in the inertial system has a larger velocity
component toward the east (due to the earth rotation) than an observer on the earth’s
surface. It is just this “excessive” velocity toward the east which for an observer on the
earth lets the stone fall toward the east, but not ⊥ downward. For the throw upward
the situation is the opposite (see Exercise 2.2).
EXAMPLE
As an example, we calculate the eastward deflection of a body that falls at the equator
from a height of 400 m.
The eastward deflection of a body falling from the height h is given by
2ω sin λh 2h
y(h) = .
3 g
The height h = 400 m, the angular velocity of the earth ω = 7.27 · 10−5 rad s−1 ,
and the gravitational acceleration is known.
Inserting the values in y(h) yields
2 · 7.27 · 400 rad m 2 · 400 s2
y(h) = ,
3 · 105 s 9.81
2.3 Exact Solution 17
Thus, the body will be deflected toward the east by 17.6 cm.
EXERCISE
Problem. An object will be thrown upward with the initial velocity v0 . Find the
eastward deflection.
Solution. If we put the coordinate system at the starting point of the motion, the
initial conditions read
z(t = 0) = 0, ż(t = 0) = v0 ,
y(t = 0) = 0, ẏ(t = 0) = 0,
x(t = 0) = 0, ẋ(t = 0) = 0.
The deflection to the east is given by y, the deflection to the south by x; z = 0 denotes
the height h above the earth’s surface.
For the motion in y-direction we have, as has been shown (see (2.4)),
dy
= −2ω(x cos λ + z sin λ) + C2 .
dt
The motion of the body in x-direction can be neglected; x ≈ 0. If one further ne-
glects the influence of the eastward deflection on z, one immediately arrives at the
equation
g
z = − t 2 + v0 t,
2
which is already known from the treatment of the free fall without accounting for the
earth’s rotation. Insertion into the above differential equation yields
dy g 2
= 2ω t − v0 t sin λ,
dt 2
g 3 v0 2
y(t) = 2ω t − t sin λ.
6 2
At the turning point (after the time of ascent T = v0 /g), the deflection is
2 v3
y(T ) = − ω sin λ 02 .
3 g
It points toward the west, as expected.
18 2 Free Fall on the Rotating Earth
EXERCISE
d 2r
m = −mge3 − 2mω × v with ω = −ω sin λe1 + ω cos λe3 .
dt 2
The flow velocity is v = −v0 e1 , and hence,
F must be perpendicular to the water surface (see Fig. 2.3). With the magnitude of
the force
F = 4m2 ω2 v02 sin2 ϕ + m2 g 2
one can, from Fig 2.3, determine H = D sin α and sin α = F2 /F . For the desired
height H one obtains
2ωv0 sin ϕ 2Dωv0 sin ϕ
H =D ≈ .
4ω2 v02 sin2 ϕ + g 2 g
Fig. 2.3.
For the numerical example one gets a bank superelevation of H ≈ 2.9 cm.
EXERCISE
Problem. Let a uniform spherical earth be covered by water. The sea surface takes
the shape of an oblate spheroid if the earth rotates with the angular velocity ω.
2.3 Exact Solution 19
Find an expression that approximately describes the difference of the sea depth at
the pole and equator, respectively. Assume that the sea surface is a surface of con-
stant potential energy. Neglect the corrections to the gravitational potential due to the
deformation.
Solution.
Fig. 2.4.
γ mM
Feff (r) = − 2 er + mω2 r sin ϑex , r = r · sin ϑ,
r
r2
V |rr21 = − Feff (r) · dr
r1
r2
γ mM
=− − 2 er + mω r sin ϑex dr · er
2
r
r1
γ mM r2 mω2 r 2 sin2 ϑ r2
=− − .
r r1 2 r1
We therefore define
γ mM mω2 r 2 2
Veff (r) = − − sin ϑ. (2.13)
r 2
Let
r = R + r(ϑ); r(ϑ) R.
γ mM
V (r) = − + V0 .
R
According to the formulation of the problem, the earth’s surface is an equipoten-
tial surface. From this it follows that the attractive force acts normal to this surface.
Because of the constancy of the potential along the surface, no tangential force can
arise.
γ mM r m r
V (r) = − 1− − ω2 R 2 1 + 2 sin2 ϑ
R R 2 R
! γ mM
=− + V0 .
R
From this it follows that
γ mM m
V0 = 2
r − ω2 R 2 sin2 ϑ − mω2 R r sin2 ϑ.
R 2
As can be seen by inserting the given values, the last term can be neglected:
γ mM
mω2 R sin2 ϑ.
R2
20 2 Free Fall on the Rotating Earth
The second requirement for the evaluation of the deformation is the volume conserva-
tion. Since one can assume r R, we can write this requirement as a simple surface
integral
π/2 2π
da · r(ϑ) = 0, (2.15)
ϑ=0 ϕ=0
π/2
mω2 R 2 sin2 ϑ
V0 + 2πR · R sin ϑdϑ = 0,
2
0
π/2
mω2 R 2 sin3 ϑ
V0 sin ϑ + dϑ = 0.
2
0
With
π/2 π/2
2
sin ϑdϑ = 1 and sin3 ϑdϑ = ,
3
0 0
one gets
m 2 2
V0 + ω R = 0,
3
m
V0 = − ω 2 R 2 .
3
By inserting this result into (2.14), one obtains
R 4 ω2 2
r(ϑ) = sin2 ϑ − .
γM 2 3
If one wants to include the influence of the deformation on the gravitational po-
tential, one needs the so-called spherical surface harmonics. They will be outlined in
detail in the lectures on classical electrodynamics.1
In 1851, Foucault1 found a simple and convincing proof of the earth rotation: A pen-
dulum tends to maintain its plane of motion, independent of any rotation of the sus-
pension point. If such a rotation is nevertheless observed in a laboratory, one can only
conclude that the laboratory (i.e., the earth) rotates.
Figure 3.1 shows the arrangement of the pendulum and fixes the axes of the coor-
dinate system.
We first derive the equation of motion of the Foucault pendulum. For the mass point
we have
F = T + mg, (3.1)
where T is a still unknown tension force along the pendulum string. In the basic equa-
tion that holds for moving reference frames,
dω
mr̈ = F − m × r − 2mω × v − mω × (ω × r), (3.2)
dt
1 Jean Bernard Léon Foucault [fuk’o], French physicist, b. Sept. 18, 1819 Paris–d. Feb. 11, 1868.
In 1851, Foucault performed his famous pendulum experiment in the Panthéon in Paris as a proof of
the earth’s rotation. In the same year he proved by means of a rotating mirror that light propagates
in water more slowly than in air, which was important for confirming the wave theory of light. He
investigated the eddy currents in metals detected by D.F. Arago (Foucault-currents), and also studied
light and heat radiation together with A.H.L. Fizeau.
the linear forces and the centripetal forces can be neglected, because for the earth’s ro-
tation dω/dt = 0 and t · |ω| 1, t 2 ω2 ≈ 0 (t ≈ pendulum period). By inserting (3.1)
into the simplified equation (3.2), we get
As is obvious from this equation, the earth’s rotation is expressed for the moving
observer by the appearance of a virtual force, the Coriolis force. The Coriolis force
causes a rotation of the vibrational plane of the pendulum. The string tension T can
be determined from (3.3) by noting that
T
T = (T · e1 )e1 + (T · e2 )e2 + (T · e3 )e3 = T
T
(−x, −y, l − z)
=T
x 2 + y 2 + (l − z)2
(−x, −y, l − z)
≈T . (3.4)
l
x y z−l
T = T − e1 − e2 + e . (3.5)
l l l 3
Before inserting (3.5) into (3.3), it is practical to decompose (3.3) into individual com-
ponents. For this purpose one has to evaluate the vector product ω × v:
e e2 e3
1
ω × v = −ω sin λ 0 ω cos λ
ẋ ẏ ż
= −ω cos λẏe1 + ω(cos λẋ + sin λż)e2 − ω sin λẏe3 . (3.6)
By inserting (3.5) and (3.6) into (3.3) with mg = −mge3 , we obtain a coupled
system of differential equations:
x
mẍ = − T + 2mω cos λẏ,
l
y
mÿ = − T − 2mω(cos λẋ + sin λż), (3.7)
l
l−z
mz̈ = T − mg + 2mω sin λẏ.
l
To eliminate the unknown string tension T from the system (3.7), we adopt the already
mentioned approximations:
3 Foucault’s Pendulum 25
The pendulum string shall be very long, but the pendulum shall oscillate with small
amplitudes only. From this it follows that x/ l 1, y/ l 1, and z/ l ≪ 1, since the
mass point moves almost in the x, y-plane. Hence, for calculating the string tension
we use the approximation
l−z
= 1, mz̈ = 0, (3.8)
l
g 2ω sin λ
ẍ = − x + x ẏ + 2ω cos λẏ,
l l (3.10)
g 2ω sin λ
ÿ = − y + y ẏ − 2ω cos λẋ.
l l
g g
ẍ = − x + 2ω cos λẏ, ÿ = − y − 2ω cos λẋ. (3.11)
l l
These two linear (but coupled) differential equations describe the vibrations of a pen-
dulum under the influence of the Coriolis force to a good approximation. In the fol-
lowing we will describe a method of solving (3.11).
26 3 Foucault’s Pendulum
ẍ = −k 2 x − 2αi 2 ẏ
i ÿ = −k 2 iy − 2αi ẋ (3.12)
Equation (3.13) is solved by the ansatz useful for all vibration processes,
u = C · eγ t , (3.14)
The most general solution of the differential equation (3.13) is the linear combination
of the linearly independent solutions
where A and B must be fixed by the initial conditions and are of course complex, i.e.,
can be decomposed into a real and an imaginary part:
The Euler relation e−iϕ = cos ϕ − i sin ϕ allows one to split (3.19) into u = x + iy:
from which it follows after separating the real and the imaginary parts
x0 = 0, ẋ0 = 0,
y0 = L, ẏ0 = 0,
i.e., the pendulum is displaced by the distance L toward the east and released at the
time t = 0 without initial velocity. Inserting x0 = 0 in (3.21), one gets
B1 = −A1 .
k−α
B2 = A2 .
k+α
As already noted in (3.16), α k and thus B2 ≈ A2 . From (3.21) one now obtains
We still have to include the initial conditions for y0 and ẏ0 . From ẏ0 = 0 and (3.22)
we get
−A1 (α − k) + A1 (α + k) = 0 ⇒ A1 = 0.
L
2A2 = L ⇒ A2 = .
2
L L
x= sin(α − k)t + sin(α + k)t,
2 2
L L
y = cos(α − k)t + cos(α + k)t.
2 2
sin(α ± k) = sin α cos k ± cos α sin k, cos(α ± k) = cos α cos k ∓ sin α sin k,
it follows that
The first factor in (3.23) describes the motion of a pendulum that vibrates with the am-
√
plitude L and the frequency k = g/ l. The second term is a unit vector n that rotates
with the frequency α = ω cos λ and describes the rotation of the vibration plane:
r = L cos kt n(t),
n(t) = sin αt e1 + cos αt e2 .
Equation (3.23) also tells us in what direction the vibrational plane rotates. For the
northern hemisphere cos λ > 0, and after a short time sin αt > 0 and cos αt > 0, i.e.,
the vibrational plane rotates clockwise. An observer in the southern hemisphere will
see his pendulum rotate counter-clockwise, since cos λ < 0.
At the equator the experiment fails, since cos λ = 0. Although the component ωx =
−ω sin λ takes its maximum value there, it cannot be demonstrated by means of the
Foucault pendulum.
Following the path of the mass point of a Foucault pendulum, one finds rosette
trajectories. Note that the shape of the trajectories essentially depends on the initial
conditions (see Fig. 3.4). The left side shows a rosette path for a pendulum released at
the maximum displacement; the pendulum shown on the right side was pushed out of
the rest position.
Because of the assumption α k in (3.16), (3.23) does not describe either of the
two rosettes exactly. According to (3.23), the pendulum always passes the rest posi-
tion, although the initial conditions were adopted as in the left figure.
3.2 Discussion of the Solution 29
EXERCISE
Problem. A vertical bar AB rotates with constant angular velocity ω. A light non-
stretchable chain of length l is fixed at one end to the point O of the bar, while the
mass m is fixed at its other end. Find the chain tension and the angle between chain
and bar in the state of equilibrium.
Solution. Three forces act on the body, viz.
(1) the gravitation (weight): Fg = −mge3 ;
(2) the centrifugal force: Fz = −mω × (ω × r);
(3) the chain tension force: T = −T sin ϕe1 + T cos ϕe3 .
Since the angular velocity has only one component in the e3 -direction, ω = ωe3 , and
Fz = −m(ω × (ω × r))
the expression
Fz = +mω2 l sin ϕ e1 .
If the body is in equilibrium, the sum of the three forces equals zero:
Exercise 3.1 Since a vector vanishes only if every component equals zero, we can set up the fol-
lowing component equations:
T = mω2 l (3.26)
and after elimination of T from (3.25) we get the angle ϕ between the chain and the
bar:
g
cos ϕ = .
ω2 l
Since the chain OP with the mass m in P moves on the surface of a cone, this arrange-
ment is called the cone pendulum.
EXERCISE
Problem. The period of a pendulum of length l is given by T . How will the period
change if the pendulum is suspended at the ceiling of a train that moves with the
velocity v along a curve with radius R?
(a) Neglect the Coriolis force. Why can you do that?
(b) Solve the equations of motion (with Coriolis force!) nearly exactly (analogous to
Foucault’s pendulum).
mv 2
FR = −mg sin ϕ + cos ϕ.
R
One has v(x) = ω(R + x) = ω(R + l sin ϕ) and R = R + x. Hence, it follows that
Fig. 3.6.
Here, the Coriolis force was neglected, since the angular velocity ϕ̇ and hence ẋ is
small compared to the rotational velocity v = ω(R + x), i.e., ω × ẋ ∼
= 0. The solution
of the homogeneous differential equation is
g
ϕh = sin −ω t .
2
l
ω2 (R/ l)
ϕi = .
(g/ l) − ω2
2π
T= .
(g/ l) − ω2
√
For ω = g/ l the period becomes infinite, since the centrifugal force exceeds the
gravitational force. This interpretation gets to the core of the matter, although the
formula (3.28) holds only for small angular velocities: For large angular velocities the
32 3 Foucault’s Pendulum
Exercise 3.2 approximation of small vibration amplitudes x in (3.27) is no longer allowed, because
the pendulum mass is being pressed outward due to the centrifugal force, i.e., to large
values of x.
(b) The equations of motion read
dω
mr̈ = F − m × r − 2mω × v − mω × (ω × (r + R)).
dt
With
and
one finds
x
mẍ = −Tx − 2mωẏ + mω2 (R + x) = − T − 2mωẏ + mω2 (R + x),
l
y
mÿ = −Ty + 2mωẋ + mω2 y = − T + 2mωẋ + mω2 y, (3.29)
l
z
mz̈ = −Tz + mg = − T + mg.
l
In the following, we will assume that the pendulum length is large, i.e., for small
amplitudes z ≈ l (ż = z̈ = 0). The string tension is then given by T = mg:
g
ẍ = ω −2
x − 2ωẏ + ω2 R,
l
g
ÿ = ω −2
y + 2ωẋ.
l
For the homogeneous solution of the differential equation (3.30) one gets with the
ansatz uhom = c eγ t the characteristic polynomial
g
γ − ω −
2 2
− 2iωγ = 0.
l
uhom = c1 exp i ω + g/ l t + c2 exp i ω − g/ l t .
ω2 R
upart = .
(g/ l) − ω2
3.2 Discussion of the Solution 33
u = uhom + upart
ω2 R
= c1 exp i ω + g/ l t + c2 exp i ω − g/ l t + . (3.31)
(g/ l) − ω2
With the initial condition x(0) = x0 and y(0) = ẋ(0) = ẏ(0) = 0, it follows for c1
and c2
√
g/ l − ω ω2 R
c1 = √ x0 + 2 ,
2 g/ l ω − g/ l
√ (3.32)
g/ l + ω ω2 R
c2 = √ x0 + 2 .
2 g/ l ω − g/ l
By decomposing the solution (3.31) into real and imaginary parts, the solutions for
x(t) and y(t) can be found.
g g ω2 R
x = c1 cos ω + t + c2 cos ω − t+
l l g/ l − ω2
l ω2 R g g g
= x0 + 2 cos t cos ωt + ω sin t sin ωt
g ω − g/ l l l l
ω2 R
+
g/ l − ω2
ω2 R g l g ω2 R
= x0 + 2 cos t cos ωt + ω sin t sin ωt + ,
ω − g/ l l g l g/ l − ω2
(3.33)
g g
y = c1 sin ω + t + c2 sin ω − t
l l
ω2 R g l g
= x0 + 2 sin ωt cos t −ω sin t cos ωt . (3.34)
ω − g/ l l g l
√ √
Because ω g/ l, ω l/g 1. From this it follows that
g
x = x0 cos t cos ωt,
l
g
y = x0 cos t sin ωt.
l
This describes a rotation of the pendulum plane with the frequency ω (as for Foucault’s
pendulum).
34 3 Foucault’s Pendulum
The pendulum period T can now be obtained from the following consideration: For
√ √
t = 0, the brace in (3.33) equals 1. For t = (π/2) l/g + t , where t (π/2) l/g,
the brace vanishes for the first time, which corresponds to a quarter of T . By expanding
the brace, for t we find
π l 3/2 2
t = ω ,
2 g
π l π l 3/2 2
T =4 + ω
2 g 2 g
l ω2
= 2π 1+ .
g g/ l
which suggests the conclusion that the Coriolis force should not be neglected from the
outset in this consideration.
EXERCISE
Problem. Explain to which directions the winds from north, east, south, and west
will be deflected in the northern hemisphere. Explain the formation of cyclones.
Solution. We derive the equation of motion for a parcel of air P that moves near the
earth’s surface. The X, Y, Z system is considered as an inertial system; i.e., we shall
not take the rotation of the earth about the sun into account. Moreover, we assume the
air mass is moving at constant height; i.e., there is no velocity component along the
z-direction (ż = 0). The centrifugal acceleration shall also be neglected.
With the assumptions mentioned above, the equation of motion of the particle is
defined by the differential equation
The force −2mω × ṙ for the north and south winds exactly equals zero. For the
west or east winds the force points along e3 or the opposite direction. Accordingly
the air masses are pushed away from or toward the ground. This force component is
however very small compared to the gravitational force mg, which also points toward
the negative e3 -direction.
If we consider an air parcel moving in the southern hemisphere, then λ > π/2 and
cos λ is negative. Thus, a west wind is here deflected to the north, a north wind toward
the east, and a south wind toward the west.
EXERCISE
Problem. A tube rotates with constant angular velocity ω (relative motion) and is
inclined from the rotational axis by the angle α. A mass m inside of the tube is pulled
inward with constant velocity c by a string.
(a) What forces act on the mass?
(b) What work is performed by these forces while the mass moves from x1 to x2 ?
(Calculate the energy balance!) Numerical values: m = 5 kg, α = 45◦ , x1 = 1 m,
x2 = 5 m, ω = 2 s−1 , c = 5 m/s, g = 9.81 m/s2 .
Solution. (a) The mass m within the tube performs a relative motion with constant
velocity c = c(−1, 0, 0), and thus the resulting acceleration is composed of the guid-
36 3 Foucault’s Pendulum
Exercise 3.4 ing acceleration af in the tube, the relative acceleration ar , and the Coriolis accelera-
tion ac :
a = a f + ar + a c .
In the present problem, we are dealing with a rotation about a fixed axis eω =
(cos α, 0, sin α) with a constant angular velocity ω, so that a0 = at = 0, and the guid-
ing acceleration is therefore
i.e., the guiding acceleration obviously consists of the centripetal acceleration bn only.
The relative acceleration ar = 0, since the relative velocity is constant. The Coriolis
acceleration ac , defined by
ac = 2ωeω × c,
is therefore
a is the result of the following forces acting on the mass m (see Fig. 3.9):
F = m · a, (3.37)
3.2 Discussion of the Solution 37
one can determine the unknown quantities S, N1 and N2 : From (3.35), (3.36) Exercise 3.4
and (3.37) it follows that
S becomes negative if x < (g cos α)/(ω2 sin α); i.e., the mass m would have to be
decelerated additionally within the tube if a constant velocity is to be maintained.
(b) During the motion, work is performed by the string force S, by the gravitational
force G, and by the Coriolis force N2 ; N1 is the normal force. The work performed by
the string force is
x2 x2
Ws = dWs = − S(x) dx
x1 x1
m
= ω2 sin2 α(x12 − x22 ) − mg cos α(x1 − x2 ). (3.38)
2
x2
WG = dWG = mg cos α(x1 − x2 ). (3.39)
x1
The work performed by the Coriolis force, taking into account dx/dt = −c, is
W N2 = dWN2 = − N2 ds = − N2 x sin α dϕ
x2
dϕ dt
=− N2 x sin α dx
dt dx
x1
Insertion of the numerical values given in the formulation of the problem yields
To check the results, one uses the fact that the sum of the work performed by the
external forces must be equal to the difference of the kinetic energies (energy balance)
E = WS + WN2 + WG ,
38 3 Foucault’s Pendulum
So far we have considered only the mechanics of a mass point. We now proceed to
describe systems of mass points. A particle system is called a continuum if it consists
of so great a number of mass points that a description of the individual mass points is
not feasible. On the other hand, a particle system is called discrete if it consists of a
manageable number of mass points.
An idealization of a body (continuum) is the rigid body. The notion of a rigid body
implies that the distances between the individual points of the body are fixed, so that
these points cannot move relative to each other. If one considers the relative motion of
the points of a body, one speaks of a deformable medium.
Degrees of Freedom
4
(xi , yi , zi ), i = 1, . . . , n.
We look for the number of degrees of freedom of a rigid body that can freely move.
To describe a rigid body in space, one must know 3 noncollinear points of it. Hence,
one has 9 coordinates:
which, however, are mutually dependent. Since by definition we are dealing with a
rigid body, the distances between any two points are constant. One obtains
1 Michael Chasles, French mathematician, b. in Épernon Nov. 15, 1793–d. Dec. 18, 1880, Paris.
Banker in Chartres; from 1841 to 1851 professor at the École Polytechnique; after 1846 professor
at the Sorbonne in Paris. Chasles is independently of J. Steiner one of the founders of the synthetic
geometry. His Aperçu historique by far surpassed the older representations of the development of
geometry and stimulated new geometrical research in his age.
We now consider the rigid body with one point fixed in space. The motion is com-
pletely described if we know the coordinates of two points
and adopt the fixed point as the origin of the coordinate system. Since the body is
rigid, we have
From these 3 equations one can eliminate 3 coordinates, so that the remaining 3 coor-
dinates describe the 3 degrees of freedom of rotation.
If a particle moves along a given curve in space, the number of degrees of freedom
is f = 1. The curve can be written in the parametric form
Fig. 4.2. Example of the para- x = x(s), y = y(s), z = z(s),
metric form: A caterpillar creeps
on a blade of grass
i.e., for a given curve the position of the particle is fully determined by specifying one
parameter value s.
A deformable medium or a fluid has an infinite number of degrees of freedom (e.g.,
a vibrating string, a flexible bar, a drop of fluid).
Center of Gravity
5
Definition Let a system consist of n particles with the position vectors rν and the
masses mν for ν = (1, . . . , n). The center of gravity of this system is defined as point S
with the position vector rs :
n
m1 r1 + m2 r2 + · · · + mn rn mν rν
rs = = ν=1 n ,
m1 + m2 + · · · + mn ν=1 mν
1
n
rs = mν rν ,
M
ν=1
n
where M = ν=1 mν is the total mass of the system, and
n
Mrs = mν rν
ν=1
We consider three systems of masses with the centers of gravity r1 , r2 , r3 and the total
masses M1 , M2 , M3 . The system 1 consists of the mass M1 = (m11 + m12 + m13 +· · · )
with the position vectors r11 , r12 , r13 , . . .; the systems 2 and 3 are analogous. Then by
definition the centers of gravity are
i m1i r1i
system 1: rs1 = ,
i m1i
i m2i r2i
system 2: rs2 = ,
i m2i
i m3i r3i
system 3: rs3 = .
i m3i
For the center of gravity of the total system we have the same relation:
i m1i r1i + i m2i r2i + i m3i r3i
rs =
m
i 1i + m
i 2i + m
i 3i
EXERCISE
Problem. Find the coordinates of the center of gravity for a system of 3 mass points.
m1 = 1 g, m2 = 3 g, m3 = 10 g,
r1 = (1, 5, 7) cm, r2 = (−1, 2, 3) cm, r3 = (0, 4, 5) cm.
5 Center of Gravity 45
1
rs = (1 − 3, 5 + 3 · 2 + 10 · 4, 7 + 3 · 3 + 10 · 5) cm
14
or, recalculated,
1
rs = (−2, 51, 66) cm.
14
EXERCISE
Problem. Find the center of gravity of a pyramid with edge length a and a homoge-
neous mass distribution.
Solution. Because of the homogeneous mass distribution, the mass density ρ(r) =
ρ0 = constant. The base of the pyramid is represented by the equation
x + y + z = a.
The coordinate axes are the edges, and the origin is the top. Then
ρ0 r dV r dV
rs = V = V , dV = dx dy dz.
ρ
V 0 dV V dV
The integration limits are evident from Fig. 5.2. The integration runs over z along
the column from z = 0 to z = a − x − y; over y along the prism from y = 0 to
y = a − x; and over x along the pyramid from x = 0 to x = a:
a a−x a−x−y
r dV x=0 y=0 z=0 r dz dy dx
rs = V
= a a−x a−x−y ,
V dV x=0 y=0 z=0 dz dy dx
Fig. 5.2.
46 5 Center of Gravity
a−x
a
1
r dV = x(a − x − y), y(a − x − y), (a − x − y) dy dx.
2
2
V 0 0
EXERCISE
Problem. Find the center of gravity of a semicircular disk of radius a. The surface
density is constant.
Fig. 5.3.
r = a, 0 ≤ ϕ ≤ π.
2a 3 /3 4a
ys = 2
= ,
πa /2 3π
i.e., the center of gravity lies at rs = (0, 4a/(3π)).
EXERCISE
Fig. 5.4.
Solution. (a) Because of symmetry, the center of gravity is on the z-axis, i.e., xs =
ys = 0. For the z-component, we have
z dV k z dV
zs = k = 2h
.
k dV (1/3)πa
2π a
h(1−ρ/a)
z dV = z d dϕ dz
k ϕ=0 =0 z=0
a
1 2 2
= 2π h 1− d
2 a
=0
a
1 2 23 4 a 2 h2
= πh 2
− + 2 =π ,
2 3a 4a 0 12
πh2 a 2 ·3 1
zs = = h.
12πa 2 h 4
48 5 Center of Gravity
Thus, the center of gravity of a circular cone is independent of the radius of the base.
(b) See Fig. 5.5. One then has
cone z dV + hemisphere z dV
1 2 2
12 πh a +hemisphere z dV
zs = = ,
Vcone + Vhemisphere (π/3)(h + 2a)a 2
2π a 0
z dV = z dϕ d dz
Fig. 5.5. Because of symme-
try the center of gravity is again hemisphere ϕ=0 =0 z=−
√
a 2 −2
on the z-axis
a
=π (2 − a 2 ) d
=0
a
4 a 2 2
=π −
4 2 0
πa 4
=− .
4
12 πa h − 4 πa
1 2 2 1 4
1 h2 − 3a 2
zs = = ,
a2 π 4 h + 2a
3 (h + 2a)
ys = 0, xs = 0.
EXERCISE
Problem.
(a) Show that any positional variation of a rigid disk in the plane can be represented
by a pure rotation about a point at a finite distance or at infinity. (Hint: The position
of the disk is already fixed by specifying two points A and B.)
(b) Show by “differential” variation of position: The planar motion of a rigid disk
can be described at any moment by a pure rotation about a point varying with the
motion, the so-called momentary center. The geometric locus of these momentary
centers is called the pole path or the fixed pole curve.
(c) Calculate the fixed pole curve r(ϕ) for a ladder sliding on two perpendicular walls.
(d) Calculate the fixed pole curve r(ϕ) for a bar of length l that can move in the guide
shown in Fig 5.6.
5 Center of Gravity 49
Fig. 5.6.
Fig. 5.7.
Solution. (a) For describing the motion of the disk we take the (arbitrary) straight line
AB; it turns into the straight line A1 B1 . The intersection M of the mid-perpendiculars
onto AA1 and BB1 is the desired center of rotation.
Argument: The triangles ABM and A1 B1 M are congruent. Hence the motion can
be considered as a rotation of the triangle ABM (involving the straight line AB) about
M by the angle ϕ.
(b) For an infinitely small rotation by dϕ the same considerations hold. But now
the individual turning points vary. These are the so-called momentary centers. In a
differential rotation about a momentary center M, for any point the path element dr
and the velocity vector v point along the same direction and are perpendicular to the
connecting lines to M (see Fig. 5.8). The geometric locus of the momentary centers is
called the pole curve.
Fig. 5.8.
(c) According to (b), one gets Fig. 5.9. The straight line l = AB forms a diagonal
of the square OBMA. Since the diagonals of a square are equal, M must move along
a circle of radius l.
50 5 Center of Gravity
Fig. 5.9.
Thus, in polar coordinates the equation of the fixed pole curve r(ϕ) reads
1 − (a 2 / l 2 )(1 − sin ϕ)2
r = r(ϕ) = OM = −a + l .
cos2 ϕ
Fig. 5.10.
EXAMPLE
vides information both on the bound state as well as on the unbound state (scattering Example 5.6
state) of a system.
The study of the unbound states of a system became of great importance in modern
physics. One learns about the mutual interaction of two objects by scattering them
off each other and observing the path of the scattered particles as a function of the
incident energy and of other path parameters. The objects studied in this way are usu-
ally molecules, atoms, atomic nuclei, and elementary particles. Scattering processes
in these microscopic regions must be described by quantum mechanics. However, one
can obtain information on scattering processes by means of classical mechanics which
is confirmed by a quantum mechanical calculation. Moreover, one may learn the meth-
ods for describing scattering phenomena by studying the classical case.
The schematic arrangement of a scattering experiment is shown in Fig. 5.11. We
consider a homogeneous beam of incoming particles (projectiles) of the same mass
and energy. The force acting on a particle is assumed to drop to zero at large distances
from the scattering center. This guarantees that the interaction is somehow localized.
Let the initial velocity v0 of each projectile relative to the force center be so large that
the system is in the unbound state, i.e., for t → ∞ the distance between the two scat-
tering particles shall become arbitrarily large. For a repulsive potential this happens
for any value of v0 ; this does not hold for an attractive potential.
The interaction of a projectile with the target particle manifests itself by the fact
that the flight direction after the collision differs from that before the collision (the
usage of the words “before” and “after” in this context presupposes a more or less
finite range of the interaction potential).
d
= sin ϑ dϑ dϕ, (5.2)
Often the differential cross section will be independent of the azimuth angle ϕ (we
shall restrict ourselves to this case), and one can define
dσ dσ
:= 2π sin ϑ (ϑ, ϕ); (5.4)
dϑ d
2π π π
dσ dσ dσ
σtot = d
(ϑ, ϕ) = dϕ dϑ sin ϑ = dϑ (ϑ). (5.6)
d
d
dϑ
0 0 0
It depends only on the kinds of particles, and possibly on the incidence energy:
(3) Introduction of the collision parameter, its relation to the scattering angle,
and the formula for the differential cross section
It is clear that the scattering angle ϑ at fixed energy can depend only on the collision
parameter b, since the initial position and the initial velocity of the particle are then
specified. The collision parameter is defined as the vertical distance of the asymptotic
5 Center of Gravity 53
incidence direction of the projectile from the initial position of the scatterer. Hence, Example 5.6
for E = constant the scattering angle is
ϑ = ϑ(b). (5.8)
Since the movements in classical mechanics are determined, this connection is unam-
biguous. (This statement is no longer valid in quantum mechanics.) Thus,
b = b(ϑ), (5.9)
b ≤ b ≤ b + db
dR = 2π sin ϑ dϑ.
This has a practical meaning inasmuch as cross sections are always measured in
the laboratory system S where the target is at rest, but the calculation of b(ϑ ) often
simplifies in the center-of mass system S . We therefore derive a relation between
these two cross sections. In the following the primed and nonprimed quantities shall
always refer to these two systems.
First, we investigate the relation between the scattering angles ϑ and ϑ . Let v1
f
f
and v1 be the asymptotic final velocity (f = final) of the projectile of mass m1 in the
system S and S , respectively. V is the relative velocity of the two systems.
From Fig. 5.14, one immediately sees that
f
v1 sin ϑ sin ϑ
tan ϑ = f
= f
,
v1 cos ϑ + V cos ϑ + V /v1
f
Fig. 5.14. where V stands for the magnitude of V, and analogously for v1 . Furthermore,
m1 v1i = (m1 + m2 )V ,
where v1i is the initial velocity of the projectile in the laboratory system (i = initial),
and
v1i = V + v1 i .
f f f
Because m1 v1i = m2 v2i and m1 v1 = m2 v2 for elastic scattering (Ekin
i = E ), v i =
kin 1
f
v1 , and therefore,
V m1
f
= .
v1 m2
Hence,
sin ϑ
tan ϑ = . (5.12)
cos ϑ + m1 /m2
This relation defines the function ϑ (ϑ); we will not give it explicitly. If a projec-
tile in S is scattered into the ring dR with the “radius” ϑ and the width dϑ (see
Fig. 5.12), it will in S be scattered into a ring dR with the “radius” ϑ (ϑ) and the
width dϑ = (dϑ /dϑ)dϑ . The number of particles scattered to dR in S and to dR
in S is therefore identical, and with (5.5), we get
dσ dσ dσ dϑ
(ϑ) · dϑ = (ϑ ) · dϑ = (ϑ ) dϑ,
dϑ dϑ dϑ dϑ
5 Center of Gravity 55
thus,
dσ dσ dϑ
(ϑ) = (ϑ ) , (5.13)
dϑ dϑ dϑ
or
dσ dσ sin ϑ dϑ
(ϑ) = (ϑ ) . (5.14)
d
d
sin ϑ dϑ
This is the desired connection.
The difference between the scattering angles and the cross sections, respectively, is
obviously determined by the mass ratio of projectile and target particle (see (5.12)).
EXERCISE
F = kr −2 .
(a) Calculate the scattering angle as a function of b and of the initial velocity of the
particle.
(b) What are the differential and the total cross sections?
Solution. (a) From the discussion of the Kepler problem, we know that the underly-
ing force law has the form
k
F =− . (5.15)
r2
The minus sign means that the force is attractive. The path equation reads1
1 mk 2El 2
= 2 1+ 1+ cos(θ − θ ) (5.16)
r l mk 2
1 See W. Greiner: Classical Mechanics: Point Particles and Relativity, 1st ed., Springer, Berlin
(2004).
56 5 Center of Gravity
k
F= . (5.20)
r2
The force is repulsive. For illustration, we consider the scattering of charged particles
by a Coulomb field (e.g., atomic nuclei by atomic nuclei, protons by nuclei, or elec-
trons by electrons, etc.). The scattering force center is created by a fixed charge −Ze
and acts on the particle with the charge −Z e. The force is then
ZZ e2
F= . (5.21)
r2
If we set k = −ZZ e2 , we can directly take over the equations for an attractive poten-
tial. The path equation (5.18) now reads
1 mZZ e2
=− (1 + ε cos θ ). (5.22)
r l2
The coordinates were rotated so that θ = 0. For ε (see (5.17)) it follows that
2
2El 2 2Eb
ε= 1+ = 1+ . (5.23)
m(ZZ e2 )2 (ZZ e2 )2
√ 1 2
l = mbv∞ = b 2mE, E = mv∞ (5.24)
2
1
cos θ < − (5.25)
ε
Fig. 5.15. Region of θ for re-
pulsive Coulomb scattering (see Fig. 5.15). Note that the force center for repulsive forces is in the outer focal point
(see Fig. 5.16).
The change of θ that occurs if the particle comes from infinity, and is then scattered
and moves to infinity again, equals the angle φ between the asymptotes, which is the
supplement to the scattering angle θ (see Fig. 5.16).
5 Center of Gravity 57
The relation cos(φ/2) = 1/ε can be proved as follows: The two limiting angles θ1 and
θ2 satisfy the condition
1
cos θ1 = − ,
ε
(5.27)
1
cos θ2 = − .
ε
sin θ1 = − sin θ2 ,
(5.28)
θ1 θ2
cos = − cos .
2 2
and therefore,
θ1 θ2
sin = sin . (5.30)
2 2
58 5 Center of Gravity
By inserting dσ/d
(θ ) from (5.40), one quickly realizes that the expression diverges
because of the strong singularity at θ = 0. This is due to the long-range nature of the
Coulomb force. If one uses potentials which decrease faster than 1/r, this singularity
disappears.
EXERCISE
U =0 (r > a),
U = −U0 (r ≤ a).
Exercise 5.8 Solution. The straight path of the particle is broken when entering and leaving the
field. We have the relation
sin α
= n, (5.47)
sin β
2U0
n= 1+ 2
.
mv∞
χ = 2(α − β)
sin α sin(α − χ/2)
⇒ =
sin β sin α
sin α cos(χ/2) − cos α sin(χ/2)
=
sin α
χ χ 1
= cos − cot α sin = . (5.48)
2 2 n
a sin α = . (5.49)
n sin(χ/2)
=a (5.53)
(n2 − 2n cos(χ/2) + 1)1/2
with respect to χ .
an
d 2 cos(χ/2) 1 an sin(χ/2) · n sin(χ/2)
⇒ = 2 −
dχ (n − 2n cos(χ/2) + 1) 1/2 2 (n2 − 2n cos(χ/2) + 1)3/2
an
cos χ2 n2 + 1 − 2n cos χ2 − 12 an2 sin2 χ2
= 2 3/2
n2 + 1 − 2n cos χ2
χ
a 3
2 n cos 2 + an
cos χ2 − an2 cos2 χ2 − 12 an2 sin2 χ
=
2
3/2
2
n2 + 1 − 2n cos χ2
χ χ χ
an n2 cos 2 + cos 2 − n − n cos2
= 3/2
2
2 n2 + 1 − 2n cos χ2
χ χ
an n cos 2 − 1 n − cos 2
= (5.54)
2 n2 + 1 − 2n cos χ 3/2
2
dσ d
⇒ σ (χ) = d
sin χ dχ
a 2 n2 sin(χ/2) |(n cos(χ/2) − 1)(n − cos(χ/2))|
=
2 sin χ (n2 + 1 − 2n cos(χ/2))2
a 2 n2 1 |(n cos(χ/2) − 1)(n − cos(χ/2))|
= . (5.55)
4 cos(χ/2) (n2 + 1 − 2n cos(χ/2))2
Here, we utilized
χ χ
sin χ = 2 cos sin . (5.56)
2 2
The angle χ takes the values from zero (for = 0) up to the value χmax (for = a)
which is determined by the equation
χmax 1
cos = . (5.57)
2 n
62 5 Center of Gravity
dσ d a 2 n2 1 [n cos(χ/2) − 1][n − cos(χ/2)]
(χ) = = (5.58)
d
sin χ dχ 4 cos(χ/2) [n2 + 1 − 2n cos(χ/2)]2
max
χ max
χ
dσ χ [n cos(χ/2) − 1][n − cos(χ/2)]
σtot = (χ) d
= πa 2 n2 sin dχ
d
2 [n2 + 1 − 2n cos(χ/2)]2
0 0
max
χ
n2
= πa 2
(1 + n2 − 2n cos(χ/2))2
0
χ χ 2 χ χ χ
× (n + 1) cos sin − n cos
2
sin − n sin dχ. (5.59)
2 2 2
2 2
I II III
Part III can be integrated at once; I and II are transformed by integrating by parts:
χ −1 2 χmax
σtot = πa n + 1 − 2n cos
2 2
n
2 0
χmax
cos(χ/2)
− πa n(n + 1)
2 2
(1 + n2 − 2n cos(χ/2)) 0
max
χ
(1/2) sin(χ/2)
− πa n(n + 1)
2 2
dχ
(1 + n2 − 2n cos(χ/2))
0
χmax
cos2 (χ/2)
+ n πa 2 2
(1 + n − 2n cos(χ/2)) 0
2
max
χ
− cos(χ/2) sin(χ/2)
− πa n 2 2
dχ. (5.60)
(1 + n2 − 2n cos(χ/2))
0
χ
y := cos ,
2
1 χ
dy = − sin dχ (5.61)
2 2
5 Center of Gravity 63
max
χ
n(n2 + 1) sin(χ/2)
− dχ
2 (1 + n2 − 2n cos(χ/2))
0
max /2)
cos(χ
y
− 2n 2
dy
(1 + n − 2ny)
2
1
n2 (1 + cos2 (χ/2)) − n(n2 + 1) cos(χ/2) χmax
= πa 2
(1 + n2 − 2n cos(χ/2)) 0
2 χmax
n +1 χ
− ln 1 + n2 − 2n cos
2 2
0
2 cos(χmax /2)
n +1
+ ny + ln(1 + n2 − 2ny) ;
2 0
EXERCISE
Problem. A hydrogen atom moves along the x-axis with a velocity vH = 1.78 ·
102 m · s−1 . It reacts with a chlorine atom that moves perpendicular to the x-axis with
vCl = 3.2 · 101 m · s−1 . Calculate the angle and the velocity of the HCl-molecule. The
atomic weights are H = 1.00797 and Cl = 35.453.
Fig. 5.19.
64 5 Center of Gravity
Exercise 5.9 Solution. We utilize momentum conservation. The initial momenta are
P1 = m1 v1 ex , m1 = A1 · 1 amu,
(5.63)
P2 = m2 v2 ey , m2 = A2 · 1 amu.
Here, A1 , A2 mean the atomic weights, and 1 amu (“atomic mass unit”)
= 1/12m(12 C). We require
If we consider a system of mass points, for the total force acting on the νth particle
we have
Fν + fνλ = ṗν . (6.1)
λ
The force fνλ is the force of the particle λ on the particle ν; Fν is the force acting on
the particle ν from the outside of the system; λ fνλ is the resulting internal force of
all other particles on the particle ν.
The resulting force acting on the system is obtained by summing over the individual
forces:
ṗν = Fν + fνλ = Ṗ.
ν ν ν λ
Since force equals (–) counter force (here Newton’s third law becomes operative), it
follows that fνλ + fλν = 0, so that the terms of the above double sum cancel pairwise.
One thus obtains for the total force acting on the system
Ṗ = F = Fν .
ν
F = Ṗ = 0, i.e., P = constant.
The total momentum P = ν pν of the particle system is thus conserved if the sum of
the external forces acting on the system vanishes.
The situation is similar for the angular momentum if the internal forces are assumed
to be central forces.
The angular momentum of the νth particle with respect to the coordinate origin is
lν = rν × pν .
The angular momentum of a single particle is defined with respect to the origin. The
same holds for the total angular momentum. The angular momentum of the system
then equals the sum over all individual angular momenta,
L= lν .
ν
Fig. 6.1.
Analogously, the torque acting on the νth particle is
d ν = r ν × Fν ,
The internal forces fνλ do not perform a torque, since we assumed them to be central
forces. This can be seen as follows: For the force acting on the νth particle, according
to (6.1) we have
d
Fν + fνλ = pν .
dt
λ
d d
rν × Fν + rν × fνλ = rν × pν = (rν × pν ) = l̇ν .
dt dt
λ
The differentiation can be moved to the left, because ṙν × pν = 0. Summation over ν
yields
r ν × Fν + rν × fνλ = L̇,
ν λ ν
D 0
D = L̇ = l̇ν .
Here, ν λ rν × fνλ = 0, since the terms of the double sum cancel pairwise, e.g.,
Since for central forces (rν − rλ ) is parallel to fνλ , the vector product vanishes.
The total torque on a system is given by the sum of the external torques
D = L̇.
EXAMPLE
Fig. 6.2. Formation of a galaxy from a cloud of gas with angular momentum L: (a) The gas con-
tracts due to the mutual gravitational attraction between its constituents. (b) The gas contracts
faster along the direction of the angular momentum L than in the plane perpendicular to L, since
the angular momentum must be conserved. In this way a flattening appears. (c) The galaxy in
equilibrium: In the plane perpendicular to L, the gravitational force balances the centrifugal
force due to the rotational motion
EXAMPLE
(a) The person holds two weights and is set into uniform circular motion with angular
velocity ω. The arms are stretched out, so that the angular momentum is large.
68 6 Mechanical Fundamental Quantities of Systems of Mass Points
Example 6.2 (b) If the person pulls the arms towards the body, the moment of inertia (see Chap. 11)
decreases. Since angular momentum is conserved, the angular velocity ω signifi-
cantly increases. Skaters exploit this effect when performing a pirouette.
Let fνλ be the force of the λth particle on the νth particle. According to (6.1), we have
d
Fν + fνλ = (mν ṙν ).
dt
λ
leads to
d 1
Fν · ṙν + fνλ · ṙν = 2
mν ṙν .
dt 2
λ
(1/2)mν ṙ2ν is however the kinetic energy Tν of the νth particle. By summation over ν,
we obtain
d 1
d
Fν · ṙν + fνλ · ṙν = mν ṙν =
2
Ṫν = Tν .
ν ν ν
dt 2 ν
dt ν
λ
Ṫν is the time derivative of the total kinetic energy of the system. By integration
ν
from t1 to t2 , with
ṙν dt = drν ,
we get
t2 t2
Aa Ai
T is the total kinetic energy, Aa is the work performed by external forces, and Ai is
the work performed by internal forces over the time interval t2 − t1 .
If we assume that the forces can be derived from a potential, we can express the
performed internal and external work by potential differences.
6.3 Energy Law of the Many-Body System 69
t2
Aa = Fν · drν = − ∇ν V · drν = −
a
dVνa
ν ν ν t
1
=− Vνa (t2 ) − Vνa (t1 ) ,
ν
Aa = V (t1 ) − V a (t2 ).
a
Vνa is the potential of the particle ν in an external field. By summing over all particles,
one obtains the total external potential V a = ν Vνa .
The force acting between two particles ν and λ is assumed to be a central force.
For the “internal” potential, we set
i
Vλν (rλν ) = Vλν
i
(rλν ) = Vνλ
i
(rνλ ).
The mutual potential depends only on the absolute value of the distance:
rνλ = |rν − rλ | = (xν − xλ )2 + (yν − yλ )2 + (zν − zλ )2 .
Thus, the principle of action and reaction is satisfied, since from this it follows auto-
matically that the force fνλ is equal and opposite to the counterforce fλν :
The index ν on the gradient indicates that the gradient is to be calculated with respect
to the components of the position vector rν of the particle ν. Hence,
∂ ∂ ∂ ∂ ∂ ∂
∇ν = , , , ∇λ = , , .
∂xν ∂yν ∂zν ∂xλ ∂yλ ∂zλ
1
= fνλ (drν − drλ ).
2
ν,λ
We now replace the difference of the position vectors by the vector rνλ = rν − rλ and
introduce the operator ∇νλ which forms the gradient with respect to this difference.
We get
1 1 1 i
Ai = − ∇νλ Vνλ
i
· drνλ = − dVνλi
=− Vνλ (t2 ) − Vνλ
i
(t1 ) ,
2 2 2
ν,λ ν,λ ν,λ
where
∂ ∂ ∂
∇νλ = , , .
∂(xν − xλ ) ∂(yν − yλ ) ∂(zν − zλ )
70 6 Mechanical Fundamental Quantities of Systems of Mass Points
Hence, the internal work is the difference of the internal potential energy. This quantity
is significant for deformable media (deformation energy).
For rigid bodies where the differences (distances) |rν − rλ | are invariant, the inter-
nal work vanishes. Changes drνλ can occur only perpendicular to rν − rλ and hence
perpendicular to the direction of force, i.e., the scalar products fνλ · drνλ vanish.
If we set for the total potential energy
1 i
V= Vνa + Vνλ ,
ν
2
ν,λ
or
the sum of potential and kinetic energy for the total system remains conserved. Since
energy can be transferred by the interaction of the particles (e.g., collisions between
gas molecules), energy conservation must not hold for the individual particle but must
hold for all particles together, i.e., for the entire system.
Fig. 6.4.
Thus, the sum of the mass moments relative to the center of gravity vanishes. If there
acts a constant external force, as for example the gravity Fν = mν g, then it also follows
that
D= r ν × Fν = mν rν × g = 0.
ν ν
i.e., in the center-of-mass system the sum of the momenta vanishes. In relativistic
physics this statement is often used as definition of the “center-of-momentum” sys-
tem; there it is not possible to introduce the notion of the center of mass,—as defined
above—in a consistent way. Only the “center-of-momentum” system can be formu-
lated in a relativistically consistent way.
The equivalent transformation of the angular momentum leads to
L= mν (rν × vν ) = mν (R + rν ) × (V + vν ),
ν ν
L= mν (R + V) + mν (R × vν ) + mν (rν × V) + mν (rν × vν ).
ν ν ν ν
and sees that the two middle terms disappear, because of the definition (6.4) of the
center-of-mass coordinates. Hence,
L = M(R × V) + mν (rν × vν ) = Ls + lν . (6.6)
ν ν
Thus, the angular momentum L can be decomposed into the angular momentum of
the center of gravity Ls = MR × V with the total mass M, and the sum of angular
momenta of the individual particles about the center of gravity.
For the torque as the derivative of the angular momentum, the same decomposition
holds:
D = Ds + dν . (6.7)
ν
72 6 Mechanical Fundamental Quantities of Systems of Mass Points
We have
1 1 1
mν vν + mν vν .
2
T= mν v2ν = mν V2 + V ·
2 ν 2 ν ν
2 ν
Because mν vν = 0, the middle term again vanishes, and we find
1 1
mν vν = Ts + T .
2
T = MV2 + (6.8)
2 2 ν
The total kinetic energy T is thus composed of the kinetic energy of a virtual particle
of mass M with the position vector R(t) (the center of gravity), and the kinetic energy
of the individual particles relative to the center of gravity. Mixed terms, e.g. of the
form V · vν 2 , do not appear! This is the remarkable property of the center-of-mass
coordinates, the foundation of their meaning.
EXERCISE
Problem. Show that the kinetic energy of two particles with the masses m1 , m2
splits into the energy of the center of gravity and the kinetic energy of relative motion.
1 1
T = m1 v21 + m2 v22 . (6.9)
2 2
The center of gravity is defined by
m1 r1 + m2 r2
R= ,
m1 + m2
and its velocity is
1
Ṙ = (m1 v1 + m2 v2 ). (6.10)
m1 + m2
The velocity of relative motion is denoted by v. We have
v = v 1 − v2 . (6.11)
6.5 Transformation of the Kinetic Energy 73
We now express the particle velocity by the center of gravity and relative velocity, Exercise 6.3
respectively.
By inserting v2 from (6.11) into (6.10), we have
(m1 + m2 )Ṙ = m1 v1 + m2 v1 − m2 v.
EXERCISE
Problem. Two bodies of masses m1 and m2 move under the action of their mutual
gravitation. Let r1 and r2 be the position vectors in a space-fixed coordinate system,
and r = r1 − r2 . Find the equations of motion for r1 , r2 , and r in the center-of-gravity
system. How do the trajectories in the space-fixed system and in the center-of-mass
system look like? Fig. 6.6. Laboratory system
74 6 Mechanical Fundamental Quantities of Systems of Mass Points
Fig. 6.7.
Gm2 r Gm1 r
r̈1 = − , r̈2 = .
r3 r3
With the relative coordinate r = r1 − r2 , it follows that
G(m1 + m2 )r
r̈ = r̈1 − r̈2 = − .
r3
Since
m2 m1
r1 = r and r2 = r,
m1 + m2 m1 + m2
it follows that
−Gm32 r1 −Gm31 r2
r̈1 = and r̈2 = .
(m1 + m2 )2 r13 (m1 + m2 )2 r23
Hence, Newton’s gravitational law holds with respect to the center of gravity, but
with modified mass factors. This means that the trajectories are conic sections as be-
6.5 Transformation of the Kinetic Energy 75
fore (relative path with respect to S). Because of the superimposed translation of the Exercise 6.4
center of gravity, the trajectories become spirals in space.
EXERCISE
Problem. Two masses (m1 = 2 kg and m2 = 4 kg) are connected by a massless rope
(without sliding) via a frictionless disk of mass M = 2 kg and radius R = 0.4 m (At-
woods machine). Find the acceleration of the mass m2 = 4 kg if the system moves
under the influence of gravitation.
Fig. 6.8.
Solution. For the given masses m1 = 2 kg, m2 = 4 kg and the tension forces at the
rope ends N1 and N2 , it follows that
m1 a1 = N1 − m1 g, m2 a2 = m2 g − N2 , (6.12)
since the disk is accelerated. θs is the moment of inertia of the disk. From this, it
follows that N2 = N1 ; otherwise, there is no motion at all. For the accelerations, we
have
a = a1 = a2 = ω̇R, (6.14)
since the rope is tight and does not slide, i.e., it adheres to the disk.
Inserting the moment of inertia of the disk θs = MR 2 /2 (see Example 11.7)
into (6.13) and using (6.14) yields for the acceleration
N1 N2 R2
a= −g=g− = ω̇R = (N2 − N1 ). (6.15)
m1 m2 MR 2 /2
76 6 Mechanical Fundamental Quantities of Systems of Mass Points
EXERCISE
Problem. Our solar system is about r0 ≈ 5 · 1020 m away from the center of the Milky
Way, and its orbital velocity relative to the galactic center v0 is ≈ 3 · 105 m/s. This is
schematically shown in Fig 6.9.
(a) Determine the mass M of our galaxy.
(b) Discuss the hypothesis that the motion of our solar system is a consequence of the
contraction of our Milky Way (see Fig. 6.9), and then verify, r0 = GM/v02 . Here
G = 6.7 · 10−11 m3 s2 kg−1 is the gravitational constant.
Fig. 6.9.
Solution. (a) If a mass point moves on a circular path, then according to Newton
the force per unit mass equals the acceleration. Since our sun (mass m) is at the pe-
riphery of our Milky Way, the attractive force toward the center can approximately be
represented by
mM
F =G , (6.16)
r02
where m is the solar mass and M is the mass of the Milky Way. The acceleration
points toward the center,
v02 F
a= = , (6.17)
r0 m
from which it follows that
v02 GM GM
= 2 or r0 = . (6.18)
r0 r0 v02
6.5 Transformation of the Kinetic Energy 77
Using the numbers given in the formulation of the problem, one gets from equation Exercise 6.6
(6.18) the mass of our Milky Way:
M ≈ 3 · 1011 m,
1 l2 1 l2 1
T= m 2 2= , (6.20)
2 m r 2 m r2
where we used l = (mr 2 )ω = mvr = constant.
The assumption is now that at the present distance r the increase in the kinetic
energy Tkin is balanced by the decrease in the potential energy if r is reduced by r.
Differentiation of (6.19) and (6.20) with respect to r yields:
dTkin l2 1
Tkin = r = − r, Tkin > 0, if r < 0,
dr m r3
dVpot GMm
Vpot = r = r, Vpot < 0, if r < 0.
dr r2
m2 v02 r02 Mm
=G or r0 v02 = MG. (6.21)
mr03 r02
As the first and most simple system of vibrating mass points, we consider the free
vibration of two mass points, fixed to two walls by springs of equal spring constant,
as is shown in the Fig. 7.1.
The two mass points shall have equal masses. The displacements from the rest
positions are denoted by x1 and x2 , respectively. We consider only vibrations along
the line connecting the mass points.
When displacing the mass 1 from the rest position, there acts the force −kx1 by the
spring fixed to the wall, and the force +k(x2 − x1 ) by the spring connecting the two
mass points. Thus, the mass point 1 obeys the equation of motion
We first determine the possible frequencies of common vibration of the two particles.
The frequencies that are equal for all particles are called eigenfrequencies. The re-
lated vibrational states are called eigen- or normal vibrations. These definitions are
correspondingly generalized for a N -particle system. We use the ansatz
i.e., both particles shall vibrate with the same frequency ω. The specific type of the
ansatz, be it a sine or cosine function or a superposition of both, is not essential.
We would always get the same condition for the frequency, as can be seen from the
following calculation.
Insertion of the ansatz into the equations of motion yields two linear homogeneous
equations for the amplitudes:
A1 (−mω2 + 2k) − A2 k = 0,
(7.3)
−A1 k + A2 (−mω2 + 2k) = 0.
The system of equations has nontrivial solutions for the amplitudes only if the deter-
minant of coefficients D vanishes:
−mω2 + 2k −k
D= = (−mω2 + 2k)2 − k 2 = 0.
−k −mω + 2k
2
k 2 k2
ω4 − 4 ω + 3 2 = 0.
m m
The positive solutions of the equation are the frequencies
3k k
ω1 = and ω2 = .
m m
These frequencies are called eigenfrequencies of the system; the corresponding vibra-
tions are called eigenvibrations or normal vibrations. To get an idea about the type
of the normal vibrations, we insert the eigenfrequency into the system (7.3). For the
amplitudes, we find
3k
A1 = −A2 for ω1 =
m
and
k
A1 = A2 for ω2 = .
m
The two mass points vibrate in-phase with the lower frequency ω2 , and with
the higher frequency ω1 against each other. The two vibration modes are illustrated
by Fig. 7.2.
Fig. 7.2.
coordinates x1 and x2 are sufficient to describe the system, and we obtain the two
eigenvibrations with the frequencies ω1 , ω2 .
In our example, the normal vibrations mean in-phase or opposite-phase (= in-phase
with different sign of the amplitudes) oscillations of the mass points. The amplitudes
of equal size are related to the equality of masses (m1 = m2 ). The general motion
of the mass points corresponds to a superposition of the normal modes with different
phase and amplitude.
The differential equations (7.1a), (7.1b) are linear. The general form of the vibration
is therefore the superposition of the normal modes. It reads
Here, we already utilized the result that x1 and x2 have opposite-equal ampli-
tudes for a pure ω1 -vibration, and equal amplitudes for pure ω2 -vibrations. This en-
sures that the special cases of the pure normal vibrations with C2 = 0, C1 = 0 and
C1 = 0, C2 = 0 are included in the ansatz (7.4). Equation (7.4) is the most general
ansatz since it involves 4 free constants. Thus one can incorporate any initial values
for x1 (0), x2 (0), ẋ1 (0), ẋ2 (0).
For example, the initial conditions are
C2 sin ϕ2 = 0.
C1 sin ϕ1 = 0.
EXERCISE
Problem. Two equal masses move without friction on a plate. They are connected
to each other and to the wall by two springs, as is indicated by Fig. 7.3. The two
spring constants are equal, and the motion shall be restricted to a straight line (one-
dimensional motion). Two equal masses coupled by two equal springs.
Find
(a) the equations of motion,
(b) the normal frequencies, and
(c) the amplitude ratios of the normal vibrations and the general solution.
Fig. 7.3.
Solution. (a) Let x1 and x2 be the displacements from the rest positions. The equa-
tions of motion then read
From the requirement for nontrivial solutions of the system of equations, it follows
that the determinant of coefficients vanishes:
2k − mω2 −k
D= = 0.
−k k − mω2
7 Vibrations of Coupled Mass Points 85
From this follows the determining equation for the eigenfrequencies, Exercise 7.1
k 2 k2
ω4 − 3 ω + 2 = 0,
m m
with the positive solutions
√ √
5+1 k 5−1 k
ω1 = and ω2 = , ω1 > ω 2 .
2 m 2 m
(c) By inserting the eigenfrequencies in (7.11) one sees that the higher frequency
ω1 corresponds to the opposite-phase mode, and the lower frequency ω2 to the equal-
phase normal vibration:
√
1 √ k 5−1
with ω1 = 3 + 5
2
, it follows from (7.11) that A2 = − A1 ,
2 m 2
√
1 √ k 5+1
with ω22 = 3 − 5 , it follows from (7.11) that A2 = A1 .
2 m 2
Since the two mass points are fixed in different ways, we find amplitudes of different
magnitudes.
The general solution is obtained as a superposition of the normal vibrations, using
the calculated amplitude ratios:
EXERCISE
Problem. Two pendulums of equal mass and length are connected by a spiral spring.
They vibrate in a plane. The coupling is weak (i.e., the two eigenmodes are not very
different). Find the motion with small amplitudes.
Fig. 7.4.
ml α̈ = −mg sin α.
g k
ẍ1 = − x1 − (x1 − x2 ),
l m (7.12)
g k
ẍ2 = − x2 + (x1 − x2 ).
l m
This coupled set of differential equations can be decoupled by introducing the coordi-
nates
u1 = x1 − x2 and u2 = x1 + x2 .
u1 = A1 cos ω1 t + B1 sin ω1 t,
(7.13)
u2 = A2 cos ω2 t + B2 sin ω2 t,
√ √
where ω1 = g/ l + 2(k/m), ω2 = g/ l are the eigenfrequencies of the two vibra-
tions. The coordinates u1 , u2 are called normal coordinates. Normal coordinates are
often introduced to decouple a coupled system of differential equations. The coor-
dinate u1 = x1 − x2 describes the opposite-phase and u2 = x1 + x2 the equal-phase
normal vibration. The equal-phase normal mode proceeds as if the coupling were ab-
sent.
For sake of simplicity, we incorporate the initial conditions in (7.13). For the nor-
mal coordinates we then have
A1 = −A, A2 = A, B1 = B2 = 0,
and thus,
u1 = −A cos ω1 t, u2 = A cos ω2 t.
7 Vibrations of Coupled Mass Points 87
1 A
x1 = (u1 + u2 ) = (− cos ω1 t + cos ω2 t),
2 2
1 A
x2 = (u2 − u1 ) = (cos ω1 t + cos ω2 t).
2 2
After transforming the angular functions, one has
ω 1 − ω2 ω1 + ω2
x1 = A sin t sin t ,
2 2
ω 1 − ω2 ω 1 + ω2
x2 = A cos t cos t .
2 2
hence, the frequency ω1 − ω2 is small. The vibrations x1 (t) and x2 (t) can then be
interpreted as follows: The amplitude factor of the pendulum vibrating with the fre-
quency ω1 + ω2 is slowly modulated by the frequency ω1 − ω2 . This process is called
beat vibration. Figure 7.6 illustrates the process. The two pendulums exchange their
energy with the amplitude modulation frequency ω1 − ω2 . If one pendulum reaches
its maximum amplitude (energy), the other pendulum comes to rest. This complete
energy transfer occurs only for identical pendulums. If the pendulums differ in mass
or length, the energy transfer becomes incomplete; the pendulums vary in amplitudes
but without coming to rest.
Fig. 7.6.
88 7 Vibrations of Coupled Mass Points
We consider another vibrating mass system: the vibrating chain. The “chain” is a mass-
less thread set with N mass points. All mass points have the mass m and are fixed to
the thread at equal distances a. The points 0 and N + 1 at the ends of the thread are
tightly fixed and do not participate in the vibration. The displacement from the rest po-
sition in y-direction is assumed to be relatively small, so that the minor displacement
in x-direction is negligible. The total string tension T is only due to the clamping of
the end points and is constant over the entire thread.
If one picks out the νth particle, the forces acting on this particle are due to the dis-
placements of the particles (ν − 1) and (ν + 1). According to Fig. 7.7 the backdriving
forces are given by
Fig. 7.7.
yν − yν−1 yν − yν+1
tan α = and tan β = .
a a
Hence, the forces are given by
yν − yν−1
Fν−1 = −T e2 ,
a
yν − yν+1
Fν+1 = −T e2 .
a
1 It is recommended that the reader go through Chap. 8 (“The Vibrating String”) before studying
this section. The concepts presented here will be more easily understood, and the mathematical ap-
proaches will be more transparent in their physical motivation.
7.1 The Vibrating Chain 89
The total backdriving force is the sum Fν−1 + Fν+1 , i.e., the equation of motion for
the particle reads
d 2 yν yν − yν−1 yν − yν+1
m e 2 = −T e 2 − T e2
dt 2 a a
or
d 2 yν T
2
= (yν−1 − 2yν + yν+1 ). (7.14)
dt ma
Since the index ν runs from ν = 1 to ν = N , one obtains a system of N coupled dif-
ferential equations. Considering that the endpoints are fixed, by setting for the indices
ν = 0 and ν = N + 1
one obtains from the differential equation (7.14) with the indices ν = 1 and ν = N the
differential equation for the first and last particle that can participate in the vibration:
d 2 y1 T
m = (−2y1 + y2 ),
dt 2 a (7.15)
2
d yN T
m 2 = (yN−1 − 2yN ).
dt a
We now look for the eigenfrequencies of the particle system, i.e., the frequencies
of vibration common to all particles. To get a determining equation for the eigenfre-
quency ωn , we introduce in (7.14) the ansatz
We obtain
T
−mω2 · Aν · cos ωt = (Aν−1 − 2Aν + Aν+1 ) cos ωt,
a
and after rewriting,
maω2
−Aν−1 + 2 − Aν − Aν+1 = 0, ν = 2, . . . , N − 1. (7.17a)
T
By insertion of (7.16) into (7.15), we get the equations for the first and the last vibrat-
ing particle:
maω2
2− A1 − A2 = 0,
T
(7.17b)
maω2
−AN−1 + 2 − AN = 0.
T
2T − maω2
= c, (7.18)
T
90 7 Vibrations of Coupled Mass Points
cA1 − A2 =0
−A1 + cA2 − A3 =0
− A2 + cA3 − A4 =0
.. ..
. .
− AN−1 + cAN = 0.
This is a system of homogeneous linear equations for the coefficients Aν . For any
nontrivial solution of the equation system (not all Aν = 0) the determinant of coeffi-
cients must vanish. This determinant has the form
c −1 0 0 0 ... 0 0 0
−1 c −1 0 0 ... 0 0 0
0 −1 c −1 0 ... 0 0 0
DN = . .. .. .. .. .. .. .. ...
.. . . . . . . . .
0 0 0 0 0 −1 c −1
...
0 0 0 0 0 ... 0 −1 c
It has N rows and N columns. The eigenfrequencies are obtained as solution of the
equation
DN = 0.
The left-hand determinant has exactly the same form as DN , but is lower by one
order (N − 1 rows, N − 1 columns). It would be the determinant of coefficients for
a similar system with one mass point less, i.e., DN−1 . The right-hand determinant is
7.1 The Vibrating Chain 91
The last determinant is just DN−2 . Hence we get the determinant recursion equa-
tion
Moreover,
c −1
D1 = |c| = c and D2 = = c2 − 1. (7.20)
−1 c
D0 = 1. (7.21)
Our problem is now to solve the determinant equation (7.19). We use the ansatz
DN = p N ,
p N = cp N−1 − p N−2 ,
The mathematical possibility p N−2 = 0 that leads to p ≡ 0 does not obey the bound-
ary condition D0 = 1 and is therefore inapplicable. Substituting c = 2 cos , we obtain
for p
p = cos ± cos2 − 1 = cos ± i sin = e±i .
and
Since the equation system (7.19) is homogeneous and linear, the general solution is a
linear combination of cos N and sin N :
G = 1, H = cot ,
so that
sin N cos sin(N + 1)
DN = cos N + = ,
sin sin
because sin cos N + sin N cos = sin(N + 1).
For any nontrivial solution of the equation system we must have DN = 0, i.e., DN
must vanish for all N ; it follows that
sin((N + 1)) = 0,
or
nπ
= n = , n = 1, . . . , N. (7.23)
N +1
n = 0 drops out since it leads to the solution 0 = 0, and hence to DN = N + 1
0, and thus does not lead to a solution of the equation DN = 0. For c we then get
=
according to (7.18):
ω2 ma nπ
c=2− = 2 cos ,
T N +1
and ω is calculated from
2T nπ
ω2 = ω(n)
2
= 1 − cos (7.24a)
ma N +1
as
2T nπ
ω(n) = 1 − cos . (7.24b)
ma N +1
These are the eigenfrequencies of the system; the fundamental frequency is obtained
for n = 1 as the lowest eigenfrequency. There are exactly N eigenfrequencies, as is
seen from (7.23): For n ≥ N + 1, we set n = (N + 1) + τ and find
τπ
n = π + .
N +1
If one inserts the above expression into (7.17a) and (7.17b) for ω and c, respectively,
one obtains for the amplitudes of the normal vibration
(n) nπ (n)
−Aν−1 + 2A(n)
ν cos − Aν+1 = 0,
N +1
(n) nπ (n)
2A1 cos = A2 , (7.25)
N +1
(n) nπ (n)
2AN cos = AN−1
N +1
7.1 The Vibrating Chain 93
(n)
where the Aν depend on n (Aν = Aν ). The system of equations (7.25) for the Aν
is the same as that for the determinants DN (equation (7.19)), with the same coeffi-
cient c = 2 cos nπ/(N + 1) = 2 cos n . Only the boundary conditions (7.25) do not
correspond to those for the DN (see (7.20) and (7.21)). The general solution for the
coefficients Aν is therefore obtained from (7.22) with at first arbitrary coefficients
E (n) :
(n) (n)
ν = E1 cos νn + E2 sin νn ,
A(n)
or, in detail,
(n) nπν (n) nπν
ν = E1 cos
A(n) + E2 sin . (7.26)
N +1 N +1
Since the points ν = 0 and ν = N + 1 are tightly clamped, for all eigenmodes n we
have y0 = yN+1 = 0, or
A(n) (n)
0 = AN+1 = 0 (boundary condition).
N
nπν (n) (n)
yν = sin E4 sin ω(n) t + E2 cos ω(n) t
N +1
n=1
N
nπν
= sin (an sin ω(n) t + bn cos ω(n) t), (7.29)
N +1
n=1
(n) (n)
where the constants E2 and E4 were renamed bn and an , respectively. They are
determined from the initial conditions.
94 7 Vibrations of Coupled Mass Points
The equation of the vibrating chord must follow from the limit for N → ∞ and
a → 0 (continuous mass distribution):
nπν nπaν
sin = sin (xν = aν takes only discrete values)
N +1 (N + 1)a
πn(aν)
= sin (l = N a is the length of the chord)
l+a
πnx πnx
lim sin = sin (x continuous).
N→∞ l+a l
a→0
i.e.,
T nπ
ω(n) = .
σ l
Hence, one has as a limit
nπx T nπ T nπ
yn (x) = sin an sin · t + bn cos t . (7.30)
l σ l σ l
This is the equation for the nth eigenmode of the vibrating chord (l is the chord length).
It will be derived once again in the next chapter in a different way and will then be
discussed in more detail.
EXERCISE
Problem. When solving the determinant equation (7.19), we have made a mathemat-
ical restriction for c by setting c = 2 cos .
Show that for the cases
(a) |c| = 2,
(b) c < −2
the eigenvalue equation DN = 0 cannot be satisfied. Clarify that thereby the special
choice of the constant c is justified.
Solution. (a)
Since |Dn | monotonically increases in n, and |D1 | = 2 > 0, we have |DN | > 0. There-
√
fore DN = 0 cannot be satisfied. ω = 0 and ω = 2T /ma are not eigenfrequencies
of the vibrating chain.
(b) By inserting the ansatz Dn = Ap n , p = 0, we also find the solution of the
recursion formula Dn = cDn−1 − Dn−2 , D1 = c, D0 = 1:
p1 = 12 c + (c2 − 4)1/2 < 0
0 > p 1 > p2 . (7.34)
p2 = 12 c − (c2 − 4)1/2 < 0
A1 + A2 = 1,
A1 A2
c + (c2 − 4)1/2 + c − (c2 − 4)1/2 = c,
2 2
c + (c2 − 4)1/2 −c + (c2 − 4)1/2
A1 = ⇔ A2 = . (7.36)
2(c2 − 4)1/2 2(c2 − 4)1/2
One then has
1 c + (c2 − 4)1/2 n 1 (c2 − 4)1/2 − c n
Dn = p1 + p2
2 (c2 − 4)1/2 2 (c2 − 4)1/2
1 n+1
= 2 p1 − p2n+1 . (7.37)
(c − 4) 1/2
96 7 Vibrations of Coupled Mass Points
But now 0 > p1 > p2 , hence (p2 /p1 )N+1 > 1. Thus, for the case c < −2 eigenfre-
quencies do not exist too.
These supplementary investigations can be summarized √ as follows: The possible
eigenfrequencies of the vibrating chain lie between 0 and 2T /ma:
2T
0 < |ω| < . (7.39)
ma
EXERCISE
Problem. Two mass points (equal mass m) lie on a frictionless horizontal plane and
are fixed to each other and to two fixed points A and B by means of springs (spring
tension T , length l).
(a) Establish the equation of motion.
(b) Find the normal vibrations and frequencies and describe the motions.
Fig. 7.8.
Fig. 7.9.
Solution. (a) For the vibrating chain with n mass points, which are equally spaced
by the distance l, the equations of motion
d 2 yN T
= (yN−1 − 2yN + yN+1 ) (N = 1, . . . , n)
dt 2 ml
were established. For the first and second mass point, we have
To get the nontrivial solution, the determinant of coefficients must vanish, i.e.,
2k − ω2
−k
D= = 0;
−k 2k − ω2
EXERCISE
Problem. Three mass points are fixed equidistantly on a string that is fixed at its
endpoints.
(a) Determine the eigenfrequencies of this system if the string tension T can be con-
sidered constant (this holds for small amplitudes).
(b) Discuss the eigenvibrations of the system. Hint: Note Exercises 8.1 and 8.2 in
Chap. 8.
Fig. 7.10.
Solution. (a) For the equations of motion of the system, one finds straightaway
2T T
mẍ1 + x1 − x2 = 0,
L L
2T T T
mẍ2 + x2 − x3 − x1 = 0, (7.42)
L L L
2T T
mẍ3 + x3 − x2 = 0.
L L
Assuming periodic oscillations, i.e., solutions of the form
As in Exercise 8.2, one gets the equation for the frequencies of the system from the
expansion of the determinant of coefficients:
3 2
Lm Lm 10Lm 2
ω6 − 6 ω4 + ω −4=0
T T T
or
3
Lm Lm 2 2 10Lm
−6
3
+
−4=0 (7.44)
T T T
with
=
ω2 . This cubic equation with the coefficients
3 2
Lm Lm 10Lm
a= , b = −6 , c= , d = −4
T T T
b 1 b2 c T2 2 b3 1 bc d
y =
+ , 3p = − + = −2 , 2q = − + =0
3a 3 a2 a L2 m2 27 a 3 3 a 2 a
we get q 2 + p 3 < 0, i.e., there are three real solutions which by using the auxiliary
quantities
q √ ϕ π √ T
cos ϕ = − = 0, y1 = −2 −p cos − =− 2 ,
−p 3 3 3 Lm
√ ϕ π
y2 = −2 −p cos + = 0,
3 3
√ ϕ √ T
y3 = 2 −p cos = 2
3 Lm
can be calculated as
T 2T T
ω1 = 0.6 , ω2 = , ω3 = 3.4 .
Lm Lm Lm
(b) From the first and third equation of (7.43), one finds for the amplitude ratios
B B mLω2
= =2− . (7.45)
A C T
7.1 The Vibrating Chain 99
(3) ω = ω3 = (3.4T /Lm)1/2 inserted into (7.45) ⇒ B3 /A3 = B3 /C3 = −1.4, i.e.,
A3 = C3 = −1.4B3 . The first and the last mass are deflected in the same direction,
while the central mass vibrates with different amplitude in the opposite direction.
The system discussed here has three vibration modes with 0, 1, and 2 nodes, re-
spectively. For a system with n mass points, both the number of modes as well as Fig. 7.13.
the number of possible nodes (n − 1) increases. A system with n → ∞ is called a
“vibrating string.”
A comparison of the figures clearly shows the approximation of the vibrating string
by the system of three mass points.
Fig. 7.14.
EXERCISE
Solution. (a) Let x1 , x2 , x3 be the displacements of the atoms from the equilibrium
positions at time t. From Newton’s equations and Hooke’s law then it follows that
Fig. 7.15.
mẍ1 = −k(x1 − x2 ),
M ẍ2 = −k(x2 − x3 ) − k(x2 − x1 ) = k(x3 + x1 − 2x2 ), (7.46)
mẍ3 = −k(x3 − x2 ).
(b) By inserting the ansatz x1 = a1 cos ωt, x2 = a2 cos ωt, and x3 = a3 cos ωt
into (7.46), one obtains
or
By factorization of (7.49) with respect to ω, one obtains for the eigenvibrations of the
system:
k k 2m
ω1 = 0, ω2 = , ω3 = 1+ .
m m M
A string of length l is fixed at both ends. Thereby appear forces T that are constant
in time and independent of the position. The string tension acts as a backdriving force
when the string is displaced out of the rest position. A string element s at the position
x experiences the force
Accordingly, along the x-direction the string element s is pulled by the force
Fx = 0.
one has
σ x 2 + y 2 ∂ 2 y y 2 ∂ 2 y
=σ 1+
x ∂t 2 x ∂t 2
T sin (x + x) − T sin (x)
= . (8.4)
x
By forming the limit for x, y → 0 on both sides of (8.4), we obtain
2
∂y ∂ 2y ∂
σ 1+ 2
=T (sin ). (8.5)
∂x ∂t ∂x
√
For sin we have sin = tan / 1 + tan2 . Since tan = ∂y/∂x (inclination of
the curve), we write
∂y/∂x
sin = . (8.6)
1 + (∂y/∂x)2
In order to simplify the equation, we again consider only small displacements of the
string in y-direction. Then ∂y/∂x 1, and (∂y/∂x)2 can be neglected too.
Thus, we obtain
∂ 2y ∂ ∂y
σ 2 =T (8.8)
∂t ∂x ∂x
or
∂ 2y ∂ 2y
σ = T . (8.9)
∂t 2 ∂x 2
The initial conditions specify the state of the string at the time t = 0 (initial excitation).
The excitation is performed by a displacement of the form f (x),
For solving the partial differential equation (PDE), we use the product ansatz y(x, t) =
X(x) · T (t). Such an approach is obvious, since we are looking for eigenvibrations.
These are defined so that all mass points (i.e., any string element at any position x)
vibrate with the same frequency. By the ansatz y(x, t) = X(x)·T (t), the time behavior
is decoupled from the spatial one. Thus we try to split the partial differential equation
into a function of the position X(x) and a function of the time T (t). Inserting y(x, t) =
X(x) · T (t) into the differential equation (8.10) yields
T̈ (t) X (x)
= c2 .
T (t) X(x)
Since one side depends only on x and the other side depends on t , while x and t are
independent of each other, there is only one possible solution: Both sides are constant.
The constant will be denoted by −ω2 .
T̈
= −ω2 or T̈ + ω2 T = 0, (8.11)
T
or
X ω2 ω2
=− 2 or X + X = 0. (8.12)
X c c2
The solutions of the differential equations (continuous harmonic vibrations) have the
form
The constants A, B, C, and D are determined from the boundary and initial condi-
tions.
From the boundary conditions, it follows for (8.11) that
Since the expression in brackets differs from zero, we must have D = 0. Then (8.13)
simplifies to
ω
y(x, t) = C sin x(A sin ωt + B cos ωt).
c
With the second boundary condition, we get
ω
y(l, t) = 0 = C sin l(A sin ωt + B cos ωt)
c
ω
⇒ 0 = C sin l.
c
This equation will be satisfied if either of the following holds:
Then
nπc nπ
an · · sin x=0
l l
is satisfied for all x only if an = 0. Thus, the solution of the differential equation is
nπ nπc
yn (x, t) = bn · sin x cos t. (8.14)
l l
The parameter n describes the excitation states of a system, in this case those of the
8.2 Normal Vibrations 105
The boundary conditions y(0, t) = y(l, t) = 0 would have led to the conditions
Ce c l + De− c l = 0
ω ω
C + D = 0;
with the solutions C = D = 0. The string would have remained at rest. But this is not
the desired solution.
Since the one-dimensional wave equation is a linear differential equation, one can
obtain the most general solution, according to the superposition principle, by the su-
perposition (addition) of the particular solutions:
∞
∞
nπx nπc
y(x, t) = bn sin cos t= bn sin kn x cos ωn t.
l l
n=1 n=1
The coefficients bn can be calculated from the given initial curve by using the consid-
erations on the Fourier series (see the next chapter):
∞
nπx
y(x, 0) ≡ f (x) = bn sin .
l
n=1
The calculation of the Fourier coefficients bn will be shown in the next chapter. One
then gets the following general solution of the differential equation:
∞
l
2 nπx nπx nπct
y(x, t) = f (x ) sin dx sin cos . (8.15)
l l l l
n=1 0
For a fixed time t , the spatial variation (positional dependence) of the normal vibra-
tion depends on the expression sin(nπx/ l) (for n > 1, sin(nπx/ l) has exactly n − 1
nodes). All mass points (position x) vibrate with the same frequency ωn .
At a definite position x, the time dependence of the normal vibration is represented
by the expression cos(nπc/ l)t . The wave number kn is defined as
ωn nπ 2π
kn ≡ = = , (8.17)
c l λn
where λn = 2l/n is the wavelength.
106 8 The Vibrating String
Fig. 8.2. Propagation of a perturbation f (x) along a long string: After the time t, the perturba-
tion has moved away by ct; it is then described by f (x − ct)
8.2 Normal Vibrations 107
Let the maximum of the perturbation f (x) be at x0 . After the time t , it lies at
x − ct = x0 .
dx
=c
dt
along the string, namely to the right (positive x-direction). One can say that the per-
turbation f (x) moves along the string with the velocity
dx
= c. (8.23)
dt
The propagation velocity of small perturbations is called the sound velocity. One
easily realizes as above that f (x + ct) is also a solution of the wave equation and
represents a perturbation that moves to the left (negative x-direction). We are deal-
ing here with running waves, while for the tightly clamped string we have standing
waves.
If a string is excited with an arbitrary normal frequency, there are points on the
string that remain at rest at any time (nodes).
The wavelength, the number of nodes, and the shape of normal vibrations can be
represented as a function of the index n (see Fig. 8.3).
EXERCISE
Problem. Consider a string of density σ that is stretched between two points and is
excited with small amplitudes.
(a) Calculate in general the kinetic and potential energy of the string.
(b) Calculate the kinetic and potential energy for waves of the form
ω(x − ct)
y = C cos
c
with T0 = 500 N, C = 0.01 m, and λ = 0.1 m.
Solution. (a) The part P Q of the string has the mass σ x and the velocity ∂y/∂t.
Its kinetic energy is then
2
1 ∂y
T = σ x . (8.24)
2 ∂t
The total kinetic energy of the string between x = a and b is
b 2
1 ∂y
T= σ dx. (8.25)
2 ∂t
a
Hence, the kinetic and potential energy are equal. If a, b are fixed points, then T and P
vary with time. But if we admit that a and b can propagate with the sound velocity c,
so that
a = A + ct and b = B + ct, (8.30)
(b)
∂y ω
= C sin x − ωt ω
∂t c
2
∂y ω
⇒ = C 2 sin2 x − ωt ω2 . (8.32)
∂t c
Insertion into (8.25) yields (a = 0, b = λ)
λ
1 T0 2 2 2 ω 1 T0 2 2
T= 2
C ω sin x − ωt dx = C ω · I. (8.33)
2c c 2 c2
0
(ω/c)λ−ωt
(ω/c)λ 2π
c c c
I= sin z dz =
2
sin z dz =
2
sin2 z dz (8.34)
ω ω ω
−ωt 0 0
2π
c 1 1 c
= z − sin(2z) = π
ω 2 4 0 ω
1 T0 2 2 c π 2 C 2 T0 c
⇒ T = C ω π = , λ = 2π . (8.35)
2 c2 ω λ ω
One gets the same expression for the potential energy. Insertion of the numerical val-
ues yields
500 N 2
T = P = (0.01)2 · π 2 m ∼ 5 N m.
0.1 m
EXERCISE
Problem. Calculate the eigenfrequencies of the system of three different masses that
are fixed equidistantly on a stretched string, as is shown in Fig. 8.5.
Hint: For small amplitudes, the string tension T does not change!
110 8 The Vibrating String
Fig. 8.5.
Fig. 8.6.
We look for the eigenvibrations. All mass points must then vibrate with the same
frequency. We therefore start with
or Exercise 8.2
−22T m2 19T 2 m −4T 3
0 = 6m3 3 + 2 + + , (8.38)
L L2 L3
a3 + b2 + c + d = 0,
where
−22T m2 19T 2 m −4T 3
a = 6m3 , b= , c= , d= .
L L2 L3
It can be transformed to the representation (reduction of the cubic equation)
y 3 + 3py + 2q = 0, (8.39)
where
b 11 T
y =+ =−
3a 9 Lm
and
1 b2 c 2 b3 1 bc d
3p = − + and 2q = − + .
3 a2 a 27 a 3 3 a 2 a
Insertion leads to
71 T 2 653 T 3
3p = − , 2q = − .
54 L2 m2 1458 L3 m3
From this, it follows that
q 2 + p 3 < 0,
EXERCISE
Problem. Determine the eigenfrequencies of the system of three equal masses sus-
pended between springs with the spring constant k, as is shown in Fig 8.7.
Hint: Consider the solution method of the preceding Exercise 8.2 and Mathematical
Supplement 8.4.
or
We look for the eigenvibrations. All mass points must vibrate with the same fre-
quency. Thus, we adopt the ansatz
(3k − mω2 )A − kB − kC = 0,
−kA + (3k − mω2 )B − kC = 0, (8.42)
−kA − kB + (3k − mω2 )C = 0.
8.2 Normal Vibrations 113
MATHEMATICAL SUPPLEMENT
In theoretical physics, one often meets the problem of solving a cubic equation, just
as in the Exercises 8.2 and 8.3. We now will clarify this problem.
1 We follow the exposition of E. v. Hanxleben and R. Hentze, Lehrbuch der Mathematik, Friedrich
Vieweg & Sohn 1952, Braunschweig–Berlin–Stuttgart.
114 8 The Vibrating String
Mathematical Supplement 8.4 Reduction of the general cubic equation: If the general cubic equation
x 3 + ax 2 + bx + c = 0 (8.43)
with nonvanishing coefficients a, b, and c is to be solved, one must first eliminate the
quadratic term of the equation, i.e., reduce the equation. If the unknown x is replaced
by y + λ, where y and λ are new, unknown quantities, (8.43) turns into
Since we have replaced one unknown quantity x by two unknown ones, y and λ, we
can freely dispose of one of the two unknown quantities. This freedom is exploited so
as to let the quadratic term of the equation disappear. This is achieved by setting the
coefficient of y 2 , that is, 3λ + a, equal to zero, i.e., λ = −a/3. By inserting this value
(8.44) changes to
3
a2 2a ab
y + − +b y +
3
− + c = 0. (8.45)
3 27 3
If we set the expressions determined by the known coefficients a, b, and c of the cubic
equation,
a2 2a 3 ab
− +b=p and − + c = q, (8.46)
3 27 3
the cubic equation takes the form
Result: To reduce the cubic equation given in the normal form, one sets x = y −a/3.
Then (8.47) follows from (8.43).
Example: x 3 − 9x 2 + 33x − 65 = 0.
(1) Solution: Set x = y − (−3) = y + 3.
(2) Solution: Insert the values calculated from (8.46) into (8.47).
Special case: If in the general cubic equation, the linear term is missing (b = 0),
i.e., the cubic equation is given in the form
x 3 + ax 2 + c = 0, (8.48)
From (8.48) and (8.49), we obtain the reduced equation Mathematical Supplement 8.4
c3 c2
+ a + c = 0 or y 3 + acy + c2 = 0. (8.50)
y3 y2
Solution of the reduced cubic equation: If one sets in the reduced cubic equation
y 3 + py + q = 0,
(8.51)
y = u + v,
one obtains
Since one can freely dispose of one of the unknown quantities u or v (justifica-
tion?), these are suitably chosen so that the coefficient of (u + v) vanishes. We there-
fore set
p
3uv + p = 0, i.e., uv = − . (8.53)
3
Equation (8.52) simplifies to
u3 + v 3 + q = 0 or u3 + v 3 = −q. (8.54)
u and v are determined by (8.53) and (8.54). The quantities u and v can no longer be
arbitrarily chosen. By raising (8.54) to the second power and (8.53) to the third power,
one obtains
u6 + 2u3 v 3 + v 6 = q 2 ,
3
p
4u3 v 3 = −4 .
3
Subtraction of the two equations yields
3
p
(u3 − v 3 )2 = q 2 + 4 ,
3
3
p
u − v = ± q2 + 4
3 3
. (8.55)
3
one gets
u1 = m, u2 = m
2 , u3 = m
3 ,
v1 = n, v2 = n
2 , v3 = n
3 .
Here, the
i are the unit roots of the cubic equation x 3 = 1 which, as is evident, read
√ √
1 3 1 3
1 = 1,
2 = − + i ,
=− −i .
2 2 2 2
Since now y = u + v, one can actually form 9 values for y (why?). But since the
quantities u and v must satisfy the determining equation (8.53), the number of possible
connections between u and v is restricted to 3, namely,
y 1 = u1 + v 1 , y 2 = u2 + v 3 , y 3 = u3 + v 2 ;
hence,
2 3 2 3
q q
q p q p
y1 = m + n = 3 − + + + 3 − − + ,
2 2 3 2 2 3
m+n m−n √
y2 = m
2 + n
3 = − + i 3, (8.57)
2 2
m+n m−n √
y3 = m
3 + n
2 = − − i 3.
2 2
The real root of the cubic equation, i.e., the root
2 3 2 3
q q
q p q p
y1 = 3 − + + + 3 − − +
2 2 3 2 2 3
is known as the “Cardano formula.” It was named in honor of the Italian Hieronimo
Cardano2 to whom the discovery of the formula was falsely ascribed. Actually, the
2 Hieronimo Cardano, Italian physicist, mathematician, and astrologer, b. Sept. 24, 1501, Pavia–
d. Sept. 20, 1576, Rome. Cardano was the illegitimate son of Fazio (Bonifacius) Cardano, a friend
of Leonardo da Vinci. He studied at the universities of Pavia and Padua, and in 1526 he graduated in
medicine. In 1532, he went to Milan, where he lived in deep poverty, until he got a position teaching
in mathematics. In 1539, he worked at a high school of physics, where he soon became the director.
In 1543, he accepted a professorship for medicine in Pavia.
As a mathematician, Cardano was the most prominent personality of his age. In 1539, he published
two books on arithmetic methods. At this time, the discovery of a solution method for the cubic
equation became known. Nicolo Tartaglia, a Venetian mathematician, was the owner. Cardano tried in
vain to get permission to publish it. Tartaglia left the method to him under the condition that he keeps
it secret. In 1545, Cardano’s book Artis magnae sive de regulis algebraicis, one of the cornerstones of
the history of algebra, was published. The book contained, besides many other new facts, the method
of solving cubic equations. The publication caused a serious controversy with Tartaglia.
8.2 Normal Vibrations 117
formula is due to the Bolognesian professor of mathematics Scipione del Ferro,3 who Mathematical Supplement 8.4
found this ingenious algorithm.
Example: y 3 − 15y − 126 = 0. Here,
p = −15, q = −126,
p q
= −5, = −63.
3 2
By inserting into the Cardano formula, one obtains
3 √
3 √
y1 = 63 + 632 − 53 + 63 − 632 − 53
3 √
3 √
= 63 + 3844 + 63 − 3844c
√ √
= 3 63 + 62 + 3 63 − 62
√ √
= 3 125 + 3 1 (= m + n)
= 6,
5+1 5−1 √ √
y2 = − + i 3 = −3 + 2i 3,
2 2
5+1 5−1 √ √
y3 = − − i 3 = −3 − 2i 3.
2 2
Check the validity of the roots by insertion!
Discussion of Cardano’s formula: The square root appearing in the Cardano for-
mula only yields a real value if the radicand (q/2)2 + (p/3)3 ≥ 0. If the radicand
is negative, the three values for y yield complex numbers. We consider the possible
cases:
2 3
q p
+ Form of the roots
2 3
(1) p>0 Real A real value, two complex con-
jugate values
(2) p < 0, namely,
3 2
p q
(a) Real As in (1).
3 < 2
3 2
p q
(b) 3 = 2 =0 Three real values, among them
a double root
3 2
p q
(c) Imaginary All three roots by the form
3 > 2
imaginary
The case (2c) was of particular interest to the mathematicians of the Middle Ages.
Since any cubic equation has at least one real root, but they could not find it by means
of Cardano’s formula, the case was called the casus irreducibilis.4 The first to solve
3 Scipione del Ferro, b. 1465(?)–d. 1526 (?). About his life we know only that he lectured from 1496
to 1526 at the university of Bologna. By 1500, he discovered the method of solving the cubic equation
but did not publish it. Tartaglia rediscovered the method in 1535.
4 Casus irreducibilis (Lat.) = “the nonreducible case”.
118 8 The Vibrating String
Mathematical Supplement 8.4 this case was the French politician and mathematician Vieta.5 He proved by using
trigonometry that this case was solvable too, and that in this case the equation has
three real roots.
Trigonometric solution of the irreducible case: Since p is negative in this case,
one starts from the reduced cubic equation
y 3 − py + q = 0, (8.58)
where p must now be kept fixed as absolute numerical value. According to the trigono-
metric formulae we have
thus,
3 1
cos3 α − cos α − cos 3α = 0. (8.59)
4 4
If one considers cos α to be unknown, (8.59) coincides with the form of (8.58). But
since the value of the cosine varies only between the limits −1 and +1, while y, ac-
cording to the values of p and q, can take any values, one cannot simply set cos α = y.
By multiplying (8.59) by a still uncertain positive factor 3 , one obtains
3 1
3 cos3 α − 2 · cos α − 3 cos 3α = 0. (8.60)
4 4
By setting · cos α = y, p = (3/4)2 , and q = −(1/4)3 cos 3α, (8.60) turns into
(8.58). From this, we find
p
=2· (8.61)
3
and
4q −4q q/2
cos 3α = − = √ = − . (8.62)
3 8 · (p/3) p/3 (p/3)3
Equation (8.62) is ambiguous, since the cosine is a periodic function. One has
5 François Vieta, French mathematician, b. 1540, Fontenay-le-Comte–d. Dec. 13, 1603, Paris. Ad-
vocate and adviser of Parliament in the Bretagne. His greatest achievements were in the theory of
equations and algebra, where he introduced and systematically used letter notations. He established
the rules for the rectangular spherical triangle which are often ascribed to Neper. In his Canon math-
ematicus, a table of angular functions (1571), he emphasized the advantages of decimal notation.
[BR]
8.2 Normal Vibrations 119
ϕ ϕ ϕ
α1 = , α2 = + 120◦ , α3 = + 240◦ .
3 3 3
Compare this consideration with the problem of cyclotomy! Which values are ob-
tained for α if k = 3, 4, . . .?
For y, one obtains
p ϕ p ϕ ◦
y1 = 2 cos , y2 = 2 cos + 120 ,
3 3 3 3
p ϕ ◦
y3 = 2 cos + 240 .
3 3
Now
ϕ ◦ ◦ ϕ
cos + 120 = − cos 60 −
3 3
and
ϕ ϕ
cos + 240◦ = − cos 60◦ + ,
3 3
Comment: The formulas of the casus irreducibilis can also be derived by means of
the Moivre’s theorem.
Example: Calculate the roots of the equation
y 3 − 981y − 11340 = 0.
Mathematical Supplement 8.4 by comparing the logarithms it follows that |(p/3)3 | > (q/2)2 . Thus, the condition of
the casus irreducibilis is fulfilled. According to (8.62)
5670
cos 3α = + √ ,
3273
log cos 3α = 3.7536 − 3.7718 = 9.9818 − 10,
ϕ
ϕ = 3α ≈ 16◦ 30 , hence, = α = 5◦ 30 .
3
From (8.64), we obtain y1 = 36, y2 = −21, y3 = −15. Check the root values by in-
sertion!
Fourier Series
9
When setting the initial conditions for the problem of the vibrating string, a trigono-
metric series was set equal to a given function f (x). The expansion coefficients of
the series had to be determined. To solve the problem, the function f (x) should also
be represented by a trigonometric series. These trigonometric series are called Fourier
series.1 The conditions that allow an expansion of a function into a Fourier series are
summarized as follows:
1 Jean Baptiste Joseph Fourier, b. March 21, 1768, Auxerre, son of a tailor–d. May 16, 1830, Paris.
Fourier attended the home École Militaire. Because of his origin he was excluded from an officer’s
career. Fourier decided to join the clergy, but did not take a vow because of the outbreak of the rev-
olution of 1789. Fourier first took a teaching position in Auxerre. Soon he turned to politics and
was arrested several times. In 1795, he was sent to Paris to study at the École Normale. He soon
became member of the teaching staff of the newly founded École Polytechnique. In 1798, he be-
came director of the Institut d’Egypte in Cairo. Only in 1801 did he return to Paris, where he was
appointed by Napoleon as a prefect of the departement Isère. During his term of office from 1802
to 1815, he arranged the drainage of the malaria-infested marshes of Bourgoin. After the downfall
of Napoleon, Fourier was dismissed from all posts by the Bourbons. However, in 1817 the king had
to agree to Fourier’s election to the Academy of Sciences, where he became permanent secretary
in 1822. Fourier’s most important mathematical achievement was his treatment of the notion of the
function. The problem of the vibrating string that had been treated already by D’Alembert, Euler, and
Lagrange, and had been solved in 1755 by D. Bernoulli by a trigonometric series. The subsequent
question of whether an “arbitrary” function can be represented by such a series was answered 1807/12
by Fourier in the affirmative. The question about the conditions for such a representation could be
answered only by his friend Dirichlet. Fourier became known mainly by his Théorie analytique de la
chaleur (1822) which deals mainly with the discussion of the equation of heat propagation in terms of
Fourier-series. This work represents the starting point for treating partial differential equations with
boundary conditions by means of trigonometric series. Fourier also made import contributions to the
theory of solving equations and to the probability calculus.
a+2l
1 nπx
an = f (x) cos dx,
l l
a
a+2l
1 nπx
bn = f (x) sin dx, (9.2)
l l
a
a+2l
1
a0 = f (x) dx.
l
a
To prove these formulas, one needs the so-called orthogonality relations of the trigono-
metric functions:
2l
nπx mπx
cos cos dx = l δnm ,
l l
0
2l
nπx mπx
sin sin dx = l δnm , (9.3)
l l
0
2l
nπx mπx
sin cos dx = 0.
l l
0
1
cos(A + B) + cos(A − B) ,
cos A cos B =
2
2l 2l
nπx mπx 1 (n + m)πx (n − m)πx
cos cos dx = cos + cos dx = 0,
l l 2 l l
0 0
if n = m. The integral of the cosine function over a full period vanishes. For n = m
we have
2l 2l
nπx mπx 1 2nπx
cos cos dx = 1 + cos dx = l.
l l 2 l
0 0
and therefore,
2l
1 mπx
am = f (x) cos dx, (9.4)
l l
0
as is given by (9.2).
The analogous relation for the bm can be confirmed by multiplication of (9.1) by
sin(mπx/ l) and integration from 0 to 2l; the same holds for the calculation of a0 .
Functions that satisfy
f (x) = f (−x)
are called odd functions. For instance, f (x) = cos x evidently is an even function and
f (x) = sin x an odd function. The part of (9.1)
∞
a0 nπx
+ an cos
2 l
n=1
represents the odd part of the series expansion (9.1). Therefore, for even functions all
bn = 0, for odd functions a0 and all an are equal to zero.
Any function f (x) can be decomposed into an even and an odd part. Thus, (f (x) +
f (−x))/2 is the even part and (f (x) − f (−x))/2 the odd part of f (x) = [(f (x) +
f (−x))/2 + (f (x) − f (−x))/2].
EXAMPLE
9.1 Inclusion of the Initial Conditions for the Vibrating String by Means of the
Fourier Expansion
A string is fixed at both ends. The center is displaced from the equilibrium position by
the distance H and then released. From Fig. 9.1 we see that the initial displacement is
124 9 Fourier Series
Fig. 9.1.
given by
⎧
⎪
⎪
Hx
0≤x ≤ ,
l
⎨2 ,
l 2
y(x, 0) = f (x) =
⎪
⎪ 2H (l − x) l
⎩ , ≤ x ≤ l.
l 2
l
2 nπx
bn = f (x) sin dx
l l
0
l/2 l
2 2H x nπx 2H nπx
= sin dx + (l − x) sin dx ,
l l l l l
0 l/2
l/2
2H x nπx 2H l nπx l2 nπx l/2
sin dx = −x cos + 2 2 sin
l l l nπ l n π l 0
0
2lH nπ Hl nπ
= 2 2
sin − cos ,
n π 2 nπ 2
l
2H nπx
(l − x) sin dx
l l
l/2
l l
2H nπx nπx
= l sin dx − x sin dx
l l l
l/2 l/2
l
2H l2
nπx xl nπx l2 nπx
= − cos + cos − 2 2 sin
l nπ l nπ l n π l l/2
2lH nπ lH nπ
= 2 2 sin + cos ,
n π 2 nπ 2
2 2lH nπ 2lH nπ
bn = sin + sin
l n2 π 2 2 n2 π 2 2
8H nπ
= 2 2
sin .
n π 2
By inserting the solution for the Fourier coefficient bn into the general solution of
the differential equation (8.15), we get the equation that describes the vibrations of
9 Fourier Series 125
∞
8H nπ nπx nπct
y(x, t) = sin sin cos
n2 π 2 2 l l
n=1
8H 1 πx πct 1 3πx 3πct
= 2 2
sin cos − 2 sin cos
π 1 l l 3 l l
1 5πx 5πct
+ 2 sin cos − ··· .
5 l l
Thus, by plucking the string in the center one essentially excites the fundamental
mode (lowest eigenvibration) sin(πx/ l) cos(πct/ l). Several overtones are admixed
with small amplitude. The initial displacement obviously corresponds to the funda-
mental vibration. If one wants to excite pure overtones, the initial displacement must
be selected according to the desired higher harmonic vibration (compare Fig. 8.3).
EXERCISE
10 10
1 nπx 4x nπx 10 4 nπx
an = 4x cos dx = cos − sin dx
5 5 nπ 5 0 nπ 5
0 0
20 nπx 10
= 0 + 2 2 cos = 0,
n π 5 0
10 10
4 nπx 4x nπx 10 4 nπx
bn = x sin dx = − cos + cos dx
5 5 nπ 5 0 nπ 5
0 0
40 20 nπx 10 40
=− + 2 2 sin =− .
nπ n π 5 0 nπ
Hence, the Fourier series reads
∞
40 1 nπx
f (x) = 20 − sin .
π n 5
n=1
126 9 Fourier Series
Fig. 9.2.
The first partial sums Sn of this series are drawn in Fig. 9.2. A comparison of this
series with the starting curve f (x) illustrates the convergence of this Fourier series.
EXERCISE
∂ 2y 2
2∂ y
= c , (9.5)
∂t 2 ∂x 2
where y = y(x, t), with
y(0, t) = 0, y(l, t) = 0,
∂ (9.6)
y(x, 0) = 0, y(x, t) = g(x).
∂t t=0
We use the separation ansatz y = X(x) · T (t). By inserting it into (9.5), one obtains
X T̈
X · T̈ = c2 X T or (x) = 2 (t). (9.7)
X c T
Since the left-hand side of (9.7) depends only on x, the right side only on t , and x and
t are independent of each other, the equation is satisfied only then if both sides are
constant. The constant is denoted by −λ2 .
X T̈
= −λ2 and = −λ2 ,
X c2 T
or, transformed,
X + λ2 X = 0 and T̈ + λ2 c2 T = 0. (9.8)
y(x, t) = (A1 cos λx + B1 sin λx)(A2 cos λct + B2 sin λct). (9.9)
From the condition y(0, t) = 0, it follows that A1 (A2 cos λct + B2 sin λct) = 0. This
condition is satisfied by A1 = 0. Then
We now set
B1 A2 = a, B1 B2 = b,
Because
∂
y(x, t) = g(x),
∂t t=0
it follows that
∞
nπcbn nπx
g(x) = sin . (9.16)
l l
n=1
128 9 Fourier Series
l
nπcbn 2 nπx
= g(x) sin dx (9.17)
l l l
0
or
l
2 nπx
bn = g(x) sin dx. (9.18)
nπc l
0
By inserting (9.18) into (9.13), we obtain the final solution for y(x, t):
∞
l
2 nπx nπx nπct
y(x, t) = g(x ) sin dx sin sin . (9.19)
nπc l l l
n=1 0
EXERCISE
Solution. (a)
0, for −5 ≤ x ≤ 0,
f (x) = period 2l = 10.
3, for 0 ≤ x ≤ 5
Fig. 9.3.
9 Fourier Series 129
a+2l 5
1 nπx 1 nπx
an = f (x) cos dx = f (x) cos dx
l l 5 l
a −5
0 5 5
1 nπx nπx 3 nπx
= (0) cos dx + 3 cos dx = cos dx
5 5 5 5 5
−5 0 0
3 5 nπx 5
= sin =0 for n = 0.
5 nπ 5 0
5 5
For n = 0, one has an = a0 = (3/5) 0 cos(0πx/5) dx = (3/5) 0 dx = 3.
Furthermore,
a+2l 5
1 nπx 1 nπx
bn = f (x) sin dx = f (x) sin dx
l l 5 l
a −5
0 5 5
1 nπx nπx 3 nπx
= (0) sin dx + 3 sin dx = sin dx
5 5 5 5 5
−5 0 0
3 5 nπx 5 3
= − cos = (1 − cos nπ).
5 nπ 5 0 nπ
Thus,
∞
3 3 nπx
f (x) = + (1 − cos nπ) sin ,
2 nπ 5
n=1
i.e.,
3 6 πx 1 3πx 1 5πx
f (x) = + sin + sin + sin + ··· .
2 π 5 3 5 5 5
EXERCISE
Exercise 9.5 or
ṡ(y) = 2g(h − y). (9.21)
From this, one can calculate the period by separation of the variables:
T /4 s(h) h
1 ds (ds/dy)dy
T= dt = √ = √ . (9.22)
4 2g(h − y) 2g(h − y)
0 0 0
1 √
T (ds/dy) h du
= √ . (9.23)
4 2g(1 − u)
0
dT
= 0 for all h. (9.24)
dh
Thus, we get from (9.23) (s ≡ ds/dy)
1 √ 1
d s h du du 1 −1/2 √ ds
√ = √ h s + h = 0 for all h. (9.25)
dh 2g(1 − u) 2g(1 − u) 2 dh
0 0
With the condition that we keep the dimensionless variable u = y/ h constant, we can
rewrite the derivative with respect to h as a derivative with respect to y,
ds uds ds
= =u = us , (9.26)
dh d(uh) dy
and thus, we can transform (9.25) into
1
du 1
√ (s + 2ys ) √ = 0 for all h. (9.27)
8g(1 − u) h
0
1
Any periodic function f (u) satisfying 0 f (u) du = 0 can generally be expanded into
a Fourier series:
∞
f (u) = [am sin(2πmu) + bm cos(2πmu)] . (9.28)
m=1
This holds for all values of h. The left-hand side of (9.29) does not contain h; therefore, Exercise 9.5
the right-hand side must be independent of h too. This holds only for am = bm = 0
(for all m), as we shall prove now.
To have the right-hand side of (9.29) independent of h, we must have
∞
y y constant · (y/ h)h1/2
am sin 2πm + bm cos 2πm = √ (9.30)
m=1
h h 8g(1 − y/ h)
or
∞
u h1/2
[am sin(2πmu) + bm cos(2πmu)] = √ √ C. (9.31)
m=1
1 − u 8g
1
h1/2 u 4 h1/2
0= √ C √ du = √ C, (9.32)
8g 1−u 3 8g
0
√
thus, C = 0. (This reflects the fact that u/ 1 − u cannot be expanded into a Fourier
series à la (9.31).)
Inserting this result C = 0 again into (9.30), we have am = bm = 0 ∀m, and thus,
from (9.29)
s
s + = 0. (9.33)
2y
From this, one finds by integrating once
s 1 ds C̃
=− ⇒ s ≡ = C̃e−(1/2) ln y = √ . (9.34)
s 2y dy y
2 See W. Greiner: Classical Mechanics: Point Particles and Relativity, 1st ed., Springer, Berlin
(2004), Problem 24.4.
The Vibrating Membrane
10
We consider a two-dimensional system: the vibrating membrane. We shall see that the
methods applied for the treatment of a vibrating string can be simply transferred in
many respects.
The membrane is a skin without an elasticity of its own. The stretching of the
membrane along the edge leads to a tension force that acts as a backdriving force on a
deformed membrane.
Let the tangential tension in the membrane be spatially constant and time indepen-
dent. We consider only vibrations with amplitudes so small that displacements within
the membrane plane can be neglected.
We introduce the following notations: σ is the surface density of the membrane, and
the membrane tension is T (force per unit length). Let the coordinate system be ori-
ented so that the membrane lies in the x,y-plane. The displacements perpendicular to
this plane are denoted by u = u(x, y, t).
To set up the equation of motion, we imagine a cut of length x through the mem-
brane parallel to the x-axis, and a cut y parallel to the y-axis. The force acting on
the membrane element xy in the x-direction is the product of the tension and the
length of the cut: Fx = T y. Analogously for the y-component we have Fy = T x.
The surface element xy is pulled by the sum of the two forces. If the membrane
is displaced, the u-component of this sum acts on it.
From Fig. 10.1, we see
Fu = T x(sin ϕ(y + y) − sin ϕ(y)) + T y(sin ϑ(x + x) − sin ϑ(x)). (10.1)
Since we restrict ourselves to small amplitudes and angles, the sine can be replaced
by the tangent. For the tangent we then insert the differential quotient, e.g.,
∂u
tan ϕ(x, y + y) = (x, y + y),
∂y
1 ∂ 2u
u − = 0. (10.2)
c2 ∂t 2
10.2 Solution of the Differential Equation 135
This form of the wave equation is independent of the dimension of the vibrating
medium. If we insert the three-dimensional Laplace operator and set u = u(x, y, z, t),
(10.2) also holds for sound vibrations (u then represents the density variation of the
air). c is the propagation velocity of small perturbations (velocity of sound)—similar
to the case of the vibrating string.
If we had selected a positive separation constant, i.e., +ω2 in (10.3), the solution
would have been Z(t) = e±ωt . This means that the solution would either explode
with the time (e+ωt ) or fade away (e−ωt ). The negative separation constant in (10.3)
obviously guarantees harmonic solutions.
In order to separate the two space variables, we use a further separation ansatz:
∂ 2X ∂ 2Y
Y + X + k 2 XY = 0.
∂x 2 ∂y 2
136 10 The Vibrating Membrane
1 ∂ 2 X(x) 1 ∂ 2 Y (y) ω2
+ + k 2 = 0, k2 = .
X(x) ∂x 2 Y (y) ∂y 2 c2
Here again, a function of x equals a function of y only if both are constants.
We split the constant k 2 into
k 2 = kx2 + ky2
∂ 2X
+ kx2 X = 0, solution: X(x) = A1 sin(kx x + δ1 ),
∂x 2
∂ 2Y
+ ky2 Y = 0, solution: Y (y) = A2 sin(ky y + δ2 ).1
∂y 2
By multiplying the partial solutions and combining the constants, one obtains the com-
plete solution of the two-dimensional wave equation:
Both equations are only satisfied for all values of the variables x, y, t if
sin δ1 = sin δ2 = 0,
sin(kx a) = sin(ky b) = 0,
1 One of the two separation constants kx2 or ky2 could in principle be chosen to be negative, so that
e.g., kx2 − ky2 = k 2 . In this case we would get Y = Aeky ·y + Be−ky ·y , and the boundary conditions
u(x, 0, t) = u(x, b, t) = 0 could be satisfied only by A = B = 0.
10.4 Eigenfrequencies 137
kx a = nx π, ky b = ny π, with nx , ny = 1, 2, . . . .
The values nx = ny = 0 must be excluded, since they lead to u(x, y, t) = 0—as for
the vibrating string.
Now we have
2 2
π π
k 2 = kx2 + ky2 = n2x + n2y ,
a b
10.4 Eigenfrequencies
10.5 Degeneracy
If in the special case of a square membrane, the edges have equal length, a = b, then
it follows that
n2x + n2y √
cπ 2
ωnx ny = √ ω11 , ω11 = .
2 a
The table of the ratios ωnx ny /ω11 for several values of the “quantum numbers”
nx , ny of a square membrane shows (see Table 10.1) that for different pairs of “quan-
tum numbers” there exist the same eigenvalues, i.e., there are different possible eigen-
vibrations with the same frequency. Such states are called degenerate. For a square
membrane which is symmetric with respect to the meaning of the x- and y-coordinate,
all states nx ny arranged symmetrically with respect to the main diagonal of the table
are degenerate.
138 10 The Vibrating Membrane
ny \ nx 1 2 3 4
1 1.00 1.58 2.24 2.92
2 1.58 2.00 2.55 3.16
3 2.24 2.55 3.00 3.54
4 2.92 3.16 3.54 4.00
We now can evaluate the cnx ny and the ϕnx ny from the initial conditions
For t = 0, the general solution and its time derivative read as follows:
∞
nx πx ny πy
u0 (x, y) = cnx ny sin ϕnx ny · sin · sin ,
a b
nx ,ny =1
∞
nx πx ny πy
v0 (x, y) = ωnx ny cnx ny cos ϕnx ny · sin · sin .
a b
nx ,ny =1
∞
nx πx ny πy
u0 (x, y) = Anx ny sin sin , (10.7)
a b
nx ,ny =1
∞
nx πx ny πy
v0 (x, y) = Bnx ny sin sin . (10.8)
a b
nx ,ny =1
The coefficients Anx ny and Bnx ny can be determined by means of the orthogonality
relations. These read
140 10 The Vibrating Membrane
a
n̄x πx nx πx
sin sin dx = aδn̄x nx ,
a a
−a
(10.9)
b
n̄y πy ny πy
sin sin dy = bδn̄y ny .
b b
−b
a b
n̄x πx n̄y πy
u0 (x, y) sin sin dxdy
a b
−a −b
a b
n̄x πx n̄y πy
=4 u0 (x, y) sin sin dxdy
a b
0 0
∞
a b
nx πx n̄x πx n̄y πy ny πy
= Anx ny sin sin dx sin sin dy
nx ,ny
a a b b
−a −b
∞
= Anx ny δn̄x nx aδn̄y ny b = abAn̄x n̄y .
nx ,ny
Likewise, we treat (10.8) to evaluate the coefficients Bnx ny . One then obtains
a b
4 nx πx ny πy
Anx ny = u0 (x, y) sin sin dxdy,
ab a b
0 0 (10.10)
a b
4 nx πx ny πy
Bnx ny = v0 (x, y) sin sin dxdy.
ab a b
0 0
With the knowledge of the Anx ny and Bnx ny , one now can calculate the cnx ny and ϕnx ny
from (10.5) and (10.6).
In the case of degenerate vibrations of the membrane, there can also appear node lines
that arise by superposition of the node line figures of the degenerate normal vibrations.
As an example we consider the position dependence of the degenerate normal vi-
brations of the quadratic membrane
πx 2πy 2πx πy
u12 = sin sin sin ω12 t and u21 = sin sin sin ω21 t. (10.11)
a a a a
10.9 The Circular Membrane 141
u = u12 + Cu21 .
The constant C specifies the particular kind of superposition. The equation of the
nodal line is obtained from u = 0. The common numerical factor sin ω12 t = sin ω21 t
obviously factors out. For the special case C = ±1, we find
πx 2πy 2πx πy
sin sin ± sin sin =0
a a a a
or, rewritten,
πx πy πy πx
sin sin cos ± cos = 0. (10.12)
a a a a
By setting the bracket equal to zero, we get the equations for the two nodal lines:
We recognize that new vibrations with new kinds of nodal lines can be constructed
by superposing appropriate normal vibrations. One can excite such specific superpo-
sitions of normal vibrations by stretching wires along the nodal lines (right figure) so
that the membrane remains at rest along these lines.
In the case of the circular membrane, it is convenient to change from the Cartesian
coordinates to polar coordinates, i.e., from u = f (x, y, t) to u = ψ(r, ϕ, t).
For this recalculation, we have
x = r cos ϕ, y = r sin ϕ,
y (10.13)
tan ϕ = , r = x2 + y2.
x
For the transformation of the Laplace operator, we need the derivatives
Fig. 10.5. Circular membrane
∂r x ∂r y (drum)
= = cos ϕ, = = sin ϕ. (10.14)
∂x r ∂y r
By differentiating the tangent, we get
∂ tan ϕ ∂ tan ϕ ∂ϕ 1 ∂ϕ y
= = 2
=− 2. (10.15)
∂x ∂ϕ ∂x cos ϕ ∂x x
142 10 The Vibrating Membrane
By inserting the polar representations for x and y, one gets ∂ϕ/∂x = −(sin ϕ)/r. The
corresponding differentiation of tan ϕ with respect to y yields ∂ϕ/∂y = (cos ϕ)/r. To
get the two-dimensional vibration equation in polar coordinates, we first transform the
Laplace operator (x, y) to polar coordinates (r, ϕ). The differential quotients are
interpreted as operators.
We demonstrate the calculation for the x-component; the recalculation of the
y-component then runs likewise. According to the chain rule, we have
∂ ∂ ∂r ∂ ∂ϕ
= + . (10.16)
∂x ∂r ∂x ∂ϕ ∂x
∂ ∂ sin ϕ ∂
= cos ϕ − . (10.17)
∂x ∂r r ∂ϕ
We square this result, taking into account that the terms act on each other as operators.
(The square of an operator means double application.)
∂2 ∂ 1 ∂ ∂ 1 ∂
= cos ϕ − sin ϕ cos ϕ − sin ϕ . (10.18)
∂x 2 ∂r r ∂ϕ ∂r r ∂ϕ
∂2 ∂2 ∂2 1 ∂ 1 ∂2
+ = = + + . (10.20)
∂x 2 ∂y 2 ∂r 2 r ∂r r 2 ∂ϕ 2
The vibration equation then takes the following form:
1 T̈
= −k 2 (10.25)
c2 T
and introduce the angular frequency ω by
ω = ck. (10.26)
Z̈ + ω2 Z = 0 (10.27)
∂ 2V 1 ∂V 1 ∂ 2V
+ + + k 2 V = 0. (10.29)
∂r 2 r ∂r r 2 ∂ϕ 2
We separate the radial and angular functions by a second product ansatz:
Hence, we obtain
d2R 1 d2φ
dr 2
+ 1 dR
r dr r 2 dϕ 2
+ + k 2 = 0. (10.31)
R(r) φ(ϕ)
144 10 The Vibrating Membrane
m must take only integer values to get the periodicity of the solution. At the angle
2π + ϕ, the solution must be identical with that for the angle ϕ. This fact is often
described by the phrase periodic boundary conditions.
Now we can admit—without restricting the problem—only positive m, since with
negative m only the sense of rotation angle is inverted.
Thus, the equation of motion for the radial function R looks as follows:
d 2R dR
r2 2
+r + k2r 2R − σ R = 0
dr dr
or
d 2 R 1 dR m2
+ + k 2
− R = 0. (10.35)
dr 2 r dr r2
We substitute z = kr, dr = dz/k. Then we get
2
2d R k 2 dR m2 k 2
k + + k − 2 R = 0,
2
dz2 z dz z
2
d R 1 dR m2
+ + 1 − 2 R = 0. (10.36)
dz2 z dz z
In this form, the equation is called Bessel’s differential equation. This differential
equation and its solutions appear in many problems of mathematical physics.
2 Friedrich Wilhelm Bessel, b. July 22, 1784, Minden–d. March 17, 1846, Königsberg (Kaliningrad).
Bessel was first a trade apprentice in Bremen, then until 1809 an assistant at the observatory in
Lilienthal, and then professor of astronomy in Königsberg and director of the observatory there. In
1838 he succeeded in measuring the annual parallax of the star 61 Cygni, thus becoming the first to
determine the distance to a fixed star. As a mathematician Bessel was best known for his investigations
on differential equations and on Bessel functions.
10.10 Solution of Bessel’s Differential Equation 145
The separation of a power factor is not necessary, but will prove to be very convenient.
Since in the center of our membrane the vibration remains always finite, g(z) must
not have a singularity at z = 0. But since for z → 0 we have
g(z) ≈ a0 zμ , (10.39)
for these physical reasons we must have μ ≥ 0. To get a more general statement, we
consider the asymptotic behavior of Bessel’s differential equation for z → 0 for at first
arbitrary μ.
We then can set as above
g(z) ≈ a0 zμ (10.40)
μ(μ − 1)zμ−2 + μzμ−2 + zμ − m2 zμ−2 = μ(μ − 1) + μ + z2 − m2 zμ−2
≈ (μ2 − m2 )zμ−2 = 0, (10.41)
μ2 − m2 = 0. (10.42)
For the above-mentioned reasons, which are of a purely physical nature, it follows that
μ = m, m ∈ N0 . (10.43)
The constant m is itself an integer. To see this, we remind ourselves of the angular
dependence of the total solution, namely,
Since after a full revolution we return again to the same point of the membrane, the
solution function must have the period 2π . But this holds only then if m is an integer!
We now try to determine the coefficients of our ansatz
For this purpose, we insert the ansatz in the Bessel equation. The individual terms of
this equation then have the following form:
d 2g
= zm−2 a0 m(m − 1) + a1 (m + 1)mz + a2 (m + 2)(m + 1)z2
dz2
+ a3 (m + 3)(m + 2)z3 + · · · ,
1 dg
= zm−2 a0 m + a1 (m + 1)z + a2 (m + 2)z2 + a3 (m + 3)z3 + · · · ,
z dz
g(z) = zm−2 (a0 z2 + a1 z3 + · · · ),
m2
− g(z) = zm−2 (−a0 m2 − a1 m2 z − a2 m2 z2 − a3 m2 z3 − · · · ).
z2
The sum of the coefficients for each power of z must vanish, i.e., a0 (m(m − 1) +
m − m2 ) = 0. Since the bracket vanishes, a0 can be arbitrary.
For a1 , we get
a1 m(m + 1) + (m + 1) − m2 = 0,
a1 (2m + 1) = 0, i.e., a1 = 0. (10.46)
or
Furthermore, we get
a3 (m + 3)(m + 2) + (m + 3) − m2 + a1 = 0,
a3 (6m + 9) = −a1 , i.e., a3 = 0. (10.48)
This recursion formula allows one to determine the coefficient ap+2 from the preced-
ing ap . Because a1 = 0, it follows that all a2n−1 vanish, i.e., in the series expansion
of the solution function there appear only even exponents. For these one obtains with
a0 = 0:
−a2n−2 −a2n−2
a2n = = . (10.50)
2n(2m + 2n) 2n2(m + n)
10.10 Solution of Bessel’s Differential Equation 147
(−1)n a0
a2n =
2n n(n − 1) · · · 1 · 2n (m + n)(m + n − 1) · · · (m + 1)
(−1)n a0
= . (10.52)
2 n!(m + n)!/m!
2n
The graph of the first Bessel functions is given in Fig. 10.6. We see that for large
arguments the Bessel functions vary like the trigonometric functions sine or cosine.
Now we can immediately write down the solutions of our differential equation:
The membrane cannot vibrate at the border r = a, i.e., the boundary condition reads
Jm (k · a) = 0,
from which the eigenfrequencies can be determined. For this purpose we must find the
zeros of the Bessel function:
z2 z4
J0 (z) = 1 − + − + · · · = 0,
4 64 (10.56)
z z3 z5
J1 (z) = − + − + · · · = 0, etc.
2 16 384
These zeros—except for the trivial ones for z = 0—can in general not be determined
exactly; they must be calculated by numerical methods. If we denote the nth node of
(m)
the function Jm (z) by zn , we obtain the following table for the values of the first
(m)
zn :
m
n 0 1 2 3 4 5
1 2.41 3.83 5.14 6.38 7.59 8.77
2 5.52 7.02 8.42 9.76 11.06 12.34
3 8.65 10.17 11.62 13.02 14.37 15.70
4 11.79 13.32 14.80 16.22 17.62 18.98
5 14.93 16.47 17.96 19.41 20.83 22.22
6 18.07 19.62 21.12 22.51 24.02 25.43
7 21.21 22.76 24.27 25.75 27.20 28.63
8 24.35 25.90 27.42 28.91 30.37 31.81
9 27.49 29.05 30.57 32.07 33.51 34.99
Table 10.3. Comparison of the exact zeros of the Bessel functions with those obtained from the
asymptotic approximation
m=0 m=5
(0) (0) (5) (5)
zn z̄n zn z̄n
n=1 2.41 2.36 8.77 10.21
n=2 5.52 5.49 12.34 13.35
.. .. .. .. ..
. . . . .
n=9 27.49 27.49 34.99 35.34
(m)
With the exact solutions zn , the boundary condition is
1 (m)
kn(m) · a = zn(m) , kn(m) = ·z .
a n
For the eigenfrequencies, we get
c (m)
ωn(m) = kn(m) · c = · z = ω0 · zn(m) . (10.59)
a n
(m)
Thus, Table 10.2 also shows the values for the ratio ωn /ω0 . By drawing all these
eigenfrequencies along an axis, one arrives at Fig. 10.7. The distances between the
individual eigenfrequencies are fully chaotic. Thus, we are dealing with extremely
anharmonic overtones. This is the reason why drums are badly suited as melodic in-
struments!
Fig. 10.7. Linear representa-
tion of the eigenfrequencies of
the circular membrane
The general solution of the vibration equation is the superposition of the normal
vibrations. It now reads
u(r, ϕ, t) = cn(m) Jm (kn(m) r) · sin(mϕ + δm ) · sin(ωn(m) t + δn(m) ). (10.60)
m,n
(m)
In analogy to the Fourier analysis, the cn can be found so that u(r, ϕ, t) can be
adjusted to any given initial condition u(r, ϕ, 0) or u̇(r, ϕ, 0).
Finally, we want to get a survey of the nodal lines of the vibrating membrane. On
these lines we must have
Jm (kn(m) r) = 0; (10.62)
or if
sin(mϕ + δm ) = 0, (10.64)
EXAMPLE
The equations of motion for a system with n vibrating mass points which are con-
nected by n + 1 springs of equal spring constant k read
mẍ1 =−kx1 + k(x2 − x1 )
mẍ2 = − k(x2 − x1 )+ k(x3 − x2 )
mẍ3 = − k(x3 − x2 )+ k(x4 − x3 )
.. ..
. . (10.66)
mẍn−1 = − k(xn−1 − xn−2 )+ k(xn − xn−1 )
mẍn = − k(xn − xn−1 )− kxn .
With
⎛ ⎞
x1
⎜ x2 ⎟
⎜ ⎟
r = ⎜ . ⎟,
⎝ .. ⎠
xn
Fig. 10.9.
10.10 Solution of Bessel’s Differential Equation 151
where
⎛ ⎞
−2 1
⎛ ⎞
⎜ 1 −2 1 ⎟ x1
⎜ ⎟
⎜ 1 −2 1 ⎟ ⎜ x2 ⎟
⎜ ⎟ ⎜ ⎟
Ĉ = ⎜ .. .. .. ⎟ and r = ⎜ . ⎟. (10.68)
⎜ . . . ⎟ ⎝ .. ⎠
⎜ ⎟
⎝ 1 −2 1 ⎠ xn
1 −2
Here,
⎛ ⎞
1 0 ... 0
⎜0 1 ... 0⎟
⎜ ⎟
Ên = ⎜ . .. .. ⎟
⎝ .. . .⎠
0 0 ... 1
iπ
sin((n + 1)γi ) = 0 ⇒ γi = , for all i ∈ {1, . . . , n}. (10.74)
n+1
We summarize the result for the ith eigenmodes:
iπ
ri (t) = sin j · cos ωi t, j = 1, 2, . . . , n (10.75)
n+1
with
k iπ
ωi = 2 sin . (10.76)
m 2n + 2
The general solution of (10.66) is a superposition of the various eigenmodes, i.e.,
a vector r(t) with the components xj (t):
n
iπ
xj (t) = (ci ·cos ωi t +bi ·sin ωi t)·sin j · , for j = 1, 2, . . . , n. (10.77)
n+1
i=1
The coefficients sin(j iπ/(n + 1)) are, according to (10.70), (10.73) and (10.76), the
components of the eigenvector to the ith mode, and since D̂n (ω) is symmetric, the
latter ones represent an orthogonal basis in Rn .
n
iπ lπ n+1
sin l sin j = δil . (10.78)
n+1 n+1 2
j =1
We explicitly check this relation in the following Exercise 10.2. We define the ortho-
normal eigenmodes ai :
2 iπ
ai = sin j , j = 1, 2, . . . , n , (10.79)
n+1 n+1
or in detail
2 πi 2πi nπi
ai = sin , sin , . . . , sin .
n+1 n+1 n+1 n+1
n
r(t) = (ci cos ωi t + bi sin ωi t)ai . (10.80)
i=1
10.10 Solution of Bessel’s Differential Equation 153
The following interesting question arises: Let the system of n mass points (degrees Example 10.1
of freedom) at the time t0 be at r(t0 ) = r0 with the velocity ṙ(t0 ) = ṙ0 . The system
moves away from this configuration, but after a certain time τ it can closely approach
the initial configuration and possibly return exactly into the initial configuration. We
call this time τ the Poincaré recurrence time.3 One looks for the difference between
the actual time-dependent state vector in the phase space (r(t), ṙ(t)) and the start
vector (r0 , ṙ|t=0 ):
ε(t) =: r(t) − r0 2 + ṙ(t) − ṙ|t=0 2 . (10.81)
The index at the second scalar product for the velocities indicates a diagonal weight
matrix which is suitably included into this normalization:
⎛ ⎞
1
⎜ ω2 ⎟
⎜ 1 ⎟
⎜ 1 ⎟
⎜ ⎟
⎜ ω22 ⎟
=⎜ ⎜
⎟;
⎟ (10.82)
⎜ .. ⎟
⎜ . ⎟
⎜ ⎟
⎝ 1 ⎠
ωn2
n
r(0) = r0 = ci ai ,
i=1
n
ṙ(0) = ṙ0 = 0 = bi ai ω i ⇒ bi = 0 for all i.
i=1
For this choice, the distance ε(t) in phase space, given by (10.81), is
ε(t) = r0 · r0 − 2r0 · ci cos ωi tai + ci2 cos2 ωi t + ci2 sin2 ωi t . (10.83)
i i i
Because r0 = i ci ai , this turns into
n
ε(t) = ci2 (1 − 2 cos ωi t + cos2 ωi t + sin2 ωi t )
i=1 =1
n
ωi t
= 2 ci2 sin2 . (10.84)
2
i=1
It is easily seen that this expression after t = 0 vanishes again only if the eigenfrequen-
cies ωi are related by rational fractions. In the general case ε(t) is only conditionally
Example 10.1 periodic. This notion will be explained more precisely as follows: We first consider the
purely periodic case. The period is then determined by the lowest of the frequencies
that are related by rational fractions:
k iπ
ωi = 2 sin . (10.85)
m 2n + 2
We denote it by ω̃ = q · ω1 (q ∈ Z+ ),
k π k π
ω̃ = 2q sin ≈ 2q . (10.86)
m 2n + 2 m 2n + 2
The last approximation holds for n 1 (many mass points). To this frequency corre-
sponds the time
2π k n+1
τ= =2 · . (10.87)
qω1 m q
This is the Poincaré recurrence time, since after this time ε(t) vanishes again, i.e.,
the initial configuration in the phase space is reached again after the time τ . For very
many mass points (n → ∞), this time tends to infinity
τ → ∞, for n → ∞. (10.88)
ri = ai cos ωi t. (10.89)
The quantities τi = 2π/ωi are the periods belonging to the coordinates xi . In analogy
to (10.77) and (10.80), we expand the general configuration vector r(t) in a Fourier
series
r(t) = (ci cos ωi t + bi sin ωi t)ai . (10.90)
i
The Fourier series (10.90) is in general no longer a periodic function with respect
to the time t , although every individual term is periodic. The periodicity is assured
only for such degrees of freedom whose frequencies ω1 , ω2 , . . . are related by rational
10.10 Solution of Bessel’s Differential Equation 155
fractions. Therefore systems with several degrees of freedom are called conditionally Example 10.1
periodic systems.
The number of frequencies which are related by rational fractions determines the
degree of degeneracy of the system. If there are no relations of this kind, the system
is non-degenerate. If all frequencies are rationally related to each other, the system is
called fully degenerate. In this case we observe a periodic time function.
The Kepler problem treated earlier is an example for a system with two degrees
of freedom (r, ϕ) which is degenerate and thus has only one frequency. By inventing
a perturbation, say a quadrupole-like potential with the typical variation 1/r 3 , the
degeneracy can be removed, which causes a rosette-like motion.
As an example of a conditionally periodic motion, we note the anisotropic linear
harmonic oscillator, which is a mass point with different spring constants in the vari-
ous Cartesian directions. The trajectory of the mass point is a Lissajous figure which
never turns into itself and in the course of time tightly covers the area given by the
amplitudes. Only in the case of degeneracy there are periodicities in the motion.
In the discussion of the Poincaré recurrence time, we assumed a periodic motion.
In the case of a conditionally periodic motion, the situation is—as was expected—
completely analogous. In this case, after the Poincaré recurrence time τ the configu-
ration vectors r(t), ṙ(t) come very close to the initial configuration r0 , ṙ0 . The initial
configuration will not be reached again, but will be reached “nearly” again after the
time τ . For further discussion, we refer to the literature.
EXERCISE
n
iπ lπ n+1
sin j sin j = δil , (10.91)
n+1 n+1 2
i=1
n
π π
dil = sin i j sin l j
n+1 n+1
j =1
n
1 (k − l)π (k + l)π
= cos j − cos .
2 n+1 n+1
j =1
Before continuing the exposition, we evaluate the sum of the following series:
n
sin(xn/2) cos x(n + 1)/2 cos(xn/2) sin x(n + 1)/2
cos kx = = − 1. (10.92)
sin x/2 sin x/2
k=1
This result is easily obtained by writing the cosine in terms of exponential functions
and then evaluating the sum as a geometrical series.
156 10 The Vibrating Membrane
The vanishing of dkl (k = l) is immediately seen from (10.94) for even k − l and k + l,
and from (10.95) for odd k − l and k + l.
Near the end of the nineteenth century, many physicists discussed the hypothesis
that the course of the world repeats in eternal cycles. This interest was stimulated
mainly by the works of Henri Poincaré (1854–1912). Also a philosopher like Friedrich
Nietzsche (1844–1900) was tempted by this theorem to a short guest performance in
physics. The first speculations along these lines originated from almost nonscientific
attempts to explain the phenomenon of heat. The Lord of Verulam (1561–1626), Fran-
cis Bacon, had already identified heat as a form of motion but had failed to construct
a quantitative theory. For lack of systematic investigations he had included the de-
velopment of heat in dung hills in his considerations. The topic attracted more and
more actors from physics, metaphysics, philosophy, politics and theology. We repro-
duce here two quotations from Poincaré’s works from this era. Henri Poincaré in 1893
wrote in “Review of metaphysics and morality”:
Everybody knows the mechanistic world view that tempted so many good people, and the vari-
ous forms in which it comes up. Some people imagine the material world as being composed of
atoms which move along straight lines because of their inertia, and change their velocity only
if two atoms collide. Other people assume that the atoms perform an attraction or repulsion
on each other, which depends on their distance. The following considerations will meet both
points of view.
It would possibly be appropriate to dispute here the metaphysical difficulties that are related
to these opinions, but I don’t have the necessary expert knowledge. Therefore I will deal here
only with the difficulties the mechanists met when they tried to reconcile their system with the
experimental facts, and with the efforts they made to overcome or to elude these difficulties.
According to the mechanistic hypothesis all phenomena must be reversible; the stars for
example could move along their orbits also in the opposite sense, without conflicting with
Newton’s laws. Reversibility is a consequence of all mechanistic hypotheses.
A theorem that can easily be proved tells us that a restricted world, which is governed only
by the laws of mechanics, will pass again and again a state that is very close to its initial state.
10.10 Solution of Bessel’s Differential Equation 157
On the other hand, according to the assumed experimental laws the universe tends towards a
certain final state which it never will leave. In this final state which represents some kind of
death, all bodies will be at rest at the same temperature.
The doubts provoked this way, accompanying the developing theory of heat based
upon an irreversible motion of atomic particles, have not yet been clearly removed.
A classical illustrative example of Poincaré’s “recurrence objection” is the parti-
tioned box, one half filled with gas which uniformly distributes over the entire box
after removal of the membrane. Experience tells us what happens, and an inversion of
this “irreversible process” is never observed in practice. But Poincaré did not think at
all of an inversion, but rather of the chance that brought the particles into the ini-
tially empty half. This chance—after some “appropriate time”—should also bring
them back again to the initial half.
In 1955, Enrico Fermi,4 John Pasta,5 and Stanislaw Ulam6 considered a problem
which corresponds to our Example 10.1, except for the additional inclusion of a non-
linear coupling term. Their interest focused on finding, by means of the first com-
puters, recurring processes such as we looked for in our purely linear problem. Sur-
prisingly, they found an almost perfect recurrence of the initial conditions after large
numbers of oscillations. The investigations and reflections of such properties of non-
linear wave equations continue to this day and have been introduced into the theory of
elementary particles (solitons).
4 Enrico Fermi, Italian physicist, b. September 29, 1901, Rome–d. November 28, 1954, Chicago.
Following studies in Pisa and research stays in Göttingen, Leiden and Rome, Fermi became a profes-
sor of physics in Rome in 1925. He built up a research group which achieved leading experimental
and theoretical results in nuclear physics. In 1938, Fermi was awarded the Nobel Prize in physics for
his work on radioactive elements created by neutron bombardment and nuclear reactions triggered
by slow neutrons. In the same year he left Italy, worked at Columbia University in New York and
finally became professor in Chicago. During the war, Fermi was involved in the Manhattan Project to
produce the nuclear bomb, and was the driving force in the development of the first nuclear reactor in
1943. Fermi did seminal work in both experimental and theoretical physics, contributing to statistical
mechanics, the general theory of relativity, and the theory of weak interactions.
5 John Pasta, American physicist and computer scientist, b. 1918, New York–d. 1984, Chicago.
Following different employments with police and the military, Pasta obtained a Ph.D. in theoretical
physics and joined the Los Alamos National Laboratory, where he worked in the group of Nicholas
Metropolis, constructing the MANIAC I computer. Later he became expert for computing with the
Atomic Energy Commission, and in 1964 professor for computer science at the University of Illinois
in Urbana-Champaign.
6 Stanislaw Marcin Ulam, Polish mathematician, b. April 13, 1909, Lemberg (today Lviv, Ukraine)–
d. May 13, 1984, in Santa Fé, New Mexico. Ulam was a student of mathematics with Stefan Banach
and contributed to measure theory, topology, and ergodic theory. In 1938 he came to the US as a Har-
vard Junior Fellow, and later joined the Manhattan project, on the intervention of John von Neumann.
Together with von Neumann, he developed the Monte Carlo method to solve numerical problems
using random numbers. Ulam suggested the functional principle of the first hydrogen bomb.
Part
IV
Mechanics of Rigid Bodies
Rotation About a Fixed Axis
11
D = l × F.
Fig. 11.1. A couple causes a
torque
While the torque on a mass point is always related to a fixed point, the torque of a
couple is completely free and can be shifted in space.
The forces acting on a rigid body can always be replaced by a total force acting on
an arbitrary point, and a couple. This can easily be shown by the following example:
At the point P1 , the force F1 acts. Nothing is changed if we let the forces −F1 and
F1 act at O . The force F1 acting on P1 and the force −F1 acting on O represent a
couple, and there remains the force F1 acting on O .
If there are several forces acting, we combine them into the resultant force F =
i Fi . The torque is then given by D = i r i × Fi .
An extended body is in equilibrium if both the total force and the total torque
vanish:
Fi = 0
i
and
ri × Fi = 0
i
i.e., the condition that the sum of all forces and the sum of all torques must van-
ish.
A rigid body rotates about a rotation axis z fixed in space. By substituting the angular
velocity vi = ω · ri for the velocity in the kinetic energy, one obtains
1 1 1
T= mi vi2 = ω2 mi ri2 = ω2 .
2 2 2
i i
Here, ri is the distance of the ith mass element from the z-axis.
The sum appearing in both relations is called the moment of inertia with respect to
the rotation axis. One has
= mi ri2 .
i
For a spatially extended, not axially symmetric rigid body which rotates about the
z-axis, there can also appear components of the angular momentum perpendicular to
the z-axis:
L= mν rν × vν = mν rν × (ω × rν )
ν ν
= mν ω(xν , yν , zν ) × (−yν , xν , 0)
ν
=ω (−xν zν , −yν zν , xν2 + yν2 )mν .
ν
Since the body is supported in such a way that the rotation axis is constantly fixed, in
the bearings appear torques (bearing moments) D = L̇. They can be compensated by
“balancing,” i.e., by attaching additional masses so that the deviation moments
− xν zν mν and − yν zν mν
ν ν
vanish.
EXAMPLE
2π h R
= r dm =
2
dϕ dz r 3 dr;
cylinder 0 0 0 Fig. 11.4. A homogeneous cylin-
der rotates about its axis
integration over the angle and the z-coordinate yields
R
= 2πh r 3 dr.
0
π R2 1
= hR 4 = πR 2 h = MR 2 .
2 2 2
164 11 Rotation About a Fixed Axis
Steiner’s Theorem1
If the moment of inertia s with respect to an axis through the center of gravity S of
a rigid body is known, the moment of inertia for an arbitrary parallel axis with the
distance b from the center of gravity is given by the relation
= s + Mb2 .
If AB is the axis through the center of gravity and A B the parallel one with the unit
vector e along the axis, this can be shown as follows:
AB = mν (rν × e)2 , A B = mν (rν × e)2 .
ν ν
The relation between rν and rν is given by Fig. 11.5. Obviously rν = −b + rν , and
therefore,
A B = mν ((−b + rν ) × e )2
ν
= mν [(−b × e ) + (rν × e )]2
ν
= mν (−b × e )2 + 2 mν (−b × e ) · (rν × e ) + mν (rν × e )2
Fig. 11.5. On Steiner’s theo-
ν ν ν
rem
= Mb2 + AB .
1 Jacob Steiner, b. March 18, 1796, Utzenstorf–d. April 1, 1863, Bern. Steiner was son of a peasant
and grew up without education. He received his first education from Pestalozzi in Yverdon. Subse-
quently Steiner studied in Heidelberg, and then he served as a teacher of mathematics in Berlin; in
1834, he became an associate professor at the university there. Steiner is considered the founder of
synthetic geometry, which was systematically developed by him. He worked on geometric construc-
tions and isoperimetric problems. A peculiar feature of his work is that he almost completely avoided
analytic and algebraic methods in geometric investigations.
11.1 Moment of Inertia 165
i.e.,
EXAMPLE
We consider the moment of inertia of a thin rectangular disk of density . For the
calculation of the moment of inertia about the x-axis, we take as the mass element
dm = a dy. We then obtain
b
b3 1
xx = y 2 a dy = a = Mb2 .
3 3
0
1
zz = M(a 2 + b2 ).
3
The moment of inertia about a perpendicular axis through the center of grav-
ity is found, according to Steiner’s theorem, from the moment of inertia about the
z-axis:
2
2
a 2 b a 2 + b2
zz = s + M + = s + M ,
2 2 4
M 2 1 1
s = zz − (a + b ) = M(a + b )
2 2 2
− ,
4 3 4
M 2
s = (a + b2 ).
12
166 11 Rotation About a Fixed Axis
where k is a unit vector pointing out of the page in Fig. 11.7, and |r| = a. The angular
velocity is then
dϕ
ω = +k .
dt
d 2ϕ d 2 ϕ aMg
−aMg sin ϕ = 0 or + sin ϕ = 0.
dt 2 dt 2 0
d 2ϕ
+ 2 ϕ = 0,
dt 2
with the solution
ϕ = A sin(t + δ).
From this, it follows that the period becomes a minimum if the vibration axis is a
√
distance a = s /M from the center of gravity. From this relation one can experi-
mentally determine the moment of inertia s .
11.2 The Physical Pendulum 167
EXERCISE
Problem. Find the moment of inertia of a sphere about an axis through its center.
The radius of the sphere is a, and the homogeneous density is .
Solution. We use cylindrical coordinates (r, ϕ, z). The z-axis is the rotation axis. For
the corresponding moment of inertia, we have
= r 2 dV .
sphere
The center of the sphere is at z = 0. The equation for the spherical surface then reads
x 2 + y 2 + z2 = a 2 or r 2 + z2 = a 2 .
= dϕ dz r 3 dr
0 −a 0
or
√
a a 2 −z2 a
1 4 π
= 2π r dz = (a 2 − z2 )2 dz.
4 0 2
−a −a
EXERCISE
Problem. Calculate the moment of inertia of a homogeneous massive cube about one
of its edges.
Solution. Let be the density and s the edge length of the cube. A mass element is
then given by
dm = dV = dx dy dz.
168 11 Rotation About a Fixed Axis
s s s
2 2
AB = (x 2 + y 2 ) dx dy dz = s 5 = Ms 2 .
3 3
0 0 0
EXERCISE
Problem. A cube of edge length s and mass M hangs vertically down from one of its
edges. Find the period for small vibrations about the equilibrium position. How long
is the equivalent thread pendulum?
Solution. The moment of inertia of the cube about AB is (see Exercise 11.4)
2
AB = Ms 2 .
3
The center of gravity is in the center of the cube, i.e., for the distance a of the center
of gravity S from the axis AB we have
1 √
a = s 2.
2
The equation of motion of the physical pendulum for small angle amplitudes was
Mga
ϕ̈ + ϕ=0
AB
which just defines the equivalence of the pendulums. By insertion of T one obtains
√
4 2s l
2π 2 = 2π ,
3g g
or resolved,
2√
l= 2s.
3
This equivalent length of the thread pendulum is also called the reduced pendulum Fig. 11.10. Physical pendulum
length. and reduced pendulum length
EXAMPLE
We consider a cylinder with a horizontal axis that can roll down an inclined plane.
The system has one degree of freedom; hence an energy consideration is sufficient.
The velocity of each point of the cylinder may be thought as being composed of the
velocity v1 due to the translational motion and of the velocity v2ν due to the rotation.
The energy of motion is then given by
mν 1 mν
vν2 = v12 mν + 2
v2ν + v1 · mν v2ν . (11.1)
2 2 2
For a symmetric mass distribution, the last term drops out, and we have
M 2 2
T= v + ϕ̇ , (11.2)
2 1 2
i.e., the energy of motion is additive in translational and rotation energy. For the cylin-
der (with symmetric mass distribution) on the inclined plane we have
M 2 2
ṡ + ϕ̇ − Mgs sin α = E (11.3)
2 2
(s measures the distance along the inclined plane). “Rolling off” without gliding
means that the axis always moves just as much as corresponds to the rotation of the
cylinder surface:
ṡ = R ϕ̇,
170 11 Rotation About a Fixed Axis
Example 11.6 where R is the cylinder radius. We thus obtain the equation
1
M + 2 ṡ 2 − Mgs sin α = E,
2 R
(11.4)
1
s̈ = g sin α.
1 + /MR 2
The acceleration of the cylinder rolling off is smaller than that of a gliding mass point.
If the total mass of the cylinder is (approximately) concentrated on the axis, then
= 0, s̈ = g sin α,
MR 2
and the acceleration is the same as for a gliding mass point. For a homogeneous cylin-
der, we have
1 2
= , s̈ = g sin α.
MR 2 2 3
For a hollow cylinder with all mass on the surface, we have
1
= 1, s̈ = g sin α;
MR 2 2
the acceleration is only half of that for a gliding mass point. If we fix a circular disk
concentric onto the cylinder, which extends beyond the base (like a wheel rim over the
rail), then /MR 2 > 1, i.e., the acceleration can be even lower.
An investigation of the force balance lets us elucidate this problem once again from
another point of view. At the point S, gravity acts and performs a torque with respect
to the point A (see Fig. 11.11)
while the constraints do not create a torque. The angular acceleration at the point A is
therefore
DA RMg sin α 2g
ϕ̈ = ω̇ = = = sin α. (11.6)
A (3/2)MR 2 3 R
The moment of inertia A of a homogeneous cylinder is easily found by means of Example 11.6
Steiner’s theorem. Since the moment of inertia with respect to the center of gravity is
s = MR 2 /2, it follows immediately that
3
A = s + MR 2 = MR 2 .
2
If the cylinder rolls without gliding, for the linear acceleration of the center of
gravity, we find
2
|as | = |ω̇ × rA | = ω̇R = g sin α. (11.7)
3
The cylinder gets only 2/3 of the acceleration which it would get when gliding. Equa-
tion (11.8) is found from simple considerations: Since the instantaneous velocity of
the contact point A equals zero, one can consider A as instantaneously at rest. But this
means that the rigid body instantaneously performs a rotation about the contact point
A, with an angular velocity ω. The velocity of an arbitrary point of the body is then
given by (see Fig. 11.11)
v = ω × rA .
Besides the gravitation force there acts the reaction force N (to balance the normal
component of Mg)
and the friction force Ff . The latter one is calculated from the balance
A cylinder with asymmetrical mass distribution, which under the influence of grav-
itation can vibrate by rolling on a horizontal base, is called a rolling pendulum. It
represents a system with one degree of freedom; the position of the rolling pendu-
lum can be specified by the rotation angle ϕ or by the coordinate x of the cylin-
der axis (measured perpendicularly to the axis, see Fig. 11.12). “Rolling off” means
Fig. 11.12. Rolling pendulum that
ẋ = R ϕ̇. (11.13)
Since there is only one degree of freedom, the energy law is sufficient for the de-
scription. The motion is composed of a translational and a rotational motion. When
applying (11.1), we have to account for the asymmetrical mass distribution. The ex-
pression mν v2ν , as a momentum due to the rotational motion, can be calculated
by assuming that the total mass M is concentrated at the center of gravity, which is
located off the axis by the distance s: |s ϕ̇| is then the velocity |v2ν | of this mass on
rotation, and π − ϕ is the angle between v1 and v2ν . According to (11.1), we then
have
M 2 2
T= ẋ + ϕ̇ − ẋ · Ms ϕ̇ cos ϕ,
2 2
where is the moment of inertia about the cylinder axis. With the condition (11.13)
for rolling follows
1
T = (MR 2 + − 2MRs cos ϕ)ϕ̇ 2 . (11.14)
2
= s + Ms 2 .
1
T = [M(R 2 + s 2 − 2Rs cos ϕ) + s ]ϕ̇ 2
2
or
1
T = (Mr 2 + s )ϕ̇ 2 ,
2
where r is the distance of the center of gravity from the contact line of the cylinder
with the base. According to Steiner’s theorem,
u = Mr 2 + s
11.2 The Physical Pendulum 173
is the moment of inertia about the contact line which changes with time, and (11.14)
takes the form
u 2
T= ϕ̇ .
2
1
T = (B − 2MRs cos ϕ)ϕ̇ 2
2
1
(B − 2MRs cos ϕ)ϕ̇ 2 + Mgs(1 − cos ϕ) = E. (11.15)
2
The equation differs from that for the physical pendulum (compare the section on
the physical pendulum). For small angles ϕ, we obtain
with
Mgs Mgs
ω2 = = . (11.18)
u MR 2 + − 2MRs
In the limit of a symmetrical mass distribution (s = 0), one has ω = 0. If the center of
gravity moves to the cylinder surface (s → R), then
MgR
ω2 = .
s
If the mass is limited to a more restricted region, ω becomes very large. If we imagine
that a part of the mass is shifted by an appropriate device to the outside of the rolling
Fig. 11.13. Transition from the
cylinder (see Fig. 11.13) and that s is large compared to R, then the vibration turns rolling pendulum to the phys-
into the vibration of a physical pendulum. ical pendulum
174 11 Rotation About a Fixed Axis
EXAMPLE
Figure 11.14 shows the moments of inertia of (a) a disk, (b) a cylinder, (c) a rec-
tangular plate, (d) a spherical shell, (e) a solid sphere, and (f) a cube about different
selected axes.
Fig. 11.14.
11.2 The Physical Pendulum 175
EXERCISE
Problem. A cube with the edge length 2a and mass M glides with constant velocity
v0 on a frictionless plate. At the end of the plate, it bumps against an obstacle and tilts
over the edge (see Fig. 11.15). Find the minimum velocity v0 for which the cube still
falls from the plate!
Fig. 11.15. Cube tilting over
an edge
Solution. We look for the velocity v0 for which the cube can tilt over its edge, as is
represented in (c). If it bumps into the obstacle at the edge of the plate, it is set into
rotation about the axis A. At the time of collision all external forces act along this
axis, and the angular momentum of the cube is conserved. Before hitting the obstacle,
the cube has—due to the translational motion—the angular momentum
L = |r × p| = p · a = Mv0 a. (11.19)
Immediately after the collision, the angular momentum appears as rotational motion
of the cube
L = A ω0 = Mv0 a,
or
Mv0 a
ω0 = . (11.20)
A
If the cube begins to lift off, the gravitational force causes a torque about the axis A
that counteracts the lifting process.
For the kinetic energy of the cube immediately after the collision, one has, for given
ω0 ,
1 1 M 2 v02 a 2
T0 = A ω02 = . (11.21)
2 2 A
The potential energy difference between position a and position c is
√ √
V = M(h2 − h1 )g = M( 2a − a)g = Mag( 2 − 1), (11.22)
EXERCISE
Problem. A thin bar of length l and mass M lies on a frictionless plate (the x, y-plane
in Fig. 11.16). A hockey puck of mass m and velocity v knocks the bar elastically
under 90◦ at the distance d from the center of gravity. After the collision the puck is
at rest.
(a) Determine the motion of the bar.
(b) Calculate the ratio m/M, accounting for the fact that the puck is at rest.
Solution. (a) Since the collision is elastic, momentum and energy conservation hold, Exercise 11.9
where momentum conservation refers both to linear and angular momentum. The bar
acquires both a translational and a rotational motion from the collision with the puck.
Conservation of the linear momentum immediately leads to
Ls = s ω = mvd = D, (11.29)
and for the angular velocity of the bar relative to the center of gravity,
mvd
ω= , (11.30)
Ml 2 /12
where
l/2
1
s = r 2 dV = Ml 2 .
12
0
Thus, the center of gravity of the bar moves uniformly with vs along the y-axis, while
the bar rotates with the angular velocity ω about the center of gravity. Figure 11.17
illustrates several stages of the motion.
(b) The kinetic energy of the bar can be determined by means of the energy con- Fig. 11.17. The motion of the
servation law. Before the collision, the kinetic energy of the puck is bar
1
T = mv 2 , (11.31)
2
while the kinetic energy after the collision consists of two components:
1
Tt = Mvs2 “translation energy of the center of gravity”
2
and
1
Tr = s ω2 “rotation energy about the center of gravity.”
2
Since the potential energy remains unchanged, it immediately follows that
1 1
T = mv 2 = Tt + Tr = (Mvs2 + s ω2 )
2 2
or
Ml 2 2
mv 2 = Mvs2 + ω . (11.32)
12
178 11 Rotation About a Fixed Axis
Exercise 11.9 Insertion of (11.28) and (11.30) into (11.32) finally yields
m2 v 2 m2 v 2 d 2 (12)2 Ml 2
mv 2 = + ,
M M 2l4 12
m m d2
1= + 12 ,
M M l2
or
m 1
=
M 1 + 12(d/ l)2
for the mass ratio. If the puck kicks the bar at the center of gravity, d = 0, no rotation
appears. In order to make the collision elastic, m = M must be satisfied.
If the puck kicks the bar at the point d = l/2, the collision is elastic only if M = 4m.
In this case the rotation velocity is ω = 6mv/Ml = 6vs / l.
EXERCISE
Problem. A billiard ball of mass M and radius R is pushed by a cue so that the center
of gravity of the ball gets the velocity v0 . The momentum direction passes through the
center of gravity. The friction coefficient between table and ball is μ. How far does
the ball move before the initial gliding motion changes to a pure rolling motion?
Solution. Since the momentum direction passes through the center of gravity, the
angular momentum with respect to the center of gravity at the time t = 0 equals zero.
The friction force f points opposite to the direction of motion (see Fig. 11.18) and
causes a torque about the center of gravity
Ds = f · R = μMgR. (11.33)
μMgR μMgR 5 μg
ω̇ = = 2
= . (11.34)
s (2/5)MR 2 R
Moreover, the friction force causes a deceleration of the center of gravity, i.e.,
f μgM
Mas = −f or as = − =− . (11.35)
M M
as is the acceleration of the center of gravity.
11.2 The Physical Pendulum 179
For the rotation velocity of the ball, one gets from (11.34)—after performing the Exercise 11.10
integration—
t
5 μg
ω= ω̇ dt = t. (11.36)
2 R
0
The linear velocity of the center of gravity follows from (11.35)—again after
integration—as
vs = as dt = v0 − μgt. (11.37)
5 g
μ tR = v0 − gμt (11.38)
2 R
or when
7 2 v0
v0 = μgt and t = . (11.39)
2 7 μg
t
μgt 2
s= vs dt = v0 t − (11.40)
2
0
2 v02 v 2 2 2 12 v02
s= − 0 = . (11.41)
7 μg 2μg 7 49 μg
If the ball is kicked at a distance h above the center of gravity, besides the linear
motion there appears a rotational motion with the angular velocity
Mv0 h 5 v0 h
ω= = . (11.42)
2 R2
If h = (2/5)R, the rolling motion of the ball starts immediately. For h < (2/5)R, one
has ω < v0 /R, and for h > (2/5)R correspondingly ω > v0 /R; in the second case the
friction force points forward.
Figure 11.19 shows the change of vs and ωR as a function of time for h = 0. If
vs = ωR, the rolling motion begins, the friction vanishes, and then vs and ω remain
constant.
180 11 Rotation About a Fixed Axis
Fig. 11.19.
EXERCISE
Problem. A bar of length 2l and mass M is fixed at point A, so that it can rotate only
in the vertical plane (see Fig. 11.20). The external force F acts on the center of gravity.
Calculate the reaction force Fr at the point A!
Solution. In order to determine Fr , one calculates the torque DA with respect to the
center of gravity of the bar, caused by Fr .
The torque with respect to the fixed point A is
DA = −F l = A ω̇, (11.43)
since the constraints do not contribute to DA . The angular acceleration of the bar ω̇
then follows from (11.43):
DA Fl
ω̇ = =− , (11.44)
A A
where A is the moment of inertia of the bar with respect to A. Since the moment of
inertia s with respect to the center of gravity S is easily calculated as
l
1
s = r 2 dV = Ml 2 , (11.45)
3
−l
11.2 The Physical Pendulum 181
1 4
A = s + Ml 2 = Ml 2 + Ml 2 = Ml 2 . (11.46)
3 3
Equation (11.46) inserted into (11.44) leads to
Fl 3 F
ω̇ = − =− . (11.47)
A 4 Ml
Since (11.47) must be correct, independent of the point from which the torque is being
calculated, from the knowledge of the torque with respect to the center of gravity S,
Ds = −Fr l, (11.48)
Ds 3Fr l 3Fr
ω̇ = =− 2 =− , (11.49)
s Ml Ml
one can calculate the reaction force Fr , by comparing (11.47) and (11.49):
3 F 3Fr 1
− =− ⇒ Fr = F.
4 Ml Ml 4
EXERCISE
Problem.
(a) Find the moment of inertia of a thin homogeneous bar of length L with respect to
an axis perpendicular to the bar.
(b) A homogeneous bar of length L and mass m is supported at the ends by identical
springs (spring constant k). The bar is moved at one end by a small displacement
a and then released.
Solve the equation of motion and determine the normal frequencies and normal
vibrations. Sketch the normal vibrations.
Fig. 11.21. A bar is supported
by two identical springs
Solution. (a) If the bar is divided into small segments of length dx with the cross
section f , we have elementary volumes dV = f dx. Let be the constant density of
the bar; then we have
182 11 Rotation About a Fixed Axis
Fig. 11.22.
L L
1
A = x (f dx) = f
2
x 2 dx = f L3 .
3
0 0
or
Fig. 11.23.
11.2 The Physical Pendulum 183
x = A cos(ω1 t + B) + b
and
ϑ = C cos(ω2 t + D)
with
2k 6k
ω1 = and ω2 = .
m m
The initial conditions at the time t = 0 are
a a
x=b− , ϑ= , ẋ = 0, ϑ̇ = 0.
2 L
Thus follows
⎧ a
⎪
⎪ b − = A cos(B) + b,
B = D = 0, ⎪
⎪ 2
⎪
⎪
a ⎪
⎨ 0 = −Aω1 sin(B),
A=− ,
2 ⎪ a
a ⎪
⎪ = C cos(D),
C= , ⎪
⎪
⎪
⎪ L
L ⎩
0 = −Cω2 sin(D),
and, hence,
a 2k a 6k
x = b − cos t, ϑ = cos t.
2 m L m
The normal modes are
2k
X1 = x1 + x2 = 2b − a cos t,
m
6k
X2 = x1 − x2 = −a cos t.
m
The general motion of a rigid body can be described as a translation and a rotation
about a point of the body. This is just the content of Chasles’ theorem, discussed at the
begin of Chap. 4. If the origin of the body-fixed coordinate system is set at the center
of gravity of the body, one can separate the center-of-mass motion and the rotation
in all practical cases (compare Chap. 6, equations (6.4)–(6.8)). For this reason, the
rotation of a rigid body about a fixed point is of particular significance.
We first consider the angular momentum of a rigid body that rotates with angular
velocity ω about the fixed point 0 (see Fig. 12.1):
L= mν (rν × vν )
ν
= mν rν × (ω × rν )
ν Fig. 12.1. A rigid body rotates
with ω about the fixed point 0
= mν ωrν2 − rν (rν · ω) ;
ν
the latter relation holds according to the expansion rule. We decompose rν and ω into
components and insert
L= mν (xν2 + yν2 + zν2 )(ωx , ωy , ωz ) − (xν ωx + yν ωy + zν ωz )(xν , yν , zν ) .
ν
L= mν (xν2 + yν2 + zν2 )ωx − xν2 ωx − xν yν ωy − xν zν ωz ex
ν
+ (xν2 + yν2 + zν2 )ωy − yν2 ωy − xν yν ωx − zν yν ωz ey
+ (xν2 + yν2 + zν2 )ωz − zν2 ωz − xν zν ωx − yν zν ωy ez .
or
Lμ = μν ων ,
ν
· ω.
L=
The elements in the main diagonal are called moments of inertia, the remaining
ones are called deviation moments. The matrix is symmetric, i.e., νμ = μν . Thus
the tensor of inertia has 6 independent components. If the mass is continuously distrib-
uted, one changes from summation to integration for calculating the matrix elements.
For example,
xy = − (r)xy dV ,
V
xx = (r)(y 2 + z2 ) dV ,
V
orientation of the axes is changed. The tensor of inertia is usually understood as the
tensor in a coordinate system with the origin in the center of gravity (center-of-mass
system). The corresponding principal moments of inertia (see Fig. 12.3) are corre-
spondingly the moments of inertia.
We decompose the motion of the rigid body into the translation of a point and the
rotation about this point, so that vν = V + ω × rν , and we obtain
1
T = mν (V + ω × rν )2
2 ν
1 1
= MV 2 + V · ω × mν rν + mν (ω × rν )2 .
2 ν
2 ν
The first and the last term correspond to pure translational and rotational energy, re-
spectively. The mixed term can be made to vanish in two different ways.
If one point is fixed, and if we put it at the origin of the body-fixed coordinate
system, then V = 0. Otherwise the origin is put at the center of gravity, so that
mν rν = 0.
ν
The rotation point is in this case the center of gravity. We now consider the pure
rotation energy
1 1
T = mν (ω × rν ) · (ω × rν ) = mν ω · (rν × (ω × rν ))
2 ν 2 ν
1 1 1
= ω· mν (rν × vν ) = ω · rν × p ν = ω · lν .
2 ν
2 ν
2 ν
Hence,
1
T = ω · L.
2
We can substitute the angular momentum Lμ = ν μν ων (μ, ν = 1, 2, 3):
1 1 1
T = ω·L= ωμ μν ων = μν ωμ ων . (12.1a)
2 2 μ ν
2 μ,ν
1
· ω.
T = ωT · (12.1c)
2
must be given as a column vector,
The vector ω on the right-hand side of the tensor
and on the left-hand side as a row vector:
⎛ ⎞
ωx
1
⎝
T = (ωx , ωy , ωz ) ωy ⎠ . (12.1d)
2
ωz
The elements of the tensor of inertia depend on the position of the origin and on the
orientation of the (body-fixed) coordinate system. It is now possible for a fixed origin
to orient the coordinate system in such a way that the deviation moments vanish. Such
a special coordinate system is called a system of principal axes. The tensor of inertia
then has diagonal form with respect to this system of axes:
⎛ ⎞
1 0 0
= ⎝ 0 2 0 ⎠ or μν = μ δμν .
(12.2)
0 0 3
For angular momenta and rotation energy in the system of principal axes, we have
the especially simple relations (ων are the components of the angular velocity ω with
respect to the principal axes)
Lμ = μν ων = μ δμν ων = μ ωμ , (12.3)
ν ν
1 1 1
T = ω·L= ω μ Lμ = 2
μ ω μ , (12.4a)
2 2 μ 2 μ
or written out,
1
T= 1 ω12 + 2 ω22 + 3 ω32 . (12.4b)
2
Because of the tensorial relation L = ω, the angular momentum and the angular
velocity have different orientations.
If the body rotates about one of the principal axes of inertia, e.g. about the μ-axis,
ω = ωeμ , then (because in this example ω = ωeμ ) according to (12.3) the angular
momentum L and the angular velocity ω have the same orientation. The vector ω then
has only one component, ω = (0, ω2 , 0), if the rotation is about the second principal
axis. The same holds also for the angular momentum: L = (0, L2 , 0). This property
of parallelism between the angular momentum and the angular velocity allows one to
determine the principal axes. The question is namely how to choose ω = {ω1 , ω2 , ω3 }
(about which axis must the body rotate), in order to get the angular momentum
L= ω and the angular velocity parallel to each other, i.e., L = ω, with a scalar.
12.4 Existence and Orthogonality of the Principal Axes 189
ω (
From the combination of the relations L = is a tensor) and L = ω ( is a
scalar), we obtain the equation
· ω = ω,
L= (12.5)
which is an eigenvalue equation. In this equation, the scalar and the related compo-
nents ωx , ωy , ωz , i.e., the rotation axis, are unknown. The equation physically states Fig. 12.4. Special case: If ω
that the angular momentum L and the rotation velocity ω are parallel to each other. is parallel to a principal axis,
This is fulfilled for certain directions ω that—as stated above—must be determined. then L is parallel to ω
; the correspond-
All values that satisfy (12.5) are called eigenvalues of the tensor
ing vectors ω = 0 are eigenvectors.
Equation (12.5) is a shortened notation for the system of equations
or
This system of homogeneous linear equations has nontrivial solutions if its determi-
nant of coefficients vanishes:
xx − xy xz
yx yy − yz = 0. (12.8)
zx zy zz −
The expansion of the determinant leads to an equation of third order in , the char-
Fig. 12.5. General case: The
acteristic equation. Its three roots are the desired principal moments of inertia (eigen- angular momentum L is not
values) 1 , 2 , and 3 . By inserting i into the system of (12.5), one can calculate parallel to the rotation veloc-
the ratio ωx(i) : ωy(i) : ωz(i) of the components of the vector ω(i) . Thereby the orientation ity ω
of the ith principal axis is determined.
Since one can find a tensor of inertia for any possible position of the body-fixed
coordinate system, there exists also a system of principal axes at each point of the
body. The orientations of these axes will however not coincide in general.
In principle, it would be possible for the cubic equation (12.8) to have two complex
solutions. We therefore have to prove that a system of real orthogonal principal axes
generally exists.
In order to apply a shortened summation notation, we number the coordinates
(x = 1, y = 2, z = 3) and denote them by Latin letters. Greek letters are indices for
the three different eigenvalues. We multiply the eigenvalue equation (12.5) for λ by
(μ)
the complex conjugated of ωi and sum over i.
190 12 Rotation About a Point
This leads to
(μ)∗
(μ)∗
ik ωk(λ) ωi = λ ωi(λ) ωj = λ ω(λ) · ω(μ)∗ . (12.10)
i,k i
In the same way, we form the complex conjugated of the equation corresponding to
(λ)
(12.9) for μ , multiply by ωk , and sum over k:
∗ki ωi = ∗μ ωk
(μ) (μ) (μ)∗ (μ)∗
ki ωi = μ ω k , , (12.11)
i i
∗ki ωi = ∗μ = ∗μ ω(μ)∗ · ω(λ) .
(μ)∗ (λ) (μ)∗ (λ)
ωk ωk ωk (12.12)
i,k k
Now we utilize the property of the tensor of inertia to be real and symmetric. We have
ik = ki = ∗ki , and the left-hand sides of (12.10) and (12.12) are equal to each
other. We subtract (12.12) from (12.10):
follows the relation λ = ∗λ , since the scalar product of two complex conjugated
quantities is positively definite.
We thus proved that λ is real. Hence, any body always has three real principal
moments of inertia and therefore also three real principal axes ω(λ) . This is of
course physically clear from the outset, since the principal moments of inertia are
nothing else but the moments of inertia about the principal axes, and therefore
they are always real.
(2) We now consider the case λ = μ: Since all ν and therefore also all ων are real,
(12.13) reads
(a) If λ = μ , then ω(λ) · ω(μ) = 0, and therefore, ω(λ) and ω(μ) are orthogonal.
(b) If, e.g., 1 = 2 = , i.e., if two of the three eigenvalues are equal, then
besides ω(1) and ω(2) all linear combinations of these two vectors are eigen-
vectors, too:
· ω(1) = ω(1) ,
· ω(2) = ω(2)
⇒ · (αω(1) + βω(2) ) = (αω(1) + βω(2) ).
12.4 Existence and Orthogonality of the Principal Axes 191
Thus, we can arbitrarily select two orthogonal vectors from the plane spanned
this way and consider them as directions of principal axes. The third principal
axis is by (12.15) fixed orthogonally to the two other axes. If two principal
moments of inertia with respect to the center of gravity as rotation point are
equal, the body is called a symmetric top.
(c) If all three moments of inertia are equal (1 = 2 = 3 ), then any arbitrary
orthogonal set of axes is a system of principal axes. If this holds with respect
to the center of gravity, the body is called a spherical top.
If a body has rotational symmetry about one axis, then we are dealing with case (b),
and the rotation axis is a principal axis. For other kinds of symmetries the symmetry
axis also coincides with the principal axis.
EXAMPLE
We calculate the tensor of inertia and the principal axes of inertia of a square covered
with mass for a corner of the square. We put the square in the x, y-plane of the coor-
dinate system, as is shown in Fig. 12.6. The components of the tensor of inertia are
obtained with z = 0 by integration over the area:
a a
2
zz = σ (x 2 + y 2 ) dx dy = Ma 2 .
3
y=0 x=0
Likewise,
a a
1
xy = yx = −σ xy dx dy = − Ma 2 .
4
y=0 x=0
The remaining deviation moments contain the factor z in the integrand and therefore
vanish:
Thus, in the selected coordinate system the plate has the following tensor of inertia:
192 12 Rotation About a Point
⎛ ⎞
Example 12.1 1 1
Ma 2 − Ma 2 0
⎜ 3 4 ⎟
⎜ ⎟
⎜ ⎟
= ⎜ − 1 Ma 2
1
Ma 2 0 ⎟.
⎜ 4 ⎟
⎜ 3 ⎟
⎝ 2 ⎠
0 0 Ma 2
3
We now calculate the orientations of the principal axes.
In accordance with the described approach, we first determine the eigenvalues of
the tensor of inertia. We introduce the abbreviation 0 = Ma 2 . Then we have the
determinant
1 1
0 − − 0 0
3 4
1 1
− 0 0 − 0 =0
4 3
2
0 0 0 −
3
or
2 7 2 2
2 − 0 + 0 0 − = 0.
3 144 3
The roots of this characteristic equation
1 7 2
1 = 0 , 2 = 0 , 3 = 0
12 12 3
are the principal moments of inertia with respect to the origin.
For the principal moment of inertia ν , the orientation of the axis ω(ν) results from
ω(ν) = ν ω(ν) .
the eigenvalue equation
Written out for ν = 1,
⎛ ⎞
1 1
0 − 0 0
⎜ 3 4 ⎟ ⎛ (1) ⎞ ⎛ (1) ⎞
⎜ ⎟ ωx ωx
⎜ 1 1 ⎟ ⎜ (1) ⎟ 1 ⎜ (1) ⎟
⎜ − 0 ⎟
0 ⎟ ⎝ ωy ⎠ = 0 ⎝ ωy ⎠ .
⎜ 4 0
⎜ 3 ⎟ (1) 12 (1)
⎝ 2 ⎠
ωz ωz
0 0 0
3
By multiplying out, we get a vector equation; after splitting into the three compo-
nents, we obtain the three equations
1 1 1
0 ωx(1) − 0 ωy(1) = 0 ωz(1) ,
3 4 12
1 1 1
− 0 ωx(1) + 0 ωy(1) = 0 ωy(1) ,
4 3 12
2 1
0 ωz(1) = 0 ωz(1) .
3 12
From this, it follows that
and thus, the orientation of the first principal axis is Example 12.1
⎛ ⎞
(1) 1
ω 1
e1 = (1) = √ ⎝ 1 ⎠ .
|ω | 2 0
Evidently, the principal axes are orthogonal to each other, as is demanded by the gen-
eral theory. For a rotation about the point 0 around one of the principal axes, the
angular momentum L is parallel to ω, but in general the center of gravity then also
moves. Such a motion can be forced only by the action of a force. Thus it is no free
motion. Force-free rotations (shortly: free rotation) take place only about the center of
gravity. The principal axes moments or principal moments of inertia about the center
of gravity are the principal moments of inertia or principal axes of the body. In our
example the orientations of the principal axes coincide with those at the point 0.
or
xi = aij xj , (12.16)
j
1 See W. Greiner: Classical Mechanics: Point Particles and Relativity, 1st ed., Springer, Berlin
(2004), Chapter 6.
194 12 Rotation About a Point
where the components aij of the rotation matrix A are the direction cosines between
the rotated and the old axes. The inverse of this transformation reads
−1 x
x=A or xi = aj i xj . (12.18)
j
The inverse rotation matrix (a −1 )ij = (aj i ) is found by exchanging rows and columns
(transposition), since the rotation is an orthogonal transformation which satisfies
aij akj = δik or aij aik = δj k . (12.19)
j i
We require for the tensor of inertia that a vector equation of the form
Lk = kl ωl (12.20)
l
Thus, we can determine the transformation behavior of the tensor from the behavior
of the vectors. The vectors L and ω obey the transformation equation (12.18). If we
replace Lk and ωl in (12.20) by the primed quantities, we obtain
kl aj l ω j = aj k Lj .
l j j
as a “tensor.” A tensor of
This transformation relation is the reason for denoting
rank m is generally defined as any quantity which under orthogonal transformations
behaves according to the logical extension of (12.23) (summation over m indices),
e.g., a tensor of third rank
Aij k = aii ajj akk Ai j k . (12.24)
i ,j ,k
For the tensor of inertia, (12.23) can be more clearly represented in matrix nota-
tion:
= A −1 .
A (12.25)
Now, according to (12.17), ei = {ai1 , ai2 , ai3 } is the vector ei in the basis ej . Hence,
the moment of inertia about the rotation axis ei = n = (n1 , n2 , n3 ) can obviously be
written as follows:
n = aij j l ail = aij j l (a T )li = n j j l n l
j,l j,l j,l
⎛ ⎞
n1
⎝ n2 ⎠ = n T ·
= (n1 , n2 , n3 ) ·n
n3
= ij ni nj . (12.26)
i,j
This relation will be derived more clearly in the context of the subsequent equation
(12.33). It allows one to calculate the moment of inertia about an arbitrary rotation
axis n rather quickly.
If the three orientations of the principal axes ei = ω(i) are selected as coordinate axes,
then
ei = ω1(i) e1 + ω2(i) e2 + ω3(i) e3 = (i)
ω j ej .
j
Hence, according to (12.23) the tensor of inertia in the system of principal axes reads
(i) (j ) (i)
(j )
ij = aik aj l kl = ωk ωl kl = ωk kl ωl . (12.27)
k,l k,l k l
196 12 Rotation About a Point
ω(j ) = j ω(j )
(12.28)
or, explicitly,
(j ) (j )
kl ωl = j ω k .
l
We thus used the orthonormality (12.16) of the principal axes vectors ω(i) . The ω(i)
were assumed to be normalized, which is possible because of the linearity of the eigen-
value equation (12.28) with respect to ω. Equation (12.29) expresses the interesting
and important fact that the tensor of inertia in its eigenrepresentation (i.e., in the coor-
dinate system with the principal axes ω(i) as coordinate axes) is diagonal and exactly
of the form (12.2). This was to be expected, but it is satisfactory to see how everything
fits together consistently.
We define a rotation axis by the unit vector n with the direction cosines n =
(cos α, cos β, cos γ ). According to (12.26), the moment of inertia about this axis
is
⎛ ⎞⎛ ⎞
xx xy xz cos α
= n = (cos α, cos β, cos γ ) ⎝ xy yy yz ⎠ ⎝ cos β ⎠ .
Fig. 12.8. n characterizes the xz yz zz cos γ
rotation axis
Multiplying out, we obtain
xx x2 + yy y2 + zz z2 + 2xy x y + 2xz x z + 2yz y z = 1. (12.31)
This equation represents an ellipsoid in the coordinates (x , y , z ), the so-called el-
lipsoid of inertia.
The distance √ from the center of rotation 0 along the direction n to the ellipsoid of
inertia is = 1/ . This allows us to write down at once the moment of inertia if the
ellipsoid of inertia is known. Each ellipsoid can now be brought to its normal form by
12.7 Ellipsoid of Inertia 197
a rotation of the coordinate system, i.e., the mixed terms can be made to vanish. We
then obtain the form of the ellipsoid of inertia
√ agrees with the already known result (12.30). With the coordinates =
This
n/ n = (1 , 2 , 3 ), we thus obtain the ellipsoid of inertia
11 12 + 22 22 + 33 32 + 212 1 2 + 213 1 3 + 223 2 3 = 1. (12.33)
√
The radius of the ellipsoid in the direction n is n = 1/ n .
198 12 Rotation About a Point
n = mν dν2 = mν |rν × n|2 . (12.34)
ν ν
We check
e1 e2 e3
rν × n = xν yν zν
cos α cos β cos γ
= (yν cos γ − zν cos β)e1 + (zν cos α − xν cos γ )e2
+ (xν cos β − yν cos α)e3
and
n = ij ni nj , (12.36)
i,j
EXAMPLE
The tensor of inertia of the square covered with mass in the x, y-plane was given by
(compare Example 12.1)
⎛ ⎞
1 1
0 − 0 0
⎜ 3 4 ⎟
⎜ ⎟
⎜ 1 1 ⎟
⎜
= ⎜ − 0 0 ⎟
0 ⎟.
⎜ 4 3 ⎟
⎝ 2 ⎠
0 0 0
3
The rotation of the coordinate system by ϕ = π/4 about the z-axis must bring to
diagonal form, because the angle bisectors of the x, y-plane, as was shown (compare
Exercise 12.1), are principal axes. The corresponding rotation matrix reads
⎛ √ √ ⎞
2 2
⎛ ⎞ ⎜ 0⎟
cos ϕ sin ϕ 0 ⎜ √ 2 2 ⎟
√
= ⎝ − sin ϕ cos ϕ 0 ⎠ = ⎜
A ⎜ 2 2
⎟
⎟.
⎜− 0 ⎟
0 0 1 ⎝ 2 2 ⎠
0 0 1
Obviously,
−1 = A
A T .
Performing the matrix multiplication yields in accordance with the former result
⎛ ⎞
1
0 0 0
⎜ 12 ⎟
⎜ ⎟
⎜ 7 ⎟
−1
= AA = ⎜ 0 ⎜ 0 ⎟
0 ⎟.
⎜ 12 ⎟
⎝ 2 ⎠
0 0 0
3
EXERCISE
Solution. For the calculation of the tensor of inertia, we choose the coordinate system
so that the longitudinal axis coincides with the z-axis.
200 12 Rotation About a Point
Obviously,
xx = (y 2 + z2 ) dV = (r 2 sin2 ϕ + z2 )r dz dr dϕ
V
2π R h
= dϕ r dr (r 2 sin2 ϕ + z2 ) dz
Fig. 12.11. From the figure it 0 0 h(r/R)
is seen that m = (1/3)π hR 2 , π
R = h tan α, s = R/ sin α = hR 2 (R 2 + 4h2 ),
20
3
xx = mh2 (tan2 α + 4). (12.37)
20
For reasons of symmetry, we have
yy = xx .
Likewise,
zz = (x 2 + y 2 ) dV = r 3 dz dr dϕ
V
2π R h
π
= dϕ 3
r dr dz = hR 4 ,
10
0 0 h(r/R)
3
zz = mh2 tan2 α. (12.38)
10
1 1 1
T = 1 ω12 + 2 ω22 + 3 ω32 . (12.39)
2 2 2
Since we already know the principal axes of inertia and moments of inertia, it remains
only to express the motion of the cone by the corresponding angular velocities. The
momentary rotation of the cone happens with the angular velocity ω about a line of
support. We can express ω by ϕ̇ by considering the velocity of the point B. On the one
hand vB = ϕ̇h cos α, and on the other hand vB = ω · R cos α. From this, we find Exercise 12.3
h
ω = ϕ̇ . (12.40)
R
ϕ is the polar angle of the figure axis (or, equivalently, the tangential line) in the x ,y -
plane; ϕ̇ is the corresponding angular velocity.
A decomposition of ω in the system of principal axes, where ω lies in the x,z-plane,
leads to ω2 = 0 and
1 1 1
T = 1 ω12 + 2 ω22 + 3 ω32
2 2 2
1 3 1 3
= mh2 (tan2 α + 4)ω12 + mh2 tan2 α 2
2 20 2 10
3 3
= mh2 ω2 sin2 α(tan2 α + 4) + mh2 ω2 sin2 α
40 20
4
3 sin α
= mh2 ω2 + 6 sin2 α . (12.42)
40 cos2 α
(b) The momentary rotation axis ω is again the connecting line between the fixed
vertex and the point of support. The relation between ω and ϕ̇ is likewise obtained by
considering the velocity of point A.
We have vA = h · ϕ̇ = ωR cos α, from which it follows that ω = ϕ̇/ sin α. The pro-
jection of ω onto the principal axes yields
ω1 = ω sin α = ϕ̇,
ω2 = 0, (12.44)
h
ω3 = ω cos α = ϕ̇ .
R
202 12 Rotation About a Point
EXERCISE
Problem. Determine the ellipsoid of inertia for the rotation of a quadratic disk about
the origin, as described in Example 12.1. Find the moments of inertia of the disk for
rotation about (a) the x-axis, (b) the y-axis, (c) the z-axis, (d) the three principal axes,
and (e) the axis {cos 45◦ , cos 45◦ , cos 45◦ }.
Solution. The ellipsoid of inertia reads
0 2 0 0 2 20 2
− x y + + = 1. (12.46)
3 x 2 3 y 3 z
√
(a) For rotation about the x-axis n = {1, 0, 0}, and thus = {1/ x , 0, 0}. Inser-
tion into (12.46) yields
0 1 0
· =1 ⇒ x = ,
3 x 3
as expected.
(b) Here, n = {0, 1, 0}, and following the procedure in (a), we find
0
y = .
3
(c) Here, n = {0, 0, 1}, and following the procedure in (a), we find
2
z = 0 .
3
(d) The third principal axis is identical with the z-axis, which corresponds to (c).
The first two principal axes are given by
1 1 1 1
n1 = √ , √ , 0 and n2 = − √ , √ , 0 ,
2 2 2 2
12.7 Ellipsoid of Inertia 203
EXERCISE
Exercise 12.5 If we select the z-axis as a rotation axis, the rotation matrix reads
⎛ ⎞
cos ϕ sin ϕ 0
A = ⎝ − sin ϕ cos ϕ 0 ⎠ .
0 0 1
Multiplying the matrices out, one obtains the components ij of the new tensor of
inertia which shall coincide with ij .
vanishes only for ϕ = 0, 2π, . . . . If there is symmetry (n ≥ 2), then we must have
13 = 23 = 0, i.e., the z-axis must be a principal axis.
Two of the remaining three equations are identical, and there remains the system
of equations
There holds D = 0 for ϕ = 0, π, 2π, . . . . Hence, 11 = 22 and 12 = 0, if n > 2. If
the axis of rotational symmetry z is at least 3-fold, the tensor of inertia is diagonal for
each orthogonal pair of axes in the x, y-plane.
EXERCISE
Problem. A rigid body consists of three mass points that are connected to the z-axis
by rigid massless bars (see Fig. 12.15).
(a) Find the elements of the tensor of inertia relative to the x, y, z-system.
(b) Calculate the ellipsoid of inertia with respect to the origin 0, and the moment of
inertia of the entire body with respect to the axis 0a.
12.7 Ellipsoid of Inertia 205
Solution. (a) The elements of the tensor of inertia relative to the x,y,z-system are
xx = mi (yi2 + zi2 )
i
and after inserting the numerical values from Fig. 12.15, one has
and likewise,
(b) From (a) one now immediately obtains for the ellipsoid of inertia with respect
to the origin 0 (see (12.30))
Exercise 12.6 To calculate the moment of inertia 0a , we evaluate the direction cosines with the
coordinates given in Fig. 12.15,
−6
cos α = √ = −0.268,
6 + 82 + 202
2
8
cos β = √ = 0.358,
6 + 82 + 202
2
and
20
cos γ = √ = 0.895.
6 + 82 + 202
2
EXERCISE
Problem. A car of mass M is driven by a motor that performs the torque 2D on the
wheel axis. The radius of the wheels is R, and their moment of inertia is = mR 2
(m is the reduced mass of the wheels).
(a) Determine the friction force f which acts on each wheel and causes the accelera-
tion of the car. The street is assumed to be planar.
(b) Calculate the acceleration of the car if the torque 2D = 103 J, M = 2 · 103 kg,
R = 0.5 m and m = 12.5 kg.
Solution. (a) Fig. 12.16 shows one of the wheels and the force f acting on it. Since
the linear acceleration of the wheel center is the same as that of the center of gravity
of the car as , one has
Mas = 2f − F. (12.48)
The factor 2 accounts for the fact that a car in general is driven by two wheels. F
Fig. 12.16. Wheel, accelera- is a possible external force which impedes the motion (air resistance), and as is the
tion as , and friction force f acceleration of the car. For the torque relative to the axis, one obtains
where is the moment of inertia of each of the four wheels, D is the accelerating Exercise 12.7
torque, and −f R is the torque performed by the friction force on each wheel. The
moment of inertia is = mR 2 . If the car does not glide, one has
ω̇R = as , (12.50)
1 (2D/R)M + 4mF
f= (12.52)
2 M + 4m
and by neglecting the backdriving force F (F = 0), we have
D/R
f= . (12.53)
1 + (4m/M)
(b) By replacing f in (12.49) by (12.53) and solving for as (F = 0), one finds the
acceleration of the car
2D/R 103 /0.5 103 m
as = = = ≈1 2.
M + 4m 2 · 10 + 4 · 12.4 1025
3 s
With the numerical values from (b), the friction force f is given by
D
f≈ = 1000 N.
R
Theory of the Top
13
A rigid, rotating body is called a top. A top is called symmetric if two of its principal
moments of inertia are equal. If 1 = 2 , we further distinguish
The third principal axis of inertia which is related to 3 is called the figure axis. It
specifies the spatial orientation of the top. For rotationally symmetric bodies it co-
incides with their symmetry axis. Hence, the center of gravity of a rotational body
always lies on the figure axis. Moreover, we must distinguish between the free top and
the heavy top. For the free top one assumes that no external forces act on the body,
so that the torque with respect to the fixed point vanishes. On the heavy top forces
act, for example gravity. One can however imagine other forces (centrifugal forces,
friction forces, etc.). For an experimental realization of a free top we only have to
support an arbitrary body at the center of gravity. The body is then in an indifferent
equilibrium, and there is no torque acting on it.
For a theoretical description of the top, we start from the basic equations
· ω = constant (conservation of angular momentum),
L= (13.1)
1
T = ω · L = constant (conservation of kinetic energy). (13.2)
2
The angular momentum L and the kinetic energy T of the free top are constant in
time. This is the content of the last two equations.
We first will derive the laws governing the free top from geometrical considerations.
The geometrical theory of the top is based on Poinsot’s1 ellipsoid (also called the
energy ellipsoid):
This ellipsoid in the ω-space is immediately obtained from (13.2). It is similar to the
ordinary ellipsoid of inertia and has the same body-fixed axes.
In the subsequent considerations, we shall utilize the property of (13.3) that the
endpoint of the vector ω lies just on the surface of the ellipsoid.
Now follows Poinsot’s construction of the motion of the free top. The angular mo-
mentum vector is constant and defines an orientation in space. The straight line de-
termined by L is therefore called the invariable straight line. Moreover, the kinetic
energy is constant, hence 2T = ω · L = constant; from the definition of the scalar
product immediately it follows that
In other words, the projection of ω onto L is constant. If one now considers ω as the
position vector for points in space, the parameter representation ω(t) fixes a plane
which is called an invariable plane. The invariable straight line is then perpendicular
to the invariable plane.
Now one can describe the motion of the top by the rolling of the Poinsot ellipsoid
on the invariable plane. This is allowed since the endpoint of ω, as is evident from (13.4),
1 Louis Poinsot, French mathematician and physicist, b. Jan. 3, 1777, Paris–d. Dec. 5, 1859, Paris.
Professor in Paris, introduced in his Eléments de statique (Paris, 1804) the concept of the couple to
mechanics and used it to represent the motion of the top. Poinsot-motion means the motion of a free
top.
13.2 Geometrical Theory of the Top 211
lies on the surface of the ellipsoid and moves in the invariable plane. The invariable
plane is also a tangent plane of Poinsot’s ellipsoid, since there is only one common
vector ω, and hence the ellipsoid and the plane have a common point. To prove this,
we show that at the point ω the gradient of the ellipsoid is parallel to L. From vector
analysis we know that the gradient of a surface is perpendicular to this plane. The
surface of the ellipsoid F is described by (13.3).
Because2
∂F ∂F ∂F
∇ω F = , , ,
∂ωx ∂ωy ∂ωz
we obtain
⎛ ⎞
xx ωx + xy ωy + xz ωz
1
∇ω F = ⎝ xy ωx + yy ωy + yz ωz ⎠ =
ω = L,
2
xz ωx + yz ωy + zz ωz
dω dω dω dω
= + ω × ω, i.e., = .
dt L dt K dt L dt K
Concerning the difference between gliding and rolling, if a wheel rolls on a plane,
the velocities of change of the contact point P in the body-fixed and in the laboratory
system are equal. If the wheel glides, the contact point in the body-fixed system is Fig. 13.4. On the condition of
fixed; in the laboratory system its position changes permanently. rolling
2 Since the surface (13.3) is defined in the ω-space, we mean by gradient the ω-gradient, i.e., ∇ω =
{∂/∂ωx , ∂/∂ωy , ∂/∂ωz }.
212 13 Theory of the Top
The trajectory of ω on the invariable plane is called the herpolhodie or trace trajec-
tory, the corresponding curve on the ellipsoid is called the polhodie or pole trajectory.
See Fig. 13.5.
The polhodie and the herpolhodie are in general complicated, not closed curves.
For the special case of a symmetric top, the Poinsot ellipsoid turns into a rotation
ellipsoid, and by rolling of the rotation ellipsoid there arise circles. ω has constant
magnitude but permanently changes the direction, i.e., ω rotates on a cone about the
angular momentum axis. This cone is called the herpolhodie- or trace cone. For a
symmetric cone it is efficient to use the symmetry axis (figure axis) as third axis for
describing the motion. The figure axis that is tightly fixed to the ellipsoid rotates just
like ω rotates about L. The cone resulting this way is called the nutation cone. The
motion of the figure axis of the top in space is called nutation. (The term precession
used in the American literature makes little sense, since the term means a motion of
the heavy top that is of a completely different origin.)
An observer who is in the system of the top and considers the figure axis as fixed
will find that ω and L rotate about this axis. For the cone arising by the rotation of
ω the term polhodie- or pole cone is introduced. The precise orientation of the axes
and cones depends essentially on the shape of the rotation ellipsoid. This is shown
by the following two diagrams on the orientation of the axes. Note that a large prin-
cipal momentum of inertia 3 corresponds to a small radius of the Poinsot ellipsoid,
√
namely, 2T /3 . The other axes of the Poinsot ellipsoid accordingly have the lengths
√ √
2T /1 and 2T /2 , respectively.
This is immediately seen from the form of (13.3) in terms of the principal axes:
Figure 13.6(a) shows the ellipsoid of a flattened (oblate) top; Fig. 13.6(b) represents
a prolate top. In the first case, the axes have the sequence ω—L—figure axis; in the
second case, the sequence is L—ω—figure axis.
Likewise is the sequence of the cones introduced above. Figure 13.6(c) shows the
case of an oblate top, and Fig. 13.6(d) that of a prolate top. We note that the three axes
lie in a plane.
13.3 Analytical Theory of the Free Top 213
ω = ω 1 e1 + ω 2 e2 + ω 3 e3 ,
where e1 , e2 , and e3 are body-fixed principal axes of the top. We now investigate the
angular momentum of the top no longer in the moving coordinate system, i.e., in the
system of the top that rotates with ω in the laboratory system, but transformed into
the laboratory system, using our knowledge of moving coordinate systems. We then
obtain
L̇|lab = L̇|top + ω × L.
Because L̇|top = ω̇, for the component in the laboratory system we have
e1 e2 e3
L̇|lab = 1 ω̇1 e1 + 2 ω̇2 e2 + 3 ω̇3 e3 + ω1 ω2 ω3 .
1 ω 1 2 ω 2 3 ω 3
214 13 Theory of the Top
Since the laboratory system is an inertial frame of reference, we have the relation
L̇ = D.
L̇|lab = D1 e1 + D2 e2 + D3 e3 .
These three coupled differential equations for ω1 (t), ω2 (t), and ω3 (t) are not linear.
This suggests that in general the solutions ωi (t) are rather complicated functions of
time. Only in the case of free motion (D = 0) can one obtain a transparent solution
which will be discussed now. Later we shall deal with the heavy top for which D = 0.
We choose the body-fixed coordinate system so that the e3 -axis corresponds to the
figure axis. Since we will restrict the analytical consideration of the theory of the top
to a free symmetric top that shall be symmetric about the figure axis, the following
conditions hold:
We show that for a symmetric top e3 , ω, and L lie in a plane. For this we have to
calculate the scalar triple product of the three vectors which must vanish:
e1 e2 e3
e3 · (ω × L) = e3 · ω1 ω2 ω3
1 ω 1 2 ω 2 3 ω 3
= (2 − 1 )ω1 ω2 = 0,
because 1 = 2 .
With the conditions for the free symmetric top, the Euler equations read
3 ω̇3 = 0 ⇒ ω3 = constant,
1 ω̇1 + (3 − 1 )ω2 ω3 = 0,
1 ω̇2 + (1 − 3 )ω1 ω3 = 0.
Thus, the component of ω along the figure axis is constant. To show this in the subse-
quent calculation, we set
ω3 = u.
13.3 Analytical Theory of the Free Top 215
To solve the two differential equations, we differentiate the second equation with re-
spect to time:
By solving the last equation for ω̇2 and inserting into the first one, we obtain
(3 − 1 )2 2
ω̈1 + u ω1 = 0.
21
we see that ω̈1 + k 2 ω1 = 0 is exactly the differential equation of the harmonic oscilla-
tor, which is solved by
The rotational frequency is thereby given by k; for k > 0 the rotation proceeds in
Fig. 13.7. Motion of ω(t) in
the mathematically positive sense. The cone arising in the rotation is again called the
the e1 , e2 -plane
pole cone. The angular momentum, which is given by L = · ω, also changes with
time:
i.e., the L-axis rotates with the same frequency k but with a different amplitude about
the figure axis (nutation). This is no contradiction to the statement |L|lab = constant,
since we measure the angular momentum from the system of the top.
Finally, we determine the angles between the three axes. We set
∠) (e3 , L) = α, ∠) (e3 , ω) = β,
or
e3 · L = e3 · (1 ω1 e1 + 2 ω2 e2 + 3 ω3 e3 ) = 3 ω3 = 3 u.
(1 B) + (3 u)
2 2 (1 B/3 u)2 + 1
216 13 Theory of the Top
or
2
1 B
cos α + 1 = 1.
3 u
cos x tan2 x + 1 = 1
B
tan β = = constant.
u
The comparison of the last two results shows the dependence of the orientation of
the axes on 1 and 3 : One has
Fig. 13.8.
13.3 Analytical Theory of the Free Top 217
also from
⎛ ⎞
1 0 0
=⎝ 0
ω × L = ω × 1 ω = 0 because 1 0 ⎠.
0 0 1
EXAMPLE
The earth is not a spherical top but a flattened rotation ellipsoid. The half-axes are
If the angular momentum axis and the figure axis do not coincide, the figure axis
performs nutations about the angular momentum axis. The angular velocity of the
nutations is
3 − 1
k= ω3 .
1
The third axis is the principal axis of inertia (pole axis). If we consider the earth as
a homogeneous ellipsoid of mass M, we obtain the two moments of inertia:
M 2 M 2
1 = 2 = (b + c2 ), 3 = (a + b2 ).
5 5
From this, we obtain
a 2 − c2
k= ω3 .
b2 + c2
Since the half-axes differ only a little, we set a = b ≈ c, and thus,
(a − c)(a + c) a−c
k= ω3 ≈ ω3 .
b +c
2 2 a
The rotation velocity of the earth is ω3 = 2π/day. Thus, we obtain for the period of
nutation
2π
T= = 304 days.
k
The figure axis of the earth (geometrical north pole) and the rotation axis ω of
the earth (kinematical north pole) rotate about each other. The measured period (the
so-called Chandler period)3 is 433 days. The deviation is essentially caused by the
fact that the earth is not rigid. The amplitude of this nutation is about ±0.2 . The
3 Seth Carlo Chandler, b. Sept. 17, 1846, Boston, Mass.–d. Dec. 31, 1913, Wellesley Hills, Mass.
American astronomer, detected the Chandler period of 14 months in the pole height fluctuations. He
observed variable stars, and for a long time he edited the Astronomical Journal.
218 13 Theory of the Top
Example 13.1 kinematical north pole moves along a spiral trajectory within a circle of 10 m radius
in the sense of the earth’s rotation.
EXAMPLE
The ellipsoid of inertia of any regular polyhedron is a sphere, which will be shown
by the example of the tetrahedron; the reasoning for the octahedron, dodecahedron,
and icosahedron is analogous. Suppose there were a principal axis of inertia with a
moment which differs from those of the two other principal axes of inertia. In a rota-
tion by 120◦ about the axis g (perpendicular from point C on the opposite plane; see
Fig. 13.9), this axis of inertia must turn into itself, since the tetrahedron is transferred
into itself. It is easily seen that only the axis g has this property and therefore must
Fig. 13.9. Regular tetrahedron: be the distinguished axis of inertia. But since h is a symmetry axis too, a rotation by
g and h are straight lines (axes) 120◦ about h must also transfer the axis of inertia g into itself, which however is not
that are perpendicular to the true. Assuming the existence of a distinct axis of inertia leads to a contradiction, and
planes opposite C and D hence the ellipsoid of inertia of a tetrahedron must be a sphere.
EXERCISE
Solution. We decompose the angular velocity ω into its components along the prin-
cipal axes of inertia:
1 1 1
T= i ωi2 = (1 cos2 ϕ + 2 sin2 ϕ)ϑ̇ 2 + ϕ̇ 2 .
2 2 2
i
13.3 Analytical Theory of the Free Top 219
The ellipsoid shall now be symmetric, 1 = 2 ; the axis AB is tilted from the Exercise 13.3
third axis by the angle α. For the total angular velocity, we have
ω = ϕ̇e3 + ϑ̇eAB .
We decompose the unit vector eAB along the axis AB with respect to the principal
axes
Thus, the components of ω along the directions of the principal axes are
EXERCISE
Problem. Find the torque that is needed to rotate a rectangular plate (edges a and b)
with constant angular velocity ω about a diagonal.
Fig. 13.12. The rectangular
plate rotates about the diago-
nal axis
Solution. The principal moments of inertia of the rectangle are already known from
Example 11.7:
1 1 1
I1 = Ma 2 , I2 = Mb2 , I3 = M(a 2 + b2 ). (13.6)
12 12 12
220 13 Theory of the Top
ω = (ω · ex )ex + (ω · ey )ey ,
i.e.,
ωb ωa
ω = −√ ex + √ ey
a2 + b2 a 2 + b2
−ωb +ωa
⇒ ω1 = √ , ω2 = √ , ω3 = 0. (13.7)
a +b
2 2 a 2 + b2
−M(b2 − a 2 )abω2
D3 = .
12(a 2 + b2 )
−M(b2 − a 2 )abω2
D= ez .
12(a 2 + b2 )
For a = b (square), D = 0!
EXERCISE
Problem. The surface of a neutron star (sphere) vibrates slowly, so that the principal
moments of inertia are harmonic functions of time:
2
Izz = mr 2 (1 + ε cos ωt),
5
2 2 cos ωt
Ixx = Iyy = mr 1 − ε , ε
1.
5 2
Solution. (a) If the total angular momentum is given in an inertial system, then Exercise 13.5
dL
= 0.
dt inertial
The principal moments of inertia are, however, given in a body-fixed system that ro-
tates itself with the angular velocity with respect to the inertial system. Then
dL dL
= + × L = 0.
dt inertial dt k
where I0 = (2/5)mr 2 is the moment of inertia of the sphere. (13.8) has the solution
0z
z = ,
1 + ε cos ωt
where 0z follows from the initial conditions; this means that z is only very weakly
time dependent.
(b) We suppose that ω
z , i.e.,
dIxx dIyy
≈ 0 and ≈ 0.
dt dt
From this, we find
3 3
˙ x + I0 z ε cos ωty = 0,
Ixx ˙ y − I0 z ε cos ωtx = 0.
Iyy (13.11)
2 2
Differentiating again and inserting (13.8), (13.9), and (13.10) yield
2
1 3
¨
Ixx x + I0 z ε cos ωt x = 0,
Iyy 2
2 (13.12)
1 3
¨y +
Iyy I0 z ε cos ωt y = 0.
Ixx 2
Since ω
z (we further assume that ω
εz ), we find
3
ωn = εz cos ωt (nutation frequency),
2
i.e., x and y perform a nutation motion with ωn .
222 13 Theory of the Top
EXERCISE
2π
R
2π
R
2π
1
I1 = σ y 2 r drdϕ = σ r 3 sin2 ϕ drdϕ = σ R 4 sin2 ϕ dϕ
4
0 0 0 0 0
1 1 M 1
= σ R4π = R 4 π = MR 2 , (13.16)
4 4 πR 2 4
since the surface density σ is given by σ = M/F = M/πR 2 . And likewise for I2
and I3
1 1
I1 = I2 = I3 = MR 2 . (13.17)
2 4
The components of the angular velocity vector are given by
1
D1 = D3 = 0 and D2 = −ω2 sin α cos α MR 2 . (13.19)
4
Because D = r × F, in the pivots act equal but oppositely directed forces of magnitude
|D2 | 1 1 sin 2α
Fig. 13.13. Geometry and piv- |F| = = MR 2 ω2 sin 2α = MR 2 ω2 (13.20)
2d 4d 4 16d
oting of the rotating circular
disk (see Fig. 13.13).
13.3 Analytical Theory of the Free Top 223
EXERCISE
Problem. What torque is needed to rotate an elliptic disk with the half-axes a and
b about the rotation axis 0A with constant angular velocity ω0 ? The rotation axis is
tilted from the large half-axis a by the angle α.
Fig. 13.14.
Solution. We choose the e1 -axis orthogonal to the plane of the drawing, e2 along the
small half-axis b, and e3 along the large half-axis. The principal moments of inertia
are then (M = σ πab, dM = σ dF )
+a
ϕ(z)
z2
I2 = σ z2 dz dy with ϕ(z) = b 1 −
a2
−a −ϕ(z)
1
⇒ I3 = Mb2 . (13.22)
4
We can immediately write down I1 (because I1 = I2 + I3 for thin plates):
1
I1 = M(a 2 + b2 ). (13.23)
4
For ω, we obtain
ω = 0 · e1 − ω0 sin α · e2 + ω0 cos α · e3 .
224 13 Theory of the Top
We now consider the motion of the top under the action of gravity. If the bearing point
O of the top does not coincide with the center of gravity S, gravity performs a torque.
To distinguish the top from the freely moving top, it is called the heavy top. First we
restrict ourselves to the symmetric top which rotates with the angular velocity ω about
its figure axis. The origin of the space-fixed coordinate system is set into the bearing
−→
point O, the negative z-axis points along the gravity force. Let the distance OS = l;
gravity acts on the top of mass m with the torque D = l × mg. Hence, the angular
momentum vector is not constant in time:
L̇ = D or dL = D · dt.
Fig. 13.15. A heavy top: The This differential form of the equation of motion expresses that the torque causes a
center of gravity S and the bear-
change dL of the angular momentum which is parallel to the torque D.
ing point O are sketched in
13.4 The Heavy Symmetric Top: Elementary Considerations 225
Sommerfeld4 and Klein5 in their book Theory of the Top called this phenomenon—
philosophizing—“Die Tendenz zum gleichsinnigen Parallelismus.” The z-component
of the angular momentum is however conserved. This results from the following con-
sideration: Because g = −gez , we have D = mgez × l, i.e., the torque has no com-
ponent along the z-direction, and hence Lz is constant. Hence the torque D causes a
motion of the angular momentum vector L on a cone about the z-axis; this motion
of the heavy top is called precession. The precession frequency of the top is thereby
constant by reason of symmetry; the relative orientation of the torque and angular
momentum vectors is also constant.
We now calculate the precession frequency. For this purpose we start from the
radial component Lr of the angular momentum:
Lr = L sin ϑ.
ωp × L = D.
4 Arnold Sommerfeld, b. Dec. 5, 1868, Königsberg–d. April 26, 1951, Munich, physicist. From 1897,
professor in Clausthal-Zellerfeld and from 1900 in Aachen, successfully tried for a mathematical
backup of technique. In 1906, Sommerfeld became professor for theoretical physics in Munich,
where he was an excellent academic teacher for generations of physicists (among others P. Debye,
P.P. Ewald, W. Heisenberg, W. Pauli, and H.A. Bethe). He extended Bohr’s ideas in 1915 to the
“Bohr–Sommerfeld theory of atom” and discovered many of the laws for the number, wavelength,
and intensity of spectral lines. His work Atomic Structure and Spectral Lines (Vol. 1, 1919; Vol. 2,
1929) was accepted for decades as a standard work of atomic physics. Further works: Lectures on
Theoretical Physics, six volumes (1942–1962).
5 Felix Klein, b. April 25, 1849, Düsseldorf–d. June 22, 1925, Göttingen. Klein studied from 1865 to
1870 in Bonn. During a stay in 1870 in Paris, he became familiar with the rapidly developing group
theory. From 1871, Klein was a private lecturer in Göttingen, from 1872, professor in Erlangen, from
1875, in Munich, from 1880, in Leipzig and from 1886, in Göttingen. He contributed fundamental
papers on function theory, geometry, and algebra. In particular, group theory and its applications
attracted his interest. In 1872, he published the Program of Erlangen. In later life Klein became
interested in pedagogical and historical problems.
226 13 Theory of the Top
ϑ0 − ϑ ≤ ϑ ≤ ϑ0 + ϑ
(see Fig. 13.18). If D = 0, there is only the nutation of the figure axis F about the then-
Fig. 13.18. In general, a nu-
invariable straight line L. For D = 0, the precession of L about the z-axis dominates.
tation is superimposed on the
precession The nutation is superimposed on this precession.
Since in the special case considered here the vectors of angular momentum and
angular velocity coincide with the figure axis of the top, we can write for the angular
momentum
L = 3 ω,
where 3 is the moment of inertia about the figure axis. For the torque we then have
the relation
D = 3 ωp × ω.
i.e., 47◦ ; it therefore changes its orientation over the millennia. This precession motion
must be distinguished from the nutations of earth (Chandler’s nutations) discussed in
Example 13.1. The latter are superimposed on the precession motion.
Possibly the most important practical application of the top is the gyrocompass. The
idea goes back to Foucault (1852). The gyrocompass consists in principle of a quickly
rotating, semi-cardanic suspended top, with the rotation axis kept in the horizontal
plane by the suspension.
The earth is not an inertial system; it rotates with the angular velocity ωE . Since the
top wants to preserve the orientation of its angular momentum, it is forced to precess
with ωE . Hence, there is a top moment D :
D = 3 ω K × ω E ,
where we set
ωE = ωE sin ϕez + ωE cos ϕeN ≡ ωEZ + ωEN
with ϕ as the geographic latitude. eN is a unit vector pointing along the meridian. By
splitting ωE one obtains
D = 3 (ωK × ωEZ + ωK × ωEN ).
The first term is compensated by the bearing of the top. This part of the top moment
tends to turn the AB-axis (see Fig. 13.20(b)) The second term causes a rotation of the
top about the z-axis. The splitting of ωK leads to the acting torque
Fig. 13.20. (a) Decomposition
of the angular frequency ωE
into a vertical and a hori-
zontal component. (b) Semi-
cardanic suspension: This top
can freely rotate about the
AB-axis
6 See W. Greiner: Classical Mechanics: Point Particles and Relativity, 1st ed., Springer, Berlin
(2004), Chapter 28.
228 13 Theory of the Top
Hence, a torque arises which always tends to turn the top along the meridian (α = 0).
If the suspension of the top is damped, the top adjusts along the north–south direc-
tion provided that it is not just at one of the two poles (ϕ ± 90◦ ). Otherwise, it performs
damped pendulum vibrations about the north–south direction. One can therefore use
the top as a direction indicator if one is not close to a pole.
Foucault’s experiments with a “gyroscope” led only to indications of the described
effect. Anschütz-Kaempfe succeeded in constructing the first useful gyrocompass
(1908). To reduce the friction, the gyroscope body—a three-phase current motor—
hangs at a float that floats in a basin of mercury. The top axis is kept horizontal by
placing the center of gravity of the top lower than the buoyancy center (corresponding
to the suspension point). In this setup the gyroscope axis vibrates under the influ-
ence of the moment not only in the horizontal, but also in the vertical plane about the
north–south direction.
Fig. 13.22. Principle of the
gyroscope
By an appropriate damping of the latter of these coupled vibrations, one can also
reach a damping of the vibrations in the horizontal plane, which is needed for the
adjustment. The deviations arising from ship vibrations and from other effects could
be removed in more recent construction (multiple gyrocompass), or accounted for by
calculation.
In order to stabilize free motions of bodies, e.g., of a disk or a projectile, these are
set into rapid eigenrotation (spin). A disk thereby maintains its tilted position almost
unchanged, and therefore gets a buoyancy similarly to the wing of a plane and thus
reaches a much larger range of flight than without rotation. A prolate projectile rotat-
ing about its longitudinal axis experiences a torque from the air resistance which tends
13.5 Further Applications of the Top 229
to turn the projectile about a center-of-gravity axis perpendicular to the flight direc-
tion. The projectile responds with a kind of precession motion which is very intricate
because of the variable air resistance. The vibrations of the projectile tip remain close
to the tangent of the trajectory, but for a projectile with “right-hand spin” drift off to
the right of the shot plane. The projectile therefore hits the target with the tip ahead,
but on firing one has to account for a right-hand deviation.
The gyroscope torques acting on guided tops tend to turn up the axes of the wheels
of a car passing a bend, which causes an additional pressure on the outer wheels and a
relief of the inner ones. The same gyroscope effect provides an increase of the milling
pressure in grinding mills and finds further application in the turn and bank indicator
of airplanes. If the plane performs a turn, the gyroscope actions on the propeller must
be taken into account.
The top can also serve to stabilize systems (cars) which by their nature are unstable,
as in the one-rail track, or to reduce the vibrations of an internally stable system,
as in Schlick’s ship gyroscope. In the latter device a heavy top with a vertical axis,
driven by a motor, is set into a frame that can rotate about the transverse horizontal
axis. During ship vibrations about the longitudinal axis—these “rolling vibrations”
shall be damped—the top performs vibrations because of the precession about the axis
lying across the ship; ship vibrations and top vibrations represent coupled vibrations.
The top vibrations are appropriately damped by a brake. By the coupling the released
energy is pulled out of the ship vibrations; as a result these are considerably reduced.
As is seen from the above example, for stabilization by a top it is generally essential
that its rotation axis is not fixed relative to the body, but that all degrees of freedom
are available. For this reason a bicycle with a tightly mounted front wheel would not
be stable. Moreover, riding a bicycle without support is also partly based on the laws
of top motion.
An indirect stabilization is used by the devices which control the straight motion of
torpedoes. On deviation from the shot direction, the top activates a relay which causes
the adjustment of the corresponding rudder.
An important problem is to stabilize a horizontal plane so that it remains horizontal
on a moving ship or airplane. This so-called artificial horizon (gyroscope horizon,
flight horizon) could, according to Schuler, be realized by a gravitation pendulum
with a period of 84 minutes (pendulum length = earth radius), since such a pendulum,
even when the suspension point is moved, points to the earth’s center. Useful artificial
horizons could be realized by tops with cardanic suspension (“top pendulum,” center
of gravity below the rotation point).
We finally note that an ordinary play top that moves with a rounded tip on a hori-
zontal plane, and thus has five degrees of freedom, does not fit the definition of a top,
since in general no point remains fixed during its motion; i.e., translation and rotation
motion cannot be dynamically separated. The fact that a play top with an initially tilted
axis straightens up under sufficiently fast rotation—which also happens, for example,
with a cooked egg—can be explained by a torque created by the friction.
EXERCISE
13.8 Gyrocompass
Problem. A simple gyrocompass consists of a gyroscope that rotates about its axis
with the angular velocity ω. Let the moment of inertia about this axis be C and the
230 13 Theory of the Top
Exercise 13.8 moment of inertia about a perpendicular axis be A. The suspension of the gyroscope
floats on mercury, hence the only acting torque forces the gyroscope axis to stay in the
horizontal plane. The gyroscope is brought to the equator. Let the angular velocity of
the earth be . What is the response of the gyroscope?
Solution. Since the earth rotates with angular velocity , the angular momentum in
the earth system satisfies
dL
= D − × L,
dt
where D is the total torque. At the equator points along the y-axis, and the z-axis is
perpendicular to it.
Fig. 13.23. is the angular
The components of the angular momentum are
velocity of the earth, ω that of
the gyroscope Lx = Cω sin ϕ,
ω
Ly = Cω cos ϕ,
1,
Lz = Aϕ̇.
Since there are no forces acting in the x,y-plane, Dz = 0. Hence, the equation for Lz
is
Aϕ̈ = −Cωϕ
or
C C
ϕ̈ + ωϕ = 0; i.e., ϕ̈ + ωr2 ϕ = 0, ωr2 = ω.
A A
ϕ oscillates with the frequency
1/2
C
ωr = ω
A
in the north–south direction!
EXERCISE
13.9 Tidal Forces, and Lunar and Solar Eclipses: The Saros Cycle7
Problem. Theancient Chinese court astronomers were able to predict lunar and solar
eclipses with great reliability. The fact that such eclipses arise only occasionally—
7 The name goes back to the Chaldeans, a Babylonian tribe. Thales presumably used Babylonian ta-
bles for predicting the solar eclipse in 585 B.C. The knowledge of natural science of the Babylonians
was highly developed. They had tables for square roots and powers, approximated the number π by
3 1/8, and could solve quadratic equations. The subdivision of the celestial circle into 12 zodiacal
symbols and the 360◦ division of the circle are modern examples of Babylonian nomenclature.
13.5 Further Applications of the Top 231
Fig. 13.24.
while otherwise we have a full moon or a new moon—is caused by the inclination of
the orbital plane of the earth–moon system from the ecliptic, i.e., the orbital plane of
the motion of the common center of gravity about the sun. This inclination is about
5.15◦ . It is not fixed in space but precesses because of the tidal forces exerted by
the sun. This leads to the so-called Saros cycle, which is of great importance for the
prediction of eclipses.
Consider the earth–moon system as a dumbbell-shaped top which rotates about
its center of gravity Sp ; the center of gravity orbits about the sun on a circular path.
The gravitation force between the earth and moon just balances the centrifugal force
resulting from the eigenrotation of the system and thus fixes the almost rigid dumbbell
length r0 . The gravitation of the sun and the centrifugal force due to orbiting about the
sun don’t compensate for each body independently but lead to resulting tidal forces.
These forces create a torque M0 on the top. Calculate M0 for the sketched position
where it just takes its maximum value. Realize that M0 on the (monthly and annual)
average has a quarter of this value. Calculate from this the precession period Tp . Can
you find arguments for why the actual Saros cycle of 18.3 years is notably longer?
Hint: The only data you need for the calculation are the distances r0 , the length of
year, and the length of the sidereal month.
Solution. R0 is defined as the vector pointing from the center of gravity of the sun to
the center of gravity of the earth-moon system. The coordinate origin of the system is
in the center of gravity of the sun.
Let R be given in cylindrical coordinates:
R = R0 +
R
= R0 er +
Rr er +
Rϕ eϕ +
Rz ez
with
|
R|
|R0 |.
We then have
2
Rr 1/2
|R| ≈ R0 1 + .
R0
γ mM
FGr (R) = − R
|R|3
γ mM 3
Rr
≈− 3 1− (R0 er +
Rr er +
Rϕ eϕ +
Rz ez )
R0 R0
γ mM γ mM
≈ er − 3 (R0 − 2
Rr ) + eϕ − 3
Rϕ
R0 R0
γ mM
+ ez − 3
Rz . (13.27)
R0
The two masses m and M don’t yet have a special meaning. We now consider the
motion of the earth–moon system as a two-body problem with an external force:
mE RE + mM · RM
RCM := R0 = (CM means center of mass),
mE + mM
mE VE + mM VM
VCM = ,
mE + mM
mE BE + mM BM
BCM = ,
mE + mM
FES + FEM + FME + FMS
= (S means sun).
mE + mM
According to (13.27),
RE = rE ,
RM = rM ,
mE rE = −mM rM ;
The last equality follows from the equilibrium condition for the center of gravity. The
magnitude of the gravitational acceleration must be equal to the magnitude of the
circular acceleration. From this, it follows that the center of mass CM rotates with the
frequency ωCM about the sun at the distance R0 .
We further know the following values:
2π
TCM = = 365 days; R0 = 149.6 · 106 km. (13.29)
ωCM
13.5 Further Applications of the Top 233
We are mainly interested in the motion of the earth–moon system. To this end we Exercise 13.9
consider the relative distance rrel between the earth and moon.
rrel = RE − RM , |rrel | = r0 ,
mE · mM
Prel = μ(VE − VM ) with μ = ,
mE + mM
dPrel FES FEM FME FMS
= μ(BE − BM ) = μ + − − ,
dt mE mE mM mM
mE mM
FEM = −γ rrel = −mωM2
rrel .
r03
The last equality holds because of the equilibrium condition, as for the center-of-mass
acceleration.
The relative distance also performs a circle with the sidereal period of the moon at
the distance r0 . One has
2π
TM = = 27 days + 8 hours,
ωM
(13.30)
r0 = rE + rM = 0.384 · 106 km.
mM ⎫
RE = R0 + rrel ⎪
⎬
mM + mE
epicycle motion. (13.31)
rrel ⎪
mE
RM = R0 − ⎭
mM + mE
where lEM represents a normal vector to the orbital plane of the earth–moon system.
The relation (13.32) does not hold exactly, since the motion of the earth–moon system
is not perfectly circular, but can be well approximated by a circle.
234 13 Theory of the Top
Exercise 13.9 The coordinate system at the center of gravity is oriented just as at the origin. The
total angular momentum with respect to the sun is evaluated as
Ltot = mE RE × VE + mM · RM × VM
mM mM
= mE R0 + rE × VCM + vrel
mM + mE mM + mE
mE mE
+ mM R0 − rrel × VCM − vrel
mM + mE mM + mE
= (mE + mM )R0 × VCM + μ · rrel × vrel
= (mE + mM )ωCM · R02 · lS−CM + L0
= LCM + L0 . (13.33)
Similar as in (13.32), here it also holds that lS−CM is a vector normal to the orbital
plane defined by the sun and the center of gravity.
dLtot
= RE × FES + RM × FMS + (RE − RM ) × FEM = 0;
dt
this means
We now consider the resulting torque M0 with respect to the center of mass CM:
The second terms in each bracket drop because the force and position vector have the
same orientation. The third two terms cancel because
rE mE = −rM mM .
3γ mE M
M0 = (
RrE (rrel × er )). (13.35)
R03
Here,
RϕE and
RzE were set to zero, according to the definition of the problem. In
order not to complicate the problem unnecessarily, we put the coordinate system at the
center of gravity and thereby also that at the origin just so that the angular momentum
on the average lies in the x, z-plane. This approach is justified since the precession
frequency to be calculated is notably less than ωCM . During one revolution about the
sun the angular momentum has changed only insignificantly (by about 20◦ ), so that
the ecliptic of the earth–moon system has turned only slightly.
13.5 Further Applications of the Top 235
Ansatz:
⎛ ⎞
cos β
rrel = r0 ⎝ sin β ⎠ , where β ∼ ωM t.
0
Ansatz:
⎛ ⎞
cos(γ + ϕ0 )
er = ⎝ sin(γ + ϕ0 ) ⎠ , where γ ∼ ωCM t.
0
one obtains, by using the two approaches, the following new formulation of (13.35):
3γ mE M mM
M0 = r02 v
R03 mE + mM
with
⎛ ⎞
− sin α cos α cos2 β sin(γ + ϕ0 ) cos(γ + ϕ0 ) − sin α cos β sin β sin2 (γ + ϕ0 )
⎜ ⎟
⎜ sin α cos α cos2 β cos2 (γ + ϕ0 ) + sin α cos β sin β sin(γ + ϕ0 ) cos(γ + ϕ0 ) ⎟
v=⎜
⎜
⎟.
⎟
⎝ cos2 α cos2 β · sin(γ + ϕ0 ) cos(γ + ϕ0 ) − sin β cos β cos α cos2 (γ + ϕ0 ) ⎠
+ cos α cos β · sin β sin2 (γ + ϕ0 ) − sin2 β sin(γ + ϕ0 ) cos(γ + ϕ0 )
Exercise 13.9 by averaging over a full period of β. The moment impact (analogous to the force
impact) of M0 and of M0 β has the same value, because of the linearity β = ωM t .
3γ Mμ
M0 β = r02
R03
⎛ ⎞
1
− sin α cos α sin(γ + ϕ0 ) cos(γ + ϕ0 )
⎜ 2 ⎟
⎜ ⎟
⎜ 1 ⎟
⎜
×⎜ sin α cos α cos (γ + ϕ0 )
2 ⎟.
⎟
⎜ 2 ⎟
⎝ 1 1 ⎠
cos2 α sin(γ + ϕ0 ) cos(γ + ϕ0 ) − sin(γ + ϕ0 ) cos(γ + ϕ0 )
2 2
The same consideration can be made for the “rotating” angle γ ∼ ωCM t , since we
assume that ωp ωCM . We therefore average over M0 β:
⎛ ⎞
0
3γ Mμ ⎜1 ⎟
M0 βγ = r02 ⎜
⎝4
sin α cos α ⎟
⎠
R03
0
3γ Mμ 1
= r02 · sin α cos α · ey := M0 . (13.36)
R03 4
Not very much is left over from the extended expression; the resulting acting moment
M0 is exactly perpendicular to L0 (see Fig. 13.27) and points along the y-direction.
If the angular momentum moved (slowly), we imagine the shifted angular momentum
as being embedded in a fixed coordinate system and thus get, according to (13.36),
the same result. M0 is constant and always perpendicular to L0 . It therefore causes
a precession.
But first we will illustrate (13.36) that has been obtained in a rather mathematical
way: In this situation, we get a maximum moment. The moment M(β) now points in
the y -direction for all possible β. For β = 90◦ or β = 270◦ , the vector M vanishes.
On the average, we thus obtain
M(β) = 2M0 .
13.5 Further Applications of the Top 237
The diagram shows that there are also two “maximum” orientations with respect to
(γ + ϕ0 ). Between these positions M(β) must vanish. One thus obtains altogether
and with
2π
ωM = ,
TM
one gets
2
3 4π TM 3 TM
ωp = cos α 2
· = cos α 2 , (13.38)
4 TCM 2π 2 TCM
and for Tp
2π 4 1 T2
Tp = = · · CM . (13.39)
ωp 3 cos α TM
With TCM ≈ 365.25 days, TM ≈ 27.3 days, and α ≈ 5.5◦ , one obtains
Tp ≈ 17.9 years. (13.40)
238 13 Theory of the Top
Exercise 13.9 The fact that the actual Saros cycle is larger by about 2% is partly due to the ap-
proximation when averaging γ (the angular momentum actually moves slightly), but
possibly due to the elliptic path of the moon about the earth. In any case, the result is
relatively accurate, considering the approximations made.
From (13.34), it further follows that
L̇CM = −L̇0 ,
i.e., the “large” angular momentum vector LCM runs through an opposite precession
cone.
A rotation by 90◦ about the x-axis, followed by a 90◦ -rotation about the y-axis
(upper figure) leads to a different result than rotating first about the y-axis and then
about the x-axis (lower figure) (noncommutativity of finite rotations).
The Euler angles are defined as follows: The first rotation is performed about the
z-axis by the angle α. The x- and y-axis turn into the X- and Y -axis. The Z-axis
8 Leonhard Euler, Swiss Mathematician, b. April 15, 1707, Basel, Switzerland–d. September 18,
1783, Saint Petersburg, Russia. Son of a clergyman, Euler studied mathematics in Basel with Johann
Bernoulli, and in 1727 was appointed professor of physics and mathematics at the university of Saint
Petersburg. With an extended break from 1741 to 1766 as a member of the Berlin Academy of Sci-
ences, he spent the rest of his life in Saint Petersburg. Euler authored more than 800 scientific papers
on nearly all subjects in mathematics, especially on calculus, calculus of variations, and the theory
of complex functions. He contributed eminently also to the theory of numbers, and his solution to
the problem of the Seven Bridges of Königsberg laid the foundation of graph theory and topology. In
physics, Euler worked mainly in hydrodynamics, the theory of elasticity, and the theory of the top.
13.6 The Euler Angles 239
coincides with the z-axis. The so defined X, Y, Z system is a first intermediate system
which is only used to keep the calculation transparent. For the unit vectors, we have
I = I ,
J = cos βJ − sin βK , (13.41b)
K = sin βJ + cos βK .
The third rotation is performed about the Z -axis by the angle γ ; the X - and Y -
axis then turn into the x - and y -axis, respectively. The z -axis is identical with the Fig. 13.31. The first two Euler
angles
Z -axis. The x , y , z system constructed this way is the desired body-fixed coordinate
system. For the unit vectors, one obtains
I = cos γ i − sin γ j ,
J = sin γ i + cos γ j , (13.41c)
K =k.
Using the relations between the unit vectors, we now determine the unit vectors
i, j, k as functions of i , j , k . For this purpose, we insert
i = cos αI − sin αJ
= cos αI − sin α cos βJ + sin α sin βK
= cos α cos γ i − cos α sin γ j − sin α cos β sin γ i
− sin α cos β cos γ j + sin α sin β k
= (cos α cos γ − sin α cos β sin γ )i
+ (− cos α sin γ − sin α cos β cos γ )j + sin α sin βk . (13.41d)
240 13 Theory of the Top
The rotations can also be expressed by the corresponding rotation matrices. For the
first rotation, we have
r = AR,
where
⎛ ⎞ ⎞⎛
cos α − sin α 0 X
= ⎝ sin α
A cos α 0⎠ and R = ⎝Y ⎠.
0 0 1 Z
The matrices for the rotations by the angles β and γ accordingly read
⎛ ⎞
1 0 0
= ⎝0
B cos β − sin β ⎠ ,
0 sin β cos β
⎛ ⎞
cos γ − sin γ 0
= ⎝ sin γ
C cos γ 0⎠.
0 0 1
is the product of the three matrices D
The matrix of the entire rotation D =A
BC.
Hence,
r = Dr or r = D
−1 r = Dr.
Since the rotation matrices are orthogonal, the inverse matrix equals the transposed
one. By calculating the matrix product, one can easily show that the matrix D agrees
with the relations derived for the unit vectors. (This agrees with the general consid-
erations from Chap. 30 of Classical Mechanics: Point Particles and Relativity of the
Lectures on Theoretical Physics.)
We first calculate the angular velocity ω of the top as a function of the Euler angles.
If (i, j, k) define the laboratory system and (i , j , k ) a body-fixed system of principal
axes, for the angular velocity we have
ω = ωα k + ωβ I + ωγ K = α̇k + β̇I + γ̇ K ,
where we presuppose that k, I, and K are not coplanar. We utilize the derived relations
between the unit vectors and obtain
1
T = (1 ω12 + 2 ω22 + 3 ω32 )
2
1
= 1 (α̇ sin β sin γ + β̇ sin γ )2
2
1
+ 2 (α̇ sin β cos γ − β̇ sin γ )2
2
1
+ 3 (α̇ cos β + γ̇ )2 . (13.43)
2
If 1 = 2 , i.e., the top is symmetric, the above expression simplifies to
1 1
T = 1 (α̇ 2 sin2 β + β̇ 2 ) + 3 (α̇ cos β + γ̇ )2 . (13.44)
2 2
For the special case of the heavy symmetric top, we will determine the explicit equa-
tions of motion and the constants of motion, starting from the Euler equations. For
simplification, we note that for the symmetric top the two orientations of the princi-
pal axes ex , ey can be arbitrarily chosen in a plane perpendicular to ez . We therefore
choose a coordinate system where the angle γ always vanishes. This system is then
no longer body-fixed (it does not rotate with the top about the ez -axis). The axes
ez , ez , ey are then coplanar, as are ex , ex , ey . This is illustrated in Fig. 13.33.
We thereby have inverted the relations (13.41a) and (13.41b). By means of the expres-
sions for ex , ey , ez , one can now easily check the triple scalar products ez · (ez × ey )
and ex · (ex × ey ), e.g.,
⎛ ⎞
1 0 0
ex · (ex × ey ) = det ⎝ cos α sin α 0 ⎠ = 0.
0 1 0
Similarly, one shows the vanishing of the other triple scalar product and thus confirms
that the corresponding vectors are coplanar.
The coordinate system thus follows the precession (with α̇) and the nutation
(with β̇) of the top, but not its eigenrotation. To realize that β̇ describes the nutation,
Fig. 13.34. Precession and nu- we note that a nutation motion of the figure axis is superimposed onto the precession
tation of the angular momen-
(compare the discussion in the section “Elementary considerations on the heavy top”).
tum: The figure axis points
along Lz This manifests itself for β in an up-and-down motion (vibration) about a fixed value
β0 (see Fig. 13.34).
For the angular velocities (13.42) of the ex , ey , ez system (which is only partly
body-fixed) relative to the laboratory system ex , ey , ez in this system (γ = 0) we have
ω1 = ωx = β̇,
ω2 = ωy = α̇ sin β, (13.46a)
ω3 = ωz = α̇ cos β,
or
ωK = ωx ex + ωy ey + (ωz + ω0 )ez
= β̇ex + α̇ sin β ey + (α̇ cos β + ω0 )ez . (13.47)
Here, ω0 is the additional angular velocity of the top relative to the ex , ey , ez system.
The angular velocity ω0 (t) in general depends on the time. We must take care also
when calculating the angular momentum, because in this particular ex , ey , ez system
the rigid body still rotates with the angular velocity ω0 ez . We can call this additional
rotation spin. It is due to the particular choice of our (not exactly body-fixed) system
13.7 Motion of the Heavy Symmetric Top 243
because ez × ez = − sin β ex , as is easily seen from (13.45). By inserting this into the
Euler equations (13.5) and (13.49) and noting that 1 = 2 , we obtain
From the above system of equations, α(t), β(t), and ω0 (t) can be determined. From
the third equation, we have, because 3 = 0,
d
α̈ cos β − α̇ β̇ sin β + ω̇0 = (α̇ cos β + ω0 ) = 0 (13.51a)
dt
or
i.e., the angular momentum component Lz = 3 A (see (13.48)!) about the figure axis
is constant.
We therefore set α̇ cos β + ω0 = A, calculate ω0 from this and insert it into the first
two equations. We then obtain two coupled differential equations for precession (α)
and nutation (β), respectively:
We first investigate this system for the case that the top performs no nutation. Then
β̈ = β̇ = 0, and β > 0. By insertion, we obtain
For a top rotating quickly about the ez -axis, A becomes very large, and the fraction
in the radicand becomes very small. We terminate the expansion of the root after the
second term and get as solutions to first order for α̇small :
mgl
α̇small = ; (13.54a)
3 A
to zeroth order for α̇large , we have
3
α̇ = A. (13.54b)
1 cos β
A stationary precession without nutation (regular precession) occurs only if the heavy
symmetric top gets a certain precession velocity (α̇small or α̇large ) by an impact. In the
general case the precession is always coupled with a nutation. The heavy top will al-
ways begin its motion with an deviation toward the direction of the gravitational force,
i.e., with a nutation. We still note that α̇small agrees with the precession frequency
mgl mgl mgl
ωp = = ≈
L 3 ω 0 3 A
obtained in the section “The heavy symmetric top: Elementary considerations.”
Before we continue the discussion on the general motion of the top, we determine
additional constants of motion. We have already seen that from the last equation of the
system (13.50) we have
or
d d 1 1
(−mgl cos β) = 1 β̇ + 1 α̇ sin β .
2 2 2
(13.56)
dt dt 2 2
This means that the energy (more precisely, the sum of the kinetic parts T1 + T2 plus
the potential energy)
1
E = 1 (β̇ 2 + α̇ 2 sin2 β) + mgl cos β (13.57)
2
is also a constant of motion. This must be so, of course, since the total energy of the
top must be constant.
The total energy of the top is then
E = E + T3
1 1
= 1 (β̇ 2 + α̇ 2 sin2 β) + 3 (α̇ cos β + ω0 )2 + mgl cos β. (13.58)
2 2
13.7 Motion of the Heavy Symmetric Top 245
The last term obviously describes the potential energy of the top in the gravitational
field. The energy law (13.58) must of course hold in general. We could have written
it down immediately and skipped the derivation (13.56) from the Euler equations.
Nevertheless, it is of interest to see how the equations of motion succeed too.
In the second Euler equation (13.51d), we insert
Since Lz is constant, this is a total differential, and then it follows that
This constant is the z-component of the angular momentum in the space-fixed system.
This is seen immediately if we multiply the angular momentum
The two angular components Lz and Lz are constant, because the moment of the grav-
itational force acts only in ex -direction, i.e., perpendicular both to the z- as well as to
the z -axis. The conditions Lz = constant and Lz = constant can be realized by a pre-
cession of L about the z-axis, and an additional rotation of the z -axis about the L-axis.
The latter motion is the nutation. This obviously means that the angular momentum L
precesses about the laboratory axis ez , and the figure axis ez simultaneously performs
a nutation about the angular momentum L.
With the constants of motion we will now further discuss the motion of the top.
From the equation of the angular momentum component Lz in the laboratory system,
we determine α̇:
Lz − Lz cos β
α̇ = , (13.66)
1 sin2 β
and insert this into (13.58):
u = cos β, (13.67)
246 13 Theory of the Top
In general, the function f (u) has three zeros. Because of its asymptotic behavior
for large, positive u and because f (1) < 0, for one zero we have u3 > 1.
For the motion of the top, we must have u̇2 ≥ 0. Since 0 ≤ β ≤ π/2, we have 0 ≤
u ≤ 1. To ensure that the top moves at all in the physically relevant region 0 ≤ u ≤ 1,
in a certain interval of this region we must have u̇2 = f (u) > 0. Hence, for physical
reasons two physically interesting zeros u1 , u2 must exist between zero and unity.
Therefore in the general case there are two corresponding angles β1 and β2 with
In special cases, we can have (1) u1 = u2 and (2) u1 = u2 = 1. We first consider these
special cases:
(1) u1 = u2 = 1: The tip of the figure axis orbits on a circle (this is called stationary
precession); no nutation occurs (the angle β has a fixed value). According to (13.66)
the precession velocity reads
13.7 Motion of the Heavy Symmetric Top 247
γ − δu
α̇ = (13.74)
1 − u2
and is constant.
(2) u1 = u2 = 1: In this case, the figure axis points vertically upward. The top
performs neither nutation nor precession motion (sleeping top). This is obviously a
special case of the stationary precession (compare Exercise 13.8).
In the general case (u1 = u2 ), a nutation of the top is superimposed on the preces-
sion between the angles β1 and β2 . According to the angular momentum law (13.65),
Fig. 13.36. γ /δ = u2
(13.66) for the precession velocity, we have
Lz − Lz cos β γ − δu
α̇ = = . (13.75)
1 sin β2 1 − u2
The zeros of this equation, i.e., the solution of α̇(u) = 0, specify those angles β at
which the precession velocity α̇ momentarily vanishes. In order to illustrate the gyro-
scope motion, we give the curve described by the intersection point of the figure axis
on a sphere centered about the bearing point. Fig. 13.37. u1 < γ /δ < u2
There are three different types of motion, as illustrated in Figs. 13.36, 13.37,
and 13.38.
(1) γ /δ = u2 : The precession velocity just vanishes at β2 ; hence, the peaks appear.
(2) u1 < γ /δ < u2 : The upper peaks at β2 extended to loops. The precession velocity
vanishes between β2 and β1 .
(3) γ /δ > u2 : The precession velocity would vanish beyond β2 (as indicated in
Fig. 13.38). A peak cannot arise.
In the case of the so-called “sleeping top,” the figure axis points up vertically, so that
neither nutation nor precession occurs. For this special case, we must have β = 0 and
β̇ = 0.
From energy conservation, we obtain
1 1
1 (β̇ 2 + α̇ 2 sin2 β) + 3 A2 + mgl cos β = E, (13.76)
2 2
and because
β = 0, β̇ = 0,
it follows that
3 A = K. (13.79)
For the quantities ε, ξ, γ , and δ in the differential equation for the nutation motion in
u = cos β, we have
2(E − (1/2)3 A2 ) 2mgl
ε= = ,
1 1
2mgl
ξ= ,
1
K 3 A
γ= = , (13.80)
1 1
3 A
δ= ,
1
⇒ ε=ξ and γ = δ.
Inserting this into the differential equation (13.71) for u, one obtains
γ2 23 A2
ū = −1= − 1. (13.83)
ε 1 2mgl
Accordingly, f (u) has one of the two courses (see Fig. 13.39).
Fig. 13.39.
For Fig. 13.39(a), β̇ actually vanishes since f (u) has a zero. Thus, we can have
β = constant = 0, i.e., the case of stationary precession. But since we also require
β = 0 for the “sleeping top,” only Fig. 13.39(b) is left over where β = 0 does not exist
as a solution (u1 ≥ 1).
Hence, from (13.83) we obtain as condition equations for the “sleeping top”:
23 A2 4mgl1
−1≥1 ⇔ A2 ≥ . (13.84)
1 2mgl 23
Equation (13.84) will be satisfied only in the initial phase of the gyroscope motion.
Because of friction, A2 = (ω3 + ω0 )2 decreases, so that
4mgl1
A2 <
23
13.7 Motion of the Heavy Symmetric Top 249
and therefore one observes precession with overlaid nutation. Further energy loss in- Example 13.10
evitably causes the top to tilt down.
EXAMPLE
(a) Write the total energy of the top as a function of the Euler angles.
(b) Determine the constants of motion, and use them to eliminate the Euler angles
α and γ from the energy law. Propose approaches for solving the resulting one-
dimensional differential equation:
1
E = 1 β̇(t)2 + Veff (β). (13.85)
2
(c) Discuss the effective potential Veff (β), and solve the differential equation of the
heavy top for infinitesimal displacements from the stable position in the minimum
of the potential:
β(t) = β0 + η(t).
Fig. 13.40.
We consider the symmetric top bound to a fixed point in the gravitation field. The
energy law reads
E = T + V, (13.86)
where
1
T= i ωi2 and V = M · g · h. (13.87)
2
i
M is the mass of the top, and h = l cos β is the distance between the center of gravity
and the bearing plane.
In order to express the angular velocity ω = (ωx , ωy , ωz ) by the Euler angles and
their time derivatives, we note that α̇, β̇, γ̇ are rotation velocities by themselves. The
rotation velocity of the body is obtained as the vector sum
Example 13.11 The vectors eγ , eα , eβ follow from the definition of the Euler angles:
γ : Rotation about the new (body-fixed) z -axis:
One thus obtains the components of the rotation velocity in the body-fixed coordinate
system
By inserting this into (13.87) and using 1 = 2 , we obtain for the kinetic energy
1 1
T = 1 (α̇ 2 sin2 β + β̇ 2 ) + 3 (α̇ cos β + γ̇ )2 . (13.93)
2 2
Since the gravitational force acts only along the z-direction, the torque acts only along
the nodal line eβ .
D = r × F = −Mglez × ez
= −Mgl sin βeβ .
Thus, the angular momentum components in the ez ,ez -plane remain unchanged.
Lz and Lz are constants of motion:
Lz = L · ez
= 1 (α̇ sin β sin γ + β̇ cos γ )(sin γ sin β)
+ 1 (α̇ sin β cos γ − β̇ sin γ )(cos γ sin β)
+ 3 (α̇ cos β + γ̇ )(cos β)
= 1 (α̇ sin2 β) + 3 cos β(α̇ cos β + γ̇ ). (13.95)
Here, we utilized (13.92). Equations (13.94) and (13.95) can be inverted, i.e., solved
for γ̇ and α̇.
Lz − Lz cos β
α̇ = , (13.96)
1 sin2 β
1 cot2 β Lz cos β
γ̇ = Lz + − . (13.97)
3 1 1 sin2 β
13.7 Motion of the Heavy Symmetric Top 251
We insert the relations (13.96) and (13.97) obtained this way into the expression for Example 13.11
the kinetic energy (13.93) and obtain
1 1 Lz − Lz cos β 2 2
E = 1 β̇ 2 + sin β
2 2 1 sin2 β
1
+ 3 (α̇ 2 cos2 β + γ̇ 2 + 2α̇ γ̇ cos β) + Mgl cos β
2
1 1 (Lz − Lz cos β)2 1 Lz
= 1 β̇ 2 + + + Mgl cos β
2 21 sin2 β 2 3
1
= 1 β̇ 2 + Veff (β). (13.98)
2
We used the constancy of Lz and Lz to eliminate the two Euler angles α and γ .
This simplified the problem greatly. From the energy law (13.98) we can in principle
determine β(t) and then obtain α(t) and γ (t) via (13.96) and (13.97).
To proceed further, various possibilities offer themselves:
(a) We can establish the equation of motion for β(t):
dE ∂Veff (β)
= 0 = 1 β̇ β̈ + β̇. (13.99)
dt ∂β
∂
1 β̈ = − Veff (β)
∂β
1 cos β Lz
= (Lz − Lz cos β)2 3 − (Lz − Lz cos β) + Mgl sin β
1 sin β 1 sin β
cos β L z Lz
= (L2z − 2Lz Lz cos β + L2z ) − + Mgl sin β. (13.100)
1 sin3 β 1 sin β
The effective potential is composed of three terms which we will discuss separately.
Spin term
1 Lz
. (13.103)
2 3
The second term is constant and is due to the energy of the eigenrotation of the top
about its figure axis. It shifts the zero point of the energy scale and is independent
of β.
Angular momentum barrier
The first term is understood by analogy to the l 2 /2mr 2 angular momentum term,
which appeared in the effective potential when treating the central force prob-
lem. It is positive and vanishes for Lz /Lz = cos β. Then β is only a physi-
cally meaningful angle if Lz < Lz . This is in general fulfilled for a top with
fast eigenrotation. For β = 0, β = π , the first term diverges because of the fac-
tor sin2 β in the denominator. As a consequence, the term then has a minimum
which lies at β = arccos(Lz /Lz ) < π/2. For β → 0 and β → π the potential rises
steeply.
Gravitation term
The last contribution is caused by the gravitational potential. It is antisymmetric Example 13.11
about the center π/2 and shifts the minimum of the effective potential to the right side
of arccos(Lz /Lz ) without changing its qualitative form.
For a given energy E (> Lz /23 ), the motion is restricted to the region
E > Veff (β) with reversal points β± ; these points are defined by E = Veff (β± ).
For a more precise analysis of the motion, we determine the stationary solution for
which β = β0 = constant. It is located exactly in the minimum of the effective poten-
tial, so that the reversal points β± coincide, and E = Veff (β0 ). β0 is then determined
from the minimum property:
∂Veff
= 0. (13.106)
∂β β=β0
with an infinitesimal displacement η(t). We expand the potential into a Taylor series
∂Veff 1 2 ∂ 2 Veff
Veff (β) = Veff (β0 ) + η + η + ··· . (13.109)
∂β β=β0 2 ∂t 2 β=β0
The linear term vanishes by construction, and the quadratic term follows by differen-
tiation of the negative right side of (13.100):
Example 13.11 For the total energy, one then obtains likewise
1 1 2 ∂ 2 Veff
E = 1 η̇ + η
2
+ Veff (β0 ). (13.112)
2 2 ∂β 2 β=β0
Differentiation with respect to the time finally leads to the differential equation of the
harmonic oscillator:
η̈ + 2 η = 0 (13.113)
with
1 ∂ 2 Veff Lz Lz − 1 Mgl(4 − 3 sin2 β0 )
2 = = . (13.114)
1 ∂β 2 β=β0 21 cos β0
is stable if 2 > 0. Obviously the product Lz Lz must be sufficiently large to ensure
stable vibrations.
Lz − Lz cos β0 ∂ Lz − Lz cos β
α̇(t) ≈ + η(t) + ··· (13.116)
1 sin2 β0 ∂β 1 sin2 β β=β0
≡ α̇0 + η(t)α̇1 ,
γ̇ (t) γ̇0 + η(t)γ̇1 , (13.117)
where α̇0 , α̇1 , γ̇0 , γ̇1 are constants which depend on Lz , Lz , and E (through β0 ). For a
qualitative investigation of the superposition of nutation (β(t)) and precession (α(t)),
we start from (13.116):
For α̇0 /α̇1 η0 > 1, α̇ always remains larger than zero (Fig. 13.42(a)). For α̇0 /α̇1 η0 = 1,
the precession frequency may become equal to zero (Fig. 13.42(b)). For α̇0 /α̇1 η0 < 1,
we have a backward motion in parts (Fig. 13.42(c)).
13.7 Motion of the Heavy Symmetric Top 255
Fig. 13.42.
EXERCISE
Problem. Use the Euler equations to show that for an asymmetric top the rotations
about the axes of the largest and smallest moment of inertia are stable; however, the
rotation about the axis of the intermediate moment of inertia is unstable.
Solution. We start from the Euler equations for the free top:
2 − 3
ω̇1 = ω2 ω3 , (13.119)
1
3 − 1
ω̇2 = ω1 ω3 , (13.120)
2
1 − 2
ω̇3 = ω1 ω2 . (13.121)
3
Let the top rotate about the body-fixed z-axis, i.e., ω3 = ω0 = constant and ω1 = ω2
= 0. To investigate the stability of the rotation about this principal axis, we tilt the
rotation axis by a small amount, so that new components δω1 , δω2 and an additional
δω3 arise. For δ ω̇3 , we have from the Euler equation
1 − 2
δ ω̇3 = δω1 δω2 0. (13.122)
3
Neglecting quadratic small terms, we can set ω3 = ω0 . From the other two Euler equa-
tions, we then obtain
3 − 2
δ ω̇1 + δω2 ω0 = 0, (13.123)
1
1 − 3
δ ω̇2 + δω1 ω0 = 0. (13.124)
2
δω1 = Aeλt ,
(13.125)
δω2 = Beλt .
256 13 Theory of the Top
Exercise 13.12 This leads to a linear set of equations in A and B, where the determinant must vanish
for nontrivial solutions:
3 − 2
λ ω 0
1
= 0. (13.126)
1 − 3
ω0 λ
2
(3 − 2 )(1 − 3 )
λ2 = ω02 . (13.127)
1 2
For the rotation about the axis of the smallest moment of inertia 3 < 1 , 2 , and
for the rotation about the axis of the largest moment of inertia 3 > 1 , 2 , equation
(13.127) leads to a purely imaginary λ:
λ2 < 0, (13.128)
and therefore to vibration solutions for δω1 and δω2 . The rotation about the axis of the
largest and smallest moment of inertia, respectively, is therefore stable.
The rotation about the axis of the intermediate moment of inertia
leads to a real λ and thus to a time evolution of δω1 and δω2 according to
The rotation axis turns away exponentially from the initial position. The rotation about
the axis of the intermediate moment of inertia is not stable!
Part
V
Lagrange Equations
Generalized Coordinates
14
In many cases, the motion of bodies considered in mechanics is not free but is re-
stricted by certain constraint conditions. The constraints can take different forms. For
instance, a mass point can be bound to a space curve or to a surface. The constraints
for a rigid body state that the distances between the individual points are constant.
If one considers gas molecules in a vessel, the constraints specify that the molecules
cannot penetrate the wall of the vessel. Since the constraints are important for solv-
ing a mechanical problem, mechanical systems are classified according to the type
of constraints. A system is called holonomic if the constraints can be represented by
equations of the form
fk (r1 , r2 , . . . , t) = 0, k = 1, 2, . . . , s. (14.1)
This form of the constraints is important since it can be used for eliminating dependent
coordinates. For a pendulum of length l (14.1) reads x 2 + y 2 − l 2 = 0 if we put the
coordinate origin at the suspension point. The coordinates x and y can be expressed
by this equation.
We already met another simple example of holonomic constraints in the context of
the rigid body, i.e., the constancy of the distances between two points: (ri − rj )2 −
Cij2 = 0. In this case the constraints served to reduce the 3N degrees of freedom of a
system of N mass points to the 6 degrees of freedom of the rigid body.
All constraints that cannot be represented in the form (14.1) are called nonholo-
nomic. These are conditions that cannot be described by a closed form or by inequali-
ties. An example of this type of constraint is the system of gas molecules enclosed in
a sphere of radius R. Their coordinates must satisfy the conditions ri ≤ R.
A further classification of the constraint conditions is made based on their time
dependence. If the constraint is an explicit function of the time, then it is called
rheonomic. If the time does not enter explicitly, the constraint is called scleronomic.
A rheonomic constraint appears if a mass point moves along a moving space curve, or
if gas molecules are enclosed in a sphere with a time-dependent radius.
In certain cases the constraints may also be given in differential form, for example
if there is a condition on velocities, e.g., for the rolling of a wheel. The constraints
then have the form
N
ak (x1 , x2 , . . . , xN ) dxk = 0, (14.2)
k
where the xk represent the various coordinates, and the ak are functions of these coor-
dinates. We now have to distinguish between two cases.
This leads to
∂ak ∂ 2U ∂ai
= = .
∂xi ∂xi ∂xk ∂xk
Thus, (14.2) represents a holonomic constraint if the coefficients obey the integrability
conditions
∂ak ∂ai
= .
∂xi ∂xk
These only mean that the “vector” a = {a1 , a2 , . . . , aN } must be rotation-free (irro-
tational). In N -dimensional space, the situation is analogous.
To classify a mechanical system, we additionally specify whether the system is
conservative or not.
EXAMPLE
A sphere in the gravitational field rolls without friction from the upper pole of a larger
sphere. The system is conservative. The constraints change completely after getting
away from the sphere and cannot be represented in the closed form of (14.1), and
therefore the system is nonholonomic. Since the time does not enter explicitly, the
system is scleronomic.
EXAMPLE
A body glides with friction down on an inclined plane (see Fig. 14.2). The inclination
angle of the plane varies with time. The coordinates and the inclination angle are
related by
y
− tan ωt = 0.
Fig. 14.2. x
14 Generalized Coordinates 261
Thus, the time occurs explicitly in the constraint. The system is holonomic and rheo- Exercise 14.2
nomic. Since friction occurs, the system is furthermore not conservative.
EXAMPLE
For the calculation, we use the coordinates xM , yM of the center, the angle ϕ that
describes the rotation, and the angle ψ that characterizes the orientation of the wheel
plane relative to the y-axis.
The velocity v of the wheel center and the rotation velocity are related by the rolling
condition
v = a ϕ̇.
ẋM = −v sin ψ,
ẏM = v cos ψ.
By inserting v, we obtain
dxM + a sin ψ · dϕ = 0,
dyM − a cos ψ · dϕ = 0,
If a body moves along a trajectory specified (or restricted) by constraints, there ap-
pear constraint reactions that keep it on this trajectory. Such constraint reactions are
support forces, bearing forces (-moments), string tensions, etc. If one is not especially
interested in the load of a string or a bearing, one tries to formulate the problem in
such a way that the constraint (and thus the constraint reaction) no longer appears in
the equations to be solved. We have tacitly used this approach in the problems treated
so far. A simple example is the plane pendulum. Instead of the formulation in Carte-
262 14 Generalized Coordinates
These coordinates qi , which now can be considered free, are called generalized coor-
dinates. In the practical cases considered here, the choice of the generalized coordi-
nates is already suggested by the formulation of the problem, and the transformation
equations (14.3) need not be established explicitly. Using generalized coordinates is
also helpful for problems without constraint conditions. For instance, a central force
problem can be described more simply and completely by the coordinates (r, ϑ, ϕ)
instead of the (x, y, z).
Fig. 14.4. Ellipse: y = b sin ϕ,
x = a cos ϕ
EXAMPLE
An ellipse is given in the x,y-plane. A particle moving on the ellipse has the coordi-
nates (x, y).
14 Generalized Coordinates 263
y = b sin ϕ, x = a cos ϕ.
Thus, the motion of the particle can be completely described by the angle ϕ (the
generalized coordinate ϕ).
EXAMPLE
EXERCISE
Problem. Classify the following systems according to whether or not they are scle-
ronomic or rheonomic, holonomic or nonholonomic, and conservative or nonconserv-
ative:
(a) a sphere rolling downward without friction on a fixed sphere;
(b) a cylinder rolling down on a rough inclined plane (inclination angle α);
(c) a particle gliding on the rough inner surface of a rotation paraboloid; and
(d) a particle moving without friction along a very long bar. The bar rotates with the
angular velocity ω in the vertical plane about a horizontal axis.
Solution. (a) Scleronomic, since the constraint is not an explicit function of time.
Nonholonomic, since the rolling sphere leaves the fixed sphere. Conservative, since
the gravitational force can be derived from a potential.
(b) Scleronomic, holonomic, nonconservative: The equation of the constraint rep-
resents either a line or a surface. Since the surface is rough, friction occurs. Therefore
this system is not conservative.
(c) Scleronomic, holonomic, but not conservative, since the friction force does not
result from a potential!
(d) Rheonomic: The constraint is an explicit function of time. Holonomic: The
equation of the constraint is a straight line that contains the time explicitly; conserva-
tive.
264 14 Generalized Coordinates
ri = ri (q1 , . . . , qν , t)
be represented as
∂ri dq1 ∂ri dqν ∂ri
ṙi = + ··· + + .
∂q1 dt ∂qν dt ∂t
In the scleronomic case, the last term drops. The velocity can also be written in the
form
f
∂ri ∂ri dqα
ṙi = q̇α + , where q̇α = (14.4)
α
∂qα ∂t dt
and q̇α denotes the generalized velocity. In the following, we restrict ourselves to
the x-component. Moreover, we consider only the scleronomic case and write for the
x-component of (14.4)
∂xi
ẋi = q̇α . (14.5)
α
∂qα
By differentiating (14.5) once again with respect to time, we obtain for the Cartesian
components of the acceleration
d ∂xi ∂xi
ẍi = q̇α + q̈α .
α
dt ∂qα ∂qα
The index to be summed over is denoted here by the letter β, to avoid confusion
with the summation index α. Then we have
∂ 2 xi ∂xi
ẍi = q̇β q̇α + q̈α .
∂qβ ∂qα α
∂qα
α,β
f
∂ri
dri = dqα . (14.6)
∂qα
α=1
where
∂ri
Qα = Fi · . (14.7)
∂qα
i
Qα is called the generalized force. Since the generalized coordinate must not have
the dimension of a length, Qα must not have the dimension of a force. The product
Qα qα , however, always has the dimension of work.
In conservative systems, i.e., if W does not depend on time, one has
∂W
dW = dqα and dW = Qα dqα .
α
∂qα α
Since the qα are generalized coordinates, they are independent of each other, and
therefore it follows that (Qα − ∂W/∂qα ) = 0 in order to satisfy the equation
∂W
Qα − dqα = 0.
α
∂qα
δ sin x
δ sin x = δx = (cos x)δx, etc.
δx
We consider a system of mass points in equilibrium. Then the total force Fi acting
on each individual mass point vanishes; hence, Fi = 0. The product of force and virtual
displacement Fi · δri is called the virtual work. Since the force for each individual
mass point vanishes, the sum over the virtual work performed on the individual mass
points also equals zero:
Fi · δri = 0. (15.1)
i
The force Fi will now be subdivided into the constraint reaction Fzi and the acting
(imposed) force Fai :
(Fai + Fzi ) · δri = 0. (15.2)
i
We now restrict ourselves to such systems where the work performed by the constraint
reactions vanishes. In many cases (except, e.g., for those with friction) the constraint
reaction is perpendicular to the direction of motion, and the product Fz · δr vanishes.
For instance, if a mass point is forced to move along a given spatial curve, its direction
of motion is always tangential to the curve; the constraint reaction points perpendicular
to the curve. There are, however, examples where the individual constraint reactions
perform work, while the sum of the works of all constraint forces vanishes; thus,
Fzi · δri = 0.
i
The string tensions of two masses hanging on a roller represent such a case. We
refer to Example 15.1. This is the proper, true meaning of the d’Alembert1 principle:
The constraint reactions in total do not perform work. We always have
Fzi · δri = 0.
i
This is the fundamental characteristic of the constraint reactions. One can, of course,
trace this presupposition back to Newton’s axiom “action equals reaction,” as we just
have seen in the example of the string tensions between two masses. But in general
it does not follow from Newton’s axioms alone. The assumption that the total virtual
work of the constraint reactions vanishes can be considered to be a new postulate. It
accounts for systems of not freely movable mass points and can be expressed by the
forces imposed on the system, as we shall see below (see (15.5)). Then the constraint
position drops out from (14.2), and one has
Fai · δri = 0. (15.3)
i
While in (15.1) each term vanishes individually, now only the sum in total vanishes.
The statement of (15.3) is called the principle of virtual work. It says that a system is
only in equilibrium if the entire virtual work of the imposed (external) forces vanishes.
In the next chapter (equations (16.8) and (16.9)) the principle of virtual work (the total
virtual work vanishes) will be established by the Lagrangian formalism.
For holonomic constraints, the effect of the constraint reactions can be elucidated
by the following: If we consider the ith constraint in the form
gi (r1 , r2 , . . . , rN , t) = 0,
then the change of gi with respect to a change of the position vector rj must be
a measure of the constraint reaction Fzj i on the j th particle due to the constraint
gi (r1 , r2 , . . . , rN , t) = 0. We thus can write
∂gi (r1 , r2 , . . . , rN , t)
Fzj i = λi = λi ∇j gi (r1 , . . . , t).
∂rj
k
k
∂gi (r1 , . . . , rN , t)
Fzj = Fzj i = λi .
∂rj
i=1 i=1
1 Jean le Rond d’Alembert, b. Nov. 16 or 17, 1717, Paris, as the son of a general–d. Oct. 29, 1783,
Paris. D’Alembert, who was abandoned by his mother, was found near the church Jean le Rond and
was brought up by the family of a glazier. Later he was educated according to his social status,
supported by grants. He studied at the Collège des Quatre Nations, and in 1741, he became a member
of the Académie des sciences. In mechanics, the d’Alembert principle is named after him; moreover,
he worked on the theory of analytic functions (1746), on partial differential equations (1747), and on
the foundations of algebra. D’Alembert is the author of the mathematical articles of the Encyclopédie.
15.1 Virtual Displacements 269
N
k
N
∂gi
δW = Fzj · δrj = λi (r1 , . . . , rN , t) · δrj
∂rj
j =1 i=1 j =1
k
= λi δgi (r1 , . . . , rN , t),
i=1
where
N
∂gi
δgi (r1 , . . . , rN , t) = · δrj .
∂rj
j =1
This is just the change of gi caused by the virtual displacements δrj . Since the virtual
displacements are by assumption compatible with the constraints, i.e., the δrj satisfy
the constraints, we must have
δgi (r1 , . . . , rN , t) = 0.
Fzi · ri = 0 (15.4a)
N
δW = Fzj · δrj = 0. (15.4b)
j =1
Hence, for holonomic constraints the constraint reactions are perpendicular to the dis-
placements that are compatible with the constraints, and the virtual work of the indi-
vidual constraint reactions vanishes. In Chap. 16, equations (16.8) and (16.9), we shall
understand from a very general point of view that in the general case (hence including
the case of nonholonomic constraints), the sum of the virtual work of all constraint re-
actions must vanish. Therefore, i Fzi · δri = 0 always holds, while Fzi · δri = 0 holds
only in special (holonomic) cases.
The principle of virtual work at first only allows us to treat problems of statics. By
introducing the inertial force according to Newton’s axiom
Fi = ṗi , (15.5)
every individual term vanishes. If we again subdivide the total force Fi into the im-
posed force Fai and the constraint reaction Fzi , with the same restriction as above we
270 15 D’Alembert Principle and Derivation of the Lagrange Equations
where the individual terms can differ from zero; only the sum in (15.6b) vanishes. This
equation expresses the d’Alembert principle.
EXAMPLE
z z
Fig. 15.1. Two masses on con- thez constraint forces are the string tensions F1 and F2 .
In the present example,
centric rollers: The string ten- The vanishing of i Fi · δri in the equilibrium state is equivalent to the equality
sions Fz1 and Fz2 are parallel of the torques imposed by the string tensions F1z , F2z through the radii R1 , R2 :
but have different magnitudes
D1 = R1 F1z = D2 = R2 F2z .
By means of the constraint, it follows with δz1 = R1 δϕ, δz2 = −R2 δϕ, that
F1z δz1 + F2z δz2 = (F1z R1 − F2z R2 )δϕ = (D1 − D2 )δϕ = 0.
In the case of equal radii (R1 = R2 ), the string tensions are equal.
From
Fai · δri = 0,
i
it follows that
m1 gδz1 + m2 gδz2 = 0.
Hence, we obtain
(m1 R1 − m2 R2 )δϕ = 0
or
m1 R1 = m2 R2
EXAMPLE
In the setup shown in Fig. 15.2, two masses connected by a rope move without friction.
The equation of motion shall be established by means of the d’Alembert principle. For
the two masses, this principle reads
l1 + l2 = l.
This leads to
By inserting this into (15.7) and taking into account that the accelerations are par-
allel to the displacements, we have
or
m1 sin α − m2 sin β
l¨1 = g.
m1 + m2
EXERCISE
We have
F1 = F1 ey , F2 = −mgey
and
2
Fν · δrν = (F1 l1 cos ϕ − mgl2 cos ϕ)δϕ = 0,
ν=1
F2 = −Gey , F1 = −Qey .
Furthermore,
and
i.e.,
2
0= Fν · δrν = (Qa cos ϕ − G(b + c) cos ϕ)δϕ = [Qa − G(b + c)] cos ϕδϕ.
ν=1
b+c
Q=G
a
is independent of the angle ϕ!
As is seen in Example 15.1 and Exercise 15.2, the drawback of the principle of vir-
tual displacements is that one still must eliminate displacements that are dependent
through constraints before one can find an equation of motion. We therefore introduce
generalized coordinates qi . If we transform the δri in (15.6a) to δqi , the coefficients
of the δqi can immediately be set to zero.
Starting from (15.6a), we introduce in the first sum according to (14.6) and (14.7),
the generalized forces
n
n
f
∂ri
f
Fi · δri = Fi · δqα = Qα δqα . (15.8)
∂qα
i=1 i=1 α=1 α=1
By adding and simultaneously subtracting equal terms, we rewrite the right-hand side
of the equation:
∂ri d ∂ri
d ∂ri
mi r̈i · = (mi ṙi ) · + mi ṙi ·
∂qν dt ∂qν dt ∂qν
i i i
d ∂ri
− mi ṙi ·
dt ∂qν
i
d ∂ri
d ∂ri
= mi ṙi · − mi ṙi . (15.10)
dt ∂qν dt ∂qν
i
274 15 D’Alembert Principle and Derivation of the Lagrange Equations
To derive the expression for the kinetic energy, we change the order of differentiation
with respect to t and qν in the last term of (15.10):
d ∂ri ∂ d ∂
= ri = vi . (15.11)
dt ∂qν ∂qν dt ∂qν
Insertion in (15.10) yields
∂ri
d ∂ri
∂
mi r̈i · = mi ṙi · − mi vi · vi . (15.12)
∂qν dt ∂qν ∂qν
i i
We can rewrite the expression ∂ri /∂qν in the first term of the right side of (15.12) by
partially differentiating (14.4) with respect to q̇ν :
∂vi ∂ri
= ,
∂ q˙ν ∂qν
since (∂/∂ q̇ν )(∂ri /∂t) = 0 and from the sum remains only the factor at q̇ν . By insert-
ing this relation into (15.12), we obtain
∂ri
d ∂vi
∂vi
mi r̈i · = mi vi · − mi vi ·
∂qν dt ∂qν ∂qν
i i i
d ∂ 1 ∂ 1
= mi vi2 − mi vi2 .
dt ∂ q̇ν 2 ∂qν 2
i i
2
Here, i (1/2)mi vi
is the kinetic energy T :
∂ri
d ∂T
∂T
mi r̈i · = − .
∂qν dt ∂ q̇ν ∂qν
i
Using (15.8) and (15.13), we can express the d’Alembert principle by generalized
coordinates. Insertion of
Fi · δri = Qν δqν (compare (15.8))
i ν
The qν are generalized coordinates; thus, the qν and the related δqν are independent
of each other. Therefore, (15.14) is satisfied only if the individual coefficients vanish,
i.e., for any coordinate qν we must have
d ∂T ∂T
− − Qν = 0, ν = 1, . . . , f. (15.15)
dt ∂ q̇ν ∂qν
15.1 Virtual Displacements 275
As a further simplification, we assume that all forces Fi can be derived from a potential
V (conservative force field):
Fi = −gradi (V ) = −∇i (V ).
because
∂V ∂V ∂V
∂xi ∂yi ∂zi
ex + ey + ez · ex + ey + ez
∂xi ∂yi ∂zi ∂qν ∂qν ∂qν
i
∂V ∂xi ∂V ∂yi ∂V ∂zi
= + +
∂xi ∂qν ∂yi ∂qν ∂zi ∂qν
i
∂V
= .
∂qν
By inserting Qν = −∂V /∂qν into (15.15), we obtain
d ∂T ∂T ∂V
− + =0
dt ∂ q̇ν ∂qν ∂qν
and
d ∂T ∂T − V
− = 0.
dt ∂ q̇ν ∂qν
V is independent of the generalized velocity; i.e., V is only a function of the position:
∂V
= 0.
∂ q̇ν
Therefore, we can write
d ∂ ∂
(T − V ) − (T − V ) = 0, (15.16)
dt ∂ q̇ν ∂qν
L = T − V, (15.17)
d ∂L ∂L
− = 0, ν = 1, . . . , f. (15.18)
dt ∂ q̇ν ∂qν
2 Joseph Louis Lagrange, b. Jan. 25, 1736, Torino–d. April 10, 1813, Paris. Lagrange came from a
French–Italian family and in 1755 became professor in Torino. In 1766, he went to Berlin as director
of the mathematical-physical class of the academy. In 1786, after the death of Friedrich II, he went to
Paris. There he essentially supported the reformation of the system of measures and was a professor
at various universities. His very extensive work includes a new foundation of variational calculus
(1760) and its application to dynamics, contributions to the three-body problem (1772), application
of the theory of continued fractions to the solution of equations (1767), number-theoretical problems,
and an unsuccessful reduction of infinitesimal calculus to algebra. With his Mécanique Analytique
(1788), Lagrange became the founder of analytical mechanics.
276 15 D’Alembert Principle and Derivation of the Lagrange Equations
These equations are called Lagrange equations, and the quantities ∂L/∂ q̇ν are called
generalized momenta. In Newton’s formulation of mechanics, the equations of mo-
tions are established directly. The forces are thus put in the foreground; they must be
specified for a given problem and inserted into the basic dynamic equations
ṗi = Fi , i = 1, . . . , N.
In the Lagrangian formulation the Lagrangian is the central quantity, and L includes
both the kinetic energy T and the potential energy V . The latter one implicitly in-
volves the forces. After L is established, the Lagrange equations can be established
and solved. Both methods are equivalent to each other, as can be seen by stepwise
inversion of the steps leading from (15.6a) to (15.18).
EXAMPLE
Two blocks of equal mass that are connected by a rigid bar of length l move without
friction along a given path (compare Fig. 15.5). The attraction of the earth acts along
the negative y-axis. The generalized coordinate is the angle α (corresponding to the
single degree of freedom of the system).
Fig. 15.5. Two blocks are
connected by a bar
x = l cos α, y = l sin α.
L = T − V.
The constants c and t0 are determined from the given initial conditions.
EXAMPLE
We will use the following example for the Lagrangian formalism to explain the con-
cept of the ignorable coordinate. The arrangement is shown in Fig. 15.6.
Fig. 15.6. Two masses m and
M are connected by a string
278 15 D’Alembert Principle and Derivation of the Lagrange Equations
Example 15.5 Two masses m and M are connected by a string of constant total length l = r + s.
The string mass is negligibly small compared to m + M. The mass m can rotate with
the string (with varying partial length r) on the plane. The string leads from m through
a hole in the plane to M, where the mass M hangs from the tightly stretched string
(with the also variable partial length s = l − r). Depending on the values ω of the
rotation of m on the plane, the arrangement can glide upward or downward. Thus, the
mass M moves only along the z-axis. The constraints characterizing the system are
holonomic and scleronomic. This arrangement has two degrees of freedom. The two
corresponding generalized coordinates ϕ and s uniquely describe the state of motion
of this conservative system.
We have
x = r cos ϕ = (l − s) cos ϕ,
y = r sin ϕ = (l − s) sin ϕ.
V = −Mgs.
d ∂L dpj
= 0 or = 0.
dt ∂ q̇j dt
Here, pj = ∂L/∂ q̇j is the generalized momentum. The generalized momentum re-
lated to the cyclic coordinate is thus constant in time. Therefore, the general conserva-
tion law holds: The generalized momentum related to a cyclic coordinate is conserved.
The Lagrange equation for s reads
(m + M)s̈ + (l − s)mϕ̇ 2 − Mg = 0
2 ṡ
L
(m + M)s̈ ṡ + − Mg ṡ = 0,
= (l − s)2 mϕ̇.
with L
(l − s)3 m
The last equation can be integrated immediately, and we obtain as a second integral
of motion
1
2
L
(m + M)ṡ 2 + − Mgs = constant = T + V = E;
2 2(l − s)2 m
i.e., the total energy of the system is conserved. The given system is in a state of equi-
librium (gravitation force = centrifugal force) for vanishing acceleration, d 2 s/dt = 0:
2
1
L
0 = s̈ = Mg − (l − s)m
m+M (l − s)2 m
2
1 L
= Mg − .
m+M (l − s)3 m
The result states that s must be constant. For a fixed distance s0 , equilibrium therefore
appears for a definite angular momentum L
=L
0 , which corresponds to a definite
angular velocity ω = ϕ̇:
0 = Mmg(l − s0 )3 .
L
For L
> L
0 , the entire arrangement glides upward; for L
< L
0 , the string with the
two masses m and M glides downward. For L
= L
0 , the system is in an equilibrium
state. For the special case L
= 0 (i.e., ϕ̇ = 0, no rotation on the plane), one simply has
the retarded free fall of the mass M.
EXAMPLE
This arrangement has one degree of freedom. Accordingly, we need only one gen-
eralized coordinate for a complete description of the state of motion of the system: the
radial distance r of the sphere from the rotation center.
One has
x = r cos ωt,
y = r sin ωt.
1 1
L = m(ẋ 2 + ẏ 2 ) = m(ṙ 2 + ω2 r 2 ),
2 2
if we take into account that for this arrangement the potential V = 0.
We now form
d ∂L ∂L
= mr̈, = mω2 r.
dt ∂ ṙ ∂r
Then we obtain the Lagrange equation
mr̈ − mω2 r = 0,
or
r̈ − ω2 r = 0.
This differential equation corresponds—up to the minus sign—to the equation for the
nondamped harmonic oscillator. It has a general solution of the type
With increasing time t , this expression for r(t) also increases; i.e.,
From the physical point of view, this means that the sphere is hurled outward by the
centrifugal force that results from the rotation of the arrangement.
The energy of the sphere increases. The reason is that the constraint reaction per-
forms work on the sphere. Although the constraint force is perpendicular to the tube
15.1 Virtual Displacements 281
wall, it is not perpendicular to the trajectory of the sphere. Hence, the product Fz · δs Exercise 15.6
does not vanish.
EXERCISE
Problem. Determine the Lagrangian and the equation of motion of the following
system: Let m be a point mass on a massless bar of length l which in turn is fixed to
a hinge. The hinge oscillates in the vertical direction according to h(t) = h0 cos ωt.
The only degree of freedom is the angle ϑ between the bar and the vertical (upright
pendulum).
1
T = m(ẋ 2 + ẏ 2 )
2
1
= m(ϑ̇ 2 l 2 + ω2 h20 sin2 ωt + 2ωh0 ϑ̇l sin ϑ sin ωt),
2
and the potential energy reads
L=T −V
m 2 2
= ϑ̇ l + ω2 h20 sin2 ωt + 2ωh0 ϑ̇ sin ϑ sin ωt − 2g(h0 cos ωt + l cos ϑ) .
2
282 15 D’Alembert Principle and Derivation of the Lagrange Equations
l ϑ̈ + (g − ω2 h0 cos ωt)ϑ = 0.
This is the desired equation of motion. If the piston is at rest, i.e., h(t) = h0 = 0, we
get
g
ϑ̈ + ϑ = 0.
l
This is the equation of motion of the ordinary pendulum!
EXERCISE
Problem. Find the position of stable equilibrium of the pendulum of Exercise 15.7
√
if the hinge oscillates with the frequency ω
g/ l.
Solution. We first rewrite the Lagrangian of the pendulum of Exercise 15.7 as fol-
lows: The terms
mω2 2 2
h sin ωt and − mgh0 cos ωt
2 0
can be written as total differentials with respect to time:
mω2 2 2 d 1
h sin ωt = − mωh0 sin ωt cos ωt + C,
2
2 0 dt 4
d mgh0
−mgh0 cos ωt = − sin ωt .
dt ω
15.1 Virtual Displacements 283
We can omit these terms, since Lagrangians that differ only t by a total derivative with Exercise 15.8
respect to time, according to the Hamilton principle δ t12 L dt = 0, are equivalent.
Hence,
m 2 2
L= ϑ̇ l + ω2 h20 sin2 ωt + 2ωh0 ϑ̇l sin ϑ sin ωt − 2g(h0 cos ωt + l cos ϑ)
2
m
= [ϑ̇ 2 l 2 + 2ωh0 ϑ̇l sin ϑ sin ωt − 2gl cos ωt]. (15.19)
2
Another transformation yields
d
mωh0 ϑ̇l sin ϑ sin ωt = − (mωh0 l cos ϑ sin ωt) + mω2 h0 l cos ϑ cos ωt,
dt
so that the Lagrangian finally reads
m 22
L= [ϑ̇ l + 2ω2 h0 l cos ϑ cos ωt − 2gl cos ϑ]. (15.20)
2
From this, one obtains of course the equation of motion as in Exercise 15.7.
We consider ϑ as a generalized coordinate with the appropriate mass coefficient
ml 2 . The equation of motion then reads
ϑ(t) =
ϑ (t) + ξ(t).
The average value of the oscillations over a period 2π/ω equals zero, while
ϑ changes
only slowly; therefore,
2π/ω
ω
ϑ (t) = ϑ(t) dt =
ϑ (t). (15.22)
2π
0
ϑ + ξ ) = f (
dU d 2U df
ϑ¨ + ml 2 ξ̈ = −
ml 2
−ξ + f (
ϑ) + ξ . (15.23)
dϑ dϑ
2 d
ϑ
The dominant terms for the oscillations are ml 2 ξ̈ and f (
ϑ ):
ml 2 ξ̈ = f (
ϑ)
ω2 h0
⇒ ξ̈ = − sin
ϑ cos ωt,
l
284 15 D’Alembert Principle and Derivation of the Lagrange Equations
ϑ cos ωt = − . (15.24)
l mω2 l 2
We now calculate an effective potential created by the oscillations, and for this purpose
we average (15.23) over a period 2π/ω (the mean values over ξ and f vanish):
dU df dU 1 df
ϑ¨ = −
ml 2
+ξ =− − f .
dϑ
dϑ
dϑ mω 2 l 2 d
ϑ
This can be written as
dUeff 1
ϑ¨ = −
ml 2
with Ueff = U + f 2. (15.25)
d
ϑ 2mω2 − l 2
mω2 h20 2
Ueff = U + sin ϑ
4
mω2 h20 2
= mgl cos ϑ + sin ϑ. (15.26)
4
The minima of Ueff give the stable equilibrium positions:
EXERCISE
The ωα are the desired vibration frequencies of the normal modes. If one cannot
find the normal coordinates of the system, one can proceed as follows: If a system has
s degrees of freedom and does not vibrate, then the Lagrangian generally reads
1
L= (mik ẋi ẋk − kik xi xk ).
2
i,k
15.1 Virtual Displacements 285
The eigenfrequencies of the system are then determined by the so-called characteristic Exercise 15.8
equation
det|kik − ω2 mik | = 0.
Solution. We describe the geometry of the molecule in the x,y-plane. Let the dis-
placement of the atom α from the rest position rα0 be denoted by xα = (xα , yα ), i.e.,
rα = rα0 + xα . The forces that keep the atoms together are assumed to be to first order
linear in the displacement from the rest position, i.e.,
mA 2 mB 2 KL
L= (ẋ1 + ẋ32 ) + ẋ − [(x1 − x2 )2 + (x3 − x2 )2 ],
2 2 2 2
if we consider longitudinal vibrations. For these modes the conservation of the center
of gravity can be written as follows:
mA (x1 + x3 )mB x2 = 0, mα rα = mα rα0 ,
α α
mA 2 m2
L= (ẋ1 + ẋ32 ) + A (ẋ1 + ẋ3 )2
2 2mB
KL 2 mA 2m2
− x1 + x32 + 2 (x1 + x3 )2 + 2A (x1 + x3 )2 .
2 mB mB
Hence, only two normal coordinates for the longitudinal motion can exist, because of
the conservation of the center of gravity.
Let
1 = x1 + x3 ,
2 = x1 − x3 . L can then be written as
mA ˙ 2 mA μ 2 KL 2 KL μ 2
L=
+ ˙ −
−
, μ ≡ 2mA + mB ,
4 2 4mB 1 4 2 4m2B 1
i.e.,
1 and
2 are the two normal coordinates of the longitudinal vibration (μ repre-
sents the total mass of the molecule).
(a) For x1 = x3 ,
2 vanishes; i.e.,
1 describes antisymmetric longitudinal vibrations Fig. 15.10.
(Fig. 15.10).
(b) For x1 = −x3 ,
1 vanishes; i.e.,
2 describes symmetric longitudinal vibrations
(Fig. 15.11). Fig. 15.11.
For transverse vibrations (see Fig. 15.12) of the form in Fig. 15.13, we set
mA 2 mB 2 KT
L= (ẏ + ẏ32 ) + ẏ − (lδ)2 ,
Fig. 15.12. 2 1 2 2 2
Fig. 15.13.
where δ is the deviation of the angle <) (ABA) from π . For small values of δ, we can
set
π π
δ= − α1 + − α2
2 2
π π
= sin − α1 + sin − α2
2 2
= cos α1 + cos α2
y 2 − y 1 y2 − y 3
= + .
l l
We utilize the conservation of the center of gravity and angular momentum conserva-
tion to eliminate y2 and y3 from L.
mA (y1 + y3 ) + mB y2 = 0 (conservation of the center of gravity).
To exclude rotation of the molecule, the total angular momentum must vanish:
d
D= mα [rα × vα ] mα [rα0 × ẋα ] = mα [ṙα0 × xα ],
α α
dt α
EXERCISE
mA (x1 + x3 ) + mB x2 = 0, mA (y1 + y3 ) + mB y2 = 0.
For angular momentum conservation, we go to the rest position of atom B, and be-
cause m1 = m3 = mA , it follows that
r10 × x1 + r30 × x3 = 0.
We have
The changes δl1 and δl2 of the distances AB and BA result by projection of the
vectors x1 − x2 and x3 − x2 onto the directions of the lines AB and BA:
The change of the angle 2α =<) (ABA) is found by projection of the vectors x1 − x2
and x3 − x2 onto the directions orthogonal to the line segments AB and BA:
1 1
δ= (x1 − x2 ) cos α − (y1 − y2 ) sin α + −(x3 − x2 ) cos α − (y3 − y2 ) sin α .
l l
288 15 D’Alembert Principle and Derivation of the Lagrange Equations
mA 2 mB 2 K1 K2
L= (ẋ1 + ẋ23 ) + ẋ2 − (δl1 )2 + (δl2 )2 − (lδ)2 .
2 2 2 2
Here, (K1 /2)[(δl1 )2 + (δl2 )2 ] is the potential energy of the rotation, and K2 (lδ)2 /2 is
the potential energy of the bending of the molecule. We adopt as new coordinates
Qα = x1 + x3 , qs1 = x1 − x3 , qs2 = y1 + y3 ,
1 mA 1
x1 = (Qα + qs1 ), x2 = − Qα , x3 = (Qα − qs1 ),
2 mB 2
1 mA 1
y1 = (qs2 + Qα cot α), y2 = − qs2 , y3 = (qs2 − Qα cot α).
2 mB 2
Fig. 15.16.
The eigenfrequencies ωs1 and ωs2 of the normal vibrations for qs1 and qs2 must be
determined by the characteristic equation
K1 2mA 2K2 2mA 2 2μK1 K2
ω −ω
4 2
1+ cos α +
2
1+ sin α + = 0.
mA mB mA mB mB m2A
15.1 Virtual Displacements 289
The coordinates qs1 and qs2 correspond to vibrations that are symmetric about the Exercise 15.10
y-axis (Fig. 15.17):
(x1 = −x3 , Qα = 0 ⇒ y1 = y3 ).
Fig. 15.17.
EXERCISE
mA x1 + mB x2 + mC x3 = 0, x-center of gravity,
mA y1 + mB y2 + mC y3 = 0, y-center of gravity,
mA l1 y1 = mC l2 y3 , angular momentum conservation.
for the frequency of the transverse vibration, and also the equation quadratic in ω2
1 1 1 1 μK1 K1
ω − ω K1
4 2
+ + K1 + + =0
mA mB mB mC mA mB mC
EXERCISE
Problem. Determine
(a) the generalized coordinates of the double pendulum;
(b) the Lagrangian of the system;
290 15 D’Alembert Principle and Derivation of the Lagrange Equations
Solution. (a) The appropriate generalized coordinates are the two angles ϑ1 and ϑ2
that are related to the Cartesian coordinates by
x1 = l1 cos ϑ1 , y1 = l1 sin ϑ1 ,
(15.28)
x2 = l1 cos ϑ1 + l2 cos ϑ2 , y2 = l1 sin ϑ1 + l2 sin ϑ2 .
1 1
T = m1 (ẋ12 + ẏ12 ) + m2 (ẋ22 + ẏ22 )
2 2
1 1
= m1 l12 ϑ̇12 + m2 l12 ϑ̇12 + l22 ϑ̇22 + 2l1 l2 ϑ̇1 ϑ̇2 cos(ϑ1 − ϑ2 ) .
2 2
(Addition theorem!)
To get the potential energy, we adopt a plane as a reference height, at the distance
l1 + l2 below the suspension point:
V = m1 g[l1 + l2 − l1 cos ϑ1 ] + m2 g l1 + l2 − (l1 cos ϑ1 + l2 cos ϑ2 ) .
L=T −V
1 1
= m1 l12 ϑ̇12 + m2 l12 ϑ̇12 + l22 ϑ̇22 + 2l1 l2 ϑ̇1 ϑ̇2 cos(ϑ1 − ϑ2 )
2 2
− m1 g[l1 + l2 − l1 cos ϑ1 ] − m2 g l1 + l2 − (l1 cos ϑ1 + l2 cos ϑ2 ) . (15.29)
m1 l12 ϑ̈1 + m2 l12 ϑ̈1 + m2 l1 l2 ϑ̈2 cos(ϑ1 − ϑ2 ) − m2 l1 l2 ϑ̇2 (ϑ̇1 − ϑ̇2 ) sin(ϑ1 − ϑ2 )
= −m2 l1 l2 ϑ̇1 ϑ̇2 sin(ϑ1 − ϑ2 ) − m1 gl1 sin ϑ1 − m2 gl1 sin ϑ1
and
or
and
m1 = m2 = m and l1 = l2 = l,
(15.30) reduce to
(e) If moreover the oscillations are small, then sin ϑ = ϑ, cos ϑ = 1, and terms
proportional to ϑ̇ 2 are negligible, which leads to
ϑ1 = A1 eiωt , ϑ2 = A2 eiωt ,
292 15 D’Alembert Principle and Derivation of the Lagrange Equations
To ensure that A1 and A2 do not vanish simultaneously, the determinant of the coeffi-
cients must vanish:
2(g − lω2 ) −lω2
= 0,
−lω2 g − lω2
and therefore,
l 2 ω4 − 4lgω2 + 2g 2 = 0
EXERCISE
d 2u g
+ u = 0. (15.36)
dt 2 4a
The solution of this differential equation is
ϑ g g
u = cos = C1 cos t + C2 sin t.
2 4a 4a
294 15 D’Alembert Principle and Derivation of the Lagrange Equations
Exercise 15.13 The motion is just like the vibration of an ordinary pendulum of length l = 4a. The
arrangement is therefore called a “cycloid pendulum.”
EXERCISE
The last term on the right-hand side is the Coriolis force caused by the time variation
of the pendulum length r.
For the coordinate r, one obtains
The first term on the right side represents the radial acceleration, the second term Exercise 15.14
follows from the radial component of the weight force, and the last term represents
Hooke’s law. For small amplitudes ϕ the motion appears as a superposition of har-
monic vibrations in the r, ϕ-plane.
EXERCISE
Problem. Four mass points of mass m move on a circle of radius R. Each mass point
is coupled to its two neighboring points by a spring with spring constant k (Fig. 15.22).
Find the Lagrangian of the system, and derive the equations of motion of the system.
Calculate the eigenfrequencies of the system, and discuss the related eigenvibrations.
Solution. The kinetic energy of the system is given by
1 2
4
T= m ṡν . (15.44) Fig. 15.22.
2
ν=1
For small displacements from the equilibrium position, the potential reads
1
4
V= k (sν+1 − sν )2 , s4+1 = s1 . (15.45)
2
ν=1
We set sν = Rϕν , and take the angles ϕν as generalized coordinates. Then the La-
grangian is
1 4
1 4
L = T − V = mR 2 ϕ̇ν2 − kR 2 (ϕν+1 − ϕν )2 . (15.46)
2 2
ν=1 ν=1
d ∂L ∂L
= , (15.47)
dt ∂ ϕ̇ν ∂ϕν
d ∂L
= mR ϕ̈ν
dt ∂ ϕ̇ν
1
= − kR 2 [2(ϕν − ϕν+1 ) + 2(ϕν − ϕν−1 )]
2
∂L
= . (15.48)
∂ϕν
296 15 D’Alembert Principle and Derivation of the Lagrange Equations
Exercise 15.15 For the case of four mass points, we then obtain
k
ϕ̈1 = (ϕ2 − 2ϕ1 + ϕ4 ),
m
k
ϕ̈2 = (ϕ3 − 2ϕ2 + ϕ1 ),
m (15.49)
k
ϕ̈3 = (ϕ4 − 2ϕ3 + ϕ2 ),
m
k
ϕ̈4 = (ϕ1 − 2ϕ4 + ϕ3 ).
m
With the ansatz ϕν = Aν cos ωt, ϕ̈ν = −Aν ω2 cos ωt, we are led to the following linear
system of equations:
⎛ k k k ⎞
2 − ω2 − 0 −
⎜ m m m ⎟
⎜ ⎟⎛ ⎞
⎜ k k k ⎟ A1
⎜ − 2 − ω2 − 0 ⎟
⎜ m m m ⎟ ⎜ A2 ⎟
⎜ ⎟ ⎜ ⎟ = 0. (15.50)
⎜ ⎟ ⎝ A3 ⎠
⎜ −
k k
2 − ω2 −
k ⎟
⎜ 0 ⎟ A4
⎜ m m m ⎟
⎝ ⎠
k k k
− 0 − 2 − ω2
m m m
For the nontrivial solutions, the determinant of the coefficient matrix must vanish. This
condition leads to the determining equation for the eigenfrequencies:
2
k k
2 − ω2 4 − ω (−ω2 ) = 0.
2
(15.51)
m m
The frequencies are
k k
ω12 = 0, ω22 = 4 , ω32 = ω42 = 2 . (15.52)
m m
To calculate the related eigenvibrations, we insert these frequencies into the system of
equations (15.50).
(1) ω12 = 0: A1 = A2 = A3 = A4 . The system does not vibrate but performs a uniform
rotation (Fig. 15.23(a)).
(2) ω22 = 4k/m: A1 = A3 = −A2 = −A4 . Two neighboring mass points perform an
out-of-phase vibration (Fig. 15.23(b)).
(c) ω32 = ω42 = 2k/m: A1 = A2 = −A3 = −A4 or A1 = A4 = −A2 = −A3 . Two Exercise 15.15
neighboring mass points vibrate in phase (Fig. 15.24(a,b)).
EXERCISE
Problem. Write down the Lagrangian of the heavy asymmetric top. Use the Euler
angles as generalized coordinates and determine the related generalized momenta.
Which coordinate is cyclic? Which further cyclic coordinate appears for a symmetric
top?
Solution. In the system of principal axes, the kinetic energy of motion is given by
1 1 1
T =
1 ω12 +
2 ω22 +
3 ω32 .
2 2 2
The potential energy is
We take the Euler angles (α, β, γ ) as generalized coordinates. The angular velocities
expressed by these coordinates read (see (13.43)),
1
L =
1 (α̇ 2 sin2 β sin2 γ + β̇ 2 cos2 γ + 2α̇ β̇ sin β sin γ cos β)
2
1
+
2 (α̇ 2 sin2 β cos2 γ + β̇ 2 sin2 γ − 2α̇ β̇ sin β sin γ cos β)
2
1
+
3 (α̇ cos β + γ̇ )2 − Mgl cos β.
2
298 15 D’Alembert Principle and Derivation of the Lagrange Equations
Exercise 15.16 The Euler angles as generalized coordinates obey the Euler–Lagrange equations
d ∂L ∂L
=
dt ∂ α̇ ∂α
and the analogous equations for β and γ . The Lagrangian does not depend on the
angle α; hence, this coordinate is cyclic, and the related generalized momentum is
conserved.
We determine the generalized momenta:
∂L ∂ω1 ∂ω2 ∂ω3
pα = =
1 ω 1 +
2 ω 2 +
3 ω 3
∂ α̇ ∂ α̇ ∂ α̇ ∂ α̇
=
1 (α̇ sin β sin γ + β̇ cos γ ) sin β sin γ
+
2 (α̇ sin β cos γ − β̇ sin γ ) sin β cos γ
+
3 (α̇ cos β + γ̇ ) cos β
= α̇ sin2 β(
1 sin2 γ +
2 cos2 γ ) + (
1 −
2 )β̇ cos γ sin β sin γ
+
3 (α̇ cos β + γ̇ ) cos β.
The generalized momenta, being the projection of the total angular momentum onto
the rotational axis related to the particular Euler angle, have a direct physical meaning.
pα is the projection of the total angular momentum onto the space-fixed z-axis (see
Exercise 13.12):
pα = L · eα = L · ez .
This projection is a conserved quantity for the asymmetric and the symmetric top.
Since the gravitational force acts only along the z-direction, the angular momentum
about this axis remains unchanged.
15.1 Virtual Displacements 299
pβ is the projection of the total angular momentum onto the nodal line, i.e., the Exercise 15.16
axis about which the second Euler rotation is being performed:
pβ = L · eβ = L · ex .
pγ = L · eγ = L · ez .
For a symmetric top, the body-fixed z -axis is a symmetry axis, and the angular mo-
mentum projection L · ez is conserved.
Lagrange Equation for Nonholonomic
Constraints 16
For systems with holonomic constraints, the dependent coordinates can be eliminated
by introducing generalized coordinates. If the constraints are nonholonomic, this ap-
proach does not work. There is no general method for treating nonholonomic prob-
lems. Only for those special nonholonomic constraints that can be given in differential
form can one eliminate the dependent equations by the method of Lagrange multipli-
ers. We therefore consider a system with constraints given in the form
n
alν dqν + alt dt = 0 (16.1)
ν=1
n
alν δqν = 0.
ν=1
These are also called instantaneous (belonging to a fixed time) constraints. This in
turn leads to
s
n
λl alν δqν = 0
l=1 ν=1
or
s
n
λl alν δqν = 0. (16.3)
ν=1 l=1
s
d ∂T ∂T
λl alν = − − Qν (ν = 1, . . . , s),
dt ∂ q̇ν ∂qν
l=1
i.e., the first s coefficients in (16.4) that correspond to the dependent qν are set to zero:
d ∂T ∂T
− − Qν − λl alν = 0 for ν = 1, . . . , s.
dt ∂ q̇ν ∂qν
l
These δqν (for ν = s + 1, . . . , n) are no longer subject to constraints. This means that
these δqν are independent of each other. One then must set the coefficients of the δqν
(ν = s + 1, . . . , n) equal to zero, just as in the derivation of the Lagrange equation for
holonomic systems.
This leads, together with the s equations for the dependent qν , to n equations in
total:
d ∂T ∂T s
− − Qν − λl alν = 0 for ν = 1, . . . , s, s + 1, . . . , n. (16.5)
dt ∂ q̇ν ∂qν
l=1
As in the derivation of the Lagrange equation for holonomic systems, we can re-
formulate (16.5) with the Lagrangian L = T − V as follows:
∂L
s
d ∂L
− − λl alν = 0, ν = 1, . . . , n. (16.6)
dt ∂ q̇ν ∂qν
l=1
where the Q∗ν enter in addition to the Qν . The equations (16.6) and (16.7) must be
identical. This leads to
Q∗ν = λl alν ; (16.8)
l
i.e., the Lagrange multipliers λl determine the generalized constraint reactions Q∗ν ;
they will not be eliminated but are part of the solution of the problem (see also the
statements in Chap. 17 on this topic). The relation (16.3) thus changes to
Q∗ν δqν = 0, (16.9)
ν
implying that the total virtual work performed by all constraint reactions vanishes.
This can be considered as the general proof of the thesis introduced in (15.3), that
constraint reactions do not perform work.
EXAMPLE
Example 16.1 The two generalized coordinates are s, ϕ. The constraint reads
R ϕ̇ = ṡ or Rdϕ − ds = 0.
as = −1, aϕ = R,
1
solcyl = mR 2 .
2
The potential energy V is
One should note that this Lagrangian cannot be used directly to derive the equation of
motion according to (15.17). The reason is that the two coordinates s and ϕ are not
independent of each other. Thus, ϕ is not an ignorable coordinate, although it does not
explicitly appear in the Lagrangian.
Since there is only one constraint, only one Lagrange multiplier λ is needed. With
the coefficients
as = −1, aϕ = R,
EXERCISE
Problem. A particle of mass m moves without friction under the action of gravitation
on the inner surface of a paraboloid, which is given by
306 16 Lagrange Equation for Nonholonomic Constraints
Exercise 16.2
x 2 + y 2 = ax.
Solution. (a) The appropriate coordinates are the cylindrical coordinates r, ϕ, z. The
kinetic energy expressed in cylindrical coordinates reads
1
T = m(ṙ 2 + r 2 ϕ̇ 2 + ż2 ).
2
Hence, the Lagrangian is
1
L = m(ṙ 2 + r 2 ϕ̇ 2 + ż2 ) − mgz. (16.13)
2
The constraint is x 2 + y 2 = ax. Since x 2 + y 2 = r 2 , we have r 2 − az = 0, or in
differential form, 2rδr − aδz = 0.
Adopting the notation r = q1 , ϕ = q2 , z = q3 , from
Aα qα = 0
α
i.e.,
2g
ω=
a
is the desired initial angular velocity.
(c) From md(r 2 ϕ̇)/dt = 0, it follows that r 2 ϕ̇ = constant = A. We suppose that the
particle has the initial angular velocity ω; i.e.,
ahω
A = ahω, and therefore ϕ̇ = .
r2
Since the particle oscillates about z = h with only small amplitude, we use λ1 =
−mg/a, which holds for z = h, and we obtain
2mg ω2 2gr
m(r̈ − r ϕ̇ 2 ) = − r ⇒ r̈ − a 2 h2 3
=− .
a r a
Since the oscillation is small, we have r = r0 + u; i.e.,
a 2 h2 ω2 2g
ü − = − (r0 + u). (16.16)
(r0 + u) 3 a
We have
1 1 1 u −3 3u 1
= = 1 + ≈ 1 − ,
(r0 + u)3 r03 (1 + u/r0 )3 r03 r0 r0 r03
EXERCISE
Problem. Three mass points m1 , m2 , m3 are fixed to the ends of two massless rods
and glide without friction in a circular tire of radius R, which stands vertically in the
gravitational field of the earth. Find the equations of motion by means of Lagrange
multipliers, and determine the equilibrium position. Find the frequency of small oscil-
lations about the equilibrium position.
Solution. We use the angles ϕ1 , ϕ2 , and ϕ3 as generalized coordinates. The angles
are not independent of each other, but are coupled by the rigid rods connecting the
mass points, via the constraints
ϕ3 − ϕ2 = α = constant,
(16.18)
ϕ2 − ϕ1 = β = constant.
In differential form, ν alν δqν = 0, they read
δϕ3 − δϕ2 = 0,
Fig. 16.2. (16.19)
δϕ2 − δϕ1 = 0.
∂L
s
d ∂L
− = λl alν . (16.21)
dt ∂ ϕ̇ν ∂ϕν
l=1
16 Lagrange Equation for Nonholonomic Constraints 309
The number of constraints in the case considered here is s = 2. We thus obtain the Exercise 16.3
three equations of motion:
g λ2 g λ1 λ2 g λ1
sin ϕ1 + = sin ϕ2 + − = sin ϕ3 − . (16.25)
R m1 R 2 R m2 R 2 m2 R 2 R m3 R 2
λ1 ω2
2
= m3 0 [m1 (sin ϕ3 − sin ϕ1 ) + m2 (sin ϕ3 − sin ϕ2 )],
R M (16.26)
ω 2
λ2
2
= m1 0 [m2 (sin ϕ2 − sin ϕ1 ) + m3 (sin ϕ3 − sin ϕ1 )].
R M
Next, we set M = m1 + m2 + m3 and ω02 = g/R. The angles ϕ3 and ϕ2 can be ex-
pressed by ϕ1 via the constraint (16.18), so that one differential equation in the variable
ϕ1 describes the entire system. Hence, from (16.22) and (16.26) we obtain
ω02
ϕ̈1 = − [m1 sin ϕ1 + m2 sin ϕ2 + m3 sin ϕ3 ]
M
ω02
=− [m1 sin ϕ1 + m2 sin(ϕ1 + β) + m3 sin(ϕ1 + α + β)]. (16.27)
M
The equilibrium position is at the point of vanishing acceleration ϕ̈1 = 0:
Exercise 16.3 By means of the addition theorem sin( + ϑ) sin + ϑ cos and ϕ̈1 = ϑ̈ , we
obtain from (16.27) the desired frequency:
ω02
For small amplitudes, this differential equation describes the vibrations of a physi-
cal pendulum. For α = β = 0, and, hence, ϕ10 = 0, it turns into the equation of the
mathematical pendulum.
Special Problems
17
So far, we defined conservative forces F by the condition that they can be derived from
a potential V (r) by forming gradients, i.e.,
The potential V (r, t) is a function of the position and in general also of the time. This
is possible as long as the forces do not depend on velocities or accelerations. There are,
however, such cases: for instance, the Lorentz force which acts on a charged particle
in the electromagnetic field is velocity-dependent:
v
F (e) = e E + × B . (17.2)
c
Here, e is the charge of the particle, and E and B are the electric and magnetic field
strength, respectively. F (e) indicates that this shall be an external force.
If external forces depend on the velocity or the acceleration, we shall call them
conservative as well if they can be expressed by a potential V that depends on the
generalized coordinates qj , the generalized velocities q̇j and the time t , according to
∂V d ∂V
Qj = − + (17.3)
∂qj dt ∂ q̇j
means the gradient vector with respect to the components of the velocity of the ith
particle.
The velocity-dependent potential
is sometimes called the generalized potential. We know from (15.14) that the kinetic
energy T and the generalized forces Qj are related by
d ∂T ∂T
− = Qj . (17.5)
dt ∂ q̇j ∂qj
Now using (17.3), we obtain
d ∂T ∂T ∂V d ∂V
− =− +
dt ∂ q̇j ∂qj ∂qj dt ∂ q̇j
or
d ∂L ∂L
− = 0,
dt ∂ q̇j ∂qj
if we define the generalized Lagrangian L by
L=T −V
3N
∂qν
j =
Q Qν . (17.6)
∂
qν
ν=1
d ∂qk
3N
∂qk
q̇k = (qk ) = q˙ j +
.
dt ∂
qj ∂t
j =1
Then
∂V ∂V ∂qν ∂V ∂ q̇ν
3N 3N
∂V ∂t
= + +
∂
qj ∂qν ∂
qj ∂ q̇ν ∂
qj ∂t ∂qj
ν=1 ν=1
=0
3N
3N
∂V ∂qν ∂V
3N
∂ ∂qν ∂
= + ˙
q + qν .
(17.9)
∂qν ∂
qj ∂ q̇ν ∂
qj qα α ∂t
∂
ν=1 ν=1 α=1
17.1 Velocity-Dependent Potentials 313
Because
∂V d ∂V
Qν = − + ,
∂qν dt ∂ q̇ν
j (17.6) as follows:
we write the generalized force Q
∂V ∂qν d ∂V ∂qν
3N 3N
j = −
Q +
∂qν ∂
qj dt ∂ q̇ν ∂
qj
ν=1 ν=1
∂V ∂qν d
3N 3N 3N
∂V ∂qν ∂V d ∂qν
=− + · − .
∂qν ∂
qj dt ∂ q̇ν ∂
qj ∂ q̇ν dt ∂
qj
ν=1 ν=1 ν=1
3N
∂V 3N
∂ ∂qν
j = − ∂V +
Q
∂
q˙ + qν
∂
qj ∂ q̇ν ∂
qj qα α ∂t
∂
ν=1 α=1
3N 3N
d ∂V ∂qν ∂V d ∂qν
+ − .
dt ∂ q̇ν ∂
qj ∂ q̇ν dt ∂
qj
ν=1 ν=1
3N
d ∂V ∂ q̇ν d ∂V
· = .
ν=1
q˙ j
dt ∂ q̇ν ∂ q˙ j
dt ∂
j becomes
Thus, Q
j = − ∂V + d ∂V
Q
∂
qj dt ∂q˙ j
3N
3N
∂V ∂ ∂qν ∂
+ ˙
q + qν
∂ q̇ν ∂qν qα α ∂t
∂
ν=1 α=1
3N
3N
∂V ∂ ∂qν ˙ ∂ ∂qν
− q +
.
∂ q̇ν ∂
qα ∂qj α ∂t ∂qj
ν=1 α=1
EXAMPLE
In the lectures on electrodynamics1 we shall show that the electric field strength E and
the magnetic field strength B can be derived from the scalar potential (r, t) and the
vector potential A(r, t), namely,
1 ∂A
E = −∇ − , B = ∇ × A. (17.10)
c ∂t
In other words, the electromagnetic phenomena can be described by , A instead of
E, B. Now we show that in the frame of the Lagrangian formalism the Lorentz force
(17.2) can be described by the velocity-dependent potential
e
V = e − A · v. (17.11)
c
The Lagrangian then reads
1 e
L = T − V = mv2 − e + A · v. (17.12)
2 c
We restrict ourselves to the Lagrange equation for the x-component:
d ∂L ∂L
− = 0. (17.13)
dt ∂vx ∂x
The other components follow likewise. We calculate
∂L ∂ e ∂A ∂L e
= −e + · v, = mvx + Ax ,
∂x ∂x c ∂x ∂vx c
and furthermore according to (17.13),
d ∂ e ∂A e dAx
mvx = −e + ·v− . (17.14)
dt ∂x c ∂x c dt
For the last term, we obtain
dAx ∂Ax ∂Ax dx ∂Ax dy ∂Ax dz
= + + +
dt ∂t ∂x dt ∂y dt ∂z dt
∂Ax ∂Ax ∂Ax ∂Ax
= + vx + vy + vz , (17.15)
∂t ∂x ∂y ∂z
and for the intermediate term,
∂A ∂Ax ∂Ay ∂Az
·v= vx + vy + vz . (17.16)
∂x ∂x ∂x ∂x
Equations (17.15) and (17.16) are now inserted into (17.14) and yield
dmvx ∂ 1 ∂Ax e ∂Ay ∂Ax e ∂Ax ∂Az
=e − − + − vy − − vz
dt ∂x c ∂t c ∂x ∂y c ∂z ∂x
e
= eEx + (Bx vy − By vz )
c
1
=e E+ v×B .
c x
1 See W. Greiner: Classical Electrodynamics, 1st ed., Springer, Berlin (1998), Chapter 23.
17.2 Nonconservative Forces and Dissipation Function (Friction Function) 315
Corresponding expressions are obtained for the y- and z-components, so that we get Example 17.1
in total
d 1
(mv) = E + v × B , (17.17)
dt c
i.e., Newton’s equation of motion with the Lorentz force.
So far, the discussion was restricted to conservative forces only. We now consider
systems with conservative and nonconservative forces. Such systems are first of all
systems with friction. They play an important role in classical physics and recently
also in heavy-ion physics. If two atomic nuclei collide, many internal degrees of free-
dom are excited; one can say that the nuclei are being heated up. Energy of relative
motion is lost. This is a signature for friction forces, which are generally considered
as being responsible for the energy loss.
We begin our discussion of nonconservative (e.g., friction) forces with the La-
grange equations in the form
d ∂T ∂T
− = Qj , j = 1, 2, . . . , n, (17.18)
dt ∂ q̇j ∂qj
(c)
and split the generalized forces Qj in a conservative part Qj and a nonconservative
(f )
part Qj (f for friction):
(c) (f )
Qj = Qj + Qj . (17.19)
(c)
Since Qj can be derived by definition from a potential according to (17.3), we can
introduce L = T − V and bring (17.18) into the form
d ∂L ∂L (f )
− = Ql , j = 1, 2, . . . , n. (17.20)
dt ∂ q̇j ∂qj
316 17 Special Problems
If the nonconservative forces are friction forces, on the right-hand side appear only
(f )
these friction forces Qj . For these, we make the ansatz
(f )
n
Qj =− fj k q̇k , (17.21)
k=1
where the fj k are the friction coefficients. If the friction tensor fj k is symmetric, i.e.,
(f )
fj k = fkj , the friction forces Qj can be obtained by partial derivation with respect
to the generalized velocities q̇j from the function
1
n
D= fkl q̇k q̇l (17.22)
2
k,l=1
according to
(f ) ∂D
Qj =− .
∂ q̇j
dW (r) (f )
= Qj q̇j = − fj k q̇j q̇k = −2D, (17.24)
dt
j j,k
i.e., the energy consumed by the friction force per unit time is twice the dissipation
function:
dE d
= (T + V ) = −2D. (17.25)
dt dt
This can also be directly derived from the Lagrange equations:
d ∂T ∂T d
(T + V ) = q̇i + q̈i + V . (17.26)
dt ∂qi ∂ q̇i dt
i i
EXAMPLE
The particle shall be under the action of the conservative gravitational force with the
potential
V = mgz
and the nonconservative friction resistance of air. The air resistance depends on the
projectile velocity. We suppose the friction force to be proportional to the velocity. It
can then be derived from the dissipation function
1
D = α(ẋ 2 + ẏ 2 + ż2 ),
2
and the Lagrange equations follow with L = (1/2)m(ẋ 2 + ẏ 2 + ż2 ) − mgz according
to (17.23) as
mẍ + α ẋ = 0, mÿ + α ẏ = 0, mz̈ + α ż + mg = 0.
These equations of motion are known from the lectures on classical mechanics.2
In the preceding text, we have already discussed holonomic and nonholonomic sys-
tems. A brief recapitulation seems appropriate: For holonomic systems, the supple-
mentary conditions can be expressed in the closed form
gi (rν , t) = 0, i = 1, 2, . . . , s, ν = 1, 2, . . . , N. (17.28)
2 See W. Greiner: Classical Mechanics: Point Particles and Relativity, 1st ed., Springer, Berlin
(2004), Chapter 20.
318 17 Special Problems
Since these equations shall be nonintegrable, one cannot eliminate s dependent co-
ordinates from them in the form (17.29). One therefore simply expresses the ri as
functions of 3N generalized coordinates qi . The qi are of course not all independent,
but are subject to supplementary conditions which are obtained by rewriting (17.29)
in terms of the qi :
3N
ail (q, t)dql + ait (q, t)dt = 0, i = 1, 2, . . . , s. (17.30)
l=1
3N
ail (q, t)δql = 0, i = 1, 2, . . . , s. (17.31)
l=1
In this form, the supplementary conditions can be combined with the Lagrange equa-
tions in the same form, namely,
3N
(r) ∂L d ∂L
Qj + − δqj = 0. (17.32)
∂qj dt ∂ q̇j
j =1
The conservative forces were taken into account in the Lagrangian L. Because of
the conditions (17.31), not all δqi in (17.32) are independent. To take this fact into
account, one multiplies in (17.31) by the—at the moment still unknown—factors λi
and sums up over i,
s
3N
λi ail (q, t)δql = 0. (17.33)
i=1 l=1
The factors λi are called Lagrange multipliers. They can be chosen arbitrarily in
(17.34). Among the 3N quantities δqj , however, only 3N − s can be chosen arbitrar-
ily, since the s supplementary conditions (17.31) still must be satisfied. We number
the δqj so that the first s of them are just the dependent ones; the last (3N − s) of the
δqj can be freely chosen.
Now we utilize the free choice of the s Lagrange parameters λi , which are deter-
mined in such a way that the coefficients of the first s variations δqj in (17.34) vanish.
This obviously leads to the s equations
d ∂L
s
(r) ∂L
Qj + − + λi aij (q, t) = 0, j = 1, 2, . . . , s, (17.35)
∂qj dt ∂ q̇j
i=1
In (17.36), all of the δqj can now be freely chosen. Therefore, the expression in the
round bracket must vanish for every individual j ; i.e., it follows that
d ∂L
s
(r) ∂L
Qj + − + λi aij (q, t) = 0, j = s + 1, s + 2, . . . , 3N. (17.37)
∂qj dt ∂ q̇j
i=1
Now we see that the two sets of (17.35) and (17.37) have the same form and can be
simply combined to
∂L d ∂L (r)
s
− + Qj + λi aij (q, t) = 0, j = 1, 2, . . . , 3N. (17.38)
∂qj dt ∂ q̇j
i=1
These are 3N equations which, together with the s supplementary conditions in the
form
3N
ail (q, t)q̇l + ait (q, t) = 0, i = 1, 2, . . . , s, (17.39)
l=1
(z)
s
Qj = λi aij (q, t). (17.40)
i=1
(z)
These forces Qj are constraint reactions which appear since the motion of the sys-
tem is restricted by supplementary conditions. Indeed, if the supplementary conditions
disappear (aij = 0), the constraint reactions also vanish; Q(z)
j = 0. The former equa-
tion (17.33) can now be written as
3N
(z)
Qi δqi = 0 (17.41)
i=1
and can be interpreted as the vanishing of the virtual work of the constraint reactions.
It is clear that the method of Lagrange multipliers developed here for nonholonomic
systems can be applied to holonomic systems, too. The holonomic constraints (17.28)
gi (rν , t) = 0, i = 1, 2, . . . , s, ν = 1, 2, . . . , N,
N
∂gi ∂gi
· drl + dt = 0, i = 1, 2, . . . , s. (17.42)
∂rl ∂t
l=1
This is exactly the form (17.29) for nonholonomic systems. From now on, the ap-
proach with Lagrange multipliers can run on as explained above. We then obtain
(3N + s) coupled equations, while the former solution method for holonomic sys-
tems (based on the elimination of s coordinates from (17.28)) leads only to (3N − s)
320 17 Special Problems
coupled equations. By the additional 2s equations the procedure now became much
more complicated. However, this complication has also a great advantage: We now
can determine the constraint reactions Q(z)
j according to (17.40) without difficulty (by
solving the 3N + s equations).
EXERCISE
Problem. Determine the equations of motion and the constraint reactions of a cir-
cular disk of mass M and radius R that rolls without gliding on the x, y-plane (see
Fig. 17.2). The disk shall always stand perpendicular to the x, y-plane.
˙ sin ,
ẋ = R ˙ cos .
ẏ = −R (17.43)
dx − R sin d = 0,
(17.44)
dy + R cos d = 0.
where
a11 = 1, a12 = 0, a13 = −R sin , a14 = 0,
a21 = 0, a22 = 1, a23 = R cos , a24 = 0.
x = λ1 ,
Q(z)
y = λ2 ,
Q(z)
(17.45)
Q(z)
= −λ1 R sin + λ2 R cos ,
(z)
Q = 0.
M ẍ = Qx + λ1 ,
M ÿ = Qy + λ2 ,
(17.47)
¨ = Q − λ1 R sin + λ2 R cos ,
I1
¨ = Q .
I2
Qx , Qy , Qφ , Q are possible external forces. We study the case without such forces
and therefore set them equal to zero. This transforms (17.47) into
M ẍ = λ1 ,
M ÿ = λ2 ,
(17.48)
¨ = −λ1 R sin + λ2 R cos ,
I1
¨ = 0,
I2
= ωt + 0 .
322 17 Special Problems
Exercise 17.3 By inserting this into (17.49), one can calculate ẍ and ÿ, which determine λ1 and λ2
through the first two equations of (17.48):
¨ sin(ωt + 0 ) + ωR
λ1 = M ẍ = M(R ˙ cos(ωt + 0 )),
(17.50)
¨ cos(ωt + 0 ) − ωR
λ2 = M ÿ = −M(R ˙ sin(ωt + 0 )).
This in turn is now inserted into the third equation (17.48), which then reads
¨ = −MR(R
I1 ˙ cos(ωt + 0 )) sin ωt
¨ sin(ωt + 0 ) + ωR
¨ cos(ωt + 0 ) − ωR
−MR(R ˙ sin(ωt + 0 )) cos ωt
¨
= −MR 2 ;
i.e.,
¨ = 0.
(I1 + MR 2 )
This leads to ¨ = 0 and hence ˙ = constant. Therefore, we can explicitly write down
the constraint reactions (17.45):
˙
x = MωR cos(ωt + 0 ),
Q(z)
Q(z) ˙
y = MωR sin(ωt + 0 ), (17.51)
(z) (z)
Q = 0, Q = 0.
These constraint reactions must act to keep the disk vertical on the x,y-plane. If the
disk rolls along a straight line (ω = 0), the constraint reactions disappear.
EXERCISE
Problem. Consider the degrees of freedom, and determine the equation of motion
of the centrifugal force governor (Fig. 17.4) through the Lagrangian.
Fig. 17.4.
17.3 Nonholonomic Systems and Lagrange Multipliers 323
Solution. The principle of the central force governor is applied, e.g., in automobiles. Exercise 17.4
The distributor drive shaft is tightly fixed to the carrier plate of a central force gov-
ernor which is attached below the interrupter plate. At higher speeds the centrifugal
masses press on their carrier plate against a “cog.” Thus, the distributor shaft set into
the driving shaft is moved additionally in the rotation direction by a cam. This mech-
anism causes a preignition needed at higher speeds. For more advanced motors with
“transistor ignition,” this mechanism is dropped.
The system has two degrees of freedom, which can be described by the angles θ
and ϕ. The motion of m, M is restricted by the constraints represented by the four rigid
rods and the rotation axis. Hence, θ and ϕ offer themselves as generalized coordinates.
We first determine the kinetic energy. The moment of inertia of the cylinder is
1
ZZ = MR 2 ,
2
and therefore,
1 1
Trot = MR + 2ml sin θ ϕ̇ 2 .
2 2 2
(17.52)
2 2
With the potential energy V = −2gl/(m + M) cos θ , we can write down the La-
grangian:
L = Trot + Tplane − V
1
= (ZZ + 2ml 2 sin2 θ )ϕ̇ 2 + (m + 2M sin2 θ )l 2 θ̇ 2
2
+ 2gl(m + M) cos θ. (17.56)
Then
The variables of the Lagrangian are the generalized coordinates qα and the accom-
panying generalized velocities q̇α . In Hamilton’s theory,1 the generalized coordinates
and the corresponding momenta are used as independent variables. In this theory the
position coordinates and the “momentum coordinates” are treated on an equal ba-
sis. Hamiltonian theory leads to an essential understanding of the formal structure of
mechanics and is of basic importance for the transition from classical mechanics to
quantum mechanics.
We now look for a transition from the Lagrangian L(qi , q̇i , t) to the Hamiltonian
H (qi , pi , t) and remember that the generalized momenta are given by
∂L
pi = .
∂ q̇i
We look for a transformation
∂L
L(qi , q̇i , t) ⇒ H qi , , t = H (qi , pi , t). (18.1)
∂ q̇i
The question is, how to construct H ? The recipe is simple and will be formulated in
the following equation (18.2). The mathematical background of such a transformation
(Legendre2 transformation) can be easily demonstrated by a two-dimensional exam-
ple. We change from the function f (x, y) to the function g(x, u) = g(x, ∂f/∂y):
∂f
f (x, y) ⇒ g(x, u) with u = ,
∂y
1 Sir William Rowan Hamilton, b. Aug. 4, 1805, Dublin–d. Sept. 2, 1865, Dunsik. Hamilton began
his studies in 1824 in Dublin. In 1827, before finishing his studies, he became professor of astronomy
and King’s astronomer of Ireland. Hamilton contributed important papers on algebra and invented
the quaternion calculus. His contributions to geometrical optics and classical mechanics, e.g., the
canonical equations and the Hamilton principle, are of extraordinary importance.
2 Adrien Marie Legendre, b. Sept. 18, 1752–d. Jan. 10, 1833, Paris. Legendre made essential contri-
butions to the foundation and development of number theory and geodesy. He also found important
results on elliptic integrals, on foundations and methods of Euclidean geometry, on variational cal-
culus, and on theoretical astronomy. For instance, he first applied the method of least squares and
calculated voluminous tables. Legendre dealt with many problems that Gauss was also interested
in, but he never reached his perfection. Beginning in 1775, Legendre served as professor at various
universities at Paris and published excellent textbooks which had a long-lasting influence.
By forming the total differential, we realize that the function g formed this way no
longer contains y as an independent variable:
dg = ydu + udy − df
∂f ∂f
= ydu + udy − dx − dy
∂x ∂y
∂f
= ydu − dx,
∂y
where now y = ∂g/∂u and ∂g/∂x = −∂f/∂x.
According to this short insertion, we now construct the Hamiltonian from the La-
grangian. We write for the Hamiltonian
H (qi , pi , t) = pi q̇i − L(qi , q̇i , t). (18.2)
i
We look for those equations of motion which are equivalent to the Lagrange equations
based on the Lagrangian L. To this end, we form the total differential:
dH = pi d q̇i + q̇i dpi − dL. (18.3)
We now utilize the definition of the generalized momentum, pi = ∂L/∂ q̇i , and the
Lagrange equation in the form
d ∂L
pi − = 0.
dt ∂qi
Inserting both into (18.4) yields
∂L
dL = ṗi dqi + pi d q̇i + dt.
∂t
Since the first and fourth term mutually cancel, there remains
∂L
dH = q̇i dpi − ṗi dqi − dt.
∂t
i i
∂L(qi , q̇i , t)
pi =
∂ q̇i
are solved for the generalized velocities q̇i , so that
The q̇i obtained this way are inserted into the definition of H (see (18.2)), so that the
Hamiltonian H finally depends only on qi , pi , and the time t ; hence, H = H (qi , pi , t).
From this, the Hamilton equations (18.5) are established and solved.
The Lagrange equations provide a set of n differential equations of second order in
the time for the position coordinates. The Hamiltonian formalism yields 2n coupled
differential equations of first order for the momentum and position coordinates. In any
case, there are 2n integration constants when solving the system of equations.
From (18.5), it is seen that for a coordinate that does not enter the Hamiltonian, the
corresponding change of the momentum with time vanishes:
∂H
=0 ⇒ pi = constant.
∂qi
If the Hamiltonian (the Lagrangian) is not explicitly time dependent, then H is a
constant of motion since
dH ∂H ∂H ∂H
= q̇i + ṗi + ,
dt ∂qi ∂pi ∂t
1
T= mν ṙν2 , ν = 1, 2, . . . , N (N = number of particles).
2 ν
330 18 Hamilton’s Equations
If the constraints are holonomic and not time-dependent, there exist transformation
equations rν = rν (qi ), and therefore,
∂rν
ṙν = q̇i .
∂qi
i
Thus, the kinetic energy is a homogeneous quadratic function of the generalized ve-
locities. The arising mass coefficients
1 ∂rν ∂rν
aik = mν ·
2 ν ∂qi ∂qk
then also
k
∂f
xi = nf.
∂xi
i=1
This can be shown by forming the derivative of the upper equation with respect to λ;
thus,
∂f ∂f
x1 + · · · + xk = nλn−1 f.
∂(λx1 ) ∂(λxk )
By setting λ = 1, the assertion follows. Euler’s theorem, applied to the kinetic energy
(n = 2), means
∂T
· q̇i = 2T . (18.6)
∂ q̇i
∂L ∂T
= = pi ,
∂ q̇i ∂ q̇i
18 Hamilton’s Equations 331
and therefore,
∂T
H= pi q̇i − L = q̇i − L.
∂ q̇i
By using the relation (18.6) and the definition of the Lagrangian, we see that
H = 2T − (T − V ) = T + V = E.
Thus, under the given conditions the Hamiltonian represents the total energy. The
energy T − V represented by the Lagrangian is sometimes called the free energy.
One should note that H does not include a possible work performed by the con-
straint reactions.
The Hamiltonian formulation of mechanics emerges via the Lagrange equations
from Newton’s equations. This became evident in deriving (18.5), where we explicitly
used the Lagrange equations. The latter ones are however equivalent to Newton’s for-
mulation of mechanics (see d’Alembert’s principle and following text). Conversely,
one can easily derive Newton’s equations from Hamilton’s equations and thus show
the equivalence of both formulations. It is sufficient to consider a single particle in
a conservative force field and to use the Cartesian coordinates as generalized coordi-
nates. Then
1 2
pi = mẋi , H= ẋi + V (xi ) (i = 1, 2, 3),
2
i
or
1 pi2
H= + V (qi ).
2 m
i
ṗ = −grad V .
EXAMPLE
Let a particle perform a planar motion under the action of a potential that depends
only on the distance from the origin. It is obvious that we should use plane polar
coordinates (r, ϕ) as generalized coordinates.
1 1
L = T − V = mv 2 − V = m(ṙ 2 + r 2 ϕ̇ 2 ) − V (r).
2 2
332 18 Hamilton’s Equations
∂L pr
pr = = mṙ or ṙ = ,
∂ ṙ m
∂L pϕ
pϕ = = mr 2 ϕ̇ or ϕ̇ = .
∂ ϕ̇ mr 2
pr2 pϕ2
H = pr ṙ + pϕ ϕ̇ − L = + + V (r).
2m 2mr 2
The Hamilton equations then yield
∂H pr ∂H pϕ
ṙ = = , ϕ̇ = = ,
∂pr m ∂pϕ mr 2
and
∂H pϕ2 ∂V ∂H
ṗr = − = 3
− , ṗϕ = − = 0.
∂r mr ∂r ∂ϕ
ϕ is a cyclic coordinate. From this follows the conservation of the angular momentum
in the central potential.
EXAMPLE
The equation of motion of the pendulum shall be derived within the frames of New-
ton’s, Lagrange’s, and Hamilton’s theory.
ṗ = K.
The arclength of the displacement is denoted by s, and the tangent unit vector by T.
Then (see Fig. 18.1)
and thus,
¨ + g sin = 0.
l
18 Hamilton’s Equations 333
¨ + g = 0.
l
This differential equation has the general solution
g g
= A cos t + B sin t,
l l
where the constants A and B are to be determined from the initial conditions.
Lagrangian theory:
1 1 1
˙ 2 = ml 2
T = mv 2 = m(l ) ˙ 2,
2 2 2
V = mgh = mg(l − l cos ) = mgl(1 − cos ).
˙
p = ml 2 .
Differentiation yields
¨
ṗ = ml 2 .
By comparing this with the above expression for ṗ , we finally get again
g
¨ +
sin = 0.
l
EXERCISE
Problem. A mass point m shall move in a cylindrically symmetric potential V (, z).
Determine the Hamiltonian and the canonical equations of motion with respect to a
coordinate system that rotates with constant angular velocity ω about the symmetry
axis,
(a) in Cartesian coordinates, and
(b) in cylindrical coordinates.
Solution. (a) The coordinates of the inertial system (x, y, z) and those of the rotating
reference system (x , y , z ) are related by
x = cos(ωt)x − sin(ωt)y ,
y = sin(ωt)x + cos(ωt)y , (18.7)
z=z.
1
L = m ẋ 2 + ẏ 2 + ż2 + ω2 (x 2 + y 2 ) + 2ω(ẏ x − ẋ y )
2
− V (x , y , z ). (18.9)
∂L
px = = m(ẋ − ωy ), Exercise 18.3
∂ ẋ
∂L
py = = m(ẏ + ωx ), (18.10)
∂ ẏ
∂L
pz = = mż .
∂z
px
ẋ = + ωy ,
m
py
ẏ = − ωx , (18.11)
m
p
ż = z
m
H= q̇i pi − L. (18.12)
i
This yields
∂H 1
ẋ = = px + ωy ,
∂px m
1
ẏ = p − ωx , (18.14)
m y
1
ż = pz ,
m
336 18 Hamilton’s Equations
Exercise 18.3 ∂H ∂V
ṗx = − = ωpy − ,
∂x ∂x
∂V
ṗy = −ωpx − , (18.15)
∂y
∂V
ṗz = − .
∂z
ẋ = ˙ cos ϕ − ϕ̇ sin ϕ ,
(18.17)
ẏ = ˙ sin ϕ + ϕ̇ cos ϕ .
∂L ∂L ∂ ẋ ∂L ∂ ẏ
p =
= +
∂ ˙ ∂ ẋ ∂ ˙ ∂ ẏ ∂ ˙
= px cos ϕ + py sin ϕ , (18.18)
∂L ∂L ∂ ẋ ∂L ∂ ẏ
pϕ = = +
∂ ϕ̇ ∂ ẋ ∂ ϕ̇ ∂ ẏ ∂ ϕ̇
= −px sin ϕ + py cos ϕ . (18.19)
Now we solve for px and py . From (18.18), it follows that
Analogously, we obtain
1
px = p cos ϕ − p sin ϕ . (18.22)
ϕ
18.1 The Hamilton Principle 337
Now we insert (18.21) and (18.22) into (18.13) and obtain Exercise 18.3
2 1
H = p 2 cos2 ϕ − p cos ϕ pϕ sin ϕ + 2 sin2 ϕ pϕ 2
2 1 1
+ p 2 sin2 ϕ + p sin ϕ pϕ cos ϕ + 2 pϕ 2 cos2 ϕ ·
2m
− ω(x py − y px ) + V ( , z )
1 1 1
= p 2 + 2 pϕ 2 + pz 2 − ω cos ϕ p sin ϕ + cos ϕ pϕ cos ϕ
2m
1
− sin ϕ p cos ϕ + sin ϕ pϕ sin ϕ + V ( , z )
1 1
= p 2 + 2 pϕ 2 + pz 2 − ωpϕ + V ( , z). (18.23)
2m
A comparison of (18.13) and (18.23) shows that the Hamiltonian becomes especially
simple if it is represented in coordinates adapted to the symmetry of the problem.
From (18.23) we see that H does not depend on the angle ϕ (ϕ is a cyclic coordinate),
hence the angular momentum component pϕ is a constant of the motion.
The canonical equations of motion read
1 1 1
˙ = p , ϕ̇ = p − ω, ż = p,
m m2 ϕ m z (18.24)
1 ∂V ∂V
ṗ = p 2 − , ṗϕ = 0, ṗz = − .
m3 ϕ ∂ ∂z
The laws of mechanics can be expressed in two ways by variational principles that
are independent of the coordinate system. The first of these are the differential prin-
ciples. In this approach, one compares an arbitrarily selected momentary state of the
system with (virtual) infinitesimal neighbor states. One example of this method is the
d’Alembert principle. Another possibility is to vary a finite path element of the sys-
tem. Such principles are called integral principles. The “path” is not understood as the
trajectory of a point of the system in the three-dimensional position space, but rather
as the path in a multidimensional space where the motion of the entire system is com-
pletely fixed. For a system with f degrees of freedom, this space is f -dimensional.
In all integral principles the quantity to be varied has the dimension of an action
(= energy · time); therefore, they are also called principles of minimum action. As
an example we will consider the Hamilton principle. The Hamilton principle requires
that a system moves in such a way that the time integral over the Lagrangian takes an
extreme value:
t2
I= L dt
t1
338 18 Hamilton’s Equations
t2
δ L dt = 0. (18.25)
t1
The path equation of the system can be determined by applying this principle.
Before considering (18.25) in more detail, we will briefly deal in general with the
variational problem.
EXAMPLE
d 2y
=0 (18.26)
dx 2
with the further prescription that the values of the desired function y(x) for x = x1
and x = x2 are given numbers. These are descriptions using rectangular coordinates.
The straight line can however also be described as the shortest connection between
two points, i.e., by
ds = minimum. (18.27)
One may imagine the two given points as being connected by all possible curves, and
among these curves that curve be selected which yields the minimum value for the
given integral. This description of the straight line is independent of the choice of
particular coordinates.
As a preparation for the following, we show how the search for the shortest connec-
tion between two points of the plane can be reduced mathematically to (18.26). After
introducing rectangular coordinates x and y, the problem is to look for a function y(x)
for which y(x1 ) and y(x2 ) have given values and the integral
x2
I= 1 + y (x)2 dx (18.28)
x1
takes a minimum value. Similar problems do not need to have a solution. So one
could put the problem (18.27) or (18.28) and prescribe not only the start point and the
endpoint, but also the direction of the curve at the start and endpoint, respectively. One
easily recognizes that under these conditions there is no shortest connection, unless
both of the given directions incidentally coincide with the straight connection.
18.1 The Hamilton Principle 339
The problem (18.28) has some similarity with the search for the minimum of a Example 18.4
given function f (x). There one considers a small change of x and forms
If f (x) = 0, f (x) can increase or decrease for small changes of x, and thus, there
is no minimum at the point x. A necessary condition for a minimum is therefore
f (x) = 0. This condition is not sufficient; it is also fulfilled for a maximum.
In the problem (18.28), we do not have to change a variable but a function y(x).
We replace y(x) by a “neighboring” function y0 (x) + εη(x) of the desired func-
tion y0 , where we will afterward assume the number ε is arbitrarily small. We must
have η(x1 ) = η(x2 ) = 0. y is then replaced by y0 + εη , and instead of the integrand
1 + y 2 we obtain the Taylor series expansion into powers of ε:
y0
1 + (y0 + εη )2 = 1 + y0 2 + ε η + ε 2 (. . .),
1 + y0 2
where the term indicated by ε2 (. . .) can be neglected for sufficiently small |ε|. There-
fore, we have
x2
x2
x2
y0
I (ε) = 1 + (y0 + εη )2 dx ≈ 1 + y0 2 dx +ε η dx,
x1 x1 x1 1 + y0 2
which shall take a minimum for ε = 0. If the integral in the second term does not
vanish, the integral
x2
1 + y0 2 dx
x1
can increase or decrease by changing the function y0 (x), depending on the sign of ε.
Hence, y0 (x) does not provide a minimum of this integral. For a minimum rather
exists the necessary condition
x2
y0
η dx = 0 (18.29)
x1 1 + y0 2
for any function η(x) that vanishes at x1 and x2 . To be able to exploit the far-reaching
arbitrariness of the function η(x), we transform (18.29) by integration by parts:
x2
x2
y d y
η − η 0 dx = 0.
1 + y0 2 dx 1 + y 2
x1 x1 0
Because η(x1 ) = η(x2 ) = 0, the first term drops. The second term
x2
d y
η· 0 dx (18.30)
dx 1 + y 2
x1 0
340 18 Hamilton’s Equations
Example 18.4 then and only then becomes zero for all allowed functions η(x) if everywhere between
x1 and x2 we have
d y
0 = 0. (18.31)
dx 1 + y 2
0
If this equation were not satisfied everywhere, we could choose η(x) so that it is
always positive where
d y
0
dx 1 + y 2
0
is positive, and choose it as negative where this expression is negative, and in this
way establish a contradiction. We can also conclude this way: If (18.31) were not
fulfilled anywhere, one should set η(x) equal to zero everywhere, except for a certain
interval about this place. But then the integral (18.30) does not vanish. We could not
choose the quantity η in (18.29) in this way; thus we could not draw the corresponding
conclusion for (18.29). From (18.31) now follows y0 = constant or y0 = 0; that means
the former description (18.26). Thus, our calculation has replaced the requirement
that a definite integral be minimized by a function, by a differential equation for this
function.
Equation (18.31) allows yet another interpretation. We have
d y y
= .
dx 1 + y 2 ( 1 + y 2 )3
As is shown in the theory of curves, this is an expression for the curvature of a curve.
Equation (18.31) thus states that the desired curve everywhere has the curvature 0.
We just have treated a simple problem of the “variational calculus.” Problems of
the type (18.27) or (18.28) are called variational problems. In Exercises 18.5 and 18.6
we shall meet further, less trivial variational problems.
Since the endpoints shall be fixed, the term integrated out vanishes, and the extremum
condition reads
x2
∂F d ∂F
− η dx = 0.
∂y dx ∂y
x1
Since η(x) can be an arbitrary function, this equation is generally satisfied only then
if
d ∂F (y(x), y (x)) ∂F (y(x), y (x))
− = 0. (18.32)
dx ∂y ∂y
This relation (18.32) is called the Euler–Lagrange equation. It is a necessary condition
for an extremum value of the integral I . The solution of the Euler–Lagrange equation,
a differential equation of second order, together with the boundary conditions yields
the wanted path. To simplify notation, we define the variation of a function y(x, ε) as
the difference between y(x, ε) and y(x, 0)
∂y
δy = y(x, ε) − y(x, 0) = ·ε
∂ε ε=0
F can also include constraints by means of Lagrange multipliers (compare Chap. 16).
342 18 Hamilton’s Equations
EXERCISE
18.5 Catenary
dV = gσy ds.
it follows that
(y − μ)y − y 2 − 1 = 0.
dy dy dy dy
y = = = y ,
dx dy dx dy
we obtain
dy dy y dy
(y − μ)y = y 2 + 1, = .
dy y − μ 1 + y 2
Integration yields
1
ln(y − μ) + ln C1 = ln(1 + y 2 )
2
or
C1 (y − μ) = 1 + y 2 .
Integration yields
ν = C1 (x + C2 )
or
1
y= cosh(C1 (x + C2 )) + μ.
C1
Thus, the solution is the catenary. The constants give the coordinates of the lowest
point (x0 , y0 ) = (−C2 , (1/C1 ) + μ). They are determined by the given length l of the
chain and by the suspension points P1 and P2 .
344 18 Hamilton’s Equations
EXERCISE
Problem. On board an aircraft, a fire breaks out after landing. The passengers must
leave by an emergency chute on which they glide down without friction. Determine
by variational calculus the form of the chute with the aim to evacuate the plane as fast
as possible (height of the hatch y0 ; distance to the bottom x0 ). Find the time of gliding
as compared to the harsh free fall, assuming x0 = (π/2)y0 .
Hint: Use the substitution
dy
y = = −cot !
dx 2
Solution. The problem goes back to the Bernoulli brothers (brachistochrone, 1696).
Energy conservation yields
1
mgy0 = mv 2 + mgy,
2
2
1 dx 2 dy
g(y0 − y) = + ,
2 dt dt
(dx)2 + (dy)2
(dt)2 = .
2g(y0 − y)
18.2 General Discussion of Variational Principles 345
To get the minimum time, one has to solve a variational problem of the form
x2
y(x1 ) = y0 ,
δ F (x, y, y ) dx = 0,
y(x2 ) = 0.
x1
Because
x2
x2
∂F ∂F ∂F d ∂F
0= δy + δy dx = − δydx,
∂y ∂y ∂y dx ∂y
x1 x1
and thus,
c2
y = y0 − (1 − cos ).
4g
c2 c2
x= ( − sin ), y = y0 − (1 − cos ). (18.38)
4g 4g
x0 0 − sin 0
= . (18.39)
y0 1 − cos 0
The transcendental equation (18.39) can be solved in general only numerically. Special
cases:
0 = 0 π 2π
x0 /y0 = 0 π/2 ∞
x0
0
1 + y 2 (dx/dy)2 + 1
T = dx = dy
2g(y0 − y) 2g(y0 − y)
0 y0
18.2 General Discussion of Variational Principles 347
y0 Exercise 18.6
c2
= dy
2g(y0 − y)(c2 − 2g(y0 − y))
0
y0
c c2 − 2g(y0 − y)
= 2 arctan
2g 2g(y0 − y)
0
c π c2 − 2gy0
= − arctan ,
g 2 2gy0
c c2 − 2gy0
T = arccot .
g 2gy0
As is seen already from (18.25), according to the Hamilton principle the time is not
being varied. The system passes a trace point and the appropriate varied trace point at
the same time. Hence,
δt = 0.
t2
δI = δ L qα (t), q̇α (t), t dt = 0, α = 1, 2, . . . , f, (18.40)
t1
Because
d d
δqα = (qα (t, ε) − qα (t, 0))
dt dt
d d
= (qα (t, ε)) − (qα (t, 0))
dt dt
d
= δ qα (t) = δ q̇α (t), (18.42)
dt
integration by parts of the second summand yields
t2
t2
∂L ∂L d
δ q̇α dt = δqα dt
∂ q̇α ∂ q̇α dt
t1 t1
t2
t2
∂L d ∂L
= δqα − δqα dt. (18.43)
∂ q̇α t1 dt ∂ q̇α
t1
Since δqα vanishes at the endpoints (integration limits), we get for the variation of the
integral
t2
∂L d ∂L
δI = − δqα dt = 0. (18.44)
α
∂qα dt ∂ q̇α
t1
For holonomic constraints, we imagine that the dependent degrees of freedom were
eliminated. We take the qα as the independent coordinates. Hence, the δqα are in-
dependent of each other, and the integral vanishes only if the coefficient of any δqα
vanishes. This means that the Lagrange equations hold:
d ∂L ∂L
− = 0. (18.45)
dt ∂ q̇α ∂qα
Likewise, one can obtain the Hamilton equations by replacing L by α pα q̇α − H
and considering the variations δpα and δqα as independent. This will be worked out
in the Exercise 18.7.
In order to show the equivalence of the Hamilton principle with the formulations of
mechanics studied so far, we shall demonstrate its derivation from Newton’s equations.
We consider a particle in Cartesian coordinates. It moves along a certain path r = r(t)
between the positions r(t1 ) and r(t2 ). Now the path is varied by a virtual displacement
δr that is compatible with the constraint:
The time is not varied. The work needed for the virtual displacement is
δA = F · δr = F a · δr,
if F e is the external force and the constraint reaction does not perform work. If F e is
conservative, then
F e · δr = −δV ,
The right-hand side can be transformed (the operator (d/dt)δr = δṙ is treated accord-
ing to (18.42)):
d d 1 2
(ṙ · δr) = ṙ · δr + r̈ · δr = ṙ · δṙ + r̈ · δr = δ ṙ + r̈ · δr.
dt dt 2
Multiplication by the mass m yields
d 1 2
mr̈ · δr = m (ṙ · δr) − δ mṙ ,
dt 2
and therefore,
d
δ(T − V ) = δL = m (ṙ · δr).
dt
Integration with respect to time leads to
t2
δ L dt = m[ṙ · δr]tt21 = 0.
t1
Thus, the Hamilton principle for a single particle has been derived from Newton’s
equations. The result can be directly extended to particle systems. This can be un-
derstood quite generally in the following way: If a particle system obeys the La-
grange equations (18.45) (which are equivalent to Newtonian mechanics), then we
have (18.44) and from that—because of (18.43)—again (18.41) or (18.40), provided
that δqα (t1 ) = δqα (t2 ) = 0. Thus, the Lagrange equations are equivalent to the Hamil-
ton principle.
EXERCISE
Exercise 18.7 where the Lagrangian L is now expressed by the Hamiltonian H ; hence,
L= pα q̇α − H (pα , qα , t). (18.47)
α
t2
t2
∂H ∂H
δL dt = δpα q̇α + pα δ q̇α − δpα − δqα dt. (18.48)
α
∂pα ∂qα
t1 t1
The second term on the right-hand side can be transformed by integration by parts,
t2
t2
t2
d
pα δ q̇α dt = pα δqα dt = pα δqα H |tt21 − ṗα δqα dt. (18.49)
dt
t1 t1 t1
The first term vanishes since the variations at the endpoints vanish: δqα (t1 ) =
δqα (t2 ) = 0. Hence, (18.48) becomes
t2
t2
∂H ∂H
0= δL dt = q̇α − δpα + −ṗα − δqα dt. (18.50)
α
∂pα ∂qα
t1 t1
The variations δpα and δqα are independent of each other because along a path in
phase space the neighboring paths can have different coordinates or (and) different
momenta. Thus, (18.50) leads to
∂H
q̇α = ,
∂pα
(18.51)
∂H
ṗα = − ,
∂qα
which was to be demonstrated.
In the Hamiltonian formalism, the state of motion of a mechanical system with f de-
grees of freedom at a definite time t is completely characterized by the specification
of the f generalized coordinates and f momenta q1 , . . . , qf ; p1 , . . . , pf . These qi
and pi can be understood as coordinates of a 2f -dimensional Cartesian space, the
phase space. The f -dimensional subspace of the coordinates qi is the configuration
space; the f -dimensional subspace of the momenta pi is called momentum space.
In the course of motion of the system the representative point describes a curve, the
phase trajectory. If the Hamiltonian is known, then the entire phase trajectory can
be uniquely calculated in advance from the coordinates of one point. Therefore to
each point belongs only one trajectory, and two different trajectories cannot intersect
each other. A path in phase space is given in parametric representation by qk (t), pk (t)
(k = 1, . . . , f ). Because of the uniqueness of the solutions of the Hamilton equations,
18.3 Phase Space and Liouville’s Theorem 351
the system develops from various boundary conditions along various trajectories. For
conservative systems the point is bound to a (2f − 1)-dimensional hypersurface of the
phase space by the condition H (q, p) = E = constant.
EXAMPLE
If the angle ϕ is taken as a generalized coordinate, then we have for the plane pendu-
lum (mass m, length l)
pϕ = ml 2 ϕ̇.
We now consider a large number N of independent points that are mechanically iden-
tical, apart from the initial conditions, and are therefore described by the same Hamil-
tonian. As a specific example, we can imagine particles in the beam of an accelerator.
If all points at time t1 are distributed over a 2f -dimensional phase space region G1
with the volume
V = q1 · · · qf · p1 · · · pf ,
The volume of an arbitrary region of phase space is conserved if the points of its
boundary move according to the canonical equations.
The density of points in phase space in the vicinity of a point moving with the fluid
is constant.
To prove that, we investigate the motion of system points through a volume element
of the phase space. Let us first consider the components of the particle flux along the
qk - and pk -direction.
The area ABCD represents the projection of the 2f -dimensional volume element
dV onto the qk , pk -plane.
The number of points entering the volume element per unit time through the “side
face” (with the projection AD onto the qk , pk -plane) is
q̇k dpk · dVk ,
where
f
dVk = dqα dpα
α=1
α=k
is the (2f − 2)-dimensional remainder volume element; dpk · dVk is the magnitude of
the lateral surface with the projection AD in the pk ,qk -plane.
3 Joseph Liouville, b. March 24, 1809, St. Omer–d. Sept. 8, 1882, Paris. Liouville was professor of
mathematics and mechanics in Paris, at the École Polytechnique, at the Collège de France, and at
the Sorbonne. He was a member of the Bureau of Measures and of many scholarly societies. From
1840 to 1870, he was considered the leading mathematician of France. He worked on statistical
mechanics, boundary value problems, differential geometry, and special functions. His constructive
proof of the existence of transcendental numbers and, in 1844, the proof that e and e2 cannot be roots
of a quadratic equation with rational coefficients, were of great significance.
18.3 Phase Space and Liouville’s Theorem 353
The Taylor expansion for the points leaving at BC in the first direction yields
∂
q̇k + (q̇k )dqk dpk · dVk . (18.52)
∂qk
From the flux components in pk - and qk -direction, the number of system points per
unit time
∂ ∂
− (q̇k ) + (ṗk ) dV (18.54)
∂qk ∂pk
∂
div(ṙ) + = 0.
∂t
The divergence refers to the 2f -dimensional phase space:
f
∂
f
∂
∇= + .
∂qk ∂pk
k=1 k=1
∂ q̇k ∂ 2H ∂ ṗk ∂ 2H
= and =− .
∂qk ∂qk ∂pk ∂pk ∂qk ∂pk
∂ q̇k ∂ ṗk
+ = 0,
∂qk ∂pk
354 18 Hamilton’s Equations
This just equals the total derivative of the density with respect to time,
d
= 0, (18.58)
dt
and hence, = constant.
EXAMPLE
The system consists of particles of mass m in a constant gravitational field. For the
energy, we have
p2
H =E= − mgq.
2m
The total energy of a particle remains constant.
The phase trajectories p(q) are parabolas
p = 2m(E + mgq),
p = p + mgt,
(p 2 /2m) − E
q= ,
mg
and likewise,
E 2 − E1
F = (p2 − p1 )
mg
E2 − E1
= (p2 − p1 ).
mg
This is just the statement of Liouville’s theorem: F = F means that the density of the
system points in phase space remains constant. The significance of Liouville’s theorem
lies in the field of statistical mechanics, where one considers ensembles because of
lack of exact knowledge of the system.
A special application is the focusing of particle currents in accelerators where a
large number of particles are subject to identical conditions. Here a reduction of the
beam cross section must lead to an undesirable broadening of the momentum distrib-
ution.
4 This chapter was stimulated by a lecture given by Professor Herminghaus (Mainz), at the occa-
sion of the sixtieth birthday of Professor P. Junior 1988 in Frankfurt. My thanks go to colleague
Mr. Herminghaus for leaving his manuscript, which I found very useful when writing this section.
5 Simon van der Meer, b. Nov. 24, 1925, Den Haag. He received the Nobel prize for physics in
1984. He studied mechanical and electrical engineering at the Technical University of Delft, took his
diploma exams as engineer and worked at first in the Philips central laboratory in Eindhoven. In 1956
he got a position as a development engineer at CERN in Geneva. Here he soon earned a reputation
for professional competence, imagination, and also for his talent for theory. He was appointed a
“senior engineer.” Meanwhile the Italian physicist Carlo Rubbia, a scientific coworker at CERN, had
developed the idea to shoot 450 GeV protons from the just-finished super-high energy accelerator
“SPS” onto their artificially produced “antiparticles”—antiprotons. The project was realized as a
collider system. For the first time one could generate and demonstrate the so far only hypothetical
intermediate W - and Z-bosons. Van der Meer, a “genuine puzzler,” provided a genial invention: the
stochastic cooling, which allowed researchers to collect antiprotons in sufficient quantity and to store
them for the experiments. Only one year after their great success, which proved the predictions of
theory in a brilliant way, van der Meer and Rubbia were awarded with the Nobel prize for physics
“for decisive merits in the discovery of the field quanta of weak interaction.”
356 18 Hamilton’s Equations
According to the predictions of the theory, these particles should be able to decay
as follows:
For the experimental proof of the IVB, one utilized the inverse reaction (18.60), by
shooting high-energy beams of antiprotons onto protons in the proton synchrotron
(PS) of the CERN. Since the protons consist only of three quarks (q) and the antipro-
tons of three antiquarks (q), many quark-antiquark pairs are created by the violent
collisions. The reactions between these quarks and antiquarks can generate the inter-
mediate vector bosons (see Fig. 18.11). In order to reach a high event rate, which is
calculated according to
one needs both a large cross section and a high beam luminosity. Now one has
Np · Np
luminosity ∼ . (18.62)
q
Here, Np and Np denote the number of protons (p) and antiprotons (p) in the beam,
and q represents the beam cross section. The higher the number of particles and the
lower the beam cross section, the higher is the event rate for creating an intermediate
vector boson. See also Fig. 18.12.
An efficient cooling mechanism for the antiproton beams is therefore needed. Each
Fig. 18.12. The existence of particle of the beam moves by the action of magnetic fields in horizontal and vertical
intermediate vector bosons vibrations about a closed pre-set trajectory. In this context the term cooling means
could be proved for the first a reduction of the vibration amplitudes of the particles and thus of the beam cross
time at CERN by the collision section, or a reduction of the width of the momentum distribution of the particles
of intense high-energy proton
and antiproton beams (Np = about the mean value. This is illustrated by Fig. 18.13. Already well-tried cooling
number of antiprotons in the methods are electron cooling, cooling by synchrotron radiation, and the stochastic
beam, Np = number of pro- cooling, which will now be outlined in more detail.
tons) The motion of each particle in the beam is described by a point in a 6-dimensional
phase space spanned by the 3 spatial and the 3 momentum coordinates. This phase
space point is surrounded by empty space. By an appropriate deformation of the phase
space element the particle can be shifted toward the center of gravity of the distribu-
tion. This is the principle of stochastic cooling.
18.4 The Principle of Stochastic Cooling 357
The experimental setup for cooling of antiproton beams is sketched in Fig. 18.14.
In the ideal case, a probe (pick-up) measures the position or the momentum of a
particle. This tiny signal is amplified and fed to the “kicker,” which then corrects the
transverse or the longitudinal momentum and thereby cools. Thus, the cooling can
be interpreted as a one-particle effect, since each particle cools itself by emission of a
self-generated signal (coherent effect). An essential prerequisite is that the particle and
the signal reach the kicker simultaneously. Because of the finite resolving power of the
probe in the real case, besides the desired signal the perturbing signals from other par-
ticles reach the kicker too. This noise causes a heating of the particles (incoherent
effect) and thus counteracts the cooling effect. This interplay of cooling and heat-
ing mechanisms is illustrated by Fig. 18.15 and will be discussed in Exercise 18.10.
The cooling effect is directly proportional to the signal amplification, while the heat-
ing is proportional to the square of the amplification. The particle is cooled only in
the hatched area (see Fig. 18.15). Evidently there exists an optimum of amplification
where the cooling effect reaches an extremum value. Thus, the greater the intensity
of the beams, the greater is the noise and the heating effect, and the less is the factor
of optimum amplification. Generation of an intense beam of antiprotons at CERN is
therefore performed by stages and may last several hours. The principle is illustrated
by Fig. 18.15.
First, an antiproton pulse of low intensity is injected at the left border of the vacuum
chamber (1). The corresponding momentum density distribution can be seen on the
right. The beam and its momentum width are then compressed by cooling (2). A high-
frequency voltage is used to shift the pulse to the right side of the chamber (3), thus
giving space for a further antiproton pulse which is injected into the chamber (4).
After cooling, the second pulse is shifted onto the already “deposited” pulse (5). This
procedure is repeated every 2 to 3 seconds for several hours. In this way, the longi-
tudinal phase-space density is increased by accumulation of more and more particles
358 18 Hamilton’s Equations
into the same momentum interval (6). The final 6-dimensional phase-space density of
the stack is higher than the density of a single pulse by a factor of 3 · 108 . The intense
antiproton beam generated this way can now be further accelerated and brought to
collision with a proton beam. Only one year after the demonstration of the intermedi-
ate vector bosons, S. van der Meer and C. Rubbia6 were awarded the Nobel Prize in
physics for their achievements.
We now come back to the apparent contradiction between Liouville’s theorem and
the method of stochastic cooling. While according to the Liouville theorem only a sin-
gle pulse can be accommodated in a ring, stochastic cooling allows one to accumulate
about 36,000 pulses in the course of a day. The final phase-space density is higher than
that of a single pulse by a factor of 3 · 108 .
However, stochastic cooling and the Liouville theorem are dealing with different
situations. The former presupposes an ensemble of a finite number of discrete parti-
cles, while the Liouville theorem presupposes a phase-space continuum (see div v !).
A discrete ensemble thus represents only a model approximation of this condition that
works the better the more dense the occupation of the phase-space volume becomes.
This becomes clear by the example of the cooling rate (which will be calculated in
the subsequent problem):
1 W
= (2g − g 2 ). (18.63)
τ N
6 Carlo Rubbia, b. March 31, 1934, Goriza. He got his education as a physicist in Pisa at the Scuola
Normale, a time-honored university. Here he got his doctorate in 1958, after which he worked for a
year as a research scholar at the Columbia University in New York, and then as an assistant professor
in Rome. In 1960, he came to CERN at Geneva as high-energy physicist. Since 1972, he has held
a chair at Harvard University. In Geneva, Rubbia was inspired by the unified theory of weak and
electromagnetic interactions developed by A. Salam, S. Glashow, and S. Weinberg (Nobel Prize for
physics, 1979). In 1976, Rubbia proposed to CERN the construction of a new 450 GeV SPS accel-
erator for the purpose of proton-antiproton collision experiments. The accelerator achieved collision
energies of 540 GeV, which were sufficient to create the (so far only predicted) W - and Z-bosons.
Important for the success of the project was not only Rubbia, but also S. van der Meer, whose con-
tributions made possible the generation of sharply bunched, pulsed antiproton currents. Both of them
got the Nobel Prize for physics in 1984.
18.4 The Principle of Stochastic Cooling 359
N denotes the number of particles in the beam, W the bandwidth of the system and
g a gain factor that will be defined in problem 18.10. The essential point however is
the dependence of the cooling rate on the inverse of the particle number of the beam,
1/N . In the limit
1
lim = 0,
N →∞ τ
cooling is no longer possible, as we would expect.
We note that the same restriction for applying Liouville’s theorem basically also
holds in thermodynamics, but there the approximation is better by 12 orders of mag-
nitude (1012 → 1024 )!
Much more important, however, is the fact that Liouville’s theorem holds on the
condition that the particles obey the Hamilton equations, with a given Hamiltonian H .
In this sense the particle system must be closed. But just this condition is violated by
the reading off the particle position (coordinate, momentum pick-up) and by the cor-
responding correction (kicker; see Fig. 18.14). This is a calculated interference from
outside which cannot be described by a Hamiltonian. Hence, the Liouville theorem
does not have to be fulfilled; moreover, it must not hold at all!
EXERCISE
Problem.
(a) Calculate the cooling rate per second for a beam of N particles.
(b) When does maximum cooling occur?
(c) Calculate the cooling time for a beam of N = 1012 particles. Let the bandwidth of
the system be W = 500 MHz, and g = 1.
Solution. (a) We first consider the case that the pick-up and the kicker are so fast that
they seize each particle independently (see Fig. 18.16). Let the displacement of this
Fig. 18.16. In the ideal case,
the pick-up seizes one particle
particle from the beam axis be xk . After passing the distance λ/4 (λ is the wavelength
of the x-vibration), the deviation is corrected electromagnetically in the kicker. Let
the correction be
The corrected distance xk of the particle from the beam axis is thus given by
(Fig. 18.17). For g = 1, the cooling would be ideal. However, in the real case there
appears a noise in the pick-up which is due to further Ns − 1 particles passing the
pick-up in the time interval Ts (see Fig. 18.18). Thus, the pick-up measures not only
the spatial displacement xk of the kth particle from the beam axis (x = 0), but also
that of the additional Ns − 1 particles located around the kth one. The recorded spatial
displacement is therefore the mean value of all Ns seized particles (the kth and the
Ns − 1 located around the kth particle):
1
Ns
xk = xj . (18.66)
Ns
j =1
For clarity, we will label the kth particle in the sum on the right-hand side, e.g., the
numbering will be chosen so that j = 1 just denotes the kth particle. Moreover, it
should be clear that the remaining Ns − 1 particles are closely located around the kth
one when passing the pick-up. We therefore add the index k to the particular spatial
displacement xj :
1 Ns
xk = x1,k + xj,k , (18.67)
Ns
j =2
This means that there will be no kick if the sample of Ns particles on the average
moves on the beam axis:
In other words, the kicker will not be activated if the center of gravity Exercise 18.10
1 1
Ns Ns
Sk = mxj,k = xj,k ≡ xk (18.70)
mNs Ns
j =1 j =1
of the sample in the pick-up is already on the beam axis (all particles have the same
mass m).
For real measurements in the pick-up, this will, of course, not be fulfilled in gen-
eral. The probability that the center of gravity of the Ns particles that are statistically
distributed over the beam just coincides with the beam axis is extremely low. In the
realistic case the sample will always be “kicked”.
We now want to know how the mean value of the spatial displacement of all N
particles in the beam will change by the mechanism of stochastic cooling. This mean
value is
1
N
E(xk ) = xk (18.71)
N
k=1
and will be denoted by E(xk ) (expectation value) to distinguish it from the mean value
for the sample of Ns particles, xk , which was defined in (18.66). It is however clear
that the mean value of the positions of all the particles just defines the beam axis.
Since we put the beam axis at the origin of the coordinate system, x = 0, the mean
value of the spatial displacement of all the particles from the beam axis just vanishes:
E(xk ) ≡ 0. (18.72)
This always holds, independent of the mechanism of stochastic cooling. Thus, the
mean value E(xk ) is not an appropriate quantity for investigating the mechanism of
stochastic cooling. It is evident that the mean square of the spatial displacement E(xk2 )
is much better suited for this purpose. We therefore will investigate the change of
E(xk2 ) by the stochastic cooling mechanism. First we consider the mean square spatial
displacement xk2 for the kth particle, and to this end, we square (18.68):
The change of xk2 for a single passage through the kicker is thus given by
Since there is one kick per revolution, this is also the change of xk2 per revolution. By
averaging over all particles, one obtains
The second equals sign in the first line follows from the additivity of the expectation
value E(. . .); compare (18.71).
To calculate the change of the expectation value of the mean square of the spatial
displacement per revolution, (E(xk2 )), the expectation values E(xk xk ), E(xk 2 )
must be expressed by E(xk2 ). We then obtain (E(xk2 )) as a function of E(xk2 ), or a
362 18 Hamilton’s Equations
Exercise 18.10 differential equation for E(xk2 ), the solution of which allows us to calculate the desired
quantities.
To evaluate E(xk xk ), we write with (18.67)
1
N Ns
1
E(xk xk ) = xk x1,k + xj,k
N Ns
k=1 j =2
1 2 1
N N Ns
1
= xk + x1,k xj,k . (18.76)
Ns N N
k=1 k=1 j =2
In the first term, we used x1,k ≡ xk , and in the second one xk ≡ x1,k .
We now realize that two different particles in the beam cannot be correlated (the
particles are statistically distributed over the beam!). Even though they belong to the
same sample of Ns particles around the kth particle, their spatial displacements xi,k
and xj,k , i = j , on the average must satisfy
1
N
E(xi,k xj,k ) = xi,k xj,k = 0, for i = j. (18.77)
N
k=1
1
N
E(x1,k xj,k ) = x1,k xj,k ≡ 0. (18.78)
N
k=1
This is now utilized in the second term of (18.76), which then vanishes. The first term
can immediately be rewritten using the definition of E, and we obtain
1
E(xk xk ) = E(xk2 ). (18.79)
Ns
Furthermore,
1 1
N Ns
E(xk 2 ) = xi,k xj,k
N Ns2
k=1 i,j =1
N
1 1
N s Ns
= x 2
i,k + xi,k xj,k . (18.80)
N Ns2
k=1 i=1 i,j =1
i=j
The second term again vanishes by using (18.77). In the first term we first average by
summing over all particles,
1 1 1
N Ns Ns
x 2
i,k = 2
E(xi,k ). (18.81)
N Ns2 Ns2
k=1 i=1 i=1
The mean square spatial deviation E(xi,k2 ) cannot depend on the label i of the particle
from the sample of Ns particles, E(xi,k 2 ) ≡ E(x 2 ). The sum over i therefore yields
k
only the additional factor Ns , and we obtain
1
E(xk 2 ) ≡ E(xk2 ). (18.82)
Ns
18.4 The Principle of Stochastic Cooling 363
Equation (18.75) with (18.79) and (18.82) thus turns into Exercise 18.10
2g − g 2
(E(xk2 )) = − E(xk2 ) (18.83)
Ns
for the change of the mean square spatial displacement per revolution. The “differen-
tial” change dE(xk2 ) per “differential” revolution dn is
dE(xk2 ) 2g − g 2
=− E(xk2 ). (18.84)
dn Ns
we obtain
√ 2g − g 2
xrms = C exp −n . (18.86)
2Ns
2Ns
n0 = (18.87)
2g − g 2
revolutions. Since each revolution takes the time T , it thus lasts for
2Ns T
τ = n0 T = , (18.88)
2g − g 2
to reduce xrms to the fraction 1/e of its original value. Since among N particles orbit-
ing in the time T , Ns particles are seized in the time Ts , for a homogeneous particle
flux density we have
Ts
Ns = N , (18.89)
T
and therefore,
2N Ts
τ= . (18.90)
2g − g 2
1
Ts = , (18.91)
2ω
364 18 Hamilton’s Equations
Fig. 18.19.
Given a Hamiltonian H = H (qj , pj , t), the motion of the system is found by integra-
tion of the Hamilton equations:
∂H ∂H
ṗi = − and q̇i = . (19.1)
∂qi ∂pi
∂H
= 0, i.e., ṗi = 0.
∂qi
where all coordinates Qi for the problem were cyclic. Then all momenta are constant,
Pi = βi , and the new Hamiltonian H is then only a function of the constant momenta
Pi ; hence, H = H (Pj ). Then
∂H (Pj ) ∂H (Pj )
Q̇i = = ωi = constant, Ṗi = − = 0.
∂Pi ∂Qi
Qi = ωi t + ω0 , Pi = βi = constant.
Here, we presupposed that the new coordinates (Pi , Qi ) again satisfy the (canoni-
cal) Hamilton equations, with a new Hamiltonian H (Pj , Qj , t). This is an essential
requirement for a coordinate transformation of the form (19.2) to make it canonical.
Just as pi is the canonical momentum corresponding to qi (pi = ∂L/∂ q̇i ), Pi shall
be the canonical momentum to Qi . A pair (qi , pi ) is called canonically conjugate
if the Hamilton equations hold for qi and pi . The transformation from one pair of
∂H ∂H
Q̇i = , Ṗi = − . (19.3)
∂Pi ∂Qi
At the moment, we do not yet require that all Qi be cyclic. This case will be con-
sidered later (Chap. 20).
In the new coordinates, we require Hamilton’s principle to be maintained. Thus,
for fixed instants of time, t1 and t2 , we have both
t2
δ L(qj , q̇j , t) dt = 0
t1
and
t2
δ L (Qj , Q̇j , t) dt = 0.
t1
also vanishes.
We observe that (19.4) will then be fulfilled even if the old and new Lagrangians
differ by a total time derivative of a function F :
t2
dF dF
L−L = , because δ dt = δ F |t2 − F |t1 = 0,
dt dt
t1
since the variation of a constant equals zero. As we shall see, the function F medi-
ates the transformation (pi , qi ) to (Pi , Qi ). F is therefore also called a generating
function. In the general case, F will be a function of the old and the new coordinates;
together with the time t it involves 4n + 1 coordinates:
F = F (pj , qj , Pj , Qj , t).
we have
dF
pi q̇i − H = Pi Q̇i − H + . (19.8)
dt
∂F1 (qj , Qj , t)
pi = ,
∂qi
∂F1 (qj , Qj , t)
Pi = − , (19.10)
∂Qi
∂F1 (qj , Qj , t)
H = H + .
∂t
We are now prepared to derive the transformation equations for a generating func-
tion of the type F2 , which is also denoted by S:
F2 ≡ S = S(qj , Pj , t).
∂F1 (qj , Qj , t)
Pi = − .
∂Qi
368 19 Canonical Transformations
This leads to
d
pi q̇i − Pi Q̇i − H + H = F2 (qj , Pj , t) − Pi Qi
dt
i i i
∂F2 ∂F2 ∂F2
= q̇i + Ṗi +
∂qi ∂Pi ∂t
i i
− Ṗi Qi − Pi Q̇i
i i
∂F2 ∂F2 ∂F2
pi q̇i + Ṗi Qi − H + H = q̇i + Ṗi + .
∂qi ∂Pi ∂t
i i i i
∂F1 (qj , Qj , t)
pi =
∂qi
follow the equations pi = pi (qj , Qj , t), which can be solved for the Qi :
Qi = Qi (pj , qj , t).
∂F1 (qj , Qj , t)
Pi = −
∂Qi
then enables us to calculate
Pi = Pi (pj , qj , t).
19 Canonical Transformations 369
We now understand the name generating function for F : The function F determines
the canonical transformation
Calculating the second derivatives of the generating functions F1,2,3,4 , we find the fol-
lowing relations to apply between old and new coordinates under a canonical trans-
formation
Exactly the existence of these mutual relations between old and new coordinates dis-
tinguishes a canonical transformation from a general transformation (19.2) of the sys-
tem’s coordinates. For the latter, (19.15) do not hold.
In the preceding derivation, the Hamiltonians H (qj , pj , t) and H (Qj , Pj , t) were
conceived as alternative descriptions of the same dynamical system. On the other
hand, we may as well conceive H and H as describing different dynamical system.
A canonical transformation of H into H then establishes a correlation of both dy-
namical systems. This way, it is sometimes possible to find the solution of a given
dynamical system by canonically transforming it into a second system that is easier to
solve. The solution of the original system is then obtained by canonically back trans-
forming the solution of the second system. With examples 19.4 and 21.16, we shall
work out the solutions of the damped and the time-dependent harmonic oscillators,
respectively, by canonically transforming these systems into the ordinary harmonic
oscillator.
370 19 Canonical Transformations
EXAMPLE
EXAMPLE
with arbitrary differentiable functions fk (qj , t). The transformation rules (19.12) for
this F2 follow as
∂fk
Qi = fi (qj , t), pi = Pk ,
∂qi
k
∂fk
H (Qj , Pj , t) = H (qj , pj , t) + Pk .
∂t
k
The new position coordinates Qi thus emerge as functions of the original position co-
ordinates qi , without any dependence on the momentum coordinates. Transformations
of this type are referred to as point transformations. This class of transformations is
generally canonical as we can always construct the corresponding generating function.
The particular case fk (qj , t) = qk then defines the identical transformation
Qi = qi , pi = Pk δki = Pi , H (Qj , Pj , t) = H (qj , pj , t).
k
EXAMPLE
The kinetic energy T (p) and the potential energy V (q) of a particle be given by
p2 1 1 k
T (p) = , V (q) = kq 2 = mω2 q 2 , ω2 = , m, k, ω = const.,
2m 2 2 m
19 Canonical Transformations 371
with m denoting the particle’s mass, k a characteristic constant of the oscillator, and Example 19.3
ω its characteristic frequency. The Hamiltonian of this system is then
p2 1
H (q, p) = + mω2 q 2 . (19.16)
2m 2
The canonical equations and the equation of motion follow as
∂H p(t) ∂H
q̇(t) = = , −ṗ(t) = = q(t) mω2 , q̈ + ω2 q = 0. (19.17)
∂p m ∂q
The direct way to evaluate the dynamics of this system is to integrate the equation of
motion. Here, we choose the “detour” over the canonical transformation formalism,
namely to map our system into another system with Hamiltonian H whose canonical
equations are even easier to solve. As the “target Hamiltonian” H , we choose
H (P ) = ωP . (19.18)
Example 19.3 With the evidence of the transformation (19.19) being canonical, it is ensured that the
transformed system (19.18) constitutes on its part a Hamiltonian system — and hence
the maintains the canonical form of the canonical equations. Explicitly, the canonical
equations of the transformed system are
∂H ∂H
Q̇(t) = = ω, Ṗ (t) = − = 0.
∂P ∂Q
These equations are thus equivalent to the original canonical equations (19.17) that
emerged from the original Hamiltonian (19.16). As H does not depend on Q, we
observe that the new canonical position coordinate Q is cyclic, hence that its conjugate
canonical momentum P represents a conserved quantity. The canonical equations for
Q̇(t) and Ṗ (t) can be immediately integrated, yielding
The system’s dynamics are thus completely solved in the simplest possible manner.
Inserting the solution functions Q(t) and P (t) into the transformation rules (19.19),
we obtain the solutions in the original coordinates q(t) and p(t)
2P (0)
q(t) = sin ωt + Q(0) , p(t) = 2mωP (0) cos ωt + Q(0) .
mω
The trigonometric functions can finally be split by means of the addition theorems.
According to (19.19), the values sin Q(0) and cos Q(0) can then be expressed in terms
of the initial conditions q(0) and p(0) of the original system
p(0)
q(t) = q(0) cos ωt + sin ωt, p(t) = −q(0) mω sin ωt + p(0) cos ωt.
mω
As expected, we find the solution of the harmonic oscillator exactly in the form as we
would have obtained by a direct integration of the canonical equations (19.17).
At this point, one could well argue that overall effort needed to solve the canonical
equations (19.17) along the “detour” over the canonical transformation method is even
larger than that for the direct solution. But this is only due to the simplicity of the orig-
inal system. The example here was just chosen to demonstrate the method consisting
of three steps: (i) forth transformation of the initial conditions into a second system,
(ii) solving on that basis the dynamics of the second system, and (iii) transforming
back the obtained solution into the original system coordinates. We may depict both
alternatives by means of the following diagram:
Canonical forth
transformation Canonical back
of Hamiltonian M(0) M−1 (t) transformation
and initial of the solution
conditions Solution of the canonical equations
of Hamiltonian H
(H ; Q(0), P (0)) (Q(t), P (t))
19 Canonical Transformations 373
In the next Example 19.4, we will show that the method to determine the dynamics Example 19.3
of a given system by transforming it into a second system that is easier to solve can
indeed reduce the overall effort, as compared to a direct solution of the original system.
This will become obvious with Example 21.16, where we treat the time-dependent
damped harmonic oscillator. This case has long been thought of as possessing no
analytic solution. Yet, the solution of this problem by means of a generalized canonical
transformation is fairly straightforward. The price to pay is that we must find the
appropriate generating function.
EXAMPLE
p 2 −2γ t 1
H (q, p, t) = e + mω2 e2γ t q 2 , (19.21)
2m 2
with the abbreviations 2γ = β/m and ω2 = k/m. As before, m stands for the mass
of the moving point particle, β for the friction coefficient, and k for the oscillator’s
constant. The canonical equations follow as
∂H p(t) −2γ t ∂H
q̇(t) = = e , −ṗ(t) = = q(t) mω2 e2γ t .
∂p m ∂q
In the left-hand side equation, we see that the canonical momentum p(t) no longer
coincides with the kinetic momentum pkin (t) = mq̇(t), provided that γ = 0,
We may combine the two first-order equations into one second-order equation for q(t)
to obtain the equation of motion of the damped harmonic oscillator in its common
form
q̈ + 2γ q̇ + ω2 q = 0. (19.22)
Example 19.4 As the new position coordinate Q solely depends on the old position coordinate q, we
are dealing here with a particular case of the general class of point transformations.
Furthermore, the relation between old and new coordinates is obviously linear. We
may thus express the transformation rules in matrix form. Solving for the old coordi-
nates this yields
−γ t
q e 0 Q
= . (19.23)
p −mγ eγ t eγ t P
According to the rule (19.12) for the mapping of the Hamiltonians, we get the
new Hamiltonian H (Q, P , t) by expressing the original Hamiltonian H (q, p, t)
via (19.23) in terms of the new coordinates Q, P and, moreover, by adding ∂F2 /∂t
1 2 1
H = m−1 e−2γ t −mγ eγ t Q + eγ t P + mω2 e2γ t e−2γ t Q2 + γ QP − mγ 2 Q2
2 2
1 −1 1
= m (P − mγ Q)2 + mω2 Q2 + γ QP − mγ 2 Q2 .
2 2
In the present example, we thus find a transformed Hamiltonian H that no longer
depends on time explicitly
P2 1
H (Q, P ) = + mω̃2 Q2 , ω̃2 = ω2 − γ 2 .
2m 2
We now observe that H emerges as exactly the Hamiltonian of an undamped har-
monic oscillator with angular frequency ω̃ = ω2 − γ 2 . Its solution is already known
from Example 19.3
Q(t) cos ω̃t m−1 ω̃−1 sin ω̃t Q(0)
= . (19.24)
P (t) −mω̃ sin ω̃t cos ω̃t P (0)
The solution functions q(t) and p(t) of the damped harmonic oscillator now follows
as the product the solution (19.24) and the canonical forth and back transformations,
given by (19.23) and its inverse
−γ t
q(t) e 0 cos ω̃t m−1 ω̃−1 sin ω̃t 1 0 q(0)
= .
p(t) −mγ eγ t eγ t −mω̃ sin ω̃t cos ω̃t mγ 1 p(0)
On the right-hand side, the initial conditions Q(0), P (0) of the transformed system
were expressed through those of the original system, q(0), p(0). Explicitly, according
to the inverse transformation of (19.23) at t = 0, we have Q(0) = q(0) and P (0) =
mγ q(0) + p(0). The determinants of all matrices are unity and hence the determinant
of the combined linear mapping (q(0), p(0)) → (q(t), p(t)). This is in agreement
with the requirement of Liouville’s theorem.
In the form of the product of three matrices, it becomes obvious that the solution
method via canonical transformation consists of the three steps, as sketched at the
end of Example 19.3. We may finally express the solution of the damped harmonic
oscillator (19.22) concisely by multiplying the matrices
q(t) q(0)
= R(t) ,
p(t) p(0)
19 Canonical Transformations 375
ω̃2 = ω2 − γ 2 .
The present example shows that the task of solving the equation of motion of a given
dynamical system can be facilitated if we succeed to represent it as the transformed
solution of a another system that is easier to solve. But this works only if we can find
an appropriate generating function.
EXAMPLE
Herein, H stand for the Hamiltonian of the given dynamical system, and δt for an
infinitesimal interval on the time axis. From the general form of transformation rules
for generating functions of type F2 we obtain the particular rules for (19.25) as
∂F2 ∂H dpi
pi = = Pi + δt = Pi − δt,
∂qi ∂qi dt
∂F2 ∂H 1st order in δt ∂H dqi
Qi = = qi + δt = qi + δt = qi + δt,
∂Pi ∂Pi ∂pi dt
∂F2 ∂H dH
H = H + =H + δt = H + δt.
∂t ∂t dt
In last rightmost terms of these equations, the canonical equations were inserted, re-
spectively. Solving for the transformed quantities, this means
Pi = pi + ṗi δt,
Qi = qi + q̇i δt,
H = H + Ḣ δt.
We now observe that the particular generating function (19.25) defines precisely the
canonical transformation that pushes the system ahead by an infinitesimal time step δt.
As any canonical transformation can be applied an arbitrary number of times in se-
quence, we can conclude that the transformation along finite time steps is also canon-
ical. This is an important result: the time evolution of a Hamiltonian system consti-
tutes a particular canonical transformation. As already stated, the class of canonical
transformations are characterized by their property to map Hamiltonian systems into
376 19 Canonical Transformations
Example 19.5 Hamiltonian systems. It is thus ensured that a Hamiltonian system remains a Hamil-
tonian system in the course of its time evolution.
EXAMPLE
With the theory of canonical transformations at hand, we may cast Liouville’s theorem
into the following general form: the volume element dV = dq1 . . . dqn dp1 . . . dpn of
a Hamiltonian system with n degrees of freedom is invariant with respect to canonical
transformations,
can. transf.
dQ1 . . . dQn dP1 . . . dPn = dq1 . . . dqn dp1 . . . dpn .
The proof for the general case of systems with n degrees of freedom is worked out Example 19.6
analogously. In the case of a Hamiltonian system, the determinant D of the Jacobi
matrix that is associated with a general transformation of the system’s coordinates has
an even number of rows (columns). We assume the transformation to be invertible.
Then, we may express the new position coordinates Qi = Qi (qj , pj ) as functions
of the old coordinates, and the old momenta as functions of the new coordinates,
pi = pi (Qj , Pj ). The determinant of the associated Jacobi matrix is the represented
by
∂(Q1 , . . . , Qn ) ∂(p1 , . . . , pn ) −1
D= , (19.27)
∂(q1 , . . . , qn ) ∂(P1 , . . . , Pn )
which is again unity as we may interchange the sequence of partial derivatives and
due to the fact that determinants of transposed matrices coincide.
We finally remark that the generating function F2 used here in this proof is com-
pletely equivalent to the other types of generating function. For, the determinant
(19.27) of the transformation’s Jacobi matrix has the equivalent representations
n ∂(P1 , . . . , Pn )∂(p1 , . . . , pn ) −1
D = (−1)
∂(q1 , . . . , qn ) ∂(Q1 , . . . , Qn )
∂(P1 , . . . , Pn ) ∂(q1 , . . . , qn ) −1
=
∂(p1 , . . . , pn ) ∂(Q1 , . . . , Qn )
∂(Q1 , . . . , Qn ) ∂(q1 , . . . , qn ) −1
= (−1)n .
∂(p1 , . . . , pn ) ∂(P1 , . . . , Pn )
The result D = 1 for a canonical transformation then follows in the same way as above
by inserting the rules into the appropriate generating function F1 , F3 , or F4 .
With the result of Example 19.5, we know that the time evolution of a Hamiltonian
system can be conceived as a particular canonical transformation whose generating
function is based on the Hamiltonian H . This yields the more special version of Liou-
ville’s theorem from Chap. 18, where it was stated that the volume element dV of a
Hamiltonian system is invariant in the course of the system’s time evolution.
EXAMPLE
For a Hamiltonian system H (qj , pj , t) of n degrees of freedom, and for two differen-
tiable functions F (qj , pj , t), G(qj , pj , t) of the canonical variables and time t , the
378 19 Canonical Transformations
A special case is established if we set up the Poisson brackets of the canonical vari-
ables qi and pi . As these variables are required to not depend on each other, we im-
mediately get
We first convince ourselves that the same relations hold for canonically transformed
coordinates Qi and Pi , hence that the fundamental Poisson brackets (19.29) are in-
variant under canonical transformations. Making use of the relations (19.15), we find
n
∂Qi ∂Qj ∂Qi ∂Qj
[Qi , Qj ] = −
∂qk ∂pk ∂pk ∂qk
k=1
n
∂pk ∂Qj ∂qk ∂Qj ∂Qj
= + = =0
∂Pi ∂pk ∂Pi ∂qk ∂Pi
k=1
n
∂Pi ∂Pj ∂Pi ∂Pj
[Pi , Pj ] = −
∂qk ∂pk ∂pk ∂qk
k=1
n
(19.30)
∂pk ∂Pj ∂qk ∂Pj ∂Pj
= − − =− =0
∂Qi ∂pk ∂Qi ∂qk ∂Qi
k=1
n
∂Qi ∂Pj ∂Qi ∂Pj
[Qi , Pj ] = −
∂qk ∂pk ∂pk ∂qk
k=1
n
∂pk ∂Pj ∂qk ∂Pj ∂Pj
= + = = δij .
∂Pi ∂pk ∂Pi ∂qk ∂Pi
k=1
We are now prepared to show that the Poisson bracket of two arbitrary functions
F (qj , pj , t) and G(qj , pj , t) establishes likewise a canonical invariant. The time t
1 Siméon Denis Poisson, French mathematician and physicist, b. June 21, 1781, Pithiviers, France–
d. April 25, 1840, Paris, France. Descending from a simple social background—his father was a
soldier—Poisson had good teachers who recognized his extraordinary gifts and made it possible for
him to begin studies at the École Polytechnique in Paris in 1798. There, his mathematical talents
were recognized by Laplace and Lagrange. Poisson became an assistant professor, and, in 1806, a
full professor at the École Polytechnique, where he energetically worked to improve teaching and the
formation of students.
His research initially was focused on the theory of ordinary and partial differential equations,
which he applied to many different physical problems. Thus, Poisson developed further the mechanics
of Laplace and Lagrange, and studied problems related to the propagation of sound, elasticity, and
static electricity. He later turned his interests towards the theory of probabilities, and recognized the
seminal nature of the Law of Large Numbers.
Many ideas and concepts are named after Poisson, such as the Poisson equation in potential the-
ory, the Poisson bracket of mechanics, the Poisson ratio in elasticity, and the Poisson distribution in
statistics.
19 Canonical Transformations 379
as the common independent variable of both the original and the transformed system Example 19.7
is not transformed. We may thus restrict ourselves to the nested mapping
F (qj , pj ) = F Qk (qj , pj ), Pk (qj , pj ) ,
G(qj , pj ) = G Qk (qj , pj ), Pk (qj , pj ) .
Multiplying and recollecting the terms for Poisson brackets with respect to the coor-
dinates Qi , Pj yields the equivalent expression
∂F ∂G ∂F ∂G
[F, G]q,p = [Qi , Qj ] + [Pi , Pj ]
∂Qi ∂Qj ∂Pi ∂Pj
i j
∂F ∂G ∂F ∂G
+ [Qi , Pj ] − [Qj , Pi ] . (19.31)
∂Qi ∂Pj ∂Pi ∂Qj
Equation (19.31) holds for any invertible coordinate transformation. In the particular
case that the transformation is canonical, then in addition the relations (19.30) for the
fundamental Poisson brackets apply. In that case, (19.31) simplifies to
∂F ∂G ∂F ∂G
[F, G]q,p = − δij = [F, G]Q,P .
∂Qi ∂Pj ∂Pi ∂Qj
i j
The Poisson bracket [F, G] is thus uniquely determined by functions F and G and
independent from the underlying coordinate system, provided that a transformation of
the coordinate system is canonical.
EXAMPLE
The proof is easily worked out by directly calculating the total time derivative of the
Poisson bracket’s definition from (19.28)
380 19 Canonical Transformations
n
Example 19.8 d ∂F d ∂G ∂G d ∂F ∂F d ∂G
[F, G] = + −
dt ∂qi dt ∂pi ∂pi dt ∂qi ∂pi dt ∂qi
i=1
∂G d ∂F
−
∂qi dt ∂pi
n n
∂F ∂ ∂G dqj ∂G dpj ∂G
= + +
∂qi ∂pi ∂qj dt ∂pj dt ∂t
i=1 j =1
∂G ∂ ∂F dqj ∂F dpj ∂F
− + +
∂qi ∂pi ∂qj dt ∂pj dt ∂t
∂G ∂ ∂F dqj ∂F dpj ∂F
+ + +
∂pi ∂qi ∂qj dt ∂pj dt ∂t
∂F ∂ ∂G dqj ∂G dpj ∂G
− + +
∂pi ∂qi ∂qj dt ∂pj dt ∂t
n
∂F ∂ dG ∂G ∂ dF ∂G ∂ dF
= − +
∂qi ∂pi dt ∂qi ∂pi dt ∂pi ∂qi dt
i=1
∂F ∂ dG
−
∂pi ∂qi dt
dG dF
= F, + ,G .
dt dt
If both I1 ≡ F as well as I2 ≡ G are invariants of motion, i.e., if dF /dt ≡ 0 and
dG/dt ≡ 0, we conclude
d
[F, G] ≡ 0. (19.32)
dt
With I3 ≡ [F, G] we have then found another, possibly trivial, invariant of the system.
We remark that Poisson’s theorem in the form of (19.32) only applies for invariants
F and G, whose total time derivatives vanish identically.
In case that dF /dt = 0 and dG/dt = 0 represent only implicit functions, we cannot
infer that the Poisson brackets [dF /dt, G] and [F, dG/dt] vanish. The reason is that
the construction of a Poisson bracket does not constitute an algebraic but an analytic
operation. In the latter case, we must impose the stronger condition that the partial
derivatives of dF /dt and dG/dt with respect to the qi and the pi all vanish
∂ dF ∂ dF dF
= 0, =0 ⇒ , G = 0.
∂qi dt ∂pi dt dt
EXAMPLE
Herein, G denotes the gravitational constant, and m1 , m2 the masses of the respective Example 19.8
bodies. The canonical equations are obtained as
∂H ∂H qi
q̇i = = pi , ṗi = − = −μ 3 .
∂pi ∂qi r
dD
= q1 ṗ2 + q̇1 p2 − q2 ṗ1 − q̇2 p1
dt
μ μ
= − 3 q1 q2 + p1 p2 + 3 q2 q1 − p2 p1
r r
≡ 0.
q1
R1 = q1 p22 − q2 p1 p2 − μ .
r
We convince ourselves of this fact again by direct calculation of the time derivative
of R1
dR1 q̇1
= q̇1 p22 + 2q1 p2 ṗ2 − q̇2 p1 p2 − q2 ṗ1 p2 − q2 p1 ṗ2 − μ
dt r
q1
+ μ 3 (q1 q̇1 + q2 q̇2 )
r
μ μ μ
= p1 p22 − 2 3 q1 q2 p2 − p1 p22 + 3 q1 q2 p2 + 3 q22 p1
r r r
μ 2 μ
− 3 p1 q1 + q2 + 3 q1 (q1 p1 + q2 p2 )
2
r r
≡ 0.
Example 19.8 We can prove this easily by directly calculating dR2 /dt. The invariants R1 and R2
constitute the components of the Runge–Lenz2 vector. We will get back to the Runge–
Lenz vector in Example 21.21.
2 Carl David Tolmé Runge, German mathematician and physicist, b. August 30, 1856, Bremen,
Germany–d. January 3, 1927, Göttingen, Germany. Runge came from a family of merchants and
grew up in Havana and Bremen. He took up studies of literature at Munich, but soon switched to
mathematics and physics. As a student in Munich, he met Max Planck, which was the beginning of
a lifelong friendship. Runge finished his studies with a thesis on differential geometry, supervised
by Weierstrass, and became a professor of mathematics in Hanover in 1886. In 1906, he took up a
professorship in Göttingen. Runge worked on the numerical solution of equations—the Runge–Kutta
method for the solution of differential equations is named after him—and on spectroscopy. He did
spectroscopical measurements himself and contributed eminently to the understanding of the spectral
series of various atoms. Runge applied his results to the new field of the analysis of stellar spectra.
In a textbook on vector analysis, Runge described the derivation, originally found by Gibbs, of con-
served quantity of the Kepler problem. This discussion was then referred to by Wilhelm Lenz in his
early quantum mechanical treatment of the hydrogen atom. The corresponding conserved quantity
has become known as the Runge–Lenz vector.
Wilhelm Lenz, German physicist, b. February 8, 1888, Frankfurt am Main, Germany–d. April
30, 1957, Hamburg, Germany. Lenz attended the same school in Frankfurt as Otto Hahn, and took
up studies of mathematics and physics in Göttingen in 1906. He obtained his Ph.D. in 1911 with
Arnold Sommerfeld in Munich and became Sommerfelds assistant. In 1921, Lenz became professor
of theoretical physics in Hamburg. Among his students and assistants in Hamburg were Pascual
Jordan, Wolfgang Pauli, and Hans Jensen, who was awarded the Nobel Prize in physics in 1963 for
the development of the shell model of the atomic nucleus. Lenz’ contributions to the early quantum
mechanics of hydrogen-like atoms renewed interest in the Runge–Lenz vector, which, actually, had
been known long before. A simple model for the description of ferromagnets developed by Lenz and
proposed as a thesis topic to one of his students is well known today by the name of the student: the
Ising model.
Hamilton–Jacobi Theory
20
The coordinates (Pi , Qi ) obey the Hamilton equations with the Hamiltonian
H (Qi , Pi , t). Since the time derivatives vanish by definition, we have
∂H ∂H
Ṗi = 0 = − , Q̇i = 0 = . (20.1)
∂Qi ∂Pi
These conditions would certainly be fulfilled by the function H ≡ 0. In order to
perform the coordinate transformation, we need a generating function. For histori-
cal reasons—Jacobi made this choice—we adopt among the four possible types the
type F2 = S(qi , Pi , t), which already has been treated in the preceding chapter. It is
generally known as the Hamilton action function. For this choice the equations (19.12)
hold. We now require that the new Hamiltonian shall identically vanish. Then
∂S ∂S ∂S
+ H q1 , . . . , qn ; p1 = , . . . , pn = ; t = 0. (20.2)
∂t ∂q1 ∂qn
Writing down this equation with the arguments, we obtain
∂S(qi , Pi = βi , t) ∂S ∂S
+ H q1 , . . . , qn ; ,..., ; t = 0. (20.3)
∂t ∂q1 ∂qn
1 Carl Gustav Jacob Jacobi, b. Dec. 18, 1804, Potsdam, son of a banker–d. Feb. 18, 1851, Berlin.
After his studies (1824), Jacobi became a lecturer in Berlin and in 1827 to 1842 held a chair as a pro-
fessor in Königsberg (now: Kaliningrad). After an extended travel through Italy to restore his weak
health, Jacobi lived in Berlin. Jacobi became known for his work Fundamenta Nova Theoria Functio-
rum Ellipticarum (1829). In 1832, Jacobi discovered that hyperelliptic functions can be inverted by
functions of several variables. Jacobi also made fundamental contributions to algebra, to elimination
theory, and to the theory of partial differential equations, e.g., in his Lectures on Dynamics (1842 to
1843), published in 1866.
S = S(q1 , . . . , qn ; β1 . . . , βn ; t),
where the βi are integration constants. A comparison with (19.12) leads to the require-
ments
∂S ∂S(q1 , . . . , qn ; β1 , . . . , βn ; t)
Pi = βi ; Qi = = = αi . (20.4)
∂Pi ∂βi
The βi , αi can be determined from the initial conditions.
The original coordinates result from the transformation equations (19.12) as fol-
lows: From
∂S(qj , βj , t)
αi =
∂βi
follow the position coordinates
qi = qi (αj , βj , t).
Insertion into
∂S(qj , Pj , t)
pi = = pi (qi , βi , t)
∂qi
finally yields
pi = pi (αi , βi , t).
Now the qi (αj , βj , t) and pi (αj , βj , t) are known as functions of the time and
of the integration constants αj , βj . This simply means the complete solution of the
many-body problem characterized by the Hamiltonian H (qi , pi , t).
We can separate off the time dependence in S. If H is not an explicit function of
the time, H represents the total energy of the system:
∂S
− = H = E. (20.5)
∂t
From this, it follows that S can be represented as
To explain the meaning of S, we form the total derivative of S with respect to time:
dS ∂S ∂S ∂S
= q̇i + Ṗi + .
dt ∂qi ∂Pi ∂t
20 Hamilton–Jacobi Theory 385
Since this integral physically represents an action (energy · time), the term action
function for S is obvious. The action function differs from the time integral over the
Lagrangian by at most an additive constant. However, this last relation cannot be used
for a practical calculation, since as long as the problem is not yet solved, one does not
know L as a function of time. Moreover, L(qi , pi , t) in (20.6) depends on the original
coordinates qi , pi , while the S-function is needed in the coordinates qi , Pi (qα , pα ).
Equation (20.7) is not unknown to us: The action function S turned up before when
formulating the Hamilton principle (18.25). Before further continuing this discussion,
we will illustrate the Hamilton–Jacobi method by an example.
EXAMPLE
p2 k
H= + q 2.
2m 2
The Hamilton action function then has the form (compare (19.12) and (20.3))
∂S
S = S(q, P , t) and p= .
∂q
From this, we obtain the Hamilton–Jacobi differential equation:
∂S 1 ∂S 2 k 2
+ + q = 0.
∂t 2m ∂q 2
For solving the problem, we make a separation ansatz into a space and a time
variable. A product ansatz would not work here, since the differential equation is not
linear. We therefore set a sum:
S = S1 (t) + S2 (q).
386 20 Hamilton–Jacobi Theory
This leads to
1 dS2 (q) 2 k 2
−Ṡ1 (t) = + q = β,
2m dq 2
where β is the separation constant. (The left-hand side depends only on the time t,
the right-hand side only on the coordinate q: Therefore, both sides can only be equal
if they are equal to a common constant β.) For the time-dependent function, we then
have
which leads to
S1 (t) = −βt.
m
Q+t = arcsin k/(2β) q .
k
The separation of the Hamilton–Jacobi equation represents a general (often the only
feasible) way of solving it. If the Hamiltonian does not explicitly depend on the time,
then
dS ∂S ∂S
+ H q1 , . . . , q n ; ,..., = 0, (20.8)
dt ∂q1 ∂qn
and the time can be separated off immediately. We set for S a solution of the form
S = S0 (qi , Pi ) − βt.
The constant β then equals H and normally represents the energy. After this separa-
tion, there remains the equation
∂S0 ∂S0
H q1 , . . . , qn ; ,..., = E. (20.9)
∂q1 ∂qn
This means that the Hamilton action function splits into a sum of partial functions Si ,
each depending only on one pair of variables. The Hamiltonian then becomes
dS1 dSn
H q 1 , . . . , qn ; ,..., = E. (20.11)
dq1 dqn
To ensure that this differential equation also separates into n differential equations for
the Si (qi , Pi ), H must obey certain conditions. For example, if H has the form
This equation can be satisfied by setting each term Hi separately equal to a constant
βi ; hence,
∂S1 ∂Sn
H1 q1 , = β 1 , . . . , H n qn , = βn , (20.14)
∂q1 ∂qn
where
β1 + β2 + · · · + βn = E. (20.15)
Since the kinetic energy term of the Hamiltonian involves the momentum pi =
dSi /dqi quadratically, these differential equations are of first order and second degree.
As solutions, we then obtain the n action functions
Si = Si (qi , βi ), (20.16)
which, apart from the separation constants βi , depend only on the coordinate qi . Ac-
cording to (19.12), Si immediately leads to the conjugate momentum pi = dSi /dqi to
the coordinate qi . The essential point is (see (20.12)) that the coordinate pair (qi , pi )
is not coupled to other coordinates (qk , pk , i = k), so that the motion in these coordi-
nates can be considered fully independent of the other ones.
We now restrict ourselves to periodic motions and define the phase integral
Ji = pi dqi , (20.17)
which is to be taken over a full cycle of a rotation or vibration. The phase integral has
the dimension of an action (or of an angular momentum). It is therefore also referred
to as an action variable. If we replace the momentum by the action function
dSi
Ji = dqi , (20.18)
dqi
we see from (20.16) that Ji depends only on the constants βi , since qi is only an inte-
gration variable. We therefore can move from the constants βi to the likewise constant
Ji and use them as new canonical momenta. Hence, one performs the transformation
Ji = Ji (βi ) −→ βi = βi (Ji ).
The total energy E which corresponds to the Hamiltonian can also be recalculated
by (20.15) to the Jk :
n
H =E= βi (Ji ). (20.19)
i=1
The Hamiltonian is therefore only a function of the action variables, which take the
role of the momenta. All corresponding conjugate coordinates are cyclic. The con-
jugate coordinates belonging to the Ji are called angle variables and are denoted by
ϕi . The generating function Si (qi , βk ) turns with βk (Jk ) into S(qi , Ji ). The Ji are the
new momenta. We therefore can apply (19.12), and for the related new coordinates,
we have
∂S(qi , Ji )
ϕj = .
∂Jj
By transforming to the action variables and angle variables, we thus have performed
a canonical transformation, mediated by the generating function
This transformation from one set of constant momenta to another set actually does not
give new insights. The meaning for periodic processes lies in the angle variable ϕi .
Since we performed only canonical transformations, we have
∂H
ϕ̇i = = νi (Ji ) = constant. (20.21)
∂Ji
20 Hamilton–Jacobi Theory 389
One can show that νi is the frequency of the periodic motion in the coordinate i. This
relation thus offers the advantage that the frequencies, which are often of primary
interest, can be determined without solving the full problem. We briefly demonstrate
this point by the following example:
EXAMPLE
We again consider the harmonic oscillator. The expression for the total energy
p2 kq 2
E= +
2m 2
is transformed so that we get the representation of an ellipse in phase space:
p2 q2
+ = 1.
2mE 2E/k
Fig. 20.1. Ellipses in phase
space
The phase integral is the area enclosed by the ellipse in phase space:
J = p dq = πab.
EXERCISE
Problem. Use the Hamilton–Jacobi method for solving the Kepler problem in a cen-
tral force field of the form
K
V (r) = − .
r
390 20 Hamilton–Jacobi Theory
Exercise 20.3 Solution. We adopt plane polar coordinates (r, ) as generalized coordinates. The
Hamiltonian reads
2
1 p K
H= p + 2 − .
2
(20.22)
2m r r r
Equation (20.25) can be satisfied only if both sides are constant. The constant is the
total energy of the system, because
∂S ∂S3
− =H =E ⇒ − = constant = β3 = E. (20.26)
∂t ∂t
We remember that
∂S ∂S
Pi = βi , Qi = = = αi ,
∂Pi ∂βi
∂S2 dS2 ( )
= = constant = β2 , (20.28)
∂ d
and therefore,
∂S1 dS1 (r) 2mK β22
= = 2mβ3 + − 2. (20.29)
∂r dr r r
20 Hamilton–Jacobi Theory 391
The Hamilton action function can now be written down as follows: Exercise 20.3
2mK β22
S= 2mβ3 + − 2 dr + β2 − β3 t. (20.30)
r r
We now define β2 and β3 as new momenta P and Pr . The quantities Qi conjugate
to the Pi are also constant.
∂S ∂ 2mK β22
Qr = = 2mβ3 + − 2 dr − t = α3 , (20.31)
∂β3 ∂β3 r r
∂S ∂ 2mK β22
Q = = 2mβ3 + − 2 dr + = α2 . (20.32)
∂β2 ∂β2 r r
This is the solution of the Kepler problem, known from the lectures on classical me-
chanics.2 The types of trajectories follow from the discussion of conic sections in the
representation r = p/(1 + ε cos ϕ):
ε =1=
E = 0: parabolas;
ε <1=
E < 0: ellipses;
ε >1=
E > 0: hyperbolas.
2 See W. Greiner: Classical Mechanics: Point Particles and Relativity, 1st ed., Springer, Berlin
(2004), Chapter 26.
392 20 Hamilton–Jacobi Theory
Exercise 20.3 Equation (20.31) could be rewritten further, by pulling the differentiation into the
integral and transforming the resulting equation in such a way that the position r
becomes a function of the time. We skip that here.
EXERCISE
Problem. Let a particle of mass m move in a force field that in spherical coordi-
nates has the form V = −K cos /r 2 . Write down the Hamilton–Jacobi differential
equation for the particle motion.
Solution. We first need the Hamiltonian operator as a function of the conjugate mo-
menta in spherical coordinates. For this purpose we first write the kinetic energy T in
spherical coordinates:
˙ + r sin ϕ̇eϕ
ṙ = ṙer + r e (20.37)
1 1
⇒ ˙ 2 + r 2 sin2 ϕ̇ 2 ).
T = mṙ · ṙ = m(ṙ 2 + r 2 (20.38)
2 2
The Lagrangian then reads
1
˙ 2 + r 2 sin2 ϕ̇ 2 ) − V (r, , ϕ).
L = T − V = m(ṙ 2 + r 2 (20.39)
2
We now assume that V (r, , ϕ) is velocity-independent (which is indeed fulfilled)
and form the canonical conjugate momenta:
∂L ∂L ∂L
pr = = mṙ, p = ˙
= mr 2 , pϕ = = mr 2 sin2 ϕ̇. (20.40)
∂ ṙ ˙
∂ ∂ ϕ̇
From this, we obtain
pr p pϕ
ṙ = , ˙ =
, ϕ̇ = .
m mr 2 mr sin2
2
∂S ∂S ∂S
pr = , p = , pϕ = . (20.43)
∂r ∂ ∂ϕ
EXERCISE
Problem.
(a) Find the complete solution of the Hamilton–Jacobi differential equation from the
preceding Exercise 20.4, and
(b) sketch how to determine the motion of the particle.
Solution. (a) The approach is analogous to Exercise 20.3. We adopt the separation
ansatz for S,
Equations (20.46) and (20.47) can only be satisfied if both sides are constant:
∂S1 (r) 2
r2 − 2mEr 2 = constant = β1 , (20.48)
∂r
∂S2 ( ) 2 1 ∂S3 (ϕ) 2
− − 2 + 2mK cos = β1 . (20.49)
∂ sin ∂ϕ
(b) The explicit equations for the motion of the particle follow from the requirement
∂S ∂S
Qi = ⇔ αi = ,
∂Pi ∂βi
since Qi , Pi are constants that are denoted by αi , βi , and thus,
∂S ∂S ∂S
= α1 , = α2 , = α3 . (20.55)
∂β1 ∂E ∂pϕ
S = S0 − Et,
S0 = S0 (q1 , . . . , qn , E, β2 , . . . , βn )
β1 is replaced by E. We can express this in the following way: Just as in the original
Hamilton–Jacobi equation, the reduced form also has n integration constants, one of
them the total energy E.
EXERCISE
Exercise 20.6 Since H does not depend explicitly on the time and the system is conservative, the
reduced Hamilton–Jacobi differential equation can be applied.
1 ∂S0 2 ∂S0 2
+ + mgy = E. (20.62)
2m ∂x ∂y
or
2 2
∂S1 (x) ∂S2 (y)
= 2mE − 2m gy − 2
. (20.64)
∂x ∂y
This is satisfied only if both sides of the equation are constant, since x and y are
independent coordinates.
2 2
∂S1 (x) ∂S2 (y)
β2 , = (2mE − β2 ) − 2m2 gy. (20.65)
∂x ∂y
y = −c1 x 2 + c2 x + c3 , (20.71)
i.e., the familiar throw parabola. For the case of the slant throw the Hamilton–Jacobi
equation may appear clumsy for establishing the equation of motion. A certain advan-
tage of the method shows up in complicated problems, e.g., in the Kepler problem in
Exercise 20.3.
S = S0 (qi , Pi ) − Et,
p = grad S = ∇S0 .
The time behavior of the S-field can be seen from the representation S = S0 − Et.
For t = 0, the surfaces S(qi , Pi ) = 0 and S0 (qi , Pi ) = 0 are identical. For t = 1, the
surface S = 0 coincides with the surface S0 = E, S = E with S0 = 2E, etc. This
398 20 Hamilton–Jacobi Theory
means graphically that surfaces of constant S-values move across surfaces of constant
S0 -values, i.e., that surfaces of constant S move through space. The formal meaning
of S follows from the action integral. One has
L dt = (px dx + py dy + pz dz − H dt),
t2 t2
∂S ∂S ∂S ∂S
L dt = dx + dy + dz + dt = S2 − S1 .
∂x ∂y ∂z ∂t
t1 t1
EXAMPLE
To illustrate the action waves, we consider the throw or fall motion in the gravita-
tional field of the earth, where the equation of motion is well known. In analogy to
Exercise 20.6, we obtain the following Hamilton–Jacobi differential equation:
2 2
1 ∂S 2 ∂S ∂S ∂S
+ + + mgz + = 0. (20.72)
2m ∂x ∂y ∂z ∂t
Sx = xβx , Sy = yβy
The quantities βx and βy are separation constants, just like βz . Integration over z
yields, up to a constant,
2 2
Sz = − (βz − mgz)3/2 . (20.74)
3g m
We write the constant βz as βz = mgz0 and thereby can express the total energy as
px2 + py2
E= + mgz0 . (20.75)
2m
By insertion, one gets the action function
√ 2
2m 2g βx + βy2
S = xβx + yβy − (z0 − z)3/2 − + mgz0 t, (20.76)
3 2m
20.1 Visual Interpretation of the Action Function S 399
Fig. 20.3.
Fig. 20.4.
With increasing S0 the tops of the Neil parabolas move in the y-direction. The
related trajectories are throw parabolas in the y, z-plane which have no velocity com-
ponent along the x-direction and reach their highest point at z = 0 (dashed curves in
Fig. 20.5).
Fig. 20.5. Projection onto the
y, z-plane of the preceding
Fig. 20.4
EXAMPLE
In this example, the peculiarities of periodic motions shall be compiled and extended
to multiply periodic motions.3
3 Here, we follow A. Budo, Theoretische Mechanik, Deutscher Verlag der Wissenschaften, Berlin
(1956).
20.1 Visual Interpretation of the Action Function S 401
1. Periodic motions: Here, one distinguishes two kinds, namely, the properly peri- Example 20.8
odic motion, for which
qi (t + τ ) = qi (t),
(20.81)
pi (t + τ ) = pi (t),
i.e., both the coordinates and the momenta have the same period τ . This motion is
also called libration. Two-dimensional examples are the (nondamped) harmonic os-
cillator or the (nondamped) vibrating pendulum. The phase-space diagram (the phase
trajectory) is a closed curve (see Fig. 20.6).
The other type of periodic motion is the rotation. Here one has (e.g., in the two-
Fig. 20.6. Two-dimensional phase
dimensional case) diagram of a properly periodic
motion. A closed phase tra-
p(q + q0 ) = p(q), (20.82) jectory occurs, e.g., for a non-
damped vibrating pendulum
i.e., the momentum takes for q + q0 the same value as for q. The coordinate q is
mostly an angle variable and q0 = 2π . One might imagine for example a circulating
pendulum; in this case q is the pendulum angle. The phase-space trajectory is then not
closed but periodic with the period q0 (see Fig. 20.7).
Fig. 20.7. Phase-space diagram of the rotation as a periodic motion. The trajectory is open but
has the period q0 . In other words, the momentum p is a periodic function of the coordinate q
with the period q0
The limiting case between rotation and libration is called limitation motion. The
pendulum, which is almost circulating, is an example for this type of motion. The
coordinate period q0 is then q0 = 2π as before, but the time period is τ = ∞.
(The pendulum then comes to rest in the upper vertical position (unstable point), i.e.,
the function graph terminates at the point q0 .) If the system is conservative and is
described by the Hamiltonian H (p, q), we have the equations
H (p, q) = E,
(20.83)
∂S
H q, = E.
∂q
The first equation yields p = p(q, E), i.e., for a given energy E the phase trajectory.
The second equation is the (reduced) Hamilton–Jacobi equation from which the action
function (generating function) F2 (q, P ) = S(q, E) can be calculated. If that is done,
one can calculate the phase-space integral
∂S
J = p dq = dq. (20.84)
∂q
402 20 Hamilton–Jacobi Theory
Example 20.8 Here, means the integration over a closed trajectory in the case of libration, or over
a full period q1 ≤ q ≤ q1 + q0 in the case of rotation. Hence, the phase integral J
exactly corresponds to the shaded areas in Figs. 20.6 and 20.7
The phase integral J = J (E) depends only on E and is constant in time, since the
total energy is constant in time. Hence, (20.84) leads to the relations
The function S (q, J ) can serve as the generator of a canonical transformation. The
new momentum is now identified with J , i.e.,
P = J. (20.87)
∂E(J )/∂J depends only on J , which is constant in time. Hence, ϕ̇ is also constant in
time, and then
∂E
ϕ= t + δ. (20.91)
∂J
Here, the phase constant δ appears. If we had not selected S (q, J ) as the generating
function but rather the complete time-dependent action function
from (20.89), since the Hamiltonian H (ϕ, J ) = E(J ) does not depend on ϕ. The Example 20.8
change of ϕ during a period τ is found from (20.91) to be
∂E
ϕ = τ, (20.94)
∂J
Hence, the angular coordinate increases during a period after which the system returns
to its initial configuration, exactly by 1. We therefore can state that the motion of the
system is periodic in ϕ with the period 1. Combining (20.94) and (20.95) yields
∂E ∂E 1
τ =1 ⇔ = = ν. (20.96)
∂J ∂J τ
ν is the frequency of the periodic motion. Obviously the complete solution of the
equations of motion is not needed for calculating ν. It is sufficient to express E as a
function of J and to differentiate with respect to J . This is the advantage of introduc-
ing the action (J ) and angle variables (ϕ). The approach is illustrated in Example 20.2
for the case of the harmonic oscillator.
S(q1 , . . . , qf ; E, β2 , . . . , βf )
= S1 (q1 ; E, β2 , . . . , βf ) + · · · + Sf (qf ; E, β2 , . . . , βf ). (20.98)
E, β2 , β3 , . . . , βf (20.99)
Example 20.8 onto each qi ,pi -plane of the phase space must be either a libration or a rotation, to
guarantee the periodicity of the entire motion of the system.
The procedure is analogous to that outlined in the first section. First one defines the
action variables
∂Si (qi ; E, β2 , . . . , βf )
Ji = pi dqi = dqi
∂qi
= Ji (E, β2 , . . . , βf ), i = 1, . . . , f. (20.100)
They are constant in time, since E, β2 , . . . , βf are constant. The f equations (20.100)
can be solved for E, β2 , . . . , βf and yield
E = E(J1 , . . . , Jf ),
β2 = β2 (J1 , . . . , Jf ),
(20.101)
..
.
βf = βf (J1 , . . . , Jf ).
Pi = Ji . (20.103)
The relation (20.102) is fully analogous to the relation (20.86), and (20.103) corre-
sponds to (20.87). The canonically conjugate angle variables result—like (20.88)—
from
∂S ∂Sk (qk , J1 , . . . , Jf )
f
ϕi = = , i = 1, . . . , f. (20.104)
∂Ji ∂Ji
k=1
since the Hamiltonian is independent of time (see (20.83)). From this follow the
Hamilton equations
∂H ∂E(Jk )
ϕ̇i = = = constant ≡ νi ,
∂Pi ∂Ji
(20.106)
∂H
J˙i = − = 0;
∂ϕi
hence,
ϕi = νi t + δi ,
(20.107)
Ji = constant.
20.1 Visual Interpretation of the Action Function S 405
We are now interested in the change of the angle variables ϕi over a period (full rev- Example 20.8
olution or back-and-forth motion of a coordinate qi with the remaining coordinates
kept fixed). It is given by
∂ϕi ∂ 2S
k ϕi = dqk = dqk
∂qk ∂Ji ∂qk
∂ ∂S ∂Jk
= dqk = = δki . (20.108)
∂Ji ∂qk ∂Ji
According to (20.107),
k ϕk = νk τκ , (20.109)
νk τk = 1. (20.110)
Thus,
1
νk = (20.111)
τk
obviously are the frequencies of the qk -motion. In other words, according to (20.106)
the (fundamental) frequency νk for the coordinate qk is νk = ∂E(J1 , . . . , Jf )/∂Jk .
Equations (20.104) can also be inverted, which yields the original coordinates qn
with
qk = qk (ϕ1 , . . . , ϕf ), k = 1, . . . , f (20.112)
ϕi → ϕi + 1,
q i → qi for libration, (20.113)
qi → qi + qi0 for rotation.
qi − ϕi qi0 (20.114)
ϕi → ϕi + 1,
(20.115)
qi − ϕi qi0 → qi + qi0 − (ϕi + 1)qi0 = qi − ϕi qi0 .
406 20 Hamilton–Jacobi Theory
Example 20.8 We therefore can expand the separation coordinates qi (for libration) or qi − ϕi qi0 (for
rotations) in a Fourier series and write
⎫
qi (ϕ1 (t), . . . , ϕf (t)) ⎬ +∞
= an(i) ei2πϕi n
⎭
qi − ϕi qi0 (ϕ1 (t), . . . , ϕf (t)) n=−∞
+∞
= an(i) ei2πn(νi t+δi ) , (20.116)
n=−∞
where
1
an(i) (ϕ1 , . . . , ϕi−1 , ϕi+1 , . . . , ϕf ) = qi (ϕ1 , . . . , ϕf )e−i2πnϕi dϕi . (20.117)
0
The Fourier coefficients an(i) (ϕ1 , . . . , ϕi−1 , ϕi+1 , . . . , ϕf ) in general still depend on all
angle variables, except for ϕi .
We now imagine other variables xl which describe the system and are useful for
certain problems. They shall unambiguously depend on the qi (t) and therefore are
also functions of the time. Then we can write
∞
i2π[(n1 ν1 +···+nf νf )t+(δn1 +...+δnf )]
= A(l)
n1 ,...,nf e , l = 1, . . . , f.
n1 ,...,nf =−∞
(20.118)
νi = ai ν, (20.121)
20.2 Transition to Quantum Mechanics 407
where the ai = (m1 m2 · · · mf )ni /mi are integers, and ν is a common factor. Thus, the Example 20.8
system is periodic if and only if all frequencies are commensurable. The fundamental
frequency ν0 is then the largest common divisor of all frequencies ν1 , . . . , νf . If there
exist only s (with s ≤ f − 1) relations of the form (20.119), s frequencies can be
rationally expressed by the remaining ones. The system (the motion) is then called
s-fold degenerate or (f − s)-fold periodic. Special cases are
• s = 0: the motion is f -fold periodic or nondegenerate;
• s = f − 1: the motion is single-periodic or fully degenerate.
where h is Planck’s action quantum, which has the value h = 6.6 · 10−34 J s. We again
consider the case of the harmonic oscillator. In Example 20.2, we evaluated the phase
integral
m
J = 2πE . (20.123)
k
√
ν = (1/2π) k/m was the frequency. With the quantum hypothesis, we then obtain
En = nhν. (20.124)
Thus, the quantum hypothesis leads to the conclusion that the vibrating mass point
can take only discrete energy values En . For the motion, this means that only certain
trajectories in the phase space are allowed. We therefore get ellipses for the phase-
space trajectories (compare Example 20.2), whose areas (the phase integral) always
differ by the amount h. In this way, the phase space acquires a grid structure that is
defined by the allowed trajectories.
Each trajectory corresponds to an energy En . In a transition between two trajec-
tories the mass point receives (or releases) the energy En − Em = (n − m)hν. The
smallest transferred amount of energy is given by hν.
Fig. 20.8. In quantum me-
chanics, the phase-space tra-
jectories of the harmonic os-
cillator are ellipses that differ
by an area of h
408 20 Hamilton–Jacobi Theory
Since the action quantum h is so small, the discrete structure of the phase space
is significant only for atomic processes. For macroscopic processes the trajectories
in the phase space are so dense that one can consider the phase space as a contin-
uum. The energy quanta hν are so small that they have no meaning for macroscopic
processes. For example, the energy emitted in a transition in the hydrogen atom is
hν = 13.6 eV (electron volt). Expressed in the (macroscopic) unit of Watt seconds
hν = 2 · 10−18 W s. The quantum hypothesis was confirmed by the explanation of the
spectra of radiating atoms.
EXERCISE
pr2 pϕ2 e2
H (p, q) = + − . (20.128)
2m 2mr 2 r
Constants of motion are
(i) H = E, since H (q, p) does not explicitly depend on the time; and
(ii) pϕ = L, since ϕ is a cyclic variable.
20.2 Transition to Quantum Mechanics 409
L represents the constant angular momentum. The Hamilton equations read Exercise 20.9
∂H ∂H pϕ
ṗϕ = − = 0, ϕ̇ = = ,
∂ϕ ∂pϕ mr 2
(20.129)
∂H e2 pϕ2 ∂H pr
ṗr = − =− + , ṙ = = .
∂r r mr 3 ∂pr m
2π
lh = pϕ dϕ = L dϕ = 2πL
0
(20.130)
h
⇒ L = l, = , l = 0, 1, 2, . . . ,
2π
i.e., the orbital angular momentum can take only integer multiples of . For the radial
motion, the phase integral equals
rmax
e2 L2
kh = pr dr = 2 2m E + − 2 dr, k = 0, 1, 2, . . . . (20.131)
r r
rmin
pr = 0; (20.132)
thus,
e2 L2
2
rm + rm − = 0,
E 2mE
√ (20.133)
e2 −
rm = − ∓ with
= −4m(2EL2 + me4 ).
2E 4mE
The integral in (20.131) is of the type
√ 2 √
ar + br + c X(r)
dr ≡ dr, (20.134)
r r
and
dr 1 br + 2c
√ =√ arcsin √ for c < 0 and
< 0. (20.136)
r X(r) −c r −
= 4ac − b2 . (20.137)
This leads to
√
X(r) b 2ar + 2b
dr = ar 2 + br + c − √ arcsin √
r 2 −a −
c br + 2c
+√ arcsin √ (20.138)
−c r −
me4
En = − . (20.142)
22 n2
This formula for the discrete energy levels in the hydrogen atom agrees exactly with
the quantum mechanical result. Only the value n = 0, which was allowed in this con-
sideration, is excluded in the quantum mechanical approach. The underlying classi-
cal picture (electron moves in an elliptic orbit with the eccentricity ε = 1 − (l/n)2 )
leads however to contradictions and must be modified in quantum mechanics. Because
n = l + k, the energy levels with n = 1, 2, . . . , are twofold, threefold, . . . , degenerate.
EXERCISE
∂F ∂G ∂F ∂G
Exercise 20.10
[F, G] = − .
α
∂qα ∂pα ∂pα ∂qα
Solution. (a)
∂F ∂G ∂F ∂G
∂G ∂F ∂G ∂F
[F, G] = − =− −
α
∂qα ∂pα ∂pα ∂qα α
∂qα ∂pα ∂pα ∂qα
= −[G, F ].
∂qr ∂F
∂F
= 0 ⇒ [F, qr ] = − δrα = − .
∂pα α
∂p α ∂p r
δrα = 1 for r = α,
δrα = 0 for r = α.
Exercise 20.10 by the transition to operators and by replacing the Poisson bracket [ , ] by the commu-
tator (1/i){ , }, where
In the canonical quantization, one passes from the classical momenta pj to operator
momenta p j , and from the classical Poisson bracket [ , ] to the quantum mechanical
Poisson bracket (1/i){ , }− . Thus, in the canonical quantization one substitutes the
relation (20.143) by
{qi , p
j }− = iδij . (20.144)
j = −i∂/∂qj :
Equation (20.144) is satisfied if p
∂
j }− = −i qi ,
{qi , p ,
∂qj −
where the product rule was used and thus (20.144) is verified. The rules for the quan-
tum mechanical commutators are identical with those for the Poisson brackets. One
might say that quantum mechanics is another algebraic realization of the Poisson
brackets. As will be seen in quantum mechanics, this conclusion is premature and
in this form not correct.
EXERCISE
Problem. Let H denote the Hamiltonian. Show that for an arbitrary function depend-
ing on qi , pi , and t we have
df ∂f
= + [f, H ].
dt ∂t
20.2 Transition to Quantum Mechanics 413
Solution. The total differential of the function f (pi , qi , t) reads Exercise 20.11
∂f ∂f ∂f
df = dt + dqα + dpα (20.145)
∂t α
∂qα ∂pα
df ∂f ∂f ∂f
⇒ = + q̇α + ṗα . (20.146)
dt ∂t α
∂qα ∂pα
Thus, the Poisson brackets enter automatically. Equation (20.147) reminds us even
more of the results of quantum mechanics than the analogies of the last problem. In
quantum mechanics we shall find the following expression for the time derivative of
:
an operator F
∂F
dF 1
= + {F , H }− , (20.148)
dt ∂t i
where H represents the Hamiltonian operator of the quantum mechanical problem. It
is, e.g., of the form
=H
(x, p ∂
H ) = −i
with p
∂x
and depends in general on the coordinates, momentum operators, and possibly even
further quantities, e.g., spin.
Extended Hamilton–Lagrange Formalism
21
Here, the index μ = 0, . . . , n spans the entire range of extended configuration space
variables. In particular, the Euler–Lagrange equation for t (s) writes
d ∂L1 ∂L1
dt − = 0.
ds ∂ ds ∂t
The equations of motion for both qj (s) and t (s) are thus determined by the extended
Lagrangian L1 . The solution qj (t) of the Euler–Lagrange equations that equivalently
emerges from the corresponding conventional Lagrangian L may then be constructed
by eliminating the evolution parameter s.
As the actions, S and S1 , are supposed to be alternative characterizations of the
same underlying physical system, the action principles δS = 0 and δS1 = 0 must hold
simultaneously. This means that
sb sb
dt
δ L ds = δ L1 ds,
sa ds sa
dt dF
L = L1 + .
ds ds
Functions F (qj , t) define a particular class of point transformations of the dynamical
variables, namely those ones that preserve the form of the Euler–Lagrange equations.
Such a transformation can be applied at any time in the discussion of a given La-
grangian system and should be distinguished from correlating L1 and L. We may thus
restrict ourselves without loss of generality to those correlations of L and L1 , where
F ≡ 0. In other words, we correlate L and L1 without performing simultaneously
a transformation of the dynamical variables. We will discuss this issue in the more
general context of extended canonical transformations in Sect. 21.3. The extended
Lagrangian L1 is then related to the conventional Lagrangian, L, by
dqj dt dqj dt dqi dqi /ds
L 1 qj , , t, = L qj , ,t , = . (21.4)
ds ds dt ds dt dt/ds
The derivatives of L1 from (21.4) with respect to its arguments can now be expressed
in terms of the conventional Lagrangian L as
∂L1 ∂L dt
= , i = 1, . . . , n, (21.5)
∂qi ∂qi ds
∂L1 ∂L dt
= , (21.6)
∂t ∂t ds
∂L1 ∂L
dqi = dqi , i = 1, . . . , n, (21.7)
∂ ds ∂ dt
∂L1 n
∂L dqi
dt = L − dqi . (21.8)
∂ ds dt
i=1 ∂ dt
With q0 ≡ ct, (21.7) and (21.8) yield for the following sum over the extended range
μ = 0, . . . , n of dynamical variables
n
∂L1 dqμ n
∂L dqi dt ∂L dqi
n
dqμ = L− dqi +
∂
μ=0 ds
ds ∂
i=1 dt
dt ds ∂ dqi ds
i=1 dt
= L1 .
n
∂L1 dqμ
L1 − dqμ = 0. (21.9)
ds
μ=0 ∂ ds
The correlation (21.4) and the pertaining condition (21.9) allows two interpretations,
depending on which Lagrangian is primarily given, and which one is derived. If the
conventional Lagrangian L is the given function to describe the dynamical system in
question and L1 is derived from L according to (21.4), then L1 is a homogeneous form
of first order in the n + 1 variables dq0 /ds, . . . , dqn /ds. This may be seen by replac-
ing all derivatives dqμ /ds with a × dqμ /ds, a ∈ R in (21.4). Consequently, Euler’s
21.1 Extended Set of Euler–Lagrange Equations 417
EXAMPLE
As only expressions of the form i qi2 − c2 t 2 are preserved under the Lorentz group,
the conventional Lagrangian for a free point particle of rest mass m0 , given by
3
dqj 1 dqi 2
Lnr qj , , t = T − V = m0 − m0 c2 , (21.10)
dt 2 dt
i=1
The constant third term has been defined accordingly to ensure that L1 converges
to Lnr in the limit dt/ds → 1. Of course, the dynamics following from (21.10) and
(21.11) are different—which reflects the modification our dynamics encounters if we
switch from a non-relativistic to a relativistic description. With the Lagrangian (21.11),
we obtain from (21.9) the constraint
2 3
dt 1 dqi 2
− 2 − 1 = 0. (21.12)
ds c ds
i=1
418 21 Extended Hamilton–Lagrange Formalism
Example 21.1 As usual for constrained Lagrangian systems, we must not insert back the constraint
function into the Lagrangian prior to setting up the Euler–Lagrange equations. Phys-
ically, the constraint (21.12) reflects the fact that the square of the four-velocity vec-
tor is constant. It equals −c2 if the sign convention of the Minkowski metric is de-
fined as ημν = ημν = diag(−1, +1, +1, +1). We thus find that in the case of the La-
grangian (21.11) the system evolution parameter s is physically nothing else than the
particle’s proper time. In contrast to the non-relativistic description, the constant rest
energy term − 12 mc2 in the extended Lagrangian (21.11) is essential. The constraint
can alternatively be expressed as
3
ds
1 dqi 2
= 1− 2 = γ −1 ,
dt c dt
i=1
which yields the usual relativistic scale factor, γ . The conventional Lagrangian L that
describes the same dynamics as the extended Lagrangian L1 from (21.11) is derived
according to (21.4)
dqj dqj dt ds
L qj , , t = L1 q j , , t,
dt ds ds dt
1 dqi 2 dt
2 1 dt ds
= m0 c − −
2 c2 dt ds ds dt
i
1 2 ds dt 1 dqi 2
= − m0 c + 1− 2
2 dt ds c dt
i
2
1 dqi 2
= −m0 c 1 − 2 . (21.13)
c dt
i
EXAMPLE
The associated constraint function coincides with that for the free-particle Lagrangian Example 21.2
from (21.12) as all terms linear in the velocities drop out calculating the differ-
ence in (21.9). Similar to the free particle case from (21.13), the extended La-
grangian (21.14) may be projected into (T M) × R to yield the well-known conven-
tional relativistic Lagrangian L
dqj
2
1 dqi 2 ζ dqi
L qj , , t = −m0 c 1 − 2 + Ai − ζ φ. (21.15)
dt c dt c dt
i i
Again, the quadratic form of the velocity terms is lost owing to the projection.
For small velocities dqj /dt, the quadratic form is regained as the square root in
(21.15) may be expanded to yield the conventional non-relativistic Lagrangian for a
point particle in an external electromagnetic field,
dqj 1 dqi 2 ζ dqi
Lnr qj , , t = m0 + Ai − ζ φ − m0 c2 . (21.16)
dt 2 dt c dt
i i
Significantly, this Lagrangian can be derived directly, hence without the detour over
the projected Lagrangian (21.15), from the extended Lagrangian (21.14) by letting
dt/ds → 1.
Comparing the Lagrangian (21.16) with the extended Lagrangian from (21.14),
we notice that the transition to the non-relativistic description is made by identify-
ing the proper time s with the laboratory time t = q0 /c. The remarkable formal sim-
ilarity of the Lorentz-invariant extended Lagrangian (21.14) with the non-invariant
conventional Lagrangian (21.16) suggests that approaches based on non-relativistic
Lagrangians Lnr may be transposed to a relativistic description by (i) introducing the
proper time s as the new system evolution parameter, (ii) treating the time t (s) as
an additional dependent variable on equal footing with the configuration space vari-
ables q(s)—commonly referred to as the “principle of homogeneity in space-time”—
and (iii) by replacing the conventional non-relativistic Lagrangian Lnr with the cor-
responding Lorentz-invariant extended Lagrangian L1 , similar to the transition from
(21.16) to (21.14).
n
dqi dqj
H (qj , pj , t) = pi − L qj , ,t , (21.17)
dt dt
i=1
n
dqi dt dqj dt
H1 (qj , pj , t, e) = pi − e − L1 qj , , t, . (21.18)
ds ds ds ds
i=1
This fact ensures the Legendre transformations (21.17) and (21.18) to be compatible.
For the corresponding definition of p0 , we must take some care as the derivative of L1
with respect to dt/ds evaluates to
∂L1 n
dqi ∂L
dt = L − = −H (qj , pj , t).
∂ ds dt ∂ dqi
i=1 dt
The constraint function from (21.9) translates in the extended Hamiltonian description
simply into
H1 qj (s), pj (s), t (s), e(s) = 0. (21.22)
This means that the extended Hamiltonian H1 directly defines the hyper-surface on
which the classical motion of the system takes place. The hyper-surface lies within the
cotangent bundle T ∗ (M × R) over the same extended configuration manifold M × R
as in the case of the Lagrangian description. Inserting (21.19) and (21.21) into the
extended set of Euler–Lagrange equations (21.3) yields the extended set of canonical
equations,
dpμ ∂H1 dqμ ∂H1
=− , = . (21.23)
ds ∂qμ ds ∂pμ
21.2 Extended Set of Canonical Equations 421
The right-hand sides of these equations follow directly from the Legendre transforma-
tion (21.18) since the Lagrangian L1 does not depend on the momenta pμ and has,
up to the sign, the same space-time dependence as the Hamiltonian H1 . The extended
set is characterized by the additional pair of canonical equations for the index μ = 0,
which reads in terms of t (s) and e(s)
de ∂H1 dt ∂H1
= , =− . (21.24)
ds ∂t ds ∂e
In contrast to the total time derivative of the Hamiltonian H (qj , pj , t), the total s
derivative of the extended Hamiltonian H1 (qν , pν ) always vanishes. Calculating the
total s derivative of H1 , and inserting subsequently the extended set of canonical equa-
tions (21.23), we find
n n
dH1 ∂H1 dqμ ∂H1 dpμ ∂H1 ∂H1 ∂H1 ∂H1
= + = − ≡ 0.
ds ∂qμ ds ∂pμ ds ∂qμ ∂pμ ∂pμ ∂qμ
μ=0 μ=0
EXAMPLE
dt ∂H1
=− = 1.
ds ∂e
422 21 Extended Hamilton–Lagrange Formalism
Example 21.3 Up to arbitrary shifts of the origin of our time scale, we thus identify t (s) with s. As all
other partial derivatives of H1 coincide with those of H , so do the respective canonical
equations. The system description in terms of H1 from (21.26) is thus identical to the
conventional description by a Hamiltonian H and does not provide any additional
information.
EXAMPLE
p2
Hnr (p) = + m0 c2 . (21.27)
2m0
Herein, p denotes the 3-component vector of particle momenta, p = (p1 , p2 , p3 ). The
equivalent extended Hamiltonian (21.26) that yields the same dynamics in terms of the
subsequent canonical equations (21.23) is then
p2
H1,nr (p, e) = − e + m0 c2 , (21.28)
2m0
in conjunction with the general side condition for extended Hamiltonians,
H1,nr (q, p, t, e) = 0. As solely expressions of the form q 2 − c2 t 2 and p 2 − e2 /c2
are maintained under Lorentz transformations, (see Example 21.18), the Hamiltoni-
ans (21.27) and (21.28) are obviously not Lorentz invariant. In the description of ex-
tended Hamiltonians, the corresponding Lorentz-invariant form of (21.28) can easily
be constructed
1 e2 1
H1,r (p, e) = p 2 − 2 + m0 c 2 . (21.29)
2m0 c 2
The constant term was adjusted to preserve the relation e = m0 c2 for p = 0. The side
condition H1 = 0, which represent an implicit function, now yields the relativistic
energy-momentum correlation
e2 = p2 c2 + m20 c4 . (21.30)
Although p and e denote formally independent canonical variables, only those com-
binations of p and e have physical significance that satisfy (21.30). Of course,
the canonical equations that follow from (21.29) are different from those following
from (21.28). This reflects the modification that a system’s description encounters
if we switch from a non-relativistic to a relativistic viewpoint. The extended set of
canonical equations (21.23) emerging from the extended Hamiltonian (21.29) is
∂H1,r dpi ∂H1,r dqi pi
− = = 0, = = ,
∂qi ds ∂pi ds m0
21.2 Extended Set of Canonical Equations 423
But this is nothing else than the non-trivial canonical equation of the conventional
Hamiltonian
Hr (p) = e = p2 c2 + m20 c4 . (21.33)
We thus encounter the well-known conventional Hamiltonian Hr (p) of the free rela-
tivistic particle. In contrast to the extended Hamiltonian (21.29), the physically equiv-
alent conventional Hamiltonian (21.33) does not manifest anymore its Lorentz invari-
ance.
To complete this example, we show that the extended Hamiltonian (21.29) also
emerges as the Legendre-transformed Lagrangian (21.11) from Example 21.1. The
extended Legendre transformation that relates extended Lagrangians with extended
Hamiltonians was defined in equations (21.18), (21.19), and (21.21) of Sect. 21.2. For
the addressed case, the canonical momenta evaluate to
∂L1 dqi ∂L1 dt
pi = dq = m0 , e = − dt = m0 c2 .
∂ i ds ∂ ds ds
ds
The extended Hamiltonian is then obtained by expressing the derivatives dqi /ds and
de/ds that are contained in the Lagrangian and in the Legendre transformation rule in
terms of the momenta pi and e,
p2 e2 p2 e2 1
H1 (p, e) = i
− 2
−L 1 (p, e), L 1 (p, e) = i
− 2
− m0 c 2 .
m0 m0 c 2m0 2m0 c 2
i i
EXAMPLE
Example 21.5 Analogously to the dynamics of a free relativistic particle, treated in Example 21.4,
the relativistic dynamics of a particle in an external potential V (q, t) is described by
the extended Hamiltonian
2
1 e − V (q, t) 1
H1,r (q, p, t, e) = p2 − + m0 c2 . (21.35)
2m0 c 2
The constant term was chosen to ensure that for p = 0 and V (q, t) = 0 the constraint
H1,r = 0 leads to e = m0 c2 . Consequently, for the general case H1,r = 0 induces the
scleronomous constraint
2
e − V (q, t) = p 2 c2 + m20 c4 . (21.36)
Again, q, p, t and e represent independent canonical variables, but only those combi-
nations of q, p, t and e have a physical meaning which satisfy (21.36).
The extended set of canonical equations (21.23) emerging from the extended
Hamiltonian (21.35) is
∂H1,r dpi e − V (q, t) ∂V ∂H1,r dqi pi
− = =− , = = ,
∂qi ds m0 c2 ∂qi ∂pi ds m0
∂H1,r de e − V (q, t) ∂V ∂H1,r dt e − V (q, t)
= = , − = = .
∂t ds m0 c2 ∂t ∂e ds m0 c2
We may express the canonical equations equivalently using the time t as the indepen-
dent variable, and eliminate the canonical variable e by means of the constraint (21.36)
de de ds e − V (q, t) ∂V m0 c2 ∂V ∂Hr
= = = = .
dt ds dt m0 c2 ∂t e − V (q, t) ∂t ∂t
These equations can be conceived to represent the canonical equations emerging from
the conventional Hamiltonian
Hr (q, p, t) = e = p 2 c2 + m20 c4 + V (q, t). (21.37)
We thus encounter the Lorentz invariant form of the conventional Hamiltonian for a
particle in an external potential V (q, t). The Hamiltonians H1,r from (21.35) and Hr
from (21.37) are physically equivalent, hence describe the same dynamics. On the
other hand, the extended Hamiltonian H1,r additionally determines the parameteriza-
tion of time t = t (s).
We finally note that the extended Hamiltonian (21.35) can be derived according
to (21.18), (21.19), and (21.21) as the Legendre transformed function of the extended
Lagrangian
21.2 Extended Set of Canonical Equations 425
dqj dt Example 21.5
L1 qj , , t,
ds ds
1 dqi 2 dt 2
2 1 dt
= m0 c 2
− − 1 − V (q, t) . (21.38)
2 c ds ds ds
i
From (21.38), the correlations of the “velocities” dqi /ds and dt/ds with the canonical
momenta, pi and −e, evaluate to
∂L1 dqi ∂L1 dt
pi = dq = m0 , e = − dt = m0 c2 + V (q, t). (21.39)
∂ dsi ds ∂ ds ds
EXAMPLE
The associated constraint H1,r (q, p, e) = 0 yields a relation of the formally indepen-
dent canonical variables p, q, and e
2
1
p 2 c2 − e − kq 2 + m20 c4 = 0. (21.41)
2
Solving this relation for e, we obtain the corresponding conventional Hamiltonian Hr
as the right-hand side of the equation e = Hr ,
1
Hr (q, p) = p 2 c2 + m20 c4 + kq 2 . (21.42)
2
The extended set of canonical equations following from the extended Hamil-
tonian (21.40) are
Example 21.6 As expected, we encounter the canonical equations of the conventional Hamil-
tonian (21.42). The pair of first-order equations can be merged into a single second-
order equation for qi (t),
3
k q̇ 2 (t) 2
q̈(t) + 1− 2 q(t) = 0. (21.44)
m0 c
For q̇(t) → c we thus have q̈(t) → 0. In agreement with the postulates of special
relativity, the speed of light, c, constitutes the absolute limit for the particle’s velocity,
q̇(t). The term in brackets forms a power of the relativistic correction factor, γ
2
−2 q̇
γ =1− . (21.45)
c
EXAMPLE
We notice that the kinetic momentum pi,k = m dqi /ds differs from the canonical mo-
mentum pi in the case of a non-vanishing external potential Ai = 0. The condition for
the Legendre transform of L1 to exist is that its 4 × 4 Hessian matrix with elements
21.2 Extended Set of Canonical Equations 427
∂ 2 L1 /[∂(dqμ /ds)∂(dqν /ds)] must be non-singular, hence that the determinant of this Example 21.7
matrix does not vanish. For the extended Lagrangian (21.14) from Example 21.2, this
is actually the case as
∂ 2 L1
det dq dq = m40 = 0.
∂ dsμ ∂ dsν
This falsifies claims occasionally found in literature that the Hesse matrix associated
with an extended Lagrangian L1 be generally singular, and that for this reason an ex-
tended Hamiltonian H1 generally could not be obtained by a Legendre transformation
of an extended Lagrangian L1 .
With the Hessian condition being actually satisfied, the extended Hamiltonian H1
that follows as the Legendre transform (21.18) of L1 evaluates to
2
1 ζ e − ζ φ(q, t) 2 1
H1 (q, p, t, e) = p − A(q, t) − + m0 c2 . (21.48)
2m0 c c 2
ds m0 c2 m0 c2
= = . (21.53)
dt e − ζφ 2
c2 (p − ζc A(q, t)) + m20 c4
The canonical equations (21.53) can now be expressed equivalently with the time t as
the independent variable
dpi dpi ds
− =−
dt ds dt
ζc ζ
∂Ak ∂φ
= − pk − Ak +ζ ,
2 c ∂qi ∂qi
c2 (p − ζc A(q, t)) + m20 c4 k
de de ds
= (21.54)
dt ds dt
ζc ζ
∂Ak ∂φ
= − pk − Ak +ζ ,
2 c ∂t ∂t
c2 (p − ζc A(q, t)) + m20 c4 k
dqi dqi ds c2 ζ
= = pi − Ai .
dt ds dt 2 c
c2 (p − ζc A(q, t)) + m20 c 4
The right-hand sides of (21.54) are exactly the partial derivatives ∂H /∂qi , ∂H /∂t , and
∂H /∂pi of the Hamiltonian (21.50)—and hence its canonical equations, which was to
be shown.
The physical meaning of the dt/ds is worked out by casting it to the equivalent
form
dt
(p − ζc A(q, t))2 p k (s) 2
= 1+ = 1 + = γ (s),
ds m20 c2 m0 c
with p k (s) the instantaneous kinetic momentum of the particle. The dimensionless
quantity dt/ds thus represents the instantaneous value of the relativistic scale fac-
tor γ .
on the basis of the extended action integral from (21.2). With the time t = q0 /c and the
configuration space variables qi treated on equal footing, we are enabled to correlate
two Hamiltonian systems, H and H , with different time scales, t (s) and T (s), hence
to canonically map the system’s time t and its conjugate quantity e in addition to the
mapping of generalized coordinates q and momenta p. The system evolution parame-
ter s is then the common independent variable of both systems, H and H . A general
mapping of all dependent variables may be formally expressed as
sb
sb
n
dqμ n
dQμ
δ pμ − H1 qν , pν ds = δ Pμ − H1 Qν , Pν ds.
sa ds sa ds
μ=0 μ=0
(21.56)
As we are operating with functionals, the condition (21.56) holds if the integrands dif-
fer at most by the derivative dF1 /ds of an arbitrary differentiable function F1 (qν , Qν )
n
dqμ n
dQμ dF1
pμ − H1 = Pμ − H1 + . (21.57)
ds ds ds
μ=0 μ=0
We restrict ourselves to functions F1 (qν , Qν ) of the old and the new extended config-
uration space variables, hence to a function of those variables, whose derivatives are
contained in (21.57). Calculating the s-derivative of F1 ,
n
dF1 ∂F1 dqμ ∂F1 dQμ
= + , (21.58)
ds ∂qμ ds ∂Qμ ds
μ=0
we then get unique transformation rules by comparing the coefficients of (21.58) with
those of (21.57)
∂F1 ∂F1
pμ = , Pμ = − , H1 = H1 . (21.59)
∂qμ ∂Qμ
∂F1 ∂F1
e=− , E= , (21.60)
∂t ∂T
430 21 Extended Hamilton–Lagrange Formalism
E(s) ≡
P0 (s) = − , E(s) = H Q(s), P (s), T (s) . (21.61)
c
The transformed Hamiltonian H is finally obtained from the general correlation of
conventional and extended Hamiltonians from (21.25), and the transformation rule
H1 = H1 for the extended Hamiltonian from (21.59)
dT dt
H (Q, P , T ) − E = H (q, p, t) − e .
ds ds
Eliminating the evolution parameter s, we arrive at the following two equivalent trans-
formation rules for the conventional Hamiltonians under extended canonical transfor-
mations
∂T
H (Q, P , T ) − E = H (q, p, t) − e,
∂t (21.62)
∂t
H (q, p, t) − e = H (Q, P , T ) − E.
∂T
The transformation rules (21.62) are generalizations of the rule for conventional
canonical transformations as now cases with T = t are included. We will see at the
end of this section that the rules (21.62) merge for the particular case T = t into the
corresponding rules (19.10), (19.12) of the conventional canonical transformation the-
ory.
By means of the Legendre transformation
n
∂F1
F2 (qν , Pν ) = F1 (qν , Qν ) + Qμ Pμ , Pμ = − , (21.63)
∂Qμ
μ=0
This means that all qμ do not take part in the transformation defined by (21.63). Hence,
for the Legendre transformation, we may regard the functional dependence of the gen-
erating functions to be reduced to F1 = F1 (Qν ) and F2 = F2 (Pν ). The new transfor-
mation rule pertaining to F2 thus follows from the Pν -dependence of F2
n
∂F2 ∂F1 ∂Qμ ∂Qμ ∂Pμ
= + Pμ + Qμ
∂Pν ∂Qμ ∂Pν ∂Pν ∂Pν
μ=0
n
∂Qμ ∂Qμ
= −Pμ + Pμ + Qμ δμν
∂Pν ∂Pν
μ=0
= Qν .
21.3 Extended Canonical Transformations 431
The new set of transformation rules, which is, of course, equivalent to the previous set
from (21.59), is thus
∂F2 ∂F2
pμ = , Qμ = , H1 = H1 . (21.64)
∂qμ ∂Pμ
Similarly to the conventional theory of canonical transformations, there are two more
possibilities to define a generating function of an extended canonical transformation.
By means of the Legendre transformation
n
∂F1
F3 (pν , Qν ) = F1 (qν , Qν ) − q μ pμ , pμ = − ,
∂qμ
μ=0
∂F3 ∂F3
qμ = − , Pμ = − , H1 = H1 . (21.66)
∂pμ ∂Qμ
n
∂F3
F4 (pν , Pν ) = F3 (pν , Qν ) + Qμ Pμ , Pμ = − ,
∂Qμ
μ=0
∂F4 ∂F4
qμ = − , Qμ = , H1 = H1 . (21.67)
∂pμ ∂Pμ
Calculating the second derivatives of the generating functions, we conclude that the
following correlations for the derivatives of the general mapping from (21.55) must
hold for the entire set of extended phase-space variables,
Exactly if these conditions are fulfilled for all μ, ν = 0, . . . , n, then the extended co-
ordinate transformation (21.55) is canonical and preserves the form of the extended
set of canonical equations (21.23). Otherwise, we are dealing with a general, non-
canonical coordinate transformation that does not preserve the form of the canonical
equations.
432 21 Extended Hamilton–Lagrange Formalism
EXAMPLE
The particular transformation rules for this case follow from their general form, given
by (21.65),
pi = Pi , Qi = qi , e = E, T = t, H1 = H1 .
The existence of a neutral element is a precondition for the set of extended canonical
transformations of a Hamiltonian system H (pj , qj , t) to form a group.
EXAMPLE
The connection of the extended canonical transformation theory with the conventional
one is furnished by the particular extended generating function
Together with the general transformation rule (21.62) for conventional Hamiltoni-
ans, we find the rule for Hamiltonians under conventional canonical transformations
from (19.12),
∂F2
H (Qj , Pj , t) = H (qj , pj , t) + E − e = H (qj , pj , t) + .
∂t
Canonical transformations that are defined by extended generating functions of the
form of (21.69) leave the time variable unchanged and thus define the subgroup of
21.3 Extended Canonical Transformations 433
conventional canonical transformations within the general group of extended canoni- Example 21.9
cal transformations. In the present example, the time t also forms a common indepen-
dent variable of both the original and the transformed system — just as presupposed in
case of a conventional canonical transformation. Corresponding to the trivial extended
Hamiltonian from (21.26), we may refer to (21.69) as the trivial extended generating
function.
EXAMPLE
EXAMPLE
EXAMPLE
The generalized form of Liouville’s theorem thus states that the extended volume form
dV1 = dq0 . . . dqn dp0 . . . dpn dt de is conserved under extended canonical transfor-
mations, hence that the determinant D that is associated with the Jacobi matrix of the
transformation is always unity. As the amount of canonical variables remains an even
number in the extended description, we may again represent D by
∂(Q0 , . . . , Qn ) ∂(p0 , . . . , pn ) −1
D= .
∂(q0 , . . . , qn ) ∂(P0 , . . . , Pn )
If the transformation rules can be derived from a generating function of type
F2 (qμ , Pμ ), then we are dealing with the particular case of a canonical coordinate
transformation. Inserting the equations for Qν and pν yields
2
∂ F2 ∂ 2 F2 −1
D = = 1.
∂qμ ∂Pν ∂Pμ ∂qν
This equation holds as (i) the partial derivatives may be interchanged, and (ii) due
to the fact that transpose matrices have the same determinant. We will see in Exam-
ple 21.18 that under generalized canonical transformations—hence transformations
that also map the time scales of original and destination systems—only the gener-
alized version of Liouville’s theorem applies, and not the conventional form from
Example 19.6.
EXAMPLE
canonical transformations, the invariance property holds for extended Poisson brack- Example 21.13
ets, [F, G]e . In analogy conventional brackets, the extended Poisson brackets are de-
fined by
n
∂F ∂G ∂F ∂G ∂F ∂G ∂F ∂G
[F, G]e = − = [F, G] − + . (21.71)
∂qμ ∂pμ ∂pμ ∂qμ ∂t ∂e ∂e ∂t
μ=0
We will show in Example 21.18 that the Lorentz transformation can be conceived
as a particular extended canonical transformation. Consequently, extended Poisson
brackets are always Lorentz invariant.
The total s derivative of a function f = f (qi , pi , t) is
n
df ∂f dqi ∂f dpi ∂f dt
= + +
ds ∂qi ds ∂pi ds ∂t ds
i=1
n
∂f dqμ ∂f dpμ
= +
∂qμ ds ∂pμ ds
μ=0
n
∂f ∂H1 ∂f ∂H1
= − = [f, H1 ]e . (21.73)
∂qμ ∂pμ ∂pμ ∂qμ
μ=0
In the context of the extended Hamilton formalism, the extended Poisson bracket of an
explicitly time-dependent function f (qi , pi , t) with the extended Hamiltonian H1 thus
yields directly the total derivative of f with respect to the independent variable, s. This
agrees formally with the conventional Poisson bracket of a conventional Hamiltonian
H with a function f (qi , pi ) that does not explicitly depend on time t . In that case, we
obtain the total time derivative of f .
EXAMPLE
Example 21.14 Analogously to the fundamental Poisson bracket (19.29) from Example 19.7, we then
find for the extended set of fundamental commutators
∂
q̂μ = qμ , p̂ν = −i . (21.76)
∂qν
For, if we let the commutator {q̂μ , p̂ν }_ of the operators q̂μ and p̂ν act on an explicitly
time-dependent function ψ(qλ ) ≡ ψ(q1 , . . . , qn , t), we get
∂ ∂ ∂
{q̂μ , p̂ν }_ ψ(qλ ) = i , qμ ψ = i qμ ψ − qμ ψ
∂qν _ ∂qν ∂qν
= iδμν ψ(qλ ). (21.77)
Because of q0 ≡ ct, the momentum operator for the index ν = 0, i.e. p̂0 ≡ −ê/c, has
the alternative representation
∂
ê = i . (21.78)
∂t
Parallel to the momentum operators p̂i = −i∂/∂qi that are conjugate to the config-
uration space variables qi , one thus finds in the extended description the operator ê
for the system’s instantaneous energy content as the conjugate quantity of the time
variable t.
Furthermore, in the extended Hamiltonian formalism of canonical quantization,
the extended Hamiltonian H1 = 0 from (21.25) is replaced by the extended Hamilton
operator Ĥ1 = 0̂
dt dt ∂
Ĥ1 = (Ĥ − ê) = Ĥ − i = 0̂, (21.79)
ds ds ∂t
with Ĥ denoting the related conventional Hamilton operator of the given quantum
mechanical problem. As long as the operator equation Ĥ1 = 0̂ is not submitted to
an extended canonical transformation, we are allowed to identify the time t with the
system’s evolution parameter s, (t ≡ s). We thereby find an operator equation that is
no longer Lorentz invariant
∂
Ĥ − i = 0̂. (21.80)
∂t
If we let these operators act on an explicitly time-dependent function ψ(qi , t), we get
the following partial differential equation
∂ψ(qi , t)
Ĥ ψ(qi , t) = i . (21.81)
∂t
21.3 Extended Canonical Transformations 437
In the realm of quantum mechanics, this equation is referred to as the Schrödinger Example 21.14
equation.
EXAMPLE
1 2 K
H (qj , pj ) = p1 + p22 + p32 − , K = G (m1 + m2 ), r 2 = q12 + q22 + q32 .
2 r
(21.82)
1 2 K
e= p1 + p22 + p32 − = const. (21.83)
2 r
Obviously, in this description the system has a singularity for r → 0. We will now
show that the Kepler system (21.82) can be canonically transformed into another
Hamiltonian system that does not exhibit any singularities. This canonical transfor-
mation can be defined in terms of a generating function of type F3 ,
1
F3 Qj , pj , T , e = − p1 Q21 − Q22 − Q23 + Q24 − p2 Q1 Q2 − Q3 Q4
2
T
− p3 Q1 Q3 + Q2 Q4 + e ξ(τ ) dτ . (21.84)
0
From the transformation rule for the conventional Hamiltonians from (21.62), H is
finally obtained as
which is, as expected, in agreement with the transformation rule of their values, E
and e. In explicit form, the transformed Hamiltonian H (Qj , Pj , T ) is then found by
expressing the original Hamiltonian in terms of the new variables
ξ(T ) 1 2
H (Qj , Pj , T ) = 2 P1 + P2 + P3 + P4 − 2K . (21.91)
2 2 2
Q1 + Q22 + Q23 + Q24 2
The constant energy e of the original system writes in terms of the new coordinates
1 2
2 (P1 + P22 + P32 + P42 ) − 2K
e= = const. (21.92)
Q21 + Q22 + Q23 + Q24
21.3 Extended Canonical Transformations 439
From H , the canonical equations of the transformed system evaluate to Example 21.15
∂H dQi ξ(T )
= = 2 Pi ,
∂Pi dT Q1 + Q2 + Q23 + Q24
2
∂H dPi 2ξ(T ) 1 2
− = = P1 + P2
2
+ P3
2
+ P4
2
− 2K Qi .
∂Qi dT (Q21 + Q22 + Q23 + Q24 )2 2
We may merge the pairs of first-order equations into second-order equations for the
Qi , i = 1, . . . , 4
d 2 Qi 1 d 1 d 2
2 dQi
− ξ(T ) − Q1 + Q2
2 + Q2
3 + Q4
dT 2 ξ(T ) dT Q21 + Q22 + Q23 + Q24 dT dT
2
ξ(T )
− 2e Qi = 0. (21.93)
Q21 + Q22 + Q23 + Q24
After having worked out the equations of motion of the transformed system, we are
now in the state to fix the as yet undetermined time function ξ(T ). With the trans-
formed canonical position coordinates conceived as functions of the transformed time,
Qi = Qi (T ), we may define
By virtue of the fixation of ξ(T ), the relation of the physical time t of the Ke-
pler system to the time T of the transformed system is uniquely determined
through (21.89)
T
t (T ) = Q21 (τ ) + Q22 (τ ) + Q23 (τ ) + Q24 (τ ) dτ. (21.95)
0
Note that the identification of ξ(T ) with the time evolution of a function the canonical
variables does not mean that ξ(T ) acquires an explicit dependence on the canonical
variables. With this particular scaling of the transformed time T , the equations of
motion (21.93) simplify to
d 2 Qi
− 2eQi = 0. (21.96)
dT 2
For e < 0, the orbit is closed in the original Kepler system. In the transformed system,
we then get four uncoupled equations of motion of the time-independent harmonic
oscillator, which we already know to be analytically solvable.
Equations (21.96) can be regarded as the equations of motion that emerge from the
canonical equations of the Hamiltonian
1 2
H (Qj , Pj ) = P1 + P22 + P32 + P42 − e Q21 + Q22 + Q23 + Q24 . (21.97)
2
By means of the relation (21.92), we immediately find the constant value E of the
Hamiltonian (21.97)
E = 2K = const. (21.98)
440 21 Extended Hamilton–Lagrange Formalism
Example 21.15 The original Kepler system may now be solved according to the scheme sketched
at the end of Example 19.3. We must first transform the given initial conditions
q1 (0), q2 (0), q3 (0) and p1 (0), p2 (0), p3 (0) of the Kepler system into the initial con-
ditions Q1 (0), Q2 (0), Q3 (0), Q4 (0) and P1 (0), P2 (0), P3 (0), P4 (0) for (21.96). This
can be worked out by inverting the transformation rules (21.85). For a unique inverse
to exist, we must choose one constraint, which may be defined for convenience as
Def
Q4 (0) = 0. (21.99)
With this setting, the initial values of the configuration space variables of the trans-
formed system are
q2 (0) q3 (0)
Q1 (0) = q1 (0) + r(0), Q2 (0) = , Q3 (0) = . (21.100)
Q1 (0) Q1 (0)
The initial momenta Pi (0) are then directly obtained from the general transformation
rules (21.87). Now, the harmonic oscillator equations (21.96) may be solved analyt-
ically to find the solutions Qi (T ), Pi (T ) at time T . The configuration space coordi-
nates qi (T ) at time T of the original Kepler system are then found from the trans-
formation rules (21.85). The corresponding momentum coordinates pi (T ) at time T
of the original Kepler system follow by solving the transformation rules (21.87) for
the pi
1
p1 = (Q1 P1 − Q2 P2 − Q3 P3 + Q4 P4 )
2r
1
p2 = (Q2 P1 + Q1 P2 − Q4 P3 − Q3 P4 )
2r
1
p3 = (Q3 P1 + Q4 P2 + Q1 P3 + Q2 P4 )
2r
1
0 = (−Q4 P1 + Q3 P2 − Q2 P3 + Q1 P4 ).
2r
The remaining task is to invert the analytic solution of (21.95) to find the represen-
tation T (t). We can then finally express the solutions qi (T ), pi (T ) in terms of the
Kepler’s system time t to obtain the qi (t) and the pi (t).
EXAMPLE
As another example for an extended canonical transformation we will show that the
time-dependent harmonic oscillator with also time-dependent damping coefficient can
directly be mapped into a conventional (time-independent) undamped harmonic oscil-
lator. Written for n degrees of freedom, the Hamiltonian of the original system is given
by
1 n
1 n
H (qj , pj , t) = e−F (t) pi2 + eF (t) ω2 (t) qi2 , (21.101)
2 2
i=1 i=1
21.3 Extended Canonical Transformations 441
with ω2 (t) and F (t) denoting arbitrary, not necessarily periodic, differentiable func- Example 21.16
tions of time. The subsequent equations of motion follow as
As the “target system” H —with T the independent variable—we demand the ordi-
nary time-independent and undamped harmonic oscillator,
1 2 1 2 2
n n
H (Qj , Pj ) = Pi + Qi . (21.103)
2 2
i=1 i=1
In the context of the generalized canonical transformation theory, this means that the
transformed time T represents a cyclic coordinate — which means that the conjugate
coordinate energy E represents a constant of motion.
The generating function F2 (qj , Pj , t, E) that defines the desired mapping of the
Hamiltonian (21.101) into (21.103) turned out to be
eF (t) 1 ξ̇ (t)
F2 qj , Pj , t, E = qi Pi + eF (t) − f (t) qi2
ξ(t) 4 ξ(t)
i i
t
dτ
−E . (21.104)
0 ξ(τ )
As the new spatial coordinates Qi and the new time T depend on the old spatial
coordinates qi and the old time t , respectively, we are actually dealing with a point
transformation. The transformation of the time scales of both systems is governed by
the as yet undetermined time function ξ(t).
In terms of the new coordinates, the transformation rule for the energy e =
−∂F2 /∂t is found from our F2 as
1 1 2
E=ξe+ f ξ − ξ̇ Qi Pi + ξ ξ̈ − ξ̇ 2 + f ξ ξ̇ − f˙ξ 2 − f 2 ξ 2 Qi .
2 4
i i
(21.106)
Because of ∂T /∂t = 1/ξ(t), the relation of old and new Hamiltonians, H and H ,
follows from the general rule from (21.62), yielding
H − E = ξ(t)(H − e).
With H the original Hamiltonian from (21.101), we get the new Hamiltonian
H (Qj , Pj , T ) by eliminating the old variables according to the transformation
442 21 Extended Hamilton–Lagrange Formalism
1
n
+ eF (t) ξ̈ − ξ̇ f − ξ f˙ + 2ξ ω2 (t) qi2 . (21.110)
4
i=1
The time function ξ(t) can be attributed a physical meaning. We easily convince us
that
n
ξ(t) = eF (t) qi2 (t) (21.111)
i=1
represents a solution of (21.109), provided, of course, that all qi (t) are solutions of
the equation of motion (21.102) of the time-dependent damped harmonic oscillator.
Inserting (21.111) into the representation (21.110) of the invariant E, then the latter
takes on the equivalent form
2
1 2
E= qi2
pi −
2
q i pi = pi q j − q i p j . (21.112)
2
i i i i,j
The actual invariance of E can be proved directly by calculating its time derivative.
Obviously, the invariant (21.112) of the time-dependent damped harmonic oscilla-
tor (21.102) has exactly the form of the conservation law for the angular momentum
21.3 Extended Canonical Transformations 443
√
in central force fields. In the realm of accelerator physics, the quantity εrms = E/n Example 21.16
is referred to as the “root-mean-square (rms) emittance.” The “rms-emittance” of a
charged particle beam is thus invariant along the beam axis, as long as (i) the particle
motion may approximately be described by linear equations of motion and (ii) the
number of beam particles is maintained.
Inserting (21.111) into (21.108) we finally find that the invariant E coincides with
the coefficient 2 from the transformed Hamiltonian (21.103) provided that ξ(t) is
given by (21.111),
2 = E. (21.113)
The initial conditions Qi (0) and Pi (0) can, furthermore, be expressed through the
qi (0) and the pi (0) by means of the inverse transformation of (21.105) at time t =
T = 0,
Qi (0) eF (0) /ξ(0)
0 qi (0)
= .
Pi (0) − 12 (ξ̇ (0) − ξ(0) f (0)) eF (0) /ξ(0) ξ(0)/eF (0) pi (0)
(21.116)
Herein, ξ(t) denotes the uniquely determined solution of (21.108) for given initial
conditions ξ(0), ξ̇ (0) and a fixed 2 = const. The elements of the solution matrix
R(t) are obtained by multiplying the three matrices involved,
ξ(t) eF (0) 1 sin T (t)
r11 (t) = cos T (t) − ξ̇ (0) − f (0) ξ(0)
ξ(0) eF (t) 2
444 21 Extended Hamilton–Lagrange Formalism
Example 21.16 ξ(0) ξ(t) sin T (t)
r12 (t) =
eF (0) eF (t)
(21.118)
eF (0) eF (t) 1
r21 (t) = ξ̇ (t) − f (t) ξ(t) − ξ̇ (0) + f (0) ξ(0) cos T (t)
ξ(0) ξ(t) 2
1 2 sin T (t)
− ξ̇ (0) − f (0) ξ(0) ξ̇ (t) − f (t) ξ(t) +
4
ξ(0) eF (t) 1 sin T (t)
r22 (t) = cos T (t) + ξ̇ (t) − f (t) ξ(t) .
ξ(t) eF (0) 2
For all times t , the determinant D = r11 r22 − r12 r21 of matrix R(t) has the value
D = 1. The linear mapping (q(0), p(0)) → (q(t), p(t)) is thus in agreement with
the requirement of Liouville’s theorem. For the particular case → 0, we find that
(sin T )/ → T . In that case, the particle motion in a time-dependent damped har-
monic oscillator are mapped into free particles.
The rule for the transformation of time t → T that emerges from the generating
function (21.104) has the particular property that the transformed time T does not de-
pend on the coordinates of the particles. Exactly for that reason, the transformed time
T maintains the property of the original time t to be a common coordinate for all parti-
cles in the transformed system. This means that T may serve as the common evolution
parameter of all particle coordinates Qi (T ) and Pi (T ) in the transformed system. In
other words, T has the global property of the transformed system’s time. Mathemat-
ically, T has this property due to the fact that the (Pi , qi ) and the (E, t) terms in the
particular generating function (21.104) are additive. Therefore, this extended canon-
ical transformation can be split into a conventional canonical transformation plus a
pure canonical time-energy transformation from Example 21.11. This is not always
possible as extended canonical transformations can be defined that do not admit a sep-
aration of the transformation of space and time coordinates. We will encounter such
a canonical transformation in Example 21.18, where the Lorentz transformation is
formulated as an extended canonical transformation.
EXAMPLE
Galileo’s principle of relativity constituted until its absorption as a limiting case into
Einstein’s principle in the year 1905 the most undoubted principles of classical dy-
namics. It stated that there exists an “absolute time” t that is instantaneously common
to all coordinate systems, how distant apart these systems ever may be located. If we
consider the special case of two coordinate systems that are moving with respect to
each other along one coordinate axis at a constant velocity v, the transformation rule
for the positions q, Q and times t , T of a moving body of mass m0 between these two
systems is simply
q = Q + vt, t = T.
The complete set of transformation rules of the canonical coordinates is then Example 21.17
∂F2 ∂F2
p= = P + m0 v, e=− = E + vP ,
∂q ∂t
(21.120)
∂F2 ∂F2
Q= = q − vt, T =− = t, H1 = H1 .
∂P ∂E
From the general transformation rule for extended Hamiltonians, H1 = H1 , the rule
for the conventional Hamiltonians H and H is then obtained according to (21.62)
with ∂T /∂t = 1 as
H = H + vP . (21.121)
As required, the rule for the Hamiltonians is in agreement with the rule for their values,
e and E.
EXAMPLE
The correct transformation rule between coordinate systems that move with respect
to each other at constant velocity (“inertial systems”) is based on the finding that the
velocity of light, c, is actually finite. This rule is referred to as the “Lorentz transfor-
mation.” A finite c constituting the upper speed limit for any signal obviously means
that a finite time span is needed for the signal to pass from one reference system to
another. As an immediate consequence, Galileo’s concept of an “absolute time” that
is instantaneously common to all inertial systems had to be abandoned. Instead, it is
obviously necessary to also transform the time t if one performs the transition from
one inertial system to another.
The special principle of relativity requires that the formulation of the description
of a physical system must be the same in all inertial systems. This means in particu-
lar for Hamiltonian systems that the Lorentz transformation must maintain the form
of the Hamiltonian. On the other hand, as we know from the preceding Chap. 19
and Sect. 21.3, only canonical transformations maintain the form of the canonical
equations. We conclude that the Lorentz transformation must constitute a particular
canonical transformation. As the Lorentz transformation necessarily associated with a
transformation of the time coordinate, t → T , it may be described only in terms of an
extended canonical transformation. Its extended generating function F2 is given by
E
F2 (q, P , t, E) = γ P q − Et − v P t − 2 q , (21.122)
c
with v denoting the constant relative velocity of the respective inertial systems. In the
formulation given here, the coordinate systems are adjusted to ensure that the relative
motion of both systems occurs along one coordinate axis, q. As usual, we denote by
γ the dimensionless length and time scaling factor γ = 1/ 1 − β 2 , with β = v/c
the scaled relative velocity. We observe that the generating function (21.122) of the
Lorentz transformation merges into that of the Galilei transformation (21.119) from
Example 21.17 for v c, hence for β → 0, γ → 1, E/c2 = m → m0 . Namely, the
total mass m = E/c2 in (21.122) is replaced by the constant rest mass m0 in the case of
446 21 Extended Hamilton–Lagrange Formalism
Example 21.18 the Galilei transformation. With regard to the transformation rules that emerge from
the generating functions, it is exactly the replacement of the second E-dependent term
in (21.122) by the constant mass term m0 in (21.119) from Example 21.17 that induces
the time transformation rule t = T of the Galilei transformation.
For the generating function (21.122), the general transformation rules (21.65) for
extended canonical transformations yield the particular rules
∂F2 E ∂F2
p= =γ P + 2v , e=− = γ (E + vP ),
∂q c ∂t
(21.123)
∂F2 ∂F2 v
Q= = γ (q − vt), T =− =γ t − 2q , H1 = H1 .
∂P ∂E c
The transformation rules for the variables Q and T follow as
Q γ −βγ q
= . (21.124)
cT −βγ γ ct
With the real angle α = arcosh γ = arsinh βγ , the linear transformations (21.124) and
(21.125) can be rewritten as orthogonal mappings, hence as the imaginary rotations
Q cos iα sin iα q
= ,
icT − sin iα cos iα ict
(21.126)
P cos iα sin iα p
= .
iE/c − sin iα cos iα ie/c
Together with the rule (21.125) for the transformation of the energies e, E
e = γ E + βγ P c (21.129)
H = γ H + βγ P c. (21.130)
EXAMPLE
= −δH1 . (21.136)
Thus, by means of the canonical equations (21.23) and the first-order transformation
rules (21.135), we have found that the characteristic function I (qν , pν ) that is con-
tained in the generating function (21.133) constitutes a constant of motion exactly if
the extended Hamiltonian H1 is invariant under the transformation (21.135) generated
by (21.133). In other words, if the rules (21.135) define a symmetry transformation
of the given Hamiltonian system then the characteristic function I (qν , pν ) of the gen-
erating function (21.133) constitutes a constant of motion. The correlation (21.136)
of a Hamiltonian system’s invariants I to the symmetry transformations that main-
tain the value of its extended Hamiltonian H1 establishes Noether’s theorem1 of point
mechanics in the extended Hamiltonian formalism
dI
=0 ⇔ δH1 = 0. (21.137)
ds
1 Amalie “Emmy” Noether, German mathematician, b. March 2, 1882, Erlangen, Germany–d. April
14, 1935, Bryn Mawr, Pennsylvania, USA. Emmy Noether grew up in Erlangen in the family of a
mathematician and passed the state examination for teachers of foreign languages. When in 1903,
women were for the first time allowed to study at Bavarian universities, she took up studies of mathe-
matics in Erlangen and graduated in 1907. Following an invitation by Felix Klein and David Hilbert,
she then moved to Göttingen. She was not allowed a habilitation, the German qualification to become
a professor, until 1919, and only in 1922 she became an assistant professor, and the first paid job as
a professor in 1923. In 1928/29 she was a guest professor in Moscow, and on 1930 in Frankfurt am
Main. Because of her Jewish descent and her political views, she was forbidden to teach in 1933.
Noether emigrated to the US, where she found a position as a guest professor at the Bryn Mawr
Women’s College.
Emmy Noether’s work on the theory of invariants, the theory of ideals and of rings and modules
was instrumental to the development of modern algebra. Her theorem linking continuous symmetries
of a physical system to conserved quantities is one of corner-store concepts of modern physics.
2 In the original publication of Emmy Noether (“Invariante Variationsprobleme,” Nachr. Kgl. Ges.
Wiss. Göttingen, Math.-Phys. Kl. 1918, 235), the theorem was presented for continuous systems in
the Lagrangian formalism.
21.3 Extended Canonical Transformations 449
∂u ∂u
Example 21.19
δu(qν , pν ) = δqμ + δpμ
μ
∂qμ ∂pμ
∂u ∂I
∂u ∂I
= − . (21.138)
μ
∂qμ ∂pμ ∂pμ ∂qμ
In the notation of extended Poisson brackets that was introduced in Example 21.13,
we may express the variation δu concisely as
δu = [u, I ]e . (21.139)
δu = Û u.
The dot in the Poisson bracket expression stands here as a placeholder for a function
the operator Û acts on. We refer to Û as the generator of the infinitesimal symmetry
transformation (21.135) that is associated with the invariant I . Obviously, I itself is
invariant under the symmetry transformation (21.135) which it generates
δI = Û I = [I, I ]e = 0. (21.141)
Two invariants I1 and I2 of a given Hamiltonian system H then define the two sym-
metry operators
The concatenations of the operators Û1 and Û2 can then be expressed as nested Pois-
son brackets. Skipping the superscript e , this reads
Û1 Û2 = Û1 [ . , I2 ] = [ . , I2 ], I1 , Û2 Û1 = Û2 [ . , I1 ] = [ . , I1 ], I2 . (21.143)
The commutator {Û1 , Û2 }_ of the operators Û1 and Û2 then defines a generally not
vanishing operator Û3 , which is represented in terms of extended Poisson brackets as
Û3 = Û1 , Û2 _ = Û1 Û2 − Û2 Û1
= [ . , I2 ], I1 − [ . , I1 ], I2 = − I1 , [ . , I2 ] − I2 , [I1 , . ]
= . , [I2 , I1 ]
= . , I3 , with I3 = [I2 , I1 ]. (21.144)
Example 21.19 generators of symmetry transformations thus provides another generator of a symme-
try transformation. The group of generators of symmetry transformations of a Hamil-
tonian system thus forms in conjunction with the commutator operation a Lie–Poisson
algebra. For a given Hamiltonian system, we do not know a priori whether all its in-
variants have been found, and hence whether the Lie–Poisson algebra of symmetry
operators is complete. Yet, by applying Poisson’s theorem to all pairs of known invari-
ants, it is always possible to find a subset of invariants Ij that is closed with respect
to evaluating their mutual Poisson brackets. With respect to the set of generators of
symmetry transformations we thus find a sub-algebra for pertaining symmetry opera-
tors Ûj .
EXAMPLE
n
n
∂I
= F2 (qν , Pν ) − Qμ Pμ + pμ
∂pμ
μ=0 μ=0
n
∂I
= F1 (qν , Qν ) + pμ .
∂pμ
μ=0
In the last step, we have replaced the generating function of type F2 (qν , Pν ) by an
equivalent function of type F1 (qν , Qν ) according to (21.63). In our case of an infini-
tesimal transformation, the generating function F1 may alternatively be expressed to
first order in as
n
∂I
I (qν , pν ) = pμ + f (qν ). (21.145)
∂pμ
μ=0
This equation is obviously fulfilled for functions I (qν , pν ) that are linear in the pν
n
I (qν , pν ) = − pμ ημ (qν ) + f (qν ). (21.146)
μ=0
The functions ημ (qν ) in (21.146) are defined to only depend on the extended set of
canonical variables qν , hence on the configuration space variables qj and time t in the
21.3 Extended Canonical Transformations 451
conventional description. With this I (qν , pν ), the generating function (21.133) from Example 21.20
Example 21.19 defines the extended point transformation
Qμ = qμ − ημ (qν ), μ = 0, . . . , n. (21.147)
This defines the conventional Noether invariant of point mechanics. If we can find
functions ξ(qj , t), ηi (qj , t), and f (qj , t) for a given Hamiltonian H (qj , pj , t) that
satisfy dI /dt = 0, then I from (21.148) constitutes a constant of motion. The Hamil-
tonian form of Noether’s theorem presented here then states that the corresponding
symmetry of the given Hamiltonian system is given by an extended canonical point
transformation that is determined by those functions ξ(qj , t) and ηi (qj , t)
As a result of the fact that the Noether theorem from (21.148) represents only a special
case, not all invariants of a given system can be expressed by (21.148) in general. As
we see from the symmetry transformations (21.149) that are associated with invariants
of the form (21.148), only those symmetries that are represented by extended point
transformations are covered by the conventional Noether theorem.
Invariants that are not of the form of (21.148) are commonly referred to in liter-
ature as “non-Noether invariants,” an example of which we will encounter in Exam-
ple 21.21. The symmetry transformations associated with “non-Noether invariants”
do not constitute extended point transformations, and hence do not emerge straight-
forwardly in the context of the Lagrangian formalism.
EXAMPLE
The classical Kepler system is an example of a two-body problem whose masses in-
teract according to an inverse square force law. In its plane version, the Hamiltonian
of this system writes in scaled coordinates (see also Example 19.9)
1 1 K
H = p12 + p22 − . (21.150)
2 2 q12 + q22
Example 21.21 As the characteristic function I = I (q1 , q2 , p1 , p2 , t) that is contained in the gen-
erating function for an infinitesimal canonical transformation (21.133) from Exam-
ple 21.19, we define
The as yet unknown coefficients η1 (q1 , q2 , t), η2 (q1 , q2 , t), and f (q1 , q2 , t) contained
herein must now be determined accordingly to render I a constant of motion. With the
physical time t the system’s independent variable, the condition therefore writes
d
−p1 p2 η1 (q1 , q2 , t) − p22 η2 (q1 , q2 , t) + f (q1 , q2 , t) = 0. (21.153)
dt
Inserting the canonical equations, we then find
K ∂η1 ∂η1 ∂η1
(q1 p2 η1 + q2 p1 η1 + 2q2 p2 η2 ) − p1 p2 + p1 + p2
(q12 + q22 )3 ∂t ∂q1 ∂q2
2 ∂η2 ∂η2 ∂η2 ∂f ∂f ∂f
− p2 + p1 + p2 + + p1 + p2 = 0. (21.154)
∂t ∂q1 ∂q2 ∂t ∂q1 ∂q2
As this polynomial in the pj must be satisfies not only in one instant of time t0 but
for all times t, we conclude that the coefficients of each power p1n p2m , n, m = 0, . . . , 3
must vanish separately. We thus obtain here eight separate conditions,
∂η1 ∂η2 ∂η1 ∂η2 ∂η1 ∂η2 ∂f
= 0, = 0, + = 0, = 0, = 0, = 0,
∂q1 ∂q2 ∂q2 ∂q1 ∂t ∂t ∂t
(21.155)
K ∂f K ∂f
q 2 η1 + = 0, (q1 η1 + 2q2 η2 ) + = 0.
(q12 + q22 )3 ∂q1 (q12 + q22 )3 ∂q2
From the conditions in the upper line, the following particular solutions emerge
is a solution of both conditions from the lower line. This shows that an invariant of
the form of (21.152) exists. Inserting the solutions η1 , η2 , and f into (21.152) finally
yields
q1
IRL1 = −p1 p2 q2 + p22 q1 − K . (21.158)
q12 + q22
This constant of motion represents one component of the Runge–Lenz vector, which is
referred to in literature of a “non-Noether invariant.” Due to its quadratic momentum
terms, the invariant (21.158) cannot be expressed as a conventional Noether invariant
of the form of (21.148) from Example 21.20. By systematically defining appropriate
polynomial functions I (qν , pν ) of the pν , we may construct the invariants of Hamil-
tonian systems from the solutions of systems of coupled partial differential equations.
21.3 Extended Canonical Transformations 453
The symmetry transformation that is associated with the invariant IRL1 follows Example 21.21
from the rules (21.135) of Example 21.19
q22
δq1 = −p2 q2 , δp1 = − p22 − K ,
(q12 + q22 )3
q1 q2
δq2 = (2p2 q1 − p1 q2 ), δp2 = p1 p2 − K .
(q12 + q22 )3
This means that all transformed canonical variables Qμ , Pμ must be constants of mo-
tion. Writing the variables for the index μ = 0 separately, we thus have
We may refer to this particular generating function as the extended action function
F2 ≡ S1 (qν , Pν ). According to the transformation rule H1 = H1 for extended Hamil-
tonians from (21.59), we obtain the transformed extended Hamiltonian H1 ≡ 0 sim-
ply by expressing the original extended Hamiltonian H1 = 0 in terms of the trans-
formed variables. This means for the conventional Hamiltonian H (qj , pj , t) accord-
ing to (21.25) in conjunction with the transformation rules from (21.65),
∂S1 ∂S1 dt
H qj , ,t + = 0.
∂qj ∂t ds
As we have ds/dt = 0 in general, we finally get the generalized form of the Hamilton–
Jacobi equation,
∂S1 ∂S1 ∂S1
H q 1 , . . . , qn , ,..., ,t + = 0. (22.2)
∂q1 ∂qn ∂t
Equation (22.2) has exactly the form of the conventional Hamilton–Jacobi equation.
Yet, it is actually a generalization as the extended action function S1 represents an
extended generating function of type F2 , as defined by (21.63). This means that S1 is
also a function of the (constant) transformed energy E = −cP0 (0) = −β0 .
Summarizing, the extended Hamilton–Jacobi equation may be interpreted as defin-
ing the mapping of all canonical coordinates qj , pj , t , and e of the actual system into
constants Qj , Pj , T , and E. In other words, it defines the mapping of the entire dy-
namical system from its actual state at time t into its state at a fixed instant of time, T ,
which could be the initial conditions.
EXAMPLE
The problem is now to find a solution S1 for this nonlinear partial differential equa-
tion. We start with generating function of the extended canonical transformation from
Example 21.16, and restrict ourselves for simplicity to the case of zero damping,
(F (t) ≡ 0),
t
1 1 ξ̇ (t) 2 dτ
S1 (qj , t, Pj , E) = √ qi Pi + qi − E . (22.5)
ξ(t) 4 ξ(t) 0 ξ(τ )
i i
In the first instance, we require only the transformed energy E ≡ −cP0 (0) ≡ −β0
in (22.5) to represent a constant. We insert the partial derivatives of S1 with respect to
Pi , qi , and time t
∂S1 qi
Qi = =√
∂Pi ξ
∂S1 Pi 1 ξ̇
pi = =√ + qi (22.6)
∂qi ξ 2ξ
∂S1 1 ξ̇ 1 ξ̈ ξ̇ 2 2 E
−e = =−
qi Pi + − 2 qi −
∂t 2 ξ3 4 ξ ξ ξ
i i
22 Extended Hamilton–Jacobi Equation 457
into the Hamilton–Jacobi equation (22.4). Expressed in terms of the transformed co- Example 22.1
ordinates, we find
1 2 1 2 2
E− Pi − Qi = 0. (22.7)
2 2
i i
In this equation, the sum of all terms proportional to i Q2i was denoted by 2 ,
1 1
2 (t) = ξ ξ̈ − ξ̇ 2 + ξ 2 ω2 (t).
2 4
For the required constant transformed energy E, (22.7) can only be satisfied for all t
if 2 itself constitutes a constant of motion
1 1
ξ ξ̈ − ξ̇ 2 + ξ 2 ω2 (t) = 2 = const. (22.8)
2 4
With ξ(t) a solution of (22.8), the transformation rules for the coordinates are now
uniquely determined
qi
1 ξ̇
Qi = √ , Pi = ξ pi − √ qi . (22.9)
ξ 2 ξ
The transformed Hamiltonian H (Qj , Pj , T ) follows now from the general transfor-
mation rule (21.62) for extended generating functions F with ∂T /∂t = 1/ξ ,
H − E = ξ(t)(H − e) (22.11)
1 2 1 2 2
n n
H (Qj , Pj ) = Pi + Qi . (22.12)
2 2
i=1 i=1
This Hamiltonian H does not explicitly depend on T if and only if ξ(t) represents a
solution of (22.8) with constant 2 . In this case, H corresponds to the Hamiltonian
of the ordinary (time-independent) harmonic oscillator. As required, the value of H ,
given by E ≡ −β0 , is now a constant of motion. In contrast, the canonical coordinates
(Qi , Pi ) of this system are not constant. The given task to map the time-dependent
harmonic oscillator (22.3) into a system where all transformed canonical coordinates
αμ , βμ depict constants of motion is thus as yet performed for β0 only. However,
we may now in a second step transform the Hamiltonian (22.12) into a new Hamil-
tonian that vanishes identically, H ≡ 0. To this end, we first set up the corresponding
458 22 Extended Hamilton–Jacobi Equation
We now try to find a solution S = S(Qj , T , βj ) that depends on the constant trans-
formed momenta βj = const. We may here restrict ourselves to a conventional action
function S as the Hamiltonian (22.12) does not explicitly depend on the independent
variable, i.e. the transformed time T . Due to the quadratic dependence of the Hamil-
tonian (22.12) on the Qi and the Pi , we try to determine the solution S(Qj , T , βj )
of (22.13) by defining
1 1
S = a(T ) Q2i + b(T ) Qi βi + c(T ) βi2 . (22.14)
2 2
i i i
∂S ∂S 1 2 1 2
= aQi + bβi , = ȧ Qi + ḃ Qi βi + ċ βi (22.15)
∂Qi ∂T 2 2
i i i
1 2 1 2
ȧ + a 2 + 2 Qi + ḃ + ab Qi βi + ċ + b2 βi = 0. (22.16)
2 2
i i i
This equation can only be satisfied for arbitrary Qi at all times T if the coefficients
in (22.16) vanish separately
ȧ + a 2 + 2 = 0, ḃ + ab = 0, ċ + b2 = 0. (22.17)
Starting with the leftmost non-linear first-order differential equation for a(T ), we may
solve this coupled set step-by-step
1 tan T
a(T ) = − tan T , b(T ) = , c(T ) = − . (22.18)
cos T
We thus find the following action function S(Qj , T , βj ) as the solution of the
Hamilton–Jacobi equation (22.13)
1 1 1 tan T 2
S = − tan T Q2i + Qi βi − βi . (22.19)
2 cos T 2
i i i
By means of this action function, we may now work out the relation of the coordinates
Qi (T ) and Pi (T ) with the integration constants αi and βi . We thereby show that
the action function S ≡ F2 represents exactly the generating function of a canonical
transformation that map the Hamiltonian H from (22.12) into a new Hamiltonian
H ≡ 0 that vanishes identically. According to the general rules from (19.12), we
obtain for (22.19) the particular transformation rules
22 Extended Hamilton–Jacobi Equation 459
sin T
Qj = αj cos T + βj ,
Pj = −αj sin T + βj cos T , (22.21)
1 2 1 2 2
H = H − Pi − Qi ≡ 0.
2 2
i i
With H ≡ 0 our task has been accomplished. The representation of the Qj and the Pj
as functions of the integration constants αj and βj means to have completely solved
our system. We may, furthermore, merge the coordinate transformations of the first
and second step
⎛ √ξ(t) 0
⎞⎛ ⎞
sin T
qj cos T αj
= ⎝ 1 ξ̇ (t) 1 ⎠⎝ ⎠ , (22.22)
pj √ √ − sin T cos T βj
2 ξ(t) ξ(t)
wherein 0 = = const. and ξ(t) denotes a solution of
1 1
ξ ξ̈ − ξ̇ 2 + ξ 2 ω2 (t) = 2 . (22.23)
2 4
The transformed time T then follows from
t
dτ
T= . (22.24)
0 ξ(τ )
With the particular initial conditions ξ(0) = 1 and ξ̇ (0) = 0, the constants αj , βj ob-
viously represent the values of the coordinates qj , pj at the instant of time t = 0.
Both, the extended action function S1 (qj , t, Pj , β0 ) from (22.5), and the con-
ventional action function S(Qj , T , βj ) from (22.19) define canonical transforma-
tions. The concatenation of both transformations then establishes again a canonical
transformation. In principle, it is thus possible to find an extended action function
S1 (qj , t, βj , β0 ) ≡ S1 (qμ , βμ ) that yields the solution of (22.4) with all βμ constants,
which means to solve the problem in one step.
Part
VII
Nonlinear Dynamics
The treatment of mechanics in these lectures would not be complete if we did not deal
at least in brief with a topic which recently has attracted much attention: nonlinear
dynamics, and thereof the “theory of chaos” as a special topic.
The starting point is the observation that ordered and regular motions like those
occurring in the harmonic oscillator, the pendulum, or the Kepler problem of plane-
tary motion are more an exception in nature than the standard case. One frequently
encounters erratic phenomena and phenomena that are unpredictable in the details.
A particularly striking example is the occurrence of turbulence in the flow of liquids.
Toward the end of the nineteenth century, the “father of nonlinear dynamics,” Henri
Poincaré,1 for the first time pointed out that an irregular behavior in mechanics is not
at all an unusual feature if the system being studied involves a nonlinear interaction.
Closely related is the—at first sight astonishing—insight that also very simple systems
may exhibit a highly complex dynamics. A simple deterministic differential equation
involving nonlinearities may have solutions the behavior of which over longer time
periods evolves quite irregularly and practically cannot be predicted. This is one of the
characteristic features of chaotic systems. The meaning of this concept, which may be
precisely defined in the frame of nonlinear dynamics, extends far beyond mechanics,
since the phenomenon of chaos arises in many fields not only of physics but also of
chemistry, biology, etc.
In the following sections, we shall learn quite a lot about general properties of
nonlinear dynamic systems. The time dependence and stability of their solutions will
be discussed, and concepts like attractors, bifurcations, and chaos will be introduced.
1 Jules-Henri Poincaré, French mathematician and physicist, b. April 29, 1854, Nancy–d. July 17,
1912, Paris. Poincaré studied at the École Polytechnique and the École des Mines and was a scholar of
Ch. Hermite. Soon after he received his doctorate, he obtained a chair at the Sorbonne in 1881, which
he held until his death. In pure mathematics, he became famous as the founder of algebraic topology
and of the theory of analytic functions of several complex variables. Further essential fields of work
were algebraic geometry and number theory. But Poincaré also dealt with applications of mathematics
to numerous physical problems, e.g., in optics, electrodynamics, telegraphy, and thermodynamics.
Together with Einstein and Lorentz, he founded the special theory of relativity. Poincaré’s work on
celestial mechanics, in particular, on the three-body problem, culminated in a monograph in three
volumes (1892–1899). In this context, he was the first who discovered the appearance of chaotic
orbits in planetary motion. Poincaré has been called “the last universalist in mathematics” because of
the unusually broad scope of his interests.
462 VII Nonlinear Dynamics
However, a detailed treatment of nonlinear dynamics, its manifold problems, and in-
terdisciplinary applications exceeds the scope of this book.2 In particular, we cannot
deal in more detail with the important topic of chaos in Hamiltonian systems.
A unified theoretical description may be given for many of the systems of interest.
A system is described by a finite set of dynamic variables that will be combined to
a column vector x = (x1 , . . . , xN )T ∈ RN . The state of the system at a given time t
is uniquely described by such a point x in phase space. The xi are generalized co-
ordinates that may represent a variety of quantities. Note that the vector x shall also
comprise the velocities (or momenta, respectively). We now assume that the system
behaves deterministically. Thus, the entire time evolution x(t) is determined if an
initial value x(t0 ) is given. The time evolution shall be described by a differential
equation of first order with respect to time:
d
x(t) = F x(t), t; λ . (23.1)
dt
Here, F is in general a nonlinear function of the coordinates x (also called the ve-
locity field or vector field). Moreover, F may also still explicitly depend on the time
t, for example if varying external forces are acting on the system. If there is no such
dependence, the system is called autonomous. Finally, the third argument in (23.1)
shall indicate that possibly there exist one or several control parameters λ. These are
fixed given constants whose values affect the dynamics of the system and may pos-
sibly change the character of the dynamics. Typical control parameters are, e.g., the
coupling strength of an interaction, or the amplitude or frequency of an external per-
turbation imposed onto the system.
Note: A possible explicit time dependence in (23.1) may be eliminated by a simple
trick. For this purpose we consider a system with one additional degree of freedom,
and postulate for the additional vector component the differential equation
d
xN+1 = 1.
dt
With the initial condition xN+1 (0) = 0, this simply implies xN+1 (t) = t . Hence, the
time on the right side of t may be replaced by xN+1 , and we are dealing with an
autonomous system with one additional dimension.
The equation of motion (23.1) is very far-reaching, in spite of its simple shape. In
particular, it incorporates the Hamiltonian mechanics as a special case: For a system
with N degrees of freedom described by the generalized coordinates q1 , . . . , qN and
the associated canonical momenta p1 , . . . , pN , the Hamiltonian equations of motion
∂H ∂H
q̇i = , ṗi = − . (23.2)
∂pi ∂qi
d
x = J ∇ x H. (23.3)
dt
∇ x H stands for the gradient vector of the Hamiltonian function,
T
∂H ∂H ∂H ∂H
∇x H = ,..., ; ,..., , (23.4)
∂q1 ∂qN ∂p1 ∂pN
and the 2N × 2N -matrix J provides both the permutation of the components as well
as the correct signs:
0 +I
J= , (23.5)
−I 0
where I denotes the N × N unit matrix. By the way, J has the following useful prop-
erties:
This mapping that depends on the time t as a parameter is called the phase flow or
simply the flow of the vector field F(x). For t = 0, the flow obviously reduces to the
identical mapping
t=0 = I. (23.8)
d
x(t) = F x(t) (23.10)
dt
in an N -dimensional phase space. To this end, we consider a small volume element
V (x) that at time t = t0 shall be at the position x = x0 and shall move with the flow.
In Cartesian coordinates, the volume is given by the product of the edge lengths,
N
V (x) = xi (x). (23.11)
i=1
The time derivative of this quantity is, according to the chain rule, given by
d dxi (x)
N N
V (x) = xj (x)
dt dt
i=1 j =i
N
N
1 dxi (x)
= xj (x) , (23.12)
xi (x) dt
j =1 i=1
=V (x)
where the extension xi /xi has been added. Hence, the relative change (= logarith-
mic time derivative) of the volume is
1 d 1 dxi (x)
N
V (x) = . (23.13)
V (x) dt xi (x) dt
i=1
466 23 Dynamical Systems
The change of the edge lengths of the volume1 may be calculated from the equation of
motion (23.10). Let us consider the distance between two edges of the cube along the
i-direction which are determined by the trajectories x0 (t) with x0 (t0 ) = x0 and x(t)
with x(t0 ) = x0 + ei xi :
dxi d
= xi (t) − x0i (t)
dt t0 dt t0
For small deviations xi , the Taylor expansion of F(x) yields to first order
dxi ∂Fi
= xi . (23.15)
dt t0 ∂xi x0
1 d ∂Fi N
(x) := V (x) = = ∇ · F. (23.16)
V (x) dt ∂xi
i=1
The rate of change of the phase-space volume is therefore determined by the diver-
gence of the velocity field F.
Liouville’s theorem is included in (23.16) as a special case. According to
(23.3)–(23.5), the velocity field of a Hamiltonian system with the coordinates x =
(q1 , . . . , qN ; p1 , . . . , pN )T reads
T
∂H ∂H ∂H ∂H
F(x) = ,..., ;− ,...,− . (23.17)
∂p1 ∂pN ∂q1 ∂qN
N
∂
N
∂
=∇·F= Fi + FN+i
∂qi ∂pi
i=1 i=1
N
∂ ∂H ∂ ∂H
N
= − = 0, (23.18)
∂qi ∂pi ∂pi ∂qi
i=1 i=1
1 Strictly speaking, the shape of V is distorted, and the edges do not remain orthogonal to each
other. But this is of no meaning when calculating the volume to lowest order.
23.2 Attractors 467
an equilibrium point and the motion comes to rest (see the section on limit cycles).
There is, however, also the possibility that the volume is shrinking and the distance
between the trajectories is being reduced only along certain directions, while they are
diverging in other directions. In this case the resulting distance even increases with
time. An originally localized region in phase space is so to speak “rolled out” and
widely distributed by the dynamic flow. The shrinking of the volume towards zero
then means that an originally N -dimensional hypercube in phase space changes over
to a geometric object with lower dimension D < N . D may even take a nonintegral
value, as will be explained in Chap. 26.
23.2 Attractors
The dynamics of a nonlinear system may be highly complicated. It is convenient to
distinguish between transient and asymptotic behavior. A transient process denotes
the initial behavior of a system after starting from a given point x0 in phase space.
Naturally, it is particularly difficult to make general statements, since the transients
depend on the particular initial condition. Theorists therefore tend to ignore this part
of the trajectory, even if it may play an important role in practice, depending on the
dominant time scales. Only recently has the study of transients gotten more attention.
The systematic treatment of the asymptotic or stationary behavior of a system is
somewhat simpler. “Stationary” shall not mean here that the system is at rest but only
that possible transient phenomena have faded away. In dissipative systems, which will
be treated here, the trajectories will asymptotically approach a subset of the phase
space of lower dimension, a so-called attractor.
The definition and correct mathematical classification of attractors is not quite sim-
ple. Actually there are several concepts in literature that differ from each other in de-
tail. Here we first give a mathematical definition2 but shall also illustrate the concept
of the attractor by various examples in the subsequent chapters.
Let us consider a vector field F(x) on a space M (e.g., M = RN ) with an associated
phase flow t . A subset A ⊂ M is denoted as an attractor if it fulfills the following
criteria:
(1) A is compact.
(2) A is invariant under the phase flow t .
(3) A has an open environment U that contracts to A under the flow.
This statement needs several explanations:
(1) A set is called compact if it is closed and restricted. This means that any limit
value of an infinite sequence belongs itself to the set, and the set cannot extend up
to infinity. “Exploding” solutions where for example particles escape to infinity
therefore cannot be attractors.
(2) Invariance under the phase flow means that
2 F. Scheck, Mechanik, Springer (1992). This book is also available in English: F. Scheck, Mechan-
ics: From Newton’s Laws to Deterministic Chaos, 3rd edition, Springer (1999).
468 23 Dynamical Systems
(3) This may be formulated in two steps. First, the environment U ⊃ A is larger than
the attractor itself, since we are dealing with an open range that includes the com-
pact A. U shall be positively invariant, i.e.,
If a point once lies within U , then it cannot leave it. It will, on the contrary, even be
pulled toward A, which may be formulated as follows: For any open environment
V of A that lies completely within U , i.e., A ⊂ V ⊂ U , one can find a time tV
after passing that the image of U lies entirely within V :
Since V may be chosen arbitrarily “close” about A, this means that for large time
values U is shrinking toward the attractor A.
Frequently, the definition of an attractor is still extended by the requirement
that it shall consist of one piece only.
(4) A cannot be separated into several closed nonoverlapping invariant subsets.
An important property of an attractor is its domain of attraction. The maximum
environment U that contracts to A is called the basin of attraction B. In correct math-
ematical formulation, B is the union of all open environments of A that fulfill the
conditions (23.20) and (23.21).
The introduction of the concept of attractor given here is rather complex. This is
justified, however, by the fact that attractors may have very complex properties. Of
central importance for nonlinear dynamics are the concepts of strange and chaotic
attractors, which sometimes—not quite correctly—are used as synonyms. These con-
cepts will become fully transparent only in the subsequent chapters and by examples.
But we shall present the definitions now:
Chaotic attractor: The motion is extremely sensitive with respect to the initial con-
ditions. The distance between two initially closely neighboring trajectories increases
exponentially with time. For more details see Chap. 26.
Strange attractor: The attractor has a strongly rugged geometrical shape that is
described by a fractal. For more details see Chap. 26.
Both of these properties arise, as a rule, in common. There are, however, also ex-
amples3 where an attractor is chaotic but not strange, or is strange but not chaotic.
3 C. Grebogi, E. Ott, S. Pelikan and J.A. Yorke, Physica 13D, 261 (1984).
23.3 Equilibrium Solutions 469
Such an x0 is also called a critical point or fixed point. Of particular interest is the
question of whether or not the system is moving toward such a fixed point and—if
several ones exist—to which of them. A fixed point that attracts the trajectories is the
simplest example of an attractor. In this case the set A defined in the previous section
is trivial and consists of a single point.
We are therefore interested in the stability of equilibrium solutions. To this end we
consider the trajectories x(t) in the vicinity of a critical point x0 . We thus require that
the distance
be a small quantity. Under this condition, the problem may be greatly simplified, since
it usually suffices to take only the lowest term of the Taylor expansion of F(x) into
account. The linearized equation of motion then reads
d
ξ (t) = Mξ (t), (23.24)
dt
where terms of quadratic or higher order in ξ were neglected. M denotes the Jacobi
matrix (functional matrix) of the function F(x) evaluated at the position x0 . This ma-
trix has the elements
∂Fi
Mik = . (23.25)
∂xk x0
Contrary to the original nonlinear equation of motion (23.1), the solution of the lin-
earized problem (23.24) is in principle simple; it may be given analytically. Let us
first consider the trivial special case of a one-dimensional system (N = 1). The Jacobi
matrix then has only a single element, say, μ, and (23.24) is solved by
Mu = μu. (23.28)
This N -dimensional linear system of equations only has nontrivial solutions if the
determinant
N
ξ (t) = cn e μ n t un , (23.30)
n=1
where the expansion coefficients cn may be determined from the initial condition at
t = 0. The eigenvalues μn may be real or complex. Complex eigenvalues thereby arise
always pairwise: If μn solves (23.29), then the complex-conjugate μ∗n obviously also
solves the equation, since the Jacobi matrix Mij is real.
The real parts of the eigenvalues of the characteristic equation are decisive for char-
acterizing an equilibrium point x0 . We now define a tightened form of the condition
of stability: An equilibrium point x0 with F(x0 ) = 0 is called asymptotically stable if
there exists an environment U x0 within which all trajectories are running toward
x0 for large times:
If the function (the vector field) F is sufficiently smooth so that it can be described
by the linear approximation, one may immediately give a sufficient condition for as-
ymptotic stability: The point x0 is asymptotically stable if all eigenvalues of the Jacobi
matrix have a negative real part, i.e., if
Conversely, if at least one of the eigenvalues has a positive real part,
μn > 0, then
x0 is an unstable fixed point, since displacements along un are increasing exponen-
tially.
With the knowledge of the eigenvectors un , the total phase space may be spanned in
partial spaces. The stable (or unstable) partial space is spanned by all vectors un satis-
fying
μn < 0 (or > 0). In addition, a partial space may occur with the special value
23.3 Equilibrium Solutions 471
μn = 0. If this happens, one speaks of a degenerate fixed point. (The associated par-
tial space is also called the center; but we shall not deal here in more detail with the
related problems.) If one considers a general perturbation of a trajectory, it will have
components in all partial spaces. After a sufficiently long time the contribution with
the maximum
μn will dominate.
Finally, we note that the linear stability analysis holds only in the vicinity of a
critical point x0 . It may be shown mathematically that the topological behavior of the
flow does not change there under the influence of the nonlinearity. But this vicinity
may be very small, such that one cannot make a statement about the global behavior
of the flow by this way.
EXAMPLE
The stability analysis becomes particularly transparent for the case N = 2 that corre-
sponds to a dynamic system with one degree of freedom x1 = q and the associated
momentum x2 = p. In the vicinity of a fixed point ẋ = F(x0 ) = 0, the motion is deter-
mined in a linear approximation by the four elements of the Jacobi matrix Mij . The
characteristic equation (23.29)
M11 − μ M12
=0 (23.34)
M21 M22 − μ
is a quadratic polynomial
or
μ2 − 2sμ + d = 0 (23.36)
with
1 1
s = (M11 + M22 ) = Tr M, d = M11 M22 − M12 M21 = det M. (23.37)
2 2
The two solutions of (23.36) may be given explicitly:
μ1/2 = s ± s 2 − d. (23.38)
Depending on the magnitude and sign of the two constants s and d, there are many
distinct possibilities for the eigenvalues μ1 , μ2 :
(a) μ1 , μ2 real and both negative (if s < 0 and 0 < d < s 2 ) stable node
(b) μ1 , μ2 real and both positive (if s > 0 and 0 < d < s 2 ) unstable node
(c) μ1 , μ2 real with distinct signs (if d < 0) saddle
(d) μ1 = μ∗2 , negative real part (if s < 0 and d > s 2 ) stable spiral
(e) μ1 = μ∗2 , positive real part (if s > 0 and d > s 2 ) unstable spiral
(f) μ1 = μ∗2 , purely imaginary (if s = 0 and d > 0) rotor
The ranges are represented in Fig. 23.2 in the s, d-plane. To these alternatives, there
correspond distinct types of trajectories ξ (t) = x(t) − x0 according to (23.30).
472 23 Dynamical Systems
Figure 23.3 illustrates how the trajectories in the vicinity of a stable node are run-
ning into the fixed point:
where u1 and u2 are the (not necessarily orthogonal) eigenvectors. The curvature of
the trajectories arises if μ1 = μ2 . These curves are parabola-like, with a common tan-
gent at the origin (in u1 - or u2 -direction depending on whether μ2 or μ1 is larger). The
trajectories for the unstable node, Fig. 23.3(b), have the same shape but are passed in
the opposite direction (exponential “explosion”). For the case of a saddle the trajec-
tories are running in the u1 -direction (let μ1 < μ2 without restriction of generality)
toward the fixed point but are pushed off in the u2 -direction, which results in the
hyperbola-like trajectories of Fig. 23.3(c).
If the eigenvalues are complex,
The general solution (23.30) then has the form Example 23.1
where c2 = c1∗ to get ξ real. If the constant c1 , the value of which is fixed by the
initial condition ξ (0), is split into magnitude and phase, and the same is done for the
Cartesian components of the complex eigenvector u1 ,
The factor in brackets describes harmonic vibrations shifted in phase relative to each
other (if α = β). One thus has the parametric representation of an ellipse. Due to the
prefactor, the size of the ellipse varies exponentially with time. Thus, the trajectories
are logarithmic spirals moving toward the fixed point or away from it, depending on
the sign of the real part of μ; see Fig. 23.3(d), (e)—hence, the name spiral. The case
of the rotor with
μ = 0 plays a particular role, since the trajectories in the vicinity of
x0 are periodic functions (concentric ellipses). This means that the equilibrium point
is stable (small displacements are not amplified) but not asymptotically stable (the
trajectory does not run into the fixed point), and hence this point is not an attractor.
EXERCISE
ẍ + α ẋ + βx + γ x 3 = 0. (23.46)
Show that the system is dissipative. Interpret the individual terms and discuss the
possible fixed points and their stability.
Solution. We are dealing with a harmonic oscillator involving friction and nonlinear-
ity. Besides the linear backdriving force of the harmonic oscillator (third term), there
acts a friction force proportional to the velocity (second term). Moreover, a cubic non-
linearity (fourth term) becomes important. This force law corresponds to a potential
m 2 m 4
V (x) = βx + γ x , (23.47)
2 4
where m denotes the mass. We obtain various types of motion, depending on the
magnitude and sign of the constants in (23.46). We first rewrite the equation of mo-
tion (23.46) in the standard form. For this purpose, we introduce the velocity as
474 23 Dynamical Systems
Exercise 23.2 a second coordinate, x = (x, y) = (x, ẋ), which leads to the coupled differential equa-
tions of first order:
d x y
ẋ = = ≡ F(x). (23.48)
dt y −αy − βx − γ x 3
For α > 0, the system is dissipative, since the divergence of the velocity field is
∂ ∂
=∇·F= y + (−αy − βx − γ x 3 ) = −α < 0. (23.49)
∂x ∂y
y = 0, x(β + γ x 2 ) = 0 . (23.50)
Hence, besides the equilibrium position x0 = (0, 0) without displacement, there still
√
occur two further symmetrically positioned fixed points x0 = (± −β/γ , 0), provided
that the constants β and γ have distinct signs. Figure 23.4 shows the associated po-
tential functions V (x) for all combinations of signs.
The characteristic equation (23.36) in Example 23.1 for the eigenvalues involves the
following coefficients:
1
For x0 = (0, 0): s = − α, d = β,
2
√ 1
For x0 = (0, ± −β/γ ): s = − α, d = −2β.
2
Obviously, asymptotic stability may occur only for a positive sign of the constant
α > 0. Only then is one dealing physically with a damping friction term. For the fixed
point in the rest position x0 = (0, 0), we get the alternatives
(1) β > 14 α 2 stable spiral
(2) 0 < β < 14 α 2 stable node
(3) β < 0 saddle
In the first case, there arise weakly damped vibrations, in the second case the oscillator
is overdamped, and the displacement monotonically tends to zero. For β < 0, the equi-
librium position is unstable, as may be seen from the potential plots in Fig. 23.4(c), (d).
23.4 Limit Cycles 475
√
The analogous considerations for the fixed points x0 = ± −β/γ , 0 lead to (assum- Exercise 23.2
ing α > 0):
(1) −2β > 14 α 2 stable spiral
(2) 0 < −2β < 14 α 2 stable node
(3) β > 0 saddle
The factor 2 arises because the curvature of the potential (23.47) in the equilibrium
positions with finite displacement is twice as large as in the rest position. Only the
double-oscillator potential (Fig. 23.4(c)) allows stable displaced fixed points (β < 0
and γ > 0).
It is instructive to plot the position of the fixed points as a function of the pa-
rameter β. As is seen from Fig. 23.5, for β = 0 there occurs a square-root branch-
ing. For γ > 0, a stable equilibrium position bifurcates into two new stable solutions.
Such bifurcations (Lat. furca = fork) frequently occur in nonlinear systems; see also
Chap. 25.
Besides the simple stationary equilibrium points studied in detail in the section on
attractors, a dynamic system may exhibit still other types of stable solutions. These
are the so-called limit cycles that are characterized by periodically oscillating closed
trajectories. Similar to the fixed points discussed already, limit cycles may also act as
attractors of motion; compare the section on attractors. Then there exists a more or less
extended range in phase space (the “basin of attraction” of the attractor): trajectories
starting from there move toward the limit cycle, which is approached for t → ∞. For
limit cycles, one may also perform a mathematical stability analysis as for fixed points
which by its very nature is somewhat more difficult.
We shall concentrate ourselves here to a special but typical example, namely a har-
monic oscillator with a nonlinear friction term. The associated differential equation
has the general form
d 2x dx
2
+ f (x) + ω2 x = 0. (23.52)
dt dt
If the middle term is absent, we obtain a harmonic oscillator with angular frequency ω.
The case of a constant coefficient, f (x) = α = constant, leads to a linear differential
equation that may be solved easily. The character of the solution is determined by an
exponential factor exp (−αt/2). The solution thus decreases exponentially toward the
fixed point at x = ẋ = 0 if α is positive. A negative value of α means that a force is
acting along the same direction as that of the instantaneous velocity which leads to an
unlimited amplification of the solution (negative damping). Physically one of course
476 23 Dynamical Systems
no longer deals with a friction force; rather an external source must exist that pumps
energy into the system.
If allowance is made for more general functions f (x), it may happen that the damp-
ing coefficient takes partly positive, partly negative values, depending on the displace-
ment. Of particular interest is the case when f (x) is negative for small magnitudes of
x, and positive for large displacements. The simplest ansatz providing such a behavior
is a quadratic polynomial
where α determines the strength of the damping/excitation and two zeros are at x =
±x0 . The zeros may be set to the value 1 without loss of generality by rescaling the
variables to x = x/x0 with α = αx02 . For convenience one may also choose the value
1 for the frequency by rescaling the time: t = ω0 t with α = α /ω. The standard form
of the equation of motion then reads (dropping the primes again):
d 2x dx
+ α(x 2 − 1) + x = 0. (23.54)
dt 2 dt
This differential equation has been set up and discussed in 1926 by the Dutch engi-
neer B. van der Pol. It served first for describing an electronic oscillator circuit with
feedback (at that time still with valves), but it was already clear to the author that his
equation could be applied to a variety of vibrational processes. Actually the origin
of this equation may be traced back even further, since around 1880 Lord Rayleigh
investigated the following differential equation in the context of nonlinear vibrations:
d 2v 1 dv 3 dv
+α − + v = 0. (23.55)
dt 2 3 dt dt
One easily sees the relation between (23.54) and (23.55). We have only to differentiate
the Rayleigh equation (23.55) with respect to time and then substitute
dv
=x (23.56)
dt
to get the van der Pol equation (23.54). Thus, both equations are essentially equivalent
to each other.
We now discuss the solutions of the van der Pol equation (23.54). It may be trans-
formed as usual to the standard form (23.1) of two coupled differential equations of
first order for the vector x(t) = (x, y)T :
dx
= y, (23.57)
dt
dy
= −x − α(x 2 − 1)y. (23.58)
dt
It is now advantageous to transform to polar coordinates in the x, y-phase space:
The time derivatives of r and θ may be expressed by those of x and y. For the radius
coordinate the relation follows immediately from the differentiation of r 2 = x 2 + y 2 :
dr dx dy
r =x +y . (23.60)
dt dt dt
23.4 Limit Cycles 477
An analogous relation for the angle coordinate may be obtained from the time deriva-
tives of (23.59):
dx dr dθ
= cos θ − r sin θ, (23.61)
dt dt dt
dy dr dθ
= sin θ + r cos θ. (23.62)
dt dt dt
By multiplying the first of these equations by y and the second one by x and subtract-
ing both equations, one obtains
dθ dy dx
r2 =x −y . (23.63)
dt dt dt
Using (23.60) and (23.63), the van der Pol system of equations in polar coordinates
reads as follows:
dr
= −α(r 2 cos2 θ − 1)r sin2 θ, (23.64)
dt
dθ
= −1 − α(r 2 cos2 θ − 1) sin θ cos θ. (23.65)
dt
The nonlinear terms on the right-hand side have a rather complex shape, but one may
give some qualitative statements on the solutions to be expected. In the limit α = 0,
one has of course a normal harmonic oscillator. The trajectories in phase space are
circles which are traveled through uniformly with the frequency 1, such that
with arbitrary ρ and t0 . Due to the nonlinearity in (23.64) and (23.65), the behavior of
the solution is modified. As long as α 1, the influence on the revolution frequency
remains small: Since the function sin θ cos θ changes its sign twice in each period,
the influence of the nonlinear term in (23.65) cancels out on the average. It is quite
different, however, for the radial motion: Here sin2 θ is positive definite, and small
changes of the radius may accumulate from period to period. The evolution direction
of the effect is determined by the sign of −α(r 2 cos2 θ − 1). In the following we
shall discuss the (more interesting) case α > 0 (the set of solutions for α < 0 may be
obtained by inversion of the time coordinate t → −t ).
For small displacements r < 1, the factor −α(r 2 cos2 θ − 1) is then always positive,
and the radius increases slowly but monotonically. For large displacements r 1,
on the contrary, the factor is predominantly negative (except for the vicinity of the
zeros of cos θ ), and the mean radius decreases from cycle to cycle. A more detailed
investigation as is performed in Exercise 23.3 shows that the trajectory in the course of
time approaches a periodic one, independent of the initial conditions, which for given
α is uniquely determined. This is the limit cycle of the system.
As long as α is very small, the limit cycle resembles a harmonic vibration as in
(23.66). The crucial difference is, however, that the amplitude ρ now has a sharply
determined value, namely, ρ = 2. If one starts from a smaller or larger value, the tra-
jectory is a spiral approaching the limit cycle. The result of a numeric calculation
for the value α = 0.1 is represented in Fig. 23.6. One can follow the spiraling mo-
tion towards the limit cycle. Moreover, deviations from the purely harmonic vibration
become visible.
478 23 Dynamical Systems
Even more interesting is the solution in the opposite limit α 1, in which the non-
linearity plays a dominant role. Here also a limit cycle evolves for the same reasons,
the shape of which, however, strongly differs from a harmonic vibration. Figure 23.7
shows the phase-space plot and the trend of the amplitude of the limit cycle for the
case α = 10. One notices that the displacement remains in the range of the maximum
amplitude x = 2 and slowly decreases toward x = 1. Subsequently, a sudden “flip-
over” sets in, and the displacement drops to the value x = −2. Then the game repeats
with opposite sign. The period length of this kind of vibration is no longer determined
by the oscillator frequency (here ω = 1) but takes a much larger value. An analytic
investigation (see Exercise 23.4) shows that it increases proportional to the “friction”
parameter α:
T (3 − 2 ln 2)α. (23.67)
23.4 Limit Cycles 479
A motion performed by a van der Pol- or Rayleigh oscillator for large α is also
called relaxation vibration. The name indicates that a tension builds up slowly which
then equilibrates via a sudden relaxation process. Such relaxation vibrations fre-
quently occur in nature. For example, the vibration of a string excited by a bow, the
squeak of a brake, and even the rhythm of a heartbeat or the time variation of animal
populations may be classified in this way.
An important and also practically useful property of nonlinear oscillators with a
limit cycle lies in the fact that self-exciting vibrations occur that are well defined and
independent of the initial conditions. As a somewhat nostalgic example, we quote the
balance of a mechanical clock, the vibrations of which are largely independent of the
strength of the driving force. Finally, we quote without proof a mathematical theorem4
stating that the possible types of motion of a two-dimensional system (corresponding
4 See, e.g., J. Guckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems and Bifur-
cations of Vector Fields, Springer (1983).
480 23 Dynamical Systems
to a mechanical system with one degree of freedom: one coordinate plus one velocity)
are completely governed by fixed points and limit cycles.
dx
= F(x)
dt
with a continuous function F. Let B be a closed and restricted range of the x,y-plane.
If a trajectory lies in B for any time t > 0, x(t) ∈ B, there are three possibilities:
The general theorem says of course nothing about the number and shape of the
fixed points and limit cycles. However, it excludes the existence of more complicated
nonperiodic types of solutions! It is important that the statement holds only for two-
dimensional systems. Two trajectories are not allowed to intersect each other in phase
space, which in the two-dimensional plane leads to considerable restrictions. But in
more than two dimensions the trajectories may “evade” each other, and more com-
plex patterns of motion are possible. In this case, the already-mentioned strange at-
tractors with a complicated shape may also occur. This will be treated in the next
chapters.
EXERCISE
Problem. Show that the solutions of the van der Pol oscillator for small values of α
are spirals that approach a circle (the limit cycle) with the radius 2.
Hint: It is a good idea to introduce new variables that are averaged over one oscil-
lation period.
Solution. We start from the plausible assumption that for α 1 the solution of the
system of differential equations (23.64) and (23.65) differs only slightly from that of
the harmonic oscillator if it is considered for short time intervals. In order to calculate
a long-term drift of the variables, it is efficient to average over one vibrational period
in each case. We define the averaged amplitude r̄(t) as
dτ r(t + τ )
r̄(t) := . (23.68)
dt
The integration thereby extends over a full revolution of the angle, i.e., from θ to
θ − 2π (the minus sign arises because of dθ/dt −1). The corresponding time inter-
val runs from t to approximately (for α = 0 exactly) the value t + 2π .
23.4 Limit Cycles 481
We are interested in the time variation of the averaged amplitude for which accord- Exercise 23.3
ing to (23.64) we have
d r̄ 1
= −α dθ r sin2 θ (r 2 cos2 θ − 1)
dt 2π
2π
dθ 2 1 2
= −α r r sin 2θ − sin2 θ . (23.69)
2π 4
0
For small α, the quantity r(t) considered over a period varies only slowly, and hence
may be pulled out of the integral and replaced by r̄(t). The remaining angle integration
is trivial, since the mean value of both sin2 θ and sin2 2θ just equals 1/2. Hence, the
averaged amplitude satisfies the differential equation
d r̄ 1 1
= α r̄ 1 − r̄ 2 , (23.70)
dt 2 4
which is correct up to the order O(α 2 ). The circulation frequency, on the contrary,
does not change to first order:
2π
d θ̄ dθ
= −1 − α(r 2 cos2 θ − 1) sin θ cos θ = −1. (23.71)
dt 2π
0
The angular integral vanishes here, since the integrand is an odd function with respect
to θ = π . The differential equation (23.70) for the averaged amplitude may be solved
in closed form. We write
d r̄
= a r̄ − br̄ 3 (23.72)
dt
and transform to the new variable
1 d r̄
u= such that du = −2 . (23.73)
r̄ 2 r̄ 3
Obviously, (23.72) reduces to the simple linear differential equation
1 du
− = au − b, (23.74)
2 dt
the solution of which is a shifted exponential function:
b
u(t) = + c e−2at , (23.75)
a
where the free constant c is to be determined from the initial condition: c = u(0) −
b/a. Insertion of a = α/2 and b = α/8 finally yields
2r(0)
r̄(t) =
. (23.76)
r 2 (0) + (4 − r 2 (0))e−at
482 23 Dynamical Systems
Exercise 23.3 Thus, it is proved that the trajectories are spirals which approach a circle with radius 2
from inside (r(0) < 2) or outside (r(0) > 2). This is the limit cycle of the van der Pol
oscillator for small values of α.
EXERCISE
Problem. Discuss the solutions of the Rayleigh oscillator (23.55) qualitatively for
large values of the parameter α 1. Find an approximate solution for the period
length of the resulting relaxation vibration.
Solution. The differential equation (23.55) of the Rayleigh oscillator written in stan-
dard form reads
dv
= x,
dt
(23.77)
dx 1
= −v − α x 3 − x .
dt 3
In order to discuss the behavior of the solution for large values of α, it is convenient
to rescale the amplitude to a new variable z = v/α:
dz 1
= x,
dt α (23.78)
dx
= −α z + f (x) ,
dt
with the abbreviation f (x) = (x 3 /3) − x. From this quantity one may read off the
direction of the trajectory for any point of the z,x-plane:
dx dx/dt z + f (x)
= = −α 2 . (23.79)
dz dz/dt x
This means that for α 1, the trajectories are almost vertical. Other directions
may occur only near the curve z(x) = −f (x). This cubic limit curve subdivides the
z, x-plane into two halves (see Fig. 23.8). In the right half, the derivative dx/dt is neg-
ative, according to (23.78), and the trajectories are running (almost) vertically down-
ward. In the left half, they are running upward.
From this knowledge, the motion for large α may be constructed graphically. Be- Exercise 23.4
ginning from an arbitrary initial point, e.g., the point O in the figure, the trajectory
at first falls almost vertically down to the curve z(x) = −f (x). The further motion
proceeds with significantly lower velocity near this curve (directly on the curve the
velocity would vanish, dx/dt = 0). Finally, the point of inversion B is reached at
(z, x) = (2/3, 1).
Because dz/dt > 0, the trajectory cannot follow the backward-running branch of
the curve but “falls down” to the point C at (2/3, −2). Now the game is repeating
with inverse sign. The curve ABCD forms the limit cycle of the Rayleigh oscillator.
It consists of two slowly passed parts (x = 2 . . . 1 and −2 . . . −1) and two fast jumps
(x = 1 . . . −2 and −1 . . . 2). This discussion immediately applies, of course, to the van
der Pol oscillator, since according to (23.56) its displacement just corresponds to the
velocity x of the Rayleigh oscillator introduced in (23.77).
The period length T of the relaxation vibration may be evaluated easily:
dz
T = dt = α , (23.80)
x
where the integral extends over a full period. Since the motion along the partial
branches BC and DA proceeds very quickly, it is sufficient to calculate the contri-
bution of AB:
dz dz
T α +α
x x
AB CD
1
dz dz/dx
= 2α = 2α dx . (23.81)
x x
AB 2
The derivative dz/dx is to be formed on the curve AB, i.e., dz/dx = −df/dx:
1 2 2
−df/dx x2 − 1 1 2
T 2α dx = 2α dx = 2α x − ln x
x x 2 1
2 1
= (3 − 2 ln 2)α 1.614α. (23.82)
The period length calculated numerically in Fig. 23.7b for α = 10 amounts to about
19, hence the asymptotic range is not yet fully reached.
Stability of Time-Dependent Paths
24
Fig. 24.1. (a) The neighboring paths x(t) of a Lyapunov-stable path xr (t) remain in its vicinity.
(b) In the case of asymptotic stability, neighboring paths are attracted such that the distance
decreases to zero with increasing time
Using this definition, we consider the geometric position x(t). The time as a para-
meter of this curve, on the contrary, does not play a role.
xr (t + T ) = xr (t). (24.1)
This may originate in two distinct manners. First, in an autonomous system vibrations
may arise by themselves, e.g., in the harmonic oscillator. Here the right-hand side
of the equation of motion does not depend explicitly on the time, ẋ = F(x). On the
other hand, there are also periodically externally excited systems which are under the
action of a time-dependent external drive with the periodicity F(x, t + T ) = F(x, t)
that reflects itself in the trajectory. An advantage is here that the period length T is
imposed from outside, while the vibrational frequency of an autonomous system is
not known from the beginning and must be determined—except for simple special
cases—by numerical solution of the equation of motion.
To discuss the stability of xr (t), one investigates, as in (23.23), the neighboring
trajectories
where the deviation ξ (t) is assumed to be small. From the equations of motion
we find
or
This equation can be linearized by expanding the right side in a Taylor series and
neglecting higher terms:
with the Jacobi matrix (here written in abstract without giving the indices)
∂G ∂F
M(t) = = . (24.7)
∂ξ ξ =0 ∂x xr (t)
Equation (24.6) is, like (23.24), a linear system of differential equations, but now the
matrix of coefficients is periodically time-dependent, M(t + T ) = M(t), while for-
merly it was constant. This periodicity also holds for autonomous systems: Although
the function F(x) does not involve the time explicitly, the reference trajectory xr (t) by
itself nevertheless induces a periodic time dependence.
24.2 Discretization and Poincaré Cuts 487
There exists a mathematical tool that is useful for the stability analysis of time-
dependent paths but also in general for the qualitative understanding of dynamic sys-
tems. The basic idea is to perform a discretization of the time dependence of a trajec-
tory. This may be done in two somewhat different ways.
An obvious possibility is the stroboscopic mapping. Instead of the continuous func-
tion x(t), one considers a discrete sequence of “snapshots” xn = x(tn ), n = 0, 1, 2, . . . .
The time points of support of the spectroscopic method are chosen as equidistant,
hence tn = t0 + nT with a scanning interval T . Of course one should choose a value
of T that is appropriate for the problem. If an oscillating driving force is acting, one
will use its period length for T . The stroboscopic method becomes particularly simple
if the trajectory x(t) itself is periodic and T coincides with the period length; then all
xn are of course identical. The stroboscopic mapping of the path consists of a single
point xn = x0 in phase space. One should note that the position of the point x0 depends
of course on the selected reference time t0 and thereby may be shifted arbitrarily along
the orbit.
As was described in the preceding section, for a stability analysis one investi-
gates neighboring trajectories x(t) that are in general no longer strictly periodic. The
thin line in Fig. 24.3 shows such an example. The first three stroboscopic snapshots
x0 , x1 , x2 are marked by dots and the distance vectors ξ n = xn − xr0 are plotted.
An alternative method of discretization of trajectories, which is not oriented to the
periodicity and is of advantage particularly for autonomous systems having no fixed
eigenfrequency, is the Poincaré cut. When changing over to the discretized sequence Fig. 24.3. The stroboscopic scan-
xn , one again chooses momentary snapshots of the continuous orbit x(t). As a criterion ning of the distance ξ (t) =
one now adopts not any fixed equidistant time distances, but rather a geometric prop- x(t) − xr (t) yields informa-
erty of the orbit itself, namely the piercing of a given hypersurface . One thereby tion on the stability of a path
selects an (N − 1)-dimensional hypersurface in phase space and marks all points xn
at which the trajectory intersects the hypersurface. One further requires that is not
only touched but properly pierced. Mathematically this means that the surface shall be
transverse to the dynamic flow, n(x) · F(x) = 0 everywhere on , where n is the sur-
face normal. One therefore speaks of a transverse cut. In a transverse cut one usually
marks only points with a definite sign of F · n, i.e., only piercings of that proceed
in the same direction.
This method of discretization of trajectories was invented by Henri Poincaré and is
called the Poincaré cut. Figure 24.4 shows as an example a trajectory in an (N = 3)-
dimensional space, with the x,y-plane as the cut surface . Three piercings in the
negative z-direction are marked as x0 , x1 , x2 . The use of Poincaré cuts makes sense
This Poincaré mapping thus connects every point of the sequence x0 , x1 , x2 , . . . with
its successor. Note that P has no index. It is a single mapping of the plane onto itself,
which according to (24.8) is “scanned” at individual points. The individual points of
the Poincaré cut arise by successive iteration of the Poincaré mapping
Hence, the long-term behavior of a trajectory may be derived from the properties of the
iterated Poincaré mapping P n , n → ∞. If the time evolution of the dynamic system is
determined by a differential equation ẋ = F(x, t), the Poincaré mapping is unique and
also reversible (possibly except for singular points), since trajectories are not allowed
to intersect each other.
The problem of describing a dynamic system is of course not yet solved by defining
the Poincaré mapping but is only postponed, since P must also be constructed explic-
itly. In most cases this cannot be achieved analytically, and one is finally left with a
numerical integration of the differential equation of the system. It turns out, however,
that the exact Poincaré mapping P frequently has amazing common features with
very simply constructed analytic discrete mappings. As an example we shall discuss
the “logistic mapping” in Chap. 27.
Let us return to the problem of stability of periodic paths. As was outlined in the
preceding section, it is sufficient to investigate the small deviations ξ (t) = x(t) − xr (t)
from the reference trajectory in the linear approximation. In this approximation the
Poincaré mapping simplifies to a linear mapping, i.e., the multiplication by a ma-
trix C:
Decisive for the long-term behavior of the deviation ξ (t) are the eigenvalues
λ1 , . . . , λN of the matrix C. If all eigenvalues satisfy the condition |λi | < 1, the map-
ping is contracting, and the sequence converges toward zero. In this case the periodic
solution xr (t) is thus asymptotically stable. If at least one of the eigenvalues |λ| > 1,
the perturbations are increasing along the direction of the associated eigenvector, and
the path is unstable.
In the subsequent example, the mathematical theory of the stability of periodic
solutions developed by Floquet will be presented in more detail.
24.2 Discretization and Poincaré Cuts 489
EXAMPLE
As described at the beginning of this chapter, we are interested in the long-term be-
havior of the path deviations ξ (t) = x(t) − xr (t) which approximately obey a linear
differential equation
d
ξ (t) = M ξ (t) (24.11)
dt
with a periodic matrix of coefficients M(t + T ) = M(t). Since we are dealing with
a linear problem, any solution may be expanded in terms of a fundamental system
of linearly independent basic solutions φ 1 (t), . . . , φ N (t). The basic solutions are not
uniquely determined, and for sake of clarity we choose them in such a way that at the
time t = 0 (we might choose also t = t0 ) they just coincide with the unit vectors in the
N -dimensional space:
where the transposition symbol T indicates that these vectors shall be column vectors.
Geometrically all of these vectors are lying on a (hyper-) spherical surface of radius
unity. The superposition of a general solution ξ (t) reads
N
ξ (t) = ci φ i (t), (24.13)
i=1
(0) = I. (24.16)
How does the periodicity of the differential equation (24.11) manifest itself in the
matrix ? To see this, one should realize that any solution ξ of (24.11) at the time
t + T satisfies the same differential equation as at the time t . This does of course not
mean that the solution will be periodic; in general ξ (t + T ) = ξ (t). But it may be
expanded both in terms of the basic solutions φ i (t + T ) as well as in terms of the
φ i (t),
Example 24.1 The constant N × N -matrix C is called the monodromy matrix. This quantity gov-
erns how the solutions develop from one period to the next. Using the initial con-
dition (24.16), one may immediately read off the value of the monodromy matrix
from (24.18):
C = (T ). (24.19)
Hence, the monodromy matrix may be calculated by integrating the differential equa-
tion (24.11) N times with distinct initial conditions over a period from 0 to T and
writing the resulting solution vectors φi (T ) into the columns. The evolution of the
matrix (t) for arbitrarily large times is obtained by iteration of (24.18). For full pe-
riods in particular, we have
and generally,
(nT ) = n (T ) = Cn . (24.21)
According to (24.14) and (24.21), the evolution of the solutions ξ (t) for large times
is thus determined by the powers of the monodromy matrix C. What happens thereby
may be read off from the N eigenvalues λi of this matrix, which are called character-
istic multipliers or Floquet multipliers,
If ui is an eigenvector of the matrix (T ), then it keeps this property for the iterated
mapping as well, e.g.,
This leads to the following functional equation for the eigenvalues of the iterated map-
ping:
This behavior is characteristic for the exponential function; i.e., (24.24) is solved by
λi (T ) = eσi T (24.25)
with an (in general complex) constant σi that is called the Floquet exponent. We still
note that (24.21) may be considered a functional equation like (24.24), with the same
kind of solution
Here, a matrix S stands in the argument of the exponential function, and the resulting
function value is again a matrix. Such a matrix function is mathematically defined
simply through its power series expansion. One can show that the eigenvalues of the
matrix S introduced by (24.26) are just the Floquet exponents σi of (24.25). If one is
24.2 Discretization and Poincaré Cuts 491
interested in the evolution matrix at any times (not only multiples of the period T ), Example 24.1
then (24.26) still has to be generalized:
The matrix U(t) may exhibit a complicated time dependence but must be periodic.
Because of (24.18), we have
The product of the exponential functions on the right side may be combined to
exp (St) exp (ST ) = exp S(t + T ) (for noncommuting matrices in the exponent this
would in general not be correct), and we find
Hence, for the long-term behavior of the solutions ξ (t), U does not play a role. This
behavior is only determined by the magnitude of the Floquet multipliers λi . Begin-
ning with an eigenvector ξ (0) = ui , this solution according to (24.24) will increase as
ξ (nT ) = ui exp(σi nT ). From that, we conclude the following: The trajectory xr (t) is
asymptotically stable if for all Floquet multipliers we have |λi | < 1; i.e., Re σi < 0. It
is unstable if for at least one eigenvalue we have |λi | > 1; i.e., Re σi > 0.
These statements, which were obtained by linearizing the equation of motion, trans-
fer also to the stability behavior of the nonlinear system. The limit of marginal stability
|λi | = 1 may be cleared up only by additional investigations.
For an autonomous periodically vibrating system, a peculiarity arises: In this case,
one of the eigenvalues always has the value λ = 1 and must not be considered in the
stability analysis. To prove this assertion, we consider the function ẋr (t). The mode
under consideration is namely the motion tangential to the reference orbit. For an
autonomous system, one obtains by differentiating the nonlinear equation of motion
∂F
ẋr (t) = F(xr ) −→ ẍr (t) = ẋr = M(t)ẋr , (24.30)
∂x xr
which agrees with the linearized equation of motion (24.16). The time evolution of the
solutions of this differential equation is determined by the matrix (t); hence,
The reference orbit xr (t) and therefore also its derivative ẋr (t) are however (contrary
to the case of general perturbations ξ (t)) periodic; hence,
which proves that the monodromy matrix has an eigenvector, namely, ẋr (0), with the
eigenvalue λ = 1. This is vividly clear: A reference orbit that is shifted in the tangential
direction simply corresponds to a shift of the time coordinate t → t + δt. Since the
absolute value of the time does not play a role in autonomous systems, xr (t) and
x(t) = xr (t + δt) are always running with unchanged distance one behind the other.
Hence, the associated Floquet multiplier must have the value unity.
492 24 Stability of Time-Dependent Paths
EXERCISE
Investigate the stable solutions and find the Floquet multipliers of the limit cycle.
Solution. A stationary fixed point exists at x0 = (0, 0)T . Its stability is governed by
the Jacobi matrix (24.7)
∂F ρ − 3x 2 − y 2 −1 − 2xy
M(x) = = , (24.34)
∂x 1 − 2xy ρ − 3y 2 − x 2
Hence, for ρ < 0 one has a stable spiral and for ρ > 0 an unstable one; ρ = 0 repre-
sents the special case of a rotor.
By inspecting (24.33), one immediately finds a periodic solution for ρ > 0, since
for constant x 2 + y 2 = ρ the system reduces to a harmonic oscillator:
√ √
xr (t) = ( ρ cos t , ρ sin t)T . (24.37)
The Jacobi matrix (24.12) evaluated at the limit cycle (24.37) reads
−2ρ cos2 t −1 − 2ρ sin t cos t
M(t) = M(xr ) = . (24.38)
1 − 2ρ sin t cos t −2ρ sin2 t
For the linearized system of equations (24.33) with this matrix M(t), the normalized
fundamental solutions (24.12) from Example 24.1 may be given explicitly. One finds
cos t − sin t
φ 1 (t) = e−2ρt and φ 2 (t) = . (24.39)
sin t cos t
Combining these vectors to the matrix (t) and evaluating at T = 2π leads to the
monodromy matrix
−4πρ
e 0
C = (T ) = . (24.40)
0 1
Hence, the basic solutions (24.39) are also already eigenvectors of the monodromy
matrix, with the eigenvectors
As expected, one of the Floquet multipliers has the value unity (the corresponding Exercise 24.2
eigensolution φ 2 (t) is tangential to xr (t)). The value λ1 determines the stability of the
limit cycle: It is asymptotically stable, since λ1 < 1 for ρ > 0. For ρ < 0, no limit
cycle exists.
The nonlinear system (24.33) is so simple that it allows also a closed analytic so-
lution. We change to polar coordinates x = r cos ϕ, y = r sin ϕ. The differential equa-
tions (24.33) lead to the decoupled system
ṙ = r(ρ − r 2 ), ϕ̇ = 0. (24.42)
Hence, the angle simply increases linearly with time, ϕ(t) = t + ϕ0 . The radial equa-
tion may be integrated as follows:
r t
dr 1 r 2 r
= dt −→ ln = t, (24.43)
r0 r(ρ − r )
2
0 2ρ ρ − r 2 r0
or solved for r
√
ρ
r(t) = ρ . (24.44)
− 1 e−2ρt + 1
r02
√
For ρ > 0, the solution asymptotically approaches the limit cycle r(t) → ρ. For
ρ < 0,
√
|ρ|
r(t) =
; (24.45)
|ρ|
2 + 1 e 2|ρ|t − 1
r0
For the present, we concentrate on the stability of stationary fixed points x0 character-
ized by
ẋ = F(x0 , μ) = 0. (25.1)
This may be interpreted as an implicit equation for the position of the fixed point de-
pending on the parameter μ, x0 = x0 (μ). A premise for the existence and continuity of
this function is, according to the theorem from analysis on implicit functions, a non-
singular Jacobi matrix M = ∂F/∂x|x0 . A discontinuous behavior, thus bifurcations,
may therefore be expected if the determinant of M vanishes; this means if one of the
eigenvalues of this matrix depending on μ takes the value zero. The meaning of the
eigenvalues of the Jacobi matrix for stability has been discussed in Chap. 24.
We now consider the typical cases in the simplest possible form, namely for a one-
dimensional system. Without restriction of generality the fixed point is set to x0 = 0,
and let the branching value be μc = 0. This may always be achieved by appropriate
coordinate transformations. Moreover, the one-dimensional bifurcation may be em-
bedded into a higher-dimensional space.
ẋ = F (x, μ) = μ − x 2 . (25.2)
But, since x is real, there is no fixed point for negative μ. If μ passes the critical
value μc = 0, the number of fixed points jumps from 0 to 2. The lower fixed point is
however unstable, as is shown by the linear stability analysis according to Chap. 23.
The eigenvalue γ of the Jacobi “matrix”
∂F
M= = −2x0 , thus, γ = −2x0 , (25.4)
∂x x0
√
is, namely, positive for the solution x02 = − μ. Figure 25.1 shows the stable (solid
curve) and unstable (dashed curve) fixed points as a function of the control parameter
μ. The arrows indicate the direction of motion, which may be immediately read off
from (25.2). One may state that a stable and an unstable solution meet each other at
the critical point and annihilate each other.
We still have to clarify the origin of the name saddle-node branching. For this pur-
pose, the branching is embedded in a two-dimensional space, which may be achieved,
Fig. 25.1. Stable (solid) and e.g., by
unstable (dashed) fixed points
for the saddle-node branching
ẋ = μ − x 2 ,
(25.5)
ẏ = −y.
The variables x and y are decoupled, and the solutions y(t) tend asymptotically to
√ √
zero. The fixed points are x01 = (+ μ, 0) and x02 = (− μ, 0). The Jacobi matrix
has the form
∂F1 /∂x ∂F1 /∂y −2x 0
M= = (25.6)
∂F2 /∂x ∂F2 /∂y x0 0 −1 x0
and is, of course, diagonal, with the eigenvalues γ1 = −2x0 , γ2 = −1. The two eigen-
√
values for the solution x02 are 2 μ and −1; i.e., they have distinct signs, which—
according to the nomenclature from Example 23.1—corresponds to a saddle point.
√
For x0 = + μ, both eigenvalues are negative, and there arises a stable node. Saddle
and node coalesce with each other at the critical point. The distinct dynamical flows
at μ < 0, μ = 0, μ > 0, are represented in Fig. 25.2. One clearly notes that after coa-
lescing of saddle and node, there is no longer a fixed point and thus the flow continues
to infinity (left diagram).
25.1 Static Bifurcations 497
(b) The pitchfork branching: The simplest example of this kind of branching arises
if one uses a cubic polynomial for F (x, μ):
ẋ = F (x, μ) = μx − x 3 . (25.7)
Since this polynomial originates from (25.2) by multiplying by the factor x, a zero as
solution simply adds to the former fixed points,
√ √
x01 = + μ, x02 = − μ, x03 = 0. (25.8)
At the critical point μc = 0, the number of fixed points therefore jumps from 1 to 3,
whereby one of the latter solutions turns out to be unstable. The Jacobi matrix
∂F
M= = μ − 3x02 (25.9)
∂x x0
γ1 = γ2 = −2μ2 , γ3 = μ. (25.10)
Thus, the solution x03 is stable below μc , but looses this property at the critical point
to the two branches x01 and x02 of the “fork,” as is represented in Fig. 25.3. As com-
pared with Fig. 25.1, the arrows representing the direction of flow have changed their
orientation in the lower half-plane x < 0. This is obvious because F (x, μ) contains
the additional factor x.
Fig. 25.3. Stable (solid) and
unstable (dashed) fixed points
of the pitchfork branching:
(a) A supercritical branching.
(b) A subcritical branching
The pitchfork bifurcation may still arise in a second version, with the stability prop-
erties exactly inverted. An example for that is
ẋ = F (x, μ) = μx + x 3 , (25.11)
which differs from (25.7) by the sign. Since the flow direction and the signs of the
eigenvalues are inverted, the branching diagram changes as represented in Fig. 25.3(b).
498 25 Bifurcations
In this case, there remains only a single stable branch. The pitchfork bifurcation of
Fig. 25.3(a) is supercritical, and that of Fig. 25.3(b) is subcritical. The two remain-
ing combinations of signs of the linear and cubic term in F (x, μ) do not yield any
qualitatively new features, they just correspond to the reflection μ → −μ.
(c) The transcritical branching: In this case, we consider a polynomial with linear
and quadratic term
ẋ = F (x, μ) = μx − x 2 (25.12)
Thus, there always exist two fixed points in the entire parameter space. The eigenval-
ues of the stability matrix are
γ1 = −μ, γ2 = μ. (25.14)
Fig. 25.4. Stable (solid) and
unstable (dashed) fixed points In each case, one of the solutions is stable, the other one is unstable. At the branching
of the transcritical branching point the two solutions change their roles; see Fig. 25.4.
(d) The Hopf branching: The branchings considered so far were characterized by
the fact that a real eigenvalue of the Jacobi matrix takes the value zero when varying
the control parameter. However, branchings may also occur for complex eigenvalues.
Since the eigenvalues always occur in the form of complex-conjugate pairs, the sys-
tem must have at least the dimension 2. As an example, we consider the system of
equations
ẋ = −y + x μ − (x 2 + y 2 ) ,
(25.15)
ẏ = x + y μ − (x 2 + y 2 ) .
This system has been investigated already in Exercise 24.2 in the context of the sta-
bility of limit cycles. For all values of μ the origin is a fixed point, x0 = (0, 0). At this
position the Jacobi matrix has the value
μ −1
M(x0 ) = (25.16)
1 μ
with the eigenvalues γ1 = μ + i, γ2 = μ − i. This means that the fixed point for
μ < 0 is a stable spiral, and for μ > 0 an unstable spiral. The fate of the stable
solution at the critical point differs from that in the cases considered so far. From
the stationary attractor x0 there evolves for μ > 0 a periodically oscillating solution
√ √
xr (t) = ( μ cos t, μ sin t), which turns out to be a stable limit cycle.
As is seen from Fig. 25.5, the bifurcation diagram resembles that of the pitchfork
branching, Fig. 25.3. Actually, the system of (25.15) may be decoupled by changing
to polar coordinates x = r cos φ, y = r sin φ. The equation of motion for the radius
r(t) then exactly coincides with (25.7); see (24.42) in Exercise 24.2. The new state is,
25.2 Bifurcations of Time-Dependent Solutions 499
It may also happen for periodic trajectories that their character changes stepwise un-
der variation of a parameter μ. We shall only briefly touch on the bifurcation theory
of periodic solutions and vividly illustrate some interesting aspects. The mathematical
tool is the Poincaré mapping introduced in Chap. 24. A periodic orbit xr (t) is charac-
terized by a fixed point xr0 in the Poincaré cut. The discretization of the neighboring
path x(t) consists of a sequence of points x0 , x1 , x2 , . . . . For the distance between the
two orbits, we have according to (24.10)
1 E. Hopf, Abh. der Sächs. Akad. der Wiss., Math. Naturwiss. Klasse 94, 1 (1942).
2 Eberhard Friedrich Ferdinand Hopf, b. April 17, 1902, Salzburg–d. July 24, 1983, Bloomington,
Indiana. Hopf studied mathematics in Berlin and taught at MIT (1932 to 1936), and at the universities
of Leipzig (1936 to 1944) and Munich (1944 to 1948), as well as at Indiana University, Bloomington
(from 1948). His main research fields were differential and integral equations, variational calculus,
ergodic theory, and celestial mechanics.
3 This was also translated into English and published as G. Faust, M. Haase and J. Argyris, An
Exploration of Chaos, North-Holland (1994).
500 25 Bifurcations
where the matrix C represents the linearized approximation of the Poincaré map-
ping P . As long as all eigenvalues λi of C fall into the complex unit circle, |λi | < 1,
the neighboring orbits are attracted and the solution xr (t) is a stable limit cycle.
The approach xn → xr0 may proceed in different ways, as will be illustrated for a
selected eigenvalue, say, λ1 . If the eigenvalue is real and positive, 0 < λ1 < 1, the xn
approach the fixed point xr0 monotonically, as is shown in Fig. 25.6(a). The axis in
this diagram corresponds to the direction of the eigenvector of C belonging to λ1 . If
the eigenvalue is real and negative, −1 < λ1 < 0, the xn form an alternating sequence;
see Fig. 25.6(b). Finally, for a pair of complex eigenvalues, λ1 = λ∗2 , |λ1 | < 1, the xn
lie on a spiral converging toward xr0 ; see Fig. 25.6(c).
Branchings occur if an eigenvalue λ leaves the unit circle at a critical value of the
control parameter μc . One distinguishes three possible cases:
(a) λ1 = +1.
According to (25.17), in this case the distance of a neighboring trajectory from the
reference trajectory does not change. (A deviation along the direction of the eigen-
vector ξ1 of the matrix C is multiplied by λ1 = 1 for each cycle.) This indicates that
for μ > μc , new limit cycles may arise that have the same period length T as the ref-
erence orbit xr . Similar to the bifurcation of a stable fixed point, a periodic solution
may also undergo a pitchfork bifurcation and split into two separated solutions. This is
sketched in Fig. 25.7(a). The other bifurcations from the last section are also possible.
Which of these cases is actually realized cannot be read off from the criterion λ1 = +1
alone.
In this case as well, the sequence of points xn of the Poincaré mapping remains
at a constant distance from xr0 but now rotates on a circle. In the Poincaré cut itself,
a limit cycle evolves. This means geometrically that the topology of the periodic or-
bit changes: It now lies on a torus which envelops the originally closed orbit, as is
shown in Fig. 25.7(b). If the two circulation frequencies on the torus mantle are in-
commensurable, i.e., do not form a rational ratio p/q, p, q ∈ N , then one speaks of a
quasiperiodic motion, since the orbit for infinitely large times never closes. It thereby
approaches any point on the torus surface to arbitrarily close distance.
(c) λ1 = −1.
This case is of particular interest, since the distance of the points xn remains con-
stant, but the direction is alternating. This means that the neighboring orbit x after each
second passage returns to its old position. There arises a periodic orbit with twice the
period length 2T , as is indicated in Fig. 25.7(c). The phenomenon is therefore called a
period-doubling or subharmonic bifurcation. Bifurcations of this kind play an impor-
tant role in the transition from periodic to chaotic motion. An explicit example will be
discussed in Example 27.1 in the context of logistic mapping.
Lyapunov Exponents and Chaos
26
In Chap. 24, the concept of stability of time-dependent orbits was discussed, and in
Example 24.1, Floquet’s theory of stability, which may be applied to periodic paths,
was explained in more detail. Building on the works of Floquet and Poincaré, the
Russian mathematician Lyapunov1 published in 1892 an even more general study of
the stability problem in which arbitrary and also nonperiodic motions were admitted.
The characteristic exponents introduced by Lyapunov have played a central role in the
theory of nonlinear systems.
Physically, this may be for example the Poincaré mapping of a dynamical system. We
now ask, how does the point sequence x0 , x1 , x2 , . . . differ from the point sequence
x̃0 , x̃1 , x̃2 , . . . that evolves from a slightly modified initial condition x̃0 = x0 + δx0 ?
We have, in general,
Obviously, the values f (xl ) are a measure of how fast the neighboring solutions xn
and x̃n go away from each other (or move towards each other). In the special case of
a periodic motion which was studied by Floquet the point sequence xl (interpreted as
a Poincaré mapping) is constant and all factors in (26.4) have the same value. This
yields the exponential relation
|δxn | = |f (x0 )|n |δx0 | = enσ |δx0 | with σ = ln |f (x0 )|, (26.6)
Using (26.4) and the multiplication rule for the logarithm, we can also write this as
1
n−1
σ = lim ln |f (xl )|. (26.8)
n→∞ n
l=0
If all xl are equal, the quantity σ of (26.7) or (26.8) obviously reduces to the special
case (26.6).
The Lyapunov exponent is a logarithmic measure for the mean expansion rate per
iteration (i.e., per unit time) of the distance between two infinitesimally close trajec-
tories.
The case σ > 0 is of particular interest. A dynamical system with a positive Lya-
punov exponent is called chaotic. The paths of such a system are extremely sensitive
to changes of the initial conditions. Because of the exponential dependence there is
no need for long waiting, and an initially small deviation δx0 explodes to arbitrary
magnitude. More strictly speaking, it is sufficient that the product nσ be a number not
very much larger than unity; compare (26.6).
This property of chaotic systems is very significant, both practically as well as con-
ceptually. The behavior of a chaotic system is not predictable, at least not over a long
time period. Since physical quantities can always be determined only with a limited
precision, δx0 inevitably has a value that differs from zero. Therefore, it is hopeless
to aim at predicting the state of a chaotic system for times that are significantly larger
than 1/σ . The attempt to reach that by more and more precise fixing of the initial
conditions is doomed to failure. Ultimately, the exponential increase of the deviation
will always win.
The fascinating point is that this effect arises in a completely deterministic system.
The dynamics of such a system is “in principle” mathematically uniquely fixed by
the basic equation of motion—be it a differential equation or a discrete mapping as
in (26.1). Nevertheless, an exact knowledge of the equation of motion cannot help in
the attempt to find the solutions. In this way a new kind of uncertainty is brought into
physics, in addition to the more familiar sources of the statistical fluctuations (noise)
and the quantum fluctuations (uncertainty relation). The first to clearly understand and
state this phenomenon was Henri Poincaré toward the end of the nineteenth century.
26.2 Multidimensional Systems 505
with the Jacobi determinant M = ∂F/∂x|xr (t) . In agreement with (26.7), as a measure
for the time evolution of the perturbation one may define the quantity
|ξ (t)|
σxr ,ξ 0 = lim ln . (26.10)
t→∞ |ξ (t0 )|
This definition of the Lyapunov exponent raises several mathematical problems that
can only be touched on here. First of all, σxr ,ξ 0 depends on the reference trajectory
xr (t) and therefore on the position of the starting point. But if the system has an at-
tractor, then the value of σ in the range of attraction of the attractor is independent
of the reference orbit. Moreover, there exists the important class of ergodic systems
for which the mean values with respect to the time (taken along an orbit) can be re-
placed by mean values in phase space. One can show also for ergodic systems that
the Lyapunov exponents defined according to (26.10) exist and are independent of the
special reference trajectory. (More strictly speaking, there may occur “pathological”
orbits with differing σ , but these form a set of measure zero).3
Furthermore, the value of σxr ,ξ 0 depends on the direction of the perturbation
ξ 0 = ξ (t0 ). In an N -dimensional space, one may construct N linearly independent
vectors ei that lead to a set of N Lyapunov exponents
The indices are chosen such that the σi are in descending order,
σ 1 ≥ σ2 ≥ · · · ≥ σN , (26.12)
N
ξ (t0 ) = ci ei (26.13)
i=1
and will always have also a component along the vector e1 . When tracing the solution
over a sufficiently long time interval, the most rapidly increasing component of the
perturbation (or, if all σi are negative, the most slowly decreasing component) will
dominate. Performing the limit in (26.10) then guarantees that the calculation yields
the maximum Lyapunov exponent. In practice one has to take care in such a calcula-
tion that the trajectory x(t) for chaotic systems moves away from the reference trajec-
tory xr (t) very rapidly. It is therefore recommended that one performs a rescaling of
the perturbation ξ (tn ) = c ξ (tn ) in regular time intervals, with a constant c 1; see
Fig. 26.1. The value of σ is then obtained by averaging over many time intervals.
The full set of the N Lyapunov exponents may be calculated by following the time
evolution of all N linearly independent perturbations ξ i with ξ i (t0 ) = ei . From the
N volumes V (p) of the parallelepipeds spanned by the ξ 1 , ξ 2 , . . . , ξ p (that must be
calculated for all values p = 1, 2, . . . , N ), one then may obtain successively all σi .4
We still note that for periodic orbits xr (t + T ) = xr (t), the Lyapunov coefficients
coincide with the real part of the Floquet exponents introduced in Example 24.1. Thus,
one has a generalization of this concept.
3 More information on these questions may be found, e.g., in D. Ruelle, Chaotic Evolution and
The Lyapunov exponents are decisive for the long-term evolution of a dynamical
system. As discussed already, positive σ imply a rapid divergence of neighboring
trajectories and nonpredictability. Of particular interest are trajectories which attract
neighboring solutions and have (at least) one positive Lyapunov exponent. They are
called chaotic attractors.
Let us consider for illustration an autonomous system with three degrees of free-
dom. Depending on the combination of signs of (σ1 , σ2 , σ3 ), various kinds of attractors
may occur:
⎧
⎪
⎪ (−, −, −), fixed point,
⎨
( 0, −, −), limit cycle,
(σ1 , σ2 , σ3 ) =
⎪
⎪ ( 0, 0, −), torus,
⎩
(+, 0, −), chaotic attractor.
(a) If all Lyapunov exponents are negative, there arises a stable fixed point to which
the neighboring trajectories from all directions are converging.
(b) The vanishing of a Lyapunov exponent, σ1 = 0, indicates the existence of a pe-
riodic motion. This has been demonstrated explicitly in Chap. 24, based on the
equation of motion. The vector e1 associated with σ1 points along the direction
of the tangent of the orbit. The attractor is a limit cycle, i.e., a one-dimensional
object with the topology (but not necessarily the geometric shape) of a circle.
(c) If two of the Lyapunov exponents vanish, σ1 = σ2 = 0, there exists a periodic
motion in two directions. Therefore the attractor is two-dimensional and has the
topology of a torus about which the trajectory is winding up. Whether or not the
trajectory is periodic in total depends on the circulation frequencies ω1 and ω2 for
the two degrees of freedom of the torus. If the values ω1 and ω2 are incommen-
surable, i.e., the ratio ω1 /ω2 is not a fraction of integer numbers but an irrational
number, the orbit will never close. Such an orbit that with increasing time covers
the torus more and more densely is called quasiperiodic.
508 26 Lyapunov Exponents and Chaos
(d) If the largest Lyapunov exponent is positive, σ1 > 0, there arises a chaotic attrac-
tor with the already discussed properties of irregular motion that depends strongly
on the initial conditions. The typical combination is σ1 > 0, σ2 = 0, σ3 < 0, but
chaotic attractors with several positive Lyapunov exponents may also occur. Geo-
metrically the chaotic attractors usually are also strange attractors, as mentioned
already in Chap. 23. They have the strange property that they are objects with
a broken dimension. These are neither lines nor surfaces (or higher-dimensional
hypersurfaces) but “something in between.” That such objects are not pure math-
ematical inventions but also occur in nature has been noted only recently, in par-
ticular by B. Mandelbrot,5 who named them fractals.6 A beautiful example of
a fractal attractor will be given in Example 27.4 in the context of a periodically
driven pendulum; compare Fig. 27.19.
If one follows the history of V still further, the game of stretching and fold-
ing will continue again and again. One may imagine that in this way a infinitely fine
5 Benoit B. Mandelbrot, b. November 20, 1924, Warsaw. After the emigration of his family to France
(1936), Mandelbrot studied in Lyon, at the California Institute of Technology, and in Paris, where he
did his doctorate in 1952. He worked at CRNS, in Geneva, and at the École Polytechnique before
he went in 1958 to the IBM Watson Research Center, where he was appointed as Research Fellow.
He served as visiting professor among others at Harvard and Yale. Mandelbrot’s interests are extra-
ordinarily broad and oriented interdisciplinarily. Building on the work of G.M. Julia (1893–1978) on
iterated rational functions, he demonstrated the properties of fractals using computer graphics and
pointed out their manifold occurrence in nature. Besides many other awards, Mandelbrot received the
Wolf Prize in physics in 1993.
6 For more on this point, see, e.g., B. Mandelbrot, The Fractal Geometry of Nature, Freeman (1982);
H.-O. Peitgen, H. Jürgens and D. Saupe, Chaos and Fractals: New Frontiers of Science, Springer
(1992).
26.4 Fractal Geometry 509
(1) The Cantor set: The construction begins with a line, namely, the set of all real
numbers in the unit interval I0 = [0, 1]. The iteration rule reads: Remove the mean
third in each interval. In the first iteration step there arise two disjunct partial inter-
vals I1 = [0, 1/3] ∪ [2/3, 1] which then split further into I2 = [0, 1/9] ∪ [2/9, 1/3] ∪
[2/3, 7/9] ∪ [7/9, 1] etc. The first iteration steps are represented in Fig. 26.4(a). In the
limit n → ∞ from the In , there arises the Cantor set7 as a kind of finely distributed
dust of points in the unit interval.
To get a measure of extension of the Cantor set, we consider the magnitude of its
complementary set, i.e., the total length of all parts cut out. This leads to a geometric
series:
1 1 1 1 2 n 1 1
L = 1 + 2 + 4 + ··· = = = 1. (26.14)
3 9 27 3 3 3 1 − 2/3
n=0
7 Georg Cantor, German mathematician, b. March 3, 1845, St. Petersburg, Russia–d. January 6,
1918, Halle. Cantor studied at the universities of Zurich and Berlin under Weierstraß, Kummer, and
Kronecker. From 1869 to 1913, he was a professor at Halle. Cantor is the founder of set theory.
He invented the notion of cardinal numbers and the concept of infinite (transfinite) numbers and dealt
with the definition of the continuum. He proved the nondenumerability of real numbers. Furthermore,
he contributed to the theory of geometric series.
510 26 Lyapunov Exponents and Chaos
The parts cut out thus add up to the total length of the unit interval, and the length
of the Cantor set is therefore zero! But it consists of infinitely many points, and
one can show that its cardinal number (power) is the same as that of the real num-
bers.8
(2) The Koch curve: This construction again begins with a straight line of length
unity. Instead of cutting out parts, something is added, and the straight line is built up
to a toothed curve in the two-dimensional plane. The iteration rule reads: Remove the
mean third of every straight partial piece and replace it by the sides of an equilateral
triangle. The first steps of the iteration are shown in Fig. 26.4(b) (they remind one of
the bulwarks of old fortresses).
The peculiar feature of the Koch9 curve and its related ones is that it is everywhere
continuous but nowhere differentiable. One cannot give a tangent to the curve, because
of the infinitely many sharp corners. The calculation of the length of the Koch curve
also leads to a remarkable result: In every partial step, always 3 partial pieces are
replaced by 4 of the same length. The total length L is therefore
n
4
L = lim Ln = lim = ∞, (26.15)
n→∞ n→∞ 3
and thus diverges. This cannot be seen directly from the graph of the curve, due to the
conversion to smaller and smaller length scales. Here the self-similarity becomes ap-
parent which is a typical feature of fractals: On any length scale, a linear magnification
of a detail of the object again resembles the entire object.10
8 This may be understood in a rather simple manner: Every point of the Cantor set may be charac-
terized by an infinite sequence of “left-right decisions”; i.e., for every iteration step in Fig. 26.4 one
must say in which of the two partial intervals the point lies. But this sequence may also be interpreted
as the binary representation of a real number in the unit interval [0, 1], whereby the assertion becomes
clear.
9 Niels Fabian Helge von Koch, Swedish mathematician, b. January 25, 1870, Stockholm–d. March
11, 1924, Stockholm. Koch was a scholar and successor of Mittag-Leffler at the University of Stock-
holm. His main research fields were systems of linear equations of infinite dimension and with infi-
nitely many unknowns. His name became familiar mainly from the curve named after him.
10 Self-similarity alone is, however, not a sufficient criterion for fractals. For example, a straight line
is self-similar in a trivial manner.
26.4 Fractal Geometry 511
(3) The Sierpinski gasket: The basic element of the Sierpinski11 gasket is a two-
dimensional area, namely, an equilateral triangle. Iteration rule: Subdivide each trian-
gle into 4 congruent parts and remove the central triangle. Figure 26.5 shows the first
steps of this iteration. The resulting object is something between an area and a curve.
It has again the property of self-similarity.
Nature offers a variety of fractal objects. Example from the organic world are the
branchings of plants (trees, cauliflower, particularly beautiful ferns) or vessels. In-
organic fractal shapes are observed in clouds, mountains, snowflakes, lightning dis-
charges, etc.
A classical example studied by Mandelbrot that resembles the Koch curve are
coastlines. For the length of one and the same coast the geographers may give quite
different values, depending on the length scale adopted in the measurement. The
smaller the scale, the better is the scanning of the bays and windings of the coast,
corresponding to the higher iterations Ln of the Koch curve. Ultimately the coastline
should wind about each individual grain of sand on the beach, which would blow up
the length enormously. But here also shows up that the application of fractal geome-
try to natural objects is meaningful only in a certain range. Ultimately on the atomic
scale when the granularity of matter becomes apparent, the mathematical limit n → ∞
loses its meaning. Nevertheless, the fractal scale behavior (see below) frequently can
be traced over many orders of magnitude.
11 Waclaw Sierpinski, Polish mathematician, b. August 20, 1882, Warsaw–d. May 14, 1969, Warsaw.
Sierpinski was a professor of mathematics in Lwow (now Ukraine, 1908–1914), Moscow (1915–
1918), and later in Warsaw. His main fields of work were the theory of sets (here, in particular, the
selection axiom and the continuum hypothesis), the topology of point sets, and number theory.
12 The situation is complicated by the fact that a multitude of distinct mathematical concepts of di-
mension are available. The calculated dimension values partly agree with each other but partly do
not, depending on the considered object. For details, see K. Falconer, Fractal Geometry, Mathemati-
cal Foundations and Applications, Wiley, New York (1990).
512 26 Lyapunov Exponents and Chaos
will in general depend on the box size . For an improved resolution the object for sure
will spread over more boxes, but the question is how fast that happens. If the scaling
behavior in the limit → ∞ follows a power law
then the value of the exponent defines the fractal dimension or capacity dimension Df .
Solving yields
ln N () − ln V () ln N ()
Df = lim = lim , (26.17)
→0 ln(1/) →0 ln(1/)
since V () remains finite in the limit and therefore does not contribute.
For nonfractal geometric objects, the dimension determined in this way coincides
with the normal Euclidean dimension, and V (0) corresponds to the Euclidean vol-
ume. For example, for sufficiently high resolution a circular disk of radius R overlaps
N () πR 2 / 2 boxes, and each further doubling of the resolution increases the num-
ber of boxes by the factor 4. Hence, one finds for the circle N () 2πR/, and so
on.
For fractals, however, the power of the scaling law (26.16) differs from the naively
expected value and is in general not an integer. For the Cantor set the determination
of Df is particularly simple. Here it suffices to take the unit interval (n = 1) as the
embedding space. It is most convenient to consider a sequence of subdivisions of the
length i that always differ by the factor 3; thus, 0 = 1, 1 = 1/3, 2 = 1/9, etc.
The Cantor set is constructed in such a way that then the number of “occupied” boxes
(partial intervals) always doubles; thus, N (0 ) = 1, N (1 ) = 2, N (2 ) = 4, etc. Hence,
the fractal dimension of the Cantor set is, according to (26.17), given by
ln N () ln N (n )
Df = lim = lim
→0 ln(1/) n→∞ ln(1/n )
ln 2n ln 2
= lim = 0.6309. (26.18)
n→∞ ln 3n ln 3
The result is independent of the manner of performing the passage to the limit
→ 0.
Similarly, one finds the dimension of the Koch curve. To cover it, in the first step
one needs N (1 ) = 4 intervals of length 1 = 1/3, in the next one N (2 ) = 16 intervals
of length 1 = 1/9, etc. As in (26.18), this implies
ln 4n ln 4
Df = lim = 1.2618. (26.19)
n→∞ ln 3n ln 3
26.4 Fractal Geometry 513
ln 3
Df = 1.5850, (26.20)
ln 2
since here for each bisection of the box size the number of partial objects increases by
the factor 3.
The results (26.18) to (26.20) are not implausible. They quantify how the consid-
ered fractals by their properties stand between the familiar objects point, line, area,
. . . . The comparison of (26.19) and (26.20) shows that the Sierpinski gasket is “more
space-filling” than the Koch curve, but does not come up to a normal area. An ex-
treme example in this respect is the area-covering curve that was discovered in 1890
by G. Peano13 and investigated in modified form by D. Hilbert. The Peano curve may
also be obtained iteratively: In a subdivision of the scale length into three sections
there arise nine partial distances of equal length; see Fig. 26.7. Accordingly, the di-
mension is
ln 9
Df = = 2. (26.21)
ln 3
This is the dimension n = 2 of the embedding space and hence the largest value that
may be taken by the capacity dimension Df . N () takes the maximum value, since
all boxes include parts of the object. The definition of the capacity dimension is very
clear and has the advantage that it provides immediately an operative calculation rule.
One only has to span grids and to count the boxes, which may easily be done on a
computer. If the function N () in a doubly logarithmic representation yields a straight
line (at least over a larger range of scale), then one can immediately read off Df from
its slope. A related but more subtle definition of the dimension which refrains from
equidistant grids and works with overlaps of variable magnitude was developed in
13 Giuseppe Peano, Italian mathematician, b. August 27, 1858, Cuneo (province Piemont)–d. April
20, 1932, Torino. Peano studied mathematics at the University of Torino, where he taught beginning
in 1880 as a lecturer and beginning in 1890 as a professor. His early works concerned analysis,
the initial-value problem of differential equations, and recursive functions. Peano emerged mainly
as a founder of the mathematical logic (with G. Frege). The Peano axioms (1889) define the natural
numbers via the properties of sets. His aim was the axiomatization of all of mathematics. Later, Peano
moved beyond mathematics and developed an universal world language (Interlingua) that, however,
did not gain acceptance.
514 26 Lyapunov Exponents and Chaos
1918 at Bonn University by the mathematician Felix Hausdorff.14 We shall not en-
ter into the details here and mention only that the Hausdorff dimension DH in the
most cases coincides with Df ,15 but there also exist exceptional cases. Generally,
D H ≤ Df .
Just for the classification of strange attractors, other dimension measures may also
be meaningful. The Poincaré mapping yields a possibly very inhomogeneously dis-
tributed cloud of points in phase space. In the calculation of the capacity dimension
the information on the frequency distribution of the points is ignored. There it has
no meaning whether a box is occupied by a single point or by thousands. To take
this quantity into account, one defines an information dimension, which includes the
density distribution of the points.
The construction is similar to that for the capacity dimension Df . The embedding
space is again subdivided into boxes, but now the contribution of every cell is weighted
by the probability pi of meeting points there. Practically, this value is determined by
generating a very large number N of points and counts how many of them fall in the
cell number i; thus, pi = Ni /N with i Ni = N . The weighting is performed with a
logarithmic measure. The information dimension DI is defined as
pi ln(1/pi )
DI = lim i . (26.22)
→0 ln(1/)
The factor f (p) = p ln(1/p) has the following meaning: Take an arbitrary point from
the distribution and ask “Does the point lie in cell i?” The answer to this question
yields the information set f (pi ). This function vanishes for pi → 0 and pi → 1, since
in these cases the answer is trivial (always “no” or always “yes”). The information gain
is maximum for pi = 1/2; then the answer is in any case a surprise.
The exact foundation for the function f (p) is provided by the statistical mechan-
ics, or by information theory (Shannon’s information measure).16 But at least we may
easily see that (26.22) turns into the formula (26.17) if the point distribution is ho-
mogeneous, i.e., if all probabilities in the in total N () covered cells have the same
value pi = p = 1/N(). For the remaining cells pi = 0. Thereby, (26.22) reduces
to
N ()p ln(1/p) ln N ()
DI = lim = lim = Df . (26.23)
→0 ln(1/) →0 ln(1/)
For a less homogeneous point distribution, the informational content is lower and one
can show that then DI < Df .17
14 Felix Hausdorff, German mathematician, b. November 8, 1869, Breslau–d. January 26, 1942,
Bonn. Hausdorff studied and taught mathematics in Leipzig and beginning in 1910 in Bonn, until
his forced retirement in 1935. He was persecuted because of his Jewish origin. In 1942, he and his
wife committed suicide, shortly before deportation to the concentration camp. Hausdorff’s main re-
search fields were topology and group theory. He introduced the concept of partly ordered sets and
dealt with Cantor’s continuum hypothesis. He founded a theory of topological and metric spaces and
about 1919 defined the concepts of dimension and measure named after him.
15 J.D. Farmer, E. Ott and J.A. Yorke, Physica 7D, 153 (1983).
16 C.E. Shannon and W. Weaver, The Mathematical Theory of Communication, Univ. of Illinois
Press, Urbana (1949).
17 See, e.g., H.-O. Peitgen, H. Jürgens and P.H. Richter, op. cit. p. 735.
26.4 Fractal Geometry 515
Problem. The process of stretching and folding, which is characteristic for the phase
flow in chaotic systems, can be illustrated by a simple two-dimensional discrete map-
ping. Let the phase space be the unit square [0, 1] × [0, 1]. The motion of a point
proceeds according to the transformation
xn+1 = 2 xn mod 1,
ayn , 0 ≤ xn < 12 , (26.24)
yn+1 = for
1
2 + ayn , 1
2 ≤ xn ≤ 1,
with a parameter 0 < a ≤ 1/2. Interpret this transformation, and calculate the fractal
dimension of the set that is generated by application of (26.24) on the unit square.
For a = 1/2, the volume is conserved. However, a connected domain in phase space
is rapidly torn up by the repeated “back-folding” (here it is more a back-shifting) and
distorted beyond recognition. The figure represents the first steps of the transformation
of a circle of radius 1/2.
For a < 1/2, the volume element shrinks in each iteration step and goes asymp-
totically to zero. In the sense of Chap. 23, one also speaks of the dissipative baker
transformation. The resulting geometrical object is a fractal which in y-direction dis-
plays the structure of a Cantor set. Just as for the latter one may calculate the fractal
516 26 Lyapunov Exponents and Chaos
Exercise 26.1 dimension according to (26.17). In the y-direction in the first transformation step the
unit interval is converted into 2 parts of length a, in the next step 4 parts of length a 2 ,
and generally 2n parts of length a n . If one selects a sequence of overlaps of the lateral
length n = a n , then N (n ) = 2n (1/ n ), where the second factor originates from the
overlapping of the x-axis. The fractal dimension is therefore
In this chapter, we shall get to know various dynamical systems that may display
highly complex forms of motion despite their very simple, even trivial structure. We
shall meet previously discussed concepts such as bifurcations and periodic and strange
attractors (limit cycles and chaotic trajectories) in a series of specific examples. For
sake of simplicity, we shall begin with the investigation of systems with discrete dy-
namics. The examples considered will gradually become physically more and more
realistic.
So far, we have been interested in dynamical systems that are described by continuous
differential equations with respect to time. In the Poincaré mapping we have seen
the possibility of reducing a continuous system to a time-discrete system. But since
the Poincaré mapping of a realistic physical system as a rule cannot be given in a
closed form, it is instructive instead of these to consider simple mathematical model
mappings. As it turns out, such models share many properties with systems which
from a physical point of view are more interesting.
Let us consider a sequence of vectors x0 , x1 , x2 , . . . which is generated by a simple
iterative mapping, i.e., by repeated application of a continuous function f(x),
EXAMPLE
One of the simplest but not least impressive examples of an iterative mapping is gen-
erated by the logistic function.1 This function is defined as an inverted parabola
with zeros at the border of the unit interval and a maximum at f (1/2) = α/4. The
logistic function depends on a real parameter α which shall lie in the range 1 < α ≤ 4,
since otherwise the mapping either would lead beyond the unit interval [0, 1] (α < 0,
α > 4) or would become trivial (0 < α < 1, all solutions converge toward x = 0).
The mapping (27.2) has the two fixed points
1
xs1 = 0 and xs2 = 1 − , (27.3)
α
with the derivatives
Since α shall be > 1, the first fixed point is always unstable. The second one is stable
(point attractor) if 1 < α < 3. In this parameter range all solutions look like that repre-
sented in Fig. 27.2 for α = 2.8; i.e., xn converges toward the fixed point xs2 , for small
starting value initially monotonically increasing, later on alternating. If α exceeds the
value α1 = 3, then according to (27.4), the fixed point xs2 becomes unstable, too.
The geometric or numerical construction of the solution shows that a stable limit
cycle of period 2 evolves, as is illustrated in Fig. 27.2(b) for the case α = 3.3. For
1 The logistic mapping may be interpreted with some effort as the model equation of a particular
physical system; see the remark at the end of Example 27.3. It is also related to other fields of science.
The logistic mapping was introduced first in 1845 in biological population dynamics by the Belgian
biomathematician P.F. Verhulst. There it describes the evolution of a population of animals or plants
in a restricted environment whereby each iteration corresponds to a new generation (the old one
thereby dies off). For a low population density, the reproduction rate is positive (if α > 1). But, if one
approaches the saturation density x = 1, the reproduction rate decreases by overpopulation such that
xn+1 < xn . The simple parabola ansatz is sufficient to generate complex dynamics.
520 27 Systems with Chaotic Dynamics
Example 27.1 α = α1 , a period doubling of the solution arises. One faces a pitchfork bifurcation as
is illustrated in Fig. 25.3. The old solution becomes unstable, and there appears a pair
of new stable solution branches. (Note that here both branches simultaneously belong
to the solution, since the latter one periodically jumps back and forth between the
branches. This behavior differs from the case studied in Chap. 25, where both solution
branches were independent of each other.)
The period doubling can also be understood by inspecting the iterated mapping
f (x) = f (f (x)). As already mentioned, for this mapping a periodic solution reduces
2
is a polynomial of fourth order with zeros at x = 0 and x = 1. Figure 27.3 shows this
function for various values of the parameter α. For α < α1 , the function f 2 intersects
the bisector x just as f does only at the two points xs1 and xs2 given in (27.3). As
a polynomial of fourth order, (27.5) allows however four intersection points with a
straight line, and just this happens for α > α1 . As shown in 2
√ Fig. 27.3, f has a mini-
mum at x = 1/2 for α > 2 and two maxima at x = 1/2 ± (1/4) − 1/(2α). This can
be seen without much calculation from
f 2 (x) = f f (x) f (x). (27.6)
A zero of the derivative (extreme value of f 2 ) follows from f (x) = 0; thus x = 1/2.
For the two other zeros one finds
1 1 1
x = f −1 , since f f (x) = f f f −1 =f = 0, (27.7)
2 2 2
which as a quadratic equation yields the two roots mentioned above. For the slope of
f 2 (x) at the position of the fixed point xs2 , one obtains
For α < α1 = 3, this slope is smaller than 1, and therefore xs2 is a stable fixed point of
f 2 too. Of course this must be so because f 2 simply picks every second value of the
27.2 One-Dimensional Mappings 521
series of the xn , and thus it takes over the stability from the solution of the mapping f . Example 27.1
For α = α1 , f 2 (x) touches the bisector, and for α > α1 two new intersection points x1 ,
x2 arise. These are two stable fixed points of the iterated mapping f 2 , i.e., f 2 (x1 ) =
x1 , f 2 (x2 ) = x2 , which are related by x2 = f (x1 ) and x1 = f (x2 ).
If the parameter α increases further, these fixed points also become unstable at a
critical value α = α2 , and the game repeats for the mapping f 2 (x). A further bifurca-
tion arises there with period doubling, and the new stable solution has a period length
of 4. Figure 27.4 shows this solution for the example α = 3.48.
Fig. 27.4. (a) For α = 3.48,
the iterated logistic function
f (f (x)) has one unstable and
two stable fixed points at x
0.43298 and x 0.85437.
(b) The trajectory shows a pe-
riod of length 4
The critical value of α2 still can be given analytically, but with some effort. To
this end, one first has to find the position of the fixed points x1 and x2 . The condition
f 2 (x) = x together with (27.5) leads to a quartic equation. But we already know two
of the nodes, namely, the fixed points xs1 and xs2 from (27.3), which may be factored
out by polynomial division:
f 2 (x) − x = x(x − 1 + 1/α) α 3 x 2 − α(1 + α)x + α + 1 = 0. (27.9)
Setting the last bracket to zero yields a quadratic equation that determines the two new
fixed points, namely,
1 1 1 1 3
x1,2 = 1+ ± 1+ 1− . (27.10)
2 α 2 α α
The second bifurcation occurs when these fixed points become unstable. This is de-
cided by the magnitude of the slope of the function f 2 (x) at the points x1 , x2 . The
slope begins with the value +1 at α = α1 , and then decreases continuously to −1. At
this point, the next bifurcation arises. One therefore has to solve the equation
Taking into account (27.6) and (27.10), equation (27.11) seems to depend on α in
a very complicated manner. After some elementary transformations, (27.11) reduces
however to a simple quadratic equation for both fixed points in common,
Hence, the critical bifurcation parameter for which the 2-cycle turns over to the 4-cycle
is
√
α2 = 1 + 6 = 3.4495 . . . . (27.13)
522 27 Systems with Chaotic Dynamics
Example 27.1 That both fixed points simultaneously become unstable is plausible and immediately
follows from (27.6), since the slope of f 2 (x) has the same value at both points:
f 2 (x1 ) = f f (x1 ) f (x1 ) = f (x2 )f f (x2 ) = f 2 (x2 ). (27.14)
It is not surprising that with increasing α the cycle of the period 4 becomes unstable
too. There arises a full cascade of period doublings at α1 , α2 , α3 , . . . . In the interval
αk < α < αk+1 , there exists a stable limit cycle of period 2k . Mathematically one may
k
consider instead of the limit cycle the iterated function f 2 (x) which displays a set of
2k distinct stationary fixed points.
One should note that the critical points αk are more and more closely spaced. As
was found at first empirically and then analytically, the αk obey the law of a geometric
sequence that converges toward a cluster point α∞ :
1
αk α∞ − . (27.15)
δk
The number δ is a constant that may be determined from the ratio
αk − αk−1
δ = lim . (27.16)
k→∞ αk+1 − αk
δ = 4.669201 . . . , (27.17)
α∞ = 3.569944 . . . . (27.18)
The cascade of period doublings was considered first by Großmann and Thomae.2
Feigenbaum3 showed that the behavior (27.15) is not restricted to the logistic mapping
but is universally valid for a large class of iterative mappings.4
Surprisingly, the numerical value (27.17) is also universally valid, and δ is there-
fore called the Feigenbaum constant. Essentially it suffices that the iterated function
be smooth and display a quadratic maximum. The mathematical properties of the bi-
furcation cascade have been thoroughly studied; for a survey see, e.g., H. Schuster,
Deterministic Chaos, VCH Publishing Company, 1989.
Of particular interest is what happens beyond α∞ . For α = α∞ obviously a cycle Example 27.1
of “infinite period” arises, i.e., an aperiodic, never-repeating solution. α > α∞ is the
domain of chaos that is characterized by irregular and seemingly random trajectories.
As an example, Fig. 27.5 shows a fraction from such a chaotic solution calculated for
α = 3.9.
The attractor diagram of the logistic mapping: Figure 27.6 presents an attractor
diagram of the logistic mapping that is highly instructive and of amazing complexity.
For each value of α on the abscissa the values passed by a trajectory are plotted as
a point cloud along the ordinate direction. The first iteration steps (here 500) were
omitted in order to filter out “transient processes” (transients) and to represent the
asymptotic attractor itself. The subsequent (200 each) iterations are then plotted in the
diagram.
Example 27.1 The left part of Fig. 27.6 shows the first segments of the bifurcation cascade: At α1 ,
α2 , α3 , the attractor splits in 2, 4, and 8 branches, respectively. According to (27.15),
the further critical points αk follow each other so closely that they are no longer re-
solved in the figure. Beyond α∞ one sees continuous bands more or less uniformly
grey. The grey domains are passed through by some kind of “scars.” Obviously these
places are more frequently passed by the trajectory, and thus, the probability of finding
P (x) of the attractor is increased here.
One notes that in the chaotic domain α > α∞ , the orbits are confined to a partial
interval of [0, 1]. The borders of this interval are obtained as xmax (α) = f (1/2) = α/4
and xmin (α) = f 2 (1/2) = (1/16)α 2 (4 − α). In the limit α = 4, the full unit interval is
passed. Thereby, the attractor covers the interval completely. More precisely, for any
point x ∈ [0, 1] and any > 0, one can find an n() such that |xn − x| < . Thus, the
trajectory approaches each point arbitrarily closely if one waits for a sufficiently long
time.
If the parameter is reduced from α = 4, there arise various interesting phenomena.
For example, when going below α1 3.6785 a band splitting is observed. The chaotic
attractor which formerly covered a connected interval splits at α = α1 into two disjunct
domains, as is clearly seen in Fig. 27.6. The trajectory thereby alternates back and
forth between the two “partial bands.” For a further reduction of α, the partial bands
in turn split again, into 4 parts at α2 3.5926, and so on. Similar to the bifurcation
cascade α1 , α2 , . . . , α∞ of the regular attractor, there also exists a kind of reversed
bifurcation cascade α1 , α2 , . . . , α∞
of the chaotic attractor. Both cascades meet at a
common limit α∞ = α∞ .
A closer inspection of the attractor diagram Fig. 27.6 reveals that there are also
windows of periodic solutions embedded in the chaotic domain. Particularly striking
is the domain with solutions of period 3 which occur above α 3.8283. This is con-
nected with a fixed point of the triply iterated mapping that arises at the position of
the maximum of f (x), i.e., at x = 1/2:
3 1 1
f = . (27.19)
2 2
√
One easily confirms that this happens at αs3 = 1 + 8 3.8284. The point x = 1/2
therefore has a particular meaning, since here the derivative vanishes; f (1/2) = 0.
Because
d 3
f (x) = f f 2 (x) f f (x) f (x), (27.20)
dx
this property transfers also to all iterated mappings. This guarantees that the fixed point
(27.19) is stable. Here, one even speaks of a superstable cycle, since the magnitude
of the derivative of f n (x) which determines the stability takes the smallest possible
value, namely zero. Due to the continuity of the mapping as a function of the parame-
ter, the period-3 cycle is still stable in a finite environment of αs3 . In the downward
direction the window of stability borders on the chaotic region.
If α increases beyond αs3 , one again finds a cascade of bifurcations with period
doubling, i.e., the 3-cycle turns into a 6-cycle, etc. The bifurcations follow each other
more and more closely until chaotic solutions emerge again.
This consideration is not restricted to the cycle of period 3 but holds also for arbi-
trarily larger period lengths. There exist superstable cycles for any natural number m
27.2 One-Dimensional Mappings 525
(their number even increases exponentially with m) with period length m which are Example 27.1
defined by
1 1
fm
= . (27.21)
2 2
They are always enclosed by a small window of regular cyclic orbits in the otherwise
chaotic domain.
It is highly instructive to study the attractor diagram in detail. Figure 27.7 shows a
small section α = 3.52 to α = 3.65 at the border of the 3-cycle window. The agree-
ment with the full attractor diagram Fig. 27.6 is amazing. Although there are tiny dif-
ferences in the shape of both geometrical objects, their structure is nevertheless very
similar. This property—a partial section looks just as the entire diagram—is called
self-similarity (in the parameter space). For the attractor diagram of the logistic map-
ping, the process of successive magnification of a section may be continued ad infini-
tum: In Fig. 27.7, one again finds partial domains that resemble the entire figure and
so on, without an end.
The Lyapunov exponent of the logistic mapping: In the nonchaotic range, the
Lyapunov exponent may be calculated rather simply analytically. We first consider
the parameter range α < α1 = 3. Here, a stable fixed point exists at xs2 = 1 − 1/α; see
(27.3). The quantity to be calculated is
1
n−1
σ = lim ln |f (xl )|. (27.22)
n→∞ n
l=0
The influence of transients is excluded by the limit process. Because xl → xs2 , only
the fixed point contributes, and thus the sum reduces to a single term:
526 27 Systems with Chaotic Dynamics
1 1 1
σ= ln |f (x1 )| + ln |f (x2 )| = ln |f (x1 )f (x2 )|. (27.24)
2 2 2
For the derivatives, we get
f (x1,2 ) = α(1 − 2x1,2 ) = −1 ∓ (α + 1)(α − 3), (27.25)
1
σ= ln |(1 − (α + 1)(α − 3)|. (27.26)
2
This function begins with√the value σ = 0 at α = α1 , then decreases monotonically
and diverges at α = 1 + 5. This is again the superstable point of the 2-cycle, since
one easily confirms that
2 1 1 √
f = at α = 1 + 5. (27.27)
2 2
σ then increases again and finally reaches the value σ = 0. This happens when the
argument of the logarithm in (27.27) takes the value −1, and thus, 1−(α +1)(α −3) √ =
−1, which leads to the quadratic equation (27.12). The solution α = α2 = 1 + 6 is
the bifurcation point at which the 2-cycle becomes unstable. In the further course
of the bifurcation cascade α2 < α < α∞ , the game is repeating, and the Lyapunov
exponent oscillates in the interval 0 ≥ σ > −∞.
A qualitatively new feature arises in the chaotic range α > α∞ , since here σ takes
positive values. This is represented in Fig. 27.8, for which the Lyapunov exponent was
determined numerically. To the left of α = α∞ , one faces the range with negative σ
that has been discussed already. For α > α∞ , σ becomes positive and on the aver-
age increases with increasing α. In the limit α = 4, there results the maximum value
σ = ln 2 0.6931, as will be shown in Exercise 27.2. The chaotic domain is repeat-
edly interspersed with windows corresponding to regular solutions where σ becomes
negative. The figure reflects the complexity of the function σ (α) only imperfectly. The
windows may be perceived only approximately, due to the limited resolution. The in-
dicated peaks pointing down in the figure are actually poles extending to σ → −∞.
A realistic representation of σ (α), when plotted with finite line width, would display
27.2 One-Dimensional Mappings 527
only a largely homogeneous black block that extends to −∞. There are infinitely
many windows with stable cycles, and one may even show that they lie densely over
the entire real interval 0 < α < 4: Any arbitrarily small vicinity of each point α still
includes stable cycles!
The function σ (α) also has the property of self-similarity, just like the attractor
in Fig. 27.6. For each step of magnification, the partial sections look like the full
figure.
EXERCISE
Solution. (a) The graph of the function f (y) = 2 y (mod 1) consists of two pieces of
a straight line with the slope 2 that are shifted relative to each other, as is represented
in Fig. 27.9. Since everywhere f (y) = 2 > 1, stable fixed points cannot exist. The
iterated solution of (27.28) is simple,
Hence, all iterated solutions of an initial value y0 are explicitly known, nevertheless
the mapping is chaotic! This is due to the factor 2n , which implies an exponential
Fig. 27.9. The mapping func- enhancement of smallest deviations in the initial value y0 . As is represented in the
tion of the Bernoulli shift
figure, an interval y is stretched by the Bernoulli shift by a factor 2 to the length
2 y. Values falling into the range y > 1 are folded back to the unit interval by the
modulo operation. This repeated sequence of stretching and folding is characteristic
of chaotic mappings. In this way there results a thorough mixing of the trajectories
and a sensitive dependence on the initial conditions.
The Bernoulli shift mapping may be visualized by writing y as a number in binary
representation:
∞
y= bk 2−k i.e., y = 0. b1 b2 b3 b4 . . . with bk = 0 or 1. (27.31)
k=1
A doubling of y corresponds to a left shift (therefore the name) of the binary digits bk ,
and the modulo operation ensures that the digit before the decimal point is cut off:
f (y) = b1 . b2 b3 b4 . . . (mod 1)
= 0 . b2 b 3 b 4 b 5 . . . . (27.32)
Now the effect of the Bernoulli shift becomes clear: Each iteration enforces the “back
digits” in the binary expansion. If the initial value y0 is known to an accuracy of
2−m , this information is exhausted after m iteration steps. Later on there remains only
“numerical noise,” the trajectory wanders around through the unit interval in a non-
predictable way.
Mathematically, there still remains a subtle difference between rational and irra-
tional values of the initial condition. A rational number (a fraction p/q) y0 has a
binary representation which (after a finite number of steps) becomes periodic. Con-
sequently the trajectory yn formed according to (27.28) with (27.32) also becomes
periodic. A simple example is provided by the cycle 2/3, 1/3, 2/3, . . . , since
1 1 1 1 1 2
y0 = 0. 1 0 1 0 1 0 . . . = 1 + 2 + 4 + ··· = = ,
2 2 2 2 1 − 1/4 3
1 1 1 1 1 1
y1 = 0. 0 1 0 1 0 1 . . . = 1 + 2 + 4 + ··· = = ,
4 2 2 4 1 − 1/4 3
y2 = 0. 1 0 1 0 1 0 . . . ,
..
.
The rational numbers lie densely on the real axis. Thus, there are infinitely many start
solutions leading to periodic trajectories, and in any arbitrarily small vicinity of a
point y one finds such solutions. On the other hand, the set of rational numbers has
27.2 One-Dimensional Mappings 529
the measure zero; these are therefore “atypical” numbers. If one adopts a “typical” Exercise 27.2
initial value y0 , e.g., a random number in the interval [0, 1], then the probability for
y0 being rational and thus leading to a periodic trajectory is arbitrarily small. For a
physicist, rational initial conditions do not play a role because of the finite precision of
measurement. In the numerical simulation, the situation is different: If the computer
used stores the numbers with a precision to m bits, then after at most m steps the
simulation of the Bernoulli shift becomes meaningless. Fortunately for many purposes
it makes no difference whether one is dealing with a cyclic solution with a very long
period or with a genuine nonperiodic solution.
(b) The logistic equation
xn+1 = 4 xn (1 − xn ) (27.33)
i.e., the Bernoulli shift mapping. When transformed back to the variable x, the solution
(27.37) reads
1
xn = 1 − cos(2π 2n y0 ) = sin2 (π2n x0 )
2
√
= sin2 2n arcsin x . (27.38)
As was discussed in (a), this leads for almost all initial values x0 to chaotic trajectories
which cannot be calculated even numerically if n becomes large.
(c) In the range of definition of the Bernoulli shift mapping (27.28), no point is
particularly distinguished. For typical, i.e., irrationally chosen initial conditions, the
solution (27.30) will meet all numbers in the interval (0, 1) with the same probability,
hence P (y) = 1. This implies a corresponding probability of the logistic mapping of
dy 1 1
P (x) = 2P (y) =2
dx 2π sin(πy) cos(πy)
1
= √ . (27.39)
π x(1 − x)
530 27 Systems with Chaotic Dynamics
Exercise 27.2 The factor 2 arises because the transformation equation has two solutions y symmet-
rical about y = 1/2 for any value of x. The probability of finding P (x) is minimum
at x = 1/2 and increases toward the borders of the unit interval. At x → 0 and x → 1,
the function diverges but remains integrable. It is normalized to unity.
For the calculation of the Lyapunov exponent (26.8), we replace the mean value
over the time sequence by a mean value over the probability distribution P (x):
1
n−1
σ = lim ln |f (xl )|
n→∞ n
l=0
1
= dx P (x) ln |f (x)|. (27.40)
0
Systems for which this substitution time average ↔ phase-space average is permis-
sible are called ergodic. The proof of ergodicity of a system is in general not simple.
Since f (x) = 2 for all x and since the probability distribution P (x) is normalized to
unity, the Lyapunov exponent of the logistic mapping at α = 4 is obtained as
σ = ln 2 = 0.6931 . . . , (27.41)
EXAMPLE
In this section, we shall meet another example of a discrete mapping that, despite of
its simple shape, leads to complex solutions and to chaotic behavior. Thereby, several
new concepts arise, and a way is described that leads from a quasiperiodic to a chaotic
motion.
In contrast to the logistic mapping, which was introduced as a purely mathematical
example, we now consider the motion of a specific mechanical system, namely, a
damped rotator that is under the influence of an external force. The corresponding
equation of motion for the rotational angle θ reads
can be rewritten as usual in an autonomous system of three coupled differential equa- Example 27.3
tions of first order. With x = θ , y = θ̇ , and z = t , we have
ẋ = y,
∞
ẏ = −β θ̇ + M(θ) δ(z − nT ), (27.45)
n=0
ż = 1.
Because of the specific form of the force (27.43), the rotator is again and again accel-
erated by the pulse, but otherwise moves freely influenced only by friction. Therefore,
the equations of motion can be integrated exactly. For this purpose, we introduce the
discretized variables
Thus, position and velocity are scanned shortly before the individual kicks. We now
consider the momentum number n and integrate the equation of motion over the nth
time interval nT − < t < (n + 1)T − . Since only one kick contributes in this
interval, the equation of motion for y reads
The solution of the homogeneous differential equation holding between the kicks may
be given immediately:
as is schematically represented in Fig. 27.10. With (27.48) and (27.49), one obtains
y(t) = yn + M(xn ) e−β(t−nT ) for nT < t < (n + 1)T . (27.51)
Example 27.3 By performing the modulo operation, one takes into account that x is a periodic angu-
lar coordinate. The system (27.53) describes a Poincaré mapping of the periodically
kicked rotator.
The angular dependence of the torque M(x) is determined by the specific physical
system. If a body is moving on a circular orbit and is under a force pointing always
in the same direction (see Fig. 27.11), then the torque is proportional to the sine of
the angle, M = r × F = −r F sin θ ez . In addition, we take into account an angle-
independent torque M0 , i.e., with K0 = r F :
Fig. 27.11. The force F(t) al-
ways points in the same direc- M(x) = M0 + K0 sin x. (27.55)
tion, independent of the dis-
placement angle θ of the rota- The system (27.53) and (27.54) with the nonlinear force law (27.55) is called the dis-
tor
sipative circular mapping (“dissipative” since we are dealing with the limit of strong
damping) and displays very interesting dynamic properties.
The equations may be written still more clearly by introducing the following ab-
breviations:
1
b = e−βT , = M0 ,
β
1
K= 1 − e−βT K0 , (27.56)
β
1
rn = 1 − e−βT yn − .
bβ
rn represents a rescaled velocity coordinate. Insertion into (27.53) leads to the follow-
ing form of the dissipative circular mapping:
One can iterate this equation numerically and study the solutions for various values
of the parameters b, K, and . A further simplification results in the limit of strong
damping, i.e., βT 1 or b 1. (To overcome the friction, K0 must, according to
(27.56), increase linearly with c.) In this case, the velocity yn is decelerated to zero
immediately after each “kick.” Consequently, the equation for the angle decouples to
This equation is called the one-dimensional circular mapping or the standard map-
ping. Its mathematical properties were intensely studied, in particular by the Russian
mathematician V.I. Arnold.5 Its interesting properties are due to the nonlinearity of
the sine function.
Let us consider for a moment the trivial limit K = 0, i.e., the linear circular map-
ping
5 V.I. Arnold, Trans. of the Am. Math. Soc. 42, 213 (1965).
27.2 One-Dimensional Mappings 533
The rotator is moving forward in equidistant steps . If it reaches the old position Example 27.3
again after a finite number of steps, the motion is periodic. This obviously happens if
/2π is a rational number,
p
= 2π , p and q coprime numbers. (27.61)
q
This means that after q time steps the rotator has performed p full turns, i.e., takes
again (modulo 2π ) its original position. One deals with a solution with the period q.
If however /2π is an irrational number, the initial point x0 is not reached again even
after an arbitrarily long time. But there occur values xn in any arbitrarily small vicinity
of x0 . In such a case the motion is called quasiperiodic.
In order to characterize the motion, one may define a winding number:
1 f n (x0 ) − x0
W= lim . (27.62)
2π n→∞ n
Here, f n (x0 ) is the n-fold iterated mapping function (without performing the modulo
operation) from (27.59). The winding number thus represents the mean shift per stroke
interval. W = W (, K) depends on both parameters of the circular mapping. In the
linear case, K = 0, W just coincides with the fraction p/q defined in (27.61). Since
the rational numbers, although densely located on the real axis, form only a set of
measure zero, the “typical” trajectories are quasiperiodic.
What happens now if the nonlinearity becomes efficient (i.e., for K
= 0) in the
circular mapping (27.60)? Let us consider a periodic solution with a rational winding
number W = p/q. The angular coordinate passes a cycle x1 , x2 , . . . , xq of length q
that causes a final shift of xq = x1 + 2πp mod 2π = x1 ; hence,
At the beginning of the present chapter, we met the criterion for the stability of a
discrete mapping, in the case at hand, of the q-fold iterated function f q (x): The mag-
nitude of the derivative of the function must be smaller than unity, and thus,
q
q
d
f (x1 )
=
f f · · · f (x )
=
f (xi )
dx 1
1
i=1
q
=
1 − K cos xi
< 1. (27.64)
i=1
If this condition is fulfilled, then x1 (and since all points in the cycle are on equal
base, all other xi too) belongs to a stable periodic attractor. In the linear case, K = 0,
the derivative has always the value f q (x) = 1. There exists marginal stability where
neighboring orbits are neither attracted nor repelled. If however 0 < K < 1, then each
of the solutions p,q = 2π p/q becomes a periodic attractor with a domain of at-
traction of finite width p,q . This is an example for a very interesting phenomenon
that arises in many branches of physics. Vibrating systems which are characterized
by two distinct frequencies adjust themselves—provided that there is a correspond-
ing interaction—in such a manner that the frequencies are synchronized, i.e., are in
an integral ratio to each other. The phenomenon is also called mode locking. The two
frequencies of the system considered here are determined on one hand by the stroke
length T , and on the other hand by the magnitude of the torque M0 .
534 27 Systems with Chaotic Dynamics
Example 27.3 Possibly the earliest experimental evidence for the phenomenon of mode locking
is ascribed to the Dutch physicist Christian Huygens. He observed that a series of
pendulum watches (Huygens played a decisive role in their discovery) which were
suspended in a row began to vibrate in the same rhythm, although their limited accu-
racy of movement would rather have suggested a drifting apart. Huygens recognized
that the weak coupling of the watches via their common back wall was responsible for
the synchronization.
The extension of the mode-locking ranges can be calculated from (27.63). For
larger period lengths q, this may be performed only numerically. We therefore re-
strict ourselves here to the simplest case q = 1. If there is a complete synchronization
with winding number W = 1, then (27.63) becomes
which is fulfilled for angles 0 < x1 < π/2 or 3π/2 < x1 < 2π . The associated values
of may be read off from (27.65):
The range of mode locking with just one turn of the rotator per stroke interval thus
has the shape of a triangle that opens with increasing K. Just the same consideration
leads to the attraction domain of the attractor with winding number W = 0, and thus,
p = 0, q = 1:
Because of the periodicity of the angle coordinate, it suffices to investigate the inter-
val 0 ≤ ≤ 2π . Values beyond this range mean only that the rotator for each kick
performs additional full turns, which does not change the dynamics significantly. In
Fig. 27.12, the borders of the ranges (27.67) and (27.68) are drawn as straight lines.
The parameter values for which a periodic synchronized motion arises are shaded in
the diagram.
as was already mentioned. It can be shown6 that for K = 1 the Arnold tongues cover Example 27.3
the entire -range:
p,q = 2π for K = 1. (27.69)
p,q
Hence, the situation is exactly complementary to the case K = 0; the “typical” so-
lutions are now periodic (mode locking), while the quasiperiodic solutions have the
measure zero.
The function W (), i.e., the winding number as a function of the frequency para-
meter at K = 1, is called the devil’s staircase (see Fig. 27.13). It is a function that
is everywhere continuous but nowhere differentiable. To any rational number p/q
belongs a step; the width of the steps decreases with increasing period length q; its
total width according to (27.69) covers the entire interval. The devil’s staircase has the
property of self-similarity, i.e., any sectional magnification again resembles the entire
object.
Fig. 27.13. The steps of the
“devil’s staircase” are ranges
where the winding number
“clicks into place,” i.e., is in-
dependent of the frequency
parameter . The sectional
magnification on the right in-
dicates the self-similar struc-
ture of the devil’s staircase
If the parameter of the nonlinear coupling exceeds the value K = 1, then the Arnold
tongues coalesce. This is connected with a conversion of the quasiperiodic solutions
into chaotic ones, as may be seen from the occurrence of positive Lyapunov exponents.
At the same time, for K > 1 there still exist domains of periodic solutions with a
negative Lyapunov exponent. Both types of solutions are interwoven in a complex
manner. For the logistic mapping, we observed a mixing of regular and chaotic motion.
But now the relations are even more complicated, since now two parameters, and
K, may be varied independently. An interesting feature of the chaotic solutions is that
they don’t have a well-defined winding number. The solution moves so irregularly that
the limit in (27.62) does not exist.
In the upper half of Fig. 27.12, for K > 1, the centers of several periodic ranges
are plotted as lines. Here, the condition of superstability introduced on page 524 is
fulfilled, i.e., the derivative of the iterated mapping function of a q-cycle vanishes,
f q (x) = 0. For the period q = 1, this condition may be evaluated easily. The solution
with winding number W = 0 has as its fixed-point condition
6 M.H. Jensen, P. Bak and T. Bohr, Phys. Rev. A30, 1960 (1984).
536 27 Systems with Chaotic Dynamics
With the value x1 following from (27.70), equation (27.71) yields the condition for
superstability
K= 1 + 2 (27.72)
as is plotted in Fig. 27.12. The condition for a superstable fixed point with the winding
number W = 1 follows analogously as
K= 1 + (2π − )2 . (27.73)
The crossing of the curves indicates that for equal values of and K distinct stable
solutions may coexist. In the range K > 1, there is no longer a straightforward relation
between the parameters K, , and the winding number W . Which of the solutions is
realized then depends on the initial condition for x.
The occurrence of chaotic solutions is associated with a qualitative change of the
mapping function f (x). As Fig. 27.14 shows, for K < 1, f (x) is a monotonically
increasing function. For K > 1, the nonlinear coupling is so strong that f (x) reflects
the shape of the sine function; i.e., there arise (quadratic) maxima and minima. Similar
to the previously treated logistic mapping, f (x) is not invertible for K > 1. This is a
necessary (but not sufficient) condition for the occurrence of chaos in one-dimensional
mappings.
The complex behavior in the mode locking expressed by the devil’s staircase is
well confirmed by experiment. Even simpler than mechanical oscillators are nonlinear
electric circuits. For example, mode locking was investigated in an externally period-
ically driven circuit involving a superconducting Josephson junction and an induction
which may be described mathematically by the circular mapping.7
Supplement: It should be noted that the logistic mapping from Example 27.1 may
be interpreted as the motion of a periodically kicked rotator. For this purpose, the angu-
lar dependence of the torque is not represented by (27.55) but rather by the (somewhat
For parameter values α ≤ 4, the values of xn are automatically bound to the unit inter-
val, and hence the modulo operation may be omitted.
EXAMPLE
In the preceding examples, we have studied systems the dynamics of which could be
described by the iteration of simple, analytically known discrete mappings. The logis-
tic mapping from Example 27.1 with its extremely simple structure served as a “test
laboratory” for investigating many aspects of nonlinear dynamics but has no plausi-
ble physical analog. For the “periodically kicked” damped rotator of Example 27.3,
the dynamics could also be reduced to the iteration of discrete equations of motion (in
the limit of strong damping to the one-dimensional circular mapping (27.57)), because
of the pulse-like nature of the acting force. Another model example, possibly even
more realistic and appropriate for a clear illustration of the characteristic phenomena
of nonlinear dynamics, is the periodically driven pendulum.8
Let the pendulum, a nonlinear oscillator with a backdriving force proportional to
the sine of the displacement angle θ , be under the influence of an additional external
force with a harmonic time dependence. Moreover, let the system be damped by fric-
tion which is proportional to the velocity. Mathematically, these system properties are
described by the following equation of motion:
d 2θ dθ
+β + sin θ = f cos(t). (27.78)
dt 2 dt
8 See, e.g., G.L. Baker and J.P. Gollub, Chaotic Dynamics, Cambridge University Press, Cambridge
(1996). In this book, extensive use is made of the example of the driven pendulum. We also refer to
H. Heng, R. Doerner, B. Huebinger and W. Martienssen, Int. Journ. of Bif. and Chaos 4, 751, 761,
773 (1994).
538 27 Systems with Chaotic Dynamics
Example 27.4 Here, β is the friction parameter, and f and denote the strength and frequency of
the driving force, respectively. The eigenfrequency of the pendulum has been set to the
value ω0 = 1, which may always be achieved by rescaling the time and the parameters
β and f . As was described in Chap. 23, this explicitly time-dependent differential
equation of second order may be rewritten in a system of three coupled autonomous
(i.e., not time-dependent) differential equations of first order:
dω
= −βω − sin θ + f cos φ,
dt
dθ
= ω, (27.79)
dt
dφ
= .
dt
For β > 0, this is obviously a dissipative system, since the divergence of the velocity
field F (compare (23.16)) is then negative:
= ∇ · F = −β. (27.80)
The equations of motion of the driven pendulum are too complicated to allow analytic
solutions. Their numerical integration by the computer does not cause any trouble,
however.9
Depending on the parameters , β, f , the driven pendulum displays many distinct
types of motion. Here we shall investigate only a small section of the parameter space,
namely the dependence on the driving strength f for fixed values of the frequency
and of the friction constant β. As an example we choose a frequency = 2/3 for all
of the subsequent investigations, i.e., a value somewhat below the natural vibration
frequency of the pendulum, and a friction parameter β = 0.5.
The system of differential equations (27.79) is integrated numerically for various
values of the parameter f , beginning with selected initial conditions θ (0) and ω(0).
To avoid needless effort, transient processes, i.e., the initial solutions of, e.g., the first
20 vibrational periods, will be ignored.
The effect of dissipation in the system ensures that the solution after some finite
time approaches an attractor. The shape of this attractor may be analyzed in different
manners. One may directly consider the time dependence of the displacement angle
θ (t) or plot the trajectory in the three-dimensional phase space θ, ω, φ, where the
third coordinate because of φ = t just corresponds to the time. More transparent
than this three-dimensional representation are reduced two-dimensional phase-space
diagrams where the time is considered as a parameter and the trajectory is plotted in
the θ, ω-plane. Contrary to the full three-dimensional phase space, the projected orbits
may intersect here. Since θ is a periodic angular variable, we restrict it to the interval
−π < θ ≤ π by the modulo operation. A trajectory that leaves the diagram at the right
or left edge, corresponding to a loop of the pendulum, therefore enters again at the
opposite edge.
Figure 27.15 shows a gallery of selected phase-space diagrams arranged by increas-
ing value of the driving force f . The value of f is always given at the top left in the
partial figures.
9 Readers are advised to explore the dynamics of the driven pendulum by their own computer ex-
periments. For the integration of the differential equation a Runge–Kutta approach is recommended;
see, e.g., W.H. Press et al., Numerical Recipes, Cambridge University Press (1989).
27.2 One-Dimensional Mappings 539
For weak perturbations, e.g., f = 0.9, the pendulum performs approximately har-
monic librations about the zero position. The limit cycle θ (t) is a slightly distorted si-
nusoidal vibration with the frequency , and correspondingly the path in phase space
is approximately an ellipse.
With increasing perturbation strength a bifurcation arises at about f = 1.07 with
a period doubling, as is represented in Fig. 27.15 for f = 1.075: Two slightly differ-
ent vibrational tracks are alternating. After further period doublings, one then finds a
libration with twice the amplitude (represented for f = 1.12) in which the pendulum
performs a loop but then moves back. The frequency of this oscillation is /3.
In the range f 1.15 . . . 1.3, there arise chaotic solutions. The trajectory for
f = 1.2 in Fig. 27.15 fluctuates in an erratic manner between librations and rotations
in both directions, and correspondingly the path densely covers a domain in phase
space. For comparison, Fig. 27.16 shows a regular (f = 1.12) and a chaotic (f = 1.2)
trajectory, which correspond to the third and fourth partial figure in Fig. 27.15.
For even stronger coupling f , the chaotic range is left again, and there occur rotat-
ing periodic solutions, as is represented for f = 1.4. The angle increases linearly with
time, according to θ (t) ∝ ± t , superimposed by local fluctuations. For f = 1.45, a
bifurcation with period doubling of the rotating solution occurs. On the average, the
angle is unchanged, but now the local deviations alternate from period to period. At
f = 1.47, one obtains a second period doubling.
540 27 Systems with Chaotic Dynamics
After a bifurcation cascade, there again occurs a range with chaotic solutions, as
is represented for the example f = 1.5. But soon the chaos is followed by regular
motion, as is demonstrated by the beautiful phase-space trajectory for f = 1.51 in
the last shown example. Here, one faces a periodic libration motion with two loops
(angular range 3 · 2π ) and the period 5 (i.e., the frequency /5).
A global survey of the behavior of the system is obtained from the attractor di-
agram. One of the coordinates is scanned in regular time intervals, and the result is
plotted along the ordinate as a function of a system parameter. Figure 27.17 shows the
angular velocity ω(tn ) scanned at the time points tn = t0 + n 2π/ versus the strength
f of the driving force. For any value on the abscissa, 150 values of ω are plotted.
The upper margin marks the nine values of f for which the associated phase-space
trajectories are shown in Fig. 27.15.
The attractor diagram clearly exhibits the previously discussed alternating ranges
of regular and chaotic motion. The chaotic window at f = 1.15 . . . 1.28 is followed by
a broad domain of periodic (rotating) solutions at f = 1.28 . . . 1.48, showing several
pronounced period-doubling bifurcations (a “subharmonic cascade”). In the ranges
f = 1.11 . . . 1.15 and f > 1.54, one finds solutions of period 3. The structural sim-
ilarity of the attractor diagram for the driven pendulum with its counterpart for the
logistic mapping, Fig. 27.6, is obvious.
27.2 One-Dimensional Mappings 541
One should note that the attractor diagram represented in Fig. 27.17 is not complete. Example 27.4
This is due to the fact that our system of differential equations (27.79) is invariant
under reflection because the pendulum has no preferred direction of oscillation. Each
solution is accompanied by a reflected trajectory
which also satisfies the equation of motion. Angle and velocity are inverted, and the
phase is shifted by a half-period. In general, the solutions always occur pairwise,
which is obvious for rotational solutions because the pendulum may run “clockwise”
or “anti-clockwise.” Which attractor is actually reached depends in a complicated
manner on the initial conditions θ (0), ω(0). When plotting Fig. 27.17, only one path
has been calculated, hence the “reflected” branches are missing. (The points are not re-
ally reflected about θ = 0, since in the stroboscopic scanning the phase of the scanning
moment is kept fixed and not shifted according to (27.81).)
However, for periodic trajectories of period length n 2π/ it may happen that θ (t +
nπ/) = −θ (t) (mod 2π) holds. Such a symmetric trajectory is identical with its
reflected partner, and there exists only one solution. This occurs, e.g., for the n = 3-
vibration shown in Fig. 27.16 (left), and in particular also for the (n = 1)-librations for
small f . At about f = 1.01, the interesting case of a symmetry-breaking bifurcation
occurs, which may be recognized by the kink in the attractor diagram Fig. 27.17. The
attractor continues to be periodic with n = 1, but looses its symmetry and therefore
splits into a pair of distinct solutions which are reflected relative to each other. In the
figure only the upper branch of this fork is included.
The attractor diagram provides a good survey of the various kinds of motion
of a dynamic system. A more far-reaching quantitative measure of the stability
of trajectories are the Lyapunov exponents σi discussed in Chap. 26. The driven
pendulum—a three-dimensional system—has three Lyapunov exponents. One of
these, let us call it σ3 , has always the value σ3 = 0. It belongs to the degree of free-
dom φ, which according to (27.79) has the trivial linear time dependence φ(t) = t .
Thus, any perturbations along this direction neither increase nor shrink exponen-
tially.
The maximum Lyapunov exponent σ1 determines the stability of the system. At-
tractors with σ1 < 0 are periodic, those with σ1 > 0 are chaotic. Figure 27.18 shows
the result of a numerical calculation of the maximum Lyapunov exponent, plotted over
the same parameter range as in the attractor diagram Fig. 27.17. One may clearly trace
the sequence of regular and chaotic domains. At the bifurcation points the exponent
σ1 touches the zero line from below.
A conspicuous feature is that σ1 never falls below the value −0.25. This is re-
lated to the fact that in the dissipative system under consideration the sum of all three
Lyapunov exponents is determined by the negative of the friction coefficient β (here,
β = 0.5):
3
σi = −β. (27.82)
i=1
The Poincaré cut introduced in Chap. 24 may further characterize the attractors.
When choosing φ = φ0 (mod 2π) as the cut condition, one just has a stroboscopic
mapping at equidistant time points tn = t0 + 2π/. The three-dimensional phase
space reduces to two dimensions (θ, ω), and the continuous trajectory turns into a
cloud of points. The Poincaré cuts of periodic attractors simply consist of one or sev-
eral fixed points, the number of which corresponds to the period length of the vibra-
tion. The situation is different, however, for the nonperiodic strange attractors which
are characteristic for the occurrence of chaotic motion. Here, the cloud of points of
the Poincaré cut covers extended partial domains of phase space more or less uni-
formly.
Figure 27.19(a) illustrates the situation for the chaotic attractor of the pendulum
with a driving strength f = 1.2. The points were obtained by stroboscopic scanning
of the trajectory shown in the fourth partial figure of Fig. 27.15 over 2000 vibrational
periods. The detailed shape of the Poincaré cut depends on the selected phase an-
gle φ0 .
The long curved object in Fig. 27.19 at first glance appears as a strange bent one-
dimensional curve. A closer look, however, shows the chaotic attractor to be a much
more complex geometric object. In the sector magnification of a small partial range
of the attractor (see Fig. 27.19(b)), the seemingly single line dissolves into several
closely spaced curves. But this is only the beginning, since a repetition of this opera-
tion would show that in each sector magnification the new lines again decay into sev-
eral fractions, and the procedure may be repeated infinitely many times. (This process
is limited in practice only by problems in the numerical integration of the equation of
motion. By the way, in order to plot Fig. 27.19(b), 100,000 periods had to be calcu-
lated.)
Thus, the attractor of the chaotic driven pendulum displays an infinite filigree
(“puff-pastry”) structure. Mathematically it is a fractal with broken dimension (see
Chap. 26) since the points of the Poincaré cut occupy, roughly speaking, a larger vol-
ume in phase space than a one-dimensional curve, but on the other hand they are too
rare to cover a two-dimensional area. An analogous statement holds also for the full
attractor in the three-dimensional phase space.
27.2 One-Dimensional Mappings 543
In Chap. 26, we dealt with the determination of the fractal dimension. A remarkable
point is that the Lyapunov exponents σi may also be used to determine the dimension
of a strange attractor. The existence of such a relation is not implausible, because the σi
decide how “fast” a region of phase space spreads under the dynamic flow. Building
on this consideration, Kaplan and Yorke10 derived a formula for calculating a Lya-
punov dimension DL . For our special case (one positive and one negative Lyapunov
exponent), the Kaplan–Yorke relation reads
σ1
DL = 1 + for σ1 > 0, σ2 < 0. (27.83)
|σ2 |
The relation between DL and the other dimension measures has not yet been cleared
up in full. The originally assumed identification of Lyapunov dimension and capacity
dimension Df cannot be maintained, since counter-examples have been found. More
recent speculations rather concern a possible relation with the information dimension
DL = DI .
For the Poincaré cut 27.19, one gets from (27.83) with σ1 0.14, σ2 −0.64,
the Lyapunov dimension DL 1.2. This value depends sensitively on the friction
10 J.L. Kaplan and J.A. Yorke, Functional differential equations and approximation of fixed points,
H.-O. Peitgen and H.O. Walter (eds.), Lecture Notes in Mathematics 730, Springer, Berlin (1979).
544 27 Systems with Chaotic Dynamics
Example 27.4 constant β. For a weaker damping, the strange attractor “blows up,” and its dimension
increases.
EXAMPLE
Hyperion is one of the more remote moons of Saturn. It revolves about Saturn with
a revolution period of 21 days, on an ellipse with eccentricity ε = 0.1 and a large
semiaxis a = 1.5 · 106 km.
The motion of Hyperion is a particularly impressive example of chaotic stagger-
ing within our solar system. In this section we shall describe this behavior using a
simplified model. The satellite Voyager 2 among others has supplied pictures of the
moon Hyperion. Hyperion is an asymmetric top that may be roughly described by a
three-axial ellipsoid with the dimensions
Hence, one obtains for the principal moments of inertia 1 < 2 < 3 :
2 − 1
≈ 0.3. (27.85)
3
The striking prediction is that Hyperion performs a chaotic staggering motion in the
sense that its rotational velocity and the orientation of its rotational axis vary signif-
icantly within a few revolution periods. This chaotic dancing, which must have hap-
pened also for other planetary satellites during their history (e.g., Phobos and Deimos
with the planet Mars have been calculated), is implied by the asymmetry of Hyperion
and by the eccentricity of the orbit.
To describe the change of the rotational velocity, we adopt the following model
(see Fig. 27.20): Hyperion H orbits Saturn S on a fixed ellipse with semimajor a and
eccentricity ε. r represents the distance between Saturn and Hyperion, ϕ the polar
angle of motion. Thus, the trajectory of Hyperion is given by
Fig. 27.20. A simple two-
dumbbell model for the asym-
metric Saturn moon Hyperion
27.2 One-Dimensional Mappings 545
k Example 27.5
r(ϕ) = . (27.86)
1 + ε cos ϕ
Its asymmetric shape is simulated by four mass points 1 to 4 with equal mass m which
are arranged in the orbital plane. Let the line 2–1 (distance d) be the (body-fixed)
e1 -axis, the line 4–3 (distance e < d) the (body-fixed) e2 -axis. The e3 -axis points
perpendicular out of the image plane: e3 = e1 × e2 . The angle ϑ specifies the rotation
of Hyperion about the e3 -axis. It is defined as the angle between the semimajor a and
the e1 -axis. The moments of inertia obey
1 1 1
1 = me2 < md 2 = 2 < m(d 2 + e2 ) = 3 . (27.87)
2 2 2
In this model, the satellite shall rotate only about the e3 -axis, i.e., the axis perpendicu-
lar to the orbital plane with the largest moment of inertia. This restriction is motivated
because the tidal friction over very long times causes (1) the rotational axis of a moon
to align along the direction of the largest moment of inertia, and (2) causes this direc-
tion to adjust perpendicular to the orbital plane. Moreover, the orbital angular momen-
tum of Hyperion is assumed to be constant. This is a very good approximation, since
the intrinsic angular momentum LE of Hyperion is always very small relative to the
orbital angular momentum LB , |LE |/|LB | ≈ (d 2 + e2 )/a 2 ≈ 10−8 . The gravitational
field at the position of Hyperion is not homogeneous, and since 1 and 2 are distinct,
the satellite experiences a torque that depends on its orbital point and its orientation
ϑ , which will be calculated now. The tidal friction shall be neglected, however. The
torque acting on the pair of masses (1, 2) is
de1
D(1,2) = × (F1 − F2 ), (27.88)
2
where
γ mMri
Fi = − (27.89)
ri3
is the force acting on the mass point i, M being the Saturn mass. Figure 27.21 once
more illustrates the torque that is caused by the gravitational forces F1 and F2 at the
positions r1 and r2 , and by the centrifugal force F = −(F1 + F2 ) at the position r.
Fig. 27.21. The forces caus-
ing the torque D(1,2)
Since the length d ≈ 200 km is small compared to the distance r ≈ 106 km, the
cosine law yields
2
d d d
ri = r 1 ± cos α + ≈ r 1 ± cos α. (27.90)
r 2r r
546 27 Systems with Chaotic Dynamics
Example 27.5 The positive sign holds for r1 , the minus sign for r2 . α is the angle between r1 − r2
and r. From (27.90), we obtain
1 1 3d
≈ 1∓ cos α . (27.91)
ri3 r 3 2r
Thus, the torque vanishes if 1 = 2 . Besides, a configuration with α experiences Example 27.5
the same torque as a configuration with 1800 + α. The torque tries to rotate Hyper-
ion in such a way that at any moment the e1 -axis points toward Saturn. The expres-
sion (27.95) for the torque remains correct even if a more realistic mass distribution is
assumed. With
dL d 2ϑ
D= = 3 2 , (27.96)
dt dt
the equation of motion for the eigenrotation of the satellite reads
3 2π 2 a 3
3 ϑ̈ = − (2 − 1 ) sin 2(ϑ − ϕ(t)) . (27.97)
2 T r(t)
Here, we set α = ϕ − ϑ . Equation (27.97) involves only one degree of freedom, ϑ , but
the right side depends via the orbital radius r(t) and the polar angle ϕ(t) on the time
and is therefore not integrable. An exception is the case of a circular orbit. Then the
mean angular frequency
2π
n= (27.98)
T
equals the angular velocity ω, and r = a, ϕ(t) = nt. With ϑ = 2(ϑ − nt), the differ-
ential equation (27.97) therefore simplifies for ε = 0 to
This is the differential equation for the pendulum. The integral of motion is the energy
E. To determine this quantity, we multiply (27.99) by ϑ̇ :
which implies
d 1 2
3 ϑ̇ − 3n (2 − 1 ) cos ϑ = 0.
2
(27.101)
dt 2
Hence,
2
1 dϑ
E = 3 − 3n2 (2 − 1 ) cos ϑ (27.102)
2 dt
is an integral of the motion. Just as for the pendulum (see, e.g., Example 18.8) we
have that for energies E larger than E0 = 3n2 (2 − 1 ) the satellite rotates, while
for E < E0 it vibrates. Due to the (so far neglected) tidal friction the energy E will
decrease more and more until it reaches the minimum value Emin = −E0 . From this
follows ϑ̇min
= 0 and ϑmin = 0. Therefore, in the final state of satellites on a circular
orbit the e1 -axis always points toward the planet (bound rotation), as is known from
the earth’s moon.
As was stated already, the differential equation (27.97) for ε
= 0 cannot be solved
analytically. One may try, however, to get approximate solutions for ε 1. This we
548 27 Systems with Chaotic Dynamics
Example 27.5 shall do now. First, we introduce dimensionless quantities, the time t = nt = 2πt/T
and ω02 = 3(2 − 1 )/3 . Then (27.97) turns into
d 2ϑ ω2 a 3
2
= − 30 sin 2(ϑ − ϕ(t )) . (27.103)
dt 2r (t )
Since r(t ) and ϕ(t ) are periodic in 2π , the right side can be expanded into a Fourier-
like Poisson series. One obtains
∞
d 2ϑ ω02 m
=− H , ε sin(2ϑ − mt ). (27.104)
dt 2 2 m=−∞ 2
In order to determine the coefficients H (m/2, ε), r(t ) and ϕ(t ) must be known.
ϕ(t) is obtained, e.g., via the second Kepler law as the solution of the following dif-
ferential equation:
dϕ (1 + ε cos ϕ)2
= . (27.105)
dt hk 2
Solving this differential equation and hence the determination of H (m/2, ε) are be-
yond of the scope of this example. We only quote that the coefficients H are pro-
portional to ε2|m/2−1| and were tabulated by Cayley11 in 1859. For small ε, we have
H (m/2, ε) ≈ −ε/2, 1, 7ε/2 for p = m/2 = 1/2, 1, 3/2. Here the half-integer variable
p has been introduced. If the argument of one of the sine functions varies only weakly
with time, i.e., if
dϑ
dt − m
1, (27.106)
It turns out that the terms in the sum are oscillating so rapidly compared to the variation
of γp that their total contribution to the equation of motion largely averages out, if
ω02 and ε are sufficiently small. In first approximation for small ω02 and ε the high-
frequency terms can be eliminated from the equation of motion by keeping γp fixed
and averaging (27.107) over a period. One then obtains
d 2 γp ω02
= − H (p, ε) sin 2γp . (27.108)
dt 2 2
This is again the pendulum equation. Integration of (27.108) analogous to (27.100)–
(27.102) again yields the energy
1 dγp 2 ω02
Ep = − H (p, ε) cos 2γp . (27.109)
2 dt 4
11The original paper is by A. Cayley, Tables of the developments of functions in the theory of elliptic
motion, Mem. Roy. Astron. Soc. 29, 191 (1859).
27.2 One-Dimensional Mappings 549
Again, γp vibrates if Ep is smaller than Eps = |H (p, ε)|ω02 /4, and rotates if Ep is Example 27.5
larger. For H (p, ε) > 0, γp vibrates about 0; for H (p, ε) < 0, γp vibrates about π/2.
The essential difference compared with the pendulum equation with ε = 0 is that here
exists not only the synchronous (p = 1) solution but also resonances, depending on
the initial condition. So, it may happen, e.g., that a satellite is captured by the tidal
friction into a p = 3/2-state. This happens in the solar system for Mercury: During
two circulations about the Sun it rotates exactly three times about its axis.
The question of to what extent the averaging over the high-frequency contributions
is justified and whether these are indeed only weak perturbations is complicated and
shall not be traced further here. But it is vividly clear that the high-frequency terms
must not be neglected if the energy is close to the limit Ep = Eps . Then these terms
will actually decide whether the satellite performs a full turn (in γp ) or whether it vi-
brates back. There exists a band of energies wp · Eps that are very close to Eps , with
wp defined by wp = (Ep − Eps )/Eps . For energies within this band the high-frequency
perturbations cannot be neglected. We shall only state without derivation that Chirikov
found an analytical criterion for the width wp of this band.12 Chirikov’s criterion pre-
dicts that for the parameters of Hyperion ω02 = 0.89 and ε = 0.1 the averaging over
the high-frequency components is not possible, since the width of the band belonging
to p = 1 and p = 3/2 is so large that the two bands overlap. In contrast, for ω02 = 0.2
and ε = 0.1 the averaging over the high-frequency components should be a good ap-
proximation. The following figures represent Poincaré cuts for these two cases. They
show points in phase space taken at ϕ = 0.
Thereby, the differential equations have been solved numerically.13 If the motion is
quasiperiodic, then successive points form a smooth curve; chaotic trajectories seem
to cover areas in a random manner.
Because of the symmetry of the inertial ellipsoid, the orientation ϑ is identical to
that with ϑ + π . Therefore, the graphs were plotted only for the range of ϑ between 0
and π . By averaging over the high-frequency components, solutions were obtained in
which γp = ϑ − pt vibrates (for Ep < Eps ). For each of these solutions, dϑ/dt has
a mean value of exactly p, and ϑ takes all values between 0 and 2π . If one considers,
however, only the points with ϕ = 0, i.e., times t = 2πn, then γp exactly corresponds
to ϑ modulo π . Therefore, a vibration in γp appears as a vibration in ϑ . The successive
points of quasiperiodic vibrations therefore yield a simple curve in the vicinity of
dϑ/dt = p that contains only a part of the angles between 0 and π . For nonresonant
quasiperiodic trajectories (Ep > Eps ), all γp ’s rotate, and successive points form a
single curve that contains all angles ϑ.
As is seen from Fig. 27.22, for small values of ω0 and ε the resonant states and
the nonresonant ones are separated by a narrow chaotic zone. The figure shows ten
distinct trajectories. Three of them correspond to the quasiperiodic vibrations in the
states p = 1/2, 1, and 3/2. As predicted by the approximation, γ1/2 vibrates about
ϑ = π/2; γ1 and γ3/2 vibrate about ϑ = 0. Three further trajectories, always enclosing
the resonant states, are chaotic. They fill narrow bands with points in a seemingly
random manner. The last four trajectories show that each chaotic band is separated
from the other bands by nonresonant quasiperiodic trajectories.
12 B.V. Chirikov, A universal instability of many-dimensional oscillators systems, Phys. Rep. 52, 262
(1979).
13 J. Wisdom, S.J. Peale and F. Mignard, The chaotic rotation of Hyperion, Internat. Journal of Solar
System Studies 58, 137 (1984).
550 27 Systems with Chaotic Dynamics
Figure 27.23 displays the situation for Hyperion. At least the chaotic zones of the
states p = 1 and p = 3/2 are no longer separated from each other. There is a small
remainder of the quasiperiodic p = 1/2 state; the quasiperiodic p = 3/2 state disap-
peared completely. Instead, there is a quasiperiodic state at p = 9/4, ϑ = π/2, which
is not given by the approximation (27.108). In total one sees 17 trajectories in the fig-
ure: eight quasiperiodic vibrations of the states p = 1/2, 1, 2, 9/4, 5/2, 3, and 7/2,
five nonresonant quasiperiodic rotations, and four chaotic trajectories.
27.2 One-Dimensional Mappings 551
A deeper study shows that the alignment of the rotational axis perpendicular to the Example 27.5
orbital plane is not stable, both in the chaotic as well as in the synchronous state. This
means that a small deviation of the rotational axis from the vertical increases exponen-
tially. The time scale for the resulting staggering motion is of the order of magnitude
of several orbital periods. The final stage of a “normal” moon is for Hyperion com-
pletely unstable. But if it tilts out of the perpendicular to the orbital plane, (27.97) is
no longer sufficient, and one has to solve the full nonlinear Euler equations. One then
finds that the full three-dimensional course of motion is completely chaotic. All three
characteristic Lyapunov exponents are positive (of the order of magnitude 0.1), which
implies a strongly chaotic staggering. Even if one could have measured the spatial
orientation of the rotation axis at the moment of Voyager 1 passing Hyperion (in No-
vember 1980) with a precision of up to ten figures, it nevertheless would not have been
possible to predict the orientation of Hyperion’s axis at the moment when Voyager 2
passed it (in August 1981).
Up to this point, the tidal friction, which causes a relatively very slow change of
the initially Hamiltonian system, has been neglected. But one can roughly describe
the history of Hyperion. Presumably, the period of the eigenrotation was initially much
shorter than the orbital period, and Hyperion began its evolution in the range far above
that which is shown in the figure. Over a time of the order of magnitude of the age of
the solar system (circa 1010 years) the eigenrotation was decelerated, and the rotation
axis straightened up perpendicular to the orbital plane. Thereby, the premises of our
simplified model (27.97) are approximately justified. But when the evolution once had
reached the chaotic domain, “the work of the tides lasting aeons was destroyed in a
few days,”14 since once it arrived in the chaotic domain, Hyperion began to stagger in
a fully erratic manner (which continues to the present day).15 Sometimes, its path will
end up in one of the few stable islands of the figure. But this cannot be the synchronous
state, since the latter state is unstable.
14 J. Wisdom, Chaotic behavior in the solar system, Nucl. Phys. B (Proc. Suppl.) 2, 391 (1987).
15 The observations of Voyager 2 are consistent with this statement. The staggering of Hyperion has
also been observed directly from the earth (see J. Klavetter, Science 246, 998 (1988), Astron. J. 98,
1855 (1989)).
Part
VIII
On the History of Mechanics
Here, we follow Friedrich Hund, Einführung in die Theoretische Physik Bd. 1, Mechanik,
Bibliographisches Institut Leipzig (1951), and also I. Szabo, Einführung in die Technische
Mechanik, Springer-Verlag, Berlin, Göttingen, Heidelberg (1956). For more on what is presented
here, in particular on the history of statics, we refer to the works of P. Duhem:
Les origines de la statique, Paris (1905–1906),
Etudes sur Lionard de Vinci, Paris (1906),
Le systéme du monde, Paris (1913–1917),
and to P. Sternagel:
Die artes mechanical im Mittelalter (1966),
and to F. Krafft:
Dynamische und statische Beobachtungsweise in der antiken Mechanik (1970).
Emergence of Occidental Physics
in the Seventeenth Century 28
these two approaches to nature was, however, not reached in antiquity. Archimedes
(287–212 BC) represented the climax in ancient statics: the lever principle, the con-
cept of the center of gravity of a body, and the well-known hydrostatic law named
after him, were known to him in full clarity. Yet, Archimedes’s discoveries fell into
oblivion. The reasons are not known. Possibly his ideas were simply too hard to be un-
derstood. Whatever the reasons were, in the Middle Ages only the works of Aristotle
were known, and these determined the further development of mechanics.
In the fourteenth century, statics enjoyed a time of prosperity, mainly due to distin-
guished men of the artist faculty of the university in Paris. The methods for decom-
posing and combining forces were developed there and utilized for solving statical
problems. In addition, the concept “work of a force” was introduced, and the “vir-
tual work” in virtual displacements was correctly used in simple cases. Leonardo da
Vinci5 (1452–1519) was a leading researcher in mechanics of his time. He performed
the decomposition of forces for investigation of moments (lever law). For individ-
ual examples he traced the law of the parallelogram of forces back to the lever law.
These relations were clearly and precisely formulated by Varignon (1654–1722) and
Newton6 (1643–1727) not before the seventeenth century.
The dynamics of mass points was created in the seventeenth century. Since this
is a very important period in the history of mechanics, we will outline it in more
detail. It should be mentioned that the formal completion and mathematical treatment
of mechanics happened in the eighteenth century and culminated in the mechanics of
Lagrange7 (1736–1813).
The seventeenth century was presumably the most decisive period in the history of
physics, probably the birth of physics as an exact science per se. In that time mechanics
was created and completed in outline, monumental in its scientific clarity and beauty,
and convincing in its predictive power and mathematical formulation. The processes
of motion in the heavens and on earth were described in a consistent way. Method-
ically in mechanics one succeeded for the first time in sharply conceptualizing ex-
periences (experiments). This was achieved mainly by using mathematical language,
combined with the invention of very fruitful abstractions and idealized cases about
which one was now able to make precise and final statements. The implications of the
new physical knowledge were quickly realized in public consciousness, as is demon-
strated by the science methodology of Bacon8 (1561–1626), Jungius9 (1587–1657)
and Descartes10 (1596–1650).
The new scientific spirit did not concern mechanics only. Actually the first book on
physics in the meaning of the new science stems from the English physician Gilbert11
(1540–1603) on the magnet. He started from well-planned experiments, generalized
them, and in this way came to general statements on magnetism and geomagnetism.
This book strongly influenced the intellectual-scientific evolution of that age. The
great scientists of that era knew it (Kepler12 praised it, Galileo13 used it). However,
it was not in magnetism that physics made a great breakthrough, but dynamics. The
following stages were important:
(1) Kepler interpreted the processes in the sky as physical phenomena;
(2) Galileo succeeded in the correct conceptual understanding of simple motions. He
invented the abstraction of the “ideal case”;
(3) Huygens14 and Newton cleared up and completed the new concepts.
In the following table, we list the most important researchers and thinkers of that
era:
28 Emergence of Occidental Physics in the Seventeenth Century 557
1473–1543 Copernicus
1530–1590 Benedetti
1540–1603 Gilbert (1600 De Magnete)
1548–1603 Stevin
1561–1626 Bacon of Verulam
1562–1642 Galileo (1638 Discourses on Fall Laws)
1571–1630 Kepler (1609 Astronomia Nova)
1587–1657 Jungius
1592–1655 Gassendi
1596–1650 Descartes (1644 Principia Philosophiae)
1608–1680 Borelli
1629–1695 Huygens (1673 Pendulum Clock)
1635–1703 Hooke
1643–1727 Newton (1686 Principia).
According to the opinion prevailing in antiquity—presumably going back to
Aristotle—heaven and earth were greatly separated from each other. The heavens rep-
resented the perfect, unchanging, divine. The earthly, on the contrary, was changing
and chaotic. This opinion was considered to be confirmed by the circular orbits of the
celestial bodies and the ideally straight earthly motions. The stars were interpreted as
being essentially different from the earth, which presumably prevented the progress
of the heliocentric world system which had been invented already in antiquity. Only
Copernicus15 (1473–1543) made this breakthrough. As a proof for his doctrine, which
states that the sun is in the center of the world, he had to offer only the simplicity and
beauty. This was not yet a “physics of the heaven,” but rather a kind of geometrical
ordering of the world.
Only Kepler presumably realized the connection of the motions in the sky with
physics. He asked about the forces when he put the orbiting velocity of planets that
decreases outward in a causal connection with a force originating from the sun and
decreasing with the distance from it (1596). He justified the validity of the heliocentric
system with the sun as the origin of the force which causes the planetary motions. In
his Astronomia Nova Seu Physica Coelestis he expressed (1609) the first planetary
laws. But his conclusion, a 1/r-dependence of the gravitational force is false. He
concluded from the area law
r 2 ϕ̇ = constant
Mundi (1619), he ascribed a particular role in the structure of the world to the five
regular bodies. He used them as mathematical “archetypes.” The use of mathematics
for describing connections, as soon became customary, was denied him, however.
Galileo does not match to Kepler in the physical interpretation of planetary motion.
He considered the circular orbit to be natural and did not yet understand the general
meaning of the laws of inertia. Borelli16 (1608–1680), on the contrary, qualitatively
described the motions of celestial bodies as an interplay of the attraction by the sun
and of the centrifugal force. He understood the planetary motion as a problem of the-
oretical mechanics. But to solve this problem essential tools still had to be developed;
first of all free fall and the throw had to be understood.
The idea of the inertia of bodies, their persistence in the state of uniform motion in
the absence of forces, was hard to understand. The abstraction was accepted only la-
boriously. On the contrary, it was easier, more natural, to ask for the reason of changes
of position and to look for relations between force and velocity, as was already done
by Aristotle in nebulous form. He took the view that the thrown body had a certain
immaterial ability which, however, gradually decreased (vis impressa natura—liter
deficiens). In the fourteenth century this ability was called “impetus” (Buridan, 1295–
1336?). The impetus normally remains constant, but gravity and air resistance can
modify it. A falling body is accelerated because more and more new impetus is added
by gravity (Benedetti,17 1530–1590). Galileo also shared this view, but he at first tried
to combine this understanding with the doctrine of the Aristotelian school that force
and velocity were proportional to each other, in a complicated nontransparent way.
Only on the third day of the Discourses on Fall Laws, clarity about the course of
the fall motions is achieved. The ideal cases of uniform motion and of uniformly ac-
celerated motion are outlined and mathematically grasped. The uniformly accelerated
motion is compared with the experience of free fall and the fall on the tilted plane.
Finally, on the fourth day of the discourses the tilted throw is analyzed: it is correctly
understood as a composition of an (ideal) propagating motion and an (ideal) fall mo-
tion.
Although Galileo described the law of inertia (persistence of bodies in uniform
motion if no forces are acting on them) and the proportionality between force and
acceleration, he did not express the general law of motion. He also did not apply his
knowledge of the laws of falling bodies on planetary motion. From his entire work
one realizes how reluctantly he gradually gave up the old ideas.
But the new ideas gained acceptance. In 1644, Descartes made the first attempt to
formulate general laws of motion. He talked about the persistence of bodies in the state
of rest or of motion, about linear motion as most natural motion (meaning force-free
motion) and about the “conservation of motion” in the impact of bodies. The latter
is obviously the conservation of momentum (called motion) i mi vi . Descartes how-
ever did not realize the vector character of the momentum (the “motion”). Presumably,
therefore, his applications of this law are wrong.
The conceptual completion of mechanics was achieved by Huygens when treating
curved motions. He studied the motion of a body on a given path in the gravitational
field of the earth (tautochrone problem) and demonstrated clear understanding of the
centripetal and centrifugal force in the treatment of the pendulum clock (1673). He
explained these topics by infinitesimal considerations as uniformly accelerated devia-
tion from linear motion. He realized the proportionality between centrifugal force and
(centrifugal) acceleration, where the acceleration is already defined infinitesimally.
Huygens realized the momentum was a vector quantity, and thus he correctly inter-
28 Emergence of Occidental Physics in the Seventeenth Century 559
preted the momentum conservation law, as can be seen from his applications of this
law to impacts.
Then Newton came up with his brilliant work Philosophiae Naturalis Principia
Mathematica (1686/87), and systematically showed the connection between mass, ve-
locity, momentum, and force. He demonstrated by the example of the gravitation law
how a force (measured by the change of the momentum) is determined by the loca-
tions of the involved bodies. Finally he applied the laws of mechanics to the treatment
of planetary motion, and he showed how the gravitation law follows by induction from
Kepler’s laws, and how the Kepler laws follow deductively from the gravitation law.
This represented the final breakthrough and the proof of the new mechanics; so to
speak the completion of Kepler’s quest.
After the preceding considerations, the question might arise why Kepler failed to
discover the acceleration law—and, hence, gravity—although it follows seemingly
simply from his own law. But it is not for us to accuse Kepler for this reason of
a lack of brilliance and imagination. It is beyond any doubt that he had both—the
genius in empirical research and the imagination in far-reaching speculations.∗ The
explanation is as follows: Kepler was a contemporary of Galileo, who survived him by
twelve years. Although Kepler knew the Galilean mechanics, in particular the central
concept of acceleration, the laws of inertia and of throwing by correspondence and
by hearsay, he nevertheless could not work it out to a consistent structure. (Note that
Kepler died in 1630, while Galileo’s Discorsi outlining his mechanics was published
only in 1638.) But even more decisive is the fact that the theory of curved motion—
invented by Huygens for the circle and completed by Newton for general orbits—was
not available to Kepler. But without the concept of acceleration for curved motions
it is impossible to derive the form of the radial acceleration from Kepler’s laws by
simple mathematical manipulations.
Newton’s mechanics of gravitation, which emerges from the dynamical fundamen-
tal law and from the reaction principle, is by its nature a further development of the
throw motion discovered by Galileo. Newton writes about this point: “That the plan-
ets are being kept in their orbits by central forces is seen from the motion of thrown
objects. A (horizontally) thrown stone under the action of gravity will be deflected
from the straight path and falls, following a bent line, finally down to the earth. If it is
thrown with greater velocity, it flies farther away, and so it might happen that it finally
would fly beyond the borders of the earth and would not fall back. Hence, the mis-
siles thrown away from the top of a mountain with increasing velocity would describe
more and more extended parabolic curves and finally—at a certain velocity† —return
∗ For example, in his thoughts concerning the possible number of planets, since he was convinced—
like the Pythagoreans—that God had taken the choice on number and proportions according to a
certain number rule.
† The value for horizontal throw is correctly given by Newton from mv 2 /R = mg as v = √gR =
7900 m s−1 ; for the vertical shot into space the necessary velocity is obtained from the energy law
∞
1 2 mM mM
mv = γ dr = γ
2 r2 R
R
with g = γ M/R 2 as
v = 2gR = 11200 m s−1 .
Both results do not involve the friction losses in the air.
560 28 Emergence of Occidental Physics in the Seventeenth Century
to the top of the mountain, and in this way move around the earth.” An overwhelming
argument—by conception and compelling logic!
The English physicist Hooke18 (1635–1703), who is known as the founder of the-
ory of elasticity, also came close to the gravitation law. This is shown by the following
statements made by him:‡ “I shall develop a world system which in every way agrees
with the known rules of mechanics. This system rests on three assumptions: (1) All
celestial bodies have an attraction directed toward their center (gravity); (2) all bod-
ies that are put in straight and uniform motion move in a straight line until they are
deflected by any force and forced into a curved path; and (3) attractive forces are the
stronger the closer they are to the body they are acting upon. Which are the various
degrees of attractions I could not yet determine by experiments. But it is an idea that
must enable the astronomers to determine all motions of celestial bodies according to
one law.”
These remarks show that Newton did not create the monument of his Principia
out of nothing. But it took an immense mental strength and bold ideas to concentrate
all that created by Galilei, Kepler, Huygens, and Hooke in physics, astronomy, and
mathematics into one focus, and in particular to announce that the force that lets the
planets circulate on their orbits around the sun is identical with the force that drives
the bodies on the earth to the floor.
For this knowledge, mankind needed one-and-a-half millennia if one considers that
in the Moralia (De facie quae in orbe lunae apparet) Plutarch19 (46–120) states that
the moon by the momentum of its orbiting is just in the same way kept from falling to
earth as a body which is “rotated around” in a sling. It took the genius of Newton to
realize what the “sling” for the planets is!
The new mechanics had a tremendous impact on the spirit of that age. Now there
existed a second incontestable science besides mathematics. Furthermore, the exact
natural sciences were born: Mechanics advanced to their model.
Let us summarize once again the most important stages in the evolution of mechan-
ics from a present-day view: The essential part of mechanics and of its fundamental
concepts is expressed in the basic dynamic law
dp d
= (mv) = F.
dt dt
Here, basically the acceleration appears as the signature of an acting force F; the law
of inertia, i.e., the conservation of momentum
p = mv
and, hence, of the velocity v = ṙ if no external forces are acting, is also contained
therein. This law of inertia had already been realized in the ancient and scholastic
mechanics (Philoponos, Buridan) by experience. Uniform motion as the ideal case
‡ There are still two other hints on the many-sided active genius of Hooke: In 1665, he writes the
prophetic words: “I have often thought that it should be possible to find an artificial, glue-like mass
which is equal or superior to that excretion from which the silkworms produce their cocoon and which
can be spun to threads by jets.” This is the basic idea of the man-made fiber that—although two-and-
a-half centuries later—revolutionized the textile industry! In the same year he writes, anticipating the
mechanical theory of heat (hence, also kinetic gas theory): “That the particles of all bodies, as hard
as they may be, nevertheless vibrate, one needs in my opinion no other proof than that, that all bodies
involve a certain degree of heat and that never before has an absolutely cold body been found.”
Notes 561
of motion was described by Galileo. Descartes clearly formulated the law of inertia,
and Huygens utilized it correctly. As already stated, in the basic dynamic equation
the acceleration appears as a differential quotient of the velocity. Huygens clearly
realized that. He also correctly understood the acceleration as a measure of the force,
as well as the role of the mass in the momentum. Newton summarized everything in
a sovereign manner and applied the fundamental law to celestial mechanics. In this
sense Newton is the endpoint of the way to mechanics to which besides him also
Galileo and Huygens contributed essentially. The general concept, however, is due to
Kepler.
For the history of mechanics, we also refer to the outlines of the history of physics.
For the sections treated here, see particularly
E.J. Dijksterhuis, Val en worp, Groningen 1924.
E. Wohlwill, Galilei, Hamburg and Leipzig 1909 and 1926.
The most important original papers were translated to German in the collection:
Ostwald’s classics of exact sciences. The main works of Kepler and Newton are avail-
able in German as
J. Kepler, Neue Astronomie oder Physik des Himmels (1609). German translation.
Munich 1929.
I. Newton, Mathematische Principien der Naturlehre (1686/87). German Transla-
tion. Leipzig 1872.
E. Mach, Die Mechanik in ihrer Entwicklung historisch-kritisch dargestellt (1933).
R. Dugas, A History of Mechanics, Neuenburg/Switzerland, (1955).
P. Sternagel, Die Artes mechanical im Mittelalter (1966).
F. Krafft, Dynamische und statische Betrachtungweise in der antiken Mechanik
(1970).
For modern English editions and translations of historical texts, we refer to
J. Kepler, W.H. Donahue (Translator), New Astronomy (1992), Cambridge Univ.
Press.
I. Newton, I.B. Cohen, A. Whitman (Translators), The Principia: Mathematical
Principles of Natural Philosophy (1999), Univ. California Press.
J.L. Lagrange, A.C. Boissonnade, V.N. Vagliente (Translators), Analytical Mechan-
ics (2001), Kluwer Academic Publishers.
Finally, we mention some texts about the history of mechanics
E.J. Dijksterhuis, The Mechanization of the World Picture: Pythagoras to Newton
(1986), Princeton Univ. Press.
E. Mach, T.J. McCormack (Translator), The Science of Mechanics: A Critical and
Historical Account of its Development, 6th edition (1988), Open Court Publishing
Company.
R. Dugas, A History of Mechanics (1988), Dover Pub.
Notes
1 Plato, Greek philosopher, b. 427 BC , Athens–d. 347, Athens, was the son of Ariston
and Periktione, from one of the most noble families of Athens. According to legend,
562 28 Emergence of Occidental Physics in the Seventeenth Century
he wrote tragedies in his youth. The meeting with Socrates, whose scholar he was for
8 years, became decisive for his turn to philosophy. After Socrates’ death (399), he
first went with other scholars of Socrates to the town of Megara to study with Euclid.
He then broadened his horizons by extended travel (first to Cyrene and Egypt). He
soon returned home and opened the war against the educational ideal of the Sophists
by his first works. He quickly won enthusiastic followers while dealing with science,
secluded from public life. Presumably, scientific intentions led him about 390 to Italy,
where he became familiar with the Pythagorean doctrine and school organization. He
was introduced to the court of the tyrant Dionysius of Syracuse. Dionysius was at first
much interested in him, but according to legend handed Plato over as a prisoner to
the envoy of Sparta, who sold him as a slave. After payment for release and return
to Athens, Plato founded the Academy in 387. Despite his bad experience, he set his
hopes for a full effectiveness to Syracuse. In 368, he followed an invitation by Dion,
uncle of the younger Dionysius, who hoped to win the young ruler over to Plato’s
political principles. Dionysius, however, only tolerated Plato’s ideas for a short time.
A third trip (361–360) also failed, since Dionysius distrusted him and turned against
him. Plato spent the last years in Athens in continuing scientific activity within a circle
of well-known scholars; according to legend, he died during a wedding meal.
Plato’s works are all preserved, except for the lecture On the good, which can be
reconstructed only in broad terms. But not all work recorded under his name is au-
thentic. The authenticity of the 7th Letter and of Laws is controversial.
The most important and surely authentic papers from the early period are: Apol-
ogy, Protagoras, State I, Gorgias, Menon, Kratylos; from the middle artistic period:
Phaidon, Banquet, State II–X; from the last years: Phaidros, Parmenides, Theaithetos,
Sophistes, Timaios, Philebos.
Almost all of his writings are dialogues that by language and structure are of great
artistic beauty. In most of them, Socrates appears as the main host of conversation.
Plato’s philosophy turns dialectics, which for his teacher Socrates had only the
negative function of destroying the false knowledge on the good and the virtue, into
an approach of realizing the good and the virtue–into a path to the “ideas.” The ideas
are not acts of imagination but are the content of that being represented by them,
which in itself is independent of us. By this distinction of the sensual (which is with
us) from the hypersensual (the later “transcendent”), Plato became a promoter of the
later so-called metaphysics.
Since the innermost nature of love is the will for perpetuation, it comes to fulfill-
ment only as love of the eternal ideas. All other love is a preliminary stage to that. To
kill off the transitory sensual and to turn toward the everlasting ideas is the aspiration
of the really philosophical man. The way toward this goal is the dialectics. Also, the
nature of this method of perception of ideas is logic.
Plato’s understanding of the role of the idea varies between the general idea and the
idea a priori. Provided it is the latter one, it is not brought into man from the outside,
but he remembers it as something he already knows but has forgotten. The under-
standing of the idea is remembrance (anámnesis). The method of remembering is that
of the hypothesis. By this, Plato means the proof in the form of the statement-logical
conclusion: If the first, then the second. But now the first, therefore the second. Or:
But now not the second, therefore not the first: e.g., in Menon: If virtue is knowledge,
then it is teachable. But now it is knowledge, therefore it is teachable. But now it is not
teachable, since there are practically no teachers of virtue. Therefore, it is not knowl-
edge, but only correct opinion inspired by the Gods; to transform it into knowledge
that is able to self-satisfy is according to Plato the essential duty of philosophy.
Notes 563
The later form of dialectics, the method of division (diaeresis) of the species into
sorts, is the draft of a logical method of proof. Aristotle rightly interprets it as a pre-
lude of the class-logical conclusion discovered by him. All proving is proving on
conditions. These can be proved themselves. In the State, Plato sketches the idea of a
completion of this proving up to the omission of all assumptions (anyipódeton), i.e.,
the idea of a proof by and from the purely logical. The absolute is defined here as
the good. In his later works, Plato interprets the ideas as numbers, i.e., as units that
include a manifold in themselves, and sees their absolute principle in the One and the
“Great-and-small” (interpretation controversial).
Plato remains aware of the limits of all human proofs. Where the dialectics ends,
there remains the speculative speech that uses the language of myth. All knowledge of
the sensual, the nature, does not go beyond well-founded presumption. Therefore, all
natural-scientific speech is necessarily myth. Plato develops this idea in his dialogue
Timaios, which was of particular influence in the Middle Ages.
The question of what is for man the good and the virtue as the way toward this
goal is answered by Plato by means of his dialectics of ideas, at first in the State by
his doctrine of the four cardinal virtues: wisdom, bravery, prudence, and justice. The
sketch of an ideal state serves only for proving this doctrine and does not represent
a plan that should be realized. This “ideal state” with its subdivision into the three
orders of scholastic profession, military profession, and peasantry, and his doctrine
on the community of possessions and women, and on the necessity that the kings
should become philosophers and the philosophers should become kings, later on was
interpreted as a political program and became efficient. In the late work Philebos,
Plato sees the good of human life in the composition of knowledge (epistéme) and joy
(hedoné), where all knowledge is admitted, but among the joys only those that are not
mixed with grief, those that cannot impair the knowledge. Men must be educated in
the spirit of such a life ideal if a real and stable state shall be possible. This restoration
program demands a radical restriction of the influence of the traditional literature on
the individual and on the community. Plato’s philosophy and critics of art were also
of extraordinary historical influence.
Plato’s doctrine, Platonism, was first developed further in Plato’s school, the Acad-
emy. One distinguishes the older, intermediate, and younger Academy. In the older
one, whose first and most important leaders were Speusippos and Xenocrates, the
Pythagorean attitudes of the late philosophy of Plato were emphasized. The ratio
of ideas and numbers became the focus of interest; soon mythological elements
joined. On the contrary, the leading men of the intermediate Academy, Arcesilaos
(315–241 BC) and Carneades (214–129), intended to revive the critical-scientific at-
titude of Plato. In this way, there emerged an—although moderate—skepticism that
believes that only probable insight is possible. The younger Academy considers the
power of reason again as more positive and combines thoughts of various systems in
an eclectic manner, in particular Platonic and Stoic thoughts. To the younger Acad-
emy belong Philo of Larissa (160–79) and Antiochos of Ascalon († 68 BC), heard
by Cicero in Athens. The Platonism of the three academies is called older Platonism.
The transition from this one to the new Platonism is mediated by the “intermediate”
Platonism, with the main representative Plutarch (AD 50–125), who taught a religious
Platonism with a strong emphasis on the absolute transcendence of God and assumed
a series of steps of intermediate beings between God and the world.
In the Middle Ages until the twelfth century, only Timaios was known and had
a strong impact. In the twelfth century, Henricus Aristippus translated Menon and
564 28 Emergence of Occidental Physics in the Seventeenth Century
Phaidon, and in the thirteenth century, W. von Moerbeke translated Parmenides. The
new Platonism had more influence than Plato’s original ideas. The historical evolution
of the philosophy of the Middle Ages was largely determined by the confrontation
between Platonism and Aristotelian philosophy. In the early scholastic, Platonism was
dominated mainly by Augustinus; in particular, the school of Chartres had a Platonic
orientation. In the high scholastic, Platonism formed a strong undercurrent in the doc-
trines of the Aristotelians (Albert, Thomas). It emerged as an independent movement
among the mathematical-scientific thinkers (Robert Grosseteste, Roger Bacon, Witelo,
Dietrich of Freiberg) and among the German mystics. The latter established the link
to the Platonism of the early Renaissance (Nicholas of Cues).
Modern Platonism began during the Italian Renaissance. In 1428, Aurispa brought
the complete Greek text of Plato’s works from Constantinople to Venice. Latin trans-
lations soon emerged, the most important one by Marsiglio Ficino, who completed it
1453 and 1483. Followers of Platonism included Lionardo Bruni and the older Pico
Della Mirandola, as well as Byzantines who had fled to Italy, among them the two
Chrysoloras, Gemistos Plethon and Bessarion. Central to this movement was the Pla-
tonic academy in Florence, founded in 1459 by Cosimo de Medici and guided by
Ficino. From there, Platonism spread all over Europe. However, only in England did
a truly Platonic school emerge (Cambridge). But Plato’s thoughts had lasting effects
in the rationalistic systems of Cartesius, Spinoza, and Leibniz. Malbranche was even
called the “Christian Plato.” In the nineteenth century, German idealism brought a
revival of Plato’s system of thought. Hegel resorted not only to Platonism but even
more to Plotinus and the New Platonism. Plato’s influence is seen in the recent past
in the phenomenology of Husserl and in world philosophy. A.N. Whitehead explic-
itly confessed to Platonism. Although his statement that all of European philosophy is
only a footnote to Plato is exaggerated, it nevertheless rightly points out the immense
influence of Plato’s philosophy. Platonism is even more dominant in the philosophy
and theology of the Christian East, where the Platonic tradition of Origenes and of
the Greek church fathers survives; for example, W. Solowjew and N. Berdjajew are
Christian Platonics. [BR].
2 Aristotle, Greek philosopher, b. 384 BC , Stagira in Macedonia–d. 322, Chalcis on
Euboea. He came at the age of 18 to Athens and became a student of Plato, where
he remained at the Academy for almost two decades, first as a scholar and then as
a teacher, finally opposing Plato with his own philosophy. After Plato’s death (347),
he lived for three years in Asia Minor with Hermias, with the ruler of Atarneus. In
343, he was called to the court of Phillip of Macedonia to be the tutor of Philip’s son
Alexander. When Alexander ascended the throne, Aristotle returned to his hometown;
however, in 334 he returned to Athens. In Athens, he founded the Peripataetic school,
so called because of the covered walks (peripatoi) surrounding the lyceum. He taught
there for twelve years among an ever-increasing circle of scholars, until the revolt of
Athens after Alexander’s death became dangerous to him, a friend of the royal dynasty.
He went to his estate at Chalcis on Euboea, where he soon died. [BR].
3 Eudoxus of Cnidus (400–347 BC ), Greek scientist. He was equally active as math-
ematician, astronomer, and geographer. His biography is not recorded in detail, but
it is considered certain that he was a member of the Platonic school. Later, he con-
ducted his own school in Cyzicus. Eudoxus gave a new definition for proportion. He
developed the exhaustive method and applied it to many geometric and stereometric
problems and theorems, which he could prove exactly for the first time. Possibly, the
major part of Euclid’s twelfth book is the work of Eudoxus. Eudoxus made a map of
Notes 565
the stars that remained top-ranking for centuries. He subdivided the sky into degrees
of longitude and latitude, gave an improved value for the solar year, and improved the
calendar. He estimated the earth’s circumference to be 400,000 stadia, edited a new
map of the known continents, and wrote a geography of seven volumes.
4 Archimedes, outstanding mathematician and mechanic of the Alexandrian era,
b. about 285, Syracuse, killed by a Roman soldier during the capture of Syracuse.
Archimedes was close to the Syracusean dynasty. He wrote important papers on math-
ematics and mathematical physics, fourteen of which are preserved. He calculated the
area and circumference of a circle, the area and volume of segments of the parabola,
the ellipse, spiral, the rotation paraboloid, the one-shell hyperboloid, and others, and
determined the center of gravity of these figures. For π , he gave a value between
3 1/7 and 3 10/71; he developed in his “sand calculation” a method of exponential
notation of arbitrarily large numbers, and in the Ephodos, a kind of integration cal-
culus. Even more important than his treatment of the equilibrium conditions of the
lever is the treatise on swimming bodies, where the principle of Archimedes is given.
Archimedes determined the ratio of the volumes of the straight circular cone, the half-
sphere, and the straight circular cylinder as 1 : 2 : 3. Uncertain are the inventions of the
water screw named after him and of the composed tackle; legendary is the burning of
the Roman fleet by concave mirrors. [BR]
5 Leonardo da Vinci, Italian painter, sculptor, architect, scientist, technician, b. April
15, 1452, Vinci near Empoli–d. May 2, 1519, in the castle Cloux near Amboise; ille-
gitimate son of Ser Piero, notary in Florence, and of a peasant girl. He was educated
in his father’s house, and at the age of 15, he went to Florence as an apprentice of
A. Verrocchio, who taught him not only painting and sculpting but also gave him an
extensive education in the technical arts. In 1472, he was admitted to the Florentine
guild of painters, but remained in Verrocchio’s studio. In this time, of common work
the earliest of his preserved works emerged: an angel and the landscape in Verroc-
chio’s painting of the baptism of Christ (Florence, Uffizi), two preachings (Uffizi and
Louvre), and the Madonna with the vase (Munich, Pinakothek). About 1478, he be-
came a freelancer and worked at Florence for about 5 more years. From this era stems
the portrait of Ginevra Benci (Vaduz, gallery Liechtenstein), the unfinished painting
of St. Jerome (Vatican), and the also unfinished great panel painting of the worship of
the kings (Uffizi), which he got as an order for the high altar of a monastic church, but
which he gave up half-finished when he left Florence at the end of 1481, to start work
with Duke Lodovico of Milan.
The end of the Sforza dynasty forced Leonardo to leave Milan (1499). Through
Mantua, where he drew a portrait of margravine Isabella d’Este (Louvre), and Venice,
where he drew up a defense plan against the threatening invasion of the Turks, he
returned in April 1500 to Florence, where he began the painting of St. Anna Selb-
dritt (Louvre). In May 1502, Leonardo started work as the first inspector of fortress
buildings with Cesare Borgia, the papal military leader, throughout whose territory
Leonardo traveled for about 10 months: the Romagna, Umbria, and parts of the
Toscana. From this activity emerged a large fraction of his maps and city maps that—
masterpieces of surveying and representation—belong to the earliest records of mod-
ern cartography. Florence also asked for his advice as a war engineer; he worked on
a plan to divert the Arno river, in order to cut off the main access road to Pisa with
whom Florence was at war, and he designed the project for a channel to make the
Arno navigable from the sea to Florence. Both of these plans were not realized, just
like the draft proposed at the same time by sultan Bajasid II to built a bridge 300-m
566 28 Emergence of Occidental Physics in the Seventeenth Century
long across the Bosporus. In 1503, Leonardo got the order to paint a monumental wall
painting for the large senate hall of the Palazzo della Signoria in Florence; there, the
drafts for the battle of Anghiari were born, which became the classical model in many
copies, e.g., by Rubens for the cavalry-fight painting of Renaissance and Baroque,
even to Delacroix. At the same time, Leonardo painted the Mona Lisa, presumably
the world’s most famous painting, and the standing Leda (preserved only as copies).
At this time, Leonardo reached the peak of his artistic fame. The arising geniuses of
the young generation either admired him without envy (Raphael) or accepted him re-
luctantly with jealousy (Michelangelo). Later, he painted only hesitantly, and more and
more turned to scientific problems. Besides mathematical studies, he studied anatomy
comprehensively. He dissected corpses and began an extensive treatise on the struc-
ture of the human body, where he promoted the anatomical drawing accompanying
the text as a tool for teaching. He also extended his biological and physical studies;
the experiments on the flight of man—already begun in Milan—led him to investigate
the flight of birds, which he also summarized like a treatise. Besides the laws of air
flow, he also tried to investigate those of water flow. These studies contain approaches
for theoretical and practical hydrology; he recorded them as materials for a treatise on
water. In these years, he tried to arrange his notes by the main topics of his planned
“books,” which as a whole comprise a theory of the mechanical primordial forces of
nature, i.e., an entire cosmology.
In 1506, Leonardo, at the request of the king of France released by the Florentine
Signoria under the pressure of the political situation, stopped the work on the Anghiari
battle and returned to Milan. There he served until 1513 mainly as an adviser to the
French governor Charles d’Amboise, for whom he designed a large domicile and the
plans for a chapel (S. Maria alla Fontana). From this era also date the drawings for
the tomb of General Giangiacomo Trivulzio that—like the Sforza monument—was
planned as an equestrian statue but was not realized. There is no clue to two almost
finished Madonna paintings for His Most Christian Majesty. Also, in Milan Leonardo
mainly dealt with scientific studies. He continued his great “anatomy,” in connection
with the anatomist Marc Antonio Della Torre of Pavia, and he extended his hydrolog-
ical and geophysical investigations both theoretically and practically, as is testified by
his project for an Adda channel between Milan and the lake of Como, and by his amaz-
ing geological observations on the origin of fossils. He also returned to his botanical
studies; also in this field, he created exact demonstration drawings according to ex-
actly defined principles of graphical representation, as in all of his research activities.
He thereby founded the scientific illustration.
When at the end of 1513 Leo X had risen to the papal throne, the now sixty-year-
old Leonardo went to Rome, presumably with the hope of acquiring orders through
his patron, Cardinal Giuliano de Medici. But he did not get big orders such as Raffael,
Bramante, and Michelangelo got. His years in Rome were occupied by research, in
particular on mechanics and anatomy. Only one painting, his last one, the mysterious
John the Baptist (Paris, Louvre), may have originated in this period.
In January 1517, Leonardo left Rome, following the invitation of Franz I. The coun-
try castle Cloux near Amboise was allocated to him as a residence, and he got the title
Premier peintre, architecte et mechanicien du Roi. He did not paint anymore, however,
because of paralysis of his hand, but mainly arranged his scientific materials; in partic-
ular, he worked on completing his “anatomy.” Among the few artistic creations of this
last era, the project of a large castle and park for the residence of the Queen Mother
in Romorantin is known. The building could not be built. His ideas nevertheless had
Notes 567
a lasting effect on the tremendous castle that was begun by Franz I when Leonardo
was still alive, the building of Chambord. The most stirring documents of his late
work are the drawings of the end of the world (Windsor), where Leonardo exhibited
his experiences of life devoted to studying nature in a unique synthesis of scientific
and artistic imagination; they are the symbol of the primordial forces penetrating the
world, which once had created and finally shall destroy it, but even in self-destruction
shall still obey the laws of harmony.
Leonardo was buried in Amboise in the church of St. Florentin, which was de-
stroyed during the French Revolution. His pupil and friend Francesco Melzi became
heir to his enormous written work, which is almost completely done in mirror writing,
familiar to him as a left-hander.
The greatness of Leonardo and his significance in the history of occidental culture
rests on the fact that he, like nobody else, understood art and science as a unity of hu-
man will of perception and power of mental comprehension. As a painter, he was the
first master of the classic style; his few artistic creations remained models of perfec-
tion for all following eras and styles. As a researcher and philosopher, he is at the bor-
derline between the Middle Ages and modern thinking. Altogether an empiricist, he
tried to acquire an encyclopedic knowledge by means of experience and experiment.
Guided by his imagination and less capable of abstract logical thinking, Leonardo
must not—as was often tried—be considered as the founder of modern science as
such. His achievements in the field of physics and pure mechanics are mediocre, often
even questionable. But, since he performed his all-embracing observations on natural
phenomena with ultimate objectivity and by virtue of his artistic talent was able to rep-
resent them in drawings, he became the pioneer of a systematic descriptive approach
in the natural sciences. Also, in the field of applied mechanics he can be considered
as the founder of elementary engineering, for which he developed the graphical prin-
ciples of demonstration. [BR]
6 Isaac Newton, b. Jan. 4, 1643, Woolsthorpe (Lincolnshire)–d. March 31, 1727,
even with his fluxion calculus that was represented in 1704 in detail. His influence on
the further development of mathematical sciences can hardly be judged, since New-
ton disliked publishing. For example, when Newton made his fluxion calculus public,
his method was already obsolete compared with the calculus of Leibniz. The quarrel
about whether he or Leibniz deserved priority for developing the infinitesimal calculus
continued until the twentieth century. Detailed studies have shown that both of them
obtained their results independently of each other. [BR]
7 Joseph Louis Lagrange, mathematician, b. Jan. 25, 1736, Torino–d. April 10, 1812,
d. April 9, 1626, London, son of Nicholas Bacon, nephew of Lord Burleigh; advocate
and deputy. In the notorious trial of his patron Essex, Bacon convicted Essex of high
treason. In 1607, he became Solicitor General; in 1613, Attorney General; in 1617,
Keeper of the Great Seal; and, in 1618, Lord Chancellor. Ennobled as Baron Verulam
and Viscount of St. Albans, in 1621 Bacon was thrown out by parliament because
of passive corruption and was sentenced to high penalties and imprisonment, which,
however, was remitted by the king’s influence. He was a curiously split character:
outstandingly talented, vastly well read, vain, excessively ambitious, and of frighten-
ing emotional frigidity. The reasons for his downfall were not only the proven and
confessed failures, but equally the anger of the parliament about the egocentric and
unauthorized policy of the king who utilized Bacon as a submissive tool.
Bacon left a large number of philosophical, literary, and legal writings. His philo-
sophical life’s work, the Instauratio Magna (i.e., great revival of philosophy), re-
mained a fragment, an attempt (based on insufficient means) of a complete recon-
struction of sciences on the basis of “unfalsified experience.” His main piece, Novum
Organum (the title indicates the contraposition to Aristotle, whose logical writings
traditionally were summarized under the title Organon), is a method of scientific re-
search, worked out down to the last detail, which shall serve to snatch the secrets of
nature and to govern it (Bacon considered knowledge as the means for a purpose,
“knowledge is power”). The starting point for any knowledge is experience. Expe-
rience and mind should be tightly linked in a “legitimate marriage,” instead of the
separation so far. Bacon constructed a complicated system of scientific induction, but
he failed to appreciate the role of mathematics. His main piece is preceded by an in-
ventory of all sciences (De diguitate et augmentis), where—according to the three
mental abilities memory, imaginative power, and mind—three main sciences are dis-
tinguished: history, poetry, and philosophy. Bacon listed what had been achieved by
each science and what still remained to be done.
Among Bacon’s literary works, the essays suggested by Montaigne are timeless:
10 in the first edition (1597), and 58 in the last edition (1625). In these “dispersed
meditations,” Bacon presented practical life’s wisdom in the various fields, general
guiding principles of the conduct of life, beyond good and evil, in an antithetical style
of epigrammatic brevity, realistic and plain. “Nova Atlantis” is the perfect description
of a philosophical ideal state.
Notes 569
Bacon’s legal writings testify to his absolute mastery of the subject. His plan of
codifying the English law of his age was not completed.
In the second half of the nineteenth century, Bacon was also considered to be the
author of Shakespeare’s dramas (Bacon theory). [BR]
9 Joachim Jungius, philosopher and scientist, b. Oct. 22, 1587, Lübeck–d. Sept. 17,
1657, Hamburg, in 1609, professor in Giessen. In 1622, he founded in Rostock the first
scientific society of Germany for the cultivation of mathematics and natural sciences.
In 1624, he became a professor in Rostock; in 1625, in Helmstedt; and in 1628, head-
master of the Johanneum and of the academic high school in Hamburg. He defended
the principle “improvement of philosophy has to originate from physics (= natural
sciences).” Jungius decisively contributed to the breakthrough of scientific chemistry
and the renewal of atomism. He was also important as a botanist. [BR]
10 René Descartes, b. March 31, 1596, La Haye–d. Feb. 11, 1650, Stockholm.
Descartes was the son of a councilor of the parliament of Bretagne and was educated
in a Jesuit college. He then began the study of law, and beginning in 1618, he partici-
pated in various campaigns. Beginning in 1622, Descartes traveled in many countries
of Europe, then settled down 1628 in the Netherlands, and lived from 1649 in Swe-
den as a teacher of philosophy. The mathematical main achievement of Descartes is
the foundation of analytical geometry in his Géometrie (1637), which also essentially
influenced the further development of infinitesimal calculus. [BR]
11 William Gilbert, English scientist and physician, b. May 24, 1544, Colchester–
d. Nov. 30, 1603, London. Gilbert was from 1573 a practicing physician in London;
from 1601, the private physician of Elizabeth I; and after her death, of King James
I of England. In his fundamental work De magnete, magneticisque corporibus et de
magnode magnets Tellure physiologia nova (London, 1600; facsimile edition, Berlin,
1892; English translation and comment by S.P. Thompson, in The collectors series in
science, 1958), Gilbert summarized the knowledge of older authors to an impressing
doctrine of magnetism and geomagnetism, and added a number of new observations
and findings. The work, which in the second book also involves a special chapter
on corpora electrica, on substances that—like amber (electrum)—after rubbing are
capable of attracting light bodies, impressed several of his contemporaries, among
others Kepler and Galileo. His treatise De monde nostro sublunari philosophia nova
appeared posthumous (Amsterdam, 1561). [BR]
12 Johannes Kepler, b. Dec. 27, 1571, Weil der Stadt–d. Nov. 15, 1630, Regensburg.
Kepler was the son of a trader who also often served in the military. He first went
to school in Leonberg, and later to the monastic school in Adelberg and Maulbronn.
From 1589, Kepler studied in Tübingen to become a theologian, but in 1599, he took
the position of professor of mathematics in Graz that was offered to him. In 1600,
because of the Counter-Reformation Kepler had to leave Graz and went to Prague.
After the death of Tycho Brahe (Oct. 24, 1601), as his successor Kepler became the
imperial mathematician. After the death of his patron, Emperor Rudolf II, Kepler left
Prague and in 1613 went to Linz as a land surveyor. From 1628, Kepler lived as an
employee of the powerful Wallenstein, mostly in Sagan. Kepler died unexpectedly
during a visit to the meeting of electors in Regensburg.
Kepler’s main fields were astronomy and optics. After extraordinarily lengthy cal-
culations, he found the fundamental laws of planetary motion: the first and second of
Kepler’s laws were published in 1609 in Astronomia Nova, and the third one in 1619
in Harmonices Mundi. In 1611, he invented the astronomical telescope. His Rudol-
phian tables (1627) continued to be one of the most important tools of astronomy until
570 28 Emergence of Occidental Physics in the Seventeenth Century
the modern age. In the field of mathematics, he developed the heuristic infinitesimal
considerations. His best-known mathematical writing is the Stereometria Doliorum
(1615), where, e.g., Kepler’s barrel rule is given.
13 Galileo Galilei, Italian mathematician, b. Feb. 15, 1564, Pisa–d. Jan. 8, 1642,
Arcetri near Florence, studied in Pisa. At the Florentine Accademia del Dissegno,
he got access to the writings of Archimedes. On the recommendation of his patron
Guidobaldo del Monte, in 1589 he received a professorship for mathematics in Pisa.
Whether or not he performed fall experiments at the leaning tower is not proven incon-
testably; in any case, the experiments had to prove his false theory. In 1592, Galileo
took a professorship of mathematics in Padua, not because of disagreements with col-
leagues but for a better salary. He invented a proportional pair of compasses, furnished
a precision mechanic workshop in his home, found the laws for the string pendulum,
and derived the laws of falling bodies first in 1604 from false assumptions and then
in 1609 from correct assumptions. Galileo copied the telescope invented one year ear-
lier in the Netherlands, used it for astronomical observations, and published the first
results in 1610 in his Nuncius Siderus, the “star message.” Galileo discovered the
mountainous nature of the Moon, the abundance of stars of the Milky Way, the phases
of Venus, the moons of Jupiter (Jan. 7, 1610), and in 1611 the sunspots, although for
these Johannes Fabricius preceded him.
Only beginning in 1610 did Galileo, who returned to Florence as Court’s mathe-
matician and philosopher to the grand duke, publicly support the Copernican system.
By his over-eagerness in the following years, he provoked in 1614 the ban of this doc-
trine by the pope. He was urged not to advocate it further by speech or in writing.
During a dispute on the nature of the comets of 1618, where Galileo was completely
right, he wrote as one of his most profound treatises the Saggiatore (inspector with the
gold balance, 1623), a paper dedicated to Pope Urban VIII. Since the former cardinal
Maffeo Barberini had been well disposed toward him, Galileo hoped to win him as
pope for accepting the Copernican doctrine. He wrote his Dialogo, the “Talk on the
Two Main World Systems,” the Ptolemyan and the Copernican, gave the manuscript
in Rome for examination, and published it 1632 in Florence. Since he obviously had
not included the agreed-upon changes of the text thoroughly enough and had shown
his sympathy with Copernicus too clearly, a trial set up against Galileo ended with his
renunciation and condemnation on June 22, 1633. Galileo was imprisoned in the build-
ing of inquisition for a few days. The statement “It (the earth) still moves” (Eppur si
mouve) is legendary. Galileo was sentenced to unrestricted arrest, which he spent with
short breaks in his country house at Arcetri near Florence. There, he also wrote a work
important for the further development of physics: the Discorsi e Dimonstrazioni math-
ematiche, the “conversations and proofs” on two new branches of science: mechanics
(i.e., the strength of materials) and the science branches concerning local motions
(falling and throwing) (Leiden 1638).
In older representations of Galileo’s life, there are many exaggerations and mis-
takes. Galileo is not the creator of the experimental method, which he utilizes no
more than many other of his contemporaries, although sometimes more critically than
the competent Athanasius Kircher. Galileo was not an astronomer in the true sense,
but a good observer; and as an excellent speaker and writer, he won friends and pa-
trons for a growing new science and its methods among the educated of his age, and
he stimulated further research. Riccioli and Grimaldi in Bologna confirmed Galileo’s
laws of free fall by experiment. His scholars Torricelli and Viviani developed one of
Galileo’s experiments—for disproving the “horror vacui”—to the barometric experi-
Notes 571
ment. Christian Huygens developed his pendulum clock based on Galileo’s ideas, and
he transformed Galileo’s kinematics to a real dynamics.
Galileo was one of the first Italians who used their native language for presentation
of scientific problems. He defended this point of view in his correspondence. His prose
takes a special position within the Italian literature, since it is distinguished by its
masterly clarity and simplicity from the prevailing bombast that Galileo had reproved
in his literary-critical essays on Taso et al. In his works Il Dialogo sopra i due massimi
sistemi (Florence 1632) and I Dialoghi delle nouve scienze (Leiden 1638), he utilized
the form of dialogue that came down from the Italian humanists, to be understood by
a broad audience. [BR]
14 Christian Huygens, Dutch physicist and mathematician, b. April 14, 1629, Den
Haag–d. July 8, 1695, Den Haag. After initially studying law, he turned to mathemati-
cal research and published among other things in 1657 a treatise on probability calcu-
lus. At the same time, he invented the pendulum clock. In March 1655, he discovered
the first moon of Saturn, and in 1656, the Orion nebula and the shape of Saturn’s ring.
By then, he was already familiar with the laws of collision and of central motion, but
published them–without proof—only in 1669. In 1663, Huygens was elected a mem-
ber of the Royal Society. In 1665, he settled in Paris as a member of the newly founded
French academy of sciences, from where he returned in 1681 to the Netherlands. After
publishing in 1657 the small treatise Horologium and in 1659 his Systema Saturnium,
sive de causis mirandorum Saturni phaenomeno, in 1673 emerged his main work:
Horologium oscillatorium (the pendulum clock), which besides the description of an
improved watch construction contains a theory of the physical pendulum. Further one
finds treatises on the cycloid as an isochrone, and important theorems on central mo-
tion and centrifugal force. From 1675 dates Huygens’s invention of the spring watch
with a balance spring, from 1690 the Tractatus de lumine (treatise on light), which
contained a first version of the wave theory (collision theory) of light, and based on
that, the theory of double refraction of Iceland spar is developed. The spherical prop-
agation of action around the light source is explained there by means of Huygens’
principle. [BR]
15 Copernicus, Coppernicus, German Koppernigk, Polish Kopernik, Nikolaus, as-
tronomer, and founder of the heliocentric world system, b. Feb. 19, 1473, Thorn–d.
May 24, 1543, Frauenburg (East Prussia). Beginning in 1491, he engaged in human-
istic, mathematical, and astronomical studies at the university in Cracow. From 1496
to 1500, he studied civil and clerical law in Bologna. At the instigations of his un-
cle, bishop Lukas Watzelrode, in 1497 he was admitted to the chapter of Ermland at
Frauenburg, but he took only the lower holy orders. In Bologna, he continued his as-
tronomical work together with the professor of astronomy Dominico Maria Novarra,
made a short stay in Rome, and in 1501 temporarily returned to Ermland. Beginning
in the autumn of 1501, he studied in Padua and Ferrara, graduating on May 31, 1503,
as a doctor of canonical law, and then studied medicine. After returning home in 1506,
he lived in Heilsberg as secretary to his uncle from 1506 until his death in 1512. He
was involved in administrating the diocese of Ermland, and he accompanied his uncle
to the Prussian state parliaments and the Polish imperial parliament. As chancellor of
the chapter, Copernicus after 1512 lived mostly in Frauenburg, resided as governor of
the chapter (1512–1521) in Mehlsack and Allenstein, and in 1523 was administrator
of the diocese of Ermland. As a deputy, he represented the order chapter (1522–1529)
at the Prussian state parliaments and there particularly supported monetary reform.
572 28 Emergence of Occidental Physics in the Seventeenth Century
recognize the buoyancy action of the surrounding medium in free fall. Taking up the
ideas of Archimedes, he writes in his work De resolutione omnium Euclidis problema-
tum (Venice 1553) that the fall velocity shall be determined by the difference of the
specific weights of the falling body and the medium.
18 Robert Hooke, English researcher, b. July 18, 1635, Freshwater (Isle of Wight)–
d. March 3, 1703, London. Hooke was at first an assistant to R. Boyle; then from 1665
a professor of geometry at Gresham College in London; and from 1677 to 1682, secre-
tary of the Royal Society. Hooke improved already-known methods and devices, e.g.,
the pneumatic pump and the composite microscope (described in his Mikographia,
1664). Hook was often involved in questions on priority, e.g., with Huygens, Hevelius,
Recommendations for Further Reading on Theoretical Mechanics 573
and Newton. He proposed, among others, the melting point of ice as the zero point of
the thermometric scale (1664), recognized the constancy of the melting and boiling
point of substances (1668), and for the first time observed the black spots on soap
bubbles. He gave a conceptually good definition of elasticity and in 1679 established
Hooke’s law. [BR]
19 Plutarch (Greek Plutarchos), Greek philosopher and historian, b. about AD 50,
The textbooks on theoretical mechanics listed below represent only part of the wealth
of excellent literature on this topic.
Classical textbooks on mechanics:
H. Goldstein: Classical Mechanics, 3rd edition (2001), Addison-Wesley Pub. Co.
A. Sommerfeld: Mechanics (Lectures on Theoretical Physics, Vol. 1), 4th edition
(1964), Academic Press.
574 28 Emergence of Occidental Physics in the Seventeenth Century
L.D. Landau and E.M. Lifschitz: Mechanics, 3rd edition (1982), Butterworth-
Heinemann.
Problems and exercises for classical mechanics:
M.R. Spiegel: Theory and Problems of Theoretical Mechanics (Schaum’s Outline
Series), SI-edition (1980), McGraw-Hill.
More mathematical presentations of mechanics:
F. Scheck: Mechanics: From Newton’s Laws to Deterministic Chaos, 3rd edition
(1999), Springer.
J.B. Marion and S.T. Thornton: Classical Dynamics of Particles and Systems, 4th
edition (1995), Saunders College Publishing.
We consider the work of H. Goldstein to be particularly suited as an addendum.
Starting from the elementary principles, he outlines the formal Hamilton–Jacobi the-
ory in a didactically brilliant manner. All typical applications (central force problem,
rigid body, vibrations etc.) are discussed and expanded on in exercises and by special
recommendations for further reading. The lectures by A. Sommerfeld, planned in a
similar way, represent a gold mine because of the treatment of many special problems
and the imaginative power demonstrated in the mathematical solution techniques.
Readers may gain an appreciation of the formal aesthetics of the volume Mechanics
from the textbook by Landau and Lifschitz.
Index