Lectures 169
Lectures 169
Lectures 169
Igor Mezić
University of California, Santa Barbara
Updated Spring Quarter, 2024
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 Points of View in Dynamical Systems Theory . . . . . . . . . . . . . . . . . . . 5
2 One-Dimensional Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Bifurcation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1 Bifurcations of One-Dimensional Flows . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.1 Saddle-node (blue-sky) bifurcation . . . . . . . . . . . . . . . . . . . . . 30
A Topological Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3
Chapter 1
Introduction
To start with the title, what is the study of nonlinear phenomena in dynamical sys-
tems theory and where does it get applied? The answers are simple: the dynamical
systems theory is the set of propositions that describe properties of systems chang-
ing in time. Therefore it applies to any system that changes in time. Which is pretty
much everything. The changes can be linear or nonlinear. We will discuss both, but
our emphasis is on the understanding and applying the nonlinear aspect. So we need
to reduce the discussion a bit. The main directions in dynamical systems theory
aim to understand properties of systems evolving in continuous time, described by
ordinary differential equations:
ẋ = f(x,t), (1.1)
x0 = T (x). (1.2)
In both cases x ∈ M is an element of some space M. The simple case is when this
set is the Euclidean space Rn . The more complex case is when the set is a manifold
- a mathematical structure that is locally ”flat” i.e. looks locally as Rn . An example
fo such an object is the circle. The reader can learn more about manifolds from the
crash course in the Appendix B.
5
6 1 Introduction
ẋ = F(x,t), (1.3)
ẋ = F(x), (1.4)
We would like to know the state x(t) at any given time t of the system we are
considering, provided we have an initial condition x(0). This is possible, and the
resulting x(t) is unique, provided that F satisfies some conditions reflecting its
continuity and global behavior [2]. Solving for x as a function of time leads to
evolution
x(t) = S(x0 ,t),
where x0 = x(0). We call the time-dependent family of maps1
x3
St(x0)
x1 x0
x2
ẋ = λ x. (1.5)
1Note how we call F a vector field, and St a map. Both are space- and time-dependent families of
vectors, but they have different physical interpretations, so the nomenclature is different.
1.1 Points of View in Dynamical Systems Theory 7
The solution x(t) depends only on the time difference t − t0 , and not indepen-
dently on t and t0 , and thus we can write τ = t − t0 to obtain
Sτ (x0 ) = x0 eλ τ , (1.7)
t → ∞, x → ∞
x x
t → ∞, x → 0
∀t, x = 0 ∀t, x = 0
0 t
0
t
t → ∞, x → 0
t → ∞, x → −∞
λ>0 λ<0
Fig. 1.2: Value vs. time plot for the linear autonomous system (1.5).
This is the general property of autonomous flows: the final position depends on
the initial position and the time difference between the final and initial time.
In the case of the non-autonomous (time-dependent) version, (1.3), the flow
depends on two time parameters, the initial time t0 and the final time t:
ẋ = λ (t)x. (1.9)
We integrate Z t Z t
dx
= λ (t¯)dt¯, (1.10)
t0 x t0
to obtain
8 1 Introduction
Rt
λ (t¯)dt¯
x(t,t0 , x0 ) = x(t0 )e = x0 e t0 (1.11)
Assuming a simple linear dependence λ (t) = at for some constant a, we get
a 2 −t 2 )
x(t,t0 , x0 ) = x(t0 )e 2 (t 0 . (1.12)
Thus,
a 2 −t 2 )
Stt0 (x0 ) = x0 e 2 (t 0 , (1.13)
which is the flow of the system (1.9). Note that
and thus the final position does not depend only on the difference of the final
and initial time. A bit of the reconciliation between the autonomous and non-
autonomous sytems is to define a new ”independent” variable that is equal to
time. In (1.9) that would amount to setting y = t, and writing
ẋ = λ (y)x,
ẏ = 1. (1.15)
Now the right hand side of this -2D- system depends only on x, y and not on t. We
gained autonomy, at the expense of introducing an extra dimension. Interestingly,
this is a helpful trick and we will use it often when studying systems with external
input depending on time.
First ordinary differential equations were solved during the time of Newton.
Leibnitz, Bernoulli brothers and others in the 1680’s discovered this was a useful
thing to do2 . In their time the components of vector x of interest were positions
and velocities of particles in the universe. Now, on the practical front, plotting
position and velocity value vs. time makes sense, and good information can be
extracted from it, as the following example additionally shows.
Example 1.1.1 In figure 1.4 we plot the case of free, undamped one-degree of
freedom vibrations satisfying the ordinary differential equation
ẋ(0)
x(t) = sin ωnt + x(0) cos ωnt.
ωn
ẋ(t) = ẋ(0) cos ωnt − ωn x(0) sin ωnt, (1.17)
p
where ωn = k/m = 7.0711 rad/s is the natural frequency of the oscillation.
The top of the figure 1.4 shows the position of the mass starting from the equilib-
rium x(0) = 0 with velocity ẋ(0) = 1. The amplitude of the periodic oscillation is
ẋ(0)/ωn = 0.1414. The graphic in the middle represents velocity vs. time. Start-
ing from velocity of ẋ(0) = 1, the curve of velocity is also periodic, with the same
period as the position-vs-time curve.
Fig. 1.4: Free undamped vibrations with mass m = 1 kg, the stiffness k = 50 N/m,
the natural frequency ωn = 7.0711 rad/s.Top: Plot of position (vertical axis) vs. time
(horizontal axis). Middle: Velocity vs time. Bottom: State space plot of velocity
(vertical axis) vs. position (horizontal axis).
10 1 Introduction
Value-vs-time plots give us all the information necessary to understand the dy-
namics of the system in time. Let’s call this Newton’s point of view. Note how-
ever that there are a number of things we would like to know that are difficult to
extract from the value-vs-time plots, such as the correlations between behaviors
of different states, if there is more than one. In the example of the mass-spring
system above, suppose we do not know the equations of motion, but we want to
know whether the behaviors of position and velocity we are measuring are cor-
related. The next point of view we describe makes much use of the relationship
between different states, and is less focused on their time-behavior.
2. The second point of view in dynamical systems theory came from a famed mono-
graph of Poincaré [16]. In the course of his studies, Poincaré realized that we
cannot solve an arbitrary, smooth system of ordinary differential equations in a
closed form. In other words, the value-vs-time expression can not be derived in a
simple form for an arbitrary system. This due to nonlinearity. Simple mechanical
systems, such as the mathematical pendulum can be nonlinear.
Example 1.3 (Mathematical pendulum). The pendulum of length l under the ac-
celeration of gravity g, is depicted in figure 1.5. We neglect the weight of the rod.
Using Newton’s law, the equation of motion reads
ml 2 θ̈ = −mglsinθ (1.18)
θ̇ = ω,
g
ω̇ = − sin θ . (1.19)
l
While the mathematical pendulum can be solved in closed form - albeit with a
lot of effort, using elliptic functions (trust me, I did a lot of those computations
myself during my thesis work :-) the observation that it cannot always be done
was the root of further investigations into nonintegrability. Laplace famously re-
marked that if one knew positions and velocities of all the particles in the universe
at any given point in time, then one could predict the future precisely. But when
the system has enough complexity, Poincaré has shown that even close-by initial
positions of particles can lead to very different outcomes and “chaotic” solutions,
that oscillate forever without returning to their initial positions.3 He also under-
stood that the right hand side of (1.4) defines a vector field on the underlying
manifold M, and that properties of that vector field can tell us a lot about the
dynamics of the system without ever having to solve for the states as functions
of time!
Example 1.1.2 Rewriting of (1.16) in first-order form, we get
ẋ = y,
k
ẏ = − x.
m
space that can be reached from x by evolving dynamics for some positive or negative time.
12 1 Introduction
x = 0. After that, the velocity increases towards zero and becomes positive again
after the system reaches the maximum negative position x = −0.1414. The sys-
tem returns to its starting position at x = 0, ẋ = 1 to recommence its motion. No
damping means the motion repeats again and again in the same way. All of this
can be proven using only the properties of the vector field
F(x) = 0, (1.21)
stable and unstable manifolds (for the case of e.g. fixed point x, those sets
of points in state space such that x(t, x0 ) → x as t → ∞ (stable manifold) and
x(t, x0 ) → x as t → −∞ (unstable manifold)), Poincaré maps - maps that take
current system state into the state observed one period of the motion later - and
unleashed a torrent of research, especially after Ueda’s discovery in 1961 [20],
and Lorenz’s discovery in 1963 [8] of physically motivated systems with ape-
riodic asymptotic dynamics, leading to the notion of strange attractor (roughly
speaking a set toward which the flow converges as t → ∞) with chaotic dynamics.
In the Poincaré point of view, value-vs-time plots are replaced by explicit rela-
tionships between state variables. The focus is on the qualitative aspects of the
dynamics such as the structure of the state space, instead of the explicit knowl-
edge of the value of a state at any given moment in time.
3. An alternative, when faced with a problem in which only data are available and
there is no ODE or other model available to start with, is the spectral point of
view in dynamical systems. It does not seem to be anybody’s in particular, but
the biggest advances were made by Wiener [21] so we’ll call it Wiener’s picture.
It avoids talking about models and equations and talks about data, and how to
decompose data coming from an observable into its harmonic components. We
know this approach under the name of harmonic analysis.
5 Note however that in the Strogatz book the phase space representation terminology is used.
1.1 Points of View in Dynamical Systems Theory 13
Example 1.1.3 One of the important physical concepts arising from a study of
the oscillator (1.16) is its natural frequency. The values of the parameters that
we introduced give the frequency fn = 1.1254 Hz (the angular natural frequency
ωn = 2π fn = 7.0711). The harmonic analysis of the signal (1.17) would reveal
that the signal can be expressed using a single frequency term,
ẋ = F(x), x ∈ R. (2.1)
The simplest ones are the free particle equation obtained for a particle moving on
the line, not influenced by any physical forces
where v is the constant velocity of the particle. Another simple one is the linear
equation discussed before, that can physically be obtained when particle inertia is
negligible (in EE when there is no induction, i.e. we have an RC circuit with no
constant voltage source)
The new feature in the solution of this equation - over (2.2) - is that it has a fixed
point: when x = 0, x(t) = 0 for any t. If k > 0 then all other solutions go off to infinity
and we call the fixed point unstable, while when k < 0 all solutions tend to the fixed
point and we call it stable.
Let’s be a little more precise when we talk about stability. We will repeat these
definitions later, but it is useful to introduce them briefly now.
Definition 2.1. (Lyapunov stability) A solution x̄(t,t0 , x̄0 ) of (3.1) is Lyapunov sta-
ble iff given ε > 0 there is a δ (ε) > 0 such that for any solution y(t,t0 , y0 ) with
|x̄0 − y0 | ≤ δ , |x̄ − y| < ε for t > t0 .
For 1-dimensional ODE’s in R, figure 2.2 shows the graphical way of determining
dynamics and stability of 1D ODE’s in the form of Poincaré’s state space portrait.
15
16 2 One-Dimensional Flows
The equilibria are labeled in red, the vector field direction is indicated by arrows,
while the stability intervals ε and δ are in green and yellow respectively.
1-D systems can only have trajectories that are 1) equilibria and 2) orbits that
connect equilibria, the trajectories on which satisfy
or
lim x(t,t0 , x0 ) = x̄, (2.5)
t→−∞
Leading to
x(t,t0 , x0 ) = 2 cot−1 (e−c(x0 ,t0 )−t ), (2.8)
but exploring the properties of the solution this way for any x0 ,t0 seems hard. In-
stead, if we go to Poincaré picture, we get the figure 2.3. From the figure it is evident
that there are an infinite number of fixed points at nπ. The fixed points at 2kπ are
unstable, while those at (2k + 1)π are stable. The space-time trajectories in figure
Fig. 2.3: a) Poincare state-space portrait and stability for 1-D ODE ẋ = sin x.
2.4 show behavior of solutions graphically. For any initial condition, the trajectories
approach stable fixed points.
Isolated equilibria x̄ - corresponding to F whose first derivative is continuous and
non-zero at x̄ are either stable or unstable. Namely, for a 1D ODE
ẋ = F(x), x ∈ R (2.9)
we can perform linearization close to the fixed point x(t) = x̄. Let F be C1 (once
differentiable, and the derivative is continuous) and dF/dx(x̄) = λ . Then by Taylor’s
theorem
18 2 One-Dimensional Flows
2π
0 t
−π
−2π
Figure 2.1.3
Fig. 2.4: Trajectories in space-time for ẋ = sin x.
In all honesty, we should admit that a picture can’t tell us certain quantitative
things: for instance, we don’t know the time at which the speed x is greatest. But
in many cases qualitative information is what we care about, and then pictures are
fine.
ẋ = F(x) = λ (x − x̄) + h(x)(x − x̄), lim h(x) = 0. (2.10)
x→x̄
Remark 2.1.
18 Recitation
FLOWS ON THE sessions
LINE for the first week will cover example 2.2.2 from the
book and population growth example section 2.3 in the book, in detail.
ẋ (c) ẋ (d)
x x
Figure 2.4.
Fig. 2.5: Phase (state space) portrait for ẋ = x2 .
ẋ = F(x), (3.1)
for some constant c. If this is not globally satisfied, the solution can blow up:
Example 3.1.
ẋ = x2 (3.3)
We have, for t0 = 0
Z x(t)
dx 1 1 1 1
=− + =t ⇒ = − t. (3.4)
x0 x2 x(t) x0 x(t) x0
Thus,
1
x(t) = 1
, (3.5)
x0 −t
and blows up for positive x0 at t = 1/x0 .
The above example shows that the solution might not exist for all time. Another
example is the one in which solutions are not unique:
Example 3.2. Consider
ẋ = x−1/2 (3.6)
21
When uniqueness fails, our geometric approach collapses bec
point doesn’t know how to move; if a phase point were started at th
3/ 2
it stay there or would it move according to x(t ) = ( 23 t ) ? (Or a
22
elementary school used to say when discussing the problem of the
3 Existence and Uniqueness, Potentials, Numerical Approximations
ForVproofs
0
= dV /dxof the existence and uniqueness
= −F(x), (3.8) theorem, see Borrel
just set (1987), Lin and Z x Segel (1988), or virtually any text on ordinary differe
This theorem
V =−
x says that if f ( x ) is smooth
F(x̄)d x̄.
0
(3.9)
enough, then solution
To show this, by the unique.
fundamentalEven
theoremso, there’s no guarantee that solutions exist forever,
of calculus,
next− dV
example.
Z x
d
= ( F(x̄)d x̄) = F(x)! (3.10)
dx dx x0
Strogatz-CROPPED2.pdf 41
3.2 Potentials 23
The function V is called the potential. The negative sign is a matter of convention -
we will se why in a moment.
Example 3.3. Consider the mass-spring-viscous damper system shown in figure 3.2.
k d kx2 d
ẋ = − x = − =− V (3.12)
c dx 2c dx
where the potential is
V = kx2 /2c. (3.13)
Note that the potential is always defined up to an additive constant, because the
derivative of a constant is 0.
Checking the evolution of V in time,
dV dx dV
V̇ = = V 0 (−V 0 ) = −( )2 < 0 (3.14)
dx dt dx
so the potential decreases in time everywhere, except at the point where dV /dx=0.
But,
dV /dx = −F(x) = 0, (3.15)
indicating that 0 gradient points of the potential are exactly the equilibrium, or fixed
points.
Example 3.4. Consider the highly damped Duffing system shown in figure 3.3 The
full equation of motion is
mẍ = kx − bx3 − cẋ (3.16)
For m/c very small, and k/c = b/c = 1, we get
implies x 0; equilibria occur at the fixed points o
minima of V ( x ) correspond to stable fixed points
local maxima corresp
24 3 Existence and Uniqueness, Potentials, Numerical Approximations
V(x)
EXAMPLE 2.7.1:
that has an stable fixed point at 0 and an unstable fixed point at π. We compute
V (θ ) = − cos(θ ) +C. We again set C = 0. When the potential is at its minimum - at
θ = 0, we have the stable equilibrium of teh system (i.e. the straight down position
of the pendulum). Where the potential has its maximum - at θ = π it is equal to 1,
we have the unstable fixed point - the straight up position of the pendulum.
Euler’s method is visualized in figure 3.5 Because Euler method is based on the
Euler
exact
x1
x(t1)
x0
t0 t1 t2
Refinements
straightforward discretization of the time-derivative, the convergence of the approx-
imate solution to the true one will be of the order 4t. Many researchers worked (and
Onestill
problem
do) on with the Eulerofmethod
improvements is that ittheestimates
these methods, theone
natural next derivative
being theonly at the
improved
left end of the time interval between t and t . A more sensible approach would
Euler method.
n n+1
be to use the average derivative across this interval. This is the idea behind the
improved Euler method. We first take a trial step across the interval, using the Euler
method. This produces a trial value xn+1 = xn + f ( xn )∆t ; the tilde above the x
indicates that this is a tentative step, used only as a probe. Now that we’ve esti-
mated the derivative on both ends of the interval, we average f ( xn ) and f ( xn 1 ),
and use that to take the real step across the interval. Thus the improved Euler
method is
k1 = F(xn )4t,
1
k2 = F(xn + k1 )4t,
2
1
k3 = F(xn + k2 )4t,
2
k3 = F(xn + k3 )4t.
(3.26)
weight
Here the weight plays the role of the control parameter, and the deflection of the
beam from vertical plays the role of the dynamical variable x.
3.0 INTRODUCTION 45
27
28 4 Bifurcation Theory
The parameter set p is assumed to be constant in time. In the beam buckling exper-
iment the parameter is the amount of axial compression. For each single execution
of the experiment, it is fixed.
A bifurcation occurs when a small continuous change made to the parameter
values p (the bifurcation parameters) of a system causes a sudden qualitative or
topological change in its behavior. Generally, at a bifurcation, the local stability
properties of equilibria, periodic orbits or other invariant sets changes.
4.1 Bifurcations of One-Dimensional Flows 29
Consider
ẋ = f (x, r), f ∈ Ck , k ≥ 2 (4.2)
The parameter p = r is the bifurcation parameter, here just a single scalar. Let
f (0, 0) = 0,
∂f
(0, 0) = 0. (4.3)
∂x
The second condition is a necessary, but not sufficient, condition for the appearance
of local bifurcations at r = 0. If
∂ f /∂ x 6= 0, (4.4)
the implicit function theorem1 shows that the equation f (x, r) = 0 possesses a unique
solution x = x(r) in a neighborhood of 0, for small enough r. In particular x = 0 is
the only equilibrium in a neighborhood of 0 when r = 0, and the same property
holds for r small enough.
Furthermore, the dynamics of in a neighborhood of 0 is qualitatively the same
for all sufficiently small values of the parameter r: no bifurcation occurs for small
values of r.
ẋ ẋ ẋ
x x x
Figure 3.1.1
Fig. 4.2: Blue sky bifurcation.
As r approaches 0 from below, the parabola moves up and the two fixed points
move toward each other. When r = 0, the fixed points coalesce into a half-stable
fixed point at x* = 0 (Figure 3.1.1b). This type of fixed point is extremely del-
icate—it vanishes as soon as r 0, and now there are no fixed points at all
(Figure 3.1.1c).
In this example, we say that a bifurcation occurred at r = 0, since the vector
fields for r 0 and r 0 are qualitatively different.
Graphical Conventions
There are several other ways to depict a saddle-node bifurcation. We can show
a stack of vector fields for discrete values of r (Figure 3.1.2).
46 BIFURCATIONS
The bifurcation diagram is the plot of x vs r shown in figure 4.3. We see that
Figure 3.1.3
the two equilibrium points - one stable and one unstable - appear for r < 0. This
bifurcation is and
labeledso shouldas be
subcritical, plotted
it new solutionshorizontally (Figure
appear below the critical point 3.1.4). The drawback is t
r = 0. the x-axis has to be plotted vertically, which looks strange at first. Arr
sometimes included in the picture, but not always. This picture is called t
x cation diagram for the sadd
bifurcation.
unstable
Terminology
Bifurcation theory is r
conflicting terminology. T
r ject really hasn’t settled do
and different people use d
words for the same thing. Fo
stable ple, the saddle-node bifurc
sometimes called a fold bif
(because the curve in Figu
Fig. Figure 3.1.4 saddle-node (blue sky) bifurcation diagram.
4.3: Subcritical
Strogatz-CROPPED2.pdf 61
32 4 Bifurcation Theory
df √ √
|x= r = (−2x)|x=√r = −2 r < 0 ⇒ stable (4.9)
dx
√
A similar calculation leads to the conclusion that the fixed point at − r is unstable.
This bifurcation is labeled supercritical, as it new solutions appear above the critical
point r = 0. The bifurcation diagram is the plot of x vs r shown in figure 4.3
As we could see from the similarity of the examples above, there might be a way
to connect them to a larger theory. Here is a theorem that does that:
Theorem 4.1. (Saddle-node (blue sky) bifurcation). Assume that the vector field f
is a Ck , k ≥ 2, in a neighborhood of (0, 0) and satisfies:
∂f
(0, 0) = a 6= 0,
∂r
∂2 f
(0, 0) = 2b 6= 0. (4.10)
∂ x2
The following properties hold in neighborhood of 0 in R for small enough r:
1. if ab < 0 (resp. ab > 0) the differential equation has no equilibria for r < 0 (resp.
for r > 0),
pab < 0 (resp. ab > 0) the differential equation possesses two equilibria x± (ε), ε =
2. if
|p| for the r > 0 (resp. r < 0), with opposite stabilities. Furthermore, the map
ε → x± (ε) is Ck−2 in a neighborhood of 0 in R, and x± (ε) = O(ε).
34 4 Bifurcation Theory
f (0, 0) = 0,
∂f
(0, 0) = 0 ⇒ a1 = 0 (4.12)
∂x
and
∂f
(0, 0) = a 6= 0,
∂r
∂2 f
(0, 0) = 2b 6= 0. (4.13)
∂ x2
, and that the curve of equilibria is parabolic near (0, 0) and thus
x2 ≈ r, ⇒ rx ≈ x3 , r2 ≈ x4 (4.14)
Note that the terms in parentheses are of order x3 and higher, due to x2 ≈ r.
Proof. Since we have the Taylor expansion is given by (4.15), we neglect x3 order
terms and we get
b
f (x, r) = ar + bx2 = 0 ⇒ r = − x2 . (4.16)
a
Consequently, depending on the sign of ar/b, or equivalently a · b · r, there are no
equilibria, one equilibrium at x = 0 or two equilibria at
p
x± = ± −ar/b. (4.17)
ẋ = r − x − (1 − x + x2 /2) = r − 1 − x2 /2 (4.20)
Set
p = r − 1,
√
y = x/ 2, (4.21)
and we obtain
ẏ = p − y2 (4.22)
As shown before, (4.22) exhibits the supercritical blue sky bifurcation at (y, p) =
√ √
(0, 0), where the stable fixed point is at y = p and the unstable one at y = − p. In
the original “coordinates”
√ (x,p
r) the bifurcation occurs at x = 0, r = 1, and the fixed
points are at x = ± 2p = ± 2(r − 1).
This is an example of a commonly used method in dynamical system: use a
change of variables to represent the system in a simpler form for which the solution
is easy to find.
In fact, what we have shown in general above is that all the one-dimensional
systems on the line exhibiting the blue sky bifurcation can be transformed to the
same normal form close to the bifurcation point in the x, r plane, given by
ẋ = r − x2 . (4.23)
36 4 Bifurcation Theory
f (−x, p) = − f (x, p)
∂2 f
(0, 0) = a 6= 0,
∂ p∂ x
∂3 f
(0, 0) = 6b 6= 0. (4.24)
∂ x3
The following properties hold in neighborhood of 0 in R for small enough p:
1. if ab < 0 (resp. ab > 0) the differential equation has one equilibrium at 0 for
p < 0 (resp. for p > 0). The equilibrium is stable when b < 0 and unstable when
b > 0.
2. if ab < 0 (resp. abp> 0) the differential equation possesses two symmetric equi-
libria x± (ε), ε = |p| for the p > 0 (resp. p < 0), that are stable when b > 0
and unstable when b < 0. Furthermore, the map ε → x± (ε) is Ck−3 in a neigh-
borhood of 0 in R, and x± (ε) = O(ε).
The situation described in the above theorem is called a pitchfork bifurcation that
occurs at p = 0.
∂2 f
(0, 0) = a 6= 0,
∂ p∂ x
∂2 f
(0, 0) = 2b 6= 0. (4.26)
∂ x2
The following properties hold in neighborhood of 0 in R for small enough p:
1. The system possesses a 0 equilibrium and another equilibrium x(p).
2. if ap < 0 (resp. ap > 0) the 0 equilibrium is stable (respectively unstable) and
x(p) has the opposite stability.
A direct consequence of conditions is that f has the expansion:
The situation described in the above theorem is called a transcritical bifurcation that
occurs at p = 0. The above are examples of bifurcations that occur when a real
eigenvalue crosses the imaginary axis. When a complex conjugate pair crosses the
imaginary axis, the Hopf bifurcation occurs. A simple example is
ṙ = pr − r3 ,
θ̇ = ω. (4.28)
T corresponding to frequency ω, and bring the linear part to the Jordan form, after
rescaling time with the complex frequency
p 1
A= (4.29)
−1 p
where p is the parameter we are varying. It turns out that if we additionally use
complex variable z = x + iy, the system reduces to the simple form
ż = λ z − z|z2 | (4.30)
that looks very much like the pitchfork bifurcation, just in complex variables. The
sign of p controls the stability of the fixed point. In fact, setting z = reiθ , we get
ṙ = r(p − r2 ),
θ̇ = 1. (4.31)
Appendix A
Topological Dynamics
In this Appendix we present the basic objects of topological dynamical systems the-
ory, such as the notion of compact metric space, ω-limit sets, α-limit sets, attractors,
homoclinic and heteroclinic orbits,...we start with the concept of topology itself.
A.1 Introduction
Definition A.1.2 We call a subset A ⊂ Rn open if for every x ∈ A, there exists ε > 0
such that B(x, ε) ⊂ A. A subset F ⊂ Rn is called closed if its complement F c is open.
It is easy to see that the collection of open sets defined in such a way satisfies the
above axioms for a topology on Rn .
A subset of M is said to be closed if its complement is in T (i.e., its complement is
open). A subset of M may be open, closed, both (clopen set), or neither. The empty
39
40 A Topological Dynamics
set 0/ and M itself are always both closed and open. A set N ⊂ M containing an open
set V such that x ∈ V is called a “neighborhood” of x.
Let M be a topological space with a topology T. A set V ⊂ M is a topological
subspace of M under the induced topology . The induced topology on V is the topol-
ogy TV on V formed by all intersections V ∩ A, where A ∈ T. It is easy to see that
TV satisfies the properties of topology on V .
A connected space is a topological space that cannot be represented as the union
of two or more disjoint nonempty open subsets. Connectedness is one of the prin-
cipal topological properties that is used to distinguish topological spaces1 . A subset
of a topological space M is a connected set if it is a connected space when viewed
as a subspace of M.
A simply connected domain is a path-connected domain where one can contin-
uously shrink any simple closed curve into a point while remaining in the domain.
The topologies that we consider in this book are largely introduced by metrics.
A metric is a non-negative function d(·, ·) : M × M → R that is symmetric in its
arguments, and has the property that d(x, y) = 0 if and only if x = y and
(the triangle inequality). Once a metric on M is defined, open sets can be defined to
be the so called balls that are sets of points
between y and z should not be smaller than the distance between x and z. The
standard example is the Euclidean space metric
Next, we define a distance function between any two non-empty sets X and Y of M
by:
d(X,Y ) = sup{d(x,Y )|x ∈ X} .
If X and Y are compact then d(X,Y ) will be finite; d(X, X) = 0; and d inherits the
triangle inequality property from the distance function in M. As it stands, d(X,Y )
is not a metric because d(X,Y ) is not always symmetric, and d(X,Y ) = 0 does not
imply that X = Y (It does imply that X ⊆ Y ). However, we can create a metric by
defining the Hausdorff distance to be:
In this Appendix we present some basic concepts from theory of manifolds that
we require for the smooth reading of the main text. We start with one-dimensional
manifolds in two dimensions, as these are easy to visualize and grasp intuitively. In
fact they are just a smidge away from the concept of a graph of a function that we
learn in elementary school. The concept of tangent space is also quite easy for one-
dimensional manifolds. Once these are introduced, the extension to n-dimensional
manifolds embedded inside d-dimensional real vector spaces should flow easily.
As a consequence, some basic concepts such as vector fields and their flows on
manifolds are defined with little effort. In case the introduction here is insufficient
for either clarity or scope, Spivak’s little book [19] is a wonderful introduction to
the subject.
B.1 Introduction
The set of real numbers R is a vector space - if one associates every real number
a with an arrow extending from 0 to a along a straight line. Choose a basis for R,
a unit vector e such that any other vector f ∈ R is given as f = se, s ∈ R. Also,
choose an orthonormal basis for R2 vectors e1 and e2 such that any other vector f ∈
R2 = xe1 + ye2 . Recall that “Cr map” means ”r−times differentiable” map.
The following definition can be understood provided the reader is familiar with
the concepts of set topology introduced in appendix A:
43
44 B Ordinary Differential Equations on Manifolds, Vector Fields and Flows
the next set of descriptions is shown in figure B.1. A very important concept related
to that of a manifold is the tangent space.
Definition B.2. The tangent space at a point (x, y) of a 1−dimensional manifold M
in R2 is the line tangent to M at (x, y).
Calculating the tangent space at a point is simple if we know the map ϕ. Let s be a
coordinate on V (think of just the linear coordinate on R restricted to the open set
V ). We will call s a local coordinate around p ∈ U. As ϕ −1 is a map from a subset
of R to R2 it can be represented by two components,
ϕ −1 = (ϕx−1 , ϕy−1 ).
∂ ϕx−1 ∂ ϕy−1
dϕ −1 (t) = ( (t), (t)).
∂s ∂s
The tangent space T M|p of M at p = (ϕx−1 (t), ϕy−1 (t)) is
∂ ϕx−1 ∂ ϕy−1
dϕ −1 (R) = c( (t), (t)), c ∈ R
∂s ∂s
B.1 Introduction 45
The collection of points (p, v), where p ∈ M, v ∈ T M|p is called the tangent bundle.
G( f ) = {(s, f (s)) ∈ R2 |s ∈ R}
∂ ϕx−1 ∂ ϕy−1 df
dϕ −1 |t = ( (t), (t)) = (1, |t ).
∂s ∂s ds
Thus, the tangent space at a point (s, f (s)) is given by all the vectors of the form
df
(c, c |t ), c ∈ R.
ds
The slope of this line is d f /ds|t , and so the tangent space is just a one-dimensional
vector space with a slope equal to the tangent to f at t ∈ R.
The concept of the manifold was invented exactly as a generalization of the above
example to the case of more complicated object that can be represented as graphs
of functions ”locally” i.e. in a neighborhood of each of their points. General one-
dimensional manifolds are grouped in two classes: those that can be smoothly trans-
formed (meaning Cr mapped) to a circle and those that can be transformed to the
real line.
Now we can define a vector field on a 1-D manifold as a section of the tangent
bundle, i.e. a collection of points (p, vp ) where p ∈ M, vp ∈ T M|p . Note that here
we select 1 point in the tangent space T M|p for every point p ∈ M as opposed to
the definition of the tangent bundle, where to every point p ∈ M we associated the
whole set of vectors in T M|p . It is now easy to see how to define Cr vector fields,
by requiring that our selection of vectors is r times differentiable with respect to the
−1 ∂ ϕ −1
local coordinate s: recall that p(s) = (ϕx (s)−1 , ϕy−1 (s)), vp (s) = ( ∂ ϕ∂xs (s), ∂ys (s)).
If for any p ∈ M we can r times differentiate vp (s) with respect to (local1 ) coordinate
s, we say the vector field is Cr .
1 We say local because there might be different such coordinates for different points p ∈ M.
46 B Ordinary Differential Equations on Manifolds, Vector Fields and Flows
We define the tangent space of an n−dimensional manifold. We will use the fol-
lowing extension of the notion of the derivative of a function (that we utilized for 1-d
manifolds in the previous section: for any vector of functions f = ( f1 (x), ..., fm (x)))
considered as a map between Rn and Rm , we can define its derivative Df|x as the
m × n matrix of partial derivatives of f. Thus, Df|x is a linear map that takes vectors
in Rn into vectors in Rm . The derivative map satisfies the following chain rule: if
f : Rl → Rm , and g : Rn → Rl then
If n = 1, then
dg
Dg = .
dt
Definition B.4. The tangent space T |p at a point p of an n−dimensional manifold
M embedded in Rm is the set of all the vectors tangent to M at p. The tangent space
at p is the image of the derivative map Dφ −1 |φ (p) of φ −1 at φ (p) ∈ Rn .
Fig. B.2: The geometry of a 2D manifold, its tangent space and a curve St (p) on the
manifold.
dp(t) dc(t)
= Dφ −1 .
dt dt
Since dc(t)/dt is a vector in Rn , then dp(t)/dt is in T M|p(t) .
Now consider a differentiable vector field v ∈ T M. One can associate with it a
flow St : M → M that is obtained by integrating
ϕ̇(p) = Dϕv|p
locally in V to obtain ϕ(t, p) as its solution and transporting it back to the manifold
as
St (p) = ϕ −1 ϕ(t, p),
48 B Ordinary Differential Equations on Manifolds, Vector Fields and Flows
to obtain the local mapping in U. This can be, on a compact manifold, “stitched
together” to obtain a global family of mappings St that we call the flow of v.2 Now
note that St (p) is a curve in M since it is a mapping from R to M. That curve passes
through vp at t = 0. In fact,
dSt
|t=0 = v p ,
dt
since
dϕ(t, p)
|t=0 = ϕ̇(p) = Dϕ|p v|p
dt
and
dSt dϕ(t, p)
|t=0 = Dϕ −1 |ϕ(0,p) |t=0 = Dϕ −1 |ϕ(p) Dϕ|p v|p = v|p .
dt dt
Also, by definition of St and the chain rule
and thus it is clear that DSt maps T Mp into T MSt (p) , but also that
f (x, y, z) = x2 + y2 + z2 − R2 ,
∂f ∂f ∂f
Df = ( , , ) = (2x, 2y, 2z) 6= 0, ∀(x, y, z) 6= 0.
∂x ∂y ∂z
2 For more information, look up the excellent exposition in Chapter 5 of Arnold’s book [2].
B.1 Introduction 49
Since (x, y, z) 6= 0 for any such function f except when R = 0, we just proved that
f = 0 for any finite radius R is a C1 manifold.
We can ask the question: if we have m functions f1 , ..., fm , is the joint level set, (the
set of points on which the value of each fi is constant) a manifold, and if so, of
which dimension. Intuitively, each additional function reduces the dimension of the
resulting set by 1. For example, a sphere defined by zero level set of f 1 (x, y, z) =
x2 + y2 + z2 − R2 in 3D and a cylinder defined by zero level set of f 2 (x, y, z) =
x2 +y2 −r2 intersect in two circles provided r < R. That union of two circles is a one-
dimensional manifold (indeed, apply the definition for each of the circles separately,
and you will see that their union satisfies the conditions to be a manifold. In this case
the manifold is not a connected set). It should be intuitive that the condition for a
joint level sets L of m functions to form an n − m-dimensional manifold in an n-
dimensional space is that the matrix D( f1 , ..., fm ) has rank m. Indeed, we have
Theorem B.1.2 Assume the vector function f : Rn → Rm has the differential D f
that is of rank m on the level set f = c = (c1 , ..., cm ) ∈ Rm . Then that level set is an
n − m dimensional manifold in Rn .
In a number of places in the book we use the concept of projections of vectors onto
eigenspaces, in both the finite and infinite-dimensional context. If a linear operator
on a Hilbert space (a vector space equipped with an inner product) has a set of
eigenvectors whose linear combinations span that space (i.e. they form a basis),
then we can define dual bases corresponding to the operator, as shown below. This
is useful, as projection to eigenspace of the operator is achieved by an inner product
with members of the dual basis.
x = ∑ ci vi
i
hx, yi = ∑ xi · yci
i
Also,
hx, yi = hy, xic
51
52 C Operator-Induced Bases in Complex Spaces
To deal with the simple case first, let V be a vector space and assume A : V → V has
a simple, finite or countable, discrete spectrum, in which algebraic and geometric
multiplicities of eigenvalues1 are finite and the same. Let A∗ be the adjoint of A, i.e.
hAx, yi = hx, A∗ yi for any x, y ∈ V . Let w1 , ..., wn be the adjoint vector basis, i.e.
A∗ w j = λ jc w j .
Then
hAvi , w j i = λi hvi , w j i.
Also,
hvi , A∗ w j i = λ j hvi , w j i.
Thus, if eigenvalues are non-zero, hvi , w j i = 0 for i 6= j. Now fix vi and assume
that hvi , wi i = 0. This could not be, since then the projection of vi on n independent
vectors in an n-dimensional space would vanish (for space with n = ∞ the same
conclusion follows if the space is complete, i.e. we can expand any vector into a
linear combination of eigenvectors). Thus, hvi , wi i 6= 0, and hvi , wi i can be made 1
by e.g. normalizing the eigenvector sets by setting the matrix V = [v1 , ..., vn ] and
[w1 , ..., wn ]T = V −1 .
Now we consider the case when an eigenvalue λk has algebraic multiplicity mk
and geometric multiplicity 1. Putting v0k ≡ 0, the generalized eigenvectors satisfy
(A − λk I)vkj = vkj−1 , ( j = 1, . . . , mk ).
To this basis there is an adjoint basis {w1k , . . . , wnk } which satisfies hvkj , wik i = δ j,i .
mk
Consider A∗ wkj . Expand this in the basis {wik }i=1 .
1 Recall that the algebraic multiplicity of an eigenvalue is, in finite dimensions, the number of
times an eigenvalue appears in the characteristic polynomial of the operator. The geometric mul-
tiplicity in the finite-dimensional case is the number of eigenvectors corresponding to an eigen-
value. The geometric multiplicity is always smaller than or equal to the algebraic multiplicity. The
infinite-dimensional case we consider here has finite-dimensional blocks that have characteristic
polynomials associated with them.
C.2 Dual Bases Induced by a Linear Operator 53
mk
A∗ wkj = ∑ hA∗ wkj , vik iwik
i=1
mk
= ∑ hwkj , Avik iwik
i=1
mk
= ∑ hwkj , λk vik + vi−1 i
k iwk
i=1
mk n
= ∑ λkc hwkj , vik iwik + ∑ hwkj , vki−1 iwik
i=1 i=1
Also,
hwkj , Avil i = λlc hwkj , vil i + hwkj , vi−1
l i. (C.3)
m
Now take j = mk and i = 1 in (C.2), and (C.3) to show that hwk k , v1l i
= 0 for k 6= l.
m
Taking j = mk − 1, it is again clear, by using hwk k , v1l i = 0 for k 6= l, from (C.2),
m −1
and (C.3) that hwk k , v1l i = 0. Repeating this process, we prove orthogonality for
any j, with i = 1, when k 6= l. Now we proceed with j = mk , i = 2. Continuing this
process by first decreasing j and keeping i constant, and then increasing i leads to
verification of hwkj , vil i = 0 for k 6= l.
We can simply remove the restriction of geometric multiplicity being 1 as fol-
lows: count eigenvalues λk , k = 1, s in a way that if a certain eigenvalue λ j has more
than one eigenvector, then set it as a separate eigenvalue as many times as the dimen-
sion of geometric eigenspace g j (i.e. the number of eigenvectors associated with λ j ),
and label those eigenvalues λ j1 , λ j2 , ...λ jg j , although they have the same numerical
value. Each such eigenvector might have some generalized eigenvectors associated
gj
with it, with multiplicity m jk , k = 1, ..., g j . Clearly, ∑k=1 m jk = m j .
Now, expanding any vector x as
s mk
x= ∑ ∑ ckj vkj ,
k=1 j=1
and a bit of geometry shows that hx, wkj ivkj is the skew-projection of x onto the
direction of vkj , “along” the eigenspace spanned by the rest of the (generalized)
eigenvectors: See figure C.1 for the graphical representation of this geometry in the
case of two eigenvectors and their duals. We can see that the projection of x on the
subspace spanned by w1 is given by hx, w1 i w1 . Label the projection of x on the
54 C Operator-Induced Bases in Complex Spaces
Fig. C.1: The geometry of eigenvector basis and its dual in the two-dimensional
case.
subspace spanned by v1 by cv1 . Noticing that both w1 and v1 are of unit length,
the projection of cv1 on the subspace spanned by w1 is given by hcv1 , w1 i w1 =
cw1 . From this, we conclude c = hx, w1 i. A similar calculation can be done for the
projection of x on the subspace spanned by v2 to show that it is equal to hx, w2 i
C.3 Projections
A linear operator P that satisfies P2 = P is called a projection. The image (or range)
of such an operator is a linear subspace VP of the vector space V : Let g1 = Pf1 , g2 =
Pf2 . Then
c1 g1 + c2 g2 = c1 Pf1 + c2 Pf2 = P(c1 f1 + c2 f2 ) (C.4)
The kernel of P - the set of all the vectors that P maps to 0 - is given by (I − P)V :
where in the one but the last equation we used the fact that vk , wk are unit vectors.
Also, all the vectors in the span{v1 , ...vk−1 , vk+1 , ..., vn } map to 0 under Pvk , by
orthonormality of the dual bases (to see this, set x = v j , j 6= k into (C.6)).
Let’s see how the projections get constructed using matrices. Let B be the matrix
whose columns are the basis vectors vk spanning a subspace VB , and C the matrix
whose columns are the dual basis vectors wk . Then, the projection PB on the sub-
space VB is given by
P = BC† (C.8)
where C† is the complex conjugate (Hermitian) transpose of C. To show this, let’s
first consider the example where VB = span{vk }:
n
(Pvk x) = hx, wk i vk = ( ∑ xl wckl )vk = BkCk† x. (C.9)
l=1
where
BC† = ∑ BkCk† . (C.11)
k∈I
The following situation commonly occurs when we consider the action of an op-
erator A on V and a vector b which is outside of the range R of A: the equation
Ax = b, x, b ∈ V, (C.12)
is then not solvable. In some sense the best we could do is to try to solve
Ax = PR b = BC† b, (C.13)
where PR b is the projection of b along the null space of A to the range of A. Thus,
there is a vector x̂ that satisfies (C.13).
Given A, It is simple to find the so-called Moore-Penrose pseudoinverse starting
from (C.12)
(A† A)−1 A† Ax = (A† A)−1 A† b, x, b ∈ V, (C.14)
implying
x̃ = (A† A)−1 A† b, (C.15)
56 C Operator-Induced Bases in Complex Spaces
indicating that the vector (Ax̃ − b) is orthogonal to the range of A (since the range of
A is the span of the columns of A. Thus, the vector b splits into Ax̃ - the orthogonal
projection of b on R - and b − Ax̃.
The matrix
A+ = (A† A)−1 A† , (C.17)
is called the Moore-Penrose pseudoinverse of A.
In the case of the operator A that has a null space N, we can similarly get
If we set
x̂ = (C† A)−1C† b, (C.20)
we get
(C† A)x̂ = C† b, (C.21)
i.e.
C† (Ax̂ − b) = 0, (C.22)
and thus Ax̂−b is in the nullspace of A. In the Moore-Penrose pseudoinverse, Ax̃−b
was orthogonal to the range of A, and thus was not necessarily in the nullspace.
References
57