Simultaneous Diagonalization of Two Quadratic Forms and A Generalized Eigenvalue Problem
Simultaneous Diagonalization of Two Quadratic Forms and A Generalized Eigenvalue Problem
Simultaneous Diagonalization of Two Quadratic Forms and A Generalized Eigenvalue Problem
Theorem. Let A, M be two real symmetric matrices of the same size, and
let M be positive definite. Then there exists a non-singular matrix C such
that
C T M C = I, (1)
and
C T AC = Λ, (2)
where Λ is s real a diagonal matrix.
Proof. We have
M = RT R, (3)
with some non-singular matrix R. Then the matrix
(R−1 )T AR−1
Set
C = R−1 B. (5)
Then C T = B T (R−1 )T = B −1 (R−1 )T , where we used that B is orthogonal. So
(4) is the same as (2). To check (1) we use B T = B −1 and (R−1 )T = (RT )−1
and obtain
C T M C = B −1 (RT )−1 RT RR−1 B = I.
This proves Theorem 1.
1
Next we will show that the entries λj of the diagonal matrix Λ in this
theorem are generalized eigenvalues of A with respect to M :
Ax = λM x, x 6= 0. (6)
det(A − λM ) = 0,
and the generalized eigenvectors are the columns of C from (1), (2).
We obtain from Theorem 1 and from its proof:
Corollary. Let A, M be symmetric matrices of the same size, and let M be
positive definite. Then all generalized eigenvalues (6) are real, and there is a
basis of the whole space which consists of generalized eigenvectors.
Proof. We refer to the proof of Theorem 1. Matrix (R−1 )T AR−1 is sym-
metric, therefore all its eigenvalues are real and the eigenvectors form a basis.
These eigenvectors are columns of B. If vj is an eigenvector of (R−1 )T AR−1
with eigenvalue λj then
(R−1 )T AR−1 vj = λj vj .
This means that uj , the eigenvectors of (6) are orthonormal with respect to
the dot product defined by
(x, y)M = xT M x,
2
and our matrix R transforms this dot product to the standard dot product:
(x, y)M = xT M y = xT RT Ry = (Rx, Ry).
Applications to mechanics.
Newton’s form of equations of motion ma = F is not always convenient,
especially when one deals with curvilinear coordinates. A generalization
was proposed by Lagrange. We consider a system of points whose position is
completely determined by some generalized coordinates q = (q1 , . . . , qn ). For
example, for one free point in space we have three coordinates (q1 , q2 , q3 ) =
(x1 , x2 , x3 ). Or q may be cylindrical, or spherical coordinates. For m free
points in space we need n = 3m coordinates. For a pendulum oscillating in
a vertical plane, we need one coordinate, for example the angle of deviation
of this pendulum from the vertical is a convenient coordinate.
As the system moves, coordinates are functions of time qj (t). Their
derivatives are called generalized velocities, q̇ = dq/dt. Derivatives with
respect to time are usually denoted by dots over letters in mechanics, to dis-
tinguish them from other derivatives. To obtain the true velocity vector
of a point xk ∈ R3 , one has to write xk = fk (q), rectangular coordinates as
functions of generalized coordinates, and differentiate:
n
X ∂fk
ẋk = q̇j ,
j=1 ∂qj
and the kinetic energy is
X
Tk = mkẋk k2 /2 = bk,i,j (q)q̇i q̇j , (7)
where bk,i,j are some functions of q. The total kinetic energy T of the system
is the sum of such expressions over all points xk , T = Tk . The important
P
fact is that
Kinetic energy is a positive definite quadratic form of generalized veloci-
ties, with coefficients depending on the generalized coordinates.
It is positive definite because the LHS of (7) is non-negative and the sum
of such expressions is positive, if at least one point actually moves.
Now we assume that vector of forces is the gradient of some function −U
of generalized coordinates:
!
∂U ∂U
F = −grad U = − ,..., . (8)
∂q1 ∂qn
3
This function U is called the potential energy or simply the potential.
Following Lagrange’s recipe, we form the following function of generalized
coordinates and velocities:
L = T − U,
the difference between kinetic and potential energy. This function is called
the Lagrangian of the system. The equations of motion in the form of La-
grange are !
d ∂L ∂L
= , 1 ≤ j ≤ n. (9)
dt ∂ q̇j ∂qj
The advantage of this formulation is that unlike for Newton’s equations ar-
bitrary curvilinear coordinate system can be used.
To see that these equations indeed generalize Newton’s equations, con-
sider a free point with coordinate x = (x1 , x2 , x3 ) and mass m moving in the
field of force with potential U . Then the kinetic energy is
m 2
T = ẋ1 + ẋ22 + ẋ23 ,
2
and the Lagrangian is L = T − U . So equations (9) become
d ∂U
(mẋj ) = − = Fj (x1 , x2 , x3 ),
dt ∂xj
where Fj is the j-th component of the force.
Equations of motion are usually non-linear and cannot be solved.
One of the most common methods of dealing with them is linearization,
that is approximation of non-linear equations by linear ones. The simplest
case is the linearization near an equilibrium. An equilibrium is a point q0
such that the system in this state does not move. This means that equations
(9) are satisfied by q(t) ≡ q0 .
Theorem 2. A point q0 is an equilibrium if and only if it is a critical point
of the potential energy U .
Proof. Let us write (9) as
d ∂T ∂T ∂U
= − .
dt ∂ q̇j ∂qj ∂qj
If q(t) ≡ q0 is a solution, then q̇ = 0 and thus ∂T /∂ q˙j = 0, and ∂T /∂qj = 0
for all j. So ∂U/∂qj = 0.
4
For the linearization we assume without loss of generality that q0 = 0,
and that both T and U are analytic functions of q and q̇. This means that
they have convergent series expansions
the difference of two quadratic forms with matrices M and A such that M is
positive definite.
To write the Lagrange equations with Lagrangian (10) we need he differ-
entiation formula
d T
(x Ax) = Ax.
dx
Here d/dx is the column (d/dx1 , . . . , d/dxn )T . So the equation of motion
with Lagrangian (10) is
d
M ẋ = −Ax. (11)
dt
Now we can use the theorem on simultaneous diagonalization. We can apply
it directly to (10) to conclude that there are new coordinates y, x = Cy,
such that
L = ẏT ẏ − yT Λy,
so the equations of motion decouple and become
5
Or alternatively we can apply the Corollary to the linear equation (11) and
find a basis uj of generalized eigenvectors. If yj are coordinates with respect
to this basis, we obtain (12) again. Stated in words this means that the
linearized equation of small oscillations always decouples and becomes (12)
after a change of coordinates.
Notice that if the matrix A is positive definite, q
then all λj > 0 and
±iωj
solutions have the form yj (t) = cj e , where ωj = λj the system is stable
and solutions oscillate with frequencies ωj .
Example. Double pendulum.
The configuration is shown in the figure. Let us choose the angles between
the two rods and the vertical direction as generalized coordinates q1 , q2 . An-
gles are measured from the downward vertical direction, counterclockwise, as
shown in the picture.
ℓ1
q1
m1
ℓ2
q2
m2
6
The equations of motion are non-linear and difficult to solve, so we linearize
them near the equilibrium (q1 , q2 ) = (0, 0). (There are four equilibria in our
system). Linearization in this case means that we replace the cosine in the
kinetic energy by 1 and the cosines in potential according to the formula
cos x ≈ 1 − x2 /2, because we want to keep only second degree terms in the
Lagrangian. The constant term in Potential energy can be omitted.
Thus the Lagrangian of the linearized system is
m1 + m2 2 2 m2 2 2
L∗ = ℓ1 q̇1 + ℓ q̇ + m2 ℓ1 ℓ2 q̇1 q̇2
2 2 2 2
m1 + m2 m2
− gℓ1 q12 − gℓ2 q22 .
2 2
and the linearized equation of motion is
! ! ! !
(m1 + m2 )ℓ21 m2 ℓ1 ℓ2 q̈1 (m1 + m2 )gℓ1 0 q1
=− ,
m2 ℓ 1 ℓ 2 m2 ℓ22 q̈2 0 m2 gℓ2 q2
which we write as
M q̈ = −Aq.
It is easy to check directly that both M and A are positive definite. Notice
that M is not diagonal in this example.