Matrix Operations

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Matrix Operations

• transpose, sum & difference, scalar multiplication

• matrix multiplication, matrix-vector product

• matrix inverse

2–1
Matrix transpose

transpose of m × n matrix A, denoted AT or A′, is n × m matrix with

T

A ij
= Aji

rows and columns of A are transposed in AT


 T
0 4  
0 7 3
example:  7 0  = .
4 0 1
3 1

• transpose converts row vectors to column vectors, vice versa



T T
• A =A

Matrix Operations 2–2


Matrix addition & subtraction

if A and B are both m × n, we form A + B by adding corresponding entries


     
0 4 1 2 1 6
example:  7 0  +  2 3 = 9 3 
3 1 0 4 3 5

can add row or column vectors same way (but never to each other!)
   
1 6 0 6
matrix subtraction is similar: −I =
9 3 9 2

(here we had to figure out that I must be 2 × 2)

Matrix Operations 2–3


Properties of matrix addition

• commutative: A + B = B + A

• associative: (A + B) + C = A + (B + C), so we can write as A + B + C

• A + 0 = 0 + A = A; A − A = 0

• (A + B)T = AT + B T

Matrix Operations 2–4


Scalar multiplication
we can multiply a number (a.k.a. scalar ) by a matrix by multiplying every
entry of the matrix by the scalar
this is denoted by juxtaposition or ·, with the scalar on the left:
   
1 6 −2 −12
(−2)  9 3  =  −18 −6 
6 0 −12 0

(sometimes you see scalar multiplication with the scalar on the right)

• (α + β)A = αA + βA; (αβ)A = (α)(βA)


• α(A + B) = αA + αB
• 0 · A = 0; 1 · A = A

Matrix Operations 2–5


Matrix multiplication

if A is m × p and B is p × n we can form C = AB, which is m × n


p
X
Cij = aik bkj = ai1b1j + · · · + aipbpj , i = 1, . . . , m, j = 1, . . . , n
k=1

to form AB, #cols of A must equal #rows of B; called compatible

• to find i, j entry of the product C = AB, you need the ith row of A
and the jth column of B
• form product of corresponding entries, e.g., third component of ith row
of A and third component of jth column of B
• add up all the products

Matrix Operations 2–6


Examples
    
1 6 0 −1 −6 11
example 1: =
9 3 −1 2 −3 −3
for example, to get 1, 1 entry of product:

C11 = A11B11 + A12B21 = (1)(0) + (6)(−1) = −6

    
0 −1 1 6 −9 −3
example 2: =
−1 2 9 3 17 0

these examples illustrate that matrix multiplication is not (in general)


commutative: we don’t (always) have AB = BA

Matrix Operations 2–7


Properties of matrix multiplication

• 0A = 0, A0 = 0 (here 0 can be scalar, or a compatible matrix)

• IA = A, AI = A

• (AB)C = A(BC), so we can write as ABC

• α(AB) = (αA)B, where α is a scalar

• A(B + C) = AB + AC, (A + B)C = AC + BC

• (AB)T = B T AT

Matrix Operations 2–8


Matrix-vector product
very important special case of matrix multiplication: y = Ax

• A is an m × n matrix
• x is an n-vector
• y is an m-vector

yi = Ai1x1 + · · · + Ainxn, i = 1, . . . , m

can think of y = Ax as

• a function that transforms n-vectors into m-vectors


• a set of m linear equations relating x to y

Matrix Operations 2–9


Inner product

if v is a row n-vector and w is a column n-vector, then vw makes sense,


and has size 1 × 1, i.e., is a scalar:

vw = v1w1 + · · · + vnwn

if x and y are n-vectors, xT y is a scalar called inner product or dot


product of x, y, and denoted hx, yi or x · y:

hx, yi = xT y = x1y1 + · · · + xnyn

(the symbol · can be ambiguous — it can mean dot product, or ordinary


matrix product)

Matrix Operations 2–10


Matrix powers

if matrix A is square, then product AA makes sense, and is denoted A2

more generally, k copies of A multiplied together gives Ak :

Ak = A
| A {z
· · · A}
k

by convention we set A0 = I

(non-integer powers like A1/2 are tricky — that’s an advanced topic)

we have Ak Al = Ak+l

Matrix Operations 2–11


Matrix inverse

if A is square, and (square) matrix F satisfies F A = I, then

• F is called the inverse of A, and is denoted A−1

• the matrix A is called invertible or nonsingular

if A doesn’t have an inverse, it’s called singular or noninvertible

by definition, A−1A = I; a basic result of linear algebra is that AA−1 = I


−k

−1 k
we define negative powers of A via A = A

Matrix Operations 2–12


Examples
 −1  
1 −1 1 2 1
example 1: = (you should check this!)
1 2 3 −1 1

 
1 −1
example 2: does not have an inverse; let’s see why:
−2 2
      
a b 1 −1 a − 2b −a + 2b 1 0
= =
c d −2 2 c − 2d −c + 2d 0 1

. . . but you can’t have a − 2b = 1 and −a + 2b = 0

Matrix Operations 2–13


Properties of inverse


−1 −1
• A = A, i.e., inverse of inverse is original matrix
(assuming A is invertible)
• (AB)−1 = B −1A−1 (assuming A, B are invertible)

T −1

−1 T
• A = A (assuming A is invertible)
• I −1 = I
• (αA)−1 = (1/α)A−1 (assuming A invertible, α 6= 0)
• if y = Ax, where x ∈ Rn and A is invertible, then x = A−1y:

A−1y = A−1Ax = Ix = x

Matrix Operations 2–14


Inverse of 2 × 2 matrix

it’s useful to know the general formula for the inverse of a 2 × 2 matrix:
 −1  
a b 1 d −b
=
c d ad − bc −c a

provided ad − bc 6= 0 (if ad − bc = 0, the matrix is singular)

there are similar, but much more complicated, formulas for the inverse of
larger square matrices, but the formulas are rarely used

Matrix Operations 2–15


Eigenvalues and Eigenvectors
De…nition 11.1. Let A be a square matrix (or linear transformation). A number ¸is
called an eigenvalue of A if there exists a non-zero vector ~u such that
A~u = ¸~u : (1)
In the above de…nition, the vector ~u is called an eigenvector associated with this eigenvalue
¸: The set of all eigenvectors associated with ¸forms a subspace, and is called the eigenspace
associated with ¸. Geometrically, if we view any n £ n matrix A as a linear transformation
T: Then the fact that ~u is an eigenvector associated with an eigenvalue ¸ means ~u is an
invariant direction under T: In other words, the linear transformation T does not change
the direction of ~u : ~u and T ~u either have the same direction (¸ > 0) or opposite direction
(¸< 0). The eigenvalue is the factor of contraction (j¸j < 1) or extension (j¸j > 1).
Remarks. (1) ~u 6= ~0 is crucial, since ~u = ~0 always satis…es Equ (1). (2) If ~u is an
eigenvector for ¸; then so is c~u for any constant c: (3) Geometrically, in 3D, eigenvectors of
A are directions that are unchanged under linear transformation A:
We observe from Equ (1) that ¸ is an eigenvalue i¤ Equ (1) has a non-trivial solution.
Since Equ (1) can be written as
(A ¡ ¸I) ~u = A~u ¡ ¸~u = ~0; (2)
it follows ¸ is an eigenvalue i¤ Equ (2) has a non-trivial solution. By the inverse matrix
theorem, Equ (2) has a non-trivial solution i¤
det (A ¡ ¸I) = 0: (3)
We conclude that ¸ is an eigenvalue i¤ Equ (3) holds. We call Equ (3) "Characteristic
Equation" of A. The eigenspace, the subspace of all eigenvectors associated with ¸; is
eigenspace = Null (A ¡ ¸I) :
² Finding eigenvalues and all independent eigenvectors:
Step 1. Solve Characteristic Equ (3) for ¸:
Step 2. For each ¸; …nd a basis for the eigenspace Null (A ¡ ¸I) (i.e., solution set of Equ
(2)).
Example 11.1. Find all eigenvalues and their eigenspace for
· ¸
3 ¡2
A= :
1 0
Solution:
· ¸ · ¸
3 ¡2 1 0
A ¡ ¸I = ¡¸
1 0 0 1
· ¸ · ¸ · ¸
3 ¡2 ¸ 0 3 ¡ ¸ ¡2
= ¡ = :
1 0 0 ¸ 1 ¡¸

1
The characteristic equation is

det (A ¡ ¸I) = (3 ¡ ¸) (¡¸) ¡ (¡2) = 0;


¸2 ¡ 3¸+ 2 = 0;
(¸¡ 1) (¸¡ 2) = 0:

We …nd eigenvalues
¸1 = 1; ¸2 = 2:
We next …nd eigenvectors associated with each eigenvalue. For ¸1 = 1;
· ¸· ¸ · ¸· ¸
~0 = (A ¡ ¸1 I) ~x = 3 ¡ 1 ¡2 x 1 2 ¡2 x1
= ;
1 ¡1 x2 1 ¡1 x2
or
x1 = x2 :
The parametric vector form of solution set for (A ¡ ¸1 I) ~x = ~0 :
· ¸ · ¸ · ¸
x1 x2 1
~x = = = x2 :
x2 x2 1
· ¸
1
basis of Null (A ¡ I) : :
1
This is only (linearly independent) eigenvector for ¸1 = 1:
The last step can be done slightly di¤erently as follows. From solutions (for (A ¡ ¸1 I) ~x =
~0 )
x1 = x2 ;
we know there is only one free variable x2 . Therefore, there is only one vector in any basis. To
…nd it, we take x2 to be any nonzero number, for instance, x2 = 1; and compute x1 = x2 = 1:
We obtain · ¸ · ¸
x 1
¸1 = 1; ~u1 = 1 = :
x2 1
For ¸2 = 2; we …nd
· ¸· ¸ · ¸· ¸
~0 = (A ¡ ¸2 I) ~x = 3 ¡ 2 ¡2 x1 = 1 ¡2 x1 ;
1 ¡2 x2 1 ¡2 x2
or
x1 = 2x2 :
To …nd a basis, we take x2 = 1: Then x1 = 2; and a pair of eigenvalue and eigenvector
· ¸
2
¸2 = 2; ~u2 = :
1

2
Example 11.2. Given that 2 is an eigenvalue for
2 3
4 ¡1 6
4
A= 2 1 65 :
2 ¡1 8

Find a basis of its eigenspace.


Solution:
2 3 2 3 2 3
4 ¡ 2 ¡1 6 2 ¡1 6 2 ¡1 6
A ¡ 2I = 4 2 1¡2 6 5 = 42 ¡1 65 ! 40 0 05 :
2 ¡1 8 ¡ 2 2 ¡1 6 0 0 0

Therefore, (A ¡ 2I) ~x = ~0 becomes

2x1 ¡ x2 + 6x3 = 0; or x2 = 2x1 + 6x3 ; (4)

where we select x1 and x3 as free variables only to avoid fractions. Solution set in parametric
form is 2 3 2 3 2 3 2 3
x1 x1 1 0
~x = 4x2 5 = 42x1 + 6x3 5 = x1 425 + x3 465 :
x3 x3 0 1
A basis for the eigenspace: 2 3 2 3
1 0
~ 1 = 2 and ~u2 = 65 :
u 4 5 4
0 1
Another way of …nding a basis for Null (A ¡ ¸I) = Null (A ¡ 2I) may be a little easier.
From Equ (4), we know that x1 an x3 are free variables. Choosing (x1 ; x3 ) = (1; 0) and
(0; 1) ; respectively, we …nd
2 3
1
x1 = 1; x3 = 0 =) x2 = 2 =) ~u1 = 25
4
0
2 3
0
x1 = 0; x3 = 1 =) x2 = 6 =) ~u2 = 65 :
4
1
Example 11.3. Find eigenvalues: (a)
2 3 2 3
3 ¡1 6 3 ¡ ¸ ¡1 6
A = 40 0 65 ; A ¡ ¸I = 4 0 ¡¸ 6 5:
0 0 2 0 0 2¡¸

det (A ¡ ¸I) = (3 ¡ ¸) (¡¸) (2 ¡ ¸) = 0

3
The eigenvalues are 3; 0; 2; exactly the diagonal elements. (b)
2 3 2 3
4 0 0 4¡¸ 0 0
B = 42 1 05 ; B ¡ ¸I = 4 2 1¡¸ 0 5
1 0 4 1 0 4¡¸

det (B ¡ ¸I) = (4 ¡ ¸)2 (1 ¡ ¸) = 0:


The eigenvalues are 4; 1; 4 (4 is a double root), exactly the diagonal elements.
Theorem 11.1. (1) The eigenvalues of a triangle matrix are its diagonal elements.
(2) Eigenvectors for di¤erent eigenvalues are linearly independent. More precisely, sup-
pose that ¸1 ; ¸2 ; :::; ¸p are p di¤erent eigenvalues of a matrix A: Then, the set consisting
of

a basis of N ull (A ¡ ¸1 I) ; a basis of N ull (A ¡ ¸2 I) ; :::; a basis of Null (A ¡ ¸p I)

is linearly independent.
Proof. (2) For simplicity, we assume p = 2 : ¸1 6= ¸2 are two di¤erent eigenvalues. Suppose
that ~u1 is an eigenvector of ¸1 and ~u2 is an eigenvector of ¸2 To show independence, we need
to show that the only solution to
x1 ~u1 + x2 ~u2 = ~0
is x1 = x2 = 0: Indeed, if x1 6= 0; then
x2
~u1 = ~u2 : (5)
x1
We now apply A to the above equation. It leads to
x2 x2
A~u1 = A~u2 =) ¸1 ~u1 = ¸2 ~u2 : (6)
x1 x1
Equ (5) and Equ (6) are contradictory to each other: by Equ (5),
x2
Equ (5) =) ¸1 ~u1 = ¸1 ~u2
x1
x2
Equ (6) =) ¸1 ~u1 = ¸2 ~u2 ;
x1
or
x2 x2
¸1 ~u2 = ¸1 ~u1 = ¸2~u2 :
x1 x1
Therefor ¸1 = ¸2 ; a contradiction to the assumption that they are di¤erent eigenvalues.

² Characteristic Polynomials

4
We know that the key to …nd eigenvalues and eigenvectors is to solve the Characteristic
Equation (3)
det (A ¡ ¸I) = 0:
For 2 £ 2 matrix, · ¸
a¡¸ b
A ¡ ¸I = :
c d¡¸
So
det (A ¡ ¸I) = (a ¡ ¸) (d ¡ ¸) ¡ bc
= ¸2 + (¡a ¡ d) ¸+ (ad ¡ bc)
is a quadratic function (i.e., a polynomial of degree 2): In general, for any n £ n matrix A;
2 3
a11 ¡ ¸ a12 ¢¢¢ a1n
6 a21 a22 ¡ ¸ ¢ ¢ ¢ a2n 7
A ¡ ¸I = 6 4 ¢¢¢
7:
¢¢¢ ¢¢¢ ¢¢¢ 5
an1 an2 ¢ ¢ ¢ ann ¡ ¸
We may expand the determinant along the …rst row to get
2 3
a22 ¡ ¸ ¢¢¢ a2n
det (A ¡ ¸I) = (a11 ¡ ¸) det 4 ¢ ¢ ¢ ¢¢¢ ¢ ¢ ¢ 5 + :::
an2 ¢ ¢ ¢ ann ¡ ¸
By induction, we see that det (A ¡ ¸I) is a polynomial of degree n:We called this polynomial
the characteristic polynomial of A: Consequently, there are total of n (the number of rows
in the matrix A) eigenvalues (real or complex, after taking account for multiplicity). Finding
roots for higher order polynomials may be very challenging.
Example 11.4. Find all eigenvalues for
2 3
5 ¡2 6 ¡1
60 3 ¡8 0 7
A=6 40 0
7:
5 45
0 0 1 1
Solution: 2 3
5 ¡ ¸ ¡2 6 ¡1
6 0 3 ¡ ¸ ¡8 0 7
A ¡ ¸I = 64 0
7;
0 5¡¸ 4 5
0 0 1 1¡¸
2 3
3 ¡ ¸ ¡8 0
det (A ¡ ¸I) = (5 ¡ ¸) det 4 0 5¡¸ 4 5
0 1 1¡¸
· ¸
5¡¸ 4
= (5 ¡ ¸) (3 ¡ ¸) det
1 1¡¸
= (5 ¡ ¸) (3 ¡ ¸) [(5 ¡ ¸) (1 ¡ ¸) ¡ 4] = 0:

5
There are 4 roots:

(5 ¡ ¸) = 0 =) ¸ = 5
(3 ¡ ¸) = 0 =) ¸ = 3
(5 ¡ ¸) (1 ¡ ¸) ¡ 4 = 0 =) ¸2 ¡ 6¸+ 1 = 0
p
6 § 36 ¡ 4 p
=) ¸ = = 3 § 2 2:
2
We know that we can computer determinants using elementary row operations. One may
ask: Can we use elementary row operations to …nd eigenvalues? More speci…cally, we have
Question: Suppose that B is obtained from A by elementary row operations. Do A and
B has the same eigenvalues? (Ans: No)
Example 11.5. · ¸ · ¸
1 1 R2 +R1 !R2 1 1
A= ! =B
0 2 1 3
A has eigenvalues 1 and 2: For B; the characteristic equation is
· ¸
1¡¸ 1
det (B ¡ ¸I) =
1 3¡¸
= (1 ¡ ¸) (3 ¡ ¸) ¡ 1 = ¸2 ¡ 4¸+ 2:

The eigenvalues are p p


4§ 16 ¡ 8 4 § 8 p
¸= = = 2 § 2:
2 2
This example show that row operation may completely change eigenvalues.
De…nition 11.2. Two n £ n matrices A and B are called similar, and is denoted as
A » B; if there exists an invertible matrix P such that A = P BP ¡1 :
Theorem 11.2. If A and B are similar, then they have exact the same characteristic
polynomial and consequently the same eigenvalues.
Indeed, if A = P BP ¡1 ; then P (B ¡ ¸I) P ¡1 = P BP ¡1 ¡ ¸P IP ¡1 = (A ¡ ¸I) : There-
fore,
¡ ¢ ¡ ¢
det (A ¡ ¸I) = det P (B ¡ ¸I) P ¡1 = det (P ) det (B ¡ ¸I) det P ¡1 = det (B ¡ ¸I) :

Example 11.6. Find eigenvalues of A if


2 3
5 ¡2 6 ¡1
60 3 ¡8 0 7
A»B=6 40
7:
0 5 45
0 0 0 4

Solution: Eigenvalues of B are ¸ = 5; 3; 5; 4: These are also the eigenvalues of A:

6
Caution: If A » B; and if ¸0 is an eigenvalue for A and B, then an corresponding
eigenvector for A may not be an eigenvector for B: In other words, two similar matrices A
and B have the same eigenvalues but di¤erent eigenvectors.
Example 11.7. Though row operation alone will not preserve eigenvalues, a pair of
row and column operation do maintain similarity. We …rst observe that if P is a type 1
elementary matrix (row replacement) ,
· ¸ · ¸
1 0 aR1 +R2 !R2 1 0
P = á ;
a 1 0 1

then its inverse P ¡1 is a type 1 (column) elementary matrix obtained from the identity
matrix by an elementary column operation that is of the same kind with "opposite sign" to
the previous row operation, i.e.,
· ¸ · ¸
¡1 1 0 C1 ¡aC2 !C1 1 0
P = á :
¡a 1 0 1
We call the column operation
C1 ¡ aC2 ! C1
is "inverse" to the row operation
R1 + aR2 ! R2 :
Now we perform a row operation on A followed immediately by the column operation
inverse to the row operation
· ¸ · ¸
1 1 R1 +R2 !R2 1 1
A= ! (left multiply by P )
0 2 1 3
· ¸
C1 ¡C2 !C 1 0 1
! = B (right multiply by P ¡1 .)
¡2 3
We can verify that A and B are similar through P (with a = 1)
· ¸· ¸· ¸
¡1 1 0 1 1 1 0
P AP =
1 1 0 2 ¡1 1
· ¸· ¸ · ¸
1 1 1 0 0 1
= = :
1 3 ¡1 1 ¡2 3
Now, ¸1 = 1 is an eigenvalue. Then,
· ¸· ¸
1¡1 1 1
(A ¡ 1) ~u =
0 2¡1 0
· ¸· ¸ · ¸
0 1 1 0
= =
0 1 0 0
· ¸
1
=) ~u = is an eigenvector for A:
0

7
But
· ¸· ¸
0¡1 1 1
(B ¡ 1) ~u =
¡2 3 ¡ 1 0
· ¸· ¸ · ¸
¡1 1 1 ¡1
= =
¡2 2 0 ¡2
· ¸
1
=) ~u = is NOT an eigenvector for B:
0
In fact,
· ¸· ¸ · ¸
¡1 1 1 0
(B ¡ 1) ~v = = :
¡2 2 ¡1 0
· ¸
1
So, ~v = is an eigenvector for B:
¡1
This example shows that

1. Row operation alone will not preserve eigenvalues.

2. Two similar matrices share the same characteristics polynomial and same eigenvalues.
But they have di¤erent eigenvectors.

² Homework #11.

1. Find eigenvalues if
2 3
¡1 2 8 ¡1
60 2 10 0 7
(a) A » 640 0
7:
¡1 4 5
0 0 0 3
2 3
¡1 2 8 ¡1
61 2 10 0 7
(b) B » 640 0
7
1 45
0 0 0 2

2. Find eigenvalues and a basis of each eigenspace.


· ¸
4 ¡2
(a) A = :
¡3 9
2 3
7 4 6
(b) B = 4¡3 ¡1 ¡85 :
0 0 1

8
3. Find a basis of the eigenspace associated with eigenvalue ¸= 1 for
2 3
1 2 4 ¡1
61 2 ¡3 0 7
A=6 40 0 1
7:
25
0 0 0 1

4. Determine true or false. Reason your answers.

(a) If A~x = ¸~x; then ¸ is an eigenvalue of A.


(b) A is invertible i¤ 0 is not an eigenvalue.
(c) If A » B; then A and B has the same eigenvalues and eigenspaces.
(d) If A and B have the same eigenvalues, then they have the same characteristic
polynomial.
(e) If det A = det B; then A is similar to B:

9
CHAPTER 7

Eigenvalues
and
Eigenvectors

7.1 ELEMENTARY PROPERTIES OF EIGENSYSTEMS

Up to this point, almost everything was either motivated by or evolved from the
consideration of systems of linear algebraic equations. But we have come to a
turning point, and from now on the emphasis will be different. Rather than being
concerned with systems of algebraic equations, many topics will be motivated
or driven by applications involving systems of linear differential equations and
their discrete counterparts, difference equations.
For example, consider the problem of solving the system of two first-order
linear differential equations, du1 /dt = 7u1 − 4u2 and du2 /dt = 5u1 − 2u2 . In
matrix notation, this system is
 ′   
u1 7 −4 u1
= or, equivalently, u′ = Au, (7.1.1)
u′2 5 −2 u2
 ′    
u 7 −4 u
where u′ = u1′ , A = 5 −2 , and u = u1 . Because solutions of a single
2 2
equation u′ = λu have the form u = αeλt , we are motivated to seek solutions
of (7.1.1) that also have the form

u1 = α1 eλt and u2 = α2 eλt . (7.1.2)

Differentiating these two expressions and substituting the results in (7.1.1) yields
    
α1 λeλt = 7α1 eλt − 4α2 eλt α1 λ = 7α1 − 4α2 7 −4 α1 α1
λt λt λt
⇒ ⇒ =λ .
α2 λe = 5α1 e − 2α2 e α2 λ = 5α1 − 2α2 5 −2 α2 α2
490 Chapter 7 Eigenvalues and Eigenvectors

In other words, solutions of (7.1.1)having


 the form (7.1.2) can be constructed
α1
provided solutions for λ and x = α in the matrix equation Ax = λx can
2
be found. Clearly, x = 0 trivially satisfies Ax = λx, but x = 0 provides no
useful information concerning the solution of (7.1.1). What we really need are
scalars λ and nonzero vectors x that satisfy Ax = λx. Writing Ax = λx
as (A − λI) x = 0 shows that the vectors of interest are the nonzero vectors in
N (A − λI) . But N (A − λI) contains nonzero vectors if and only if A − λI
is singular. Therefore, the scalars of interest are precisely the values of λ that
make A − λI singular or, equivalently, the λ ’s for which det (A − λI) = 0.
66
These observations motivate the definition of eigenvalues and eigenvectors.

Eigenvalues and Eigenvectors


For an n × n matrix A, scalars λ and vectors xn×1 = 0 satisfying
Ax = λx are called eigenvalues and eigenvectors of A, respectively,
and any such pair, (λ, x), is called an eigenpair for A. The set of
distinct eigenvalues, denoted by σ (A) , is called the spectrum of A.
• λ ∈ σ (A) ⇐⇒ A − λI is singular ⇐⇒ det (A − λI) = 0. (7.1.3)
  
• x = 0  x ∈ N (A − λI) is the set of all eigenvectors associated
with λ. From now on, N (A − λI) is called an eigenspace for A.
• Nonzero row vectors y∗ such that y∗ (A − λI) = 0 are called left-
hand eigenvectors for A (see Exercise 7.1.18 on p. 503).

Geometrically, Ax = λx says that under transformation by A, eigenvec-


tors experience only changes in magnitude or sign—the orientation of Ax in ℜn
is the same as that of x. The eigenvalue λ is simply the amount of “stretch”
or “shrink” to which the eigenvector x is subjected when transformed by A.
Figure 7.1.1 depicts the situation in ℜ2 .
Ax = λx

Figure 7.1.1
66
The words eigenvalue and eigenvector are derived from the German word eigen, which means
owned by or peculiar to. Eigenvalues and eigenvectors are sometimes called characteristic values
and characteristic vectors, proper values and proper vectors, or latent values and latent vectors.
7.1 Elementary Properties of Eigensystems 491

Let’s now face


 the problem
 of finding the eigenvalues and eigenvectors of
the matrix A = 75 −4−2
appearing in (7.1.1). As noted in (7.1.3), the eigen-
values are the scalars λ for which det (A − λI) = 0. Expansion of det (A − λI)
produces the second-degree polynomial
 
7 − λ −4 
p(λ) = det (A − λI) =  = λ2 − 5λ + 6 = (λ − 2)(λ − 3),
5 −2 − λ 
which is called the characteristic polynomial for A. Consequently, the eigen-
values for A are the solutions of the characteristic equation p(λ) = 0 (i.e.,
the roots of the characteristic polynomial), and they are λ = 2 and λ = 3.
The eigenvectors associated with λ = 2 and λ = 3 are simply the nonzero
vectors in the eigenspaces N (A − 2I) and N (A − 3I), respectively. But deter-
mining these eigenspaces amounts to nothing more than solving the two homo-
geneous systems, (A − 2I) x = 0 and (A − 3I) x = 0.
For λ = 2,
   
5 −4 1 −4/5 x1 = (4/5)x2
A − 2I = −→ =⇒
5 −4 0 0 x2 is free
  

 4/5
=⇒ N (A − 2I) = x  x = α .
1
For λ = 3,
  
4 −4 −1 1 x1 = x2
A − 3I = −→ =⇒
5 −5 0 0 x2 is free
  

 1
=⇒ N (A − 3I) = x  x = β .
1
In other words, the eigenvectors of A associated with λ = 2 are all nonzero
T
multiples of x = ( 4/5 1 ) , and the eigenvectors associated with λ = 3 are
T
all nonzero multiples of y = ( 1 1 ) . Although there are an infinite number of
eigenvectors associated with each eigenvalue, each eigenspace is one dimensional,
so, for this example, there is only one independent eigenvector associated with
each eigenvalue.
Let’s complete the discussion concerning the system of differential equations
u′ = Au in (7.1.1). Coupling (7.1.2) with the eigenpairs (λ1 , x) and (λ2 , y) of
A computed above produces two solutions of u′ = Au, namely,
   
λ1 t 2t 4/5 λ2 t 3t 1
u1 = e x = e and u2 = e y = e .
1 1
It turns out that all other solutions are linear combinations of these two particular
solutions—more is said in §7.4 on p. 541.
Below is a summary of some general statements concerning features of the
characteristic polynomial and the characteristic equation.
492 Chapter 7 Eigenvalues and Eigenvectors

Characteristic Polynomial and Equation


• The characteristic polynomial of An×n is p(λ) = det (A − λI).
The degree of p(λ) is n, and the leading term in p(λ) is (−1)n λn .
• The characteristic equation for A is p(λ) = 0.
• The eigenvalues of A are the solutions of the characteristic equation
or, equivalently, the roots of the characteristic polynomial.
• Altogether, A has n eigenvalues, but some may be complex num-
bers (even if the entries of A are real numbers), and some eigenval-
ues may be repeated.
• If A contains only real numbers, then its complex eigenvalues must
occur in conjugate pairs—i.e., if λ ∈ σ (A) , then λ ∈ σ (A) .

Proof. The fact that det (A − λI) is a polynomial of degree n whose leading
term is (−1)n λn follows from the definition of determinant given in (6.1.1). If

1 if i = j,
δij =
0 if i = j,
then

det (A − λI) = σ(p)(a1p1 − δ1p1 λ)(a2p2 − δ2p2 λ) · · · (anpn − δnpn λ)
p

is a polynomial in λ. The highest power of λ is produced by the term


(a11 − λ)(a22 − λ) · · · (ann − λ),
so the degree is n, and the leading term is (−1)n λn . The discussion given
earlier contained the proof that the eigenvalues are precisely the solutions of the
characteristic equation, but, for the sake of completeness, it’s repeated below:
λ ∈ σ (A) ⇐⇒ Ax = λx for some x = 0 ⇐⇒ (A − λI) x = 0 for some x = 0
⇐⇒ A − λI is singular ⇐⇒ det (A − λI) = 0.
The fundamental theorem of algebra is a deep result that insures every poly-
nomial of degree n with real or complex coefficients has n roots, but some
roots may be complex numbers (even if all the coefficients are real), and some
roots may be repeated. Consequently, A has n eigenvalues, but some may be
complex, and some may be repeated. The fact that complex eigenvalues of real
matrices must occur in conjugate pairs is a consequence of the fact that the roots
of a polynomial with real coefficients occur in conjugate pairs.
7.1 Elementary Properties of Eigensystems 493

Example 7.1.1
 
1 −1
Problem: Determine the eigenvalues and eigenvectors of A = 1 1
.

Solution: The characteristic polynomial is


 
1 − λ −1 

det (A − λI) =  = (1 − λ)2 + 1 = λ2 − 2λ + 2,
1 1 − λ

so the characteristic equation is λ2 − 2λ + 2 = 0. Application of the quadratic


formula yields √ √
2 ± −4 2 ± 2 −1
λ= = = 1 ± i,
2 2
so the spectrum of A is σ (A) = {1 + i, 1 − i}. Notice that the eigenvalues are
complex conjugates of each other—as they must be because complex eigenvalues
of real matrices must occur in conjugate pairs. Now find the eigenspaces.
For λ = 1 + i,
     

−i −1 1 −i i
A − λI = −→ =⇒ N (A − λI) = span .
1 −i 0 0 1

For λ = 1 − i,
     

i −1 1 i −i
A − λI = −→ =⇒ N (A − λI) = span .
1 i 0 0 1

In other words, the eigenvectors associated with λ1 = 1 + i are all nonzero


T
multiples of x1 = ( i 1 ) , and the eigenvectors associated with λ2 = 1 − i
T
are all nonzero multiples of x2 = ( −i 1 ) . In previous sections, you could
be successful by thinking only in terms of real numbers and by dancing around
those statements and issues involving complex numbers. But this example makes
it clear that avoiding complex numbers, even when dealing with real matrices,
is no longer possible—very innocent looking matrices, such as the one in this
example, can possess complex eigenvalues and eigenvectors.

As we have seen, computing eigenvalues boils down to solving a polynomial


equation. But determining solutions to polynomial equations can be a formidable
task. It was proven in the nineteenth century that it’s impossible to express
the roots of a general polynomial of degree five or higher using radicals of the
coefficients. This means that there does not exist a generalized version of the
quadratic formula for polynomials of degree greater than four, and general poly-
nomial equations cannot be solved by a finite number of arithmetic operations
involving +,−,×,÷, √n . Unlike solving Ax = b, the eigenvalue problem gener-

ally requires an infinite algorithm, so all practical eigenvalue computations are


accomplished by iterative methods—some are discussed later.
494 Chapter 7 Eigenvalues and Eigenvectors

For theoretical work, and for textbook-type problems, it’s helpful to express
the characteristic equation in terms of the principal minors. Recall that an r × r
principal submatrix of An×n is a submatrix that lies on the same set of r
rows and columns, and an r × r principal minor is the determinant of an r × r
principal submatrix. In other words, r × r principal minors are obtained by
deleting the same set of n−r rows and columns, and there are nr = n!/r!(n−r)!
such minors. For example, the 1 × 1 principal minors of
 
−3 1 −3
A =  20 3 10  (7.1.4)
2 −2 4
are the diagonal entries −3, 3, and 4. The 2 × 2 principal minors are
     
 −3 1     
  = −29,  −3 −3  = −6, and  3 10  = 32,
 20 3   2 4   −2 4 
and the only 3 × 3 principal minor is det (A) = −18.
Related to the principal minors are the symmetric functions of the eigenval-
ues. The k th symmetric function of λ1 , λ2 , . . . , λn is defined to be the sum
of the product of the eigenvalues taken k at a time. That is,

sk = λi1 · · · λik .
1≤i1 <···<ik ≤n

For example, when n = 4,


s1 = λ1 + λ2 + λ3 + λ4 ,
s2 = λ1 λ2 + λ1 λ3 + λ1 λ4 + λ2 λ3 + λ2 λ4 + λ3 λ4 ,
s3 = λ1 λ2 λ3 + λ1 λ2 λ4 + λ1 λ3 λ4 + λ2 λ3 λ4 ,
s4 = λ1 λ2 λ3 λ4 .
The connection between symmetric functions, principal minors, and the coeffi-
cients in the characteristic polynomial is given in the following theorem.

Coefficients in the Characteristic Equation


If λn + c1 λn−1 + c2 λn−2 + · · · + cn−1 λ + cn = 0 is the characteristic
equation for An×n , and if sk is the k th symmetric function of the
eigenvalues λ1 , λ2 , . . . , λn of A, then

• ck = (−1)k (all k × k principal minors), (7.1.5)

• sk = (all k × k principal minors), (7.1.6)
• trace (A) = λ1 + λ2 + · · · + λn = −c1 , (7.1.7)
• det (A) = λ1 λ2 · · · λn = (−1)n cn . (7.1.8)
7.1 Elementary Properties of Eigensystems 495

Proof. At least two proofs of (7.1.5) are possible, and although they are concep-
tually straightforward, each is somewhat tedious. One approach is to successively
use the result of Exercise 6.1.14 to expand det (A − λI). Another proof rests on
the observation that if
p(λ) = det(A − λI) = (−1)n λn + a1 λn−1 + a2 λn−2 + · · · + an−1 λ + an
is the characteristic polynomial for A, then the characteristic equation is
λn + c1 λn−1 + c2 λn−2 + · · · + cn−1 λ + cn = 0, where ci = (−1)n ai .
Taking the rth derivative of p(λ) yields p(r) (0) = r!an−r , and hence
(−1)n (r)
cn−r = p (0). (7.1.9)
r!
It’s now a matter of repeatedly applying the formula (6.1.19) for differentiating
a determinant to p(λ) = det (A − λI). After r applications of (6.1.19),

p(r) (λ) = Di1 ···ir (λ),
ij =ik

where Di1 ···ir (λ) is the determinant of the matrix identical to A − λI except
that rows i1 , i2 , . . . , ir have been replaced by −eTi1 , −eTi2 , . . . , −eTir , respectively.
It follows that Di1 ···ir (0) = (−1)r det (Ai1 ···ir ), where Ai1 i2 ···ir is identical to
A except that rows i1 , i2 , . . . , ir have been replaced by eTi1 , eTi2 , . . . , eTir , re-
spectively, and det (Ai1 ···ir ) is the n − r × n − r principal minor obtained by
deleting rows and columns i1 , i2 , . . . , ir from A. Consequently,

p(r) (0) = Di1 ···ir (0) = (−1)r det (Ai1 ···ir )
ij =ik ij =ik

= r! × (−1)r (all n − r × n − r principal minors).

The factor r! appears because each of the r! permutations of the subscripts on


Ai1 ···ir describes the same matrix. Therefore, (7.1.9) says
(−1)n (r)
cn−r = p (0) = (−1)n−r (all n − r × n − r principal minors).
r!
To prove (7.1.6), write the characteristic equation for A as
(λ − λ1 )(λ − λ2 ) · · · (λ − λn ) = 0, (7.1.10)
and expand the left-hand side to produce
λn − s1 λn−1 + · · · + (−1)k sk λn−k + · · · + (−1)n sn = 0. (7.1.11)
(Using n = 3 or n = 4 in (7.1.10) makes this clear.) Comparing (7.1.11)
with (7.1.5) produces the desired conclusion. Statements (7.1.7) and (7.1.8) are
obtained from (7.1.5) and (7.1.6) by setting k = 1 and k = n.
496 Chapter 7 Eigenvalues and Eigenvectors

Example 7.1.2
Problem: Determine the eigenvalues and eigenvectors of
 
−3 1 −3
A =  20 3 10  .
2 −2 4

Solution: Use the principal minors computed in (7.1.4) along with (7.1.5) to
obtain the characteristic equation

λ3 − 4λ2 − 3λ + 18 = 0.

A result from elementary algebra states that if the coefficients αi in

λn + αn−1 λn−1 + · · · + α1 λ + α0 = 0

are integers, then every integer solution is a factor of α0 . For our problem, this
means that if there exist integer eigenvalues, then they must be contained in the
set S = {±1, ±2, ±3, ±6, ±9, ±18}. Evaluating p(λ) for each λ ∈ S reveals
that p(3) = 0 and p(−2) = 0, so λ = 3 and λ = −2 are eigenvalues for A.
To determine the other eigenvalue, deflate the problem by dividing

λ3 − 4λ2 − 3λ + 18
= λ2 − λ − 6 = (λ − 3)(λ + 2).
λ−3
Thus the characteristic equation can be written in factored form as

(λ − 3)2 (λ + 2) = 0,

so the spectrum of A is σ (A) = {3, −2} in which λ = 3 is repeated—we say


that the algebraic multiplicity of λ = 3 is two. The eigenspaces are obtained
as follows.
For λ = 3,
   
1 0 1/2  −1 
A − 3I −→  0 1 0  =⇒ N (A − 3I) = span  0  .
 
0 0 0 2

For λ = −2,
   
1 0 1  −1 
A + 2I −→  0 1 −2  =⇒ N (A + 2I) = span  2  .
 
0 0 0 1

Notice that although the algebraic multiplicity of λ = 3 is two, the dimen-


sion of the associated eigenspace is only one—we say that A is deficient in
eigenvectors. As we will see later, deficient matrices pose significant difficulties.

You might also like