02.2 Matrix Multiplication

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2.

2 Matrix multiplication

We are going to build up the definition of matrix multiplication in several steps. The first is that
c1
⎛ ⎞

if r = (r1 , … , rn ) is a 1✕n row vector and c = ⎜ ⎟


⎜ ⋮ ⎟ is a n✕1 column vector, we define
⎝ ⎠
cn

rc = r1 c1 + ⋯ + rn cn .

This might remind you of the dot product if you have seen that before.

3
For example, if r = (1 2) and c = ( ) then
4

rc = 1 × 3 + 2 × 4 = 11.

Next, suppose A is a m✕n matrix and c is again a height n column vector. Let r1 , … , rm be
the rows of A; these are 1✕n row vectors. We then define Ac to be the m✕1 column vector

r1 c
⎛ ⎞

⎜ ⎟.
⎜ ⋮ ⎟

⎝ ⎠
rm c

1
⎛ ⎞
1 2 3
For example, if A = ( ) and c = ⎜ 0 ⎟ then
4 5 6
⎝ ⎠
−1

1 × 1 + 2 × 0 + 3 × (−1)
Ac = ( )
4 × 1 + 5 × 0 + 6 × (−1)

Finally, the most general definition of matrix multiplication. Suppose A is m✕n and B is n✕p
(so that the length of a row of A is the same as the height of a column of B). Write cj for the jth
column of B. Then we define AB to be the m✕p matrix whose jth column is Acj :

AB = (Ac1 Ac2 ⋯ Acp )


1 2 5 6 7 5
This time if A = ( ) and B = ( ) , the columns of B are c1 = ( ) ,
3 4 1 0 −1 1

6 7
c2 = ( ) , and c3 = ( ) and
0 −1

AB = (Ac1 Ac2 Ac3 )

1 × 5 + 2 × 1 1 × 6 + 2 × 0 1 × 7 + 2 × (−1)
= ( )
3 × 5 + 4 × 1 3 × 6 + 4 × 0 3 × 7 + 4 × (−1)

This definition of matrix multiplication is good for hand calculations, and for emphasising that
matrix multiplication happens columnwise (that is, A multiplies into B one column of B at a
time). Sometimes, especially when proving properties of matrhx multiplication, it is more
convenient to have a definition with an explicit formula. The definition we have just seen is
equivalent to the following:

Definition 2.5 Let A = (aij ) be m✕n and B = (bij ) be n✕p. The matrix product AB is the
m✕p matrix whose i,j entry is

∑ aik bkj .

k=1

Notice that we only define the product AB when the number of columns of A is the same as the
number of rows of B.

Here are some important properties of matrix multiplication:

Proposition 2.1 Let A and A’ be m✕n, B and B’ be n ✕ p, and C be p ✕ q. Let λ be a number.

(AB)C=A(BC) (that is, matrix multiplication is associative).


(A+A’)B=AB + A’B, A(B+B’)=AB+AB’ (matrix multiplication distributes over addition).
(λA)B = λ(AB) = A(λB) .
A0n×p = 0m×p and 0p×m A = 0p×n .
(AB)
T
= B
T
A
T
.

Proof. Let A = (aij ), A



= (a

ij
), B = (bij ), B

= (b

ij
), C = (cij ) . During this proof we also
write Xij to mean the i, j entry of a matrix X.
n
AB has i,j entry ∑
k=1
aik bkj , so the i, j entry of (AB)C is

p p n

∑(AB)il clj = ∑ ∑ aik bkl clj .

l=1 l=1 k=1

p
On the other hand, the i, j entry of BC is ∑
l=1
bil clj so the i, j entry of A(BC) is

n n p

∑ aik (BC)kj = ∑ aik ∑ bkl clj

k=1 k=1 l=1

n p

= ∑ ∑ aik bkl clj .

k=1 l=1

These are the same: it doesn’t matter if we do the k or l summation first, since we just get
the same terms in a different order.
n
The i, j entry of (A + A )B

is ∑
k=1
(aik + a

ik
)bjk which equals
n n

k=1
aik bkj + ∑
k=1
a

ik
bkj , but this is the sum of the i, j entry of AB and the i, j entry of

A B , proving the first equality. The second is similar.
The i,j entry of λA is λaij , so the i, j entry of (λA)B is

n n

∑(λaik )bkj = λ ∑ aik bkj = λ(AB)ij

k=1 k=1

so (λA)B and λ(AB) have the same i, j entry for any i, j, and are therefore equal. The
second equality can be proved similarly.
This is clear from the definition of matrix multiplication.
n n
The i, j entry of AB is ∑
k=1
aik bkj , so the i, j entry of (AB)
T
is ∑
k=1
ajkbki . If we write
the i, j entry of B
T
as βij and the i, j entry of A
T
as αij then the i, j entry of B
T
A
T
is
n n

i=1
βik αkj , but βik = bki and αkj = ajk so this is ∑
i=1
ajk bki which is the same as the
i, j entry of (AB)
T
.

These properties show some of the ways in which matrix multiplication is like ordinary
multiplication of numbers. There are two important ways in which it is different: in general,
AB ≠ BA
1 2 5 6 19 22
( )( ) = ( )
3 4 7 8 43 50

5 6 1 2 23 34
( )( ) = ( )
7 8 3 4 31 46

and unlike for multiplying numbers, we can have AB = 0 even when A, B ≠ 0 :

0 1 0 2 0 0
( )( ) = ( ).
0 0 0 0 0 0

There is a matrix which behaves like 1 does under multiplication:

Definition 2.6 The n ✕ n identity matrix In is the matrix whose i, j entry is 1 if i=j and 0
otherwise.

1 0 0
⎛ ⎞
1 0
I2 = ( ), I3 = ⎜ 0 1 0⎟
0 1
⎝ ⎠
0 0 1

Proposition 2.2 Let A be m ✕ n. Then AIn = A = Im A .


Proof. Let A = (aij ) , and write In = (δij ) so that δij = 0 if i ≠ j and δii = 1 for each i.
Using the definition of matrix multiplication, the i, j entry of AIn is

∑ aik δkj .

k=1

Because δab is zero unless a=b, the only term in this sum which is not zero is when k = j .
This term is aij × 1 = aij , so the i, j entry of AIn is the same as the i, j entry of A and
AIn = A .

The proof that Im A = A is similar. ▢

Definition 2.7 An n✕n matrix A is called invertible if there is an n✕n matrix B such that
AB = In = BA .

If A is invertible then there is one and only one matrix B such that AB = In = BA .
Lemma 2.1 Suppose AB = BA = In and AC = CA = In . Then B=C.
Proof.

B = In B as I X = X for anyX

= (CA)B

= C(AB) matrix mult is associative

= CIn

= C.

When A is invertible we use the notation A


−1
for the matrix such that AA
−1
= A
−1
A = In .

0 1
Not every non-zero square matrix is invertible: you can check directly that ( ) and
0 0

1 1
( ) aren’t invertible, for example. More generally, suppose that A and B are any non-
1 1

zero matrices such that AB = 0n×n (we have already seen examples of this). If A were
invertible we could multiply this equation on the left by the inverse of A to get

−1 −1
A AB = A 0n×n

B = 0n×n

which is not the case. It follows that if there is a non-zero matrix B such that AB = 0n×n then
A isn’t invertible (and a similar argument, multiplying on the right instead of the left, shows
neither is B).

2.2.1 Why define matrix multiplication this way?

The last thing in this section on matrix multiplication is a quick comment on where it comes
from.
Write R
p
for the set of all p✕1 column vectors with real numbers as entries. If A is an m✕n
matrix (with real number entries, say) then there is a function
n m
TA :R → R

TA (v) = Av

Now suppose B is n✕p, so that there is a similar map TB : R


p
→ R
n
. The composition
TA ∘ TB makes sense as a map R
p
→ R
m
, and it turns out

TA ∘ TB = TAB .

The connection with composition of maps is why matrix multiplication is defined the way it is.
We use a similar notation for complex column vectors: C
p
denotes the set of all height p
column vectors with complex number entries.

You might also like