Baker-Algebra and Number Theory
Baker-Algebra and Number Theory
Baker-Algebra and Number Theory
[13/05/2003]
A. Baker
Department of Mathematics, University of Glasgow. E-mail address : [email protected] URL: http://www.maths.gla.ac.uk/ajb
Contents
Chapter 1. Basic Number Theory 1. The natural numbers 2. The integers 3. The Euclidean Algorithm and the method of back-substitution 4. The tabular method 5. Congruences 6. Primes and factorization 7. Congruences modulo a prime 8. Finite continued fractions 9. Innite continued fractions 10. Diophantine equations 11. Pells equation Problem Set 1 Chapter 2. Groups and group actions 1. Groups 2. Permutation groups 3. The sign of a permutation 4. The cycle type of a permutation 5. Symmetry groups 6. Subgroups and Lagranges Theorem 7. Group actions Problem Set 2 Chapter 3. Arithmetic functions 1. Denition and examples of arithmetic functions 2. Convolution and M obius Inversion Problem Set 3 Chapter 4. Finite and innite sets, cardinality and countability 1. Finite sets and cardinality 2. Innite sets 3. Countable sets 4. Power sets and their cardinality 5. The real numbers are uncountable Problem Set 4 Index 1 1 3 4 6 8 11 13 16 17 22 23 25 29 29 30 31 32 33 35 38 43 47 47 48 52 53 53 55 55 57 59 60 61
CHAPTER 1
We can also compare natural numbers using inequalities. Given x, y N0 exactly one of the following must be true: x = y, x < y, y < x. As usual, if one of x = y or x < y holds then we write x y or y in the sense that x < y and y < z = x < z. x. Inequality is transitive
The most subtle aspect of the natural numbers to deal with is the fact that they form an innite set. We can and usually do list the elements of N0 in the sequence 0, 1, 2, 3, 4, . . . which never ends. One of the most important properties of N0 is The Well Ordering Principle (WOP): Every non-empty subset S N0 contains a least element. A least or minimal element of a subset S N0 is an element s0 S for which s0 s for all s S . Similarly, a greatest or maximal element of S is one for which s s0 for all s S . Notice that N0 has a least element 0, but has no greatest element since for each n N0 , n + 1 N0 and n < n + 1. It is easy to see that least and greatest elements (if they exist) are always unique. In fact, WOP is logically equivalent to each of the two following statements. The Principle of Mathematical Induction (PMI): Suppose that for each n N0 the statement P (n) is dened and also the following conditions hold: P (0) is true; whenever P (k ) is true then P (k + 1) is true.
1
Then P (n) is true for all n N0 . The Maximal Principle (MP): Let T N0 be a non-empty subset which is bounded above, i.e., there exists a b N0 such that for all t T , t b. Then T contains a greatest element. It is easily seen that two greatest elements must agree and we therefore refer to the greatest element. Theorem 1.1. The following chain of implications holds PMI = WOP = MP = PMI. Hence these three statements are logically equivalent. Proof. PMI = WOP: Let S N0 and suppose that S has no least element. We will show that S = . Let P (n) be the statement P (n): k / S for all natural numbers k such that 0 k n.
Notice that 0 / S since it would be a least element of S . Hence P (0) is true. Now suppose that P (n) is true. If n + 1 S , then since k / S for 0 k n, n + 1 would be the least element of S , contradicting our assumption. Hence, n + 1 / S and so P (n + 1) is true. By the PMI, P (n) is true for all n N0 . In particular, this means that n / S for all n and so S = . WOP = MP: Let T N0 have upper bound b and set S = {s N0 : t < s for all t T }. Then S is non-empty since for t T , t b < b + 1,
so b + 1 S . If s0 is a least element of S , then there must be an element t0 T such that s0 1 t0 ; but we also have t0 < s0 . Combining these we see that s0 1 = t0 T . Notice also that for every t T , t < s0 , hence t s0 1. Thus t0 is the desired greatest element. MP = PMI: Let P (n) be a statement for each n N0 . Suppose that P (0) is true and for n N0 , P (n) = P (n + 1). Suppose that there is an m N0 for which P (m) is false. Consider the set T = {t N0 : P (n) is true for all natural numbers n satisfying 0 n t}.
Notice that T is bounded above by m, since if m k , k / T . Let t0 be the greatest element of T , which exists thanks to the MP. Then P (t0 ) is true by denition of T , hence by assumption P (t0 + 1) is also true. But then P (n) is true whenever 0 n t0 + 1, hence t0 + 1 T , contradicting the fact that t0 was the greatest element of T . Hence, P (n) must be true for all n N0 . An important application of these equivalent results is to proving the following property of the natural numbers. Theorem 1.2 (Long Division Property). Let n, d N0 with 0 < d. Then there are unique natural numbers q, r N0 satisfying the two conditions n = qd + r and 0 r < d. Proof. Consider the set T = {t N0 : td n} N0 .
2. THE INTEGERS
Then T is non-empty since 0 T . Also, for t T , t td, hence t n. So T is bounded above by n and hence has a greatest element q . But then qd n < (q + 1)d. Notice that if r = n qd, then 0 r = n qd < (q + 1)d qd = d. To prove uniqueness, suppose that q , r is a second such pair. Suppose that r = r . By interchanging the pairs if necessary, we can assume that r < r . Since n = qd + r = q d + r , 0 < r r = (q q )d. Notice that this means q q since d > 0. If q > q , this implies d d r r <dr d, (q q )d, hence
and so d < d which is impossible. So q = q which implies that r r = 0, contradicting the fact that 0 < r r. So we must indeed have q = q and r = r. 2. The integers The set of integers is Z = Z+ {0} Z = N0 Z , where Z+ = {n N0 : 0 < n}, Z = {n : n Z+ }.
We can add and multiply integers, indeed, they form a basic example of a commutative ring. We can generalize the Long Division Property to the integers. 0 Theorem 1.3. Let n, d Z with 0 = d. Then there are unique integers q, r Z for which r < |d| and n = qd + r.
Proof. If 0 < d, then we need to show this for n < 0. By Theorem 1.2, we have unique natural numbers q , r with 0 r < d and n = q d + r . If r = 0 then we take q = q and r = 0. If r = 0 then take q = 1 q and r = d r . Finally, if d < 0 we can use the above with d in place of d and get n = q (d) + r and then take q = q . Once again, it is straightforward to verify uniqueness. Given two integers m, n Z we say that m divides n and write m | n if there is an integer k Z such that n = km; we also say that m is a divisor of n. If m does not divide n, we write m n. Given two integers a, b not both 0, an integer c is a common divisor or common factor of a and b if c | a and c | b. A common divisor h is a greatest common divisor or highest common factor if for every common divisor c, c | h. If h, h are two greatest common divisors of a, b, then h | h and h | h, hence we must have h = h. For this reason it is standard to refer to the greatest common divisor as the positive one. We can then unambiguously write gcd(a, b) for this number. Later we will use Long Division to determine gcd(a, b). Then a and b are coprime if gcd(a, b) = 1, or equivalently that the only common divisors are 1. There are many useful algebraic properties of greatest common divisors. Here is one while others can be found in Problem Set 1. Proposition 1.4. Let h be a common divisor of the integers a, b. Then for any integers x, y we have h | (xa + yb). In particular this holds for h = gcd(a, b). Proof. If we write a = uh and b = vh for suitable integers u, v , then xa + yb = xuh + yvh = (xu + yv )h, and so h | (xa + yb) since (xu + yv ) Z.
Theorem 1.5. Let a, b be integers, not both 0. Then there are integers u, v such that gcd(a, b) = ua + vb. Proof. We might as well assume that a = 0 and set h = gcd(a, b). Let S = {xa + yb : x, y Z, 0 < xa + yb} N0 . Then S is non-empty since one of (1)a is positive and hence is in S . By the Well Ordering Principle, there is a least element d of S , which can be expressed as d = u0 a + v0 b for some u0 , v0 Z. By Proposition 1.4, we have h | d; hence all common divisors of a, b divide d. Using Long Division we can nd q, r Z with 0 r < d satisfying a = qd + r. But then r = a qd = (1 qu0 )a + (qv0 )b, hence r S or r = 0. Since r < d with d minimal, this means that r = 0 and so d | a. A similar argument also gives d | b. So d is a common divisor of a, b which is divisible by all other common divisors, so it must be the greatest common divisor of a, b. This result is theoretically useful but does not provide a practical method to determine gcd(a, b). Long Division can be used to set up the Euclidean Algorithm which actually determines the greatest common divisor of two non-zero integers. 3. The Euclidean Algorithm and the method of back-substitution Let a, b Z be non-zero. Set n0 = a, d0 = b. Using Long Division, choose integers q0 and r0 such that 0 r0 < |d0 | and n0 = q0 d0 + r0 . Now set n1 = d0 , d1 = r0 0 and choose integers q1 , r1 such that 0 r1 < d1 and n1 = q1 d1 + r1 . We can repeat this process, at the k -th stage setting nk = dk1 , dk = rk1 and choosing integers qk , rk for which 0 rk < dk and nk = qk dk + rk . This is always possible provided rk1 = dk = 0. Notice that 0 rk < rk1 < r1 < r0 = b, hence we must eventually reach a value k = k0 for which dk0 = 0 but rk0 = 0. The sequence of equations n0 = q0 d0 + r0 , n1 = q1 d1 + r1 , . . . nk0 2 = qk0 2 dk0 2 + rk0 2 , nk0 1 = qk0 1 dk0 1 + rk0 1 , nk0 = qk0 dk0 , allows us to express each rk = dk+1 in terms of nk , rk1 . For example, we have rk0 1 = nk0 1 qk0 1 dk0 1 = nk0 1 qk0 1 rk0 2 . Using this repeatedly, we can write dk0 = un0 + vr0 = ua + vb. Thus we can express dk0 as an integer linear combination of a, b. By Proposition 1.4 all common divisors of the pair a, b divide dk0 . It is also easy to see that dk0 | nk0 , dk0 1 | nk0 1 , . . . , r0 | n0 ,
from which it follows that dk0 also divides a and b. Hence the number dk0 is the greatest common divisor of a and b. So the last non-zero remainder term rk0 1 = dk0 produced by the Euclidean Algorithm is gcd(a, b). This allows us to express the greatest common divisor of two integers as a linear combination of them by the method of back-substitution. Example 1.6. Find the greatest common divisor of 60 and 84 and express it as an integral linear combination of these numbers. Solution. Since the greatest common divisor only depends on the numbers involved and not their order, we might as take the larger one rst, so set a = 84 and b = 60. Then 84 = 1 60 + 24, 60 = 2 24 + 12, 24 = 2 12, Working back we nd 12 = 60 + (2) 24 = 60 + (2) (84 + (1) 60) = (2) 84 + 3 60. Thus gcd(60, 84) = 12 = 3 60 + (2) 84. Example 1.7. Find the greatest common divisor of 190 and 72, and express it as an integral linear combination of these numbers. Solution. Taking a = 190, b = 72 we have 190 = (2) (72) + 46, 72 = (2) 46 + 20, 46 = 2 20 + 6, 20 = 3 6 + 2, 6 = 3 2, Working back we nd 2 = 20 + (3) 6 = 20 + (3) (2 20 + 46), = (3) 46 + 7 20, = (3) 46 + 7 (72 + 2 46), = 7 (72) + 11 46, = 7 (72) + 11 (190 + 2 (72)), = 11 190 + 29 (72). Thus gcd(190, 72) = 2 = 11 190 + 29 (72). This could also be done by using the fact that gcd(190, 72) = gcd(190, 72) and proceeding as follows. Example 1.8. Find the greatest common divisor of 190 and 72 and express it as an integral linear combination of these numbers. 46 = 190 + 2 (72), 20 = 72 + 2 46, 6 = 2 20 + 46, 2 = 20 + (3) 6, 2 = gcd(190, 72). 24 = 84 + (1) 60, 12 = 60 + (2) 24, 12 = gcd(60, 84).
Solution. Taking a = 190, b = 72 we have 190 = 2 72 + 46, 72 = 1 46 + 26, 46 = 1 26 + 20, 26 = 1 20 + 6, 20 = 3 6 + 2, 6 = 3 2, Working back we nd 2 = 20 + (3) 6 = 20 + (3) (26 + (1) 20), = (3) 26 + 4 20, = (3) 26 + 4 (46 + (1) 26), = 4 46 + (7) 26, = 4 46 + (7) (72 + (1) 46), = (7) 72 + 11 46, = (7) 72 + 11 (190 + (2) 72), = 11 190 + (29) 72. Thus gcd(190, 72) = 2 = 11 190 + (29) 72. From this we obtain gcd(190, 72) = 2 = 11 190 + 29 (72). It is usually be more straightforward working with positive a, b and to adjust signs at the end. Notice that if gcd(a, b) = ua + vb, the values of u, v are not unique. For example, 83 190 + 219 (72) = 2. In general, we can modify the numbers u, v to u + tb, v ta since (u + tb)a + (v ta)b = (ua + vb) + (tba tab) = (ua + vb). Thus dierent approaches to determining the linear combination giving gcd(a, b) may well produce dierent answers. 46 = 190 + (2) 72, 26 = 72 + (1) 46, 20 = 46 + (1) 26, 6 = 26 + (1) 20, 2 = 20 + (3) 6, 2 = gcd(190, 72).
4. The tabular method This section describes an alternative approach to the problem of expressing gcd(a, b) as a linear combination of a, b. I learnt this method from Francis Clarke of the University of Wales Swansea. The tabular method uses the sequence of quotients appearing in the Euclidean Algorithm and is closely related to the continued fraction method of Theorem 1.42. The tabular method provides an ecient alternative to the method of back-substitution and can also be used check calculations done by that method.
We will illustrate the tabular method with an example. In the case a = 267, b = 207, the Euclidean Algorithm produces the following quotients and remainders. 267 = 1 207 + 60, 207 = 3 60 + 27, 60 = 2 27 + 6, 27 = 4 6 + 3, 6 = 2 3 + 0. The last non-zero remainder is 3, so gcd(267, 207) = 3. Back-substitution gives 3 = 27 4 6 = 27 4 (60 2 27) = 4 60 + 9 27 = 4 60 + 9 (207 3 60) = 9 207 31 60 = 9 207 31 (267 1 207) = (31) 267 + 40 207. In the tabular method we form the following table. 1 3 2 4 2 1 0 1 3 7 31 69 0 1 1 4 9 40 89 Here the rst row is the sequence of quotients. The second and third rows are determined as follows. The entry tk under the quotient qk is calculated from the formula tk = qk tk1 + tk2 . So for example, 31 arises as 4 7 + 3. The nal entries in the second and third rows always have the form b/ gcd(a, b) and a/ gcd(a, b); here 207/3 = 69 and 267/3 = 89. The previous entries are A and B , where the signs are chosen according to whether the number of quotients is even or odd. Why does this give the same result as back-substitution? The arithmetic involved seems very dierent. In our example, the value 40 arises as 31 + 9 in the back-substitution method and as 4 9 + 4 in the tabular method. The key to understanding this is provided by matrix multiplication, in particular the fact that it is associative. Consider the matrix product 0 1 1 1 0 1 1 3 0 1 1 2 0 1 1 4 0 1 1 2
in which the quotients occur as the entries in the bottom right-hand corner. By the associative law, the product can be evaluated either from the right: 0 1 1 4 0 1 1 2 0 1 1 3 0 1 1 1 or from the left: 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 3 0 1 1 3 0 1 1 2 0 1 1 3 0 1 1 2 0 1 1 4 0 1 1 3 = , 1 3 1 4 0 1 1 3 = 1 2 1 4 0 1 3 7 = 1 4 4 9 0 1 7 31 = 1 2 9 40 0 1 3 7 = , 1 2 4 9 0 1 7 31 = , 1 4 9 40 0 1 31 69 = . 1 2 40 89 0 1 1 3 0 1 1 2 0 1 1 2 0 1 1 4 0 1 1 4 0 1 1 4 0 1 1 2 = , 1 2 4 9 0 1 0 1 = 1 2 1 2 0 1 0 1 = 1 3 1 2 0 1 0 1 = 1 2 1 1 1 2 4 9 , = 4 9 9 20 4 9 9 20 = , 9 20 31 69 9 20 31 69 = , 31 69 40 89
Notice that the numbers occurring as the left-hand columns of the rst set of partial products are the same (apart from the signs) as the numbers which arose in the back-substitution method. The numbers in the second set of partial products are those in the tabular method. Thus back-substitution corresponds to evaluation from the right and the tabular method to evaluation from the left. This shows that they give the same result. Giving a general proof of this identication of the two methods with matrix multiplication 0 1 is not too hard. In fact it becomes obvious given the factorization of the matrix as 1 q 0 1 1 q the product of two elementary matrices. Two elementary row operations are 1 0 0 1 0 1 performed when multiplying by on the left. Firstly q (row 2) is added to row 1, then 1 q 0 1 the two rows are swapped. Multiplication by on the right performs similar column 1 q operations. 0 1 The determinant of is 1 and so by the multiplicative property of determinants, 1 q det 0 1 1 q1 0 1 0 1 = (1)r . 1 q2 1 qr
It is this that explains the rule for the choice of signs in the tabular method. The partial products have determinant alternately equal to 1. This provides a useful check on the calculations. 5. Congruences Let n N0 be non-zero, so n > 0. Then for integers x, y , we say that x is congruent to y modulo n if n | (x y ) and write x y (mod n) or x y . Then is an equivalence relation on
n n
5. CONGRUENCES
Z in the sense that the following hold for x, y, z Z: (Reexivity) (Symmetry) (Transitivity)
n n
x x,
n
x y = y x,
n
x y and y z = x z.
n n
The set of equivalence classes is denoted Z/n. We will denote the congruence class or residue class of the integer x by xn ; sometimes notation such as x or [x]n is used. Residue classes can be added and multiplied using the formul xn + yn = (x + y )n , x + y = x + y + (x x) + (y y ) x + y,
n
xn yn = (xy )n .
These make sense because if xn = xn and yn = yn , then x y = (x + (x x))(y + (y y )) = xy + y (x x) + x(y y ) + (x x)(y y ) xy.
n
We can also dene subtraction by xn yn = (x y )n . These operations make Z/n into a commutative ring with zero 0n and unity 1n . Since for each x Z we have x = qn + r with q, r Z and 0 r < n, we have xn = rn , so we usually list the distinct elements of Z/n as 0n , 1n , 2n , . . . , (n 1)n . Theorem 1.9. Let t Z have gcd(t, n) = 1. Then there is a unique residue class un Z/n for which un tn = 1n . In particular, the integer u satises ut 1.
n
Proof. By Theorem 1.5, there are integers u, v for which ut + vn = 1. This implies that ut 1, hence un tn = 1n . Notice that if wn also has this property then wn tn = 1n which gives
n
wn (tn un ) = (wn tn )un = un , hence wn = un . We will refer to u as the inverse of t modulo n and un as the inverse of tn in Z/n. Since ut + vn = 1, neither t nor u can have a common factor with n. Example 1.10. Solve each of the following congruences, in each case giving all (if any) integer solutions: (i) 5x 7; (ii) 3x 6; (iii) 2x 8; (iv) 2x 7.
12 101 10 10
x 5 x 35 11.
12 12 12
x 34 3x 34 6 2.
101 101 101
(iii) Here gcd(2, 10) = 2, so the above method does not immediately apply. We require that 2(x 4) 0, giving (x 4) 0 and hence x 4. So we obtain the solutions x 4 and x 9. (iv) This time we have 2x 7 so 2x + 10k = 7 for some k Z. This is impossible since 2 | (2x + 10k ) but 2 7, so there are no solutions. Another important application is to the simultaneous solution of two or more congruence equations to dierent moduli. The next Lemma is the key ingredient.
10 10 5 5 10 10
10
Lemma 1.11. Suppose that a, b N0 are coprime and n Z. If a | n and b | n, then ab | n. Proof. Let a | and b | n and choose r, s Z so that n = ra = sb. Then if ua + vb = 1, n = n(ua + vb) = nua + nvb = su(ab) + rv (ab) = (su + rv )ab. Since su + rv Z, this implies ab | n. Theorem 1.12 (The Chinese Remainder Theorem). Suppose n1 , n2 Z+ are coprime and b1 , b2 Z. Then the pair of simultaneous congruences x b1 ,
n1
x b2 ,
n2
has a unique solution modulo n1 n2 . Proof. Since n1 , n2 are coprime, there are integers u1 , u2 for which u1 n1 + u2 n2 = 1. Consider the integer t = u1 n1 b2 + u2 n2 b1 . Then we have the congruences t u2 n2 b1 b1 ,
n1 n1
t u1 n1 b2 b2 ,
n2 n2
so t is a solution for the pair of simultaneous congruences in the Theorem. To prove uniqueness modulo n1 n2 , note that if t, t are both solutions to the original pair of simultaneous congruences then they satisfy the pair of congruences t t,
n1
t t.
n2 n1 n2
By Lemma 1.11, n1 n2 | (t t), implying that t t, so the solution tn1 n2 Z/n1 n2 is unique as claimed. Remark 1.13. The general integer solution of the pair of congruences of Theorem 1.12 is x = u1 n1 b2 + u2 n2 b1 + kn1 n2 (k Z).
Example 1.14. Solve the following pair of simultaneous congruences modulo 28: 3x 1,
4
5x 2.
7
Solution. Begin by observing that 32 1 and 3 5 = 15 1, hence the original pair of congruences is equivalent to the pair
4 7
x 3,
4
x 6.
7
Using the Euclidean Algorithm or otherwise we nd 2 4 + (1) 7 = 1, so the solution modulo 28 is x 2 4 6 + (1) 7 3 48 21 = 27.
28 28
Hence the general integer solution is 27 + 28n (n Z). Example 1.15. Find all integer solutions of the three simultaneous congruences 7x 1,
8
x 2,
3
x 1.
5
Solution. We can proceed in two steps. First solve the pair of simultaneous congruences 7x 1,
8
x2
3
11
modulo 8 3 = 24. Notice that 72 = 49 1, so the congruences are equivalent to the pair
8
x 7,
8
x 2.
3
x 1.
5
This gives for the general integer solution x = 71 + 120n (n Z). 6. Primes and factorization Definition 1.16. A positive natural number p N0 for which p > 1 whose only integer factors are 1 and p is called a prime. Otherwise such a natural number is called composite. Some examples of primes are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97. Notice that apart from 2, all primes are odd since every even integer is divisible by 2. We begin with an important divisibility property of primes. Theorem 1.17 (Euclids Lemma). Let p be a prime and a, b Z. If p | ab, then p | a or p | b. Proof. Suppose that p a. Since gcd(p, a) | p, we have gcd(p, a) = 1 or gcd(p, a) = p; but the latter implies p | a, contradicting our assumption, thus gcd(p, a) = 1. Let r, s Z be such that rp + sa = 1. Then rpb + sab = b and so p | b. More generally, if a prime p divides a product of integers a1 an then p | aj for some j . This can be proved by induction on the number n. Theorem 1.18 (Fundamental Theorem of Arithmetic). Let n N0 be a natural number such that n 1. Then n has a unique factorization of the form n = p1 p2 pt , where for each j , pj is a prime and 2 p1 p2 pt .
Proof. We will prove this using the Well Ordering Principle. Consider the set S = {n N0 : 1 n and no such factorization exists for n}
Now suppose that S = . Then by the WOP, S has a least element n0 say. Notice that n0 cannot be prime since then it have such a factorization. So there must be a factorization n0 = uv with u, v N0 and u, v = 1. Then we have 1 < u < n0 and 1 < v < n0 , hence u, v / S and so there are factorizations u = p1 pr , v = q1 qs for suitable primes pj , qj . From this we obtain n0 = p1 pr q1 qs , and after reordering and renaming we have a factorization of the desired type for n0 .
12
To show uniqueness, suppose that p1 pr = q1 qs for primes pi , qj satisfying p1 p2 pr and q1 q2 qs . Then pr | q1 qs and hence pr | qt for some t = 1, . . . , s, which implies that pr = qt . Thus we have p1 pr1 = q1 qs1 , where we q1 , . . . , qs1 is the list q1 , . . . , qs with the rst occurrence of qt omitted. Continuing this way, we eventually get down to the case where 1 = q1 qsr for some primes qj . But this is only possible if s = r, i.e., there are no such primes. By considering the sizes of the primes we have p1 = q1 , p2 = q2 , . . . , pr = qs , which shows uniqueness. We refer to this factorization as the prime factorization of n. Corollary 1.19. Every natural number n 1 has a unique factorization
rt 1 r2 n = pr 1 p2 pt ,
rj and 2
We call this factorization the prime power factorization of n. Proposition 1.20. Let a, b N0 be non-zero with prime power factorizations
rk 1 a = pr 1 pk , sk 1 b = ps 1 pk ,
where 0
rj and 0
sj . Then
tk 1 gcd(a, b) = pt 1 pk
with tj = min{rj , sj }. Proof. For each j , we have pjj | a and pjj | b, hence pjj | gcd(a, b). Then by Lemma 1.11, tk 1 pt 1 pk | gcd(a, b). If gcd(a, b) , 1 < m = t1 k p1 pt k then m | gcd(a, b) and there is a prime q dividing m, hence q | a and q | b. This means that q = p for some and so pt +1 | gcd(a, b). But then pr +1 | a and ps +1 | b which is impossible. tk 1 Hence gcd(a, b) = pt 1 pk . We have not yet considered the question of how many primes there are, in particular whether there are nitely many. Theorem 1.21. There are innitely many distinct primes. Proof. Suppose not. Let the distinct primes be p0 = 2, p1 , . . . , pn where 2 = p0 < 3 = p1 < < pn . Consider the natural number N = (2p1 pn ) + 1. Notice that for each j , pj N . By the Fundamental Theorem of Arithmetic, N = q1 qk for some primes qj . This gives a contradiction since none of the qj can occur amongst the pj . We can also show that certain real numbers are not rational. Proposition 1.22. Let p be a prime. Then p is not a rational number.
t t t
13
a for integers a, b. We can assume that gcd(a, b) = 1 since p = b a2 common factors can be cancelled. Then on squaring we have p = 2 and hence a2 = pb2 . Thus b p | a2 , and so by Euclids Lemma 1.17, p | a. Writing a = a1 p for some integer a1 we have 2 2 2 2 a2 1 p = pb , hence a1 p = b . Again using Euclids Lemma we see that p | b. Thus p is a common factor of a and b, contradicting our assumption. This means that no such a, b can exist so p is not a rational number. Proof. Suppose that Non-rational real numbers are called irrational. The set of all irrational real numbers is much bigger than the set of rational numbers Q, see Section 5 of Chapter 4 for details. However it is hard to show that particular real numbers such as e and are actually irrational. 7. Congruences modulo a prime In this section, p will denote a prime number. We will study Z/p. We begin by noticing that it makes sense to consider a polynomial with integer coecients f (x) = a0 + a1 x + + ad xd Z[x], but reduced modulo p. If for each j , aj bj , we write
p
a0 + a1 x + + ad xd b0 + b1 x + + bd xd
p
and talk about residue class of a polynomial modulo p. We will denote the residue class of f (x) by f (x)p . We say that f (x) has degree d modulo p if ad 0.
p
For an integer c Z, we can evaluate f (c) and reduce the answer modulo p, to obtain f (c)p . If f (c)p = 0p , then c is said to be a root of f (x) modulo p. We will also refer to the residue class cp as a root of f (x) modulo p. Proposition 1.23. If f (x) has degree d modulo p, then the number of distinct roots of f (x) modulo p is at most d. Proof. Begin by noticing that if c is root of f (x) modulo p, then f (x) f (x) f (c) = (a1 + a2 (x + c) + + ad (xd1 + + cd1 ))(x c).
p
Hence f (x) f1 (x)(x c). If c is another root of f (x) modulo p for which cp = cp , then since
p
f1 (c )(c c) 0
p
we have p | f1 (c )(c c) and so by Euclids Lemma 1.17, p | f1 (c ); thus c is a root of f1 (x) modulo p. If now the integers c = c1 , c2 , . . . , ck are roots of f (x) modulo p which are all distinct modulo p, then f (x) (x c1 )(x c2 ) (x ck )g (x).
p
d.
Theorem 1.24 (Fermats Little Theorem). Let t Z. Then t is a root of the polynomial p1 1 p (x) = xp x modulo p. Moreover, if tp = 0p , then t is a root of the polynomial 0 p (x) = x modulo p. Proof. Consider the function : Z Z/p; (t) = (tp t)p .
14
Notice that if s t then (s) = (t) since sp s tp t. Then for u, v Z, has the following
p p
additivity property: (u + v ) = (u) + (v ). To see this, notice that the Binomial Theorem gives
p1
(u + v )p = up + v p +
j =1
p j pj u v . j
For 1
and as none of j !, (p j )!, (p 1)! is divisible by p, the integer the following useful result.
(u1 + + un ) = (u1 ) + + (un ). To prove Fermats Little Theorem, notice that (1) = 0p and so for t
t summands
1,
For general t Z, we have (t) = (t + kp) for k N0 , so we can replace t by a positive natural number congruent to it and then use the above argument. If tp = 0p , then we have p | t(tp1 1) and so by Euclids Lemma 1.17, p | (tp1 1). The second part of Fermats Little Theorem can be used to elucidate the multiplicative structure of Z/p. Let t be an integer not divisible by p. By Theorem 1.9, since gcd(t, p) = 1, there is an inverse u of t modulo p. The set Pt = {tk p :k 1} Z/p
s is nite with at most p 1 elements. Notice that in particular we must have tr p = tp for some r = 1 . The order of t modulo p is the smallest d > 0 such that td 1. We r < s and so ts p p p
denote the order of t by ordp t. Notice that the order is always in the range 1
ordp t
p 1.
Lemma 1.26. For t Z with p t, the order of t modulo p divides p 1. Moreover, for k N0 , tk 1 if and only if ordp t | k .
p
Proof. Let d = ordp t be the order of t modulo p. Writing p 1 = qd + r with 0 we have 1 tp1 tqd+r = tqd tr tr ,
p p p
r < d,
which means that r = 0 since d is the least positive integer with this property.
15
r < d, we have
1 tq d tr tr ,
p p
hence r = 0 by the minimality of d. So d | k . Theorem 1.27. For a prime p, there is an integer g such that ordp g = p 1. Proof. Proofs of this result can be found in many books on elementary Number Theory. It is also a consequence of our Theorem 2.28. Such an integer g is called a primitive root modulo p. The distinct powers of g modulo p are then the (p 1) residue classes
0 2 p2 1p = gp , gp , g p , , gp .
This implies the following result. Proposition 1.28. Let g be a primitive root modulo the prime p. Then for any integer t with p t, there is a unique integer r such that 0 r < p 1 and t g r .
p
Notice that the power g (p1)/2 satises (g (p1)/2 )2 1. Since this number is not congruent to 1 modulo p, Proposition 1.23 implies that
p ( p 1) / 2 g 1. p
Proposition 1.29. If p is an odd prime then the polynomial x2 + 1 has no roots modulo p if p 3, two roots modulo p if p 1.
4 4
Proof. Let g be a primitive root modulo p. If p 3, suppose that u2 + 1 0. Then if u g r , we have g 2r 1, hence g 2r g (p1)/2 . But
4 p p p p
then (p 1) | (2r (p 1)/2) which is impossible since (p 1)/2 is odd. If p 1, (g (p1)/4 )4 1 0, so the polynomial x4 1 has four distinct roots modulo p, namely
4 p (p1)/4 3(p1)/4 1p , 1p , gp , gp .
By Proposition 1.23, this means that g (p1)/4 , g 3(p1)/4 are roots of x2 + 1 modulo p. Theorem 1.30 (Wilsons Theorem). For a prime p, (p 1)! 1.
p
Proof. This is trivially true when p = 2, so assume that p is odd. By Fermats Little Theorem 1.24, the polynomial xp1 1 has for its p 1 distinct roots modulo p the numbers 1, 2, . . . , p 1. Thus (x 1)(x 2) (x p + 1) xp1 1.
p
16
8. Finite continued fractions Let a, b Z with b > 0. If the Euclidean Algorithm for these integers produces the sequence
Then
r0 1 a = q0 + = q0 + = q0 + b b b/r0
1 q1 + q2 + + 1 1 qk0 1 + 1 qk0
and this expression is called the continued fraction expansion of a/b, written [q0 ; q1 , . . . , qk0 ]; we also say that [q0 ; q1 , . . . , qk0 ] represents a/b. In general, [a0 ; a1 , a2 , a3 , . . . , an ] gives a nite continued fraction if each ak is an integer with all except possibly a0 being positive. Then
[a0 ; a1 , a2 , a3 , . . . , an ] = a0 + a1 +
1 1 a2 + 1 a3 +
Notice that this expansion for a/b is not necessarily unique since if qk0 > 1, then qk0 = (qk0 1)+1 and we obtain the dierent expansion
r0 1 a = q0 + = q0 + = q0 + b b b/r0
1 q1 + q2 + + qk0 1 + 1 1 1 (qk0 1) + 1 1
17
which shows that [q0 ; q1 , . . . , qk0 ] = [q0 ; q1 , . . . , qk0 1, 1]. For example, 8 1 1 1 21 =1+ =1+ =1+ =1+ 13 13 13 5 1 1+ 1+ 8 8 3 1+ 5 1 1 1 =1+ =1+ =1+ 1 1 1 1+ 1+ 1+ 1 1 1 1+ 1+ 1+ 2 1 1 1+ 1+ 1+ 3 1 1 1+ 1+ 2 1 1+ 1 so 21/13 = [1; 1, 1, 1, 1, 2] = [1; 1, 1, 1, 1, 1, 1]. Analogous considerations show that every rational number has exactly two such continued fraction expansions related in a similar fashion. The convergents of the above continued fraction expansion are the numbers A0 = 1, A1 = 1 + 1 = 2, 1 A2 = 1 + 1 1+ 3 = , 1 2 1 1 1+ 1+ 1+ 1 1 1 1+ A3 = 1 + 1+ 1 1 1+ = 21 , 13 1 1 5 = , 3
A4 = 1 + 1+
1 1 1+ 1 1+ 1 1
8 = , 5
A5 = 1 +
1 2 which form a sequence tending to 21/13. They also satisfy the inequalities A0 < A2 < A4 < A5 < A3 < A1 . In general, the even convergents of a nite continued fraction expansion always form a strictly increasing sequence, while the odd ones form a strictly decreasing sequence. 9. Innite continued fractions The continued fraction expansions considered so far are all nite, however innite continued fraction (icf) expansions turn out to be interesting too. Such an innite continued fraction expansion has the form [a0 ; a1 , a2 , a3 , . . .] = a0 + a1 + 1 1 a2 +
1 a3 + where a0 , a1 , a2 , a3 , . . . are integers with all except possibly a0 being positive. Of course, we might expect to have to consider questions of convergence for such an innite expansion and we will discuss this point later. Example 1.31. Assuming it makes sense, what real number must the following innite continued fraction [1; 1, 1, 1, . . .] represent?
18
Solution. If = [1; 1, 1, 1, . . .] = 1 + 1+
1 1 1+ 1 1 +
1 = 1 + , i.e., 2 1 = 0, 1 5 . It is obvious that > 0, hence = (1 + 5)/2. which has solutions = 2 Let A = [a0 ; a1 , a2 , a3 , . . .] be an innite continued fraction expansion. Then for each k 0, the nite continued fraction Ak = [a0 ; a1 , a2 , a3 , . . . , ak ] is called the k -th convergent of A. In Example 1.31, the rst few convergents are 1 A0 = , 1 A1 = 1 + 1 1+ 1 1+ 1 1 1 2 = , A2 = 1 + 1 1 5 = , A4 = 1 + 3 1 1+ 3 = , 1 2 1 1 1+ 1+ 1 1 1+ 8 = . 5
then
A3 = 1 +
1 1 Here the numerators and denominators form the famous Fibonacci sequence {un }, 1, 1, 2, 3, 5, 8, . . . which is given by the recurrence relation u1 = u2 = 1, un = un1 + un2 (n 3).
Using the convergents of a continued fraction, we might dene A = [a0 ; a1 , a2 , a3 , . . .] to be lim An , provided this limit exists. We will show that such limits do always exist and we will
then say that A = [a0 ; a1 , a2 , a3 , . . .] represents the value of this limit. The rst few convergents of A = [a0 ; a1 , a2 , a3 , . . .] are a0 A0 = , 1 a1 a0 + 1 , A1 = a1 a2 (a1 a0 + 1) + a0 A2 = , a2 a1 + 1 a3 (a2 a1 a0 + a2 + a0 ) + a1 a0 + 1 . A3 = a3 (a2 a1 + 1) + a1 The general pattern is given in the next result. Theorem 1.32. Given the innite continued fraction A = [a0 ; a1 , a2 , a3 , . . .], set p0 = a0 , q0 = 1, p1 = a1 a0 + 1, q1 = a1 , while for n 2, pn = an pn1 + pn2 , Then for each n qn = an qn1 + qn2 . pn . qn
In the proof and later in this section we will make use of generalized nite continued fractions [a0 ; a1 , a2 , a3 , . . . , an1 , an ] for which a0 Z, 0 < ak N0 , 1 k n 1, and 0 < an R.
19
Proof. The cases n = 0, 1, 2 clearly hold. We will prove the result by Induction on n. pk Suppose that for some k 2, Ak = . Then qk Ak+1 = [a0 ; a1 , a2 , a3 , . . . , ak , ak+1 ] = [a0 ; a1 , a2 , a3 , . . . , ak + 1/ak+1 ], which gives us the inductive step Ak+1 = (ak + 1/ak+1 )pk + pk1 (ak + 1/ak+1 )qk + qk1 ak+1 (ak pk1 + pk2 ) + pk1 = ak+1 (ak qk1 + qk2 ) + qk1 ak+1 pk + pk1 = ak+1 qk + qk1 pk+1 . = qk+1
Corollary 1.33. The convergents of A = [a0 ; a1 , a2 , a3 , . . .] satisfy i) for n 1, pn qn1 pn1 qn = (1)n1 , ii) for n 2, pn qn2 pn2 qn = (1)n an , An An2 = (1)n an . qn2 qn An An1 = (1)n1 ; qn1 qn
Proof. We will use Induction on n. We can easily verify the cases n = 1, 2. Assume that the equations hold when n = k for some k 2. Then pk+1 qk pk qk+1 = (ak+1 pk + pk1 )qk pk (ak+1 qk + qk1 ) = pk1 qk pk qk1 = (1)(1)k1 = (1)k , giving the inductive step required to prove (i). Similarly, for (ii) we have pk qk2 pk2 qk = (ak pk1 + pk2 )qk2 pk2 (ak qk1 + qk2 ) = ak (pk1 qk2 pk2 qk1 ) = ak (1)k2 = (1)k ak . Corollary 1.34. The convergents of A = [a0 ; a1 , a2 , a3 , . . .] satisfy the inequalities A2r < A2r+2s < A2r+2s1 < A2s1 for all integers r, s with s > 0. Hence each A2m is less than each A2n1 and the sequence {A2n } is strictly increasing while the sequence {A2n1 } is strictly decreasing, i.e., A0 < A2 < < A2m < < A2n1 < < A3 < A1 . Theorem 1.35. The convergents of the innite continued fraction [a0 ; a1 , a2 , a3 , . . .] form a sequence {An } which has a limit A = lim An .
n
Proof. Notice that the increasing sequence {A2n } is bounded above by A1 , hence it has a limit say. Similarly, the decreasing sequence {A2n1 } is bounded below by A0 , hence it has a limit u say. Notice that = lim A2n lim A2n1 = u.
n n
n q2n1 q2n
20
1 = 0, hence n qn
Example 1.36. Determine the real number which is represented by the innite continued fraction [1; 2, 2, 2, . . .] and calculate its rst few convergents. Solution. Let be this number. Then 1= 1 , 1+
giving the equation 2 1 = 1. Thus = 2 and since is clearly positive, we get = 2. We have a0 = 1, 2 = a1 = a2 = a3 = , giving p0 = 1, q0 = 1, p1 = 3, q1 = 2 and for n 2, pn = 2pn1 + pn2 , qn = 2qn1 + qn2 . The rst few convergents are 3 7 17 41 99 239 577 , A7 = . A0 = 1, A1 = , A2 = , A3 = , A4 = , A5 = , A6 = 2 5 12 29 70 169 408 Theorem 1.37. Each irrational number has a unique representation as an innite continued fraction expansion [c0 ; c1 , c2 , . . .] for which cj Z with cj > 0 if j > 0. Proof. We begin by setting 0 = and c0 = [0 ]. Then if 1 , 1 = 0 c0 we can dene c1 = [1 ]. Continuing in this way, we can inductively dene sequences of real numbers n and integers cn satisfying 1 , cn = [n ]. n = n1 cn1 Notice that for n > 0, cn > 0. Also, if n is rational then so is n1 since 1 n1 = cn1 + , n and this would imply that 0 was rational which is false. In particular this shows that n = 0 at each stage and n > cn . Using the generalized continued fraction notation we have = [c0 ; c1 , . . . , cn , n+1 ] with convergents satisfying the conditions pn n+1 pn + pn1 , . Cn = = qn n+1 qn + qn1 Then n+1 pn + pn1 pn | Cn | = n+1 qn + qn1 qn pn1 qn pn qn1 = (n+1 qn + qn1 )qn (1)n = (n+1 qn + qn1 )qn 1 1 < 2. = qn+1 qn qn
2 0 as n , hence C . Since the qn form a strictly increasing sequence of integers, 1/qn n Thus the innite continued fraction [c0 ; c1 , c2 , . . .] represents . It is easy to see that if is represented by the innite continued fraction [a0 ; a1 , a2 , . . .] then a0 = [ ], and in general cn = an for all n, hence this representation is unique.
21
Example 1.38. Find the continued fraction expansion of 2. Solution. Let 0 = 2 and so c0 = [ 2] = 1. Then 2+1 1 = = 2+1 1 = 21 21 and so c1 = [ 2 + 1] = 2. Repeating this gives 1 1 = 2+1 2 = = 2+12 21 and c2 = [ 2 + 1] = 2. Clearly we get for each n > 0, 1 = 2 + 1, cn = 2. n = 21 So the innite continued fraction representing 2 is [1; 2, 2, . . .] = [1; 2], where 2 means 2 repeated innitely often. We will write a1 , a2 , . . . , ap to denote the sequence a1 , a2 , . . . , ap repeated innitely often as in the last example. Example 1.39. Find the continued fraction expansion of 3. Solution. Let 0 = 3 and so c0 = [ 3] = 1. Then 3+1 3+1 1 = = 1 = 31 2 31 and so c1 = 1. Repeating gives 2( 3 + 1) 2 1 = = = 3+1 2 = 1 c1 31 31
1 3+1 3+1 1 = = = = 1 3 = 2 c2 31 2 31
and c3 = 1 = c1 . From now on this pattern repeats giving 3+1 1 if n is odd, if n is odd, n = 2 cn = 2 if n is even. 3 + 1 if n is even, So the innite continued fraction representing 3 is [1; 1, 2, 1, 2, . . .] = [1, 1, 2]. The rst few convergents are 5 7 19 26 71 97 , , , . 1, 2, , , 3 4 11 15 41 56 This example illustrates a general phenomenon. Theorem 1.40. For a natural number n which is not a square, the irrational number n has an innite continued fraction expansion of the form [a0 ; a1 , a2 , . . . , ap ]. Furthermore, if p is the smallest such number, then the continued fraction expansion of n also has the symmetry n = [a0 ; a1 , a2 , . . . , ap ] = [a0 ; a1 , a2 , . . . , a2 , a1 , 2a0 ].
22
The smallest p for which the expansion has periodic part of length p is called the period of the continued fraction expansion of n. Here are some more examples whose periods are indicated. 5 = [2; 4] (period 1) 6 = [2; 2, 4] (period 2) 7 = [2; 1, 1, 1, 4] (period 4) 8 = [2; 1, 4] (period 2) 10 = [3; 6] (period 1) 11 = [3; 3, 6] (period 2) 12 = [3; 2, 6] (period 2) 13 = [3; 1, 1, 1, 1, 6] (period 5) 97 = [9; 1, 5, 1, 1, 1, 1, 1, 1, 5, 1, 18] (period 11) 10. Diophantine equations Consider the following problem: Find all integer solutions x, y of the equation 35x + 61y = 1. Such problems in which we are only interested in integer solutions are called Diophantine problems and are named after the Greek Diophantus in whose book many examples appeared. Diophantine problems were also studied in several ancient civilizations including those of China, India and the Middle East. Since gcd(35, 61) = 1, we can use the Euclidean Algorithm to nd a specic solution of this problem. 61 = 1 35 + 26, 35 = 1 26 + 9, 26 = 2 9 + 8, 9 = 1 8 + 1, 8 = 1 8. Hence a solution is obtained from 1 = 9 + (1) 8 = 9 + (1) (26 + (2) 9) = (1) 26 + 3 9 = (1) 26 + 3 (35 + (1) 26) = 3 35 + (4) 26 = 3 35 + (4) (61 + (1) 35) = 7 35 + (4) 61. So x = 7, y = 4 is an integer solution. To nd all integer solutions, notice that another solution x, y must satisfy 35(x 7) + 61(y + 4) = 0, hence we have 35 | 61(y + 4) so by Euclids Lemma 1.17, 35 | (y + 4). Thus y = 4 + 35k for some k Z and then x = 7 61k . The general integer solution is x = 7 61k, y = 4 + 35k (k Z). Here is the general result about this kind of problem. Theorem 1.41. If a, b, c Z and h = gcd(a, b) set a = uh and b = vh. a) If h c, then the equation ax + by = c has no integer solutions. 26 = 61 + (1) 35, 9 = 35 + (1) 26, 8 = 26 + (2) 9, 1 = 9 + (1) 8,
23
b) If h | c, then the equation ax + by = c has integer solutions. If x0 , y0 is a particular integer solution, then the general integer solution is x = x0 vk , y = y0 + uk (k Z). Proof. a) This is obvious. b) Dividing through by h gives the equivalent equation ux + vy = w, where c = wh. This can be solved as in the preceding discussion to obtain the stated general solution. We end this section by showing how continued fractions can be used to nd one solution of the above Diophantine problem, 35x + 61y = 1. Consider the continued fraction expansion 61 =1+ 35 1 1+ 2+ 1 1 1+ = [1; 1, 2, 1, 8].
1 8 The penultimate convergent is [1; 1, 2, 1] = 7/4. Apart from the signs involved, the numbers 7,4 are those appearing in the above solution. This illustrates a general result which is closely related to the tabular method of 4. Theorem 1.42. If a, b are coprime positive integers, then a solution of ax + by = 1 is obtained from the continued fraction expansion b = [c0 ; c1 , . . . , cm ] a pm1 and setting by taking the penultimate convergent qm1 x = (1)m pm1 , bqm1 apm1 = (1)m1 , y = (1)m1 qm1 .
Proof. This is a consequence of Corollary 1.33(ii) where we take n = m. This gives i.e., (1)m pm1 a + (1)m1 qm1 b = 1.
11. Pells equation Another important Diophantine problem is the solution of Pells Equation x2 dy 2 = 1, where d is an integer which is not a square. It turns out that the integer solutions x, y of this equation can be found using continued fractions. We will describe the method without detailed proofs. From now on, let d be a non-square natural number. Let [a0 ; a1 , . . . , ap ] be the innite continued fraction expansion of period p for d, with n-th convergent An = pn /qn using the notation of Theorem 1.32.
2 2 Theorem 1.43. If x = u, y = v is a positive integer solution of the equation x dy = 1, then u/v is a convergent of the continued fraction expansion of d.
Theorem 1.44. a) If the period p is even then all positive integer solutions of the equation x2 dy 2 = 1 are given by x = pkp1 , y = qkp1 (k = 1, 2, 3, . . .). b) If the period p is odd then all positive integer solutions of the equation x2 dy 2 = 1 are given by x = p2kp1 , y = q2kp1 (k = 1, 2, 3, . . .). Example 1.45. Find all positive integer solutions of x2 2y 2 = 1.
24
Solution. From Example 1.38 we have 2 = [1; 2] with p = 1. So the positive integer solutions are x = p2k1 , y = q2k1 (k = 1, 2, 3, . . .). The rst few are (x, y ) = (3, 2), (17, 12), (99, 70). Example 1.46. Find all positive integer solutions of x2 3y 2 = 1. Solution. From Example 1.39 we have 3 = [1, 1, 2] with p = 2. So the positive integer solutions are x = p2k1 , y = q2k1 (k = 1, 2, 3, . . .). The rst few are (x, y ) = (2, 1), (7, 4), (26, 15).
2 = 1 is the positive integral solution x , y with x , y The fundamental solution of x2 dy 1 1 1 1 minimal. Thus since the convergents of d have pn , qn strictly increasing, we have
(x1 , y1 ) =
Theorem 1.47. The positive integral solutions of x2 dy 2 = 1 are precisely the pairs of integers (xn , yn ) for which xn + yn d = (x1 + y1 d)n . For d = 2, x1 + y1 2 = 3 + 2 2, x2 + y2 2 = 17 + 12 2, x3 + y3 2 = 99 + 70 2, x4 + y4 2 = 577 + 408 2. x1 + y1 3 = 2 + 1 3, x2 + y2 3 = 7 + 4 3, x3 + y3 3 = 26 + 15 3, x4 + y4 3 = 97 + 56 3.
For d = 3,
Example 1.48. Find the fundamental solution of x2 97y 2 = 1 and hence nd 3 other positive integral solutions. Solution. We have 97 = [9; 1, 5, 1, 1, 1, 1, 1, 1, 5, 1, 18] which has period p = 11. The fundamental solution is (x1 , y1 ) = (p21 , q21 ) = (62809633, 6377352), while the rst few solutions are given by x1 + y1 97 = 62809633 + 6377352 97, x2 + y2 97 = 7890099995189377 + 801118277263632 97, x3 + y3 97 = 991148570062293006927649 + 100635889969041933956760 97, x4 + y4 97 = 124507355868174813917084187296257 + 12641806631167809665750389674528 97.
PROBLEM SET 1
25
Problem Set 1 1-1. If a, b, c are non-zero integers, show that each of the following statements is true. (a) If a | b and b | c, then a | c. (b) If a | b and b | a, then b = a. (c) If k Z is non-zero, then gcd(ka, kb) = |k | gcd(a, b). 1-2. Use the Euclidean Algorithm and the method of back-substitution to nd the following greatest common divisors and in each case express gcd(a, b) as an integer linear combination of a, b: gcd(76, 98), gcd(108, 120), gcd(1008, 520), gcd(936, 876), gcd(591, 691). Use the tabular method of 4 to check your results. 1-3. For non-zero integers a, b, show that the set S = {xa + yb : x, y Z, 0 < xa + yb} dened in the proof of Theorem 1.5 agrees with the set T = {t gcd(a, b); t N0 , 0 < t}. 1-4. Use the method in the proof of Theorem 1.5, show that if n Z, gcd(12n + 5, 5n + 2) = 1. 1-5. Find all integer solutions x (if there are any) of each of the following congruences: (a) 9x 23; (b) 21x 7; (c) 21x 8; (d) 210x 97; (e) 13x 36.
245 77 77 1007 37
1-6. Find all integer solutions x (if there are any) of each of the following pairs of simultaneous congruences: (a) 9x 23, 210x 97; (b) 21x 7, 13x 36; (c) 9x 23, 21x 7.
245 1007 77 37 245 77
1-7. [Challenge question ] Using Maple, for a collection of values n = 2, 3, 4, 5, 6, 7, 8, ... determine all the solutions modulo n of the congruence x3 x 0. Can you spot anything systematic about n the number of solutions modulo n? 1-8. Show that if a prime p divides a product of integers a1 an , then p | aj for some j . 1-9. Let p, q be a pair of distinct prime numbers. Show that each of the following is irrational: r p p n ; (c) for any coprime pair of natural numbers r, s. (a) p for n > 1; (b) s q q 1-10. Let p1 , p2 , . . . , pr and q1 , q2 , . . . , qs be primes which satisfy the congruences pi 1 (1
4 4
r),
qj 3
4
(1
s).
Show that p1 p2 pr q1 q2 qs (1)s . Use this result to show that for any natural number n, 4n + 3 is divisible by at least one prime p with p 3.
4
1-11. (a) Find two roots of the polynomial f (x) = x4 + 22 modulo 23. Hence nd three factors of f (x) modulo 23 and explain why you would not expect there to be any other monic linear factors. (b) Find two roots of the polynomial g (x) = x4 + 4x2 + 43x3 + 43x + 3 modulo 47. Hence nd three factors of g (x) modulo 47 and explain why you would not expect there to be any other monic linear factors. 1-12. For each of the primes p = 5, 7, 11, 13, 17, 19, 23, 37 nd a primitive root modulo p.
26
1-13. Let p be an odd prime. For t Z with p t, dene 1 if there is a u Z such that t u2 , t p = 1 otherwise. p Use the proof of Proposition 1.29 to show that 1 p = (1)(p1)/2 .
1 + pn1 ,
pn
1.
What can you say about the case p = 2? 1-15. Determine the two continued fraction expansions of each of the numbers 1/3, 2/3, 3.14159, 3.14160, 51/11, 1725/1193, 1193/1725, 1193/1725, 30031/16579, 1103/87. In each case determine all the convergents. 1-16. If n is a positive integer, what are the continued fraction expansions of n and 1/n? What about when n is negative? [Hint: Try a few examples rst then attempt to formulate and prove general results.] Try to nd a relationship between the continued fraction expansions of a/b and a/b, b/a when a, b are non-zero natural numbers. 1-17. If A = [a0 ; a1 , . . . , an ] with A > 1, show that 1/A = [0; a0 , a1 , . . . , an ]. Let x > 1 be a real number. Show that the n-th convergent of the continued fraction representation of x agrees with the (n 1)-th convergent of the continued fraction representation of 1/x. 1 1 . Determine as many conver1-18. Find the continued fraction expansions of and 5 51 gents as you can. 1 1-19. Investigate the continued fraction expansions of 6 and . Determine as many con6 vergents as you can. 1-20. [Challenge question ] Try to determine the rst 10 terms in the continued fraction expansion of e using the series expansion 1 1 1 e = 1 + + + + . 1! 2! 3! 1-21. Find all the solutions of each of the following Diophantine equations: (a) 64x + 108y = 4, 1-22. Let n be a positive integer. a) Prove the identities n+ n2 + 1 = 2n + ( n2 + 1 n) = 2n + 1 . n + n2 + 1 (b) 64x + 108y = 2, (c) 64x + 108y = 12.
b) Show that [ n2 + 1] = n and that the innite continued fraction expansion of n2 + 1 is [n; 2n]. c) Show that [ n2 + 2] = n and that the innite continued fraction expansion of n2 + 2 is [n; n, 2n].
PROBLEM SET 1
27
d) Show that [ n2 + 2n] = n and that the innite continued fraction expansion of n2 + 2n is [n; 1, 2n]. 1-23. Find the fundamental solutions of Pells equation x2 dy 2 = 1 for each of the values d = 5, 6, 8, 11, 12, 13, 31, 83. In each case nd as many other solutions as you can.
CHAPTER 2
= xn = (x)n .
Example 2.3. Let R = Q, R, C. Then each of these choices gives a group (GL2 (R), ) with GL2 (R) = a b : a, b, c, d R, ad bc = 0 , c d
= multiplication of matrices, = a b c d
1
1 0 = I2 , 0 1 b d ad bc . = ad bc a c ad bc ad bc
Example 2.4. Let X be a nite set and let Perm(X ) be the set of all bijections f : X X . Then (Perm(X ), ) is a group where = composition of functions, = IdX = the identity function on X, f 1 = the inverse function of f . (Perm(X ), ) is called the permutation group of X . We will study these and other examples in more detail. If a group (G, ) has a nite underlying set G, then the number of elements in the G is called the order of G, written |G|.
29
30
2. Permutation groups We will follow the ideas of Example 2.4 and consider the standard set with n elements n = {1, 2, . . . , n}. The Sn = Perm(n) is called the symmetric group on n objects or the symmetric group of degree n or the permutation group on n objects. Theorem 2.5. Sn has order |Sn | = n!. Proof. Dening an element Sn is equivalent to specifying the list (1), (2), . . . , (n) consisting of the n numbers 1, 2, . . . , n taken in some order with no repetitions. To do this we have n choices for (1), n 1 choices for (2) (taken from the remaining n 1 elements), and so on. In all, this gives n (n 1) 2 1 = n! choices for , so |Sn | = n! as claimed. We will often describe using the notation = 1 2 ... n . (1) (2) . . . (n) 1 2 3 , 3 1 2 1 2 3 , 1 3 2 1 2 3 3 2 1 1 2 3 . 2 1 3
We can calculate the composition of two permutations , Sn , where (k ) = ( (k )). Notice that we apply to k rst then apply to the result (k ). For example, 1 2 3 3 2 1 In particular, 1 2 3 3 1 2 = 1 2 3 , 1 3 2 1 2 3 2 3 1 1 2 3 3 1 2
1
1 2 3 1 2 3
= .
1 2 3 1 2 3 = . 2 3 1 3 1 2 Let X be a set with exactly n elements which we list in some order, x1 , x2 , . . . , xn . Then there is an action of Sn on X given by xk = x(k) 1 2 3 A = B, 2 3 1 ( Sn , k = 1, 2, . . . , n). 1 2 3 C = A. 2 3 1 For example, if X = {A, B, C } we can take x1 = A, x2 = B, x3 = C and so 1 2 3 B = C, 2 3 1
Often it is useful to display the eect of a permutation : X X by indicating where each element is sent by with the aid of arrows. To do this we display the elements of X in two similar rows with an arrow joining xi in the rst row to (xi ) in the second. For example, A B C the permutation = acting on X = {A, B, C } can be displayed as B C A A@ @ A B @ oo C @@ oo @@ o @@ooooo @@@ @ oo@ wooo B
31
We can compose permutations by composing the arrows. Thus A B C C A B can be determined from the diagram A @@
A B C B C A
A A
B @@oooo C @@ ooo@@@ @@ooo @@ oo@ wooo OOO B C OO ~ ~ O ~~~ ~~OOO ~ O O ~ ~ OOO ~~ ~~ O' B
B B
C C
3. The sign of a permutation Let Sn and consider the arrow diagram of as above. Let c be the number of crossings of arrows. The sign of is the number sgn = (1)c = +1 if c is even, 1 if c is odd.
Then sgn : Sn {+1, 1}. Notice that {+1, 1} is actually a group under multiplication. Proposition 2.7. The function sgn : Sn {+1, 1} satises sgn( ) = sgn( ) sgn( ) (, Sn ). Proof. By considering the arrow diagram for obtained by joining the diagrams for and , we see that the total number of crossings is c + c . If we straighten out the paths starting at each number in the top row, so that we change the total number of crossings by 2 each time. So (1)c +c = (1)c . A permutation is called even if sgn = 1, otherwise it is odd. The set of all even permutations in Sn is denoted by An . Notice that An and in fact the following result is true. Proposition 2.8. The set An forms a group under composition. Proof. By Proposition 2.7, if , An , then sgn( ) = sgn( ) sgn( ) = 1. Note also that An . The arrow diagram for 1 is obtained from that for by interchanging the rows and reversing all the arrows, so sgn 1 = sgn . Thus if An , then sgn 1 = 1. Hence, An is a group under composition. An is called the n-th alternating group. Example 2.9. The elements of A3 are = 1 2 3 , 1 2 3 1 2 3 , 2 3 1 1 2 3 . 3 1 2
32
4. The cycle type of a permutation Suppose Sn . Now carry out the following steps. Form the sequence 1 (1) 2 (1) r1 1 (1) r1 (1) = 1 where k (j ) = ( k1 (j )) and r1 is the smallest positive power for which this is true. Take the smallest number k2 = 1, 2, . . . , n for which k2 = t (1) for every t. Form the sequence k2 (k2 ) 2 (k2 ) r2 1 (k2 ) r2 (k2 ) = k2 where r2 is the smallest positive power for which this is true. Repeat this with k3 = 1, 2, . . . , n being the smallest number for which k3 = t (k2 ) for every t. . . . Writing k1 = 1, we end up with a collection of disjoint cycles k1 (k1 ) 2 (k1 ) r1 1 (k1 ) r1 (k1 ) = k1 k2 (k2 ) 2 (k2 ) r2 1 (k2 ) r2 (k2 ) = k2 . . . kd (kd ) 2 (kd ) rd 1 (kd ) rd (kd ) = kd in which every number k = 1, 2, . . . , n occurs in exactly one row. The s-th one of these cycles can be viewed as corresponding to the permutation of n which behaves according to the action of on the elements that appear as t (ks ) and x every other element. We indicate this permutation using the cycle notation (ks (ks ) rs 1 (ks )). Then we have = (k1 (k1 ) r1 1 (k1 )) (kd (kd ) rd 1 (kd )), which is the disjoint cycle decomposition of . It is unique apart from the order of the factors and the order in which the numbers within each cycle occur. For example, in S4 , (1 2)(3 4) =(2 1)(4 3) = (3 4)(1 2) = (4 3)(2 1), (1 2 3)(1) =(3 1 2)(1) = (2 3 1)(1) = (1)(1 2 3) = (1)(3 1 2) = (1)(2 3 1). We usually leave out cycles of length 1, so for example (1 2 3)(1) = (1 2 3). Recall that when performing elementary row operations (EROs) on n n matrices, one of the types involves interchanging a pair of rows, say rows r and s, this operation is denoted by Rr Rs . The corresponding elementary matrix E (Rr Rs ) is obtained from the identity matrix In by performing this operation. In fact, we can do a sequence of such operations to obtain any permutation matrix P = [pij ], whose rows are obtained by applying the permutation Sn to those of In so that pij = (i)j = 1 if j = (i), 0 if j = (i). .
5. SYMMETRY GROUPS
33
1 2 3 , then 2 3 1 0 1 0 P = 0 0 1 . 1 0 0
Proposition 2.10. For Sn , det P = sgn . A permutation Sn which interchanges two elements of n and leaves the rest xed is called a transposition. Proposition 2.11. Let Sn . Then there are transpositions 1 , . . . , k such that = 1 k . One way to decompose a permutation into transpositions is to rst decompose it into disjoint cycles then use the easily checked formula (2.1) (i1 i2 . . . ir ) = (i1 ir ) (i1 i3 )(i1 i2 ).
Example 2.12. Decompose = into a product of transpositions. Solution. We have = (3)(1 2 5 4) = (1 2 5 4) = (1 4)(1 5)(1 2). Some alternative decompositions are = (2 1)(2 4)(2 5) = (5 2)(5 1)(5 4). 5. Symmetry groups Let S be a set of points in Rn , where n = 1, 2, 3, . . .. A symmetry of S is a surjection : S S which preserves distances, i.e., |(u) (v)| = |u v| (u, v S ). 1 2 3 4 5 2 5 3 1 4 S5
Theorem 2.13. Let be a symmetry of S Rn . Then a) is a bijection and 1 is also a symmetry of S ; b) preserves distances between points and angles between lines joining points. Corollary 2.14. Let S Rn . Then the set Sym(S ) of all symmetries of S is a group under composition. Example 2.15. Let T R2 be an equilateral triangle A 111 11 11 11 11 O 111 11 1 with vertices A, B, C .
34
Then a symmetry is dened once we know where the vertices go, hence there are as many symmetries as permutations of the set {A, B, C }. Each symmetry can be described using permutation notation and we obtain the 6 symmetries = A B C , A B C A B C , B C A A B C , C A B A B C , A C B A B C C B A A B C . B A C
Therefore we have | Sym( )| = 6. Example 2.16. Let S R2 be the square B (1, 1), C (1, 1), D(1, 1). B centred at the origin O with vertices at A(1, 1), A
C D Then a symmetry is dened by sending A to any one of the 4 vertices then choosing how to send B to one of the 2 adjacent vertices. This gives a total of 4 2 = 8 such symmetries, thus | Sym( )| = 8. Again we can describe symmetries in terms of their eect on the vertices. Here are the 8 elements of Sym( ) described in permutation notation. = A B C D , A B C D A B C D , A D C B A B C D , B C D A A B C D , D C B A A B C D , C D A B A B C D , C B A D A B C D , D A B C A B C D . B A D C
Example 2.17. Let R R2 be the rectangle centred at the origin O with vertices at A(2, 1), B (2, 1), C (2, 1), D(2, 1). B O C D A
A symmetry can send A to any of the vertices, and then the long edge AB must go to the longer of the adjacent edges. This gives a total of 4 such symmetries, thus | Sym(R)| = 4. Again we can describe symmetries in terms of their eect on the vertices. Here are the 4 elements of Sym(R) described in permutation notation. = A B C D A B C D A B C D B A D C A B C D C D A B A B C D D C B A
Given a regular n-gon (i.e., a regular polygon with n sides all of the same length and n vertices V1 , V2 , . . . , Vn ) the symmetry group is the dihedral group of order 2n D2n , with elements , , 2 , . . . , n1 , , , 2 , . . . , n1 where k is an anticlockwise rotation through 2k/n about the centre and is a reection in the line through V1 and the centre. Moreover we have || = n, | | = 2, = n1 = 1 .
35
In permutation notation this becomes = (V1 V2 Vn ), but is more complicated to describe. For example, if n = 6 we have = (V1 V2 V3 V4 V5 V6 ), while if n = 7 = (V1 V2 V3 V4 V5 V6 V7 ), = (V2 V7 )(V3 V6 )(V4 V5 ). We have seen that when n = 3, Sym( ) is the permutation group of the vertices and so D6 is essentially the same group as S6 . 6. Subgroups and Lagranges Theorem Let (G, ) be a group and H G. Then H is a subgroup of G if (H, ) is a group. In detail this means that for x, y H , x y H ; H; if z H then z 1 H . We write H G whenever H is a subgroup of G and H < G if H = G, i.e., H is a proper subgroup of G. Example 2.18. For n Z+ , An is a subgroup of Sn , i.e., An Sn . = (V2 V6 )(V3 V5 ),
By Example 2.3, for each choice of R = Q, R, C, there is a group (GL2 (R), ) with GL2 (R) = Example 2.19. Let SL2 (R) = a b : a, b, c, d R, ad bc = 1 c d GL2 (R).
1
a b : a, b, c, d R, ad bc = 0 . c d
GL2 (R).
Solution. This follows easily with aid of the three identities det a b = ad bc; c d det(AB ) = det A det B ; a b c d = 1 d b . ad bc c a
Let (G, ) be a group. From now on, if x, y G we will write xy for x y . Also, for n Z we write n1 ) if n > 0, x(x xn = if n = 0, (x1 )n if n < 0. If g G, g = {g n : n Z} G is a subgroup of G called the subgroup generated by g . This follows from the three equations g m g n = g m+n ; = g0; (g n )1 = g n .
If g is nite and contains exactly n elements then g is said to have nite order |g | = n. If g is innite then g is said to have innite order |g | = .
36
Proposition 2.20. If g G has nite order |g | then |g | = min{m Z+ : m > 0, g m = }. Example 2.21. In the group Sn the cyclic permutation (i1 i2 ir ) of length r has order |(i1 i2 ir )| = r. Solution. Setting = (i1 i2 ir ), we have k (1) = hence | | r. As ik = 1 for 1 < k ik+1 i1 if k < r, if k = r,
So for example, |(1 2)| = 2, |(1 2 3)| = 3 and |(1 2 3 4)| = 4. But notice that the product (1 2)(3 4 5) satises ((1 2)(3 4 5))2 = (1 2)(3 4 5)(1 2)(3 4 5) = (3 5 4), hence |(1 2)(3 4 5)| = 6. On the other hand, the product (1 2)(2 3 4) satises ( (1 2)(2 3 4) )2 = (1 2)(2 3 4)(1 2)(2 3 4) = (1 3)(2 4) so |(1 2)(3 4 5)| = |(1 3)(2 4)| = 2. A group (G, ) is called cyclic if there is an element c G such that G = c ; such a c is called a generator of G. Notice that for such a group, |G| = |c|. Example 2.22. The group (Z, +) is cyclic of innite order with generators 1. Example 2.23. If 0 < n N0 , then the group (Z/n, +) is cyclic of nite order n. Two generators are 1n Z/n. More generally, tn is a generator if and only if gcd(t, n) = 1. Solution. We have that for each k Z, k = (1 + 1 + + 1) (with k summands). From this we see that 1n are obvious generators and so Z/n = 1n . If gcd(t, n) = 1, then by Theorem 1.9, there is an integer u such that ut 1. Hence 1n tn and so Z/n = tn . Conversely, if Z/n = tn then for some k N0 we have 1 1+ +1 (with k summands) and so kt 1, hence kt + n = 1 for some Z. But this implies gcd(t, n) | 1, hence gcd(t, n) = 1.
n n n
The Euler -function : Z+ N0 is dened by (n) =number of generators of Z/n =number of elements tn Z/n with gcd(t, n) = 1. In order to state some properties of , we need to introduce some notation. For a positive natural number n and a function f dened on the positive natural numbers, the symbol f (d)
d|n
denotes the sum of all the numbers f (d) where d ranges over all the positive integer divisors of n, including 1 and n. For example, f (d) = f (1) + f (2) + f (3) + f (6).
d|6
Theorem 2.24. The Euler function enjoys the following properties: a) b) c) d) (1) = 1; if gcd(m, n) = 1 then (mn) = (m)(n); if p is a prime and r 1 then (pr ) = (p 1)pr1 . for a non-zero natural number n, (d) = n.
d|n
37
For example, (120) = (8 3 5) = (8)(3)(5) = (23 )(3)(5) = 22 2 4 = 25 = 32. The next result is actually a consequence of Lagranges Theorem which follows immediately after it and is of great importance in the study of nite groups. Proposition 2.25. Let G be a nite group and let g G. Then g has nite order and |g | divides |G|. Theorem 2.26 (Lagranges Theorem). Let (G, ) be a nite group and H divides |G|. G. Then |H |
Proof. The idea is to divide up G into disjoint subsets of size H . We do this by dening for each x G the left coset of x with respect to H , xH = {g G : x1 g H } = {g G : g = xh for some h H }. We need the following facts. i) For x, y G, xH yH = xH = yH . This is seen as follows. If xH = yH then xH yH = . Conversely, suppose that xH yH = . If yh xH for some h H , then x1 yh H . For k H , x1 yk = (x1 yh)(h1 k ), which is in H since x1 yh, h1 k H and H is a subgroup of G. Hence yH xH . Repeating this argument with x and y interchanged we also see that xH yH . Combining these inclusions we obtain xH = yH . ii) For each g G, |gH | = |H |. If gh = gk for h, k H then g 1 (gh) = g 1 (gk ) and so h = k . Thus there is a bijection : H gH ; (h) = gh, which implies that the sets H and gH have the same number of elements. Thus every element g G lies in exactly one such coset gH . Thus G is the union of these disjoint cosets which all have size H . Denoting the number of these cosets by [G : H ] we have |G| = |H |[G : H ]. The number [G : H ] of cosets of H in G is called the index of H in G. The set of all cosets of H in G is denoted G/H , i.e., G/H = {gH : g G}. Corollary 2.27. If G is a nite group and H G, then |G| = |H | |G/H | = |H |[G : H ].
Proposition 2.25 now follows easily by taking H = g and using the fact that |g | = |H |. This allows us to give a promised proof of a number theoretic result, the Primitive Element Theorem 1.27. Indeed the following generalisation is true. Theorem 2.28. Let G be a group of nite order n = |G| and suppose that for each divisor d of n there are at most d elements of G satisfying xd = . Then G is cyclic and so abelian. Proof. Let (d) denote the number of elements in G of order d. By Proposition 2.25, (d) = 0 unless d divides |G|. Since G= we have |G| =
d||G|
{g G : |g | = d},
d||G|
(d).
38
(d).
(d) =
d||G|
(d).
We will show that for each divisor d of |G|, (d) (d). For each such d of |G|, we have (d) 0. If (d) = 0 then (d) < (d), since the latter is positive. So suppose that (d) > 0, hence there is an element a G of order d. In fact, the distinct powers = a0 , a, a2 , . . . , ad1 are all solutions of the equation xd = and indeed, by assumption on G, they must be the only such solutions since there are d of them. But now an element ak a with k = 0, 1, 2, . . . , d 1 has order d precisely if gcd(d, k ) = 1 since this requires ak = a and so for some u Z, uk 1 which happens precisely when gcd(d, k ) = 1 as we know from Theorem 1.9. By the denition of , there are (d) of such elements in a , hence (d) = (d). Thus we have shown that in all cases (d) (d). Notice that if (d) < (d) for some d dividing |G|, this would give a strict inequality in place of Equation (2.2). Hence we must always have (d) = (d). In particular, there are (n) elements of order n, hence there must be an element of order n, so G is cyclic. Taking G = Up , the group of invertible elements of Z/p under multiplication, we obtain Theorem 1.27. 7. Group actions If X is a set and (G, ) then a (group ) action of (G, ) on X is a rule which assigns to each g G and x X and element gx X so that the following conditions are satised. GpAc1 For all g1 , g2 G and x X , (g1 g2 )x = g1 (g2 x). GpAc2 For x X , x = x. Thus each g G can be viewed as acting as a permutation of X . Example 2.29. Let G Sn and let X = n. For G and k n let k = (k ). This denes an action of (G, ) on n. Example 2.30. Let X Rn and let G Sym(X ) be a subgroup of the symmetry group of X . For G and x X , let x = (x). This denes an action of (G, ) on X . Suppose we have an action of a group (G, ) on a set X . For x X , the stabilizer of x is StabG (x) = {g G : gx = x} G, and the orbit of x is OrbG (x) = {gx : g G} X. Notice that x = x, so x OrbG (x) and StabG (x). Thus StabG (x) = and OrbG (x) = . Theorem 2.31. For each x, y X , a) StabG (x) G; b) y OrbG (x) if and only if x OrbG (y ); c) y OrbG (x) if and only if OrbG (y ) = OrbG (x). Proof. a) If g1 , g2 StabG (x) then by GpAct1, (g1 g2 )x = g1 (g2 x) = g1 x = x.
d
7. GROUP ACTIONS
39
By GpAct2, x = x, hence StabG (x). Finally, if g StabG (x) then by GpAct1 and GpAct2, g 1 x = g 1 (gx) = (g 1 g )x = x = x, hence g 1 StabG (x). So StabG (x) G. b) If y OrbG (x), then y = gx for some g G. Hence x = (g 1 g )x = g 1 (gx) = g 1 y and so x OrbG (y ). The converse is similar. c) If y OrbG (x) then by (b), x OrbG (y ) and so x = ky for some k G. Hence if g G, gx = g (ky ) = (g k )y OrbG (y ) and so OrbG (x) OrbG (y ). By (b), x OrbG (y ) and so we also have OrbG (y ) OrbG (x). This gives OrbG (y ) = OrbG (x). Conversely, if OrbG (y ) = OrbG (x) then y OrbG (y ) = OrbG (x). Example 2.32. Let X = be the square with vertices A, B, C, D and let G = Sym( ). Determine StabG (x) and OrbG (x) where a) x is the vertex A; b) x is the midpoint M of AB ; c) x is the point P on AB where AP : P B = 1 : 3. Solution. Recall Example 2.16. We will write permutations of the vertices in cycle notation. a) We have StabG (A) = {, (B D)} . Also, every vertex can be obtained from A by applying a suitable symmetry, hence OrbG (x) = {A, B, C, D}. b) A symmetry xes the midpoint of AB if and only if it maps this edge to itself. The symmetries doing this have one of the eects (A) = A, (B ) = B or (A) = B, (B ) = A. Thus StabG (M ) = {, (A B )(C D)} . Also, we can arrange to send A to any other vertex and B to either of the adjacent vertices of the image of A, hence the orbit of M consists of the set of 4 midpoints of edges. c) A symmetry can only x P if it sends A to a vertex A say, and B to a vertex B with A P : P B = 1 : 3 and this is only possible if A = A and B = B , hence must also x A, B . So StabG (P ) = {}. On the other hand, since we can select a symmetry to send A to any other vertex and B to either of the adjacent vertices to the image, P can be sent to any of the points Q which cut an edge in the ratio 1 : 3. So the orbit of P is the set consisting of these 8 points. Theorem 2.33 (Orbit-Stabilizer Theorem). Let (G, ) act on X . Then for x X there is a bijection F : G/ StabG (x) OrbG (x) between the set of cosets of StabG (x) in G and the orbit of x, dened by F (g StabG (x)) = gx. Moreover we have F ((t g ) StabG (x)) = tF (g StabG (x))
1 g1 g2
(t G).
Proof. We begin by checking that F is well dened. If g1 StabG (x) = g2 StabG (x), then StabG (x) and 1 1 g1 x = g1 ((g1 g2 )x) = (g1 g1 g2 )x = g2 x.
Hence F is well dened. Notice that gx = kx if and only if (g 1 k )x = x, i.e., g 1 k StabG (x) which means that g StabG (x) = k StabG (x). So F is an injection. Also, every y OrbG (x) has the form tx = F (t StabG (x)) for some t G, which shows that F is surjective.
40
The nal equation property is a consequence of the denition of F . Corollary 2.34. If G is nite then for each x X , |G| . | StabG (x)|
The sizes of the orbits in Example 2.32 can be found using this result. Theorem 2.35. The orbits of an action of (G, ) on X decompose X into a union of disjoint subsets, X=
U an orbit
U.
|U |.
In these results, each orbit U has the form OrbG (xU ) for some element xU X . Moreover, if G is nite, then |U | = [G : StabG (xU )] = |G| . | StabG (xU )|
The formula in Corollary 2.36 becomes the orbit-stabilizer equation : |G| . | StabG (xU )|
(2.3)
|X | =
U an orbit
If there is only one orbit, then the action is said to be transitive, and in this case, for any x X we have X = OrbG (x) and |X | = |G|/| StabG (x)|. Given an action of (G, ) on X , another useful idea is that of the xed point set or xed set of an element g G, FixG (g ) = {x X : gx = x}. FixG (g ) is also often denoted X g . Theorem 2.37 (Burnside Formula). If (G, ) acts on X with G and X nite, then number of orbits = 1 |G| | FixG (g )|.
g G
7. GROUP ACTIONS
41
1
g G xFixG (g )
= = =
1
xX g StabG (x)
| StabG (x)|
xX
|U | | StabG (x)|
U = OrbG (x) an orbit
= =
|G|
U an orbit
1
U an orbit
= number of orbits. Example 2.38. Let X = {1, 2, 3, 4} and let G S4 be the subgroup
G = {, (1 2), (3 4), (1 2)(3 4)} acting on X in the obvious way. How many orbits does this action have? Solution. Here |G| = 4 = |X |. Furthermore we have FixG () = X, FixG ((1 2)) = {3, 4}, FixG ((3 4)) = {1, 2}, FixG ((1 2)(3 4)) = .
So there are 2 orbits, namely {1, 2} and {3, 4}. Example 2.39. Let X = {1, 2, 3, 4, 5, 6} and let G = (1 2 3)(4 5) S6 be the cyclic subgroup acting on X in the obvious way. How many orbits does this action have? Solution. Here |G| = 6 and |X | = 6. The elements of G are , (1 2 3)(4 5), (1 3 2), (4 5), (1 2 3), (1 3 2)(4 5). The xed sets of these are FixG () = X, FixG ((4 5)) = {1, 2, 3, 6}, By the Burnside Formula, number of orbits = 18 1 (6 + 1 + 3 + 4 + 3 + 1) = = 3. 6 6 FixG ((1 2 3)(4 5)) = FixG ((1 3 2)(4 5)) = {6}, FixG ((1 2 3)) = FixG ((1 3 2)) = {4, 5, 6}.
So there are 3 orbits, namely {1, 2, 3}, {4, 5} and {6}. Example 2.40. A dinner party of seven people is to sit around a circular table with seven seats. How many distinguishable ways are there to do this if there is to be no head of table ?
42
Solution. View the seven places as numbered 1 to 7. There are 7! ways to arrange the diners in these places. Take X to be the set of all possible such arrangements, so |X | = 7!. Regard two such arrangements as indistinguishable if one is obtained from the other by a rotation of the diners around the places. Clearly there are 7 such rotations, each involving everyone moving k seats to the right for some k = 0, 1, . . . , 6. Let denote the rotation corresponding to everyone moving one seat to the right. Then to get everyone to move k seats we repeatedly apply k times in all, i.e., k . This suggests we should consider the group G = {, , 2 , 3 , 4 , 5 , 6 } consisting of all of these operations, with composition as the binary operation. This provides an action of G on X . The number of indistinguishable seating plans is the number of orbits under this action, i.e., 1 | FixG (g )|. |G|
g G
Notice that apart from the identity element, no rotation can x any arrangement, so when g = , FixG (g ) = , while FixG () = X . Hence the number of indistinguishable seating plans is 7!/7 = 6! = 720. Example 2.41. Find the number of distinguishable ways there are to colour the edges of an equilateral triangle using four dierent colours, where each colour can be used on more than one edge. Solution. Let X be the set of all possible such colourings of the equilateral triangle ABC whose symmetry group is G = S3 , which we view as the permutation group of {A, B, C }; hence |G| = 6. Also |X | = 43 = 64 since each edge can be coloured in 4 ways. G acts on X in the obvious way. A pair of colourings is indistinguishable precisely if they are in the same orbit. By the Burnside formula, the number of distinguishable colourings is given by 1 | FixG ( )|. number of orbits = 6
G
The xed sets of elements of the various cycle types in G are as follows. Identity element : FixG () = X , | FixG ()| = 64. 3-cycles (i.e., = (A B C ), (A C B )): these give rotations and can only x a colouring that has all sides the same colour, hence | FixG ( )| = 4. 2-cycles (i.e., = (A B ), (A C ), (B C )): each of these gives a reection in a line through a vertex and the midpoint of the opposite edge. For example, (A B ) xes C and interchanges the edges AC, BC , it will therefore x any colouring that has these edges the same colour. There are 4 4 = 16 of these, so | FixG ((A B ))| = 16. Similarly for the other 2-cycles. By the Burnside formula, 120 1 = 20. number of distinguishable colourings = (64 + 2 4 + 3 16) = 6 6
PROBLEM SET 2
43
Problem Set 2 2-1. Which of the following pairs (G, ) forms a group? (a) (b) (c) (d) (e) (f) (g) G = {x Z : x = 0}, G = {x Q : x = 0}, G= G= G= G= a b : a, b R, a2 + b2 = 1 , b a z w : z, w C, |z |2 + |w|2 = 1 , w z a b : a, b, c, d Z, ad bc = 0 , c d a b : a, b, c, d Z, ad bc = 1 , c d = ; = ; = multiplication of matrices; = multiplication of matrices; = multiplication of matrices; = multiplication of matrices; = composition of functions.
G = { Sn : (n) = n},
2-2. For each of the following permutations in S6 , determine its sign and decompose it into disjoint cycles: = 1 2 3 4 5 6 , 3 4 2 5 6 1 = 1 2 3 4 5 6 , 3 6 4 1 5 2 = 1 2 3 4 5 6 . 3 4 1 6 2 5
2-3. Find the orders of the symmetry groups of the following geometric objects, and in each case try to describe the symmetry groups as groups of permutations: a) a regular pentagon; b) a regular hexagon; c) a regular hexagon with vertices alternately coloured red and green; d) a regular hexagon with edges alternately coloured red and green; e) a cube; f) a cube with the pairs of opposite faces coloured red, green and blue respectively. 2-4. [Challenge question.] Suppose Tet is a regular tetrahedron with vertices A, B, C, D. a) Show that the symmetry group Sym(Tet) of Tet can be identied with the symmetric group S4 which acts by permuting the vertices. b) For each pair of distinct vertices P, Q, how many symmetries map the edge P Q into itself? Show that these symmetries form a group. c) Find a geometric interpretation of the alternating group A4 acting as symmetries of Tet. 2-5. In each of the following groups (G, ) decide whether the subset H is a subgroup of G and when it is, decide whether it is cyclic. a) G = {x Q : x = 0}, H = {x G : x > 0}, = ; b) G = {x Q : x = 0}, H = {x G : x < 0}, = ; c) G = {x Q : x = 0}, H = {x G : x2 = 1}, = ; d) G = {x C : x = 0}, H = {x G : xd = 1}, = ; e) G = {z C : z = 0}, H = {z G : |z | < }, = ; z w : z, w C, |z |2 + |w|2 = 1 , H = {A G : |A| < }, f) G = w z = matrix multiplication; a b a b g) G = : a, b, c, d R, ad bc = 0 , H = A G : A = , c d 0 d
44
= matrix multiplication; h) G = Sym( ), H = the subset of rotations in G, = composition of functions. 2-6. Using Lagranges Theorem, nd all possible orders of elements of each of the following groups and decide whether there are indeed elements of those orders: Z/6, S3 , A3 , S4 , A4 , D8 , D10 . 2-7. [Challenge question ] Let G be a group. Show that each of the following subsets of G is a subgroup: a) CG (x) = {c G : cx = xc}, where x G is any element; b) Z(G) = {c G : cg = gc for all g G}; c) NG (H ) = {n G : for every h H , nhn1 H , and n1 hn H }, where H any subgroup. 2-8. Using Lagranges Theorem, nd all subgroups of each of the groups Z/6, S3 , A3 , S4 , A4 , D8 , D10 . 2-9. Let G = S4 and let X denote the set consisting of all subsets of 4 = {1, 2, 3, 4}. For S4 and U X , let U = { (u) X : u U }. a) Show that this denes an action of G on X . b) For each of the following elements U of X , nd OrbG (U ) and StabG (U ): , {1}, {1, 2}, {1, 2, 3}, {1, 2, 3, 4}. c) For each of the following elements of G nd FixG (g ): , (1 2), (1 2 3), (1 2 3 4), (1 2)(3 4). 2-10. Let G = GL2 (R) be the group of 2 2 invertible real matrices under matrix multiplication and let X = R2 be the set of all real column vectors of length 2. For A G and x X let Ax be the usual product. a) Show that this denes an action of G on X . b) Find the orbit and stabilizer of each the following vectors: 0 1 0 1 , , , . 0 0 1 1 c) For each of the following matrices A nd FixG (A): 1 1 1 1 2 0 sin cos cos sin 0 1 u 0 , , , , , , , 0 1 0 5 0 3 cos sin sin cos 1 0 0 u where , u R with u = 0. 2-11. [Challenge question ] Using the same group G = GL2 (R) and notation as in the previous question, let Y denote the set of all lines through the origin in R2 . For A G and L Y , let AL = {Ax R2 : x L}. a) Show that AL is always a line and that this denes an action of G on Y . b) For each of the following vectors v nd the line Lv through the origin containing it and nd the orbit and stabilizer of Lv : 1 0 1 , , . 0 1 1 c) For each of the matrices A in (c) of the previous question, nd FixG (A) for this action.
G is
PROBLEM SET 2
45
2-12. Let G = Sym(Tet) be the symmetry group of the regular tetrahedron Tet with vertices A, B, C, D. Let X denote the set of edges of Tet. For G and E X let E = {(P ) Tet : P E }. a) Show that E is an edge and that this denes an action of G on X . b) Find OrbG (E ) and StabG (E ) for the edge AB . c) For each of the following elements of G nd FixG (g ): , (A B ), (A B C ), (A B C D), (A B )(C D). 2-13. Let X = {1, 2, 3, 4, 5, 6, 7, 8} and G = (1 2 3 4 5 6)(7 8) be the cyclic subgroup of S7 acting on X in the obvious way. How many orbits does this action of G have? 2-14. How many distinguishable 5-bead circular necklaces can be made where each bead has to be a dierent colour chosen from 5 colours? Here two such necklaces are deemed to be indistinguishable if one can be obtained from the other by a combination of rotations and ips. What if the number of colours used is 6? 7? 8? What if we only allow rotations between indistinguishable necklaces? 2-15. How many distinguishable regular tetrahedral dice can be made where each face has one of the numbers 1,2,3,4 on it? Here two such dice are deemed to be indistinguishable if one can be obtained from the other by a rotation. What about if we allow arbitrary symmetries between indistinguishable such dice?
CHAPTER 3
Arithmetic functions
1. Denition and examples of arithmetic functions Let Z+ = N0 {0} be the set of positive integers. A function : Z+ R (or : Z+ C) is called a real (or complex) arithmetic function if (1) = 1. There are many important and interesting examples. Example 3.1. The following are all real arithmetic functions: a) The identity function id : Z+ R; id(n) = n. b) The Euler function : Z+ R of Theorem 2.24. c) For each positive natural number r, r : Z+ R; r (n) =
d|n
dr .
1 is often denoted ; (n) is equal to the sum of the (positive) divisors of n. d) The function given by : Z+ R; e) The function given by : Z+ R; (n) = 1. (n) = 1 0 if n = 1, otherwise.
The set of all real (or complex) arithmetic functions will be denoted by AFR (or AFC ). An arithmetic function is called (strictly ) multiplicative if (mn) = (m) (n) whenever gcd(m, n) = 1. By Theorem 2.24(b), the Euler function is strictly multiplicative. In fact, each of the functions in Example 3.1 is strictly multiplicative. An important example is the M obius function : Z+ R dened as follows. If n Z+ then by the Fundamental Theorem of Arithmetic and Corollary 1.19, we have the prime power rt 1 r2 factorization n = pr rj and 2 p1 < p2 < < pt . 1 p2 pt , where for each j , pj is a prime, 1 We set 0 if any rj > 1, rt 1 r2 (n) = (pr 1 p2 pt ) = t (1) if all rj = 1. So for example, if n = p is a prime, (p) = 1, while (p2 ) = 0. Also, (60) = (22 3 5) = 0. Proposition 3.2. The M obius function is multiplicative. Proof. This follows from the denition and the fact that the prime power factorizations of two coprime natural numbers m, n have no common prime factors. So for example, (105) = (3)(5)(7) = (1)3 = 1.
47
48
3. ARITHMETIC FUNCTIONS
if n
2.
Proof. By Induction on r, the number of prime factors in the prime power factorization rt 1 of n = pr 1 pt , so r = r1 + + rt . If r = 1, then n = p is prime and (p) = 1, hence (d) = 1 1 = 0.
d|p
t Assume that whenever r < k . Then if r = k , let n = mpr t where pt is a prime factor of n. Then rt (n) = (m)(pt ) and so
(d) =
d|n d|m
((d) + (dpt )) =
d|m
((d) + (d)(pt )) =
d|m
(d)(1 1) = 0.
This gives the Inductive Step. 2. Convolution and M obius Inversion Let , : Z+ R (or C) be arithmetic functions. The convolution of and is the function for which (n) =
d|n
(d) (n/d).
Proposition 3.4. The convolution of two arithmetic functions is an arithmetic function. Moreover, satises a) for arithmetic functions , , , ( ) = ( ); b) for an arithmetic function , = = ; for which c) for an arithmetic function , there is a unique arithmetic function = = ; d) For two arithmetic functions , , = . Hence (AFR , ) and (AFC , ) are commutative groups. Proof. (a) For n Z+ , ( ) (n) =
d|n
= =
k m=n
(k ) ( ) (m),
49
(k ) ( ) (m).
(d)(n/d) = (n),
and similarly (n) = (n). (c) Take t1 = 1. We will show by Induction that there are numbers tn for which td (n/d) = (n).
d|n
Suppose that for some k > 1 we have such numbers tn for n < k . Consider the equation td (k/d) = (k ) = 0.
d|k
Rewriting this as tk =
d|k d=k
td (k/d),
by (n) = tn . By we see that tk is uniquely determined from this equation. Now dene construction, (d) = (n). (n) = (n/d)
d|n
(d) (n/d) =
d|n
(n/d)(d) =
k|n
(k )(n/k ) = (n).
. In each of the groups (AFR , ) and (AFC , ), the inverse of an arithmetic function is Here is an important example. Proposition 3.5. The inverse of is = , the M obius function. Proof. Recall that (n) = 1 for all n. By Proposition 3.3 we have (d) (n/d) =
d|n d|n
(d) =
1 n = 1, 0 n > 1.
Hence = is the inverse of by the proof of Proposition 3.4(c). Theorem 3.6 (M obius Inversion). Let f, g : Z+ R (or f, g : Z+ C) be arithmetic functions satisfying f (n) =
d|n
g (d)
(n Z+ ).
Then g (n) =
d|n
f (d)(n/d)
(n Z+ ).
50
3. ARITHMETIC FUNCTIONS
f (d)(n/d).
Example 3.7. Use M obius Inversion to nd a formula for (n), where is the Euler function. Solution. By Theorem 2.24(d), (d) = n.
d|n
This can be rewritten as the equation = id where id(n) = n. Applying M obius Inversion gives = id , i.e., n (n/d)d. (n) = (d) = d
d|n d|n
pr
1,
0 s r
66
d,
(d) (n/d) =
d|n
(d)(n/d).
Proposition 3.9. If , are multiplicative arithmetic functions, then is multiplicative. Proof. If m, n be coprime positive integers, (mn) =
d|mn
=
r|m s|n
(r)(s) ((m/r)(n/s))
=
r|m s|n
=
r|m
(r) (m/r)
s|n
(s) (n/s)
= (m) (n).
51
Hence is multiplicative. Corollary 3.10. Suppose that is a multiplicative arithmetic function, and is the arithmetic function satisfying (n) = (d) (n Z+ ).
d|n
52
3. ARITHMETIC FUNCTIONS
Problem Set 3 3-1. Let : Z+ R be the function for which (n) is the number of positive divisors of n. a) Show that is an arithmetic function. rt 1 r2 b) Suppose that n = pr p1 < p 2 < 1 p2 pt is the prime power factorization of n, where 2 < pt and rj > 0. Show that
rt 1 r2 (pr 1 p2 pt ) = (r1 + 1)(r2 + 1) (rt + 1).
c) Is multiplicative? d) Show that = . 3-2. Show that each of the functions r (r 1) of Example 3.1 are multiplicative. 3-3. For each r N0 dene the arithmetic function [r] : Z+ R by [r](n) = nr . In particular, [0] = and [1] = id. a) Show that [r] is multiplicative. b) If r > 0, show that r = [r] . Deduce that r is multiplicative. c) Show that [r] [r] satises [r] [r](n) = nr (n). d) Find a general formula for [r] [s](n) when s < r. 3-4. For n Z+ , prove the following formul, where the functions are dened in the text or in earlier questions. (a)
d|n
(d) (n/d) = n;
(b)
d|n
(d) (n/d) = 1;
(c)
d|n
r (d)(n/d) = nr .
CHAPTER 4
When n = 0, there is exactly one function (the identity function) and this is a bijection; if m > 0 then there are no functions m . So P (0) is true. Suppose that P (k ) is true for some k N0 and let f : m k + 1 be an injection. We have two cases to consider: (i) k + 1 im f , (ii) k + 1 / im f . (i) For some r m we have f (r) = k + 1. Consider the function g : m 1 k given by g (j ) = f (j ) f (j + 1) if 0 j < r, m. if r < j
Then g is an injection, so by the assumption that m 1 k , hence m k + 1. (ii) Consider the function h : m k given by h(j ) = f (j ). Then h is an injection, and by the
53
54
assumption that P (k ) is true, m k and so m k + 1. In either case we have established that P (k ) P (k + 1). By PMI, P (n) is true for all n N0 . b) This time we proceed by Induction on m. Consider the statement Q(m) : For n N0 , if there is a surjection m n then m n.
When m = 0, there is exactly one function (the identity function) and this is a bijection; if n > 0 there are no surjections n. So Q(0) is true. Suppose that Q(k ) is true for some k N0 and let f : k + 1 n be a surjection. Let f : k n be the restriction of f to k, i.e., f (j ) = f (j ) for j k. There are two cases to deal with: (i) f is a surjection, (ii) f is a not a surjection. (i) By the assumption that Q(k ) is true, k n which implies that k + 1 n. (ii) There must be exactly one s n not in im f . Dene g : k n 1 by g (j ) = f (j ) if 0 f (j ) < s, n. n.
f (j ) 1 if s < f (j )
Then g is a surjection, so by the assumption that Q(k ) is true, k n 1, hence k + 1 In either case, we have established that Q(k ) Q(k + 1). By PMI, Q(n) is true for all n N0 . c) This follows from (a) and (b) since a bijection is both injective and surjective.
Corollary 4.4. Suppose that X is a nite set and suppose that there are bijections m X and n X . Then m = n. Proof. Let f : m X and g : n X be bijections. Using the inverse g 1 : X n which is also a bijection, we can form a bijection h = g 1 f : m n. By part (c), m = n. For a nite set X , the unique n N0 for which there is a bijection n X is called the cardinality of X , denoted |X |. If X is innite then we sometimes write |X | = , while if X is nite we write |X | < . We reformulate Theorem 4.3 without proof to give some important facts about cardinalities of nite sets. Theorem 4.5 (Pigeonhole Principle). Let X, Y be two nite sets. a) If there is an injection X Y then |X | |Y |. b) If there is a surjection X Y then |X | |Y |. c) If there is a bijection X Y then |X | = |Y |. The name Pigeonhole Principle comes from the use of this when distributing m letters into n pigeonholes. If each pigeonhole is to receive at most one letter, m n; if each pigeonhole is to receive at least one letter, m n. Let X be a set and P X . Then P is a proper subset of X if P = X , i.e., there is an element x X with x / P. Notice that if X is a nite set and S a subset, then the inclusion function inc : S X given by inc(j ) = j is an injection. So we must have |S | |X |. If P is a proper subset then we have |P | < |X | and this implies that there can be no injection X P nor a surjection P X . These conditions actually characterise nite sets. In the next section we investigate how to recognise innite sets.
3. COUNTABLE SETS
55
2. Innite sets Theorem 4.6. Let X be a set. a) X is innite if and only if there is an injection X P where P X is a proper subset. b) X is innite if and only if there is a surjection Q X where Q X is a proper subset. c) X is innite if and only if there is an injection N0 X . d) X is innite if and only if there is a subset T X and an injection N0 T . Example 4.7. The set of all natural numbers N0 = {0, 1, 2, . . .} is innite. Solution. Let us take the subset P = {1, 2, 3, . . .} and dene a function f : N0 P by f (n) = n + 1. 0 O 1
1 O 2
2 O 3
n O n+1
If f (m) = f (n) then m + 1 = n + 1 so m = n, hence f is injective. If k P then k 1 and so (k 1) 0, implying (k 1) N0 whence f (k 1) = k . Thus f is also surjective, hence bijective. Example 4.8. Show that there are bijections between the set of all natural numbers N0 and each of the sets S1 = {2n : n N0 }, S2 = {2n + 1 : n N0 }, S3 = {3n : n N0 }. In each case nd a bijection and its inverse. Solution. For S1 , let f1 : N0 S1 be given by f1 (n) = 2n. Then f1 is a bijection: it is injective since 2n1 = 2n2 implies n1 = n2 , and surjective since given 2m N0 , f1 (m) = 2m. 1 The inverse function is given by f1 (k ) = k/2. For S2 , let f2 : N0 S2 be given by f2 (n) = 2n + 1. Then f2 is a bijection: it is injective (2n1 + 1 = 2n2 + 1 implies n1 = n2 ) and surjective since given 2m + 1 N0 , f2 (m) = 2m + 1. 1 The inverse function is given by f2 (k ) = (k 1)/2. For S3 , let f3 : N0 S3 be given by f3 (n) = 3n. Then f3 is a bijection: it is injective since 3n1 = 3n2 implies n1 = n2 , and surjective since given 3m N0 , f3 (m) = 3m. The inverse 1 function is given by f3 (k ) = k/3. Notice that each of the sets S1 , S2 , S3 is a proper subset of N0 , yet each is in 1-1 correspondence with N0 itself. 3. Countable sets Definition 4.9. A set X is countable if there is a bijection S X where either S = n for some n N0 or S = N0 . A countable innite set is said to be countably innite or of cardinality 0 . An innite set which is not countable is said to be uncountable. Example 4.10. The following sets are countably innite. a) b) c) d) Any innite subset S N0 . X Y where X, Y are countably innite. X Y where X is countably innite and Y is nite. The set of all ordered pairs of natural numbers N0 N0 = {(m, n) : m, n N0 }.
56
Solution. a) Since S is innite it cannot be empty. Let S0 = S . By WOP, S0 has a least element s0 say. Now consider the set S1 = S {s0 }; this is not empty since otherwise S would be nite. Again WOP ensures that there is a least element s1 S1 . Continuing, we can construct a sequence s0 , s1 , . . . , sn , . . . of elements in S with sn the least element of Sn = S {s0 , s1 , . . . , sn1 } which is never empty. Notice in particular that s0 < s1 < < sn < , from which it easily follows that sn n. If s S , then for some m N0 must satisfy m by construction of the sn we must have s = sm0 for some m0 . Hence S = {sn : n N0 }. Now dene a function f : N0 S by f (n) = n; this is easily seen to be a bijection. b) The simplest case is where X Y = . Then given bijections f : N0 X and g : N0 Y we construct a function h : N0 X Y by n f 2 h(n) = n1 g 2 if n is even, if n is odd. s, so
Then h is a bijection. If Z = X Y and Y Z are both countably innite, let f : N0 X and g : N0 Y Z be bijections. Then we dene h : N0 X Y by n f 2 h(n) = n1 g 2 if n is even, if n is odd.
This is again a bijection. The case where one of X X Y and Y X Y is nite is easy to deal with by the method used for (c). c) Since Y is nite so is Y X Y Y . Let f : N0 X and g : m Y X Y be bijections. Dene h : N0 X Y by g (n 1) if 1 n m,
h(n) =
f (n m 1) if m < n.
Then h is a bijection. d) Plot each pair (a, b) as the point in the xy -plane with coordinates (a, b); such points are all those with natural number coordinates. Starting at (0, 0) we can now trace out a path passing through all of these points and we can arrange to do this without ever recrossing such a point.
57
. . .
(0, 3)
O
(0, 2) o
/ (1, 3) GG GG GG GG #
(1, 2)
cGG GG GG GG
(2, 2)
GG GG GG GG #
(0, 1)
This gives a sequence {(rn , sn )}06n of elements of N0 N0 which contains every natural number exactly once. The function f : N0 N0 N0 ; f (n) = (rn , sn ),
is a bijection. e) This is demonstrated in a similar way to (d) but is slightly more involved. For each a/b Q+ , we can assume that a, b are coprime (i.e., have no common factors) and plot it as the point in the xy -plane with coordinates (a, b). Starting at (1, 1) we can now trace out a path passing through all of these points with coprime positive natural number coordinates and can even arrange to do this without ever recrossing such a point. . . .
(1, 4)
O
(2, 4)
(1, 3) o
(2, 3)
cGG GG GG GG
(3, 3)
(1, 2)
(2, 2)
(3, 2)
(4, 2)
(
(5, 1)
This gives us a sequence {rn }06n of elements of Q+ which contains every element exactly once. The function f : N0 Q+ ; is a bijection. 4. Power sets and their cardinality For two sets X and Y , let Y X = {f : f : X Y is a function}. f (n) = rn ,
58
Example 4.11. Let X and Y be nite sets. Then Y X is nite and has cardinality |Y X | = |Y ||X | . Solution. Suppose that the distinct elements of X are x1 , . . . , xm where m = |X | and those of Y are y1 , . . . , yn where n = |Y |. A function f : X Y is determined by specifying the values of the m elements f (x1 ), . . . , f (xm ) of Y . Each f (xk ) can be chosen in n ways so the total number of choices is nm . Hence |Y X | = nm . A particular case of this occurs when Y has two elements, e.g., Y = {0, 1}. The set {0, 1}X is called the power set of X , and has 2|X | elements and indeed it is often denoted 2|X | . It has another important interpretation. For any set X , we can consider the set of all its subsets P (X ) = {U : U X is a subset}. Before stating and proving our next result we introduce the characteristic or indicator function of a subset U X , U : X {0, 1}; U (x) = 1 if x U , 0 if x / U.
Theorem 4.12. For a set X , the function : P (X ) {0, 1}X ; is a bijection. Proof. The indicator function of a subset U X is clearly determined by U , so is well dened. Also, a function f {0, 1}X determines a corresponding subset of X Uf = {x X : f (x) = 1} with Uf = f . This shows that is a bijection whose inverse function satises 1 (f ) = Uf . (U ) = U ,
Example 4.13. If X is nite then P (X ) is nite with cardinality |P (X )| = 2|X | . Proof. This follows from Example 4.11. Using the standard nite sets n = {1, . . . , n} (n N0 ) we have |P (0)| = 20 = 1, |P (1)| = 21 = 2, |P (2)| = 22 = 4, |P (3)| = 23 = 8, . . . where P (0) = {}, P (1) = {, {1}}, P (2) = {, {1}, {2}, {1, 2}}, P (3) = {, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 2}}, . . . We will now see that for any set X the power set P (X ) is always bigger than X . Theorem 4.14 (Russells Paradox). For a set X , there is no surjection X P (X ).
59
Proof. Suppose that g : X P (X ) is a surjection. Consider the subset W = {x X : x / g (x)} X. Then by surjectivity of g there is a w X such that g (w) = W . If w W , then by denition of W we must have w / g (w) = W , which is impossible. On the other hand, if w / W , then w g (w) = W and again this is impossible. But then w cannot be in W or the complement X W , contradicting the fact that every element of X has to be in one or other of these subsets since X = W (X W ). Thus no such surjection can exist. When X is nite, this result is not surprising since 2n > n for n N0 . For X an innite set, it leads to the idea that there are dierent sizes of innity. Before showing how this result allows us to determine some concrete examples, we give a generalization. Corollary 4.15. Let X and Y be sets and suppose that Y has a subset Z Y which admits a bijection g : Z P (X ). Then there is no surjection X Y . Proof. Suppose that f : X Y is a surjection. Choosing any element p P (X ) and dening the function H : X P (X ); h(x) = g (f (x)) if f (x) Z, p if f (x) / Z,
we easily see that h is a surjection, contradicting Russells Paradox. Thus no such surjection can exist. 5. The real numbers are uncountable Theorem 4.16 (Cantor). The set of real numbers R is uncountable, i.e., there is no bijection N0 R. Proof. Suppose that R is countable and therefore the obviously innite subset (0, 1] R is countable. Then we can list the elements of (0, 1]: q0 , q1 , . . . , qn , . . . . For each n we can uniquely express qn as a non-terminating expansion innite decimal qn = 0.qn,1 qn,2 qn,k , where for each k , qn,k = 0, 1, . . . , 9 and for every k0 there is always a k > k0 for which qn,k = 0. Now dene a real number p (0, 1] by requiring its decimal expansion p = 0.p1 p2 pk to have the property that for each k 1, 1 2 if qk1,k = 1, if qk1,k = 1.
pk =
Notice that this is also non-terminating. Then p = q1 since p1 = q0,1 , p = q2 since p2 = q1,2 , etc. So p cannot be in the list of qn s, contradicting the assumption that (0, 1] is countable. The method of proof used here is often referred to as Cantors diagonalization argument. In particular this shows that R is much bigger that the familiar subset Q R, however it can be hard to identify particular elements of the complement R Q. In fact the subset of all real algebraic numbers is countable, where such a real number is a root of a monic polynomial of positive degree, X n + an1 X n1 + + a0 Q[X ].
60
Problem Set 4 4-1. Show that each of the following sets is countable: a) b) c) d) e) Z, the set of all integers; {n2 : n Z}, the set of all integers which are squares of integers; {n Z : n = 0}, the set of all non-zero integers; Q, the set of all rational numbers; {x R : x2 Q}, the set of all real numbers which are square roots of rational numbers.
4-2. Show that a subset of a countable set is countable. 4-3. Let X be a countable set. If Y is a nite set, show that the cartesian product X Y = {(x, y ) : x X, y Y } is countable. Use Example 4.10(d) or a modication of its proof to show that this is still true if Y is countably innite.
Index
1-1, 53 correspondence, 53 action, 30 group, 38 alternating group, 31 arithmetic function, 47 back-substitution, 5 bijection, 53 binary operation, 29 Burnside Formula, 40 Cantors diagonalization argument, 59 cardinality, 54 0 , 55 characteristic function, 58 common divisor, 3 factor, 3 commutative ring, 3, 9 composite, 11 congruence class, 9 congruent, 8 continued fraction expansion, 16 nite, 16, 17 generalized nite, 18 innite, 17 convergent, 17, 18 convolution, 48 coprime, 3 coset left, 37 countable set, 55 countably innite set, 55 cycle decomposition, disjoint, 32 notation, 32 type, 32 degree, 13 dihedral group, 34 Diophantine problem, 22 disjoint cycle decomposition, 32 divides, 3
61
divisor, 3 common, 3 greatest common, 3 element greatest, 1 least, 1 maximal, 1 minimal, 1 equivalence relation, 8 Euclids Lemma, 11 Euclidean Algorithm, 4 Euler -function, 36 even permutation, 31 factor common, 3 highest common, 3 factorization prime, 12 prime power, 12 Fermats Little Theorem, 13 Fibonacci sequence, 18 nite continued fraction, 17 set, 53 xed point set, 40 set, 40 function arithmetic, 47 characteristic, 58 indicator, 58 fundamental solution, 24 Fundamental Theorem of Arithmetic, 11 generator, 36 greatest common divisor, 3 element, 1 group, 29 action, 38 action,transitive, 40 alternating, 31 dihedral, 34 permutation, 29, 30 symmetric, 30
62
INDEX
highest common factor, 3 Idiots Binomial Theorem, 14 index, 37 indicator function, 58 innite continued fraction, 17 set, 1, 53 injection, 53 integer, 3 inverse, 9 irrational, 13 Lagranges Theorem, 37 least element, 1 left coset, 37 Long Division Property, 3 M obius function, 47 maximal element, 1 Maximal Principle (MP), 2 minimal element, 1 multiplicative, 47 strictly, 47 natural numbers, 1 numbers natural, 1 odd permutation, 31 one-one, 53 onto, 53 orbit, 38 orbit-stabilizer equation, 40 order, 14, 29 Pells Equation, 23 period, 22 permutation even, 31 group, 29, 30 matrix, 32 odd, 31 sign of a, 31 Pigeonhole Principle, 53 power set, 58 prime, 11 factorization, 12 power factorization, 12 primitive root, 15 Principle of Mathematical Induction (PMI), 1 proper subgroup, 35 proper subset, 54 real algebraic numbers, 59 recurrence relation, 18
represents, 16, 18 residue class, 9 root, 13 primitive, 15 set countable, 55 countably innite, 55 nite, 53 xed, 40 xed point, 40 innite, 1, 53 of cardinality 0 , 55 power, 58 standard, 30 uncountable, 55 sign of a permutation, 31 solution fundamental, 24 stabilizer, 38 standard set, 30 strictly multiplicative, 47 subgroup, 35 generated by g , 35 proper, 35 subset proper, 54 surjection, 53 symmetric group, 30 symmetry, 33 tabular method, 6 transitive, 1 group action, 40 transposition, 33 uncountable set, 55 Well Ordering Principle (WOP), 1 Wilsons Theorem, 15