Mullen G.L., Panario D. - Handbook of Finite Fields
Mullen G.L., Panario D. - Handbook of Finite Fields
Mullen G.L., Panario D. - Handbook of Finite Fields
HANDBOOK OF
FINITE FIELDS
Gary L. Mullen
Daniel Panario
HANDBOOK OF
FINITE FIELDS
DISCRETE
MATHEMATICS
ITS APPLICATIONS
Series Editor
Kenneth H. Rosen, Ph.D.
R. B. J. T. Allenby and Alan Slomson, How to Count: An Introduction to Combinatorics,
Third Edition
Craig P. Bauer, Secret History: The Story of Cryptology
Juergen Bierbrauer, Introduction to Coding Theory
Katalin Bimbó, Combinatory Logic: Pure, Applied and Typed
Donald Bindner and Martin Erickson, A Student’s Guide to the Study, Practice, and Tools of
Modern Mathematics
Francine Blanchet-Sadri, Algorithmic Combinatorics on Partial Words
Miklós Bóna, Combinatorics of Permutations, Second Edition
Richard A. Brualdi and Dragos̆ Cvetković, A Combinatorial Approach to Matrix Theory and Its
Applications
Kun-Mao Chao and Bang Ye Wu, Spanning Trees and Optimization Problems
Charalambos A. Charalambides, Enumerative Combinatorics
Gary Chartrand and Ping Zhang, Chromatic Graph Theory
Henri Cohen, Gerhard Frey, et al., Handbook of Elliptic and Hyperelliptic Curve Cryptography
Charles J. Colbourn and Jeffrey H. Dinitz, Handbook of Combinatorial Designs, Second Edition
Abhijit Das, Computational Number Theory
Martin Erickson, Pearls of Discrete Mathematics
Martin Erickson and Anthony Vazzana, Introduction to Number Theory
Steven Furino, Ying Miao, and Jianxing Yin, Frames and Resolvable Designs: Uses,
Constructions, and Existence
Mark S. Gockenbach, Finite-Dimensional Linear Algebra
Randy Goldberg and Lance Riek, A Practical Handbook of Speech Coders
Jacob E. Goodman and Joseph O’Rourke, Handbook of Discrete and Computational Geometry,
Second Edition
Titles (continued)
HANDBOOK OF
FINITE FIELDS
Gary L. Mullen
Daniel Panario
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2013 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid-
ity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti-
lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy-
ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To Bevie Sue, with love,
Gary L. Mullen
Part I: Introduction
ix
x Contents
3 Irreducible polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.1 Counting irreducible polynomials Joseph L.Yucas . . . . . . . . . . . . . 53
3.1.1 Prescribed trace or norm . . . . . . . . . . . . . . . . . . . . . 54
3.1.2 Prescribed coefficients over the binary field . . . . . . . . . . . 55
3.1.3 Self-reciprocal polynomials . . . . . . . . . . . . . . . . . . . . 56
3.1.4 Compositions of powers . . . . . . . . . . . . . . . . . . . . . . 57
3.1.5 Translation invariant polynomials . . . . . . . . . . . . . . . . 58
3.1.6 Normal replicators . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2 Construction of irreducibles Melsik Kyuregyan . . . . . . . . . . . . . . . 60
3.2.1 Construction by composition . . . . . . . . . . . . . . . . . . . 60
3.2.2 Recursive constructions . . . . . . . . . . . . . . . . . . . . . . 63
3.3 Conditions for reducible polynomials Daniel Panario . . . . . . . . . . . 66
3.3.1 Composite polynomials . . . . . . . . . . . . . . . . . . . . . . 66
3.3.2 Swan-type theorems . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4 Weights of irreducible polynomials Omran Ahmadi . . . . . . . . . . . . 70
3.4.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.2 Existence results . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.3 Conjectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5 Prescribed coefficients Stephen D. Cohen . . . . . . . . . . . . . . . . . 73
3.5.1 One prescribed coefficient . . . . . . . . . . . . . . . . . . . . . 74
3.5.2 Prescribed trace and norm . . . . . . . . . . . . . . . . . . . . 75
3.5.3 More prescribed coefficients . . . . . . . . . . . . . . . . . . . . 76
3.5.4 Further exact expressions . . . . . . . . . . . . . . . . . . . . . 78
3.6 Multivariate polynomials Xiang-dong Hou . . . . . . . . . . . . . . . . . 80
3.6.1 Counting formulas . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.6.2 Asymptotic formulas . . . . . . . . . . . . . . . . . . . . . . . 81
3.6.3 Results for the vector degree . . . . . . . . . . . . . . . . . . . 81
3.6.4 Indecomposable polynomials and irreducible polynomials . . . . 83
3.6.5 Algorithms for the gcd of multivariate polynomials . . . . . . . 84
4 Primitive polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.1 Introduction to primitive polynomials Gary L. Mullen and Daniel Panario 87
4.2 Prescribed coefficients Stephen D. Cohen . . . . . . . . . . . . . . . . . 90
4.2.1 Approaches to results on prescribed coefficients . . . . . . . . . 91
4.2.2 Existence theorems for primitive polynomials . . . . . . . . . . 92
4.2.3 Existence theorems for primitive normal polynomials . . . . . . 93
4.3 Weights of primitive polynomials Stephen D. Cohen . . . . . . . . . . . . 95
4.4 Elements of high order José Felipe Voloch . . . . . . . . . . . . . . . . . 98
4.4.1 Elements of high order from elements of small orders . . . . . . 98
4.4.2 Gao’s construction and a generalization . . . . . . . . . . . . . 98
4.4.3 Iterative constructions . . . . . . . . . . . . . . . . . . . . . . 99
5 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.1 Duality theory of bases Dieter Jungnickel . . . . . . . . . . . . . . . . . 101
5.1.1 Dual bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.1.2 Self-dual bases . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1.3 Weakly self-dual bases . . . . . . . . . . . . . . . . . . . . . . 104
5.1.4 Binary bases with small excess . . . . . . . . . . . . . . . . . . 106
5.1.5 Almost weakly self-dual bases . . . . . . . . . . . . . . . . . . 107
5.1.6 Connections to hardware design . . . . . . . . . . . . . . . . . 109
Contents xi
14 Combinatorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
14.1 Latin squares Gary L. Mullen . . . . . . . . . . . . . . . . . . . . . . . . 550
14.1.1 Prime powers . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
14.1.2 Non-prime powers . . . . . . . . . . . . . . . . . . . . . . . . . 552
14.1.3 Frequency squares . . . . . . . . . . . . . . . . . . . . . . . . . 553
14.1.4 Hypercubes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
14.1.5 Connections to affine and projective planes . . . . . . . . . . . 554
14.1.6 Other finite field constructions for MOLS . . . . . . . . . . . . 555
14.2 Lacunary polynomials over finite fields Simeon Ball and Aart Blokhuis . . 556
14.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
14.2.2 Lacunary polynomials . . . . . . . . . . . . . . . . . . . . . . . 556
14.2.3 Directions and Rédei polynomials . . . . . . . . . . . . . . . . 557
14.2.4 Sets of points determining few directions . . . . . . . . . . . . . 558
14.2.5 Lacunary polynomials and blocking sets . . . . . . . . . . . . . 559
14.2.6 Lacunary polynomials and blocking sets in planes of prime order 561
14.2.7 Lacunary polynomials and multiple blocking sets . . . . . . . . 562
14.3 Affine and projective planes Gary Ebert and Leo Storme . . . . . . . . . 563
14.3.1 Projective planes . . . . . . . . . . . . . . . . . . . . . . . . . 563
14.3.2 Affine planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
14.3.3 Translation planes and spreads . . . . . . . . . . . . . . . . . . 565
14.3.4 Nest planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
14.3.5 Flag-transitive affine planes . . . . . . . . . . . . . . . . . . . . 568
14.3.6 Subplanes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
14.3.7 Embedded unitals . . . . . . . . . . . . . . . . . . . . . . . . . 571
14.3.8 Maximal arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
14.3.9 Other results . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
14.4 Projective spaces James W.P. Hirschfeld and Joseph A. Thas . . . . . . . 574
14.4.1 Projective and affine spaces . . . . . . . . . . . . . . . . . . . . 574
14.4.2 Collineations, correlations, and coordinate frames . . . . . . . . 576
14.4.3 Polarities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578
14.4.4 Partitions and cyclic projectivities . . . . . . . . . . . . . . . . 582
14.4.5 k -Arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583
14.4.6 k -Arcs and linear MDS codes . . . . . . . . . . . . . . . . . . . 586
14.4.7 k -Caps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
14.5 Block designs Charles J. Colbourn and Jeffrey H. Dinitz . . . . . . . . . 589
14.5.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
14.5.2 Triple systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
14.5.3 Difference families and balanced incomplete block designs . . . . 592
xx Contents
The CRC Handbook of Finite Fields (hereafter referred to as the Handbook ) is a reference
book for the theory and applications of finite fields. It is not intended to be an introductory
textbook. Our goal is to compile in one volume the state of the art in research in finite
fields and their applications. Hence, our aim is a comprehensive book, with easy-to-access
references for up-to-date facts and results regarding finite fields.
The Handbook is organized into three parts. Part I contains just one chapter which is
devoted to the history of finite fields through the 18-th and 19-th centuries.
Part II contains theoretical properties of finite fields. This part of the Handbook contains
12 chapters. Chapter 2 deals with basic properties of finite fields; properties that are used
in various places throughout the entire Handbook. Near the end of Section 2.1 is a rather ex-
tensive list of recent finite field-related books; these books include textbooks, books dealing
with theoretical topics as well as books dealing with various applications to such topics as
combinatorics, algebraic coding theory for the error-free transmission of information, and
cryptography for the secure transmission of information. Also included is a list of recent
finite field-related conference proceedings volumes.
Chapter 2 also provides rather extensive tables of polynomials useful when dealing with
finite field computational issues. The website http://www.crcpress.com/product/isbn/
9781439873786 provides larger and more extensive versions of the tables presented in Sec-
tion 2.2.
The next two chapters deal with polynomials such as irreducible and primitive poly-
nomials over finite fields. Chapter 5 discusses various kinds of bases over finite fields, and
Chapter 6 discusses character and exponential sums over finite fields.
In Chapter 7, results on solutions of equations over finite fields are discussed. Chapter
8 covers permutation polynomials in one and several variables, as well as a discussion of
value sets of polynomials, and exceptional polynomials over finite fields. Chapter 9 discusses
special functions over finite fields. This discussion includes Boolean, APN, PN, bent, kappa
polynomials, planar functions and Dickson polynomials, and finishes with a discussion of
Schur’s conjecture.
Sequences over finite fields are considered in Chapter 10. This chapter includes material
on finite field transforms, LFSRs and maximal length sequences, correlation and autocorre-
lation and linear complexity of sequences as well as algebraic dynamical systems over finite
fields.
Chapter 11 deals with various kinds of finite field algorithms including basic finite field
computational techniques, formulas for polynomial counting, irreducible techniques, fac-
torizations of polynomials in one and several variables, discrete logarithms, and standard
models for finite fields.
In Chapter 12, curves over finite fields are discussed in great detail. This discussion
includes elliptic and hyperelliptic curves. Rational points on curves are considered as well
as towers and zeta functions over finite fields. In addition, there is a discussion of p-adic
estimates of zeta and L-functions over finite fields.
Chapter 13 discusses a variety of topics over finite fields. These topics include relations
between the integers and polynomials over finite fields, matrices over finite fields, linear
algebra and related computational topics, as well as classical groups over finite fields, and
Carlitz and Drinfeld modules.
Part III of the Handbook, containing four chapters, discusses various important appli-
cations, including mathematical as well as very practical applications of finite fields. Latin
xxv
xxvi Preface
squares and the polynomial method, useful in various areas of combinatorics, are considered.
In addition, affine and projective planes, projective spaces, block designs, and difference sets
are discussed in detail. In each of these areas, since these topics contain an immense num-
ber of papers, we discuss only those techniques and topics related to finite fields. Other
topics included in Chaper 14 are (t, m, s)-nets useful in numerical integration, applications
of primitive polynomials over finite fields, and Ramanujan and expander graphs.
Chapter 15 is another important chapter in the Handbook. It discusses algebraic coding
theory and includes a long introductory section dealing with basics properties of codes.
This is followed by sections on special kinds of codes including LDPC codes, turbo codes,
algebraic geometry codes, raptor codes, and polar codes.
Chapter 16 deals with cryptographic systems over finite fields. In the first section various
basic issues dealing with cryptography are discussed. Next to be discussed are stream and
block ciphers, multivariate cryptographic systems, elliptic and hyperelliptic curve crypto-
graphic systems as well as systems arising from Abelian varieties over finite fields.
Finally, in Chapter 17 we discuss several additional applications of finite fields including
finite fields in biology, quantum information theory, and various applications in engineer-
ing including topics like optimal orthogonal codes, binary sequences with small aperiodic
autocorrelation, and sequences with small Hamming correlation.
In the bibliography, we have included for each reference, the pages where that reference
is discussed in the Handbook. There is also a large index to help readers quickly locate
various topics in the Handbook.
The Handbook is not meant to be read in a sequential way. Instead, each section is meant
to be self-contained. Basic properties of finite fields are included in Chapter 2. Proofs are
not included in the Handbook ; instead authors have given references where proofs of the
important results can be found. In an effort to help the reader locate proofs and important
results for each section, at the end of each section we have provided a list of references
used in that section. Those reference numbers refer to the main bibliography at the end of
the Handbook which contains over 3,000 references. A short “See also” section is included
for most sections; these are intended to provide the reader with references to other related
sections and references of the Handbook.
The following numbering system is in effect in the Handbook. Within a given section, all
results, theorems, corollaries, definitions, examples, etc., are numbered consecutively (with
the exception of tables and figures). For example, the result numbered 2.1.5 happens to be
a theorem which is the fifth listing in Section 2.1 of Chapter 2. We have also included many
remarks in each section. These are also numbered as part of the same system so that for
example, Remark 2.1.4 is the fourth listing in Section 2.1.
Readers are encouraged to make us aware of corrections to the material presented here.
Readers should contact the author(s) of the section involved, as well as both of the Editors-
in-Chief.
We would of course like to first thank the authors of the various sections for their time
and effort. Without their help, the Handbook would, quite simply, not exist. We also greatly
appreciate the authors’ willingness to use our style and format so that the entire Handbook
has a consistent and uniform style and format. While we appreciate the help of all of the
authors, we would especially like to thank Ian Blake, Steve Cohen, Cary Huffman, Alfred
Menezes, Harald Niederreiter, Henning Stichtenoth, and Arne Winterhof who not only wrote
several sections, but who also provided the Editors-in-Chief with valuable input in numerous
aspects of the Handbook. Every section was reviewed by at least two external reviewers, in
addition to the Editors-in-Chief. We also would like to thank the many reviewers who took
the time to read and send us comments on the various sections and drafts. Without their
help, we would of course have ended up with a volume of considerably diminished quality.
We would like to thank Brett Stevens and David Thomson for their help with various
Preface xxvii
LATEX and file issues. Finally, we would like to thank Shashi Kumar for his invaluable help
in setting up, reworking, and running the style files that define the overall look of the entire
Handbook. His efforts were of tremendous help to us. We would also like to thank Bob Stern
for his continued support.
Needless to say, this project has involved many, many hours. We thank Bevie Sue Mullen
and Lucia Moura for their encouragement, support, love, and patience during the entire
process.
Gary L. Mullen
Daniel Panario
This page intentionally left blank
Editors-in-Chief
Contributors
xxix
xxx Contributors
Rudolf Lidl
7 Hill Street M. Ram Murty
West Launceston Department of Mathematics and Statistics
Tasmania 7250 Queen’s University
Australia Kingston, Ontario K7L 3N6
Email: [email protected] Canada
Email: [email protected]
Simon Litsyn
School of Electrical Engineering
Tel Aviv University Harald Niederreiter
Ramat Aviv 69978 Radon Institute for Computational and Ap-
Israel plied Mathematics
Email: [email protected] Austrian Academy of Sciences
Altenbergerstr. 69
Gary McGuire A-4040 Linz
School of Mathematical Sciences Austria
University College Dublin Email: [email protected]
Dublin 4
Ireland
Email: [email protected]
Andrew Odlyzko
Wilfried Meidl School of Mathematics
Faculty of Engineering and Natural Sciences University of Minnesota
Sabanci University Minneapolis, MN 55455
Orhanli 34956, Tuzla-Istanbul U.S.A.
Turkey Email: [email protected]
Email: [email protected]
Clément Pernet
INRIA/LIG MOAIS Joseph H. Silverman
51, avenue Jean Kuntzmann Mathematics Department, Box 1917
F-38330 Montbonnot Saint-Martin Brown University
France Providence, RI 02912
Email: [email protected] U.S.A.
Email: [email protected]
Alexander Pott
Otto-von-Guericke University Magdeburg
39106 Magdeburg Bart de Smit
Germany Mathematisch Institut
Email: [email protected] Universiteit Leiden
Postbus 9512
2300 RA Leiden
Martin Roetteler
The Netherlands
NEC Laboratories America, Inc.
Email: [email protected]
4 Independence Way, Suite 200
Princeton, NJ 08540
U.S.A.
Brett Stevens
Email: [email protected]
School of Mathematics and Statistics
Carleton University
Ivelisse Rubio Ottawa ON K1S 5B6
Department of Computer Science Canada
University of Puerto Rico Email: [email protected]
Rio Piedras Campus
P.O. Box 70377
San Juan, PR 00936-8377 Henning Stichtenoth
Email: [email protected] Faculty of Engineering and Natural Sciences
Sabanci University
Renate Scheidler Orhanli 34956, Tuzla-Istanbul
Department of Mathematics and Statistics Turkey
University of Calgary Email: [email protected]
2500 University Drive NW
Calgary, Alberta, T2N 1N4
Canada Leo Storme
Email: [email protected] Department of Mathematics
Ghent University
Kai-Uwe Schmidt Krijgslaan 281, Building S22
Faculty of Mathematics B-9000 Ghent, Belgium
Otto-von-Guericke University Email: [email protected]
Universitätsplatz 2
39106 Magdeburg
Germany Oscar Takeshita
Email: [email protected] Email: [email protected]
Contributors xxxv
David Thomson
Arne Winterhof
School of Mathematics and Statistics
Johann Radon Institute for Computational
Carleton University
and Applied Mathematics
Ottawa ON K1S 5B6
Austrian Academy of Sciences
Canada
Altenbergerstr. 69
Email: [email protected]
4040 Linz, Austria
Email: [email protected]
José Felipe Voloch
The University of Texas at Austin
Mathematics Dept, RLM 8.100 Joseph L. Yucas
2515 Speedway Stop C1200 68 Rock Springs Rd.
Austin, Texas 78712-1202 Makanda, IL 62958
U.S.A. U.S.A.
Email: [email protected] Email: [email protected]
1
This page intentionally left blank
1
History of finite fields
1.1.1 Introduction
While the theory of finite fields emerged as an independent discipline at the end of the
19-th century, aspects of the subject can be traced back at least to the middle of the 17-th
century. It is our intention to present here a survey of highlights of finite field theory as they
emerged in the 18-th and 19-th centuries, culminating in a description of Eliakim Hastings
Moore’s [2139], which began the study of abstract finite fields.
Leonard Eugene Dickson (1874-1954), in the first volume of his History of the Theory
of Numbers [851] gives many references to works that can be interpreted as dealing with
finite fields, although not always described explicitly as such. Chapters VII and VIII are
especially relevant, and give remarkably complete listings of what had been achieved before
1918. Chapter VIII, entitled Higher Congruences, occasionally uses the language of finite
fields, although the emphasis is largely number theoretic. Dickson had already written a
textbook, entitled Linear Groups with an Exposition of the Galois Field Theory [850] which
is probably the first work devoted exclusively to finite fields. This book remained without
any serious rival until the emergence in the 1950s of more geometric, less computational,
methods, such as those pioneered by Artin in his Geometric Algebra [135]. The first 71 pages
of Dickson’s work constitute a very full account of finite fields, and its exercises, partly based
on the work of earlier researchers, are still a valuable source of problems and ideas.
Of course, finite fields are mentioned in general histories of algebra, such as that of van
der Waerden [2848]. Furthermore, Finite Fields by Lidl and Niederreiter [1939] contains
much historical information and a very extensive bibliography, especially of the older litera-
ture. Another brief but useful source of information is found in the historical notes scattered
throughout Cox’s Galois Theory [749].
3
4 Handbook of Finite Fields
The first use of the English expressions field of order s and Galois-field of order s = q n
occurs in a paper of E. H. Moore (1862-1932), which he presented in 1893. Moore states that
the term field was an equivalent to the German term endlicher Körper, used by Heinrich
Weber (1842-1913). We observe that Richard Dedekind (1831-1916) had already introduced
such a term as Zahlenkörper, which can be traced back to lectures he gave in 1858. The 1933
edition of the Oxford English Dictionary does not include a definition of the mathematical
term field, although it does define group in its mathematical meaning, but more recent
editions of the dictionary include the mathematical use of field, with an attribution to
Moore.
The name Galois field is synonymous with finite field, and it signifies the importance
to the subject of an innovatory paper by Évariste Galois (1811-1832), published in 1830
[1168], when the author was only 18. We will comment in greater detail on Galois’s work
later in this article, but we will briefly mention here that Galois lays the foundations of
finite field theory by showing that for each prime p and positive integer n, there is a finite
field of order pn , and its multiplicative group of non-zero elements is cyclic of order pn − 1.
Galois’s arguments are rather sketchy, but there is no doubt that he understood the fun-
damental principles of the structure of a finite field, including the role of the automorphism
given by raising elements to the p-th power. As has proved to be the case on a number of
occasions, it seems that most of Galois’s discoveries were already known to Gauss, in this
case, in the late 1790s, but as Gauss never published an account of his work, Galois was
unaware of Gauss’s priority. (Gauss is credited with the discovery of non-Euclidean geome-
try before Bolyai and Lobachevsky, with the discovery of quaternions before Hamilton, and
the discovery of the method of least squares before Legendre.) We will also give a sketch of
Gauss’s approach to finite fields, which he called the theory of higher congruences, as it is
described in Volume 2 of his Werke [1259].
Gauss’s magnum opus, we refer to the book The Shaping of Arithmetic [1297].
Gauss introduces the concept of congruence in Article (Art.) 1, and designates congru-
ence by means of the now familiar symbol ≡. This is the first published use of this symbol,
which seems to have entered into conventional use quite rapidly. It occurs for instance in C.
Kramp’s Éléments d’arithmétique universelle [1804] an elementary work much influenced
by Gauss’s masterpiece (the use of the exclamation mark in n! makes its first appearance
here). In Section 2, Gauss proves in Art. 14 that if p is a prime integer and a, b are integers
not divisible by p, then p does not divide the product ab. This basic result is fundamental
for the proof that the integers modulo p form a field. Gauss comments that the theorem
was already in Euclid’s Elements. Oddly enough, for a person as notoriously meticulous as
Gauss, he mistakenly says that it is Proposition 32 of Book VII, when it is in fact Propo-
sition 30. Concerning this result, Gauss wrote magisterially: However we did not wish to
omit it because many modern authors have employed vague computations in place of proof
or have neglected the theorem entirely, and because by this very simple case we can more
easily understand the nature of the method which will be used later for solving much more
difficult problems. He uses Art. 14 to prove Art. 16, a result often called the fundamental
theorem of arithmetic: a composite number can be resolved into prime factors in only one
way. This basic result is not in Euclid.
Gauss describes Euler’s totient (or phi) function, which he denotes by the symbol φ
(following Art. 38). (We recall that the totient function measures the number of totitives of
a positive integer n, that is, the number of integers lying between 1 and n that are relatively
prime to n.) This is again the first occurrence of a now familiar symbol in mathematics.
Euler himself, although introducing the idea of the function in 1760, did not use such
notation. Art. 43 is a proof that an integer polynomial of degree m cannot have more than
m incongruent roots modulo a prime. This basic theorem on polynomial arithmetic was
first published by Lagrange in 1768. Euler had shown that the congruence xn − 1 ≡ 0
modulo a prime has at most n roots in 1774, and Gauss notes that Euler’s method is easily
generalized.
Section III, on residues of powers, contains Art. 49: if p is a prime number that does
not divide a, and at is the lowest power of a that is congruent to unity to the modulus p,
the exponent t will either = p − 1 or be a factor of this number. Gauss notes that this
implies Fermat’s Little Theorem, and he gives some of the history of this theorem that we
described above. Art. 55 is the fundamental statement: There always exist numbers with the
property that no power less than the p − 1st is congruent to unity. This of course amounts
to saying that the multiplicative group of the integers modulo a prime p is cyclic of order
p − 1. Again, it is interesting to observe the authority of Gauss’s language as he describes
earlier approaches to Art. 55: This theorem furnishes an outstanding example of the need
for circumspection in number theory so that we do not accept fallacies as certainties. . . .
No one has attempted the demonstration except Euler . . . See especially his article 37 where
he speaks at great length of the need for demonstration. But the demonstration which this
shrewdest of men presents has two defects. . . . In Art. 57, Gauss adopts the nomenclature
primitive roots, due originally to Euler, for the integers, or residues, described in Art. 55.
Günther Frei, [1103], has given a lengthy description of the genesis and contents of the
unpublished Section Eight, and we will make use of some of his analysis here, since it has
considerable bearing on the early theory of finite fields. Gauss’s work on finite fields can be
traced back at least to 1796, as there are references to it in his Mathematical Diary [1257].
It is well known that Gauss was particularly fascinated by the law of quadratic reciprocity,
and he gave several different proofs of this fundamental theorem, the first dating from 1796.
The third and fourth of these proofs drew Gauss into the study of polynomials modulo a
prime, and his surviving investigations enable us to discern much of the theory of finite
extensions of a field of prime order.
In Frei’s translation, Gauss wrote But at the same time one sees that the solution of
congruences constitutes only a part of a much higher investigation, namely the investigation
of the decomposition of functions into factors. Accordingly, Gauss developed a theory of
factorization of polynomials whose coefficients are integers modulo a prime p, including the
determination of greatest common divisors by Euclid’s algorithm. He introduced the concept
of a prime polynomial, corresponding to irreducible polynomial in modern terminology, and
showed that arbitrary polynomials can be factored into products of prime polynomials.
Among the highlights of his discoveries, we may mention his proof that every irreducible
m
polynomial modulo p, different from x, and of degree m, is a divisor of xp −1 − 1. Further-
m
more, xp −1 − 1 is the product of all monic irreducible polynomials of degree d dividing
m, apart from x. From this fact, he obtained a formula for the number of irreducible monic
polynomials of degree n with coefficients integers modulo p. Frei also notes that Gauss ap-
preciated the importance of the Frobenius automorphism, and came close to discovering a
form of Hensel’s Lemma, significant in p-adic analysis.
The idea of using the imaginary roots of such irreducible polynomials to simplify some
of his work had occurred to Gauss, and, in Frei’s translation, Gauss wrote Indeed, we could
have shortened incomparably all our following investigations, had we wanted to introduce
such imaginary quantities by taking the same liberty some more recent mathematicians
have taken, but nevertheless, we have preferred to deduce everything from first principles. It
should be recalled that Gauss sometimes displayed a conservative approach to new concepts
in mathematics, and his public aversion to using imaginary roots of congruences is akin to
his disinclination to use complex numbers. Thus, for example, his thesis, published in 1799,
states that every real polynomial is a product of real factors of degree one or two, rather
than stating that every complex polynomial is a product of factors of degree 1.
a + a1 i + a2 i2 + · · · + aν−1 iν−1 ,
where a, a1 , . . . , aν−1 are integers modulo p. There are pν different values for these expres-
sions.
Let α be an expression of the form above. If we raise α to the second, third, etc, powers,
we obtain a sequence of expressions of the same form. Thus we must have αn = 1 for a
certain positive integer n, which we choose to be as small as possible. We then have n
different expressions
1, α, . . . , αn−1 .
ν
Galois shows that n divides pν − 1, and thus αp −1 = 1. Galois next aims to prove that
there is some α for which the corresponding n is pν − 1. He makes an analogy at this stage
with existence of primitive roots modulo p in the theory of numbers. We did not find that
Galois provided a convincing argument for this key issue.
Galois then draws the remarkable conclusion that all the algebraic quantities that arise
ν
in this theory are roots of equations of the form xp = x. Furthermore, if F (x) is an integer
polynomial of degree ν irreducible modulo p, there are integer polynomials f (x) and φ(x)
such that ν
f (x)F (x) = xp − x + pφ(x).
Galois also notes that if α is a root of the irreducible congruence F (x) ≡ 0, then the other
roots are 2 ν−1
αp , αp , . . . , αp .
This is a consequence of the fact that
n n
F (x)p ≡ F (xp ).
We remark that this is an early indication of the role of the so-called Frobenius mapping
as a generator of the associated Galois group. Galois later notes that all the roots of the
ν
congruence xp ≡ x depend only on the roots of a single irreducible polynomial of degree
ν.
To illustrate all this theory, Galois attempts to find a primitive root of the congruence
3
x7 ≡ x (mod 7).
He aims to do this by exhibiting elements having orders 9 and 19. In fact, he makes several
noteworthy errors, which may have confused any readers of this exposition of the new
theory. He begins by noting that x3 ≡ 2 (mod 7) is irreducible, and lets i be a root of the
8 Handbook of Finite Fields
congruence. He claims that −1 − i has order 19, but this is false, as it has order 9 × 19.
He then claims that α = i + i2 is primitive, but this is again false, as it has order 114, not
342 = 73 − 1. Finally, Galois claims that his α satisfies
α3 + 3α + 1 = 0,
but this is again incorrect, as in truth it satisfies
α3 + α + 1 = 0.
Indeed, the polynomial x3 + 3x + 1 is not even irreducible modulo 7, as 4 is a root of it.
As we mentioned earlier, the paper contains several misprints, possibly because the com-
positor found the notation difficult to handle, but Galois’s errors are not just typographical
(although they are of course ultimately trivial and in no way invalidate his theory).
Joseph-Alfred Serret gives a treatment of this problem of finding a primitive root in the
second edition of Cours d’algèbre supérieure [2596], pp. 367-370, following Galois’s methods.
The required primitive element Galois might have had in mind was β = i − i2 , not i + i2 .
This element β is a root of x3 − x + 2, which is certainly an irreducible primitive polynomial.
While i + i2 may have replaced i − i2 because of a typographical error, Galois nonetheless
made further mistakes which are difficult to explain. Serret himself made no comment on
this strange aspect of Galois’s paper.
As justification for introducing this theory, Galois explains that it is required in the
theory of permutations which arise in the study of primitive (rational) polynomials which
are solvable by radicals. He alludes, in effect, to what is the affine group of the finite field
of order pν , which must be the Galois group of such a polynomial when the action on the
roots is doubly transitive. He excludes degrees 9 and 25, where he must have known that
there exist exceptional doubly transitive solvable permutation groups. There is another one
in degree 49, which he did not mention.
1865. In fact, it bears many similarities to Gauss’s unpublished Section 8 (itself published
for the first time in Latin in 1863), and to Dedekind’s 1857 paper (for details, see later
in this section). Serret was presumably unaware of this material, as he made no mention
of it. We can recognize several classical theorems of finite field theory described clearly in
Serret’s Chapter 3 of Volume 2 of the third edition. Thus for example, if the integer g is
not divisible by the prime p, the polynomial (later said to be of Artin-Schreier type)
xp − x − g
is irreducible modulo p. Art. 372 presents six theorems summarizing Galois’s findings of
1830, Art. 349 gives a formula for the number of monic irrreducible polynomials modulo a
prime p, and Art. 350 gives upper and lower bounds for their number.
He also obtained a formula for the number of monic irreducible polynomials of degree n
modulo p. Schönemann’s paper is long (56 pages) and not very clear. It is also written in a
very formal style, each result being presented in the form of Erklärung and Lehrsatz, followed
by Beweis, in imitation of the approach characteristic of Euclid’s Elements. Nonetheless,
Schönemann did innovatory work, which, even if anticipated by Gauss, was quoted reason-
ably frequently in the second half of the nineteenth century, for instance, by Kronecker.
In his paper Abriss einer Theorie der Höheren Congruenzen in Bezug auf einen reellen
Primzahl-Modulus [793] written in late 1856, and published in 1857, Dedekind covered much
of the same ground pioneered by Gauss in Disquisitiones Generales de Congruentiis. While
we pointed out above that Dedekind was responsible for editing Gauss’s manuscript for
publication in 1863, Frei presents several strong reasons to suppose that, at the time he
wrote, Dedekind was unacquainted with this key work, and did not see it until 1860. Frei
suggests that Dedekind was more concerned to give a solid foundation to Kummer’s theory of
ideal numbers. In any case, Dedekind notes that there is a strong analogy between the theory
of polynomials modulo a prime and elements of number theory. By way of illustrating this
analogy, let p be an odd prime and let P and Q be different irreducible monic polynomials
of degrees m and n, respectively. Then working modulo Q, P determines an element of the
field of order pn , and this element is either a square or a non-square. By analogy with the
P
Legendre symbol, we set ( P Q ) equal to 1 if P is a square modulo Q, and ( Q ) equal to −1
if it is a non-square. Working modulo P , we may likewise define ( Q
P ). Then, in complete
10 Handbook of Finite Fields
In his proof, Dedekind uses a version of Gauss’s Lemma, employed in one of Gauss’s proofs
of the quadratic reciprocity theorem.
and that when the marks are so combined the results of these operations are in every case
uniquely determined and belong to the system of marks. Such a system of marks we shall
call a field of order s, using the notation F [s]. . . .
We are led at once to seek [t]o determine all such fields of order s, F [s].
Moore notes that Galois had defined a field of order q n , for each prime q and each positive
integer n. Moore denotes this field by GF [q n ], presumably in honor of Galois. This GF [q n ]
is defined via an irreducible polynomial of degree n modulo q, and is unique, in the sense
that such irreducible polynomials exist for all q and n, and the GF [q n ] so constructed is
independent of the particular irreducible polynomial chosen. Moore’s main theorem is then
stated as: Every existent field F [s] is the abstract form of a Galois field GF [q n ]; s = q n .
Moore remarks: This interesting result I have not seen stated before.
Moore’s proof occupies pages 212-220 of his paper, and he derives further properties
of GF [q n ] in the next few pages. We feel that Moore’s paper marks the beginning of the
abstract theory of finite fields. In 1896, Dickson was awarded the first doctorate in mathe-
matics at the new University of Chicago, for a thesis written under Moore’s direction, the
subject matter being permutation polynomials over finite fields. Dickson’s 1901 book gave
a streamlined proof of Moore’s uniqueness theorem on pp. 13-14.
and their Lie algebras, and published in 1901 (with later additions) details of his discovery
of versions of the groups of type E6 and G2 over finite fields [842, 848, 843, 844]. It was
not until later work of Chevalley in 1955 that further finite analogues of the exceptional
continuous groups were constructed in a uniform way.
The theory of finite fields may be said to have acquired a more conceptual form in the
twentieth century after Emil Artin (1898-1962) introduced the notion of a zeta function
for a quadratic extension of the rational function field Fp (t), where p is a prime. Artin
formulated a version of the Riemann hypothesis for these zeta functions, and verified the
hypothesis for a number of curves in his dissertation, published in 1924. Helmut Hasse
(1899-1979) subsequently proved the Riemann hypothesis for function fields of genus 1
in 1934, but the complete proof for arbitrary non-singular curves by André Weil (1906-
1998) in 1948 employed sophisticated methods of algebraic geometry. The analogy between
counting rational points on algebraic varieties over finite fields and the cohomology theories
of complex varieties has been a powerful motivating force in the more recent theory of finite
fields.
References Cited: [135, 749, 793, 842, 843, 844, 848, 850, 851, 1168, 1257, 1258, 1259,
1297, 1621, 1804, 1939, 2021, 2139, 2556, 2596, 2848]
This page intentionally left blank
2
Introduction to finite fields
Proofs for most of the results in this chapter can be found in Chapters 2 and 3 of [1939];
see also [1631, 1938, 2017, 2049, 2077, 2179, 2921]. We refer the reader to Section 2.1.8 for
a comprehensive list of other finite field related books.
2.1.1 Definition A ring (R, +, ·) is a nonempty set R together with two operations, “+” and
“·” such that:
(1) (R, +) is an abelian group;
(2) · is associative, that is for all a, b, c ∈ R, a · (b · c) = (a · b) · c;
(3) left and right distributive laws hold: for all a, b, c ∈ R
a · (b + c) = a · b + a · c and (b + c) · a = b · a + c · a.
13
14 Handbook of Finite Fields
(4) R is a division ring (also called a skew field ) if the nonzero elements of R form a
group under “·”;
(5) R is a field if it is a commutative division ring.
2.1.3 Definition The order of a finite field F is the number of distinct elements in F.
2.1.6 Definition If R is a ring and there exists a positive integer n such that nr = 0 for all
r ∈ R, then the least such positive integer n is the characteristic of the ring, and R has
positive characteristic. Otherwise, R has characteristic zero.
2.1.7 Theorem A ring R 6= {0} of positive characteristic having an identity and no zero divisors
must have prime characteristic.
2.1.8 Corollary A finite field has prime characteristic.
2.1.9 Proposition For a commutative ring R of characteristic p, we have
n n n
(a1 + · · · + as )p = ap1 + · · · + aps
for every n ≥ 1 and ai ∈ R.
2.1.10 Lemma Suppose F is a finite field with a subfield K containing q elements. Then F is a
vector space over K and |F | = q m , where m is the dimension of F viewed as a vector space
over K.
2.1.12 Theorem Let F be a finite field. The cardinality of F is pn , where p is the characteristic of
F and n is the dimension of F over its prime subfield.
2.1.13 Remark We denote by Fq a finite field with q elements. We note that by Remark 2.1.34
there is only one finite field (up to isomorphism) with q elements.
2.1.14 Remark Another common notation for a field of order q is GF (q), where GF stands for
Galois field. This name is used in honor of Évariste Galois (1811–1832), who in 1830 was the
first person to seriously study properties of general finite fields (fields with a prime power
but not necessarily a prime number of elements).
2.1.15 Remark The recent publication of The Mathematical Writings of Evariste Galois by Neu-
mann [2223] will make Galois’s own words available to readers.
2.1.16 Lemma If Fq is a finite field with q elements and a 6= 0 ∈ Fq , then aq−1 = 1, and thus
aq = a, for all a in Fq .
2.1.17 Remark An immediate consequence of the previous lemma is that the multiplicative inverse
of any a 6= 0 in a field of order q is aq−2 , because aq−2 · a = aq−1 .
2.1.18 Theorem The sum of all elements of a finite field is 0, except for the field F2 .
Introduction to finite fields 15
Pn
2.1.19 Definition A polynomial f over Fq is an expression of the form f (x) = ai xi , where
i=0
n is a nonnegative integer, and ai ∈ Fq for i = 0, 1, . . . , n. A polynomial is monic if the
coefficient of the highest power of x is 1. The ring formed by the polynomials over Fq
with sum and product of polynomials is the ring of polynomials over Fq and is denoted
by Fq [x].
2.1.21 Remark Both Fq [x] and the ring of polynomials in n ≥ 1 variables, Fq [x1 , . . . , xn ], have
unique factorization into irreducibles.
2.1.22 Definition The Möbius µ function is defined on the set of positive integers by
1
if m = 1,
µ(m) = (−1)k if m = m1 m2 · · · mk , where the mi are distinct primes,
0 otherwise, i.e., if p2 divides m for some prime p.
2.1.23 Definition The number of monic irreducible polynomials of degree n over Fq is denoted
by Iq (n).
1X
Iq (n) = µ(d)q n/d .
n
d|n
2.1.25 Remark We have that Iq (n) > 0 for all prime powers q and all integers n > 1:
1X 1 n
µ(d)q n/d ≥ q − q n−1 − q n−2 − · · · − q > 0.
Iq (n) =
n n
d|n
2.1.26 Remark For a polynomial f ∈ Fq [x], we have (f (x))q = f (xq ). This property is of great
use in finite field calculations.
2.1.27 Lemma If Fq is a finite field with q elements then in Fq [x] we have
Y
xq − x = (x − a).
a∈Fq
2.1.28 Remark The next theorem is crucial for fast polynomial irreducibility testing and factor-
ization algorithms over finite fields; see Sections 11.3 and 11.4.
r
2.1.29 Theorem Let f be an irreducible polynomial of degree n over Fq . Then f (x)|(xq − x) if
and only if n|r.
16 Handbook of Finite Fields
2.1.30 Definition Let f ∈ F [x] be of positive degree and E an extension of F . Then f splits
in E if f can be written as a product of linear factors in E[x], that is, there exist
α1 , α2 , . . . , αn ∈ E such that
where a ∈ F is the leading coefficient of f and E is the smallest such field. The field E
is a splitting field of f over F if f splits in E.
2.1.31 Theorem If F is a field, and f any polynomial of positive degree in F [x], then there exists
a splitting field of f over F . Any two splitting fields of f over F are isomorphic under an
isomorphism which keeps the elements of F fixed and maps the roots of f into each other.
2.1.32 Theorem For every prime p and positive integer n ≥ 1 there is a finite field with pn elements.
n
Any finite field with pn elements is isomorphic to the splitting field of xp − x over Fp .
2.1.33 Remark The previous theorem shows that a finite field of a given order is unique up to
field isomorphism because splitting fields are unique up to isomorphism. Thus we speak of
“the” finite field of a particular order q.
2.1.34 Remark We note that when p is a prime the field Fp is the same as (isomorphic to) the
ring Zp of integers modulo p. The ring Zp is also denoted by Z/pZ. When n > 1 the finite
field Fpn is not the same as the ring Zpn of integers modulo pn . Indeed, Zpn is not a field if
n > 1.
2.1.35 Theorem Let Fpn be the finite field with pn elements. Every subfield of Fpn has pm elements
for some positive integer m dividing n. Conversely, for any positive integer m dividing n
there is a unique subfield of Fpn of order pm .
2.1.36 Remark The subfields of Fq36 are illustrated in the following diagram:
Fq36
Fq18 Fq12
Fq9 Fq4
Fq6
Fq3 Fq2
Fq1
Figure 2.1.1 The subfields of Fq36 .
2.1.37 Theorem The multiplicative group F∗q of all nonzero elements of the finite field Fq is cyclic.
2.1.38 Definition An element α ∈ Fq which multiplicatively generates the group F∗q of all nonzero
elements of the field Fq is a primitive element, sometimes also a primitive root.
2.1.39 Remark Let θ be a primitive element of a finite field Fq . Then every nonzero element of Fq
can be written as a power of θ. This representation makes multiplication of field elements
Introduction to finite fields 17
very easy to compute. However, in general, it may not be easy to find the power s of θ such
that θt + θr = θs ; see Subsection 2.1.7.5. Conversely, as we will see later in our discussion
of bases for finite fields, representations which make exponentiation easy to compute often
have a more complex multiplicative structure.
2.1.40 Definition Let α ∈ F∗q . The order of α is the smallest positive integer n such that αn = 1.
2.1.41 Remark We use the notation (a, b) or gcd(a, b) to represent the greatest common divisor
(gcd) of a and b, where a and b belong to a Euclidean domain (usually integers or polyno-
mials).
2.1.43 Definition The number of positive integers e ≤ n such that (n, e) = 1 is denoted by φ(n),
and is the Euler function.
2.1.44 Remark The Euler function is multiplicative: if (m, n) = 1, then φ(mn) = φ(m)φ(n).
2.1.45 Remark It follows from Lemma 2.1.42 that there are exactly φ(q − 1) primitive elements
in Fq .
2.1.46 Definition A monic polynomial all of whose roots are primitive elements is a primitive
polynomial.
2.1.51 Definition Let f ∈ Fq [x] be a nonzero polynomial. If f (0) 6= 0, the order of f is the least
positive integer e such that f |xe − 1. If f (0) = 0, let f (x) = xr g(x) for some integer
r ≥ 1 and g ∈ Fq [x] with g(0) 6= 0. In this case, the order of f is the order of g.
2.1.52 Remark We denote the order of f by ord(f ). The order of a polynomial is also called the
period or exponent of the polynomial.
2.1.53 Theorem Let f ∈ Fq [x] be an irreducible polynomial over Fq of degree n with f (0) 6= 0.
Then ord(f ) is equal to the order of any root of f in the multiplicative group of F∗qn .
2.1.54 Corollary If f ∈ Fq [x] is an irreducible polynomial over Fq of degree n, then ord(f )|(q n −1).
2.1.55 Theorem Let Fq be a finite field of characteristic p, and let f ∈ Fq [x] be a polynomial
of positive degree with f (0) 6= 0. Let f = af1b1 · · · fkbk be the canonical factorization of f
into irreducibles in Fq [x], where a ∈ Fq , b1 , . . . , bk ∈ N, and f1 , . . . , fk are distinct monic
irreducible polynomials in Fq [x]. Then ord(f ) = ept , where e is the least common multiple
of ord(f1 ), . . . , ord(fk ) and t is the smallest integer with pt ≥ max(b1 , . . . , bk ).
18 Handbook of Finite Fields
2.1.56 Definition Let K be a subfield of F and let M be a subset of F . Then K(M ) denotes the
intersection of all subfields of F containing K and M as subsets. This field is K adjoin
M . When M is finite, say M = {α1 , . . . , αk }, we write K(α1 , . . . , αk ) for K(M ).
2.1.57 Definition Let K ⊆ F , α ∈ F , and f (α) = 0 where f is a monic polynomial in K[x]. Then
f is the minimal polynomial of α if α is not a root of any nonzero polynomial in K[x]
of lower degree.
2.1.58 Proposition The minimal polynomial of any extension field element is irreducible over the
base field. This result provides a method by which one can obtain irreducible polynomials.
2.1.60 Theorem Let F be a finite extension of K and let E be a finite extension of F . Then E is
a finite extension of K. Moreover, we have [E : K] = [E : F ][F : K].
2.1.61 Definition Let K ⊆ F and let α ∈ F . Then α is algebraic over K if there is a nonzero
polynomial f ∈ K[x] such that f (α) = 0 in F [x]. An extension field is algebraic if every
element of the extension field is algebraic.
2.1.65 Theorem Let Fq be a finite field and let Fr be a finite extension of Fq . Then Fr is a simple
algebraic extension of Fq , and for any primitive element α of Fr the relation Fr = Fq (α)
holds.
2.1.66 Corollary For any prime power q and any integer n ≥ 1 there is an irreducible polynomial
of degree n over Fq .
2.1.67 Example Consider q = 2100 . We can identify the elements of Fq with polynomials of the
form a0 + a1 α + a2 α2 + · · · + a99 α99 , where 0 ≤ ai < 2 for each i and where α is a root of an
irreducible polynomial of degree 100 over the field F2 . Corollary 2.1.66 shows that such an
irreducible polynomial always exists. Using Theorem 2.1.24 we have that there are exactly
1
2100 − 250 − 220 + 210
100
Introduction to finite fields 19
+ 0 1 α α+1 × 0 1 α α+1
0 0 1 α α+1 0 0 0 0 0
1 1 0 α+1 α 1 0 1 α α+1
α α α+1 0 1 α 0 α α+1 1
α+1 α+1 α 1 0 α+1 0 α+1 1 α
2.1.70 Example Consider the field F9 , which is a vector space of dimension 2 over F3 . Consider
f (x) = x2 +x+2 in F3 [x]. This polynomial has no roots in F3 so it is irreducible over F3 . Let
α be a root of f , so α2 + α + 2 = 0. Hence α2 = −α − 2 = 2α + 1. The field F32 is isomorphic
to the set {aα + b | a, b ∈ F3 } with its natural operations. We can compute the addition and
multiplication tables by hand. For example, 2α(α + 2) = 2α2 + 4α = 2(2α + 1) + α = 2α + 2.
The following addition and multiplication tables are obtained. We can use the multiplication
table to check that the multiplicative order of α in F9 is 8, and thus α is a primitive element
of F9 .
+ 0 1 2 α α+1 α+2 2α 2α + 1 2α + 2
0 0 1 2 α α+1 α+2 2α 2α + 1 2α + 2
1 1 2 0 α+1 α+2 α 2α + 1 2α + 2 2α
2 2 0 1 α+2 α α+1 2α + 2 2α 2α + 1
α α α+1 α+2 2α 2α + 1 2α + 2 0 1 2
α+1 α+1 α+2 α 2α + 1 2α + 2 2α 1 2 0
α+2 α+2 α α+1 2α + 2 2α 2α + 1 2 0 1
2α 2α 2α + 1 2α + 2 0 1 2 α α+1 α+2
2α + 1 2α + 1 2α + 2 2α 1 2 0 α+1 α+2 α
2α + 2 2α + 2 2α 2α + 1 2 0 1 α+2 α α+1
× 0 1 2 α α+1 α+2 2α 2α + 1 2α + 2
0 0 0 0 0 0 0 0 0 0
1 0 1 2 α α+1 α+2 2α 2α + 1 2α + 2
2 0 2 1 2α 2α + 2 2α + 1 α α+2 α+1
α 0 α 2α 2α + 1 1 α+1 α+2 2α + 2 2
α+1 0 α+1 2α + 2 1 α+2 2α 2 α 2α + 1
α+2 0 α+2 2α + 1 α+1 2α 2 2α + 2 1 α
2α 0 2α α α+2 2 2α + 2 2α + 1 α+1 1
2α + 1 0 2α + 1 α+2 2α + 2 α 1 α+1 2 2α
2α + 2 0 2α + 2 α+1 2 2α + 1 α 1 2α α+2
2.1.71 Example Let f (x) = x2 + 1 ∈ F3 [x]. It is straightforward to check that f is irreducible over
the field F3 . Let α be a root of f . We compute α2 = −1 and α4 = 1. Hence no root of f can
have order 8, that is, no root of f can be a primitive element. Nevertheless, the splitting
field of f over F3 is F9 . It can be seen that α + 1 has order 8 and is thus a primitive element
for F9 over F3 .
20 Handbook of Finite Fields
2.1.72 Remark Tables of irreducible and primitive polynomials can be found in Section 2.2. In
that section is a discussion of some computer algebra packages for implementing finite field
arithmetic.
2.1.73 Theorem If f is an irreducible polynomial of degree n over Fq then f has a root α in Fqn .
2 n−1
Moreover all of the roots of f are simple and are given by α, αq , αq , . . . , αq .
2 n−1
2.1.74 Definition Let α ∈ Fqn . Then α, αq , αq , . . . , αq are the conjugates of α over Fq .
2.1.75 Lemma Let α ∈ Fqn and let the minimal polynomial of α over Fq have degree d. Consider
2 n−1
the set α, αq , αq , . . . , αq of conjugates of α. The elements of this set are distinct if n = d;
otherwise each distinct conjugate is repeated n/d times.
2.1.76 Theorem The distinct automorphisms of Fqn over Fq are given by the functions
j
σ0 , σ1 , . . . , σn−1 where σj : Fqn → Fqn and is defined by σj (α) = αq for any α ∈ Fqn .
2.1.77 Remark The set of automorphisms of Fq forms a group with the operation of functional
composition. This group is called the Galois group of Fqn over Fq . It is a cyclic group
with generator σ1 : Fqn → Fqn that maps α ∈ Fqn to αq , and is called the Frobenius
automorphism. The conjugates of α are thus the elements to which α is sent by iterated
applications of the Frobenius automorphism.
2.1.78 Remark The subfields of Fqn are exactly the fields of the form Fqm where m|n. The sub-
groups of the Galois group of Fqn over Fq are exactly the groups generated by σ1m where
m|n. Moreover, σ1m (α) = α if and only if α ∈ Fqm . Thus there is a one-to-one correspondence
between the subfields of Fqn and the subgroups of its Galois group.
2.1.79 Remark In general, if F is an extension of a field K then the set of automorphisms of F
that leave K fixed pointwise is the Galois group of F over K. The field of Galois theory is
the study of Galois groups. Thus, if K is finite and F is a finite extension of K then the
Galois group is cyclic. When K is infinite, the Galois group need not be cyclic, even if F is
a finite extension of K.
2.1.80 Definition Let K = Fq and F = Fqn . For α ∈ F , we define the trace of α over K as
n−1
TrF/K (α) = α + αq + · · · + αq . Equivalently, TrF/K (α) is the sum of the conjugates
of α. If K is the prime subfield of F then the trace function is the absolute trace.
2.1.81 Example Let K = F2 and F = F24 . Then TrF/K (α) = α + α2 + α4 + α8 . For K = F4 and
F = F16 we have TrF/K (α) = α + α4 .
q
2.1.82 Remark Since TrF/K (α) = TrF/K (α) the trace of an element always lies in the base
field K.
2.1.83 Theorem Let K = Fq and F = Fqn . The trace function has the following properties:
1. for any α ∈ F , TrF/K (α) ∈ K;
2. TrF/K (α + β) = TrF/K (α) + TrF/K (β) for α, β ∈ F ;
3. TrF/K (cα) = cTrF/K (α) for α ∈ F and c ∈ K;
4. the trace function is a K-linear map from F onto K;
Introduction to finite fields 21
2.1.84 Theorem For β ∈ F let Lβ be the map α 7→ TrF/K (βα). Then Lβ 6= Lγ if β 6= γ. Moreover
the K-linear transformations from F to K are exactly the maps of the form Lβ as β varies
over the elements of the field F .
2.1.85 Remark The result in Theorem 2.1.84 provides a method to generate all of the linear
transformations from the extension field F to the subfield K.
2.1.86 Definition Let K = Fq and F = Fqn . The norm over K of an element α ∈ F is defined by
n−1
q n−1 i n
Y
q
NormF/K (α) = αα · · · α = αq = α(q −1)/(q−1)
.
i=0
2.1.87 Remark The norm of an element α is thus calculated by taking the product of all of the
conjugates of α, just as the trace of α is obtained by taking the sum of all of the conjugates
of α.
2.1.88 Theorem Let K = Fq and F = Fqn . The norm function has the following properties:
1. NormF/K (α) ∈ K;
2. NormF/K (αβ) = NormF/K (α) NormF/K (β) for α, β ∈ F ;
3. the norm maps F onto K and F ∗ onto K ∗ ;
4. NormF/K (α) = αn if α ∈ K;
5. NormF/K (αq ) = NormF/K (α);
6. if K ⊆ F ⊆ E are finite fields then
NormE/K (α) = NormF/K (NormE/F (α)) .
2.1.5 Bases
2.1.89 Remark Every finite field F is a vector space over each of its subfields, and thus has a
vector space basis over each of its subfields. There are several different kinds of bases for
finite fields. Each kind of basis facilitates certain computations. When doing computations in
finite fields, there are some important operations like addition, multiplication, q-th powering
and finding inverses. With some bases computing inverses and q-th powers are easy, while
multiplication could be more involved. With other bases, one can calculate multiplications
quickly at the cost of more complicated inverse computations or exponentiations.
2.1.90 Remark The vector space of all n × r matrices over a field Fq is of dimension nr over Fq .
Taking into account the order of the elements, the total number of distinct bases of Fqn
over Fq is given by
(q n − 1)(q n − q) · · · (q n − q n−1 ),
which is also equal to the number of elements in the general linear group GLn (Fq ), the ring
of nonsingular n × n matrices over Fq .
22 Handbook of Finite Fields
2.1.91 Remark Consider Fqn as a vector space over Fq of dimension n. We know there are many
bases for this vector space. Given B = {α1 , . . . , αn } ⊆ Fqn , how can we tell if B is a basis
for Fqn over Fq ? We begin with a test which determines whether a set of elements of Fqn
is independent over Fq . If this result is applied to a set containing n elements, it can thus
be used to determine whether these elements form a basis of Fqn over Fq . We require the
following notation.
2.1.92 Definition Let K = Fq and F = Fqn . Let {α1 , . . . , αn } be a set of n elements of F viewed
as a vector space over the subfield K. We define the discriminant ∆F/K (α1 , . . . , αn )
with the following rule:
2.1.93 Theorem If α1 , . . . , αn ∈ Fqn , then the set {α1 , . . . , αn } is a basis for Fqn over Fq if and
only if ∆Fqn /Fq (α1 , . . . , αn ) is nonzero.
2.1.94 Remark The following result provides an alternative method to determine if a given set of
elements forms a basis. We note that the calculations for this method must be done in the
extension field, not in the base field. Working in the extension field may have a significant
computational cost. For example, if the base field is F2 and the extension field is F21000 then
computations in the base field are much faster than computations in the extension field.
2.1.95 Corollary The set {α1 , . . . , αn } is a basis for Fqn over Fq if and only if
α1 ··· αn
α1q ··· αnq
.. .. .. 6= 0.
. . .
n−1 n−1
α1q ··· αnq
2.1.96 Definition Let α be a root of an irreducible polynomial of degree n over Fq . The set
{1, α, α2 , . . . , αn−1 } is a polynomial basis of the field Fqn over Fq .
2.1.97 Remark When we use a polynomial basis for Fqn we can regard field elements, which in
reality are polynomials in α of degree at most n − 1, as vectors. We can then easily add
vectors in the usual way by adding the corresponding coefficients. Field multiplication is
more complicated since we must gather terms with like powers of the basis elements when
we simplify a product.
n−1
2.1.98 Definition If α ∈ Fqn and {α, αq , . . . , αq } is a basis for Fqn over Fq , then the basis is a
normal basis of Fqn over Fq , and α is a normal element.
n−1
2.1.99 Remark If β = a0 α + a1 αq + · · · + an−1 αq so that β is represented by the vector
(a0 , . . . , an−1 ), then αq is simply represented by the shifted vector (an−1 , a0 , . . . , an−2 ).
Thus if we have a normal basis, it is extremely easy to raise a field element to the power
q. Addition is of course also still easy to compute using a normal basis. We note that
multiplication of field elements is quite complicated using a normal basis. In Section 5.2 we
Introduction to finite fields 23
give important properties of normal bases including their existence for any finite extension
field of Fq .
2.1.100 Definition Two ordered bases of Fqn over Fq {α1 , . . . , αn } and {β1 , . . . , βn } are comple-
mentary (or dual ) if TrFqn /Fq (αi βj ) = δij , where δij = 0 if j 6= i and δij = 1 if i = j.
An ordered basis is self-dual if it is dual with itself.
2.1.101 Definition A primitive normal basis for an extension field Fqn over Fq is a basis of the
2 n−1
form {α, αq , αq , . . . , αq }, where α is a primitive element for Fqn over Fq .
2.1.102 Remark Further kinds of bases for finite fields and their properties are discussed in detail
in Chapter 5. For example, we show that each basis of Fqn has a unique dual basis. We give
fundamental properties of normal bases and primitive normal bases in Section 5.2. We give
there, among other results, the fundamental theorem that for any prime power q and any
integer n ≥ 2 there exists a primitive normal basis for Fqn over Fq .
Pn−1 i
2.1.103 Definition Let L(x) = αi xq , where αi ∈ Fqn . A polynomial of this form is a
i=0
linearized polynomial over Fqn (also a q-polynomial because the exponents are all powers
of q).
2.1.104 Remark These polynomials form an important class of polynomials over finite fields because
they are Fq -linear functions from Fqn to Fqn .
2.1.105 Theorem Let L(x) be a linearized polynomial. Then for all α, β ∈ Fqn and all c ∈ Fq , we
have
1. L(α + β) = L(α) + L(β),
2. L(cα) = cL(α).
2.1.106 Theorem Let L be a nonzero linearized polynomial over Fqn and assume that the roots of L
lie in the field Fqs , an extension field of Fqn . Then each root of L has the same multiplicity,
which is either 1, or a positive power of q.
The Frobenius automorphism x 7→ xq is one such example, and the trace function
2.1.107 RemarkP
n−1 i
Tr(x) = i=0 xq provides another important example of a linearized polynomial over Fq .
2.1.108 Definition Let L be a linearized polynomial over Fqn . A polynomial of the form A(x) =
L(x) − α, for α ∈ Fqn , is an affine polynomial over Fqn .
2.1.109 Theorem Let A be a nonzero affine polynomial over Fqn and assume that the roots of A
lie in the field Fqs , an extension field of Fqn . Then each root of A has the same multiplicity,
which is either 1, or a positive power of q.
2.1.110 Remark We collect here some concepts and results needed in later sections of the handbook.
24 Handbook of Finite Fields
2.1.111 Definition For f ∈ Fq [x], Φq (f ) denotes the number of polynomials over Fq which are of
smaller degree than the degree of f and which are relatively prime to f . This is also the
number of units in the ring Fq [x]/(f (x)).
2.1.112 Remark Similarly to the corresponding properties for the Euler function from elementary
number theory, we have the following result (see Lemma 3.69 of [1939] and Definition 2.1.43).
2.1.113 Lemma The function Φq has the following properties:
1. Φq (f ) = 1 if the degree of f is 0;
2. Φq (f g) = Φq (f )Φq (g) if f and g are relatively prime;
3. if f has degree n ≥ 1 then
Φq (f ) = q n (1 − q −n1 ) · · · (1 − q −nr ),
2.1.115 Remark The following is a synopsis of properties of roots of unity and cyclotomic polyno-
mials, which can be found in [1939, Chapters 2 and 3].
2.1.116 Remark Let n be a positive integer. The polynomial xn −1 has many special properties over
any field. For example, xn − 1 is the minimal polynomial of the Frobenius automorphism
which generates the Galois group of Fqn over Fq , which is useful when studying normal bases
over finite fields, see Section 5.2. Many of the basic properties of cyclotomic polynomials
(and their roots) hold over arbitrary fields, however in this section we restrict to the finite
field case.
2.1.117 Definition The roots α1 , α2 , . . . , αn ∈ Fqn of the polynomial xn − 1 ∈ Fq [x] are the n-th
roots of unity over Fq .
2.1.118 Remark The roots of any degree n polynomial over Fq must be in Fqn . Thus, the n-th roots
of unity of Fq are all contained in Fqn .
2.1.119 Theorem Let n be a positive integer and let Fq be a finite field of characteristic p. If p does
not divide n, the roots of unity form a cyclic group of order n with respect to multiplication
e
in F∗q . Otherwise, let n = mpe , where e > 0 and gcd(m, p) = 1. Then xn − 1 = (xm − 1)p
and the n-th roots of unity are the m-th roots of unity with multiplicity pe .
2.1.120 Definition Let Fq be a finite field of characteristic p which does not divide n. Denote the
cyclic group of n-th roots of unity as Un . Suppose that Un is generated by α ∈ Fqn , that
is Un = hαi. Then α is a primitive n-th root of unity over Fq .
Introduction to finite fields 25
2.1.121 Definition Let Fq have characteristic p, not dividing n, and let ζ be a primitive n-th root
of unity over Fq . Then the polynomial
n
Y
Qn (x) = (x − ζ s )
s=1 gcd(s,n)=1
2.1.122 Remark The n-th cyclotomic polynomial does not depend on the choice of primitive root
of unity chosen, since ζ s , gcd(s, n) = 1, runs over all primitive n-th roots of unity.
2.1.123 Theorem Let Fq be a finite field with characteristic p which does not divide n. Then
1. deg(Qn ) = φ(n);
2. xn − 1 = d|n Qd (x);
Q
8.
−2 if n = 1,
0 if n = 2,
Qn (−1) =
p
if n = 2pe ,
1 otherwise.
2.1.128 Theorem Let n be a positive integer not divisible by the characteristic of Fq . An explicit
factorization of the Qn over Fq is given by
Y Y
Qn (x) = (xd − 1)µ(n/d) = (xn/d − 1)µ(d) ,
d|n d|n
26 Handbook of Finite Fields
2.1.132 Remark The property that every function over a finite commutative ring with identity can
be represented by a polynomial with coefficients in that ring characterizes finite fields. In
particular, if a finite commutative ring R with unity has the property that every function
from the ring to itself can be represented by a polynomial with coefficients in the ring, then
R is a finite field, and conversely.
2.1.133 Remark The Lagrange Interpolation Formula can also be stated in the following form: for
n ≥ 0, let a0 , . . . , an be n + 1 distinct elements of Fq , and let b0 , . . . , bn be n + 1 arbitrary
elements of Fq . Then, there exists exactly one polynomial f ∈ Fq [x] of degree less than or
equal to n such that f (ai ) = bi , i = 0, . . . , n. This polynomial is given by
n n
X Y x − ak
f (x) = bi .
ai − ak
i=0 k=0,k6=i
2.1.134 Theorem Let f : Fnq → Fq . The polynomial Pf (x1 , . . . , xn ) represents f , that is,
Pf (b1 , . . . , bn ) = f (b1 , . . . , bn ) for all (b1 , . . . , bn ) ∈ Fnq , where
X
Pf (x1 , . . . , xn ) = f (a1 , . . . , an )[1 − (x1 − a1 )q−1 ] · · · [1 − (xn − an )q−1 ].
(a1 ,...,an )∈Fn
q
2.1.7.4 Discriminants
2.1.135 Definition Let f be a polynomial of degree n in Fq [x] with leading coefficient a, and with
roots α1 , α2 , . . . , αn in its splitting field, counted with multiplicity. The discriminant of
f is given by Y
D(f ) = a2n−2 (αi − αj )2 .
1≤i<j≤n
2.1.139 Definition If the elements of F∗q are represented as powers of a fixed primitive element
b ∈ Fq , then addition in Fq can be facilitated by using Jacobi logarithms (sometimes
also called Zech logarithms) L(n) defined by the equation 1 + bn = bL(n) , where the case
bn = −1 is excluded.
2.1.140 Remark One can show that bm + bn = bm+L(n−m) whenever this is defined. Tables of
Jacobi logarithms for fields of characteristic 2 and order at most 64 can be found on Table
B of [1939]. Jacobi logarithms were first studied by Jacobi [1584].
2.1.141 Remark In this subsection we briefly describe several algebraic systems that have many
but perhaps not all of the properties of a field. We are indebted to John Sheekey (Università
di Padova) for this section.
2.1.142 Definition A left (resp. right) prequasifield is a set Q together with two operations, “+”
and “·” such that:
(1) (Q, +) is an abelian group;
(2) for all a, b, c ∈ Q there exist unique x, y, z ∈ Q such that
a · (b + c) = a · b + a · c (resp.(b + c) · a = b · a + c · a).
2.1.144 Remark All left prequasifields have prime power order. Left prequasifields coordinatize
translation planes. The smallest left prequasifield which is not a field has order 9. The
smallest semifield which is not a field has order 16. For more on the above structures
see [807] or [1560].
2.1.145 Remark The multiplicative structure of a left prequasifield is a quasigroup. The multi-
plicative structure of a semifield is a loop. The multiplicative structure of a nearfield is a
group.
2.1.146 Definition Let Q be a set together with two operations, “+” and “·”, containing additive
identity 0 and multiplicative identity 1, such that:
(1) (Q/{0}, ·) is a group;
(2) left and right distributive laws hold: for all a, b, c ∈ Q
a · (b + c) = a · b + a · c and (b + c) · a = b · a + c · a;
a + 0 = 0 + a = a.
(4A) Q is a neofield if in addition to (1), (2) and (3) it satisfies for all a, b ∈ Q there
exist unique x, y ∈ Q such that
a+x=b and y + a = b.
(4B) Q is a division semiring if in addition to (1), (2) and (3) above it also satisfies
+ is associative and commutative.
2.1.147 Remark The additive structure of a neofield is a loop. The additive structure of a division
semiring is a commutative monoid.
2.1.148 Remark For more properties of semirings see [1291]. Note that a division semiring in which
multiplication is commutative is sometimes also referred to as a semifield, but this definition
does not coincide with the previously defined structures.
2.1.149 Remark For more properties of neofields see [2343].
2.1.150 Remark We briefly describe Galois rings. We are indebted to Horacio Tapia-Recillas (Uni-
versidad Autónoma Metropolitana, Unidad Iztapalapa, México) for this subsection.
2.1.151 Remark Galois rings represent a natural (Galois) extension of the (local) modular ring
of integers Z/pm Z where p is a prime and m a positive integer. Krull [1808] recognized
their existence and later, Janusz [1593] and Raghavendran [2436] independently obtained
additional properties of these rings. More details on Galois rings can be found in [283, 1409,
2045, 2921].
2.1.152 Definition Let Z/pm Z be the ring of integers modulo pm , p a prime and m > 1 an integer.
A monic irreducible (primitive) polynomial f ∈ (Z/pm Z)[x] of degree n is a monic basic
irreducible (primitive) if its reduction modulo p is irreducible (primitive) in (Z/pZ)[x].
Introduction to finite fields 29
2.1.153 Remark Monic basic irreducible (primitive) polynomials in (Z/pm Z)[x] can be determined
by means of Hensel’s Lifting Lemma from monic irreducible (primitive) polynomials in
(Z/pZ)[x].
2.1.154 Definition Let f ∈ (Z/pm Z)[x] be a monic basic irreducible polynomial of degree n. Then
the Galois ring determined by f is
2.1.155 Remark With the above notation, an equivalent definition of a Galois ring is the following.
2.1.156 Definition Let Z be the ring of (rational) integers and let f ∈ Z[x] be a monic polynomial
of degree n such that its reduction modulo pZ is irreducible, then
GR(pm , n) = Z[x]/hpm , f i.
2.1.157 Remark The Galois ring GR(pm , n) can also be defined by means of the p-adic numbers in
the following way.
2.1.158 Definition Let p be a prime, Qp be the field of p-adic (rational) numbers and Zp be the
ring of p-adic integers (for details see [2588]). Let n be a positive integer and let ω
be a (pn − 1) root of unity. Then Qp (ω) is an unramified Galois extension of degree n
of Qp . Let Zp [ω] be the ring of elements of Qp (ω) integral over Zp . Let pZp [ω] be the
(unique) maximal ideal of Zp [ω] generated by p. Then the quotient Zp [ω]/pZp [ω] is a
field isomorphic to the Galois field Fpn .
2.1.159 Definition With the notation as above let m be a positive integer and let pm Zp [ω] be the
principal ideal of Zp [ω] generated by pm . Then the Galois ring GR(pm , n) is defined as:
2.1.160 Remark This ring contains as a subring the ring of integers modulo pm , Z/pm Z, and can
be thought of as an extension of Z/pm Z by adjoining a (pn − 1) root of unity ω:
2.1.161 Theorem With the notation as above, the basic properties of the Galois ring GR(pm , n)
are the following [283, 1409, 2045, 2921]:
1. GR(pm , n) contains Z/pm Z as a subring, it has characteristic pm and cardinality
pmn . The integer m is the nilpotency index of the Galois ring.
2. The ring GR(pm , n) is local with maximal ideal M = hpi = pGR(pm , n) gen-
erated by p, and a principal ideal ring where any ideal is of the form hpi i for
i = 0, 1, 2, . . . , m. Furthermore, it is a finite chain ring:
3. Each non-zero element of the Galois ring GR(pm , n) can be written as upk , where
u is a unit and 0 ≤ k ≤ m − 1. In this representation the integer k is unique and
the unit u is unique modulo the ideal hpm−k i.
4. The canonical homomorphism φ : GR(pm , n) −→ GR(pm , n)/M, between the
Galois ring and its residue field GR(pm , n)/M is such that φ(ξ) = ξ is a root of
φ(f (x)). The residue field is isomorphic to the Galois field GF (pn ) = Fpn with
pn elements. Furthermore, GF (pn )∗ = hξi.
5. The Galois ring is a (Z/pm Z)-module:
6. The group of units U of the Galois ring GR(pm , n) has the following structure:
U = C × G,
where q = pn .
9. Given a prime p and an integer n > 1, for each divisor r of n there is a unique
Galois ring GR(pm , r), and any subring of the Galois ring GR(pm , n) is of this
form.
10. For each positive integer t, there is a natural injective ring homomorphism
GR(pm , n) −→ GR(pm , nt).
11. There is a natural surjective ring homomorphism GR(pm , n) −→ GR(pm−1 , n)
with kernel hpm−1 i.
12. The group of automorphisms of the Galois ring GR(pm , n) is a cyclic group of
order n.
13. The Galois ring GR(pm , n) is quasi-Frobenius.
2.1.162 Example GR(p, n) = GF (p, n) = Fpn , GR(pm , 1) = (Z/pm Z).
2.1.163 Example [2045] The polynomial f (x) = x3 + x + 1 ∈ (Z/22 Z)[x] is monic basic irreducible
over (Z/22 Z). Then GR(22 , 3) = (Z/22 Z)[x]/hf (x)i.
Introduction to finite fields 31
2.1.164 Example [1409] The polynomial g(x) = x3 + 2x2 + x − 1 ∈ (Z/22 Z)[x] is also monic basic
irreducible over (Z/22 Z). Then GR(22 , 3) = (Z/22 Z)[x]/hg(x)i.
2.1.165 Example [283] The polynomial g(x) = x3 − 2x2 − x − 1 ∈ (Z/23 Z)[x] is monic basic
irreducible over (Z/23 Z). Then GR(23 , 3) = (Z/23 Z)[x]/hg(x)i.
2.1.166 Remark We give a list of finite field related books, divided into categories and listed without
duplication even though a number of these books could be listed in two or more categories.
2.1.8.1 Textbooks
2.1.167 Remark We begin by listing a number of books that could be used as textbooks. Ref-
erence [1939] by Lidl and Niederreiter is, by far, the most comprehensive. Other text-
books include Jungnickel [1631], Lidl and Niederreiter [1938], Masuda and Panario [2017],
McEliece [2049], Menezes et al. [2077], Mullen and Mummert [2179], Small [2681], and
Wan [2921, 2923].
2.1.168 Remark We list a number of books dealing with various theoretical topics related to finite
fields: [240, 398, 557, 850, 961, 1121, 1122, 1333, 1389, 1511, 1631, 1701, 1756, 1773, 1843,
1845, 1922, 1936, 1938, 1939, 2017, 2049, 2054, 2077, 2107, 2548, 2637, 2641, 2667, 2670,
2672, 2681, 2711, 2714, 2793, 2920, 2921, 2923, 2949, 2950].
2.1.8.3 Applications
2.1.169 Remark The use of finite fields in algebraic coding theory has been the focus for numerous
books: [231, 270, 304, 311, 1558, 1943, 1945, 1991, 2252, 2281, 2404, 2405, 2819, 2820, 2849].
2.1.170 Remark Theoretical and applied aspects of cryptography have been treated in: [245, 312,
313, 661, 759, 762, 922, 1105, 1303, 1413, 1521, 1563, 1694, 1774, 2076, 2080, 2644, 2720].
2.1.171 Remark There have been several books on the applications of finite fields in combinatorics,
especially in combinatorial design theory and finite geometries: [131, 141, 211, 260, 261,
262, 453, 484, 706, 785, 807, 819, 1509, 1510, 1515, 1560, 1875, 2445, 2719, 2781, 2851].
2.1.8.4 Algorithms
2.1.172 Remark Several books contain results on algorithmic and computational finite field topics.
These include [761, 1227, 2632].
2.1.173 Remark The Finite Fields and Applications Conferences (Fq n series) have been held
every two years (except 2005) since 1991. The proceedings from these conferences are: [663,
1636, 1870, 2052, 2181, 2182, 2184, 2185, 2187, 2197]. Other conference proceedings volumes
include: [533, 535, 596, 869, 1057, 1228, 1306, 1316, 1436, 1478, 1480, 1928].
References Cited: [131, 136, 141, 211, 231, 240, 245, 260, 261, 262, 270, 304, 311, 312,
313, 398, 453, 484, 533, 535, 557, 596, 661, 663, 706, 759, 761, 762, 785, 797, 807, 819, 850,
32 Handbook of Finite Fields
869, 922, 961, 1057, 1105, 1121, 1122, 1227, 1228, 1291, 1303, 1306, 1316, 1333, 1389, 1413,
1436, 1478, 1480, 1509, 1510, 1511, 1515, 1521, 1558, 1560, 1563, 1570, 1584, 1631, 1636,
1694, 1701, 1756, 1773, 1774, 1843, 1845, 1848, 1875, 1922, 1928, 1936, 1938, 1939, 1943,
1945, 1991, 2017, 2049, 2052, 2054, 2076, 2077, 2080, 2107, 2144, 2179, 2181, 2182, 2184,
2185, 2187, 2197, 2223, 2252, 2280, 2281, 2343, 2404, 2405, 2445, 2548, 2632, 2637, 2641,
2644, 2667, 2670, 2672, 2681, 2711, 2714, 2719, 2720, 2781, 2793, 2819, 2820, 2849, 2851,
2920, 2921, 2923, 2949, 2950]
2.2 Tables
David Thomson, Carleton University
2.2.1 Remark Unless otherwise stated, all of the data given in this section was created by the
author and, when possible, was verified with known results. Basic algorithms (for example,
brute force) were preferred due to their reliability and ease of verification. Unless stated, all
simulations were done in C/C++ using the NTL version 5.5.2 library [2633] for modular
computations. NTL was compiled using the GMP version 4.3.2 library [2797] for multi-
precision arithmetic. Extended and machine-readable versions of the tables found in this
section can be found on the book’s website [2180].
2.2.2 Remark Since most computer algebra packages can readily handle basic finite field compu-
tations, our aim is not to repeat tables whose purpose is to improve hand-calculations. For
reference, we briefly recall the list of tables found in [1939].
Tables A and B are aids to perform fast arithmetic by hand over small finite fields.
Table A is a list of all elements over small finite fields and their discrete logarithms with
respect to a primitive element. Table B provides a list of Jacobi’s logarithms L(·) for F2n ,
2 ≤ n ≤ 6. These logarithms allow the computation of field elements by the relationship
ζ α + ζ β = ζ α+L(β−α) .
Table C provides a list of all monic irreducible polynomials of degree n over small prime
fields. Particularly, these tables cover p = 2 and n ≤ 11, p = 3 and n ≤ 7, p = 5 and n ≤ 5,
p = 7 and n ≤ 4.
Tables D, E, and F deal with primitive polynomials. Table D lists one primitive poly-
nomial over F2 for degrees n ≤ 100. Table E lists all quadratic primitive polynomials for
11 ≤ p ≤ 31 and Table F lists one primitive polynomial of degree n over Fp for all n ≥ 2
with p < 50 and pn < 109 .
2.2.3 Remark Low-weight irreducible polynomials are highly desired due to their efficiency in
hardware and software implementations of finite fields. Irreducible polynomials of degree at
least 2 over F2 must have an odd number of terms. The use of irreducible trinomials (having
3 terms) and, in their absence, irreducible pentanomials (having 5 terms) are useful; see, for
example, [1413, Chapter 2]. For cryptographic use, the irreducible trinomial or pentanomial
of lowest lexicographical order (for a fixed n, prefer the trinomial xn +xk +1 over xn +xk1 +1
when k < k1 , the analogue for pentanomials is obvious) is often preferred for transparency
reasons. However, the irreducible with the optimal performance for a given implementation
is not necessarily the lowest lex-order, see [2573] and Section 11.1. A list of the lowest-weight
Introduction to finite fields 33
lowest-lex-order irreducible over F2 is given in [2582] for degree n ≤ 10000. Table 2.2.1 gives
the lowest-weight, lowest-lex-order irreducible polynomial for n ≤ 1025. The output of the
table follows the format n, k (for trinomials xn + xk + 1) or n, k1 , k2 , k3 (for pentanomials
xn + xk1 + xk2 + xk3 + 1). We have extended these tables to larger n and to larger q for
small values of n. Furthermore, the computer algebra package Magma [712] contains similar
tables, due to Steel (2004-2007), for the following values of q and n:
q n≤ q n≤ q n≤ q n≤
2 120, 000 3 50, 000 4, 5, 7 2000 9 ≤ q ≤ 127 1000 (or more).
Sections 3.4 and 4.3 give more information on weights of irreducible and primitive polyno-
mials.
Table 2.2.1 Lowest weight lowest-lexicographical order irreducible polynomial of degree n over F2 . Out-
put: n, k (for trinomials xn + xk + 1) or n, k1 , k2 , k3 (for pentanomials xn + xk1 + xk2 + xk3 + 1).
2.2.4 Remark Constructions of irreducible low-weight polynomials are rare; see Sections 3.4
and 3.5. Instead, conditions for reducibility are often more tractable; see Section 3.3.
Swan [2753] gives conditions for when a trinomial xn + xk + 1 ∈ F2 [x] is reducible. In
particular, the trinomial is reducible when 8 divides n. Only partial results for Swan-like
Introduction to finite fields 35
conditions on pentanomials over F2 exist in the literature; see, for example, [1777] and
Section 3.3.
2.2.5 Conjecture [2582] For every n, there exists either an irreducible trinomial of degree n over
F2 or, in the absence of an irreducible trinomial, there exists an irreducible pentanomial of
degree n over F2 .
2.2.6 Remark A polynomial over Fq is primitive if all of its roots are generators of the (cyclic)
multiplicative group F∗q . We give an analogous table to Table 2.2.1 but instead list the
lowest-weight lowest-lexicographical order primitive polynomial of degree n ≤ 577 over F2 .
To compute primitivity, we used the Cunningham project to find the factorization of 2n − 1;
see Section 2.2.3 for more details.
Table 2.2.2 Lowest weight lowest-lexicographical order primitive polynomial of degree n ≤ 577 over F2 .
Output: n, k (for trinomials xn + xk + 1) or n, k1 , k2 , k3 (for pentanomials xn + xk1 + xk2 + xk3 + 1).
2.2.7 Remark Table 2.2.3 is the analogous table to Table 2.2.1, giving the lowest-weight, lowest-
lexicographical order irreducible polynomial of degree n ≤ 516 over F3 .
Table 2.2.3 Lowest weight lowest lexicographical order irreducible polynomial of degree n over F3 .
Output: n, {degrees, (coefficients)}, (constant term).
2.2.8 Remark Necessary and sufficient conditions for the existence of an irreducible binomial
of degree n over finite fields of odd characteristic are given in [1939, Theorem 3.75]. A
constructive derivation of the degrees for which there exists an irreducible binomial over
Fq , q odd, is given in [2356]. The following conjecture summarizes empirical observations of
extending Tables 2.2.1 and 2.2.3 to higher characteristics.
2.2.9 Conjecture Let q > 2. For every n, there is an irreducible polynomial of degree n over Fq
of weight at most 4.
2.2.10 Remark Normal bases are often required in hardware implementations of finite fields due
to the efficiency of exponentiation when the finite field is represented using a normal basis.
38 Handbook of Finite Fields
The complexity of a normal basis N , CN , is defined in Definition 5.3.1. Normal bases with
low complexity are highly preferred. An optimal normal basis of Fqn over Fq is a normal
basis attaining the minimum complexity CN = 2n − 1. See Sections 5.2 and 5.3 for more
details on normal bases and their complexities.
2.2.11 Remark Table 2.2.4 is due to an exhaustive search for normal bases of F2n over F2 , n ≤ 39,
originally given in [2015]. The table gives the number of normal bases, the smallest and
largest complexities (mCN , MCN ), the average and variance (AvgCN , V arCN ) of complexities
and the smallest and largest complexities for self-dual normal elements. In Table 2.2.4, we
fix a typo on the minimum complexity of n = 37, originally noted in [130], and make some
minor corrections to the calculations of the averages and variances. In the “Notes” column,
“Optimal” indicates that the basis with minimal complexity is an optimal normal basis
(Theorem 5.3.6), and “sd” indicates that the minimal complexity basis is self-dual.
Self-dual
n # Normal bases mCN MCN AvgCN V arCN mCN MCN Notes
2 1 3 3 3.00 0 3 3 Optimal, sd
3 1 5 5 5.00 0 5 5 Optimal, sd
4 2 7 9 8.00 1.00 - -
5 3 9 15 11.67 6.22 9 9 Optimal, sd
6 4 11 17 15.00 6.00 11 15 Optimal, sd
7 7 19 27 23.00 9.14 21 21 mCN = 3n − 2
8 16 21 35 29.00 11 - - mCN = 3n − 3
9 21 17 45 35.57 41.57 17 29 Optimal, sd
10 48 19 61 44.83 61.31 27 51
11 93 21 71 55.82 57.65 21 57 Optimal, sd
12 128 23 83 64.13 107.23 - -
13 315 45 101 78.38 71.07 45 81 sd
14 448 27 135 91.07 108.42 27 135 Optimal, sd
15 675 45 137 105.89 127.36 45 105 sd
16 2048 85 157 115.82 114.59 - -
17 3825 81 177 136.83 136.67 81 171 sd
18 5376 35 243 153.51 185.12 35 243 Optimal, sd
19 13797 117 229 172.00 171.91 117 201 sd
20 24576 63 257 190.81 205.81 - -
21 27783 95 277 210.97 216.43 105 237
22 95232 63 363 231.93 238.56 63 363 mCN = 3n − 3
23 182183 45 325 254.02 254.60 45 309 Optimal, sd
24 262144 105 375 276.89 281.01 - -
25 629145 93 383 301.01 300.37 93 357 sd
26 1290240 51 555 325.96 328.59 51 555 Optimal, sd
27 1835001 141 443 351.99 351.38 141 413
28 3670016 55 517 378.98 379.12 - - Optimal
29 9256395 57 521 407.00 406.21 57 465 Optimal, sd
30 11059200 59 759 435.95 438.52 59 759 Optimal, sd
31 28629151 237 587 466.00 465.21 237 537 sd
32 67108864 361 621 497.00 496.07 - -
33 97327197 65 693 529.00 528.44 65 693 Optimal, sd
34 250675200 243 819 562.00 561.52 243 819 sd
35 352149515 69 779 596.00 595.08 69 693 Optimal, sd
36 704643060 71 1017 630.99 630.51 - - Optimal
37 1857283155 141 823 667 666.04 141 sd
38 3616800703 207 1131 704.00 703.18 207
39 5282242828 77 933 742.00 741.09 77 Optimal, sd
Table 2.2.4 Statistics for normal bases of F2n over F2 obtained by exhaustive search, n ≤ 39.
2.2.12 Remark Our first conjecture based on Table 2.2.4 appears in [3036] and elsewhere. We also
summarize the conjectures found in [2015].
2.2.13 Conjecture When no optimal normal basis of F2n over F2 exists, the minimum complexity
of a normal basis of F2n over F2 is 3n − 3.
Introduction to finite fields 39
2.2.14 Remark Normal bases of F2n over F2 achieving a complexity of 3n − 3 are given in Propo-
sition 5.3.46 and this complexity is the minimal found when n = 8 and n = 22.
2.2.15 Conjecture The number of normal bases of F2n over F2 are normally distributed with
respect to their complexities. Furthermore, the average complexity of a normal basis of F2n
over F2 is (n2 − n + 3)/2 and the variance is also n2 /2 − cn, for a small positive constant c.
2.2.16 Remark We remark that the conspicuous wording in Conjecture 2.2.15, that normal bases
are normally distributed, is mostly coincidental. Indeed, as n grows, the number of normal
bases grow like 2n / log(n), see Theorem 5.2.13, so the Central Limit Theorem supports
this conjecture. The precise distribution of the complexities is still an open and interesting
problem.
2.2.17 Remark Self-dual normal bases are often preferred in normal basis implementations due to
their highly symmetric properties; see Sections 5.1, 5.2, 5.3, 16.7 as well as [1264, 2925], for
more information on self-dual normal bases and their implementations. Exhaustive searches
of self-dual normal bases of F2n over F2 appear in [130, 1263, 1631, 2015] and [130] gives
an exhastive search of self-dual normal bases of Fqn over Fq for larger q and odd n. Ta-
bles 2.2.5, 2.2.6, and 2.2.7 are directly from [130]; we note that we did not implement their
algorithm. Table 2.2.5 gives the minimum complexity Cn of a self-dual normal basis of F2n
over F2 for odd n ≤ 45, Table 2.2.6 for q a power of 2 and small n, and Table 2.2.7 for Fqn
over Fq for odd q ≤ 19 and small n.
n 3 5 7 9 11 13 15 17 19 21 23
Cn 5 9 21 17 21 45 45 81 117 105 45
n 25 27 29 31 33 35 37 39 41 43 45
Cn 93 141 57 237 65 69 141 77 81 165 153
Table 2.2.5 The lowest complexity for self-dual normal bases of F2n over F2 for odd n, n ≤ 45.
q/n 3 5 7 9 11 13 15 17 19 21 23 25
2 5 9 21 17 21 45 45 81 117 105 45 93
4 5 9 21 17 21 45 45 81 117 105 45 93
8 9 9 21 45 21 45 81 81
16 5 9 21 17 21 45
32 5 19 21 17 21
64 9 9 21 45
128 5 9 37
256 5 9
Table 2.2.6 Lowest complexity for self-dual normal bases of Fqn over Fq where q is a power of 2 for
small odd values of n.
q/n 3 5 7 9 11 13 15 17 19 21 23 25
3 7 13 25 37 55 67 – 91 172 – 127 135
5 6 13 25 46 64 85 – 157 153 150
7 6 16 19 41 61 96 87 –
11 6 13 25 52 31 100 78
13 6 13 25 51 64 37
17 8 13 25 51 64 100 –
19 8 13 31 51 67 –
Table 2.2.7 Lowest complexity for self-dual normal bases of Fqn over Fq for odd primes q ≤ 19 and
small odd values of n.
40 Handbook of Finite Fields
2.2.2.2 Minimum type of a Gauss period admitting a normal basis of F2n over F2
2.2.18 Remark We briefly recall the definition of a Gauss period (Definition 5.3.16). Let r = nk +1
be a prime not dividing q and let γ be a primitive r-th root of unity in Fqnk . Furthermore,
let K be the unique subgroup of order k in Z∗r and Ki = {a · q i : a ∈ K} ⊆ Z∗r be cosets of
K, 0 ≤ i ≤ n − 1. The elements
X
αi = γ a ∈ Fqn , 0 ≤ i ≤ n − 1,
a∈Ki
are Gauss periods of type (n, k) over Fq . Gauss periods over finite fields are highly desirable
as normal bases since, when they exist, they have low complexity; see Theorem 5.3.23.
Normal bases due to Gauss periods of type (n, 1), for all q, and of type (n, 2), for q = 2,
characterize the optimal normal bases (Theorem 5.3.6) and have complexity 2n − 1. Gauss
periods also often have high order, see Remark 5.3.49. For conditions on when Gauss periods
of type (n, k) admit normal bases of Fqn over Fq , see Theorem 5.3.17. In particular, we note
that there is no Gauss period of F2n over F2 which admits a normal basis when 8 divides n.
Table 2.2.8 gives the lowest k for which a Gauss period of type (n, k) admits a normal
basis of F2n pver F2 for n ≤ 577. We give a similar table over F3 in Table 2.2.9. This range
was chosen to cover degrees for common implementations of finite field arithmetic. The
output of the table is in the format “n, k 00 where k is the minimum number admitting a
type (n, k) Gauss period over Fqn , where q = 2, 3.
2,1 3,2 4,1 5,2 6,2 7,4 9,2 10,1 11,2 12,1 13,4 14,2
15,4 17,6 18,1 19,10 20,3 21,10 22,3 23,2 25,4 26,2 27,6 28,1
29,2 30,2 31,10 33,2 34,9 35,2 36,1 37,4 38,6 39,2 41,2 42,5
43,4 44,9 45,4 46,3 47,6 49,4 50,2 51,2 52,1 53,2 54,3 55,12
57,10 58,1 59,12 60,1 61,6 62,6 63,6 65,2 66,1 67,4 68,9 69,2
70,3 71,8 73,4 74,2 75,10 76,3 77,6 78,7 79,4 81,2 82,1 83,2
84,5 85,12 86,2 87,4 89,2 90,2 91,6 92,3 93,4 94,3 95,2 97,4
98,2 99,2 100,1 101,6 102,6 103,6 105,2 106,1 107,6 108,5 109,10 110,6
111,20 113,2 114,5 115,4 116,3 117,8 118,6 119,2 121,6 122,6 123,10 124,3
125,6 126,3 127,4 129,8 130,1 131,2 132,5 133,12 134,2 135,2 137,6 138,1
139,4 140,3 141,8 142,6 143,6 145,10 146,2 147,6 148,1 149,8 150,19 151,6
153,4 154,25 155,2 156,13 157,10 158,2 159,22 161,6 162,1 163,4 164,5 165,4
166,3 167,14 169,4 170,6 171,12 172,1 173,2 174,2 175,4 177,4 178,1 179,2
180,1 181,6 182,3 183,2 185,8 186,2 187,6 188,5 189,2 190,10 191,2 193,4
194,2 195,6 196,1 197,18 198,22 199,4 201,8 202,6 203,12 204,3 205,4 206,3
207,4 209,2 210,1 211,10 212,5 213,4 214,3 215,6 217,6 218,5 219,4 220,3
221,2 222,10 223,12 225,22 226,1 227,24 228,9 229,12 230,2 231,2 233,2 234,5
235,4 236,3 237,10 238,7 239,2 241,6 242,6 243,2 244,3 245,2 246,11 247,6
249,8 250,9 251,2 252,3 253,10 254,2 255,6 257,6 258,5 259,10 260,5 261,2
262,3 263,6 265,4 266,6 267,8 268,1 269,8 270,2 271,6 273,2 274,9 275,14
276,3 277,4 278,2 279,4 281,2 282,6 283,6 284,3 285,10 286,3 287,6 289,12
290,5 291,6 292,1 293,2 294,3 295,16 297,6 298,6 299,2 300,19 301,10 302,3
303,2 305,6 306,2 307,4 308,15 309,2 310,6 311,6 313,6 314,5 315,8 316,1
317,26 318,11 319,4 321,12 322,6 323,2 324,5 325,4 326,2 327,8 329,2 330,2
331,6 332,3 333,24 334,7 335,12 337,10 338,2 339,8 340,3 341,8 342,6 343,4
345,4 346,1 347,6 348,1 349,10 350,2 351,10 353,14 354,2 355,6 356,3 357,10
358,10 359,2 361,30 362,5 363,4 364,3 365,24 366,22 367,6 369,10 370,6 371,2
372,1 373,4 374,3 375,2 377,14 378,1 379,12 380,5 381,8 382,6 383,12 385,6
386,2 387,4 388,1 389,24 390,3 391,6 393,2 394,9 395,6 396,11 397,6 398,2
399,12 401,8 402,5 403,16 404,3 405,4 406,6 407,8 409,4 410,2 411,2 412,3
413,2 414,2 415,28 417,4 418,1 419,2 420,1 421,10 422,11 423,4 425,6 426,2
427,16 428,5 429,2 430,3 431,2 433,4 434,9 435,4 436,13 437,18 438,2 439,10
441,2 442,1 443,2 444,5 445,6 446,6 447,6 449,8 450,13 451,6 452,11 453,2
454,19 455,26 457,30 458,6 459,8 460,1 461,6 462,10 463,12 465,4 466,1 467,6
468,21 469,4 470,2 471,8 473,2 474,5 475,4 476,5 477,46 478,7 479,8 481,6
482,5 483,2 484,3 485,18 486,10 487,4 489,12 490,1 491,2 492,13 493,4 494,3
495,2 497,20 498,9 499,4 500,11 501,10 502,10 503,6 505,10 506,5 507,4 508,1
509,2 510,3 511,6 513,4 514,33 515,2 516,3 517,4 518,14 519,2 521,32 522,1
523,10 524,5 525,8 526,3 527,6 529,24 530,2 531,2 532,3 533,12 534,7 535,4
537,8 538,6 539,12 540,1 541,18 542,3 543,2 545,2 546,1 547,10 548,5 549,14
550,7 551,6 553,4 554,2 555,4 556,1 557,6 558,2 559,4 561,2 562,1 563,14
564,3 565,10 566,3 567,4 569,12 570,5 571,10 572,5 573,4 574,3 575,2 577,4
Table 2.2.8 Lowest type of a Gauss period forming a normal basis for q = 2 and n ≤ 577.
Introduction to finite fields 41
2,2 3,2 4,1 5,2 6,1 7,4 8,2 9,2 10,3 11,2 13,4 14,2
15,2 16,1 17,6 18,1 19,10 20,5 21,2 22,3 23,2 25,4 26,2 27,4
28,1 29,2 30,1 31,10 32,8 33,6 34,3 35,2 37,4 38,15 39,2 40,7
41,2 42,1 43,4 44,2 45,4 46,3 47,6 49,4 50,2 51,8 52,1 53,2
54,3 55,6 56,2 57,4 58,4 59,12 61,6 62,21 63,2 64,4 65,2 66,3
67,4 68,2 69,2 70,3 71,8 73,4 74,2 75,8 76,10 77,6 78,1 79,4
80,5 81,2 82,9 83,2 85,16 86,2 87,4 88,1 89,2 90,7 91,10 92,5
93,4 94,3 95,2 97,4 98,2 99,2 100,1 101,6 102,11 103,6 104,5 105,2
106,10 107,6 109,10 110,3 111,2 112,1 113,2 114,5 115,4 116,2 117,8 118,9
119,2 121,6 122,3 123,6 124,13 125,2 126,1 127,4 128,2 129,8 130,4 131,2
133,16 134,2 135,4 136,1 137,6 138,1 139,4 140,2 141,2 142,4 143,6 145,10
146,2 147,10 148,1 149,8 150,5 151,6 152,5 153,14 154,3 155,2 157,10 158,2
159,34 160,4 161,6 162,1 163,4 164,5 165,2 166,3 167,14 169,4 170,8 171,12
172,1 173,2 174,9 175,4 176,2 177,4 178,15 179,2 181,6 182,14 183,4 184,7
185,8 186,15 187,6 188,5 189,2 190,3 191,2 193,4 194,2 195,10 196,1 197,18
198,1 199,4 200,2 201,10 202,3 203,12 205,4 206,3 207,4 208,10 209,2 210,1
211,10 212,5 213,6 214,3 215,6 217,6 218,15 219,4 220,4 221,2 222,1 223,12
224,2 225,8 226,15 227,24 229,12 230,2 231,2 232,1 233,2 234,5 235,4 236,8
237,6 238,4 239,2 241,6 242,3 243,2 244,4 245,24 246,3 247,6 248,11 249,8
250,3 251,2 253,4 254,2 255,12 256,1 257,6 258,5 259,10 260,2 261,6 262,3
263,6 265,4 266,8 267,4 268,1 269,8 270,3 271,6 272,5 273,10 274,3 275,12
277,4 278,2 279,10 280,1 281,2 282,1 283,6 284,2 285,2 286,3 287,6 289,12
290,20 291,6 292,1 293,2 294,5 295,12 296,2 297,8 298,4 299,2 301,10 302,3
303,2 304,4 305,6 306,7 307,4 308,2 309,4 310,15 311,6 313,6 314,14 315,2
316,1 317,26 318,17 319,4 320,2 321,18 322,3 323,2 325,4 326,2 327,10 328,7
329,2 330,1 331,6 332,8 333,6 334,15 335,6 337,10 338,2 339,10 340,4 341,8
342,13 343,4 344,5 345,2 346,3 347,6 349,10 350,2 351,22 352,1 353,14 354,3
355,12 356,11 357,4 358,4 359,2 361,30 362,3 363,4 364,7 365,18 366,5 367,6
368,11 369,2 370,4 371,2 373,4 374,3 375,2 376,7 377,14 378,1 379,12 380,5
381,20 382,10 383,12 385,6 386,2 387,14 388,1 389,24 390,5 391,6 392,8 393,10
394,9 395,6 397,6 398,2 399,12 400,1 401,8 402,5 403,4 404,2 405,2 406,21
407,8 409,4 410,2 411,2 412,19 413,2 414,9 415,30 416,5 417,6 418,15 419,2
421,10 422,21 423,4 424,4 425,12 426,3 427,4 428,2 429,2 430,3 431,2 433,4
434,3 435,4 436,13 437,18 438,9 439,10 440,2 441,6 442,3 443,2 445,6 446,15
447,4 448,1 449,8 450,33 451,6 452,8 453,2 454,28 455,2 457,30 458,15 459,8
460,1 461,6 462,1 463,12 464,2 465,10 466,3 467,6 469,10 470,2 471,8 472,4
473,2 474,3 475,4 476,2 477,14 478,4 479,8 481,28 482,3 483,10 484,7 485,2
486,1 487,4 488,2 489,12 490,15 491,2 493,4 494,3 495,30 496,13 497,14 498,11
499,4 500,8 501,16 502,9 503,6 505,10 506,2 507,18 508,1 509,2 510,7 511,6
512,23 513,4 514,3 515,2 517,4 518,9 519,2 520,1 521,32 522,3 523,10 524,2
525,20 526,3 527,6 529,24 530,2 531,2 532,4 533,12 534,27 535,4 536,8 537,8
538,4 539,12 541,18 542,3 543,2 544,10 545,6 546,5 547,10 548,2 549,18 550,21
551,2 553,4 554,2 555,6 556,1 557,6 558,5 559,4 560,5 561,2 562,9 563,14
565,6 566,3 567,28 568,1 569,12 570,1 571,10 572,5 573,4 574,3 575,2 577,4
Table 2.2.9 Lowest type of a Gauss period forming a normal basis for q = 3 and n ≤ 577.
2.2.19 Remark Table 2.2.10 gives the minimum complexity of a normal basis of F2n over F2 for
40 ≤ n ≤ 721 by using a combination of the exhaustive search data of Table 2.2.4 and theo-
rems from Section 5.3. In each row, we give the degree n, the minimum complexity Cn of a
normal basis of F2n over F2 , the method by which the normal basis was obtained and what
property or parameters were used. In the “Method” column, “Optimal” indicates existence
of an optimal normal basis, “GNB” indicates the basis arises as a Gauss period and their
type is given in the “Property” column. Proposition 5.3.38 constructs normal bases of Fqn
using normal bases of subfields of coprime degree. When this method wins, the values of
these coprime factors are indicated in the “Property” column. Corollary 5.3.15 requires an
optimal normal basis of F2kn and the type of the optimal normal basis and the value of k
are indicated in the “Property” column. Finally, “sd” indicates that the basis is self-dual.
When n is a power of 2, the best result, when available, is by random search since known
methods do not apply. Gauss periods cannot form normal bases when 8 divides n, see Propo-
sition 5.3.20, and n contains no coprime factors with which to apply Proposition 5.3.38. By
Conjecture 2.2.15, the complexity of these bases is likely to approach n2 /2.
2.2.20 Problem Find constructions of low complexity normal bases of F2n over F2 when n is a
prime power, specifically a power of 2.
42 Handbook of Finite Fields
Table 2.2.10 Minimum found complexity of a normal basis of F2n over F2 , 40 ≤ n ≤ 721.
46 Handbook of Finite Fields
2.2.21 Remark The Combinatorial Object Server (COS) [2507] allows the user to specify a type
of combinatorial object with specific parameter values and COS will return a list of the
objects having the desired parameters. In many cases, the format of the output can be
chosen to be more machine-readable or human-readable. COS does not rely on a list, rather
it generates the objects requested on-the-fly; for this reason, the output is restricted to 200
objects. Examples of the objects generated are permutations, subsets and combinations,
set and integer partitions, irreducible and primitive polynomials over small finite fields and
spanning trees of a graph.
2.2.22 Remark The Cunningham project produces a set of tables to factor the numbers bn ± 1
for b = 2, 3, 5, 6, 7, 10, 11, 12 for n as large as possible. The current factorization methods
employed are the elliptic curve method, the multiple polynomial quadratic sieve and the
number field sieve. For more information on factorization methods, see [2080, Chapter 3].
The Cunningham tables appear in published form [415] and as an electronic resource [2890].
2.2.23 Remark The Great Internet Mersenne Prime Search (GIMPS) [2084] is a distributed com-
puting effort dedicated to finding and verifying Mersenne primes (that is, primes of the form
2p − 1, where p is also a prime). GIMPS uses a combination of trial factoring using the Sieve
of Eratosthenes, followed by the Pollard P − 1 method and ending with the Lucas-Lehmer
primality test. For more information on primality testing, see [724, Chapter 31], for exam-
ple. GIMPS provides the Prime95 software, which automates all factoring and distributed
computing processes. The (currently) largest known Mersenne prime is 243112609 − 1 con-
taining 12978189 decimal digits [2084].
The search for Mersenne primes is of particular interest in searching for primitive tri-
nomials of large degrees. Primitive polynomials of low-weight are useful in cryptographic
applications and pseudo-random number generation; see Sections 14.9 and 16.2. If p is a
Mersenne prime, then any irreducible polynomial of degree p over F2 is primitive. Since bi-
nomials of degree at least 2 cannot be irreducible over F2 , we consider trinomials xp +xr +1,
for some 0 ≤ r ≤ p − 1. Sieving trinomials for reducibles is possible by Swan’s Theorem;
see Section 3.3. For more details on the algorithms and methods used in the search for
primitive trinomials, see [408]. An implementation of polynomial arithmetic over F2 which
was motivated by the GIMPS project, entitled gf2x, is necessarily highly optimized and is
preferred in some finite field software implementations; see Table 2.2.11 for more details.
2.2.24 Remark The On-Line Encyclopedia of Integer SequencesTM (OEISTM ) [2800] is a
constantly-updated, searchable database of integer sequences. Examples of famous sequences
in the OEISTM are the Catalan numbers (A000108), prime numbers (A000040), and the
Fibonnacci numbers (A000045). Users can search by sequence, “word” (for example, “num-
ber of irreducible polynomials” yields sequence A001037) or sequence number. Sequences
are sorted lexicographically, so the sequence references may have changed since the date of
publication.
2.2.25 Remark In Table 2.2.11, we present a number of software packages which are useful for
finite field implementations. We distinguish between packages which are open-source and
commercial. We refer the reader to the citation, which provides a current (as of the date of
publication) Web URL to the most recent build of the software. We note that this is not
an exhaustive list of software packages, simply a useful list of packages used or researched
by the author.
Introduction to finite fields 47
See Also
[1413], [2080] For patents and standards of elliptic curve cryptography, most of
which contain guidelines for finite field implementations.
References Cited: [130, 399, 408, 415, 712, 724, 792, 1263, 1264, 1355, 1413, 1426, 1631,
1777, 1939, 2002, 2015, 2080, 2084, 2180, 2356, 2507, 2573, 2582, 2633, 2709, 2753, 2796,
2797, 2799, 2800, 2801, 2890, 2925, 3004, 3036]
This page intentionally left blank
II
Theoretical Properties
3 Irreducible polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Counting irreducible polynomials • Construction of irreducibles • Con-
ditions for reducible polynomials • Weights of irreducible polynomials •
Prescribed coefficients • Multivariate polynomials
4 Primitive polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Introduction to primitive polynomials • Prescribed coefficients • Weights
of primitive polynomials • Elements of high order
5 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Duality theory of bases • Normal bases • Complexity of normal bases •
Completely normal bases
6 Exponential and character sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Gauss, Jacobi, and Kloosterman sums • More general exponential and
character sums • Some applications of character sums • Sum-product
theorems and applications
7 Equations over finite fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
General forms • Quadratic forms • Diagonal equations
8 Permutation polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
One variable • Several variables • Value sets of polynomials • Exceptional
polynomials
9 Special functions over finite fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Boolean functions • PN and APN functions • Bent and related functions
• κ-polynomials and related algebraic objects • Planar functions and
51
52 Theoretical Properties
3.1.1 Remark In this section we∗ are concerned with exact formulae for the number of (univariate)
irreducible polynomials over finite fields possessing various properties. There is some overlap
with Section 3.5 where specifically polynomials with prescribed coefficients are discussed.
Formulae and asymptotic expressions for mutivariate polynomials are given in Section 3.6.
3.1.2 Theorem (Theorem 2.1.24). Denote the number of monic irreducible polynomials of degree
n over Fq by Iq (n). Then
1X
Iq (n) = µ(d)q n/d .
n
d|n
∗ The author wishes to thank Stephen Cohen for a number of helpful improvements in this section.
53
54 Handbook of Finite Fields
3.1.5 Definition The trace of a monic polynomial f of degree n over Fq is −a1 , where a1 is the
first coefficient of f , i.e., the coefficient of xn−1 in f . The norm of a monic polynomial
f of degree n over Fq is (−1)n an , where an is the last coefficient of f , i.e., the constant
term in f . The trace and norm of a monic irreducible polynomial are, respectively, the
trace and norm of any of its roots in Fqn over Fq .
3.1.6 Remark In [2508, 3048] (cited below) and sometimes in the literature, the trace of a poly-
nomial f is taken to be the first coefficient a1 itself.
3.1.7 Theorem [541, 2508] For a non-zero a ∈ Fq the number Iq (n, a) of monic irreducible poly-
nomials of degree n over Fq with trace a is
1 X
Iq (n, a) = µ(d)q n/d .
qn
d|n
(d,q)=1
3.1.8 Theorem [3048] Let q be a power of the prime p. Write n = pk m with m being p-free (i.e., p
does not divide m). The number Iq (n, 0) of monic irreducible polynomials of degree n over
Fq with trace 0 is
1 X X
Iq (n, 0) = µ(d)q n/d − µ(d)q n/dp ,
qn n
d|m d|m
3.1.11 Remark In [541], Carlitz obtained formulae for the number of monic irreducible polynomials
over Fq , q odd, with prescribed trace and whose norm is a (non-zero) square or a non-square,
respectively. These involve the quadratic character λ on Fq ; thus for 0 6= b ∈ Fq , λ(b) = 1
or −1 according as b is a square or non-square in Fq , respectively. (See also Remark 3.5.49.)
3.1.12 Theorem [541, 1783] Let q = pm be an odd prime power. For a ∈ Fq denote by
Iq (n, a, h), h = 1, −1, respectively, the number of monic irreducible polynomials of degree
n over Fq with trace a whose norm is a (non-zero) square or a non-square. Then,
1. if a = 0
1 p−1
2p (q − q) for n = p,
Iq (n, 0, h) = 1 n−1
2n (q − 1) 6 p;
for n =
2. if a 6= 0
1 p−1
2p (q + S) for n = p,
Iq (n, a, h) = n−1
1
2n (q +S− (−1)h λ(na) − 1) 6 p,
for n =
n−1 n−1
where S = (−1)h q 2 λ((−1) 2 a), and λ is the quadratic character in Fq .
3.1.13 Remark Subsection 3.5.4 contains various formulae for the numbers of irreducible polyno-
mials with some prescribed coefficients in a general finite field Fq . We list here a specialized
result over the binary field F2 not included there.
Tj maps F2n to F2 and T1 is the usual trace function. For n = 2 and j = 3, we define
T3 (β) = 0 for all β ∈ F4 . For an integer r with 1 ≤ r ≤ n, define F (n, t1 , t2 , . . . , tr )
to be the number of elements β ∈ F2n with Tj (β) = tj for j = 1, . . . , r and let
(I2 (n, t1 , t2 , . . . , tr ) =) I(n, t1 , t2 , . . . , tr ) be the number of monic irreducible polyno-
mials f (x) over F2 of degree n with coefficient of xn−j = tj for j = 1, . . . , r.
3.1.16 Remark Explicit formulae for I(n, t1 , t2 , t3 ) (the number of irreducible polynomials over F2
whose first three coefficients are prescribed) can be recovered from Theorem 3.1.15 via the
next theorem.
3.1.17 Theorem [1076, 3049] For n ≥ 3, F (n, t1 , t2 , t3 ) = 2n−3 + G(n, t1 , t2 , t3 ), where the values
of G(n, t1 , t2 , t3 ) are displayed in the following tables.
1. Case n = 2m + 1 (in the first column m is calculated modulo 12):
2. Case n = 2m with 3 not dividing n (in the first column m is calculated modulo 4):
3. Case n = 2m with 3 dividing n (in the first column m is calculated modulo 4):
3.1.18 Remark Formulae for I(n, t1 , t2 ) and F (n, t1 , t2 ) can be obtained by adding appropriate
terms from above [565]. For the binary field these will agree with the earlier general expres-
sions of Kuz’min (Theorems 3.5.43 and 3.5.45) from [1815].
3.1.19 Remark Self-reciprocal polynomials were defined in Remark 2.1.48. For results on these
polynomials
see [3050]. Any monic self-reciprocal polynomial R of (even) degree 2n has the
2
x + 1
form xn f , where f is a monic polynomial of degree n in Fq [x]. (We observe that,
x
Irreducible polynomials 57
1 X
SRM Iq (2n) = µ(d)(q n/d − 1),
2n
d|n
d odd
3.1.21 Remark The notion of self-reciprocal polynomial has recently been generalized and Theorem
3.1.20 extended correspondingly [48].
3.1.23 Theorem [48] Suppose n > 1 and g, h are polynomials over Fq as in Definition 3.1.22. Then
0 if q is even and b1 = b2 = 0,
1 n
(q − 1) if q is odd and n = 2m ,
Iq (2n, g, h) = 2n
1 X
µ(d)q n/d otherwise.
2n d|n
d odd
3.1.25 Definition The radical of an integer m (> 1) (denoted here by m∗ ) is the product of the
distinct primes dividing m.
3.1.26 Definition For t > 1, a t-polynomial T over Fq of degree tn is one that has the form
T (x) = f (xt ) for some monic polynomial f of degree n.
3.1.27 Remark If T is irreducible, then f is also irreducible. Further (see Theorem 3.2.5), (i)
t∗ |(q n − 1), and (ii) if 4|t, then 4|(q n − 1). In fact, if (i) holds then n can be expressed as
n = klm, where k is the order of q (mod t∗ ), l∗ |t, and m and t are relatively prime [666].
58 Handbook of Finite Fields
3.1.28 Theorem [666] Suppose t > 1 and that (i) and (ii) of Remark 3.1.27 hold with n = klm.
Let T M Iq (tn) be the number of monic irreducible t-polynomials of degree tn. Then
φ(t) n
(q − 1) for m = 1,
tn
T M Iq (tn) =
mφ(t) Iqn/m (m) for m > 1.
tn
3.1.29 Remark For t > 1, a t-reciprocal polynomial T over Fq of degree 2tn is one that is
botha
2t
t-polynomial and a self-reciprocal polynomial. Thus it has the form T (x) = xtn f x x+1 t ,
where f is a monic polynomial of degree n. If T is irreducible then f is irreducible. Moreover,
from [666] we have (i) t is odd and (ii) t∗ |(q n + 1). Also 2n = klm, where l∗ |2t and m and
2t are relatively prime (so that m|n).
3.1.30 Theorem [666, 2200] Suppose t > 1 and that (i) and (ii) of Remark 3.1.29 hold with
2n = klm. Let T SRM Iq (2tn) be the number of monic irreducible t-reciprocal polynomials
of degree 2tn. Then
φ(t) n
(q + 1) for m = 1,
2tn
T SRM Iq (2tn) =
mφ(t) Iqn/m (m) for m > 1.
2tn
3.1.31 Definition A polynomial f over Fq is translation invariant if f (x+a) = f (x) for all a ∈ Fq .
3.1.32 Theorem [2200] Let T IM Iq (qn) denote the number of translation invariant monic irre-
ducible polynomials of degree qn over Fq . Then
q−1 X
T IM Iq (qn) = µ(d)q n/d .
qn
d|n
(q,d)=1
3.1.33 Remark Many of the above formulae and other similar ones can be obtained using the
general counting technique [2200] which follows.
for some polynomial rb(n, x) ∈ Fq [x]. The polynomial rb(n, x) is the n-th order transform
of r.
3.1.36 Remark For a polynomial f (x) = xm + am−1 xm−1 + · · · + a1 x + a0 over Fqn we define the
following sequence of polynomials:
j j j
f (j) (x) = xm + aqm−1 xm−1 + · · · + aq1 x + aq0 .
We observe that f (s) = f where s is the least common multiple of the degrees of the minimal
polynomials of the coefficients of f .
s−1
Y
Sf (x) = f (j) (x).
j=0
1 X
HM Iq (kn, r(x)) = µ(n/d)[(m − 1)q n − g(d) + 1]
kn
d|n
d - (n/k)
1 X
= µ(d)[(m − 1)q n/d − g(n/d) + 1].
kn
d|n
k - d
See Also
§3.5 For further formulae and estimates for irreducible polynomials with prescribed
coefficients.
§3.6 For formulae and asymptotic expressions for irreducible multivariate polynomials.
References Cited: [48, 541, 547, 565, 666, 1076, 2091, 2200, 2508, 3048, 3049, 3050]
60 Handbook of Finite Fields
is irreducible over Fq if and only if f − αg is irreducible over Fqn for any zero α ∈ Fqn of P .
3.2.3 Remark Theorem 3.2.2 was employed by several authors [589, 678, 685, 1172, 1820, 1819,
1821, 1822, 1823, 1824, 1939, 2091] to give iterative constructions of irreducible polynomials
over finite fields. A further extension of the theorem is produced in [1825], which is also
instrumental in the construction of irreducible polynomials of relatively higher degree from
given ones.
3.2.4 Theorem [2077] Let P ∈ Fq [x] be irreducible of degree n. Then for any a, b, c, d ∈ Fq such
that ad − bc 6= 0,
n ax + b
F (x) = (cx + d) P
cx + d
is also irreducible over Fq .
3.2.5 Theorem [2077] Let t be a positive integer and P ∈ Fq [x] be irreducible of degree n and
exponent e (equal to the order of any root of P ). Then P (xt ) is irreducible over Fq if and
only if
1. (t, (q n − 1)/e) = 1,
2. each prime factor of t divides e, and
3. if 4|t then 4|(q n − 1).
3.2.6 Theorem [1939] Let f1 , f2 , . . . , fN be all the distinct monic irreducible polynomials in
Fq [x] of degree m and order e, and let t ≥ 2 be an integer whose prime factors di-
vide e but not (q m − 1) /e. Assume also that q m ≡ 1 (mod 4) if t ≡ 0 (mod 4). Then
f1 (xt ), f2 (xt ), . . . , fN (xt ) are all the distinct monic irreducible polynomials in Fq [x] of degree
mt and order et.
3.2.7 Remark Agou [36] has established a criterion for f (g(x)) to be irreducible over Fq ,
where f, g ∈ Fq [x] are monic and f is irreducible over Fq . This criterion was used in
Agou [36, 38,
39, 40] to characterize irreducible polynomials of special types such as
r
f xp − ax , f (xp − x − b), and others. Such irreducible compositions of polynomials are
also studied in Cohen [666, 671], Long [1954, 1955], and Ore [2324].
Irreducibility criteria for compositions of polynomials of the form f (xt ) have been estab-
lished by Agou [34, 35, 36], Butler [469], Cohen [671], Pellet [2376], Petterson [2392], and
Serret [2597, 2600]. Berlekamp [231, Chapter 6] and Varshamov and Ananiashvilii [2860]
discussed the relationship between the orders of f (xt ) and that of f (x).
Irreducible polynomials 61
Pn
3.2.8 Theorem [2077] Let q = 2m and let P (x) = ∈ Fq [x] be irreducible over Fq of
i=0 ci x
i
∗ n
. Denote F2m = F and F2 = K. Then
1
degree n and P (x) = x P x
3.2.9 Remark Part 1 of Theorem 3.2.8 was obtained by Meyn [2091] and by Kyuregyan [1820]
in the present general form; for the case q = 2 it was earlier obtained by Varshamov and
Garakov [2861].
3.2.10 Theorem [2091] Let q be an odd
prime power. If P is an irreducible polynomial of degree
n over Fq , then x P x + x
n −1
is irreducible over Fq if and only if the element P (2)P (−2)
is a non-square in Fq .
3.2.11 Theorem [1822] Let q be odd, P (x) 6= x be an irreducible polynomial of degree n ≥ 1 over
Fq , and ax2 + bx + c and dx2 + rx + h be relatively prime polynomials in Fq [x] with a or d
being non-zero and r2 6= 4dh. Suppose
ax2 + bx + c
2
n
F (x) = dx + rx + h P
dx2 + rx + h
br − 2 (cd + ah − δ)
br − 2 (cd + ah + δ)
n
r2 − 4dh P P
r2 − 4hd r2 − 4hd
is a non-square in Fq .
3.2.12 Remark The case a = c = r = 1 and b = d = h = 0 of Theorem 3.2.11 reduces to
Theorem 3.2.10.
3.2.13 Remark We briefly describe some constructive aspects of irreducibility of certain types of
polynomials, particularly binomials and trinomials.
3.2.14 Definition A binomial is a polynomial with two nonzero terms, one of them being the
constant term.
3.2.15 Remark Irreducible binomials can be characterized explicitly. For this purpose it suffices
to consider nonlinear, monic binomials.
3.2.16 Theorem [1939] Let t ≥ 2 be an integer and a ∈ F∗q . Then the binomial xt − a is irreducible
in Fq [x] if and only if the following two conditions are satisfied:
1. each prime factor of t divides the order e of a in F∗q , but not (q − 1)/e;
2. q ≡ 1 (mod 4) if t ≡ 0 (mod 4).
3.2.17 Remark Theorem 3.2.16 was essentially shown by Serret [2600] for finite prime fields. Fur-
ther characterizations of irreducible binomials can be found in Albert [70, Chapter 5], Cap-
peli [505, 506, 507], Dickson [850, Part I, Chapter 3]. Lowe and Zelinsky [1962], Rédei [2443,
Chapter 11], and Schwarz [2568].
3.2.18 Theorem Let a be a nonzero element in an extension field of Fq , q = 2A u − 1, with A ≥ 2
and u odd. Suppose e is the order of the subgroup of F∗q generated by a and the condition
62 Handbook of Finite Fields
of Part 1 in Theorem 3.2.16 is satisfied for some natural number t divisible by 2A = 2B.
Then the binomial xt − a factors as a product of B monic irreducible polynomials in Fq [x]
of degrees υ = t/B, that is in Fq [x] we have the canonical factorization
B
Y
xt − a = xυ − bcj xυ/2 − b2 ,
j=1
where b = ar , 2Br = e/2 + 1 (mod (q − 1)), and the elements c1 , . . . , cB are the roots of the
polynomial
B/2
X (B − i − 1)!B
F (x) = xB−2i ∈ Fq [x],
i!(B − 2i)!
i=0
and all cj , 1 ≤ j ≤ B are in Fq .
3.2.19 Remark The factorization in Theorem 3.2.18 is due to Serret [2600], see also Albert [70,
Chapter 5] and Dickson [850, Part I, Chapter 3]. Shiva and Allard [2616] discuss a method
k
for factoring x2 −1 + 1 over F2 . The factorization of xq−1 − a over Fq is considered in
Dickson [849], see also Agou [37]. Schwarz [2569] has a formula for the number of monic
irreducible factors of fixed degree for a given binomial and Rédei [2442] gives a short proof
of it; see also Agou [34], Butler [469], and Schwarz [2568]. Gay and Vélez [1261] prove a
formula for the degree of the splitting field of an irreducible binomial over an arbitrary field
that was shown by Darbi [769] for fields of characteristic 0. Agou [33] studied the factor-
ization of an irreducible binomial over Fq in an extension field of Fq . Beard and West [213]
and McEliece [2046] tabulate factorizations of the binomials xn − 1. The factorization of
more general polynomials g(x)t − a over finite prime fields is considered in Ore [2323] and
Petterson [2392]. Applications of factorizations of binomials are contained in Agou [34],
Berlekamp [229], and Vaughan [2863].
3.2.20 Definition A trinomial is a polynomial with three nonzero terms, one of them being the
constant term.
3.2.21 Remark The trinomials that we consider are also affine polynomials.
3.2.22 Theorem [1939] Let a ∈ Fq and let p be the characteristic of Fq . Then the trinomial xp −x−a
is irreducible in Fq [x] if and only if it has no root in Fq .
3.2.23 Corollary [1939] With the notation of Theorem 3.2.22 the trinomial xp − x − a is irreducible
in Fq [x] if and only if the absolute trace TrF/K (a) 6= 0, where F = Fq and K = Fp .
3.2.24 Remark Theorem 3.2.22 and Corollary 3.2.23 were first shown by Pellet [2378]. The fact
that xp −x−a is irreducible over Fp if a ∈ F∗p was already established by Serret [2597, 2600].
See also Dickson [841], [850, Part I, Chapter 3] and Albert [70, Chapter 5] for these results.
3.2.25 Remark Since for b ∈ F∗p the polynomial f (x) is irreducible over Fq if and only if f (bx) is
irreducible over Fq , the criteria above hold also for trinomials of the form bp xp − bx − a.
3.2.26 Remark If we consider more general trinomials of the above type for which the degree is a
higher power of the characteristic, then these criteria need not be valid any longer. In fact,
the following decomposition formula can be established.
3.2.27 Theorem [1939] For xq − x − a with a being an element of the subfield K = Fr of F = Fq
we have the decomposition
q/r
Y
xq − x − a = (xr − x − βj )
j=1
Irreducible polynomials 63
in Fq [x], where the βj are the distinct elements of Fq with TrF/K (βj ) = a.
3.2.28 Remark Theorem 3.2.27 is due to Dickson [841], [850, Part I, Chapter 3], but in the special
case a = 0 it was already noted by Mathieu [2022].
3.2.29 Theorem [2857, 2858] Let P (x) = xn +an−1 xn−1 +· · ·+a1 x+a0 be an irreducible polynomial
over the finite field Fq of characteristic p and let b ∈ Fq . Then the polynomial P (xp − x − b)
is irreducible over Fq if and only if the absolute trace TrF/K (nb − an−1 ) 6= 0, where F = Fq
and K = Fp .
3.2.30 Remark Theorem 3.2.29 was shown in this general form by Varshamov [2857, 2858]; see also
Agou [36]. The case b = 0 received considerable attention much earlier. The corresponding
result for b = 0 and finite prime fields was stated by Pellet [2378] and proved in Pellet [2377].
Polynomials f (xp − x) over Fp with deg(f ) a power of p were treated by Serret [2598, 2599].
The case b = 0 for arbitrary finite fields was considered in Dickson [850, Part I, Chapter
r
3] and Albert [70, Chapter 5]. More general types of polynomials such as f (xp − ax),
2r r
f (xp − axp − bx) and others have also been studied, see Agou [36, 37, 38, 39, 40, 41, 42],
Cohen [671], Long [1953, 1954, 1955, 1956], Long and Vaughan [1957, 1958], and Ore [2324].
3.2.31 Theorem Let f (x) = xr − ax − b ∈ Fq [x], where r > 2 is a power of the characteristic of
Fq , and suppose that the binomial xr−1 − a is irreducible over Fq . Then f (x) is the product
of a linear polynomial an an irreducible polynomial over Fq of degree r − 1.
3.2.32 Remark Theorem 3.2.31 generalizes results of Dickson [849] and Albert [70, Chapter 5].
See Schwarz [2571] for further results in this direction.
where f 0 is the formal derivative of f . Let g0 (x) = xp −x+δ0 and gk (x) = xp −x+δk , where
δk ∈ F∗p , k ≥ 1. Define f0 (x) = f (g0 (x)), and fk (x) = fk−1
∗ ∗
(gk (x)) for k ≥ 1, where fk−1 (x)
∗ 1 nk−1 1
is the monic reciprocal polynomial of fk−1 (x), i.e., fk−1 (x) = fk−1 (0) x fk−1 x . Then
for each k ≥ 0 the polynomial fk (x) is irreducible over Fq of degree nk = n · pk+1 .
3.2.34 Remark The case s = 1 and the sequence (δk )k≥0 is constant, i.e., δk = δ ∈ F∗p of Theo-
rem 3.2.33, has been studied by Varshamov in [2859], where no proof is given. For a proof
see [1172, 2077].
Pn
3.2.35 Theorem [1820, 1821, 1823] Let δ ∈ F∗2s and f1 (x) = u
u=0 cu x be a monic irreducible
polynomial over F2s whose coefficients satisfy the conditions
c
c1 δ n−1
T rF/K = 1 and T rF/K = 1,
c0 δ
where F2s = F and F2 = K. Then all the terms in the sequence (fk (x))k≥1 defined as
k−1
n
fk+1 (x) = x2 fk x + δ 2 x−1 , k ≥ 1,
Pn
3.2.36 Theorem [1820, 1821, 2077] Let f (x) = be irreducible over F2m of degree n.
i=0 ci x
i
Denote F2m = F and F2 = K. Suppose that TrF/K (c1 /c0 ) 6= 0 and TrF/K (cn−1 /cn ) 6= 0.
Define the polynomials ak (x) and bk (x) recursively by a0 (x) = x, b0 (x) = 1 and for k ≥ 1
Then
n
fk (x) = (bk (x)) f (ak (x)/bk (x))
1. For the case q = 2 in Theorem 3.2.36 the trace function is the identity map on
Fq .
Pn
Pn irreducible polynomial over F2 of degree n with
2. Let f (x) = i=0 ci xi be a monic
c1 cn−1 6= 0. Then fk (x) = i=0 ci aik (x)bn−i
k (x) is irreducible over F2 of degree
n2k for all k ≥ 0.
3. The irreducibility of fk over F2 has been studied by several authors, includ-
ing Varshamov [2859], Wiedemann [2977], Meyn [2091], Gao [1172], Menezes et
al. [2077].
4. The irreducibility of fk over F2s has been studied by Kyuregyan [1820, 1819, 1821]
and Menezes et al. [1172, 2077].
3.2.38 Theorem [678, 2077] Let f be a monic irreducible polynomial of degree n ≥ 1 over Fq , q
odd, where n is even if q ≡ 3 (mod 4). Suppose that f (1)f (−1) is a non-square in Fq . Define
f0 (x) = f (x),
fk (x) = (2x)tk−1 fk−1 x + x−1 /2 , k ≥ 1,
where tk = n2k denotes the degree of fk (x). Then fk (x) is an irreducible polynomial over
Fq of degree n2k for every k ≥ 1.
3.2.39 Remark Further constructions similar to the one from Theorem 3.2.38 can be found in [1822,
1824].
3.2.40 Theorem [1822] Let P (x) 6= x be an irreducible polynomial of degree n ≥ 1 over Fq ,
where n is even if q ≡ 3 (mod 4), with r, h, δ ∈ Fq and r 6= 0, δ 6= 0. Suppose that
P 2δ−rh
r2 P − 2δ+rh
r2 is a non-square in Fq . Define
F0 (x) = P (x),
t
4δ 2 − (hr)2 .
2h k−1 2 2h
Fk (x) = 2x + Fk−1 x + 2x + , k ≥ 1,
r r4 r
where tk = n2k denotes the degree of Fk (x). Then Fk (x) is an irreducible polynomial over
Fq of degree n2k for every k ≥ 1.
3.2.41 Remark For r = δ = 2 and h = 0 Theorem 3.2.40 coincides with Theorem 3.2.38 due to
Cohen [678, 685]; see also [2077, Theorem 3.24].
3.2.42 Theorem [1822] Let P (x) 6= x be an irreducible polynomial of degree n ≥ 1 over Fq .
Suppose that the elements P (0), h and (2r) are squares in Fq and the element P
n n 2h
r is
Irreducible polynomials 65
a non-square in Fq . Define
F0 (x) = P (x),
!tk−1 !
2 2
−1 (rx + 2h) (4h) x
Fk (x) = (Fk−1 (0)) Fk 2 , k ≥ 1,
4h (rx + 2h)
where tk = n2k denotes the degree of Fk (x). Then Fk (x) is an irreducible polynomial over
Fq of degree n2k for every k ≥ 1.
3.2.43 Theorem [1822] Let P be an irreducible polynomial of degree n ≥ 1 over Fq , where n is
even if q ≡ 3 (mod 4) and b ∈ Fq . Suppose that the element P − 2 is a non-square in Fq .
b
Define
F0 (x) = P (x),
b2
2 b
Fk (x) = Fk−1 x + bx + − , k ≥ 1.
4 2
Pn
3.2.45 Theorem [1824] Let q be an odd prime power and P (x) = au xu be an irreducible
u=0
polynomial of degree n > 1 over Fq with at least one coefficient a2i+1 6= 0 0 ≤ i ≤ b n2 c .
Let ax2 + 2hx + ahd−1 and dx2 + 2ax + h be relatively prime, where a, d, h, ∈ F∗q and
n
a2 6= hd. Suppose that the element hd−1 is a non-zero square in Fq and the element
n
hd − a2 gF0 hd is non-square in Fq (see Definition 3.2.44 for gP ). Define
F0 (x) = P (x),
tk−1
ax2 +2hx+ahd−1
Fk (x) = Hk−1 (a, d)−1 dx2 + 2ax + h Fk−1 dx2 +2ax+h , for k ≥ 1,
where Hk−1 (a, d) = dtk−1 Fk−1 ad , and tk is the degree of Fk (x). Then Fk (x) is an irre-
F0 (x) = P (x),
ax2 + c
Fk (x) = (2x)tk−1 Fk−1 , k ≥ 1,
2ax
where tk is the degree of Fk (x). Then Fk (x) is an irreducible polynomial over Fq of degree
tk = n2k for every k ≥ 1.
3.2.47 Remark In particular, the case q ≡ 3 (mod 4)) and F0 (x) = x2 + 2x + c with a = 1 of
Theorem 3.2.46 was considered by McNay [589]. The case a = c = 1 was derived by Cohen;
see [678, 685, 2077].
66 Handbook of Finite Fields
See Also
References Cited: [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 70, 213, 229, 231, 469, 505, 506,
507, 589, 666, 671, 678, 685, 769, 841, 849, 850, 1172, 1261, 1819, 1820, 1821, 1822, 1823,
1824, 1825, 1939, 1953, 1954, 1955, 1956, 1957, 1958, 1962, 2022, 2046, 2077, 2091, 2323,
2324, 2376, 2377, 2378, 2392, 2442, 2443, 2568, 2569, 2571, 2597, 2598, 2599, 2600, 2616,
2857, 2858, 2859, 2860, 2861, 2863, 2977]
3.3.1 Remark There has been substantial work showing that some classes of polynomials are
irreducible using several types of composition of polynomials. Here we are interested in “if
and only if” irreducibility statements that as a consequence provide reducibility results.
3.3.2 Remark Let f be a polynomial of degree m over Fq , q = pk , and let L be the linearized
Pn pi
polynomial L(x) = i=0 ai x . Ore [2324] considers the irreducibility of f (L). Agou in
r
several articles [37, 39, 41] considers special types of linearized polynomials including xp −ax
2r r
and xp − axp − bx.
3.3.3 Theorem [37] Let f (x) = xm + bm−1 xm−1 + · · · + b0 be an irreducible polynomial over Fq ,
q = pk , with root β. Then, for any nonzero a in Fq , f (xp − ax) is irreducible over Fq if and
only if a(q−1) gcd(m,p−1)/(p−1) = 1 and Trkm (β/Ap ) 6= 0, where A ∈ Fqm satisfies Ap−1 = a
km−1
and Trkm (x) = x + xp + · · · + xp . In particular, if A is in Fq , then f (xp − Ap−1 x) is
irreducible over Fq if and only if Trk (bm−1 /Ap ) 6= 0.
3.3.4 Remark Similar results can be found in the papers by Agou cited above.
3.3.5 Remark We are also interested in results that guarantee classes of polynomials that are
reducible. The concluding reducibility result for compositions of the type f (L), where f
and L are as above, is given next.
3.3.6 Theorem [41] Let f be an irreducible polynomial of degree m over Fq , q = pk , and let L be
Pn i
the linearized polynomial L(x) = i=0 ai xp . If n ≥ 3, f (L) is reducible.
3.3.7 Remark The cases when n ≤ 2 in the previous theorem were studied by Agou [39, 40].
3.3.8 Remark Cohen [671, 672] gives alternative proofs for Agou’s results.
Irreducible polynomials 67
3.3.9 Remark Moreno [2145] considers the irreducibility of a related composition of functions.
3.3.10 Theorem [2145] Let f and g be polynomials over Fq , q = pk , and let f be irreducible of
degree m. The polynomial f (g(x)) is irreducible over Fq if and only if g(x) + β is irreducible
over Fqm for any root β of f .
3.3.11 Remark Brawley and Carlitz [394] define root-based polynomial compositions called com-
posed products.
3.3.12 Definition Let f and g be monic polynomials in F∗q with factorizations in the algebraic
closure of Fq , given by
Y Y
f (x) = (x − α) and g(x) = (x − β).
α β
3.3.13 Theorem [394] The composed products of f and g, f ◦ g and f ? g, are irreducible if and
only if f and g are irreducible with coprime degrees.
3.3.14 Remark Related results to composed products can be found in [394, 395, 2103].
3.3.15 Remark Pellet [2379] and Stickelberger [2716] relate the parity of the number of irreducible
factors of a squarefree polynomial with its discriminant. When the parity of the number of
irreducible factors of a polynomial is even, the polynomial is reducible.
3.3.16 Remark We recall, from Section 2.1, the definition of discriminant (Definition 2.1.135).
3.3.17 Definition Let f be a polynomial of degree n in Fq [x] with leading coefficient a, and with
roots α1 , α2 , . . . , αn in its splitting field, counted with multiplicity. The discriminant of
f is given by Y
D(f ) = a2n−2 (αi − αj )2 .
1≤i<j≤n
3.3.18 Remark The discriminant of f is zero if and only if f has multiple roots.
3.3.19 Remark Next is a result given by Stickelberger [2716] although the theorem was originally
shown by Pellet [2379]; see also [846].
3.3.20 Theorem [2379, 2716] Let p be an odd prime and suppose that f is a monic polynomial
of degree n with integral coefficients in a p-adic field F. Let f¯ be the result of reducing
the coefficients of f modulo p. Assume further that f¯ has no repeated roots. If f¯ has r
irreducible factors over the residue class field, then r ≡ n (mod 2) if and only if D(f ) is a
square in F.
3.3.21 Proposition [2753] Let q be a power of an odd prime p and let Fq be the finite field with q
elements. Let g be a polynomial over Fq of degree n with no repeated roots. Furthermore,
68 Handbook of Finite Fields
let r be the number of irreducible factors of g over Fq . Then r ≡ n (mod 2) if and only if
D(f ) is a square in Fq .
3.3.22 Remark Swan extends the previous result to the case p = 2 by noting that a p-adic integer
a coprime to p, is a p-adic square if and only if a is a square modulo 4p.
3.3.23 Corollary [2753] Let g be a polynomial of degree n over F2 with D(g) 6= 0 and let f be
a monic polynomial over the 2-adic integers such that g is the reduction of f modulo 2.
Furthermore, let r be the number of irreducible factors of g over F2 . Then r ≡ n (mod 2)
if and only if D(f ) ≡ 1 (mod 8).
3.3.24 Remark Swan also characterizes the parity of the number of irreducible factors of a trino-
mial over F2 . These polynomials are of practical importance when implementing finite field
extensions; for example, see [1567] and Section 3.4.
3.3.25 Theorem [2753] Let n > k > 0. Assume precisely one of n, k is odd. If r is the number
of irreducible factors of f (x) = xn + xk + 1 ∈ F2 [x], then r is even (and hence f is not
irreducible) in the following cases:
1. n even, k odd, n 6= 2k and nk/2 ≡ 0, 1 (mod 4);
2. n odd, k even, k - 2n and n ≡ 3, 5 (mod 8);
3. n odd, k even, k | 2n and n ≡ 1, 7 (mod 8).
In other cases f has an odd number of factors.
3.3.26 Remark The case when n and k are both odd can be covered by making use of the fact
that the reciprocal polynomial of f has the same number of irreducible factors as f ; for
reciprocal polynomials see Definition 2.1.48. If both n and k are even the trinomial is a
square and has an even number of irreducible factors.
3.3.27 Corollary [2753] Let n be a positive integer divisible by 8. Then every trinomial over F2 of
degree n has an even number of irreducible factors in F2 [x], and hence it is not irreducible.
3.3.28 Remark Many results have been given following Swan’s technique for other types of poly-
nomials and finite field extensions and characteristics. We state several of them starting
from characteristic two results and then focusing on odd characteristic.
3.3.29 Remark Vishne [2878] considers trinomials in finite extensions of F2 ; see also Theorems
6.69 and 6.695 in [231]. Evaluating the discriminant of a trinomial (modulo 8R, where R is
the valuation ring of the corresponding extension of 2-adic numbers), Vishne’s studies are
a direct analogue of Swan’s proof over F2 .
3.3.30 Corollary [2878] Let F2s be an even degree extension of F2 and n an even number.
Then, g(x) = xn + axk + b ∈ F2s [x] has an odd number of irreducible factors only when
g(x) = x2d + axd + b and t2 + at + b has no root in F2s .
3.3.31 Remark Similar results to the one in Corollary 3.3.30 are given in [2878]. Special cases of
trinomials of low degrees over extensions of F2 are given in [570].
3.3.32 Remark Hales and Newhart give a Swan-like result for binary tetranomials; see Theorem
2 in [1399].
3.3.33 Remark It is convenient to use irreducible trinomials over F2 when constructing extension
fields. The usage of pentanomials (polynomials with 5 nonzero coefficients) when trinomials
do not exist is in the IEEE standard specifications for public-key cryptography [1567].
However, Scott [2573] shows that some of the recommended irreducible polynomials are not
optimal.
Irreducible polynomials 69
3.3.34 Remark Next we show some partial results attempting to characterize the reducibility of
binary pentanomials.
3.3.35 Remark Koepf and Kim [1777] give a Swan-like result for a class of binary pentanomials
of the form xm + xn+1 + xn + x + 1, where m is even and 1 ≤ n ≤ bm/2c − 1. Ahmadi [45]
shows that there are no self-reciprocal irreducible pentanomials of degree n over F2 if n is
a multiple of 12.
3.3.36 Problem Characterize completely the reducible pentanomials over F2 .
3.3.37 Remark The reducibility of some classes of binary polynomials with more than five nonzero
coefficients has also been studied. In all cases, the polynomials have a special form.
3.3.38 Remark Fredricksen, Hales, and Sweet [1096] give a Swan-like result for polynomials of the
form xn f (x) + g(x) where f, g ∈ F2 [x].
3.3.39 Remark Let n and v ≥ 2 be coprime positive integers, let r be the least positive residue of
n modulo v, and set d = (n − r)/v. A polynomial of the form xr g(xv ) + h(xv ) with g monic
of degree d and h of degree ≤ d is an (n, v)-windmill polynomial. These polynomials are
related to a method to generate pseudorandom bit sequences by combining linear feedback
shift registers in a “windmill” configuration [2686].
3.3.40 Theorem [673] Suppose that n and v are as above and f is a squarefree (n, v)-windmill
polynomial over F2m . Let nm (f ) denote the number of irreducible factors of f over F2m .
1. If n ≡ ±1 (mod 8) or m is even, then nm (f ) is odd.
2. If n ≡ ±3 (mod 8) and m is odd, then nm (f ) is even.
3.3.41 Theorem [333] Let f (x) = xn + iS xi + 1 ∈ F2 [x], where
P
Then f has no repeated roots. If n ≡ ±1 (mod 8), then f has an odd number of irreducible
factors. If n ≡ ±3 (mod 8), then f has an even number of irreducible factors.
3.3.42 Remark A short proof of Theorem 3.3.41 is given in [52].
3.3.43 Remark Swan-type results for composite, linearized, and affine polynomials over F2 are
given in [1735, 3063]
3.3.44 Problem The complete characterization of reducible polynomials over F2 is a hard open
problem. Provide new reducibility characterizations for classes of polynomials over F2 .
3.3.45 Remark There have also been reducibility results over finite fields of odd characteristic.
Binomials over finite fields of odd characteristic can be treated easily with a Swan-type
approach [1417].
3.3.46 Remark For trinomials over finite fields of odd characteristic only partial results are
known [46, 1223, 1417].
3.3.47 Remark Over F3 , Loidreau [1952] gives the parity for the number of irreducible factors for
any trinomial over F3 by examining the discriminant using all possible congruences of n
and k modulo 12; see also [1223]. This type of analysis holds for higher characteristic, but
the number of cases grows quickly with the characteristic, making a complete analysis for
large q hard to achieve.
3.3.48 Problem Completely characterize the reducibility of trinomials over finite fields of charac-
teristic different from 2 and 3.
70 Handbook of Finite Fields
See Also
References Cited: [37, 39, 40, 41, 45, 46, 52, 231, 236, 333, 394, 395, 570, 671, 672, 673,
827, 828, 846, 1096, 1223, 1300, 1399, 1417, 1567, 1735, 1777, 1939, 1952, 2103, 2145, 2324,
2379, 2573, 2686, 2716, 2753, 2818, 2878, 3054, 3063]
Pn
3.4.1 Definition The weight of a polynomial f (x) = i=0 ai xi in Fq [x] is the number of its
nonzero coefficients.
3.4.4 Theorem [850] Let the order of a ∈ F∗q be e. Then the binomial xn − a ∈ Fq [x], n ≥ 2, is
irreducible over Fq if and only if the following conditions are satisfied:
1. if r is a prime number and r|n, then r | e and r - (q − 1)/e;
2. q ≡ 1 (mod 4) if n ≡ 0 (mod 4).
3.4.5 Remark The study of irreducible binomials over finite fields can be traced back to the works
of researchers who were trying to generalize Fermat’s little theorem – see for example the
Irreducible polynomials 71
expositions of Poinsot [2407] and Serret [2600]. Theorem 3.4.4 in the format given above
appears in [850, 1938, 1939, 2077].
3.4.6 Corollary Let n be an odd number, and let a ∈ Fq . Then xn − a is irreducible over Fq if
and only if for every prime divisor r of n, a is not an r-th power of an element of Fq .
3.4.7 Corollary [1938, 1939, 2077] Let the order of a ∈ F∗q be e, and let r be a prime factor of
q − 1 which does not divide (q − 1)/e. Assume that q ≡ 1 (mod 4) if r = 2 and k ≥ 2. Then
k
for any non-negative integer k, xr − a is irreducible over Fq .
3.4.8 Corollary [2356] Let Fq be a finite field of odd characteristic p, p ≥ 5. There exists an
irreducible binomial over Fq of degree m, m 6≡ 0 (mod 4), if and only if every prime factor
of m is also a prime factor of q − 1. For m ≡ 0 (mod 4) then there exists an irreducible
binomial over Fq of degree m if and only if q ≡ 1 (mod 4) and every prime factor of m is
also a prime factor of q − 1.
3.4.9 Example
1. x6 − a is irreducible over Fq if and only if a is neither a quadratic nor a cubic
residue in Fq .
k
2. x2 ± 2 are irreducible over F5 [2077].
k k
3. x3 ± 2 and x3 ± 3 are irreducible over F7 [2077].
k
4. x3 + ω is irreducible over F4 where F4 = F2 (ω) [2077].
3.4.10 Remark It is clear that there is no irreducible binomial of degree > 1 over F2 . From results
above, it follows that x2 + 1 is the only irreducible binomial of degree > 1 over F3 and there
are infinitely many irreducible binomials over Fq for q ≥ 4. Also, it is clear that for every
q there are infinitely many positive integers m, for example p the characteristic of Fq , such
that there is no irreducible binomial of degree m over Fq . This fact motivates the following
results.
3.4.11 Theorem [2378] Let Fq be of characteristic p. Then the trinomial f (x) = xp − x − b is
irreducible over Fq if and only if one of the following equivalent conditions are satisfied
1. f has no root in Fq ;
2. TrFq (b) 6= 0;
3. p does not divide [Fq : Fp ].
3.4.12 Corollary [2077] Let Fq be of characteristic p. For a, b ∈ F∗q , the trinomial xp − ax − b is
irreducible over Fq if and only if a = cp−1 for some c ∈ Fq and TrFq (a/cp ) 6= 0.
3.4.13 Corollary [2324] Let xp − x − a ∈ Fq [x] be an irreducible trinomial over Fq of characteristic
p, and let α be a root of this polynomial in an extension field of Fqp . Then xp − x − aαp−1
is irreducible over Fq (α).
3.4.14 Theorem [2077] Let p ≡ 3 (mod 4) be a prime and let p + 1 = 2r s with s odd. Then, for
k k−1
any integer k ≥ 1, x2 − 2ax2 − 1 is irreducible over Fp , and hence irreducible over any
odd degree extension Fq , where a = ar is obtained recursively as follows:
1. a1 = 0;
(p+1)/4
aj−1 +1
2. for j from 2 to r − 1, set aj = 2 ;
(p+1)/4
3. ar = ar−12 −1 .
3.4.15 Remark The following constitutes a partial result towards proving Conjecture 3.4.31.
72 Handbook of Finite Fields
3.4.16 Theorem [667, 2446] Let Fq be a finite field of characteristic p, and let n ≥ 2 be such that
p does not divide 2n(n − 1). Let Tn (q) denote the number of a ∈ Fq for which the trinomial
xn + x + a is irreducible over Fq . Then
q
Tn (q) = + O(q 1/2 ), (3.4.1)
n
where the implied constant depends only on n.
3.4.17 Remark There is not any substantial result proving the existence of irreducible fewnomials
of weight at least four over finite fields.
3.4.18 Remark There are many conjectures concerning the existence of irreducible fewnomials
over finite fields (see the next section) whose resolution currently seems to be out of reach.
The following is a partial result towards proving the existence of irreducible fewnomials over
binary fields.
3.4.19 Theorem [2634] There exists a primitive polynomial of degree n over F2 whose weight is
n/4 + o(n).
3.4.20 Remark The weight of a monic polynomial of degree n over a finite field is between 1
and n + 1 inclusive. On one end of the weight spectrum we have the monomial x and
binomials, whose irreducibility is well understood. We have some partial results concerning
the irreducibility of polynomials corresponding to the other end of the weight spectrum.
3.4.21 Theorem [850] The all one polynomial xn + xn−1 + · · · + x2 + x + 1 ∈ Fq [x], which is of
weight n + 1, is irreducible over Fq if and only if n + 1 is a prime number and q is a primitive
root modulo n + 1.
3.4.22 Example
1. If n = 2, 4, 10, 12, 18, 28, 36, 52, 58, 60, 66, then xn + xn−1 + · · · + x2 + x + 1 is
irreducible over F2 .
2. Let m > 1 be an integer. If m is even with m ≡ 0 or 6 (mod 8), then the number
of irreducible factors of f (x) = xm + xm−1 + · · · + x3 + x2 + x + 1 is an even
number and hence f is reducible over F2 [55].
3.4.23 Conjecture [1847] (Artin) Let a be a non-square integer different from 1 and −1. Then
there are infinitely many primes p so that a is a primitive element in Fp .
3.4.24 Remark Using Theorem 3.4.21 and Conjecture 3.4.23, naturally we have the following
conjecture.
3.4.25 Conjecture There are infinitely many n for which xn + xn−1 + · · · + x2 + x + 1 is irreducible
over Fq .
3.4.26 Theorem [1531] Artin’s conjecture holds provided that the generalized Riemann hypothesis
is true.
3.4.27 Remark The following is an immediate corollary of Theorem 3.4.21 and Theorem 3.4.26.
3.4.28 Theorem There are infinitely many n for which xn + xn−1 + · · · + x2 + x + 1 is irreducible
over Fq provided that the generalized Riemann hypothesis is true.
3.4.3 Conjectures
3.4.29 Remark The following conjectures are supported by extensive computations, but it seems
that resolving them will be very hard. The first three conjectures have become part of
Irreducible polynomials 73
folklore and it is very difficult to trace back their origin. Here we provide a reference for
the interested readers who wish to have more information about these conjectures. Note
that the following is by no means an exhaustive list of conjectures related to the weight
distribution of irreducible polynomials over finite fields.
3.4.30 Conjecture [1233] For every n, there exists a polynomial of degree n and of weight at most
five which is irreducible over F2 .
3.4.31 Conjecture [1233] Let q ≥ 3 be a prime power. For every n, there exists a polynomial of
degree n and of weight at most four which is irreducible over Fq .
3.4.32 Conjecture [2186] The set of positive integers n for which there exists an irreducible trino-
mial of degree n over F2 has a positive density in the set of positive integers.
3.4.33 Conjecture [307] The number of irreducible trinomials over F2 of degree at most n is
3n + o(n).
3.4.34 Conjecture [52] Let i, j be two positive integers such that j < i, and let
xi+1 + 1
Fi,j (x) = + xj .
x+1
Then the number of irreducible polynomials Fi,j of degree at most n over F2 is 2n + o(n).
3.4.35 Conjecture [1233] For every positive integer n, there exists a polynomial g of degree at
most logq n + 3 such that f (x) = xn + g(x), called a sedimentary polynomial , is irreducible
over Fq .
See Also
References Cited: [52, 55, 307, 667, 850, 1233, 1531, 1938, 1939, 2077, 2186, 2324, 2356,
2378, 2407, 2446, 2600, 2634]
Pn
3.5.1 Definition Given a monic polynomial f (x) = xn + ai xn−i of degree n in Fq [x] then
i=1
ai is the i-th coefficient. For 0 ≤ k, m ≤ n, {a1 , . . . , ak } are the first k coefficients and
{an−m+1 , . . . , an } are the last m coefficients. In particular −a1 is the trace of f and
(−1)n an is the norm of f .
74 Handbook of Finite Fields
3.5.2 Remark Results on the distribution of irreducible polynomials of degree n with certain
coefficients prescribed generally fall into the categories (i) existence, (ii) asymptotic esti-
mates, (iii) explicit estimates, (iv) exact formulae or expressions. For those (few) in category
(iv) further classification under (i)–(iii) may be desirable. Historically, the distribution of
irreducible polynomials with prescribed first and/or last coefficients is approachable by
number-theoretic methods [134, 541, 1447, 2834] yielding asymptotic behaviour. Recently,
more specialized finite field techniques have refined these into explicit estimates, whilst re-
taining asymptotic features. As for existence, in some cases there are theorems which yield
the existence of polynomials of degree n with the stronger property of being primitive for
all but a few pairs (q, n); see Section 4.2.
3.5.3 Remark To keep notation simple, the number of irreducible polynomials of any particular
type will be denoted at that specific juncture by I (or, for example, I(a, b)). All polynomials
will be monic polynomials in Fq [x] (with q a power of the characteristic p) of degree n unless
mentioned otherwise.
3.5.4 Remark See Section 3.1 for further formulae for irreducible polynomials.
3.5.5 Remark For convenience, the results of Theorems 3.1.7, 3.1.8, and 3.1.10 are summarized
again here.
3.5.6 Theorem Write n = pj n0 , where j ≥ 0 and p - n0 . Then the number of irreducible polyno-
mials with trace a ∈ Fq is
1 X ε X n/dp
I= µ(d)q n/d − q ,
nq n
d|n0 d|n0
1 X
I= φ(r).
nφ(m)
r∈Dn
mr =m
3.5.8 Remark An asymptotic description of the number in Theorem 3.5.7 occurs in Theorem
III.5 of [512].
3.5.9 Theorem [512] The number I of irreducible polynomials with prescribed non-zero constant
term satisfies
qn − 1 qn − 1
1
− 2q n/2 ≤I≤ .
n q−1 n(q − 1)
3.5.10 Remark The next theorem, like Theorem 3.5.9, effectively concerns polynomials with one
fixed coefficient, although it involves three coefficients a1 , an−1 and an .
Irreducible polynomials 75
3.5.18 Definition Given a pair (q, n) let P be the largest prime factor of q n − 1. Then (q, n) is
an lps (largest prime survives) pair if P ∈ Dn as in Theorem 3.5.7.
3.5.19 Remark According to [2318], empirical evidence indicates that many pairs (q, n) are lps
pairs although there are some “sporadic” pairs that are not.
3.5.20 Theorem [2318] Suppose n ≥ 3 and (q, n) is an lps pair. Then the number I(a, b) of
irreducible polynomials with non-zero trace a and non-zero norm b satisfies
n
(1 − 1
P )(q n − 1) − q n−1 − 2nq 2 + 1
I(a, b) ≥ .
n(q − 1)2
76 Handbook of Finite Fields
3.5.21 Example Take (q, n) = (9, 7). Since 97 − 1 = 23 × 547 × 1093, then (9, 7) is an lps pair —
though (3, 14) which also relates to the factorization of 97 − 1 is not an lps pair! Theorem
3.5.15 yields a lower bound for I(a, b) of 8552.73; Theorem 3.5.16 yields lower bounds of
9216.75 and 9198.47 when a is zero and non-zero, respectively, and, when a 6= 0, Theorem
3.5.20 yields an improved lower bound of 9411.91.
3.5.22 Remark Even when (q, n) is not an lps pair a classical theorem of Zsigmondy (see e.g.,
[2467] or Wikipedia) implies there is always a prime l ∈ Dn except when q is a Mersenne
prime and n = 2 and when (q, n) = (2, 6). Given (q, n) with these exceptions, such a prime
is a Zsigmondy prime and the largest Zsigmondy prime is the largest of such primes. The
authors of Theorem 3.5.20 were unaware of Zsigmondy’s theorem and it is evident that the
same argument yields the following modification which is generally applicable. It appears
here for the first time.
3.5.23 Theorem Suppose n ≥ 3 and (q, n) 6= (2, 6). Let PZ be the corresponding largest Zsigmondy
prime. Then the number I(a, b) in Theorem 3.5.20 satisfies
n
n
(1 − 1
PZ )(q − 1) − q n−1 − 2nq 2 + 1
I(a, b) ≥ .
n(q − 1)2
3.5.24 Example Take (q, n) = (47, 4), not an lps pair since q 2 − 1 = 25 × 3 × 23 and q 2 + 1 =
2 × 5 × 13 × 17. Here 5, 13, and 17 are Zsigmondy primes with PZ = 17. Then Theorem
3.5.15 provides no information, whereas Theorem 3.5.16 yields lower bounds for I(a, b) of
504.4 and 515.23, respectively, when a is zero and non-zero. Now, when a 6= 0, Theorem
3.5.23 delivers an improved lower bound for I(a, b) of 528.25.
3.5.25 Remark [2318] also provides an expression for upper bounds for I(a, b). These are always
better than those of Theorem 3.5.16 when n is a multiple of q − 1.
3.5.26 Remark Existence results for (merely) irreducible polynomials with prescribed trace and
norm follow from those on the existence of primitive polynomials (Section 4.2).
3.5.27 Remark The question of estimating the number of irreducible polynomials f with the first
k coefficients a1 , . . . , ak and the last m coefficients an−m+1 , . . . , an of f prescribed (where
2 ≤ k + m < n) can be generalized as follows and tackled by number-theoretical methods.
Given a monic polynomial M ∈ Fq [x] of degree m, where 1 ≤ m < n, let R ∈ Fq [x] be
a (not necessarily monic) polynomial prime to M . Then consider polynomials f of degree
n with the first k coefficients prescribed and such that f ≡ R (mod M ). The original
question of choosing k first and m last coefficients is recovered by selecting M (x) = xm and
R(x) = an−m+1 xm−1 + · · · + an , an 6= 0. In this connection, note from Lemma 2.1.113 that
Φq (xm ) = q m−1 (q − 1).
3.5.28 Theorem [512, 1551, 2454] Let 2 ≤ k + m < n. Suppose a1 , . . . , ak ∈ Fq , M is a monic
polynomial in Fq [x] of degree m, and R ∈ Fq [x] is a fixed (not necessarily monic) polynomial
of degree < m prime to M . Then the number I of irreducible polynomials f of degree n
with prescribed first k coefficients and such that f ≡ R (mod M ) satisfies
n−k n−k
1 q n 1 q n
− (k + m + 1)q 2 ≤ I ≤ + (1 − δ)(k + m − 1)q 2 ,
n Φq (M ) n Φq (M )
1
where 0 < δ = 1 − < 1.
q k Φq (M )
Irreducible polynomials 77
3.5.29 Remark Theorem 3.5.28 leads to explicit existence results such as those which follow.
3.5.30 Corollary [685] Under the conditions of Theorem 3.5.28, suppose that k + m < n/2 and
that 2
n n+1
q> (n even); q> (n odd).
2 2
Then there exists an irreducible polynomial f ≡ R (mod M ) of degree n with its first k
coefficients prescribed. In particular, there exists an irreducible polynomial with its first k
coefficients and its last m coefficients prescribed (and non-zero constant term).
3.5.31 Corollary [685] There exists an irreducible polynomial of degree n with its first k and last
m coefficients prescribed whenever 2 ≤ k + m ≤ n/3.
3.5.32 Remark In contrast to Corollaries 3.5.30 and 3.5.31 giving existence conditions for ir-
reducible polynomials with prescribed first and last coefficients, [1209] investigates those
whose middle coefficients are fixed providing these are all zero.
3.5.33 Theorem [1209] Suppose 1 < k ≤ m < n and q 2k−m−2 ≥ q(n − k + 1)4 . Then there exists
an irreducible polynomial of degree n with ak = · · · = am = 0.
3.5.34 Corollary [1209] For any c with 0 < c < 1 and any positive integer n such that
(1 − 3c)n ≥ 2 + 8 logq n, there exists an irreducible polynomial of degree n over Fq with
any bcnc consecutive coefficients (other than the first or last) equal to 0.
3.5.35 Example [1209] With c = 1/4 in Corollary 3.5.34, there exists an irreducible polynomial of
degree n with bn/4c consecutive coefficients equal to 0 whenever q ≥ 61 and n ≥ 37 and for
smaller prime powers for n ≥ nq where nq ≤ n2 = 266.
3.5.36 Remark When the prescribed coefficients do not comprise first and last, or middle coeffi-
cients, the only general estimates are asymptotic. In the following two theorems n−m ≤ n−2
coefficients aj of f are prescribed and given their assigned values. The remaining m coeffi-
cients A = {aj1 , . . . , ajm }, say, are allowed to take any values in Fq .
3.5.37 Theorem [2710] Let the n−m coefficients of f not in A (as in Remark 3.5.36) be prescribed
and given their assigned values in Fq . Regard the members of A as algebraically independent
indeterminates (or transcendentals). Suppose that f is absolutely irreducible in Fq [x, A].
Then, for sufficiently large q, the number I of irreducible polynomials f ∈ Fq [x] of degree
n with the coefficients not in A as prescribed satisfies
I = cq m + O(q m/2 ),
3.5.40 Remark The final theorem in this section relates to irreducible polynomials of even degree
of a specific type, namely self-reciprocal polynomials of degree 2n (see Definition 2.1.48).
Note that, if F is a self-reciprocal polynomial of degree 2n, then its last coefficient a2n = 1
and its first n coefficients are the same as its last n non-constant coefficients in the sense
that ai = a2n−i , i = 1, . . . , n.
3.5.41 Theorem [1210] The number I of irreducible self-reciprocal polynomial of degree 2n with
the first m (< n/2) coefficients prescribed satisfies
q n−m m + 5 n +1
I− ≤ q2 .
2n n
See also [1211].
3.5.42 Remark Expressions for the number of irreducible polynomials with the first two coefficients
prescribed are given in [1815]. These are described in terms of the function H(a, n), a ∈ Fq .
Here, if p - n, then
n−2
q − λ((−1)l−1 la)q l−1 for n = 2l,
H(a, n) = n−2
q + δ(a)λ((−1)l n)q l−1 for n = 2l + 1,
where λ denotes the quadratic character on Fq , δ(a) = −1 (a 6= 0) and δ(0) = q − 1. If p|n,
then n−2
q − δ(a)λ((−1)l )q l−1 for n = 2l,
H(a, n) = n−2
q + λ((−1)l 2a)q l for n = 2l + 1.
Given a1 , a2 ∈ Fq , in Theorems 3.5.43 and 3.5.45 (which relate, respectively, to odd and
even q), I(a1 , a2 ) denotes the number of irreducible polynomials of degree n over Fq with
the indicated first two coefficients.
3.5.43 Theorem [1815] Suppose q is odd. Then, if p - n and a ∈ Fq ,
1X
I(0, −a) = µ(d)H(a/d, n/d).
n
d|n
1X
I(0, −a) = µ(d)H(a/d, n/d), a 6= 0,
n
d|n0
1X
I(0, 0) = µ(d)[H(0, n/d) − q n/pd ],
n
d|n0
1 X
I(1, 0) = µ(d)q n/d .
nq 2
d|n0
3.5.44 Remark The general value of I(a1 , a2 ) can be recovered from Theorem 3.5.43 using, if p - n,
(n−1) 2
I(a1 , a2 ) = I(0, a2 − 2n a1 ). If p|n and a1 6= 0, then I(a1 , a2 ) = I(1, 0).
3.5.45 Theorem [1815] Suppose q is even. Then for a ∈ Fq ,
1X
µ(d)H(a, n/d) for n odd,
n
d|n
I(0, a) = 1 X n
n
µ(d)[H(a, n/d) − q 2d −1 ] for n even;
d|n, d odd
Irreducible polynomials 79
d−1 n
1 X
∗
I(1, a) = µ(d)H a+ , ,
n 2 d
d|n, d odd
where, with q = 2r and χ the canonical additive character on Fq ,
q n−2 for n = 4l,
n−3
q n−2 + (−1)lr δ(a)q 2
∗
for n = 4l + 1,
H (a, n) = n−3
q n−2 + (−1)lr δ(1 + a)q 2 for n = 4l − 1,
n−2
q n−2 − (−1)lr χ(a)q 2
for n = 4l + 2.
3.5.46 Remark The general value of I(a1 , a2 ) can be recovered from Theorem 3.5.45 since, for
a1 6= 0, I(a1 , a2 ) = I(1, a2 /a21 ) and I(0, a) = I(0, 1), a 6= 0.
3.5.47 Remark For the binary field F2 , [565] contains alternative expressions for the number
I(a1 , a2 ) in Theorem 3.5.45. For expressions for the number of irreducible polynomials over
F2 with the first three coefficients prescribed, see [1076, 3049]. The flavor of these results is
caught by the following conjecture which holds for k ≤ 3.
3.5.48 Conjecture [3049] Let n = 2l be even and I(a1 , . . . , as ) denote the number of irreducible
polynomials of degree n (over F2 ) whose first s coefficients are as shown. Then
X
I(a1 , . . . , as ) = µ(d)J(n/d, a1 , . . . , as ),
2|n, d odd
where
J(n, a1 , . . . , as ) = 2n−s + cl−r+1 2l−r+1 + · · · + cl 2l ,
for some r with 1 ≤ r ≤ l and ci ∈ {−1, 0, 1}.
3.5.49 Remark Let γ be a primitive element of Fq and s|q − 1. A coset C of the subgroup of F∗q
generated by γ s takes the form {γ is+h : 0 ≤ i < (q − 1)/s}, for some h with 0 ≤ h < s.
In [1783] general expressions are obtained for the number of irreducible polynomials of
degree n with prescribed trace and norm in a specified coset C. These involve quantities
like Gauss and Jacobi sums. Such expressions are made explicit (i.e., the trigonometric sums
are precisely determined) in the following cases:
1. s = 2 (compare with [541]), s = 3 and s = 4;
2. q = p2er and s (> 1) a factor of pe + 1.
3.5.50 Remark For q powers of 2 or 3, [2123] contains completely explicit expressions for the
number of irreducible polynomials of degrees n in the indicated ranges for polynomials with
two prescribed coefficients as follows:
1. a1 any prescribed value, an−1 = 0, for n ≤ 10;
2. a1 = 0, a3 any prescribed value, for n ≤ 30.
See Also
References Cited: [134, 310, 512, 541, 565, 669, 685, 1076, 1209, 1210, 1211, 1405, 1416,
1447, 1551, 1783, 1815, 2121, 2123, 2318, 2454, 2467, 2710, 2834, 2893, 3049]
80 Handbook of Finite Fields
3.6.1 Theorem [1561] The polynomial ring Fq [x1 , . . . , xk ] is a unique factorization domain. Let
f (x1 , . . . , xk ) = a0 + a1 xk + · · · + an xnk ∈ Fq [x1 , . . . , xk ], where n > 0, ai ∈ Fq [x1 , . . . , xk−1 ],
0 ≤ i ≤ n, an 6= 0. Then f is irreducible in Fq [x1 , . . . , xk ] if and only if it is irreducible
in (Fq (x1 . . . , xk−1 ))[xk ] and gcdFq [x1 ,...,xk−1 ] (a0 , . . . , an ) = 1, where Fq (x1 . . . , xk−1 ) is the
field of rational functions in x1 , . . . , xk−1 over Fq and gcdFq [x1 ,...,xk−1 ] (a0 , . . . , an ) denotes
the gcd of a0 , . . . , an in Fq [x1 , . . . , xk−1 ].
3.6.3 Definition Let Nk = Fq [x1 , . . . , xk ]/∼, where for f, g ∈ Fq [x1 , . . . , xk ], f ∼ g means that
f = cg for some c ∈ F∗q . Elements of Nk are normalized polynomials in Fq [x1 , . . . , xk ].
Let Nk (m) = {f ∈ Nk :hdeg f = m}, where i deg f denotes the total degree of f . Let
m+k m+k−1
1
Nk (m) = |Nk (m)| = q−1 q ( k ) −q ( k ) , Ik (m) = |{f ∈ Nk (m) : f is irreducible}|,
Pk (m; n) = |{(f, g) ∈ Nk (m) × Nk (n) : gcd(f, g) = 1}|.
3.6.4 Remark We have that Nk (m) is the number of normalized polynomials of (total) degree m
in Fq [x1 , . . . , xk ], Ik (m) is the number of normalized irreducible polynomials of degree m,
and Pk (m; n) is the number of relatively prime pairs of normalized polynomials of degrees
m and n, respectively.
3.6.5 Theorem [336] We have Ik (0) = 0, and for m > 0, Ik (m) is given by the recursive formula
Ik (1) + a1 −1 Ik (m−1)+am−1 −1
X
Ik (m) = Nk (m) − ··· .
a1 am−1
1a1 +2a2 +···+(m−1)am−1 =m
where
X Ik (1) Ik (d)
Ak (d) = (−1)a1 +···+ad ··· .
a1 ad
1a1 +2a2 +···+dad =d
3.6.8 Remark In Theorem 3.6.7, when k ≥ 2, no closed formula for Ak (d) is known. When k = 1,
it is known that A1 (0) = 1, A1 (1) = −q, and A1 (d) = 0 for d ≥ 2 [537, 668].
Irreducible polynomials 81
t m−t−1+k
αi Nk (m − i) + O q ( k ) ,
X
Ik (m) =
i=0
Pk (m;n)
3.6.11 Theorem [1546] Let k ≥ 2. Then limm+n→∞ Nk (m)Nk (n) = 1.
P1 (m;n)
3.6.12 Remark [226] We have for the univariate case N1 (m)N1 (n) = 1 − 1q ; see Section 11.2 for more
results on counting univariate polynomials.
3.6.13 Theorem [1546] Let k ≥ 2 and t ≥ 0 be fixed integers. Then
t
X
Pk (m; n) = Nk (m − d)Nk (n − d)Ak (d) + O Nk (m − t − 1)Nk (n − t − 1) ,
d=0
where Ak (d) is defined in Theorem 3.6.7. The constant in the O-term depends only on q, k, t.
3.6.14 Definition Let f ∈ Fq [x1 , . . . , xk ]. The vector degree of f , denoted by Def f , is the k-tuple
(degx1 f, . . . , degxk f ).
3.6.16 Remark We have that Nk (m) is the number of normalized polynomials of vector degree
m in Fq [x1 , . . . , xk ], Ik (m) is the number of normalized irreducible polynomials of vector
degree m, and Pk (m; n) is the number of relative prime pairs of normalized polynomials of
vector degrees m and n, respectively.
3.6.17 Remark [664, 1546] For m = (m1 , . . . , mk ) ∈ Nk ,
1 X
Nk (m) = (−1)k+δ1 +···+δk q (m1 +δ1 )···(mk +δk ) .
q−1
(δ1 ,...,δk )∈{0,1}k
82 Handbook of Finite Fields
3.6.19 Theorem [1546] We have Ik (o) = 0, and for m > o, Ik (m) is given by the recursive formula
X Y Ik (i) + ai − 1
Ik (m) = Nk (m) − .
ai
(a
Pi )o<i<m i
i ai i=m
3.6.21 Theorem [1546] Let k ≥ 2 and (m1 , . . . , mk ) ∈ (Z+ )k with m1 = max1≤i≤k−1 mi . Further
assume that m1 ≥ 3 if k = 2 and m1 ≥ 2 if k = 3. Then
Ik (m1 , . . . , mk ) = Nk (m1 , . . . , mk ) − qNk (m1 , . . . , mk−1 , mk − 1) + O q m1 (m2 +1)···(mk +1) .
3.6.22 Remark The asymptotic formula in Theorem 3.6.21 is interesting only when mk > m1 since
otherwise the O-term is bigger than or comparable to the term qNk (m1 , . . . , mk−1 , mk − 1).
When mk > m1 , Theorem 3.6.21 indicates that most of the reducible polynomials in
Nk (m1 , . . . , mk ) are of the form (xk + α)f for some α ∈ Fq , f ∈ Nk (m1 , . . . , mk−1 , mk − 1).
3.6.23 Corollary Under the assumptions of Theorem 3.6.21, we have
Ik (m1 , . . . , mk ) = (1 − q −Mk )N (m1 , . . . , mk ) + O q m1 (m2 +1)···(mk +1) ,
where Mk = (m1 + 1) · · · (mk−1 + 1) − 1. The main term here was given by Cohen [664,
Theorem 1]. For fixed m1 , . . . , mk−1 , it is not the case that almost all polynomials in
Ik (m1 ,...,mk )
Nk (m1 , . . . , mk ) are irreducible: we have N k (m1 ,...,nk )
→ 1 − q −Mk as mk → ∞.
Ik (m1 ,...,mk )
3.6.24 Remark [1546] For k ≥ 2, Nk (m1 ,...,mk ) → 1 as both m1 , mk → ∞.
where
X P Y Ik (i)
ai
Ak (d) = (−1) 0<i≤d .
ai
0<i≤d
P(ai )0<i≤d
0<i≤d ai i=d
3.6.31 Theorem [2214] Let k ≥ 2. Assume that f ∈ Fq [x1 , . . . , xk ] is indecomposable over Fq , the
algebraic closure of Fq . For each λ ∈ Fq , let Iλ denote the set of all distinct irreducible
factors of f − λ over Fq . Then
X X
(|Iλ | − 1) ≤ min deg g − 1.
λ∈Fq
λ∈Fq g∈Iλ
3.6.33 Definition [1224] Let F be a field and fix a term order in F[x1 , . . . , xk ] that respects
the total degree. A monic polynomial f ∈ F[x1 , . . . , xk ] with f (0, . . . , 0) = 0 is monic
original.
3.6.34 Remark [338, 1224] Let F be a field and let k ≥ 2. Every monic original polynomial
f ∈ F[x1 , . . . , xk ] has a unique decomposition f = u ◦ h, where u ∈ F[t], h ∈ F[x1 , . . . , xk ]
are both monic original and h is indecomposable.
3.6.35 Theorem [1224] Let F be an algebraically closed field and let k ≥ 2 and n ≥ 2 be integers.
Denote by l the smallest prime divisor of n. Then the set of all decomposable monic n
original
polynomials in F[x1 , . . . , xk ] of degree n is an affine algebraic set of dimension k+km +m−3,
where (
n if k = 2, nl is a prime and nl ≤ 2l − 5,
m=
l otherwise.
3.6.36 Definition Let k ≥ 2 and n ≥ 1. Denote the number of monic original (respectively
indecomposable monic original, decomposable monic original) polynomials of (total)
degree n in Fq [x1 , . . . , xk ] by Pk,n (respectively Ik,n , Dk,n ).
3.6.37 Theorem [338] We have Ik,1 = q k−1 , and for n > 1, Ik,n is given by the recursive formula
n
X
Ik,n = Pk,n − q m −1 Ik,m .
m|n, m<n
84 Handbook of Finite Fields
Ik,n
3.6.38 Remark [338] Let k ≥ 2. We have Pk,n → 1 when n → ∞ with q fixed or when q → ∞ with
n fixed.
3.6.39 Theorem [1224] Assume k ≥ 2 and n ≥ 2. Let l be the smallest prime divisor of n and let
m defined in Theorem 3.6.35. Then the following hold.
∗
1. |Dk,n − αk,n | ≤ αk,n βk,n where
k−1+ n
1 − q ( k−1 )
k+ n
− m
( m )+m−3
αk,n = q k · ,
1 − q −1
2 k−1+ n
∗ − 12 ( k−1 l )+1
βk,n = q .
1 − q −1
3.6.41 Remark Computation of the gcd of multivariate polynomials over finite fields is more
difficult than that of univariate polynomials; the Extended Euclidean Algorithm alone is
not sufficient to produce the gcd due to the fact that Fq [x1 , . . . , xk ] with k > 1 is no longer
a Euclidean domain. In this subsection we gather several algorithms for computing the
multivariate gcd based on different approaches.
3.6.42 Definition Let R be a unique factorization domain. For f ∈ R[x], lc(f ) denotes the leading
coefficient of f ; pp(f ) denotes the primitive part of f , i.e., f divided by the gcd of its
coefficients. The resultant of f, g ∈ R[x] is denoted by res(f, g).
w(x, u) = b(u)vu , f ∗ (x, u)w(x, u) = b(u)f (x, u), g ∗ (x, u)w(x, u) = b(u)g(x, u)
for all u ∈ S.
7. Check if degy (f ∗ w) = degy (bf ) and degy (g ∗ w) = degy (bg).
8. gcdFq [x,y] (f, g) = ppx (w).
3.6.45 Remark Let h = gcdFq [x,y] (f, g). The conditions in Step 7 of Algorithm 3.6.44 are satisfied
if and only if S does not contain any root of resx (f /h, g/h).
3.6.46 Algorithm [13] (Gröbner basis)
Input: f, g ∈ Fq [x1 , . . . , xk ], f, g 6= 0.
Output: gcd(f, g).
1. Compute the reduced Gröbner basis G for the ideal hwf, (1 − w)f i of
Fq [x1 , . . . , xk , w] with respect to an elimination order with x1 , . . . , xk smaller
than w.
2. lcm(f, g) is the polynomial in G which does not involve w.
fg
3. gcd(f, g) = lcm(f,g) .
3.6.47 Remark For the correctness of Algorithm 3.6.43, see [1227, Theorem 6.12]. The cost of the
Extended Euclidean Algorithm is given in [1227, Theorem 3.11]. For the correctness and
the cost of Algorithm 3.6.44, see [1227, Theorem 6.37]. The correctness of Algorithm 3.6.46
is given in [13, p.72]. For the complexity of computing reduced Gröbner bases, see [1227,
Section 21.7].
See Also
References Cited: [13, 226, 336, 337, 338, 537, 664, 668, 1224, 1227, 1244, 1546, 1561, 2214]
This page intentionally left blank
4
Primitive polynomials
constructions
4.1.3 Theorem The number of primitive polynomials of degree n over Fq is φ(q n − 1)/n, where
φ denotes Euler’s function.
4.1.4 Remark [1939, Section 3.1] The reciprocal polynomial f ∗ (with leading coefficient different
from 0) of a primitive polynomial f of degree n is defined by f ∗ (x) = xn f (1/x); see Defini-
tion 2.1.48. The monic reciprocal polynomial of a primitive polynomial is again primitive.
In general, for any polynomial f , the order of f ∗ is the same as the order of f . For the
definition of order of a polynomial see Definition 2.1.51.
4.1.5 Theorem [1939, Theorem 3.16] A polynomial f ∈ Fq [x] of degree n is primitive if and only
if f is monic, f (0) 6= 0, and the order of f is q n − 1.
4.1.6 Theorem [1939, Theorem 3.18] A monic polynomial f ∈ Fq [x] of degree n ≥ 1 is a primitive
polynomial if and only if the smallest positive integer r for which xr is congruent modulo
f to some element of Fq is r = (q n − 1)/(q − 1) and (−1)n f (0) is a primitive element in Fq .
In case f is primitive over Fq , xr ≡ (−1)n f (0) (mod f (x)).
87
88 Handbook of Finite Fields
yt − 1 r0 + r1 y + · · · + rk−1 y k−1
g(y) = = h(y) + ,
(y − 1)f (y) f (y)
where h(y) = t−k +t−k−1 y+· · ·+1 y t−k−1 . Then f is primitive if and only if the number of
nonzero terms in h(y), considered as a polynomial in y over Fq , is equal to q k−1 (q−1)−1−N ,
where N is the number of nonzero terms in the finite sequence t−k+1 , t−k+2 , . . . , m defined
Pk−n−1
by t−n = rn − i=1 fi t−n−i , n = 0, 1, . . . , k − 1 where the empty sum is interpreted as
Pk
0, and t+n = 1 − i=1 fi t+n−i , n = 1, 2 . . . , m − t.
4.1.9 Corollary [1857] An irreducible polynomial is primitive if and only if the finite sequence
1 , . . . , m defined in Theorem 4.1.8 contains no two identical periodic subsequences.
4.1.10 Definition [2186] Define wn (q) and Wn (q) as the minimal weight (the number of nonzero
coefficients) among all monic irreducible and primitive, respectively, polynomials of de-
gree n over Fq .
4.1.11 Remark It is stated in [2186] that wn (p) ≤ Wn (p) ≤ (n + 1)/2 for any sufficiently large
prime p. For p = 2, wn (2) ≤ Wn (2) ≤ n/4 + o(n).
4.1.12 Problem Extend these results to Fq when q is a power of a prime.
4.1.13 Conjecture [2186] Wn (2) = 3 infinitely often.
4.1.14 Problem [2186] Find examples of fields Fqn with Wn (q) = o(n) or at least with wn (q) = o(n)
for infinitely many n.
4.1.15 Remark The following conjectures require the notions of primitive normal polynomials and
completely normal primitive polynomials; see Sections 5.2 and 5.4.
4.1.17 Definition For a prime p ≥ 3, assume that the field Fp consists of the elements
0, ±1, . . . , ±(p − 1)/2. The height of a polynomial is the maximum absolute value of its
coefficients. We define hn (p) and Hn (p) as the minimal height of all monic irreducible
and primitive, respectively, polynomials of degree n over Fp .
4.1.18 Remark It is stated in [2186] that hn (p) = O(p2/3 ) and Hn (p) = O(pn/(n+1)+ε ). The bound
hn (p) = O(p2/3 ) has been improved to hn (p) ≤ p1/2+o(1) in [2654].
4.1.19 Problem Improve the above bounds for hn (p) and Hn (p).
4.1.20 Problem Extend the bounds for Fq when q is a power of a prime.
Primitive polynomials 89
4.1.21 Theorem [690] For n ≥ 2 and a ∈ F∗q , there is a primitive normal polynomial of degree n
over Fq with trace a.
4.1.22 Theorem There is a primitive normal polynomial of degree n ≥ 3 with given norm and
nonzero trace; see [682] for n ≥ 5, [692] for n = 4, and [1557] for n = 3.
4.1.23 Conjecture [2157] For each n ≥ 2 there is a completely normal primitive polynomial of
degree n over Fq . (This is true if n is a prime, or n = 4, or if q n ≤ 231 with q ≤ 97.)
4.1.24 Remark [2186] For q ≥ n log n there is a completely normal primitive polynomial of Fqn
over Fq .
k
−1
4.1.25 Conjecture [2816] For any k there is a trinomial f so that f (x), x2 + 1 is a primitive
polynomial of degree k over F2 . (This conjecture has been proved for k ≤ 500 [307].)
Pt
4.1.26 Definition Let f (x) = xn + k=0 ak xnk , ak 6= 0, and 0 = n0 < n1 < · · · < nt < n. We
define the excess of f by
X X
E(f ) = nj − nj .
t≥j≥(t+1)/2 t/2≥j≥1
4.1.27 Remark The excess of a polynomial is related to self-dual, weakly self-dual and almost
weakly self-dual bases; see Section 5.1.
4.1.28 Remark In [2186] it is proved that there are binary primitive polynomials of degree n with
3 2
E(f ) ≤ 32 n + o(n).
4.1.29 Problem [2158] Let E(f ) denote the excess of the polynomial f . Prove or disprove that for
p a prime, there is a primitive polynomial f for each degree n ≥ 2 with excess E(f ) at most
as follows:
1. E(f ) ≤ 1, if p > 5;
2. E(f ) ≤ 2, if p = 5;
3. E(f ) ≤ 3, if p = 3;
4. E(f ) ≤ 6, if p = 2.
4.1.30 Remark See [2158] for more results related to the excess of a polynomial.
4.1.31 Remark Several classes of irreducible and primitive polynomials seem to have asymptotic
density δirr (as q → ∞) and δprim given by
1 φ(q n − 1)
δirr = and δprim = .
n nq n
4.1.32 Problem [2158] Find natural examples of families of polynomials over Fq having densities of
irreducible and/or primitive polynomials different from δirr and δprim . Some such examples
are given in Table 7 of [1181] and in [673] in connection to windmill polynomials.
90 Handbook of Finite Fields
See Also
References Cited: [307, 588, 673, 682, 690, 692, 1071, 1181, 1416, 1557, 1857, 1939, 2156,
2157, 2158, 2186, 2204, 2654, 2807, 2816]
4.2.1 Definition Given a positive integer r define τ (r) = φ(r)/r, where φ is Euler’s function.
4.2.2 Remark For a pair (q, n) with q (as always) a prime power, τ (q n − 1) is the proportion
of non-zero elements of Fqn that are primitive (see Theorem 4.1.3). It also signifies the
proportion of powers of irreducible polynomials of degree n that are primitive.
4.2.3 Remark For a primitive polynomial f ∈ Fq and non-zero a ∈ Fq the polynomial F (x + a)
need not be primitive. The arithmetical structure of the set of primitive polynomials is less
marked than that of the set of irreducible polynomials.
4.2.4 Remark The monic reciprocal f ∗ (x) = a−1 n
n x f (1/x) (see Remark 3.5.14) of a primitive
polynomial is also primitive (Remark 4.1.4).
4.2.5 Remark Except as mentioned, all polynomials (in particular in statements of theorems,
etc) will be assumed
Pn to be monic polynomials of degree n over Fq taken to have the form
f (x) = xn + i=1 ai xn−i . Here ai is the i-th coefficient. The meaning of the terms first (or
last) m coefficients will be as described in Definition 3.5.1. In particular −a1 is the trace of
f and (−1)n an is the norm of f .
4.2.6 Remark The norm of a primitive polynomial has to be a primitive element of Fq (Theorem
4.1.6).
4.2.7 Remark Asymptotic estimates (for large n) of the number of primitive polynomials of degree
n with prescribed coefficients (e.g., with prescribed first or last coefficients) in themselves
do not lead to strong existence results. All the theorems in this section (except Theorem
4.2.14) are existence results that are unaccompanied by useful lower bounds on the number
of primitive polynomials over a range of pairs (q, n).
Primitive polynomials 91
4.2.8 Problem Supply non-trivial bounds for the number of primitive polynomials with prescribed
coefficients in the problems in the following subsections.
4.2.10 Remark This section will also feature results on the distribution of polynomials with pre-
scribed coefficients that are simultaneously primitive and normal.
4.2.11 Remark Character sum techniques and estimates constitute the principal underlying mech-
anism. Specifically, one can characterize both primitivity and the prescribed coefficient re-
quirement in terms of such sums. The primitivity condition is in terms of multiplicative
character sums over Fqn , whereas the coefficient conditions involve lifted (multiplicative
and additive) characters over Fq .
4.2.12 Remark Theorem 4.2.14 is an illustration of a typical asymptotic type of lower bound
estimate that can be attained. All existence results described derive from such estimates
(perhaps “weighted”). Further examples are not given here because the best estimates for
absolute existence purposes are not optimal asymptotically (see Remark 4.2.17).
4.2.13 Definition A square-free divisor of a positive integer r is any factor of the radical of r, i.e.,
the product of the distinct primes dividing r. Thus the number of square-free divisors
or r (denoted by W (r)) is given by W (r) = 2ω(r) , where ω(r) is the number of distinct
primes dividing r (with ω(1) = 0).
4.2.14 Theorem [1031] The number N of primitive polynomials whose first m coefficients are
arbitrarily prescribed satisfies
4.2.17 Remark The effect of the factor W (q n − 1) in estimates such as (4.2.1) can be significantly
reduced through a sieving technique described in many of the papers cited below. In this
way W (q n − 1) is reduced to W (k) where k is an essential “core,” a factor of q n − 1, and the
sieving process generally proceeds over the remaining primes in q n − 1 not in the core. This
produces sharper existence results (while simultaneously blunting asymptotic quality).
4.2.18 Remark When the characteristic p of Fq is less than m, an equivalent process to prescribing
directly the first m coefficients is to consider (irreducible) polynomials f with the first m
values σj prescribed, where σj is the trace over Fq of γ j for a root γ of f . By these means
it suffices to prescribe m alternative parameters (or conditions).
92 Handbook of Finite Fields
4.2.19 Remark One can focus on polynomials with a specific coefficient prescribed (the m-th, say),
[686]. For m ≤ n/2 this can be achieved with around m/2 constraints (rather than the m
conditions needed to fix the first m coefficients, noted in Remark 4.2.18). For m exceeding
n/2 the reciprocal polynomial can be considered but then an extra condition (relating to
fixing the norm as a selected primitive element of Fq ) is needed.
4.2.20 Remark Dealing with the situation when one (or more) of the first m coefficients are
prescribed via an alternative set of prescribed values as in Remark 4.2.18 breaks down if
p ≥ m. To overcome this, a p-adic method, devised initially by Fan and Han, for example
in [1030, 1031, 1032], and improved by Cohen, for example in [686], is used in many
of the papers cited below. This eliminates from the situation difficulties caused by the
characteristic.
4.2.21 Remark To establish a specific existence result, it is shown that it is valid for pairs (q, n)
satisfying an arithmetical condition that holds for almost all pairs (q, n). Computation is
required; firstly numerical checks to show that some of the finite number of remaining pairs
satisfy the condition and then, for pairs which fail, direct working in the field to exhibit a
polynomial with the required property.
4.2.22 Remark A conjecture of Hansen and Mullen, [1416], has been the driver for the key theorem
on the existence of a primitive polynomial with an arbitrary prescribed coefficient. Prior
to its formulation the only known result was the following existence theorem for primitive
polynomials with arbitrary trace [675, 700, 1637]. Paper [700] establishes a self-contained
proof of the full theorem.
4.2.23 Theorem [686, 701, 702] Given m with 1 ≤ m < n and a ∈ Fq , there exists a primitive
polynomial with m-th coefficient a, with (genuine) exceptions only when
(q, n, m, a) = (q, 2, 1, 0), (4, 3, 1, 0), (4, 3, 2, 0), (2, 4, 2, 1).
4.2.24 Remark An existence result on primitive polynomials with prescribed norm and trace
can be derived by combining a theorem on primitive normal polynomials with prescribed
norm and trace (Theorem 4.2.46) with one on the last two coefficients prescribed (Theorem
4.2.50), since these two coefficient prescriptions are equivalent for primitive polynomials,
as for irreducible polynomials (Remark 3.5.14). (Note, however, that the monic reciprocal
of a normal polynomial need not be normal and therefore the equivalence breaks down for
primitive normal polynomials.) For the question of prescribing the norm and trace it is
sensible to assume n ≥ 3 and that the prescribed norm is a primitive element of Fq .
4.2.25 Theorem [682, 691, 692, 1037, 1557] Let a, b ∈ Fq with b a primitive element of Fq . Suppose
n ≥ 3 if a 6= 0 and n ≥ 5 if a = 0. Then there exists a primitive polynomial with trace a
and norm b.
4.2.26 Remark Theorem 4.2.25 is complete except for the cases in which a = 0 and n = 3, 4; these
remaining cases have recently been resolved in [687]: the only exceptions occur when n = 3,
a = 0, and q = 4 or 7.
4.2.27 Problem Similarly, other existence results on primitive polynomials may be derived as a
consequence from those on primitive normal polynomials (Section 4.2.3). Enunciate these
and fill gaps that arise because the trace of a normal polynomial must be non-zero.
4.2.28 Theorem [627] Suppose n ≥ 5. Then there exists a primitive polynomial with a1 = an−1 =
0, except when (q, n) = (4, 5), (2, 6), (3, 6).
Primitive polynomials 93
4.2.29 Remark As an existence result, Theorem 4.2.28 is complete: the question must have a
negative answer when n ≤ 4. It can be rephrased as asserting that the existence of a
primitive f for which both f and its monic reciprocal (Remark 3.5.14) have trace 0.
4.2.30 Theorem [683] Suppose n ≥ 5 and a, b ∈ Fq . Then there exists a primitive polynomial f
such that f has trace a and its monic reciprocal has trace b.
4.2.31 Problem Extend Theorem 4.2.30 to cover degrees n = 3, 4, perhaps with some listed ex-
ceptions.
4.2.32 Theorem [198, 695, 698, 1411, 1412, 2658] Suppose n ≥ 5 if q is odd, and n ≥ 7 if q is even.
Then there exists a primitive polynomial with its first two coefficients arbitrarily prescribed.
If n = 4 and q is odd, then the same conclusion holds for sufficiently large q.
4.2.33 Remark In [1411] it is claimed that when q is even, then the conclusion of Theorem 4.2.32
holds (with some exceptions) when n ≥ 4, although some of the given detail assumes n ≥ 7.
4.2.34 Theorem [695, 1030, 1033, 2104] Suppose n ≥ 7. Then there exists a primitive polynomial
with its first three coefficients arbitrarily prescribed.
4.2.35 Remark An asymptotic result Theorem 4.2.52 on the existence on a primitive normal
polynomial with its first b n−12 c coefficients prescribed yields a corresponding one for a
primitive polynomial [695, 1032] (see Problem 4.2.27). What follows now is an unconditional
result on the existence of a primitive polynomial with up to one-third of its first coefficients
prescribed.
4.2.36 Theorem [684] Suppose m ≤ n
3(except that m ≤ n4 when q = 2). Then there exists a
primitive polynomial with its first m coefficients arbitrarily prescribed, with the exception
that there is no primitive cubic over F4 with zero first coefficient.
4.2.37 Remark Underlying these results is the fundamental existence theorem of Lenstra and
Schoof [1899] (see also [693]).
4.2.38 Theorem [693, 1899] For every pair (q, n) there exists a primitive normal polynomial of
degree n over Fq .
4.2.39 Remark The original proof of Theorem 4.2.38 in [1899] requires significant numerical com-
putation to verify plus direct examination of a few fields. For the modified approach of [693]
a computer is not required.
4.2.40 Remark If n ≤ 2, then a primitive polynomial is automatically normal (theorem statements
may or may not include these values).
4.2.41 Remark For normal polynomials the character sums referred to in Remark 4.2.11 involve
additive characters over Fqn . The resulting estimates (corresponding to Theorem 4.2.14, for
example) now feature such quantities as W (xn − 1), defined as the number of square-free
polynomial divisors of the polynomial xn − 1 ∈ Fq [x]. Furthermore, the sieve (referred to in
Remark 4.2.17) now can additionally relate to the factorization of the polynomial xn − 1,
wherein the sieving process proceeds over the irreducible factors in xn − 1 not in the “core.”
Indeed, because the factorization of xn − 1 can be examined more systematically than the
corresponding numerical factorization of q n −1, theoretical existence can be treated initially
more effectively by means of an additive rather than a multiplicative sieve. Naturally, the
resulting arithmetic conditions are more demanding than in the case of (merely) primitive
polynomials and the consequent computations more substantial.
94 Handbook of Finite Fields
4.2.42 Remark In the study of polynomials which are both primitive and normal, prescribed
first or last coefficients cannot be so easily interchanged via use of reciprocal polynomials
since the monic reciprocal of a normal polynomial is not necessarily normal. Similarly, the
requirements for specifying one coefficient (see Remark 4.2.19) are more difficult. There are,
however, few new difficulties associated with the characteristic (see Remark 4.2.20).
4.2.43 Theorem [1036] Suppose n ≥ 15 and 1 ≤ m < n and that a ∈ Fq (with a 6= 0 if m = 1).
Then there exists a primitive normal polynomial with m-th coefficient am = a.
4.2.44 Remark Paper [1036] relies on some substantial computations which are only briefly sum-
marized. Its authors note that the theory extends unchanged to polynomials of smaller
degree and indeed there is a manuscript that claims an extension of Theorem 4.2.43 to
degrees n ≥ 9, though there are some problems with the details. The problem is sensible
for n ≥ 3 but the present method cannot currently be applied effectively to small degrees.
4.2.45 Conjecture [1036] Suppose n ≥ 2 and 1 ≤ m < n and that a ∈ Fq (with a 6= 0 if m = 1).
Then there exists a primitive normal polynomial with am = a, except when (q, n, m, a)
takes any of the values
(2, 3, 2, 1), (2, 4, 2, 1), (2, 4, 3, 1), (2, 6, 3, 1), (3, 4, 2, 2), (5, 3, 4, 3), (4, 3, 2, 1 + γ),
4.2.54 Remark Further constraints on primitive normal polynomials with prescribed coefficients
could lead to new areas of research as in the illustrations which follow.
Primitive polynomials 95
4.2.55 Definition An element α of Fqn is completely normal if it generates over any intermediate
field Fqd (where d|n) a normal basis of Fqn over Fqd . The minimal polynomial of α over
Fq is a completely normal polynomial.
4.2.56 Conjecture [2157] For every n there exists a primitive completely normal polynomial.
4.2.57 Remark Major contributions towards establishing this conjecture have been made by
Hachenberger [1391, 1394] (see Section 5.4). Part of the construction involves trace-
compatible sequences. Also the methods employed emphasize algebraic structure rather
than character sums and lead to useful lower bounds on the number of constructed polyno-
mials.
4.2.58 Problem For which pairs (q, n) does there exist a primitive completely normal polynomial
with prescribed (non-zero) trace and/or (primitive) norm?
4.2.59 Remark Although the (monic) reciprocal of a primitive polynomial is primitive, the recip-
rocal of a normal polynomial need not be normal.
4.2.60 Definition A strong primitive normal polynomial f is such that both f and its monic
reciprocal are primitive normal polynomials.
4.2.61 Theorem [694] For every pair (q, n) there exists a strong primitive normal polynomial,
except when
(q, n) = (2, 3), (2, 4), (3, 4), (4, 3), (5, 4).
4.2.62 Problem For which pairs (q, n) does there exist a strong primitive normal polynomial with
prescribed (non-zero) trace and/or (primitive) norm?
See Also
References Cited: [198, 627, 675, 682, 683, 684, 686, 691, 692, 693, 694, 695, 698, 700, 701,
702, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1391, 1394, 1411, 1412, 1416,
1557, 1637, 1899, 2104, 2157, 2658]
4.3.1 Definition As in Definition 4.1.10, denote by Wn (q) the minimal weight (or number of
nonzero coefficients) of a primitive polynomial of degree n over Fq .
4.3.2 Remark Hansen and Mullen [1416] list for each prime p < 100 and each degree n, with
pn < 1050 , a primitive polynomial over Fp of weight Wn (p). Earlier, Stahnke [2699] had
listed a primitive polynomial over F2 for each degree n ≤ 168 of weight Wn (2). In every
96 Handbook of Finite Fields
case Wn (p) = 3 or 5. Though Definition 4.3.1 makes sense for any prime power q, most
relevant literature relates to the binary field F2 . Because of numerous applications there
is particular interest in primitive polynomials of low or minimal weight. When n ≥ 2,
Wn (2) ≥ 3: thus primitive trinomials (failing which pentanomials) are especially sought.
In this research area there are few “theorems,” most work being empirical, heuristic or
conjectural. References are scattered in journals in diverse fields. The citations given here
are selective and incomplete.
4.3.3 Theorem [2634] (See Remark 4.1.11) For any sufficiently large prime p,
n+1
Wn (p) ≤ .
2
n
4.3.4 Theorem [2634] (See Remark 4.1.11) Wn (2) ≤ + o(1).
4
4.3.5 Conjecture [1302, 2186] For all n, Wn (2) ≤ 5.
4.3.6 Conjecture [1302, 2186] For infinitely many values of n, Wn (2) ≤ 3.
4.3.7 Remark Progress on Conjectures 4.3.5 and 4.3.6 may be difficult. The next conjecture has
a less ambitious goal.
4.3.8 Conjecture [1302] There is a positive integer m such that for infinitely many values of n,
Wn (2) ≤ m.
4.3.9 Remark When 2n − 1 is a (Mersenne) prime any irreducible polynomial of degree n over F2
is primitive. This means that polynomials of these degrees and small weight are valuable,
because primitivity can be tested (or at least ruled out) with the aid of theorems such as
Swan’s theorem for trinomials from [2753]. See also Theorem 3.3.25.
4.3.10 Theorem [408, 2753] Let n > s > 0 and assume n+s is odd. Then the trinomial xn +xs +1 ∈
F2 [x] has an even number of irreducible factors if and only if one of the following holds:
1. n even, n 6= 2s, ns/2 ≡ 0 or 1 (mod 4);
2. n odd, s - 2n, n ≡ ±3 (mod 8);
3. n odd s|2n, n ≡ 1 (mod 8).
4.3.11 Remark A trinomial of degree n over F2 where 2n −1 is a prime is a Mersenne trinomial (of
degree n). To date 47 Mersenne primes are known (numbered in increasing order) as M1 = 2,
. . . , M47 = 43112609. A total of 30 of these yield primitive trinomials. Zierler [3071] lists
Mersenne trinomials of degrees up to 11213 (corresponding to M23 ) and additional primitive
trinomials have been listed in [406, 1489, 1811]. The “Great Trinomial Hunt,” driven by
Brent and Zimmermann [408], parallels GIMPS, the “Great Internet Prime Search” (see
www.mersenne.org). Note that M47 is the largest known prime (as of November 2011),
having 12978189 digits. Table 4.3 gives a list of all known Mersenne trinomials xn + xs + 1
with Mr = 2n − 1 and s < n/2. Their reciprocals (wherein s > n/2) are also primitive.
4.3.12 Remark Lists of primitive Mersenne pentanomials have also been produced, see for exam-
ple [1811, 3011]. Those in [3011] have the form xn + xn−1 + xm + xm−1 + 1. In particular,
it has been suggested [3011] that, for random number generation, the use of pentanomials
might be preferable to trinomials. Lists of primitive polynomials over F2 of weights 5, 7, and
9 and all degrees between 9 and 660 occur in [2437]. These possess the quality that the dif-
ference between each pair of consecutive indices is almost the same. This property promotes
the implementation of the generation of linear recurring sequences based on highly mod-
ular devices (ring generators) leading to enhanced performance. On the other hand, there
are cryptographic reasons why primitive polynomials of high weight might be important
especially if all multiples of moderate degree also have high weight [1994].
Primitive polynomials 97
r n s
1 2 1
2 3 1
3 5 2
4 7 1, 3
6 17 3, 5, 6
8 31 3, 6, 7, 13
10 89 38
12 127 1, 7, 15, 30, 63
13 521 32, 48, 158, 168
14 607 105, 147, 273
15 1 279 216, 418
17 2 281 715, 915, 1029
18 3 217 67, 576
20 4 423 271, 369, 370, 649, 1393, 1419, 2098
21 9 689 84, 471, 1836, 2444, 4187
24 19 937 881, 7083, 9842
26 23 209 1530, 6619, 9739
27 44 497 8575, 21034
29 110 503 25230, 53719
30 132 049 7000, 33912, 41469, 52549, 54454
32 756 839 215747, 267428, 279695
33 859 433 170340, 288477
37 3 021 377 361604, 1010202
38 6 972 593 3037958
41 24 036 583 8412642, 8785528
42 25 964 951 880890, 4627670, 4830131, 6383880
43 30 402 457 2162059
44 32 582 657 5110722, 5552421, 7545455
46 42 643 801 55981, 3706066, 3896488, 12899278, 20150445
47 43 112 609 3569337, 4463337, 17212521, 21078848
Table 4.3.1 Mersenne trinomials.
4.3.13 Example [3011] There is no Mersenne trinomial of degree 61 (M9 ) but x61 +x60 +x46 +x45 +1
is a primitive pentanomial.
See Also
References Cited: [406, 408, 1302, 1416, 1489, 1811, 1994, 2186, 2437, 2634, 2699, 2753,
3011, 3071]
98 Handbook of Finite Fields
4.4.1 Remark There is no known explicit construction of primitive elements (Definition 2.1.38).
For some applications, it suffices to construct elements of sufficiently large order. The current
state of the art on explicit constructions of elements of large order usually proves lower
bounds for their orders which are much smaller than the expected actual order.
4.4.3 Remark If α 6= 0, 1 is an element of a finite field of characteristic p, then deg α < ord α ≤
pdeg α − 1.
4.4.4 Theorem There exists an absolute constant c > 0 and, for every > 0, there exists a
δ > 0, such that whenever α 6= 0, 1 is an element of a finite field of characteristic p with
deg α = n.
√
1. [1241, 1243] If ord α = n + 1 then ord(1 − α) ≥ exp(c n).
2. [2884] If ord α < n2− then ord(1 − α) ≥ exp(cnδ ).
4.4.5 Remark If α is as in Theorem 4.4.4 Part 1, then n + 1 is prime and p is a primitive root
modulo n + 1. Likewise, the possible values of n in Theorem 4.4.4 Part 2 are restricted.
4.4.6 Remark Similar, but weaker, results can be proved about the order of R(α), R ∈ Fp (x) or
even β, F (α, β) = 0, F (x, y) ∈ Fp [x, y], with α as in the previous theorem [2884].
4.4.7 Remark Poonen conjectured (as part of a more general conjecture, see [2884]) that, with
notation as in the previous theorem, max{ord α, ord(1 − α)} ≥ exp(cn). A special case was
also conjectured by Cheng [611].
4.4.8 Theorem [611] Let α satisfy αm = g where m|(q − 1) and let g be a primitive element in
Fq . Then, deg α = m deg g and ord(1 − α) ≥ exp(cm).
4.4.9 Theorem [1173] Given an integer n and a prime p, let m = dlogp ne. If g ∈ Fp [x], deg(g) ≤
m
2m is such that xp − g(x) has an irreducible factor of degree n then any root of this factor
in Fpn has order at least exp((log n)2 / log log n).
4.4.10 Remark A search for the polynomial g satisfying the hypotheses in the above theorem can
be done in time polynomial in n log p. It is conjectured that such a polynomial exists for all
p, n.
4.4.11 Remark An improvement on the bounds of [1173] was given in [713].
Primitive polynomials 99
See Also
References Cited: [54, 465, 611, 612, 713, 1173, 1241, 1242, 1243, 2884, 2886]
This page intentionally left blank
5
Bases
As noted in Section 2.1, Fqm may be viewed as a vector space of dimension m over
Fq , and therefore has a basis (in fact, many bases) over Fq . Some fundamental definitions
and results on bases were already given there. In particular, Theorem 2.1.93 and Corollary
2.1.95 provide criteria when a set {α1 , . . . , αm } of elements in Fqm forms a basis over Fq .
The present section provides a more detailed treatment of the general theory of bases,
the unifying theme being the notion of duality introduced in Definition 2.1.100. Proofs for
most of the results in this section can be found in [1631]; however, a somewhat different
notation is used there.
5.1.1 Definition Two ordered bases {α1 , . . . , αm } and {β1 , . . . , βm } of F = Fqm over K = Fq are
dual (or complementary) if TrF/K (αi βj ) = δij , where δij = 0 if i 6= j and δij = 1 if i = j.
An ordered basis is trace-orthogonal if it satisfies TrF/K (αi αj ) = 0 whenever j 6= i; and
101
102 Handbook of Finite Fields
it is self-dual if it is dual with itself, that is, if it additionally satisfies TrF/K (αi2 ) = 1
for all i.
5.1.2 Remark The following simple but fundamental result guarantees the existence of dual bases.
Remark 5.1.4 provides a proof by giving an explicit construction using the well-known
concept of dual bases in linear algebra.
5.1.3 Theorem For every ordered basis B = {α1 , . . . , αm } of F = Fqm over K = Fq , there is a
uniquely determined dual ordered basis B ∗ = {β1 , . . . , βm }.
5.1.4 Remark As F is a finite-dimensional vector space over K, it is isomorphic to the dual vector
space F ∗ consisting of all linear transformations from F to K. In Theorem 2.1.84, these
transformations were given in terms of the trace function TrF/K in the form Lβ with β ∈ F .
Then the map L : F → F ∗ with β 7→ Lβ is an isomorphism. This allows one to obtain the
following explicit description of the dual basis B ∗ :
5.1.5 Remark One reason for the importance of dual bases is the fact that they provide an easy
way for determining the coordinate representation of arbitrary elements of F , using the
concept of primal and dual coordinates.
5.1.6 Definition Given a dual pair of ordered bases B, B ∗ as above, we use the following notation.
Let
ξ = x1 α1 + · · · + xm αm = (x)1 β1 + · · · + (x)m βm .
Then
rB (ξ) = (x1 , . . . , xm ) and rB ∗ (ξ) = ((x)1 , . . . , (x)m )
are, respectively, the primal coordinates and the dual coordinates of ξ (with respect to
B).
5.1.7 Lemma Let B = {α1 , . . . , αm } and B ∗ = {β1 , . . . , βm } be a dual pair of ordered bases of
F = Fqm over K = Fq , let ξ ∈ F , and let rB (ξ) and rB ∗ (ξ) be as in Definition 5.1.6. Then
5.1.8 Remark The following two results deal with the dual basis of a basis B in the important
special cases where B is either a polynomial basis or a normal basis; see Definitions 2.1.96
and 2.1.98.
m−1
5.1.9 Theorem Let θ ∈ F = Fqm , and assume that B = {α1 = θ, α2 = θq , . . . , αm = θq } is a
normal basis for F over K = Fq . Then the dual basis B = {β1 , . . . , βm } is likewise normal:
∗
m−1
B ∗ = {ζ, ζ q , . . . , ζ q }, where ζ = β1 .
5.1.10 Remark [1172] The dual normal basis B ∗ can be described explicitly as follows. For i =
i
0, . . . , m − 1, put ti := TrF/K (θθq ). Let t be the polynomial
and let d(x) = dm−1 xm−1 + · · · + d1 x + d0 be the unique monic polynomial of degree < m
in K[x] satisfying
d(x)t(x) ≡ 1 (mod xm − 1).
Then B ∗ is the normal basis generated by the element
m−1
ζ = d0 θ + d1 θq + · · · + dm−1 θq .
5.1.13 Theorem [1265, 1298] Let θ be a root of a monic irreducible polynomial f of degree m over
K = Fq , and let B = {1, θ, θ2 , . . . , θm−1 } be the corresponding polynomial basis of F = Fqm
over K. Then the dual basis B ∗ of B is likewise a polynomial basis if and only if f is a
binomial and m ≡ 1 (mod p), where q is a power of the prime p.
5.1.14 Corollary There exists a dual pair of polynomial bases of Fqm over Fq if and only if the
following three conditions are satisfied:
1. m ≡ 1 (mod p);
2. every prime r dividing m also divides q − 1;
3. m ≡ 0 (mod 4) implies q ≡ 1 (mod 4).
5.1.15 Corollary Let B = {1, θ, θ2 , . . . , θm−1 } be a polynomial basis of of F2m over F2 , where
m ≥ 2. Then the dual basis of B cannot be a polynomial basis.
5.1.16 Example There is no dual pair of polynomial bases of F3m over F3 . There exists a dual pair
of polynomial bases of F4m over F4 if and only if m is a power of 3. There exists a dual pair
of polynomial bases of F5m over F5 if and only if m is a power of 16.
5.1.17 Remark Lemma 5.1.7 shows that the computation of coordinates is particularly simple when
the basis used is self-dual. This also has important applications in the design of hardware
implementations for the multiplication in finite (extension) fields; see, for instance, Sections
4.1, 4.4, and 5.5 of [1631]. Unfortunately, a self-dual basis does not always exist, which
motivates considering a slightly weaker notion.
5.1.18 Theorem [2583] There exists a self-dual basis of Fqm over Fq if and only if either q is even
or both q and n are odd.
104 Handbook of Finite Fields
5.1.20 Theorem [1635] There always exists an almost self-dual basis of Fqm over Fq .
5.1.21 Remark If there exists a self-dual basis for F over K, there are many such bases. All these
bases are related by suitable transformations, namely via orthogonal matrices; see Definition
13.2.35. The number of such matrices is given in Theorem 13.2.37, which implies an explicit
formula for the number of self-dual bases.
5.1.22 Lemma [1635] Let B = {α1 , . . . , αm } be a self-dual ordered basis of F = Fqm over K = Fq ,
and let A = (aij ) be an invertible m × m matrix over K. Then the ordered basis B 0 =
{β1 , . . . , βm } with
Xm
βi = aij αj for i = 1, . . . , m
i=1
where
2 if q is even,
(
1 if i is even,
i = and γ = 1 if q and m are odd,
0 if i is odd,
0 otherwise.
5.1.24 Theorem [1573] For m ≥ 2, there does not exist a self-dual polynomial basis of Fqm over Fq .
5.1.25 Remark In contrast to Theorem 5.1.24, self-dual normal bases often exist; see Section 5.2.
5.1.26 Remark For computational purposes, it would be helpful to have a self-dual polynomial
basis. However, Theorem 5.1.24 excludes this possibility. Fortunately, there are weaker no-
tions which are still useful for hardware implementations; one such notion is discussed in
the present subsection.
5.1.27 Definition A basis {α1 , . . . , αm } of F = Fqm over K = Fq with associated dual basis
B ∗ = {β1 , . . . , βm } is weakly self-dual if there exist an element δ ∈ F ∗ and a permutation
σ of {1, . . . , m} such that the following condition holds:
5.1.28 Remark For computational purposes, weakly self-dual polynomial bases are quite attractive,
as they lead to rather simple transformations between dual and primal coordinates. Consider
Bases 105
some element ξ ∈ F , with primal and dual coordinates as in Definition 5.1.6. Using Equation
(5.1.5), one obtains
Xm
ξδ = xσ(j) βj . (5.1.6)
j=1
Thus the dual coordinates of the product ξδ arise by simply permuting the primal coordi-
nates of ξ according to σ.
5.1.29 Remark The observation in Remark 5.1.28 is of particular interest for hardware implemen-
tations of the multiplication in F if a polynomial basis B is used. Consider a further element
η ∈ F , given in primal coordinates (y1 , . . . , ym ), and write π = ξη. Then one may design a
simple hardware device – called a dual basis multiplier with respect to B – which computes
the product πδ = (ξη)δ = (ξδ)η in dual coordinates from ξδ in dual coordinates (which, ac-
cording to Equation (5.1.6), are obtained from the primal coordinates of ξ by just applying
the permutation σ) and η in primal coordinates. Using Equation (5.1.6) for π instead of ξ,
the primal coordinates of the product π can then be obtained by simply permuting the dual
coordinates of πδ according to σ −1 , so that all required coordinate transformations reduce
to permutations. Moreover, the permutations arising are very simple ones; see Theorem
5.1.30 below. In the particularly important binary case, this allows the hardware design of
efficient dual basis multipliers in many instances of practical interest; see [1631] for more
details and for examples of dual basis multipliers.
5.1.30 Theorem [1298, 2936] A polynomial basis B = {1, θ, θ2 , . . . , θm−1 } of Fqm over Fq is weakly
self-dual if and only if the minimal polynomial f of θ is either a trinomial with constant
term −1 or a binomial. Moreover, for i = 1, . . . , m,
1
δ = βk = and σ(i) := k − i + 1 (mod m) if f (x) = xm + axk − 1,
θk f 0 (θ)
and
1 1
δ= = and σ(i) := 1 − i (mod m) if f (x) = xm − a.
f 0 (θ) mθm−1
5.1.31 Corollary Let B = {1, θ, θ2 , . . . , θm−1 } be a polynomial basis of F = Fqm over Fq . Then
the transformation from the primal coordinates of an arbitrary element ξ ∈ F to the dual
coordinates of the element ξδ (for a suitable constant element δ of F ) is just a permutation
if and only if the minimal polynomial of θ is either a trinomial with constant term −1 or a
binomial.
5.1.32 Remark As the binary case is of particular practical importance, we state some results for
this special case explicitly.
5.1.33 Theorem Let f (x) = xm + xk + 1 be an irreducible trinomial over F2 , let θ be a root of
f , and B ∗ = {β1 , . . . , βm } the dual basis of the polynomial basis B = {1, θ, θ2 , . . . , θm−1 }
generated by θ. Then B is weakly self-dual, and Equation (5.1.5) is satisfied with
1
δ = βk = and σ(i) := k − i + 1 (mod m) for i = 1, . . . , m.
θk f 0 (θ)
5.1.34 Example For m = 6, we may use a root θ of the cyclotomic polynomial Φ9 (x) = x6 + x3 + 1
over F2 to generate F = F26 . Then the polynomial basis B = {1, θ, θ2 , θ3 , θ4 , θ5 } is weakly
self-dual with δ = θ31θ2 = θ4 and
1 2 3 4 5 6
σ= .
3 2 1 6 5 4
106 Handbook of Finite Fields
B ∗ = {θ6 = θ3 + 1, θ5 , θ4 , θ9 = 1, θ8 = θ5 + θ2 , θ7 = θ4 + θ}.
5.1.37 Remark In cases where no weakly self-dual polynomial basis can exist, further general-
izations of the notion of weak self-duality are useful. We begin with some results for the
binary case; proofs for these results can be found in [1631]. Consider a polynomial basis
B = {1, θ, θ2 , . . . , θm−1 } of F2m over F2 and a scalar multiple C = B ∗ /δ of its dual basis
B ∗ . According to Definition 5.1.27, B is a weakly self-dual basis (with respect to the given
value of δ) if and only if the coordinate transformation from C to B is just a permutation;
equivalently, the matrix S associated with this change of basis has to be a permutation
matrix. If no weakly self-dual basis exists, one wants to find a polynomial basis B and a
suitable element δ for which the associated transformation matrix S is as simple as possible,
meaning that S should have the smallest possible number of non-zero entries. This leads to
the following definition.
5.1.38 Definition The weight w(S) of an invertible m × m matrix S is the number of non-zero
entries of S. As the minimum weight is always at least m, one defines the excess of S
as e(S) = w(S) − m.
5.1.39 Remark In the hardware design of dual basis multipliers over F2 , one wants to use an
irreducible polynomial for which the matrix S associated with this change of basis has the
smallest possible excess, as this turns out to be the number of XOR-gates required for
computing the primal coordinates from the generalized dual coordinates, that is, from the
coordinates with respect to C = B ∗ /δ; see, for instance, [1631, Section 4.5]. Weakly self-dual
bases correspond to the smallest possible case, namely e(S) = 0.
By Theorem 5.1.12, the dual basis B ∗ = {β1 , . . . , βm } of the polynomial basis B defined by θ
is obtained by dividing the coefficients of g by f 0 (θ) = θ2 . Choosing δ = (θ3 f 0 (θ))−1 = 1/θ5
gives B ∗ /δ = {γ1 , . . . , γm }, where
γ1 = θ2 , γ2 = θ, γ3 = 1 + θ2 , γ4 = θ3 + θ7 , γ5 = θ6 , γ6 = θ5 , γ7 = θ4 , γ8 = θ3 .
Bases 107
Thus the transformation matrix S from generalized dual to primal coordinates has excess 2
in this case, giving indeed a quite simple coordinate transformation. This heavily depends on
the choice of δ; for instance, the choice δ = f 0 (θ)−1 = 1/θ2 would lead to a transformation
matrix with excess 16.
5.1.41 Theorem [2718] Let θ be a root of an irreducible polynomial f of degree m over K = F2 ,
and let B = {1, θ, θ2 , . . . , θm−1 } be the corresponding polynomial basis of F2m over K, and
B ∗ the dual basis of B. Write
and put
1
δ= , where s = dt/2e.
f 0 (θ)θms
Then the transformation matrix S from C = B ∗ /δ to B has excess
t
X s−1
X
e(S) = mj − mj .
j=s+1 j=1
and let B be a corresponding polynomial basis of F2m over F2 , and B ∗ the dual basis of B.
−1
Put δ = (f 0 (θ)θm2 ) . Then the transformation matrix S from C = B ∗ /δ to B has excess
e(S) = m3 − m1 .
5.1.43 Corollary The binary irreducible polynomials leading to a transformation matrix of excess
2 are precisely the irreducible pentanomials of the special form xm + xk+1 + xk + xk−1 + 1.
5.1.44 Theorem A binary irreducible polynomial leads to a transformation matrix of excess 1 if
and only if it has the form xm + x + 1, where m is even.
5.1.45 Remark It is also of interest to investigate the possible spectra of the number of elements
of trace 1 in a polynomial basis for F2m over F2 . This problem was first considered in [51]
where it was noted that using a polynomial basis with a small number of elements of trace
1 is desirable, since it allows a particularly efficient implementation of the trace function.
For example, this is important for halving a point on an elliptic curve over F2m , and for
generating pseudo random sequences using elliptic curves. In particular, it is of interest to
find trinomials and pentanomials associated with bases with a small number of elements of
trace 1. We mention one striking result in this direction; more results on the trace spectra
of polynomial bases can be found in [47, 51, 2646].
5.1.46 Theorem Suppose that there exists an irreducible trinomial of degree m over F2 . Then
there also exists an irreducible trinomial such that the corresponding polynomial basis for
F2m over F2 contains exactly one element with trace 1.
5.1.47 Remark In this subsection, we present some results on polynomial bases corresponding
to matrices with small excess over a general field Fq . These results are taken from [2158].
For q 6= 2, a matrix of excess 0 is not necessarily a permutation matrix. This leads to the
following generalization of weakly self-dual bases.
108 Handbook of Finite Fields
5.1.48 Definition A basis {α1 , . . . , αm } of F = Fqm over K = Fq with associated dual basis
B ∗ = {β1 , . . . , βm } is almost weakly self-dual if there exist an element δ ∈ F ∗ , elements
c1 , . . . , cn ∈ K ∗ , and a permutation σ of {1, . . . , m} such that the following condition
holds:
βi = ci δασ(i) for i = 1, . . . , m. (5.1.7)
5.1.49 Remark The almost weakly self-dual bases are precisely those bases corresponding to a
transformation matrix S with e(S) = 0. An almost weakly self-dual basis is actually weakly
self-dual if and only if c1 = · · · = cn .
5.1.50 Theorem A polynomial basis B = {1, θ, θ2 , . . . , θm−1 } of Fqm over Fq is almost weakly
self-dual if and only if the minimal polynomial f of θ is either a trinomial or a binomial.
5.1.51 Remark There are no proper almost weakly self-dual bases which belong to irreducible
binomials: in this case, the basis is actually weakly self-dual by Theorem 5.1.30. Hence it
suffices to consider the case of irreducible trinomials in Theorem 5.1.52; in the special case
where d = 1, one recovers Theorem 5.1.30.
5.1.52 Theorem Let f (x) = xm − axk − d be an irreducible trinomial over F2 , let θ be a root of
f , and B ∗ = {β1 , . . . , βm } the dual basis of the polynomial basis B = {1, θ, θ2 , . . . , θm−1 }
generated by θ. Then B is almost weakly self-dual, and Equation (5.1.7) is satisfied with
(
d 1 if i ≤ k,
δ = βk = k 0 , ci =
θ f (θ) d−1 if i > k
and
σ(i) := k − i + 1 (mod m) for i = 1, . . . , m.
5.1.53 Theorem Let θ be a root of a monic irreducible polynomial f of degree m over K = Fq ,
and let B = {1, θ, θ2 , . . . , θm−1 } be the corresponding polynomial basis of Fqm over K, and
B ∗ the dual basis of B. Write
f (x) = xm + fmt xmt + · · · + fm1 xm1 + fm0 xm0 , where 0 = m0 < m1 < · · · < mt < m,
and put
1
δ= , where s = dt/2e.
f 0 (θ)θms
Then the transformation matrix S from C = B ∗ /δ to B has excess
Pt
j=s+1 mj − s−1
P
j=1 mj if t is odd,
e(S) = P
t Ps
j=s+1 mj − j=1 mj if t is even.
5.1.55 Remark Many aspects of the study of various types of bases for Fqm over Fq are to a large
extent motivated by the hardware design of efficient multipliers for Fqm . The seminal paper
in this area is due to Berlekamp [234]. Another seminal idea – which motivated the study
of optimal and low complexity normal bases, see Section 5.3 – was contained in a 1981
US patent application by Massey and Omura; see “Computational Method and Apparatus
for Finite Field Arithmetic,” US Patent No. 4,587,627, 1986. As already mentioned, intro-
ductory examples and some references can be found in Sections 4.1, 4.4, and 5.5 of [1631].
There is an abundance of papers in this area, largely due to its importance for hardware
architectures for public key cryptography; the interested reader should consult the relevant
sections of the extensive survey [208]. A more recent survey concerning the special topic of
polynomial basis multipliers in the binary case is given in [982].
See Also
References Cited: [47, 51, 208, 234, 982, 1172, 1265, 1298, 1573, 1631, 1635, 2158, 2583,
2646, 2718, 2936]
We present basic results on normal and self-dual normal elements, and we give a unified
approach following the PhD thesis [1172]. Let p be a prime and let q be a power of p. Denote
by σ the Frobenius map of Fqn :
Then
i
σ i (α) = αq , for all i ≥ 0,
n
and σ n = 1 as a map on Fqn (since σ n (α) = αq = α for all α ∈ Fqn ). The Galois group
of Fqn over Fq consists of the n maps σ i , 0 ≤ i ≤ n − 1. Recall from Definition 2.1.98 that
an element α ∈ Fqn is a normal element over Fq if the conjugates σ i (α), 0 ≤ i ≤ n − 1,
are linearly independent over Fq . Hence a normal basis for Fqn over Fq is of the form
{α, σ(α), . . . , σ n−1 (α)}, where α ∈ Fqn is normal over Fq . Also, an irreducible polynomial
f ∈ Fq [x] of degree n is an N-polynomial (or normal polynomial ) if its roots are linearly
independent over Fq , that is, its roots form a normal basis of Fqn over Fq .
110 Handbook of Finite Fields
5.2.1 Theorem (Normal basis theorem) For every prime power q and every integer n ≥ 1, Fqn
has a normal basis over Fq .
5.2.2 Remark The normal basis theorem was proved first by Hensel [1486]. A more general
normal basis theorem for any finite Galois extension of an arbitrary field was proved by
Noether [2297] and Deuring [824].
5.2.3 Proposition
1. For any two integers m, n ≥ 1 and for any normal element α ∈ Fqmn over Fq , the
element
β = TrFqmn /Fqn (α),
the trace of α from Fqmn to Fqn , is a normal element of Fqn over Fq .
2. If gcd(m, n) = 1, then any normal element α ∈ Fqn over Fq is still normal in
Fqmn over Fqm .
3. If gcd(m, n) = 1, then for any normal elements α ∈ Fqn and β ∈ Fqm over Fq ,
the product αβ is a normal element of Fqmn over Fq .
5.2.5 Remark
1. An element α ∈ Fqn is normal over Fq if and only if every β ∈ Fqn is equal to
f (σ) ◦ α for some f ∈ Fq [x].
2. For any f, g ∈ Fq [σ], we have
(f g) ◦ α = f ◦ (g ◦ α).
Pm
3. For any nonzero polynomial f (σ) = i=0 ci σ i ∈ Fq [σ] with m < n, f is not equal
to the zero map on Fqn . In fact, if f ◦ α = 0 for all α ∈ Fqn , then the polynomial
Pm qi
i=0 ai x has q n zeros in Fqn , which is more than the degree q m . Hence xn − 1
is the minimal polynomial of σ and
Fq [σ] ∼
= Fq [x]/(xn − 1)
C(f ) = (aj−i )
n = em, e = pv ,
where the gi (x)’s are distinct monic irreducible factors of xm − 1 in Fq [x]. Let Ei (x) ∈ Fq [x]
for 1 ≤ i ≤ r so that, with 1 ≤ j ≤ r,
Ri ∼
= Fq [x]/(gi (x)e ), 1 ≤ i ≤ r,
and
Fq [σ] = R1 + R2 + · · · + Rr , (5.2.2)
is a direct sum of subrings. For 1 ≤ i ≤ r, let
Also, Vi has dimension e · deg(gi (x)) over Fq . Let Wi be the subspace of Vi annihilated by
gi (x)e−1 , 1 ≤ i ≤ r, that is
Wi = {α ∈ Vi : gi (σ)e−1 ◦ α = 0}.
5.2.7 Theorem [2395, 2580] With the notation as in the above remark, we have that
Fqn = V1 + V2 + · · · + Vr
112 Handbook of Finite Fields
where, Φq is the Euler Phi function for polynomials, see Definition 2.1.111.
5.2.9 Corollary Let n be a power of p, the characteristic of Fq . Then
1. an element α ∈ Fqn is normal over Fq if and only if TrFqn /Fq (α) 6= 0;
2. an irreducible polynomial f (x) ∈ Fq [x] of degree n is an N-polynomial if and only
if the coefficient of xn−1 in f (x) is nonzero.
5.2.10 Corollary Let n be a prime such that q is primitive modulo n. Then
1. an element α ∈ Fqn is normal over Fq if and only if α 6∈ Fq and TrFqn /Fq (α) 6= 0;
2. an irreducible polynomial f (x) ∈ Fq [x] of degree n is an N-polynomial if and only
if the coefficient of xn−1 in f (x) is nonzero.
5.2.11 Theorem For any α ∈ Fqn , define
n−1
X n−1
X
Tα (x) = σ i (α)xi ∈ Fqn [x], tα (x) = ti xi ∈ Fq [x],
i=0 i=0
of normal elements.
1. [1486] α ∈ Fqn is normal over Fq if and only if gcd(Tα (x), xn − 1) = 1 in Fqn [x].
2. [1172] α ∈ Fqn is normal over Fq if and only if gcd(tα (x), xn − 1) = 1 in Fq [x],
that is, if tα (σ) is invertible in Fq [σ].
3. [2385] Suppose that α ∈ Fqn is normal over Fq . Then, for any g(σ) ∈ Fq [σ], the
element β = g(σ) ◦ α is normal over Fq if and only if g(σ) is invertible in Fq [σ],
that is, gcd(g(x), xn − 1) = 1 in Fq [x].
5.2.12 Remark Part 3 above shows again that Φq (xn −1) is equal to the number of normal elements
in Fqn over Fq . There is a nice formula for Φq (xn − 1) due to Ore [2324]. Suppose n = pv m
where m is not divisible by p. For a positive integer d, let τm (d) denote the multiplicative
order of d modulo m, and φ(d) be the Euler φ-function (equal to the number of integers
between 1 and d that are relatively prime to d). Then
Y φ(d)/τm (d)
n n 1
Φq (x − 1) = q 1 − τ (d) .
q m
d|m
Also, for a general polynomial f ∈ Fq [x] with r distinct irreducible factors in Fq [x] with
degrees d1 , d2 , . . . , dr (the degrees need not be distinct), we have
r
n
Y 1
Φq (f (x)) = q 1 − di .
i=1
q
Bases 113
5.2.13 Theorem [1186] For any f (x) ∈ Fq [x] of degree n ≥ 1 with f (0) 6= 0, we have
qn
e if n < q,
Φq (f ) ≥ qn qn
γ+ 1 > e0.83 (1+log n) if n ≥ q,
2(1+logq n) q
e (1+logq n)
f1 (x) = xp + xp−1 + · · · + x − 1,
f2 (x) = f1 (xp − x − 1),
fk+1 (x) = fk∗ (xp − x − 1), k ≥ 2,
where f ∗ (x) denotes the reciprocal polynomial of f (x), that is, f ∗ (x) = xd f (1/x) where d
is the degree of f (x); see Definition 2.1.48. Then, for every k ≥ 1, fk∗ (x) is an N-polynomial
over Fp of degree pk .
5.2.20 Proposition [309]
1. Suppose q ≡ 1 (mod 4) and let a ∈ Fq be a non-square. Then the polynomial
k k
x2 − a(x − 1)2
5.2.21 Definition For any α, β ∈ Fqn , define the trace polynomial of α and β over Fq to be
n−1
X
tα,β (σ) = Tr(α σ i (β)) σ i ∈ Fq [σ].
i=0
When α = β, tα,α (σ) is denoted by tα (σ), which agrees with the definition of tα in
Theorem 5.2.11, and is the trace polynomial of α over Fq . An element α is dual to β in
Fqn over Fq if tα,β (σ) = 1, and α is self-dual if it is dual to itself.
5.2.22 Remark Note that α ∈ Fqn is normal over Fq if and only if tα (σ) is invertible, while α ∈ Fqn
is self-dual over Fq if and only if tα (σ) = 1. Hence an element α ∈ Fqn is self-dual over Fq
if and only if it generates a self-dual normal basis for Fqn over Fq .
5.2.23 Theorem There is a self-dual normal basis for Fqn over Fq if and only if q and n are odd,
or q is even and n is not divisible by 4.
5.2.24 Remark Imamura and Morii [1574] proved the “only if” part of the theorem, and Lempel
and Weinberger [1890] proved the “if” part. For self-dual normal bases in Galois extensions
of arbitrary fields, see Bayer-Fluckier and Lenstra [212].
5.2.25 Proposition Suppose gcd(m, n) = 1. Then
1. for any self-dual element α ∈ Fqn over Fq , α remains self-dual in Fqmn over Fqm ;
2. for any self-dual elements α ∈ Fqn over Fq and β ∈ Fqm over Fq , the product αβ
is a self-dual element of Fqmn over Fq .
5.2.26 Lemma Let α ∈ Fqn be any normal element over Fq . For any β = f (σ) ◦ α and γ = g(σ) ◦ α
where f (σ), g(σ) ∈ Fq [σ], we have the following equation in Fq [σ]:
tβ,γ (σ) = f (σ)g(σ −1 )tα (σ).
5.2.27 Proposition Let α ∈ Fqn be any normal element over Fq , and let hα (σ) be the inverse of
tα (σ), that is, hα (x)tα (x) ≡ 1 (mod xn − 1).
1. The element β = hα (σ) ◦ α is dual to α; hence, the dual basis of the normal basis
generated by α is a normal basis and is generated by β.
2. An element β = f (σ) ◦ α in Fqn has a dual if and only if f (σ) is invertible, and
its dual is equal to γ = g(σ) ◦ α where g(σ) = (f (σ −1 )tα (σ −1 ))−1 .
3. An element β = f (σ) ◦ α is self-dual if and only if
f (σ)f (σ −1 ) = hα (σ).
4. Suppose Fqn has a self-dual element α over Fq (so tα (σ) = 1). Then β = f (σ) ◦ α
is self-dual if and only if
f (σ)f (σ −1 ) = 1. (5.2.3)
Bases 115
C(f )C(f )t = I,
that is, C(f ) is orthogonal. Let oc(n, q) denote the number of orthogonal n × n circulant
matrices over Fq . When Fqn has a self-dual normal basis for Fq , oc(n, q) is the number of
self-dual elements in Fqn over Fq , and oc(n, q)/n is the number of self-dual normal bases in
Fqn over Fq . Suppose xn − 1 factors as in Equation (5.2.1) where e = pv . We classify the
irreducible factors gi (x) into three types:
1. x − 1 and possibly x + 1 (if n is even and q is odd);
2. self-reciprocal factors gi (x) with roots consisting of pairs {ξ, ξ −1 } with ξ 6= ξ −1 ,
hence gi (x) = xd gi (1/x)/gi (0) where d is the degree of gi (x). Suppose there are
s such irreducible factors in Equation (5.2.1) and their degrees are 2d1 , . . . , 2ds ;
3. the remaining irreducible factors, which come in pairs gi (x) and g̃i (x) of the
same degree so that, for each root ξ of gi (x), ξ −1 is a root of g̃i (x), hence
g̃i (x) = xd gi (1/x)/gi (0). Suppose there are t such pairs of irreducible factors
in Equation (5.2.1) and their degrees are e1 , . . . , et .
We order the components of R in Equation (5.2.2) accordingly:
where R0 represents the component for x − 1 plus that for x + 1 if it is a factor. Then the
space of solutions for Equation (5.2.3) is the direct sum of solutions from each component.
This leads to the following theorem.
5.2.29 Theorem [1990] Let n = pv m, di ’s and ej ’s be as defined in the above remark. Then
( Q
s Qt
1 i=1 (q di + 1) i=1 (q ei − 1) if v = 0,
oc(n, q) = v
2 q (p −1)m/2 · oc(m, q) if v ≥ 1,
where
1 if q is even, 1 if p 6= 2,
1 = 2 if q and m are odd, 2 = q 1/2 if p = 2 and v = 1,
1/2
4 if q is odd and m is even, 2q if p = 2 and v ≥ 2.
5.2.30 Corollary [130, 258, 1635] If the condition in Theorem 5.2.23 is satisfied, then there are
oc(n, q) self-dual normal elements in Fqn over Fq .
5.2.31 Definition An element α ∈ Fqn is primitive normal over Fq if it is normal over Fq and has
multiplicative order q n − 1. A normal basis generated by a primitive normal element is
a primitive normal basis. A polynomial of degree n in Fq [x] is primitive normal if its
roots are primitive normal in Fqn over Fq .
5.2.32 Theorem (Primitive normal basis theorem) For any prime power q and any integer n ≥ 1,
Fqn has a primitive normal element over Fq .
5.2.33 Remark The primitive normal basis theorem was proved by Carlitz [540] for q n sufficiently
large, by Davenport [775] when q is a prime and by Lenstra and Schoof [1899] in the general
116 Handbook of Finite Fields
case. For a theoretical proof which does not require any machine calculation, see Cohen and
Huczynska [692]. The next few theorems are further strengthenings of the primitive normal
basis theorem.
5.2.34 Theorem [690] For any prime power q, any integer n ≥ 2 and any nonzero a ∈ Fq , there
is a primitive normal polynomial f (x) = xn + c1 xn−1 + c2 xn−2 + · · · + cn−1 x + cn ∈ Fq [x]
with c1 = a.
5.2.35 Theorem [1036] For any prime power q, any integer n ≥ 15, any integer m with 1 ≤
m < n, and any a ∈ Fq (with a 6= 0 if m = 1), there is a primitive normal polynomial
f (x) = xn + c1 xn−1 + c2 xn−2 + · · · + cn−1 x + cn ∈ Fq [x] with cm = a.
5.2.36 Theorem [694, 2806] For any prime power q and any integer n ≥ 2, there is an element
α ∈ Fqn such that both α and α−1 are primitive normal over Fq except when (q, n) is one
of the pairs (2, 3), (2, 4), (3, 4), (4, 3) and (5, 4).
5.2.37 Theorem [1035] For any prime power q, any integer n ≥ 7 and any a, b ∈ Fq with a 6= 0,
there is a primitive normal polynomial f (x) = xn +c1 xn−1 +c2 xn−2 +· · ·+cn−1 x+cn ∈ Fq [x]
with c1 = a and c2 = b.
5.2.38 Theorem [1034] For any integer n ≥ 2 and sufficiently large prime power q, there is a
primitive normal polynomial f (x) = xn + c1 xn−1 + c2 xn−2 + · · · + cn−1 x + cn ∈ Fq [x] with
its first bn/2c coefficients arbitrarily prescribed except that c1 6= 0.
5.2.39 Theorem [1029] For any integer n ≥ 2 and sufficiently large prime power q, there is a
primitive normal polynomial f (x) = xn + c1 xn−1 + c2 xn−2 + · · · + cn−1 x + cn ∈ Fq [x] with
its last bn/2c coefficients arbitrarily prescribed except that (−1)n cn must be a primitive
element of Fq .
See Also
§2.2 For tables of primitive polynomials of various kinds and standards requiring
normal basis arithmetic.
§5.3 For normal bases of low complexity.
§5.4 For completely normal bases.
§16.7 For hardware implementations of finite field arithmetic.
References Cited: [130, 212, 258, 309, 540, 690, 692, 694, 775, 824, 1029, 1034, 1035, 1036,
1172, 1185, 1186, 1486, 1574, 1631, 1635, 1890, 1899, 1990, 2077, 2297, 2324, 2385, 2395,
2580, 2806, 2859].
Bases 117
5.3.1 Definition Let α ∈ Fqn be normal over Fq and let N = (α0 , α1 , . . . , αn−1 ) be the normal
basis of Fqn over Fq generated by α, where
i
αi = αq , 0 ≤ i ≤ n − 1.
where tij ∈ Fq . The matrix T is the multiplication table of the basis N . Furthermore,
the number of non-zero entries of T , denoted by CN , is the complexity (also called the
density) of the basis N .
5.3.2 Remark An exhaustive search for normal bases of F2n over F2 for n < 40 is given in [2015],
extending previous tables such as those found in [1631]. Using data from the search, the
authors in [2015] indicate that normal bases of F2n over F2 follow a normal distribution
(with respect to their complexities) which is tightly compacted about a mean of roughly
n2 /2. We define low complexity normal bases loosely to mean normal bases known to have
sub-quadratic bounds, with respect to n, on their complexity.
5.3.3 Remark In addition, [2015] gives the minimum-known complexity of a normal basis of F2n
over F2 for many values of n using a variety of constructions that appear in this section.
Further tables on normal bases are provided in Section 2.2.
5.3.4 Proposition [2199] The complexity CN of a normal basis N of Fqn over Fq is bounded by
2n − 1 ≤ CN ≤ n2 − n + 1.
5.3.5 Definition A normal basis is optimal normal if it achieves the lower bound in Proposi-
tion 5.3.4.
5.3.7 Theorem (Optimal normal basis theorem) [1184] Every optimal normal basis is equivalent
to either a Type I or a Type II optimal normal basis. More precisely, suppose Fqn has an
optimal normal basis over Fq generated by α and let b = Tr(α) ∈ Fq . Then one of the
following must hold:
1. n + 1 is a prime, q is primitive modulo n + 1 and −α/b is a primitive (n + 1)-st
root of unity;
2. q = 2v with gcd(v, n) = 1, 2n + 1 is a prime such that 2 and −1 generate the
multiplicative group of Z2n+1 , and α/b = γ + γ −1 for some primitive (2n + 1)-st
root of unity γ.
5.3.8 Remark Gao and Lenstra [1184] prove a more general version of the optimal normal basis
theorem. They show that if a finite Galois extension L/K, where K is an arbitrary field,
has an optimal normal basis, say generated by α, then there is a prime number r, an r-th
root of unity γ in some algebraic extension of L and a nonzero constant c ∈ K so that one
of the following holds:
1. α = cγ and L has degree r − 1 over K (so the polynomial xr−1 + xr−2 + x + 1 is
irreducible over K);
2. α = c(γ + γ −1 ) and L has degree (r − 1)/2 over K (so the minimal polynomial
of γ + γ −1 over K has degree (r − 1)/2).
5.3.9 Theorem [308] Let F (x) = xq+1 + dxq − (ax + b) with a, b, d ∈ Fq and b 6= ad. Let f be an
irreducible factor of F of degree n > 1 and let α be a root of f . Then all the roots of f are
i
αi = αq = ϕi (α), i = 0, 1, . . . , n − 1,
where ϕ(x) = (ax + b)/(x + d). If τ = TrFqn /Fq (α) 6= 0, then (α0 , α1 , . . . , αn−1 ) is a normal
basis of Fqn over Fq such that
τ∗
α0 −en−1 −en−2 · · · −e1 α0 b∗
α1
α 1 e1 en−1 b
α 2 e2 e α2
α = n−2 + b
, (5.3.1)
.. .. .
.. .. b
. . .
αn−1 en−1 e1 αn−1 b
5.3.10 Corollary [308] The following are two special cases of the above theorem.
1. For every a, β ∈ F∗q with TrFq /Fp (β) = 1,
1 p−1 1 p
xp − ax − a
β β
is irreducible over Fq and its roots form a normal basis of Fqp over Fq of complexity
at most 3p − 2. This corresponds to the case of Theorem 5.3.9 with n = p, e1 = a,
ϕ(x) = ax/(x + a), b = b∗ = 0, and τ ∗ = a/β if p 6= 2 and τ ∗ = a/β − a if p = 2.
Bases 119
xn − β(x − a + 1)n
is irreducible over Fq and its roots form a normal basis of Fqn over Fq with
complexity at most 3n − 2. This corresponds to the case of Theorem 5.3.9 with
e1 = a, ϕ(x) = ax/(x + 1), b = b∗ = 0, and τ ∗ = −n(a − 1)β/(1 − β) − , with
given as in Theorem 5.3.9 with d = 1.
5.3.11 Conjecture [3036] If there does not exist an optimal normal basis of Fqn over Fq , then the
complexity of a normal basis of Fqn over Fq is at least 3n − 3.
5.3.12 Remark Explicit constructions of low complexity normal bases beyond the optimal normal
bases and the constructions given in Theorem 5.3.10 are rare. In Section 5.3.2 we give a
generalization of optimal normal bases arising from Gauss periods. Below, we illustrate how
to construct new normal bases of low complexity arising from previously known normal
bases.
5.3.13 Proposition [1172, 2578, 2580] Suppose gcd(m, n) = 1 and α and β generate normal bases
A and B for Fqm and Fqn over Fq , respectively. By Proposition 5.2.3, αβ generates a normal
basis N for Fqmn over Fq . Furthermore, we have CN = CA CB and if α and β both generate
optimal normal bases, then CN = 4mn − 2m − 2n + 1.
5.3.14 Proposition [634] Let n = mk and suppose α ∈ Fqn generates a normal basis
(α0 , α1 , . . . , αn−1 ) over Fq with multiplication table T = (tij ) for 0 ≤ i, j ≤ n − 1. Then
m−1
X
ββi = sij βj , 0 ≤ i ≤ m − 1,
j=0
where
X
sij = tum+i,vm+j , 0 ≤ i, j ≤ m − 1.
0≤u,v≤k−1
5.3.15 Corollary [633, 634, 1931] Let n = mk. Upper-bounds on the complexity obtained from
traces of optimal normal bases of Fqn over Fq are given in Table 5.3.1.
Table 5.3.1 Upper-bounds on the complexity obtained from of traces of optimal normal bases of Fqn
over Fq , where n = mk . ∗ Tight when k = 3; † tight when k = 2, 3.
120 Handbook of Finite Fields
5.3.16 Definition [139] Let r = nk + 1 be a prime not dividing q and let γ be a primitive r-th
root of unity in Fqnk . Furthermore, let K be the unique subgroup of order k in Z∗r and
Ki = {a · q i : a ∈ K} ⊆ Z∗r be cosets of K, 0 ≤ i ≤ n − 1. The elements
X
αi = γ a ∈ Fqn , 0 ≤ i ≤ n − 1,
a∈Ki
5.3.17 Theorem [1180, 2951] Let αi ∈ Fqn be Gauss periods of type (n, k) as defined in Defini-
tion 5.3.16. The following are equivalent:
1. N = (α0 , α1 , . . . , αn−1 ) is a normal basis of Fqn over Fq ;
2. gcd(nk/e, n) = 1, where e is the order of q modulo r;
3. the union of K0 , K1 , . . . , Kn−1 is Z∗r ; equivalently, Z∗r = hq, Ki.
5.3.18 Remark Gauss periods of type (n, 1) define Type I optimal normal bases and Gauss periods
of type (n, 2) define Type II optimal normal bases when q = 2.
5.3.19 Remark For the remainder of this section, we are concerned with Gauss periods which are
admissible as normal bases, that is, where the properties in Theorem 5.3.17 hold. When
the characteristic p does not divide n, the existence of admissible Gauss periods of type
(n, k) is shown assuming the ERH in [14, 159] for any n with k ≤ (cn)3 (log(np))2 . For
any k and prime power q, assuming the GRH, there are infinitely many n such that there
is an admissible Gauss period of Fqn over Fq [1236]. In contrast, when p divides n, [2952]
contains necessary and sufficient conditions for admissible Gauss periods, thus showing the
non-existence of admissible Gauss periods in certain cases.
5.3.20 Proposition [1180] There is no admissible Gauss period of type (n, k) over F2 if 8 divides
nk.
5.3.22 Proposition [259, 1180] Let N = (α0 , α1 , . . . , αn−1 ) be the normal basis arising from Gauss
periods of type (n, k) for Fqn over Fq . Let j0 < n be the unique index such that −1 ∈ κj0 ,
and let δj = 1 if j = j0 and 0 if j 6= j0 . Then
n−1
X
ααi = δi k + cij αj , 0 ≤ i ≤ n − 1,
j=0
hence CN ≤ (n − 1)k + n.
5.3.23 Proposition [139, 259] Let p be the characteristic of Fq and let N = (α0 , α1 , . . . , αn−1 ) be
the normal basis of Fqn over Fq arising from Gauss periods of type (n, k).
1. If p divides k, then CN ≤ nk − 1.
2. If p = 2, then
(
kn − (k 2 − 3k + 3) ≤ CN ≤ (n − 1)k + 1 if k even,
(k + 1)n − (k 2 − k + 1) ≤ CN ≤ (n − 2)k + n + 1 if k odd.
Bases 121
Table 5.3.2 Complexities of normal bases from Gauss periods of Type (n, k), 2 ≤ k ≤ 6, n > p.
5.3.26 Remark Let q = 2. Proposition 5.3.13 can be used to create normal bases of large extension
degree by combining normal bases of subfields with coprime degree. By Proposition 5.3.20
Gauss periods of type (n, k) do not exist when 8 divides nk. Hence, Proposition 5.3.13
cannot be used to construct low complexity normal bases when the degree is a prime power.
Thus, when n is a prime power (specifically a power of two), there are no constructions of
low-complexity normal bases arising from the above propositions.
5.3.27 Problem Find explicit constructions of low-complexity normal bases of F2n over F2 when
n is a power of two.
5.3.28 Remark Normal bases of low complexity are useful in fast encoding and decoding of network
codes, see [2665] for more details.
5.3.29 Remark Proposition 5.2.20, Theorem 5.3.9, and its corollaries show how the multiplicative
group of Fq or Fq2 can be used to construct irreducible polynomials and normal bases for
those degrees n whose prime factors divide q − 1 or q + 1. Also, Gauss periods use the
multiplicative group of Fqr−1 for some prime r. Couveignes and Lercier [746] show how
these methods can be generalized by using elliptic curve groups. The normal bases from
their construction may not have low complexity, but these bases still allow a fast algorithm
for multiplication. We outline their construction below; for more details on how to perform
fast multiplication using elliptic periods, we refer the reader to [746]. For properties of
elliptic curves, see Section 12.2.
5.3.30 Remark Let E be an elliptic curve over Fq defined by a Weierstrass equation
Y 2 + a1 XY + a3 Y = X 3 + a2 X 2 + a4 X + a6 ,
where ai ∈ Fq . The points of E over every extension of Fq form an additive group with the
point O at infinity as the identity. The order of the group E(Fq ) is q + 1 − t for some integer
√
t with |t| ≤ 2 q. Let n > 1 be an integer such that E(Fq ) has a cyclic subgroup F of order
n. The quotient E 0 = E/F is also an elliptic curve over Fq and there is an isogeny
φ : E → E0,
122 Handbook of Finite Fields
that has F as its kernel, and φ is defined by rational functions in Fq [X, Y ]. For any point
P ∈ E, let x(P ) denote the x-coordinate of P and similarly denote y(P ), thus
5.3.31 Remark We describe here an explicit formula due to Kohel [1779] for E 0 and φ when E is
of the form
E: Y 2 = X 3 + aX + b.
We denote by D the kernel polynomial given by
Y
D(X) = (X − x(Q))
Q∈F \{O}
Furthermore, E 0 is defined by
Y 2 = X 3 + (a − 5v)X + (b − 7w),
where
5.3.32 Definition Let T ∈ E(Fq ) be a point of order n and φ be the corresponding isogeny with
its kernel generated by T . For any point P ∈ E(Fqn ) with φ(P ) ∈ E 0 (Fq ), let θ(P, T )
denote the slope of the line passing through the two points T and P + T , that is
y(P + T ) − y(T )
θ(P, T ) = ∈ Fq n .
x(P + T ) − x(T )
5.3.33 Theorem [746] Let T ∈ E(Fq ) be a point of order n ≥ 3 and φ be the corresponding isogeny
with its kernel generated by T . Suppose there is a point P ∈ E(Fqn ) so that n P 6= O in E
and φ(P ) ∈ E 0 (Fq ). Then either
1. the elliptic period θ(P, T ) is a normal element of Fqn over Fq if the trace of θ(P, T )
from Fqn to Fq is nonzero, or
Bases 123
2. the element 1 + θ(P, T ) is a normal element of Fqn over Fq if the trace of θ(P, T )
is zero.
5.3.34 Example [746] Consider the following curve over F7
E: y 2 + xy − 2y = x3 + 3x2 + 3x + 2.
The point T = (3, 1) ∈ E(F7 ) has order n = 5, so the subgroup F = hT i has order 5. By
Vélu’s formula, the equation for E 0 = E/F is
E0 : y 2 + xy − 2y = x3 + 3x2 − 3x − 1,
f (X) = (X 5 + 2X 2 − 2X − 1) + 3(X 4 + 3X 2 − 3) = X 5 + 3X 4 − 3X 2 − 2X − 3
β = α4756 = −α3 − α2 + 3α + 2,
Hence
Y (P + T ) − Y (T )
θ(P, T ) = = −α4 + α3 + 3α2 − 3α − 3
X(P + T ) − X(T )
is a normal element in F75 over F7 .
5.3.35 Remark For the definition of dual and self-dual bases, see Definition 2.1.100. Self-dual nor-
mal bases have been well studied due to their efficiency in implementation, see Section 16.7.
A complete treatment of dual bases over finite fields can be found in [1631, Chapter 4], see
also Sections 5.1 and 5.2.
5.3.36 Remark It is computationally easier to restrict an exhaustive search to self-dual normal
bases. Geiselmann in [1263, 1631] computes the minimum complexity for a self-dual normal
basis of F2n over F2 for all n ≤ 47. These computations are repeated for odd degrees n ≤ 45
in [130] and the authors also give tables of minimum complexity self-dual normal bases over
finite fields of odd characteristic and for extensions of F2` , ` > 1. Some additional searches
for self-dual normal bases can be found in [2015].
5.3.37 Proposition [1632] Let N be a normal basis with multiplication table T . Then N is self-dual
if and only if T is symmetric.
124 Handbook of Finite Fields
5.3.38 Proposition [1632] Let gcd(m, n) = 1. Suppose α and β generate normal bases A and B
for Fqm and Fqn over Fq , respectively. Then γ = αβ generates a self-dual normal basis N
for Fqmn over Fq if and only if both A and B are self-dual, as in Proposition 5.2.3. The
complexity of the basis N is CN = CA CB , as in Proposition 5.3.13.
5.3.39 Proposition [1632, 2422] Let n be even, α ∈ F2n and γ = 1 + α. Then,
1. the element α generates a self-dual normal basis for F2n over F2 if and only if γ
does;
2. if α and γ = 1 + α generate self-dual normal bases B and B̄, respectively, for F2n
over F2 , then the complexities of B and B̄ are related by
CB̄ = n2 − 3n + 8 − CB .
1. the average complexity of a self-dual normal basis of F2n over F2 is 21 (n2 −3n+8);
2. if B is a self-dual normal basis for F2n over F2 , we have
2n − 1 ≤ CB ≤ n2 − 5n + 9,
and one of the equalities holds if and only if either B or its complement B̄ is
optimal.
5.3.41 Proposition [308] Let q be a power of a prime p. For any β ∈ F∗q with TrFq /Fp (β) = 1,
xp − xp−1 − β p−1
is irreducible over Fq and its roots form a self-dual normal basis of Fqp over Fq with com-
plexity at most 3p − 2. The multiplication table is as in Theorem 5.3.10 with e1 = β,
ei+1 = ϕ(ei ) for i ≥ 1, ϕ(x) = βx/(x + β), τ ∗ = 1 if p 6= 2 and τ ∗ = 1 − β if p = 2.
5.3.42 Proposition [308] Let n be an odd factor of q − 1 and let ξ ∈ Fq have multiplicative
order n. Then there exists u ∈ Fq such that (u2 )(q−1)/n = ξ. Let x0 = (1 + u)/n and
x1 = (1 + u)/(nu). Then the monic polynomial
1
(x − x0 )n − u2 (x − x1 )n
1+u2
is irreducible over Fq and its roots form a self-dual normal basis of Fqn over Fq . The
multiplication table is as in Theorem 5.3.9 with a = (x0 − ξx1 )/(1 − ξ), b = −x0 x1 ,
d = a − (x0 − x1 ) and τ = 1.
5.3.43 Proposition [308] Let n be an odd factor of q + 1 and let ξ ∈ Fq2 be a root of xq+1 − 1
with multiplicative order n. Then there is a root u of xq+1 − 1 such that (u2 )q+1 /n = ξ. Let
x0 = (1 + u)/n and x1 = (1 + u)/(nu). Then
1
(x − x0 )n − u2 (x − x1 )n
1 − u2
is irreducible over Fq and its roots form a self-dual normal basis of Fqn over Fq . The
multiplication table is as in Theorem 5.3.9 with a = (x1 − ξx0 )/(1 − ξ), b = −x0 x1 ,
d = a − (x0 + x1 ) and τ = 1.
Bases 125
5.3.44 Proposition [1180, 1930] Let α be a type (n, k) Gauss period generating a normal basis N
and let j0 = 0 if k is even and j0 = n/2 if k is odd. Then the element
j0
αq − k
γ=
nk + 1
is dual to α, and hence γ generates the dual basis Ñ of N . Furthermore, the complexity of
the dual basis Ñ is (
(k + 1)n − k if p - k,
CÑ ≤
kn − 1 if p | k.
5.3.45 Corollary [1180] For n > 2, a normal basis of Fqn over Fq arising from Gauss periods of
type (n, k) is self-dual if and only if k is even and divisible by the characteristic of Fq . In
particular, Type II optimal normal bases are self-dual.
5.3.46 Proposition [2924] The complexity of the dual of a Type I optimal normal basis is 3n − 2
if q is odd and 3n − 3 if q is even.
5.3.47 Remark [633] Upper bounds on the complexities of the dual basis of the Fqm -trace of
optimal normal bases of Fqn over Fq , where n = mk, are given in Table 5.3.3.
5.3.48 Remark In practical applications it is important to know how to do fast arithmetic in finite
fields, for example addition, multiplication, and division; and for cryptographic applications
it is also desirable to have elements of high orders and a fast algorithm for exponentiation.
Details for the basic operations discussed in this section can be found in Section 11.1, see
also [1227]. In hardware implementations, normal bases are often preferred, see Section 16.7
for details on hardware implementations. This subsection presents some theoretical results
related to fast multiplication and exponentiation under normal bases generated by Gauss
periods.
5.3.49 Remark Gao and Vanstone [1188] first observed that a Type II optimal normal basis gen-
erator has high order, which was proved later by von zur Gathen and Shparlinski [1240];
for more details see Section 4.4. Computer experiments by Gao, von zur Gathen, and Pa-
nario [1179] indicate that Gauss periods of type (n, k) with k > 2 also have high orders;
however, it is still open whether one can prove a subexponential lower bound on their orders.
5.3.50 Problem Give tight bounds on the orders of Gauss periods of type (n, k), k > 2.
5.3.51 Proposition [1179, 1188] Suppose α ∈ Fqn is a Gauss period of type (n, k) over Fq . Then
for any integer 1 ≤ t < q n − 1, αt can be computed using at most n2 k operations in Fq .
5.3.52 Theorem [1180] Suppose γ is an element of order r (not necessarily a prime) and
X
α= γi
i∈K
126 Handbook of Finite Fields
Then (α0 , α1 , . . . , αn−1 ) is the normal basis generated by α, with the following property:
α0 + α1 + · · · + αn−1 = −1.
To compute A−1 (assuming A 6= 0), we apply a fast gcd algorithm to the two polynomials
A(x) and xr − 1 to get a polynomial U (x) of degree at most r − 1 so that A(x)U (x) ≡ 1
(mod xr − 1). The element in Fqn corresponding to the polynomial U (x) is the desired
inverse of A. The fast gcd step needs O(r log2 r log log r) operations in Fq , see [1227].
5.3.54 Example (Generalized Gauss Periods [1047, 1174]) For any normal basis from Gauss periods
of type (n, k), we can apply Theorem 5.3.52 to perform fast arithmetic in Fqn . To obtain
an admissible Gauss period of type (n, k), r = nk + 1 must be a prime. Here we give an
Bases 127
example of generalized Gauss periods where r is not prime. Suppose we want to perform fast
arithmetic in F2954 . Let n = 954 and note that the smallest k so that there is an admissible
Gauss period of type (n, k) over F2 is k = 49. The corresponding r = nk + 1 = 46747 is a
little big in this case. We observe that 954 = 106 · 9 and that there is an admissible Gauss
period α1 of type (106, 1) over F2 , and an admissible Gauss period α2 of type (9, 2). Then
α = α1 α2 is a normal element of F2n over F2 . We construct this α as follows. Let
α = γ + γ 322
is a generalized Gauss period that is normal for Fqn over Fq . Now we can apply Theorem
5.3.52 to perform fast arithmetic in Fqn with a much smaller r. In [1174], it is shown how
to find generalized Gauss periods with minimum r and the related subgroups K; see [1174,
Tables 2-4] for many more examples for which generalized Gauss periods are better than
Gauss periods.
5.3.55 Example (Fast arithmetic under type II optimal normal bases) For type II optimal normal
bases over F2 , we describe below a slightly faster algorithm from [248, 1238]. Suppose 2n + 1
is a prime and the multiplicative group of Z2n+1 is generated by −1 and 2. Let γ ∈ F22n be
an element of order 2n + 1. For any i ≥ 0, define
γi = γ i + γ −i .
Then N = (γ1 , γ2 , . . . , γn ) is a permutation of the normal basis for F2n over F2 generated
by
α = γ1 = γ + γ −1 .
We note that γ0 = 2 and
γ1 + γ2 + · · · + γn = 1.
To do fast multiplication and division in F2n , we first perform a basis transition from N to
the polynomial basis P = (α, α2 , . . . , αn ), then perform a fast multiplication of polynomials
and finally transform the result back to the basis N . To do the basis transitions, we need
the following properties:
γi+j = γi γj + γj−i , for all i, j.
To see how to go from the basis N to the basis P , suppose we have an expression
A = a1 γ1 + a2 γ2 + · · · + a` γ` ,
γm = α m .
We observe that
a1 γ1 + a2 γ2 + · · · + a` γ`
= a1 γ1 + · · · + am γm + am+1 (γm γ1 + γm−1 ) + · · · + a` (γm γ`−m + γm−(`−m) ),
= (a1 γ1 + · · · + am γm + am+1 γm−1 + · · · + a` γm−(`−m) )
+αm (am+1 γ1 + am+2 γ2 + · · · + a` γ`−m ) .
128 Handbook of Finite Fields
See Also
References cited: [14, 130, 139, 159, 248, 259, 308, 633, 634, 746, 1047, 1172, 1174, 1179,
1180, 1184, 1188, 1227, 1236, 1238, 1240, 1263, 1631, 1632, 1779, 1930, 1931, 2015, 2199,
2422, 2578, 2580, 2665, 2865, 2924, 2951, 2952, 3036].
We present some theoretical results concerning algebraic extensions of finite fields. The
starting point is the Complete Normal Basis Theorem, which is a strengthening of the
classical Normal Basis Theorem. The search for completely normal elements leads to an
interesting structure theory for finite fields comprising a generalization of the class of finite
Galois field extensions to the class of cyclotomic modules.
5.4.1 Remark Let Fq denote an algebraic closure of the finite field Fq . The Frobenius automor-
phism of Fq /Fq (throughout denoted by σ) is the field automorphism mapping each θ ∈ Fq
Bases 129
to its q-th power θq . For every integer m ≥ 1 there is a unique subfield Em of Fq such that
Fq ⊆ Em and |Em | = q m . As usual we write Fqm for Em . Given integers d, m ≥ 1, one has
Fqd ⊆ Fqm if and only if d|m. Moreover, if d is a divisor of m, then Fqm /Fqd is a Galois ex-
tension of degree md ; its Galois group is cyclic and generated by σ (when restricted to Fq ).
d m
Pm/d−1 qdi Qm/d−1 qdi
The (Fqm , Fqd )-trace of w ∈ Fqm is i=0 w , while i=0 w is the (Fqm , Fqd )-norm
of w.
5.4.2 Remark Recall from Definition 2.1.98 that θ ∈ Fqm is a normal element of Fqm over Fq
m−1
provided its conjugates θ, θq , . . . , θq under the Galois group of Fqm /Fq form an Fq -basis
of Fqm .
5.4.4 Example (Based on [1387, 1388]) Let q = 7 and m = 3c , where c ≥ 1, and let η ∈ F7 be
a primitive 3c+1 -th root of unity. Then F73c is obtained by adjoining η to F7 . For every
i = 1, . . . , c, let τ (i) := 3b(i−1)/2c . Then
c 2·τ (i)
X X c−i
θ := 1 + η j·3
i=1 j=1,
gcd(3,j)=1
5.4.9 Definition Let M be a nonempty set of positive integers such that m ∈ M and d|m
imply d ∈ M , and k, n ∈ M imply lcm(k, n) ∈ M (where lcm denotes the least common
multiple). Then M is a Steinitz number.
5.4.10 Remark The intermediate fields of Fq /Fq are in one-to-one correspondence with the Steinitz
numbers (see Brawley and S Schnibben [398]). The field corresponding to the Steinitz number
M is the union FqM := m∈M Fqm . This algebraic extension of Fq is infinite if and only if
M is infinite. Any finite Steinitz number is of the form {d : d ∈ Z, d ≥ 1, d|m} for some
positive integer m; the corresponding field is Fqm .
5.4.11 Definition Let M be a Steinitz number. Then a sequence (wm )m∈M of elements of Fq is
trace-compatible, if for all d, m ∈ M with d|m the (Fqm , Fqd )-trace of wm is equal to wd .
Similarly, the sequence is norm-compatible, if the (Fqm , Fqd )-norm of wm is equal to wd
for every d, m ∈ M such that d|m.
5.4.12 Remark For an infinite Steinitz number M a trace-compatible sequence (wm )m∈M such
that every wm is a normal element of Fqm /Fq can be interpreted as an (infinite) normal ba-
sis for FqM over Fq . The existence of sequences of that kind in infinite Galois extensions over
an arbitrary field is proved by Lenstra [1896]. Representations of finite fields within com-
puter algebra systems which rely on subfield embeddings and trace-compatible sequences
of normal elements are studied by Scheerhorn [2537, 2538]. For finite fields, [2537] provides
an elementary proof of the existence of normal trace-compatible sequences for the entire
algebraic closure Fq /Fq (i.e., for the Steinitz number N).
5.4.13 Theorem [1389, Section 26] Consider the Steinitz number N and let Fq be any finite field.
Then there exists a trace-compatible sequence (wm )m∈N in Fq such that, for every m, the
element wm is completely normal in Fqm over Fq .
5.4.14 Definition A Galois field extension Fqm /Fq is completely basic, if every normal element
of Fqm /Fq is completely normal in Fqm /Fq .
5.4.15 Remark The notion of a completely basic extension in the context of a general finite di-
mensional Galois extension goes back to Faith [1023]. Blessenohl and Johnsen [318] have
characterized the completely basic extensions among the abelian extensions. Previously,
Blessenohl [315] considered cyclic extensions of prime power degree. Meyer [2090] has ex-
tended [318] to a certain class of not necessarily abelian Galois extensions. For finite fields
an elementary proof of Theorem 5.4.18 below is given in Section 15 of [1389].
5.4.16 Definition For relatively prime integers q, ` ≥ 1, the order of q modulo ` is the least
integer n ≥ 1 such that q n ≡ 1 (mod `); it is denoted by ord` (q).
5.4.17 Definition When considering a finite field with characteristic p, for an integer t ≥ 1 the
p-free-part t0 of t is the largest divisor of t which is relatively prime to p.
2. For every prime divisor r of m, every normal element of Fqm /Fq is normal in
Fqm /Fqr .
3. For every prime divisor r of m, the number ord(m/r)0 (q) is not divisible by r.
5.4.19 Example Let Fq be any finite field. Then Fqm is completely basic over Fq in all the following
cases:
1. m = r or m = r2 , where r is a prime;
2. m divides q − 1;
3. m = pb , where p is the characteristic of Fq and b ≥ 0 is any integer.
5.4.20 Remark Actually, in the latter case of Example 5.4.19, θ is a normal element for Fqpb /Fq if
and only if the (Fqpb , Fq )-trace of θ is non-zero (see [1389, Section 5]). For this class of field
extensions, Blake, Gao, and Mullin [309] provide an iterative construction of completely
normal elements.
5.4.21 Remark In the present subsection the class of finite extensions of a finite field is generalized
to the class of cyclotomic modules; thereby, completely normal elements are generalized to
complete generators of cyclotomic modules. This generalization is necessary in order to
understand how completely normal elements are additively composed. The main references
are [1389, 1390].
5.4.22 Remark Consider again an algebraic closure Fq of Fq . For every integer d ≥ 1 the additive
group of Fq is equipped with a module structure over the algebra Fqd [x] of polynomials in
the variable x. The corresponding scalar multiplication (of field elements by polynomials) is
carried out by first evaluating a polynomial f ∈ Fqd [x] at σ d (the Frobenius automorphism
over Fqd ) and afterwards by applying the Fqd -endomorphism f (σ d ) to an element ω ∈ Fq ,
resulting in f (σ d )(ω). In this context, Fq is an Fqd [x]-module (with respect to σ d ). For
θ ∈ Fq and an integer d ≥ 1 let
Fqd [x]θ := {h(σ d )(θ) : h ∈ Fqd [x]}
denote the Fqd [x]-submodule generated by θ.
5.4.23 Definition For θ ∈ Fq and an integer d ≥ 1, the q d -order of θ is the monic polynomial
g ∈ Fqd [x] of least degree such that g(σ d )(θ) = 0. The q d -order of θ is denoted by
Ordqd (θ).
5.4.27 Definition For an integer m ≥ 1, its square-free part ν(m) is the product of the distinct
prime divisors of m. We let ν(1) := 1.
5.4.28 Proposition [1389, Section 18] Consider Fq [x] and let (k, t) be as in Definition 5.4.26. Write
t = pb t0 , where p is the characteristic of Fq and t0 the p-free part of t. Then
0 b
1. Φk (xt ) = Φk (xt )p .
2. If d is a common divisor of k and t, then Φk (xt ) = Φkd (xt/d ). In particular,
Φk (xt ) = Φν(k) (xkt/ν(k) ).
b
3. If k and t are relatively prime, then Φk (xt ) = e|t0 Φke (x)p . The latter is the
Q
5.4.31 Remark In the setting of Definition 5.4.30, one has λω ∈ Ck,t for all ω ∈ Ck,t if and only
if λ ∈ Fqn where n = ν(k)kt
[1389, Section 18]. Therefore, Ck,t carries the structure of an
Fqd [x]-module if and only if d is a divisor of ν(k)
kt
. This motivates the notion of the module
character in Definition 5.4.30.
5.4.32 Definition Let Ck,t be a cyclotomic module over Fq . An element θ is a complete generator
for Ck,t over Fq , provided θ (simultaneously) generates Ck,t as an Fqd [x]-module for
kt
every divisor d of its module character ν(k) .
5.4.33 Theorem (Complete Cyclotomic Generator Theorem; [1389, Section 18]) Given any finite
field Fq and any cyclotomic module Ck,t over Fq , there exists a complete generator for Ck,t
over Fq .
5.4.34 Remark Theorem 5.4.33 generalizes the Complete Normal Basis Theorem 5.4.5 from the
class of finite field extensions of Fq to the class of cyclotomic modules over Fq : If (k, t) =
(1, m), then the cyclotomic module Ck,t is equal to the extension field Fqm ; over Fq , the
module character of Fqm is equal to m and the complete generators of Fqm /Fq are exactly
the completely normal elements of Fqm over Fq . The following theorem generalizes Remark
5.4.7.
5.4.35 Theorem (Cyclotomic Reduction Theorem; [1389, Section 25] and [1390, Section 3]) Con-
sider two cyclotomic modules Ck,s and C`,t over Fq . Assume that ks and `t are relatively
prime. Then, for θ ∈ Ck,s and ω ∈ C`,t the following two assertions are equivalent:
Bases 133
1. θ and ω are complete generators for Ck,s and C`,t over Fq , respectively.
2. θ · ω is a complete generator for the cyclotomic module Ck`,st over Fq .
5.4.36 Definition Consider a generalized cyclotomic polynomial Φk (xt ) ∈ Fq [x]. A set ∆ ⊆ Fq [x]
of generalized cyclotomic polynomials is a cyclotomic decomposition of Φk (xt ), if the
members of ∆ are pairwise relatively prime and
Y
Φk (xt ) = Ψ(x).
Ψ(x)∈∆
In that case, i(∆) denotes a corresponding set of pairs (`, s) with Φ` (xs ) ∈ ∆.
5.4.37 Proposition [1389, Section 19] Let ∆ be a cyclotomic decomposition of Φk (xt ) ∈ Fq [x].
Then
L
1. Ck,t = (`,s)∈i(∆) C`,s is a decomposition into a direct sum of cyclotomic mod-
ules.
P
2. For θ ∈ Ck,t let θ = (`,s)∈i(∆) θ`,s be the corresponding decomposition into its
∆-components. If θ is a complete generator of Ck,t over Fq , then (necessarily) for
every (`, s) ∈ i(∆) the component θ`,s is a complete generator of C`,s over Fq .
5.4.39 Theorem (Complete Decomposition Theorem; [1389, Section 19] and [1390, Section 5])
Consider a generalized cyclotomic polynomial Φk (xt ) over the finite field Fq with charac-
teristic p. Let r be a prime divisor of t. Assume that r 6= p and that r does not divide k.
Then
∆r := {Φk (xt/r ), Φkr (xt/r )}
is a cyclotomic decomposition of Φk (xt ). Moreover, the following two statements are equiv-
alent:
1. ∆r is an agreeable decomposition over Fq .
2. ordν(kt0 ) (q) is not divisible by ra , where a ≥ 1 is maximal such that ra divides t
(recall that t0 is the p-free part of t and ν(kt0 ) is the square-free part of kt0 ).
5.4.40 Remark Consider Φk (xt ) and ∆r as in Theorem 5.4.39. Then the module character of each
kt
cyclotomic module corresponding to a member of ∆r is equal to rν(k) and therefore a proper
kt
divisor of the module character ν(k) of Ck,t over Fq .
5.4.41 Example [1389, Section 19] Consider an extension Fqm /Fq , where m > 1 is not a power
of the characteristic p. Let r|m be the largest prime divisor of m that is different from p.
Then {xm/r − 1, Φr (xm/r )} is an agreeable decomposition of xm − 1 over Fq . If in particular
n−1
m = rn , this decomposition is equal to {xr − 1, Φrn (x)}. The Complete Decomposition
r n−1
Theorem 5.4.39 may then be applied to x − 1 if n ≥ 2, and an induction argument
134 Handbook of Finite Fields
shows that the canonical decomposition {x − 1, Φr (x), Φr2 (x), . . . , Φrn (x)} is an agreeable
n
decomposition of xr − 1 over Fq .
5.4.42 Remark In Section 6 of [1390] a Uniqueness Theorem is proved: Starting from a generalized
cyclotomic polynomial Φk (xt ) over Fq , one obtains a unique (finest) agreeable decomposi-
tion of Φk (xt ) by a recursive application of the Complete Decomposition Theorem 5.4.39
independent of the order the various primes r are chosen.
5.4.43 Example [1389, Section 19] The set {x − 1, Φ2 (x), Φ4 (x), Φ3 (x4 ), Φ9 (x4 ), Φ7 (x36 )} is an
agreeable decomposition of x252 − 1 over Fq , where q is any prime power relatively prime
to 252. When specializing q to be equal to 5, the Complete Decomposition Theorem may
be applied several further times to reach the following agreeable decomposition of x252 − 1
over F5 : {x − 1, Φ2 (x), Φ4 (x), Φ3 (x2 ), Φ12 (x), Φ9 (x2 ), Φ36 (x), Φ7 (x6 ), Φ28 (x3 ), Φ63 (x2 ),
Φ252 (x)}.
5.4.44 Remark Consider a cyclotomic module Ck,t over Fq , where k and t are relatively prime. Let
t = pb t0 , where p is the characteristic of Fq and t0 is the p-free part of t. The canonical de-
b
composition {Φke (x)p : e|t0 } is (by definition) the finest possible cyclotomic decomposition
of Φk (xt ). By Theorem 19.10 of [1389], the canonical decomposition is agreeable provided
that t0 and ordν(kt0 ) (q) are relatively prime. (Recall that ν(kt0 ) is the square-free part of
kt0 .)
5.4.45 Theorem [1389, Section 19] Let Fq be a finite field and p its characteristic. Write m = m0 pb
b
(with m0 being the p-free part of m). Then the canonical decomposition {Φd (x)p : d|m0 }
of xm − 1 is agreeable over Fq if and only if m0 and ordν(m0 ) (q) are relatively prime.
5.4.46 Definition Consider a cyclotomic module Ck,t over Fq , with k and t being relatively
prime. Then Ck,t is regular over Fq , provided that ordν(kt0 ) (q) and kt are relatively
prime (t0 is the p-free part of t and ν(kt0 ) is the square-free part of kt0 ). In particular,
in the case where (k, t) = (1, m), the extension Fqm is regular over Fq , provided that m
and ordν(m0 ) (q) are relatively prime.
5.4.47 Example These examples (taken from [1391, Section 1]), indicate that the class of regular
extensions is quite large. Given any finite field Fq , we have that Fqm is regular over Fq in
all the following cases:
1. m is a power of a prime r;
2. ν(m) divides ν(q − 1);
3. m is a power of a Carmichael number (a Carmichael number is an odd composite
integer n ≥ 1 such that r − 1 divides n − 1 for every prime divisor r of n;
by Alford, Granville, and Pomerance [75] there are infinitely many Carmichael
numbers, examples being 561, 1105, 1729, and 2465);
4. m has all its prime divisors from {7, 11, 13, 17, 19, 31, 41, 47, 49, 61, 73, 97, 101,
107, 109, 139, 151, 163, 167, 173, 179, 181, 193}, without any restriction on their
multiplicity.
5.4.48 Remark Any completely basic extension (Definition 5.4.14) is regular. In order to compare
the completely basic extensions with the regular ones, consider a finite set π of prime num-
bers, and let ν be the product of all r ∈ π. Let further N (π) := {m ∈ Z : m ≥ 1, ν(m)|ν}.
If Fqν is completely basic over Fq , then Fqm is regular over Fq for every m ∈ N (π). In
Bases 135
contrast, the subset of those m ∈ N (π) for which Fqm is completely basic over Fq is only
finite (see [1391, Proposition 3.3]).
5.4.49 Remark When considering the complete generation of a regular cyclotomic module Ck,t
over Fq , due to Remark 5.4.44, one can independently work on the components (which are
also regular) arising from the canonical decomposition of Φk (xt ). It is therefore sufficient
to consider the subclass of (regular) cyclotomic modules of the form Cn,pb (corresponding
b
to the generalized cyclotomic polynomial Φn (x)p ), where p is the characteristic of Fq and
n is relatively prime to q.
5.4.50 Definition Let p be the characteristic of Fq and n be relatively prime to q. Assume that
Cn,pb is a regular cyclotomic module over Fq . Write n = 2c · n with n odd. Then Cn,pb
is exceptional over Fq , provided the following conditions are satisfied: q ≡ 3 (mod 4)
and c ≥ 3 and the order of q modulo 2c is equal to 2. In all other cases, Cn,pb is
non-exceptional over Fq .
5.4.51 Remark For an integer n ≥ 1 let π(n) denote the set of prime divisors of n. If n and q are
relatively prime, the order of q modulo n is of the form
Y
ordn (q) = ordν(n) (q) · rα(r) ,
r∈π(n)
where α(r) ≥ 0 for all r ∈ π(n). Assume next that Cn,pb is a regular cyclotomic module
over Fq , hence ordν(n) (q) is relatively prime to npb . Let further
Y
τ = τ (q, n) := rbα(r)/2c .
r∈π(n)
5.4.52 Theorem [1389, Section 20] Let p be the characteristic of Fq and n be relatively prime to q.
Assume that Cn,pb is a regular cyclotomic module over Fq . Let τ = τ (q, n) be as in Remark
n n
5.4.51. Then τ divides ν(n) in general, and 2τ divides ν(n) if Cn,pb is exceptional. Moreover
the following hold:
1. If Cn,pb is non-exceptional, then θ is a complete generator of Cn,pb over Fq if and
b
only if Ordqτ (θ) = Φn/τ (x)p .
2. If Cn,pb is exceptional, then θ is a complete generator of Cn,pb over Fq if and only
b b
if Ordqτ (θ) = Φn/τ (x)p and Ordq2τ (θ) = Φn/(2τ ) (x)p .
5.4.53 Theorem [1389, Section 21] Let p be the characteristic of Fq and n be relatively prime
to q. Assume that Cn,pb is a regular cyclotomic module over Fq . Let τ = τ (q, n) be as in
Remark 5.4.51 and let ϕ denote Euler’s function (see Definition 2.1.43). Then the number
of complete generators of Cn,pb over Fq is equal to
τ ϕ(n)/ordn (q) (pb −1)ϕ(n)
1. q ordn (q)/τ − 1 ·q , if Cn,pb is non-exceptional;
2ordn (q)/τ ordn (q)/τ
τ ϕ(n)/(2ordn (q)) (pb −1)·ϕ(n)
2. q − 4q +3 ·q , if Cn,pb is exceptional.
5.4.54 Theorem [1389, Section 21] Let Ck,t be a regular cyclotomic module over Fq and write
t = pb t0 , where p is the characteristic of Fq and t0 is the p-free part of t. Then the number
of complete generators of Ck,t over Fq is at least
0 0 b
(q − 1)ϕ(k)t · q ϕ(k)t (p −1)
,
136 Handbook of Finite Fields
where ϕ denotes Euler’s function. Moreover, equality holds if and only if kt0 divides q − 1,
in which case θ is a complete generator of Ck,t if and only if the q-order of θ is Φk (xt ).
5.4.55 Conjecture If Fqm /Fq is a regular extension, then the number of completely normal elements
0 0 b
of Fqm /Fq is at least (q − 1)m · q m (p −1) by Theorem 5.4.54 (where m = pb m0 with m0
being the p-free part of m and p the characteristic of Fq ); moreover, equality holds if and
only if m0 divides q − 1, in which case Fqm is completely basic over Fq . We conjecture that,
for any extension Fqm /Fq , the number of completely normal elements of Fqm /Fq is at least
0 0 b
(q − 1)m · q m (p −1)
.
5.4.56 Remark The aim of Theorem 5.4.57 below and its subsequent remarks is a construction
(in the spirit of Example 5.4.4) of complete generators for regular cyclotomic modules over
Fq that are of the form Cn,1 . It is based on Theorem 5.4.52. A suitable application of the
Complete Decomposition Theorem 5.4.39 in combination with the Cyclotomic Reduction
Theorem 5.4.35 gives rise to a complete generator for a general cyclotomic module, in
particular a completely normal element for an arbitrary extension Fqm over Fq . Thereby,
Remark 5.4.20 on extensions of the form Fqpb may be used to cover the cases Ck,t and Fqm ,
where the characteristic p of Fq divides t and m, respectively. Throughout, gcd denotes the
greatest common divisor.
5.4.57 Theorem [1389, Chapter VI] Consider a finite field Fq and an integer n ≥ 1 with gcd(n, q) =
1. Let s := ordν(n) (q). Assume that Cn,1 is a regular cyclotomic module over Fq (hence n and
s are relatively prime). If n is odd, or if q ≡ 1 (mod 4), or if n ≡ 2 (mod 4) and q ≡ 3 (mod 4),
let (Q, N ) := (q s , n). If Cn,1 is exceptional over Fq , let (Q, N ) := (q 2s , n2 ). Write Q−1 = ρ·ρ,
where ν(ρ) = ν(N ) and gcd(N, ρ) = 1 (hence ρ is the largest divisor of Q−1 composed from
prime divisors of N ), and let a := gcd(ρ, N ) and I := {i ∈ Z : 1 ≤ i ≤ a, gcd(i, a) = 1}.
On I there is defined an equivalence relation by i ∼ j if and only if i ≡ q ` j (mod a) for
some ` ∈ Z. Let R be a set of representatives of ∼ on I. Finally, let y ∈ Fq be a primitive
(N ρ)-th root of unity, and
X X
w := y i and u := (y i + y iq ).
i∈R i∈R
With Tr denoting the (Fqns , Fqn )-trace mapping, the following hold:
1. Ordq (Tr(w)) = Φn (x) in the first case, that is, n odd, or q ≡ 1 (mod 4), or
n ≡ 2 (mod 4) and q ≡ 3 (mod 4).
2. Ordq (Tr(u)) = Φn (x) and Ordq2 (Tr(u)) = Φ n2 (x) in the second case, that is,
where Cn,1 is exceptional over Fq .
5.4.58 Remark The case where 4|n and q ≡ 3 (mod 4) and Cn,1 is regular but non-exceptional over
Fq is missed in the formulation of Theorem 5.4.57. It can be covered, however, by applying
Theorem 5.4.57 to the pair (q 2 , n2 ) in order to determine an element having q 2 -order Φn/2 (x).
Any such element has q-order Φn (x) (Section 24 of [1389]).
5.4.59 Remark When searching for a complete generator of the regular cyclotomic module Cn,1
over Fq , where gcd(n, q) = 1, define the parameter τ = τ (q, n) as in Theorem 5.4.52.
Then (q, n) is exceptional if and only if (q τ , nτ ) is exceptional. Consider Cn,1 = {θ ∈ Fq :
Φn (σ)(θ) = 0} as the cyclotomic module C nτ ,1 over Fqτ . Apply the construction of Theorem
5.4.57 (and Remark 5.4.58) to the pair (q τ , nτ ) instead of (q, n). Then the resulting elements
constitute complete generators of Cn,1 over Fq by Theorem 5.4.52.
Bases 137
5.4.60 Remark Completing previous work of Carlitz [539] in 1952 and Davenport [775] in 1968,
Lenstra and Schoof [1899] proved the Primitive Normal Basis Theorem (Theorem 5.2.32)
in 1987. It states that for any finite field Fq and any integer m ≥ 1 there exists a primitive
element of Fqm that is normal over Fq . Recall that a primitive element of Fqm is a generator
of the (cyclic) multiplicative group of Fqm ; see Theorem 2.1.37 and Definition 2.1.38. In this
subsection we consider the existence of primitive elements that are completely normal.
5.4.61 Remark By means of a computer search, Morgan and Mullen [2157] calculated for every
pair (p, m), with p ≤ 97 a prime and with pm < 1050 , a monic irreducible polynomial of
degree m over Fp whose roots are primitive and completely normal elements for Fpm over Fp .
It is conjectured in [2157] that for every extension Fqm /Fq there exists a primitive element
of Fqm that is completely normal over Fq .
5.4.62 Theorem [1391] Let q be a prime power and assume that Fqm is a regular extension over
Fq (see Definition 5.4.46). Assume further that q − 1 is divisible by 4 if q is odd and m is
even. Then there exists a primitive element of Fqm that is completely normal over Fq .
5.4.63 Remark Blessenohl [316] settles the existence of primitive completely normal elements for
extensions Fqm /Fq , where m = 2` is a divisor of q 2 − 1 and ` ≥ 3 and q ≡ 3 (mod 4).
In [1394], the case where q ≡ 3 (mod 4) and where m is a sufficiently large power of 2 is
handled, giving rise to the first bound in Theorem 5.4.64 below. In a not yet published work
by the author [1395], the existence of primitive completely normal elements is proved for all
regular extensions, i.e., the assertion of Theorem 5.4.62 also holds without the additional
assumption that q − 1 is divisible by 4 if q is odd and m is even.
5.4.64 Theorem [1394] Let Fq be a finite field with characteristic p. For an integer m ≥ 1 let
PCN(q, m) denote the number of primitive elements of Fqm that are completely normal
over Fq . Then (with ϕ denoting Euler’s function):
`−2
1. PCN(q, 2` ) ≥ 4(q − 1)2 , if q ≡ 3 (mod 4) and ` ≥ e + 3 (where e is maximal
such that 2e |q 2 − 1), or if q ≡ 1 (mod 4) and ` ≥ 5.
`−2
2. PCN(q, r` ) ≥ r2 (q − 1)r , if r 6= p is an odd prime and ` ≥ 2.
` r `−1 `−1
3. PCN(q, r ) ≥ r(q − 1) · ϕ(q r − 1), if r ≥ 7 and r 6= p is a prime and ` ≥ 2.
` p`−1 −1
4. PCN(q, p ) ≥ pq (q − 1), if ` ≥ 2.
` p`−1 −1 `−1
5. PCN(q, p ) ≥ pq (q − 1) · ϕ(q p − 1), if p ≥ 7 and ` ≥ 2.
5.4.65 Definition Consider a finite field Fq and let M be a Steinitz number. Assume that
(wm )m∈M is a sequence in FqM that is both, norm-compatible and trace-compatible
over Fq (see Definition 5.4.11). Then (wm )m∈M is a complete universal generator of
FqM over Fq if, for every m ∈ M , wm is a primitive element of Fqm that is completely
normal over Fq .
5.4.66 Theorem [1392] Let q > 1 be any prime power and r ≥ 7 be any prime. Furthermore, let
M := {rn : n ∈ N}. Then there exists a complete universal generator for FqM over Fq .
5.4.67 Remark The conclusion of Theorem 5.4.66 could also be proved for r = 5 when Fq has
characteristic 5, or when q mod 25 is not equal to 1, 7, 18 or 24, or when q is sufficiently
large [1392, 1393]. For the cases r = 2 and r = 3 similar results are available only when the
assumption that M is a Steinitz number is weakened, e.g., M has to consist of powers of 9
or powers of 8, respectively.
138 Handbook of Finite Fields
See Also
§5.1, §5.2 For self-dual, weakly self-dual, and primitive normal bases.
§5.3 For information on low-complexity normal bases.
References Cited: [75, 309, 315, 316, 317, 318, 398, 539, 589, 775, 1023, 1386, 1387, 1388,
1389, 1390, 1391, 1392, 1393, 1394, 1395, 1896, 1899, 2090, 2157, 2537, 2538, 2539, 2540]
6
Exponential and character sums
In this section, we focus mainly on Gauss, Jacobi, and Kloosterman sums over finite
fields, with brief mention of Eisenstein and Jacobsthal sums. Throughout, Fq is a finite field
of characteristic p with q = pr elements. (In Subsection 6.1.3, the exponent r will be taken
to be the order of p (mod k) for a fixed integer k.) We refer to [240] for proofs of many of
the results in this section. In some cases, the proofs need modification because of differing
definitions of the trivial character χ0 : in Definition 6.1.1 below, χ0 (0) = 0, while in [240],
χ0 (0) = 1.
6.1.1 Definition A multiplicative character χ on F∗q is a map from the cyclic group F∗q into the
group of complex roots of unity such that χ(αβ) = χ(α)χ(β) for all α, β ∈ F∗q . We
extend χ to a function on Fq by setting χ(0) = 0. The trivial character χ0 satisfies
χ0 (α) = 1 for every α ∈ F∗q . The order of χ is the smallest positive integer n for which
χn = χ0 . The unique character ρ of order 2 is the quadratic character .
139
140 Handbook of Finite Fields
6.1.2 Definition Write ζp = e2πi/p . For a character χ of order k on Fq and for β ∈ Fq , the Gauss
sum G(β, χ) of order k over Fq is defined by
X
G(β, χ) = χ(α)ζpTr(αβ) ,
α∈Fq
where Tr(α) denotes the trace of α from Fq to the prime field Fp . When β = 1, we
abbreviate G(χ) = G(β, χ).
6.1.3 Remark The next theorem shows that G(β, χ) can be evaluated in terms of G(χ).
6.1.4 Theorem [240, p. 9]. For a character χ on Fq and β ∈ Fq ,
(
q−1 if β = 0, χ = χ0 ,
G(β, χ) =
χ(β)G(χ) otherwise.
In particular,
G(χ) = χ(−1)G(χ).
6.1.5 Remark For proofs of the following two theorems, see [240, p. 10].
6.1.6 Theorem For a character χ on Fq ,
√
|G(χ)| = q if χ 6= χ0 , and G(χ) = −1 if χ = χ0 .
6.1.8 Remark We next present two theorems on uniform distribution of Gauss sums. The first,
due to Katz and Zheng [1713], was subsequently extended by Shparlinski [2652]. For the
second and some generalizations thereof, see Iwaniec and Kowalski [1581, Theorem 21.6],
Katz [1701, Chapter 9], and Fu and Liu [1139].
6.1.9 Theorem Consider the collection of (q − 1)(q − 2) normalized Gauss sums
√
G(β, χ)/ q, β ∈ F∗q , χ 6= χ0 .
6.1.11 Definition Let β ∈ F∗q and suppose that q = kf + 1 for some positive integer k. The
(reduced) f -nomial Gaussian periods g(β, k) of order k are defined by
X k
g(β, k) = ζpTr(βα ) .
α∈Fq
Exponential and character sums 141
6.1.12 Remark Gauss sums and periods have been used for counting solutions to diagonal equa-
tions over Fq [240, Chapters 10, 12] and for counting points on more general varieties
[1343, 1344, 2167]. Thaine [2791] has given an application to class groups of cyclotomic
fields. For some applications to coding theory, see [1078, 2225].
6.1.13 Theorem [240, p. 11]. Let β ∈ F∗q . If χ is a character on Fq of order k, then
k−1
X
g(β, k) = G(β, χj ).
j=1
√
In particular, |g(β, k)| ≤ (k − 1) q.
6.1.14 Remark The inequality above has been strengthened in several different ways, depending
on the relationship between p and k; see Heath-Brown and Konyagin [1454]. For further
estimates for Gauss sums, see [374, 1790, 2645].
6.1.15 Definition Let q = pr = kf + 1 and let γ be a generator of the cyclic group F∗q . The
polynomial
k−1
Y
Rk (x) = (x − g(γ s , k)),
s=0
whose zeros are (reduced) f -nomial Gaussian periods, is the (reduced) period polynomial
of degree k.
6.1.22 Remark Eisenstein sums can be applied to obtain congruences for binomial coefficients
[240, Section 12.9] and to evaluate Brewer sums [240, Chapter 13]. They are also useful for
evaluating Gauss sums g(k), via the following theorem.
6.1.23 Theorem [240, p. 421] Let χ be a character of order k on Fq , and let χ∗ denote the restriction
of χ to the prime field Fp , so that χ∗ is a character on Fp of order
k
k∗ = .
gcd(k, (q − 1)/(p − 1))
Then
k−1 k−1
G(χ∗ j )E(χj ) − p
X X
g(k) = E(χj ).
j=1 j=1
k∗ - j k∗ | j
6.1.24 Theorem [240, p. 391]. Let χ be a nontrivial character on Fq , and let χ∗ denote the restric-
tion of χ to Fp . Then the Eisenstein sum E(χ) can be expressed in terms of the Gauss sum
G(χ) over Fq and the Gauss sum G(χ∗ ) over Fp as follows:
(
G(χ)/G(χ∗ ) if χ∗ is nontrivial,
E(χ) =
−G(χ)/p if χ∗ is trivial.
As a consequence, (
p(r−1)/2 if χ∗ is nontrivial,
|E(χ)| =
p(r−2)/2 if χ∗ is trivial.
6.1.25 Definition Let χ, ψ be multiplicative characters on Fq . The Jacobi sum J(χ, ψ) over Fq
is defined by X
J(χ, ψ) = χ(α)ψ(1 − α).
α∈Fq
We say that the Jacobi sum has order k if k is the least common multiple of the orders
of its arguments.
6.1.26 Remark Clearly J(χ, ψ) = J(ψ, χ) ∈ Q(ζq−1 ). The next four theorems follow easily from
the results in [240, Section 2.1].
6.1.27 Theorem (Trivial Jacobi sums.) For characters χ, ψ on Fq ,
q − 2
if ψ, χ are both trivial,
J(χ, ψ) = −1 if exactly one of ψ, χ is trivial,
−χ(−1) if χψ is trivial with χ nontrivial.
J(χ, ψ) = G(χ)G(ψ)/G(χψ).
Exponential and character sums 143
Thus if f denotes the order of p (mod k), then G(χ)k lies in a subfield of Q(ζk ) of index f .
6.1.31 Remark Louboutin [1960] used Theorem 6.1.30 for power residue characters attached to the
simplest real cyclic cubic, quartic, quintic, and sextic number fields, to efficiently compute
class numbers of these fields.
6.1.32 Definition Let χ1 , . . . , χt be characters on Fq . Define the multiple Jacobi sum J(χ1 , . . . , χt )
by X
J(χ1 , . . . , χt ) = χ1 (α1 ) · · · χt (αt ).
α1 ,...,αt ∈Fq
α1 +···+αt =1
Similarly define
X
J0 (χ1 , . . . , χt ) = χ1 (α1 ) · · · χt (αt ).
α1 ,...,αt ∈Fq
α1 +···+αt =0
6.1.33 Remark Jacobi sums have applications to solving diagonal equations over finite fields and
to discrete log cryptosystems [240, Chapter 10]. They have been used to determine the
cardinality of certain classes of irreducible polynomials over Fq with prescribed trace and
restricted norm [1365, 1783]. See [2948, Chapter 16] for an application to primality testing,
and [1146] for an application to coding theory.
6.1.34 Remark The next four theorems can be proved by a straightforward modification of the
proofs in [240, Sections 10.1–10.3].
6.1.35 Theorem If χ1 , . . . , χt are all trivial characters on Fq , then
(q − 1)t − (−1)t
J(χ1 , . . . , χt ) = , J0 (χ1 , . . . , χt ) = J(χ1 , . . . , χt ) + (−1)t .
q
6.1.36 Theorem Suppose that χ1 , . . . , χt are characters on Fq which are not all trivial. Then
(
(1 − q)J(χ1 , . . . , χt ) if χ1 · · · χt is trivial,
J0 (χ1 , . . . , χt ) =
0 otherwise.
6.1.37 Theorem (Reduction formula) Suppose that χ1 , . . . , χt are characters on Fq such that χt
is nontrivial. Then
t
−(−1)
if χ1 , . . . , χt−1 are all trivial,
J(χ1 , . . . , χt ) = J(χt , χ1 · · · χt−1 )J(χ1 , . . . , χt−1 ) if χ1 · · · χt−1 is nontrivial,
−qJ(χ1 , . . . , χt−1 ) otherwise.
144 Handbook of Finite Fields
6.1.38 Theorem Suppose that the characters χ1 , . . . , χt on Fq are not all trivial. Then
(
G(χ1 ) · · · G(χt )/G(χ1 · · · χt ) if χ1 · · · χt is nontrivial,
J(χ1 , . . . , χt ) =
−G(χ1 ) · · · G(χt )/q otherwise.
Thus if χ1 , . . . , χt are all nontrivial,
(
q (t−1)/2 if χ1 · · · χt is nontrivial,
|J(χ1 , . . . , χt )| =
q (t−2)/2 otherwise.
6.1.39 Remark We next present two theorems on the uniform distribution of Jacobi sums. The
first is due to Katz and Zheng [1713]. For the second and generalizations thereof, see Katz
[1711, Corollary 20.3]. (Katz’s Corollary 20.3 for r = 1 is equivalent to his Theorem 17.5
with n = 1.)
6.1.40 Theorem Consider the collection of (q − 2)(q − 3) normalized Jacobi sums
√
J(χ, ψ)/ q, χ, ψ, χψ all nontrivial on Fq .
As q tends to infinity, this collection is asymptotically equidistributed on the complex unit
circle.
6.1.41 Theorem For a fixed nontrivial character ψ on Fq , consider the collection of q−3 normalized
Jacobi sums
√
J(χ, ψ)/ q, χ 6= χ0 , χ 6= ψ.
As q tends to infinity, this collection is asymptotically equidistributed on the complex unit
circle.
6.1.42 Remark Suppose that in place of the collection above, one considers the more general
collection of normalized multiple Jacobi sums
J(χ1 , . . . , χm , ψ1 , . . . , ψn )/q (n+m−1)/2 ,
where the ψj are fixed nontrivial characters and where the χi run through all nontrivial
characters for which χ1 · · · χm ψ1 · · · ψn is nontrivial. Katz [email communication, 2011] has
shown that as q tends to infinity, this collection is asymptotically equidistributed on the
complex unit circle, except in the “degenerate” case where both m = 1 and ψ1 · · · ψn is
trivial.
6.1.43 Definition (Lifted Gauss sums) Let χ be a character on Fq , and let m be a positive integer.
Recall that the Gauss sum G(χ) on Fq is defined by
X
G(χ) = χ(α)ζpTr(α) .
α∈Fq
The lift of G(χ) to the extension field Fqm is the Gauss sum
X
Gm (χ0 ) = χ0 (δ)ζpTr(δ) ,
δ∈Fqm
where Tr is the trace from Fqm to Fp and χ0 is the character on Fqm defined by
6.1.44 Definition (Lifted Jacobi sums) Suppose that χ1 , . . . , χt are characters on Fq , and let m
be a positive integer. The lift of the Jacobi sum
X
J(χ1 , . . . , χt ) = χ1 (α1 ) · · · χt (αt )
α1 ,...,αt ∈Fq
α1 +···+αt =1
6.1.45 Theorem (Hasse-Davenport theorem on lifted Gauss sums [240, p. 360]) Let χ be a character
on Fq , and let m be a positive integer. Then in the notation of Definition 6.1.43,
6.1.46 Remark The next corollary follows with the aid of Theorem 6.1.38.
6.1.47 Corollary (Hasse-Davenport theorem on lifted Jacobi sums.) Suppose that χ1 , . . . , χt are
characters on Fq which are not all trivial, and let m be a positive integer. Then in the
notation of Definition 6.1.44,
6.1.48 Definition A Gauss or Jacobi sum is pure if some positive integral power of it is real.
6.1.49 Example Quadratic Gauss sums are pure, by Theorem 6.1.86. Another example is given by
the following theorem of Stickelberger, proved in [240, Section 11.6].
6.1.50 Theorem Let χ be a character of order k > 2 on Fq , where q = pr . Suppose that there is
a positive integer t such that pt ≡ −1 (mod k), with t chosen minimal. Then r = 2ts for
some positive integer s, and
(
−1/2 (−1)s−1 if p = 2,
q G(χ) = s−1+(pt +1)s/k
(−1) if p > 2.
6.1.51 Theorem [1011] Let χ be a character of order k on Fq . Then the Gauss sums G(χj ) are
pure for all integers j if and only if −1 is a power of p (mod k). In the special case that k
is a prime power, G(χ) is pure if and only if −1 is a power of p (mod k).
6.1.52 Theorem [1011] Let χ be a character of order k > 1 on Fq , where q = pr . If G(χ) is pure,
then 2(q − 1)/(k(p − 1)) is an integer with the same parity as r. In particular, if r = 1 and
√
k > 2, then G(χ) is not pure, i.e., the normalized Gauss sum G(χ)/ p on Fp cannot equal
a root of unity when χ is a character on Fp of order > 2. Also, if r = 2, then G(χ) is pure
if and only if k | (p + 1).
6.1.53 Remark The next theorem, due to Aoki [112], gives further examples of pure Gauss sums.
See also Aoki [113].
146 Handbook of Finite Fields
6.1.59 Theorem (Hasse-Davenport product formula for Gauss sums [240, p. 351]) Let ψ be a
character on Fq of order ` > 1. For every character χ on Fq ,
`−1
Y
G(χψ i )/G(ψ i ) = χ` (`)G(χ` )/G(χ).
i=1
6.1.60 Remark The Hasse-Davenport formula for products of Gauss sums (Theorem 6.1.59) is
the finite field analogue of the Gauss multiplication formula for gamma functions. Work of
Kubert and Lichtenbaum [1809] on Jacobi sum Hecke characters led to identities involving
products of Gauss sums which extended the Hasse-Davenport product formula. (For calcula-
tion of conductors of Jacobi sum Hecke characters, see [2610].) Other examples of identities
involving products of Gauss sums may be found in [813, 1010, 1013, 1014, 1015, 2854]. For
evaluations of special hypergeometric character sums over Fq in terms of products of Gauss
sums, see papers of Evans and Greene [1007, 1008].
6.1.65 Remark We next give two congruences for Jacobi sums; see [240, pp. 60, 97]. More general
congruences are given in [1001].
6.1.66 Theorem Let χ, ψ be nontrivial characters on Fq of orders a, b, respectively. Then
J(χ, ψ) ≡ −q (mod (1 − ζa )(1 − ζb )) .
If moreover a = b > 2, then the right member −q may be replaced by −1.
6.1.67 Theorem Let χ be a character on Fq of order 2`, where ` > 1 is odd, and let n denote an
even integer not divisible by `. Then
J(χ, χn ) ≡ −χn (4) mod (1 − ζ` )2 .
6.1.68 Definition Let γ be a generator of the cyclic group F∗q , and let χ be a character of order k
on Fq for which χ(γ) = ζk . For any pair of integers a, b (mod k), define the cyclotomic
number C(a, b) = C(γ, a, b) of order k over Fq to be the number of α ∈ F∗q for which
6.1.69 Remark Cyclotomic numbers are useful for obtaining residuacity criteria. For example, the
cyclotomic numbers of order 12 can be used to prove that for a prime p ≡ 1 (mod 12), 3
is a quartic residue (mod p) if and only if a3 ≡ −1 (mod 4), where a3 is as in Definition
6.1.74. See [240, p. 231].
6.1.70 Theorem [240, p. 365] In the notation of Definition 6.1.68, the cyclotomic numbers C(a, b)
are related to the Jacobi sums J(χu , χv ) by the following finite Fourier series expansions:
X k−1
k−1 X
k 2 C(a, b) = χu (−1)J(χu , χv )ζk−au−bv
u=0 v=0
and
X k−1
k−1 X
χu (−1)J(χu , χv ) = C(a, b)ζkau+bv .
a=0 b=0
6.1.71 Definition Let p be an odd prime. For a positive integer k and an integer a not divisible
by p, define the Jacobsthal sums Uk (a), Vk (a) over Fp by
p−1 k p−1 k
X m m +a X m +a
Uk (a) = , Vk (a) = ,
m=0
p p m=0
p
6.1.72 Remark Jacobsthal sums have applications to the distribution of quadratic residues [240,
Chapter 6], to evaluations of Brewer sums [240, Chapter 13], and to the evaluation of
certain hypergeometric character sums [240, Equation (13.3.2)]. The next theorem expresses
Jacobsthal sums in terms of Jacobi sums.
6.1.73 Theorem [240, pp. 188–189]. For a prime p ≡ 1 (mod 2k), let χ be a character on Fp of
order 2k. Let a be an integer not divisible by p. Then
k−1
a X 2j+1
Uk (a) = χ(−1) χ (4a)J(χ2j+1 , χ2j+1 )
p j=0
148 Handbook of Finite Fields
and
k−1
a X 2j
Vk (a) = χ (4a)J(χ2j , χ2j ).
p j=1
6.1.75 Remark Given p, the parameters a3 , r3 , and u3 are uniquely determined, but b3 , s3 , and
v3 are determined only up to sign. These parameters appear below in the evaluations of
cubic and sextic Gauss and Jacobi sums over Fp . In the case that 2 is a cubic nonresidue
(mod p), we have 3 - b3 and the parameters r3 , s3 , u3 , v3 are odd. In the case that 2 is a
cubic residue (mod p), we have 3 | b3 and r3 , s3 , u3 , v3 are even; see [240, Section 3.1].
6.1.76 Theorem (Cubic Jacobi sums [240, Section 3.1]) For q = p = 6f + 1, let χ be a character
of order 3 on Fp . Then
√
J(χ, χ) = (r3 + is3 3)/2.
6.1.77 Theorem (Sextic Jacobi sums [240, Section 3.1]) For q = p = 6f + 1, let χ be a character
of order 6 on Fp . Then
√
J(χ, χ) = (−1)f (u3 + iv3 3)/2 = (−1)f J(χ, χ4 ),
√
J(χ, χ2 ) = J(χ3 , χ2 ) = a3 + ib3 3 = (−1)f J(χ, χ3 ).
6.1.79 Theorem (Quartic Jacobi sums [240, Section 3.2]) For q = p = 4f + 1, let χ be a character
of order 4 on Fp . Then
6.1.81 Theorem (Octic Jacobi sums [240, Section 3.3]) For q = p = 8f + 1, let χ be a character of
order 8 on Fp . Then
√
J(χ, χ) = χ(4)(a8 + ib8 2) = χ(−4)J(χ, χ3 ), J(χ, χ2 ) = χ(−4)(a4 + ib4 ).
6.1.82 Theorem (Duodecic Jacobi sums [240, Section 3.5]) For q = p = 12f +1, let χ be a character
of order 12 on Fp so that
J(χ3 , χ3 ) = (−1)f (a4 + ib4 )
as in Theorem 6.1.79. Then
where the plus sign is chosen if 3 | b4 and the minus sign is chosen if 3 | a4 . The three
additional duodecic Jacobi sums below are expressed in terms of previously evaluated Jacobi
sums of orders 6, 4, 3, respectively:
where (
c12 = ±1 with c12 ≡ −a4 (mod 3) if 3 | b4 ,
c12 = ±i with c12 ≡ −ib4 (mod 3) if 3 | a4 .
6.1.83 Remark Values of all duodecic Jacobi sums may be deduced from Theorem 6.1.82. For
example,
J(χ3 , χ5 ) = σ5 J(χ, χ3 ) = J(χ, χ3 ),
where σ5 is as in Definition 6.1.99 with k = 12. Niitsuma [2288] applied duodecic Jacobi
sum evaluations to count rational points on certain hyperelliptic curves over Fp .
6.1.84 Remark Jacobi sums of various other small orders are explicitly evaluated in [240]; e.g.,
quintic Jacobi sums are computed in [240, Section 3.7]. For the quintic case, see also Hoshi
[1538] for an analysis of Gauss sums, Jacobi sums, and period polynomials. Values of Jacobi
sums of order 16 have been applied to construct regular Hadamard matrices [1908].
6.1.85 Theorem (Quadratic multiple Jacobi sums [240, p. 299]) Suppose that χ1 , . . . , χt are all
equal to the quadratic character ρ on Fq . Then
(
ρ(−1)(t−1)/2 q (t−1)/2 if t is odd,
J(χ1 , . . . , χt ) =
−ρ(−1)t/2 q (t−2)/2 if t is even.
6.1.86 Theorem (Quadratic Gauss sums [240, p. 362]) Let q = pr for an odd prime p, and let ρ
be the quadratic character on Fq . Then
( √
(−1)r−1 q if p ≡ 1 (mod 4),
g(2) = G(ρ) = √
(−1)r−1 ir q if p ≡ 3 (mod 4).
6.1.87 Remark The Gauss sums in Theorem 6.1.86 lie in a quadratic extension of Q. For evalu-
ations of general Gauss sums over Fq lying in quadratic and multi-quadratic extensions of
Q, see [115, 2042, 3026, 3027].
6.1.88 Theorem (Cubic periods [240, Section 4.1]) Let q = p ≡ 1 (mod 6), so that 4p = r32 + 3s23
with r3 ≡ 1 (mod 3), s3 ≡ 0 (mod 3). Let b be a primitive root of p. Then the cubic
irreducible polynomial x3 − 3px − pr3 has the three real zeros g(3), g(b, 3), and g(b2 , 3), with
√ √ √ √ √ √
one zero in each of the three intervals (−2 p, − p), (− p, p), ( p, 2 p).
6.1.89 Remark No simple criterion is known for determining which of the three intervals above
contains the cubic Gauss sum g(3). However, g(3) and the Gauss sums G(χ) for cubic
characters χ have been evaluated in terms of products of values of Weierstrass ℘-functions;
see [240, p. 158]. The next theorem, due to Heath-Brown and Patterson [1455], gives an
equidistribution result for cubic Gauss character sums over Fp .
6.1.90 Problem Determining the distribution of n-th order Gauss sums over Fp for a general fixed
n is an open problem.
6.1.91 Theorem Consider the collection of all normalized cubic Gauss sums
√
G(χ)/ p, χ cubic on Fp , q = p ≡ 1 (mod 3), p < x.
6.1.92 Remark The next theorem determines the sextic Gaussian period g(6) unambiguously, once
g(3) is known.
6.1.93 Theorem (Sextic periods [1003]) Let q = p ≡ 1 (mod 6), so that 4p = r32 + 3s23 with
r3 ≡ 1 (mod 3), s3 ≡ 0 (mod 3). In the case that 2 is a cubic residue (mod p),
2
/4
√
g(6) = g(3) + i(p−1) g(3)2 − p / p.
In the case that 2 is a cubic nonresidue (mod p), then with the sign of s3 specified by
s3 ≡ −r3 (mod 4),
2 √
g(6) = g(3) + i(p−1) /4
{4p − g(3)2 + s−1 2pg(3) + 2pr3 − r3 g(3)2 }/(2 p).
3
6.1.94 Remark For the history behind the quartic Gauss sum evaluations in the two theorems
below, see [240, p. 162].
6.1.95 Theorem (Quartic Gauss sums [240, Section 4.2]) Let q = p ≡ 1 (mod 4), and let χ be a
quartic character on Fp . As in Theorem 6.1.79, write J(χ, χ) = a + bi, where p = a2 + b2
with a ≡ −1 (mod 4). (Note that the sign of b depends on the choice of the quartic character
χ.) Define C = ±1 by
|b| p − 1
C ≡ (−1)(p−1)/4 ! (mod p).
a 2
Then r √ r √ !
|b| p+a p |b| p − a p
(b2 +2b)/8
G(χ) = C(−1) +i ,
|a| 2 b 2
6.1.96 Theorem (Quartic periods [240, Section 4.2]) In the notation of the previous theorem, if
p ≡ 1 (mod 8),
|b|
√ √
2
q
g(4) = p+C (−1)(b +2|b|)/8
2p + 2a p,
|a|
while if p ≡ 5 (mod 8),
√ |b|
√
2
q
g(4) = p + iC (−1)(b +2|b|)/8
2p − 2a p.
|a|
6.1.97 Remark The next theorem determines the Gaussian period g(12) unambiguously, once g(3)
is known. For an extension to Fq , see Gurak [1363].
6.1.98 Theorem (Duodecic periods [1003]) Let q = p = 12f + 1, so that as in Theorem 6.1.78,
p = a2 + b2 with a ≡ −(−1)f (mod 4). In the case that −3 is a quartic residue (mod p),
−a
√
√
g(12) = g(6) + (g(4) − p) 1 + g(3)/ p .
3
In the case that −3 is a quartic nonresidue (mod p) (which is equivalent to 3 - b by [240,
Section 7.2]) then with the sign of b specified by b ≡ −1 (mod 3),
√ 2 √
g(12) = g(6) + g(4) − p + 2b g(3)/ (g(4) − p) .
p
6.1.99 Definition Fix an integer k > 1. In this subsection, q = pf , where f is the order of p
(mod k). For the group R = (Z/kZ)∗ , let T denote a complete set of φ(k)/f coset
representatives of the quotient group R/hpi. For an integer a, let `(a) denote the least
nonnegative integer congruent to a (mod k). Write
f
X −1 fY
−1
s(a) = ai , t(a) = ai ! ,
i=0 i=0
K = Q(ζk ), M = Q(ζk , ζp )
6.1.100 Theorem [240, p. 343] Let π = ζp − 1. We have the prime ideal factorizations
Pp−1
Y Y Y
πOM = Pj , pOM = j , pOK = Pj .
j∈T j∈T j∈T
6.1.101 Definition (Power residue symbol χP ) Define the character χP of order k on the finite
field OK /P by setting χP (α + P ) equal to the unique power of ζk which is congruent
to α(q−1)/k (mod P ), for every α ∈ OK with α ∈ / P . If α ∈ P , set χP (α + P ) = 0.
6.1.102 Theorem (Stickelberger’s congruence for Gauss sums [240, p. 344]) For any integer a, the
Gauss sum G(χ−a
P ) over the finite field OK /P is an element of OM satisfying the congruence
−π s(a)
G(χ−a
P )≡ mod Ps(a)+1 .
t(a)
6.1.103 Remark For the following two corollaries, see Conrad [714].
6.1.104 Corollary If 0 ≤ a, b < k with a, b not both 0, then with u := (q − 1)/k,
q − 1 − bu
J(χ−a −b
P , χP ) ≡ −(−1)
au
(mod P ).
au
6.1.105 Corollary Let k = q − 1 (so that χP has order q − 1), and let
0 ≤ bi < q − 1, i = 1, 2, . . . , t,
6.1.107 Theorem (Prime ideal factorization of Gauss sums [240, p. 346]) For any integer a,
s(aj)
Y
G(χ−a
P )OM = Pj −1 .
j∈T
ks(aj)/(p−1) `(aj)
Y Y
G(χ−a k
P ) OK = Pj −1 = Pj −1 .
j∈T j∈R
6.1.109 Theorem (Prime ideal factorization of Jacobi sums [240, p. 346]) Suppose that a, b are
integers such that a + b is not divisible by k. Then
Y v(a,b,j) s(aj) + s(bj) − s(aj + bj)
J(χ−a −b
P , χP )OK = Pj −1 , with v(a, b, j) = .
p−1
j∈T
Exponential and character sums 153
In particular, for q = p,
Y
J(χP , χP )OK = Pj −1 .
1≤j<k/2
(j,k)=1
6.1.110 Definition Let Qp denote the field of p-adic rationals, and let Zp be its ring of p-adic
integers. Consider the extension field Qp (ζ), where ζ is a primitive p-th root of the
element 1 ∈ Zp . For π = ζ − 1, let λ denote the prime element in Qp (ζ) satisfying
6.1.112 Remark The following theorem of Gross-Koblitz expresses the p-adic Gauss sum in terms
of Morita’s p-adic gamma functions. For a relatively elementary proof, see Robert [2463].
6.1.113 Theorem (Gross-Koblitz formula for p-adic Gauss sums) Let a be any integer. Viewing the
Gauss sum G(χ−aP ) ∈ OM as embedded in the subfield Qp (ζ) of the P-adic completion of
M , we have the following equality in Zp [ζ]:
fY
−1
`(api )
G(χ−a
P ) = −λ s(a)
Γp .
i=0
k
6.1.114 Remark The next corollary follows with the aid of Theorem 6.1.38.
6.1.115 Corollary (Gross-Koblitz formula for p-adic Jacobi sums) Let b1 , . . . , bt be integers such
that c := b1 + · · · + bt is not divisible by k. Viewing the Jacobi sum J(χ−b −bt
P , . . . , χP ) ∈ OK
1
as embedded in the subfield Qp of the P -adic completion of K, we have the following equality
in Zp :
fY
−1
`(b1 pi ) `(bt pi ) `(cpi )
J(χ−b1 −bt
P , . . . , χP ) = (−1) t−1 u
(−p) Γp · · · Γp Γp ,
i=0
k k k
6.1.118 Definition For u ∈ Fq and a multiplicative character χ on Fq , define the (twisted) Kloost-
erman sum K(u, χ) over Fq by
X
K(u, χ) = χ(α)ζpTr(α+u/α) .
α∈F∗
q
Note that K(0, χ) = G(χ). When χ is trivial, we abbreviate K(u) = K(u, χ).
6.1.119 Remark Kloosterman sums occur frequently in the theory of modular forms, and they have
many applications in analytic number theory [1453, 1581, 2524]. For some applications to
coding theory, see [620, 1562, 1732, 2124]. For further applications, see the references in [503,
p. 448]. Some congruences for Kloosterman sums may be found in [592, 1299, 1946, 2128].
In [1785], it is proved that if K(u) is an integer and p > 3, then K(u) is even. This proves
in particular the nonvanishing of 1 + K(u) for p > 3. For analysis of the cases p = 2, 3, see
[49]. The vanishing of 1 + K(u) has applications to bent functions, defined in Section 9.3.
(See [2086] for recent work on bent and hyper-bent functions.)
This is also known as a (twisted) hyper-Kloosterman sum. It reduces to the sum given
in Definition 6.1.118 when m = 1.
6.1.123 Remark Wan [2915] studied the algebraic degree of multiple Kloosterman sums. For further
results on the degrees of Kloosterman sums, see [1784]. The next theorem gives an easily
proved expression for multiple Kloosterman sums in terms of Gauss sums.
m
1 X Y
K(u, χ1 , . . . , χm ) = χ(u) G(χχi ),
q−1 χ i=0
where χ0 is the trivial character and where the sum is over all characters χ on Fq .
Exponential and character sums 155
6.1.125 Theorem For a positive integer m, let ψ be a character on Fq of order m + 1. Then for any
u ∈ Fq ,
X
K(u, ψ, ψ 2 , . . . , ψ m ) = (q, m)q m/2 ζpTr(α(m+1)) ,
α∈Fq
αm+1 =u
where (
1 if 2 | m,
(q, m) = 2
(−1)r−1+(q−1)(m−1)/8 i(p−1) r/4 if 2 - m.
6.1.126 Remark Theorem 6.1.125 is due to Duke [928], who showed it to be equivalent to the Hasse-
Davenport product formula for Gauss sums (Theorem 6.1.59). For a related identity, see
Ye [3034]. The Kloosterman sum K(u, ψ, ψ 2 , . . . , ψ m ) is connected to Fourier expansions of
certain Poincaré series [928]. In the case m = 1, Theorem 6.1.125 reduces to the following
well-known evaluation of the Salié sum K(u, ρ).
6.1.127 Theorem (Salié sum) Let u ∈ Fq and let ρ be the quadratic character on Fq . Then
6.1.128 Remark Theorem 6.1.129 below reduces to Theorem 6.1.127 when χ = ρ and reduces to
Theorem 6.1.120 when χ is trivial.
6.1.129 Theorem [715, Equation (4)] Let u ∈ Fq and let ρ denote the quadratic character on Fq .
Then for any character χ on Fq ,
G(ρ)χ(4) X
K(u, χ) = χρ(y 2 − 4u)ζpTr(y) .
G(χρ)
y∈Fq
6.1.130 Remark Sums closely related to the above sum on y form an orthogonal set of eigenfunctions
for a collection of adjacency matrices of “finite upper half plane” Cayley graphs [1000].
6.1.131 Theorem (Upper bound for multiple Kloosterman sums) For u ∈ Fq and characters
χ1 , . . . , χm on Fq ,
|K(u, χ1 , . . . , χm )| ≤ (m + 1)q m/2 .
6.1.132 Remark The upper bound above is due to Deligne for trivial characters, and in full general-
√
ity to Katz [1701, p. 49]. See Conrad [715] for a nice proof of the special case |K(u, χ)| ≤ 2 q,
patterned on Weil’s original proof for trivial χ. Equality in Theorem 6.1.131 can occur; for
example, let u = 1, (m + 1) | (p − 1), p | r in Theorem 6.1.125. However, equality cannot
occur in Theorem 6.1.131 in the case that all m characters are trivial, since then
√
6.1.133 Definition The Weil bound |K(u)| < 2 q is a consequence of the formula expressing
−K(u) as a sum of conjugate Frobenius eigenvalues:
where
√
g(u) = q exp(iθ(q, u)), θ(q, u) ∈ (0, π).
We call θ(q, u) the Kloosterman angle, noting that
√
−K(u) = 2 q cos(θ(q, u)), u ∈ F∗q .
6.1.134 Definition Let u ∈ F∗q . For a positive integer n, let Kn (u) denote the Kloosterman sum
over Fqn (so that K1 (u) = K(u)). We call Kn (u) the lift of the Kloosterman sum K(u)
from Fq to Fqn .
6.1.135 Remark It is shown in [1785] that Kn (u) is an integer if and only if K(u) is an integer.
6.1.136 Remark The following theorem, analogous to the Hasse-Davenport lifting formula for Gauss
sums (Theorem 6.1.45), offers a formula of Carlitz [548] for the lift Kn (u) in terms of
a Dickson polynomial in K(u). For an extension to K(u, χ), see [1581, p. 281]. (Dickson
polynomials over Fq are discussed in Section 9.6.)
Since p
g(u) = −K(u) ± K(u)2 − 4q /2,
we have
p n p n
−2n Kn (u) = −K(u) + K(u)2 − 4q + −K(u) − K(u)2 − 4q .
2 y 2
Z
sin t dt.
π x
6.1.140 Conjecture Fix a positive integer u and consider the collection of Kloosterman angles
{θ(p, u) : u < p < x}. As x tends to infinity, this collection is asymptotically equidistributed
with respect to the Sato-Tate measure on (0, π).
6.1.141 Remark Theorem 6.1.139 is due to Katz [1701, p. 240]. For a quantitative refinement and
related results, see [2092, 2245, 2650]. For an extension to Kloosterman sums over rings, see
[1725]. Statements of the conjecture above may be found in Katz’s books [1700, Conjecture
1.2.5] and [1701, p. 5]. See also the references in Shparlinski [2650, p. 420].
Exponential and character sums 157
6.1.142 Definition Fix an integer u which is not a perfect square. Motivated by Theorem 6.1.127
u
with q = p > 2, we define the Salié angle ϑ(p, u) ∈ (0, π) for each prime p with p =1
by
ϑ(p, u) = 2πb(u, p)/p,
where b = b(u, p) is the smallest positive integer for which b2 ≡ u (mod p).
6.1.143 Remark The next theorem gives an equidistribution result for angles of Salié sums K(u, ρ);
see [1581, p. 496] and the references in Shparlinski [2649], where one finds further results of
this type.
6.1.144 Theorem (Equidistribution of Salié
angles) Fix an integer u which is not a perfect square.
u
The set of Salié angles {ϑ(p, u) : p = 1, p < x} is asymptotically equidistributed in the
interval (0, π), as x tends to infinity.
6.1.145 Remark The following equidistribution result for normalized Kloosterman sums
√
K(−1, χ)/ q was conjectured by Evans and proved by Katz [1711]. Note that each such
sum is real and lies in the interval [−2, 2] by Theorem 6.1.131.
6.1.146 Theorem (Equidistribution of K(−1, χ)) Consider the collection of q−1 normalized Kloost-
√
erman sums K(−1, χ)/ q, where χ runs through the characters on Fq . As q tends to infinity,
the “angles” of this collection are asymptotically equidistributed with respect to the Sato-
Tate measure on [−2, 2]. In other words, as q tends to infinity, the proportion of the members
of this collection that lie in a fixed subinterval [v, w] ⊂ [−2, 2] approaches
Z wp
1
4 − x2 dx.
2π v
6.1.147 Definition Let n be a positive integer. Define the n-th power moment Sn of the Klooster-
man sums K(u) by X
Sn = K(u)n .
u∈Fq
6.1.148 Remark Moisio [2119, 2122] gave evaluations of Sn for n ≤ 10 when q is a power of 2 or 3,
and he related them to cyclic codes; see also [1733]. In the remainder of this subsection, we
restrict our attention to the case q = p > n.
6.1.149 Remark Various congruences have been given for Sn , but explicit evaluations of Sn for all
q = p > n are known only for n = 1, 2, 3, 4, 5, 6 [622]. The values for n ≤ 4 below, due to
Salié, may be found in Iwaniec’s book [1580, Section 4.4] (where −p should be replaced by
−3p in Equation (4.25)).
6.1.150 Theorem (Power moments of Kloosterman sums [622]) Let q = p > n. The power moments
Sn are integer multiples of p. We have
p
S1 = 0, S2 = p2 − p, S3 = p2 + 2p, S4 = 2p3 − 3p2 − 3p.
3
For p > 5, p
S5 = 4 p3 + (ap + 5)p2 + 4p,
3
158 Handbook of Finite Fields
{η(6z)η(3z)η(2z)η(z)}2 .
6.1.151 Remark Evans [1006] has conjectured an explicit formula for S7 in terms of the coefficient
of q p in the q-expansion of a weight 3, level 525 newform.
6.1.152 Definition Let q = p > n ≥ 1. In the notation of Definition 6.1.133, the Kloosterman
power moments Sn can be expressed as
p−1
X
Sn = (−1)n + (−1)n (g(u) + g(u))n .
u=1
where Un is the n-th monic Chebychev polynomial of the second kind. We normalize
the sum Tn by defining
Yn := (−1 − Tn )/p2 .
6.1.153 Remark For some twists of Sn and Tn , see Liu [1947] and Evans [1005].
6.1.154 Theorem If q = p > n, then Yn is an integer. In particular,
p
Y1 = Y2 = 0, Y3 = , Y4 = 1, Y5 = ap , Y6 = bp ,
3
where ap , bp are defined in Theorem 6.1.150.
6.1.155 Remark Theorem 6.1.154 can be found in Evans [1006]. There it is moreover conjectured
(for p > 7) that
p
Y7 = (|A(p)|2 − p2 ),
105
where A(p) is the p-th Fourier coefficient of a weight 3, level 525 eigenform with quartic
nebentypus of conductor 105. Furthermore, Evans [1005, p. 523] conjectured (for p > 7)
that
Y8 = B(p) + p2 ,
Exponential and character sums 159
where B(p) is the p-th Fourier coefficient of a weight 6, level 6 newform with trivial neben-
typus. These conjectured values for Yn satisfy the following upper estimate due to Katz
[1701, Theorem 0.2].
6.1.157 Remark Let k be a positive integer. We briefly discuss some basic Gauss and Kloosterman
sums over Z/kZ, as they are natural extensions of sums over the finite field Z/pZ.
6.1.158 Definition For integers m and k > 0, define the quadratic Gauss sum qk (m) over Z/kZ by
k−1
X 2
qk (m) = ζkmn .
n=0
6.1.159 Remark The special case qp (m) is the quadratic Gaussian period g(m, 2) over Fp given in
Definition 6.1.11.
6.1.160 Definition For integers a, b, c with ac 6= 0, define a generalized quadratic Gauss sum
S(a, b, c) by
|c|−1
X
S(a, b, c) = exp(πi(an2 + bn)/c).
n=0
6.1.161 Remark The special case S(2m, 0, k) is the quadratic Gauss sum qk (m) given in Defini-
tion 6.1.158.
6.1.162 Theorem (Reciprocity theorem for Gauss sums [240, p. 13]) For integers a, b, c with ac 6= 0
and ac + b even,
b2
πi
S(a, b, c) = |c/a|1/2 exp sgn(ac) − S(−c, −b, a).
4 ac
k
√
+ im )
m (1 k if k≡0 (mod 4),
√
m
k
k if k≡1 (mod 4),
qk (m) =
0 if k≡2 (mod 4),
m √
k i k if k≡3 (mod 4).
1 √
qk (1) = (1 + i)(1 + i−k ) k.
2
160 Handbook of Finite Fields
6.1.164 Theorem [240, p. 47] If (m, k)=1 with k > 1 odd and squarefree, then (cf. Theorem 6.1.86)
k−1
X n mn
qk (m) = ζ .
n=1
k k
6.1.165 Definition (Gauss character sums over Z/kZ.) Let χ be a Dirichlet character (mod k),
where k > 1. For any integer m, define the Gauss sum τk (m, χ) over Z/kZ by
k−1
X
τk (m, χ) = χ(n)ζkmn .
n=1
6.1.166 Remark When χ is trivial, τk (m, χ) is a Ramanujan sum. When k = p, τk (m, χ) is the
Gauss sum G(m, χ) over Fp . The next two theorems deal with primitive Gauss sums; for
proofs and generalizations, see [1581, pp. 48–49].
6.1.167 Theorem For k > 1, let χ (mod k) be a primitive Dirichlet character [240, pp. 28–29].
Then √
|τk (1, χ)| = k.
If further χ is quadratic, then
(√
k if χ(−1) = 1,
τk (1, χ) = √
i k if χ(−1) = −1.
6.1.169 Remark There is a reduction formula which reduces the problem of evaluating τk (m, χ)
to the case where χ (mod k) is primitive, m = 1, and k is a prime power ps ; see [240, p.
29]. When s > 1, such Gauss sums have known closed form evaluations; see [240, Section
1.6] and [659]. There are similar reduction formulae for Kloosterman sums. Evaluations and
bounds for Kloosterman sums over Z/ps Z with s > 1 are given in [655, 1004]. For Gauss
and Kloosterman sums over rings of algebraic or p-adic integers, see [1002, 1364, 1366].
Extensions of Gaussian periods to finite rings are discussed in [1012]. For Hecke Gauss
sums in quadratic number fields, see [382]. Gauss sums connected with Hecke L-functions
are discussed in [1581, p. 60].
6.1.170 Remark Let k > 1 be an odd integer. We close with an evaluation of a Salié sum over the
ring Z/kZ, which for prime k = p reduces to the evaluation of the Salié sum K(u, ρ) over
Fp given in Theorem 6.1.127. For a short proof, see [2815].
6.1.171 Definition Fix an odd integer k > 1. For an integer a with (a, k) = 1, define the Salié sum
S(a) over Z/kZ by x
x+a/x
X
S(a) = ζk ,
k
x∈(Z/kZ)∗
6.1.172 Theorem (Salié sums over Z/kZ) Fix an odd integer k > 1 and let (a, k) = 1. If a is not
congruent to a square mod k, then S(a) = 0. If a ≡ b2 (mod k) for some integer b, then
2 √ X 2xb
S(a) = i(k−1) /4 k ζk .
x∈Z/kZ
x2 =1
See Also
§3.1, §3.5 For counting irreducible polynomials with prescribed norm or trace.
§6.2 For more general exponential and character sums.
§6.3 For further applications of character sums.
§7.3 For solutions to diagonal equations over finite fields.
§10.1 For the discrete Fourier transform and Gauss sums.
§12.2, §12.4, For curves and varieties, counting rational points, zeta and L-functions.
§12.5, §12.7,
§12.8, §12.9
§14.6 For applications to difference sets.
§15.1 For cyclic codes.
References Cited: [49, 66, 111, 112, 113, 114, 115, 144, 240, 374, 382, 503, 548, 580, 592,
620, 622, 655, 659, 714, 715, 813, 928, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007,
1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1078, 1139, 1146, 1299, 1343, 1344, 1363,
1364, 1365, 1366, 1453, 1454, 1455, 1538, 1562, 1580, 1581, 1700, 1701, 1711, 1713, 1725,
1732, 1733, 1783, 1784, 1785, 1790, 1809, 1908, 1946, 1947, 1960, 2042, 2086, 2092, 2119,
2122, 2124, 2128, 2167, 2225, 2245, 2288, 2463, 2524, 2610, 2615, 2645, 2649, 2650, 2652,
2791, 2815, 2835, 2854, 2915, 2948, 3026, 3027, 3034]
X √
ψ(f (x)) ≤ (d − 1) q.
x∈Fq
X √
ψ(f (x)) ≤ (d0 − 1) q
x∈Fq
162 Handbook of Finite Fields
X √
χ(f (x)) ≤ (d − 1) q.
x∈Fq
6.2.3 Theorem Let f, g ∈ Fq [x] be polynomials of degrees d > 0 and e > 0 respectively,
ψ : Fq → C∗ a non-trivial additive character and χ : F∗q → C∗ a non-trivial multiplica-
tive character of order m (extended by zero to Fq ), such that either f is not of the form
f¯p − f¯ with f¯ ∈ Fq [x] or g is not an m-th power in Fq [x]. Then
X √
ψ(f (x))χ(g(x)) ≤ (d + e − 1) q.
x∈Fq
6.2.4 Remark The previous three results are a consequence of Weil’s conjectures for curves,
as pointed out by Hasse [1443] and Weil [2961], so they follow from Weil’s proof of the
conjectures [2962]. They are also particular cases of the more general higher dimensional
results stated in this section.
where
Sm (V, f, ψ) = S(V ×Fq Fqm , f, ψ ◦ TrFqm /Fq ).
6.2.7 Remark Character sums can be used to count the number of rational points on a variety.
In particular, zeta functions of affine varieties are special cases of L-functions of additive
character sums: it is easy to check that, if V is the affine variety defined by the vanishing
of f1 , . . . , fr ∈ Fq [x1 , . . . , xn ], then for any non-trivial character ψ, and variables y1 , . . . , yr ,
Sm (An+r , y1 f1 + · · · + yr fr , ψ) = q mr · #V (Fqm ).
Exponential and character sums 163
6.2.8 Definition A number z ∈ C is a Weil integer if it is an algebraic integer and all its
conjugates over Q have the same absolute value. If A ∈ R, A > 0 is fixed, z has weight
w ∈ R (relative to A) if all its conjugates have absolute value Aw/2 .
In this way, estimates about character sums can be derived from statements about the
number of roots and poles of the corresponding L-function and their absolute values.
6.2.12 Remark The type of statements that one tries to prove about these sums are estimates of
the form |Sm (V, f, ψ)| ≤ Cq m(n+i)/2 , where i is as small as possible and C is a constant
that depends only on certain quantities attached to the polynomials that define the variety
V and the regular map f . In some particularly nice cases, one can give geometric conditions
that imply the optimal bound (i = 0).
6.2.13 Theorem [795, Théorème 8.4] Let f ∈ Fq [x1 , . . . , xn ] be a polynomial of degree d > 0.
Suppose that
1. d is prime to p.
2. The projective hypersurface defined by the highest degree homogeneous form fd
∂fd ∂fd
of f is smooth, that is, the polynomials fd , ∂x 1
, . . . , ∂xn
do not have any common
zero in P n−1
(Fq ).
n
Then for every non-trivial ψ, L(AnFq , f, ψ; T )(−1) is a polynomial of degree (d − 1)n , all
whose reciprocal roots have weight n relative to q. In particular, for every m ≥ 1 the
following estimate holds:
|Sm (An , f, ψ)| ≤ (d − 1)n q mn/2 .
6.2.14 Theorem [28, Theorems 1.4, 1.11][2469, Corollary 3] Let f ∈ Fq [x1 , . . . , xn ] be a polynomial
of degree d. Suppose that
1. d is divisible by p.
2. The projective hypersurface defined by the highest degree homogeneous form fd
of f is smooth.
3. The projective hypersurface defined by the homogeneous form fd−1 of degree
d − 1 of f does not contain any of the common roots of the partial derivatives of
fd .
164 Handbook of Finite Fields
n
Then for every non-trivial ψ, L(AnFq , f, ψ; T )(−1) is a polynomial of degree ((d − 1)n+1 −
(−1)n+1 )/d, all whose reciprocal roots have weight n relative to q. In particular, for every
m ≥ 1 the following estimate holds:
(d − 1)n+1 − (−1)n+1 mn/2
|Sm (An , f, ψ)| ≤ q .
d
6.2.15 Theorem [341, Theorem 1] Let f ∈ Fq [x1 , . . . , xn ] be a polynomial of degree d. Then
for every non-trivial ψ, the total degree of the L-function L(AnFq , f, ψ; T ) does not exceed
(4d + 5)n .
6.2.16 Theorem [341, Theorem 2] Let V ⊆ AN
Fq be an affine variety of dimension n and degree
e and f ∈ Fq [x1 , . . . , xN ] a polynomial of degree d. Then for every non-trivial ψ, the total
degree of the L-function L(V, f, ψ; T ) does not exceed (4 max(e + 1, d) + 5)2N +1 .
6.2.17 Theorem [1532, Theorem 5] Let V ⊆ A3Fq be the surface defined by g = 0, where
g ∈ Fq [x1 , x2 , x3 ], and let f ∈ Fq [x1 , x2 , x3 ]. Suppose that
1. All geometric fibres of f : V → A1Fq have dimension ≤ 1.
2. The generic fibre of f : V → A1Fq is a geometrically irreducible curve.
Then for every non-trivial ψ : Fq → C∗ and every m ≥ 1 the following estimate holds:
where c(X) is the total Chern class of X [799, Exposé XVII, 5.2] and L the class of a
hyperplane section, all whose reciprocal roots have weight n relative to q. In particular, for
every m ≥ 1 the following estimate holds:
1. d is prime to p.
2. The scheme-theoretic intersection X ∩ H ∩ L has dimension n − 2, where L =
PN N
Fq \AFq is the hyperplane at infinity and H is the hypersurface defined by F = 0.
Let δ ≥ −1 be the dimension of its singular locus.
3. The dimension of the singular locus of the scheme-theoretic intersection X ∩ L
is smaller than or equal to δ.
Then for every non-trivial ψ, all reciprocal roots and poles of L(V, f, ψ; T ) have weight
smaller than or equal to n + δ + 1 relative to q, and its total degree is bounded by a constant
C(X, d) depending only on number and the degrees of the forms defining X and on d. In
particular, for every m ≥ 1 the following estimate holds:
ai xi ∈ Fq [x±1 ±1
P
6.2.21 Definition Let f = i∈Zn 1 , . . . , xn ] be a Laurent polynomial. The Newton
polyhedron ∆∞ (f ) of f at infinity is the convex hull in Rn of the set {0}∪{i ∈ Zn |ai 6= 0}.
A polynomial f is non-degenerate with respect to ∆∞ (f ) if, for every face δ of ∆∞ (f )
that does not contain the origin, the equations
∂fδ ∂fδ
= ··· = =0
∂x1 ∂xn
∗
do not have any common solution in (Fq )n , where fδ = ai xi is the restriction
P
i∈Zn ∩δ
of f to the face δ.
6.2.22 Theorem [23, Theorem 4.2][812, Theorem 1.3] Let Tn be the n-dimensional split torus
over Fq , f ∈ Fq [x±1 ±1
1 , . . . , xn ] a Laurent polynomial and ∆∞ (f ) its Newton polyhedron at
infinity. Suppose that
1. dim ∆∞ (f ) = n.
2. f is non-degenerate with respect to ∆∞ (f ).
n
Then for every non-trivial ψ, L(Tn (Fq ), f, ψ; T )(−1) is a polynomial of degree
n!Vol(∆∞ (f )), all whose reciprocal roots have weight smaller than or equal to n relative to
q. If, in addition, the origin is an interior point of ∆∞ (f ), then all its reciprocal roots have
weight n relative to q. In particular, for every m ≥ 1 the following estimate holds:
where
Sm (V, f, χ) = S(V ⊗ Fqm , f, χ ◦ NormFqm /Fq ).
6.2.34 Theorem [560, Theorem 13][2384] Let X be a smooth projective curve of genus g over Fq ,
f, h ∈ Fq (X) rational functions and ψ : Fq → C∗ (respectively χ : F∗q → C∗ ) a non-trivial
additive (resp. multiplicative) character. Suppose that f is not of the form f¯p − f¯ with
f¯ ∈ Fq (X), and g is not an ord(χ)-th power in Fq (X). Then for every m ≥ 1,
X
ψ(TrFqm /Fq f (x))χ(NormFqm /Fq h(x)) ≤ (2g − 2 + s + l + d)q m/2 ,
x∈V (Fqm )
where V ⊆ X is the open set where f and h are defined, l is the number of poles of f , s is
the number of zeroes and poles of h and d is the degree of the polar part of the divisor (f ).
6.2.35 Theorem [1709, Theorem 1.1] Let f, g ∈ Fq [x1 , . . . , xn ] be polynomials of degrees d and e,
respectively, prime to p. Suppose that the projective hypersurfaces defined by the highest
degree forms of f and g are smooth and intersect transversally. Then for every m ≥ 1 the
following estimate holds:
X
ψ(TrFqm /Fq f (x))χ(NormFqm /Fq g(x)) ≤ Cn,d,e q mn/2 ,
x∈Fqm
where X X
Cn,d,e = (d − 1)a (e − 1)b + (d − 1)a (e − 1)b .
a+b=n a+b=n−1
6.2.36 Theorem [1138, Proposition 0.1] Let Tn be the n-dimensional split torus over Fq ,
f ∈ Fq [x±1 ±1
1 , . . . , xn ] a Laurent polynomial, ∆∞ (f ) its Newton polyhedron at infinity and
χ : T (Fq ) → C a character. Suppose that
n ∗
1. dim ∆∞ (f ) = n.
2. f is non-degenerate with respect to ∆∞ (f ).
Then for every non-trivial ψ and every m ≥ 1 the following estimate holds:
X
ψ(TrFqm /Fq f (x))χ(NormFqm /Fq x) ≤ n!Vol(∆∞ (f ))q mn/2 .
x∈(F∗
qm
)n
6.2.37 Theorem [1702, Theorem 1][2893, Corollary 2.2] Let a ∈ Fqm such that Fqm = Fq (a) and
χ : F∗qm → C∗ a non-trivial multiplicative character. Then the following estimate holds:
X √
χ(x − a) ≤ (m − 1) q.
x∈Fq
Exponential and character sums 169
6.2.38 Theorem [2893, Corollary 2.8] Let f ∈ Fq [x] be a monic polynomial, and χ : (Fq [x]/(f ))∗ →
C∗ a non-trivial character. Then
X 1
χ(g mod f ) ≤ (deg(f ) + 1)q d/2 ,
d
where the sum is taken over the set of monic irreducible polynomials of degree d in Fq [x]
which are coprime to f .
6.2.39 Theorem [1140, Theorem 3.7] Let f ∈ Fq [x1 , . . . , xn , y1 , . . . , yn0 ] be a polynomial of degree
d, r ≥ 1 an integer and let g be the polynomial
r
X
f (x1,j , . . . , xn,j , y1 , . . . , yn0 )
j=1
1. d is prime to p.
0
2. The degree d homogeneous part of g defines a smooth hypersurface in Prn+n −1 .
Then for every non-trivial additive character ψ and every m ≥ 1 the following estimate
holds:
X 0 0
ψ(TrFqmr /Fq f (x1 , . . . , xn , y1 , . . . , yn0 )) ≤ (d − 1)nr+n q m(nr+n )/2 .
xi ∈Fqmr ,yj ∈Fqm
0 n
The constant (d−1)nr+n can be replaced by C(p, f )r3(d+1) −1
, where C(p, f ) depends only
on p and f .
See Also
§6.1 For specific results about Gauss, Jacobi, and Kloosterman sums.
§12.7 For the general theory of `-adic sheaves and their L-functions.
[22], [2698] For a study of exponential sums and their L-functions using Dwork’s
p-adic methods.
[253] For applications of p-adic cohomology to the study of exponential sums.
[796], [2589] For the general theory of exponential sums using `-adic cohomology.
[1700], [1867] For a study of the total degree of the L-function associated to an
exponential sum and its change with the characteristic of the base field.
[1869] For a comprehensive survey of the main estimates for exponential
sums obtained using `-adic cohomology.
References Cited: [22, 23, 28, 253, 339, 341, 342, 560, 795, 796, 799, 812, 940, 1138, 1140,
1532, 1700, 1702, 1703, 1704, 1707, 1709, 1712, 1844, 1867, 1869, 2384, 2468, 2469, 2589,
2698, 2893, 2961, 2962]
170 Handbook of Finite Fields
The main goal of this chapter is to show that character sums are very useful and friendly
tools for a variety of problems in many areas such as coding theory, cryptography, and
algorithms. There are so many applications of character sums that any survey will be
incomplete. Here we chose a combination of some classical applications and newer, less
known applications using different types of character sums.
6.3.1 Proposition [1631, Lemma 7.3.7] Let χ denote a nontrivial multiplicative character of Fq .
Then we have
X
χ(x + a)χ(x + b) = −1, a, b ∈ Fq , a 6= b.
x∈Fq
6.3.2 Definition A Hadamard matrix of order n is an n×n matrix H with entries from {−1, +1}
satisfying HH T = nI.
6.3.3 Construction (Paley) [2344] Let q be the power of an odd prime, η the quadratic character
of Fq and Fq = {ξ1 , . . . , ξq } any fixed ordering of Fq . For q ≡ 3 (mod 4) there exists a
Hadamard matrix H = (hij ) of order n = q + 1 defined by
hi,j = η(ξj − ξi ), i, j = 1, . . . , n − 1, i 6= j.
6.3.4 Remark
6.3.5 Definition Let n be a positive divisor of q − 1 and γ a primitive element of Fq . Then the
jn+i
sets Ci = γ : j = 0, 1, . . . , (q − 1)/n − 1 , i = 0, 1, . . . , n − 1, are cyclotomic cosets
of order n. For a0 , a1 , . . . , an−1 ∈ F∗q we define a cyclotomic mapping fa0 ,a1 ,...,an−1 (of
index n) by fa0 ,a1 ,...,an−1 (0) = 0 and
6.3.6 Proposition [998, Theorem 3.7] The mapping fa0 ,a1 ,...,an−1 is a permutation of Fq if and
only if ai Ci 6= aj Cj for all 0 ≤ i < j ≤ n − 1.
6.3.7 Corollary For n ≥ 2 let j be an integer with 0 ≤ j < n, χ a multiplicative character of Fq
of order n, and a, b ∈ Fq with a 6= b. If aj = a and ai = b for i 6= j, then ga,b = fa0 ,a1 ,...,an−1
is a permutation if and only if χ(a) = χ(b).
6.3.9 Remark Complete mappings are pertinent to the problem of constructing orthogonal Latin
squares [1939, Section 9.4].
6.3.10 Remark Substituting b = ac we see that ga,b is a complete mapping if and only if χ(c) = 1
and χ(a + 1) = χ(ac + 1) and the number N of complete mappings ga,ac with c 6= 1 is
n−1
X 1X X
N= χi (a + 1)χi (a − c−1 ).
n i=0
c∈F∗
q , χ(c)=1, c6=1 a∈F∗
q \{1,c
−1 }
6.3.13 Remark All single errors are detected since all pi are permutations. Another frequent
family of errors are adjacent transpositions . . . ab . . . → . . . ba . . . which are all detected
if pi+1 (x)p−1
i (x) − x are also permutations for i = 1, . . . , s − 1. A permutation f such that
f (x) − x is also a permutation is an orthomorphism. Since f is an orthomorphism whenever
−f is a complete mapping, the number of orthomorphisms and complete mappings of the
form ga,b is the same and the probability that a random choice of the parameters (a, b) gives
an orthomorphism is asymptotically n−2 by Theorem 6.3.11.
6.3.14 Example [International Standard Book Number (ISBN-10)] An ISBN-10 consists of 10
digits x1 − x2 x3 x4 x5 x6 − x7 x8 x9 − x10 . The first digit x1 characterizes the language group,
x2 x3 x4 x5 x6 is the actual book number, x7 x8 x9 is the number of the publisher, and x10 is a
172 Handbook of Finite Fields
check digit. A correct ISBN satisfies x1 +2x2 +3x3 +4x4 +5x5 +6x6 +7x7 +8x8 +9x9 +10x10 =
0 ∈ F11 , i.e., pi (x) = ix, i = 1, . . . , 10, and pi+1 p−1
i (x) = (i
−1
+ 1)x, i = 1, . . . , 9, which are
all orthomorphisms.
6.3.15 Definition Put εn = exp(2πi/n). Let (sk ) be a T -periodic sequence over Zn . The (periodic)
autocorrelation of (sk ) is the complex-valued function defined by
T −1
1 X sk+t −sk
A(t) = εn , 1 ≤ t < T.
T
k=0
6.3.16 Remark Sequences with low autocorrelation have several applications in wireless commu-
nication, cryptography, and radar, see the monograph [1303].
6.3.18 Remark Proposition 6.3.1 implies the exact values of the autocorrelation function of the
cyclotomic generator of order n; see [2068] for the proof of a generalization to arbitrary
finite fields.
6.3.19 Theorem The autocorrelation function f (t) of the cyclotomic generator of order n is given
by A(t) = (−1 + εjn + ε−j−k
n )/p if t ∈ Cj and −1 ∈ Ck .
are Gauss sums of first kind and second kind, respectively. Let χ1 , . . . , χk be k ≥ 2
multiplicative characters of Fq . The sum
X
J(χ1 , . . . , χk ) = χ1 (c1 ) . . . χk (ck ),
c1 +···+ck =1
is a Jacobi sum, where the summation is extended over all (c1 , . . . , ck ) ∈ Fkq such that
c1 + · · · + ck = 1.
6.3.21 Remark Gauss sums of the first and second kinds are closely related by
(q−1)/T −1
T X
Ga (ψ) = G(χj , ψ),
q−1 j=0
Exponential and character sums 173
where χ is a multiplicative character of order (q − 1)/T , and Gauss sums of first kind and
Jacobi sums by
G(χ1 , ψ) · · · G(χk , ψ)
J(χ1 , . . . , χk ) =
G(χ1 · · · χk , ψ)
if all involved characters are nontrivial. For background on Gauss and Jacobi sums see
Section 6.1, [240] or [1939, Chapter 5]. In particular, we have (if all characters are nontrivial)
|G(χ, ψ)| = q 1/2 , |Ga (ψ)| ≤ q 1/2 , and |J(χ1 , . . . , χk )| = q (k−1)/2 . (6.3.1)
We note that the bound on |Ga (ψ)| is only nontrivial if T > q 1/2 . If q = p is a prime, bounds
which are nontrivial for T ≥ p1/3+ε and T ≥ pε are given in [380, 1454] (see also [371] for
an improvement), respectively.
6.3.22 Remark Gauss and Jacobi sums are involved in the proofs and statements of reciprocity
laws. For example, let p and q be two distinct odd primes, let
rbe the order of p modulo q,
ξ be a primitive q-th root of unity in Fpr , and G = x∈Fq xq ξ x be a Gauss sum of first
P
γj
= ξj , j = 0, . . . , n − 1,
p n
6.3.27 Remark In terms of the decompositions of p and q in the rings of Eisenstein and Gaussian
integers we have
ad − bc
Kp,3 = if p = a2 − ab + b2 ; q = c2 − cd + d2 ; a, c ≡ 2 (mod 3); b, d ≡ 0 (mod 3)
3
and
ad − bc
Kp,4 = if p = a2 + b2 ; q = c2 + d2 ; a, c ≡ 1 (mod 4); b, d ≡ 0 (mod 2).
2
6.3.2.2 Distribution of linear congruential pseudorandom numbers
TΓ (B)
DN (Γ) = sup − |B| ,
B⊆[0,1) N
where the supremum is taken over all subintervals B = [α, β) ⊆ [0, 1), and TΓ (B) is the
number of elements of Γ inside B.
6.3.31 Remark The discrepancy is a measure for the uniform distribution of Γ and a small dis-
crepancy is a desirable feature for (quasi-)Monte Carlo integration [2248]. The problem of
estimating the discrepancy can be reduced to the problem of estimating certain exponential
sums.
6.3.32 Proposition (Erdős-Turan inequality) [922, Theorem 1.2.1] Let Γ be a sequence (γn )N
n=1
in [0, 1). We have for any integer H ≥ 1,
H N −1
1 1 X1 X
DN (Γ) + |SN (h)|, where SN (h) = exp (2πihγn ) .
H N h n=0
h=1
6.3.33 Remark For the sequence (xn /p), n = 0, 1, . . . , T − 1, in [0, 1) derived from a linear pseu-
dorandom number generator (xn ) (where we identify Fp with the integers {0, 1, . . . , p − 1}),
the absolute value of the sums SN (h) equals the absolute value of Gauss sums of second
kind provided that b 6= (1 − a)x0 and a 6= 1. A discrepancy bound can be easily obtained
by combining Proposition 6.3.32 with the bound |Ga (ψ)| ≤ p1/2 .
6.3.34 Theorem [2230, Theorem 1] For the sequence Γ = (xn /p : n = 0, . . . , N − 1), where xn is
defined by (6.3.2), N < T and T is the order of a, we have DN (Γ) N −1 p1/2 (log p)2 .
Exponential and character sums 175
6.3.2.3 Diagonal equations, Waring’s problem in finite fields, and covering radius
of certain cyclic codes
6.3.36 Theorem [1939, Theorem 6.34] The number Nb of solutions of (6.3.3) for b ∈ F∗q is
dX
1 −1 dX
s −1
where the sum over ψ runs over all additive characters of Fq and ψcj (x) = ψ(cj x).
6.3.38 Definition Let g(k, q) be the smallest s such that every element b ∈ Fq can be written
as a sum of at most s summands of k-th powers in Fq . The problem of determining or
estimating g(k, q) is Waring’s problem in Fq .
6.3.39 Remark We note that g(k, q) = g(d, q) if d = gcd(k, q − 1) and we may restrict ourselves
to the case that k | (q − 1). Combining Theorem 6.3.36 (with k1 = · · · = ks = k | q − 1
and c1 = · · · = cs ) with the result on the absolute value of Jacobi sums (6.3.1) we get
immediately
Nb ≥ q s−1 − (k − 1)s q (s−1)/2
which implies the following bound.
6.3.40 Theorem [2990] For any divisor k of q − 1 we have g(k, q) ≤ s if q s−1 > (k − 1)2s .
6.3.41 Remark Theorem 6.3.40 applies only to k < q 1/2−ε and for q 3/7 + 1 ≤ k < q 1/2 we have the
improvement g(k, q) ≤ 8 of [650, Corollary 7]. Moreover, from [1283, Theorem 6] it follows
that for any ε > 0, k ≤ q 1−ε and if Fq = Fp (xk ) for some x ∈ Fq , there is a constant
c(ε) such that g(k, q) ≤ c(ε). However, if q = p is a prime, a very moderate but nontrivial
bound on Gauss sums of the second kind from [1789] leads to the nontrivial bound on
g(k, p) (ln k)2+ε if k < p(log log p)1−ε / log p.
6.3.43 Proposition [1468, Lemma 1.1] Let H be the parity check matrix of a linear [n, k]-code C
over Fq , i.e., C = {c ∈ Fnq : HcT = 0}. The covering radius is the least integer ρ such that
every x ∈ Fn−kq is a linear combination of at most ρ columns of H.
6.3.44 Remark Let g ∈ Fq [X] be the minimal polynomial of an element α ∈ F∗q of order n and r be
the order of q modulo n, i.e., Fq (α) = Fqr . Then the cyclic code C = (g) is the [n, n−r]-code
with parity check matrix H = (1, α, α2 , . . . , αn−1 ), where the elements of Fqr are identified
with r-dimensional column vectors.
Put N = (q r − 1)/n. Then α = γ N for some primitive element γ of Fqr and the columns
of H consist of the nonzero N -th powers in Fqr . By Proposition 6.3.43, ρ(C) is the least
integer ρ such that any x ∈ Fqr can be written as a linear combination of at most ρ N -th
powers in Fqr . Hence, we have ρ(C) ≤ g(N, q), where we have equality for q = 2.
6.3.45 Definition ((Extended) Hidden number problem) [347, 348] Let T ⊆ Fp . Recover a number
a ∈ Fp if for many known t ∈ T the l most significant bits of at are given.
6.3.46 Remark If l is of order log1/2 p and T has some uniform distribution property, a lattice
reduction technique solves the hidden number problem in polynomial time. The uniform
distribution property is fulfilled if the maximum over all nontrivial additive character sums
of Fp over T is small,
X
max ψ(t) = O(#T 1−ε ).
ψ
t∈T
If T is a subgroup ofF∗p ,
the sums are Gauss sums of the second kind and the desired uniform
distribution property is fulfilled by the bounds of [1454] and [380, 371] if #T ≥ p1/3+ε and
#T ≥ pε , respectively. The bound of [1789] and ideas reminiscent to Waring’s problem solve
the problem for smaller #T ≥ log p/(log log p)1−ε using more bits, that is, l of order log4 p.
6.3.47 Definition [2655] The sparse polynomial noisy interpolation problem consists of finding
an unknown polynomial f ∈ Fp [X] of small weight from approximate values of f (t) at
polynomially many points t ∈ Fp selected uniformly at random.
6.3.48 Remark
1. The case f (X) = aX corresponds to the hidden number problem.
2. For more details we refer to the survey [2647] and [2644, Chapter 30].
6.3.49 Theorem [2548, Theorem 2G] Let g ∈ Fq [X] be of degree n and f ∈ Fq [X] have d distinct
roots in its splitting field over Fq . Let χ be a multiplicative character of Fq of order s and
let ψ be an additive character of Fq . If either s > 1 and f is not, up to a multiplicative
constant, an s-th power, or ψ is nontrivial and gcd(n, q) = 1, we have
X
χ(f (c))ψ(g(c)) ≤ (d + n − 1)q 1/2 .
c∈Fq
Exponential and character sums 177
6.3.50 Remark The condition gcd(n, q) = 1 can be replaced by the weaker condition that
Y q − Y − g(X) is absolutely irreducible. The condition on f is fulfilled if gcd(deg(f ), s) = 1.
6.3.52 Proposition [2711, Lemma 1 and Lemma 2] The number Nf,s,qr of solutions (x, y) ∈ F2qr
of a superelliptic equation is
X X
Nf,s,qr = χ(f (x)),
ord(χ)|s x∈Fqr
where the outer sum runs over all multiplicative characters χ of Fqr such that χs is trivial.
The number Ng,qr of an Artin-Schreier equation is
X X
Ng,qr = ψr (g(x)),
ψr x∈Fqr
where the outer sum runs over all additive characters ψ of Fq and ψr (x) = ψ(Tr(x)), and
Tr denotes the trace from Fqr to Fq .
6.3.53 Remark After isolating the trivial characters, the Weil bound implies immediately bounds
on Nf,s,qr and Ng,qr .
6.3.54 Theorem [2711, Chapter 1.4] If f ∈ Fq [X] with gcd(deg(f ), s) = 1 and d > 0 different
zeros (in the algebraic closure of Fq ), then the number of solutions Nf,s,qr over Fqr of the
superelliptic equation Y s = f (X) satisfies
If g ∈ Fq [X] has degree n with gcd(n, q) = 1, then the number of solutions Ng,qr over Fqr
of the Artin-Schreier equation Y q − Y = g(X) satisfies
where tf is the smallest value of t with f (t) (α) = f (s) (α) for some positive integer s < t,
i.e., the orbit length.
6.3.57 Proposition [1620, Proposition 3] Let η denote the quadratic character of Fq . A quadratic
polynomial f ∈ Fq [X] is S
stable if and only if η(x) = −1 for all elements x of the adjusted
orbit Orb(f ) = {−f (α)} Orb(f ).
6.3.58 Remark If f is stable, then the elements f (n) (α), n = 2, . . . , tf − 1, are different and
Proposition 6.3.57 implies for any positive integer K,
tf −1 K K
1 X Y 1 X Y
tf − 2 = K (1 − η( f (k) (f (n) (α)) )) ≤ K 1 − η f (k) (x) .
2 n=2 2
k=1 x∈Fq k=1
Expanding the product on the right hand side, we obtain one trivial sum equal to q and
2K − 1 sums which can be bounded by the Weil bound giving
q
tf = K + O(2K q 1/2 ).
2
The optimal choice of K gives the following result.
6.3.59 Theorem [2331, Theorem
1] For any odd q and any stable quadratic polynomial f ∈ Fq [X]
we have tf = O q 3/4 .
6.3.60 Remark Similarly, Gomez and Nicolás estimated in [1310] the number of stable quadratic
polynomials over Fq for an odd prime power q. As for Theorem 6.3.59, this reduces to
estimating the character sum
X K
Y
(1 − η (Fk (a, b, c))) ,
(a,b,c)∈F∗
q ×Fq ×Fq k=1
where Fk (a, b, c) is the k-th element of the critical orbit of f and K any positive integer
parameter. Gomez and Nicolás [1310] have proved that there are O(q 5/2 (log q)1/2 ) stable
quadratic polynomials over Fq for the power q of an odd prime.
where Tr denotes the absolute trace from F2r to F2 , and define a code
Ct = {cg : g ∈ Gt }.
6.3.62 Remark The code Ct is linear and is the dual of the primitive, binary BCH code with
designed distance 2t + 1 [1991].
6.3.63 Remark Following [2372], one can use the Weil bound to estimate the Hamming weight
for Ct . To do this, we recall that any nonzero codeword cg ∈ Ct comes from a nonzero
Pt
polynomial g(X) = i=1 gi X 2i−1 . Hence, we have
X X
2r − 2wtH (cg ) = (−1)Tr(g(x)) = ψ(g(x)),
x∈F2r x∈F2r
where ψ(x) = (−1)Tr(x) is the additive canonical character [2372, Equation (4)]. Applying
now the Weil bound, we get the following result.
6.3.64 Theorem [2372, Theorem 4] For t ≥ 1 the minimum Hamming weight of Ct is at least
2r−1 − (t − 1)2r/2 .
6.3.65 Definition Let ψ be a nontrivial additive character of Fq and let a, b ∈ Fq . We define the
Kloosterman sum (see Section 6.1) by
X
K(ψ; a, b) = ψ(ax + bx−1 ).
x∈F∗
q
Y q − Y = aX + bX −1 + c
is a Kloosterman equation.
6.3.68 Theorem The number Na,b,c of solutions (x, y) ∈ F2qr of a Kloosterman equation satisfies
|Na,b,c − q r | ≤ 2(q − 1)q r/2 .
6.3.69 Remark We note that this estimate is obtained in a similar way as the number of solutions to
an Artin-Schreier equation in Theorem 6.3.54, but using the estimate in Proposition 6.3.66
instead of the Weil bound.
6.3.70 Definition Let γ be a primitive element of F2r . The codewords of the Kloosterman code
C = {ca,b : a, b ∈ F2r } are defined by
r r
ca,b = (Tr(a + b), Tr(aγ + bγ −1 ) + · · · + Tr(aγ 2 −2
+ bγ 2−2 )), a, b ∈ F2r .
180 Handbook of Finite Fields
6.3.71 Theorem [2372] The minimum weight of the Kloosterman code is at least 2r−1 − 2r/2 .
6.3.72 Definition For a ∈ F∗p , b ∈ Fp we define, with the convention 0−1 = 0, the sequence (un )
by the recurrence relation
un+1 = au−1
n + b, n = 0, 1, . . . , (6.3.4)
with u0 the initial value. Then the numbers un /p, n = 0, 1, . . ., in the interval [0, 1) form
a sequence of inversive congruential pseudorandom numbers.
6.3.73 Remark The sequence (un ) defined by (6.3.4) is purely periodic with some period T ≤ p.
Using the Erdős-Turan inequality given by Theorem 6.3.32, one reduces the problem of
estimating the discrepancy of the elements in the sequence (un /p), n = 0, . . . , T − 1, to
estimating character sums given by
T
X −1
ST (h) = ψ(hun )
n=0
with a fixed integer h 6≡ 0 (mod p) and ψ is the additive canonical character of Fp . Niederre-
iter and Shparlinski [2270] developed a method to reduce the problem of estimating |ST (h)|
to estimating Kloosterman sums.
6.3.74 Theorem [2270, Theorem 2] For the sequence Γ = (un /p : n = 0, . . . , T − 1), where (un ) is
defined by (6.3.4) and T is the period of the sequence (un ), we have
r
6.3.75 Definition Let Br = {0, 1} . The Fourier coefficients (or Walsh-Hadamard coefficients)
B(a)
b of B(U1 , . . . , Ur ), where a ∈ Br , are defined as
B(u)+<a,u>
X
B(a)
b = (−1) ,
u∈Br
6.3.76 Remark Boolean functions used in cryptography must have high nonlinearity, see for ex-
ample [523] and Section 9.1.
6.3.77 Remark Each Boolean function B : Fr2 → F2 can be represented with a polynomial f ∈ F2r
and the absolute trace function as B(x1 , . . . , xr ) = Tr(f (x1 β1 + · · · + xr βr )) with some
fixed ordered basis {β1 , . . . , βr } of F2r . Moreover, if we identify x ∈ F2r with its coordinate
Exponential and character sums 181
1 X
N (B) = 2r−1 − max ψ(B(u) + bu) ,
2 b∈F2r
u∈F2r
where ψ is the additive canonical character of F2r . For example, if f (x) = x−1 (with the
convention 0−1 = 0), we get Kloosterman sums on the right hand side.
6.3.79 Remark The bound of Theorem 6.3.78 can be obtained using the standard method for
reducing incomplete character sums to complete ones, see for example [1581, Section 12.2],
and the Weil bound. This reducing method can be traced back to Pòlya and Vinogradov; for
references see [1581]. Generalizations to certain incomplete character sums over arbitrary
finite fields are given in [776, 2989, 2991, 2993].
6.3.80 Algorithm [1887] Let f ∈ Fp [X], p an odd prime, be a squarefree polynomial with f (0) 6= 0
which splits over Fp . For t = 0, 1, . . . , N we compute
via the Euclidean algorithm, where N is the main parameter of the algorithm, hoping that
at least one polynomial Lt is nontrivial, that is, is equal to neither 1 nor f .
For each t, the polynomial
6.3.84 Definition [Mauduit and Sárközy [2037]] For a finite binary sequence EN = (e1 , . . . , eN ) ∈
{−1, +1}N the well-distribution measure of EN is defined by
M
X
W (EN ) = max ea+bj ,
a,b,M
j=1
where the maximum is taken over all a, b, M ∈ Z and b, M > 0 such that 1 ≤ a + b ≤
a + bM ≤ N , and the correlation measure of order ` of EN is defined as
M
X
C` (EN ) = max en+d1 en+d2 · · · en+d` ,
M,D
n=1
where the maximum is taken over all D = (d1 , . . . , d` ) and M is such that 0 ≤ d1 <
. . . < d` ≤ N − M .
6.3.85 Definition Let p be an odd prime. The binary sequence Ep = (1, e1 , . . . , ep−1 ) defined by
n
en = , 0 ≤ n < p,
p
6.3.86 Remark The sums in the definitions of well-distribution measure and correlation measure
of order ` for the Legendre sequence are essentially sums of products of Legendre symbols
which can be estimated by Theorem 6.3.78.
6.3.87 Remark We note that W (EN ) and C` (EN ) of a “truly random” sequence are of the order
of magnitude N 1/2 log N and N 1/2 (log N )c(`) , respectively, see [82, 554].
6.3.88 Theorem [2037] For the Legendre sequence Ep we have
6.3.89 Remark The linear complexity is a measure for the unpredictability and thus suitability
of a sequence in cryptography. For sequences (u0 , . . . , uN −1 ) ∈ FN 2 it is closely related to
the correlation measure of order ` of the sequence (e0 , . . . , eN −1 ) ∈ {−1, +1}N defined by
en = (−1)un , n = 0, . . . , N − 1. Hence, from a suitable upper bound on C` (EN ) up to a
sufficiently large ` we can derive a lower bound on the linear complexity of (un ), see [393]
and Subsection 10.4.5.
Exponential and character sums 183
6.3.90 Theorem (Vinogradov’s Formula) [1631, Lemma 7.5.3] For a subset S ⊆ F∗q , the number Q
of primitive elements in S is
ϕ(q − 1) X µ(d) X X
Q= χ(x),
q−1 ϕ(d) χ
d|(q−1) ord(χ)=d
x∈S
where µ and ϕ denote Möbius’ and Euler’s totient function, respectively, and χ is a nontrivial
multiplicative character of Fq .
6.3.91 Theorem [1631, Lemma 7.5.3] Let S be a subset of F∗q . Then the number R of s-th powers
in S is
1 X X
R= χ(x).
s
ord(χ)|s x∈S
6.3.92 Remark [1939, Theorem 5.4] For a set S of 1 ≤ N ≤ p consecutive elements in Fp , the
Burgess bound [464]
M +N
X 2
χ(n) N 1−1/r p(r+1)/4r +ε
,
n=M +1
ϕ(p−1) 2
which is nontrivial for any N ≥ p1/4+ε , implies Q = p−1 (N + O(N 1−1/r p(r+1)/4r +ε
))
N 1−1/r (r+1)/4r 2 +ε
and R = s + O(N p ). For generalizations of the Burgess bound see [375,
583, 584, 1791].
6.3.93 Remark For Fpr with 2 ≤ r < p1/2 and a defining element α of Fpr , the bound
X
χ(α + x) ≤ (r − 1)p1/2
x∈Fp
of [1702] guarantees the existence of primitive elements in rather small subsets of Fpr .
6.3.94 Remark Let g ∈ Fp be an element of order T | (p − 1). The Diffie-Hellman key ex-
change [859] is a way for two parties to establish a common secret key over an insecure
channel. The question of studying the distribution of the Diffie-Hellman triples (g x , g y , g xy ),
x, y = 0, . . . , T −1, is motivated by the assumption that these triples cannot be distinguished
from totally random triples in feasible computational time, see [487, 488]. In the series of
papers [197, 368, 487, 488, 587, 1191] it has been shown that such triples are uniformly
distributed via estimating double exponential sums with linear combinations of the entries
in such triples.
6.3.95 Definition Let ψ be the additive canonical character of Fp . For integers a, b, c, we define
the exponential sum
T
X
Sa,b,c (T ) = ψ (ag x + bg y + cg xy ) .
x,y=1
184 Handbook of Finite Fields
6.3.96 Remark Bounds on the sums Sa,b,c (T ) were obtained by actually estimating different sums
T
X T
X
Wa,c (T ) = ψ(ag x + cg xy )
y=1 x=1
since |Sa,b,c (T )| ≤ Wa,c (T ). In [487] the bound Wa,c (T ) T 5/3 p1/4 is obtained, which
is nontrivial for T > p3/4+ε . Moreover, Bourgain [368] gave a nontrivial estimate for
T > pε for any ε > 0 and Garaev [1191] improved not only the bound in [487] obtain-
ing Wa,c (T ) T 7/4 p1/8+ε , but also the range of nontriviality T > p1/2+ε . Furthermore,
Chang and Yao [587] gave an estimate that works in a range that was not covered by any
explicit bound in any other work.
6.3.97 Definition Let ψ denote the additive canonical character of the prime field Fp . Given a
set T = {t1 , t2 , . . . , tn } of n elements in Fp , the sequence
n
X
fT (k) = ψ(tj k), k = 0, 1, . . . , p − 1,
j=1
6.3.98 Remark It is well known that the discrete Fourier transform (see Section 10.1) captures
a lot of information about the “random-like” behavior of sets, that is, “good” sets have a
small discrete Fourier transform for all 1 ≤ k < p. In [59, 1702], several constructions of
thin sets T were given, that is, sets of size |T | = O((log p)2+ε ) with small maximum discrete
Fourier transform max |fT (k)| = O(|T |(log p)−ε ).
1≤k≤p−1
6.3.99 Remark Constructions of thin sets have applications in graph theory, computer sci-
ence [1161], and in combinatorial number theory [2510, 2996]. For example, in additive
number theory, for an infinite set A of natural numbers, one defines the lower density of A
as
N (x)
d(A) = lim inf ,
x→∞ x
where N (x) = #{a ∈ A | a ≤ x}. One problem of interest is to construct essential com-
ponents, that are sets H with the property that d(A + H) > d(A) for all such A with
0 < d(A) < 1, and investigate how thin an essential component is. For such estimates,
see [59, 2510, 2996].
6.3.100 Theorem [1384, Corollaries 1 and 5] Let A, B ⊆ Fq and ψ and χ be a nontrivial additive
and multiplicative character, respectively. Then we have
X X
ψ(ab) ≤ (|A||B|q)1/2 and χ(ab + 1) ≤ (|A||B|q)1/2 .
a∈A,b∈B a∈A,b∈B
6.3.101 Remark Based on Theorem 6.3.100, Gyarmati and Sárközy showed in [1385, Corollaries 1
and 2] that for all subsets A, B, C, D ⊆ Fq with |A||B||C||D| > q 3 , the equation
a + b = cd, a ∈ A, b ∈ B, c ∈ C, d ∈ D
Exponential and character sums 185
ab + 1 = cd, a ∈ A, b ∈ B, c ∈ C, d ∈ D
has a solution.
6.3.102 Remark When q = p and ψ is the additive canonical character of Fp , Bourgain and Garaev
proved in [378, Theorem 1.2] the following estimate for any subsets A, B, C of F∗p ,
X
ψ(abc) < (|A||B||C|)13/16 p5/18+o(1) .
a∈A,b∈B,c∈C
See Also
§3.1, §3.5, §4.2 For estimating the number of polynomials with certain features.
§6.1, §6.2 For basics on character sums.
§6.4, §7.3 For diagonal equations and Waring’s problem.
§7.1 For Kummer and Artin-Schreier curves.
§9.1, §9.3 For Boolean functions and nonlinearity.
§10.3, §10.4, §17.3 For correlation and related measures.
§10.5 For nonlinear recurrence sequences.
§11.4 For univariate factorization.
§11.6, §17.2 For discrete logarithm based cryptosystems.
§14.1 For Latin squares.
§14.5, §14.6 For Hadamard matrices and related combinatorial structures.
§15.1 For basics on coding theory.
§17.2 For applications of character sums in quantum information theory.
References Cited: [59, 82, 197, 240, 260, 347, 348, 368, 371, 375, 378, 380, 393, 463, 464,
487, 488, 523, 554, 583, 584, 587, 650, 776, 859, 922, 998, 1161, 1191, 1283, 1303, 1310, 1384,
1385, 1454, 1468, 1536, 1581, 1620, 1631, 1702, 1730, 1789, 1791, 1887, 1911, 1939, 1991,
2037, 2068, 2230, 2248, 2270, 2278, 2331, 2344, 2372, 2510, 2548, 2626, 2644, 2655, 2647,
2711, 2989, 2990, 2991, 2993, 2996]
6.4.1 Notation
6.4.1 Remark The order q = pn of the field Fq is assumed to be sufficiently large. The elements
of the prime residue field Fp are occasionally associated with their concrete representatives.
The cardinality of a finite set X is denoted by |X|.
186 Handbook of Finite Fields
6.4.2 Definition Given nonempty sets A and B the sum set A + B is defined by
A + B := {a + b : a ∈ A, b ∈ B}
AB = {ab : a ∈ A, b ∈ A}.
a1 b1 = a2 b2 , (a1 , a2 ) ∈ A × A, (b1 , b2 ) ∈ B × B
is denoted by E× (A, B) and is the multiplicative energy between the sets A and B.
6.4.3 Remark There is a simple and important connection between the cardinality of the product
set AB and the multiplicative energy E× (A, B):
|A|2 |B|2
|AB| ≥ .
E× (A, B)
6.4.4 Remark All subsets in this section are assumed to be nonempty. For quantities U and V ,
the notations U = O(V ), U V and V U are all equivalent to the statement that the
inequality |U | ≤ cV holds with some absolute constant c > 0. We use the abbreviation ep (x)
to denote e2πix/p .
6.4.5 Remark Bourgain, Katz, and Tao [381], with subsequent refinement by Bourgain, Glibichuk
and Konyagin [380], proved the following theorem, which is called the sum-product estimate
in prime fields.
6.4.6 Theorem [381] For any ε > 0 there exists δ = δ(ε) > 0 such that, if A ⊂ Fp and |A| < p1−ε ,
then
max{|A + A|, |AA|} ≥ |A|1+δ .
6.4.7 Remark The proof of Theorem 6.4.6 uses results from additive combinatorics and ideas of
Edgar and Miller [955]. The condition |A| < p1−ε is essential, because if the cardinality of
|A| is close to p, then |A + A| and |AA| have cardinalities close to |A|.
6.4.8 Theorem [1194] There is a positive constant c such that for any nonempty set A ⊂ Fp the
following bound holds:
n |A|2 o
max{|A + A|, |AA|} > c min 1/2 , p1/2 |A|1/2 .
p
6.4.9 Remark Theorem 6.4.8 is meaningful for sets A of cardinality larger than p1/2 . Explicit
sum-product estimates for large subsets for the first time were given by Hart, Iosevich, and
Solymosi [1424]. Theorem 6.4.8 is due to Garaev [1194]. It follows that if |A| > p2/3 , then
max{|A + A|, |AA|} > cp1/2 |A|1/2 .
This bound is optimal in the sense that for any positive integer N < p there exists A ⊂ Fp
with |A| = N such that
max{|A + A|, |AA|} < c1 p1/2 |A|1/2 ,
Exponential and character sums 187
An explicit sum-product estimate for subsets of cardinality |A| < p1/2 was given for the
first time by Garaev [1192] in the form
This estimate subsequently was improved by several authors; the most recent one is due to
Rudnev [2500]: if |A| < p1/2 , then max{|A + A|, |AA|} > |A|12/11+o(1) .
n o
6.4.12 Theorem [1195] Let A, B ⊂ F∗p , L = min |B|, p|A|−1 . Then
|A|2 |B|2
|A − A|2 · |A|3 L1/9 (log L)−1 .
E× (A, B)
6.4.13 Remark Theorem 6.4.12 is an explicit version of Bourgain’s sum product estimate for
subsets of incomparable sizes. The presence of the multiplicative energy is important in
applications to multilinear exponential sum estimates. The proof can be found in [1195]
and is based on ideas from [371, 1192, 1193].
6.4.14 Theorem [371] Let A, B ⊂ F∗p . Then
1
|8AB − 8AB| ≥ min{|A||B|, p − 1}.
2
P8 P16
6.4.15 Remark Here 8AB − 8AB = { i=1 ai bi − i=9 ai bi : ai ∈ A, bi ∈ B}. From the result of
Glibichuk and Konyagin [1284] it was known that
1
|3AA − 3AA| ≥ min{|A|2 , p − 1}.
2
6.4.16 Remark Theorem 6.4.14 is proved by Bourgain [371] with subsequent application to mul-
tilinear exponential sum estimates with nearly optimal entropy conditions.
6.4.17 Theorem [379] Let A, B ⊂ Fq such that |A| > 1 and B is not contained in any proper
subfield of Fq . Then
1 6/7
max{|A + AB|, |A − AB|} ≥ |A| min{|A||B|, q}1/7 .
2
6.4.18 Remark Theorem 6.4.17 is proved by Bourgain and Glibichuk [379] with subsequent appli-
cations to exponential sum estimates over small subgroups of F∗q . Because of the presence of
subfields an additional condition on subsets is needed. An explicit sum-product estimate in
Fq for the first time was obtained by Katz and Shen [1696] in the following form: suppose
that A is a subset of Fq such that for any A0 ⊂ A with |A0 | ≥ |A|18/19 and for any G ⊂ Fq
188 Handbook of Finite Fields
6.4.3 Applications
6.4.23 Remark The sum-product estimate and its variants have found many spectacular applica-
tions in various areas of mathematics. Using sum-product estimates, Bourgain, Glibichuk,
and Konyagin [380] obtained a new estimate of multilinear rational trigonometric sums,
which has important applications to classical Gauss trigonometric sums.
6.4.24 Theorem [380] For any ε > 0 there exists δ = δ(ε) > 0 and a positive integer k = k(ε) such
that if X ⊂ Fp with |X| > pε , then
X X
max ... ep (ax1 . . . xk ) < |X|k p−δ .
(a,p)=1
x1 ∈X xk ∈X
6.4.25 Corollary [380] For any ε > 0 there exists δ = δ(ε) > 0 such that if H is a subgroup of the
multiplicative group F∗p with |H| > pε , then
X
max ep (ah) < |H|p−δ .
(a,p)=1
h∈H
6.4.26 Theorem [371] Let 0 < δ < 1/4 and r ≥ 2 be an integer. There is a δ 0 > (δ/r)Cr such that
if p is a sufficiently large prime and X1 , X2 , . . . , Xr ⊂ Fp satisfy |Xi | ≥ pδ for all 1 ≤ i ≤ r
and if
r
Y
|Xi | > p1+δ ,
i=1
Exponential and character sums 189
6.4.27 Remark Theorem 6.4.26 is due to Bourgain [371]. It gives a nontrivial estimate for multi-
linear exponential sums under nearly optimal conditions on the sizes of the sets Xi .
6.4.28 Theorem [1195] Let 3 ≤ r ≤ 1.44 log log p be an integer, ε be a fixed positive constant. Let
X1 , X2 , . . . , Xr be subsets of F∗p such that
1/81
|X1 | · |X2 | · |X3 | · · · |Xr | > p1+ε .
Then X X r
... ep (x1 . . . xr ) < |X1 | · · · |Xr |p−0.45 ε/2
x1 ∈X1 xr ∈Xr
Then, as p → ∞, we have X
max ep (ax) = o(|H|).
(a,p)=1
x∈H
6.4.31 Corollary [1195] Let g be a generator of F∗p and let N > e57 log p/ log log p . Then, as p → ∞,
we have X
max ep (ag x ) = o(N ).
(a,p)=1
x≤N
6.4.32 Theorem [1195] Let X, Y, Z ⊂ F∗p be such that |X||Y | > δ p for some constant δ > 0. Then,
for any ε > 0 one has the bound
XXX
ep (xyz) |X| · |Y | · |Z|539/540+ε ,
x∈X y∈Y z∈Z
6.4.34 Remark Theorem 6.4.33 is due to Bourgain [368]. Previously known results only applied for
large values of t, t1 , t2 . The exponential sum that appears in Theorem 6.4.33 for the first time
was investigated by Canetti, Friedlander, and Shparlinski [488] motivated by cryptographic
applications. Subsequently Banks, Conflitti, Friedlander, and Shparlinski [196] found ap-
plications to estimate exponential sums with Mersenne numbers. For further details, see
Bibak [268], Chang [587] and the references therein.
190 Handbook of Finite Fields
6.4.35 Theorem [369] Given a positive integer r and ε > 0, there exists δ = δ(r, ε) > 0 satisfying
the following property: if
r
X
f (x) = ai xki ∈ Z[x], (ai , p) = 1,
i=1
6.4.36 Remark Theorem 6.4.35 is due to Bourgain, its proof is based on sum-product estimates for
subsets of Fp ×Fp . It gives a nontrivial bound for the exponential sums under essentially op-
timal conditions on the exponents ki . The exponential sum that appears in Theorem 6.4.33
was estimated by Mordell [2143] in 1932, see Cochrane, Coffelt, and Pinner [654] for more
details.
n−1
6.4.37 Definition Denote Tr(x) = 1 + xp + · · · + xp the trace of x ∈ Fq . Let ψ(x) =
ep (aTr(x)), a ∈ F∗q , be a nontrivial additive character of Fq .
6.4.38 Theorem [373] Let 0 < δ, δ2 < 1 and r ≥ 2 be integer. Let A1 , . . . , Ar ⊂ Fq satisfy
Then
X X 0
... ψ(x1 . . . xr ) < |A1 | · · · |Ar |q −δ ,
x1 ∈A1 xr ∈Ar
0 −r/δ2
where we may take δ = C (δ/r)Cr for some positive constant C.
6.4.39 Theorem [379] Let 0 < η ≤ 1 be fixed and H be a multiplicative subgroup of F∗q with
max(135/η,180000)
|H| ≥ q log2 log2 q
.
|H ∩ G| ≤ |H|1−η .
6.4.40 Remark Using the sum-product estimate in Fp , Bourgain, Katz, and Tao [380] obtained
the following Fp - analogy of the Szemerédi-Trotter theorem.
6.4.41 Theorem [380] Let P (correspondingly L) be a set of points (correspondingly a set of lines)
in the plane Fp × Fp . Assume that for some α < 2 we have max{|P|, |L|} ≤ M < pα . Then
for some constant γ = γ(α) > 0 one has
n o
W (P, L) := # (x, y), ` ∈ P × L : (x, y) ∈ ` M 3/2−γ .
6.4.42 Remark An explicit bound for W (P, L) has been obtained by Helfgott and Rudnev [1465].
When the cardinalities of P and L are large, a sharp bound for W (P, L) has been obtained
by Vinh [2876]; see also Cilleruelo [643].
6.4.43 Remark The following two results are due to Bourgain [370]. They solve one of the questions
of Wigderson on expander maps in two variables. Their proofs are based on the Fp - analogy
of Szemerédi-Trotter theorem.
6.4.44 Theorem [370] Let A, B be subsets of Fp with |A| = |B| = N < pα , where α < 1 is a
positive constant. Then for some constant β = β(α) > 0 one has
6.4.45 Theorem [370] Let A, B be subsets of Fp with |A| < p1/2 , |B| < p1/2 . Then
XX
max ep (a(xy + x2 y 2 )) < p1−γ ,
(a,p)=1
x∈A y∈B
Thus, at the same time Bourgain constructed 2-source extractors with entropies less than
1/2, breaking the barrier 1/2. Regarding the problem of extractors and applications of
sum-product results to problems of computer science, see [200, 268, 370].
6.4.47 Definition Let G be a finite group, A ⊂ G be a set of generators of G (that is, every g ∈ G
can be expressed as a product of elements of A ∪ A−1 ). The Cayley graph Γ(G, A) is
the graph (V, E) with vertex set V = G and edge set E = {(ag, g) : g ∈ G, a ∈ A}. The
diameter of a graph X = (V, E) is maxv1 ,v2 ∈V d(v1 , v2 ), where d(v1 , v2 ) is the length of
the shortest path between v1 and v2 in X.
6.4.48 Theorem [1463] Let p be a prime. Let A be a set of generators of G = SL2 (Z/pZ). Then
the Cayley graph Γ(G, A) has diameter O((log p)c ), where c and the implied constant are
absolute.
6.4.49 Remark Theorem 6.4.48 is due to Helfgott [1463], the sum-product estimate played a crucial
role in the proof of this result. It initiated a series of very important works in the diameter
problem and expansion theory of Cayley graphs of finite groups. See also [376, 377, 1464,
1466] for further references.
192 Handbook of Finite Fields
6.4.50 Remark The sum product estimates have found numerous applications in many other prob-
lems. For instance, in the work of Chang [583] the sum-product estimate and its versions
have found a number of applications to multiplicative character sum estimates. Cochrane
and Pinner [656] applied the sum-product estimates to finite field versions of the Waring
problem. Ostafe and Shparlinski [2333] applied the result of Glibichuk and Rudnev [1282]
to a version of the Waring problem with Dickson’s function.
References Cited: [53, 193, 194, 196, 200, 268, 368, 370, 371, 373, 376, 377, 378, 379, 380,
381, 456, 488, 587, 643, 654, 656, 955, 1192, 1193, 1194, 1195, 1282, 1284, 1347, 1424, 1425,
1463, 1464, 1465, 1466, 1696, 1697, 2143, 2333, 2500, 2651, 2738, 2781, 2876]
7
Equations over finite fields
7.1 General forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Affine hypersurfaces • Projective hypersurfaces •
Toric hypersurfaces • Artin-Schreier hypersurfaces •
7.1.1 Remark There are lots of results on equations over finite fields. In this section, we give a
collection of sample results and examples focusing on hypersurfaces. Additional results can
be found in [1708, 2548, 2902] and in the references of this section.
7.1.2 Definition For a polynomial f (x1 , . . . , xn ) ∈ Fq [x1 , . . . , xn ], let Af denote the affine hyper-
surface in the affine n-space An defined by the equation f = 0. Then, Af (Fq ) denotes
the set of Fq -rational points (x1 , . . . , xn ) ∈ Fnq such that
f (x1 , . . . , xn ) = 0.
The notation #Af (Fq ) denotes the cardinality of the set Af (Fq ), namely, the number
of Fq -rational points on the affine hypersurface Af .
193
194 Handbook of Finite Fields
where y1 , . . . , ym are variables. Thus, for simplicity and without too much loss of generality,
we shall focus on the hypersurface case.
7.1.4 Theorem [2548] For a non-zero polynomial f of (total) degree d in Fq [x1 , . . . , xn ], we have
1. 0 ≤ #Af (Fq ) ≤ dq n−1 .
2. If #Af (Fq ) ≥ 1, then #Af (Fq ) ≥ q n−d .
3. If f is homogenous of degree d, then q n−d ≤ #Af (Fq ) ≤ d(q n−1 − 1) + 1.
7.1.5 Definition Let Ωd,n (q) denote the Fq -vector space of polynomials in Fq [x1 , ..., xn ] with
degree at most d.
1 X
(#Af (Fq ) − q n−1 )2 = q n−1 − q n−2 .
|Ωd,n (q)|
f ∈Ωd,n (q)
7.1.7 Remark The above average or probabilistic result suggests that for most polynomials
f ∈ Ωd,n (q), one should expect that #Af (Fq ) is approximately q n−1 for large q. This is
indeed the case, as the next few theorems show. The first one is an effective version of the
Lang-Weil theorem.
7.1.10 Definition A polynomial f ∈ Fq [x1 , . . . , xn ] (or the affine hypersurface it defines) is smooth
if the following system of equations
∂f ∂f
= ··· = =f =0
∂x1 ∂xn
has no solutions in the algebraic closure of Fq . A homogenous polynomial f (or the
projective hypersurface it defines) is smooth if the above system of equations has no
solutions other than possibly the trivial one (0, . . . , 0).
7.1.13 Definition The singular locus Sing(f ) of a polynomial f ∈ Fq [x1 , . . . , xn ] (or the affine
hypersurface it defines) is the affine algebraic set in An defined by the following system
of equations
∂f ∂f
= ··· = = f = 0.
∂x1 ∂xn
Similarly, the singular locus Sing(f ) of a homogeneous polynomial f (or the projective
hypersurface it defines) is the projective algebraic set in the projective space Pn−1
defined by the same system of equations.
7.1.14 Theorem Let f ∈ Fq [x1 , . . . , xn ] be a polynomial of degree d > 0 such that the singular
loci of both f and its homogenous leading form have dimension at most s for some integer
s ≥ −1. Then,
|#Af (Fq ) − q n−1 | ≤ Cd,n q (n+s)/2 ,
where Cd,n is an explicit constant depending only on d and n.
7.1.15 Remark This result is the affine version of the Hooley-Katz theorem [1533] which unifies
the previous two theorems, at least in a qualitative way. If f is absolutely irreducible, then
we can take s ≤ n − 2 and this recovers the Lang-Weil theorem. If f and its leading form
are both smooth, then we can take s = −1 and this recovers Deligne’s theorem.
f (x0 , . . . , xn ) = 0,
where two solutions are identified if they differ by a non-zero scalar multiple.
q n+1−d − 1 qn − 1
≤ #Pf (Fq ) ≤ d .
q−1 q−1
7.1.18 Remark This is simply a consequence of the corresponding affine theorem in the previous
subsection.
7.1.19 Theorem [795] If f ∈ Fq [x0 , x1 , . . . , xn ] is a smooth homogeneous polynomial of degree
d > 0, then
qn − 1 d−1
#Pf (Fq ) − ≤ ((d − 1)n − (−1)n )q (n−1)/2 .
q−1 d
7.1.20 Theorem [1270, 1533] If f ∈ Fq [x0 , x1 , . . . , xn ] is a homogeneous polynomial of degree d > 0
such that the projective singular locus Sing(f ) is of dimension at most s for some integer
s ≥ −1, then
qn − 1
#Pf (Fq ) − ≤ Cd,n q (n+s)/2 ,
q−1
where Cd,n is an explicit constant depending only on d and n.
7.1.21 Remark Similar results hold for more general complete intersections, see [1270].
196 Handbook of Finite Fields
∂f δ ∂f δ
f δ = x1 = · · · = xn =0
∂x1 ∂xn
(q − 1)n − (−1)n
#Tf (Fq ) − ≤ (n!Vol(∆) − 1)q (n−1)/2 .
q
(q − 1)n − (−1)n
#Tf (Fq ) − ≤ (n − 1)q (n−1)/2 .
q
(q − 1)n − (−1)n
#Tf (Fq ) − ≤ (n − 2)q (n−1)/2 + q (n−2)/2 .
q
These results are used to obtain improved bounds for the number of elements with given
trace and norm, see [2125]. The toric hypersurface Tf in this example is a toric Calabi-Yau
hypersurface. It is the most important example in arithmetic mirror symmetry, see [2900].
7.1.27 Remark For results on more general toric complete intersections, see [27].
Equations over finite fields 197
7.1.28 Definition Let f ∈ Fq [x1 , . . . , xn ] be a polynomial of degree d > 0. Let ASf denote the
Artin-Schreier hypersurface in An+1 defined by the equation
y q − y = f (x1 , . . . , xn ).
For any positive integer k, let #ASf (Fqk ) denote the number of Fqk -rational points on
ASf .
7.1.29 Remark We note that for k = 1, one has the relation #ASf (Fq ) = q#Af (Fq ), where Af is
the affine hypersurface in An defined by f = 0.
7.1.30 Theorem Assume that d is not divisible by p and the leading form of f is smooth. Then,
7.1.31 Remark This is simply a consequence of Deligne’s estimate [795] for exponential sums, see
Section 6.2 for more details. It can be improved and generalized in various ways as the
next few theorems show. The next theorem is a consequence of Katz’s estimate [1704] for
singular exponential sums.
7.1.32 Theorem Assume that d is not divisible by p and the singular locus of the leading form of
f has dimension at most s for some integer s ≥ −1. Then,
7.1.33 Remark In the case s = −1, that is, the leading form of f is smooth, the above result
reduces to Deligne’s estimate. In this case, the subscheme defined by the Jacobian ideal
< ∂f /∂x1 , . . . , ∂f /∂xn > has at most a finite number of points and the above Deligne
estimate can be improved further in some cases.
7.1.34 Theorem [2472] Assume that d is not divisible by p > 2 and the leading form of
f is smooth. Assume further that the Jacobian subscheme in An defined by the ideal
< ∂f /∂x1 , . . . , ∂f /∂xn > is a zero-dimensional smooth variety and the image of its F̄q -
points under f are distinct (a Morse condition). If nk is even, suppose additionally that the
hypersurface defined by
7.1.35 Corollary Assume d is not divisible by p > 2, the leading form of f is smooth, and the Ja-
cobian subscheme in An defined by the ideal < ∂f /∂x1 , . . . , ∂f /∂xn > is a zero-dimensional
smooth variety and the images of its F̄q -points under f are distinct. Assume that n is odd,
then
|#Af (Fq ) − q n−1 | ≤ Cd,1,n q (n−1)/2 .
7.1.36 Definition Let f (x, y) ∈ Fq [x1 , . . . , xn , y1 , . . . , yn0 ] be a polynomial with two sets of vari-
ables, where n, n0 ≥ 1. For a positive integer k, let Nk (f ) denote the number of solutions
of the equation
xp0 − x0 = f (x1 , . . . , xn , y1 , . . . , yn0 )
0
such that (x0 , x1 , . . . , xn ) ∈ Fn+1
qk
and (y1 , . . . , yn0 ) ∈ Fnq .
7.1.37 Remark We note that the two sets of coordinates in the previous definition run over different
extension fields of Fq .
7.1.38 Definition Let f (x, y) be a polynomial as in the previous definition. For a positive integer
k, we define the k-th fibred sum of f along y to be the new polynomial
7.1.39 Theorem [1140] Given f of degree d > 0 as above. Let fd be the homogeneous leading form
0
of f . Assume that the k-th fibred sum ⊕ky fd is smooth in Pkn+n −1 and assume that d is
not divisible by p. Then, we have the following two estimates
0 0 0
|Nk (f ) − q kn+n | ≤ (p − 1)(d − 1)kn+n q (kn+n )/2 ,
0 n
−1 (kn+n0 )/2
|Nk (f ) − q kn+n | ≤ c(p, n, n0 )k 3(d+1) q ,
where the constant c(p, n, n0 ), depending only on p, n, n0 , is not known to be effective if
n0 ≥ 2.
7.1.40 Remark In the case k = 1, the first estimate reduces to Deligne’s theorem as above. More
generally, for a degree d polynomial f ∈ Fq [x1 , . . . , xn ] and positive integers k1 , . . . , kn , let
Nk1 ,...,kn (f ) denote the number of solutions of the equation f (x1 , . . . , xn ) = 0 with xi ∈ Fqki
for all 1 ≤ i ≤ n. Under suitable nice conditions, one expects [2898] the estimate of the type
where k is the least common multiple of the integers ki . The above theorem is one example
of this kind. For two additional examples, see [1705, 2471].
7.1.41 Definition Let f ∈ Fq [x1 , . . . , xn ] be a polynomial of degree d > 0. For a positive integer
m not divisible by p, let Kf,m denote the Kummer hypersurface in An+1 defined by
y m = f (x1 , . . . , xn ).
Equations over finite fields 199
7.1.42 Theorem [1707, 2468] Assume that f is smooth in An and its leading form fd is also smooth
in Pn−1 . Then,
|#Kf,m (Fq ) − q n | ≤ (m − 1)(d − 1)n q n/2 .
7.1.43 Remark Since #Kf,m (Fq ) = #Kf,(m,q−1) (Fq ), we may assume that m divides q − 1. Write
m = (q − 1)/e, then for fixed e, the above bound reduces to
1
|#Kf,m (Fq ) − q n | ≤ (d − 1)n q (n+2)/2 .
e
Similar to the Artin-Schreier hypersurface case, we expect that for fixed e, under suitable
hypotheses, an improved estimate of the form
7.1.44 Remark In this subsection, we review several results on p-divisibility of the solution number
for an affine algebraic set, where p is the characteristic of the finite field Fq . We begin with
the well known Chevalley-Warning theorem.
7.1.45 Theorem [2548] Let f1 , . . . fm ∈ Fq [x1 , . . . , xn ] be polynomials of degrees d1 , . . . , dm respec-
tively. Assume that n > d1 + · · · + dm . Then,
n − (d1 + · · · + dm )
µ≥ .
maxi di
Then,
#Af1 ,...,fm (Fq ) ≡ 0 (mod q µ ).
7.1.47 Remark This theorem is due to Ax [150] in the case m = 1 and to Katz [1698] for gen-
eral m. Ax’s proof is elementary and based on the Stickelberger relation for Gauss sums
(for Gauss sums, see Section 6.1). Katz’s proof uses more advanced methods from Dwork’s
p-adic theory. In [2907], it was noted that Katz’s result can be proved by Ax’s more ele-
mentary method. Later, a short elementary reduction of Katz’s theorem for general m to
Ax’s theorem for m = 1 was given in [1540]. This is consistent with Remark 7.1.3 that
a system of equations can often be reduced to the one equation case. Additional simpler
proofs to more general results are given in [2914]. If one takes into account the actual terms
(the polytope) of the polynomials f1 , . . . , fm , the Ax-Katz theorem can be generalized or
improved in certain cases, see Adolphson-Sperber [21] for p-divisibility of exponential sums.
We next describe its consequence for equations over finite fields.
7.1.48 Theorem Let ∆ be an integral convex polytope in Rm+n which contains all the exponents of
the monomials in the polynomial y1 f1 (x1 , . . . , xn ) + · · · + ym fm (x1 , . . . , xn ) with coefficients
in Fq . Let ω be the smallest positive integer such that the dilation ω∆ contains a lattice
point in Zm+n with all coordinates positive. Then,
7.1.49 Remark For non-prime fields (i.e., q is not a prime), the Ax-Katz theorem can be improved
in some cases by considering the p-weights of the exponents of the polynomials fi and the
Weil descent, see Moreno-Moreno [2151]. Let q = pr , and let {α1 , . . . , αr } be an Fp -basis of
Fq . Then, any element xi ∈ Fq can be written uniquely as
r
X
xi = xij αj , xij ∈ Fp .
j=1
Let e = e0 + e1 p + · · · + er−1 pr−1 be the p-digit expansion of an integer e ∈ [0, q − 1]. Then,
one has the relation
r−1
j j
(xi1 α1p + · · · + xir αrp )ej .
Y
xei =
j=0
In this way, one finds a system of mr polynomials gij ∈ Fp [x11 , . . . , xmr ] such that
One can then apply the Ax-Katz theorem to the right side over the prime field Fp .
7.1.50 Remark For a homogeneous polynomial f ∈ Fq [x0 , . . . , xn ] of degree d > 0, the Ax-Katz
theorem implies that the projective hypersurface Pf in Pn satisfies the congruence
where µ is the smallest positive integer greater than or equal to (n + 1 − d)/d. This gives
a non-trivial congruence only in the case n + 1 > d (Fano varieties). In the case n + 1 = d
(Calabi-Yau variety) or n + 1 < d (varieties of general type), one cannot expect such
general p-divisibility results. However, the following example suggests that one may still
expect some congruences for some pair of varieties. For λ ∈ Fq , let Xλ denote the Dwork
family of Calabi-Yau hypersurfaces in Pn defined by the equation
xn+1
0 + · · · + xn+1
n + λx0 x1 · · · xn = 0.
Let Yλ be the projective closure in the projective toric variety P∆ of the toric affine hyper-
surface defined by
1
x1 + · · · + xn + + λ = 0,
x1 · · · xn
where ∆ is the simplex in Rn with vertices
In this example, the mirror variety Yλ is a quotient of Xλ by a finite group G, see [254, 1141]
for extensions of such congruence to a pair of more general quotient varieties.
7.1.51 Remark From a zeta function point of view, see Section 12.7 for more on zeta functions,
the p-adic estimate in this subsection corresponds to an estimate for the first non-trivial
slope. The study of all slopes for the zeta function is significantly deeper, and the reader is
refered to Section 12.8.
Equations over finite fields 201
See Also
References Cited: [21, 23, 27, 150, 254, 473, 795, 812, 1140, 1141, 1270, 1533, 1540, 1698,
1704, 1705, 1707, 1708, 1849, 2125, 2151, 2468, 2470, 2471, 2472, 2548, 2898, 2900, 2902,
2907, 2914]
7.2.1 Definition A quadratic form f over a field F is a homogeneous polynomial over F of degree
two: X
f (X) = aij xi xj aij ∈ F,
i≤j
7.2.2 Remark The theory is different in the characteristic 2 and odd characteristic cases.
7.2.3 Definition A quadratic space is a pair (Q, V ) where V is a finite dimensional vector space
over F and Q : V → F satisfies
1. Q(λv) = λ2 v, for all λ ∈ F and v ∈ V , and
2. for char(F ) = 2, we require BQ (v, w) = Q(v + w) + Q(v) + Q(w) to be a sym-
metric bilinear form; for char(F ) 6= 2, we require that BQ (v, w) = 12 (Q(v + w) −
Q(v) − Q(w)) to be a symmetric bilinear form.
P
7.2.4 Remark Each choice of a basis {e1 , . . . , en } of V yields a quadratic form f (X) = Q( i xi ei );
a different choice of basis yields an equivalent form.
7.2.5 Remark Suppose char(F ) = 2. The associated matrix of f is CMf . Then f (X) =
X T CMf X. There are many matrices M with f (X) = X T M X but CMf is the only up-
per triangular one. The matrix for the associated symmetric bilinear form bf (X, Y ) =
202 Handbook of Finite Fields
If rad(Q) = 0, Q is non-degenerate.
Let P(F ) be the additive subgroup {a2 + a : a ∈ F }. The Arf invariant ∆(Q) is
P
αi βi ∈
F/P(F ). ∆(Q) does not depend on the choice of the symplectic basis. If Q is degenerate
then write V = rad(Q) ⊕ W . Then Q|W is non-degenerate and we define ∆(Q) = ∆(Q|W ).
Now suppose char(F ) 6= 2. The associated matrix of f is Mf = 12 (CMf + CMfT ). Then
f (X) = X T Mf X. There are many matrices M with f (X) = X T M X but Mf is the only
symmetric one. The matrix for the associated symmetric bilinear form bf is also Mf . The
function BQ determines Q (and bf determines f ) via Q(v) = BQ (v, v). Suppose Q is non-
degenerate. Then there is an orthogonal basis with the associated matrix diagonal. Let F ∗2
be the multiplicative subgroup {a2 : a ∈ F }. The determinant of f is det Mf ∈ F ∗ /F ∗2
(equivalently, the product of the entries in the diagonalization) and it does not depend
on the choice of basis. Again, if Q is degenerate then write V = rad(Q) ⊕ W and set
det(Q) = det(Q|W ).
7.2.8 Remark Details on the results of this section can be found in [1939, Section 6.2].
7.2.9 Theorem Let q be even and let (Q, V ) be a quadratic space over Fq . Let n = dim V and
r = dim rad(Q). Then:
1. n − r = 2s is even.
2. |Fq /P(Fq )| = 2.
3. Fix 0 6= d ∈ Fq /P(Fq ). Then V has a symplectic basis such that the resulting
quadratic form is one of the following:
(a) E1 : x1 x2 + x3 x4 + · · · + x2s−1 x2s ,
(b) E2 : x21 + x1 x2 + dx22 + x3 x4 + · · · + x2s−1 x2s ,
(c) E3 : x20 + x1 x2 + x3 x4 + · · · + x2s−1 x2s .
4. E1 occurs if and only if Q(rad(Q)) = 0 and ∆(Q) = 0; E2 occurs if and only if
Q(rad(Q)) = 0 and ∆(Q) = d 6= 0; E3 occurs if and only if Q(rad(Q)) 6= 0.
Equations over finite fields 203
7.2.10 Definition The rank of Q is the minimal number of variables in a quadratic form induced
from Q.
7.2.11 Remark The rank of Q is dim V − dim rad(Q) except for the case E3 when it is dim V −
dim rad(Q) + 1.
7.2.12 Theorem Let q be odd and let (Q, V ) be a quadratic space over Fq . Let n = dim V and
r = dim rad(Q). Write n − r = 2s or 2s + 1. Then
1. |F∗q /F∗2
q | = 2.
2. Fix 1 6= d ∈ F∗q /F∗2
q . Then V has a basis such the resulting quadratic form is one
of the following:
(a) O1 : x1 x2 + x3 x4 + · · · + x2s−1 x2s ,
(b) O2 : x21 − dx22 + x3 x4 + · · · + x2s−1 x2s ,
(c) O3 : x20 + x1 x2 + x3 x4 + · · · + x2s−1 x2s ,
(d) O4 : dx20 + x1 x2 + x3 x4 + · · · + x2s−1 x2s .
3. In cases O1, O3, det Q = (−1)s , and in cases O2, O4, det Q = (−1)s d.
7.2.13 Remark When char(F ) 6= 2:
1. we always have Q(rad(Q)) = 0;
2. the rank of Q is always dim V − dim rad(Q); and
3. xy ' x2 − y 2 so that each form in Theorem 7.2.12 can be written as a diagonal
form.
7.2.14 Definition For a quadratic space (Q, V ) over a finite field, let N (Q = 0) denote the number
of v ∈ V such that Q(v) = 0.
7.2.15 Theorem Let (Q, V ) be a quadratic space over Fq , where q is even. Let n = dim V and
r = dim rad(Q). Then
1h n p i
N (Q = 0) = q + (q − 1)Λ(Q) q n+r ,
q
where
+1, if Q is type E1,
Λ(Q) = −1, if Q is type E2,
0, if Q is type E3.
7.2.16 Theorem Let (Q, V ) be a quadratic space over Fq , where q is odd. Let n = dim V and
r = dim rad(Q). Then
1h n p i
N (Q = 0) = q + (q − 1)Λ(Q) q n+r ,
q
where
+1, if Q is type O1,
Λ(Q) = −1, if Q is type O2,
0, if Q is type O3 or O4.
7.2.17 Remark With a little more work [1745, 1746], one can give the number of solutions to
Q(x) = c, for any scalar c, and even to Q(x) + F (x) = c, where F (x) is an arbitrary linear
function.
204 Handbook of Finite Fields
7.2.18 Remark In almost all applications the quadratic spaces arise as follows: Let F = Fq , K =
Pm i
Fqn and let L(x) = i=0 αi xq be a linearized polynomial over K, see Definition 2.1.103.
7.2.22 Remark The invariants dim rad(Q) = d and Λ(Q) are completely determined only in the
case of q even and L having one term. Let v2 (k) demote the highest power of 2 dividing k.
a
7.2.23 Theorem Let F = Fq with q even. Let Q(x) = TrK/F (γx · xq ), where K = Fqn and γ ∈ K.
Set d = (n, a).
1. [1747] If v2 (n) ≤ v2 (a) then dim rad(Q) = d and Λ(Q) = 0.
2. [1745] If v2 (n) = v2 (a) + 1 then
(
(2d, +1) if γ is a (q a + 1)-th power in K,
(dim rad(Q), Λ(Q)) =
(0, −1) if γ is not a (q a + 1)-th power in K.
odd prime divisors p of n with min{vp (n), vp (b − a)} + min{vp (n), vp (b + a)} odd.
3. If n is even and b ± a is even then Λ(n) = Λ(2k ).
4. For n = 2k :
(a) If k ≤ M then Λ(n) = +1.
(b) If k = 1 + M and v2 (b − a) 6= v2 (b + a) then Λ(n) = −1.
(c) If k = 1 + M and v2 (b − a) = v2 (b + a) then Λ(n) = 0.
(d) If k ≥ 2 + M and v2 (b − a) 6= v2 (b + a) then Λ(n) = −1.
(e) If k ≥ 2 + M and v2 (b − a) = v2 (b + a) then
i. If k = 2 and one of a, b is odd and the other is equivalent to 2 (mod 4)
then Λ(n) = +1.
ii. If k ≥ 3 and one of a, b is odd and the other is divisible by 4 then
Λ(n) = +1.
iii. Otherwise, Λ(n) = −1.
7.2.4 Applications
7.2.28 Remark The first appearance of trace forms seems to be Welch’s Theorem [231] which is
Theorem 7.2.23 for γ = 1 and 2a dividing n. It was used to compute weight enumerators of
double-error correcting BCH codes. Other particular trace forms have been used to compute
weight enumerators of second order Reed-Muller codes [1991] and minimal codes [3000].
Weights of irreducible codes have been found via counting the number of polynomials L
with fixed invariants by Feng and Luo [1056, 1981].
7.2.29 Remark Quadratic forms have been used to construct Artin-Schreier curves y q + y = xL(x)
with many rational points [1072, 1075, 2844]. Quadratic forms over F2 , partitioned by
their bilinear forms, were used to construct systems of linked symmetric designs in [483].
Maximal rank quadratic forms give rise to the first examples of bent functions which have
cryptographic importance, see Section 9.3. Piecewise functions, with each piece a trace form,
206 Handbook of Finite Fields
yield other bent functions and a presentation of the Kerdock code in [536]. Families of trace
forms have been used to construct Gold-like sequences [1077, 1731]. Correlations of maximal
and other sequences have been computed with trace forms [1475, 1736, 1745, 1746].
See Also
§9.3, §10.3, §12.6, §15.2 For more information on applications of quadratic forms.
References Cited: [231, 483, 536, 1056, 1072, 1073, 1074, 1075, 1077, 1475, 1731, 1736,
1745, 1746, 1747, 1939, 1981, 1991, 2844, 3000]
7.3.1 Preliminaries
We present a summary of results on diagonal equations. The selection gives an overview of
the area as well as some of its recent developments. General references for diagonal equations
are: Chapter 10 in [240], Chapter 8 in [1575], Chapters 3-6 in [1617], Chapter 6, Section 3
in [1939], and Chapter 6 in [2681].
7.3.2 Remark The number of solutions Nb can be expressed in terms of Jacobi and Gauss sums.
The relation between Jacobi and Gauss sums and other results on these sums are included
in Section 6.3.
7.3.3 Theorem [1939, Theorem 6.33] The number N0 of solutions of (7.3.1) for b = 0 is
χj11 (c1 ) · · · χjss (cs ) J0 χj11 , . . . , χjss ,
X
N0 = q s−1 + (7.3.2)
(j1 ,...,js )∈T
where T is the set of all (j1 , . . . , js ) ∈ Zs such that 1 ≤ ji ≤ di − 1 for 1 ≤ i ≤ s, χj11 . . . χjss
is trivial, χi is a multiplicative character of order di = gcd(ki , q − 1), and J0 is a Jacobi
sum.
7.3.4 Theorem [1939, Theorem 6.34] The number Nb of solutions of (7.3.1) for b ∈ F∗q is
dX
1 −1 dX
s −1
j1
Nb = q s−1 + χj11 bc−1 · · · χjss bc−1 J χ1 , . . . , χjss ,
··· 1 s (7.3.3)
j1 =1 js =1
Equations over finite fields 207
7.3.6 Definition Let I(k1 , . . . , ks ) be the number of s-tuples (j1 , . . . , js ) ∈ Zs such that 1 ≤ ji ≤
ki − 1 and expression (7.3.4) is an integer.
7.3.7 Remark Note that I(k1 , . . . , ks ) ≤ (k1 − 1) · · · (ks − 1). The number I(k1 , . . . , ks ) can be
s
ci xki i ; see [2681]. This
P
interpreted as the degree of the numerator of the zeta-function of
i=1
number appears in the estimates (7.3.5) and (7.3.6) for the number of solutions N0 and
Nb , b 6= 0, of Equation (7.3.1).
7.3.8 Theorem [1939, Theorems 6.36, 6.37] Let di = gcd(ki , q − 1) for i = 1, . . . , s. Then,
1. For b = 0,
s−2
|N0 − q s−1 | ≤ I(d1 , . . . , ds )(q − 1)q 2 . (7.3.5)
2. For b 6= 0,
h i s−1
|Nb − q s−1 | ≤ (d1 − 1) · · · (ds − 1) − 1 − q −1/2 I(d1 , . . . , ds ) q 2 . (7.3.6)
7.3.9 Remark Note that if q is sufficiently large with respect to d1 , . . . , ds then (7.3.5) and (7.3.6)
imply that Equation (7.3.1) is solvable. One can obtain an improvement of (7.3.5) and
(7.3.6) if the p-weights of all the di ’s are small and the solutions are in Fq2 [1274].
7.3.10 Remark When dealing with any type of equation, the first question that one might ask is
whether or not the equation has solutions over a given field. A classical result is Chevalley’s
theorem [615] that guarantees a non-trivial solution whenever the number of variables is
larger than the degree of the polynomial and there is no constant term. In [2151] the authors
improve Chevalley’s result by considering the p-weight degree of the polynomial instead of
its degree. The following results determine solvability of some families of diagonal equations.
prime power and k a positive integer such that k 6= p − 1 in
7.3.11 Theorem [2809] Let q be anyP
s k k+3
the case q = p. The equation i=1 ci xi = 0 has a nontrivial solution for s ≥ 2 .
7.3.12 Theorem [2679, Theorems 1, 2] Let Fq0 ⊆ Fq be finite fields, and suppose that c1 , . . . , cs ∈
Fq0 .
Qs 2
1. If s ≥ 2 and q > i=1 (ki − 1) s−1 , then Equation (7.3.1) is solvable in Fq for any
b ∈ Fq .
2. If s ≥ 3 and
D−1
q s−1 − 1 1 X Y
s > (1 − ki ) ,
(q − 1)q 2 −1 D
l=0 ki | l
208 Handbook of Finite Fields
s k−1
s q 2 −1 (q − 1) X
N0 = q s−1 + (1 − k)τ (j) , and,
k j=0
1 k−1
s 1 (q 2 − ) X
for b 6= 0, Nb = q s−1 − s+1 q 2 −1 (1 − k) θ(b)
q −
2 (1 − k)τ (j) ,
k j=0
t
where = (−1) r , θ(b) = |{i | cni = (−b)n }| , τ (j) = i | cni = (αj )n
, 1 ≤ i ≤ s, and α
is a primitive root in Fq .
7.3.16 Theorem [2432] Let kj = 2mj for j = 1, . . . , s − 2, ks−1 = kms−1 , ks = k r ms , where
r ≥ 1, (2k, m1 · · · ms ) = 1 and m1 , . . . , ms are pairwise coprime.
7.3.17 Remark Theorem 7.3.16 holds when the number of variables s is even. Explicit formulas
for the case when s is odd are given in [199].
7.3.18 Theorem [2744] Let s > 2. Then I(k1 , . . . , ks ) = 0 (and hence N0 = q s−1 ) if and only if
one of the following holds:
1. For some i, ki , k1 k···k
i
s
= 1.
2. If {ki1 , . . . , kir | 1 ≤ i1 < · · · < ir ≤ s} is the set of all even integers among
k k
{k1 , . . . , ks }, then 2 - r, 2i1 , . . . , 2ir are pairwise coprime, and kij is coprime
to any odd number in {k1 , . . . , ks } for j = 1, . . . , r and r < s.
7.3.19 Remark Diagonal equations with few variables have received special attention [2144, 2714].
Hermitian curves, a particular case of “Fermat like” equations, have been studied extensively
because they provide good examples of curves with maximal number of rational points
[1199, 1278, 1514, 2499]. We present a recent result on Fermat equations and refer the
reader to [110, 1717] for results on other specific families of “Fermat like” equations.
Equations over finite fields 209
7.3.20 Theorem [1798, Theorem 1.1] The number of solutions of X k + Y k + Z k = 0 over Fqm is
N0 = 3k + k 2 (q − 2) + (d − 1)(d − 2) provided that
Nk−d
(t−1)
2
p> q + 1 ,
kπ
t+1
sin( 2N )
q m −1
where N = q−1 , k | N, d = (N
k , t + 1), and q ≡ t (mod
N
k) with 0 < t < N
k.
7.3.21 Definition Let L(k1 , . . . , ks ) be the least positive integer represented by (7.3.4) if there is
such an integer, or, otherwise, let L(k1 , . . . , ks ) = s − 1.
7.3.22 Definition For k an integer, let vq (k) denote the highest power of q dividing k.
7.3.23 Remark Many results on diagonal equations give bounds for the number of solutions Nb of
(7.3.1), while others give bounds for vp (Nb ). Most of these bounds depend on the numbers
I and L of Definitions 7.3.6 and 7.3.21. Exact values for I and L are hard to compute in
general; [2743] includes their exact values for certain equations, and [1350] provides sharp
general lower bounds for I.
7.3.24 Theorem [1939, 2679, 2744]
s
s
X X ki1 · · · kir
I(k1 , . . . , ks ) = (−1) + (−1)s−r .
r=1
lcm [ki1 , . . . , kir ]
1≤i1 <i2 <···<ir ≤s
1 1
+ ··· + > µ ≥ 1,
k1 ks
then vq (Nb ) ≥ µ.
7.3.26 Theorem [2449] Let q = pn , ki |(q − 1), ri be the least integer such that ki | (pri − 1),
Ki ki = pri − 1 for i = 1, . . . , s, and µ1 be the least integer such that
s
X n (Ki , p − 1)
µ1 ≥ .
i=1
ri (p − 1)
µ1
If µ = n − 1, then vq (Nb ) ≥ µ, where [a] denotes the integer part of a.
l ki |(q −
7.3.27 Theorem [2956] Suppose m 1), I(k1 , . . . , ks ) > 0 and let wi = gcd (ki , lcm [kj |i 6= j]).
Ps 1
Then L(k1 , . . . , ks ) = i=1 wi if one of the following conditions holds:
Ps 1
1. i=1 wi ≡ 0 (mod 1),
2. lcm [w1 , . . . , ws ] |ws ,
3. I(k1 , . . . , ks ) ≤ 10,
4. s ≤ 3.
k1 k2 ···ks
7.3.28 Theorem [2745, Theorems 2,4] For each i, define ui = gcd ki , ki . Then,
7.3.29 Corollary Assume that each ki satisfies one of the conditions of Part 3 of Theorem 7.3.28.
Then L(k1 , . . . , ks ) = 2s .
7.3.30 Theorem [2906, Theorem 3] vq (N0 ) ≥ L(k1 , . . . , ks ) − 1.
7.3.32 Remark Some of the bounds for the powers of p dividing the number of solutions of diagonal
equations can be improved if one considers the p-weight σp (ki ) of the degrees of the terms
in the equation instead of their degrees ki .
7.3.33 Theorem [2149, Theorem 10] Let q = pn . If µ is the least integer satisfying
1 1
µ≥n + ··· + −1 ,
σp (k1 ) σp (ks )
then vp (Nb ) ≥ µ.
Ps Ps
7.3.34 Remark If i=1 σp (k1
i)
> 1 then the equation i=1 ci xki i = 0 has a non-trivial solution.
n Ps o
i=1 σp (ji (q−1)/ki )
7.3.35 Theorem [2149, Theorem 2] Let γ = min(j1 ,...,js ) p−1 −1, where (j1 , . . . , js )
satisfies (7.3.4). If N0 is the number of solutions of Equation (7.3.1) over Fqm , then
vp (N0 ) ≥ mγ. Also, mγ is best possible, i.e., there exists an equation c1 xk11 +· · ·+cs xk1s = 0
such that vp (N0 ) = mγ.
7.3.36 Remark We note that Theorem 7.3.35 uses a calculation on Fq to give information on
vp (N0 ) for any extension of Fq .
7.3.37 Remark Generalizations of diagonal equations have been considered by several authors.
Some examples are systems of diagonal equations [21, 1698, 2151, 2808], deformed diagonal
equations [546, 2698], and equations with terms of disjoint support [501, 502, 563].
7.3.38 Theorem [1051, Theorem 5] Let F = c1 xk1 + · · · + cs xks + g(x1 , . . . , xs ) be a polynomial
over
p−1
Fp , where c1 · · · cs 6= 0 and deg(g) < k. Then F = 0 is solvable in Fsp if s ≥ .
k c
b p−1
7.3.39 Remark Theorem 7.3.38 was proved using the combinatorial nullstellensatz and it general-
izes Theorem 1 in [546] to include the case k - (p−1). Prior to this result, Newton polyhedra
were used in [23] to prove results that allows one to estimate the number of zeros of poly-
nomials of the type in Theorem 7.3.38. Note that, even if s > k, Chevalley’s theorem does
not guarantee the solvability of the equations in Theorem 7.3.38.
Equations over finite fields 211
7.3.40 Theorem [2746, Theorem 1.1] Let q = pn and F = xk1 + · · · + xks + g(x1 , . . . , xs ) be a
polynomial over Fq where deg(g) < k. Then, for nonempty subsets A1 , . . . , As ⊂ Fq ,
( s )
X |Ai | − 1
|{F (a1 , . . . , as ) | ai ∈ Ai }| ≥ min p, +1 .
k
i=1
7.3.41 Remark If q = p and A1 = · · · = As = F∗p , s > k, one obtains Theorem 1.3 in [1838].
n
7.3.42 Theorem Let q = p and ki > 1 be positive integers satisfying ki |(p − 1). Let F =
xk11 + · · · + xks s + g(x1 , . . . , xs ) be a polynomial over Fq , where wp (g) <
mini {ki },P
and N0
Ps 1 s 1
be the number of solutions of F = 0. Then, vp (N0 ) = n i=1 ki − 1 whenever i=1 ki
is an integer. In particular, F is solvable over Fq .
7.3.43 Remark The result in Theorem 7.3.42 is a special case of Theorem 15 of [562].
7.3.45 Definition The smallest s = g(k, q) such that the equation xk1 + · · · + xks = b has a solution
for every b ∈ Fq is Waring’s number for Fq with respect to k.
pn −1
7.3.46 Theorem Waring’s number g(k, pn ) exists if and only if pd −1
- k for all d|n, d 6= n. Also, if
d = gcd(k, q − 1), then g(k, q) = g(d, q).
7.3.47 Remark Because of the previous theorem it is enough to consider k| (q − 1). From now on
we assume that g(k, q) exists and k| (q − 1).
p−1
7.3.48 Theorem [566] We have g(k, p) ≤ k and equality holds if k = 1, 2, 2 , p − 1.
1
7.3.49 Theorem [2677] If 2 ≤ k < q + 1, then g(k, q) = 2.
4
7.3.50 Remark To find the exact value for g(k, q) is a difficult problem and, given Theorem 7.3.49,
one might ask, for each k, which is the largest q such that g(k, q) 6= 2 [2677]. The next table
contains some of the exact values known for g(k, p). These and other exact values can be
found in [12, 2148, 2676, 2678].
k k 6= p − 1, p−1
2 5 6 6 7 7 8
31, 41 37 ≤ p ≤ 67 71, 113
p pk≤
29 61 31 109, 139, 223 43 127 41
g(k, p) 2 +1 3 4 3 4 3 4
k 8 9 10 10 10 11 11 11
73, 109 71 ≤ p ≤ 491
p 73 ≤ p ≤ 137 127, 163 31 41 521 ≤ p ≤ 631 89 67 199, 331
233, 257 181, 199 61 661 ≤ p ≤ 881 353, 419
337, 761 271, 307 641, 911 463, 617
g(k, p) 3 3 5 4 3 4 5 3
p−1
7.3.51 Remark If the value of g(k, p) for any 3 ≤ k ≤ 11, k 6= p − 1, is not included in the
2
table above, then g(k, p) = 2. Hence, the table and Theorem 7.3.48 provide all the exact
values of Waring’s number for k = 1, . . . , 11.
212 Handbook of Finite Fields
7.3.52 Theorem [2149, Theorem 18] Let l 6= 1. If k|(pn + 1), k 6= pn + 1, then g(k, p2nl ) = 2.
p−1
7.3.53 Theorem [651, Theorem 2] Let t = k .
If a, b are the unique positive integers
with a > b
and a2 +b2 +ab = p, then, for t = 3, g(k, p) = a+b−1, and for t = 6, g(k, p) = 23 a + 31 b . If
t = 4 and a, b are the unique positive integers with a > b and a2 +b2 = p, then g(k, p) = a−1.
7.3.54 Theorem [1782, Theorems 1,2] Let φ denote Euler’s function, m be a positive integer and
p,r be primes such that p is a primitive root modulo rm . Then,
m
pφ(r ) − 1 φ(rm ) (p − 1)φ(rm )
g ,p = .
rm 2
7.3.55 Remark Theorem 7.3.54 generalizes the results in [2995]. A good survey on bounds for
g(k, q) can be found in [2990].
√
7.3.56 Theorem [1282, Theorem 5] If k < q, then g(k, q) ≤ 8.
p−1
7.3.57 Theorem [2990, Theorem 2] If d =
gcd( p
n −1 , then g(k, pn ) ≤ ng(d, p).
k )
,p−1
4/7
7.3.58 Theorem [1206, Theorem 4] If k ≥ 2 is a proper divisor of p − 1 and k ≥ (p − 1) , then
7/3
k
g(k, p) ≤ 170 (p−1)4/3 log p.
7.3.59 Remark By using a result in [2030] an improvement to Theorem 7.3.58 can be obtained.
7.3.60 Theorem [1789, Theorem 2] For any > 0 there exists c > 0 such that for any k ≥ 2, p ≥
k ln k 2+
(ln(ln k+1))1−
, we have g(k, p) ≤ c (ln k) .
p−1
7.3.61 Theorem [651, Theorem 1] Let t = and l be a positive integer. If φ(t) ≥ l, then
k
1
g(k, p) ≤ C(l)k for some constant C(l), where φ is Euler’s function.
l
7.3.62 Remark Theorems 7.3.60 and 7.3.61 prove conjectures made by Heilbronn in [1460]. The
next theorem gives an explicit value for the constant C(l) in Theorem 7.3.61 when l = 2.
Other estimates for C(l) are also given in [656].
p−1
7.3.63 Theorem [656, Theorem 1.1] For t = k > 2 we have the uniform upper bound g(k, p) ≤
83k 1/2 .
2. For any > 0, if |A0k | ≥ 42/n , then g(k, pn ) ≤ C()k , for some constant C().
√ √
3. g(k, p2 ) ≤ 16 k + 1. If n ≥ 3, then g(k, pn ) ≤ 10 k + 1.
4. If |A0k | ≥ p for > 41
83 , then, for all sufficiently large p, we have g(k, p) ≤ 6.
Equations over finite fields 213
7.3.66 Remark Part 1 of Theorem 7.3.65 improves Theorem 1 in [2992]. Parts 2 and 3 prove the
extensions of Heilbronn’s conjectures [1460] to arbitrary fields. See also [374].
7.3.67 Remark Waring’s problem has been generalized to systems of diagonal equations [561, 2808],
to general polynomials [372, 549, 658], to Dickson polynomials [1314], to factorials [1197],
and to reciprocals [754, 2643].
7.3.68 Theorem [1196, Theorem 1] Any residue class λ modulo p can be represented in
P5
the form i=1 mi !ni ! ≡ λ (mod p) for some positive integers m1 , n1 , . . . , m5 , n5 with
max1≤i≤5 {mi , ni } ≤ cp27/28 , where c is an absolute constant.
7.3.69 Theorem [1196, Theorem 2] Let l(p) be the smallest integer such that for every integer
λ the congruence n1 ! + · · · + nl ! ≡ λ (mod p) has a solution in positive integers. Then
l(p) ≤ C log3 (p), for some constant C.
7.3.70 Definition Let Dk (x, a) be the Dickson polynomial of degree k and parameter a ∈ Fq
(see Section 8.3). The Waring problem for Dickson polynomials over Fq is to find the
smallest positive integer s = ga (k, q) such that the equation
Dk (x1 , a) + · · · + Dk (xs , a) = b, x1 , . . . , xs ∈ Fq
7.3.71 Theorem [2333, Theorem 1] Let ga (k, q) be defined as in Definition 7.3.70. The inequality
ga (k, q) ≤ 16 holds
1. For any a ∈ F∗q and (k, q − 1) ≤ 2−3/2 (q − 1)1/2 .
2. For any a that it is a square in F∗q and (k, q + 1) ≤ 2−3/2 (q − 1)1/2 .
See Also
References Cited: [12, 21, 23, 24, 30, 110, 150, 168, 199, 240, 372, 374, 416, 494, 501, 502,
511, 515, 538, 546, 549, 561, 562, 563, 566, 615, 650, 651, 656, 658, 754, 1051, 1196, 1197,
1199, 1206, 1274, 1278, 1282, 1314, 1350, 1460, 1514, 1540, 1548, 1575, 1617, 1698, 1717,
1757, 1782, 1789, 1798, 1838, 1917, 1918, 1939, 2030, 2120, 2137, 2144, 2146, 2147, 2148,
2149, 2150, 2151, 2165, 2333, 2432, 2449, 2499, 2550, 2643, 2676, 2677, 2678, 2679, 2681,
2697, 2698, 2714, 2743, 2744, 2745, 2746, 2808, 2809, 2883, 2906, 2956, 2990, 2992, 2995,
3001, 3002, 3003, 3066]
This page intentionally left blank
8
Permutation polynomials
8.1 One variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Introduction • Criteria • Enumeration and
distribution of PPs • Constructions of PPs • PPs
from permutations of multiplicative groups • PPs
from permutations of additive groups • Other types
of PPs from the AGW criterion • Dickson and
reversed Dickson PPs • Miscellaneous PPs
8.2 Several variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.3 Value sets of polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Large value sets • Small value sets • General
polynomials • Lower bounds • Examples • Further
value set papers
8.4 Exceptional polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Fundamental properties • Indecomposable
exceptional polynomials • Exceptional polynomials
and permutation polynomials • Miscellany •
Applications
8.1.1 Introduction
8.1.1 Definition For q a prime power, let Fq denote the finite field containing q elements. A
polynomial f ∈ Fq [x] is a permutation polynomial (PP) of Fq if the function f : c → f (c)
from Fq into itself induces a permutation. Alternatively, f is a PP of Fq if the equation
f (x) = a has a unique solution for each a ∈ Fq .
8.1.2 Remark The set of all PPs on Fq forms a group under composition modulo xq −x, isomorphic
to the symmetric group Sq of order q!. For q > 2, the group Sq is generated by xq−2 and all
linear polynomials ax + b, and if c is a primitive element in Fq , Sq is generated by cx, x + 1,
and xq−2 .
8.1.3 Remark Given a permutation g of Fq , the unique permutation polynomial Pg (x) of Fq of
degree at most q − 1 representing the function g can be found byPthe Lagrange Interpolation
Formula (see Theorem 1.71 in [1939]). In particular Pg (x) = a∈Fq g(a)(1 − (x − a)q−1 );
see also Theorem 2.1.131.
215
216 Handbook of Finite Fields
8.1.2 Criteria
Normalized PPs of Fq q
x any q
x2 q ≡ 0 (mod 2)
x3 q 6≡ 1 (mod 3)
x3 − ax (a not a square) q ≡ 0 (mod 3)
x4 ± 3x q=7
x4 + a1 x2 + a2 x (if its only root in Fq is 0) q ≡ 0 (mod 2)
x5 q 6≡ 1 (mod 5)
5
x − ax (a not a fourth power) q ≡ 0 (mod 5)
x5 + ax (a2 = 2) q=9
x5 ± 2x2 q=7
x5 + ax3 ± x2 + 3a2 x (a not a square) q=7
x5 + ax3 + 5−1 a2 x (a arbitrary) q ≡ ±2 (mod 5)
x5 + ax3 + 3a2 x (a not a square) q = 13
x5 − 2ax3 + a2 x (a not a square) q ≡ 0 (mod 5)
A list of PPs of degree 6 over finite fields with odd characteristic can be found in [840]. A
list of PPs of degree 6 and 7 over finite fields with characteristic two can be found in [1916].
Permutation polynomials 217
A recent preprint [2605] tabulates all monic PPs of degree 6 in the normalized form.
8.1.9 Theorem [1939] The polynomial f is a PP of Fq if and only if
P
c∈Fq χ(f (c)) = 0 for all
nontrivial additive characters χ of Fq .
8.1.10 Remark Another criterion for PPs conjectured by Mullen [2176] in terms of the size |Vf |
of the value set Vf = {f (a) | a ∈ Fq } of a polynomial f of degree n was proved by Wan
[2911]. Namely, if |Vf | > q − q−1 n then f is a PP of Fq [2911]. We refer to Section 8.3 for
more information on value sets. A variation of Hermite’s criterion in terms of combinatorial
identities is given in [2018]. Hermite’s criterionPcan be rewritten in terms of the invariant,
up (f ), the smallest positive integer k such that x∈Fq f (x)k 6= 0. That is, f is a PP of Fq if
and only if up (f ) = q − 1. In the case q = p, this criterion was improved in [1813] and only
requires k > p−1 2 . Using Teichmüller liftings, Wan et al. [2919] obtained an upper bound
for |Vf | and improved Hermite’s criterion. Several other criteria were obtained by Turnwald
[2826] in terms of invariants associated with elementary symmetric polynomials, without
using Teichmüller liftings.
8.1.11 Theorem [2826] Let f ∈ Fq [x] be a polynomial of degree n with 1 ≤ n < q. Let u be
the smallest positive integer k with sk 6= 0 if such k exists and otherwise set u = ∞,
where sk denotes the k-th elementary symmetric polynomial of the values f (a). Let v be
the value set Vf = {f (a) | a ∈ Fq }. Let w be the smallest positive integer k
the size of P
with pk = a∈Fq f (a)k 6= 0 if such k exists and otherwise set w = ∞. The following are
equivalent:
8.1.12 Remark A criterion in terms of resultants was given by von zur Gathen [1221]. Using the
Euclidean algorithm to compute the resultant, von zur Gathen provided a probabilistic test
to determine whether a given polynomial is a PP or not. The number of operations in Fq
has a softly linear running time O(n log q(log(n log q))k ) for some k. Furthermore, Ma and
von zur Gathan showed that this decision problem has a zero-error probabilistic polynomial
time in [1983] and provided a random polynomial time test for rational functions over
finite fields, along with several related problems in [1984]. Earlier, Shparlinski had given a
deterministic superpolynomial time algorithm for testing PP [2638]. In 2005 Kayal provided
a deterministic polynomial-time algorithm for testing PP [1716].
8.1.13 Problem [1935] Let Nn (q) denote the number of PPs of Fq which have degree n. We have
the trivial boundary
P conditions: N1 (q) = q(q − 1), Nn (q) = 0 if n is a divisor of q − 1 larger
than 1, and Nn (q) = q! where the sum is over all 1 ≤ n < q − 1 such that n is either 1
or is not a divisor of q − 1. Find Nn (q).
8.1.14 Remark In an invited address before the MAA in 1966, Carlitz conjectured that for each
even integer n, there is a constant Cn so that for each finite field of odd order q > Cn ,
there does not exist a PP of degree n over Fq . A polynomial f over Fq is exceptional if
the only absolutely irreducible factors of f (x) − f (y) in Fq [x, y] are scalar multiples of
x − y. One can also characterize an exceptional polynomial as a polynomial which induces
a permutation of infinitely many finite extension fields of Fq . As first proved by Cohen in
[667], any exceptional polynomial is a PP, and the converse holds if q is large compared to
218 Handbook of Finite Fields
the degree of f . Cohen’s equivalent statement of Carlitz’s conjecture [677] says that there is
no exceptional polynomial of even degree in odd characteristic. This was proved by Fried,
Guralnick, and Saxl in [1120]; in fact an even stronger result was obtained through the use
of powerful group theoretic methods, including the classification of finite simple groups.
8.1.15 Remark For the next theorem we require the concept of an exceptional cover; see Section
9.7.
8.1.16 Theorem [1120] There is no exceptional cover of nonsingular absolutely irreducible curves
over Fq of degree 2p where q is a power of p and p is prime.
8.1.17 Remark Several partial results on Carlitz’s conjecture can be found in [677, 2908]. Moreover,
Wan generalized Carlitz’s conjecture in [2909] proving that if q > n4 and (n, q − 1) = 1 then
there is no PP of degree n over Fq . Later Cohen and Fried [688] gave an elementary proof
of Wan’s conjecture following an argument of Lenstra and this result was stated in terms of
exceptional polynomials; see Section 8.4 for more information on exceptional polynomials.
8.1.18 Theorem [688, 2909] There is no exceptional polynomial of degree n over Fq if (n, q −1) > 1.
8.1.19 Theorem [551]
1. Let ` > 1. For q sufficiently large, there exists a ∈ Fq such that the polynomial
x(x(q−1)/` + a) is a PP of Fq .
2. Let ` > 1, (r, q − 1) = 1, and k be a positive integer. For q sufficiently large, there
exists a ∈ Fq such that the polynomial xr (x(q−1)/` + a)k is a PP of Fq .
8.1.20 Remark Any non-constant polynomial h(x) ∈ Fq [x] of degree ≤ q − 1 can be written
uniquely as axr f (x(q−1)/` ) + b with index ` [61]. Namely, write
h(x) = a(xn + an−i1 xn−i1 + · · · + an−ik xn−ik ) + b,
where a, an−ij 6= 0, j = 1, . . . , k. Here we suppose that j ≥ 1 and n − ik = r. Then
h(x) = axr f (x(q−1)/` ) + b, where f (x) = xe0 + an−i1 xe1 + · · · + an−ik−1 xek−1 + ar ,
q−1
`= ,
(n − r, n − r − i1 , . . . , n − r − ik−1 , q − 1)
and (e0 , e1 , . . . , ek−1 , `) = 1. Clearly, h is a PP of Fq if and only if g(x) = xr f (x(q−1)/` ) is
a PP of Fq . Then ` is the index of h.
8.1.21 Remark If ` = 1 then f (x) = 1 so that g(x) = xr . In this case g(x) is a PP of Fq if and
only if (r, q − 1) = 1. We can assume ` > 1.
8.1.22 Remark More existence and enumerative results for binomials can be found in [61, 64, 1833,
2018, 2019, 2020, 2825, 2905, 2913]. In [1833], Laigle-Chapuy proved the first assertion of
2
Theorem 8.1.19 assuming q > `2`+2 1 + ``+1 `+2 . In [2020], Masuda and Zieve obtained a
stronger result for more general binomials of the form xr (xe1 (q−1)/` + a). More precisely
they showed the truth of Part 1 of Theorem 8.1.19 for q > `2`+2 . Here we present a general
result of Akbary-Ghioca-Wang (Theorem 8.1.25) which shows that there exist permutation
polynomials of index ` for any prescribed exponents satisfying conditions (8.1.1). This result
generalizes all the existence results from [551, 1833, 2020].
m
where s := (q − 1)/`. For a tuple ā := (a1 , . . . , am ) ∈ F∗q , we let
ā
gr,ē (x) := xr (xem s + a1 xem−1 s + · · · + am−1 xe1 s + am ) .
m
We define Nr,ē m
(`, q) as the number of all tuples ā ∈ F∗q ā
such that gr,ē (x) is a PP of
Fq . In other words Nr,ē (`, q) is the number of all monic permutation (m + 1)-nomials
m
ā
gr,ē (x) = xr f (x(q−1)/` ) over Fq with vanishing order at zero equal to r, set of exponents
ā
ē for f (x), and index `. Note that if r and ē satisfy (8.1.1) then gr,ē (x) has index `.
m `! m
Nr,ē (`, q) − q < `!`q m−1/2 .
``
8.1.25 Theorem [61] For any q, r, ē, m, ` that satisfy (8.1.1), (r, s) = 1, and q > `2`+2 , there exists
an ā ∈ (F∗q )m such that the (m + 1)-nomial gr,ē
ā
(x) is a permutation polynomial of Fq .
log q
8.1.26 Remark For q ≥ 7 we have `2`+2 < q if ` < 2 log log q .
8.1.29 Theorem [1788] Fix j integers k1 , . . . , kj with the property that 0 < k1 < · · · < kj < q − 1
and define N (k1 , . . . , kj ; q) as the number of PPs of Fq of degree less than q − 1 such that
the coefficient of xki equals 0, for i = 1, . . . , j. Then
r !q
q! 1
N (k1 , . . . , kj ; q) − j < 1 + ((q − k1 − 1)q)q/2 .
q e
In particular, Nq−2 (q) = q! − N (q − 2; q).
8.1.30 Remark We note that for 1 ≤ t ≤ q − 2 the number of PPs of degree at least q − t − 1
is q! − N (q − t − 1, q − t, . . . , q − 2; q). In [1788] Konyagin and Pappalardi proved that
N (q − t − 1, q − t, . . . , q − 2; q) ∼ qq!t holds for q → ∞ and t ≤ 0.03983 q. This result
guarantees the existence of PPs of degree at least q − t − 1 for t ≤ 0.03983 q (as long as q
is sufficiently large). However, the following theorem establishes the existence of PPs with
exact degree q − t − 1.
8.1.31 Theorem [61] Let m ≥ 1. Let q be a prime power such that q − 1 has a divisor ` with m < `
and `2`+2 < q. Then for every 1 ≤ t < (`−m) ` (q − 1) coprime with (q − 1)/` there exists an
ā
(m + 1)-nomial gr,ē (x) of degree q − t − 1 which is a PP of Fq .
8.1.32 Corollary [61] Let m ≥ 1 be an integer, and let q be a prime power such that (m+1) | (q−1).
Then for all n ≥ 2m + 4, there exists a permutation (m + 1)-nomial of Fqn of degree q − 2.
8.1.33 Definition Let m[k] (q) be the number of permutations of Fq which are k-cycles and are
represented by polynomials of degree q − k.
8.1.34 Theorem [2967] Every transposition of Fq is represented by a unique polynomial of degree
q − 2. Moreover, 2
3 q(q − 1) if q ≡ 1 (mod 3),
m[3] (q) = 0 if q ≡ 2 (mod 3),
1
3 q(q − 1) if q ≡ 0 (mod 3).
220 Handbook of Finite Fields
8.1.38 Remark For the purpose of introducing the construction of PPs in the next few sections,
we present the following recent result by Akbary-Ghioca-Wang (AGW).
8.1.39 Theorem (AGW’s criterion, [62]) Let A, S, and S̄ be finite sets with #S = #S̄, and let
f : A → A, f¯ : S → S̄, λ : A → S, and λ̄ : A → S̄ be maps such that λ̄ ◦ f = f¯ ◦ λ. If both
λ and λ̄ are surjective, then the following statements are equivalent:
1. f is a bijection (a permutation of A);
2. f¯ is a bijection from S to S̄ and f is injective on λ−1 (s) for each s ∈ S.
8.1.40 Remark We note that this criterion does not require any restriction on the structures of
the sets S and S̄ in finding new classes of PPs of a set A. In particular, if we take A as
a group and S and S̄ as homomorphic images of A, then we obtain the following general
result for finding permutations of a group.
8.1.41 Theorem [62] Let (G, +) be a finite group, and let ϕ, ψ, ψ̄ ∈ End(G) be group endomor-
phisms such that ψ̄ ◦ ϕ = ϕ ◦ ψ and #im(ψ) = #im(ψ̄). Let g : G −→ G be any mapping,
and let f : G −→ G be defined by f (x) = ϕ(x) + g(ψ(x)). Then,
8.1.42 Remark One can apply this result to a multiplicative group of a finite field, an additive
group of a finite field, or the group of rational points of an elliptic curve over finite fields
[62]. This reduces a problem of determining whether a given polynomial over a finite field
Permutation polynomials 221
8.1.45 Definition [2278, 2940] Let γ be a primitive element of Fq , q − 1 = `s for some positive
integers ` and s, and the set of all nonzero `-th powers of Fq be C0 = {γ `j : j =
0, 1, . . . , s − 1}. Then C0 is a subgroup of F∗q of index `. The elements of the factor group
F∗q /C0 are the cyclotomic cosets
Ci := γ i C0 , i = 0, 1, . . . , ` − 1.
For any integer r > 0 and any A0 , A1 , . . . , A`−1 ∈ Fq , we define an r-th order cyclotomic
mapping fAr 0 ,A1 ,...,A`−1 of index ` from Fq to itself by fAr 0 ,A1 ,...,A`−1 (0) = 0 and
Moreover, fAr 0 ,A1 ,...,A`−1 is an r-th order cyclotomic mapping of the least index ` if the
mapping cannot be written as a cyclotomic mapping of any smaller index.
8.1.46 Remark Cyclotomic mapping permutations were introduced in [2278] when r = 1 and
in [2940] for any positive r. Let ζ = γ s be a primitive `-th root of unity in Fq
and P (x) = xr f (xs ) be a polynomial of index ` over Fq with positive integer r. Then
P (x) = xr f (xs ) = fAr 0 ,A1 ,...,A`−1 (x) where Ai = f (ζ i ) for 0 ≤ i ≤ ` − 1. We note that
the least index of a cyclotomic mapping is equal to the index of the corresponding polyno-
mial. If P is a PP of Fq then (r, s) = 1 and Ai = f (ζ i ) 6= 0 for 0 ≤ i ≤ ` − 1. Under these
two necessary conditions, P (x) = xr f (xs ) is a PP of Fq if and only if fAr 0 ,A1 ,...,A`−1 is a PP
of Fq [2940]. The concept of cyclotomic mapping permutations have recently been general-
ized
Pn in r[2943] allowing each branch to take a different ri value so that P (x) has the form
s
i=0 x fi (x ). More results can be found in [2943] and related piecewise constructions in
i
[1061, 3058].
8.1.47 Remark There are several other equivalent descriptions of PPs of the form xr f (xs ), see [65,
2359, 2916, 2940, 3076] for example. In particular, in [65], it is shown that P (x) = xr f (xs )
`−1
X
is a PP of Fq if and only if (r, s) = 1, Ai = f (ζ ) 6= 0 for 0 ≤ i ≤ ` − 1, and
i
ζ cri Acs
i =0
i=0
for all c = 1, . . . , ` − 1. This criterion is equivalent to Hermite’s criterion when the index
222 Handbook of Finite Fields
8.1.51 Remark For ` = 3, the conditions in (8.1.2) are sufficient to determine whether P is a PP of
Fq [2927]. However, for ` > 3, it turns out not to be the case (for example, see [63, 64, 2927]).
For general `, a characterization of PPs of the form xr (xes +1) in terms of generalized Lucas
sequences of order k := `−12 is given in [2940, 2942].
8.1.52 Definition [64] For any integer k ≥ 1 and η a fixed primitive (4k + 2)-th root of unity,
the generalized Lucas sequence (or unsigned generalized Lucas sequence) of order k is
defined as {an }∞
n=0 such that
2k
X k
X
t −t n
an = (η + η ) = ((−1)t+1 (η t + η −t ))n .
t=1 t=1
t odd
8.1.53 Theorem [2940, 2942] Let q = pm be an odd prime power and q − 1 = `s with odd ` ≥ 3
2 . Then P (x) = x (x + 1) is a PP of Fq if and only if (r, s) = 1,
and (e, `) = 1. Let k = `−1 r es
s
(2r + es, `) = 1, 2 = 1, and
where acs is the (cs)-th term of the generalized Lucas sequence {ai }∞ i=0 of order k over Fp ,
jc = c(2eφ(`)−1 r + s) (mod 2`), Rjc ,k (x) is the remainder of the Dickson polynomial Djc (x)
of the first kind of degree jc divided by the characteristic polynomial gk (x), and L is a left
shift operator on sequences. In particular, all jc are distinct even numbers between 2 and
2` − 2.
8.1.54 Remark We note that the degree of any remainder Rn,k is at most k −1 and Rn,k is either a
Dickson polynomial of degree ≤ k − 1 or the degree k − 1 characteristic polynomial gk−1 (x)
of the generalized Lucas sequence of order k, or a negation of the above polynomials [2942].
We can extend the definition of {an } to negative subscripts n using the same recurrence
relations. We also remark that, for ` ≤ 7, the sequences used in the descriptions of per-
mutation binomials have simple structures and are fully described [63, 2927]. We note that
signed generalized Lucas sequences are defined in [2942] and they are used to compute the
coefficients of the compositional inverse of permutation binomials xr (xes + 1). Finally we
note that Equation (8.1.3) always holds if the sequence {an } is s-periodic over Fp , which
means that an ≡ an+ks (mod p) for integers k and n. We remark that these sequences are
Permutation polynomials 223
defined over prime fields and checking the s-periodicity of these sequences is a much simpler
task than checking whether the polynomial is a PP over the extension field directly.
8.1.55 Theorem [64] Assume the conditions (8.1.2) on `, r, e, and s hold. If {an } is s-periodic
over Fp , then the binomial P (x) = xr (xes + 1) is a permutation binomial of Fq .
8.1.56 Theorem [65] Let p be an odd prime and q = pm and let `, r, s be positive integers satisfying
that q − 1 = `s, (r, s) = 1, (e, `) = 1, and ` odd. Let p ≡ −1 (mod `) or p ≡ 1 (mod `) and
` | m. Then the binomial P (x) = xr (xes + 1) is a permutation binomial of Fq if and only
if (2r + es, `) = 1. In particular, if p ≡ 1 (mod `) and ` | m, then the conditions (r, s) = 1,
(e, `) = 1, and ` odd imply that (2r + es, `) = 1 [60].
8.1.57 Remark For a = 1 (equivalent to a = bs for some b), under the assumptions on q, s, `, r, e,
it is shown in [3076] that the s-periodicity of the generalized Lucas sequence implies that
(η + η −1 )s = 1 for every (2`)-th root of unity η. However, we note that these two conditions
are in fact equivalent for a = 1. The following result extends Theorem 8.1.55 as it also deals
with even characteristic.
8.1.58 Theorem [3076] For q, s, `, e, r, a satisfying q − 1 = `s, (r, s) = 1, (e, `) = 1, r, e > 0 and
a ∈ F∗q , suppose (−a)` 6= 1 and (z + a/z)s = 1 for every (2`)-th root of unity z. Then
P (x) = xr (xes + a) is a permutation binomial of Fq if and only if (2r + es, 2`) ≤ 2.
8.1.59 Theorem [2020] Suppose xr (xes + a) permutes Fp , where a ∈ F∗p and r, e, s > 0 such that
p √
p − 1 = `s and (`, e) = 1. Then s ≥ p − 3/4 − 1/2 > p − 1.
8.1.60 Remark For earlier results on permutation binomials, we refer to [569, 2018, 2020, 2127,
2680, 2681, 2825, 2905, 2913].
8.1.61 Theorem [65] Let q − 1 = `s. Assume that f (ζ t )s = 1 for any t = 0, . . . , ` − 1. Then
P (x) = xr f (xs ) is a PP of Fq if and only if (r, q − 1) = 1.
8.1.62 Corollary Let q − 1 = `s and g be any polynomial over Fq . Then P (x) = xr g(xs )` is a PP
of Fq if and only if (r, q − 1) = 1 and g(ζ t ) 6= 0 for all 0 ≤ t ≤ ` − 1.
8.1.63 Corollary [65, 1833] Let p be a prime, ` be a positive integer and v be the order of p in
Z/`Z. For any positive integer n, let q = pm = p`vn and `s = q−1. Assume f is a polynomial
in Fpvn [x]. Then the polynomial P (x) = xr f (xs ) is a PP of Fq if and only if (r, q − 1) = 1
and f (ζ t ) 6= 0 for all 0 ≤ t ≤ ` − 1.
8.1.64 Remark Corollary 8.1.63 is reformulated as Theorem 2.3 in [3076]. Namely, let `, r > 0
satisfy `s = q − 1. Suppose q = q0m where q0 ≡ 1 (mod `) and ` | m, and f ∈ Fq0 [x]. Then
P (x) = xr f (xs ) permutes Fq if and only if (r, s) = 1 and f has no roots in the µ` , the set
of `-th roots of unity.
8.1.65 Theorem [65] Let q − 1 = `s, and suppose that Fq (the algebraic closure of Fq ) contains a
s
primitive (j`)-th root of unity η. Assume that η −ut f (η jt ) = 1 for any t = 0, . . . , ` − 1
and a fixed u. Moreover assume that j | us. Then P (x) = xr f (xs ) is a PP of Fq if and only
if (r, s) = 1 and (r + us
j , `) = 1.
8.1.66 Remark Some concrete classes of PPs satisfying the above assumptions can be found in
[60, 65, 2033, 3075, 3076]. For example, let hk (x) := xk + · · · + x + 1. Then the permutation
behavior of the polynomials xr hk (xs ) = xr (xks + · · · + xs + 1) and xr hk (xes )t has been
studied in detail. Moreover, for certain choices of indices ` and finite fields Fq (for example,
p ≡ −1 (mod 2`) where ` > 1 is either odd or 2`1 with `1 odd), several concrete classes of
PPs can be obtained [65, 3075, 3076].
224 Handbook of Finite Fields
8.1.67 Remark When ` ≤ 5, much simpler descriptions involving congruences and gcd conditions
can be found in [60, 2033]. A reformulation of these results in terms of roots of unity can be
found in [3075] which also covers the case of ` = 7. For larger index `, one can also construct
PPs of this form when ` is an odd prime such that ` < 2p + 1.
8.1.68 Theorem [60] Let ` be an odd prime such that ` < 2p+1, then P (x) = xr (xks +· · ·+xs +1)
is a PP of Fq if and only if (r, s) = 1, (`, k + 1) = 1, (2r + ks, `) = 1, and (k + 1)s ≡ 1
(mod p).
8.1.69 Remark There are several results on new classes of PPs when using additive group endo-
morphisms ψ, ψ̄, and ϕ in Theorem 8.1.39 [62, 2003, 3046, 3077].
8.1.70 Theorem [62] Consider any polynomial g ∈ Fqn [x], any additive polynomials ϕ, ψ ∈ Fqn [x],
any Fq -linear polynomial ψ̄ ∈ Fqn [x] satisfying ϕ ◦ ψ = ψ̄ ◦ ϕ and #ψ(Fqn ) = #ψ̄(Fqn ), and
any polynomial h ∈ Fqn [x] such that h(ψ(Fqn )) ⊆ Fq \ {0}. Then
1. f (x) := h(ψ(x))ϕ(x) + g(ψ(x)) permutes Fqn if and only if
a. ker(ϕ) ∩ ker(ψ) = {0}; and
b. f¯(x) := h(x)ϕ(x) + ψ̄(g(x)) is a bijection between ψ(Fqn ) and ψ̄(Fqn ).
2. For any fixed h, ϕ, ψ and ψ̄ satisfying the above hypothesis and Part 1.a, there
#im(ψ)
are (#im(ψ))! · # ker(ψ̄) such permutation functions f (when g varies)
(where ψ and ψ̄ are viewed as endomorphisms of (Fqn , +)).
3. Assume in addition that ψ̄ ◦ g |im(ψ) = 0. Then f (x) = h(ψ(x))ϕ(x) + g(ψ(x))
permutes Fqn if and only if ker(ϕ)∩ker(ψ) = {0} and h(x)ϕ(x) induces a bijection
from ψ(Fqn ) to ψ̄(Fqn ).
4. Assume in addition that ϕ ◦ ψ = 0, and that g(x) restricted to im(ψ) is a permu-
tation of im(ψ). Then f (x) = h(ψ(x))ϕ(x) + g(ψ(x)) permutes Fqn if and only
if ker(ϕ) ∩ ker(ψ) = {0} and ψ̄ restricted to im(ψ) is a bijection between im(ψ)
and im(ψ̄).
8.1.71 Theorem [62, 3046] Let q = pe for some positive integer e.
1. If k is an even integer or k is odd and q is even, then fa,b,k (x) = axq + bx + (xq − x)k ,
a, b ∈ Fq2 , permutes Fq2 if and only if b − aq ∈ F∗q and a + b 6= 0.
2. If k and q are odd positive integers, then fa,k (x) = axq + aq x + (xq − x)k , a ∈ F∗q2
and a + aq 6= 0, permutes Fq2 if and only if (k, q − 1) = 1.
8.1.72 Remark The classes fa,b,k with a, b ∈ Fq and k even and fa,k for a ∈ Fq and p and k
odd were first constructed in [62]. The remaining classes were obtained in [3046]. For other
concrete classes of PPs of additive groups, we refer to [62, 325, 730, 1832, 2003, 3046, 3077].
8.1.73 Remark In this subsection, we give several other constructions of PPs that can be obtained
by using arbitrary surjective maps λ and λ̄ in Theorem 8.1.39, instead of using multiplicative
or additive group homomorphism.
8.1.74 Theorem [62] Let q be a prime power, let n be a positive integer, and let L1 , L2 , L3 be
Fq -linear polynomials over Fq seen as endomorphisms of (Fqn , +). Let g ∈ Fqn [x] be such
that g(L3 (Fqn )) ⊆ Fq . Then f (x) = L1 (x) + L2 (x)g(L3 (x)) is a PP of Fqn if and only if
Permutation polynomials 225
1. ker(Fy ) ∩ ker(L3 ) = {0}, for any y ∈ im(L3 ), where Fy (x) := L1 (x) + L2 (x)g(y);
and
2. f¯(x) := L1 (x) + L2 (x)g(x) permutes L3 (Fqn ).
8.1.75 Remark The above result extends some constructions in [325, 730].
8.1.76 Theorem [62] Let q be any power of the prime number p, let n be any positive integer,
and let S be any subset of Fqn containing 0. Let h, k ∈ Fqn [x] be polynomials such that
h(0) 6= 0 and k(0) = 0, and let B ∈ Fqn [x] be any polynomial satisfying h(B(Fqn )) ⊆ S and
B(aα) = k(a)B(α) for all a ∈ S and all α ∈ Fqn . Then the polynomial f (x) = xh(B(x)) is
a PP of Fqn if and only if f¯(x) = xk(h(x)) induces a permutation of the value set B(Fqn ).
8.1.77 Remark The case that S = Fq and k(x) = x2 was considered in [2003]. Some examples
of B are given in [62]. It is remarked in [62] that Theorem 8.1.76 can be generalized for
f (x) = A(x)h(B(x)) and f¯(x) = C(x)k(h(x)) where A, C ∈ Fqn [x] are polynomials such
that B(A(x)) = C(B(x)) with C(0) = 0 and A is injective on B −1 (s) for each s ∈ B(Fqn ),
under the similar assumptions h(B(Fqn )) ⊆ S \ {0} and B(aα) = k(a)B(α) for all a ∈ S
and all α ∈ Fqn .
8.1.78 Definition [62] Let S ⊆ Fq and let γ, b ∈ Fq . Then γ is a b-linear translator with respect
to S for the mapping F : Fq −→ Fq , if
F (x + uγ) = F (x) + ub
8.1.79 Remark The above definition is a generalization of the concept of b-linear translator studied
in [593, 595, 1817], which deals with the case q = pmn , and S = Fpm . The relaxation on
the condition for S to be any subset of Fq provides a much richer class of functions (see
examples in [62]). Using the original definition of linear translators, several classes of PPs
of the form G(x) + γT r(H(x)) are constructed in [593, 595, 1817]. In the case that G is also
a PP, it is equivalent to constructing polynomials of the form x + γT r(H 0 (x)).
8.1.80 Theorem [62] Let S ⊆ Fq and F : Fq −→ S be a surjective map. Let γ ∈ Fq be a b-linear
translator with respect to S for the map F . Then for any G ∈ Fq [x] which maps S into S,
we have that x + γG(F (x)) is a PP of Fq if and only if x + bG(x) permutes S.
8.1.82 Corollary [62, 593] Under the conditions of Theorem 8.1.80, we have
1. If G(x) = x then x + γF (x) is a PP of Fq if and only if b 6= −1.
2. If q is odd and 2S = S, then x + γF (x) is a complete mapping of Fq if and only
if b 6∈ {−1, −2}.
8.1.83 Theorem [1817] Let L : Fqn −→ Fqn be an Fq -linear mapping of Fqn with kernel αFq , α 6= 0.
Suppose α is a b-linear translator with respect to Fq for the mapping f : Fqn −→ Fq and
h : Fq −→ Fq is a permutation of Fq . Then the mapping G(x) = L(x) + γh(f (x)) permutes
Fqn if and only if b 6= 0 and γ does not belong to the image set of L.
8.1.84 Corollary [1817] Let t be a positive integer with (t, q − 1) = 1, H ∈ Fqn [x] and γ, β ∈ Fqn .
t
Then the mapping G(x) = xq − x + γ (T r(H(xq − x)) + βx) is a PP of Fqn if and only if
T r(γ) 6= 0 and T r(β) 6= 0.
226 Handbook of Finite Fields
8.1.85 Theorem [3046] Let A be a finite field, S and S̄ be finite sets with #S = #S̄ such that the
maps ψ : A → S and ψ̄ : A → S̄ are surjective and ψ is additive, i.e., ψ̄(x+y) = ψ̄(x)+ ψ̄(y),
for all x, y ∈ A. Let g : S → A and f : A → A be maps such that ψ̄(f + g ◦ ψ) = f ◦ ψ and
ψ̄(g(ψ(x))) = 0 for every x ∈ A. Then the map f (x) + g(ψ(x)) permutes A if and only if f
permutes A.
8.1.86 Corollary [3046] Let n and k be positive integers such that (n, k) = d > 1, and let s be any
d n−d
positive integer with s(q k −1) ≡ 0 (mod q n −1). Let L1 (x) = a0 x+a1 xq +· · ·+an/d−1 xq
be a polynomial with L1 (1) = 0 and let L2 ∈ Fq [x] be a linearized polynomial and g ∈ Fqn [x].
Then f (x) = (g(L1 (x)))s + L2 (x) permutes Fqn if and only if L2 permutes Fqn .
8.1.87 Corollary [3046] Let n and k be positive integers such that (n, k) = d > 1, let s be any
k
positive integer with s(q k − 1) ≡ 0 (mod q n − 1). Then h(x) = (xq − x + δ)s + x permutes
Fqn for any δ ∈ Fqn .
k
8.1.88 Remark More classes of PPs of the form (xq − x + δ)s + L(x) and their generalization can
be found in [1061, 3058]. See [1481, 3043, 3044, 3057] for more classes of PPs of the form
k
(xp + x + δ)s + L(x). One can also find several classes of PPs of the form xd + L(x) over F2n
in [1929, 2363, 2362]. It is proven in [1929] that, under the assumption gcd(d, 2n − 1) > 1,
if xd + L(x) is a PP of F2n then L must be a PP of F2n . Hence some of these classes of PPs
of the form xd + L(x) are compositional inverses of PPs of the form L1 (x)d + x.
8.1.89 Remark See [872, 3045] for some explicit classes of PPs over F3m .
8.1.90 Remark The permutational behavior of Dickson polynomials of the first kind is simple and
classical; see Remark 8.1.5. For more information on Dickson polynomials, see Section 9.6.
8.1.91 Definition For any positive integer n, let En (x, a) be the Dickson polynomial of the second
kind (DPSK) defined by
bn/2c
n−i
X
En (x, a) = (−a)i xn−2i .
i=0
i
8.1.92 Remark Matthews observed in his Ph.D. thesis [2034] that if q is a power of an odd prime
p and n satisfies the system of congruences
n + 1 ≡ ±2 (mod p),
n + 1 ≡ ±2 (mod 12 (q − 1)), (8.1.4)
n + 1 ≡ ±2 (mod 12 (q + 1)),
8.1.93 Definition [1547] For any positive integer n, the reversed Dickson polynomial, Dn (a, x), is
defined by
bn/2c
n−i
X n
Dn (a, x) = (−x)i an−2i .
i=0
n−i i
8.1.94 Remark It is easy to check that Dn (0, x) = 0 if n is odd and Dn (0, x) = 2(−x)k if n = 2k.
Hence Dn (0, x) is a PP of Fq if and only if q is odd, n = 2k, and (k, q − 1) = 1. For a 6= 0,
we can also check that Dn (a, x) = an Dn (1, x/a2 ). Hence Dn (a, x) is a PP of Fq where a 6= 0
if and only if Dn (1, x) is a PP of Fq . If Dn (1, x) is a PP of Fq , then (q, n) is a desirable pair.
8.1.95 Definition A mapping f from Fq to Fq is almost perfect nonlinear (APN) if the difference
equation f (x + a) − f (x) = b has at most two solutions for any fixed a 6= 0, b ∈ Fq .
8.1.96 Remark We refer to Section 9.2 for more information on APN functions and their applica-
tions.
8.1.97 Theorem [1547] Let q = pe with p a prime and e > 0. If p = 2 or p > 3 and n is odd, then
xn is an APN on Fq2 implies that Dn (1, x) is a PP of Fq , which also implies that xn is an
APN over Fq .
8.1.98 Theorem [1545] The pair (pe , n) is a desirable pair in each of the following cases:
p e n
2 2k + 1, (k, 2e) = 1 [1295, 1547]
2 22k − 2k + 1, (k, 2e) = 1 [1547, 1690]
2 even 2e + 2k + 1, k > 0, (k − 1, e) = 1 [1547]
2 5k 28k + x6k + 24k + 22k − 1 [906, 1547]
3 (3k + 1)/2, (k, 2e) = 1 [1547]
3 even 3e + 5 [1542]
5 (5k + 1)/2, (k, 2e) = 1 [1479]
≥3 pk + 1, k ≥ 0, pk ≡ 1 (mod 4), v2 (e) ≤ v2 (k) [1547]
≥3 pe + 2, k ≥ 0, pe ≡ 1 (mod 3) [1479, 1547]
≥5 3
8.1.99 Remark Two pairs (q, n1 ) and (q, n2 ), where n1 and n2 are positive integers, are equivalent
if n1 and n2 are in the same p-cyclotomic coset modulo q 2 − 1, i.e., Dn1 (1, x) ≡ Dn2 (1, x)
(mod xq − x). No desirable pairs outside the ten families (up to equivalence) given in Theo-
rem 8.1.98 are known. There are several papers on necessary conditions for a reverse Dickson
polynomial to be a PP [1544, 1545].P In particular, it is proved in [1545] that (pe , n) is a
n
desirable pair if and only if fn (x) = j≥0 2j x is a PP of Fpe . However, it is not known
j
whether the above classes are the only non-equivalent desirable pairs. Several new classes
of reversed Dickson polynomials can be found in [1542, 1544].
8.1.100 Remark Dickson polynomials are used to prove the following class of PPs.
8.1.101 Theorem [1527] Let m ≥ 1 and 1 ≤ k, r ≤ m − 1 be positive integers satisfying that kr ≡ 1
m−1
(mod m). Let q = 2m , σ = 2k , and T r(x) := x + x2 + · · · + x2 . For α, γ in {0, 1}, we
define Pr−1 i
(αT r(x) + i=0 xσ )σ+1
Hα,γ (x) := γT r(x) + .
x2
Then Hα,γ (x) is a PP of F2m if and only if r + (α + γ)m ≡ 1 (mod 2).
228 Handbook of Finite Fields
8.1.102 Remark Cyclic and Dickson PPs play a vital role in the Schur Conjecture from 1922
which postulated that if f is a polynomial with integer coefficients which is a PP of Fp
(when considered modulo p) for infinitely many primes p, then f must be a composition of
binomials axn + b and Dickson polynomials. This has been shown to be true; see [1109], the
notes to Chapter 7 of [1939] and [2192]. A proof without the use of complex analysis can be
found in [2827]. More generally, a matrix analogue of the Schur conjecture was studied in
[2178]. Let Fm×m
q denote the ring of m × m matrices over the finite field Fq . A polynomial
f ∈ Fq [x] is a permutation polynomial (PP) on Fm×m q if it gives rise to a permutation of
Fm×m
q . Using a characterization of PPs of the matrix ring over finite fields Fq in [396], it is
shown by Mullen in [2178] that any polynomial f with integral coefficients which permutes
the matrices of fixed size over a field of p elements for infinitely many p is a composition of
linear polynomials and Dickson polynomials Dn (x, a) with n 6= 3 an odd prime and a 6= 0
an integer. Several related questions are also addressed in [2178]; see Section 9.7 for more
information on Schur’s conjecture.
8.1.103 Remark Let f be an integral polynomial of degree n ≥ 2. Cohen [676] proved one of
the Chowla and Zassenhaus conjectures [632] (concerning irreducible polynomials), which
postulated that if f is a PP over Fp of degree n modulo p for any p > (n2 − 3n + 4)2 , then
f (x) + cx is not a PP of Fp unless c = 0. This shows that if both f and g are integral PPs
of degree n ≥ 2 over a large prime field, then their difference h = f − q cannot be a linear
polynomial cx where c 6= 0. Suppose that t ≥ 1 denotes the degree of h. More generally, it
is proved in [699] that t ≥ 3n/5. Moreover, if n ≥ 5 and t ≤ n − 3 then (t, n) > 1. Roughly
speaking, two PPs over a large prime field Fp cannot differ by a polynomial with degree less
than 3n/5.
8.1.104 Remark Evans [998] considers orthomorphisms, mappings θ with θ(0) = 0 so that θ and
θ(x) − x are both PPs of Fq . He studies connections between orthomorphisms, latin squares,
and affine planes. A map θ is an orthomorphism if and only if θ(x) − x is a complete map-
ping. Complete mappings of small degrees and existence of complete mappings (in partic-
ular, binomials) are studied in [2268]. Enumeration results for certain types of cyclotomic
orthomorphisms are provided in [2278]. It is proved in [2268] for odd q and in [2904] for
even q that the degree of a complete mapping is at most q − 3. It is known that families of
permutation polynomials of the form f (x) + cx can be used in the construction of maximal
sets of mutually orthogonal Latin squares [997, 2918]. Let C(f ) be the number of c in Fq
such that f (x) + cx is a permutation polynomial over Fq . Cohen’s theorem [676] on the
Chowla-Zassenhaus conjecture shows that C(f ) ≤ 1 if the degree n of f is not divisible by
p and q is sufficiently large compared to n. Chou showed that C(f ) ≤ q − 1 − n in his thesis
q−1
[624]. Then Evans, Greene, and Niederreiter proved C(f ) ≤ q − n−1 in [1016], which also
proves a conjecture of Stothers [2729] when q is prime. In the case that q is an odd prime,
it gives the best possible result C(f ) ≤ (q − 3)/2 for polynomials of the form x(q+1)/2 + cx.
A general bound for C(f ) which implies Chou’s bound was obtained by Wan, Mullen, and
Shiue in [2918], as well as a significant bound C(f ) ≤ r where r is the least nonnegative
residue of q − 1 modulo n under certain mild conditions. It is conjectured in [1016] that
f (x) − f (0) is a linearized p-polynomial over Fq if C(f ) ≥ bq/2c; this was proved to be true
for q = p or any monomial f (x) = xe in [1016]. Wan observed that this conjecture holds
also for q = p2 from the results in [624, 1016]. Several other related results on the function
C(f ) can be found in [997, 2826, 2912].
8.1.105 Remark Results on the cycle structure of monomials and of Dickson polynomials can be
found in [44] and [1934], respectively. Cycle decomposition, in particular, decomposition of
Permutation polynomials 229
them into cycles of the same length (which are motivated by Turbo codes [2742, 2769]), are
studied in [44, 2152, 2296, 2495, 2496, 2519]. Moreover, we refer readers to [68, 575, 625] for
cycle structure of permutation polynomials with small Carlitz rank [575] or with full cycles.
8.1.106 Remark Finding the compositional inverse of a PP is a hard problem except for the triv-
ial well known classes such as the inverses of linear polynomial, monomials, and Dickson
polynomials. There are several papers on the explicit format of the inverses of some special
classes of permutation polynomials, for example, [727, 1817, 2205, 2206, 2942]. Because the
problem is equivalent to finding the inverses of PPs of the form xr f (xs ), the most general
result can be found in [2941].
8.1.107 Remark PPs are related to special functions. For example, Dobbertin constructed several
classes of PPs [903, 904] over finite fields of even characteristic and used them to prove
several conjectures on APN monomials. The existence of APN permutations on F22n is a
long-term open problem in the study of vectorial Boolean functions. Hou [1541] proved that
there are no APN permutations over F24 and there are no APN permutations on F22n with
coefficients in F2n . Only recently the authors in [424] found the first APN permutation over
F26 . However, the existence of APN permutations on F22n for n ≥ 4 remains open. Over
finite fields of odd characteristics, a function f is a planar function if f (x + a) − f (x) is a
PP for each nonzero a; see Sections 9.2 and 9.5. In [874], Ding and Yuan constructed a new
family of planar functions over F3m , where m is odd, and then obtained the first examples
of skew Hadamard difference sets, which are inequivalent to classical Paley difference sets.
Permutation polynomials of F32h+1 obtained from the Ree-Tits slice symplectic spreads in
PG(3, 32h+1 ) were studied in [192]. Later on, they were used in [871] to construct a family
of skew Hadamard difference sets in the additive group of this field. For more information
on these special functions and their applications, we refer the readers to Sections 9.2, 9.5,
and 14.6.
8.1.108 Remark Golomb and Moreno [1305] show that PPs are useful in the construction of circular
Costas arrays, which are useful in sonar and radar communications. They gave an equivalent
conjecture for circular Costas arrays in terms of permutation polynomials and provided some
partial results. The connection between Costas arrays and APN permutations of integer
rings Zn was studied in [917]. Composed with discrete logarithms, permutation polynomials
of finite fields are used to produce permutations of integer rings Zn with optimum ambiguity
and deficiency [2353, 2355], which generate APN permutations in many cases. Earlier results
on PPs of Zn can be found in Section 5.6 of [870] and [2171, 2460, 3062].
See Also
References Cited: [44, 60, 61, 62, 63, 64, 65, 68, 192, 325, 396, 424, 504, 551, 569, 575, 593,
230 Handbook of Finite Fields
595, 624, 625, 632, 652, 653, 667, 676, 677, 679, 680, 681, 688, 699, 727, 730, 733, 771, 840,
870, 871, 872, 874, 902, 903, 904, 906, 917, 997, 998, 1016, 1061, 1109, 1120, 1221, 1295,
1305, 1479, 1481, 1482, 1483, 1484, 1527, 1541, 1542, 1544, 1545, 1547, 1592, 1690, 1716,
1787, 1788, 1813, 1817, 1832, 1833, 1916, 1929, 1933, 1934, 1935, 1936, 1939, 1983, 1984,
1995, 1996, 1997, 2003, 2018, 2019, 2020, 2033, 2034, 2127, 2152, 2171, 2175, 2176, 2178,
2192, 2205, 2206, 2268, 2278, 2296, 2353, 2355, 2359, 2362, 2363, 2460, 2495, 2496, 2519,
2603, 2605, 2638, 2680, 2681, 2729, 2742, 2769, 2825, 2826, 2827, 2904, 2905, 2908, 2909,
2911, 2912, 2913, 2916, 2918, 2919, 2927, 2940, 2941, 2942, 2943, 2967, 3043, 3044, 3045,
3046, 3047, 3057, 3058, 3062, 3065, 3075, 3076, 3077]
8.2.2 Remark A permutation polynomial f (x1 , . . . , xn ) induces a mapping from Fqn to Fq but
does not induce a 1 − 1 mapping unless n = 1.
8.2.3 Remark [2172] A combinatorial computation shows that there are (q n )!/((q n−1 )!)q permu-
tation polynomials in n variables over Fq .
for all additive characters χb1 , . . . , χbm of Fq with (b1 , . . . , bm ) 6= (0, . . . , 0).
8.2.7 Corollary [542] A polynomial f ∈ Fq [x1 , . . . , xn ] is a permutation polynomial over Fq if and
only if X
χ(f (c1 , . . . , cn )) = 0
(c1 ,...,cn )∈Fn
q
8.2.16 Definition Two polynomials are equivalent if one can be transformed into the other by a
n
aij yj + bi , 1 ≤ i ≤ n, where aij , bi ∈ Fq and the
P
transformation of the form xi = j=1
matrix (aij ) is nonsingular.
8.2.17 Theorem [2227] Let f ∈ Fq [x1 , . . . , xn ] with the degree of f at most two and n ≥ 2. For q
odd, f is a permutation polynomial over Fq if and only if f is equivalent to a polynomial
of the form g(x1 , . . . , xn−1 ) + xn for some g. For q even, f is a permutation polynomial
over Fq if and only if f is equivalent to a polynomial of the form g(x1 , . . . , xn−1 ) + xn or
g(x1 , . . . , xn−1 ) + x2n .
8.2.18 Theorem [2174] If q is odd and n ≥ 2, then f (x1 , . . . , xn ) of degree at most two is a non-
singular feedback function if and only if f (x1 , . . . , xn ) = cx1 + f0 (x2 , . . . , xn ), c ∈ Fq where
f0 (x2 , . . . xn ) is any polynomial in the variables x2 , . . . , xn of degree at most two over Fq .
8.2.19 Remark Niederreiter [2229] provides criteria for quadratic polynomials over Z to be per-
mutation polynomials modulo p, i.e., permutation polynomials over Fp , that involve the
rank of a matrix of coefficients. A result similar to Theorem 8.2.18 is obtained in [2174] for
q even; see the corresponding result for quadratic forms in Theorem 8.2.17. These results
provide an application of several variable permutation polynomials.
8.2.20 Remark [2736] Results are given on permutation polynomials in several variables over a
finite field and orthogonal systems of polynomials whose image spaces are allowed to be
232 Handbook of Finite Fields
arbitrary subfields of the finite field. The results developed in this paper are used to con-
struct additional complete sets of frequency squares, rectangles and hyperrectangles, and
orthogonal arrays. Some of the theorems present a relationship between permutation poly-
nomials of a finite field and field permutation functions, an implicit bound on the possible
number of functions in an orthogonal field system, and results related to the generation of
orthogonal field systems.
8.2.21 Remark For finite rings there are two concepts to distinguish: permutation polynomials
(as above) and strong permutation polynomials. The latter are defined via the cardinality
of the inverse image. Polynomials are strong permutation polynomials (or strong orthogonal
systems) in n variables if they can be completed to an orthogonal system of n polynomials
in n variables.
8.2.22 Theorem [1132] If R is a local ring whose maximal ideal has a minimal number m of
generators, then for every n > m there exists a permutation polynomial in n variables that
is not a strong permutation polynomial.
8.2.23 Corollary [1132] Every permutation polynomial in any number of variables over a local
ring R is strong if and only if R is a finite field.
8.2.24 Remark Wei and Zhang showed [2957] that if n ≤ m in the setting of Theorem 8.2.22, then
every orthogonal system of k polynomials in n variables can be completed to an orthogonal
system of n polynomials (and in particular, every permutation polynomial is strong).
See Also
[1939] Section 7.5 discusses permutations and orthogonal systems in several variables.
[2188] Considers bounds for value sets of polynomial vectors in several variables.
[2326] Considers an application of permutation polynomials and orthogonal systems
to pseudorandom number generation.
[3062] Considers permutation polynomials over finite commutative rings.
References Cited: [542, 1132, 1937, 1939, 1940, 2032, 2172, 2174, 2227, 2228, 2229, 2326,
2736, 2957, 3062]
8.3.1 Definition For f ∈ Fq [x], the value set of f is the set Vf = {f (a)|a ∈ Fq }; the cardinality
of Vf is denoted by #Vf .
Permutation polynomials 233
8.3.2 Remark Every subset of Fq occurs as Vf for some f ∈ Fq [x] of degree at most q − 1 (by the
Lagrange Interpolation Formula); see Theorem 2.1.131.
8.3.3 Remark Any f ∈ Fq [x] satisfies #Vf ≤ q; equality occurs precisely when f is a permutation
polynomial; see Section 8.1.
8.3.4 Theorem Suppose f ∈ Fq [x] of degree n is not a permutation polynomial. Then:
8.3.5 Example [760] Let q = rk where r is a prime power and k is a positive integer. Then
f (x) := xr + xr−1 satisfies #Vf = q − q/r, and hence achieves equality in (1).
8.3.6 Remark If f ∈ Fq [x] has degree n, then #Vf ≥ dq/ne (since each α ∈ Fq has at most n
preimages under f ).
8.3.8 Theorem [549] Let f ∈ Fp [x] have degree n, where p is prime. If n < p and #Vf = dp/ne ≥ 3,
then n divides p − 1 and f (x) = a(x + b)n + c with a, b, c ∈ Fp .
√
8.3.9 Theorem [2105] Let f ∈ Fq [x] be monic of degree n, where gcd(n, q) = 1 and n ≤ q. If
#Vf = dq/ne, then n divides q − 1 and f (x) = (x + b) + c with b, c ∈ Fq .
n
8.3.10 Problem Determine all minimal value set polynomials over Fpk . This is done for k ≤ 2 in
[2105].
8.3.11 Remark Minimal value set polynomials whose values form a subfield are characterized in
[351]. A connection between minimal value set polynomials and Frobenius non-classical
curves is given in [350].
8.3.12 Theorem [628, 1308] If f (x) ∈ Fq [x] is monic of degree n > 15, where n4 < q and
#Vf < 2q/n, then f (x) has one of the forms:
8.3.13 Theorem [289] Let f ∈ Fp [x] have degree less than 34 (p−1), where p is prime. If #f (F∗p ) = 2
then f is a polynomial in x(p−1)/d for some d ∈ {2, 3}.
8.3.14 Remark This result indicates that some phenomena become apparent only when one con-
siders #f (F∗p ) rather than #Vf .
234 Handbook of Finite Fields
8.3.5 Examples
bn/2c
n−i
X n
Dn (x, a) = (−a)i xn−2i .
i=0
n−i i
8.3.31 Remark See Section 9.6 for a discussion of Dickson polynomials over Fq .
8.3.32 Theorem [628] Suppose q is odd with 2r || (q 2 − 1). Then for each n ≥ 1, and each a ∈ F∗q ,
q−1 q+1
#VDn (x,a) = + + α,
2(n, q − 1) 2(n, q + 1)
where α = 1 if 2r−1 || n; α = 1/2 if 2t || n and 1 ≤ t ≤ r − 2; α = 0 otherwise. Here η
denotes the quadratic character defined by η(0) = 0, η(a) = 1 if a is a square in Fq and
η(a) = −1 if a is a nonsquare in Fq .
8.3.33 Remark If (n1 , q 2 − 1) = (n2 , q 2 − 1), then #VDn1 (x,a) = #VDn2 (x,a) .
8.3.34 Theorem [628] Suppose q is even. Then for each n ≥ 1, and each a ∈ F∗q ,
q−1 q+1
#VDn (x,a) = +
2(n, q − 1) 2(n, q + 1)
8.3.35 Remark For further examples, see for instance [757, 758].
8.3.36 Remark There are many other papers describing results concerning value sets of polynomi-
als over finite fields; however, for lack of space, we are unable to precisely state them. Pages
379-381 of [1939] provide a wealth of descriptions of older papers dealing with value sets;
pages 388-389 of [1939] provide summaries of value set results for polynomials in several
variables. The paper [2826] presents the state of knowledge about value sets as of 1995.
8.3.37 Remark Since the publication of [1939], in [772] are given formulas for the number of
polynomials of degree q − 1 with a value set of cardinality k. Paper [773] describes values
sets of diagonal equations over finite fields by giving a new proof of the Cauchy-Davenport
theorem. See [757] and [758] for a discussion of polynomials over F2n which take on each
nonzero value only a small number of times (at most six).
8.3.38 Remark Paper [1222] shows that if f is not a permutation polynomial over Fq and q ≥ n4 ,
then #Vf < q − q/(2n), while [760] shows that by using the polynomial (x + 1)xq−1 ,
Wan’s bound is sharp for every extension of the base field. The paper [57] discusses results
concerning the size of the intersection of the value sets of two nonconstant polynomials and
[1458] discusses lower bounds for the size of the value set for the polynomial (xm + b)n
improving the bound given in [1307]. Paper [2746] discusses cardinalities of value sets for
236 Handbook of Finite Fields
diagonal kinds of polynomials in several variables where the preimage points come from
subsets of the field rather than the entire field.
See Also
References Cited: [57, 285, 289, 350, 351, 549, 628, 629, 630, 667, 734, 757, 758, 760, 772,
773, 774, 1086, 1222, 1307, 1308, 1367, 1368, 1458, 1939, 2043, 2105, 2188, 2746, 2826, 2911,
2919, 2981, 2982]
8.4.2 Remark If f ∈ Fq [x] is exceptional over Fqk for some k, then f is exceptional over Fq .
8.4.4 Theorem [667] A polynomial f ∈ Fq [x] is exceptional over Fq if and only if every absolutely
irreducible factor of f (x) − f (y) in Fq [x, y] is a constant times x − y.
8.4.5 Corollary If f ∈ Fq [x] is exceptional, then there are integers 1 < e1 < e2 < · · · < ek such
that: f is exceptional over Fqn if and only if n is not divisible by any ei .
8.4.6 Corollary If f ∈ Fq [x] is exceptional, then there is an integer M > 1 such that f permutes
each field Fqm for which m is coprime to M .
8.4.7 Corollary For g, h ∈ Fq [x], the composition g ◦ h is exceptional if and only if both g and h
are exceptional.
Permutation polynomials 237
1. n is coprime to p, or
2. n is a power of p, or
pr (pr −1)
3. n = 2 where r > 1 is odd and p ∈ {2, 3}.
8.4.11 Theorem [1755, 2192] The indecomposable exceptional polynomials over Fq of degree co-
prime to q are precisely the polynomials of the form `1 ◦ f ◦ `2 where `1 , `2 ∈ Fq [x] are linear
and either
8.4.12 Theorem [1372, 1374] The indecomposable exceptional polynomials over Fq of degree
s(s − 1)/2, where s = pr > 3 and q = pm with p prime, are precisely the polynomials
of the form `1 ◦ f ◦ `2 where `1 , `2 ∈ Fq [x] are linear, r > 1 is coprime to 2m, and f is one
of the following polynomials:
8.4.13 Remark The proofs of Theorems 8.4.10 and 8.4.12 rely on the classification of finite simple
groups.
8.4.14 Theorem [1120, 1755] For prime p, the degree-p exceptional polynomials over Fpm are pre-
cisely the polynomials `1 ◦f ◦`2 where `1 , `2 ∈ Fpm [x] are linear and f (x) = x(x(p−1)/r − a)r
m
with r | (p − 1) and a ∈ Fpm such that ar(p −1)/(p−1) 6= 1.
Pd pi
8.4.15 Proposition [674, 840] Let L be a linearized polynomial (i.e., L(x) = i=0 ai x with
ai ∈ Fpm ), and let S(x) = xj H(x)k where H ∈ Fpm [x] satisfies L(x) = xj H(xk ). Then S is
exceptional over Fpm if and only if S has no nonzero roots in Fpm .
238 Handbook of Finite Fields
8.4.16 Proposition [1369] Let s = pr where p is an odd prime. If a ∈ Fpm is not an (s − 1)-th
power, then
(s+1)/2
(xs − ax − a) · (xs − ax + a)s + (xs − ax + a)2 + 4a2 x
2xs
is an indecomposable exceptional polynomial over Fpm .
8.4.17 Proposition [1371] Let s = 2r . If a ∈ F2m is not an (s − 1)-th power, then
(xs + ax + a)s+1 xs + ax a2 x
· +T
xs xs + ax + a (xs + ax + a)2
8.4.19 Theorem A permutation polynomial over Fq of degree at most q 1/4 is exceptional over Fq .
8.4.20 Remark A weaker version of Theorem 8.4.19 was proved in [777]; the stated result is
obtained from the same proof by using the fact that an absolutely irreducible degree-d
√
bivariate polynomial over Fq has at least q + 1 − (d − 1)(d − 2) q roots in Fq × Fq . For
proofs of this estimate, see [145, 1122, 1886]. A stronger (but false) version of this estimate
was stated in [1939], and [1222] deduced Theorem 8.4.19 from this false estimate. Finally,
[145] states a stronger version of Theorem 8.4.19, but the proof is flawed and when fixed it
yields Theorem 8.4.19.
8.4.21 Remark Up to composing with linears on both sides, the only known non-exceptional
√ (x+1)N +1
permutation polynomials over Fq of degree less than q are x10 +3x over F343 and x
over F24r−1 , where r ≥ 3 and N = (4r + 2)/3.
8.4.22 Remark Heuristics predict that “at random” there would be no permutation polynomials
q
over Fq of degree less than 2 log q .
8.4.23 Remark There are no known examples of non-exceptional permutation polynomials over
q
Fq of degree less than 2 log q when q is prime.
8.4.24 Remark Nearly all known examples of permutation polynomials over Fq of degree less than
q
can be written as the restriction to Fq of a permutation π of an infinite algebraic
2 log q
extension K of Fq , where π is induced by a rational function in the symbols σ i (x), with σ
being a fixed automorphism of K. Such a permutation π may be viewed as an exceptional
rational function over the difference field (K, σ); see [703, 1912, 1913].
8.4.4 Miscellany
8.4.27 Remark Theorem 8.4.26 is called the Carlitz–Wan conjecture. It follows from Theo-
rems 8.4.10, 8.4.11, and 8.4.12. However, the known proofs of Theorems 8.4.10 and 8.4.12
rely on the classification of finite simple groups, whereas [688, 1369, 1898] present short
self-contained proofs of Theorem 8.4.26.
8.4.28 Theorem If f ∈ Z[x] is a permutation polynomial over Fp for infinitely many primes p,
then f is the composition of linear and Dickson polynomials.
8.4.29 Remark Theorem 8.4.28 was proved in [2563] when f has prime degree. It was shown in
[2192] (confirming an assertion in [2563]) that the full Theorem 8.4.28 follows quickly from
the main lemma in [2563] together with a group-theoretic result from [2564]. A different
proof of Theorem 8.4.28 appears in [1109, 1936, 2827], which combines this group-theoretic
result with Weil’s bound on the number of Fq -rational points on a genus-g curve over Fq .
8.4.30 Remark Theorem 8.4.28 is called the Schur conjecture, although Schur did not pose this
conjecture. The paper [1109] made the incorrect assertion that Schur had conjectured The-
orem 8.4.28 in [2563], and this assertion has become widely accepted despite its falsehood.
8.4.31 Remark The concept of exceptionality can be extended to rational functions or more general
maps between varieties [1373]. In particular, many exceptional rational functions arise as
coordinate projections of isogenies of elliptic curves [1115, 1370, 2193].
8.4.5 Applications
8.4.32 Remark Exceptional polynomials were used in [2785] to produce families of hyperelliptic
curves whose Jacobians have an unusually large endomorphism ring. These curves were
used in [770] to realize certain groups PSL2 (q) as Galois groups of extensions of certain
cyclotomic fields.
8.4.33 Remark Exceptional polynomials were used in [516, 2341] to produce curves whose Jacobian
is isogenous to a power of an elliptic curve, and in particular to produce maximal curves
(see Section 12.5).
8.4.35 Remark This result (together with exceptionality of f ) has been used to produce new
examples of binary sequences with ideal autocorrelation [862], cyclic difference sets with
Singer parameters [864], almost perfect nonlinear functions [863], and bent functions [864,
3012]. See Sections 10.3, 14.6, 9.2, and 9.3, respectively.
8.4.36 Remark For further results about the polynomials f from Lemma 8.4.34, including formulas
for a polynomial inducing the inverse of the permutation induced by f on F2m , see [905].
These polynomials are shown to be exceptional in [696, 697, 864, 905, 3073].
8.4.37 Remark The polynomials in cases 1 and 3 of Theorem 8.4.12 have been used to produce
branched coverings of the projective line in positive characteristic whose Galois group is
either symplectic [9] or orthogonal [8].
240 Handbook of Finite Fields
See Also
[58], [1368] For Davenport pairs, which are pairs (f, g) of polynomials in Fq [x]
such that f (Fqm ) = g(Fqm ) for infinitely many m. This notion
generalizes exceptionality, since f ∈ Fq [x] is exceptional if and
only if (f, x) is a Davenport pair.
[696], [697], [3073] For the factorization of f (x) − f (y) where f (x) is a polynomial
from case 1 or 3 of Theorem 8.4.12.
[696], [1900], [2190] For the discovery of some of the polynomials in Theorem 8.4.12.
[840] For a thorough study of exceptional polynomials using only the
Hermite–Dickson criterion, and the discovery of the polynomials
in Theorem 8.4.11 and Proposition 8.4.15.
References Cited: [8, 9, 58, 145, 516, 543, 667, 674, 688, 696, 697, 703, 770, 777, 840, 862,
863, 864, 905, 1109, 1115, 1120, 1122, 1222, 1368, 1369, 1370, 1371, 1372, 1373, 1374, 1755,
1886, 1898, 1900, 1912, 1913, 1936, 1939, 2126, 2190, 2192, 2193, 2341, 2563, 2564, 2785,
2827, 3012, 3073, 3074]
9
Special functions over finite fields
241
242 Handbook of Finite Fields
9.1.2 Remark In this section, we study single-output Boolean functions. Multi-output (or vecto-
rial) Boolean functions are studied in Section 9.2.
9.1.3 Remark Endowing Fn
2 with the structure of F2n allows taking advantage of the field struc-
ture for designing Boolean functions (this is more true for multi-output Boolean functions,
however, see Section 9.2), despite the fact that the important parameters in applications
are more related to the vector space structure of Fn2 than to the field structure of F2n .
9.1.4 Remark We emphasize that in all the definitions and propositions of this section, “x ∈ Fn
2”
can be replaced by “x ∈ F2n ,” except when the coordinates of x are specifically involved or
when the Hamming weight of x plays a role.
9.1.5 Remark The simplest way of representing a Boolean function is its truth table. But this
representation gives little insight on the function. Another well-known way is with the
disjunctive and conjunctive normal forms. But these representations, which do not allow
uniqueness, are not well adapted to coding, cryptography, and sequences for communica-
tions, which are main applications of Boolean functions.
Q Representation (9.1.1) is the algebraic normal form of f (in brief, ANF). The
9.1.8 Definition
terms i∈I xi are monomials.
9.1.12 Remark The transform giving the expression of the coefficients of the ANF by means of
the values of f , and vice versa, is the binary Möbius transform.
9.1.13 Remark Proposition 9.1.11 results in an algorithm for calculating the ANF from the truth
table of the function and vice versa, with complexity O(n2n ) (so, a little higher than linear,
since the size of the input f is 2n ), see for example [523].
9.1.14 Remark A multivariate representation similar to the ANF but over Z also exists, called
numerical normal form, with similar formulas involving the Möbius transform over Z
[530],[523, Section 8.2.1].
where Γn is the set of integers obtained by choosing one element in each cyclotomic coset
of 2 (mod 2n − 1), o(j) is the size of the cyclotomic coset containing j, Aj ∈ F2o(j) and
TrF2o(j) /F2 is the trace function from F2o(j) to F2 .
9.1.16 Remark The Aj ’s can be calculated by using the Mattson-Solomon polynomial [523, 1992].
9.1.17 Definition Representation (9.1.5) is the trace representation (or univariate representation)
of f .
9.1.18 Example Let n be even. Then we can define the function f (x) = TrF2n /F2 (x3 ) +
n/2
TrF2n/2 /F2 (x2 +1
).
9.1.19 Remark Any Boolean function f can be simply represented in the form TrF2n /F2 (P (x))
where P is a polynomial over F2n , but there is no uniqueness of such a representation,
unless o(j) = n for every j such that Aj 6= 0.
244 Handbook of Finite Fields
9.1.20 Definition The Walsh transform Wf of an n-variable Boolean function f is the discrete
Fourier transform (or Hadamard transform) P of the sign function (−1)f (x) . Given an
n
inner product x · y in F2 (for instance x · y = i=1 xi yi over Fn2 , or x · y = TrF2n /F2 (xy)
n
over F2n ), the value of the Walsh transform of f at u ∈ Fn2 is given by:
X
Wf (u) = (−1)f (x)+x·u ,
x∈Fn
2
where the sum is over the integers. The set {u ∈ Fn2 | Wf (u) 6= 0} is the Walsh support
of f .
9.1.21 Remark
√ There exists also a normalized version, in which the value Wf (u) above is divided
by 2n , which simplifies the inverse Walsh transform formula (see below) but gives a non-
integer value.
9.1.22 Remark There exists an algorithm for calculating the Walsh transform whose complexity
is O(n2n ), see for example [523]. A more general definition and an example are given in
Section 9.3.
9.1.23 Proposition (Inverse Walsh transform) [1992], [523, Section 8.2.2] For every x ∈ Fn
2 and
every n-variable Boolean function f , we have
X
Wf (u)(−1)u·x = 2n (−1)f (x) .
u∈Fn
2
9.1.24 Proposition (Parseval’s relation) [1992], [523, Section 8.2.2] For every n-variable Boolean
function f , we have X
Wf2 (u) = 22n .
u∈Fn
2
9.1.25 Proposition (Poisson summation formula) [523, Section 8.2.2] For every n-variable Boolean
function f , for every vector subspace E of Fn2 , and for all elements a and b of Fn2 , we have
X X
(−1)b·u Wf (u) = |E ⊥ | (−1)a·b (−1)f (x)+a·x ,
u∈a+E ⊥ x∈b+E
where E ⊥ is the orthogonal subspace of E (that is, E ⊥ = {u ∈ Fn2 | u · x = 0, for all x ∈ E})
and |E ⊥ | denotes the cardinality of E ⊥ .
9.1.26 Proposition [491] Let E and E 0 be two supplementary vector subspaces of Fn
2 . Then, for
every element a of Fn2 , we have
!2
X X X
⊥ f (x)+a·x
Wf2 (u) = |E | (−1) .
u∈a+E ⊥ b∈E 0 x∈b+E
9.1.27 Definition The degree d◦ f of the ANF is the algebraic degree of the function: if f is given
by (9.1.1), respectively, by (9.1.2), then d◦ f = max{|I| | aI 6= 0} = max{wt(u) | au 6= 0}.
Special functions over finite fields 245
9.1.28 Remark A function is affine (respectively, quadratic) if it has algebraic degree at most 1
(respectively, 2). Affine functions are those functions of the form x 7→ u · x + , u ∈ Fn2 ,
∈ F2 .
9.1.29 Example The function of Example 9.1.10 has algebraic degree 3.
9.1.30 Proposition [523, Section 8.2.1] Let f be represented by its trace representation (9.1.5).
Then f has algebraic degree max w2 (j), where w2 (j) is the Hamming weight of the
j∈Γn | Aj 6=0
binary expansion of j (called the 2-weight of j).
9.1.31 Example The function of Example 9.1.18 has algebraic degree 2 (i.e., is quadratic).
9.1.32 Proposition [523, Section 8.2.1] The algebraic degree of any n-variable Boolean function
f equals the maximum dimension of the subspaces {x ∈ Fn2 | supp(x) ⊆ I}, where I is any
subset of {1, . . . , n} (equivalently, of all affine subspaces of Fn2 ), on which f takes value 1
an odd number of times.
9.1.33 Definition The Hamming weight of an n-variable Boolean function f is the integer
wt(f ) = |{x ∈ Fn2 | f (x) = 1}|. The function is balanced if it has Hamming weight 2n−1
(that is, if its output is uniformly distributed over F2 ).
9.1.34 Proposition An n-variable Boolean function has algebraic degree at most n − 1 if and only
if it has even Hamming weight.
9.1.35 Proposition (McEliece’s theorem) Let f be an n-variable Boolean function of algebraic
degree at most r with 0 < r < n. Then the Hamming weight of f is divisible by 2d r e−1 =
n
9.1.36 Proposition [1855], [523, Section 8.2.2] For n ≥ 2 and 1 ≤ k ≤ n, if the Walsh transform
of f takes values divisible by 2k , then f has algebraic degree at most n − k + 1.
9.1.37 Proposition A quadratic Boolean function f is balanced if and only if its restriction to the
vector subspace Ef = {a ∈ Fn2 | Da f (x) := f (x) + f (x + a) ≡ cst} = {a ∈ Fn2 | Da f (x) =
Da f (0), for all x ∈ Fn2 } (the linear kernel of f ) is not constant. If it is not balanced, then
n+k
its Hamming weight equals 2n−1 ± 2 2 −1 where k is the dimension of Ef .
9.1.38 Definition The Hamming distance between two n-variable Boolean functions f and g
equals the size of the set {x ∈ Fn2 | f (x) 6= g(x)}, that is, equals wt(f + g).
The nonlinearity N L(f ) of a Boolean function f is its minimal distance to affine
functions.
9.1.39 Proposition [1992], [523, Section 8.4.1] For every n-variable Boolean function, we have
1
N L(f ) = 2n−1 − max |Wf (u)|. (9.1.6)
2 u∈Fn2
9.1.40 Proposition [1992], [523, Section 8.4.1] Parseval’s relation implies the covering radius bound
9.1.41 Definition The Boolean functions achieving the covering radius bound with equality are
bent (see Section 9.3 and the references therein).
9.1.42 Proposition [1992] An n-variable Boolean function is bent if and only if its Walsh transform
takes values ±2n/2 only (n even).
246 Handbook of Finite Fields
9.1.43 Definition For every 0 ≤ r ≤ n, the r-th order nonlinearity N Lr (f ) of a Boolean function
f equals its minimum Hamming distance to Boolean functions of algebraic degrees at
most r.
9.1.45 Proposition [522] Let f be any n-variable function and r a positive integer smaller than n.
Denoting (again) by Da f (x) = f (x) + f (x + a) the first-order derivatives of f , we have
1 1
s X
N Lr (f ) ≥ max maxn N Lr−1 (Da f ); 2n−1 − 22n − 2 N Lr−1 (Da f ) .
2 a∈F2 2 n
a∈F2
9.1.46 Remark Proposition 9.1.45, iteratively applied, allows deducing lower bounds on the higher
order nonlinearity of a function from lower bounds on the nonlinearities (i.e., the first-order
nonlinearities) of its higher-order derivatives.
9.1.47 Remark The parameters above (algebraic degree, Hamming weight, nonlinearity) are in-
variant under composition of the Boolean function on the right by any affine automorphism
x 7→ L(x) + a (where L is linear bijective). The algebraic degree and the nonlinearity are
invariant under addition of any affine Boolean function (in the case of the algebraic degree,
though, the invariance needs the function to be non-affine) [1992].
9.1.48 Definition Two n-variable Boolean functions f and g are extended-affine equivalent (in
brief, EA-equivalent) if there exists a linear automorphism L, an affine Boolean function
` and a vector a such that g(x) = f (L(x) + a) + `(x). A parameter is EA-invariant if it
is preserved by EA-equivalence.
9.1.49 Remark If x is in Fn
2 (and is viewed as a 1×n matrix over F2 ), then L(x) = x×A where A is
n−1
a non-singular n×n matrix over F2 . If x lives in F2n , then L(x) = a1 x+a2 x2 +· · ·+an−1 x2
where a1 , a2 , . . . , an−1 are chosen in F2n such that the kernel of L is reduced to {0}.
9.1.50 Remark Little is known on the number of n-variable Boolean functions up to EA-
n
equivalence, except of course that it is larger than the number 22 of Boolean functions
divided by the number 2n (2n − 1)(2n − 2)(2n − 22 ) · · · (2n − 2n−1 ) of affine automorphisms
of Fn2 and by the number 2n+1 of affine Boolean functions.
9.1.51 Remark There exists another notion of equivalence: the CCZ-equivalence; for vectorial
functions, it is more general than EA-equivalence, but for Boolean functions, it coincides
with EA-equivalence [444].
9.1.52 Remark Boolean functions are used in pseudo-random generators, in stream ciphers (in
conventional cryptography), to ensure a sufficient “nonlinearity” (since linear ciphers are
weak [2011]), see Section 16.2. They must then be balanced to avoid distinguishing attacks.
9.1.53 Remark A high algebraic degree of a Boolean function f ensures a good resistance of the
stream ciphers (using it as the nonlinear part) against the Berlekamp-Massey attack [2011]
and the Rønjom-Helleseth attack [2474] and its variants.
Special functions over finite fields 247
9.1.54 Remark A large nonlinearity (that is, a nonlinearity near the covering radius bound) is
an important cryptographic characteristic which ensures resistance to the fast correlation
attack [2075].
9.1.55 Remark Vectorial Boolean functions are also used in cryptography (in block ciphers); they
are addressed in Section 9.2.
9.1.56 Definition Let n be a positive integer and m < n a non-negative integer. A Boolean
function over Fn2 is m-resilient (respectively, m-th order correlation immune) if any of
its restrictions obtained by fixing at most m of its input coordinates xi is balanced
(respectively, has same output distribution as f itself). The resiliency order of f is the
largest value of m such that f is m-resilient.
9.1.57 Remark The notions of resilient and correlation immune functions have a sense for x ∈ Fn
2
only (that is, not for x ∈ F2n ).
9.1.58 Remark A function is m-resilient if and only if it is balanced and m-th order correlation
immune. Since Boolean functions used in stream ciphers must be balanced, we shall address
only resiliency in the sequel.
9.1.59 Remark The resiliency order quantifies the resistance to the Siegenthaler correlation attack
[2663] of a stream cipher using f as a combiner (that is, combining the outputs to linear
feedback registers by applying f , see Section 16.2).
9.1.60 Remark The resiliency order is not EA-invariant. It is invariant under permutations of the
coordinates of x ∈ Fn2 .
9.1.61 Proposition (Siegenthaler’s bound) [2663], [523, Section 8.7] Any m-resilient n-variable
Boolean function has algebraic degree smaller than or equal to n − m − 1 if 0 ≤ m < n − 1
and is affine if m = n − 1.
9.1.62 Proposition (Xiao-Massey’s characterization) [3015] Any n-variable Boolean function f is
m-resilient if and only if Wf (u) = 0 for all u ∈ Fn2 such that 0 ≤ wt(u) ≤ m.
9.1.63 Proposition (Improved Sarkar-Maitra’s bound) [520, 534, 2522], [523, Section 8.7] Let f be
any n-variable m-resilient function (m ≤ n − 2) and let d be its algebraic degree. The values
of the Walsh transform of f are divisible by 2m+2+b d c . Hence, according to Equation
n−m−2
9.1.64 Remark The divisibility property in Proposition 9.1.63, combined with known bounds on
the nonlinearity, implies improvements of these bounds (as shown in Proposition 9.1.63 for
the covering radius bound).
9.1.67 Remark The algebraic immunity quantifies the resistance to the algebraic attack [741] of
the ciphers using f as a combiner, or as a filter (taking as input n bits at fixed positions in
an LFSR, see Section 16.2).
9.1.68 Remark The set of annihilators of f equals the ideal of all multiples of f + 1.
9.1.69 Remark Let g be any function of algebraic degree at most d, then expressing that g is an
annihilator of f (that is, f (x) = 1 implies g(x) = 0, for every x ∈ Fn2 ) by means of the
(unknown) coefficients of the ANF of g results in a system of homogeneous linear equations.
Pd
In this system, we have i=0 ni number of variables (the coefficients of the monomials of
9.1.71 Remark Let f filter an LFSR. Let us assume for instance that the LFSR has length 256
(it can then be initialized for instance with a key of length 128 and an IV of length 128
as well). Then Proposition 9.1.70 requires n ≥ 16 to ensure a complexity of the algebraic
attack larger than exhaustive search, see e.g., [490]. A minimal security margin would be
n ≥ 18. Algebraic attacks have therefore changed the number of variables of the Boolean
functions used in practice (before them, for reasons of efficiency, the number of variables
was rarely more than 10).
9.1.72 Proposition (Bounds on the weight, Lobanov’s bound on the nonlinearity) [523, Section
PAI(f )−1 n
8.9] For every n and every n-variable Boolean function f , we have i=0 i ≤ wt(f ) ≤
Pn−AI(f ) n n+1
i=0 i (in particular, if n is odd andf has optimal algebraic immunity 2 , then f
PAI(f )−2 n−1
is balanced) and N L(f ) ≥ 2 i=0 i .
9.1.73 Remark Bounds also exist for the higher order nonlinearities [523, Section 8.9], [2085].
9.1.74 Proposition [490] If an n-variable balanced Boolean function f , with n odd, has no non-zero
n−1
annihilator of algebraic degree at most 2 , then it has optimal algebraic immunity.
9.1.75 Remark
1. Stream ciphers must also resist fast algebraic attacks, which work if one can find
g 6= 0 of low degree and h of algebraic degree not much larger than dn/2e, such
that f g = h [735, 1446]. These attacks need more data (of a particular shape)
than standard algebraic attacks but can be faster if g has lower degree due to
the relaxation of the condition on h.
2. Stream ciphers must also resist algebraic attacks on the augmented function
[1069].
3. Finally, they must resist the already mentioned attack by Rønjom and Helleseth
(designed for the filter generator and later generalized) which requires f to have
an algebraic degree close to n.
9.1.77 Remark This characteristic is related to the diffusion of the cipher in which f is involved.
It is less important than the characteristics seen above, but it has attracted some attention.
9.1.78 Remark Other cryptographic characteristics (less essential than the algebraic degree, the
nonlinearity and the algebraic immunity) exist [523]: the non-existence of nonzero linear
Special functions over finite fields 249
structure, the global avalanche criterion, the maximum correlation to subsets, the algebraic
thickness, the nonhomomorphicity.
9.1.79 Remark Boolean functions being tools for the design of cryptosystems, an important as-
pect of the research in this domain is to design constructions of Boolean functions having
the necessary cryptographic features (contrary to other domains of the study of Boolean
functions, which mainly study their properties).
9.1.80 Remark A Boolean function obtained by some construction and satisfying a given crypto-
graphic criterion, or several criteria, will be considered as new if it is EA-inequivalent to all
previously found functions satisfying the same criteria.
9.1.81 Remark We call secondary the constructions which use already defined functions satisfying
a given property, to build a new one satisfying the same property. A construction from first
principles will be primary.
9.1.82 Proposition (Maiorana-McFarland’s construction) [485, 519], [523, Section 8.7] Let r and s
be positive integers; let g be any s-variable Boolean function and let φ be a mapping from
Fs2 to Fr2 . Then the function
fφ,g (x, y) = x · φ(y) + g(y), x ∈ Fr2 , y ∈ Fs2 , (9.1.7)
where “·” is an inner product in Fr2 , is (at least) k-resilient where k = min{wt(φ(y)), y ∈
Fs2 } − 1, and has nonlinearity satisfying:
r
2n−1 − 2r−1 maxr |φ−1 (a)| ≤ N L(fφ,g ) ≤ 2n−1 − 2r−1 maxr |φ−1 (a)|.
a∈F2 a∈F2
9.1.83 Remark This construction was originally developed for bent functions (see Section 9.3) and
has been later adapted to resilient functions.
9.1.84 Remark The resiliency order of fφ,g can be larger, for some well-chosen functions g [519].
9.1.85 Remark The Maiorana-McFarland construction has been generalized in several ways [523].
9.1.86 Proposition (Indirect sum) [521], [523, Section 8.7] Let r and s be positive integers and let
0 ≤ t < r and 0 ≤ m < s. Let f1 and f2 be two r-variable t-resilient functions. Let g1 and
g2 be two s-variable m-resilient functions. Then the function
h(x, y) = f1 (x) + g1 (y) + (f1 + f2 )(x) (g1 + g2 )(y); x ∈ Fr2 , y ∈ Fs2
is an (r + s)-variable (t + m + 1)-resilient function. The Walsh transform of h takes the value
1 1
Wh (a, b) = Wf1 (a) [Wg1 (b) + Wg2 (b)] + Wf2 (a) [Wg1 (b) − Wg2 (b)] . (9.1.8)
2 2
If the Walsh transforms of f1 and f2 have disjoint supports and if the Walsh transforms
of g1 and g2 have disjoint supports, then
N L(h) = min 2r+s−2 + 2r−1 N L(gj ) + 2s−1 N L(fi ) − N L(fi )N L(gj ) .
(9.1.9)
i,j∈{1,2}
250 Handbook of Finite Fields
In particular, if f1 and f2 have nonlinearity 2r−1 − 2t+1 and disjoint Walsh supports, if
g1 and g2 have nonlinearity 2s−1 − 2m+1 and disjoint Walsh supports, and if f1 + f2 has
algebraic degree r − t − 1 and g1 + g2 has algebraic degree s − m − 1, then h achieves
Siegenthaler’s and (improved) Sarkar-Maitra’s bounds with equality.
9.1.87 Remark Some particular choices of functions f1 , f2 , g1 , g2 give secondary constructions pre-
viously introduced by several authors (Siegenthaler, Tarannikov) [523, Section 8.7]; in par-
ticular the well-known direct sum h(x, y) = f (x) + g(y) is obtained by putting g1 = g2
and/or f1 = f2 .
9.1.88 Remark The indirect sum gives also a secondary construction of bent functions [523].
9.1.89 Proposition (Secondary construction without extension of the number of variables) [523,
Section 8.7] Let 0 ≤ k ≤ n. Let f1 , f2 and f3 be three k-resilient n-variable functions. Then
the function s1 = f1 + f2 + f3 is k-resilient if and only if the function s2 = f1 f2 + f1 f3 + f2 f3
is k-resilient. Moreover
3
!
1 X
N L(s2 ) ≥ N L(s1 ) + N L(fi ) − 2n−1 , (9.1.10)
2 i=1
9.1.90 Remark It has been impossible until now to obtain resilient functions of sufficient orders
with good algebraic immunity. For this reason, the filter model is preferred to the combiner
model (the correlation attack works on the latter).
9.1.91 Proposition [528] Let n be any positive integer and α a primitive element of the field F2n .
n−1
Let f be the balanced Boolean function on F2n whose support equals {0, αs , . . . , αs+2 −2 }
for some s. Then f has optimum algebraic immunity dn/2e. Moreover, f has algebraic degree
n
n − 1 and nonlinearity N L(f ) ≥ 2n−1 − n · ln 2 · 2 2 − 1.
9.1.92 Remark This bound on the nonlinearity, which has been recently slightly improved, is not
enough for ensuring that the function allows resisting the fast correlation attacks, but it
has been checked, for n ≤ 26, that the exact value of N L(f ) is much better than this lower
bound. Improving significantly the bound of Proposition 9.1.91 is an open problem.
9.1.93 Remark The function in Proposition 9.1.91 shows also good immunity against fast algebraic
attacks as shown by Liu et al. [1951].
9.1.94 Remark Despite the fact that complexity for computing the output is roughly the same as
for computing the discrete logarithm, the function can be efficiently computed because n is
small; the Pohlig-Hellman method is efficient, at least for some values (n = 18 or 20).
9.1.95 Remark We note that a similar function had been previously studied by Brandstätter,
Lange, and Winterhof in [391] but the algebraic immunity was not addressed by these
authors.
9.1.96 Remark Other infinite classes of balanced functions with optimal algebraic immunity have
been found [2777, 3056] which have good nonlinearity and good resistance to fast algebraic
attacks (checked by computer), at least for small n. More classes exist but, either their
Special functions over finite fields 251
optimal algebraic immunity is not completely proved, or they are closely related to the
classes mentioned above, or they do not have good nonlinearity, or they have bad resistance
to fast algebraic attacks.
9.1.97 Remark Boolean functions are used to design error correcting codes whose lengths are
powers of 2, in particular Reed-Muller codes (see Section 15.1).
9.1.98 Definition Let n be any positive integer and 0 ≤ r ≤ n. The Reed-Muller code of length 2n
and order r is the set of binary words of length 2n corresponding to the output columns
of the truth-tables of all the n-variable Boolean functions of algebraic degrees at most
r.
2n n
order nr is a linear code over F2
9.1.99 Proposition [1992] The Reed-Muller code of length 2n and
(i.e., is a vector subspace of F2 ) of dimension 1 + n + 2 +· · ·+ r and minimum distance
2n−r .
9.1.100 Remark The Reed-Muller code of order 1 is optimal according to the Griesmer bound
[1992].
9.1.101 Remark The Reed-Muller code of length 2n and of order 2 contains a nonlinear optimal
code: the Kerdock code of the same length, introduced in [1728] (not as in the definition
below, though). The Kerdock code of length 2n is the union of cosets of the first order
Reed-Muller code, chosen such that the sum of two elements from different cosets is a bent
function.
(x, ) ∈ F2n−1
9.1.102 Definition The Kerdock code of length 2n is the set of functions of the form
P n2 −1× i
F2 7→ f (ux, ) + TrF2n−1 /F2 (ax) + η + τ where f (x, ) = TrF2n−1 /F2 i=1 x2 +1
+
TrF2n−1 /F2 (x), with u, a ∈ F2n−1 , and η, τ ∈ F2 .
9.1.103 Proposition [1992] The Kerdock code of length 2n has size 22n and minimum distance
2n−1 − 2n/2−1 .
9.1.104 Remark The Kerdock code of length 2n is optimal, as proved by Delsarte [800].
9.1.105 Remark Other codes with the same parameters exist, called generalized Kerdock codes
[1673].
9.1.106 Remark The Kerdock code of length 2n is not linear but it is the image of a linear code
over Z/4Z by a distance preserving mapping called the Gray map [1409].
9.1.107 Remark Boolean functions are related to sequences for communications: see Chapter 10.
252 Handbook of Finite Fields
9.1.108 Definition A sequence (si )i≥0 over F2 satisfying an order n linear homogeneous recurrence
relation (with constant coefficients), is an m-sequence if it has (optimal) period 2n − 1.
9.1.109 Proposition [1303] The m-sequences of period 2n − 1 are the sequences of the form si =
TrF2n /F2 (λαi ), where λ ∈ F∗2n and α is a primitive element of F2n . Given such an m-
sequence, any other m-sequence of the same period differs with (si )i≥0 , up to a cyclic shift,
by a decimation d such that gcd(d, 2n −1) = 1 (that is, the second sequence equals (sdi+t )i≥0 ,
where t is some integer).
9.1.110 Remark Since sequences are viewed up to cyclic shifts, we shall, in the sequel, take λ equal
to 1 and the integer t equal to 0.
Pn−1
9.1.111 Remark According to Proposition 9.1.109, the crosscorrelation Cd (τ ) = t=0 (−1)st+τ +sdt
between two m-sequences equals X
d
(−1)TrF2n /F2 (cx+x )
x∈F∗
2n
See Also
§15.1 For properties of Reed-Muller codes and their sub-codes; see also [1992].
[523] For a survey on Boolean functions for coding and cryptography (the chapter
which follows it in the same monograph deals with vectorial functions, which
are the subject of Section 9.2 in the present handbook). This survey includes
binary bent functions (see also Section 9.3 in the present handbook).
[1303] For a recent survey on sequences.
References Cited: [267, 391, 444, 485, 490, 491, 519, 520, 521, 522, 523, 528, 530, 534,
735, 741, 800, 1069, 1303, 1409, 1446, 1673, 1728, 1855, 1951, 1992, 2011, 2073, 2075, 2085,
2317, 2474, 2522, 2663, 2777, 3015, 3056]
Special functions over finite fields 253
9.2.1 Remark Around 1992, two cryptanalysis methods had been introduced in the literature
devoted to symmetric cryptosystems, the differential cryptanalysis [278], and the linear
cryptanalysis [2026]. It was shown later that these methods are basically linked [577]. In
order to resist these attacks the round function used in an iterated block cipher must
satisfy some mathematical properties. These properties are mainly covered by the concepts
of nonlinearity and of differential uniformity for functions on extension fields.
9.2.2 Definition Any function from F2n into F2m is an (n, m)-function. It is a Boolean function
when m = 1 and a function on F2n when m = n. Here we assume that n ≥ m. Basic
properties on Boolean functions can be found in Section 9.1.
9.2.3 Definition Let F be an (n, m)-function. The component functions of F are the Boolean
functions
fλ : x ∈ F2n 7→ Tr(λF (x)), λ ∈ F∗2m ,
where Tr is the absolute trace on F2m (see Definition 2.1.80).
9.2.4 Definition An (n, m)-function F is balanced when it is uniformly distributed, i.e., F takes
every value of F2m each 2n−m times.
9.2.5 Proposition [524] An (n, m)-function is balanced if and only if all its component functions
are balanced.
9.2.6 Remark When m = n, a balanced function is a permutation of F2n , as it is shown in a
more general context in [1939, Theorem 7.7].
9.2.7 Remark The nonlinearity of a Boolean function f is usually computed by means of the
highest magnitude of its Walsh spectrum Wf . These quantities are denoted by N L(f ) and
L(f ) respectively (see the definitions in Section 9.1).
9.2.8 Definition A Boolean function f is plateaued if its Walsh coefficients take at most three
values, namely 0, ±L(f ). Then, L(f ) = 2s with s ≥ n/2.
If s = n/2 (and n even) then f is bent and its Walsh coefficients take two values only,
n
namely ±2 2 .
Also, f is semi-bent if s = (n + 1)/2 for odd n and s = (n + 2)/2 for even n.
An (n, m)-function is plateaued when its components are plateaued.
9.2.9 Definition Let F be an (n, m)-function with components fλ . Let N L(fλ ) be the nonlin-
earity of fλ . The nonlinearity of F , say N L(F ), is the lowest nonlinearity achieved by
one of its components:
L(F )
N L(F ) = min (N L(fλ )) = 2n−1 − ,
λ∈F2m 2
where L(F ) is the highest magnitude appearing in the Walsh spectrum of all fλ .
254 Handbook of Finite Fields
Indeed, this holds for any fλ (see the covering radius bound in Definition 9.1.41).
9.2.11 Definition Let an (n, m)-function F be expressed as a univariate polynomial of degree less
than 2n . The algebraic degree of F is the maximal Hamming weight of its exponents,
considering the 2-ary expansion of exponents.
9.2.12 Definition Let F be an (n, m)-function. For any a ∈ F2n , the derivative of F with respect
to a is the (n, m)-function Da F defined for all x ∈ F2n by
Da F (x) = F (x + a) + F (x).
9.2.13 Definition An (n, m)-function F is a bent function, also called a perfect nonlinear (PN)
function, if and only if N L(F ) = 2n−1 − 2(n/2−1) , i.e., all its components are Boolean
bent functions.
9.2.14 Theorem [2302] An (n, m)-function can be PN only if n is even and m ≤ n/2.
9.2.18 Remark In cryptology, PN functions were introduced in 1990-95 as functions which provide
the optimal resistance to linear attacks and to differential attacks (see [2302] and [2304]).
They have the best nonlinearity, by Definition 9.2.13, and the δa,b are as small as possible,
by Proposition 9.2.16. A major drawback is that these optimal functions are not balanced;
also, they presuppose the use of non-invertible round functions (see [489], [496, Chapter 3]).
Special functions over finite fields 255
9.2.19 Remark In cryptography, most works focused on optimal functions with respect to diffe-
rential attacks. The aim is to exhibit such functions that, moreover, are bijective and oppose
a good resistance to linear attacks. According to Remark 9.2.18, functions on F2n , i.e., (n, n)-
functions) are generally considered. Algebraic properties of almost perfect nonlinear (resp.
almost bent) functions and their links with error-correcting codes are introduced in [525].
9.2.20 Definition A function F on F2n is almost perfect nonlinear (APN) if and only if all the
equations
F (x) + F (x + a) = b, a ∈ F2n , a 6= 0, b ∈ F2n , (9.2.2)
have at most two solutions. The function F is almost bent (AB) if and only if the value
of X
WF (β, λ) = (−1)Tr(λF (x)+βx) (9.2.3)
x∈F2n
n+1
is equal either to 0 or to ±2 2 , for any β and λ in F2n , β 6= 0.
9.2.21 Theorem [577] AB functions exist for n odd only. Any AB function is APN.
9.2.22 Remark A function F on F2n is APN if and only if all its derivatives are 2-to-1. This can
be derived from the definition.
9.2.23 Proposition [525, Theorem 1] Let F be an AB function on F2n . Then the algebraic degree
of F is less than or equal to (n + 1)/2.
9.2.24 Remark According to (9.2.3), a function F on F2n is an AB function if and only if its
components fλ are semi-bent. Thus
9.2.27 Definition Let F be a function on F2n . For any a ∈ F∗2n and b ∈ F2n , we denote
9.2.28 Remark Those functions for which δ(F ) = 2 are APN. It is worth noticing that such
functions are rare and hard to find. From a recent result of Voloch [2885], it follows that
these functions asymptotically have density zero in the set of all functions. Few infinite
256 Handbook of Finite Fields
families are known. The monomial APN functions are related with exceptional objects (see
Subsection 9.2.8). The known infinites families of non-monomial APN functions are listed
in [388]. They are all quadratic. Edel and Pott propose in [954] an original construction
providing a sporadic non-quadratic non-monomial APN function.
9.2.29 Problem AB functions on F2n , with n odd, provide an optimal resistance to both differential
attacks and linear attacks. There are several classes of AB permutations. The situation
is different for even n: there are APN functions F such that L(F ) = 2(n+2)/2 and it is
conjectured that this value is the minimum; also the existence of APN permutations, for
n > 6, is not resolved.
9.2.30 Example The inverse function ∗ , F (x) = x−1 , is an APN (not AB) permutation on F2n
when n is odd. For even n, it is a permutation too, but δ(a, b) takes three values, namely 0,
2 and 4 so that δ(F ) = 4. This function has the highest degree and satisfies L(F ) = 2(n+2)/2
for n even. The inverse function on F28 is used in the AES S-boxes (see Section 16.2).
9.2.31 Remark Regarding the PN property, the APN property can be extended on fields of odd
characteristic (see [908, 1479], for instance).
9.2.32 Remark The first APN permutation of F26 (n = 6) was presented by Dillon at the conference
Finite Fields and their Applications (Fq 9) in 2009. Thus, a long-standing (and famous)
conjecture stating that there is no APN permutation of F2n when n is even was disproved.
The method, explained in [424], uses mostly the representation of APN functions by codes
(see Theorem 9.2.47). However the existence of APN permutations of an even number n
(with n ≥ 8) of variables remains a research problem of great interest. The next theorem,
is due to several authors [228, 1541, 2303].
n
∗ Note that, as a function on F2n , F (x) = x2 −2 so that F (0) = 0.
Special functions over finite fields 257
9.2.37 Remark It was conjectured in [525] that for any AB function F , there exists a linear function
L such that F + L is a permutation. A counterexample for this conjecture is given in [449,
Remark 4].
9.2.38 Example [447, Proposition 1] Let n = 3k where k is odd and not divisible by 3. Let u be
any element in F∗2n of order 22k + 2k + 1. Then any function
s ik
+2tk+s
F (x) = x2 +1
+ ux2
9.2.39 Definition Let F and F 0 be two functions on F2n . They are equivalent if F 0 is obtained
from F by compositions of 1 and/or 2:
1. F 7→ A1 ◦ F ◦ A2 + A, where A1 and A2 are affine permutations and A is any
function which is affine or constant;
2. F 7→ F −1 , the inverse function of F when F is a permutation.
They are extended affine equivalent (EA-equivalent) when F 0 is obtained from F by
transformations of type 1.
9.2.40 Proposition [525] Let F be a function on F2n which is APN (resp. AB). Then any function
which is equivalent to F is an APN (resp. AB) function too.
9.2.41 Remark The definition of Carlet-Charpin-Zinoviev (CCZ)-equivalence, naturally derived
from [525, Proposition 3], was proposed in [449]. It necessitates to introduce the graph of
any function F on F2n :
GF = { (x, F (x)) | x ∈ F2n }.
9.2.42 Definition Let F and F 0 be two functions on F2n . They are CCZ-equivalent if their graphs
are affine equivalent, i.e., if there exists an affine automorphism A of F2n × F2n such
that A(GF ) = GF 0 .
9.2.43 Theorem [449, Proposition 2] Let F, F 0 be CCZ-equivalent functions. Then F is APN (resp.
AB) if and only if F 0 is APN (resp. AB). In this case, F and F 0 have the same nonlinearity
but may have different algebraic degrees.
9.2.44 Remark In [449], Budaghyan, Carlet, and Pott proved that CCZ-equivalence is more general
than equivalence. They are then able to present APN functions which are EA-inequivalent
to all known APN functions.
9.2.45 Remark The important question whether affinely inequivalent functions are CZZ-equivalent
was first investigated by Edel, Kyureghyan, and Pott [953]. They exhibited an APN bi-
nomial, which is new since it cannot be obtained from another known APN function (see
example below). A number of constructions of APN functions followed the definition of CCZ-
equivalence (see, for instance: [386, 423, 443, 447]). A classification up to CCZ-equivalence,
for small dimensions, is proposed in [417].
9.2.46 Example [953, Theorem 2] Let the function F on F210 , F (x) = x3 +ux36 . Let ω ∈ F4 \{0, 1}.
Then F is APN for any u = ω i y, i ∈ {1, 2} and y ∈ F∗25 . Moreover, these APN functions
are not CCZ-equivalent to any APN monomial on F210 .
258 Handbook of Finite Fields
9.2.47 Theorem [525] Let F be a function on F2n such that F (0) = 0. Let α be a primitive element
of F2n . Let us denote by CF the linear binary code of length 2n − 1 defined by its parity
check matrix n
α2 α2 −2
1 α ...
HF = n
F (1) F (α) F (α2 ) . . . F (α2 −2 )
where each entry is viewed as a binary vector. The dual code is denoted by (CF )⊥ . Then
we have
1. The function F is APN if and only if the code CF has minimum distance five.
2. The function F is AB if and only if the weights of the non zero codewords of the
code (CF )⊥ form the following set: {2n−1 , 2n−1 ± 2(n−1)/2 }.
9.2.48 Corollary [525] Let F be a function on F2n such that F (0) = 0. Then we have
1. If F is APN then the dimension of CF is equal to 2n − 2n − 1.
2. If F is APN then (CF )⊥ does not contain the all-one vector.
3. If F is AB then the weight distribution of (CF )⊥ is unique and the same as the
weight distribution of the dual of the 2-error-correcting BCH code.
9.2.49 Remark The weight distribution of (CF )⊥ exactly corresponds to the Walsh spectrum of F ,
that is the multiset of the values WF (β, λ) given by (9.2.3). Thus, if this weight distribution
is known, the nonlinearity of F is also known.
9.2.50 Definition A binary code C is 2` -divisible if the weight of any of its codewords is divisible
by 2` . Moreover C is exactly 2` -divisible if, additionally, it contains at least one codeword
whose weight is not divisible by 2`+1 .
9.2.51 Theorem [494] Let F be a function on F2n , with n = 2` + 1. Then F is AB if and only if
F is APN and the code (CF )⊥ , defined in Theorem 9.2.47, is 2` -divisible.
9.2.52 Remark The determination of the weight distributions of codes of type CF remains an open
problem except when the code is optimal in a certain sense. The work of Kasami remains
fundamental, proving notably the uniqueness of the weight enumerator of these optimal
codes ([1688, 1689]; see also an extensive study in [590, Section 3.4.2]).
9.2.53 Example The functions F (x) = xd , d = 22i − 2i + 1 with gcd(i, n) = 1 are APN for any n.
Such exponents d are called the Kasami exponents. When n is odd, the code CF is equivalent
i 3i
to the cyclic code with two zeros α2 +1 and α2 +1 (α being a primitive root of F2n ) and F
is AB. The proof is due to Kasami [1688] for odd n (see also [590, Theorem 3.32]) and to
Janwa and Wilson [1595] for even n.
9.2.54 Remark The code CF always contains a subcode which is a cyclic code. This is of most
interest when F is a monomial as we will see later (see [525] for more details).
9.2.56 Proposition [525, Section 3.4] Let F be a quadratic function on F2n with n odd. Then F
is AB if and only if F is APN.
Special functions over finite fields 259
9.2.57 Proposition [525] Let F be a quadratic function. Then F is APN if and only if the code
CF does not contain any codeword of weight three.
9.2.58 Remark When F is quadratic, CF⊥ is contained in the punctured Reed-Muller code of order
2 (see [1991, Chapter 15]).
i
9.2.59 Example The functions F (x) = x2 +1
with gcd(i, n) = 1 are APN for any n (see [1689,
2303]). Such quadratic exponents are called Gold exponents.
9.2.60 Remark There are several constructions of quadratic non-monomial APN functions; no-
tably, there are infinite families of such functions (see [272, 386, 388, 443, 447, 449] and
Remark 9.2.28). There is only one infinite family of binomials, which was introduced and
extensively studied in [447, Section II] by Budaghyan, Carlet, and Leander. Their Walsh
spectrum was computed in [387]. Some such binomials are bijective (see Example 9.2.38).
9.2.61 Problem The Walsh spectrum of a non-monomial quadratic function is generally not known.
Concerning APN such functions, the problem is discussed in [387]. For the computation of
some spectrum, see [384, 385, 387].
9.2.62 Example The function F (x) = x3 + T r(x9 ), was recently discovered [448]. It is APN for
any n and inequivalent (in any sense) to the Gold functions, while it has the same Walsh
spectrum [385].
9.2.63 Theorem [389, Theorem 3] Let F be a quadratic APN function and G be a Gold function,
k
i.e., G(x) = x2 +1 with gcd(k, n) = 1, on F2n . If F and G are CCZ-equivalent then they
are EA-equivalent.
9.2.64 Theorem [228, Theorem 6] There are no APN quadratic mappings on F2n of the form
n−1
X i
F (x) = ci x2 +1
, ci ∈ F2n (9.2.4)
i=1
k
unless F is a monomial: F (x) = cx2 +1
with gcd(k, n) = 1, c ∈ F2n .
9.2.65 Definition The APN function F , on F2n , is crooked if for every a ∈ F∗2n the image of its
derivative Da F is a hyperplane or the complement of a hyperplane.
9.2.72 Remark In cryptography, monomials are usually called power functions. They were inten-
sively studied, since they have a lower implementation cost in hardware. Moreover, their
properties regarding differential attacks can be studied more easily, since they are related
to the weight enumerators of some cyclic codes with two zeros [590, Section 3.4.2].
9.2.73 Proposition [525] Let F (x) = xd be a function on F2n . Then CF , defined as in Theorem
9.2.47, is the binary cyclic code of length 2n − 1 whose zeros are α, αd and their conjugates;
CF is a cyclic code with two zeros.
9.2.74 Remark According to Corollary 9.2.48, if F is APN then the cyclotomic coset of d modulo
2n − 1 has size n (with F (x) = xd ). In this case, CF has minimum distance 5 and dimension
2n − 2n − 1.
9.2.75 Example The Melas code is the cyclic code of length 2n − 1 with two zeros α and α−1 .
Its minimum distance is 5 for odd n and 3 for even n. It is the code CF corresponding to
the inverse function F (x) = x−1 (see Example 9.2.30). Lachaud and Wolfmann described
in [1828] the set of weights of CF⊥ .
9.2.76 Proposition [494, 1467] Let F (x) = xd with gcd(d, 2n − 1) = 1. Then CF⊥ is exactly 2-
divisible if and only if CF is the Melas code, i.e., d = −1.
9.2.77 Remark Let F (x) = xd with gcd(d, 2n − 1) = 1. The study of the weights of the code CF⊥
corresponds to the study of the crosscorrelation of a pair of maximal length linear sequences,
called m-sequences (see Section 10.3).
9.2.78 Remark Janwa, McGuire, and Wilson characterized several classes of cyclic codes with two
zeros whose minimum distance is at most four. More precisely, by applying a form of Weil’s
theorem, they showed that, for a large class of such codes, only a finite number could be
good, i.e., have minimum distance five [1594, 1595].
9.2.79 Theorem [1594] For any fixed d satisfying d ≡ 3 (mod 4) and d > 3, the cyclic codes CF ,
F (x) = xd , of length 2n − 1, with two zeros α and αd , have codewords of weight 4 for all
but finitely many values of n.
9.2.80 Remark By their work [1594], the authors strengthened the conjecture that APN (a fortiori
AB) functions are exceptional. Their main conjecture was solved recently by Hernando and
McGuire (see Theorem 9.2.83 below).
9.2.81 Definition Let F (x) = xd . The exponent d is exceptional if F is APN on infinitely many
extension fields of F2 .
9.2.82 Remark The epithet exceptional is due to Dillon who observed that Gold and Kasami
exponents can be defined by means of exceptional polynomials (see Section 8.4). Dillon
extensively describes the links between these exceptional codes, maps, difference sets, and
polynomials in [863], a review of great interest. More can be found in [862, 864]. Moreover,
reversed permutation Dickson polynomials are closely related with APN monomials [1547].
9.2.83 Theorem [1490] Only the Gold and the Kasami numbers are exceptional exponents.
9.2.84 Problem Gold functions and Kasami functions, presented in Examples 9.2.59 and 9.2.53,
form the exceptional classes of APN power functions. The other classes, known as sporadic,
are listed in Example 9.2.85. It is currently conjectured that any APN power function, which
is not exceptional, belongs to one of the known “sporadic” classes (up to equivalence).
Special functions over finite fields 261
9.2.85 Example The “sporadic” classes, which are known, are the following classes of APN power
functions. Such a function F on F2n , F (x) = xd , is designed by its exponent d.
1. The Welch functions: d = 2t + 3 with n = 2t + 1. Any Welch function is AB,
proved by Canteaut, Charpin, and Dobbertin [493] (see also [492, 494]). This was
conjectured by Welch around 1968, concerning the crosscorrelation function of
binary m-sequences. Dobbertin previously proved that F is APN [904].
2. The Niho functions: for n = 2t + 1,
t
2 + 2t/2 − 1 if t is even,
d=
2t + 2(3t+1)/2 − 1 otherwise
See Also
References Cited: [224, 228, 264, 272, 277, 278, 331, 384, 385, 386, 387, 388, 389, 417, 423,
424, 443, 447, 448, 449, 489, 492, 493, 494, 495, 496, 524, 525, 577, 590, 808, 862, 863, 864,
903, 904, 906, 908, 953, 954, 1467, 1479, 1490, 1526, 1541, 1547, 1594, 1595, 1688, 1689,
1816, 1828, 1939, 1991, 2026, 2287, 2302, 2303, 2304, 2838, 2885]
262 Handbook of Finite Fields
9.3.1 Remark In this section, we give basics and mainly focus on recent results on Boolean bent
functions and also provide a rather comprehensive survey of generalized bent functions. For
other results on Boolean bent functions, together with most of the proofs, the reader is
referred to [523].
denotes some non-degenerate symmetric bilinear form on Fnp , sometimes also called an
inner product. If f : Fnp → Fp , we may define a corresponding complex-valued function
f (x)
fc by fc (x) = ζp . The Walsh transform of f is by definition the Walsh transform of
fc . The normalized Walsh coefficients are the numbers p−n/2 fˆ(y).
9.3.3 Remark
9.3.4 Remark
1. If we identify Fnp with Fpn , all p-ary functions can be described by Trn (F (x))
for some function F : Fpn → Fpn of degree at most pn − 1. This is the univari-
ate representation. If we do not identify Fnp with Fpn , the p-ary function has a
representation as a multinomial in x1 , . . . , xn , where the variables xi occur with
exponent at most p − 1. This is the multivariate representation.
2. The multivariate representation is unique.
3. The univariate representation is not unique. Unique univariate form of a Boolean
function, trace representation, is discussed in Section 9.1. The case of odd p is
similar.
9.3.5 Definition The algebraic degree of a p-ary function is the degree of the polynomial giving
its multivariate representation.
Special functions over finite fields 263
9.3.6 Remark The algebraic degree of a p-ary function is not the degree of the polynomial F
in the univariate representation, see Section 9.1. Throughout this section, degree always
means the algebraic degree.
9.3.7 Remark The Walsh transform can also be defined for functions F : Fn
p → Fp . In this case,
m
one considers all functions Fa : Fpn → Fp of the form Fa (x) = a · F (x), where a · x denotes
an inner product on Fm p . In this section, we mostly consider functions Fp → Fp , where p
n
|fˆ(y)| = |
X
(−1)f (x)+y·x | = 2n/2
x∈Fn
2
9.3.10 Remark Definition 9.3.9 is a special case of Definition 9.3.8. We include it here since most
papers on bent functions just deal with the Boolean case. In the Boolean case, the dimension
n must obviously be even. Boolean bent functions are regular.
9.3.11 Remark Boolean bent functions that are equal to their dual are self-dual and those whose
dual is the complement of the function are anti self-dual [526, 1543]. The class of self-dual
bent functions can be extended by defining formally self-dual Boolean functions [1565].
9.3.12 Example
1. The function
is bent on F2m
2 .
2. If we identify Fnp with the additive group of Fpn , the function f (x) = Trn (x2 ) is
p-ary bent for all odd primes p. This example shows that in the odd characteristic
case, the dimension n need not be even.
9.3.15 Remark
1. EA-equivalence of Boolean functions from Definition 9.1.48 is the case p = 2 in
Definition 9.3.13.
2. If a function f is not affine then any function EA-equivalent to it has the same
algebraic degree as f . On the other hand, the function f and its dual f ∗ do not
necessarily have the same degree (this can be seen in many examples below).
3. Any function which is EA-equivalent to a bent function is bent.
4. For bent functions, CCZ equivalence coincides with EA equivalence (see [451,
Theorem 3]).
5. Each Boolean quadratic bent function over F2m 2 is EA-equivalent to the function
in Example 9.3.12 (1). Thus, the class of Boolean quadratic bent functions is
complete and consists of a single equivalence class. For the case of odd p, see
Remark 9.3.36.
9.3.16 Definition [3038] A Boolean function f over F2n is hyper bent if f (xk ) is bent for any k
coprime with 2n − 1 [529, 591].
(p − 1)n
2 ≤ deg f ≤ + 1.
2
Moreover, if f is a weakly regular bent function (that includes the case of Boolean bent
functions) with (p − 1)n ≥ 4 then
(p − 1)n
2 ≤ deg f ≤ .
2
9.3.18 Remark The Maiorana-McFarland and Dillon classes of bent functions (see Construc-
tion 9.3.37, Remark 9.3.39, and Theorem 9.3.59) contain examples of regular functions
where the upper bound on the degree is achieved when n is even. Other examples of ternary
(weakly) regular bent functions in even dimension that achieve the maximal degree come
from Theorem 9.3.62 and are given by Coulter-Matthews exponents with k = n − 1 (see
Remark 9.3.65). Infinite classes of ternary bent functions in odd dimension (both weakly
and non weakly regular) attaining the bound are obtained from plateaued functions using
Construction 9.3.69. This construction also gives examples of weakly regular functions in
odd dimension with p = 5 that have maximal degree.
9.3.19 Problem It is not known whether the bound for the degree of non weakly regular bent
functions is sharp for general p. The same question remains open for weakly regular bent
functions with odd n and p > 5.
9.3.20 Theorem [1812, 2486] The dual of a weakly regular bent function is again a weakly regular
bent function.
Special functions over finite fields 265
9.3.21 Theorem [1812, Property 8] The Walsh transform coefficients of a p-ary bent function f
with odd p satisfy
f ∗ (y)
(
± ζp if n is even or n is odd and p ≡ 1 (mod 4),
p −n/2
fˆ(y) = f ∗ (y)
± i ζp if n is odd and p ≡ 3 (mod 4),
where i is a complex primitive 4-th root of unity. Regular bent functions can only be found
for even n and for odd n with p ≡ 1 (mod 4). Also, for a weakly regular bent function, the
constant u in Definition 9.3.8 can only be equal to ±1 and ±i.
9.3.23 Example If f (x) = Trn (xp+1 ), defined on Fpn with p odd and n odd, then Da (f )(x) =
f (x + a) − f (x) = Trn (xp a + ap x + ap+1 ). The polynomial xp a + ap x is linearized and
xp a + ap x = 0 if and only if x = 0 or (x/a)p−1 + 1 = 0. But the latter expression has no
solution if n is odd, hence xp a + ap x is a permutation polynomial and Trn (xp a + ap x + ap+1 )
is balanced, hence f is bent.
9.3.24 Theorem Let n = 2m and f : Fpn → Fp be a p-ary bent function. Then
where µ = ±1 and N (j) = #{x ∈ Fpn : f (x) = j}. In particular, fˆ(0) = ±pm .
9.3.25 Remark Weakly regular bent functions are useful for constructing certain combinatorial
objects such as partial difference sets, strongly regular graphs, and association schemes (see
[598, 1058, 2424, 2775]).
is a difference set in Fn2 with parameters (2n , 2n−1 ± 2(n/2)−1 , 2n−2 ± 2(n/2)−1 ; 2n−2 ) (see
Section 14.6.4). Equivalent difference sets give rise to EA-equivalent bent functions, but
EA-equivalent bent functions need not necessarily give rise to equivalent difference sets, see
Example 9.3.28.
9.3.27 Remark Theorem 9.3.26 does not hold for p-ary bent functions with p odd. For general p,
the set Rf := {(x, f (x)) : x ∈ Fnp } ⊂ Fn+1
p is a relative difference set [2423].
9.3.28 Example If f and g are EA-equivalent Boolean bent functions, the corresponding differ-
ence sets need not be equivalent. One reason is that the complemented function f , i.e.,
f + 1 describes the complementary difference set. But, more seriously, adding a linear map-
ping l to f may not preserve equivalence of the corresponding difference sets. Using the
bivariate notation introduced in Remark 9.3.52, the functions f (x, y) = Tr4 (x · y 7 ) and
g(x, y) = f (x, y) + Tr4 (x) with x, y ∈ F24 , are EA equivalent. Both functions describe dif-
ference sets with parameters (256, 120, 56; 64). These difference sets are inequivalent since
the corresponding designs are not isomorphic.
9.3.29 Theorem [1475] The function f : Fn
2 → F2 is bent if and only if
9.3.30 Remark Theorem 9.3.29 shows that f has the largest distance to all affine functions, hence
bent functions solve the covering radius problem for first order Reed-Muller codes of length
2n with n even. In other words, bent functions are maximum nonlinear functions Fn2 → F2
if n is even.
9.3.31 Example [2486] The following is a complete list of Boolean bent functions on F2m
2 for
1 ≤ m ≤ 3 (up to equivalence). For the functions F3 , F4 , F5 , and F6 , we have m = 3.
1. x1 x2 for m = 1,
2. x1 x2 + x3 x4 for m = 2,
3. x1 x4 + x2 x5 + x3 x6 = F3 ,
4. F3 + x1 x2 x3 = F4 ,
5. F4 + x2 x4 x6 + x1 x2 + x4 x6 = F5 ,
6. F5 + x3 x4 x5 + x1 x2 + x3 x5 + x4 x5 = F6 .
9.3.32 Remark The corresponding difference sets are inequivalent, but the design corresponding
to the difference set from F3 is equivalent to the design corresponding to F4 .
P
9.3.33 Proposition [1991, Chapter 15] Let f be a quadratic function i,j ai,j xi xj in 2m variables
over F2 . Then f is bent if and only if one of the following equivalent conditions is satisfied:
(2m,2m)
1. The matrix (ai,j + aj,i )i,j=1,...,2m ∈ F2 is invertible.
2. The symplectic form B(x, y) := f (x + y) + f (x) + f (y) is nondegenerate.
9.3.34 Example
Thematrix corresponding to the Boolean function f (x) = x1 x2 + x2 x3 + x3 x4 is
0 1 00
1 0 10
and it is invertible, hence f is bent. Similarly, the function x1 x2 + x2 x3 +
0 1 01
0 0 10
0 1 0 1
1 0 1 0
x3 x4 + x1 x4 is not bent, since the corresponding matrix 0 1 0 1 is singular.
1 0 1 0
9.3.35 Proposition [1470] Any quadratic p-ary function f mapping Fpn to Fp is bent if and only if
the bilinear form B(x, y) := f (x+y)−f (x)−f (y)+f (0) associated with f is nondegenerate.
Moreover, all quadratic p-ary bent functions are (weakly) regular.
9.3.36 Remark There are exactly two inequivalent nondegenerate quadratic forms on Fpn with p
odd:
function. Moreover, the bijectiveness of π is necessary and sufficient for f being bent. Such
bent functions are regular and the dual function is equal to f ∗ (x, y) = y·π −1 (x)+σ(π −1 (x)).
Boolean quadratic bent functions all belong to the completed Maiorana-McFarland class.
On the other hand, for odd p and even n, quadratic bent function f2 in Remark 9.3.36 does
not belong to the completed Maiorana-McFarland class [445].
9.3.38 Example The Boolean function x1 x4 + x2 x5 + x3 x6 + x4 x5 x6 is a Maiorana-McFarland bent
function on 6 variables.
9.3.39 Remark The Maiorana-McFarland construction can be used to construct bent functions of
degree (p − 1)m by choosing a function σ of degree (p − 1)m.
9.3.41 Remark These functions are bent functions of partial spread type, since a collection of
subspaces of dimension m with pairwise trivial intersection is a partial spread. The functions
f + are of type PS + , the others of type PS − .
9.3.42 Example Let q = 2m , and view Fq2 as a 2-dimensional space over Fq , but also as a 2m-
dimensional vector space over F2 . Then the 1-dimensional subspaces of Fq2 (viewed as a
2-dimensional vector space) are m-dimensional subspaces over F2 . The q + 1 subspaces of
dimension 1 over Fq are
Uα := {α · x : x ∈ Fq }
where αq+1 = 1, i.e., α is in the multiplicative group of order q + 1 in Fq2 . These subspaces
intersect pairwise trivially, hence we may take any 2m−1 or 2m−1 + 1 of these subspaces to
construct f − or f + .
9.3.43 Remark
1. The parameters of DI+ are complementary to the parameters of DI− , but the
difference sets are, in general, not complements of each other.
2. All functions in PS − have algebraic degree n/2. On the contrary, if n/2 is even
then class PS + contains all quadratic bent functions [861].
3. A partial spread where the union of the subspaces covers the entire vector space
is called a spread.
4. The spread constructed in Example 9.3.42 is the regular spread.
5. Spreads of subspaces of dimension m in 2m-dimensional subspaces can be used
to describe translation planes.
6. There are numerous spreads, hence many partial spreads. Many partial spreads
are not contained in spreads. For partial spreads contained in spreads, the two
constructions in Construction 9.3.40 are complements of each other, i.e., for any
f − , there is another partial spread such that f − = f + + 1.
268 Handbook of Finite Fields
9.3.44 Definition The class PS ap is a subclass of PS − where a subspread of the regular spread
is used.
9.3.45 Remark [523] All of the functions in the PS ap class are hyper bent (see Definition 9.3.16).
9.3.46 Remark Some of the known constructions of bent functions are direct, that is, do not use as
building blocks previously constructed bent functions. We call these primary constructions.
The others, sometimes leading to recursive constructions, will be called secondary construc-
tions. Most of the secondary constructions of Boolean bent functions are not explained here
and can be found in [523, Section 6.4.2].
9.3.48 Remark This construction was formulated originally in a much more general form using
relative difference sets [2423]. Since the function f : Fp → Fp with f (x) = x2 is a p-ary bent
function if p is odd, Construction 9.3.47 yields bent functions Fnp → Fp for all n (p odd).
9.3.49 Example The following monomial functions f (x) = Trn (αxd ) are bent on F2n with
n = 2m:
/ {y d : y ∈ F2n } [1295];
1. d = 2k + 1 with n/ gcd(k, n) being even and α ∈
2. d = r(2 − 1) with gcd(r, 2 + 1) = 1 and α ∈ F2m being −1 of the Kloosterman
m m
5. d = 22k + 2k + 1 with n = 6k and k > 1, α ∈ F23k with TrF23k /F2k (α) = 0 [495].
9.3.50 Remark These functions are monomial bent functions. Functions in Part 1 are quadratic,
and those in Parts 4 and 5 belong to the Maiorana-McFarland class. Functions in Part 2
are in the class PS ap . An exhaustive search shows that there are no other monomial bent
functions for n ≤ 20.
9.3.51 Example [531, 532, 907, 1474, 1878] (Niho bent functions) A positive integer d (always
understood modulo 2n − 1 with n = 2m) is a Niho exponent if d ≡ 2j (mod 2m − 1) for
some j < n. As we consider Trn (axd ) with a ∈ F2n , without loss of generality, we can
assume that d is in normalized form, i.e., with j = 0. Then we have a unique representation
d = (2m − 1)s + 1 with 2 ≤ s ≤ 2m . Following are examples of bent functions consisting of
one or more Niho exponents:
m
1. Quadratic function Trm (ax2 +1 ) with a ∈ F∗2m (here s = 2m−1 + 1).
2. Binomials of the form f (x) = Trn (α1 xd1 +α2 xd2 ), where 2d1 ≡ 2m + 1 (mod 2n − 1)
2m 2 2m +1
and α1 , α2 ∈ F2n are such that (α1 + α1 ) = α2
∗
. Equivalently, denoting
m m
a = (α1 + α12 )2 and b = α2 we have a = b2 +1 ∈ F∗2m and
m
f (x) = Trm (ax2 +1
) + Trn (bxd2 ).
d2 = (2m − 1)3 + 1,
6d2 = (2m − 1) + 6 (with the condition that m is even).
These functions have algebraic degree m and do not belong to the completed
Maiorana-McFarland class [446].
3. Take r > 1 with gcd(r, m) = 1 and define
2r−1
X−1
!
2m +1 di
f (x) = Trn ax + x ,
i=1
m
where 2r di = (2m − 1)i + 2r and a ∈ F2n is such that a + a2 = 1. The dual of
f , calculated using Proposition 9.3.56, is equal to
m n−r m m r
f ∗ (y) = Trm u(1 + y + y 2 ) + u2 + y 2 (1 + y + y 2 )1/(2 −1) ,
m
where u ∈ F2n is arbitrary with u + u2 = 1. Moreover, if d < m is a positive
integer defined uniquely by dr ≡ 1 (mod m) then the algebraic degree of f ∗ is
equal to d + 1. Both functions f and its dual belong to the completed Maiorana-
McFarland class. On the other hand, f ∗ is not a Niho bent function.
4. Bent functions in a bivariate representation obtained from the known o-
polynomials (see Remarks 9.3.52 and 9.3.54).
9.3.52 Remark The bivariate representation of a Boolean function is defined as follows: we identify
Fn2 with F2m × F2m and consider the argument of f as P an ordered pairi (x, y) of elements
in F2m . There exists a unique bivariate polynomial 0≤i,j≤2m −1 ai,j x y over F2m that
j
represents f . The algebraic degree of f is equal to max(i,j) | ai,j 6=0 (w2 (i) + w2 (j)) (w2 (j)
is the Hamming weight of the binary expansion of j). Since f is Boolean the bivariate
representation can be written in the form of f (x, y) = Trm (P (x, y)), where P (x, y) is some
polynomial of two variables over F2m .
9.3.53 Construction [532, 861] Define class H of functions in their bivariate representation as
follows
Trm xH( xy ) if x 6= 0,
g(x, y) = (9.3.1)
Trm (µy) if x = 0,
where µ ∈ F2m and H is a mapping from F2m to itself with G(z) := H(z) + µz satisfying
9.3.54 Remark
1. Any mapping G on F2m that satisfies (9.3.2) is an o-polynomial.
2. Any o-polynomial defines a hyperoval in PG(2, 2m ). Using Construction 9.3.53,
every o-polynomial results in a bent function in class H. For a list of known, up
to equivalence, o-polynomials, see [532].
9.3.55 Remark A bent function F2n → F2 with n = 2m belongs to class H if and only if its
restriction to each coset uF2m with u ∈ F∗2n is linear. Thus, Niho bent functions from
270 Handbook of Finite Fields
Example 9.3.51 are just functions of class H viewed in their univariate representation. In
particular, binomial Niho bent functions with d2 = (2m − 1)3 + 1 correspond to Subiaco
hyperovals [1474], functions with 6d2 = (2m − 1) + 6 correspond to Adelaide hyperovals and
functions listed under number 3 (consisting of 2r terms) are obtained from Frobenius map
m−r
G(z) = z 2 (i.e., translation hyperovals) [531].
9.3.56 Proposition [532] Let g be a bent function having the form of (9.3.1). Then its dual function
g ∗ , represented in its bivariate form, satisfies g ∗ (α, β) = 1 if and only if the equation
H(z) + βz = α has no solutions in F2m .
9.3.57 Theorem [1470] Consider an odd prime p and nonzero a ∈ Fpn . Then for any j ∈ {1, . . . , n},
the quadratic p-ary function f mapping Fpn to Fp and given by
j
f (x) = Trn axp +1
pn − 1
p gcd(2j,n) − 1 6 − ind(a)(pj − 1) , (9.3.3)
2
where ind(a) denotes the unique integer t ∈ {0, . . . , pn − 1} with a = ξ t and ξ being a
primitive element of Fpn .
9.3.58 Remark The functions in Theorem 9.3.57 are quadratic, hence they are (weakly) regular.
In [1471], the dual function has been determined explicitly.
9.3.59 Theorem [1470] Let n = 2m and t be an arbitrary positive integer with gcd(t, pm + 1) = 1
for an odd prime p. For any nonzero a ∈ Fpn , define the following p-ary function mapping
Fpn to Fp
m
f (x) = Trn axt(p −1) .
(9.3.4)
Let K denote the Kloosterman sum (see Definition 6.1.118). Then for any y ∈ F∗pn , the
corresponding Walsh transform coefficient of f is equal to
m pm t(pm −1)
fˆ(y) = 1 + K ap +1 + pm ζp−Trn (a y )
and
m
fˆ(0) = 1 − (pm − 1)K ap +1 .
m
Assuming pm > 3, then f is bent if and only if K ap +1 = −1. Moreover, if the latter
9.3.60 Remark According to the result of Katz and Livné [1695] (see also [2121, Theorem 6.4]),
√ F3m , the Kloosterman sum K(c) takes on all the integer values in the
∗
with c running
√ over
m m
range (−2 3 , 2 3 ) that are equal to −1 modulo 3. In particular, there exists at least
m
one a ∈ F3n such that K a3 +1 = −1. This means that in the ternary case (i.e., when
p = 3) and given the conditions of Theorem 9.3.59, there exists at least one a ∈ F3n such
that the function (9.3.4) is bent. Moreover, there are no bent functions having the form of
(9.3.4) when p > 3 since in this case, the Kloosterman sum never takes on the value −1 as
shown in [1785]. More on Kloosterman sums is contained in Section 6.1.
9.3.61 Remark Let p = 2 and without loss of generality assume a ∈ F2m . Then exactly the same
result as in Theorem 9.3.59 holds for any m in the binary case giving the Dillon class of
Special functions over finite fields 271
bent functions [591, 861, 1879] (see Example 9.3.49 Part 2). Moreover,
√ the√Kloosterman
sum over F2m takes on all the integer values in the closed range [−2 2m , 2 2m ] that are
equal to −1 modulo 4 [1828]. This means that binary Dillon bent functions exist.
3m +1
9.3.62 Theorem [1315, 1469, 1470] Let n = 2m with m odd and a = ξ , where ξ is a primitive 4
element of F3n . Then the ternary function f mapping F3n to F3 and given by
3n −1 m
f (x) = Trn ax 4 +3 +1
where o(t) is the size of the cyclotomic coset modulo 3m − 1 that contains t and the set Im
is defined as follows. Select all such integers in the range {0, . . . , 3m − 1} that do not contain
2-digits in the ternary expansion and none of 1-digits are adjacent (the least significant digit
is cyclically linked with the most significant). Split this set into cyclotomic cosets modulo
3m − 1, take coset leaders and denote this subset Im .
9.3.63 Theorem [1472] Let n = 4k. Then the p-ary function f mapping Fpn to Fp and given by
3k 2k k
f (x) = Trn xp +p −p +1 + x2
is a weakly regular bent function of degree (p − 1)k + 2. Moreover, for any y ∈ Fpn the
corresponding Walsh transform coefficient of f is equal to
9.3.64 Remark Some sporadic examples of ternary (non) weakly regular bent functions consisting
of one or two terms in the univariate representation can be found in [1473].
9.3.65 Remark Definition and other information on planar functions can be found in Section 9.5.
In particular, except for one class, all known planar functions are quadratic [451] which
means that they can be represented by so called Dembowski-Ostrom polynomials (see Def-
inition 9.5.17 and [732]). The only known example of a nonquadratic planar function is
3k +1
F (x) = x 2 over F3n with gcd(k, n) = 1 and odd k, known as Coulter-Matthews function
[732]. The following theorem shows that every planar function gives a family of generalized
bent functions.
9.3.66 Theorem [527] A function F mapping Fpn to itself is planar if and only if for every nonzero
a ∈ Fpn the function Trn (aF ) is generalized bent.
272 Handbook of Finite Fields
9.3.67 Remark The generalized bent functions Trn (aF ) obtained from Dembowski-Ostrom poly-
nomials are quadratic, hence they are weakly regular (see Proposition 9.3.35). It was shown
in [1055, 3042] that the bent functions coming from the Coulter-Matthews planar functions
are also weakly regular.
9.3.68 Definition A function f mapping Fpn to Fp is s-plateaued if its Walsh coefficients either
equal zero or satisfy |fˆ(y)|2 = pn+s . The case s = 0 corresponds to bent functions.
9.3.69 Construction [571, 572] For every a = (a1 , a2 , . . . , as ) ∈ Fsp let fa be an s-plateaued
function from Fpn to Fp such that fˆa (t) · fˆb (t) ≡ 0 for any a, b ∈ Fsp with a 6= b and t ∈ Fpn .
Then function f (x, y1 , y2 , . . . , ys ) from Fpn × Fsp to Fp defined by
X Qs yi (yi − 1) · · · (yi − (p − 1))
s i=1
f (x, y1 , y2 , . . . , ys ) = (−1) fa (x)
s
(y1 − a1 ) · · · (ys − as )
a∈Fp
is bent. Moreover, for any t ∈ Fpn and a ∈ Fsp , the corresponding Walsh transform coefficient
of f is equal to
fˆ(t, a) = ζp−a·y fˆy (t),
where y ∈ Fsp is unique with fˆy (t) 6= 0.
9.3.70 Remark Quadratic functions are always plateaued. Then one can construct s-plateaued
functions with the prescribed properties by adding suitably chosen linear functions to such
quadratic s-plateaued functions. In this
P case, the constructed bent function f has degree
(p − 1)s + 2 (resp. (p − 1)s + 1) if a∈Fsp fa is quadratic (resp. affine). It is possible to
construct specific families of quadratic s-plateaued functions that lead to weakly regular
bent functions when n − s is even. Similar constructions with n − s odd can lead both to
weakly and non weakly regular bent functions. On the other hand, one can take a suitable
family of (n − 1)-plateaued quadratic functions (maximal order achievable by a non affine
function). Then in the case when the degree of the corresponding bent function exceeds
(p − 1)n/2 being the maximum for a weakly regular function (see Theorem 9.3.17), the
obtained functions are non weakly regular.
functions fa : Fnp → Fp defined by fa (x) = a · x are bent for all a 6= 0. Here “·” denotes
a symmetric bilinear form (see Definition 9.3.2).
9.3.74 Remark A collection of 22m−1 − 1 quadratic Boolean bent functions in 2m variables such
that the sum of any two of them is bent again, gives rise to a Kerdock code of length 22m .
The Boolean functions in Example 9.3.72 Part 2 only give rise to 2m −1 such bent functions.
Special functions over finite fields 273
See Also
References Cited: [445, 446, 451, 495, 523, 524, 526, 527, 529, 531, 532, 571, 572, 591, 594,
598, 732, 809, 861, 864, 907, 1055, 1058, 1295, 1315, 1409, 1469, 1470, 1471, 1472, 1473,
1474, 1475, 1539, 1543, 1565, 1695, 1785, 1812, 1828, 1842, 1856, 1878, 1879, 1991, 2051,
2121, 2224, 2302, 2423, 2424, 2486, 2775, 3038, 3042]
9.4.2 Theorem [2034] The polynomial f (x1 , . . . , xn ) is a κ-polynomial over Fq if and only if
X
χ(f (a)) = 0
a∈Fn
q
9.4.5 Definition Let S = hS, +, ?i be a set with two binary operations, addition + and multi-
plication ? and where hS, +i forms a group.
1. S is a Cartesian group if it has no zero divisors and hS ∗ , ?i forms a loop.
2. S is a left (resp. right) quasifield if, in addition to being a Cartesian group, S
also has a left (resp. right) distributive law.
3. S is a semifield if it is a non-associative division ring – that is, a Cartesian group
with both distributive laws.
If we do not insist on a multiplicative identity, so that hS ∗ , ?i forms a quasigroup
instead of a loop, then we may speak of a pre-Cartesian group, prequasifield, or pre-
semifield.
9.4.6 Remark Each algebraic object corresponds with a type of projective plane under the Lenz-
Barlotti classification. Specifically, Cartesian groups correspond to type II, quasifields to
type IV, and semifields to type V. For more information along these lines see Section 14.3,
as well as the classical texts of Dembowski [807] and Hughes and Piper [1560]. See also
Subsection 2.1.7.6.
9.4.7 Theorem
1. The additive group of a quasifield is necessarily an elementary abelian p-group.
2. [1764] Any semifield of prime or prime squared order is necessarily a finite field.
3. [1764] Proper semifields, i.e., semifields which are not fields, exist for all prime
power orders pe ≥ 16 with e ≥ 3.
9.4.8 Theorem Let M (x, y) be a κ-polynomial over Fq satisfying M (x, y) = 0 if and only if
xy = 0. Define the sets L and R as follows:
9.4.10 Remark There are many examples of quasifield planes – for example, the classification
of translation planes of order 49 by Mathon and Royle [2025] reveals there are 1347 non-
isomorphic such planes. However, there does not appear to have been any attempt to study
the corresponding κ-polynomials, nor the non-linearized permutation polynomials in L or
R that arise from quasifields (through the known examples or by attempting to prove
restrictions on the permissible sets of permutation polynomials which can represent the
non-distributive side of a quasifield).
9.4.11 Remark Unsurprisingly, given the significant amount of extra structure, much more is
known about semifields than quasifields or Cartesian groups. Consequently, the remainder
of this section focuses on the semifield case.
9.4.12 Definition Let S1 = hFq , +, ?i and S2 = hFq , +, ∗i be two presemifields. Then S1 and S2
are isotopic if there exists three non-singular linear transformations L, M, N ∈ Fq [x] of
Fq over Fp such that
9.4.13 Remark This definition of equivalence, which is clearly much weaker than the standard
ring isomorphism, arises from projective geometry, where its importance is underlined by
the following result of Albert.
9.4.14 Theorem [71] Two presemifields coordinatize isomorphic planes if and only if they are
isotopic.
9.4.15 Remark Any presemifield S = hFq , +, ◦i is isotopic to a semifield via the following trans-
formation: choose any α ∈ F∗q and define a new multiplication ? by
(x ◦ α) ? (α ◦ y) = x ◦ y
9.4.17 Proposition (Dickson’s commutative semifields) Let q = pe with p an odd prime, e > 1
and let {1, λ} be basis for Fq2 over Fq . For any j a non-square of Fq and any non-trivial
automorphism σ of Fq , we define a binary operation ? by
276 Handbook of Finite Fields
9.4.18 Proposition (Albert’s generalized twisted fields) [72, 73] For any prime power q, select
σ, τ ∈ Aut(Fq ) and let j ∈ Fq be any element satisfying (xy)−1 xσ y τ = j has no solution for
x, y ∈ F∗q . Define a binary operation ? by
9.4.19 Remark The Dickson semifields of Proposition 9.4.17 were the first published examples
of non-associative finite division rings, while Albert’s construction is both historically im-
portant and fundamental, as it has played a significant role in subsequent classification
results, see Theorem 9.4.24 below. These two constructions are the only ones needed for our
discussion. To highlight the variety of construction techniques developed, even for isotopic
semifields, we direct the interested reader to the following not exhaustable list:
1. the paper [1764] of Knuth which contains several constructions;
2. the Cohen-Ganley and Ganley commutative semifields [689, 1170];
3. the Jha-Johnson semifields [1608, 1680];
4. the Hiramine-Matsumoto-Oyama quasifield construction [1508, 1679];
5. the Kantor-Williams semifields [1682];
6. the semifield construction using spread sets viewed as linear maps [518, 952].
9.4.20 Definition Let S = hFq , +, ?i be a finite semifield. We define the left, middle, and right
nucleus of S, denoted by Nl , Nm and Nr , respectively, as follows:
9.4.21 Remark It is easy to show that all nuclei are finite fields. The nuclei measure how far S is
from being associative. Moreover, the orders of the nuclei are invariants of S under isotopy
and so act as a coarse signature of the semifield. We note that if S is commutative, then
Nl = Nr ⊂ Nm .
9.4.22 Theorem (Restricting the corresponding κ-polynomial) [729] Let n and e be natural num-
bers. Set q = pe for some odd prime p and t(x) = xq − x. Let S be a semifield of order q n
with middle nucleus containing Fq . Then there exists a semifield S 0 = hFqn , +, ?i, isotopic to
Special functions over finite fields 277
(n−1)e−1
X i j
K(x, y) = aij xp y p .
i,j=0
9.4.23 Remark Any semifield S can be represented as a right vector space over Nl , a left vector
space over Nr and both a left or right vector space over Nm . The concept of dimension
therefore naturally arises in the study of semifields.
1. [2081] Any three dimensional semifield over N is necessarily either a finite field
or a twisted field.
2. [2082] Fix d to be a prime. For sufficiently large q, any semifield of dimension d
over N = Fq is necessarily either a finite field or a twisted field.
9.4.25 Theorem (Dimension two results for commutative semifields) Let S be a commutative
semifield of order q 2n with [S : Nm ] = 2.
9.4.26 Theorem (Strong isotopy for commutative semifields) [728] Let S1 = hFq , +, ?i and
S2 = hFq , +, ∗i be isotopic commutative presemifields and let S10 be any commutative semi-
field corresponding to S1 . Set d = [Nm (S10 ) : N (S10 )].
See Also
[138] For a recent attempt to resolve the problem of establishing inequivalence between
projective planes. Whether dealing with Cartesian groups, quasifields or
semifields, present methods for establishing the inequivalence of examples are
technical and generally unwieldy. Of major interest would be a new and
efficient method for doing so.
[787] Determines many ovals in semifields, as well as in the planes generated by
the planar functions of Proposition 9.5.11 Part 2 of the next section.
[1560] Gives a good discussion on the coordinatization method for projective planes.
[1678] Gives a conjecture concerning the asymptotic number of pairwise non-isotopic
semifields of fixed characteristic. The conjecture is proved for characteristic
two by Kantor [1677]; see also [1679, 1682].
[1871] For a recent survey emphasizing the geometric approach to semifields by two
of the leading authors in that field.
[2969] Conjectures the existence of left or right primitive elements (suitably defined)
in finite semifields. This was proved by Gow and Sheekey [1346] for semifields
of sufficiently large order relative to the characteristic. See [1487, 2490] for
counterexamples of small order in characteristic two.
References Cited: [71, 138, 327, 466, 518, 689, 728, 729, 787, 807, 810, 845, 847, 952, 1170,
1346, 1487, 1508, 1560, 1608, 1677, 1678, 1679, 1680, 1682, 1751, 1764, 1871, 2025, 2034,
2081, 2082, 2490, 2491, 2492, 2521, 2891, 2969]
9.5.1 Definition Let G and H be arbitrary finite groups, written additively, but not necessarily
abelian. A function f : G → H is a planar function if for every non-identity a ∈ G the
functions ∆f,a : x 7→ f (a + x) − f (x) and ∇f,a : x 7→ −f (x) + f (x + a) are bijections.
A polynomial f ∈ Fq [x] is planar over Fq if the function induced by f on Fq is a planar
function (on hFq , +i).
9.5.2 Remark A few points should be made clear from the outset.
9.5.6 Remark Given groups G and H as in Definition 9.5.1 and a function f : G → H, an incidence
structure I(G, H; f ) may be defined as follows: “Points” are the elements of G × H; “Lines”
are the symbols L(a, b) with (a, b) ∈ G × H, together with the symbols L(c) with c ∈ G;
incidence is defined by
9.5.11 Proposition
α
1. [732] xp +1 is planar over Fpe if and only if e/ gcd(α, e) is odd.
α
2. [732] x(3 +1)/2 is planar over F3e if and only if gcd(α, 2e) = 1.
3. [732, 874] For any a ∈ F3e , x10 + ax6 − a2 x2 is planar over F3e if and only if e is
odd or e = 2.
280 Handbook of Finite Fields
9.5.12 Remark Many more examples can be generated from commutative semifields of odd order,
see Section 9.5.5. The examples of Proposition 9.5.11 Part 2 do not generalize to larger
characteristic – apply Proposition 9.5.24 Part 2. When e is odd, these examples generate
the only Lenz-Barlotti type II planes of non-square order known [732]; see [263, 787, 1758]
for more concerning these planes.
9.5.13 Definition Let f, g ∈ Fq [x] be two planar polynomials over Fq . Set f1 (respectively g1 )
to be the polynomial which results when f (respectively g) is stripped of all linearized
and constant terms. Then f and g are planar equivalent if there exist two linearized
permutation polynomials L, M ∈ Fq [x] satisfying L(f1 (x)) ≡ g1 (M (x)) mod (xq − x).
9.5.14 Theorem [732] Let f ∈ Fq [x] and let L ∈ Fq [x] be a linearized polynomial. Then the
following are equivalent.
1. f (L) is a planar polynomial.
2. L(f ) is a planar polynomial.
3. f is a planar polynomial and L is a permutation polynomial.
9.5.15 Theorem (Isomorphic planes and planar equivalence)
9.5.17 Definition A Dembowski-Ostrom (or DO) polynomial over a field of characteristic p is any
polynomial of the shape
i
+pj
X
aij xp .
i,j
9.5.18 Theorem [732] Let f ∈ Fq [x] with deg(f ) < q. The following are equivalent.
to those examples remain the only known counterexamples in any characteristic and it is
eminently possible that no other non-DO examples exist.
1. [1286, 1506, 2481] The polynomial f ∈ Fp [x] is planar over Fp if and only if the
reduced form of f is a quadratic.
2. [726] The polynomial xn is planar over Fp2 if and only if n ≡ 2pi mod (p2 − 1)
for some integer i ∈ {0, 1}.
3. [731] For prime p ≥ 5, the polynomial xn is planar over Fp4 if and only if n ≡
2pi mod (p4 − 1) for some integer i ∈ {0, 1, 2, 3}.
9.5.22 Remark These results represent the only general classification results on planar polynomials
so far obtained.
9.5.23 Proposition (General necessary conditions) [734, 871, 1818] Let f ∈ Fq [x] and let V (f ) =
{f (a) : a ∈ Fq }. A necessary condition for f to be planar over Fq is #V (f ) ≥ (q + 1)/2. If
f is a DO polynomial, then this condition is also sufficient.
9.5.24 Proposition (Necessary conditions for monomials) Let xn be planar over Fpe . The following
statements hold.
1. [808] gcd(n, pe − 1) = 2.
2. [1611] n ≡ 2 mod (p − 1).
3. [726] If 2|e, then n ≡ 2pi mod (p2 − 1).
4. [731] If 4|e, then n ≡ 2pi mod (p4 − 1).
9.5.27 Remark By Theorem 9.5.25, classifying commutative semifields of odd order is equivalent to
determining the “isotopism classes” of planar DO polynomials. Moreover, Theorem 9.4.26
shows that any strong isotopism class of commutative semifields can split into at most
two isotopism classes, so that there are sound reasons for considering the strong isotopism
problem on planar DO polynomials instead of the more difficult isotopism problem.
9.5.28 Theorem (Strong isotopy and planar equivalence) [728] Let f, g ∈ Fq [x] be planar DO
polynomials with corresponding commutative presemifields Sf and Sg . There is a strong
isotopism (N, N, L) between Sf and Sg if and only if f (N (x)) ≡ L(g(x)) mod (xq − x).
282 Handbook of Finite Fields
See Also
§9.2 For APN functions that are closely related to planar functions.
§9.3 For bent functions that are closely related to planar functions.
§14.3 For affine and projective planes; the seminal paper [808] clearly outlines the
main properties of the planes constructed via planar functions.
§14.6 Discusses difference sets. Ding and Yuan [874] used the examples of
Proposition 9.5.11 Part 3 to disprove a long-standing conjecture on skew
Hadamard difference sets; see also [871, 2972].
[273] Construct further classes of planar DO polynomials; see also [274], [450], [451],
[2383], [3059], [3060]. The problem of planar (in)equivalence between these
constructions is not completely resolved at the time of writing. An incredible
new class, which combines Albert’s twisted fields with Dickson’s semifields,
was very recently discovered by Pott and Zhou [2425].
[728] Classifies planar DO polynomials over fields of order p2 and p3 . This does not
constitute a classification of planar polynomials over fields of these orders.
[729] Applies Theorem 9.4.22 to commutative presemifields of odd order to restrict
both the form of the DO polynomials and the isotopisms that need to be
considered; see also [1799]. A promising alternative approach (which applies
also to APN functions, see Section 9.2) is outlined in [274], while a third
approach was given recently in [2973].
[1507] For results on possible forms of planar functions not defined over finite fields.
[1799] Gives specific forms for planar DO polynomials corresponding to the Dickson
semifields [847], the Cohen-Ganley semifields [689], the Ganley semifields [1170],
and the Penttila-Williams semifield [2383].
References Cited: [263, 273, 274, 326, 450, 451, 689, 726, 728, 729, 731, 732, 734, 787, 808,
811, 847, 871, 874, 1170, 1286, 1506, 1507, 1611, 1758, 1799, 1818, 2383, 2393, 2425, 2481,
2972, 2973, 3059, 3060]
9.6.1 Basics
9.6.1 Definition Let n be a positive integer. For a ∈ Fq , we define the n-th Dickson polynomial
of the first kind Dn (x, a) over Fq by
bn/2c
X n n−i
Dn (x, a) = (−a)i xn−2i .
n−i i
i=0
Special functions over finite fields 283
9.6.2 Theorem (Waring’s formula, [1939, Theorem 1.76]) Let σ1 , . . . , σk be elementary symmetric
polynomials in the variables x1 , . . . , xk over a ring R and sn = sn (x1 , . . . , xk ) = xn1 + · · · + xnk
∈ R[x1 , . . . , xk ] for n ≥ 1. Then we have
X (i1 + i2 + · · · + ik − 1)!n i1 i2
sn = (−1)i2 +i4 +i6 +··· σ1 σ2 · · · σkik ,
i1 !i2 ! · · · ik !
for n ≥ 1, where the summation is extended over all tuples (i1 , i2 , . . . , in ) of nonnegative
integers with i1 + 2i2 + · · · + kik = n. The coefficients of the σ1i1 σ2i2 · · · σkik are integers.
9.6.3 Theorem Dickson polynomials of the first kind are the unique monic polynomials satisfying
the functional equation
an
a
Dn y + , a = y n + n ,
y y
where a ∈ Fq and y ∈ Fq2 . Moreover, they satisfy the recurrence relation
9.6.5 Definition For a ∈ Fq , we define the n-th Dickson polynomial of the second kind En (x, a)
over Fq by
bn/2c
X n−i
En (x, a) = (−a)i xn−2i .
i
i=0
9.6.6 Theorem Dickson polynomials of the second kind have a functional equation
9.6.9 Remark Permutation properties of Dickson polynomials are important; see Section 8.1. The
famous Schur conjecture postulating that every integral polynomial that is a permutation
polynomial for infinitely many primes is a composition of linear polynomials and Dickson
polynomials was proved by Fried [1109]. We refer readers to Section 9.7.
9.6.2 Factorization
9.6.10 Remark The factorization of the Dickson polynomials of the first kind over Fq was given
[626] and simplified in [266].
9.6.11 Theorem [266, 626] If q is even and a ∈ F∗q then Dn (x, a) is the product of squares of
irreducible polynomials over Fq which occur in cliques corresponding to the divisors d of
n, d > 1. Let kd be the least positive integer such that q kd ≡ ±1 (mod d). To each such d
there corresponds φ(d)/(2kd ) irreducible factors of degree kd , each of which has the form
kY
d −1
√ i i
(x − a(ζ q + ζ −q ))
i=0
9.6.13 Example Let (q, n) = (5, 12). Then D12 (x, 2) = x12 + x10 + x8 + 4x6 + 3x2 + 3 is the product
of irreducible polynomials over F5 which occur in cliques corresponding to the divisors d = 4
and d = 12 of n = 12. By direct computation, m4 = N4 = 4 and m12 = N12 = 4. For d = 4,
there corresponds one irreducible factor of degree 4,Qwhile there√are two irreducible factors
Nd −1 i i
of degree 4 for d = 12, each of which has the form i=0 (x − aqi (ζ q + ζ −q )), where ζ
is a 4d-th root of unity.
9.6.14 Remark Similar results hold for Dickson polynomials of the second kind and they can be
found in [266] and [626]. Dickson polynomials of other kinds are defined in [2945] and the
Special functions over finite fields 285
factorization of the Dickson polynomial of the third kind is obtained similarly in [2945]. We
note that the factors appearing in the above results are over Fq , although their description
uses elements in an extension field of Fq . In [1079] Fitzgerald and Yucas showed that these
factors can be obtained from the factors of certain cyclotomic polynomials. This in turn
gives a relationship between a-self-reciprocal polynomials and these Dickson factors. In the
subsequent subsections we explain how this works. These results come mainly from [1079].
9.6.15 Definition Let q be an odd prime power and fix a ∈ F∗q . For a monic polynomial f over
Fq of degree n, with f (0) 6= 0, define the a-reciprocal of f by
xn
fˆa (x) = f (a/x).
f (0)
9.6.16 Remark We note that the notion of a 1-self-reciprocal is the usual notion of a self-reciprocal.
9.6.17 Lemma
1. If α is a root of f then a/α is a root of fˆa .
2. The polynomial f is irreducible over Fq if and only if fˆa is irreducible over Fq .
9.6.18 Remark The a-reciprocal of an irreducible polynomial f may not have the same order as f .
For example, consider f (x) = x3 + 3 when q = 7. Then f has order 9 while fˆ3 (x) = x3 + 2
has order 18.
9.6.19 Theorem [1079] Suppose f is a polynomial of even degree n = 2m over Fq . The following
statements are equivalent:
1. f is a-self-reciprocal;
2. n = 2m and f has the form
m−1
X
f (x) = bm xm + b2m−i (x2m−i + am−i xi )
i=0
for some bj ∈ Fq .
9.6.23 Definition Define the mapping Φa : Pm → Sn from the polynomials over Fq of degree m
to the a-self-reciprocal polynomials over Fq of degree n = 2m by
Φa (f (x)) = xm f (x + a/x).
9.6.24 Remark In the case a = 1 this transformation has appeared often in the literature. The
first occurrence is Carlitz [547]. Other authors writing about Φ are Chapman [589], Cohen
[678], Fitzgerald-Yucas [1079], Kyuregyan [1821], Miller [2100], Meyn [2091], and Scheerhorn
[2538].
9.6.26 Theorem Maps Φa and Ψa are multiplicative and are inverses of each other.
9.6.27 Theorem [1079] The polynomial Dn (x, a) is mapped to x2n + an by the above defined Φa ,
namely, Φa (Dn (x, a)) = x2n + an .
9.6.28 Theorem [1079] The polynomial x2n + an factors over Fq as
Y
x2n + an = f (x),
1. Factor x2n + an .
2. For each factor f of x2n + an which is not a-self-reciprocal, multiply f with fˆa .
3. Apply Ψa .
9.6.31 Example We factor D12 (x, 2) = x12 + x10 + x8 + 4x6 + 3x2 + 3 when q = 5.
x24 + 212 = [(x4 + x2 + 2)(x4 + 2x2 + 3)][(x4 + 3)(x4 + 2)][(x4 + 4x2 + 2)(x4 + 3x2 + 3)]
= (x8 + 3x6 + 2x4 + 2x2 + 1)(x8 + 1)(x8 + 2x6 + 2x4 + 3x2 + 1).
Special functions over finite fields 287
D12 (x, 2) = (D4 (x, 2) + 3D2 (x, 2) + 2)D4 (x, 2)(D4 (x, 2) + 2D2 (x, 2) + 2)
= (x4 + 3)(x4 + 2x2 + 3)(x4 + 4x2 + 2).
n · ord(an )
if n is odd and a is a non-square,
η(n, a) =
4n · ord(an ) otherwise.
9.6.33 Theorem [1079] For a monic irreducible polynomial f over Fq and a ∈ F∗q , the following
statements are equivalent:
1. f divides Dn (x, a).
2. There exists a divisor d of n with n/d odd and ord(Φa (f )) = η(d, a), where Φa
is defined in Definition 9.6.23.
9.6.34 Definition For a ∈ F∗q , define the a-cyclotomic polynomial Qa (n, x) over Fq by
Y
Qa (n, x) = (xd − ad/2 )µ(n/d) .
d|n
d even
9.6.35 Remark When n ≡ 0 (mod 4), we have Q1 (n, x) = Q(n, x), the n-th cyclotomic poly-
nomial over Fq . When n Q ≡ 2 (mod 4), we have Q1 (n, x) = Q(n/2, −x2 ). Similar to the
factorization of x − 1 = d|n Q(d, x) [1939], we can reduce the factorization of x2n ± an
n
9.6.37 Remark A factorization of these a-cyclotomic polynomials Qa (m, x) is also given in [1079].
9.6.38 Definition [2945] For a ∈ Fq , any integers n ≥ 0 and 0 ≤ k < p, we define the n-th Dickson
polynomial of the (k + 1)-th kind Dn,k (x, a) over Fq by D0,k (x, a) = 2 − k and
bn/2c
X n − ki n − i
Dn,k (x, a) = (−a)i xn−2i .
n − i i
i=0
288 Handbook of Finite Fields
9.6.39 Definition [2945] For a ∈ Fq , any integers n ≥ 0 and 0 ≤ k < p, we define the n-th reversed
Dickson polynomial of the (k + 1)s-th kind Dn,k (a, x) over Fq by D0,k (a, x) = 2 − k and
bn/2c
X n − ki n − i
Dn,k (a, x) = (−1)i an−2i xi .
n − i i
i=0
9.6.40 Remark [2945] It is easy to see that Dn,0 (x, a) = Dn (x, a) and Dn,1 (x, a) = En (x, a).
Moreover, if char(Fq ) = 2, then Dn,k (x, a) = Dn (x, a) if k is even and Dn,k (x, a) = En (x, a)
if k is odd.
9.6.41 Theorem [2945] For any integer k ≥ 1, we have
Dn,k (x, a) = kDn,1 (x, a) − (k − 1)Dn,0 (x, a) = kEn (x, a) − (k − 1)Dn (x, a).
9.6.43 Theorem [2945] The Dickson polynomial of the (k + 1)-th kind satisfies the following re-
currence relation
Dn,k (x, a) = xDn−1,k (x, a) − aDn−2,k (x, a),
for n ≥ 2 with initial values D0,k (x, a) = 2 − k and D1,k (x, a) = x.
9.6.44 Theorem [2945] The generating function of Dn,k (x, a) is
∞
X 2 − k + (k − 1)xz
Dn,k (x, a)z n = .
n=0
1 − xz + az 2
9.6.45 Remark The Dickson polynomial Dn,k (x, a) of the (k + 1)-th kind satisfies a second order
differential equation; see [2723, 2945] for more details.
9.6.46 Theorem [2945] Suppose ab is a square in F∗q . Then Dn,k (x, a) is a PP of Fq if and only if
Dn,k (x, b) is a PP of Fq . Furthermore,
p p
Dn,k (α, a) = ( a/b)n Dn,k (( b/a)α, b).
Sq−1 = {α ∈ Fq : uq−1
α = 1}, Sq+1 = {α ∈ Fq : uq+1
α = 1}, Sp = {±2},
where for positive integers n and r we use the notation (n)r to denote n (mod r), the
smallest positive integer congruent to n modulo r.
9.6.49 Theorem [2945] Let α = uα + 1
uα where uα ∈ Fq2 and α ∈ Fq . Let α = ucα ∈ {±1} where
2
p(q −1)
c= 4 . As functions on Fq we have
(i)
9.6.51 Definition [1936] The Dickson polynomial of the first kind Dn (x1 , . . . , xt , a), 1 ≤ i ≤ t,
is given by the functional equations
where xi = si (u1 , . . . , ut+1 ) are elementary symmetric functions and u1 · · · ut+1 = a. The
(1) (t)
vector D(t, n, a) = (Dn , . . . , Dn ) of the t Dickson polynomials is a Dickson polynomial
vector.
9.6.55 Remark Much less is known for the multivariate Dickson polynomials of the second kind.
(1)
The same recurrence relation of Dn (x1 , . . . , xt , a) is used to define the multivariate Dickson
(1)
polynomials of the second kind En (x1 , . . . , xt , a) with the initial condition E0 = 1, Ej =
Pj r−1
P∞
r=1 (−1) xr Ej−r for 1 ≤ j ≤ t. The generating function is n=0 En z n = Pt+1 (−1) 1
i x zi
;
i=0 i
see [1936] for more details.
See Also
References Cited: [266, 547, 589, 626, 678, 1079, 1109, 1116, 1821, 1936, 1939, 2091, 2100,
2538, 2723, 2945]
9.7.1 Remark (Extend values) The historical functions of this section are polynomials and ratio-
nal functions: f (x) = Nf (x)/Df (x) with Nf and Df relatively prime (nonzero) polynomials,
denoted f ∈ F (x), F a field (almost always Fq or a number field). The subject takes off
by including functions f – covers – where the domain and range are varieties of the same
dimension. Still, we emphasize functions between projective algebraic curves (nonsingular),
often where the target and domain are projective 1-space.
9.7.2 Definition The degree of f ∈ F (x), deg(f ), is the maximum of deg(Nf ) and deg(Df ).
Add a point at ∞ to F , F ∪ {∞} = P1x (F ), to get the F points of projective 1-space.
9.7.3 Remark (Plug in ∞) Using Definition 9.7.2 requires plugging in and getting out ∞. We
sometimes use the notion of value sets Vf and their cardinality #Vf (Section 8.3).
1. The value of f (x0 ) for x0 ∈ F is ∞ if x0 is a zero of Df (x).
2. The value of f (∞) is respectively ∞, 0, or the ratio of the Nf and Df leading
coefficients, if the degree of Nf is greater, less than, or equal to the degree of Df .
If z is a variable indicating the range, this gives f as a function from P1x (F ) to P1z (F ). We
abbreviate this as f : P1x → P1z .
Special functions over finite fields 291
9.7.4 Definition (Möbius equivalence) Denote the group – under composition – of Möbius trans-
formations x 7→ ax+b
cx+d with ad − bc 6= 0, a, b, c, d ∈ F by PGL(F ). Refer to f1 , f2 ∈ F (x)
as Möbius equivalent if f2 = α ◦ f1 ◦ β for α, β ∈ PGL(F ).
9.7.5 Example If f (x) = xn , with gcd(n, q − 1) = 1, then #Vf = q k + 1 on P1x (Fqk ) exactly for
those infinitely many k with gcd(n, q k − 1) = 1.
9.7.6 Remark Initial motivation came from Schur’s Conjecture Theorem 9.7.32, which starts over
a number field K – a finite extension of Q, the rational numbers – with its ring of integers
OK . That asks about Vf over residue class fields, OK /pp of prime ideals p , denoting this
Vf (O/pp) (Vf (Fp ) if O = Z). Assume Nf and Df have coefficients in OK . Avoid p – it is a
bad prime – if it contains the leading coefficient of either Nf or Df .
9.7.9 Proposition [2203, p. 390] Consider Xh0 = {(x, y) | h(x, y) = 0}, an algebraic curve, defined
by h ∈ K[x, y]. Then, there is a unique nonsingular curve Xh – the normalization of Xh0
– and a morphism µh : Xh0 → Xh that is an isomorphism on the complement of a finite
subset of points in Xh . Indeed, every variety Xh0 has such a unique normalization, but in
higher dimensions it may be singular, and µh is an isomorphism off a codimension 1 set.
9.7.10 Definition (Components) A definition field for an algebraic set W is a field containing
all coefficients of all polynomials defining W . Components of W over F are algebraic
subsets which are not the union of two closed non-empty proper algebraic subsets over
F [1427, p. 3]. We say W is a variety if it has just one component. It is absolutely
irreducible if it has just one component over F̄ , an algebraic closure of F .
9.7.11 Remark (Points on varieties) [1427, Chapters 1 and 2] and [2203, Section 2] introduce
affine and projective algebraic sets, and their components (Definition 9.7.10), except they
are over an algebraically closed field. For perfect fields F (including finite fields and number
fields) this extends for normal varieties. Since their components do not meet, taking any
disjoint union of distinct varieties under the action of the absolute Galois group of F defines
components in general. Points on an algebraic set X over F refers here to geometric points:
points with coordinates in F̄ . It is an F point if its coordinates are in F .
9.7.12 Definition A general f : X → Z is a cover means it is a finite, flat morphism (see Definition
9.7.25) of quasi-projective varieties [2203, p. 432, Proposition 2].
9.7.13 Lemma Definition 9.7.12 simplifies for curves, because all our varieties will be normal, and
so for curves, nonsingular. Then, any nonconstant morphism is a cover: That includes any
nonconstant rational function f : P1x → P1z .
9.7.14 Example If f : X → Z is finite and X and Z are nonsingular, generalizing what happens
for curves, and no matter their dimension, then f is automatically flat [1427, p. 266, 9.3a)].
This does not extend to weakening nonsingular to normal varieties. [2203, p. 434] has a
finite morphism, where X is nonsingular (it is affine 2-space), and Z is normal. But, the
fiber degree is 2 over each z ∈ Z, excluding one point where it is 3.
292 Handbook of Finite Fields
9.7.15 Remark (Assuming normality) Starting with Subsection 9.7.2 all results assume that the
algebraic sets are normal. Some constructions (especially Definition 9.7.45) momentarily
produce nonnormal sets, that we immediately replace with their normalizations.
9.7.16 Definition An f ∈ Fq (x) is exceptional if it maps one-one on P1 (Fqk ) for infinitely many
k. Similarly, with K a number field, f ∈ K(x) is exceptional if it is exceptional mod p
for infinitely many primes p .
9.7.17 Remark We use K, allowing decoration, for a number field. Section 8.3 refers to the split-
ting field, Ωf (respectively Ωf,F̄ ), of f (x) − z over F (z) (respectively over F̄ (z)). The
automorphism group of the extension Ωf /F (z) (respectively Ωf,F̄ /F̄ (z)) is the arithmetic
(respectivelygeometric) monodromy group A (respectively G) of a separable function (Def-
inition 9.7.25) f ∈ F (x). When there are several functions, we denote these Af and Gf .
They act on the zeros, {x1 , . . . , xn } (often denoted {1, . . . , n}), of f (x) − z, giving a natural
permutation representation on n symbols.
9.7.18 Definition Every cover f : X → Z over a field F with X irreducible has an associated
extension of function fields that determines the cover up to birational morphisms (see
Lemma 9.7.43).
9.7.19 Remark Essentially all the Galois theory of fields translates to useful statements about a
cover f : X → Z (over F ) of an irreducible variety Z. It does this by corresponding to f the
composite of the function field extensions F (X 0 )/F (Z) where X 0 runs over the components
of X [2203, p. 396]. Several papers in our references (say, [1114, Section 0.C]) give oft-used
examples, with Lemma 9.7.20 a simple archetype.
9.7.20 Lemma (See Remark 9.7.21) Any separable cover f : X → Y over F has a Galois closure
cover fˆ : X̂ → Z over F . Then, Af is the group of fˆ with its natural permutation repre-
sentation TAf (of degree the degree of f ). Do this over F̄ to get the geometric monodromy
Gf . Then, X is irreducible (resp. absolutely irreducible) if and only if TAf (resp. TGf ) is
transitive. For f a rational function it is automatic that TGf (and so TAf ) is transitive.
9.7.21 Remark [1118, Section 2.1] explains how to form the Galois closure cover of a cover using
fiber products (see Remark 9.7.54). This shows how to form the Galois closure cover of any
collection of covers as in Lemma 9.7.50.
9.7.22 Remark Normalization gives a nearly invertible process to Remark 9.7.19: going from field
extensions of F (Z) to covers of Z. While this does not translate all arithmetic cover prob-
lems to Galois theory, we apply the phrase “monodromy precision” (Remark 9.7.26) to when
it does. Example: It does in the topic of exceptional covers, as in Proposition 9.7.28.
9.7.23 Definition Denote the elements of a group G, under a representation TG , that fix 1 by
G(1). When TG is transitive, refer to TG as primitive (respectively doubly transitive)
if there is no group properly between G(1) and G (respectively G(1) is transitive on
{2, . . . , n}).
9.7.24 Theorem [1112, Theorem 1]: An f ∈ Fq (x) is exceptional if and only if the following holds
for each orbit O of Af (1) on {2, . . . , n}:
O breaks into strictly smaller orbits under Gf (1). (9.7.1)
Special functions over finite fields 293
9.7.25 Definition (Covers) Let f ∈ Fq (x) be nonconstant and separable: not g(xp ) for some
g ∈ Fq (x). Then, f : P1x (F¯q ) → P1z (F¯q ) by x 7→ f (x) has these cover properties.
1. Excluding a finite set {z1 , . . . , zr } ⊂ P1z (F¯q ), branch points of f , there are exactly
n = deg(f ) points over z 0 .
2. For z 0 a branch point, counting zeros, x0 of f (x) − z 0 with multiplicity, the sum
at all x0 s over z 0 is still n. An x0 ∈ P1x with multiplicity > 1 is a ramified point.
For K a number field, the same properties hold, without any separable condition.
9.7.26 Remark (MacCluer’s Theorem) Theorem 9.7.24 has a surprise: (9.7.1) implies exceptionality
over Fq . An error term in applying Chebotarev’s density theorem with branch points (as in
Section 8.3, in Section 8.3.3) vanishes. A ramified point with p not dividing its multiplicity
is tame.
Macluer’s thesis [1986] responded to a Davenport-Lewis conjecture [777] by showing The-
orem 9.7.24 for a polynomial tame at every point. We say: MacCluer’s Theorem shows tame
polynomial exceptional covers exhibit monodromy precision [1119, Section 3.2.1]. Proposi-
tion 9.7.28 shows monodromy precision holds for general exceptional covers.
9.7.27 Example A polynomial f over Fq for which p| deg(f ) is not tame at ∞.
9.7.28 Proposition [1112] combined with [1118, Principle 3.1]: Let f : X → Z be any cover
(Definition 9.7.12) over Fq with X absolutely irreducible. Then [1118, Corollary 2.5]:
1. the extended meaning of (9.7.1) is that the 2-fold fiber product (Section 9.7.3)
of f minus the diagonal has no absolutely irreducible Fq components; and
2. (9.7.1) is equivalent to f being exceptional: X(Fqk ) → Y (Fqk ) is one-one (and
onto) for infinitely many k.
9.7.29 Remark As noted in [1118, Comments on Principle 3.1], the proof of [1112] applies without
change to give Proposition 9.7.28 Part 2 when X and Z are non-singular; indeed, it applies
to pr-exceptionality (Definition 9.7.93). Without, however, this nonsingularity assumption,
there are complications considered in [1119, Section A.4.1] (see Example 9.7.14).
9.7.30 Definition Let f in Proposition 9.7.28 over Fq be an exceptional cover. Denote values k
where (9.7.1) holds with Fqk replacing Fq , by Ef,q : the exceptionality set of f .
Similarly, for f satisfying the hypotheses of Proposition 9.7.28 over a number field
K, denote those primes p where f mod p has Ef,O/pp infinite, by Ef,K .
9.7.31 Definition The equation Tu (cos(θ)) = cos(uθ) defines the u-th Chebychev polynomial,
Tu . From it define a Chebychev conjugate: α ◦ Tu ◦ α−1 with α(x) = αz0 (x) = z 0 x and
either z 0 = 1, or z 0 and −z 0 are conjugate in a quadratic extension of K.
9.7.32 Theorem (Schur’s Conjecture) [1109, Theorem 2]: With K a number field, the f ∈ O[x]
for which Ef,K is infinite are compositions with maps a 7→ ax + b (affine) over K with
polynomials of the following form for some odd prime u:
9.7.33 Remark Many still refer to Theorem 9.7.32 as Schur’s Conjecture, though Schur conjectured
it only over Q. Paper [1109] refers to all Chebychev conjugates as Chebychev polynomials,
rather than Dickson as in Remark 9.7.34. Reference [1936] assiduously distinguishes Dickson
polynomials.
Here is a simple branch point Chebychev Conjugate characterization: f has two finite
(6= ∞) branch points, ±z 0 ∈ P1z (Q̄), which identify with the unique unramified points (in
P1x (Q̄)) over the branch points, as in [1109, Proof of Lemma 9].
1. A corollary of [1124, Theorem 3.5] is that any cover with a unique totally and
tamely ramified point decomposes over F if and only if it decomposes over F̄ .
This applies if f ∈ F [x] has deg(f ) prime to the characteristic of F .
2. If f from Part 1 is indecomposable, then Gf is primitive (see Definition 9.7.23)
and it contains an n-cycle.
3. If f ∈ K[x] is exceptional, since (9.7.1) says Gf cannot be doubly transitive, up
to composing with K affine maps, f from Part 2 is in (9.7.2).
9.7.34 Remark (Dickson doppelgangers, see Section 9.6) Each Chebychev conjugate is a constant
times a Dickson polynomial [1118, Proposition 5.3]. The Remark 9.7.33 characterization –
by locating their branch points – avoids using equations. That is the distinction at the last
step between the proof of Theorem 9.7.32 and [1936, Chapter 6].
9.7.35 Remark Use the notation in Theorem 9.7.32. Suppose f ∈ OK [x] is an exceptional polyno-
mial. Define nf,c (resp. nf,C ) to be the product of distinct primes s for which f has a degree s
cyclic (resp. Chebychev conjugate) composition factor. One can check that Corollary 9.7.36
follows from 9.7.28 combined with 9.7.32.
9.7.36 Corollary For f ∈ OK [x] an exceptional polynomial, one can determine Ef,K (excluding
bad primes, Remark 9.7.6) from nc,f and nf,C by congruences. When OK = Z, then p ∈ Ef,Q
if and only if gcd(p − 1, s) = 1 for each s|nf,c and gcd(p2 − 1, s) = 1 for each s|nf,C .
9.7.37 Example (Infinite Ef,Q ) It is necessary that gcd(2, nc ) = 1 and gcd(6, nC ) = 1 for there
to be infinitely many p that satisfy the conclusion of Corollary 9.7.36. But it is sufficient,
too. Without loss, assume gcd(nc , nC ) = 1. If 36 | nc , then Dirichlet’s Theorem on primes
in arithmetic progressions gives an infinite set of p ≡ 3 (mod nc nC ). They are in Ef,Q . If
3|nc , the Chinese remainder theorem gives an arithmetic progression of p satisfying p ≡ 3
(mod nC ) and p ≡ −1 (mod nc ). So, Ef,Q is infinite whenever it has a chance to be.
9.7.38 Remark Combine [1113, Lemma 1] with monodromy precision in Proposition 9.7.39. This
shows, the Proposition 9.7.28 fiber product statement is equivalent to f ∈ K(x) being ex-
ceptional, and therefore permutation, mod p . If OK /pp is sufficiently large, the fiber product
statement is also necessary for f to be permutation (well-known, for example [1109, proof
of Theorem 2, last paragraph]).
9.7.39 Proposition (Permutation functions) From Remark 9.7.38, for f ∈ Fq (x), those k where f
permutes P1 (Fqk ) contains Ef,Fq as a cofinite subset. Similarly, for K a number field, those
p where f functionally permutes P1 (O/pp) contains Ef,K as a cofinite subset.
9.7.40 Remark Section 8.1 shows permutation polynomials are abundant. Exceptional polynomials
satisfy a much stronger property, but Corollary 9.7.36 shows they are abundant, too. One
difference: Section 9.7.3 combines them in ways with no analog for permutation polynomials.
9.7.41 Corollary An analog of Theorem 9.7.32 holds over Fq to characterize exceptional poly-
nomials of degree prime to p ([1120, Introduction to Section 5] or [1118, Proposition 5.1]).
There, z 0 in αz0 is either 1 or in the unique quadratic extension of Fq . Consider a Chebychev
Special functions over finite fields 295
9.7.42 Definition For any field extension F1 /F2 containing Fp , there is the notion of being sep-
arable [1121, p. 111]. For f ∈ Fq (x), the extension F¯q (x)/F¯q (f (x)) being separable is
equivalent to f is separable (Definition 9.7.25). Many of our examples inherit separa-
bleness from this special case.
9.7.43 Lemma (Curve covering maps [1427, Chapter I, Section 6]) Any nonsingular projective al-
gebraic curve X over a perfect field F has a field of functions F (X) that uniquely determines
X up to isomorphism over F .
Each non-constant element f ∈ F (X) determines a finite map X → P1z over F [1427,
Chapter I, Exercise 6.4]. If F (X)/F (f ) is separable, then f has the covering properties
of (9.7.25): finite number of branch points, and uniform count of points in a fiber over F̄
(including multiplicity in the fiber) [1427, Chapter IV, Proposition 2.2].
9.7.44 Definition Refer to any f in the conclusion of Lemma 9.7.43 as a nonsingular cover of P1z .
9.7.45 Definition (Fiber product) Let fi : Xi → P1z , i = 1, 2, be two nonsingular covers of P1z .
The set theoretic fiber product consists of the algebraic curve
9.7.46 Remark Definition 9.7.45 works equally for any covers Xi → Z, i = 1, 2, with Z a normal
projective variety. Then, X1 ×Z X2 is normal and projective (possibly with several compo-
nents) with natural maps pri : X1 ×Z X2 → Xi , i = 1, 2, given by its projection on each
factor. The functions fi ◦ pri , i = 1, 2 are identical, giving a well-defined map:
(f1 , f2 ) : X1 ×Z X2 → Z. (9.7.3)
9.7.49 Definition The fiber product Xf,f = X ×Z X for a cover f : X → Z of degree exceeding
1 has at least two components. One is the diagonal : the set ∆(X) = {(x, x) | x ∈ X}.
The normal variety X ×Z X \ ∆(X) generalizes the set in Theorem 9.7.24.
296 Handbook of Finite Fields
9.7.50 Lemma (Fiber product monodromy [1118, Section 2.1.3]) Consider the covers in Defini-
tion (9.7.45). To each fj there is an arithmetic (resp. geometric) monodromy group Afj
(resp. Gfj ), j = 1, 2. Similarly, for (f1 , f2 ) in (9.7.3). Then, A(f1 ,f2 ) maps naturally, surjec-
tively, to Afj by homomorphisms pr∗j , j = 1, 2. There is a largest simultaneous quotient, H,
of both Afj s given by homomorphisms mi : Afj → H, j = 1, 2, so that
9.7.52 Definition (Absolute components) Given X1 ×Z X2 in Corollary 9.7.51, denote the union
of its absolutely irreducible F components by X1 ×abs
Z X2 . Denote the complementary
cp
set, X1 ×Z X2 \ X1 ×absZ X2 , of components by X1 ×Z X2 .
9.7.53 Theorem (Explicit Ef,q – see Remark 9.7.54) Let f : X → Z (as in Proposition 9.7.28) be
an exceptional cover over Fq . For Xi0 , an Fq component of X ×cp
Z X, denote the number of
components in its breakup over F¯q by si , i = 1, . . . , u.
The group G(Fqsexc /Fq ) is naturally a quotient of Af /Gf . We can interpret all quantities
using Af and Gf .
9.7.54 Remark All but the last sentence of Theorem 9.7.53 is [1118, Corollary 2.8]. The last sen-
tence is from [1118, Lemma 2.6], using that the Galois closure cover of f is a(ny) component
(over Fq ) of the deg(f ) = n-fold fiber product of f with itself. Project that fiber product
onto the 2-fold fiber product of f over Fq to finish. Corollary 9.7.51 shows the orbit lengths
of Af (1) on {2, . . . , n} divided by the corresponding orbit lengths of Gf (1), give the si s.
9.7.55 Theorem (Explicit Ef,K – see Remark 9.7.56) Now change Fq to K (number field) in the
first sentence of Theorem 9.7.53. For each cyclic subgroup C ≤ Af /Gf denote those σ ∈ Af
that map to C by AC . As previously, denote the stabilizers of 1 in the representation by
AC (1) and GC (1). Consider the set, Cf,K , of cyclic C (as in (9.7.1)):
{C | each orbit of AC (1) on {2, . . . , n} breaks into strictly smaller orbits under GC (1)}.
Then, f is exceptional over K if and only if Cf,K is nonempty. Further, Ef,K consists of
those primes p for which the Frobenius attached to p is a generator of some C ∈ Cf,K .
9.7.56 Remark Theorem 9.7.55 comes from applying [1113, Section 2] exactly as in Remark 9.7.28.
If Ef,K is infinite, then X ×Z X \ ∆(X) has no absolutely irreducible component. The
converse, however, does not hold.
9.7.57 Definition (Category of exceptional covers) For Z absolutely irreducible over Fq , denote
the collection of exceptional covers of Z over Fq by TZ,Fq .
Special functions over finite fields 297
Also, there is at most one morphism between any two objects in TZ,Fq .
9.7.59 Remark (When f1 = f2 in Theorem 9.7.58) We definitely include the fiber product of
a cover in TZ,Fq with itself. Then, the only absolutely irreducible component of the fiber
product is the diagonal (Definition 9.7.49), which is equivalent to the original cover.
9.7.61 Remark Consider (fi , Xi ) ∈ TZ,Fq , i = 1, 2, for which there exists ψ : X1 → X2 over Fq
that factors through f2 : f2 ◦ ψ = f1 . Then, Theorem 9.7.58 says ψ is unique.
9.7.62 Corollary For (f, X) ∈ TZ,Fq , denote the group of the Galois closure cover of f over X by
Af (1). Then, Af has the representation Tf by acting on cosets of Af (1). If (fi , Xi ) ∈ TZ,Fq ,
i = 1, 2, we write (f1 , X1 ) > (f2 , X2 ) if f1 factors through X2 . [1118, Proposition 4.3] pro-
duces from these pairs a canonical group AZ,Fq with a profinite permutation representation
TZ,Fq .
9.7.63 Remark (A projective limit) Given (fi , Xi ) ∈ TZ,Fq , i = 1, 2, there is a 3rd (f, X) ∈ TZ,Fq ,
given by the fiber product, that factors through both. This is the condition defining a
projective sequence. So, AZ,Fq in Corollary 9.7.62 is a projective limit.
9.7.64 Definition (AZ,Fq , TZ,Fq ) is the (arithmetic) monodromy group, in its natural permutation
representation, of the exceptional tower TZ,Fq .
x−z 0
9.7.68 Definition Definition 9.7.31 explains Chebychev conjugates. Consider lz 0 : x 7→ x+z 0 ,
mapping ±z 0 to 0, ∞, with a = (z 0 )2 ∈ K, z 0 6∈ K. Then, for n odd, characterize
Rn,a = (lz0 )−1 ◦ (lz0 (x))n , a cyclic conjugate, by these conditions:
±z 0 are its sole ramified points, Rn,a (±z 0 ) = ±z 0 and it maps ∞ 7→ ∞. (9.7.4)
9.7.69 Remark According to [1936, Chapter 2, Section 5]), the function Rn,a in Definition 9.7.68 is
a Redei function. From [1936, Theorem 3.11], under the hypotheses on z 0 , the exceptionality
set ERn,a ,K is
9.7.70 Remark (Addendum Remark 9.7.69) Quadratic reciprocity determines nonempty arith-
metic progressions for which z 0 is a quadratic residue and those for which it is not. If z 0 in
Definition 9.7.68 were in K, then – of course – the exceptional set is the same as for xn .
Whether or not z 0 ∈ K, we refer to Rn,a as a cyclic conjugate.
9.7.71 Definition [1118, Section 4.2] Suppose a collection C of covers from an exceptional tower
TY,Fq is closed under the categorical fiber product. We say C is a subtower. We also
speak of the (minimal) subtower any collection generates under fiber product.
9.7.72 Remark Section 4.3 of [1118] uses that the fiber product of two unramified covers is un-
ramified to create cryptographic exceptional subtowers. Section 5.2.3 of [1118] computes the
arithmetic monodromy attached to the Dickson subtower generated by all the exceptional
Chebychev conjugates over Fq . The analog of Remark 9.7.69 over Fq gives a similar – Redei
– subtower of TP1z ,Fq generated by exceptional cyclic conjugates.
9.7.73 Remark Theorem 9.7.65 requires common exceptional intersection (Remark 9.7.66) to form
fiber products in TZ,K , Z absolutely irreducible over a number field K. For fiber products
(or composites) of Chebychev and cyclic conjugates, we easily decide if exceptional sets have
infinite intersection. Exceptional rational functions from Serre’s O(pen) I(mage) T(heorem)
give much harder versions of such problems.
9.7.74 Definition (j-line P1j ) A special copy of projective 1-space, the j-line, occurs in the study
of modular curves (see Theorem 9.7.76). Each j ∈ P1j \{∞}(Q̄) = A1j (Q̄) has an attached
isomorphism class of elliptic curves Ej . For each integer n > 0, consider a special case of
a modular curve, µ0 (n) : X0 (n) → P1j , with its cover of P1j . Denote the points of X0 (n)
not lying over j = ∞ by Y0 (n).
9.7.75 Definition For E an elliptic curve, denote by E → E/C an isogeny from quotienting E by
a (finite) torsion subgroup C of E. When C is a cyclic, generated by e0 ∈ E (resp. all
torsion points killed by multiplication by n), write C = he0 i (resp. Cn ).
9.7.76 Theorem There are two approaches to giving “meaning” to each algebraic point y ∈ Y0 (n),
whose image in P1j is jy
Special functions over finite fields 299
1. [2311, p. 108] or [1115, p. 158]: y 7→ [Ejy → Ejy /he0y i] with e0y ∈ Ejy of order n
where brackets, [ ] , indicate an isomorphism class of isogenies.
2. [1115, Lemma 2.1]: y 7→ fy ∈ Q̄(x) (up to Möbius equivalence) of degree n.
9.7.77 Theorem [1115, Theorem 2.1] Suppose f ∈ K(x) is exceptional and of prime degree u.
Then, f is Möbius equivalent over K to either:
1. a cyclic (Remark 9.7.70) or a Chebychev (Remark 9.7.34) conjugate; or
2. to some fy (u = n) in Theorem 9.7.76, Part 2.
9.7.78 Definition For a dense set of j 0 ∈ A1j , the corresponding Ej 0 is of CM -type if its ring
of isogenies, tensored by Q, has dimension 2 over Q. Such isogenies form a complex
quadratic extension of Q (containing j 0 , which is an algebraic integer; [2586, II-28] or
[2614, Chapter 2, Section 5.2]). Otherwise, j 0 is of GL2 -type.
9.7.79 Theorem [1115, (2.10)] Continue the notation of Theorem 9.7.77. Except for the two cases
where jy is one of the two finite branch points of µ0 (u), the geometric monodromy Gfy is
the order 2u dihedral group Du , and fy has four branch points (Definition 9.7.25). For u in
Theorem 9.7.77, Part 2, for which Ej 0 has good reduction, the coordinates of e0y generate a
constant extension of K with group Afy /Gfy (explained in Theorem 9.7.85).
9.7.80 Theorem [1115, Section 2.B] For j 0 of CM-type, complex multiplication theory gives (an
infinite) Efy ,K . Computing this would use [1370, Sections 6.3.1-6.3.2].
9.7.81 Remark (Addendum to Theorem 9.7.80) Using adelic (modular) arithmetic gives analogs
of Corollary 9.7.36; and Corollary 9.7.41 for explicitly finding the functional inverse of a
CM-type reduced modulo a prime in the exceptional set Efy ,K . If K = Q(j 0 ), then Efy ,K
depends on the congruence defining the Frobenius in the (cyclic of degree u−1 over K)
constant field. Only finitely many j 0 in Q have CM-type, corresponding to class number 1
for complex quadratic extensions.
9.7.82 Problem Take one of the CM-type j s in Q. Then, consider two allowed values of u, ui ,
i = 1, 2, denoting the corresponding fy s by fi , i = 1, 2. Test for explicitness in Remark
9.7.81 as to whether Ef1 ,Q ∩ Ef2 ,Q is infinite.
9.7.83 Definition (Composition factor definition field) For f ∈ F (x) consider a minimal field
Ff (ind) over which f decomposes into composition factors indecomposable over F̄ .
Similarly, denote the minimal field over which Xf,f \ ∆ in Theorem 9.7.24 breaks into
absolutely irreducible components by F̂f (2).
9.7.84 Proposition [1118, Proposition 6.5] If f : X → Z is a cover over F , then Ff (ind) ⊂ F̂f (2).
9.7.85 Theorem (See Remark 9.7.87) Assume j 0 ∈ A1j is of GL2 -type. For K = Q(j 0 ), consider
C = Cu in Definition 9.7.75 with u a prime. The corresponding fy ∈ K(x), y over j 0 , has
degree u2 . Use the monodromy groups of Definition 9.7.17.
There is a constant M1,j 0 so that if u > M1,j 0 , then the arithmetic/geometric monodromy
quotient Afy /Gfy is GL2 (Z/u)/{±1}. Further, fy decomposes into two degree u rational
functions over Kf (ind), but it is indecomposable over K.
9.7.86 Theorem [1118, Proposition 6.6] Continue Theorem 9.7.85 hypotheses. For a second con-
stant M2,j 0 , and for any prime p of OK with |OK /pp| > M2,j 0 assume Ap ∈ GL2 (Z/u)/h±1i
represents the conjugacy class of the Frobenius for p . Then, fy mod p is an exceptional
indecomposable rational function, and it decomposes over the algebraic closure of OK /pp,
300 Handbook of Finite Fields
precisely when hAp i acts irreducibly on (Z/u)2 = Vu . This holds for infinitely many primes
p . In particular, fy is exceptional over K (Definition 9.7.30).
9.7.87 Remark (Using Serre’s OIT) [2586] lays the groundwork for [2587]. The latter has the
existence of the constant M1,j 0 . Appendix A.1 and Section 3.2 of [2586] proves that it exists
when j 0 ∈ A1j (Q̄) is not an algebraic integer. Then, the computation of Mi,j 0 , i = 1, 2,
in Theorems 9.7.85 and 9.7.86 is effective. Even after all these years, there is no effective
computation of these constants when j is not CM-type, but is an algebraic integer. Section
2 of [1115] gets Theorem 9.7.85 from the OIT using the relation between Parts 1 and 2 in
Theorem 9.7.76.
9.7.88 Remark (More elementary, but less precise than Theorem 9.7.86) Theorem 2.2 of [1115]
shows, for every K and any prime u > 3, the j 0 ∈ K, with fy satisfying the exceptionality
and decomposability conclusions of Theorem 9.7.86, are dense. Applying the [1113, Theorem
3] (or [1121, Theorem 12.7]) version of Hilbert’s Irreducibility Theorem to X0 (u) gives the
corresponding M2,j 0 explicitly.
9.7.89 Example (M1,j 0 effectiveness?) Appendix A.1 and Section 3.3 of [2586] gives Ogg’s example
[2310] with j 0 ∈ Q. Section 6.2.2 of [1118] reviews this case, where M2,j 0 = 6, to show how
to pick an Ap acting irreducibly on Vu as in Theorem 9.7.86 (for infinitely many p ), assuring
that Efy ,Q is infinite for u > M2,j 0 .
Section 6.3.2 of [1118] – still Ogg’s case – aims at finding an automorphic function, a
la Langland’s Program, that would characterize the primes in Efy ,Q . This is akin to the
unrelated examples of [2595], but uses results on automorphic functions in [2590, Theorem
22]. Primes of Efy ,Q do not lie in arithmetic progressions. So, Problem 9.7.90 is much harder
than Problem 9.7.82.
9.7.90 Problem (Analog of Problem 9.7.82) For the Ogg curve in Example 9.7.89, consider two
allowed values of u, ui , i = 1, 2, denoting the corresponding fy s by fi , i = 1, 2. Test for
explicitness in Remark 9.7.81 as to whether Ef1 ,Q ∩ Ef2 ,Q is infinite.
9.7.91 Remark Paper [1123] connects “variables separated factors” of Xf,f and composition factors
of f . Reference [84] used this to effectively test for composition factors (and primitivity) of
covers.
9.7.92 Theorem [1370, Chapter 3] Excluding finitely many degrees, all indecomposable exceptional
f ∈ K(x) (K a number field) are Möbius equivalent to a cyclic or Chebychev conjugate, or
to a CM function from Theorem 9.7.77 of prime degree; or they are from Theorem 9.7.86
and of prime degree squared.
9.7.93 Definition [1118, Definition 2.2] Consider f : X → Z, a cover of normal varieties over Fq ,
with Z absolutely irreducible, but X possibly reducible. Then f is pr-exceptional if it
is surjective on Fqk points for infinitely many k. There is a similar definition extending
Definition 9.7.30 over a number field, and for both a notation for exceptional sets.
9.7.94 Definition Use the value set notation of Remark 9.7.3. We say fi ∈ Fq (x), i = 1, 2, is a
Davenport pair over Fq if Vf1 (P1 (Fqk )) = Vf2 (P1 (Fqk )) for infinitely many k. So, take
f2 (x) = x to see Davenport pairs generalize exceptional functions. The notion applies to
any pair of covers fi : Xi → Z, i = 1, 2. For K a number field, this similarly generalizes
Definition 9.7.30: f1 , f2 ∈ K(x) are a Davenport pair if they are a Davenport pair for
infinitely many residue class fields.
Special functions over finite fields 301
9.7.95 Theorem [1118, Corollary 3.6] Monodromy precision (Definition 9.7.26) applies to pr-
exceptional covers and so to Davenport pairs. That is, generalizing Theorems 9.7.53
and 9.7.55, a precise monodromy statement generalizes MacCluer’s Theorem (Proposition
9.7.28) to pr-exceptional covers and to Davenport pairs.
9.7.96 Theorem [1118, Section 3.1.2] With the notation of Definition 9.7.93, a pr-exceptional cover
over Fq is exceptional if and only if X is absolutely irreducible.
9.7.97 Remark The proof of Schur’s Conjecture began the solution of Davenport’s problem for
polynomial pairs (f1 , f2 ) over a number field, the main result of [1110]. Section 3.2 in [1118]
shows the exceptional set characterization for Davenport pairs in general is given by the
intersection of exceptionality sets for pr-exceptionality correspondences. A full description
of many authors’ results that came from the solution of Davenport’s problem – especially
the study of general zeta functions attached to diophantine problems – is in [1119, Section
7.3].
9.7.98 Remark (The Genus 0 Problem) Geometric monodromy groups of rational functions are
severely limited. The mildest statement for f ∈ Q̄(x) is that excluding cyclic and alternating
groups the composition factors of Gf fall among a finite set of simple groups. That is the
original genus 0 problem.
There is a large literature distinguishing between geometric monodromy of f ∈ Q̄(x)
and those in F¯q (x), because of wild (not tame; Remark 9.7.26) ramification. The contrast
starts from the [2191, Section 8.1.2, Guralnick’s Optimistic Conjecture] list of all primitive
monodromy groups of indecomposable f ∈ Q̄[x].
9.7.99 Example (Davenport pairs) A significant part of the exceptional primitive monodromy
groups (Remark 9.7.98), without cyclic or alternating group composition factors, came
from the finitely many possible degrees of Davenport pairs f1 , f2 ∈ K[x] (polynomials) over
number fields, with f1 indecomposable and Vf1 (OK /pp) = Vf2 (OK /pp).
Important hints about what to expect for primitive monodromy groups of f ∈ F¯q (x)
came also from Davenport pairs. Section 3.3.3 in [1118] (explicitly in [332]): Over every
Fq , there are infinitely many degrees of Davenport pairs, where (deg(f1 ), p) = 1, f1 is
indecomposable, and Vf1 (Fqk ) = Vf2 (Fqk ) for all k.
9.7.100 Example Theorem 14.1 in [696] described the geometric monodromy (PSL2 (pa ), p = 2, 3, a
odd) of the only possible exceptional polynomials over Fp whose degrees were neither prime
to p or a power of p. Then, [1120] produced these: the first exceptional polynomials over
finite fields with nonsolvable monodromy.
9.7.101 Remark (Zeta functions attached to problems) Chapters 25 and 26 in [1121] details how
Davenport pairs led to attaching Poincaré series – based on the Galois stratification pro-
cedure of [1117] – to counting the values of parameters for any diophantine problem inter-
pretable over all extensions of Fq , or for infinitely many primes p of K.
9.7.102 Example Denote w1 , . . . , wu by w . Suppose f (w w , y) ∈ Fq [w
w , x), g(w w , x, y]. Denote the car-
dinality of w 0 ∈ Au (Fqk ) with
w 0 , x))(P1 (Fqk )) = V (g(w
V (f (w w 0 , x))(P1 (Fqk )) (9.7.5)
P∞
by Nf,g,k . Define Pf,g,Fq (t) to be the Poincaré series i=1 Nf,g,k tk .
9.7.103 Example With notation over Z, as in Example 9.7.102, suppose f (w w , y) ∈ Z[w
w , x), g(w w , x, y].
Denote the cardinality of w 0 ∈ Au (Fpk ) with (9.7.5) holding over Fpk by Nf,g,Z/p,k . Define
P∞
Pf,g,Z/p (t) to be i=1 Nf,g,Z/p,k tk .
9.7.104 Theorem [1121, Chapter 25], [1119, Section 7.3.3] For any diophantine problem over Fq ex-
pressed in a first order language, the attached Poincaré series is a rational function. Further,
302 Handbook of Finite Fields
there is an effective computation of the coefficients of its numerator and denominator based
on expressing those coefficients in p-adic Dwork cohomology.
9.7.105 Theorem [Theorem 9.7.104 continued] Given a diophantine problem D over Z (or OK )
expressed in a first order language, there is an effective split of the primes of Q (or over K)
into two sets: LD,1 and LD,2 , with LD,2 finite. Further, there is a set of varieties V1 , . . . , Vs
over Z, from which we produce linear equations in variables Y1 , . . . , Ys0 that serve as the
coefficients of the numerator and denominator of a rational function PD (t). To each (p, Yi ),
p ∈ LD,1 there is a universal attachment of a p-adic Dwork cohomology group, H(p, Yi ),
computed in the category of such Dwork cohomology attached to V1 , . . . , Vs .
The corresponding Poincaré series PD,p at p ∈ LD,1 comes by substituting H(p, Yi ) for
each Y1 , . . . , Ys0 in PD (t). Then apply the Frobenius operator at p to these coefficients.
9.7.106 Remark [814] In Theorem 9.7.105 it is possible to take V1 , . . . , Vs to be nonsingular projec-
tive varieties with Yi representing a Chow motive (over Q). Applying the Frobenius operator
at p is meaningful as Chow motives are formed from étale cohomology groups of V1 , . . . , Vs .
9.7.107 Remark The effectiveness of Theorem 9.7.104 is based on Dwork cohomology [943], and the
explicit calculations of [341]. Theorem 9.7.105 and Remark 9.7.106 both rest on the Galois
stratification procedure of [1117] or [1121, Chapter 24].
On the plus side, the uniform use of étale cohomology from characteristic 0 produces
wonderful invariants – like, Euler characteristics – attached to diophantine problems. On
the negative, all the effectiveness disappears. In particular, the relation between the sets
denoted LD,1 in the two results is a mystery.
9.7.108 Remark Relating exceptional covers (and Davenport pairs) and other problems about al-
gebraic equations is a running theme in [1118] and [1119]. Detecting these relations comes
from pr-exceptional correspondences [1118, Section 3.2]. We catch the possible appearance
of such correspondences when two Poincaré series have infinitely many identical coefficients.
9.7.109 Example An exceptional cover, X → P1z , over Q, will be a curve whose Poincaré series is the
same as that of P1z at infinitely many primes. The systematic use of such characterizations
combines monodromy precision (where it applies) and Theorem 9.7.110.
9.7.110 Theorem ([1119, Proposition 7.17], based on [1021]) The zero support of the difference of
two Poincaré series consists of the union of arithmetic progressions.
See Also
References Cited: [84, 332, 341, 696, 777, 814, 943, 1021, 1109, 1110, 1111, 1112, 1113,
1114, 1115, 1117, 1118, 1119, 1120, 1121, 1123, 1124, 1370, 1427, 1936, 1986, 2031, 2191,
2203, 2310, 2311, 2586, 2587, 2590, 2595, 2614]
10
Sequences over finite fields
303
304 Handbook of Finite Fields
10.1.3 Definition Let G be a finite abelian group. The Fourier transform of any function
f : G −→ C is the function fb : G
b −→ C defined by
X
fb(χ) = f (x)χ(x)
x∈G
where χ is a character of G.
10.1.4 Remark The Fourier transform can also be defined without the complex conjugation of
χ(x).
10.1.5 Remark It is common to choose an identification of G and G.
b If χα denotes the image of
α ∈ G under some isomorphism from G to G,
b we write the Fourier transform as
X
fb(α) = f (x)χα (x)
x∈G
µα (x) = ζphα,xi
where ζp is a fixed primitive p-th root of unity, and h, i is any Fp -valued inner product on
Fq . If we take hα, xi to be Tr(αx) (absolute trace) then the Fourier transform of a function
f : Fq −→ C becomes X
fb(α) = f (x)ζp−Tr(αx) .
x∈Fq
10.1.7 Example The characters of F∗q (the multiplicative group of nonzero elements of Fq ) have
the form
jk
χj (γ k ) = ζq−1
where 0 ≤ j ≤ q − 2, ζq−1 is a fixed primitive complex (q − 1)-th root of unity, and γ is a
fixed generator of F∗q . For example, if q is odd and j = (q − 1)/2 then χj is the quadratic
character. The Fourier transform of a function f : F∗q −→ C can then be written
q−2
−jk
X
fb(j) = f (γ k )ζq−1
k=0
10.1.8 Example Let n > 1 be a positive integer. If G = Z/nZ, the characters are the functions
χj : Z/nZ → C, for j ∈ Z/nZ, where
χj (k) = ζnjk
where ζn is a complex primitive n-th root of unity. The Fourier transform of a function
f : Z/nZ −→ C can then be written
n−1
X
fb(j) = f (k)ζn−jk .
k=0
10.1.9 Remark The space CG of all complex-valued functions defined on G is a Hermitian inner
product space via
1 X
hf, gi = f (x)g(x).
|G|
x∈G
1 X
f (x) = hf, χiχ(x)
|G|
χ∈G
b
for all x ∈ G. This expresses f as a linear combination of the basis of characters. Note that
it also expresses each value of f as a linear combination of roots of unity. This expression
explains why the values fb(χ) are sometimes called the Fourier coefficients of f .
10.1.13 Remark Next we present some of the fundamental theorems in Fourier analysis. Proofs can
be found in [2789].
10.1.14 Theorem Plancherel’s Theorem states that
1 b
hf, gi = hf , gbi.
|G|
10.1.16 Theorem The Poisson Summation Formula states that, for any subgroup H of G,
1 X 1 X b
f (x) = f (χ)
|H| |G|
x∈H ⊥ χ∈H
for all a ∈ G.
10.1.19 Theorem The convolution theorem of Fourier analysis states that the Fourier transform of
a convolution of two functions is equal to the ordinary product of their Fourier transforms:
f[
∗ g = fb · gb.
306 Handbook of Finite Fields
10.1.20 Remark We have defined above the Fourier transform of a function G −→ C. The definition
can be extended to functions defined on G taking values in another abelian group, B. The
definition involves characters of B as well as characters of G.
10.1.21 Definition Let f : G −→ B be a function between finite abelian groups. The Fourier
b×B
transform of f is the function fb : G b −→ C defined by
X
fb(χ, ψ) = ψ(f (x)) χ(x). (10.1.1)
x∈G
because the additive characters (see Example 10.1.6) have the form Tr(αx).
10.1.26 Remark We note that in the important special case where f (x) = xd and gcd(d, q − 1) = 1,
we may write any β ∈ Fq as cd , and then
d
xd −αx) d
−αc−1 x)
X X
fb(α, β) = ζ Tr(c = ζ Tr(x = fb(αc−1 , 1).
x∈Fq x∈Fq
It follows that, when f (x) = xd and gcd(d, q − 1) = 1, we may often assume without loss of
generality that β = 1.
10.1.27 Example For a Boolean function f : Fn
2 −→ F2 , the characters of F2 have the form
n
χα (x) = (−1)hα,xi
where h, i is any inner product on Fn2 . Also, the only nonzero β in F2 is β = 1, so we may
drop the dependence on β and the Fourier transform is written
X
fb(α) = (−1)f (x)+hα,xi .
x∈Fn
2
This is the Walsh transform of the Boolean function f , or the Hadamard transform of the
±1 valued function (−1)f .
Sequences over finite fields 307
10.1.28 Remark One often uses the finite field F2n for the vector space Fn
2 , and the trace inner
product hα, xi = Tr(αx). In this case the Walsh transform of a Boolean function f is
X
fb(α) = (−1)f (x)+Tr(αx) .
x∈Fn
2
(f (g1 ), . . . , f (gn )) ∈ B n
10.1.32 Remark If we identify a function f : G −→ C with its vector (f (g1 ), . . . , f (gn )), the Fourier
Transform of f is the vector obtained by multiplying the vector f by the character table of
G:
fb = Xf
where X is the character table of G.
10.1.33 Remark We have already presented a case of the Discrete Fourier Transform in Example
10.1.8, however we shall give the matrix formulation here, which is more common. In fact,
the field K below does not need to be a finite field. Further information can be found in
many places, see [2789] for example.
10.1.34 Definition Let K be a field containing all the n-th roots of unity. Let ζn be a primitive
n-th root of unity in K. Let Fn be the n × n matrix whose (i, j) entry is ζnij , where
0 ≤ i, j ≤ n − 1. The matrix Fn is the n-th Fourier matrix.
10.1.38 Remark The Fourier matrix Fn is the character table of G = Z/nZ, and this definition is
a case of Remark 10.1.32.
10.1.39 Remark If K = C the Discrete Fourier Transform (DFT) is the same as Example 10.1.8.
If K = Fq and n is a divisor of q − 1, the DFT is known as the Discrete Fourier Transform
over a finite field.
1 −ij
10.1.40 Remark The inverse of Fn has (i, j) entry n ζn ,
i.e., Dn Fn = nIn . Therefore the Discrete
Fourier Transform has an inverse, which is almost a Discrete Fourier Transform itself, except
for the factor of 1/n.
√
Sometimes authors include a factor of 1/ n in the definition, which then also appears
in the inverse, so with this definition the inverse DFT is also a DFT.
If K is a field of characteristic p, we assume here that n is relatively prime to p so that
1/n exists in K. If p divides n, a generalized DFT has been defined in [2013] using the
values of the Hasse derivatives.
10.1.41 Remark The DFT as defined here may also be viewed as the DFT on the group ring K[G]
where G is a cyclic group. This definition can be generalized to a DFT on other group rings.
Rings which support Fourier transforms are characterized in [418].
10.1.42 Remark Sometimes we identify the vector f = (f0 , f1 , . . . , fn−1 ) with the polynomial
f (x) = f0 + f1 x + · · · + fn−1 xn−1 , and then
Thus the Discrete Fourier Transform of f is the vector of values of the polynomial f at the
n-th roots of unity.
10.1.43 Remark The Fast Fourier Transform is a computationally efficient way of computing the
Discrete Fourier Transform, and has many applications. Traditionally, n is a power of 2 and
one uses polynomial evaluations as in Remark 10.1.42. There is a huge literature on this
topic, see [2789] for example, so we do not go into this here.
10.1.44 Remark Given a vector f = (f0 , f1 , . . . , fn−1 ) in K n , we construct a circulant matrix F
with f as its top row. Similarly we construct a circulant matrix Fb with fb = (fb0 , fb1 , . . . , fbn−1 )
as its top row. There is a result sometimes known as Blahut’s theorem [141] stating that
the weight of f is equal to the rank of this circulant Fb, and the weight of fb is equal to the
rank of F . This is true because if we let D be the diagonal matrix diag(f0 , f1 , . . . , fn−1 ), we
observe that Fn DFn = Fb, and because Fn is invertible, the rank of D (which is the weight
of f ) is equal to the rank of Fb. Because the linear complexity of f is equal to the rank of
F , this result can be useful for linear complexities of sequences.
10.1.45 Remark In this subsection, as before, G and B denote finite abelian groups.
Sequences over finite fields 309
10.1.46 Remark The Fourier spectrum is the set of values of the Fourier transform. Most authors
take the Fourier spectrum to be the multi-set of values, i.e., the values including their
multiplicities. The Fourier spectrum is an important invariant in many applications.
10.1.47 Example The Fourier spectrum of the function x3 on F2n is {0, ±2(n+1)/2 } if n is odd, and
{0, ±2n/2 , ±2(n+2)/2 } if n is even.
10.1.48 Remark There are many papers calculating the Fourier spectrum of specific functions that
are of particular interest, see [384] or [387] for example. There are also general papers on
the structure of the Fourier spectrum of arbitrary functions. The Weil bound gives an upper
bound on the absolute value of any Fourier coefficient, a result that we state here, and that
has many variations.
10.1.49 Definition The Weil sum associated to an additive character µ of Fq and a polynomial
f ∈ Fq [x] is X
µ(f (x)).
x∈Fq
X √
µ(f (x)) ≤ (deg(f ) − 1) q.
x∈Fq
10.1.4.2 Nonlinearity
10.1.55 Remark The Fourier Transform of χA can be used to obtain information about A. There
are some general principles, such as: if all the Fourier coefficients of χA are small, relatively
speaking, then A is usually a fairly “random” subset; see [155] for further details. In a
different direction, the paper [865] is an example of utilizing the Fourier Transform of the
characteristic function of the support of a specific function of interest.
10.1.56 Definition The Gauss sum associated to a multiplicative character χ (i.e., a character of
F∗q ) and an additive character µ (i.e., a character of Fq ) is
X
S(χ, µ) = χ(x)µ(x). (10.1.4)
x∈F∗
q
10.1.57 Remark Here is one interpretation of Gauss sums. Working in the space of functions F∗q −→
C we consider µ as a function on F∗q by restriction. Using Fourier inversion on (10.1.4), we
can express an additive character µ in terms of the basis of multiplicative characters; and
the coefficients in this expansion are Gauss sums.
10.1.58 Remark If G is any finite abelian group, the uncertainty principle [2789] states that
Thus, a function and its Fourier transform cannot both have “small” support. Tao [2780]
recently showed that, in the case q = p, the uncertainty principle can be improved as follows:
See Also
References Cited: [141, 155, 384, 387, 418, 865, 2013, 2780, 2789]
Sequences over finite fields 311
10.2.1 Definition Let k be a positive integer and let a0 , a1 , . . . , ak−1 be fixed elements of the
finite field Fq . A sequence s0 , s1 , . . . of elements of Fq satisfying the linear recurrence
relation
k−1
X
sn+k = ai sn+i for n = 0, 1, . . .
i=0
is an LFSR sequence (or a linear feedback shift register sequence, also a linear recurring
sequence) in Fq . The integer k is the order of the LFSR sequence or of the linear
recurrence relation.
10.2.2 Remark We usually abbreviate the sequence s0 , s1 , . . . by (sn ). The sequence (sn ) in Def-
inition 10.2.1 is uniquely determined by the linear recurrence relation and by the initial
values s0 , s1 , . . . , sk−1 .
10.2.3 Remark In electrical engineering, LFSR sequences in Fq are generated by special switching
circuits called linear feedback shift registers. A linear feedback shift register consists of
adders and multipliers for arithmetic in Fq as well as delay elements. In the binary case
q = 2, multipliers are not needed.
10.2.4 Theorem Any LFSR sequence (sn ) in Fq of order k is ultimately periodic with least period
at most q k − 1. A sufficient condition for (sn ) to be (purely) periodic is that the coefficient
a0 in the linear recurrence relation in Definition 10.2.1 is nonzero.
10.2.5 Definition Let (sn ) be an LFSR sequence in Fq of order k satisfying the linear recurrence
relation in Definition 10.2.1. Then the polynomial
k−1
X
f (x) = xk − ai xi ∈ Fq [x]
i=0
10.2.6 Definition Let (sn ) be an LFSR sequence in Fq of order k. Then for n = 0, 1, . . ., the
vector
sn = (sn , sn+1 , . . . , sn+k−1 ) ∈ Fkq
is the n-th state vector of (sn ).
10.2.7 Remark Let (sn ) be an LFSR sequence in Fq with characteristic polynomial f ∈ Fq [x]
and let A be the companion matrix of f . Then the linear recurrence relation for (sn ) in
Definition 10.2.1 can be written as the identity sn+1 = sn A, n = 0, 1, . . ., for the state
vectors. Consequently, we get sn = s0 An for n = 0, 1, . . . . Since An can be calculated
by O(log n) matrix multiplications using the standard square-and-multiply technique, this
312 Handbook of Finite Fields
identity leads to an efficient algorithm for computing remote terms of the LFSR sequence
(sn ).
10.2.8 Remark Let F∞
q be the sequence space over Fq , viewed as a vector space over Fq under
termwise operations for sequences. Let T be the shift operator
T ω = (wn+1 ) for all ω = (wn ) ∈ F∞
q .
10.2.9 Definition The uniquely determined monic polynomial over Fq generating the ideal I(σ)
in Remark 10.2.8 is the minimal polynomial of the LFSR sequence σ in Fq .
10.2.10 Remark If σ is the zero sequence, then its minimal polynomial is the constant polynomial
1. If σ is a nonzero LFSR sequence, then its minimal polynomial has positive degree and is
the characteristic polynomial of the linear recurrence relation of least possible order satisfied
by σ.
10.2.11 Theorem The minimal polynomial of an LFSR sequence (sn ) in Fq divides any characteristic
polynomial of (sn ). A characteristic polynomial of (sn ) of degree k ≥ 1 is the minimal
polynomial of (sn ) if and only if the corresponding state vectors s0 , s1 , . . . , sk−1 are linearly
independent over Fq .
10.2.12 Example Let (sn ) be an LFSR sequence in Fq of order k with initial values s0 = s1 = · · · =
sk−2 = 0, sk−1 = 1 (s0 = 1 if k = 1). Then the linear independence property in the second
part of Theorem 10.2.11 is clearly satisfied, and so the characteristic polynomial of (sn )
of degree k is also the minimal polynomial of (sn ). An LFSR sequence with these special
initial values is an impulse response sequence.
10.2.13 Theorem If m ∈ Fq [x] is the minimal polynomial of the LFSR sequence σ in Fq , then the
least period of σ is equal to the order of m and the least preperiod of σ is equal to the
multiplicity of 0 as a root of m.
10.2.14 Corollary An LFSR sequence in Fq is periodic if and only if its minimal polynomial m ∈
Fq [x] satisfies m(0) 6= 0.
10.2.15 Remark The basic theory of LFSR sequences in finite fields, as presented above, has quite
a long history. An important early paper is Zierler [3070]. Other milestones in the history
of LFSR sequences in finite fields are the lecture notes of Selmer [2579] and the book of
Golomb [1300]. A treatment of LFSR sequences in the wider context of general feedback
shift register sequences in finite fields, i.e., those including also nonlinear feedback functions,
is given in the monograph of Ronse [2475]. The proofs of many results in this section can
be found in Chapters 6 and 7 of the book [1938].
10.2.16 Theorem [1063] Let (sn ) be an LFSR sequence in Fq with characteristic polynomial
f ∈ Fq [x]. Let e0 be the multiplicity of 0 as a root of f , where we can have e0 = 0, and
let α1 , . . . , αh be the distinct nonzero roots of f (in its splitting field F over Fq ) with
multiplicities e1 , . . . , eh , respectively. Then
h eX
i −1
X n+j
sn = tn + βij αin for n = 0, 1, . . . ,
i=1 j=0
j
Sequences over finite fields 313
10.2.18 Theorem For each i = 1, . . . , h, let σi be an LFSR sequence in Fq with minimal polynomial
mi ∈ Fq [x] and least period ri . If m1 , . . . , mh are pairwise coprime, then the minimal
polynomial of the (termwise) sum sequence σ1 + · · · + σh is equal to the product m1 · · · mh
and the least period of σ1 + · · · + σh is equal to the least common multiple of r1 , . . . , rh .
10.2.19 Remark In general, operations with LFSR sequences are treated in terms of the spaces
S(f ), where for a monic f ∈ Fq [x] we let S(f ) be the kernel of the linear operator f (T )
on F∞q (compare with Remark 10.2.8). Any S(f ) is a linear subspace of Fq of dimension
∞
for some monic g ∈ Fq [x]. If each fi , 1 ≤ i ≤ h, is nonconstant and has only simple roots,
then g is the monic polynomial whose roots are the distinct elements of the form α1 · · · αh ,
where each αi is a root of fi in the splitting field of f1 · · · fh over Fq . For the general case,
a procedure to determine g can be found in Zierler and Mills [3072].
10.2.24 Theorem [939, 2237] Let σ be an LFSR sequence in Fq . Then so is σ (d) for any positive
Qk
integer d. If f is a characteristic polynomial of σ and f (x) = j=1 (x−βj ) is the factorization
Qk
of f in its splitting field over Fq , then gd (x) = j=1 (x−βjd ) is a characteristic polynomial of
σ (d) . Furthermore, if f is the minimal polynomial of σ and d is coprime to the least period
of σ, then gd is the minimal polynomial of σ (d) .
10.2.26 Remark Characteristic sequences play an important role in the Niederreiter algorithm
for factoring polynomials over finite fields [2249, 2260]. Characteristic sequences can be
described explicitly in terms of their generating functions (see Theorem 10.2.35 below).
10.2.27 Remark In the case q = 2, an interesting operation on sequences is that of binary comple-
mentation. If σ is a sequence of elements of F2 , then its binary complement σ is obtained
by replacing each term 0 in σ by 1 and each term 1 in σ by 0.
10.2.28 Theorem Let σ be an LFSR sequence in F2 with minimal polynomial m ∈ F2 [x]. Write m in
the form m(x) = (x + 1)h m1 (x) with an integer h ≥ 0 and m1 ∈ F2 [x] satisfying m1 (1) = 1.
Then the minimal polynomial m of the binary complement σ is given by m(x) = (x+1)m(x)
if h = 0, m(x) = m1 (x) if h = 1, and m(x) = m(x) if h ≥ 2.
10.2.29 Remark LFSR sequences in Fq can be characterized in terms of Hankel determinants. For
an arbitrary sequence (sn ) of elements of Fq and for integers n ≥ 0 and b ≥ 1, define the
Hankel determinant
Dn(b) = det((sn+i+j )0≤i,j≤b−1 ).
10.2.30 Theorem The sequence (sn ) of elements of Fq is an LFSR sequence in Fq if and only if
(b)
there exists an integer b ≥ 1 such that Dn = 0 for all sufficiently large n. Furthermore,
(b)
(sn ) is an LFSR sequence in Fq with minimal polynomial of degree k if and only if D0 = 0
for all b ≥ k + 1 and k + 1 is the least positive integer for which this holds.
10.2.31 Remark If an LFSR sequence in Fq is known to have a minimal polynomial of degree at
most k for some integer k ≥ 1, then the Berlekamp-Massey algorithm produces the minimal
polynomial from the first 2k terms of the sequence [231, 2011].
10.2.32 Remark LFSR sequences in Fq can also be characterized in terms of their generating
functions. There are two different characterizations, depending on whether the generating
function is a formal power series in x or in x−1 .
10.2.33 Theorem The sequence (sn ) of elements of Fq is an LFSR sequence in Fq of order k with
connection polynomial c ∈ Fq [x] if and only if
∞
X g(x)
sn xn =
n=0
c(x)
where p1 , . . . , ph ∈ Fq [x] are the distinct monic irreducible factors of f and p0i is the first
derivative of pi .
Sequences over finite fields 315
10.2.37 Theorem Any maximal period sequence σ in Fq is periodic with least period q k − 1, where
k is the degree of the minimal polynomial of σ.
10.2.38 Remark The terminology “maximal period sequence” stems from the fact that, by Theo-
rems 10.2.4 and 10.2.37, q k − 1 is the largest value that can be achieved by the least period
of an LFSR sequence in Fq of order k.
10.2.39 Theorem Let (sn ) be a maximal period sequence in Fq with minimal polynomial of degree
k. Then the state vectors s0 , s1 , . . . , sqk −2 of (sn ) run exactly through all nonzero vectors
in Fkq .
10.2.40 Theorem A nonzero periodic sequence σ of elements of Fq is a maximal period sequence in
Fq if and only if all its shifted sequences T h σ, h = 0, 1, . . ., together with the zero sequence
form an Fq -linear subspace of F∞ q .
10.2.41 Theorem Let σ be a maximal period sequence in Fq with minimal polynomial of degree k.
Then every LFSR sequence in Fq having an irreducible minimal polynomial g with g(0) 6= 0
and deg(g) dividing k can be obtained from σ by applying a shift and then a decimation.
10.2.42 Remark For maximal period sequences, the autocorrelation function has a simple form.
Let (sn ) be a maximal period sequence in Fq with least period r. Then its autocorrelation
function Cr is defined by
r−1
X
Cr (h) = χ(sn − sn+h )
n=0
10.2.44 Remark For LFSR sequences in Fq for which the least period is sufficiently large compared
to the order, the terms in the full period are almost evenly distributed over Fq . Without
loss of generality, it suffices to consider periodic LFSR sequences in Fq . For such a sequence
σ = (sn ) with least period r and for b ∈ Fq , let Z(b; σ) be the number of integers n with
0 ≤ n ≤ r − 1 such that sn = b. In other words, Z(b; σ) is the number of occurrences of b
in a full period of σ.
10.2.45 Theorem Let σ be a periodic LFSR sequence in Fq of order k and with least period r. Then
for any b ∈ Fq we have
r 1
Z(b; σ) − ≤ 1− q k/2 .
q q
10.2.46 Remark If σ is a maximal period sequence in Fq with minimal polynomial of degree k, then
it follows from Theorem 10.2.39 that Z(b; σ) = q k−1 for b 6= 0 and Z(0; σ) = q k−1 − 1.
10.2.47 Remark Let the sequence σ = (sn ) be as in Remark 10.2.44. For b ∈ Fq and a positive
integer N , let Z(b; N ; σ) be the number of integers n with 0 ≤ n ≤ N − 1 such that sn = b.
316 Handbook of Finite Fields
10.2.48 Theorem [2231] Let σ be a periodic LFSR sequence in Fq of order k and with least period
r. Then for any b ∈ Fq and any integer N with 1 ≤ N ≤ r we have
N 1 k/2 2 7
Z(b; N ; σ) − ≤ 1− q log r + .
q q π 5
10.2.49 Remark The distribution of blocks of elements in a periodic LFSR sequence σ = (sn ) in
Fq has also been investigated. Let r be the least period of σ. For b = (b1 , . . . , bt ) ∈ Ftq with
a positive integer t, let Z(b; σ) be the number of integers n with 0 ≤ n ≤ r − 1 such that
sn+i−1 = bi for 1 ≤ i ≤ t. The simplest case is that of a maximal period sequence σ in Fq .
If k is the degree of the minimal polynomial of σ and 1 ≤ t ≤ k, then Z(b; σ) = q k−t for
b ∈ Ftq with b 6= 0, whereas Z(0; σ) = q k−t − 1.
10.2.50 Theorem [2233] Let σ be a periodic LFSR sequence in Fq with least period r. Let m ∈ Fq [x]
be the minimal polynomial of σ and put k = deg(m). Suppose that the positive integer t is
less than or equal to the degree of any irreducible factor of m in Fq [x]. Then for any b ∈ Ftq
we have
r 1
Z(b; σ) − t ≤ 1 − t q k/2 .
q q
10.2.51 Remark More general results on distribution properties of LFSR sequences are available
in [2233]. For instance, there is an analog of Theorem 10.2.48 for the distribution of blocks
of elements in parts of the full period. Furthermore, one can consider not only blocks of
successive terms of an LFSR sequence as in Remark 10.2.49, but also blocks of terms with
arbitrary lags. Refined results for various special cases can be found in [2635, 2662].
10.2.52 Remark LFSR sequences in Fq have numerous applications. In this subsection, we mention
some typical applications. We start with an application to combinatorics.
10.2.54 Remark Let s0 , s1 , . . . be an impulse response sequence of order k (see Example 10.2.12)
which is also a maximal period sequence in Fq with minimal polynomial of degree k. Then
0, s0 , s1 , . . . , sqk −2 is a (q, k) de Bruijn sequence. This follows immediately from Theorem
10.2.39.
10.2.55 Remark Any periodic sequence σ = (sn ) of elements of Fq , say with period r, satisfies the
linear recurrence relation sn+r = sn for n = 0, 1, . . . and is thus an LFSR sequence in Fq .
The linear complexity (or the linear span) of σ is defined to be the degree of the minimal
polynomial of σ. The linear complexity is an important complexity measure in the theory
of stream ciphers in cryptology. For details on the linear complexity, the reader is referred
to Section 10.4.
10.2.56 Remark There is a family of cryptosystems which are based on LFSR sequences in Fq and
the operation of decimation (see Definition 10.2.23). These cryptosystems were introduced
in [2241] and are FSR cryptosystems.
10.2.57 Remark LFSR sequences in Fq can be used for the encoding of cyclic codes. We refer to
Section 8.7 in the book of Peterson and Weldon [2390] for an account of this application.
Sequences over finite fields 317
This connection between LFSR sequences and cyclic codes, when combined with results of
the type stated in Theorem 10.2.45, yields information on the weight distribution of cyclic
codes [2232].
10.2.58 Remark Maximal period sequences in finite prime fields are used in methods for generating
uniform pseudorandom numbers in the interval [0, 1]. Well-known methods of this type are
the digital multistep method and the generalized feedback shift register (GFSR) method.
We refer to Chapter 9 in the book [2248] for a detailed discussion of these methods.
10.2.59 Remark LFSR sequences in Fq have important applications in digital communication sys-
tems. A celebrated example is code division multiple access (CDMA) in wireless communi-
cation. The book of Viterbi [2880] is the standard reference for CDMA.
See Also
References Cited: [231, 939, 1063, 1294, 1300, 1938, 2011, 2231, 2232, 2233, 2237, 2241,
2248, 2249, 2260, 2390, 2475, 2579, 2635, 2662, 2880, 3070, 3072]
10.3.1 Definition Let {u(t)} and {v(t)} be two complex-valued sequences of period n. The
periodic correlation of {u(t)} and {v(t)} at shift τ is the inner product
n−1
X
θu,v (τ ) = u(t + τ )v(t), 0 ≤ τ < n,
t=0
10.3.2 Remark Sequences with good correlation properties have numerous applications in commu-
nication systems and lead to many challenging problems in finite fields. The main problems
from an application point of view are to find single sequences with low autocorrelation
for all nonzero shifts and families of sequences where the maximum nontrivial auto- and
crosscorrelation values between any two sequences in the family is low. For a more detailed
survey on the design and analysis of sequences with low correlation the reader is referred to
[1475]. Other related references are [1215, 1303, 1476, 2526, 2673] and Chapter V (Section
7) in [706].
318 Handbook of Finite Fields
10.3.3 Remark The crosscorrelation between two sequences {a(t)} and {b(t)} that take on values
in Zq = {0, 1, . . . , q − 1} is defined using Definition 10.3.1 where u(t) = ω a(t) , v(t) = ω b(t)
and ω is a complex q-th root of unity, i.e.,
n−1
X
θa,b (τ ) = ω a(t+τ )−b(t) , 0 ≤ τ < n.
t=0
10.3.4 Definition A sequence {s(t)} has ideal autocorrelation if θs (τ ) = 0 for all τ 6≡ 0 (mod n).
10.3.5 Remark Sequences with ideal autocorrelation do not always exist so a sequence has optimal
autocorrelation if the maximal value of its autocorrelation is as small as it can be for a
sequence of the given period and symbol alphabet. Many optimal sequences have constant
autocorrelation of −1 for all out-of-phase shifts, i.e., when τ 6≡ 0 (mod n).
10.3.6 Definition A q-ary maximal-length linear sequence {s(t)} (or m-sequence) is a sequence of
elements with symbols from Fq of period q m − 1 generated from a nonzero initial state
(s(0), s(1), . . . , s(m − 1)) and a linear recursion of degree m given by
m
X
fi s(t + i) = 0,
i=0
Pm
where the characteristic polynomial of the recursion, defined by f (x) = i=0 fi xi , is a
primitive polynomial in Fq [x] of degree m.
10.3.7 Theorem [1939, Theorem 8.24] An m-sequence, after a suitable cyclic shift, can be described
using the trace function from F = Fqm to K = Fq as
10.3.12 Remark There is a close connection between a balanced binary sequence {s(t)} of period
2m − 1 with autocorrelation −1 for all shifts τ 6= 0 (mod 2m − 1) and difference sets with
Singer parameters (2m − 1, 2m−1 − 1, 2m−2 − 1). The connection is that {s(t)} gives a Singer
difference set (mod 2m − 1) defined by D = {t | s(t) = 0} (see Section 14.6).
10.3.13 Theorem [864] Let 1 ≤ k < and gcd(k, m) = 1 and let α be a primitive element in F2m .
m
2
Define the subset of F2m by
n 2k k 2k k
o
Uk = (x + 1)2 −2 +1 + x2 −2 +1 + 1 | x ∈ F2m .
10.3.15 Theorem [2388] If p ≡ 3 (mod 4) then the Legendre sequence has a two-valued autocor-
relation with values θs (0) = p and θs (τ ) = −1 for τ 6= 0 (mod p). If p ≡ 1 (mod 4) the
autocorrelation is inferior and takes on values 1 and −3 when τ 6= 0 (mod p) in addition to
θs (0) = p.
10.3.16 Construction (Binary Sidelnikov sequences) Let α be a primitive element in Fpm . The
binary Sidelnikov sequence has period pm − 1 and is defined by
1 if αt + 1 is a nonsquare in Fpm ,
s(t) =
0 otherwise.
10.3.17 Theorem [2659] Binary Sidelnikov sequences are balanced with three-valued optimal au-
tocorrelation with out-of-phase values 0 or −4 when n ≡ 0 (mod 4) and 2 and −2 when
n ≡ 2 (mod 4).
10.3.18 Theorem [1475] Let h be a function from Zq to Zq . The sequence {h(t)} of period q with
symbols from Zq has ideal autocorrelation if and only if h is a bent function.
10.3.19 Remark Information on bent functions can be found in Section 9.3.
10.3.22 Remark There are three well-known bounds on θmax for a family of sequences with given
period n and family size M . These bounds are due to Welch [2964], and Sidelnikov and
Levenshtein [1475]. The Welch and Sidelnikov bounds are based on bounds on the inner
products between complex vectors. In the
√ Welch (resp. Sidelnikov) bound the sequences are
considered as complex vectors of norm n (resp. complex q-th roots of unity).
10.3.23 Theorem (The Welch bound) Let k ≥ 1 be an integer and F a family of M cyclically
distinct sequences of period n. Then
!
2k 1 M n2k+1
(θmax ) ≥ k+n−1
− n2k .
Mn − 1 n−1
k(k + 1) 2k n2k+1 2n
(θmax )2 > (2k + 1)(n − k) + − n , 0≤k <
.
2 M (2k)! k 5
k+1 2k n2k+1
(θmax )2 > (2n − k) − , k ≥ 0.
2 M (k!)2 2n
k
10.3.27 Remark The crosscorrelation between two m-sequences of the same period and with sym-
bols from K = Fp , p prime, is equivalent to calculating the following exponential sum for
all nonzero c ∈ F = Fpm ,
X d
θa,b (τ ) = −1 + ω TrF /K (cx−x ) ,
x∈F
where two zeros α and β of the two characteristic polynomials are related by β = αd
where gcd(d, pm − 1) = 1 and c = ατ . General results and open problems can be found in
[1467, 1475].
10.3.28 Remark Many binary families of sequences with excellent correlation properties are con-
structed from m-sequences. The most well-known is the family of Gold sequences.
10.3.29 Construction (Gold sequences) Let m be odd, d = 2k + 1 and gcd(k, m) = 1. Let {s(t)}
be an m-sequence of period n = 2m − 1. The family of Gold sequences is defined by
10.3.30 Theorem [1295] The parameters of the Gold family are n = 2m − 1, M = 2m + 1, and
m+1
θmax = 2 2 + 1.
10.3.31 Remark [2660] The Gold sequence family is optimal with respect to the Sidelnikov bound
and has the smallest possible θmax for the given period, alphabet, and family size.
Sequences over finite fields 321
10.3.32 Construction (The small family of Kasami sequences) Let m = 2k, k ≥ 2, and α be a
primitive element of F = F2m . Let also M = F2k , K = F2 , and
k
sa (t) = TrF/K (αt ) + TrM/K aα(2 +1)t
K = {{sa (t)} | a ∈ M }.
10.3.33 Theorem [1688] The parameters of the small Kasami family are n = 2m − 1, M = 2k , and
θmax = 2k + 1.
10.3.34 Remark [1475] The small Kasami family is optimal with respect to the Welch bound and
has the smallest possible θmax for the given period, alphabet, and family size.
10.3.35 Construction (No sequences) Let m = 2k, k ≥ 2, and α be a primitive element of F = F2m .
Let also 1 ≤ r ≤ 2k − 1, r 6= 2i for any i and gcd(r, 2k − 1) = 1. Let M = F2k , K = F2 , and
define r
k
sa (t) = TrM/K TrF/M αt + aα(2 +1)t
where a ∈ F . The No sequence family is defined by
N = {{sa (t)} | a ∈ F }.
10.3.36 Theorem [2295] The parameters of the No sequence family are the same as for the small
Kasami family. No sequences have higher linear complexity than the Kasami sequences.
10.3.37 Theorem [2661] (Sidelnikov sequence family) Let p be a prime, 0 < d < p and α be an
element of order pm − 1. Let also F = Fpm and K = Fp . Let {s(t)} be a sequence over Fp
of the form
d
!
X
kt
s(t) = TrF/K ak α ,
k=1
10.3.40 Definition (Lifting ofPf (x)) The lifting of a binary polynomial f (x) is the quaternary
m i
polynomial g(x) = i=0 gi x where g(x2 ) ≡ (−1)m f (x)f (−x) (mod 4).
10.3.41 Example The lifting of the primitive binary polynomial f (x) = x3 + x + 1 is g(x) ≡
x3 + 2x2 + x + 3 (mod 4) since g(x2 ) = (−1)3 f (x)f (−x) ≡ x6 + 2x4 + x2 + 3 (mod 4).
(Family A) Let g(x) be a lifting of a binary primitive polynomial of degree m.
10.3.42 Construction P
m m
The recursion i=0 gi s(t + i) ≡ 0 (mod 4) generates 4 − 1 quaternary nonzero sequences
corresponding to all nonzero initial states (s(0), s(1), . . . , s(m − 1)). These sequences are
known to have period 2m − 1. Family A is constructed by selecting 2m + 1 cyclically distinct
sequences from this set.
322 Handbook of Finite Fields
m
10.3.43 Theorem [383, 2692] Family A has parameters n = 2m −1, M = 2m +1 and θmax ≤ 2 2 +1.
10.3.45 Definition Let s = a + 2b ∈ Z4 where a, b ∈ Z2 . The most significant bit map π is defined
by π(s) = b.
10.3.46 Remark Any quaternary sequence {s(t)} of odd period n defines a quaternary sequence
{(−1)t s(t)} of period 2n. Thus a binary sequence {b(t)} of period 2n is obtained by b(t) =
π((−1)t s(t)).
10.3.47 Construction (Family of binary Kerdock sequences) Let g(x) be the lifting of a primitive
binary polynomial of odd degree m and define h(x) = −g(−x). The recursion with char-
acteristic polynomial h(x) is applied to all initial states that are nonzero modulo 2. This
generates 4m − 2m quaternary sequences of period 2(2m − 1). Selecting cyclically distinct
sequences from this set and using the most significant bit map π to these sequences leads
to the Kerdock family K of 2m−1 binary sequences of period 2(2m − 1).
10.3.48 Theorem [1475] The parameters of family K are n = 2(2m − 1), M = 2m−1 and θmax ≤
m+1
2 2 + 2.
10.3.49 Remark The sequence family K is superior to the small Kasami set. The size of family K
can be increased by a factor of 2 without increasing θmax . For further details see [1475].
10.3.50 Definition The aperiodic correlation of two complex-valued sequences {u(t)} and {v(t)}
for t = 0, 1, · · · , n − 1 is defined by
min{n−1,n−1−τ }
X
ρu,v (τ ) = u(t + τ )v(t), −(n − 1) ≤ τ ≤ n − 1.
t=max{0,−τ }
10.3.51 Remark If {s(t)} is a binary {+1, −1} sequence then ρs,s (τ ) = ρs.s (−τ ) and the aperiodic
autocorrelation is determined by the values
n−1−τ
X
ρs (τ ) = s(t + τ )s(t), 0 ≤ τ ≤ n − 1.
t=0
10.3.52 Definition A Barker sequence is a binary {−1, +1} sequence of length n if the aperiodic
values ρs (τ ) satisfy |ρs (τ )| ≤ 1 for all τ , 1 ≤ τ ≤ n − 1.
10.3.53 Remark Barker sequences are only known for the following lengths n = 2, 3, 4, 5, 7, 11, 13.
For example the sequence (+1 + 1 + 1 + 1 + 1 − 1 − 1 + 1 + 1 − 1 + 1 − 1 + 1) of length
n = 13 is the longest known Barker sequence; see also Section 17.3.
10.3.54 Remark It has been shown by Turyn and Storer [2828] that there are no Barker sequences
of odd length n > 13 and if they exist then n = 0 (mod 4). There is an overwhelming
evidence that no Barker sequence of length n > 13 exists.
Sequences over finite fields 323
10.3.55 Definition The merit factor of a binary {−1, +1} sequence {s(t)} of length n is defined
by
n2
F = Pn−1 2
2 τ =1 ρs (τ ).
10.3.56 Remark The highest known merit factor for a sequence is F = 14.08 coming from a Barker
sequence of length 13. For a long time the largest proven asymptotical value of the merit
factor for a family of arbitrarily long sequences was 6 [1523]. In [352] and [1806] new con-
structions were presented of families of sequences with asymptotic merit factor believed to
be greater than 6.34. Recently, this claim has been proved [1605].
10.3.57 Remark Low correlation zone sequences (LCZ) are designed with small auto- and crosscor-
relation values for small values of their relative time shifts. The parameters of a family of
LCZ sequences are the period n of the sequences, the number M of sequences, the length
L of the low correlation zone, and the upper bound δ on the correlation value in the low
correlation zone. For further details see [1215].
10.3.58 Definition A family of low correlation zone sequences is defined by the parameters
(n, M, L, δ), where
10.3.59 Theorem [2778] For an (n, M, L, δ) LCZ sequence family it holds that
n−1
ML − 1 ≤ 2 .
1 − δn
10.3.60 Definition The periodic Hamming correlation between a pair of binary sequences {s1 (t)}
and {s2 (t)} of period n is the integer
n−1
X
θ1,2 (τ ) = s1 (t + τ )s2 (t), 0 ≤ τ < n.
t=0
10.3.61 Remark The periodic Hamming correlation is important in evaluating optical communica-
tion systems where 0s and 1s indicate presence or absence of pulses of transmitted light.
F = {{si (t)} | i = 1, 2, . . . , M }
10.3.63 Remark The close relations between (n, w, λ) OOCs and constant weight codes provides
good bounds on OOC from known bounds on constant weight codes. For further information
see [1475].
10.3.64 Remark Other correlation measures include the partial-period correlation between two very
long sequences where the correlation is calculated over a partial period. In practice, there
324 Handbook of Finite Fields
is also some interest in the mean-square correlation of a sequence family rather than in
θmax . For the evaluation of these correlation measures coding theory sometimes plays an
important role. For more information the reader is referred to [1475].
See Also
References Cited: [352, 383, 550, 706, 864, 1215, 1295, 1303, 1324, 1467, 1475, 1476, 1523,
1605, 1688, 1806, 1939, 2295, 2388, 2526, 2555, 2659, 2660, 2661, 2673, 2692, 2778, 2828,
2964]
10.4.2 Definition The minimal polynomial of a linear recurring sequence S is the uniquely defined
monic polynomial M ∈ Fq [x] of smallest degree for which S is a linear recurring sequence
with characteristic polynomial M . The linear complexity L(S) of S is the degree of the
minimal polynomial M .
10.4.3 Remark Without loss of generality one can assume that f is monic, i.e., cl = 1. A sequence
S over Fq is a linear recurring sequence if and only if S is ultimately periodic, if c0 in (10.4.1)
is nonzero then S is purely periodic, see [1939, Chapter 8]. Consequently Definition 10.4.1
is only meaningful for (ultimately) periodic sequences. Using the notation of [1134, 2064],
(1)
we let Mq (f ) be the set of sequences over Fq with characteristic polynomial f . The set
(1)
of sequences with a fixed period N is then Mq (f ) with f (x) = xN − 1. The minimal
(1)
polynomial M of a sequence S ∈ Mq (f ) is always a divisor of f . For an N -periodic
sequence S we have L(S) ≤ N ; see Section 10.2.
10.4.4 Remark The linear complexity of a sequence S can alternatively be defined as the length
of the shortest linear recurrence relation satisfied by S. In engineering terms, L(S) is 0 if
S is the zero sequence and otherwise it is the length of the shortest linear feedback shift
register (Section 10.2) that can generate S [1631, 1939, 2502, 2503].
10.4.5 Definition For n ≥ 1 the n-th linear complexity L(S, n) of a sequence S over Fq is the
length L of a shortest linear recurrence relation
10.4.6 Remark Again one may assume that the n-th minimal polynomial is monic. Then it is
unique whenever L ≤ n/2. Definition 10.4.5 is also applicable for finite sequences, i.e.,
strings of elements of Fq of length n.
10.4.8 Remark Linear complexity and linear complexity profile of a given sequence (as well as the
linear recurrence defining it) can be determined by using the Berlekamp-Massey algorithm;
see Section 15.1 or [1631, Section 6.7], and [2011]. The algorithm is efficient for sequences
with low linear complexity and hence such sequences can easily be predicted.
10.4.9 Remark A sequence used as a keystream in stream ciphers must consequently have a large
linear complexity, but also altering a few terms of the sequence should not cause a significant
decrease of the linear complexity. An introduction to the stability theory of stream ciphers
is the monograph [873]. For a general comprehensive survey on the theory of stream ciphers
we refer to [2502, 2503].
326 Handbook of Finite Fields
10.4.10 Definition The k-error linear complexity Lk (S, n) of a sequence S of length n is defined
by
Lk (S, n) = min L(T, n),
T
where the minimum is taken over all sequences T of length n with Hamming distance
d(T, S) from S at most k. For an N -periodic sequence S over Fq the k-error linear
complexity is defined by [2700]
where the minimum is taken over all N -periodic sequences T over Fq for which the first
N terms differ in at most k positions from the corresponding terms of S.
10.4.11 Remark The concept of the k-error linear complexity is based on the sphere complexity
introduced in [873].
10.4.12 Remark Recent developments in stream ciphers point toward an increasing interest in word-
based or vectorized stream ciphers (see for example [784, 1445]), which requires the study
of multisequences.
10.4.16 Definition For an integer n ≥ 1 the n-th joint linear complexity L(S, n) of an m-fold
multisequence S = (S1 , . . . , Sm ) is the length of the shortest linear recurrence relation
the first n terms of the m parallel sequences S1 , . . . , Sm satisfy simultaneously. The joint
linear complexity profile of S is the non-decreasing integer sequence L(S, 1), L(S, 2), . . ..
10.4.20 Definition For an integer k with 0 ≤ k ≤ mn, the (n-th) k-error joint linear complexity
Lk (S, n) of an m-fold multisequence S over Fq is defined by
For an integer 0 ≤ k ≤ n the (n-th) k-error Fq -linear complexity Lqk (S, n) of S is defined
by
Lqk (S, n) = min L(T, n).
T∈Fm×n
q ,dC (S,T)≤k
i.e., the minimum is taken over all m-fold length n multisequences T = (T1 , . . . , Tm )
over Fq with Hamming distances dH (Si , Ti ) ≤ ki , 1 ≤ i ≤ m.
The definitions for periodic multisequences are analogous.
10.4.22 Remark For more information and discussion of linear recurring sequences, we refer to
Section 10.2.
10.4.23 Remark The reciprocal of a characteristic polynomial of a sequence S is also called a
feedback polynomial of S.
10.4.24 Remark Proposition 10.4.21 implies a one-to-one correspondence between sequences in
(1)
Mq (f ) and rational functions g/f with deg(g) < deg(f ) (when the approach via Laurent
series is used), and more generally between m-fold multisequences in Mq (f1 , . . . , fm ) and
m-tuples of rational functions (g1 /f1 , . . . , gm /fm ) with deg(gi ) < deg(fi ), 1 ≤ i ≤ m. We
note that in Proposition 10.4.21 Part 2 it is more convenient to start the indices for the
sequence elements si with i = 1.
10.4.25 Proposition [1135] Let (g1 /f1 , . . . , gm /fm ) be the m-tuple of rational functions corre-
sponding to S ∈ Mq (f1 , . . . , fm ). The joint minimal polynomial of S is the unique monic
polynomial M ∈ Fq [x] such that fg11 = hM1 , . . . , gfm m
= hMm for some (unique) polynomials
h1 , . . . , hm ∈ Fq [x] with gcd(M, h1 , . . . , hm ) = 1.
10.4.26 Remark For an N -periodic sequence S = s0 , s1 , . . ., let S N (x) N
P be the polynomial S (x) =
∞
s0 + s1 x + · · · + sN −1 xN −1 of degree at most N − 1. Then i=0 si x
i
= S N (x)/(1 − xN ),
which gives rise to the following theorem.
10.4.27 Theorem [759, Lemma 8.2.1], [2061] The joint linear complexity of an N -periodic m-fold
multisequence S = (S1 , . . . , Sm ) is given by
10.4.28 Remark Theorem 10.4.27 implies the famous Blahut theorem [303, 2503], [1631, Theorem
6.8.2] for the linear complexity of N -periodic sequences over Fq , gcd(N, q) = 1, which we
state in 3 commonly used different versions.
10.4.29 Theorem (Blahut’s Theorem) Let S be an N -periodic sequence over Fq , let gcd(N, q) = 1,
and let α be a primitive N -th root of unity in an extension field of Fq . Then
10.4.30 Theorem (Blahut’s Theorem) Let gcd(N, q) = 1, α be a primitive N -th root of unity in an
extension field of Fq and let A = (aij ) be the N × N Vandermonde matrix with aij = αij ,
0 ≤ i, j ≤ N − 1. Let s = (s0 , s1 , . . . , sN −1 ) be the vector corresponding to one period of an
N -periodic sequence S over Fq . The linear complexity L(S) of S is the Hamming weight of
the vector AsT .
10.4.31 Remark The vector a = AsT is called the discrete Fourier transform of s. Several gener-
alizations of the discrete Fourier transform have been suggested in the literature that can
be used to determine the linear complexity of periodic sequences and multisequences with
period not relatively prime to the characteristic of the field. We refer to [297, 2013, 2061].
10.4.32 Theorem (Blahut’s Theorem) Let S = s0 , s1 , . . . be a sequence over Fq with period N
dividing q − 1, and let g ∈ Fq [x] be the unique polynomial of degree at most N − 1 satisfying
g(αj ) = sj , j = 0, 1, . . ., where α is a fixed element of Fq of order N . Then L(S) = w(g),
where w(g) denotes the weight of g, i.e., the number of nonzero coefficients of g.
10.4.33 Theorem [298, Theorem 8] Let f be a polynomial over a prime field Fp with degree of f at
most p − 1 and let S = s0 , s1 , . . . be the p-periodic sequence over Fp defined by sj = f (j),
j = 0, 1, . . .. Then L(S) = deg(f ) + 1.
Sequences over finite fields 329
10.4.34 Remark Theorem 8 in [298] more generally describes the linear complexity of pr -periodic
sequences over Fp . A generalization of Theorem 10.4.33 to arbitrary finite fields is given in
Theorem 1 of [2066].
10.4.35 Remark The linear complexity of an N -periodic sequence over Fq can be determined by the
Berlekamp-Massey algorithm in O(N 2 ) elementary operations. For some classes of period
lengths, faster algorithms (of complexity O(N )) are known, the earliest being the Games-
Chan algorithm [1169] for binary sequences with period N = 2v . A collection of algorithms
for several period lengths can be found in [3013] (see also [2958, 2959, 2960, 3014]). Some
techniques to establish fast algorithms for arbitrary periods are presented in [85, 599, 600,
2057]. Stamp and Martin [2700] established a fast algorithm for the k-error linear complexity
for binary sequences with period N = 2v . Generalizations are presented in [1642, 1864, 2520],
and for odd characteristic in [2056, 3013].
10.4.36 Remark In contrast to the faster algorithms introduced in the literature for certain period
lengths, the Berlekamp-Massey algorithm also can determine the linear complexity profile
of a (single) sequence. As an application, the general behavior of linear complexity profiles
can be analyzed.
10.4.37 Theorem [1631, Theorem 6.7.4],[2502] Let S = s1 , s2 , . . . be a sequence over Fq . If
L(S, n) > n/2 then L(S, n + 1) = L(S, n). If L(S, n) ≤ n/2, then L(S, n + 1) = L(S, n)
for exactly one choice of sn+1 ∈ Fq and L(S, n + 1) = n + 1 − L(S, n) for the remaining q − 1
choices of sn+1 ∈ Fq .
10.4.38 Remark The linear complexity profile is uniquely described by the increment sequence
of S, i.e., by the sequence of the positive integers among L(S, 1), L(S, 2) − L(S, 1),
L(S, 3) − L(S, 2), . . . [2246, 2935, 2937]. Another tool for the analysis of the linear com-
plexity profile arises from a connection to the continued fraction expansion of Laurent
series [2239, 2240].
P∞
10.4.39 Theorem [2240] Let S = s1 , s2 , . . . be a sequence over Fq , let S(x) = i=1 si x
−i
∈
Fq ((x )) be the corresponding formal Laurent series, and let A1 , A2 , . . . be the polyno-
−1
mials in the continued fraction expansion of S(x), i.e., S(x) = 1/(A1 + 1/(A2 + · · · )) where
Aj ∈ Fq [x], deg(Aj ) ≥ 1, j ≥ 1. Let Q−1 = 0, Q0 = 1 and Qj = Aj Qj−1 + Qj−2 for j ≥ 1.
Then L(S, n) = deg(Qj ) where j is determined by
The n-th minimal polynomials are all (monic) polynomials of the form M = aQj + gQj−1 ,
a ∈ F∗q , g ∈ Fq [x] with deg(g) ≤ 2 deg(Qj ) − n − 1. In particular, the increment sequence of
S is deg(A1 ), deg(A2 ), . . ..
10.4.40 Remark Generalizations of the Berlekamp-Massey algorithm and of continued fraction anal-
ysis for the linear complexity of multisequences can be found in [127, 763, 764, 765, 766,
873, 1053, 1671, 2516, 2517, 2944].
q −n n(q+1)+q
(
n
1 X 2 + (q+1)2 − q (q+1)2 for even n,
En(1) = n L(S, n) = n
2
q +1 −n n(q+1)+q
q n 2 + 2(q+1)2 − q (q+1)2 for odd n.
S∈Fq
10.4.43 Remark Theorem 10.4.42 was obtained by an analysis of the Berlekamp-Massey algorithm.
Rueppel and Smeets [2502, 2685] provide closed formulas for the variance, showing that the
variance is small. A detailed analysis of the linear complexity profile of sequences over Fq is
given by Niederreiter in the series of papers [2235, 2239, 2240, 2243, 2246]. As a main tool,
the continued fraction expansion of formal Laurent series is used. For a more elementary
combinatorial approach, see [2242].
10.4.44 Theorem [2239] The linear complexity profile of a random sequence follows closely but
irregularly the n/2-line, deviations from n/2 of the order of magnitude log n must appear
for infinitely many n.
10.4.45 Remark The asymptotic behavior of the joint linear complexity is investigated by Niederre-
iter and Wang in the series of papers [2275, 2276, 2933] using a sophisticated multisequence
linear feedback shift-register synthesis algorithm based on a lattice basis reduction algorithm
in function fields [2549, 2928, 2934].
10.4.46 Theorem [2253, 2275, 2276]
10.4.49 Remark Feng and Dai [1059] obtained their result with different methods, namely with
multi-dimensional continued fractions.
10.4.50 Conjecture [2276]
mn
En(m) = + O(1) as n → ∞.
m+1
10.4.51 Remark For a detailed survey on recent developments in the theory of the n-th joint linear
complexity of m-fold multisequences we refer to [2257].
10.4.52 Theorem [1134, 1136] For a monic polynomial f ∈ Fq [x] with deg(f ) ≥ 1, let
10.4.53 Remark In [1134, 1136] an explicit formula for the variance Var(m) (f ) of the joint linear
(m)
complexity of random multisequences of Mq (f ) is given. In [1135, 1136] it is shown
how to obtain from Theorem 10.4.52 closed formulas for the more general case of m-fold
multisequences in Mq (f1 , . . . , fm ).
10.4.54 Remark Since for f (x) = xN − 1 the set M(m) (f ) is the set of N -periodic sequences,
earlier formulas on expectation (and variance) of the (joint) linear complexity of periodic
(multi)sequences can be obtained as a corollary of Theorem 10.4.52: [2059, Theorem 3.2],
[2060, Theorem 1], [3025, Theorem 1] on E (1) (xN − 1), and [1137, Theorem 1], [2061,
Theorem 1] on E (m) (xN − 1) for arbitrary m.
10.4.55 Remark In [1137, 2061] lower bounds on the expected joint linear complexity for periodic
multisequences are presented, estimating the magnitude of the formula for E (m) (xN − 1)
in Theorem 10.4.52. In [1137] it is also noted that the variance Var(m) (xN − 1) is small,
showing that for random N -periodic multisequences over Fq the joint linear complexity is
close to N (the trivial upper bound), with a small variance.
10.4.56 Remark Lower bounds for the expected n-th k-error joint linear complexity, the expected n-
th k-error Fq -linear complexity and the expected n-th k-error joint linear complexity for an
integer vector k = (k1 , . . . , km ) for a random m-fold multisequence over Fq are established
in [2063]. These results generalize earlier bounds for the case m = 1 presented in [2058].
10.4.57 Remark For periodic sequences, lower bounds on the expected k-error linear complexity
have been established in [2059, 2060]. For periodic multisequences (with prime period N
different from the characteristic), lower bounds for the expected error linear complexity are
presented in [2063] for all 3 multisequence error linear complexity measures.
10.4.58 Remark In the papers [2062, 2254, 2273, 2274, 2866] the question is addressed if linear
complexity and k-error linear complexity can be large simultaneously. Among others, the
existence of N -periodic sequences attaining the upper bounds N and N − 1 for linear and
k-error linear complexity is shown for infinitely many period lengths (and a certain range
for k depending on the period length), and it is shown that for several classes of period
length a large number of N -periodic (multi)sequences with (joint) linear complexity N also
exhibits a large k-error linear complexity.
10.4.59 Remark In [3016] methods from function fields are used to construct periodic multise-
quences with large linear complexity and k-error linear complexity simultaneously for vari-
ous period lengths.
10.4.62 Remark We note that j p−2 = j −1 for j ∈ F∗p . Since inversion is a fast operation this
sequence is, despite its high n-th linear complexity, still highly predictable.
10.4.63 Remark Analogous sequences of (10.4.2) over arbitrary finite fields Fq are studied in [2067].
Multisequences of this form are investigated in [2070]. Explicit inversive sequences and
multisequences can also be defined using the multiplicative structure of Fq .
10.4.65 Remark Sequences of the form (10.4.3) are analyzed in [2069, 2070]. With an appropriate
choice of the parameters one can obtain (multi)sequences with perfect linear complexity
profile, i.e., L(Z, n) ≥ mn/(m + 1).
10.4.66 Theorem [2070] Let m < (q − 1)/N and let C1 , . . . , Cm be different cosets of the group hγi
generated by γ, such that none of them contains the element −1. For 1 ≤ i ≤ m choose
αi , βi such that αi βi−1 ∈ Ci , then
mn
L(Z, n) ≥ min ,N , n ≥ 1.
m+1
min {n, N }
L(Q, n) ≥ , n ≥ 1.
2
10.4.69 Remark The period N of Q is at least half of the multiplicative order of ϑ.
with some initial value u0 ∈ Fp such that U is purely periodic with some period N ≤ p.
Sequences over finite fields 333
10.4.72 Remark For some special classes of polynomials much better results are available, see [1359,
1382, 2642]. For instance, in case of the largest possible period N = p we have
yj+1 = ayjp−2 + b, j ≥ 0,
n−1 N −1
L(Y, n) ≥ min , , n ≥ 1.
3 2
10.4.74 Theorem [1359, 2642] The power sequence P = p0 , p1 , . . ., defined as
pj+1 = pej , j ≥ 0,
n2 N2
L(P, n) ≥ min , , n ≥ 1.
4(p − 1) p − 1
10.4.75 Remark Two more classes of nonlinear sequences provide much better results than in the
general case, nonlinear sequences with Dickson polynomials [87] and Rédei functions [2072].
See Section 9.6 and [1936] for the definitions.
10.4.77 Theorem [759, 2829] The linear complexity of the Legendre sequence is
(p − 1)/2 if p ≡ 1 (mod 8),
p if p ≡ 3 (mod 8),
L(Λ) =
p − 1 if p ≡ 5 (mod 8),
(p + 1)/2 if p ≡ 7 (mod 8).
10.4.78 Theorem [2644, Theorem 9.2] The linear complexity profile of the Legendre sequence sat-
isfies
min{n, p}
L(Λ, n) > − 1, n ≥ 1.
1 + p1/2 (1 + log p)
10.4.79 Remark For similar sequences, that are defined by the use of the quadratic character of
arbitrary finite fields and the study of their linear complexity profiles, see [1786, 2065, 2994].
334 Handbook of Finite Fields
10.4.80 Definition Let γ be a primitive element and η be the quadratic character of the finite field
Fq of odd characteristic. The Sidelnikov sequence σ = σ0 , σ1 , . . . for j ≥ 0, is
1 if η(γ j + 1) = −1,
σj =
0 otherwise.
10.4.81 Remark In many cases one is able to determine the linear complexity L(σ) over F2 exactly,
see Meidl and Winterhof [2071]. For example, if (q − 1)/2 is an odd prime such that 2 is
a primitive root modulo (q − 1)/2, then σ attains the largest possible linear complexity
L(σ) = q − 1. Moreover we have the lower bound [2071]
min{n, q}
L(σ, n) , n ≥ 1.
q 1/2 log q
The k-error linear complexity of the Sidelnikov sequence seen as a sequence over Fp has
been estimated in [86, 641, 1198]. For results on similar sequences with composite modulus
see [392] and [759, Chapter 8.2].
10.4.82 Definition Let p > 3 be a prime and E be an elliptic curve over Fp of the form
Y 2 = X 3 + aX + b
with coefficients a, b ∈ Fp such that 4a3 +27b2 6= 0. For a given initial point W0 ∈ E(Fp ),
a fixed point G ∈ E(Fp ) of order N and a rational function f ∈ Fp (E) the elliptic curve
congruential sequence W = w0 , w1 , . . . (with respect to f ) is
10.4.84 Remark For example, choosing the function f (x, y) = x, the work of Hess and Shparlin-
ski [1493] gives the lower bound
10.4.85 Remark The Kolmogorov complexity is a central topic in algorithmic information theory.
The Kolmogorov complexity of a binary sequence is, roughly speaking, the length of the
shortest computer program that generates the sequence. The relationship between linear
complexity and Kolmogorov complexity was studied in [257, 2946]. The Kolmogorov com-
plexity is twice the linear complexity for almost all sequences over F2 of sufficiently (but only
moderately) large length. In contrast to the linear complexity the Kolmogorov complexity
is in general not computable and so of no practical significance.
Sequences over finite fields 335
10.4.86 Definition Let S = s0 , s1 , . . . be a sequence over Fq , and for s ≥ 1 let V (S, s) be the
subspace of Fsq spanned by the vectors sj − s0 , j = 1, 2, . . ., where
The sequence S passes the s-dimensional lattice test for some s ≥ 1, if V (S, s) = Fsq .
For given s ≥ 1 and n ≥ 2 we say that S passes the s-dimensional n-lattice test if
the subspace spanned by the vectors sj − s0 , 1 ≤ j ≤ n − s, is Fsq . The largest s for
which S passes the s-dimensional n-lattice test is the lattice profile at n and is denoted
by S(S, n).
where the maximum is taken over all D = (d1 , d2 , . . . , dk ) with non-negative integers
d1 < d2 < · · · < dk and M such that M − 1 + dk ≤ T − 1. Obviously, C2 (S) is bounded
by the maximal absolute value of the aperiodic autocorrelation of S.
10.4.90 Remark The correlation measure of order k was introduced by Mauduit and Sárközy in
[2037]. The linear complexity profile of a given N -periodic sequence can be estimated in
terms of its correlation measure and a lower bound on L(S, n) can be obtained whenever
an appropriate bound on max Ck (S) is known.
10.4.91 Theorem [393] We have
10.4.92 Remark In [1748] an alternative feedback shift register architecture was presented, feedback
with carry shift registers (FCSR). For binary sequences the procedure is as follows: Differ-
ently to linear recurring sequences the bits are added as integers (again following a linear
recurrence relation). The result is added to the content of a memory, which is a nonnegative
integer m, to obtain an integer σ. The parity bit σ (mod 2), of σ is then the next term of
the sequence, and the higher order bits bσ/2c are the new content of the memory.
FCSR-sequences share many properties with linear recurring sequences, but for their
analysis instead of arithmetics in finite fields, arithmetics in the 2-adic numbers is used - or
in the more general case of sequences modulo p in the p-adic numbers.
336 Handbook of Finite Fields
10.4.5.5 Discrepancy
10.4.93 Definition Let X = x0 , x1 , . . . be a sequence in the unit interval [0, 1). For 0 ≤ d1 < · · · <
dk < n we put
A(I, x1 , . . . , xn−dk )
sup − V (I) ,
I n − dk
where the supremum is taken over all subintervals of [0, 1)k , V (I) is the volume of I
and A(I, x1 , . . . , xn−dk ) is the number of points xj , j = 1, . . . , n − dk , in the interval I.
See Also
References Cited: [85, 86, 87, 127, 129, 257, 297, 298, 303, 392, 393, 599, 600, 641, 759, 763,
764, 765, 766, 784, 873, 913, 914, 1053, 1059, 1134, 1135, 1136, 1137, 1169, 1198, 1330, 1331,
1359, 1378, 1382, 1445, 1493, 1631, 1642, 1671, 1748, 1749, 1786, 1864, 1936, 1939, 2011,
2013, 2036, 2037, 2056, 2057, 2058, 2059, 2060, 2061, 2062, 2063, 2064, 2065, 2066, 2067,
2069, 2070, 2071, 2072, 2235, 2239, 2240, 2242, 2243, 2246, 2253, 2254, 2257, 2273, 2274,
2275, 2276, 2502, 2503, 2516, 2517, 2520, 2549, 2642, 2644, 2685, 2700, 2774, 2829, 2866,
2928, 2933, 2934, 2935, 2937, 2944, 2946, 2958, 2959, 2960, 2994, 3013, 3014, 3016, 3025]
Sequences over finite fields 337
10.5.1 Introduction
10.5.2 Remark ADSs have proved to be exciting and challenging mathematical objects. They have
very interesting algebraic and number theoretic properties and also exhibit a very complex
behavior; see [91, 964, 1020, 1087, 1313, 1618, 1619, 2421, 2547, 2668] for the foundations of
the theory. This makes them an invaluable building block for various applications including
pseudorandom number generators (PRNGs), which are of crucial value in quasi-Monte Carlo
methods and cryptography; see [2271, 2648, 2814]. Recently very surprisingly links with
other natural sciences such as biology [1596, 1860, 2717] and physics [106, 154, 216, 222,
1348, 1498, 2464, 2870, 2871] have emerged.
or
un+k = F(k) (un ).
10.5.5 Remark If F1 , . . . , Fm are rational functions, then one has to decide what to do if un is a
pole of some of them. A canonical way to resolve this is to define these functions on the
338 Handbook of Finite Fields
set of their poles separately (for example, define 0−1 = 0). Certainly this problem does not
occur if all functions F1 , . . . , Fm are polynomials.
10.5.6 Remark Recently, new applications have emerged to cryptography and quasi-Monte Carlo
methods, where ADSs have been shown to provide a very attractive alternative to the
classical linear congruential PRNG.
10.5.7 Definition An attack on a PRNG is an algorithm that observes several outputs of a PRNG
and then is able to continue to generate the same sequence with a nontrivial probability.
10.5.8 Remark The interest in nonlinear dynamical systems as sources of pseudorandom numbers
[2271, 2814] has been driven by a series of devastating attacks on traditional linear con-
structions which have made them useless for cryptographic purposes; see [716, 1831] and
references therein.
10.5.9 Remark Although nonlinear PRNGs are believed to be cryptographically stronger; see,
however [210, 299, 300, 301, 1309, 1380] for some attacks. Yet, obtaining concrete efficient
constructions with good rigorously proven estimates on their statistical and other properties,
has also been a challenging task [1358, 1379, 2329].
log Dn (F)
δ(F) = lim ,
n→∞ n
where Dk (F) is the degree of F(k) , defined as the largest degree of the components
(k) (k)
F1 , . . . , Fm .
10.5.11 Remark The existence of the limit follows immediately from the inequality Dk+m (F) ≤
Dk (F)Dm (F).
10.5.12 Remark Studying how the iterates of an ADS grow is a classical research direction with
a rich history and a variety of results. In the case of ADSs over the complex and p-adic
numbers there is a well-studied measure of “size” called the height, [91, 964, 1020, 2668].
Unfortunately this measure does not apply to ADSs over finite fields. However, in this case
the degree Dk (F) provides a very natural and adequate substitute for the notion of height.
Thus, the algebraic entropy plays a very essential role in the theory of ADSs over finite
fields [215, 216, 222, 1498, 2817, 2870, 2871].
10.5.13 Remark The degree growth is important for many applications of ADSs. In the univariate
case, it is obvious that the n-th iterate of a polynomial of degree d is a polynomial of degree
dn . This however is not true anymore in the multidimensional case. Yet, the exponential
growth is expected for a “typical” ADS. For example in [1358, 1379] the exponential growth
is shown for some very special classes of polynomial systems, corresponding to nonlinear
recurrence sequences of order m, that is, sequences satisfying a recurrence relation of the
form
wn+m = F (wn+m−1 , . . . , wn ), n = 0, 1, . . . ,
with F ∈ Fq (X1 , . . . , Xm ). An alternative approach with a combinatorial flavor to studying
such sequences has been suggested in [2329].
Sequences over finite fields 339
10.5.14 Remark Recently, a family of multivariate ADSs with polynomial degree growth of their
iterates has been constructed in [1312, 2328, 2330, 2332, 2334]. These ADSs are formed by
systems of rational functions of the form:
gm , hm ∈ Fq , gm 6= 0,
gi 6= 0, degXj G
fi < si,j , degXj Hi ≤ si,j ,
for 1 ≤ i < j ≤ m.
10.5.15 Remark The structure and the degree of iterates of ADSs in 10.5.14 is given by Theo-
rem 10.5.16 which is essentially [2332, Lemma 1].
10.5.16 Theorem [2332, Lemma 1] In the case of ei = 1, i = 1, . . . , m, for any ADS of the
form (10.5.2), we have
(k)
Fi = Xi Gi,k (Xi+1 , . . . , Xm ) + Hi,k (Xi+1 , . . . , Xm ), i = 1, . . . , m, k = 0, 1, . . . ,
where
Gi,k , Hi,k ∈ Fq [Xi+1 , . . . , Xm ], i = 1, . . . , m − 1
and
1
deg Gi,k = k m−i si,i+1 . . . sm−1,m + ψi (k),
(m − i)!
10.5.21 Remark Surprisingly enough, investigating the distribution and other number theoretic
properties of PRNGs ultimately requires to study additive character sums with their ele-
ments. In turn, this leads to investigation of their algebraic properties such as the degree
growth or absolute irreducibility of certain linear combinations of the iterates. Deeper alge-
braic properties such as the dimension of the singularity locus of certain associated algebraic
varieties are also of interest. Below we explain how these properties become ultimately re-
lated to the rather analytic question of the uniformity of distribution.
10.5.22 Remark An application of the celebrated Koksma–Szüsz inequality (see the original
works [1780, 2759] and also Section 6.3) reduces the question of studying the distribu-
tion of the vectors (10.5.1) to the question of estimating the following additive character
sums
N −1 m
!
X X
Sa (N ) = ψ ai un,i ,
n=0 i=1
where X = (X1 , . . . , Xm ). So a natural next step is to use the Weil bound, see Section 6.3.
However, for this the rational function La,k,l (X) has to be “exponential sums friendly,” that
is,
1. nontrivial (that is, with a nontrivial trace);
2. of small degree (so that the Weil bound is nontrivial for this function);
3, if possible, linear in some of the variables (thus to avoid using the Weil bound at
all);
4. if possible, have a low dimensional locus of singularity (to apply the Deligne
bound, see Section 6.3 or Deligne-like bounds by Katz [1704]).
10.5.23 Problem Find general (necessary and sufficient) conditions under which, the linear forms
La,k,l (X) are not constant for any non-zero vector a ∈ Fm
q and k 6= l.
10.5.24 Remark Over Fq , for a special class of iterations, some sufficient conditions are given
in [1358, 1379]. In [2329], in some special case, the above condition of on La,k,l (X) has been
replaced by some combinatorial argument, which, however, does not seem to generalize any
further.
Sequences over finite fields 341
10.5.25 Remark For ADSs of the shape (10.5.2) the degree of the iterates grows polynomially.
Furthermore the iterates are linear in one of the variables. Both properties together, have led
to rather strong results about the distribution of the vectors (10.5.1); see [2330, 2332]. This
idea and construction has been further developed in [2326, 2327, 2336, 2337]. Furthermore,
in [2332] a new construction of hash functions is suggested that is based on the above ADSs.
10.5.26 Remark It is clear that the systems of the shape (10.5.2) with ei = 1, i = 1, . . . , m (or for
arbitrary ei = ±1, if one defines 0−1 = 0) define a permutation of Fm q if and only if the
“coefficients” Gi , i = 1, . . . , m − 1, have no zeros over Fq . For such permutation systems,
Ostafe [2326] has established rather strong results about the distribution of elements in
trajectories on average over all initial values; see also [2334, 2653].
10.5.27 Remark At the same time it has become clear that in order to improve the results
of [2330, 2332] one needs to obtain more detailed information about the algebraic structure
of polynomial iterates, in particular about the number of absolutely irreducible components
of the polynomials Gi,k1 − Gi,k2 where 0 ≤ k2 < k1 and Gi,k is as in Theorem 10.5.16.
10.5.28 Remark Sums of multiplicative characters along trajectories of ADSs have also been con-
sidered in the literature [2272, 2277, 2337]. Such sums can be estimated within the same
lines that have been used for additive character sums, however instead of linear combi-
nations La,k,l (X) one has to study some other algebraic expressions including the iter-
ates. For instance, the argument of [2337] is based on studying the bilinear combinations
Gi,k Hi,l − Gi,l Hi,k (in the notation of Theorem 10.5.16) and showing that they are not
constant.
10.5.29 Theorem [1173, Theorem 1.4] Suppose that a polynomial f ∈ Fq [X] is not a monomial or
`
a binomial of the form axp + b where p is the characteristic of Fq . Then for any integer
N ≥ 1, the polynomials f, f (1) , . . . , f (N ) are multiplicative independent.
10.5.30 Remark Theorem 10.5.29 is used in an construction of elements of large multiplicative order
over finite fields; see [1173, Theorem 1.1].
10.5.31 Remark There are several possible interpretations of what a multivariate analogue of The-
orem 10.5.29 may look like. All of them are interesting, however no results in this direction
have been obtained so far.
10.5.32 Remark Clearly the sequence of vectors {un }, given by (10.5.1) is eventually periodic with
some period τ . That is, for some integer s ≥ 0 we have un+τ = un for n ≥ s.
10.5.33 Definition If s is the smallest integer with this above property of Remark 10.5.32, then
T = s + τ is the trajectory length.
10.5.34 Remark If we work in a finite field of q elements, then the trajectory length T satisfies
T ≤ q m , where m is the number of variables.
10.5.35 Remark No general lower bounds on the trajectory length of sequences generated by ADSs
are known. Furthermore, assuming that the map generated by F behaves as a random map
(and for a generic polynomial system this seems to be a natural and well tested assumption),
one should expect that in fact T is of order q m/2 . On the other hand, most of the results
342 Handbook of Finite Fields
about the distribution of the sequence (10.5.1) are nontrivial only if the trajectory length
T is close to its largest possible value [2326, 2327, 2330, 2332, 2336, 2337]. Hence we see
that “generic” ADSs are not likely to satisfy this property. Thus one needs constructions of
special ADSs tailored for these applications.
10.5.36 Remark Very little is known about constructing ADSs with guaranteed large trajectory
length. The only known rigorous results are those of Ostafe [2328] and their generalizations
in [2334], that gives a complete characterization (which in turn leads to explicit construc-
tions) of ADSs of the type (10.5.2) which achieve the largest possible value of the trajectory
length T = q m .
10.5.37 Remark A result of [2669] gives a nontrivial (albeit very weak) lower bound on the length
of a reduction of a trajectory of an ADS over the rationals modulo a prime p that holds for
almost all p (in fact the result applies to more general settings).
10.5.38 Remark As we have seen in Section 10.5.4, the algebraic structure of iterates becomes very
important for studying the distribution of trajectories. Questions of this type are also of
great intrinsic interest for the theory of ADSs. Unfortunately, they are notoriously hard, and
most of the few known results apply only to iterates of univariate quadratic polynomials;
see [76, 152, 153, 1310, 1313, 1618, 1619, 1620, 2331] and references therein.
10.5.39 Definition A polynomial f over a field K is stable if all its iterations f (n) are irreducible
over K.
where γ = −b/2a.
10.5.41 Remark Clearly γ in Definition 10.5.40 is the unique critical point of f (that is, the zero
of the derivative f 0 ).
10.5.42 Remark It is shown in [1618, 1619, 1620] that critical orbits play a very important role in
the dynamics of polynomial iterations.
10.5.43 Remark Capelli’s Lemma, describing the conditions on the irreducibility of polynomial
compositions, plays a prominent role in this area.
10.5.44 Theorem [1620, Proposition 3] A quadratic polynomial f ∈ K[X] is stable if the set
{−f (γ)} ∪ Orb(f ) contains no squares. If K = Fq is a finite field of odd characteristic,
this property is also necessary.
10.5.45 Remark If K = Fq is a finite field, there is some integer t such that f (t) (γ) = f (s) (γ) for
some positive integer s < t. Then f (n+t) (γ) = f (n+s) (γ) for any n ≥ 0. Accordingly, for the
smallest value of t with the above condition denoted by tf , we have
10.5.46 Remark Remark 10.5.45 immediately implies that a quadratic polynomial f ∈ Fq [X] can
be tested for stability in q 1+o(1) arithmetic operations over Fq . Using bounds of character
sums, it has been shown in [2331] that this can be improved.
10.5.47 Theorem [2331] A quadratic polynomial over Fq can be tested for stability in q 3/4+o(1)
arithmetic operations over Fq .
10.5.48 Remark In [50], a generalization of Theorem 10.5.47 is given to polynomials g(f ) where f
is quadratic and g is an arbitrary polynomial over Fq .
10.5.49 Remark Since a random polynomial of degree d is irreducible over Fq with probability
about 1/d and the degree of the iterations f (n) grows exponentially with n, it is natural to
expect that there are only very few stable polynomials over Fq . For quadratic polynomials
this has been confirmed in a quantitative form by Gomez and Nicolás [1310]. With several
ingenious extensions of the method of [1310], Gomez, Nicolás, Ostafe, and Sadornil [1311]
have shown this for polynomials of arbitrary degree d ≥ 2.
10.5.50 Theorem [1310] For any odd prime power q, there are at most q 5/2+o(1) stable quadratic
polynomials over Fq .
10.5.51 Remark The number of irreducible divisors and other arithmetic properties of iterations of
polynomials over large finite fields have been studied in [1313].
10.5.52 Problem Give explicit constructions of polynomial systems, over some “interesting” fields
(k)
such as Q and Fq , such that all polynomials Fi , i = 1, . . . , m, k = 1, 2, . . ., are
1. irreducible over Fq ;
2. absolutely irreducible over Fq .
10.5.53 Remark Over the field of rational numbers Q we define the class Ep,m (where p is an
arbitrary prime) of m-variate analogues of the Eisenstein polynomials: We say that F ∈
Z[X1 , . . . , Xm ] belongs to Ep,m if
F (X1 , . . . , Xm ) = A0 X1d1 · · · Xm
dm
+ pf (X1 , . . . , Xm )
for some f ∈ Z[X1 , . . . , Xm ], where A0 6≡ 0 (mod p), d1 + · · · + dm > deg f and such that
F (0) 6≡ 0 (mod p2 ), where 0 = (0, . . . , 0). Clearly if F = GH, where G, H ∈ Z[X1 , . . . , Xm ],
is reducible then F ≡ GH (mod p). Since if F is a monomial modulo p then so are G and H.
As deg F = d1 + · · · + dm , we conclude that G and H are nonconstant monomials modulo p.
Therefore G(0) ≡ H(0) ≡ 0 (mod p) which implies that F (0) = G(0)H(0) ≡ 0 (mod p2 ),
contradicting F ∈ Em,p . If F, G1 , . . . , Gm ∈ Ep,m then because
F (G1 (0), . . . , Gm (0)) ≡ F (0) 6≡ 0 (mod p2 ),
we also have F (G1 , . . . , Gm ) ∈ Ep,m . Therefore if F1 , . . . , Fm ∈ Em,p for some prime p then
(k) (k)
their iterations F1 , . . . , Fm , k = 1, 2, . . . are all irreducible over Q. This however does
not lead to a construction of absolutely irreducible iterates. There is no obvious finite field
analogue of this construction.
where kuk is the Euclidean norm of a vector u and we assume that the elements of Fp
are represented by the set {0, . . . , p − 1}.
344 Handbook of Finite Fields
10.5.55 Remark Certainly results about the asymptotically uniform distribution of the first N
vectors (10.5.1) (mentioned in Section 10.5.4) immediately imply that DF ,u0 (N ) is close to
the largest possible value n1/2 p in this case. However such results are known only for very
long segments of the trajectories.
10.5.56 Remark The notion of the diameter (under a slightly different name) is introduced in [1381]
and then has also been studied in [585, 586, 644] where a wide variety of methods has been
used. However, the only known results about the diameter are in the univariate case, that
is, when F = {f } ⊆ Fp [X] and u0 = u0 ∈ Fp .
10.5.57 Theorem [1381, Theorem 6] For any fixed ε > 0 and Tf,u0 ≥ N ≥ p1/2+ε where Tf,u0 is the
trajectory length corresponding to iterations of f ∈ Fp [X] originating at u0 , we have
Df,u0 (N ) = p1+o(1)
as p → ∞.
10.5.58 Remark Theorem 10.5.57, when it applies, provides the asymptotically best possible bound.
For smaller values of N , one can use a variety of the results from [585, 586, 644] that are
nontrivial in essentially the best possible range Tf,u0 ≥ N ≥ pε for any fixed ε > 0.
10.5.59 Problem Let Fqs be an extension of degree s ≥ 2 of Fq . Obtain a lower bound on the smallest
dimension of an affine space over Fq containing the first N elements of the trajectory of
iterations of f ∈ Fqs [X] originating at u0 ∈ Fqs .
10.5.60 Problem In the multidimensional case, besides estimating DF ,u0 (N ), one can also study
other geometric characteristics, such as
1. the volume and the number of vertices of the convex hull of the set
{u0 , . . . , uN −1 };
2. the number of directions and the number of distances defined by pairs of vectors
(uk , un ), 0 ≤ k, n ≤ N − 1;
3. the number of directions and the number of distances defined by pairs of consec-
utive vectors (un , un+1 ), 0 ≤ n ≤ N − 1.
References Cited: [50, 76, 91, 106, 152, 153, 154, 210, 215, 216, 222, 299, 300, 301, 585,
586, 644, 657, 716, 964, 1020, 1087, 1309, 1310, 1311, 1312, 1313, 1348, 1358, 1379, 1380,
1381, 1498, 1596, 1618, 1619, 1620, 1704, 1780, 1831, 1860, 2269, 2270, 2271, 2272, 2277,
2279, 2326, 2327, 2328, 2329, 2330, 2331, 2332, 2334, 2335, 2336, 2337, 2421, 2464, 2547,
2648, 2653, 2668, 2669, 2717, 2759, 2814, 2817, 2870, 2871]
11
Algorithms
This section presents algorithmic methods for finite fields. It deals with concrete imple-
mentations and describes how some fundamental operations on the elements are performed.
345
346 Handbook of Finite Fields
Finite fields play a central role in many areas, such as cryptography, coding theory, and ran-
dom number generation, where the speed of the computations is paramount. Therefore we
discuss algorithms with a particular attention to efficiency and we address implementation
techniques and complexity aspects as often as possible. In many instances, the complexity
of the algorithms that we describe depends on the field multiplication method that is imple-
mented. In the following, M(log p) denotes the complexity to multiply two positive integers
less than p. Similarly, Mq (n) represents the complexity to multiply two polynomials in Fq [x]
of degree less than n. We note that many software tools or libraries implement some of the al-
gorithms that are presented next. A non-exhaustive list includes Magma [2798], Pari [209],
Sage [2709], Mathematica [3004], Maple [2002], NTL [2633], GMP [1102], MPFQ [1255],
FLINT [1426], and ZEN [576]. Detailed algorithms and complexity analyses can be found
in [660, 661, 751, 1227, 1413, 1768, 2080, 2632].
11.1.1 Preliminaries
11.1.1.1 Prime field generation
11.1.1 Remark For many applications, a random prime field of a given size is needed. This implies
finding a prime number of a given bit length that is random in the following sense: the
probability for any particular prime of that size to be selected is sufficiently small so that
it is impossible for anyone to take advantage of this event. There are two different ways
to find such a random prime number. The first technique is to identify a prime number
among integers of a prescribed size successively picked at random. The second technique is
to construct integers with special properties so that it is easy to prove that they are prime
and whose distribution over the set of all the primes of the desired size is close to uniform.
11.1.2 Algorithm (Prime number random search)
Input: An integer ` ≥ 2.
Output: An `-bit prime number.
1. repeat
2. Pick an `-bit random integer n
3. until n is not composite
4. return n
11.1.3 Remark The prime number theorem [2080, Section 4.1] ensures that it takes O(`) random
draws among `-bit integers before picking an integer that is actually prime.
11.1.4 Remark The approach used at Line 3 of Algorithm 11.1.2 to test if n is composite or not
determines the quality of the prime number that is returned. In practice, Algorithm 11.1.2
relies on trial division, to quickly eliminate most composite numbers, and on the Miller–
Rabin compositeness test, properly set up depending on the size of the desired prime. See Re-
mark 11.1.5 and [2080, Section 4.4] for implementation details. In the end, Algorithm 11.1.2
returns a probable prime, i.e., not certified but prime with very high probability.
11.1.5 Remark The first compositeness test of real practical significance is due to Solovay and
Strassen [2694]. It was superseded by the Miller–Rabin test [2099, 2435], which is faster and
easier to implement. The error probability in the Miller–Rabin test can be adjusted using
a security parameter t that also drives the complexity of the algorithm. The Miller–Rabin
test is always correct when it declares that an integer n is composite, but it may be wrong
with probability at most 4−t when it declares that n is prime. Taking into account the
distribution of prime numbers and using advanced probabilistic analysis techniques, it can
be shown that Algorithm 11.1.2 coupled with the Miller–Rabin test requires a very small t
in order to return a high quality probable prime. For instance, t = 5 is enough to produce
Algorithms 347
a 500-bit integer that is prime with probability greater than 1 − 2−85 ; see [2080, Note 4.47
and Fact 4.48] and [2632, Section 10.3] for more examples.
11.1.6 Remark Given the parameter t and the factorization of n−1 as 2s m with m odd, the Miller–
Rabin test computes at most t modular exponentiations to determine if n is composite.
Each exponentiation is of the form ak , where a is a O(n) random value and k = 2r m
for some r < s. Its complexity is therefore O(t log3 n). As noted in [660], the algorithm is
significantly faster when single precision integers a are used instead of random values in
[0, n − 1]. Although the error bound may no longer hold in that case.
11.1.7 Remark An interesting variant of Algorithm 11.1.2 consists in testing integers in an arith-
metic progression. For instance, considering the odd numbers in some interval may reduce
the complexity of the process, see [2080, Section 4.4].
11.1.8 Remark When a provable prime is required, we can run a more expensive primality test
on the integer n returned by Algorithm 11.1.2. There are mainly two primality tests used
in practice. APRCL [660, 662], named after its inventors, relies on Jacobi sums and is a
simplified version of the test presented in [19]. Its complexity is O(logc log log log n n), for some
effective constant c. The Elliptic Curve Primality Proving test (ECPP) [143] uses elliptic
curves and its complexity is Õ(log5 n). The fast version of ECPP, called fastECPP runs
in Õ(log4 n), see [2141] for implementation details. Both ECPP and fastECPP offer the
advantage of producing a certificate that can be used to prove that n is prime much quicker
than running the test again. In 2002, Agrawal, Kayal, and Saxena [43] announced the first
deterministic polynomial time primality testing algorithm. For an input n, the complexity
of the so-called AKS algorithm is O(log10.5 n). One variant of AKS [610] runs in time
Õ(log4 n). Although this algorithm has the same complexity as fastECPP, the constants are
much larger and as a result fastECPP is still the method of choice to prove the primality of
general integers.
11.1.9 Remark As hinted in Remark 11.1.1, a radically different approach to find a prime number
is to construct it from scratch. There are two popular provable prime generation methods in
the literature, respectively due to Mihăilescu [2095] and Maurer [2038, 2080]. They both use
Pocklington’s lemma recursively, but Maurer’s method generates random provable primes
whose distribution is close to uniform over the set of all primes of a given size, whereas
Mihăilescu’s approach is more efficient but slightly reduces the set of primes that may be
produced.
11.1.10 Remark To define Fqn , it is enough to construct a basis of the vector space Fqn over
Fq . A normal basis offers several advantages, such as a cheap evaluation of the Frobenius
automorphism, see Section 5.2. Alternatively, any irreducible polynomial of degree n with
coefficients in Fq can be used to represent Fqn . This gives rise to a polynomial basis; see
Definition 2.1.96.
11.1.11 Remark Quite similarly to the prime field case, there are two different ways to find an
irreducible polynomial of a given degree. We can identify an irreducible polynomial among
many polynomials generated at random. We may also construct an irreducible polynomial
directly. This is well illustrated in [1187].
11.1.12 Remark It follows from Theorem 2.1.24 that a random monic polynomial of degree n with
coefficients in Fq is irreducible with probability close to n1 · So it takes an expected O(n)
attempts to find an irreducible polynomial of degree n at random; see Subsection 11.3.2,
for a description of some irreducibility testing algorithms.
348 Handbook of Finite Fields
11.1.13 Remark It is well known that different irreducible polynomials of degree n generate iso-
morphic finite fields. The isomorphism can even be computed explicitly [77, 1895]. So it
may seem that the choice of the irreducible polynomial used to generate Fqn is irrelevant.
However, certain polynomials are more suitable than others when it comes to the efficiency
of the operations in Fqn . In particular, polynomials with a low number of nonzero terms
have a clear advantage, as they provide a faster modular reduction, see Subsection 11.1.3.2.
11.1.14 Definition Let f ∈ Fq [x]. The polynomial f is s-sparse, if f has s nonzero terms. The
terms binomial , trinomial , quadrinomial , and pentanomial are frequently used to refer
to 2-sparse, 3-sparse, 4-sparse, and 5-sparse polynomials, respectively. Furthermore, f
is t-sedimentary if f = xn + h where deg h = t.
11.1.15 Remark According to Definition 11.1.14, any polynomial is s-sparse. Similarly any poly-
nomial is t-sedimentary. However only those with sufficiently small parameters s or t are
considered in practice. The relevance of s-sparse polynomials is known for a long time,
whereas the interest for t-sedimentary polynomials is more recent [717, 2306].
11.1.16 Definition We denote by σq (n) the minimal s such that there exists an irreducible s-sparse
polynomial of degree n in Fq [x]. Similarly, τq (n) denotes the minimal t such that there
exists an irreducible polynomial of degree n in Fq [x] of the form xn + h with deg h = t.
11.1.17 Conjecture Let n be a positive integer and let q be a power of a prime greater than 2, then
we have σq (n) ≤ 4. If q = 2, then σq (n) ≤ 5 [1233].
11.1.18 Remark Conjecture 11.1.17 states that except for certain extensions of degree n of F2 where
a pentanomial is required because there is no irreducible trinomial of degree n, it is always
possible to find an irreducible binomial, trinomial, or quadrinomial to define Fqn over Fq .
A similar conjecture exists for primitive polynomials, see Section 4.1.
11.1.19 Example We have σ2 (8) = 5. An exhaustive search shows that there is no irreducible
trinomial of degree 8 over F2 but it is easy to see that x8 + x4 + x3 + x + 1 is irreducible
over F2 . The field F28 is the smallest extension of F2 that requires a pentanomial. Similarly,
σ3 (49) = 4. Again, an exhaustive search shows that there is no irreducible binomial or
trinomial of degree 49 over F3 but x49 + 2x3 + x2 + 1 is irreducible over F3 . The field F349
is the smallest extension of F3 that requires a quadrinomial.
11.1.20 Conjecture Let n be a positive integer and let q ≥ 2 be a prime power, then we have
τq (n) ≤ 3 + logq n [1233].
11.1.21 Remark Conjectures 11.1.17 and 11.1.20 are supported by extensive computations [1181,
1187, 1233, 1327].
11.1.22 Remark We refer to [2582] for a table of irreducible trinomials and pentanomials in F2 [x]
of degree ranging from 2 to 10, 000; see also Section 2.2.
11.1.23 Remark See [403] for a dedicated algorithm with reduced space complexity designed to test
the irreducibility of trinomials of large degree in F2 [x]. See [1223] for additional algorithms
specifically designed for trinomials.
11.1.24 Remark Finally, we refer to Subsection 11.3.2 for methods to construct an irreducible
polynomial of degree n over Fq . We note that there exist infinite families of irreducible
k k
polynomials. For instance, the polynomials x2.3 + x3 + 1 with k ≥ 0 are all irreducible in
F2 [x] [1187].
Algorithms 349
11.1.25 Remark Finding a primitive element of a finite field is an interesting problem both from a
theoretical and a practical point of view [439, 927, 2627, 2628, 2640]. Given the complete
factorization of q − 1, Algorithm 11.1.26 returns a generator of F∗q in polynomial time in
log q. No efficient method is available when the factorization of q − 1 is not known.
11.1.26 Algorithm (Primitive element random search)
Input: A prime power q and the complete factorization of q − 1 as pd11 · · · pdkk .
Output: A generator γ of F∗q .
1. for i = 1 to k do
2. repeat
3. Choose α ∈ F∗q at random and compute β = α(q−1)/pi
4. until β 6= 1
di
5. γi = α(q−1)/pi
6. end for
Qk
7. return γ = i=1 γi
11.1.27 Remark Algorithm 11.1.26 is more efficient than the naı̈ve approach, which consists in
searching for an element γ such that γ (q−1)/pi 6= 1, for all i. This is especially true as
the number of prime factors of q − 1 grows. The complexity of finding a generator of the
multiplicative group of a prime field F∗p with Algorithm 11.1.26 is O(log4 p) [2632].
11.1.28 Remark To generate a random prime field Fp together with a primitive element, simply
modify Algorithm 11.1.2 so that the random integer n produced at Line 2 is of the form
n = m + 1, where all the prime factors of m are known. An algorithm to generate a random
factored number is given in [2632, Section 9.6]. Then use Algorithm 11.1.26 to return a
generator of F∗p .
11.1.29 Remark The order of a polynomial is introduced in Definition 2.1.51. The order of an
irreducible polynomial f is equal to the order of any of its roots. It can be found with
Algorithm 11.1.30 [2473].
11.1.30 Algorithm (Order of an irreducible polynomial)
Input: An irreducible polynomial f of degree n with coefficients in Fq and the complete
factorization of q n − 1 as q1e1 · · · q`e` .
Output: The order of f .
1. for i = 1 to ` do
e1 fi e
Find the smallest nonnegative integer fi such that f | xq1 ···qi ···q` ` − 1
2.
3. end for
4. return q1 f1 · · · q` f`
11.1.31 Remark If f is monic of order q n − 1 such that f (0) 6= 0 then f is a primitive polynomial.
In fact, the conditions are equivalent [1939, Theorem 3.16]; see also Section 4.1 for more
details on primitive polynomials. Algorithm 11.1.32 [1416] is specifically designed to quickly
test if a random polynomial is primitive or not.
11.1.32 Algorithm (Primitive polynomial testing)
Input: An irreducible polynomial f of degree n with coefficients in Fq and the list of
all the prime factors of q n − 1.
Output: true if f is primitive and false otherwise.
350 Handbook of Finite Fields
11.1.36 Remark The simplest approach to find the minimal polynomial of α ∈ Fqn is to compute
i k
the different conjugates αq of α until we have αq = α. The minimal polynomial of α is
k−1
then equal to the product (x − α)(x − αq ) · · · (x − αq ). The worst case complexity of this
algorithm is O(Mq (n)n log q) in time and O(n) elements in space. This technique is refined
in [1328] for the special case Fq = F2 .
11.1.37 Remark In [2631], Shoup proposes a more general algorithm with a better time complexity
but also with an increased space complexity. Indeed, given the ring Fq [α][β] of dimension
n over Fq , Shoup’s algorithm finds the minimal polynomial of an element in that ring
with complexity O(Mq (n)n1/2 + n2 ) in time and O(n3/2 ) elements in space. It relies on
several subroutines, including Wiedemann’s projection method [2976], a polynomial evalu-
ation technique by Brent–Kung [401], and finally, the Berlekamp/Massey algorithm [2011]
applied to recover the minimal polynomial from a sequence of projected points. Kedlaya and
Umans [1721] give an algorithm for constructing a minimal polynomial in n1+o(1) log1+o(1) q
bit operations. It relies on a fast modular composition method via multivariate multipoint
evaluation, see Remark 11.1.86.
11.1.38 Remark There are essentially three different ways to represent the elements of a finite
field. Small finite fields can be represented with Jacobi logarithms, also known as Zech’s
Algorithms 351
logarithms, see Subsection 2.1.7.5. Extension fields Fqn can always be represented with a
normal basis over Fq , see Section 5.2. Finally, prime fields and extension fields can be seen
as a quotient set, respectively modulo a prime number and an irreducible polynomial.
11.1.39 Remark Elements in Fp are usually represented as integers in [0, p−1] or in [−bp/2c, bp/2c].
We can also use alternative systems, such as the Montgomery representation, see Re-
mark 11.1.46, redundant systems [3030], or even floating point numbers [933].
11.1.40 Remark When Fqn is defined as Fq [x]/(f ), where f is an irreducible polynomial of degree
n in Fq [x], elements are usually represented as polynomials in Fq [x] of degree strictly less
than n. However, a generalization of Montgomery representation, see Remark 11.1.57, and
various redundant systems [405, 909, 3009] exist for extension fields as well.
11.1.41 Remark The Kronecker substitution [2111] allows to represent a polynomial in Fq [x] as an
integer by formally replacing x by a suitable integer, usually a sufficiently large power of 2.
Polynomial multiplication can be done faster with this system [930, 1430]. This technique
is implemented in the FLINT library [1426].
11.1.42 Remark The reduction of an integer u modulo a prime number p and the reduction of a
polynomial g modulo an irreducible polynomial f are denoted by u (mod p) and g (mod f ),
respectively. In any case, an efficient reduction method is crucial for fast finite field arith-
metic. Indeed, a reduction is usually performed after each operation to ensure that the size
of the operands, i.e., the bit length or the degree, remains under control. Assuming that the
operands are always reduced, an addition requires at worst one straightforward reduction.
On the contrary, reducing the result of a multiplication is more involved even if we know
that the size of the product is at most twice the size of the modulus.
11.1.43 Remark We assume that integers are represented using radix b. For most architectures, we
have b = 2w , for a fixed w ≥ 2. A word then corresponds to w bits and we assume that
multiplications and divisions by b, which correspond respectively to word left shift and word
right shift instructions, are free. The following is presented in [2080, Algorithm 14.42].
11.1.48 Remark The computation of Redc(u) does not require any division, only n word right shifts
at Line 6. A precise analysis shows that Algorithm 11.1.47 needs n(n + 1) single-precision
multiplications [2080, Note 14.34]. A classical reduction modulo p of a 2n-word integer rely-
ing on a Euclidean division also requires approximately n2 single-precision multiplications
but it also needs n single-precision divisions on top.
11.1.49 Remark The conversion to the Montgomery representation is not particularly cheap. How-
ever, once the conversion is done, it is possible to efficiently add, multiply, and even invert
values within this system, see Remarks 11.1.67 and 11.1.101. At the end of the whole com-
putation, for instance a modular exponentiation, the result is then converted back to its
normal representation using the relation Redc([z]) = z.
11.1.50 Remark Other representation systems, such as the residue number system [163, 2754],
the modular number system [166], and the polynomial modular number system [165] offer
interesting features for prime field arithmetic. All the computations can easily be parallelized
for increased performance.
11.1.51 Remark If working with a random prime is not crucial, choosing a modulus of a special
form can lead to substantial savings. An extreme example is illustrated by a Mersenne prime
p = 2k − 1, for which a reduction modulo p is extremely cheap. Indeed, since a shift by
a full word is free, reducing x < p2 modulo p only requires to shift x by exactly r = k
Algorithms 353
(mod w) bits, i.e., less than w bits. Unfortunately, Mersenne primes are too scarce to be of
any use for practical applications. This explains the introduction of generalizations of the
form p = 2k +c with c small [750] and later p = 2nk w ±2nk−1 w ±· · ·±2n1 w ±1 where w = 16,
32, or 64 [2693]. For instance, p = 2192 −264 −1 and p = 2256 −2224 +2192 +296 −1 are prime
and their low Hamming weight allows a fast reduction. Those integers are part of a list of
recommended primes of various sizes known as NIST primes [1066]. A further generalization
is to consider prime numbers of the form p = g(t), where g is a sparse polynomial and t is
an integer not necessarily equal to 2 [640].
11.1.52 Remark The remainder g (mod f ) can always be computed with the Euclidean division
algorithm, but if f is a sparse polynomial, the following algorithm [1233] is particularly well
suited and more efficient.
11.1.53 Algorithm (Polynomial modular reduction)
Ps−1
Input: Two polynomials f and g with coefficients in Fq , where f = xn + i=1 ai xbi
with 0 = b1 < b2 < · · · < bs−1 < n.
Output: The polynomial r such that g ≡ r (mod f ) with deg r < n.
1. r ← g
2. while deg(r) ≥ n do
3. k ← max{n, deg r − n + bs−1 + 1}
4. Write r as r1 xk + r2 with deg r2 < k
5. r ← r2 − r1 f − xn xk−n
6. end while
7. return r
11.1.54 Remark For an s-sparse polynomial f of degree n, the reduction of g modulo f requires at
most 2(s − 1)(deg g − n + 1) operations in Fq . The impact of s on the overall complexity
is obvious in this complexity analysis and justifies the interest for low weight irreducible
polynomials; see Subsection 11.1.1.2.
11.1.55 Remark The concept of an almost irreducible trinomial, i.e., a trinomial f defined over F2
and having an irreducible factor of degree n, is introduced in [405] and [909]. The arithmetic
is then performed in the ring F2 [x]/(f ) containing the field F2n , using a redundant set of
representatives. In [2573], Scott observes that for a given architecture, some well chosen
irreducible pentanomials over F2 may provide a faster arithmetic than trinomials, including
irreducible trinomials.
11.1.56 Remark Dedicated reduction methods have been developed for specific polynomials. For
instance, [1413, Algorithms 2.41 and 2.42] gives highly optimized reduction methods modulo
x163 + x7 + x6 + x3 + 1 and x233 + x74 + 1.
11.1.57 Remark A generalization of the Montgomery representation for polynomials over finite
fields of characteristic 2 is described in [1775].
11.1.58 Remark For cryptographic applications, especially for applications running on embedded
devices, a new type of finite field, called an optimal extension field, has been recently intro-
duced [161, 2096].
11.1.59 Definition An optimal extension field is one of the form Fpn where p is a generalized
Mersenne prime of the form 2k + c that fits in a word and such that the irreducible
354 Handbook of Finite Fields
11.1.60 Remark Considering finite fields of similar cardinality, optimal extension fields compare
favorably against prime fields thanks to an inversion that is usually faster. Also, due to
the lack of dedicated instructions in some old processors to multiply polynomials in F2 [x],
a multiplication in an optimal extension field is usually faster than in characteristic 2 on
those platforms.
11.1.4 Addition
11.1.61 Remark Adding, or subtracting, elements in a finite field is in general pretty straightfor-
ward; see for instance [1413, Algorithms 2.7 and 2.32]. This is not the case when nonzero
elements are expressed as a power of a generator of the multiplicative group. The notion of
Jacobi logarithm, see Subsection 2.1.7.5, gives a way to add elements easily. However, Ja-
cobi logarithms must be precomputed and stored for each nonzero element [661, Subsection
2.3.3] and this explains why they are only used for relatively small fields.
11.1.62 Remark For prime fields represented as Z/pZ, a reduction is sometimes needed after adding
two integers modulo a prime number p and this might lead to branch mispredictions [1428].
In [3078], Zimmermann discuss specifically designed algorithms, which do not need any
adjustment step to perform additions, subtractions, and multiplications under certain con-
ditions.
11.1.5 Multiplication
11.1.63 Remark Multiplying two elements is a task significantly more complicated than adding
them. There is a wide range of multiplication methods whose efficiency and level of sophis-
tication increase with the size of the operands.
11.1.64 Remark One approach to perform a modular multiplication is to compute the product first
and then reduce it independently. This is especially effective for large values where it is worth
using advanced multiplication techniques, such as Karatsuba [1684], Toom-Cook [1768], or
fast Fourier transform [751]. For smaller values, Algorithm 11.1.65, which is based on the
schoolbook method, reduces the result while it is computed for increased performance.
11.1.65 Algorithm (Interleaved multiplication-reduction)
Input: The n-word prime p and two n-word integers u = (un−1 . . . u0 )b and v.
Output: An integer r such that r ≡ uv (mod p).
1. r ← 0
2. for i = 0 to n − 1 do
3. r ← rb + un−i−1 v
4. Approximate q = br/pc by q̂
5. r ← r − q̂p
6. end for
7. while r ≥ p do
8. r ←r−p
9. end while
10. return r
Algorithms 355
11.1.66 Remark Although r is relatively small, different techniques, some of them quite involved,
exist to determine q̂ at Line 4 of Algorithm 11.1.65; see [829, Section 2.2] and [661, Subsec-
tion 11.1.2].
11.1.67 Remark Remark 11.1.46 gives a presentation of the notions of Montgomery representation
and of Montgomery reduction. Montgomery multiplication consists in multiplying elements
in Montgomery representation, before applying Montgomery reduction in order to have the
result of the product, again, in Montgomery representation. Formally, we have the relation
Redc([x][y]) = [xy].
11.1.68 Remark Karatsuba’s method relies on a clever use of the divide and conquer strategy. While
the schoolbook multiplication has complexity O(n2 ) to compute the product of two n-word
integers, Karatsuba multiplication [1683, 1684] has asymptotic complexity O(nlog2 3 ) =
O(n1.585 ). Karatsuba multiplication is thus faster than the naı̈ve approach, but due to a
certain overhead this occurs only for integers of size larger than some threshold n0 , which
depends on the platform. It is reported in [409] that this threshold can vary from 10 up to
more than 100 words. Note that GMP [1102] uses specifically optimized values for a wide
range of architectures.
11.1.69 Algorithm (Karatsuba multiplication)
Input: Two n-word integers u = (un−1 . . . u0 )b , v = (vn−1 . . . v0 )b , and n0 ≥ 2.
Output: The 2n-word integer uv.
1. if n ≤ n0 then
2. return uv computed with the schoolbook method
3. else
4. k ← dn/2e
5. Split u and v in two parts:
6. U1 ← (un−1 . . . uk )b and U0 ← (uk−1 . . . u0 )b
7. V1 ← (vn−1 . . . vk )b and V0 ← (vk−1 . . . v0 )b
8. Us ← U0 + U1 and Vs ← V0 + V1
9. Compute recursively U0 V0 , U1 V1 , and Us Vs
10. return U1 V1 b2k + (Us Vs − U1 V1 − U0 V0 )bk + U0 V0
11. end if
11.1.70 Remark Since multiplications by bq are free, multiplying two n-word integers with Algo-
rithm 11.1.69 requires only three multiplications of size n/2 and a few additions of size
n. This observation applied recursively justifies the complexity O(nlog2 3 ) of Karatsuba’s
method.
11.1.71 Remark Using polynomial evaluation and interpolation techniques, Toom 3-ways reduces
one multiplication of size n to five multiplications of size n/3 and thus runs in O(nlog3 5 ) =
O(n1.465 ) [1768, Subsection 4.3.3.A]. Generalizing this idea, Schönhage–Strassen multiplica-
tion [2559] using the fast Fourier transform runs in O(n log n log log n) [751, Section 9.5]. The
overhead is such that this method is only worth using for integers having several thousand
digits [1102].
11.1.5.2 Extension fields
11.1.72 Remark For finite fields Fqn defined via an irreducible polynomial f ∈ Fq [x] of degree n, all
the techniques presented for prime fields apply in this context as well. In particular, there
is a generalization of Algorithm 11.1.65 where polynomial multiplications and reductions
modulo f are interleaved. Also, more advanced multiplication methods, such as Karatsuba
or based on the fast Fourier transform are available as well, with similar complexities as in
the integer case.
356 Handbook of Finite Fields
11.1.73 Remark Many practical applications use finite fields of characteristic two and specific mul-
tiplication methods, such as the right-to-left comb, left-to-right comb, or even window meth-
ods involving some precomputations, have been developed in this context; see [1413] for some
explicit algorithms and [400] for a discussion focused on the implementation of Karatsuba,
Toom-Cook, Schönhage, and Cantor methods.
11.1.74 Remark We refer to Section 5.3 for multiplication in a normal basis, where the complexity
is thoroughly discussed and the concept of optimal normal basis is introduced.
11.1.6 Squaring
11.1.6.1 Finite fields of odd characteristic
suggests that it takes less effort to compute u2 than to compute uv, where u and v are
distinct integers of the same size. In practice, a dedicated modular squaring algorithm can
be up to 20% faster than its general counterpart [661].
11.1.76 Remark For each multiplication algorithm, there exists a specific variant exclusively de-
signed to square elements; see [661, Subsection 10.3.3] for a description of the schoolbook
and Karatsuba squaring methods.
11.1.77 Remark For the binary field F2n defined over F2 with a normal basis, a squaring corresponds
to a circular shift of the coordinates. With a polynomial basis modulo f , the squaring of an
element requires slightly more work, as we have
n−1
!2 n−1
X X
i
ai x = ai x2i .
i=0 i=0
11.1.7 Exponentiation
11.1.78 Remark The exponentiation operation consists in computing αm , for a nonnegative integer
m and α ∈ Fq . It should be optimized as much as possible as it is a crucial subroutine in
many algorithms. For instance, exponentiation is key for finding a generator of the multi-
plicative group or a primitive polynomial, for computing an inverse or the square root of
an element, or for factoring a polynomial; see [1232] for a very complete survey dedicated
to exponentiation methods in finite fields.
11.1.79 Remark When the exponent m is fixed, it may be worth searching for an addition chain
with low complexity computing m; see [661, Section 9.2]. If the element α ∈ Fp is fixed while
the exponent varies, precomputing some powers of α can lead to considerable speedups. See
Algorithms 357
in particular Yao’s method [1767, 3032] also known as BGMW [414] and fixed-base comb
algorithm [242, 2396] presented in [1941] as well. More details are available in [661, Section
9.3]. In the general case, where both the element and the exponent vary, Algorithm 11.1.80
is the most efficient exponentiation method that is known.
11.1.80 Algorithm (Sliding window exponentiation)
P`−1
Input: An element α ∈ Fp , a nonnegative exponent m = i=0 mi 2i , a parameter
k
k ≥ 1, and the stored values α, α3 , . . . , α2 −1 .
Output: The element α ∈ Fp .
m
1. β ← 1 and i ← ` − 1
2. while i ≥ 0 do
3. if mi = 0 then
4. β ← β 2 (mod p) and i ← i − 1
5. else
6. s ← max{i − k + 1, 0}
7. while ms = 0 do
8. s←s+1
9. end while
10. for j = 1 to i − s + 1 do
11. β ← β 2 (mod p)
12. end for
13. Form t = (mi . . . ms )2
14. β ← β × αt (mod p)
15. i←s−1
16. end if
17. end while
18. return β
11.1.81 Remark The parameter k controls the size of the window used to scan the bits of n. For
k = 1, Algorithm 11.1.80 coincides with the well-known square and multiply method. For
k > 1, Algorithm 11.1.80 relies on 2k−1 − 1 precomputed values obtained at a cost of 2k−1
multiplications. Given an `-bit exponent m, it requires `/(k + 1) extra field multiplications,
on average [2305]. The size of the window k should therefore be selected to minimize the
quantity 2k−1 + `/(k + 1). The number of squarings is independent of k and is equal to ` in
any case. Algorithm 11.1.80 is implemented in NTL [2633].
11.1.82 Remark The main difference with the prime field case is the existence of the Frobenius
automorphism that can be used in an extension field to speed up the computation of an
exponentiation.
11.1.83 Algorithm (Fast exponentiation in polynomial basis)
Input: Two polynomials f and g with coefficients in Fq such that deg f = n and
deg g < n, an exponent 0 < m < q n , and a positive integer r.
Output: The polynomial g m (mod f ).
1. Write m in base q r as m = (m`−1 . . . m0 )q r
2. for i = 0 to ` − 1 do
3. Compute and store g mi (mod f )
4. end for
r
5. h ← xq (mod f ) and t ← 1
6. for i = ` − 1 down to 0 do
358 Handbook of Finite Fields
7. t ← t(h) (mod f )
8. t ← tg mi (mod f )
9. end for
10. return t
11.1.84 Remark Algorithm 11.1.83 relies on a generalization of the square and multiply method in
base q r . It is discussed in [1180] where r is set to be dn/ logq ne and where the computation of
the residues g mi (mod f ) at Line 3 is done with the BGMW method [414]. The computation
t(h) (mod f ) at Line 7 is a modular composition; see Remarks 11.1.85 and 11.1.86.
11.1.85 Remark The first nontrivial method to compute t(h) (mod f ) is due to Brent and
Kung [401]. Assuming that deg h, deg t < deg f = n and that square matrices of di-
mension n can be multiplied with O(nw ) field multiplications, the Brent–Kung approach
takes O(Mq (n)n1/2 + n(w+1)/2 ) field operations. With Mq (n) = O(nlog3 2 ) correspond-
ing to Karatsuba’s multiplication (Subsection 11.1.5), the overall complexity becomes
O(nlog3 2+1/2 ) = O(n2.085 ). If instead fast integer multiplication methods à la Schönhage–
Strassen [2559] are used, the complexity is O(n(w+1)/2 ). The natural bound w = 3 was
improved by Strassen who introduced a method with w = log2 7 < 2.8074 [2731]. The best
known upper bound is w < 2.3727 [2984, 2985].
11.1.86 Remark Umans proposed a completely different approach to perform a modular composi-
tion, based on multivariate multipoint evaluation [2836]. Initially, this work only addressed
characteristic at most no(1) but was later generalized to any characteristic by Umans and
Kedlaya [1721, 1722]. In any case, the asymptotic complexity is n1+o(1) log1+o(1) q bit oper-
ations, which is optimal up to lower order terms.
11.1.87 Remark For the particular case Fq = F2n and for r = dn/ log2 ne, the complexity of
Algorithm 11.1.83 becomes O(M2 (n)n/ log n). This complexity does not depend on the
method, either [401] or [1722], used to perform the modular composition.
11.1.88 Remark To compute αm ∈ Fqn , where Fqn is represented using a normal basis over Fq , the
exponent m is again expressed in base q k , for some fixed k, in order to take advantage of
the Frobenius automorphism that allows one to compute αq with just a cyclic shift of the
coordinates of α. In this case, only Line 7 of Algorithm 11.1.83 needs to be modified and
k
replaced by t ← tq ; see also Subsection 5.3.5.
11.1.89 Remark When Fq is an optimal extension field (Definition 11.1.59), the action of the Frobe-
nius automorphism can also be computed extremely efficiently, leading to very fast expo-
nentiation techniques; see [2097] and [661, Subsection 11.3.3].
11.1.8 Inversion
11.1.90 Remark There are two ways to compute the inverse of a field element α ∈ Fqn . If the
field is defined as a quotient set, we can use the extended Euclidean algorithm, see Algo-
n
rithm 11.1.92. Alternatively, Lagrange’s theorem implies that we have αq −2 = 1/α. This
last method is totally general but it is particularly adapted to finite fields defined with a
normal basis or when the action of the Frobenius automorphism α 7→ αq can be computed
efficiently.
11.1.91 Remark Let R be a Euclidean ring with Euclidean function ϕ. For elements a, b ∈ R, we
can always write a = bq + r with ϕ(r) < ϕ(b). In practice, R = Z with natural order < or
R = Fq [x] with the degree function. Assuming that b is a prime number or an irreducible
polynomial and a is an element such that ϕ(a) < ϕ(b), we then have gcd(a, b) = 1. The
Bézout identity au + bv = 1, whose coefficients u and v are returned by Algorithm 11.1.92,
implies that au ≡ 1 (mod b), i.e., u is the inverse of a modulo b.
Algorithms 359
11.1.93 Remark The division at Line 6 of Algorithm 11.1.92 is exact. Dedicated methods to compute
the quotient of an exact division are given in [1601, 1805].
11.1.94 Remark In a finite field, an inversion is in general considerably more expensive than a multi-
plication. When k elements a1 , . . . , ak need to be inverted, a trick due to Montgomery [2133]
allows one to replace k inversions by one inversion and 3k −3 field multiplications. The prin-
ciple is to compute the inverse (a1 · · · ak )−1 and then multiply it by suitable precomputed
terms in order to recover successively each inverse individually. It was introduced initially
to speed up the elliptic curve factorization method and is therefore described for integers
defined modulo a composite number. However, this trick can be applied to any structure
where the notion of inverse exists. It is also presented in [660] and [661].
11.1.95 Remark In order to compute the division e/a, where e, a ∈ Fqn , one could invert a and
multiply the result by e. However, e/a can be obtained directly with Algorithm 11.1.92.
Simply replace u ← 1 by u ← e at Line 1.
11.1.8.1 Prime fields
11.1.96 Remark Algorithm 11.1.92 is usually preferred over Lagrange’s method to compute the
inverse of α modulo p. There are at least two reasons for that. Computing the exponentia-
tion αp−2 (mod p) requires on average twice as many arithmetic operations as the extended
Euclidean method [709]. Also, there are a number of improvements and variants of Algo-
rithm 11.1.92 that can be implemented to speed up its execution on different platforms.
11.1.97 Remark Algorithm 11.1.92 returns the inverse modulo p after O(log p) steps. When suitably
implemented, in particular if the precision of the Euclidean division is adjusted and decreases
with the size of its arguments d and c, then the overall time complexity is O(log2 p) [660,
Section 1.3]. In [1888], Lehmer suggests to replace the exact computation of the quotient q at
Line 3 of Algorithm 11.1.92 by an approximation obtained by dividing the most significant
digits of d and c. This remark has been explored further by many authors [710, 1602, 1903].
11.1.98 Remark In a variant introduced by Brent and Kung [402, 661], divisions are eliminated and
replaced by shifts, additions, and subtractions. This approach, known as the binary method
or the plus-minus method , is well-suited for architectures where a division is expensive. Not
surprisingly, the number of steps needed is still O(log p). Indeed, the quotient q at each step
of Algorithm 11.1.92 is small most of the time. The probability that q = 1 is close to 0.415
and q ≤ 5 in more than 77% of the cases [660].
11.1.99 Remark Schönhage [2557], improving on Knuth’s work [1766], showed how the ex-
tended Euclidean algorithm, and hence inversion, can be done asymptotically in time
O(M (log p) log log p); see also [3033] and [56] for a description of the method. Stehlé and
Zimmermann [2705] developed a binary recursive gcd method. Although it does not im-
prove on the O(M (log p) log log p) asymptotic complexity, its description, implementation,
and proof of correctness are simpler than Schönhage’s method.
360 Handbook of Finite Fields
11.1.100 Remark A very simple approach presented in [2803] allows one to find an inverse modulo
p. Interestingly, it is not related to the extended Euclidean gcd method nor Lagrange’s
method. It is particularly efficient for certain types of primes, such as Mersenne primes.
The method is recalled in [661, Subsection 11.1.3].
11.1.101 Remark There is a notion of Montgomery inverse [2132], which completes the other op-
erations already existing in Montgomery representation; see Remarks 11.1.46 and 11.1.67,
and [1644, 2536] for additional improvements regarding the Montgomery inverse.
11.1.102 Remark As in the integer case, there is a binary version of Algorithm 11.1.92 that does
not require any division; see [436] for an efficient version in even characteristic.
11.1.103 Remark A notion of reduction in a finite field defined by a normal basis is discussed
in [2748]. It is then possible to compute the inverse of an element with a variant of Al-
gorithm 11.1.92. However, since the Frobenius automorphism can be evaluated for free
n
in a normal basis, Lagrange’s method, which computes α−1 as αq −2 is preferred. Algo-
rithm 11.1.104 follows this idea with an additional improvement, i.e., the use of addition
chains computing q − 2 and n − 1 [1576]. See [661, Section 9.2] for a definition of addition
chains and related techniques to find short chains.
11.1.104 Algorithm (Inversion using Lagrange’s theorem)
Input: An element α ∈ F∗qn , two addition chains, namely (a0 , a1 , . . . , as1 ) computing
q − 2 and (b0 , b1 , . . . , bs2 ) computing n − 1.
n
Output: The inverse of α, i.e., α−1 = αq −2 .
1. Compute β ← αq−2 using the addition chain (a0 , a1 , . . . , as1 )
2. T [0] ← α × β
3. for i = 1 to s2 do
bj
4. γ ← T [k]q where bi = bk + bj
5. T [i] ← γ × T [j]
6. end for
7. γ ← T [s2 ]
8. return β × γ q
11.1.108 Remark Let us assume that p is an odd prime number and that q is some power of p. We
know that F∗q is a cyclic group, generated by, say, γ. Because the cardinality of F∗q is even,
all the square elements in F∗q must be even powers of γ and all the nonsquare elements
correspond to the odd powers of γ. Thus, there are (q − 1)/2 squares and just as many
nonsquare elements in F∗q .
1. k ← 1
2. while p 6= 1 do
3. if α = 0 then
4. return 0
5. end if
6. v←0
7. while α ≡ 0 (mod 2) do
8. v ← v + 1 and α ← α/2
9. end while
10. if v ≡ 1 (mod 2) and p ≡ ±3 (mod 8) then
11. k ← −k
12. end if
13. if α ≡ 3 (mod 4) and p ≡ 3 (mod 4) then
14. k ← −k
15. end if
16. r ← α, α ← p (mod r), and p ← r
17. end while
18. return k
11.1.112 Remark Algorithm 11.1.111 relies on a quadratic reciprocity law that allows one to reduce
the size of the operands in a way that is similar to the computation of the gcd with Euclid’s
algorithm.
11.1.113 Remark The evaluation of αp via the exponentiation α(p−1)/2 (mod p) has complexity
O(log3 p) with the square and multiply method. By contrast, Algorithm 11.1.111 has com-
plexity O(log2 p). In [407], Brent and Zimmermann describe a new algorithm to compute
α
p with complexity O(M(log p) log log p).
11.1.114 Remark There is a generalization of the Legendre symbol for the elements of Fpn . Let
f ∈ Fp [x] be an irreducible polynomial of degree n and
let g ∈ Fp [x].
The Legendre symbol
for polynomials, again denoted by fg satisfies f0 = 0 and fg = g (q−1)/2 (mod f ) for
nonzero g. Algorithm 11.1.115 relies also on a quadratic reciprocity law, quite similarly to
Algorithm 11.1.111, and allows to efficiently determine if g is a square or not modulo f .
362 Handbook of Finite Fields
1. k ← 1
2. repeat
3. if g = 0 then
4. return 0
5. end if
6. a ← the leading coefficient of g
7. g ← g/a
8. if deg f ≡1 (mod 2) then
9. k ← ap k
10. end if
11. if pdeg f ≡ 3 (mod 4) and deg f deg g ≡ 1 (mod 2) then
12. k ← −k
13. end if
14. r ← g, g ← f (mod r), and f ← r
15. until deg f = 0
16. return k
11.1.116 Remark Once we know that a nonzero element α ∈ Fp is a square, it may be necessary to
compute a square root of α, i.e., an element ρ ∈ Fp , satisfying ρ2 = α. If ρ is one square root,
then −ρ is the other one and there exist closed formulas to compute ρ in most cases [660,
Section 1.5]. Indeed, we have
11.1.117 Remark For the other cases, i.e., when q ≡ 1 (mod 8) or when q ≡ 5 (mod 8), α(q−1)/4 =
−1, and q is an even power of p, there exist several polynomial time methods to compute
a square root. The most efficient factorization methods, applied to the polynomial x2 − α,
return a square root of α in Fq using O(log q) field operations; see Remark 11.4.3. There
are also dedicated methods, such as Tonelli and Shanks algorithm [660], which returns a
square root in the prime field Fp in time O(log4 p). Another example is Algorithm 11.1.118,
which is a generalization of Cipolla’s method [751, Subsection 2.3.9]. Algorithm 11.1.118 is
remarkably simple and easy to implement. It works in the quadratic extension Fq2 and also
requires O(log q) field operations.
11.1.119 Remark Every element α ∈ F2n is a square and the square root ρ of α can be easily
n−1
obtained thanks to the multiplicative structure of F∗2n , which implies that ρ = α2 . With
a normal basis, the computation is immediate. Using a polynomial representation, modulo
Pn−1
an irreducible polynomial f , there is a different approach. If α is represented by i=0 gi xi ,
then observe that
√ X √ X i−1
α= gi xi/2 + x gi x 2
i even i odd
√ √
where x has been precomputed modulo f . We note that x can be obtained very easily
when f is a trinomial of odd degree; see [661, Subsection 11.2.6].
11.1.120 Remark Unlike for finite fields of odd characteristic, solving a quadratic equation in F2n is
more involved than just extracting square roots. Considering the polynomial x2 + x + β in
F2n [x], we can show that it has a root in F2n if and only if the trace of β is zero. In that
case, a solution τ is given by
(n−3)/2
X 2i+1
τ= β2
i=0
where ω ∈ F2n is any element of trace 1. In any case, the other solution is τ + 1. Reference
[1088] gives techniques to speed up the computation of a square root in characteristic two
at the expense of extra storage.
See Also
References Cited: [19, 43, 56, 77, 143, 161, 163, 165, 166, 204, 209, 242, 400, 401, 402, 403,
404, 405, 406, 407, 408, 409, 414, 436, 439, 576, 610, 640, 660, 661, 662, 709, 710, 717, 750,
751, 829, 909, 927, 930, 933, 1066, 1088, 1102, 1180, 1181, 1187, 1223, 1227, 1232, 1233,
1255, 1327, 1328, 1410, 1413, 1416, 1426, 1428, 1430, 1576, 1601, 1602, 1644, 1683, 1684,
1721, 1722, 1766, 1767, 1768, 1775, 1781, 1805, 1888, 1895, 1903, 1939, 1941, 2002, 2011,
2038, 2080, 2095, 2096, 2097, 2099, 2111, 2132, 2133, 2141, 2194, 2305, 2306, 2396, 2420,
2435, 2473, 2536, 2557, 2559, 2573, 2582, 2627, 2628, 2631, 2632, 2633, 2640, 2693, 2694,
2705, 2709, 2731, 2748, 2754, 2798, 2803, 2836, 2976, 2984, 2985, 3004, 3009, 3030, 3032,
3033, 3078]
364 Handbook of Finite Fields
We∗ give basic counting estimates for univariate polynomials over finite fields. First, we
provide some classical counting results. Then we focus on a methodology based on analytic
combinatorics that allows the derivation of many nontrivial counting results. A series of
counting results for univariate polynomials over finite fields, some of them linked to the
analysis of algorithms, is then provided.
11.2.1 Remark The most classical counting estimate is for the number In of monic irreducible
polynomials of degree n over Fq ; see Theorem 2.1.24. (We observe that in the results of
this section there is only one finite field involved, and so we drop the notation Iq (n) for the
simpler one In . Sometimes, however, we let q go to infinity.)
qn q n/2
1X
In = µ(d)q n/d = +O .
n n n
d|n
11.2.3 Remark Since In > 0 for any prime power q and integer n > 1 and the number of monic
polynomials of degree n over Fq is q n , then the probability of a polynomial being irreducible
is close to 1/n. This probability tends to zero with n → ∞.
11.2.4 Remark A polynomial is squarefree if it has no repeated factors. The number of squarefree
polynomials over Fq was first given by Carlitz [537].
11.2.5 Theorem [537] Let Qn be the number of squarefree polynomials of degree n over Fq . Then,
q n − q n−1 for n ≥ 2,
Qn =
qn for n = 0, 1.
11.2.6 Remark Theorem 11.2.5 implies, for n ≥ 2, that the proportion of squarefree polynomials
is 1 − 1/q. As a consequence, for large finite fields Fq most polynomials are squarefree.
11.2.7 Remark Let us consider the number of irreducible factors of fixed degree d in a random
polynomial of degree n over Fq . Zsigmondy [3083] concentrates on the prime field case Fp
and gives results for the number of monic polynomials having no irreducible factors of degree
∗ Originally,this section was to be written by Philippe Flajolet and the author, but sadly, Philippe passed
away before we started working on it. This section is dedicated to the memory of my friend Philippe
Flajolet, for all his many lessons and guidance.
Algorithms 365
d, 1 ≤ d ≤ n, and for the number of monic polynomials having a given number of distinct
roots. We will return to this problem in Subsection 11.2.3.2.
11.2.8 Remark The notes at the end of Chapter 4 of Lidl and Niederreiter’s book [1939] point to
several classical counting references published before 1983.
11.2.9 Remark Other classical results related to polynomials with prescribed trace or norm and
to self-reciprocal polynomials are given in Sections 3.1 and 3.5, respectively.
11.2.10 Remark As a final classical estimate we consider the probability that two polynomials are
coprime. It has been known since at least the 1960s (see Berlekamp [231] and Knuth [1765])
but most likely for a long time before then, that with probability 1 − 1/q, the greatest
common divisor of two polynomials over a finite field is 1, independently of the degrees of the
polynomials. This result has been reinvented a large number of times. In Subsection 11.2.3.4,
a much more precise refinement of this classical result is given.
11.2.11 Remark Flajolet has made major methodological contributions to the research area known
as analytic combinatorics. Among other things, analytic combinatorics provides a general
methodology that can be successfully applied to the analysis of algorithms from many di-
verse areas. Its main and classical reference is the book by Flajolet and Sedgewick [1083]. In
this section this general framework is presented only in relation to polynomials over a finite
field Fq although it can be used in much more general settings. For a longer introduction to
this methodology and its application to the analysis of polynomial factorization algorithms
see Flajolet, Gourdon, and Panario [1080].
11.2.12 Remark This framework has two basic components: generating functions to express, com-
binatorially, properties of interest, and asymptotic analysis for the derivation of estimates
when exact extraction of coefficients is not possible. The studied properties could be for pure
mathematical interest or for their application to the analysis of algorithms for polynomials
over finite fields [1080, 2347].
11.2.13 Remark Generating functions for counting some properties of polynomials over finite fields
have been previously used in some specific cases by Berlekamp [231, Chapter 3], Knuth
[1765, Subsection 4.6.2], and Odlyzko [2306]. The global usage of this technique to count
many interesting expressions, and its usage in the analysis of polynomial over finite fields
algorithms, only became possible after the establishment of analytic combinatorics [1083].
11.2.16 Remark Since [z n ]P (z) is q n , it follows that P (z) = (1 − qz)−1 . This expression and the
one in Proposition 11.2.15 implicitly determine that In satisfies
1X
In = µ(k)q n/k .
n
k|n
11.2.17 Remark Carlitz’s result [537] for squarefree polynomials (Theorem 11.2.5) can be easily
recovered under this framework. Indeed, the generating function for squarefree polynomials
is Y Ik
Q(z) = 1 + zk .
k≥1
Moreover, considering the multiplicity of its irreducible factors, each polynomial f factors
as f = st2 , where s is squarefree and t is an arbitrary polynomial. We thus have P (z) =
Q(z)P (z 2 ), and therefore,
P (z) 1 − qz 2
Q(z) = = .
P (z 2 ) 1 − qz
Carlitz’s estimate, given in Theorem 11.2.5, can be recovered after extracting coefficients
from Q(z).
11.2.18 Remark Generating functions encode exact counting information in their coefficients. How-
ever, their extraction from a given generating function is in general a difficult task. Neverthe-
less, when considering generating functions as analytic functions, their behavior near their
dominant singularities (those with smallest modulus) is an important source of information
to extract coefficient asymptotics as their index tends to infinity.
11.2.19 Remark Most of the generating functions f (z) of interest in this section are singular at
z = 1/q with an isolated singularity of the algebraic-logarithmic type. In that case, we can
apply the following important result due to Flajolet and Odlyzko [1081].
where z1 > 1/q and ε are positive real numbers. Let k ≥ 0 be any integer, and α a real
number with α 6= 0, −1, −2, . . .. If in a neighborhood of z = 1/q, f (z) has an expansion of
the form k
1 1
f (z) = log (1 + o(1)),
(1 − qz)α 1 − qz
then the coefficients satisfy, asymptotically as n → ∞,
nα−1
[z n ]f (z) = q n (log n)k (1 + o(1)).
Γ(α)
Algorithms 367
11.2.21 Remark This theorem requires analytic continuation of f (z) outside its circle of conver-
gence. However, there are some situations in which generating functions do not satisfy this
hypothesis. For instance in some of the generating functions related to smooth polynomials
(Subsection 11.2.3.3) analytic continuation is not possible. Saddle point methods are used
in these cases. All asymptotic enumeration methods required in this section are explained in
detail in the excellent presentations by Flajolet and Sedgewick [1083] and Odlyzko [2307].
11.2.22 Remark We list several results that are mostly derived in the framework of analytic com-
binatorics.
11.2.23 Remark Bivariate generating functions are used to study important parameters of inter-
est. The exact counting problem is now refined with two parameters, namely, the degree
of the polynomial and an additional property to be studied (for example, the number of
its irreducible factors). With an appropriate normalization, successive differentiation of the
bivariate generating function with respect to the additional parameter (evaluated at 1)
gives the factorial moments of interest. The classical book by Flajolet and Sedgewick [1083]
presents a comprehensive explanation of this methodology. The complete study of the num-
ber of irreducible factors of a random polynomial is presented below as an example.
where [uk z n ]P (u, z) is the number of polynomials of degree n with k irreducible factors.
Successive differentiation of P (u, z) with respect to u (evaluated at u = 1) give univariate
generating functions for the factorial moments of this parameter. Asymptotic analysis of
these
√ univariate generating functions gives an expectation of log n and standard deviation
log n. More can be said for this problem as the next theorem states. Flajolet and So-
ria [1084] prove that the number of irreducible factors in a random polynomial over a finite
field satisfies a central limit theorem with mean and variance asymptotic to log n. We note
that this result is equivalent to the Erdös-Kac theorem stating that the number of prime
factors in a random integer at most n satisfies a central limit theorem with mean and vari-
ance asymptotic to log log n. We refer to Subsection 11.2.3.5 for analogies among irreducible
decompositions of polynomials over finite fields, prime decompositions of integers, and cycle
decompositions of permutations.
11.2.25 Theorem Let Ωn be a random variable counting the number of irreducible factors of a
random polynomial of degree n over Fq , where each factor is counted with its order of
multiplicity.
1. The mean value of Ωn is asymptotic to log n [231, 1765].
2. The variance of Ωn is asymptotic to log n [1084, 1765].
368 Handbook of Finite Fields
11.2.26 Remark As a first example of a factorization pattern, let us consider the number of irre-
ducible factors in a random polynomial of a given fixed degree. As it is indicated in Remark
11.2.7, the number of roots was studied by Zsigmondy [3083] for prime fields. Knopfmacher
and Knopfmacher [1760] present a detailed analysis, for any finite field, including variance.
11.2.27 Remark The case of polynomials with no roots is interesting when studying the distinct
values that a polynomial can take. This is related to permutation polynomials (Section 8.1)
and was studied by Uchiyama [2833]; see also [670].
11.2.29 Remark The number of irreducible factors of a given degree d in a polynomial of degree n
was studied by Williams [2983]. In [1761] can be found a detailed analysis of this problem
as well as a determination of the variance, in both the cases where repetitions are allowed,
and where they are not allowed.
11.2.30 Remark Knopfmacher, Knopfmacher, and Warlimont [1762] provide the mean and variance
of what they call the “length” of a general factorization pattern by studying the number of
polynomial factorizations into exactly k factors.
11.2.31 Remark When factoring univariate polynomials (Section 11.4) using the method based on
the squarefree, distinct-degree, and equal-degree factorizations, it is relevant to determine if
a polynomial has all its irreducible factors of different degrees. In this case, the third stage,
the “equal-degree factorization,” is not required.
11.2.32 Theorem [1080, 1763] The probability that all irreducible factors of a random polynomial
of degree n over Fq have different degrees (but with single factors possibly repeated) is
Algorithms 369
asymptotic to
Y Ik
cq = 1+ k
(1 − q −k )Ik ,
q −1
k≥1
11.2.34 Remark Information on the degrees of the largest irreducible factors is crucial to measure
stopping rules for factorization algorithms [1080]. Information on the degrees of the smallest
irreducible factors helps in the analysis of irreducible test algorithms [2349, 2350].
11.2.35 Remark A random polynomial of degree n has with high probability several irreducible
[j]
factors whose degrees sum to near n; see Theorem 11.2.37 and Remark 11.2.38. Let Dn be
the j-th largest degree of the factors of a random polynomial of degree n in Fq . Car [510] ob-
[1]
tained an asymptotic expression for the cumulative distribution function of Dn in terms of
the Dickman function. This number-theoretic function was originally introduced to describe
the distribution of the largest prime divisor of a random integer [2788].
11.2.36 Definition The Dickman function is defined as the unique continuous solution of the
difference-differential equation
0
ρ(u) = 1 (0 ≤ u ≤ 1), uρ (u) = −ρ(u − 1) (u > 1).
[1]
11.2.37 Theorem The distribution of the largest degree Dn satisfies for all x ∈ (0, 1):
[1]
where F1 (x) = ρ(1/x) and ρ denotes the Dickman function. In particular, one has E(Dn ) ∼
gn, where g = 0.62432 . . . is known as the Golomb-Dickman constant.
[j]
11.2.38 Remark The most complete results about Dn , for any fixed positive integer j, are due
[2]
to Gourdon [1341]; see also [1080]. For instance, we also have E(Dn ) ∼ 0.20958 . . . n, and
[3]
E(Dn ) ∼ 0.08831 . . . n.
11.2.39 Remark Information on the relation between the first and second largest degree irreducible
factors is used to compute the average-case analysis of the classical factorization method
based on squarefree, distinct-degree, and equal-degree factorization under the “early-abort”
stopping strategy [1080]. This also requires information on the joint distribution of the first
two largest degree irreducible factors.
370 Handbook of Finite Fields
11.2.40 Remark Estimates for the largest degree of the irreducible factors are related to the study
of smooth polynomials that play an important role in the discrete logarithm problem in
finite fields, especially in the index calculus method; see Section 11.6.
11.2.41 Definition A polynomial of degree n over Fq is m-smooth if all its irreducible factors have
degree at most m.
11.2.42 Remark In the index calculus method a search is repeated until an m-smooth polynomial is
found. Hence, the analysis of the index calculus method requires information on the number
of polynomials that are m-smooth. The generating function for the number Nq (m; n) of
monic polynomials over Fq of degree n which are m-smooth is
m Ik
X
n
Y 1
Sm (z) = Nq (m; n) z = .
1 − zk
n≥0 k=1
For the
√ cryptographic
√ applications, m tends to infinity with n. More precisely, we have
m = n log n/ 2 log 2; see [2306]. Hence, singularity analysis does not apply since analytic
continuation is not possible. Odlyzko [2306] uses the saddle point method for deriving an
asymptotic estimate for the numbers Nq (m; n) as n → ∞, uniformly for m in the range
n1/100 ≤ m ≤ n99/100 . (Actually, his results hold for nδ ≤ m ≤ n1−δ , where δ > 0.)
11.2.43 Remark A variant of the index calculus method over F2n (the Waterloo algorithm) was
introduced in [306]; see also [2306]. It improves the running time of the method but it does
not improve its asymptotic order. The running time was proven rigorously by Drmota and
Panario [921] using a bivariate saddle point analysis that follows closely Odlyzko’s estimates
in [2306].
11.2.44 Remark For related estimates for the index calculus method without using smooth poly-
nomials see [1212, 1213]. The fastest variant of the index calculus method for F2n is still
due to Coppersmith [717]; see Section 11.6 for more details and references.
11.2.45 Remark We focus now on the smallest degrees of the irreducible factors of a polynomial.
These estimates are useful, for example, to analyze Ben-Or’s irreducible test; see Section
11.3. This analysis requires the study of the probability that a random polynomial of degree
n contains no irreducible factors of degree up to a certain value m (such polynomials are
sometimes called m-rough) and are related to the Buchstab function.
11.2.46 Definition The Buchstab function is the unique continuous solution of the difference-
differential equation
0
uω(u) = 1 1 ≤ u ≤ 2, (uω(u)) = ω(u − 1) u > 2.
11.2.47 Remark This function was introduced by Buchstab [442] when studying the analogous
problem for integer numbers, that is, numbers with no small prime factors. This function
Algorithms 371
has been largely studied [2788]. It is known that it tends quickly to e−γ = 0.56416 . . ., where
γ is Euler’s constant.
11.2.48 Remark Car [510] gives estimates for m-roughness that depend on the Buchstab function
for m large with respect to n, say m > c1 n log log n/ log n. Gao and Panario [1187] show
that for m small with respect to n, say m < c2 log n, the estimate e−γ /m holds; see also
[2351].
11.2.49 Remark The study of the probability that a random polynomial is m-rough for the complete
range 1 ≤ m ≤ n, is given by Panario and Richmond [2352]. The estimates are in terms
of the Buchstab function when m → ∞. When m is fixed Flajolet and Odlyzko singularity
analysis is applied.
11.2.50 Theorem [2352] The smallest degree Sn among the irreducible factors of a random poly-
nomial of degree n over Fq satisfies
1 n 1 log n
P r(Sn ≥ m) = ω + O max , ,
m m m2 mn
when m tends to infinity with n.
11.2.51 Remark Using Theorem 11.2.50 it is not difficult to prove that the expected smallest
degree among the irreducible factors of a random polynomial is asymptotic to e−γ log n
[2352]. More generally, the expected r-th smallest degree among the irreducible factors of a
random polynomial is asymptotic to e−γ logr n/r!.
11.2.52 Remark As it is pointed out in Remark 11.2.10 two polynomials over Fq are coprime with
probability 1 − 1/q. Much more can be said about the distribution of the degrees of the
irreducible factors in the gcd of several polynomials. Indeed, the limiting distribution of a
random variable counting the total degree of the greatest common divisor of two or more
random univariate polynomials over the finite field Fq is geometric, and the distributions of
random variables counting the number of common factors (with and without repetitions)
are very close to Poisson distributions when q is large. The main reference for these results,
from where we extract a couple of main theorems, is [1189]. For simplicity we state the
results for two polynomials but they immediately generalize to several polynomials.
11.2.53 Theorem [1189] Let us consider two polynomials over Fq of degrees n1 and n2 , respectively,
and the random variables Zd (n1 , n2 ) for the number of distinct irreducible factors in the
gcd, Zr (n1 , n2 ) for the number of irreducible factors in the gcd counting repetitions, and
Zt (n1 , n2 ) for the total degree of the gcd of the two polynomials. Then, as n1 → ∞ and
n2 → ∞ and for I(z) the generating function of irreducible polynomials, we have
1. the probability generating function for Zd is
X (1 − u)m
I q −2m ;
P D(u) = exp −
m
m≥1
372 Handbook of Finite Fields
11.2.54 Remark As a corollary we obtain, for example, the probability that the gcd has zero
(coprime polynomials), one or two irreducible factors:
P (Zr = 0) = 1 − 1/q,
P (Zr = 1) = (1 − 1/q)I(1/q 2 ), and
P (Zr = 2) = (1/2)(1 − 1/q) I 2 (1/q 2 ) + I(1/q 4 ) .
We also obtain, from the total degree results, that the probability that the gcd has degree
k is asymptotic to q −k (1 − q −1 ) as the degrees grow.
11.2.55 Remark Exact values of these probabilities can be computed for small values of the degrees
n1 and n2 and of field size q. For tables of probabilities for a few common irreducible factors
(counted with, or without repetitions), and for the mean and variance of Zd and Zr for small
values of q, see [1189].
11.2.56 Remark There are several analogies among the irreducible decomposition of polynomi-
als over finite fields, the prime decomposition of integers, and the cycle decomposition of
permutations. We exemplify below with several specific results and then give a heuristic
argument to justify these analogies.
3. M3 (q): the class of all monic square-free polynomials over Fq whose irreducible
factors have distinct degrees. In this class
Q the number of polynomials of degree
n is a(n, q)q n where, a(n, q) → a(q) := k≥1 (1 + Ik q −k ) exp(−1/k) as n → ∞.
Let us define, for x > 0,
1 2 x 3/2
Φn (x) := λ ` n : log m(λ) − (log n) > √ (log n) .
2 3
11.2.59 Theorem [901] Fix one of the classes Mi (q) described above. For each λ ` n, let w(λ, q)
denote the proportion of polynomials in this class whose factorizations have shape λ. Then
there exists a constant c0 > 0 (independent of the class) such that for each x ≥ 1 there
exists n0 (x) such that
X
wi (λ, q) ≤ c0 e−x/4 for all q and all n ≥ n0 (x).
λ∈Φn (x)
In particular, almost all polynomials of degree n over Fq in Mi (q) have splitting fields of
degree exp(( 21 + o(1))(log n)2 ), as n → ∞.
11.2.60 Theorem [901] In each of the classes described above the average degree En (q) of a splitting
field of a polynomial of degree n in that class satisfies
√
n n log log n
r
log En (q) = C +O ,
log n log n
uniformly in q, and for C = 2.99047... an explicitly defined constant.
11.2.61 Remark The constant C in the above theorem was obtained by Goh and Schmutz [1290]
and Stong [2724] in their study of the analogous problem in the symmetric group Sn . A
permutation in Sn is of type λ = 1k1 2k2 · · · nkn if it has exactly ks cycles of length s for
each s, and its order is then equal to m(λ). If w(λ) denotes the proportion of permutations
in Sn which
P are of type λ, then the average order of a permutation in Sn is equal to
En := λ`n w(λ)m(λ). We can think of m(λ) as a random variable where λ ranges over
the partitions of n and the probability of λ is w(λ). Properties of the random variable m(λ)
(and related random variables) under the distribution w(λ) have been studied by Erdös and
Turán [983, 984, 985, 986]. In particular, the distribution of log m(λ) is approximated by a
normal distribution with mean 21 (log n)2 and variance 31 (log n)3 in a precise sense [985].
11.2.62 Remark There is a general heuristic that Flajolet, Gourdon, and Panario [1080] call the
permutation model. Probabilistic properties of the decomposition of polynomials into irre-
ducible factors are expected to have a shape resembling (as q → ∞) that of the correspond-
ing properties of the cycle decomposition of permutations. As the cardinality q of the finite
field Fq goes to infinity (with n staying fixed!), the joint distribution of the degrees of the
irreducible factors in a random polynomial of degree n converges to the joint distribution
of the lengths of cycles in a random permutation of size n. As stated in [1080]: This prop-
erty is visible at the generating functions level when any generating function of polynomials
taken at z/q converges (as q → ∞) to the corresponding exponential generating function of
permutations. For example, the generating function of monic polynomials, when normalized
with the change of variable z 7→ z/q, is the exponential generating function of permutations:
∞
zn
z 1 X
P = = n! .
q 1 − z n=1 n!
374 Handbook of Finite Fields
Similarly, we have
∞
zn
z 1 X
I → log = (n − 1)! ,
q (q→∞) 1 − z n=1 n!
11.2.63 Remark Several of the results for polynomials and permutations have been generalized to
problems in the exp-log combinatorial class; see for instance [1084, 1085, 1341, 1342, 2351].
The exp-log class includes problems such as 2-regular graphs, several types of permutations,
random mappings (functional digraphs), polynomials over finite fields, random mappings
patterns for unlabelled objects, and arithmetical semigroups.
11.2.64 Remark Relations between the irreducible decomposition of polynomials over finite fields
and the prime decomposition of integers are discussed in detail in Section 13.1.
See Also
References Cited: [231, 306, 442, 509, 510, 537, 670, 665, 717, 901, 921, 983, 984, 985, 986,
1080, 1081, 1083, 1084, 1085, 1187, 1189, 1190, 1212, 1213, 1226, 1235, 1239, 1290, 1341,
1342, 1564, 1667, 1760, 1761, 1762, 1763, 1765, 1939, 2306, 2307, 2347, 2349, 2350, 2351,
2352, 2724, 2788, 2833, 2983, 3083]
11.3.1 Introduction
We consider the problem of testing polynomials for irreducibility and constructing irre-
ducible polynomials of prescribed degree. Such constructions are essential in many compu-
tations with finite fields, including cryptosystems, error-correcting codes, random number
generators, combinatorial designs, complexity theory, and many other mathematical com-
putations. In particular, they allow us to construct finite fields of specified order. We confine
ourselves to univariate polynomials in this section. Factorization algorithms and irreducibil-
ity tests for multivariate polynomials are discussed in Section 11.5.
Algorithms 375
11.3.1 Definition The following notation will be used in the analyses of algorithms.
M(n) : N → N is defined such that two polynomials over a field F of degree at most n
can be multiplied with O(M(n)) operations in F. M(n) = n2 using the “school”
method, while M(n) = n log n log log n for FFT-based methods [498].
MM(n) : N → N is defined such that two n × n matrices over a field F can be
multiplied with O(MM(n)) operations in F. Using the standard algorithm,
MM(n) = n3 and MM(n) = n2.3727 for the best currently known algorithm
[2985].
For f, g : R → R, f = O˜(g) if f = O(g(log |g|)c ) for some absolute constant c ≥ 0.
11.3.2 Remark An essential early construction for certifying irreducibility, as well as for factoring,
was established by Petr in 1937 [2391], though it was not presented in algorithmic terms.
11.3.3 Definition Let f ∈ Fq [x] be a squarefree polynomial of degree n over a finite field Fq . The
Petr/Berklekamp matrix Q ∈ Fn×n
q of f is defined such that
X
xiq ≡ Qij xj (mod f ).
0≤j<n
This is the matrix representation of the Frobenius map (a 7→ aq (mod f )) on the basis
h1, x, x2 , . . . , xn−1 i for Fq [x]/(f ).
11.3.4 Theorem [2391, 2567, 2570] Let f = f1e1 · · · fkek ∈ Fq [x], for irreducible f1 , . . . , fk ∈ Fq [x],
have Petr/Berlekamp matrix Q ∈ Fn×n
q . The characteristic polynomial det(Q − λI) ∈ Fq [x]
of Q satisfies
Y
det(Q − λI) = (−1)n (λei − 1) · λ(e1 −1)···(ek −1) .
1≤i≤k
11.3.5 Remark Theorem 11.3.4 could, in principle, have been cast as an algorithm for testing irre-
ducibility with the technology of the day, using the method for computing the characteristic
polynomial of a matrix by Danilevsky [768] from 1937. The characteristic polynomial of Q
has the form xn − 1 for an irreducible polynomial.
11.3.6 Remark In 1954 Butler [468] presented an explicit method for determining the number of
irreducible factors of a polynomial based on the following theorem.
11.3.7 Theorem [468] Let f ∈ Fq [x] have degree n, with Petr/Berlekamp matrix Q ∈ Fn×n
q . Then
rank(Q − I) = n − k, where k is the number of distinct irreducible factors of f . A squarefree
f is irreducible if and only if rank(Q − I) = n − 1.
11.3.8 Remark The Petr/Butler approach is developed into a complete algorithm for polynomial
factorization by Berlekamp [230] in 1967. While the cost of Butler’s method was not analyzed
in the modern sense, it is straightforward that constructing Q and taking the rank of Q − I
would cost O(M(n)(n + log q) + MM(n)) or O˜(n log q + MM(n)) operations in Fq .
376 Handbook of Finite Fields
11.3.9 Remark In 1980, Rabin [2434] based a more efficient test around the following theorem. It
was originally presented for polynomials over prime fields, but works for polynomials over
any finite field Fq .
11.3.10 Theorem A polynomial f ∈ Fq [x] of degree n ≥ 1 is irreducible if and only if
n
1. f | xq − x, and
m
2. gcd(xq − x, f ) = 1 for all divisors m of n.
the bit complexity model by Kedlaya and Umans [1722]. These are slower than the above
irreducibility tests; see Section 11.4.
11.3.19 Remark Density estimates for irreducible polynomials easily reduce the problem of finding
irreducible polynomials to that of certifying a polynomial is irreducible, assuming we have
a way to generate random field elements. Such an approach was even suggested by Galois
in 1830 [1168]. However, one can do considerably better than the most naı̈ve reduction,
and both the asymptotic complexity of this problem and more practical concerns hold
considerable interest.
11.3.20 Remark Gauss gives an explicit formula for the number Iq (n) of irreducible polynomials of
degree n, from which it is easily derived that q n /(2n) < I(n, q) < q n /n (see [1939], Exercises
3.26 and 3.27). Thus, given a way to randomly generate elements of Fq , any algorithm for
testing irreducibility also yields a probabilistic algorithm for finding irreducible polynomials
of any specified degree. The expected number of operations is n times the cost of the chosen
irreducibility test.
11.3.21 Remark In 1981, Ben-Or [223] described a simple probabilistic algorithm for constructing
an irreducible polynomial with a better expected cost than the straightforward approach
suggested in Remark 11.3.20.
variance of the smallest degree among the irreducible factors of a random polynomial of
degree n. Results are stated in terms of the Buchstab function, a classical number-theoretic
function used in the study of numbers with no prime factors smaller than some bound;
see [2350] for definitions and details. They also show that the expected average-case cost
of an irreducibility test in Ben-Or’s algorithm is O(M(n) log(n)(log n + log q)) operations
in Fq , and provide an asymptotic analysis with explicit constants for this complexity. The
expected cost of actually finding an irreducible polynomial is n times this cost.
11.3.28 Remark Von zur Gathen and Gerhard [1227], Section 14.9 suggest that a reasonable ap-
proach to finding irreducible polynomials might be a hybrid of Ben-Or’s algorithm, with
a switch to the irreducibility test of Rabin when testing for factors of degrees over some
prescribed bound such as log2 n.
11.3.29 Remark Shoup [2629] gives an asymptotically faster probabilistic algorithm for constructing
an irreducible polynomial of degree n, which requires O((n2 log n + n log q) log n log log n)
operations in Fq . It also has the benefit of reducing use of randomness to O(n) random
elements from Fq (the algorithms described above all require an expected O(n2 ) random
elements). Shoup’s algorithm is very different from the Rabin’s and Ben-Or’s, and more
closely resembles the deterministic constructions discussed in Section 11.3.7 below. The
algorithm proceeds by looking at each prime power re dividing n (where r is prime). There
are a number of special cases. In particular, if r is the characteristic of Fq , then Artin-
Schreier type polynomials can be employed (see Remark 11.3.33 below), and when r = 2 a
simple construction suffices. In the remaining cases, for each prime-power-factor re of n, it
first factors the r-th cyclotomic polynomial Φr ∈ Fq [x]. This can be done quickly. It then
e
finds an r-th power non-residue in ξ ∈ Fq [θ], where θ is an adjoined root of Φr ; xr − ξ
is irreducible of degree re in Fq [θ][x], from which an irreducible polynomial of degree re
over Fq [x] is constructed via traces. The good asymptotic cost is maintained through a
fast algorithm for finding minimal polynomials in algebraic extensions, employing a very
elegant use of the “Transposition Principle” (also known as Tellegen’s theorem). Despite
the intricate construction, the polynomial produced is uniformly selected from the set of all
irreducible polynomials of degree n in Fq [x].
11.3.30 Remark An algorithm for finding irreducible polynomials of degree n, along the lines of
Shoup’s algorithm above, is exhibited by Couveignes and Lercier [747] and has an expected
cost of n1+o(1) (log q)5+o(1) . The key innovation is the fast construction of irreducible polyno-
mials (for all but some exceptional degrees) using isogenies of a random elliptic curve. This
is combined with the recent fast modular composition algorithm of Kedlaya and Umans
[1722], along with the fast algorithm of Bostan et al. [363] to compute “composed sums” of
polynomials.
11.3.31 Remark Deterministic algorithms for constructing univariate polynomials of specified de-
gree over a finite field of characteristic p are considerably more difficult. The goal is al-
gorithms which require time polynomial in n and log p. Given such an algorithm it is
straightforward to construct an irreducible polynomial over an extension field Fq of Fp ,
at least up to polynomial time in n and log q. We thus concern ourselves with constructing
irreducible polynomials over prime fields Fp . A more efficient algorithm for non-prime fields
is established by Shoup [2625].
Algorithms 379
11.3.32 Remark The algorithms of Chistov from 1984 [621] and Semaev [2580] are the first to
require time (np)O(1) , and hence are effective over small finite fields; see also [2856].
11.3.33 Remark In 1986 Adleman and Lenstra [14] exhibited an algorithm for constructing irre-
ducible polynomials which runs in polynomial time in n and log p, assuming that the Ex-
tended Riemann Hypothesis is true. Their algorithm proceeds by first factoring n = pe n0 ,
where gcd(n0 , p) = 1. An extension of degree n0 of Fp is built, and then an extension of
degree pe is constructed on top of that. A minimal polynomial of a generating element is
irreducible of degree n. The degree n0 extension requires finding a prime r ≡ 1 (mod n0 )
such that p is inert in the unique subfield K of the r-th cyclotomic field such that [K : Q] = n.
From this an extension of Fp of degree n0 is constructed via Gauss periods; see Section 5.3.
To find the inert primes, they simply test numbers of the form n0 k + 1 for k = 1, 2, . . . in
sequence for primality and inertness. The Extended Riemann Hypothesis guarantees that a
k = O(n40 (log np)2 ) exists, but without this assumption (or randomization) there is no way
known to find such a p. To find an extension of degree pe , Artin-Schreier polynomials of
the form xp − x − α ∈ Fq [x] are employed, which are irreducible when α 6= β p − β for some
β ∈ K, where K is any extension of Fp . A tower of fields is constructed, starting with Fpn0 ,
and ending with Fpn = Fpn0 pe .
11.3.34 Remark Evdokimov [1017], in 1986, independently presented a similar result to Adelman
and Lenstra’s. In this construction n is factored as a product r1e1 · · · r`e` for distinct primes
ri , and an irreducible polynomial of each degree riei is constructed. When ri = p the Artin-
Schreier polynomials areei
again employed. When ri 6= p he employs a reducibility theorem
for the polynomial xri − a (where a lies in a small cyclotomic extension) which requires
finding ri -th non-residues, small examples of which are ensured by the Extended Riemann
Hypothesis. Irreducible polynomials of different prime power degrees are combined by com-
puting the minimal polynomial of the sum of the roots of the two polynomials. This is easily
accomplished with basic linear algebra, though the “composed sum” algorithm of [363] can
now accomplish this in quasi-linear time; see Remark 11.3.30 above.
11.3.35 Remark In 1989 Shoup [2624, 2625] gave the currently fastest completely unconditional
deterministic algorithm for finding irreducible polynomials, along similar lines to [1017].
He demonstrates a polynomial-time (in n and log p) reduction from the problem of finding
irreducible polynomials of degree n to the problem of factoring polynomials in Fp [x]. Shoup
shows that if, for every prime r | n, we are given a representation of a splitting field K of
xr − 1 (i.e., a non-trivial irreducible factor of xr − 1), and an r-th nonresidue in K, then
we can construct an irreducible polynomial of degree n. Both finding the factor of xr − 1
and the non-residue (by repeatedly taking r-th roots in K) can be accomplished through
factorization. Shoup [2624, 2625] also provides the fastest (unconditional) deterministic
algorithm for factoring polynomials over Fp [x]. Employing this in his irreducible polynomial
construction method yields a deterministic algorithm requiring O(p1/2 (log p)3 n3 (log n)c +
(log p)2 n4 (log n)c ) or O˜(p1/2 n3 + n4 ) operations in Fp , for some absolute constant c > 0.
11.3.36 Remark If the requirement that the degree of the constructed polynomial in Fp [x] be exactly
n is relaxed, and only an approximation of this degree is required, some unconditional
algorithms are known.
11.3.37 Remark In 1986, von zur Gathen [1219] gave a polynomial-time algorithm (in n and log p)
which finds an irreducible polynomial of degree at least n, assuming we can do preprocessing
only with respect to p (and requiring time polynomial in p).
380 Handbook of Finite Fields
11.3.38 Remark Adleman and Lenstra [14] present an algorithm which outputs an irreducible
polynomial f ∈ Fp [x] with n/(c log p) < deg f ≤ n, for some absolute constant c, and
requires time polynomial in n and log p.
11.3.39 Remark For a fixed prime p and ñ ∈ N, Shparlinski [2639, Theorem 1], gives a deter-
ministic, unconditional algorithm that finds an irreducible polynomial of degree ñ · (1 +
exp(−(log log ñ)1/2−o(1) )), and requires time polynomial in p and log ñ.
11.3.40 Remark For a sufficiently large q̃, Shparlinski [2639, Theorem 2], provides a deterministic,
unconditional algorithm that constructs a prime p, an integer n ≥ 0, and an irreducible
polynomial f ∈ Fp [x] of degree n, such that pn = q̃ + o(q̃). The algorithm requires time
polynomial in log q̃; see also [2636, Section 2].
See Also
[1227, Section 14.9] For an exposition of Rabin’s algorithm for testing irreducible
polynomials, and Ben-Or’s algorithm for generating irreducible
polynomials, and their complexity analyses.
[2636, Section 2] For early deterministic algorithms for constructing irreducible
polynomials; see also the introduction of [2625].
References Cited: [14, 223, 230, 363, 468, 482, 498, 621, 747, 768, 1017, 1168, 1187, 1219,
1227, 1239, 1667, 1722, 1939, 2349, 2350, 2357, 2391, 2434, 2567, 2570, 2580, 2624, 2625,
2629, 2636, 2639, 2856, 2985]
11.4.1 Remark The pioneering algorithms for the factorization of univariate polynomials over a
finite field are due to Berlekamp [230, 232]. Subsequent improvements of the asymptotic
running time were presented by Cantor and Zassenhaus [499], von zur Gathen and Shoup
[1239], Huang and Pan [1554], and Kaltofen and Shoup [1667]. The currently fastest method
yields the following main result of this section.
11.4.2 Theorem [1722] The factorization of a univariate polynomial of degree n over Fq can be
computed with an expected number (n log q)1+o(1) · (n1/2 + log q) of bit operations.
11.4.3 Remark Ignoring asymptotically small factors, this running time corresponds to n3/2 +
n log q field operations. These methods rely on a long line of algorithmic developments. The
most basic nontrivial task is that of multiplying polynomials. At degree n, its cost M(n)
is discussed in Definition 11.3.1. The current record of Fürer [1148] for integer multiplica-
∗
tion stands at n log n 2O(log n) bit operations, where log∗ n is the smallest k for which the
k-fold iterated logarithm log log · · · log n is less than 1. Fürer’s algorithm can be adapted to
multiplication of polynomials. For a k-bit prime p and two polynomials in Fp [x] of degree
at most n it takes O(M(n(k + log n))) bit operations. Next come division with remainder
at O(M(n)) (Sieveking [2664], Brent and Kung [401]), and univariate gcd, multipoint eval-
Algorithms 381
uation, and interpolation at O(M(n) log n) field operations (Strassen [2734]). Further tasks
include squarefree factorization, with O(M(n) log n) operations (Yun [3051]). Furthermore,
the matrix multiplication exponent ω is such that MM(n) ≤ nω ; see Definition 11.3.1. It
appears in an algorithm of Brent and Kung [401] for modular composition, which computes
f ◦ g (mod h) from f , g, and h with O(n1.687 ) operations.
Multivariate multipoint evaluation asks for the evaluation of a multivariate polynomial,
of degree at most d in each of its r variables, at n points in Frq . One can achieve this with
O(d(ω+1)(r−1)+1 ) field operations (Nüsken and Ziegler [2301]). A major advance of Kedlaya
and Umans [1722], at the heart of their subsequent results, was a modular approach to
this problem. They consider it as a problem over the integers, say for a prime field Fq ,
and solve it in a standard modular fashion, modulo various small primes. This leads to a
method using r(dr + q r + n)(log q)o(1) bit operations. It also yields a univariate modular
composition method with (n log q)1+o(1) bit operations, roughly corresponding to only O(n)
field operations. The more efficient methods use a “polynomial representation of the Frobe-
nius automorphism,” suggested by Erich Kaltofen, fast algorithms for it, and an interval
blocking strategy. This appeared first in von zur Gathen and Shoup [1239].
11.4.4 Remark Many other works have contributed to the factoring problem. We only mention
Serret [2600], Arwin [137], Petr [2391], Kempfert [1726], Niederreiter [2249, 2250], Kaltofen
and Lobo [1663], Gao and von zur Gathen [1178], Kaltofen and Shoup [1667], von zur
Gathen and Gerhard [1226].
11.4.5 Remark Berlekamp [230] presented a deterministic algorithm to factor in Fp [x] with O(nω +
pn2+o(1) ) field operations; see [1220] for this estimate. It was improved by Shoup [2624] to
O(p1/2 log2 p + log p · n2+o(1) ) operations. Berlekamp [232] introduced the fundamental idea
of probabilistic algorithms, although it became widely accepted in computer science only
after the polynomial-time primality test of Solovay and Strassen [2694]. The factorization
algorithms mentioned are all probabilistic, in the Las Vegas sense where the output can be
verified and thus guaranteed to be correct, but the running time is a random variable (with
exponentially decaying tails). Removing randomness while conserving polynomial time is
still an open question, of great theoretical interest but presumably no practical import.
Already Berlekamp [232] observed that it boils down to the following.
11.4.6 Open Question Given the coefficients of a polynomial which is a product of linear factors
over a prime field Fp , can one find a nontrivial factor deterministically in polynomial time?
11.4.7 Remark Several papers provide steps in this direction, often assuming the Extended Rie-
mann Hypothesis: Moenck [2112], Adleman, Manders, and Miller [15], Huang [1552, 1553],
von zur Gathen [1220], Rónyai [2476, 2477, 2478, 2479], Mignotte and Schnorr [2094], Ev-
dokimov [1019, 1018], Bach, von zur Gathen, H. Lenstra [158], Rónyai and Szántó [2480],
Gao [1176], Ivanyos, Karpinski, and Saxena [1579], Ivanyos, Karpinski, Rónyai, and Saxena
[1578], Arora, Ivanyos, Karpinski, and Saxena [132].
11.4.8 Remark Computations just using field operations can be modeled by arithmetic circuits
(Strassen [2732]). All algorithms discussed here, except those of Kedlaya and Umans [1722]
and Fürer’s multiplication, are of this type. Their running time is Ω(n log q) field operations.
11.4.9 Open Question Do all arithmetic circuits over Fq that factor univariate polynomials of
degree n require Ω(n log q) arithmetic operations?
11.4.10 Remark A positive answer is known only for n = 2 (von zur Gathen and Seroussi [1237]).
11.4.11 Remark The survey articles of Kaltofen [1646, 1655], von zur Gathen and Panario [1234]
and the textbooks by Shparlinski [2637, 2641] and von zur Gathen and Gerhard [1227]
present the details of most of these algorithms, more references, and historical information.
382 Handbook of Finite Fields
It is intriguing that the basics of most modern algorithms go back to Legendre, Gauß, and
Galois.
11.4.12 Remark The average cost of factoring algorithms, for polynomials picked uniformly at
random from those of a fixed degree, is analyzed in Flajolet, Gourdon, and Panario [1080]
and von zur Gathen, Panario, and Richmond [1235].
See Also
References Cited: [15, 132, 137, 158, 230, 232, 401, 498, 499, 723, 1018, 1019, 1080, 1148,
1176, 1178, 1220, 1225, 1226, 1227, 1234, 1235, 1237, 1239, 1552, 1553, 1554, 1578, 1579,
1646, 1655, 1663, 1667, 1684, 1722, 1726, 2094, 2112, 2249, 2250, 2301, 2391, 2476, 2477,
2478, 2479, 2480, 2558, 2559, 2600, 2624, 2637, 2641, 2664, 2694, 2732, 2734, 2984, 3051]
We extend the univariate factorization techniques of the previous section to several variables.
Two major ingredients are the reduction from the bivariate case to the univariate one, and
the reduction from any number to two variables. We present most of the known techniques
according to the representation of the input polynomial.
11.5.1 Remark In this subsection we are concerned with different kinds of factorizations of a
multivariate polynomial f ∈ Fq [x1 , . . . , xn ] stored in dense representation.
11.5.3 Remark The representation of multivariate polynomials is an important issue, which has
been discussed from the early ages of computer algebra [761, 778, 1519, 1616, 2129, 2130,
2131, 2730, 3021].
11.5.4 Remark Separable factorization can be seen as a preprocess to the other factorizations
(squarefree, irreducible, and absolutely irreducible, as defined below), which allows to reduce
to considering separable polynomials.
Algorithms 383
11.5.5 Definition Let R be an integral domain. A polynomial f ∈ R[x] is primitive if the common
divisors in R of all the coefficients of f are invertible in R.
11.5.6 Definition Let R be a unique factorization domain of characteristic p, and let Ep represent
{1} if p = 0 and {1, p, p2 , p3 , . . .} otherwise. If f is a primitive polynomial in R[y] of
degree d ≥ 1, then the separable decomposition of f , written Sep(f ), is defined to be
the set Sep(f ) := {(f1 , q1 , m1 ), . . . , (fs , qs , ms )} ⊆ (R[y] \ R) × Ep × N, satisfying the
following properties:
Qs
1. f (y) = i=1 fi (y qi )mi ,
2. for all i 6= j in {1, . . . , s}, fi (y qi ) and fj (y qj ) are coprime,
3. for all i ∈ {1, . . . , s}, mi (mod p) 6= 0,
4. for all i ∈ {1, . . . , s}, fi is separable and primitive,
5. for all i 6= j in {1, . . . , s}, (qi , mi ) 6= (qj , mj ).
The process of computing the separable decomposition is the separable factorization.
Let us recall that f (d) ∈ Õ(g(d)) means that f (d) ∈ g(d)(log2 (3 + g(d)))O(1) . With the
second randomized algorithm, the ouput is always correct, and the cost estimate is the
average of the number of operations in F taken over all the possible executions.
384 Handbook of Finite Fields
11.5.15 Definition For convenience, we say that a polynomial f ∈ R[x1 , . . . , xn ] is primitive (resp.
separable) in xi if it is so when seen in R[x1 , . . . , xi−1 , xi+1 , . . . , xn ][xi ].
11.5.21 Remark We do not discuss specific algorithms for computing the absolute factoriza-
tion. In fact, whenever F is a finite field, the absolutely irreducible decomposition of
f ∈ F [x1 , . . . , xn ] can be obtained from the irreducible decomposition over the algebraic
extension of F of degree deg f . For more details and advanced algorithms we refer the reader
to [617].
Algorithms 385
11.5.22 Theorem [1883, Theorem 2] Let q = pk , and let f ∈ Fq [x, y] be a polynomial of degree
dx in x and dy in y. If q ≥ 10dx dy then Irr(f ) can be computed with factoring several
polynomials in Fq [y] whose degree sum does not exceed dx + dy , plus an expected number
of Õ(k(dx dy )1.5 ) operations in Fp .
11.5.23 Remark If q is not sufficiently large to apply Theorem 11.5.22 then one can compute
the irreducible factorization of f over a slightly larger finite field, and then recover the
factorization over Fq by computing the norm of the factors.
The algorithm underlying Theorem 11.5.22 is summarized in the following.
11.5.24 Algorithm (Sketch of the lifting and recombination technique)
Input: a primitive and separable polynomial f ∈ Fq [x][y], of partial degrees dx in x
and dy in y.
Output: the irreducible decomposition Irr(f ) of f .
1. Normalization. If the cardinality of Fq is sufficiently large then a suitable shift of
the variable x reduces the problem to the normalized case defined as follows:
∂f
deg f (0, y) = dy and Res f (0, y), (0, y) 6= 0.
∂y
11.5.28 Remark In [1177], Gao designed the first softly quadratic time probabilistic reduction
of the factorization problem from two to one variable whenever the characteristic of the
coefficient field is zero or sufficiently large. His algorithm makes use of the first algebraic de
Rham cohomology group of F [x, y, 1/f (x, y)], as previously used by Ruppert [2504, 2505]
for testing the absolute irreducibility. In fact, if f factors into f1 · · · fr over the algebraic
closure of F then
ˆi ∂fi
!
fˆi ∂f
∂x
i f ∂y
dx + dy
f f
i∈{1,...,r}
is a basis of the latter group, where fˆi := f /fi [2504, Satz 2]. In consequence, this group can
be obtained by searching for closed differential 1-forms with denominators f and numerators
of degrees at most deg f − 1, which can be easily done by solving a linear system. A nice
presentation of Ruppert’s results is made in Schinzel’s book [2542, Chapter 3]. The algorithm
underlying Theorem 11.5.22 makes use of these ideas in order to show that a precision
σ = dx + 1 of the series in the Hensel lifting suffices.
11.5.29 Remark Let f ∈ F [x1 , . . . , xn ] continue to denote a polynomial in n variables over a field
F of total degree d. For any points (α1 , . . . , αn ), (β1 , . . . , βn ) and (γ1 , . . . , γn ) in F n , we
define the bivariate polynomial fα,β,γ in the variables x and y by fα,β,γ := f (α1 x + β1 y +
γ1 , . . . , αn x + βn y + γn ).
11.5.30 Theorem (Bertini’s theorem) [2602, Chapter II, Section 6.1] If f is irreducible, then there
exists a proper Zariski open subset of (F n )3 such that fα,β,γ is irreducible for any triple
(α1 , . . . , αn ), (β1 , . . . , βn ), (γ1 , . . . , γn ) in this subset.
11.5.31 Definition For any irreducible factor g of f , a triple (α1 , . . . , αn ), (β1 , . . . , βn ), (γ1 , . . . , γn )
in (F n )3 is a Bertinian good point for g if g(α1 x + β1 y + γ1 , . . . , αn x + βn y + γn ) is
irreducible with the same total degree as g. In other words, the irreducible factors of
f are in one-to-one correspondence with those of fα,β,γ . The complementary set of
Bertinian good points is written B(f ) and is the set of Bertinian bad points.
11.5.32 Remark For algorithmic purposes, the entries of (α1 , . . . , αn ), (β1 , . . . , βn ) and (γ1 , . . . , γn )
must be taken in a finite subset S of F , so that we are naturally interested in upper bounding
the number of Bertinian bad points in (S n )3 . We refer to such a bound as a quantitative
Bertini theorem. The density of Bertinian bad points with entries in a non-empty finite
subset S of F is
|B(f ) ∩ (S n )3 |
B(f, S) := ,
|S|3n
where |S| represents the cardinality of S.
11.5.33 Theorem (Quantitative Bertini theorem) [1658, Corollary 2] and [1881, Corollary 8] If F
is a perfect field of characteristic p, and according to the above notation, we have that:
1. B(f, S) ≤ (3d(d − 1) + 1)/|S| if p ≥ d(d − 1) + 1;
2. B(f, S) ≤ 2d4 /|S| otherwise.
11.5.34 Remark What we call “Bertini’s theorem” here is a particular but central case of more gen-
eral theorems such as in [2602, Chapter II, Section 6.1]. As pointed out by Kaltofen [1658],
the special application of Bertini’s theorem to reduce the factorization problem from several
to two variables was already known by Hilbert [1499, p. 117]. This is why Kaltofen and some
Algorithms 387
authors say “(effective) Hilbert Irreducibility Theorem” instead of “Bertini’s theorem.” For
more historical details about Bertini’s work, we refer the reader to [1623, 1750].
11.5.35 Remark Bertini’s theorem was introduced in complexity theory by Heintz and Sievek-
ing [1462], and Kaltofen [1645]. It quickly became a cornerstone of many randomized fac-
torization or reduction techniques including [1218, 1229, 1647, 1648, 1649]. Over the field of
complex numbers, Bajaj et al. [162] obtained the bound B(f, S) ≤ (d4 −2d3 +d2 +d+ 1)/|S|
by following Mumford’s proof [2202, Theorem 4.17] of Bertini’s theorem. Gao [1177] proved
the bound B(f, S) ≤ 2d3 /|S| whenever F has characteristic 0 or larger than 2d2 . Then Chèze
pointed out [616, Chapter 1] that the latter bound can be refined to B(f, S) ≤ d(d2 − 1)/|S|
by using directly [2504, Satz C]. The paper [1218] contains a version for non-perfect fields
with a bound that is exponential in d. If the cardinality |F | is too small, one can switch to
an extension (see Remark 11.5.67 below).
11.5.36 Corollary Let S(n, d) represent a cost function for the product of two power series over
a field F in n variables truncated to precision d. Let f ∈ Fq [x1 , . . . , xn ] be a polynomial
of total degree d. If q ≥ 4d4 then Irr(F ) can be computed with an expected number of
O(1) factorizations of polynomials in Fq [x, y] of total degree d, plus an expected number of
Õ(dS(n − 1, d)) operations in Fq .
11.5.37 Remark Softly optimal series products exist in particular cases [1519], for which the fac-
torization thus reduces to the univariate case in expected softly linear time as soon as
n ≥ 3.
11.5.38 Remark The first deterministic polynomial time multivariate factorization algorithms are
due to Kaltofen [1645, 1646]. Kaltofen constructed polynomial-time reductions to bi- (in
1981) and univariate (in 1982) factorization over an abstract field, which were discov-
ered independently of the 1982 univariate factorization algorithm over the rationals by
A. K. Lenstra, H. W. Lenstra, and Lovász [1893]. Kaltofen’s reduction to univariate factor-
ization, however, was inspired by Zassenhaus’s algorithm [3053]. For more references to work
by others (Chistov, von zur Gathen, Grigoriev, A. K. Lenstra) that immediately followed,
we refer the reader to Kaltofen’s surveys [1654, 1655, 1659], and to [1227].
11.5.39 Remark Polynomial factorization over finite fields has been implemented in Maple by
Bernardin and Monagan [239]. Other practical techniques have been reported in [2299]. At
the present time, the most general algorithm is due to Steel [2703]: it handles all coefficient
fields being explicitly finitely generated over their prime field, and it has been implemented
within the Magma computer algebra system [360]. Steel’s algorithm actually completes and
improves a previous approach investigated by Davenport and Trager [779].
11.5.40 Remark It is possible, via the rank of the Petr matrix or the distinct degree factorization
algorithm, to count the number of irreducible factors of a univariate polynomial over a field
Fq of characteristic p in deterministic polynomial time in log p. The same remains true for
multivariate polynomials [1182, 1651], but the algorithms are not straightforward. In [1182] a
multivariate deterministic distinct degree factorization is presented. There “distinct degree”
is with respect to any degree order.
11.5.45 Definition A polytope in Rn is integral if all of its vertices are in Zn . An integral polytope
P is integrally decomposable if there exists two integral polytopes Q and R such that
P = Q + R, where both Q and R have at least two points. Otherwise, P is integrally
indecomposable.
11.5.46 Definition The Newton polytope of f , written N (f ), is the convex hull in Rn of Supp(f ).
The integral convex hull of f is the subset of points in Zn lying in N (f ).
11.5.47 Theorem (Ostrowski’s theorem) [2338], translated in [2339] If f factors into gh then we
have N (f ) = N (g) + N (h).
11.5.48 Remark The previous theorem leads to the following irreducibility test.
11.5.49 Corollary (Irreducibility criterion) [1175, p. 507] If f ∈ F [x1 , . . . , xn ] is a nonzero polyno-
mial not divisible by any xi , and if N (f ) is integrally indecomposable, then f is irreducible
over any algebraic extension of F .
11.5.50 Theorem [1175, Theorem 4.2] Let P be an integral polytope in Rn contained in a hyperplane
H and let e ∈ Zn be a point lying outside of H. If e1 , . . . , ek are all the vertices of P , then
the convex hull of P and e is integrally indecomposable if, and only if, all the entries of
e − e1 , e − e2 , . . . , e − ek are coprime.
11.5.51 Theorem [1175, Theorem 4.11] Let P be an indecomposable integral polytope in Rn with
at least two points, that is contained in a hyperplane H, and let e ∈ Rn be a point outside
of H. Let S be any subset of points in Zn contained in the convex hull of e and P . Then
the convex hull of S and Q is integrally indecomposable.
11.5.52 Remark Let f ∈ Fp [x, y] be a polynomial with t nonzero terms and of total degree d such
that t < d. Let r be a vector in R2 , and let Γ be a subset of edges of N (f ) satisfying the
following properties:
1. N (f ) ⊆ Γ + rR≥0 ,
2. each of the two infinite edges of Γ + rR≥0 contains exactly one point of N (f ),
Algorithms 389
11.5.56 Definition The affine group over Z2 , written Aff(Z2 ), is the set of the maps U
α β i γ
U : (i, j) 7→ + , (11.5.1)
α0 β0 j γ0
11.5.57 Definition Let S be a finite subset of Z2 . The set S is normalized if it belongs to N2 and
if it contains at least one point in {0} × N, and also at least one point in N × {0}.
11.5.58 Theorem [256, Theorem 1.2] For any normalized finite subset S of Z2 , of cardinality σ,
convex size π, and included in [0, dx ] × [0, dy ], one can compute an affine map U ∈ Aff(Z2 )
as in (11.5.1), together with U (S), with O(σ log2 ((dx + 1)(dy + 1))) bit-operations, such that
U (S) is normalized and contained in a block [0, d0x ] × [0, d0y ] satisfying (d0x + 1)(d0y + 1) ≤ 9π.
11.5.59 Lemma For any field F , for any f ∈ F [x, y] not divisible by x and y, for any U as in
Equation (11.5.1), the polynomial
0 0
ey +γ 0
X
U (f ) := f(ex ,ey ) xαex +βey +γ y α ex +β
(ex ,ey )∈Supp(f )
11.5.60 Remark In order to compute the irreducible factorization of F , we can compute a reduction
map U as in Theorem 11.5.58 for Supp(f ), then compute the irreducible factorization of
U (f ), and finally apply U −1 to each factor. In this way we benefit from complexity bounds
that only depend on the convex size π of f instead of its dense size (dx + 1)(dy + 1).
The variable υ101 represents a polynomial in F[x1 , x2 , . . .], which can be evaluated by use of
the straight-line program. For instance, the determinant of an n × n matrix whose entries
are n2 variables can be represented, via Gaussian elimination, by a straight-line program
with divisions of length O(n3 ). Because those divisions can cause divisions by zero on
evaluation at certain points, it is desirable to remove them from such programs [2733]:
the shortest division-free straight-line program for the determinant that is known today
has length O(n2.7 ) and uses no constants other than 1 and −1 in F [1670]. In any case,
Algorithms 391
divisions can be removed by increasing the length by a factor O((deg f )1+ ) for any > 0.
The 1986 algorithm in [1650, 1653] produces from a straight-line program of length l for
a polynomial of degree d in Monte Carlo random polynomial-time straight-line programs
for the irreducible factors (and their multiplicities). The factor programs themselves have
length O(d2 l + d3+ ). Over finite fields of characteristic p, for an irreducible factor g of
m
multiplicity pm m0 , where gcd(p, m0 ) = 1, a straight-line program for g p is returned; see
Remark 11.5.68 below. The algorithm is implemented in the Dagwood system [1100] and can
factor matrix determinants. A shortcoming of the straight-line representations, which later
were adopted by the TERA project, was exposed by the Dagwood program: the lengths,
while polynomial in the input lengths, become quite large (over a million assignments). The
construction, however, plays a key role in complexity theory [1640].
11.5.65 Remark Since polynomials represented by straight-line programs can be converted to sparse
polynomials in polynomial-time in their sparse size by the algorithm in [3080], the straight-
line factorization algorithm brought to a successful conclusion the search for polynomial-
time sparse factorizers. Previous attempts based on sparse Hensel lifting [1216, 1230, 1649,
3080, 3081], retained an exponential substep for many factors, namely the computation
of the so-called right-side Hensel correction coefficients. The problem of computing the
coefficient of a given term in a sparse product is in general #P-hard. Nonetheless, if a
polynomial has only a few sparse factors, such sparse lifting can be quite efficient, in practice.
11.5.66 Remark Instead of straight-line programs, one can use a full-fledged programmed procedure
that evaluates the input polynomial. The irreducible factors are then evaluated at values
for the variables by another procedure that makes (“oracle”) calls to the input evaluation
procedure. Thus is the genesis of algorithms for black box polynomials [1668, 1669].
The idea is the following: Suppose one can call a black evaluation box for the polynomial
f (x1 , . . . , xn ) ∈ F[x1 , . . . , xn ]. First, uniformly randomly select from a sufficiently large finite
set field elements ai , ci (2 ≤ i ≤ n) and bj (1 ≤ j ≤ n) and interpolate and factor the
bivariate image r
fˆ(X, Y ) = f (X + b1 , c2 Y + a2 X + b2 , . . . , cn Y + an X + bn ) =
Y
ĝk (X, Y )ek .
k=1
By the effective Hilbert Irreducibility Theorem 11.5.33 above, the irreducible polynomials
ĝk are with high probability bivariate images of the irreducible factors hk (x1 , . . . , xn ) of f .
For small coefficient fields we shall assume that the black box can evaluate f at elements
in a finite algebraic extension E of F. Already the bivariate interpolation algorithm may
require such an extension in order to have sufficiently many distinct points.
11.5.67 Remark If one selects an extension E of degree [E : F] > deg(f ) that is a prime number,
all hk remain irreducible over that extension. Indeed, the Frobenius norm NormE/F (h̃) ∈
F[x1 , . . . , xn ] of a possible non-trivial irreducible factor h̃ ∈ E[x1 , . . . , xn ] of an hk must
be a power of an irreducible polynomial over F, hence a power of hk itself. For otherwise
gcd(hk , NormE/F (h̃)) would constitute a non-trivial factor of hk over F. But then deg(h̃ )·[E :
F] = deg(hk ) · m, where m is the exponent of that power, and because [E : F] is a prime
> deg(f ) ≥ deg(hk ), we obtain the contradiction deg(h̃ ) = deg(hk ) · (m/[E : F]) ≥ deg(hk ).
Remark 11.5.66 continued. Now the black box for evaluating all hk (ξ1 , . . . , ξn ) at field
elements ξi ∈ F stores (“hard-wires”) the ai , bj and the factors gk (X) = ĝk (X, 0) in its
constant pool. We note that the gk are not necessarily irreducible, but with high probability
they are pairwise relatively prime [1669, Section 2, Step 3], and their leading terms only
depend on the variable X. The black box first interpolates
f¯(X, Y ) = f (X + b1 , Y (ξ2 − a2 (ξ1 − b1 ) − b2 ) + a2 X + b2 ,
. . ., Y (ξn − an (ξ1 − b1 ) − bn ) + an X + bn ) (11.5.2)
392 Handbook of Finite Fields
We note that again the h̄k are not necessarily irreducible. One may Hensel-lift the factor-
ization
r
Y
f (X + b1 , a2 X + b2 , . . . , an X + bn ) = gk (X)ek (11.5.4)
k=1
provided none of the multiplicities ek is divisible by p. Otherwise, one can fully factor
f¯(X, Y ) and lump (multiply) those irreducible factors h̄κ (X, Y ) together where h̄κ (X, 0)
divide one and the same gk (X). Alternatively, if pm divides ek one could lift the pm -th
power of gk and take a pm -th root of the lifted factor. We have f¯(ξ1 − b1 , 1) = f (ξ1 , . . . , ξn ),
and for all k we obtain h̄k (ξ1 − b1 , 1) = hk (ξ1 , . . . , ξn ). We observe that the scalar multiple
of hk is fixed in all evaluations by the choice of gk .
11.5.68 Remark Over finite coefficient fields, there is no restriction on the multiplicities ek . One
does not obtain a pure straight-line program for the polynomial hk because a bivariate
factorization of f¯ or a pm -th root of the lifted factor, which depend on the evaluation
points ξi , are performed on each evaluation. One can obtain straight-line polynomials that
equal the irreducible factors modulo (xq1 − x1 , . . . , xqn − xn ) by powering by q/pm , where
q < ∞ is the cardinality of the coefficient field. Those straight-line programs produce correct
evaluations of the irreducible factors.
11.5.69 Remark The blackbox factorization algorithm is implemented in the FoxBox system [830].
The size blowup experienced in the straight-line factorization algorithm does not occur. In
fact, the factor evaluation black box makes O(deg(f )2 ) calls to the black box for f and
factors a bivariate polynomial, either by lifting (11.5.4) or, if multiplicities are divisible by
the characteristic, by factoring f¯. The program is fixed except for the constants ai , bj and
the polynomials gk .
11.5.70 Remark We conclude that the sparse representations of the factors can be recovered by
sparse interpolation over a finite field; see [1662] and the literature cited there. Dense factors
can be identified to have more than a given number of terms and skipped.
See Also
References Cited: [10, 11, 162, 219, 237, 238, 239, 256, 360, 365, 616, 617, 761, 778, 779,
830, 1100, 1175, 1177, 1182, 1183, 1216, 1217, 1218, 1227, 1229, 1230, 1271, 1462, 1499,
1518, 1519, 1616, 1623, 1640, 1645, 1646, 1647, 1648, 1649, 1650, 1651, 1653, 1654, 1655,
1658, 1659, 1660, 1661, 1662, 1668, 1669, 1670, 1716, 1739, 1750, 1880, 1881, 1882, 1883,
1893, 1897, 1983, 2108, 2129, 2130, 2131, 2202, 2209, 2299, 2338, 2339, 2400, 2504, 2505,
2528, 2529, 2530, 2542, 2602, 2703, 2730, 2733, 2938, 2939, 3021, 3052, 3053, 3080, 3081]
Algorithms 393
Surveys and detailed expositions with proofs can be found in [661, 1939, 1938, 2080,
2308, 2306, 2720].
11.6.1 Remark Discrete exponentiation in a finite field is a direct analog of ordinary exponentia-
tion. The exponent can only be an integer, say n, but for w in a field F , wn is defined except
when w = 0 and n ≤ 0, and satisfies the usual properties, in particular wm+n = wm wn
and (for u and v in F ) (uv)m = um v m . The discrete logarithm is the inverse function, in
analogy with the ordinary logarithm for real numbers. If F is a finite field, then it has at
least one primitive element g; i.e., all nonzero elements of F are expressible as powers of g,
see Chapter 2.
11.6.2 Definition Given a finite field F , a primitive element g of F , and a nonzero element w of
F , the discrete logarithm of w to base g, written as logg (w), is the least non-negative
integer n such that w = g n .
11.6.3 Remark The value logg (w) is unique modulo q − 1, and 0 ≤ logg (w) ≤ q − 2. It is often
convenient to allow it to be represented by any integer n such that w = g n .
11.6.4 Remark The discrete logarithm of w to base g is often called the index of w with respect to
the base g. More generally, we can define discrete logarithms in groups. They are commonly
called generic discrete logarithms.
11.6.6 Remark The definition of a group discrete logarithm allows for consideration of discrete
logarithms in finite fields when the base g is not primitive, provided the argument is in
the group hgi. This situation arises in some important applications, in particular in the
U.S. government standard for the Digital Signature Algorithm (DSA). DSA operations are
performed in a field Fp with p a prime (nowadays recommended to be at least 2048 bits).
This prime p is selected so that p − 1 is divisible by a much smaller prime r (specified
in the standard to be of 160, 224, or 256 bits), and an element h of Fp is chosen to have
multiplicative order r (say by finding a primitive element g of Fp and setting h = g (p−1)/r ).
The main element of the signature is of the form hs for an integer s, and ability to compute
s would break DSA. DSA can be attacked either by using generic finite group discrete
logarithm algorithms in the group hhi or finite field algorithms in the field Fp (which can
then easily yield a solution in hhi).
11.6.7 Remark The basic properties of discrete logarithms given below, such as the change of base
formula, apply universally. On the other hand, many of the discrete logarithm algorithms
described later are valid only in finite fields. Generally speaking, discrete logarithms are
comparatively easy to compute in finite fields, since they have a rich algebraic structure that
can be exploited for cryptanalytic purposes. Much of the research on discrete logarithms
394 Handbook of Finite Fields
in other settings has been devoted to embedding the relevant groups inside finite fields in
order to apply finite field discrete logarithm algorithms.
11.6.8 Remark This section is devoted to finite field discrete logarithms, and only gives a few ref-
erences to other ones. For elliptic curve discrete logarithms, the most prominent collection,
see Section 16.4. However, other groups have also been used, for example class groups of
number fields [2044].
11.6.9 Remark Most popular symbolic algebra systems contain some implementations of discrete
logarithm algorithms. For example, Maple has the mlog function, while Mathematica has
FieldInd. More specialized systems for number theoretic and algebraic computations, such
as Magma, PARI, and Sage, also have implementations, and typically can handle larger
problems. Thus for all but the largest problems that are at the edge of computability with
modern methods, widely available and easy to use programs are sufficient. Tables of finite
fields, such as those in [1939], are now seldomly printed in books.
11.6.10 Remark Until the mid-1970s, the main applications for discrete logarithms were similar to
those of ordinary logarithms, namely in routine computations, but this time in finite fields.
They allowed replacement of relatively hard multiplications by easier additions. What was
frequently used was Zech’s logarithm (also called Jacobi’s logarithm, cf. [1939]), which is a
modification of the ordinary discrete logarithm. In a finite field F with primitive element g,
Zech’s logarithm of an integer n is defined as the integer Z(n) mod (q − 1) which satisfies
g Z(n) = 1 + g n . This provides a quick way to add elements given in terms of their discrete
logarithms: aside from boundary cases, g m + g n = g m (1 + g n−m ) = g m+Z(n−m) .
11.6.11 Remark As with ordinary logarithms, where slide rules and logarithm tables have been re-
placed by calculators, such routine applications of discrete logarithms in small or moderately
large fields now rely on computer algebra systems.
11.6.12 Remark Interest in discrete logarithms jumped dramatically in the mid-1970s with the
invention of public key cryptography, see Chapter 16. While discrete exponentiation is easy,
the discrete logarithm, its inverse, appeared hard, and this motivated the invention of the
Diffie–Hellman key exchange protocol, the first practical public key cryptosystem. Efficient
algorithms for discrete logarithms in the field over which this protocol is implemented would
make it insecure.
11.6.13 Remark The Diffie–Hellman problem is to compute g xy , the key that the two parties to
the Diffie–Hellman protocol obtain, from the g x and g y that are visible to the eavesdropper.
Although this problem has attracted extensive attention, it has not been solved, and for the
most important cases of finite field and elliptic curve discrete logarithms, it is still unknown
whether the Diffie–Hellman problem is as hard as the discrete logarithm one; see [410] for
recent results and references.
11.6.14 Remark It is known that single bits of discrete logarithms are about as hard to compute
as the entire discrete logarithms [1444].
11.6.15 Remark There are some rigorous lower bounds on discrete log problems, but only for groups
given in ways that restrict what can be done in them [2219, 2630].
Algorithms 395
11.6.16 Remark Various cryptosystems other than the Diffie-Hellman one have been proposed
whose security similarly depends on the intractability of the discrete logarithm problem;
see Section 16.1. Many of them can be used in settings other than finite fields.
11.6.17 Remark There are close analogies between integer factorization and discrete logarithms
in finite fields, and most (but not all) of the algorithms in one area have similar ones in
the other. This will be seen from some of the references later. In general, considerably less
attention has been devoted to discrete logarithms than to integer factorization. Hence the
smaller sizes of discrete logarithm problems that have been solved result both from the
greater technical difficulty of this problem as compared to integer factorization and from
less effort being devoted to it.
11.6.18 Remark Shor’s 1994 result [2623] shows that if quantum computers become practical,
discrete logarithms will become easy to compute. Therefore cryptosystems based on discrete
logarithms may all become suddenly insecure.
11.6.19 Remark Suppose that G is a group, and g an element of finite order m in G. If u and v are
two elements of hgi, then
11.6.20 Remark (Change of base formula) Suppose that G is a group, and that g and h are two
elements of G that generate the same cyclic subgroup hgi = hhi of order m. If u is an
element of hgi, then
logg (u) ≡ logh (u) ∗ logg (h) (mod m),
and therefore
logg (h) ≡ 1/ logh (g) (mod m).
These formulas mean that one can choose the most convenient primitive element to work
with in many applications. For example, in finite fields F2k , elements are usually represented
as polynomials with binary coefficients, and one can find (as verified by experiment and
inspired by heuristics, but not proved rigorously) primitive elements that are represented as
polynomials of very low degree. This can offer substantial efficiencies in implementations.
However, it does not affect the security of the system. If discrete logarithms are easy to
compute in one base, they are easy to compute in other bases. Similarly, the change of the
irreducible polynomial that defines the field has little effect on the difficulty of the discrete
logarithm problem.
11.6.21 Remark If the order of the element g can be factored even partially, the discrete logarithm
problem reduces to easier ones. This is the Silver–Pohlig–Hellman technique [2406]. Suppose
that g is an element of finite order m in a group G, and m is written as m = m1 m2 with
gcd(m1 , m2 ) = 1. Then the cyclic group hgi is the direct product of the cyclic groups
hg m2 i and hg m1 i of orders m1 and m2 , respectively. If we determine a = loggm2 (wm2 ) and
396 Handbook of Finite Fields
b = loggm1 (wm1 ), the Chinese Remainder Theorem tells us that logg (w) is determined
completely, and in fact we obtain
where x and y come from the Euclidean algorithm computation of gcd(m1 , m2 ), namely
1 = xm1 + ym2 . This procedure extends easily to more than two relatively prime factors.
11.6.22 Remark When m, the order of g, is a prime power, say m = pk , the computation of logg (w)
reduces to k discrete logarithm computations in a cyclic group of p elements. For example,
if r = pk−1 and h = g r , u = wr , then h has order p, and computing logh (u) yields the
reduction of logg (w) (mod p). This process can then be iterated to obtain the reduction
modulo p2 , and so on.
11.6.23 Remark The above remarks, combined with results of the next section, show that when
the complete factorization of the order of g can be obtained, discrete logarithms can be
computed in not much more than r1/2 operations in the group, where r is the largest prime
in the factorization.
11.6.24 Remark In a finite field, any function can be represented by a polynomial. For the discrete
logarithm, such polynomials do turn out to have some aesthetically pleasing properties, see
[2189, 2244, 2922, 2968]. However, so far they have turned out to be of no practical use
whatsoever.
11.6.25 Remark We next consider some algorithms for discrete logarithms that work in very general
groups. The basic one is the baby steps–giant steps method that combines time and space,
due to Shanks [2607].
11.6.26 Algorithm Baby steps–giant steps algorithm: Suppose that G is a group and g is an element
of G of finite order m. If h ∈ hgi, h = g k , and w = dm1/2 e, then k can be written as k = aw+b
for some (often non-unique) a, b with 0 ≤ a, b < w. To find such a representation, compute
the set A = {g jw : 0 ≤ j < w} and sort it. This takes m1/2 + O(log(m)) group operations
and O(m1/2 log(m)) sorting steps, which are usually very easy, since they can be performed
on bit strings, or even initial segments of bit strings. Next, for 0 ≤ i < w, compute hg −i and
check whether it is present in A. When it is, we obtain the desired representation k = jw +i.
11.6.27 Remark The baby steps–giant steps technique has the advantage of being fully determin-
istic. Its principal disadvantage is that it requires storage of approximately m1/2 group
elements. A space-time tradeoff is available, in that one can store a smaller list (the set
A in the notation above, with fewer but larger “giant steps”) but then have to do more
computing (more “baby steps”).
11.6.28 Remark The baby steps–giant steps algorithm extends easily to many cases where the
discrete logarithm is restricted in some way. For example, if it is known that logg (w) lies
in an interval of length n, the basic approach sketched above can be modified to find it in
O(n1/2 ) group operations (plus the usual sorting steps). Similarly, if the discrete logarithm
k is allowed to have only small digits when represented in some base (say binary digits in
base 10), then the running time will be about the square root of the number of possibilities
for k. For some other recent results, see [2708].
Algorithms 397
11.6.29 Remark In 1978, Pollard invented two randomized methods for computing discrete log-
arithms in any group, the rho method, and the kangaroo (or lambda) technique [2413].
Just like Pollard’s earlier rho method for integer factorization, they depend on the birthday
paradox, which says that if one takes a (pseudo) random walk on a completely connected
graph of n vertices, one is very likely to revisit the same vertex in about n1/2 steps. These
discrete logarithm algorithms also depend, just as the original rho method does, on the
Floyd algorithm (Section 3.1 of [1765]) for detecting cycles with little memory at some cost
in running time, in that they compare x2i to xi , where xi is the position of the random
walk at time i.
11.6.30 Remark Since the rho and kangaroo methods for discrete logarithms are probabilistic, they
cannot guarantee a solution, but heuristics suggest, and experiments confirm, that both run
in expected time O(m1/2 ), where m is the order of the group. This is the same computational
effort as for the baby steps–giant steps algorithm. However, the rho and kangaroo methods
have two advantages. One is that they use very little memory. Another one is that, as was
first shown by van Oorschot and Wiener [2852], they can be parallelized, with essentially
linear speedup, so that k processors find a solution about k times faster than a single one.
We sketch just the standard version of the rho method, and only briefly.
11.6.31 Algorithm Rho algorithm for discrete logarithms: Partition the group hgi of order m into
three roughly equal sets S1 , S2 , and S3 , using some property that is easy to test, such as
the first few bits of a canonical representation of the elements of G. To compute logg (h),
define a sequence w0 , w1 , . . . by w0 = g and for i > 0, wi+1 = wi 2 , wi g, or wi h, depending
on whether wi ∈ S1 , S2 , or S3 . Then each wi is of the form
wi = g ai hbi
for some integers ai , bi . If the procedure of moving from wi to wi+1 behaves like a random
walk (as is expected), then in O(m1/2 ) steps we will find i such that wi = w2i , and this will
give a congruence
ai + bi logg (h) ≡ a2i + b2i logg (h) (mod m).
Depending on the greatest common divisor of m and bi − b2i this congruence will typically
either yield logg (h) completely, or give some stringent congruence conditions, which with
the help of additional runs of the algorithm will provide a complete solution.
11.6.32 Remark The low memory requirements and parallelizability of the rho and kangaroo al-
gorithms have made them the methods of choice for solving general discrete logarithm
problems. There is a substantial literature on various modifications, although they do not
improve too much on the original parallelization observations of [2852]. Some references are
[613, 1734, 2414, 2790].
11.6.33 Remark The rho method, as outlined above, requires knowledge of the exact order m of the
group. The kangaroo method only requires an approximation to m. The kangaroo algorithm
can also be applied effectively when the discrete logarithm is known to lie in a restricted
range.
11.6.34 Remark The rest of this section is devoted to a brief overview of index calculus algo-
rithms for discrete logarithms. Unlike the Shanks and Pollard methods of the previous two
398 Handbook of Finite Fields
subsections, which take exponential time, about m1/2 for a group of order m, the index
calculus techniques are subexponential, with running times closer to exp((log(m))1/2 ) and
even exp((log(m))1/3 ). However, they apply directly only to finite fields. That is why much
of the research on discrete logarithms in other groups of cryptographic interest, such as on
elliptic curves, is devoted to finding ways to reduce those problems to ones in finite fields.
11.6.35 Remark In the case of DSA mentioned at the beginning of this section, the recommended
size of the modulus p has increased very substantially, from 512 to 1024 bits when DSA was
first adopted, to the range of 2048 to 3072 bits more recently. The FIPS 186-3 standard
specifies bit lengths for the two primes p and r of (1024, 160), (2048, 224), (2048, 256), and
(3072, 256). The relative sizes of p and r were selected to offer approximately equal levels
of security against index calculus algorithms (p) and generic discrete logarithm attacks (r).
The reason for the much faster growth in the size of p is that with the subexponential
running time estimates, the effect of growing computing power is far more pronounced on
the p side than on the r side. In addition, while there has been no substantial theoretical
advance in index calculus algorithms in the last two decades, there have been numerous
small incremental improvements, several cited later in more detailed discussions. On the
other hand, there has been practically no progress in generic discrete logarithm algorithms,
except for parallelization.
11.6.36 Remark The basic idea of index calculus algorithms dates back to Kraitchik, and is also
key to all fast integer factorization algorithms. In a finite field Fq with primitive element g,
if we find some elements xi , yj ∈ Fq such that
r
Y s
Y
xi = yj ,
i=1 j=1
then
r
X s
X
logg xi ≡ logg yj (mod q − 1).
i=1 j=1
If enough equations are collected, this linear system can be solved for the logg xi and logg yj .
Singular systems are not a problem in practice, since typically computations generate con-
siderably more equations than unknowns, and one can arrange for g itself to appear in the
multiplicative relations.
11.6.37 Remark To compute logg w for some particular w ∈ F with index calculus algorithms,
it is often necessary to run a second stage that produces a relation involving w and the
previously computed discrete logarithms. In some algorithms the second stage is far easier
than the initial computation, in others it is of comparable difficulty.
11.6.38 Remark For a long time (see [2306] for references), the best index calculus algorithms for
both integer factorization and discrete logarithms had running times of the form
exp((c + o(1))(log q)1/2 (log log q)1/2 ) as p → ∞
for various constants c > 0, where q denotes the integer being factored or the size of the
finite field. The first practical method that broke through this running time barrier was
Coppersmith’s algorithm [717] for discrete logarithms in fields of size q = 2k (and more
generally, of size q = pk where p is a small prime and k is large). It had running time of
approximately
exp(C(log q)1/3 (log log q)2/3 ),
where the C varied slightly, depending on the distance from k to the nearest power of p, and
in the limit as k → ∞ it oscillated between two bounds [2306]. The function field sieve of
Algorithms 399
Adleman [16], which also applies to fields with q = pk where p is relatively small, improves
on the Coppersmith method, but has similar asymptotic running time estimate. For the
latest results on its developments, see [1626, 1628, 2543].
11.6.39 Remark The running time of Coppersmith’s algorithm turned out to also apply to the
number field sieve. This method, which uses algebraic integers, was developed for integer
factorization by Pollard and H. Lenstra, with subsequent contributions by many others. It
was adopted for discrete log computations in prime fields by Gordon [1325], with substantial
improvements by other reseachers. For the latest estimates and references, see [711, 1627,
2544, 2545].
11.6.40 Remark The index calculus algorithms depend on a multiplicative splitting of some ele-
ments, such as integers or polynomials, into such elements drawn from a smaller collection.
This smaller collection usually is made up of elements that by some measure (norm) are
small. The essence of index calculus algorithms is to select general elements from the large
set at random, but as intelligently as possible in order to maximize the chances they will
have the desired type of splitting. Usually elements that do have such splittings are called
“smooth.”
11.6.41 Remark There are rigorous analyses that provide estimates of how often elements in various
domains are “smooth.” For ordinary integers, there are the estimates of [1501]. For algebraic
integers, we can use [441, 2574]. For polynomials over finite fields, recent results are [2348].
11.6.42 Remark Index calculus algorithms for discrete logarithms require the solution of linear
equations modulo q − 1, where q is the size of the field. As in the Silver–Pohlig–Hellman
method, the Chinese Remainder Theorem (and an easy reduction of the case of a power
of a prime to that of the prime itself) reduces the problem to that of solving the system
modulo primes r that divide q − 1. (For more extensive discussion of linear algebra over
finite fields, see Section 13.4.)
11.6.43 Remark The linear algebra problems that arise in index calculus algorithms for integer
factorization are very similar, but simpler, in that they are all just modulo 2. For discrete
logarithm problems to be hard, they have to be resistant to the Silver–Pohlig–Hellman at-
tack. Hence q − 1 has to have at least one large prime factor r, and so the linear system
has to be solved modulo a large prime. That increases the complexity of the linear solution
computation, and thus provides slightly higher security for discrete logarithm cryptosys-
tems.
11.6.44 Remark A key factor that enables the solution of the very large linear systems that arise in
index calculus algorithms is that these systems are very sparse. (Those “smooth” elements
do not involve too many of the “small” elements in the multiplicative relations.) Usually
the structured Gaussian elimination method (proposed in [2306] and called there intelligent
gaussian elimination, afterwards renamed in the first practical demonstration of it [1839],
now sometimes called filtering) is applied first. It combines the relations in ways that re-
duce the system to be solved and do not destroy the sparsity too far. Then the conjugate
gradient, the Lanczos, or the Wiedemann methods (developed in [721, 2976], the first two
demonstrated in practice in [1839]) that exploit sparsity are used to obtain the final solution.
400 Handbook of Finite Fields
11.6.45 Remark For the extremely large linear systems that are involved in record-setting com-
putations, distributed computation is required. The methods of choice, once structured
Gaussian elimination is applied, are the block Lanczos and block Wiedemann methods
[718, 719, 2134].
11.6.46 Remark Some symbolic algebra systems incorporate implementations of the sparse linear
system solvers mentioned above.
11.6.47 Remark As a demonstration of the effectiveness of the sparse methods, the record factor-
ization of RSA768 [1753], mentioned below, produced 64 billion linear relations. These were
reduced, using structured gaussian elimination, to a system of almost 200 million equations
in about that many unknowns. This system was still sparse, with the average equation
involving about 150 unknowns. The block Wiedemann method was then used to solve the
resulting system.
11.6.48 Remark Extreme caution should be exercised when drawing any inferences about relative
performance of various integer factorization and discrete logarithm algorithms from the
record results listed here. The computing resources, as well as effort involved in program-
ming, varied widely among the various projects.
11.6.49 Remark As of the time of writing (early 2012), the largest cryptographically hard integer
(i.e., one that was chosen specifically to resist all known factoring attacks, and is a product
of two roughly equal primes) that has been factored is RSA768, a 768-bit (232 decimal digit)
integer from the RSA challenge list [1753]. This was the result of a large collaboration across
the globe stretching over more than two years, and used the general number field sieve.
11.6.50 Remark The largest discrete logarithm case for a prime field Fp (with p chosen to resist
simple attacks) that has been solved is for a 530-bit (160 decimal digit) prime p. This was
accomplished by Kleinjung in 2007 [1752]. The number field sieve was used.
11.6.51 Remark In fields of characteristic two, the largest case that has been solved is that of Fq
with q = 2613 , using the function field sieve. (An earlier record was for q = 2607 using the
Coppersmith algorithm.) This computation took several weeks on a handful of processors,
and was carried out by Joux and Lercier in 2005 [1625].
11.6.52 Remark The largest generic discrete logarithm problem that has been solved in a hard
case is that of discrete logarithms over an elliptic curve modulo a 112-bit prime, thus a
group of size about 2112 . This is due to Bos and Kaihara [353], and was done in 2009. Right
now, a large multi-year collaborative effort is under way to break the Certicom ECC2K-130
challenge, which involves computing discrete logarithms on an elliptic curve over a field
with 2131 elements [160]. All these efforts rely on parallelized versions of the Pollard rho
method.
See Also
References Cited: [16, 160, 353, 410, 441, 613, 661, 711, 717, 718, 719, 721, 1325, 1444,
1501, 1625, 1626, 1627, 1628, 1734, 1752, 1753, 1765, 1839, 1938, 1939, 2044, 2080, 2134,
2189, 2219, 2244, 2306, 2308, 2348, 2406, 2413, 2414, 2543, 2544, 2545, 2574, 2607, 2623,
2630, 2708, 2720, 2790, 2852, 2922, 2968, 2976]
11.7.1 Definition Let p be a prime number and let n be a positive integer. An explicit model for a
finite field of size pn is a field whose underlying additive group is Fnp = Fp ×Fp ×· · ·×Fp .
Thus, one can specify an explicit model using O(n3 log p) bits. When we say that for an
algorithm an explicit model is input or output we assume that the explicit data consisting of
p and (aijk ) are given as input or output. There is a deterministic polynomial time algorithm
that given such data first decides whether p is prime [43], and then decides whether it defines
an explicit model for a finite field [1895, Section 2].
11.7.3 Remark Let A be a field of characteristic p > 0 and size pn and let b0 , . . . , bn−1 be a basis
of A as a vector space over Fp . We then obtain an explicit model as follows. Write ψ for
the unique Fp -vector space isomorphism Fnp → A sending ei to bi for 0 ≤ i < n. Define a
multiplication map on Fnp by v · w = ψ −1 (ψ(v) · ψ(w)), for v, w ∈ Fnp . Together with vector
addition, this multiplication makes Fnp into a field.
11.7.4 Remark An alternative space-efficient way to give explicit models is to give an irreducible
polynomial f ∈ Fp [x] of degree n over Fp , which we can encode with O(n log p) bits. Using
the Fp -basis 1, x, . . . , xn−1 of the field Fp [x]/(f ) we then obtain an explicit model of a
field of size pn as in Remark 11.7.3. One can convert between this representation and
the representation with n3 elements by deterministic polynomial time algorithms [1895,
Theorem 1.1]. Since our concern in this section is only about whether an algorithm runs in
polynomial time or not, and not about the degree of the polynomial if it does, we use the
more flexible setup of the explicit data consisting of the n3 elements aijk in Fp .
11.7.5 Theorem [789] There is a deterministic polynomial time algorithm such that
1. on input two explicit models for finite fields A, B of the same cardinality, it
produces a field isomorphism φA,B : A → B;
2. for any three explicit models A, B, C for finite fields of the same size we have
φB,C ◦ φA,B = φA,C .
11.7.6 Remark The isomorphism that the algorithm produces is given as explicit output by listing
the entries of the square matrix associated to the underlying linear map over the prime
402 Handbook of Finite Fields
field; see [1895, Section 2] for a proof of this theorem without Property 2. By Property 2
this algorithm can be used for “coercion” in computer algebra [361].
11.7.7 Definition An algorithmic model for finite fields is a sequence (Aq )q , where q runs over all
prime powers and Aq is an explicit model for a finite field of size q, such that there is
an algorithm that on input q produces Aq .
11.7.8 Example For each prime p and n ≥ 1 one can take the explicit model Apn to be given as
in Remark 11.7.4 by the first irreducible polynomial of degree n over Fp with respect to a
lexicographic ordering of polynomials of degree n over Fp .
11.7.9 Example Conway polynomials [1452, 1966] provide an algorithmic model for finite fields
that has some additional properties. However computing Conway polynomials is laborious
and there is only a rather limited table of known Conway polynomials.
11.7.10 Theorem [789] There is an algorithmic model (Sq )q such that there is a deterministic
polynomial time algorithm that on input an explicit model A for a field of size q
1. computes the model Sq ;
2. computes an isomorphism of fields A → Sq .
11.7.11 Remark The theorem in fact determines (Sq )q uniquely up to “polynomial time base
change.” More precisely, suppose that for every prime power q = pn we have an invert-
ible n × n-matrix Mq over Fp and suppose that there is a deterministic polynomial time
algorithm that on input an explicit model for a field of size q produces Mq . Given (Sq )q as
in the theorem we let (Sq0 )q be the algorithmic model that one gets in the manner of Remark
11.7.3 by considering the columns of Mq as an Fp -basis of Sq for every q. Then Theorem
11.7.10 also holds for (Sq0 )q . Moreover, every algorithmic model (Sq0 )q with the Properties 1
and 2 of the theorem arises in this way from (Sq )q .
11.7.12 Remark Theorem 11.7.10 implies Theorem 11.7.5. To prove Theorem 11.7.10 one can show
[789] that it holds for the algorithmic model of finite fields (Sq )q , where Sq is the standard
model for a finite field of size q defined below.
Step 2: prime ideals. Let p, r be prime numbers with p 6= r, and let l be the number
of factors r in the integer (pϕ(r) − 1)/(r2 /r). Denote by Sp,r the set of prime ideals p of
Br that satisfy p ∈ p. This set is finite of cardinality rl , and for each p ∈ Sp,r there exists
a unique system (ap,j )0≤j<lr of integers ap,j ∈ {0, 1, . . . , p − 1} such that p is generated
by p together with {ηr,k+1,i − ap,i+kr : 0 ≤ k < l, 0 ≤ i < r}. We define a total ordering
on Sp,r by putting p < q if there exists h ∈ {0, 1, . . . , lr − 1} such that ap,j = aq,j for all
j < h and ap,h < aq,h . The smallest element of Sp,r in this ordering is denoted by pp,r .
We define Fp,r to be the ring Br /pp,r , and for k ∈ Z>0 we define αp,r,k ∈ Fp,r to be
the residue class of ηr,k+l,0 modulo pp,r .
Step 3: equal characteristic. Let p be a prime number and put Fp = Z/pZ. Let the
Pp−1
element f = f (x, y) of the polynomial ring Fp [x, y] be defined by f = xp −1−y · i=1 xi .
We define Fp,p to be the polynomial ring Fp [x1 , x2 , x3 , . . .] modulo the ideal generated
by {f (x1 , 1), f (xk+1 , xk ) : k > 0}. For k ∈ Z>0 we denote the image of xk in Fp,p by
αp,p,k .
Step 4: an algebraic closure. Let p be a prime number. Then for any prime number r it
is true that the ring Fp,r is a field containing Fp ; that for each k ∈ Z>0 , the element αp,r,k
of Fp,r is algebraic of degree rk over Fp ; and that one has Fp,r = Fp (αp,r,1 , αp,r,2 , . . .).
We write F̄p for the tensor product, over Fp , of the rings Fp,r , with r ranging over
the set of all prime numbers. For any prime number r and k ∈ Z>0 , the image of αp,r,k
under the natural ring homomorphism Fp,r → F̄p is again denoted by αp,r,k .
The ring F̄p is a field containing Fp , and it is an algebraic closure of Fp . We have
F̄p = Fp (αp,r,k : r prime, k ∈ Z>0 ), each αp,r,k being algebraic of degree rk over Fp .
Step 5: a vector space basis. Let p be a prime number. For each s ∈ Q/Z, the element
s ∈ F̄p is defined as follows. There exists a unique system of integers (cr,k )r,k , with r
ranging over the set of prime numbers and k over PZ>0 , such that each cr,k belongs to
{0, 1, . . . , r − 1} and s equals the residue class of r,k cr,k /rk modulo Z, the sum being
finite in the sense that cr,k = 0 for allQbut finitely many pairs r, k. With that notation,
cr,k
s is defined to be the finite product r,k αp,r,k .
The system (s )s∈Q/Z is a vector space basis of F̄p over Fp . In addition, for each
s ∈ Q/Z the degree of s over Fp equals the order of s in the additive group Q/Z.
For n ≥ 1 the standard model Spn for a field of size pn is the explicit model
that one obtains in the manner of Remark 11.7.3 by considering the Fp -basis
0 , 1/n , 2/n , . . . , (n−1)/n of the unique subfield Fpn of F̄p of size pn .
11.7.14 Example Suppose that p is an odd prime number. The standard models for fields of size pn
where n is a power of 2 can be computed as follows. Put l = ord2 ((p2 − 1)/8) and for each
k ≥ −l consider the image αp,2,k of η2,k+l,0 = ζ2k+l+2 + ζ2−1k+l+2 in the field Fp,2 = B2 /pp,2 .
We have αp,2,−l = 0, and for each k ≥ −l the element αp,2,k+1 is a root of the quadratic
polynomial fk = x2 − 2 − αp,2,k ∈ Fp (αp,2,k )[x].
If k < 0 then fk has two roots in Fp and by the choice of the prime pp,2 in Definition
11.7.13, the element αp,2,k+1 is the smallest of the two roots if we order Fp as 0, 1, . . . , p − 1.
Starting from αp,2,−l = 0, this enables us to find αp,2,−l+1 , . . . , αp,2,0 ∈ Fp .
One can show that fk is irreducible in Fp (αp,2,k )[x] for all k ≥ 0. In particular, the
polynomial f0 = x2 − 2 − αp,2,0 ∈ Fp [x] gives the standard model for a field of size p2 .
Moreover, for every k ≥ 1 the irreducible polynomial over Fp of the generator αp,2,k of Fp2k
over Fp is (x2 − 2)◦k − αp,2,0 , where for a polynomial f ∈ Fp [x] we let f ◦k denote the k-fold
composition f (f (· · · f (x) · · · )).
11.7.15 Remark Note that the definition above also provides a standard embedding Sq → Sqd for
every prime power q and every integer d ≥ 1, which is induced by the inclusion Fq ⊂ Fqd .
A composition of standard embeddings is again a standard embedding.
404 Handbook of Finite Fields
11.7.16 Theorem [789] There is a probabilistic algorithm that on input a prime p and n ≥ 1
computes Spn in expected time bounded by a polynomial in log p and n.
11.7.17 Remark This theorem is an easy consequence of the fact that explicit models can be made
in probabilistic polynomial time [1892], and Theorem 11.7.10 above. Under the assump-
tion of the generalized Riemann hypothesis, this can also be achieved with a deterministic
polynomial time algorithm [14].
11.7.18 Remark An algorithm as in Theorem 11.7.16 is semi-deterministic: it is a probabilistic
algorithm that gives the same output when it runs twice on the same input. Thus, the
output does not depend on the random numbers drawn by the algorithm.
11.7.19 Example One way to find a non-square in Fp for a given odd prime p in semi-deterministic
polynomial time is as follows. Start with the value −1. If it is a square, find the two square
roots with the probabilistic method of Tonelli-Shanks [660, 1.5.1] and select the root that
has the smallest representative in the set {1, . . . , p − 1}. If this is a square in Fp , repeat.
√
Within O(log p) iterations we find a non-square up in Fp . By considering the Fp -basis 1, up
√
of the field Fp ( up ) we find a semi-deterministic algorithm that given p produces an explicit
model for a field of size p2 in polynomial time. This proves a special case of Theorem 11.7.16.
The models produced in this way are not the standard models, which as we saw in
Example 11.7.14 are obtained by taking inverse images of 0 under iterates of the map
x 7→ x2 − 2 rather than inverse images of −1 under iterates of the map x 7→ x2 . In both
cases the algorithm produces quadratic polynomials until it encounters an irreducible one,
but only the method of Example 11.7.14 has the advantage that for all p all subsequent
quadratic polynomials are also irreducible.
11.7.20 Remark In a similar way, one can prove that there is a semi-deterministic algorithm that
given a prime power q, and a prime l and r ≥ 1 with lr | q − 1, computes a root of unity of
order lr in Sq in expected time polynomial in l and log q.
See Also
References Cited: [14, 43, 361, 660, 789, 790, 1452, 1892, 1895, 1966]
12
Curves over finite fields
405
406 Handbook of Finite Fields
The theory of algebraic curves is essentially equivalent to the theory of algebraic function
fields. The latter requires less background and is closer to the theory of finite fields; therefore
we present here the theory of function fields. At the end of the section, we give a brief
introduction to the language of algebraic curves. Our exposition follows mainly the book
[2714]∗ , other references are [1147, 1296, 1511, 2280, 2281, 2872].
Throughout this section, K denotes a finite field. However, almost all results of this
section hold for arbitrary perfect fields.
12.1.1 Definition An algebraic function field over K is an extension field F/K with the following
properties:
1. There is an element x ∈ F such that x is transcendental over K and the exten-
sion F/K(x) has finite degree.
2. No element z ∈ F \ K is algebraic over K.
The field K is the constant field of F .
12.1.2 Remark
1. We often use the term function field rather than algebraic function field.
2. Property 2 in Definition 12.1.1 is often referred to as: K is algebraically closed in
F , or K is the full constant field of F .
3. If F/K is a function field, then the degree [F : K(z)] is finite for every z ∈ F \ K.
4. Every function field F/K can be generated by two elements, F = K(x, y), where
the extension F/K(x) is finite and separable.
Throughout this section, F/K always means a function field over K.
12.1.3 Example (Rational function fields) The simplest example of a function field over K is the
rational function field F = K(x), with x being transcendental over K. The elements of
K(x) are the rational functions z = f (x)/g(x) where f, g are polynomials over K and g is
not the zero polynomial.
12.1.4 Example (Elliptic and hyperelliptic function fields) Let F be an extension of the rational
function field K(x) of degree [F : K(x)] = 2. For simplicity we assume that charK 6= 2.
Then there exists an element y ∈ F such that F = K(x, y), and y satisfies an equation over
K(x) of the form
y 2 = f (x), with f ∈ K[x] square-free
(i.e., f is not divisible by the square of a polynomial h ∈ K[x] of degree ≥ 1). One shows
that F is rational if deg(f ) = 1 or 2. F is an elliptic function field if deg(f ) = 3 or 4, and
∗ Theauthors thank Springer Science+Business Media for their permission to use in Section 12.1 parts
from H. Stichtenoth’s book Algebraic Function Fields and Codes, GTM 254, 2009.
Curves over finite fields 407
it is a hyperelliptic function field if deg(f ) ≥ 5. See also Definition 12.1.108 and Example
12.1.109. A detailed exposition of elliptic and hyperelliptic function fields is given in Sections
12.2 and 12.4.
12.1.5 Remark In case of charK = 2, the definition of elliptic and hyperelliptic function fields
requires some modification, see [2714, Chapters 6.1, 6.2].
12.1.6 Definition A valuation of F/K is a map ν : F → Z ∪ {∞} with the following properties:
1. ν(x) = ∞ if and only if x = 0.
2. ν(xy) = ν(x) + ν(y) for all x, y ∈ F .
3. ν(x + y) ≥ min{ν(x), ν(y)} for all x, y ∈ F .
4. There exists an element z ∈ F such that ν(z) = 1.
5. ν(a) = 0 for all a ∈ K \ {0}.
12.1.7 Remark The symbol ∞ denotes an element not in Z such that ∞+∞ = ∞+n = n+∞ = ∞
and ∞ > m for all m, n ∈ Z. It follows that ν(x−1 ) = −ν(x) for every nonzero element
x ∈ F . Property 3 above is the Triangle Inequality. The following proposition is often useful.
12.1.8 Proposition (Strict triangle inequality) Let ν be a valuation of the function field F/K and
let x, y ∈ F such that ν(x) 6= ν(y). Then ν(x + y) = min{ν(x), ν(y)}.
12.1.9 Remark For a valuation ν of F/K, consider the following subsets O, O∗ , P of F :
12.1.10 Definition
1. A subset P ⊆ F is a place of F/K if there exists a valuation ν of F/K such that
P = {z ∈ F | ν(z) > 0}. The valuation ν is uniquely determined by the place
P . Therefore we write ν =: νP and say that νP is the valuation corresponding
to the place P .
2. If P is a place of F/K and νP is the corresponding valuation, then the ring
OP := {z ∈ F | νP (z) ≥ 0} is the valuation ring of F corresponding to P .
3. An element t ∈ F with νP (t) = 1 is a prime element at the place P .
4. Let PF := {P | P is a place of F }.
12.1.11 Remark Since P is a maximal ideal of its valuation ring OP , the residue class ring OP /P
is a field. The constant field K is contained in OP , and P ∩ K = {0}. Hence one has a
canonical embedding K ,→ OP /P . We always consider K as a subfield of OP /P via this
embedding.
2. The degree of the field extension FP /K is finite and is the degree of the place
P . We write deg P := [FP : K].
3. A place P ∈ PF is rational if deg P = 1. This means that FP = K.
4. For z ∈ OP , denote by z(P ) ∈ FP the residue class of z in FP . For z ∈ F \ OP ,
set z(P ) := ∞. The map from F to FP ∪ {∞} given by z 7→ z(P ) is the residue
class map at P .
12.1.13 Remark For a rational place P ∈ PF and an element z ∈ OP , the residue class z(P ) is the
(unique) element a ∈ K such that νP (z − a) > 0. In this case, one calls the map z 7→ z(P )
from OP to K the evaluation map at the place P . We note that the evaluation map is
K-linear. This map plays an important role in the theory of algebraic–geometry codes, see
Section 15.2.
12.1.14 Example We want to describe all places of the rational function field K(x)/K.
The residue class field of this place is isomorphic to K[x]/(h) and therefore we
have deg P = deg(h).
2. Another valuation of K(x)/K is defined by ν(z) = deg(g) − deg(f ) for z =
f (x)/g(x) 6= 0. The corresponding place is called the place at infinity and is
denoted by P∞ or (x = ∞). It follows from the definition that
f (x)
P∞ = deg(f ) < deg(g) .
g(x)
12.1.15 Remark [2714, Corollary 1.3.2] Every function field F/K has infinitely many places.
12.1.16 Remark The following theorem states that distinct valuations of F/K are independent of
each other.
12.1.17 Theorem (Approximation theorem) [2714, Theorem 1.3.1] Let P1 , . . . , Pn ∈ PF be pairwise
distinct places of F . Let x1 , . . . , xn ∈ F and r1 , . . . , rn ∈ Z. Then there exists an element
z ∈ F such that
νPi (z − xi ) = ri for i = 1, . . . , n.
12.1.19 Remark
1. A nonzero element a ∈ K has neither zeros nor poles.
2. For all x 6= 0 and P ∈ PF , P is a pole of x if and only if P is a zero of x−1 .
12.1.20 Theorem [2714, Theorem 1.4.11] For x ∈ F \ K the following hold:
1. x has at least one zero and one pole.
2. The number of zeros and poles of x is finite.
3. Let P1 , . . . , Pr and Q1 , . . . , Qs be all zeros and poles of x, respectively. Then
r
X s
X
νPi (x) deg Pi = −νQj (x) deg Qj = [F : K(x)].
i=1 j=1
12.1.21 Definition
1. The divisor group of F/K is the free abelian group generated by the set of
places of F/K. It is denoted by Div(F ). The elements of Div(F ) are divisors of
F . That means, a divisor of F is a formal sum
X
D= nP P with nP ∈ Z and nP 6= 0 for at most finitely many P.
P ∈PF
D = n1 P1 + · · · + nk Pk where ni = nPi .
P P
Two divisorsPD = P nP P and E = P mP P are added coefficientwise,
P that
is D + E = P (nP + mP )P . The zero divisor is the divisor 0 = P rP P where
all rP = 0.
2. A divisor of the form D = P with P ∈ PF is a prime divisor.
P
3. The degree of the divisor D = P nP P is
X
deg D := nP · deg P.
P ∈PF
We note that this is a finite sum since nP 6= 0 only for finitely many P .
410 Handbook of Finite Fields
P
4. P
A partial order on Div(F ) is defined as follows: if D = P nP P and E =
P mP P , then
12.1.22 Remark Since every nonzero element x ∈ F has only finitely many zeros and poles, the
following definitions are meaningful.
12.1.23 Definition For a nonzero element x ∈ F , let Z and N denote the set of zeros and poles of
x, respectively.
P
1. The divisor (x)0 := P ∈Z νP (x)P is the zero divisor of x.
P
2. The divisor (x)∞ := − P ∈N νP (x)P is the divisor of poles of x.
P
3. The divisor div(x) := P ∈PF νP (x)P = (x)0 − (x)∞ is the principal divisor of
x.
12.1.24 Remark
1. We note that both divisors (x)0 and (x)∞ are positive divisors. By Theorem
12.1.20, deg(x)0 = deg(x)∞ and hence deg(div(x)) = 0.
2. For x ∈ F \ K we have deg(x)0 = deg(x)∞ = [F : K(x)]. The principal divisor
of a nonzero element a ∈ K is the zero divisor. We observe that for the element
0 ∈ K, no principal divisor is defined.
3. The sum of two principal divisors and the negative of a principal divisor are
principal, since div(xy) = div(x) + div(y) and div(x−1 ) = −div(x). Therefore the
principal divisors form a subgroup of the divisor group of F .
12.1.25 Example We consider again the rational function field F = K(x). Let f ∈ K[x] be a
nonzero polynomial and write f as a product of irreducible polynomials,
As every element of K(x) is a quotient of two polynomials, we thus obtain the principal
divisor for any nonzero element z ∈ K(x) in this way.
12.1.26 Definition
1. Two divisors D, E ∈ Div(F ) are equivalent if E = D + div(x) for some x ∈ F .
This is an equivalence relation on the divisor group of F/K. We write
3. The factor group Cl(F ) := Div(F )/Princ(F ) is the divisor class group of F .
4. For a divisor D ∈ Div(F ) we denote by [D] ∈ Cl(F ) its class in the divisor class
group.
12.1.27 Remark The equivalence relation ∼ as defined in Definition 12.1.26 is often denoted as
linear equivalence of divisors.
12.1.28 Remark
1. It follows from the definitions that D ∼ E if and only if [D] = [E].
2. D ∼ E implies deg D = deg E.
3. In a rational function field K(x), the converse of Part 2 also holds. If F/K is
non-rational, then there exist, in general, divisors of the same degree which are
not equivalent.
12.1.29 Definition Let F/K be a function field and let A ∈ Div(F ) be a divisor of F . Then the
set
L(A) := {x ∈ F | div(x) ≥ −A } ∪ {0}
is the Riemann–Roch space associated to the divisor A.
is the dimension of A. We point out that dim L(A) denotes here the dimension as a
vector space over K.
12.1.32 Remark
1. If A ∼ B then the spaces L(A) and L(B) are isomorphic (as K-vector spaces).
Hence A ∼ B implies `(A) = `(B).
2. A ≤ B implies L(A) ⊆ L(B) and hence `(A) ≤ `(B).
3. deg A < 0 implies `(A) = 0.
4. L(0) = K and hence `(0) = 1.
12.1.33 Remark The following theorem is one of the main results of the theory of function fields.
12.1.34 Theorem (Riemann–Roch theorem) [2714, Theorem 1.5.15] Let F/K be a funcion field.
Then there exist an integer g ≥ 0 and a divisor W ∈ Div(F ) with the following property:
for all divisors A ∈ Div(F ),
12.1.35 Definition The integer g =: g(F ) is the genus of F , and the divisor W is a canonical
divisor of F .
12.1.40 Example Consider the rational function field F = K(x). The following hold:
1. The genus of K(x) is 0.
2. Let P∞ be the infinite place of K(x), see Example 12.1.14. For every k ≥ 0 we
obtain
L(kP∞ ) = {f ∈ K[x] | deg(f ) ≤ k}.
This shows that Riemann–Roch spaces are natural generalizations of spaces of
polynomials.
3. The divisor W = −2P∞ is canonical.
12.1.41 Remark Conversely, if F/K is a function field of genus g(F ) = 0, then there exists an
element x ∈ F such that F = K(x). (This does not hold in general if K is not a finite field.)
12.1.42 Remark For divisors of degree deg A > 2g − 2, Riemann’s Theorem gives a precise formula
for `(A). On the other hand, `(A) = 0 if deg A < 0. For the interval 0 ≤ deg A ≤ 2g − 2,
there is no exact formula for `(A) in terms of deg A.
12.1.43 Theorem (Clifford’s theorem) [2714, Theorem 1.6.13] For all divisors A ∈ Div(F ) with
0 ≤ deg A ≤ 2g − 2,
1
`(A) ≤ 1 + · deg A.
2
12.1.44 Remark The genus g(F ) of a function field F is its most important numerical invariant. In
general it is a difficult task to determine g(F ). Some methods are discussed in Subsection
12.1.3. Here we give upper bounds for g(F ) in some special cases.
12.1.45 Remark Assume that F = K(x, y) is a function field over K, where x, y satisfy an equation
ϕ(x, y) = 0 with an irreducible polynomial ϕ(X, Y ) ∈ K[X, Y ] of degree d. Then
(d − 1)(d − 2)
g(F ) ≤ .
2
Equality holds if and only if the plane projective curve which is defined by the affine equation
ϕ(X, Y ) = 0, is nonsingular. (These terms are explained in Subsection 12.1.5.)
12.1.46 Remark (Riemann’s inequality) [2714, Corollary 3.11.4] Suppose that F = K(x, y). Then
g(F ) ≤ ([F : K(x)] − 1)([F : K(y)] − 1).
Curves over finite fields 413
12.1.56 Theorem [2714, Corollary 3.5.5] If F 0 /F is a finite separable extension of function fields,
then at most finitely many places of F are ramified in F 0 /F .
12.1.57 Remark More precise information about the ramified places in F 0 /F is given in Theorem
12.1.72.
12.1.59 Remark ConF 0 /F is a homomorphism from the divisor group of F to the divisor group of
F 0 , which sends principal divisors of F to principal divisors of F 0 .
12.1.60 Remark For every divisor A ∈ Div(F ), one has
[F 0 : F ]
deg ConF 0 /F (A) = · deg A.
[K 0 : K]
12.1.61 Definition Let F 0 /K 0 be a finite extension of F/K, let P ∈ PF and OP its valuation
ring.
1. An element z ∈ F 0 is integral over OP if there exist elements u0 , . . . , um−1 ∈ OP
such that z m + um−1 z m−1 + · · · + u1 z + u0 = 0. Such an equation is an integral
equation for z over OP .
2. The set OP0 := {z ∈ F 0 | z is integral over OP } is a subring of F 0 . It is the
integral closure of OP in F 0 .
12.1.62 Proposition [2714, Chapter 3.2, 3.3] With notation as in Definition 12.1.61, the following
hold:
1. z ∈ F 0 is integral over OP if and only if the coefficients of the minimal polynomial
of z over F are in OP .
2. OP0 = P 0 |P OP 0 .
T
Pn
3. There exists a basis (z1 , . . . , zn ) of F 0 /F such that OP0 = i=1 zi OP , that is,
everyPelement z ∈ F 0 which is integral over OP , has a unique representation
z = xi zi with xi ∈ OP . Such a basis (z1 , . . . , zn ) is an integral basis at the
place P .
4. Every basis (y1 , . . . , yn ) of F 0 /F is an integral basis for almost all places P ∈ PF
(that is, for all P with only finitely many exceptions). In particular, if F 0 = F (y)
then (1, y, . . . , y n−1 ) is an integral basis for almost all P .
Curves over finite fields 415
12.1.63 Remark Using integral bases one can often determine all extensions of a place P ∈ PF in
F 0 . In the following theorem, denote by ū := u(P ) ∈ FP the residue classP of an element
i
u ∈ OP in the residue class field FP = O P /P . For a polynomial ψ(T ) = u i T ∈ OP [T ]
we set ψ̄(T ) := ūi T i ∈ FP [T ].
P
12.1.64 Theorem (Kummer’s theorem) [2714, Theorem 3.3.7] Suppose that F 0 = F (y) with y
integral over OP . Let ϕ ∈ OP [T ] be the minimal polynomial of y over F and decompose ϕ̄
into irreducible factors over FP
ϕ̄(T ) = γ1 (T )1 · · · γr (T )r
with distinct irreducible monic polynomials γi ∈ FP [T ] and i ≥ 1. Choose monic polyno-
mials ϕi ∈ OP [T ] such that ϕ̄i = γi . Then the following hold:
1. For each i ∈ {1, . . . , r} there exists a place Pi |P such that ϕi (y) ∈ Pi . The relative
degree of Pi |P satisfies f (Pi |P ) ≥ deg(γi ).
2. If (1, y, . . . , y n−1 ) is an integral basis at P , then there exists for each i ∈ {1, . . . , r}
a unique place Pi |P with ϕi (y) ∈ Pi , and we have e(Pi |P ) = i and f (Pi |P ) =
deg(γi ).
Qn
3. If ϕ̄(T ) = i=1 (T − ai ) with distinct elements a1 , . . . , an ∈ K, then P splits
completely in F 0 /F .
12.1.65 Example Consider a field K with charK 6= 2 and a function field F = K(x, y), where
y satisfies an equation y 2 = f (x) with a polynomial f (x) ∈ K[x] of odd degree. Then
[F : K(x)] = 2, and ϕ(T ) = T 2 − f (x) is the minimal polynomial of y over K(x). Let
a ∈ K.
1. If f (a) is a nonzero square in K (that is, f (a) = c2 with 0 6= c ∈ K), then the
place (x = a) of K(x) (see Example 12.1.14) splits into two rational places of F .
2. If f (a) is a non-square in K, then the place (x = a) has exactly one extension Q
in F , and deg Q = 2.
3. If a ∈ K is a simple root of the equation f (x) = 0, then the place (x = a) of
K(x) is totally ramified in F/K(x), and its unique extension P ∈ PF is rational.
For more examples see Section 12.5.
12.1.66 Remark In what follows, we assume that F 0 /F is a separable extension of function fields of
degree [F 0 : F ] = n. As before, P denotes a place of F and OP0 is the integral closure of OP
in F 0 . By TrF 0 /F : F 0 → F we denote the trace mapping. For information about separable
extensions and the trace map, see any standard textbook on algebra, e.g., [1846].
12.1.67 Definition
1. For P ∈ PF , the set
CP := {z ∈ F 0 | TrF 0 /F (zOP0 ) ⊆ OP }
is the complementary module of P in F 0 .
12.1.69 Definition The different of a finite separable extension of function fields F 0 /F is the divisor
of the function field F 0 defined as
X X
Diff(F 0 /F ) := d(P 0 |P )P 0 .
P ∈PF P 0 |P
12.1.70 Theorem [2714, Theorems 3.4.6, 3.4.13] Let F 0 /K 0 be a finite separable extension of F/K.
1. If W is a canonical divisor of F/K, then the divisor
W 0 := ConF 0 /F (W ) + Diff(F 0 /F )
is a canonical divisor of F 0 /K 0 .
2. (Hurwitz genus formula) The genera of F 0 and F satisfy the equation
[F 0 : F ]
2g(F 0 ) − 2 = (2g(F ) − 2) + deg Diff(F 0 /F ).
[K 0 : K]
12.1.71 Remark We note that Part 2 is an immediate consequence of Part 1 since the degree of
a canonical divisor of F is 2g(F ) − 2. Next we give some results that help to compute the
different exponents d(P 0 |P ).
12.1.72 Theorem (Dedekind’s different theorem) [2714, Theorem 3.5.1] Let F 0 /F be a finite sepa-
rable extension of function fields, let P ∈ PF and P 0 ∈ PF 0 with P 0 |P . Then
1. d(P 0 |P ) ≥ e(P 0 |P ) − 1 ≥ 0.
2. d(P 0 |P ) = e(P 0 |P ) − 1 if and only if the characteristic of F does not divide
e(P 0 |P ).
12.1.73 Remark In other words, the different of F 0 /F contains exactly the places of F 0 which are
ramified in F 0 /F . In particular it follows that only finitely many places are ramified. The
following definition is motivated by Dedekind’s Different Theorem.
12.1.75 Lemma In a tower of separable extensions F 00 ⊇ F 0 ⊇ F , the different is transitive, that is:
12.1.76 Proposition [2714, Theorem 3.5.10] Let F 0 = F (y) be a separable extension of degree
[F 0 : F ] = n. Let P ∈ PF and assume that the minimal polynomial ϕ of y has all of its
coefficients in OP . Let P1 , . . . , Pr be all extensions of P in F 0 . Then one has:
Curves over finite fields 417
e(P ) · f (P ) · r = [F 0 : F ].
12.1.80 Proposition (Kummer extensions) [2714, Proposition 3.7.3] Let F 0 = F (y) be an extension
of function fields of degree [F 0 : F ] = n, where the constant field of F is the finite field Fq .
Assume that
y n = u ∈ F and n divides (q − 1).
Then F 0 /F is Galois, and the Galois group Gal(F 0 /F ) is cyclic of order n.
1. For P ∈ PF define rP := gcd(n, νP (u)), the greatest common divisor of n and
νP (u). Then
n n
e(P 0 |P ) = and d(P 0 |P ) = − 1 for all P 0 |P.
rP rP
1 X
g(F 0 ) = −n + 1 + (n − gcd(n, νP (u))) deg P.
2
P ∈PF
12.1.82 Example Assume that the characteristic of K is odd. Let F = K(x, y) with y 2 = f (x),
where f ∈ K[x] is a square-free polynomial of degree deg(f ) = 2m + 1. This means that
f = f1 · · · fs with pairwise distinct irreducible polynomials fi ∈ K[x]. Let Pi ∈ PK(x)
be the place corresponding to fi , i = 1, . . . , s, and P∞ be the pole of x in K(x). For
P ∈ {P1 , . . . , Ps , P∞ } we have gcd(2, νP (f )) = 1, and for all other places Q ∈ PK(x) we
have νQ (f ) = 0. Then Part 3 of the Proposition above yields g(F ) = (deg(f ) − 1)/2 = m.
Hence for every integer m ≥ 0 there exist function fields F/K of genus g(F ) = m.
12.1.83 Proposition (Artin–Schreier extensions) [2714, Proposition 3.7.8] Let F/K be a function
field, where K is a finite field of characteristic p. Let F 0 = F (y) with y p − y = u ∈ F . We
assume that for all poles P of u in F , p does not divide νP (u), and that u 6∈ K. Then the
following hold:
12.1.87 Remark If E/K 0 is a finite extension of F/K (meaning that E/F is a finite extension and
K 0 is the constant field of E), we consider the intermediate field F ⊆ F 0 := F K 0 ⊆ E. Then
F 0 /K 0 is a constant field extension of F/K, and E/F 0 is an extension of function fields
having the same constant field K 0 .
12.1.88 Theorem [2714, Chapter 3.6] Let F 0 = F K 0 be a constant field extension of F . Then the
following hold:
12.1.4 Differentials
12.1.89 Remark In this subsection we consider a function field F/K where K = Fq is a finite field
of characteristic p. The aim is to give an interpretation of the canonical divisors of F .
12.1.90 Remark The set F p := {z p | z ∈ F } is a subfield of F which contains K. The extension
F/F p has degree [F : F p ] = p and is purely inseparable. An element z ∈ F \ F p is called a
separating element for F/K. For every separating element z, the extension F/K(z) is finite
and separable.
12.1.91 Remark Recall that a module over a field L is just a vector space over L.
νP (ω) := νP (u).
This definition is independent of the choice of the prime element t, and one can
show that νP (ω) = 0 for almost all P ∈ PF .
420 Handbook of Finite Fields
2. The divisor of ω is X
div(ω) := νP (ω)P.
P ∈PF
12.1.99 Remark Divisors have the property div(uω) = div(u) + div(ω) for u ∈ F \ {0} and
ω ∈ ΩF \ {0}. Therefore div(ω) ∼ div(η) for any two nonzero differentials ω, η ∈ ΩF .
12.1.100 Remark Recall that the divisor of poles of an element 0 6= x ∈ F is denoted by (x)∞ .
12.1.101 Proposition [2714, Chapter 4.3] Let x ∈ F be a separating element for F/K. Then
12.1.102 Theorem [2714, Chapter 4.3] Let ω ∈ ΩF be a nonzero differential of F/K. Then the divisor
W := div(ω) is a canonical divisor of F . In particular,
2g(F ) − 2 = deg(div(ω)).
12.1.108 Definition
1. A function field F/K of genus g(F ) = 1 is an elliptic function field.
2. A function field F/K is hyperelliptic if g(F ) ≥ 2, and there exists an element
x ∈ F such that [F : K(x)] = 2.
12.1.109 Example [2714, Chapters 6.1, 6.2] Let K be a finite field of characteristic 6= 2, and let
F/K be an elliptic or hyperelliptic function field of genus g. Assume that F has at least
one rational place P . Then there exist x, y ∈ F such that F = K(x, y) and y 2 = f (x) with
a square-free polynomial f ∈ K[x] of degree 2g + 1. The differentials
xi
ωi := dx , i = 0, . . . , g − 1
y
12.1.112 Definition
1. The n-dimensional affine space An = An (K̄) over K̄ is the set of all n-tuples
of elements of K̄. An element P = (a1 , . . . , an ) ∈ An is a point, and a1 , . . . , an
are its coordinates.
2. Let f1 , . . . , fm ∈ K[X1 , . . . , Xn ] be polynomials. Then the set V := {P ∈
An | f1 (P ) = · · · = fm (P ) = 0} is the affine algebraic set defined by f1 =
· · · = fm = 0. We say that V is defined over K since the polynomials f1 , . . . , fm
have coefficients in K.
3. Let V be as in 2. The set I(V ) := {f ∈ K̄[X1 , . . . , Xn ] | f (P ) = 0 for all P ∈ V }
is an ideal of K̄[X1 , . . . , Xn ], which is the ideal of V .
4. The algebraic set V is absolutely irreducible if I(V ) is a prime ideal of
K̄[X1 , . . . , Xn ]. Then the residue class ring Γ(V ) := K̄[X1 , . . . , Xn ]/I(V ) is an
integral domain, and its quotient field K̄(V ) := Quot(Γ(V )) is the field of ratio-
nal functions on V . The residue class of Xi in K̄(V ) is the i-th coordinate func-
tion on V and is denoted by xi . The subfield K(V ) := K(x1 , . . . , xn ) ⊆ K̄(V )
is the field of K-rational functions on V .
5. An absolutely irreducible affine algebraic set V is an absolutely irreducible affine
algebraic curve over K (briefly, an affine curve over K), if the field K(V ) as
defined in Part 4 has transcendence degree one over K. This means that K(V )
is an algebraic function field over K, as in Definition 12.1.1. The curve V is a
plane affine curve if V ⊆ A2 .
6. Let V be an affine curve over K. A point P ∈ V is K-rational if all its coordi-
nates are in K. We set V (K) := {P ∈ V | P is K-rational}.
7. Two affine curves V1 and V2 are birationally equivalent if their function fields
K(V1 ) and K(V2 ) are isomorphic.
12.1.113 Example Let F/K be an algebraic function field. Then there exist elements x, y ∈ F such
that F = K(x, y), and there is an irreducible polynomial f ∈ K[X, Y ] such that f (x, y) = 0.
Let V ⊆ A2 be the plane affine curve defined by f = 0. Then K(V ) = F .
12.1.114 Definition
1. Let V be an affine curve as in Definition 12.1.112, and let P ∈ V . A rational
function ϕ ∈ K̄(V ) is defined at P if ϕ = g(x1 , . . . , xn )/h(x1 , . . . , xn ) with
g, h ∈ K̄[x1 , . . . , xn ] and h(P ) 6= 0. The set OP (V ) of all rational functions on
V which are defined at P , is a ring and it is the local ring of V at P .
2. The point P is non-singular if its local ring is integrally closed. This means, by
definition, that every z ∈ K̄(V ) which satisfies an integral equation over OP (V ),
is in OP (V ); see Definition 12.1.61.
3. The curve V is non-singular if all of its points are non-singular.
422 Handbook of Finite Fields
12.1.115 Remark Let f ∈ K[X, Y ] be an absolutely irreducible polynomial (that is, f is irreducible
in K̄[X, Y ]). Then the equation f = 0 defines a plane affine curve C ⊆ A2 (K̄). A point
P ∈ C is non-singular if and only fX (P ) 6= 0 or fY (P ) 6= 0, where fX (X, Y ) and fY (X, Y )
denote the partial derivatives with respect to X and Y , respectively.
12.1.116 Example Let n > 0 be relatively prime to the characteristic of K. Then the Fermat curve
C which is defined by the equation f (X, Y ) = X n + Y n − 1 = 0, is non-singular.
12.1.117 Remark In a sense, affine curves are not “complete,” one has to add a finite number of
points “at infinity.” To be precise, one introduces the projective space Pn over K̄ and the
“projective closure” of an affine curve in Pn . This leads to the concept of projective curves.
We do not give details here and refer to textbooks on algebraic geometry, for example
[1147, 1427, 2281].
12.1.118 Remark
1. Two projective curves are birationally equivalent if their function fields are iso-
morphic.
2. For every projective curve C there exists a non-singular projective curve X which
is birationally equivalent to C. The curve X is uniquely determined up to isomor-
phism and it is the non-singular model of C.
12.1.119 Remark There is a 1–1 correspondence between {algebraic function fields F/K, up to iso-
morhism} and {absolutely irreducible, non-singular, projective curves X defined over K, up
to isomorphism}. Under this correspondence, extensions F 0 /F of function fields correspond
to coverings X 0 → X of curves, composites of function fields E = F1 F2 correspond to fibre
products of curves, etc. What corresponds to a place P of a function field F/K? If P is
rational, then it corresponds to a K-rational point of the associated projective curve. Now
let K = Fq and let P be a place of F with deg P = n. Then P corresponds to exactly n
points on the associated projective curve, with coordinates in the field Fqn . These points
form an orbit under the Frobenius map, which is the map that raises the coordinates of
points to the q-th power. For details, see [2281].
See Also
References Cited: [1147, 1296, 1427, 1511, 1846, 2280, 2281, 2714, 2872]
12.2.1 Remark Most of the background material for this section may be found in [2670, Chap-
ters III, V, XI]. Other standard references for the theory of elliptic curves include the
books [557, 1563, 1756, 1773, 1843, 1845, 2054, 2107, 2667, 2672, 2950] and survey arti-
cles [556, 2784].
Curves over finite fields 423
y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 with a1 , a2 , a3 , a4 , a6 ∈ K. (12.2.1)
12.2.4 Remark The discriminant vanishes if and only if the curve defined by the Weierstrass
equation has a singular point, i.e, a point (x0 , y0 ) on the curve where both partial derivatives
vanish:
2y0 + a1 x0 + a3 = 0 and 3x20 + 2a2 x0 + a4 − a1 y0 = 0.
1
12.2.5 Remark If char(K) 6= 2, then the substitution y 7−→ 2 (y − a1 x − a3 ) transforms the
Weierstrass equation into the simpler form
y 2 = 4x3 + b2 x2 + 2b4 x + b6 .
y
If also char(K) 6= 3, then the substitution (x, y) 7−→ x−3b
36 , 108 yields the further simpli-
2
fication
y 2 = x3 − 27c4 x − 54c6 .
12.2.6 Definition An elliptic curve E defined over a field K is a Weierstrass equation over K that
is nonsingular, i.e., ∆ 6= 0, together with an extra point “at infinity” which is denoted O.
For any extension field L of K, the set of points of E defined over L is the set
12.2.7 Remark A fancier definition of an elliptic curve over K is a nonsingular projective curve
of genus one defined over K with a marked point whose coordinates are in K. Using the
Riemann–Roch theorem, one can show that every such curve is given by a Weierstrass
equation, with the marked point being the point O, which is the unique point at infinity;
see [2670, III.3.1].
12.2.8 Example The real points on an elliptic curve defined over R may have one or two compo-
nents, as illustrated in Figure 12.2.8∗ .
∗ Theauthor thanks Springer Science+Business Media for their permission to include Figures 3.1 and 3.3
from his book The Arithmetic of Elliptic Curves, GTM 106, 2009.
424 Handbook of Finite Fields
y 2 = x3 − 3x + 3 y 2 = x3 + x y 2 = x3 − x
12.2.9 Proposition
The substitution
x −→ u2 x + r, y −→ u3 y + u2 sx + t, (12.2.2)
transforms the Weierstrass equation (12.2.1) into a Weierstrass equation
y 2 + a01 xy + a03 y = x3 + a02 x2 + a04 x + a06
whose coefficients and associated quantities satisfy
ua01 = a1 + 2s
u2 a02 = a2 − sa1 + 3r − s2
u3 a03 = a3 + ra1 + 2t
u4 a04 = a4 − sa3 + 2ra2 − (t + rs)a1 + 3r2 − 2st
u6 a06 = a6 + ra4 + r2 a2 + r3 − ta3 − t2 − rta1
u2 b02 = b2 + 12r
u4 b04 = b4 + rb2 + 6r2
u6 b06 = b6 + 2rb4 + r2 b2 + 4r3
u8 b08 = b8 + 3rb6 + 3r2 b4 + r3 b2 + 3r4
u4 c04 = c4 u6 c06 = c6 u12 ∆0 = ∆ j 0 = j
12.2.10 Definition Two elliptic curves E/K and E 0 /K are isomorphic over K if there is a sub-
stitution (12.2.2) with u ∈ K ∗ and r, s, t ∈ K that transforms the Weierstrass equation
of E into the Weierstrass equation of E 0 .
12.2.11 Theorem [2670, III.1.4] Let K be an algebraically closed field. Then E/K and E 0 /K are iso-
morphic over K if and only if they have the same j-invariant, i.e., if and only if j(E) = j(E 0 ).
12.2.12 Example According to Theorem 12.2.11, there are two F2 -isomorphism classes of elliptic
curves defined over F2 , namely those with j-invariant 0 and those with j-invariant 1. How-
ever, there are five F2 -isomorphism classes of elliptic curves defined over F2 . An example
from each isomorphism class is listed in the following table.
curve j
y 2 + y = x3 0
y 2 + y = x3 + x 0
y + y = x3 + x + 1
2
0
y 2 + xy = x3 + 1 1
y + xy + y = x3 + 1
2
1
Curves over finite fields 425
12.2.13 Theorem [2670, X.5.4.1] Let p ≥ 5, let q be a power of p, and let E1 /Fq and E2 /Fq be
elliptic curves. Then E1 and E2 are isomorphic over Fq if and only if j(E1 ) = j(E2 ) and
there exists a u ∈ F∗q such that
2 0 0
c4 (E)c6 (E) = u c4 (E )c6 (E ) if j(E1 ) 6= 0 and j(E1 ) 6= 1728,
c4 (E) = u4 c4 (E 0 ) if j(E1 ) = 1728,
c6 (E) = u6 c6 (E 0 )
if j(E1 ) = 0.
12.2.14 Definition Let Eq be the set of Fq -isomorphism classes of elliptic curves defined over Fq .
12.2.15 Remark The set E2 is described in Example 12.2.12; it has five elements. For p ≥ 5 and q
a power of p, we have
12.2.16 Remark There are curves over finite fields Fq that have no points with coordinates in Fq ,
but a theorem of Lang says that this does not happen for curves of genus one.
12.2.17 Theorem [2670, Exercise 10.6] Let C/Fq be a smooth projective curve of genus one.
Then C(Fq ) is not empty. More generally, if V /Fq is a variety that is isomorphic over Fq to
an abelian variety, then V (Fq ) is not empty.
12.2.18 Remark In particular, if F (X, Y, Z) ∈ Fq [X, Y, Z] is homogeneous of degree 3 and if the
associated curve F = 0 is nonsingular, then there are values x, y, z ∈ Fq , not all zero, such
that F (x, y, z) = 0.
12.2.19 Definition Let E/K be an elliptic curve defined over a field K, and let L/K be an extension
field. The set of points E(L) forms an abelian group using the following rules:
1. The point O is the identity element. In what follows, we use the convention
that O is on every vertical line.
2. The negative of the point P = (x, y) is the point −P = (x, −y −a1 x−a3 ). If E is
given by a simpler Weierstrass equation as in Remark 12.2.5, then a1 = a3 = 0,
and −P = (x, −y) is the reflection of P about the x-axis.
3. The sum of distinct points P and Q is obtained by intersecting the line
through P and Q with E. This yields three points P , Q, and R (counted with
appropriate multiplicities). Then P + Q equals −R.
4. The sum of P with itself is obtained similarly, using the tangent line to E at P .
The geometric definition of the group law is illustrated in Figure 12.2.2.
12.2.20 Remark All of the group axioms are easy to verify except for associativity; see [2670, III.3.4]
for a proof of associativity.
12.2.21 Algorithm Let E/K be an elliptic curve defined over a field K, and let L/K be an extension
field. The following addition algorithm gives the group structure on the set of points E(L).
1. The point O is the identity element, so
R s
Rs
Q P
s s
Ps s
T
s
P +Q T +T =O s
P +P
P +Q+R=O
P1 + P2 = O.
λ ν
y2 − y1 y1 x2 − y2 x1
x1 6= x2
x2 − x1 x2 − x1
P3 = (x3 , y3 ) with x3 = λ2 +a1 λ−a2 −x1 −x2 and y3 = −(λ+a1 )x3 −ν−a3 .
12.2.22 Remark A special case of the addition algorithm is the duplication formula. Let P = (x, y) ∈
E(L). Then the x-coordinate of 2P is
x4 − b4 x2 − 2b6 x − b8
x(2P ) = .
4x3 + b2 x2 + 2b4 x + b6
12.2.23 Remark Algorithm 12.2.21 explains how to add and double points on an elliptic curve. For
cryptographic applications, it is important to do these operations as efficiently as possible.
There are tradeoffs between using affine versus projective coordinates. It may also be possi-
ble to use the Frobenius endomorphism in place of the doubling map for “double-and-add”
algorithms, and alternative equations for elliptic curves may also allow for more efficient
operations. For details, see Sections 12.3 and 16.4.
12.2.24 Definition For a positive integer m, the multiplication-by-m map on E(L) is the map
For m < 0 define [m](P ) = −[−m](P ), and set [0](P ) = O. The kernel of multiplication-
by-m is the subgroup
12.2.25 Example The elliptic curve E : y 2 = x3 + x + 1 over the field F11 has discriminant ∆ = 10
and j-invariant j = 9. The points P = (2, 0), Q = (6, 5), and R = (8, 9) are in E(F11 ). The
addition algorithm (Algorithm 12.2.21) allows us to compute quantities such as
P + Q = (8, 9), Q + R = (4, 4), [2]Q = (0, 1), [3]P + [4]R = (4, 5).
The group E(F11 ) has 14 elements. The orders of the elements P , Q, and R are given by
So for example, the kernel of multiplication by 7 in E(F11 ) consists of the seven points
E(F11 )[7] = [n](Q) : 0 ≤ n ≤ 6 , where Q = (6, 5).
However, going to an extension field, one finds that E(F11 )[7] contains 49 points; see The-
orem 12.2.60.
12.2.28 Definition The set of isogenies from E1 to E2 is denoted Hom(E1 , E2 ), the set of en-
domorphisms of E is denoted End(E), and the set of automorphisms of E is de-
noted Aut(E). A subscript K indicates that we take only maps that are defined over K,
thus HomK (E1 , E2 ), EndK (E), and AutK (E).
y 2 y(b − x2 )
φ : E1 −→ E2 , (x, y) 7−→ , .
x2 x2
428 Handbook of Finite Fields
Y (a2 − 4b − X 2 )
2
Y
φ̂ : E2 −→ E1 , (X, Y ) 7−→ , .
4X 2 8X 2
A direct computation shows that φ̂ ◦ φ = [2] on E1 and φ ◦ φ̂ = [2] on E2 . The maps φ and φ̂
are examples of dual isogenies as described in Theorem 12.2.41.
12.2.31 Example Let K be a field of characteristic p > 0, let q be a power of p, let E/K be an
elliptic curve given by a Weierstrass equation (12.2.1), and define an elliptic curve E (q) /K
using the Weierstrass equation
12.2.32 Proposition [2670, II.2.3] Let E1 /K and E2 /K be elliptic curves and let φ : E1 → E2 be
an isogeny. Then either φ(P ) = O for all P ∈ E1 (K), or else φ E1 (K) = E2 (K). (The
constant map φ(P ) = O is the zero isogeny.)
12.2.33 Definition In general, the degree of a finite map φ : C1 → C2 between algebraic curves is
the degree of the extension of function fields K(C1 )/φ∗ K(C2 ). The map is separable if
the field extension K(C1 )/φ∗ K(C2 ) is separable, and otherwise it is inseparable (which
can only happen in finite characteristic). The inseparability degree of φ, denoted degi (φ),
is the inseparability degree of the extension K(E1 )/φ∗ K(E2 ). Thus φ is separable if and
only if degi (φ) = 1, and φ is purely inseparable if degi (φ) = deg(φ).
12.2.34 Example The Frobenius map φq defined in Example 12.2.31 is purely inseparable. In gen-
eral, for integers m and n, the map
12.2.40 Remark For every isogeny E1 → E2 there is an isogeny going in the opposite direction, as
described in the next theorem.
12.2.41 Theorem [2670, III.6.1, III.6.2] Let E1 /K and E2 /K be a elliptic curves, and let φ : E1 →
E2 be an isogeny of degree n defined over K.
12.2.42 Remark
P Let E/K be an elliptic curve. Recall that Div(E) is the group of formal sums
P ∈E(K) nP (P ), where the nP are integers and only finitely many of them are nonzero.
Also Pic(E), the Picard group of E, is the quotient of Div(E) by the subgroup of principal
divisors, i.e., divisors of functions. So there is an exact sequence
∗
1 −→ K −→ K(E)∗ −→ Div(E) −→ Pic(E) −→ 0.
(For background on divisors, see Section 12.1.) On an elliptic curve, the group law can be
used to describe the principal divisors, as in the next result.
430 Handbook of Finite Fields
P
12.2.43 Proposition [2670, III.3.5] Let E/K be an elliptic curve. A divisor D = nP (P ) ∈ Div(E)
is principal if and only if
X X
nP = 0 and [nP ](P ) = O.
P ∈E(K) P ∈E(K)
(We note that the first sum is a sum of integers, while the second sum is a sum of points
on the elliptic curve E.)
12.2.44 Proposition [2670, III.3.4] Let E/K be an elliptic curve, let Div0 (E) be the group of
divisors of degree 0, and let Pic0 (E) be the corresponding group of divisor classes. Then
there is an isomorphism
12.2.45 Remark If E/Fq is an elliptic curve defined over a finite field, then E(Fq ) is a finite (abelian)
group.
12.2.46 Theorem (Hasse–Weil estimate) [2670, V.3.1] Let E/Fq be an elliptic curve defined over a
finite field. Then
√
q + 1 − #E(Fq ) ≤ 2 q.
aq (E) = q + 1 − #E(Fq )
12.2.48 Remark The following theorem of Birch describes how aq (E) is distributed as E ranges
over all isomorphism classes of elliptic curves defined over Fq .
12.2.49 Theorem [284], [2101, Appendix B]. For each E ∈ Eq (see Definition 12.2.14), write aq (E) =
√
2 q cos θq (E) with 0 ≤ θq (E) ≤ π. Then for all 0 ≤ α ≤ β ≤ π,
β
#{E ∈ Eq : α ≤ θq (E) ≤ β} 2
Z
lim = sin2 (t) dt.
q→∞ #Eq π α
12.2.50 Remark The number of points in E(Fq ) is constrained by Theorem 12.2.46. The exact
orders that occur are given in the following theorem.
√
12.2.51 Theorem [2954] Let q = pn be a prime power, and let b be an integer with |b| ≤ 2 q. Then
there exists an elliptic curve E/Fq with #E(Fq ) = q + 1 − b if and only if b satisfies one of
the following conditions:
1. gcd(b, p) = 1;
√
2. n is even and b = ±2 q;
√
3. n is even and p 6≡ 1 (mod 3) and b = ± q;
4. n is odd and p equals 2 or 3 and b = ±p(n+1)/2 ;
5. n is odd and b = 0;
Curves over finite fields 431
12.2.5 Twists
12.2.55 Definition Let E/K be an elliptic curve. A twist of E is an elliptic curve E 0 /K such that E 0
is isomorphic to E over K, but not necessarily isomorphic over K. Two twists E 0 /K
and E 00 /K are equivalent if they are isomorphic over K.
P
1. ∗ ∗ d #ED (Fq ) = d(q − 1);
Q D∈Fq /(Fq )
2. D∈F∗q /(F∗q )d #ED (Fq ) = #E(Fqd ).
12.2.60 Theorem [2670, III.6.4] Let E/K be an elliptic curve, and let K be an algebraic closure
of K.
1. Let p = char(K). Then
E(K)[m] ∼
= Z/mZ × Z/mZ for all m ≥ 1 with p - m.
(If char(K) = 0, then this holds for all m ≥ 1.)
2. If char(K) = p > 0, then one of the following is true
E(K)[pr ] ∼
= Z/pr Z for all r ≥ 1,
r
E(K)[p ] = 0 for all r ≥ 1.
12.2.61 Example Let E be an elliptic curve defined by a Weierstrass equation (12.2.1). One can
determine conditions on the coordinates of a point P = (x, y) ∈ E for it to be a torsion
point. For example
and
[3](P ) = O if and only if 3x4 + b2 x3 + 3b4 x2 + 3b6 x + b8 = 0.
In general, there is a division polynomial ψn (x, y) ∈ K[x, y] with the property that
[n](P ) = O if and only if ψn (x, y) = 0; see [2670, Exercise 3.7]. These polynomials can
be computed recursively using the formula
12.2.62 Remark Letting m = `i run over larger and larger powers of a prime `, we obtain a module
over the ring of `-adic integers Z` . This is convenient because Z` is a ring of characteristic
zero.
12.2.63 Definition Let E/K be an elliptic curve. The `-adic Tate module of E is the inverse limit
12.2.64 Remark Let E/Fq be an elliptic curve, where q is a power of p. The Frobenius map Fq :
E(Fq ) −→ E(Fq ) (Example 12.2.31) maps E(Fq )[m] to E(Fq )[m]. If p - m, then E(Fq )[m]
is a free Z/mZ-module of rank two, so choosing a Z/mZ-basis T1 , T2 for E(Fq )[m], the
Frobenius map satisfies
Thus the action of φq on E(Fq )[m] is represented by the matrix φ̃q,m = ac db . The matrix
depends on the choice of the basis for E(Fq )[m], but its trace and determinant do not.
Curves over finite fields 433
The maps φ̃q,`i : E(Fq )[`i ] → E(Fq )[`i ] fit together to give a map φq,` : T` (E) → T` (E).
Choosing a basis for T` (E) ∼= Z2` allows us to evaluate the trace and the determinant of φq,`
as `-adic numbers.
12.2.65 Remark The following theorem explains why the quantity aq (E) in Definition 12.2.47 is
the trace of the Frobenius map.
12.2.66 Theorem [2670, V.2.6] Let E/Fq be an elliptic curve with q a power of the prime p. Then
for every integer m with p - m,
12.2.67 Remark More generally, any isogeny φ : E1 → E2 induces a homomorphism of the associ-
ated Tate modules T` (E1 ) → T` (E2 ), and if φ is defined over K, then the induced map is
Gal(K/K)-invariant.
12.2.68 Theorem [2783] Let E1 /Fq and E2 /Fq be elliptic curves defined over a finite field, and let `
be a prime different from the characteristic of Fq . Then the natural map
Gal(K/K)
HomK (E1 , E2 ) ⊗ Z` −→ HomZ` T` (E1 ), T` (E2 )
is an isomorphism.
12.2.69 Remark Theorem 12.2.68 is also true for elliptic curves defined over number fields. This
was conjectured by Tate and proven by Faltings [1024].
12.2.70 Theorem (Weil pairing) [2670, III.8.1, III.8.2] Let E/K be an elliptic curve, let m ≥ 1 be
∗
an integer that is prime to the characteristic of K, and let µm ⊂ K denote the group of
m-th roots of unity. There is a pairing
em : E(K)[m] × E(K)[m] −→ µm
12.2.72 Remark The Weil pairings on E(K)[`i ] fit together to give a bilinear, alternating, non-
degenerate, Galois invariant pairing
where T` (µ) is the inverse limit of µ`i . Further, e`∞ (P, φ̂(Q)) = e`∞ (φ(P ), Q).
12.2.73 Theorem Each of the following recipes computes the Weil pairing em (P, Q).
1. Choose a function fP ∈ K(E) whose divisor satisfies div(fP ) = m(P ) − m(O).
Choose a function gP ∈ K(E) satisfying fP ◦ [m] = gPm . Then the quan-
tity gP (S + Q)/gP (S) does not depend on the point S ∈ E(K), and its value
is em (P, Q).
2. Choose arbitrary points S, T ∈ E(K) and choose functions FP and FQ in K(E)
whose divisors satisfy div(FP ) = m(S + P ) − m(S) and div(FQ ) = m(T + Q) −
m(T ). Then the quantity
FQ (T + P ) FP (S + Q)
FQ (T ) FP (S)
defined as follows: Let T ∈ E(K)[m] and P ∈ E(K). Choose a point Q ∈ E(K) with
[m](Q) = P . Then there exists an α ∈ K ∗ such that
√ √
em σ(Q) − Q, T = ( m α)σ / m α for all σ ∈ Gal(K/K),
and we set T(T, P ) = α mod (K ∗ )m . The Tate pairing may be computed by choosing a
function fT ∈ K(E) with divisor div(fT ) = m(T ) − m(O), and then T(T, P ) = fT (P +
Q)/fT (Q) for any Q ∈ E(K) such that the functions are defined and nonzero.
12.2.76 Remark If m is large, it is not clear in practice how to compute the functions used to
evaluate the Weil pairing (Theorem 12.2.73) and the Tate pairing (Theorem 12.2.75). A
double-and-add algorithm due to Miller allows these pairings to be computed quite effi-
ciently; see Theorem 16.4.38.
12.2.77 Remark There are functorial definitions of the Weil and Tate pairings from which our
rather ad hoc definitions may be derived. Briefly, the Weil pairing is a pairing between the
m torsion on an abelian variety A and the m-torsion on its dual  ∼
= Ext(A, Gm ), combined
Curves over finite fields 435
with an identification of  with the Picard group Pic0 (A). The Tate pairing is a cup product
pairing on Galois cohomology that uses the Weil pairing to map A[m] ⊗ Â[m] to µm . For
details, see for example [2671].
12.2.78 Definition Let A be a finite Q-algebra, i.e., A is a ring that contains Q as a subring and
that is a finite dimensional Q-vector space. An order of A is a subring of A that is
finitely generated as a Z-module and that contains a Q-basis of A. A maximal order is
an order that is contained in no other orders.
12.2.79 Example Let D ∈ Z be a positive integer that is not a perfect square and let Z[δ] be the
√ √
ring of integers of Q( −D). For example, we can take δ = −D+2 −D . Then every order
√
in Q( −D) has the form Z + f Z[δ] for some integer f ≥ 1. The integer f is the conductor
of the order Z + f Z[δ].
12.2.81 Remark A number field has a unique maximal order, its ring of integers. Quaternion alge-
bras may have many maximal orders.
12.2.82 Theorem [2670, III.9.4] Let E/K be an elliptic curve. The endomorphism ring End(E) of E
has one of the following forms:
1. End(E) = Z; √
2. End(E) is an order in a quadratic imaginary field Q( −D);
3. End(E) is an order in a quaternion algebra. (This form is only possible if K is a
finite field.)
12.2.83 Theorem [2670, III.10.1] Let E/K be an elliptic curve. Then its automorphism
group Aut(E) is a finite group whose order is given in the following table, where j(E)
is the j-invariant of E:
12.2.84 Example Let K be a field whose characteristic is neither 2 nor 3, and let A, B ∈ K ∗ with
4A3 + 27B 2 6= 0. Then
EA,B : y 2 = x3 + Ax + B
is an elliptic curve whose automorphism group is as follows:
1. If AB 6= 0, then Aut(EA,B ) = µ2 .
2. If B = 0, then j(E) = 1728 and Aut(EA,0 ) = µ4 , where [ζ](x, y) = (ζ 2 x, ζy) for
ζ ∈ µ4 .
3. If A = 0, then j(E) = 0 and Aut(E0,B ) = µ6 , where [ζ](x, y) = (ζ 2 x, ζ 3 y) for
ζ ∈ µ6 .
436 Handbook of Finite Fields
12.2.85 Remark Theorem 12.2.51 lists the possible values of #E(Fq ). These determine the associ-
ated endomorphism rings as described in the following theorem.
12.2.86 Theorem [2954] Let E/Fq be an elliptic curve, let φq ∈ End(E) be the q-power Frobenius
map, and let A = End(E) ⊗ Q. Referring to the classification in Theorem 12.2.51:
In Case 1, A = Q(φq ) is a quadratic imaginary field and End(E) is an arbitrary order
in A.
In Case 2, A is a quaternion algebra, φq ∈ Z, and End(E) is a maximal order in A.
In Cases 3–6, A = Q(φq ) is a quadratic imaginary field and End(E) is an order in A
whose conductor is not divisible by p.
12.2.87 Theorem [2670, V.3.1] Let E/K be an elliptic curve defined over a field of characteristic
p > 0. The following are equivalent:
1. E[pn ] = 0 for some n ≥ 1, equivalently for all n ≥ 1.
n
2. The dual of the Frobenius map, φ̂pn : E (p ) → E, is purely inseparable for some
n ≥ 1, equivalently for all n ≥ 1.
3. The multiplication-by-pn map [pn ] : E → E is purely inseparable for some n ≥ 1,
equivalently for all n ≥ 1.
4. The endomorphism ring End(E) is an order in a quaternion algebra.
5. The formal group associated to E has height 2. (See [2670, IV §1] for the con-
struction of the formal group associated to an elliptic curve and [2670, IV §7] for
the definition the height of a formal group.)
12.2.88 Definition Let K be a field of positive characteristic p. An elliptic curve E/K is supersin-
gular if one (equivalently all) of the conditions in Theorem 12.2.87 are true. Otherwise E
is ordinary.
12.2.94 Example The factorization of the first few polynomials Hp (T ) modulo p are listed in the
following table.
p Hp (T ) (mod p)
3 T +1
5 T 2 + 4T + 1
7 (T + 1)(T + 3)(T + 5)
11 (T + 1)(T + 5)(T + 9)(T 2 + 10T + 1)
13 (T 2 + 4T + 9)(T 2 + 7T + 1)(T 2 + 12T + 3)
17 (T 2 + T + 16)(T 2 + 14T + 1)(T 2 + 16T + 1)(T 2 + 16T + 16)
19 (T + 1)(T + 9)(T + 17)(T 2 + 4T + 1)(T 2 + 13T + 6)(T 2 + 18T + 16)
d2 d
D = 4T (1 − T ) 2
+ 4(1 − 2T ) − 1.
dT dT
The polynomial Hp (T ) in Theorem 12.2.93, Part 3 satisfies
p−1
2
X p−1 2
DHp (T ) = p (p − 2 − 4i) 2 T i.
i=0
i
12.2.98 Remark Let E/Q be an elliptic curve given by a Weierstrass equation with integer coeffi-
cients. Reducing the coefficients modulo p gives an elliptic curve Ẽp /Fp for all primes p - ∆.
If End(E) is an order in a quadratic imaginary field k, then Ẽp is supersingular if p is inert
in k and ordinary if p is split in k; see Definition 12.4.3.
12.2.99 Remark The situation if End(E) = Z is more complicated. Serre and Elkies [969, 2590] have
proven that SS(X) = #{p < X : Ẽp is supersingular} is smaller that X 3/4+ √ as X → ∞.
Lang and Trotter have conjectured [1848] that SS(X) is asymptotic to C X/ log(X) for
a certain positive constant C. In the opposite direction, Elkies [968] has proven that Ẽp is
supersingular for infinitely many p, i.e., SS(X) → ∞ as X → ∞.
12.2.100 Remark This section describes the zeta function of an elliptic curve. See Sections 11.6
and 12.7 for zeta functions of arbitrary curves and higher dimensional algebraic varieties.
12.2.101 Definition Let E/Fq be an elliptic curve. The zeta function of E/Fq is the formal power
series X∞
#E(Fqn ) n
Z(E/Fq , T ) = exp T
n=1
n
P∞
12.2.102 Remark It might seem more natural to use the series n=1 #E(Fqn )T n , but the series
defining Z(E/Fq , T ) has better transformation properties.
12.2.103 Theorem [2670, V.2.4] Let E/Fq be an elliptic curve, and let a = aq (E) be the trace of
Frobenius for E (Definition 12.2.47). Then
1 − aT + qT 2
Z(E/Fq , T ) = . (12.2.3)
(1 − T )(1 − qT )
This is an instance of Poincaré duality. For the general statement of Poincaré duality and
the associated functional equation for zeta functions of varieties over finite fields, see The-
orems 12.7.18 and 12.7.20.
12.2.105 Remark The formula (12.2.3) for Z(E/Fq , T ) is equivalent to the statement that if we
factor 1 − aT + qT 2 = (1 − αT )(1 − βT ) over the complex numbers, then
Further, the Hasse–Weil estimate (Theorem 12.2.46) is equivalent to the statement that α
√
and β are complex conjugates satisfying |α| = |β| = q. So in particular,
12.2.106 Example Consider the curve E/F2 defined by the Weierstrass equation
E : y 2 + xy = x3 + 1.
Curves over finite fields 439
√
Then #E(F2 ) = 4 and a2 (E) = −1. The roots of 1 + T + 2T 2 are −1±2 −7 , so for all n ≥ 1,
√ n √ n
−1 + −7 −1 − −7
n
#E(F2n ) = 2 + 1 − − .
2 2
Using this formula, it is easy to compute #E(F2n ) for large values of n, for example,
#E(F2173 ) = 2173 + 1 − 67870783603944754053042229.
12.2.107 Remark The elliptic curve discrete logarithm problem (ECDLP) underlies the use of elliptic
curves in cryptography. In this section we discuss ECDLP and some related problems. For
applications to cryptography, see Section 16.4.
12.2.108 Definition Let E/Fq be an elliptic curve and let P, Q ∈ E(Fq ). A (discrete) logarithm of
Q to the base P is an integer N such that Q = [N ](P ). The discrete logarithm, which
is denoted by logP (Q), is well-defined modulo the order mP of the element P in the
group E(Fq ), so one may view logP as a group homomorphism
The elliptic curve discrete logarithm problem (ECDLP ) is the problem of comput-
ing logP (Q) for given points P and Q. (Note the analogy with the classical discrete
logarithm problem for the multiplicative group F∗q ; see Section 11.6.)
12.2.109 Remark If the order mP of P is prime, then the fastest known general algorithm for solving
√
the ECDLP (as of 2012) has running time on the order of mP . This may be compared to
the DLP in Fq , for which there are algorithms with running times that are subexponential
∗
in log q.
12.2.110 Definition Let E/Fq be an elliptic curve and let P ∈ E(Fq ). The (computational ) el-
liptic curve Diffie–Hellman problem (ECDHP-comp) is the following: Given the values
of P , [M ](P ), and [N ](P ), compute the value of [M N ](P ).
12.2.111 Definition The (decisional ) elliptic curve Diffie–Hellman problem (ECDHP-dec) is the
following: Given the values of P , [M ](P ), and [N ](P ), with better than equal probability,
distinguish between the points [M N ](P ) and a randomly chosen point Q.
12.2.112 Definition The embedding degree of the integer m in the field Fq is the smallest integer k
such that µm ⊂ F∗qk , where µm is the group of m-th roots of unity. Equivalently, it is
the smallest integer k such that q k ≡ 1 (mod m).
12.2.113 Remark The importance of the embedding degree is that it describes the degree of an
extension field over which the Weil and Tate pairings are defined.
12.2.114 Remark [2079], [2670, XI.6.1] Let E/Fq , m ≥ 1, and T ∈ E(Fq )[m] be as in Theo-
rem 12.2.75, and let T be the Tate pairing. Then Menezes, Okamoto, and Vanstone have
noted that the ECDLP in E(Fq ) can be reduced to the DLP in Fq , since if N = logP (Q),
then T(P, Q) = T(P, P )N . Similarly, the decisional ECDHP is as easy to solve as comput-
ing the Tate pairing, since (with rare exceptions) T([M ]P, [N ]P ) = T(P, Q) if and only if
Q = [M N ]P .
440 Handbook of Finite Fields
12.2.115 Definition An elliptic curve E/Fp is anomalous if ap (E) = 1, that is, if #E(Fp ) = p.
12.2.116 Remark Let p ≥ 3 and let E/Fp be an anomalous elliptic curve. Then Araki, Satoh,
Semaev, and Smart [2535, 2581, 2682] observed that the ECDLP in E(Fp ) can be solved in
essentially linear time [2670, XI.6.5].
See Also
References Cited: [284, 556, 557, 825, 968, 969, 1024, 1563, 1756, 1773, 1843, 1845, 1848,
2054, 2079, 2101, 2107, 2497, 2535, 2581, 2590, 2667, 2670, 2671, 2672, 2682, 2783, 2784,
2950, 2954]
12.3.1 Remark Section 12.2 defined elliptic curves using Weierstrass equations (12.2.1). The fol-
lowing definitions present other curve shapes which have algorithmic advantages.
12.3.2 Definition A short Weierstrass curve over a field K of characteristic not equal to 2 is a
curve of the form y 2 = x3 + ax + b with a, b ∈ K and 4a3 + 27b2 6= 0.
12.3.3 Definition A short Weierstrass curve over a field K of characteristic 2 is a curve of the
form y 2 + xy = x3 + ax2 + b with a, b ∈ K and b 6= 0.
12.3.4 Remark The curve shape defined in Definition 12.3.3 covers only ordinary curves; supersin-
gular curves over a field K of characteristic 2 have short Weierstrass equations of the form
y 2 + cy = x3 + ax + b with a, b, c ∈ K and c 6= 0.
12.3.5 Definition A Montgomery curve over a field K of characteristic not equal to 2 is a curve
of the form by 2 = x3 + ax2 + x with a ∈ K \ {−2, 2} and b ∈ K \ {0}.
Curves over finite fields 441
12.3.6 Definition An Edwards curve over a field K of characteristic not equal to 2 is a curve of
the form x2 + y 2 = 1 + dx2 y 2 with d ∈ K \ {0, 1}.
A twisted Edwards curve over a field K of characteristic not equal to 2 is a curve of
the form ax2 + y 2 = 1 + dx2 y 2 with a, d ∈ K \ {0} and a 6= d.
12.3.7 Definition A Hessian curve over a field K is a curve of the form x3 + y 3 + 1 = dxy with
d ∈ K and d3 6= 27.
A twisted Hessian curve over a field K is a curve of the form ax3 + y 3 + 1 = dxy
with a, d ∈ K, a 6= 0, and d3 6= 27a.
12.3.8 Remark Other curve shapes studied in the literature include binary Edwards curves, Ja-
cobi quartics, Jacobi intersections, and Doche-Icart-Kohel curves. Chudnovsky and Chud-
novsky’s work [636] studied addition formulas on Hessian curves, Jacobi quartics, Jacobi
intersections, and Weierstrass curves. Montgomery curves were proposed by Montgomery
in [2133] in the context of the elliptic-curve method to factor integers. Edwards curves
were introduced by Edwards in [956] in the form x2 + y 2 = a2 (1 + x2 y 2 ), and generalized
to x2 + y 2 = 1 + dx2 y 2 by Bernstein and Lange in [247]. Twisted Edwards curves were
introduced by Bernstein, Birkner, Joye, Lange, and Peters in [244].
12.3.2 Addition
12.3.9 Remark Given a nonsingular projective genus one curve with a specified neutral element
O, one can abstractly define addition of points P on the curve to correspond to addition of
divisors P −O modulo principal divisors. However, computations use more concrete addition
laws specified as rational functions of the input coordinates.
12.3.10 Remark Addition on Weierstrass curves is described concretely in Definition 12.2.19 and
Algorithm 12.2.21. It is necessary to distinguish generic additions from doublings and from
computations involving O as input or output; the generic addition formulas fail for those
cases.
There are other addition formulas that do not require so many distinctions. An addition
law is strongly unified if it can be used to double generic points. It is K-complete (abbre-
viated complete when K is clear from context) if it can be used to add each pair of points
defined over K.
12.3.11 Example Consider the twisted Edwards curve ax2 + y 2 = 1 + dx2 y 2 over K with neutral
element (0, 1). The negative of (x, y) on this curve is (−x, y). The Edwards addition law,
valid for almost all points P = (x1 , y1 ) and Q = (x2 , y2 ) on the curve, states that the sum
P + Q is
x1 y2 + y1 x2 y1 y2 − ax1 x2
, .
1 + dx1 x2 y1 y2 1 − dx1 x2 y1 y2
The Edwards addition law is strongly unified: it does not make an exception for P = Q. If
a is a square in K and d is not a square in K then the Edwards addition law is K-complete.
12.3.12 Example Consider the twisted Hessian curve ax3 + y 3 + 1 = dxy over K with neutral
element (0, −1). The negative of (x, y) on this curve is (x/y, 1/y). The rotated Hessian
addition law, valid for almost all points P = (x1 , y1 ) and Q = (x2 , y2 ) on the curve, states
that the sum P + Q is
x1 − y12 x2 y2 y1 y22 − ax21 x2
, .
ax1 y1 x22 − y2 ax1 y1 x22 − y2
442 Handbook of Finite Fields
12.3.14 Remark Additions in affine coordinates require inversions in the underlying field K. Inver-
sions are computationally expensive compared to multiplications and additions. Implemen-
tations thus work with fractions, delaying the inversions for as long as possible. Mathemat-
ically this means working in projective coordinates.
12.3.15 Definition The projective representations over a field K of a vector (x, y) ∈ K 2 are the
vectors (X, Y, Z) ∈ K 3 such that Z 6= 0, X/Z = x, and Y /Z = y.
Bernstein and Lange showed in [249] that these two addition laws cover all possible pairs
of curve points; the outputs coincide if they are both defined; each defined output is on the
curve; and this addition turns the set of curve points into a group.
An analogous 56-monomial pair of addition laws for Weierstrass curves was presented
by Bosma and Lenstra in [362] before Edwards curves were introduced.
Curves over finite fields 443
12.3.24 Definition Addition formulas with inputs and output in projective coordinates are pro-
jective addition formulas. Addition formulas with one input and output in projective
coordinates but the other input in affine coordinates are mixed addition formulas.
12.3.25 Remark Affine coordinates are equivalent to projective coordinates with Z = 1; mixed
addition formulas eliminate multiplications by 1 and sometimes save time in other ways.
The literature also contains various speedups for other types of restricted representations,
such as additions of projective points having X = 1 or of two points having the same
Z-coordinate.
12.3.26 Remark Often an addition formula involves some computations that depend only on one
input point. A readdition of the same input point reuses those computations.
12.3.27 Remark The following subsections state the most efficient explicit formulas known for the
most popular curve shapes used in computations. Cost is reported as squarings (S), mul-
tiplications by curve constants (D), and general multiplications (M); multiplications by
curve constants are counted separately because often these constants can be chosen small.
Additions and subtractions are ignored. Coordinate systems here are chosen primarily to
minimize doubling cost and secondarily to minimize addition cost, where cost counts multi-
plications and a fraction of squarings; the fraction is 0.8 in characteristic 6= 2 and 0.2 in char-
acteristic 2. See the Explicit Formulas Database [246] (http://hyperelliptic.org/EFD/)
444 Handbook of Finite Fields
for a much more comprehensive collection of curve shapes, coordinate systems, computer-
verified addition formulas, and references.
12.3.28 Remark The following algorithms choose weights (2, 3, 1) and a = −3. These choices min-
imize the cost of known doubling algorithms on short Weierstrass curves, except for a few
curves (at most 6 isomorphism classes) having a = 0. All of the standard NIST curves over
prime fields have a = −3, and almost all curves over prime fields have low-degree isogenies
to curves with a = −3.
12.3.29 Algorithm: Doubling.
Input: P1 = (X1 : Y1 : Z1 ).
Output: P3 = (X3 : Y3 : Z3 ) = 2P1 .
δ = Z12 ; γ = Y12 ; β = X1 γ; α = 3(X1 − δ)(X1 + δ); X3 = α2 − 8β; Z3 = (Y1 + Z1 )2 − γ − δ;
Y3 = α(4β − X3 ) − 8γ 2 .
12.3.30 Algorithm: Addition.
Input: P1 = (X1 : Y1 : Z1 ), P2 = (X2 : Y2 : Z2 ).
Output: P3 = (X3 : Y3 : Z3 ) = P1 + P2 .
A = Z12 ; B = Z22 ; U1 = X1 · B; U2 = X2 · A; S1 = Y1 · Z2 · B; S2 = Y2 · Z1 · A; H = U2 − U1 ;
I = (2H)2 ; J = H ·I; r = 2(S2 −S1 ); V = U1 ·I; X3 = r2 −J −2V ; Y3 = r ·(V −X3 )−2S1 ·J;
Z3 = ((Z1 + Z2 )2 − A − B) · H.
12.3.31 Remark Doubling takes 3M+5S. Addition takes 11M+5S. Readdition of P2 saves 1M+1S
by caching B and Z2 · B. Mixed addition with Z2 = 1 takes 7M + 4S: it skips all operations
involving Z2 and computes K = H 2 ; I = 4K; Z3 = (Z1 + H)2 − A − K.
12.3.32 Remark The following algorithms for short binary Weierstrass curves choose weights
√
(1, 2, 1), called Lopez-Dahab coordinates. The formulas use a6 as a curve constant, as-
suming implicitly that a6 is a square (which is automatic for characteristic-2 finite fields
and other characteristic-2 perfect fields).
For fields F2n with n odd each isomorphism class contains a curve with a2 = 1 or one
with a2 = 0. Optimizations are different, as the following algorithms show. Lopez-Dahab
coordinates (X : Y : Z) are extended to include T = Z 2 , and further extended to include
W = XZ for a2 = 0.
Differential-addition formulas represent a point by its x-coordinate, and represent x in
turn as a fraction X/Z. These formulas do not differ between a2 = 0 and a2 = 1.
12.3.33 Algorithm: Doubling, a2 = 1.
Input: P1 = (X1 : Y1 : Z1 : T1 ).
Output: P3 = (X3 : Y3 : Z3 : T3 ) = 2P1 .
√
A = X12 ; B = Y12 ; Z3 = T1 · A; T3 = Z32 ; X3 = (A + a6 T1 )2 ; Y3 = B · (B + X3 + Z3 ) +
a6 T3 + T3 .
12.3.34 Algorithm: Addition, a2 = 1.
Input: P1 = (X1 : Y1 : Z1 : T1 ), P2 = (X2 : Y2 : Z2 : T2 ).
Output: P3 = (X3 : Y3 : Z3 : T3 ) = P1 + P2 .
A = X1 · Z2 ; B = X2 · Z1 ; C = A2 ; D = B 2 ; E = A + B; F = C + D; G = Y1 · T2 ;
Curves over finite fields 445
H = Y2 · T1 ; I = G + H; J = I · E; Z3 = F · Z1 · Z2 ; T3 = Z32 ; X3 = A · (H + D) + B · (C + G);
Y3 = (A · J + F · G) · F + (J + Z3 ) · X3 .
12.3.35 Algorithm: Mixed addition, a2 = 1.
Input: P1 = (X1 : Y1 : Z1 : T1 ), P2 = (X2 : Y2 : 1 : 1).
Output: P3 = (X3 : Y3 : Z3 : T3 ) = P1 + P2 .
A = X1 + X2 · Z1 ; B = Y1 + Y2 · T1 ; C = A · Z1 ; D = C · (B + C); Z3 = C 2 ; T3 = Z32 ;
X3 = B 2 + C · A2 + D; Y3 = (X3 + X2 · Z3 ) · D + (X2 + Y2 ) · T3 .
12.3.36 Remark For a2 = 1, doubling takes 2M + 4S + 2D; addition takes 13M + 3S; readdition
does not save anything; mixed addition with Z2 = 1 takes 8M + 4S.
12.3.37 Algorithm: Doubling, a2 = 0.
Input: P1 = (X1 : Y1 : Z1 : T1 : W1 ).
Output: P3 = (X3 : Y3 : Z3 : T3 : W3 ) = 2P1 .
√
A = X12 ; B = Y12 ; Z3 = W12 ; X3 = (A + a6 T1 )2 ; T3 = Z32 ; W3 = X3 · Z3 ; Y3 =
B · (B + X3 + Z3 ) + a6 T3 + W3 .
12.3.38 Algorithm: Addition, a2 = 0.
Input: P1 = (X1 : Y1 : Z1 : T1 : W1 ), P2 = (X2 : Y2 : Z2 : T2 : W2 ).
Output: P3 = (X3 : Y3 : Z3 : T3 : W3 ) = P1 + P2 .
A = X1 · Z2 ; B = X2 · Z1 ; C = A2 ; D = B 2 ; E = A + B; F = C + D; G = Y1 · T2 ;
H = Y2 · T1 ; I = G + H; J = I · E; Z3 = F · Z1 · Z2 ; X3 = A · (H + D) + B · (C + G);
Y3 = (A · J + F · G) · F + (J + Z3 ) · X3 ; T3 = Z32 ; W3 = X3 · Z3 .
12.3.39 Algorithm: Mixed addition, a2 = 0.
Input: P1 = (X1 : Y1 : Z1 : T1 : W1 ), P2 = (X2 : Y2 : 1 : 1 : 1).
Output: P3 = (X3 : Y3 : Z3 : T3 : W3 ) = P1 + P2 .
A = Y1 + Y2 · T1 ; B = X1 + X2 · Z1 ; C = B · Z1 ; Z3 = C 2 ; T3 = Z32 ; D = X2 · Z3 ;
X3 = A2 + C · (A + B 2 + a2 C); Y3 = (D + X3 ) · (A · C + Z3 ) + (Y2 + X2 ) · T3 ; W3 = X3 · Z3 .
12.3.40 Remark For a2 = 0, doubling takes 2M + 5S + 2D; addition takes 14M + 3S; readdition
does not save anything; mixed addition with Z2 = 1 takes 8M + 4S + 1D.
12.3.41 Algorithm: Differential addition and doubling.
Input: P2 = (X2 : Z2 ), P3 = (X3 : Z3 ) with x(P3 − P2 ) = x1 .
Output: P4 = (X4 : Z4 ) = 2P2 , P5 = (X5 : Z5 ) = P2 + P3 .
A = X2 · Z3 ; B = X3 · Z2 ; C = X22 ; D = Z22 ; Z5 = (A + B)2 ; X5 = x1 · Z5 + A · B;
√
X4 = (C + a6 D)2 ; Z4 = C · D.
12.3.42 Remark The combined operation of differential addition and doubling takes 5M + 4S + 1D.
The Montgomery ladder thus takes 5M + 4S + 1D per bit of the scalar.
12.3.43 Remark Montgomery curves are of interest primarily for their efficient differential addition.
Points are represented via their x-coordinate x(P ) = X/Z. The formulas depend on the
curve constant c = (a + 2)/4.
12.3.44 Algorithm: Differential addition and doubling.
Input: P2 = (X2 : Z2 ), P3 = (X3 : Z3 ) with x(P3 − P2 ) = x1 .
Output: P4 = (X4 : Z4 ) = 2P2 , P5 = (X5 : Z5 ) = P2 + P3 .
A = X2 + Z2 ; Ā = A2 ; B = X2 − Z2 ; B̄ = B 2 ; E = Ā − B̄; C = X3 + Z3 ; D = X3 − Z3 ;
F = D · A; G = C · B; X5 = (F + G)2 ; Z5 = x1 · (F − G)2 ; X4 = Ā · B̄; Z4 = E · (B̄ + cE).
446 Handbook of Finite Fields
12.3.45 Remark The combined operation of differential addition and doubling takes 5M + 4S + 1D.
A Montgomery curve is birationally equivalent to a twisted Edwards curve. Differential
addition formulas on twisted Edwards curves trade 1M for 1D; if the curve constant can
be chosen small those formulas are more efficient.
See Also
References Cited: [244, 246, 247, 249, 362, 636, 956, 2133]
Curves over finite fields 447
Recall that by Remark 12.1.119 every K-rational point on a projective curve over a field
K, and hence also every K-rational point on an affine curve over K, corresponds to a degree
one place of the associated function field. This section predominantly uses the language of
curves and points rather than function fields and places.
12.4.2 Remark The curve defined by a hyperelliptic equation (12.4.1) is non-singular if and only
if for no point (x0 , y0 ) on the curve, both partial derivatives vanish, i.e.,
12.4.3 Definition A hyperelliptic curve C of genus g defined over a field K is given by a hyper-
elliptic equation over K that is irreducible in K(x, y), non-singular, and satisfies one of
the following three conditions:
1. deg(f ) = 2g + 1 and deg(h) ≤ g;
2. deg(f ) ≤ 2g + 1 and h is monic of degree g + 1, or deg(f ) = 2g + 2 and
a. either K has characteristic different from 2, deg(h) ≤ g and the leading
coefficient of f is a square in K,
b. or K has characteristic 2, or h is monic of degree g + 1 and the leading
coefficient of f is of the form s2 + s for some s ∈ K ∗ ;
3. deg(f ) = 2g + 2 and
a. either K has characteristic different from 2, deg(h) ≤ g, and the leading
coefficient of f is not a square in K,
b. or K has characteristic 2, or h is monic of degree g + 1 and the leading
coefficient of f is not of the form s2 + s for any s ∈ K ∗ .
A curve C is imaginary or ramified in case (1), real or split in case (2), and unusual or
inert in case (3).
12.4.4 Remark An elliptic curve can be thought of as an imaginary hyperelliptic curve of genus
g = 1.
12.4.5 Remark Every unusual hyperelliptic curve is a real curve over a quadratic extension of K.
12.4.6 Remark It is customary, although not necessary, to take h to be identically zero if K does
not have characteristic 2. This is always possible by completing the square, i.e., adding h2 /4
to both sides of (12.4.1) and replacing y by y − h/2.
448 Handbook of Finite Fields
12.4.7 Definition The infinite places of a hyperelliptic curve C/K as given in (12.4.1) are exactly
the poles of x (see Definition 12.1.18).
12.4.8 Definition Let C/K be a hyperelliptic curve. For any extension field L of K, the set of
finite points of C defined over L is the set of solutions (x0 , y0 ) ∈ L×L to the hyperelliptic
equation (12.4.1) defining C. The set of points at infinity of C defined over L is the set
{∞}
if C/L is imaginary,
S= {∞, ∞} if C/L is real,
∅ if C/L is unusual.
The finite points and points at infinity defined over L together form the set of points of
C defined over L or the set of L-rational points of C, denoted by C(L).
12.4.9 Definition The hyperelliptic involution of a hyperelliptic curve C/K is the map
ι : C(K) → C(K) that sends a finite point P = (x0 , y0 ) of C to the point
P = ι(P ) = (x0 , −y0 − h(x0 )) of C. If C is imaginary, then ι(∞) = ∞. If C is real,
then ι(∞) = ∞ and ι(∞) = ∞.
12.4.10 Remark A point on a hyperelliptic curve is ramified (when viewed as a place) if it is fixed
by ι, and unramified otherwise.
12.4.11 Remark If C is imaginary, then the infinite place of K(x) is a ramified K-rational point on
C; if C is real, then it is an unramified K-rational point on C, and if C is unusual, then it
is not a point on C, but rather a place of degree 2.
12.4.12 Theorem (Generalization of [2373, Section 5] and [1587, Proposition 2.1]) Let C be a
hyperelliptic curve of genus g over a perfect field K defined by (12.4.1), and let x0 , y0 ∈ K.
Substituting
bz
x = t−1 + x0 , y = g+1 + a
t
into (12.4.1), with (
y0 if char(K) = 2,
a=
−h(x0 )/2 otherwise,
and (
1 if h(x0 ) + 2y0 = 0,
b=
h(x0 ) + 2y0 otherwise,
H(t) = b−1 tg+1 (h(t−1 + x0 ) + 2a), F (t) = b−2 t2g+2 (f (t−1 + x0 ) − ah(t−1 + x0 ) − a2 ),
12.4.13 Example A hyperelliptic curve y 2 = f (x) of genus g defined over R may have as few as one
and as many as g + 1 connected components, as illustrated in Figure 12.4.13, depending on
the number of real roots of f .
(a) y 2 = x5 +x4 −5x3 −2x2 +3x+1 (imaginary) (b) y 2 = x6 + x4 − 5x3 − 2x2 + 3x + 1 (real)
Figure 12.4.1 Examples of imaginary, real, and unusual hyperelliptic curves of genus 2 defined over R.
12.4.14 Remark Unlike the case of elliptic curves, the set of points of C defined over any exten-
sion field of K does not form an abelian group. Instead, one needs to use divisors, as in
Definition 12.1.21.
12.4.15 Definition Let C/K be a hyperelliptic curve and L an extension field of K. A divisor of
C is defined over L if Dσ = P nP P σ = D for all L-automorphisms σ on K, where
P
P σ = (σ(x), σ(y)) if P = (x, y), ∞σ = ∞ and (in case C is real) ∞σ = ∞.
12.4.16 Definition [661, Definition 14.3] Let C/K be a hyperelliptic curve. The set of degree zero
divisors of C defined over K, denoted by Div0K (C), is a subgroup of the divisor group
of C/K (see Definition 12.1.21). The set of principal divisors defined over K, denoted
by PrincK (C), is a subgroup of Div0K (C) (see Definition 12.1.26).
450 Handbook of Finite Fields
12.4.17 Definition Let C/K be a hyperelliptic curve. The degree zero divisor class group defined
over K is given by Pic0K (C) = Div0K (C)/ PrincK (C). We denote Pic0 (C) = PicK
0
(C).
12.4.18 Remark In the case of genus 1, i.e., elliptic curves, Pic0K (C) is isomorphic to the group
of points defined over K on the elliptic curve; see Proposition 12.2.44. This is not true for
hyperelliptic curves of genus g > 1.
12.4.19 Remark The group Pic0K (C) is isomorphic to the group of K-rational points on the Jaco-
bian variety JC (K) of C; see [661, Section 4.4.6.a]. As a result, the two terminologies and
notations are sometimes used interchangeably in the literature.
12.4.20 Definition Let K = Fq . Then Pic0Fq (C) is a finite abelian group. Its order, denoted by h,
is the degree zero divisor class number, or class number, of C/Fq .
12.4.21 Definition For any divisor D of C, [D] denotes the divisor class of Pic0K (C) represented
by D.
12.4.22 Definition Let C/Fq be a real hyperelliptic curve. The order R of the subgroup of Pic0K (C)
generated by [∞ − ∞] is the regulator of C/K.
12.4.23 Remark The divisor R(∞ − ∞) is principal, and thus the divisor of a function . This
function is a fundamental unit of the maximal order of the corresponding function field; see
Remark 12.1.119.
12.4.24 Remark Let C/Fq be a hyperelliptic curve and F/Fq the corresponding algebraic function
field; see Remark 12.1.119. Let C denote the ideal class group of the maximal order of
F/Fq (x). Then the following relationships hold between Pic0K (C) and C.
1. If C/K is imaginary, then h = |C|, and the groups Pic0K (C) and C are isomorphic.
2. If C/K is real, then h = R|C|.
3. If C/K is unusual, then h = |C|/2.
12.4.25 Remark The ideal class number |C| of a real hyperelliptic curve is expected to be small
(frequently 1), so we expect that R ≈ h [1125]; see also [1129, 1130, 1131] for the genus 1
case.
12.4.26 Remark In the case of imaginary and real hyperelliptic curves, and to a lesser extent for
unusual curves, there exist algorithms for efficient arithmetic in Pic0K (C). We restrict our
attention to K = Fq .
P
12.4.27 Definition A divisor D = P nP P of a hyperelliptic curve C/Fq is finitely effective if
nP ≥ 0 for all finite points P of C.
12.4.28 Remark Every finitely effective degree zero divisor D of a hyperelliptic curve C/Fq can be
represented uniquely as
DS − deg(DS )∞
if C/Fq is imaginary,
D= DS − deg(DS )∞ + n∞ (∞ − ∞) if C/Fq is real,
DS − (deg(DS )/2)∞ if C/Fq is unusual,
Curves over finite fields 451
12.4.29 Definition A degree zero divisor D of a hyperelliptic curve C/Fq with a representation as
given in Remark 12.4.28 is semi-reduced if for all P ∈ supp(DS ) we have P 6∈ supp(DS ),
unless P = P , in which case nP = 1.
12.4.31 Remark If C/Fq is imaginary, then every divisor class of Pic0Fq (C) contains a unique reduced
divisor.
12.4.32 Remark If C/Fq is real, then divisor classes of Pic0Fq (C) do not generically contain unique
reduced divisors. However, each class does contain a unique reduced divisor D such that
n∞ lies in a specified range of length g + 1 − deg(DS ). Paulus and Rück [2373] proposed
the interval [0, g − deg(DS )]. Galbraith, Harrison, and Mireles Morales [1155] proposed a
balanced divisor representation, using the interval centered around ddeg(DS )/2e.
12.4.33 Remark If C/Fq is unusual, then every divisor class of Pic0Fq (C) contains at most one
reduced divisor. If the class contains a divisor of the form given in Remark 12.4.28, then it
contains either a unique reduced divisor or q + 1 divisors of the form as in Remark 12.4.28
with deg(DS ) = g +1. Arithmetic in Pic0Fq (C) when C/Fq is unusual is not as well developed
as for imaginary and real hyperelliptic curves C/Fq . For details, see [133].
12.4.34 Theorem [1774, Theorem 5.1] Let C/Fq be a hyperelliptic curve. If D is a semi-reduced
divisor defined over Fq as given in Remark 12.4.28, then DS can be represented uniquely
by a pair of polynomials u, v ∈ Fq [x] where
Y
u(x) = (x − xP )nP
P ∈ supp(DS )
is monic and v is the unique polynomial such that deg(v) < deg(u), v(xP ) = −yP for all
P = (xP , yP ) ∈ supp(DS ), and u | v 2 + hv − f.
12.4.36 Remark In some sources such as [1774], the condition v(xP ) = −yP is replaced by
v(xP ) = yP . This also describes a unique representation
P of semi-reduced divisors in which
DS from Theorem 12.4.34 is replaced by DS = P nP P .
12.4.37 Remark Alternative representations to [u, v] can be obtained by taking [u, v 0 ] where v 0
is any polynomial with v 0 ≡ v (mod u) that satisfies the same interpolation condition.
In the real case, it is sometimes computationally advantageous to use an alternative with
deg(v 0 ) = g + 1 [988].
12.4.38 Remark If C/Fq is imaginary, then semi-reduced divisors are uniquely determined by
their Mumford representation. Thus, the Mumford representation gives an explicit, ef-
ficient representation of divisor classes of Pic0Fq (C) via reduced representatives D with
deg(v) < deg(u) ≤ g. The identity divisor class, PrincFq (C), is represented by [1, 0], and
the inverse of [u, v] is given by [u, −h − v]. Cantor’s algorithm [497] (see also [661, Al-
gorithm 14.7]) describes how to compute the reduced sum of two divisors in Mumford
representation in polynomial time using only arithmetic with polynomials in Fq [x].
452 Handbook of Finite Fields
12.4.39 Remark If C/Fq is real, then semi-reduced divisors are uniquely determined by their Mum-
ford representation and n∞ -value. Thus, the Mumford representation, together with the
integer n∞ , can be used to represent divisor classes in Pic0Fq (C). The identity class is rep-
resented by u = 1, v = 0, and n∞ = 0. The inverse of the divisor class [u, v, n∞ ] is given
by [u, −h − v, −n∞ − deg(u)], after which additional adjustment steps are performed to
obtain a value n∞ in the required range, as described in [1155, 2373]. A modification of
Cantor’s algorithm (see, for example, [1586]) can be used to compute the reduced sum of
two reduced divisors, after which additional operations are performed to obtain a value n∞
in the required range.
12.4.40 Remark A generalization of Shanks’ NUCOMP algorithm is more efficient than Cantor’s
algorithm for moderate and large genus [1588], and can be adapted for use in both the
imaginary and real models. For g ≤ 4, optimized explicit formulas exist in the imaginary
case that describe the addition and reduction algorithm in terms of operations in Fq ; see
[661, Chapter 14] for a survey. Explicit formulas for genus 2 also exist in the real case
[987, 988].
12.4.41 Remark The multiplication-by-m map on elliptic curves (see Definition 12.2.24) generalizes
naturally to Pic0K (C). The double-and-add method, as well as more advanced methods for
scalar multiplication, can also be applied to compute this map efficiently; see [661, Chapter 9]
for a survey.
12.4.42 Definition Let C/Fq be a real hyperelliptic curve. The infrastructure of C/Fq , denoted
by R, is the finite set of all reduced principal divisors D with 0 ≥ n∞ > −R.
12.4.43 Remark The infrastructure is often described in terms of reduced principal ideals of the
maximal order of the function field associated to C/Fq ; see, for example [2707].
12.4.44 Definition Let C/Fq be a real hyperelliptic curve, and D ∈ R. The distance of D is
δ(D) = −n∞ .
12.4.45 Remark Divisors in the infrastructure can be represented using a combination of the Mum-
ford representation and distance.
12.4.46 Remark Distance imposes a natural ordering on the set R. The baby step operation moves
cyclically from one divisor to the next in this ordering [2706]. The distance obtained by
traversing one entire cycle is exactly the regulator R.
12.4.47 Remark A modification of Cantor’s algorithm applied to two divisors in the infrastructure,
where the reduction process terminates as soon as a reduced divisor is obtained, produces
another infrastructure divisor [2706]. NUCOMP [1588] and explicit formulas for genus 2
[987, 988] can also be used for this purpose.
12.4.48 Definition The operation of computing the reduced sum of two divisors in R, as described
in the previous remark, is a giant step.
12.4.49 Remark The divisor [1, 0] with distance 0 acts as the identity with respect to the giant
step operation. The inverse of D = [u, v] 6= [1, 0] with respect to the giant step operation is
[u, −h − v] and has distance R + deg(u) − δ(D).
12.4.50 Remark Giant steps move through R in larger steps than baby steps, because the distance
of a giant step applied to inputs D and D0 is δ(D) + δ(D0 ) − d, where 0 ≤ d ≤ 2g; see,
for example [1587]. Distances are not exactly additive due to the adjustments required to
Curves over finite fields 453
12.4.52 Definition Let C/K be a hyperelliptic curve. An endomorphism of Pic0 (C) is a group
homomorphism of Pic0 (C). An endomorphism of Pic0 (C) is defined over K if it is
a group homomorphism of Pic0K (C). The set of endomorphisms of Pic0 (C) is de-
noted by End(Pic0 (C)), and the set of endomorphisms defined over K is denoted by
EndK (Pic0 (C)).
12.4.53 Remark If C has genus 1, then Definition 12.4.52 agrees with the definition of an endomor-
phism of an elliptic curve as given in Definition 12.2.27.
12.4.54 Remark As in the elliptic curve case (see Definition 12.2.39), End(Pic0 (C)) and
EndK (Pic0 (C)) are rings.
12.4.55 Example The multiplication-by-m map [m] : Pic0 (C) −→ Pic0 (C) is an endomorphism of
Pic0 (C) that is defined over K. Thus, End(Pic0 (C)) and EndK (Pic0 (C)) always contain Z.
12.4.56 Definition [661, Definition 14.13] Let C/K be a hyperelliptic curve. If End(Pic0 (C))
contains an order of a number field of degree 2g over Q, then End(Pic0 (C)) has complex
multiplication.
12.4.57 Example Let C/Fq be hyperelliptic curve. As in the elliptic curve case (see Exam-
ple 12.2.31), the Frobenius automorphism of Fq that sends an element a to aq extends
to an endomorphism of Pic0 (C) that is defined over K and is different from [m] for all
m ∈ Z.
12.4.58 Definition Let C/Fq be a hyperelliptic curve. The group Pic0 (C) (or, more properly, the
Jacobian JC ) is supersingular if it is isogenous to the product of supersingular elliptic
curves.
12.4.59 Remark If C/Fq is a hyperelliptic curve that is not supersingular, then C may have com-
plex multiplication. The Frobenius endomorphism satisfies a monic polynomial equation of
degree 2g with integer coefficients (its characteristic polynomial, see Remark 12.4.60). If
that polynomial is irreducible, then the Frobenius corresponds to an algebraic integer of
degree 2g and C/Fq has complex multiplication.
12.4.60 Remark The zeta function of a hyperelliptic curve C/Fq of genus g (Definition 12.5.12) is
of the form
L(t)
Z(C/Fq , t) = ,
(1 − t)(1 − qt)
454 Handbook of Finite Fields
where L(t) is the L-polynomial of C/Fq (Theorem 12.5.13). The reciprocal polynomial
P (t) = t2g L(1/t) is the characteristic polynomial of the Frobenius endomorphism (Re-
mark 12.5.15), a polynomial of degree 2g with integer coefficients whose roots have absolute
√
value q (Theorem 12.5.17).
12.4.61 Theorem (Special case of Theorem 12.5.13) Let C/Fq be a hyperelliptic curve of genus g
and class number h. Then h = L(1).
√
12.4.62 Remark Theorem 12.4.61 and Remark 12.4.60 immediately imply the bounds ( q −1)2g ≤
√ 2g √ 2g √ 2g
h ≤ ( q + 1) . The interval [( q − 1) , ( q + 1) ] is called the Hasse–Weil interval for
hyperelliptic curves of genus g over Fq .
12.4.63 Remark There is an extensive body of literature on algorithms for computing class numbers
of hyperelliptic curves over finite fields; see [815, 817, 1249, 1555, 1556, 1718, 1719, 1865,
1866, 1905, 2089, 2867] For genus 2 curves, see also [2399]. An overview of the main methods
can be found in [661, Section 17.3].
12.4.64 Definition The kernel of the multiplication-by-m map on Pic0 (C) is the subgroup
12.4.65 Definition Let C/Fq be a hyperelliptic curve. Let m ≥ 1 a prime integer with embedding
degree k in Fq (see Definition 12.2.112). The Tate-Lichtenbaum pairing is defined as
and
T ([D1 ], [D2 ] + [D3 ]) = T ([D1 ], [D2 ])T ([D1 ], [D3 ]),
non-degenerate (if T ([D1 ], [D2 ]) = 1 for all [D2 ] ∈ Pic0K (C)[m] then [D1 ] = PrincK (C)),
and the result is independent of the divisor class representatives used.
12.4.67 Remark The Tate-Lichtenbaum pairing can be computed using an analogue of Miller’s
algorithm for elliptic curves [182, Section 5].
12.4.68 Remark Hyperelliptic curves with small embedding degree exist, i.e., for which computing
the Tate-Lichtenbaum pairing is efficient. For example, Galbraith [1154] proved that su-
persingular hyperelliptic curves of genus g have embedding degree bounded by an integer
k(g). For g ≤ 6, Rubin and Silverberg [2493] show that k(g) ≤ 7.5g. Various constructive
methods for non-supersingular hyperelliptic curves also exist; see [182] for a recent survey.
Curves over finite fields 455
12.4.69 Definition Let C/Fq be a hyperelliptic curve. Let m ≥ 1 a prime integer with embedding
degree k in Fq . Suppose that Pic0Fq (C) contains no elements of order m2 . The modified
Tate-Lichtenbaum pairing is defined as
12.4.70 Remark The main advantage of the modified Tate-Lichtenbaum pairing over the Tate-
Lichtenbaum pairing is that it takes specific values in µm as opposed to equivalence classes.
12.4.71 Remark There are other types of pairings and algorithms to compute them, many designed
to have computational advantages over the Tate-Lichtenbaum pairing. For a recent survey,
see [182].
12.4.7 The hyperelliptic curve discrete logarithm problem
12.4.72 Remark Similar to the elliptic curve discrete logarithm problem (see Section 12.2.11), the
hyperelliptic curve discrete logarithm problem (HCDLP) is the basis of many hyperelliptic
curve cryptosystems. In this section we discuss the HCDLP and some related problems. For
applications to cryptography, see Section 16.5.
12.4.73 Definition Let C/Fq be a hyperelliptic curve and [D1 ], [D2 ] ∈ Pic0Fq (C). The hyperelliptic
curve discrete logarithm problem (HCDLP ) is the problem of computing n ∈ Z such
that [D1 ] = n[D2 ], if it exists.
12.4.74 Remark If the order l of [D2 ] is prime, then the fastest known
√ general algorithm for solving
the HCDLP (as of 2011) has running time on the order of l. As with the ECDLP, there
are a number of cases where the problem can be solved more easily; see [661, Part V] and
[1586] for recent surveys.
12.4.75 Remark If the genus g is sufficiently large compared to the finite field order q, then the
HCDLP can be solved in expected time subexponential in q g using the index-calculus
method [17, 977, 980]. The current state-of-the-art is Enge
√ and Gaudry’s result [980] that if
g/ logg q > ϑ, the expected bit complexity is Lqg [1/2, 2((1 + 1/2ϑ)1/2 + (1/2ϑ)1/2 )] where
12.4.76 Remark Index-calculus can also be used to solve the HCDLP faster than the generic
methods for smaller genera [1245, 1256, 2802]. The current state-of-the-art is Gaudry,
Thomé, Thériault, and Diem’s result [1256] that the HCDLP can be solved in expected
2
time O(g 5 q 2− g + ) if q > g!. This is asymptotically faster than the generic algorithms for
g ≥ 3.
12.4.77 Remark Frey, Müller, and Rück [1106, 1413] showed how the modified Tate-Lichtenbaum
pairing can be used to reduce the HCDLP to the DLP in the group of m-th roots of unity
µm ⊂ Fqk , where k is the embedding degree of m in the field Fq (see Definition 12.2.112).
If k is sufficiently small, for example if C/Fq is supersingular, this is more efficient than the
generic algorithms.
12.4.78 Remark If m = pn where p is the characteristic of Fq , then an algorithm of Rück [2498]
can be used to solve the HCDLP in time O(n2 log p).
456 Handbook of Finite Fields
12.4.79 Remark If Fq = Fpn where n is composite, the Weil descent methodology [661, Section 22.3]
can in certain cases be used to reduce the HCDLP to another instance of the HCDLP on
a curve of higher genus over a smaller finite field, where faster non-generic algorithms may
apply.
12.4.80 Remark The Diffie-Hellman and decisional Diffie-Hellman problems also generalize to
Pic0Fq (C), and are also sometimes used as the underlying security assumption of certain
cryptographic protocols (see Section 16.5). Both reduce to the HCDLP, but equivalence is
not known.
12.4.81 Definition Let C/Fq be a real hyperelliptic curve and let D ∈ R. The infrastructure
discrete logarithm problem (IDLP ) is the problem of computing δ(D).
12.4.82 Remark This problem has also been used as the underlying security assumption of crypto-
graphic protocols [1587]. It is computationally easy to compute a divisor in the infrastructure
close to a given distance [1587], but solving the IDLP is believed to be difficult. In fact,
the IDLP can be reduced to the DLP in the subgroup of Pic0Fq (C) generated by [∞ − ∞]
[1089, 2142].
See Also
§12.1 For analagous material for general function fields and curves.
§12.2 For analagous material for the genus 1 case (i.e., elliptic curves).
§16.5 For applications of hyperelliptic curves to cryptography.
References Cited: [17, 133, 182, 497, 661, 815, 817, 977, 980, 987, 988, 1089, 1105, 1106,
1125, 1129, 1130, 1131, 1154, 1155, 1245, 1249, 1256, 1413, 1555, 1556, 1586, 1587, 1588,
1718, 1719, 1774, 1865, 1866, 1905, 2089, 2142, 2373, 2399, 2493, 2498, 2706, 2707, 2802,
2867]
12.5.1 Remark In this section we use the language of function fields rather than algebraic curves,
see Section 12.1. A simple way for switching from function fields to algebraic curves is as
follows.
A function field F/Fq of genus g corresponds to a curve X of genus g over Fq , that is
an absolutely irreducible, non-singular, projective curve which is defined over Fq . If F =
Fq (x, y) and x, y satisfy the equation ϕ(x, y) = 0 for an irreducible polynomial ϕ(X, Y ) ∈
Fq [X, Y ], then X is a non-singular, projective model of the plane curve which is defined by
Curves over finite fields 457
ϕ(X, Y ) = 0. By abuse of notation, we say briefly that the curve X is given by ϕ(x, y) = 0.
Rational places of the function field correspond to Fq -rational points of X .
12.5.2 Remark Let F be a function field over Fq . Then F has only finitely many rational places.
12.5.4 Example For the rational function field F = Fq (x) we have N (F ) = q + 1. The rational
places are the zeros of x − a with a ∈ Fq , and the pole P∞ of x.
12.5.5 Lemma [2714, Lemma 5.1] Let F 0 /F be a finite extension of function fields having the same
constant field Fq . Then the following hold.
1. Let P be a place of F and P 0 a place of F 0 lying above P . If P 0 is rational, then
P is rational.
2. N (F 0 ) ≤ [F 0 : F ] · N (F ).
12.5.6 Remark The following special case of Kummer’s Theorem [2714, Theorem 3.3.7] is often
useful to determine rational places of a function field.
12.5.7 Lemma Let P be a rational place of F and let OP be its valuation ring. Consider a finite
extension E = F (y) of F such that Fq is also the full constant field of E. Assume that the
minimal polynomial ϕ(T ) of y over F has all its coefficients in OP (that is, y is integral
over OP ). Suppose that the reduction ϕ̄(T ) of ϕ(T ) modulo P (which is a polynomial over
the residue class field OP /P = Fq ) splits over Fq as follows:
ϕ̄(T ) = (T − a1 ) · · · (T − as ) · p1 (T ) · · · pr (T )
12.5.8 Example Assume that q = 2m with m ≥ 2, and consider the function field F = Fq (x, y)
with
y 2 + y = xq−1 .
The pole P∞ of x is totally ramified in the extension F/Fq (x); this gives one rational place
of F . Next we consider the place P = (x = a) of Fq (x) which is the zero of x − a with
a ∈ Fq . The reduction of the minimal polynomial ϕ(T ) = T 2 + T + xq−1 modulo P is then
(
T2 + T + 1 if a 6= 0,
ϕ̄(T ) =
T2 + T if a = 0.
1. For every n ≥ 0, there are only finitely many divisors A ≥ 0 with deg A = n.
2. The class group Cl0 (F ) is a finite group.
12.5.12 Definition The Zeta function of F is defined by the power series in C[[t]] below (here C is
the complex number field):
∞
X
Z(t) := An tn ,
n=0
1. The power series Z(t) converges for all t ∈ C with |t| < q −1 .
2. Z(t) can be written as
L(t)
Z(t) =
(1 − t)(1 − qt)
with a polynomial L(t) = a0 + a1 t + · · · + a2g t2g ∈ Z[t] of degree 2g. This
polynomial is the L-polynomial of F .
3. (Functional equation of the L-polynomial) The coefficients of the L-polynomial
of F satisfy
a. a0 = 1 and a2g = q g ,
b. a2g−i = q g−i ai for 0 ≤ i ≤ g.
4. N (F ) = a1 + q + 1.
5. L(1) = hF is the class number of F .
12.5.16 Remark The following theorem is fundamental for the theory of function fields over finite
fields. It was first proved by Hasse for g = 1; the generalization to all g ≥ 1 is due to Weil.
12.5.17 Theorem (Hasse–Weil theorem) [2714, Theorem 5.2.1] The reciprocals of the roots of the
L-polynomial satisfy
|ωj | = q 1/2 for 1 ≤ j ≤ 2g .
12.5.18 Remark The Hasse–Weil theorem is often referred to as the Riemann Hypothesis for func-
tion fields over finite fields.
12.5.19 Remark The next result is an easy consequence of the Hasse–Weil theorem 12.5.17.
12.5.20 Theorem (Hasse–Weil bound) [2714, Theorem 5.2.3] The number N = N (F ) of rational
places of a function field F/Fq of genus g satisfies the inequality
12.5.24 Remark Clearly Nq (g) ≤ q + 1 + g · 2q 1/2 . Further improvements of this bound can be
obtained.
12.5.25 Proposition (Serre’s explicit formulas) [2592], [2714, Proposition 5.3.4] Suppose that
u1 , . . . , um are non-negative real numbers, not all of them equal to zero, satisfying
460 Handbook of Finite Fields
Pm
1+ n=1 un cos nθ ≥ 0 for all θ ∈ R. Then
Pm
2g + n=1 un q n/2
Nq (g) ≤ 1 + Pm −n/2
.
n=1 un q
12.5.26 Remark The results of the examples and tables below are proved in the following way.
First one derives upper bounds for Nq (g) using Serre’s explicit formulas. In some cases,
these upper bounds can be improved slightly by rather subtle arguments [1549]. Lower
bounds for Nq (g) are usually obtained by providing explicit examples of function fields
having that number of rational places. Many methods of construction have been proposed,
see [1550, 2280, 2846] for some of them.
12.5.27 Example (The case g = 1) [2591] Let q = pe with a prime number p.
1. If e is odd, e ≥ 3 and p divides 2q 1/2 , then Nq (1) = q + 2q 1/2 .
1/2
2. Nq (1) = q + 1 + 2q , otherwise.
12.5.28 Example (The case g = 2) For all prime powers q,
j k j k
q − 2 + 2 · 2q 1/2 ≤ Nq (2) ≤ q + 1 + 2 · 2q 1/2 .
g 0 1 2 3 4 5 6 7 8 9 10 20
N2 (g) 3 5 6 7 8 9 10 10 11 12 13 19-21
N4 (g) 5 9 10 14 15 17 20 21 21-24 26 27 40-45
N8 (g) 9 14 18 24 25 29 33-34 34-38 35-42 45 42-49 76-83
12.5.32 Example (Values of Nq (g) for 1 ≤ g ≤ 4 and prime numbers q ≤ 43) (at the time of
printing)
q 2 3 5 7 11 13 17 19 23 29 31 37 41 43
Nq (1) 5 7 10 13 18 21 26 28 33 40 43 50 54 57
Nq (2) 6 8 12 16 24 26 32 36 42 50 52 60 66 68
Nq (3) 7 10 16 20 28 32 40 44 48 60 62 72 78 80
Nq (4) 8 12 18 24 33 38 46 48-50 57 67-70 72 82 88-90 92
12.5.33 Remark If the genus g(F ) is large with respect to q, the Hasse–Weil bound can be improved
considerably.
12.5.34 Proposition (Ihara’s bound) [1569], [2714, Proposition 5.3.3] Suppose that Nq (g) = q + 1 +
2gq 1/2 . Then g ≤ q 1/2 (q 1/2 − 1)/2.
Curves over finite fields 461
12.5.35 Example Let q be a square. Then there exists a function field of genus g = q 1/2 (q 1/2 − 1)/2
having q + 1 + 2gq 1/2 rational places. For more about function fields which attain the
Hasse–Weil upper bound, see Subsection 12.5.4.
12.5.36 Example
12.5.37 Definition A function field F/Fq is maximal if g(F ) > 0 and N (F ) attains the Hasse-Weil
upper bound N (F ) = q + 1 + 2gq 1/2 .
12.5.38 Remark It is clear that q must be the square of a prime power, if there exists a maximal
function field F/Fq . Therefore we assume in this subsection that q = `2 is a square. By
Ihara’s bound 12.5.34, the genus of a maximal function field F over F`2 satisfies 1 ≤ g(F ) ≤
`(` − 1)/2.
12.5.39 Example [2714, Lemma 6.4.4] Let H := F`2 (x, y) where x, y satisfy the equation y ` + y =
x`+1 . Then H is a maximal function field over F`2 with g(H) = `(` − 1)/2 and N (H) =
`3 + 1 = `2 + 1 + 2g(H)`. The field H is called the Hermitian function field over F`2 .
12.5.40 Remark The rational places of the Hermitian function field H are the following: there is
a unique common pole of x and y, and for any α, β ∈ F`2 with α` + α = β `+1 there is a
unique common zero of y − α and x − β. In this way one obtains all 1 + `3 rational places
of H.
12.5.41 Remark There are generators u, v of the Hermitian function field H which satisfy the
equation u`+1 + v `+1 = 1. Hence the Hermitian function field is a special case of a Fermat
function field, which is defined by an equation un + v n = 1 with gcd(n, q) = 1.
12.5.42 Proposition
1. Suppose that F/F`2 is a maximal function field of genus g(F ) = `(` − 1)/2. Then
F is isomorphic to the Hermitian function field H [2499].
2. There is no maximal function field E/F`2 whose genus satisfies 14 (`−1)2 < g(E) <
1 1 1
2 `(` − 1) for ` odd (and 4 `(` − 2) < g(E) < 2 `(` − 1) for ` even) [1145].
3. Up to isomorphism there is a unique maximal function field E/F`2 of genus
g(E) = 14 (` − 1)2 for ` odd (and g(E) = 14 `(` − 2) for ` even) [3, 1144].
12.5.43 Proposition [1826]. Let F be a maximal function field over Fq . Then every function field
E of positive genus with Fq ⊂ E ⊆ F is also maximal over Fq .
12.5.44 Remark The Hermitian function field H/F`2 has a large automorphism group G. Every
subgroup U ⊆ G whose fixed field is not rational, provides then an example of a maximal
function field H U over F`2 . Most known examples of maximal function fields over F`2 have
been constructed in this way [474, 1205, 1280], and [1511, Chapter 10].
462 Handbook of Finite Fields
12.5.45 Example [1279] Over the field Fq with q = r6 , consider the function field F = Fq (x, y, z)
which is defined by the equations
2
r r+1 xr − x r 3 +1
x +x=y and y · r = z r+1 .
x +x
Here F is the Giulietti–Korchmáros function field; it is maximal over Fq of genus g(F ) =
(r − 1)(r4 + r3 − r2 )/2. It is (at the time of printing) the only known example of a maximal
function field over Fq which is not a subfield of the Hermitian function field H/Fq .
12.5.46 Remark [2722] An important ingredient in many proofs of results on maximal function
fields (for example, Parts 2 and 3 of Proposition 12.5.42) is the Stöhr–Voloch theory which
sometimes gives an improvement of the Hasse–Weil upper bound. The method of Stöhr–
Voloch involves the construction of an auxiliary function which has zeros of high order at
the Fq -rational points of the corresponding non-singular curve. We illustrate this method
in the case of plane curves. Let f (X, Y ) ∈ Fq [X, Y ] be an absolutely irreducible polynomial
that defines a non-singular projective plane curve. Recall that an affine point (a, b) with
f (a, b) = 0 is non-singular if at least one of the partial derivatives fX (X, Y ) or fY (X, Y )
does not vanish at the point (a, b). The auxiliary function h(X, Y ) in this case is obtained
from the equation of the tangent line as h(X, Y ) = (X − X q )fX (X, Y ) + (Y − Y q )fY (X, Y ).
Suppose now that f (X, Y ) does not divide h(X, Y ). Then
N (F ) ≤ d(d + q − 1)/2 ,
where F = Fq (x, y) with f (x, y) = 0 is the corresponding function field, and d denotes the
degree of the polynomial f (X, Y ). As an example consider the case d = 4. The genus of F is
g(F ) = (d − 1)(d − 2)/2 = 3. The bound above gives N (F ) ≤ 2q + 6 which is better than the
Hasse–Weil upper bound for all q ≤ 23. We note that Nq (3) = 2q + 6 for q = 5, 7, 11, 13, 17
and 19; see Example 12.5.32.
12.5.47 Remark In this subsection we give some results about the asymptotic growth of the numbers
Nq (g), see 12.5.23. As was mentioned in Proposition 12.5.34, the Hasse-Weil upper bound
Nq (g) ≤ q + 1 + 2gq 1/2 cannot be attained if the genus is large with respect to q.
12.5.48 Definition The real number A(q) := lim supg→∞ Nq (g)/g is Ihara’s quantity.
12.5.49 Remark As follows from the Hasse–Weil bound, A(q) ≤ 2q 1/2 . The following bound is a
significant improvement of this estimate.
12.5.50 Theorem (Drinfeld–Vlǎduţ bound) [2714, Theorem 7.1.3], [2882]
A(q) ≤ q 1/2 − 1.
12.5.51 Remark The proof of the Drinfeld–Vlǎduţ bound is a clever application of Serre’s explicit
formulas 12.5.25. If q is a square, the Drinfeld–Vlǎduţ bound is sharp.
12.5.52 Theorem [1569, 2821]
A(q) = q 1/2 − 1 if q is a square.
12.5.53 Remark If q is a non-square, the exact value of A(q) is not known. The lower bounds for
A(q), given below, are proved by providing specific sequences of function fields Fn /Fq such
Curves over finite fields 463
that limn→∞ N (Fn )/g(Fn ) > 0. Every such sequence gives then a lower bound for A(q).
For details, see Section 12.6.
12.5.54 Theorem
1. [2280, Theorem 5.2.9], [2592] There is an absolute constant c > 0 such that
A(q) > c · log q for all prime powers q.
2. [265, 3079]
A(q 3 ) ≥ 2(q 2 − 1)/(q + 2).
12.5.55 Remark Recall that bαc and dαe denote the floor and the ceiling of a real number α.
The harmonic mean of two positive real numbers α, β is given by the formula H(α, β) =
2αβ/(α + β). The following result contains Theorem 12.5.52 and Part 2 of Theorem 12.5.54
as special cases.
12.5.56 Theorem [1203] (see also Example 12.6.29) For every prime number p and every n ≥ 2,
For example, one has for p = 2 and all sufficiently large odd integers n,
12.5.57 Example [105, 938] The best known lower bounds for A(q) for q = 2, 3, 5 were obtained
from class field towers:
A(2) ≥ 0.316999... ,
A(3) ≥ 0.492876... ,
A(5) ≥ 0.727272... .
See Also
References Cited: [3, 105, 265, 474, 938, 973, 1144, 1145, 1203, 1205, 1279, 1280, 1414,
1511, 1549, 1550, 1569, 1826, 1959, 2280, 2459, 2499, 2591, 2592, 2593, 2714, 2722, 2783,
2821, 2846, 2882, 3079]
464 Handbook of Finite Fields
12.6 Towers
Arnaldo Garcia, IMPA
Henning Stichtenoth, Sabanci University
We use terminology as in Sections 12.1 and 12.5, see also [2714]. Some methods are
discussed how to obtain lower bounds for Ihara’s quantity A(q), see Definition 12.5.48. Such
bounds have a great impact in applications, for instance in coding theory, see Section 15.2.
12.6.1 Remark Lower bounds for A(q) are usually obtained in the following way: one con-
structs a sequence of function fields (Fi /Fq )i≥0 with g(Fi ) → ∞ such that the limit
limi→∞ N (Fi )/g(Fi ) exists. If this limit is positive, then it provides a non-trivial lower
bound for A(q).
12.6.2 Remark Essentially three methods are known for constructing such sequences of function
fields: modular towers, class field towers, and explicit towers. In the following two remarks
we give a very brief description of the first two methods.
12.6.3 Remark (Modular towers) [217, 971, 972, 1569, 2821] Modular towers were introduced by
Ihara, and independently by Tsfasman, Vlăduţ, and Zink. Let N be a positive integer and
p a prime number not dividing N . There exists an affine algebraic curve Y0 (N ) defined over
Fp such that, for any field K of characteristic p, Y0 (N ) parametrizes the set of isomorphy
classes of pairs (E, C), where E is an elliptic curve (see Section 12.2) and C is a cyclic
subgroup of E of order N , defined over K, in a functorial way. The construction of Y0 (N ) is
independent of p and can be done in characteristic zero also. The complete curve obtained
from Y0 (N ) is denoted X0 (N ). If ` 6= p is another prime, then the curves X0 (`n ), n = 1, 2, . . .
form a tower with the maps sending (E, C) to (E, C 0 ) where C 0 is the unique subgroup of C
of index `. Over Fp2 , the supersingular elliptic curves (see Subsection 12.2.9) together with
all their cyclic subgroups of order `n give rational points on X0 (`n )(Fp2 ), because Frobenius
is multiplication by −p on those curves. This gives a tower of curves over Fp2 which attains
the Drinfel’d–Vlăduţ bound.
For Fq2 , with q arbitrary, a similar construction can be made using Shimura curves which
parametrize abelian varieties of higher dimension with additional structure.
12.6.4 Remark (Class field towers) [938, 2280, 2561, 2592] Starting with any function field F0 of
genus g0 ≥ 2 and a set S0 of rational places of F0 , one defines inductively the field Fn+1 to
be the maximal abelian unramified extension of Fn in which all places of Sn split completely,
and Sn+1 to be the set of all places of Fn+1 which lie over Sn . If Fn ( Fn+1 for all n (which
is not always the case), the tower thus obtained is called a class field tower, and its limit
(see Definition 12.6.8) is at least |S0 |/(g0 − 1). The hard part is to choose F0 , S0 so that the
tower is infinite. This is analogous to the corresponding problem in the number field case
of infinite class field towers which was solved by Golod and Shafarevich. A choice of F0 , S0
then can be used to show that A(p) ≥ c · log p, for p prime, with an absolute constant c > 0.
This approach which is due to Serre [2592], is so far the only way to prove that A(p) > 0
holds for prime numbers p.
12.6.5 Remark (Explicit towers of function fields) These towers were introduced by Garcia and
Stichtenoth [1200, 2714]. The method, which is more elementary than modular towers and
class field towers, is presented below in some detail.
Curves over finite fields 465
12.6.7 Remark Items 2 and 3 imply that g(Fi ) → ∞ as i → ∞. The following limit exists for
every tower over Fq [2714, Lemma 7.2.3].
12.6.8 Definition Let F = (F0 , F1 , . . .) be a tower of function fields over Fq . The limit λ(F) :=
limi→∞ N (Fi )/g(Fi ) is the limit of the tower F.
12.6.9 Remark We note that the inequalities 0 ≤ λ(F) ≤ A(q) hold for every tower over Fq .
12.6.10 Definition A tower F/Fq is asymptotically good if λ(F) > 0. It is asymptotically bad if
λ(F) = 0.
12.6.11 Remark The notion of asymptotically good (bad) towers is related to the notion of asymp-
totically good (bad) sequences of codes; see Section 15.2. The remark below follows imme-
diately from the definitions.
12.6.12 Remark As A(q) ≥ λ(F), every asymptotically good tower F over Fq provides a non-trivial
lower bound for Ihara’s quantity.
12.6.13 Remark Most towers turn out to be asymptotically bad and some effort is needed to find
asymptotically good ones. We discuss now some criteria which ensure that a tower is good.
12.6.15 Remark The splitting locus is always finite (it may be empty). The ramification locus can
be finite or infinite.
12.6.16 Theorem [2714, Theorem 7.2.10] Assume that the tower F = (F0 , F1 , . . .) over Fq has the
following properties.
1. The splitting locus Split(F/F0 ) is non-empty.
2. The ramification locus Ram(F /F0 ) is finite.
466 Handbook of Finite Fields
3. For every P ∈ Ram(F/F0 ) there is a constant cP ∈ R such that for all n ≥ 0 and
all places Q of Fn lying over P , the different exponent d(Q|P ) is bounded by
Then the tower F is asymptotically good, and its limit satisfies the inequality
s
λ(F) ≥ ,
g(F0 ) − 1 + r
where
1 X
s := Split(F/F0 ) and r := cP · deg P.
2
P ∈Ram(F /F0 )
12.6.17 Remark Of course, one should choose the constant cP as small as possible (if it exists). In
general it is a difficult task to prove its existence in towers having wild ramification.
12.6.18 Remark A tower F/F0 is tame if all places P ∈ Ram(F/F0 ) are tame in all extensions
Fn /F0 ; that is, the ramification index e(Q|P ) is relatively prime to q for all places Q of Fn
lying over P .
12.6.19 Remark For a tame tower, the constants cP in Theorem 12.6.16 can be chosen as cP = 1.
Hence a tame tower with finite ramification locus and non-empty splitting locus is asymp-
totically good, and the inequality for λ(F) given in Theorem 12.6.16 holds with
1 X
r := deg P.
2
P ∈Ram(F /F0 )
12.6.20 Remark All known asymptotically good towers of function fields have properties 1, 2, and
3 of Theorem 12.6.16.
12.6.21 Definition Let f (Y ) ∈ Fq (Y ) and h(X) ∈ Fq (X) be non-constant rational functions, and
let F = (F0 , F1 , . . .) be a tower of function fields over Fq . The tower F is recursively
defined by the equation f (Y ) = h(X), if there exist elements xi ∈ Fi (i = 0, 1, . . .) such
that
1. F0 = Fq (x0 ) is a rational function field;
2. Fi = Fi−1 (xi ) for all i ≥ 1;
3. for all i ≥ 1, the elements xi−1 , xi satisfy the equation f (xi ) = h(xi−1 ).
12.6.22 Example [2714, Proposition 7.3.2] Let q = `2 be a square, ` > 2. Then the equation
Y `−1 = 1 − (X + 1)`−1
defines an asymtotically good tame tower F over Fq . The ramification locus of this tower
is the set of all places (x0 = α) with α ∈ F` , and the place (x0 = ∞) splits completely. By
Theorem 12.6.16 the limit satisfies the inequality
12.6.23 Example [2714, Proposition 7.3.3] Let q = `e with e ≥ 2 and set m := (q − 1)/(` − 1). Then
the equation
Y m = 1 − (X + 1)m
defines an asymptotically good tame tower F over Fq with limit
This gives a simple proof that A(q) > 0 for all non-prime
√ values of q. For q = 4 the tower
attains the Drinfeld–Vlăduţ bound λ(F) = 1 = 4 − 1.
12.6.24 Example [1204] Let q = p2 where p is an odd prime. Then the equation
X2 + 1
Y2 =
2X
defines a tame tower F over Fq . Its ramification locus is
There are 2(p − 1) rational places of F0 which split completely in the tower. The inequality
in Theorem 12.6.16 gives λ(F) ≥ p − 1 which coincides with the Drinfeld–Vlăduţ bound.
So,
λ(F) = p − 1.
The fact that the splitting locus of this tower has cardinality 2(p − 1) is not easy to prove.
For p = 3, 5 one can check directly that the places (x0 = α) with α4 + 1 = 0 (for p = 3)
and α8 − α4 + 1 = 0 (for p = 5) split completely in F.
12.6.25 Remark Now we give some examples of wild towers, that is, there are some places of F0
whose ramification index in some extension Fn /F0 is divisible by the characteristic of Fq .
In wild towers, it is usually difficult to find a bound, if it exists, for the different exponents
in terms of ramification indices (see Theorem 12.6.16).
12.6.26 Example [1200] Let q = `2 be a square and define the tower F = (F0 , F1 , . . .) over Fq as
follows: F0 := Fq (x0 ) is the rational function field, and for all n ≥ 0, set Fn+1 := Fn (xn+1 )
with
(xn+1 xn )` + xn+1 xn = x`+1
n .
The ramification locus of F is Ram(F/F0 ) = { (x0 = 0), (x0 = ∞) }, and all other rational
places of F0 split completely in the tower. We note however that Theorem 12.6.16 is not
directly applicable to determine the limit λ(F). One can show that
λ(F) = ` − 1,
` − 1. The determination of the splitting locus and the ramification locus for this tower is
easy. The hard part is to show that cP = 2 for all ramified places (for the definition of cP
see Theorem 12.6.16).
12.6.28 Example [265, 2845] Over the field Fq with q = `3 , the equation
Y ` − Y `−1 = 1 − X + X −(`−1)
468 Handbook of Finite Fields
2(`2 − 1)
λ(F) ≥ .
`+2
It follows that
2(`2 − 1)
A(`3 ) ≥ ,
`+2
12.6.29 Example [1203] Let q = `n with n ≥ 2. For every partition of n into relatively prime parts,
Y Y `j
Trj + Trk = 1,
X `k X
where
2 a−1
Tra (T ) = T + T ` + T ` + · · · + T ` for any a ∈ N.
1 1 −1
λ(F) ≥ 2 + ,
`j − 1 `k − 1
which is the harmonic mean of `j − 1 and `k − 1. This tower gives the best known lower
bound for Ihara’s quantity A(q), for all non-prime fields Fq (at the time of printing).
12.6.30 Remark None of the towers in Examples 12.6.22 - 12.6.24 or 12.6.26 - 12.6.29 is Galois over
F0 , that is, not all of the extensions Fn /F0 , n ≥ 0 are Galois extensions. In some special
cases however, one can prove that the tower F̂ := (F̂0 , F̂1 , . . .), where F̂n is the Galois
closure of Fn /F0 , is also asymptotically good [1202, 2713].
12.6.31 Remark There are examples of function fields with many rational points which are abelian
extensions of a rational function field (for instance, the Hermitian function field H; see
Example 12.5.39). Other abelian extensions over Fq (x) having many rational places can
be obtained via the method of cyclotomic function fields [2280]. However, abelian ex-
tensions F/Fq (x) of large genus have only few rational places. More precisely, if (Fi )i≥0
is a sequence of abelian extensions of a rational function field with g(Fi ) → ∞, then
limi→∞ N (Fi )/g(Fi ) = 0 [1107].
12.6.32 Remark We conclude this section with a warning: not every irreducible equation f (Y ) =
h(X) defines a recursive tower. For instance, if one replaces X + 1 by X in Examples 12.6.22
and 12.6.23, one just gets a finite extension F/F0 but not a tower. Also, one has to show
that Fq is algebraically closed in each field Fi of the tower. In most of the examples above
this follows from the fact that there is some place which is totally ramified in all extensions
Fi /F0 .
Curves over finite fields 469
See Also
References Cited: [217, 265, 938, 971, 972, 1107, 1200, 1201, 1202, 1203, 1204, 1569, 2280,
2561, 2592, 2713, 2714, 2821, 2845]
We use the terminology in [1427]. For the definitions of schemes, morphisms between
schemes, and the affine Spec A for a commutative ring A, see [1427, II 2]. For the definitions
of schemes or morphisms of finite type, see [1427, II 3]. For the definitions of separated,
proper or projective schemes or morphisms, see [1427, II 4]. For the definition of smooth
morphisms, see [1427, III 10].
Throughout this section, we assume our schemes are separated. Let X be a scheme of
finite type over Z. Denote the set of Zariski closed points in X by |X| (observe that in the
rest of the handbook this notation indicates the cardinality of the set X). For any x ∈ |X|,
the residue field k(x) of X at x is a finite field. Let N (x) be the number of elements of k(x).
12.7.2 Remark When X is the affine scheme Spec Z, ζX (s) is just the Riemann zeta function
∞
Y 1 X 1
ζ(s) = = .
p
1 − p−s n=1
n s
12.7.3 Remark We are concerned with the case where X is a scheme of finite type over a finite
field Fq with q elements of characteristic p. For any x ∈ |X|, k(x) is a finite extension of Fq .
Set deg(x) = [k(x) : Fq ]. Then we have N (x) = q deg(x) .
ζX (s) = Z(X, q −s ).
We have X
Z(X, t) = tdeg(α) ,
α
#AnFq (Fqm ) = q mn ,
1
Z(AnFq , t) = .
1 − qn t
#PnFq (Fqm ) = 1 + q m + · · · + q mn ,
1
Z(PnFq , t) = .
(1 − t)(1 − qt) · · · (1 − q n t)
such that Pi (X, t) ∈ Z[t] and all reciprocal zeros of Pi (X, t) lie on the circle
i
|t| = q 2 .
12.7.9 Remark The above theorem is usually called the Weil conjecture. It was proved to be true
by Dwork, Grothendieck, and Deligne. Weil points out that to prove his conjecture, one
needs to construct a cohomology theory for schemes over an abstract field. One such theory
is the `-adic cohomology theory constructed by Grothendieck, where ` is a prime number
distinct from the characteristic of the ground field. Results on `-adic cohomology theory
used in this section can be found in [136, 797, 1570].
Curves over finite fields 471
12.7.10 Remark In the Riemann hypothesis, we consider the Archimedean absolute values of the
zeros and poles of Z(X, t). One can show that for any prime number ` 6= p, all the zeros
and poles of Z(X, t) are `-adic units. (This follows from `-adic cohomology theory.) It is
interesting to study the p-adic absolute values of the zeros and poles of Z(X, t); see Section
12.8.
#
12.7.11 Definition [1570, Exp. XV] Let FrX : X → X be the morphism of schemes (FrX , FrX ) :
(X, OX ) → (X, OX ) such that on the underlying topological space, FrX : X → X is the
identity, and Fr#X : OX → OX maps a section s of OX to s . Fix an algebraic closure F
q
12.7.12 Remark There is a canonical one-to-one correspondence between the set X(Fqm ) of Fqm -
m
rational points in X, and the set of fixed points of FX on X.
12.7.13 Remark Let H i (X, Q` ) and Hci (X, Q` ) be the `-adic cohomology groups and the `-adic
cohomology groups with compact support of X, respectively. They are finite dimensional
vector spaces, and they vanish if i 6∈ [0, 2dim X]. If X is proper, we have H i (X, Q` ) ∼
=
Hci (X, Q` ).
12.7.14 Theorem [797, Rapport 3.2] (Lefschetz fixed point theorem) We have
XX
2dim
#X(Fqm ) = (−1)i Tr FX
m
, Hci (X, Q` ) .
i=0
12.7.15 Remark The following theorem follows from Theorems 12.7.6 and 12.7.14. It proves the
rationality of the function Z(X, t).
12.7.16 Theorem [797, Rapport 3.1] (Grothendieck’s formula) Let X be a scheme of finite type
over Fq . We have
2dimX
Y (−1)i+1
Z(X, t) = det 1 − FX t, Hci (X, Q` ) .
i=0
12.7.17 Remark Suppose a finite group G acts on X. Then each Hci (X, Q` ) is a representation G.
Let M
Hci (X, Q` ) = Vij
j∈Ii
be the isotypic decomposition of this representation. Then each Vij is invariant under the
action of FX , and we have a further factorization for the formula of Z(X, t) in Theorem
12.7.16:
2dimX
Y Y i+1
Z(X, t) = det(1 − FX t, Vij )(−1) .
i=0 j∈Ii
In this way, we can get further information about Z(X, t). As an example, let Xλ be the
Dwork hypersurface in Pn−1
Fq defined by the equation
12.7.19 Remark Poincaré duality implies the functional equation for Z(X, t).
12.7.20 Theorem [795, Paragraph 2.6] Suppose X is proper smooth over Fq and pure of dimension
n. We have
1 nχ(X)
Z X, n = q 2 tχ(X) Z(X, t),
q t
P2n
where χ(X) = i=0 (−1)i dim H i (X, Q` ) is the Euler characteristic of X, and if N is the
n
multiplicity of the eigenvalue q 2 of FX acting on H n (X, Q` ), then we have
1 if n is odd,
=
(−1)N if n is even.
12.7.21 Remark Together with Theorem 12.7.16, the following theorem of Deligne proves the Rie-
mann hypothesis for Z(X, t).
12.7.22 Theorem [795], [798, Corollaires 3.3.4-3.3.5] Suppose X is a scheme of finite type over Fq .
For any eigenvalue α of FX on Hci (X, Q` ), α is an algebraic integer, and all the Galois
w
conjugates of α have Archimedean absolute value q 2 for some integer w ≤ i. The equality
w = i holds if X is proper smooth over Fq .
12.7.23 Remark For each 0 ≤ i ≤ 2dim X, let bi = dim Hci (X, Q` ), and let αij (j = 1, . . . , bi ) be all
the eigenvalues of FX on Hci (X, Q` ). By Theorem 12.7.14, we have
XX X
2dim bi
#X(Fq ) = (−1)i αij .
i=0 j=1
Theorem 12.7.22
P provides bounds for |αij |. To get a bound for #X(Fq ), it suffices to find a
bound for i bi .
12.7.24 Remark Suppose X is proper smooth over Fq , pure of dimension n, and geometrically
connected (i.e., X is connected). Then H 0 (X, Q` ) and H 2n (X, Q` ) are one dimensional,
and FX acts on them by scalar multiplications 1 and q n , respectively.
Curves over finite fields 473
12.7.25 Corollary Suppose X is proper smooth over Fq , pure of dimension n, and geometrically
connected. Then we have
2n−1
i
X
|#X(Fq ) − (1 + q n )| ≤ bi q 2 .
i=1
Q(t)
Z(X, t) = ,
P (t)
12.7.28 Remark Using Dwork’s theory and Theorem 12.7.22, Bombieri obtains the following bound
for totdeg Z(X, t); see [341, Theorems 1, 2, and Proposition in IV] and [339, Theorem 1].
12.7.29 Theorem Let X be a closed affine subvariety in AN
Fq defined by the vanishing of r polyno-
mials f1 , . . . , fr ∈ Fq [t1 , . . . , tN ] of degrees ≤ d. Then we have
We have
XX
2dim
(−1)i dim Hci (X, Q` ) ≤ 2r DN (1, d1 + 1, . . . , dr + 1),
i=0
totdeg Z(X, t) ≤ (2e3 )N (2e3 + 1)N (5 max{d1 , . . . , dr } + 1)N .
P2dim X
12.7.32 Remark Starting from a universal bound | (−1)i dim Hci (X, Q` )|, Katz [1706] de-
i=0
P2dim X
duces a bound for i=0 dim Hci (X, Q` ), and hence a bound for totdeg Z(X, t). In par-
ticular, Katz gets the following estimate from those of Bombieri and Adolphson-Sperber.
474 Handbook of Finite Fields
12.7.2 L-functions
12.7.34 Remark For any scheme X of finite type over Fq and any `-adic sheaf F, we can associate
an L-function L(X, F, t). For the definition of an `-adic sheaf on a scheme, we refer to [1570,
Exp. VI]. We simply mention that in the case where X = Spec F for a field F , giving an
`-adic sheaf on X is equivalent to giving a continuous `-adic Galois representation
Gal(F /F ) → GL(n, Q` ).
Suppose X is a scheme of finite type over Fq . An Fqm -rational point x ∈ X(Fqm ) is an
Fq -morphism Spec Fqm → X. Let F be an `-adic sheaf on X. Then the inverse image of F
on Spec Fqm defines a Galois representation which we denote by
Gal(F/Fqm ) → GL(Fx̄ ).
Here Fx̄ is the stalk of F at the geometric point Spec F → X over x. The Galois group
Gal(F/Fqm ) has a special element, the Frobenius substitution
m
φx : Fqm → Fqm , φx (α) = αq .
Denote by Fx the inverse of φx and call it the geometric Frobenius at x. Let x ∈ |X| be a
Zariski closed point in X. Then we have a closed immersion Spec k(x) → X, and hence x
defines a k(x)-rational point in X. We denote the corresponding geometric Frobenius also
by Fx .
12.7.35 Definition The L-function L(X, F, t) is the formal power series with variable t and with
coefficients in Q` defined by
Y 1
L(X, F, s) = .
det(1 − Fx tdeg(x) , Fx̄ )
x∈|X|
12.7.36 Remark When F is the constant `-adic sheaf Q` , L(X, F, t) coincides with Z(X, t).
12.7.37 Theorem [797, Rapport 3] For any positive integer m, let
X
Sm (X, F) = Tr(Fx , Fx̄ ).
x∈X(Fqm )
∗
12.7.38 Remark [797, Sommes trig.] Let ψ : Fq → Q` be a nontrivial additive character. One can
construct an `-adic sheaf Lψ on A1Fq such that for any Fqm -rational point x ∈ A1Fq (Fqm ) =
Fqm , we have
Tr(Fx , (Lψ )x̄ ) = ψ(TrFqm /Fq (x)).
Let f ∈ Fq [t1 , . . . , tN ] be a polynomial. It defines a morphism f : AN Fq → AFq . For any
1
Fqm -rational point x ∈ AN Fq (Fq m ) = Fq m with coordinates x = (x1 , . . . , xN ), the `-adic sheaf
N
∗
f Lψ has the property
We note that
X
Sm (AN ∗
Fq , f Lψ ) = ψ(TrFqm /Fq (f (x1 , . . . , xN )))
x1 ,...,xN ∈Fqm
the homomorphisms induced by this pair on cohomology groups with compact support by
F : Hci (X, F) → Hci (X, F).
12.7.40 Theorem [797, Rapport 3.2] (Grothendieck trace formula) We have
X XX
2dim
Tr(Fx , Fx̄ ) = (−1)i Tr F m , Hci (X, F) .
x∈X(Fqm ) i=0
12.7.41 Remark The following theorem follows from Theorems 12.7.37 and 12.7.40. It proves the
rationality of the function L(X, F, t).
12.7.42 Theorem [797, Rapport 3.1] (Grothendieck’s formula) Let X be a scheme of finite type
over Fq . We have
2dimX
Y (−1)i+1
L(X, F, t) = det 1 − F t, Hci (X, F) .
i=0
12.7.44 Remark Together with 12.7.42, the following theorem of Deligne proves the Riemann hy-
pothesis for L(X, F, t).
12.7.45 Theorem [798, Corollaire 3.3.4] Suppose X is a scheme of finite type over Fq and F is a
mixed `-adic sheaf of weights ≤ w on X. Then any eigenvalue of F on Hci (X, F) is pure of
weight ≤ i + w relative to q.
476 Handbook of Finite Fields
12.7.46 Remark For each 0 ≤ i ≤ 2dim X, let bi = dim Hci (X, F), and let αij (j = 1, . . . , bi ) be all
the eigenvalues of F on Hci (X, F). By 12.7.40, we have
X XX X
2dim bi
Sm (X, F) = Tr(Fx , Fx̄ ) = (−1)i αij
m
.
x∈X(Fqm ) i=0 j=1
Theorem 12.7.45
P provides bounds for |αij |. To get a bound for Sm (X, F), it suffices to find
bounds for i bi .
∗
Pthe `-adic sheaf f Lψ in Remark 12.7.38, we can
12.7.47 Remark Applying Theorem 12.7.45 to
get a bound for the exponential sum | x1 ,...,xN ∈Fqm ψ(TrFqm /Fq (f (x1 , . . . , xN )))|. We have
the following results.
12.7.48 Theorem [795, Théorème 8.4], [798, Paragraphs 3.7.2-3.7.4] Let f ∈ Fq [t1 , . . . , tN ] be a
polynomial of degree d, and let fd be the homogeneous part of f of degree d. Suppose fd
defines a smooth hypersurface in PN
Fq
−1
and d is relatively prime to p.
1. Hci (AN ∗
F , f Lψ ) = 0 for i 6= N .
2. dim Hc (AN
N ∗ N
F , f Lψ ) = (d − 1) .
3. All eigenvalues of F on Hc (AN
N ∗
F , f Lψ ) are pure of weight N .
N +1
4. L(AN ∗
Fq , f Lψ ) = P (t)
(−1)
for a polynomial P (t) of degree (d − 1)N so that all
reciprocal roots of P (t) have Archimedean absolute value q N .
Nm
5. | x1 ,...,xN ∈Fqm ψ(TrFqm /Fq (f (x1 , . . . , xN )))| ≤ (d − 1)N q 2 .
P
12.7.49 Theorem [23, Theorem 4.2], [812, Theorem 1.3] Let f ∈ Fq [t1 , . . . , tN , 1/t1 , . . . , 1/tN ] be a
Laurent polynomial. Write
X
f= ci1 ...iN ti11 · · · tiNN .
i1 ,...,iN
P hull in Q of the
N
Let ∆∞ (f ) be the convex set {(i1 , . . . , iN )|ci1 ...iN 6= 0} ∪ {0}. For any face
τ of ∆∞ (f ), let fτ = (i1 ,...,iN )∈τ ci1 ...iN ti11 · · · tiNN . Suppose f is nondegenerate with respect
to ∆∞ (f ) in the sense that for any face τ of ∆∞ (f ) that does not contain the origin, the
subscheme of GN 1
m,Fq = (AFq − {0})
N
defined by
∂fτ ∂fτ
= ··· = =0
∂t1 ∂tN
12.7.50 Remark The estimates in Theorem 12.7.29, and Propositions 12.7.31 and 12.7.33 can also
be extended to the L-functions associated to exponential sums [22, 339, 341, 1706].
Curves over finite fields 477
12.7.51 Remark Suppose X is a geometrically connected smooth projective curve over Fq of genus
g. Then H 1 (X, Q` ) can be identified with T` (JX ) ⊗Z` Q` , where T` (JX ) is the Tate module
of the Jacobian JX of X, and dim H 1 (X, Q` ) = 2g. By Theorems 12.7.16 and 12.7.22 and
Corollary 12.7.25, we have the following.
12.7.52 Theorem Suppose X is a geometrically connected smooth projective curve over Fq of genus
g. We have
P (t)
Z(X, t) = ,
(1 − t)(1 − qt)
where P (t) = det(1 − F t, H 1 (X, Q` )) is a polynomial with integer coefficients, and all its
√
reciprocal roots have Archimedean absolute value q. Moreover, we have
√
|#X(Fq ) − (1 + q)| ≤ 2g q.
12.7.53 Remark The above theorem was proved by Hasse for elliptic curves and by Weil for curves
of higher genus. An elementary proof was given by Stepanov, Schmidt, and Bombieri [340].
12.7.54 Definition Let K(X) be the function field of X, and let ρ : Gal(K(X)/K(X)) → GL(V )
be a continuous Galois representation, where V is a finite dimensional vector space over
Q` . Suppose there exists a finite subset S of |X| such that ρ is unramified everywhere
on X − S. We define the L-function L(X, ρ, t) to be
Y 1
L(X, ρ, t) = ,
det(1 − Fx tdeg(x) , V Ix )
x∈|X|
12.7.55 Remark The Galois representation ρ defines an `-adic sheaf FV on X − S such that for any
x ∈ |X − S|, the Galois representation Gal(k(x)/k(x)) → GL(Fx̄ ) coincides with the Galois
representation ρ|Gal(k(x)/k(x)) . Let j : X − S ,→ X be the open immersion. Then we have
We have
H 0 (X, j∗ FV ) ∼
= V Gal(K(X)/K(X)) ,
H 2 (X, j∗ FV ) ∼
= VGal(K(X)/K(X)) ,
12.7.56 Theorem [136, Exp. XVIII. Paragraph 3.2.6], [797, Dualité 1.3] (Poincaré duality) We have
a perfect pairing
( , ) : H i (X, j∗ FV ) × H 2−i (X, j∗ FV ∗ ) → Q` ,
478 Handbook of Finite Fields
12.7.57 Remark By Poincaré duality and Theorem 12.7.42, we have the following functional equa-
tion for L-functions.
where
2
X
χ(X, ρ) = (−1)i dim H i (X, j∗ FV ),
i=0
2
Y (−1)i+1
(X, ρ) = det − F, H i (X, j∗ FV ) .
i=0
12.7.59 Remark Using Theorem 12.7.45 and Poincaré duality, one can prove the following, which
gives the Riemann hypothesis for L(X, ρ, t) by Remark 12.7.55.
12.7.60 Theorem Suppose for any x ∈ |X − S|, all eigenvalues of Fx on V are pure of weight w
relative to q deg(x) . Then any eigenvalue of F on H i (X, j∗ F) is pure of weight i + w relative
to q.
12.7.61 Remark For any point x ∈ |X|, let Kx be the completion of K(X) with respect to the
valuation corresponding to x, and let ρx : Gal(Kx /Kx ) → GL(V ) be the restriction of the
representation ρ. In Theorem 12.7.58, the Euler characteristic χ(X, ρ) and the constant
(X, ρ) in the functional equation can be expressed in terms of the invariants of the Galois
representations ρx (x ∈ |X|) of the local fields Kx .
X
χ(X, ρ) = (2 − 2g)dim V − deg(x)ax (ρx ),
x∈|X|
12.7.63 Theorem [1868, Théorème 3.2.1.1] (Laumon’s product formula) Let ω be any nonzero
meromorphic differential 1-form on X. Then we have
Y
(X, ρ) = q (1−g)dim(V ) (Kx , ρx , ω|Spec Kx ),
x∈|X|
See Also
References Cited: [22, 23, 136, 339, 340, 341, 435, 486, 794, 795, 797, 798, 812, 1345, 1427,
1570, 1706, 1710, 1868]
12.8.1 Introduction
12.8.1 Remark We know from the preceding section that the reciprocal roots and poles of zeta
and L-functions defined over the finite field Fq are algebraic integers, which are units at all
primes except the Archimedean ones and those lying over p. Many Archimedean estimates
(the Riemann hypothesis over finite fields) have been given in Section 12.7; in this section
we are interested in p-adic estimates, i.e., the p-adic Riemann hypothesis. We often refer to
notations and results from Section 12.9.
12.8.2 Remark The first result in this direction seems to be Stickelberger’s congruence which
gives the valuation of Gauss sums and Jacobi sums (see Section 6.1). The modern results
come from the work of Dwork [940], who was the first to prove the rationality of zeta and
L-functions, by p-adic means. From his pioneering work, many p-adic cohomology theories
originated, such as Monsky-Washnitzer, crystalline, or rigid cohomology [1572]; most of the
results described below follow from the explicit description of these cohomologies. Note also
they proved very useful for explicit calculations.
12.8.3 Remark One can describe the variation of the p-adic cohomology spaces from differential
equations, such as the Picard-Fuchs one; they sometimes allow one to give an analytic
expression for roots or poles of some zeta and L-functions. The best known example is the
family of ordinary elliptic curves in Legendre form, which is linked to a hypergeometric
function 2 F1 [941]; the Gross-Koblitz formula (Theorem 6.1.113) links Gauss sums with
the p-adic gamma function. As a consequence one gets a p-adic expression for Jacobi sums
and the zeta function of a diagonal hypersurface. Other examples are cubic sums linked
to the solutions of the Airy differential equation [1397], Kloostermann sums to the Bessel
differential equation [942], or the zeta function of a monomial deformation of a diagonal
hypersurface which can be expressed from hypergeometric functions n Fn−1 [1754, 3041].
12.8.4 Remark Here we shall be concerned with p-adic valuations; we only mention briefly these
subjects; we neither speak about unit root functions, the reader interested in this subject
should refer to [2895, 2896, 2897] and the references therein.
12.8.5 Remark [798] Let L(T ) be an L-function as in Definition 12.7.35, coming from a scheme
over Fq of dimension n and a punctually pure sheaf of weight 0. From Deligne’s integrality
480 Handbook of Finite Fields
theorem and Poincaré duality (see Section 12.7), the q-adic valuations of its reciprocal roots
and poles are rational numbers lying in the interval [0, n].
12.8.6 Definition Let F be a p-adic field, OF its ring of integers, π P
a uniformizing parameter,
d i
and vq the valuation on F normalized by vq (q) = 1. If P = i=0 ai T ∈ F [T ] is a one
variable polynomial, its q-adic Newton polygon, denoted NPq (P ), is the lower convex
hull of the set of points {(i, vq (ai )), 0 ≤ i ≤ d}.
12.8.7 Theorem [1770] Assume that P (0) = 1. Let s1 , . . . , sr be the slopes of NPq (P ), of respective
multiplicity li (i.e., each si is the slope of a segment of horizontal length li ); then the
polynomial P has exactly li reciprocal roots of q-adic valuation si for any 1 ≤ i ≤ r.
12.8.8 Remark One can give a more general statement about the valuations of the roots of P ,
removing the hypothesis P (0) = 1. However, the theorem above is sufficient in the following
results.
12.8.2 Lower bounds for the first slope
12.8.9 Remark We give lower bounds for the first slope of the Newton polygon of L-functions
attached to families of (additive) exponential sums over affine space, first uniform, then
depending on the characteristic. We end the subsection with an (incomplete) historical
account on these questions.
12.8.10 Proposition [150] Let X be a scheme of finite type over Fq and L(X, F, T ) be an L-function
as in Definition 12.7.35; for µ ∈ R+ , the following statements are equivalent:
1. The q-adic valuations of the reciprocal roots and poles of L(X, F, T ) are greater than
or equal to µ.
2. For any m, we have vqm (Sm (X, F)) ≥ µ.
3. All slopes of the q-adic Newton polygons of the factors of L(X, F, T ) are greater than
or equal to µ.
12.8.11 Theorem [21, Theorem 1.2] Let f ∈ Fq [x1 , . . . , xN ] be a polynomial, and ∆ := ∆∞ (f ) its
Newton polytope at infinity (see Theorem 12.7.49); denote by ω(∆) the smallest positive
rational number such that ω(∆)∆, the dilation of ∆ by the factor ω(∆), contains a lattice
point with all coordinates positive (a point in ZN >0 ). Then every reciprocal root or pole of
L(AN , f ∗ Lψ , T ) has q-adic valuation greater than or equal to ω(∆).
12.8.12 Remark Note that we make no assumption about the polynomial being non-degenerate
with respect to its Newton polytope here.
12.8.13 Remark (see Section 7.1) One can deduce divisibility results on the numbers of points
of algebraic varieties via the orthogonality relation on additive characters. Actually the
Chevalley-Warning, Ax and Katz theorems are all consequences of the theorem above.
12.8.14 Definition Let D ⊂ (N\{0})N be a finite subset, which is not contained in some Nk ,
k < N ; for any m ≥ 1, Pdefine the subset ED,p (m) of {0,P. . . , pm − 1}#D consisting of
m
all (ud )d∈D such that d∈D dud ≡ 0 (mod p − 1) and d∈D dud has all coordinates
positive. We set
( )
X
σD,p (m) := min σp (ud ), (ud ) ∈ ED,p (m) .
d∈D
n o
σD,p (m)
12.8.15 Proposition [293] The set m has a minimum.
m≥1
Curves over finite fields 481
12.8.17 Theorem Let Fq [x1 , . . . , xN ]D be the vector space of polynomials whose monomials have
their exponents in D.
1. For any f ∈ Fq [x1 , . . . , xN ]D , the reciprocal roots and poles of L(An , f ∗ Lψ , T ) have
q-adic valuation greater than or equal to πp (D).
2. Moreover, this bound is optimal in the sense that there exists a polynomial f in
F[x1 , . . . , xN ]D such that a reciprocal root or pole of L(AN , f ∗ Lψ , T ) has q-adic valuation
equal to πp (D).
12.8.18 Remark For f ∈ Fq [x1 , . . . , xN ] a degree d polynomial, there are many lower bounds in the
literature for the q-adic valuation of the exponential sum
X
S(f ) := ψ(f (x1 , . . . , xN )),
(x1 ,...,xN )∈FN
q
giving in turn lower bounds for the valuations or the reciprocal roots and poles of the
associated L-function. We give a brief account of these results here.
1. Sperber [2698] proves the uniform bound vq (S(f )) ≥ Nd ; then with Adolphson they
give the bound in Theorem 12.8.11. This last bound is the best possible uniform one, since
it is attained for some large enough p.
2. Later on, Moreno and Moreno take into account the characteristic [2151] via Weil
descent; this leads to generally better bounds, less uniform however. Recently, Moreno,
Shum, Castro, and Kumar [2153] give a bound depending on the exponents effectively
σD,p (m)
appearing in the polynomial, and on the cardinality of the field (namely m(p−1) , with
m
q = p ). Note this last bound depends on too many parameters to say anything about the
valuations of the reciprocal roots and poles of the L-function.
12.8.19 Remark In this subsection, we describe lower bounds for the Newton polygons associated
to zeta functions of smooth projective varieties, to L-functions associated to an additive
character and a Laurent polynomial (toric exponential sums) or a rational function of one
variable. In the case of zeta functions, these bounds come from the Hodge numbers of related
varieties in characteristic 0; for this reason we shall call these bounds Hodge polygons.
12.8.20 Definition Let X be a smooth projective variety of dimension n, defined over Fq . For
any 0 ≤ m ≤ 2n, define NPm (X) as the q-adic Newton polygon of the characteristic
polynomial of the action of Frobenius on the m-th etale cohomology space det(1 −
FX t, H m (X, Q` )).
Define the Hodge polygon in degree m of X as the polygon HPm (X) having slope i
with multiplicity hi,m−i := dim H m−i (X, Ωi ) for 0 ≤ i ≤ m.
12.8.21 Theorem [29, 255, 2041] For any 0 ≤ m ≤ 2n, the polygon NPm (X) lies on or above the
polygon HPm (X), and they have the same endpoints.
12.8.22 Remark There is an analogous result in the case of a smooth complete intersection in
Gm × An [27].
482 Handbook of Finite Fields
12.8.23 Definition Let X be a smooth projective variety of dimension n, defined over Fq . The
variety X is ordinary when we have NPm (X) = HPm (X) for any 0 ≤ m ≤ 2n.
12.8.24 Definition Notations and assumptions are as in Theorem 12.7.49. We set ∆ := ∆∞ (f ), and
N +1
denote by NPq (f ) the q-adic Newton polygon of the polynomial L(GN ∗
m , f Lψ , T )
(−1)
.
Denote by C(∆) := R+ ∆ the cone of ∆ in R , M∆ := C(∆) ∩ Z the monoid
N N
associated to this cone, and A∆ the algebra k[xM∆ ]. One can define a map from C(∆)
to R+ , the weight associated to ∆, by
to which we associate the Poincaré series PA∆ (t) := i≥0 dim A∆, Di ti .
P
12.8.25 Proposition [1802, Lemme 2.9] The series PA∆ (t) is a rational function. Precisely, the series
P∆ (t) := (1 − tD )N PA∆ (t) is a polynomial with degree less than or equal to N D, such that
P∆ (1) = N !Vol(∆).
12.8.26 Example Let f be a polynomial of degree d in the N variables x1 , . . . , xN , containing
the monomials xd1 , . . . , xdN with non-zero coefficients; its Newton polytope at infinity is the
simplex with vertices (0, . . .P , 0), (d, . . . , 0), . . . , (0, . . . , d). The associated cone is RN
+ , the
weight is w∆ (u1 , . . . , uN ) = dui , and the denominator is d. In this case the Poincaré series
1
can be written PA∆ (t) = (1−t) N .
12.8.31 Definition Let d1 , . . . , ds denote positive integers. We define the Hodge polygon
HP(d1 , . . . , ds ) as the polygon with slopes 0 and 1 with multiplicity s − 1,
1 d1 −1 1 ds −1
d1 , . . . , d1 , . . . , ds , . . . , ds , each with multiplicity 1.
12.8.32 Theorem [3069, Theorem 1.1] Let f ∈ Fq (x) be a rational function having s poles of prime
to p orders d1 , . . . , ds , and X denote the projective line with the poles removed. Then the
q-adic Newton polygon of the function L(X, f ∗ Lψ , T ) lies on or above the Hodge polygon
HP(d1 , . . . , ds ), and they have the same endpoints.
12.8.33 Remark Lower bounds for Newton polygons of L-functions associated to a character of order
pl , evaluated at a Witt vector of functions, pure or twisted by a multiplicative character can
be found in [1948, 1950]. See also [1949], in which T -adic exponential sums are introduced,
giving a framework in which to unify the study of the p-adic properties of these L-functions
when l varies.
12.8.34 Remark There are also some results and conjectures on the p-adic theory of L-functions
associated to multiplicative characters; see [20] and the references therein.
12.8.38 Definition The polygon GNP(S, p) defined above is the generic Newton polygon of the
family ft , t ∈ S. When it exists, the Hasse polynomial for the generic Newton polygon
of this family is the polynomial generating the ideal defining the Zariski closed subset
S\U.
12.8.39 Remark One can define more general Hasse polynomials, for instance for the first vertex,
or for the first m vertices.
12.8.40 Example Set fλ (x, y, z) := z(y 2 − x(x − 1)(x − λ)) for λ ∈ A1Fq (Fp )\{0, 1}. The generic
Newton polygon has vertices (0, 0), (1, 1), (2, 3), and the Hasse polynomial for the generic
P p−1 p−1 2
Newton polygon (actually for the vertex (1, 1)) is the polynomial F (λ) = i=0 2 2
i
λi . In
other words, the supersingular elliptic curves in Legendre form are those for which F (λ) = 0.
484 Handbook of Finite Fields
12.8.41 Theorem [2899, 2910] Let S∆ parametrize the space of Laurent polynomials over Fp with
Newton polytope ∆ ⊂ RN , non-degenerate with respect to it.
1. If N ≤ 3, then GNP(S∆ , p) = HP(∆) for any p ≡ 1 (mod D(∆)).
2. If N ≥ 4, there exists an integer D0 (∆) (in general strictly greater than D(∆))
depending only on ∆ such that GNP(S∆ , p) = HP(∆) for any large enough prime p such
that p ≡ 1 (mod D0 (∆)).
12.8.42 Remark The L-function associated to f has its coefficients in Qp (ζp ), which is a totally
ramified extension of Qp of degree p − 1. Since the ordinates for the vertices of the Hodge
1
polygon are in D(∆) N, a necessary condition in order to have GNP(S∆ , p) = HP(∆) is p ≡ 1
(mod D(∆)). The theorem above shows that it is not a sufficient condition for N ≥ 4.
12.8.43 Remark When N = 1, something stronger is true: if p ≡ 1 (mod lcm(d, d0 )), then for any
Pd
f (x) = i=−d0 ai xi ∈ Fq [x, x−1 ] with ad a−d0 6= 0, we have NP(f ) = HP([−d0 , d]).
12.8.44 Remark [3069] More generally, this result remains true for rational functions of one variable
with poles of orders d1 , . . . , ds when we have p ≡ 1 (mod lcm(d1 , . . . , ds )).
12.8.45 Theorem [2553] Let D = {1, . . . , d}; if p > 2d, the first vertex
n p−1ofothe generic Newton
1
polygon GNP(SD , p) is (1, p−1 d p−1
d e), with Hasse polynomial f
d d e
, the polynomial
p−1
associating to the coefficients a1 , . . . , ad of f the degree p − 1 coefficient of the d p−1
d e-th
power of f .
12.8.46 Theorem [293] Let D ⊂ N be finite, d = max D, and consider SD , the affine variety
parametrizing degree d polynomials whose monomials have their exponents in D. The first
slope of the generic Newton polygon GNP(SD , p) is equal to the p-density πp (D) of the set
D.
12.8.47 Theorem [291] Let p = 2, and D = {1 ≤ i ≤ d, 2 - i}; set SD =SpecFp [{ai }i∈D , a−1
d ] (we
parametrize the polynomials by their coefficients). The first vertex of the generic Newton
polygon GNP(SD , p) is
1. (n, 1) if 2n − 1 ≤ d < 2n+1 − 3, with Hasse polynomial a2n −1 ;
2. (2n, 2) if d = 2n+1 − 3, with Hasse polynomial a3·2n−1 −1 ;
3. in the second case, if we consider D0 = D\{3 · 2n−1 − 1}, the first vertex of the generic
Newton polygon GNP(SD0 , p) is (n, 1), with Hasse polynomial a2n −1 .
12.8.48 Remark The first result in the direction of Theorem 12.8.47 can be found in [2552] where
the authors determine the first slope and necessary conditions to get it. There are also
results for the first vertex in any characteristic p [291], but only dealing with the cases
pn − 1 ≤ d ≤ 2pn − 2.
12.8.49 Theorem [295] Let d ≥ 2 be an integer prime to p, and Sd = Spec Fp [a1 , . . . , ad , a−1
d ]
parametrize the degree d polynomials. If p ≥ 3d, the generic Newton polygon GNP(Sd , p)
has vertices
i
pj − i
Yi X
i, , Yi = ,
p − 1 0≤i≤d−1 j=1
d
and the Hasse polynomial for the i-th vertex is a polynomial in Fp [a1 , . . . , ad ], homogeneous
of degree Yi .
12.8.50 Remark The Newton polygons corresponding to degree 4 [1528], degree 6 [1529] polyno-
mials, and to the family xd + λx when p is large enough and p ≡ −1 (mod d) [3028] are
completely determined.
Curves over finite fields 485
12.8.51 Remark The generic Newton polygons associated to twisted one variable sums (coming
from the product of an additive character evaluated at a Laurent polynomial and of a
multiplicative character evaluated at the variable) are given in [296] for p large enough.
Using the Poisson formula, they give the generic Newton polygons attached to families of
polynomials P (xs ), deg P = d.
12.8.52 Theorem [3067, 3068]
1. We have the limit limp→∞ GNP(Sd , p) = HP(d).
2. There is a Zariski dense open subset U in SpecQ[a1 , . . . , ad , a−1
d ] (parametrizing the de-
gree d polynomials defined over Q) such that for any f ∈ U we have limp→∞ NP(f mod p) =
HP(d).
12.8.53 Remark In the case of twisted exponential sums, by a multiplicative character of order s
(or for the family of polynomials P (xs ), deg P = d), the limit no longer exists as in the first
assertion above. Actually the limit exists if we restrict to the primes in a fixed residue class
modulo s [296], and there is a result similar to the second assertion in this case.
12.8.54 Remark For L-functions associated to a one variable rational function of fixed pole orders
d1 , . . . , ds , the generic Newton polygon tends to the Hodge polygon HP(d1 , . . . , ds ) when p
tends to infinity [1915].
12.8.55 Problem There are conjectures by Wan [2899] asserting that under certain additional hy-
potheses, Theorem 12.8.52 remains true for the space S∆ of polynomials over Fp with
Newton polytope at infinity ∆ ⊂ RN , non-degenerate with respect to it. Actually a conse-
quence of Theorem 12.8.41 is that lim inf p→∞ GNP(S∆ , p) = HP(∆). Some special cases of
these conjectures are proved in [292].
12.8.56 Remark We consider the Newton polygons of curves. To each curve one can associate its
Jacobian variety, and more generally we consider the Newton polygons of abelian varieties.
They encode useful invariants, such as the p-rank. We give the stratification of the space of
principally polarized abelian varieties by their Newton polygon as described in the work of
Oort and others. Then we focus on curves; even if the situation is less well-known than in
the case of abelian varieties, the subject has drawn much attention and we give the principal
results. We end the section with some remarks about Artin-Schreier curves.
12.8.57 Definition The Newton polygon of a curve C defined over Fq is the q-adic Newton polygon
of the numerator L(C, T ) of its zeta function.
Let A be an abelian variety of dimension g defined over Fq ; for any prime ` 6= p, the
inverse limit of the `-th power torsion subgroups of A is the `-adic Tate module of A,
a Z` -module of rank 2g. Let PA (T ) denote the characteristic polynomial of the action
of Frobenius (q-th power) on the Q` -vector space T` (A) ⊗ Q` . The Newton polygon of
the abelian variety A defined over Fq is the q-adic Newton polygon of the polynomial
T 2g PA ( T1 ).
12.8.58 Remark The Newton polygon of a curve C coincides with one of its Jacobian variety JC ,
as defined above.
12.8.59 Remark For a curve of genus g, or an abelian variety of dimension g, the Newton polygon
starts at (0, 0) and ends at (2g, g). Moreover, it follows from Poincaré duality that it is
symmetric in the sense that if it contains the slope s with multiplicity m, it also contains
the slope 1 − s with the same multiplicity.
486 Handbook of Finite Fields
12.8.60 Remark Some authors consider the Newton polygon of the polynomial PA (T ); this Newton
polygon is symmetric to the one we consider here, with respect to the line x = g.
12.8.61 Definition A genus g curve (an abelian variety of dimension g) is ordinary when its Newton
polygon has the slopes 0 and 1 each with multiplicity g; it is supersingular when it has
the slope 12 with multiplicity 2g.
A polygon satisfying the requirements of Remark 12.8.59 is admissible.
12.8.62 Theorem [1769] Curves of genus g (resp. abelian varieties of dimension g) are generically
ordinary.
12.8.63 Theorem [2322] Let Ag,1 ⊗ Fp denote the space parametrizing principally polarized abelian
varieties of dimension g defined over Fp . It has dimension g(g+1)
2 .
Let ζ be an admissible polygon; the space Wζ of principally polarized abelian varieties
having their Newton polygon lying on or above ζ is closed, and has dimension
12.8.64 Remark There are more precise results by Li and Oort on the supersingular stratum [1919].
12.8.65 Problem As a consequence, every admissible polygon is the Newton polygon of an abelian
variety. It is not known whether this is true for curves. Van der Geer and Van der Vlugt
have shown that for p = 2, there are supersingular curves of every genus [2843], but this is
not even known in odd characteristic.
12.8.66 Definition The p-rank of the curve C (resp. of the abelian variety A) is the integer in
{0, . . . , g} defined as either the length of the horizontal segment of its Newton polygon,
or the dimension of the Fp -vector space JC [p] (resp. A[p]).
12.8.67 Remark From Theorem 12.8.63, the space of principally polarized abelian varieties having
p-rank f has codimension g − f in Ag,1 ⊗ Fp ; this is also true in the space Mg of genus
g curves [1022] (which has dimension 3g − 3) and in the space Hg of genus g hyperelliptic
curves [1281] (of dimension 2g − 1).
12.8.68 Definition The Hasse-Witt matrix of a non-singular curve C of genus g is the matrix of
the Frobenius (p-th power) mapping on the g dimensional space H 1 (C, OC ).
12.8.69 Remark Via Serre’s duality, the Hasse-Witt matrix is the transpose of the matrix of the
Cartier-Manin operator [553] on the space of differentials of the first kind.
12.8.70 Theorem [1999] Let H denote the Hasse Witt matrix of C, a curve defined over Fq , q = pm ,
i
and H (p ) denote the matrix obtained by raising all coefficients of H to the power pi . Let
a−1
Ha := HH (p) · · · H (p ) . Then
1. The rank of Hg is the p-rank of the curve C.
2. We have the congruence L(C, T ) ≡ det(Ig − T Hm ) (mod p).
12.8.71 Example With respect to the basis dual to the one given in Example 12.1.85, the Hasse-Witt
p−1
matrix of the hyperelliptic curve y 2 = f (x) is the matrix ({f 2 }pi−j )1≤i,j≤g .
12.8.72 Remark There are more general congruences for the factor of the zeta function of an
hypersurface coming from the primitive middle cohomology space [799].
Curves over finite fields 487
12.8.73 Theorem [752, Corollary 1.8] (Deuring-Shafarevich formula) Let C and C 0 be two curves
with function fields E and F ; assume that the extension E/F is Galois with Galois group
G a p-group. If f and f 0 denote the respective p-ranks of C and C 0 , then the relation
X
1 − f = #G(1 − f 0 ) + (ex − 1),
x∈C(k)
holds, where ex is the ramification index of the place x in the extension E/F .
12.8.74 Example Let E = k(x, y) be the extension of the rational function field k(x) defined by
the (Artin-Schreier) equation y p − y = f (x); assume that f has s poles with orders prime
to p. Then E/k(x) is a Galois extension with Galois group Z/pZ, and the p-rank of E is
f = s − 1.
12.8.75 Remark One can deduce a stratification by the p-rank of the space of Artin-Schreier curves
(i.e., Artin-Schreier coverings of the projective line) [2430].
12.8.76 Remark [3068] We have an expression for the numerator of the zeta function of the Artin-
Schreier
P0 curve C : y p − y = f (x) from the L-functions L(af, T ) associated to the sums
x∈P1 ψ(af (x)), where
Q the sum is taken over the points which are not poles of f . Precisely
we have L(C, T ) = a∈Fp L(af, T ). The L-functions on the right are conjugated under the
action of Gal(Q(ζp )/Q); thus they all have the same Newton polygon, and NPq (C) is the
dilation by the factor p − 1 of NPq (f ). As a consequence, the above determination of (parts
of) Newton polygons of L-functions associated to one variable exponential sums translate
to results on the Newton polygons of Artin-Schreier curves.
12.8.77 Remark The same argument gives the Newton polygon for curves with equation A(y) =
f (x), A an additive polynomial, as a dilation of the Newton polygon NPq (f ).
12.8.78 Remark We focus on the case p = 2: here Artin-Schreier curves are hyperelliptic curves.
From the remark above, we have NPq (C) = NPq (f ) for C : y 2 + y = f (x). The stratification
of Hg by the 2-rank is described in [2430]: the irreducible components of the stratum of
curves with 2-rank f , Hg,f are in bijection with the partitions of g + 1 in f + 1 positive
integers. Inside Hg,0 , one can reduce to f a polynomial, and Theorem 12.8.47 gives the first
vertex for the generic Newton polygon in this space. One can also deduce from these results
a theorem originally proved by Scholten and Zhu: there is no supersingular hyperelliptic
curve of genus g = 2n − 1, n ≥ 2 in characteristic 2 [2552].
12.8.79 Remark This result stands in striking contrast with the situation of genus g = 2n [2842].
In this case the dimension of the space of hyperelliptic supersingular curves is greater than
or equal to n; it can be as large as possible.
12.8.80 Remark For g ≤ 8, the supersingular hyperelliptic curves are completely determined when
p = 2 in [2551], and when p = 3 in [291]. When p = 2 one can also give the first vertices
occuring for Newton polygons of curves in Hg,0 for g ≤ 9 [291].
12.8.81 Remark There are results asserting the non-existence of supersingular Artin-Schreier curves
in odd characteristic for some infinite families of genera [291].
488 Handbook of Finite Fields
See Also
References Cited: [20, 21, 23, 25, 26, 27, 29, 31, 150, 255, 291, 292, 293, 295, 296, 553, 752,
798, 799, 940, 941, 942, 1022, 1281, 1397, 1528, 1529, 1571, 1699, 1754, 1769, 1770, 1802,
1915, 1919, 1948, 1949, 1950, 1999, 2041, 2151, 2153, 2322, 2430, 2551, 2552, 2553, 2698,
2842, 2843, 2895, 2896, 2897, 2899, 2910, 3028, 3041, 3067, 3068, 3069]
12.9.1 Remark As in Section 7.1, we shall restrict to the case of hypersurfaces. We focus on
theoretical and deterministic results. Probabilistic algorithms and improvements are not
discussed. As always, Fq denotes a finite field of characteristic p. The time for algorithms
means the number of field operations.
12.9.2 Definition For a polynomial f ∈ Fq [x1 , . . . , xn ], the sparse representation of f is the sum
of its non-zero terms
m
X
f (x1 , . . . , xn ) = aj xVj , aj ∈ F∗q ,
j=1
where
v
Vj = (v1j , . . . , vnj ), xVj = x11j · · · xvnnj .
The point counting problem is to compute the number #Af (Fq ) of Fq -rational points of
the equation f = 0.
q
12.9.3 Remark For this problem, we may replace xi by xi and assume that the degree of f in each
variable is at most q − 1. The sparse input size of f is then mn log(q).
12.9.4 Example For non-zero elements ai ∈ Fq and b ∈ Fq , let
f (x) = a1 xq−1
1 + · · · + an xq−1
n + b,
ai1 + · · · + aik + b = 0.
Curves over finite fields 489
The latter problem is the subset sum problem over Fq , which is well known to be NP-
complete. The fastest known deterministic algorithm for deciding if #Af (Fq ) > 0 in this
case is the baby-step-giant-step method, which runs in time O(n2n/2 log(q)).
12.9.5 Theorem [1231] Computing #Af (Fq ) is NP-hard, even in the case n = 2 or deg(f ) = 3.
12.9.6 Remark For a positive integer r > 1, the modular counting problem is to compute the
residue class of #Af (Fq ) modulo r. It is clear that 0 ≤ #Af (Fq ) ≤ q n . Thus, if one can
compute #Af (Fq ) modulo r for a single large r > q n or for many small r, the Chinese
remainder theorem implies that one can compute #Af (Fq ) as well. This suggests that even
for small r, the modular counting problem is not going to be much easier than the full
counting problem.
12.9.7 Theorem [1318] Let r be a positive integer. Let q = ph . If r is not a power of p, then
computing #Af (Fq ) modulo r is NP-hard. If r = pb is a power of p, computing #Af (Fq )
modulo r is also NP-hard, if either p ≥ 2n or h ≥ 2n or b > nh, that is, r = pb > q n .
12.9.8 Remark This complexity result shows that if r is not a power of p, one cannot expect a
fast algorithm to compute #Af (Fq ) modulo r. Even in the case r = pb is a power of p,
any general algorithm computing #Af (Fq ) modulo pb is expected to be fully exponential in
each of the three parameters {p, b, h}. The next two results provide non-trivial algorithms
in this direction.
12.9.9 Theorem [1318] Let q = ph and r = pb . The number #Af (Fq ) modulo pb can be computed
h
in time O(nm2qb ) = O(nm2p b ), where m is the number of monomials of f .
12.9.10 Theorem [2903] Let q = ph and r = pb . The number #Af (Fq ) modulo pb can be computed
in time O(n(8m)p(h+b) ), where m is the number of monomials of f .
12.9.11 Problem Improve the exponent p(h + b) to O(p + h + b) if possible.
q
12.9.13 Remark Replacing xi by xi and we may again assume that the degree of f in each variable
is at most q − 1. The dense input size of f is then (d + 1)n log(q).
12.9.14 Remark There is no known complexity result for the general point counting problem with
dense input. On the contrary, there are polynomial time algorithms in various special cases.
This suggests that the dense input point counting problem may have polynomial time
algorithms in much greater generality. We shall describe some of these positive results
below. In the special case that both n and d are fixed, the sparse input size agrees with the
dense input size. This is the case for elliptic curves for instance.
12.9.15 Theorem [1866] There is a p-adic algorithm which computes the number #Af (Fq ) in time
O(p2n+4 (dn logp q)3n+7 ).
12.9.16 Remark This is a general purpose algorithm, which runs in polynomial time if p is small and
n is fixed. It assumes no conditions on the affine hypersurface Af . If one assumes additional
conditions on f , significant improvements can be made. In the following, we give several
such examples.
490 Handbook of Finite Fields
12.9.17 Theorem [1862] Assume that both the affine hypersurface Af and its infinite part are
smooth of degree d not divisible by p > 2. Then, the number #Af (Fq ) can be computed in
time O (p2+ (dn logp q)O(1) ).
12.9.18 Remark It may be possible to improve the factor p2+ to p0.5+ . This has been done in the
special case of hyperelliptic curves and more generally superelliptic curves [1429, 2109].
12.9.19 Theorem [18, 2394, 2560] Let n = 2 and assume that the projective curve defined by f is
smooth. Then, the number #Af (Fq ) can be computed in time O((log q)cd ), where cd is a
constant depending only on d.
12.9.20 Remark The constant cd is in general exponential in d. For hyperelliptic curves, the constant
cd can be taken to be a polynomial in d. It is not clear if the same theorem is true for singular
plane curves.
12.9.21 Remark In the case of the diagonal hypersurface
the number #Af (Fq ) has a compact expression in terms of Jacobi sums, which can be
computed in polynomial time using LLL lattice basis reduction. This is worked out in some
cases in [455].
is a rational function in T ; see Section 12.8 for more details. The degrees of its numerator
and the denominator can be bounded by a function depending only on the degree d of
the polynomial f and the number n of variables. The output size for the zeta function is
comparable to the dense input size O(dn log q) of f . Thus, in computing the zeta function,
we always use the dense input size.
12.9.23 Theorem [2894] There is an algorithm, that given f ∈ Fq [x1 , . . . , xn ] of degree d, computes
the reduction of the zeta function Z(Af , T ) modulo p in time bounded by a polynomial in
p nd log q.
12.9.24 Remark This is a polynomial time algorithm if p is small. However, it only gives the modulo
p reduction of the zeta function. For any other prime ` 6= p, no nontrivial general algorithm
is known which computes the reduction of the zeta function modulo `, except when n ≤ 2.
By the Chinese remainder theorem, this is not much easier than computing the full zeta
function in general.
12.9.25 Theorem [1866] There is an algorithm, that given f ∈ Fq [x1 , . . . , xn ] of degree d, computes
the zeta function Z(Af , T ) in time bounded by a polynomial in (dn p log q)n .
12.9.26 Remark This is a polynomial time algorithm if p is small and n is fixed. In the case that
f is sufficiently smooth, this result can be greatly improved as follows.
12.9.27 Theorem [1862, 1863] There is an algorithm, that given f ∈ Fq [x1 , . . . , xn ] of degree d,
computes the zeta function Z(Af , T ) in time bounded by a polynomial in dn p log q, provided
that the affine hypersurface Af and its infinite part are both smooth and d is not divisible
by p > 2.
Curves over finite fields 491
12.9.28 Remark In various special cases, one can expect significantly better results. For instance,
when n = 1, there is always a polynomial time algorithm which computes the zeta function
of the zero-dimensional hypersurface Z(Af , T ) [2894]. The case n = 2 (the curve case)
has been studied most extensively. We state two such results in the next subsection. For a
diagonal hypersurface
f (x1 , . . . , xn ) = a1 xd11 + · · · + an xdnn + b,
the zeta function has an explicit expression in terms of Jacobi sums, which can then be
computed in polynomial time using LLL basis reduction; see [455] for the main ideas.
See Also
References Cited: [18, 455, 552, 564, 579, 816, 817, 1231, 1248, 1318, 1429, 1556, 1718,
1720, 1861, 1862, 1863, 1865, 1866, 1905, 2109, 2394, 2533, 2560, 2894, 2901, 2903].
This page intentionally left blank
13
Miscellaneous theoretical topics
The arithmetic structures of the ring of integers Z and the ring of polynomials Fq [x],
where q is a prime power, are strikingly similar. In particular, the densities of irreducible
elements in these rings are virtually identical, leading to very closely analogous theorems
(and conjectures) in the two settings. For a general exposition on the ideas contained in this
493
494 Handbook of Finite Fields
section, see, for example, [963]. Here we state some of these analogous definitions, theorems,
and conjectures, listing first the item involving Z and second its analogue in Fq [x]. The latter
of these first two definitions will be used throughout this section.
13.1.1 Definition For m (6= 0) ∈ Z, the absolute value of m, denoted |m|, is |Z/ hmi |.
13.1.2 Definition For f (6= 0) ∈ Fq [x], the absolute value of f , denoted |f |, is |Fq [x]/ hf i |.
13.1.3 Remark Suppose the degree of f in the above definition is n. Since the quotient ring consists
of all polynomials of degree less than n, we see that in fact |f | = q n . We note also that
n = logq (|f |). These facts will be relevant frequently in what follows.
13.1.4 Remark Throughout this section we shall use the notation f (k) ∼ g(k) to mean that
limk→∞ f (k)/g(k) = 1. In addition, the notation log denotes the natural logarithm.
13.1.5 Theorem (The Prime Number Theorem) Let π(m) be the number of prime numbers less
than or equal to m. Then
m
π(m) ∼ .
log(m)
13.1.6 Remark For the polynomial case, we employ the notation of Definition 2.1.23 but, in analogy
with integers, add the notation πq (n) to mean the number of monic irreducible polynomials
over Fq of degree less than or equal to n. The next result then follows immediately from
Theorem 2.1.24.
13.1.7 Theorem (The Polynomial Prime Number Theorem) If Iq (n) is as in Definition 2.1.23, we
have
qn
Iq (n) ∼ .
n
13.1.8 Remark We note that if f ∈ Fq [x] is of degree n, this theorem says that
|f |
Iq (n) ∼ ,
logq (|f |)
q qn
πq (n) ∼ .
q−1 n
Miscellaneous theoretical topics 495
13.1.12 Theorem (Primes in Arithmetic Progression) Let the Euler function φ be as in Defini-
tion 2.1.43 and suppose a and d are relatively prime positive integers. By πa,d (m) we mean
the number of primes less than or equal to m which are congruent to a modulo d. Then
m
πa,d (m) ∼ .
log(m)φ(m)
13.1.13 Remark This then says that for any such pair {a, d} there are infinitely many primes
which are congruent to a (mod d), and moreover that the primes are “ultimately uniformly
distributed” among the eligible congruence classes of d. Artin [134] proved the following
analogous result for polynomials.
13.1.14 Theorem [134] Let the “polynomial Euler function” Φq be as in Definition 2.1.111 and
suppose A and D are relatively prime polynomials in Fq [x]. By Iq;A,D (n) we mean the
number of monic irreducible polynomials of degree n which are congruent to A modulo D.
Then
qn
Iq;A,D (n) ∼ .
nΦq (D)
13.1.15 Remark Hayes [1447] generalizes this result to a much broader class of congruence relations
which he calls “arithmetically distributed” relations. As an example, he shows that the
relation of two monic polynomials having the same first k and last m coefficients (see
Definition 3.5.1 in [1447]) is arithmetically distributed, and so the following theorem holds.
13.1.16 Theorem [1447] Let Iq;k,m (n) be the number of monic irreducible polynomials of degree n
for which the first k and last m coefficients are prescribed. Then
q q n−k−m
Iq;k,m (n) ∼ .
q−1 n
13.1.17 Remark This says that monic irreducibles are “ultimately uniformly distributed” with
respect to their first k and last m coefficients, provided of course that the constant term is
not 0. For much more information on irreducible polynomials with prescribed coefficients,
see Section 3.5.
13.1.18 Definition Two odd prime numbers are twin primes if the absolute value of their difference
is as small as possible, i.e., is 2.
13.1.19 Definition Two monic irreducible polynomials over Fq are twin irreducibles if the absolute
value of their difference is as small as possible, i.e., is 1 if q > 2 and is 4 if q = 2.
13.1.20 Remark This last definition implies that for q > 2, two monic irreducibles are twins provided
they are identical except in their constant terms (so their difference has degree 0). For q = 2,
however, all irreducible have 1 as their constant coefficient and all have an odd number of
terms (otherwise they are divisible by x + 1), and so in this case twins will differ in their
linear and quadratic terms (so their difference has degree 2).
13.1.21 Conjecture [1418] Let π2 (m) be the number of twin prime pairs less than or equal to m.
Then
m Y 1
π2 (m) ∼ 2 1 − .
(log m)2 (p − 1)2
odd p
496 Handbook of Finite Fields
13.1.22 Remark The product over odd primes in the above conjecture is the “twin primes constant”
and has value approximately 0.66016. If this conjecture were proved, it then implies the
“Twin Primes Conjecture,” i.e., the existence of infinitely many twin prime pairs. We have
an analogous conjecture in the polynomial setting.
13.1.23 Conjecture [962] Let I2,q (n) be the number of twin irreducible polynomials of degree n.
Then
qn Y
q−1
1
I2,q (n) ∼ δ 1 − ,
2 n2 (|P | − 1)2
P
where either δ = 1 and the product is over all monic irreducibles P provided q > 2, or δ = 4
and the product excludes linear irreducibles when q = 2.
13.1.24 Remark Though unproven, these two conjectures are strongly supported by numerical
evidence. For example, for polynomials of degree 16 over F3 , the above formula accurately
predicts the 66606 twin irreducible pairs. For more information on what is currently known
along the lines of Conjecture 13.1.23, especially for the “fixed n, q going to ∞” case, see
Pollack [2410] and [2411].
13.1.25 Remark Though the Twin Primes Conjecture remains unproven, it was first observed by
Hall [1402] and then further explored by Pollack [2409] and Effinger [959], that the “Polyno-
mial Twin Primes Conjecture” holds provided only that q > 2. The proof technique makes
use of the fact that unlike prime numbers, irreducible polynomials have “internal struc-
ture.” Specifically, “irreducibility preserving substitutions” can be exploited to guarantee
the following theorem.
13.1.26 Theorem [959] (Polynomial Twin Primes Theorem) Over every Fq with q > 2, there exist
infinitely many twin irreducible pairs.
13.1.27 Remark In fact one can prove the existence of “t-tuplets” (twins being 2-tuplets) in the
polynomial case. For example, over F7 we can guarantee the existence of infinitely many
4-tuplets. This follows from the next result.
13.1.28 Theorem [959] Let Fq satisfy that q ≥ 4. If q ≡ 0 (mod 4) or q ≡ 1 (mod 4) and if p is
(p−1)(q−1)
any prime dividing q − 1, then there exist exactly t = p irreducible polynomials of
k
the form x − a over Fq for every k ≥ 1. If q ≡ 3 (mod 4), the conclusion holds for all odd
p
k
primes p, but no irreducibles of the form x2 − a exist provided k ≥ 2.
13.1.29 Problem The polynomial twin primes conjecture remains unsolved over F2 .
13.1.30 Definition If χ is a Dirichlet character (see, for example, Section 1.4.3 of [751]), then the
corresponding Dirichlet L-function is
Y −1
χ(p)
L(s, χ) = 1− s ,
p
p
13.1.31 Remark The function L(s, χ), which converges for all Re(s) > 1, can, like the Riemann
zeta function, be extended analytically to a meromorphic function on the whole complex
plane.
Miscellaneous theoretical topics 497
13.1.32 Conjecture (The Generalized Riemann Hypothesis) For every Dirichlet character χ, if
Re(s) > 0 and if L(s, χ) = 0, then Re(s) = 1/2.
13.1.33 Definition If χ is a character “of Dirichlet type” on Fq [x] (see, for example, Chapter 5 of
[961]), then the corresponding polynomial Dirichlet L-function is
Y −1
χq (P )
Lq (s, χ) = 1− ,
|P |s
P
13.1.34 Remark The following polynomial analogue of the Generalized Riemann Hypothesis follows
from the deep results of André Weil [2962] and is a key ingredient of the results of our next
subsection and of many other results in the number theory of polynomials over finite fields.
For example, see [1606] for an exposition of the polynomial analogue of Artin’s conjecture
on primitive roots.
13.1.35 Theorem [2962] (A Polynomial Generalized Riemann Hypothesis) The function Lq (s, χ) is
a complex polynomial Fχ in q −s and when factored into
Y
Fχ (q −s ) = (1 − γi q −s ),
13.1.36 Remark In both the integer and polynomial cases, the “two-primes” Goldbach problem of
writing an appropriate element of the ring Z or Fq [x] as a sum of two irreducible elements
remains unsolved. However, if one moves to the “three-primes” case, a great deal more can
be said. The first giant step forward was the pioneering work of Hardy and Littlewood in
the 1920s. In the following result, Hypothesis R (a Weak Generalized Riemann Hypothesis)
replaces the 1/2 in Conjecture 13.1.32 with Θ, where 1/2 ≤ Θ < 3/4.
13.1.37 Theorem [1420] If Hypothesis R holds, then, as m → ∞, the number N3 (m) of represen-
tations of the odd integer m as a sum of three odd primes satisfies:
m2 Y Y
1 1
N3 (m) ∼ 1+ 1− 2 ,
(log m)3 p>2 (p − 1)3 p − 3p + 3
p|m
13.1.41 Remark Obviously, even polynomials only exist over F2 and are then ones which are divis-
ible by x or x + 1.
13.1.42 Theorem [961] Let f be an odd monic polynomial of degree n over Fq . As q → ∞ or
n → ∞, the number M3 (f ) of representations of f = P1 + P2 + P3 , where deg(P1 ) = n,
deg(P2 ) < n and deg(P3 ) < n, satisfies:
q 2n Y
Y
1 1
M3 (f ) ∼ 3 1+ 1− ,
n (|P | − 1)3 |P |2 − 3|P | + 3
|P |>2 P |f
13.1.45 Remark Hence, we get “complete 3-primes theorems” in both the integer and polynomial
cases provided that we have a proven Generalized Riemann Hypothesis. In the polynomial
case, thanks to Weil, we do; in the integer case we still do not.
13.1.46 Remark The general problem of writing an arbitrary positive integer as a sum of a limited
number of k-th powers, known as the Waring Problem, was first settled by Hilbert in 1909.
13.1.47 Theorem (The Hilbert-Waring Theorem) [1500] For every positive integer k, there exists
an integer s(k) such that every positive integer m can be written as a sum of at most s(k),
k-th powers.
13.1.48 Remark Tremendous effort over the years has been put toward the question: Given k, what
is s(k)? To that end we have following definitions, the latter having first been introduced
by Hardy and Littlewood [1419].
13.1.49 Definition Let g(k) be smallest s such that every positive integer m is a sum of at most
s, k-th powers. Let G(k) be the smallest s for which every sufficiently large positive
integer is a sum of at most s, k-th powers.
13.1.50 Remark It is known that g(2) = G(2) = 4, that g(3) = 9, and that 4 ≤ G(3) ≤ 7. Exact
formulas are now known for g(k) for all k. Although exact values of G(k) are known only
for k = 2 and k = 4 (G(4) = 16), considerable progress has been made on good lower and
Miscellaneous theoretical topics 499
upper bounds for G(k). For excellent surveys of the Waring Problem, see Ellison [974] and
the more current Vaughan and Wooley [2862].
13.1.51 Remark Turning to the polynomial case, one must first observe that no cancellation occurs
in the integer Waring Problem, and so the most appropriate analogue in the polynomial
case will allow as little cancellation as possible. This is achieved as follows:
13.1.52 Definition The representation f = X1k + · · · + Xsk of f ∈ Fq [x] as a sum of k-th powers
of polynomials in Fq [x] is a strict sum provided that for every 1 ≤ i ≤ s, deg(Xi ) ≤
ddeg(f )/k)e (equivalently, provided that k deg(Xi ) < k + deg(f )).
13.1.53 Remark The following result, first proved independently by Car [508], Webb [2955], and
Kubota [1810], is the best analogue to the Hilbert-Waring Theorem.
13.1.54 Theorem (Polynomial Waring Theorem) Suppose k < p = char(Fq ). Then there exists an
integer s(k), independent of q, such that every f ∈ Fq [x] is a strict sum of s(k) k-th powers
of polynomials in Fq [x].
13.1.55 Remark Just as in the integer case, the question becomes: Given k, what is s(k)? The
following definition is from Section 1.1 of [961].
13.1.56 Definition Given k ≥ 2, let gpoly (k) be the smallest value of s such that for every q with
p = char(Fq ) > k, every f ∈ Fq [x] is a strict sum of s k-th powers in Fq [x]. Let Gpoly (k)
be the smallest s such that this same condition holds except possibly for a finite number
of polynomials in the collection of all Fq [x] with p > k.
13.1.57 Remark As discussed in Section 1.2 of [961], the case of k = 2 is settled by Serre, with
Webb showing that the only exceptions are two polynomials of degree 3 and six polynomials
of degree 4 over F3 which require four squares.
13.1.59 Remark To date no exact values of gpoly (k) or Gpoly (k) are known for k > 2. The cases
of k = 3 and k = 4 have been extensively studied. See the Introduction of [513] for a
good summary of what is currently known, including cases not satisfying the hypothesis of
Theorem 13.1.54 (i.e., that k < p). Following our Definition 13.1.56, however, it is known
that gpoly (3) ≤ 9, Gpoly (3) ≤ 7, and that Gpoly (4) ≤ 11; see [514, 515]. For results on upper
bounds for gpoly (k), especially for large k, see [1167]. Finally, in [513], the following upper
bounds for general k are established for both parameters. Note that in the case of gpoly (k),
there is dependence on q.
See Also
References Cited: [134, 508, 513, 514, 515, 602, 751, 823, 957, 958, 959, 960, 961, 962, 963,
974, 1166, 1167, 1402, 1418, 1419, 1420, 1447, 1448, 1497, 1500, 1606, 1810, 1891, 2016,
2409, 2410, 2411, 2412, 2862, 2877, 2955, 2962]
On one hand we collect results about the numbers of matrices of various types over Fq ;
on the other hand, we shall also discuss matrix representations of the field Fqm over Fq and
give a few results concerning determinants.
13.2.1 Remark As noted in Remark 2.1.90, the vector space of all m × n matrices over a field
F = Fq has dimension mn, and thus the number of m × n matrices is q mn . We shall
mainly concentrate on square matrices. Clearly, the m × m matrices over Fq constitute
(m,m)
a ring R = Fq , and the invertible matrices in R form a group G = GL(m, q), the
general linear group. The elements of G are exactly those matrices in R which transform
the elements of a fixed ordered basis B = {α1 , . . . , αm } for Fqm over Fq into an ordered
basis again. Hence the order of G agrees with the total number of distinct ordered bases:
|GL(m, q)| = q m(m−1)/2 (q m − 1)(q m−1 − 1) · · · (q − 1). (13.2.1)
Trivially, the determinants of the elements of G are uniformly distributed over F ∗ . In
particular, the matrices in G with determinant 1 form a group SL(m, q) of order
|SL(m, q)| = q m(m−1)/2 (q m − 1)(q m−1 − 1) · · · (q 2 − 1),
called the special linear group. These two groups are the most elementary instances of the
classical groups, which are discussed in detail in Section 13.4.
13.2.2 Remark Generalizing Equation (13.2.1), we give a formula for the number of all m × n
matrices with prescribed rank k over Fq . This is closely related to the number of subspaces
of prescribed dimension in a vector space over Fq .
Miscellaneous theoretical topics 501
m
13.2.4 Lemma The Gaussian coefficient k q is given explicitly as follows:
13.2.6 Remark Clearly, the probability that a given entry of an m × n matrix of rank k over Fq is
6= 0 does not depend on the position of the entry. In [2093] this probability is determined
to be
1 − q1 1 − q1k
.
1 − q1m 1 − q1n
In [412], functions counting matrices of given rank over a finite field with specified positions
equal to 0 are studied. Such matrices may be viewed as q-analogues of permutations with
certain restricted values. In particular, a simple closed formula for the number of invertible
matrices with zero diagonal is obtained (see below), as well as recursions to enumerate
matrices with zero diagonal by rank.
13.2.7 Theorem [2093] The number of invertible m × m matrices over Fq whose diagonal consists
entirely of zeros is
m !
m−1 m
q ( 2 ) (q − 1)m
X
−1 i
q (−1) [m − i]q ! ,
i=0
i
13.2.8 Remark By basic group theory, the order of an invertible m × m matrix over Fq has to
divide the order of GL(m, q) given in Equation (13.2.1). This elementary result can be
strengthened in two ways.
13.2.9 Theorem [2293] The least common multiple of all orders of matrices in GL(m, q) is pe M ,
where q is a power of the prime p, e is the least integer satisfying m ≤ pe , and M is the least
common multiple of q − 1, q 2 − 1, . . . , q m − 1. In particular, the order o(A) of A ∈ GL(m, q)
divides pe M . Moreover, o(A) is always bounded by q m − 1.
502 Handbook of Finite Fields
13.2.10 Example Consider F = Fqm as a vector space over K = Fq , and let γ be any element of
F ∗ . Then γ defines a K-linear mapping Tγ : F → F via
Tγ : ξ 7→ γξ for ξ ∈ F,
and the order of Tγ equals the order o(γ) of γ in F ∗ . Now represent Tγ with respect to
some basis B of F over K; then the associated matrix Aγ (B) is a matrix of order o(γ) in
GL(m, q). In particular, choosing γ as a primitive element ω for F gives a matrix Aω (B) of
the maximum possible order q m − 1.
13.2.12 Theorem [1269] Any two Singer subgroups of G = GL(m, q) are conjugate in G. The
number S(m, q) of Singer subgroups of G equals
|GL(m, q)|
S(m, q) = ,
m(q m − 1)
gm · m
P −1 −1
t=0 gt gm−t if q is odd,
i(m, q) = Pdm/2e
gm · t=0 gt−1 gm−2t
−1
q −t(2m−3t) if q is even,
where gt denotes the number of invertible t × t matrices over Fq as given in (13.2.1) (for
t 6= 0) and g0 = 1.
13.2.14 Remark [2166] More generally, there is a (rather involved) formula for the number of m×m
matrices of order k over Fq . Clearly, the probability that a randomly chosen m × m matrix
2
over Fq is invertible (and therefore has an order) is gm /q m , where gm denotes the number
of invertible m × m matrices over Fq as given in (13.2.1). For fixed q, this probability has
a limit:
gm Y 1
lim 2 = 1− r .
m→∞ q m q
r≥1
For q = 2, this limit is 0.28878 . . .; for q = 3, the limit is 0.56012 . . .; and, as q → ∞, the
probability of being invertible goes to 1.
13.2.15 Remark [1064] On the other end of the spectrum, the number of nilpotent m × m matrices
over Fq – that is, matrices A satisfying Ak = 0 for some k – is given by a very simple
formula: it equals q m(m−1) .
Miscellaneous theoretical topics 503
13.2.16 Definition Any subring R of the full matrix ring K (m,m) over a field K which is itself a
field is a matrix field (of degree m) over K. If K is a finite field and if R ∼
= Fq , R is a
matrix representation of degree m for Fq (over K).
13.2.17 Remark The following result concerning the existence of matrix representations for Fq is
an obvious consequence of Theorem 13.2.9 and Example 13.2.10. For details, see Example
13.2.19.
13.2.18 Theorem Any matrix representation for Fqm over Fq has degree at least m. Moreover, there
always exists a representation of degree m.
13.2.19 Example With the notation of Example 13.2.10, the q m − 1 matrices Aγ (B) with γ ∈ F ∗
together with the zero matrix form a matrix representation R(B) of degree m for Fqm over
Fq . Clearly, one may write R(B) as the q m − 1 powers of the Singer cycle Aω (B) together
with the zero matrix. Now let f (x) = xm + fm−1 xm−1 + · · · + f1 x + f0 be the minimal
polynomial of a primitive element ω (so that f is a primitive polynomial of degree m over
K), and let B = {1, ω, ω 2 , . . . , ω m−1 } be the associated polynomial basis. Then Aω (B) is
the companion matrix of f , that is,
0 0 ··· 0 −f0
1 0 · · · 0 −f1
.. .. ..
Aω (B) = 0 1
. . . .
. .
. . ..
. . . 0 −fm−2
0 0 · · · 1 −fm−1
As this example shows, Singer cycles in GL(m, q) give rise to matrix representations for
Fqm over Fq of the smallest possible degree. Essentially, all minimal degree representations
may be obtained in this way.
(m,m)
13.2.20 Theorem [2980] Let R ⊂ Fq be a matrix representation for F = Fqm over K = Fq .
Then there exists a matrix A ∈ R such that
R = {Ak : k = 1, . . . , q m − 1} ∪ {0}.
Moreover, A is similar to the companion matrix of a primitive polynomial of degree m over
K, and det A is a primitive element of K.
13.2.21 Remark Theorem 13.2.20 characterizes the matrix representations for Fqm of smallest de-
gree. There are considerably more general results due to Willett [2980] who classified all
matrix fields of degree m over Fq in terms of primitive polynomials over the underlying
prime field Fp . The general result is rather technical, so we only give the special case q = p
here. Proofs for all these results can also be found in Section 1.5 of [1631].
13.2.22 Theorem [2980] Let p be a prime, and let R be any representation of Fpm as a matrix field
of degree n over Fp . Up to similarity, R has the form
R = {diag 0, . . . , 0, Ak , . . . , Ak : k = 1, . . . , pm − 1} ∪ {0},
13.2.24 Lemma [2584] Let B be a basis for F = Fqm over Fq , and let R(B) be the associated matrix
representation. Then R(B) is symmetric if and only if there exists an element λ ∈ F ∗ such
that the dual basis B ∗ of B satisfies B ∗ = λB, where λB consists of all elements λβ with
β ∈ B.
13.2.25 Theorem [2584] Every finite field F = Fqm admits a basis B over Fq such that the associated
matrix representation R(B) is symmetric.
13.2.26 Theorem [2584] Let B be a basis of F = Fqm over Fq with associated matrix representation
R(B), and assume that q is even or that q and m are both odd. Then the dual basis B ∗
of B satisfies B ∗ = λB for some λ ∈ F ∗ (so that R(B) is symmetric) if and only if λ is a
square, say λ = µ2 , and the basis µB is self-dual.
13.2.27 Theorem [2584] Let B be a basis of F = Fqm over Fq , and assume that the associated matrix
representation R(B) is closed under transposition of matrices. Then R(B) is symmetric
provided that either q is even or q and m are both odd. If q is odd and m is even, then
q m/2
R(B) is not symmetric if and only if B ∗ = (λB) for some λ ∈ F ∗ .
13.2.28 Remark There are many results on the number of matrices of special types over Fq . In
the remaining subsections, we present a selection of such results. We begin with those cases
which are related to the enumeration of various types of bases as discussed in Chapter 5.
Let us summarize these connections as follows:
13.2.29 Definition An m × m matrix C = (cij )i,j=1,...,m is circulant if its rows are generated by
successive cyclic shifts of its first row, that is
where indices are computed modulo m. Thus C is specified by the entries cj = c1,j in
its first row:
c1 c2 · · · cm−1 cm
cm
c 1 · · · cm−2 cm−1
C=
cm−1 cm · · · cm−3 cm−2 .
.. .. .. .. ..
. . . . .
c2 c3 ··· cm c1
Such a matrix C is denoted as circ(c1 , . . . , cm ).
Miscellaneous theoretical topics 505
13.2.30 Remark The following three results are well-known. They may be found, for example, in
[1631].
13.2.31 Lemma Let F be a field. Mapping the matrix C = circ(c1 , . . . , cm ) to the coset of the
polynomial c(x) = c0 + c1 x + · · · + cm−1 xm−1 modulo xm − 1 gives an isomorphism between
the ring of all circulant m×m matrices over F and the ring R = F [x]/(xm −1). In particular,
C is invertible if and only if the associated polynomial c(x) is a unit in R.
13.2.32 Corollary Let C(m, q) denote the multiplicative group of all invertible circulant m × m
matrices over Fq . Then the order of C(m, q) equals Φq (xm − 1), where Φq is the function
introduced in Definition 2.1.111.
13.2.33 Theorem Let q be a power of the prime p, let m be a positive integer, and write m = pb n,
where p does not divide n. Then
Y φ(d)/od (q)
|C(m, q)| = q m 1 − q od (q) , (13.2.3)
d|n
where φ is the Euler function given in Definition 2.1.43 and where od (q) denotes the multi-
plicative order of d modulo q.
Qr
13.2.34 Corollary Assume that q and m are co-prime. Then |C(m, q)| = j=1 (q mj − 1) , where
m1 , . . . , mr are the degrees of the irreducible factors of xm − 1 over Fq .
13.2.36 Remark The preceding terminology is somewhat ambiguous, as the term orthogonal group
usually refers to the group of isometries of an orthogonal geometry, that is, of a vector space
equipped with a quadratic form. In the case of finite fields, no ambiguity arises if both q
and m are odd. However, if q is odd and m is even, then there are two distinct orthogonal
groups (of different orders); and if q is even, an orthogonal geometry is by definition a
symplectic geometry refined by an additional quadratic form, which forces m to be even.
In particular, the standard text books do not contain the order of O(m, q) for even values
of q. Nevertheless, the definition given in 13.2.35 makes sense for all cases. See Section 13.4
for the “classical” orthogonal groups.
13.2.37 Theorem [1989] Assume that either q is even or both q and m are odd. Then
m−1
Y
q i − i ,
|O(m, q)| = γ (13.2.4)
i=1
where ( (
1 if i is even, 2 if q is even,
i = and γ =
0 if i is odd, 1 if q and n are odd.
Now let q be odd and m even, say m = 2s. Then
s−1
Y
|O(m, q)| = q s(s−1)/2 (q s + ) q 2i − 1 ,
(13.2.5)
i=1
where (
1 if s is odd and q ≡ 3 (mod 4),
=
−1 otherwise.
506 Handbook of Finite Fields
13.2.39 Remark We require some notation to state the number of orthogonal circulant m × m
matrices over Fq ; clearly all these matrices form a group which is denoted by OC(m, q).
First assume that q and m are co-prime. (The general case is reduced recursively to this
special case.) Then let x − 1, possibly x + 1 (this factor arises only if q is odd and m is
even) and f1 , . . . , fr be the monic irreducible factors of xm − 1. Some of the fi may be
self-reciprocal (that is, fi = fi∗ , see Definition 2.1.48), say f1 , . . . , fs . Then the remaining
fj split into pairs of the form {f, f ∧ } with f 6= f ∧ , where f ∧ = f ∗ /f0 and where f0 is the
constant term of f ; say r = s + 2t, and (fs+j )∧ = fs+t+j for j = 1, . . . , t.
13.2.40 Theorem [258, 471, 1633, 1990] Assume that q and m are co-prime, and write, using the
notation introduced in Remark 13.2.39,
where
1 if q is even,
γ = 2 if q and m are odd,
4 if q is odd and m is even.
13.2.42 Remark In order to prove the results on (circulant) orthogonal matrices presented in the
previous subsection, one needs a connection with and enumeration results for two other
interesting classes of matrices, which are the topic of the present subsection. Again, proofs
for these results are in [1631].
Miscellaneous theoretical topics 507
13.2.44 Definition Let A be a symmetric invertible matrix over Fq . Then any (invertible) m × m
matrix M satisfying A = M M T is a factor of A.
13.2.45 Lemma [1989] The invertible symmetric m × m matrices over Fq admitting a factor are in
a 1-to-1 correspondence with the cosets of O(m, q) in GL(m, q). Hence
|GL(m, q)|
|O(m, q)| = ,
sf (m, q)
where sf (m, q) denotes the number of invertible symmetric matrices over Fq admitting a
factor.
13.2.46 Lemma [69] Let A be a symmetric invertible matrix over Fq . If q is odd, A admits a factor
if and only if det A is a square; and if q is even, A admits a factor if and only if A has at
least one non-zero diagonal entry.
13.2.47 Theorem [544, 1989] The number N (m, r) of symmetric m × m matrices of rank r over Fq
equals
s r−1
Y q 2i Y
q m−i − 1 ,
N (m, r) = 2i − 1
·
i=1
q i=0
where r ≤ m and s = br/2c. In particular, the number of invertible symmetric m × m
matrices over Fq is given by
m
(
Y
i
0 if i is even,
N (m, m) = q − δi , where δi =
i=1
1 otherwise.
13.2.48 Theorem [545, 1989] The number N0 (m, r) of skew-symmetric m × m matrices of rank r
over Fq is given by
s 2s−1
Y q 2i−2 Y m−i
N0 (m, 2s) = 2i − 1
· q −1 and N0 (m, 2s + 1) = 0,
i=1
q i=0
13.2.49 Remark (Infinite) Toeplitz and Hankel matrices with complex entries play a prominent role
in classical linear algebra and have many important applications; see, for instance, [1354] for
an introductory treatment or [367] for a monograph on large finite Toeplitz matrices. In this
subsection, we discuss such matrices over Fq . Both classes of matrices are closely related,
as a Hankel matrix may be viewed as an “upside-down” Toeplitz matrix, so it suffices to
concentrate on one of these two classes.
508 Handbook of Finite Fields
13.2.51 Remark Note that indices are not computed modulo m in the defining condition (13.2.9) for
a Toeplitz matrix. If we would do so, we get further restrictions and arrive at an equivalent
formulation for the defining condition (13.2.2) for circulant matrices. Thus circulant matrices
constitute a special class of Toeplitz matrices.
13.2.55 Corollary For any positive integer r, the number of m × m Toeplitz (or Hankel) matrices
over Fq with rank r is constant for every m ≥ r + 1.
13.2.56 Remark We note that Toeplitz and Hankel matrices over finite fields have some interesting
applications. For instance, Toeplitz matrices are used as pre-conditioners in the process of
Miscellaneous theoretical topics 509
solving linear systems having unstructured coefficient matrices [1665]. In [1207], an explicit
surjective map σ from the set of all ordered pairs of coprime monic polynomials of degree m
over Fq to the set of all invertible m × m Hankel matrices over Fq is constructed; this map
has the additional property that, for any such matrix A, the pre-image σ −1 (A) is in a one-
to-one correspondence with Fq . Therefore Theorem 13.2.54 gives a proof for the following
result.
13.2.57 Theorem [226] The number of ordered pairs of coprime monic polynomials of degree m
over Fq equals q 2m−1 (q − 1).
13.2.58 Corollary The probability that two randomly chosen monic polynomials of the same posi-
tive degree with coefficients in Fq are coprime is 1 − (1/q).
13.2.59 Remark Further results on the greatest common divisor of polynomials are given in Section
11.2.
13.2.7 Determinants
13.2.60 Definition Let {α1 , . . . , αm } be a set of m elements of Fqm . Then the determinant
α1 ··· αm
α1q ··· αmq
.. .. ..
. . .
m−1 m−1
α1q ··· q
αm
13.2.61 Remark By Corollary 2.1.95, the set {α1 , . . . , αm } is a basis for Fqm over Fq if and only if
its Moore determinant is nonzero. The Moore determinant may be viewed as a finite field
analogue of the well-known Vandermonde determinant. It is used extensively in the theory
of Drinfeld modules, see Section 13.3. Moore [2140] proved the following formula for his
determinant, which immediately implies the validity of Corollary 2.1.95.
13.2.62 Theorem Let {α1 , . . . , αm } be a set of m elements of Fqm . Then its Moore determinant
det M (α1 , . . . , αm ) is given by
m−1
Y Y i
X
det M (α1 , . . . , αm ) = α1 αi+1 − cj αj .
i=1 c1 ,...,ci ∈Fq j=1
13.2.63 Remark [910] Equation (13.2.1) immediately gives an exact formula for the probabilty that
the determinant of a random m × m matrix A over Fq equals 0. From this one easily sees
that the probability in question is in the order of magnitude prob(det A = 0) = 1q + O q12 .
It is remarkable that one actually has the following much more general result for both
determinants and permanents:
1 1
prob(det A = α) = prob(per A = α) = + O
q q2
for every α ∈ Fq . This result is obtained in [910] as a byproduct of the authors’ study of
the Polya permanent problem for matrices over finite fields.
510 Handbook of Finite Fields
13.2.64 Remark Several further results concerning the enumeration of various types of matrices over
finite fields can be found in the nice survey article [2166]. We have considered the number
of invertible m × m matrices over Fq with the smallest (viz. 2) and the largest (viz. q m − 1)
possible orders in Subsection 13.2.2. Matrices of a specified order k are, of course, closely
related to the solutions of the matrix equation f (X) = 0, where f (x) = xk − 1. There are
many papers concerning matrix equations over finite fields; see the notes to Section 6.2 in
[1939] for a collection of references.
See Also
References Cited: [69, 226, 258, 367, 412, 471, 544, 545, 910, 1064, 1070, 1207, 1269, 1354,
1517, 1631, 1633, 1665, 1989, 1990, 2093, 2140, 2166, 2293, 2584, 2980]
13.3.1 Definition Let Fq be a finite field with q elements and n an integer > 1. The set of n × n
nonsingular matrices over Fq forms a group with respect to matrix multiplication, called
the general linear group of degree n over Fq and denoted by GLn (q). The set of n × n
matrices over Fq of determinant 1 forms a subgroup of GLn (q), called the special linear
group of degree n over Fq and denoted by SLn (q).
The group GLn (q) can also be defined as the group consisting of nonsingular
linear transformations of an n-dimensional vector space over Fq .
13.3.2 Remark In literatures of group theory the groups GLn (q) and SLn (q) are sometimes written
as GL(n, q) instead of GLn (q) and SL(n, q) instead of SLn (q).
13.3.4 Theorem [1589] The center Zn of GLn (q) consists of those matrices λIn , where λ ∈ F∗q and
In is the n × n identity matrix. The center of SLn (q) is SLn (q) ∩ Zn , which consists of those
matrices λIn , where λ ∈ F∗q and λn = 1.
Miscellaneous theoretical topics 511
13.3.5 Definition The factor group GLn (q)/Zn is the projective general linear group of degree n
over Fq and denoted by P GLn (q). The factor group SLn (q)/SLn (q)∩Zn is the projective
special linear group of degree n over Fq and is denoted by P SLn (q).
and the orders of these groups are 48, 24, 8, 4, 2, respectively. Thus P SL2 (3) is not simple.
13.3.12 Theorem [858] The only isomorphisms between the groups P SLn (q) are
1. P SL2 (4) ' P SL2 (5), and
2. P SL2 (7) ' P SL3 (2).
13.3.13 Theorem [858] The only isomorphisms between the groups P SLn (q) and Am are
1. P SL2 (3) ' A4 ,
2. P SL2 (4) ' P SL2 (5) ' A5 ,
3. P SL2 (9) ' A6 , and
4. P SL4 (2) ' A8 .
P 7→ P A.
The subset of subspaces of the same dimension forms an orbit. The cardinality of the
orbit of m-dimensional subspaces (0 ≤ m ≤ n) is denoted by N (m, n).
n n
13.3.15 Theorem [2920] We have N (m, n) = m q, where m q is the Gaussian coefficient
hni (q n − 1)(q n−1 − 1) · · · (q n−m+1 − 1)
= .
m q (q m − 1)(q m−1 − 1) · · · (q − 1)
13.3.16 Remark For more details on Gaussian coefficients, see Section 13.2.
13.3.18 Theorem [857] All elements of Sp2ν (q, K) are nonsingular matrices with determinant 1
and, hence, Sp2ν (q, K) is a subgroup of SL2ν (q). Moreover, Sp2 (q, K) = SL2 (q).
13.3.20 Theorem [857] Let K1 and K2 be two cogredient 2ν × 2ν nonsingular alternate matrices,
then Sp2ν (q, K1 ) ' Sp2ν (q, K2 ).
13.3.21 Remark By the previous theorem it is sufficient to consider the symplectic group
Sp2ν (q, K0 ) where
0 Iν
K0 = .
−Iν 0
The group Sp2ν (q, K0 ) is simply the symplectic group of degree 2ν over Fq and denoted by
Sp2ν (q).
13.3.22 Theorem [1589] The order of Sp2ν (q) is
ν
2 Y
|Sp2ν (q)| = q ν (q 2i − 1).
i=1
13.3.23 Definition Let T ∈ Sp2ν (q). If I2ν − T is of rank 1, then T is a symplectic transvection.
where T ∈ Sp2ν (q), and diag(λ, 0, . . . , 0) is a diagonal matrix with λ, 0, . . . , 0 along the main
diagonal and λ ∈ F∗q .
13.3.25 Theorem [1589] The group Sp2ν (q) is generated by symplectic transvections.
13.3.26 Theorem [1589] The group Sp2ν (q) is its own commutator subgroup in all cases except
ν = 1, q = 2 or 3 and ν = 2, q = 2.
13.3.27 Remark The group Sp2 (2) = SL2 (2) ' S3 and its commutator subgroup is A3 . The group
Sp2 (3) = SL2 (3) is of order 24 and is solvable.
13.3.28 Theorem [2786] The group Sp4 (2) ∼
= S6 .
13.3.29 Theorem [2786] For ν ≥ 2, the symmetric group S2ν+2 is a subgroup of Sp2ν (2).
13.3.30 Theorem [1589] The center of Sp2ν (q) consists of I2ν and −I2ν .
514 Handbook of Finite Fields
13.3.31 Definition The factor group Sp2ν (q)/{±I2ν } is the projective symplectic group of degree
2ν over Fq and denoted by P Sp2ν (Fq ).
13.3.32 Theorem (Dickson) [1589] The group P Sp2ν (q) is a simple group except for the cases
P Sp2 (2), P Sp2 (3), and P Sp4 (2).
13.3.34 Theorem [2920] The action of GL2ν (q) on the subspaces of F2ν
q induces an action of Sp2ν (q)
on the subspaces of F2νq . Two subspaces P and Q are in the same orbit of Sp2ν (q) if and
only if they are of the same type.
13.3.35 Theorem [2920] Denote the cardinality of the orbit of subspaces of type (m, s) by
N (m, s; 2ν). Then
Qν 2ν
2s(ν+s−m) i=ν+s−m+1 (q − 1)
N (m, s; 2ν) = q Qs m−2s .
i
Q
i=1 (q − 1)
2i
i=1 (q − 1)
13.3.36 Definition Let Fq2 be a finite field with q 2 elements, where q is a prime power. The
Frobenius automorphism (refer to Remark 2.1.77)
a 7→ aq
of Fq2 is denoted by −, i.e., a = aq for all a ∈ Fq2 , and is the involution of Fq2 .
13.3.37 Definition Let n be an integer > 1, and H = (hij ) be an n × n matrix over Fq2 . The
matrix (hij ) is denoted by H. If t H = H, H is a Hermitian matrix. Let H be an
n × n nonsingular Hermitian matrix over Fq2 . The set of n × n matrices T satisfying
T H t T = H forms a group with respect to matrix multiplication, the unitary group
of degree n with respect to H over Fq2 and denoted by Un (q 2 , H). The subgroup of
Un (q 2 , H) consisting of those T ∈ Un (q 2 , H) with determinant 1 is the special unitary
group and denoted by SUn (q 2 , H).
Let H = (hij ) be an n × n matrix over Fq2 and H(x, y) =
P
hij xi y j be
the corresponding Hermitian form, which is nonsingular if detH 6= 0. Let H(x, y)
be a nonsingular Hermitian form on an n-dimensional vector space V over Fq2 . Then
Un (q 2 , H) can also be defined as the group of linear transformations T of V satisfying
H(xT, yT ) = H(x, y) for x, y ∈ V .
13.3.38 Theorem [857] All elements of Un (q 2 , H) are nonsingular matrices and, hence, Un (q 2 , H)
is a subgroup of GLn (q 2 ) and SUn (q 2 , H) is a subgroup of SLn (q 2 ).
Miscellaneous theoretical topics 515
13.3.39 Definition Two n × n Hermitian matrices H1 and H2 over Fq2 are cogredient if there is
an n × n nonsingular matrix P over Fq2 such that H1 = P H2 t P .
13.3.40 Theorem [857] Let H1 and H2 be two cogredient nonsingular Hermitian matrices, then
Un (q 2 , H1 ) ' Un (q 2 , H2 ), and also SUn (q 2 , H1 ) ' SUn (q 2 , H2 ).
13.3.41 Remark When n is even, write n as n = 2ν; and when n is odd, write n as n = 2ν + 1. By
the above theorem, it is sufficient to consider the unitary groups Un (q 2 , H0 ), SUn (q 2 , H0 )
and Un (q 2 , H1 ), SUn (q 2 , H1 ), where
0 Iν 0
0 Iν
H0 = and H1 = Iν 0 0 .
Iν 0
0 0 1
Here Un (q 2 , H0 ) and Un (q 2 , H1 ) are the unitary groups of degree n over Fq2 , and are denoted
by U2ν (q 2 ) and U2ν+1 (q 2 ), respectively. Similarly, we have SU2ν (q 2 ) and SU2ν+1 (q 2 ). We
use the symbols Un (q 2 ) and SUn (q 2 ) to cover these two cases. In the literatures of group
theory these groups are sometimes written as Un (q) instead of Un (q 2 ) and SUn (q) instead
of SUn (q 2 ).
13.3.42 Theorem [2786] The orders of Un (q 2 ) and SUn (q 2 ) are, respectively,
n
1
Y
|Un (q 2 )| = q 2 n(n−1) (q i − (−1)i ),
i=1
and
n
1
Y
|SUn (q 2 )| = q 2 n(n−1) (q i − (−1)i ).
i=2
Wn = Un (q 2 ) ∩ Zn = {aIn : aa = 1}
and the center of SUn (q 2 ) is
SUn (q 2 ) ∩ Zn = {aIn : aa = 1 and an = 1}.
(Recall that Zn is the subgroup of n×n nonsingular scalar matrices over Fq2 , i.e., the center
of GLn (q 2 ).)
13.3.45 Definition The factor group Un (q 2 )/Wn is the projective unitary group of degree n over
Fq2 and denoted by P Un (q 2 ). Similarly, the factor group SUn (q 2 )/(SUn (q 2 ) ∩ Zn ) is the
projective special unitary group of degree n over Fq2 and denoted by P SUn (q 2 ).
and
n
1
Y
|P SUn (q 2 )| = (gcd(n, q + 1))−1 q 2 n(n−1) (q i − (−1)i ).
i=2
516 Handbook of Finite Fields
13.3.48 Lemma [857] Every unitary transvection can be expressed in the form
Iν 0 0
Iν 0
T T −1 or T diag(λ, 0, . . . , 0) Iν 0 T −1
diag(λ, 0, . . . , 0) Iν
0 0 1
the group
P SU2 (32 ) ' P SL2 (3) ' A4 ,
and the group P SU3 (22 ) is of order 23 · 32 = 72 and is solvable.
13.3.52 Theorem [858] We have P SU4 (22 ) ' P Sp4 (3).
13.3.56 Definition Let n be an integer > 1, Fq be a finite field with q elements where q is an odd
prime power; and S be an n × n nonsingular symmetric matrix over Fq . The set of n × n
matrices T satisfying T S t T = S forms a group with respect to matrix multiplication,
the orthogonal group of degree n with respect to S over Fq and denoted byPOn (q, S).
Let S = (sij ) be an n × n symmetric matrix over Fq and Q(x) = sij xi xj be
the corresponding quadratic form, which is nonsingular if S is nonsingular. Let Q(x)
be a nonsingular quadratic form on an n-dimensional vector space V over Fq . Then
On (q, S) can also be defined as the group of linear transformations T of V such that
Q(xT ) = Q(x) for all x ∈ V .
Miscellaneous theoretical topics 517
13.3.57 Theorem [1589] All elements of On (q, S) are nonsingular matrices of determinant ±1 and,
hence, On (q, S) is a subgroup of GLn (q).
13.3.58 Theorem [1589] Let S1 and S2 be two cogredient n × n nonsingular symmetric matrices,
then On (q, S1 ) ' On (q, S2 ). Moreover, for any n × n nonsingular symmetric matrix S over
Fq and any λ ∈ F∗q , On (q, S) = On (q, λS).
13.3.59 Remark Choose a fixed non-square element z of F∗q . By the previous theorem it is sufficient
to consider the four orthogonal groups with respect to the following four n × n nonsingular
symmetric matrices
0 Iν
S2ν = ,
Iν 0
0 Iν 0 0 Iν 0
S2ν+1,1 = Iν 0 0 , S2ν+1,z = Iν 0 0
0 0 1 0 0 z
0 Iν 0 0
Iν 0 0 0
S2ν+2 = 0 0 1 0 ,
0 0 0 −z
13.3.64 Theorem [857] Every element of O2ν+δ (q) is a product of at most 2ν + δ symmetries.
518 Handbook of Finite Fields
13.3.65 Definition Elements of O2ν+δ (q) are orthogonal matrices and those of determinant 1 are
proper orthogonal matrices. All proper orthogonal matrices form a subgroup of O2ν+δ (q),
the proper orthogonal group also referred to as the special orthogonal group, and denoted
by SO2ν+δ (q). The commutator subgroup of O2ν+δ (q) is denoted by Ω2ν+δ (q).
13.3.67 Definition The factor group O2ν+δ (q)/{±I2ν+δ } over Fq is the projective orthogonal group
of degree 2ν + δ with respect to S2ν+δ over Fq and is denoted by P O2ν+δ (q). Similarly,
the factor group SO2ν+δ (q)/(SO2ν+δ (q) ∩ {±I2ν+δ }) is the projective proper orthogonal
group of degree 2ν + δ with respect to S2ν+δ over Fq and is denoted by P SO2ν+δ (q).
We also define P Ω2ν+δ (q) = Ω2ν+δ (q)/(Ω2ν+δ (q) ∩ {±I2ν+δ }).
O2ν+δ (q) ⊃ SO2ν+δ (q) ⊃ Ω2ν+δ (q) ⊃ Ω2ν+δ (q) ∩ {±I2ν+δ } ⊃ {I}.
13.3.70 Theorem (Dickson) [1589] The group P Ω2ν+δ (q) is a simple group except the following
cases:
1. ν = 2, δ = 0,
2. ν = 1, δ = 1 and q = 3.
13.3.71 Remark For the exceptional cases in the above Theorem, we have
and
0 Is 0 0 0
Is 0 0 0 0
0 0 1 0 0
M (m, 2s + 2, s) = .
0 0 0 −z 0
0 0 0 0 0(m−2s−2)
Then P is a subspace of type (m, 2s, s), (m, 2s + 1, s, 1), (m, 2s + 1, s, z), and (m, 2s +
2, s, diag(1, −z)), respectively.
13.3.74 Theorem [2920] Two subspaces P and Q of F2ν+δ
q belong to the same orbit of O2ν+δ (q) if
and only if they are of the same type.
13.3.75 Remark The cardinality of any orbit of O2ν+δ (q) has already been determined [2920].
13.3.76 Definition Let q be a power of 2 and n be an integer > 1. Let G be an n × n regular matrix
over Fq . An n × n matrix T over Fq is orthogonal with respect to G, if T G t T + G is an
alternate matrix. The set of n × n orthogonal matrices with respect to G over Fq forms
a group with respect to matrix multiplication, the orthogonal group of degree n with
respect to G over Fq and denoted by On (q, G).
Let G(x) be a nonsingular quadratic form on an n-dimensional vector space V
over Fq . Then On (q, G) can also be defined as the group of linear transformations T of
V such that G(vT ) = G(v) for all v ∈ V .
13.3.77 Theorem [857] All elements of On (q, G) are nonsingular matrices of determinant 1 and,
hence, On (q, G) is a subgroup of SLn (q).
13.3.78 Theorem [857] Let G1 and G2 be two cogredient n × n regular matrices, then O(q, G1 ) '
O(q, G2 ).
13.3.79 Remark Choose a fixed element α such that α cannot be expressed in the form x2 + x
where x ∈ Fq . Write n = 2ν + δ where δ = 0, 1 or 2. Assume ν > 0. By the previous theorem
it is sufficient to consider the orthogonal groups O2ν+δ (q, G), where G is an n × n matrix
of one of the following forms
0 Iν 0 0
0 Iν 0
0 Iν 0 0 0 0
, 0 0 0 , 0 0 α 1 ,
0 0
0 0 1
0 0 0 α
corresponding to the cases n = 2ν, n = 2ν + 1 and n = 2ν + 2, respectively. The orthogonal
group with respect to G over Fq will be denoted by O2ν+δ (q).
13.3.80 Theorem [857] We have O2ν+1 (q) ' Sp2ν (q).
13.3.81 Remark In the following we consider only the groups O2ν (q) and O2ν+2 (q). In the literature
+ −
on group theory, O2ν (q) and O2ν+2 (q) are sometimes denoted by O2ν (q) and O2ν+2 (q), and
called the plus type and the minus type, respectively.
13.3.82 Theorem [2786] The order of O2ν+δ (q), where δ = 0 or 2, is
ν
Y ν+δ−1
Y
|O2ν+δ (q)| = q ν(ν+δ−1) (q i − 1) (q i + 1).
i=1 i=0
520 Handbook of Finite Fields
13.3.84 Theorem [857] Every element of O2ν+δ (q) is a product of at most 2ν+δ orthogonal transvec-
tions except for the case n = 4, ν = 2 and q = 2; if an element of O2ν+δ (q) is expressed as
a product of an even number of orthogonal transvections, so is every such expression.
13.3.85 Definition Except for the case n = 4, ν = 2 and q = 2, an orthogonal matrix T ∈ O2ν+δ (q)
which is a product of an even number of orthogonal transvections is a rotation. The
set of rotations forms a subgroup of O2ν+δ (q), the group of rotations and denoted by
SO2ν+δ (q). The commutator subgroup of O2ν+δ (q) is denoted by Ω2ν+δ (q).
13.3.86 Lemma For n ≥ 4 the center of Ω2ν+δ (q) consists of the identity element only.
13.3.87 Remark Except for the case n = 4, ν = 2, and q = 2, we have the normal series of O2ν+δ (q)
13.3.89 Theorem (Dickson) [857] The group Ω2ν+δ (q) is a simple group except for the case n =
4, ν = 2.
13.3.90 Theorem [858] If q 6= 2 and n = 4, ν = 2, then
If q = 2 and n = 4, ν = 2, then SO2·2 (2) = SL2 (2) × SL2 (2) and Ω2·2 (2) is a direct product
of two cyclic groups of order 3.
13.3.91 Remark The action of GL2ν+δ (q) on the subspaces of F2ν+δ
q induces an action of O2ν+δ (q)
on the subspaces of F2ν+δ
q . The cardinality of any orbit of O2ν+δ (q) has already been de-
termined [2920].
See Also
We present algorithms for efficient computation of linear algebra problems over finite
fields. Implementations∗ of the proposed algorithms are available through the Magma,
Maple (within the LinearAlgebra[Modular] subpackage) and Sage systems; some parts
can also be found within the C/C++ libraries NTL, FLINT, IML, M4RI, and the special
purpose LinBox template library for exact, high-performance linear algebra computation
with dense, sparse, and structured matrices over the integers and over finite fields [931].
13.4.2 Remark Classical triple loop implementation of matrix multiplication makes MM(m, k, n) ≤
2mkn. The best published estimates to date gives MM(n, n, n) ≤ O (nω ) with ω ≈ 2.3755
[723], though improvements to 2.3737 and 2.3727 are now claimed [2728, 2985]. For very
rectangular matrices one also have astonishing results like MM(n, n, nα ) ≤ O n2+ for a
constant α > 0.294 and any > 0 [720]. Nowadays practical implementations mostly use
Strassen-Winograd’s algorithm, see Subsection 13.4.1.4, with an intermediate complexity
and ω ≈ 2.8074.
13.4.3 Remark The practical efficiency of matrix multiplication depends highly on the repre-
sentation of field elements. We thus present three kinds of compact representations for
elements of a finite field with very small cardinality: bitpacking (for F2 ), bit-slicing (for say
F3 , F5 , F7 , F23 , or F32 ), and Kronecker substitution. These representations are designed to
allow efficient linear algebra operations, including matrix multiplication.
13.4.4 Algorithm (Greasing) Over F2 , the method of the four Russians [124], also called Greasing,
can be used as follows:
1. A 64 bit machine word can be used to represent a row vector of dimension 64.
2. Matrix multiplication of a m × k matrix A by a k × n matrix B can be done
by first storing all 2k k-dimensional linear combinations of rows of B in a table.
Then the i-th row of the product is copied from the row of the table indexed by
the i-th row of A.
3. By ordering indices of the table according to a binary Gray Code, each row of
the table can be deduced from the previous one, using only one row addition.
This brings the bit operation count to build the table from k2k n to 2k n.
4. Choosing k = log2 n in the above method implies MM(n) = O n3 / log n over F2 .
13.4.5 Definition [349] Bit slicing consists in representing an n-dimensional vector of k-bit sized
coefficients using k binary vectors of dimension n. In particular, one can use Boolean
word instruction to perform arithmetic on 64 dimensional vectors.
1. Over F3 , the binary representation 0 ≡ [0, 0], 1 ≡ [1, 0], −1 ≡ [11] allows to add
and subtract two elements in 6 Boolean operations:
Add([x0 , x1 ], [y0 , y1 ]) : s ← x0 ⊕ y1 , t ← x1 ⊕ y0
Return(s ∧ t, (s ⊕ x1 ) ∨ (t ⊕ y1 ))
Sub([x0 , x1 ], [y0 , y1 ]) : t ← x0 ⊕ y0
Return(t ∨ (x1 ⊕ y1 ), (t ⊕ y1 ) ∧ (y0 ⊕ x1 ))
F3 F5 F7
Addition 6 20 17
Negation 1 5 3
Double 5 0
Table 13.4.1 Boolean operation counts for basic arithmetic using bit slicing
13.4.6 Definition Bit packing consists in representing a vector of field elements as an integer
fitting in a single machine word using a 2k -adic representation:
Elements of extension fields are viewed as polynomials and stored as the evaluation
of this polynomial at the characteristic of the field. The latter evaluation is known as
Kronecker substitution.
13.4.7 Remark We first need a way to simultaneously reduce coefficients modulo the characteristic,
see [929].
13.4.8 Algorithm (REDQ: Q-adic REDuction)
Pd
Require: three integers p, q and r̃ = i=0 µei q i ∈ Z
Pd
Ensure: ρ ∈ Z, with ρ = i=0 µi q i where µi = µei mod p
REDQ COMPRESSION
j k
1. s = pr̃ ;
2. for i = j0 tok d doj k
3. ui = qr̃i − p qsi ;
4. end for
REDQ CORRECTION {only when p - q, otherwise µi = ui is correct}
5. µd = ud ;
6. for i = 0 to d − 1 do
7. µi = ui − qui+1 mod p;
8. end for
Miscellaneous theoretical topics 523
Pd
9. Return ρ = i=0 µi q i ;
13.4.9 Remark Once we can pack and simultaneously reduce coefficients of finite field in a single
machine word, the obtained parallelism can be used for matrix multiplication. Depending on
the respective sizes of the matrix in the multiplication one can pack only the left operand or
only the right one or both [930]. We give here only a generic algorithm for packed matrices,
which use multiplication of a right packed matrix by a non-packed left matrix.
13.4.10 Algorithm (Right packed matrix multiplication)
Require: a prime p and Ac ∈ Fm×k
p and Bc ∈ Fk×n
p , stored with several field elements
per machine word
Ensure: Cc = Ac × Bc ∈ Fm×n
p
1. A = Uncompress(Ac ); {extract the coefficients}
2. Cc = A × Bc ; {Using e.g., Algorithm 13.4.14}
3. Return REDQ(Cc );
13.4.11 Remark Then, over extensions, fast floating point operations can be used on the Kro-
necker substitution of the elements. Indeed, it is very often desirable to use floating point
arithmetic, exactly. For instance floating point routines can more easily use large hard-
ware registers, they can more easily optimize the memory hierarchy usage [1336, 2974] and
portable implementations are more widely available. We present next the dot product and
the matrix multiplication is then straightforward [929, 930, 932].
13.4.13 Remark Over word-size prime fields one can also use the reduction to floating point routines
of algorithm 13.4.12. The main point is to be able to perform efficiently the matrix mul-
tiplication of blocks of the initial matrices without modular reduction. Thus delaying the
reduction as much as possible, depending on the algorithm and internal representations,
in order to amortize its cost. We present next such a delaying with the classical matrix
multiplication algorithm and a centered representation [933].
Ensure: C = A × B ∈ Fm×np
524 Handbook of Finite Fields
13.4.15 Remark If the field is too large for the strategy 13.4.14 over machine words, then two main
approaches would have to be considered:
1. Use extended arithmetic, either arbitrary of fixed precision, if the characteristic
is large, and a polynomial representation for extension fields. The difficulty here
is to preserve an optimized memory management and to have an almost linear
time extended precision polynomial arithmetic.
2. Use a residue number system and an evaluation/interpolation scheme: one can
use Algorithm 13.4.14 for each prime in the residue number system (RNS) and
each evaluation point. For Fpk , the number of needed primes is roughly 2 log2β (p)
and the number of evaluations points is 2k − 1.
13.4.16 Remark With matrices of large dimension, sub-cubic time complexity algorithms, such
as Strassen-Winograd’s [2987] can be used to decrease the number of operations. Algo-
rithm 13.4.17 describes how to compute one recursive level of the algorithm, using seven
recursive calls and 15 block additions.
C11 ← P1 + P2 ; U2 ← P1 + P6 ; U3 ← U2 + P7 ; U4 ← U2 + P5 ;
C12 ← U4 + P3 ; C21 ← U3 − P4 ; C22 ← U3 + P5 ;
13.4.18 Remark In practice, one uses a threshold in the matrix dimension to switch to a base case
algorithm, that can be any of the previously described ones. Following Subsection 13.4.1.2,
one can again delay the modular reductions, but the intermediate computations of Strassen-
Winograd’s algorithm impose a tighter bound.
13.4.19 Theorem [933] Let A ∈ Zm×k , B ∈ Zk×n C ∈ Zm×n and β ∈ Z with ai,j , bi,j , ci,j , β ∈
{0, . . . , p − 1}. Then every intermediate value z involved in the computation of A × B + βC
Miscellaneous theoretical topics 525
13.4.21 Remark We present algorithms computing the determinant and inverse of square matrices;
the rank, rank profile, nullspace, and system solving for arbitrary shape and rank matrices.
All these problems are solved a la Gaussian elimination, but recursively in order to effectively
incorporate matrix multiplication. The latter is denoted generically gemm and, depending
on the underlying field, can be implemented using any of the techniques of Subsections
13.4.1.1, 13.4.1.2, or 13.4.1.3.
13.4.22 Remark A special care is given to the asymptotic time complexities: the exponent is reduced
to that of matrix multiplication using block recursive algorithms, and the constants are also
carefully compared. Meanwhile, this approach is also effective for implementations: grouping
arithmetic operations into matrix-matrix products allow to better optimize cache accesses.
13.4.23 Remark Algorithms 13.4.24, 13.4.25, 13.4.26, and 13.4.27 show how to reduce the computa-
tion of triangular matrix systems, triangular matrix multiplications, and triangular matrix
inversions to matrix-matrix multiplication. Note that they do not require any temporary
storage other than the input and output arguments.
13.4.24 Algorithm [trsm: Triangular System Solve with Matrix right hand side)
Require: A ∈ Fm×m
q non-singular upper triangular, B ∈
Fm×n
q
Ensure: X ∈ Fm×n
q s.t. AX = B
−1 Using the conformal block
1. if m=1 then return X = A1,1 × B end if
2. X2 =trsm(A3 , B2 ); decomposition:
3. B1 = B1 − A2 X2 ; {using gemm, e.g., via Algorithm A1 A2 X1 B1
=
13.4.14} A3 X2 B1
4. X1 =trsm(A1 , B1 );
X1
5. return X = ;
X2
13.4.28 Remark Dense Gaussian elimination over finite fields can be reduced to matrix multiplica-
tion, using the usual techniques for the LU decomposition of numerical linear algebra [457].
However, in applications over a finite field, the input matrix often has non-generic rank
profile and special care needs to be taken about linear dependencies and rank deficiencies.
The PLE decomposition is thus a generalization of the PLU decomposition for matrices
with any rank profile.
13.4.29 Definition A matrix is in row-echelon form if all its zero rows occupy the last row positions
and the leading coefficient of any non-zero row except the first one is strictly to the right
of the leading coefficient of the previous row. Moreover, it is in reduced row-echelon form
if all coefficients above a leading coefficient are zeros.
13.4.30 Definition For any matrix A ∈ Fqm×n of rank r, there is a PLE decomposition A = P LE
where P is a permutation matrix, L is a m × r lower triangular matrix and E is a r × n
matrix in row-echelon form, with unit leading coefficients.
13.4.31 Remark Algorithm 13.4.32 shows how to compute such a decomposition by a block recursive
algorithm, thus reducing the complexity to that of matrix multiplication.
3. Let j be the column index of the first non-zero entry of A and P = T1,j the
transposition between indices 1 and j;
4. return (P, P A, [1]);
5. else
6. (P1 , L1 , E1 ) = PLE(A1 ); {recursively} Split A columnwise
in halves:
7. A2 = P1 A2 ; A = A1 A2
A3 = L−1
8. 1,1 A3 ; {using trsm} A3 L1,1
Split A2 = , L1 =
9. A4 = A4 − L1,2 A3 ; {using gemm} A4 L1,2
10. (P2 , L2 , E2 ) = PLE(A4 ); {recursively} where A3 and L1,1 have r1 rows.
Ir1 L1,1 E1 A3
11. return P1 , , ;
P2 P2 L1,2 L2 E2
12. end if
13.4.33 Remark The row-echelon and reduced row-echelon forms can be obtained from the PLE
decomposition, using additional operations: trsm, trtri, and trtrm, as shown in Algo-
rithms 13.4.34 and 13.4.35.
13.4.36 Remark Figure 13.4.2 shows the various steps between the classical Gaussian elimination
(LU decomposition), the computation of the echelon form and of the reduced echelon form,
together with the various problems that each of them solve. Table 13.4.3 shows the leading
constant Kω in the asymptotic time complexity of these algorithms, assuming that two
n × n matrices can be multiplied in Cω nω + o(nω ).
m × n of the matrix, a
13.4.37 Remark If the rank r is very small compared to the dimensions
system Ax = b can be solved in time bounded by O (m + n)r2 [2170, Theorem 1].
528 Handbook of Finite Fields
13.4.39 Remark The computation of the minimal and characteristic polynomials is closely related
to that of the Frobenius normal form.
13.4.40 Definition Any matrix A ∈ Fn×n
q is similar to a unique block diagonal matrix F =
P −1 AP = diag(Cf1 , . . . , Cft ) where the blocks Cfi are companion matrices of the poly-
nomials fi , which satisfy fi+1 |fi . The fi are the invariant factors of A and F is the
Frobenius normal form of A.
13.4.41 Remark Most algorithms computing the minimal and characteristic polynomial or the
Frobenius normal form rely on Krylov basis computations.
13.4.42 Definition
1. The Krylov matrix
of order d for a vector v with respect to a matrix A is the
matrix KA,v,d = v Av . . . Ad−1 v ∈ Fn×d
q .
A,v
2. The minimal polynomial Pmin of A and v is the least degree monic polynomial
P such that P (A)v = 0.
Miscellaneous theoretical topics 529
13.4.43 Theorem
A,v
1. AKA,v,d = KA,v,d CP A,v , where d = deg(Pmin ).
min
2. For linearly independent vectors (v1 , . . . , vk ), if K = KA,v1 ,d1 . . . KA,vk ,dk is
CP A,v1 B1,2 ... B1,k
min
B CP A,v1 . . . B2,k
2,1 min
non-singular. Then AK = K . . . . , where the blocks
.. .. .. ..
Bk,1 Bk,2 CP A,vk
min
Bi,j are zero except on the last column.
3. For linearly independent vectors (v1 , . . . , vk ), let (d1 , . . . dk ) be the lexicograph-
ically largest sequence of degrees such that K = KA,v1 ,d1 . . . KA,vk ,dk is
non-singular. Then
13.4.44 Remark
1. Some choice of vectors v1 , . . . , vk lead to a matrix H block diagonal: this is the
Frobenius normal form [1171].
2. The matrix obtained from Equation (13.4.1) is a Hessenberg form. It suffices to
compute the characteristic polynomial from its diagonal blocks.
13.4.48 Remark These probabilistic algorithms depend on the ability to sample uniformly from a
large set of coefficients from the field. Over small fields, it is always possible to embed the
problem into an extension field, in order to make the random sampling set sufficiently large.
In the worst case, this could add a O (log(n)) factor to the arithmetic cost and prevent most
of the bit-packing techniques. Instead, the effort of [944] is to handle cleanly the small finite
field case.
13.4.49 Remark We consider now the case where the input matrix is sparse, i.e., has many zero
elements, or has a structure which enables fast matrix-vector products. Gaussian elimination
would fill-in the sparse matrix or modify the interesting structure. Therefore one can use
iterative methods instead which only use matrix-vector iterations (blackbox methods [1669]).
There are two major differences with numerical iterative routines: over finite fields there
exists isotropic vectors and there is no notion of convergence, hence the iteration must
proceed until exactness of the result [1840]. Probabilistic early termination can nonetheless
be applied when the degree of the minimal polynomial is smaller than the dimension of the
matrix [935, 945, 1664]. More generally the probabilistic nature of the algorithms presented
in this section is subtle: e.g., the computation of the minimal polynomial is Monte-Carlo, but
that of system solving, using the minimal polynomial, is Las Vegas (by checking consistency
of the produced solution with the system). Making some of the Monte-Carlo solutions Las
Vegas is a key open problem in this area.
13.4.50 Remark The first iterative algorithm and its analysis are due to Wiedemann [2976]. The
algorithm computes the minimal polynomial in the Monte-Carlo probabilistic fashion.
13.4.51 Definition For a linearly recurring sequence S = (Si ), its minimal polynomial is denoted
by ΠS .
1. The minimal polynomial of a matrix is denoted ΠA = Π(Ai ) .
2. For a matrix A and a vector b, we note ΠA,b = Π(Ai ·b) .
3. For another vector u, we note Πu,A,b = Π(uT ·Ai ·b) .
13.4.53 Definition [See Definition 2.1.111] We extend Euler’s totient function by Φq,k (f ) =
(1 − q −kdi ), where di are the degrees of the distinct monic irreducible factors of the
Q
polynomial f .
13.4.54 Theorem For u1 , . . . , uj selected uniformly at random, the probability that lcm(Πuj ,A,b ) =
ΠA,b is at least Φq,k (ΠA,b ).
13.4.55 Theorem For b1 , . . . , bk selected uniformly at random, the probability that lcm(ΠA,bi ) = ΠA
is at least Φq,k (ΠA ).
Miscellaneous theoretical topics 531
13.4.56 Remark It is possible to compute the rank, determinant, and characteristic polynomial of a
matrix from its minimal polynomial. All these reductions require to precondition the matrix
so that the minimal polynomial of the obtained matrix will reveal the information sought,
while keeping a low cost for the matrix-vector product [607, 935, 947, 1666, 2824, 2874, 2875].
13.4.57 Theorem [947] Let S be a finite subset of a field F that does not include 0. Let A ∈ Fm×n
having rank r. Let D1 ∈ S n×n and D2 ∈ S m×m be two random diagonal matrices then
2
−n
deg(minpoly(D1 × At × D2 × A × D1 )) = r, with probability at least 1 − 11n
2|S| .
13.4.58 Theorem [2824] Let S be a finite subset of a field F that does not include 0. Let U ∈ S n×n
be a unit upper bi-diagonal matrix where the second diagonal elements u1 , . . . , un−1 are
randomly selected in S. For A ∈ Fn×n , the term of degree 0 of the minimal polynomial of
2
U A is the determinant of A with probability at least 1 − n2|S|
−n
.
13.4.59 Remark If A is known to be non-singular the algorithm can be repeated with different
matrices U until the obtained minimal polynomial is of degree n. Then it is the characteristic
polynomial of U A and the determinant is certified. Alternatively if the matrix is singular
then X divides the minimal polynomial. As Wiedemann’s algorithm always returns a factor
of the true minimal polynomial, and U is invertible, the algorithm can be repeated on
U A until either the obtained polynomial is of degree n or it is divisible by X. Overall the
determinant has a Las-Vegas blackbox solution.
13.4.60 Theorem [2874, 2875] Let S be a finite subset of a field F that does not include 0 and A ∈
Fn×n with s1 , . . . , st as invariant factors. Let U ∈ S n×k and V ∈ S k×n be randomly chosen
rank k matrices in F. Then gcd(ΠA , ΠA+U V ) = sk+1 with probability at least 1 − nk+n+1 |S| .
13.4.61 Remark Using the divisibility of the invariant factors and the fact that their product is of
degree n, one can
√ see that the number of degree changes between successive invariant factors
is of order O ( n) [2874]. Thus by a binary search over successive applications of Theorem
13.4.60 one can recover all of the invariant factors and thus the characteristic polynomial
of the matrix in a Monte-Carlo fashion.
13.4.62 Remark For the solution of a linear system Ax = b, one could compute the minimal
polynomial ΠA,b and then derive a solution of the system as a linear combination of the Ai b.
The following Lanczos approach is more efficient for system solving as it avoids recomputing
(or storing) the latter vectors [947, 1273].
13.4.63 Algorithm (Lanczos system solving)
Require: A ∈ Fm×n , b ∈ Fm
Ensure: x ∈ Fn such that Ax = b or failure
1. Let à = D1 AT D2 AD1 and b̃ = D1 AT D2 b + Ãv with D1 and D2 random diagonal
matrices and v a random vector;
−1
2. w0 = b̃; v1 = Ãw0 ; t0 = v1T w0 ; γ = b̃t w0 t0 ; x0 = γw0 ;
3. repeat
4. T
α = vi+1 vi+1 t−1 T −1
i ; β = vi+1 vi ti−1 ; wi+1 = vi+1 − αwi − βwi−1 ;
T
5. vi+2 = Ãwi+1 ; ti+1 = wi+1 vi+2 ;
6. γ = b̃t wi+1 t−1
i+1 ; xi+1 = xi + γwi+1 ;
7. until wi+1 = 0 or ti+1 = 0;
8. Return x = D1 (xi+1 − v);
532 Handbook of Finite Fields
13.4.64 Remark The probability of success of Algorithm 13.4.63 follows from Theorem 13.4.57.
13.4.65 Remark Over small fields, if the rank of the matrix is known, the diagonal matrices of line
1 can be replaced by sparse preconditioners with O (n log(n)) non-zero coefficients to avoid
the need of field extensions [607, Corollary 7.3].
13.4.66 Remark If the system with A and b is known to have a solution then the algorithm can
be turned Las Vegas by checking that the output x indeed satisfies Ax = b. In general, we
do not know if this algorithm returns failure because of bad random choices or because the
system is inconsistent. However, Giesbrecht, Lobo, and Saunders have shown that when the
system is inconsistent, it is possible to produce a certificate vector u such that uT A = 0
together with uT b 6= 0 within the same complexity [1273, Theorem 2.4]. Overall, system
solving can be performed by blackbox algorithms in a Las-Vegas fashion.
13.4.70 Remark Originating from the seminal paper [1643] most of the algorithms dealing with
structured matrices use the displacement rank approach [2346].
Miscellaneous theoretical topics 533
13.4.71 Definition For A ∈ Fm×m and B ∈ Fn×n , the Sylvester (respectively Stein) linear dis-
placement operator 5A,B (respectively 4A,B ) satisfies for M ∈ Fm×n :
5A,B (M ) = AM − M B,
4A,B (M ) = M − AM B.
13.4.72 Remark The main idea behind algorithms for structured matrices is to use such generators
as a compact data structure, in cases where the displacement has low rank.
13.4.73 Remark Usual choices of matrices A and B are diagonal matrices and cyclic down shift
matrices.
13.4.74 Definition We denote the diagonal matrix whose (i, i) entry is xi by Dx , x ∈ Fn and by
Zn,ϕ , ϕ ∈ F the n × n unit circulant matrix having ϕ at position (1, n), ones in the
subdiagonal (i + 1, i) and zeros elsewhere.
Table 13.4.4 Complexity of the matrix-vector product for some structured matrices
13.4.75 Remark As computing matrix vector products with such structured matrices have close
algorithmic correlation to computations with polynomials and rational functions, these ma-
trices can be quickly multiplied by vectors, in nearly linear time as shown on Table 13.4.4.
Therefore the algorithms of Subsection 13.4.4 can naturally be applied to structured matri-
ces, to yield almost O n2 time linear algebra.
13.4.76 Remark If the displacement rank is small there exist algorithms quasilinear in n, the dimen-
sion of the matrices, which over finite fields are essentially variations or extensions of the
Morf/Bitmead-Anderson divide-and-conquer [290, 2154] or Cardinal’s [517] approaches. The
method is based on dividing the original problem repeatedly into two subproblems with one
leading principal submatrix and the related Schur complement. This leads to O α2 n1+o(1)
system solvers, which complexity bound have recently been reduced to O αω−1 n1+o(1)
[364, 1600]. With few exceptions, all algorithms thus need matrices in generic rank profile.
Over finite fields this can be achieved using Kaltofen and Saunders unit upper triangu-
lar Toeplitz preconditioners [1666] and by controlling the displacement rank growth and
non-singularity issues [1656].
534 Handbook of Finite Fields
13.4.77 Remark Overall, as long as the matrix fits into memory, Gaussian elimination methods over
finite fields are usually faster than iterative methods [935]. There are heuristics trying to
take advantage of both strategies. Among those we briefly mention the most widely used:
1. Perform the Gaussian elimination with reordering 13.4.68 until the matrix is
almost filled up. If the remaining non-eliminated part would fit as a dense matrix,
switch to the dense methods of Subsection 13.4.2.
2. Maintain two sets of rows (or columns), sparse and dense. Favor elimination on
the sparse set. This is particularly adapted to index calculus [1839].
3. Perform a preliminary reordering in order to cut the matrix into four quadrants,
the upper left one being triangular. This, together with the above strategies has
proven effective on matrices which are already quasi-triangular, e.g., Gröbner
bases computations in finite fields [1043].
4. If the rank is very small compared to the dimension of the matrix, one can use left
and right highly rectangular projections to manipulate smaller structures [2040].
5. The arithmetic cost and thus timing predictions are easier on iterative methods
than on elimination methods. On the other hand the number of non-zero elements
at a given point of the elimination is usually increasing during an elimination,
thus providing a lower bound on the remaining time to triangularize. Thus a
heuristic is to perform one matrix-vector product with the original matrix and
then eliminate using Gaussian elimination. If at one point the lower bound for
elimination time surpasses the predicted iterative one or if the algorithm runs
out of memory, stop the elimination and switch to the iterative methods [937].
13.4.78 Remark Iterative methods based on one-dimensional projections, such as Wiedmann and
Lanczos algorithm can be generalized with block projections. Via efficient preconditioning
[607] these extensions to the scalar iterative methods can present enhanced properties:
1. Usage of dense sub-blocks, after multiplications of blocks of vectors with the
sparse matrix or the blackboxes, allows for a better locality and optimization of
memory accesses, via the application of the methods of Subsection 13.4.1.
2. Applying the matrix to several vectors simultaneously introduces more paral-
lelism [718, 719, 1664].
3. The probability of success augments with the size of the considered blocks, espe-
cially over small fields [1657, 2873].
13.4.80 Theorem The degree d matrix minimal polynomial of a block sequence (Hi ) ∈ (Fqk×k )Z
can be computed in O k 3 d2 using block versions of Hermite-Pade approximation and ex-
tended Euclidean algorithm [214] or Berlkamp-Massey algorithm [719, 1657, 2873]. Further
Miscellaneous theoretical topics 535
1+o(1)
improvement by [214, 1275, 1670, 2804] bring this complexity down to O (k ω d) ,
using a matrix extended Euclidean algorithm.
13.4.82 Remark These block-Krylov techniques are used to achieve the best known time com-
plexities for several computations with black-box matrices over a finite field or the ring of
integers: computing the determinant, the characteristic polynomial [1670], and the solution
of a linear system of equations [946].
See Also
References Cited: [89, 124, 214, 290, 349, 364, 457, 517, 607, 718, 719, 720, 723, 929, 930,
931, 932, 933, 934, 935, 937, 944, 945, 946, 947, 1043, 1171, 1273, 1275, 1336, 1485, 1600,
1643, 1656, 1657, 1664, 1666, 1669, 1670, 1723, 1839, 1840, 2040, 2154, 2170, 2346, 2386,
2726, 2727, 2728, 2804, 2824, 2873, 2874, 2875, 2974, 2976, 2985, 2987, 3031, 3082]
13.5.1 Remark Much of the theory presented here will be familiar to the reader from the theory
of elliptic curves. Moreover, the reader may profit on first reading this by only focusing on
the basic case A = Fq [t]. We have given a very rapid introduction to Drinfeld modules and
the like and have naturally omitted many topics; the subject is quite active and changing
rapidly. For much of the elided material please consult [919, 920, 1333, 1451, 2793].∗
13.5.2 Remark One of the salient points about Drinfeld modules is that they allow one to go
as far as possible in replacing the integers Z as the fundamental object in arithmetic. In
other words, while all of the fields involved obviously lie over Spec(Z), the theory allows
one to study invariants coming from characteristic 0 (such as class groups, etc.) as well
∗ This survey is dedicated to the memory of my friend, and function field pioneer, David Hayes.
536 Handbook of Finite Fields
as analogous invariants arising only in finite characteristic (such as the “class module” of
Subsection 13.5.6 below).
13.5.3 Remark It is well known that the most important function in classical analysis is the expo-
nential function ez ; in analysis in characteristic p, it is the q-th power mapping τq (z) := z q
(q = pn0 , n0 a positive integer) and functions created out of it. Note that τq is clearly an Fq -
linear mapping by the Binomial Theorem and separable polynomials in τq are characterized
among all separable polynomials as those having their zeroes form an Fq vector space. For
any Fq -field E we let E{τq } be the algebra of polynomials in τq (which is noncommutative
in general); the ring E{τq } is the ring of algebraic Fq -linear endomorphisms of the additive
group over E. We also let Ē be a fixed algebraic closure of E.
13.5.4 Remark Let E be an arbitrary field (of arbitrary characteristic).
13.5.6 Example Let E = Fq (t) where t is an indeterminate. Let g ∈ E and suppose that g(t) has
a zero of order j at ∞; set |g| := q −j .
13.5.7 Remark The field E is complete if every Cauchy sequence converges to an element of E.
If the field is not complete one can complete it by mimicking the construction of the real
numbers (the field E of Example 13.5.6 completes to the formal Laurent series field Fq ((1/t))
in 1/t). We assume that E is complete for the rest of this subsection.
13.5.8 Remark Property 4 in Definition 13.5.5 immediately implies that a series with coefficients
in E converges if and only if the n-th term goes to 0.
13.5.9 Remark Let F be a finite extension of E. It is well-known that | · | extends uniquely to F
(and thus to any algebraic closure Ē of E).
13.5.10 Proposition [1846] Let F be a normal extension of E. Let σ : F → F be an E-
automorphism. Then |σ(x)| = |x| for all x ∈ F .
F
13.5.11 Corollary [1846] Let F/E be finite of degree d and let NE be the norm. Then |x| =
|NEF (x)|1/d for all x ∈ E.
13.5.12 Remark The absolute value | · | is also readily seen to extend uniquely to the completion
of Ē which remains algebraically closed.
P∞
13.5.13 Definition A power series i=0 ai xi with coefficients in E is entire if it converges for all
x ∈ E.
13.5.14 Theorem [1333] Let f (x) be an entire power series with coefficients in E. Then there
exists a nonnegative integer j, an element c ∈ E and a (possibly empty or finite) sequence
{0 6= λi } ⊂ Ē, where |λi | → ∞ as i → ∞, with f (x) = cxj i (1 − x/λi ).
Q
Miscellaneous theoretical topics 537
13.5.15 Remark Conversely a product, as in Theorem 13.5.14, defines an E-entire power series
if {λi } is stable under E-automorphisms of Ē and inseparability indices are taken into
account. Note that Theorem 13.5.14 immediately implies that the power series for ez is
never entire as an E-power series for non-Archimedean E of characteristic 0.
13.5.16 Remark The notion of analytic continuation in complex analysis is essential; it is what
allows analytic functions to be defined locally. In non-Archimedean analysis, as just de-
scribed, there are far too many open sets for any analogous theory. Following ideas from
algebraic geometry (and Grothendieck), Tate had the fantastic idea to use a “Grothendieck
topology” to very seriously cut down on the number of such open sets. This theory is called
“rigid analysis” and it does in fact allow for analytic continuation, etc. More importantly,
via this theory one is able to pass between (rigid) analytic sheaves and algebraic sheaves
in the manner of Serre’s famous G.A.G.A. paper [2585]. In particular, one is then able to
construct algebraic functions via analysis. For details on all of this, see for example [354].
13.5.17 Remark Let X be a smooth, complete, geometrically irreducible curve over Fq with function
field k. Note that k is precisely a global field of characteristic p. We let ∞ ∈ X be a fixed
closed point of degree d∞ over Fq , and A the ring of functions holomorphic away from ∞;
set K := k∞ = the completion of k under the absolute value |x|∞ = q −v(x) where v(x) is
the order of zero of x at ∞. We set C∞ to be the completion of a fixed algebraic closure K̄
of K. Let K1 ⊂ C∞ be a finite extension of K (automatically also complete) with separable
closure K1sep ⊂ C∞ .
sep
13.5.18 Definition 1. A K1 -lattice is a discrete, finitely generated, Gal(K1 /K1 )-stable A-
submodule M of K1sep . 2. A morphism between two lattices M1 and M2 of the same
scalar c such cM1 ⊆ M2 . 3. To a lattice M we attach the exponential function
A-rank is a Q
eM (z) := z 06=β∈M (1 − z/β).
13.5.19 Remark By Theorem 13.5.14, eM (z) is entire with K1 coefficients. As M can be exhausted
by finite dimensional Fq -vector spaces, one deduces that eM (z) is a limit of Fq -linear poly-
nomials and so is, itself, an Fq -linear surjection from C∞ to itself. We obtain an exact
sequence
eM (z)
0 −→ M −→ C∞ −→ C∞ −→ 0 . (13.5.1)
Let a ∈ A; by transport of structure (and matching divisors) we now obtain an exotic
A-module structure φMa (z) on C∞ via the functional equation
Y
eM (az) := φM
a (eM (z)) = a · eM (z) (1 − eM (z)/eM (α)) . (13.5.2)
06=α∈a−1 M/M
13.5.21 Remark Via φ, E and Ē become A-modules with a∗z := φa (z) for a ∈ A, z ∈ Ē. Let I ⊆ A
be an ideal and let φ[I] ⊂ Ē be those points annihilated by all i ∈ I; commutativity of A
implies that φ[I] is a finite A-module. Now let a ∈ A 6∈ ker ı. As A is a Dedekind domain,
simple counting arguments show the existence of an integer d, independent of a, such that
φ[(a)] ' A/(a)d ; d is the rank of φ. Isogenies are only possible between Drinfeld modules of
the same rank and isomorphisms correspond to those P of the form cτq0 where c 6= 0.
We then have the following fundamental result of Drinfeld.
13.5.22 Theorem [919] The exponential construction gives an equivalence of categories between
Drinfeld modules of rank d over K1 and rank d K1 -lattices.
13.5.23 Example Let A = Fq [t]. The Carlitz module C is the rank 1 Drinfeld module over Fq (t)
P∞ qj
given by Ct := tτq0 + τq . The associated Carlitz exponential is eC (z) := j=0 z /Dj ,
where Dj is the product of all monic polynomials of degree j, and the Carlitz lattice is Aξ
Q∞ j
where ξ := (−t)q/(q−1) j=1 (1 − t1−q )−1 is the Carlitz period. For general A, Hayes has
constructed generalizations of C which are defined over a Hilbert class field of k.
13.5.24 Remark It was a fundamental observation of Carlitz that adding the elements of C[a] to k
gave an abelian extension. These extensions were then shown to provide the abelian closure
of k which is tamely ramified at ∞ by Hayes [1449] and generalized by him to arbitrary
A in [1450] in analogy with cyclotomic fields. Drinfeld gave a modular approach to the
construction of these class fields in his paper [919]; in [920] he obtained the full abelian
closure. This full abelian closure has also recently been explicitly constructed by Zywina
[3084].
13.5.25 Remark Carlitz originally constructed his module very concretely based on the famous
combinatorial formula called the Moore Determinant (a finite characteristic analog of the
Wronskian as well as the Vandermonde determinant) as given in the next definition (see
also Section 13.2).
w1 ... wn
0
τq0 (wn )
w1q ... wnq τq (w1 ) ...
. ..
.. .. =
.
. . .
. . n−1
τqn−1 (wn )
n−1
q n−1
τq (w1 ) . . .
w1q ... wn
13.5.27 Remark Moore then shows the following basic result analogous to Abel’s calculation of the
Wronskian.
13.5.28 Proposition [1333] ∆q (w1 , . . . , wd ) equals
d
Y Y Y
··· (wi + ki−1 wi−1 + · · · + k1 w1 ) .
i=1 ki−1 ∈Fq k1 ∈Fq
13.5.29 Remark Proposition 13.5.28 allowed Carlitz to calculate various products of polynomials
over finite fields, and thus to explicitly construct his exponential, lattice, and module.
13.5.30 Remark Returning to a general Drinfeld modules φ, let p now be a prime of A with
completions kp , Ap . Let φ[p∞ ] be the set of all p-power torsion points.
Miscellaneous theoretical topics 539
13.5.31 Definition We set Tp (φ) := HomA (kp /Ap , φ[p∞ ]); this is the p-adic Tate module of φ.
13.5.32 Remark If p 6= ker ı, then Tp (φ) is isomorphic to Adp as A-module; the construction of Tate
modules is functorial.
13.5.33 Remark Let E be as in Definition 13.5.20 now also be finite; let φ be a Drinfeld module over
E of rank d. Let FE := τqt which is an endomorphism of φ. Set b := ker ı (which is a maximal
ideal of A) and let p 6= b be another nontrivial prime of A; set fφ (u) := det(1−uFE | Tp (φ)).
Very clever use of central simple arithmetic allows one to establish the following results.
13.5.34 Theorem [920] The polynomial fφ (u) depends only on the isogeny class of fφ (u) and has
coefficients in A which are independent of the choice of p. The reciprocal roots of fφ (u) in
C∞ have absolute value q t/d and their product generates the ideal b[E : A/b] .
13.5.36 Remark Clearly the set of Weil numbers of rank d for E is acted on by the group of
k-automorphism of k̄ ⊂ C∞ and we let Wd (E) be the set of orbits under this action.
13.5.37 Theorem [920] The mapping from the set of isogeny classes of Drinfeld modules over E of
rank d to Wd (E) given by Theorem 13.5.34 is a bijection.
13.5.38 Remark Let E now be a finite extension of k equipped with a Drinfeld module φ of rank d;
let O be the A-integers of E. As A is a finitely generated Fq -algebra, for almost all O-primes
P we can reduce the coefficients of φ modulo P to obtain a Drinfeld module of rank d over
O/P. For the other primes, we now localize and assume that E is equipped with a nontrivial
discrete valuation v with v(A) nonnegative; let Ov ⊂ E be the associated valuation ring of
v-integers and let Mv ⊂ Ov be the maximal ideal.
13.5.39 Definition 1. The Drinfeld module φ has stable reduction at v if there exists ψ which is
isomorphic to φ such that ψ has coefficients in Ov and such that the reduction ψ v of ψ
is a Drinfeld module (of rank d1 ≤ d). 2. The module φ has good reduction at v if it has
stable reduction and d1 = d. 3. The module φ has potential stable (resp. potential good )
reduction if there is an extension (E1 , w) of (E, v) such that φ has stable (resp. good)
reduction at w.
13.5.40 Remark Drinfeld [919] established that every Drinfeld module has potential stable reduc-
tion. Let p be a prime of A not contained in Mv and view Tp (φ) as a Gal(E sep /E)-module.
13.5.41 Theorem [2767] The Drinfeld module φ has good reduction at v if and only if Tp (φ) is
unramified at v as a Gal(E sep /E)-module.
540 Handbook of Finite Fields
13.5.42 Remark Theorem 13.5.41 is an obvious analog of the theorem of Ogg-Néron-Shafarevich and
is due to Takahashi. Taguchi [2765] has established that Tp (φ) is a semisimple Gal(E sep /E)-
module; this was also independently established by Tamagawa [2773],
13.5.43 Remark Let E continue to be a finite extension of k equipped with a Drinfeld module φ of
rank d; using φ, E becomes a natural A-module which is denoted “φ(E)”.
13.5.44 Theorem [2419] The A-module φ(E) is isomorphic to T ⊕ N where T is a finite torsion
module and N is a free A-module of rank ℵ0 .
13.5.45 Remark The above result, due to Poonen, applies more generally to other modules of
rational points (such as a ring OS of S-integers which contains the coefficients of φ). The
main technique in the proof is to establish the tameness of φ(L); that is, let M ⊆ φ(L) be a
submodule such that k ⊗A M is finite dimensional, then M is itself finitely generated. This
result is established via the use of heights and is analogous to the theorem of Mordell-Weil
for elliptic curves/abelian varieties except that the rank is always infinite.
13.5.46 Remark We now present very recent work of Taelman [2760, 2761] that establishes an
analog of the classical class group/Tate-Shafarevich group and the group of units/Mordell-
Weil group in the theory of Drinfeld modules.
13.5.47 Remark As above, let E be a finite extension of k with A-integers O; set E∞ := E ⊗k K.
Let φ be a Drinfeld A-module of rank d over E whose coefficients will be assumed to lie in
O (N.B., this does not imply that φ has good reduction at all primes of O); let eφ (z) be the
associated exponential function. The functional equation satisfied by eφ (z) implies that the
coefficients of φ lie in E. Thus eφ (z) induces a natural Fq -linear, continuous endomorphism
of E∞ (also denoted eφ ); it is an open map as the derivative of eφ (z) is identically 1. Let
M̂ (φ/O) := e−1φ (O) ⊂ E∞ . The openness of eφ (z) immediately implies the next result.
13.5.48 Proposition [2760, 2761] The A-module M̂ (φ/O) is discrete and cocompact in E∞ .
13.5.49 Definition We set M (φ/O) ⊂ φ(O) to be the image of M̂ (φ/O) under eφ (z).
13.5.50 Corollary [2760, 2761] M (φ/O) is a finitely generated A-module under the Drinfeld action.
13.5.51 Remark M (φ/O) is an analog of both the group of units of a number field and the Mordell-
Weil group of an elliptic curve.
13.5.53 Remark As O is cocompact in E∞ and eφ (E∞ ) is open in E∞ , we deduce the next result.
13.5.54 Proposition [2760, 2761] H(φ, O) is a finite A-module.
13.5.55 Remark H(φ, O) is the class module or the Tate-Shafarevich module of φ over O.
13.5.57 Remark One can reformulate the invariants of the Carlitz module in terms of Zariski sheaves
and shtuka (see Remark 13.5.116) thereby reinterpreting them à la motivic cohomology, see
[2762].
13.5.58 Remark Let F∞ ⊂ K be the field of constants and let π ∈ K be a fixed element of order 1
at ∞. Every element x ∈ K ∗ has a decomposition x = ζx,π π v∞ (x) hxiπ with ζx,π ∈ F∗∞ and
hxiπ ≡ 1 (mod π); x is positive if ζx,π = 1 (generalizing the notion of “monic polynomial”).
Let I be the group of A-fractional ideals with subgroups P + ⊆ P where P (resp. P + ) is
the group of principal (resp. and positively) generated ideals; I/P + is finite, etc.
13.5.59 Remark Let U1 ⊂ C∞ be the group of elements 1 + w where |w|∞ < 1. The binomial
theorem implies that U1 is a Zp -module in the usual exponential fashion; as we can also
take p-power roots uniquely in U1 , it is in fact a Qp -vector space. Let I = (i) ∈ P + where
i is positive and define hIiπ := hiiπ . As U1 is divisible the next result follows directly.
13.5.60 Proposition [1333] The map I 7→ hIiπ extends uniquely to a homomorphism I 7→ hIiπ of
I to U1 .
13.5.61 Remark Let deg(I) denote the degree over Fq of the associated divisor on Spec(A).
13.5.66 Remark Theorem 13.5.34 immediately implies that L(φ, s) converges on a “half-plane” of
S∞ consisting of (x, y) with |x|∞ bounded below (see [1208] for factors at the finitely many
bad primes).
13.5.67 Theorem [335] The function L(φ, s) analytically continues to a continuous (in y ∈ Zp )
family of entire power series in x−1 which is also continuous on S∞ .
13.5.68 Remark Note that these entire power series are obtained by expanding the Euler products
and “summing by degree.”
13.5.69 Corollary [335] Let j be a nonnegative integer. Then L(φ, (x, −j)) is a polynomial in x−1 .
13.5.70 Remark These special polynomials play an essential role in the theory. By [335], their degree
grows logarithmically in j. They interpolate continuously to entire families at all places of
542 Handbook of Finite Fields
k. Moreover, this logarithmic growth allows one to analogously handle all associated partial
L-series obtained by summing only over a residue class [1334].
13.5.72 Example Let A = Fq [t] and let “positive=monic.” Then L(C, s) = ζA (s − 1).
13.5.73 Remark Let ξ be a Hayes-period (as in Example 13.5.23). Judicious use of the associated
exponential function gives the next result.
13.5.74 Theorem [1333] Let j be a positive integer divisible by q d∞ − 1. Then 0 6= ζA (j)/ξ j is
algebraic over k.
13.5.75 Example In the case A = Fq [t], Carlitz in the 1930’s defined a suitable “factorial” element
and thus analogs of Bernoulli numbers; these are called “Bernoulli-Carlitz” elements.
13.5.85 Remark Let A = Fq [t], p a prime of A of degree d, and let πp ∈ A have order 1 at p; thus
the completion Ap is isomorphic to F[[πp ]] where F ' Fqd . Set R = Ap equipped with the
canonical topology.
Miscellaneous theoretical topics 543
13.5.87 Remark Let f : R → R be a continuous function and µ a measure;Rit is easy to see that the
associated “Riemann sums” will converge to an element denoted “ R f (t) dµ(t).” Let µ and
ν be two R-valued measures; it is clear that their sum is also an R-valued measure. Their
product is defined, as usual, by convolution as below.
13.5.89 Remark It is easy to check that the space of measures forms a commutative R-algebra
under convolution. Now let Dj be the hyperdifferential operator given by Dj xi := ji xi−j .
13.5.90 Theorem [1333] There is an isomorphism (which depends on the basis constructed) of the
R-algebra of measures and R{{D}}.
P∞
13.5.91 Remark Let y ∈ Zp be written as y = i=0 ci q
i
, where 0 ≤ ci < q. Let ρ be a permutation
of the set {0, 1, 2, . . .}.
P∞
13.5.92 Definition We define ρ∗ (y), y ∈ Zp , by ρ∗ (y) := i=0 ci q
ρ(i)
. We let S(p) denote the
induced group of bijections of Zp .
13.5.93 Theorem [1335] The map ρ∗ is a homeomorphism of Zp and ρ(y0 + y1 ) = ρ(y0 ) + ρ(y1 ) if
there is no carry-over of q-adic digits in the sum of y0 and y1 . Furthermore, ρ∗ stabilizes
both the positive and negative integers. Finally, n ≡ ρ∗ (n) (mod q − 1) for all integers n.
13.5.94 Remark It is remarkable that S(q) appears to act as a group of symmetries of the L-series
of Drinfeld modules. We present some evidence of this here and refer the reader to [1335] for
more details. Our first result is completely general and is an application of Lucas’ formula
for the reduction modulo p of binomial coefficients.
13.5.95 Proposition [1335] Let ρ∗ ∈ S(p) , y ∈ Zp , and j a nonnegative integer. Then we have
ρ∗ (y) y
ρ∗ (j)
≡ j (mod p).
13.5.96 Theorem [1335] Let j be a nonnegative integer. Then, as polynomials in x−1 , ζA (x, −j)
and ζA (x, −ρ∗ (j)) have the same degree.
13.5.97 Remark Theorem 13.5.96 also depends on another old formula of Carlitz (resurrected by
Thakur) that allows one to compute the relevant degrees; these degrees then turn out to be
invariant of the ρ∗ action as a corollary of Theorem 13.5.93. Using Lucas’ formula again,
one also establishes the next result.
13.5.98 Theorem [1335] Let ρ∗ be as above. Then the map Di → Dρ∗ (i) gives rise to an automor-
phism of R{{D}}.
13.5.99 Remark Thus, of course, S(p) also acts as automorphisms of measure algebras. Moreover,
long ago Carlitz computed the denominators of his Bernoulli-Carlitz elements (as in Exam-
ple 13.5.75) in analogy with the classical result of von Staudt-Clausen. It is quite remarkable
that the condition that a prime f ∈ A of degree d divides this denominator is invariant
under S(qd ) .
544 Handbook of Finite Fields
13.5.100 Remark We finish this subsection by noting that, in examples including A = Fq [t], the
zeroes of ζA (s) “lie on a line” (with perhaps finitely many exceptions for nonrational A)
with orders agreeing with classical predictions. For this, see [2611, 2892].
13.5.10 Multizeta
13.5.101 Remark L-series of Drinfeld modules give rise to entire power series upon summing by
degree. If one takes these sums and intermixes them, one obtains “multi-L-series” whose
study was initiated by Thakur (see for example [2792]). One obtains a remarkably rich
edifice with analogs of many essential classical results such as shuffle relations and the
realization of multizeta values as periods, etc. For instance, Anderson and Thakur [97] have
shown the analog of a result of Terasoma giving an interpretation of multizeta values as
periods of “mixed Carlitz-Tate t-modules.”
13.5.102 Remark Let A again be a general base ring. Let M be a rank d lattice as in Definition
13.5.18. Let {m1 , . . . , md } ∈ M be chosen so that M = Am1 +· · ·+Amd−1 +Imd where I is
a nonzero ideal of A (as can always be done since A is a Dedekind domain). The discreteness
of M implies that (m1 , . . . , md ) does not belong to any hyperplane defined over K = k∞ .
13.5.103 Definition We define Ωd := Pd−1 (C∞ ) \ Yd , where Yd is the subset of all points lying in
K-hyperplanes.
13.5.104 Remark The space Ωd comes equipped with the structure of a rigid analytic space. It is
connected (in the appropriate rigid sense) [2847]; in particular, functions are determined by
their local expansions.
13.5.105 Example Let d = 2. Then Ω2 = P1 (C∞ ) − P1 (K). This space is analogous to the classical
upper and lower half-planes.
13.5.106 Remark Let M , {m1 , . . . , md }, be as above; clearly M is isomorphic (Definition 13.5.18) to
the lattice M̂ := A + A m m2
1
+ · · · + A mmd−1
1
+Im
m1 which is, itself, associated to the element
d
d
(m2 /m1 , . . . , md /m1 ) ∈ Ω . It is therefore reasonable that modular spaces associated to
Drinfeld modules (i.e., spaces whose points parametrize Drinfeld modules with, perhaps,
some added structure on their torsion points) can be described using Ωd in a manner
exactly analogous to elliptic modular curves and the classical upper half plane. Indeed, this
was accomplished by Drinfeld in his original paper [919].
13.5.107 Remark Drinfeld’s construction can be quite roughly sketched as follows: For simplicity,
assume that I = A and let J ⊂ A be a nontrivial ideal. Let ΓJ ⊂ GLd (A) be the principal
congruence subgroup. Then ΓJ acts on Ωd in complete analogy with the standard action of
congruence subgroups of SL2 (Z). This action of ΓJ is rigid analytic and the quotient ΓJ \Ωd
also exists as a rigid space which is denoted MJd .
13.5.108 Theorem [919] The space MJd is a regular affine variety of dimension d. It is equipped with
a natural morphism to Spec(A) which is flat and smooth away from the primes dividing J.
13.5.109 Remark The space MJ2 is a relative curve highly analogous to elliptic modular curves.
Indeed, it also can be compactified via “Tate objects” at a finite number of cusps.
13.5.110 Example Let I = A before and let Γ = SL2 (A). Let eA (z) be the exponential function
associated to A considered as a rank 1 lattice (Definition 13.5.18). Set EA (z) := eA (z)−1 .
Miscellaneous theoretical topics 545
Then one shows [1332] that EA (z) is a uniformizing parameter at the infinite cusp for Γ
acting on Ω2 in analogy with e2πiz and the upper half plane classically. (We note that our
notation “EA (z)” is highly nonstandard but necessary as the commonly used symbols are
already taken here.)
13.5.111 Remark Given the analogy between the spaces Ωd (and especially Ω2 ) and the classically
upper half plane, it makes sense to discuss modular forms in this context [1332]. For sim-
plicity, let Γ = GL2 (A).
13.5.112 Definition [1267] A rigid analytic function f on Ω2 is a modular form of weight k and type
m (for a nonnegative integer k and class m of Z/(q − 1)) if and only if
a b
1. for γ = ∈ Γ one has f az+b
cz+d = det γ −m (cz + d)k f (z),
c d
2. f is holomorphic at the cusps.
13.5.113 Remark Property 1 of Definition 13.5.112 implies that f (z) has a Laurent-series expansion
in terms of EA (z) and Property 2 is the requirement that this expansion has no negative
terms; similar expansions are mandated at the other cusps. Modular forms of a given weight
and type then comprise finite dimensional vector spaces.
13.5.114 Example Let A = Fq [t]. A Drinfeld module ρ of rank 2 is determined by ρt = tτ 0 +gτ +∆τ 2 .
Here g is a modular form of weight q − 1 and type 0 and ∆ is a form of weight q 2 − 1 and
type 0.
13.5.115 Remark One can readily equip such modular forms with an action of the Hecke operators
which are related to Galois representations. For this see, for example, [125, 126, 334, 1267].
13.5.116 Remark The modular theory has been highly important in a number of ways. In analogy
with work on the Korteweg-de Vries equation, Drinfeld found a way to sheafify his modules,
(see, for example, [2201] for a very nice account). These sheaves, called “shtuka,” have been
essential to the work of Lafforgue completing the Langlands program for function fields and
the general linear group [1829, 1830].
13.5.117 Remark Building on the notion of shtuka, in a fundamental paper Anderson [94] generalized
Drinfeld modules to “t-modules” by replacing the additive group with additive n-space. Such
t-modules have a tensor product as well as associated exponential functions and Anderson
presents an extremely useful criterion for the surjectivity of these functions. The paper [94]
has played a key role in many developments.
13.5.118 Remark Another application of the Drinfeld modular curves has been in the algebraic-
geometric construction of codes due to Goppa. For example, the reader may consult [2280,
2881].
13.5.119 Remark Finally, we mention the very recent result of Pellarin [2375] which establishes a
two-variable deformation of Theorem 13.5.74. In particular, one obtains deformations of
Bernoulli-Carlitz elements.
13.5.120 Remark There is a vast theory related to the transcendency of elements arising in the
theory of Drinfeld modules and the like. For instance, the Carlitz period (Example 13.5.23)
546 Handbook of Finite Fields
was shown to be transcendental by Wade [2888]. Since then a vast number of other results
and techniques were introduced into the theory including ideas from automata theory. For
a sampling of what can be found, we refer the reader to [78, 79, 96, 581, 582, 3039, 3040].
References Cited: [78, 79, 94, 96, 97, 125, 126, 334, 335, 354, 581, 582, 919, 920, 1208,
1267, 1332, 1333, 1334, 1335, 1449, 1450, 1451, 1829, 1830, 1846, 2201, 2280, 2375, 2419,
2585, 2611, 2760, 2761, 2762, 2763, 2764, 2765, 2767, 2773, 2792, 2793, 2847, 2881, 2888,
2892, 3039, 3040, 3084]
III
Applications
14 Combinatorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
Latin squares • Lacunary polynomials over finite fields • Affine and
projective planes • Projective spaces • Block designs • Difference sets
• Other combinatorial structures • (t, m, s)-nets and (t, s)-sequences •
547
This page intentionally left blank
14
Combinatorial
549
550 Handbook of Finite Fields
functions of graphs
14.1.1 Definition A latin square of order n is an n × n array based upon n distinct symbols with
the property that each row and each column contains each of the n symbols exactly
once.
14.1.2 Example The following are latin squares of orders 3 and 5
1 2 3 4 0
0 1 2 3 4 0 1 2
1 2 0 , 0 1 2 3 4 .
2 0 1 2 3 4 0 1
4 0 1 2 3
14.1.3 Remark Given any latin square of prime power order q (with symbols from Fq , the finite
field of order q), using the Lagrange Interpolation Formula from Theorem 2.1.131, we can
construct a polynomial P (x, y) of degree at most q − 1 in both x and y which represents
the given latin square. The field element P (a, b) is placed at the intersection of row a and
column b. For example, the two squares given in the previous example can be represented
by the polynomials x + y over F3 and 2x + y + 1 over F5 .
14.1.4 Definition Assume that a latin square of order n is based upon the n distinct symbols
1, 2, . . . , n. Such a latin square of order n is reduced if the first row and first column are
in the standard order 1, 2, . . . , n. Let ln denote the number of reduced latin squares of
order n. Let Ln denote the total number of distinct latin squares of order n.
14.1.7 Definition Two latin squares of order n are orthogonal if upon placing one of the squares
on top of the other, we obtain each of the possible n2 distinct ordered pairs exactly
once. In addition, a set {L1 , . . . , Lt } of latin squares all of the same order is orthogonal
if each distinct pair of squares is orthogonal, i.e., if Li is orthogonal to Lj whenever
i 6= j. Such a set of squares is a set of mutually orthogonal latin squares (MOLS).
Combinatorial 551
14.1.8 Remark There are numerous combinatorial objects which are equivalent to sets of MOLS.
These include transversal designs, orthogonal arrays, edge-partitions of a complete bipartite
graph, and (k, n)-nets. We refer to Chapter III, Theorem 3.18 of [706] for a more detailed
discussion of these topics; see also Sections 14.5 and 14.7.
14.1.9 Definition Let N (n) denote the maximum number of mutually orthogonal latin squares
(MOLS) of order n.
14.1.12 Theorem [355] For any prime power q, the polynomials ax + y with a 6= 0 ∈ Fq represent a
complete set of q − 1 MOLS of order q by placing the field element ax + y at the intersection
of row x and column y of the a-th square.
14.1.13 Remark In Subsection 14.1.5 we discuss connections of complete sets of MOLS with other
combinatorial objects; in particular with affine and projective planes where it is stated that
the existence of a complete set of MOLS of order n is equivalent to the existence of an
affine, or projective, plane of order n.
14.1.14 Example The following gives a complete set of 4 MOLS of order 5, arising from the poly-
nomials x + y, 2x + y, 3x + y, 4x + y over the field F5
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
1 2 3 4 0 2 3 4 0 1 3 4 0 1 2 4 0 1 2 3
2 3 4 0 1 , 4 0 1 2 3 , 1 2 3 4 0 , 3 4 0 1 2 .
3 4 0 1 2 1 2 3 4 0 4 0 1 2 3 2 3 4 0 1
4 0 1 2 3 3 4 0 1 2 2 3 4 0 1 1 2 3 4 0
14.1.20 Definition If A is a latin square of order m and B is a latin square of order n, denote the
entry at row i and column j of A by aij . Similarly we denote the (i, j) entry of B by
bij . Then the Kronecker product of A and B is the mn × mn square A ⊗ B, given by
0 1 2
0 1
A= , B=1 2 0.
1 0
2 0 1
Then the Kronecker product construction using A and B yields the following 6×6 square
whose elements are ordered pairs:
00 01 02 10 11 12
01 02 00 11 12 10
02 00 01 12 10 11
.
10 11 12 00 01 02
11 12 10 01 02 00
12 10 11 02 00 01
14.1.22 Lemma If H and K are latin squares of orders n1 and n2 , then H ⊗ K is a latin square of
order n1 n2 .
14.1.23 Lemma If H1 and H2 are orthogonal latin squares of order n1 and K1 and K2 are orthogonal
latin squares of order n2 , then H1 ⊗ K1 and H2 ⊗ K2 are orthogonal latin squares of order
n1 n2 .
14.1.24 Corollary If there is a pair of MOLS of order n and a pair of MOLS of order m, then there
is a pair of MOLS of order mn.
14.1.25 Theorem If n = q1 · · · qr , where the qi are distinct prime powers with q1 < · · · < qr , then
N (n) ≥ q1 − 1.
14.1.26 Remark In 1922, MacNeish [1988] conjectured that N (n) = q1 − 1. This has been shown
to be false for all non-prime power values of n ≤ 62; it is in fact conjectured in [1876] that
this conjecture is false at all values of n other than 6 and prime powers.
Combinatorial 553
14.1.27 Definition Let n = λm. An F (n; λ) frequency square is an n × n square based upon m
distinct symbols so that each of the m symbols occurs exactly λ times in each row and
column. Thus an F (n; 1) frequency square is a latin square of order n. Two F (n; λ)
frequency squares are orthogonal if when one square is placed on top of the other, each
of the m2 possible distinct ordered pairs occurs exactly λ2 times [2173]. A set of such
squares is orthogonal if any two distinct squares are orthogonal. Such a set of mutually
orthogonal squares is a set of MOFS.
14.1.28 Theorem [1456] The maximum number of MOFS of the form F (n; λ) is bounded above by
(n − 1)2 /(m − 1).
14.1.29 Theorem [2173] If q is a prime power and i ≥ 1 is an integer, a complete set of (q i −1)2 /(q −
1), F (q i ; q i−1 ) MOFS can be constructed using the linear polynomials a1 x1 + · · · + a2i x2i
over the field Fq where
1. The vector (a1 , . . . , ai ) 6= (0, . . . , 0),
2. The vector (ai+1 , . . . , a2i ) 6= (0, . . . , 0),
3. The vector (a01 , . . . , a02i ) 6= e(a1 , . . . , a2i ) for any e 6= 0 ∈ Fq .
14.1.4 Hypercubes
14.1.31 Definition Two hypercubes are orthogonal if, when superimposed, each of the n2 ordered
pairs appears nd−2 times. Again the d = 2 case reduces to that of latin squares. A set
of t ≥ 2 hypercubes is orthogonal if every pair of distinct hypercubes is orthogonal.
14.1.32 Theorem [1876] The maximum number of mutually orthogonal hypercubes of order n ≥ 2,
dimension d ≥ 2, and fixed type j with 0 ≤ j ≤ d − 1 is bounded above by
1 d d d
nd − 1 − (n − 1) − (n − 1)2 − · · · − (n − 1)j .
n−1 1 2 j
14.1.33 Corollary The maximum number of order n, dimension d, and type 1 hypercubes is bounded
above by
nd − 1
Nd (n) ≤ − d.
n−1
14.1.34 Remark In the case that d = 2, Nd (n) reduces to the familiar bound of n − 1 for sets of
MOLS of order n. As was the case for d = 2, the bound for d > 2 can always be realized
when n is a prime power.
14.1.35 Corollary There are at most
d−1 d d−2 d
(n − 1) + (n − 1) + ··· + (n − 1)j
d−1 j+1
554 Handbook of Finite Fields
cube cube cube cube cube cube cube cube cube cube
1 2 3 4 5 6 7 8 9 10
012 012 012 012 000 000 012 012 012 012
120 201 012 012 111 111 120 201 201 120
201 120 012 012 222 222 201 120 120 201
012 012 120 201 111 222 120 201 120 201
120 201 120 201 222 000 201 120 012 012
201 120 120 201 000 111 012 012 201 120
012 012 201 120 222 111 201 120 201 120
120 201 201 120 000 222 012 012 120 201
201 120 201 120 111 000 120 201 012 012
14.1.39 Remark Affine and projective planes are discussed in Section 14.3. We first state the
following fundamental result; and then discuss a few other related results.
14.1.40 Theorem [355], [706, Theorem III.3.20] There exists a projective plane (or an affine plane)
of order n if and only if there exists a complete set of MOLS of order n.
14.1.41 Definition Two complete sets of MOLS of order n are isomorphic if after permuting the
rows, permuting the columns with a possibly different permutation, and permuting the
Combinatorial 555
symbols with a third possibly different permutation of each square of the first set, we
obtain the second set of MOLS. See Part III of [706] for further discussion of non-
isomorphic sets of MOLS, affine, and projective planes.
14.1.42 Conjecture If p is a prime, any two complete sets of MOLS of order p are isomorphic.
14.1.43 Remark The above conjecture is only known to be true for p = 3, 5, 7. Truth of the conjec-
ture would imply that all planes of prime order are Desarguesian.
14.1.44 Theorem [2917, 2918] For q = pn , let 0 ≤ k < n, N = (q − 1)/(q − 1, pk − 1) and
k
set e = q − N . Let u be a primitive N -th root of unity in Fq . Assume that xp + ci x
is a permutation polynomial for e elements c1 , . . . , ce ∈ Fq , where one can assume that
k
c1 − 1 = c2 . Let a 6= 0 and c1 be such that f (x) = axp + c1 x is an orthomorphism of Fq
(so f is a permutation polynomial with f (0) = 0, and f (x) − x is also a permutation). Let
k
di = c1 − ci . Then the polynomials auj xp + c1 x + y, j = 1, . . . , N ; di x + y, i = 3, . . . , e; x + y
represent a complete set of q − 1 MOLS of order q.
14.1.45 Corollary For each n ≥ 2 and any odd prime p, the above construction gives τ (n) ≥ 2, non-
isomorphic complete sets of MOLS of order pn , where τ (n) denotes the number of positive
divisors of n.
14.1.46 Example For any odd prime p, this construction gives an example of a non-Desarguesian
affine translation plane of order p2 , constructed without the use of a right quasifield as used
in [818].
14.1.47 Remark For q = 9, let F9 be generated by the primitive polynomial f (x) = x2 + 2x + 2
over F3 . Let α be a root of f (x). The Desarguesian plane of order 9 may be constructed by
using the polynomials αi x + y, i = 0, . . . , 7. Since u = α2 is a primitive 4-th root of unity,
the construction from the above corollary leads to the polynomials αx3 + y, α3 x3 + y, α5 x3 +
y, α7 x3 +y which represent four MOLS of order 9. To extend these four MOLS to a complete
set of 8 MOLS of order 9, we consider the polynomials x + y, α2 x + y, α4 x + y, α6 x + y.
Thus four of the latin squares are the same in both the Desarguesian and non-Desarguesian
constructions.
14.1.48 Remark There are other finite field constructions for sets of MOLS; here we briefly allude to
a few of them which are described in much more detail in [706]. Quasi-difference matrices and
V (m, t) vectors are discussed in Section VI.17.4; self-orthogonal latin squares are considered
in Section III.5.6; MOLS with holes are considered in Section III.1.7; starters are studied
in VI.55.; and atomic latin squares are studied in Section III.1.6.
See Also
References Cited: [355, 397, 398, 706, 818, 992, 993, 1456, 1875, 1876, 1988, 2053, 2155,
2173, 2177, 2737, 2917, 2918]
14.2.1 Introduction
14.2.1 Remark In 1970 Rédei published his treatise Lückenhafte Polynome über endlichen Körpern
[2444], soon followed by the English translation Lacunary Polynomials over Finite Fields
[2445], the title of this chapter. One of the important applications of his theory is to give
information about the following two sets.
14.2.2 Definition For f : Fq → Fq , or f ∈ Fq [X] define the set of directions (slopes of secants of
the graph):
f (x) − f (y)
D(f ) := | x 6= y ∈ Fq .
x−y
14.2.4 Remark The sets P (f ) and D(f ) partition Fq . If (f (x) − f (y))/(x − y) = m then the
polynomial f (x) + mx = f (y) + my, so m is a direction determined by f precisely when
f (X) + mX is not a permutation polynomial (on Fq ).
14.2.6 Definition Denote by f ◦ the degree of f , and by f ◦◦ the second degree, the degree of the
polynomial we obtain by removing the leading term.
14.2.7 Definition If f ◦◦ < f ◦ − 1 then f is lacunary and the difference f ◦ − f ◦◦ is the gap.
14.2.8 Remark We want to survey what is known about lacunary polynomials (with a large gap)
that are fully reducible. In many applications however the gap is not between the degree and
the second degree, so instead of being of the form f (X) = X n + h(X), where h◦ ≤ n − 2, it
Combinatorial 557
is of the more general form f (X) = g(X)X n + h(X), where h◦ ≤ n − 2, for some polynomial
g.
14.2.9 Example For d | (q − 1) the field K = Fq contains the d-th roots of unity, so the polynomial
X d − ad is fully reducible.
14.2.10 Remark In many applications the degree f ◦ = q, as is the case in the following examples.
14.2.11 Example The lacunary polynomials X q + c, X q − X, and if q is odd then X q ± X (q+1)/2
and X q ± 2X (q+1)/2 + X, are fully reducible in Fq [X].
14.2.12 Theorem [2445] Let f (X) = X p + g(X), with g ◦ = f ◦◦ < p, be fully reducible in Fp [X], p
prime. Then either g is constant, or g = −X or g ◦ is at least (p + 1)/2.
14.2.13 Remark Let s(X) be the zeros polynomial of f , that is the polynomial with the same set
of zeros as f , but each with multiplicity one. So s = gcd(f, X p − X). It follows that
s | f − (X p − X) = X + g.
We may write f = s · r, where r is the fully reducible polynomial that has the zeroes of f
with multiplicity one less. Hence r divides the derivative f 0 = g 0 . So we conclude that
f = s · r | (X + g)g 0 .
If the right hand side is zero, then either g = −X, corresponding to the fully reducible
polynomial f (X) = X q − X, or g 0 = 0 which (since g ◦ < p) implies g(X) = c for some
c ∈ K and f (X) = X p + c = (X + c)p . If the right hand side is nonzero, then, being divisible
by f , it has degree at least p, so g ◦ + g ◦ − 1 ≥ p which gives g ◦ ≥ (p + 1)/2.
14.2.14 Remark In the next section we see how this result can be applied to obtain information
about the number of directions determined by a function.
14.2.15 Definition Let AG(2, q) be the Desarguesian affine plane of order q, where points of
AG(2, q) are denoted by pairs (a, b), a, b ∈ Fq .
14.2.16 Definition Let P G(2, q) be the Desarguesian projective plane of order q with homogeneous
point coordinates (a : b : c) and line coordinates [u : v : w]. The point (a : b : c) is
incident with the line [u : v : w] precisely when au + bv + cw = 0. The equation of the
line [u : v : w] is then uX + vY + wZ = 0.
14.2.17 Remark We consider AG(2, q) as part of the projective plane P G(2, q) where [0 : 0 : 1] is
the line at infinity, the line with equation Z = 0. The affine point (a, b) corresponds to the
projective point (a : b : 1).
14.2.18 Definition Let u = (u1 , u2 ) and v = (v1 , v2 ) be two affine points. The pair u, v determines
the direction m if the line joining them has slope m, or equivalently, if (u2 − v2 )/(u1 −
v1 ) = m.
558 Handbook of Finite Fields
14.2.19 Definition Let R be a set of q points in AG(2, q). We define DR ⊆ Fq ∪ {∞} to be the set
of directions determined by the pairs of points in R.
14.2.20 Remark The reason we take R to have size q is two-fold. Firstly, in Rédei’s formulation of
the problem R is the graph of a function f and DR = Df . Secondly, any set with more than
q points determines all directions, by the pigeon hole principle: there are exactly q lines in
every parallel class, so if |R| > q, then there is a line with at least two points of R in each
parallel class. For results concerning the case |R| < q, see [2756].
14.2.25 Remark The third case in Theorem 14.2.24, 2 + (q − 1)/(pe + 1) ≤ N ≤ (q − 1)/(pe − 1) for
some e satisfying 1 ≤ e ≤ bn/2c, is not sharp. The following are some examples of functions
that determine few directions.
14.2.26 Example The function f (X) = X (q+1)/2 , where q is odd, determines (q + 3)/2 directions.
14.2.27 Example The function f (X) = X s , where s = pe is the order of a subfield of Fq , determines
(q − 1)/(s − 1) directions.
14.2.28 Example The function f (X) = TrFq /Fs (X), the trace from Fq to the subfield Fs , determines
(q/s) + 1 directions.
14.2.29 Example If f (X) ∈ Fq [X s ], where s is the order of a subfield of Fq and is chosen maximal
with this property, in other words, f is Fs -linear (apart from the constant term) but not
linear over a larger subfield, then (q/s) + 1 ≤ N ≤ (q − 1)/(s − 1).
14.2.30 Remark Motivated by the form of the examples the following theorem was obtained (in
a number of steps) by Ball, Blokhuis, Brouwer, Storme, and Szőnyi. Initial results are in
[322], then the classification was all but obtained in [321], and completed in [185].
14.2.31 Theorem [185] If, for f : Fq → Fq , with f (0) = 0, the number N = |D(f )| > 1 of directions
determined by f is less than (q + 3)/2, then for a subfield Fs of Fq
q q−1
+1≤N ≤ ,
s s−1
and if s > 2 then f is Fs -linear.
14.2.32 Remark This result is obtained using several lemmas about fully reducible lacunary poly-
nomials which are of independent interest.
Combinatorial 559
14.2.34 Lemma [185] Let s be a power of p with 1 ≤ s < q and suppose that
is fully reducible over Fq . If s > 2, g ◦ = q/s2 and 2(g 0 )◦ < g ◦ then X q/s + g(X) is Fs -linear.
14.2.35 Remark Theorem 14.2.31 completely characterizes the case in which the number of direc-
tions is small, that is less than (q + 3)/2. In the case that q = p is prime, N < (p + 3)/2
implies N = 1, and the characterization of N = (p + 3)/2 directions was given by Lovász
and Schrijver [1961].
14.2.36 Theorem [1961] If f ∈ Fp [X], p prime, determines (p + 3)/2 directions, then f (X) =
X (p+1)/2 up to affine equivalence.
14.2.37 Remark Much more can be said in this case, the following surprising theorem by Gács
[1151] shows that there is a huge gap in the spectrum of possible number of directions.
is large [191].
14.2.41 Remark Let R be a subset of AG(2, q) of size q. The set of points of P G(2, q)
14.2.42 Definition A blocking set of P G(2, q) is a set of points B of P G(2, q) with the property
that every line of is incident with a point of B.
560 Handbook of Finite Fields
14.2.43 Lemma [358] A blocking set of P G(2, q) has at least q + 1 points and equality can only be
obtained if these points all are on a line.
14.2.45 Remark We tacitly assume that all blocking sets under consideration are minimal, so
they do not contain a proper subset that is also a blocking set. For blocking sets of non-
Desarguesian planes and for further reading on blocking sets see [320, 329, 430, 432, 433,
1150, 1153] and for more recent references, see Remark 14.2.54.
14.2.46 Lemma [319] Suppose that B is a blocking set of size q + k + 1 and that (1 : 0 : 0) ∈ B
and assume that the line with equation Z = 0, that is [0 : 0 : 1] is a tangent to B. Then
the non-horizontal lines [1 : u : v] are blocked by the affine points of B and the Rédei
polynomial of the affine part of B can be written as
F0 = V q G0 + W q H0 ,
with Y
F0 (V, W ) = (bV + W ).
(a:b:1)∈B
f (W ) = g(W ) + W q h(W ).
14.2.48 Lemma [320] Let f ∈ Fq [X] be fully reducible, and suppose that f (X) = X q g(X) + h(X),
where g and h have no common factor. Let k be the maximum of the degrees of g and
h. Then k = 0, or k = 1 and f (X) = a(X q − X) for some a ∈ F∗q , or q is prime and
√
k ≥ (q + 1)/2, or q is a square and k ≥ q, or q = p2e+1 for some prime p and k ≥ pe+1 .
√
14.2.49 Theorem [430] A non-trivial blocking set B in P G(2, q), q square, has at least q + q + 1
√
points. If equality holds then B consists of the points of a subplane of order q.
14.2.50 Theorem [320] A non-trivial blocking set B in P G(2, q), q = p2e+1 , p prime, q 6= p, has at
least q + pe+1 + 1 points. This bound is sharp only in the case e = 1.
3
14.2.51 Theorem [319] A non-trivial blocking set B in P G(2, p), p prime, has at least 2 (p + 1)
1
points. If equality holds then every point of B is on precisely 2 (p − 1) tangents.
14.2.52 Remark The bound in Theorem 14.2.51 was conjectured in [831].
14.2.53 Remark The proof of Lemma 14.2.48 leads to the following divisibility condition
It would be good (and probably not infeasible) to characterize the case of equality in the
case p is prime, that is find all f, g, and h with f of degree q + (q + 1)/2, g and h of degree
Combinatorial 561
14.2.56 Remark The blocking set problem in P G(2, p), p prime, leads one to search for polynomials
f (X), g(X), h(X), where f = X p g + h factors completely into linear factors and g and h
have degree at most 12 (p + 1). More precisely, given a blocking set B of size 32 (p + 1), for
each point P ∈ B, and each tangent ` passing through P , there is a polynomial f with the
above property. A factor of f of multiplicity e corresponds to a line incident with P distinct
from ` meeting B in e + 1 points.
14.2.57 Remark The equation f =(Xg
ˆ + h)(h0 g − g 0 h) has several infinite families of solutions, and
some sporadic ones, not all of them necessarily corresponding to blocking sets.
14.2.58 Theorem [323] The following list contains all non-equivalent solutions for f = X p g + h,
where f factors completely into linear factors and g and h have degree at most 12 (p + 1),
for p < 41.
1. (For odd p, say p = 2r + 1.) Take f (X) = X (X − a)3 where the product is
Q
over the nonzero squares a. Then f satisfies f (X) = X(X r − 1)3 = X p g + h with
g(X) = X r − 3, h(X) = 3X r+1 − X. This would correspond to line intersections
of the lines incident with P (with frequencies written as exponents) 1r , 22 , 4r .
For p = 7 this is the function for the blocking set {(1 : 0 : 0), (0 : 1 : 0), (0 : 0 :
1)} ∪ {(a : b : 1) | a, b ∈ {1, 2, 4}}.
2. (For p = 4t + 1.) Take f (X) = X (X − a) (X − b)4 where the product is over
Q Q
the nonzero squares a and fourth powers b. Here f (X) = X(X 2t − 1)(X t − 1)4 =
X p g + h with g(X) = X 2t − 4X t + 5 and h(X) = −5X 2t+1 + 4X t+1 − X. This
would correspond to line intersections 12t , 2t+2 , 6t .
3. (For p = 4t+1.) Take f (X) = X t+1 (X −a) (X −b)2 where the product is over
Q Q
the nonzero squares a and fourth powers b. Here f (X) = X t+1 (X 2t −1)(X t −1)2 =
X p g + h with g(X) = X t − 2 and h(X) = 2X 2t+1 − X t+1 . This would correspond
to line intersections 12t , 2t , 4t (t + 2)2 . For p = 13 this is a function for the
blocking set {(1 : 0 : 0), (0 : 1 : 0), (0 : 0 : 1)} ∪ {(1 : a : 0), (0 : 1 : a), (a : 0 :
1) | a3 = −1} ∪ {(b : c : 1) | b3 = c3 = 1}.
4. (For p = 13.) Take f (X) = X (X − a)4 (X − 21 a) where the product is
Q Q
over the values a with a3 = 1. Here f (X) = X(X 3 − 1)4 (X 3 − 18 ) = X p g + h
with g(X) = X 3 + 4 and h(X) = 5X 7 − 5X 4 − 5X. This corresponds to line
562 Handbook of Finite Fields
14.2.62 Definition A t-fold blocking set B of P G(2, q) is a set of points such that every line is
incident with at least t points of B.
14.2.63 Theorem [328] Let B be a t-fold blocking set in P G(2, q), q = ph , p prime, of size t(q+1)+c.
Let c2 = c3 = 2−1/3 and cp = 1 for p > 3.
1. If q = p2d+1 and t < q/2 − cp q 2/3 /2, then c ≥ cp q 2/3 , unless t = 1 in which case
B, with |B| < q + 1 + cp q 2/3 , contains a line.
√
2. If 4 < q is a square, t < q 1/4 /2 and c < cp q 2/3 , then c ≥ t q and B contains the
union of t disjoint Baer subplanes, except for t = 1 in which case B contains a
line or a Baer subplane.
√
l q m
3. If q = p2 , p prime, and t < q 1/4 /2 and c < p 41 + p+1 2 , then c ≥ t q and B
contains the union of t disjoint Baer subplanes, except for t = 1 in which case B
contains a line or a Baer subplane.
14.2.64 Remark For more precise results in the case t = 2 see [188]; for t = 3 see [184]; for q = p3
see [2416, 2417, 2418]; for q = p6n+3 see [328]; and for q = p6n see [328, 2418].
14.2.65 Remark The proof of Theorem 14.2.63 starts with the main theorem of [330] on fully
reducible lacunary polynomials.
14.2.66 Theorem [330] Let f ∈ Fq [X], q = pn , p prime, be fully reducible, f (X) = X q g(X) + h(X),
where (g, h) = 1. Let k = max(g ◦ , h◦ ) < q. Let e be maximal such that f is a pe -th power.
Then we have one of the following:
1. e = n and k = 0;
2. e ≥ 2n/3 and k ≥ pe ;
3. 2n/3 > e > n/2 and k ≥ pn−e/2 − (3/2)pn−e ;
4. e = n/2 and k = pe and f (X) = aTr (bX + c) + d or f (X) = a Norm(bX + c) + d
for suitable constants a, b, c, d. Here Tr and Norm respectively denote the trace
and norm function from Fq to F√q ;
l p m
5. e = n/2 and k ≥ pe 14 + (pe + 1)/2 ;
Combinatorial 563
6. n/2 > e > n/3 and k ≥ pn/2+e/2 − pn−e − pe /2, or if 3e = n + 1 and p ≤ 3, then
k ≥ pe (pe + 1)/2;
7. n/3 ≥ e > 0 and k ≥ pe d(pn−e + 1)/(pe + 1)e;
8. e = 0 and k ≥ (q + 1)/2;
9. e = 0, k = 1 and f (X) = a(X q − X).
14.2.67 Remark Lacunary polynomials over finite fields and in particular Redei’s theorem, The-
orem 14.2.12, and Blokhuis’ theorem, Theorem 14.2.51, have also been used in algebra,
algebraic number theory, group theory, and group factorization. For a survey of these ap-
plications, see [2758].
Polynomials over finite fields have been used to tackle a variety of problems associated
with incidence geometries. Various extensions of the ideas first used for lacunary polynomials
have been studied. This has led to some interesting techniques involving field extensions,
algebraic curves which in turn have led to classification, non-existence, and stability results
concerning subsets of points of a finite projective spaces with a certain given property. For
a recent survey, see [187].
See Also
References Cited: [184, 185, 186, 187, 188, 190, 191, 319, 320, 321, 322, 323, 328, 329, 330,
358, 430, 432, 433, 831, 1150, 1151, 1152, 1153, 1423, 1872, 1961, 1977, 1978, 2408, 2416,
2417, 2418, 2444, 2445, 2755, 2756, 2757, 2758]
All structures in this section are finite. Reference [1560] is an excellent introduction to
projective and affine planes. See Section VII.2 of [706] for a concise description of the Hall,
André, Hughes, and Figueroa planes.
14.3.1 Definition A finite projective plane is a finite incidence structure of points and lines such
that
1. every two distinct points together lie on a unique line;
2. every two distinct lines meet in a unique point;
3. there exists a quadrangle (four points with no three collinear).
564 Handbook of Finite Fields
14.3.2 Remark If π is a finite projective plane, then there is a positive integer n such that any
line of π has exactly n + 1 points, every point lies on exactly n + 1 lines, the total number
of points is n2 + n + 1, and the total number of lines is n2 + n + 1. This number n is called
the order of π.
14.3.3 Construction [1510] The classical examples of finite projective planes are constructed as
follows. Let V be a 3-dimensional vector space over the finite field Fq of order q. Take as
points the 1-dimensional subspaces of V and as lines the 2-dimensional subspaces of V , and
let incidence be given by containment. The resulting incidence structure is a finite projective
plane of order q, denoted by PG(2, q). These projective planes are Desarguesian since they
satisfy the classical configurational theorem of Desargues (for instance, see [555]). Note that
this construction shows that there exists a finite projective plane of order q for any prime
power q. Alternatively, one may use homogeneous coordinates (x : y : z) = {(f x, f y, f z) : f ∈
Fq \{0}} for the points of PG(2, q), and [a : b : c] = {[f a, f b, f c] : f ∈ Fq \{0}} for the lines
of PG(2, q), where the point (x : y : z) is incident with the line [a : b : c] if and only if
ax + by + cz = 0.
14.3.4 Remark Some non-classical finite projective planes are discussed in Subsections 14.3.3 to
14.3.5. Many other constructions can be found in [807]. One of the most difficult problems in
finite geometry is determining the spectrum of possible orders for finite projective planes. All
known examples have prime power order, but it is unknown if this must be true in general.
The Bruck-Ryser-Chowla Theorem (Section 14.5) excludes an infinite number of positive
integers as possible orders. In addition, order 10 has been excluded via a computer search
[1837]. There are precisely four different (non-isomorphic, as defined in Subsection 14.3.3)
projective planes of order 9, the smallest order for which non-classical examples exist [1836].
An overview of the state of knowledge concerning small projective planes follows:
order n 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
existence y y y y n y y y n y ? y n ? y
number 1 1 1 1 0 1 1 4 0 ≥1 ? ≥1 0 ? ≥ 22
14.3.5 Definition A finite affine plane is a finite incidence structure of points and lines such
that
1. any two distinct points together lie on a unique line;
2. for any point P and any line ` not containing P , there exists a unique line m
through P that has no point in common with ` (the “parallel axiom”);
3. there exists a triangle (three points not on a common line).
14.3.6 Remark If one defines a parallelism on the lines of an affine plane by saying that two lines
are parallel if they are equal or have no point in common, then parallelism is an equivalence
relation whose equivalence classes are called parallel classes. Each parallel class of lines is a
partition of the point set, and every line belongs to exactly one parallel class.
14.3.7 Remark If one removes a line ` together with all its points from a projective plane π, then
one obtains an affine plane π0 = π ` . Two lines of the affine plane π ` are parallel if and
only if the projective lines containing them meet the line ` in the same point. We call `
the line at infinity of π0 , and the points of ` are called the points at infinity. Conversely,
Combinatorial 565
to construct a projective plane from an affine plane π0 , create a new point for each parallel
class of π0 and adjoin this new point to each line in that parallel class. Also adjoin a new
line that contains all the new points and no other points. The resulting incidence structure
is a projective plane π, called the projective completion of π0 . The order of π0 is the order
of its projective completion.
14.3.8 Construction The classical way to construct finite affine planes is as follows. Take as points
the ordered pairs (a, b), with a, b ∈ Fq , and as lines the sets of points (x, y) satisfying an
equation of the form Y = mX + b for some m, b ∈ Fq or an equation of the form X = c
for some c ∈ Fq . The resulting structure is an affine plane of order q, denoted by AG(2, q).
Such an affine plane is also Desarguesian since the projective completion of AG(2, q) is
(isomorphic to) PG(2, q). Alternatively, AG(2, q) may be constructed from a 2-dimensional
vector space V over Fq by taking as points all vectors in V and as lines all cosets of 1-
dimensional subspaces, where incidence is then given by containment.
14.3.10 Definition If φ is a collineation of a projective plane π, and φ fixes all lines through a
point P and all points on a line `, then φ is a (P, `)-perspectivity. In particular, it is a
(P, `)-elation if P ∈ `.
14.3.11 Definition A projective plane π is (P, `)-transitive if for any distinct points A, B not on `
and collinear with P (A 6= P 6= B), there is a (P, `)-perspectivity φ in Aut(π) such that
Aφ = B. Similarly, π is (m, `)-transitive if it is (P, `)-transitive for all points P on m. If
π is (`, `)-transitive for some line `, then ` is a translation line of π and π is a translation
plane with respect to `.
14.3.12 Remark If π is a translation plane with respect to a line `, then the affine plane π ` = π \ `
is also a translation plane. Most often a translation plane is considered an affine plane, with
its line at infinity the translation line. The translation group of such an affine plane is the
group of all (`, `)-elations, which acts sharply transitively on the points of the affine plane
π ` . References [279, 1613] provide extensive information on translation planes.
14.3.13 Remark Translation planes are coordinatized by algebraic structures called quasifields (see
Section 2.1). Every quasifield has an algebraic substructure called its kernel, which in the
finite setting is necessarily a finite field. The quasifield is then a finite dimensional vector
space over its kernel, and the dimension of the translation plane is the dimension of this
vector space.
566 Handbook of Finite Fields
14.3.14 Definition Let Σ = PG(2t + 1, q) be a (2t + 1)-dimensional projective space for some
non-negative integer t (see Section 14.4 for the definition of projective space). A spread
of Σ is a set S of t-subspaces of Σ such that any point of Σ belongs to exactly one
element of S. The set-wise stabilizer of S in Aut(Σ) is the automorphism group Aut(S)
of the spread.
14.3.15 Construction View the finite field F = Fq2t+2 as a (2t + 2)-dimensional vector space V over
its subfield Fq , and let Σ = PG(2t + 1, q) be the associated (2t + 1)-dimensional projective
space. If θ is a primitive element of F and L = Fqt+1 is the subfield of order q t+1 , then for
each positive integer i, θi L is a (t + 1)-dimensional vector subspace of V that represents
t+1
a t-subspace of Σ. Moreover, S = {L, θL, θ2 L, . . . , θq L} is a spread of Σ. The spreads
obtained in this way are regular as defined below.
14.3.17 Proposition [1515, p. 200] Any three mutually skew t-subspaces of the projective space
PG(2t + 1, q) determine a unique t-regulus containing them.
14.3.18 Remark The points covered by a 1-regulus R in PG(3, q) are the points of a hyperbolic
quadric. The transversals to R form another 1-regulus covering the same hyperbolic quadric.
This 1-regulus is the opposite regulus Ropp to R.
14.3.19 Definition Let q > 2 be a prime power. A spread S in PG(2t + 1, q) is regular if for any
three elements of S, the t-regulus determined by them is contained in S. (See [1515] for
an alternative definition valid for q = 2.)
is “linear” in a well-defined way [427], then the resulting subregular planes are the André
planes which are two-dimensional over their kernels. Thus Hall planes are André planes,
but not necessarily vice versa.
14.3.24 Remark In [1508], a method is given for obtaining a spread of PG(3, q 2 ) from a spread of
PG(3, q) for every odd prime power q.
14.3.25 Definition Let S0 be a regular spread of PG(3, q). A nest N in S0 is a set of reguli in S0
such that every line of S0 belongs to 0 or 2 reguli of N . Thus a nest is a 2-cover of the
lines of S0 which are contained in the nest.
14.3.26 Remark Counting arguments show that the number t of reguli in a nest must satisfy
(q + 3)/2 ≤ t ≤ 2(q − 1). In particular, we note that q must be odd for nests to exist. If a
nest contains t reguli, it is a t-nest. If U denotes the t(q + 1)/2 lines of S0 contained in the
reguli of a t-nest N , there is a natural potential replacement set for U . Namely, if (q + 1)/2
lines can be found in the opposite regulus to each regulus of N such that the resulting set
W of t(q + 1)/2 lines are mutually disjoint, then S = (S0 \ U ) ∪ W is a non-regular spread
of PG(3, q) and hence A(S) is a non-Desarguesian translation plane. In this case, the nest
N is replaceable, and the resulting plane A(S) is a nest plane.
14.3.27 Definition An inversive plane is a 3 − (n2 + 1, n + 1, 1) design (see Section 14.5), for some
integer n ≥ 2. That is, an inversive plane is an incidence structure of n2 + 1 points and
n(n2 + 1) blocks, each block of size n + 1, such that every three points lie in a unique
block. The blocks are the circles of the inversive plane.
14.3.28 Construction Let q be any prime power. Take as points the elements of Fq2 together with
the symbol ∞. Take as circles the images of Fq ∪{∞} under the nonsingular linear fractional
mappings on Fq2 , with the usual conventions on ∞. If incidence is given by containment, this
produces an inversive plane with q 2 + 1 points whose circles have size q + 1. This inversive
plane is Miquelian because it satisfies the classical configurational result of Miquel, and is
denoted by M (q) [807].
14.3.29 Theorem [427] There is a one-to-one correspondence between the points and circles of
M (q), and the lines and reguli of a regular spread of PG(3, q). There is an associated
homomorphism from the stabilizer of the regular spread to the automorphism group of
M (q), whose kernel is a cyclic group of order q + 1.
14.3.30 Remark Using the above correspondence, it is usually easier to search for nests in M (q)
rather than directly in a regular spread S0 of PG(3, q). Such nests can often be constructed
by taking the orbit of some carefully chosen “base” circle under a natural cyclic or elemen-
tary abelian subgroup of Aut(M (q)). However, to check if the resulting nest is replaceable,
one must pull back to S0 and work in PG(3, q). Some nests are replaceable and some are
not. Computations involving finite field arithmetic lead to the following results.
14.3.31 Theorem [172, 173, 949, 950, 2374] For any odd prime power q ≥ 5, there exist replaceable
t-nests for t = q − 1, q, q + 1, 2(q − 1). The resulting spreads determine non-Desarguesian
translation planes of order q 2 which are two-dimensional over their kernels.
14.3.32 Remark The nesting technique for constructing two-dimensional translation planes is quite
robust. In addition to the above examples, replaceable t-nests have been constructed for
568 Handbook of Finite Fields
√ √
many values of t in the range 3(q + 1)/4 − q/2 ≤ t ≤ 3(q + 1)/4 + q/2; see [174].
Moreover, the translation planes associated with nests often can be characterized by the
action of certain collineation groups [1609, 1612, 1614, 1615].
14.3.33 Remark Circle geometries and the notion of subregularity can be extended to higher dimen-
sions. Using algebraic pencils of Sherk surfaces, in [756] several infinite families of non-André
subregular translation planes are constructed which are 3-dimensional over their kernels.
Proofs use intricate finite field computations involving the trace, norm, and bitrace.
14.3.34 Definition An affine plane is flag-transitive if it admits a collineation group which acts
transitively on incident point-line pairs.
14.3.35 Remark A straightforward counting argument shows that transitivity on lines implies tran-
sitivity on flags for affine planes.
14.3.36 Remark By a celebrated result of Wagner [2889], every finite flag-transitive affine plane is
necessarily a translation plane, and hence arises from a spread S of PG(2t + 1, q), for some
positive integer t, according to Theorem 14.3.21. The affine plane A(S) is flag-transitive if
and only if the spread S admits a transitive collineation group.
14.3.37 Construction Let F = Fq2t+2 be treated as a (2t + 2)-dimensional vector space over its
subfield Fq , thus serving as the underlying vector space for Σ = PG(2t + 1, q). If θ is a
primitive element of F, the collineation θ induced by multiplication by θ is a Singer cycle
of Σ; that is, the cyclic group hθi acts sharply transitively on the points and hyperplanes
of Σ. If G denotes the Singer subgroup of order q t+1 + 1, let O denote the partition of the
points of Σ into (q t+1 − 1)/(q − 1) G-orbits of size q t+1 + 1 each. As shown in [948], these
point orbits are caps when t is odd (see Section 14.4 for the definition of a cap). For future
reference we let H denote the index two subgroup of G.
14.3.38 Example [1674] Let q be an odd prime power, and let t be an odd integer. Using the above
t+1
model, choose b ∈ F such that bq −1 = −1. Let σ : F → F via σ : x 7→ xq , and let E
denote the subfield of F whose order is q t+1 . Then A1 = {x + bxσ : x ∈ E} represents a
t-space Γ1 of Σ that meets half the G-orbits of O in two points each (from different H-
orbits) and is disjoint from the rest. Similarly, A2 = {bx + bσ+1 xσ : x ∈ E} represents a
t-space Γ2 of Σ that meets the G-orbits of O which are disjoint from Γ1 in two points each
(from different H-orbits). Moreover, S = ΓH H
1 ∪ Γ2 is a spread of Σ admitting a transitive
collineation group, which yields a non-Desarguesian flag-transitive affine plane A(S) of order
q t+1 with Fq in its kernel.
14.3.39 Example [1674, 1681] Let q be an odd prime power, and let t be an even integer. Using
the notation of Example 14.3.38, Γ1 now meets every G-orbit of O in one point each, and
hence S1 = ΓG 1 is a spread of Σ which admits a transitive (cyclic) collineation group. The
resulting affine plane A(S1 ) is a non-Desarguesian flag-transitive affine plane of order q t+1
with Fq in its kernel. If q ≡ 1 (mod 4), then Γ2 also meets each G-orbit of O in one point
each, and these points lie in H-orbits that are disjoint from Γ1 . Moreover, S2 = ΓH H
1 ∪Γ2 is a
spread of Σ admitting a transitive (non-cyclic) collineation group, thereby yielding another
non-Desarguesian flag-transitive affine plane A(S2 ) of order q t+1 with Fq in its kernel. This
plane does not admit a cyclic collineation group acting transitively on the line at infinity.
For q ≡ 3 (mod 4), one may obtain such a spread S2 by replacing Γ2 with the t-space of Σ
t+1 t+1 t+1 t+1
represented by {µxq + µbq (xσ )q : x ∈ E}, where µ = θ(q −1)/(q−1) .
Combinatorial 569
14.3.40 Remark The field automorphism σ in the above examples may be replaced by any element
of Gal(Fq2t+2 /Fq ). The resulting planes are non-Desarguesian provided σ does not induce
the identity map on the subfield E. Lower bounds are given in [1674, 1681] for the number
of mutually non-isomorphic planes obtained as b and σ vary.
14.3.41 Example [1675] Let q be a power of 2, and let t ≥ 2 be an even integer. Using the notation of
Example 14.3.38, let Tr denote the trace from E to Fq , and choose some element r ∈ Fq2 \Fq .
Let Γ0 be the t-space of Σ represented by {Tr (x)+rx : x ∈ E}. Then Γ0 meets every G-orbit
of O in one point each, and hence S 0 = (Γ0 )G is a spread of Σ which admits a transitive
(cyclic) collineation group. The resulting flag-transitive affine plane A(S 0 ) of order q t+1 with
Fq in its kernel is non-Desarguesian provided q t+1 > 8.
14.3.42 Remark Other than the Hering plane [1488] of order 27 and the Lüneburg planes [1979]
of order 22d for odd d ≥ 3, all known finite flag-transitive affine planes arise from spreads
consisting of a single G-orbit or the union of two H-orbits, where G and H are the Singer
subgroups defined in Construction 14.3.37. It is shown in [951] that if q = pe for some
odd prime p and some positive integer e and if gcd((q t+1 + 1)/2, (t + 1)e) = 1, then any
flag-transitive affine plane of order q t+1 with Fq in its kernel (other than the above Hering
plane) must arise in this way. More can be said for t = 1, 2.
14.3.43 Theorem [175] If q = pe is an odd prime power such that gcd((q 2 + 1)/2, e) = 1, then any
two-dimensional flag-transitive affine plane of order q 2 is isomorphic to one of the planes
constructed in Example 14.3.38 with t = 1. The number of such isomorphism classes can be
determined by Möbius inversion. For e = 1 (hence q = p prime), the above gcd condition is
necessarily satisfied and the number of isomorphism classes is precisely (q − 1)/2.
14.3.44 Theorem [169, 176] If q = pe is an odd prime power such that gcd((q 3 + 1)/2, 3e) = 1, then
any three-dimensional flag-transitive affine plane of order q 3 , other than Hering’s plane of
order 27, is isomorphic to one of the planes constructed in Example 14.3.39 with t = 2. For
e = 1 (hence q = p prime), the number of isomorphism classes of each type arising from
Example 14.3.39 is precisely (q − 1)/2.
14.3.45 Problem For q even, the classification and complete enumeration of finite flag-transitive
affine planes of dimension two or three over their kernel remains an open problem. The only
known two-dimensional examples are the Lüneburg planes.
14.3.46 Problem The classification of finite flag-transitive affine planes is one of the few open cases
in the program announced in [454] to classify all finite flag-transitive linear spaces. For
arbitrary dimension over the kernel, it is not known if there exist examples of finite flag-
transitive affine planes other than the ones listed above, and the classification seems to be
quite difficult. In the projective setting, it is believed that the only flag-transitive projective
plane is the Desarguesian one, although this remains an open problem.
14.3.6 Subplanes
14.3.47 Definition Let π be a projective plane with point set P and line set L. A projective plane
π 0 with point set P 0 and line set L0 is a subplane of π if P 0 ⊆ P and L0 ⊆ L, and π 0
inherits its incidence relation from π.
14.3.48 Theorem [425] Let π be a finite projective plane of order n, and let π 0 be a subplane of π
with order m < n. Then n = m2 or m2 + m ≤ n.
14.3.49 Remark It is unknown whether equality can hold in the above inequality; if so, this would
imply that the order n = m2 +m of π is not a prime power. The case n = m2 is of particular
570 Handbook of Finite Fields
interest. In this case, every point of π \ π 0 is incident with a unique line of π 0 , and dually
every line of π \ π 0 is incident with a unique point of π 0 .
14.3.51 Remark In the classical setting, the lattice of subplanes follows directly from the lattice of
subfields. Namely, if q = pe for some prime p and some positive integer e, then the subplanes
of PG(2, q), up to isomorphism, are precisely PG(2, pk ) as k varies over all positive divisors
of e. So PG(2, q) has a Baer subplane if and only if q is a square. Moreover, one can easily
count the number of subplanes of a given order in this classical (Desarguesian) setting.
14.3.52 Theorem [1510, Lemma 4.20] If q is any prime power and n ≥ 2 is any integer, then the
number of subplanes of order q in PG(2, q n ), all of which are isomorphic to PG(2, q), is
q 3(n−1) (q 3n − 1)(q 2n − 1)
.
(q 3 − 1)(q 2 − 1)
In particular, the number of Baer subplanes in PG(2, q 2 ) is q 3 (q 3 + 1)(q 2 + 1).
14.3.53 Remark It is currently unknown if PG(2, q 2 ) has the greatest number of Baer subplanes
among all projective planes of order q 2 . No counter-examples have been found. Amazingly,
there are affine planes of order q 2 which contain more affine subplanes of order q than does
AG(2, q 2 ) [1099].
14.3.54 Definition A Baer subplane partition, or BSP for short, of PG(2, q 2 ) is a partition of the
points of PG(2, q 2 ) into subplanes, each isomorphic to PG(2, q).
14.3.55 Example [426] Consider the full Singer group of order q 4 + q 2 + 1 acting sharply transitively
on the points and lines of π = PG(2, q 2 ). Then the orbits under the Singer subgroup of order
q 2 + q + 1 are Baer subplanes, and the orbit of any one of these Baer subplanes under the
complementary Singer subgroup of order q 2 −q +1 forms a BSP of π. Such a BSP is classical.
14.3.56 Remark It is shown in [171] that any spread of PG(5, q) admitting a linear cyclic sharply
transitive action corresponds to a “perfect” BSP of PG(2, q 2 ), and this spread is regular if
and only if the BSP is classical. By definition, a BSP is perfect if and only if it is an orbit
of some Baer subplane under an appropriate Singer subgroup, although the Baer subplane
itself need not be a point orbit under a Singer subgroup. Examples 14.3.39 and 14.3.41 for
t = 2 yield the following result.
14.3.57 Theorem [171] Let q 6= 2 be a prime power. Then there exist non-classical BSPs of
PG(2, q 2 ).
14.3.58 Remark Relatively little is known about the subplane structure of non-Desarguesian planes.
There is no known example of a square order projective plane which has been shown not
to contain a Baer subplane. However, it is not known if every square order projective plane
must contain a Baer subplane. At the other extreme, the Hall planes, the Hughes planes, the
Figueroa planes, and many two-dimensional subregular translation planes have been proven
to contain subplanes of order two (that is, Fano subplanes). It has been conjectured that
every finite non-Desarguesian plane must contain a subplane of order two. More surprisingly,
it is shown in [480] that the Hughes plane of order q 2 (q odd) has a subplane of order 3 when
q ≡ 2 (mod 3). Extensive, but not exhaustive, computer searches for small q have found no
subplanes of order 3 in this plane when q ≡ 1 (mod 3). Very recently [481], subplanes of
order 3 have been proven to exist in all odd order Figueroa planes.
Combinatorial 571
14.3.59 Remark Reference [205] provides an excellent introduction to the topic of unitals. Proofs
and precise statements of most results in this subsection may be found in the above reference.
14.3.60 Definition A unital is a 2−(n3 + 1, n + 1, 1) design for some integer n ≥ 3 (that is, a
geometry having n3 + 1 points, with n + 1 points on each line such that any two distinct
points are on exactly one line).
14.3.61 Remark Here the interest is not in unitals as designs, but in unitals embedded in a projective
plane of order n2 . The lines (blocks) of the unital are then the lines of the ambient projective
plane which meet the unital in more than one point (and hence in n + 1 points).
14.3.62 Example Let PG(2, q 2 ) be represented using homogeneous coordinates. Then the points
(x : y : z) for which xxq + yy q + zz q = 0 form a unital. This unital is a Hermitian curve.
14.3.65 Remark Orthogonal Buekenhout unitals embedded in PG(2, q 2 ) have been completely enu-
merated. In particular, if q is an odd prime, then the number of mutually inequivalent or-
thogonal Buekenhout unitals in PG(2, q 2 ) is 21 (q + 1), one of which is the Hermitian curve.
The only nonsingular Buekenhout unital embedded in PG(2, q 2 ) is the Hermitian curve.
Exhaustive computer searches in [195, 2382] show that there are precisely two inequivalent
unitals embedded in each of PG(2, 9) and PG(2, 16), the Hermitian curve and one other
orthogonal Buekenhout unital. The enumeration of orthogonal and nonsingular Buekenhout
unitals in various non-Desarguesian translation planes may be found in [179, 180]. For in-
stance, if q ≥ 5 is a prime, then, up to equivalence, the Hall plane of order q 2 has 12 (q + 1)
nonsingular Buekenhout unitals and 1 + b 3q 4 c orthogonal Buekenhout unitals.
14.3.66 Remark In [915], it is shown that the Hall planes contain unitals which are not obtainable
from any Buekenhout construction. This is the only infinite family of unitals embedded in
translation planes which has been proven to be non-Buekenhout. There are also square-order
non-translation planes which are known to contain unitals, necessarily non-Buekenhout. For
instance, the Hughes planes of order q 2 are known to contain unitals for all odd prime powers
q [2482, 2947]. In [788], the Figueroa planes of order q 6 are shown to contain unitals for any
prime power q. In fact, there is no known example of a square-order projective plane which
has been shown not to contain a unital.
572 Handbook of Finite Fields
14.3.67 Remark Proofs of almost all results in this subsection may be found in Chapter 12 of [1510].
14.3.68 Definition A {k; r}-arc in PG(2, q) is a set K of k points such that r is the maximum
number of points in K that are collinear. A {k; 2}-arc is a k-arc.
14.3.70 Definition The {k; r}-arcs in PG(2, q) with k = (q + 1)(r − 1) + 1 are maximal {k; r}-arcs.
14.3.71 Example Singleton points (r = 1), the whole plane (r = q + 1), and the complement of
a line (r = q) are trivial maximal {k; r}-arcs. The {q + 2; 2}-arcs in PG(2, q) for q even,
also called hyperovals (see Section VII.2.9 of [706]), are examples of non-trivial maximal
{k; r}-arcs, and have been objects of intense interest for many years.
14.3.72 Lemma If K is a non-trivial maximal {k; r}-arc in PG(2, q), then r must be a (proper)
divisor of q.
14.3.73 Theorem [189] If PG(2, q) contains a non-trivial maximal {k; r}-arc, then q must be even.
where
αλ + α0 λ0 βλ + β 0 λ0
α ⊕ α0 = , β ⊕ β0 = , λ ⊕ λ0 = λ + λ0 .
λ + λ0 λ + λ0
Let F ⊂ C be a set of 2d −1 non-singular conics with common nucleus (0, 0, 1), which is closed
under the composition of distinct conics. Then the points of the conics in F, together with
(0, 0, 1), form a maximal {k; 2d }-arc in PG(2, q). To obtain one such set F, assume q = 24m+2
and let ∈ F24m+2 be such that Tr () = 1. Let A = {x ∈ F24m+2 : x2 + x ∈ F22m+1 }, and let
r(λ) = λ3 + for all λ ∈ A. Then |A| = 22m+2 and
F = {C1,r(λ),λ : λ ∈ A \ {0}}
is such a subset of C which determines, as indicated above, a maximal {k; 22m+2 }-arc in
PG(2, 24m+2 ). These arcs do not arise from Construction 14.3.74. Other possibilities for F
may be found in [1062, 1406, 1407].
Combinatorial 573
14.3.76 Remark Semifields (see Section 2.1) are algebraic structures that may be used to coor-
dinatize certain translation planes, called semifield planes. These are the only translation
planes which are also dual translation planes. Many new examples have recently been found.
Chapter 6 of [785] is an excellent source for many of these new developments.
14.3.77 Problem As previously mentioned, all known finite projective (and affine) planes have prime
power order, although it is certainly unclear whether this must be true in general. However,
it is now known that if a projective plane of order n admits an abelian collineation group
of order n2 or n2 − n, then n must be a prime power [326, 1634]. An equally important
open problem is whether any finite projective (or affine) plane of prime order p must be
Desarguesian. This appears to be a very difficult problem; the smallest open case is p = 11.
14.3.79 Remark There are other survey articles on substructures in projective planes. The sec-
tion on Finite Geometry in The Handbook of Combinatorial Designs [706] and the survey
article [1513] state the main results on arcs, {k; r}-arcs, caps, unitals, and blocking sets
in PG(2, q), where exact definitions, tables, and supplementary results are provided. In
addition, the collected work [785] contains a great variety of results on substructures in
PG(2, q), techniques for investigating these substructures, and important open problems in
this area. The linearity conjecture on small minimal blocking sets in PG(2, q) is one of the
most important such open problems (see Chapter 3 of [785] for an explicit statement of this
conjecture). Proving this conjecture would imply several new results on various substruc-
tures in PG(2, q) as well as in PG(n, q), for n > 2. In particular, this would include new
results on maximal partial spreads, minihypers, extendability of linear codes, tight sets,
and Cameron-Liebler line classes. The investigation of maximal arcs in PG(2, q) for q even,
inspired by the new construction method of Mathon described in Construction 14.3.75, and
the investigation of embedded unitals as discussed in Section 14.3.7 are central problems
on substructures in projective planes which also merit further research.
574 Handbook of Finite Fields
See Also
References Cited: [104, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 189, 195,
205, 279, 326, 422, 425, 426, 427, 428, 452, 454, 480, 481, 555, 706, 756, 785, 788, 807, 820,
915, 948, 949, 950, 951, 1062, 1099, 1406, 1407, 1488, 1508, 1510, 1513, 1515, 1560, 1609,
1612, 1613, 1614, 1615, 1634, 1674, 1675, 1681, 1836, 1837, 1979, 1980, 2024, 2374, 2382,
2482, 2889, 2947]
14.4.2 Definition
1. For any m = −1, 0, 1, 2, . . . , n, a subspace of dimension m, or m-space, of
PG(n, F ) is a set of points all of whose representing vectors form, together
with the zero, a subspace of dimension m + 1 of V = V (n + 1, F ); it is denoted
by Πm .
2. A subspace of dimension zero is a point; a subspace of dimension −1 is the
empty set. A subspace of dimension one is a line, of dimension two is a plane,
of dimension three is a solid. A subspace of dimension n − 1 is a hyperplane. A
subspace of dimension n − r is a subspace of codimension r.
14.4.3 Definition
1. The set of m-spaces of PG(n, F ) is denoted PG(m) (n, F ) or, when F = Fq , by
PG(m) (n, q).
2. For r, s, m, n ∈ N, let
(a) θ(n, q) = (q n+1 − 1)/(q − 1), also denoted by θ(n);
(b) |PG(m) (n, q)| = φ(m; n, q);
Qs
(c) [r, s]− = i=r (q i − 1), for s ≥ r.
14.4.7 Definition
1. If a point P lies in a subspace Πm , then P is incident with Πm or, equally well,
Πm is incident with P .
2. If Πr and Πs are subspaces of PG(n, F ), then the meet or intersection of Πr
and Πs , written Πr ∩ Πs , is the set of points common to Πr and Πs ; it is also a
subspace.
3. The join of Πr and Πs , written Πr Πs , is the smallest subspace containing Πr
and Πs .
1. If Πr and Π0r are both r-spaces in PG(n, F ) and Π0r ⊂ Πr , then Π0r = Πr .
2. (Grassmann Identity) If Πr ∩ Πs = Πt and Πr Πs = Πm , then r + s = m + t.
3. A subspace Πm is the join of m + 1 linearly independent points; it is also the
intersection of n − m linearly independent hyperplanes.
4. Equivalently, the set of all representing vectors of the points of Πm , together with
the zero vector, is the intersection of n − m hyperplanes of the vector space V ,
which define n − m linearly independent vectors U = (u0 , . . . , un ).
14.4.9 Theorem (The principle of duality) [1510, Chapter 2] For any space S = PG(n, F ), there is
a dual space S ∗ , whose points and hyperplanes are respectively the hyperplanes and points
of S. For any theorem true in S, there is an equivalent theorem true in S ∗ . In particular, if
T is a theorem in S stated in terms of points, hyperplanes, and incidence, the same theorem
is true in S ∗ and gives a dual theorem T∗ in S by substituting “hyperplane” for “point” and
“point” for “hyperplane.” Thus “join” and “meet” are dual. Hence the dual of an r-space
in PG(n, F ) is an (n − r − 1)-space.
14.4.10 Remark For small dimensions, in PG(2, F ), point and line are dual; in PG(3, F ), point and
plane are dual, whereas the dual of a line is a line.
14.4.11 Definition
1. If H∞ is any hyperplane in PG(n, F ), then AG(n, F ) = PG(n, F )\H∞ is an
affine space of n dimensions over F . When F = Fq , write AG(n, F ) = AG(n, q).
2. The subspaces of AG(n, F ) are the subspaces of PG(n, F ), apart from H∞ , with
the points of H∞ deleted in each case.
3. This hyperplane H∞ is the hyperplane at infinity of AG(n, F ).
14.4.12 Definition
1. If S and S 0 are two spaces PG(n, F ), n ≥ 2, then a collineation α : S −→ S 0 is
a bijection which preserves incidence; that is, if Πr ⊂ Πs , then Πα α
r ⊂ Πs .
α α
2. It is sufficient that α is a bijection such that, if Π0 ⊂ Π1 , then Π0 ⊂ Π1 .
3. When n = 1, consider the lines S and S 0 embedded in planes over F ; then
a collineation α : S −→ S 0 is a transformation induced by a collineation of
the planes; that is, if S0 and S00 are planes with S ⊂ S0 and S 0 ⊂ S00 , and
α0 : S0 −→ S00 is a collineation mapping S onto S 0 , then let α be the restriction
of α0 to S.
where
m m m
Xp = (xp0 , . . . , xpn ),
T = (tij );
that is,
m m
tx0i = xp0 t0i + · · · + xpn tni ,
for i = 0, 1, . . . , n.
2. If {P0 , . . . , Pn+1 } and {P00 , . . . , Pn+1
0
} are both subsets of PG(n, F ) of cardinality
n + 2 such that no n + 1 points chosen from the same set lie in a hyperplane, then
there exists a unique projectivity α such that Pi0 = Piα , for all i ∈ {0, 1, . . . , n+1}.
14.4.22 Definition Let S be a space PG(n, F ) and S 0 its dual space superimposed on S; that is,
the points of S 0 are the hyperplanes of S and the hyperplanes of S 0 are the points of S.
Consider a function α : S −→ S 0 . If α is a collineation, it is a correlation of S and induces
a collineation, also named α, of S 0 to S; that is, as the points of S are transformed to
hyperplanes, then hyperplanes are transformed to points since α preserves incidence. If
α is a projectivity, then it is a reciprocity of S. In either case, if α is involutory, that is
α2 = 1, where 1 is the identity, then α is a polarity of S.
14.4.23 Remark If P and P 0 are points and π is a hyperplane such that P α = π and π α = P 0 ,
then, in a polarity, P = P 0 .
14.4.24 Definition Let AG(n, F ) = PG(n, F )\H∞ be an affine space over F . Then, in a given
coordinate frame where H∞ has equation x0 = 0, a point of AG(n, F ) can be written
P(1, x1 , . . . , xn ) and hence as (x1 , . . . , xn ). So the points of AG(n, F ) are the elements
of V (n, F ). The xi are the affine or non-homogeneous coordinates of the given point.
14.4.25 Remark It is assumed that, for any AG(n, F ), the coordinate frame is this one.
14.4.26 Theorem [1510, Chapter 2]
1. The subspaces of AG(n, F ) have the form X + S, where X is any vector and S
is any subspace of V (n, F ).
2. Three points X, Y, Z of AG(n, F ) are collinear if and only if there exists λ in
F \{0, 1} such that X = λY + (1 − λ)Z.
14.4.27 Theorem [1510, Chapter 2]
1. If Fqn = Fq (β), then the map
X = (x1 , . . . , xn ) 7→ x = x1 + x2 β + · · · + xn β n−1
(x − y)q−1 = (x − z)q−1 .
14.4.3 Polarities
14.4.28 Theorem [1510, Chapter 2] Suppose that α is a correlation of PG(n, F ); then it is the
product of an automorphic collineation σ and a projectivity of PG(n, F ) to its dual space
given by the matrix T . Then α is a polarity if and only if
−1
σ2 = 1 and T σ T ∗ = tI,
Combinatorial 579
14.4.32 Definition
1. In a polarity α, if P α = π and π 0α = P 0 , with P, P 0 points and π, π 0 hyperplanes,
then π is the polar (hyperplane) of P and P 0 is the pole of π 0 . Since α2 = 1, the
converse is also true.
2. If P 0 lies in π = P α , then P lies in π 0 = P 0α . In this case, P and P 0 are conjugate
points, and π and π 0 are conjugate hyperplanes. The point P is self-conjugate
if it lies in its own polar hyperplane; the hyperplane π is self-conjugate if it
contains its own pole.
14.4.33 Remark [1510, Chapter 2] The self-conjugate points P(X) of α are given by X σ T X ∗ = 0.
14.4.34 Remark For the definition of a linear complex see [1509, Section 15.2].
14.4.35 Definition
1. A quadric Q (or Qn ) in PG(n, F ), n ≥ 1, is the set of points P(x0 , . . . , xn )
satisfying a quadratic equation
n
X
aij xi xj = 0,
i, j = 0
i≤j
with aij in F and not all zero. For n = 2, a quadric is a conic; for n = 3, a
quadric is a quadric surface.
580 Handbook of Finite Fields
with aij in F and not all zero, with σ an automorphism of F of order 2, and
with aσij = aji . For n = 2, a Hermitian variety is a Hermitian curve; for n = 3,
a Hermitian variety is a Hermitian surface.
2. for n odd,
1 1 0 0 ··· 0 0
1 0 0 0 ··· 0 0
0 0 0 1 ··· 0 0
T =
0 0 1 0 ··· 0 0 .
.. .. .. .. .. ..
. . . . . .
0 0 0 0 ··· 0 1
0 0 0 0 ··· 1 0
14.4.40 Remark [1510, Chapter 5] Theorem 14.4.39 holds for every field F of characteristic two
with the property that {x2 | x ∈ F } = F .
14.4.41 Theorem [1510, Chapter 5] Let Q be a non-singular quadric of PG(n, q). The coordinate
frame can be chosen so that Q has the following equation:
1. n even,
x20 + x1 x2 + x3 x4 + · · · + xn−1 xn = 0;
2. n odd,
a.
x0 x1 + x2 x3 + · · · + xn−1 xn = 0;
b.
f (x0 , x1 ) + x2 x3 + · · · + xn−1 xn = 0,
where f is any chosen irreducible quadratic form.
Hence, up to a projectivity, there is a unique non-singular quadric in PG(n, q) with n even;
for n odd, there are two types of non-singular quadric.
14.4.42 Definition
1. In case 1 of Theorem 14.4.41, the quadric is parabolic. In case 2.a, it is hyperbolic;
in case 2.b, it is elliptic;
2. The character w of a non-singular quadric in PG(n, q) is 0 if it is elliptic, 1 if it
is parabolic, and 2 if it is hyperbolic.
xq+1
0 + xq+1
1 + · · · + xq+1
n = 0.
14.4.46 Definition A spread S of PG(n, q) by r-spaces is a set of r-spaces which partitions PG(n, q);
that is, every point of PG(n, q) lies in exactly one r-space of S. Hence any two r-spaces
of S are disjoint.
14.4.49 Definition Since Fq is a subfield of Fqk for k ∈ N\{0}, so PG(n, q) is naturally embedded
in PG(n, q k ) once the coordinate frame is fixed. Any PG(n, q) embedded in PG(n, q k )
is a subgeometry of PG(n, q k ).
14.4.50 Theorem [1510, Chapter 4] If s(n, q, q k ) is the number of subgeometries PG(n, q) embedded
in PG(n, q k ), then
14.4.57 Definition A projectivity α which permutes the θ(n) points of PG(n, q) in a single cycle
is a cyclic projectivity; it is a Singer cycle and the group it generates a Singer group.
14.4.58 Theorem [1510, Chapter 4] A projectivity α of PG(n, q) is cyclic if and only if the charac-
teristic polynomial of an associated matrix is subprimitive; that is, the smallest power m of
a characteristic root that lies in Fq is m = θ(n).
14.4.59 Corollary [1510, Chapter 4] A cyclic projectivity permutes the hyperplanes of PG(n, q) in
a single cycle.
14.4.60 Corollary [1510, Chapter 4] The number of cyclic projectivities in PG(n, q) is given by
n
Y
σ(n, q) = q n(n+1)/2 (q i − 1)ϕ(θ(n))/(n + 1),
i=1
14.4.5 k -Arcs
14.4.65 Definition
1. A k-arc in PG(n, q), n ≥ 2, is a set K of k points, with k ≥ n + 1, such that no
n + 1 of its points lie in a hyperplane.
2. An arc K is complete if it is not properly contained in a larger arc.
3. Otherwise, if K∪{P } is an arc for some point P of PG(n, q), the point P extends
K.
584 Handbook of Finite Fields
14.4.66 Definition A normal rational curve in PG(n, q), n ≥ 2, is any set of points of PG(n, q)
which is projectively equivalent to
14.4.67 Remark [1515, Chapter 27] Any normal rational curve contains q + 1 points. For n = 2,
it is a non-singular conic; for n = 3, it is a twisted cubic. Any (n + 3)-arc in PG(n, q) is
contained in a unique normal rational curve of this space.
14.4.69 Remark For a survey of solutions to Problems I, II, III, see [1512, 1513] and [1511, Chapter
13].
14.4.70 Theorem [1510, Chapter 8] Let K be a k-arc of PG(2, q). Then
1. k ≤ q + 2;
2. for q odd, k ≤ q + 1;
3. any non-singular conic of PG(2, q) is a (q + 1)-arc;
4. each (q + 1)-arc of PG(2, q), q even, extends to a (q + 2)-arc.
14.4.71 Definition
1. The (q + 1)-arcs of PG(2, q) are ovals.
2. The (q + 2)-arcs of PG(2, q), q even, are complete ovals or hyperovals.
14.4.72 Theorem [1510, Chapter 8] In PG(2, q), q odd, every oval is a non-singular conic.
14.4.73 Remark Theorem 14.4.72 is a celebrated result due to Segre [2575]. For more details on k-
arcs in PG(2, q), ovals and hyperovals see [1510, Chapters 8–10]. The fundamental Theorem
14.4.76, also due to Segre [2577], relates k-arcs of PG(2, q) to plane algebraic curves.
14.4.75 Remark At each point, K has t = q + 2 − k tangents; the total number of tangents is tk.
14.4.76 Theorem [1510, Chapter 10]
1. Let K be a k-arc in PG(2, q), with q even. Then the tk tangents of K belong to
an algebraic envelope Γt of class t, that is, the dual of a plane algebraic curve of
order t, with the following properties:
Combinatorial 585
14.4.81 Theorem (The duality principle for k-arcs) [1515, Chapter 27] A k-arc of PG(n, q), with
n ≥ 2 and k ≥ n + 4, exists if and only if a k-arc of PG(k − n − 2, q) exists.
14.4.82 Corollary [1515, Chapter 27] In PG(q − 2, q), q even and q ≥ 4, there exist (q + 2)-arcs.
14.4.83 Conjecture [1511, Chapter 13]
14.4.84 Definition
1. Let C be a code of length k over an alphabet A of size q with q ≥ 2. In other
words, C is a set of (code)words, where each word is an ordered k-tuple over A.
2. For a given m with 2 ≤ m ≤ k, impose the following condition: no two words
in C agree in as many as m positions. Then |C| ≤ q m . If |C| = q m , then C is a
maximum distance separable (MDS) code.
14.4.85 Remark MacWilliams and Sloane [1991, Chapter 11] introduce the chapter on MDS codes
as “one of the most fascinating in all of coding theory.”
14.4.86 Definition
1. The (Hamming) distance between two codewords
d(C) = min{d(X, Y ) | X, Y ∈ C, X 6= Y }.
14.4.87 Theorem [2849, Chapter 5] For an MDS code, d(C) = k − m + 1; see Section 15.1.
14.4.88 Remark One of the main problems concerning MDS codes is to maximize d(C), and so k,
for given m and q. Another problem is to determine the structure of C in the optimal case.
14.4.89 Theorem [434] For an MDS code, k ≤ q + m − 1.
14.4.90 Remark For m = 2, the MDS code C gives a set of q 2 codewords of length k, no two
of which agree in more than one position. This is equivalent to the existence of a net of
order q and degree k; see also Section 14.1. It follows that k ≤ q + 1, the case of equality
corresponding to an affine plane of order q; see Section 14.3. Theorem 14.4.89 follows by an
inductive argument.
Combinatorial 587
14.4.91 Remark
1. The case m = 3 and k = q + 2 is equivalent to the existence of an affine plane
of order q, with q even, containing an appropriate system of (q + 2)-arcs. For all
known examples the plane is an affine plane AG(2, q) with q = 2h ; see Willems
and Thas [2979].
2. For m = 4 and k = q + 3, it has been shown that either q = 2 or 36 divides q; no
example other than for q = 2 is known to exist; see Bruen and Silverman [431].
14.4.92 Remark Henceforth, only linear MDS codes are considered; that is C is an m-dimensional
subspace of the vector space V (k, q), which is Fkq with the usual addition and scalar multi-
plication.
14.4.93 Theorem [1511, Chapter 13] For m ≥ 3, linear MDS codes and arcs are equivalent objects.
14.4.94 Remark Let C be an m-dimensional subspace of V (k, q) and let G be an m × k generator
matrix for C; that is, the rows of G are a basis for C. Then C is MDS if and only if any
m columns of G are linearly independent; this property is preserved under multiplication
of the columns by non-zero scalars. So consider the columns of G as points P1 , . . . , Pk of
PG(m − 1, q). It follows that C is MDS if and only if {P1 , . . . , Pk } is a k-arc of PG(m − 1, q).
This gives the relation between linear MDS codes and arcs.
14.4.95 Theorem [1511, Chapter 13] For 2 ≤ m ≤ k − 2, the dual of a linear MDS code is again a
linear MDS code.
14.4.96 Remark For 3 ≤ m ≤ k − 3, Theorem 14.4.95 is the translation of Theorem 14.4.81 from
geometry to coding theory.
14.4.7 k -Caps
14.4.97 Definition
1. In PG(n, q), n ≥ 3, a set K of k points no three of which are collinear is a k-cap.
2. A k-cap is complete if it is not contained in a (k + 1)-cap.
3. A line of PG(n, q) is a secant, tangent, or external line as it meets K in 2, 1 or
0 points.
14.4.99 Definition A (q 2 + 1)-cap of PG(3, q), q 6= 2 is an ovoid; the ovoids of PG(3, 2) are its
elliptic quadrics.
14.4.100 Theorem [1509, Chapter 16] At each point P of an ovoid O of PG(3, q), there is a unique
tangent plane π such that π ∩ O = {P }.
14.4.101 Theorem [1509, Chapter 16]
1. Apart from the tangent planes, every plane meets an ovoid O in a (q + 1)-arc.
2. When q is even, the (q 2 + 1)(q + 1) tangents of O are the totally isotropic lines
of a symplectic polarity α of PG(3, q), that is, the lines ` for which `α = `.
588 Handbook of Finite Fields
14.4.102 Theorem [1509, Chapter 16] In PG(3, q), q odd, every ovoid is an elliptic quadric.
14.4.103 Remark Theorem 14.4.102 is a celebrated result, due independently to Barlotti [202] and
Panella [2358]. Both proofs rely on Theorem 14.4.72.
14.4.104 Theorem [421] In PG(3, q), q even, every ovoid containing at least one conic section is an
elliptic quadric.
14.4.105 Theorem [1509, Chapter 16] In PG(3, q), let W (q) be the incidence structure formed by
all the points and the totally isotropic lines of a symplectic polarity α. Then W (q) admits
a polarity α0 if and only if q = 22e+1 ; in that case, the absolute points of α0 , namely the
points lying in their image lines, form an ovoid of PG(3, q). Such an ovoid is an elliptic
quadric if and only if q = 2.
14.4.106 Definition For q = 22e+1 , with e ≥ 1, the ovoids in Theorem 14.4.105 are Tits ovoids.
14.4.107 Theorem [1509, Chapter 16] With q = 22e+1 , the canonical form of a Tits ovoid is
14.4.110 Definition Let O be an ovoid of PG(3, q) and let B be the set of all intersections π ∩ O,
with π a non-tangent plane of O. Then the incidence structure formed by the triple
I(O) = (O, B, ∈) is a 3 − (q 2 + 1, q + 1, 1) design. A 3 − (n2 + 1, n + 1, 1) design
I = (P, B, ∈) is an inversive plane of order n and the elements of B are circles; the
inversive planes arising from ovoids are egglike.
14.4.111 Theorem [807, Chapter 6] Every inversive plane of even order is egglike.
14.4.112 Definition If the ovoid O is an elliptic quadric, then the inversive plane I(O), and any
inversive plane isomorphic to it, is classical or Miquelian.
14.4.113 Remark By Theorem 14.4.102, an egglike inversive plane of odd order is Miquelian. For
odd order, no other inversive planes are known.
14.4.114 Definition Let I be an inversive plane of order n. For any point P of I, the points of I
other than P , together with the circles containing P with P removed, form a 2−(n2 , n, 1)
design, that is, an affine plane of order n. This plane is denoted IP and is the internal
plane or derived plane of I at P .
Combinatorial 589
14.4.115 Remark [807, Chapter 6] For an egglike inversive plane I(O) of order q, each internal plane
is Desarguesian, that is, the affine plane AG(2, q) over Fq .
14.4.116 Theorem [2795] Let I be an inversive plane of odd order n. If, for at least one point P of
I, the internal plane IP is Desarguesian, then I is Miquelian.
14.4.117 Remark Up to isomorphism, there is a unique inversive plane of order n for the values
n = 2, 3, 4, 5, 7; see Chen [608], Denniston [821, 822], Witt [2997]. As a corollary of Theorem
14.4.116 and the uniqueness of the projective plane of order n for n = 3, 5, 7, a computer-free
proof of the uniqueness of the inversive plane of order n is obtained for these n.
14.4.118 Remark For more information about designs, see Section 14.5. For more information about
projective spaces, see [1509, 1510, 1515] and [1511, Chapter 13].
See Also
References Cited: [202, 324, 421, 431, 434, 608, 807, 821, 822, 1050, 1509, 1510, 1511,
1512, 1513, 1515, 1560, 1991, 2313, 2314, 2315, 2358, 2575, 2576, 2577, 2794, 2795, 2849,
2979, 2997]
14.5.1 Basics
14.5.1 Definition A balanced incomplete block design (BIBD) is a pair (V, B) where V is a v-set
and B is a collection of b k-subsets of V (blocks) such that each element of V is contained
in exactly r blocks and any 2-subset of V is contained in exactly λ blocks. The numbers
v, b, r, k, and λ are parameters of the BIBD. Its order is v; its replication number is r;
its blocksize is k; and its index is λ.
14.5.2 Proposition Trivial necessary conditions for the existence of a BIBD(v, b, r, k, λ) are
1. vr = bk, and
2. r(k − 1) = λ(v − 1).
Parameter sets that satisfy conditions 1 and 2 are admissible.
λ(v−1)
14.5.3 Remark The three parameters v, k, and λ determine the remaining two as r = k−1 and
vr
b= k . Hence one often writes (v, k, λ) design to denote a BIBD(v, b, r, k, λ).
590 Handbook of Finite Fields
14.5.4 Example The unique (6, 3, 2) design and the unique (7, 3, 1) design have blocks shown below
as columns:
0000011122 0001123
1123423433 1242534
2345554545 3654656
14.5.6 Remark A Steiner triple system of order v can exist only when v − 1 is even because every
element occurs with v − 1 others, and in each block in which it occurs it appears with two
other elements. Moreover, every block contains three pairs and hence v2 must be a multiple
of 3. Thus, it is necessary that v ≡ 1, 3 (mod 6). This condition was shown to be sufficient
in 1847.
14.5.7 Theorem [1741] A Steiner triple system of order v exists if and only if v ≡ 1, 3 (mod 6).
14.5.8 Theorem [708] A TS(v, λ) exists if and only if v 6= 2 and λ ≡ 0 (mod gcd(v − 2, 6)).
14.5.9 Remark Existence theorems such as Theorems 14.5.7 and 14.5.8 are typically established by
a combination of direct constructions to make designs for specific values of v, and recursive
constructions to make solutions for large values of v from solutions with smaller values
of v. Finite fields are most often used in providing direct constructions, both to provide
ingredients for recursive constructions, and to produce solutions with specific properties.
Examples for triple systems are developed to demonstrate these; see [708].
14.5.10 Construction [2222] Let p be a prime, n ≥ 1, and pn ≡ 1 (mod 6). Then there is an
STS(pn ). To construct one, let Fpn be a finite field on a set X of size pn = 6t + 1 with 0 as
its zero element, and ω a primitive root of unity. Then
{{ω i + j, ω 2t+i + j, ω 4t+i + j} : 0 ≤ i < t, j ∈ X}
Combinatorial 591
14.5.12 Definition Two BIBDs (V1 , B1 ), (V2 , B2 ) are isomorphic if there exists a bijection α :
V1 → V2 such that B1 α = B2 . An automorphism is an isomorphism of a design with
itself. The set of all automorphisms of a design forms a group, the (full) automorphism
group. An automorphism group is any subgroup of the full automorphism group.
14.5.14 Remark In Construction 14.5.10, we can treat D = {{ω i , ω 2t+i , ω 4t+i } : 0 ≤ i < t} as the
base or starter triples of the design. Their development is the result of applying the action
of the elementary abelian group of order pn to the base triples. The verification requires
that for every difference d ∈ Fpn \ {0}, there is exactly one way to choose x, y ∈ D ∈ D so
that d = x − y, with arithmetic in the elementary abelian group of order pn .
14.5.15 Construction [807] Let p be a prime, n ≥ 1, and pn ≡ 7 (mod 12). Let Fpn be a finite field
on a set X of size pn = 6t + 1 = 12s + 7 with 0 as its zero element and ω a primitive root
of unity. Then
{{ω 2i + j, ω 2t+2i + j, ω 4t+2i + j} : 0 ≤ i < t, j ∈ X}
forms the blocks of an STS(pn ) on X. (These are the Netto triple systems.)
14.5.16 Remark The Netto triple systems provide examples of STS(v)s that admit 2-homogeneous
automorphism groups but (for v > 7) do not admit 2-transitive groups. We give another
construction of the Netto triple systems. Let
Let ε be a primitive sixth root of unity in Fpn . Then the orbit of {0, 1, ε} under the action
of Γ is the Netto triple system of order pn . This illustrates one of the principal reasons for
using large automorphism groups, and in particular for using the additive and multiplicative
structure of the finite field – a single triple represents the entire triple system.
14.5.17 Construction [1459] Let p = 2t + 1 be an odd prime. Let ω be a primitive root of unity in
Zp satisfying ω ≡ 1 (mod 3). Then
14.5.18 Definition A set of blocks is a partial parallel class (PPC) if no two blocks in the set share
an element. A PPC is an almost parallel class if it contains v−1 3 blocks; when it contains
v
3 blocks, it is a parallel class or resolution class. A partition of all blocks of a TS(v, λ)
into parallel classes is a resolution and the STS is resolvable. An STS(v) together with
a resolution of its blocks is a Kirkman triple system, KTS(v).
14.5.19 Remark In Construction 14.5.17, the triples {{j, j +p, j +2p} : j ∈ Zp } form a parallel class.
Indeed we can say much more in certain cases. The method of “pure and mixed differences”
[356] is applied, using a set of elements Fq × X, for X a finite set; a pure(x) difference is
the difference d = a − b associated with the pair {(a, x), (b, x)} and a mixed(x,y) difference
is the difference d = a − b associated with the pair {(a, x), (b, y)}.
14.5.20 Construction [2440] If q = pα ≡ 1 (mod 6) is a prime power, then there exists a KTS(3q).
Let t = (q − 1)/6. To construct a KTS(3q), take as elements Fq × {1, 2, 3}, writing ai for
(a, i). Let ω be a primitive element in Fq , and let B consist of triples:
1. C = {01 , 02 , 03 };
2. Bij = {ωji , ωji+2t , ωji+4t }, 0 ≤ i < t, j ∈ {1, 2, 3};
3. Ai = {ω1i , ω2i+2t , ω3i+4t }, 0 ≤ i < t.
Each of the (nonzero) pure and mixed differences occurs exactly once in triples of B, and
thus B is the set of starter triples for an STS(3q). This STS(3q) is resolvable. Indeed,
R0 = C ∪ {Bij : 0 ≤ i < t, j ∈ {1, 2, 3}} ∪ {Ajt+i : j ∈ {1, 3, 5}, 0 ≤ i < t} forms a parallel
class; when developed modulo 6t + 1, it yields a further 6t parallel classes. Each Ai , when
developed modulo 6t + 1, also yields a parallel class; taking those parallel classes only from
Ajt+i with j ∈ {0, 2, 4} and 0 ≤ i < t} thus yields a further 3t parallel classes, for a total
of 9t + 1 forming the resolution.
14.5.21 Construction [2440] If q = pα ≡ 1 (mod 6) is a prime power, then there exists a KTS(2q+1).
Let t = (q − 1)/6. To construct a KTS(2q + 1), take as elements (Fq × {1, 2}) ∪ {∞}. Let ω
be a primitive element of Fq , and let m satisfy 2ω m = ω t + 1. Let B consist of triples
1. C = {01 , 02 , ∞};
2. Bi = {ω1i , ω1i+t , ω2i+m }, 0 ≤ i < t, 2t ≤ i < 3t, or 4t ≤ i < 5t;
3. Ai = {ω2i+m , ω2i+m+3t , ω2i+m+5t }, 0 ≤ i < t.
Every pure and every mixed difference occurs exactly once and hence B is a set of starter
triples for an STS(2q + 1). But B itself forms a parallel class, whose development modulo q
yields the required q parallel classes for the KTS(2q + 1).
14.5.22 Definition Let B = {b1 , ..., bk } be a subset of an additive group G. The G-stabilizer of B
is the subgroup GB of G consisting of all elements g ∈ G such that B + g = B. B is
full or short according to whether GB is or is not trivial. The G-orbit of B is the set
OrbG B of all distinct right translates of B, namely, OrbG B = {B + s | s ∈ S} where S
is a complete system of representatives for the right cosets of GB in G.
integer µg . The list of partial differences from B is the multiset ∂B where each g ∈ G
appears exactly µg times. (∆B = ∂B if and only if B is a full block.)
14.5.24 Definition Let G be a group of order v. A collection {B1 , ..., Bt } of k-subsets of G forms a
(v, k, λ) difference family (or difference system) if every nonidentity element of G occurs
λ times in ∂B1 ∪ · · · ∪ ∂Bt . The sets Bi are base blocks. A difference family having at
least one short block is partial.
14.5.25 Remark
1. All definitions given can be extended to a multiplicative group by replacing B + g
with B · g and bi − bj with bi b−1 j .
2. If t = 1, then B1 is a (v, k, λ) difference set; see Section 14.6.
3. If {B1 , . . . , Bt } is a (v, k, λ) difference family over G, OrbG (B1 ) ∪ · · · ∪ OrbG (Bt )
is the collection of blocks of a BIBD(v, k, λ) admitting G as a sharply point-
transitive automorphism group. This BIBD is cyclic (abelian, nonabelian, dihe-
dral, and so on) if the group G has the respective property. In this case the
difference family is a cyclic (abelian, nonabelian, dihedral, respectively) difference
family.
4. A BIBD(v, k, λ) with an automorphism group G acting sharply transitively on the
points is (up to isomorphism) generated by a suitable (v, k, λ) difference family.
5. Every short block of a (v, k, 1) difference family over an abelian group G is a
coset of a suitable subgroup of G.
14.5.26 Theorem [261] The set of order p subgroups of Fpn forms a (pn , p, 1) difference family
generating the point-line design associated with the affine geometry AG(n, p).
14.5.27 Definition
1. C = {c1 , . . . , ck } is a multiple of B = {b1 , . . . , bk } if, for some w, ci = w · bi for
all i.
2. w is a multiplier of order n of B = {b1 , . . . , bk } if wn = 1 but wi 6= 1 for
0 < i < n, and for some g ∈ G, B = w·B +g = {w·b1 +g, w·b2 +g, . . . , w·bk +g}.
3. w is a multiplier of a difference family D if, for each base block B ∈ D, there
exists C ∈ D and g ∈ G for which w · B + g = C.
4. If q is a prime power and D is a (q, k, λ) difference family over Fq in which one
base block, B, has a multiplier of order k or k − 1 and all other base blocks are
multiples of B, then D is radical.
14.5.28 Theorem [5] Suppose q ≡ 7 (mod 12) is a prime power and there exists a cube root of
unity ω in Fq such that x = ω − 1 is a primitive root. Then the following base blocks form
a (7q,4,1) difference family over Z7 × Fq :
1. {(0,0), (0, (x − 1)x2t−1 ), (0,ω(x − 1)x2t−1 ), (0,ω 2 (x − 1)x2t−1 )} for 1 ≤ t ≤
(q − 7)/12,
2. {(0,0), (1, x2t ), (2, ωx2t ), (4, ω 2 x2t )} for 1 ≤ t ≤ (q − 3)/2, x2t 6= ω, and
3. {(0,0), (2t , ω t ), (2t , x · ω t ), (2t+2 , 0)} for 0 ≤ t ≤ 2.
14.5.29 Remark The (7q,4,1) difference families are obtainable by Theorem 14.5.28 for q = 7, 19,
31, 43, 67, 79, 103, 127, 151, 163, 199, 211, 367, 379, 439, 463, 487, 571, but not for q =
594 Handbook of Finite Fields
139, 223, 271, 283, 307, 331, 523, 547. A more general construction for (7q,4,1) difference
families with q a prime power ≡ 7 (mod 12) can be found in [5].
14.5.30 Theorem [2986] Suppose q is a prime power, and λ(q − 1) ≡ 0 (mod k(k − 1)). Then a
(q, k, λ) difference family over Fq exists if
1. λ is a multiple of k/2 or (k − 1)/2;
2. λ ≥ k(k − 1); or
k(k−1)
3. q > k2 .
14.5.31 Theorem [459] Suppose q is an odd prime power. Then there exists a (q, k, λ) radical
difference family if either:
1. λ is a multiple of k/2 and q ≡ 1 (mod k − 1), or
2. λ is a multiple of (k − 1)/2 and q ≡ 1 (mod k).
14.5.32 Remark For radical difference families with λ = 1, the multiplier must have odd order (that
is, order k if k is odd, or order k − 1 if k is even).
14.5.33 Theorem [459] Let q = 12t + 1 be a prime power and 2e be the largest power of 2 dividing
t. Then a (q, 4, 1) radical difference family in Fq exists if and only if −3 is not a 2e+2 -th
power in Fq . (This condition holds for q = 13, 25, 73, 97, 109, 121, 169, 181, 193, 229, 241,
277, 289, 313, 337, 409, 421, 433, 457, 529, 541, 577, 601, 625, 673, 709, 733, 757, 769, 829,
841.)
14.5.34 Theorem [459] Let q = 20t + 1 be a prime power, and let 2e be the largest power
√ of 2
dividing t. Then a (q, 5, 1) radical difference family in Fq exists if and only if (11 + 5 5)/2
is not a 2e+1 -th power in Fq . (This condition holds for q = 41, 61, 81, 241, 281, 401, 421,
601, 641, 661, 701, 761, 821, 881.)
14.5.35 Remark In [460], necessary and sufficient conditions are given for a (q, k, 1) radical difference
family with k ∈ {6, 7} to exist over Fq ; a sufficient condition is also given for k ≥ 8.
14.5.36 Theorem [460, 1356, 2986] Among others, (q, k, 1) radical difference families exist for the
following values of q and k:
k=6 q ∈ {181, 211, 241, 631, 691, 1531, 1831, 1861, 2791, 2851, 3061};
k=7 q ∈ {337, 421, 463, 883, 1723, 3067, 3319, 3823, 3907, 4621, 4957,
5167, 5419, 5881, 6133, 8233, 8527, 8821, 9619, 9787, 9829};
k=8 q ∈ {449, 1009, 3137, 3697, 6217, 6329, 8233, 9869};
k=9 q ∈ {73, 1153, 1873, 2017, 6481, 7489, 7561, 8359}.
14.5.37 Theorem [1268] If there exists a (p, k, 1) radical difference family with p a prime and k odd,
there exists a cyclic RBIBD(kp, k, 1) whose resolution is invariant under the action of Zkp .
14.5.38 Theorem [708] Let p be a prime, n ≥ 1, pn ≡ 1 (mod 6), and ω be a primitive root of Fpn ,
pn = 6t + 1. Let S = {ω 0 , ω 2t , ω 4t }, and Si = ω i S.
1. For 0 ≤ c < t, the development of {0} ∪ Sc under the addition and multiplication
of Fpn forms a (pn , 4, 2) design in which the omission of the first element in each
block yields an STS(pn ).
2. For 0 ≤ c < d < t, the development of Sc ∪ Sd under the addition and multipli-
cation of Fpn forms a (pn , 6, 5) design.
Combinatorial 595
14.5.39 Remark The STSs in Theorem 14.5.38 Part 1 have been called nested Steiner triple systems,
but the standard statistical notion of nested design is different – see Definition 14.5.40 and
[2159].
14.5.40 Definition If the blocks of a BIBD (V, D1 ) with v symbols in b1 blocks of size k1 are
each partitioned into sub-blocks of size k2 , and the b2 = b1 k1 /k2 sub-blocks them-
selves constitute a BIBD (V, D2 ), then the system of blocks, sub-blocks, and symbols
is a nested balanced incomplete block design (nested BIBD or NBIBD) with parameters
(v, b1 , b2 , r, k1 , k2 ), r denoting the common replication. Also (V, D1 ) and (V, D2 ) are the
component BIBDs of the NBIBD.
14.5.41 Remark A resolvable BIBD (RBIBD) (V, D) is a nested block design (V, D1 , D2 ) where the
blocks of D1 , of size k1 = v, are the resolution classes of D, and D2 = D.
14.5.42 Remark Nested block designs may have more than two blocking systems and consequently
more than one level of nesting. A doubly nested block design is a system (V, D1 , D2 , D3 )
where both (V, D1 , D2 ) and (V, D2 , D3 ) are nested block designs. A resolvable NBIBD is a
doubly nested block design.
14.5.43 Definition A multiply nested BIBD (MNBIBD) is a nested block design (V, D1 , D2 , . . . , Ds )
with parameters (v, b1 , . . . , bs , r, k1 , . . . , ks ) for which the systems (V, Dj , Dj+1 ) are
NBIBDs for j = 1, . . . , s − 1.
14.5.45 Definition A nested row-column design is a system (V, D1 , D2 , D3 ) for which (1) each
of (V, D1 , D2 ) and (V, D1 , D3 ) is a nested block design, (2) each block of D1 may be
displayed as a k2 × k3 row-column array, one member of the block at each position in
the array, so that the columns are the D2 sub-blocks in that block, and the rows are the
D3 sub-blocks in that block.
14.5.46 Definition A (completely balanced) balanced incomplete block design with nested rows and
columns, BIBRC(v, b1 , k2 , k3 ), is a nested row-column design (V, D1 , D2 , D3 ) for which
each of (V, D1 , D2 ) and (V, D1 , D3 ) is a NBIBD.
14.5.47 Theorem [2159] If v = mpq + 1 is a prime power and p and q are relatively prime, then
initial nesting blocks for a BIBRC(v, mv, sp, tq) are Al = xl−1 L ⊗ M for l = 1, . . . , m, where
Ls×t = (xi+j−2 )i,j , Mp×q = (x[(i−1)q+(j−1)p]m )i,j , s and t are integers with st ≤ m, and x
is a primitive element of Fv . (Here ⊗ is the Kronecker product.) If m is even and pq is odd,
A1 , . . . , Am/2 are intial nesting blocks for BIBRC(v, mv/2, sp, tq);
14.5.48 Theorem [2159] Write xui = 1 − x2mi where x is a primitive element of Fv and v = 4tm + 1
is a prime power. Let A be the addition table with row margin (x0 , x2m , . . . , x(4t−2)m ) and
596 Handbook of Finite Fields
column margin (xm , x3m , . . . , x(4t−1)m ), and set Al = xl−1 A. If ui − uj 6≡ m (mod 2m) for
i, j = 1, . . . , t, then A1 , . . . , Am are initial nesting blocks for BIBRC(v, mv, 2t, 2t). Including
0 in each margin for A, if further ui 6≡ m (mod 2m) for i = 1, . . . , t, then A1 , . . . , Am are
initial nesting blocks for BIBRC(v, mv, 2t + 1, 2t + 1).
14.5.49 Definition Let K be a subset of positive integers and let λ be a positive integer. A pairwise
balanced design (PBD(v, K, λ) or (K, λ)-PBD) of order v with block sizes from K is a
pair (V, B), where V is a finite set (the point set) of cardinality v and B is a family
of subsets (blocks) of V that satisfy (1) if B ∈ B, then |B| ∈ K and (2) every pair of
distinct elements of V occurs in exactly λ blocks of B. The integer λ is the index of the
PBD. The notations PBD(v, K) and K-PBD of order v are often used when λ = 1.
14.5.50 Example A PBD(10, {3, 4}) is given below where the blocks are listed columnwise.
1 1 1 2 2 2 3 3 3 4 4 4
2 5 8 5 6 7 5 6 7 5 6 7
3 6 9 8 9 10 10 8 9 9 10 8
4 7 10
14.5.51 Remark Many constructions of pairwise balanced designs employ sub-structures in balanced
incomplete block designs. In a (v, k, λ)-design (V, B), useful sub-structures include those
specified by a set S ⊂ V so that for every B ∈ B, |B ∩ S| ∈ L; then setting K = {k − ` :
` ∈ L}, a (|V \ S|, K, λ)-PBD arises by removing all points of S. When the BIBD is made
by a finite field construction, such sub-structures may arise from algebraic properties of the
field. Other useful sub-structures arise from the presence of parallel classes; when a parallel
class of blocks is present, a new element can be added and adjoined to each block in the
parallel class to increase the size of some blocks by one. Example 14.5.50 is produced in
this way from a (9,3,1)-design. This can be applied to more than one parallel class, when
present [706, §IV].
14.5.52 Definition Let K and G be sets of positive integers and let λ be a positive integer. A
group divisible design of index λ and order v ((K, λ)-GDD) is a triple (V, G, B), where
V is a finite set of cardinality v, G is a partition of V into parts (groups) whose sizes lie
in G, and B is a family of subsets (blocks) of V that satisfy (1) if B ∈ B then |B| ∈ K,
(2) every pair of distinct elements of V occurs in exactly λ blocks or one group, but not
both, and (3) |G| > 1. If v = a1 g1 + a2 g2 + · · · + as gs , and if there are ai groups of size gi ,
i = 1, 2, . . . , s, then the (K, λ)-GDD is of type g1a1 g2a2 . . . gsas . This is exponential notation
for the group type. Alternatively, if the GDD has groups G1 , G2 , . . . , Gt , then the list
T = [|Gi | : i = 1, 2, . . . , t] is the type of the GDD when more convenient. If K = {k},
then the (K, λ)-GDD is a (k, λ)-GDD. If λ = 1, the GDD is a K-GDD. Furthermore, a
({k}, 1)-GDD is a k-GDD.
14.5.53 Definition Let H be a subgroup of order h of a group G of order v. A collection {B1 , ..., Bt }
of k-subsets of G forms a (v, h, k, λ) difference family over G and relative to H if ∂B1 ∪
· · · ∪ ∂Bt covers each element of G − H exactly λ times and covers no element in H.
Combinatorial 597
14.5.54 Remark
14.5.55 Theorem [4, 1357] Suppose q ≡ 1 (mod k − 1) is a prime power. Then a (kq, k, k, 1) relative
difference family over Fk × Fq exists if one of the following holds:
1. k ∈ {3, 5} (for k = 5, the initial block B can be taken as {(0, 0), (1, 1), (1, −1),
(4, x), (4, −x)} where x is any nonsquare in Fq such that exactly one of x−1, x+1
is a square);
2. k = 7 and q 6= 19;
3. k = 9 and q 6∈ {17, 25, 41, 97, 113};
4. k = 11, q < 1202, q is prime, and q 6∈ [30,192], [240,312] or [490,492].
14.5.7 t-designs
14.5.57 Definition A t-(v, k, λ) design, a t-design in short, is a pair (X, B) where X is a v-set
of points and B is a collection of k-subsets of X (blocks) with the property that every
t-subset of X is contained in exactly λ blocks. The parameter λ is the index of the
design.
14.5.58 Example Let q be a prime power and n > 0 be an integer. Then G = PGL(2, q n ) acts
sharply 3-transitively on X = Fqn ∪ {∞}. If S ⊆ X is the natural inclusion of Fq ∪ {∞},
then the orbit of S under G is a 3-(q n + 1, q + 1, 1) design. These designs are spherical
geometries; when n = 2, they are inversive planes or Möbius planes.
14.5.59 Theorem [1559] Let B be a subgroup of the multiplicative group of nonzero elements of
Fq . Then the orbit of S = B ∪ {0, ∞} under the action of PGL(2, q) is the block set of one
of the designs:
598 Handbook of Finite Fields
14.5.60 Remark Packings and coverings relax the conditions on block designs, and have been ex-
tensively studied; see [2106] for a more detailed exposition.
14.5.61 Definition Let v ≥ k ≥ t. A t-(v, k, λ) covering is a pair (X, B), where X is a v-set of
elements (points) and B is a collection of k-subsets (blocks) of X, such that every t-subset
of points occurs in at least λ blocks in B. Repeated blocks in B are permitted.
l l l mmm
v v−1 λ(v−t+1)
Lλ (v, k, t) = k k−1 ... k−t+1 .
14.5.63 Definition Let v ≥ k ≥ t. A t-(v, k, λ) packing is a pair (X, B), where X is a v-set of
elements (points) and B is a collection of k-subsets of X (blocks), such that every t-
subset of points occurs in at most λ blocks in B. If λ > 1, then B is allowed to contain
repeated blocks.
14.5.64 Remark A t-(v, k, 1) packing with b blocks is equivalent to a binary code of length v, size b,
constant weight k, and minimum Hamming distance at least 2(k − t + 1); see Section 15.1.
j k
vDλ (v−1,k−1,t−1)
14.5.65 Theorem (First Johnson bound) [2106] Dλ (v, k, t) ≤ k . Iterating this
bound yields Dλ (v, k, t) ≤ Uλ (v, k, t), where
j j j kkk
v v−1 λ(v−t+1)
Uλ (v, k, t) = k k−1 ... k−t+1 .
14.5.66 Theorem (Second Johnson bound) [2106] Suppose d = D1 (v, k, t) = qvj+ r, wherek 0 ≤ r ≤
v(k+1−t)
v − 1. Then q(q − 1)v + 2qr ≤ (t − 1)d(d − 1), and hence D1 (v, k, t) ≤ k2 −v(t−1) .
Combinatorial 599
See Also
References Cited: [4, 5, 261, 262, 356, 459, 460, 461, 706, 708, 807, 1268, 1356, 1357, 1459,
1559, 1741, 2106, 2159, 2222, 2440, 2719, 2986]
14.6.1 Basics
14.6.2 Example
1. The group G itself and G\{g} for an arbitrary g ∈ G are (v, v, v, 0)- and (v, v −
1, v − 2; 1)-difference sets.
2. The set {1, 3, 4, 5, 9} is a cyclic (11, 5, 2; 3)-difference set in the group (Z/11Z, +).
These are the squares modulo 11.
3. The set {1, 2, 4} ⊂ Z/7Z is a cyclic (7, 3, 1; 2)-difference set.
4. The set {(0, 1), (0, 2), (0, 3), (1, 0), (2, 0), (3, 0)} ⊂ Z/4Z × Z/4Z is a (16, 6, 2; 4)
difference set.
600 Handbook of Finite Fields
λ · (v − 1) = k · (k − 1).
14.6.7 Remark Lemma 14.6.6 can be proved by counting differences. It also follows from Theorem
14.6.9 which shows that difference sets are the same objects as symmetric designs with a
sharply transitive (regular) automorphism group. We refer the reader to Section 14.5 for the
definition of symmetric designs and to [261] for a proof of the important Theorem 14.6.9.
14.6.8 Definition The development of a difference set D is the incidence structure dev(D) whose
points are the elements of G and whose blocks are the translates g+D := {g+d : d ∈ D}.
14.6.9 Theorem [262] The existence of a (v, k, λ; n)-difference set is equivalent to the existence of
a symmetric (v, k, λ)-design D admitting G as a point regular automorphism group; i.e., for
any two points P and Q, there is a unique group element g which maps P to Q. The design
D is isomorphic with dev (D).
14.6.10 Remark Necessary conditions on the parameters v, k and λ of a symmetric design are also
necessary conditions for the parameters of a difference set. In particular, the following two
theorems hold. We emphasize that these are necessary conditions for symmetric designs,
even if the designs are not constructed from difference sets.
14.6.11 Theorem [2566] If D is a (v, k, λ; n) difference set with v even, then n = u2 is a square.
14.6.12 Theorem [429, 631] If D is a (v, k, λ; n) difference set with v odd, then the equation nx2 +
(−1)(v−1)/2 λy 2 = z 2 must have an integral solution (x, y, z) 6= (0, 0, 0).
14.6.13 Example Not all symmetric designs can be constructed from difference sets. There are,
for instance, no difference sets with parameters (25, 9, 3; 6) or (31, 10, 3; 7), but symmetric
designs with these parameters exist, see the tables in [262].
Combinatorial 601
14.6.14 Remark The main problem about difference sets is to give necessary and sufficient con-
ditions for their existence. These conditions sometimes depend only on the parameters
(v, k, λ; n), sometimes also on the structure of the ambient group G. Another problem is to
classify all (v, k, λ; n)-difference sets up to equivalence or isomorphism. In many nonexis-
tence theorems, the exponent of a group plays an important role.
14.6.15 Definition The exponent exp(G) of a (multiplicatively written) finite group G is the small-
∗
est integer v ∗ such that g v = 1.
14.6.16 Remark There are many necessary conditions that the parameters of a difference set have
to satisfy. An important condition is Theorem 14.6.62. Many more restrictions are in [2546].
14.6.17 Definition Two difference sets D1 (in G1 ) and D2 (in G2 ) are equivalent if there is a group
isomorphism ϕ between G1 and G2 such that D1ϕ = {dϕ : d ∈ D1 } = g + D2 for a
suitable g ∈ G2 . The difference sets are isomorphic if the designs dev (D1 ) and dev (D2 )
are isomorphic.
14.6.18 Remark Equivalent difference sets yield isomorphic designs, but a design may give rise to
several inequivalent difference sets, as the following example shows.
14.6.19 Example The following three difference sets in Z/4Z × Z/4Z with parameters (16, 6, 2; 4)
are pairwise inequivalent, but the designs are all isomorphic:
D1 = {(0, 0), (1, 0), (2, 0), (0, 1), (1, 2), (2, 3)},
D2 = {(0, 0), (1, 0), (2, 0), (0, 1), (0, 3), (3, 2)},
D3 = {(0, 0), (1, 0), (0, 1), (2, 1), (1, 2), (2, 3)}.
14.6.20 Definition Some types of (v, k, λ; n)-difference sets have special names. A difference
set with λ = 1 is planar. The parameters can be written in terms of the order n as
(n2 + n + 1, n + 1, 1; n). The corresponding design is a projective plane. Difference sets
with v = 4n are Hadamard difference sets, in which case n = u2 must be a square, and
the parameters are (4u2 , 2u2 − u, u2 − u; u2 ). Difference sets with v = 4n − 1, hence with
parameters (4n − 1, 2n − 1, n − 1; n), are of Paley type. Both Hadamard and Paley type
difference sets are closely related to Hadamard matrices; see Constructions 14.6.44 and
n
−1 q n−1 −1 q n−2 −1 n−2
14.6.52. The parameters qq−1 , q−1 , q−1 ; q are the Singer parameters; see
Theorem 14.6.22.
14.6.21 Remark The parameters of a symmetric design, hence also the parameters of a (v, k, λ; n)-
difference set satisfy 4n − 1 ≤ v ≤ n2 + n + 1. The extremal cases are Paley type difference
sets (v = 4n − 1) and planar difference sets.
14.6.22 Theorem [2674] Let α be a generator of the multiplicative group of Fqn , where q is a
prime power. Then the set of integers {i : 0 ≤ i < (q n − 1)/(q − 1), Tr (αi ) = 0} modulo
(q n − 1)/(q − 1) form a (cyclic) difference set with Singer parameters
14.6.23 Remark
10001212201112222020211201021002212022002000 . . .
14.6.26 Remark In general, there are many difference sets with Singer parameters inequivalent to
Singer difference sets (Theorems 14.6.27 and 14.6.29). There are even non-abelian difference
sets with Singer parameters; see Example 14.6.70.
14.6.27 Theorem
[1324] Let D be an arbitrary cyclic difference set with parameters
q s −1 q s−1 −1 q s−2 −1
q−1 , q−1 , q−1 ; q
s−2
in a group G. Let G be embedded into F∗qn /F∗q which is possible
if s|n. Let α be a primitive element in Fqn . Then the set of integers {0 ≤ i < (q n − 1)(q − 1) :
Tr Fqn /Fqs (αi ) ∈ D} is a difference set with classical Singer parameters. The difference sets
are Gordon-Mills-Welch difference sets corresponding to D. Note that different embeddings
of the same difference set D may result in inequivalent difference sets [211].
14.6.28 Remark If D is a Singer difference set, the above construction may be reformulated as
follows: if s divides n and if r is relatively prime to q s − 1, then the set of integers {0 ≤ i <
(q n − 1)/(q − 1) : Tr Fqs /Fq [(Tr Fqn /Fqs (αi ))r ] = 0} is a Gordon-Mills-Welch difference set.
14.6.29 Theorem [862, 864] Let α be a generator of the multiplicative group of F2n , and let t < n/2
be an integer relatively prime to n, and k = 4t − 2t + 1. Then the set D = {(x + 1)k + xk + 1 :
x ∈ F2n , x 6= 0, 1} ⊆ F∗2n is a Dillon-Dobbertin difference set with parameters (2n −1, 2n−1 −
1, 2n−2 − 1; 2n−2 ). If n = 3t ± 1, then the set D = F∗2n \ {(x + 1)k + xk : x ∈ F2n } ⊆ F∗2n is
a difference set with parameters (2n − 1, 2n−1 − 1, 2n−2 − 1; 2n−2 ).
14.6.30 Remark The series of Gordon-Mills-Welch difference sets [1324] and Dillon-Dobbertin dif-
ference sets [862, 864] show that the number of inequivalent difference sets grows rapidly. In
Combinatorial 603
these two series, inequivalent difference sets are in general also non-isomorphic; see [1676]
for the Gordon-Mills-Welch case and [905] for the Dillon-Dobbertin case.
14.6.31 Construction [1303] Cyclic difference sets can be used to construct binary sequences with
2-level autocorrelation function. Let α be the generator of the cyclic group Z/vZ, and let D
be a (v, k, λ; n) difference set in Z/vZ. Define a sequence (ai ) by ai = 1 if αi ∈ D, otherwise
Pn−1
ai = 0. This sequence has period v, and Ct (a) := i=0 (−1)ai +ai+t = v − 4(k − λ) for
t = 1, . . . , v − 1, which are the off-phase autocorrelation coefficients; see also Section 10.3.
14.6.32 Remark Cyclic Paley type difference sets yield sequences with constant off-phase autocor-
relation −1. These sequences (difference sets) have numerous applications since the autocor-
relation is small (in absolute value) [1303]. It is conjectured that no sequences with constant
off-phase autocorrelation 0 exist if v > 4; see Remark 14.6.45.
14.6.33 Conjecture (Ryser’s Conjecture) [1842, 1909] If gcd(v, n) 6= 1, then there is no cyclic
(v, k, λ; n) difference set in a cyclic group. A strengthening of this conjecture is due to
Lander: if D is a (v, k, λ; n)-difference set in an abelian group of order v, and p is a prime
dividing v and n, then the Sylow p-subgroup of G is not cyclic.
14.6.34 Theorem [1907] Lander’s conjecture is true for all abelian difference sets of order n = pk ,
where p > 3 is prime.
14.6.35 Remark
1. The smallest open case for Lander’s conjecture is a cyclic (465, 145, 45; 100) dif-
ference set.
2. More restrictions on putative counterexamples to Lander’s conjecture are con-
tained in [1909].
14.6.36 Theorem [116] Let R = {a ∈ F∗3m : a = x + x6 has 4 solutions with x ∈ F3m } with
m > 1. Then the set ρ(R) is a difference set with Singer parameters ((3m − 1)/2, (3m−1 −
1)/2, (3m−2 − 1)/2; 3m−2 ), where ρ is the canonical epimorphism F∗3m → F∗3m /F∗3 .
14.6.37 Theorem [1477] Let q = 3e , e ≥ 1, m = 3k, d = q 2k − q k + 1. If R = {x ∈ Fqm :
Tr Fqm /Fq (x + xd ) = 1}, then ρ(R) is a difference set with parameters ((q m − 1)/(q −
1), q m−1 , q m−1 − q m−2 ; q m−2 ), where ρ is the canonical epimorphism F∗qm → F∗q .
14.6.38 Theorem [262] The following subsets of Fq are difference sets in the additive group of
Fq . They are cyclotomic difference sets. Some of these difference sets may have Singer
parameters.
(2)
1. Fq := {x2 : x ∈ Fq \{0}}, q ≡ 3 (mod 4) (quadratic residues, Paley difference
sets);
(4)
2. Fq := {x4 : x ∈ Fq \{0}}, q = 4t2 + 1, t odd;
(4)
3. Fq ∪ {0}, q = 4t2 + 9, t odd;
(8)
4. Fq = {x8 : x ∈ Fq \{0}}, q = 8t2 + 1 = 64u2 + 9, t, u odd;
(8)
5. Fq ∪ {0}, q = 8t2 + 49 = 64u2 + 441, t odd, u even;
6. H(q) = {xi : x ∈ Fq \{0}, i ≡ 0, 1 or 3 (mod 6)}, q = 4t2 + 27, q ≡ 1 (mod 6)
(Hall difference sets).
14.6.39 Remark The proofs of the statements in Theorem 14.6.38 use cyclotomic numbers [2725].
604 Handbook of Finite Fields
14.6.40 Theorem Let q and q + 2 be prime powers. Then the set D = {(x, y) : x, y are both nonzero
squares or both non-squares or y = 0} is a twin prime power difference set with parameters
q 2 + 2q − 1 q 2 + 2q − 3 q 2 + 2q + 1
2
q + 2q, , ;
2 4 4
14.6.41 Definition A difference set D in the group G is skew symmetric if D is of Paley type and
{0, d, −d : d ∈ D} = G, hence D ∩ −D = ∅.
14.6.42 Theorem The following sets are skew symmetric difference sets in the additive group of Fq ,
q ≡ 3 (mod 4):
1. {x2 : x ∈ Fq , x 6= 0} (Paley difference sets);
2. {x10 ± x6 − x2 : x ∈ Fq , x 6= 0} where q = 3h , h odd [874];
h+1
3. {x4a+6 ± x2a − x2 : x ∈ Fq , x 6= 0} where q = 3h , h odd, a = 3 2 [871].
14.6.43 Remark A large class of skew Hadamard difference sets in elementary abelian groups of
order q 3 (q prime power) has been recently constructed [2210].
14.6.44 Construction A Hadamard difference set D in a group G of order 4u2 (see Definition
14.6.20) gives rise to a Hadamard matrix (Section 14.5) as follows: Label the rows and
columns of a matrix H = (hx,y ) by the elements of G, and put hx,y = 1 if x − y ∈ D,
otherwise hx,y = −1. This matrix is a Hadamard matrix, see Section 14.5.
14.6.45 Remark A special case of Rysers’s Conjecture 14.6.33 is that there are no cyclic Hadamard
difference sets with v > 4 (if v = 4, there is a trivial cyclic (4, 1, 0; 1)-difference set). This is
also known as the circulant Hadamard matrix conjecture. The smallest open case for which
one cannot prove the nonexistence of a cyclic Hadamard difference set with v = 4u2 so far
is u = 11715 = 3 · 5 · 11 · 71; see [1910] and also [2168] for the connection to the Barker
sequence conjecture (Section 10.3).
14.6.46 Remark [262] Hadamard difference sets in elementary abelian groups are equivalent to bent
functions (Section 9.3). The bent function is the characteristic function of the Hadamard
difference set. The following theorem gives an explicit construction.
14.6.47 Theorem The set
is a Hadamard difference set with parameters (22m , 22m−1 − 2m−1 , 22m−2 − 2m−1 ; 22m−2 ).
14.6.48 Remark There are several other constructions of difference sets with these parameters, also
in other groups. Two major construction methods are the Maiorana-McFarland method
(Sections 9.1 and 9.3) and the use of partial spreads (Section 9.3).
14.6.49 Theorem [781, 1803, 2830] An abelian Hadamard difference set in a group G of order 22m+2
exists if and only if exp(G) ≤ 2m+2 .
14.6.50 Remark In the following theorem, we combine knowledge about the existence of abelian
Hadamard difference sets. Many authors contributed to this theorem.
Combinatorial 605
14.6.54 Problem It is an open question whether Paley type difference sets exist for other values.
14.6.55 Theorem [2051] Let q be a prime power and d a positive integer. Let G be a group of
order v = q d+1 (q d + · · · + q 2 + q + 2) which contains an elementary abelian subgroup E of
order q d+1 in its center. View E as the additive group of Fd+1 q . Put r = (q d+1 − 1)/(q − 1)
d
and let H1 , . . . , Hr be the hyperplanes of order q of E. If g0 , . . . , gr are distinct coset
representatives of E in G, then D = (g1 + H1 ) ∪ (g2 + H2 ) ∪ · · · ∪ (gr + Hr ) is a McFarland
difference set with parameters
q d+1 −1 d q d+1 −1 d q d −1
q d+1 (1 + q−1 ), q · q−1 , q · q−1 ; q
2d
.
14.6.56 Remark
14.6.57 Theorem [609, 782] Let q be a prime power, and let t be any positive integer. Difference
sets with parameters
− 1 2t−1 2q 2t + q − 1 2t−1
2t
q 2t−1 + 1 4t−2
2t q
4q , q , q (q − 1) ;q
q2 − 1 q+1 q+1
14.6.58 Remark The existence of difference sets is closely related to character sums. Most necessary
conditions on the existence of difference sets are derived from character sums and number
theoretic conditions.
14.6.59 Theorem [262, 2830] Let D be a (v, k, λ; n) difference set in G, and let χ be a homomorphism
of a field. If χ(g) = 1 for all g ∈ G (in which case the
from G into the multiplicative group P
homomorphism is denoted χ0 ), then g∈D χ0 (d) = k. If χ 6= χ0 , then
! !
X X
−1
χ(d) · χ(d ) = n.
d∈D d∈D
14.6.60 Remark Theorem 14.6.59 is very useful if χ is complex-valued. In this case, the sum χ(D) :=
∗
χ(d) is an element in the ring Z[ζv∗ ], where ζv∗ = e2πi/v is a primitive v ∗ -th root of
P
d∈D
unity, and v ∗ is the exponent of G.
14.6.61 Remark If χ is complex-valued, Theorem 14.6.59 may be also viewed as an equation about
the ideal generated by χ(D): For χ 6= χ0 , we have (χ(D))(χ(D)) = (n), where (·) denotes
an ideal generated in Z[ζv∗ ] and ( ) is complex conjugation. Using results from algebraic
number theory, many necessary conditions can be obtained, for instance Theorem 14.6.62.
14.6.62 Theorem [2830] Let D be an abelian (v, k, λ; n) difference set in G, and let w be a divisor
of v. If p is prime, p|n and pj ≡ −1 (mod w), then an integer i exists such that p2i |n, but
p2i+1 is not a divisor of n. If w is the exponent v ∗ of G, then p does not divide n.
14.6.63 Example [262, 1842] There is no (40, 13, 4; 9)-difference set in Z/2Z × Z/2Z × Z/2Z × Z/5Z
(use v ∗ = 10 and p = 3 in Theorem 14.6.62). Using a different (though similar) theorem,
one can also rule out the existence of a (40, 13, 4; 9)-difference set in Z/2Z × Z/4Z × Z/5Z.
Note that a cyclic difference set with these parameters exists (Construction 14.6.25).
14.6.7 Multipliers
14.6.65 Theorem If ϕ is a multiplier of the difference set D, then there is at least one translate
g + D of D which is fixed by ϕ. If D is abelian and gcd(v, k) = 1, then there is a translate
fixed by all multipliers [261].
14.6.66 Remark
See Also
§9.2 Relative difference sets are a generalization of difference sets. An important class
of relative difference sets can be described by planar functions (PN functions).
§9.3 Bent functions are equivalent to elementary abelian Hadamard difference sets.
§10.3 Cyclic difference sets are binary sequences with two-level autocorrelation function.
§14.5 Difference sets are an important tool to construct combinatorial designs.
References Cited: [116, 211, 261, 262, 429, 609, 631, 706, 781, 782, 862, 864, 871, 874, 905,
1303, 1324, 1326, 1477, 1676, 1803, 1842, 1907, 1909, 1910, 2051, 2168, 2210, 2546, 2566,
2674, 2725, 2830]
14.7.1 Definition Let d denote a positive integer, and let X be a nonempty finite set. A d-class
symmetric association scheme on X is a sequence R0 , R1 , . . . , Rd of nonempty subsets
of the Cartesian product X × X, satisfying
1. R0 = {(x, x) | x ∈ X},
2. X × X = R0 ∪ R1 ∪ · · · ∪ Rd and Ri ∩ Rj = ∅ for i 6= j,
3. for all i ∈ {0, 1, . . . , d}, RiT = Ri where RiT := {(y, x) | (x, y) ∈ Ri },
4. for all integers h, i, j ∈ {0, 1, . . . , d}, and for all x, y ∈ X such that (x, y) ∈ Rh ,
the number phij := |{z ∈ X | (x, z) ∈ Ri , (z, y) ∈ Rj }| depends only on h, i, j,
and not on x or y.
608 Handbook of Finite Fields
14.7.2 Example The Hamming scheme H(n, q) has the set F n of all words of length n over an
alphabet F of q symbols as its vertex set. Two words are i-th associates if and only if the
Hamming distance between them is i. Generally the alphabet F is F2 , but other finite fields
are also used.
14.7.3 Example The cyclotomic schemes are obtained as follows. Let q be a prime power and k a
divisor of q − 1. Let C1 be the subgroup of the multiplicative subgroup of Fq of index k, and
let Ci , i = 1, 2, . . . , k be the cosets of C1 (the cyclotomic classes). The points of the scheme
are the elements of Fq , and two points x, y are i-th associates if x − y ∈ Ci (zero associates
if x − y = 0). In order for this to be an association scheme one must have −1 ∈ C1 or
equivalently 2k must divide q − 1 if q is odd.
14.7.4 Definition A Costas array of order n is an n × n array of dots and blanks that satisfies:
1. There are n dots and n(n − 1) blanks, with exactly one dot in each row and
column.
2. All the segments between pairs of dots differ in length or in slope.
C(n) denotes the number of distinct n × n Costas arrays.
14.7.5 Construction (Welch construction) Let p be prime and α be a primitive element in the
field Fp . Let n = p − 1. A Costas array of order n is obtained by placing a dot at (i, j) if
and only if i = αj , for a ≤ j < n + a, a a nonnegative integer, and i = 1, . . . , n.
14.7.6 Construction [1301] Let α and β be primitive elements in the field Fq for q a prime power.
Let n = q − 2. Costas arrays of order n are obtained by
14.7.7 Remark Using Constructions 14.7.5 and 14.7.6, C(p − 1) > 1 and C(q − 2) > 1. Also, if a
corner dot is present in a Costas array of order n, it can be removed along with its row and
column to obtain a Costas array of order n − 1.
14.7.8 Theorem [1304] If q > 2 is a prime power, then there exist primitive elements α and β in
Fq such that α + β = 1.
14.7.9 Corollary Removing the corner dot at (1, 1) in the Costas array of order q − 2 from Con-
struction 14.7.6 Part 2 yields C(q − 3) ≥ 1.
14.7.10 Example If there exist primitive elements α and β satisfying the conditions stated, then a
Costas array of order n can be obtained by removing one or more corner dots.
Conditions n
α1 = 2 q−3
α + β = 1 and α2 + β 2 = 1
1 1
q−4
α2 + α1 = 1 q−4
α + β = 1 and α2 + β −1 = 1
1 1
q−4
and necessarily α−1 + β 2 = 1 q−5
Combinatorial 609
14.7.11 Remark All Costas arrays of order 28 are accounted for by the Golomb and Welch con-
struction methods [918], making 28 the first order (larger than 5) for which no sporadic
Costas array exists.
14.7.12 Remark For n ≥ 30, n 6∈ {31, 53}, the only orders for which Costas arrays are known are
orders n = p − 1 or n = q − 2 or orders for which some algebraic condition exists that
guarantees corner dots whose removal leaves a smaller Costas array.
14.7.13 Remark The properties of a Costas array make it an ideal discrete waveform for Doppler
sonar. Having one dot in each row and column minimizes reverberation. Distinct segments
between pairs of dots give it a thumbtack ambiguity function because, shifted left-right in
time and up-down in frequency, copies of the pattern can only agree with the original in
one dot, no dots, or all n dots at once. Thus, the spike of the thumbtack makes a sharp
distinction between the actual shift and all the near misses. See [916] for a survey on Costas
arrays and http://www.costasarrays.org/ for up-to-date information on Costas arrays.
14.7.14 Definition A conference matrix of order n is an n × n (0, ±1)-matrix C with zero diagonal
satisfying CC T = (n − 1)I. A conference matrix is normalized if all entries in its first
row and first column are 1 (except the (1,1) entry which is 0). A square matrix A
is symmetric if A = AT and skew-symmetric if A = −AT . The core of a normalized
conference matrix C consists of all the rows and columns of C except the first row and
column.
14.7.15 Theorem [2851, page 360] If there exists a conference matrix of order n, then n is even;
furthermore, if n ≡ 2 (mod 4), then, for any prime p ≡ 3 (mod 4), the highest power of p
dividing n − 1 is even.
14.7.16 Theorem [2345] Let q be an odd prime power.
1. If q ≡ 1 (mod 4), then there is a symmetric conference matrix of order q + 1.
2. If q ≡ 3 (mod 4), then there is a skew-symmetric conference matrix of order q+1.
14.7.17 Construction In the construction for Theorem 14.7.16, let q be an odd prime power and
let χ denote the quadratic character on the finite field Fq (i.e., χ(x) = 0 if x = 0, χ(x) = 1
if x is a square and χ(x) = −1 if x is a nonsquare). Number the elements of Fq : 0 =
a0 , a1 , . . . , aq−1 and define a q × q matrix Q by qi,j := χ(ai − aj ) for 0 ≤ i, j < q − 1. It
follows that Q is symmetric if q ≡ 1 (mod 4) and skew-symmetric if q ≡ 3 (mod 4). Define
the (q + 1) × (q + 1) matrix C by
0 1 1 ··· 1
±1
..
. Q
±1
where the terms ±1 are +1 when q ≡ 1 (mod 4) and −1 when q ≡ 3 (mod 4). It follows that
C is a conference matrix of order q + 1. In the special case when q is prime, Q is circulant.
14.7.18 Lemma 1. If C is a skew-symmetric conference matrix, then I + C is a Hadamard matrix.
2. If C is a symmetric conference matrix of order n, then
I + C −I + C
−I + C −I − C
610 Handbook of Finite Fields
14.7.22 Definition When A and B are circulant matrices, the conference matrix C in Construction
14.7.21 is constructible from two circulant matrices or for short, two circulants type.
14.7.23 Theorem [1289, 2831, 2975] If q ≡ 1 (mod 4) is a prime power, then there is a symmetric
conference matrix C of order q + 1 of two circulants type.
14.7.24 Theorem [2023] There is a symmetric conference matrix of order q 2 (q + 2) + 1 whenever q
is a prime power, q ≡ 3 (mod 4), and q + 3 is the order of a conference matrix.
14.7.26 Definition The size of a covering array is the covering array number CAN(t, k, v). The
covering array is optimal if it has the minimum possible number of rows.
14.7.27 Construction [1457] Let q be a prime power and q ≥ s ≥ 2. Over the finite field Fq , let
F = {f1 , . . . , fqs } be the set of all polynomials of degree less than s. Let A be a subset of
Fq ∪ {∞}. Define an q s × |A| array in which the entry in cell (j, a) is fj (a) when a ∈ Fq , and
is the coefficient of the term of degree s − 1 when a = ∞. The result is a CA(q s ; s, |A|, q).
Because every t-tuple is covered exactly once, it is in fact an orthogonal array of index one
and strength s.
14.7.28 Remark Covering arrays are typically constructed by a combination of computational,
direct, and recursive constructions [705]. Finite fields arise most frequently in the direct
construction of covering arrays. One example is the use of permutation vectors to construct
covering arrays [2612]. A second, outlined next, uses Weil’s theorem and character theoretic
arguments to establish that certain cyclotomic matrices form covering arrays.
14.7.29 Construction [704] Let ω be a primitive element of Fq , with q ≡ 1 (mod v). For each
q and ω, form a cyclotomic vector xq,v,ω = (xi : i ∈ Fq ) ∈ Fqq by setting x0 = 0 and
xi ≡ j (mod v) when i = ω j for i ∈ F?q . Choosing a different primitive element of Fq can
lead to the same vector xq,v,ω , or, for some number m that is coprime to v, to a vector
in which each element is multiplied by m and reduced modulo v. For our purposes, the
vectors produced are equivalent, so henceforth let xq,v denote any vector so obtained. From
Combinatorial 611
xq,v = (xi : i ∈ Fq ), form a q × q matrix Aq,v = (aij ) with rows and columns indexed by
Fq , by setting aij = xj−i (computing the subscript in Fq ).
14.7.30 Theorem [704] When q > t2 v 2t , Aq,v from Construction 14.7.29 is a covering array of
strength t.
14.7.31 Definition [1404] A Hall triple system (HTS) is a pair (S, L) where S is a set of elements
(points) and L a set of lines satisfying:
1. every line is a 3-subset of S,
2. any two distinct points lie in exactly one line, and
3. for any two intersecting lines, the smallest subsystem containing them is iso-
morphic to the affine plane of nine points, AG(2,3).
v
14.7.36 Definition An ordered design ODλ (t, k, v) is a k × λ · t · t! array with v entries such
that
1. each column has k distinct entries, and
2. each tuple of t rows contains each column tuple of t distinct entries precisely λ
times.
v
14.7.37 Definition A perpendicular array PAλ (t, k, v) is a k × λ · t array with v entries such
that
1. each column has k distinct entries, and
2. each set of t rows contains each set of t distinct entries as a column precisely λ
times.
612 Handbook of Finite Fields
14.7.38 Definition For 0 ≤ s ≤ t, a PAλ (t, k, v) is an s-PAλ (t, k, v) if, for each w ≤ t and
u ≤ min(s, w), the following holds. Let E1 , E2 be disjoint sets of entries, E1 ∩ E2 = ∅
with | E1 |= u and | E2 |= w − u. Then the number of columns containing E1 ∪ E2 and
having E2 in a given set U of w − u rows is a constant, independent of the choice of E1 ,
E2 , and U . Authentication perpendicular arrays (APA) are 1-PA.
14.7.40 Theorem [275] Permutation groups yield special cases of t-transitive or t-homogeneous sets.
1. The groups PGL2 (q), q a prime power, form OD1 (3, q + 1, q + 1); the groups
PSL2 (q), q ≡ 3 (mod 4) are APA3 (3, q + 1, q + 1). The special cases of this last
family when the prime power q ≡ 3, 11 (mod 12) form the only known infinite
family of APAλ (t, n, n) with t > 2 and minimal λ.
2. The groups AGL1 (q), (q a prime power), of order q · (q − 1) form an OD1 (2, q, q);
the groups ASL1 (q), (q a prime power ≡ 3 (mod 4)) of order q · (q − 1)/2 form
APA1 (2, q, q).
14.7.41 Definition Let q ≡ 3 (mod 4) be a prime power, k odd. An APAV(q, k) (V stands for
vector) is a tuple (x1 , . . . , xk ) where xi ∈ Fq and such that for each i the xi − xj , j 6= i
are evenly distributed on squares and nonsquares [1262].
14.7.42 Remark An APAV(q, k) implies the existence of APA1 (2, k, q). In [606] a theorem on char-
acter sums based on the Hasse–Weil inequality is used to prove existence of an APAV(q, k)
when q is large enough with respect to k.
14.7.43 Theorem [606] The following exist, for a prime power q with q ≡ 3 (mod 4),
1. APAV(q, 7) for q ≥ 7, q 6∈ {11, 19},
2. APAV(q, 9) for q ≥ 19,
3. APAV(q, 11) for q ≥ 11, q 6∈ {19, 27},
4. APAV(q, 13) for q ≥ 13, q 6∈ {19, 23, 31}, and
5. APAV(q, 15) for q ≥ 31.
14.7.44 Definition Let n, q, t, and s be positive integers and suppose (to avoid trivialities) that
n > q ≥ t ≥ 2. Let V be a set of cardinality n and let W be a set of cardinality q. A
function f : V → W separates a subset X of V if f is an injection when restricted to X.
An (n, q, t)-perfect hash family of size s is a collection F = {f1 , f2 , . . . , fs } of functions
from V to W with property that for all sets X ⊆ V such that |X| = t, at least one
of the functions f1 , f2 , . . . , fs separates X. The notation PHF(s; n, q, t) is used for an
(n, q, t)-perfect hash family of size s. A perfect hash family is optimal if s is as small as
possible, given n, q, t.
14.7.46 Theorem [2721] Suppose that there exists a q-ary code C of length K, with N codewords,
t
having minimum distance D. Then there exists a PHF(N ; K, q, t), where (N − D) 2 < N.
14.7.47 Corollary [2721] Suppose N and v are given, with v a prime power and N ≤ v + 1. Then
there exists a PHF(N ; v dN/(2)e , v, t) based on a Reed–Solomon code.
t
14.7.48 Theorem [2853] A PHF((i + 1)2 ; v i+1 , v, 3) exists whenever v is a prime power, v ≥ 3, and
i ≥ 1. A PHF( 65 (2i3 + 3i2 + i) + i + 1; v i+1 , v, 4) exists whenever v is a prime power, v ≥ 4,
and i ≥ 1.
14.7.49 Theorem [2721] For any prime power q and for any positive integers n, m, i such that n ≥ m
t qm
and 2 ≤ i ≤ q n , there exists a PHF(q n ; q m+(i−1)n , q m , t) when
2 < i−1 .
14.7.51 Remark In a linear PHF, columns correspond to polynomials of degree less than s over Fq .
It follows directly that two columns agree in at most s − 1 entries, and hence that if the
linear PHF has more than (s − 1) 2t rows, it has strength at least t. By judicious selection
of the particular rows (i.e., a subset A of Fq ∪ {∞}), fewer rows can often be employed.
The key observation, developed in [206, 302, 707], is that when A is chosen properly, a
system of equations over Fq for each set of t chosen columns never admits a solution. This
is developed in an algebraic setting in [302], in a geometric setting in [206], and in a graph-
theoretic setting in [707]. The results to follow all employ this basic strategy.
14.7.52 Theorem [302] Let s ≥ 2 and t ≥ 2. When q is a sufficiently large prime power, there is an
optimal linear PHF(s(t − 1); q s , q, t).
14.7.53 Theorem [206, 207, 302]
14.7.55 Definition A starter in the odd order abelian group G (written additively), where |G| = g
is a set of unordered pairs S = {{si , ti } : 1 ≤ i ≤ (g − 1)/2} that satisfies:
1. {si : 1 ≤ i ≤ (g − 1)/2} ∪ {ti : 1 ≤ i ≤ (g − 1)/2} = G\{0}, and
2. {±(si − ti ) : 1 ≤ i ≤ (g − 1)/2} = G\{0}.
14.7.56 Definition A strong starter is a starter S = {{si , ti }} in the abelian group G with the
additional property that si + ti = sj + tj implies i = j, and for any i, si + ti 6= 0.
14.7.57 Definition A skew starter is a starter S = {{si , ti }} in the abelian group G with the
additional property that si + ti = ±(sj + tj ) implies i = j, and for any i, si + ti 6= 0.
14.7.60 Definition Let q be a prime power that can be written in the form q = 2k t + 1, where
t > 1 is odd and let ω be a primitive element in the field Fq . Then define
1. C0 to be the multiplicative subgroup of Fq \{0} of order t,
2. Ci = ω i C0 , 0 ≤ i ≤ 2k − 1 to be the cosets of C0 (cyclotomic classes), and
3. ∆ = 2k−1 , H = ∪∆−1 a
i=0 Ci and Ci = (1/(a − 1))Ci .
14.7.61 Theorem [2198] Let T = {{x, ω ∆ x} : x ∈ H}. Then T is a skew starter (the Mullin–Nemeth
starter) in the additive subgroup of Fq .
14.7.62 Theorem [897] For each a ∈ C∆ , let Sa = {{x, ax} : x ∈ ∪∆−1 a
i=0 Ci }. Then for any a ∈ C∆ ,
Sa is a strong starter in the additive group of Fq . Further, Sa and Sb are orthogonal if
a, b ∈ C∆ with a 6= b. Hence, the set {Sa |a ∈ C∆ } is a set of t pairwise orthogonal starters
of order q.
n
14.7.63 Theorem [623] Let p = 22 be a Fermat prime with n ≥ 2. There exists a strong starter in
the additive group of Fp .
14.7.64 Remark [1537] No strong starter in the additive groups of F3 , F5 , or F9 exists.
14.7.65 Definition Let G be an additive abelian group of order g, and let H be a subgroup of
order h of G, where g − h is even. A Room frame starter in G\H is a set of unordered
pairs S = {{si , ti } : 1 ≤ i ≤ (g − h)/2} such that
1. {si : 1 ≤ i ≤ (g − h)/2} ∪ {ti : 1 ≤ i ≤ (g − h)/2} = G\H, and
2. {±(si − ti ) : 1 ≤ i ≤ (g − h)/2} = G\H.
Combinatorial 615
14.7.66 Remark A starter is the special case of a frame starter when H = {0}. The concepts of
strong, skew, and orthogonal for Room frame starters are as for starters replacing {0} by
H and (g − 1)/2 by (g − h)/2.
14.7.67 Theorem [898] Let q ≡ 1 (mod 4) be a prime power such that q = 2k t + 1, where t > 1 is
odd. Then there exist t orthogonal Room frame starters in (Fq × (Z2 )n )\({0} × (Z2 )n ) for
all n ≥ 1.
14.7.68 Theorem [93] If p ≡ 1 (mod 6) is a prime and p ≥ 19, then there exist three orthogonal
frame starters in (Fp × (Z3 ))\({0} × (Z3 )).
14.7.69 Definition Let S be a set of n + 1 elements (symbols). A Room square of side n (on symbol
set S), RS(n), is an n × n array, F , that satisfies the following properties:
1. every cell of F either is empty or contains an unordered pair of symbols from
S,
2. each symbol of S occurs once in each row and column of F ,
3. every unordered pair of symbols occurs in precisely one cell of F .
14.7.70 Definition A Room square of side n is standardized (with respect to the symbol ∞) if the
cell (i, i) contains the pair {∞, i}.
14.7.71 Definition A standardized Room square of side n is skew if for every pair of cells (i, j)
and (j, i) (with i 6= j) exactly one is filled.
14.7.72 Definition A standardized Room square of side n is cyclic if S = Zn ∪{∞} and if whenever
{a, b} occurs in the cell (i, j), then {a + 1, b + 1} occurs in cell (i + 1, j + 1) where all
arithmetic is performed in Zn (and ∞ + 1 = ∞).
14.7.73 Example Below are skew Room squares of sides 7 and 9; the Room square of side 7 is cyclic.
∞1 49 37 28 56
∞0 15 46 23 89 ∞2 57 34 16
34 ∞1 26 50 58 ∞3 69 24 17
61 45 ∞2 30 36 78 ∞4 19 25
02 56 ∞3 41 79 12 ∞5 38 46
52 13 60 ∞4 45 ∞6 18 39 27
63 24 01 ∞5 26 59 13 ∞7 48
04 35 12 ∞6 67 14 29 ∞8 35
23 15 68 47 ∞9
14.7.74 Theorem [899] The existence of d pairwise orthogonal starters in an abelian group of order
n implies the existence of a Room d-cube of side n.
14.7.75 Remark Construction 14.7.77 is used to establish Theorem 14.7.74 when d = 2. It is easily
extended when d > 2.
616 Handbook of Finite Fields
14.7.76 Definition An adder for the starter S = {{si , ti } : 1 ≤ i ≤ (g − 1)/2} is an ordered set
AS = {a1 , a2 , . . . , a(g−1)/2 } of (g − 1)/2 distinct nonzero elements from G such that the
set T = {{si + ai , ti + ai } : 1 ≤ i ≤ (g − 1)/2} is also a starter in the group G.
14.7.82 Remark Theorem 14.7.83 gives the connection between frame starters and Room frames.
The construction for a Room frame from a pair of orthogonal frame starters is a general-
ization of Construction 14.7.77.
14.7.83 Theorem [898] Suppose a pair of orthogonal frame starters in G\H exists, where |G| = g
and |H| = h. Then there exists a Room frame of type hg/h .
14.7.84 Remark Theorems 14.7.67 and 14.7.68 in conjunction with Theorem 14.7.83 yield Corollary
14.7.85. Theorem 14.7.86 details the existence of Room frames of type tu .
14.7.85 Corollary a) Let q ≡ 1 (mod 4) be a prime power such that q = 2k t + 1, where t > 1 is
odd. Then there exist a Room frame of type (2n )q for all n ≥ 1. b) If p ≡ 1 (mod 6) is a
prime and p ≥ 19, then there exists a Room frame of type 3p .
14.7.86 Theorem (Existence theorems for uniform Room frames) [706, §VI.50] and [900]
1. There does not exist a Room frame of type tu if any of the following conditions
hold: (i) u = 2 or 3; (ii) u = 4 and t = 2; (iii) u = 5 and t = 1; (iv) t(u − 1) is
odd.
2. Suppose t and u are positive integers, u ≥ 4 and (t, u) 6= (1, 5), (2, 4). Then there
exists a uniform Room frame of type tu if and only if t(u − 1) is even.
Combinatorial 617
14.7.87 Definition A strongly regular graph with parameters (v, k, λ, µ) is a finite graph on v
vertices, without loops or multiple edges, regular of degree k (with 0 < k < v − 1, so
that there are both edges and nonedges), and such that any two distinct vertices have
λ common neighbors when they are adjacent, and µ common neighbors when they are
nonadjacent.
14.7.88 Remark There are many constructions for strongly regular graphs. Example 14.7.89 gives
several that use finite fields. For a table of the existence of strongly regular graphs with
v ≤ 280 see [706, pp. 852–866].
14.7.89 Example [419]
1. Paley(q): For prime powers q = 4t + 1, the graph with vertex set Fq where two
vertices are adjacent when they differ by a square. This strongly regular graph
has parameters (q, 12 q − 1, 41 (q − 5), 14 (q − 1)).
2. van Lint–Schrijver(u): a graph constructed by the cyclotomic construction in
[2850], by taking the union of u classes.
3. An−1,2 (q) or n2 q : the graph on the lines in PG(n − 1, q), adjacent when they
14.7.90 Definition A whist tournament Wh(4n) for 4n players is a schedule of games each involving
two players opposing two others, such that
1. the games are arranged into 4n − 1 rounds, each of n games;
2. each player plays in exactly one game in each round;
3. each player partners every other player exactly once;
4. each player opposes every other player exactly twice.
618 Handbook of Finite Fields
14.7.91 Definition Each game is denoted by an ordered 4-tuple (a, b, c, d) in which the pairs {a, c},
{b, d} are partner pairs; {a, c} is a partner pair of the first kind, and {b, d} is a partner
pair of the second kind. The other pairs are opponent pairs; in particular {a, b}, {c, d}
are opponent pairs of the first kind, and {a, d}, {b, c} are opponent pairs of the second
kind.
14.7.92 Definition A whist tournament Wh(4n + 1) for 4n + 1 players is defined as for 4n, except
that Conditions 1, 2 are replaced by
10 . the games are arranged into 4n + 1 rounds each of n games;
20 . each player plays in one game in each of 4n rounds, but does not play in the
remaining round.
14.7.93 Definition A Wh(4n) is Z-cyclic if the players are ∞, 0, 1, . . . , 4n − 2 and each round
is obtained from the previous one by adding 1 modulo 4n − 1 to each non-∞ entry.
A Wh(4n + 1) is Z-cyclic if the players are 0, 1, . . . , 4n and the rounds are similarly
developed modulo 4n + 1.
14.7.94 Theorem [167] If p = 4n + 1 is prime and w is a primitive root of p, then the games
(wi , wi+n , wi+2n , wi+3n ), 0 ≤ i ≤ n − 1, form the initial round of Z-cyclic Wh(4n + 1).
14.7.95 Theorem Let P denote any product of primes p with each p ≡ 1 (mod 4), and let q, r
denote primes with both q, r ≡ 3 (mod 4).
A Z-cyclic Wh(4n) is known to exist when:
1. 4n ≤ 132 (see [7, 98]);
2. 4n = 2α (α ≥ 2) (see [100]);
3. 4n = qP + 1, q ∈ {3, 7, 11, 19, 23, 31, 43, 47, 59, 67, 71, 79, 83, 103, 107, 127} (see
[7]);
4. 4n = 3P + 1 (see [100]);
5. 4n = 32m+1 + 1, m ≥ 0 (see [101]);
6. 4n = qr2 P + 1, q and r distinct, q < 60, r < 100 (see [7]).
A Z-cyclic Wh(4n + 1) is known to exist when:
1. 4n + 1 = P or r2 P and r ≤ 100 (see [7, 102]);
2. 4n + 1 ≤ 149 (see [7]);
3. 4n + 1 = 32m or 32m P (see [101]);
4. 4n + 1 = 3sP , s ∈ {7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47} (see [7]).
14.7.97 Theorem A TWh(4n + 1) exists for all n ≥ 5, and possibly for n = 4. A TWh(4n) exists
for all n ≥ 1 except for n = 3; see [7, 1965].
14.7.98 Theorem A Z-cyclic TWh(4n + 1) exists when:
1. 4n + 1 is a prime p ≡ 1 (mod 4), p ≥ 29 (see [462]);
Combinatorial 619
See Also
References Cited: [6, 7, 93, 98, 99, 100, 101, 102, 167, 206, 207, 261, 262, 275, 302, 419,
462, 479, 606, 623, 704, 705, 706, 707, 897, 898, 899, 900, 916, 918, 1262, 1289, 1301, 1304,
1403, 1404, 1457, 1537, 1932, 1965, 2023, 2198, 2345, 2612, 2719, 2721, 2831, 2850, 2851,
2853, 2975, 3037]
14.8.1 Remark The theory of (t, m, s)-nets and (t, s)-sequences is significant for quasi-Monte Carlo
methods in scientific computing (see the books [839] and [2248] and the recent survey
article [2259]). For both (t, m, s)-nets and (t, s)-sequences, the idea is to sample the s-
dimensional unit cube [0, 1]s in a uniform and equitable manner. In a nutshell, (t, m, s)-nets
are finite samples (or point sets) and (t, s)-sequences are infinite sequences with special
620 Handbook of Finite Fields
uniformity properties. The definition of a (t, m, s)-net (see Definition 14.8.2 below) has
a priori no connection with finite fields, but it turns out that most of the interesting
constructions of (t, m, s)-nets use finite fields as a tool. By a point set we mean a multiset
in the sense of combinatorics, i.e., a set in which multiplicities of elements are allowed and
taken into account.
14.8.2 Definition [2236, 2690] For integers b ≥ 2 and 0 ≤ t ≤ m and a given dimension s ≥ 1, a
(t, m, s)-net in base b is a point set P consisting of bm points in [0, 1]s such that every
subinterval of [0, 1]s of volume bt−m which has the form
s
Y
[ai b−di , (ai + 1)b−di )
i=1
14.8.3 Remark It is easily seen that a (t, m, s)-net in base b is also a (u, m, s)-net in base b for all
integers u with t ≤ u ≤ m. Any point set consisting of bm points in [0, 1)s is a (t, m, s)-net in
base b with t = m. Smaller values of t mean stronger uniformity properties of a (t, m, s)-net
in base b. The number t is the quality parameter of a (t, m, s)-net in base b.
14.8.4 Definition A (t, m, s)-net P in base b is a strict (t, m, s)-net in base b if t is the least
integer u such that P is a (u, m, s)-net in base b.
14.8.5 Example Let s = 2 andPlet b ≥ 2 and m ≥ 1 be given integers. For any integer n with
m−1
0 ≤ n < bm , let n = r
r=0 ar (n)b with ar (n) ∈ Zb = {0, 1, . . . , b − 1} be the digit
Pm−1
expansion of n in base b and put φb (n) = r=0 ar (n)b−r−1 . Then the point set consisting
of the points (nb−m , φb (n)) ∈ [0, 1]2 , n = 0, 1, . . . , bm − 1, is a (0, m, 2)-net in base b. This
point set is the Hammersley net in base b.
14.8.6 Example Let b ≥ 2, s ≥ 1, and t ≥ 0 be given integers. Then the point set consisting of
the points (nb−1 , . . . , nb−1 ) ∈ [0, 1]s , n = 0, 1, . . . , b − 1, each taken with multiplicity bt , is
a (t, t + 1, s)-net in base b.
14.8.7 Remark According to Remark 14.8.3 and Example 14.8.6, a (t, m, s)-net in base b always
exists for m = t and m = t + 1. For m ≥ t + 2 there are combinatorial obstructions to
the general existence of (t, m, s)-nets in base b. This was first observed in [2236]. Later, a
combinatorial equivalence between (t, m, s)-nets in base b and ordered orthogonal arrays as
defined in Definition 14.8.8 below was established.
14.8.9 Theorem [1873, 2183] Let b ≥ 2, s ≥ 2, k ≥ 2, and t ≥ 0 be integers. Then there exists a
(t, t+k, s)-net in base b if and only if there exists an ordered orthogonal array OOAb (s, k, k−
1, bt ).
14.8.10 Corollary [2236] There exists a (0, 2, s)-net in base b if and only if there exist s−2 mutually
orthogonal latin squares of order b.
Combinatorial 621
14.8.11 Corollary [2236] For m ≥ 2, a (0, m, s)-net in base b can exist only if s ≤ M (b) + 2, where
M (b) is the maximum cardinality of a set of mutually orthogonal latin squares of order b.
In particular, if m ≥ 2, then a necessary condition for the existence of a (0, m, s)-net in
base b is s ≤ b + 1.
14.8.12 Remark The equivalence between nets and ordered orthogonal arrays enunciated in The-
orem 14.8.9, when combined with extensions of standard parameter bounds for orthogonal
arrays to the case of ordered orthogonal arrays, leads to lower bounds on the quality pa-
rameter for nets. Examples of such bounds are the linear programming bound [2008], the
Rao bound [2007], and the dual Plotkin bound [271, 2009]. Extensive numerical data on
these bounds are available at http://mint.sbg.ac.at.
14.8.13 Remark Most of the known constructions of nets are based on the digital method which
goes back to [2236]. In order to describe the digital method for the construction of (t, m, s)-
nets in base b, we need the following ingredients. First of all, let integers b ≥ 2, m ≥ 1, and
s ≥ 1 be given. Then we choose:
1. a commutative ring R with identity and of cardinality b;
(i)
2. bijections ηj : R → Zb for 1 ≤ i ≤ s and 1 ≤ j ≤ m;
3. m × m matrices C (1) , . . . , C (s) over R.
Now let r ∈ Rm be an m-tuple of elements of R and define
(i) (i) (i)
πj (r) = ηj (cj · r) ∈ Zb for 1 ≤ i ≤ s, 1 ≤ j ≤ m,
(i)
where cj is the j-th row of the matrix C (i) and · denotes the standard inner product. Next
we put
m
(i)
X
π (i) (r) = πj (r)b−j ∈ [0, 1] for 1 ≤ i ≤ s
j=1
and
P (r) = (π (1) (r), . . . , π (s) (r)) ∈ [0, 1]s .
By letting r range over all bm elements of Rm , we arrive at a point set P consisting of bm
points in [0, 1]s .
14.8.14 Definition If the point set P constructed in Remark 14.8.13 forms a (t, m, s)-net in base b,
then P is a digital (t, m, s)-net in base b. If we want to emphasize that the construction
uses the ring R, then we speak also of a digital (t, m, s)-net over R. If P is a strict
(t, m, s)-net in base b, then P is a digital strict (t, m, s)-net in base b (or over R).
14.8.15 Remark The matrices C (1) , . . . , C (s) in Remark 14.8.13 are generating matrices of the
digital net. The quality parameter t of the digital net depends only on the generating
matrices. For a convenient algebraic condition on the generating matrices to guarantee a
certain value of t, we refer to Theorem 4.26 in [2248]. In the important case where the ring
R is a finite field, an even simpler condition is given in Theorem 14.8.18 below.
14.8.16 Example Let s = 2 and let b ≥ 2 and m ≥ 1 be given integers. Choose R = Zb and let the
(i)
bijections ηj in Remark 14.8.13 be identity maps. Let C (1) be the m × m identity matrix
over Zb and let C (2) = (ci,j )1≤i,j≤m be the m × m antidiagonal matrix over Zb with ci,j = 1
622 Handbook of Finite Fields
14.8.17 Definition Let C (1) , . . . , C (s) be m × m matrices over the finite field Fq and for 1 ≤ i ≤ s
(i)
and 1 ≤ j ≤ m let cj denote the j-th row of the matrix C (i) . Then %(C (1) , . . . , C (s) )
is defined to be the largest nonnegative integer d such that, for any integers 0 ≤
Ps (i)
d1 , . . . , ds ≤ m with i=1 di = d, the vectors cj , 1 ≤ j ≤ di , 1 ≤ i ≤ s, are lin-
early independent over Fq (this property is assumed to be vacuously satisfied for d = 0).
14.8.18 Theorem [2236] The point set P constructed in Remark 14.8.13 with R = Fq and m × m
generating matrices C (1) , . . . , C (s) over Fq is a digital strict (t, m, s)-net over Fq with t =
m − %(C (1) , . . . , C (s) ).
14.8.19 Example Let s = 2, let b = p be a prime, and let the m × m matrices C (1) and C (2) over
Fp be as in Example 14.8.16. Then it is easily seen that %(C (1) , C (2) ) = m. Using Theorem
14.8.18, this shows again that the Hammersley net in base b = p is a digital (0, m, 2)-net
over Fp .
14.8.20 Remark The equivalence between nets and ordered orthogonal arrays stated in Theorem
14.8.9 has an analog for digital nets. The special family of linear ordered orthogonal arrays
was introduced in [276] and it was shown that these arrays correspond to digital nets.
14.8.21 Remark There is a very useful duality theory for digital nets which facilitates many con-
structions of good digital nets. A crucial ingredient is the weight function Vm on Fms q
introduced in Definition 14.8.22 below. The main result of this duality theory is Theorem
14.8.26 below.
14.8.23 Remark The NRT weight is named after the work of Niederreiter [2234] and Rosenbloom
and Tsfasman [2483]. The NRT space is Fms
q with the metric dm (A, B) = Vm (A − B) for
A, B ∈ Fms
q . For m = 1 the NRT space reduces to the Hamming space in coding theory.
14.8.25 Remark Let the m × m matrices C (1) , . . . , C (s) over Fq be generating matrices of a digital
net P . Set up an m × ms matrix M over Fq as follows: for 1 ≤ j ≤ m, the j-th row of
M is obtained by concatenating the transposes of the j-th columns of C (1) , . . . , C (s) . Let
M ⊆ Fms q be the row space of M and let M⊥ be its dual space, i.e.,
M⊥ = {A ∈ Fms
q : A · M = 0 for all M ∈ M},
Combinatorial 623
14.8.26 Theorem [2267] Let m ≥ 1 and s ≥ 2 be integers. Then the point set P in Remark 14.8.25
is a digital strict (t, m, s)-net over Fq with t = m + 1 − δm (M⊥ ).
14.8.27 Corollary [2267] Let m ≥ 1 and s ≥ 2 be integers. Then from any Fq -linear subspace N
of Fms
q with dim(N ) ≥ ms − m we can construct a digital strict (t, m, s)-net over Fq with
t = m + 1 − δm (N ).
14.8.28 Remark There are digital nets for which a property analogous to that in Definition 14.8.2
holds for a wider range of subintervals of [0, 1]s . Such generalized digital nets were introduced
in [835] and are also studied in detail in Chapter 15 of [839].
14.8.29 Remark There is a generalization of the digital method which can be viewed as a nonlinear
analog of the construction in Remark 14.8.13. For simplicity we consider only the case where
R = Fq (see [2258] for a general ring R). Compared to Remark 14.8.13, the only change is
(i)
that instead of linear forms cj · r we now use polynomial functions, that is, for 1 ≤ i ≤ s
(i) (i)
and 1 ≤ j ≤ m we choose polynomials fj over Fq in m variables and then we replace cj · r
(i)
by fj (r) for r ∈ Fm q . The following criterion uses the concept of permutation polynomial
in several variables (see Section 8.2).
14.8.30 Theorem [2258] The point set constructed in Remark 14.8.29 is a (t, m, s)-net in base q
Ps (i)
if and only if, for any integers d1 , . . . , ds ≥ 0 with i=1 di = m − t, the polynomials fj ,
1 ≤ j ≤ di , 1 ≤ i ≤ s, have the property that all of their nontrivial linear combinations
with coefficients from Fq are permutation polynomials over Fq in m variables.
14.8.31 Remark A general principle for the construction of (t, m, s)-nets with s ≥ 2 is based on the
use of Proposition 14.8.50 below in conjunction with the constructions of (t, s−1)-sequences
in Subsection 14.8.6. In the present subsection, we describe constructions of (t, m, s)-nets
that are not derived from this principle. One of the first constructions of this type was that
of polynomial lattices in [2247]. Choose f ∈ Fq [x] with deg(f ) = m ≥ 1 and an s-tuple
g = (g1 , . . . , gs ) ∈ Fq [x]s with deg(gi ) < m for 1 ≤ i ≤ s. Consider the Laurent series
expansions
∞
gi (x) X (i) −k
= uk x ∈ Fq ((x−1 )) for 1 ≤ i ≤ s.
f (x)
k=1
Then for 1 ≤ i ≤ s the generating matrix C (i) = (cj,r ) is the Hankel matrix given by
(i) (i)
cj,r = uj+r ∈ Fq for 1 ≤ j ≤ m, 0 ≤ r ≤ m − 1. The bijections ηj in Remark 14.8.13 are
chosen arbitrarily. The resulting digital net over Fq is denoted by P (g, f ).
14.8.32 Definition Let s ≥ 2 and let f and g be as in Remark 14.8.31. Then the figure of merit
%(g, f ) is defined by
s
X
%(g, f ) = s − 1 + min deg(hi ),
i=1
14.8.33 Theorem [2247] For s ≥ 2, the point set P (g, f ) in Remark 14.8.31 is a digital strict
(t, m, s)-net over Fq with t = m − %(g, f ).
624 Handbook of Finite Fields
14.8.34 Remark It is clear from Theorem 14.8.33 that in order to obtain a good (t, m, s)-net by this
construction, i.e., a net with a small value of t, we need to find g and f with a large figure of
merit %(g, f ). A systematic method for the explicit construction of good polynomial lattices
is the component-by-component algorithm in [836]; see also Chapter 10 in [839].
14.8.35 Remark Several constructions of digital nets are based on Corollary 14.8.27. A powerful
construction of this type uses algebraic function fields (see Section 12.1 for background on
algebraic function fields). We present only a simple version of this construction; more refined
versions can be found in [2261]. Let F be an algebraic function field (of one variable) with
full constant field Fq , that is, Fq is algebraically closed in F . Let N (F ) denote the number
of rational places of F . For a given dimension s ≥ 2, we assume that N (F ) ≥ s and let
P1 , . . . , Ps be s distinct rational places of F . Let G be a divisor of F . For each i = 1, . . . , s,
let ti ∈ F be a prime element at Pi and let ni ∈ Z be the coefficient of Pi in G. For f in the
Riemann-Roch space L(G) and a given integer m ≥ 1, let θ(i) (f ) ∈ Fm q be the vector whose
j
coordinates are, in descending order, the coefficients of ti , j = −ni + m − 1, −ni + m −
2, . . . , −ni , in the local expansion of f at Pi . Now define the Fq -linear map θ : L(G) → Fms q
by
θ(f ) = (θ(1) (f ), . . . , θ(s) (f )) for all f ∈ L(G).
A digital net over Fq is then obtained by applying Corollary 14.8.27 with N being the image
of the map θ. A suitable choice of the divisor G leads to the following result.
14.8.36 Theorem [2261] Let s ≥ 2 be an integer and let F be an algebraic function field with full
constant field Fq , genus g ≥ 1, and N (F ) ≥ s. If k and m are integers with 0 ≤ k ≤ g − 1
and m ≥ max(1, g − k − 1), then there exists a digital (g − k − 1, m, s)-net over Fq provided
that
s+m+k−g
Ak (F ) < h(F ),
s−1
where Ak (F ) is the number of positive divisors of F of degree k and h(F ) is the divisor
class number of F .
14.8.37 Example Let q = 9 and let F be the Hermitian function field over F9 , that is, F = F9 (x, y)
with y 3 + y = x4 . Then g = 3, N (F ) = 28, and h(F ) = 4096. We apply Theorem 14.8.36
with s = 28, k = 0, m = 5, and we obtain a digital (2, 5, 28)-net over F9 . The value t = 2 is
the currently best value of the quality parameter for a (t, 5, 28)-net in base 9, according to
the website http://mint.sbg.ac.at which contains an extensive database for parameters
of (t, m, s)-nets.
14.8.38 Remark Another construction based on Corollary 14.8.27 was introduced in [2255]. For
integers m ≥ 1 and s ≥ 2, consider the Fq -linear space P = {f ∈ Fqm [x] : deg(f ) < s}.
Fix α ∈ Fqm and define the Fq -linear subspace Pα = {f ∈ P :P f (α) = 0} of P. Set up a
s
map τ : P → Fms q as follows. Write f ∈ P explicitly as f (x) = i=1 γi x
i−1
with γi ∈ Fqm
for 1 ≤ i ≤ s. For each i = 1, . . . , s, choose an ordered basis Bi of Fqm over Fq and let
ci (f ) ∈ Fm
q be the coordinate vector of γi with respect to Bi . Then define
τ (f ) = (c1 (f ), . . . , cs (f )) ∈ Fms
q for all f ∈ P.
A digital net over Fq is now obtained by applying Corollary 14.8.27 with N being the image
of the subspace Pα under τ . The resulting digital net is a cyclic digital net over Fq relative
to the bases B1 , . . . , Bs .
14.8.39 Remark A generalization of the construction in Remark 14.8.38 was presented in [2398].
For integers m ≥ 1 and s ≥ 2, consider Q = Fsqm as a vector space over Fq . Fix α ∈ Q with
α 6= 0 and put Qα = {β ∈ Q : α · β = 0}. Then Qα is an Fq -linear subspace of Q. Let
Combinatorial 625
σ : Q → Fms
q be an isomorphism between vector spaces over Fq . A digital net over Fq is now
obtained by applying Corollary 14.8.27 with N being the image of the subspace Qα under
σ. The resulting digital net is a hyperplane net over Fq . Detailed information on hyperplane
nets and cyclic digital nets can be found in Chapter 11 of [839].
14.8.40 Theorem [1874] Given a linear code over Fq with length n, dimension k, and minimum
distance d ≥ 3, we can construct a digital (n − k − d + 1, n − k, s)-net over Fq with s =
b(n − 1)/hc if d = 2h + 2 is even and s = bn/hc if d = 2h + 1 is odd.
14.8.41 Remark Further applications of coding theory to the construction of digital nets are dis-
cussed in the survey articles [2251] and [2256]. We specifically mention some principles of
combining several digital nets to obtain a new digital net that are inspired by coding theory.
For instance, the well-known Kronecker-product construction in coding theory has an analog
for digital nets [276]. The following result is an analog of the matrix-product construction
of linear codes.
14.8.42 Theorem [2263] Let h be an integer with 2 ≤ h ≤ q. If for k = 1, . . . , h a digital
(tk , mk , sk )-net over Fq is given and if s1 ≤ s2 ≤ · · · ≤ sh , then we can construct a digital
Ph Ph
(t, k=1 mk , k=1 sk )-net over Fq with
h
X
t=1+ mk − min (h − k + 1)(mk − tk + 1).
1≤k≤h
k=1
14.8.45 Remark There is an analog of (t, m, s)-nets for sequences of points in [0, 1]s , given in
Definition 14.8.46 below. P First we need some notation. For an integer b ≥ 2 and a real
∞
number x ∈ [0, 1], let x = j=1 yj b−j with all yj ∈ Zb be a b-adic expansion of x, where the
case yj = b − 1 for all
Pm sufficiently large j is allowed. For any integer m ≥ 1, we define the
truncation [x]b,m = j=1 yj b−j . Note that this truncation operates on the expansion of x
and not on x itself, since it may yield different results depending on which b-adic expansion
of x is used. If x = (x(1) , . . . , x(s) ) ∈ [0, 1]s and the x(i) , 1 ≤ i ≤ s, are given by prescribed
b-adic expansions, then we define
14.8.47 Remark It is easily seen that a (t, s)-sequence in base b is also a (u, s)-sequence in base b for
all integers u ≥ t. Smaller values of t mean stronger uniformity properties of a (t, s)-sequence
in base b. The number t is the quality parameter of a (t, s)-sequence in base b.
14.8.48 Definition A (t, s)-sequence S in base b is a strict (t, s)-sequence in base b if t is the least
integer u such that S is a (u, s)-sequence in base b.
P∞ r
14.8.49 Example Let s = 1 and let b ≥ 2 be an integer. For n = 0, 1, . . ., let n = r=0 ar (n)b with
all ar (n) ∈ Zb and
P∞ ra (n) = 0 for all sufficiently large r be the digit expansion of n in base
b. Put φb (n) = r=0 ar (n)b−r−1 . Then the sequence φb (0), φb (1), . . . is a (0, 1)-sequence in
base b. This sequence is the van der Corput sequence in base b.
14.8.50 Proposition [2236] Given a (t, s)-sequence in base b, we can construct a (t, m, s + 1)-net in
base b for any integer m ≥ t.
14.8.51 Remark The following result is obtained by combining Corollary 14.8.11 and Proposition
14.8.50.
14.8.52 Corollary [2236] A (0, s)-sequence in base b can exist only if s ≤ M (b) + 1. In particular,
a necessary condition for the existence of a (0, s)-sequence in base b is s ≤ b.
14.8.53 Remark It was shown in [2238] that for any integers b ≥ 2 and s ≥ 1 there exists a (t, s)-
sequence in base b for some value of t. Therefore it is meaningful to define tb (s) as the least
value of t for which there exists a (t, s)-sequence in base b.
14.8.54 Theorem [2283, 2565] For any integers b ≥ 2 and s ≥ 1, we have
s
tb (s) ≥ − cb log(s + 1)
b−1
with a constant cb > 0 depending only on b.
14.8.55 Definition [1858] Let b ≥ 2 and s ≥ 1 be integers and let N0 denote the set of nonnegative
integers. Let T : N → N0 be a function with T(m) ≤ m for all m ∈ N. Then a sequence
x0 , x1 , . . . of points in [0, 1]s is a (T, s)-sequence in base b if for all k ∈ N0 and m ∈ N,
the points [xn ]b,m with kbm ≤ n < (k + 1)bm form a (T(m), m, s)-net in base b. Here
the coordinates of all points xn , n = 0, 1, . . ., are given by prescribed b-adic expansions.
A (T, s)-sequence S in base b is a strict (T, s)-sequence in base b if there is no function
U : N → N0 with U(m) ≤ m for all m ∈ N and U(m) < T(m) for at least one m ∈ N
such that S is a (U, s)-sequence in base b.
14.8.56 Remark If the function T in Definition 14.8.55 is such that for some integer t ≥ 0 we have
T(m) = m for m ≤ t and T(m) = t for m > t, then the concept of a (T, s)-sequence in
base b reduces to that of a (t, s)-sequence in base b.
14.8.57 Remark There is an analog of the digital method in Remark 14.8.13 for the construction
of sequences. Let integers b ≥ 2 and s ≥ 1 be given. Then we choose:
1. a commutative ring R with identity and of cardinality b;
2. bijections ψr : Zb → R for r = 0, 1, . . ., with ψr (0) = 0 for all sufficiently large r;
(i)
3. bijections ηj : R → Zb for 1 ≤ i ≤ s and j ≥ 1;
4. ∞ × ∞ matrices C (1) , . . . , C (s) over R.
Combinatorial 627
P∞
For n = 0, 1, . . ., let n = r=0 ar (n)br with all ar (n) ∈ Zb and ar (n) = 0 for all sufficiently
large r be the digit expansion of n in base b. We put n = (ψr (ar (n)))∞ ∞
r=0 ∈ R . Next we
define
(i) (i) (i)
yn,j = ηj (cj · n) ∈ Zb for n ≥ 0, 1 ≤ i ≤ s, and j ≥ 1,
(i) (i)
where cj is the j-th row of the matrix C (i) . Note that the inner product cj ·n is meaningful
since n has only finitely many nonzero coordinates. Then we put
∞
(i)
X
x(i)
n = yn,j b−j for n ≥ 0 and 1 ≤ i ≤ s.
j=1
14.8.58 Definition If the sequence S constructed in Remark 14.8.57 forms a (t, s)-sequence in
base b, then S is a digital (t, s)-sequence in base b. If we want to emphasize that the
construction uses the ring R, then we speak also of a digital (t, s)-sequence over R. If S
is a strict (t, s)-sequence in base b, then S is a digital strict (t, s)-sequence in base b (or
over R).
14.8.59 Definition If the sequence S constructed in Remark 14.8.57 forms a (strict) (T, s)-sequence
in base b, then S is a digital (strict) (T, s)-sequence in base b (or over R).
14.8.60 Remark The matrices C (1) , . . . , C (s) in Remark 14.8.57 are generating matrices of the
digital sequence. The value of t for a digital (t, s)-sequence and the function T for a digital
(T, s)-sequence depend only on the generating matrices. For the determination of t in the
general case, we refer to Theorem 4.35 in [2248]. For the case R = Fq , see Theorem 14.8.62
below.
14.8.61 Example Let s = 1 and let b ≥ 2 be an integer. Choose R = Zb and let the bijections
(i)
ψr and ηj in Remark 14.8.57 be identity maps. Let C (1) be the ∞ × ∞ identity matrix
over Zb . Then the van der Corput sequence in Example 14.8.49 is easily seen to be a digital
(0, 1)-sequence over Zb with generating matrix C (1) .
14.8.62 Theorem Let S be the sequence constructed in Remark 14.8.57 with R = Fq and ∞ × ∞
(i)
generating matrices C (1) , . . . , C (s) over Fq . For 1 ≤ i ≤ s and m ∈ N, let Cm denote the
left upper m × m submatrix of C (i) . Then S is a digital strict (T, s)-sequence over Fq with
(1) (s) (1) (s)
T(m) = m − %(Cm , . . . , Cm ) for all m ∈ N, where %(Cm , . . . , Cm ) is given by Definition
14.8.17.
14.8.63 Remark It was shown in [2238] that for any prime power q and any integer s ≥ 1, there
exists a digital (t, s)-sequence over Fq for some value of t. In analogy with tb (s) in Remark
14.8.53, we define dq (s) as the least value of t for which there exists a digital (t, s)-sequence
over Fq . It is trivial that tq (s) ≤ dq (s), and so Theorem 14.8.54 provides also a lower bound
on dq (s).
14.8.64 Problem With the previous notation, it is an open problem whether we can ever have
tq (s) < dq (s).
14.8.65 Remark An analog of the duality theory for digital nets described in Subsection 14.8.2 was
developed in [838] for the case of digital (T, s)-sequences. Let the weight function vm on
628 Handbook of Finite Fields
Fm
q be as in Definition 14.8.22. For A = (a
(1)
, . . . , a(s) ) ∈ Fms
q with a(i) ∈ Fm
q for 1 ≤ i ≤ s,
we put
Um (A) = max vm (a(i) ).
1≤i≤s
14.8.67 Theorem [838] Let s ≥ 2 be an integer. Then from any dual space chain (Nm )m≥1 over Fq
we can construct a digital strict (T, s)-sequence over Fq with T(m) = m + 1 − δm (Nm ) for
all m ≥ 1.
14.8.68 Remark In analogy with the generalized digital nets mentioned in Remark 14.8.28, there are
generalized digital sequences as introduced in [835] and also studied in Chapter 15 of [839].
14.8.69 Remark The nonlinear digital method described in Remark 14.8.29 can be used also for
the construction of (t, s)-sequences [2258].
14.8.70 Remark A general family of digital (t, s)-sequences is formed by Niederreiter sequences
[2238]. We describe only the simplest case of this construction. For a given dimension s ≥ 1,
let p1 , . . . , ps ∈ Fq [x] be pairwise coprime polynomials over Fq . Let ei = deg(pi ) ≥ 1 for
1 ≤ i ≤ s. For 1 ≤ i ≤ s and integers u ≥ 1 and 0 ≤ k < ei , consider the Laurent series
expansion
∞
xk X
= a(i) (u, k, r)x−r−1 ∈ Fq ((x−1 )).
pi (x)u r=0
(i)
Then define cj,r = a(i) (Q+1, k, r) ∈ Fq for 1 ≤ i ≤ s, j ≥ 1, and r ≥ 0, where j −1 = Qei +k
with integers Q = Q(i, j) and k = k(i, j) satisfying 0 ≤ k < ei . The generating matrices of
(i)
the Niederreiter sequence are now given by C (i) = (cj,r )j≥1,r≥0 for 1 ≤ i ≤ s. The bijections
(i)
ψr and ηj in Remark 14.8.57 are chosen arbitrarily.
14.8.71 Theorem [837, 2238] The Niederreiter sequence based on the pairwise coprime non-
Ps polynomials p1 , . . . , ps ∈ Fq [x] is a digital strict (t, s)-sequence over Fq with
constant
t = i=1 (deg(pi ) − 1).
14.8.72 Remark If q is a prime, 1 ≤ s ≤ q, and pi (x) = x − i + 1 ∈ Fq [x] for 1 ≤ i ≤ s, then
we obtain the digital (0, s)-sequences over Fq called Faure sequences [1045]. An analogous
construction of digital (0, s)-sequences over Fq for arbitrary prime powers q and dimensions
1 ≤ s ≤ q was given in [2236]. Note that in view of Corollary 14.8.52, s ≤ q is also a
necessary condition for the existence of a (0, s)-sequence in base q. If q = 2, s ≥ 1 is an
arbitrary dimension, p1 (x) = x ∈ F2 [x], and p2 , . . . , ps are distinct primitive polynomials
over F2 , then we obtain Sobol’ sequences [2690].
14.8.73 Remark The construction of Niederreiter sequences in Remark 14.8.70 is optimized by
Ps irreducible polynomials over Fq of least degrees. If
letting p1 , . . . , ps be s distinct monic
with this choice we put Tq (s) = i=1 (deg(pi ) − 1), then for fixed q the quantity Tq (s) is
of the order of magnitude s log s as s → ∞. Let U (s) denote the least value of t that is
Combinatorial 629
known to be achievable by Sobol’ sequences for given s. Then T2 (s) = U (s) for 1 ≤ s ≤ 7
and T2 (s) < U (s) for all s ≥ 8.
We choose
wu ∈ L(D − nu P∞ ) \ L(D − (nu + 1)P∞ ) for 0 ≤ u ≤ g.
14.8.75 Theorem [3018] Let F be an algebraic function field with full constant field Fq and genus
g which contains at least one rational place P∞ . Let D be a divisor of F with deg(D) = 2g
and P∞ ∈/ supp(D) and let P1 , . . . , Ps be distinct places of F with Pi 6= P∞ for 1 ≤ i ≤ s.
Then the
Pcorresponding Niederreiter-Xing sequence is a digital (t, s)-sequence over Fq with
s
t = g + i=1 (deg(Pi ) − 1).
14.8.76 Corollary [2282] For every prime power q and every dimension s ≥ 1, there exists a digital
(Vq (s), s)-sequence over Fq , where Vq (s) = min {g ≥ 0 : Nq (g) ≥ s + 1} and Nq (g) is the
maximum number of rational places that an algebraic function field with full constant field
Fq and genus g can have.
14.8.77 Remark It was shown in [2282] that Vq (s) = O(s) as s → ∞. Since tq (s) ≤ dq (s) ≤ Vq (s) by
Remark 14.8.63 and Corollary 14.8.76, we obtain tq (s) = O(s) and dq (s) = O(s) as s → ∞.
In view of Theorem 14.8.54, these asymptotic bounds are best possible.
14.8.78 Remark The only improvements on Niederreiter-Xing sequences were obtained, in some
special cases, in the more recent paper [2266]. For instance, let q be an arbitrary prime
power and let s = q + 1. Then tq (q + 1) = dq (q + 1) = 1. On the other hand, the construction
in [2266] yields a digital (T, q + 1)-sequence over Fq with T(m) = 0 for even m ≥ 2 and
T(m) = 1 for odd m ≥ 1.
630 Handbook of Finite Fields
See Also
References Cited: [271, 276, 835, 836, 837, 838, 839, 1045, 1858, 1873, 1874, 2007, 2008,
2009, 2183, 2234, 2236, 2238, 2247, 2248, 2251, 2255, 2256, 2258, 2259, 2261, 2263, 2266,
2267, 2282, 2283, 2398, 2483, 2565, 2690, 3018]
14.9.1 Remark The performance of several applications of polynomials, frequently primitive, de-
pend on the weights of multiples of the base polynomial. Many of these applications are
discussed in this Handbook.
14.9.2 Remark The multiples of a polynomial f with weight w influence the statistical bias of the
linear feedback shift register sequence generated from f . Fewer multiples with a given weight,
w reduces the w-th moment of the Hamming weight [1622, 1944]. For more information on
bias and randomness of linear feedback shift register sequences see Section 10.2.
14.9.3 Remark In Section 15.1 the use of primitive polynomials f , to generate cyclic redundancy
check codes is discussed. The undetectable error patterns of these codes are precisely those
whose errors correspond to multiples of f . This has the consequences that burst errors of
length up to deg(f ) are always detectable and that to understand how many arbitrary errors
can be detected requires having knowledge of the weights of multiples of f .
14.9.4 Remark In Section 15.4, turbo codes are discussed. Turbo codes use feedback polynomials
that are often primitive. The bit error rate (BER) of the turbo code’s interleaver design
depends on the weights of polynomials divisible by the feedback polynomial [2513].
14.9.5 Remark Low weight multiples of a public polynomial compromise the private key for the
T CHo cryptosystem and its security therefore rests on the difficulty of finding low weight
multiples [146, 1491]. The weight of polynomials and their multiples is important in linear
feedback shift register cryptanalysis and certain attacks depend on the sparsity of the feed-
back polynomial or one of its multiples [2074]. Chapter 16 discusses the many connections
between finite fields and cryptography.
Combinatorial 631
14.9.6 Remark We present a discussion of applications of polynomials and the weights of their
multiples to the construction and strength of orthogonal arrays.
14.9.7 Definition An orthogonal array of size N , with k constraints (or k factors or of degree
k), s levels (or of order s), and strength t, denoted OA(N, k, s, t), is a k × N array
(sometimes N × k) with entries from a set of s ≥ 2 symbols, having the property that
in every t × N submatrix, every t × 1 column vector appears the same number λ = sNt
of times. The parameter λ is the index of the orthogonal array. An OA(N, k, s, t) is also
denoted by OAλ (t, k, s).
14.9.8 Remark From the definition, an orthogonal array of strength t is also an orthogonal array
of strength t0 for all 1 ≤ t0 ≤ t.
14.9.9 Theorem [357, 1457] Let C be a linear code over Fq with words of length n. Then the
n × |C| array formed with the words of C as the columns is a (linear) orthogonal array of
maximal strength t if and only if C ⊥ , its dual code, has minimum weight t + 1.
14.9.10 Remark The half of Theorem 14.9.9 that gives the strength of the orthogonal array from
the minimum weight of the dual code was known as early as 1947 [1457, 1727]. Delsarte was
able to generalize Theorem 14.9.9 to the case where the code and the orthogonal array are
not required to be linear [801]. We can extend Theorem 14.9.9 and exactly determine the
number of times each vector appears in any (t + 1) × n submatrix of the orthogonal array.
14.9.11 Theorem [2204] Let C be a linear code of length n over Fq and assume that the words
of C form the columns of an orthogonal array of strength t. Then for any t + 1-subset
T = {i1 , . . . it } ⊂ {1, . . . , n} and for any t + 1-tuple b ∈ Ft+1
q , the number of times that b
appears as a column of the (t + 1) × n submatrix determined by T , λTb (C), is
14.9.12 Theorem [2204] Let f be a primitive polynomial of degree m over Fq and let Cnf be the
set of all subintervals of the shift-register sequence with length n generated by f , together
with the zero vector of length n. The dual code of Cnf is given by
n−1
X
(Cnf )⊥ = {(b1 , . . . , bn ) : bi+1 xi is divisible by f }.
i=0
14.9.13 Remark [2204] Munemasa only proves Theorem 14.9.12 over F2 but the proof works more
generally for any finite field.
14.9.14 Remark [2354] The primitivity condition in Theorem 14.9.12 can be substantially relaxed
to polynomials with distinct roots.
14.9.15 Remark The combined effect of Theorems 14.9.9 and 14.9.12 is that to know the strength
of the orthogonal array derived from a polynomial f , and its shift register sequences, it is
essential to know about the weights of multiples of f .
632 Handbook of Finite Fields
0000100101100111110001101110101.
14.9.18 Remark The set of cards from a standard deck which contains the Ace, 2, 3, 4, 5, 6, and
7 of each suit and the 8 of spades, clubs, and hearts can be encoded uniquely with the
non-zero binary words of length 5. The first digit encodes the cards color, 0 for red and
1 for black. The second digit encodes whether the suit is major or minor in bridge: 0 for
clubs or diamonds; 1 for hearts or spades. The remaining three digits encode the value of
the card via the last three digits in the binary representation of the card’s value: 000 for 8,
001 for Ace, 010 for 2, 011 for 3, 100 for 4, 101 for 5, 110 for 6, and 111 for 7. This encoding
has the property that the first digit in a card’s code corresponds to the color of that card.
Other encodings have the required properties as well [833].
14.9.19 Remark Using the shift register sequence from Remark 14.9.17 and the card encoding from
Remark 14.9.18 we obtain the following sequence of cards:
A♦, 2♦, 4♦, A♥, 2♣, 5♦, 3♥, 6♣, 4♥, A♠, 3♣, 7♦, 7♥, 7♠, 6♠, 4♠,
8♠, A♣, 3♦, 6♦, 5♥, 3♠, 7♣, 6♥, 5♠, 2♠, 5♣, 2♥, 4♣, 8♥, 8♣
14.9.20 Remark A deck of these 31 cards arranged in this order looks upon casual inspection to
be randomly ordered. The deck can be cut arbitrarily many times (since the shift register
sequence is cyclic) before removing five cards in sequence from the top of the deck. With
the knowledge which cards are black, the identity of all five chosen cards can be determined
[832].
14.9.21 Remark Due to the low weight of primitive f = x5 + x2 + 1, the encoding scheme and
the generating polynomial are simple enough to be quickly calculated mentally which is
important for the appearance of the trick [832].
14.9.22 Remark Much can be done to augment the impression this trick makes on an audience.
For ideas see [832, 833, 2169]. Two sets of these 31 cards with identical backs can be placed
in this order repeated to give the impression of a more normal sized deck.
14.9.23 Remark The 8♦, corresponding to the binary string 00000, can be added to the deck be-
tween the 8♣ and A♦. This deviation from the linear shift register can simply be memorized
ad-hoc or a new, nonlinear shift register sequence memorized:
ak+5 = (1 + ak+1 · ak+2 · ak+3 · ak+4 )(ak + ak+2 ) + (ak · ak+1 · ak+2 · ak+3 · ak+4 ),
14.9.26 Remark Polynomials in F2 [x] which have large weight or large degree will sometimes be
given in hexadecimal notation. For example the polynomial
is 111100011 in binary notation and, grouping these from the right into fours, is 1E3 in
hexadecimal notation. The use of the two notations for polynomials will always be clear
in the context. Be aware that some authors in the literature denote binary polynomials in
hexidecimal after deleting the rightmost 1, since most polynomials used in applications have
a constant term 1, so it can be assumed present in many contexts.
14.9.27 Definition The set of polynomials of degree d in Fq [x] is denoted by Pq,d . For f ∈ F[x],
the dual code of length n, (Cnf )⊥ , defined in Theorem 14.9.12 can be identified with all
polynomials divisible by f of degree less than n. The minimum weight of a polynomial
from this set is denoted by d((Cnf )⊥ ). This is also the minimum weight of the code
(Cnf )⊥ .
14.9.28 Remark We begin with some general bounds on d((Cnf )⊥ ), followed by results for polyno-
mials f of specific degree and end with results for polynomials f of specific weights.
14.9.29 Proposition An application of Theorem 14.9.12 with bounds on the period of polynomials
gives that if f ∈ Pq,m , then d((Cnf )⊥ ) = 2 for all n ≥ q m − 1.
14.9.30 Theorem [2083] Let f ∈ P2,m and 0 ≤ t ≤ (m − 1)/2. Let n1 (t) be the smallest positive
integer such that
t+1
X n1 (t)
> 2m .
j=0
j
Set n2 (0) = ∞ and for t > 0, let n2 (t) = 2b(m−1)/tc − 1. If n1 (t) < n2 (t), then for all
n1 (t) ≤ n ≤ n2 (t), d((Cnf )⊥ ) ≤ 2t + 2. In other words, for such n, there will always be a
multiple of f of weight less than 2t + 3 and degree less than n.
14.9.31 Theorem [2083] Let e = b(m − 1)/tc and n2 (t) = 2e − 1. Let α be a primitive element in
F2e and M (i) (x) be the minimal polynomial of αi . Let
then d((C2ge −1 )⊥ ) ≥ 2t + 2 and the BCH code (see Section 15.1) generated by g can be
truncated to a code meeting the bound in Theorem 14.9.30 for any admissible n1 (t) ≤ n ≤
n2 (t).
14.9.32 Proposition [2083] In Theorem 14.9.30, n1 (t) ≤ t + 2m/(t+1) (t + 1)!1/(t+1) , and whenever
m > (t + 1)2 + t(t + 1) log2 (t + 1), we have n1 (t) < n2 (t).
14.9.33 Theorem [1591] If f ∈ P2,m is primitive and if g = xn + xk + 1 is the trinomial multiple of
f with minimum degree then
2m + 2
n≤ .
3
14.9.34 Proposition If x + 1 is a factor of f ∈ F2 [x] then f does not divide any polynomials of odd
weight.
634 Handbook of Finite Fields
14.9.35 Lemma [558] Let f ∈ F2 [x] have simple roots, period n and suppose (1 + x) is a factor of
(x+1)f ⊥ (x+1)f ⊥
f . If d((Cnf )⊥ ) = d then d((Cn+1 ) ) = d and d((Cj ) ) = 4 for n + 2 ≤ j ≤ 2n.
14.9.36 Theorem [1591] Let f ∈ F2 [x] be an irreducible polynomial with period ρ. Then f divides
a trinomial if and only if gcd(xρ + 1, (x + 1)ρ + 1) 6= 1.
14.9.37 Theorem [1591] If f ∈ P2,m is primitive and if g = xn + xk + 1 is a trinomial divisible by
f then n and k belong to the same-length cyclotomic coset modulo 2m − 1.
14.9.38 Theorem [1591] All primitive f ∈ P2,m divide some 4-nomial of degree no bigger than
$ √ %
1+ 1 + 4.2m+1
.
2
14.9.54 Remark The assumption in Theorem 14.9.53 is motivated by Theorem 14.9.52 and empirical
evidence. See [1362] for precise definition of the assumption and detailed discussion.
14.9.55 Remark Theorem 14.9.53 implies that it is highly likely to get a trinomial multiple with
degree no more than 2m/2+2 . This is in contrast to the bound of (2m + 2)/3 from Theo-
rem 14.9.33. In general Theorem 14.9.53 suggests that to avoid sparse multiples, f must be
picked with very large degree.
14.9.56 Remark In [1994], Maitra, Gupta, and Venkateswarlu extend this enumerative and proba-
bilistic analysis to include the product of primitive polynomials.
14.9.57 Proposition The bounds on weights of multiples of all polynomials from degree 4 to degree
16 and degrees 24 and 32 in F2 [x] are given in Table 14.9.2.2. For degrees 4 ≤ m ≤ 16,
Koopman and Chakravarty exhaustively searched all polynomials of degree m and all their
multiples of degrees m + 8 ≤ n ≤ m + 2048 [1793]. The m = 16 results are from the
theoretical work of Merkey and Posner [2083] and exhaustive searches by Castagnoli, Ganz,
and Graber [559]. The bounds on weights of multiples of degree 24 polynomials, which are
less complete than those for smaller m, are the work of Merkey and Posner [2083] and
searches by Castagnoli, Ganz, and Graber [559] and Ray and Koopman [2439]. In all cases
for m = 24 polynomials attaining the bounds are reported to be known although the specific
polynomials have not been published [559, 2083, 2439]. The even more incomplete results
for m = 32 are reported in [559, 2083].
14.9.58 Example Table 14.9.2.2 gives bounds that apply to every polynomial with the given degree.
To aid the reading of Table 14.9.2.2, we give an example from it. The information from the
636 Handbook of Finite Fields
deg(f ) degree range of mul- upper bound polynomial (in hexadecimal no-
f ⊥
tiples of f on d(Cn ) tation) attaining the bound
4 12 ≤ n ≤ 15 3 f = 13
5 13 ≤ n ≤ 15 4 f = 2B
16 ≤ n ≤ 31 3 f = 25
6 14 ≤ n ≤ 31 4 f = 59
32 ≤ n ≤ 63 3 f = 43
7 15 ≤ n ≤ 63 4 f = B7
64 ≤ n ≤ 127 3 f = 91
8 16 ≤ n ≤ 17 5 f = 139
18 ≤ n ≤ 127 4 f = 12F
128 ≤ n ≤ 255 3 f = 14D
9 n = 17 6 f = 13C
18 ≤ n ≤ 22 5 f = 30B
23 ≤ n ≤ 255 4 f = 297
256 ≤ n ≤ 511 3 f = 2CF
10 18 ≤ n ≤ 22 6 f = 51D
23 ≤ n ≤ 31 5 f = 573
32 ≤ n ≤ 511 4 f = 633
512 ≤ n ≤ 1023 3 f = 64F
11 19 ≤ n ≤ 23 7 f = AE1
24 ≤ n ≤ 33 6 f = A65
34 ≤ n ≤ 36 5 f = BAF
37 ≤ n ≤ 1023 4 f = B07
1024 ≤ n ≤ 2047 3 f = C9B
12 20 ≤ n ≤ 23 8 f = 149F
24 ≤ n ≤ 39 6 f = 1683
40 ≤ n ≤ 65 5 f = 11F 1
66 ≤ n ≤ 2047 4 f = 180F
2048 ≤ n ≤ 2060 3 f = 16EB
13 21 ≤ n ≤ 24 8 f = 216F
n = 25 7 f = 254B
26 ≤ n ≤ 65 6 f = 3213
66 ≤ n ≤ 2061 4 f = 2055
14 22 ≤ n ≤ 25 8 f = 46E3
26 ≤ n ≤ 27 7 f = 5153
28 ≤ n ≤ 71 6 f = 6E57
72 ≤ n ≤ 127 5 f = 425B
128 ≤ n ≤ 2062 4 f = 43D1
15 23 ≤ n ≤ 27 8 f = C617
28 ≤ n ≤ 31 7 f = B7AB
32 ≤ n ≤ 129 6 f = AE75
128 ≤ n ≤ 191 5 f = D51B
192 ≤ n ≤ 2063 4 f = 92ED
16 n = 18 12 f = 15BED
19 ≤ n ≤ 21 10 f = 1D22F
n = 22 9 f = 18F 57
23 ≤ n ≤ 31 8 f = 11F B7
32 ≤ n ≤ 35 7 f = 126B5
36 ≤ n ≤ 151 6 f = 13D65
152 ≤ n ≤ 257 5 f = 15935
258 ≤ n ≤ 32767 4 f = 1A2EB
32768 ≤ n ≤ 65535 3 f = 1002D
24 18 ≤ n ≤ 47 12
48 ≤ n ≤ 50 10
51 ≤ n ≤ 63 9
64 ≤ n ≤ 129 8
130 ≤ n ≤ 255 7
466 ≤ n ≤ 211 − 1 6
5793 ≤ n ≤ 223 − 1 4
32 n = 18 12
568 ≤ n ≤ 210 − 1 8
2954 ≤ n ≤ 215 − 1 6
92682 ≤ n ≤ 231 − 1 4
minimum distance
f ⊥
f standard d(Cn ) : 12 11 10 9 8 7
13D65 IEC TC57 after 1990 ranges of n: [17,20] [21,22]
1F29F ranges of n: 17 [18,22]
15B93 IEC TC57 before 1990 ranges of n: [17,19] [20,25]
15935 ranges of n: [17,19] [20,24] [25,26]
16F63 IEEE WG77.1 ranges of n: 17 18 [19,29]
1A2EB ranges of n: [17,18] [19,27]
1011B ranges of n:
1A097 IBM SDLC ranges of n: [17,24]
11021 CRC-CCITT ranges of n:
18005 CRC-ANSI ranges of n:
minimum distance
f ⊥
f standard d(Cn ) : 6 5 4 2
13D65 IEC TC57 after 1990 ranges of n: [23,151] [151,∞]
1F29F ranges of n: [23,130] [131,258] [259,∞]
15B93 IEC TC57 before 1990 ranges of n: [26,128] [129,254] [255,∞]
15935 ranges of n: [27,51] [52,257] [258,∞]
16F63 IEEE WG77.1 ranges of n: 30 [31,255] [256,∞]
1A2EB ranges of n: [28,109] [110,32767] [32768,∞]
1011B ranges of n: [17,115] [116,28658] [28659,∞]
1A097 IBM SDLC ranges of n: [25,83] [84,32766] [32767,∞]
11021 CRC-CCITT ranges of n: [17,32767] [32768,∞]
18005 CRC-ANSI ranges of n: [17,32767] [32768,∞]
third line of the section for polynomials of degree 11, indicates that for every binary degree
11 polynomial f ∈ P2,11 , there exists multiples of f which have degrees 34, 35, and 36 and
weight less than or equal to 5. The polynomial cited in the last column, f (x) = BAF =
x11 + x9 + x8 + x7 + x5 + x3 + x2 + x + 1, meets this bound tightly; that is, all of its multiples
of degree 34, 35, or 36 have weight 5 or above.
14.9.59 Remark In Table 14.9.2.2, three of the degree 16 polynomials meeting the bounds are
known to be unique. For d(Cnf )⊥ = 6, f = 13D65 and for d(Cnf )⊥ = 4, f = 1A2EB are the
unique tight polynomials, up to reciprocal. For d(Cnf )⊥ = 5, f = 15935 is unique [559].
14.9.60 Remark In contrast to Table 14.9.2.2, Tables 14.9.2 through 14.9.4 give the distance dis-
tributions of multiples of a few, specific polynomials for degrees 16, 24, and 32.
14.9.61 Remark [559] Table 14.9.2 gives the distance profiles of ten specific polynomials in P2,16
found by Castagnoli, Ganz, and Graber. They exhaustively searched all degree 16 poly-
nomials for those with optimum profiles. The polynomial f = 1F 29F is the unique poly-
f f
nomial with d(C130 )⊥ = 6 and d(C258 )⊥ = 4. Up to reciprocal, f = 1011B is the unique
f ⊥ f
polynomial with d(C28658 ) = 4 and d(C115 )⊥ = 6. The authors of [559] suggest that
any cyclic redundancy check polynomials of degree 16 should be chosen only from the list
{13D65, 1F 29F, 15935, 1A2EB, 1011B}.
14.9.62 Example The third polynomial in Table 14.9.2 gives the distance distribution for the poly-
nomial f (x) = 15B93 = x16 + x14 + x12 + x11 + x9 + x8 + x7 + x4 + x + 1, which was the IEC
TC57 standard cyclic redundancy check polynomial until 1990. All its multiples of degrees
17–19 have weight 10 or more. All its multiples of degrees 20–25 have weight 8 or more. All
its multiples of degrees 26–128 have weight 6 or more. All its multiples of degree 129–254
have weight 5 or more. All its multiples of degrees 255 and higher have weight at least 2.
For each degree there exist specific multiples that attain these lower bounds; for example
there is a degree 17 multiple of f with weight 10.
14.9.63 Remark Table 14.9.3 gives the distance distribution for some specific polynomials of de-
gree 24. All were constructed via the generalized BCH code method: multiplying together
638 Handbook of Finite Fields
minimum distance
f ⊥
f d(Cn ) : 16 15 14 12 11 10 9 ref.
1323009 ranges of n: [558, 2083]
1401607 ranges of n: [558, 2083]
1805101 ranges of n: [2083]
15D6DCB ranges of n: 25 26 [27,36] [558]
17B01BD ranges of n: [25,26] [27,41] [558]
131FF19 ranges of n: 25 [26,33] [558]
15BC4F5 ranges of n: [25,26] [27,28] [29,31] [32,33] [34,35] [558]
1328B63 ranges of n: [25,30] [31,36] [558]
minimum distance
f ⊥
f d(Cn ) : 8 7 6 5 4 2 ref.
1323009 ranges of n: [25,68] [69,2048] [2049,4094] [4095,∞] [558, 2083]
1401607 ranges of n: [25,55] [56,2048] [2049,4094] [4095,∞] [558, 2083]
1805101 ranges of n: [25,1023] [2083]
15D6DCB ranges of n: [37,83] [84,2050] [2051,4098] [4099,∞] [558]
17B01BD ranges of n: [42,95] [96,2048] [2049,4094] [4095,∞] [558]
131FF19 ranges of n: [34,37] [38,252] [253,4097] [4098,∞] [558]
15BC4F5 ranges of n: [36,41] [42,47] [78,217] [218,4095] [4096,∞] [558]
1328B63 ranges of n: [37,61] [62,846] [847,23 − 1] [223 ,∞] [558]
minimal polynomials of elements from F2e and small factors, x and x + 1 [558, 2083]. For
discussion of BCH codes, see Section 15.1.
14.9.64 Remark Table 14.9.4 gives the distance distribution for some specific polynomials of degree
32. All were obtained via the generalized BCH code method: multiplying together minimal
polynomials of elements from F2e and small factors, x and x + 1 [558, 2083].
14.9.65 Remark For the third polynomial in Table 14.9.4, used in many standards, Jain [1590]
has determined and published many of the minimum degree polynomials that establish the
ranges given in Table 14.9.4. The actual polynomials are given in Table 14.9.5. Jain has
determined all the polynomials that f divides which have the pattern of at most three
burst errors of length 4 each and several other specific patterns of errors.
14.9.66 Remark Koopman has performed an exhaustive search over all f ∈ P2,32 for 40 ≤ n ≤
131104. His primary concern was finding cyclic redundancy check polynomials which were
simultaneously good at typical Ethernet maximum transmission unit (MTU) lengths, n =
12112, and much longer lengths n ≥ 64, 000, so although his search has in principle solved
the d(Cnf )⊥ problem for all n in this range he did not specifically publish these, rather he
highlights the last three polynomials given in Table 14.9.4 and compares them to the others
[1792]. Discussion of the benefits and costs of using these various polynomials in different
scenarios appear in [558, 1590, 1792, 2083].
14.9.67 Remark We now present divisibility results that are organized by the weight of the base
polynomial.
14.9.68 Theorem [2204] Let f (x) = xm + xl + 1 be a trinomial over F2 such that gcd(m, l) = 1. If
g is a trinomial multiple of f of degree at most 2m, then
minimum distance
f ⊥
f d(Cn ) : 20 18 17 16 15 14 13 12 11 10 9 8 ref.
1404098E2 ranges of n: [33,78] [79,1023] [558, 2083]
10884C912 ranges of n: [33,79] [80,1023] [558, 2083]
104C11DB7∗ ranges of n: [33,42] [43,44] [45,53] [54,66] [67,89] [90,123] [558, 1590]
1F1922815 ranges of n: [33,44] [45,48] [49,98] [99,1024] [558]
1F4ACFB13 ranges of n: 33 [34,35] 36 37 [38,43] [44,56] [57,306] [558]
1A833982B ranges of n: [33,35] [36,49] [50,53] [54,59] [60,90] [558]
1572D7285 ranges of n: [33,34] 35 [36,38] [39,52] [53,68] [69,80] [81,110] [558]
11EDC6F41 ranges of n: 33 [34,38] [39,40] [41,52] [53,79] [80,209] [558]
1741B8CD7 ranges of n: [40,48] [49,50] [51,184] [1792]
132583499 ranges of n: [40,48] [49,58] [59,166] [1792]
120044009 ranges of n: [1792]
100210801 ranges of n: [1792]
minimum distance
f ⊥
f d(Cn ) : 7 6 5 4 3 2 ref.
1404098E2 ranges of n: [1024,∞] [558, 2083]
10884C912 ranges of n: [1024,∞] [558, 2083]
104C11DB7∗ ranges of n: [124, 203] [204,300] [301,3006] [3007,91639] [91640,232 − 1] [232 ,∞] [558, 1590]
1F1922815 ranges of n: [1025,2046] [2047,∞] [558]
1F4ACFB13 ranges of n: [307,32768] [32769,65534] [65535,∞] [558]
1A833982B ranges of n: [91,113] [114,1092] [1093,65537] [65538,∞] [558]
1572D7285 ranges of n: [111,266] [267,1029] [1030,65535] [65536,∞] [558]
11EDC6F41 ranges of n: [210,5275] [5276,231 − 1] 31
[2 ,∞] [558]
1741B8CD7 ranges of n: [185,16392] [16393,114695] [114696,∞] [1792]
132583499 ranges of n: [167,32769] [32770,65538] [65539,∞] [1792]
120044009 ranges of n: [40,32770] [32771,65538] [65539,∞] [1792]
100210801 ranges of n: [40,65537] [65538,∞] [1792]
f (x) = 1 + x + x2 + xm−3 + xm
= (1 + x + x2 )(1 + xm−3 + xm−2 ),
h(x) = (1 + x) + (x3 + x4 ) + · · · + (xm−7 + xm−6 ) + xm−4 ,
f (x)h(x) = g(x) = 1 + x2m−6 + x2m−4 ; or
Table 14.9.6 Table of pentanomials which divide trinomials: “p” in type indicates that the given
polynomial f is primitive, “i” indicates that f is irreducible, and “r” indicates that f is reducible.
14.9.70 Remark All primitive polynomials satisfy the gcd condition of Theorems 14.9.68
and 14.9.69, and thus, in particular, Theorems 14.9.68 and 14.9.69 hold for all primitive
trinomials and pentanomials over F2 .
Combinatorial 641
14.9.71 Corollary If f (x) = xm + xl + xk + xj + 1 is primitive over F2 and not one of the exceptions
given in Table 14.9.6 or their reciprocals, then, for m < n ≤ 2m,
14.9.72 Theorem [826] Let F be any field and f, g, h ∈ F[x], f h = g, w(f ) = n > 1 and w(g) = m.
If there exists an f0 ∈ F[x] such that f (x) = f0 (xk ) for k > 1 then there exist gi ∈ F[x],
w(gi ) = mi for 0 ≤ i < k such that
k−1
X
g(x) = gi (xk )xi (14.9.1)
i=0
k−1
X
m = mi and mi 6= 1. (14.9.2)
i=0
14.9.73 Remark Theorem 14.9.72 can be used to simplify the analysis of multiples of f . An example
used in [2354] is given in Corollary 14.9.74 and was used in the proofs of Theorems 14.9.75
and 14.9.76.
14.9.74 Corollary [826] Let F be any field and f, g, h ∈ F[x], f h = g, w(f ) = n and w(g) ≤ 3. If
there exists f0 ∈ F[x] such that f (x) = f0 (xk ) for k > 1 then there exists g0 ∈ F[x] such
that g(x) = g0 (xk ).
14.9.75 Theorem [826] Let f (x) = a + bxk + xm (a, b 6= 0) be a monic trinomial over F3 . If
g(x) = c + xn (c 6= 0) is a monic binomial over F3 with degree at most 3m divisible by f
with g = f h, then f and g are as given in Table 14.9.7.
Table 14.9.7 Polynomials over F3 such that g = f h for monic trinomial f and monic binomial g .
14.9.76 Theorem [826] Let f (x) = a + bxk + xm (a, b 6= 0) be a monic trinomial over F3 . If
g(x) = c + dxl + xn (c, d 6= 0) is a monic trinomial over F3 with degree at most 3m divisible
by f with g = f h, then
1. g = f 3 ;
2. f and g are as in the Table 14.9.8;
3. f and g are reciprocals of polynomials listed in Table 14.9.8.
642 Handbook of Finite Fields
Table 14.9.8 Table of polynomials such that g = f h with f and g monic trinomials over F3 .
See Also
References Cited: [146, 218, 357, 558, 559, 801, 826, 832, 833, 1362, 1457, 1491, 1590,
1591, 1622, 1727, 1792, 1793, 1834, 1944, 1994, 2074, 2083, 2169, 2204, 2354, 2439, 2513]
In the last two decades, the theory of Ramanujan graphs has gained prominence primarily for
two reasons. First, from a practical viewpoint, they resolve an extremal problem in commu-
nication network theory (see for example [269, 1535]). Second, from a more aesthetic view-
Combinatorial 643
point, they fuse diverse branches of pure mathematics, namely, number theory, representa-
tion theory, and algebraic geometry. The purpose of this survey is to unify some of the recent
developments and expose certain open problems in the area. This survey is an expanded
version of [2208] and is by no means an exhaustive one and demonstrates a highly number-
theoretic bias. For other surveys, we refer the reader to [1535, 1922, 1967, 1968, 2525, 2837].
For a more up-to-date survey highlighting the connection between graph theory and auto-
morphic representations, we refer the reader to Li’s recent survey article [1924].
14.10.1 Definition A graph X is a pair (V, E) consisting of a vertex set V = V (X) and an edge set
E = E(X) which is a multiset of unordered pairs of (not necessarily distinct) vertices.
Each edge consists of two vertices that are called its endpoints. A loop is an edge whose
endpoints are equal. Multiple edges are edges having the same pair of endpoints. A
simple graph is a graph having no loops nor multiple edges. If a graph has loops or
multiple edges, we will call it a multigraph. When two vertices u and v are endpoints
of an edge, they are adjacent and write u ∼ v to indicate this fact. A directed graph Y
is a pair (W, F ) consisting of a set of vertices W and a multiset F of ordered pairs of
vertices which are called arcs.
14.10.2 Remark All the graphs in this chapter are undirected unless stated explicitly otherwise.
14.10.3 Definition The degree of a vertex v of a graph X, denoted by deg(v), is the number of
edges incident with v, where we count a loop with multiplicity 1. A graph X is k-regular
if every vertex has degree k.
P
14.10.4 Proposition (Handshaking Lemma) For any simple graph X, v∈V (X) deg(v) = 2|E(X)|.
If X is a k-regular graph with n vertices, then |E(X)| = kn/2.
14.10.6 Remark We remark that the adjacency matrix defined above depends on the labeling of the
vertices of X. Different labelings of the vertices of a graph X may possibly yield different
adjacency matrices. However, all these adjacency matrices are similar to each other (by
permutation matrices) and thus, their spectrum is the same.
14.10.7 Remark One can define an adjacency matrix of a directed graph Y = (W, A) similarly. Given
a labeling of the vertices W of Y , the (x, y)-th entry of the adjacency matrix corresponding
to this labeling equals the number of arcs from x to y. Adjacency matrices of directed graphs
may be non-symmetric.
14.10.8 Example The spectrum of the complete graph Kn on n vertices is (n − 1)(1) , (−1)(n−1) ,
where the exponent signifies the multiplicity of the respective eigenvalue. The Petersen
graph has spectrum 3(1) , 1(5) , −2(4) .
14.10.9 Theorem Let X be a graph on n vertices with maximum degree ∆ and average degree d.
Then d ≤ λ1 ≤ ∆ and |λi | ≤ ∆ for every 2 ≤ i ≤ n.
644 Handbook of Finite Fields
14.10.11 Definition A path is a walk with no repeated vertices. A cycle is a closed walk with no
repeated vertices except the starting vertex.
14.10.12 Remark A word of caution must be inserted here. In graph theory literature, the distinction
between a walk and a path is as we have defined it above. However, in number theory circles,
the finer distinction is not made and one uses the word “path” to mean a “walk”; see for
example, [2523, 2789].
14.10.13 Definition A graph X is connected if for every two distinct vertices x and y, there is a
path from x to y.
14.10.14 Proposition For every graph X with adjacency matrix A and any integer r ≥ 1, the (x, y)-th
entry of Ar equals the number of walks of length r between x and y.
14.10.15 Proposition The number of closed walks of length r in a graph X with n vertices equals
λr1 + λr2 + · · · + λrn .
14.10.16 Definition An independent set in a graph X is a subset of vertices that are pairwise non-
adjacent. A graph X is bipartite if its vertex set can be partitioned into two independent
sets A and B; X is complete bipartite and denoted by K|A|,|B| if it contains all the edges
between A and B.
14.10.17 Proposition A graph is bipartite if and only if it does not contain any cycles of odd length.
14.10.18 Theorem If X is a k-regular and connected graph with n vertices, then λ1 = k and the
multiplicity of k is 1 with the eigenspace of k spanned by the all 1 vector of dimension n.
If X is a k-regular and connected graph, then X is bipartite if and only if λn = −k.
14.10.19 Definition If X is a k-regular and connected graph, then the eigenvalue k of X is trivial.
All other eigenvalues of X are non-trivial. Let λ(X) = max |λi |, where the maximum is
taken over all non-trivial eigenvalues of X. The parameter λ(X) is the second eigenvalue
of X by some authors. The second largest eigenvalue of X is λ2 (X) and λ(X) ≥ λ2 (X).
14.10.20 Definition The distance d(x, y) between two distinct vertices x and y of a connected graph
X is the length of a shortest path between x and y. The diameter D of a connected
graph X is the maximum of d(x, y), where the maximum is taken over all pairs of distinct
vertices x 6= y of X.
14.10.21 Remark When k ≥ 3, if X is a k-regular and connected graph with n vertices and diameter
D
−1
D, then n ≤ 1 + k + k(k − 1) + · · · + k(k − 1)D−1 = 1 + k · (k−1) k−2 and consequently,
(n−1)(k−2) log(n−1) log(k/(k−2))
D ≥ logk−1 k + 1 > log(k−1) − log(k−1) . Thus, the diameter of any connected
k-regular graph is at least logarithmic in the order of the graph. The next theorem implies
that when the non-trivial eigenvalues of a k-regular connected graph are small, then the
above inequality is tight up to a multiplicative constant.
Combinatorial 645
14.10.22 Theorem [637] If X is a connected non-bipartite k-regular graph with n vertices and di-
ameter D, then:
log(n − 1)
D≤ + 1.
log(k/λ(X))
14.10.23 Remark Kahale [1641] obtained an upper bound on the minimum distance between i subsets
of the same size of a regular graph in terms of the i-th largest eigenvalue in absolute
value. Kahale
√ also constructed examples of k-regular graphs on n vertices having λ(X) =
(1+o(1))2 k − 1 and D = 2(1+o(1)) logk−1 n showing the previous result is asymptotically
best possible. Here the o(1) term tends to 0 as n goes to infinity.
14.10.24 Remark A similar result can be derived for k-regular bipartite graphs; if X is a bipartite
k-regular and connected graph of diameter D, we have (see Quenell [2433])
log(n − 2)/2
D≤ + 2,
log(k/λ0 (X))
where λ0 (X) is the maximum absolute value of the eigenvalues of X that are not k nor −k.
14.10.25 Remark Chung, Faber, and Manteuffel [638] and independently, Van Dam and
Haemers [2839] obtained slight improvements of the previous diameter bounds.
14.10.26 Definition The chromatic number χ(X) of a graph X is the minimum number of colors
that can be assigned to the vertices of a graph such that any two adjacent vertices
have different colors. The largest order of an independent set of vertices of X is the
independence number of X and is denoted by α(X).
14.10.27 Remark The chromatic number of X is the minimum number of independent sets that
|V (X)|
partition the vertex set of X and consequently, χ(X) ≥ α(X) .
n(−λn )
α(X) ≤
k − λn
and so
k
χ(X) ≥ 1 + .
−λn
nλ(X)
14.10.29 Remark An immediate consequence of the previous result is that α(X) ≤ k+λ(X) and
k
χ(X) ≥ 1 + λ(X) for any non-bipartite connected k-regular graph X. These facts show that
a good upper bound for the absolute values of the non-trivial eigenvalues of a regular graph
will yield non-trivial bounds for the independence and chromatic number.
14.10.30 Remark The following theorem shows that the eigenvalues of a regular graph are closely
related to its edge distribution.
14.10.31 Theorem [81] If X is a k-regular connected graph with eigenvalues k = λ1 > λ2 ≥ · · · ≥
λn ≥ −k, let λ := max(|λ2 |, |λn |). For S, T ⊂ V (X), denote by e(S, T ) the number of edges
with one endpoint in S and another in T . Then for all S, T ⊂ V (X)
s
|S|
k|S||T | |T |
p
e(S, T ) − ≤ λ |S||T | 1 − 1− < λ |S||T |.
n n n
14.10.32 Remark The previous theorem states that k-regular graphs X with small non-trivial eigen-
values (compared to k) have their edges uniformly distributed (similar to random k-regular
646 Handbook of Finite Fields
graphs). Such graphs are called pseudorandom graphs and are important in many situations
(see Krivelevich-Sudakov’s survey [1807]). Bilu and Linial [282] have obtained a converse of
the previous result of Alon and Chung.
14.10.33 Definition
√ A k-regular connected multigraph X is a Ramanujan multigraph if |λi | ≤
2 k − 1 for every eigenvalue λi 6= k. A Ramanujan graph is a Ramanujan multigraph
having no loops nor multiple edges.
14.10.34 Remark We mention that the definition of a Ramanujan graph used by other authors
is slightly √
weaker. For example, Sarnak in [2525] calls a k-regular graph Ramanujan if
λ2 (X) ≤ 2 k − 1.
14.10.35 Example The complete graph Kn is an (n − 1)-regular Ramanujan graph as its eigenvalues
are (n − 1)(1) , (−1)(n−1) , where the exponents denote the multiplicities of the eigenvalues.
The complete bipartite graph Kn,n has eigenvalues n(1) , 0(2n−2) , −n(1) and is an n-regular
Ramanujan graph.
14.10.36 Remark In [80, p.95], Alon announced a√proof with Boppana of the fact that for any
−1
k-regular graph X of order n, λ2 (X) ≥ 2 k − 1 − O((logk n) ), where the constant in
the O term depends only on k. Many researchers refer to this result as the Alon-Boppana
Theorem. Other researchers refer to the following statement proved by Nilli (pseudonym
for Alon) in [2289] as the Alon-Boppana Theorem.
14.10.37 Theorem [2289] If X is a k-regular and connected graph with diameter D ≥ 2b + 2, then
√
√ 2 k−1−1
λ2 (X) ≥ 2 k − 1 − .
b+1
14.10.38 Remark The Alon-Boppana Theorem and Remark 14.10.21 imply that if (Xi )i≥1 is a
sequence of k-regular and connected graphs with limi→+∞ |V (Xi )| = +∞, then
√
lim inf λ2 (Xi ) ≥ 2 k − 1.
i→∞
14.10.39 Remark The best lower bound for the second largest eigenvalue λ2 (X) of a k-regular graph
of diameter D is due to Friedman [1126] who showed that
√ √ π
λ2 (X) ≥ 2 k − 1 cos θk,t ≥ 2 k − 1 cos (14.10.1)
t+1
h i
π π
where t is the largest integer such that D ≥ 2t and θk,t ∈ t+5 , t+1 is the smallest positive
cos θ √
solution of the equation k
2k−2 = sin(t+1)θ
sin tθ . The number 2 k − 1 cos θk,t is the largest
eigenvalue of the k-regular tree Tk,t of depth t; this tree has a root vertex x and exactly
k(k − 1)i−1 vertices at distance i from x for each 1 ≤ i ≤ t. Friedman used analytic tools
involving Dirichlet and Neumann eigenvalues for graphs with boundaries to prove (14.10.1).
Later, Nilli [2290] gave an elementary proof of a slightly weaker bound.
√ π
14.10.40 Remark We outline here an elementary proof of the inequality λ2 (X) ≥ 2 k − 1 cos t+1 for
every connected k-regular graph X of diameter D ≥ 2t + 2. The first ingredient of the proof
is that the largest eigenvalue of any subgraph induced by a ball of radius t of X is larger
than the largest eigenvalue of Tk,t . The second is that if u and v are vertices at distance at
least 2t+2 in X, then the subgraph induced by the vertices at distance at most t from u or v
Combinatorial 647
has exactly two components X(u) and X(v). By Cauchy eigenvalue interlacing, the second
largest eigenvalue of X is greater than the minimum
√ of the largest eigenvalue
√ of X(u) and
π
X(v) which by the previous argument is at least 2 k − 1 cos θk,t ≥ 2 k − 1 cos t+1 .
14.10.41 Remark At this point, it is worth stating that Friedman [1126] (see also Nilli [2290]) proved
the stronger statement that if X is a k-regular graphs containing
√ a subset of
√ r points each
π
of distance at least 2t from one another, then λr (X) ≥ 2 k − 1 cos θk,t ≥ 2 k − 1 cos t+1 .
This implies thatthe r-th largest eigenvalue λ r (X) of any connected k-regular graph X is
√ π2 −4
log (n/r)
at least 2 k − 1 1 − 2f 2 + O(f ) , where f = k−12 .
14.10.42 Remark One might wonder if the behavior of the negative eigenvalues of a connected k-
regular graph X is similar to the behavior of the negatives of the positive eigenvalues of X.
If X is bipartite, then the spectrum of X is symmetric with respect to 0 and this settles the
previous question. In general, it turns out that additional conditions are needed in order to
obtain similar results for the negative eigenvalues. This is because there are regular graphs
with increasing order whose eigenvalues are bounded from below by an absolute constant.
For example, the eigenvalues of a line graph are at least −2. It turns out that the number
of odd cycles plays a role in the behavior of the negative eigenvalues of regular graphs. The
odd girth of a graph X is the smallest length of a cycle of odd length.
14.10.43 Theorem [1126, 2290] If X is a connected k-regular graph of order n with a subset of r
points each of distance at least 2t from one another, and odd girth at least 2t, then
√ √ π
λn−r (X) ≤ −2 k − 1 cos θk,t = −2 k − 1 cos . (14.10.2)
t+1
14.10.44 Corollary [1923] If (Xi )i≥0 is a sequence of k-regular graphs of increasing orders
√ such that
the odd girth of Xi tends to infinity as i → ∞, then lim supi→∞ µl (Xi ) ≤ −2 k − 1, for
each l ≥ 1, where µl (X) denotes the l-th smallest eigenvalue of X.
14.10.45 Theorem [646, 648] For an integer r ≥ 3, let cr (X) denote the number of cycles of length
r of a graph X. If (Xi )i≥0 is a sequence of k-regular graphs of increasing orders such that
√
limi→∞ c2r+1 (Xi )
|V (Xi )| = 0 for each r ≥ 1, then lim supi→∞ µl (Xi ) ≤ −2
k − 1, for each l ≥ 1.
14.10.46 Remark The difficulty of constructing infinite families of Ramanujan graphs is also illus-
trated by the following result of Serre.
14.10.47 Theorem [2594] For any > 0, there exists a positive constant c = c(, k) such that
for every k-regular
√ graph X on n vertices, the number of eigenvalues λi of X such that
λi > (2 − ) k − 1 is at least c · n.
14.10.48 Remark Different short and elementary proofs of Serre’s theorem were found indepen-
dently by Nilli [2290] and Cioabă [646, 648]. Nilli’s proof is similar to Friedman’s argument
from [1126] while Cioabă’s proof uses the fact that the trace of Al is the number of closed
walks of length l. See also [646, 648] for a similar theorem to Theorem 14.10.47 for the small-
est eigenvalues of regular graphs. These proofs, as well as extensions of the Alon-Bopanna
theorem (see recent work of Mohar [2118]) rely on the notion of the universal cover of a
graph; see Definitions 14.10.108 and 14.10.109, and Theorem 14.10.110 for more details.
14.10.49 Remark The idea behind all the proofs of Serre’s theorem indicated above is that the
universal cover of a finite k-regular graph is the rooted infinite k-regular tree Tk . This implies
that the number of closed walks of even length starting at some vertex of a finite k-regular
graph is at least the number of closed √ walks of the same length starting at the root of the
infinite k-regular tree. The number 2 k − 1 is the spectral radius of the adjacency operator
of the infinite k-regular tree (for more details see [1126, 1535]). In some circumstances, the
648 Handbook of Finite Fields
√
lower bound for the second eigenvalue of a k-regular graph can be improved beyond 2 k − 1
[1927, 2118].
14.10.50 Remark Greenberg and Lubotzky (see Chapter 4 of [1967] or [646, 647] for a short ele-
mentary proof) extended the Alon-Bopanna bound to any family of general graphs with
isomorphic universal cover. If (Xi )i≥1 is a family of finite connected graphs with universal
cover X̃ and ρ is the spectral radius of the adjacency operator of X̃, then lim inf λ2 (Xi ) ≥ ρ
as i → +∞. For extensions of Alon-Boppana theorem and Serre’s theorem for irregular
graphs, see [646, 647, 1534, 2118].
14.10.51 Remark The notion of Ramanujan graph has been extended to hypergraphs and studied in
this setting. Again, these notions lead to the use of the Ramanujan conjecture formulated
for higher GLn in the Langlands program.
14.10.52 Definition A hypergraph X = (V, E) is a pair consisting of a vertex set V and a set of
hyperedges E consisting of subsets of V . If all the edges are of the same size r, X is an r-
uniform hypergraph or r-graph. In the familiar setting of a graph, an edge is viewed as a
2-element subset of V and is thus a 2-uniform hypergraph. One class of hypergraphs that
are studied are the (k, r)-regular hypergraphs in which each edge contains r elements and
each vertex is contained in k edges. For an ordinary graph, r = 2 and this generalizes the
notion of a k-regular graph. In this special setting, the adjacency matrix A is a |V | × |V |
matrix with zero diagonal entries and the (i, j)-th entry is the number of hyperedges
that contain {i, j}.
14.10.53 Remark One can show easily that k(r − 1) is an eigenvalue of A and this is the trivial
eigenvalue. With this definition in place, a Ramanujan hypergraph is defined as a finite
connected (k, r)-regular hypergraph such that every eigenvalue λ of A with |λ| 6= k(r − 1)
satisfies p
|λ − (r − 2)| ≤ 2 (k − 1)(r − 1).
We refer the reader to the important work of Li [1925] for further details.
14.10.54 Definition For any subset A of vertices of a graph X, the edge boundary of A, denoted
∂A, is
∂A = {xy ∈ E(X) : x ∈ A, y ∈
/ A}.
That is, the edge boundary of A consists of edges with one endpoint in A and another
outside A.
|∂A| |V (X)|
h(X) = min : A ⊂ X, |A| ≤ .
|A| 2
14.10.56 Definition A family of k-regular graphs (Xi )i≥1 with |V (Xi )| increasing with i, is a family
of expanders if there exists a positive absolute constant c such that h(Xi ) > c for every
i ≥ 1.
Combinatorial 649
14.10.57 Remark Informally, a family of k-regular expanders is a family of sparse (k fixed and
|V (Xi )| → +∞ as i → +∞ imply that the number of edges of Xi is linear in its number
of vertices), but highly connected graphs (h(Xi ) > c means that in order to disconnect Xi ,
one must remove many edges).
14.10.58 Example [2005] Let Xm denote the following 8-regular graph on m2 vertices. The vertex
set of Xm is Z/mZ × Z/mZ. The neighbors of a vertex (x, y) are (x ± y, y), (x, y ± x), (x ±
y + 1, y), (x, y ± x + 1). The family (Xm )m≥4 is the first explicit family of expanders. Mar-
gulis [2005] proved that (Xm )m≥4 are expanders using representation theory. Margulis [2005]
used the fact that the group SL3 (Z) has Kazhdan property T. Groups having this property
or the weaker property τ can be used to construct infinite families of constant-degree Cay-
ley graphs expanders. We refer the reader to [1535, 1967, 1968] for nice descriptions and
explanations of these properties and their relation to expanders.
14.10.59 Remark Expander graphs play an important role in computer science, mathematics, and
the theory of communication networks; see [269, 1535]. These graphs arise in questions about
designing networks that connect many users while using only a small number of switches.
14.10.60 Theorem [80, 2117] If X is a connected k-regular graph, then
k − λ2
q
k 2 − λ22 ≥ h(X) ≥ .
2
14.10.61 Remark The previous theorem shows that constructing an infinite family of k-regular
expanders (Xi )i≥1 is equivalent to constructing an infinite family of k-regular graphs (Xi )i≥1
such that k − λ2 (Xi ) is bounded away from zero.
14.10.62 Definition Let G be a group written in multiplicative notation and let S be a subset of
elements of G that is closed under taking inverses and does not contain the identity. The
Cayley graph of G with respect to S (denoted by X(G, S)) is the graph whose vertex set
is G where x ∼ y if and only if x−1 y ∈ S. If G is abelian, then it is common to use the
additive notation in the definition of X(G, S): x ∼ y if and only if y − x ∈ S.
14.10.63 Remark In general, if S is an arbitrary multiset of G, denote by X(G, S) the directed graph
with vertex set G and arc set {(x, y) : x−1 y ∈ S}. If S is inverse-closed and does not contain
the identity, then this graph is undirected and has no loops.
14.10.64 Theorem Let G be a finite abelian group and S a symmetric subset of G of size k. The
eigenvalues of the adjacency matrix of X(G, S) are given by
X
λχ = χ(s)
s∈S
then the graph is connected by our earlier remarks. Thus, to construct Ramanujan graphs,
we require
X √
χ(s) ≤ 2 k − 1
s∈S
for every non-trivial irreducible character χ of G. This is the strategy employed in many of
the explicit constructions of Ramanujan graphs.
14.10.67 Example A simple example can be given using Gauss sums. If p ≡ 1 (mod 4) is a prime,
let G = Z/pZ and S = {x2 : x ∈ Z/pZ} be the multiset of squares. The multigraph X(G, S)
is easily seen to be Ramanujan in view of the fact (see for example [2207, p. 81])
X 2
/p √
e2πiax = p
x∈Z/pZ
for any a 6= 0. By our convention in the computation of degree of a vertex, we see that
X(G, S) is a p-regular graph.; see [1807] for other related examples.
14.10.68 Example When q ≡ 1 (mod 4) is a prime power, the Paley graph of order q is the Cayley
graph X(G, S) of the additive group of a finite field G = Fq with respect to the set S of
non-zero squares. This simple and undirected graph has√q vertices, is
√
connected and regular
−1− q −1+ q
of degree q−1
2 and its non-trivial eigenvalues are 2 and 2 , each of multiplicity
q−1
2 . The Paley graph is Ramanujan when q ≥ 9.
14.10.69 Remark The proof of Theorem 14.10.64 is reminiscent of the Dedekind determinant formula
in number theory. This formula computes det A, where A is the matrix whose (i, j)-th
entry is f (ij −1 ) for any function f defined on the finite abelian group G of order n. The
determinant is
Y X
f (g)χ(g) .
χ g∈G
14.10.70 Definition Let G be an abelian group written in the additive notation and S ⊂ G. The
sum graph of G with respect to S (denoted by Y (G, S)) has G as vertex set and x ∼ y
if and only if x + y ∈ S.
14.10.71 Theorem [1922, p. 197] Let G be an abelian group. The eigenvalues of Y (G, S) are given
as follows. For each irreducible character χ of G, define
X
eχ = χ(s).
s∈S
14.10.73 Theorem [2208] Let G = Fq be a finite field of q = pm elements and f (x) a polynomial
with coefficients in Fq and of degree 2 or 3. Let S be the multiset
{f (x) : x ∈ Fq }.
X √
exp(2πitrFq /Fp (af (x))/p) ≤ (deg f − 1) q
x∈Fq
provided f is not identically zero; see [1922, p. 94]. In particular, if f has degree 3, we get
√
the estimate of 2 q for the exponential sum. For example, if u ∈ Z/pZ and we take
S = {x3 + ux : x ∈ Z/pZ},
X √
exp(2πia(x3 + ux)/p) ≤ 2 p
x∈Z/pZ
14.10.78 Theorem [834] Let G be a finite group and S a symmetric subset which is stable under
conjugation. The eigenvalues of the Cayley graph X(G, S) are given by
1 X
λχ = χ(s)
χ(1)
s∈S
14.10.85 Proposition If X is a k-regular graph with n vertices and adjacency matrix A, then
1. A0 = In , A1 = A;
2. A2 = A21 − kIn ;
3. Ar+1 = A1 Ar − (k − 1)Ar−1 for every r ≥ 2.
14.10.86 Proposition [780, 1969] Let Um denote the Chebychev polynomial of the second kind
sin(m+1)θ
defined by expressing sin θ as a polynomial of degree m in cos θ:
sin(m + 1)θ
Um (cos θ) = .
sin θ
Then
bm
2 c
X m A
Am−2r = (k − 1) Um
2 √ .
r=0
2 k−1
Combinatorial 653
14.10.88 Proposition [780, 1969] If X is a k-regular graph with n vertices and eigenvalues k = λ1 ≥
λ2 ≥ · · · ≥ λn , then
l
b2c n
XX l
X sin(l + 1)θj
(Al−2r )(x, x) = (k − 1) 2 .
sin θj
x∈V r=0 j=1
λ
where cos θj = 2√k−1 j
for each 1 ≤ j ≤ n. If X is vertex-transitive of degree k, then
(Aj )(x, x) = (Aj )(y, y) for any j and x, y ∈ V (X) and thus,
l
b2c n
X l
X sin(l + 1)θj
n (Al−2r )(x, x) = (k − 1) 2 .
r=0 j=1
sin θj
It can be shown that Λ(2m) is a normal subgroup of Λ(2) of finite index. Let q be a prime.
The graphs X p,q and the Cayley graph of Λ(2)/Λ(2q) with respect to the set of generators
α1 Λ(2q), α1 Λ(2q), . . . , αs Λ(2q), αs Λ(2q) are isomorphic as shown by the next result.
14.10.94 Proposition Let φ : Λ(2) → P GL (2, Z/qZ) defined as follows:
a0 + ua1 a2 + ua3
φ([a0 + a1 i + a2 j + a3 k]) =
−a2 + ua3 a0 − ua1
14.10.96 Remark Estimating rQ (n) is an important and difficult problem in number theory. Ac-
cording to [1969], there is no simple explicit formula for rQ (n) as Jacobi’s formula because
of additional cusp forms that appear at the higher level. The Ramanujan conjecture for
weight 2 cusp forms and its proof by Eichler and Igusa yields a good approximation for
rQ (n). More precisely, if p is a prime and l ≥ 0, then
l
rQ (pl ) = C(pl ) + O (p 2 + )
Combinatorial 655
14.10.97 Remark The number theoretic facts above and the connection between the eigenvalues and
the number of closed nonbacktracking walks in a regular graph were used by Lubotzky,
Phillips, and Sarnak to prove the following result.
14.10.98 Theorem [1969, 2006] The graphs X p,q are Ramanujan.
q
14.10.99 Remark If p= −1, then X p,q is bipartite of high girth; its girth is at least 4 logp q −
logp 4 ≈ 43 logp |V (X p,q )|. If pq = 1, then X p,q also has high girth; its girth is at least
2 logp q ≈ 32 logp |V (X p,q )|. From the results of Hoffman, it also follows that these graphs
have large chromatic number (at least 1 + 2p+1 √ ).
p
14.10.100 Remark Morgenstern [2160] generalized Lubotzky, Phillips, and Sarnak’s construction and
constructed infinite families of (q + 1)-regular Ramanujan graphs for every prime power q.
14.10.101 Construction Reingold, Vadhan, and Wigderson [2448] introduced a new graph product
called the zig-zag product which they used to construct infinite families of constant-degree
expanders.
14.10.102 Definition Let X be a k-regular graph with vertex set [n] = {1, . . . , n}. Suppose the edges
incident to each vertex of X are labeled from 1 to k in some arbitrary, but fixed way.
The rotation map RotX : [n] × [k] → [n] × [k] is defined as follows: RotX (u, i) = (v, j)
if the i-th edge incident to u is the j-th edge incident to v.
14.10.103 Definition Let G1 be a D1 -regular graph with vertex set [N1 ] with rotation map RotG1
and G2 be a D2 -regular graph with vertex set [D1 ] with rotation map RotG2 . The zig-zag
product G1 zG2 is the D22 -regular graph with vertex set [N1 ] × [D1 ] whose rotation map
RotG1 zG2 is:
1. let (k 0 , i0 ) = RotG2 (k, i);
2. let (w, l0 ) = RotG1 (v, k 0 );
3. let (l, j 0 ) = RotG2 (l0 , j);
4. define RotG1 zG2 ((v, k), (i, j)) = ((w, l), (i0 , j 0 )).
14.10.104 Definition A graph G is an (n, d, λ)-graph if G has n vertices, is d-regular, and the absolute
value of any non-trivial eigenvalue of G is at most λd.
14.10.105 Theorem [2448] If G1 is an (N1 , D1 , µ1 )-graph and G2 is an (D1 , D2 , µ2 )-graph, then G1 zG2
is an (N1 D1 , D22 , µ1 + µ2 + µ22 )-graph.
14.10.106 Construction Using the previous theorem, Reingold, Vadhan, and Wigderson [2448] con-
structed infinite families of constant-degree expanders.
656 Handbook of Finite Fields
14.10.107 Construction Bilu and Linial [282] have used graph lifts to construct infinite
p families of d-
regular graphs whose non-trivial eigenvalues have absolute value at most C d log3 d, where
C is some positive absolute constant. We outline their method below.
14.10.109 Definition A surjective homomorphism f : V (G1 ) → V (G2 ) is a covering map (see [282,
1126]) if for each vertex x of G1 , the restriction of f to x and its neighbors is bijective.
Given a graph G, a cover of G is a pair (H, f ), where f : V (H) → V (G) is a covering map.
If in addition G is connected and finite and H is finite, then for each vertex y ∈ V (G),
the preimage f −1 (y) has the same cardinality. If |f −1 (y)| = t for each y ∈ V (G), then
(H, f ) is a t-cover or H is a t-lift of G.
14.10.110 Remark For every finite graph G, there is a universal cover or a largest cover G̃ which is
an infinite tree whose vertices can be identified with the set of nonbacktracking walks from
a fixed vertex x ∈ V (G). For example, the universal cover of any finite k-regular graph is
the infinite k-regular tree Tk .
14.10.111 Remark An important property of a t-cover (H, f ) of a finite graph G is that the graph H
inherits the eigenvalues of G. This is because the vertex set of H can be thought as V (G) ×
{1, . . . , t} with the preimage (also called the fiber of y) f −1 (y) = {x : x ∈ V (H), f (x) =
y} = {(y, i) : 1 ≤ i ≤ t}. The edges of H are related to the edges of G as follows: each fiber
f −1 (y) induces an independent set in H; if yz ∈ E(G), then the subgraph of H induced
by f −1 (y) ∪ f −1 (z) = {(y, i), (z, i) : 1 ≤ i ≤ t} is a perfect matching (meaning that there
exists a permutation σ ∈ St such that (y, i) is adjacent to (z, σ(i)) for each 1 ≤ i ≤ t); if
yz ∈/ E(G), then there are no edges between f −1 (y) and f −1 (z). The partition of the vertex
set of H as V (H) = ∪y∈V (G) f −1 (y) is equitable (see [1287]) and its quotient matrix is the
same as the adjacency matrix of G. This implies the eigenvalues of A(G) are the eigenvalues
of A(H). These eigenvalues of A(H) are old and the remaining eigenvalues of A(H) are new.
14.10.112 Remark In the case of a 2-lift, the new eigenvalues can be interpreted as eigenvalues of a
signed adjacency matrix as follows. If H is a 2-lift of G, then for each edge yz of G, the
subgraph induced by f −1 (y) ∪ f −1 (z) = {(y, 0), (y, 1), (z, 0), (z, 1)} in H has either (y, 0)
adjacent to (z, 0) and (y, 1) adjacent to (z, 1) (in which case set s(y, z) = s(z, y) = 1) or
(y, 0) adjacent to (z, 1) and (y, 1) adjacent to (z, 0) (in which case set s(y, z) = s(z, y) − 1).
Let s(y, z) = 0 for all other y, z ∈ V (G). The symmetric {0, −1, 1} matrix As whose (y, z)-th
entry is s(y, z) is the signed adjacency matrix of the G with respect to the cover H. It is
known that the eigenvalues of H are union of the eigenvalues of the adjacency matrix of
G and the eigenvalues of the signed adjacency matrix of G. Bilu and Linial [282] proved
that every graph G with maximum degree d has a signed adjacency p matrix (which can be
found efficiently) whose eigenvalues have absolute value at most C d log3 d where C is
some positive absolute constant.
14.10.113 Construction Bilu and Linial’s idea to construct almost Ramanujan graphs is the following:
start with a k-regular Ramanujan graph G0 (for example, the complete graph Kk+1 ) and
then construct a 2-lift of Gi (denoted by Gi+1 ) such that the new eigenvalues of Gi+1 are
small in absolute value for i ≥ 0. Bilu and Linial [282] prove that every k-regular graph G
Combinatorial 657
p
has a 2-lift H such that the new eigenvalues of H have absolute value at most C k log3 k
where C is some positive absolute constant. In this wayp the sequence of k-regular graphs
Gi has non-trivial eigenvalues bounded from above by C k log3 k.
14.10.114 Remark Bilu and Linial [282] make the following conjecture which if true, would imply the
existence of infinite sequences of k-regular Ramanujan graphs for every k ≥ 3.
14.10.115 Conjecture [282] Every k-regular
√ graph has a signed adjacency matrix whose eigenvalues
have absolute value at most 2 k − 1.
14.10.116 Construction A different combinatorial construction of almost Ramanujan graphs was
proposed by de la Harpe and Musitelli [786], and independently by Cioabă and Murty [646,
649]. The idea of these constructions is that perturbing Ramanujan graphs by adding or
removing perfect matchings will yield graphs with small non-trivial eigenvalues. The linear
algebraic reason for this fact follows from a theorem of Weyl which bounds the eigenvalues of
a sum of two Hermitian matrices in terms of the eigenvalues of the summands. De la Harpe
and Musitelli [786] note that adding a perfect matching to any 6-regular √ Ramanujan graph
will yield a 7-regular graph whose 2nd largest
√ eigenvalue is at most 2 5+1 ∼ = 5.47 which
∼
is larger than the Ramanujan bound of 2 6 = 4.89, but strictly less than 7. Cioabă and
Murty [646, 649] use known results regarding gaps between consecutive primes to observe
that by adding or removing perfect matching from Ramanujan graphs, one can construct
k-regular almost Ramanujan graphs for almost all k. More precisely, their result is that
given > 0, for almost all k ≥ 3, one can construct
√ infinite families of k-regular graphs
whose 2nd largest eigenvalue is at most (2 + ) k − 1.
14.10.117 Remark The following conjecture was made in [646]; if true, this conjecture would imply
the existence of infinite families of k-regular Ramanujan graphs for any k ≥ 3.
14.10.118 Conjecture [646] Let X be a k-regular Ramanujan graph with an even number of vertices.
Then there exists a perfect matching P with V (P ) = V (X) such that the (k + 1)-regular
graph obtained from the union of the edges of X and P is Ramanujan.
14.10.119 Remark In a recent outstanding work, Friedman [1127] solved a long-standing conjecture
of Alon from the 1980s and proved that almost all regular graphs are almost Ramanujan.
14.10.120 Theorem [1127] Given > 0 and k ≥ 3, the probability that
√ a random k-regular graph
on n vertices has all non-trivial eigenvalues at most (2 + ) k − 1 goes to 1 as n goes to
infinity.
14.10.122 Definition Let X be a k-regular graph and denote q = k − 1. The Ihara zeta function is
Y −1
ZX (s) = 1 − q −s`(p)
p
where the product is over all prime geodesic cycles p and `(p) is the length of p.
Moreover, ZX (s) satisfies the Riemann hypothesis’ (that is, all the singular points in the
region 0 < <(s) < 1 lie on Re(s) = 1/2) if and only if X is a Ramanujan graph.
14.10.124 Remark Hashimoto [1441], as well as Stark and Terras [2701] have defined a zeta function
for an arbitrary graph and established its rationality. The definition of this zeta function is
simple enough. Let Nr be the number of closed walks γ of length r so that neither γ nor γ 2
have backtracking. Then, the zeta function of the graph X is defined as
∞
!
X Nr tr
ZX (t) = exp .
r=1
r
See Also
§6.1, §6.2 For details on Gauss sums and other character sums.
§12.7 For discussions on zeta functions and L-functions of curves.
§15.1, §15.3 For details on algebraic, LDPC, and expander codes.
References Cited: [80, 81, 83, 156, 157, 269, 282, 411, 637, 638, 645, 646, 647, 648, 649,
780, 786, 834, 1126, 1127, 1128, 1287, 1441, 1534, 1535, 1568, 1641, 1693, 1807, 1921, 1922,
1923, 1924, 1925, 1926, 1927, 1967, 1968, 1969, 1970, 2005, 2006, 2117, 2118, 2160, 2207,
2208, 2289, 2290, 2433, 2448, 2523, 2525, 2589, 2594, 2701, 2789, 2837, 2839]
15
Algebraic coding theory
15.1 Basic coding properties and bounds . . . . . . . . . . . . . . 659
Channel models and error correction • Linear codes •
Cyclic codes • A spectral approach to coding • Codes
and combinatorics • Decoding • Codes over Z4 •
Conclusion
15.2 Algebraic-geometry codes . . . . . . . . . . . . . . . . . . . . . . . . . . 703
Classical algebraic-geometry codes • Generalized
algebraic-geometry codes • Function-field codes •
Asymptotic bounds
15.3 LDPC and Gallager codes over finite fields . . . . . . 713
15.4 Turbo codes over finite fields . . . . . . . . . . . . . . . . . . . . . . 719
Introduction • Convolutional codes • Permutations
and interleavers • Encoding and decoding • Design of
turbo codes
15.5 Raptor codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
Tornado codes • LT and fountain codes • Raptor
codes
15.6 Polar codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735
Space decomposition • Vector transformation •
659
660 Handbook of Finite Fields
15.1.2 Example A channel with two inputs {0, 1} and two outputs {0, 1} is a binary symmetric
channel (BSC) if p00 = p11 = 1 − p, p01 = p10 = p; p is the crossover probability.
15.1.3 Example A channel with two inputs {0, 1} and three outputs {0, E, 1} is a binary erasure
channel (BEC) if p00 = p11 = 1 − , p0E = p1E = , and p01 = p10 = 0. An erasure in the
received word is a position containing the symbol E.
15.1.4 Remark There are many other types of channels, including q-ary channels (q inputs), burst
noise channels, and the additive white Gaussian noise (AWGN) channel, which is a contin-
uous channel that adds white Gaussian noise to the transmitted signal.
15.1.8 Definition A code C is an (n, M, d)q code if it has M distinct codewords of length n over
the finite field Fq such that the minimum distance between any two distinct codewords
is d where
d = min d(c1 , c2 ).
c1 ,c2 ∈C
c1 6=c2
The rate r of such a code is the ratio of the number of information symbols transmitted
per codeword to the number of symbols per codeword, assuming the codewords are
chosen equally likely, i.e., r = logq (M )/n.
15.1.9 Remark An important problem of coding theory is to design codes that have a large min-
imum distance for a given code size (or large size for a given minimum distance) and to
have encoding and decoding algorithms that can be implemented efficiently. Many types of
encoders and decoders are considered in the literature.
15.1.10 Remark Unless specified otherwise, a received word is the sum of a codeword transmitted
over a discrete memoryless channel and an error word. All such words are vectors over a
finite field.
15.1.11 Definition A maximum likelihood decoder (MLD) [1164] is one that, on receiving a word of
channel output symbols r, chooses the codeword ĉ ∈ C that maximizes the probability
of r being received, i.e., maximizes the conditional probability P (r | c):
15.1.12 Definition A maximum a posteriori (MAP) decoder [1164] is one that, on receiving the
channel output word r, chooses the codeword ĉ ∈ C that maximizes the a posteriori
probability P (c | r), i.e.,
ĉ = arg max P (c | r).
c∈C
Algebraic coding theory 661
15.1.13 Definition A minimum distance decoder [1164] is one that, on receiving the word r, chooses
a codeword ĉ ∈ C that is at minimum distance from r, i.e.,
A bounded distance decoder attempts to find the codeword, if it exists, within distance
b(d − 1)/2c of the received word.
15.1.14 Remark The output from these decoders may not be unique. In this case, in an equally
likely scenario, the decoder chooses one of the codewords satisfying the criteria. In the case
of choosing codewords equally likely, MAP is equivalent to MLD.
15.1.15 Lemma On a BSC with crossover probability p < 1/2, the minimum distance decoder is
equivalent to the MLD [304, 1943, 2389].
15.1.16 Remark If C is an (n, M, d)q code, one may think of the codewords being surrounded with
“spheres” of radius e = b(d − 1)/2c, i.e. the set of q-ary n-tuples at distance e or less from
the center. The spheres are then nonintersecting.
15.1.17 Lemma If C is an (n, M, d)q code and a codeword c is transmitted and word r is received,
with r errors and s erasures, then c is the unique codeword in C closest to the received r
provided that 2r + s < d.
15.1.18 Remark Shannon [2608] showed that with every discrete memoryless channel with a finite
number of input and output symbols, one may define the notion of channel capacity in bits
per channel use. His celebrated channel coding theorem and its converse then state there
exists a code that allows essentially error-free transfer of data over the channel as long as
the rate does not exceed channel capacity. Conversely, he showed that such transmission is
not possible if the channel rate exceeds capacity. More specifically, Gallager [1164] showed
that if C is the channel capacity, there is an error-rate exponent function E(R) that is
positive for rates 0 ≤ R < C and zero elsewhere, and there exists a code of rate R < C and
length n > N for which the probability Pw of word error on the channel satisfies
Pw < exp(−N E(R)) for 0 ≤ R < C.
The importance of the result is that there exists a sequence of codes of increasing lengths
n > N and of rates less than the channel capacity for which the probability of word error,
after maximum likelihood decoding, decreases exponentially to zero.
15.1.19 Remark Algebraic coding theory arose in response to the challenge of the Shannon the-
orems to define codes, as well as efficient encoding and decoding algorithms, that achieve
asymptotically low error rate at code rates arbitrarily close to capacity. There are numerous
excellent books available on the subject. The following were particularly useful references in
preparing this section: [231, 304, 311, 1558, 1639, 1943, 1945, 1991, 2047, 2136, 2389, 2390,
2405, 2484, 2511, 2849].
15.1.21 Remark The anomaly in notation from the nonlinear case with a designation of (n, M, d)q
code, where the number of codewords is M , will be clear from the context.
662 Handbook of Finite Fields
15.1.22 Remark To determine the minimum distance of a nonlinear (n, M, d)q code C will in general
require a search over the distances between all M
2 pairs of codewords. For a linear code
this can be reduced to determining the weight of a minimum weight codeword since
15.1.23 Definition The scalar (or inner) product of two vectors u = (u1 , u2 , . . . , un ), v =
(v1 , v2 , . . . , vn ) from Fnq is
(u, v) = u1 v1 + u2 v2 + · · · + un vn ∈ Fq .
15.1.27 Remark In Rn , the only vector orthogonal to itself is the zero vector. So nonzero linear
subspaces of Rn cannot be self-orthogonal. However, the situation is markedly different when
considering subspaces of vector spaces over fields of prime characteristic; in that case, self-
orthogonal and self-dual subspaces do exist. For example, C = {(0, 0), (1, 1)} is a self-dual
(2, 1, 2)2 code in F22 .
15.1.28 Remark If C is a self-dual code, its dimension is n/2 and its length n is even.
15.1.29 Definition For C an (n, k, d)q linear code, a k × n matrix G is a generator matrix of C if
its row space equals C. Similarly an (n − k) × n matrix H is a parity check matrix for
C if its row space equals C ⊥ .
15.1.30 Remark Note that some authors use the term parity cheek matrix only for the binary case,
as in this case the inner product of a codeword and row of the matrix involves an even
number of ones in the sum. We retain the term parity check matrix for all sizes of fields.
15.1.31 Remark As any codeword of C is orthogonal to any word in C ⊥ , it follows that
where superscript T denotes matrix transpose. Thus c ∈ C if and only if HcT = 0T where
0 is the zero row vector of length n − k.
15.1.32 Remark By using elementary row operations (which preserve the row space of a matrix)
and possibly coordinate position permutations, any generator matrix of a code C can be
put in the form
G = [Ik | A] ,
Algebraic coding theory 663
H = −AT | In−k .
15.1.33 Definition Generator and parity check matrices of a code C in the form of Remark 15.1.32
are in systematic form.
15.1.34 Remark Some authors (e.g. [1558]) refer to the form of matrices in Remark 15.1.32 as being
in standard form and reserve “systematic” for any form where the information bits appear
explicitly in the codewords.
15.1.35 Remark From the definitions it is clear that a parity check matrix of a linear code C is a
generator matrix of the code C ⊥ , and a generator matrix of C is a parity check matrix of
the code C ⊥ .
15.1.36 Remark If C is an (n, k, d)q code with parity check matrix H, then c ∈ C if and only
if HcT = 0T . Thus if c is a minimum weight codeword, then HcT = 0T implies a linear
combination of d columns of H is the all zero vector. It follows that C has minimum distance
d if every subset of d − 1 columns of H is linearly independent, and there is a dependent
set of d columns.
15.1.37 Lemma In an (n, k, d)q code C, any d − 1 columns of a parity check matrix for C are linearly
independent, and there is at least one set of d columns that is dependent.
15.1.38 Definition Let C be an (n, M, d)q code and let Ai (resp. Bi , if C is linear) be the number
of codewords in C (resp. C ⊥ , which exists if C is linear) of weight i. Define the weight
enumerators of these codes as
n
X n
X
WC (x, y) = Ai xn−i y i and WC ⊥ (x, y) = Bi xn−i y i .
i=0 i=0
15.1.39 Theorem (MacWilliams identities) [304, 1558, 2849] Let WC (x, y) (resp. WC ⊥ (x, y)) be the
weight enumerator of an (n, k, d)q linear code C (resp. of the (n, n − k, d0 )q dual code C ⊥ ).
Then
1
WC ⊥ (x, y) = k WC (x + (q − 1)y, x − y). (15.1.1)
q
15.1.40 Remark An alternative version of the MacWilliams identities is obtained by expanding the
terms in (15.1.1) to obtain
n−j j
X n−i
n−i
X
Ai = q k−j Bi for 0 ≤ j ≤ n.
i=0
j i=0
n−j
15.1.41 Remark In [801] Delsarte proposed an interesting method to examine code properties and
in particular their combinatorial aspects. It gives fundamental insight to the problem. Only
a brief look at this approach will be given.
664 Handbook of Finite Fields
15.1.42 Definition [801] For positive integers n and λ the Krawtchouk polynomial Pk (x) is the
polynomial of degree k over the rational numbers
k
x n−x
X
j k−j
Pk (x) = (−1) λ for 0 ≤ k ≤ n
j=0
j k−j
where
x
= x(x − 1)(x − 2) · · · (x − j + 1)/j!.
j
These polynomials form a set of orthogonal polynomials and have many interesting
properties such as
Pn n
Pk (i)P` (i) = δkl nk ,
1. i=0 i
Pn
2. P` (i)Pi (k) = δkl (λ + 1)k ,
Pni=0 n−k j n−x
3. k=0 n−j Pk (x) = (λ + 1) j .
Later, when required, the parameter λ will be specified.
15.1.44 Remark If the (i, k) entry of a matrix P is defined as Pk (i), 0 ≤ i, k ≤ n, the MacWilliams
transform can be expressed as A0 = AP .
15.1.45 Remark It can be shown that (A0 )0 = (λ + 1)n A for any (n + 1)-tuple A. Likewise
P 2 = (λ + 1)n In+1 .
15.1.46 Definition [801] Let C be an (n, M, d)q code. Define the distance distribution of C as
1
Ai (C) = | {(x, y) ∈ C 2 | (x, y) = i} | for 0 ≤ i ≤ n,
M
i.e., the average number of codewords at distance i from a fixed codeword. In the case
C is linear, this is just the weight distribution of C.
15.1.48 Remark The following lemma gives motivation for the appearance of Krawtchouk polyno-
mials.
15.1.49 Lemma [801] Let Wk be the set of vectors of weight k in Fn
q , and let u be a vector of weight
j in Fnq . Then with λ = q − 1
X
(u, x) = Pk (j).
x∈Wk
15.1.50 Lemma [801] The distance distribution of any code over Fq is a positive (n + 1)-tuple for
λ = q − 1.
Algebraic coding theory 665
15.1.51 Definition Let A(C) be the distance distribution of an (n, M, d)q code C, and define the
four fundamental parameters of C (where the parameter of the Krawtchouk polynomials
is λ = q − 1) as
1. d = d(A(C)) is the minimum distance,
2. s = s(A(C)) is the number of nonzero distances (of C),
3. d0 = d(A0 (C)) is the dual distance,
4. s0 = s(A0 (C)) is the external distance.
In the case the code C is linear, A0 (C) is a constant multiple of the weight distribution
of C ⊥ .
15.1.52 Remark It is important to note these four parameters are defined for any code C. The
parameter d0 is the smallest i such that A0i 6= 0, i > 0. The parameter s0 is the number
of nonzero entries in the set {A01 , i = 1, . . . , n}. In the case the code is linear, the four
parameters are the minimum distance and number of nonzero codeword weights of C and
the dual code C ⊥ , respectively. The following three theorems from [801] (the first theorem
was first proved by MacWilliams) are representative of the effectiveness of this approach.
15.1.53 Theorem [801] Let A be an (n + 1)-tuple with A0 6= 0 and let A0 be its MacWilliams
transform. Then s0 ≥ b(d − 1)/2c.
15.1.55 Remark A linear code over Fq is automatically distance invariant because c + C = C for all
c ∈ C.
15.1.56 Theorem [801] Let C be a code for which s ≤ d0 . Then C is distance invariant.
15.1.57 Remark The following theorem justifies the name external distance given to the parameter
s0 .
15.1.58 Theorem [801] Let C be a code with external distance s0 . Then each point of Fn
q is at
distance at most s0 from at least one codeword.
15.1.59 Definition Let C be an (n, k, d)q code. An encoding is an injective mapping from the set
Fkq of all messages to the set C of codewords. A minimum distance decoding is a map
from the set Fnq of all possible received vectors to the set C which sends the received
vector r to a codeword closest
tor. The codeword is unique if the number of errors in
transmission is at most e = d−1 2 .
15.1.60 Remark A particular decoding algorithm that works for any linear code over Fq is the
coset or standard array decoding algorithm. It is generally too inefficient for practical con-
sideration. Since a linear (n, k, d)q code C is a vector subspace of Fnq , it can be viewed as
an (additive) subgroup of Fnq . Thus Fnq can be decomposed into the cosets of C. Such a
decomposition can be used to derive a decoding algorithm.
666 Handbook of Finite Fields
15.1.61 Definition Let C be a linear (n, k, d)q code. A coset of C with representative u ∈ Fn
q is the
set u + C = {u + c | c ∈ C}. For u1 , u2 ∈ Fnq , as sets u1 + C and u2 + C are either disjoint
or identical. Since C is closed under addition, it follows that u + C equals u + c + C if
c ∈ C. There are q n−k distinct cosets of C in Fnq .
15.1.62 Definition The standard array for an (n, k, d)q code C is a q n−k × q k array of vectors of
Fnq formed in the following manner:
1. In the first row of the array, place the codewords of C in some ordering, with
the all-zero codeword first.
2. Choose a vector u1 of minimum weight in Fnq \ C and add this to each vector in
the first row to form the second row.
3. Choose a vector u2 of minimum weight in Fnq \ {C ∪ {u1 + C}} and add this to
each vector in the first row to form the third row.
4. Continue the process until the vectors of Fnq are exhausted.
Vectors in the first column of the array are coset leaders. Every vector of Fnq appears
exactly once in the array.
qm − 1
πm,q = = q m−1 + q m−2 + · · · + 1.
q−1
This is the number of nonzero projective m-tuples where scalar multiples are identified.
15.1.67 Definition The m-th order Hamming code over Fq , denoted Hπm,q ,q (m), has a parity check
matrix formed by constructing the m×πm,q matrix whose columns are the set of distinct
nonzero projective q-ary m-tuples. The code Hπm,q ,q (m) is a (πm,q , πm,q − m, 3)q code.
Algebraic coding theory 667
15.1.68 Remark The dimension of Hπm,q ,q (m) is easy to establish. That the code has minimum
distance 3 is seen by finding some linear combination of three columns of the parity check
matrix that equals zero. That there is no linear combination of two columns equaling zero
follows from the columns being projectively distinct.
15.1.69 Remark The binary Hamming code Hπm,2 ,2 (m) is a (2m − 1, 2m − 1 − m, 3)2 code. The
dual of this code has all nonzero words of weight 2m−1 , and hence the weight enumerator
of Hπ⊥m,2 ,2 (m) is
Using the MacWilliams identities it is seen that the weight enumerator of Hπm,2 ,2 (m) is
WHπm,2 ,2 (x, y) = 2−m (x + y)n + n(x + y)(n−1)/2 (x − y)(n+1)/2 .
In a similar manner it can be established that the dual of the code Hπm,q ,q (m) over Fq has
all nonzero weights equal to q m−1 . The dual codes to Hamming codes are simplex codes.
u ⊗ v = (u1 v1 , u2 v2 , . . . , un vn ).
This is also the coordinate-wise “AND” operation. Similarly the usual coordinate-wise ad-
dition over F2 corresponds to “XOR.”
15.1.71 Definition Define the Reed-Muller generator matrix G(r, m), 0 ≤ r ≤ m, as follows.
1. The zero-th row v0 is all ones.
2. The rows vi , i = 1, 2, . . . , m, are formed by alternating 2m−i 0’s with 2m−i 1s
to fill the vector of length 2m .
3. The matrix G(r, m) consists of the above m + 1 rows together with all products
ofPthe vectors vi , i = 1, 2, . . . , m, taken j at a time for all j ≤ r. G(r, m) is a
r
( j=0 m m
j ) × 2 matrix.
15.1.72 Definition The matrix G(r, m) is a generator matrix for the order r and degree m binary
Reed-Muller code RM2m ,2 (r, m).
15.1.75 Remark [2849] The generator matrix G(r, m) of RM2m ,2 (r, m) can be written as
G(r, m − 1) G(r, m − 1)
G(r, m) = .
0 G(r − 1, m − 1)
It follows that
The minimum distance of RM2m ,2 (r, m) can be established using Lemma 15.1.93 below.
15.1.76 Remark Given an (n, k, d)qm code C over Fqm , there are several natural ways to obtain
codes over subfields. Two methods are of particular interest: subfield subcodes and trace
codes, introduced below.
15.1.77 Definition The subfield subcode of an (n, k, d)qm code C is the set of codewords of C with
all elements in Fq , i.e.,
sf
15.1.78 Lemma The subfield subcode Cqm |q obtained from the (n, k, d)qm code C is a linear (n, ≥
mk − (m − 1)n, ≥ d)q code.
15.1.79 Remark The subfield subcode of an (n, M, d)qm nonlinear code is, as in the linear case,
the set of codewords with all coordinates in Fq . The minimum distance of the subfield sub-
code is at least d. In the linear case, the bound on the dimension of the subfield subcode
is obtained from considering the dimension of the subspaces involved. Thus replacing ele-
ments of the parity check matrix of C by m-tuple columns over Fq , the subfield subcode
is the set of n-tuples over Fq orthogonal to all (n − k)m rows and has dimension at least
n − (n − k)m = mk − (m − 1)n.
15.1.81 Definition Given an (n, k, d)qm code C, the trace code of C is defined as
15.1.82 Lemma The trace code of an (n, k, d)qm code C is an (n, ≤ mk, ≤ d)q code.
15.1.83 Remark The dimension bound of Lemma 15.1.82 is the trivial one; since the parent code
has (q m )k codewords, the trace code over Fq can have at most q mk codewords. The bound
is achieved if and only if Tr qm |q (c1 ) 6= Tr qm |q (c2 ) for any two distinct codewords c1 , c2 ∈ C.
15.1.84 Remark The following result, due to Delsarte [802], shows the relation between subfield
subcodes and trace codes.
Algebraic coding theory 669
15.1.85 Lemma [802] Let C be a linear code over Fqm . Then the dual of the subfield subcode of C
is the trace code of the dual code of C over Fqm , i.e.,
⊥ tr
Cqsfm |q = C⊥ q m |q
.
15.1.86 Remark The following commutative diagram may be of value in visualizing these relation-
ships:
dual
C −−−−→ C⊥
subfieldy ytrace
⊥ tr
dual
Cqsfm |q −−−−→ Cqsfm |q = C⊥ q m |q
.
15.1.87 Remark There are many ways of modifying a given code. The definitions given below are
typical, although the terminology is not uniform in the literature.
15.1.88 Definition An (n, k, d)q code C can be modified in the following ways:
1. Expurgation: Certain codewords are deleted (no change of length).
2. Augmentation: The number of codewords is increased (no change of length).
3. Puncturing: The length of the code is reduced by deleting a coordinate position,
generally without reducing the size of the code.
4. Extending: The length of the code is increased without increasing the size of
the code.
5. Shortening: The length and dimension of the code are reduced.
6. Lengthening: The length and dimension of the code are increased.
15.1.89 Example Examples of the above operations on an (n, k, d)q code are given below.
1. If a row of a generator matrix is deleted, the result is an (n, k − 1, d0 )q code where
d0 ≥ d. Similarly a row could be added to a parity check matrix, which is linearly
independent of the existing rows, to give the same result.
2. A row could be added to a generator matrix that is linearly independent of the
existing rows to give an (n, k + 1, d0 )q code where d0 ≤ d.
3. If we delete a coordinate position contained in the nonzero positions of a codeword
of minimum weight (assuming it is not of weight 1), the result is an (n−1, k, d−1)q
code.
4. Adding an overall parity check produces an (n + 1, k, d0 )q code where d0 = d or
d + 1 depending on the code structure. For example, if q = 2 and the original
code contains minimum weight codewords of odd weight (in which case half the
words are of odd weight and half of even weight), the extended code has minimum
distance d + 1. This is equivalent to adding a column of zeroes to the parity check
matrix and then a row of all ones.
5. If a generator matrix of the code is in systematic form and the first row and
column are deleted, the result is an (n − 1, k − 1, d0 )q code where d0 ≥ d.
6. Adding a row to a generator matrix, linearly independent of the existing rows,
and then adding a column gives an (n + 1, k + 1, d0 )q code where d0 ≤ d + 1.
670 Handbook of Finite Fields
15.1.90 Remark The trace code and subfield subcode of the previous subsection may also be thought
of as ways of modifying a given code.
15.1.91 Remark Given two codes, an (n1 , k1 , d1 )q code C1 with generator matrix G1 and an
(n2 , k2 , d2 )q code C2 with generator matrix G2 , methods to produce a third code C are
considered.
15.1.92 Remark The following construction proves useful in some situations, such as the construc-
tion of the Reed-Muller codes of Subsection 15.1.2.3, already noted.
15.1.93 Lemma Let Ci be an (n, ki , di )q code, i = 1, 2. Then the code
C = {(u, u + v) | u ∈ C1 , v ∈ C2 }
15.1.94 Definition The direct sum code is the (n1 + n2 , k1 + k2 , d)q code where d = min{d1 , d2 }
with generator matrix G1 ⊕ G2 (block diagonal sum of matrices).
15.1.95 Definition The product code C1 ⊗ C2 is the (n1 n2 , k1 k2 , d1 d2 )q code with generator matrix
G1 ⊗ G2 , the tensor product of the two component generator matrices.
15.1.96 Remark One can think of the codewords of the product code in the following manner.
Consider an array of k1 × k2 information symbols from Fq . The array is first extended to
a k1 × n2 array by completing each row to codewords in C2 . The columns of this array are
completed to an n1 × n2 array by completing each column to a codeword in C1 . Notice the
bottom right (n1 − k1 ) × (n2 − k2 ) array consists of checks on checks. The resulting n1 × n2
matrices are the codewords of C1 ⊗ C2 . The process could have been done by first completing
the columns of the information symbols with codewords in C1 and then completing rows. It
is easily verified the same final array is obtained.
15.1.97 Definition Let C1 be a linear (N, K, D)qm code and C2 a linear (n, m, d)q code. Fix a basis of
Fqm over Fq and represent elements of Fqm as m-tuples over Fq . These m-tuples represent
information positions in C2 which therefore give a one-to-one correspondence between
elements of Fqm and codewords of C2 . The concatenated code of C1 by C2 is obtained by
replacing the Fqm -elements of codewords of C1 by codewords of C2 corresponding to the
information m-tuples of the Fqm -elements.
15.1.98 Lemma The concatenated code of C1 by C2 is a linear (nN, kK, ≥ dD)q code.
15.1.99 Remark The central problem of coding theory is to construct codes with as many codewords
as possible for a given minimum distance (or as large a minimum distance as possible for a
given code size), in such a way that the code can be efficiently encoded and decoded. This
section reviews many of the bounds available. While interest is largely in linear codes, a
version of a few of the bounds holds true for nonlinear codes as well. Most of the books
on coding have a treatment similar to the one here. The treatment of bounds in [2849] is
particularly elegant and succinct.
Algebraic coding theory 671
15.1.100 Definition Define Aq (n, d) to be the size of the largest code of length n over Fq with
minimum distance d, i.e.,
Aq (n, d) = max {M | an (n, M, d)q code exists}.
15.1.102 Remark The following bound is a lower bound on the size of a maximal code.
15.1.103 Theorem (Sphere covering bound, Varshamov-Gilbert bound) [2849, 2855]
qn
Aq (n, d) ≥ .
sq (d − 1, n)
15.1.104 Remark Theorem 15.1.103 is shown by surrounding each codeword of a code of maximum
size with a sphere of radius d − 1 and considering the union of such spheres. If the space
is not exhausted, it would be possible to add a word at distance at least d from every
codeword, implying the code was not optimal.
For linear codes a different argument leads to a similar result. For a given n and k, let
r = n − k, and attempt to construct an r × n parity check matrix, with the property that
any d − 1 columns are linearly independent, by adding columns sequentially. The process
may be started with the r × r identity matrix. Suppose j − 1 such columns have been found.
A j-th column can be added if
d−2
j−1
X
(q − 1)i < q r − 1.
i=1
i
If n is the largest value of j for which this inequality holds, then an (n, k, d)q code exists.
Notice that n is also the smallest value of n for which
d−2
X n
(q − 1)i ≥ q n−k − 1,
i=1
i
q k ≥ q n /sq (d − 2, n).
This result can be compared to the result of Theorem 15.1.103 and is also referred to as the
Varshamov-Gilbert bound.
15.1.105 Remark The following bounds are upper bounds on the size of codes. In many cases it is
possible to find codes that meet the bounds with equality.
672 Handbook of Finite Fields
15.1.106 Theorem (Sphere packing bound, Hamming bound) [2849] For a positive integer d with
1 ≤ d ≤ n, let e = b d−1
2 c. Then
qn qn
Aq (n, d) ≤ Pe n
i
= .
i=0 i (q − 1)
sq (e, n)
15.1.107 Remark In an optimal code with Aq (n, d) codewords, it is possible to surround codewords
with nonintersecting spheres of radius e and the bound follows.
15.1.108 Definition A code whose size meets the bound of Theorem 15.1.106 with equality is perfect.
15.1.109 Remark In a perfect code the spheres of radius e around the codewords are nonintersect-
ing and exhaust the space. The Hamming codes, both binary and nonbinary, are perfect
codes. Besides the linear Hamming codes there are nonlinear perfect codes with the same
parameters. The (23, 12, 7)2 binary Golay code and the (11, 6, 5)3 ternary Golay code (to be
discussed later) are perfect. There are also trivial perfect codes: (i) the (n, n, 1)q code (the
complete space), e = 0, (ii) the (n, 1, n)2 binary repetition code for n odd, e = (n − 1)/2,
(iii) a code of length n over Fq consisting of a single codeword, e = n, and (iv) a binary
code of odd length containing only two words c and its complement c + 1, e = (n − 1)/2
[1558, 2810, 2811].
15.1.110 Theorem (Plotkin bound) [2849] If an (n, M, d)q code exists, then
nM (q − 1)
d≤ .
(M − 1)q
15.1.111 Remark In the following it will be convenient to define θ = (q − 1)/q. By rearranging terms
in the above theorem, another expression for the Plotkin bound is
d
Aq (n, d) ≤ for d > θn.
d − θn
15.1.112 Theorem (Griesmer bound) [2849] If an (n, k, d)q code exists, then
k−1
X
d
n≥ .
i=0
qi
Aq (n, d) ≤ q n−d+1 .
15.1.114 Remark Puncturing the code on d − 1 nonzero coordinate positions of a minimum weight
codeword implies all remaining codewords are distinct and the Singleton bound follows.
15.1.115 Remark For a linear (n, k, d)q code, the Singleton bound reduces to
d ≤ n − k + 1.
15.1.116 Definition A code (either linear or nonlinear) that meets the Singleton bound with equality
is a maximum distance separable (MDS) code.
Algebraic coding theory 673
15.1.117 Remark
1. For a linear (n, k, d)q code, the Singleton bound is simply a reflection that every
set of d − 1 columns of the parity check matrix (an (n − k) × n matrix) is linearly
independent, and hence the largest value of d − 1 is n − k.
2. The dual of an (n, k, d = n − k + 1)q MDS code is an (n, n − k, k + 1)q MDS code.
3. Important examples of MDS codes are the Reed-Solomon codes discussed in
the next section. Trivial examples include the (n, 1, n)q repetition code and the
(n, n, 1)q full code.
15.1.118 Theorem (Elias bound) [2390, 2849] Assume that r ≤ θn and r2 − 2θnr + θnd > 0. Then
θnd qn
Aq (n, d) ≤ · .
r2 − 2θnr + θnd sq (r, n)
15.1.119 Remark It is of interest to consider the constraints on codes and how they are reflected
in the bounds discussed as the length of the code increases to infinity. Notice that for a
linear (n, k, d)q code, the rate is k/n, the ratio of the number of information symbols to
code symbols. Asymptotically as the length increases, this rate is denoted αq (δ) where δ
is the asymptotic normalized distance d/n. The following definition is valid for both linear
and nonlinear codes.
1
αq (δ) = lim supn→∞ logq Aq (n, δn).
n
15.1.121 Remark To discuss asymptotic versions of the bounds, the following definition of the q-ary
entropy function is needed.
15.1.123 Remark The work of Delsarte [801] on distance distributions for codes and their
MacWilliams transforms (Definition 15.1.43) suggests the following linear programming
approach to upper bounding the size of M for an (n, M, d)q code [2050, 2849]. Let
A = (A0 = 1, A1 , . . . , An ) be the
Pndistance distribution of a putative (n, M, d)q code, where
A1
Pn = · · · = A d−1 = 0 and i=0 Ai = M . The linear program consists of maximizing
i=0 A i subject to A 0 = 1, A 1 = · · · = Ad−1 = 0, Ai ≥ 0 for i = d, . . . , n, and the
MacWilliams transform constraints
n
X
Ai Pk (i) ≥ 0, k = 0, 1, . . . , n.
i=0
This bound, often referred to as the linear programming or LP bound, is not very explicit,
but it leads to the following two theorems [2050] which are quite strong upper bounds for
binary codes. Noting that H2 (x) = −x log2 (x) − (1 − x) log2 (1 − x), define g(x) by
√
g(x) = H2 ((1 − 1 − x)/2).
674 Handbook of Finite Fields
15.1.125 Remark The above bound is referred to as the MRRW bound (after the initials of the four
authors of [2050]), and it implies the following one - the bounds are actually the same over
a range of values of δ.
15.1.126 Theorem For binary codes, if 0 < δ < 1/2,
q nHq (r/n) .
Apart from the MRRW bound, the following asymptotic versions of the bounds follow
directly from their finite counterparts.
15.1.130 Theorem (Asymptotic bounds) [1558, 2390, 2849]
1. Asymptotic Varshamov-Gilbert bound:
αq (δ) ≥ 1 − Hq (δ), 0 ≤ δ ≤ θ.
2. Asymptotic Singleton bound:
αq (δ) ≤ 1 − δ, 0 ≤ δ ≤ 1.
3. Asymptotic Plotkin bound:
αq (δ) ≤ 1 − δ/θ, 0 ≤ δ < θ.
4. Asymptotic Hamming bound:
αq (δ) ≤ 1 − Hq (δ/2), 0 ≤ δ ≤ θ.
5. Asymptotic Elias bound:
p
αq (δ) ≤ 1 − Hq (θ − θ(θ − δ)), 0 ≤ δ < θ.
15.1.131 Definition Let x = (x0 , x1 , . . . , xn−1 ) ∈ Fqn . A cyclic (right) shift of x (with wraparound)
is (xn−1 , x0 , x1 , . . . , xn−2 ). Let C be a linear code. Then C is a cyclic code if every cyclic
shift of a codeword in C is also a codeword in C.
15.1.132 Remark Some benefits to assuming a code is linear have been seen. Restricting attention
further to cyclic codes allows the formulation of efficient algorithms for the construction,
encoding, and decoding of them. The simple addition of requiring cyclic shifts of codewords
to be codewords introduces a strong algebraic structure into the picture that allows these
benefits.
Algebraic coding theory 675
15.1.134 Example The ring Fq [x] is the set of polynomials in the indeterminate x with coefficients
in Fq . The rings Z, Fq [x], and Fq [x]/hxn − 1i are PIDs. In Z the ideal h2i is {2k | k ∈ Z},
i.e., the set of even integers. In Fq [x]
hg(x)i = {a(x)g(x) | a(x) ∈ Fq [x]}
is the ideal generated by the polynomial g(x).
15.1.135 Remark In the quotient ring Fq [x]/hxn −1i, each coset has a unique coset representative that
is either the zero polynomial or has degree less than n. For simplicity, the coset a(x)+hxn −1i
with a(x) = a0 + a1 x + · · · + an−1 xn−1 will be denoted a(x). Therefore the quotient ring
Fq [x]/hxn − 1i is the set of polynomials
{a0 + a1 x + · · · + an−1 xn−1 | ai ∈ Fq , i = 0, 1, . . . , n − 1}
with addition and multiplication modulo xn − 1.
There is a natural map between n-tuples over Fq and polynomials in Fq [x]/hxn − 1i,
namely
Fnq −→ Fq [x]/hxn − 1i
(a0 , a1 , . . . , an−1 ) 7→ a0 + a1 x + · · · + an−1 xn−1 .
Thus a linear code C can be viewed equivalently as a subspace of Fnq and an Fq -subspace of
Fq [x]/hxn − 1i. Notice that if (a0 , a1 , . . . , an−1 ) 7→ a0 + a1 x + · · · + an−1 xn−1 , then the cyclic
shift (an−1 , a0 , a1 , . . . , an−2 ) corresponds to the polynomial x(a0 + a1 x + · · · + an−1 xn−1 )
(mod xn − 1) which gives the reason for interest in the quotient ring Fq [x]/hxn − 1i.
15.1.136 Lemma A linear code C is a cyclic code if and only if it is an ideal in Fq [x]/hxn − 1i.
15.1.137 Remark Since Fq [x]/hxn − 1i is a PID, every nonzero ideal C has a generator polynomial
g(x), i.e., C = hg(x)i, and one such generator polynomial is the unique monic codeword
polynomial of least degree. It also follows that g(x) divides xn − 1 (written g(x) | (xn − 1))
as otherwise xn − 1 = a(x)g(x) + r(x) for some a(x) and r(x) where r(x) has degree strictly
less than that of g(x). Since by definition r(x) would be in C, r(x) nonzero contradicts the
fact that g(x) was of least degree. For g(x) | (xn −1), g(x) generates an ideal in Fq [x]/hxn −1i
that is the set of polynomials divisible by g(x). More specifically
hg(x)i = {a(x)g(x) | a(x) ∈ Fq [x], deg a(x) < n − deg g(x)}.
The term generator polynomial of a cyclic code is reserved for the unique monic polynomial
generating the code and dividing xn −1. Thus cyclic codes are determined by factors of xn −1.
Indeed, if gcd(n, q) = 1, xn − 1 factors into distinct irreducible polynomials fi (x) over Fq ,
i = 1, 2, . . . , t, and then there are 2t cyclic codes of length n over Fq . If gcd(n, q) 6= 1, xn − 1
has repeated irreducible factors. For example if gcd(n, q) = p where Fq has characteristic
Qt
p, then n = n1 p and xn − 1 = (xn1 − 1)p . More generally, if xn − 1 = 1=1 fiei (x) is the
Qt
factorization, the possible number of cyclic codes is i=1 (ei + 1); see [1945]. Henceforth it
is assumed that gcd(n, q) = 1.
676 Handbook of Finite Fields
15.1.138 Remark A cyclic (n, k, d)q code C has a generator polynomial g(x) of degree n − k. If
g(x) = g0 +g1 x+· · ·+gn−k xn−k , the code is viewed equivalently as the ideal of Fq [x]/hxn −1i
15.1.140 Remark Let C be an (n, k, d)q cyclic code with generator polynomial g(x) whose roots are
α1 , α2 , . . . , αn−k . The roots will in general be in an extension field of Fq , say Fqm where
n | (q m − 1). Another way of defining the code is to note that c(x) = a(x)g(x) is a codeword
polynomial if and only if
c(αi ) = 0, i = 1, 2, . . . , n − k.
Thus c = (c0 , c1 , . . . , cn−1 ) ∈ C if and only if HcT = 0T where
α12 α1n−1
···
1 α1
1 α2 α22 ··· α2n−1
H= .. .
.
2 n−1
1 αn−k αn−k ··· αn−k
15.1.141 Remark Let C1 and C2 be cyclic codes with generator polynomials g1 (x) and g2 (x), respec-
tively. Then C1 ⊆ C2 if and only if g2 (x) | g1 (x).
15.1.142 Remark In practice there is often a preference to use systematic codes where the informa-
tion symbols appear explicitly in the codeword. To achieve this one might row-reduce the
generator matrix of (15.1.2) (with possible column permutations - see Remark 15.1.32) to
be of the form
G0 = [Ik | A]
Algebraic coding theory 677
as can be done for any linear code. One could then encode the information word m (of
length k, recalling that the message length is the same as code dimension) as mG0 . However
if column permutations are required to obtain G0 , the resulting code with generator matrix
G0 may not be cyclic. For an (n, k, d)q cyclic code with generator polynomial g(x) of degree
n − k, one might also do the following encoding. Divide xn−k m(x) by g(x), where m(x) is
the information polynomial, to obtain
xn−k m(x) = a(x)g(x) + r(x), deg r(x) < n − k.
Thus xn−k m(x) − r(x) = a(x)g(x) is a codeword with the k information symbols in the
“high end” and n − k parity check symbols in the “low end.”
15.1.143 Theorem [1558] Let C be a cyclic code of length n over Fq and let v(x) ∈ Fq [x]. Then
C = hv(x)i if and only if gcd(v(x), xn − 1) = g(x). Equivalently v(x) generates C if and only
if the n-th roots of unity that are zeros of v(x) are precisely the zeros of g(x).
15.1.144 Theorem Let Ci , i = 1, 2, be cyclic codes with generator polynomials gi (x), i = 1, 2. Then
C1 ∩ C2 has generator polynomial lcm(g1 (x), g2 (x)) and C1 + C2 has generator polynomial
gcd(g1 (x), g2 (x)).
15.1.145 Remark The additional constraint of cyclic codes (over linear) allows considerably more
information on the minimum distance of the code to be obtained. The minimal polynomial
over Fq of an element β in some extension field of Fq is the monic irreducible polynomial,
denoted Mβ (x), of least degree in Fq [x] that has β as a zero.
15.1.146 Remark Recall that the q-ary Hamming code Hn,q (m) is a (πm,q , πm,q − m, 3)q code where
n = πm,q = (q m − 1)/(q − 1). When gcd(n, q − 1) = 1, Hn,q (m) can be made cyclic in the
following manner. Let α be a primitive element in Fqm . Then β = αq−1 is a primitive n-th
root of unity. The parity check matrix of Hn,q (m) can be taken as
H = 1 β β 2 · · · β n−1 .
Under the stated conditions the minimal polynomial of β over Fq is of degree m. It remains
to verify that no two columns of H are multiples of each other over Fq to ensure a minimum
distance of 3. Since the dual of this code has all codewords of weight q m−1 the weight
enumerator of the q-ary Hamming code can be obtained via the MacWilliams identities, as
in the binary case. This is an example of the first class of cyclic codes to be considered.
15.1.147 Definition Let n be a positive integer relatively prime to q. For i an integer, the q-
cyclotomic coset modulo n containing i is Ci = {i, iq, . . . , iq r−1 } (mod n) where r is the
smallest positive integer such that iq r ≡ i (mod n).
15.1.148 Remark The smallest extension field of Fq that contains a primitive n-th root of unity is
Fqm where m is the size | C1 | of the q-cylotomic coset modulo n containing 1. If β is a
primitive n-th root of unity in Fqm , then the minimal polynomial of β a over Fq is
Y
Mβ a (x) = (x − β j ).
j∈Ca
where s runs through a set of representatives of the distinct q-cyclotomic cosets modulo n.
15.1.149 Definition Let β be a primitive n-th root of unity in an extension field of Fq . Let C be a
cyclic code over Fq of length n with generator polynomial g(x) ∈ Fq [x]. Then there is a
set T ⊆ {0, 1, . . . , n − 1} such that the roots of g(x) are {β t | t ∈ T }. T is a defining set
(with respect to β) of C.
15.1.151 Definition [359, 1329, 1516, 1558, 2035] The cyclic code BCHn,q (δ) of length n and
designed distance δ has a generator polynomial of the form
15.1.152 Remark These codes are referred to as BCHn,q (δ) codes, where BCH stands for the
code name (Bose-Chaudhuri-Hocquenghem after the code originators [359, 1516]), and the
subscripts on BCH are the length and field of definition, and δ is the designed distance of
the code. The relationship between the designed distance and the true minimum distance
of BCHn,q (δ) is discussed below.
15.1.153 Remark The reason for requiring the generator polynomial of the BCH code to have a
consecutive sequence of roots derives from properties of Vandermonde matrices as follows.
Let β1 , β2 , . . . , β` be distinct elements in some extension field of Fq . The determinant of the
matrix
β1 β2 · · · β`
β12 β22 · · · β`2
3
β1 β23 · · · β`3
(15.1.3)
..
.
β1` β2` ··· β``
Q`
is i=1 βi times the determinant of the matrix
···
1 1 1
β1 β2 ··· β`
β12 β22 β`2
··· .
(15.1.4)
..
.
β1`−1 β2`−1 ··· β``−1
Q
The determinant of this matrix is i>j (βi − βj ). If βi = βj for some i 6= j, the determinant
of (15.1.4) is zero and the matrix is singular. If the entries of the second row are distinct,
Algebraic coding theory 679
the determinant is nonzero and the matrix is nonsingular. The determinant of the matrix
in (15.1.3) is
`
!
Y Y
βi (βi − βj ).
i=1 i>j
15.1.154 Remark To complete the discussion, it is straightforward to find the inverse of a Vander-
monde matrix of the form (15.1.4) in the following manner. Define the polynomials
`−1 `
X Y x − βj
fi (x) = fij xj =
βi − βj
j=0 j=1,j6=i
that take on the values of 0 for x = βj , j 6= i and 1 for x = βi . The inverse of the
Vandermonde matrix (15.1.4) is then the matrix [fij ].
15.1.155 Remark To consider the dimension and minimum distance of the BCH code of Definition
15.1.151, note that the parity check matrix may be written in the form
βa β 2a β (n−1)a
1 ···
a+1 2(a+1)
1 β β ··· β (n−1)(a+1)
H = . . .. .. .. .
.. .. . . .
1 β a+δ−2 β 2(a+δ−2) ··· β (n−1)(a+δ−2)
The word c ∈ Fnq is in the BCH code if and only if HcT = 0T . Since by the Vandermonde
argument, any δ − 1 columns of H are linearly independent, the code has minimum distance
at least δ. In addition, as the entries of H are in Fqm , they can be expressed as column
m-tuples over Fq . Therefore the rank of H over Fq is at most m(δ − 1). Thus the BCH code
of Definition 15.1.151 is an (n, ≥ (n − m(δ − 1)), ≥ δ)q cyclic code.
15.1.156 Remark There are numerous techniques to improve the bound on the actual minimum
distance of a BCH code, i.e., improve on the lower bound of the designed distance.
Reed-Solomon (RS) codes:
15.1.157 Definition [2447] A q-ary Reed-Solomon code RSq−1,q (k) of dimension k is a BCH code
of length n = q − 1 and designed distance δ = n − k + 1 = q − k over Fq , i.e., for α ∈ Fq
a primitive element, the code has a generator polynomial of the form
a+δ−2
Y
g(x) = (x − αi ).
i=a
15.1.158 Remark The RS code defined above is a (q −1, k, d = q −k)q MDS code. That the minimum
distance d is exactly q − k is shown as follows. By the Singleton bound d ≤ n − k + 1 = q − k.
Since a BCH code has minimum distance at least its designed distance δ, d ≥ δ = q − k.
Hence d = q − k. If 1 is not a root of g(x), then the extended narrow sense RS code,
obtained by adding an overall parity check, is a (q, k, d + 1)q code and hence is also MDS.
With appropriate care one can define RS codes of length less than q − 1.
Notice also that the code defined as
C = {(f (1), f (α), . . . , f (αq−2 )) | f (x) ∈ Fq [x], deg f (x) ≤ k − 1}
is the (q − 1, k, q − k)q narrow sense RS code, as now outlined. The code C is certainly linear
and k-dimensional as distinct polynomials of degree at most k − 1 cannot produce equal
680 Handbook of Finite Fields
15.1.160 Definition A set of k column positions of a linear (n, k, d)q code is an information set if
the corresponding columns of a code generator matrix are linearly independent.
15.1.161 Lemma If C is an (n, k, n − k + 1)q MDS code, then C ⊥ is an (n, n − k, k + 1)q MDS code.
Any k columns of a generator matrix for an (n, k, d)q MDS code C are linearly independent
and hence these column positions form an information set. Similarly any n − k columns of
the parity check matrix for C are linearly independent and any such set of columns forms
an information set for C ⊥ .
15.1.162 Remark If Fq has characteristic 2 (q = 2s for some s), then by fixing a basis of Fq over
F2 one could expand the elements of Fq in each coordinate position of each codeword of a
(q − 1, k, q − k)q RS code to obtain an (s(q − 1), sk, ≥ (n − k + 1))2 code that is effective in
correcting burst errors. This is simply a concatenated code with trivial inner code.
15.1.163 Remark It is easy to see that a BCH code can be viewed as a subfield subcode of an RS
code.
15.1.164 Remark [304, 1945, 2484] In order to study the structure of other codes, the notion of
a generalized RS (GRS) code is defined. Let α1 , α2 , . . . , αn be n ≤ q distinct elements of
Fq and v = (v1 , v2 , . . . , vn ) ∈ (F∗q )n where F∗q = Fq \ {0}. Let k ≤ n. The GRS code
GRSn,q (α, v, k) is defined as
GRSn,q (α, v, k) = {(v1 f (α1 ), v2 f (α2 ), . . . , vn f (αn )) | f (x) ∈ Fq [x], deg f (x) ≤ k − 1}.
15.1.165 Theorem The above code GRSn,q (α, v, k) is an (n, k, n − k + 1)q MDS code. Further, there
is a vector w ∈ (F∗q )n such that C ⊥ is GRSn,q (α, w, n − k), i.e., the dual of a GRS code
is a GRS code, and the code GRSn,q (α, v, k)⊥ is the code GRSn,q (α, w, n − k) for some
w ∈ (F∗q )n .
15.1.166 Remark The proof of the above theorem is a simple generalization of the RS/BCH ar-
gument. Notice that the vector of nonzero elements v has the effect of multiplying each
coordinate position by a nonzero element which has little effect on code parameters. How-
ever when subfield subcodes are considered, the effect can be more significant.
Algebraic coding theory 681
15.1.167 Remark A code being MDS is a strong condition. In particular, as the following theorem
shows, its weight enumerator is uniquely determined.
15.1.168 Theorem [2390] If C is an (n, k, d)q MDS code, then its weight distribution {Ai , i =
0, 1, . . . , n} is given by A0 = 1, Ai = 0 for i = 1, 2, . . . , d − 1, and
i−d
i − 1 i−d−j
n X
Ai = (q − 1) (−1)j q for i = d, d + 1, . . . , n.
i j=0
j
15.1.169 Remark The existence of MDS codes is of interest. Only linear codes are considered here.
As noted, there are the trivial (n, 1, n)q codes (for any alphabet of size q) and parity
check (n, n − 1, 2)q codes for any length n. If one considers the parity check matrix of
a (q − 1, k, q − k)q RS (MDS) code, it is always possible to add columns (1, 0, . . . , 0)T and
(0, 0, . . . , 0, 1)T to obtain a (q + 1, k, q + 2 − k)q code for 2 ≤ k ≤ q − 1. This “doubly
extended” construction also works for BCH codes [2999]. When q is even, it is possible to
obtain a triply extended (q + 2, q − 1, 4)q MDS code and its dual (q + 2, 3, q)q code.
15.1.170 Conjecture [1558] It is postulated that if there is a nontrivial (n, k, n − k + 1)q MDS code
over Fq , then n ≤ q + 1, except when q is even and k = 3 or k = q − 1 in which case
n ≤ q + 2.
15.1.171 Remark Reed-Solomon codes are ubiquitous in communication and storage system stan-
dards of all kinds. Only a few of these are mentioned as examples: digital video broadcast-
ing (DVB) (EN 300-421 (satellite) and 429 (cable)); ADSL (asymmetric digital subscriber
loops) for low rate communications over telephone lines (ANSI T1.413); Internet high speed
(gigabit) optical networks (ITU-T G.795 and G.709 and OC-192); wireless broadband ac-
cess networks (including metropolitan (known commercially as WiMax) and local) in IEEE
802.16 (which is a family of standards covering the many types of networks in such appli-
cations); deep space missions (Voyager and Mariner, among others, although more recent
missions have tended to favor LDPC and turbo codes); satellite communication systems
(IESS 308); two-dimensional bar codes; data storage including CDs, CD-ROMs, DRAMs,
and DVDs and RAID (random arrays of inexpensive disks) systems. In each application the
alphabet size and code parameters are carefully chosen for given “channel” conditions. For
example ordinary music CDs use (32, 28, 5)256 and (28, 24, 5)256 codes with 8 bit symbols,
obtained by shortening a (255, 251, 5)256 Reed-Solomon code, and then cross interleaved in
a clever way to enable the system to withstand bursts of lengths up to 4,000 bits, caused
by a scratch of up 2.5 mm on the disk surface, without replay error. The optical network
standard OC-192 uses symbols from 3 to 12 bits (and RS codes of length up to 4095 symbols
with a maximum of 256 parity symbols).
Duadic codes:
Pn−1
15.1.172 Definition A vector v = (v0 , v1 , . . . , vn−1 ) ∈ Fn
q is even-like if i=0 vi = 0; otherwise v
is odd-like. A code C is even-like if all of its codewords are even-like; C is odd-like if it is
not even-like.
15.1.173 Remark The terms even-like and odd-like generalize the notion of even and odd weight for
binary vectors; i.e., if v ∈ Fn2 , v is even-like if and only if v has even weight. If C is a cyclic
code of length n over Fq with defining set T , then C is even-like if and only if 0 ∈ T .
15.1.174 Definition Let S1 and S2 be subsets of {1, 2, . . . , n − 1} that are unions of q-cyclotomic
cosets modulo n. Assume further that S1 ∩ S2 = ∅, S1 ∪ S2 = {1, 2, . . . , n − 1}, and
682 Handbook of Finite Fields
that there exists b relatively prime to n such that S2 = {sb (mod n) | s ∈ S1 } and
S1 = {sb (mod n) | s ∈ S2 }. The pair of sets S1 and S2 form a splitting of n given by b
over Fq . For i ∈ {1, 2}, let Di be the cyclic code of length n over Fq with defining set Si .
The pair D1 and D2 is a pair of odd-like duadic codes. For i ∈ {1, 2}, let Ti = {0} ∪ Si ,
and let Ci be the cyclic code of length n over Fq with defining set Ti . The pair C1 and
C2 is a pair of even-like duadic codes.
15.1.175 Remark Binary duadic codes were first defined in [1901]. They were later generalized to
other fields in [2402, 2403, 2506, 2687]. Duadic codes include quadratic residue codes de-
scribed later; quadratic residue codes exist only for prime lengths, but duadic codes can
exist for composite lengths.
15.1.176 Theorem With the notation of Definition 15.1.174, the following hold.
1. Duadic codes of length n over Fq exist if and only if n is odd and q is a square
modulo n.
2. For i ∈ {1, 2}, Ci has dimension (n − 1)/2, and there is a permutation of coordi-
nates that sends C1 to C2 .
3. For i ∈ {1, 2}, Ci ⊆ Di , Di has dimension (n + 1)/2, and there is a permutation
of coordinates that sends D1 to D2 .
15.1.177 Remark Part 1 of Theorem 15.1.176 can be used to find precisely the values of n
such that duadic codes of length n exist, in a manner similar to the following. Assume
n = p1a1 pa2 2 · · · par r where p1 , p2 , . . . , pr are distinct odd primes. Binary duadic codes exist if
and only if pi ≡ ±1 (mod 8) for 1 ≤ i ≤ r. Duadic codes over F3 exist if and only if pi ≡ ±1
(mod 12) for 1 ≤ i ≤ r. Duadic codes over F4 exist if and only if n is odd.
15.1.178 Remark Much is known about the structure of duadic codes as illustrated by the next two
theorems; see Chapter 6 of [1558].
15.1.179 Theorem Let C be a cyclic (n, (n − 1)/2, d)q code over Fq . Then C is self-orthogonal if and
only if C is an even-like duadic code where −1 gives the splitting of n over Fq . In that case,
if C is one of the duadic pair C1 and C2 , Ci⊥ = Di for i ∈ {1, 2}. Furthermore, if n = p is a
prime with p ≡ −1 (mod 8), every splitting of p over F2 is given by −1 and every binary
duadic code of length p is self-orthogonal.
15.1.180 Theorem Let D1 and D2 be a pair of odd-like binary duadic codes of length n where the
splitting is given by −1 over F2 . Then for i ∈ {1, 2},
1. the weight of every even weight codeword of Di is divisible by 4, and the weight
of every odd weight codeword is congruent to n (mod 4); furthermore,
2. the extended codes D bi are self-dual. In D
bi if n ≡ −1 (mod 8), all codewords have
weights divisible by 4, and if n ≡ 1 (mod 8), all codewords have even weights but
some codewords have weights not divisible by 4.
15.1.181 Remark Quadratic residue codes, considered next, are special cases of duadic codes.
15.1.182 Definition For an odd prime p define R to be the set of integers modulo p where
The set R is the set of quadratic residues modulo p. The set N of nonzero elements of
Zp that are not in R are the quadratic nonresidues modulo p.
Algebraic coding theory 683
15.1.183 Remark Clearly | R | = | N | = (p − 1)/2. The sets observe a parity of sorts since a residue
times a residue modulo p is a residue, a residue times a nonresidue is a nonresidue, and
a nonresidue times a nonresidue is a residue. This implies that, whenever q is a quadratic
residue modulo p, the pair of sets R and N form a splitting of p given by any nonresidue
over Fq . For ν a primitive element of Zp (i.e., ν is a generator of the multiplicative group
Z∗p = Zp \ {0}), it is clear that R = {ν 2i | i = 1, 2, . . . , (p − 1)/2}, independent of the
primitive element chosen. The theory of quadratic residues is a fundamental part of number
theory.
15.1.184 Definition [311, 2405] Let p be an odd prime and q a prime power with gcd(p, q) = 1.
Assume further that q (p−1)/2 ≡ 1 (mod p); hence q is a quadratic residue modulo p.
Let R and N be the quadratic residues and quadratic nonresidues modulo p. Relative
to a primitive p-th root of unity β in some extension field of Fq , denote by DR and
DN the (p, (p + 1)/2, d)q odd-like duadic codes over Fq with defining sets R and N ,
respectively. Denote by CR and CN the (p, (p − 1)/2, d0 )q even-like duadic codes over Fq
with defining sets {0} ∪ R and {0} ∪ N , respectively. The four codes DR , DN , CR , and
CN are quadratic residue (QR) codes.
15.1.185 Remark Much is known about QR codes and their relatives, their minimum distances,
and automorphism groups. As noted previously, the only two perfect codes with minimum
distance greater than 3 are cosets of two linear Golay codes, which are quadratic residue
codes treated in the next two examples.
15.1.186 Example The binary (23, 12, 7)2 Golay code [2405, 2849]: Let p = 23 and q = 2, and note
that 2 is a quadratic residue modulo 23 as 52 ≡ 2 (mod 23). The smallest extension field of
F2 that contains a primitive 23-rd root of unity is F211 . If α is a primitive element of F211 ,
then β = α89 is a primitive 23-rd root of unity. With the appropriate choice of primitive
element, the polynomial
Y
gR (x) = (x − β r ) = x11 + x9 + x7 + x6 + x5 + x + 1 ∈ F2 [x]
r∈R
generates the (23, 12, d)2 quadratic residue code DR . Theorem 15.1.180 implies d ≡ 3
(mod 4), and the minimum distance d can be determined to be 7. Notice that since
P3 23 11
i=0 i = 2 , this code is perfect.
15.1.187 Example The ternary (11, 6, 5)3 Golay code [2405, 2849]: For p = 11 and q = 3, note that
62 ≡ 3 (mod 11), and hence 3 is a quadratic residue modulo 11. The smallest extension field
of F3 containing a primitive 11-th root of unity is F35 , and if α is a primitive root in F35 ,
then β = α22 is a primitive 11-th root of unity. With the appropriate choice of primitive
element, Y
gR (x) = (x − β r ) = x5 + x4 + 2x3 + x2 + 2 ∈ F3 [x]
r∈R
is a generator polynomial
i of the (11, 6, d)3 quadratic residue code DR ; it can be shown that
P2
d = 5. Since i=0 11 i 2 = 3 5
, this code is also perfect. In fact, as noted earlier, this code
and the binary (23, 12, 7)2 Golay code are the only two possible perfect linear codes with
minimum distance greater than 3. This ternary perfect code was actually discovered before
Golay by Virtakallio [1522] in connection with a football pool problem.
15.1.188 Remark It is interesting to note that the Golay codes were discovered very early in the
history of coding [1292] and took only half a page in the Proceedings of the Institute of
Radio Engineers, (IRE, precursor to the IEEE), appearing prior to the paper of Hamming
684 Handbook of Finite Fields
[1408]. The original paper of Shannon [2608] actually contained an example of a Hamming
code, prior to the appearance of the Hamming paper. The parameters of the Golay codes
were found by examining Pascal’s triangle for sequential summations along a row that add
to an appropriate power. Once the required relationship was found, Golay was able to find
generator matrices for the two (linear) codes. He also noted that
2
X 90
= 212
i=0
i
raising the possibility of a (90, 78, 5)2 perfect binary code, but no such code can exist
[2810, 2849].
Alternant codes:
15.1.189 Remark Alternant and GRS codes bear a similar relationship as BCH and RS codes in
that a BCH code of block length q m − 1 over Fq is a subfield subcode of an RS code of this
length over Fqm .
15.1.190 Definition An alternant code An,q (α, v) is the subfield subcode of GRSn,qm (α, v, k), i.e.,
sf
15.1.191 Lemma [1991, 2484] If An,q (α, v) = GRSn,qm (α, v, k) |qm |q is an (n, k 0 , d0 )q code, then
k 0 ≤ k and d0 ≥ n − k + 1.
15.1.192 Remark The class of alternant codes is quite large, containing all narrow sense BCH and
Goppa codes. The duals of alternant codes are also of interest for which we have the following
diagram [802]:
dual
GRSn,qm (α, v, k) −−−−→ GRSn,qm (α, w, n − k)
subfieldy ytrace
dual
An,q (α, v) −−−−→ An,q (α, v)⊥
= GRSn,qm (α, v, k)sf
q m |q = GRSn,qm (α, w, n − k)tr
q m |q .
Goppa codes:
15.1.193 Remark [1319, 1320, 2047, 2849] To discuss Goppa codes it is instructive to consider a
natural transition from BCH codes in the following manner. For a variable x and α a
nonzero element in a field, note that
1
= 1 + α−1 x + α−2 x2 + · · · + α−(2t−1) x2t−1 + α−2t x2t + α−(2t+1) x2t+1 + · · ·
1 − α−1 x
as can be verified by multiplying each side by (1 − α−1 x). The effect of taking this equation
modulo x2t is to truncate the series on the right hand side to terms of degree 2t and higher.
Consider the primitive narrow sense BCH code of length n = q m − 1 and designed distance
2t + 1 with defining set C1 ∪ C2 ∪ · · · ∪ C2t relative to α, a primitive element of Fqm . For
convenience denote the nonzero field elements αi as αi−1 . Then c = (c0 , c1 , . . . , cn−1 ) is a
codeword of the BCH code if and only if
n−1
ci αi−j = 0
X
for j = 1, 2, . . . , 2t.
i=0
Algebraic coding theory 685
For a received word r = (r0 , r1 , . . . , rn−1 ), the sum of a codeword and error word (presumed
of weight not more than t), define the syndromes as
n−1
ri αi−j
X
Sj = for j = 1, 2, . . . , 2t,
i=0
P2t
and define the syndrome polynomial as j=1 Sj xj−1 . Note that in the absence of errors
(the received word is a codeword), this is the zero polynomial. However this polynomial can
be expressed as
2t 2t n−1
! n−1 2t
In this guise the transition from BCH to Goppa codes is more intuitive.
15.1.194 Definition [1945, 2047, 2849] Let g(x) ∈ Fqm [x] be monic and let L = {α0 , α1 , . . . , αn−1 }
be a subset of Fqm such that g(αi ) 6= 0 for 0 ≤ i ≤ n − 1. Then (c0 , c1 , . . . , cn−1 ) ∈ Fnq
is in the Goppa code Gn,q (L, g) if and only if
n−1
X ci
≡ 0 (mod g(x)).
i=0
x − αi
The Goppa code is more often denoted Γ(L, g) in the literature, but the notation used
here is consistent with our earlier code designations.
15.1.195 Remark If the Goppa polynomial is irreducible over Fqm , the code is called irreducible. By
choosing the Goppa polynomial as g(x) = xd−1 and L as the set of n-th roots of unity in
Fqm , the Goppa code is a BCH code with designed distance d.
15.1.196 Theorem [2047, 2849] For the Goppa code Gn,q (L, g), where |L| = n and deg g(x) = s, we
have:
1. The minimum distance is bounded by d ≥ s + 1.
2. The code dimension is bounded by k ≥ n − ms.
15.1.197 Remark [1945, 2047] For any monic polynomial g(x) of degree s such that g(α) 6= 0, it is
clear from previous arguments that
g(x) − g(α)
1 1
≡− (mod g(x))
x−α g(α) x−α
is a polynomial of degree less than s. (Note that this follows as x − α is a factor of the
numerator since the numerator has α as a zero.) Thus c is a codeword in Gn,q (L, g) if and
686 Handbook of Finite Fields
15.1.200 Remark The binary RM codes are generalized in the following manner. Construct the
(m + 1) × (q m − 1) matrix over Fq as follows. The first row consists of all ones and is labeled
v0 . Among the m-tuples of the q m − 1 columns in rows v1 through vm , all nonzero m-tuples
over Fq occur. A simple way of viewing this is to choose row v1 as a pseudo-noise (pn)
sequence over Fq , a sequence generated by a linear feedback shift register whose feedback
coefficients are the coefficients of a primitive polynomial of degree m over Fq . Row vi+1 is
row 1 shifted by i positions for 1 ≤ i ≤ m − 1. Let Fq [X1 , . . . , Xm ]ν = Fq [X]ν denote the set
of polynomials of degree at most ν over Fq in m variables. Define the matrix Gν,m over Fq
whose first row is v0 and whose remaining rows are generated by all monomials in Fq [X]ν
acting on rows v1 , v2 , . . . , vm . Let GRMqm −1,q (ν, m) be the code with generator matrix
Gν,m ; let GRMqm ,q (ν, m) be the extended code obtained from GRMqm −1,q (ν, m) by adding
an overall parity check. The exponent of a monomial is limited to degree at most q − 1 in
any variable since xqi = xi in Fq ; thus we only consider values of ν with ν ≤ m(q − 1).
15.1.201 Theorem [805, 1691] The parameters of GRMqm −1,q (ν, m) are
m
ν X
m t − jq + m − 1
X
n = q m − 1, k= (−1)j , d = (q − s)q m−r−1 − 1,
t=0 j=0
j t − jq
15.1.202 Remark The expression for the code dimension in the previous theorem is the number of
monomials of total degree at most ν. This is also the number of ways of placing ν balls in
ν+m
no cell containing more than q − 1 balls.
m cells,
m−1
When ν < q, this expression reduces to
m , and the minimum distance is (q − ν)q − 1.
15.1.203 Remark To show the GRM codes are cyclic requires the following notion. The radix-q
weight of an integer j is defined as
wq (j) = j0 + j1 + · · · where j = j0 + j1 q + j2 q 2 + · · · with 0 ≤ ji < q.
15.1.204 Theorem Let α be a primitive element of Fqm . Then the code GRMqm −1,q (ν, m), for
ν ≤ m(q − 1), is cyclic with defining set relative to α given by
{j | 0 < j < q m − 1, 0 < wq (j) ≤ m(q − 1) − ν − 1}.
The code is a subcode of the primitive narrow sense BCH code with designed distance
(q − s)q m−r−1 − 1 where ν = r(q − 1) + s as in Theorem 15.1.201.
15.1.205 Remark To discuss projective GRM codes denote by Fq [X0 , . . . , Xm ]0ν = Fq [X]0ν the set of
homogeneous polynomials in m + 1 variables of degree at most ν where ν ≤ (m + 1)(q − 1).
(The difference in the use of X, compared to the nonprojective case, is resolved by context.)
There are πm+1,q projective (m + 1)-tuples, Pm q , which coordinatize the positions of the
projective code. The projective codes are defined as
P GRMπm+1,q ,q (ν, m) = {(f (x)) | f (x) ∈ Fq [X]0ν , x ∈ Pm
q }.
The basic parameters of the code are given in the following theorem.
15.1.206 Theorem [1827, 2696] The code P GRMπm+1,q ,q (ν, m) has the following parameters:
m+1
q m+1 − 1 t − jq + 1
X X m+1
n = πm+1,q = , k= (−1)j
q−1 j=0
j t − jq
t=ν (mod q−1)
0<t≤ν
and
d = (q − s)q m−r−1 ,
where ν − 1 = r(q − 1) + s, 0 ≤ s < q − 1. In the case ν < q these expressions reduce to
r+m
n = πm+1,q , k = , d = (q − r + 1)q m−1 .
m
15.1.207 Remark The binary (extended) RM codes have an interesting interpretation in terms of
finite geometries for which some terminology is needed. Denote the n-dimensional projective
geometry over Fq by P G(n, q) and the Euclidean geometry by EG(n, q). (Some authors
denote EG(n, q) by AG(n, q); see Chapter 14.)
15.1.208 Remark [311] Projective geometries are discussed first; see also Section 14.4. Let α denote
a primitive element of Fqn+1 and v = (q n+1 − 1)/(q − 1) = πn+1,q . Then β = αv is a
primitive element of Fq . No two points αj with j = 0, 1, . . . , v − 1 are Fq -multiples of each
other, and hence these first v powers of α can be taken as the points of P G(n, q). Let
αj1 , αj2 , . . . , αjm+1 , m ≥ 1, be m + 1 linearly independent points in P G(n, q), i.e., there is
no linear relationship between them over Fq . The points
{aj1 αj1 + aj2 αj2 + · · · + ajm+1 αjm+1 | aji ∈ Fq },
688 Handbook of Finite Fields
with multiples over Fq identified, give the πm+1,q = (q m+1 − 1)/(q − 1) points of an m-flat in
P G(n, q). The flat is a P G(m, q). In particular, the number of points on a line in P G(n, q) is
q + 1. In P G(n, q) any two lines intersect in a single point, and the number of lines through
a fixed point in P G(n, q) is given by (q n − 1)/(q − 1).
To construct an EG(n, q) from a P G(n, q), subtract an (n − 1)-flat from P G(n, q). An
(n − 1)-flat can be thought of as the set of points in P G(n, q) orthogonal to a given line.
Alternatively the points of P G(n, q) can be divided into the two groups
The points of S2 form an (n − 1)-flat and those of S1 an EG(n, q). The points of this
geometry can be represented as the elements of Fqn . The points of an m-flat in EG(n, q)
through a given point αi is the set of points
where the αjk ’s are independent. Lines can be parallel in EG(n, q). In EG(2, q) the (q + 1)q
lines can be divided into q + 1 parallel classes, where the lines in each class are parallel,
each class containing q lines. Lines in different classes intersect. In EG(n, q) there are
q n−1 (q n − 1)/(q − 1) lines and (q n − 1)/(q − 1) parallel classes, each class containing q n−1
lines and each line containing q points.
15.1.209 Remark The generator matrix G(r, m) of the r-th order binary Reed-Muller code
RM2m ,2 (r, m) (Definitions 15.1.71 and 15.1.72) has the following interpretation. The zero-th
row is the incidence vector of the Euclidean space EG(m, 2). Rows 1 through m are inter-
preted as incidence vectors for (m − 1)-flats. The product of two such rows is the incidence
vector of the intersection of two such flats (an (m − 2)-flat) etc. One aim in pursuing this
geometric view for coding is the consideration of codes derived from such finite geometries
and their majority logic decoding. For the remainder of this section assume q = ps for some
prime p, the characteristic of Fq , and integer s.
15.1.210 Definition [1691, 2390] The r-th order Euclidean geometry code of length q m over Fp ,
denoted EGqm ,p (r), is the largest linear code over Fp that contains in its null space the
incidence vectors of all (r + 1)-flats in EG(m, q).
15.1.211 Remark Note that the incidence vectors are, by definition, binary and so the codes could
be defined over any finite field. However, not choosing the field Fp would complicate the
analysis considerably. It turns out that these codes are in fact extended cyclic codes as the
following theorem shows.
15.1.212 Theorem [304, 1691] Let α be a primitive element of Fqm = Fpsm . Then EGqm ,p (r) is the
code obtained by extending the cyclic code whose defining set relative to α is
15.1.213 Definition The r-th order projective geometry code, denoted P G(qm −1)/(q−1),p (r), is the
largest linear code containing the incidence vectors of all r-flats in its null space.
15.1.214 Theorem [1692] Let β ∈ Fqm = Fpsm be a primitive (q m − 1)/(q − 1) root of unity. The
code P G(qm −1)/(q−1),p (r) is a cyclic code whose defining set relative to β is
{j | 0 < j < (q m − 1)/(q − 1), 0 < max wq (j(q − 1)pi ) ≤ (q − 1)(m − r + 1)}.
0≤i<s
Algebraic coding theory 689
15.1.215 Remark The EG and PG codes are defined as the largest codes having the appropriate
flats in their dual codes. In the binary case, the EG codes are extended PG codes, and
the PG codes can be made cyclic. The codes were investigated extensively in [1691, 1692,
1942, 2966]. Lower bounds on their minimum distances can be obtained from the number
of errors they are capable of correcting with majority logic decoding; see Section 15.1.6.6.
The RM, GRM, EG, and P G codes, as well as many others, can be discussed under a
very general class of polynomial codes [1692].
Justesen codes:
15.1.216 Remark There have been many attempts to explicitly construct codes for which the nor-
malized rate and distance functions do not both tend to zero with increasing block length.
It is known that the class of BCH codes cannot achieve this. While the class of Goppa
codes can be used for such a purpose, the construction is not explicit in that it calls for
the construction of a sequence of suitable polynomials which are known to exist but are
not given [2849]. Justesen [1638] provided the first explicit construction. An outline of that
construction is given. Let N = 2m − 1, and let α be a primitive element in F2m . Let Cm,K
be the (N, K, D)2m RS code given by
{(f (1), f (α), . . . , f (αN −1 )) | f (x) ∈ F2m [x], deg f (x) ≤ K − 1}.
0
Let Cm,K be the (2N, K, 2D)2m code given by
0
Cm,K = {(a0 , a1 , . . . , aN −1 , a0 , αa1 , . . . , αN −1 aN −1 ) | (a0 , a1 , . . . , aN −1 ) ∈ Cm,K }.
00
The Justesen code Cm,K is found by expanding the components of each codeword, which
are in F2m , into binary m-tuples with respect to some fixed basis.
15.1.217 Theorem [1638] For any given code rate R < 1/2 and given m = 1, 2, . . ., choose Km to be
the smallest integer K such that K/2N ≥ R where N = 2m − 1. The code Cm,K
00
m
is linear
over F2 with length n = 2mN , dimension mKm , rate Km /2N ≥ R, and minimum distance
dn asymptotically bounded by
15.1.218 Remark The normalized rate and distance of the code for a given rate R are both clearly
nonzero. The construction calls for the expansion of elements of F2m into binary m-tuples
as well as a primitive element α ∈ F2m . To make the procedure entirely constructive (i.e., to
explicitly give such an expansion) an irreducible polynomial of each degree, as the degrees
tend to infinity, is needed. The issue is discussed in [1638] where it is noted that an irreducible
` `
polynomial of the form x2·3 + x3 + 1 can be used, ` = 1, 2, . . .. Given such an irreducible
polynomial for a given degree and known order, a primitive element can be determined.
15.1.219 Remark Suppose a = (a0 , a1 , . . . , an−1 ) 7→ a(x) = a0 + a1 x + · · · + an−1 xn−1 ∈ Fq [x] and
n | (q m − 1), i.e., Fqm contains a primitive n-th root of unity α. The Mattson-Solomon
polynomial of a is defined as
n
X n
X
A(x) = An−i xn−i = a(αi )xn−i
i=1 i=1
690 Handbook of Finite Fields
1
Ai = a(αn−i ) ←→ ai = A(αi ).
n
The utility of such an approach stems from the following theorem.
15.1.220 Theorem [2849] If the Mattson-Solomon polynomial A of a vector a has r n-th roots of
unity as zeros, then the weight of a must be at least n − r.
15.1.221 Remark This theorem follows as ai = A(αi )/n for 0 ≤ i < n. The theorem allows a
spectral approach to coding to be taken (as for example in [304]) where the correspondence
between weights of vectors and zeros of Mattson-Solomon polynomials can be exploited. The
coefficients of the Mattson-Solomon polynomial A(x) can be viewed as a discrete Fourier
transform of the coefficients of the original polynomial a(x) and hence enjoy many useful
properties, similar to those of a discrete Fourier transform.
15.1.222 Remark The relationships between codes and certain combinatorial structures are deep
and of great interest. They include connections between codes and association schemes,
difference sets, finite geometries and designs, among many others. Only a basic relationship
between the weight classes of certain codes as incidence vectors for designs will be noted
here. The reader is referred to many chapters in the handbook by Pless et al. [2405] for a
recent view of the subject and to Chapter 14 of this Handbook.
15.1.223 Definition A t-(v, k, λ) design is a pair (P, B) where P is a collection of v distinct points
and B is a collection of subsets of P, called blocks, each of size k, with the property that
any subset of P of size t is contained in exactly λ blocks. The number of blocks of the
design is given by b = λ vt / kt .
15.1.224 Remark An interesting technique for obtaining classes of t-designs is to consider the code-
words of fixed weight in a binary code and to determine under what conditions they might
form incidence vectors for the blocks of a t-design. Such investigations have often illumi-
nated the structure of a code and produced interesting classes of designs for combinatorial
use. An important tool in this investigation has been the Assmus-Mattson theorem below.
While quite technical, it is straightforward to apply if sufficient information is known about
the code and its dual. The support of a vector is the list of coordinate positions where the
vector is nonzero. The codewords of some fixed weight hold a design if the supports of the
codewords form a t-design for some t.
15.1.225 Theorem [142] Let C be an (n, k, d)q code with dual code C ⊥ having minimum distance
d⊥ . If q = 2, let w = n; for q > 2, let w be the largest integer such that
w+q−2
w− < d.
q−1
Define w⊥ analogously for C ⊥ . Suppose {Ai } and {Bi } are the weight distributions for C
and C ⊥ respectively. Let s be the number of i with Bi 6= 0 for 0 < i ≤ n − t for some integer
t. Suppose t < d and s ≤ d − t. Then the codewords of weight i in C hold a t-design provided
Ai 6= 0 and d ≤ i ≤ w. The words of weight i in C ⊥ hold a t-design provided Bi 6= 0 and
d⊥ ≤ i ≤ min{n − t, w⊥ }.
Algebraic coding theory 691
15.1.226 Example The extended binary (24, 12, 8)2 Golay code (the extension of the quadratic
residue (23, 12, 7)2 code) has weight distribution {A0 = 1, A8 = 759, A12 = 2576, A16 =
759, A24 = 1} and is self-dual. For t = 5, the number of nonzero weights less than 24−5 = 19
is s = 3. Thus the conditions and each weight class of the code satisfies the theorem, i.e.,
each weight class of the code holds a 5-design. In particular the 759 codewords of weight 8
hold a Steiner design (λ = 1), in fact a 5-(24, 8, 1) design, one of the more interesting designs
known. This design is also related to the Leech lattice which leads to a particularly dense
sphere packing in 24-dimensional Euclidean space. This connection is explored in detail in
[2805].
15.1.227 Example In a similar manner, the extended ternary (12, 6, 6)3 Golay code is of interest.
This code has nonzero codewords only of weights 6, 9, and 12, and is self-dual. Taking
t = 5, the number of nonzero code weights less than 12 − 5 = 7 is s = 1 and so the weight
classes of the code each hold 5-designs. This code has 264 codewords of weight 6, 440 of
weight 9 and 24 of weight 12.
15.1.228 Remark The above discussion suggests that self-dual codes with few weights and large
distance might be of interest in terms of applying the Assmus-Mattson theorem and such
has been the case. Particular interest is in the case of binary and ternary self-dual codes. If
C is a binary self-dual code, its length is even and every codeword has even weight. If C is a
ternary self-orthogonal code, every weight is divisible by 3. A binary code is even if every
codeword has even weight. A binary code is doubly-even if every weight is divisible by 4.
Formally self-dual codes are those for which WC (x, y) = WC ⊥ (x, y) and all self-dual codes
are formally self-dual. Many results are known about such codes and a few are touched on
here [1991, 2405].
15.1.229 Lemma [2405] Let C be a code over Fq with every weight divisible by ∆. If q = 2 and ∆ = 4
or if q = 3 and ∆ = 3, then C is self-orthogonal.
15.1.230 Lemma [2405] Binary self-dual doubly-even codes of length n exist if and only if 8 | n.
Ternary self-dual codes exist if and only if 4 | n.
15.1.232 Remark Codes which achieve the upper bounds in the above theorem are referred to as
extremal and often have weight classes of codewords that hold designs. The binary and
ternary Golay codes are extremal. The existence of other extremal codes has long been a
matter of interest.
15.1.233 Remark As noted, the work of Delsarte [801, 806] has been influential in the relationship
between codes and combinatorics, particularly t-designs, orthogonal arrays and graphs. The
result below is but one such example.
15.1.234 Theorem [801] Let C be a binary code of minimum distance d and external distance s0 , with
d ≥ s0 . Then C is distance invariant and the codewords of a given weight hold a t-design
with t = d − s0 .
692 Handbook of Finite Fields
15.1.6 Decoding
15.1.6.1 Decoding BCH codes
15.1.235 Remark Decoding a primitive narrow sense BCH code is considered. The modifications
needed for a nonnarrow sense or nonprimitive code, as well as for RS and other cyclic codes,
will be clear. Also, in most cases, such errors-only decoding algorithms can be extended
to errors-and-erasures algorithms. Assume the code has length n = q m − 1 and designed
distance d = 2t+1 and every codeword polynomial has roots αi for i = 1, 2, . . . , 2t where α is
a primitive element of Fqm . The development below is fairly standard [231, 304, 1329, 1943,
2136, 2390, 2484]. It is convenient to think of codewords and received words as polynomials.
Assume a codeword polynomial c(x) is sent and the polynomial r(x) = c(x) + e(x) is
received, where e(x) is the error polynomial, assumed to be of weight at most t. If more
than t errors occur, the algorithm may fail. For example, in the extreme case, if a certain set
of d errors occurs, it may happen that an incorrect codeword is received and the algorithm
will assume no errors occurred in transmission.
15.1.236 Remark The algorithm discussed in this section was first described by Peterson [2389] for
binary codes and by Gorenstein and Zierler [1329] for nonbinary codes. Denote the i-th
position of a codeword by αi for i = 0, 1, . . . , q m − 2. Assume the actual number of errors
that occurred (unknown) is ν ≤ t = b(d − 1)/2c. Let the errors be in coordinate positions
Xi = αji with error values Yi for i = 1, 2, . . . , ν. Noting that c(αj ) = 0 for j = 1, 2, . . . , 2t,
the information of value in the received codeword is from its syndromes evaluated as
ν
Yi Xij
X
Sj = r(αj ) = c(αj ) + e(αj ) = e(αj ) = for j = 1, 2, . . . , 2t.
i=1
i.e., the errors occur at the inverses of the zeros of σ(x). It follows that
Since ν ≤ t, all syndromes are known. If the number of errors made, ν, was known, the
equations could be solved for the coefficients σi and error locations found from the roots of
σ(x). The following lemma suggests a method to find ν.
15.1.237 Lemma The matrix
S1 S2 ··· S`
S2 S3 ··· S`+1
..
.
S` S`+1 ··· S2`−1
is nonsingular if ` ≤ ν and singular otherwise.
15.1.238 Remark To find the actual number of errors that occurred in transmission, first form the
matrix in Lemma 15.1.237 for ` = t. If nonsingular, assume t errors occurred. If singular,
set ` to t − 1 and repeat until the matrix is nonsingular. The algorithm is then:
1. Compute the syndromes Si for i = 1, 2, . . . , 2t.
2. Determine the actual number of errors that occurred in transmission from Lemma
15.1.237.
3. Find the error locations by finding the roots of σ(x) (by trying all nonzero field
elements if necessary).
4. Find the error values using (15.1.5).
Several steps of this algorithm are computationally expensive, such as finding successive
determinants. More efficient methods are introduced next.
15.1.239 Remark The above decoding technique is easily generalized to RS, GRS, alternant, and
Goppa codes.
15.1.240 Remark [231, 2011] Equation (15.1.6) suggests the following interpretation for determining
the error locator polynomial: Given the sequence of 2t syndromes, Si for i = 1, 2, . . . , 2t,
determine the linear feedback shift register of minimum length such that if S1 , S2 , . . . are
initially loaded into it, it will generate all 2t syndromes. Clearly the feedback coefficients
of such a shift register will be the coefficients of the error locator polynomial. An efficient
algorithm to determine these coefficients was given in Berlekamp [231] and this algorithm
was interpreted in the above manner by Massey [2011]. The details of the development
are intricate and omitted here. The following lemma from [2011] gives the flavor of the
argument. Denote the feedback connection polynomial at the r-th stage by σ (r) (x) and the
length of the register by `r so that the shift register generates S1 , S2 , . . . , Sr . The lemma
decides on the length of the minimum length register in going from the (r − 1)-st stage to
the r-th stage.
15.1.241 Lemma [2011] Let {`i , σ (i) (x)} be a sequence of minimum-length shift registers such that
(`i , σ (i) (x)) generates the sequence up to Si , for i = 1, 2, . . . , r − 1. Then if σ (r) (x) 6=
σ (r−1) (x), the length of the r-th shift register is
`r = max{`r−1 , r − `r−1 }.
15.1.242 Remark Once the length of the shift register for the r-th stage is determined, an efficient
procedure to update the coefficients of σ (r) (x) is known ([304, Theorem 7.4.1], [2390, Theo-
rem 9.10], and [231, Algorithm 7.4]). The final step of the improvements for the Berlekamp-
Massey algorithm involves determining the error values, once their locations are known, due
694 Handbook of Finite Fields
and
ω(Xi−1 )
Yi =
Xi−1 ω 0 (Xi−1 )
where ω 0 is the formal derivative of ω.
15.1.244 Remark Once the error locations are known, ω(x) can be used to compute the error values
(rather than invert a ν × ν matrix over Fq ). The degree of ω(x) is less than that of σ(x).
Notice that this last step can be omitted for binary codes, since the error values at error
locations are 1.
15.1.245 Remark Sugiyama et al. [2741] showed how the extended Euclidean algorithm (EEA) can be
used to decode Goppa (and hence RS and BCH) codes. The technique was further developed
in [2047]. If F is any field, the EEA is used to determine the greatest common divisor
d(x) ∈ F[x] of two polynomials a(x), b(x) ∈ F[x] and to also determine two polynomials
s(x), t(x) ∈ F[x] such that
s(x)a(x) + t(x)b(x) = d(x).
The application of this algorithm to solving the key equation for decoding will be seen later.
To discuss this, some properties of the algorithm are required. The treatment of [2047] is
followed.
Three sequences of polynomials are derived: {ri (x), si (x), ti (x)} with the initial condi-
tions
r−1 (x) = a(x) s−1 (x) = 1 t−1 (x) = 0
r0 (x) = b(x) s0 (x) = 0 t0 (x) = 1
where it is assumed deg a(x) ≥ deg b(x). The sequence of polynomials {ri (x)} and {qi (x)}
are derived via the equation
where deg ri (x) < deg ri−1 (x); thus the degrees of the ri (x)’s are strictly decreasing. Using
the polynomials qi (x), two auxiliary polynomial sequences are defined:
si (x) = si−2 (x) − qi (x)si−1 (x) and ti (x) = ti−2 (x) − qi (x)ti−1 (x).
Also deg ti (x) + deg ri−1 (x) = deg a(x) for all i > 0, and since the degrees of the ri (x)’s
are strictly decreasing, the degrees of the ti (x)’s are strictly increasing and in particular
Algebraic coding theory 695
deg ti (x) + deg ri (x) < deg a(x). If the last nonzero remainder polynomial ri (x) is at step
n, i.e., rn (x) 6= 0 but rn+1 (x) = 0, then
d(x) = rn (x) = sn (x)a(x) + tn (x)b(x).
15.1.246 Remark Applying the EEA to the polynomials of the key Equation (15.1.7) with a(x) = x2t
and b(x) = S(x) leads to three sequences of polynomials {ri (x)}, {si (x)}, and {ti (x)}.
The degrees of the ti (x)’s are increasing and those of the ri (x)’s are decreasing. Stopping
the algorithm at the point where deg tj (x) first exceeds deg rj (x) yields the polynomials
σ(x) = tj (x) and ω(x) = rj (x); also
sj (x)x2t + tj (x)S(x) = rj (x).
Interpreting this equation modulo x2t then solves the key Equation (15.1.7).
15.1.247 Remark The decoding method for GRS codes considered in [2965] relies on forming the
syndrome polynomial where the syndromes are a form of Fourier transform of the received
word. Welch-Berlekamp decoding operates directly on the received word. There are many
variants of the algorithm in the literature and only the basic ideas are given here. For
simplicity we consider C = GRSn,q (x, 1, k + 1) where
C = {(f (x1 ), f (x2 ), . . . , f (xn )) | f (x) ∈ Fq [x], deg f (x) ≤ k},
with distinct code positions xi in Fq , and let t = b(n − k)/2c. Suppose the codeword corre-
sponding to the polynomial a(x) is sent, i.e., c = (c1 , c2 , . . . , cn ) = (a(x1 ), a(x2 ), . . . , a(xn )),
and the word r = (r1 , r2 , . . . , rn ) is received. For decoding, it suffices to determine the code-
word polynomial a(x) from r. Suppose an unknown number e ≤ t of errors has occurred in
transmission and ri = ci + ei for i = 1, 2, . . . , n. Two polynomials D(x), N (x) ∈ Fq [x] are
sought with the properties
a. deg D(x) ≤ t,
b. deg N (x) ≤ t + k − 1, (15.1.8)
c. N (xi ) = ri D(xi ) for i = 1, 2, . . . , n.
15.1.248 Theorem There exist polynomials N (x), D(x) which satisfy the above conditions and can
be efficiently computed. Furthermore, the ratio N (x)/D(x) is the unique polynomial that
gives the closest codeword to r if fewer than t errors were made in transmission.
15.1.249 Remark The thinking behind this theorem is straightforward. Let E = {i | ri 6= ci },
e = | E |, i.e., the (unknown) error positions and their number. The polynomial
Y
D(x) = (x − xj )
j∈E
is the error locator polynomial, i.e., its zeros are the error locations. The polynomial cor-
responding to the codeword is a(x) = N (x)/D(x). It can be shown that the ratio of any
pair of solutions of the system in Part (c) of (15.1.8), under the conditions given in Part
(a) for D(x) and Part (b) for N (x), gives this polynomial. Note that N (xi ) = ri D(xi ) for
i = 1, 2, . . . , n as follows. If xi ∈ E, N (xi ) = 0 = ri D(xi ) since N (x) = a(x)D(x) and
D(xi ) = 0; if xi ∈ / E, N (xi ) = a(xi )D(xi ) = ci D(xi ) = ri D(xi ).
15.1.250 Remark When the code is presented in one of its alternative forms, for example in terms of
a generator polynomial, the Welch-Berlekamp equations above are slightly modified. Such
an approach is given in [578, 2164].
696 Handbook of Finite Fields
15.1.251 Remark Majority logic decoding involves taking a majority vote on the evaluations of
certain parity check equations. It is a very simple (both to discuss and implement) decoding
technique that is effective when the codes possess certain structure. The Euclidean and
projective geometry codes are perhaps the prime examples, but by no means the only codes
in this class. Early work on this subject is due to Massey [2010].
15.1.252 Definition If a code has J check sums (equations) which each check coordinate position
j and checks other coordinate positions at most once, the checks are orthogonal on
position j.
15.1.253 Lemma If it is possible to construct J parity checks orthogonal on each code coordinate
position, the code can correct bJ/2c errors and has minimum distance at least J + 1.
15.1.254 Definition A set of check sums is orthogonal on a set S if each check sum contains all
coordinate positions of S and checks each coordinate position not in S at most once.
q m−r − 1
dM L =
q−1
15.1.257 Remark In a practical communication scheme, codeword symbols are modulated in a man-
ner suitable for channel transmission in that the bandwidth of the scheme must be within
the bandwidth assigned. The receiver typically observes the demodulated waveform for the
period of transmission for a symbol and makes a decision as to the symbol transmitted.
This decision often has associated with it a reliability or confidence measure as to how
likely the decision made is to being correct. For example if the output of a matched filter
is quantized at the decision time to two levels, this is equivalent to making a hard decision.
If more levels are used, it would correspond to a soft decision. Thus in a hard decision
receiver this soft information is discarded and only the hard decisions on the symbols are
used. This leads to a decrease in the error performance of the system and techniques to
incorporate the soft information on the received symbols to be used in the error control
Algebraic coding theory 697
process have been investigated. The generalized minimum distance (GMD) technique, due
to Forney [1092], is perhaps the simplest such technique. It is discussed here for binary codes
only. Assume the {0, 1} code has Hamming distance d. The generalization to the nonbinary
case is straightforward.
15.1.258 Remark For the binary case assume the code is over the alphabet {−1, +1} by changing 0s
in the code over F2 to −1 and 1s to +1 and assume d is the minimum Hamming distance
of the code. Assume further the received word is of the form y = (y0 , y1 , . . . , yn−1 ) and
define the reliability word r = (r0 , r1 , . . . , rn−1 ) where ri represents the reliability of the
i-th symbol. The larger ri is in magnitude (positive or negative), the more reliable the
transmitted symbol is (in the {±1} code). For maximum likelihood decoding this is the log
likelihood ratio of the two possibilities, i.e.,
P (yi | 1)
ri = log .
P (yi | 0)
Thus r0 is the magnitude limited version of the reliability measure formed on each coordinate
position of the reliability word. The Euclidean distance from r0 to any {±1} codeword c is
then
| r0 |2 −2(r0 , c)+ | c |2 .
15.1.259 Theorem [1092] There is at most one codeword c in the binary {±1} linear code with
Hamming distance d from the received word such that
(r0 , c) > n − d
15.1.262 Remark Successively increasing the number of erasures (but not beyond (d − 1)/2) and
using errors-and-erasures decoding will find the codeword closest to the quantized received
word r0 under the conditions noted.
698 Handbook of Finite Fields
15.1.263 Remark It is assumed that C is a GRS code, although the results are applicable
to other code classes such as algebraic geometry codes. For simplicity we consider
C = GRSn,q (x, 1, k + 1) where
with distinct code positions xi and code rate (k + 1)/n. For such a code if the number of
errors made in transmission on the channel is at most e = b(d − 1)/2c, the bounded distance
decoding algorithm produces the unique correct codeword from the received word r. A list
` decoder with decoding radius τ produces a list of at most ` codewords for any received
word r ∈ Fnq within a radius τ of r. The decoding is successful if the transmitted codeword
is in the list and fails otherwise. The relationship between the decoding radius τ and the
size of the list is of interest.
15.1.264 Remark The list decoding problem is closely related to the following polynomial interpo-
lation problem [2739]; for the Lagrange Interpolation Formula, see Theorem 2.1.131.
15.1.265 Definition The polynomial interpolation problem is defined as follows. For a set of pairs of
elements (xi , yi ) ∈ Fq × Fq , the xi distinct, and for positive integers k and τ , determine
a list of all polynomials f (x) ∈ Fq [x] of degree at most k such that
| {i | f (xi ) = yi } | ≥ τ.
15.1.266 Remark In other words, the desire is to find the set of all polynomials of degree at most k
for which f (xi ) = yi in at least τ out of the n places. The relationship of this problem to
the list decoding problem is immediate.
15.1.267 Remark The technique of Sudan [2739] is to determine a nonzero bivariate polynomial
Q(x, y), of a certain degree, that is zero on the set {(xi , yi ) | i = 1, 2, . . . , n} and show that
factors of this polynomial of the form (y − f (x)) with deg f (x) ≤ k correspond to codewords
within the required distance of the received word. In the above formulation we would like
τ to be as small as possible (the number of errors as large as possible). The concept of
weighted degree of Q(x, y) is of interest: the (1, k) degree of a monomial xi y j is i + jk, and
the (1, k) degree of Q(x, y) is m + `k where this is the maximum over all monomials xm y `
of Q(x, y). This arises as we need the x-degree of Q(x, y) when a polynomial of degree k in
x is substituted for y.
15.1.268 Remark The algorithm proceeds as follows:
1. Find a nonzero bivariate polynomial Q(x, y) of weighted degree at most m + `k
(chosen later) such that Q(xi , yi ) = 0 for i = 1, 2, . . . , n.
2. Factor Q(x, y) into irreducible factors and output all factors of the form y − f (x)
with deg f (x) ≤ k where f (xi ) = yi for at least τ values of i.
P` Pm+(`−b)k
The number of variables qab in the linear system b=0 a=0 qab xai yib = 0 where
i = 1, 2, . . . , n is
`+1
(m + 1)(` + 1) + k .
2
If this quantity is greater than n, the system of n linear equations has a nontrivial solution
since the number of unknowns exceeds the number of equations. The polynomial Q(x, y) is
P` Pm+(`−b)k
given by Q(x, y) = b=0 a=0 qab xa y b .
Algebraic coding theory 699
15.1.269 Lemma [2739] If Q(x, y) is as in Part 1 above and f (x) is such that deg f (x) ≤ k with
f (xi ) = yi for at least τ values where τ > m + `k, then y − f (x) divides Q(x, y).
15.1.270 Remark
p [2739] In the above development if one chooses m = dk/2e − 1 and
`= 2(n + 1)/k − 1, it can be shown that
p √
τ ≥ d 2(n + 1)/k − k/2 or τ > 2kn.
√ √
Thus if fewer than n− 2kn ≈ n(1− 2R) errors are made in transmission, the transmitted
codeword will be in the list. Notice this implies the results are valid only for fairly low rate
codes.
15.1.271 Remark There are a number of items to be noted.
1. As τ decreases (number of errors increases), the size of the list tends to increase
although not always strictly monotonically. For a given decoding radius there are
estimates of the list size which are not discussed here.
2. A significant improvement in the above bound was achieved by Guruswami
√ and
Sudan [1377] who increased the relative fraction of errors to ∼ 1 − R. This was
achieved by allowing the bivariate polynomial Q(x, y) to have zeros of multiplicity
m > 1 and optimizing the results over this parameter. Parvaresh and Vardy [2361]
described a class of codes that could be list-decoded to this radius.
3. Define the entropy function hq (p) = −p logq (p) − (1 − p) logq (1 − p) and note
this is different than the q-ary entropy function of Definition 15.1.122. Denote a
(p, L) list-decodable code as one capable of correcting a fraction p of errors with
a list of size L. One can show [2501] that for code rates R ≤ 1 − hq (p) − there
exists a (p, O(1/))-list decodable code while for R > 1−hq (p)+ every (p, L)-list
decodable code has L exponential in q.
4. The work is extended in [1376, 1377].
15.1.273 Remark The study of codes over Z4 began in earnest after the publication of [1409]. In that
paper, a relationship, via the Gray map, was discovered between certain families of binary
nonlinear codes and Z4 -linear codes. That paper heightened interest in studying codes over
Zr and eventually codes over other rings; see Section 2.1.7.7 for a discussion of Galois rings.
15.1.274 Remark Since a subspace of Fn
q is an Fq -submodule of Fq , a linear code of length n over
n
C. By permuting coordinates, the generator matrix for C can be put in the form
Ik1 A B1 + 2B2
G= , (15.1.9)
O 2Ik2 2C
where A, B1 , B2 , and C are matrices with entries from {0, 1}, Ik1 and Ik2 are k1 × k1 and
k2 × k2 identity matrices, and O is the k2 × k1 zero matrix. The number of codewords in C,
called the type of C, is 4k1 2k2 . If C is a Z4 -linear code with generator matrix (15.1.9), the
generator matrices for Res(C) and Tor(C) are, respectively,
Ik1 A B1
GRes = Ik1 A B1 and GTor = .
O Ik2 C
defined by
(x, y) = x1 y1 + x2 y2 + · · · + xn yn (mod 4),
where x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) are in Zn4 . If C is a Z4 -linear code of
length n, define C ⊥ = {v ∈ Zn4 | (v, c) = 0 for all c ∈ C}, the dual of C. If C ⊆ C ⊥ , the
code C is self-orthogonal. If C = C ⊥ , it is self-dual.
−(B1 + 2B2 )T − C T AT CT
In−k1 −k2
G⊥ = ,
2AT 2Ik2 O
The octacode is self-dual, has type 44 , and has minimum Lee weight 6. Res(o8 ) = Tor(o8 ) is
equivalent to the [8, 4, 4] extended binary Hamming code. By Part 3 of Theorem 15.1.280,
the Gray image G(o8 ) is a (16, 256, 6)2 code. By [1093, 2298, 2689], G(o8 ) is the Nordstrom-
Robinson code, a nonlinear binary code that is the largest possible binary code of length
16 and minimum Hamming distance 6. This (16, 256, 6)2 code is the extension, using an
overall parity check, of a (15, 256, 5)2 code originally defined by Nordstrom and Robinson
in [2298]. The latter code consisted of length 15 binary vectors with the first 8 coordinate
positions arbitrary and the last 7 positions Boolean combinations of the first 8 coordinates.
m+1
15.1.282 Remark In 1968, Preparata [2427] defined a family of nonlinear (2m+1 , 22 −2m−2
, 6)2
binary codes when m is odd. These codes have twice as many codewords as a linear
(2m+1 , 2m+1 − 2m − 3, 6)2 extended binary BCH code. Each Preparata code has the largest
number of codewords of any binary code of its length 2m+1 and minimum distance 6;
see Chapter 17, Section 3 of [1991]. These codes lie between RM2m+1 ,2 (m − 2, m + 1)
and RM2m+1 ,2 (m − 1, m + 1). In 1972, Kerdock [1728] defined a family of nonlinear
(2m+1 , 4m+1 , 2m − 2(m−1)/2 )2 binary codes when m is odd. These codes lie between
RM2m+1 ,2 (1, m + 1) and RM2m+1 ,2 (2, m + 1). Amazingly, the weight distribution of the
m+1
(2m+1 , 22 −2m−2
, 6)2 Preparata code is the MacWilliams transform of the weight distribu-
tion of the (2m+1 , 4m+1 , 2m − 2(m−1)/2 )2 Kerdock code. When m = 3, both codes are equiv-
alent to the Nordstrom–Robinson code of Example 15.1.281. In [1409], an extended cyclic
Z4 -linear code of length 2m and type 4m+1 , denoted K(mb + 1), was defined; the Gray image
G(K(m+1)) is equivalent, by permuting coordinates, to the (2m+1 , 4m+1 , 2m −2(m−1)/2 )2 bi-
b
nary code defined by Kerdock. The Lee weight distribution {L0 , L1 , . . . , L2m+1 } of K(m+1),
b
and hence the weight distribution of G(K(m + 1)), is
b
1 if i = 0 or i = 2m+1 ,
m+1 m
2 (2 − 1) if i = 2m ± 2(m−1)/2 ,
Li =
2m+2 − 2 if i = 2m ,
0 otherwise.
Letting P (m + 1) = K(m b + 1)⊥ , the Gray image G(P (m + 1)), although generally
not the same as the original Preparata code, has the same weight distribution as the
m+1
(2m+1 , 22 −2m−2
, 6)2 binary Preparata code; this weight distribution can be obtained by
using the MacWilliams transform on the weight distribution of G(K(m b + 1)). Preparata’s
original construction begins with a single-error correcting binary BCH code and a double-
error correcting subcode of this code. A linear code is created using a variation of the
702 Handbook of Finite Fields
construction in Lemma 15.1.93, and then cosets of this code are adjoined to give a
m+1
(2m+1 − 1, 22 −2m−2
, 5)2 code for m odd. Adding an overall parity check yields a
m+1
m+1 2 −2m−2
(2 ,2 , 6)2 code. There is a similar connection [1409] between the nonlinear
binary codes of minimum distance 8 of Goethals [1288] and the nonlinear binary codes of
high minimum distance of Delsarte and Goethals [804].
15.1.8 Conclusion
15.1.283 Remark This section has outlined certain aspects of algebraic coding theory including much
of the early work in the subject. Two books [233, 305] compile and comment on initial
papers from the first 25 years of coding theory that greatly influenced the development of
the discipline. The references cited in this section found in [233] include [359, 1090, 1092,
1292, 1319, 1320, 1329, 1408, 1516, 1638, 2035, 2298, 2447, 2811]. References cited in this
section found in [305] include [359, 1292, 1329, 1408, 1516, 1638, 1942, 2011, 2035, 2298,
2427, 2447, 2855].
See Also
[13] Gröbner bases are useful in the theory of algebraic geometry codes.
[936] Describes the essential properties of concatenated codes.
[1525] Develops the theory of Goppa and algebraic geometry codes.
[1530] For a proof of the classification of perfect codes.
Algebraic coding theory 703
References Cited: [13, 142, 231, 233, 304, 305, 311, 359, 578, 801, 802, 804, 805, 806, 936,
1090, 1092, 1093, 1164, 1288, 1292, 1319, 1320, 1329, 1376, 1377, 1408, 1409, 1516, 1522,
1525, 1530, 1558, 1638, 1639, 1691, 1692, 1728, 1827, 1901, 1942, 1943, 1945, 1991, 2010,
2011, 2035, 2047, 2050, 2136, 2164, 2298, 2361, 2389, 2390, 2402, 2403, 2405, 2427, 2447,
2484, 2501, 2506, 2511, 2608, 2687, 2689, 2696, 2739, 2741, 2805, 2810, 2811, 2849, 2855,
2965, 2966, 2999]
15.2.1 Remark Algebraic-geometry codes constitute a powerful family of codes which were intro-
duced in their classical form by Goppa in the years 1977 to 1982 [1321, 1322, 1323]. Goppa
used the language of algebraic curves over finite fields to define algebraic-geometry codes.
Modern expositions prefer a description in terms of algebraic function fields over finite fields.
In order to be consistent with the recent literature, we follow this practice in the present
section.
15.2.2 Remark We follow the terminology for algebraic function fields in Chapter 12. Let F be an
algebraic function field (of one variable) with full constant field Fq , that is, Fq is algebraically
closed in F . A divisor of F is a finite Z-linear combination of places of F . If P denotes the
set of all places of F , then a divisor G of F can be uniquely written in the form
X
G= mP P,
P ∈P
where mP ∈ Z for all P ∈ P and mP 6= 0 for only finitely many P ∈ P. The support of G
is the set of all P ∈ P with mP 6= 0. Consequently, the support of G is a finite set. The
degree deg(G) of G is defined by
X
deg(G) = mP deg(P ),
P ∈P
where deg(P ) denotes the degree of the place P ; see Section 12.1. The divisor G is positive
(written G ≥ 0) if mP ≥ 0 for all P ∈ P. The divisors of F form an abelian group with
respect to addition, where two divisors of F are added by adding the corresponding integer
coefficients in the above unique representation of divisors.
15.2.3 Remark A basic object in the construction of an algebraic-geometry code is a Riemann-
Roch space. For any f ∈ F ∗ , the principal divisor div(f ) of f is given by
X
div(f ) = νP (f ) P,
P ∈P
where νP denotes the normalized discrete valuation of F associated with the place P . Note
that the image of νP as a map is Z ∪ {∞}. If f ∈ F ∗ is given, then νP (f ) 6= 0 for at most
704 Handbook of Finite Fields
It is an important fact that L(G) is a finite-dimensional vector space over Fq . We write `(G)
for the dimension of this vector space. The Riemann-Roch theorem (Chapter 12) provides
important information on `(G).
15.2.4 Definition Let n be a positive integer and let q be an arbitrary prime power. Let F be an
algebraic function field with full constant field Fq , genus g, and at least n rational places.
Choose n distinct rational places P1 , . . . , Pn of F and a divisor G of F such that none of
the Pi , 1 ≤ i ≤ n, is in the support of G. The algebraic-geometry code C(P1 , . . . , Pn ; G)
is defined as the image of the Fq -linear map ψ : L(G) → Fnq given by
15.2.5 Remark Note that νPi (f ) ≥ 0 for 1 ≤ i ≤ n and all f ∈ L(G), since Pi is not in the support
of G. Thus, f belongs to the valuation ring OPi of Pi for 1 ≤ i ≤ n, and so the residue
class f (Pi ) of f , that is, the image of f under the residue class map of the place Pi , is well
defined. Since Pi is a rational place, that is, a place of degree 1, we can identify f (Pi ) with
an element of Fq . Thus, C(P1 , . . . , Pn ; G) is indeed a subset of Fnq .
15.2.6 Theorem With the notation and assumptions in Definition 15.2.4, suppose that the divisor
G of F satisfies also g ≤ deg(G) < n. Then the algebraic-geometry code C(P1 , . . . , Pn ; G)
is a linear code over Fq with length n, dimension
k = `(G) ≥ deg(G) + 1 − g,
15.2.18 Remark The condition g ≤ deg(G) < n in Theorem 15.2.6 implies that the number N of
rational places of the algebraic function field F must be greater than the genus g of F , in
order for Theorem 15.2.6 to be applicable. However, for small values of q such as q = 2 and
q = 3, the condition N > g can be satisfied only for small values of g. Consequently, for
small values of q the construction of classical algebraic-geometry codes in Definition 15.2.4
can yield good codes only for small lengths. A possible remedy is to devise constructions
that use not only rational places, but also places of higher degree. Such constructions will
be discussed in this subsection.
15.2.19 Remark The construction of NXL codes due to Niederreiter, Xing, and Lam [2285] uses
places of arbitrary degree. We present this construction in the more general form given in
Chapter 5 of [2281]. Let F be an algebraic function field with full constant field Fq and
706 Handbook of Finite Fields
{fi,j + L(D) : 1 ≤ j ≤ si }
Pr
of the factor space L(D + Gi )/L(D). The n-dimensional factor space L(D + i=1 Gi )/L(D)
has then the Fq -basis
{fi,j + L(D) : 1 ≤ j ≤ si , 1 ≤ i ≤ r}
which we order in a lexicographic manner. We note further that every
r
X r
X
f ∈ L(D + Gi − E) ⊆ L(D + Gi )
i=1 i=1
15.2.20 Definition The NXL code C(G1 , . . . , Gr ; D, E) is defined as the image of the Fq -linear
map
r
X
η : L(D + Gi − E) → Fnq
i=1
given by
η(f ) = (c1,1 , . . . , c1,s1 , . . . , cr,1 , . . . , cr,sr )
Pr
for all f ∈ L(D + i=1 Gi − E), where the ci,j are as in Remark 15.2.19.
15.2.21 Theorem Assume that the hypotheses in Remark 15.2.19 are satisfied and put
− D). Then the NXL code C(G1 , . . . , Gr ; D, E) is a linear code over Fq with
m = deg(E P
r
length n = i=1 si , dimension
r
X
k = `(D + Gi − E) ≥ n − m − g + 1,
i=1
15.2.23 Remark The following construction of XNL codes is due to Xing, Niederreiter, and
Lam [3019]. It is a powerful method of combining data from algebraic function fields over
finite fields with (short) linear codes in order to produce a longer linear code as the output.
We present the slightly more general version of XNL codes in Chapter 5 of [2281].
15.2.24 Definition Let F be an algebraic function field with full constant field Fq . Let P1 , . . . , Pr
be distinct places of F which can have arbitrary degrees. Let G be a divisor of F such
that none of the Pi , 1 ≤ i ≤ r, is in the support of G. For each i = 1, . . . , r, let Ci be a
linear code over Fq with length ni , dimension ki ≥ deg(Pi ), and minimum distance di ,
Prφi be an injective Fq -linear map from the residue class field of Pi into Ci . Put
and let
n = i=1 ni . Then the XNL code C(P1 , . . . , Pr ; G; C1 , . . . , Cr ) is defined as the image
of the Fq -linear map β : L(G) → Fnq given by
15.2.25 Theorem With the notation and assumptions in Definition 15.2.24, suppose that the divisor
P r
G of F satisfies also g ≤ deg(G) < i=1 deg(Pi ), where g is the genus of F . Then
Pr the
XNL code C(P1 , . . . , Pr ; G; C1 , . . . , Cr ) is a linear code over Fq with length n = i=1 ni ,
dimension
k = `(G) ≥ deg(G) + 1 − g,
and minimum distance d ≥ d , where d0 is the minimum of i∈M di taken over all subsets
0
P
P
M of {1, . . . , r} for which i∈M deg(Pi ) ≤ deg(G), with M denoting the complement of M
in {1, . . . , r}. Moreover, if deg(G) ≥ 2g − 1, then k = deg(G) + 1 − g.
15.2.26 Corollary If in addition deg(Pi ) ≥ di for 1 ≤ i ≤ r, then the minimum distance d of the
XNL code C(P1 , . . . , Pr ; G; C1 , . . . , Cr ) satisfies
r
X
d≥ di − deg(G).
i=1
15.2.27 Remark If P1 , . . . , Pr are distinct rational places of F and for each i = 1, . . . , r we choose
Ci to be the trivial linear code over Fq with ni = ki = di = 1 and φi the identity map
on Fq , then the construction of XNL codes reduces to that of classical algebraic-geometry
codes. Theorem 15.2.6 is thus a special case of Theorem 15.2.25 and Corollary 15.2.26.
15.2.28 Example Many excellent examples of XNL codes were found in [875] and [3020]. The
following simple example is typical. Let q = 2 and let F = F2 (x, y) be the elliptic function
field defined by y 2 + y = x + x−1 . We choose r = 6 and let P1 , P2 , P3 , P4 be the four rational
places and P5 and P6 the two places of degree 2 of F . The linear codes Ci have the following
parameters: for 1 ≤ i ≤ 4 we let ni = ki = di = 1 and for i = 5, 6 we let ni = 3, ki = 2,
di = 2. Then for m = deg(G) = 1, . . . , 7 the corresponding XNL code is a linear code over
F2 with parameters n = 10, k = m, and d ≥ 8 − m. The linear codes with m = 2, 3, 4 and
d = 8 − m are optimal.
15.2.29 Example Let q = 3 and let F = F3 (x, y) be the elliptic function field defined by
y 2 = x(x2 + x − 1). We choose r = 9 and let P1 , P2 , P3 , P4 , P5 , P6 be the six rational places
and P7 , P8 , P9 the three places of degree 2 of F . The linear codes Ci have the following
parameters: for 1 ≤ i ≤ 6 we let ni = ki = di = 1 and for 7 ≤ i ≤ 9 we let ni = 3, ki = 2,
di = 2. Then for m = deg(G) = 1, . . . , 11 the corresponding XNL code is a linear code over
F3 with parameters n = 15, k = m, and d ≥ 12 − m. The linear code with m = 3 and d = 9
is optimal.
708 Handbook of Finite Fields
15.2.30 Remark By the same argument as in Remark 15.2.12, the condition in Definition 15.2.24
that none of the Pi , 1 ≤ i ≤ r, is in the support of G can be dropped if we replace G by a
suitable divisor G0 .
15.2.31 Remark Decoding algorithms for generalized algebraic-geometry codes can be found
in [1496].
15.2.32 Remark As for classical algebraic-geometry codes (Remark 15.2.14), the dual codes of
XNL codes can be described in terms of differentials and residues for algebraic function
fields [912].
15.2.33 Remark A very general perspective on the construction of algebraic-geometry codes is that
of function-field codes. A function-field code is a special type of subspace of an algebraic
function field over a finite field from which linear codes can be derived. Function-field codes
were introduced in Chapter 6 of [2280] and studied in detail by Hachenberger, Niederreiter,
and Xing [1396].
15.2.34 Remark Let F be an algebraic function field with full constant field Fq . For a place P of
F , let OP denote the valuation ring of P and let MP be the unique maximal ideal of OP .
For a finite nonempty set Q of places of F , we write
\ \
OQ = OP , MQ = MP .
P ∈Q P ∈Q
15.2.35 Definition Let F be an algebraic function field with full constant field Fq and let Q be a
finite nonempty set of places of F . A function-field code (in F with respect to Q) is a
nonzero finite-dimensional Fq -linear subspace V of F which satisfies the two conditions
V ⊆ OQ and V ∩ MQ = {0}.
15.2.36 Theorem Let Q be a finite nonempty set of places of F and let G be a divisor of F such
that `(G) ≥ 1, none of the places in Q is in the support of G, and
X
deg(G) < deg(P ).
P ∈Q
are linearly independent over Fq , where Fq is an algebraic closure of Fq . Then the Fq -linear
subspace of F spanned by f1 , . . . , fk is a function-field code in F with respect to Q.
15.2.38 Remark For suitable Q and k in Theorem 15.2.37, appropriate elements f1 , . . . , fk can be
constructed by the approximation theorem for valuations.
15.2.39 Remark Any nonzero Fq -linear subspace of a function-field code in F with respect to Q is
again a function-field code in F with respect to Q.
Algebraic coding theory 709
15.2.40 Remark The most powerful method of deriving linear codes from function-field codes is
described in Definition 15.2.41 below and generalizes the construction of XNL codes in
Definition 15.2.24.
where on the right-hand side we use concatenation of vectors. Then the linear code
CQ (V ; C1 , . . . , Cr ) over Fq is defined as the image of V under γ.
15.2.42 Remark The following notation is convenient. For I ⊆ {1, . . . , r} we write I for the com-
plement of I in {1, . . . , r} and Q(I) = {Pi : i ∈ I} ⊆ Q. We put
X
d0 = min di ,
I
i∈I
where the minimum is extended over all I ⊆ {1, . . . , r} for which V ∩ MQ(I) 6= {0}. The last
condition is always assumed to be satisfied for I = {1, . . . , r}. The condition V ∩ MQ = {0}
in Definition 15.2.35 implies that d0 ≥ 1.
15.2.43 Theorem The code CQ (V ; C1 , . . . , Cr ) in Definition 15.2.41 is a linear code over Fq with
length n, dimension k, and minimum distance d, where
r
X
n= ni , k = dim(V ), d ≥ d0 ,
i=1
The number
ϑQ (V ) = min {ϑQ (f ) : f ∈ V, f 6= 0}
is the minimum block weight of V (with respect to Q).
15.2.45 Theorem If the notation is arranged in such a way that d1 ≤ d2 ≤ · · · ≤ dr , then the
minimum distance d of the code CQ (V ; C1 , . . . , Cr ) in Definition 15.2.41 satisfies
ϑQ (V )
X
d≥ di ,
i=1
In fact, as the following theorem from Chapter 6 in [2280] shows, a special family of these
codes suffices to represent any linear code.
15.2.47 Theorem Let C be an arbitrary linear code over Fq with length n and dimension k. Then
there exists an algebraic function field F with full constant field Fq , a set Q of n distinct
rational places of F , and a k-dimensional function-field code V in F with respect to Q such
that C is equal to the code CQ (V ; C1 , . . . , Cn ), where Ci is the trivial linear code over Fq
of length 1 and dimension 1 for 1 ≤ i ≤ n.
15.2.48 Remark A stronger result than Theorem 15.2.47, according to which an even smaller family
of codes can represent any linear code, was proved in [2380].
15.2.49 Remark The asymptotic theory of codes studies the set of ordered pairs of asymptotic
relative minimum distances and asymptotic information rates as the length of codes goes
to ∞. This theory considers general (including nonlinear) codes as well as the special case
of linear codes. Algebraic-geometry codes play a decisive role in this theory.
15.2.50 Remark We write n(C) for the length of a code C and d(C) for the minimum distance of
C. Furthermore, we use logq to denote the logarithm to the base q.
15.2.51 Definition For a given prime power q, let Uq (respectively Uqlin ) be the set of all ordered
pairs (δ, R) ∈ [0, 1]2 for which there exists a sequence C1 , C2 , . . . of general (respectively
linear) codes over Fq such that n(Ci ) → ∞ as i → ∞ and
15.2.52 Proposition [2820] There exists a function αq (respectively αqlin ) on [0, 1] such that
Uq = {(δ, R) : 0 ≤ R ≤ αq (δ), 0 ≤ δ ≤ 1}
and
Uqlin = {(δ, R) : 0 ≤ R ≤ αqlin (δ), 0 ≤ δ ≤ 1}.
15.2.53 Proposition [2820] The functions αq and αqlin have the following properties:
15.2.55 Definition For 0 < δ < 1, the q-ary entropy function Hq is defined by (Definition 15.1.122)
15.2.56 Remark The benchmark bound in the asymptotic theory of codes is the asymptotic Gilbert-
Varshamov bound in Theorem 15.2.57 below which dates from the 1950s.
15.2.57 Theorem For any prime power q, we have
αq (δ) ≥ αqlin (δ) ≥ RGV (q, δ) := 1 − Hq (δ) for 0 < δ < (q − 1)/q.
15.2.58 Theorem [3017] For any prime power q and any real number δ with 0 < δ < (q − 1)/q, there
exists a sequence C1 , C2 , . . . of classical algebraic-geometry codes over Fq with n(Ci ) → ∞
as i → ∞ which yields
αqlin (δ) ≥ RGV (q, δ).
15.2.59 Remark Since no improvement on Theorem 15.2.57 was obtained for a long time, there was
speculation that maybe αqlin (δ) = RGV (q, δ) for 0 < δ < (q − 1)/q. However, this conjecture
was disproved by the use of algebraic-geometry codes. The crucial result in this context
is the TVZ bound (named after Tsfasman, Vlăduţ, and Zink [2821]) in Theorem 15.2.60
below. For the formulation of this bound, we need the quantity
Nq (g)
A(q) = lim sup ,
g→∞ g
where q is an arbitrary prime power and Nq (g) is the maximum number of rational places
that an algebraic function field with full constant field Fq and genus g can have. We note
that A(q) > 0 for all prime powers q and that A(q) = q 1/2 − 1 if q is a square. The proof of
the TVZ bound uses appropriate sequences of classical algebraic-geometry codes for which
the length goes to ∞.
15.2.60 Theorem [2821] For any prime power q, we have
1
αq (δ) ≥ αqlin (δ) ≥ RTVZ (q, δ) := 1 − −δ for 0 ≤ δ ≤ 1.
A(q)
15.2.61 Remark The following two theorems show that for certain sufficiently large values of the
prime power q, we can beat the asymptotic Gilbert-Varshamov bound in Theorem 15.2.57
by using suitable sequences of classical algebraic-geometry codes. Both theorems are simple
consequences of the TVZ bound in Theorem 15.2.60 and information about the quantity
A(q) defined in Remark 15.2.59.
15.2.62 Theorem Let q ≥ 49 be the square of a prime power. Then there exists an open subinterval
(δ1 , δ2 ) of (0, (q − 1)/q) containing (q − 1)/(2q − 1) such that
15.2.63 Theorem Let q ≥ 343 be the cube of a prime power. Then there exists an open subinterval
(δ1 , δ2 ) of (0, (q − 1)/q) containing (q − 1)/(2q − 1) such that
15.2.64 Remark More generally, it was shown in [2284] that a result like Theorem 15.2.62 holds for
all sufficiently large composite nonsquare prime powers q. It is not known whether such a
result holds also for all sufficiently large primes q.
15.2.65 Remark If one considers general (i.e., not necessarily linear) codes, then global improve-
ments on the TVZ bound for αq (δ) in Theorem 15.2.60 can be obtained. By a global
improvement we mean an improvement for any q and δ by a positive quantity independent
712 Handbook of Finite Fields
of δ. The currently best global improvement on the TVZ bound is the Niederreiter-Özbudak
bound in Theorem 15.2.66 below.
15.2.66 Theorem [2262] For any prime power q, we have
1 1
αq (δ) ≥ RNO (q, δ) := 1 − − δ + logq 1 + 3 for 0 ≤ δ ≤ 1.
A(q) q
15.2.67 Remark The proof of the Niederreiter-Özbudak bound in [2262] is quite involved. A simpler
proof using a new variant of the construction of algebraic-geometry codes was presented
in [2715], but this proof works only for a slightly restricted range of the parameter δ. The
new construction proceeds as follows. Let n be a positive integer (which will be the length
of the code) and let F be an algebraic function field with full constant field Fq such that
F has at least n + 1 rational places. Choose n + 1 distinct rational places P0 , P1 , . . . , Pn
of F and put Q = {P1 , . . . , Pn }. Furthermore, let m, r, and s be integers with m ≥ 1 and
1 ≤ r ≤ min(n, s). We let G(Q; r, s) be the set of all positive divisors G of F of degree s
such that the support of G is a subset of Q of cardinality r. Next we define
[
S(mP0 ; Q; r, s) = {f ∈ L(mP0 + G) : νP0 (f ) = −m}.
G∈G(Q;r,s)
See Also
References Cited: [92, 875, 912, 1321, 1322, 1323, 1375, 1396, 1415, 1496, 1524, 2262, 2264,
2265, 2280, 2281, 2284, 2285, 2380, 2714, 2715, 2820, 2821, 3017, 3019, 3020, 3029]
Algebraic coding theory 713
15.3.1 Definition [1987] A binary linear code of length n is a Gallager code provided it has an
m × n parity check matrix H where every column has fixed weight c and every row has
weight as close to nc/m as possible. This code will be denoted Galn,2 (c, H). If r = nc/m
is an integer and every row of H has weight r, the Gallager code is called regular.
15.3.2 Definition A binary linear code of length n is a low density parity check (LDPC) code
provided it is a regular Gallager code Galn,2 (c, H). This code will be denoted by
LDP Cn,2 (c, r, H) where c is the weight of each column of H and r is the weight of
each row of H.
15.3.3 Remark Note that the m × n matrix H used to define a Gallager or LDPC code C is not
assumed to have independent rows and so is technically not a parity check matrix in the
sense often used. However, H can be row reduced and the zero rows removed to form a
parity check matrix for C with independent rows. Thus the dimension of C is at least n − m.
Generally the column and row weights are chosen to be relatively small compared to n,
and thus the density of 1s in H is low. A natural generalization for Gallager and LDPC
codes is to allow the row weights and column weights of H to both vary, in some controlled
manner. The resulting codes are sometimes termed irregular LDPC codes. Unless specific
parameters are given, for the remainder of this section, the term “LDPC” will refer to either
regular or irregular LDPC codes.
15.3.4 Remark Gallager and LDPC codes were developed by R. G. Gallager in [1162, 1163].
Gallager also presented two iterative decoding algorithms designed to decode these codes of
long length, several thousand bits for example. These algorithms are presented in Remarks
15.3.8 and 15.3.12.
15.3.5 Remark The reason Gallager codes are useful is because they achieve close to optimal
properties. Roughly speaking, for any c ≥ 3 and any λ > 1, there exist Gallager codes
of long enough length and rates up to 1 − 1/λ such that virtually error-free transmission
occurs; the specific codes that achieve this property are dependent upon the characteristics
of the communication channel being used. Additionally, for an appropriate choice of c, there
exist Gallager codes Galn,2 (c, H), where H is m × n, of rates arbitrarily close to channel
capacity and c/m arbitrarily small. The precise formulation of these statements can be
found in [1987]. Furthermore, there exist irregular LDPC codes whose performance very
closely approaches the Shannon channel capacity for the binary additive white Gaussian
noise (AWGN) channel [642, 2457]. These codes compare favorably and can even surpass
the performance of turbo codes [1987, 2457].
15.3.6 Remark LDPC codes are part of several standards used in digital television, optical com-
munication, and mobile wireless communication, including DVB-T2, DVB-S2, WiMAX-
IEEE 802.16e, and IEEE 802.11n [1046]. Both the DVB-T2 standard [995], used in digital
television, and the DVB-S2 standard [994], used in a variety of satellite applications, employ
LDPC codes concatenated with BCH codes. In addition, a 2007 paper from the Consulta-
tive Committee for Space Data Systems (CCSDS) outlines the experimental specifications
for the use of LDPC codes for near-Earth and deep space communications [573].
714 Handbook of Finite Fields
15.3.7 Remark Gallager and LDPC codes can be defined over other alphabets. Such codes were
examined in Gallager’s original work [1163]; see also [2455]. Only binary codes are considered
here.
15.3.8 Remark Gallager’s first decoding algorithm is an iterative algorithm that involves flipping
bits in the received vector. This algorithm works best on a binary symmetric channel (BSC)
when the code rate is well below channel capacity. Assume that the codeword c is trans-
mitted and y = (y1 , y2 , . . . , yn ) is received using Galn,2 (c, H). In the computation of the
syndrome HyT , each received bit yi affects at most c components of that syndrome, as the
i-th bit is in c parity check equations. Let Si denote the set of bits involved in these c parity
check equations. If among all the bits Si involved in these c parity check equations only the
i-th is in error, then the c components of HyT , arising from these c parity check equations,
will equal 1 indicating the parity check equations are not satisfied. Even if there are some
other errors among the bits Si , one expects that several of these c components of HyT will
equal 1. This is the basis of the Gallager Bit-Flipping Decoding Algorithm.
I. Compute HyT and determine the unsatisfied parity check equations, i.e., the
parity check equations where the components of HyT equal 1.
II. For each of the n bits, compute the number of these unsatisfied parity check
equations involving that bit.
III. Flip the bits of y, from 0 to 1 or 1 to 0, that are involved in the largest number
of these unsatisfied parity check equations; call the resulting vector y again.
IV. Iteratively repeat I, II, and III until either HyT = 0T , in which case the received
vector is decoded as this latest y, or until a certain number of iterations is
reached, in which case the received vector is not decoded and a decoding failure
is declared.
15.3.9 Remark In order to describe the second of Gallager’s decoding algorithms, we need the
concept of a Tanner graph [2779], which can be defined for any code.
15.3.10 Definition The Tanner graph is a bipartite graph constructed from an m × n parity
check matrix H for a binary code. The Tanner graph has two types of vertices: variable
and check nodes. There are n variable nodes, one corresponding to each coordinate or
column of H. There are m check nodes, one for each parity check equation or row of H.
The Tanner graph has only edges between variable nodes and check nodes; a given check
node is connected to precisely those variable nodes where there is 1 in the corresponding
column of H. Since a code can have many different parity check matrices, there are many
different Tanner graphs for a code. If the code is Galn,2 (c, H), the degree of each variable
node is c; if the code is LDP Cn,2 (c, r, H), the degree of each variable node is c, and the
degree of each check node is r.
15.3.11 Remark The second of Gallager’s algorithms is an example of an iterative “message pass-
ing” decoding algorithm. Message passing decoding algorithms have the following general
characteristics.
1. Message passing algorithms are performed in iterations, also called rounds.
2. Messages are passed from a variable node to a check node and from a check node
to a variable node along the edges of the Tanner graph of a code. Outgoing mes-
sages sent out from a node along an adjacent edge depend on all the incoming
messages to that node except the incoming message along .
3. There is an initial passing of messages from the variable nodes to the check nodes
(or perhaps from the check nodes to the variable nodes).
Algebraic coding theory 715
4. One iteration, or round, consists of passing messages from all check nodes to
all adjacent variable nodes followed by the passing of messages from all variable
nodes to all adjacent check nodes (or perhaps from variable nodes to check nodes
and then back to variable nodes).
5. At the end of each round, a computation is done that will either end the algorithm
or indicate that another iteration should be performed.
6. There is a preset maximum number of rounds that the algorithm will be allowed
to run.
15.3.12 Remark The description of Gallager’s message passing decoding algorithm comes from
[1987] and applies to binary channels in which noise bits are independent, such as the
memoryless BSC or the binary AWGN channel. The algorithm falls in a class of algorithms
called “sum-product algorithms” and is implemented using message passing.
Let C = Galn,2 (c, H) where H is an m × n matrix. In the Tanner graph T for C, number
the variable nodes 1, 2, . . . , n and the check nodes 1, 2, . . . , m. Let V (j) denote the variable
nodes connected to the j-th check node, and let C(k) denote the c check nodes connected
to the k-th variable node.
Suppose the codeword c is transmitted and y = c + e is received where e is the unknown
error vector. Given the syndrome HyT = HeT = zT , the object of the decoder is to compute
the conditional probabilities P (ek = 1 | z) for 1 ≤ k ≤ n. From there, the most likely error
vector and hence most likely codeword can be found. The algorithm is an iterative message
passing algorithm in which each message is a pair of probabilities. For 1 ≤ k ≤ n, the
initial message passed from the variable node k to the check node j ∈ C(k) (in Step I of
0 1
the algorithm) is the pair (qj,k , qj,k ) = (p0k , p1k ), where p0k = P (ek = 0) and p1k = P (ek = 1)
come directly from the channel statistics. For example, if communication is over a BSC
with crossover probability p, then p0k = 1 − p and p1k = p. Each iteration consists of first
passing messages from check nodes to variable nodes (in Step II of the algorithm) and then
passing messages from variable nodes back to check nodes (in Step III of the algorithm).
0 1
The message from the check node j to the variable node k for k ∈ V (j) is the pair (rj,k , rj,k ),
e
where rj,k for e ∈ {0, 1} is the probability that the j-th check equation is satisfied given
that ek = e and that the other bits ei for i ∈ V (j) \ {k} have probability distribution given
0 1
by {qj,i , qj,i }. The message from the variable node k to the check node j ∈ C(k) is the pair
0 1 e
(qj,k , qj,k ), where qj,k for e ∈ {0, 1} is the probability that ek = e given information obtained
from checks C(k) \ {j}. The Gallager Message Passing Sum-Product Decoding Algorithm
for binary Gallager codes is the following.
0 1
I. For 1 ≤ k ≤ n, pass the message (qj,k , qj,k ) = (p0k , p1k ) from variable node k to
check node j ∈ C(k).
e
II. Update the values of rj,k for k ∈ V (j) and e ∈ {0, 1} according to the equation
X Y
e ei
rj,k = P (zj | ek = e, {ei | i ∈ V (j) \ {k}}) qj,i .
i∈V (j)\{k}, ei ∈{0,1} i∈V (j)\{k}
0 1
For 1 ≤ j ≤ m, pass the message (rj,k , rj,k ) from check node j to variable node
k ∈ V (j).
e
III. Update the values of qj,k for j ∈ C(k) and e ∈ {0, 1} according to the equation
Y
e
qj,k = αj,k pek e
ri,k
i∈C(k)\{j}
0 1
where αj,k is chosen so that qj,k + qj,k = 1. For 1 ≤ k ≤ n, pass the message
0 1
(qj,k , qj,k ) from variable node k to check node j ∈ C(k).
716 Handbook of Finite Fields
for 1 ≤ k ≤ n and e ∈ {0, 1}. Then for 1 ≤ k ≤ n, set ebk = 0 if qk0 > qk1 and ebk = 1
if qk1 > qk0 . Let b
e = (b eT = zT , decode by setting e = b
e1 , eb2 , . . . , ebn ). If Hb e and stop the
algorithm; otherwise repeat Steps II and III unless the predetermined maximum number of
iterations has been reached. If the maximum number of iterations has been reached and if
Hb eT never equals zT , stop the algorithm and declare a decoding failure.
15.3.13 Remark If the Tanner graph T is without cycles and the algorithm successfully halts, then
the probabilities P (ek = e | z) equal αk qke for 1 ≤ k ≤ n where αk is chosen so that
αk qk0 + αk qk1 = 1. If the graph has cycles, αk qke are approximations to P (ek = e | z). In the
end, one does not care exactly what these probabilities are; one only cares about obtaining
a solution to HeT = zT . Thus the algorithm can be used effectively even when there are
long cycles present. The algorithm has been successful for codes of length a few thousand,
say n = 10, 000, particularly with c small, say c = 3. Analysis of the algorithm can be found
in [1987].
15.3.14 Remark In Step II of the algorithm, the conditional probabilities P (zj | ek = e, {ei | i ∈
V (j) \ {k}}) for k ∈ V (j) and e ∈ {0, 1} are required. These probabilities are either 0 or
1. Notice that the j-th check equation is the modulo 2 sum of the values ek for k ∈ V (j).
Thus
X
1 if zj ≡ e + ei (mod 2),
P (zj | ek = e, {ei | i ∈ V (j) \ {k}}) = i∈V (j)\{k}
0 otherwise.
15.3.15 Remark There is a message passing algorithm for Gallager codes on the binary erasure
channel (BEC). For the BEC, there are two inputs {0, 1} to the channel, three possible
outputs {0, E, 1} from channel transmission, and an associated probability . In a BEC an
input symbol x is received as an output symbol y with the following probabilities P (y | x):
P (0 | 1) = P (1 | 0) = 0, P (0 | 0) = P (1 | 1) = 1 − , and P (E | 0) = P (E | 1) = . So
the output that is received is either an erasure, denoted E, or is the original input symbol;
0 is never received as 1 and 1 is never received as 0. The message passing algorithm for
the BEC, which is described and thoroughly analyzed in [2455], employs the notation from
Remark 15.3.12. Suppose that c is sent and y = (y1 , y2 , . . . , yn ) is received. The messages
on each edge of the Tanner graph are elements of {0, E, 1}. For binary Gallager codes the
Message Passing Decoding Algorithm for a BEC is the following.
I. For 1 ≤ j ≤ m, send E from every check node j to every variable node k ∈ V (j).
II. For 1 ≤ k ≤ n, determine the message to send from the variable node k to
the check node j ∈ C(k) as follows. If yk and all of the incoming messages to
the variable node k from the check nodes i ∈ C(k) \ {j} equal E, the outgoing
message from the variable node k to the check node j ∈ C(k) is E. Otherwise, if
one of yk or the incoming messages to the variable node k from the check nodes
i ∈ C(k) \ {j} equals 0, respectively 1, send 0, respectively 1.
III. For 1 ≤ j ≤ m, determine the message to send from the check node j to the
variable node k ∈ V (j) as follows. If any of the incoming messages to the check
node j from the variable nodes i ∈ V (j) \ {k} is E, the outgoing message from
the check node j to the variable node k ∈ V (j) is E. Otherwise, if none of the
incoming messages to the check node j from the variable nodes i ∈ V (j) \ {k} is
Algebraic coding theory 717
E, the outgoing message from the check node j to the variable node k ∈ V (j) is
the modulo 2 sum of the incoming messages to the check node k from the variable
nodes i ∈ V (j) \ {k}. In addition, if yk = E, update yk to be the value of the
incoming message.
Repeat Steps II and III until all yk ’s have been determined, in which case the updated y is
declared to equal the transmitted codeword c, or until the number of iterations has reached
a predetermined value. In the latter case, declare a decoding failure.
15.3.16 Remark In Step II of the algorithm, there is no ambiguity in assigning the value to the
message to be sent. It is not possible for one of yk or the incoming messages to have the
value 0 and another of yk or the incoming messages to have the value 1. In Step III, the
only time the value 0 or 1 is passed as a message from the check node j to the variable node
k is when there is no erasure from any other variable node sent to check node j and hence
the value of the k-th variable is determined uniquely by the j-th parity check equation.
15.3.17 Remark When applying the message passing algorithm of Remark 15.3.15, the iterations
may stabilize so that there are erasures left but no further iteration will remove any of these
remaining erasures. This will occur because of the presence of stopping sets.
15.3.18 Definition Let V be the set of variable nodes in the Tanner graph of Galn,2 (c, H). A subset
S of V is a stopping set if all the check nodes connected to a variable node in S are
connected to at least two variable nodes in S.
15.3.19 Theorem [2455] Suppose that a codeword of Galn,2 (c, H) is sent over a BEC and that the
received vector has erasures on the set E of variable nodes. Assume that the message passing
decoder of Remark 15.3.15 is allowed to run on the received vector until either all erasures
are corrected or the process fails to make progress. If the algorithm fails to make progress
and erasures remain, then the set of variable nodes that still contain erasures is the unique
maximal stopping set inside E.
15.3.20 Remark So far, the focus of this section has been on decoding Gallager codes. The algo-
rithms are efficient because they take advantage of the sparsity of the parity check matrix
for such codes. One might ask if this sparseness of the parity check matrix can be used
to efficiently encode Gallager codes. The answer is yes, as described in [2455]. Begin with
the m × n parity check matrix H for Galn,2 (c, H), which we assume has rank m. By using
row and column permutations, which do not affect the column or row weights, H can be
transformed into an approximate upper triangular form
T A B
.
E C D
LDPC code constructions, performance has to be obtained through simulation. The ref-
erences [1801, 1841, 3055, 3061] are representative of this approach. There are numerous
decoding algorithms, some variations of those given here, and their performance analysis
is also an important topic of research. When using iterative decoding, one needs to know
if the algorithm will converge to a codeword and if that codeword is the original codeword
transmitted. The presence of “pseudocodewords” play a role in describing convergence of
an iterative decoder; see [151, 1724, 1778] and the references in these papers. The error
performance of a maximum likelihood decoder is determined by the distance distribution of
the codewords in the code; in the case of an iterative decoder, error performance is deter-
mined by the distribution of pseudocodewords [1724]. There are three common notions of
pseudocodewords: computation tree pseudocodewords, graph cover pseudocodewords, and
linear programming pseudocodewords; see [151] for connections between the three types.
For instance, computation tree pseudocodewords arise as follows [1724]. From the Tanner
graph of an LDPC code, the computation tree can be constructed starting from a fixed
variable root node. The actual structure of the tree is determined by the scheduling used
by the message passing algorithm and the number of iterations used. In general a variable
node from the Tanner graph will appear multiple times in the computation tree, as will
the check nodes. A codeword in the LDPC code will be a binary assignment of variable
nodes in the Tanner graph so that the binary sum of the neighbors of each check node is
0. The same assignment made in the computation tree will also yield, for the neighbors of
each check node, a binary sum equal to 0. However, it is possible to make an assignment of
the variable nodes in the computation tree so that the binary sum of the neighbors of each
check node is 0, but the assignment gave different values to some nodes of the computation
tree that actually represented the same variable node of the Tanner graph. This assignment
of values is a computation tree pseudocodeword.
15.3.22 Remark The special case of linear codes that can be represented by Tanner graphs with no
cycles has been investigated. Such a code can be decoded with maximum likelihood by using
the min-sum algorithm with complexity O(n2 ). However, it can be shown that such cycle-
free Tanner graphs cannot support good codes, in terms of either trade-off between rate
and minimum distance or performance. Aspects of cycle-free Tanner graphs are explored
further in [996, 1401, 2518].
15.3.23 Remark The analysis of LDPC codes, particularly for finite block lengths, poses challenges.
Interest is in performance under a class of belief propagation decoding algorithms, a subclass
of message passing algorithms, where the messages sent between variable and check nodes
represent the probability or belief a given variable node has a particular value. This usually
involves either likelihoods or log likelihood ratios, conditioned on values received in the pre-
vious round. While such algorithms are suboptimal, they are more efficient than maximum
likelihood algorithms and generally yield good performance. In the first round of decoding,
the message sent from a variable node v to its check neighbors is the log likelihood ratio of
the received data, i.e., the log of the ratio of the probability the sent bit was a 0 to that
it was a 1, given the received value (which may be discrete or continuous). In subsequent
rounds, messages from a check node c to a neighbor variable node v is a log likelihood of
the message arriving at v, involving information sent from all neighbor variable nodes other
than v, in the previous round. After the first round, messages from a variable node v to
a neighbor check node c is a log likelihood of the value of the message sent from v condi-
tioned on values received in the previous round from neighbor check nodes other than c.
The analysis requires evaluation of probability density functions of data received by a node
that involves convolutions on the order of the degree of the node, either check or variable.
It can be shown [2458] that the behavior of these densities is narrowly concentrated around
their expected values and thus it suffices to consider only the expected values of the den-
Algebraic coding theory 719
sities. The updating formulae for these expected densities, referred to as density evolution,
gives a method of evaluating the performance of the algorithm. The equations assume the
variables involved in the convolutions are independent which is only true to the extent that
the graph, grown out from a given variable node, is a tree to a certain level. At some point
this ceases to be true. However, the analysis given with this assumption yields results which
are accurate for most codes. The technique of density evolution introduced in [2458] has
been a cornerstone of the analysis of LDPC codes.
15.3.24 Remark Many LDPC codes exhibit probability of error behavior that, as the signal to
noise ratio (SNR) is increased, the error curve flattens out rather than continuing as a
typical “waterfall” curve. This is termed an error floor and is an undesirable feature for
applications requiring very small probabilities of error, in the range of 10−12 for many
standards. It is difficult to simulate curves down to such a low value - even with special
hardware, simulations can typically be done only to about 10−9 [2456]. Much effort has
been expended on finding analytical techniques to determine codes whose error floors are
below such levels. The key reference for such work is [2456] where the notion of trapping
sets is introduced. An (a, b) trapping set is a subgraph of the Tanner graph of the code
induced by a set of variable nodes of size a with the property that the subgraph has b
check nodes of odd degree. Although the sizes of the trapping sets are not the only relevant
parameters for predicting error performance of the code [2456], it can be shown they are
a major influence for error floors in performance curves. In particular such floors tend to
be caused by overlapping clusters of fairly small trapping sets. The determination of such
trapping sets tends to be a feasible computational task, either analytically or via simulation
[2456], for many codes of interest. The design of codes without such trapping sets is a major
goal of LDPC coding theorists. Trapping sets can be thought of as an analog of stopping
sets for erasure codes.
References Cited: [151, 573, 642, 994, 995, 996, 1046, 1162, 1163, 1401, 1724, 1778, 1801,
1841, 1987, 2455, 2456, 2457, 2458, 2518, 2779, 3055, 3061]
15.4.1 Introduction
15.4.1.1 Historical background
Turbo codes are a class of error control codes that was first introduced by Berrou, Glavieux,
and Thitimajshima in 1993 [251]. Turbo coding was a paradigm shift in the design of error
control codes that enabled them to achieve very good performance.
15.4.1.2 Terminology
15.4.1 Remark Basic coding notation and code properties are covered in Section 15.1 of this
handbook. We shall need to add some terminology in order to properly address turbo
codes.
720 Handbook of Finite Fields
15.4.2 Remark In many practical communication systems, we are often interested in encoding
binary (the alphabet is restricted to two symbols) messages of a given manageable length k,
i.e., messages are transmitted in packets of a predetermined size. This is because of a fixed
amount of memory or other hardware limitations. However, the actual amount of data to be
transmitted may be much more than k bits. In this case, long data streams are simply split
into multiple messages and the encoding process is repeated as necessary. Those messages
may be represented by a k-bit vector u. In this section, we work only over F2 (= GF (2)).
However, many of these ideas can be considered over general finite fields.
15.4.5 Definition The set of distinct messages of length k is U = Fk2 , where Fk2 is the Cartesian
product of F2 taken k times.
15.4.6 Remark Let t and s be two vectors. One is often interested in chaining them together to
form a single vector r.
15.4.7 Definition Let t = (t0 , t1 , t2 , . . . , tk1 −1 ) and s = (s0 , s1 , s2 , . . . , sk2 −1 ) be two vec-
tors. Define the chained vector r = t|s = (t0 , t1 , t2 , . . . , tk1 −1 , s0 , s1 , s2 , . . . , sk2 −1 ) =
(r0 , r1 , r2 , . . . , rk1 +k2 −1 ).
15.4.8 Remark Order matters with the chaining operator, i.e., t|s is not the same as s|t in general.
The chaining of more than two vectors follows in a similar way.
15.4.9 Example Let three vectors r, s, and t be chained together in this order. Their chaining is
denoted r|s|t.
15.4.10 Definition Let C be an (n, k) linear code. An encoder C e for C is a one-to-one and onto
function C e : Fk2 → Fn2 , C e : u 7→ v, where n ≥ k, u ∈ Fk2 , v ∈ Fn2 , and the set of
codewords C = {v ∈ Fn2 |v = C e (u)} forms a linear subspace of Fn2 .
15.4.11 Remark A few methods for deriving codes from existing ones are explained in Section 15.1.
The methods therein modify one code in order to derive a new code. Let us denote by the
term compound derivation of a code any technique that derives a new code by combining
two or more codes. An important classical method of compound derivation of a code by
combining two codes is code concatenation [1091].
A classical (n2 , k1 ) concatenated code C3 over Fn2 2 is defined by an encoder function C3e
generated by the function composition C3e = C2e ◦ C1e ; C1 and C2 are the constituent codes
of C3 . Further, C1 is the outer code and C2 the inner code.
15.4.13 Remark In the definition above, the symbols for both C1 and C2 are over the same field
F2 . The interested reader may refer to [1091] and learn that when the above codes are
defined with appropriate lengths and symbols over extension fields of different lengths, the
minimum distance of the resulting code C3 is at least the product of the minimum distances
of C1 and C2 .
Algebraic coding theory 721
15.4.14 Remark The above definition of a concatenated code is precisely what is known as a
serially-concatenated turbo code [225]. The novelty in turbo coding is a proper choice of
the constituent codes for the encoder and a decoding method for these codes. The following
type of compound derived code is used to form the classical parallel-concatenated turbo
code.
v1 and v2 be the code vectors corresponding to a message vector u for codes C1 and
C2 , respectively. An (n2 + n1 , k) parallel-concatenated code C3 over Fn2 2 +n1 is formed by
vector chaining these code vectors, i.e., encoding a message vector u with C3 produces
v3 = v1 |v2 . The codes C1 and C2 are the constituent codes of C3 .
15.4.16 Remark Several variations of a turbo encoder have been studied, but we focus on the classi-
cal parallel-concatenated version. A parallel-concatenated turbo code, which is a compound
derived code, has a recursive convolutional code as one of its constituent codes. We shall
see that the second constituent code itself is also a compound code derived from a recursive
convolutional code and a permutation code. A permutation code is implemented in practice
with a device called an interleaver.
15.4.17 Remark Convolutional codes are a popular type of error control code because of the sim-
plicity of their encoding and the practicality of optimal decoding for many channels. They
have been used in a variety of applications from home wireless networks to deep-space
communication systems. Encoding is performed by polynomial operations over the ring of
polynomials with coefficients over F2 . The Viterbi algorithm [2879] is widely used to decode
convolutional codes in practice because it is known to be optimal (maximum likelihood
decoding) for any memoryless channel; see Section 15.1 for a discussion of these concepts.
15.4.18 Remark There are two major classes of convolutional codes: non-recursive convolutional
codes and recursive convolutional codes. Many of the traditional applications use non-
recursive convolutional codes, but turbo codes require recursive convolutional codes for
reasons briefly explained in Section 15.4.5.
15.4.19 Remark In convolutional coding, a message vector u is traditionally represented in poly-
nomial form on the variable D.
15.4.21 Example The polynomial representation of the information vector in Example 15.4.4 is
U (D) = D + D2 + D4 .
15.4.22 Definition The set of all polynomials in the polynomial ring on the variable D with
coefficients in F2 and whose degree is smaller than k is denoted F2 [D]<k .
722 Handbook of Finite Fields
15.4.23 Definition Typically convolutional codes are encoded in “non-recursive” form. The encoder
is defined by a generator matrix G(D) whose entries are polynomials in F2 [D]<d , where
dmax = d − 1 is the maximum degree among the polynomials in the matrix.
15.4.24 Example An example generator matrix with dmax = 2 and elements in F2 [D]<3 is
g2 (D)] = 1 + D + D2 1 + D2 .
G(D) = [g1 (D)
15.4.25 Definition The encoding of a message U (D) is performed by multiplying it by the generator
matrix. The resulting matrix V (D) = U (D)G(D) is the codeword in polynomial form.
15.4.26 Remark It is worthwhile noting that the codeword length n of convolutional codes is a
function of the message space F2 [D]<k . This means that a generator matrix defines a family
of codes. However, in practice k is chosen to be a fixed value.
15.4.27 Example We illustrate the encoding operation with an example by encoding the message
in Example 15.4.21 with the generator matrix in Example 15.4.24
V (D) = U (D)G(D) = U (D) [g1 (D) g2 (D)] = [v1 (D) v2 (D)] .
15.4.28 Remark The encoding operation may be generalized to multi-dimensional message matrices
and generator polynomials. However, many practical systems have a U (D) with a single
message polynomial (M = N = 1) and G(D) is a simple one-by-two matrix (P = 2). This
implies V (D) is typically a one-by-two matrix.
15.4.29 Definition The Hamming weight w(P (D)) of a polynomial P (D) ∈ F2 [D]<k is the number
of non-zero monomials in P (D).
15.4.30 Remark The Hamming weight of a vector as defined in Section 15.1 of the corresponding
vector p gives the same value, i.e., w(p) = w(P (D)).
15.4.31 Definition The Hamming weight w(M (D)) of a matrix M (D) whose entries are polyno-
mials in F2 [D]<k is the sum of the Hamming weights of each of the polynomials.
15.4.32 Remark We examine the non-recursive convolutional code with a fixed maximum degree
dmax = 2 in Example 15.4.24. It is clear that its minimum distance is no larger than the
weight of the generator matrix w(G(D)) = 5 regardless of the length n of the code since, by
letting the message polynomial be U (D) = 1, we obtain a code polynomial V (D) = G(D).
Therefore, a convolutional code with a fixed generator matrix is asymptotically bad.
15.4.33 Remark A message of the form U (D) = Dx , 0 ≤ x < k, generates a code polynomial whose
weight is w(G(D)).
15.4.34 Remark In order to improve the minimum distance of a convolutional code, one may
increase the maximum degree dmax of the generator matrix; however, it is easy to see that
the minimum distance is expected to only increase linearly with dmax . The decoding of
such codes, however, unfortunately becomes exponentially complex in dmax when Viterbi
decoding is used. A typical value for dmax is six to keep decoding complexity manageable.
15.4.35 Theorem Non-recursive convolutional codes C with a fixed generator matrix G(D) are
asymptotically bad.
15.4.36 Theorem Any low-weight message polynomial encoded by a non-recursive convolutional
code C produces a low-weight codeword polynomial.
Algebraic coding theory 723
15.4.37 Remark From the definition of encoding, it follows that v1 (D) and v2 (D) have g1 (D) and
g2 (D) as their factors, respectively.
15.4.38 Theorem There is a one-to-one and onto mapping between each U (D) and v1 (D) as well
as U (D) and v2 (D).
15.4.39 Remark We examine v1 (D) = U (D)g1 (D) more carefully. Let the degree of g1 (D) be
dmax = d − 1. We may write v1 (D) as
15.4.44 Remark We illustrate how to encode a message when using recursive convolutional codes.
While for a non-recursive convolutional code we simply multiplied the message polynomial
by the generator matrix, for a recursive convolutional code there is an additional step.
15.4.45 Remark Let U 0 (D) ∈ F2 [D]<k be a message to be encoded. We first form a polynomial
Z(D) = Dd−1 (U 0 (D)) + T (D), where T (D) is the negative of the remainder of the di-
vision of Dd−1 (U 0 (D)) by g1 (D). Naturally, Z(D) is divisible by g1 (D) by construction.
Next we simply multiply Z(D) by the generator matrix GR (D). The resulting matrix
V (D) = Z(D)GR (D) is the codeword in polynomial form.
15.4.46 Example We illustrate the encoding operation with an example by encoding the message
U 0 (D) = D3 + D4 . The remainder of the division of D2 U 0 (D) = D2 (D3 + D4 ) = D5 + D6 by
g1 (D) = 1 + D + D2 is R(D) = D. The polynomial Z(D) = D + D5 + D6 is then multiplied
by the generator matrix GR (D)
g2 (D)
V (D) = Z(D)GR (D) = Z(D) 1 = [v1 (D) v2 (D)] .
g1 (D)
15.4.47 Remark We note that the codeword in vector format is v = v1 |v2 = t|u0 |v2 .
15.4.48 Remark While non-recursive convolutional code encoding involves multiplication of poly-
nomials, recursive convolutional encoding involves long-division.
15.4.49 Remark Recursive convolutional codes are asymptotically bad because the set of codeword
polynomials is identical to an equivalent non-recursive convolutional code.
724 Handbook of Finite Fields
15.4.50 Theorem Recursive convolutional codes C with a fixed generator matrix GR (D) are asymp-
totically bad.
15.4.51 Remark Despite the fact that the weight distribution of the codewords in a non-recursive
convolutional code and an equivalent recursive convolutional code are the same, there are
fundamental differences in the relationship between the Hamming weights of the message
vectors and the Hamming weights of the corresponding codewords.
15.4.52 Remark A low-weight message polynomial encoded by a recursive convolutional code C
does not necessarily produce a low-weight codeword polynomial.
15.4.53 Remark A message polynomial of the form U 0 (D) = Dx , 0 ≤ x < k, produces codeword
polynomials whose weights are lower bounded by αx for some positive constant α. Compare
this with Remark 15.4.33. This is a fundamental property of recursive convolutional codes
that makes them suitable for turbo coding due to the long-division encoding process.
15.4.54 Remark Permutations have long been used along with error control codes in the area of
digital communications. Their main use is to reorder the elements of a vector. A device that
permutes the elements of a vector is an interleaver.
15.4.56 Definition Let a vector u = (u0 , u1 , u2 , . . . , uk−1 ) ∈ Fk2 . A permuted version uΠ of u under
the permutation function Π is uΠ = (uΠ(0) , uΠ(1) , uΠ(2) , . . . , uΠ(k−1) ).
15.4.57 Definition The design requirements for an interleaver in typical applications are minimal.
A simple block interleaver may in general suffice. A block interleaver writes the elements
of the vector on a matrix row-wise and then reads them back column-wise to permute
the elements of the vector.
15.4.58 Example Let a vector a = (a0 , a1 , a2 , a3 , a4 , a5 ). A 2 by 3 block interleaver first writes the
elements of a on a 2 by 3 matrix row-wise:
a0 a1 a2
.
a3 a4 a5
15.4.60 Definition A linear polynomial function (LPF) over Zn is defined by the function
L : Zn → Zn , L : x 7→ f1 x + f0 , where f1 , f0 ∈ Zn .
15.4.61 Remark Let f1 be co-prime to n over Z. The linear polynomial function function
L : Zn → Zn , L : x 7→ f1 x + f0 is a linear permutation polynomial (LPP).
Algebraic coding theory 725
15.4.62 Definition A quadratic polynomial function (QPF) over Zn is defined by the function
Q : Zn → Zn , Q : x 7→ f2 x2 + f1 x + f0 , where f2 , f1 , f0 ∈ Zn .
15.4.63 Proposition Let n be a power of 2. Let f1 be odd and f2 even. The QPF function
Q : Zn → Zn , Q : x 7→ f2 x2 + f1 x + f0 is a quadratic permutation polynomial (QPP).
15.4.64 Remark Necessary and sufficient conditions to obtain for QPPs for arbitrary n and other
properties have been investigated in [2742, 2768].
15.4.67 Example Let u be a message vector. First we encode u with the recursive convolutional
(1)
code C1 and obtain the code vector v(1) = t(1) |u|v2 . Next we form the code vector for C2 .
We first encode u with the permutation code CΠ to obtain uΠ = Π(u). Then uΠ is encoded
(2)
with a recursive convolutional code C1 to obtain the code vector v(s) = t(2) |uΠ |v2 for Cs .
(2)
The code vector for C2 becomes v(2) = t(2) |v2 by puncturing the message symbols uΠ of
(1) (2)
Cs . Finally, the code vector for the turbo code becomes v(T) = t(1) |u|v2 |t(2) |v2 .
15.4.68 Remark The decoding of turbo codes is performed using iterative algorithms [1398]; for
iterative algorithms see Section 15.1.
15.4.69 Remark We focus on the design of parallel-concatenated turbo codes. Naturally, the design
narrows down to an understanding of the overall turbo code principle and to the selection
of the constituent recursive convolutional codes and an interleaver.
15.4.70 Remark Recall the definition of a linear code C of length n as a linear subspace of Fn
2 from
Section 15.1. Let C be a code over Fn2 (not necessarily linear). The Holy Grail design for C
has been to maximize its minimum distance as defined in Section 15.1. For a linear code,
this translates to maximizing its minimum Hamming weight. This task becomes even more
challenging when we require a design for a family of codes that are asymptotically good,
i.e., a family of codes with a fixed code-rate such that the minimum distance to code length
ratio does not vanish as the code length goes to infinity.
15.4.71 Remark The error correction capability of a code is mostly dictated by its minimum dis-
tance, with its weight distribution a secondary factor. In turbo coding, the relationship
between the weight of message vectors and the weight of codeword vectors becomes crucial.
726 Handbook of Finite Fields
When an error control code is used, we ultimately need not only a set of codewords C but
an encoder to map messages from U to C and a decoder to map codewords (possibly with
errors) back to messages.
15.4.72 Remark We observe again the codeword of a parallel-concatenated turbo code:
(1) (2)
v(T) = t(1) |u|v2 |t(2) |v2 . (15.4.1)
The design principles for a turbo code are related to the following question: How do we
(1)
minimize the chances of producing simultaneously low weight codewords for C1 (t(1) |u|v2 )
(2)
and C2 (t(2) |v2 )?
15.4.73 Remark The first design principle is that, since the codeword includes the message vector
u, it is more important to focus on the case when w(u) is small.
15.4.74 Remark The second design principle is, given that w(u) is small, identify a good permuta-
tion function that minimizes the chances of producing simultaneously low weight codewords
for C1 and C2 .
15.4.75 Remark The main design parameters of constituent recursive convolutional codes are the
maximum degree dmax of the generator polynomials and the choice of the generator matri-
ces. Surprisingly, it has been found that dmax = 3 or 4 is sufficient to achieve very good
performance with turbo coding. Typically, the polynomial g1 (D) is chosen to be a primitive
polynomial.
15.4.76 Remark The reason for choosing g1 (D) to be primitive is that it maximizes the degree of
polynomials of the form U (D) = 1 + Dτ such that U (D) is divisible by g1 (D); this is known
to improve the codeword weight of recursive convolutional codes with respect to message
vectors with w(U (D)) = 2. One of the latest communication systems using turbo codes is
the fourth generation wireless cellular 4G LTE standard.
15.4.77 Example The 4G LTE standard uses the generator matrix with dmax = 3:
1 + D + D3
GR (D) = 1 .
1 + D2 + D3
15.4.78 Remark Interleavers or permutation functions have been extensively investigated in turbo
coding applications [252, 725, 755, 767, 911, 1277, 2291, 2513, 2742, 2770, 2771]. Shannon’s
random coding bound theorem [2608] naturally led researchers to experiment with differ-
ent pseudo-random or modified pseudo-random constructions [911]. There were two major
drawbacks to early pseudo-random constructions: first, practical implementation calls for
interleavers that require little storage and computation; and second, pseudo-random con-
structions often generate turbo codes that suffer from a performance deficiency known as
an “error-floor”; see Remark 15.3.24.
15.4.79 Remark Some researchers attempted to use simple interleavers with turbo coding such as
block interleavers. However, a block interleaver has been shown to have too much regu-
larity; it has also been shown that interleavers based on LPPs behave similarly to block
interleavers [2770]. A good compromise between interleaver complexity and turbo code per-
formance can be achieved with QPP interleavers; the second degree polynomial brings a
Algebraic coding theory 727
“non-linear” feature that is very desirable for turbo coding [2768, 2771]. The simplicity of
QPP interleavers due to their algebraic structure and their very good performance resulted
in them becoming part of the 4G LTE standard.
See Also
References Cited: [225, 251, 252, 725, 755, 767, 911, 1091, 1277, 1398, 2291, 2495, 2496,
2513, 2519, 2608, 2742, 2768, 2770, 2771, 2879]
15.5.1 Remark This section is concerned with coding for the binary erasure channel (BEC) (see
Section 15.1), except for the last subsection which includes comments on the performance of
raptor codes on the binary symmetric channel (BSC). Recall that for the BEC a nonerased
symbol is correct and the goal is to design codes which can be encoded and decoded effi-
ciently. The capacity of the BEC with erasure probability is 1 − . If C is a linear (n, k, d)q
code with parity check matrix H, it is capable of correcting up to d − 1 erasures. If a code-
word c is transmitted and word r received (whose components are in the set {0, 1, E}), the
problem of correcting erasures is the problem of solving a set of linear equations of the form
H 0 xT = yT where H 0 is the matrix of columns of the original parity check matrix H of
the code corresponding to erased positions of r, x is a vector of variables corresponding to
the erased positions of r, whose solutions are required, and y the sum of columns of the
parity check matrix corresponding to the positions of r containing 1s. This follows from the
fact that a codeword c satisfies the equation HcT = 0T , the vector of all zeros. The com-
plexity of decoding is that of matrix reduction, assumed to be O(d3 ) when the maximum
number of erasures d − 1 has occurred (unless a more sophisticated algorithm is used). It
is quite remarkable that the codes in this section achieve essentially linear complexity for
both encoding and decoding while achieving capacity. This is accomplished by constructing
a code by means of a bipartite graph and interpreting the decoding operation as a graph
algorithm. The forerunners of raptor codes are tornado and LT codes, considered in the
following sections. It should be noted that packet loss on the internet, due to traffic con-
ditions or node buffer overflow, give a natural model of an erasure channel and the results
of this section yield an excellent solution for Internet multicast data transmission with no
feedback channel to request missing packets. If a block code is used for such a purpose, to
be effective, one would have to know the erasure probability for effective code design. This
is usually difficult to estimate and is often time varying. The LT and raptor codes described
here are rateless and such an estimate is not required. The tornado codes are block codes,
728 Handbook of Finite Fields
unlike LT and raptor codes, which are rateless (fountain) codes. The tornado codes are
considered for the important ideas they introduced that led, in a sense, to the fountain and
LT codes. However tornado codes are block codes while LT and raptor codes are rateless,
having no concept of block length or dimension. All codes considered are binary.
15.5.2 Remark The most complete description of tornado codes is given in [1975] (although not
referred to as tornado codes there). Earlier work includes the papers [1972, 1974, 1973, 2619].
The paper [1975] has been influential in the last decade of progress on the problem of coding
with irregular bipartite graphs.
15.5.3 Remark Consider a bipartite graph B with k information nodes on the left and βk check
nodes on the right. This is a version of the Tanner graph of a code; see Section 15.3. The
information nodes are associated with information symbols. Since only XOR operations are
used in the encoding, the information symbols can be taken as either bits or strings of
bits of the same length (packets). The manner of choosing edges in the graph is critical to
code performance and is discussed later. Given the graph, the complexity of encoding and
decoding is proportional to the number of edges in the graph. For the bipartite graph B,
denote the associated code by C(B). The decoding is considered next. A codeword consists
of the set of information and check symbols.
15.5.4 Remark Decoding process: For the code C(B) and its associated bipartite graph B, consider
a received vector y, the result of transmitting a codeword through the BEC. The positions
of y contain symbols from the alphabet {0, 1, E} or strings of such symbols. Associate the
received symbols with the appropriate information and check graph nodes. The decoding
proceeds as follows: given the correct value of a check symbol and all but one of the informa-
tion symbol neighbors, set the missing information symbol to the XOR value of the check
and the known information symbols. The process continues until all information nodes are
retrieved or decoding failure. The algorithm is a version of the belief propagation (BP) or
message passing algorithms used for noisy channels.
15.5.5 Remark The success of the decoding procedure of C(B) depends on the existence of check
symbols with the required property to the completion of decoding. Before considering this
condition, the construction of the codes is given. The term tornado derives from the ob-
servation that [470] one can initiate decoding on such graphs as coded symbols arrive, and
the decoder stalls until, typically, the arrival of a single further symbol allows it to proceed
quickly to completion.
15.5.6 Remark The bipartite graph B is constructed randomly as follows. Refer to an edge as
one of degree i on the left (right) if the left (right) node it is connected to has degree
i. The following approach is taken. The graph is to have e edges. Two distributions (to
be discussed later) are defined: a left distribution (on edges, not nodes) (λ1 , . . . , λn ) and
a right distribution (ρ1 , . . . , ρn ), where λi (resp. ρi ) is the fraction of edges in the graph
of left degree i (resp. right degree i), Pfor some appropriate integer n. The number of left
nodes of degree i is eλi /i and k = e i λi /iPand similarly for the right nodes. The average
left degree of thePinformation nodes is 1/( i λi /i), denoted by a` and similarly for right
degrees ar = 1/( i ρi /i). If the graph has e edges, k left information nodes and βk right
check nodes, then a` = ar β. As discussed below, it is possible to choose the left and right
distributions to ensure complete decoding with high probability.
15.5.7 Remark Code construction: With the terminology of the previous remark, the construction
of the bipartite graph, B, is described, given the left and right distributions. Consider a
Algebraic coding theory 729
sequence of four columns of nodes. The first column has k information nodes, the second
and third e nodes each, and the fourth βk check nodes. Edges appear only between adjacent
columns. A fraction λi of the e edges have left degree i. For each i, connect i of the nodes of
the second column to a node of the first column, eλi /i times (truncated to form integers), to
give this number of left nodes of degree i. Similarly, connect eρi /i of the nodes in the third
column to a node in the fourth column to give this number of right vertices of this degree.
Nodes in the second and third columns are all of degree 1. To complete the process connect
the e nodes of the second and third columns by a random matching (permutation). The
second and third columns of nodes are removed with edges now connecting the information
and check nodes. The process may yield a small number of multiple edges between an
information and check node. These are replaced with a single edge with negligible effect.
The graph B is the random bipartite graph.
15.5.8 Remark The two distributions are chosen from the analysis of the probability of successful
decoding of the algorithm of Remark 15.5.4. The decoding algorithm is a Markov process
that leads to a certain differential equation. It is convenient for the analysis to define the
two distribution polynomials
X X
λ(x) = λi xi−1 , ρ(x) = ρi xi−1 .
i i
i−1 i
The use of x rather than x is for convenience in the analysis. The analysis leads to the
following important results.
15.5.9 Theorem [1975, Proposition 2] Let B be a bipartite graph with k information nodes that
is chosen at random with edge degree distributions λ(x) and ρ(x), and suppose that δ is
fixed such that
ρ(1 − δλ(x)) > 1 − x for all 0 < x ≤ 1. (15.5.1)
For all η > 0 there is some k0 such that for all k > k0 , if the message symbols of C(B) are
erased independently with probability δ, then with probability at least 1−k 2/3 exp(−k 3/2 /2)
the decoding algorithm terminates with at most ηk information symbols erased.
15.5.10 Remark While not quite enough to show the decoding terminates successfully, with a slight
modification of the conditions [1975] it can be shown:
15.5.11 Lemma [1975, Lemma 1] Let B be a bipartite graph with k left information nodes, chosen
at random according to the distributions λ(x) and ρ(x) satisfying Condition (15.5.1) such
that λ1 = λ2 = 0. Then there is some η > 0 such that, with probability 1 − O(k −3/2 ),
the decoding process restricted to the subgraph induced by any η-fraction of the left nodes
terminates successfully.
15.5.12 Remark Condition (15.5.1) was relaxed in [1972] to:
δλ(1 − ρ(1 − x)) < x for x ∈ (0, δ].
It remains to show that distributions satisfying (15.5.1) exist. Distributions that satisfy this
condition are referred to as capacity achieving (CA). Numerous works have addressed the
problem of finding CA sequences of distributions including [1976, 2340, 2457, 2618, 2620].
A variety of techniques are used. One such example is given in the following.
15.5.13 Example This example is taken from [1975]. Let D be a positive integer which is chosen
to trade off average left degree with how well the decoding process works. Let H(D) be the
PD
truncated Harmonic series: H(D) = i=1 1/i ≈ ln(D). Let
D
1 1 X xi
λi = , i = 2, 3, . . . , D + 1, and hence λD (x) = .
(H(D)(i − 1)) H(D) i=1 i
730 Handbook of Finite Fields
The average left degree is then a` = H(D)(D + 1)/D. For the right distribution choose
e−α αi−1
ρi = , i = 1, 2, . . . , ρD (x) = eα(x−1) ,
(i − 1)!
i.e., the Poisson distribution, where α is chosen so that ar = αeα /(eα − 1) = a` /β, i.e.,
to make the fractions of nodes on the left and the right the same. It can be shown [1975]
that the two distributions satisfy (15.5.1) for δ ≤ β/(1 + 1/D) and hence are CA. These
distributions are referred to as heavy tail/Poisson or tornado [1975, 2619].
15.5.14 Remark It is to be noted that CA sequences lead to a graph having a high probability of
successfully completing decoding on the graph.
15.5.15 Theorem [1975, Theorem 3] For any rate R, 0 < R < 1, and any with 0 < < 1, and
sufficiently large block length n, there is a linear code and decoding algorithm that, with
probability 1 − O(n−3/4 ), is able to correct a random (1 − R)(1 − ) fraction of errors in
time proportional to n ln(1/).
15.5.16 Remark The original construction of tornado codes involved a cascade of m + 1 sections of
random bipartite graphs, the i-th section having β i k left nodes and β i+1 k check nodes, with
a final code C as a good erasure correcting code. Works subsequent to [1975] typically used
only a single section (and no final erasure correcting code) and perhaps this is the important
impact of the work on tornado codes. The ideas of tornado codes (although not always
referred to as tornado codes) first appeared in [1974]. Reference [1972] simplified the analysis.
The notion of the graph construction and CA distribution sequences has proved important
and a rather large set of papers on this subject now exists, many giving new techniques,
including linear programming, for finding CA distributions. These include [2457, 2618, 2620].
The influence of the paper [1975] is evident in subsequent work on the problem. The ability
to encode and decode codes on the BEC in linear time is a remarkable achievement (recall
the initial comments on the equivalence of this problem to matrix reduction and Gaussian
elimination).
15.5.17 Definition Define a bipartite graph B0 with k information nodes and βk check nodes,
0 < β < 1. Recursively form the bipartite graph Bi whose left nodes are the β i k right
i+1
√ β k right nodes, 1 ≤ i ≤ m, for an integer m chosen so that
nodes of Bi−1 , with
m+1
β k is roughly k. Choose a final code C as a good erasure correcting code of rate
1−β with β m+1 k information nodes. The resulting cascaded code, C(B0 , B1 , · · · , Bm , C)
has k information symbols (the leftmost k symbols as input to the graph B0 ) and
m+1
X
β i k + β m+2 k/(1 − β) = kβ/(1 − β)
i=1
15.5.18 Remark The notion of fountain codes first appears in the work [470]. The term “fountain”
refers to the process whereby k information symbols (either bits or packets) are encoded
into coded symbols by XOR’ing subsets of information symbols in such a way that a receiver
may collect any k(1+) coded symbols, for some appropriate to the system, to recover all k
information symbols. The image is of a coder producing a digital fountain of coded symbols
and a receiver collects a sufficient number of such symbols to retrieve the information. Codes
Algebraic coding theory 731
with such a property can be described as universal rateless erasure codes. The codes are
rateless since there is no concept of code dimension. LT codes are the first realization of
such erasure codes. The term “LT” refers to “Luby transform.” The codes were devised for
a situation where packets are lost, or dropped, on the Internet and there was no efficient
way a receiver could request a retransmission for such a missing packet. This is a typical
situation for a multicast internet protocol where a single information source transmits a
large amount of data to a large user community. For a typical block erasure correcting
code, each user would have to receive a specific set of codeword symbols in order to decode
the remaining erasures. In the LT scenario, any sufficiently large collection of coded symbols
will do. While often thought of as erasure correcting codes, there are no erasures in this
situation - in effect erased packets are those that are lost.
15.5.19 Remark As a matter of notation, the terms used in this section are information symbols
and coded symbols. Many papers use the terms input and output symbols which are also
natural but in some situations, where more than one level of graphs is considered, they
might be ambiguous.
15.5.20 Remark In constructing bipartite graphs for coding, two forms appear in the literature. In
the first form, there are n nodes on the left, and n − k check nodes on the right. This is
often called the Tanner graph of the code. The other form is the one used above with k
information nodes on the left and n − k check nodes on the right.
15.5.22 Definition (The LT decoding rule) [1971] From a collection of K (slightly larger than
k) received coded symbols, a bipartite graph is formed with k left information nodes
(initially of unknown value) and K right received code nodes. Edge connections are
formed from the header information of the received coded symbols. All coded symbols
of degree one are said to cover their unique information neighbors and this set of covered
information symbols is the ripple. Information symbols in the ripple are determined as
they are the same as the coded symbols covering them. At the first stage an information
symbol in the ripple is XOR’ed with all of its coded neighbors and all edges to the
coded neighbors removed. This might produce coded nodes of degree one and their
unique information symbols are added to the ripple. The process continues iteratively
until either all information symbols are recovered or there are no coded symbols of
degree one. If at the end some information symbols remain uncovered, decoding failure
is declared.
15.5.23 Remark The decoding process above has a complexity proportional to the number of edges
in the graph. The number K of coded symbols required to have a reasonably high probability
of decoding success is typically in the range of 1.01 to 1.05 times k, for sufficiently large k.
From a balls in bins analysis it can be shown that for such a coding process it is necessary
that at least k ln(k/δ) balls must be thrown into the k (information) bins in order to have
a probability of δ that each of the bins has at least one ball (i.e., that each information
symbol is used (covered) by at least one of the coded symbols - for if one is not covered it
732 Handbook of Finite Fields
could not be recovered). Thus in the encoding process the average degree of an information
symbol must be at least ln(k/δ) no matter what distribution is used for producing the coded
symbols.
15.5.24 Remark The random behavior of the LT decoding process is determined entirely by the
distribution {ρi }, the number of information symbols k, and the number of coded symbols
K obtained, which are typically slightly larger than k. The desirable properties for the
distribution are
1. to ensure the fewest possible number of coded symbols to be able to complete
the decoding with high probability and
2. the average degree of the coded symbol nodes is as small as possible (although,
as noted, at least ln(k)).
15.5.25 Remark In analyzing the LT decoding process, a desirable feature is to have the rate at
which information symbols are added to the ripple to be about the same as the rate they
are processed. The ripple size should be large enough to ensure the decoding process can
continue to completion, but not so large that a coded symbol has too high a probability
of covering an information node already in the ripple. A distribution that has desirable
properties in terms of this and other properties is the soliton distribution given by:
1/k i = 1,
ρi = (15.5.2)
1/(i(i − 1)) i = 2, 3, . . . , k,
and ρS (x) = i ρi xi . (The name derives from physics where it arises from a similar property
P
Pk
in a refraction problem.) The expected value of this distribution is H(k) = i=1 1/i, the
harmonic sum to k. In [2621], a different analysis of the decoding process of Definition 15.5.22
is given. It is shown there that if, at each step of the decoding process one wants an expected
number of degree one coded nodes, the degree distribution must satisfy the (asymptotic)
equation
(1 − x)η 00 (x) = 1, 0 < x < 1
i
P
where η(x) = i ηi x . The solution of this equation is almost the soliton distribution
- except that η1 = 0 and the range is infinite. The fact that such a distribution yields
no coded symbols of degree one is a problem since the decoding algorithm cannot start.
While this distribution is ideal in the sense that the expected number of coded symbols
needed to recover the information in the ripple is one, in practice it performs poorly as the
probability the ripple disappears before completion is high. The robust soliton distribution
{µi } is proposed to correct these defects. It is defined in the following manner. Let δ be
the probability√ of decoder failure for k information symbols and K coded symbols and
R = c ln(kδ) k for some suitable constant c:
k
X
µi = (ρi + τi )/β, where β = (ρi + τi ) (15.5.3)
i=1
The intuition
√ is that the probability a random walk of length k deviates by more than
ln(k/δ) k from its mean is at most δ [1971]. With this background it is possible to show
the following theorem.
Algebraic coding theory 733
15.5.26 Theorem [1971, Theorems 12, 13 and 17] With the above notation, the average degree of
a coded symbol is D = O(ln(k/δ)), and the number of coded
√ symbols required to achieve a
decoding failure probability of at most δ is K = k + O( k ln2 (k/δ)).
15.5.27 Remark While the LT codes represent a remarkable step forward for coding on an erasure
channel, such as the Internet, its complexity or running time is not linear in the number
of input information symbols. The following result emphasizes this. For a code with k
information symbols, we say a decoding algorithm is reliable if it fails to decode with a
probability at most 1/k c for some positive constant c. The overhead of the code and decoding
algorithm is if the decoder needs (1+)k coded symbols in order for the decoder to succeed
with high probability. The term space complexity refers to the amount of memory required
to implement decoding. The reference [2621] is followed closely here, although the terms
information and coded symbols are used rather than input and output symbols. The term
raptor derives from RAPid TORnado although the tornado and LT codes have very different
constructions.
15.5.28 Lemma [2621, Proposition 1] If an LT code with k information symbols possesses a reliable
decoding algorithm, then there is a constant c such that the graph associated to the decoder
has at least ck log(k) edges.
15.5.29 Remark Recall that if K coded symbols are gathered to decode for the k information
symbols, then Gaussian elimination is maximum likelihood and has a complexity of O(Kk 2 ),
i.e., from the K coded symbols a K × k matrix equation can be set up to be solved for
the k information symbols. The decoding algorithm for LT codes is able to improve on
this complexity considerably by choosing an appropriate distribution {ρi } and decoding
algorithm, as noted above. An LT code generated in this manner is referred to as a (k, ρ(x))
LT code.
15.5.30 Remark The idea behind the raptor codes is to first add parity check symbols to the
information symbols, by use of an efficient linear block erasure correcting code, to form the
set of intermediate symbols and then LT encoding this set. This relieves the LT decoder
from having to correct all the erasures. A few can be left to the block code to correct and
this eases the burden of the LT decoder considerably and allows a linear decoding time,
for a good choice of code. Let Cn be a linear code of dimension k and block length n. It is
called the precode of the raptor code. The intermediate symbols are then the k information
symbols and n − k parity checks of the precode Cn . For the LT code, a modification of the
above soliton like distribution is suggested [2621]:
D
!
1 X xi xD+1
ρD (x) = µx + +
µ+1 i=2
(i − 1)i D
where D = d4(1 + )/e for a given real number and µ = (/2) + (/2)2 . Thus this is a
soliton-like distribution with a positive probability of degree 1 and truncated at D + 1. The
LT code is designated a (n, ρD (x)) code. The entire code is the raptor code and designated
a (k, Cn , ρD (x)) raptor code. A key result on the way to the final one is the following lemma.
15.5.31 Lemma [2621, Lemma 4] There exists a positive number c (depending on ) such that with
an error probability of at most e−cn any set of (1 + /2)n + 1 coded symbols of the LT code
with parameters (n, ρD (x)) are sufficent to recover at least (1 − δ)n information symbols,
where δ = (/4)(1 + ).
734 Handbook of Finite Fields
k
k
15.5.32 Remark Two extreme cases of raptor codes are noted. If ρw = w /2 , corresponding to the
distribution polynomial ρ(x) = (1 + x)k /2k , it leads to the probability that any particular
binary k-tuple being chosen as 1/2k , i.e., over F2 since a vector of
k
k
ka uniform distribution
k
weight w is chosen with probability w /2 and each of the w are chosen equally likely. If
such a distribution was used in the LT process, performance would be very poor. The degree
distribution is too large to permit the process to succeed. At the other extreme, suppose
the LT process of the raptor code (k, Cn , ρ(x)) uses a trivial distribution ρ1 = 1, i.e., after
the precoding, the coder chooses an intermediate symbol at random and declares it a coded
symbol. Such a code is referred to as a precode only (PCO) raptor code. One can see that
such a code can achieve a low overhead only for very low rate codes Cn .
15.5.33 Remark A suitable precode has the properties:
1. The rate R of Cn is (1 + /2)(1 + ).
2. The BP decoder can decode Cn on a BEC with erasure probability
δ = (/4)/(1 + ) = (1 − R)/2,
in O(n log(1/)) operations. (Note this is half of capacity for the rate of the code.)
It is suggested that several types of codes meet these criteria, such as tornado codes and
right regular codes (coded symbol nodes have the same degree).
15.5.34 Theorem [2621, Theorem 5] Let be a positive real number, k the number of information
symbols, D = d4(1 + )/e, R = (1 + /2)/(1 + ), n = dk/(1 − R)e, and let Cn be a code
with properties described above. Then the raptor code (k, Cn , ρD (x)) has space complexity
1/R, overhead and a cost of O(log(1/)) with respect to BP decoding of both the precode
and LT code.
15.5.35 Remark A problem with raptor codes is that they are not systematic. A technique is given
in [2621] to generate systematic raptor codes but is not discussed here. Many applications
of raptor codes in practice prefer systematic codes. Raptor codes have been incorporated
into numerous standards for the reliable delivery of data objects. The codes are described in
IETF RFC 5053 and IETF RFC 6330 for such applications as the DVB-H IPDC (IP datacast
to handheld devices) and 3GPP TS for multimedia broadcast/multicast service (MBMS)
and future standards of IEEE P2220 (a draft standard protocol for stream management in
media client devices) and 3GPP eMBMS, among others. The latter standards use RaptorQ
codes defined over the finite field F28 . All of these standards use systematic raptor codes.
The monograph [2622] has an extensive discussion of these codes for standards.
15.5.36 Remark A good erasure correcting (linear) code should have a good minimum distance.
Thus a reasonable question to ask is how such a code would perform on a noisy channel,
such as a binary symmetric channel (BSC). The performance of raptor codes on such a
channel is considered in [991]. Many of the results for the erasure channel are generalized
there. It builds on the landmark paper [2458] which considered more general classes of low
density parity check codes and introduced such fundamental concepts as density evolution.
The application of these ideas to raptor codes generalizes many of the results for the erasure
channel.
References Cited: [470, 991, 1971, 1972, 1973, 1974, 1975, 1976, 2340, 2457, 2458, 2618,
2619, 2620, 2621, 2622]
Algebraic coding theory 735
15.6.1 Definition Let F = Fq be the field of cardinality q. For a positive integer ` decomposition
of F` is defined recursively. Let T0 = F` , |T0 | = q ` , and
(a ,a ,...,a`−1 )
[ (a ) [ [ (a ,a ) [ [ [
T0 = T1 0 = T2 0 1 = · · · = ··· T` 0 1 ,
a0 ∈F a0 ∈F a1 ∈F a0 ∈F a1 ∈F a`−1 ∈F
where
(a0 ,a1 ,...,ai−1 )
Ti = q `−i , i = 1, . . . , `.
where the minimum is taken over all possible choices of a0 , . . . , ai−1 , and d(·) is the
minimum Hamming distance of the code. Then the distance hierarchy of a space decom-
position is the vector
d = (d0 = 1, d1 , . . . , d`−1 ).
15.6.10 Remark The distance hierarchy of a decomposition is a parameter used to determine the
decay rate of decoding error probability.
15.6.11 Example The distance hierarchy of the decomposition from Example 15.6.3 is (1, 2). The
distance hierarchy of the decomposition from Example 15.6.8 is (1, 2, 3, . . . , q).
15.6.12 Definition A space decomposition is proper if for at least one vector (a0 , . . . , a`−1 ) ∈ F` ,
d T (a0 ,...,a`−1 ) ≥ 2.
15.6.13 Remark Given a space decomposition and a vector a = (a0 , . . . , a`−1 ) ∈ F` , one may define
a transform g : F` → F` associated to the decomposition as
(a0 ,a1 ,...,a`−1 )
g(a) = T` .
(a ,a ,...,a )
Recall that T` 0 1 `−1
here is a vector from F` . An extended transform of vectors of
lengths greater than ` can be defined as follows.
15.6.14 Definition Let b = (b0 , . . . , b`s−1 ) ∈ F`s be a vector, and B be the matrix of size ` × s,
b0 b1 ... bs−1
bs bs+1 ... b2(s−1)
B= .. .. .. .. = (b0 , . . . , bs−1 ),
. . . .
b(`−1)s b(`−1)s+1 ... b`s−1
15.6.15 Example Using the decomposition from Example 15.6.3 with s = 4 we have for
b = (01011100),
0 1 0 1
B= ,
1 1 0 0
and
ĝ(b) = (g(01), g(11), g(00), g(10)) = (11100001).
Algebraic coding theory 737
m m m
15.6.16 Definition An `-step transform G : F` → F` is defined as follows. Let b(0) = b ∈ F` .
At the i-th step of the transform, i = 0, . . . , m − 1, the vector b(i) is partitioned to the
successive sub-vectors of length `i+1 ,
(i) (i) (i)
b(i) = b0 , b1 , . . . , b`m−i−1 −1 ,
where
(i) (i) (i) (i)
bj = (bj` , bj`+1 , . . . , b(j+1)`i+1 −1 ), j = 0, ..., `m−i−1 − 1.
Then
(i) (i) (i)
b(i+1) = ĝ(b0 ), ĝ(b1 ), . . . , ĝ(b`m−i−1 −1 ) .
c = G(b) = b(m) .
where the union is taken over all q k choices of b ∈ Fn such that the components of b
are set to zero in the coordinates having index belonging to J.
15.6.19 Remark The way to choose the set J will be discussed later.
15.6.20 Lemma If the space decomposition is linear, the polar code is a linear code.
15.6.3 Decoding
15.6.21 Lemma Encoding of a polar code requires m steps each having complexity linear in the
length of the code. The complexity of encoding a polar code is O(n log n).
15.6.22 Remark Let c = G(b) be transmitted over a memoryless channel. At the output of the
channel we obtain for each of the n coordinates a set of q probabilities, one for each of the
field elements, that has been transmitted at this position. Based on this we have to conclude
what is the most likely code vector that had been transmitted.
15.6.23 Remark Decoding polar codes can be done recursively. The algorithm is called Successive
Cancelation. The defined transform G implies a natural order of the elements of b. The
elements of b are processed in this order under the assumption that the previous elements
of b have been determined. It can be shown that asymptotically in the length of the code
738 Handbook of Finite Fields
a subset J of the elements have negligible probability of being wrongly decoded, while the
rest of elements are correctly decoded only with probability tending to 1/q. This allows to
employ the following encoding: place the encoded information on the positions of J, while
the values of the rest of the symbols are fixed to some prescribed value, e.g., to 0. The
code rate, equal to the proportion |J|/n, asymptotically achieves the symmetric capacity of
memoryless channels. A detailed description of the algorithm is given below.
15.6.24 Remark Noticing that the last step of the transform G is just a concatenation of transfor-
mations g of the transposed columns of a matrix of size ` × `m−1 , we may reconstruct the
rows of the matrix row by row. Moreover, we use knowledge from the previously decoded
rows.
15.6.25 Remark The last step of transform G is ĝ(bm−1
0 ), i.e., if
bm−1
0 = (bm−1
0 , bm−1
1 , . . . , bm−1
n−1 ),
Given that we managed to decode the first (i − 1) rows, we may compute the q probabilities
for the entries of the i-th row from the channel output as follows: the probability of ai,j ,
i = 0, . . . , `−1, j = 0, . . . , n/`−1, to be β ∈ F, is just the probability that in the j-th segment
of length ` in the transmitted code word was a vector belonging to T (a0,j ,...,a(i−2),j ,β) , where
a0,j , . . . , a(i−2),j are known from the previous decoded rows.
15.6.26 Remark The problem of decoding a polar code of length `m is thus reduced to ` decodings
of polar codes of length `m−1 , encoding the rows of the matrix. Decoding of each row could
be split into ` decodings of codes of length `m−1 , etc. Finally, we arrive at decodings of
single symbols, being the entries of the initial vector b. If this entry has index belonging
to the set J, then it is zero, otherwise we may choose the most probable element of F as
our decision. It was shown in [119] that when the rate of the code is less than the channel
capacity, there exists a choice of set J allowing negligible probabilities of errors in the entries
where we make a choice.
15.6.27 Theorem The complexity of the successive cancellation decoding is O(n log n).
15.6.28 Theorem Let the polar code be based on a proper space decomposition and g be the
corresponding transform. Let d = (d0 , d1 , . . . , d`−1 ) be its distance hierarchy. Then for any
rate less than the capacity of the channel and growing code length n, the probability of
E(g)
decoding error decays as O(q −n ), where the decomposition exponent E(g) satisfies
`−1
1X
E(g) ≥ log` di .
` i=0
15.6.30 Example For the binary decomposition from Example 15.6.3, E(g) = 0.5. For the quater-
nary (q = 4, ` = 4) decomposition from Example 15.6.8, E(g) = 0.573120 . . ..
Algebraic coding theory 739
15.6.31 Remark Polar codes were proposed by Arikan [119] and provided a scheme for achieving
the symmetric capacity of binary memoryless channels (BMC) with polynomial encoding
and decoding complexity. The original construction by Arikan yields a binary code of block
length n = 2m , and a flexible rate.
15.6.32 Remark Different decompositions were considered in [1796, 2161, 2163, 2428, 2429, 2531,
2776]. For the binary case the decomposition from Example 15.6.3 gives the best expo-
nent for all lengths up to 13, see [1796]. In [2428] non-linear decompositions of lengths 14,
15, and 16 based on partitions of a Hamming code to cosets of the Nordstrom-Robinson
code, which in turn is partitioned to cosets of the first-order Reed-Muller codes, are de-
scribed. These decompositions provide a better exponent than any linear ones. However,
for linear decompositions the smallest ` for which the exponent is greater than 0.5 is 16,
for which E(g) = 0.51828; see [1796]. In [2163], along with extended Reed-Solomon codes,
nested families of algebraic-geometric codes are used to construct non-binary decomposi-
tions. In [1796] it was suggested to use nested families of BCH codes and codes achieving the
Gilbert-Varshamov bound to construct efficient decompositions. It was shown that when `
increases, the best error exponent tends to 1. A construction of polar codes using several
decompositions over different fields is proposed in [2429].
15.6.33 Remark As mentioned in [119, Section I.D] the notion of polar coding is strongly related to
previous ideas in coding theory, such as multi-level coding and Reed-Muller codes. Another
strong origin of polar coding is a previous paper by Arikan [117] where the channel combining
and splitting were used to demonstrate that improvements can be obtained for the sum
cutoff rate of some appropriate channels.
15.6.34 Remark In [118, 119, 1442, 2162, 2531] the problem of optimizing the choice of the infor-
mation subset was considered for different channels.
15.6.35 Remark The tradeoff between the block length, the gap to capacity and the asymptotic
decoding error probability was considered in [1795]. Decoding implementations were con-
sidered in [119, 120, 1906] A list decoding for polar codes was introduced in [2772]. Using
polar codes in concatenated schemes was discussed in [181, 1814].
15.6.36 Remark Use of polar codes in other areas of information theory was considered in [2, 103,
121, 314, 753, 1520, 1685, 1794, 1797, 1993, 2532].
References Cited: [2, 103, 117, 118, 119, 120, 121, 122, 181, 314, 753, 1442, 1520, 1685, 1794,
1795, 1796, 1797, 1814, 1906, 1993, 2161, 2162, 2163, 2428, 2429, 2531, 2532, 2772, 2776]
This page intentionally left blank
16
Cryptography
741
742 Handbook of Finite Fields
16.1.1 Definition Symmetric-key encryption schemes, also called ciphers, are usually classified as
being a stream cipher in which encryption is performed one character (or bit) at a time,
or a block cipher in which encryption is performed on a block of characters (or bits).
16.1.2 Remark Stream ciphers are generally preferred over block ciphers in applications where
buffering is limited and message characters must be individually processed as they are
received.
16.1.3 Example A classical example of a stream cipher is the simple substitution cipher. The secret
key is a randomly selected permutation π of the English alphabet. An English plaintext
Cryptography 743
16.1.8 Definition A block cipher consists of a family of encryption functions Ek : {0, 1}n →
{0, 1}n parameterized by an `-bit key k. Each function in the family is invertible. The
inverse of Ek is the decryption function Dk .
16.1.9 Remark If two parties wish to communicate securely, they first agree upon a secret key
k ∈ {0, 1}` . Then, to transmit a message m ∈ {0, 1}n , a party computes c = Ek (m) and
sends c. The recipient computes m = Dk (c).
16.1.10 Example Feistel ciphers [1048] are a general class of block ciphers. The parameters of a
Feistel cipher are n (the block length), ` (the key length), and h (the number of rounds).
The ingredients are a key scheduling algorithm that determines subkeys k1 , k2 , . . . , kh from
k, and component functions fi : {0, 1}n/2 → {0, 1}n/2 for 1 ≤ i ≤ h, where fi depends on
ki .
A plaintext block m ∈ {0, 1}n is encrypted as follows:
1. Write m = (m0 , m1 ), where m0 , m1 ∈ {0, 1}n/2 .
744 Handbook of Finite Fields
where DES denotes the DES encryption function. The ciphertext block c can be decrypted
as follows:
m = DES−1 −1 −1
k0 (DESk00 (DESk000 (c))),
where DES−1 denotes the DES decryption function. The secret key has bitlength 168 ren-
dering exhaustive key search infeasible. Given a few plaintext-ciphertext pairs, the secret
key can be recovered by a meet-in-the-middle attack that has running time approximately
2112 steps; however, this attack is considered infeasible in practice.
16.1.13 Example Block ciphers encrypt a long message n bits at a time. The drawback of this
method is that identical plaintext blocks result in identical ciphertext blocks, and hence
the ciphertext may leak information about the plaintext. To circumvent this weakness, long
messages can be encrypted using the cipher-block-chaining (CBC) mode of encryption. A
long message m is first broken up into blocks m1 , m2 , . . . , mt , each of bitlength n. Then,
a random initialization vector c0 ∈ {0, 1}n is chosen, and ci = Ek (mi ⊕ ci−1 ) is computed
for i = 1, 2, . . . , t. The ciphertext is c = (c0 , c1 , . . . , ct ). Decryption is accomplished by
computing mi = DES−1 k (ci ) ⊕ ci−1 for i = 1, 2, . . . , t.
16.1.14 Remark The RSA encryption and signature schemes were introduced in a 1978 paper by
Rivest, Shamir, and Adleman [2462].
16.1.15 Algorithm [RSA key generation] Each party does the following:
16.1.16 Remark The adversary’s task of computing the private key d corresponding to a public key
(n, e) can be shown to be equivalent to the problem of factoring n. As of 2012, factoring 1024-
bit RSA moduli n is out of reach of the fastest integer factorization algorithms. However,
2048-bit moduli and 3072-bit moduli are recommended for medium- and long-term security.
16.1.17 Remark Presented next are the basic versions of the RSA encryption and signature schemes.
In the signature scheme H : {0, 1}∗ → [0, n − 1] is a cryptographic hash function (Re-
mark 16.1.23).
16.1.18 Remark In what follows, a = b mod n is understood to mean that a is the reminder of b
when divided by n, and as usual, a ≡ b (mod n) means a and b are congruent modulo n.
16.1.19 Algorithm (RSA encryption scheme) To encrypt a message m ∈ [0, n − 1] for party A, do
the following:
1. Compute m = cd mod n.
16.1.20 Algorithm (RSA signature scheme) To sign a message m, party A with public key (n, e)
and private key d does the following:
1. Compute h = H(m).
2. Compute s = hd mod n.
3. A’s signature on m is the integer s.
16.1.21 Remark The RSA encryption and signature schemes work because
(me )d ≡ m (mod n)
for all m ∈ [0, n − 1], a property that can easily be verified using Fermat’s little theorem.
16.1.22 Remark Security is based on the intractability of the problem of computing e-th roots
modulo n. While it is clear that this problem is no harder than that of factoring n, the
equivalence of the two problems has not been proven.
16.1.23 Remark A hash function H : {0, 1}∗ → {0, 1}` is an efficiently-computable function that
meets some cryptographic requirements such as one-wayness given a randomly chosen ele-
ment h ∈ {0, 1}` it is computationally infeasible to find any m ∈ {0, 1}∗ with H(m) = h) and
collision resistance (it is computationaly infeasible to find distinct m1 , m2 ∈ {0, 1}∗ with
H(m1 ) = H(m2 )). Examples of commonly-used hash functions are SHA-1 and SHA-256
[1065].
746 Handbook of Finite Fields
16.1.25 Remark The fastest generic algorithm known for solving the discrete logarithm problem is
√
Pollard’s rho algorithm [2413] which has a running time of O( n) and its parallelization
by van Oorschot and Wiener [2852].
16.1.26 Example The first example of a discrete logarithm cryptographic system was the Diffie-
Hellman key agreement protocol [859]. The purpose of this protocol is to enable two parties
A and B to agree upon a shared secret by exchanging messages over a communications
channel whose contents are authenticated but not secret. Given a cyclic group G = hgi of
order n, party A randomly selects an integer a ∈ [1, n − 1] and sends g a to B. Similarly,
party B randomly selects an integer b ∈ [1, n − 1] and sends g b to A. Both parties can
compute the shared secret k = g ab . An eavesdropper is faced with the task of determining
g ab given g, g a and g b . This is the Diffie-Hellman problem, whose intractability is assumed
to be equal to that of the discrete logarithm problem in G [2039].
16.1.27 Example ElGamal designed a closely-related scheme for public-key encryption [965]. In this
scheme, party A’s private key is a randomly selected integer a ∈ [1, n − 1] and her public
key is the group element g a . To encrypt a message m ∈ G for A, a party selects a random
integer k ∈ [1, n − 1], computes c1 = g k and c2 = m(g a )k , and sends (c1 , c2 ) to A. Party
A decrypts by computing m = c2 /ca1 . The basic security requirement is that an adversary
should be unable to compute m given the public key g a and ciphertext (c1 , c2 ). It is easy
to see that the adversary’s task is equivalent to solving an instance of the Diffie-Hellman
problem.
16.1.28 Remark The main criteria for selecting a suitable group G for implementing a discrete
logarithm cryptosystem are that (i) the group operation can be efficiently computed (so
that cryptographic operations can be efficiently performed); and (ii) the discrete logarithm
problem should be intractable. Over the years, several families of groups have been proposed
for cryptographic use including subgroups of:
1. The multiplicative group of a finite field.
2. The group E(Fq ) of Fq -rational points on an elliptic curve E defined over a finite
field Fq [1771, 2102].
3. The divisor class group of a genus-g hyperelliptic curve defined over a finite field
Fq [1772].
4. The group of Fq -rational points on an abelian variety defined over a finite field
Fq (Section 16.6).
5. The class group of an imaginary quadratic number field [440].
16.1.29 Remark Public-key cryptosystems designed using elliptic curves, hyperelliptic curves, and
abelian varieties are studied in Sections 16.4, 16.5, and 16.6, respectively. An example of a
discrete logarithm cryptographic scheme that employs a multiplicative subgroup of a finite
field is presented next.
16.1.3.3 DSA
16.1.30 Remark The digital signature algorithm (DSA) was proposed by the U.S. government’s
National Institute of Standards and Technology in 1991 [1067]. It was the first digital
Cryptography 747
16.1.36 Remark Since 2000, pairings have been widely used to design cryptographic protocols that
attain objectives not known to be achievable using conventional methods. The first such
748 Handbook of Finite Fields
protocol was a one-round three-party key agreement scheme due to Joux [1624]. Recall that
the Diffie-Hellman key agreement scheme is a two-party protocol where the two exchanged
messages are independent of each other, and therefore can be simultaneously exchanged.
Joux showed how pairings can be used to construct an analogous one-round key agreement
scheme for three parties.
16.1.37 Example Suppose that e is a symmetric pairing, i.e., G1 = G2 . In Joux’s protocol, each of
the three communicating parties A, B, C, randomly selects integers a, b, c ∈ [1, n − 1], and
simultaneously broadcasts the group elements g1a , g1b , g1c , respectively. The shared secret
is k = e(g1 , g1 )abc which party A, for example, can compute as k = e(g1b , g1c )a . A passive
adversary’s task is to compute k given g1 , g1a , g1b and g1c . This problem is the bilinear Diffie-
Hellman problem, and is assumed to be no easier than the discrete logarithm problems in
G1 and G3 .
16.1.38 Example A fundamental pairing-based protocol is the Boneh-Franklin identity-based en-
cryption scheme [344]. The scheme has the feature that a party B can encrypt a message for
a second party A using only A’s identifier (such as A’s email address) and some publically-
available system parameters. Party A decrypts the message using a secret key that it must
obtain from a trusted third party (TTP). Unlike symmetric-key cryptography, A and B
do not have to share secret keying material. Also, unlike public-key cryptography, it is not
necessary for A to have a public key before B can encrypt a message for A.
16.1.39 Remark A basic version of the Boneh-Franklin scheme is described next using symmetric
pairings. The scheme uses a bilinear pairing e on (G1 , G1 , G3 ) and two cryptographic hash
functions H1 : {0, 1}∗ → G1 and H2 : G3 → {0, 1}` , where ` is the length of the message to
be encrypted.
16.1.40 Algorithm (Boneh-Franklin identity based encryption) In the setup stage, a trusted third
party (TTP) generates keying material for itself.
1. The TTP randomly selects an integer t ∈ [1, n − 1].
2. The TTP’s public key is T = g1t and its private key is t.
At any time, a party A with identifier IDA can request its private key dA from the TTP:
1. The TTP computes dA = H1 (IDA )t and securely delivers dA to A.
To encrypt a message m ∈ {0, 1}` for A, do the following:
1. Randomly select an integer r ∈ [0, n − 1].
2. Compute R = g1r and C = m ⊕ H2 (e(H1 (IDA ), T )r ).
3. Send (R, C) to A.
To decrypt (R, C), party A does the following:
1. Obtain the private key dA from the TTP.
2. Compute m = C ⊕ H2 (e(dA , R)).
16.1.41 Remark Decryption works because
16.1.44 Remark The BLS signature scheme is described next using asymmetric pairings. The
scheme uses a bilinear pairing e on (G1 , G2 , G3 ) and a cryptographic hash function
H : {0, 1}∗ → G∗1 .
16.1.45 Algorithm (BLS key generation) To generate a key pair, each party does the following:
16.1.48 Remark Security is based on the hardness of the following variant of the Diffie-Hellman
problem: given M ∈ G1 and X ∈ G2 , compute M x where X = g2x .
16.1.49 Remark Pairings that are suitable for implementing Joux’s key agreement scheme, the
Boneh-Franklin identity-based encryption scheme, and the BLS short signature scheme can
be constructed from the Weil and Tate pairings defined on certain elliptic curves over finite
fields. For further details, see Section 16.4.
16.1.50 Remark Shor [2623] showed that integer factorization and discrete logarithm problems can
be efficiently solved on a quantum computer, thus rendering RSA and all discrete logarithm
cryptosystems insecure. As of 2012, the feasibility of building large-scale quantum computers
is far from certain. Nonetheless, cryptographers have been designing and analyzing public-
key cryptosystems that potentially resist attacks by quantum computers, and which could
serve as replacements to RSA and discrete logarithm cryptosystems in the event that large-
scale quantum computers become a reality. Among these post-quantum cryptosystems are
quantum key distribution [227] and conventional cryptosystems based on hash functions,
error-correcting codes, lattices, and multivariate quadratic equations [245]. Cryptosystems
based on multivariate quadratic equations are examined in Section 16.3. A code-based
cryptosystem is described next.
16.1.51 Remark In 1978, McEliece introduced a public-key encryption scheme based on error cor-
recting codes [2048]. The security of McEliece’s scheme is based on the hardness of the
general decoding problem, a problem that is known to be NP-hard [235].
16.1.52 Algorithm (McEliece key generation) Each party does the following:
750 Handbook of Finite Fields
1. Select a k × n generator matrix G for a t-error correcting binary (n, k)-code for
which there is an efficient decoding algorithm.
2. Randomly select a k × k binary invertible matrix S.
3. Randomly select a n × n permutation matrix P .
4. Compute the k × n matrix G b = SGP .
5. The party’s public key is (n, k, t, G);
b her private key is (S, G, P ).
16.1.53 Algorithm (McEliece encryption scheme) To encrypt a message m ∈ {0, 1}k for party A,
do the following:
1. Obtain an authentic copy of A’s public key (n, k, t, G).
b
2. Randomly select an error vector z of length n and Hamming weight t.
3. Compute c = mG b + z.
4. Send c to A.
To decrypt c, party A does the following:
1. Compute bc = cP −1 .
2. Use the decoding algorithm for the code generated by G to decode b
c to m.
b
−1
3. Compute m = mSb .
16.1.54 Remark Decryption works because
c = cP −1 = (mG
b b + z)P −1 = (mSGP + z)P −1 = (mS)G + zP −1 .
Since zP −1 is a vector of Hamming weight t, the decoding algorithm for the code generated
by G decodes b c to m b −1 = m.
b = mS, whence mS
16.1.55 Remark McEliece’s original paper [2048] proposed using a Goppa code with parameters
n = 1024, n = 524, and t = 50. However, it has recently been shown that the McEliece
encryption scheme with these parameters is insecure [250]. Research is ongoing to determine
parameter sets for which one can have a high confidence that the McEliece encryption
scheme will remain resistant to both classical and quantum attacks for the forseeable future.
See Also
References Cited: [227, 235, 245, 250, 344, 346, 440, 965, 1048, 1065, 1067, 1068, 1521,
1624, 1694, 1771, 1772, 2039, 2048, 2080, 2102, 2413, 2462, 2609, 2623, 2720, 2852]
16.2.1 Remark We present some algorithms for stream ciphers and block ciphers. In stream cipher
cryptography a pseudorandom sequence of bits of length equal to the message length is gen-
erated by a pseudorandom sequence generator (PSG). This sequence is then bitwise XOR-ed
(addition modulo 2) with the message sequence and the resulting sequence is transmitted.
At the receiving end, deciphering is done by generating the same pseudorandom sequence
and bitwise XOR-ing the cipher bits with this sequence. The seed for the pseudorandom
bit generator is the secret key. A general algorithm for a pseudorandom sequence generator
is based on a recursive relation over a finite field, a finite state machine in general and
a feedback shift register sequence in particular, with a filtering function or some control
units.
16.2.2 Remark In block cipher cryptography, the message bits are divided into blocks and each
block is separately provided as an input to a permutation, i.e., encryption, using the same
key and transmitted. A block cipher basically is a permutation of a finite field (or a finite
ring), which is a composition of multiple permutations in a subfield (or subring) of the
finite field (or the finite ring). Most modern day block ciphers are iterated ciphers and use
substitution boxes (S-boxes) (i.e., permutations) as the nonlinear part in the scheme.
16.2.3 Remark We consider stream ciphers like RC4 [2001] and the WG stream cipher [2217],
and block ciphers like RC6 [2461] and AES [762]. The aim is to explain the underlying ideas
rather than describing complete solutions.
16.2.4 Remark Stream ciphers are a very important class of cryptographic primitives for encryp-
tion, authentication, and key derivation. The basic principle behind stream cipher encryp-
tion is simple.
16.2.5 Definition (One-time pad) Let zt , for t ≥ 0, be a random key bit sequence which is known
to both the sender and the receiver. Suppose the sender wants to send a message bit
sequence mt . The cipher bit sequence is computed as ct = mt ⊕ zt , and transmitted to
the receiver. The receiver knowing zt , computes mt = ct ⊕ zt .
16.2.6 Remark This simple scheme provides the highest level of security, called perfect secrecy.
It is unbreakable under the assumption that each key is used only for one encryption.
16.2.7 Remark The main problem with the one-time pad is that the key sequence is as long as the
message sequence and for each encryption we need a new random key sequence which has to
be shared by sender and receiver. This creates serious key management and key distribution
problems. One remedy is to use a pseudorandom sequence generator (PSG) also known as
keystream generator. A PSG is a deterministic algorithm which starts with a reasonably
short random bit string (called a seed ) and expands it into a very long bit string which is
used as the keystream. The seed is the secret key shared between sender and receiver. The
security of the stream cipher depends on the security of the PSG. Informally a PSG is secure
if given a segment of the generated key bits it is hard to predict the next bit. Equivalently,
it must be computationally very hard to distinguish the generated pseudorandom sequence
from a random sequence.
16.2.8 Remark Most modern stream ciphers use an initialization vector (IV) which is not secret.
The PRG is seeded by the (key, IV) pair. The same key may be used with distinct IVs and
the constraint on the protocol usage is that a (key, IV) pair should not be repeated. Current
stream ciphers have a similar structure which can be described by a finite state machine
752 Handbook of Finite Fields
(FSM). The Ecrypt home page [989] contains thorough information and may be referred to
by anybody who is interested in the design and analysis of stream ciphers.
16.2.9 Definition (Security assumptions) It is assumed that the adversary knows everything
except the secret key. This is known as Kerckhoff ’s principle. A few details are:
1. Algorithms in a stream cipher are public.
2. The only secret information in the system is the pre-shared key.
3. An attacker can intercept communications (ciphertext) among communicating
entities.
16.2.10 Remark From Assumption 3, attackers can always obtain ciphertext. If an attacker
manages to obtain a certain amount of the corresponding plaintext, then this portion of
keystream is exposed. This is referred to as a known plaintext attack. Thus the security
of stream cipher is reduced to randomness of PSG. The attacker’s goal may be to recover
the secret key or partial information about the secret key using a portion of the known
keystream, i.e., using the known portion of the output of PSG.
16.2.11 Definition (The two phases in stream cipher) A stream cipher consists of two phases:
one is the key initialization phase, for which the algorithm is key initialization algorithm
(KIA), and the other is the PSG running phase and the algorithm is PSG. Usually, the
algorithms used in these two phases are similar.
16.2.12 Remark More specifically, KIA is the same as PSG without outputs or it may be a slightly
different function. Figure 16.2.1 shows a general model of a stream cipher.
16.2.13 Remark In the initialization phase, a key initialization algorithm (KIA) is employed, which
has two inputs, one is an initial vector (IV), which is public information, and the other is a
secret key, k, which is a pre-shared encryption key. The goal of KIA is to scramble key bits
with IV in order to get a complex nonlinear function of k and IV. The output of KIA is
provided as an initial value to the PSG. KIA only executes once for each encryption session.
After the key initialization, PSG starts to output a keystream which is used in encryption.
IV IV
KIA PSG KIA PSG
k k
zi zi
mi ci ci mi
+ +
Encryption Decryption
16.2.14 Definition (Stream cipher modeled as a Finite State Machine (FSM)) In general, any PSG
can be considered as a finite state machine (FSM) or some variant of it which may be
defined as follows. Suppose Y and Z are finite fields (or finite rings) and the elements
of Z are represented by m bits. An FSM is a 5-tuple (S0 , F, G, n, m) where S0 ∈ Y n is
the initial state, F : Y n × Z → Y n is the state update function and G : Y n × Z → Z
Cryptography 753
16.2.15 Remark In modeling practical stream ciphers, in many cases it is seen that, for any St ,
F (St , zt ) and G(St , zt ) do not depend on zt . In this situation we write
16.2.16 Remark RC4 was designed by Rivest in 1987 and kept as a trade secret until it was leaked
in 1994. It is widely used in Internet communications. In the open literature, RC4 is one
of the very few proposed keystream generators that are not based on shift registers. A
design approach of RC4 which has originated from the exchange-shuffle paradigm, is to use
a relatively big array/table that slowly changes with time under the control of itself. For a
detailed discussion on RC4 see the Master’s thesis of Mantin [2001].
16.2.17 Definition RC4 has an N -stage register S, which holds a permutation of all N = 2n
possible n-bit integers, where n is typically chosen as 8. The initial state is derived from
a key (whose typical size is between 40 and 256 bits) by a Key-Scheduling Algorithm
(KSA), i.e., Key initialization algorithm (KIA). The PSG is referred to as the Pseudo-
Random Generation Algorithm (PRGA) in RC4.
16.2.18 Remark In what follows, a = b (mod n) is understood to mean that b is the reminder of a
when divided by n.
16.2.19 Definition (KSA of RC4) FSM for KSA for a given secret key K of length l bytes is a
4-tuple (Q0 , FK , r, m). The secret key is used to scramble S by shuffling the words in S.
Suppose S = (x0 , x1 , . . . , xN −1 ); we denote a state of RC4 by (i, j, S) or equivalently by
(i, j, x0 , x1 , . . . , xN −1 ). Let (i, j, x0 , x1 , . . . , xN −1 ) ∈ R be a state at some time instant
and let (e, d, y0 , y1 , . . . , yN −1 ) = FK (i, j, x0 , x1 , . . . , xN −1 ) ∈ R be the next state. In the
initialization process, i and j are initialized to 0, the identity permutation (0, 1, . . . , N −
1) is loaded in the array S. Thus we have the initial state Q0 = (0, 0, 0, 1, 2, . . . , 255) of
KSA.
16.2.20 Definition (PRGA of RC4.) RC4 keystream generator (PRGA) can also be represented
as a finite state machine (I0 , F, G, r, m) where F is the state updating function and G
is the output function. Let (i, j, x0 , x1 , . . . , xN −1 ) ∈ R be a state at some time instant
and let (e, d, y0 , y1 , . . . , yN −1 ) = F (i, j, x0 , x1 , . . . , xN −1 ) ∈ R be the next state. Figure
16.2.2 shows the state transition of PRGA.
t = xe + xd (mod N )
d = j + xe e=i+1
j + ··· xd · · · xt ··· xe ··· x0 + i
1
16.2.21 Remark (Attacks on RC4) RC4 has a huge internal state of 8×258 = 2064 bits. We observe
that in RC4, the state update function is invertible. If the size of the internal state is s in
bits (s = (N + 2)(log N ) = 2064 in RC4) and the next state update function is randomly
chosen, then the average cycle length is about 2s−1 [1082]. However, it is hard theoretically
to determine any randomness properties for RC4. Cryptanalysis of RC4 attracted a lot
of attention in the cryptographic community after it was made public in 1994. Numerous
significant weaknesses were discovered and notable weakness include weak initialization
vectors, classes of weak keys, patterns that appear twice the expected number of times (the
second byte bias), and biased distribution of RC4 initial permutation. Weaknesses in the
key scheduling algorithm in RC4 led to a practical attack on the security protocol WEP.
Currently, it has been proposed to use AES in WEP due to these weaknesses of RC4.
16.2.22 Remark We now introduce the WG stream cipher which was submitted to the eSTREAM
project in 2005 by Nawaz and Gong [989]. The cipher is based on WG (Welch-Gong)
transformations. WG cipher has desired randomness properties, like long periods, large
linear complexity, two level autocorrelation and ideal t-tuple distribution. It is resistant to
Time/Memory/Data tradeoff attacks, algebraic attacks, and correlation attacks. The cipher
can be implemented with a small amount of hardware [1835].
Cryptography 755
16.2.23 Definition A WG cipher can be regarded as a nonlinear filter generator over an exten-
sion field, filtered by a WG transformation. As shown in Figure 16.2.3, it consists of a
linear feedback shift register, followed by a WG permutation transform. The LFSR
is basedPon an l degree primitive polynomial p over the finite field F2m given by,
l
p(x) = i=0 ci xi , ci ∈ F2m . The LFSR generates a maximal-length sequence (an m-
sequence) over F2m . This simple design generates a keystream whose period is 2n − 1,
where n = lm, and it is easy to analyze various cryptographic properties of the gener-
ated keystream. The feedback signal Init is used only in the key initialization phase. In
PSG running phase, the feedback is only from the LFSR. The output of the cipher is
one bit. We denote a WG cipher with an LFSR of l stages over F2m as W G(m, l).
16.2.24 Remark The version of the WG submitted to eSTREAM [989] is denoted by W G(29, 11).
+ + +
cl−1 cl−2 c1 c0
Update of LFSR
Pl−1
+ al−1 al−2 ··· a1 a0
i=0 ci ai+k + W Gperm(ak+l−1 ),
0 ≤ k < 2l (in KIA phase)
ak+l =
Init
Pl−1
W Gperm i=0 ci ai+k , k ≥ 2l (in PSG)
m
m Output : sk = W G(ak+l−1 ), k ≥ 2l
Tr
1
Figure 16.2.3 A diagram for WG ciphers.
W Gperm(x) = t(x + 1) + 1
t(x) = x + xr1 + xr2 + xr3 + xr4
W G(x) = T r(W Gperm(x))
16.2.26 Remark Note that a WG transformation exists only if m 6≡ 0 (mod 3) (see [864]). In
practice, we consider a value of m to be a reasonable choice for a WG cipher where m 6≡ 0
(mod 3) and either of the following holds: m is small enough to allow an efficient lookup
table implementation of the permutation (m ≤ 11), or m ≡ 2 (mod 3) and m has an
optimal normal basis for efficient implementation in hardware; see Sections 5.3 and 16.7.
The suitable values of m for 7 ≤ m ≤ 29 are 7, 8, 10, 11, 23, and 29.
16.2.27 Remark Here we compute exponents, i.e., ri ’s from [864] so that t is a permutation poly-
nomial. The exponents used in [2217] which are taken from [2294] are different. But W G
sequences are identical for both representations.
756 Handbook of Finite Fields
16.2.28 Theorem [1303] The linear span of the WG cipher can be determined by the following
formula
X
LS = m × lw(i) where w(i) is the Hamming weight of i,
i∈I
basis in the finite field F229 . The parameters for implementation are listed in
Table 2.
m l WG and Polynomials
29 11 W G(x) = T r(t(x + 1) + 1) where
t(x) = x + xr1 + xr2 + xr3 + xr4 and
r1 = 210 + 1, r2 = 220 + 210 + 1
r3 = 2 − 2 + 1, r4 = 220 + 210 − 1
20 10
3. Key Initialization Phase for W G(29, 11): An initial state of the LFSR contains
319 bits where each register holds 29 bits. For a 128-bit key and 128-bit IV
(initial vector), the rule for loading the LFSR is shown in Table 3. Here x||y =
(x0 , . . . , xr−1 , y0 , . . . , ys−1 ) is the concatenation of two vectors x = (x0 , · · · , xr−1 )
and y = (y0 , . . . , ys−1 ). Once the LFSR has been loaded with the key and IV,
the key stream generator is run for 22 clock cycles. This is the key-initialization
phase of the cipher operation. During this phase the 29 bit vector of the output
of the WG permutation is added to the feedback of the LFSR which is then used
to update the LFSR.
4. PSG phase: After running the KIA for 2l = 22 clock cycles, PSG starts to give
1 bit output for each clock cycle. At this phase, the feedback to LFSR from the
output of the WG permutation stops.
16.2.35 Remark We note that the WG Stream Cipher was not selected for Phase III of the e-Stream
Project since, although no attacks against WG were reported, the cipher is compromised
758 Handbook of Finite Fields
if a relaxation of at most 245 bits are generated from a single key. Also the hardware
implementation seems to be larger than desirable; see [990]. However, the linear span can
be increased up to at least 29 × 1128 = 2101.722 using the decimation method as it is done
for WG7 in [1982]. A hardware implementation has recently been reported in [1835], which
shows different implementation methods and optimizations.
16.2.36 Definition A block cipher consists of a pair of encryption and decryption operators E
and D. For a fixed key K, E maps an n bit plaintext m = (m0 , . . . , mn−1 ) to an n bit
ciphertext c = (c0 , . . . , cn−1 ), namely encryption, and D maps the ciphertext back to
the plaintext, i.e., decryption. In other words,
E : K × Fn2 → Fn2
(D ◦ E)(m) = m
where m is an n-bit message, K is the set of possible keys and Fn2 is the set consisting
of all n-bit vectors.
16.2.39 Definition (Feistel Structure) [2080] The Feistel Structure is a feedback shift register with
time varying feedback function. In other words, the round function is a time varying
feedback function. For example, DES is of a Feistel structure, the input Mi−1 to the
i-th round is divided into two equal halves Li−1 and Ri−1 , i.e., Mi−1 = Li−1 ||Ri−1 . The
output Mi = (Li , Ri ) is defined as follows
16.2.40 Remark Note that for the invertibility of the round function, f (·, ·) need not be invert-
ible. The security of the encryption algorithm depends on the design of f (·, ·) and the key
scheduling algorithm. The block cipher RC6 and many other block ciphers have this struc-
ture. Here we consider RC6 as an example of a block cipher based on the Feistel structure.
16.2.41 Definition (Substitution-Permutation Network (SPN)) [2080] In an SPN, each round func-
tion consists of a few successive layers. The input to a substitution layer is divided into
small blocks of bits say blocks of eight bits each. An S-box is applied to each block.
Each S-box is a bijective map, so that entire substitution layer is also a bijective map.
Cryptography 759
The effect of a substitution layer is local in the sense that an output bit in a particular
position depends only on a few of the input bits in its nearby positions. This local effect
is compensated by having a permutation layer which permutes its input bits. The round
key is usually incorporated at the beginning or at the end of the round function.
16.2.42 Remark We consider AES Rijndael as an example of a block cipher based on the SPN
structure; see Subsection 16.2.6.
16.2.5 RC6
16.2.43 Remark This section is mainly from [2461]. RC6 is a symmetric key block cipher which
encrypts 128-bit plaintext blocks to 128-bit ciphertext blocks and supports key sizes of
128, 192, and 256 bits. It was designed by Rivest, Robshaw, Sidney, and Yin to meet the
requirements of the Advanced Encryption Standard (AES) competition [2461].
16.2.44 Remark In general RC6 is specified as RC6-w/r/l where the word size is w bits, encryption
consists of r rounds (generally r is 20), and l denotes the length of the encryption key in
bytes.
16.2.45 Definition The encryption process involves three types of operations. Let
x = (x0 , . . . , xw−1 ) ∈ Fw
2 and y = (y0 , . . . , yw−1 ) ∈ F2 (note that x and y can be treated
w
16.2.46 Remark The block cipher RC6 has the following features [2461]. It is fast and simple and
the best attack on RC6 appears to be exhaustive key search.
16.2.47 Definition (Encryption and decryption) RC6 is of the Feistel structure. In details, RC6 is a
feedback shift register with time varying feedback and four w-bit registers which contain
the input plaintext (a0 , a1 , a2 , a3 ) and the output ciphertext (ar , ar+1 , ar+2 , ar+3 ) at
the end of encryption. Round keys S0 , . . . , S2r+3 are obtained from the key schedule
algorithm, where each array element Si is of w bits (see Definition 16.2.49 and Figure
16.2.7). The state updating is done as follows:
where gi is defined in the for loop of the Algorithm Enc(); see Figure 16.2.7.
760 Handbook of Finite Fields
16.2.48 Remark From the encryption algorithm, it can be easily seen that the process is invert-
ible. The decryption algorithm is very similar to the encryption algorithm. It is a good
exercise to write the decryption algorithm from the encryption algorithm. The input is
(ar , ar+1 , ar+2 , ar+3 ), the round keys are in the reversed order (S2r+3 , . . . , S0 ), ≪ is re-
placed by ≫ and + is replaced by − in proper places of the encryption algorithm. See
[2461] for a detailed description of the decryption algorithm.
Output : Ciphertext (ar , ar+1 , ar+2 , ar+3 ). Output : Round keys S0 , . . . , S2r+3 .
Procedure : Procedure :
a1 = a1 + S0 S0 = Pw .
a3 = a3 + S1 For i = 1 to 2r + 3 do
For i = 1 to r do Si = Si−1 + Qw
t = f (a(i−1)+1 ) End For
u = f (a(i−1)+3 ) A=B=i=j=0
ai−1 = ((ai−1 ⊕ t) ≪ u) + S2i v = 3 × max{c, 2r + 4}
a(i−1)+2 = ((a(i−1)+2 ⊕ u) ≪ t) + S2i+1 For s = 1 to v do
The i-th state: (ai , ai+1 , ai+2 , ai+3 ) A = Si = (Si + A + B) ≪ 3
= (a(i−1)+1 , a(i−1)+2 , a(i−1)+3 , ai−1 ) B = Lj = (Lj + A + B) ≪ (A + B)
End For i = (i + 1) (mod (2r + 4))
ar = ar + S2r+2 j = (j + 1) (mod c)
ar+2 = ar+2 + S2r+3 End For
End Algorithm. End Algorithm.
(Here f (x) = x(2x + 1) ≪ lg w)
Figure 16.2.7 Encryption and key schedule algorithm for RC6 [2461].
16.2.49 Definition (Key schedule) The user supplies a key of l bytes, where 0 ≤ l ≤ 255. Sufficient
zero bytes are appended to give a key length equal to an integral number (say c) of
words, and it is stored in L0 , . . . , Lc−1 . From this key, 2r + 4 words are derived and
stored in the array S0 , . . . , S2r+3 . The constants P32 = B7E15163 and Q32 = 9E3779B9
(hexadecimal) are derived from the binary expansion of e − 2 (e is the base of natural
logarithm) and φ − 1 (φ is the Golden Ratio).
16.2.50 Remark Figure 16.2.7 gives a description of the key schedule algorithm.
16.2.51 Remark Some of the features of AES are as follows; see [762] for a more detailed description
of AES. There are some differences between Rijndael and AES. Rijndael provides for several
choices of block and key sizes. AES adopted only a subset of these parameter choices. Here
we are ignoring these differences.
1. There are three allowable block lengths: 128, 192, and 256 bits.
2. There are three allowable key lengths (independent of selected block length): 128,
192, and 256 bits.
Cryptography 761
3. The number of rounds is 10, 12, or 14, depending on the key length.
4. Each round consists of three functions, which are in four “layers” as
a. 8-bit inverse permutation (sub-byte transform),
b. 32-bit linear transformation (mix columns operation),
c. 128-bit permutation (shift rows operation) and
d. round key addition.
16.2.52 Definition (High level description of AES) A plaintext m of 128 bits is the initial state
which is represented as a four by four array of bytes (see Figure 16.2.8).
1. For a given plaintext m, the initial state is m. Perform an AddRoundKey op-
eration which is xor of the RoundKey with the initial State.
2. For each of the r − 1 rounds perform a SubByte operation on State using an
S-box; perform a ShiftRows on State; perform an operation called MixColumns
on State; and perform an AddRoundKey operation.
3. For the r-th round, perform SubByte; perform ShiftRows and perform Ad-
dRoundKey.
4. The final State y is the ciphertext.
128-bit message M
0-th round
XOR with K0 (0-th round key)
8-bit
S S S S S S S S S S S S S S S S
8-bit repeat
Shift Rows and Mix Columns for r − 1
rounds
XOR with Ki (i-th round key)
8-bit
S S S S S S S S S S S S S S S S
8-bit
Shift Rows r-th round
128-bit ciphertext C
16.2.53 Remark (Algebraic structure of AES Rijndael) [762] Rijndael uses a finite field F28 defined
by the primitive polynomial p(x) = x8 + x4 + x3 + x + 1. Let α be a root of p, i.e., p(α) = 0
in F28 . We use classical polynomial representation and the elements of F28 are considered
as a set consisting of all polynomials of degree less than or equal to 7 with coefficients from
F2 . So we can identify an element of F28 by an 8-bit vector.
16.2.54 Remark We introduce the following ring of matrices:
In other words, for each matrix in M4 (F28 ), the entries are taken from F28 , i.e., each element
of the matrix has 8-bit or one byte representation, and each row or column can be considered
as a 32-bit word.
16.2.55 Remark For 128-bit version of Rijndael block cipher, a message M of 128 bits is parsed as
16 bytes and then further parsed as a 4 by 4 matrix:
M = (m0 , m1 , m2 , m3 , m4 , m5 , m6 , m7 , m8 , m9 , m10 , m11 , m12 , m13 , m14 , m15 ),
where mi ∈ F28 , and the initial state M0 ∈ M4 (F28 ) is given as
m0 m4 m8 m12
m1 m5 m9 m13
M0 = m2
.
m6 m10 m14
m3 m7 m11 m15
16.2.56 Remark Three basic operators for AES are SubByte, ShiftRow, and MixColumn.
16.2.58 Definition ShiftRow transform R and its inverse on a state X are given as follows
x00 x01 x02 x03 x00 x01 x02 x03
x11 x12 x13 x10 −1
x13 x10 x11 x12
R(X) =
, R (X) =
.
x22 x23 x20 x21 x22 x23 x20 x21
x33 x30 x31 x32 x31 x32 x33 x30
16.2.59 Definition MixColumn transform L is a linear transform on F428 . Recall that α is a root
of p(x) in F28 . Given a state X, L(X) = LX where LX is the matrix multiplication of
L and X over F28 . The linear transform L and its inverse are given as follows
α 1+α 1 1 β0 β3 β2 β1
1 α 1+α 1 , L−1 = β1 β0 β3 β2 ,
L= 1 1 α 1+α β2 β1 β0 β3
1+α 1 1 α β3 β2 β1 β0
where β0 = α3 + α2 + α, β1 = α3 + 1, β2 = α3 + α2 + 1, and β3 = α3 + α + 1.
16.2.61 Remark The total number of round key bits is equal to the block length times the number
of rounds plus 1. The 128-bit version needs 10 rounds. Thus 1408 bits, or 44 words of round
key bits are needed. Thus, the key schedule should extend a 128-bit key to round keys, a
total of 1408 key bits. Let {ki }43 i=0 be a sequence of words, ki ∈ F28 , which consists of 4
4
bytes. Let (k0 , k1 , k2 , k3 ) be the 128-bit session key. The sequence {ki } is used as the round
keys. The expansion of the key is shown in Table 16.2.9.
Table 16.2.9 Key scheduling, where ki = (ki0 , ki1 , ki2 , ki3 ), i = 0, 1, . . . , 43 and kij ∈ F28 .
16.2.62 Definition (Rijndael encryption and decryption) For 128-bit version of Rijndael, a message
M of 128 bits is parsed as a 4 by 4 matrix. The number of rounds is equal to 10. The
process of computation of the cipher C (again written as a 4 by 4 matrix) is shown in
Table 16.2.10. When viewed as an FSM, the initial state of the Rijndael block cipher is
M0 , the final state is M10 (which is the output, i.e., ciphertext) and the state update
function is a map : F1628 → F28 as shown in Table 16.2.10.
16
Encryption Decryption
M0 = M + K0 , in M4 (F28 ) C0 = C + K10 ,
Mi = H(Mi−1 ) + Ki , 1 ≤ i ≤ 9 C1 = G−1 (C0 ) + K9
M10 = G(M9 ) + K10 Ci = H −1 (Ci−1 ) + K10−i , 2 ≤ i ≤ 10
The ciphertext is C = M10 . The plaintext is M = C10 .
See Also
References Cited: [762, 864, 989, 990, 1082, 1303, 1317, 1835, 1982, 2001, 2080, 2217, 2294,
2461]
764 Handbook of Finite Fields
16.3.1 Remark Due to limited space only a few key areas more directly related to the theory of
finite fields are covered. For a more complete reference, readers should consult [894, 882].
This section grows out of [894] but with new materials from the last two years added.
16.3.2 Remark Multivariate public key cryptosystems are motivated by the need to develop new
cryptosystems that have the potential to resist future quantum computer attacks. In ad-
dition, multivariate public key cryptosystems are also motivated by the need to develop
efficient public key cryptosystems that could be used in small computing devices with lim-
ited computing and memory capacities like sensors, radio-frequency identification (RFID)
tags, and other similar small devices.
16.3.3 Remark The foundation of any public key cryptosystem is a class of “trapdoor one-way
functions.” The fundamental mathematical structure of such a class of functions determines
all the basic characteristics of a public key cryptosystem. In the case of multivariate (public-
key) cryptosystems (MPKCs), the trapdoor one-way function is usually in the form of a
multivariate quadratic polynomial map over a finite field.
16.3.4 Definition For a MPKC, the public key is, in general, given by a set of quadratic polyno-
mials:
P(x1 , ..., xn ) = (p1 (x1 , . . . , xn ), . . . , pm (x1 , . . . , xn )),
where each pi is a (usually quadratic) nonlinear polynomial in X = (x1 , . . . , xn ):
X X X
yk = pk (X) := Pik xi + Qik x2i + Rijk xi xj + Sk . (16.3.1)
i i i>j
16.3.5 Remark The evaluation of these polynomials corresponds to either the encryption procedure
or the verification procedure.
16.3.6 Remark Most of the constructions are quadratic constructions due to the consideration of
the efficiency of encryption determined by the key size.
16.3.7 Remark Inverting a multivariate quadratic map is generally equivalent to solving a set of
quadratic equations over a finite field, or the following multivariate quadratic (MQ) problem.
16.3.8 Definition (MQ problem) Solve a system p1 (X) = p2 (X) = · · · = pm (X) = 0, where each
pi is a quadratic polynomial in X = (x1 , . . . , xn ). All coefficients and variables are in
Fq .
contrast, the security of RSA-type cryptosystems relies on the hardness of integer factor-
ization and is based on number theory developed in the 17th and 18th centuries. Elliptic
curve cryptosystems employ the mathematical theory developed in the 19th century. This
is a remark from Whitfield Diffie at the RSA Europe conference in Paris in 2002. Algebraic
geometry, the mathematics that MPKCs depend on, was developed in the 20-th century.
16.3.11 Remark This section is organized as follows: Subsection 16.3.1 provides a sketch of how
MPKCs work in general; Subsection 16.3.2 describes the known trapdoor constructions in
more detail; Subsection 16.3.3 describes the most important modes of attacks; the last
subsection is about future research directions in this area.
16.3.12 Remark After Diffie-Hellman [860], cryptographers proposed many trapdoor functions. The
earliest published proposal of MPKC schemes seemed to have arisen in Japan [2029, 2822,
2823] in the early 1980s. These papers were published in Japanese, and remained largely
unknown outside Japan.
16.3.13 Remark The first article in English describing a public key cryptosystem with more than
one independent variable may be the one from Ong et al [2320], and the first use of more than
one equation is by Fell and Diffie [1049]. The earliest attempt bearing some resemblance to
today’s MPKCs (with 4 variables) seems to be [2029]. In 1988, the first MPKC in the current
form appeared in [2028], and the basic construction described below (Subsection 16.3.1.1)
has not really changed much since.
16.3.14 Definition A usual MPKC has a private map Q, which is the central map and it belongs
to a certain class of quadratic maps each of which can be efficiently inverted.
16.3.15 Definition The basic construction of the public key is derived via composition with two
affine maps S, T .
P = T ◦ Q ◦ S : Fnq → Fm
q ,
where the maps S, T are affine (sometimes linear) invertible maps on Fnq and Fm
q , respec-
tively.
16.3.16 Remark The purpose of T and S is to hide the trap door Q. The key of a MPKC is indeed
the design of the central map.
16.3.17 Remark
The public key consists of the polynomials in P.
The secret key consists of the information in S, T , and Q.
To verify a signature or to encrypt a block, one simply computes
Y = P(X),
X = P −1 (Y ) = S −1 ◦ Q−1 ◦ T −1 (Y ),
766 Handbook of Finite Fields
which is computed via the composition factors in turn. Notice that by the inverse
of a map here, we mean finding one of possibly many pre-images, not necessarily
an inverse function in the strict mathematical sense.
16.3.18 Remark The basics of MPKCs are provided below so that the reader has a basic sense
about how these schemes can work in practice:
Cipher block or message digest size: m elements of Fq ;
Plaintext block or signature size: n elements of Fq ;
Public key size: mn(n + 3)/2 elements of Fq ;
Secret key size: Usually n2 + m2 + [size of P] elements of Fq ;
Secret map time complexity: (n2 + m2 ) Fq -multiplications, plus the time it is needed
to invert Q;
Public map time complexity: About mn2 /2 Fq -multiplications.
16.3.19 Remark In terms of computational complexity, MPKCs usually have strong advantages as
we shall see below. But a disadvantage with MPKCs is that their keys are large compared
to number-theory-based systems like RSA or ECC. For example, the public key size of
RSA-2048 is not much more than 2048 bits, but a current version of the Rainbow signature
scheme has n = 42, m = 24, q = 256, i.e., the size of the public key is 22,680 bytes.
16.3.20 Remark There are other alternative forms in which multivariate polynomials can be used
for public key cryptosystems, as we discuss next.
16.3.21 Definition The public key of an implicit form MPKC is a system of l equations:
P (W, Z) = P (w1 , . . . , wn , z1 , . . . , zm ) = (p1 (W, Z), . . . , pl (W, Z)) = (0, . . . , 0), (16.3.2)
2. for any given specific element Y 0 , we can easily solve the equation
3. Equation (16.3.3) is linear and Equation (16.3.4) is nonlinear but can be solved
efficiently.
16.3.23 Remark To verify a signature W with the digest Z, one checks that P (W, Z) = 0. If one
wants to use P to encrypt the plaintext W , one would solve P (W, Z) = (0, . . . , 0), and find
the ciphertext Z. To invert (i.e., to decrypt or to sign) Z, one first calculates Y 0 = T −1 (Z),
then substitutes Y 0 into the Equation (16.3.4) and solves for X. The final plaintext or
signature is given by W = S −1 (X).
16.3.24 Remark In an implicit-form MPKC, the public key consists of the l polynomial components
of P and the field structure of F. The secret key mainly consists of L, S, and T . Depending
on the case, the equation Q(X, Y ) = (0, . . . , 0) is either known or has parameters which are
a part of the secret key.
16.3.25 Remark The maps S, T , L serve to hide the equation Q(X, Y ) = 0, which otherwise could
be easily solved for any Y . Mixed schemes are relatively rare, one example being Patarin’s
Dragon [2365].
16.3.26 Remark The isomorphism of polynomials problem originated from trying to attack MPKCs
by finding the secret keys.
16.3.28 Remark The system based on the IP problem was first proposed by Patarin [2366], where
the verification process is performed by showing the equivalence (or isomorphism) of two
different maps. A simplified version is the isomorphism of polynomials with one secret
(IP1s) problem, where we only need to find the map S (if it exists), while the map T
is known to be the identity map. This problem is used to build identification schemes
[1044, 1266, 1914, 2371, 2387].
16.3.29 Remark Mathematically, this problem can be viewed from the perspective of the problem
of classification of quadratic maps from Fnq to Fm
q under the action of the group GLn (Fq ) ×
GLm (Fq ), namely to describe precisely the orbit space of all the quadratic maps from Fnq to
Fm
q under the action of the group GLn (Fq ) × GLm (Fq ); a very hard mathematical problem
we do not know much about, except the case when m = 1, which is the classification of
quadratic (or bilinear) forms, a problem we know very well.
16.3.30 Remark Since the vast majority of MPKCs are in standard form, we deal with mainly such
systems in the rest of this section.
16.3.31 Remark The first attempt to construct a multivariate signature [2320, 2321] utilizes a
quadratic equation with two variables.
y ≡ x21 + αx22 (mod n), (16.3.7)
768 Handbook of Finite Fields
where n = pq is an RSA modulus, a product of two large primes. The public key is essen-
tially the integer n and Equation (16.3.7). Since the security is supposed to be based on
the factorization of n, this system can really be viewed as a derivative of RSA, though it
indeed initiated the idea of multivariate cryptosystems. This system was broken by Pollard
and Schnorr in [2415], where they gave a probabilistic algorithm to solve Equation (16.3.7)
for any y without even knowing the factors of n. Assuming the generalized Riemann hy-
pothesis, a solution can be found with a computational complexity of O((log n)2 log log |k|)
in O(log n)-bit integer operations.
16.3.32 Remark Diffie and Fell [1049] tried to build a cryptosystem using the composition of in-
vertible linear maps and simple tame maps of the form
T (x1 , x2 ) = (x1 + g(x2 ), x2 ),
where g is a polynomial. Tame maps, well known in algebraic geometry, are easily invert-
ible but hard to hide when composed with each other, [1049] used only two variables and
equations; not surprisingly, the authors concluded that it appeared very difficult to build
such a cryptosystem with any real practical value that is both secure and has a public key
of practical size, therefore practically useful.
16.3.33 Remark An attempt to build a true multivariate (with four variables) public key cryp-
tosystem was also made by Matsumoto, Imai, Harashima, and Miyagawa [2029], where the
public keys are given by quadratic polynomials. However it was soon broken [2312]. People
soon realized that more than 4 variables are needed and new mathematical ideas are needed
to make MPKCs work.
16.3.34 Remark The tame maps used in [1049] are a special case of the “triangular” or de Jonquières
maps from algebraic geometry.
16.3.36 Remark A de Jonquières map J can be efficiently inverted as long as gi is not too compli-
cated. The invertible affine linear maps over Fnq together with the de Jonquières maps belong
to the family of tame transformations from algebraic geometry, including all transformations
that are in the form of a composition of elements of these two types of transformations.
Tame transformations are elements of the group of automorphisms of the polynomial ring
Fq [x1 , . . . , xn ]. Elements in this automorphism group that are not tame are wild. Given a
polynomial map, it is in general very difficult to decide whether or not the map is tame,
or even if there is indeed any wild map [2213], a question closely related to the famous
Jacobian conjecture. This problem was solved in 2003 when [2613] proves that the Nagata
map is indeed wild.
16.3.37 Remark The first attempt in the English literature with a clear triangular form is the
birational permutations construction by Shamir [2606]. However, triangular constructions
were earlier pursued in Japan under the name “sequential solution type systems” [1440,
2822, 2823]. Their construction is actually even more general in the sense that they use
rational functions instead of just polynomials. These works in Japanese are not so well-
known.
Cryptography 769
16.3.38 Remark Triangular maps are extremely fast to evaluate and to invert if gi are very simple
functions. However, they do have certain strong definitive characteristics. On the small end
of a triangular system, so to speak, the variable xn is mapped to the simple function of
itself. On the bigger end, the variable xi appears only once in a single equation. The other
equations involve successively more variables.
16.3.39 Proposition If we write the quadratic portion of the central polynomials yi = qi (X) as
bilinear forms, or take the symmetric matrix denoting the symmetric differential of the
central polynomials as in
16.3.41 Remark Triangular (and Oil-and-Vinegar, and variants thereof) systems are usually called
“single-field” or “small-field” approaches to MPKC design, in contrast to the approach
taken by Matsumoto and Imai in 1988 [2028]. In the “big-field” constructions, a totally new
type of mathematical construction, the central map is really a map in a larger field L, a
degree n extension of a finite field K. One builds an invertible map Q : L → L, and picks
a K-linear bijection φ : L → Kn . Then we have the following multivariate polynomial map,
which should presumably be quadratic in general:
Q = φ ◦ Q ◦ φ−1 , (16.3.10)
and, we “hide” this map Q by composing from both sides by two invertible affine linear
maps S and T in Kn .
16.3.42 Definition Matsumoto and Imai built a scheme C ∗ by choosing a field K of characteristic
2 and the map Q α
Q : X 7−→ Y = X 1+q , (16.3.11)
where q is the number of elements in K, X is an element in L, and gcd(1+q α , q n −1) = 1.
16.3.45 Remark The map Q is always quadratic due to the linearity of the Frobenius map
α
Fα (X) = X q .
α 2α
16.3.46 Remark A significant algebraic implication of C ∗ and Equation (16.3.11) is Y q −1
= Xq −1
or
α 2α
XY q = X q Y. (16.3.13)
∗
Patarin [2364] used this bilinear relation to cryptanalyze the original C (see Subsec-
tion 16.3.3.1). Though the original idea of C ∗ failed, it has inspired many new designs.
16.3.47 Definition An HFE (Hidden Field Equations) system, as the most significant of the C ∗
derivatives, is constructed by replacing Q, the monomial used by C ∗ , by the extended
Dembowski-Ostrom polynomial map:
X i j X i
Q : X ∈ L = Fqn 7−→ Y = aij X q +q + bi X q + c. (16.3.14)
0≤i≤j<r 0≤i<r
16.3.48 Remark This map is, in general, not one-to-one and we need additional structure to identify
the real inverse from one of a number of possible candidates for decryption.
16.3.49 Remark Inverting Q is equivalent to solving a univariate equation of a certain degree in L.
It is well-studied and straightforward to implement but depends very much on the degree
of the polynomial, using some version of the Berlekamp (or Cantor-Zassenhaus) algorithm
[230, 499]; see Section 11.4. Typically, the cost of this solution is O(nd2 log d + d3 ), where
d is the maximum degree of Q.
16.3.50 Remark For practical applications, one might conclude right away that we should have as
small a d as possible, or as small an r as possible, since usually d = 2q r or q r + 1. But the
situation actually becomes very subtle. Just as Equation (16.3.11) intrinsically meant that
the C ∗ map in some form has a rank of 2 and leads to Equation (16.3.13) and all the known
cryptanalysis of C ∗ related systems, Equation (16.3.14) fundamentally is responsible for all
the algebraic properties of the HFE.
16.3.51 Remark A critical fact is that the intrinsic rank of the map is bounded by r, and usually
achieves that value for randomly chosen parameters. This rank essentially determines the
complexity of current attacks [737, 1042]. For example, the HFE Challenge 1 solved by
Faugère and Joux [1042] has an intrinsic rank of 4.
16.3.52 Remark An HFE with a high d is unbroken, although it can be really slow to decrypt/invert.
Quartz [2369] probably sets a record for the slowest cryptographic algorithm when submitted
to NESSIE — on a Pentium III 500MHz, it took half a minute to do a signature, but has
been improved substantially since.
16.3.53 Remark Recent research progress in [878, 884, 890] shows that the HFE still has great
potential if we explore the cases using finite fields with odd characteristics.
16.3.54 Remark Schemes C ∗ and HFE each can be modified by techniques mentioned later (Plus-
Minus, vinegar variables, and internal perturbation). Also related are the `IC system and
probabilistic big-field based MPKCs [1340].
16.3.55 Remark The Oil and Vinegar (OV) and later unbalanced Oil and Vinegar (UOV) schemes
[1737, 2367] are designed only for signatures. This construction is inspired by the idea of
Cryptography 771
linearization equations (Subsection 16.3.2.3). In some sense, this construction uses a method
where ones transforms an attacking method into a designing method.
16.3.56 Definition Let v < n and m = o = n − v. The variables x1 , . . . , xv are vinegar variables
and xv+1 , . . . , xn oil variables. An oil-vinegar map Q : Kn → Km is a map in the form
Y = Q(X) = (q1 (X), . . . , qo (X)), where
n
v X
(l)
X
ql (X) = αij xi xj , l = 1, · · · , o
i=1 j=i
and all coefficients are randomly chosen from the base field K.
16.3.57 Remark In each qi , there are no quadratic terms of oil variables, which means the oil
variables and vinegar variables are not fully mixed (like oil and vinegar in a salad dressing),
which is the origin of the name of this scheme.
16.3.58 Definition The public key map P for an Oil-Vinegar scheme is constructed as
P = Q ◦ S,
16.3.59 Remark The change of basis by the transformation S is a process to “mix” fully oil and
vinegar, so one cannot tell what are oil variables and what are the vinegar variables. With
OV and UOV constructions, there is no need to compose another affine map T .
16.3.60 Remark The original Oil and Vinegar signature scheme has m = o = v = n/2. When o < v,
it becomes the unbalanced Oil and Vinegar signature scheme.
16.3.61 Remark The public key for an OV or an UOV is P = (p1 , . . . , po ), the polynomial compo-
nents of P. The secret key consists of the linear map S and the map Q.
16.3.62 Remark Given a message Y = (y1 , . . . , yo ), to sign it, one needs to find a vector W =
(w1 , . . . , wn ) such that P(W ) = Y . With the secret key, this can be done efficiently. First,
one guesses values for each vinegar variable x1 , . . . , xv , and obtains a set of o linear equations
with the o oil variables xv+1 , . . . , xn . With high probability, it has a solution. If the linear
system does not have a solution, one may repeatedly assign random values to the vinegar
variables until one finds a pre-image of a given element in Ko . Then one applies S −1 .
16.3.63 Remark To check if W is indeed a legitimate signature for Y , one only needs to get the
public map P and check if indeed P(W ) = Y .
16.3.64 Remark The algebraic property that is most significant in an unbalanced Oil-and-Vinegar
system is the absence of pure oil cross-terms. Equivalently, if we have an UOV polynomial,
then the quadratic part of each component qi in the central map from X to Y , when viewed
as a bilinear form using a matrix, see Equation (16.3.9), looks like
(i) (i) (i) (i)
α11 ··· α1v α1,v+1, · · · α1n
.. .. .. .. .. ..
. . . . . .
(i) (i) (i) (i)
αv1 ··· αvv αv,v+1, · · · αvn ,
Mi := (i)
(i) (16.3.15)
αv+1,1, · · · αv+1,v, 0 ··· 0
.. .. .. .. .. ..
. . . . . .
(i) (i)
αn1 ··· αnv 0 ··· 0
772 Handbook of Finite Fields
∗ ∗
or in the block form: .
∗ 0
16.3.65 Remark There have been different attempts to make UOV more efficient such as [1687,
1686], which were promptly broken [2998].
16.3.66 Remark A new way to make the UOV type construction more efficient is the Rainbow
construction by stacking several layers of Unbalanced Oil-Vinegar systems together for an
easily invertible central map [889].
16.3.67 Definition For a u-stage Rainbow 0 < v1 < v2 < · · · < vu+1 = n, the construction of
central map over any finite field is given by
vl X
n
(k) (k)
X X
yk = qk (X) = αij xi xj + β i xi , if vl < k ≤ vl+1 . (16.3.16)
i=1 j=i i<vl+1
16.3.68 Remark In the signing process, we first choose randomly the values for the vinegar variables
x1 , . . . , xv1 in the first layer, and solve for the oil variables xv1 +1 , . . . , xv2 . Then we use the
known values of xi in the second layer and find the values for the oil variables in the second
layer. We continue like this layer by layer until we have all the xi ’s.
16.3.69 Remark The components of Y in a Rainbow-type construction are typically written to have
indices v1 + 1, . . . , n. In the pure Rainbow scheme, S and T and the coefficients α and β
are totally randomly chosen. The essential structure of the Rainbow instance is determined
by 0 < v1 < v2 < · · · < vu+1 = n or the “Rainbow structure sequence” (v1 , o1 , o2 , . . . , ou ),
where oi := vi+1 − vi .
16.3.70 Remark The Rainbow construction is a special case of the UOV; however, the structure
of
∗ ∗
the system, consist of ou equations with the associated bilinear maps of the form
∗ 0
∗ 0
following m − ou equations with the associated bilinear maps of the form leads to
0 0
a different attack; see Subsection 16.3.3.10).
16.3.71 Remark Aside from attacks peculiar to the UOV and Rainbow systems, the Rainbow-type
constructions also share certain characteristics of triangular schemes, therefore there is a
need to account for rank-based attacks (Subsection 16.3.3.8), such as the two improved
attacks in [280, 895]. None of these attacks are considered essentially effective.
16.3.72 Remark If one wants to make the computation of the central map and its inverse fast,
another direct way is to make the Oil-vinegar polynomials sparse, such as in the case of the
TTS (Tame Transformation Signatures) schemes, [604, 3023, 3024]. The TRMS [2931] of
Wang et al. are also a TTS instance. Due to the sparsity, there also exist certain extra pos-
sibilities of linear algebra and related vulnerabilities, principally UOV-type vulnerabilities
such as described in [891].
16.3.73 Remark Minus and Plus are simple but useful ideas, earliest mentioned by Matsumoto,
Patarin, and Shamir (probably found independently [2370, 2606]).
Cryptography 773
16.3.74 Remark For the Minus method [2370], several (r) polynomials are removed from the public
keys. When inverting the public map, the legitimate users take random values for the missing
variables. Minus is very suitable for signature schemes without any performance loss, since
a document need not have a unique signature.
16.3.75 Remark For encryption, Minus causes significant slowdown, since the missing coordinates
must be searched. In theory the public map of an encryption method should be injective.
If we have to search through r variables in Fq , we effectively have q n+r results, only q n of
which should represent valid ciphertexts, hence the expected number of guesses taken per
decryption is q r . Thus, decryption is slowed by that same factor of q r .
16.3.76 Remark Minus or removing some public equations makes a C ∗ -based system much harder
to solve. SFLASH [67, 739, 2368], a C ∗− instance with (q, n, r) = (27 , 37, 11), was accepted
as an European security standard for low-cost smart cards by the New European Schemes
for Signatures, Integrity, and Encryption [2220].
16.3.77 Remark In 2007, the SFLASH family of cryptosystems [924, 925] was defeated. The attack
uses the symmetry and the invariants of the differential of the public map P (Subsec-
tion 16.3.3.4) inspired by the attack on internal perturbation. Making a secure C ∗ -based
signature scheme may require a new modifier called Projection [880].
16.3.78 Remark Plus is the opposite of Minus: add random central equations to the original central
map, and this can be used to mask the high-end of the triangle system. For encryption
methods, this again does not affect performance much; for digital signatures there is a
slowdown as the extra variables again need to be guessed. Regardless, Plus-Minus variations
defend against attacks that are predicated on the rank of equations.
16.3.79 Remark Plus-Minus alone does not make triangular constructions secure. Paper [1339] dis-
cusses this in detail and concludes exactly the opposite: Triangle-Plus-Minus constructions
can be broken by very straightforward attacks using simple linear algebra. Therefore one
must use more elaborate variations [280, 895, 3023].
16.3.80 Remark Internal perturbation is a general method of improving the security of MPKCs
by adding some perturbation or controlled noise. Internal perturbation was first applied to
Matsumoto-Imai systems, which produce the variation [876].
16.3.81 Remark Take V = (v1 , . . . , vr ) to be an r-tuple of random affine forms in the variables X.
Let f = (f1 , . . . , fn ) be a random r-tuple of quadratic functions in V . Let our new Q be
defined by
α
X7→Y = (X)q +1 + f (V (X))
where the power operation assumes the vector space to represent a field. The number of
Patarin relations decreases quickly down to 0 as r increases. For every Y , we may find
Q−1 (Y ) by guessing at V (X) = B, finding a candidate X = (Y + B)h and checking the
initial assumption that V (X) = X. Since we repeat the procedure q r times, we are almost
forced to let q = 2 and make r as small as possible.
16.3.82 Remark In this system, there are possible extraneous solutions just as in the system HFE.
Therefore, we must manufacture some redundancy in the form of a hash segment or check-
sum. The original perturbation system was broken [1094] via a surprising differential crypt-
analysis (cf. Sec. 16.3.3.4). Internal perturbation is usually coupled with the plus variation,
and this is one of the best multivariate encryption schemes.
774 Handbook of Finite Fields
16.3.83 Remark The idea of vinegar variables had been introduced earlier with UOV, and was
used as a defense in Quartz. The idea is to use an auxillary variable that occupies only a
small subspace of the input space. It was pointed out [888] that internal perturbation is
almost exactly equal to both vinegar variables and projection, or fixing the input to an
affine subspace. We basically set one, two, or more variables of the public key to be zero to
create the new public key. In the case of signature schemes, each projected dimension will
slow down the signing process by a factor of q.
16.3.84 Remark Projection is a very useful simple method. Since (Sec. 16.3.3.4) structural attacks
usually start by looking for an invariant or a symmetry, it is a good idea that we should
try to remove both. Restricting to a subspace of the original space breaks the symmetry.
Something like the Minus modifier destroys an invariant. Hence the use of projection by
itself prevents some attacks, such as [924, 925, 1095]. The differential attack against C ∗
(and `IC) derivatives uses the structure of the big field L. Hence projection is expected to
prevent such an attack [880].
16.3.85 Remark MPKCs based only on triangular constructions were not pursued again until a
much more complex defense against rank attacks was proposed, with the tame transforma-
tion method (TTM) of Moh [2113].
16.3.86 Remark de Jonquières maps can be viewed either as upper triangular or lower triangular.
Moh [2113] suggested a construction where the central map Q is given by
Q = Ju ◦ Jl ◦ I(x1 , . . . , xn ), (16.3.17)
16.3.89 Definition A MinRank problem is to look for non-zero matrices with minimum rank in a
space of matrices.
16.3.90 Remark The Minrank problem is NP-hard in general but can be easy for special cases, in
particular, when the minimum rank is very low.
16.3.91 Remark The idea of sequentially solvable equations (or stages) can also be used in conjunc-
tion with other ideas. Some of the more notable attempts are from Wang, who had written
Cryptography 775
about a series of schemes called “Tractable Rational Map Cryptosystems” (TRMC). TRMC
v1 is essentially no different from early TTM [603]. The central map of TRMC v2 [2929]
has a small random overdetermined block on one end and the rest of the variables are de-
termined in the triangular (tame) style. Versions 3 and 4 [2930, 2932] use a similar trick as
3IC [892].
16.3.92 Remark The TTM construction is a truly original and very intriguing idea. So far existing
constructions of the TTM cryptosystem and related schemes do not work for public-key
encryption [883, 886, 887, 2226]. Most of the schemes proposed are not presented in any
systematic way, and no explanation is yet given why and how they work. More sophistication
is needed and we suspect that to create a successful TTM-like scheme will probably require
deep insight from algebraic geometry.
16.3.93 Remark A TTS (tamed transformation signature) scheme can be viewed as a similar but
simpler construction. This system is essentially the result of an application of the Minus
method in [2606] for a tame transformation. A few of them were suggested mainly by Chen
and Yang [604, 605]. These systems can also be defeated by the method used by Stern and
Vaudenay [722, 896].
16.3.94 Remark In C ∗ and HFE, we use one big field L = Kn . In Rainbow/TTS or similar schemes,
each component is as small as the base field. We can use something in between, as seen in
MFE (Medium Field Encryption) and `IC (`-Invertible Cycles). These two constructions
use a standard Cremona transformation from algebraic geometry.
16.3.95 Definition A standard Cremona transformation is defined as: L∗ := L\{0} for some field
L:
16.3.96 Proposition This transformation is a bijection for any field L, and inverts via
p
X1 := Y1 Y2 /Y3 .
16.3.97 Remark MFE’s central map uses structure related to matrix multiplications to defend
against linearization relations, but it does not avoid all the problems, as can be seen in
Subsection 16.3.3.3.
16.3.98 Remark The `-invertible cycle [892] also uses an intermediate field L = Kk and extends C ∗
by using the following central map from (L∗ )` to itself:
16.3.99 Remark This is much faster computationally than computing the inverse of C ∗ . But these
have so much in common with C ∗ that we need the same variations. In other words, we
need to do 3IC− p (with minus and projection) and 2IC+ i (with internal perturbation and
plus), paralleling C ∗− p and C ∗+ i (also known as PMI+) [881].
776 Handbook of Finite Fields
16.3.100 Remark Initially all the MPKCs were constructed over fields of characteristic 2. But in
the work of [890], they notice that a very good idea would be the usage of fields of odd
characteristic. The rationale is that in the case of relatively large odd characteristic, the
field equations in the form of xqi − xi = 0, cannot be effectively used, therefore it forces
one to find all the solutions in the algebraic closure instead of the small field itself. This
would make the MPKCs much more secure in terms of direct attacks by polynomial solvers.
Recent results on the degree of regularity [878, 884] further confirms this notion.
16.3.101 Remark There are also other new constructions using different new ideas. One interesting
one is a construction using Diophantine equations over certain function rings [1461]. Another
is the MQQ construction using quasi-groups, although it has been defeated [1285, 2115].
16.3.102 Remark To attack an MPKC directly as an MQ problem instance is usually not very
effective, and cryptanalysts generally try to attack MPKCs utilizing the structure of the
central map by either finding the private key directly, or finding extra polynomial relations
to enhance the polynomial solver.
16.3.103 Definition For a given MPKC, a linearization equation is a relation between the compo-
nents of the ciphertext Y and plaintext X in the following form:
X X X
aij xi yj + bi xi + cj yj + d = 0. (16.3.20)
16.3.104 Remark The key property of these equations is that when substituted with the actual
values of Y , we get an affine (linear) relation between the xi ’s. In general, each equation
should effectively eliminate one variable from the system.
16.3.105 Remark The key and first example is the attack against C ∗ by Patarin [2364]. For any C ∗
public key, we can compute Y from X, and substitute enough (X, Y ) pairs and solve for
aij , bi , cj , and d. A basis for the solution space gives us all the linearization relations.
16.3.106 Remark If we are given any ciphertext, i.e., the values of yi , these n bilinear relations will
produce linear equations satisfied by components of the plaintext X, and in the case of C ∗ ,
it gives us enough linearly independent linear equations to help us to find the plaintext
with the help of the public equations. In similar systems like 3IC, for example, linearization
equations are also present in large numbers, which need to be eliminated to make the system
secure.
16.3.107 Remark If the number of linearization equations is high enough, we can defeat the system
efficiently. However, it is shown in [883, 886, 887] that even when the number of linearization
equations (or special form of bilinear relations) is not so large, their existence can be lethal.
Cryptography 777
16.3.108 Remark Ding and Schmidt [887] found that the low-rank central polynomials — often
rank 2 — in currently existing implementation schemes for the TTM cryptosystems make
it possible to extend the linearization method by Patarin [2364] to attack all current TTM
implementation schemes (Subsection 16.3.3.1).
16.3.109 Remark For the Ding-Schmidt attack [887], the number of linearization equations is not
that high, but they manage to eliminate the “lock polynomial” that defends a TTM instance
against a simple rank attack.
16.3.111 Remark This is the key to break the MFE systems. There are at least 8k linear dependencies
derived from the SOLE out of a total of 12k variables in an MFE system, which makes the
cryptanalyst’s task much easier. Paper [885] used another trick – the fact that squaring is
linear in a characteristic two field – to get it down to 2k remaining variables at most and
concluded that solving for the remaining variables is easy.
16.3.112 Remark The existence of linearization relations of higher degrees shows multivariate en-
cryption schemes designed in the triangular style are full of traps and to design a secure
system is very difficult without an intrinsically sophisticated algebraic structure.
16.3.113 Remark Structural attacks on MPKC systems to recover private keys (or equivalently useful
keys) are of two related types:
Invariants: invariants (mostly subspaces) that can be tracked.
Symmetries: transformations that leave certain structures invariant and hence can be
computed by a system of equations.
These two methods are closely related since invariants are defined according to symmetry.
Earlier designers were not yet fully aware of the importance of symmetry. We will present
the symmetry or invariants used in the new differential attacks on the C ∗ family of cryp-
tosystems as exemplified by the differential attacks developed by Stern and collaborators in
[1094].
16.3.114 Remark The cryptanalysis of the PMI (perturbed Matsumoto-Imai) [1094] was a true
novelty for a technique usually associated with symmetric key cryptography.
16.3.115 Remark Using the notation in a PMI system, we know that for a randomly chosen vector b,
the probability is q −r that it lies in the kernel K of the linear part of V . When that happens,
V (X + b) = V (X) for any X. Since q −r is not too small, if we can use this to distinguish
between a vector b ∈ T −1 K (back-mapped into the original space) and b 6∈ T −1 K, we can
778 Handbook of Finite Fields
bypass the protection of the perturbation, find our bilinear relations and accomplish the
cryptanalysis.
16.3.116 Remark In [1094], Fouque, Granboulan, and Stern built a distinguisher using a test on the
kernel of the symmetric difference
We say that t(b) = 1 if dim kerW DP(b, W ) = 2gcd(n,α) − 1, and t(b) = 0 otherwise. If
b ∈ K, then t(b) = 1 with probability one, otherwise it is less than one. If gcd(n, α) > 1, it
is a nearly perfect distinguisher. If not, we can employ two other tricks. For one of them, we
observe K is a vector space, so Pr(t(b + b0 ) = 0|t(b0 ) = 0) will be relatively high if b ∈ K
and relatively low otherwise.
16.3.117 Remark There is a surprisingly simple defense dating back to [2370] (which introduced
SFLASH). By using the “plus” variant, i.e., appending a random quadratic polynomial to
P, enough false positives are generated to overwhelm the distinguishing test of [1094]. The
extra equations also serve as a distinguisher when there are extraneous solutions. Basically,
the more “plus” equations, the less discriminating power of the above mentioned test. Based
on empirical results of Ding and Gower [881], when r = 6, a = 12 should be sufficient, and
a = 14 would be a rather conservative estimate for the amount of “plus” needed to mask
the PMI structure.
16.3.118 Remark The symmetry found in [1094] can be explained by considering the case of the C ∗
cryptosystem. The symmetric differential of any function G, defined formally just like in
Equation (16.3.21):
is bilinear and symmetric in its variables a and X. In the first version of this attack [925],
we look at the differential of the public map P, and look for skew-symmetric maps with
respect to this bilinear function, namely, the linear maps M such that
16.3.119 Remark The reason that this works is that the central map Q and the public key, which
encapsulates the vital information in the central map, unfortunately have very strong sym-
metry in the sense that all the differentials from
α
these maps share some common nontrivial
skew-symmetric map M . Since Q(X) = X 1+q , its differential is
α α
DQ(a, X) = aq X + aX q .
16.3.120 Theorem The maps M skew-symmetric with respect to this DQ(a, X) [925] are precisely
those induced from multiplication by some element ζ satisfying the condition
α
ζ q + ζ = 0.
16.3.121 Remark This skew-symmetry survives under change of basis. It can be seen that the skew-
symmetry continues to hold even when we remove some components of P. In terms of the
public key, this means that if we write
DP(c, W ) := (cT H1 W, cT H2 W, . . . , cT Hm W )
Cryptography 779
16.3.123 Remark The second symmetry is multiplicative symmetry, which comes also from the
differential DP(c, W ) [924].
16.3.124 Proposition Let ζ be an element in the big field L. Then we have
α
DQ(ζ · a, x) + DQ(a, ζ · x) = (ζ q + ζ)DQ(a, x).
16.3.125 Theorem Let
Mζ = M−1
S ◦ (X 7→ ζX) ◦ MS
be the linear map in Kn corresponding to multiplication by ζ, then
span{MTζ Hi + Hi Mζ : i = 1, . . . , n} = span{Hi : i = 1, . . . , n},
i.e., the space spanned by the quadratic polynomials from the central map is invariant under
the skew-symmetric action.
16.3.126 Remark The public key of C ∗− inherits some of that symmetry. Note that not ev-
ery skew-symmetric action by a matrix Mζ corresponding to an L-multiplication re-
sults in MTζ Hi + Hi Mζ being in the span of the public-key differential matrices, because
S := span{Hi : i = 1, . . . , n − r} as compared to span{Hi : i = 1, . . . , n} is missing r of the
basis matrices. However, as the authors of [924] argued heuristically and backed up with
empirical evidence, if we just pick the first three MTζ Hi + Hi Mζ matrices, or any three
Pn−r
random linear combinations of the form i=1 bi (MTζ Hi + Hi Mζ ) and demand that they fall
in S, then there is a good chance to find a nontrivial Mζ corresponding to a multiplication
by ζ, which can be used to break the C ∗− scheme.
16.3.127 Remark For a set of public keys from C ∗ , tests [924] show that the above strategy almost
surely eventually recovers the missing r equations and breaks the scheme. The only known
attempted defense is [880].
16.3.128 Remark Given a quadratic polynomial, we can always associate it with a symmetric matrix.
By a rank attack, we mean an attack using the rank of those matrices. There are two types of
rank attacks, attacks that specifically target high rank or low rank. Let Hi be the symmetric
matrix corresponding to the quadratic part of public polynomials.
16.3.129 Remark Since rank attack usually means attacking via finding low rank matrices, some
also call the high rank attack the dual rank attack. The high rank attack first appeared with
[722] where Coppersmith et al. defeated a triangular construction.
16.3.130 Remark The high rank attacks of Goubin-Courtois and Yang-Chen [1339, 3023] work for
“plus”-modified triangular systems; it is also easier to understand than the formulation in
[722]. Against UOV, we might possibly do even better on this attack with differentials [895].
780 Handbook of Finite Fields
16.3.131 Remark The first Minrank attack is the Goubin-Courtois version against TTM. Let r be the
smallest rank in linear combinations of central equations, which without loss of generality
we take to be the first central equation itself in TTM. Goubin and Courtois outline how to
find the smallest ranked combination (and hence break the Triangle-Plus-Minus system) in
m
expected time O(q d n er m3 ). Yang and Chen have extended the effectiveness of this attack
[3023] related to the number of distinct kernels of the same rank.
16.3.132 Remark The defeat of the HFE Challenge 1 by Faugère and Joux [1042], a direct solution
of the 80 equations in 80 variables, is not the first serious attempt on HFE systems. That
credit goes to a rank-based attack by Kipnis and Shamir [1739]. The attack proceeds by
moving the problem back to the extension field, where all the underlying structure can be
seen. This is a very natural approach if we intend to exploit the design structure of HFE in
the attack. They transform the problem of finding the secret key into a problem of finding
the minimum rank of linear combinations of certain matrices, which is exactly r (as in
Subsection 16.3.2.3). This is the MinRank problem [467] and is in general exponential, but
can be easy if r is small.
16.3.133 Remark Kipnis and Shamir [1739] suggested using determinants of all (r + 1) × (r + 1)
sub-matrices to derive a huge assortment of equations to solve the problem. To solve this
system, they introduce an idea which they call relinearization, which led to the well-known
XL paper [740]. It has been argued that using a Lazard-Faugère solver on this system of
equations is effective [737] and equally effective as the direct attack, however the situation
is still not very clear due to a recent observation in [1610].
16.3.134 Remark To forge a signature for a UOV scheme as in Subsection 16.3.2.4, one needs to
solve the equation P(W ) = Y .
16.3.135 Remark When o = v as with the original Oil-and-Vinegar, this is fairly easy due to the
attack by Kipnis and Shamir [1738]. The basic idea is to treat each component yi = pi (W )
of the public key P as a bilinear form. Namely, take their associated symmetric matrices
via the symmetric differential as follows:
Mj−1 Mi O = O;
Hj−1 Hi (S −1 O) = (S −1 O).
16.3.138 Remark This proposition states that any Hj−1 Hi has the common invariant subspace of
S −1 O. Knowing S −1 O is sufficient to find an equivalent form for S. Kipnis et al. [1737]
claim that the same argument works if v < o; even if v > o it can be done in time directly
proportional to q v−o , and hence v − o cannot be too small. However the situation about this
claim is also not very clear due to recent observations in [500]. When there are two or three
times more vinegar variables than oil variables the method appears to be secure, despite
the claims of [390].
16.3.3.11 Reconciliation
16.3.139 Remark We could also attempt to find a sequence of change of basis that would lead to
the inversion of the public map as an improvement to a brute force attack [895]. In the case
of an attack on the UOV scheme with o oil and v = n − o vinegar variables, the attack
becomes a problem of solving m equations in v variables, which could be much easier than
solving m equations in n variables in a direct attack. The reconciliation attack fails with a
1
probability of approximately q−1 .
16.3.140 Remark Reconciliation attack can be applied to Rainbow systems, which have multiple
layer structure (Subsection 16.3.2.4). This attack works for all constructions with a UOV
construction in the final stage, including all Rainbow and TTS constructions. This affects
how the current proposed parameters of Rainbow [895] are selected.
16.3.141 Remark To mount a direct attack, we try to solve the m equations P (W ) = Z in the n
variables w1 , . . . , wn . If m ≥ n, we are (over-)determined. If m < n, we are underdetermined.
For most cases (n cannot be too large compared to m), we cannot do much more than to
fix values for m − n variables randomly and continue with m = n [738].
16.3.142 Remark Due to the NP-hardness the the MQ problem [1214], the difficulty of solving
“generic” or randomly chosen systems of nonlinear equations is generally conceded. But, it
is very hard to quantify exactly how hard it is to solve a non-generic system. Often many
techniques in algebraic cryptanalysis require solving a system of polynomial equations at the
end for more or less generic systems. So we must solve the system p1 = p2 = · · · = pm = 0,
where each pi is a quadratic polynomial. Coefficients and variables are in the field K = Fq .
16.3.143 Remark The standard methods for solving equations are Buchberger’s algorithm [437]
to compute a Gröbner basis, and its descendents investigated by Lazard’s group [1877].
Macaulay generalized Sylvester’s matrix to multivariate polynomials [1985]. The idea is to
construct a matrix whose rows are from the coefficients of the multiples of the polynomials
of the original system, the columns representing all the monomials up to a given degree.
Lazard [1877] observed that for a large enough degree, ordering the columns according to
a monomial ordering and performing usual row reduction on the matrix is equivalent to
Buchberger’s algorithm.
16.3.144 Remark Faugère proposed an improved Gröbner bases algorithm called F4 [1040]. A later
version, F5 [1041] made headlines [1042] when it was used for solving HFE Challenge 1 in
782 Handbook of Finite Fields
2002, but we do not really know what the real F5 is, since no one else, as far as we know,
has repeated any of the many results claimed by F5 . A version of F4 is implemented in the
computer algebra system MAGMA [2798] and is publicly available.
16.3.145 Remark Lazard’s idea was rediscovered in 1999 by Courtois, Klimov, Patarin, and Shamir
[740] as XL. Courtois et al. proposed several adjuncts [736, 742, 743] to XL.
16.3.146 Remark Recently based on the concept of mutant, much work has been done to improve the
XL algorithms, which produced a family of mutant XL algorithms [438, 877, 879, 2114, 2116].
16.3.147 Remark For a system of equations p1 (x1 , . . . , xn ) = · · · = pm (x1 , . . . , xn ) = 0, if we look at
the ideal generated by pi , then each element f of the ideal can be expressed in the form:
n
X
f= fi pi ,
1
16.3.148 Definition For each such expression, we define the level of this expression to be the
max(deg(gi ) + deg(pi ), 1 = 1, . . . , m). If deg(f ) is lower than any of the levels of f ,
f is a mutant.
16.3.149 Remark The key idea of mutants [877] is that we try to mathematically describe the
degeneration of the systems, which is critical for the solving process to work. Currently, the
best mutant algorithms beat all other algebraic solvers, including F4 .
16.3.150 Remark A key question for understanding the security of the MPKCs is the complexity
of those algebraic solvers. Using generating functions, there are heuristic estimates on the
complexity of solving generic systems [201, 3022]. These works are used to estimate the
complexity of algebraic attack on the HFE systems [1349] based on the concept of degree of
regularity. Recently new breakthroughs have led us to mathematically prove new estimates
for the degree of regularity of HFE [878, 884, 926], which provide theoretical support for
the apparent security benefits of using fields of odd characteristics.
16.3.151 Theorem Let P be a HFE polynomial of degree D. If Rank(P ) > 1, the degree of regularity
of the associated HFE system is bounded by
(q − 1)Rank(P )
+ 2.
2
In particular, this is less than or equal to
(q − 1)(blogq (D − 1)c + 1)
+ 2.
2
If Rank(P ) = 1, then the degree of regularity is less than or equal to q. Here Rank is the
rank of the quadratic form associated to P .
16.3.152 Remark The concept of degree of regularity and mutant are also closely related. Mutants
can appear only at a degree equal to or higher than the degree of regularity [472].
16.3.153 Remark What really drives the development of the designs in MPKCs are indeed new
mathematical ideas that bring new mathematical structures and insights in the construction
Cryptography 783
of MPKCs. The mathematical ideas we have used are just some of the very basic ideas
developed in mathematics and there is great potential in advancing this idea further with
some of the more sophisticated mathematical constructions in algebraic geometry. One
particularly interesting problem would be to make the TTM cryptosystems [2113] work
with the establishment of some new systematic approach. This definitely demands some
deep insights and the usage of some intrinsically combinatorial structures from algebraic
geometry.
16.3.154 Remark Though a lot has been done in studying the efficiency of different attacks, we
still do not fully understand the full potential or the limitations of some of the attack
algorithms. We still need to understand both the theory and practice of how efficiently
general attack algorithms work and how to implement them efficiently. From the theoretical
point of view, to answer these problems, the foundation again lies in modern algebraic
geometry. One critical step would be to prove the maximum rank conjecture postulated
in [853], the theoretical basis used to estimate the complexity of the polynomial solving
algorithms, and the F4 algorithm. Another interesting problem is to mathematically prove
some of the commonly used complexity estimate formulas in [3022].
16.3.155 Remark MPKCs interact more and more with other topics like algebraic attacks. Algebraic
attacks are a very popular research topic in breaking symmetric block ciphers like AES [743]
and stream ciphers [128] and analyzing hash functions [2740]. The origin of such an idea is
from MPKCs, and in particular Patarin’s linearization equation attack method. New ideas
in MPKCs will have much more broad applications in the area of algebraic attacks. The idea
of multivariate constructions was also applied to the symmetric constructions [281, 893].
Similar ideas may have further applications in designing stream ciphers and block ciphers.
The theory of functions on a space over a finite field (multivariate functions) will play an
increasingly important role in the unification of the research in all these related areas.
16.3.156 Remark Research in MPKCs has already developed new challenges that need new methods
and ideas. A mutually beneficial interaction between MPKCs and algebraic geometry [748]
will grow rapidly. MPKCs will provide excellent motivation and critical problems for the
development of the theory of functions over finite fields, and new mathematical tools and
insights are critical for MPKCs’ future development.
See Also
References Cited: [67, 128, 201, 230, 280, 281, 390, 437, 438, 467, 472, 499, 500, 603, 604,
605, 722, 736, 737, 738, 739, 740, 742, 743, 748, 853, 860, 876, 877, 878, 879, 880, 881, 882,
883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 894, 893, 895, 896, 924, 925, 926, 1040,
1041, 1042, 1044, 1049, 1094, 1095, 1214, 1266, 1285, 1339, 1340, 1349, 1440, 1461, 1610,
1686, 1687, 1737, 1738, 1739, 1877, 1914, 1985, 2028, 2029, 2113, 2114, 2115, 2116, 2213,
2220, 2226, 2312, 2320, 2321, 2367, 2364, 2365, 2366, 2368, 2369, 2370, 2371, 2387, 2415,
2606, 2613, 2740, 2798, 2822, 2823, 2929, 2930, 2931, 2932, 2998, 3022, 3023, 3024]
784 Handbook of Finite Fields
16.4.1 Remark The Fq -rational points on an elliptic curve E defined over a finite field Fq form a
finite abelian group according to Subsection 12.2.2; its group order is close to q by Theo-
rem 12.2.46. This group can be used to implement the discrete logarithm based cryptosys-
tems introduced in Subsection 16.1.3.2, as first observed in [1771, 2102].
16.4.2 Remark For reasons of efficiency, elliptic curve cryptosystems are usually implemented
over prime fields Fp or fields F2m of characteristic two. Supersingular curves over fields
F3m of characteristic three have attracted some attention in the context of pairing based
cryptography, see Section 16.4.2.
16.4.3 Remark To resist generic attacks on the discrete logarithm problem, elliptic curve cryp-
tosystems are implemented in the prime order cyclic subgroup of maximal cardinality n
inside E(Fq ). For representing group elements with the minimum number of bits, it is desir-
able that the curve order itself be prime. Except for special cases (see Section 16.4.1.3 and
[2535, 2581, 2682]), only generic attacks are known on the √ elliptic curve discrete logarithm
problem (ECDLP), with a running time on the order of n. A security level of m bits,
corresponding to a symmetric-key cryptosystem (see Section 16.1.2) with 2m keys, thus
requires an order n of 2m bits. Extrapolating the theoretical subexponential complexity of
Remark 11.6.38 for factoring or the DLP in finite fields allows to derive heuristic security
estimates for the corresponding public key cryptosystems of Sections 16.1.3.1 and 16.1.3.2.
Several studies have been carried out in the literature, taking added heuristics on techno-
logical progress into account, see [1276]. They are summarized in the following table; the
figures for the factorization based RSA system essentially carry over to systems based on
discrete logarithms in finite fields; see Remark 11.6.38. The 80 bit security level is a historic
figure.
16.4.4 Remark Some cryptographic primitives (encryption, signatures, etc.) have been adapted
and standardized specifically for elliptic curves. As other discrete logarithm based systems
(see Section 16.1.3.2), they require a setup of public domain parameters, a cyclic subgroup
G of prime order n of some curve E(Fq ), with a fixed base point P such that G = hP i.
Moreover, the bit patterns representing elements of Fq and E(Fq ) need to be agreed upon.
Cryptography 785
16.4.5 Example (Elliptic Curve Integrated Encryption Scheme, ECIES) This cryptosystem is es-
sentially the same as ElGamal’s, see Example 16.1.27; but the encryption of elements of
G is replaced by symmetric key encryption of arbitrary bit strings with a derived secret
key. So the scheme is hybrid, using symmetric key and public key elements. An additional
message authentication code (MAC) prevents alterations of the encrypted message during
transmission and authenticates its sender. (A MAC is essentially a hash function, see Re-
mark 16.1.23, depending additionally on a symmetric key, and can indeed be constructed
from hash functions; for more details, see [2080, Subsection 9.5.2].)
Besides the domain parameters for the elliptic curve group, the setup comprises a sym-
metric key scheme with an encryption function Ek1 and inverse decryption function Dk1 ,
using keys k1 of length `1 bits; and a message authentication code Mk2 using keys k2 of
length `2 . Party A has the private key a ∈ [0, n − 1] and the related public key Q = aP .
To encrypt a message m ∈ {0, 1}∗ , party B selects a random integer r ∈ [0, n − 1],
computes R = kP , S = kQ and (k1 , k2 ) = f (S), where f : G → {0, 1}`1 × {0, 1}`2 is a key
derivation function (for instance, a cryptographic hash function; see Remark 16.1.23). He
computes c1 = Ek1 (m) and c2 = Mk2 (c1 ); the ciphertext is (R, c1 , c2 ).
To decrypt such a ciphertext, party A recovers S = aR and (k1 , k2 ) = f (S). If
Mk1 (c1 ) 6= c2 , she rejects the ciphertext as invalid; otherwise, she obtains the clear text
as m = Dk1 (c1 ).
16.4.6 Remark The scheme has been first described in a generic discrete logarithm setting (and
in a slightly different form) in [221], and standardized under the name Elliptic Curve Aug-
mented Encryption Scheme in [109]. For arguments supporting its security under suitable
assumptions on the underlying primitives, see [221, 2683] and [313, Chapter III].
16.4.7 Example (Elliptic Curve Digital Signature Algorithm, ECDSA) The algorithm is a simple
transposition of the DSA of Section 16.1.3.3 to the elliptic curve setting.
Besides the domain parameters for the elliptic curve group, the setup comprises a hash
function H : {0, 1}∗ → [0, n − 1] and the reduction function f : G → [0, n − 1], (x, y) 7→ x
(mod n).
Party A has the private key a ∈ [0, n − 1] and the related public key Q = aP .
To sign a message m, party A randomly selects an integer k ∈ [1, n − 1], computes
R = kP , r = f (R), h = H(m), and s ≡ k −1 (h + ar) (mod n). The signature is the pair
(r, s).
To verify such a signature, party B computes h = H(m), w ≡ s−1 (mod n), u1 ≡ wh
(mod n), u2 ≡ wr (mod n), and R = u1 P + u2 Q. He accepts the signature as valid if and
only if r = f (R).
16.4.8 Remark The scheme has been standardized in [108], see also [1133], [1567, Subsections
7.2.7–7.2.8], and [2292, Section 6]. For arguments supporting its security under suitable
assumptions on the underlying primitives, see [313, Chapter II] and [420]. The fact that the
function f depends only on the x-coordinate of its argument has raised doubts about the
security of the scheme [2712]; in particular, it implies weak malleability: From a signature
(r, s) on a given message, another signature (r, −s) on the same message may be obtained.
16.4.9 Remark A necessary condition for the security of an elliptic curve cryptosystem is that the
order of E(Fq ) be prime, or a prime multiplied by a small cofactor. Some special curves
for which this condition is easily tested have been suggested in the literature. These are
more and more deprecated in favor of random curves (see Section 16.4.1.4) in conventional
discrete logarithm settings, [458]. Supersingular and especially CM curves are still needed,
786 Handbook of Finite Fields
16.4.10 Example (Supersingular curves) The orders of supersingular elliptic curves are known by
Theorem 12.2.51: Over Fp , the only occurring order is p + 1. Over Fpm with p ∈ {2, 3},
the orders pm + 1 − t with t ∈ {0, ±pm/2 , ±p(m+1)/2 , ±2pm/2 } may occur depending on the
parity of m. The ECDLP on supersingular curves over Fpm may be reduced to the DLP in
the multiplicative group of Fp2 for curves over Fp ; of Fp , Fp2 , Fp3 or Fp4 for curves over F2m ;
and of Fp , Fp2 , Fp3 or Fp6 for curves over F3m . Thus, supersingular curves are deprecated
except for low security pairing based cryptosystems.
16.4.11 Example (Curves over extension fields) If E is defined over a finite field Fq with q small,
then |E(Fq )| can be obtained by exhaustively enumerating all points; and |E(Fqm )| is easily
computed by Remark 12.2.105. In particular, the case q = 2 has been suggested in the
literature. However, since E(Fqm ) contains the subgroup E(Fq ) (and further subgroups if
m is not prime), the group order cannot be prime anymore.
16.4.12 Remark The existence of the additional Frobenius automorphism of order m, together with
the negation automorphism of order 2, may√be used to speed up the generic algorithms of
Sections 11.6.6 and 11.6.7 by a factor of 2m [1165, 2978], which reduces the effective
security level.
16.4.13 Remark (Weil descent) If E is defined over an extension field Fqm , then E(Fqm ) can be
embedded into A(Fq ), where A is an abelian variety of dimension m, called the Weil restric-
tion or restriction of scalars of E. There is reason to believe that the discrete logarithm
problem in A(Fq ) may be easier to solve than by a generic algorithm, relying on an ap-
proach of representing the group A(Fq ) by a set of generators (called the factor base) and
relations which are solved by linear algebra, cf. Section 11.6.8, leading to a potential attack
described first in [1104, Section 3.2]. Cases where A contains the Jacobian of a hyperelliptic
curve of genus close to m have been worked out for curves over fields of characteristic 2
in [1160, 1250], and fields of odd characteristic in [852]. So far, the attack has been made
effective for certain curves with prime m ≤ 7.
Another algorithm for discrete logarithms, working directly with curves over Fqm and
specially adapted factor bases, is described in [1247]; heuristically, it is faster than the
generic algorithms for m ≥ 3 fixed and q → ∞. Since it involves expensive Gröbner basis
computations, it has been made effective only for m ≤ 3.
Combinations of these approaches are also possible and have led to an attack on curves
of close to cryptographic size over Fp6 [1629]. Moreover, isogenies may be used to transport
the discrete logarithm problem from a seemingly secure curve to one that may be attacked
by Weil descent [1156].
It thus appears cautious to prefer for cryptographic applications curves over prime fields
Fp or, if even characteristic leads to significant performance improvements, fields F2m of
prime extension degree m.
1. Let D < 0, D ≡ 0 or 1 (mod 4), p prime and m minimal such that 4pm = t2 −v 2 D
has a solution in integers t, v.
√
2. Compute the class polynomial HD ∈ Z[X], the minimal polynomial of j D+2 D ,
where j is the absolute elliptic modular invariant function.
Cryptography 787
3. HD splits completely over Fpm (and no subfield), and its roots are the j-invariants
of the elliptic curves defined over Fpm with complex multiplication by OD . For
each such j-invariant, one easily writes down a curve with pm + 1 − t points by
solving the expression of j in Definition 12.2.2 for the curve coefficients (up to
isomorphisms and twists, see Section 12.2.5, the solution is unique).
16.4.15 Remark It is easy to see that a prime number of points is only possible for D ≡ 5 (mod 8).
16.4.16 Remark The degree of the class polynomial is the class number of OD , and its total bit
size is of the order of O(|D|1+ ) under GRH. Several quasi-linear algorithms of complexity
O(|D|1+ ) for computing class polynomials have been described in the literature, by floating
point approximations of its roots [978], lifting to a local field [744] or Chinese remaindering
[220]. Nevertheless, the algorithms are restricted to small values of |D|, while random curves
correspond to |D| of the order of q, so that only a negligible fraction of curves may be reached
by the CM approach.
16.4.17 Remark While no attack on this particular fraction of curves has been devised so far,
random curves are generally preferred where possible; note, however, that pairing-based
cryptosystems require the use of either supersingular curves or ordinary curves obtained
with the CM approach; see Section 16.4.2.4.
16.4.18 Example (NIST curves) The USA standard [2292] suggests a prime field Fp and a pseu-
dorandom curve (assuming that the hash function SHA-1 is secure) of prime order over Fp
for p of 192, 224, 256, 384, and 521 bits. (The largest example is for the Mersenne prime
p = 2521 − 1.) For the binary fields F2163 , F2233 , F2283 , F2409 and F2571 , a pseudo-random
curve (of order twice a prime) and a curve defined over F2 (of order twice or four times a
prime) are given. As recommended in Remark 16.4.13, the extension degrees are prime for
curves defined over F2 .
16.4.19 Remark We note that the generic discrete logarithm algorithms of Subsections 11.6.6
and 11.6.7 allow for a trade-off between precomputations and the breaking of a given
discrete logarithm: In a group of size about 2m , a precomputation of 2k group elements
yields additional logarithms in time 2m−k . As a precaution, one may thus wish to avoid
predetermined curves, especially at lower security levels.
16.4.20 Remark Algorithms for counting points on random elliptic curves currently come in two
flavours. The first algorithm, SEA, is of polynomial complexity; for curves over extension
fields Fpm , there are a variety of algorithms using p-adic numbers, with a much better
polynomial exponent in m, but which are exponential in log p.
16.4.21 Algorithm (Schoof) In [2560], Schoof describes the first algorithm of complexity polynomial
in log q for counting the number of points on an arbitrary elliptic curve E(Fq ). The algorithm
is deterministic and computes the trace of Frobenius aq of Definition 12.2.47 and thus the
zeta function of Section 12.2.10. Given a prime ` not dividing q, by Theorem 12.2.66 the
value of aq modulo ` can be determined by checking for all possible values whether the
numerator of the zeta function annihilates the `-torsion points. Chinese remaindering for
sufficiently many primes yields the exact value ofaq , which is bounded by Theorem 12.2.46.
The algorithm has a complexity of O (log q)5+) , due in part to the fact that the `-torsion
points generate an Fq -algebra of dimension O(`2 ).
16.4.22 Algorithm (Schoof–Elkies–Atkin, SEA) Improvements are due to Atkin and Elkies [970].
When there is an Fq -rational separable isogeny (see Definition 12.2.26) of degree ` from
788 Handbook of Finite Fields
E(Fq ) to another curve, then the `-torsion points may be replaced by the kernel of the
isogeny, generating an algebra of dimension O(`) over Fq . By the complex multiplication
theory of Example 16.4.14, this happens when ` is coprime to the conductor of√the ring of
endomorphisms OD of E and ` is not inert in the quadratic number field Q( D), which
holds for about half of the primes. The complexity of the algorithm becomes O (log q)4+)
[312, Chapter VII], [661, Section 17.2].
16.4.23 Remark The practical bottleneck of the algorithm used to be the computation of bivariate
modular polynomials, of size O(`3+ ), needed to derive isogenies of degree `. A quasi-linear
algorithm is described in [979]; eventually limited by space, it has been used for ` up to
around 10000. A more recent algorithm [2752] computes the polynomial, reduced modulo
the characteristic p of Fq and instantiated in one variable by an element of Fq , also in
time O(`3+ ), but in space O(`(` + log q)); it has been used for ` up to about 100000.
Further building blocks of the SEA algorithm have also been optimized [366, 1251, 2098].
The current record is for a prime field Fp with p having about 5000 decimal digits [2751].
16.4.24 Remark The SEA algorithm is implemented in several major computer algebra systems,
and random elliptic curves of cryptographic size with a prime number of points are easily
found, be it as domain parameters, be it in a setting where each user has his own elliptic
curve as part of his public key.
16.4.25 Algorithm (p-adic point counting) For an elliptic curve E over an extension field Fpm ,
Satoh [2533] introduced an algorithm computing its canonical lift to a curve Ê over Qpm ,
the unramified extension of degree m of the p-adic numbers Qp . The curve Ê has the
same endomorphism ring OD (Example 16.4.14) as E and reduces modulo the maximal
ideal of Qpm to E. More precisely, an approximation to Ê may be computed by Newton
iterations on a function derived from the modular polynomial of level p, Algorithm 16.4.21,
at arbitrary p-adic precision. In a second step, the trace of the Frobenius map is computed
in this characteristic 0 setting by the action of its dual isogeny (the reduction of which is
separable) on a holomorphic differential; for this, the isogenies are computed explicitly. After
a precomputation of O(p3+ ) for the p-th modular polynomial (see Algorithm 16.4.21), the
complexity of the algorithm is O(p2 m3+ ).
16.4.26 Remark Satoh’s algorithm is not immediately applicable in characteristic two. Mestre sug-
gests in [2088] to use arithmetic-geometric mean (AGM) iterations, a sequence of isogenies
of degree 2, to obtain the canonical lift and the trace of the Frobenius map, also in time
O(m3+ ).
16.4.27 Remark Later work concentrates on lowering the complexity in m: to quasi-quadratic for
finite fields Fqm with a Gaussian normal basis [1904] or in the general case [1422]; or on
lowering the complexity in p: to quasi-linear [1248] or even quasi-square root [1429]. The
record in [1904] for a curve over F2100002 goes beyond all practical cryptographic needs.
16.4.28 Remark For a more thorough account, see [313, Chapter VI] or [661, Section 17.3].
16.4.30 Definition Let E be an elliptic curve defined over a finite field Fq , and let n be the largest
prime divisor of the cardinality of E(Fq ). Assume that n does not divide q. (This is
required in a cryptographic setting due to anomalous curves.) Then the embedding degree
of E is the smallest integer k such that E(Fqk ) contains E(Fq )[n], the n2 points of n-
torsion of E(Fq ); see Theorem 12.2.60, i.e., k is minimal such that E(Fqk )[n] = E(Fq )[n].
16.4.31 Theorem [183, Theorem 1] If n does not divide q − 1, then the embedding degree is the
smallest integer k such that n divides q k − 1.
16.4.33 Remark In this setting, G1 = E(Fq )[n] and G3 are in fact fixed, while there are n + 1
possible choices for G2 ; see Subsection 16.4.2.3. Diagonalizing the matrix of the Frobenius
endomorphism on E(Fq )[n] by Theorem 12.2.66 yields a mathematically canonical choice
also for G2 , which is given by the following theorems.
16.4.34 Theorem G1 is the subgroup of E(Fqk )[n] generated by the points having eigenvalue 1
under the Frobenius endomorphism φq of Example 12.2.31. There is a unique subgroup
G2 ⊆ E(Fqk ) of order n generated by the points having eigenvalue q under the Frobenius
endomorphism.
Pk−1
16.4.35 Theorem Let Tr : E(Fq ) → E(Fq ), P 7→ i=0 φiq (P ), denote the trace endomorphism of
level k on E. Then the endomorphisms Tr and π2 = id − φq , restricted to E(Fqk )[n], yield
surjective group homomorphisms Tr : E(Fqk )[n] → G1 with kernel G2 and π2 : E(Fqk )[n] →
G2 with kernel G1 .
16.4.36 Definition For a point P on E defined over some extension field Fqm and an integer r, let
fr,P be the function with divisor r(P ) − (rP ) − (r − 1)(O) that is defined over Fqm and
has leading coefficient 1 in O, see Definitions 12.1.21 and 12.1.23.
For finite points R and S 6= −R, denote by vR = x − x(R) the line with divisor
(R) + (−R) − 2(O) and by `R,S = (y − y(R)) − λR,S (x − x(R)) the line with divisor
(R) + (S) + (−R − S) − 3(O), where
( y(S)−y(R)
x(S)−x(R) if R 6= S,
λR,S = 3x(R)2 +2a2 x(R)+a4 −a1 y(R)
2y(R)+a1 x(R)+a3 if R = S,
16.4.38 Algorithm
Require: A point P on E and an integer r
Ensure: fr,P = VL , where L and V are given as products of lines
Compute an addition-negation chain r1 , . . . , rs for r.
P1 ← P , L1 ← 1, V1 ← 1
for i = 2, . . . , s do
j ← j(i), k ← k(i)
if ri = −rj then
Pi ← −Pj
Li ← Vj
Vi ← Lj vPi
else
Pi ← Pj + Pk
Li ← Lj Lk `Pj(i) ,Pk(i)
Vi ← Vj Vk vPi
end if
end for
return L = Ls , V = Vs
16.4.39 Example The Weil pairing en of Theorem 12.2.70 is a cryptographic pairing as long as
f (Q)
G2 6= G1 . If P , Q ∈ E(Fqk )[n], then en (P, Q) = (−1)n fn,Q
n,P
(P ) .
16.4.40 Example Assume that E(Fqk ) does not contain a point of order n2 , or, equivalently, that n3
does not divide |E(Fqk )|. Then the map E(Fqk )[n] → E(Fqk )/nE(Fqk ), Q 7→ Q + nE(Fqk ),
is a group isomorphism, and the Tate pairing T of Theorem 12.2.75 yields a non-degenerate
pairing
e0T : E(Fqk )[n] × E(Fqk )[n] → F∗qk /(F∗qk )n , (P, Q) 7→ T(P, Q + nE(Fqk )).
Since e0T |G1 ×G1 takes values in F∗q /(F∗q )n = {1}, the restriction e0T |G1 ×G2 is non-degenerate
for any G2 6= G1 .
The reduced Tate pairing
(qk −1)/n
eT : G1 × G2 → G3 , (P, Q) 7→ T(P, Q + nE(Fqk )) ,
16.4.41 Remark We observe that during the computation of the reduced Tate pairing by Algo-
rithm 16.4.38, all factors lying in a subfield of Fqk may be omitted due to the final expo-
nentiation. In particular, if the x-coordinate of Q lies in a subfield, then all vPi may be
dropped, a technique known as denominator elimination; see Remark 16.4.51.
16.4.43 Remark Since G1 is invariant under the Frobenius φq , but G2 is not, the endomorphisms
ψ and φq cannot commute. So the existence of ψ implies that E is supersingular; see
Theorems 12.2.82 and 12.2.87. Conversely, for supersingular curves, there are distortion
maps ψ : G1 → G2 for any G2 6= G1 [2869, Theorem 5].
Cryptography 791
16.4.44 Example Let E be a supersingular curve with distortion map ψ and G2 = ψ(G1 ). Let
e : G1 × G2 → G3 be a cryptographic pairing. Then
is a cryptographic pairing in which both arguments come from the same group G1 . This
setting is sometimes called a symmetric pairing in the literature, although it does not in
general satisfy e(P, Q) = e(Q, P ); see also Section 16.4.2.3.
16.4.45 Remark Further work has produced a variety of pairings with a shorter loop in Algo-
rithm 16.4.38, that is, defined by some function fr,P with r < n. In general, this is obtained
by choosing special curves and restricting to the subgroups G1 and G2 of Theorem 16.4.34.
Since all involved groups are cyclic, such pairings are necessarily powers of the Tate pairing.
16.4.46 Example (Eta pairing) Let E be a supersingular curve with even k = 2a and distortion
map ψ as in Definition 16.4.42. Let T = t − 1, where t is the trace of the Frobenius map,
see Definition 12.2.47. Then T ≡ q (mod n) and n | (T a + 1). Assume that n2 - (T a + 1).
Then the map
a−1 q k −1
G1 × G1 → G3 , (P, Q) 7→ fT,P (ψ(Q))aT n ,
is a cryptographic pairing. For a proof, see [203, Section 4] and [1494, Section III]. By Exam-
ple 16.4.62, only curves over fields of characteristic two or three may satisfy the assumptions
√
of the theorem. Notice that T is of order q by Theorem 12.2.46, so that in the best case
ρ ≈ 1 (see Definition 16.4.66) the loop length in Algorithm 16.4.38 is reduced by a factor
of about 2, while the final exponentiation becomes more expensive.
16.4.47 Example (Ate pairing) Let T = t−1, where t is the trace of the Frobenius map, and assume
that n2 - (T k − 1). Then the map
q k −1
G2 × G1 → G3 , (Q, P ) 7→ fT,Q (P ) n ,
is a cryptographic pairing [1494]. Notice that the roles of G1 and G2 are inverted compared to
the reduced Tate pairing of Example 16.4.40. Thus, as a price to pay for the loop shortening
in Algorithm 16.4.38, the number of operations in G2 and thus Fqk increases.
log2 n
16.4.48 Conjecture (Optimal ate pairing) A loop length of essentially ϕ(k) ,
where ϕ is Euler’s
function, may be obtained for a pairing of the previous type, for instance via a product of
i q k −1
functions fci ,Q (P )q n with
P
log2 ci of the desired magnitude; concrete instances have
been obtained using lattice reduction [1492, 2868].
16.4.49 Theorem Assume that E is defined over the field Fq of characteristic at least 5 and that
d ∈ {2, 3, 4, 6} is such that d | gcd(k, # Aut(E)). By Proposition 12.2.59, there is, besides E
itself and up to equivalence, precisely one twist E 0 of degree d such that n | #E 0 (Fqk/d ). As
can be seen from Proposition 12.2.57, there is an isomorphism ϕ : E 0 → E which is defined
over Fqd . The subgroup G02 of order n of E 0 (Fqk/d ) satisfies ϕ(G02 ) = G2 .
16.4.50 Remark Theorem 16.4.49 implies that in the presence of twists, elements of G2 are more
compactly represented by elements of G02 ; or otherwise said, any cryptographic pairing
e : G1 × G2 → G3 yields an equivalent cryptographic pairing e0 : G1 × G02 → G3 ,
(P, Q0 ) 7→ e(P, ϕ(Q0 )).
792 Handbook of Finite Fields
16.4.51 Remark Theorem 16.4.49 and the explicit form of ϕ given in Proposition 12.2.57 show that
the x-coordinates of elements in G2 lie in Fqk/2 for d even and that the y-coordinates lie
in Fqk/3 when 3 | d. This may allow for simplifications of Algorithm 16.4.38 in conjunction
with the final exponentiation; see Remark 16.4.41.
16.4.52 Example (Twisted ate pairing) Under the hypotheses of Theorem 16.4.49, let T = t − 1,
where t is the trace of the Frobenius map, and assume that n2 - (T k − 1). Then the map
q k −1
G1 × G2 → G3 , (P, Q) 7→ fT k/d ,P (Q) n ,
is a cryptographic pairing [1494]. Here, the roles of G1 and G2 are again as in the reduced
Tate pairing of Example 16.4.40. However, compared to the ate pairing of Example 16.4.47,
the loop length in Algorithm 16.4.38 is increased by a factor of kd . Unless t is smaller than
generically expected, the twisted ate pairing is in fact less efficient to compute than the
reduced Tate pairing.
16.4.53 Remark For the sake of giving security arguments for pairing based systems, the crypto-
logic literature has taken to distinguishing pairings according to the possibility of moving
efficiently between the groups G1 and G2 . For instance, if G1 = G2 , then the decisional
Diffie-Hellman problem is easy in G1 : Given P , aP , bP and R ∈ G1 , one has R = abP if
and only if e(P, R) = e(aP, bP ).
16.4.55 Remark We note that since G1 and G2 are cyclic of the same order n, they are trivially
isomorphic; but exhibiting an effective isomorphism may require to compute discrete loga-
rithms. In general, an efficiently computable isomorphism will be given by an endomorphism
of the elliptic curve.
16.4.56 Example The pairing of Example 16.4.44 on supersingular curves with distortion map is of
type 1. Any pairing with G2 6= G1 , G2 is of type 2: The isomorphism is given by the trace
map Tr of Theorem 16.4.35. To the best of our knowledge, pairings with G2 = G2 are of
type 3; at least the trace is trivial on G2 .
16.4.57 Remark The terminology type 4 has been used for pairings in which the second argument
comes from the full n-torsion group; in this case, G2 can be seen as the group generated by
this argument, which may vary with each use of the cryptographic primitive. As it is then
unlikely that G2 = G1 or G2 , a type 4 pairing essentially behaves as a type 2 pairing.
16.4.58 Remark Type 1 pairings, being restricted to supersingular curves, offer a very limited choice
of embedding degrees, see Example 16.4.62. Type 2 pairings are sometimes preferred in the
cryptographic literature since they appear to facilitate certain security arguments. On the
other hand, the existence of ψ implies that the decisional Diffie-Hellman problem is easy in
G2 , and it is apparently not possible to hash into any subgroup G2 different from G1 and
G2 ; see Subsection 16.4.2.5. Recent work introduces a heuristic construction to transform a
Cryptography 793
cryptographic primitive in the type 2 setting, together with its security argument, into an
equivalent type 3 primitive [597].
16.4.59 Remark Some cryptographic primitives have been formulated with a pairing on subgroups
of composite order n. More precisely, n is the product of two primes that are unknown to
the general public, but form part of the private key as in the RSA system of Section 16.1.3.1.
Such pairings can be realized either with supersingular curves [345, Section 2.1] or using
Algorithm 16.4.64; the former leads to a ρ-value (Definition 16.4.66) at least 2, the latter to
a ρ-value close to 2. There is a heuristic approach to transform such cryptosystems, together
with their security proofs, into the setting of prime order subgroups [1098].
16.4.61 Remark Thus to balance the difficulty of the discrete logarithm problems in the elliptic
curve groups G1 and G2 over Fq and G3 ⊆ F∗qk , the embedding degree k should be chosen
according to the security equivalences in Section 16.4.1.1. For instance, if one follows the
recommendations of [2684], for a system of 256 bit security one would choose n ≈ 2512 and
thus q ≈ 2512 , and k ≈ 15425512 ≈ 30. Since by Theorem 16.4.31 the embedding degree k
equals the order of n in Fq , it will be close to q for random curves. Hence one needs special
constructions to obtain pairing-friendly curves, curves with a prescribed, small value of k.
For a comprehensive survey, see [1101].
16.4.62 Example (Supersingular curves) As first noticed in [2079], the embedding degree is always
exceptionally small for supersingular curves. The following table gives the possible cardi-
nalities according to Theorem 12.2.51, the maximal size n of a cyclic subgroup by [2497]
and the embedding degree k with respect to n.
|E(Fq )| n k
q+1 q+1 2
√ √
q + 1 ±√ q q + 1 ±√ q 3
q + 1 ± √2q q + 1 ± √2q 4
q + 1 ± 3q q + 1 ± 3q 6
√ √
q+1±2 q q±1 1
16.4.63 Remark All algorithms for finding ordinary pairing-friendly curves rely on complex multi-
plication constructions, cf. Example 16.4.14, and construct curves over prime fields only.
16.4.64 Algorithm A very general method is due to Cocks and Pinch [1101, Section 4.1]. It allows
to fix the desired group order n beforehand; choosing a low Hamming weight in the binary
decomposition of n or more generally a value of n with a short addition-subtraction chain
speeds up Algorithm 16.4.38.
Require: An integer k ≥ 2, a quadratic discriminant D < 0 and a prime n such that
k | (n − 1) and the Legendre symbol D
n = 1
Ensure: A prime p and an elliptic curve E(Fp ) (with complex multiplication by OD )
having a subgroup of order n and embedding degree k
repeat
ζ ← an integer such that ζ modulo n is a primitive k-th root of unity in F∗n
t←ζ +1
794 Handbook of Finite Fields
t−2
v ← an integer such that v ≡ √
D
(mod n)
2 2
t −v D
p← 4
until p is an integer and prime
Then p ≡ t − 1 (mod n)
Construct the curve E over Fp with p + 1 − t points as in Example 16.4.14
16.4.65 Remark Generically, in this construction t and v will be close to n, so that p will be close
to n2 . This motivates the following definition.
log p
ρ= .
log n
16.4.67 Remark By Theorem 12.2.46, the superior limit of ρ is at least 1 for p → ∞. Values of ρ
larger than 1 result in a loss of bandwidth when transmitting elements of G1 , which is a
log2 n-bit subgroup embedded into a ρ log2 n-bit group, and a less efficient arithmetic in the
elliptic curve. The security equivalences of Section 16.4.1.1 do in fact not fix the value of k,
but that of ρk; so different values of k may lead to comparable security levels.
16.4.68 Remark Further research has concentrated on finding families of pairing-friendly curves,
the parameters of which are given by values of polynomials.
16.4.69 Algorithm [413] The following is a direct transcription of Algorithm 16.4.64 to polynomials.
Require: An integer k ≥ 2 and a quadratic discriminant D < 0
Ensure: Polynomials p and n ∈ Q(x) such that if the values p(x0 ) and n(x0 ) are
simultaneously prime integers, then there is an elliptic curve E(Fp(x0 ) ) (with complex
multiplication by OD ) having a subgroup of order n(x0 ) and embedding degree k
n ← an irreducible
√ polynomial in Q[x] such that the number field K = Q[x]/(n)
contains D and a primitive k-th root of unity
z ← a polynomial in Q[x] that reduces to a primitive k-th root of unity ζ in K
t←z+1
v ← a polynomial in Q[x] that reduces to the element ζ−1 √
D
in K
√
s ← a polynomial in Q[x] that reduces to D in K
v ← (z−1)s
D mod n
t2 −Dv 2
p← 4
16.4.70 Remark The polynomials p and n need not represent primes or even integers; choosing
small values of |D|, and n such that z, s ∈ Z[X] may help. Let d = deg(n) be the degree of
K. While it is always possible to choose n such that either z or v is of low degree (as low
as 1 if n is the minimal polynomial of the corresponding algebraic number), it is a priori not
clear whether both can be chosen of low degree. Generically, p is of degree 2(d − 1), and the
asymptotic ρ-value of the family is 2 − d2 , a small improvement over Algorithm 16.4.64. In
many cases, however, actual ρ-values are much closer to 1, as demonstrated by the following
example.
√
16.4.71 Example [413, p. 137] Let k be odd, D = −4 and K = Q(ζ, −1) = Q[x]/(Φ4k (x)) where
Φ4k (x) = Φk (−x2 ) is the 4k-th cyclotomic polynomial. Choose ζ(x) = −x2 , t(x) = −x2 + 1,
s(x) = 2xk , v(x) = 21 (xk+2 + xk ), p(x) = 41 (x2k+4 + 2x2k+2 + x2k + x4 − 2x2 + 1). The
polynomial p takes integral values in odd arguments and, conjecturally, represents primes
if it is irreducible (since p(1) = 1, there is no local obstruction to representing primes).
Cryptography 795
k+2
Asymptotically for p → ∞, ρ → ϕ(k) , and ρ → 1 if furthermore k → ∞ with a fixed
number of prime factors.
√
16.4.72 Remark Similar results hold for even k, and for D = −3 since −3 ∈ Q[x]/(Φ3 (x)).
deg p
16.4.73 Remark The following table, taken from [1101], gives the current best values of ρ = deg n for
polynomial families of pairing-friendly curves for k ≥ 4. (Smaller values of k may be obtained
for prime fields using supersingular curves; see Example 16.4.62.) For the constructions
behind each family, see [1101].
k deg p deg n ρ kρ k deg p deg n ρ kρ
4 2 2 1.00 4.0 28 16 12 1.33 37.3
5 14 8 1.75 8.8 29 60 56 1.07 31.1
6 2 2 1.00 6.0 30 12 8 1.50 45.0
7 16 12 1.33 9.3 31 64 60 1.07 33.1
8 10 8 1.25 10.0 32 34 32 1.06 34.0
9 8 6 1.33 12.0 33 24 20 1.20 39.6
10 4 4 1.00 10.0 34 36 32 1.12 38.2
11 24 20 1.20 13.2 35 72 48 1.50 52.5
12 4 4 1.00 12.0 36 14 12 1.17 42.0
13 28 24 1.17 15.2 37 76 72 1.06 39.1
14 16 12 1.33 18.7 38 40 36 1.11 42.2
15 12 8 1.50 22.5 39 28 24 1.17 45.5
16 10 8 1.25 20.0 40 22 16 1.38 55.0
17 36 32 1.12 13.8 41 84 80 1.05 43.0
18 8 6 1.33 24.0 42 16 12 1.33 56.0
19 40 36 1.11 21.1 43 88 84 1.05 45.0
20 22 16 1.38 27.5 44 46 40 1.15 50.6
21 16 12 1.33 28.0 45 32 24 1.33 60.0
22 26 20 1.30 28.6 46 50 44 1.14 52.3
23 48 44 1.09 25.1 47 96 92 1.04 49.0
24 10 8 1.25 30.0 48 18 16 1.12 54.0
25 52 40 1.30 32.5 49 100 84 1.19 58.3
26 28 24 1.17 30.3 50 52 40 1.30 65.0
27 20 18 1.11 30.0
Definition 16.4.54, the trace Tr : E(Fqk )[n] → G2 can be used to obtain a hash function
with values in G2 . Alternatively, in the presence of twists as described in Theorem 16.4.49,
one may more efficiently hash into the subgroup G02 on the twisted curve, for which the
cofactor is smaller.
To hash into E(k) where k = Fq or k = Fqk , one may use a hash function H : {0, 1} → k
to obtain the x-coordinate of a point. As not all elements of k occur as x-coordinates, one
may need several trials. A possibility is to concatenate the message m with a counter i,
denoted by m||i, and to increase the counter until H(m||i) is the x-coordinate of a point
on E. An additional hash bit may be used to determine one of the generically two points with
the given x-coordinate. The algorithm is deterministic and, if H is modeled as a random
function, it needs an expected number of two trials averaged over all input values. However,
for |k| → ∞, there is a doubly exponentially small fraction of the input values that will
take exponential time. Several recent results exhibit special cases in which polynomial time
hashing is possible uniformly for all input values.
16.4.77 Example [312, Section 4.1] If q ≡ 2 (mod 3), then E : y 2 = x3 + 1 is a supersingular curve
over Fq with q + 1 points and k = 2. Precisely, since third powering is a bijection on Fq
with inverse z 7→ z 1/3 = z (2q−1)/3 , the map Fq → E(Fq )\{O}, y 7→ (y 2 − 1)(2q−1)/3 , y , is
a bijection.
16.4.78 Example [2604] Let E : y 2 = f (x) = x3 + a2 x2 + a4 x + a6 over Fq of characteristic at
least 3. There are explicit rational functions u1 (t), u2 (t), u3 (t), and v(t) such that v(t)2 =
f (u1 (t2 ))f (u2 (t2 ))f (u3 (t2 )) [2675]. So for any t there is at least one i(t) such that ui(t) (t2 )
is a square in Fq , which yields a map Fq → E(Fq ), t 7→ ui(t) (t2 ), f (ui(t) (t2 ))1/2 . In a
cryptographic context, we may assume that a non-square in F∗q is part of the input, and
then Tonelli-Shanks’s algorithm computes square roots in deterministic polynomial time;
see [660, Section 1.5.1] and [2813]. The argument is refined in [2604] to give a deterministic
procedure for computing points on the curve without knowing a non-square and to show
that at least q−4 8 different points may be reached. The case of characteristic two is also
handled.
16.4.79 Example [1566] Let Fq with q ≡ 2 (mod 3) be of characteristic at least 5, and let E : y 2 =
4
1/3 2
t6
x3 +ax+b be an elliptic curve over Fq . Let v(t) = 3a−t
6t and x(t) = v(t) 2
− b − 27 + t3 .
Then 0 7→ O, 0 6= t 7→ (x(t), tx(t) + v(t)) is a map Fq → E(Fq ) with image size at least q4 ,
and conjecturally close to 5q
8 . A similar result holds for curves over F2 with odd m.
m
16.4.80 Remark Alternative encodings for elliptic curves in Hessian form over Fq with q ≡ 2
(mod 3) and odd are given in [1039, 1672]; see also [745]. They have an image of proven
size about q/2.
See Also
References Cited: [108, 109, 183, 203, 220, 221, 312, 313, 345, 366, 413, 420, 458, 597,
660, 661, 744, 745, 852, 970, 978, 979, 1039, 1098, 1101, 1104, 1108, 1133, 1156, 1159, 1160,
1165, 1247, 1250, 1251, 1276, 1422, 1429, 1492, 1494, 1566, 1567, 1629, 1630, 1672, 1771,
1894, 1904, 2079, 2080, 2088, 2098, 2102, 2292, 2426, 2497, 2533, 2560, 2601, 2604, 2675,
2683, 2684, 2712, 2751, 2752, 2766, 2813, 2868, 2869, 2978]
Cryptography 797
16.5.1 Remark As stated in Definition 12.4.17, the set of Fq -rational reduced divisors of degree
zero of a hyperelliptic curve C defined over a finite field Fq form a finite abelian group, the
Picard group P ic0Fq (C).
16.5.2 Remark The Picard group can be used as a substitute for the group of points on an el-
liptic curve to implement cryptosystems based on the difficulty of the discrete logarithm
problem [1772, 661]. For efficiency reasons, imaginary curves are usually preferred over real
curves since their group operation is slightly faster in practice. We note that for real hyperel-
liptic curves, the cryptosystems rely on the infrastructure discrete logarithm problem which
can be reduced to a discrete logarithm problem in the Picard group (see Definition 12.4.81
and Remark 12.4.82).
16.5.3 Remark For an imaginary hyperelliptic curve defined over the finite field Fq , the Picard
group is isomorphic to the ideal class group of the curve (the quotient group of the ideals
in the polynomial ring of the curve modulo the principal ideals). For a curve of genus g,
√
this group has order close to q g (to be precise the group order is between ( q − 1)2g and
√
( q + 1)2g , see Remark 12.4.62). In practice, Mumford’s representation (Theorem 12.4.34)
is used to construct divisors (or rather the corresponding ideals).
16.5.4 Remark As is the case for groups obtained from elliptic curves, the fastest known attacks
on the discrete logarithm
√ problem for hyperelliptic curves of genus two are generic attacks
which require O( n) group operations to compute the discrete logarithm. Because the group
order is n ≈ q 2 , solving the discrete logarithm problem requires O(q) group operations.
16.5.5 Remark Following the ideas of Harley [1421], efficient implementations of hyperelliptic
curve group operations are usually done via explicit fomulae which replace the polynomial
operations of Cantor’s algorithm [497] with a sequence of field operations on the coefficients
of these polynomials [147, 1246, 1854].
16.5.6 Remark For efficiency reasons, for curves in odd characteristic the curve is usually assumed
to be reduced (via isomorphisms) to the form y 2 = x5 + f3 x3 + f2 x2 + f1 x + f0 . For curves
in characteristic two, the curve is usually assumed to be reduced to either y 2 + h1 xy =
x5 + f3 x3 + f2 x2 + f0 (with h1 = 1 in extensions of odd degree) or y 2 + (x2 + h1 x + h0 )y =
x5 + f4 x4 + f1 x + f0 .
16.5.7 Remark Genus two curves of the form y 2 + y = f (x) over a field of characteristic two are
supersingular. The discrete logarithm in these curves is much easier to compute (via the
Weil or Tate pairings) than for other curves of genus two. For this reason, these curves are
avoided for cryptographic applications.
16.5.8 Remark For curves in characteristic two, the binary field structure can be used to easily
solve quadratic equations. It becomes possible to compute “halving” operations, giving
performance advantages over doubling operations in some cases [286, 287, 1038].
798 Handbook of Finite Fields
16.5.9 Remark As for elliptic curves, there are various coordinate systems to represent divisors,
and more precisely to represent reduced divisors of weight two (divisors of the most com-
mon case). In particular, projective coordinates are sometimes used to obtain inversion-free
explicit formulae. Two basic types of projective coordinates are used in practice:
1. “standard” projective coordinates [2110, 661], [U1 , U0 , V1 , V0 , Z], which corre-
spond to the Mumford representation (in affine coordinates) [x2 + (U1 /Z)x +
(U0 /Z), (V1 /Z)x + (V0 /Z)];
2. new projective coordinates [1853], [U1 , U0 , V1 , V0 , Z1 , Z2 ], which correspond
to the Mumford representation (in affine coordinates) [x2 + (U1 /Z12 )x +
(U0 /Z12 ), (V1 /Z13 Z2 )x + (V0 /Z13 Z2 )].
16.5.10 Remark In some cases, extended (or redundant) coordinates are used since they can save
some field operations in further group operations. For example, the recent coordinates of
Lange [1852] use the coordinates [U1 , U0 , V1 , V0 , Z, Z 2 ] for curves over fields of characteristic
two. Similarly, new coordinates are usually used in the form [U1 , U0 , V1 , V0 , Z1 , Z2 , Z12 , Z22 ]
(in odd characteristic) or [U1 , U0 , V1 , V0 , Z1 , Z2 , Z1 Z2 , Z12 , Z22 , Z12 Z2 ] (in characteristic two).
16.5.11 Remark Just as curves of genus two, hyperelliptic curves of genus three can be used to
construct cryptosystems based on the discrete logarithm problems, however, evaluating
the security of these cryptosystems is more complicated. There are two algorithms that
must be considered when evaluating the difficulty of the discrete logarithm problem on a
specific hyperelliptic curve: the index calculus algorithm and Smith’s trigonal mapping to
non-hyperelliptic curves.
16.5.12 Remark Index calculus algorithms are used to map the discrete logarithm problem to
computing non-trivial solutions of a system of linear equations. The algorithm proceeds in
two steps: a relation search (to build the system) and a (sparse) linear algebra solver.
Under the right conditions, the overall complexity of the relation search and linear
algebra solver is lower than generic attacks. For curves of genus one and two, index calculus
attacks appear to be slower than generic attacks, but for genus three and higher, index
calculus does indeed reduce the cost of computing discrete logarithms.
16.5.13 Remark A number of variants of the index calculus relation search have been pro-
posed [1245, 1256, 2212, 2802]. Currently, the most efficient algorithms are those of Gaudry
et al. [1256] and Nagao [2212]. Both of these algorithm require the equivalent of O(q 4/3+ )
group operations to compute the discrete logarithm. It should be noted that index calculus
algorithms have the same (estimated) running time for all genus three curves over the field
Fq .
16.5.14 Remark For curves of genus three over fields of odd characteristic, there exists a specialized
attack against the discrete logarithm problem. This attack is due to Smith [2688] and uses a
trigonal map to send the Picard group of a hyperelliptic curve of genus three to the Picard
group of a non-hyperelliptic curve (also of genus three).
As was shown by Diem [854, 856], it is easier to solve the discrete logarithm in the Picard
group of a non-hyperelliptic curve than a hyperelliptic one, the running time decreasing to
an equivalent of O(q 1+ ) group operations for genus three curves. If the trigonal map in
Smith’s attack is successful, the security of the curve is then no better than that of a genus
two curve (but with a higher cost for the group operation).
Smith’s attack depends heavily on the structure of the 2-torsion group of the curve (over
the algebraic closure of the field Fq ). More specifically, it depends on the extension degree
Cryptography 799
of the field Fqk in which each of the Weierstrass points of the curve is defined. As a result,
not all curves over Fq are vulnerable to Smith’s attack.
16.5.15 Example For curves of the form y 2 = f (x) for which f (x) splits into linear factors, Smith’s
attack has a 1 − 2−105 probability of success (i.e., roughly only 1 in 2105 curves will remain
safe from the attack).
However, the probability of success is much lower for other types of factorization of
f (x), and can even reach zero. For imaginary curves, the following factorization types of
f (x) (degrees of the irreducible factors of f (x)) are completely secure against Smith’s attack:
[7], [5, 2], [5, 1, 1], [4, 3], [3, 2, 2], [3, 2, 1, 1], [3, 1, 1, 1, 1]. For real curves, there is one more
factorization type of f (x) which is completely secure against the attack: [5, 3].
16.5.16 Remark There are a number of open problems coming from Smith’s algorithm. First of all,
it is not known how to adapt the trigonal map to curves over fields of characteristic two.
Secondly, the trigonal map exists only for curves of genus three, but similar ideas could be
important to the security of hyperelliptic curves of higher genus.
16.5.17 Remark As with genus two curves, the group operations are performed via explicit formulae
rather than Cantor’s algorithm [149, 1383, 2211]. Once again, curve isomorphisms are used
to reduce the curve equation and improve efficiency. In odd characteristic the preferred form
of the equation is y 2 = x7 + f5 x5 + f4 x4 + f3 x3 + f2 x2 + f1 x + f0 . For curves in characteristic
two, similar reductions can be performed for each form of h(x).
Unlike elliptic curves and curves of genus two, genus three curves of the form
y 2 + y = f (x) over F2n are never supersingular [2552] and can be used in cryptographic
applications. These curves allow for much faster doubling operations, giving efficiency ad-
vantages.
16.5.18 Remark As for genus two curves, for genus three curves in characteristic two, the binary field
structure can be used to easily solve quadratic (and some quartic) equations. It becomes
possible to compute “halving” operations, giving performance advantages over doubling
operations, especially when the degree of h(x) increases [288].
16.5.19 Remark The highest genus that is sometimes considered for cryptographic applications is
four [149, 2381]. The running time for solving the discrete logarithm problem in a hyper-
elliptic curve of genus four (using the algorithm of Gaudry et al. [1256]) is equivalent to
O(q 3/2+ ) group operations.
16.5.20 Remark For curves of small degree, the cost of computing discrete logarithm grows as
O(q 2−2/g+ ) (for a fixed genus and an increasing field size) [1256, 2212]. However, the
complexity of the group operation is at least linear with respect to the genus of the curve
(in practice this growth is closer to quadratic), in practice limiting the range of “small”
genera which are of interest for cryptosystems. At higher genera, the situation is even more
difficult for cryptosystems based on the discrete logarithm. This situation is discussed in
Section 16.6.10.
16.5.21 Remark The security arguments for hyperelliptic curves of genus two are very similar to
those of elliptic curves, with the only distinction that the group order is close to q 2 (rather
than q for elliptic curves). The required bit-sizes for the finite fields are therefore half of
800 Handbook of Finite Fields
what they are for elliptic curves with the same security level. Once again, the group order
should be a prime or a prime with a small co-factor. Similarly, the secret key should be of
size similar to the group order, which means that secret key sizes for hyperelliptic curves of
genus two are in fact identical to those of elliptic curves.
16.5.22 Remark The following table summarizes these arguments, giving comparisons with sym-
metric key cryptosystems and as in Remark 16.4.3, the 80 bit security level is now considered
a historic figure.
security level ECC field size genus two field size secret key size
80 160 80 160
112 224 112 224
128 256 128 256
192 384 192 384
256 512 256 512
16.5.23 Remark To choose the field size for genus three hyperelliptic curve, the first concern is
the possibility of an index calculus attack. To have m-bits of security level, we ask that
q 4/3 ≈ 2m (using the complexity in Remark 16.5.12), hence log q ≈ 3m/4.
It may be noted that index calculus attacks do not obtain any significant speedups from
restrictions on the key size or factorizations of the group order. In some situations it may be
acceptable to allow the group order to factor into a prime with a (relatively) large cofactor,
as long as the largest prime factor is of such size that the subgroup attack of Pohlig and
Hellman [2406] cannot be effective. The cofactor could then take up to one quarter of the
bit size of the group order.
Having large subgroups can still pose a problem for the security of a cryptosystem unless
there is a mechanism in place to ensure the group element is indeed in the desired subgroup,
otherwise an attacker could provide a group element which is outside of the (large-)prime-
order subgroup, and use a subgroup attack to obtain partial information about the scalar,
thus reducing the effective security level. Unfortunately, verifying in which subgroup a given
group element is located is quite expensive (of cost similar to the scalar multiplication
itself), and has a significant impact on the efficiency of the cryptosystem. For this reason,
it is usually preferred to use groups whose order is (close to) a prime, even though larger
cofactors may not directly decrease the security level.
16.5.24 Remark It is possible to choose keys which are significantly smaller than the group size
without necessarily weakening the cryptosystem. This can be of great importance for the
efficiency of the cryptosystem since the cost of the scalar multiplication (the cryptographic
primitive) is directly proportional to the number of bits of the secret key. However, if the
secret key is too small, it may become possible to compute the secret key using a generic
attack. The bit-size of the secret must therefore be equivalent to those used for elliptic
curves at the same security level.
16.5.25 Remark For curves in odd characteristic, it is recommended to ask that f (x) factors into
one of the factorization types given in Example 16.5.15 (to insure no trigonal mapping can
be used to mount a successful attack). This final condition (factorization of the defining
polynomial) can be seen as equivalent to the requirement that the group order be close to
a prime for protection against generic attacks (factorization of the group order).
16.5.26 Remark The following table gives bit sizes for the field of definition and the secret key
for genus three hyperelliptic curve cryptosystems, comparing with symmetric key sizes and
ECC sizes. If large cofactors are allowed in the group order, its largest prime factor should
be at least of the secret key size.
Cryptography 801
security level ECC field size genus three field size secret key size
80 160 60 160
112 224 84 224
128 256 96 256
192 384 144 384
256 512 192 512
16.5.27 Definition A hyperelliptic Koblitz curve is a hyperelliptic curve defined over F2 that is
used over F2n for n prime (to avoid having too many subgroups).
16.5.28 Remark The following table lists all (isomorphism classes of) non-supersingular hyperel-
liptic Koblitz curves of genus 2 and their characteristic polynomials.
16.5.29 Remark Besides an easier computation of the group order (although computing the group
order of a random hyperelliptic curve over a binary field is quite efficient), the main advan-
tage of Koblitz curves comes from the Frobenius map over F2 .
16.5.30 Remark A similar table for genus three can be found in [1850].
16.5.31 Definition The map defined by τ (x, y) = (x2 , y 2 ) for points on the curve gives a degree n
endomorphism on the Picard group when applied to (the support of) divisors defined
over F2n . The application of τ on the Mumford representation of a divisor consists of
squaring all the coefficients of the polynomials. A τ -adic expansion of the scalar [2195]
then gives a reduction in the cost of the scalar multiplication (as for Koblitz elliptic
curves).
16.5.32 Remark Since the τ map has order n for divisors over F2n , the security level of the curve
√
must be adjusted accordingly (generic attacks can be accelerated by a factor of O( n).
16.5.33 Example The curve
y 2 + (x2 + x + 1)y = x5 + x
is a genus two Koblitz curve with characteristic polynomial over F2 given by t4 +
2t3 + 3t2 + 4t + 4 = 0. Over the field F2113 , its Picard group has order 2 · 7 ·
1583 · 476183 · 10218712550205474310417731984747447186313991554764219834409 (i.e.,
10553167646 times a prime). Taking into account the subgroups and the degree 113 en-
domorphism, this curve gives equivalent security to a random elliptic curve over a field of
186 bits.
16.5.34 Definition A subfield curve is a hyperelliptic curve of genus g defined over a field Fq , but
used as a curve over Fqn [1850]. These curves are generalizations of Koblitz curves.
802 Handbook of Finite Fields
16.5.35 Remark There are two possible advantages of using subfield curves. First, computing the
group order is easier than for a general curve over the same field: it can be obtained from
knowledge of the characteristic polynomial of the curve over Fq [1850]. This is of particular
interest in odd characteristic since computing the group order is much more expensive for
these fields than for binary fields. Second, the field structure (of Fqn as an extension of Fq )
allows for more efficient arithmetic than for a prime field Fp with p ≈ q n .
16.5.36 Remark One of the main drawbacks of using curves which are defined over a subfield (in
this case of Fqn ) is that the Picard group contains at least one large subgroup, namely
the Picard group of the curve over the subfield Fq . This opens the way for attacks on the
discrete logarithm problem based on the algorithm of Pohlig and Hellman [2406]. To avoid
these attacks, it is necessary to implement some techniques to ensure the group elements
used are always in the larger (prime order) subgroup. Furthermore, these curves are at risk
of Weil descent based attacks, in particular the variant of Gaudry [1247].
16.5.37 Definition Given a curve C defined over Fq , the trace-zero subvariety of the curve is the
quotient group T = P ic0Fqn (C)/P ic0Fq (C). This was proposed by Lange for genus two
and n = 3 [1851] as a method to construct a cryptographically viable group on a field
extension without having to deal with subgroups. If the order of the group T is prime
and close to q 4 (q g(n−1) in general) this group can be used for cryptosystems based on
the discrete logarithm problem.
16.5.38 Remark As with subfield curves, the group order of trace-zero subvarieties is computed
from the group order of the Picard group of the curve over Fq . Similarly, these curves take
advantage of the subfield structure to make the field arithmetic more efficient. Finally, the
group operations can also be tailored to work on T rather than on the general Picard group.
16.5.39 Remark Just as with any curve on a field extension, trace-zero subvarieties can be subjected
to Weil descent based attacks. For curves of genus greater than two and for n > 3 in genus
two, a Weil descent attack on the whole Picard group with the simplest form of index
calculus attack [1245] is sufficient to solve the discrete logarithm in time O(q 2 ), significantly
faster than through generic attacks [855].
16.5.40 Remark For n = 3, Gaudry’s variant would allow the attack to handle the Picard group
as if it were the Picard group of a genus six curve over Fq , and the discrete logarithm
problem in the trace-zero subvariety can be lifted to a discrete logarithm problem in the
whole Picard group. This attack has running time O(q 5/3 ), instead of the desired O(q 2 ) for
generic attacks on the group T .
Furthermore, it may be possible to adapt Gaudry’s variant to work directly on the
trace-zero subvariety (by selecting a different factor base, more appropriate to this context),
treating it as a genus four curve over Fq . If such an attack is possible, it would have running
time O(q 3/2 ), making it particularly effective. The construction of an appropriate factor
base is an open problem.
16.5.41 Remark Because of these attacks, field sizes for secure cryptosystems would have to be
increased to preserve the security level, which in turns decreases the efficiency of the field
arithmetic. As a result, groups coming from trace-zero subvarieties are now mostly of the-
oretical interest.
16.5.42 Remark Computing the Picard group order of hyperelliptic curves in even characteristic can
be done quite efficiently via p-adic (in this case 2-adic) algorithms. Currently, the fastest
Cryptography 803
algorithms are due to Satoh [2534] and Streng [2735]. These algorithms can handle in
reasonable time both curves of high genus and (more interesting for practical applications)
curves of small genus over a large field.
16.5.43 Remark Compared with the characteristic two case, computing the group order of curves in
odd characteristic is mostly an open problem. In curves over prime fields, most algorithms
are based on Pila’s [2394] extension of Schoof’s algorithm. For genus two curves (where such
works are concentrated), the fastest algorithms are due to Gaudry and Schost [1252, 1253],
and Gaudry, Kohel, and Smith for curves with complex multiplication [1254].
16.5.44 Remark At this time, finding a random hyperelliptic curve in characteristic two that corre-
sponds to required security requirements – having a given genus and field size, with a Picard
group whose order is a large prime with a small co-factor – is considered quite practical.
For curves in odd characteristic, this is considerably more difficult. For genus two curves,
such a search is feasible although expensive ([1253] reports such a search at the 128-bit
security level taking over one million CPU-hours). For genus three (or higher), there are
currently no examples of practical searches for such curves.
16.5.45 Remark Since the Tate-Lichtenbaum pairing comes from the divisor class group structure of
elliptic curves, it naturally generalizes to hyperelliptic curves, although some care has to be
taken to properly adapt the definition as well as Miller’s algorithm; see Subsection 12.4.6. A
survey of the different techniques available for pairings on hyperelliptic curves can be found
in [1157]. Actual implementations of cryptosystems using hyperelliptic curve pairings are
much less common than elliptic curve ones, but [2309] demonstrates the potential interest
of hyperelliptic curves for pairing-based cryptography.
16.5.46 Remark For curves of genus two over prime fields, it is possible to construct curves with a
given group order using complex multiplication methods [2087, 2970, 2971]. Such curves may
be of interest for pairing based cryptosystems on hyperelliptic curves [1157]. For cryptosys-
tems based on the discrete logarithm problem, more random curves are usually preferred
for fear the structure coming from complex multiplication could eventually open the way
to new attacks that significantly decreases the difficulty of the discrete logarithm problem
in these curves (although no such attack is currently known).
See Also
References Cited: [1, 147, 149, 286, 287, 288, 313, 497, 661, 854, 855, 856, 1038, 1157, 1245,
1246, 1247, 1252, 1253, 1254, 1256, 1383, 1421, 1772, 1850, 1851, 1852, 1853, 1854, 2087,
2110, 2195, 2211, 2212, 2309, 2381, 2394, 2406, 2534, 2552, 2688, 2735, 2802, 2970, 2971]
16.6.1 Definitions
16.6.1 Definition An Abelian variety A over a finite field F is a smooth projective algebraic
variety defined over F on which there is an algebraic group operation, also defined over
F. In particular, the identity element O of the group is an F-rational point.
16.6.3 Definition An Abelian variety is simple over F if it contains no proper Abelian subvarieties
defined over F. It is absolutely simple if it is simple over the algebraic closure F of F.
16.6.4 Definition Two Abelian varieties A1 and A2 are isogenous if there is a surjective morphism
A1 −→ A2 with finite kernel.
16.6.2 Examples
16.6.11 Theorem Given a curve C defined over F, there is an Abelian variety J(C), the Jacobian
of C, which is also defined over F. It has the following properties:
1. The dimension of J(C) is the genus of C.
2. If g is the genus of C, there is a surjective map C g = C × · · · × C −→ J(C).
16.6.12 Theorem The F points on J(C) are given by D0 /P where D0 is the Abelian group of
divisors of degree zero and P is the subgroup of divisors of degree zero which are divisors
of functions.
16.6.13 Remark Let A be an Abelian variety defined over an extension field F0 over F. Suppose
that [F0 : F] = r and that A is of dimension d. Then there is an Abelian variety RF0 /F A
Cryptography 805
(called the restriction of scalars of A) defined over F of dimension rd with the property that
RF0 /F A(F) = A(F0 ).
16.6.5 Endomorphisms
16.6.18 Definition Denote by Fn the unique extension of F of degree n contained in F. The zeta
function Z(A, T ) of an Abelian variety A defined over a finite field F is the power series
X
Z(A, T ) = exp |A(Fn )|T n /n .
n≥1
806 Handbook of Finite Fields
16.6.19 Theorem There are polynomials P0 (T ), . . . , P2d (T ) (where d = dim(A)) such that
2d
Y i+1
Z(A, T ) = Pi (T )(−1) .
i=0
16.6.27 Definition The Newton polygon of an Abelian variety A is the Newton polygon of P1 (T ).
In particular, if we write P1 (T ) = 1 + a1 T + · · · + a2d T 2d , then the Newton polygon is
the convex hull of the points (j, ordp aj ) for 1 ≤ j ≤ 2d. A slope of the Newton polygon
is (d − b)/(c − a) where (a, b) and (c, d) are two vertices. The length of the slope is c − a.
16.6.28 Proposition [2438, Theorem 9.1] The slopes of the Newton polygon are (counting multi-
plicities) the p-adic ordinals of the reciprocal roots of P1 (T ). More precisely, if λ is a slope
of length m, then precisely m of the numbers ordp πi are equal to λ.
16.6.29 Remark Given the condition that for every j, we have πj πj+d = q, at most d of the πj can
be prime to q (that is, have ordp πj = 0).
16.6.32 Remark There are efficient algorithms to compute on elliptic curves as well as on Jacobians
of hyperelliptic curves. These have been described in earlier sections. There is an interesting
class of curves (called Ca,b curves) which generalizes these. We describe them below.
16.6.33 Definition Let a and b be positive integers. A Ca,b curve over F is one that is defined by
an equation of the form
X
F (x, y) = αb,0 xb + α0,a y a + αi,j xi y j
ia+jb<ab
where the coefficients αi,j are elements of F and the leading coefficients αb,o and α0,a
are nonzero.
16.6.34 Remark Such a curve has the property that it is has exactly one F-rational point (Q say)
at infinity and the polar divisors of the functions x and y are aQ and bQ, respectively.
16.6.35 Example An elliptic curve is a C2,3 curve and more generally, a hyperelliptic curve given
by
y 2 = f (x)
with f a polynomial of degree 2g + 1 is a C2,2g+1 curve.
16.6.36 Example A superelliptic curve, in other words one of the form
y a = f (x)
16.6.43 Remark Addition on the Jacobian of a general curve has been described by Volcheck but
this method does not seem to be practical over fields of cryptographic size.
16.6.44 Remark Arita, Miura, and Sekiguchi [123] have shown how to use a singular plane model of
a general curve to obtain an algorithm for addition on the Jacobian. However, the complexity
of this algorithm has not been analyzed.
16.6.45 Remark For cryptographic applications, we want the group A(F) to be “nearly” of prime
order. In particular, we want A to be simple (and perhaps even absolutely simple). This
rules out supersingular Abelian varieties as (by a result stated above) they are isogenous to
a power of a supersingular elliptic curve.
16.6.46 Definition Let A be an Abelian variety over the finite field F. Let P ∈ A(F) and Q an
element of the subgroup of A(F) generated by P . The discrete logarithm problem for
this subgroup is to determine an integer n so that Q = nP .
16.6.47 Remark In the case of hyperelliptic curves, we have the following result of Enge and Gaudry
based on an index calculus approach.
16.6.48 Theorem [980] The discrete logarithm problem on J(C) (where C is a hyperelliptic curve
of genus g defined over a finite field of q elements) is of complexity
√ p
O(exp{( 2 + o(1)) (log q g )(log log q g )}).
16.6.49 Remark This result assumes that the group order is known and that the group itself can
be computed in polynomial time. Several authors have studied analogues of this result for
the Jacobian of a non-hyperelliptic curve. In particular, one has the following results.
16.6.50 Theorem [854] Fix positive integers d (the degree) and g ≤ (d − 1)(d − 2)/2 (the genus).
Denote by S(q) the set of all instances of the discrete logarithm problem in curves of
genus g over Fq represented by plane models of degree d. Then there is a subset S1 (q)
with |S1 (q)|/|S(q)| → 1 (as q → ∞) such that the instances in S1 (q) can be solved in an
2
“expected” time of O(q (2− d−2 ) (log q)A ) for some integer A ≥ 1.
16.6.51 Remark The analysis in [854] involves several heuristic assumptions (including assumptions
on the group order and structure) and the power of the logarithm is not specified. The
result, in particular, applies to non-hyperelliptic curves of genus 3 (which by the canonical
embedding, may be represented by a plane quartic (d = 4)). This particular case is further
analyzed by Diem and Thomé [856].
16.6.52 Remark In the case studied by Diem and Thomé, the heuristic assumptions can be made
explicit in graph-theoretic terms. Choose a subset F ⊂ C(F) (called the factor base) of
√
cardinality O( q) and let L denote the complement of F in C(F). A graph G is constructed
with vertices L∪{∗} where ∗ is a special root vertex. The edges in this graph are determined
as follows. For each pair F1 , F2 ∈ F, compute the line L through these points. Consider the
divisor D = F1 + F2 + D1,2 representing the intersection of C with L. If P, Q are points in
the support of D1,2 , at least one of which is in L, construct an edge joining P and Q (or P
and ∗ or Q and ∗).
Cryptography 809
16.6.53 Theorem [856] Suppose that C is a non-hyperelliptic curve of genus 3 with the property
that J(C)(F) is cyclic. Suppose also that the graph G constructed above has a tree rooted
at ∗ of depth O((log q)2 ) and having at least O(q 5/6 ) elements. Then the discrete logarithm
problem in J(C) can be solved in O(q(log q)A ) steps.
16.6.54 Remark We say that the discrete logarithm problem for the pair (A, P ) consisting of the
Abelian variety A and the subgroup generated by a point P on A is difficult if it is compu-
tationally infeasible to solve the problem.
16.6.55 Remark Let A and P be as above and suppose that the discrete logarithm problem for
(A, P ) is difficult. Then we can perform a key exchange using the Diffie Hellman protocol
with the group G generated by P .
16.6.56 Remark In order for such a key exchange scheme to be useful and secure against known
attacks, we require:
1. arithmetic on A should be efficient;
2. the group order A(F) should be nearly prime.
16.6.57 Remark There is a vast literature on the discrete logarithm problem. The reader is referred
to the surveys [1495, 1586] for some results that are not mentioned here.
16.6.58 Remark The motivation for considering higher dimensional Abelian varieties as the basis of
a cryptographic scheme is based on the fact that if A has dimension d, the number of points
A(F) is of order q d . Thus, we should expect the difficulty of the discrete logarithm problem to
be of the order q d/2 . In particular, a 2-dimensional Abelian variety has a discrete logarithm
problem of difficulty O(q) whereas an elliptic curve has a discrete logarithm problem of
difficulty O(q 1/2 ). This means that there is the possibility of achieving the same level of
security that an elliptic curve over a field of 2163 offers by using a two dimensional Abelian
variety over a field of 282 . The fact that we can work over a field of smaller size may mean
that there is a reduction in the overhead of time and memory required to do computations.
Whether this is actually the case is a matter of current research.
16.6.59 Remark Frey [1104] suggested that it might be possible to use Weil descent to attack the
discrete logarithm problem on elliptic curves defined over Fpn where n is composite. This
idea was explicitly developed and analyzed by Gaudry, Hess, and Smart [1250] in the case
p = 2.
16.6.60 Definition Let ` and n be positive integers. Set q = 2` , k = Fq , and K = Fqn . Let E be a
non-supersingular elliptic curve defined over K by the equation
y 2 + xy = x3 + ax + b
16.6.67 Remark Besides the Weil pairing described above, Frey and Rück [1108] introduced the
Lichtenbaum-Tate pairing. Suppose that A is the Jacobian of the curve C defined over F.
If ` is a prime not dividing q (the cardinality of F) and k is the order of q modulo `, there
is a pairing
A[`](F) × A(F)/`A(F) −→ F× qk
/(F×
qk
)` .
16.6.68 Theorem [1108] This pairing is non-degenerate. Using it, the discrete logarithm problem
in A[`](F ) can be reduced to the corresponding problem in F×
qk
in probabilistic polynomial
time in log q.
Cryptography 811
16.6.70 Remark If the embedding degree is small, then solving the discrete logarithm problem in
F×
qk
is practical. Several authors have found examples of Abelian varieties for which the
embedding degree is small.
16.6.71 Theorem [1097] Given a CM field K of degree 2g ≥ 4, a primitive CM-type Φ of K, a
positive integer k and a prime r ≡ 1 (mod k) that splits completely in K, there exists a
prime p and a simple, ordinary Abelian variety A defined over Fp with embedding degree k
with respect to r, and an element π ∈ K so that |A(Fp )| = NK/Q (π − 1).
16.6.72 Remark The construction of A is not explicit. Rather the polynomial P1 (T ) is constructed
explicitly and an appeal is made to the theorem of Honda and Tate. Moreover, heuristic
analysis suggests that P1 (T ) can be found in O(log r) steps.
16.6.73 Remark Galbraith, McKee, and Valenca [1158] produce examples of ordinary Abelian sur-
faces with embedding degrees 5, 10, and 12. Again, the construction appeals to the Honda-
Tate theorem.
16.6.74 Remark The embedding degree and security parameters in the case of supersingular Abelian
varieties is analyzed carefully by Rubin and Silverberg [2494].
See Also
References Cited: [123, 148, 294, 852, 854, 856, 980, 1097, 1104, 1108, 1158, 1250, 1495,
1582, 1585, 1586, 2078, 2438, 2494, 2783]
16.7.1 Remark Common choices for hardware are configurable semiconductor devices, such as field
programmable gate arrays (FPGAs) and application specific integrated circuits (ASICs).
FPGAs have a very low non-recurring engineering cost and can generally be re-configured
812 Handbook of Finite Fields
many times. On the other hand, ASICs require a high non-recurring engineering cost, but
the unit cost is relatively lower for large quantity production, and are considered to be more
suitable for applications requiring high-speed, low area, or low power designs.
16.7.2 Remark An arithmetic algorithm for hardware can be evaluated based on one or more
of the following: (i) number of arithmetic operations over the underlying field (hence it is
essentially arithmetic complexity of the algorithm), (ii) amount of storage, e.g., flip-flops
for temporary storage and memory for pre-computed values, and (iii) number of accesses
to memories used.
16.7.3 Remark An arithmetic algorithm can be mapped onto different types of hardware archi-
tectures, e.g., serial, parallel, pipelined, systolic, etc. Two important figures of merit to
characterize the performance of an architecture are space and time complexities. Generally,
trade-offs between space and time are possible. Main components of a hardware architecture
include: logic gates, temporary storage (flip-flops, registers, etc.), and memories.
16.7.4 Definition The space complexity of an architecture is its number of logic gates (e.g., AND
and XOR) and the amount of storage (i.e., flip-flops and memories). For simplicity, only
two-input logic gates are assumed and interconnections are ignored.
16.7.5 Definition The critical path of an architecture is the path that represents the longest time
delay. We approximate the delay as the delay caused by the logic gates in the critical
path. In actual hardware realization, other factors such as interconnections contribute
to the delay.
16.7.6 Definition The time complexity of an architecture is the amount of time needed by the
architecture to complete the required arithmetic operation upon receiving any portion of
the input. For a fully bit-parallel architecture, the time complexity is simply the delay in
the critical path. For architecture that requires multiple clock cycles for the arithmetic
operation, the time complexity is approximated as the product of the number of clock
cycles and the gate delays in the critical path.
16.7.7 Remark At the architectural level, various design components can be used, for example, a
two-input XOR gate for an addition over F2 and a two-input AND gate for a multiplication
over F2 . The time delay of a component is denoted as T along with a suitable subscript. For
example, we denote the time delay for a two-input XOR gate as TX , and that of a two-input
AND gate as TA .
16.7.8 Definition The elements of F2n can be represented as polynomials over F2 of degree n − 1
n−1
or less. Thus, for a ∈ F2n we can write a = i=0 ai xi , where ai ∈ F2 and 0 ≤ i ≤ n − 1.
P
The set {1, x, . . . , xn−1 } is a polynomial basis of F2n over F2 ; see Section 2.1.
Pn−1 Pn−1
ai xi and b = i=0 bi x be two elements of F2n . Then the
i
16.7.9 Definition Let a = i=0 Pn−1
addition of a and b is a + b := i=0 (ai + bi )x , where ai + bi is over F2 .
i
Cryptography 813
16.7.10 Proposition Using n XOR gates in parallel, a + b can be computed with a delay of one TX
only. On the other hand, a + b can be be computed in bit serial fashion using only one XOR
gate and a delay of nTX .
16.7.11 Definition Let a and b be two elements of F2n represented using polynomials as stated
above. Then the product of a and b is ab := a(x)b(x) (mod f (x)), where f is an irre-
ducible polynomial of degree n over F2 and defines the representation of the field.
16.7.12 Remark Various algorithms exist for computing ab. One class of algorithms involves the
following two steps: first the multiplication of the two n-term polynomials, a and b, is
performed, and then the resulting (2n − 1)-term polynomial is reduced modulo f . In the
straightforward or schoolbook method, the multiplication ab is performed by repeated shift-
and-add operations and requires O(n2 ) additions and multiplications over F2 . A more effi-
cient method is known as the Karatsuba algorithm and it is based on the following.
16.7.13 Theorem [1684] Assume that n is even and the n-term polynomial a is split as aH xn/2 +aL ,
where aH and aL are each (n/2)-term polynomials in x over F2 . Similarly, b is split as
bH xn/2 + bL . Then
16.7.14 Remark We note that addition and subtraction are the same in fields of characteristic two.
In case n is not even, a zero coefficient can be padded at the higher degree end of a and b
allowing an even splitting. This technique can be applied recursively and the following can
be proven.
16.7.15 Lemma [2342] Assuming that n is a power of 2, then a recursive application of Theorem
16.7.13 for the multiplication of a and b requires no more than nlog2 3 multiplications and
6nlog2 3 − 8n + 2 additions over F2 .
16.7.16 Corollary [1026, 2342] A fully bit parallel implementation of the Karatsuba algorithm for
the multiplication of two n-term F2 polynomials requires 6nlog2 3 − 8n + 2 XOR and nlog2 3
AND gates, and its time complexity is (3 log2 n − 1)TX + TA .
16.7.17 Remark Since log2 3 ≈ 1.58 < 2, the Karatsuba algorithm, which first appeared in [1684]
for integer multiplication, has a subquadratic arithmetic complexity. In the context of poly-
nomial multiplication, a generalization of the Karatsuba algorithm can be found in [2963].
16.7.18 Remark The main rationale behind designing a fully parallel multiplier is to achieve a
higher speed. For multiplication of polynomials a and b, one way to achieve a time com-
plexity that is lower than that of the Karatsuba algorithm is to split the polynomials
according to the parity of the x’s exponent [1028], i.e., a(x) = ae (y) + xao (y), where y = x2 ,
Pn/2−1 Pn/2−1
ae (y) = i=0 a2i y i and ao (y) = i=0 a2i+1 y i . Similarly, b(x) = be (y) + xbo (y). Then
the product a(x)b(x) is computed using the following Karatsuba-like formula
If n is a power of 2, then a recursive application of the above splitting can lead to a fully
bit parallel polynomial multiplication hardware that has the same space complexity as the
Karatsuba algorithm, but is about 30% faster - the main gate delays are 2 log2 nTX vs.
(3 log2 n − 1)TX .
814 Handbook of Finite Fields
16.7.19 Remark An improvement to the original F2 [x] Karatsuba algorithm in terms of space
complexity is given in [3064]. The main idea of this method can also be described using the
following refined Karatsuba identity [243, p. 325]:
This identity is in fact closer to Karatsuba’s original squaring identity ([1684, p. 595]) than
identity 16.7.1. Table 16.7.1 summarizes complexities of the Karatsuba algorithm and its
two variants.
Algorithm #AND #XOR Gate delay
Karatsuba [32, 2342] nlog2 3 6nlog2 3
− 8n + 2 (3 log2 n − 1)TX + TA
Overlap-free Karatsuba [1028] nlog2 3 6nlog2 3 − 8n + 2 (2 log2 n)TX + TA
Refined Karatsuba [243, 3064] nlog2 3 5.5nlog2 3 − 7n + 1.5 (3 log2 n − 1)TX + TA
16.7.20 Remark The Winograd short convolution algorithm may be viewed as a generalization
of the original Karatsuba-based algorithm [241, 2747, 2988]: while the original Karatsuba
algorithm performs evaluation and interpolation using linear factors x − ∞, x and x − 1,
the Winograd method also uses nonlinear factors, i.e., irreducible polynomials of degrees
greater than 1. For n = 3i (i > 0), the algorithm yields an asymptotic time complexity of
(4 log3 n − 1)TX + TA ≈ (2.52 log2 n − 1)TX + TA , which is better than that of the Karatsuba
algorithm. However, the Winograd method has a higher asymptotic space complexity than
the Karatsuba algorithm.
16.7.21 Remark Other subquadratic arithmetic complexity algorithms exist, e.g., [164, 243]. How-
ever, when mapped onto hardware architectures, they yield a higher time complexity, e.g.,
[243] is linear to n for a fully bit parallel implementation [574].
16.7.22 Theorem [3006] Let d be a binary polynomial of degree at most 2n − 2 (i.e., d could be the
product of two binary polynomials - each of degree at most n − 1). Let Wf be the number
of nonzero coefficients of f of degree n. Then d (mod f ) can be computed with at most
(Wf − 1)(n − 1) bit operations.
16.7.23 Remark In addition to the class of algorithms mentioned in Remark 16.7.12 which performs
polynomial multiplication ab and reduction modulo f in two separate steps, another class
exists that combines these two steps. The latter class of algorithms is often used with
sequential architectures. To this end, we first look at the shift-and-reduce operation for
hardware.
16.7.24 Lemma Let a and f be as defined above. Then the i-th coordinate of xa modulo f is given
as follows:
an−1 i = 0,
(xa)i = (16.7.2)
an−1 + fi ai−1 1 ≤ i ≤ n − 1.
16.7.25 Remark Equation (16.7.2) can be realized using an n-stage linear feedback shift register
(LFSR) that has feedback connections corresponding to the coefficients of f . If the LFSR
is initialized with the coordinates of a then after one shift the LFSR will contain the
coordinates of xa modulo f ; see Section 10.2.
Cryptography 815
16.7.31 Definition Algorithm 16.7.28 can be modified so that in each iteration a digit is processed,
reducing the number of iterations to dn/de, where d is the number of bits in each digit.
We refer to the modified algorithms as digit-level multiplication algorithms.
16.7.32 Remark Compared to bit-level algorithms (e.g., Algorithm 16.7.28), the number of arith-
metic operations in each iteration of a digit-level algorithm is higher and so is the space
complexity of the corresponding digit-serial architecture, typically by a factor of d. This is
because the x multiplication operation of Algorithm 16.7.28 is replaced by a multiplication
with xd , and each AND operation by a multiplication of two polynomials of degree d − 1
over F2 . A number of digit-serial multiplier architectures have been proposed, see for ex-
ample, [107] and [2695]. For resource constrained applications, digit-serial multipliers may
offer suitable trade-offs between space and time requirements.
16.7.33 Remark For applications that demand high throughput (in terms of number of operations
per second), multipliers can be designed based on pipeline or systolic array architecture.
Examples of such multipliers include [2055, 3035]. Pipeline and systolic array multipliers
usually require extra flip-flips for storing intermediate results.
Pn−1
16.7.34 Remark Using a polynomial basis, squaring of a ∈ F2n is a2 = i=0 ai x2i (mod f ). Thus,
unlike a normal basis (see Subsection 16.7.4), the use of a polynomial basis to implement an
F2n squaring requires bit operations or logic gates. However, the number of gates required
can be quite low. For example, if the reduction polynomial f is a trinomial in some special
form, then a bit parallel squaring unit can be implemented with about n/2 XOR gates
[3007]. Like squaring, a square root in polynomial basis is also very efficient.
816 Handbook of Finite Fields
16.7.35 Remark Let a be a nonzero element of F2n and b be the multiplicative inverse of a.
Then in polynomial notation we have ab ≡ 1 (mod f ), where f is as defined earlier. Since
gcd(f, a) = 1, the extended Euclidean algorithm can be used to obtain b. By simply chang-
ing the initialization, the extended Euclidean algorithm can also be used for computing the
division b = c/a in F2n , which in polynomial notation can be written as ab ≡ c (mod f ).
C = ZB, (16.7.3)
16.7.42 Remark The matrix Z is the Mastrovito matrix [2014]. Given Zi−1 , one can obtain Zi using
an LFSR operation and hence it requires Wf − 2 additions over F2 . Thus, the formation
of matrix Z requires no more than (n − 1)(Wf − 2) additions over F2 [1434]. For arbitrary
a and b, a straightforward approach to compute the matrix vector product ZB requires
O(n2 ) arithmetic operations over F2 .
Cryptography 817
16.7.44 Lemma [1026] The sum of two n × n Toeplitz matrices is also an n × n Toeplitz matrix. If
the matrix entries belong to F2 , then the sum requires no more than 2n − 1 additions over
F2 .
16.7.45 Lemma [1026, 2988] Let n = 2i (i > 0), T be an n × n Toeplitz matrix and V be a column
vector over F2 . Then the Toeplitz matrix-vector product (TMVP) T V can be computed
with nlog2 3 multiplications and 5.5nlog2 3 − 6n + 0.5 additions over F2 .
16.7.46 Proposition [2988] The matrix T and vector V can be split as follows:
T1 T0 V0
T = and V = ,
T2 T1 V1
where T0 , T1 and T2 are (n/2) × (n/2) matrices and are individually in Toeplitz form, and
V0 and V1 are (n/2) × 1 column vectors. Now the following noncommutative formula can
be used to compute the TMVP T V recursively:
T1 T0 V0 P0 + P2
TV = = , (16.7.4)
T2 T1 V1 P1 + P2
D = T B. (16.7.5)
16.7.48 Remark A class of transformation matrices U is given in [1434]. Special cases exist for which
T = U Z and C = U −1 D can be computed efficiently. In particular, when the reduction
polynomial is a trinomial, these transformations can be done by simple permutations. For
such special cases, the cost of multiplication of two elements of F2n is essentially the cost
of the multiplication of a Toeplitz matrix and a vector over F2 [1026].
16.7.49 Remark Let a, b, c ∈ F2n and a 6= 0. Then the division b = c/a over F2n can be performed by
solving the matrix equation (16.7.3) over F2 for B, which is the column vector corresponding
to the coordinates of b [1434]. Hardware architectures are available for solving such matrix
equations [1433].
16.7.50 Remark The division b = c/a over F2n can also be performed by solving equation (16.7.5)
over F2 for B. Since the matrix in (16.7.5) is Toeplitz, algorithms for solving (16.7.5) are
more efficient that those for (16.7.3). The use of (16.7.5) however requires pre- and post-
processing. As stated in Remark 16.7.48, such processing can be as simple as a permutation
when f is a trinomial.
16.7.51 Remark Below we consider arithmetic in binary extension finite fields F2n using normal
bases. Definition and various properties of normal bases can be found in Section 5.2.
818 Handbook of Finite Fields
2 n−1
16.7.52 Remark Let γ ∈ F2n and N = {γ, γ 2 , γ 2 , . . . , γ 2 } be a normal basis of F2n over F2 . An
n−1
element a ∈ F2n can be represented as a = a0 γ + a1 γ 2 + · · · + an−1 γ 2 , where ai ∈ F2
n−1
and 0 ≤ i ≤ n − 1. It is easy to see that a2 = an−1 γ + a0 γ 2 + · · · + an−2 γ 2 . Thus the
coordinates of a2 are obtained by a simple cyclic shift of the coordinates of a. In other
j
words, ai = (a2 )(i+1) = (a2 )(i+j) , where subscripts are evaluated modulo n.
n n−1
16.7.53 Remark Let a be a nonzero element of F2n . Then we can write a−1 = a2 −2
= a2(2 −1) .
n−1 2 n−2
Noting that since 2 −1 = 1+2+2 +· · ·+2 , the inverse can be computed with O(n)
multiplication and squaring operations. Using a normal basis, squaring does not require any
gates. For practical applications, this method however requires a high speed multiplier.
16.7.54 Remark Consider an integer m > 1. Denote m̃ = bm/2c and m0 = m (mod 2). Then
m̃
2m − 1 = 2m0 (2m̃ − 1)(2m̃ + 1) + m0 = 2m0 {(2m̃ − 1)2m̃ + (2m̃ − 1)} + m0 . Thus, given a2 −1 ,
m
one can compute a2 −1 using 1 + m0 multiplications (disregarding other operations). By
recursively applying this technique, the inverse of a nonzero a ∈ F2n can be computed with
blog2 (n−1)c+W (n−1)−1 multiplications over F2n , where W (n−1) is the number of nonzero
bits in the binary representation of n − 1. Inversion algorithms incurring only these many
multiplications have been independently proposed by Itoh-Tsujii [1576] and Feng [1052],
and have been subsequently implemented in hardware for cryptographic applications, see
for example [1361, 2441, 2466, 2489].
16.7.55 Lemma [2196] Let a, b, c ∈ F2n be represented with respect to a normal basis N as defined
above and c = ab. Let A = (a0 , a1 , . . . , an−1 )T denote the column vector representing the
n−1
coordinates of a = a0 γ + a1 γ 2 + · · · + an−1 γ 2 . Similarly, let B be the column vector
corresponding to b. Then
n−1
! n−1 !
2i 2i
X X
c= ai γ bi γ = AT GB,
i−0 i−0
i
+2j
where G = [gi,j ]n−1
i,j=0 is an n × n matrix over F2n and gi,j = γ
2
.
16.7.56 Remark Note that G depends only on N and is fixed or constant for a given basis. As
Pn−1 k
each entry of G can be represented with respect to N , one can write G = k=0 Gk γ 2 ,
where Gk is an n × n matrix over F2 . Using Lemma 16.7.55, for 0 ≤ k ≤ n − 1 we can
k k
write ck = AT Gk B. Consider Gn−1 and denote AT Gn−1 B as g(a, b). Since c2 = (ab)2 ,
k k
for 0 ≤ k ≤ n − 1 we have cn−1−k = g(a2 , b2 ). This leads to the following algorithm for
multiplication using normal basis N .
16.7.57 Algorithm (Multiplication over F2n using a normal basis)
Input: a, b ∈ F2n represented with respect to a normal basis N
Output: c = ab
1. s ← a, t ← b
2. For i from 0 to n − 1 do
3. cn−1−i ← g(s, t)
4. s ← s2 , t ← t2
5. Return c
16.7.58 Remark The multiplication scheme described above is due to Massey and Omura [2012].
For an arbitrary normal basis, half of the entries of Gn−1 are expected to be nonzero. Thus,
up to O(n2 ) arithmetic operations over F2 are needed for each iteration of Algorithm 16.7.57
or O(n3 ) for the entire algorithm.
Cryptography 819
16.7.59 Remark The number of nonzero entries in Gn−1 , which is denoted by CN , determines the
complexity of the normal basis multiplier. It has been shown by Mullin et al. that the
number of non-zeros in Gn−1 is ≥ 2n − 1 [2196]. The normal bases that satisfy the equality
are known as optimal normal bases (ONB). There are two types of ONB (Types I and II).
Section 5.3 presents more details on ONB.
16.7.60 Remark If Algorithm 16.7.57 is mapped onto a sequential architecture that requires n
clock cycles, then its space complexity is O(n2 ) logic gates and the critical path has a gate
delay of O(log2 n). The algorithm can be easily unrolled and mapped onto a fully parallel
architecture, especially since the squaring operation in N does not require any logic gates.
The space and the time complexities for such a fully parallel realization are O(n3 ) gates
and O(log2 n) gate delays, respectively.
16.7.61 Remark The Massey-Omura multiplication scheme has some redundancy in the sense that
there are common terms in the expressions of product coordinates ck , 0 ≤ k ≤ n − 1 [2451].
By removing the redundancy, a bit parallel multiplier (referred to as Reyhani-Hasan-1)
that offers reduced space complexity and is applicable to any arbitrary normal basis has
been reported in [2451]. By exchanging one AND gate for one XOR gate, the multiplier
(referred to as Reyhani-Hasan-2) of [2452] reduces the number of AND gates to n(n + 1)/2.
Because the complexity of subfield multiplication is higher than that of subfield addition,
this technique is shown to be quite effective for composite field multiplications [2452].
Table 16.7.2 Complexities of quadratic F2n general normal basis parallel multipliers.
Pn
16.7.65 Proposition [1027] Using the identity xn+1 = 1 = j=1 xj , we have the following decom-
position of the matrix Z = Z1 + Z2 :
0 an an−1 · · · a3 a2 an an−1 an−2 ··· a2 a1
a1
0 an ··· a4 a3
an an−1 an−2 ··· a2 a1
a2 a 1 0 ··· a5 a4 an an−1 an−2 ··· a2 a1
Z= . . . .. ..+ .. .. .. .. ...
.. .. .. .. ..
. . .
. . . . . .
an−2 an−3 an−4 · · · 0 an an an−1 an−2 ··· a2 a1
an−1 an−2 an−3 · · · a1 0 an an−1 an−2 ··· a2 a1
(16.7.8)
Therefore, MVP ZB may be computed via ZB = Z1 B + Z2 B.
16.7.66 Remark The straightforward computation of the TMVP Z1 B requires n(n − 1) AND gates
and n(n − 2) XOR gates. Clearly, computing Z2 B requires only n AND gates and n − 1
XOR gates since all the rows in Z2 are the same. However, this fact was not noticed in the
original Massey-Omura [2012] normal basis multiplier, where n AND gates and n × (n − 1)
XOR gates were used to compute MVP Z2 B [2926]. After removing the above redundancy
in Z2 B, Hasan et al. presented a multiplier with the following complexities: n(n−1)+n = n2
AND gates, n(n−2)+(n−1)+n = n2 −1 XOR gates, and a gate delay of d1+log2 neTX +TA
[1439]. The structure of Sunar-Koç’s Type I optimal normal basis multiplier is based on their
polynomial basis multiplier [1776] and its space complexity is the same as that in [1439],
but its gate delay is d2 + log2 neTX + TA . Another design that has the same complexities as
the multiplier in [1439] is Reyhani-Hasan-1 multiplier [2451]. These multipliers all belong to
the class of quadratic parallel multipliers, and their complexities are summarized in Table
16.7.3.
Scheme #AND #XOR Gate delay
Wang et al. [2926] n2 2n2 − 2n d1 + log2 neTX + TA
Hasan et al. [1439] n2 n2 − 1 d1 + log2 neTX + TA
Sunar-Koç [1776] n2 n2 − 1 d2 + log2 neTX + TA
Reyhani-Hasan-1 [2451] n2 n2 − 1 d1 + log2 neTX + TA
2
Reyhani-Hasan-2 [2452] n(n + 1)/2 1.5n − 0.5n − 1 d1 + log2 neTX + TA
Table 16.7.3 Complexities of quadratic F2n Type I optimal normal basis parallel multipliers.
16.7.67 Remark By exchanging one AND gate for one XOR gate, Reyhani-Hasan-2 Type I optimal
normal basis multiplier reduces the number of AND gates to n(n+1)/2 [2452], while keeping
the total number of AND and XOR gates unchanged. This technique is also used in the
Elia-Leone-2 Type II optimal normal basis parallel multiplier listed in Table 16.7.4 [966].
16.7.68 Proposition [1188] Let x = y + y −1 generate a Type II optimal normal basis of F2n over F2 ,
where y is a primitive (2n + 1)-st root of unity in F22n . Define xi = y i + y −i for 0 ≤ i ≤ n,
we have
0 1 n−1
{x2 , x2 , . . . , x2 } = {x1 , x2 , . . . , xn }. (16.7.9)
Therefore, X = {x1 , x2 , . . . , xn } is also a basis of F2n over F2 .
16.7.69 Proposition [1188] Given a field element a represented with respect to the above two bases,
Pn−1 i Pn
i.e., a = i=0 âi x2 and a = i=1 ai xi , the coordinate transformation formula between
these two bases is given as follows:
as(2i ) = âi , (16.7.10)
where 0 ≤ i ≤ n − 1 and s(j) is defined as the unique integer such that 0 ≤ s(j) ≤ n and
j ≡ s(j) (mod 2n + 1) or j ≡ −s(j) (mod 2n + 1).
Cryptography 821
16.7.70 Remark From (16.7.9) and (16.7.10), it follows that the basis conversion operation between
X and X̂ is simply a permutation and hence may be performed in hardware without using
any logic gates.
16.7.71 Proposition [1884, 2750] The product ab can be computed via an MVP ZB using basis X,
and the matrix Z can be decomposed as the summation of two matrices i.e., Z = Z1 + Z2 :
a2 a3 ··· an an 0 a1 · · · an−2 an−1
a3 a 4 ··· an an−1 a1 0 · · · an−3 an−2
.. . .. .. + .. .. .. .. . (16.7.11)
Z= . .. .. ..
. . .
. . . . .
an an ··· a3 a2 an−2 an−3 · · · 0 a1
an an−1 · · · a2 a1 an−1 an−2 · · · a1 0
Here, Z1 is a Hankel matrix, i.e., entries at (i, j) and (i − 1, j + 1) are equal, and Z2 is a
circulant matrix.
16.7.72 Remark The straightforward computation of the above matrix-vector product ZB results
in the following quadratic parallel multipliers: Sunar-Koç multiplier [2750], Elia-Leone-1
multiplier [966] and Reyhani-Hasan-1 multiplier [2451]. Their gate counts and delay com-
plexities are equal, and are summarized in Table 16.7.4.
Table 16.7.4 Complexities of quadratic F2n Type II optimal normal basis parallel multipliers.
16.7.73 Remark After being converted from a Type I optimal normal basis into basis X using
(16.7.6), the computation of ab becomes a modular multiplication operation, which can be
divided into two steps. The first step, i.e., the polynomial multiplication operation step,
can be performed using the Karatsuba algorithm. This results in Leone’s subquadratic
multiplier [1902]. Fan-Hasan’s Type I optimal normal basis subquadratic multiplier is based
on (16.7.8). For Type II optimal normal basis multiplication, Fan and Hasan use the TMVP
formula (16.7.4) and the decomposition (16.7.11). By exploiting vector and matrix symmetry
properties that exist in the matrix vector expressions of Types I and II optimal normal basis
multiplications, Hasan et al. use the block recombination technique to design subquadratic
parallel multipliers in [1437]. Table 16.7.5 gives gate counts and gate delays for the above-
mentioned optimal normal basis parallel multipliers of subquadratic space complexity.
Table 16.7.5 Complexities of subquadratic F2n Types I and II optimal normal basis parallel
multipliers.
16.7.74 Remark For Type II optimal normal bases, multiplication over F2n can be expressed in
terms of one or more multiplications of n-term polynomials over F2 [1180, 1238]. Then one
822 Handbook of Finite Fields
can use any suitable method for polynomial multiplication such as the Karatsuba algorithm.
The scheme presented in [1180] uses two polynomial multiplications and that in [1238]
uses only one. Both schemes however incur computational overhead to express the Type II
optimal normal basis multiplication into polynomial multiplication(s). The overheads are
O (n) for [1180] and O (n log2 n) for [1238]. In [248], the computational overhead of [1238]
has been reduced by a factor of about two.
16.7.75 Remark For efficient hardware realization of field multiplication using polynomial bases,
examples of architectures that use low weight field defining polynomials, i.e., trinomials and
pentanomials, include [2453, 2465, 2749]. Other types of polynomials that have been used
in multiplier architectures include all-one, nearly all-one, and equally-spaced polynomials.
16.7.76 Remark The field F2n can be embedded into a cyclotomic ring. This embedding technique
leads to a redundant representation, i.e., each field element is represented using more that
n bits. Several multiplier architectures have been reported using such redundant represen-
tation [923, 2216, 3009]. The best scenario for redundant representation is when only one
extra bit is needed and it occurs where a Type I optimal normal basis exists, the conditions
for which are the same as those for an irreducible all-one polynomial.
16.7.77 Remark Optimal normal bases do not exist for every field, so that alternatives that allow
efficient squaring operations include the use of Gaussian normal bases and Dickson polyno-
mials for the representation of field elements. Examples of multiplier architectures that use
such representation include [1438, 1885].
16.7.78 Remark Besides polynomial and normal bases, hardware multiplier architectures have been
reported using bases like shifted polynomial, dual, and triangular; see for example [1025,
1060, 1435, 3008].
16.7.79 Remark The Montgomery multiplication algorithm [2132] and Residue Number System
(RNS) [2691] have been extensively studied for hardware implementations of arithmetic over
prime fields; see for example [614, 2221, 2325, 2541]. These schemes have also been applied
to arithmetic over binary extension fields [164, 1775, 2787]. For multiplication over F2n , the
main cost of a straightforward realization of the Montgomery algorithm is a multiplication
of two binary polynomials of degree n − 1 [1775], and the RNS based multiplication scheme
has been shown to have O(n1.6 ) bit operations for a special form of reduction polynomials
[164].
16.7.80 Remark Some cryptographic systems use exponentiation ae , where a ∈ F2n and e is a
nonzero positive integer of up to n bits long. A straightforward method for exponentiation
is to use the well-known square-and-multiply algorithm [2080]. This method requires blog2 ec
squaring operations and W (e) − 1 multiplications over F2n , where W (e) is the number of
nonzero bits in the binary representation of e. Many improvements have been proposed
that require some re-coding of the exponent e and/or creation of look-up tables based on a
[2080].
See Also
References Cited: [32, 107, 164, 241, 243, 248, 574, 614, 923, 966, 1025, 1026, 1027, 1028,
1052, 1060, 1180, 1188, 1238, 1361, 1431, 1432, 1433, 1434, 1435, 1437, 1438, 1439, 1576,
1684, 1775, 1776, 1884, 1885, 1902, 2012, 2014, 2055, 2080, 2132, 2196, 2216, 2221, 2316,
2325, 2342, 2441, 2451, 2452, 2453, 2465, 2466, 2489, 2541, 2691, 2695, 2747, 2749, 2750,
2787, 2926, 2953, 2963, 2988, 3006, 3007, 3008, 3009, 3035, 3064]
This page intentionally left blank
17
Miscellaneous applications
825
826 Handbook of Finite Fields
dynamical systems described by polynomial functions over a finite field. This provides ac-
cess to the algorithmic theory of computational algebra and the theoretical foundation of
algebraic geometry, which help with all aspects of the modeling process. Polynomial dynam-
ical systems provide a unifying framework for many discrete modeling types. The algebraic
framework allows for efficient construction and analysis of discrete models.
17.1.3 Remark Two directed graphs are usually assigned to each such system.
17.1.4 Definition The dependency graph (or wiring diagram) D(f ) of f has n vertices 1, . . . , n,
corresponding to the variables x1 , . . . , xn of f . There is a directed edge i → j if there
exists x = (x1 , . . . , xi , . . . , xn ) ∈ Fn such that fj (x) 6= fj (x1 , . . . , xi + 1, . . . , xn ). That
is, D(f ) encodes the variable dependencies in f .
17.1.5 Definition The dynamics of f is encoded by its phase space (or state space), denoted
by S(f ). It is the directed graph with vertex set Fn and a directed edge from u to v if
f (u) = v.
17.1.6 Definition For each u ∈ Fn , the orbit of u is the sequence {u, f (u), f 2 (u), . . .}, where f k
means k-th composition of f . The sequence {u, f (u), f 2 (u), . . . , f t−1 (u)} is a limit cycle
of length t and u is a periodic point of period t if u = f t (u) and t is the smallest such
number. Since Fn is finite, every orbit must include a limit cycle.
17.1.7 Definition The point u is a fixed point (or steady state) if f (u) = u.
17.1.8 Lemma [748] (Limit cycle analysis) For a polynomial dynamical system f = (f1 , . . . , fn ) :
Fn → Fn , states in a limit cycle of length t are elements of the algebraic variety
V (f1t − x1 , . . . , fnt − xn ), defined by the polynomials f1t − x1 , . . . , fnt − xn , but not of the
variety V (f1s − x1 , . . . , fns − xn ) for any s < t. In particular, fixed points are the points in
the variety V (f1 − x1 , . . . , fn − xn ).
17.1.9 Definition A component of the phase space S(f ) consists of a limit cycle and all orbits of
f that contain it. Hence, the phase space is a disjoint union of components.
17.1.10 Definition A polynomial dynamical system can be iterated using different update sched-
ules:
• synchronous: all variables are updated at the same time;
Miscellaneous applications 827
• delays.
17.1.11 Remark Synchronous and asynchronous systems have the same fixed points.
17.1.12 Remark For visualization and analysis of PDS, the Web-based tool ADAM is available, see
Section 17.1.5 [1502]. PDS can be used to represent dynamic biological systems.
17.1.13 Example The lac(tose) operon is a functional unit of three genes, LacZ, LacY, and LacA,
transcribed together, responsible for the metabolism of lactose in the absence of glucose in
bacteria. In the presence of lactose this genetic machinery is disabled by a repressor pro-
tein. The genes in the lac operon encode several proteins involved in this process. Lactose
permease transports extracellular lactose into the cell, where the protein β-galactosidase
breaks the lactose down into glucose, galactose, and allolactose. The allolactose binds to
the repressor protein and deactivates it, which results in the transcription of the three lac
genes, resulting in the production of permease and β-galactosidase. This positive feedback
loop allows for a rapid increase of lactose when needed [1583]. The result is a bistable
dynamical system. This process can be modeled by a polynomial dynamical system. Each
variable can take on two states: 0 denotes the absence (or low concentration) of a substrate
or the inactive state of a variable, and 1 denotes presence or activity. As there are only
two states, the system is modeled over the finite field F2 . Permease (xP ) transports ex-
ternal lactose (xeL ) inside the cell, and the update function for intracellular lactose (xL )
is xL (t + 1) = xeL (t) AND xP (t), or as a polynomial over F2 , fL = xeL xP . Using xB for
β-galactosidase, xM for mRNA, xA for allolactose, the functionality of the lac operon can
be modeled by the polynomial dynamical system f = (fL , fA , fM , fP , fB ) : F52 → F25 :
fL = xeL xP ,
fA = xL xB ,
fM = xA ,
fP = xM ,
fB = xM .
Figure 17.1.1 denotes the dependency graph and phase space of the above model [1504].
17.1.14 Remark Continuous models of biological systems, such as ordinary or partial differential
equation models, rely on exact rate parameters, for which it is oftentimes impossible to
obtain exact measurements. When not enough information is available to build a quantita-
tive model, discrete models can give valuable insight about the qualitative behavior of the
system. Such models are state- and time-discrete. For example, in the most simple case, one
distinguishes only between two states, ON and OFF, or active and inactive, present and
absent, etc.
828 Handbook of Finite Fields
Figure 17.1.1 PDS for lac operon: state space in the absence (top) and presence (bottom left) of glucose,
and dependency graph (bottom right). Each 5-tuple represents the states of lactose, allolactose, mRNA,
permease, and β -galactosidase, (L, A, M, P, B).
Miscellaneous applications 829
17.1.15 Remark Model types include (probabilistic) Boolean networks, logical networks, Petri nets,
cellular automata, and agent-based (individual-based) models, to name the most commonly
found ones [343, 1859, 2215, 2512, 2617, 2704]. All these model types can be translated into
PDS [1505, 2864].
17.1.16 Remark In a Boolean network model every variable is either ON or OFF, and the state
of each variable at time t + 1 is determined by a Boolean expression that involves some or
all of the variables at time t. Boolean models were first introduced in 1969 by Kauffman
for gene regulatory networks, in which each gene (variable) is either expressed (ON) or not
expressed (OFF) at every time step [1715].
17.1.17 Algorithm For Boolean network models, where there are two states (TRUE and FALSE),
F = F2 , 0 denotes FALSE and 1 TRUE. Table 17.1.17 lists the Boolean expressions and the
corresponding polynomials. All Boolean expressions can be translated to polynomials using
this correspondence.
17.1.18 Remark Logical models are a generalization of Boolean models, in which variables can
take on more than two states, e.g., to represent the three states low, medium, and high
concentration of a substrate. The rules governing the temporal evolution are switch-like
logical rules for the different states of each variable. Updates in logical models are specified
via parameters rather than by a (Boolean) expression.
17.1.20 Example (Logical model of lambda phage [2431]) Lambda phage is a virus that injects its
DNA into a bacterial host. Once injected, it enters either a lytic cycle or a lysogenic cycle.
In the lytic cycle, its DNA is replicated, the host cell lyses, and new viruses are released. In
the lysogenic cycle, the virus DNA is copied into the hosts DNA where it remains without
any apparent harm to the host. A number of bacterial and viral genes takes part in the
decision process between lysis and lysogenisation. The core of the regulatory network that
controls the life cycle consists of two regulatory genes, cI and cro. Lysogeny is maintained
if cI proteins dominate, the lytic cycle if cro proteins dominate. CI inhibits cro, and vice
versa. At high concentrations, cro downregulates its own production, see Figure 17.1.3.
This regulatory network can be encoded in the following logical model: V = {c1, cro},
Figure 17.1.3 Logical model of lambda phage, blunt ended arrows indicate an inhibitory effect.
E = {(c1, cro, 1), (cro, c1, 1), (cro, cro, 2)}, and K = {Kc1 , Kcro }, where Kc1 : {0, 1, 2} →
{0, 1} is defined as Kc1 (0) = 1, Kc1 (1) = Kc1 (2) = 0 and Kcro : {0, 1} × {0, 1, 2} →
{0, 1, 2} as Kcro (0, 2) = Kcro (1, 2) = 1, Kcro (1, 0) = Kcro (1, 1) = 0, Kcro (0, 0) = 1, and
Kcro (0, 1) = 2.
17.1.21 Algorithm [2864] Logical models are translated into a PDS by the following algorithm.
Let (V, E, K) be a logical model as in Definition 17.1.19. Choose a prime number p
such that p ≥ mi + 1 for all 1 ≤ i ≤ n ({0, . . . , mi } has mi + 1 elements), and let
F = Fp = {0, 1, . . . , p − 1} be the field with p elements. Note that we may consider the set S
to be a subset of Fn . Consider a vertex vi and let gi be its coordinate function. Our goal is to
represent gi as a polynomial in terms of its inputs, say xi1 , . . . , xir . That is, we need a poly-
nomial function defined on Fr with values in F. Denote a ∧ b = min{a,Q b}, using the natural
order on the set F, viewed as integers. To extend the domain of g from vj ∈I(i) {0, . . . , mj,i }
to Fr , we define g(xi1 , . . . , xir ) = gi (xi1 ∧ mi1 , . . . , xir ∧ mir ) for (xi1 , . . . , xir ) ∈ Fr . The
polynomial form of gi : Fr → F is then
X Y
gi (x) = gi (ci1 , . . . , cir ) (1 − (xj − cj )p−1 ),
(ci1 ,...,cir )∈Fr vj ∈I(i)
where the right-hand side is computed modulo p; see also the Lagrange Interpolation For-
mula (Theorem 2.1.131).
17.1.22 Example The logical model of the lambda phage presented in Example 17.1.22 corresponds
to the PDS over F3 : c1 = x1 , cro = x2 , and the polynomials are
f1 = −x22 + 1,
f2 = −x21 x22 + x21 x2 + x21 + x22 − x2 − 1.
17.1.23 Remark The logical model can be converted manually or with the software package ADAM
[1502]. Instead of analyzing the logical model, the corresponding PDS can be analyzed for
its dynamic features.
Miscellaneous applications 831
17.1.24 Remark Petri nets are bipartite graphs, consisting of places and transitions. Places can
be marked with tokens, usually representing concentration levels or number of molecules
present. Transitions fire in a non-deterministic way and move tokens between places. Petri
nets have been used extensively to model chemical reaction networks, where places represent
species and transitions interactions. Analysis of the Petri net can reveal which species are
persistent, i.e., which species can become extinct if all species are present at the initial time.
For the translation algorithm of Petri nets into polynomial dynamical systems, see [2864].
17.1.25 Remark Agent-based models (ABM) (or individual-based models) are computational models
consisting of individual agents, each agent having a set of rules that defines how it interacts
with other agents and the environment. Simulation is used to assess the evolution of the
system as a whole. Sophisticated agent-based models have been published that simulate
biological systems including tumor growth and the immune system [90, 976, 2000]. For
conversion of agent-based models into polynomial dynamical systems, see [1505].
17.1.28 Remark The minimal-sets algorithm [1597] constructs the set of all “minimal” wiring
diagrams such that a model that fits the experimental data exists for each wiring dia-
gram. Given m observations stating that the inputs ti result in the state si for a variable,
i.e., (s1 , t1 ), . . . , (sm , tm ), where si ∈ Fn and ti ∈ F, the minimal-sets algorithm identi-
fies all inclusion minimal sets S ∈ {1, . . . , n} such that there exists a polynomial function
f ∈ F[{xi |i ∈ S}] with f (si ) = ti . The algorithm is based on the fact that one can define
a simplicial complex ∆ associated to the experimental data, such that the face ideal for
the Alexander dual of ∆ is a square-free monomial ideal M , and the generating sets for
the minimal primes in the primary decomposition of the ideal M are exactly the desired
minimal sets.
17.1.29 Remark Typically, there are many models that fit experimental data, even when restricting
the model space to minimal models. The minimal sampling algorithm is used to sample the
subspace of minimal models; it returns a set of weighted functions per node, assigning
a higher weight to functions that are candidates for several monomial orders [868]. The
algorithm is based on the Gröbner fan of an ideal.
832 Handbook of Finite Fields
17.1.30 Remark The methods described in this section heavily rely on computer algebra systems,
e.g., Macaulay2 [1355]. A major advantage of translating other discrete modeling types
into polynomial systems is that efficient implementations of algorithms such as Gröbner
basis calculations or primary decomposition are already implemented, and can be used
independently of the underlying model type. On the other hand, an algorithm implemented
to analyze a Petri net cannot be re-used to analyze a logical model. As many biologists
are not familiar with computer algebra systems and the mathematical theory, software
packages have been developed that allow the construction and analysis of discrete models
using methods described in this section, without requiring understanding of the underlying
mathematics [866, 1502].
17.1.31 Remark Sections 17.1.6.1 to 17.1.6.4 describe specific classes of polynomial dynamical sys-
tems, and theorems that relate the structure of the PDS to its dynamics.
17.1.32 Remark Certain polynomial functions are very unlikely to represent an interaction in a bi-
ological system. For example, in a Boolean system, x + y, i.e., the exclusive OR, is unlikely
to represent an actual biological process. In addition, there are classes of functions that are
biologically more relevant than other functions. One such class consists of nested canalyz-
ing functions, named after the genetic concept of canalization, identified by the geneticist
Waddington in the 1940s. Networks consisting of nested canalyzing functions are robust
and stable [1714, 2887].
17.1.33 Definition A Boolean function f (x1 , . . . , xn ) is canalyzing if there exists an index i and a
Boolean value a for xi such that f (x1 , . . . , xi−1 , a, xi+1 , . . . , xn ) = b is constant. That is,
the variable xi , when given the canalyzing value a, determines the value of the function
f , regardless of the other inputs. The output value b is the canalyzed value.
17.1.35 Remark Any Boolean function in n variables is a map f : {0, 1}n → {0, 1}. Denote the set
of all such maps by Bn . For any Boolean function f ∈ Bn , there is a unique polynomial
g ∈ F2 [x1 , . . . , xn ] such that g(a1 , . . . , an ) = f (a1 , . . . , an ) for all (a1 , . . . , an ) ∈ Fn2 and such
that the degree of each variable appearing in g is equal to 1. Namely,
X n
Y
g(x1 , . . . , xn ) = f (a1 , . . . , an ) (1 − (xi − ai )).
(a1 ,...,an )∈Fn
2
i=1
17.1.36 Definition Let S be a a non-empty set whose highest element is rS . The completion of S,
which is denoted by [rS ], is the set [rS ] := {1, 2, . . . , rS }. For S = ∅, let [r∅ ] := ∅.
n
17.1.38 Corollary The set of points in F22 corresponding to coefficient vectors of nested canalyzing
functions in the variable order x1 , . . . , xn , denoted by Vidncf , is given by
ncf = (c , . . . , c ) ∈ F2 : c = 1, c = c
n Y
Vid S c , for S ⊆ [n] .
∅ [n] 2 [n] [rS ] [n]\{i}
i∈[rS ]\S
17.1.39 Definition Let σ be a permutation on the elements of the set [n]. We define a new order
relation <σ on the elements of [n] as follows: i <σ j if and only if σ −1 (i) < σ −1 (j).
Let S be a nonempty subset of [n], say S = {i1 , . . . , it }. Let rSσ :=
max{σ −1 (i1 ), . . . , σ −1 (it )}. The completion of S with respect to the permutation σ, de-
noted by [rSσ ]σ is the set [rSσ ]σ := {σ(1), . . . , σ(rSσ )}.
17.1.40 Corollary Let f ∈ Bn and let σ be a permutation of the set [n]. The polynomial f is a nested
canalyzing function in the order xσ(1) , . . . , xσ(n) , with input values aσ(i) and corresponding
output values bσ(i) , 1 ≤ i ≤ n, if and only if c[n] = 1 and, for any subset S ∈ [n],
Y
cS = c[rSσ ]σ c[n]\{w} .
σ ] \S
w∈[rS σ
n
17.1.41 Corollary Let σ be a permutation on [n]. The set of points in F22 corresponding to nested
σ
canalyzing functions in the variable order xσ(1) , . . . , xσ(n) , denoted by Vid , is defined by
n Y
σ
Vid = (c∅ , . . . , c[n] ) ∈ F22 : c[n] = 1, cS = c[rSσ ]σ c[n]\{w} , for S ⊆ [n] .
σ ] \S
w∈[rS σ
17.1.42 Corollary Let f ∈ R and let σ be a permutation of the elements of the set [n]. If f is a nested
canalyzing function in the order xσ(1) , . . . , xσ(n) , with input values ai and corresponding
834 Handbook of Finite Fields
ai = c[n]\{σ(i)} , for 1 ≤ i ≤ n − 1,
b1 = c∅ + cσ(1) c[n]\{σ(1)} ,
bi+1 − bi = c[i+1]σ c[n]\{σ(i+1)} + c[i]σ , for 1 ≤ i < n − 1 and
bn − an = bn−1 + c[n−1]σ .
17.1.43 Remark Thus, the family of nested canalyzing polynomials in a given number of variables
can be described as an algebraic variety defined by a collection of binomials. This description
has several useful implications.
17.1.44 Remark For given time course data and a wiring diagram, one can identify all nested
canalyzing functions (17.1.6.1) that fit these information. Nested canalyzing functions can
be parameterized by an ideal, and intersecting the variety of this ideal with the variety of
the ideal of all functions that fit the data results in the desired set of models [1503].
17.1.45 Remark For linear systems, i.e., all polynomials are linear functions, the dynamics of a
system can be determined completely from the structure of the polynomials [975, 2812].
17.1.46 Remark Conjunctive (respectively disjunctive) Boolean network models consist of functions
constructed using only the AND (respectively OR) operator. For conjunctive or disjunctive
networks with strongly connected dependency graph, the cycle structure is completely de-
termined by a formula that depends on the loop number, the greatest common divisor of the
lengths of the dependency graph’s simple (no repeated vertices) directed cycles. For general
Boolean conjunctive or disjunctive networks, an upper and lower bound for the number of
cycles of a particular length can be calculated [1598].
See Also
References Cited: [74, 90, 343, 748, 866, 867, 868, 975, 976, 1355, 1502, 1503, 1504, 1505,
1583, 1597, 1598, 1599, 1714, 1715, 1859, 1860, 2000, 2215, 2431, 2512, 2514, 2515, 2617,
2704, 2812, 2864, 2887, 3010]
In this chapter we mention some topics of the theory of finite fields related to quantum
information theory. However, we will not give any background on quantum information
theory and just refer to the monographs [1742, 2286, 2360]. For a more detailed treatment
of quantum algorithms for algebraic problems we refer to the survey [619].
17.2.1 Definition A maximal set of mutually unbiased bases, for short MUBs, is given by a set of
n2 + n vectors in Cn which are the elements of n + 1 orthonormal bases of Cn :
Bh = {wh,1 , . . . , wh,n }, h = 0, . . . , n.
Hence,
hwh,i , wh,j i = δi,j ,
1, i = j,
where δi,j = and the defining property is the mutual unbiasedness, given
0, i 6= j,
by
1
|hwf,i , wg,j i| = √ (17.2.1)
n
Pn
for 0 ≤ f, g ≤ n, f 6= g, and 1 ≤ i, j ≤ n, where ha, bi = u=1 au bu denotes the
standard inner product of two vectors a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Cn .
17.2.2 Remark Mutually unbiased bases were introduced by Schwinger [2572]. They have applica-
tions in quantum state determination [1577, 3005], quantum cryptography [2401], quantum
error-correcting codes [475, 476, 1337], and the mean king’s problem [981].
17.2.3 Theorem [1743, 3005] Let n = pr be the power of a prime p > 2 and ψ be the additive
canonical character of Fn = {ξ1 , . . . , ξn }. Then
1
wh,k = √ ψ(ξh ξu2 + ξk ξu ) u=1,...,n ,
h, k = 1, . . . , n,
n
n
and w0,j = (δj,u )u=1 is a maximal set of MUBs.
17.2.4 Remark Maximal sets of n + 1 MUBs in dimension n are only known to exist in any
dimension n = pr which is a power of a prime p. For n = 2r a construction based on Galois
rings is given in [1743].
If we relax (17.2.1) to
|hwf,i , wg,j i| = O n−1/2 (log n)1/2
we can construct sets of n + 1 orthonormal bases for any dimension n [2656]. Elliptic curves
over finite fields can be used to construct sets of n + 1 orthonormal bases with
|hwf,i , wg,j i| = O n−1/2
which applies to almost all dimensions n and under some widely believed conjectures about
the gaps between primes to all n.
17.2.5 Theorem [2656] Let E be an elliptic curve over a finite field Fp of prime order p > 3 with
n points. For 2 ≤ d ≤ n − 1 denote by Ad the set of bivariate polynomials over E of degree
836 Handbook of Finite Fields
at most d with f (0, 0) = 0. Let X denote the character group of E. For f ∈ Fp [E] we define
the set
Bf = {vf,χ : χ ∈ X},
where for a character χ ∈ X, the vector vf,χ is given by
1
vf,χ = √ (ψ(f (P ))χ(P ))P ∈E
n
where ψ denotes the additive canonical character of Fp .
For 2 ≤ d ≤ n − 1 the standard basis and the pd−1 sets Bf = {vf,χ : χ ∈ X}, with
f ∈ Ad , are orthonormal and satisfy
2d + (2d + 1)n−1/2
|hvf,χ , vg,ψ i| ≤ ,
n1/2
where f, g ∈ Ad , f 6= g, and χ, ψ ∈ X.
17.2.6 Remark Maximal sets of MUBs and geometries over finite fields were used to define a
discrete analog of quantum mechanical phase space and the corresponding notion of Wigner
transform [1272, 1360]. The latter is a real valued function that uniquely characterizes a
quantum state and allows one to compute measurement statistics by performing summation
along lines of the underlying finite geometry.
17.2.7 Definition For a positive integer n let A = {vi = (vi1 , . . . , vin ) : i = 1, . . . , n2 } be a set of
n2 vectors in Cn . The set of n2 × n2 matrices
2
E = {Ei = (vij vik /n)ni,k=1 : i = 1, . . . , n2 }
dimensions and numerical evidence exists for dimensions up to 45, see [2450], it is a difficult
task to explicitly construct SIC-POVMs. There are no known infinite families and in fact
it is not even clear if there are SIC-POVMs for infinitely many n. However, ASIC-POVMS
can be constructed for any power of an odd prime.
17.2.9 Theorem [1744] Let q be a power of a prime p ≥ 3 and ψ denote the additive canonical
character of Fq . Let
va,b = q −1/2 ψ(ax2 + bx)) ∈ Cq
x∈Fq
for all (a, b) ∈ Fq × F∗q and va,0 = (δa,x )x∈Fq for all a ∈ Fq . We define Ea,b =
(va,b,x va,b,y )x,y∈Fq and
X X
G= Ea,b .
a∈Fq b∈Fq
Then the set {Fa,b : a, b ∈ Fq }, with Fa,b = G−1/2 Ea,b G−1/2 , is an ASIC-POVM.
17.2.10 Remark Recall that for a matrix A ∈ Cd×d the Hermitian conjugate is defined as
A† = (At )∗ , where At denotes transposition and ∗ denotes entry-wise complex conjugation.
For two matrices A, B we denote by A ⊗ B their (Kronecker) tensor product.
17.2.11 Definition (QECC characterization [1759]) Let C ≤ Cd be a subspace and let E ⊆ C d×d be
a set of error operators. Then C is a quantum error-correcting code for E if the projector
PC onto the code space C satisfies the identities PC E † F PC = αE,F PC for all E, F ∈ E
and for some αE,F ∈ C. In this case C can detect all errors in E.
17.2.12 Remark We next give a brief description of the stabilizer formalism which allows connecting
the problem of finding quantum error-correcting codes to the problem of finding isotropic
subspaces over finite fields with respect to a certain symplectic inner product.
17.2.13 Definition Let Fq be the finite field of order q = pr where p is prime. Denote by ψ
an additive character of Fq . For α ∈ Fq denote by eα the corresponding standard
basis vector in Cq , i.e., the vector that has 1 in position α and is 0 elsewhere. We
ex+α ex t for α ∈ Fq and
P
define the following unitary error operators: Xα := x∈F q
Zβ := z∈Fq ψ(βz)ez ez for β ∈ Fq .
t
P
The weight of Eγ,α,β is the number of indices i for which not both αi and βi are zero.
The group Gn of all Eγ,α,β has size pq 2n and is the Pauli group.
17.2.16 Definition [140, 477, 1337] A stabilizer code is a quantum code C with parameters [[n, k, d]]
that is the joint eigenspace of a set of n − k commuting Pauli operators S1 , . . . , Sn−k ,
where 0 ≤ k ≤ n. The distance d is the weight of the smallest error that cannot be
detected by the code.
17.2.19 Definition Let S be an Abelian subgroup of Gn which has trivial intersection with
the center of Gn . Furthermore, let {g1 , g2 , . . . , g` } where gi = ψ(γi )Xαi Zβi with
γi ∈ {0, . . . , p − 1} and (αi , βi ) ∈ Fnq × Fnq be a minimal set of generators for S. Then a
stabilizer matrix of the corresponding stabilizer code C is a generator matrix of the (clas-
sical) additive code C ⊆ Fqn × Fqn generated by (αi , βi ). The corresponding stabilizer
matrix is defined as
α1 β1
.. .. ∈ F`×2n .
. . q
α` β`
17.2.20 Remark A special class of stabilizer codes are CSS codes [478, 2702]. These codes are
obtained from a pair of classical linear codes C1 = [n, k1 , d1 ]q and C2 = [n, k2 , d2 ]q over Fq .
The codes have to satisfy the condition that C2⊥ ⊆ C1 , where C2⊥ denotes the dual code of
C2 with respect to the standard inner product on Fnq . The basis states of the code space C
are defined as cw := √ 1 ⊥
P
c∈C ⊥ ec+w , where w ∈ C1 . Two states cw and cw0 are identical
|C2 | 2
if and only if w − w0 ∈ C2⊥ and otherwise they are orthogonal. The dimension of the code
space C is q k , where k = k1 + k2 − n and the minimum distance is d ≥ min(d1 , d2 ).
17.2.21 Remark It can be shown [1338, 1353] that any stabilizer code can be encoded efficiently
by
P using quantum operations fromP the Clifford group. A generating set for this group is
t
F √1 t t t
P
y∈Fq eγy ey for γ ∈ q \{0}, q x,z∈Fq ψ(xz)e z ex , and x,y∈Fq ex e x ⊗ex+y ey , where
these gates can be applied to any (pair) of the n qudits.
17.2.22 Remark Several constructions of families of quantum error-correcting codes based on finite
fields are known. Starting from classical Reed-Solomon codes (see Section 15.1) over F2n ,
in [1352] a construction of codes with parameters [[n(2n − 1), n(2n − 1 − 2K), d ≥ K + 1]]
was given, where K = 2n − δ and δ > (2n − 1)/2 + 1 is the designed distance of the classical
Reed-Solomon code [2n − 1, K, δ] which is underlying the construction.
For a construction of nonbinary stabilizer codes using arbitrary finite fields see [1729].
17.2.23 Remark Various bounds on the parameters of quantum error-correcting codes are
known, see [476]. Similar to the classical Singleton bound, a quantum Singleton bound of
k + 2d ≤ n + 2 for any code with parameters [[n, k, d]] can be shown. Codes meeting this
bound with equality are quantum MDS codes and several constructions based on finite fields
Fq for large q are known [1351].
17.2.24 Remark Classical Goppa codes have been used to construct quantum error-correcting codes.
A construction in [2027] is based on towers of Garcia-Stichtenoth function fields (see Section
12.6) Fi = Fq2 (x0 , z1 , . . . , xi , zi+1 ), defined by equations ziq +zi −xq+1
i−1 = 0 and xi = zi /xi−1 ,
Miscellaneous applications 839
i = 0, 1, . . . These quantum codes have been shown to be asymptotically good, i.e., their
parameters [[ni , ki , di ]] satisfy lim ni → ∞, lim inf ki /ni > 0, and lim inf di /ni > 0, see
i→∞ i→∞ i→∞
also [601].
17.2.25 Remark The webpage http://www.codetables.de provides tables of the best known quan-
tum error-correcting codes for small parameters. The codes are specified by their stabilizer
matrices and where applicable it is noted that a code is optimal, i.e., achieving the highest
possible d for fixed n and k or the highest possible k for fixed n and d. The table also
contains known bounds on the parameters and contains information about the construction
of the codes. Some of these tables and constructions have also been made available in the
computer algebra system Magma.
17.2.26 Remark For a more detailed survey on quantum error-correcting codes see [1054].
17.2.27 Theorem [2623] Let (en ) be a periodic sequence over Fp of (unknown) period T . If the
mapping n 7→ en , 0 ≤ n < T , is injective, T can be recovered on a quantum computer in
polynomial time.
17.2.28 Remark No classical polynomial time algorithm is known for period finding.
If g ∈ F∗p is an element of (unknown) order T , Shor’s algorithm [2623] determines T in
quantum polynomial time.
Let (fn ) be another sequence over Fp of period T such that n 7→ fn is injective and
the (unknown) pair of positive integers (t1 , t2 ) satisfies en+t1 fn+t2 = en fn for all n ≥ 0.
Then Shor’s algorithm also determines (t1 , t2 ). Let a, b ∈ F∗p be such that the order of b
is t and a = bx with 0 ≤ x < t. Then x is the discrete logarithm of a to the base b. If
we choose en = bn and fn = a−n , we get en+x fn+1 = bn+x a−n−1 = en fn for all n and
we have (t1 , t2 ) = (x, 1). Hence, Shor’s algorithm finds the discrete logarithm in quantum
polynomial time.
However, Shor’s algorithm does not provide the period of (en ) if n 7→ en is not injective.
17.2.29 Theorem [1400] Given two periodic sequences (en ) and (fn ) with periods T and t, respec-
tively, we denote by D(en , fn ) the number of integers n ∈ [0, T t − 1] with en 6= fn .
For any constant c > 0, there is a quantum algorithm which computes in polynomial
time, with probability at least 3/4, the period of any sequence (en ) of period T satisfying
Tt
D(en , fn ) ≥ , (17.2.3)
(log T )c
17.2.30 Remark In [2397], for binary sequences (en ) with moderately small autocorrelation (which
includes all cryptographically interesting sequences) it was proved that (17.2.3) is fulfilled
and the algorithm of Hales and Hallgren [1400] determines their period in quantum poly-
nomial time. In [2657] a more general problem was studied of determining the period of
a sequence over an unknown finite field Fp , given a black-box which returns only a few
most significant bits of the sequence elements. A moderately small autocorrelation is again
a sufficient condition such that the condition (17.2.3) is satisfied.
840 Handbook of Finite Fields
17.2.31 Theorem [2509] Any monic, square-free polynomial f ∈ Fp [X] of degree d can be recon-
f (a)
structed from an oracle Of : Fp → {−1, 0, 1} given by the Legendre symbol Of (a) = p
of f (a) after O(d) quantum-queries with probability 1 + O(p−1 ).
17.2.32 Remark The case f (X) = X + s with an unknown s ∈ Fp is a special case of the hidden
shift problem. In this case in [2840] an efficient quantum algorithm was presented that finds
s with one query to the oracle Of . This algorithm uses the fact that the Legendre symbol
is, essentially, equal to its discrete Fourier transform.
17.2.33 Remark The hidden shift problem has also been studied for functions other than the Leg-
endre symbol. In [2488] the hidden shift problem for Boolean functions was studied and it
was shown that for any bent function (see Section 9.3) B : Fn2 → F2 the hidden shift s can
be found by making O(n) queries to OB (x) = (−1)B(x) . If furthermore an efficient circuit is
known to implement the dual bent function B ∗ , then this can be reduced to a single query.
These results were recently extended to the case of random Boolean functions [1260] where
the shift s can be determined from solving a system of linear equations over F2 that is
obtained from a sampling procedure and to the case of functions that are close to quadratic
bent functions, where closeness is measured using the Gowers U3 norm [2487].
17.2.34 Remark Some additive character sums over a finite field (twisted Kloosterman sums) are
closely connected to quantum algorithms for finding hidden nonlinear structures over finite
fields [618].
17.2.35 Remark Schumacher and Westmoreland [2562] investigate a discrete version of quantum
mechanics called modal quantum computing based on a finite field instead of the complex
numbers. Some characteristics of actual quantum mechanics are retained including the no-
tions of superposition, interference, and entanglement.
17.2.36 Remark Exponential sums over finite fields are used to construct quantum finite automata
[88].
17.2.37 Remark Classical and quantum algorithms for solvability testing and finding integer so-
lutions x, y of equations of the form ag x + bhy = c over a finite field Fq are studied in
[2841].
See Also
References Cited: [88, 140, 475, 476, 477, 478, 567, 568, 601, 618, 619, 791, 981, 1054,
1142, 1143, 1260, 1272, 1337, 1338, 1351, 1352, 1353, 1360, 1400, 1577, 1729, 1742, 1743,
1744, 1759, 2027, 2138, 2286, 2360, 2397, 2401, 2450, 2487, 2488, 2509, 2562, 2572, 2623,
2656, 2657, 2702, 2840, 2841, 3005]
17.3.2 Definition Let A = (aj ) be a sequence over C of length n. For integer u satisfying 0 ≤
Pn−u−1
u < n, the aperiodic autocorrelation of A at shift u is Cu (A) = j=0 aj aj+u .
17.3.3 Remark The most important case, from a practical and historical viewpoint, occurs when
the sequence is binary, which means that its alphabet is {1, −1}. A binary sequence for
which all aperiodic autocorrelations at nonzero shifts are small in magnitude relative to the
sequence length is intrinsically suited for the separation of signals from noise, and therefore
has natural applications in digital communications, including radar, synchronization, and
steganography.
17.3.4 Remark The overall goal is to find binary sequences A having the property that, for each
u 6= 0 independently, |Cu (A)| takes its smallest possible value. An ideal binary sequence A
from this viewpoint therefore satisfies |Cu (A)| = 0 or 1 for each u 6= 0, which is known as a
Barker sequence. The longest Barker sequence currently known has length 13 and there is
overwhelming evidence that no longer Barker sequence exists (see [1603] for a survey).
nonzero c in E.
842 Handbook of Finite Fields
17.3.6 Theorem
√ [2527] Every m-sequence Y of length n = 2m − 1 satisfies |Cu (Y )| < 1 +
(2/π) n + 1 log 4n/π for each u satisfying 0 < u < n.
17.3.7 Remark The above theorem shows the existence of an infinite family of binary sequences
for which the
√ magnitude of the aperiodic autocorrelation (at nonzero shifts) grows no faster
than order n log n. The only known improvement√ of this result uses probabilistic methods
and guarantees a growth rate of at most order n log n [2135].
17.3.8 Remark For the following definition we need quadratic characters, see Section 6.1.
17.3.9 Definition For an odd prime p, define λ : Fp → {1, −1} to be the quadratic character on
F∗p and λ(0) = 1. For real r, the r-shifted Legendre sequence of length p is the sequence
(xj ) of length p satisfying xj = λ(j + brpc) for 0 ≤ j < p.
17.3.10 Theorem [2037] For all real r, the r-shifted Legendre sequence X of length p satisfies
√
|Cu (X)| < 1 + 18 p log p for each u satisfying 0 < u < n.
17.3.11 Definition The merit factor of a binary sequence A of length n > 1 is F (A) =
Pn−1
n2 /(2 u=1 [Cu (A)]
2
).
17.3.14 Remark The largest asymptotic merit factor occurring in the above theorem is 6. However,
by modifying the construction as shown below, an asymptotic merit factor of approxi-
mately 6.34 can be achieved, which is the largest known asymptotic merit factor for binary
sequences.
17.3.15 Theorem [1604, 1605] Let t = 1.0578 . . . be the middle root of 4x3 − 30x + 27 and write
r = 3/4 − t/2. Let (xj ) be the r-shifted Legendre sequence of length p. Define Wp to be the
sequence (wj ) of length btpc given by wj = xj for 0 ≤ j < p and wj = xj−p for p ≤ j < btpc.
Then F (Wp ) → 6.3420 . . . , which is the largest root of 29x3 − 249x2 + 417x − 27.
17.3.16 Definition Let A = (aj ) and B = (bj ) be sequences over C of length n. For integer u
satisfying 0 ≤ u < n, the aperiodic crosscorrelation of A and B at shift u is Cu (A, B) =
Pn−u−1
j=0 aj bj+u and the periodic crosscorrelation of A and B at shift u is Ru (A, B) =
Pn−1
j=0 aj bj+u , where indices are reduced modulo n.
17.3.17 Definition A set S of sequences has maximum aperiodic correlation θ if |Cu (A, B)| ≤ θ
for all A, B ∈ S when either A 6= B or u 6= 0.
17.3.18 Remark [2554] Code-division multiple access (CDMA) is a technique that allows multiple
users to communicate over the same medium. For example, direct-sequence CDMA employs
a set S of sequences over C of the same length. Each user is assigned a sequence of S, and
encodes information by sending a modulated version of the assigned sequence in which each
sequence element is multiplied by a complex number drawn from some alphabet. In order to
Miscellaneous applications 843
allow synchronization at the receiver and to minimize interference between different users,
it is necessary to minimize the maximum aperiodic correlation of S.
17.3.19 Remark It is a notoriously difficult problem to design sequence sets with small maximum
aperiodic correlation directly. The usual approach is therefore to design sequence sets that
have good periodic crosscorrelation properties (see Section 10.3 for constructions using finite
fields), and then analyze their aperiodic crosscorrelation properties using either separate
methods or numerical computation.
17.3.20 Definition A binary Golay sequence pair is a pair of binary sequences A, B of equal
length n whose aperiodic autocorrelations satisfy Cu (A) + Cu (B) = 0 for 0 < u < n. A
binary Golay sequence is a member of a binary Golay sequence pair.
17.3.21 Remark Binary Golay sequence pairs were introduced to solve a problem in infrared mul-
tislit spectrometry [1293], and have since been used in many other digital information
processing applications such as optical time domain reflectometry [2218] and medical ul-
trasound [2300]. The defining aperiodic autocorrelation property can be exploited to allow
very efficient energy use when transmitting information, or to remove unwanted components
from received signals.
17.3.22 Theorem [967] If there exists a binary Golay sequence pair of length n and p is an odd
prime factor of n, then −1 is a square in Fp and so p ≡ 1 (mod 4).
17.3.23 Theorem [2832] If there exist binary Golay sequence pairs of length n1 and n2 , then there
exists a binary Golay sequence pair of length n1 n2 .
17.3.24 Remark Starting from “seed” binary Golay sequence pairs of length 2, 10, and 26, the
above theorem produces a binary Golay sequence pair of length 2k 10` 26m for all non-
negative integers k, `, m. Once it is known that a binary Golay sequence of a particular
length exists, it is then important in some applications to find as many such sequences of
this length as possible.
17.3.25 Definition Let A = ((−1)aj ) be a binary sequence of length 2m . The algebraic normal
form of A is the unique function fA (x1 , . . . , xm ) : Fm
2 → F2 satisfying
17.3.26 Theorem [783] Let π be a permutation of {1, . . . , m} and let e00 , e0 , e1 , . . . , em ∈ F2 . The
binary sequences A and B of length 2m having algebraic normal form fA (x1 , . . . , xm ) =
Pm−1 Pm Pm−1
Pmk=1 xπ(k) xπ(k+1) + k=1 ek xπ(k) + e0 and fB (x1 , . . . , xm ) = k=1 xπ(k) xπ(k+1) +
0
k=1 ek xπ(k) + e0 + xπ(1) form a binary Golay sequence pair.
17.3.27 Example Take m = 3, (π(1), π(2), π(3)) = (2, 1, 3), and (e00 , e0 , e1 , e2 , e3 ) = (0, 1, 1, 0, 1).
Then fA (x1 , x2 , x3 ) = x2 x1 + x1 x3 + x2 + x3 + 1 is the algebraic normal form of A =
(− + + − − − − −) (writing + for 1, and − for −1), and fB (x1 , x2 , x3 ) = x2 x1 + x1 x3 + x3
is the algebraic normal form of B = (+ − + − + + − −). The sequences A and B form a
binary Golay sequence pair of length 8.
17.3.28 Corollary For m > 1, there are at least 2m m! binary Golay sequences of length 2m .
844 Handbook of Finite Fields
17.3.29 Remark The 2m m! binary Golay sequences arising from the above theorem form m!/2
cosets of the first-order Reed-Muller code RM(1, m) in the second-order Reed-Muller code
RM(2, m); see Section 15.1. When these sequences are used in multicarrier transmission, the
Golay property tightly controls variations in the transmitted power while the code structure
allows powerful error correction [783].
17.3.30 Definition An (n, w, λ) optical orthogonal code is a set C of sequences over {0, 1} of length
n and Hamming weight w such that the periodic crosscorrelation satisfies Ru (A, B) ≤ λ
for all A, B ∈ C when either A 6= B or u 6= 0.
17.3.31 Remark [639] Optical orthogonal codes are used in optical CDMA. Each user is assigned a
sequence of the code, and a 1 in the sequence corresponds to a time instant when the user
is allowed to transmit a light pulse. The correlation constraint enables self-synchronization
of the system and controls the interference between different users.
17.3.32 Definition An (n, w, λ) optical orthogonal code of size M is optimal if there is no (n, w, λ)
optical orthogonal code of size greater than M .
17.3.33 Remark Optical orthogonal codes are closely related to other combinatorial objects such
as constant-weight codes and cyclic difference families [639]. There are numerous construc-
tions of optimal optical orthogonal codes. Two important general constructions using finite
geometries are given here.
17.3.34 Construction [2319] Write F = Fq and E = Fqm , and let α be primitive in E. The points in
the affine geometry AG(m, q) are the elements of E; see Section 14.3. A d-flat in AG(m, q)
is a translate of a d-dimensional subspace of E over F . Then the lines in AG(m, q) are
precisely the 1-flats. Two d-flats A and B are equivalent if there exists a ∈ E ∗ such that
A = {ax : x ∈ B}. This partitions the d-flats into orbits. Let C be a set containing exactly
one representative from each orbit of a d-flat that does not contain 0. With a d-flat S ∈ C
we associate a sequence over {0, 1} of length q m − 1 by placing a 1 at position i precisely
when αi ∈ S.
Qk qm−i+1 −1
17.3.35 Remark The q-binomial coefficient m
k q = i=1 q i −1 equals the number of k-
dimensional subspaces of Fqm over Fq ; see Section 13.2.
17.3.36 Theorem [2319] The above code is a (q m − 1, q d , q d−1 ) optical orthogonal code of size
m−1
d q
. If d = 1 or d = m − 1, the code is optimal.
17.3.37 Example Take q = 2, m = 3, d = 1, and let α be primitive in F8 . There are 21
lines in AG(3, 2) that do not contain 0. These are of the form {x, y}, where x, y ∈
F∗8 and x 6= y. A list of representatives of the orbits is {1, α}, {1, α2 }, {1, α3 } and
{(1100000), (1010000), (1001000)} is an optimal (7, 2, 1) optical orthogonal code.
q m+1 −1
17.3.38 Construction [639] Write F = Fq , E = Fqm+1 , and n = q−1 .
The points in the projective
geometry PG(m, q) are the elements of E /F (which is isomorphic to Z/nZ); see Section
∗ ∗
14.4. Let α be primitive in E and let [a] denote the coset of F ∗ in E ∗ containing a. A
d-space in PG(m, q) is the image under the mapping x 7→ [x] of the nonzero elements
of a (d + 1)-dimensional subspace of E over F . Then the lines in PG(m, q) are precisely
the 1-spaces. Two d-spaces A and B are equivalent if there exists a ∈ E ∗ /F ∗ such that
Miscellaneous applications 845
A = {ax : x ∈ B}. This partitions the d-spaces into orbits, whose sizes divide n. Let C be a
set containing exactly one representative from each orbit of size exactly n. With a d-space
S ∈ C we associate a sequence over {0, 1} of length n by placing a 1 at position i precisely
when [αi ] ∈ S.
m+1 d+1 d
17.3.39 Theorem [639] The above code is a q q−1−1 , q q−1−1 , qq−1−1
optical orthogonal code. For
q m −1 q m −q
d = 1, the code is optimal and has size q 2 −1 for even m and q 2 −1 for odd m.
17.3.40 Example Take q = 2, m = 3, d = 1, and let α be the primitive element in F16 that satisfies
α4 = α + 1. Then the 2-dimensional subspaces {0, 1, α, α4 }, {0, 1, α2 , α8 }, {0, 1, α5 , α10 }
of F16 map to the lines {[1], [α], [α4 ]}, {[1], [α2 ], [α8 ]}, {[1], [α5 ], [α10 ]} in PG(3, 2). Their
orbits have size 15, 15, and 5, respectively, and exhaust all 35 lines in PG(3, 2). Hence,
{(110010000000000), (101000001000000)} is an optimal (15, 3, 1) optical orthogonal code.
17.3.41 Definition Let A = (aj ) and B = (bj ) be sequences of length n. For integer u satisfying
0 ≤ u < n, the Hamming correlation between A and B at shift u is Hu (A, B) =
Pn−1
j=0 h(aj , bj+u ), where h(x, y) = 1 if x = y and h(x, y) = 0 otherwise and indices are
reduced modulo n.
17.3.43 Definition Let S be the set of all sequences of length n over some fixed alphabet, and
write
17.3.44 Remark There are numerous constructions of optimal families. Two constructions based
on finite fields are given here.
17.3.45 Construction [1889] Let k and m be integers satisfying 1 ≤ k ≤ m. Let α be primitive
in Fpm and let β be primitive in Fpk . Write F = Fp and E = Fpm . For v ∈ Fpk , define
Pk−1
the sequence Xv = (xj ) of length pm − 1 over Fpk by xj = v + `=0 Tr E/F (αj+` ) β ` for
0 ≤ j < pm − 1. Call {Xv : v ∈ Fpk } the Lempel-Greenberger family.
846 Handbook of Finite Fields
17.3.52 Remark Rank distance codes were introduced independently in [803, 1149, 2485]. Such
codes find applications in correcting crisscross errors in arrays, for example in memory chip
arrays or magnetic tape recording [2485]. Further applications are given in the following
two subsections. Motivated by these applications, the goal is to find (m, n, d) rank distance
codes that have as many elements as possible.
17.3.53 Theorem [803] The size of an (m, n, d) rank distance code over Fq is at most q n(m−d+1) .
17.3.54 Definition An (m, n, d) rank distance code over Fq of size q n(m−d+1) is a maximum rank
distance code.
17.3.55 Definition The rank distribution of a rank distance code C is (ai ), where ai is the number
of elements in C of rank i, and its distance distribution is (bi ), where bi = |{(X, Y ) ∈
C × C : rank(X − Y ) = i}|/|C|.
17.3.56 Theorem [803] The rank distribution (ai ) and the distance distribution (bi ) of an (m, n, d)
maximum rank distance code over Fq satisfy a0 = b0 = 1 and
Xi−d
m j i
ai = bi = (−1)j q (2) (q n(i−j−d+1) − 1) for i ∈ {1, 2, . . . , m},
i q j=0 j q
m
where k q is the q-binomial coefficient given in Remark 17.3.35.
17.3.57 Theorem [803] Let m, n, d be positive integers satisfying d ≤ m ≤ n. Write F = Fq and
E = Fqn , and let α be primitive in E. For λ = (λ0 , λ1 , . . . , λm−d ) ∈ E m−d+1 , let Xλ = (xij )
Pm−d k
be the m × n matrix given by xij = Tr E/F (αj k=0 λk αiq ) for i ∈ {1, 2, . . . , m} and
Miscellaneous applications 847
j ∈ {1, 2, . . . , n}. Then {Xλ : λ ∈ E m−d+1 } is an (m, n, d) maximum rank distance code
over Fq .
17.3.59 Remark Space-time codes are used in wireless digital communications to transmit a block
of n data symbols over m transmit antennas. The motivation is that, with careful code
design, the transmitter introduces spatial diversity and so can prevent signal loss caused by
shadowing [2782]. The usual goal is to design space-time codes whose diversity equals the
number of transmit antennas m.
17.3.60 Theorem [1963] The size of an (m, n, d) space-time code over an alphabet of size q is at
most q n(m−d+1) .
17.3.62 Remark There are different approaches for constructing space-time codes. Algebraic con-
structions based on rank distance codes over finite fields are presented in the following. The
principal difficulty in this approach is to find some “rank-preserving” mapping from a finite
field to a finite subset of C.
17.3.63 Theorem [1964] Let C be an (m, n, d) maximum rank distance code over F2 , where we
interpret the elements
√
of C to be in {0, 1}, as a subset of Z. Let h and ` be positive integers.
h
Write θ = e2π −1/2 , let η be a nonzero element in 2Z[θ], and let κ be a nonzero element
in C. Define ( `−1 )
X Ph−1 v
u 2 Xu,v
S= κ η θ v=0 : Xu,v ∈ C ,
u=0
17.3.66 Definition Let K be a number field and R the ring of algebraic integers in K. Suppose
that R contains a prime ideal P such that R/P is isomorphic to Fq . Let A ⊂ C be a
set containing exactly one representative from each of the q elements of R/P . Define
φ : Fq → A to be the mapping induced by the isomorphism from Fq to R/P , and extend
φ to act element-wise on matrices over Fq .
848 Handbook of Finite Fields
17.3.67 Theorem [1740] For each matrix X over Fq , we have rank(φ(X)) ≥ rank(X).
17.3.68 Corollary [1740] Let C be an (m, n, d) maximum rank distance code over Fq . Then φ(C) is
an optimal (m, n, d) space-time code over A.
17.3.69 Remark The set A is not uniquely defined for a given
P K, but from a practical viewpoint it
is advantageous to choose A such that the energy a∈A |a|2 is minimized.
√
17.3.70 Example Write i = −1 and take K = Q[i], so that R = Z[i]. Denote the ideal aR by (a)
and let p be a prime satisfying p ≡ 1 (mod 4). Then (p) = (π)(π), where π = a + bi for
some a, b ∈ Z satisfying a2 + b2 = p, and R/(π) is isomorphic to Fp . Identify Fp with Z/pZ
and define φ : Fp → C by φ(x + pZ) = x − [ πx ]π, where [ · ] rounds to the nearest element (in
the Euclidean metric) in Z[i]. Take A to be the image of φ. Then A minimizes the energy.
Specifically for p = 5, we have (5) = (2 + i)(2 − i) and A = {0, 1, −1, i, −i}.
dS on P (Fm
q ) is defined by dS (U, V ) = dim(U ) + dim(V ) − 2 dim(U ∩ V ).
17.3.73 Remark Random linear network coding is a technique for communicating efficiently over
networks. The transmitting node injects a basis for a k-dimensional subspace of Fm q into
the network. Each intermediate node relays a randomly chosen linear combination of its
incoming vectors. The receiving node waits until enough linearly independent vectors are
received and then tries to reconstruct the transmitted subspace. If the transmitted subspaces
are taken from an (m, k, 2d) constant-dimension code with d > 1, then the receiver can
reconstruct the transmitted space even if there are (a limited number of) lost or erroneously
inserted vectors. Specifically, a successful reconstruction is always possible if the transmitted
subspace U and the received subspace V satisfy dS (U, V ) < d [1800].
k×(m−k) k×(m−k)
17.3.74 Definition [2666] The mapping Λ : Fq → Pk (Fm q ) takes X ∈ Fq to the
rowspace of [I X], where I is the identity matrix of order k.
k×(m−k)
17.3.75 Theorem [2666] We have dS (Λ(X), Λ(Y )) = 2 rank(X − Y ) for all X, Y ∈ Fq .
17.3.76 Corollary [2666] Let C be a (k, m − k, d) rank distance code over Fq . Then Λ(C) is an
(m, k, 2d) constant-dimension code over Fq .
17.3.77 Remark If C is a (k, m − k, d) maximum rank distance code, the “lifted” code Λ(C) has size
q k(m−k)−(d−1) max{k,m−k} . This code is almost optimal in the sense that every (m, k, 2d)
constant-dimension code over Fq has fewer than four times as many codewords as Λ(C)
[2666]. In some special cases Λ(C) can be augmented to give an optimal code, as shown
below.
17.3.78 Theorem [1998] Let k and m be positive integers satisfying m = rk for integer r. Let I and
Z be the identity and all-zero matrix over Fq of size k × k, respectively. Let Ω0 ⊂ Pk (Fm q )
contain the rowspace of [Z · · · Z I], and for ` ∈ {1, 2, . . . , r −1}, let Ω` ⊂ Pk (Fm
q ) be the set
of rowspaces corresponding to {[Z · · · Z I A] : A ∈ C` }, where C` is a (k, `k, k) maximum
Miscellaneous applications 849
See Also
References Cited: [635, 639, 783, 803, 967, 1149, 1293, 1523, 1603, 1604, 1605, 1607, 1740,
1800, 1889, 1963, 1964, 1998, 2004, 2037, 2135, 2218, 2300, 2319, 2485, 2527, 2554, 2666,
2782, 2832]
This page intentionally left blank
Bibliography
851
852 Handbook of Finite Fields
[18] L. M. Adleman and M.-D. Huang, Counting points on curves and Abelian varieties
over finite fields, J. Symbolic Comput. 32 (2001) 171–189. <490, 491>
[19] L. M. Adleman, C. Pomerance, and R. S. Rumely, On distinguishing prime numbers
from composite numbers, Ann. of Math., 2nd Ser. 117 (1983) 173–206. <347,
363>
[20] A. Adolphson and S. Sperber, p-adic estimates for multiplicative character sums,
Preprint, http://arxiv.org/abs/1103.5513. <483, 488>
[21] A. Adolphson and S. Sperber, p-adic estimates for exponential sums and the theorem
of Chevalley-Warning, Ann. Sci. École Norm. Sup., IVe Ser. 20 (1987) 545–556.
<199, 201, 210, 213, 480, 488>
[22] A. Adolphson and S. Sperber, On the degree of the L-function associated with an
exponential sum, Compositio Math. 68 (1988) 125–159. <169, 473, 476, 479>
[23] A. Adolphson and S. Sperber, Exponential sums and Newton polyhedra: cohomology
and estimates, Ann. of Math., 2nd Ser. 130 (1989) 367–406. <165, 169, 196,
201, 210, 213, 476, 479, 482, 488>
[24] A. Adolphson and S. Sperber, p-adic estimates for exponential sums, In p-adic
Analysis, volume 1454 of Lecture Notes in Math., 11–22, Springer, Berlin, 1990.
<213>
[25] A. Adolphson and S. Sperber, On twisted exponential sums, Math. Ann. 290 (1991)
713–726. <482, 488>
[26] A. Adolphson and S. Sperber, Twisted exponential sums and Newton polyhedra, J.
Reine Angew. Math. 443 (1993) 151–177. <482, 488>
[27] A. Adolphson and S. Sperber, On the zeta function of a complete intersection, Ann.
Sci. École Norm. Sup., IVe Ser. 29 (1996) 287–328. <196, 201, 481, 488>
[28] A. Adolphson and S. Sperber, Exponential sums on An . III, Manuscripta Math. 102
(2000) 429–446. <163, 169>
[29] A. Adolphson and S. Sperber, On the zeta function of a projective complete intersec-
tion, Illinois J. Math. 52 (2008) 389–417. <481, 488>
[30] A. Adolphson and S. Sperber, Exponential sums nondegenerate relative to a lattice,
Alg. Num. Th. 3 (2009) 881–906. <213>
[31] A. Adolphson and S. Sperber, On unit root formulas for toric exponential sums, Alg.
Num. Th. 6 (2012) 573–585. <482, 488>
[32] V. B. Afanasyev, Complexity of VLSI implementation of finite field arithmetic,
In Proc. II. Intern. Workshop on Algebraic and Combinatorial Coding Theory,
USSR, 6–7, 1990. <814, 823>
[33] S. Agou, Sur l’irréducibilité des polynômes à coefficients dans un corps fini, C. R.
Acad. Sci. Paris, Sér. A-B 272 (1971) A576–A577. <62, 66>
[34] S. Agou, Factorisation sur un corps fini Fpn des polynômes composés f (X s ) lorsque
f (X) est un polynôme irréductibile de Fpn [X], L’ Enseignement Math., IIe Ser.
22 (1976) 305–312. <60, 62, 66>
[35] S. Agou, Factorisation sur un corps fini K des polynômes composés f (X s ) lorsque
f (X) est polynôme irréductibile de K[X], C. R. Acad. Sci. Paris, Sér. A-B 282
(1976) Ai, A1067–A1068. <60, 66>
[36] S. Agou, Critères d’irréductibilité des polynômes composés à coefficients dans un
corps fini, Acta Arith. 30 (1976/77) 213–223. <60, 63, 66>
r
[37] S. Agou, Factorisation sur un corps fini Fpn des polynômes composés f (X p − aX)
lorsque f (X) est un polynôme irréductible de Fpn (X), J. Number Theory 9
(1977) 229–239. <62, 63, 66, 70>
Bibliography 853
r
[38] S. Agou, Irréductibilité des polynômes f (X p − aX) sur un corps fini Fps , J. Reine
Angew. Math. 292 (1977) 191–195. <60, 63, 66>
2r r
[39] S. Agou, Irréductibilité des polynômes f (X p − aX p − bX) sur un corps fini Fps , J.
Number Theory 10 (1978) 64–69. <60, 63, 66, 70>
2r r
[40] S. Agou, Irréductibilité des polynômes f (X p − aX p − bX) sur un corps fini Fps , J.
Number Theory 11 (1979) 20. <60, 63, 66, 70>
Pm ri
[41] S. Agou, Irréductibilité des polynômes f ( i=0 ai X p ) sur un corps fini Fps , Canad.
Math. Bull. 23 (1980) 207–212. <63, 66, 70>
2r r
[42] S. Agou, Sur la factorisation des polynômes f (X p − aX p − bX) sur un corps fini
Fps , J. Number Theory 12 (1980) 447–459. <63, 66>
[43] M. Agrawal, N. Kayal, and N. Saxena, PRIMES is in P, Ann. of Math., 2nd Ser. 160
(2004) 781–793. <347, 363, 401, 404>
[44] S. Ahmad, Cycle structure of automorphisms of finite cyclic groups, J. Combin.
Theory 6 (1969) 370–374. <228, 229>
[45] O. Ahmadi, Self-reciprocal irreducible pentanomials over F2 , Des. Codes Cryptogr.
38 (2006) 395–397. <69, 70>
[46] O. Ahmadi, On the distribution of irreducible trinomials over F3 , Finite Fields Appl.
13 (2007) 659–664. <69, 70>
[47] O. Ahmadi, The trace spectra of polynomial bases for F2n , Appl. Algebra Engrg.
Comm. Comput. 18 (2007) 391–396. <107, 109>
[48] O. Ahmadi, Generalization of a theorem of Carlitz, Finite Fields Appl. 17 (2011)
473–480. <57, 59>
[49] O. Ahmadi and R. Granger, An efficient deterministic test for Kloosterman sum zeros,
2012, to appear in Math. Comp. <154, 161>
[50] O. Ahmadi, F. Luca, A. Ostafe, and I. E. Shparlinski, On stable quadratic polynomials,
Glasg. Math. J. 54 (2012) 359–369. <343, 344>
[51] O. Ahmadi and A. Menezes, On the number of trace-one elements in polynomial bases
for F2n , Des. Codes Cryptogr. 37 (2005) 493–507. <107, 109>
[52] O. Ahmadi and A. Menezes, Irreducible polynomials of maximum weight, Util. Math.
72 (2007) 111–123. <69, 70, 73>
[53] O. Ahmadi and I. E. Shparlinski, Bilinear character sums and sum-product problems
on elliptic curves, Proc. Edinb. Math. Soc., 2nd Ser. 53 (2010) 1–12. <188, 192>
[54] O. Ahmadi, I. E. Shparlinski, and J. F. Voloch, Multiplicative order of Gauss periods,
Int. J. Number Theory 6 (2010) 877–882. <99>
[55] O. Ahmadi and G. Vega, On the parity of the number of irreducible factors of self-
reciprocal polynomials over finite fields, Finite Fields Appl. 14 (2008) 124–131.
<72, 73>
[56] A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The Design and Analysis of Computer Al-
gorithms, Addison-Wesley Publishing Co., Reading, Mass.-London-Amsterdam,
1975, Second printing, Addison-Wesley Series in Computer Science and Infor-
mation Processing. <359, 363>
[57] W. Aitken, On value sets of polynomials over a finite field, Finite Fields Appl. 4
(1998) 441–449. <235, 236>
[58] W. Aitken, M. D. Fried, and L. M. Holt, Davenport pairs over finite fields, Pacific J.
Math. 216 (2004) 1–38. <240>
[59] M. Ajtai, H. Iwaniec, J. Komlós, J. Pintz, and E. Szemerédi, Construction of a thin
set with small Fourier coefficients, Bull. London Math. Soc. 22 (1990) 583–590.
854 Handbook of Finite Fields
<184, 185>
[60] A. Akbary, S. Alaric, and Q. Wang, On some classes of permutation polynomials, Int.
J. Number Theory 4 (2008) 121–133. <223, 224, 229>
[61] A. Akbary, D. Ghioca, and Q. Wang, On permutation polynomials of prescribed
shape, Finite Fields Appl. 15 (2009) 195–206. <218, 219, 229>
[62] A. Akbary, D. Ghioca, and Q. Wang, On constructing permutations of finite fields,
Finite Fields Appl. 17 (2011) 51–67. <220, 221, 224, 225, 229>
[63] A. Akbary and Q. Wang, On some permutation polynomials over finite fields, Int. J.
Math. Math. Sci. 16 (2005) 2631–2640. <222, 229>
[64] A. Akbary and Q. Wang, A generalized Lucas sequence and permutation binomials,
Proc. Amer. Math. Soc. 134 (2006) 15–22. <218, 222, 223, 229>
[65] A. Akbary and Q. Wang, On polynomials of the form xr f (x(q−1)/l ), Int. J. Math.
Math. Sci. (2007) Art. ID 23408, 7. <221, 222, 223, 229>
[66] S. Akiyama, On the pure Jacobi sums, Acta Arith. 75 (1996) 97–104. <146, 161>
[67] M.-L. Akkar, N. T. Courtois, R. Duteuil, and L. Goubin, A fast and secure imple-
mentation of Sflash, In Public Key Cryptography—PKC 2003, volume 2567 of
Lecture Notes in Comput. Sci., 267–278, Springer, Berlin, 2002. <773, 783>
[68] E. Aksoy, A. Çeşmelioğlu, W. Meidl, and A. Topuzoğlu, On the Carlitz rank of
permutation polynomials, Finite Fields Appl. 15 (2009) 428–440. <229>
[69] A. A. Albert, Symmetric and alternate matrices in an arbitrary field. I, Trans. Amer.
Math. Soc. 43 (1938) 386–436. <507, 510>
[70] A. A. Albert, Fundamental Concepts of Higher Algebra, University of Chicago Press,
Chicago, IL, 1958. <61, 62, 63, 66>
[71] A. A. Albert, Finite division algebras and finite planes, In Proc. Sympos. Appl.
Math., Vol. 10, 53–70, American Mathematical Society, Providence, RI, 1960.
<275, 278>
[72] A. A. Albert, Generalized twisted fields, Pacific J. Math. 11 (1961) 1–8. <276>
[73] A. A. Albert, Isotopy for generalized twisted fields, An. Acad. Brasil. Ci. 33 (1961)
265–275. <276>
[74] R. Albert and H. G. Othmer, The topology of the regulatory interactions predicts the
expression pattern of the segment polarity genes in drosophila melanogaster, J.
Theoret. Biol. 223 (2003) 1–18. <825, 834>
[75] W. R. Alford, A. Granville, and C. Pomerance, There are infinitely many Carmichael
numbers, Ann. of Math., 2nd Ser. 139 (1994) 703–722. <134, 138>
[76] N. Ali, Stabilité des polynômes, Acta Arith. 119 (2005) 53–63. <342, 344>
[77] B. Allombert, Explicit computation of isomorphisms between finite fields, Finite
Fields Appl. 8 (2002) 332–342. <348, 363>
[78] J.-P. Allouche and J. Shallit, Automatic Sequences: Theory, Applications, General-
izations, Cambridge University Press, Cambridge, 2003. <546>
[79] J.-P. Allouche and D. S. Thakur, Automata and transcendence of the Tate period in
finite characteristic, Proc. Amer. Math. Soc. 127 (1999) 1309–1312. <546>
[80] N. Alon, Eigenvalues and expanders, Combinatorica 6 (1986) 83–96. <646, 649, 658>
[81] N. Alon and F. R. K. Chung, Explicit construction of linear sized tolerant networks,
Discrete Math. 72 (1988) 15–19. <645, 658>
[82] N. Alon, Y. Kohayakawa, C. Mauduit, C. G. Moreira, and V. Rödl, Measures of
pseudorandomness for finite sequences: typical values, Proc. Lond. Math. Soc.,
3rd Ser. 95 (2007) 778–812. <182, 185>
Bibliography 855
[83] N. Alon and Y. Roichman, Random Cayley graphs and expanders, Random Structures
Algorithms 5 (1994) 271–284. <651, 658>
[84] C. Alonso, J. Gutierrez, and T. Recio, A rational function decomposition algorithm
by near-separated polynomials, J. Symbolic Comput. 19 (1995) 527–544. <300,
302>
[85] H. Aly, R. Marzouk, and W. Meidl, On the calculation of the linear complexity of
periodic sequences, In Finite Fields: Theory and Applications, volume 518 of
Contemp. Math., 11–22, Amer. Math. Soc., Providence, RI, 2010. <329, 336>
[86] H. Aly and W. Meidl, On the linear complexity and k-error linear complexity over
Fp of the d-ary Sidel0 nikov sequence, IEEE Trans. Inform. Theory 53 (2007)
4755–4761. <334, 336>
[87] H. Aly and A. Winterhof, On the linear complexity profile of nonlinear congruen-
tial pseudorandom number generators with Dickson polynomials, Des. Codes
Cryptogr. 39 (2006) 155–162. <333, 336>
[88] A. Ambainis and N. Nahimovs, Improved constructions of quantum automata, The-
oret. Comput. Sci. 410 (2009) 1916–1922. <840, 841>
[89] P. R. Amestoy, T. A. Davis, and I. S. Duff, Algorithm 837: AMD, an approximate
minimum degree ordering algorithm, ACM Trans. Math. Software 30 (2004)
381–388. <532, 535>
[90] G. An, In silico experiments of existing and hypothetical cytokine-directed clinical
trials using agent-based modeling, Crit Care Med 32 (2004) 2050–2060. <831,
834>
[91] V. Anashin and A. Khrennikov, Applied Algebraic Dynamics, volume 49 of de Gruyter
Expositions in Mathematics, de Gruyter, Berlin, 2009. <337, 338, 344>
[92] H. E. Andersen and O. Geil, Evaluation codes from order domain theory, Finite
Fields Appl. 14 (2008) 92–123. <705, 712>
[93] B. A. Anderson and K. B. Gross, A partial starter construction, Congress. Numer.
21 (1978) 57–64. <615, 619>
[94] G. W. Anderson, t-motives, Duke Math. J. 53 (1986) 457–502. <545, 546>
[95] G. W. Anderson, Log-algebraicity of twisted A-harmonic series and special values of
L-series in characteristic p, J. Number Theory 60 (1996) 165–209. <541>
[96] G. W. Anderson, W. D. Brownawell, and M. A. Papanikolas, Determination of the
algebraic relations among special Γ-values in positive characteristic, Ann. of
Math, 2nd Ser. 160 (2004) 237–313. <546>
[97] G. W. Anderson and D. S. Thakur, Multizeta values for Fq [t], their period interpreta-
tion, and relations between them, Int. Math. Res. Not. IMRN (2009) 2038–2055.
<544, 546>
[98] I. Anderson, A hundred years of whist tournaments, J. Combin. Math. Combin.
Comput. 19 (1995) 129–150. <618, 619>
[99] I. Anderson, Combinatorial Designs and Tournaments, volume 6 of Oxford Lecture
Series in Mathematics and its Applications, The Clarendon Press, Oxford Uni-
versity Press, New York, 1997. <619>
[100] I. Anderson, Some cyclic and 1-rotational designs, In Surveys in Combinatorics,
volume 288 of London Math. Soc. Lecture Note Ser., 47–73, Cambridge Univ.
Press, Cambridge, 2001. <618, 619>
[101] I. Anderson and N. J. Finizio, Some new z-cyclic whist tournament designs, Discrete
Math. 293 (2005) 19–28. <618, 619>
[102] I. Anderson, N. J. Finizio, and P. A. Leonard, New product theorems for Z-cyclic
856 Handbook of Finite Fields
[199] I. Baoulina, On the number of solutions to certain diagonal equations over finite fields,
Int. J. Number Theory 6 (2010) 1–14. <208, 213>
[200] B. Barak, G. Kindler, R. Shaltiel, B. Sudakov, and A. Wigderson, Simulating in-
dependence: new constructions of condensers, Ramsey graphs, dispersers, and
extractors, J. ACM 57 (2010) Art. 20, 52. <191, 192>
[201] M. Bardet, J.-C. Faugère, and B. Salvy, On the complexity of Gröbner basis com-
putation of semi-regular overdetermined algebraic equations, In Proceedings of
the International Conference on Polynomial System Solving, 71–74, 2004. <782,
783>
[202] A. Barlotti, Un’estensione del teorema di Segre-Kustaanheimo, Boll. Un. Mat. Ital.
10 (1955) 498–506. <588, 589>
[203] P. S. L. M. Barreto, S. D. Galbraith, C. Ó’hÉigeartaigh, and M. Scott, Efficient pairing
computation on supersingular abelian varieties, Designs, Codes and Cryptography
42 (2007) 239–271. <791, 796>
[204] P. S. L. M. Barreto and J. F. Voloch, Efficient computation of roots in finite fields,
Des. Codes Cryptogr. 39 (2006) 275–280. <360, 363>
[205] S. Barwick and G. Ebert, Unitals in Projective Planes, Springer Monographs in
Mathematics. Springer, New York, 2008. <571, 574>
[206] S. G. Barwick and W.-A. Jackson, Geometric constructions of optimal linear perfect
hash families, Finite Fields Appl. 14 (2008) 1–13. <613, 619>
[207] S. G. Barwick, W.-A. Jackson, and C. T. Quinn, Optimal linear perfect hash families
with small parameters, J. Combin. Des. 12 (2004) 311–324. <613, 619>
[208] L. Batina, S. B. Örs, B. Preneel, and J. Vandewalle, Hardware architectures for public
key cryptography, Integration, the VLSI Journal 34 (2003) 1 – 64. <109>
[209] C. Batut, K. Belabas, D. Bernardi, H. Cohen, and M. Olivier, PARI/GP, version
2.5.0, http:pari.math.u-bordeaux.fr/, as viewed in July, 2012. <346, 363>
[210] A. Bauer, D. Vergnaud, and J.-C. Zapalowicz, Inferring sequences produced by non-
linear pseudorandom number generators using coppersmith’s methods, In Proc.
15th Intern. Conf. on Practice and Theory in Public-Key Cryptography, PKC
2012, volume 7293 of Lecture Notes in Comput. Sci., 609–626, Springer, Berlin,
2012. <338, 344>
[211] L. D. Baumert, Cyclic Difference Sets, Lecture Notes in Mathematics, Vol. 182.
Springer-Verlag, Berlin, 1971. <31, 600, 602, 607>
[212] E. Bayer-Fluckiger and H. W. Lenstra, Jr., Forms in odd degree extensions and
self-dual normal bases, Amer. J. Math. 112 (1990) 359–373. <114, 116>
[213] J. T. Beard, Jr. and K. I. West, Factorization tables for xn − 1 over GF(q), Math.
Comp. 28 (1974) 1167–1168. <62, 66>
[214] B. Beckermann and G. Labahn, Fraction-free computation of matrix rational inter-
polants and matrix GCDs, SIAM J. Matrix Anal. Appl. 22 (2000) 114–144. <534,
535>
[215] E. Bedford and K. Kim, Continuous families of rational surface automorphisms with
positive entropy, Math. Ann. 348 (2010) 667–688. <338, 344>
[216] E. Bedford and T. T. Truong, Degree complexity of birational maps related to matrix
inversion, Comm. Math. Phys. 298 (2010) 357–368. <337, 338, 344>
[217] P. Beelen and I. I. Bouw, Asymptotically good towers and differential equations,
Compos. Math. 141 (2005) 1405–1424. <464, 469>
[218] D. Behr, Searchable magic book contents, main site: http://archive.denisbehr.
de, http://archive.denisbehr.de/archive/route/entries.php?url=10,
862 Handbook of Finite Fields
[238] L. Bernardin, On bivariate Hensel lifting and its parallelization, In ISSAC ’98: Pro-
ceedings of the 1998 International Symposium on Symbolic and Algebraic Com-
putation, 96–100, New York, 1998, ACM. <385, 392>
[239] L. Bernardin and M. B. Monagan, Efficient multivariate factorization over finite
fields, In Applied Algebra, Algebraic Algorithms and Error-Correcting Codes,
volume 1255 of Lecture Notes in Comput. Sci., 15–28, Springer-Verlag, 1997.
<387, 392>
[240] B. C. Berndt, R. J. Evans, and K. S. Williams, Gauss and Jacobi Sums, Canadian
Mathematical Society Series of Monographs and Advanced Texts, John Wiley &
Sons Inc., New York, 1998. <31, 139, 140, 141, 142, 143, 145, 146, 147, 148, 149,
150, 151, 152, 156, 159, 160, 161, 173, 185, 206, 213>
[241] D. J. Bernstein, Multiplication for mathematicians, 2001, preprint available at http:
//cr.yp.to/papers.html#m3. <814, 823>
[242] D. J. Bernstein, Pippenger’s exponentiation algorithm, 2002, preprint available at
http://cr.yp.to/papers/pippenger.pdf. <357, 363>
[243] D. J. Bernstein, Batch binary Edwards, In Advances in Cryptology—CRYPTO 2009,
volume 5677 of Lecture Notes in Comput. Sci., 317–336, Springer, Berlin, 2009.
<814, 823>
[244] D. J. Bernstein, P. Birkner, M. Joye, T. Lange, and C. Peters, Twisted Edwards curves,
In Progress in Cryptology—AFRICACRYPT 2008, volume 5023 of Lecture Notes
in Comput. Sci., 389–405, Springer, Berlin, 2008. <441, 446>
[245] D. J. Bernstein, J. Buchmann, and E. Dahmen, editors, Post-Quantum Cryptography,
Springer-Verlag, Berlin, 2009. <31, 749, 750>
[246] D. J. Bernstein and T. Lange, Explicit-formulas database, http://hyperelliptic.
org/EFD/. <443, 446>
[247] D. J. Bernstein and T. Lange, Faster addition and doubling on elliptic curves, In
Advances in Cryptology—ASIACRYPT 2007, volume 4833 of Lecture Notes in
Comput. Sci., 29–50, Springer, Berlin, 2007. <441, 442, 446>
[248] D. J. Bernstein and T. Lange, Type-II optimal polynomial bases, In Arithmetic of
Finite Fields, volume 6087 of Lecture Notes in Comput. Sci., 41–61, Springer,
Berlin, 2010. <127, 128, 822, 823>
[249] D. J. Bernstein and T. Lange, A complete set of addition laws for incomplete Edwards
curves, J. Number Theory 131 (2011) 858–872. <442, 446>
[250] D. J. Bernstein, T. Lange, and C. Peters, Attacking and defending the McEliece
cryptosystem, In Post-Quantum Cryptography, volume 5299 of Lecture Notes in
Comput. Sci., 31–46, Springer, Berlin, 2008. <750>
[251] C. Berrou, A. Glavieux, and P. Thitimajshima, Near Shannon limit error-correcting
coding and decoding: turbo-codes, In Proc. IEEE Int. Conf. on Commun., 1064–
1070, Geneva, Switzerland, 1993. <719, 727>
[252] C. Berrou, S. K. Y. Saouter, C. Douillard, and M. Jézéquel, Designing good permuta-
tions for turbo codes: towards a single model, In Proc. International Conference
on Communications, volume 1, 341–345, Paris, France, 2004. <726, 727>
[253] P. Berthelot, Cohomologie rigide et théorie de Dwork: le cas des sommes exponen-
tielles, Astérisque (1984) 3, 17–49, p-adic cohomology. <169>
[254] P. Berthelot, S. Bloch, and H. Esnault, On Witt vector cohomology for singular
varieties, Compos. Math. 143 (2007) 363–392. <200, 201>
[255] P. Berthelot and A. Ogus, Notes on Crystalline Cohomology, Princeton University
Press, Princeton, NJ, 1978. <481, 488>
864 Handbook of Finite Fields
[274] J. Bierbrauer, New semifields, PN and APN functions, Des. Codes Cryptogr. 54 (2010)
189–200. <282>
[275] J. Bierbrauer and Y. Edel, Theory of perpendicular arrays, J. Combin. Des. 2 (1994)
375–406. <612, 619>
[276] J. Bierbrauer, Y. Edel, and W. C. Schmid, Coding-theoretic constructions for (t, m, s)-
nets and ordered orthogonal arrays, J. Combin. Des. 10 (2002) 403–418. <622,
625, 630>
[277] J. Bierbrauer and G. M. Kyureghyan, Crooked binomials, Des. Codes Cryptogr. 46
(2008) 269–301. <259, 261>
[278] E. Biham and A. Shamir, Differential cryptanalysis of DES-like cryptosystems, J.
Cryptology 4 (1991) 3–72. <253, 261>
[279] M. Biliotti, V. Jha, and N. L. Johnson, Foundations of Translation Planes, volume 243
of Monographs and Textbooks in Pure and Applied Mathematics, Marcel Dekker
Inc., New York, 2001. <565, 574>
[280] O. Billet and H. Gilbert, Cryptanalysis of Rainbow, In Security and Cryptography
for Networks, volume 4116 of Lecture Notes in Comput. Sci., 336–347, Springer,
2006. <772, 773, 783>
[281] O. Billet, M. J. B. Robshaw, and T. Peyrin, On building hash functions from multi-
variate quadratic equations, In J. Pieprzyk, H. Ghodosi, and E. Dawson, editors,
ACISP, volume 4586 of Lecture Notes in Computer Science, 82–95, Springer,
2007. <783>
[282] Y. Bilu and N. Linial, Lifts, discrepancy and nearly optimal spectral gap, Combina-
torica 26 (2006) 495–519. <646, 656, 657, 658>
[283] G. Bini and F. Flamini, Finite Commutative Rings and their Applications, The
Kluwer International Series in Engineering and Computer Science, 680, Kluwer
Academic Publishers, Boston, MA, 2002. <28, 29, 31>
[284] B. J. Birch, How the number of points of an elliptic curve over a fixed prime field
varies, J. London Math. Soc., 2nd Ser. 43 (1968) 57–60. <430, 440>
[285] B. J. Birch and H. P. F. Swinnerton-Dyer, Note on a problem of Chowla, Acta Arith.
5 (1959) 417–423 (1959). <234, 236>
[286] P. Birkner, Efficient divisor class halving on genus two curves, In Thirteenth Inter-
national Workshop on Selected Areas in Cryptography, volume 4356 of Lecture
Notes in Comput. Sci., 317–326, Springer, Berlin, 2007. <797, 803>
[287] P. Birkner and N. Thériault, Faster halvings in genus 2, In Fifteenth International
Workshop on Selected Areas in Cryptography, volume 5381 of Lecture Notes in
Comput. Sci., 1–17, Springer, Berlin, 2009. <797, 803>
[288] P. Birkner and N. Thériault, Efficient halving for genus 3 curves over binary fields,
Advances in Mathematics of Communications 4 (2010) 23–47. <799, 803>
[289] A. Biró, On polynomials over prime fields taking only two values on the multiplicative
group, Finite Fields Appl. 6 (2000) 302–308. <233, 236>
[290] R. R. Bitmead and B. D. O. Anderson, Asymptotically fast solution of Toeplitz and
related systems of linear equations, Linear Algebra Appl. 34 (1980) 103–116.
<533, 535>
[291] R. Blache, First vertices for generic Newton polygons, and p-cyclic coverings of the
projective line, preprint available, http://arxiv.org/abs/0912.2051, 2009.
<484, 487, 488>
[292] R. Blache, Newton polygons for character sums and Poincaré series, Int. J. Number
Theory 7 (2011) 1519–1542. <485, 488>
866 Handbook of Finite Fields
[293] R. Blache, Valuations of exponential sums and the generic first slope of artin-schreier
curves, J. Number Theory 132 (2012) 2336–2352. <480, 484, 488>
[294] R. Blache, J.-P. Cherdieu, and J. Estrada Sarlabous, Some computational aspects of
Jacobians of curves in the family y 3 = γx5 + δ over Fp , Finite Fields Appl. 13
(2007) 348–365. <807, 811>
[295] R. Blache and É. Férard, Newton stratification for polynomials: the open stratum, J.
Number Theory 123 (2007) 456–472. <484, 488>
[296] R. Blache, É. Férard, and H. J. Zhu, Hodge-Stickelberger polygons for L-functions of
exponential sums of P (xs ), Math. Res. Lett. 15 (2008) 1053–1071. <485, 488>
[297] S. R. Blackburn, A generalisation of the discrete Fourier transform: determining the
minimal polynomial of a periodic sequence, IEEE Trans. Inform. Theory 40
(1994) 1702–1704. <328, 336>
[298] S. R. Blackburn, T. Etzion, and K. G. Paterson, Permutation polynomials, de Bruijn
sequences, and linear complexity, J. Combin. Theory, Ser. A 76 (1996) 55–82.
<328, 329, 336>
[299] S. R. Blackburn, D. Gómez-Pérez, J. Gutierrez, and I. E. Shparlinski, Predicting the
inversive generator, In Cryptography and Coding, volume 2898 of Lecture Notes
in Comput. Sci., 264–275, Springer, Berlin, 2003. <338, 344>
[300] S. R. Blackburn, D. Gómez-Pérez, J. Gutierrez, and I. E. Shparlinski, Predicting
nonlinear pseudorandom number generators, Math. Comp. 74 (2005) 1471–1494.
<338, 344>
[301] S. R. Blackburn, D. Gómez-Pérez, J. Gutierrez, and I. E. Shparlinski, Reconstructing
noisy polynomial evaluation in residue rings, J. Algorithms 61 (2006) 47–59.
<338, 344>
[302] S. R. Blackburn and P. R. Wild, Optimal linear perfect hash families, J. Combin.
Theory, Ser. A 83 (1998) 233–250. <613, 619>
[303] R. E. Blahut, Transform techniques for error control codes, IBM J. Res. Develop. 23
(1979) 299–315. <328, 336>
[304] R. E. Blahut, Theory and Practice of Error Control Codes, Addison-Wesley Publishing
Company Advanced Book Program, Reading, MA, 1983. <31, 661, 663, 680, 688,
690, 692, 693, 703>
[305] I. F. Blake, editor, Algebraic Coding Theory: History and Development, Dowden
Hutchinson & Ross Inc., Stroudsburg, Pa., 1973, Benchmark Papers in Electrical
Engineering and Computer Science. <702, 703>
[306] I. F. Blake, R. Fuji-Hara, R. C. Mullin, and S. A. Vanstone, Computing logarithms in
finite fields of characteristic two, SIAM J. Algebraic Discrete Methods 5 (1984)
276–285. <370, 374>
[307] I. F. Blake, S. Gao, and R. J. Lambert, Construction and distribution problems for
irreducible trinomials over finite fields, In Applications of Finite Fields, volume 59
of Inst. Math. Appl. Conf. Ser. (New Ser.), 19–32, Oxford Univ. Press, New York,
1996. <73, 89, 90>
[308] I. F. Blake, S. Gao, and R. C. Mullin, Normal and self-dual normal bases from
factorization of cxq+1 + dxq − ax − b, SIAM J. Discrete Math. 7 (1994) 499–512.
<118, 124, 128>
[309] I. F. Blake, S. Gao, and R. C. Mullin, Specific irreducible polynomials with linearly
independent roots over finite fields, Linear Algebra Appl. 253 (1997) 227–249.
<113, 116, 131, 138>
[310] I. F. Blake and T. Garefalakis, A transform property of Kloosterman sums, Discrete
Bibliography 867
[329] A. Blokhuis, R. Pellikaan, and T. Szőnyi, Blocking sets of almost Rédei type, J.
Combin. Theory, Ser. A 78 (1997) 141–150. <560, 563>
[330] A. Blokhuis, L. Storme, and T. Szőnyi, Lacunary polynomials, multiple blocking sets
and Baer subplanes, J. London Math. Soc., 2nd Ser. 60 (1999) 321–332. <562,
563>
[331] C. Blondeau, A. Canteaut, and P. Charpin, Differential properties of power functions,
Int. J. Inf. Coding Theory 1 (2010) 149–170. <261>
[332] A. W. Bluher, Explicit formulas for strong Davenport pairs, Acta Arith. 112 (2004)
397–403. <301, 302>
[333] A. W. Bluher, A Swan-like theorem, Finite Fields Appl. 12 (2006) 128–138. <69,
70>
[334] G. Böckle, An eichler-shimura isomorphism over function fields between Drinfeld
modular forms and cohomology classes of crystals, preprint (2002). <545, 546>
[335] G. Böckle, Global L-functions over function fields, Math. Ann. 323 (2002) 737–795.
<541, 546>
[336] A. Bodin, Number of irreducible polynomials in several variables over finite fields,
Amer. Math. Monthly 115 (2008) 653–660. <80, 85>
[337] A. Bodin, Generating series for irreducible polynomials over finite fields, Finite Fields
Appl. 16 (2010) 116–125. <80, 82, 85>
[338] A. Bodin, P. Dèbes, and S. Najib, Indecomposable polynomials and their spectrum,
Acta Arith. 139 (2009) 79–100. <83, 84, 85>
[339] E. Bombieri, On exponential sums in finite fields, Amer. J. Math. 88 (1966) 71–105.
<163, 169, 473, 476, 479>
[340] E. Bombieri, Counting points on curves over finite fields (d’après S. A. Stepanov), In
Séminaire Bourbaki, 25ème année (1972/1973), Exp. No. 430, 234–241. Lecture
Notes in Math., Vol. 383, Springer, Berlin, 1974. <477, 479>
[341] E. Bombieri, On exponential sums in finite fields. II, Invent. Math. 47 (1978) 29–39.
<164, 169, 302, 473, 476, 479>
[342] E. Bombieri and S. Sperber, On the estimation of certain exponential sums, Acta
Arith. 69 (1995) 329–358. <164, 169>
[343] D. Bonchev, S. Thomas, A. Apte, and L. B. Kier, Cellular automata modelling of
biomolecular networks dynamics, SAR and QSAR in Environmental Research
21 (2010) 77–102. <829, 834>
[344] D. Boneh and M. Franklin, Identity-based encryption from the Weil pairing, SIAM
J. Comput. 32 (2003) 586–615. <748, 750>
[345] D. Boneh, E.-J. Goh, and K. Nissim, Evaluating 2-DNF formulas on ciphertexts, In
Theory of cryptography, volume 3378 of Lecture Notes in Comput. Sci., 325–341,
Springer, Berlin, 2005. <793, 796>
[346] D. Boneh, B. Lynn, and H. Shacham, Short signatures from the Weil pairing, J.
Cryptology 17 (2004) 297–319. <748, 750>
[347] D. Boneh and R. Venkatesan, Rounding in lattices and its cryptographic applica-
tions, In Proceedings of the Eighth Annual ACM-SIAM Symposium on Discrete
Algorithms, 675–681, ACM, New York, 1997. <176, 185>
[348] D. Boneh and R. Venkatesan, Breaking RSA may not be equivalent to factoring
(extended abstract), In Advances in Cryptology—EUROCRYPT ’98, volume
1403 of Lecture Notes in Comput. Sci., 59–71, Springer, Berlin, 1998. <176,
185>
Bibliography 869
[349] T. J. Boothby and R. W. Bradshaw, Bitslicing and the method of four Russians over
larger finite fields, preprint available, http://arxiv.org/abs/0901.1413, 2009.
<522, 535>
[350] H. Borges, Frobenius non-classical curves of type g(y) = f (x), preprint, 2012. <233,
236>
[351] H. Borges and F. Conceição, On the characterization of minimal value set polynomials,
preprint, 2012. <233, 236>
[352] P. Borwein, K.-K. S. Choi, and J. Jedwab, Binary sequences with merit factor greater
than 6.34, IEEE Trans. Inform. Theory 50 (2004) 3234–3249. <323, 324>
[353] J. Bos and M. E. Kaihara, Playstation 3 computing breaks 260 barrier: 112-bit
prime ECDLP solved, online annoucement, http://lacal.epfl.ch/112bit prime,
2009. <400, 401>
[354] S. Bosch, U. Güntzer, and R. Remmert, Non-Archimedean Analysis: A Systematic
Approach to Rigid Analytic Geometry, volume 261 of Grundlehren der Mathe-
matischen Wissenschaften [Fundamental Principles of Mathematical Sciences],
Springer-Verlag, Berlin, 1984. <537, 546>
[355] R. C. Bose, On the application of the properties of Galois fields to the construction
of hyper-graeco-latin squares, Sankhyā 3 (1938) 323–338. <551, 554, 556>
[356] R. C. Bose, On the construction of balanced incomplete block designs, Ann. Eugenics
9 (1939) 353–399. <592, 599>
[357] R. C. Bose, On some connections between the design of experiments and information
theory, Bull. Inst. Internat. Statist. 38 (1961) 257–271. <631, 642>
[358] R. C. Bose and R. C. Burton, A characterization of flat spaces in a finite geometry and
the uniqueness of the Hamming and the MacDonald codes, J. Combin. Theory
1 (1966) 96–104. <560, 563>
[359] R. C. Bose and D. K. Ray-Chaudhuri, On a class of error correcting binary group
codes, Information and Control 3 (1960) 68–79. <678, 702, 703>
[360] W. Bosma, J. Cannon, and C. Playoust, The Magma algebra system I: The user
language, J. Symbolic Comput. 24 (1997) 235–265. <387, 392>
[361] W. Bosma, J. Cannon, and A. Steel, Lattices of compatibly embedded finite fields,
J. Symbolic Comput. 24 (1997) 351–369. <402, 404>
[362] W. Bosma and H. W. Lenstra, Jr., Complete systems of two addition laws for elliptic
curves, J. Number Theory 53 (1995) 229–240. <442, 446>
[363] A. Bostan, P. Flajolet, B. Salvy, and E. Schost, Fast computation of special resultants,
J. Symbolic Comput. 41 (2006) 1–29. <378, 379, 380>
[364] A. Bostan, C.-P. Jeannerod, and É. Schost, Solving structured linear systems with
large displacement rank, Theoret. Comput. Sci. 407 (2008) 155–181. <533, 535>
[365] A. Bostan, G. Lecerf, B. Salvy, É. Schost, and B. Wiebelt, Complexity issues in
bivariate polynomial factorization, In ISSAC 2004, 42–49, ACM, New York,
2004. <385, 392>
[366] A. Bostan, F. Morain, B. Salvy, and E. Schost, Fast algorithms for computing isogenies
between elliptic curves, Math. Comp. 77 (2008) 1755–1778. <788, 796>
[367] A. Böttcher and B. Silbermann, Introduction to Large Truncated Toeplitz Matrices,
Universitext. Springer-Verlag, New York, 1999. <507, 510>
[368] J. Bourgain, Estimates on exponential sums related to the Diffie-Hellman distribu-
tions, Geom. Funct. Anal. 15 (2005) 1–34. <183, 184, 185, 189, 192>
[369] J. Bourgain, Mordell’s exponential sum estimate revisited, J. Amer. Math. Soc. 18
870 Handbook of Finite Fields
equations over finite fields, Finite Fields Appl. 12 (2006) 681–692. <210, 213>
[502] W. Cao and Q. Sun, Factorization formulae on counting zeros of diagonal equations
over finite fields, Proc. Amer. Math. Soc. 135 (2007) 1283–1291. <210, 213>
[503] X. Cao, A note on the moments of Kloosterman sums, Appl. Algebra Engrg. Comm.
Comput. 20 (2009) 447–457. <154, 161>
[504] X. Cao and L. Hu, New methods for generating permutation polynomials over finite
fields, Finite Fields Appl. 17 (2011) 493–503. <216, 229>
[505] A. Capelli, Sulla rudittibilità delle equazioni algebriche I, Rend. Acad. Sci. Fis. Mat.
Napoli 3 (1897) 243–252. <60, 61, 66>
[506] A. Capelli, Sulla rudittibilità delle equazioni algebriche II, Rend. Acad. Sci. Fis. Mat.
Napoli 4 (1898) 243–252. <61, 66>
[507] A. Capelli, Sulla redutibilita delle funzione xn − A in un campo qualunque di razion-
alità, Math. Ann. 54 (1901) 602–603. <61, 66>
[508] M. Car, Le probléme de Waring pour l’anneau des polynômes sur un corps fini, C.
R. Acad. Sci. Paris, Sér. A-B 273 (1971) A141–A144. <499, 500>
[509] M. Car, Factorisation dans Fq [X], C. R. Acad. Sci. Paris, Sér. I, Math. 294 (1982)
147–150. <368, 374>
[510] M. Car, Théorèmes de densité dans Fq [X], Acta Arith. 48 (1987) 145–165. <369,
371, 374>
[511] M. Car, Waring’s problem in function fields, Proc. London Math. Soc., 3rd Ser. 68
(1994) 1–30. <213>
[512] M. Car, Distribution des polynômes irréductibles dans Fq [T ], Acta Arith. 88 (1999)
141–153. <74, 75, 76, 79>
[513] M. Car, New bounds on some parameters in the Waring problem for polynomials over
a finite field, In Finite Fields and Applications, volume 461 of Contemp. Math.,
59–77, Amer. Math. Soc., Providence, 2008. <499, 500>
[514] M. Car and L. Gallardo, Sums of cubes of polynomials, Acta Arith. 112 (2004) 41–50.
<499, 500>
[515] M. Car and L. Gallardo, Waring’s problem for polynomial biquadrates over a finite
field of odd characteristic, Funct. Approx. Comment. Math. 37 (2007) 39–50.
<213, 499, 500>
[516] P. Carbonne and T. Henocq, Décomposition de la jacobienne sur les corps finis, Bull.
Polish Acad. Sci. Math. 42 (1994) 207–215. <239, 240>
[517] J.-P. Cardinal, On a property of Cauchy-like matrices, C. R. Acad. Sci. Paris, Sér.
I, Math. 328 (1999) 1089–1093. <533, 535>
[518] I. Cardinali, O. Polverino, and R. Trombetti, Semifield planes of order q 4 with kernel
Fq2 and center Fq , European J. Combin. 27 (2006) 940–961. <276, 278>
[519] C. Carlet, A larger class of cryptographic Boolean functions via a study of the
Maiorana-McFarland construction, In Advances in Cryptology—CRYPTO 2002,
volume 2442 of Lecture Notes in Comput. Sci., 549–564, Springer, Berlin, 2002.
<249, 252>
[520] C. Carlet, On the coset weight divisibility and nonlinearity of resilient and correlation-
immune functions, In Sequences and Their Applications, Discrete Math. Theor.
Comput. Sci. (Lond.), 131–144, Springer, London, 2002. <247, 252>
[521] C. Carlet, On the secondary constructions of resilient and bent functions, In Coding,
Cryptography and Combinatorics, volume 23 of Progr. Comput. Sci. Appl. Logic,
3–28, Birkhäuser, Basel, 2004. <249, 252>
878 Handbook of Finite Fields
[522] C. Carlet, Recursive lower bounds on the nonlinearity profile of Boolean functions
and their applications, IEEE Trans. Inform. Theory 54 (2008) 1262–1272. <246,
252>
[523] C. Carlet, Boolean Functions for Cryptography and Error Correcting Codes (Chapter
8), In Y. Crama and P. L. Hammer, editors, Boolean Models and Methods in
Mathematics, Computer Science, and Engineering, 257–397, Cambridge Univer-
sity Press, 2010. <180, 185, 243, 244, 245, 247, 248, 249, 250, 252, 262, 268,
273>
[524] C. Carlet, Vectorial Boolean functions for cryptography, In Y. Crama and P. L. Ham-
mer, editors, Boolean Models and Methods in Mathematics, Computer Science,
and Engineering, 398–469, Cambridge University Press, 2010. <253, 254, 261,
273>
[525] C. Carlet, P. Charpin, and V. Zinoviev, Codes, bent functions and permutations
suitable for DES-like cryptosystems, Des. Codes Cryptogr. 15 (1998) 125–156.
<255, 257, 258, 259, 260, 261>
[526] C. Carlet, L. E. Danielsen, M. G. Parker, and P. Solé, Self-dual bent functions, Int.
J. Inf. Coding Theory 1 (2010) 384–399. <263, 273>
[527] C. Carlet and S. Dubuc, On generalized bent and q-ary perfect nonlinear functions,
In Finite Fields and Applications, 81–94, Springer, Berlin, 2001. <271, 273>
[528] C. Carlet and K. Feng, An infinite class of balanced functions with optimal algebraic
immunity, good immunity to fast algebraic attacks and good nonlinearity, In
Advances in Cryptology—ASIACRYPT 2008, volume 5350 of Lecture Notes in
Comput. Sci., 425–440, Springer, Berlin, 2008. <250, 252>
[529] C. Carlet and P. Gaborit, Hyper-bent functions and cyclic codes, J. Combin. Theory,
Ser. A 113 (2006) 466–482. <264, 273>
[530] C. Carlet and P. Guillot, A new representation of Boolean functions, In Applied Al-
gebra, Algebraic Algorithms and Error-Correcting Codes, volume 1719 of Lecture
Notes in Comput. Sci., 94–103, Springer, Berlin, 1999. <243, 252>
[531] C. Carlet, T. Helleseth, A. Kholosha, and S. Mesnager, On the dual of bent func-
tions with 2r Niho exponents, In Proceedings of the 2011 IEEE International
Symposium on Information Theory, 657–661, IEEE, 2011. <268, 270, 273>
[532] C. Carlet and S. Mesnager, On Dillon’s class H of bent functions, Niho bent functions
and o-polynomials, J. Combin. Theory, Ser. A 118 (2011) 2392–2410. <268, 269,
270, 273>
[533] C. Carlet and A. Pott, editors, Sequences and Their Applications, volume 6338 of
Lecture Notes in Comput. Sci., Springer, Berlin, 2010. <31>
[534] C. Carlet and P. Sarkar, Spectral domain analysis of correlation immune and resilient
Boolean functions, Finite Fields Appl. 8 (2002) 120–130. <247, 252>
[535] C. Carlet and B. Sunar, editors, Arithmetic of Finite Fields, volume 4547 of Lecture
Notes in Comput. Sci., Springer, Berlin, 2007. <31>
[536] C. Carlet and J. L. Yucas, Piecewise constructions of bent and almost optimal Boolean
functions, Des. Codes Cryptogr. 37 (2005) 449–464. <206>
[537] L. Carlitz, The arithmetic of polynomials in a Galois field, Amer. J. Math. 54 (1932)
39–50. <80, 85, 364, 366, 374>
[538] L. Carlitz, Some applications of a theorem of Chevalley, Duke Math. J. 18 (1951)
811–819. <213>
[539] L. Carlitz, Primitive roots in a finite field, Trans. Amer. Math. Soc. 73 (1952) 373–382.
<137, 138>
Bibliography 879
[540] L. Carlitz, Some problems involving primitive roots in a finite field, Proc. Nat. Acad.
Sci. U.S.A. 38 (1952) 314–318; errata, 618. <115, 116>
[541] L. Carlitz, A theorem of Dickson on irreducible polynomials, Proc. Amer. Math. Soc.
3 (1952) 693–700. <54, 55, 59, 74, 79>
[542] L. Carlitz, Invariantive theory of equations in a finite field, Trans. Amer. Math. Soc.
75 (1953) 405–427. <230, 232>
[543] L. Carlitz, Permutations in a finite field, Proc. Amer. Math. Soc. 4 (1953) 538. <238,
240>
[544] L. Carlitz, Representations by quadratic forms in a finite field, Duke Math. J. 21
(1954) 123–137. <507, 510>
[545] L. Carlitz, Representations by skew forms in a finite field, Arch. Math. (Basel) 5
(1954) 19–31. <507, 510>
[546] L. Carlitz, Solvability of certain equations in a finite field, Quart. J. Math. Oxford,
2nd Ser. 7 (1956) 3–4. <210, 213>
[547] L. Carlitz, Some theorems on irreducible reciprocal polynomials over a finite field, J.
Reine Angew. Math. 227 (1967) 212–220. <57, 59, 286, 290>
[548] L. Carlitz, Kloosterman sums and finite field extensions, Acta Arith. 16 (1969/1970)
179–193. <156, 161>
[549] L. Carlitz, D. J. Lewis, W. H. Mills, and E. G. Straus, Polynomials over finite fields
with minimal value sets, Mathematika 8 (1961) 121–130. <213, 233, 236>
[550] L. Carlitz and S. Uchiyama, Bounds for exponential sums, Duke Math. J. 24 (1957)
37–41. <321, 324>
[551] L. Carlitz and C. Wells, The number of solutions of a special system of equations in
a finite field, Acta Arith 12 (1966/1967) 77–84. <218, 229>
[552] R. Carls and D. Lubicz, A p-adic quasi-quadratic time point counting algorithm, Int.
Math. Res. Not. IMRN (2009) 698–735. <491>
[553] P. Cartier, Une nouvelle opération sur les formes différentielles, C. R. Acad. Sci.
Paris 244 (1957) 426–428. <486, 488>
[554] J. Cassaigne, C. Mauduit, and A. Sárközy, On finite pseudorandom binary sequences
VII: The measures of pseudorandomness, Acta Arith. 103 (2002) 97–118. <182,
185>
[555] R. Casse, Projective Geometry: an Introduction, Oxford University Press, Oxford,
2006. <564, 574>
[556] J. W. S. Cassels, Diophantine equations with special reference to elliptic curves, J.
London Math. Soc., 2nd Ser. 41 (1966) 193–291. <422, 440>
[557] J. W. S. Cassels, Lectures on Elliptic Curves, volume 24 of London Mathematical
Society Student Texts, Cambridge University Press, Cambridge, 1991. <31, 422,
440>
[558] G. Castagnoli, S. Bräuer, and M. Herrmann, Optimization of cyclic redundancy-
check codes with 24 and 32 parity bits, IEEE Transactions on Communications
41 (1993) 883–892. <634, 638, 639, 642>
[559] G. Castagnoli, J. Ganz, and P. Graber, Optimum cycle redundancy-check codes with
16-bit redundancy, IEEE Transactions on Communications 38 (1990) 111–114.
<635, 637, 642>
[560] F. N. Castro and C. J. Moreno, Mixed exponential sums over finite fields, Proc. Amer.
Math. Soc. 128 (2000) 2529–2537. <168, 169>
[561] F. N. Castro, I. Rubio, P. Guan, and R. Figueroa, On systems of linear and diagonal
880 Handbook of Finite Fields
[601] H. Chen, S. Ling, and C. Xing, Asymptotically good quantum codes exceeding the
Ashikhmin-Litsyn-Tsfasman bound, IEEE Trans. Inform. Theory 47 (2001)
2055–2058. <839, 841>
[602] J. Chen and T. Wang, On the Goldbach problem, Acta Math. Sinica 32 (1989)
702–718. <497, 500>
[603] J.-M. Chen and T.-T. Moh, On the Goubin-Courtois attack on TTM, Cryptology
ePrint Archive, 2001, http://eprint.iacr.org/2001/072. <774, 775, 783>
[604] J.-M. Chen and B.-Y. Yang, A more secure and efficacious TTS signature scheme,
In Information Security and Cryptology—ICISC 2003, volume 2971 of Lecture
Notes in Comput. Sci., 320–338, Springer, Berlin, 2004. <772, 775, 783>
[605] J.-M. Chen, B.-Y. Yang, and B.-Y. Peng, Tame transformation signatures with
Topsy-Yurvy Hashes, In IWAP’02, 1–8, 2002,
http://dsns.csie.nctu.edu.tw/iwap/proceedings/proceedings/sessionD/7.pdf.
<775, 783>
[606] K. Chen and L. Zhu, Existence of APAV(q, k) with q a prime power ≡ 3 (mod 4) and
k odd > 1, J. Combin. Des. 7 (1999) 57–68. <612, 619>
[607] L. Chen, W. Eberly, E. Kaltofen, B. D. Saunders, W. J. Turner, and G. Villard,
Efficient matrix preconditioners for black box linear algebra, Linear Algebra
Appl. 343/344 (2002) 119–146. <531, 532, 534, 535>
[608] Y. Chen, The Steiner system S(3, 6, 26), J Geometry 2 (1972) 7–28. <589>
[609] Y. Q. Chen, A construction of difference sets, Des. Codes Cryptogr. 13 (1998) 247–250.
<605, 607>
[610] Q. Cheng, Primality proving via one round in ECPP and one iteration in AKS,
In Advances in cryptology—CRYPTO 2003, volume 2729 of Lecture Notes in
Comput. Sci., 338–348, Springer, Berlin, 2003. <347, 363>
[611] Q. Cheng, Constructing finite field extensions with large order elements, SIAM J.
Discrete Math. 21 (2007) 726–730 (electronic). <98, 99>
[612] Q. Cheng, S. Gao, and D. Wan, Constructing high order elements through subspace
polynomials, In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium
on Discrete Algorithms, 1457–1463, 2012. <99>
[613] J. H. Cheon, J. Hong, and M. Kim, Accelerating Pollard’s rho algorithm in finite
fields, Journal of Cryptology 25 (2012) 185–242. <397, 401>
[614] R. C. C. Cheung, S. Duquesne, J. Fan, N. Guillermin, I. Verbauwhede, and G. X.
Yao, FPGA implementation of pairings using residue number system and lazy
reduction, In Proceedings of the 2011 Workshop on Cryptographic Hardware and
Embedded Systems, 421–441, 2011. <822, 823>
[615] C. Chevalley, Démonstration d’une hypothèse de m. artin, Abhand. Math. Sem.
Hamburg 11 (1936) 73–75. <207, 213>
[616] G. Chèze, Des méthodes symboliques-numériques et exactes pour la factorisation ab-
solue des polynômes en deux variables, PhD thesis, Université de Nice-Sophia
Antipolis (France), 2004. <387, 392>
[617] G. Chèze and G. Lecerf, Lifting and recombination techniques for absolute factoriza-
tion, J. Complexity 23 (2007) 380–420. <384, 392>
[618] A. M. Childs, L. J. Schulman, and U. V. Vazirani, Quantum algorithms for hidden
nonlinear structures, In Forty Eighth Annual IEEE Symposium on Foundations
of Computer Science, 395–404, 2007. <840, 841>
[619] A. M. Childs and W. van Dam, Quantum algorithms for algebraic problems, Rev.
Modern Phys. 82 (2010) 1–52. <835, 841>
Bibliography 883
[620] K. Chinen and T. Hiramatsu, Hyper-Kloosterman sums and their applications to the
coding theory, Appl. Algebra Engrg. Comm. Comput. 12 (2001) 381–390. <154,
161>
[621] A. Chistov, Polynomial time construction of a finite field, In Abstracts of Lectures at
Seventh All-Union Conference in Mathematical Logic, 196, Novosibirsk, USSR,
1984, In Russian. <379, 380>
[622] H. T. Choi and R. Evans, Congruences for sums of powers of Kloosterman sums, Int.
J. Number Theory 3 (2007) 105–117. <157, 161>
[623] B. C. Chong and K. M. Chan, On the existence of normalized room squares, Nanta
Math. 7 (1974) 8–17. <614, 619>
[624] W. S. Chou, Permutation Polynomials on Finite Fields and their Combinatorial
Applications, PhD thesis, Penn. State Univ., University Park, PA, 1990. <228,
230>
[625] W. S. Chou, The period lengths of inversive pseudorandom vector generations, Finite
Fields Appl. 1 (1995) 126–132. <229, 230>
[626] W.-S. Chou, The factorization of Dickson polynomials over finite fields, Finite Fields
Appl. 3 (1997) 84–96. <284, 290>
[627] W.-S. Chou and S. D. Cohen, Primitive elements with zero traces, Finite Fields Appl.
7 (2001) 125–141. <92, 95>
[628] W. S. Chou, J. Gómez-Calderón, and G. L. Mullen, Value sets of Dickson polynomials
over finite fields, J. Number Theory 30 (1988) 334–344. <233, 235, 236>
[629] W.-S. Chou, J. Gómez-Calderón, G. L. Mullen, D. Panario, and D. Thomson, Subfield
value sets of polynomials over finite fields, Funct. Approx. Comment. Math. 48
(2013) 147–165. <236>
[630] W.-S. Chou and G. L. Mullen, A note on value sets of polynomials over finite fields,
preprint, 2012. <234, 236>
[631] S. Chowla and H. J. Ryser, Combinatorial problems, Canadian J. Math. 2 (1950)
93–99. <600, 607>
[632] S. Chowla and H. Zassenhaus, Some conjectures concerning finite fields, Norske Vid.
Selsk. Forh. (Trondheim) 41 (1968) 34–35. <228, 230>
[633] M. Christopoulou, T. Garefalakis, D. Panario, and D. Thomson, The trace of an
optimal normal element and low complexity normal bases, Des. Codes Cryptogr.
49 (2008) 199–215. <119, 125, 128>
[634] M. Christopoulou, T. Garefalakis, D. Panario, and D. Thomson, Gauss periods as
constructions of low complexity normal bases, Des. Codes Cryptogr. 62 (2012)
43–62. <119, 121, 128>
[635] W. Chu and C. J. Colbourn, Optimal frequency-hopping sequences via cyclotomy,
IEEE Trans. Inform. Theory 51 (2005) 1139–1141. <846, 849>
[636] D. V. Chudnovsky and G. V. Chudnovsky, Sequences of numbers generated by addi-
tion in formal groups and new primality and factorization tests, Adv. in Appl.
Math. 7 (1986) 385–434. <441, 446>
[637] F. R. K. Chung, Diameters and eigenvalues, J. Amer. Math. Soc. 2 (1989) 187–196.
<645, 658>
[638] F. R. K. Chung, V. Faber, and T. A. Manteuffel, An upper bound on the diameter of
a graph from eigenvalues associated with its Laplacian, SIAM J. Discrete Math.
7 (1994) 443–457. <645, 658>
[639] F. R. K. Chung, J. A. Salehi, and V. K. Wei, Optical orthogonal codes: design,
analysis, and applications, IEEE Trans. Inform. Theory 35 (1989) 595–604.
884 Handbook of Finite Fields
[659] T. Cochrane and Z. Zheng, A survey on pure and mixed exponential sums modulo
prime powers, In Number Theory for the Millennium I, 273–300, A. K. Peters,
Natick, MA, 2002. <160, 161>
[660] H. Cohen, A Course in Computational Algebraic Number Theory, volume 138 of
Graduate Texts in Mathematics, Springer-Verlag, Berlin, 1993. <346, 347, 359,
362, 363, 404, 796>
[661] H. Cohen, G. Frey, R. Avanzi, C. Doche, T. Lange, K. Nguyen, and F. Vercauteren,
editors, Handbook of Elliptic and Hyperelliptic Curve Cryptography, Discrete
Mathematics and Its Applications. Chapman & Hall/CRC, Boca Raton, FL,
2006. <31, 346, 354, 355, 356, 357, 358, 359, 360, 363, 393, 401, 449, 450, 451,
452, 453, 454, 455, 456, 788, 796, 797, 798, 803>
[662] H. Cohen and H. W. Lenstra, Jr., Primality testing and Jacobi sums, Math. Comp.
42 (1984) 297–330. <347, 363>
[663] S. Cohen and H. Niederreiter, editors, Finite Fields and Applications, volume 233 of
London Mathematical Society Lecture Note Series, Cambridge, 1996. Cambridge
University Press. <31>
[664] S. D. Cohen, The distribution of irreducible polynomials in several indeterminates
over a finite field, Proc. Edinburgh Math. Soc., Ser. II 16 (1968/1969) 1–17. <81,
82, 85>
[665] S. D. Cohen, Further arithmetical functions in finite fields, Proc. Edinburgh Math.
Soc., Ser. II 16 (1968/1969) 349–363. <368, 374>
[666] S. D. Cohen, On irreducible polynomials of certain types in finite fields, Proc. Cam-
bridge Philos. Soc. 66 (1969) 335–344. <57, 58, 59, 60, 66>
[667] S. D. Cohen, The distribution of polynomials over finite fields, Acta Arith. 17 (1970)
255–271. <72, 73, 217, 230, 234, 236, 240>
[668] S. D. Cohen, Some arithmetical functions in finite fields, Glasgow Math. J. 11 (1970)
21–36. <80, 83, 85>
[669] S. D. Cohen, Uniform distribution of polynomials over finite fields, J. London Math.
Soc., 2nd Ser. 6 (1972) 93–102. <77, 79>
[670] S. D. Cohen, The values of a polynomial over a finite field, Glasgow Math. J. 14
(1973) 205–208. <368, 374>
[671] S. D. Cohen, The irreducibility of compositions of linear polynomials over a finite
field, Compos. Math. 47 (1982) 149–152. <60, 63, 66, 70>
[672] S. D. Cohen, The reducibility theorem for linearised polynomials over finite fields,
Bull. Austral. Math. Soc. 40 (1989) 407–412. <66, 70>
[673] S. D. Cohen, Windmill polynomials over fields of characteristic two, Monatsh. Math.
107 (1989) 291–301. <69, 70, 89, 90>
[674] S. D. Cohen, Exceptional polynomials and the reducibility of substitution polynomials,
Enseign. Math., IIe Ser. 36 (1990) 53–65. <237, 240>
[675] S. D. Cohen, Primitive elements and polynomials with arbitrary trace, Discrete Math.
83 (1990) 1–7. <92, 95>
[676] S. D. Cohen, Proof of a conjecture of Chowla and Zassenhaus on permutation poly-
nomials, Canad. Math. Bull. 33 (1990) 230–234. <228, 230>
[677] S. D. Cohen, Permutation polynomials and primitive permutation groups, Arch.
Math. (Basel) 57 (1991) 417–423. <218, 230>
[678] S. D. Cohen, The explicit construction of irreducible polynomials over finite fields,
Des. Codes Cryptogr. 2 (1992) 169–174. <60, 64, 65, 66, 286, 290>
886 Handbook of Finite Fields
[679] S. D. Cohen, Dickson polynomials of the second kind that are permutations, Canad.
J. Math. 46 (1994) 225–238. <226, 230>
[680] S. D. Cohen, Dickson permutations, In Number-Theoretic and Algebraic Methods in
Computer Science, 29–51, World Sci. Publ., River Edge, NJ, 1995. <226, 230>
[681] S. D. Cohen, Permutation group theory and permutation polynomials, In Algebras
and Combinatorics, 133–146, Springer, Singapore, 1999. <216, 230>
[682] S. D. Cohen, Gauss sums and a sieve for generators of Galois fields, Publ. Math.
Debrecen 56 (2000) 293–312. <89, 90, 92, 94, 95>
[683] S. D. Cohen, Kloosterman sums and primitive elements in Galois fields, Acta Arith.
94 (2000) 173–201. <93, 95>
[684] S. D. Cohen, Primitive polynomials over small fields, In Finite Fields and Applications,
volume 2948 of Lecture Notes in Comput. Sci., 197–214, Springer, Berlin, 2004.
<93, 95>
[685] S. D. Cohen, Explicit theorems on generator polynomials, Finite Fields Appl. 11
(2005) 337–357. <60, 64, 65, 66, 77, 79>
[686] S. D. Cohen, Primitive polynomials with a prescribed coefficient, Finite Fields Appl.
12 (2006) 425–491. <92, 95>
[687] S. D. Cohen, Primitive cubics and quartics with zero trace and prescribed norm,
Finite Fields Appl. 18 (2012) 1156–1168. <92>
[688] S. D. Cohen and M. D. Fried, Lenstra’s proof of the Carlitz-Wan conjecture on
exceptional polynomials: an elementary version, Finite Fields Appl. 1 (1995)
372–375. <218, 230, 238, 239, 240>
[689] S. D. Cohen and M. J. Ganley, Commutative semifields, two-dimensional over their
middle nuclei, J. Algebra 75 (1982) 373–385. <276, 277, 278, 282>
[690] S. D. Cohen and D. Hachenberger, Primitive normal bases with prescribed trace,
Appl. Algebra Engrg. Comm. Comput. 9 (1999) 383–403. <89, 90, 116>
[691] S. D. Cohen and D. Hachenberger, Primitivity, freeness, norm and trace, Discrete
Math. 214 (2000) 135–144. <92, 94, 95>
[692] S. D. Cohen and S. Huczynska, Primitive free quartics with specified norm and trace,
Acta Arith. 109 (2003) 359–385. <89, 90, 92, 94, 95, 116>
[693] S. D. Cohen and S. Huczynska, The primitive normal basis theorem—without a
computer, J. London Math. Soc., 2nd Ser. 67 (2003) 41–56. <93, 95>
[694] S. D. Cohen and S. Huczynska, The strong primitive normal basis theorem, Acta
Arith. 143 (2010) 299–332. <95, 116>
[695] S. D. Cohen and C. King, The three fixed coefficient primitive polynomial theorem,
JP J. Algebra Number Theory Appl. 4 (2004) 79–87. <93, 95>
[696] S. D. Cohen and R. W. Matthews, A class of exceptional polynomials, Trans. Amer.
Math. Soc. 345 (1994) 897–909. <239, 240, 301, 302>
[697] S. D. Cohen and R. W. Matthews, Exceptional polynomials over finite fields, Finite
Fields Appl. 1 (1995) 261–277. <239, 240>
[698] S. D. Cohen and D. Mills, Primitive polynomials with first and second coefficients
prescribed, Finite Fields Appl. 9 (2003) 334–350. <93, 95>
[699] S. D. Cohen, G. L. Mullen, and P. J.-S. Shiue, The difference between permutation
polynomials over finite fields, Proc. Amer. Math. Soc. 123 (1995) 2011–2015.
<228, 230>
[700] S. D. Cohen and M. Prešern, Primitive finite field elements with prescribed trace,
Southeast Asian Bull. Math. 29 (2005) 283–300. <92, 95>
Bibliography 887
[701] S. D. Cohen and M. Prešern, Primitive polynomials with prescribed second coefficient,
Glasgow Math. J. 48 (2006) 281–307. <92, 95>
[702] S. D. Cohen and M. Prešern, The Hansen-Mullen primitive conjecture: completion
of proof, In Number Theory and Polynomials, volume 352 of London Math. Soc.
Lecture Note Ser., 89–120, Cambridge Univ. Press, Cambridge, 2008. <92, 95>
[703] R. M. Cohn, Difference Algebra, Interscience Publishers John Wiley & Sons, New
York-London-Sydney, 1965. <238, 240>
[704] C. J. Colbourn, Covering arrays from cyclotomy, Des. Codes Cryptogr. 55 (2010)
201–219. <610, 611, 619>
[705] C. J. Colbourn, Covering arrays and hash families, In Information Security and Re-
lated Combinatorics, NATO Peace and Information Security, 99–136, IOS Press,
2011. <610, 619>
[706] C. J. Colbourn and J. H. Dinitz, editors, Handbook of Combinatorial Designs, Discrete
Mathematics and its Applications. Chapman & Hall/CRC, Boca Raton, FL,
second edition, 2007. <31, 317, 324, 550, 551, 554, 555, 556, 563, 572, 573, 574,
596, 599, 600, 607, 616, 617, 619>
[707] C. J. Colbourn and A. C. H. Ling, Linear hash families and forbidden configurations,
Des. Codes Cryptogr. 52 (2009) 25–55. <613, 619>
[708] C. J. Colbourn and A. Rosa, Triple Systems, Oxford Mathematical Monographs. The
Clarendon Press, Oxford University Press, New York, 1999. <590, 594, 599>
[709] G. E. Collins, Computing multiplicative inverses in GF(p), Math. Comp. 23 (1969)
197–200. <359, 363>
[710] G. E. Collins, Lecture notes on arithmetic algorithms, 1980, University of Wisconsin.
<359, 363>
[711] A. Commeine and I. Semaev, An algorithm to solve the discrete logarithm problem
with the number field sieve, In Public Key Cryptography—PKC 2006, volume
3958 of Lecture Notes in Comput. Sci., 174–190, Springer, Berlin, 2006. <399,
401>
[712] “Computer Algebra Group, University of Sydney,” Magma Computational Algebra
System, http://magma.maths.usyd.edu.au/magma, as viewed in July, 2012.
<33, 48, 49>
[713] A. Conflitti, On elements of high order in finite fields, In Cryptography and Compu-
tational Number Theory, volume 20 of Progr. Comput. Sci. Appl. Logic, 11–14,
Birkhäuser, Basel, 2001. <98, 99>
[714] K. Conrad, Jacobi sums and Stickelberger’s congruence, Enseign. Math,. IIe Ser. 41
(1995) 141–153. <152, 161>
[715] K. Conrad, On Weil’s proof of the bound for Kloosterman sums, J. Number Theory
97 (2002) 439–446. <154, 155, 161>
[716] S. Contini and I. E. Shparlinski, On Stern’s attack against secret truncated linear
congruential generators, volume 3574 of Lecture Notes in Comput. Sci., 52–60,
Springer, Berlin, 2005. <338, 344>
[717] D. Coppersmith, Fast evaluation of logarithms in fields of characteristic two, IEEE
Trans. Inform. Theory 30 (1984) 587–594. <348, 363, 370, 374, 398, 401>
[718] D. Coppersmith, Solving linear equations over GF(2): block Lanczos algorithm, Linear
Algebra Appl. 192 (1993) 33–60. <400, 401, 534, 535>
[719] D. Coppersmith, Solving homogeneous linear equations over GF(2) via block Wiede-
mann algorithm, Math. Comp. 62 (1994) 333–350. <400, 401, 534, 535>
[720] D. Coppersmith, Rectangular matrix multiplication revisited, J. Complexity 13 (1997)
888 Handbook of Finite Fields
[740] N. T. Courtois, A. Klimov, J. Patarin, and A. Shamir, Efficient algorithms for solv-
ing overdefined systems of multivariate polynomial equations, In Advances in
Cryptology—EUROCRYPT 2000, volume 1807 of Lecture Notes in Comput. Sci.,
392–407, Springer, Berlin, 2000. <780, 782, 783>
[741] N. T. Courtois and W. Meier, Algebraic attacks on stream ciphers with linear feed-
back, In Advances in Cryptology—EUROCRYPT 2003, volume 2656 of Lecture
Notes in Comput. Sci., 345–359, Springer, Berlin, 2003. <248, 252>
[742] N. T. Courtois and J. Patarin, About the XL algorithm over GF(2), In Topics
in Cryptology—CT-RSA 2003, volume 2612 of Lecture Notes in Comput. Sci.,
141–157, Springer, Berlin, 2003. <782, 783>
[743] N. T. Courtois and J. Pieprzyk, Cryptanalysis of block ciphers with overdefined
systems of equations, In Advances in Cryptology—ASIACRYPT 2002, volume
2501 of Lecture Notes in Comput. Sci., 267–287, Springer, Berlin, 2002. <782,
783>
[744] J.-M. Couveignes and T. Henocq, Action of modular correspondences around CM
points, In Algorithmic number theory (Sydney, 2002), volume 2369 of Lecture
Notes in Comput. Sci., 234–243, Springer, Berlin, 2002. <787, 796>
[745] J.-M. Couveignes and J.-G. Kammerer, The geometry of flex tangents to a cubic curve
and its parameterizations, J. Symbolic Comput. 47 (2012) 266–281. <796>
[746] J.-M. Couveignes and R. Lercier, Elliptic periods for finite fields, Finite Fields Appl.
15 (2009) 1–22. <121, 122, 123, 128>
[747] J.-M. Couveignes and R. Lercier, Fast construction of irreducible polynomials over
finite fields, Israel Journal of Mathematics 194 (2013) 77–105. <378, 380>
[748] D. Cox, J. Little, and D. O’Shea, Ideals, Varieties, and Algorithms, Undergraduate
Texts in Mathematics. Springer, New York, third edition, 2007. <783, 826, 834>
[749] D. A. Cox, Galois Theory, Pure and Applied Mathematics (New York). Wiley-
Interscience, John Wiley & Sons, Hoboken, NJ, 2004. <3, 9, 11>
[750] R. Crandall, Method and apparatus for public key exchange in a cryptographic system,
United States Patent 5, 159, 632, Date: Oct. 27th 1992. <353, 363>
[751] R. Crandall and C. Pomerance, Prime Numbers, Springer, New York, second edition,
2005, A computational perspective. <346, 354, 355, 362, 363, 496, 500>
[752] R. M. Crew, Etale p-covers in characteristic p, Compositio Math. 52 (1984) 31–45.
<487, 488>
[753] H. S. Cronie and S. B. Korada, Lossless source coding with polar codes, In Proc.
(ISIT) Symp. IEEE Int Information Theory, 904–908, 2010. <739>
[754] E. Croot, Sums of the form 1/xk1 + · · · + 1/xkn modulo a prime, Integers 4 (2004) A20,
6. <213>
[755] S. Crozier, J. Lodge, P. Guinand, and A. Hunt, Performance of turbo codes with
relative prime and golden interleaving strategies, In Proc. of the Sixth Inter-
national Mobile Satellite Conference (IMSC ’99), 268–275, Ottawa, Ontario,
Canada, 1999. <726, 727>
[756] C. Culbert and G. L. Ebert, Circle geometry and three-dimensional subregular trans-
lation planes, Innov. Incidence Geom. 1 (2005) 3–18. <568, 574>
[757] T. W. Cusick, Value sets of some polynomials over finite fields GF(22m ), SIAM J.
Comput. 27 (1998) 120–131 (electronic). <235, 236>
[758] T. W. Cusick, Polynomials over base 2 finite fields with evenly distributed values,
Finite Fields Appl. 11 (2005) 278–291. <235, 236>
[759] T. W. Cusick, C. Ding, and A. Renvall, Stream Ciphers and Number Theory, volume 66
890 Handbook of Finite Fields
[799] P. Deligne and N. Katz, Groupes de Monodromie en Géométrie Algébrique. II, Lecture
Notes in Mathematics, Vol. 340. Springer-Verlag, Berlin, 1973. <165, 167, 169,
486, 488>
[800] P. Delsarte, An algebraic approach to the association schemes of coding theory, Philips
Res. Rep. Suppl. (1973) vi+97. <251, 252>
[801] P. Delsarte, Four fundamental parameters of a code and their combinatorial signif-
icance, Information and Control 23 (1973) 407–438. <631, 642, 663, 664, 665,
673, 691, 703>
[802] P. Delsarte, On subfield subcodes of modified Reed-Solomon codes, IEEE Trans.
Inform. Theory IT-21 (1975) 575–576. <668, 669, 684, 703>
[803] P. Delsarte, Bilinear forms over a finite field, with applications to coding theory, J.
Combin. Theory, Ser. A 25 (1978) 226–241. <846, 849>
[804] P. Delsarte and J.-M. Goethals, Alternating bilinear forms over GF (q), J. Combin.
Theory, Ser. A 19 (1975) 26–50. <702, 703>
[805] P. Delsarte, J.-M. Goethals, and F. J. MacWilliams, On generalized Reed-Muller
codes and their relatives, Information and Control 16 (1970) 403–442. <686,
703>
[806] P. Delsarte and V. I. Levenshtein, Association schemes and coding theory, IEEE
Trans. Inform. Theory 44 (1998) 2477–2504. <691, 703>
[807] P. Dembowski, Finite Geometries, Ergebnisse der Mathematik und ihrer Grenzge-
biete, Band 44. Springer-Verlag, Berlin, 1968. <28, 31, 274, 278, 564, 567, 574,
588, 589, 591, 599>
[808] P. Dembowski and T. G. Ostrom, Planes of order n with collineation groups of order
n2 , Math. Z. 103 (1968) 239–258. <254, 261, 279, 280, 281, 282>
[809] U. Dempwolff, Automorphisms and equivalence of bent functions and of difference sets
in elementary abelian 2-groups, Comm. Algebra 34 (2006) 1077–1131. <273>
[810] U. Dempwolff, Semifield planes of order 81, J. Geom. 89 (2008) 1–16. <275, 278>
[811] U. Dempwolff and M. Röder, On finite projective planes defined by planar monomials,
Innov. Incidence Geom. 4 (2006) 103–108. <279, 280, 282>
[812] J. Denef and F. Loeser, Weights of exponential sums, intersection cohomology, and
Newton polyhedra, Invent. Math. 106 (1991) 275–294. <165, 169, 196, 201, 476,
479>
[813] J. Denef and F. Loeser, Character sums associated to finite Coxeter groups, Trans.
Amer. Math. Soc. 350 (1998) 5047–5066. <146, 161>
[814] J. Denef and F. Loeser, Definable sets, motives and p-adic integrals, J. Amer. Math.
Soc. 14 (2001) 429–469. <302>
[815] J. Denef and F. Vercauteren, An extension of Kedlaya’s algorithm to Artin-Schreier
curves in characteristic 2, In Algorithmic Number Theory, volume 2369 of Lecture
Notes in Comput. Sci., 308–323, Springer, Berlin, 2002. <454, 456>
[816] J. Denef and F. Vercauteren, Counting points on Cab curves using Monsky-Washnitzer
cohomology, Finite Fields Appl. 12 (2006) 78–102. <491>
[817] J. Denef and F. Vercauteren, An extension of Kedlaya’s algorithm to hyperelliptic
curves in characteristic 2, J. Cryptology 19 (2006) 1–25. <454, 456, 491>
[818] J. Dénes and A. D. Keedwell, Latin Squares and Their Applications, Academic Press,
New York, 1974. <555, 556>
[819] J. Dénes and A. D. Keedwell, Latin Squares, volume 46 of Annals of Discrete Math-
ematics, North-Holland Publishing Co., Amsterdam, 1991. <31>
Bibliography 893
[820] R. H. F. Denniston, Some maximal arcs in finite projective planes, J. Combin. Theory
6 (1969) 317–319. <572, 574>
[821] R. H. F. Denniston, Uniqueness of the inverse plane of order 5, Manuscripta Math. 8
(1973) 11–19. <589>
[822] R. H. F. Denniston, Uniqueness of the inversive plane of order 7, Manuscripta Math.
8 (1973) 21–26. <589>
[823] J.-M. Deshouillers, G. Effinger, H. te Riele, and D. Zinoviev, A complete Vinogradov
3-primes theorem under the Riemann hypothesis, Electron. Res. Announc. Amer.
Math. Soc. 3 (1997) 99–104. <497, 500>
[824] M. Deuring, Galoissche Theorie und Darstellungstheorie, Math. Ann. 107 (1933)
140–144. <110, 116>
[825] M. Deuring, Die Typen der Multiplikatorenringe elliptischer Funktionenkörper, Abh.
Math. Sem. Hansischen Univ. 14 (1941) 197–272. <431, 440>
[826] M. Dewar, L. Moura, D. Panario, B. Stevens, and Q. Wang, Division of trinomials
by pentanomials and orthogonal arrays, Des. Codes Cryptogr. 45 (2007) 1–17.
<640, 641, 642>
[827] M. Dewar and D. Panario, Linear transformation shift registers, IEEE Trans. Inform.
Theory 49 (2003) 2047–2052. <70>
[828] M. Dewar and D. Panario, Mutual irreducibility of certain polynomials, In Finite
Fields and Applications, volume 2948 of Lecture Notes in Comput. Sci., 59–68,
Springer, Berlin, 2004. <70>
[829] J.-F. Dhem, Design of an Efficient Public Key Cryptographic Library for RISC-Based
Smart Cards, PhD thesis, Faculté des sciences appliquées, Laboratoire de mi-
croélectronique, Université catholique de Louvain-la-Neuve, Belgique, 1998, avail-
able at http://users.belgacom.net/dhem/these/index.html. <355, 363>
[830] A. Dı́az and E. Kaltofen, FoxBox a system for manipulating symbolic objects in
black box representation, In ISSAC ’98: Proceedings of the 1998 International
Symposium on Symbolic and Algebraic Computation, 30–37, 1998. <392>
[831] J. W. Di Paola, On minimum blocking coalitions in small projective plane games,
SIAM J. Appl. Math. 17 (1969) 378–392. <560, 563>
[832] P. Diaconis and R. Graham, Products of universal cycles, In E. D. Demaine, M. L.
Demaine, and T. Rodgers, editors, A Lifetime of Puzzles, 35–55, A. K. Peters
Ltd., Wellesley, MA, 2008. <632, 642>
[833] P. Diaconis and R. Graham, Magical Mathematics: The Mathematical Ideas that
Animate Great Magic Tricks, Princeton University Press, 2011. <632, 642>
[834] P. Diaconis and M. Shahshahani, Generating a random permutation with random
transpositions, Z. Wahrsch. Verw. Gebiete 57 (1981) 159–179. <651, 652, 658>
[835] J. Dick, Walsh spaces containing smooth functions and quasi-Monte Carlo rules of
arbitrary high order, SIAM J. Numer. Anal. 46 (2008) 1519–1553. <623, 628,
630>
[836] J. Dick, P. Kritzer, G. Leobacher, and F. Pillichshammer, Constructions of general
polynomial lattice rules based on the weighted star discrepancy, Finite Fields
Appl. 13 (2007) 1045–1070. <624, 630>
[837] J. Dick and H. Niederreiter, On the exact t-value of Niederreiter and Sobol’ sequences,
J. Complexity 24 (2008) 572–581. <628, 630>
[838] J. Dick and H. Niederreiter, Duality for digital sequences, J. Complexity 25 (2009)
406–414. <627, 628, 630>
[839] J. Dick and F. Pillichshammer, Digital Nets and Sequences: Discrepancy Theory and
894 Handbook of Finite Fields
[859] W. Diffie and M. E. Hellman, New directions in cryptography, IEEE Trans. Inform.
Theory IT-22 (1976) 644–654. <183, 185, 746>
[860] W. Diffie and M. E. Hellman, New directions in cryptography, In Secure Communi-
cations and Asymmetric Cryptosystems, volume 69 of AAAS Sel. Sympos. Ser.,
143–180, Westview, Boulder, CO, 1982. <765, 783>
[861] J. F. Dillon, Elementary Hadamard Difference-Sets, ProQuest LLC, Ann Arbor, MI,
1974, Thesis (Ph.D.)–University of Maryland, College Park. <265, 267, 268, 269,
271, 273>
[862] J. F. Dillon, Multiplicative difference sets via additive characters, Des. Codes Cryp-
togr. 17 (1999) 225–235. <239, 240, 260, 261, 602, 607>
[863] J. F. Dillon, Geometry, codes and difference sets: exceptional connections, In Codes
and Designs, volume 10 of Ohio State Univ. Math. Res. Inst. Publ., 73–85, de
Gruyter, Berlin, 2002. <239, 240, 260, 261>
[864] J. F. Dillon and H. Dobbertin, New cyclic difference sets with Singer parameters,
Finite Fields Appl. 10 (2004) 342–389. <239, 240, 260, 261, 268, 273, 319, 324,
602, 607, 755, 756, 763>
[865] J. F. Dillon and G. McGuire, Near bent functions on a hyperplane, Finite Fields
Appl. 14 (2008) 715–720. <310>
[866] E. Dimitrova, L. D. Garcı́a-Puente, F. Hinkelmann, A. S. Jarrah, R. Laubenbacher,
B. Stigler, M. Stillman, and P. Vera-Licona, Polynome, Available at http:
//polymath.vbi.vt.edu/polynome/, 2010. <832, 834>
[867] E. Dimitrova, L. D. Garcı́a-Puente, F. Hinkelmann, A. S. Jarrah, R. Laubenbacher,
B. Stigler, M. Stillman, and P. Vera-Licona, Parameter estimation for Boolean
models of biological networks, Theoret. Comput. Sci. 412 (2011) 2816–2826.
<831, 834>
[868] E. S. Dimitrova, A. S. Jarrah, R. Laubenbacher, and B. Stigler, A Gröbner fan method
for biochemical network modeling, In ISSAC 2007, 122–126, ACM, New York,
2007. <831, 834>
[869] C. Ding, T. Helleseth, and H. Niederreiter, editors, Sequences and Their Applications,
Springer Series in Discrete Mathematics and Theoretical Computer Science, Lon-
don, 1999. Springer-Verlag London Ltd. <31, 32>
[870] C. Ding, D. Pei, and A. Salomaa, Chinese Remainder Theorem: Applications in
Computing, Coding, Cryptography, World Scientific Publishing Co. Inc., River
Edge, NJ, 1996. <229, 230>
[871] C. Ding, Z. Wang, and Q. Xiang, Skew Hadamard difference sets from the Ree-Tits
slice symplectic spreads in PG(3, 32h+1 ), J. Combin. Theory, Ser. A 114 (2007)
867–887. <229, 230, 281, 282, 604, 607>
[872] C. Ding, Q. Xiang, J. Yuan, and P. Yuan, Explicit classes of permutation polynomials
of F33m , Sci. China, Ser. A 52 (2009) 639–647. <226, 230>
[873] C. Ding, G. Xiao, and W. Shan, The Stability Theory of Stream Ciphers, volume 561
of Lecture Notes in Comput. Sci., Springer-Verlag, Berlin, 1991. <325, 326, 329,
336>
[874] C. Ding and J. Yuan, A family of skew Hadamard difference sets, J. Combin. Theory,
Ser. A 113 (2006) 1526–1535. <229, 230, 279, 282, 604, 607>
[875] C. S. Ding, H. Niederreiter, and C. P. Xing, Some new codes from algebraic curves,
IEEE Trans. Inform. Theory 46 (2000) 2638–2642. <707, 712>
[876] J. Ding, A new variant of the Matsumoto-Imai cryptosystem through perturbation, In
Public Key Cryptography—PKC 2004, volume 2947 of Lecture Notes in Comput.
896 Handbook of Finite Fields
[928] W. Duke, On multiple Salié sums, Proc. Amer. Math. Soc. 114 (1992) 623–625. <155,
161>
[929] J.-G. Dumas, Q-adic transform revisited, In Proceedings of the 2008 International
Symposium on Symbolic and Algebraic Computation, 63–69, ACM, New York,
2008. <522, 523, 535>
[930] J.-G. Dumas, L. Fousse, and B. Salvy, Simultaneous modular reduction and Kronecker
substitution for small finite fields, J. Symbolic Comput. 46 (2011) 823–840. <351,
363, 523, 535>
[931] J.-G. Dumas, T. Gautier, M. Giesbrecht, P. Giorgi, B. Hovinen, E. Kaltofen, B. D.
Saunders, W. J. Turner, and G. Villard, LinBox: A generic library for exact
linear algebra, In A. M. Cohen, X.-S. Gao, and N. Takayama, editors, ICMS’2002,
Proceedings of the 2002 International Congress of Mathematical Software, 40–50,
World Scientific Pub., 2002. <521, 535>
[932] J.-G. Dumas, T. Gautier, and C. Pernet, Finite field linear algebra subroutines,
In Proceedings of the 2002 International Symposium on Symbolic and Algebraic
Computation, 63–74, ACM, New York, 2002. <523, 535>
[933] J.-G. Dumas, P. Giorgi, and C. Pernet, Dense linear algebra over word-size prime
fields: the FFLAS and FFPACK packages, ACM Trans. Math. Software 35 (2008)
Art. 19, 35. <351, 363, 523, 524, 535>
[934] J.-G. Dumas, C. Pernet, and Z. Wan, Efficient computation of the characteristic
polynomial, In ISSAC’05, 140–147, ACM, New York, 2005. <529, 535>
[935] J.-G. Dumas and G. Villard, Computing the rank of sparse matrices over finite
fields, In V. G. Ganzha, E. W. Mayr, and E. V. Vorozhtsov, editors, CASC
2002, Proceedings of the Fifth International Workshop on Computer Algebra in
Scientific Computing, 47–62. Technische Universität München, Germany, 2002.
<530, 531, 532, 534, 535>
[936] I. I. Dumer, Concatenated codes and their multilevel generalizations, In Handbook of
Coding Theory, Vol. I, II, 1911–1988, North-Holland, Amsterdam, 1998. <702,
703>
[937] A. Duran, B. Saunders, and Z. Wan, Hybrid algorithms for rank of sparse matrices,
In R. Mathias and H. Woerdeman, editors, SIAM Conference on Applied Linear
Algebra, 2003. <534, 535>
[938] I. Duursma and K.-H. Mak, On lower bounds for the Ihara constants A(2) and A(3),
arXiv:1102.4127v2[math.NT] (2011). <463, 464, 469>
[939] P. F. Duvall and J. C. Mortick, Decimation of periodic sequences, SIAM J. Appl.
Math. 21 (1971) 367–372. <313, 317>
[940] B. Dwork, On the rationality of the zeta function of an algebraic variety, Amer. J.
Math. 82 (1960) 631–648. <163, 169, 479, 488>
[941] B. Dwork, p-adic cycles, Inst. Hautes Études Sci. Publ. Math. 37 (1969) 27–115.
<479, 488>
[942] B. Dwork, Bessel functions as p-adic functions of the argument, Duke Math. J. 41
(1974) 711–738. <479, 488>
[943] B. M. Dwork, On the zeta function of a hypersurface III, Ann. of Math., 2nd Ser. 83
(1966) 457–519. <302>
[944] W. Eberly, Black box Frobenius decompositions over small fields, In Proceedings
of the 2000 International Symposium on Symbolic and Algebraic Computation,
106–113, ACM, New York, 2000. <529, 530, 535>
[945] W. Eberly, Early termination over small fields, In Proceedings of the 2003 Interna-
900 Handbook of Finite Fields
[965] T. ElGamal, A public key cryptosystem and a signature scheme based on discrete
logarithms, IEEE Trans. Inform. Theory 31 (1985) 469–472. <746, 750>
[966] M. Elia and M. Leone, On the inherent space complexity of fast parallel multipliers
for GF (2m ), IEEE Trans. Comput. 51 (2002) 346–351. <820, 821, 823>
[967] S. Eliahou, M. Kervaire, and B. Saffari, On Golay polynomial pairs, Adv. in Appl.
Math. 12 (1991) 235–292. <843, 849>
[968] N. D. Elkies, The existence of infinitely many supersingular primes for every elliptic
curve over Q, Invent. Math. 89 (1987) 561–567. <438, 440>
[969] N. D. Elkies, Distribution of supersingular primes, Astérisque 198-200 (1991) 127–132.
<438, 440>
[970] N. D. Elkies, Elliptic and modular curves over finite fields and related computational
issues, In Computational Perspectives on Number Theory, volume 7 of AMS/IP
Stud. Adv. Math., 21–76, Amer. Math. Soc., Providence, RI, 1998. <787, 796>
[971] N. D. Elkies, Explicit modular towers, Proceedings of the Thirty Fifth Allerton
Conference on Communication, Control and Computing (1998) 23–32. <464,
469>
[972] N. D. Elkies, Explicit towers of Drinfeld modular curves, In European Congress of
Mathematics, Vol. II, volume 202 of Progr. Math., 189–198, Birkhäuser, Basel,
2001. <464, 469>
[973] N. D. Elkies, E. W. Howe, A. Kresch, B. Poonen, J. L. Wetherell, and M. E. Zieve,
Curves of every genus with many points. II. Asymptotically good families, Duke
Math. J. 122 (2004) 399–422. <463>
[974] W. Ellison, Waring’s problem, Amer. Math. Monthly 78 (1971) 10–36. <499, 500>
[975] B. Elspas, The theory of autonomous linear sequential networks, In Linear Sequential
Switching Circuits, 21–61, Holden-Day, San Francisco, Calif., 1965. <834>
[976] H. Enderling, M. Chaplain, and P. Hahnfeldt, Quantitative modeling of tumor dy-
namics and radiotherapy, Acta Biotheoretica 58 (2010) 341–353. <831, 834>
[977] A. Enge, Computing discrete logarithms in high-genus hyperelliptic Jacobians in
provably subexponential time, Math. Comp. 71 (2002) 729–742. <455, 456>
[978] A. Enge, The complexity of class polynomial computation via floating point approx-
imations, Math. Comp. 78 (2009) 1089–1107. <787, 796>
[979] A. Enge, Computing modular polynomials in quasi-linear time, Math. Comp. 78
(2009) 1809–1824. <788, 796>
[980] A. Enge and P. Gaudry, A general framework for subexponential discrete logarithm
algorithms, Acta Arith. 102 (2002) 83–103. <455, 456, 808, 811>
[981] B.-G. Englert and Y. Aharonov, The mean king’s problem: prime degrees of freedom,
Phys. Lett. A 284 (2001) 1–5. <835, 841>
[982] S. S. Erdem, T. Yanık, and Ç. K. Koç, Polynomial basis multiplication over GF(2m ),
Acta Appl. Math. 93 (2006) 33–55. <109>
[983] P. Erdős and P. Turán, On some problems of a statistical group-theory I, Z.
Wahrscheinlichkeitstheorie und Verw. Gebiete 4 (1965) 175–186 (1965). <373,
374>
[984] P. Erdős and P. Turán, On some problems of a statistical group-theory II, Acta math.
Acad. Sci. Hungar. 18 (1967) 151–163. <373, 374>
[985] P. Erdős and P. Turán, On some problems of a statistical group-theory III, Acta
Math. Acad. Sci. Hungar. 18 (1967) 309–320. <373, 374>
[986] P. Erdős and P. Turán, On some problems of a statistical group-theory IV, Acta
902 Handbook of Finite Fields
[1006] R. Evans, Seventh power moments of Kloosterman sums, Israel J. Math. 175 (2010)
349–362. <158, 161>
[1007] R. Evans and J. Greene, Clausen’s theorem and hypergeometric functions over finite
fields, Finite Fields Appl. 15 (2009) 97–109. <146, 161>
[1008] R. Evans and J. Greene, Evaluations of hypergeometric functions over finite fields,
Hiroshima Math. J. 39 (2009) 217–235. <146, 161>
[1009] R. Evans, H. D. L. Hollmann, C. Krattenthaler, and Q. Xiang, Gauss sums, Jacobi
sums, and p-ranks of cyclic difference sets, J. Combin. Theory, Ser. A 87 (1999)
74–119. <152, 161>
[1010] R. J. Evans, Identities for products of Gauss sums over finite fields, Enseign. Math.,
IIe Ser. 27 (1981) 197–209 (1982). <146, 161>
[1011] R. J. Evans, Pure Gauss sums over finite fields, Mathematika 28 (1981) 239–248
(1982). <145, 161>
[1012] R. J. Evans, Period polynomials for generalized cyclotomic periods, Manuscripta
Math. 40 (1982) 217–243. <160, 161>
[1013] R. J. Evans, Character sum analogues of constant term identities for root systems,
Israel J. Math. 46 (1983) 189–196. <146, 161>
[1014] R. J. Evans, The evaluation of Selberg character sums, Enseign. Math., IIe Ser. 37
(1991) 235–248. <146, 161>
[1015] R. J. Evans, Selberg-Jack character sums of dimension 2, J. Number Theory 54 (1995)
1–11. <146, 161>
[1016] R. J. Evans, J. Greene, and H. Niederreiter, Linearized polynomials and permutation
polynomials of finite fields, Michigan Math. J. 39 (1992) 405–413. <228, 230>
[1017] S. A. Evdokimov, Efficient factorization of polynomials over finite fields and the
generalized Riemann hypothesis, Translation of Zapiski Nauchnyck Seminarov
Leningradskgo Otdeleniya Mat. Inst. V.A. Steklova Akad. Nauk SSSR (LOMI),
volume 176, 1989, 104–117. <379, 380>
[1018] S. A. Evdokimov, Faktorizatsiya razreshimogo mnogochlena nad konechnym polem i
Obobshchennaya Gipoteza Rimana, Zapiski Nauchnyck Seminarov Leningrad-
skgo Otdeleniya Mat. Inst. V.A. Steklova Akad. Nauk SSSR (LOMI) 176 (1989)
104–117, With English abstract, S. A. Evdokimov, Factoring a solvable polyno-
mial over a finite field and the Generalized Riemann Hypothesis. <381, 382>
[1019] S. A. Evdokimov, Factorization of polynomials over finite fields in subexponential time
under GRH, In Algorithmic Number Theory, First International Symposium,
ANTS-I, volume 877 of Lecture Notes in Comput. Sci., 209–219, Springer, Berlin,
1994. <381, 382>
[1020] G. Everest and T. Ward, Heights of Polynomials and Entropy in Algebraic Dynamics,
Universitext. Springer-Verlag London Ltd., London, 1999. <337, 338, 344>
[1021] J.-H. Evertse, Linear equations with unknowns from a multiplicative group whose
solutions lie in a small number of subspaces, Indag. Math. (New Ser.) 15 (2004)
347–355. <302>
[1022] C. Faber and G. van der Geer, Complete subvarieties of moduli spaces and the Prym
map, J. Reine Angew. Math. 573 (2004) 117–137. <486, 488>
[1023] C. C. Faith, Extensions of normal bases and completely basic fields, Trans. Amer.
Math. Soc. 85 (1957) 406–427. <129, 130, 138>
[1024] G. Faltings, Finiteness theorems for abelian varieties over number fields, In Arithmetic
Geometry, 9–27, Springer, New York, 1986. <433, 440>
[1025] H. Fan and Y. Dai, Fast bit-parallel GF (2n ) multiplier for all trinomials, IEEE Trans.
904 Handbook of Finite Fields
[1063] J. P. Fillmore and M. L. Marx, Linear recursive sequences, SIAM Rev. 10 (1968)
342–353. <312, 317>
[1064] N. J. Fine and I. N. Herstein, The probability that a matrix be nilpotent, Illinois J.
Math. 2 (1958) 499–504. <502, 510>
[1065] FIPS 180-3, Secure hash standard (SHS), Federal Information Processing Standards
Publication 180-3, National Institute of Standards and Technology, 2008. <745,
750>
[1066] FIPS 186-2, Digital signature standard, Federal Information Processing Standards
Publication 186-2, 2000, available at http://csrc.nist.gov. <353, 363>
[1067] FIPS 186-3, Digital signature standard (DSS), Federal Information Processing Stan-
dards Publication 186-3, National Institute of Standards and Technology, 2009.
<746, 750>
[1068] FIPS 46-3, Data encryption standard (DES), Federal Information Processing Stan-
dards Publication 46-3, National Institute of Standards and Technology, 1999.
<744, 750>
[1069] S. Fischer and W. Meier, Algebraic immunity of s-boxes and augmented functions,
In Proceedings of Fast Software Encryption 2007, volume 4593 of Lecture Notes
in Comput. Sci., 366–381, 2007. <248, 252>
[1070] S. D. Fisher, Classroom notes: matrices over a finite field, Amer. Math. Monthly 73
(1966) 639–641. <501, 510>
[1071] R. W. Fitzgerald, A characterization of primitive polynomials over finite fields, Finite
Fields Appl. 9 (2003) 117–121. <88, 90>
[1072] R. W. Fitzgerald, Highly degenerate quadratic forms over finite fields of characteristic
2, Finite Fields Appl. 11 (2005) 165–181. <205, 206>
[1073] R. W. Fitzgerald, Highly degenerate quadratic forms over F2 , Finite Fields Appl. 13
(2007) 778–792. <204, 206>
[1074] R. W. Fitzgerald, Invariants of trace forms over finite fields of characteristic 2, Finite
Fields Appl. 15 (2009) 261–275. <205, 206>
[1075] R. W. Fitzgerald, Trace forms over finite fields of characteristic 2 with prescribed
invariants, Finite Fields Appl. 15 (2009) 69–81. <204, 205, 206>
[1076] R. W. Fitzgerald and J. L. Yucas, Irreducible polynomials over GF(2) with three
prescribed coefficients, Finite Fields Appl. 9 (2003) 286–299. <56, 59, 79>
[1077] R. W. Fitzgerald and J. L. Yucas, Pencils of quadratic forms over finite fields, Discrete
Math. 283 (2004) 71–79. <206>
[1078] R. W. Fitzgerald and J. L. Yucas, Sums of Gauss sums and weights of irreducible
codes, Finite Fields Appl. 11 (2005) 89–110. <141, 161>
[1079] R. W. Fitzgerald and J. L. Yucas, Generalized reciprocals, factors of Dickson poly-
nomials and generalized cyclotomic polynomials over finite fields, Finite Fields
Appl. 13 (2007) 492–515. <285, 286, 287, 290>
[1080] P. Flajolet, X. Gourdon, and D. Panario, The complete analysis of a polynomial
factorization algorithm over finite fields, J. Algorithms 40 (2001) 37–81. <365,
368, 369, 373, 374, 382>
[1081] P. Flajolet and A. Odlyzko, Singularity analysis of generating functions, SIAM J.
Discrete Math. 3 (1990) 216–240. <366, 374>
[1082] P. Flajolet and A. M. Odlyzko, Random mapping statistics, In Advances in
cryptology—EUROCRYPT ’89 (Houthalen, 1989), volume 434 of Lecture Notes
in Comput. Sci., 329–354, Springer, Berlin, 1990. <754, 763>
Bibliography 907
[1102] Free Software Foundation, GNU Multiple Precision library, version 5.0.5, 2012, avail-
able at http://gmplib.org/. <346, 355, 363>
[1103] G. Frei, The unpublished section eight: on the way to function fields over a finite field,
In The Shaping of Arithmetic After C. F. Gauss’s Disquisitiones Arithmeticae,
159–198, Berlin: Springer, 2007. <6>
[1104] G. Frey, Applications of arithmetical geometry to cryptographic constructions, In
D. Jungnickel and H. Niederreiter, editors, Finite Fields and Applications, 128–
161, Springer-Verlag, Berlin, 2001. <786, 796, 809, 811>
[1105] G. Frey and T. Lange, Varieties over special fields, In Handbook of Elliptic and
Hyperelliptic Curve Cryptography, Discrete Math. Appl., 87–113, Chapman &
Hall/CRC, Boca Raton, FL, 2006. <31, 32, 456>
[1106] G. Frey, M. Müller, and H.-G. Rück, The Tate pairing and the discrete logarithm
applied to elliptic curve cryptosystems, IEEE Trans. Inform. Theory 45 (1999)
1717–1719. <455, 456>
[1107] G. Frey, M. Perret, and H. Stichtenoth, On the different of abelian extensions of global
fields, In Coding Theory and Algebraic Geometry, volume 1518 of Lecture Notes
in Math., 26–32, Springer, Berlin, 1992. <468, 469>
[1108] G. Frey and H.-G. Rück, A remark concerning m-divisibility and the discrete log-
arithm problem in the divisor class group of curves, Math. Comp. 62 (1994)
865–874. <793, 796, 810, 811>
[1109] M. Fried, On a conjecture of Schur, Michigan Math. J. 17 (1970) 41–55. <228, 230,
239, 240, 284, 290, 293, 294, 302>
[1110] M. Fried, The field of definition of function fields and a problem in the reducibility of
polynomials in two variables, Illinois J. Math. 17 (1973) 128–146. <301, 302>
[1111] M. Fried, On a theorem of Ritt and related Diophantine problems, J. Reine Angew.
Math. 264 (1973) 40–55. <295, 302>
[1112] M. Fried, On a theorem of MacCluer, Acta Arith. 25 (1973/74) 121–126. <292, 293,
302>
[1113] M. Fried, On Hilbert’s irreducibility theorem, J. Number Theory 6 (1974) 211–231.
<294, 296, 300, 302>
[1114] M. Fried, Fields of definition of function fields and Hurwitz families—groups as Galois
groups, Comm. Algebra 5 (1977) 17–82. <292, 302>
[1115] M. Fried, Galois groups and complex multiplication, Trans. Amer. Math. Soc. 235
(1978) 141–163. <239, 240, 299, 300, 302>
[1116] M. Fried and R. Lidl, On Dickson polynomials and Rédei functions, In Contributions
to General Algebra, 5, 139–149, Hölder-Pichler-Tempsky, Vienna, 1987. <284,
290>
[1117] M. Fried and G. Sacerdote, Solving Diophantine problems over all residue class fields
of a number field and all finite fields, Ann. of Math., 2nd Ser. 104 (1976) 203–233.
<301, 302>
[1118] M. D. Fried, The place of exceptional covers among all Diophantine relations, Finite
Fields Appl. 11 (2005) 367–433. <292, 293, 294, 296, 297, 298, 299, 300, 301,
302>
[1119] M. D. Fried, Variables separated equations: Strikingly different roles for the branch cy-
cle lemma and the finite simple group classification, Science China Mathematics
55 (2012) 1–69. <293, 297, 301, 302>
[1120] M. D. Fried, R. Guralnick, and J. Saxl, Schur covers and Carlitz’s conjecture, Israel
J. Math. 82 (1993) 157–225. <218, 230, 237, 240, 294, 301, 302>
Bibliography 909
[1121] M. D. Fried and M. Jarden, Field Arithmetic, volume 11 of Ergebnisse der Math-
ematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas
(3)], Springer-Verlag, Berlin, 1986. <31, 32, 295, 300, 301, 302>
[1122] M. D. Fried and M. Jarden, Field Arithmetic, volume 11 of Ergebnisse der Mathematik
und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics
[Results in Mathematics and Related Areas, 3rd Series. A Series of Modern Sur-
veys in Mathematics], Springer-Verlag, Berlin, third edition, 2008, Revised by
Jarden. <31, 32, 238, 240>
[1123] M. D. Fried and R. E. MacRae, On curves with separated variables, Math. Ann. 180
(1969) 220–226. <300, 302>
[1124] M. D. Fried and R. E. MacRae, On the invariance of chains of fields, Illinois J. Math.
13 (1969) 165–171. <294, 302>
[1125] E. Friedman and L. C. Washington, On the distribution of divisor class groups of
curves over a finite field, In Théorie des Nombres, 227–239, de Gruyter, Berlin,
1989. <450, 456>
[1126] J. Friedman, Some geometric aspects of graphs and their eigenfunctions, Duke Math.
J. 69 (1993) 487–525. <646, 647, 656, 658>
[1127] J. Friedman, A Proof of Alon’s Second Eigenvalue Conjecture and Related Problems,
Mem. Amer. Math. Soc. 195 (2008). <657, 658>
[1128] J. Friedman, R. Murty, and J.-P. Tillich, Spectral estimates for abelian Cayley graphs,
J. Combin. Theory, Ser. B 96 (2006) 111–121. <651, 658>
[1129] C. Friesen, A special case of Cohen-Lenstra heuristics in function fields, In Number
Theory, volume 19 of CRM Proc. Lecture Notes, 99–105, Amer. Math. Soc.,
Providence, RI, 1999. <450, 456>
[1130] C. Friesen, Class group frequencies of real quadratic function fields: the degree 4 case,
Math. Comp. 69 (2000) 1213–1228. <450, 456>
[1131] C. Friesen, Bounds for frequencies of class groups of real quadratic genus 1 function
fields, Acta Arith. 96 (2001) 313–331. <450, 456>
[1132] S. Frisch, When are weak permutation polynomials strong?, Finite Fields Appl. 1
(1995) 437–439. <232>
[1133] D. Fu and J. Solinas, IKE and IKEv2 authentication using the elliptic curve digital
signature algorithm (ECDSA), RFC 4754, Internet Engineering Task Force, 2007,
http://www.ietf.org/rfc/rfc4754.txt. <785, 796>
[1134] F.-W. Fu, H. Niederreiter, and F. Özbudak, On the joint linear complexity of linear
recurring multisequences, In Coding and Cryptology, volume 4 of Ser. Coding
Theory Cryptol., 125–142, World Sci. Publ., Hackensack, NJ, 2008. <325, 330,
331, 336>
[1135] F.-W. Fu, H. Niederreiter, and F. Özbudak, Joint linear complexity of arbitrary
multisequences consisting of linear recurring sequences, Finite Fields Appl. 15
(2009) 475–496. <328, 331, 336>
[1136] F.-W. Fu, H. Niederreiter, and F. Özbudak, Joint linear complexity of multisequences
consisting of linear recurring sequences, Cryptogr. Commun. 1 (2009) 3–29. <330,
331, 336>
[1137] F.-W. Fu, H. Niederreiter, and M. Su, The expectation and variance of the joint linear
complexity of random periodic multisequences, J. Complexity 21 (2005) 804–822.
<331, 336>
[1138] L. Fu, Weights of twisted exponential sums, Math. Z. 262 (2009) 449–472. <168,
169>
910 Handbook of Finite Fields
[1139] L. Fu and C. Liu, Equidistribution of Gauss sums and Kloosterman sums, Math. Z.
249 (2005) 269–281. <140, 161>
[1140] L. Fu and D. Wan, Moment L-functions, partial L-functions and partial exponential
sums, Math. Ann. 328 (2004) 193–228. <169, 198, 201>
[1141] L. Fu and D. Wan, Mirror congruence for rational points on Calabi-Yau varieties,
Asian J. Math. 10 (2006) 1–10. <200, 201>
[1142] C. A. Fuchs, On the quantumness of a Hilbert space, Quantum Inf. Comput. 4 (2004)
467–478. <836, 841>
[1143] C. A. Fuchs and M. Sasaki, Squeezing quantum information through a classical chan-
nel: measuring the “quantumness” of a set of quantum states, Quantum Inf.
Comput. 3 (2003) 377–404. <836, 841>
[1144] R. Fuhrmann, A. Garcia, and F. Torres, On maximal curves, J. Number Theory 67
(1997) 29–51. <461, 463>
[1145] R. Fuhrmann and F. Torres, The genus of curves over finite fields with many rational
points, Manuscripta Math. 89 (1996) 103–106. <461, 463>
[1146] R. Fuji-Hara, K. Momihara, and M. Yamada, Perfect difference systems of sets and
Jacobi sums, Discrete Math. 309 (2009) 3954–3961. <143, 161>
[1147] W. Fulton, Algebraic Curves, Advanced Book Classics. Addison-Wesley Publishing
Company Advanced Book Program, Redwood City, CA, 1989. <406, 421, 422>
[1148] M. Fürer, Fast integer multiplication, In Proceedings of the Thirty-ninth Annual
ACM Symposium on Theory of Computing, San Diego, California, USA, 57–66.
ACM, 2007, Preprint available at http://www.cse.psu.edu/~furer/Papers/
mult.pdf. <380, 382>
[1149] È. M. Gabidulin, Theory of codes with maximum rank distance, Problemy Peredachi
Informatsii 21 (1985) 3–16. <846, 849>
[1150] A. Gács, A remark on blocking sets of almost Rédei type, J. Geom. 60 (1997) 65–73.
<560, 563>
[1151] A. Gács, On a generalization of Rédei’s theorem, Combinatorica 23 (2003) 585–598.
<559, 563>
[1152] A. Gács, L. Lovász, and T. Szőnyi, Directions in AG(2, p2 ), Innov. Incidence Geom.
6/7 (2007/08) 189–201. <559, 563>
[1153] A. Gács, P. Sziklai, and T. Szőnyi, Two remarks on blocking sets and nuclei in planes
of prime order, Des. Codes Cryptogr. 10 (1997) 29–39. <560, 563>
[1154] S. D. Galbraith, Supersingular curves in cryptography, In Advances in Cryptology—
ASIACRYPT 2001, volume 2248 of Lecture Notes in Comput. Sci., 495–513,
Springer, Berlin, 2001. <454, 456>
[1155] S. D. Galbraith, M. Harrison, and D. J. Mireles Morales, Efficient hyperelliptic arith-
metic using balanced representation for divisors, In Algorithmic Number Theory,
volume 5011 of Lecture Notes in Comput. Sci., 342–356, Springer, Berlin, 2008.
<451, 452, 456>
[1156] S. D. Galbraith, F. Hess, and N. P. Smart, Extending the GHS Weil descent attack, In
L. Knudsen, editor, Advances in Cryptology—EUROCRYPT 2002, volume 2332
of Lecture Notes in Comput. Sci., 29–44, Springer-Verlag, Berlin, 2002. <786,
796>
[1157] S. D. Galbraith, F. Hess, and F. Vercauteren, Hyperelliptic pairings, In Pairing-
Based Cryptography—Pairing 2007, volume 4575 of Lecture Notes in Comput.
Sci., 108–131, Springer, Berlin, 2007. <803>
[1158] S. D. Galbraith, J. F. McKee, and P. C. Valença, Ordinary abelian varieties having
Bibliography 911
[1217] J. von zur Gathen, Hensel and Newton methods in valuation rings, Math. Comp. 42
(1984) 637–661. <385, 392>
[1218] J. von zur Gathen, Irreducibility of multivariate polynomials, J. Comput. System Sci.
31 (1985) 225–264. <387, 390, 392>
[1219] J. von zur Gathen, Irreducible polynomials over finite fields, In Proc. Sixth Conf.
Foundations of Software Technology and Theoretical Computer Science, volume
241 of Springer Lecture Notes in Comput. Sci., 252–262, Delhi, India, 1986. <379,
380>
[1220] J. von zur Gathen, Factoring polynomials and primitive elements for special primes,
Theoretical Computer Science 52 (1987) 77–89. <381, 382>
[1221] J. von zur Gathen, Tests for permutation polynomials, SIAM J. Comput. 20 (1991)
591–602. <217, 230>
[1222] J. von zur Gathen, Values of polynomials over finite fields, Bull. Austral. Math. Soc.
43 (1991) 141–146. <235, 236, 238, 240>
[1223] J. von zur Gathen, Irreducible trinomials over finite fields, Math. Comp. 72 (2003)
1987–2000. <69, 70, 348, 363>
[1224] J. von zur Gathen, Counting decomposable multivariate polynomials, Appl. Algebra
Engrg. Comm. Comput. 22 (2011) 165–185. <83, 84, 85>
[1225] J. von zur Gathen and J. Gerhard, Arithmetic and factorization of polynomials over
F2 , Technical Report tr-rsfb-96-018, University of Paderborn, Germany, 1996,
43 pages. <382>
[1226] J. von zur Gathen and J. Gerhard, Polynomial factorization over F2 , Math. Comp.
71 (2002) 1677–1698. <369, 374, 381, 382>
[1227] J. von zur Gathen and J. Gerhard, Modern Computer Algebra, Cambridge University
Press, Cambridge, New York, Melbourne, second edition, 2003. <31, 32, 84, 85,
125, 126, 128, 346, 363, 376, 378, 380, 381, 382, 385, 387, 392>
[1228] J. von zur Gathen, J. L. Imaña, and Ç. K. Koç, editors, Arithmetic of Finite Fields,
volume 5130 of Lecture Notes in Comput. Sci., Springer, Berlin, 2008. <31, 32>
[1229] J. von zur Gathen and E. Kaltofen, Factoring multivariate polynomials over finite
fields, Math. Comp. 45 (1985) 251–261. <387, 392>
[1230] J. von zur Gathen and E. Kaltofen, Factoring sparse multivariate polynomials, J.
Comput. System Sci. 31 (1985) 265–287. <390, 391, 392>
[1231] J. von zur Gathen, M. Karpinski, and I. E. Shparlinski, Counting curves and their
projections, Comput. Complexity 6 (1996/97) 64–99. <489, 491>
[1232] J. von zur Gathen and M. Nöcker, Exponentiation in finite fields: Theory and practice,
In Applied Algebra, Algebraic Algorithms and Error-Correcting Codes, Twelfth
International Symposium AAECC-12, volume 1255 of Lecture Notes in Comput.
Sci., 88–113, Springer, 1997. <356, 363>
[1233] J. von zur Gathen and M. Nöcker, Polynomial and normal bases for finite fields, J.
Cryptology 18 (2005) 337–355. <73, 348, 353, 363>
[1234] J. von zur Gathen and D. Panario, Factoring polynomials over finite fields: a survey,
J. Symbolic Comput. 31 (2001) 3–17. <381, 382>
[1235] J. von zur Gathen, D. Panario, and B. Richmond, Interval partitions and polynomial
factorization, Algorithmica 63 (2012) 363–397. <369, 374, 382>
[1236] J. von zur Gathen and F. Pappalardi, Density estimates related to Gauss periods, In
Cryptography and Computational Number Theory, volume 20 of Progr. Comput.
Sci. Appl. Logic, 33–41, Birkhäuser, Basel, 2001. <120, 128>
Bibliography 915
[1237] J. von zur Gathen and G. Seroussi, Boolean circuits versus arithmetic circuits, Infor-
mation and Computation 91 (1991) 142–154. <381, 382>
[1238] J. von zur Gathen, M. A. Shokrollahi, and J. Shokrollahi, Efficient multiplication
using type 2 optimal normal bases, In Arithmetic of Finite Fields, volume 4547
of Lecture Notes in Comput. Sci., 55–68, Springer, Berlin, 2007. <127, 128, 821,
822, 823>
[1239] J. von zur Gathen and V. Shoup, Computing Frobenius maps and factoring polyno-
mials, Computational Complexity 2 (1992) 187–224. <369, 374, 376, 380, 381,
382>
[1240] J. von zur Gathen and I. E. Shparlinski, Orders of Gauss periods in finite fields, In
Algorithms and Computations, volume 1004 of Lecture Notes in Comput. Sci.,
208–215, Springer, Berlin, 1995. <125, 128>
[1241] J. von zur Gathen and I. E. Shparlinski, Orders of Gauss periods in finite fields, Appl.
Algebra Engrg. Comm. Comput. 9 (1998) 15–24. <98, 99>
[1242] J. von zur Gathen and I. E. Shparlinski, Constructing elements of large order in finite
fields, In Applied Algebra, Algebraic Algorithms and Error-Correcting Codes,
volume 1719 of Lecture Notes in Comput. Sci., 404–409, Springer, Berlin, 1999.
<99>
[1243] J. von zur Gathen and I. E. Shparlinski, Gauss periods in finite fields, In Finite Fields
and Applications, 162–177, Springer, Berlin, 2001. <98, 99>
[1244] J. von zur Gathen, A. Viola, and K. Ziegler, Counting reducible, powerful, and rel-
atively irreducible multivariate polynomials over finite fields, SIAM J. Discrete
Math., 27 (2013) 855–891. <84, 85>
[1245] P. Gaudry, An algorithm for solving the discrete log problem on hyperelliptic curves,
In Advances in Cryptology—EUROCRYPT 2000, volume 1807 of Lecture Notes
in Comput. Sci., 19–34, Springer, Berlin, 2000. <455, 456, 798, 802, 803>
[1246] P. Gaudry, Fast genus 2 arithmetic based on theta functions, Journal of Mathematical
Cryptology 1 (2007) 243–265. <797, 803>
[1247] P. Gaudry, Index calculus for abelian varieties of small dimension and the elliptic
curve discrete logarithm problem, J. Symbolic Comput. 44 (2009) 1690–1702.
<786, 796, 802, 803>
[1248] P. Gaudry and N. Gürel, Counting points in medium characteristic using Kedlaya’s
algorithm, Experiment. Math. 12 (2003) 395–402. <491, 788>
[1249] P. Gaudry and R. Harley, Counting points on hyperelliptic curves over finite fields,
In Algorithmic Number Theory, volume 1838 of Lecture Notes in Comput. Sci.,
313–332, Springer, Berlin, 2000. <454, 456>
[1250] P. Gaudry, F. Hess, and N. P. Smart, Constructive and destructive facets of Weil
descent on elliptic curves, Journal of Cryptology 15 (2002) 19–46. <786, 796,
809, 810, 811>
[1251] P. Gaudry and F. Morain, Fast algorithms for computing the eigenvalue in the Schoof–
Elkies–Atkin algorithm, In J.-G. Dumas, editor, Proceedings of the 2006 Inter-
national Symposium on Symbolic and Algebraic Computations—ISSAC MMVI,
109–115, ACM, New York, 2006. <788, 796>
[1252] P. Gaudry and É. Schost, Construction of secure random curves of genus 2 over prime
fields, In Advances in Cryptology—EUROCRYPT 2004, volume 3027 of Lecture
Notes in Comput. Sci., 239–256, Springer, Berlin, 2004. <803>
[1253] P. Gaudry and É. Schost, Genus 2 point counting over prime fields, J. Symbolic
Comput. 47 (2012) 368–400. <803>
916 Handbook of Finite Fields
[1254] P. Gaudry, B. A. Smith, and D. R. Kohel, Counting points on genus 2 curves with real
multiplication, In Advances in Cryptology—ASIACRYPT 2011, volume 7073 of
Lecture Notes in Comput. Sci., 504–519, Springer, Berlin, 2011. <803>
[1255] P. Gaudry and E. Thomé, MPFQ – A finite field library Release 1.0-rc3, 2010, available
at http://mpfq.gforge.inria.fr. <346, 363>
[1256] P. Gaudry, E. Thomé, N. Thériault, and C. Diem, A double large prime variation for
small genus hyperelliptic index calculus, Math. Comp. 76 (2007) 475–492. <455,
456, 798, 799, 803>
[1257] C. F. Gauss, Mathematical Diary. Original manuscript in Latin: Handschriften-
abteilung Niedersächsische Staats- und Universitätsbibliothek Göttingen, Cod.
Ms. Gauss Math. 48 Cim., English commented transl. by J. Gray, A commen-
tary on Gauss’s Mathematical Diary, 1796–1814, with an English translation.
Exposition. Math., 2:97–130, 1984. <6, 11>
[1258] C. F. Gauss, Disquisitiones Arithmeticae, Lipsiae: G. Fleischer, 1801, English transl.
A. A. Clarke. New Haven: Yale University Press, 1966. <4, 11>
[1259] C. F. Gauss, Werke, ed. Königliche Gesellschaft der Wissenschaften zu Göttingen,
vol. II, Höhere Arithmetik., Göttingen: Universitäts-Druckerei, 1863. <4, 5, 11>
[1260] D. Gavinsky, M. Rötteler, and J. Roland, Quantum algorithm for the Boolean hidden
shift problem, In Proceedings of the Seventeenth Annual International Computing
and Combinatorics Conference (COCCON’11), volume 6842 of Lecture Notes in
Comput. Sci., 158–167, Springer, 2011. <840, 841>
[1261] D. Gay and W. Vélez, On the degree of the splitting field of an irreducible binomial,
Pacific J. Math. 78 (1978) 117–120. <62, 66>
[1262] G. Ge and L. Zhu, Authentication perpendicular arrays APA1 (2, 5, v), J. Combin.
Des. 4 (1996) 365–375. <612, 619>
[1263] W. Geiselmann, Algebraische Algorithmenentwicklung am Beispiel der Arithmetik in
endlichen Körpern, PhD thesis, Universit at Karlsruhe, 1992. <39, 42, 49, 123,
128>
[1264] W. Geiselmann and D. Gollmann, Duality and normal basis multiplication, In Cryp-
tography and Coding, III, volume 45 of Inst. Math. Appl. Conf. Ser. (New Ser.),
187–195, Oxford Univ. Press, New York, 1993. <39, 49>
[1265] W. Geiselmann and D. Gollmann, Self-dual bases in Fqn , Des. Codes Cryptogr. 3
(1993) 333–345. <103, 109>
[1266] W. Geiselmann, W. Meier, and R. Steinwandt, An attack on the isomorphisms of
polynomials problem with one secret, Int. Journal of Information Security 2
(2003) 59–64. <767, 783>
[1267] E.-U. Gekeler, On the coefficients of Drinfeld modular forms, Invent. Math. 93 (1988)
667–700. <545, 546>
[1268] M. Genma, M. Mishima, and M. Jimbo, Cyclic resolvability of cyclic Steiner 2-designs,
J. Combin. Des. 5 (1997) 177–187. <594, 599>
[1269] S. R. Ghorpade, S. U. Hasan, and M. Kumari, Primitive polynomials, Singer cycles
and word-oriented linear feedback shift registers, Des. Codes Cryptogr. 58 (2011)
123–134. <502, 510>
[1270] S. R. Ghorpade and G. Lachaud, Étale cohomology, Lefschetz theorems and number
of points of singular varieties over finite fields, Mosc. Math. J. 2 (2002) 589–631.
<195, 201>
[1271] P. Gianni and B. Trager, Square-free algorithms in positive characteristic, Appl. Alg.
Eng. Comm. Comp. 7 (1996) 1–14. <384, 392>
Bibliography 917
[1293] M. Golay, Static multislit spectrometry and its application to the panoramic display
of infrared spectra, J. Opt. Soc. Amer. 41 (1951) 468–472. <843, 849>
[1294] R. Gold, Characteristic linear sequences and their coset functions, SIAM J. Appl.
Math. 14 (1966) 980–985. <313, 317>
[1295] R. Gold, Maximal recursive sequences with 3-valued recursive crosscorrelation func-
tions, IEEE Trans. Inform. Theory 14 (1968) 154–156. <227, 230, 268, 273, 320,
324>
[1296] D. M. Goldschmidt, Algebraic Functions and Projective Curves, volume 215 of Grad-
uate Texts in Mathematics, Springer-Verlag, New York, 2003. <406, 422>
[1297] C. Goldstein, N. Schappacher, and J. Schwermer, editors, The Shaping of Arithmetic
After C. F. Gauss’s Disquisitiones Arithmeticae, Springer, Berlin, 2007. <5,
11>
[1298] D. Gollmann, Design of Algorithms in Cryptography. (Algorithmenentwurf in
der Kryptographie.), Aspekte Komplexer Systeme. 1. Mannheim: B.I. Wis-
senschaftsverlag. viii, 158 p. 68.00; öS 531.00; sFr 68.00 /hc , 1994. <103, 105,
109>
[1299] F. Göloğlu, G. McGuire, and R. Moloney, Binary Kloosterman sums using Stickel-
berger’s theorem and the Gross-Koblitz formula, Acta Arith. 148 (2011) 269–279.
<154, 161>
[1300] S. W. Golomb, Shift Register Sequences, With portions co-authored by Lloyd R.
Welch, Richard M. Goldstein, and Alfred W. Hales. Holden-Day Inc., San Fran-
cisco, Calif., 1967. <70, 312, 317>
[1301] S. W. Golomb, Algebraic constructions for Costas arrays, J. Combin. Theory, Ser. A
37 (1984) 13–21. <608, 619>
[1302] S. W. Golomb, Periodic binary sequences: solved and unsolved problems, In Sequences,
Subsequences, and Consequences, volume 4893 of Lecture Notes in Comput. Sci.,
1–8, Springer, Berlin, 2007. <96, 97>
[1303] S. W. Golomb and G. Gong, Signal Design for Good Correlation: For Wireless Com-
munication, Cryptography, and Radar, Cambridge University Press, Cambridge,
2005. <31, 32, 172, 185, 243, 252, 317, 324, 603, 607, 756, 763>
[1304] S. W. Golomb and G. Gong, The status of Costas arrays, IEEE Trans. Inform.
Theory 53 (2007) 4260–4265. <608, 619>
[1305] S. W. Golomb and O. Moreno, On periodicity properties of Costas arrays and a
conjecture on permutation polynomials, IEEE Trans. Inform. Theory 42 (1996)
2252–2253. <229, 230>
[1306] S. W. Golomb, M. G. Parker, A. Pott, and A. Winterhof, editors, Sequences and
Their Applications, volume 5203 of Lecture Notes in Comput. Sci., Springer,
Berlin, 2008. <31, 32>
[1307] J. Gómez-Calderón, On the cardinality of value set of polynomials with coefficients
in a finite field, Proc. Japan Acad., Ser. A Math. Sci. 68 (1992) 338–340. <235,
236>
[1308] J. Gómez-Calderón and D. J. Madden, Polynomials with small value set over finite
fields, J. Number Theory 28 (1988) 167–188. <233, 236>
[1309] D. Gómez-Pérez, J. Gutierrez, and Á. Ibeas, Attacking the Pollard generator, IEEE
Trans. Inform. Theory 52 (2006) 5518–5523. <338, 344>
[1310] D. Gómez-Pérez and A. P. Nicolás, An estimate on the number of stable quadratic
polynomials, Finite Fields Appl. 16 (2010) 401–405. <178, 185, 342, 343, 344>
[1311] D. Gómez-Pérez, A. P. Nicolás, A. Ostafe, and D. Sadornil, Stable polynomials over
Bibliography 919
[1330] M. Goresky and A. Klapper, Arithmetic crosscorrelations of feedback with carry shift
register sequences, IEEE Trans. Inform. Theory 43 (1997) 1342–1345. <336>
[1331] M. Goresky and A. M. Klapper, Fibonacci and Galois representations of feedback-
with-carry shift registers, IEEE Trans. Inform. Theory 48 (2002) 2826–2836.
<336>
[1332] D. Goss, π-adic Eisenstein series for function fields, Compositio Math. 41 (1980) 3–38.
<545, 546>
[1333] D. Goss, Basic Structures of Function Field Arithmetic, volume 35 of Ergebnisse der
Mathematik und ihrer Grenzgebiete, Springer-Verlag, Berlin, 1996. <31, 32, 535,
536, 538, 541, 542, 543, 546>
[1334] D. Goss, Applications of non-Archimedean integration to the L-series of τ -sheaves,
J. Number Theory 110 (2005) 83–113. <542, 546>
[1335] D. Goss, ζ-phenomenology, In Noncommutative Geometry, Arithmetic, and Related
Topics: Proceedings of the Twenty-First Meeting of the Japan-U.S. Mathematics
Institute, The Johns Hopkins University Press, Baltimore, MD, 2011. <543, 546>
[1336] K. Goto and R. van de Geijn, High-performance implementation of the level-3 BLAS,
ACM Trans. Math. Software 35 (2009) Art. 4, 14. <523, 535>
[1337] D. Gottesman, Class of quantum error-correcting codes saturating the quantum Ham-
ming bound, Phys. Rev. A, 3rd Ser. 54 (1996) 1862–1868. <835, 838, 841>
[1338] D. Gottesman, Fault-tolerant quantum computation with higher-dimensional systems,
In Quantum Computing & Quantum Communications; First NASA International
Conference (QCQC’98), volume 1509 of Lecture Notes in Comput. Sci., 302–313,
Springer, 1998. <838, 841>
[1339] L. Goubin and N. T. Courtois, Cryptanalysis of the TTM cryptosystem, In Advances
in Cryptology—ASIACRYPT 2000, volume 1976 of Lecture Notes in Comput.
Sci., 44–57, Springer, Berlin, 2000. <769, 773, 774, 779, 783>
[1340] A. Gouget and J. Patarin, Probabilistic multivariate cryptography, In VIETCRYPT,
volume 4341 of Lecture Notes in Comput. Sci., 1–18, Springer, Berlin, 2006.
<770, 783>
[1341] X. Gourdon, Combinatoire, Algorithmique et Géométrie des Polynômes, PhD disser-
tation, École Polytechnique, 1996. <369, 374>
[1342] X. Gourdon, Largest component in random combinatorial structures, Discrete Math.
180 (1998) 185–209. <374>
[1343] P. Goutet, An explicit factorisation of the zeta functions of Dwork hypersurfaces,
Acta Arith. 144 (2010) 241–261. <141, 161>
[1344] P. Goutet, On the zeta function of a family of quintics, J. Number Theory 130 (2010)
478–492. <141, 161>
[1345] P. Goutet, Isotypic decomposition of the cohomology and factorization of the zeta
functions of dwork hypersurfaces, Finite Fields Appl. 17 (2011) 113–137. <472,
479>
[1346] R. Gow and J. Sheekey, On primitive elements in finite semifields, Finite Fields Appl.
17 (2011) 194–204. <278>
[1347] W. T. Gowers, A new proof of Szemerédi’s theorem, Geom. Funct. Anal. 11 (2001)
465–588. <188, 192>
[1348] B. Grammaticos, R. G. Halburd, A. Ramani, and C.-M. Viallet, How to detect the
integrability of discrete systems, J. Phys. A 42 (2009) 454002, 30. <337, 344>
[1349] L. Granboulan, A. Joux, and J. Stern, Inverting HFE is quasipolynomial, In Advances
in Cryptology—CRYPTO 2006, volume 4117 of Lecture Notes in Comput. Sci.,
Bibliography 921
<233, 236>
[1368] R. M. Guralnick, Rational maps and images of rational points of curves over finite
fields, Irish Math. Soc. Bull. (2003) 71–95. <233, 236, 240>
[1369] R. M. Guralnick and P. Müller, Exceptional polynomials of affine type, J. Algebra
194 (1997) 429–454. <238, 239, 240>
[1370] R. M. Guralnick, P. Müller, and J. Saxl, The rational function analogue of a question
of Schur and exceptionality of permutation representations, Mem. Amer. Math.
Soc. 162 (2003) viii+79. <239, 240, 299, 300, 302>
[1371] R. M. Guralnick, P. Müller, and M. E. Zieve, Exceptional polynomials of affine type,
revisited, Preprint, 1999. <238, 240>
[1372] R. M. Guralnick, J. Rosenberg, and M. E. Zieve, A new family of exceptional poly-
nomials in characteristic two, Ann. of Math., 2nd Ser. 172 (2010) 1361–1390.
<237, 240>
[1373] R. M. Guralnick, T. J. Tucker, and M. E. Zieve, Exceptional covers and bijections on
rational points, Int. Math. Res. Not. IMRN (2007) Art. ID rnm004, 20. <239,
240>
[1374] R. M. Guralnick and M. E. Zieve, Polynomials with PSL(2) monodromy, Ann. of
Math., 2nd Ser. 172 (2010) 1315–1359. <237, 240>
[1375] V. Guruswami and A. C. Patthak, Correlated algebraic-geometric codes: improved
list decoding over bounded alphabets, Math. Comp. 77 (2008) 447–473. <705,
712>
[1376] V. Guruswami and A. Rudra, Limits to list decoding Reed-Solomon codes, IEEE
Trans. Inform. Theory 52 (2006) 3642–3649. <699, 703>
[1377] V. Guruswami and M. Sudan, Improved decoding of Reed-Solomon and algebraic-
geometry codes, IEEE Trans. Inform. Theory 45 (1999) 1757–1767. <699, 703>
[1378] F. G. Gustavson, Analysis of the Berlekamp-Massey linear feedback shift-register
synthesis algorithm, IBM J. Res. Develop. 20 (1976) 204–212. <329, 336>
[1379] J. Gutierrez and D. Gómez-Pérez, Iterations of multivariate polynomials and discrep-
ancy of pseudorandom numbers, In Applied Algebra, Algebraic Algorithms and
Error-Correcting Codes, volume 2227 of Lecture Notes in Comput. Sci., 192–199,
Springer, Berlin, 2001. <338, 340, 344>
[1380] J. Gutierrez and Á. Ibeas, Inferring sequences produced by a linear congruential
generator on elliptic curves missing high-order bits, Des. Codes Cryptogr. 45
(2007) 199–212. <338, 344>
[1381] J. Gutierrez and I. E. Shparlinski, Expansion of orbits of some dynamical systems
over finite fields, Bull. Aust. Math. Soc. 82 (2010) 232–239. <344>
[1382] J. Gutierrez, I. E. Shparlinski, and A. Winterhof, On the linear and nonlinear complex-
ity profile of nonlinear pseudorandom number-generators, IEEE Trans. Inform.
Theory 49 (2003) 60–64. <332, 333, 336>
[1383] C. Guyot, K. Kaveh, and V. M. Patankar, Explicit algorithm for the arithmetic on
the hyperelliptic Jacobians of genus 3, Journal of the Ramanujan Mathematical
Society 19 (2004) 75–115. <799, 803>
[1384] K. Gyarmati and A. Sárközy, Equations in finite fields with restricted solution sets I:
Character sums, Acta Math. Hungar. 118 (2008) 129–148. <184, 185>
[1385] K. Gyarmati and A. Sárközy, Equations in finite fields with restricted solution sets
II: Algebraic equations, Acta Math. Hungar. 119 (2008) 259–280. <184, 185>
[1386] D. Hachenberger, On completely free elements in finite fields, Des. Codes Cryptogr.
4 (1994) 129–143. <129, 138>
Bibliography 923
[1387] D. Hachenberger, Explicit iterative constructions of normal bases and completely free
elements in finite fields, Finite Fields Appl. 2 (1996) 1–20. <129, 138>
[1388] D. Hachenberger, Normal bases and completely free elements in prime power exten-
sions over finite fields, Finite Fields Appl. 2 (1996) 21–34. <129, 138>
[1389] D. Hachenberger, Finite Fields: Normal Bases and Completely Free Elements, The
Kluwer International Series in Engineering and Computer Science, 390. Kluwer
Academic Publishers, Boston, MA, 1997. <31, 32, 129, 130, 131, 132, 133, 134,
135, 136, 138>
[1390] D. Hachenberger, A decomposition theory for cyclotomic modules under the complete
point of view, J. Algebra 237 (2001) 470–486. <131, 132, 133, 134, 138>
[1391] D. Hachenberger, Primitive complete normal bases for regular extensions, Glasgow
Math. J. 43 (2001) 383–398. <95, 134, 135, 137, 138>
[1392] D. Hachenberger, Universal generators for primary closures of Galois fields, In Finite
Fields and Applications, 208–223, Springer, Berlin, 2001. <137, 138>
[1393] D. Hachenberger, Generators for primary closures of Galois fields, Finite Fields Appl.
9 (2003) 122–128. <137, 138>
[1394] D. Hachenberger, Primitive complete normal bases: existence in certain 2-power ex-
tensions and lower bounds, Discrete Math. 310 (2010) 3246–3250. <95, 137,
138>
[1395] D. Hachenberger, Primitive complete normal bases for regular extensions II: the
exceptional case, unpublished (2012). <137, 138>
[1396] D. Hachenberger, H. Niederreiter, and C. P. Xing, Function-field codes, Appl. Algebra
Engrg. Comm. Comput. 19 (2008) 201–211. <708, 712>
[1397] C. D. Haessig, L-functions of symmetric powers of cubic exponential sums, J. Reine
Angew. Math. 631 (2009) 1–57. <479, 488>
[1398] J. Hagenauer, E. Offer, and L. Papke, Iterative decoding of binary block and convo-
lutional codes, IEEE Trans. Inform. Theory 42 (1996) 429–445. <725, 727>
[1399] A. W. Hales and D. W. Newhart, Swan’s theorem for binary tetranomials, Finite
Fields Appl. 12 (2006) 301–311. <68, 70>
[1400] L. Hales and S. Hallgren, An improved quantum Fourier transform algorithm and
applications, In Forty First Annual Symposium on Foundations of Computer
Science, 515–525, IEEE Comput. Soc. Press, Los Alamitos, CA, 2000. <839,
841>
[1401] T. R. Halford, A. J. Grant, and K. M. Chugg, Which codes have 4-cycle-free Tanner
graphs?, IEEE Trans. Inform. Theory 52 (2006) 4219–4223. <718, 719>
[1402] C. Hall, L-functions of twisted Legendre curves, J. Number Theory 119 (2006) 128–
147. <496, 500>
[1403] J. I. Hall, On the order of Hall triple systems, J. Combin. Theory, Ser. A 29 (1980)
261–262. <611, 619>
[1404] M. Hall, Jr., Automorphisms of Steiner triple systems, IBM J. Res. Develop 4 (1960)
460–472. <611, 619>
[1405] K. H. Ham and G. L. Mullen, Distribution of irreducible polynomials of small degrees
over finite fields, Math. Comp. 67 (1998) 337–341. <75, 79>
[1406] N. Hamilton and R. Mathon, More maximal arcs in Desarguesian projective planes
and their geometric structure, Adv. Geom. 3 (2003) 251–261. <572, 574>
[1407] N. Hamilton and R. Mathon, On the spectrum of non-Denniston maximal arcs in
PG(2, 2h ), European J. Combin. 25 (2004) 415–421. <572, 574>
924 Handbook of Finite Fields
[1408] R. W. Hamming, Error detecting and error correcting codes, Bell System Tech. J. 29
(1950) 147–160. <684, 702, 703>
[1409] A. R. Hammons, Jr., P. V. Kumar, A. R. Calderbank, N. J. A. Sloane, and P. Solé, The
Z4 -linearity of Kerdock, Preparata, Goethals, and related codes, IEEE Trans.
Inform. Theory 40 (1994) 301–319. <28, 29, 31, 251, 252, 273, 699, 701, 702,
703>
[1410] D.-G. Han, D. Choi, and H. Kim, Improved computation of square roots in specific
finite fields, IEEE Trans. Comput. 58 (2009) 188–196. <360, 363>
[1411] W. Han, The distribution of the coefficients of primitive polynomials over finite
fields, In Cryptography and Computational Number Theory, volume 20 of Progr.
Comput. Sci. Appl. Logic, 43–57, Birkhäuser, Basel, 2001. <93, 95>
[1412] W. B. Han, The coefficients of primitive polynomials over finite fields, Math. Comp.
65 (1996) 331–340. <93, 95>
[1413] D. Hankerson, A. Menezes, and S. Vanstone, Guide to Elliptic Curve Cryptography,
Springer Professional Computing. Springer-Verlag, New York, 2004. <31, 32, 49,
346, 353, 354, 356, 363, 455, 456>
[1414] J. P. Hansen and J. P. Pedersen, Automorphism groups of Ree type, Deligne-Lusztig
curves and function fields, J. Reine Angew. Math. 440 (1993) 99–109. <461,
463>
[1415] S. H. Hansen, Error-correcting codes from higher-dimensional varieties, Finite Fields
Appl. 7 (2001) 530–552. <705, 712>
[1416] T. Hansen and G. L. Mullen, Primitive polynomials over finite fields, Math. Comp.
59 (1992) 639–643, S47–S50. <75, 79, 88, 90, 92, 95, 97, 349, 363>
[1417] B. Hanson, D. Panario, and D. Thomson, Swan-like results for binomials and tri-
nomials over finite fields of odd characteristic, Des. Codes Cryptogr. 61 (2011)
273–283. <69, 70>
[1418] G. Hardy and E. Wright, An Introduction to the Theory of Numbers, Oxford University
Press, Oxford, 2008. <495, 500>
[1419] G. H. Hardy and J. E. Littlewood, Some problems of ‘Partitio Numerorum’; IV: The
singular series in Waring’s problem and the value of the number G(k), Math. Z.
12 (1922) 161–188. <498, 500>
[1420] G. H. Hardy and J. E. Littlewood, Some problems of ‘Partitio Numerorum’; III: On
the expression of a number as a sum of primes, Acta Math. 44 (1923) 1–70. <497,
500>
[1421] R. Harley, Fast arithmetic on genus two curves, Preprint, 2000. <797, 803>
[1422] R. Harley, Asymptotically optimal p-adic point-counting, 2002, Posting to the Number
Theory List, available at http://listserv.nodak.edu/cgi-bin/wa.exe?A2=
ind0212&L=NMBRTHRY&P=R1277. <788, 796>
[1423] N. V. Harrach and C. Mengyán, Minimal blocking sets in PG(2, q) arising from
a generalized construction of Megyesi, Innov. Incidence Geom. 6/7 (2007/08)
211–226. <559, 563>
[1424] D. Hart, A. Iosevich, and J. Solymosi, Sum-product estimates in finite fields via
Kloosterman sums, Int. Math. Res. Not. IMRN (2007) Art. ID rnm007, 14.
<186, 192>
[1425] D. Hart, L. Li, and C. Yen Shen, Fourier analysis and expanding phenomena in finite
fields, preprint available, http://arxiv.org/abs/0909.5471, 2009. <188, 192>
[1426] W. B. Hart et al., Fast Library for Number Theory (Version 2.2), available at
http://www.flintlib.org. <47, 49, 346, 351, 363>
Bibliography 925
<326, 336>
[1446] P. Hawkes and G. G. Rose, Rewriting variables: the complexity of fast algebraic
attacks on stream ciphers, In Advances in Cryptology—CRYPTO 2004, volume
3152 of Lecture Notes in Comput. Sci., 390–406, Springer, Berlin, 2004. <248,
252>
[1447] D. R. Hayes, The distribution of irreducibles in GF[q, x], Trans. Amer. Math. Soc.
117 (1965) 101–127. <74, 79, 495, 500>
[1448] D. R. Hayes, The expression of a polynomial as a sum of three irreducibles, Acta
Arith. 11 (1966) 461–488. <497, 500>
[1449] D. R. Hayes, Explicit class field theory for rational function fields, Trans. Amer.
Math. Soc. 189 (1974) 77–91. <538, 546>
[1450] D. R. Hayes, Explicit class field theory in global function fields, In Studies in Algebra
and Number Theory, volume 6 of Adv. in Math. Suppl. Stud., 173–217, Academic
Press, New York, 1979. <538, 546>
[1451] D. R. Hayes, A brief introduction to Drinfeld modules, In The Arithmetic of Function
Fields, volume 2 of Ohio State Univ. Math. Res. Inst. Publ., 1–32, de Gruyter,
Berlin, 1992. <535, 546>
[1452] L. S. Heath and N. A. Loehr, New algorithms for generating Conway polynomials
over finite fields, J. Symbolic Comput. 38 (2004) 1003–1024. <402, 404>
[1453] D. R. Heath-Brown, Arithmetic applications of Kloosterman sums, Nieuw Arch.
Wiskd. (5) 1 (2000) 380–384. <154, 161>
[1454] D. R. Heath-Brown and S. Konyagin, New bounds for Gauss sums derived from kth
powers, and for Heilbronn’s exponential sum, Q. J. Math. 51 (2000) 221–235.
<141, 161, 173, 176, 185>
[1455] D. R. Heath-Brown and S. J. Patterson, The distribution of Kummer sums at prime
arguments, J. Reine Angew. Math. 310 (1979) 111–130. <150, 161>
[1456] A. Hedayat, D. Raghavarao, and E. Seiden, Further contributions to the theory of
F -squares design, Ann. Statist. 3 (1975) 712–716. <553, 556>
[1457] A. S. Hedayat, N. J. A. Sloane, and J. Stufken, Orthogonal Arrays, Theory and
Applications, Springer Series in Statistics. Springer-Verlag, New York, 1999.
<610, 619, 631, 642>
[1458] A. Hefez, On the value sets of special polynomials over finite fields, Finite Fields
Appl. 2 (1996) 337–347. <235, 236>
[1459] L. Heffter, Ueber Tripelsysteme, Math. Ann. 49 (1897) 101–112. <591, 599>
[1460] H. Heilbronn, Lecture Notes on Additive Number Theory mod p, California Institute
of Technology (1964). <212, 213>
[1461] R. Heindl, New Directions in Multivariate Public Key Cryptography, PhD disser-
tation, Clemson University, 2009, http://etd.lib.clemson.edu/documents/
1247508584/. <776, 783>
[1462] J. Heintz and M. Sieveking, Absolute primality of polynomials is decidable in random
polynomial time in the number of variables, In Automata, Languages and Pro-
gramming, volume 115 of Lecture Notes in Comput. Sci., 16–28, Springer-Verlag,
1981. <387, 392>
[1463] H. A. Helfgott, Growth and generation in SL2 (Z/pZ), Ann. of Math., 2nd Ser. 167
(2008) 601–623. <191, 192>
[1464] H. A. Helfgott, Growth in SL3 (Z/pZ), J. Eur. Math. Soc. (JEMS) 13 (2011) 761–851.
<191, 192>
Bibliography 927
<382, 392>
[1617] J.-R. Joly, Équations et variétés algébriques sur un corps fini, Enseignement Math.,
IIe Ser. 19 (1973) 1–117. <206, 213>
[1618] R. Jones, Iterated Galois towers, their associated martingales, and the p-adic Man-
delbrot set, Compos. Math. 143 (2007) 1108–1126. <337, 342, 344>
[1619] R. Jones, The density of prime divisors in the arithmetic dynamics of quadratic
polynomials, J. Lond. Math. Soc., 2nd Ser. 78 (2008) 523–544. <337, 342, 344>
[1620] R. Jones and N. Boston, Settled polynomials over finite fields, Proc. Amer. Math.
Soc. 140 (2012) 1849–1863. <178, 185, 342, 344>
[1621] C. Jordan, Traité des Substitutions et des Équations Algébriques, Paris: Gauthier-
Villars, 1870. <10, 11>
[1622] H. F. Jordan and D. C. M. Wood, On the distribution of sums of successive bits of
shift-register sequences, IEEE Trans. Computers C-22 (1973) 400–408. <630,
642>
[1623] J.-P. Jouanolou, Théorèmes de Bertini et Applications, volume 42 of Progress in
Mathematics, Birkhäuser Boston, 1983. <387, 392>
[1624] A. Joux, A one round protocol for tripartite Diffie-Hellman, J. Cryptology 17 (2004)
263–276. <748, 750>
[1625] A. Joux, Discrete logarithms in GF(2607 ) and GF(2613 ), mailing list an-
nouncement, https://listserv.nodak.edu/cgi-bin/wa.exe?A2=ind0509&L=
NMBRTHRY&P=R1490&D=0&I=-3&T=0, 2005. <400, 401>
[1626] A. Joux and R. Lercier, The function field sieve is quite special, In Algorithmic Number
Theory, volume 2369 of Lecture Notes in Comput. Sci., 431–445, Springer, Berlin,
2002. <399, 401>
[1627] A. Joux and R. Lercier, Improvements to the general number field sieve for discrete
logarithms in prime fields: A comparison with the Gaussian integer method,
Math. Comp. 72 (2003) 953–967. <399, 401>
[1628] A. Joux and R. Lercier, The function field sieve in the medium prime case, In Advances
in Cryptology—EUROCRYPT 2006, volume 4004 of Lecture Notes in Comput.
Sci., 254–270, Springer, Berlin, 2006. <399, 401>
[1629] A. Joux and V. Vitse, Cover and decomposition index calculus on elliptic curves
made practical—Application to a seemingly secure curve over Fp6 , In Advances
in Cryptology—EUROCRYPT 2012, volume 7237 of Lecture Notes Comput. Sci.,
9–26, 2011. <786, 796>
[1630] M. Joye, A. Miyaji, and A. Otsuka, editors, Pairing-Based Cryptography — Pairing
2010, volume 6487 of Lecture Notes in Comput. Sci., Springer-Verlag, Berlin,
2010. <788, 796>
[1631] D. Jungnickel, Finite Fields: Structure and Arithmetics, Bibliographisches Institut,
Mannheim, 1993. <13, 31, 32, 39, 49, 101, 103, 105, 106, 109, 113, 116, 117, 123,
128, 170, 183, 185, 325, 327, 328, 329, 336, 503, 504, 505, 506, 510>
[1632] D. Jungnickel, Trace-orthogonal normal bases, Discrete Appl. Math. 47 (1993) 233–
249. <123, 124, 128>
[1633] D. Jungnickel, T. Beth, and W. Geiselmann, A note on orthogonal circulant matrices
over finite fields, Arch. Math. (Basel) 62 (1994) 126–133. <506, 510>
[1634] D. Jungnickel and M. J. de Resmini, Another case of the prime power conjecture for
finite projective planes, Adv. Geom. 2 (2002) 215–218. <573, 574>
[1635] D. Jungnickel, A. J. Menezes, and S. A. Vanstone, On the number of self-dual bases
of GF(q m ) over GF(q), Proc. Amer. Math. Soc. 109 (1990) 23–29. <104, 109,
936 Handbook of Finite Fields
115, 116>
[1636] D. Jungnickel and H. Niederreiter, editors, Finite Fields and Applications, Springer-
Verlag, Berlin, 2001. <31, 32>
[1637] D. Jungnickel and S. A. Vanstone, On primitive polynomials over finite fields, J.
Algebra 124 (1989) 337–353. <92, 95>
[1638] J. Justesen, A class of constructive asymptotically good algebraic codes, IEEE Trans.
Inform. Theory IT-18 (1972) 652–656. <689, 702, 703>
[1639] J. Justesen and T. Høholdt, A Course in Error-Correcting Codes, EMS Textbooks
in Mathematics. European Mathematical Society (EMS), Zürich, 2004. <661,
703>
[1640] V. Kabanets and R. Impagliazzo, Derandomizing polynomial identity tests means
proving circuit lower bounds, Comput. Complexity 13 (2004) 1–46. <391, 392>
[1641] N. Kahale, Isoperimetric inequalities and eigenvalues, SIAM J. Discrete Math. 10
(1997) 30–40. <645, 658>
[1642] T. Kaida, S. Uehara, and K. Imamura, An algorithm for the k-error linear complexity
of sequences over GF(pm ) with period pn , p a prime, Inform. and Comput. 151
(1999) 134–147. <329, 336>
[1643] T. Kailath, S. Y. Kung, and M. Morf, Displacement ranks of a matrix, Bull. Amer.
Math. Soc. (New Ser.) 1 (1979) 769–773. <532, 535>
[1644] B. S. Kaliski, Jr., The Montgomery inverse and its applications, IEEE Trans. on
Computers 44 (1995) 1064–1065. <360, 363>
[1645] E. Kaltofen, A polynomial reduction from multivariate to bivariate integral polyno-
mial factorization, In Proceedings of the Fourteenth Symposium on Theory of
Computing, 261–266, ACM, 1982. <387, 392>
[1646] E. Kaltofen, A polynomial-time reduction from bivariate to univariate integral polyno-
mial factorization, In Proc. Twenty Third Annual Symp. Foundations of Comp.
Sci., 57–64. IEEE, 1982. <381, 382, 387, 392>
[1647] E. Kaltofen, Effective Hilbert irreducibility, Information and Control 66 (1985) 123–
137. <387, 392>
[1648] E. Kaltofen, Fast parallel absolute irreducibility testing, J. Symbolic Comput. 1 (1985)
57–67. <387, 392>
[1649] E. Kaltofen, Sparse Hensel lifting, In Proceedings of EUROCAL ’85, Vol. 2, volume
204 of Lecture Notes in Comput. Sci., 4–17, Springer-Verlag, 1985. <387, 391,
392>
[1650] E. Kaltofen, Uniform closure properties of p-computable functions, In Proc. Eighteenth
Annual ACM Symp. Theory Comput., 330–337, 1986, also published as part of
[1652] and [1653]. <390, 391, 392>
[1651] E. Kaltofen, Deterministic irreducibility testing of polynomials over large finite fields,
J. Symbolic Comput. 4 (1987) 77–82. <387, 392>
[1652] E. Kaltofen, Greatest common divisors of polynomials given by straight-line programs,
J. ACM 35 (1988) 231–264. <936>
[1653] E. Kaltofen, Factorization of polynomials given by straight-line programs, In S. Mi-
cali, editor, Randomness and Computation, volume 5 of Advances in Computing
Research, 375–412, JAI Press Inc., Greenwhich, Connecticut, 1989. <390, 391,
392, 936>
[1654] E. Kaltofen, Polynomial factorization 1982-1986, In D. V. Chudnovsky and R. D.
Jenks, editors, Computers in Mathematics, volume 125 of Lecture Notes in Pure
and Applied Mathematics, 285–309, Marcel Dekker, New York, NY, 1990. <387,
Bibliography 937
392>
[1655] E. Kaltofen, Polynomial factorization 1987-1991, In Proc. LATIN ’92, volume 583 of
Lecture Notes in Comput. Sci., 294–313, Springer-Verlag, 1992. <381, 382, 387,
392>
[1656] E. Kaltofen, Asymptotically fast solution of Toeplitz-like singular linear systems, In
Proceedings of the International Symposium on Symbolic and Algebraic Compu-
tation, ISSAC ’94, 297–304, ACM, New York, NY, USA, 1994. <533, 535>
[1657] E. Kaltofen, Analysis of Coppersmith’s block Wiedemann algorithm for the parallel
solution of sparse linear systems, Math. Comp. 64 (1995) 777–806. <534, 535>
[1658] E. Kaltofen, Effective Noether irreducibility forms and applications, J. Comput.
System Sci. 50 (1995) 274–295. <386, 392>
[1659] E. Kaltofen, Polynomial factorization: a success story, In ISSAC ’03: Proceedings of
the 2003 International Symposium on Symbolic and Algebraic Computation, 3–4,
ACM, 2003. <387, 392>
[1660] E. Kaltofen and P. Koiran, On the complexity of factoring bivariate supersparse
(lacunary) polynomials, In ISSAC ’05: Proceedings of the 2005 International
Symposium on Symbolic and Algebraic Computation, 208–215, ACM, 2005. <390,
392>
[1661] E. Kaltofen and P. Koiran, Finding small degree factors of multivariate supersparse
(lacunary) polynomials over algebraic number fields, In ISSAC ’06: Proceedings
of the 2006 International Symposium on Symbolic and Algebraic Computation,
162–168, ACM, 2006. <390, 392>
[1662] E. Kaltofen and W. Lee, Early termination in sparse interpolation algorithms, J.
Symbolic Comput. 36 (2003) 365–400. <392>
[1663] E. Kaltofen and A. Lobo, Factoring high-degree polynomials by the black box
Berlekamp algorithm, Technical report, Department of Computer Science, Rens-
selaer Polytechnic Institute, 1994. <381, 382>
[1664] E. Kaltofen and A. Lobo, Distributed matrix-free solution of large sparse linear
systems over finite fields, Algorithmica 24 (1999) 331–348. <530, 534, 535>
[1665] E. Kaltofen and V. Pan, Parallel solution of Toeplitz and Toeplitz-like linear systems
over fields of small positive characteristic, In First International Symposium on
Parallel Symbolic Computation—PASCO ’94, volume 5 of Lecture Notes Ser.
Comput., 225–233, World Sci. Publ., River Edge, NJ, 1994. <509, 510>
[1666] E. Kaltofen and B. D. Saunders, On Wiedemann’s method of solving sparse linear
systems, In Applied Algebra, Algebraic Algorithms and Error-Correcting Codes,
volume 539 of Lecture Notes in Comput. Sci., 29–38, Springer, Berlin, 1991.
<531, 533, 535>
[1667] E. Kaltofen and V. Shoup, Subquadratic-time factoring of polynomials over finite
fields, Math. Comp. 67 (1998) 1179–1197. <369, 374, 376, 380, 381, 382>
[1668] E. Kaltofen and B. Trager, Computing with polynomials given by black boxes for their
evaluations: Greatest common divisors, factorization, separation of numerators
and denominators, In Proc. Twenty Ninth Annual Symp. Foundations of Comp.
Sci., 296–305, 1988. <391, 392>
[1669] E. Kaltofen and B. Trager, Computing with polynomials given by black boxes for their
evaluations: Greatest common divisors, factorization, separation of numerators
and denominators, J. Symbolic Comput. 9 (1990) 301–320. <391, 392, 530, 535>
[1670] E. Kaltofen and G. Villard, On the complexity of computing determinants, Comput.
Complexity 13 (2004) 91–130. <390, 392, 535>
938 Handbook of Finite Fields
[1847] S. Lang and J. Tate (eds.), The Collected Papers of Emil Artin, Addison–Wesley
Publishing Co., Inc., Reading, Mass.-London, 1965. <72>
[1848] S. Lang and H. Trotter, Frobenius Distributions in GL2 -Extensions, volume 504 of
Lecture Notes in Mathematics, Springer-Verlag, Berlin, 1976. <32, 438, 440>
[1849] S. Lang and A. Weil, Number of points of varieties in finite fields, Amer. J. Math. 76
(1954) 819–827. <194, 201>
[1850] T. Lange, Efficient Arithmetic on Hyperelliptic Curves, PhD thesis, Universität
Gesamthochschule Essen, 2001. <801, 802, 803>
[1851] T. Lange, Trace zero subvariety for cryptosystems, Journal of the Ramanujan Math-
ematical Society 19 (2004) 15–33. <802, 803>
[1852] T. Lange, Arithmetic on binary genus 2 curves suitable for small devices, In Pro-
ceedings ECRYPT Workshop on RFID and Lightweight Crypto, 2005. <798,
803>
[1853] T. Lange, Formulae for arithmetic on genus 2 hyperelliptic curves, Appl. Algebra Eng.
Commun. Comput. 15 (2005) 295–328. <798, 803>
[1854] T. Lange and M. Stevens, Efficient doubling on genus two curves over binary fields,
In Eleventh International Workshop on Selected Areas in Cryptography, volume
3357 of Lecture Notes in Comput. Sci., 170–181, Springer, Berlin, 2004. <797,
803>
[1855] P. Langevin, Covering radius of RM(1, 9) in RM(3, 9), In Eurocode ’90, volume 514
of Lecture Notes in Comput. Sci., 51–59, Springer, Berlin, 1991. <245, 252>
[1856] P. Langevin and G. Leander, Monomial bent functions and Stickelberger’s theorem,
Finite Fields Appl. 14 (2008) 727–742. <268, 273>
[1857] V. Laohakosol and U. Pintoptang, A modification of Fitzgerald’s characterization of
primitive polynomials over a finite field, Finite Fields Appl. 14 (2008) 85–91.
<88, 90>
[1858] G. Larcher and H. Niederreiter, Generalized (t, s)-sequences, Kronecker-type se-
quences, and diophantine approximations of formal Laurent series, Trans. Amer.
Math. Soc. 347 (1995) 2051–2073. <626, 630>
[1859] R. Laubenbacher, A. Jarrah, H. Mortveit, and S. S. Ravi, Encyclopedia of Complex-
ity and System Science, chapter A Mathematical Foundation for Agent-Based
Computer Simulation, Springer Verlag, New York, 2009. <829, 834>
[1860] R. Laubenbacher and B. Stigler, A computational algebra approach to the reverse
engineering of gene regulatory networks, Journal of Theoretical Biology 229
(2004) 523–537. <337, 344, 831, 834>
[1861] A. G. B. Lauder, Computing zeta functions of Kummer curves via multiplicative
characters, Found. Comput. Math. 3 (2003) 273–295. <491>
[1862] A. G. B. Lauder, Counting solutions to equations in many variables over finite fields,
Found. Comput. Math. 4 (2004) 221–267. <490, 491>
[1863] A. G. B. Lauder, Deformation theory and the computation of zeta functions, Proc.
London Math. Soc., 3rd Ser. 88 (2004) 565–602. <490, 491>
[1864] A. G. B. Lauder and K. G. Paterson, Computing the error linear complexity spectrum
of a binary sequence of period 2n , IEEE Trans. Inform. Theory 49 (2003) 273–
280. <329, 336>
[1865] A. G. B. Lauder and D. Wan, Computing zeta functions of Artin-Schreier curves over
finite fields. II, J. Complexity 20 (2004) 331–349. <454, 456, 491>
[1866] A. G. B. Lauder and D. Wan, Counting points on varieties over finite fields of small
characteristic, In Algorithmic Number Theory: Lattices, Number Fields, Curves
948 Handbook of Finite Fields
and Cryptography, volume 44 of Math. Sci. Res. Inst. Publ., 579–612, Cambridge
Univ. Press, Cambridge, 2008. <454, 456, 489, 490, 491>
[1867] G. Laumon, Majorations de sommes trigonométriques (d’après P. Deligne et N. Katz),
In The Euler-Poincaré characteristic, volume 83 of Astérisque, 221–258, Soc.
Math. France, Paris, 1981. <169>
[1868] G. Laumon, Transformation de Fourier, constantes d’équations fonctionnelles et con-
jecture de Weil, Inst. Hautes Études Sci. Publ. Math. 65 (1987) 131–210. <478,
479>
[1869] G. Laumon, Exponential sums and l-adic cohomology: a survey, Israel J. Math. 120
(2000) 225–257. <169>
[1870] M. Lavrauw, G. L. Mullen, S. Nikova, D. Panario, and L. Storme, editors, Proc.
Tenth International Conference on Finite Fields and Applications, volume 579
of Contemp. Math., American Mathematical Society, Providence, RI, 2012. <31>
[1871] M. Lavrauw and O. Polverino, Finite semifields, In L. Storme and J. D. Buele, editors,
Current Research Topics in Galois Geometry, chapter 6, Nova Publishers, 2011.
<274, 278>
[1872] M. Lavrauw, L. Storme, and G. Van de Voorde, A proof of the linearity conjecture
for k-blocking sets in PG(n, p3 ), p prime, J. Combin. Theory, Ser. A 118 (2011)
808–818. <561, 563>
[1873] K. M. Lawrence, A combinatorial characterization of (t, m, s)-nets in base b, J.
Combin. Des. 4 (1996) 275–293. <620, 630>
[1874] K. M. Lawrence, A. Mahalanabis, G. L. Mullen, and W. C. Schmid, Construction of
digital (t, m, s)-nets from linear codes, In Finite Fields and Applications, volume
233 of London Math. Soc. Lecture Note Ser., 189–208, Cambridge University
Press, Cambridge, 1996. <625, 630>
[1875] C. F. Laywine and G. L. Mullen, Discrete Mathematics using Latin Squares, Wiley-
Interscience Series in Discrete Mathematics and Optimization. John Wiley &
Sons Inc., New York, 1998. <31, 32, 551, 555, 556>
[1876] C. F. Laywine, G. L. Mullen, and G. Whittle, d-dimensional hypercubes and the
Euler and MacNeish conjectures, Monatsh. Math. 119 (1995) 223–238. <552,
553, 556>
[1877] D. Lazard, Gröbner bases, Gaussian elimination and resolution of systems of algebraic
equations, In Computer Algebra, volume 162 of Lecture Notes in Comput. Sci.,
146–156, Springer, Berlin, 1983. <781, 783>
[1878] G. Leander and A. Kholosha, Bent functions with 2r Niho exponents, IEEE Trans.
Inform. Theory 52 (2006) 5529–5532. <268, 273>
[1879] N. G. Leander, Monomial bent functions, IEEE Trans. Inform. Theory 52 (2006)
738–743. <268, 271, 273>
[1880] G. Lecerf, Sharp precision in Hensel lifting for bivariate polynomial factorization,
Math. Comp. 75 (2006) 921–933. <385, 392>
[1881] G. Lecerf, Improved dense multivariate polynomial factorization algorithms, J. Sym-
bolic Comput. 42 (2007) 477–494. <386, 392>
[1882] G. Lecerf, Fast separable factorization and applications, Appl. Alg. Eng. Comm.
Comp. 19 (2008) 135–160. <383, 384, 392>
[1883] G. Lecerf, New recombination algorithms for bivariate polynomial factorization based
on Hensel lifting, Appl. Alg. Eng. Comm. Comp. 21 (2010) 151–176. <385, 392>
[1884] C. Lee and C. Chang, Low-complexity linear array multiplier for normal basis of
type-II, In Proc. IEEE International Conf. Multimedia and Expo, 1515–1518,
Bibliography 949
ematics, World Scientific Publishing Co. Inc., River Edge, NJ, 1996. <31, 32,
643, 650, 651, 658>
[1923] W.-C. Li, On negative eigenvalues of regular graphs, C. R. Acad. Sci. Paris, Sér. I,
Math. 333 (2001) 907–912. <647, 658>
[1924] W.-C. Li, Recent developments in automorphic forms and applications, In Number
Theory for the Millennium II, 331–354, A. K. Peters, Natick, MA, 2002. <643,
658>
[1925] W.-C. Li, Ramanujan hypergraphs, Geom. Funct. Anal. 14 (2004) 380–399. <648,
658>
[1926] W.-C. Li, Zeta functions in combinatorics and number theory, In Fourth International
Congress of Chinese Mathematicians, volume 48 of AMS/IP Stud. Adv. Math.,
351–366, Amer. Math. Soc., Providence, RI, 2010. <658>
[1927] W.-C. Li and P. Solé, Spectra of regular graphs and hypergraphs and orthogonal
polynomials, European J. Combin. 17 (1996) 461–477. <648, 658>
[1928] Y. Li, S. Ling, H. Niederreiter, H. Wang, C. Xing, and S. Zhang, editors, Coding
and Cryptology, volume 4 of Series on Coding Theory and Cryptology. World
Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2008. <31, 32>
[1929] Y. Li and M. Wang, On EA-equivalence of certain permutations to power mappings,
Des. Codes Cryptogr. 58 (2011) 259–269. <226, 230>
[1930] Q. Liao and K. Feng, On the complexity of the normal bases via prime Gauss period
over finite fields, J. Syst. Sci. Complex. 22 (2009) 395–406. <125, 128>
[1931] Q. Liao and L. You, Low complexity of a class of normal bases over finite fields, Finite
Fields Appl. 17 (2011) 1–14. <119, 128>
[1932] Y. S. Liaw, More Z-cyclic Room squares, Ars Combin. 52 (1999) 228–238. <616,
619>
[1933] R. Lidl and G. L. Mullen, When does a polynomial over a finite field permute the
elements of the field?, Amer. Math. Monthly 95 (1988) 243–246. <216, 230>
[1934] R. Lidl and G. L. Mullen, Cycle structure of Dickson permutation polynomials, Math.
J. Okayama Univ. 33 (1991) 1–11. <228, 230>
[1935] R. Lidl and G. L. Mullen, When does a polynomial over a finite field permute the
elements of the field?, II, Amer. Math. Monthly 100 (1993) 71–74. <216, 217,
230>
[1936] R. Lidl, G. L. Mullen, and G. Turnwald, Dickson Polynomials, volume 65 of Pitman
Monographs and Surveys in Pure and Applied Mathematics, Longman Scientific
& Technical, Harlow, 1993. <31, 32, 230, 239, 240, 283, 289, 290, 294, 298, 302,
333, 336>
[1937] R. Lidl and H. Niederreiter, On orthogonal systems and permutation polynomials in
several variables, Acta Arith. 22 (1972/73) 257–265. <231, 232>
[1938] R. Lidl and H. Niederreiter, Introduction to Finite Fields and Their Applications,
Cambridge University Press, Cambridge, revised edition, 1994. <13, 31, 32, 71,
73, 312, 317, 393, 401>
[1939] R. Lidl and H. Niederreiter, Finite Fields, volume 20 of Encyclopedia of Mathematics
and its Applications, Cambridge University Press, Cambridge, second edition,
1997. <3, 11, 13, 24, 27, 31, 32, 37, 49, 60, 61, 62, 66, 70, 71, 73, 87, 90, 171,
173, 175, 179, 183, 185, 202, 206, 207, 209, 213, 215, 216, 217, 228, 230, 232, 235,
236, 238, 240, 253, 261, 283, 287, 290, 318, 324, 325, 327, 336, 349, 363, 365, 374,
377, 380, 393, 394, 401, 510>
[1940] R. Lidl and C. Wells, Chebychev polynomials in several variables, J. Reine Angew.
952 Handbook of Finite Fields
[1979] H. Lüneburg, Über projektive Ebenen, in denen jede Fahne von einer nicht-trivialen
Elation invariant gelassen wird, Abh. Math. Sem. Univ. Hamburg 29 (1965) 37–
76. <569, 574>
[1980] H. Lüneburg, Translation Planes, Springer-Verlag, Berlin, 1980. <566, 574>
[1981] J. Luo and K. Feng, On the weight distributions of two classes of cyclic codes, IEEE
Trans. Inform. Theory 54 (2008) 5332–5344. <205, 206>
[1982] Y. Luo, Q. Chai, G. Gong, and X. Lai, Wg-7, a lightweight stream cipher with
good cryptographic properties, In Proceedings of IEEE Global Communications
Conference (GLOBECOM’10), 2010. <758, 763>
[1983] K. Ma and J. von zur Gathen, The computational complexity of recognizing permu-
tation functions, Comput. Complexity 5 (1995) 76–97. <217, 230, 392>
[1984] K. Ma and J. von zur Gathen, Tests for permutation functions, Finite Fields Appl. 1
(1995) 31–56. <217, 230>
[1985] F. S. Macaulay, The Algebraic Theory of Modular Systems, Cambridge Mathematical
Library, Cambridge University Press, Cambridge, 1994. <781, 783>
[1986] C. R. MacCluer, On a conjecture of Davenport and Lewis concerning exceptional
polynomials, Acta Arith 12 (1966/1967) 289–299. <293, 302>
[1987] D. J. C. MacKay, Good error-correcting codes based on very sparse matrices, IEEE
Trans. Inform. Theory 45 (1999) 399–431. <713, 715, 716, 719>
[1988] H. F. MacNeish, Euler squares, Ann. of Math., 2nd Ser. 23 (1922) 221–227. <552,
556>
[1989] F. J. MacWilliams, Orthogonal matrices over finite fields, Amer. Math. Monthly 76
(1969) 152–164. <505, 507, 510>
[1990] F. J. MacWilliams, Orthogonal circulant matrices over finite fields, and how to find
them., J. Combin. Theory, Ser. A 10 (1971) 1–17. <115, 116, 506, 510>
[1991] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes. I,
North-Holland Mathematical Library, Vol. 16, North-Holland Publishing Co.,
Amsterdam, 1977. <31, 32, 179, 185, 205, 206, 259, 261, 266, 273, 586, 589, 661,
684, 691, 701, 703>
[1992] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes. II,
North-Holland Mathematical Library, Vol. 16, North-Holland Publishing Co.,
Amsterdam, 1977. <243, 244, 245, 246, 251, 252>
[1993] H. Mahdavifar and A. Vardy, Achieving the secrecy capacity of wiretap channels using
polar codes, IEEE Trans. Inform. Theory 57 (2011) 6428–6443. <739>
[1994] S. Maitra, K. C. Gupta, and A. Venkateswarlu, Results on multiples of primitive
polynomials and their products over GF(2), Theoret. Comput. Sci. 341 (2005)
311–343. <96, 97, 634, 635, 642>
[1995] C. Malvenuto and F. Pappalardi, Enumerating permutation polynomials I: Permu-
tations with non-maximal degree, Finite Fields Appl. 8 (2002) 531–547. <220,
230>
[1996] C. Malvenuto and F. Pappalardi, Enumerating permutation polynomials II: k-cycles
with minimal degree, Finite Fields Appl. 10 (2004) 72–96. <220, 230>
[1997] C. Malvenuto and F. Pappalardi, Corrigendum to: “Enumerating permutation poly-
nomials I: Permutations with non-maximal degree” [Finite Fields Appl. 8 (2002),
no. 4, 531–547], Finite Fields Appl. 13 (2007) 171–174. <220, 230>
[1998] F. Manganiello, E. Gorla, and J. Rosenthal, Spread codes and spread decoding in
network coding, In Proc. Int. Symp. Inform. Theory, 881–885, 2008. <848,
849>
Bibliography 955
[1999] J. I. Manin, The Hasse-Witt matrix of an algebraic curve, Izv. Akad. Nauk SSSR
Ser. Mat. 25 (1961) 153–172. <486, 488>
[2000] Y. Mansury, M. Kimura, J. Lobo, and T. S. Deisboeck, Emerging patterns in tumor
systems: Simulating the dynamics of multicellular clusters with an agent-based
spatial agglomeration model, Journal of Theoretical Biology 219 (2002) 343–370.
<831, 834>
[2001] I. Mantin, Analysis of the Stream Cipher RC4, Master’s dissertation, The Weizmann
Institute of Science, Rehovot, 76100, Israel, 2001. <751, 753, 763>
[2002] MaplesoftTM , Maplesoft - Technical Computing Software for Engineers, Mathe-
maticians, Scientists, Intructors and Students, version 16.01, http://www.
maplesoft.com/, as viewed in July 2012. <48, 49, 346, 363>
[2003] J. E. Marcos, Specific permutation polynomials over finite fields, Finite Fields Appl.
17 (2011) 105–112. <221, 224, 225, 230>
[2004] D. A. Marcus, Number Fields, Universitext, Springer-Verlag, New York, 1977. <847,
849>
[2005] G. A. Margulis, Explicit constructions of expanders, Problemy Peredači Informacii 9
(1973) 71–80. <649, 658>
[2006] G. A. Margulis, Explicit group-theoretic constructions of combinatorial schemes and
their applications in the construction of expanders and concentrators, Problemy
Peredachi Informatsii 24 (1988) 51–60. <653, 655, 658>
[2007] W. J. Martin and D. R. Stinson, A generalized Rao bound for ordered orthogonal
arrays and (t, m, s)-nets, Canad. Math. Bull. 42 (1999) 359–370. <621, 630>
[2008] W. J. Martin and D. R. Stinson, Association schemes for ordered orthogonal arrays
and (T, M, S)-nets, Canad. J. Math. 51 (1999) 326–346. <621, 630>
[2009] W. J. Martin and T. I. Visentin, A dual Plotkin bound for (T, M, S)-nets, IEEE
Trans. Inform. Theory 53 (2007) 411–415. <621, 630>
[2010] J. L. Massey, Threshold Decoding, Massachusetts Institute of Technology, Research
Laboratory of Electronics, Tech. Rep. 410, Cambridge, Mass., 1963. <696, 703>
[2011] J. L. Massey, Shift-register synthesis and BCH decoding, IEEE Trans. Inform. Theory
IT-15 (1969) 122–127. <246, 252, 314, 317, 325, 336, 350, 363, 693, 702, 703>
[2012] J. L. Massey and J. K. Omura, Computational methods and apparatus for finite
field arithmetic, US Patent No. 4,587,627, to OMNET Assoc., Sunnyvale CA,
Washington, D.C.: Patent and Trademark Office (1986). <818, 820, 823>
[2013] J. L. Massey and S. Serconek, Linear complexity of periodic sequences: a general
theory, In Advances in Cryptology—CRYPTO ’96, volume 1109 of Lecture Notes
in Comput. Sci., 358–371, Springer, Berlin, 1996. <308, 310, 328, 336>
[2014] E. D. Mastrovito, VLSI designs for multiplication over finite field GF (2m ), In Proc.
Sixth International Conference on Applied Algebra, Algebraic Algorithms and
Error-Correcting Codes (AAECC-6), 297–309, 1988. <816, 823>
[2015] A. M. Masuda, L. Moura, D. Panario, and D. Thomson, Low complexity normal
elements over finite fields of characteristic two, IEEE Trans. Comput. 57 (2008)
990–1001. <38, 39, 49, 117, 123, 128>
[2016] A. M. Masuda and D. Panario, Sequences of consecutive smooth polynomials over a
finite field, Proc. Amer. Math. Soc. 135 (2007) 1271–1277. <500>
[2017] A. M. Masuda and D. Panario, Tópicos de Corpos Finitos com Aplicações em
Criptografia e Teoria de Códigos, Publicações Matemáticas do IMPA. [IMPA
Mathematical Publications]. Instituto Nacional de Matemática Pura e Aplicada
(IMPA), Rio de Janeiro, 2007, 26o Colóquio Brasileiro de Matemática. [26th
956 Handbook of Finite Fields
[2057] W. Meidl, Reducing the calculation of the linear complexity of u2v -periodic binary
sequences to Games-Chan algorithm, Des. Codes Cryptogr. 46 (2008) 57–65.
<329, 336>
[2058] W. Meidl and H. Niederreiter, Counting functions and expected values for the k-error
linear complexity, Finite Fields Appl. 8 (2002) 142–154. <331, 336>
[2059] W. Meidl and H. Niederreiter, Linear complexity, k-error linear complexity, and the
discrete Fourier transform, J. Complexity 18 (2002) 87–103. <331, 336>
[2060] W. Meidl and H. Niederreiter, On the expected value of the linear complexity and the
k-error linear complexity of periodic sequences, IEEE Trans. Inform. Theory 48
(2002) 2817–2825. <331, 336>
[2061] W. Meidl and H. Niederreiter, The expected value of the joint linear complexity of
periodic multisequences, J. Complexity 19 (2003) 61–72. <328, 331, 336>
[2062] W. Meidl and H. Niederreiter, Periodic sequences with maximal linear complexity
and large k-error linear complexity, Appl. Algebra Engrg. Comm. Comput. 14
(2003) 273–286. <331, 336>
[2063] W. Meidl, H. Niederreiter, and A. Venkateswarlu, Error linear complexity measures
for multisequences, J. Complexity 23 (2007) 169–192. <331, 336>
[2064] W. Meidl and F. Özbudak, Linear complexity over Fq and over Fqm for linear recurring
sequences, Finite Fields Appl. 15 (2009) 110–124. <325, 336>
[2065] W. Meidl and A. Winterhof, Lower bounds on the linear complexity of the discrete
logarithm in finite fields, IEEE Trans. Inform. Theory 47 (2001) 2807–2811.
<333, 336>
[2066] W. Meidl and A. Winterhof, Linear complexity and polynomial degree of a func-
tion over a finite field, In Finite Fields with Applications to Coding Theory,
Cryptography and Related Areas, 229–238, Springer, Berlin, 2002. <329, 336>
[2067] W. Meidl and A. Winterhof, On the linear complexity profile of explicit nonlinear
pseudorandom numbers, Inform. Process. Lett. 85 (2003) 13–18. <332, 336>
[2068] W. Meidl and A. Winterhof, On the autocorrelation of cyclotomic generators, In
Finite Fields and Applications, volume 2948 of Lecture Notes in Comput. Sci.,
1–11, Springer, Berlin, 2004. <172, 185>
[2069] W. Meidl and A. Winterhof, On the linear complexity profile of some new explicit
inversive pseudorandom numbers, J. Complexity 20 (2004) 350–355. <332, 336>
[2070] W. Meidl and A. Winterhof, On the joint linear complexity profile of explicit inversive
multisequences, J. Complexity 21 (2005) 324–336. <332, 336>
[2071] W. Meidl and A. Winterhof, Some notes on the linear complexity of Sidel0 nikov-
Lempel-Cohn-Eastman sequences, Des. Codes Cryptogr. 38 (2006) 159–178.
<334, 336>
[2072] W. Meidl and A. Winterhof, On the linear complexity profile of nonlinear congruential
pseudorandom number generators with Rédei functions, Finite Fields Appl. 13
(2007) 628–634. <333, 336>
[2073] W. Meier, E. Pasalic, and C. Carlet, Algebraic attacks and decomposition of Boolean
functions, In Advances in cryptology—EUROCRYPT 2004, volume 3027 of Lec-
ture Notes in Comput. Sci., 474–491, Springer, Berlin, 2004. <247, 252>
[2074] W. Meier and O. Staffelbach, Fast correlation attacks on stream ciphers, In
D. Barstow, W. Brauer, P. Brinch Hansen, D. Gries, D. Luckham, C. Moler,
A. Pnueli, G. Seegmüller, J. Stoer, N. Wirth, and C. Günther, editors, Advances
in Cryptology—EUROCRYPT’88, volume 330 of Lecture Notes in Comput. Sci.,
301–314, Springer, Berlin, 1988. <630, 642>
Bibliography 959
[2075] W. Meier and O. Staffelbach, Fast correlation attacks on certain stream ciphers, J.
Cryptology 1 (1989) 159–176. <247, 252>
[2076] A. Menezes, Elliptic Curve Public Key Cryptosystems, The Kluwer International
Series in Engineering and Computer Science, 234. Kluwer Academic Publishers,
Boston, MA, 1993. <31, 32>
[2077] A. Menezes, I. Blake, X.-H. Gao, R. Mullin, S. Vanstone, and T. Yaghoobian, Ap-
plications of Finite Fields, The Springer International Series in Engineering and
Computer Science, Vol. 199, Springer, 1993. <13, 31, 32, 60, 61, 63, 64, 65, 66,
71, 73, 113, 116>
[2078] A. Menezes and M. Qu, Analysis of the Weil descent attack of Gaudry, Hess and
Smart, In Topics in Cryptology—CT-RSA 2001, volume 2020 of Lecture Notes
in Comput. Sci., 308–318, Springer, Berlin, 2001. <810, 811>
[2079] A. J. Menezes, T. Okamoto, and S. A. Vanstone, Reducing elliptic curve logarithms
to logarithms in a finite field, IEEE Trans. Inform. Theory 39 (1993) 1639–1646.
<439, 440, 793, 796>
[2080] A. J. Menezes, P. C. van Oorschot, and S. A. Vanstone, Handbook of Applied Cryp-
tography, CRC Press Series on Discrete Mathematics and its Applications. CRC
Press, Boca Raton, FL, 1997. <31, 32, 46, 49, 346, 347, 351, 352, 363, 393, 401,
750, 758, 763, 785, 796, 822, 823>
[2081] G. Menichetti, On a Kaplansky conjecture concerning three-dimensional division
algebras over a finite field, J. Algebra 47 (1977) 400–410. <277, 278>
[2082] G. Menichetti, n-dimensional algebras over a field with a cyclic extension of degree
n, Geom. Dedicata 63 (1996) 69–94. <277, 278>
[2083] P. Merkey and E. Posner, Optimum cyclic redundancy codes for noisy channels, IEEE
Trans. Inform. Theory 30 (1984) 865–867. <633, 635, 638, 639, 642>
[2084] “Mersenne Research, Inc.,” GIMPS Home, http://www.mersenne.org, as viewed in
July, 2012. <46, 49>
[2085] S. Mesnager, Improving the lower bound on the higher order nonlinearity of Boolean
functions with prescribed algebraic immunity, IEEE Trans. Inform. Theory 54
(2008) 3656–3662. <248, 252>
[2086] S. Mesnager, A new class of bent and hyper-bent Boolean functions in polynomial
forms, Des. Codes Cryptogr. 59 (2011) 265–279. <154, 161>
[2087] J.-F. Mestre, Construction des courbes de genre 2 à partir de leurs modules, Progress
in Mathematics 94 (1991) 313–334. <803>
[2088] J.-F. Mestre, Lettre adressée à Gaudry et Harley, http://www.math.jussieu.fr/
~mestre/lettreGaudryHarley.ps, 2000. <788, 796>
[2089] J.-F. Mestre, Algorithmes pur compter des point de courbes en petite charactéristique
et en petit genres, Available at http://www.math.jussieu.fr/~mestre/, 2002.
<454, 456>
[2090] P. Meyer, Eine Charakterisierung vollständig regulärer, abelscher Erweiterungen, Abh.
Math. Sem. Univ. Hamburg 68 (1998) 199–223. <129, 130, 138>
[2091] H. Meyn, On the construction of irreducible self-reciprocal polynomials over finite
fields, Appl. Algebra Engrg. Comm. Comput. 1 (1990) 43–53. <57, 59, 60, 61,
64, 66, 286, 290>
[2092] P. Michel, Some recent applications of Kloostermania, In Physics and Number Theory,
volume 10 of IRMA Lect. Math. Theor. Phys., 225–251, Eur. Math. Soc., Zürich,
2006. <156, 161>
[2093] T. Migler, K. E. Morrison, and M. Ogle, How much does a matrix of rank k weigh?,
960 Handbook of Finite Fields
[2151] O. Moreno and C. J. Moreno, Improvements of the Chevalley-Warning and the Ax-
Katz theorems, Amer. J. Math. 117 (1995) 241–244. <200, 201, 207, 210, 213,
481, 488>
[2152] O. Moreno and I. Rubio, Cyclic decomposition of monomial permutations, Congr.
Numer. 73 (1990) 147–158. <229, 230>
[2153] O. Moreno, K. W. Shum, F. N. Castro, and P. V. Kumar, Tight bounds for Chevalley-
Warning-Ax-Katz type estimates, with improved applications, Proc. London
Math. Soc., 3rd Ser. 88 (2004) 545–564. <481, 488>
[2154] M. Morf, Doubling algorithms for Toeplitz and related equations, In Proc. 1980
Int’l Conf. Acoustics Speech and Signal Processing, 954–959, Denver, Colo., 1980.
<533, 535>
[2155] I. H. Morgan, Construction of complete sets of mutually equiorthogonal frequency
hypercubes, Discrete Math. 186 (1998) 237–251. <554, 556>
[2156] I. H. Morgan and G. L. Mullen, Primitive normal polynomials over finite fields, Math.
Comp. 63 (1994) 759–765, S19–S23. <88, 90>
[2157] I. H. Morgan and G. L. Mullen, Completely normal primitive basis generators of finite
fields, Utilitas Math. 49 (1996) 21–43. <89, 90, 95, 137, 138>
[2158] I. H. Morgan, G. L. Mullen, and M. Z̆ivković, Almost weakly self-dual bases for finite
fields, Appl. Algebra Engrg. Comm. Comput. 8 (1997) 25–31. <89, 90, 107, 109>
[2159] J. P. Morgan, Nested designs, In Design and Analysis of Experiments, volume 13 of
Handbook of Statist., 939–976, North-Holland, Amsterdam, 1996. <595, 599>
[2160] M. Morgenstern, Existence and explicit constructions of q + 1 regular Ramanujan
graphs for every prime power q, J. Combin. Theory, Ser. B 62 (1994) 44–62.
<655, 658>
[2161] R. Mori and T. Tanaka, Performance and construction of polar codes on symmetric
binary-input memoryless channels, In Proc. IEEE Int. Symp. Information Theory
ISIT 2009, 1496–1500, 2009. <735, 738, 739>
[2162] R. Mori and T. Tanaka, Performance of polar codes with the construction using
density evolution, IEEE Comm. Letters 13 (2009) 519–521. <739>
[2163] R. Mori and T. Tanaka, Non-binary polar codes using Reed-Solomon codes and
algebraic geometry codes, In Proc. Information Theory Workshop, 1–5, 2010.
<739>
[2164] M. Morii and M. Kasahara, Generalized key-equation of remainder decoding algorithm
for Reed-Solomon codes, IEEE Trans. Inform. Theory 38 (1992) 1801–1807.
<695, 703>
[2165] B. Morlaye, Équations diagonales non homogènes sur un corps fini, C. R. Acad. Sci.
Paris, Sér. A-B 272 (1971) A1545–A1548. <208, 213>
[2166] K. E. Morrison, Integer sequences and matrices over finite fields, J. Integer Seq. 9
(2006) Article 06.2.1, 28 pp. <502, 510>
[2167] E. Mortenson, Modularity of a certain Calabi-Yau threefold and combinatorial con-
gruences, Ramanujan J. 11 (2006) 5–39. <141, 161>
[2168] M. J. Mossinghoff, Wieferich pairs and Barker sequences, Des. Codes Cryptogr. 53
(2009) 149–163. <604, 607>
[2169] C. Mulcahy, Card colm, Mathematical Association of America Online, http://www.
maa.org/columns/colm/cardcolm.html. <632, 642>
[2170] T. Mulders and A. Storjohann, Rational solutions of singular linear systems, In Pro-
ceedings of the 2000 International Symposium on Symbolic and Algebraic Com-
putation, 242–249, ACM, New York, 2000. <527, 535>
964 Handbook of Finite Fields
[2171] G. Mullen and H. Stevens, Polynomial functions (mod m), Acta Math. Hungar. 44
(1984) 237–241. <229, 230>
[2172] G. L. Mullen, Permutation polynomials in several variables over finite fields, Acta
Arith. 31 (1976) 107–111. <230, 232>
[2173] G. L. Mullen, Polynomial representation of complete sets of mutually orthogonal
frequency squares of prime power order, Discrete Math. 69 (1988) 79–84. <553,
556>
[2174] G. L. Mullen, Permutation polynomials and nonsingular feedback shift registers over
finite fields, IEEE Trans. Inform. Theory 35 (1989) 900–902. <231, 232>
[2175] G. L. Mullen, Dickson polynomials over finite fields, Adv. in Math. (China) 20 (1991)
24–32. <226, 230>
[2176] G. L. Mullen, Permutation polynomials over finite fields, In Finite Fields, Coding
Theory, and Advances in Communications and Computing, volume 141 of Lecture
Notes in Pure and Appl. Math., 131–151, Dekker, New York, 1993. <216, 217,
230>
[2177] G. L. Mullen, A candidate for the “next Fermat problem,” Math. Intelligencer 17
(1995) 18–22. <551, 556>
[2178] G. L. Mullen, Permutation polynomials: a matrix analogue of Schur’s conjecture and
a survey of recent results, Finite Fields Appl. 1 (1995) 242–258. <216, 228, 230>
[2179] G. L. Mullen and C. Mummert, Finite Fields and Applications, volume 41 of Student
Mathematical Library, American Mathematical Society, Providence, RI, 2007.
<13, 31, 32>
[2180] G. L. Mullen and D. Panario, Handbook of Finite Fields (Web resource), http:
//www.crcpress.com/product/isbn/9781439873786, as viewed in November,
2012. <32, 49>
[2181] G. L. Mullen, D. Panario, and I. E. Shparlinski, editors, Finite Fields and Applications,
volume 461 of Contemp. Math., American Mathematical Society, Providence, RI,
2008. <31, 32>
[2182] G. L. Mullen, A. Poli, and H. Stichtenoth, editors, Finite Fields and Applications,
volume 2948 of Lecture Notes in Comput. Sci., Springer-Verlag, Berlin, 2004.
<31, 32>
[2183] G. L. Mullen and W. C. Schmid, An equivalence between (t, m, s)-nets and strongly
orthogonal hypercubes, J. Combin. Theory, Ser. A 76 (1996) 164–174. <620,
630>
[2184] G. L. Mullen and P. J.-S. Shiue, editors, Finite Fields, Coding Theory, and Advances
in Communications and Computing, volume 141 of Lecture Notes in Pure and
Applied Mathematics, Marcel Dekker Inc., New York, 1993. <31, 32>
[2185] G. L. Mullen and P. J.-S. Shiue, editors, Finite Fields: Theory, Applications, and
Algorithms, volume 168 of Contemp. Math., American Mathematical Society,
Providence, RI, 1994. <31, 32>
[2186] G. L. Mullen and I. E. Shparlinski, Open problems and conjectures in finite fields, In
Finite Fields and Applications, volume 233 of London Math. Soc. Lecture Note
Ser., 243–268, Cambridge Univ. Press, Cambridge, 1996. <73, 88, 89, 90, 96,
97>
[2187] G. L. Mullen, H. Stichtenoth, and H. Tapia-Recillas, editors, Finite Fields with Ap-
plications to Coding Theory, Cryptography and Related Areas, Springer-Verlag,
Berlin, 2002. <31, 32>
[2188] G. L. Mullen, D. Wan, and Q. Wang, Value sets of polynomials maps over finite fields,
Bibliography 965
[2208] M. R. Murty, Ramanujan graphs, J. Ramanujan Math. Soc. 18 (2003) 33–52. <643,
651, 658>
[2209] D. R. Musser, Multivariate polynomial factorization, J. Assoc. Comput. Mach. 22
(1975) 291–308. <385, 392>
[2210] M. Muzychuk, On Skew Hadamard difference sets, arXiv:1012.2089v1, 2010. <604,
607>
[2211] K.-i. Nagao, Improving group law algorithms for Jacobians of hyperelliptic curves,
In Proceedings of the Fourth International Symposium of Algorithmic Number
Theory— ANTS-IV, volume 1838 of Lecture Notes in Comput. Sci., 439–448,
Springer, Berlin, 2000. <799, 803>
[2212] K.-i. Nagao, Index calculus attack for Jacobian of hyperelliptic curves of small genus
using two large primes, Japan J. Indust. Appl. Math. 24 (2007) 289–305. <798,
799, 803>
[2213] M. Nagata, On Automorphism Group of k[x, y], Kinokuniya Book-Store Co. Ltd.,
Department of Mathematics, Kyoto University, Lectures in Mathematics, No. 5,
Tokyo, 1972. <768, 783>
[2214] S. Najib, Une généralisation de l’inégalité de Stein-Lorenzini, J. Algebra 292 (2005)
566–573. <83, 85>
[2215] A. Naldi, D. Thieffry, and C. Chaouiya, Decision diagrams for the representation and
analysis of logical models of genetic networks, In CMSB’07: Proceedings of the
2007 International Conference on Computational Methods in Systems Biology,
233–247, Springer-Verlag, Berlin, Heidelberg, 2007. <829, 834>
[2216] A. H. Namin, H. Wu, and M. Ahmadi, A new finite field multiplier using redundant
representation, IEEE Trans. Comput. 57 (2008) 716–720. <822, 823>
[2217] Y. Nawaz and G. Gong, The WG stream cipher, 2005, preprint available at
http://www.cacr.math.uwaterloo.ca/techreports/2005/cacr2005-15.pdf.
<751, 755, 756, 757, 763>
[2218] M. Nazarathy, S. Newton, R. Giffard, D. Moberly, F. Sischka, W. Trutna, Jr., and
S. Foster, Real-time long range complementary correlation optical time domain
reflectometer, IEEE J. Lightwave Technology 7 (1989) 24–38. <843, 849>
[2219] V. I. Nechaev, On the complexity of a deterministic algorithm for a discrete logarithm,
Mat. Zametki 55 (1994) 91–101, 189. <394, 401>
[2220] NESSIE: New European Schemes for Signatures, Integrity, and Encryption. Informa-
tion Society Technologies programme of the European commission (IST-1999-
12324), http://www.cryptonessie.org/. <773, 783>
[2221] J. C. Néto, A. F. Tenca, and W. V. Ruggiero, A parallel k-partition method to perform
Montgomery multiplication, In Proc. ASAP-2011, 251–254, 2011. <822, 823>
[2222] E. Netto, Zur Theorie der Tripelsysteme, Math. Ann. 42 (1893) 143–152. <590, 599>
[2223] P. M. Neumann, The Mathematical Writings of Evariste Galois, European Mathe-
matical Society, Zurich, 2011. <14, 32>
[2224] T. Neumann, Bent Functions, PhD thesis, Department of Mathematics, University
of Kaiserslautern, Germany, 2006. <273>
[2225] D. K. Nguyen and B. Schmidt, Fast computation of Gauss sums and resolution of the
root of unity ambiguity, Acta Arith. 140 (2009) 205–232. <141, 161>
[2226] X. Nie, L. Hu, J. Li, C. Updegrove, and J. Ding, Breaking a new instance of ttm
cryptosystems., In ACNS, volume 3989 of Lecture Notes in Comput. Sci., 210–
225, Springer, Berlin, 2006. <775, 783>
[2227] H. Niederreiter, Permutation polynomials in several variables over finite fields, Proc.
Bibliography 967
[2284] H. Niederreiter and C. P. Xing, Towers of global function fields with asymptotically
many rational places and an improvement on the Gilbert-Varshamov bound,
Math. Nachr. 195 (1998) 171–186. <711, 712>
[2285] H. Niederreiter, C. P. Xing, and K. Y. Lam, A new construction of algebraic-geometry
codes, Appl. Algebra Engrg. Comm. Comput. 9 (1999) 373–381. <705, 706, 712>
[2286] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information,
Cambridge University Press, Cambridge, 2000. <835, 841>
[2287] Y. Niho, Multi-Valued Cross-Correlation Functions Between Two Maximal Linear
Recursive Sequences, PhD thesis, Univ. Southern California, 1972. <261>
[2288] Y. Niitsuma, Counting points of the curve y 2 = x12 + a over a finite field, Tokyo J.
Math. 31 (2008) 59–94. <149, 161>
[2289] A. Nilli, On the second eigenvalue of a graph, Discrete Math. 91 (1991) 207–210.
<646, 658>
[2290] A. Nilli, Tight estimates for eigenvalues of regular graphs, Electron. J. Combin. 11
(2004) Note 9, 4 pp. <646, 647, 658>
[2291] A. Nimbalker, T. K. Blankenship, B. Classon, T. E. Fuja, and D. J. Costello, Jr.,
Contention-free interleavers, In Proc. 2004 IEEE International Symposium on
Information Theory, 54, Chicago, IL, 2004. <726, 727>
[2292] NIST, Digital signature standard (DSS), Federal Information Processing Standards
Publication 186-3, National Institute of Standards and Technology, 2009. <785,
787, 796>
[2293] I. Niven, Fermat’s theorem for matrices, Duke Math. J. 15 (1948) 823–826. <501,
510>
[2294] J.-S. No, S. W. Golomb, G. Gong, H.-K. Lee, and P. Gaal, Binary pseudorandom
sequences of period 2n − 1 with ideal autocorrelation, IEEE Trans. Inform.
Theory 44 (1998) 814–817. <755, 756, 763>
[2295] J. S. No and P. V. Kumar, A new family of binary pseudorandom sequences hav-
ing optimal periodic correlation properties and large linear span, IEEE Trans.
Inform. Theory IT-35 (1989) 371–379. <321, 324>
[2296] W. Nöbauer, On the length of cycles of polynomial permutations, In Contributions
to General Algebra, 3, 265–274, Hölder-Pichler-Tempsky, Vienna, 1985. <229,
230>
[2297] E. Noether, Normalbasis bei Körpen ohne höhere Verzweigung, J. Reine Angew.
Math. 167 (1932) 147–152. <110, 116>
[2298] A. W. Nordstrom and J. P. Robinson, An optimum nonlinear code, Information and
Control 11 (1967) 613–616. <701, 702, 703>
[2299] M. Noro and K. Yokoyama, Yet another practical implementation of polynomial fac-
torization over finite fields, In ISSAC ’02: Proceedings of the 2002 International
Symposium on Symbolic and Algebraic Computation, 200–206, ACM, 2002. <387,
392>
[2300] A. Nowicki, W. Secomski, J. Litniewski, I. Trots, and P. A. Lewin, On the application
of signal compression using Golay’s codes sequences in ultrasonic diagnostic,
Arch. Acoustics 28 (2003) 313–324. <843, 849>
[2301] M. Nüsken and M. Ziegler, Fast multipoint evaluation of bivariate polynomials, In
Twelfth Annual European Symposium on Algorithms (ESA), volume 3221 of Lec-
ture Notes in Comput. Sci., 544–555, Springer, Berlin, 2004. <381, 382>
[2302] K. Nyberg, Perfect nonlinear S-boxes, In Advances in Cryptology—EUROCRYPT
’91, volume 547 of Lecture Notes in Comput. Sci., 378–386, Springer, Berlin,
Bibliography 971
[2339] A. M. Ostrowski, On the significance of the theory of convex polyhedra for formal
algebra, ACM SIGSAM Bull. 33 (1999) 5, Translated from [2338]. <388, 392>
[2340] P. Oswald and M. A. Shokrollahi, Capacity-achieving sequences for the erasure chan-
nel, IEEE Trans. Inform. Theory 48 (2002) 3017–3028. <729, 734>
[2341] F. Özbudak, On maximal curves and linearized permutation polynomials over finite
fields, J. Pure Appl. Algebra 162 (2001) 87–102. <239, 240>
[2342] C. Paar, A new architecture for a parallel finite field multiplier with low complexity
based on composite fields, IEEE Trans. Comput. 45 (1996) 856–861. <813, 814,
823>
[2343] L. J. Paige, Neofields, Duke Math. J. 16 (1949) 39–60. <28, 32>
[2344] R. Paley, On orthogonal matrices., J. Math. Phys., Mass. Inst. Techn. 12 (1933)
311–320. <170, 185>
[2345] R. E. A. C. Paley, On orthogonal matrices, J. Math. Phys 12 (1933) 311–320. <609,
619>
[2346] V. Y. Pan, Structured Matrices and Polynomials: Unified Superfast Algorithms,
Birkhäuser Boston Inc., Boston, MA, 2001. <532, 535>
[2347] D. Panario, What do random polynomials over finite fields look like?, In Finite
Fields and Applications, volume 2948 of Lecture Notes in Comput. Sci., 89–108,
Springer, Berlin, 2004. <365, 374>
[2348] D. Panario, X. Gourdon, and P. Flajolet, An analytic approach to smooth polynomials
over finite fields, In Algorithmic Number Theory, volume 1423 of Lecture Notes
in Comput. Sci., 226–236, Springer, Berlin, 1998. <399, 401>
[2349] D. Panario, B. Pittel, B. Richmond, and A. Viola, Analysis of Rabin’s irreducibility
test for polynomials over finite fields, Random Structures Algorithms 19 (2001)
525–551. <369, 374, 376, 380>
[2350] D. Panario and B. Richmond, Analysis of Ben-Or’s polynomial irreducibility test,
Random Structures Algorithms 13 (1998) 439–456. <369, 374, 377, 378, 380>
[2351] D. Panario and B. Richmond, Exact largest and smallest size of components, Algo-
rithmica 31 (2001) 413–432. <371, 374>
[2352] D. Panario and B. Richmond, Smallest components in decomposable structures: exp-
log class, Algorithmica 29 (2001) 205–226. <371, 374>
[2353] D. Panario, A. Sakzad, B. Stevens, and Q. Wang, Two new measures for permutations:
ambiguity and deficiency, IEEE Trans. Inform. Theory 57 (2011) 7648–7657.
<229, 230>
[2354] D. Panario, O. Sosnovski, B. Stevens, and Q. Wang, Divisibility of polynomials over
finite fields and combinatorial applications, Des. Codes Cryptogr. 63 (2012) 425–
445. <631, 641, 642>
[2355] D. Panario, B. Stevens, and Q. Wang, Ambiguity and deficiency in Costas arrays and
APN permutations, In LATIN 2010: Theoretical Informatics, volume 6034 of
Lecture Notes in Comput. Sci., 397–406, Dekker, New York, 2010. <229, 230>
[2356] D. Panario and D. Thomson, Efficient pth root computations in finite fields of char-
acteristic p, Des. Codes Cryptogr. 50 (2009) 351–358. <37, 49, 71, 73>
[2357] D. Panario and A. Viola, Analysis of Rabin’s polynomial irreducibility test, In
LATIN’98: theoretical informatics (Campinas, 1998), volume 1380 of Lecture
Notes in Comput. Sci., 1–10, Springer, Berlin, 1998. <376, 380>
[2358] G. Panella, Caratterizzazione delle quadriche di uno spazio (tridimensionale) lineare
sopra un corpo finito, Boll. Un. Mat. Ital. Ser. III 10 (1955) 507–513. <588,
589>
974 Handbook of Finite Fields
[2359] Y. H. Park and J. B. Lee, Permutation polynomials and group permutation polyno-
mials, Bull. Austral. Math. Soc. 63 (2001) 67–74. <221, 230>
[2360] K. R. Parthasarathy, Quantum Computation, Quantum Error Correcting Codes and
Information Theory, Published for the Tata Institute of Fundamental Research,
Mumbai, 2006. <835, 841>
[2361] F. Parvaresh and A. Vardy, Correcting errors beyond the Guruswami-Sudan radius
in polynomial time, In Proceedings of the Forty Sixth Annual IEEE Symposium
on Foundations of Computer Science, 285–294, 2005. <699, 703>
[2362] E. Pasalic, On cryptographically significant mappings over GF(2n ), In Arithmetic of
Finite Fields, volume 5130 of Lecture Notes in Comput. Sci., 189–204, Springer,
Berlin, 2008. <226, 230>
[2363] E. Pasalic and P. Charpin, Some results concerning cryptographically significant
mappings over GF(2n ), Des. Codes Cryptogr. 57 (2010) 257–269. <226, 230>
[2364] J. Patarin, Cryptanalysis of the Matsumoto and Imai public key scheme of Eurocrypt
’88, In Advances in Cryptology—CRYPTO ’95, volume 963 of Lecture Notes in
Comput. Sci., 248–261, Springer, Berlin, 1995. <770, 776, 777, 783>
[2365] J. Patarin, Asymmetric cryptography with a hidden monomial and a candidate algo-
rithm for ' 64 bits asymmetric signatures, In Advances in Cryptology—CRYPTO
’96, volume 1109 of Lecture Notes in Comput. Sci., 45–60, Springer, Berlin, 1996.
<767, 783>
[2366] J. Patarin, Hidden Field Equations (HFE) and Isomorphisms of Polynomials (IP):
two new families of asymmetric algorithms, In Eurocrypt’96, volume 1070 of
Lecture Notes in Comput. Sci., 33–48, Springer, Berlin, 1996. <767, 783>
[2367] J. Patarin, The oil and vinegar signature scheme, 1997, Dagstuhl Workshop on
Cryptography. <770, 783>
[2368] J. Patarin, N. T. Courtois, and L. Goubin, FLASH, a fast multivariate signature
algorithm, In Topics in Cryptology—CT-RSA 2001, volume 2020 of Lecture
Notes in Comput. Sci., 298–307, Springer, Berlin, 2001. <773, 783>
[2369] J. Patarin, N. T. Courtois, and L. Goubin, QUARTZ, 128-bit long digital signa-
tures, In Topics in Cryptology—CT-RSA 2001, volume 2020 of Lecture Notes in
Comput. Sci., 282–297, Springer, Berlin, 2001. <770, 783>
∗
[2370] J. Patarin, L. Goubin, and N. T. Courtois, C−+ and HM : Variations around two
schemes of T. Matsumoto and H. Imai, In Asiacrypt’98, volume 1514, 35–49,
Springer, Berlin, 1998. <772, 773, 778, 783>
[2371] J. Patarin, L. Goubin, and N. T. Courtois, Improved algorithms for Isomorphisms
of Polynomials, In Eurocrypt’98, volume 1403, 184–200, Berlin, 1998, Springer.
<767, 783>
[2372] K. G. Paterson, Applications of exponential sums in communications theory, In
Cryptography and Coding, volume 1746 of Lecture Notes in Comput. Sci., 1–24,
Springer, Berlin, 1999. <179, 180, 185>
[2373] S. Paulus and H.-G. Rück, Real and imaginary quadratic representations of hyperel-
liptic function fields, Math. Comp. 68 (1999) 1233–1241. <448, 451, 452, 456>
[2374] S. E. Payne, Spreads, flocks, and generalized quadrangles, J. Geom. 33 (1988) 113–
128. <567, 574>
[2375] F. Pellarin, Values of certain l-series in positive characteristic, Ann. of Math., 2nd
Ser. 176 (2012) 2055–2093. <545, 546>
[2376] A. Pellet, Sur les fonctions irréducibles suivant un module premier, C.R. Acad. Sci.
Paris 93 (1881) 1065–1066. <60, 66>
Bibliography 975
[2377] A. Pellet, Sur les fonctions réduites suivant un module premier, Bull. Soc. Math.
France 17 (1889) 156–167. <63, 66>
[2378] A. E. Pellet, Sur les fonctions irréductibles suivant un module premier et une fonction
modulaire., C. R. Acad. Sci. Paris. 70 (1870) 328–330. <62, 63, 66, 71, 73>
[2379] A. E. Pellet, Sur la décomposition d’une fonction entière en facteurs irréductibles
suivant un module premier., C. R. Acad. Sci. Paris. 86 (1878) 1071–1072. <66,
67, 70>
[2380] R. Pellikaan, B.-Z. Shen, and G. J. M. van Wee, Which linear codes are algebraic-
geometric?, IEEE Trans. Inform. Theory 37 (1991) 583–602. <710, 712>
[2381] J. Pelzl, T. Wollinger, and C. Paar, High performance arithmetic for hyperelliptic
curve cryptosystems of genus two, Information Technology: Coding and Com-
puting (ITCC) 2 (2004) 513–517. <799, 803>
[2382] T. Penttila and G. F. Royle, Sets of type (m, n) in the affine and projective planes of
order nine, Des. Codes Cryptogr. 6 (1995) 229–245. <571, 574>
[2383] T. Penttila and B. Williams, Ovoids of parabolic spaces, Geom. Dedicata 82 (2000)
1–19. <282>
[2384] G. I. Perel0 muter, Estimate of a sum along an algebraic curve, Mat. Zametki 5 (1969)
373–380. <168, 169>
[2385] S. Perlis, Normal bases of cyclic fields of prime-power degree, Duke Math J. 9 (1942)
507–517. <112, 116>
[2386] C. Pernet and A. Storjohann, Faster algorithms for the characteristic polynomial, In
ISSAC 2007, 307–314, ACM, New York, 2007. <529, 535>
[2387] L. Perret, A fast cryptanalysis of the isomorphism of polynomials with one secret
problem, In Advances in Cryptology—EUROCRYPT 2005, volume 3494 of Lec-
ture Notes in Comput. Sci., 354–370, Springer, Berlin, 2005. <767, 783>
[2388] O. Perron, Bemerkungen über die Verteilung der quadratischen Reste, Math. Z. 56
(1952) 122–130. <319, 324>
[2389] W. W. Peterson, Error-Correcting Codes, The M.I.T. Press, Cambridge, Mass., 1961.
<661, 674, 692, 703>
[2390] W. W. Peterson and E. J. Weldon, Jr., Error-Correcting Codes, The M.I.T. Press,
Cambridge, Mass., second edition, 1972. <316, 317, 661, 673, 674, 681, 686, 688,
692, 693, 696, 697, 703>
[2391] K. Petr, Über die Reduzibilität eines Polynoms mit ganzzahligen Koeffizienten nach
einem Primzahlmodul, Časopis pro pěstovánı́ matematiky a fysiky 66 (1937)
85–94. <375, 380, 381, 382>
[2392] E. Petterson, Über die Irreduzibilität ganzzahliger Polynome nach einem
Primzahlmodul, J. Reine Angew. Math. 175 (1936) 209–220. <60, 62, 66>
[2393] D. Pierce and M. J. Kallaher, A note on planar functions and their planes, Bull. Inst.
Combin. Appl. 42 (2004) 53–75. <279, 282>
[2394] J. Pila, Frobenius maps of abelian varieties and finding roots of unity in finite fields,
Math. Comp. 55 (1990) 745–763. <490, 491, 803>
[2395] A. Pincin, Bases for finite fields and a canonical decomposition for a normal basis
generator, Comm. Algebra 17 (1989) 1337–1352. <111, 116>
[2396] N. Pippenger, On the evaluation of powers and monomials, SIAM J. Comput. 9
(1980) 230–250. <357, 363>
[2397] F. Piroi and A. Winterhof, Quantum period reconstruction of binary sequences, In
Applied Algebra, Algebraic Algorithms and Error-Correcting Codes, volume 3857
976 Handbook of Finite Fields
of Lecture Notes in Comput. Sci., 60–67, Springer, Berlin, 2006. <839, 841>
[2398] G. Pirsic, J. Dick, and F. Pillichshammer, Cyclic digital nets, hyperplane nets, and
multivariate integration in Sobolev spaces, SIAM J. Numer. Anal. 44 (2006)
385–411. <624, 630>
[2399] N. L. Pitcher, Efficient Point-Counting on Genus-2 Hyperelliptic Curves, ProQuest
LLC, Ann Arbor, MI, 2009, Thesis (Ph.D.)–University of Illinois at Chicago.
<454, 456>
[2400] D. A. Plaisted, New NP-hard and NP-complete polynomial and integer divisibility
problems, Theoret. Comput. Sci. 13 (1984) 125–138. <390, 392>
[2401] M. Planat, H. C. Rosu, and S. Perrine, A survey of finite algebraic geometrical
structures underlying mutually unbiased quantum measurements, Found. Phys.
36 (2006) 1662–1680. <835, 841>
[2402] V. Pless, Q-codes, J. Combin. Theory, Ser. A 43 (1986) 258–276. <682, 703>
[2403] V. Pless, Duadic codes and generalizations, In Eurocode ’92, volume 339 of CISM
Courses and Lectures, 3–15, Springer, Vienna, 1993. <682, 703>
[2404] V. Pless, Introduction to the Theory of Error-Correcting Codes, Wiley-Interscience
Series in Discrete Mathematics and Optimization. John Wiley & Sons Inc., New
York, third edition, 1998. <31, 32>
[2405] V. S. Pless, W. C. Huffman, and R. A. Brualdi, editors, Handbook of Coding Theory.
Vol. I, II, North-Holland, Amsterdam, 1998. <31, 32, 661, 683, 690, 691, 703>
[2406] S. C. Pohlig and M. E. Hellman, An improved algorithm for computing logarithms
over GF (p) and its cryptographic significance, IEEE Trans. Inform. Theory 24
(1978) 106–110. <395, 401, 800, 802, 803>
[2407] L. Poinsot, Réflexions sur les principes fondamentaux de la théorie des nombres,
Journal de mathématiques pures et appliquées 10 (1845) 1–101. <71, 73>
[2408] P. Polito and O. Polverino, Linear blocking sets in PG(2, q 4 ), Australas. J. Combin.
26 (2002) 41–48. <561, 563>
[2409] P. Pollack, An explicit approach to hypothesis H for polynomials over a finite field,
In Anatomy of Integers, volume 46 of CRM Proc. Lecture Notes, 259–273, Amer.
Math. Soc., Providence, 2008. <496, 500>
[2410] P. Pollack, A polynomial analogue of the twin primes conjecture, Proc. Amer. Math.
Soc. 136 (2008) 3775–3784. <496, 500>
[2411] P. Pollack, Simultaneous prime specializations of polynomials over finite fields, Proc.
Lond. Math. Soc. 97 (2008) 545–567. <496, 500>
[2412] P. Pollack, Revisiting Gauss’s analogue of the prime number theorem for polynomials
over finite fields, Finite Fields Appl. 16 (2010) 290–299. <494, 500>
[2413] J. M. Pollard, Monte Carlo methods for index computation (mod p), Math. Comp.
32 (1978) 918–924. <397, 401, 746, 750>
[2414] J. M. Pollard, Kangaroos, Monopoly and discrete logarithms, J. Cryptology 13 (2000)
437–447. <397, 401>
[2415] J. M. Pollard and C.-P. Schnorr, An efficient solution of the congruence x2 + ky 2 = m
(mod n), IEEE Trans. Inform. Theory 33 (1987) 702–709. <768, 783>
[2416] O. Polverino, Small minimal blocking sets and complete k-arcs in PG(2, p3 ), Discrete
Math. 208/209 (1999) 469–476. <562, 563>
[2417] O. Polverino, Small blocking sets in PG(2, p3 ), Des. Codes Cryptogr. 20 (2000) 319–
324. <561, 562, 563>
[2418] O. Polverino and L. Storme, Small minimal blocking sets in PG(2, q 3 ), European J.
Bibliography 977
[2501] A. Rudra, Limits to list decoding of random codes, IEEE Trans. Inform. Theory
IT-57 (2011) 1398–1408. <699, 703>
[2502] R. A. Rueppel, Analysis and Design of Stream Ciphers, Communications and Control
Engineering Series. Springer-Verlag, Berlin, 1986. <325, 329, 330, 336>
[2503] R. A. Rueppel, Stream ciphers, In Contemporary Cryptology, 65–134, IEEE, New
York, 1992. <325, 328, 336>
[2504] W. M. Ruppert, Reduzibilität ebener Kurven, J. Reine Angew. Math. 369 (1986)
167–191. <386, 387, 392>
[2505] W. M. Ruppert, Reducibility of polynomials f (x, y) modulo p, J. Number Theory 77
(1999) 62–70. <386, 392>
[2506] J. J. Rushanan, Topics in Integral Matrices and Abelian Group Codes: Generalized Q-
Codes, ProQuest LLC, Ann Arbor, MI, 1986, Thesis (Ph.D.)–California Institute
of Technology. <682, 703>
[2507] F. Ruskey, The Object Server Home Page (COS), http://theory.cs.uvic.ca, as
viewed in July 2012. <46, 49>
[2508] F. Ruskey, C. R. Miers, and J. Sawada, The number of irreducible polynomials and
Lyndon words with given trace, SIAM J. Discrete Math. 14 (2001) 240–245.
<54, 59>
[2509] A. Russell and I. E. Shparlinski, Classical and quantum function reconstruction via
character evaluation, J. Complexity 20 (2004) 404–422. <840, 841>
[2510] I. Z. Ruzsa, Essential components, Proc. London Math. Soc., 3rd Ser. 54 (1987)
38–56. <184, 185>
[2511] W. E. Ryan and S. Lin, Channel Codes: Classical and Modern, Cambridge University
Press, Cambridge, 2009. <661, 703>
[2512] A. Sackmann, M. Heiner, and I. Koch, Application of petri net based analysis tech-
niques to signal transduction pathways, BMC Bioinformatics 7 (2006) 482. <829,
834>
[2513] H. R. Sadjadpour, N. J. A. Sloane, M. Salehi, and G. Nebe, Interleaver design for
turbo codes, IEEE J. Select. Areas Commun. 19 (2001) 831–837. <630, 634,
642, 726, 727>
[2514] J. Saez-Rodriguez, L. G. Alexopoulos, J. Epperlein, R. Samaga, D. A. Lauffenburger,
S. Klamt, and P. K. Sorger, Discrete logic modelling as a means to link protein
signalling networks with functional analysis of mammalian signal transduction,
Molecular Systems Biology 5:331 (2009). <825, 834>
[2515] O. Sahin, H. Frohlich, C. Lobke, U. Korf, S. Burmester, M. Majety, J. Mattern,
I. Schupp, C. Chaouiya, D. Thieffry, A. Poustka, S. Wiemann, T. Beissbarth,
and D. Arlt, Modeling erbb receptor-regulated g1/s transition to find novel
targets for de novo trastuzumab resistance, BMC Systems Biology 3 (2009) 1.
<825, 834>
[2516] S. Sakata, n-dimensional Berlekamp-Massey algorithm for multiple arrays and con-
struction of multivariate polynomials with preassigned zeros, In Applied Algebra,
Algebraic Algorithms and Error-Correcting Codes, volume 357 of Lecture Notes
in Comput. Sci., 356–376, Springer, Berlin, 1989. <329, 336>
[2517] S. Sakata, Extension of the Berlekamp-Massey algorithm to N dimensions, Inform.
and Comput. 84 (1990) 207–239. <329, 336>
[2518] A. Sakzad, M.-R. Sadeghi, and D. Panario, Codes with girth 8 Tanner graph repre-
sentation, Des. Codes Cryptogr. 57 (2010) 71–81. <718, 719>
[2519] A. Sakzad, M. R. Sadeghi, and D. Panario, Cycle structure of permutation functions
982 Handbook of Finite Fields
over finite fields and their applications, Adv. Math. Commun. 6 (2012) 347–361.
<229, 230, 727>
[2520] A. Sălăgean, On the computation of the linear complexity and the k-error linear
complexity of binary sequences with period a power of two, IEEE Trans. Inform.
Theory 51 (2005) 1145–1150. <329, 336>
[2521] R. Sandler, The collineation groups of some finite projective planes, Portugal. Math.
21 (1962) 189–199. <276, 278>
[2522] P. Sarkar and S. Maitra, Nonlinearity bounds and constructions of resilient Boolean
functions, In Advances in cryptology—CRYPTO 2000 (Santa Barbara, CA),
volume 1880 of Lecture Notes in Comput. Sci., 515–532, Springer, Berlin, 2000.
<247, 252>
[2523] P. Sarnak, Some Applications of Modular Forms, volume 99 of Cambridge Tracts in
Mathematics, Cambridge University Press, Cambridge, 1990. <644, 658>
[2524] P. Sarnak, Kloosterman, quadratic forms and modular forms, Nieuw Arch. Wiskd. 1
(2000) 385–389. <154, 161>
[2525] P. Sarnak, What is . . . an expander?, Notices Amer. Math. Soc. 51 (2004) 762–763.
<643, 646, 658>
[2526] D. Sarwate and M. Pursley, Crosscorrelation properties of pseudorandom and related
sequences, Proceedings of the IEEE 68 (1980) 593–619. <317, 324>
[2527] D. V. Sarwate, An upper bound on the aperiodic autocorrelation function for a
maximal-length sequence, IEEE Trans. Inform. Theory 30 (1984) 685–687. <842,
849>
[2528] T. Sasaki, T. Saito, and T. Hilano, Analysis of approximate factorization algorithm.
I, Japan J. Indust. Appl. Math. 9 (1992) 351–368. <385, 392>
[2529] T. Sasaki and M. Sasaki, A unified method for multivariate polynomial factorizations,
Japan J. Indust. Appl. Math. 10 (1993) 21–39. <385, 392>
[2530] T. Sasaki, M. Suzuki, M. Kolář, and M. Sasaki, Approximate factorization of multi-
variate polynomials and absolute irreducibility testing, Japan J. Indust. Appl.
Math. 8 (1991) 357–375. <385, 392>
[2531] E. Sasoglu, E. Telatar, and E. Arikan, Polarization for arbitrary discrete memoryless
channels, preprint available, http://arxiv.org/abs/0908.0302, 2009. <739>
[2532] E. Sasoglu, E. Telatar, and E. Yeh, Polar codes for the two-user binary-input multiple-
access channel, In Proc. IEEE Information Theory Workshop (ITW), 1–5, 2010.
<739>
[2533] T. Satoh, The canonical lift of an ordinary elliptic curve over a finite field and its
point counting, J. Ramanujan Math. Soc. 15 (2000) 247–270. <491, 788, 796>
[2534] T. Satoh, Generating genus two hyperelliptic curves over large characteristic finite
fields, In Advances in Cryptology - EUROCRYPT 2009, volume 5479 of Lecture
Notes in Comput. Sci., 536–553, Springer, Berlin, 2009. <803>
[2535] T. Satoh and K. Araki, Fermat quotients and the polynomial time discrete log algo-
rithm for anomalous elliptic curves, Comment. Math. Univ. St. Paul. 47 (1998)
81–92. <440, 784>
[2536] E. Savas and Ç. K. Koç, The Montgomery modular inverse—revisited, IEEE Trans.
Comput. 49 (2000) 763–766. <360, 363>
[2537] A. Scheerhorn, Trace and norm-compatible extensions of finite fields, Appl. Algebra
Engrg. Comm. Comput. 3 (1992) 199–209. <130, 138>
[2538] A. Scheerhorn, Iterated constructions of normal bases over finite fields, In Finite
Fields: Theory, Applications, and Algorithms, volume 168 of Contemp. Math.,
Bibliography 983
309–325, Amer. Math. Soc., Providence, RI, 1994. <130, 138, 286, 290>
[2539] A. Scheerhorn, Dickson polynomials and completely normal elements over finite fields,
In Applications of Finite Fields, volume 59 of Inst. Math. Appl. Conf. Ser. (New
Ser.), 47–55, Oxford Univ. Press, New York, 1996. <138>
[2540] A. Scheerhorn, Dickson polynomials, completely normal polynomials and the cyclic
module structure of specific extensions of finite fields, Des. Codes Cryptogr. 9
(1996) 193–202. <138>
[2541] D. M. Schinianakis, A. P. Fournaris, H. E. Michail, A. P. Kakarountas, and
T. Stouraitis, An RNS implementation of an Fp elliptic curve point multiplier,
IEEE Transactions on Circuits and Systems I: Regular Papers 56 (2009) 1202–
1213. <822, 823>
[2542] A. Schinzel, Polynomials with Special Regard to Reducibility, volume 77 of Encyclo-
pedia of Mathematics and its Applications, Cambridge University Press, 2000.
<386, 392>
[2543] O. Schirokauer, The special function field sieve, SIAM J. Discrete Math. 16 (2002)
81–98. <399, 401>
[2544] O. Schirokauer, The impact of the number field sieve on the discrete logarithm problem
in finite fields, In Algorithmic Number Theory: Lattices, Number Fields, Curves
and Cryptography, volume 44 of Math. Sci. Res. Inst. Publ., 397–420, Cambridge
Univ. Press, Cambridge, 2008. <399, 401>
[2545] O. Schirokauer, The number field sieve for integers of low weight, Math. Comp. 79
(2010) 583–602. <399, 401>
[2546] B. Schmidt, Characters and Cyclotomic Fields in Finite Geometry, volume 1797 of
Lecture Notes in Mathematics, Springer-Verlag, Berlin, 2002. <600, 601, 607>
[2547] K. Schmidt, Dynamical Systems of Algebraic Origin, volume 128 of Progress in Math-
ematics, Birkhäuser Verlag, Basel, 1995. <337, 344>
[2548] W. M. Schmidt, Equations over Finite Fields. An Elementary Approach, Lecture
Notes in Mathematics, Vol. 536. Springer-Verlag, Berlin, 1976. <31, 32, 176,
185, 193, 194, 199, 201>
[2549] W. M. Schmidt, Construction and estimation of bases in function fields, J. Number
Theory 39 (1991) 181–224. <330, 336>
[2550] T. Schoen and I. Shkredov, Additive properties of multiplicative subgroups of Fp ,
Quart. J. Math. 63 (2012) 713–822. <212, 213>
[2551] J. Scholten and H. J. Zhu, Families of supersingular curves in characteristic 2, Math.
Res. Lett. 9 (2002) 639–650. <487, 488>
[2552] J. Scholten and H. J. Zhu, Hyperelliptic curves in characteristic 2, Int. Math. Res.
Not. (2002) 905–917. <484, 487, 488, 799, 803>
[2553] J. Scholten and H. J. Zhu, Slope estimates of Artin-Schreier curves, Compositio Math.
137 (2003) 275–292. <484, 488>
[2554] R. A. Scholtz, The spread spectrum concept, IEEE Trans. Commun. COM-25 (1977)
748–755. <842, 845, 849>
[2555] R. A. Scholtz and L. R. Welch, GMW sequences, IEEE Trans. Inform. Theory 30
(1984) 548–553. <318, 324>
[2556] T. Schönemann, Grundzüge einer allgemeinen theorie der höheren congruenzen, deren
modul eine reele primzahl ist, J. Reine Agnew. Math. 31 (1845) 269–325. <9,
11>
[2557] A. Schönhage, Schnelle berechnung von kettenbruchentwicklungen, Acta Inf. 1 (1971)
139–144. <359, 363>
984 Handbook of Finite Fields
[2558] A. Schönhage, Schnelle Multiplikation von Polynomen über Körpern der Charakter-
istik 2, Acta Informatica 7 (1977) 395–398. <382>
[2559] A. Schönhage and V. Strassen, Schnelle Multiplikation grosser Zahlen, Computing
(Arch. Elektron. Rechnen) 7 (1971) 281–292. <355, 358, 363, 382>
[2560] R. Schoof, Elliptic curves over finite fields and the computation of square roots mod
p, Math. Comp. 44 (1985) 483–494. <490, 491, 787, 796>
[2561] R. Schoof, Algebraic curves over F2 with many rational points, J. Number Theory 41
(1992) 6–14. <464, 469>
[2562] B. Schumacher and M. D. Westmoreland, Modal quantum theory, In QPL 2010, 7th
Workshop on Quantum Physics and Logic, 145–149, 2010. <840, 841>
[2563] I. Schur, Über den Zusammenhang zwischen einem Problem der Zahlentheorie und
einem Satz über algebraische Funktionen, S.-B. Preuss. Akad. Wiss. Phys.-Math.
Klasse (1923) 123–134. <239, 240>
[2564] I. Schur, Zur theorie der einfach transitiven permutationgruppen, S.-B. Preuss. Akad.
Wiss. Phys.-Math. Klasse (1933) 598–623. <239, 240>
[2565] R. Schürer, A new lower bound on the t-parameter of (t, s)-sequences, In Monte Carlo
and Quasi-Monte Carlo Methods, 623–632, Springer-Verlag, Berlin, 2008. <626,
630>
[2566] M. P. Schützenberger, A non-existence theorem for an infinite family of symmetrical
block designs, Ann. Eugenics 14 (1949) 286–287. <600, 607>
[2567] S̆. Schwarz, Contribution à la recluctibilité des polynômes dans la théorie des congru-
ences, Věstnik Knálovskè české spol. nauk. (1939) 1–7. <375, 380>
[2568] Š. Schwarz, A contribution to the reducibility of binomial congruences (Slovak),
Časopis Pěst. Mat. Fys. 71 (1946) 21–31. <61, 62, 66>
[2569] Š. Schwarz, On the reducibility of binomial congruences and on the bound of the least
integer belonging to a given exponent mod p, Časopis Pěst. Mat. Fys. 74 (1949)
1–16. <62, 66>
[2570] S̆. Schwarz, On the reducibility of polynomials over a finite field, Quart. J. Math.
Oxford 2 (1956) 110–124. <375, 380>
[2571] Š. Schwarz, On a class of polynomials over a finite field (Russian), Mat.-Fyz. C̆asopis.
Slovensk. Akad. 10 (1960) 68–80. <63, 66>
[2572] J. Schwinger, Unitary operator bases, Proc. Nat. Acad. Sci. U.S.A. 46 (1960) 570–579.
<835, 841>
[2573] M. Scott, Optimal irreducible polynomials for GF(2m ) arithmetic, In Software Perfor-
mance Enhancement for Encryption and Decryption (SPEED 2007), 2007, Avail-
able online (July 2011) http://www.hyperelliptic.org/SPEED/start07.html.
<32, 49, 68, 70, 353, 363>
[2574] E. J. Scourfield, On ideals free of large prime factors, J. Théor. Nombres Bordeaux
16 (2004) 733–772. <399, 401>
[2575] B. Segre, Ovals in a finite projective plane, Canad. J. Math. 7 (1955) 414–416. <584,
589>
[2576] B. Segre, On complete caps and ovaloids in three-dimensional Galois spaces of char-
acteristic two, Acta Arith. 5 (1959) 315–332 (1959). <588, 589>
[2577] B. Segre, Introduction to Galois geometries, Atti Accad. Naz. Lincei Mem. Cl. Sci.
Fis. Mat. Natur. Sez. I Ser. XIII 8 (1967) 133–236. <584, 589>
[2578] G. E. Séguin, Low complexity normal bases for F2mn , Discrete Appl. Math. 28 (1990)
309–312. <119, 128>
Bibliography 985
[2579] E. S. Selmer, Linear Recurrence Relations over Finite Fields, University of Bergen,
Bergen (Norway), 1966. <312, 317>
[2580] I. Semaev, Construction of polynomials, irreducible over a finite field, with linearly
independent roots, Mat. Sbornik 135 (1988) 520–532, In Russian; English trans-
lation in Math. USSR-Sbornik, 63:507-519, 1989. <111, 116, 119, 128, 379, 380>
[2581] I. A. Semaev, Evaluation of discrete logarithms in a group of p-torsion points of an
elliptic curve in characteristic p, Math. Comp. 67 (1998) 353–356. <440, 784>
[2582] G. Seroussi, Table of low-weight binary irreducible polynomials, Technical Report
HP-98-135, Computer Systems Laboratory, Hewlett Packard, 1998. <33, 35, 49,
348, 363>
[2583] G. Seroussi and A. Lempel, Factorization of symmetric matrices and trace-orthogonal
bases in finite fields, SIAM J. Comput. 9 (1980) 758–767. <103, 109>
[2584] G. Seroussi and A. Lempel, On symmetric representations of finite fields, SIAM J.
Algebraic Discrete Methods 4 (1983) 14–21. <503, 504, 510>
[2585] J.-P. Serre, Géométrie algébrique et géométrie analytique, Ann. Inst. Fourier, Greno-
ble 6 (1955–1956) 1–42. <537, 546>
[2586] J.-P. Serre, Abelian l-adic Representations and Elliptic Curves, McGill University
lecture notes written with the collaboration of Willem Kuyk and John Labute.
W. A. Benjamin, Inc., New York-Amsterdam, 1968. <299, 300, 302>
[2587] J.-P. Serre, Propriétés galoisiennes des points d’ordre fini des courbes elliptiques,
Invent. Math. 15 (1972) 259–331. <300, 302>
[2588] J.-P. Serre, A Course in Arithmetic, volume 7 of Graduate Texts in Mathematics,
Springer-Verlag, New York, 1973. <29>
[2589] J.-P. Serre, Majorations de sommes exponentielles, In Journées Arithmétiques de
Caen, 111–126, Astérisque No. 41–42, Soc. Math. France, Paris, 1977. <169,
650, 658>
[2590] J.-P. Serre, Quelques applications du théorème de densité de Chebotarev, Inst. Hautes
Études Sci. Publ. Math. (1981) 323–401. <300, 302, 438, 440>
[2591] J.-P. Serre, Nombres de points des courbes algébriques sur Fq , In Seminar on Number
Theory, Exp. No. 22, 8, Univ. Bordeaux I, Talence, 1983. <459, 460, 463>
[2592] J.-P. Serre, Sur le nombre des points rationnels d’une courbe algébrique sur un corps
fini, C. R. Acad. Sci. Paris, Sér. I, Math. 296 (1983) 397–402. <459, 463, 464,
469>
[2593] J.-P. Serre, Quel est le nombre maximum de points rationnels que peut avoir une
courbe algébrique de genre g sur un corps fini?, Annuaire du Collége de France
84 (1984) 397–402. <461, 463>
[2594] J.-P. Serre, Répartition asymptotique des valeurs propres de l’opérateur de Hecke Tp ,
J. Amer. Math. Soc. 10 (1997) 75–102. <647, 658>
[2595] J.-P. Serre, On a theorem of Jordan, Bull. Amer. Math. Soc. (New Ser.) 40 (2003)
429–440. <300, 302>
[2596] J.-A. Serret, Cours d’algèbre supérieure, Paris: Bachelier, 1849, Second ed. Paris:
Mallet-Bachelier, 1854. Third. ed. Paris: Gauthier-Villars, 1866. <8, 11>
[2597] J. A. Serret, Mémoire sur la théorie des congruences suivant un module premier et
suivant une fonction modularie irréductible, Mém. Acad. Sci., Inst. de France 1
(1866) 617–688. <60, 62, 66>
[2598] J. A. Serret, Détermination des fonctions entières irréductibles, suivant un module
premier, dans le cas où le degré est égal au module, J. Math. Pures Appl. 18
(1873) 301–304. <63, 66>
986 Handbook of Finite Fields
[2599] J. A. Serret, Sur les fonctions entières irréductibles, suivant un module premier, dans
le cas où le degré est une puissance du module, J. Math. Pures Appl. 18 (1873)
437–451. <63, 66>
[2600] J.-A. Serret, Cours d’Algèbre Supérieure. Tome I, Les Grands Classiques Gauthier-
Villars. [Gauthier-Villars Great Classics]. Éditions Jacques Gabay, Sceaux, 1992,
Reprint of the fourth (1877) edition. <60, 61, 62, 66, 71, 73, 381, 382>
[2601] H. Shacham and B. Waters, editors, Pairing-Based Cryptography — Pairing 2009,
volume 5671 of Lecture Notes in Comput. Sci., Springer-Verlag, Berlin, 2009.
<788, 796>
[2602] I. R. Shafarevich, Basic Algebraic Geometry 1: Varieties in Projective Space, Springer-
Verlag, second edition, 1994. <386, 392>
[2603] R. Shaheen and A. Winterhof, Permutations of finite fields for check digit systems,
Des. Codes Cryptogr. 57 (2010) 361–371. <229, 230>
[2604] A. Shallue and C. E. van de Woestijne, Construction of rational points on elliptic
curves over finite fields, In F. Hess, S. Pauli, and M. Pohst, editors, Algorithmic
Number Theory—ANTS-VII, volume 4076 of Lecture Notes in Comput. Sci.,
510–524, Springer-Verlag, Berlin, 2006. <796>
[2605] C. J. Shallue and I. M. Wanless, Permutation polynomials and orthomorphism poly-
nomials of degree six, preprint, 2012. <217, 230>
[2606] A. Shamir, Efficient signature schemes based on birational permutations, In Crypto,
volume 773 of Lecture Notes in Comput. Sci., 1–12, Springer, Berlin, 1993. <768,
772, 775, 783>
[2607] D. Shanks, Class number, a theory of factorization, and genera, In 1969 Number
Theory Institute, 415–440, Amer. Math. Soc., Providence, RI, 1971. <396, 401>
[2608] C. E. Shannon, A mathematical theory of communication, Bell System Tech. J. 27
(1948) 379–423, 623–656. <661, 684, 703, 726, 727>
[2609] C. E. Shannon, Communication theory of secrecy systems, Bell System Tech. J. 28
(1949) 656–715. <743, 750>
[2610] R. T. Sharifi, On norm residue symbols and conductors, J. Number Theory 86 (2001)
196–209. <146, 161>
[2611] J. T. Sheats, The Riemann hypothesis for the Goss zeta function for Fq [t], J. Number
Theory 71 (1998) 121–157. <544, 546>
[2612] G. B. Sherwood, S. S. Martirosyan, and C. J. Colbourn, Covering arrays of higher
strength from permutation vectors, J. Combin. Des. 14 (2006) 202–213. <610,
619>
[2613] I. P. Shestakov and U. U. Umirbaev, The Nagata automorphism is wild, Proc. Natl.
Acad. Sci. USA 100 (2003) 12561–12563. <768, 783>
[2614] G. Shimura and Y. Taniyama, Complex Multiplication of Abelian Varieties and its
Applications to Number Theory, volume 6 of Publications of the Mathematical
Society of Japan, The Mathematical Society of Japan, Tokyo, 1961. <299, 302>
[2615] K. Shiratani and M. Yamada, On rationality of Jacobi sums, Colloq. Math. 73 (1997)
251–260. <146, 161>
[2616] S. G. Shiva and P. Allard, A few useful details about a known technique for factoring
1 + X 2q−1 , IEEE Trans. Inform. Theory IT-16 (1970) 234–235. <62, 66>
[2617] I. Shmulevich, E. R. Dougherty, S. Kim, and W. Zhang, Probabilistic Boolean net-
works: a rule-based uncertainty model for gene regulatory networks, Bioinfor-
matics 18 (2002) 261–274. <829, 834>
[2618] M. A. Shokrollahi, New sequences of linear time erasure codes approaching the channel
Bibliography 987
two, Appl. Algebra Engrg. Comm. Comput. 14 (2004) 381–395. <93, 95>
[2659] V. M. Sidel0 nikov, Some k-valued pseudo-random sequences and nearly equidistant
codes, Problemy Peredači Informacii 5 (1969) 16–22. <319, 324>
[2660] V. M. Sidel0 nikov, On mutual correlation of sequences, Soviet Math. Dokl. 12 (1971)
197–201. <320, 324>
[2661] V. M. Sidel0 nikov, On the cross correlation of sequences, Problemy Kibernet. (1971)
15–42. <321, 324>
[2662] V. M. Sidel’nikov, Estimates for the number of appearances of elements on an interval
of a recurrent sequence over a finite field, Discrete Math. Appl. 2 (1992) 473–481.
<316, 317>
[2663] T. Siegenthaler, Correlation-immunity of nonlinear combining functions for crypto-
graphic applications, IEEE Trans. Inform. Theory 30 (1984) 776–780. <247,
252>
[2664] M. Sieveking, An algorithm for division of powerseries, Computing 10 (1972) 153–156.
<380, 382>
[2665] D. Silva and F. R. Kschischang, Universal secure network coding via rank-metric
codes, IEEE Trans. Inform. Theory 57 (2011) 1124–1135. <121, 128>
[2666] D. Silva, F. R. Kschischang, and R. Kötter, A rank-metric approach to error control
in random network coding, IEEE Trans. Inform. Theory 54 (2008) 3951–3967.
<848, 849>
[2667] J. H. Silverman, Advanced Topics in the Arithmetic of Elliptic Curves, volume 151
of Graduate Texts in Mathematics, Springer-Verlag, New York, 1994. <31, 32,
422, 440>
[2668] J. H. Silverman, The Arithmetic of Dynamical Systems, volume 241 of Graduate Texts
in Mathematics, Springer, New York, 2007. <337, 338, 344>
[2669] J. H. Silverman, Variation of periods modulo p in arithmetic dynamics, New York J.
Math. 14 (2008) 601–616. <342, 344>
[2670] J. H. Silverman, The Arithmetic of Elliptic Curves, volume 106 of Graduate Texts in
Mathematics, Springer-Verlag, New York, second edition, 2009. <31, 32, 422,
423, 424, 425, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440>
[2671] J. H. Silverman, A survey of local and global pairings on elliptic curves and abelian va-
rieties, In Pairing-Based Cryptography (PAIRING 2010), volume 6478 of Lecture
Notes in Comput. Sci., 377–396, Springer, Berlin, 2010. <435, 440>
[2672] J. H. Silverman and J. Tate, Rational Points on Elliptic Curves, Undergraduate Texts
in Mathematics, Springer-Verlag, New York, 1992. <31, 32, 422, 440>
[2673] M. K. Simon, J. K. Omura, R. A. Scholtz, and B. K. Levitt, Spread Spectrum Com-
munications Handbook, McGraw-Hill, Inc., 1994. <317, 324>
[2674] J. Singer, A theorem in finite projective geometry and some applications to number
theory, Trans. Amer. Math. Soc. 43 (1938) 377–385. <601, 607>
[2675] M. Skalba, Points on elliptic curves over finite fields, Acta Arithmetica 117 (2005)
293–301. <796>
[2676] C. Small, Solution of Waring’s problem mod n, Amer. Math. Monthly 84 (1977)
356–359. <211, 213>
[2677] C. Small, Sums of powers in large finite fields, Proc. Amer. Math. Soc. 65 (1977)
35–36. <211, 213>
[2678] C. Small, Waring’s problem mod n, Amer. Math. Monthly 84 (1977) 12–25. <211,
213>
990 Handbook of Finite Fields
[2679] C. Small, Diagonal equations over large finite fields, Canad. J. Math. 36 (1984)
249–262. <207, 209, 213>
[2680] C. Small, Permutation binomials, Internat. J. Math. Math. Sci. 13 (1990) 337–342.
<223, 230>
[2681] C. Small, Arithmetic of Finite Fields, volume 148 of Monographs and Textbooks in
Pure and Applied Mathematics, Marcel Dekker Inc., New York, 1991. <31, 32,
206, 207, 213, 223, 230>
[2682] N. P. Smart, The discrete logarithm problem on elliptic curves of trace one, J.
Cryptology 12 (1999) 193–196. <440, 784>
[2683] N. P. Smart, The exact security of ECIES in the generic group model, In B. Honary,
editor, Cryptography and Coding, volume 2260 of Lecture Notes in Comput. Sci.,
73–84, Springer-Verlag, Berlin, 2001. <785, 796>
[2684] N. Smart (ed.), ECRYPT II yearly report on algorithms and keysizes (2009-2010),
Technical Report D.SPA.13, European Network of Excellence in Cryptology II,
2010. <784, 793, 796>
[2685] B. Smeets, The linear complexity profile and experimental results on a randomness
test of sequences over the field Fq , presented at IEEE Int. Symp. on Information
Theory 1988, June 19–24. <329, 330, 336>
[2686] B. Smeets and W. Chambers, Windmill generators: a generalization and an obser-
vation of how many there are, In Advances in Cryptology—EUROCRYPT’88,
volume 330 of Lecture Notes in Comput. Sci., 325–330, Springer, Berlin, 1988.
<69, 70>
[2687] M. H. M. Smid, Duadic codes, IEEE Trans. Inform. Theory 33 (1987) 432–433. <682,
703>
[2688] B. A. Smith, Isogenies and the discrete logarithm problem in Jacobians of genus 3
hyperelliptic curves, J. Cryptology 22 (2009) 505–529. <798, 803>
[2689] S. L. Snover, The Uniqueness of the Nordstrom-Robinson and the Golay Binary Codes,
ProQuest LLC, Ann Arbor, MI, 1973, Thesis (Ph.D.)–Michigan State University.
<701, 703>
[2690] I. M. Sobol’, Distribution of points in a cube and approximate evaluation of integrals
(Russian), Ž. Vyčisl. Mat. i Mat. Fiz. 7 (1967) 784–802. <620, 625, 628, 630>
[2691] M. Sodestrand, W. Jenkins, G. A. Jullien, and F. J. Taylor, Residue Number System
Arithmetic: Modern Applications in Digital Signal Processing, IEEE Press, 1986.
<822, 823>
[2692] P. Solé, A quaternary cyclic code, and a family of quadriphase sequences with low
correlation properties, In Coding Theory and Applications, volume 388 of Lecture
Notes in Comput. Sci., 193–201, Springer, New York, 1989. <322, 324>
[2693] J. A. Solinas, Generalized Mersenne numbers, Combinatorics and Optimiza-
tion Research Report CORR 99-39, University of Waterloo, 1999, available
at http://www.cacr.math.uwaterloo.ca/techreports/1999/corr99-39.ps.
<353, 363>
[2694] R. Solovay and V. Strassen, A fast Monte-Carlo test for primality, SIAM J. Comput.
6 (1977) 84–85. <346, 363, 381, 382>
[2695] L. Song and K. K. Parhi, Low-energy digit-serial/parallel finite field multipliers, The
Journal of VLSI Signal Processing 19 (1998) 149–166. <815, 823>
[2696] A. B. Sørensen, Projective Reed-Muller codes, IEEE Trans. Inform. Theory 37 (1991)
1567–1576. <687, 703>
[2697] K. W. Spackman, Simultaneous solutions to diagonal equations over finite fields, J.
Bibliography 991
[2717] B. Stigler, Polynomial dynamical systems in systems biology, In Modeling and Sim-
ulation of Biological Networks, volume 64 of Proc. Sympos. Appl. Math., 53–84,
Amer. Math. Soc., Providence, RI, 2007. <337, 344>
[2718] D. R. Stinson, On bit-serial multiplication and dual bases in GF(2m ), IEEE Trans.
Inform. Theory 37 (1991) 1733–1736. <107, 109>
[2719] D. R. Stinson, Combinatorial Designs: Constructions and Analysis, Springer-Verlag,
New York, 2004. <31, 32, 599, 619>
[2720] D. R. Stinson, Cryptography: Theory and Practice, Discrete Mathematics and its
Applications. Chapman & Hall/CRC, Boca Raton, FL, third edition, 2006. <31,
32, 393, 401, 750>
[2721] D. R. Stinson, R. Wei, and L. Zhu, New constructions for perfect hash families and
related structures using combinatorial designs and codes, J. Combin. Des. 8
(2000) 189–200. <613, 619>
[2722] K.-O. Stöhr and J. F. Voloch, Weierstrass points and curves over finite fields, Proc.
London Math. Soc., 3rd Ser. 52 (1986) 1–19. <462, 463>
[2723] T. Stoll, Complete decomposition of Dickson-type polynomials and related Diophan-
tine equations, J. Number Theory 128 (2008) 1157–1181. <288, 290>
[2724] R. Stong, The average order of a permutation, Electron. J. Combin. 5 (1998) Research
Paper 41, 6 pp. <373, 374>
[2725] T. Storer, Cyclotomy and Difference Sets, volume 2 of Lectures in Advanced Mathe-
matics, Markham Publishing Co., Chicago, IL, 1967. <603, 604, 607>
[2726] A. Storjohann, Deterministic computation of the Frobenius form (extended abstract),
In Forty Second IEEE Symposium on Foundations of Computer Science, 368–
377, IEEE Computer Soc., Los Alamitos, CA, 2001. <529, 535>
[2727] A. Storjohann and G. Villard, Algorithms for similarity transforms, Technical report,
Rhine Workshop on Computer Algebra, 2000. <529, 535>
[2728] A. J. Stothers, On the Complexity of Matrix Multiplication, PhD thesis, University
of Edinburgh, 2010. <521, 535>
[2729] W. W. Stothers, On permutation polynomials whose difference is linear, Glasgow
Math. J. 32 (1990) 165–171. <228, 230>
[2730] D. R. Stoutemyer, Which polynomial representation is best?, In Proceedings of the
1984 MACSYMA Users’ Conference, 221–243, 1984. <382, 392>
[2731] V. Strassen, Gaussian elimination is not optimal, Numer. Math. 13 (1969) 354–356.
<358, 363>
[2732] V. Strassen, Evaluation of rational functions, In R. E. Miller and J. W. Thatcher,
editors, Complexity of Computer Computations, Plenum Press, 1972. <381, 382>
[2733] V. Strassen, Vermeidung von Divisionen, J. Reine Angew. Math. 264 (1973) 182–202.
<390, 392>
[2734] V. Strassen, Algebraische berechnungskomplexität, In Perspectives in Mathematics,
Anniversary of Oberwolfach 1984, 509–550, Birkhäuser Verlag, Basel, 1984. <381,
382>
[2735] M. Streng, Computing Igusa class polynomials, preprint available, http://arxiv.
org/abs/0903.4766, 2012. <803>
[2736] S. J. Suchower, Subfield permutation polynomials and orthogonal subfield systems in
finite fields, Acta Arith. 54 (1990) 307–315. <231, 232>
[2737] S. J. Suchower, Polynomial representations of complete sets of frequency hyperrectan-
gles with prime power dimensions, J. Combin. Theory, Ser. A 62 (1993) 46–65.
Bibliography 993
<554, 556>
[2738] B. Sudakov, E. Szemerédi, and V. H. Vu, On a question of Erdős and Moser, Duke
Math. J. 129 (2005) 129–155. <188, 192>
[2739] M. Sudan, Decoding of Reed Solomon codes beyond the error-correction bound, J.
Complexity 13 (1997) 180–193. <698, 699, 703>
[2740] M. Sugita, M. Kawazoe, and H. Imai, Gröbner basis based cryptanalysis of sha-1,
Cryptology ePrint Archive, Report 2006/098, 2006, http://eprint.iacr.org/.
<783>
[2741] Y. Sugiyama, M. Kasahara, S. Hirasawa, and T. Namekawa, A method for solving key
equation for decoding Goppa codes, Information and Control 27 (1975) 87–99.
<694, 703>
[2742] J. Sun and O. Y. Takeshita, Interleavers for turbo codes using permutation polyno-
mials over integer rings, IEEE Trans. Inform. Theory 51 (2005) 101–119. <229,
230, 725, 726, 727>
[2743] Q. Sun, The number of solutions of certain diagonal equations over finite fields,
Sichuan Daxue Xuebao 34 (1997) 395–398. <209, 213>
Pn
[2744] Q. Sun and D. Q. Wan, On the solvability of the equation i=1 xi /di ≡ 0 (mod 1)
and its application, Proc. Amer. Math. Soc. 100 (1987) 220–224. <208, 209,
213>
Pn
[2745] Q. Sun and D. Q. Wan, On the Diophantine equation i=1 xi /di ≡ 0 (mod 1), Proc.
Amer. Math. Soc. 112 (1991) 25–29. <209, 213>
[2746] Z.-W. Sun, On value sets of polynomials over a field, Finite Fields Appl. 14 (2008)
470–481. <211, 213, 235, 236>
[2747] B. Sunar, A generalized method for constructing subquadratic complexity GF (2k )
multipliers, IEEE Trans. Comput. 53 (2004) 1097–1105. <814, 823>
[2748] B. Sunar, A Euclidean algorithm for normal bases, Acta Appl. Math. 93 (2006) 57–74.
<360, 363>
[2749] B. Sunar and Ç. K. Koç, Mastrovito multiplier for all trinomials, IEEE Trans.
Comput. 48 (1999) 522–527. <822, 823>
[2750] B. Sunar and Ç. K. Koç, An efficient optimal normal basis type II multiplier, IEEE
Trans. Comput. 50 (2001) 83–87. <821, 823>
[2751] A. V. Sutherland, Genus 1 point-counting record modulo a 5000+ digit prime, 2010,
Posting to the Number Theory List, http://listserv.nodak.edu/cgi-bin/
wa.exe?A2=ind1007&L=nmbrthry&T=0&F=&S=&P=287.
<788, 796>
[2752] A. V. Sutherland, On the evaluation of modular polynomials, ArXiv 1202.3985v3, to
appear in the proceedings of the Tenth Algorithmic Number Theory Symposium
ANTS-X, 2012. <788, 796>
[2753] R. G. Swan, Factorization of polynomials over finite fields, Pacific J. Math. 12 (1962)
1099–1106. <34, 49, 67, 68, 70, 96, 97>
[2754] N. Szabo and R. I. Tanaka, Residue Arithmetic and its Application to Computer
Technology, McGraw-Hill, 1967. <352, 363>
[2755] P. Sziklai, On small blocking sets and their linearity, J. Combin. Theory, Ser. A 115
(2008) 1167–1182. <561, 563>
[2756] T. Szőnyi, On the number of directions determined by a set of points in an affine
Galois plane, J. Combin. Theory, Ser. A 74 (1996) 141–146. <558, 563>
[2757] T. Szőnyi, Blocking sets in Desarguesian affine and projective planes, Finite Fields
994 Handbook of Finite Fields
[2799] “The Mathworks Inc.” MATLAB - The Language of Technical Computing, http:
//www.mathworks.com/products/matlab/, as viewed in July 2012. <48, 49>
[2800] “The OEIS Foundation Inc.” The on-line encyclopedia of integer
sequencesTM (OEISTM ), http://www.oeis.org, as viewed in July, 2012.
<46, 49>
[2801] “The PARI Group,” PARI/GP Development Center, http://pari.math.u-
bordeaux.fr/, as viewed in July, 2012. <47, 49>
[2802] N. Thériault, Index calculus attack for hyperelliptic curves of small genus, In Advances
in Cryptology—ASIACRYPT 2003, volume 2894 of Lecture Notes in Comput.
Sci., 75–92, Springer, Berlin, 2003. <455, 456, 798, 803>
[2803] J. J. Thomas, J. M. Keller, and G. N. Larsen, The calculation of multiplicative inverses
over GF(P ) efficiently where P is a Mersenne prime, IEEE Trans. Comput. 35
(1986) 478–482. <360, 363>
[2804] E. Thomé, Fast computation of linear generators for matrix sequences and application
to the block Wiedemann algorithm, In Proceedings of the 2001 International
Symposium on Symbolic and Algebraic Computation, 323–331, ACM, New York,
2001. <535>
[2805] T. M. Thompson, From Error-Correcting Codes Through Sphere Packings to Simple
Groups, volume 21 of Carus Mathematical Monographs, Mathematical Associa-
tion of America, Washington, DC, 1983. <691, 703>
[2806] T. Tian and W. F. Qi, Primitive normal element and its inverse in finite fields, Acta
Math. Sinica (Chin. Ser.) 49 (2006) 657–668. <116>
[2807] T. Tian and W.-F. Qi, Typical primitive polynomials over integer residue rings, Finite
Fields Appl. 15 (2009) 796–807. <90>
[2808] A. Tietäväinen, On systems of linear and quadratic equations in finite fields, Ann.
Acad. Sci. Fenn. Ser. A I No. 382 (1965) 5. <210, 213>
[2809] A. Tietäväinen, On diagonal forms over finite fields, Ann. Univ. Turku. Ser. A I No.
118 (1968) 10. <207, 213>
[2810] A. Tietäväinen, On the nonexistence of perfect codes over finite fields, SIAM J. Appl.
Math. 24 (1973) 88–96. <672, 684, 703>
[2811] A. Tietäväinen, A short proof for the nonexistence of unknown perfect codes over
GF(q), q > 2, Ann. Acad. Sci. Fenn. Ser. A I (1974) 6. <672, 702, 703>
[2812] R. A. H. Toledo, Linear finite dynamical systems, Comm. Algebra 33 (2005) 2977–
2989. <834>
[2813] A. Tonelli, Bemerkung über die Auflösung quadratischer Congruenzen, Nachrichten
von der Königl. Gesellschaft der Wissenschaften und der Georg-Augusts-
Universität zu Göttingen (1891) 344–346. <796>
[2814] A. Topuzoğlu and A. Winterhof, Pseudorandom sequences, In Topics in Geometry,
Coding Theory and Cryptography, volume 6 of Algebr. Appl., 135–166, Springer,
Dordrecht, 2007. <337, 338, 344>
[2815] Á. Tóth, On the evaluation of Salié sums, Proc. Amer. Math. Soc. 133 (2005) 643–645.
<160, 161>
[2816] J. Tromp, L. Zhang, and Y. Zhao, Small weight bases for Hamming codes, In Comput-
ing and Combinatorics, volume 959 of Lecture Notes in Comput. Sci., 235–243,
Springer, Berlin, 1995. <89, 90>
[2817] T. T. Truong, Degree complexity of a family of birational maps. II. Exceptional cases,
Math. Phys. Anal. Geom. 12 (2009) 157–180. <338, 344>
[2818] B. Tsaban and U. Vishne, Efficient linear feedback shift registers with maximal period,
Bibliography 997
[2838] E. R. van Dam and D. Fon-Der-Flaass, Codes, graphs, and schemes from nonlinear
functions, European J. Combin. 24 (2003) 85–98. <259, 261>
[2839] E. R. van Dam and W. H. Haemers, Eigenvalues and the diameter of graphs, Linear
and Multilinear Algebra 39 (1995) 33–44. <645, 658>
[2840] W. van Dam, S. Hallgren, and L. Ip, Quantum algorithms for some hidden shift
problems, In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on
Discrete Algorithms (Baltimore, MD, 2003), 489–498, ACM, New York, 2003.
<840, 841>
[2841] W. van Dam and I. E. Shparlinski, Classical and quantum algorithms for exponential
congruences, In Theory of Quantum Computation, Communication, and Cryp-
tography, volume 5106 of Lecture Notes in Comput. Sci., 1–10, Springer, Berlin,
2008. <840, 841>
[2842] G. van der Geer and M. van der Vlugt, Reed-Muller codes and supersingular curves.
I, Compositio Math. 84 (1992) 333–367. <487, 488>
[2843] G. van der Geer and M. van der Vlugt, On the existence of supersingular curves of
given genus, J. Reine Angew. Math. 458 (1995) 53–61. <486, 488>
[2844] G. van der Geer and M. van der Vlugt, Quadratic forms, generalized Hamming weights
of codes and curves with many points, J. Number Theory 59 (1996) 20–36. <205,
206>
[2845] G. van der Geer and M. van der Vlugt, An asymptotically good tower of curves over
the field with eight elements, Bull. London Math. Soc. 34 (2002) 291–300. <467,
469>
[2846] G. van der Geer and M. van der Vlugt, Tables of curves with many points, 2009,
http://www.science.uva.nl/~geer/tables-mathcomp21.pdf. <460, 463>
[2847] M. van der Put, A note on p-adic uniformization, Nederl. Akad. Wetensch. Indag.
Math. 49 (1987) 313–318. <544, 546>
[2848] B. L. van der Waerden, A History of Algebra: From al-Khwārizmī to Emmy Noether,
Springer-Verlag, Berlin, 1985. <3, 11>
[2849] J. H. van Lint, Introduction to Coding Theory, volume 86 of Graduate Texts in Math-
ematics, Springer-Verlag, Berlin, third edition, 1999. <31, 32, 586, 589, 661, 663,
668, 670, 671, 672, 673, 674, 683, 684, 685, 686, 689, 690, 703>
[2850] J. H. van Lint and A. Schrijver, Construction of strongly regular graphs, two-weight
codes and partial geometries by finite fields, Combinatorica 1 (1981) 63–73.
<617, 619>
[2851] J. H. van Lint and R. M. Wilson, A Course in Combinatorics, Cambridge University
Press, Cambridge, 1992. <31, 32, 609, 619>
[2852] P. C. van Oorschot and M. J. Wiener, Parallel collision search with cryptanalytic
applications, J. Cryptology 12 (1999) 1–28. <397, 401, 746, 750>
[2853] T. van Trung and S. Martirosyan, New constructions for IPP codes, Des. Codes
Cryptogr. 35 (2005) 227–239. <613, 619>
[2854] P. van Wamelen, New explicit multiplicative relations between Gauss sums, Int. J.
Number Theory 3 (2007) 275–292. <146, 161>
[2855] R. Varshamov, Estimate of the number of signals in error correcting codes, Dokl.
Akad. Nauk. SSSR 117 (1957) 739–741. <671, 702, 703>
[2856] R. Varshamov, A general method of synthesizing irreducible polynomials over Galois
fields, Soviet Math. Dokl. 29 (1984) 334–336. <379, 380>
[2857] R. R. Varshamov, A certain linear operator in a Galois field and its applications
(Russian), Studia, Sci. Math. Hunger. 8 (1973) 5–19. <63, 66>
Bibliography 999
[2898] D. Wan, Rationality of partial zeta functions, Indag. Math. (New Ser.) 14 (2003)
285–292. <198, 201>
[2899] D. Wan, Variation of p-adic Newton polygons for L-functions of exponential sums,
Asian J. Math. 8 (2004) 427–471. <484, 485, 488>
[2900] D. Wan, Mirror symmetry for zeta functions, In Mirror Symmetry V, volume 38 of
AMS/IP Stud. Adv. Math., 159–184, Amer. Math. Soc., Providence, RI, 2006.
<196, 200, 201>
[2901] D. Wan, Algorithmic theory of zeta functions over finite fields, In Algorithmic Number
Theory: Lattices, Number Fields, Curves and Cryptography, volume 44 of Math.
Sci. Res. Inst. Publ., 551–578, Cambridge Univ. Press, Cambridge, 2008. <491>
[2902] D. Wan, Lectures on zeta functions over finite fields, In Higher-Dimensional Geometry
over Finite Fields, volume 16 of NATO Sci. Peace Secur. Ser. D Inf. Commun.
Secur., 244–268, IOS, Amsterdam, 2008. <193, 196, 201>
[2903] D. Wan, Modular counting of rational points over finite fields, Found. Comput. Math.
8 (2008) 597–605. <489, 491>
[2904] D. Q. Wan, On a problem of Niederreiter and Robinson about finite fields, J. Austral.
Math. Soc., Ser. A 41 (1986) 336–338. <228, 230>
[2905] D. Q. Wan, Permutation polynomials over finite fields, Acta Math. Sinica (New Ser.)
3 (1987) 1–5. <218, 223, 230>
[2906] D. Q. Wan, Zeros of diagonal equations over finite fields, Proc. Amer. Math. Soc. 103
(1988) 1049–1052. <209, 210, 213>
[2907] D. Q. Wan, An elementary proof of a theorem of Katz, Amer. J. Math. 111 (1989)
1–8. <199, 201>
[2908] D. Q. Wan, Permutation polynomials and resolution of singularities over finite fields,
Proc. Amer. Math. Soc. 110 (1990) 303–309. <218, 230>
[2909] D. Q. Wan, A generalization of the Carlitz conjecture, In Finite Fields, Coding Theory,
and Advances in Communications and Computing, volume 141 of Lecture Notes
in Pure and Appl. Math., 431–432, Dekker, New York, 1993. <218, 230>
[2910] D. Q. Wan, Newton polygons of zeta functions and L functions, Ann. of Math., 2nd
Ser. 137 (1993) 249–293. <483, 484, 488>
[2911] D. Q. Wan, A p-adic lifting lemma and its applications to permutation polynomials, In
Finite Fields, Coding Theory, and Advances in Communications and Computing,
volume 141 of Lecture Notes in Pure and Appl. Math., 209–216, Dekker, New
York, 1993. <217, 230, 233, 236>
[2912] D. Q. Wan, A classification conjecture about certain permutation polynomials, In
Finite Fields: Theory, Applications and Algorithms, volume 168 of Contemporary
Math., 401–402, American Mathematical Society, Providence, RI, 1994. <228,
230>
[2913] D. Q. Wan, Permutation binomials over finite fields, Acta Math. Sinica (New Ser.)
10 (1994) 30–35. <218, 223, 230>
[2914] D. Q. Wan, A Chevalley-Warning approach to p-adic estimates of character sums,
Proc. Amer. Math. Soc. 123 (1995) 45–54. <199, 201>
[2915] D. Q. Wan, Minimal polynomials and distinctness of Kloosterman sums, Finite Fields
Appl. 1 (1995) 189–203. <154, 161>
[2916] D. Q. Wan and R. Lidl, Permutation polynomials of the form xr f (x(q−1)/d ) and their
group structure, Monatsh. Math. 112 (1991) 149–163. <221, 230>
[2917] D. Q. Wan, G. L. Mullen, and P. J.-S. Shiue, Erratum: “The number of permutation
polynomials of the form f (x) + cx over a finite field”, Proc. Edinburgh Math.
1002 Handbook of Finite Fields
<232>
[2958] S. Wei, G. Chen, and G. Xiao, A fast algorithm for determining the linear complexity
of periodic sequences, In Information Security and Cryptology, volume 3822 of
Lecture Notes in Comput. Sci., 202–209, Springer, Berlin, 2005. <329, 336>
[2959] S. Wei, G. Xiao, and Z. Chen, A fast algorithm for determining the linear complexity
of a binary sequence with period 2n pm , Sci. China, Ser. F 44 (2001) 453–460.
<329, 336>
[2960] S. Wei, G. Xiao, and Z. Chen, A fast algorithm for determining the minimal polyno-
mial of a sequence with period 2pn over GF(q), IEEE Trans. Inform. Theory 48
(2002) 2754–2758. <329, 336>
[2961] A. Weil, On some exponential sums, Proc. Nat. Acad. Sci. U. S. A. 34 (1948) 204–207.
<162, 169>
[2962] A. Weil, Sur les Courbes Algébriques et les Variétés qui s’en dÉduisent, Actualités
Sci. Ind., no. 1041; Publ. Inst. Math. Univ. Strasbourg 7 (1945). Hermann et
Cie., Paris, 1948. <162, 169, 497, 500>
[2963] A. Weimerskirch and C. Paar, Generalizations of the Karatsuba algorithm for efficient
implementations, 2006, preprint available, http://eprint.iacr.org/2006/224.
<813, 823>
[2964] L. Welch, Lower bounds on the maximum cross correlation of signals, IEEE Trans.
Inform. Theory 20 (1974) 397–399. <320, 324>
[2965] L. R. Welch and E. R. Berlekamp, Error Correction for Algebraic Block Codes, U. S.
Patent 4,633,470 (1986). <695, 703>
[2966] E. J. Weldon, Jr., Euclidean geometry cyclic codes, In Combinatorial Mathematics
and its Applications, 377–387, Univ. North Carolina Press, Chapel Hill, N.C.,
1969. <689, 696, 703>
[2967] C. Wells, The degrees of permutation polynomials over finite fields, J. Combin. Theory
7 (1969) 49–55. <219, 220, 230>
[2968] A. Wells Jr., A polynomial form for logarithms modulo a prime, IEEE Trans. Inf.
Theory 30 (1984) 845–846. <396, 401>
[2969] G. P. Wene, On the multiplicative structure of finite division rings, Aequationes Math.
41 (1991) 222–233. <278>
[2970] A. Weng, Konstruktion kryptographisch geeigneter Kurven mit komplexer Multiplika-
tion, PhD thesis, Universität Gesamthochschule Essen, 2001. <803>
[2971] A. Weng, Constructing hyperelliptic curves of genus 2 suitable for cryptography,
Math. Comput. 72 (2003) 435–458. <803>
[2972] G. Weng, W. Qiu, Z. Wang, and Q. Xiang, Pseudo-Paley graphs and skew Hadamard
difference sets from presemifields, Des. Codes Cryptogr. 44 (2007) 49–62. <282>
[2973] G. Weng and X. Zeng, Further results on planar DO functions and commutative
semifields, Des. Codes Cryptog. 63 (2012) 413–423. <282>
[2974] R. C. Whaley, A. Petitet, and J. J. Dongarra, Automated empirical optimizations
of software and the ATLAS project, Parallel Computing 27 (2001) 3–35. <523,
535>
[2975] A. L. Whiteman, An infinite family of Hadamard matrices of Williamson type, J.
Combin. Theory, Ser. A 14 (1973) 334–340. <610, 619>
[2976] D. H. Wiedemann, Solving sparse linear equations over finite fields, IEEE Trans.
Inform. Theory 32 (1986) 54–62. <350, 363, 399, 401, 530, 535>
[2977] D. Wiedermann, An iterated quadratic extension of GF(2), Fibonacci Quart. 26
Bibliography 1005
235–245. <546>
[3041] J.-D. Yu, Variation of the unit root along the Dwork family of Calabi-Yau varieties,
Math. Ann. 343 (2009) 53–78. <479, 488>
[3042] J. Yuan, C. Carlet, and C. Ding, The weight distribution of a class of linear codes
from perfect nonlinear functions, IEEE Trans. Inform. Theory 52 (2006) 712–
717. <272, 273>
[3043] J. Yuan and C. Ding, Four classes of permutation polynomials of F2m , Finite Fields
Appl. 13 (2007) 869–876. <226, 230>
[3044] J. Yuan, C. Ding, H. Wang, and J. Pieprzyk, Permutation polynomials of the form
(xp − x + δ)s + L(x), Finite Fields Appl. 14 (2008) 482–493. <226, 230>
[3045] P. Yuan, More explicit classes of permutation polynomials of F33m , Finite Fields
Appl. 16 (2010) 88–95. <226, 230>
[3046] P. Yuan and C. Ding, Permutation polynomials over finite fields from a powerful
lemma, Finite Fields Appl. 17 (2011) 560–574. <221, 224, 226, 230>
[3047] P. Yuan and X. Zeng, A note on linear permutation polynomials, Finite Fields Appl.
17 (2011) 488–491. <216, 230>
[3048] J. L. Yucas, Irreducible polynomials over finite fields with prescribed trace / prescribed
constant term, Finite Fields Appl. 12 (2006) 211–221. <54, 59>
[3049] J. L. Yucas and G. L. Mullen, Irreducible polynomials over GF(2) with prescribed
coefficients, Discrete Math. 274 (2004) 265–279. <55, 56, 59, 79>
[3050] J. L. Yucas and G. L. Mullen, Self-reciprocal irreducible polynomials over finite fields,
Des. Codes Cryptogr. 33 (2004) 275–281. <56, 59>
[3051] D. Y. Y. Yun, Fast algorithm for rational function integration, In B. Gilchrist,
editor, Information Processing 77—Proceedings of the IFIP Congress 77, 493–
498, North-Holland, Amsterdam, 1977. <381, 382>
[3052] H. Zassenhaus, On Hensel factorization I, J. Number Theory 1 (1969) 291–311. <385,
392>
[3053] H. Zassenhaus, Polynomial time factoring of integral polynomials, ACM SIGSAM
Bull. 15 (1981) 6–7. <387, 392>
[3054] G. Zeng, Y. Yang, W. Han, and S. Fan, Reducible polynomial over F2 constructed by
trinomial σ-lfsr, In Information Security and Cryptology, volume 5487 of Lecture
Notes in Comput. Sci., 192–200, Springer, Berlin, 2009. <70>
[3055] L. Zeng, L. Lan, Y. Y. Tai, S. Song, S. Lin, and K. Abdel-Ghaffar, Constructions
of nonbinary quasi-cyclic LDPC codes: a finite field approach, IEEE Trans.
Communications 56 (2008) 545–554. <718, 719>
[3056] X. Zeng, C. Carlet, J. Shan, and L. Hu, More balanced Boolean functions with optimal
algebraic immunity and good nonlinearity and resistance to fast algebraic attacks,
IEEE Trans. Inform. Theory 57 (2011) 6310–6320. <250, 252>
k
[3057] X. Zeng, X. Zhu, and L. Hu, Two new permutation polynomials with the form (x2 +
x + δ)s + x over F2n , Appl. Algebra Engrg. Comm. Comput. 21 (2010) 145–150.
<226, 230>
[3058] Z. Zha and L. Hu, Two classes of permutation polynomials over finite fields, Finite
Fields Appl. 18 (2012) 781–790. <221, 226, 230>
[3059] Z. Zha, G. M. Kyureghyan, and X. Wang, Perfect nonlinear binomials and their
semifields, Finite Fields Appl. 15 (2009) 125–133. <282>
[3060] Z. Zha and X. Wang, New families of perfect nonlinear polynomial functions, J.
Algebra 322 (2009) 3912–3918. <282>
Bibliography 1009
1011
1012 Handbook of Finite Fields
walk, 644
closed, 644
Walsh
coefficient, 262
transform, 262
Waring’s formula, 283
Waring’s number, 211
existence, 211
Waring’s problem, 498
Wedderburn, 14
Weierstrass ℘-function, 150
Weierstrass equation, 423
discriminant, 423
j-invariant, 423
Mathematics
Poised to become the leading reference in the field, the Handbook of Finite Fields
is exclusively devoted to the theory and applications of finite fields. More than
80 international contributors compile state-of-the-art research in this definitive
handbook. Edited by two renowned researchers, the book uses a uniform style
and format throughout and each chapter is self contained and peer reviewed.
The first part of the book traces the history of finite fields through the eighteenth
and nineteenth centuries. The second part presents theoretical properties of finite
fields, covering polynomials, special functions, sequences, algorithms, curves,
and related computational aspects. The final part describes various mathematical
and practical applications of finite fields in combinatorics, algebraic coding theory,
cryptographic systems, biology, quantum information theory, engineering, and
other areas. The book provides a comprehensive index and easy access to over
3,000 references, enabling you to quickly locate up-to-date facts and results
regarding finite fields.
Features
• Gives a complete account of state-of-the-art theoretical and applied topics
in finite fields
• Describes numerous applications from the fields of computer science and
engineering
• Presents the history of finite fields and a brief summary of basic results
• Discusses theoretical properties of finite fields
• Covers applications in cryptography, coding theory, and combinatorics
• Includes many remarks to further explain the various results
• Contains more than 3,000 references, including citations to proofs of
important results
• Offers extensive tables of polynomials useful for computational issues, with
even larger tables available on the book’s CRC Press web page
About the Editors
Gary L. Mullen is a professor of mathematics at The Pennsylvania State University.
Daniel Panario is a professor of mathematics at Carleton University.
K13417