Foundations of Mathematics in Polymorphic Type Theory: M. Randall Holmes
Foundations of Mathematics in Polymorphic Type Theory: M. Randall Holmes
Foundations of Mathematics in Polymorphic Type Theory: M. Randall Holmes
about the theorem prover, see [15]; for a technical tion of x ∈ y as “x has property y”; if x is of type n, we
treatment of its mathematical foundations, see [16]. expect y to be of type n + 1.
Note that we do not claim in TTU that objects of
different types are different objects (we do not claim
2. The theory of types that the types are disjoint collections). We do not allow
the question to be asked as to whether variables of
We present a version of the Theory of Types of Russell, different types represent the same object! Similarly, we
as simplified by Ramsey. We will refer to the theory do not allow ourselves to address the question of
we develop here as TTU (the U is for “urelements”, as whether a type n + 1 property could be attributed to a
we weaken extensionality). type m object (where m ≠ n).
The intuitive idea behind the Theory of Types is as We adopt an axiom scheme which asserts that any
follows. We are given a collection of individuals. We property of type n objects which we can express is
further concern ourselves with classes of individuals, actually realized by a type n + 1 object:
classes of classes of individuals, and so forth. Actually,
we prefer at this point to talk of properties of individ- Axiom Scheme of Comprehension: For each formula φ,
uals, properties of properties of individuals, and so (∃A.(∀x.x ∈ A ≡ φ)). (A not free in φ.)
forth. If we take an intensional view (speaking of
properties rather than sets or classes) this will help to This is a scheme not only because of the infinite
motivate our refusal to adopt an extensional criterion of number of possible choices of formula φ, but also
identity between objects of positive type. because of the infinite number of possible types for the
We begin to formalize this picture. We will develop variable x (the type of x determines the type of A).
TTU as a first-order theory with equality with sorts Now we turn our attention to the issue of extension-
indexed by the natural numbers. The intuitive motiva- ality. It is possible to adopt strong extensionality in the
tion is that type 0 is the type of individuals and type Theory of Types (and Russell and Ramsey did this). We
n + 1, for each concrete natural number n, is inhabited adopt an intensional viewpoint here which motivates
by properties of type n objects. us to be skeptical of extensionality; it may be that we
We supply countably many variables of each type. will be accused of being disingenuous, since of course
We do not encumber the variables with superscripts; we we know that the adoption of strong extensionality
merely stipulate that each variable x has a unique type would bring us into collision with the unsolved problem
denoted by type(x). This will make our notation more of the consistency of NF. We will discuss this issue
readable. Further, we provide a map raise on again below.
variables with the properties that type(raise(x)) = We denote the theory developed so far by TTU0. We
type(x) + 1 and that raise(x) = raise(y) implies show how TTU0 can interpret our target theory TTU,
x = y. We may also denote raise(x) by x+. Intuitively, which has in addition a weak form of extensionality.
raise represents the operation of incrementing a This interpretation of extensionality is due to Marcel
(suppressed) type superscript. Crabbé in [3], where he used this approach to demon-
It is important to note that the use of numerals for strate that NF without any form of extensionality inter-
the types is strictly a convenience. The types could prets NFU.
equally well be presented as concrete tokens ι, ι′, ι″ and An intuitive approach to introducing extensionality
so forth, and use of type superscripts would allow the is to point out that we can identify those type n + 1
map raise to be defined without any reference to objects which have the same elements. We define a
arithmetic as well. relation to embody this idea:
The only primitive non-logical predicate of TTU is
membership, denoted by ∈. Note that logical consider- Definition: We define x ~ y for any x, y of the same
ations tell us that an atomic formula x = y is well-formed positive type as (∀z.z ∈ x ≡ z ∈ y). For x, y of
iff type(x) = type(y). We stipulate further that an type 0, we allow x ~ y to denote an unspecified
atomic formula x ∈ y is well-formed iff type(x) + 1 = equivalence relation (it may be equality, but we do
type(y). This squares with our intuitive interpretata- not assume this).
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 31
We would like to use ~ as our equality relation in Extensionality: (Σ(x) ∧ Σ(y)) → (x = y ≡ (∀z.z ∈
each type. The difficulty which arises is that if ~ is not x ≡ z ∈ y))
the equality relation already, there will be properties in
any type n + 2 which do not respect the relation ~ on Comprehension: For each formula φ, (∃A.Σ(A) ∧
type n + 1. The solution to this problem which we adopt (∀x.x ∈ A ≡ φ)). (A not free in φ).
is to redefine the membership relation as well as the
equality relation. We first introduce an auxiliary predi- Each of these “axioms” is actually a scheme because
cate: it comes in different versions for each appropriate type.
It is straightforward but tedious to verify that this
Definition: Σ(x), read in English “x is a set”, is defined theory is interpreted in TTU0 in the way we have
as (∀yz.y ~ z → (y ∈ x ≡ z ∈ x)). This makes sense indicated. It is also easy to believe that this is the case,
for x in any positive type (it doesn’t make sense at so we will not belabour the point!
type 0 because a type 0 variable cannot appear to An immediate advantage of TTU over TTU0 is that
the right of ∈). we can define a canonical {x | φ} such that {x | φ} is a
set and (a ∈ {x | φ}) ≡ φ[a/x] (where φ[a/x] is the result
We say that x is a set just in case the property
of replacing x with a in φ). Comprehension ensures that
represented by x respects the identity relation ~ of the
there is at least one candidate for the role of {x | φ} for
appropriate type. This enables us to define a new
any formula φ; Extensionality ensures that there is at
membership relation:
most one such candidate which is a set. All the non-
Definition: x ∈new y is defined as x ∈ y ∧ Σ(y). sets have the same (empty) extension but there is a
unique empty set ∅ = {x | x ≠ x} in each type. In TTU0
The new membership relation preserves the exten- there is no way to choose a canonical object of a given
sions of properties which respect the relations ~ in each extension (another way that one might do this, which
type, and assigns no elements at all to the properties would not require any extensionality assumptions,
whose extensions do not respect the equivalence rela- would be to adopt and apply the Axiom of Choice).
tions ~. Of course, it is the case that we could have built a
A property of any type which respects the relation ~ model of TT (the Theory of Types with strong exten-
(if x ~ y then either both have the property or neither sionality x = y ≡ (∀z.z ∈ x ≡ z ∈ y)). This process would
do) is assigned the same extension under ∈new that it had have been logically more complex, because it would be
under ∈; a property of any type which does not respect necessary to eliminate the properties which did not
the relation ~ is assigned the empty extension. If we respect the new equivalence relations, rather than simply
restrict ourselves to formulas using the predicates Σ, ~, make their extensions empty. Two a priori reasons to
and ∈new, it is straightforward to verify that ~ has the prefer the approach we have taken present themselves:
logical properties required to interpret the equality the first is that this approach does not actually eliminate
relation (substitution of “equals” for “equals” in any any objects from our model of “intensional type theory”
context preserves truth value). TTU0 (though it does collapse some distinctions between
The theory of Σ, ~, and ∈new in TTU0 is an interpre- them); the second is that the construction for a model
tation in TTU0 of the inessentially stronger theory TTU of TT is intrinsically considerably more complicated. It
which we now describe, in which ~ interprets equality could not be presented (as this one has) as a scheme of
and ∈new interprets membership. definitions of the same form at each type; the defini-
TTU, like TTU0, is a first order theory with equality tion of the interpreted TT would in fact be a recursively
and membership. It has the same sequence of sorts defined scheme with the definition of the new relations
indexed by the natural numbers, and the same forma- at each type depending on the new relations on the
tion rules for atomic formulas, plus the primitive atomic preceding type. (It is worth noting that reasoning in the
formula Σ(x) which is defined for x of any positive type. theory of types involving such recursion on the type
The axioms of TTU are structure cannot be reproduced in the system NFU we
will develop.)
Sethood: x ∈ y → Σ(y) In any event, we temporarily adopt TTU as our
32 M. RANDALL HOLMES
foundation for mathematics. In the next section, we will tion. This is a reasonable definition of the set of natural
discuss the actual development of some mathematics numbers.
in TTU (and the mathematical development will Notice that this is an impredicative instance of
motivate us to adopt two additional axioms, the axioms comprehension: if one thinks of the instance of com-
of Infinity and Choice). prehension providing us with N as a definition of N,
We review the intuitive picture behind TTU. We have it is disturbing that N itself falls within the scope of the
type 0 (our featureless “individuals”) to start with; each quantifier over A (N is “defined” as the intersection of
type n + 1 consists of sets or classes of type n objects a collection of sets to which it itself belongs). We admit
plus possibly some additional objects (which we will to no philosophical qualms about this (we do not think
call urelements or atoms; these are not to be confused that instances of comprehension are definitions); there
with the individuals of type 0). is a little more discussion of impredicativity later in the
paper.
An annoying feature of this definition is that it must
3. The development of mathematics in TTU be repeated in exactly the same way if one wishes to
“count” objects of types higher than 0. There is a type
In this section, we will develop some basic mathematics 4 set N which provides us with numerals for counting
in TTU. The development will go in the same way as it type 2 objects, and so forth in each higher type. It is
would in TT (our weakening of extensionality does not not usual to have to have different numerals to count
affect the suitability of the system for foundations). It objects of different sorts, though this is true in some
will also become clear in the course of this development natural languages (e.g. Japanese).
why the type structure of TTU (or TT) can be regarded We have not assumed an axiom of infinity at this
as annoying. This will help to motivate our eventual point. If there were finitely many type 0 objects, we
revision of the theory to obtain type-free foundations would discover that the type 1 set V = {x | x = x} would
in NFU. The mathematical development will also belong to some natural number. An axiom of infinity
motivate the addition of the axioms of Infinity and excluding this situation could look like this:
Choice.
We begin by defining the natural numbers. The *Axiom of Infinity: (∀n ∈ N.V ∉ n)
intuitive idea is that we will define each concrete natural
number n as the set of all sets with n elements. This It is straightforward to prove that the version of this
must immediately be modified by type considerations: axiom with V at type 1 implies the analogous assertions
we will define n (of type 2) as the set of all type 1 sets at each higher type. The star indicates that we do not
(of type 0 individuals) with n elements. Then, of course, actually adopt this as an axiom; it is a consequence of
we need to verify that we can actually do this. the Axiom of Ordered Pairs introduced below.
We define 0 as {∅}, the set whose only element is We now consider the definition of the ordered pair.
the (type 1) empty set. Recall from above that there may The following standard definition (due to Kuratowski)
be many elements of type 1 with no elements (because could be used:
of the weakening of extensionality in TTU), but there
will be exactly one set with no elements. *Definition: 〈x, y〉, read “the ordered pair of x and y”
For any (type 1) set A, we define A + 1 as is defined as {{x},{x, y}}.
{a < {x} | a ∈ A ∧ x ∉ a}; A + 1 is defined as the set
of all disjoint unions of elements of A with singletons. This definition (which we have starred to indicate
Observe that 0 + 1 will be the set 1 of all singletons, 1 that we do not ultimately adopt it) has the technical dis-
+ 1 will be the set 2 of all sets with two elements, 2 + advantage that the pair is two types higher than its
1 will be the set 3 of all sets with 3 elements, and so projections. This has the odd effect that functions are
on through the concrete natural numbers. three types higher than their values and arguments. We
We define N as {n | (∀A.(0 ∈ A ∧ (∀A.A ∈ A → prefer to have a pair which is of the same type as its
A + 1 ∈ A)) → n ∈ A)}. N is the (type 3) set of all projections. If we were using TT (i.e., if we assumed
(type 2) sets which belong to all (type 3) sets which strong extensionality) we would be able to define such
contain 0 and are closed under our “successor” opera- a pair on all sufficiently high types (this is due to Quine
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 33
in [23]). In TTU it is not possible to define the type- So we introduce a relation symbol ≤ and stipulate that
level pair, but it is possible to assume that there is one x ≤ y is a well-formed atomic formula iff type(x) =
as long as the Axiom of Infinity is assumed. The precise type(y). Our official axiom is:
situation is as follows: if we assume the Axiom of
Choice (as we will), it is possible to prove that there is Axiom of Choice: ≤ is a well-ordering (of each type).
a type level pair on each type (this is a consequence of (That is, ≤ is a linear order, and any nonempty set
the theorem κ2 = κ of transfinite cardinal arithmetic); has a ≤-least element).
if we did not assume choice, it would still be possible
to demonstrate that TTU with Infinity interprets TTU Two sets A and B are said to be equinumerous just
with a type-level pair (this development would exploit in case there is a bijection between them (just as in the
Quine’s definition of the pair for pure sets). Since we usual set theory). It is natural in type theory to define
do assume choice, we see no reason to go into details. the cardinal number |A| as the set of all sets equi-
We avoid the necessity of developing the theory of numerous with A. Notice that |A| is one type higher than
relations and functions with the Kuratowski pair far A, and also that the finite cardinal numbers will coincide
enough to prove κ2 = κ by adopting the type-level pair with the natural numbers already defined. Ordinal
as a primitive of our theory. numbers are usually defined in type theory as equiva-
Here are the formal details: lence classes of well-orderings under similarity.
We add predicates π1 and π2 to our formal language. We briefly review the paradoxes of naive set theory.
x π1 y and x π2 y are well-formed iff type(x) = The Russell paradox of the class {x | x ∉ x} cannot be
type(y), and satisfy the following reproduced in type theory because the formula x ∉ x
cannot be well-formed (no matter what the type of x).
Axiom of Ordered Pairs: For each x, y (of any type) The Cantor paradox of naive set theory applies the
there is a unique object z (of the same type) such that theorem |A| < |P(A)| (the cardinality of a set is strictly
z π1 x and z π2 y. We will denote this uniquely deter- smaller than the cardinality of its power set) to the
mined object by 〈x, y〉. Moreover, 〈x, y〉 uniquely cardinality of the universe to obtain an absurd result;
determines x and y. this doesn’t work because |A| < |P(A)| is ill-typed: the
set A and the set P(A) of all subsets of A are at different
It is easy to show that the Axiom of Ordered Pairs types. To fix this, we introduce a
implies the Axiom of Infinity. Once the ordered pair is
introduced, we can develop the theory of functions and Definition: We define P1(A) as the set of all one-
relations in the usual way. An advantage of the type- element subsets of A.
level pair is that relations are one type higher than the
elements of their domains and ranges, and functions are We can prove |P1(A)| < |P(A)| in TTU, from which
one type higher than their arguments and values; if we we can prove |P1(V)| < |P(V)|: the set of all one-element
used the Kuratowski pair, these displacements would be subsets of the universal set V of all elements of a
equal to 3. The Axiom of Choice can be introduced in particular type is smaller than the set of all subsets of
a number of ways. A common form of the Axiom of V (both of these sets live in the next type above the type
Choice looks like this: of V). The Burali-Forti pardox of the largest ordinal is
again avoided by type considerations. The set of all
*Axiom of Choice: Let P be a collection of pairwise ordinal numbers in a given type is well-ordered by the
disjoint nonempty sets. Then there is a set C (of type usual order on ordinal numbers, and so belongs to an
one lower than the type of P) which contains exactly ordinal number, but this ordinal is of a higher type than
one element of each element of P. any of the ordinals in the collection of ordinals we
started with.
The usual equivalences between forms of the Axiom Hereinafter we take TTU to include the axioms of
of Choice are provable, including the equivalence with Ordered Pairs and of Choice.
the assertion that all sets can be well-ordered. We prefer
to use the latter form, and in the context of type theory
it is sufficient to say that each type can be well-ordered.
34 M. RANDALL HOLMES
4. Model theory of TTU in TTU subset of each type of the model is coded by some
element of the next higher type of the model.
We will use TTU as our vehicle for metamathematics As we remarked above, the incompleteness theorems
as well as for mathematics. It should be clear that TTU show us that we cannot hope to prove the existence of
is adequate to prove standard results in model theory. a model of TTU (natural or otherwise) in TTU. However,
In this section, we discuss the definition of the notion we do know (by standard results of model theory) that
“model of TTU” inside TTU. Of course, it will not be if TTU is consistent, there will be a countable (and so
possible to prove inside TTU that there is a model of of course not natural) model of TTU. The assumption
TTU (any more than it is possible to prove inside ZFC that there is a natural model of TTU is presumably
that there is a model of ZFC!). consistent with TTU, but strengthens the theory (in
A model of TTU is determined by a sequence of sets much the same way that the assumption that there is an
Ti implementing the types plus relations implementing inaccessible cardinal strengthens ZFC, though we are
membership between each pair of successive types, the talking here about weaker theories).
projection relations on each type, and the well-order- If we define TTUn, for each concrete natural number
ings of each type. If we assumed that the Ti’s were n, as the fragment of TTU obtained by restricting our
pairwise disjoint, we could represent the membership attention to types 0-(n – 1) of TTU, it turns out that we
relation of the model by a single relation E, but we can prove the existence of natural models of TTUn for
prefer not to make this assumption: thus, we provide a each concrete n, though of course not in a uniform
sequence of membership relations Ei # Ti × Ti + 1 coding manner. We sketch the construction of such a natural
membership of type i objects in type i + 1 objects in model of TTUn. The key idea, from which all the details
the object theory. Notice that the “types” Ti are all of the model can be worked out, is that for any types i
sets of the same type n + 1 in the metatheory: the and j with i < j, there is a natural way to code type i
relations Ei are at the same type as the Ti’s, the elements objects into type j: an object x of type i can be coded
of any type of the model are one type lower, and by its (j – i)-fold iterated singleton. The natural model
the sequences T and E are one type higher. We will of TTUn will have as the elements of its types 0-(n – 1)
implement a model as a structure 〈T, E, P1, P2, W 〉, the elements of the true types 0-(n – 1) as coded into
where T is the sequence of types, E is the sequence of type n – 1. Thus, the set Ti for each i will be the
membership relations, P1 and P2 are sequences of first type n set of all ((n – 1) – i)-fold singletons of objects
and second projection relations for types of the model, of type i. The “membership relation” En – 2 between the
and W is a sequence of well-orderings of types of the two highest types Tn – 2 and Tn – 1 will be the intersec-
model. We may abuse notation by referring to a model tion of the subset relation with Tn – 2 × Tn – 1: if we have
as 〈T, E〉 when no reference to the other relations is x of type n – 2 and y of type n – 1, x is coded by {x}
needed. and y is coded by itself, and x ∈ y iff {x} # y. In
We do not formally define satisfaction of sentences general, the relation Ei for 0 ≤ i ≤ n – 2 will be the
by the model (standard methods are readily adapted to type n relation on iterated singletons in type n – 1
type theory). Of course, there are additional conditions induced by the intersection of the subset relation with
in the definition of the notion “model of TTU” which Tn × Tn + 1. It is easy to see that this is a natural model,
express the conditions that it satisfy each axiom of TTU! because collections of iterated singletons of type i
The finite axiomatization of NFU given below suggests objects correspond precisely to collections of type i
a simple way to define “model of TTU” which would objects.
not even require a formal definition of satisfaction of
sentences; it is straightforward to express satisfaction
of the typed version of each of those axioms by a model 5. Typical ambiguity in TTU
without metamathematical finesse.
We do define a particular class of models of special The Theory of Types was proposed before the devel-
interest: we say that a model 〈T, E〉 of TTU is natural opment of Zermelo-Fraenkel set theory, but it was not
just in case for each natural number i and subset A of adopted generally as a foundation for mathematics. The
Ti there is an element a of Ti + 1 such that for all x ∈ Ti, reasons for this have to do at least partially with the
x ∈ A if and only if x Ei a; in a natural model, every cumbersome nature of a notation cluttered with type
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 35
superscripts; in our development we have avoided this model will be (or include if there are urelements in the
problem. A further reason, which we have not been able type) the power set of type n, and so cannot be the same
to avoid, is the high level of polymorphism in this as type n (because it must be strictly larger than type n
theory. by Cantor’s theorem). What happens in natural models
Recall that when we defined the set N, we realized inside TTU is technically a bit different, but the outcome
that we had succeeded in defining numerals for counting is the same: a natural model of TTU in TTU will have
type 0 objects, but that we would need to define a types all of different cardinalities and so distinct.
different set N in a different type for the counting of Our answer to this objection will become plainer
objects of each type. The natural number 3 (in type 2) later, but for the moment we merely say that we have
is the set of all three-element sets of type 0 objects, but no particular reason to believe that our “ultimate” model
there are further natural numbers 3 which are sets of of TTU is a natural model (in the sense that type n + 1
all three element sets of type 1 objects, sets of all three contains every arbitrary subcollection of type n). The
element sets of type 186 objects, etc. Actually, we can’t intensional view of type theory that we suggested when
really say that these are “further” numbers 3; the syntax motivating the weakening of extensionality also leaves
of our language forbids us from comparing any two of us open to skepticism as to whether every arbitrary
these, or from considering the sequence of “numbers subcollection of type n is realized by a property of
3” except on the meta-level (it is easy to define the type n objects (an element of type n + 1). Certainly if
sequence of “numbers 3” in a model of TTU). we want to pursue the possibility of a hidden identity
This is a particular case of a general phenomenon, between the types, we must abandon the idea that type
known to Russell, called “typical ambiguity”. We intro- n + 1 contains an object representing each arbitrary
duce a useful subcollection of type n.
In any event, mathematical convenience strongly
Definition: For any formula φ of the language of TTU, suggests that we would wish to collapse the type
define φ+ as the formula which results from the structure of TTU.
replacement of each variable x occurring in φ with
raise(x).
6. A road not taken: Collapsing TTU to Mac
It should be evident that φ will itself be a well-
+
6. Lane set theory with atoms
formed formula. This notation allows us to describe the
phenomena of typical ambiguity succinctly: for each It should be noted before we proceed to our official
object {x | φ} that we can define, there is a precisely technique of collapsing the type structure that there is
analogous object {raise(x) | φ+} which we can define a natural way to collapse the type structure of TTU
in the next higher type. The ambiguity is even more which leads to a Zermelo-style set theory (though not
profound. It is easy to see that for each axiom φ all the way to ZFC). This collapse works best in models
(including instances of Ordered Pairs and of Choice), of TT; there can be a technical obstruction to collapsing
φ+ is also an axiom: from this it follows easily that for models of TTU with urelements, which is removed if
every theorem φ, φ+ is also a theorem. we stipulate for each type n + 1 that there are at least
The type structure of TTU looks rather like a hall of as many urelements of type n + 1 as there are single-
mirrors! The reduplication of objects and theorems in tons of type n urelements (i.e., as we go up in type the
each type suggests that perhaps what is going on is that number of urelements added at each type will not
the types (which we have never assumed to be distinct) decrease). This given, we proceed as follows: we are
are actually all identical. This is the motivation for initially given an injection (one-to-one map) f0 from the
Quine’s original proposal (based in TT rather than TTU) singletons of type 0 objects to type 1 objects (the role
of the theory “New Foundations”. The difficulty with of the singleton operation here is just to make it possible
Quine’s suggestion (which still has not been justified to type the map f0 within TTU); for each natural number
in its original form) is that there is no obvious reason n, once we have defined fn, an injection from single-
to believe that this can be the case. In fact, if we think tons of type n objects into type n + 1 objects, we define
of the “natural” models of TT or TTU in Zermelo-style fn + 1 as follows: for any set A in type n + 1, we define
set theory this seems absurd: type n + 1 in a natural fn + 1({A}) as {fn({a}) | a ∈ A} (the elementwise image
36 M. RANDALL HOLMES
of P1(A) under fn), and choose an arbitrary injection 7. Polymorphism used to collapse the type
from singletons of type n + 1 urelements to type n + 2 7. structure
urelements to serve as the restriction of fn + 1 to
urelements. This system of maps fn can be used to If we wish to collapse the type structure in such a way
interpret an untyped set theory in TTU: each fn is used as to exploit the polymorphism of TTU, we obtain quite
in the obvious way to identify the type n objects with different results from those of the construction of the
a subset of the type n + 1 objects, and the identifica- previous section. It is not generally the case that the
tion maps are defined in such a way as to respect collapse defined in the previous section will identify a
membership. The untyped set theory which results is set {x | φ} with its analogue {x+ | φ+} in the next higher
Zermelo set theory with atoms, with the restriction that type. This will be true for finite well-founded sets, but
the axiom of separation only applies to formulas in not generally for any others. For example the “numbers
which each quantifier is restricted to a set; this theory 3” of a model of TTU will be collapsed to the sets [Vi]3
can also be called “Mac Lane set theory with atoms”. in the model of Mac Lane set theory with atoms, where
(For Mac Lane’s proposal of this theory, see [19]; for the set Vi is the set coding type i and [X]3 represents
an excellent recent study of this theory, see [20]). The the set of all three-element subsets of x. The sets [Vi]3
restriction on separation arises essentially because TTU are demonstrably distinct, and this is a typical situation.
never allows us to quantify over all types at once. The To assume that such a collapse is possible has non-
model of Mac Lane set theory obtained in this way trivial logical content, expressible as an axiom scheme
might turn out to be a model of Zermelo set theory, but to be adjoined to TTU. To see this, consider the case of
cannot turn out to be a model of ZFC, because the set abstracts {x | φ} where the formula φ happens not
sequence of sets implementing types of the original to contain any free occurrences of x, or any parameters
model will not be a set and will provide a counterex- (i.e., φ is a sentence). Any such set {x | φ} will be equal
ample to replacement (if we get a model of ZFC, that to the universe or the empty set depending on whether
implies that we had no urelements to start with, and φ happens to be true or false. {x+ | φ+} will have the
under this condition the sequence of types of the same property, and we certainly expect the universe in
original model is definable in the model of Mac Lane a type to be analogous to the universe in the next type,
set theory; it may actually be possible to get a model and similarly for the empty set. It follows that we must
of ZFA (ZFC weakened to allow atoms) by collapsing believe the following axiom scheme in TTU:
a model of TTU, because it is not clear that the sequence
of types is definable in the general situation; in any case Ambiguity Scheme: For any sentence φ, φ ≡ φ+
it is possible that the model obtained by collapsing a
general model of TTU (even an extensional one) will The truth of this scheme is clearly necessary for a
contain an inner model of ZFC, if it happens to contain polymorphic collapse to succeed; it has been shown that
an inaccessible cardinal). it is in a sense also sufficient (see [27] and a later
The technical details of the collapse described in section of this paper (but we cheat slightly in our proof)).
the preceding paragraph are not important to our We now define the theory NFU which results from
development, but the fact that it is possible to get to the identification of the types in TTU. We should point
Zermelo-style foundations by collapsing types in TTU out as we do this that we have not yet justified this
(more naturally from TT) should be reviewed in this maneuver; we can see what theory results from col-
context for comparison and contrast with the rather lapsing the type structure of TTU, but we cannot yet see
different collapse of types which leads to NFU. The intuitively or otherwise that this theory is legitimate.
logical complexity of the collapse sketched in the NFU is a first-order theory with equality, member-
previous paragraph is about the same as the complexity ship, the sethood predicate Σ, projection relations π1 and
of the interpretation of type theory with full extension- π2 and a well-ordering ≤. Its axioms are Sethood,
ality in TTU0 which we briefly sketched earlier; it Extensionality (the weak form given above for TTU),
clearly involves recursion on types. Comprehension, Ordered Pairs, and Choice, as in TTU
but with all distinctions of type ignored. The only
one of these axioms which requires comment is
Comprehension.
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 37
It may seem that the form of Comprehension which (We note at this point that we ourselves question
we obtain by dropping all indications of type from the whether even Quine was guilty of a mere “syntactical
Comprehension scheme of TTU is trick”. We would describe Quine as guilty of being
unduly hopeful that the identification of the types
*Axiom Scheme of Comprehension: For each formula suggested by the typical ambiguity of TT represented a
φ (of the language of NFU) (∃A.Σ(A) ∧ (∀x.x ∈ real phenomenon.)
A ≡ φ)). (A not free in φ).
Any similarity between this definition and the well- Axiom of the Universal Set: V = {x | x = x} exists.
formedness conditions for formulas of TTU is far from
accidental. A stratified formula φ will be obtainable Axiom of Complements: Ac = {x | x ∉ A} exists.
from a formula of the language of TTU by disregarding
type distinctions; we can now present the Axiom of Boolean Unions: A < B = {x | x ∈ A ∨
x ∈ B} exists.
Axiom Scheme of Stratified Comprehension: For each
stratified formula φ of the language of NFU, Axiom of Set Union: <A = {x | (∃y.x ∈ y ∧ y ∈ A)}
(∃A.Σ(a) ∧ (∀x.x ∈ A ≡ φ)).〈x, y〉. (A not free in φ). exists.
This is still a complex axiom scheme, and of course Axiom of Singletons: {A} = {x | x = A} exists.
we still see the genetic relationship of NFU to type
theory, but at least we have avoided the disaster of Axiom of Ordered Pairs: As above.
unstratified comprehension. But we are certainly open
at this point to the accusation that we have perpetrated Axiom of Cartesian Products: A × B = {〈x, y〉 | x ∈ A
a “syntactical trick”! ∧ y ∈ B} exists.
38 M. RANDALL HOLMES
Axiom of Converses: R–1 = {〈x, y〉 | 〈y, x〉 ∈ R} exists. construct {x | φ} as the complement of {x | ψ}, and
similarly for {〈x, y〉 | φ}. If φ is of the form ψ ∨ χ
Axiom of Relative Products: R | S = {〈x, y〉 | (∃z.〈x, y〉 we construct {x | φ} as the boolean union of {x | ψ}
∈ R ∧ 〈z, y〉 ∈ S} exists. and {x | χ}, and we prove the existence of
{〈x, y〉 | φ} in the same way. If φ is of the form (∃z.ψ),
Axiom of Domains: dom(R) = {x | (∃y.〈x, y〉 ∈ R)} we construct {x | φ} as the domain of {〈x, z〉 | ψ}.
exists. The case that remains is the hard one. We con-
struct {〈x, y〉 | (∃z.ψ)}. In order to do this, we expand
Axiom of Singleton Images: Rι = {〈{x}, {y}〉 | 〈x, y〉 ∈ our language to include terms obtained by possibly
R} exists. repeated application of the projection functions π1
and π2 to variables. We will indicate below how these
Axiom of the Diagonal: The set Eq representing the are eliminated when we get to the stage of atomic
equality relation, {〈x, x〉 | x = x}, exists. formulas. Notice that any composition of projection
operators can be constructed as a relation using
Axiom of Projections: The sets P1 and P2 representing the axiom of relative products. Once this expansion
the projection relations, {〈〈x, y〉, x〉 | x = x} and of our language is made, we define the set
{〈〈x, y〉, y〉 | x = x}, exist. {〈x, y〉 | (∃z.ψ)} as the domain of {〈u, z〉 | ψ′}, where
u is a new variable and ψ′ is obtained from ψ by
Axiom of Inclusion: The set Subset representing the replacing each occurrence of x with the term π1(u)
subset relation, {〈x, y〉 | Σ(x) ∧ Σ(y) ∧ (∀z.z ∈ x → and each occurrence of y with the term π2(u).
z ∈ y)}, exists. We may suppose without loss of generality that no
other logical connective or quantifier occurs in φ.
Axiom of Choice: As above. At the stage of atomic formulas, we need to
consider first “atomic formulas” of the forms πa(x) ∈
The first thing to notice about this axiom set is that B and 〈πa(x), πb(y)〉 ∈ C, where the operators πa and
the sets and operations on sets provided by the axioms πb are compositions of projection operators (Eq may
are all intuitively reasonable (even the universal set and be used as the composition of zero projection
complements, if one does not have prior training in operators (the identity map) if needed; also the vari-
ZFC): they are the primitives of boolean and relation ables x and y are not necessarily distinct). First note
algebra, plus the domain operator needed to reduce that any composition of projection operators is
binary relations to sets, the projection and subset realized by a set relation by application of the axioms
relations, and the special set operations of singleton, of projections and relative products. Thus, we can
singleton image, and set union. The axiom of the uni- convert 〈πa(x), πb(y)〉 ∈ C to the form 〈x, y〉 ∈
versal set is actually redundant; one can use any specific (πa | C | πb–1) and πa(x) ∈ B to the form x ∈
set introduced in the other axioms along with the dom(πa > (V × B)), producing atomic formulas
boolean operations to construct V. without the new function symbols.
Our aim for the rest of this section is to prove the All that remains is to verify the theorem for atomic
following formulas of the forms listed above. In what follows,
variables are distinct unless written with the same
Meta-Theorem: For any stratified formula φ, {x | φ} letter. {u | u ∈ A} is equal to A if A is a set and to
exists. the empty set otherwise. {〈u, v〉 | u ∈ A} is A × V;
{〈v, u〉 | u ∈ A} is V × A. {v | u ∈ A} and {〈v, w〉 |
Lemma 1: Suppose that φ is built up by logical opera- u ∈ A} are either the universe (resp. universal
tions from sentences of the forms u ∈ A and 〈u, v〉 relation) or the empty set, depending on whether
∈ R where A and R are parameters or constants. Then u ∈ A.
{x | φ} and {〈x, y〉 | φ} exist. {u | 〈u, v〉 ∈ R} is the domain of the intersection
of R and V × {v} (we have intersections because we
Proof of Lemma 1: We prove this by induction on have unions and complements). {v | 〈u, v〉 ∈ R} is the
the structure of φ. If φ is of the form ¬φ, then we domain of the converse (the range) of the intersec-
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 39
tion of R and {u} × V. {w | 〈u, v〉 ∈ R} is the universe eliminated by replacement with (∃y.y ∈ x) ∨ x ∈
or the empty set. {u | 〈u, u〉 ∈ R} is the domain of {∅}, then carrying out the translation of this as
the intersection of R and Eq. {v | 〈u, u〉 ∈ R} is the above.
universe or the empty set. {〈u, v〉 | 〈u, v〉 ∈ R} is the The motivation behind this transformation is that
intersection of R and V × V. {〈v, u〉 | 〈u, v〉 ∈ R} is each variable x in φ has different reference in φ′:
the converse of the intersection of R and V × V. x refers in φ′ to the (N – type(x))-fold singleton of
{〈u, w〉 | 〈u, v〉 ∈ R} is the cartesian product of its referent in φ. It is useful to note that this is
{u | 〈u, v〉 ∈ R} (already shown to exist) and V, and essentially the same as the coding trick used in the
similar considerations apply to the three other cases construction of natural models of TTUn above: the
in which the two pairs share one variable. If the two relations used to code membership at different
pairs share no variable, we have the universal (relative) types are the same as in that construction.
relation or the empty set. {〈u, v〉 | 〈u, u〉 ∈ R} and In any case, it is easy to check that the changes in
{〈v, u〉 | 〈u, u〉 ∈ R} are cartesian products with V of relations and the restrictions of quantifiers are those
{u | 〈u, u〉 ∈ R}, which has already been shown to which should be induced by the intended change of
exist. reference of variables, if the truth values of φ and φ′
The proof of Lemma 1 is complete. For the idea are to be the same.
of exploiting the projection operators to make it For any variable x, {x | φ′} exists by Lemma 1.
possible to consider only sets and binary relations, This will be the set of (N – type(x))-fold single-
we are indebted to Tarski and Givant ([28]). tons of objects x such that φ; apply the operation
of set union (N – type(x)) times to this set to get
Lemma 2: The set E = {〈{x}, y〉 | x ∈ y} exists. {x | φ}.
The proof of the Meta-Theorem is complete.
Proof of Lemma 2: The domain of (V × V)ι is the set of
all singletons, which we will call 1. (1 × V) > The presentation of NFU in this section has the advan-
Subset = E. It is useful to note that this relation tage that there is no meta-mathematics in the formula-
played an important role in the construction of tion of the comprehension axioms; each axiom tells us
natural models of TTUn above. of the existence of concrete sets or operations on sets.
It increases the sense that stratified comprehension may
Definition: Define V 0 as V; define V n + 1 as the domain be a reasonable criterion for set existence, but it cer-
of (V n × V n)ι. V n will be the set of all n-fold iterated tainly does not provide firm support for this impression.
singletons. (This is not to be confused with the After all, it is known that if we add strong extension-
standard notation Vα for stage α of the cumulative ality (another intuitively appealing assumption) to the
hierarchy in ZFC, which will be used below). For any given axiom set it becomes inconsistent (because we
relation R taken from the list (Eq, P1, P2, E), define would get NF + Choice, which was shown to be
R0 as R and Rn + 1 as Rιn for each concrete n; Rn is the inconsistent by Specker in [26]).
relation on n-fold iterated singletons induced by R. Another practical reason to be interested in this
section is that it suggests a clean way to define the
Proof of Meta-Theorem: Let φ be an arbitrary strati- notion “model of TTU”; the axiomatization given here,
fied formula, and let type be a stratification of φ if typed, gives an axiomatization of TTU as well, and
with an upper bound N on its range. We transform φ it is much easier to describe satisfaction of these axioms
into a related formula φ′ which will have the form in a model of TTU in our metatheory than to describe
required by Lemma 1. φ′ is constructed by replacing satisfaction of the general comprehension scheme. We
each occurrence of x ∈ y with 〈x, y〉 ∈ EN – type(y), don’t carry out the details here, but it is useful to be
replacing each occurrence of x = y with 〈x, y〉 ∈ aware of the issue.
EqN – type(x), where Eq is the set provided by the Further, the details of this proof are important in
axiom of equality, and replacing occurrences of x πi developing and understanding the intuitive picture of set
y with 〈x, y〉 ∈ PiN – type(x) (where Pi stands for P1 theory we develop below.
or P2), and restricting each quantifier over a variable
x to the set V N – type(x). Occurrences of Σ(x) are best
40 M. RANDALL HOLMES
9. Sufficiency of the Ambiguity Scheme schemes and the corresponding NF-like systems is
provable anyway, in the absence of definable Skolem
In this section, we prove an important theorem of functions (using saturated models); the curious reader
Specker. can see the details in [27] or [5], p. 59.
Proof of Theorem: We use the well-orderings provided In this section, we will give what is essentially Jensen’s
by the Axiom of Choice to construct a term model proof that NFU is consistent, but in a form proposed
of TTU. Note that any sentence (∃x.φ) which is true by Maurice Boffa ([1]). We feel that this proof is not
has a witness (µx.φ) definable as the ≤-least element especially intuitively appealing, but it certainly does
of {x | φ}. When (∃x.φ) is not true, we might as well work.
define (µx.φ) as ∅. Take any model of the revised It uses Ramsey’s theorem, which we now state:
TTU: it is straightforward to show that the substruc-
ture consisting of the referents of all µ-terms is a Definition: For any set X, we define [X ]n as the set
model of TTU as well. (Another way of putting all of all n-element subsets of n. Let P be a finite set
this is that we are exploiting the fact that TTU has of pairwise disjoint sets whose union is [X ]n (a
definable Skolem functions.) finite partition of [X ]n). We say that H # X is a
A model of NFU is then obtained by identifying homogeneous set for P iff there is p ∈ P such that
the analogous µ-terms of different types in the term [H ]n # p; in other words, all n-element subsets of H
model constructed in this way from a model of TTU fall in the same “compartment” of the partition P.
+ Ambiguity. There can be no conflict between
statements true of analogous µ-terms at different Theorem (Ramsey): For any infinite set X and parti-
types of TTU, by Ambiguity (the crucial point here tion P of [X ]n, there is an infinite homogeneous set
is that all elements of the term model are definable!). for P.
The semantics of stratified sentences of NFU are
obtained directly from the term model (or indeed This theorem can be proved in TTU (in various typed
from the original model of TTU + Ambiguity, which versions) in essentially the same way that it is proved
has the same theory), and the semantics of unstrati- in standard set theory.
fied sentences are obtained from the term model in The main result of this section is
the natural way.
The proof is complete. Theorem: NFU is consistent iff TTU is consistent.
The proof of the sufficiency of ambiguity is much It is obvious that TTU is consistent if NFU is con-
easier in TTU as we have presented it, because the sistent; given a model of NFU with domain M and
presence of Choice gives us definable witnesses to every membership relation e (in some type in TTU, which we
true existential statement (even with parameters – thus recall is our metatheory), we can define a model in the
definable Skolem functions). A similar technique will following way: Ti = M for each i and Ei = e for each i,
work for theories without Choice, but may appear to and the sequences of projection operators and well-
be cheating: it adds no strength to type theory without orderings will be constant sequences of the projection
Choice to add a well-ordering of each type but exclude models and well-ordering of the model of NFU. It is
the well-ordering from instances of comprehension clear that the resulting structure will be a model of TTU,
(the well-orderings can be interpreted as external well- with the expected special feature that the types are
orderings of each type in a countable model). If one identical (and so the membership relations between
extends the Ambiguity Scheme to formulas containing successive types are identical as well).
the external well-ordering, the argument goes just as it It is the converse, that the consistency of TTU implies
does here. It turns out that one does not need to cheat: the consistency of NFU, which is less obvious. Suppose
the equiconsistency of type theories with ambiguity we have a model 〈T, E, P1, P2, W 〉 of TTU.
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 41
The trick is to show that any increasing subsequence between any two types i and j with i < j in a model of
of types of this model can also be used to determine a TTU. There is no obvious way to “skip types” this way
model of TTU. Let f be a strictly increasing function in a model of TT (and it is demonstrably impossible to
from N to N. We define a model 〈T f, E f, P1f, P2f, W f〉 do this in a model of TT which satisfies Choice). See
of TTU. Tif (type i of the submodel) will be simply Tf(i). our [12] for a discussion of this.
The definition of Eif is a little more complex: we need
to develop a notion of “membership” of type f(i)
elements of our model in type f(i + 1) elements of our 11. The bootstrap to untyped foundations
model: we know that f(i + 1) > f(i), but we do not
At the moment, our metatheory is TTU and we have
necessarily have f(i + 1) = f(i) + 1. The solution is to
been reasoning in it about the object theories TTU and
use the coding of elements of type f(i) into f(i + 1) – 1
NFU. We now know (by dint of some hard reasoning,
as (f(i + 1) – f(i) – 1)-fold iterated singletons. For each
admittedly) that the consistency of TTU implies the
x ∈ Tf(i), we write ιf(i + 1) – f(i) – 1(x) to represent the iterated
consistency of NFU. Moreover, NFU is an extension
singleton of x (in the sense of the model) in Tf(i + 1) – 1
of TTU: it extends TTU with the information that the
(note that this is x itself if f(i + 1) = f(i) + 1). We define
types are all in fact one and the same domain.
Eif as the set of all ordered pairs 〈x, y〉 such that
On reflection (meta-meta-theoretic in nature!), we
ιf(i + 1) – f(i) – 1(x)Ef(i) – 1 y ∧ (∀z.z Ef(i) – 1 y → (∃w.z =
realize that we can extend our metatheory from TTU to
ιf(i + 1) – f(i) – 1(w))). The new model regards all sets of
NFU harmlessly. So (in at least a mathematical sense)
f(i + 1) – f(i) – 1-fold singletons (in the sense of the
we have arrived at foundations in NFU as desired.
original model) in Tf(i + 1) as sets (it is easy to see that
these correspond exactly to the sets of type f(i) objects
in the original model) and treats all other elements of 12. The meaning of untyped foundations
Tf(i + 1) as urelements. Notice that the definition of Eif
depends only on the values of f(i) and f(i + 1); any At this point we leave the task of mathematical devel-
model 〈T g, E g, P1g, P2g, W g〉 of this kind which contains opment of our foundations and commence explicit
these two types as successive types will have the same philosophical reflection on the foundations (though we
membership relation between them. We define the will not eschew further mathematical development). We
sequences P1 f, P1f, and W f in the obvious way: e.g. do think that a basically self-contained exposition of the
Wif = Wf(i). It is straightforward but tedious to show that mathematics is of value for supporting the necessary
this is in fact a model of TTU. reflection, which is why we have provided it.
Now take any finite set of sentences S in the Nothing we have done should necessarily have dis-
language of TTU. There will be a highest type n which pelled the impression that the development of NFU is
occurs in any of these sentences. We can use S to define a syntactical trick. What proponents of ZFC have that
a finite partition of [N]n + 1 as follows: the “compart- we do not have (so far) is a nice intuitive picture of what
ments” of the partition into which an element A of is going on in their favored approach to foundations:
[N]n + 1 falls will be determined by the truth values of one starts with the empty set and runs through a series
the sentences of S in models 〈T f, E f, P1f, P2f, W f〉 where of stages indexed by the ordinals, at each step con-
the image of {0, . . . , n} under f is A. Clearly there are structing all collections of objects constructed before
no more than 2|S| compartments. By Ramsey’s theorem, that stage. This picture motivates not only ZFC, but a
there is an infinite homogeneous set H for this partition. hierarchy of extensions of ZFC which is in principle
Let h be the strictly increasing map from N onto H. The impossible to formalize completely! An inessential
model 〈T h, E h, P1h, P2h, W h〉 will satisfy the scheme of modification of the intuitive motivation for ZFC gives
typical ambiguity for sentences in S. By compactness, an intuitive motivation for ZFA as well (one can start
it follows that the full scheme of typical ambiguity is with a set of atoms, or allow the addition of sets of
consistent (because any finite subset of the scheme is atoms at each stage).
consistent). By Specker’s ambiguity theorem of the Moreover, the motivation behind TTU might seem
previous section, NFU is consistent. to be a variation of the same thing. One takes an
It is a crucial feature of this construction that there arbitrary collection of objects to be type 0, then one
is a natural way to define a “membership relation” takes all collections of these objects (plus some junk)
42 M. RANDALL HOLMES
to be type 1, all collections of type 1 objects (plus some Convention: We will refer to completely arbitrary col-
junk) to be type 2, and so forth. As we have noted lections here as “classes” rather than “sets”. We will
before, it is impossible to reconcile the idea of poly- refer to the properties and relations whose extensions
morphic collapse of a model of TTU with the idea that are sets as “natural” properties and relations.
each type contains every subcollection of the previous
type. To see this, consider the collection of all elements It might be objected that our inclusion of choice in
of a given type which do not belong to their polymor- TTU is incompatible with regarding it as an “inten-
phic analogues in the next type; this would implement sional” theory; it is often supposed that the purpose of
the Russell class! (this shows that the relation “x is the the Axiom of Choice is precisely to allow the con-
polymorphic analogue of y in type type(y) + 1” cannot struction by “arbitrary” choices of sets which we
be representable inside the model of TTU being cannot specify by a common property of their elements,
collapsed polymorphically). making it incompatible with an “intensional” viewpoint.
As we have suggested at a couple of points in the This is not necessarily the case. The Axiom of
mathematical development, this is not our understanding Constructibility (V = L) of Gödel is a statement moti-
of TTU (and so we do not need to revise it when we vated entirely by intensional considerations. The axiom
make the transition to NFU). We suggested at the outset V = L is all about definability: every set in ZF + V = L
that we regard TTU as being fundamentally a theory of is definable from finitely many ordinal parameters, so
properties rather than a theory of classes (an intensional is in a natural sense the extension of a property. But
rather than an extensional theory). It is reasonable to the “intensional” theory ZF + V = L proves choice!
suppose that some collections of an “arbitrary” nature Moreover, the objections to ZF + V = L which are
may not be the extensions of any property of objects we usually expressed are “extensional” in character: most
recognize, and so may not be realized as sets. The set theorists doubt that the theory with V = L captures
extensional criterion of identity between sets may not the truth about “arbitrary” collections even of natural
be thought appropriate for a theory of properties, but it numbers!
is certainly convenient for mathematical purposes, and The problem is to make intuitive sense of the strati-
we have shown how to arrange for it to hold by fication criterion of comprehension in NFU in a way
identifying properties with the same extension (in our that lets us see NFU as reflecting a view of the world
interpretation of TTU in TTU0 above). with its own internal logic rather than a weird modifi-
The phenomenon of typical ambiguity in type theory cation of the view of the world implicit in TTU. What
that Russell and Quine noticed and from which Quine is special about the properties with stratified definitions
attempted to extrapolate is an intensional rather than (i.e., why should we regard these properties as natural)?
an extensional feature of the theory; sets are being The obvious shift from TTU to NFU is that the world
viewed as extensions of definable properties rather than of individuals with which we start (type 0) turns out to
“arbitrary” collections when the phenomenon of typical be the whole world! The properties and urelements of
ambiguity is noticed. type 1 turn out to be identified (bijectively) with the
NFU is an extensional theory in the sense that the individuals. Moreover, the details of the identification
criterion for identity between (instantiated) properties of type 1 with type 0 automatically handle all types at
is extensional. It is not unreasonable even from an once; there is no opportunity to tinker so as to accom-
intensional and purely philosophical standpoint to modate type 2 and higher types.
maintain that properties which hold of exactly the same Speaking informally, what we have done is cause
objects are the same (though this is not a commonly each natural property of individuals to be represented
held view). Because NFU is extensional in this sense, by an individual (the set with the extension determined
it does make sense to refer to it as a theory of sets. But by that property). We do not use all individuals to
it is an intensional theory in the sense that the sets of represent properties in this way; the ones we do not use
NFU are considered as being extensions of properties are the urelements or atoms. This is analogous to a
rather than arbitrary collections (and we admit the process with which we are all familiar: it resembles the
possibility that some arbitrary collections may be not be process of assigning semantics to general terms (which
extensions of sets because they are not the extension might be regarded as common nouns, adjectives, or
of properties we can specify). verbs) in a language. So we speak of the construction
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 43
of our model of theory metaphorically as the construc- is needed now is an account of the singleton construc-
tion of a kind of “language”. tion.
If we have a preexisting notion of “natural property” Further, our interpretation of what is going on in the
which we are trying to represent in this way (we construction of sets also suggests a metaphor for the sin-
suppose that a natural notion of ordered pair is provided gleton relation: the singleton set {x} corresponds to the
so that we can identify natural relations with certain natural property of being x (the predicate of x-hood).
natural properties in the usual way), one relation which We actually adopt a different metaphorical interpreta-
we cannot expect to be natural is the relation which tion of {x} which might seem surprising: we will call
holds between x and y iff x has the natural property {x} the “name” of x in the “language” we are con-
represented by y: the semantic relations of the language structing. In doing this, we are reversing a common
can be expected to be “arbitrary relations” from the maneuver for eliminating proper names a from formal
standpoint of the original notion of natural property (and logic: replacing proper names a with references to the
relation) we started with. If the notion of natural predicate “being a” which applies only to a. Another
property (and relation) satisfies reasonable closure way to see that it is reasonable to interpret {x} as a
properties under logical operations, this is provable (we proper noun standing for x is to interpret sets as
can construct the paradox of heterological adjectives). common nouns (as if we were to associate the word
The stratification criterion does not disappoint us too “man” with the set of men); the proper noun “Peter”
much here: the relation ∈ is not a set relation. But in might then be seen to be associated with the set con-
the light of this consideration, the stratification criterion sisting only of Peter.
for comprehension and thus naturalness of properties So we propose to reduce set theory to semantics and
looks uncomfortably strong: although ∈ is not a set mereology (the study of the relation of part to whole).
relation, many properties and relations defined in terms Something very like our reduction of set theoretical
of ∈ are required to be natural properties and relations primitives to the subset relation (understood as the
by the stratification criterion. relation of part to whole) and the singleton relation is
Contemplation of the proof of the Meta-Theorem of found in the works of David Lewis (notably [18]). We
finite axiomatization of NFU can be useful here. Lemma came up with this idea independently of Lewis (though
1 expresses a closure property which we expect a we benefited subsequently from study of his develop-
reasonable notion of naturalness of properties to have: ment) and we provide an explanation of the role of the
essentially, it ensures that properties and relations first- singleton relation where he treats it as something
order definable in terms of natural properties and mysterious. Lewis analyzes standard foundations in ZFC
relations will be natural properties and relations. If we (where such an analysis is equally valuable) rather than
look at the proof of the Meta-Theorem, we see that the Quine-style set theory.
crucial fact is that inclusion (subsethood) is a natural It is not to be expected that the relation between
relation (used to define the set relation E used to handle objects and their names will be a natural relation (and
membership in the proof) and that natural relations on indeed we will prove below that it is not); it shares in
individuals induce natural relations on their singletons the “arbitrariness” of the whole procedure of assigning
and vice versa (this is seen in the use of hierarchies of referents to symbols in a language.
relations Rn induced by repeated application of singleton Our identification of the relation of subset to set with
image in the definition of the modified formula φ′, and the relation of part to whole on sets allows us to give
in the use of set union at the last step to collapse a set an intuitive picture of the properties of this “language”.
of iterated singletons of elements of the desired set to Singletons are disjoint (set-theoretically), so they are
the desired set). non-overlapping (mereologically). Singletons have no
We suggest an intuitive interpretation of the subset proper (nonempty) subsets which are sets, so they are
relation which is not semantic in character at all: the “atomic” (at least among sets: they have no proper (and
relation of subset to set can very reasonably be under- non-null) parts which are sets). We regard the empty set
stood to be the restriction to sets of the relation of part as the null part, about which we have no qualms (though
to whole. This is not a semantic relation, and it is quite we can fix this picture to satisfy such qualms by
reasonable to suppose that it is a natural relation. regarding the empty set as representing a particular
Because x ∈ y ≡ {x} # y, it is clear that all that atomic object and each singleton as the fusion of another
44 M. RANDALL HOLMES
atomic object with the atomic object representing the obvious question about it is whether it is logically
empty set). possible to do it. The development in the paper shows
We introduce a basic notion of mereology: that if it is possible to model the many-sorted TTU (for
which we have good intuition) it is possible to model
Definition: The fusion of all objects with property P is the more problematic NFU. It should not be surprising
defined as the uniquely determined object which that it takes nontrivial mathematical reasoning to show
contains all objects with property P as parts and is that this is possible.
part of every object which contains all objects with
property P.
13. The intuitive picture summarized and the
We make the mereological assumption that the fusion 13. types revisited
of all individuals with any natural property P is an
individual. For the record, we also assume that the We summarize our intuitive picture. We start with a
relation of part to whole is reflexive, antisymmetric, and world of individuals and notion of natural property and
transitive (a weak partial order). relation. Natural properties and relations are closed
A set is then presented to us as being the fusion of under first-order definability. Among the natural rela-
the names of all its elements. If P is a natural property, tions are the relations of equality, projections, the
the property “being the name of something with P” is relation of part to whole, and a well-ordering of the
also a natural property (see the next paragraph), and by universe. We attempt to develop a language in which
our mereological assumption there is an individual each natural property and relation will be represented
which is the fusion of all individuals with the latter by an individual. We choose a relation of name to
property. This individual is taken to represent the referent with the property that names are non-null and
property P. The role of the pairwise non-overlapping pairwise non-overlapping and that the natural proper-
names (singletons) here is crucial. If we tried to use the ties and relations on names correspond precisely to the
fusion of the objects with property P to represent P, natural properties and relations on the corresponding
we would not be able to recover P unambiguously from referents. We assume that the fusion of all individuals
the object representing it. For example, the fusion of with any given natural property is an individual. We
all human beings and the fusion of all human cells are represent each natural property by the fusion of the
the same object (mod quibbles about tissue cultures and names of the individuals of which it holds; this will be
such; the point should be clear, though). an individual by an application of the last two assump-
The crucial further property which makes everything tions. Recapitulation of the proof of the Meta-Theorem
work (as we can see by examining the proof of the under a suitable interpretation establishes that we have
Meta-Theorem) is that every natural property of indi- a model of NFU.
viduals corresponds to a natural property of names and Note that this picture is no more complicated than
vice versa (and the same for relations – the axioms of the intuitive picture of the cumulative hierarchy of types
singleton image and set union do this work in the axiom (recall that the intuitive picture of the cumulative hier-
set) – even though the semantic relation between objects archy involves an account of the notion of the power set
and names is not a natural relation. We are not saying and of general ordinals). The disadvantage it has is that
that an individual and its name have the same proper- that it is less obvious that the picture can be realized
ties – we are saying that for each natural property P, (though it is not obvious to everyone that the cumula-
“being the name of something that has P” and “being tive hierarchy can be realized).
the referent of something that has P” are also natural Notice also that this picture does not refer in any
properties, and similarly for relations. The “world” of obvious way to types. Realization of this picture seems
names is a “model” of the whole world for natural to be a reasonable thing to try in an untyped universe
properties and relations: the structure consisting of the (or at least, one with only two types – a type of indi-
names is a kind of microcosm. viduals and a type of natural properties). Yet we obtain
We suggest to the reader that this picture of the an interpretation of full type theory. What is the meaning
construction of a kind of “language” to express natural of the types if they are not to be taken (as they clearly
properties can be taken to be intuitively appealing; the cannot be taken here) to be disjoint sorts of object?
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 45
Once again, this is best discovered by looking at the that a yet smaller model of the world of “names”, and
proof of the Meta-Theorem. Specifically, we consider so forth. If we take the first n of these models and the
the relation of the formula φ′ in that proof to the formula natural “reference relations” among them, we get a
φ. In order to obtain the formula φ′ which is entirely structure corresponding exactly to the natural model of
expressed in terms of natural relations from a stratified TTUn we presented in an earlier section! The full infinite
formula φ mentioning the non-natural relation ∈, we sequence of nested models cannot be a set (we’ll discuss
need to replace talk of each object x mentioned in the this further in the next section).
proof with talk of its (N – type(x))-fold singleton. In
terms of our intuitive picture, we replace talk of x with
talk of its (N – type(x))-fold name. 14. Fitness for mathematical applications
When we look at an object in our intuitive picture,
we can use it in various ways. We can use it directly. NFU is fit for mathematical applications. An extended
We can use it to refer to that (if anything) of which it study of this can be found in my [13]; one can also look
is the name. And, further, we can iterate reference: we at Rosser’s [24]; the development there is in NF, but it
can look at the referent of its referent, the referent of is readily adapted to NFU (with a primitive type-level
the referent of its referent, and so forth. What the ordered pair replacing the definable Quine pair used in
analysis in the previous paragraph reveals is that this is NF).
the function of the relative types in stratified compre- Features of the implementation of mathematics in
hension from our untyped viewpoint. A stratified NFU which might be found appealing or instructive are
formula is one in which each variable x is used to refer the use of Frege’s definition of the natural numbers (the
to objects at the same iterated level of reference number 3 is the set of all sets with three elements) and
((type(x))-fold reference) wherever it appears – knowl- the Russell-Whitehead definitions of cardinal and
edge of the relations between the different levels of ordinal number. It does not seem to be as widely known
reference of the same object requires knowledge of the as it should be that these definitions do not in and of
reference relation itself, which is not a natural relation themselves lead to paradox. Of course, the fact that NFU
(we cannot expect to represent it internally to our has a universal set has the same instructive quality.
language). Stratification is seen to be a natural property There are also features of the implementation of
to expect of definitions of properties using the reference mathematics in NFU which may be found annoying or
relation – those definitions which do not try to even pathological. Some of these can be discovered by
“diagonalize” on the reference relation succeed in looking at the resolutions of the paradoxes in NFU.
defining natural properties. The reason that we are able The Russell paradox is readily averted by the strati-
to define any sets at all in terms of the reference relation fication criterion for comprehension: x ∉ x is not strat-
has to do with the parallelism between natural proper- ified, so we do not have to think about the embarrassing
ties of objects and natural properties of their names. We {x | x ∉ x}.
are also helped in defining sets in terms of member- The Cantor paradox results from applying Cantor’s
ship (∈) by the part that the natural relation inclusion theorem |A| < |P(A)| (the cardinality of a set is strictly
(#) plays in our understanding of membership. less than that of its power set) to the universal set. It is
Moreover, types of individual variables are seen to be clear that |P(V)| ≤ |V|! The resolution lies in the fact
genuinely “relative” rather than absolute (if we increase that the form of Cantor’s theorem provable in TTU is
the level of reference uniformly nothing changes, |P1(A)| < |P(A)|: the set of one-element subsets of A is
because of the perfect correspondence between strictly smaller in size than the set of all subsets of
properties of names and properties of referents). A. We believe this, but we also tend to think that
The fact that there is a hierarchy of types is not |A| < |P1(A)|. However, Cantor’s theorem tells us that
surprising if we look at the fact that the world of |P1(V)| < |P(V)| ≤ |V|; so it cannot be the case that |V|
“names” is postulated as a “microcosm” within the = |P1(V)|. We see that the singleton map is not a set
world of individuals: within the world of “names” as a function (from this it follows that the relation of name
model of the whole world, we would expect then to find to referent in our intuitive picture is not a natural
a model of the world of “names” which can be viewed relation; we suggested that this would be the case).
as an even smaller model of the whole world, and within We will later find occasion to use this
46 M. RANDALL HOLMES
Definition: Cantor’s theorem in its original form would to be “smaller than” (though externally isomorphic to)
hold if |A| < |P1(A)| were true; we call sets A with the universe. The fact that the ordinals are not exter-
this property cantorian sets. Of more interest to us nally well-ordered is another example of the same
are the strongly cantorian sets, which we define as phenomenon: the definition of a well-ordering provides
those sets A such that the restriction of the singleton that any nonempty subset of the domain of the well-
map to A is a set function. ordering has a least element, not that any subclass of
the domain has a least element. The existence of a
The Burali-Forti paradox results from considering the countable proper class is quite arresting, but it is
order type of the natural order on the ordinals. The particularly arresting to us because our intuition is
natural order on the (Russell-Whitehead) ordinal trained by a system of set theory motivated by “limita-
numbers is a set relation in NFU and in fact a well- tion of size”; we believe that small collections should
ordering; thus it has an order type Ω. The Burali-Forti be sets. This is not part of the motivation of NFU.
paradox results from applying the theorem of naive set We report from our own experience that working in
theory (and ZFC) which asserts that the order type of NFU does allow one to develop some intuition about
the natural order on the ordinals less than an ordinal α the properties one can expect sets to have in this system;
is α. This would force Ω to sit past the end of the in fact, we believe that once one understands what is
sequence of all ordinals, which is absurd! But this is not going on, one finds that NFU is not nearly as far from
a theorem of TTU or NFU: the order type of the natural ZFC in its outlook on the mathematical universe as one
order on the ordinals less than α in TTU is an ordinal might suppose. This is the subject of the next section.
two types higher than α: if W is a well-ordering of type
α, then (W ι)ι (the order on double singletons induced
by W) is the order type of the ordinals less than α. (We 15. Mutual reflections
use the notation T(α) for the order type of W ι, thus
T 2(α) for the order type of (W ι)ι). Since the order type In this section, we will discuss the relationships between
of the natural order on the ordinals less than Ω is the views of the mathematical universe embodied in
obviously strictly less than Ω, which is the order type TTU, NFU, and ZFC.
of all the ordinals, we have T 2(Ω) < Ω. T (and so T 2) There is a nice interpretation of NFU in terms of the
respects the natural order on the ordinals, so repeated cumulative hierarchy of ZFC, which our goal of
application of T 2 (or of T ) to Ω produces an externally showing the autonomy of foundations in NFU did not
countable decreasing sequence of ordinals; this clearly allow us to present as our official consistency proof of
cannot be a set, so T 2 (and thus T ) is not a function. In NFU. If one takes a nonstandard model of (enough of)
some external sense, the ordinals of NFU are not well- ZFC with an external isomorphism j between a rank
ordered; this has been presented as a serious criticism Vα + 1 of the cumulative hierarchy with infinite index and
of NF. The “sequence” T i(Ω) of ordinals is a relic in a lower rank Vj(α) + 1 (this is usually obtained by pro-
NFU of the “hall of mirrors” in TTU; similar phenomena viding an external automorphism of the entire model
occur in the cardinal numbers. moving an ordinal downward, but an application below
The peculiar phenomena found in the resolutions of requires us to state the result with more generality)
the paradoxes are certainly surprising to a student of then one readily obtains an interpretation of NFU. The
standard set theory. However, our intuitive motivation domain of the model of NFU will be the set Vα in the
for this style of set theory should already have suggested nonstandard model of set theory (here we use the
that some arbitrary collections of sets would be found notation Vα to stand for the rank indexed by α in the
not to be sets. The fact that the singleton map is not a cumulative hierarchy of sets in ZFC; this should not be
function is quite natural in terms of our intuitive confused with the notation V n for the set of n-fold
motivation. This allows V and P1(V) (which are singletons used above). The membership relation
externally isomorphic) to have different cardinalities x ∈new y of the model will be defined as j(x) ∈ y ∧
“internally” to NFU: in terms of our intuitive motiva- y ∈ Vj(α) + 1. Projection relations are available because
tion, |V| is the number of individuals and |P1(V)| is the α is infinite, and we can suppose ourselves provided
number of names: it is entirely reasonable in terms of with a well-ordering because the ambient theory has
our intuitive picture that the realm of names turns out choice. We do not present a proof here that this is a
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 47
model of NFU; the flavor of the proof is very similar with top, one can see that W ι (the relation on single-
to that of the proof of the Meta-Theorem above (the idea tons induced by W) is also a well-founded extensional
is to exhibit a transformation which eliminates the relation with top, which is not necessarily isomorphic
external function j from translations of stratified to W. If the isomorphism type of W is x, we denote the
instances of comprehension, so that they can be seen isomorphism type of W ι by T(x). T is easily seen to be
to define sets in the underlying nonstandard model of an (external) endomorphism of the isomorphism types.
set theory – see [5], pp. 68–69 for details). Work with T can be proved to map some rank of the cumulative
this interpretation of NFU in ZFC (and similar theories) hierarchy in the interpreted Zermelo-style set theory
allows one to see that the viewpoint of NFU is not really onto a lower rank. We then construct an interpretation
profoundly different from that of ZFC. (not a set model; there is no problem with Gödel’s
It is also easy to interpret set theory in the style of second incompleteness theorem here) of NFU in the set
Zermelo in NFU. The idea (due to Roland Hinnion in model of Zermelo-style set theory in the same way that
[10] – a treatment is found in [13]) is to interpret sets NFU is interpreted in ZFC above. We think that it is
in Zermelo-style set theory as isomorphism classes of definitely of interest that NFU reflects the strategy for
well-founded extensional relations with top elements: construction of models of NFU in ZFC internally.
in other words, a set in the Zermelo-style set theory is We look at the interpretation of TTU in NFU. We
represented by the isomorphism class of the member- have observed above that TTU provides us with a
ship relation restricted to its transitive closure (strictly natural model of TTUn for each concrete natural number
speaking, the transitive closure of its singleton under n. This interpretation gives us a natural model of TTUn
the membership relation). The isomorphism classes of for each concrete natural number n in NFU (natural only
well-founded extensional relations make up a set, and in the sense that all subsets in NFU of any type are
there is a natural “membership” relation on these realized in the next higher type). Further, we can see
isomorphism classes (viewed as pictures of sets) which by comparing the construction with terminology
is also a set relation. If the whole domain of isomor- introduced during the proof of the Meta-Theorem that
phism classes is used to interpret set theory in this way, in these models type i is represented by V (n – i) – 1 (the
one always obtains a model of ZFC – Power Set in set of ((n – i) – 1)-fold singletons) and membership of
which Power Set is actually false (there is a largest type i in type i + 1 is represented by E(n – i) – 1 (where E
cardinal). But if the ambient NFU is augmented with is the relation introduced in Lemma 2). These models
strong assumptions, restricting the domain of the are all substructures of the same proper class model of
interpreted set theory may yield a model of Zermelo a “type theory” with a top type but no bottom type. Type
set theory or ZFC. One sometimes restricts oneself to theory with types indexed by negative integers (or all
isomorphism classes of strongly cantorian well-founded integers) is consistent by a compactness argument (any
extensional relations; in this case the domain of the proof in type theory uses only finitely many types, so
interpreted Zermelo style set theory is a proper class any proof of a contradiction in the theory of negative
(the predicate “strongly cantorian” is unstratified), but types would immediately yield a contradiction in
there are technical advantages to doing this. It may be standard type theory – this is due to Hao Wang in [29]).
instructive to see that Zermelo-style set theory can be This structure is not a set in NFU (if it were, we would
understood as the theory of isomorphism classes of fall afoul of Gödel’s second incompleteness theorem,
well-founded extensional relations, independently of because the theory of this structure is as strong as TTU
any consideration of NFU. and so as NFU itself), and the sequences of its types and
The interpretation of NFU in ZFC that we have membership relations give new examples of countable
described can be carried out naturally in Zermelo-style proper classes. This is a better example than the
set theory as interpreted inside NFU! The point is that descending sequence in the ordinals of how the “hall
the model of Zermelo-style set theory one obtains inside of mirrors” effect in TTU can be rediscovered in NFU.
NFU has an external endomorphism (isomorphism with An elegant and surprising application of the properties
a proper substructure of itself) which is definable in of this structure in NFU, which is also applicable to
NFU, and can be used to define an external isomor- NFU, is found in [21].
phism between ranks of the cumulative hierarchy as Finally, we compare NFU and ZFC foundations in
above. For any well-founded extensional relation W the light of our intuitive motivation for set theory using
48 M. RANDALL HOLMES
semantics and mereology, which is equally applicable properties on the universe precisely) but it has its own
to either theory. natural origin as a refinement of TTU, and a little
The intuition behind the cumulative hierarchy looks mathematical work in TTU shows that it can be realized.
perhaps a little less convincing (at least, one sees more We believe that we have adequately established that
clearly the enormity of the claim underlying this picture) NFU can be understood as an autonomous view of the
when one presents it in terms of this interpretation of world of mathematics.
set theory. The thing which appears more remarkable
in this interpretation is the construction of power sets
at successor stages. Suppose one has constructed Vα. 16. Extensions
Vα + 1 is the set of all subsets of Vα; that is, it is the
collection of all fusions of singletons of elements of A fundamental (and philosophically appealing) charac-
Vα. Given the singletons, the construction of all the teristic of the foundational picture behind ZFC is that
fusions representing sets of elements of Vα is nothing it is “self-extending”. If one understands the iterative
remarkable; one can quite reasonably adopt the view hierarchy of ranks in ZFC, it seems entirely reasonable
that any fusion of objects whatsoever is an object. But that there would be an inaccessible cardinal, and this
the construction of the singletons themselves should intuition continues to support one (perhaps with
give us pause. Vα is a collection of objects which may increasing doubts) as stronger and stronger large
overlap one another in very complicated ways: we cardinal axioms are considered.
implicitly claim that we can produce a collection of Foundations in NFU have the same characteristic.
pairwise non-overlapping objects in one-to-one Formal features of the theory suggest extensions of the
correspondence with any Vα. The intuition behind this theory which turn out to be consistent (on reasonable
must be that we have a truly inexhaustible supply of assumptions) and usually surprisingly strong.
distinct atomic objects somewhere; this may not be an Natural extensions of NFU tend to involve the
unreasonable assumption (there’s no reason to think it notions of cantorian and strongly cantorian sets intro-
leads to paradox) but it should give us pause. duced above. We will review a few extended versions
The intuition behind TTU might be mistakenly taken of NFU to get a flavor of what is going on.
to be the same as that behind ZFC: this is because one The reason why “strongly cantorian” proves to be
might suppose that we are simply building Vω + ω in ZFC. an important notion in formulating extensions of NFU
But the subtlety of the theory of types is that we make has to do with the fact that stratification restrictions on
no assumptions about whether the objects at the next the formation of sets {x | φ} can be avoided to some
type are “new” or not. One does seem to be more or less extent where variables restricted to strongly cantorian
forced to identify the finite well-founded sets in sets are involved. If A is a strongly cantorian set, then
different types as being in some sense “the same” (or there is a function ιA such that ιA(x) = {x} for all
else one has to complicate the semantic relations in the x ∈ A, and also a function ιA–1 such that ιA–1({x}) = x for
intuitive picture), but what sorts of identifications may all x ∈ A. It follows that we can lower the relative type
exist beyond that are open. One could suppose that each of any variable x restricted to A by replacing it with the
type n + 1 properly extends type n (as in the interpre- equivalent term ιA–1({x}), and raise its relative type by
tation in Vω + ω) but one could equally well (and with replacing it with the equivalent term <ιA(x) (this nota-
some philosophical justification) suppose that each type tionally compact formulation for type-raising only
n + 1 is a proper subset of type n. Type 0 could be taken works if all elements of A are sets, but there is no
to be the universe at the outset – then one might suppose essential difficulty in referring to “the element of ιA(x)”
that the natural properties of the universe were coded in case x might be an urelement). Either of these
in a small subset of the universe (in the sets of a model operations can be iterated, so the type of a variable
of NFU, say) and the natural properties of this subset restricted to a strongly cantorian set can be reset to any
of the universe were coded in a smaller subset still, desired value; the types of such variables can be ignored
etc. in determining stratification.
The intuition behind NFU is seen to be quite daring An extension of NF which recommended itself to
(the nontrivial part is the idea that there is a domain of Rosser in [24] and is equally appropriate for NFU is
“names” natural properties on which reflect the natural embodied in the following “obviously true”:
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 49
Axiom (Rosser): For each natural number n, {1, . . . , n} allows the relative type of a variable restricted to this
has n elements. set to be ignored. It may suggest the stronger idea that
“arbitrary” subclasses of strongly cantorian sets might
Rosser called this the Axiom of Counting; it be sets. We present an axiom along these lines.
expresses our intuitive notion that counting can be
effected by putting finite sets into one-to-one corre- Definition: A strongly cantorian ordinal is the order
spondence with initial segments of the natural numbers type of a strongly cantorian well-ordering (a strongly
in a sensible way. It might seem that one could prove cantorian ordinal will not itself be a strongly
this as a theorem by induction on n, but it happens that cantorian set – it will be a very large set because it
the condition {1, . . . , n} ∈ n is not stratified. It is is an isomorphism class).
sufficient on the basis of the discussion above to assume
that N is strongly cantorian – then the set of all natural Axiom of Small Ordinals: For any formula φ there is a
numbers satisfying the property in the Axiom of set A such that the elements of A which are strongly
Counting would exist and would have to be all of N. cantorian ordinals are precisely the strongly canto-
It turns out that in fact the Axiom of Counting is rian ordinals x such that φ; in other words, the
equivalent to the assertion that N is strongly cantorian intersection of any definable class with the strongly
(and that there are models of NFU in which it does not cantorian ordinals is the intersection of some set with
hold). the strongly cantorian ordinals.
The Axiom of Counting has surprisingly strong
consequences in set theory. NFU proves the existence Theorem (Holmes, closely following Solovay): The
of ℵn for each concrete n, but does not prove the consistency strength of NFU + Counting + Small
existence of ℵω. NFU + Counting proves the existence Ordinals is exactly that of ZFC – Power Set + “there
of ℵℵn for each n, which is a surprising gain in strength. is a weakly compact cardinal”.
An innocent-seeming axiom proposed in [11] by C.
Ward Henson in the context of NF (in a slightly An intuitively reasonable extension of NFU once
different form) is the again gives considerable consistency strength (note that
this theory is stronger than ZFC, in spite of the omission
Axiom of Cantorian Sets: Each cantorian set is strongly of Power Set, because a weakly compact cardinal is
cantorian. inaccessible). For the proof, see Solovay’s [25] and my
refinement in [14]. A refinement of this theory
This implies Counting because NFU proves that N (described in [13]) gives the strength of second-order
is cantorian. The axiom says that any set which is the ZFC with a measure on the proper class ordinal (as
same size as its image under the singleton map actually proved in our pending [14]).
supports the restriction of the singleton map as a set The Axiom of Small Ordinals suggests that the
function, which seems reasonable. sequence of strongly cantorian ordinals is to be identi-
fied with the ordinals of ZFC. Comparison of the
Theorem (Solovay): NFU + Axiom of Cantorian Sets definition of the ordinals in ZFC with that in NFU is
has the exact consistency strength of ZFC + the interesting. The class of von Neumann ordinals cannot
scheme which asserts for each concrete natural be a set in any set theory (it is a paradoxical totality like
number n that there is an n-Mahlo cardinal. the Russell class). One cannot even prove the existence
of the von Neumann ω in NFU (its definition is unstrat-
This is a surprising level of strength! We prove in ified). However, the existence of infinite von Neumann
[14] that NFU + Axiom of Cantorian Sets really does ordinals is consistent with NFU. Any von Neumann
prove that there are n-Mahlo cardinals for each concrete ordinal must be strongly cantorian, and it is easy (using
n (Solovay showed that they existed in an appropriate permutation methods – see [5]) to convert any model
version of the constructible universe). of NFU to a model of NFU with a von Neumann ordinal
Our intuitive picture tells us that a strongly cantorian corresponding to each strongly cantorian Russell-
set is a set on which we are given the relation of name Whitehead ordinal. This suggests that it is the “limit”
to referent as a natural relation. We have seen that this of the proper class of strongly cantorian ordinals rather
50 M. RANDALL HOLMES
than a “big” set ordinal like Ω which corresponds (as set we give for NFU, modified by the addition of strong
it were) to Cantor’s Absolute. extensionality and the omission of Set Union. The
The purpose of this section is to suggest that the proofs of Lemmas 1 and 2 do not use the set union
approach to foundations based on NFU may have axiom. The proof of the Meta-Theorem requires for
considerable mathematical power. The axioms adjoined success that N = type(x); parameters of type one higher
to NFU here are natural statements to consider in the than that of x (but not bound variables) are harmless;
context of NFU (and its intuitive picture); one could atomic sentences involving these parameters do not need
consider systems like NFU + “there is an inaccessible to be translated because they are already in a form
cardinal”, but these would not witness the autonomy of which can be handled by Lemma 1. The intuitive picture
NFU foundations, since “there is an inaccessible we gave for NFU works for NFP as well, with the
cardinal” is an axiom suggested by the ZFC viewpoint. modification that we assume that all properties of
Notice that NFU + Axiom of Cantorian Sets does prove referents are reflected by properties of names, but we
that there is an inaccessible cardinal (and more). do not assume that all properties of names are reflected
by properties of referents (strong extensionality corre-
sponds to the further assumption that all individuals are
17. Review of philosophical objections fusions of names).
Some object to the urelements. Forster has gone so
In this section, we review common objections to NFU far in his otherwise excellent recent survey [6] of NF
of a philosophical nature. as to suggest that to adopt NFU as a foundation is to
There are objections to NFU which apply to ZFC as “betray set theory”. We fail to see any particular philo-
much as they apply to NFU, and so are not directly sophical appeal in restricting oneself entirely to pure
relevant to our purpose here. Both theories assume the sets. Zermelo’s original formulation of set theory
axiom of choice, for which we will not apologize. Both admitted atoms (and also admitted non-well-founded
theories are classical rather than constructive. There has sets). The step from ZFA to ZF is technically easy –
been a little work on intuitionistic NF (which is not one can define the class of pure sets (the definition
known to be consistent – see for example [4]) and as involves recursion on the membership relation) and
far as we know none on intuitionistic NFU. Both restrict one’s attention to pure sets thereafter (and this
theories are impredicative. In this connection it might has technical advantages). In NFU no such restriction
be worth noting that if one restricts the comprehension to pure sets is possible. The definition of the class of
scheme of TTU to prevent the occurrence of variables pure sets is inherently unstratified – no construction
of higher type than that of the set being defined, and involving recursion on the membership relation can be
further assumes strong extensionality, it is known that expected to succeed in NFU.
the type structure of the resulting theory TTI can be It is surprising that the existence of urelements is
collapsed to obtain an untyped theory NFI. A further actually provable in NFU (using Choice). But it is no
refinement of the restriction on comprehension in TT, more than surprising; it doesn’t cause any fundamental
forbidding bound variables of the same type as the set problems for the user of the theory. The natural methods
being defined in instances of comprehension, gives a of constructing models of NFU in ZFC, TTU or NFU
truly predicative type theory TTP which can be col- itself all involve the creation of urelements anyway. Of
lapsed to an untyped theory NFP. The untyped theories course Specker’s disproof of Choice in NF (logically
are no stronger than the related predicative type theories equivalent to the disproof of strong extensionality in
(as long as Infinity is assumed in the typed theories: NFI NFU with Choice) is a serious problem for one who
is exactly as strong as second-order arithmetic, while supports foundations in NF – which we do not. (See [26]
NFP is weaker than first-order arithmetic but does prove for this result, or [5], p. 50; it may also be instructive
Infinity), but it is not clear that they can really be to look at my proof that there are atoms in [13], which
viewed as predicative theories from a philosophical makes use of the Axiom of Counting to simplify
standpoint: they do admit self-membered sets, for Specker’s argument).
example. These results are discussed in Marcel Crabbé’s Another objection to the urelements is that they are
paper [2]. a large collection of apparently indistinguishable objects
The theory NFP is finitely axiomatized by the axiom with no obvious function in the theory (except avoid-
FOUNDATIONS OF MATHEMATICS IN POLYMORPHIC TYPE THEORY 51
ance of contradiction). This is by contrast with ZFC, the types in models of type theory is an unnatural maneuver
standard model of which is “rigid” (all objects in a may be given pause by the fact that all infinite models
model of ZFC are in some sense distinguishable from of TTU3 (including models of TT3) satisfy the ambiguity
one another). We observe that a taste for rigid structures scheme (of course, we can only define φ ≡ φ+ for
is relatively recent. If we look at the early history of sentences φ which do not mention the top type). This
mathematics, we see arithmeticians studying a rigid implies that the theories NF3 and NFU3 in which only
structure, but geometers studying a homogeneous struc- those instances of the comprehension scheme are
ture: the points of Euclidean space are indistinguishable assumed which can be stratified with three types have
from one another. The implementation of Euclidean lots of models. NFU4 is equivalent to NFU (and also
space as R2 in ZFC is “rigid”, but it is not clear that NF4 = NF). (These are results of Grishin (see [8]), but
Euclidean space really comes with a preferred set of his papers are not very accessible: it is better to look at
Cartesian axes built in. The distinct elements [5], p. 65). There are nice consistency proofs for both
of type 0 in Russell’s theory of types have a similar NFI and NFU (due to Marcel Crabbé) which exploit the
homogeneous character. ambiguity of models with three types.
Philosophers should notice that there do seem to be Finally, there is the objection that attention to strat-
non-sets in the real world, and even that the world seems ification is somehow an unnatural constraint on math-
to support some homogeneity (as between points of ematical work. We think that in fact most mathematical
space and moments of time). constructions are naturally stratified, mod a universal
There is a large class of objections to phenomena in tendency to confuse objects with their singleton sets
NFU which reflect the fact that it is not motivated by (which is of course deplorable from the NFU stand-
the “limitation of size” doctrine. Because NFU is a set point). An occasional but not uncommon inconvenience
theory with a universal set, it is possible for a set (e.g., in mathematical constructions in NFU is that one finds
V ) to have a proper subclass (e.g., the Russell class). oneself explicitly using the singleton construction or
The fact that NFU further admits countable proper an operation like P1 or the T operations on cardinal or
classes is quite arresting. But the reasons that these ordinal numbers to fix a stratification problem. On the
proper classes are not sets are clear in terms of the intu- other hand, the most commonly used unstratified con-
itive picture of the theory we have presented (they struction, the construction of the von Neumann ordinals
involve recursion along the “arbitrary” relation of name (including natural numbers) is quite unnatural by com-
to referent). The failure of the ordinals of NFU to be parison with the Russell-Whitehead constructions used
externally well-ordered is another phenomenon of the in NFU (one feels this especially strongly when teaching
same kind (something like the Axiom of Small Ordinals it to students), though it is undeniably convenient.
can recover sensible behavior on the part of small We feel that this objection may have some merit:
ordinals, while our intuitive motivation suggests that the Zermelo style foundations may be slightly simpler on a
behavior of “big” ordinals can be expected to be odd). technical level than Quine-style foundations. But Quine-
All of these objections are answered at least to some style foundations are technically feasible (not just usable
extent by the observation that the underlying motiva- in principle). Our experience indicates that this approach
tion of NFU does not require (indeed forbids) us to to foundations can be used in practice.
assume that each arbitrary extension is realized by a set. In connection with the “naturalness” of the stratifi-
There are other set theories, notably the set theory IST cation criterion, it is natural to mention Forster’s elegant
used as a foundation for nonstandard analysis, which theorem to the effect that the stratified predicates are
admit (and even exploit) the possibility of proper exactly those sentences which are invariant under the
subclasses of sets. redefinition of the membership relation using “setlike”
Mathematicians seem particularly uncomfortable permutations in any extensional structure for the
with the idea that A and P1(A) can be of different sizes language of set theory, which strongly suggests that
in NFU. In our experience, this particular phenomenon stratification is a natural property for reasons not
is something one can get used to; one can develop an obviously related to those given here (we don’t consider
intuitive feel for the difference between set bijections it to be entirely unrelated). The details can be found in
and those that involve a shift of type. [5], p. 94.
Those who think that the identification of successive Our aim in this paper is not to suggest any change
52 M. RANDALL HOLMES
in mathematical practice. ZFC foundations are quite sat- 15. Holmes, M. Randall and J. Alves-Foss: ‘The Watson Theorem
isfactory, in our opinion. But if one wants to understand Prover’, preprint, available at the Watson home page
http://math.boisestate.edu/~holmes/proverpage.html
what foundations are and what they do for us, it is useful
16. Holmes, M. Randall: 2000, ‘A Strong and Mechanizable Grand
to be aware that there are other possible approaches. Logic’, to appear in J. Harrison and M. Aagaard, Theorem
Proving in Higher Order Logics: 13th International Conference,
TPHOLs 2000, Lecture Notes in Computer Science, vol. 1869,
References Springer-Verlag, pp. 283–300.
17. Jensen, Ronald Bjorn: 1969, ‘On the Consistency of a Slight
(?) Modification of Quine’s “New Foundations” ’, Synthese 19,
01. Boffa, M.: 1977, ‘The Consistency Problem for NF’, Journal
250–263.
of Symbolic Logic 42, 215–220.
18. Lewis, David: 1991, Parts of Classes, Basil Blackwell Ltd.,
02. Crabbé, Marcel: 1982, ‘On the Consistency of an Impredicative
Oxford.
Subsystem of Quine’s NF’, Journal of Symbolic Logic 47,
19. Mac Lane, Saunders: 1986, Mathematics, Form and Function,
131–136.
Springer-Verlag.
03. Crabbé, Marcel: 1992, ‘On NFU’, Notre Dame Journal of
20. Mathias, A. R. D.: ‘Notes on Mac Lane Set Theory’, to appear
Formal Logic 33, 112–119.
in Annals of Pure and Applied Logic.
04. Dziergowski, Daniel: 1993, ‘Le théorème d’ambiguïté et son
21. Orey, S: 1964, ‘New Foundations and the Axiom of Counting’,
extension à la logique intuitionniste’ (in French), doctoral thesis,
Duke Mathematical Journal 31, 655–660.
Catholic University of Louvain.
22. Quine, W. V. O.: 1937, ‘New Foundations for Mathematical
05. Forster, T. E.: 1995, Set Theory with a Universal Set, an
Logic’, American Mathematical Monthly 44, 70–80.
Exploration of an Untyped Universe, 2nd ed., Oxford Logic
23. Quine, W. V. O.: 1945, ‘On Ordered Pairs’, Journal of Symbolic
Guides, No. 31, OUP.
Logic 10, 95–96.
06. Forster, T. E.: 1997, ‘Quine’s NF, 60 Years On’, American
24. Rosser, J. Barkley: 1973, Logic for Mathematicians, 2nd edition,
Mathematical Monthly 104(9) (November), 838–845.
Chelsea, New York.
07. Foundations of Mathematics mailing list: Steve Simpson,
25. Solovay, Robert: ‘The Consistency Strength of NFUB’, preprint,
moderator, home page with archive at http://www.math.psu.edu/
available through logic e-prints on the WWW.
simpson/fom/
26. Specker, E. P.: 1953, ‘The Axiom of Choice in Quine’s “New
08. Grishin, V. N.: 1969, ‘Consistency of a Fragment of Quine’s NF
Foundations for Mathematical Logic” ’, Proceedings of the
System’, Sov Math Dokl 10, 1387–1390.
National Academy of Sciences of the U.S.A. 39, 972–975.
09. Hailperin, T.: 1944, ‘A Set of Axioms for Logic’, Journal of
27. Specker, E. P.: 1962, ‘Typical Ambiguity’, in E. Nagel (ed.),
Symbolic Logic 9, 1–19.
Logic, Methodology, and Philosophy of Science, Stanford.
10. Hinnion, Roland: 1975, ‘Sur la théorie des ensembles de Quine’,
28. Tarski, Alfred and Steven Givant: 1988, A Formalization of Set
Ph.D. thesis, ULB Brussels.
Theory Without Variables, American Mathematical Society,
11. Henson, C. W.: 1973, ‘Type-raising Operations in NF’, Journal
Providence.
of Symbolic Logic 38, 59–68.
29. Wang, Hao: 1952, ‘Negative Types’, Mind 61, 366–368.
12. Holmes, M. Randall: 1995, ‘The Equivalence of NF-style Set
Theories with “Tangled” Type Theories; The Construction of
ω-models of predicative NF (and More)’, Journal of Symbolic
Logic 60, 178–189. Department of Mathematics,
13. Holmes, M. Randall: 1998, Elementary Set Theory with a
Boise State University,
Universal Set, vol, 10 of the Cahiers du Centre de logique,
Academia, Louvain-la-Neuve (Belgium). 1910 University Drive,
14. Holmes, M. Randall: ‘Strong Axioms of Infinity in NFU’, to Boise, ID 83725,
appear in the Journal of Symbolic Logic. U.S.A.