Prog Prove
Prog Prove
Prog Prove
HO
b a s =
December 5, 2013
Contents
1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Programming and Proving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Types bool, nat and list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Type and Function Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4 Induction Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.5 Simplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Logic and Proof Beyond Equality . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Proof Automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Single Step Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Inductive Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Isar: A Language for Structured Proofs . . . . . . . . . . . . . . . . . . . . 4.1 Isar by Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Proof Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Streamlining Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Case Analysis and Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 23 24 25 28 31 37 38 40 42 45
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
1 Introduction
Isabelle is a generic system for implementing logical formalisms, and Isabelle/HOL is the specialization of Isabelle for HOL, which abbreviates HigherOrder Logic. We introduce HOL step by step following the equation HOL = Functional Programming + Logic. We assume that the reader is used to logical and set theoretic notation and is familiar with the basic concepts of functional programming. Chapter 2 introduces HOL as a functional programming language and explains how to write simple inductive proofs of mostly equational properties of recursive functions. Chapter 3 introduces the rest of HOL: the language of formulas beyond equality, automatic proof tools, single step proofs, and inductive denitions, an essential specication construct. Chapter 4 introduces Isar, Isabelles language for writing structured proofs. This introduction to the core of Isabelle is intentionally concrete and example-based: we concentrate on examples that illustrate the typical cases; we do not explain the general case if it can be inferred from the examples. We cover the essentials (from a functional programming point of view) as quickly and compactly as possible. For a comprehensive treatment of all things Isabelle we recommend the Isabelle/Isar Reference Manual [6], which comes with the Isabelle distribution. The tutorial by Nipkow, Paulson and Wenzel [5] (in its updated version that comes with the Isabelle distribution) is still recommended for the wealth of examples and material, but its proof style is outdated. In particular it fails to cover the structured proof language Isar. If you want to apply what you have learned about Isabelle we recommend you donwload and read the book Concrete Semantics [4], a guided tour of the wonderful world of programming langage semantics formalised in Isabelle. In fact, Programming and Proving in Isabelle/HOL constitutes part I of Con-
1 Introduction
crete Semantics. The web pages for Concrete Semantics also provide a set of A L TEX-based slides for teaching Programming and Proving in Isabelle/HOL. Acknowledgements I wish to thank the following people for their comments on this document: Florian Haftmann, Ren Thiemann and Christian Sternagel.
This chapter introduces HOL as a functional programming language and shows how to prove properties of functional programs by induction.
2.1 Basics
2.1.1 Types, Terms and Formulas HOL is a typed logic whose type system resembles that of functional programming languages. Thus there are base types, in particular bool, the type of truth values, nat, the type of natural numbers (N), and int, the type of mathematical integers (Z). type constructors, in particular list, the type of lists, and set, the type of sets. Type constructors are written postx, e.g. nat list is the type of lists whose elements are natural numbers. function types, denoted by . type variables, denoted by a, b etc., just like in ML. Note that a b list means " a ( b list )", not ( a b ) list : postx type constructors have precedence over . Terms are formed as in functional programming by applying functions to arguments. If f is a function of type 1 2 and t is a term of type 1 then f t is a term of type 2 . We write t :: to mean that term t has type .
There are many predened inx symbols like + and . The name of the corresponding binary function is op +, not just +. That is, x + y is syntactic sugar for op + x y .
Terms may also contain -abstractions. For example, x . x is the identity function. Formulas are terms of type bool. There are the basic constants True and False and the usual logical connectives (in decreasing order of precedence): , , , . Equality is available in the form of the inx function = of type a a bool. It also works for formulas, where it means if and only if. Quantiers are written x . P and x . P. Isabelle automatically computes the type of each variable in a term. This is called type inference. Despite type inference, it is sometimes necessary to attach explicit type constraints (or type annotations) to a variable or term. The syntax is t :: as in m < (n ::nat ). Type constraints may be needed to disambiguate terms involving overloaded functions such as +, and . Finally there are the universal quantier and the implication =. They are part of the Isabelle framework, not the logic HOL. Logically, they agree with their HOL counterparts and , but operationally they behave differently. This will become clearer as we go along.
Right-arrows of all kinds always associate to the right. In particular, the formula A1 = A2 = A3 means A1 = (A2 = A3 ). The (Isabelle specic) notation [[ A1 ; . . .; An ]] = A is short for the iterated implication A1 = . . . = An = A. A1 ... An Sometimes we also employ inference rule notation: A
2.1.2 Theories Roughly speaking, a theory is a named collection of types, functions, and theorems, much like a module in a programming language. All the Isabelle text that you ever type needs to go into a theory. The general format of a theory T is theory T imports T 1 . . . T n begin denitions, theorems and proofs end
where T 1 . . . T n are the names of existing theories that T is based on. The T i are the direct parent theories of T. Everything dened in the parent theories (and their parents, recursively) is automatically visible. Each theory T must reside in a theory le named T .thy.
HOL contains a theory Main, the union of all the basic predened theories like arithmetic, lists, sets, etc. Unless you know what you are doing, always include Main as a direct or indirect parent of all your theories.
In addition to the theories that come with the Isabelle/HOL distribution (see http://isabelle.in.tum.de/library/HOL/) there is also the Archive of Formal Proofs at http://afp.sourceforge.net, a growing collection of Isabelle theories that everybody can contribute to. 2.1.3 Quotation Marks The textual denition of a theory follows a xed syntax with keywords like begin and datatype. Embedded in this syntax are the types and formulas of HOL. To distinguish the two levels, everything HOL-specic (terms and types) must be enclosed in quotation marks: ". . . ". To lessen this burden, quotation marks around a single identier can be dropped. When Isabelle prints a syntax error message, it refers to the HOL syntax as the inner syntax and the enclosing theory language as the outer syntax.
2.2.2 Type nat Natural numbers are another predened datatype: datatype nat = 0 | Suc nat All values of type nat are generated by the constructors 0 and Suc. Thus the values of type nat are 0, Suc 0, Suc (Suc 0) etc. There are many predened functions: +, , , etc. Here is how you could dene your own addition: fun add :: "nat nat nat" where "add 0 n = n" | "add (Suc m ) n = Suc (add m n )" And here is a proof of the fact that add m 0 = m : lemma add_02: "add m 0 = m" apply(induction m ) apply(auto ) done The lemma command starts the proof and gives the lemma a name, add_02. Properties of recursively dened functions need to be established by induction in most cases. Command apply(induction m ) instructs Isabelle to start a proof by induction on m. In response, it will show the following proof state: 1. add 0 0 = 0 2. m . add m 0 = m = add (Suc m ) 0 = Suc m The numbered lines are known as subgoals. The rst subgoal is the base case, the second one the induction step. The prex m . is Isabelles way of saying for an arbitrary but xed m . The = separates assumptions from the conclusion. The command apply(auto ) instructs Isabelle to try and prove all subgoals automatically, essentially by simplifying them. Because both subgoals are easy, Isabelle can do it. The base case add 0 0 = 0 holds by definition of add, and the induction step is almost as simple: add (Suc m ) 0 = Suc (add m 0) = Suc m using rst the denition of add and then the induction hypothesis. In summary, both subproofs rely on simplication with function denitions and the induction hypothesis. As a result of that nal done, Isabelle associates the lemma just proved with its name. You can now inspect the lemma with the command thm add_02 which displays add ?m 0 = ?m
The free variable m has been replaced by the unknown ?m. There is no logical dierence between the two but an operational one: unknowns can be instantiated, which is what you want after some lemma has been proved. Note that there is also a proof method induct, which behaves almost like induction ; the dierence is explained in Chapter 4.
Terminology: We use lemma, theorem and rule interchangeably for propositions that have been proved. Numerals (0, 1, 2, . . . ) and most of the standard arithmetic operations (+, , , , < etc) are overloaded: they are available not just for natural numbers but for other types as well. For example, given the goal x + 0 = x, there is nothing to indicate that you are talking about natural numbers. Hence Isabelle can only infer that x is of some arbitrary type where 0 and + exist. As a consequence, you will be unable to prove the goal. To alert you to such pitfalls, Isabelle ags numerals without a xed type in its output: x + (0:: a ) = x. In this particular example, you need to include an explicit type constraint, for example x +0 = (x ::nat ). If there is enough contextual information this may not be necessary: Suc x = x automatically implies x ::nat because Suc is not overloaded.
An Informal Proof Above we gave some terse informal explanation of the proof of add m 0 = m. A more detailed informal exposition of the lemma might look like this: Lemma add m 0 = m Proof by induction on m.
Case 0 (the base case): add 0 0 = 0 holds by denition of add. Case Suc m (the induction step): We assume add m 0 = m, the induction hypothesis (IH), and we need to show add (Suc m ) 0 = Suc m. The proof is as follows: add (Suc m ) 0 = Suc (add m 0) by denition of add = Suc m by IH
Throughout this book, IH will stand for induction hypothesis. We have now seen three proofs of add m 0 = 0: the Isabelle one, the terse four lines explaining the base case and the induction step, and just now a model of a traditional inductive proof. The three proofs dier in the level of detail given and the intended reader: the Isabelle proof is for the machine, the informal proofs are for humans. Although this book concentrates on Isabelle proofs, it is important to be able to rephrase those proofs as informal text comprehensible to a reader familiar with traditional mathematical proofs. Later on we will introduce an Isabelle proof language that is closer to traditional informal mathematical language and is often directly readable.
2.2.3 Type list Although lists are already predened, we dene our own copy just for demonstration purposes: datatype a list = Nil | Cons a " a list"
Type a list is the type of lists over elements of type a. Because a is a type variable, lists are in fact polymorphic: the elements of a list can be of arbitrary type (but must all be of the same type). Lists have two constructors: Nil, the empty list, and Cons, which puts an element (of type a ) in front of a list (of type a list ). Hence all lists are of the form Nil, or Cons x Nil, or Cons x (Cons y Nil ) etc. datatype requires no quotation marks on the left-hand side, but on the right-hand side each of the argument types of a constructor needs to be enclosed in quotation marks, unless it is just an identier (e.g. nat or a ).
We also dene two standard functions, append and reverse: fun app :: " a list a list a list" where "app Nil ys = ys" | "app (Cons x xs ) ys = Cons x (app xs ys )" fun rev :: " a list a list" where "rev Nil = Nil" | "rev (Cons x xs ) = app (rev xs ) (Cons x Nil )" By default, variables xs, ys and zs are of list type. Command value evaluates a term. For example, value "rev (Cons True (Cons False Nil ))" yields the result Cons False (Cons True Nil ). This works symbolically, too: value "rev (Cons a (Cons b Nil ))" yields Cons b (Cons a Nil ). Figure 2.1 shows the theory created so far. Because list, Nil, Cons etc are already predened, Isabelle prints qualied (long) names when executing this theory, for example, MyList .Nil instead of Nil. To suppress the qualied names you can insert the command declare [[names_short]]. This is not recommended in general but just for this unusual example. Structural Induction for Lists Just as for natural numbers, there is a proof principle of induction for lists. Induction over a list is essentially induction over the length of the list, al-
2.2 Types bool, nat and list theory MyList imports Main begin datatype a list = Nil | Cons a "a list" fun app :: "a list => a list => a list" where "app Nil ys = ys" | "app (Cons x xs) ys = Cons x (app xs ys)" fun rev :: "a list => a list" where "rev Nil = Nil" | "rev (Cons x xs) = app (rev xs) (Cons x Nil)" value "rev(Cons True (Cons False Nil))" end Fig. 2.1. A Theory of Lists
though the length remains implicit. To prove that some property P holds for all lists xs, i.e. P xs , you need to prove 1. the base case P Nil and 2. the inductive case P (Cons x xs ) under the assumption P xs, for some arbitrary but xed x and xs. This is often called structural induction. 2.2.4 The Proof Process We will now demonstrate the typical proof process, which involves the formulation and proof of auxiliary lemmas. Our goal is to show that reversing a list twice produces the original list. theorem rev_rev [simp ]: "rev (rev xs ) = xs" Commands theorem and lemma are interchangeable and merely indicate the importance we attach to a proposition. Via the bracketed attribute simp we also tell Isabelle to make the eventual theorem a simplication rule: future proofs involving simplication will replace occurrences of rev (rev xs ) by xs. The proof is by induction: apply(induction xs ) As explained above, we obtain two subgoals, namely the base case (Nil ) and the induction step (Cons ):
10
1. rev (rev Nil ) = Nil 2. a xs . rev (rev xs ) = xs = rev (rev (Cons a xs )) = Cons a xs Let us try to solve both goals automatically: apply(auto ) Subgoal 1 is proved, and disappears; the simplied version of subgoal 2 becomes the new subgoal 1: 1. a xs . rev (rev xs ) = xs = rev (app (rev xs ) (Cons a Nil )) = Cons a xs
In order to simplify this subgoal further, a lemma suggests itself. A First Lemma We insert the following lemma in front of the main theorem: lemma rev_app [simp ]: "rev (app xs ys ) = app (rev ys ) (rev xs )" There are two variables that we could induct on: xs and ys. Because app is dened by recursion on the rst argument, xs is the correct one: apply(induction xs ) This time not even the base case is solved automatically: apply(auto ) 1. rev ys = app (rev ys ) Nil A total of 2 subgoals ... Again, we need to abandon this proof attempt and prove another simple lemma rst. A Second Lemma We again try the canonical proof procedure: lemma app_Nil2 [simp ]: "app xs Nil = xs" apply(induction xs ) apply(auto ) done Thankfully, this worked. Now we can continue with our stuck proof attempt of the rst lemma: lemma rev_app [simp ]: "rev (app xs ys ) = app (rev ys ) (rev xs )"
11
apply(induction xs ) apply(auto ) We nd that this time auto solves the base case, but the induction step merely simplies to 1. a xs . rev (app xs ys ) = app (rev ys ) (rev xs ) = app (app (rev ys ) (rev xs )) (Cons a Nil ) = app (rev ys ) (app (rev xs ) (Cons a Nil ))
The missing lemma is associativity of app, which we insert in front of the failed lemma rev_app. Associativity of app The canonical proof procedure succeeds without further ado: lemma app_assoc [simp ]: "app (app xs ys ) zs = app xs (app ys zs )" apply(induction xs ) apply(auto ) done Finally the proofs of rev_app and rev_rev succeed, too. Another Informal Proof Here is the informal proof of associativity of app corresponding to the Isabelle proof above. Lemma app (app xs ys ) zs = app xs (app ys zs ) Proof by induction on xs.
Case Nil : app (app Nil ys ) zs = app ys zs = app Nil (app ys zs ) holds by denition of app. Case Cons x xs : We assume
app (app xs ys ) zs = app xs (app ys zs ) and we need to show app (app (Cons x xs ) ys ) zs = app (Cons x xs ) (app ys zs ). The proof is as follows: app (app (Cons x xs ) ys ) zs = app (Cons x (app xs ys )) zs = Cons x (app (app xs ys ) zs ) = Cons x (app xs (app ys zs )) = app (Cons x xs ) (app ys zs ) (IH)
by by by by
12
Didnt we say earlier that all proofs are by simplication? But in both cases, going from left to right, the last equality step is not a simplication at all! In the base case it is app ys zs = app Nil (app ys zs ). It appears almost mysterious because we suddenly complicate the term by appending Nil on the left. What is really going on is this: when proving some equality s = t , both s and t are simplied to some common term u. This heuristic for equality proofs works well for a functional programming context like ours. In the base case s is app (app Nil ys ) zs, t is app Nil (app ys zs ), and u is app ys zs. 2.2.5 Predened Lists Isabelles predened lists are the same as the ones above, but with more syntactic sugar:
There is also a large library of predened functions. The most important ones are the length function length :: a list nat (with the obvious denition), and the map function that applies a function to all elements of a list: fun map :: " ( a b ) a list b list" "map f [] = []" | "map f (x # xs ) = f x # map f xs" From now on lists are always the predened lists. 2.2.6 Exercises Exercise 2.1. Start from the denition of add given above. Prove it is associative (add (add m n ) p = add m (add n p )) and commutative (add m n = add n m ). Dene a recursive function double :: nat nat and prove that double m = add m m. Exercise 2.2. Dene a function count :: a a list nat that counts the number of occurrences of an element in a list. Prove count x xs length xs. Exercise 2.3. Dene a recursive function snoc :: a list a a list that appends an element to the end of a list. Do not use the predened append operator @. With the help of snoc dene a recursive function reverse :: a list a list that reverses a list. Do not use the predened function rev. Prove reverse (reverse xs ) = xs.
13
14
lemma "mirror (mirror t ) = t" apply(induction t ) yields 1. mirror (mirror Tip ) = Tip 2. t1 a t2. [[mirror (mirror t1) = t1; mirror (mirror t2) = t2]] = mirror (mirror (Node t1 a t2)) = Node t1 a t2 The induction step contains two induction hypotheses, one for each subtree. An application of auto nishes the proof. A very simple but also very useful datatype is the predened datatype a option = None | Some a Its sole purpose is to add a new element None to an existing type a. To make sure that None is distinct from all the elements of a, you wrap them up in Some and call the new type a option. A typical application is a lookup function on a list of key-value pairs, often called an association list: fun lookup :: " ( a b ) list a b option" where "lookup [] x = None" | "lookup ((a ,b ) # ps ) x = (if a = x then Some b else lookup ps x )" Note that 1 2 is the type of pairs, also written 1 2 . Pairs can be taken apart either by pattern matching (as above) or with the projection functions fst and snd : fst (a , b ) = a and snd (a , b ) = b. Tuples are simulated by pairs nested to the right: (a , b , c ) abbreviates (a , (b , c )) and 1 2 3 abbreviates 1 (2 3 ). 2.3.2 Denitions Non recursive functions can be dened as in the following example: denition sq :: "nat nat" where "sq n = n n" Such denitions do not allow pattern matching but only f x 1 . . . x n = t, where f does not occur in t. 2.3.3 Abbreviations Abbreviations are similar to denitions: abbreviation sq :: "nat nat" where "sq n n n"
15
The key dierence is that sq is only syntactic sugar: after parsing, sq t is replaced by t t , and before printing, every occurrence of u u is replaced by sq u . Internally, sq does not exist. This is the advantage of abbreviations over denitions: denitions need to be expanded explicitly (Section 2.5.5) whereas abbreviations are already expanded upon parsing. However, abbreviations should be introduced sparingly: if abused, they can lead to a confusing discrepancy between the internal and external view of a term. The ASCII representation of is == or \<equiv>. 2.3.4 Recursive Functions Recursive functions are dened with fun by pattern matching over datatype constructors. The order of equations matters. Just as in functional programming languages. However, all HOL functions must be total. This simplies the logicterms are always denedbut means that recursive functions must terminate. Otherwise one could dene a function f n = f n + 1 and conclude 0 = 1 by subtracting f n on both sides. Isabelles automatic termination checker requires that the arguments of recursive calls on the right-hand side must be strictly smaller than the arguments on the left-hand side. In the simplest case, this means that one xed argument position decreases in size with each recursive call. The size is measured as the number of constructors (excluding 0-ary ones, e.g. Nil ). Lexicographic combinations are also recognized. In more complicated situations, the user may have to prove termination by hand. For details see [2]. Functions dened with fun come with their own induction schema that mirrors the recursion schema and is derived from the termination order. For example, fun div2 :: "nat nat" where "div2 0 = 0" | "div2 (Suc 0) = Suc 0" | "div2 (Suc (Suc n )) = Suc (div2 n )" does not just dene div2 but also proves a customized induction rule: P0 P (Suc 0) n . P n = P (Suc (Suc n )) Pm This customized induction rule can simplify inductive proofs. For example, lemma "div2(n +n ) = n" apply(induction n rule : div2.induct ) yields the 3 subgoals
16
1. div2 (0 + 0) = 0 2. div2 (Suc 0 + Suc 0) = Suc 0 3. n . div2 (n + n ) = n = div2 (Suc (Suc n ) + Suc (Suc n )) = Suc (Suc n ) An application of auto nishes the proof. Had we used ordinary structural induction on n, the proof would have needed an additional case analysis in the induction step. The general case is often called computation induction, because the induction follows the (terminating!) computation. For every dening equation f (e ) = . . . f (r 1 ) . . . f (r k ) . . . where f (r i ), i =1. . .k, are all the recursive calls, the induction rule f .induct contains one premise of the form P (r 1 ) = . . . = P (r k ) = P (e ) If f :: 1 . . . n then f .induct is applied like this: apply(induction x 1 . . . x n rule : f .induct ) where typically there is a call f x 1 . . . x n in the goal. But note that the induction rule does not mention f at all, except in its name, and is applicable independently of f.
17
"itrev (x #xs ) ys = itrev xs (x #ys )" The behaviour of itrev is simple: it reverses its rst argument by stacking its elements onto the second argument, and it returns that second argument when the rst one becomes empty. Note that itrev is tail-recursive: it can be compiled into a loop, no stack is necessary for executing it. Naturally, we would like to show that itrev does indeed reverse its rst argument provided the second one is empty: lemma "itrev xs [] = rev xs" There is no choice as to the induction variable: apply(induction xs ) apply(auto ) Unfortunately, this attempt does not prove the induction step: 1. a xs . itrev xs [] = rev xs = itrev xs [a ] = rev xs @ [a ]
The induction hypothesis is too weak. The xed argument, [], prevents it from rewriting the conclusion. This example suggests a heuristic: Generalize goals for induction by replacing constants by variables. Of course one cannot do this navely: itrev xs ys = rev xs is just not true. The correct generalization is lemma "itrev xs ys = rev xs @ ys" If ys is replaced by [], the right-hand side simplies to rev xs, as required. In this instance it was easy to guess the right generalization. Other situations can require a good deal of creativity. Although we now have two variables, only xs is suitable for induction, and we repeat our proof attempt. Unfortunately, we are still not there: 1. a xs . itrev xs ys = rev xs @ ys = itrev xs (a # ys ) = rev xs @ a # ys
The induction hypothesis is still too weak, but this time it takes no intuition to generalize: the problem is that the ys in the induction hypothesis is xed, but the induction hypothesis needs to be applied with a # ys instead of ys. Hence we prove the theorem for all ys instead of a xed one. We can instruct induction to perform this generalization for us by adding arbitrary : ys. apply(induction xs arbitrary : ys ) The induction hypothesis in the induction step is now universally quantied over ys :
18
1. 2.
Thus the proof succeeds: apply auto done This leads to another heuristic for generalization: Generalize induction by generalizing all free variables (except the induction variable itself). Generalization is best performed with arbitrary : y 1 . . . y k . This heuristic prevents trivial failures like the one above. However, it should not be applied blindly. It is not always required, and the additional quantiers can complicate matters in some cases. The variables that need to be quantied are typically those that change in recursive calls.
2.5 Simplication
So far we have talked a lot about simplifying terms without explaining the concept. Simplication means
2.5 Simplication
19
(1)
(2)
(3)
(4)
Simplication is often also called rewriting and simplication rules rewrite rules. 2.5.1 Simplication Rules The attribute simp declares theorems to be simplication rules, which the simplier will use automatically. In addition, datatype and fun commands implicitly declare some simplication rules: datatype the distinctness and injectivity rules, fun the dening equations. Denitions are not declared as simplication rules automatically! Nearly any theorem can become a simplication rule. The simplier will try to transform it into an equation. For example, the theorem P is turned into P = False. Only equations that really simplify, like rev (rev xs ) = xs and xs @ [] = xs, should be declared as simplication rules. Equations that may be counterproductive as simplication rules should only be used in specic proof steps (see 2.5.4 below). Distributivity laws, for example, alter the structure of terms and can produce an exponential blow-up. 2.5.2 Conditional Simplication Rules Simplication rules can be conditional. Before applying such a rule, the simplier will rst try to prove the preconditions, again by simplication. For example, given the simplication rules p 0 = True p x = f x = g x, the term f 0 simplies to g 0 but f 1 does not simplify because p 1 is not provable. 2.5.3 Termination Simplication can run forever, for example if both f x = g x and g x = f x are simplication rules. It is the users responsibility not to include simplication rules that can lead to nontermination, either on their own or in combination with other simplication rules. The right-hand side of a simplication rule
20
should always be simpler than the left-hand sidein some sense. But since termination is undecidable, such a check cannot be automated completely and Isabelle makes little attempt to detect nontermination. When conditional simplication rules are applied, their preconditions are proved rst. Hence all preconditions need to be simpler than the left-hand side of the conclusion. For example n < m = (n < Suc m ) = True is suitable as a simplication rule: both n < m and True are simpler than n < Suc m . But Suc n < m = (n < m ) = True leads to nontermination: when trying to rewrite n < m to True one rst has to prove Suc n < m , which can be rewritten to True provided Suc (Suc n ) < m, ad innitum. 2.5.4 The simp Proof Method So far we have only used the proof method auto. Method simp is the key component of auto, but auto can do much more. In some cases, auto is overeager and modies the proof state too much. In such cases the more predictable simp method should be used. Given a goal 1. [[ P 1 ; . . .; P m ]] = C the command apply(simp add : th 1 . . . th n ) simplies the assumptions P i and the conclusion C using
all simplication rules, including the ones coming from datatype and fun, the additional lemmas th 1 . . . th n , and the assumptions.
In addition to or instead of add there is also del for removing simplication rules temporarily. Both are optional. Method auto can be modied similarly: apply(auto simp add : . . . simp del : . . .) Here the modiers are simp add and simp del instead of just add and del because auto does not just perform simplication. Note that simp acts only on subgoal 1, auto acts on all subgoals. There is also simp_all, which applies simp to all subgoals.
2.5 Simplication
21
2.5.5 Rewriting With Denitions Denitions introduced by the command denition can also be used as simplication rules, but by default they are not: the simplier does not expand them automatically. Denitions are intended for introducing abstract concepts and not merely as abbreviations. Of course, we need to expand the denition initially, but once we have proved enough abstract properties of the new constant, we can forget its original denition. This style makes proofs more robust: if the denition has to be changed, only the proofs of the abstract properties will be aected. The denition of a function f is a theorem named f_def and can be added to a call of simp just like any other theorem: apply(simp add : f_def ) In particular, let-expressions can be unfolded by making Let_def a simplication rule. 2.5.6 Case Splitting With simp Goals containing if-expressions are automatically split into two cases by simp using the rule P (if A then s else t ) = ((A P s ) ( A P t )) For example, simp can prove (A B ) = (if A then B else False ) because both A (A B ) = B and A (A B ) = False simplify to True. We can split case-expressions similarly. For nat the rule looks like this: P (case e of 0 a | Suc n b n ) = ((e = 0 P a ) ( n . e = Suc n P (b n ))) Case expressions are not split automatically by simp, but simp can be instructed to do so: apply(simp split : nat .split ) splits all case-expressions over natural numbers. For an arbitrary datatype t it is t .split instead of nat .split. Method auto can be modied in exactly the same way.
22
2.5.7 Exercises Exercise 2.4. Dene arithmetic expressions in one variable over integers (type int ) as a data type: datatype exp = Var | Const int | Add exp exp | Mult exp exp Dene a function eval :: exp int int such that eval e x evaluates e at the value x. A polynomial can be represented as a list of coecients, starting with the constant. For example, [4, 2, 1, 3] represents the polynomial 4 + 2x x2 + 3x3 . Dene a function evalp :: int list int int that evaluates a polynomial at the given value. Dene a function coes :: exp int list that transforms an expression into a polynomial. This may require auxiliary functions. Prove that coes preserves the value of the expression: evalp (coes e ) x = eval e x. Hint: simplifying with algebra_simps takes care of common algebraic properties of + and .
3.1 Formulas
The core syntax of formulas (form below) provides the standard logical constructs, in decreasing order of precedence: form ::= (form ) | True | False | term = term | form | form form | form form | form form | x . form | x . form Terms are the ones we have seen all along, built from constants, variables, function application and -abstraction, including all the syntactic sugar like inx symbols, if, case etc.
Remember that formulas are simply terms of type bool. Hence = also works for formulas. Beware that = has a higher precedence than the other logical operators. Hence s = t A means (s = t ) A, and A B = B A means A (B = B ) A. Logical equivalence can also be written with instead of =, where has the same low precedence as . Hence A B B A really means (A B ) (B A). Quantiers need to be enclosed in parentheses if they are nested within other constructs (just like if, case and let ).
The most frequent logical symbols and their ASCII representations are shown in Fig. 3.1. The rst column shows the symbols, the other columns ASCII representations. The \<...> form is always converted into the symbolic form by the Isabelle interfaces, the treatment of the other ASCII forms depends on the interface. The ASCII forms /\ and \/ are special in that they are merely keyboard shortcuts for the interface and not logical symbols by themselves.
24
3 Logic and Proof Beyond Equality = \<forall> \<exists> \<lambda> --> <-> /\ \/ \<not> \<noteq> ALL EX %
& | ~ ~=
The implication = is part of the Isabelle framework. It structures theorems and proof states, separating assumptions from conclusions. The implication is part of the logic HOL and can occur inside the formulas that make up the assumptions and conclusion. Theorems should be of the form [[ A1 ; . . .; An ]] = A, not A1 . . . An A. Both are logically equivalent but the rst one works better when using the theorem in further proofs.
3.2 Sets
Sets of elements of type a have type a set. They can be nite or innite. Sets come with the usual notation:
{}, {e 1 ,. . .,e n } e A, A B A B, A B, A B, A
(where A B and A are set dierence and complement) and much more. UNIV is the set of all elements of some type. Set comprehension is written {x . P } rather than {x | P }.
In {x . P } the x must be a variable. Set comprehension involving a proper term t must be written {t | x y . P }, where x y are those free variables in t that occur in P. This is just a shorthand for {v . x y . v = t P }, where v is a new variable. For example, {x + y |x . x A} is short for {v . x . v = x +y x A}.
Here are the ASCII representations of the mathematical symbols: \<in> : \<subseteq> <= \<union> Un \<inter> Int
Sets also allow bounded quantications x A. P and x A. P. For the more ambitious, there are also and :
25
A = {x . B A. x B }
A = { x . B A. x B } are \<Inter>
The ASCII forms of are \<Union> and Union, those of and Inter. There are also indexed unions and intersections: ( (
x A B x ) = {y . x A. y B x } x A B x ) = {y . x A. y B x }
The ASCII forms are UN x:A. B and INT x:A. B where x may occur in B. If A is UNIV you can just write UN x. B and INT x. B. Some other frequently useful functions on sets are the following: set :: a list a set nite :: a set bool card :: a set nat f A = {y . x A. y = f x } converts a list to the set of its elements is true i its argument is nite is the cardinality of a nite set and is 0 for all innite sets is the image of a function over a set
See [3] for the wealth of further predened functions in theory Main.
They show you were they got stuck, giving you an idea how to continue. They perform the obvious steps but are highly incomplete.
A proof method is complete if it can prove all true formulas. There is no complete proof method for HOL, not even in theory. Hence all our proof methods only dier in how incomplete they are.
26
A proof method that is still incomplete but tries harder than auto is fastforce. It either succeeds or fails, it acts on the rst subgoal only, and it can be modied just like auto, e.g. with simp add. Here is a typical example of what fastforce can do: lemma " [[ xs A. ys . xs = ys @ ys ; us A ]] = n . length us = n +n" by fastforce This lemma is out of reach for auto because of the quantiers. Even fastforce fails when the quantier structure becomes more complicated. In a few cases, its slow version force succeeds where fastforce fails. The method of choice for complex logical goals is blast. In the following example, T and A are two binary predicates. It is shown that if T is total, A is antisymmetric and T is a subset of A, then A is a subset of T : lemma " [[ x y . T x y T y x ; x y . A x y A y x x = y ; x y . T x y A x y ]] = x y . A x y T x y" by blast We leave it to the reader to gure out why this lemma is true. Method blast
is (in principle) a complete proof procedure for rst-order formulas, a fragment of HOL. In practice there is a search bound. does no rewriting and knows very little about equality. covers logic, sets and relations. either succeeds or fails.
Because of its strength in logic and sets and its weakness in equality reasoning, it complements the earlier proof methods. 3.3.1 Sledgehammer Command sledgehammer calls a number of external automatic theorem provers (ATPs) that run for up to 30 seconds searching for a proof. Some of these ATPs are part of the Isabelle installation, others are queried over the internet. If successful, a proof command is generated and can be inserted into your proof. The biggest win of sledgehammer is that it will take into account the whole lemma library and you do not need to feed in any lemma explicitly. For example, lemma " [[ xs @ ys = ys @ xs ; length xs = length ys ]] = xs = ys"
27
cannot be solved by any of the standard proof methods, but sledgehammer nds the following proof: by (metis append_eq_conv_conj ) We do not explain how the proof was found but what this command means. For a start, Isabelle does not trust external tools (and in particular not the translations from Isabelles logic to those tools!) and insists on a proof that it can check. This is what metis does. It is given a list of lemmas and tries to nd a proof just using those lemmas (and pure logic). In contrast to simp and friends that know a lot of lemmas already, using metis manually is tedious because one has to nd all the relevant lemmas rst. But that is precisely what sledgehammer does for us. In this case lemma append_eq_conv_conj alone suces: (xs @ ys = zs ) = (xs = take (length xs ) zs ys = drop (length xs ) zs ) We leave it to the reader to gure out why this lemma suces to prove the above lemma, even without any knowledge of what the functions take and drop do. Keep in mind that the variables in the two lemmas are independent of each other, despite the same names, and that you can substitute arbitrary values for the free variables in a lemma. Just as for the other proof methods we have seen, there is no guarantee that sledgehammer will nd a proof if it exists. Nor is sledgehammer superior to the other proof methods. They are incomparable. Therefore it is recommended to apply simp or auto before invoking sledgehammer on what is left. 3.3.2 Arithmetic By arithmetic formulas we mean formulas involving variables, numbers, +, , =, <, and the usual logical connectives , , , , . Strictly speaking, this is known as linear arithmetic because it does not involve multiplication, although multiplication with numbers, e.g. 2n is allowed. Such formulas can be proved by arith : lemma " [[ (a ::nat ) by arith x + b ; 2x < c ]] = 2a + 1 2b + c"
In fact, auto and simp can prove many linear arithmetic formulas already, like the one above, by calling a weak but fast version of arith. Hence it is usually not necessary to invoke arith explicitly. The above example involves natural numbers, but integers (type int ) and real numbers (type real ) are supported as well. As are a number of further operators like min and max. On nat and int, arith can even prove theorems with quantiers in them, but we will not enlarge on that here.
28
3.3.3 Trying Them All If you want to try all of the above automatic proof methods you simply type try You can also add specic simplication and introduction rules: try simp : . . . intro : . . . There is also a lightweight variant try0 that does not call sledgehammer.
to the proof state. We will now examine the details of this process. 3.4.1 Instantiating Unknowns We had briey mentioned earlier that after proving some theorem, Isabelle replaces all free variables x by so called unknowns ?x. We can see this clearly in rule conjI. These unknowns can later be instantiated explicitly or implicitly:
By hand, using of. The expression conjI [of "a =b" "False" ] instantiates the unknowns in conjI from left to right with the two formulas a =b and False, yielding the rule
a =b False a = b False In general, th [of string 1 . . . string n ] instantiates the unknowns in the theorem th from left to right with the terms string 1 to string n . By unication. Unication is the process of making two terms syntactically equal by suitable instantiations of unknowns. For example, unifying ?P ?Q with a = b False instantiates ?P with a = b and ?Q with False. We need not instantiate all unknowns. If we want to skip a particular one we can just write _ instead, for example conjI [of _ "False" ]. Unknowns can also be instantiated by name, for example conjI [where ?P = "a =b" and ?Q = "False" ].
29
3.4.2 Rule Application Rule application means applying a rule backwards to a proof state. For example, applying rule conjI to a proof state 1. . . . = A B results in two subgoals, one for each premise of conjI : 1. . . . = A 2. . . . = B In general, the application of a rule [[ A1 ; . . .; An ]] = A to a subgoal . . . = C proceeds in two steps: 1. Unify A and C, thus instantiating the unknowns in the rule. 2. Replace the subgoal C with n new subgoals A1 to An . This is the command to apply rule xyz : apply(rule xyz ) This is also called backchaining with rule xyz. 3.4.3 Introduction Rules Conjunction introduction (conjI ) is one example of a whole class of rules known as introduction rules. They explain under which premises some logical construct can be introduced. Here are some further useful introduction rules: x . ?P x ?P = ?Q ?P ?Q
impI
x . ?P x ?Q = ?P
allI
?P = ?Q
?P = ?Q
iI
These rules are part of the logical system of natural deduction (e.g. [1]). Although we intentionally de-emphasize the basic rules of logic in favour of automatic proof methods that allow you to take bigger steps, these rules are helpful in locating where and why automation fails. When applied backwards, these rules decompose the goal:
conjI and iI split the goal into two subgoals, impI moves the left-hand side of a HOL implication into the list of assumptions, and allI removes a by turning the quantied variable into a xed local variable of the subgoal.
30
Isabelle knows about these and a number of other introduction rules. The command apply rule automatically selects the appropriate rule for the current subgoal. You can also turn your own theorems into introduction rules by giving them the intro attribute, analogous to the simp attribute. In that case blast, fastforce and (to a limited extent) auto will automatically backchain with those theorems. The intro attribute should be used with care because it increases the search space and can lead to nontermination. Sometimes it is better to use it only in specic calls of blast and friends. For example, le_trans, transitivity of on type nat, is not an introduction rule by default because of the disastrous eect on the search space, but can be useful in specic situations: lemma " [[ (a ::nat ) b; b _ by(blast intro : le trans ) c; c d; d e ]] = a e"
Of course this is just an example and could be proved by arith, too. 3.4.4 Forward Proof Forward proof means deriving new theorems from old theorems. We have already seen a very simple form of forward proof: the of operator for instantiating unknowns in a theorem. The big brother of of is OF for applying one theorem to others. Given a theorem A = B called r and a theorem A called r , the theorem r [OF r ] is the result of applying r to r , where r should be viewed as a function taking a theorem A and returning B. More precisely, A and A are unied, thus instantiating the unknowns in B, and the result is the instantiated B. Of course, unication may also fail.
Application of rules to other rules operates in the forward direction: from the premises to the conclusion of the rule; application of rules to proof states operates in the backward direction, from the conclusion to the premises.
In general r can be of the form [[ A1 ; . . .; An ]] = A and there can be multiple argument theorems r 1 to r m (with m n ), in which case r [OF r 1 . . . r m ] is obtained by unifying and thus proving Ai with r i , i = 1. . .m. Here is an example, where re is the theorem ?t = ?t : thm conjI [OF re [of "a" ] re [of "b" ]] yields the theorem a = a b = b. The command thm merely displays the result. Forward reasoning also makes sense in connection with proof states. Therefore blast, fastforce and auto support a modier dest which instructs
31
the proof method to use certain rules in a forward fashion. If r is of the form A = B , the modier dest : r allows proof search to reason forward with r, i.e. to replace an assumption A , where A unies with A, with the correspondingly instantiated B. For example, Suc_leD is the theorem Suc m n = m n , which works well for forward reasoning: lemma "Suc (Suc (Suc a )) by(blast dest : Suc_leD ) b = a b"
In this particular example we could have backchained with Suc_leD, too, but because the premise is more complicated than the conclusion this can easily lead to nontermination. 3.4.5 Finding Theorems Command nd_theorems searches for specic theorems in the current theory. Search criteria include pattern matching on terms and on names. For details see the Isabelle/Isar Reference Manual [6].
To ease readability we will drop the question marks in front of unknowns from now on.
0 is even If n is even, so is n + 2.
The operative word inductive means that these are the only even numbers. In Isabelle we give the two rules the names ev0 and evSS and write inductive ev :: "nat bool" where ev0: "ev 0" | evSS : "ev n = ev (n + 2)" To get used to inductive denitions, we will rst prove a few properties of ev informally before we descend to the Isabelle level. How do we prove that some number is even, e.g. ev 4? Simply by combining the dening rules for ev : ev 0 = ev (0 + 2) = ev ((0 + 2) + 2) = ev 4
32
Rule Induction Showing that all even numbers have some property is more complicated. For example, let us prove that the inductive denition of even numbers agrees with the following recursive one: fun even :: "nat bool" where "even 0 = True" | "even (Suc 0) = False" | "even (Suc (Suc n )) = even n" We prove ev m = even m. That is, we assume ev m and by induction on the form of its derivation prove even m. There are two cases corresponding to the two rules for ev : Case ev0: ev m was derived by rule ev 0: = m = 0 = even m = even 0 = True Case evSS : ev m was derived by rule ev n = ev (n + 2): = m = n + 2 and by induction hypothesis even n = even m = even (n + 2) = even n = True What we have just seen is a special case of rule induction. Rule induction applies to propositions of this form ev n = P n That is, we want to prove a property P n for all even n. But if we assume ev n, then there must be some derivation of this assumption using the two dening rules for ev. That is, we must prove Case ev0: P 0 Case evSS : [[ev n ; P n ]] = P (n + 2) The corresponding rule is called ev .induct and looks like this: ev n P0 n . [[ev n ; P n ]] = P (n + 2) Pn The rst premise ev n enforces that this rule can only be applied in situations where we know that n is even. Note that in the induction step we may not just assume P n but also ev n , which is simply the premise of rule evSS. Here is an example where the local assumption ev n comes in handy: we prove ev m = ev (m 2) by induction on ev m. Case ev0 requires us to prove ev (0 2), which follows from ev 0 because 0 2 = 0 on type nat. In case evSS we have m = n + 2 and may assume ev n, which implies ev (m 2) because m 2 = (n + 2) 2 = n. We did not need the induction hypothesis at all for this proof, it is just
33
a case analysis of which rule was used, but having ev n at our disposal in case evSS was essential. This case analysis of rules is also called rule inversion and is discussed in more detail in Chapter 4. In Isabelle Let us now recast the above informal proofs in Isabelle. For a start, we use Suc terms instead of numerals in rule evSS : ev n = ev (Suc (Suc n )) This avoids the diculty of unifying n +2 with some numeral, which is not automatic. The simplest way to prove ev (Suc (Suc (Suc (Suc 0)))) is in a forward direction: evSS [OF evSS [OF ev0]] yields the theorem ev (Suc (Suc (Suc (Suc 0)))). Alternatively, you can also prove it as a lemma in backwards fashion. Although this is more verbose, it allows us to demonstrate how each rule application changes the proof state: lemma "ev (Suc (Suc (Suc (Suc 0))))" 1. ev (Suc (Suc (Suc (Suc 0)))) apply(rule evSS ) 1. ev (Suc (Suc 0)) apply(rule evSS ) 1. ev 0 apply(rule ev0) done Rule induction is applied by giving the induction rule explicitly via the rule : modier: lemma "ev m = even m" apply(induction rule : ev .induct ) by(simp_all ) Both cases are automatic. Note that if there are multiple assumptions of the form ev t, method induction will induct on the leftmost one. As a bonus, we also prove the remaining direction of the equivalence of ev and even :
34
lemma "even n = ev n" apply(induction n rule : even .induct ) This is a proof by computation induction on n (see Section 2.3.4) that sets up three subgoals corresponding to the three equations for even : 1. even 0 = ev 0 2. even (Suc 0) = ev (Suc 0) 3. n . [[even n = ev n ; even (Suc (Suc n ))]] = ev (Suc (Suc n )) The rst and third subgoals follow with ev0 and evSS, and the second subgoal is trivially true because even (Suc 0) is False : by (simp_all add : ev0 evSS ) The rules for ev make perfect simplication and introduction rules because their premises are always smaller than the conclusion. It makes sense to turn them into simplication and introduction rules permanently, to enhance proof automation: declare ev .intros [simp ,intro ] The rules of an inductive denition are not simplication rules by default because, in contrast to recursive functions, there is no termination requirement for inductive denitions. Inductive Versus Recursive We have seen two denitions of the notion of evenness, an inductive and a recursive one. Which one is better? Much of the time, the recursive one is more convenient: it allows us to do rewriting in the middle of terms, and it expresses both the positive information (which numbers are even) and the negative information (which numbers are not even) directly. An inductive denition only expresses the positive information directly. The negative information, for example, that 1 is not even, has to be proved from it (by induction or rule inversion). On the other hand, rule induction is tailor-made for proving ev n = P n because it only asks you to prove the positive cases. In the proof of even n = P n by computation induction via even .induct, we are also presented with the trivial negative cases. If you want the convenience of both rewriting and rule induction, you can make two denitions and show their equivalence (as above) or make one denition and prove additional properties from it, for example rule induction from computation induction. But many concepts do not admit a recursive denition at all because there is no datatype for the recursion (for example, the transitive closure of a relation), or the recursion would not terminate (for example, an interpreter
35
for a programming language). Even if there is a recursive denition, if we are only interested in the positive information, the inductive denition may be much simpler. 3.5.2 The Reexive Transitive Closure Evenness is really more conveniently expressed recursively than inductively. As a second and very typical example of an inductive denition we dene the reexive transitive closure. The reexive transitive closure, called star below, is a function that maps a binary predicate to another binary predicate: if r is of type bool then star r is again of type bool, and star r x y means that x and y are in the relation star r. Think r when you see star r, because star r is meant to be the reexive transitive closure. That is, star r x y is meant to be true if from x we can reach y in nitely many r steps. This concept is naturally dened inductively: inductive star :: " ( a a bool ) a a bool" for r where re : "star r x x" | step : "r x y = star r y z = star r x z" The base case re is reexivity: x = y. The step case step combines an r step (from x to y ) and a star r step (from y to z ) into a star r step (from x to z ). The for r in the header is merely a hint to Isabelle that r is a xed parameter of star, in contrast to the further parameters of star, which change. As a result, Isabelle generates a simpler induction rule. By denition star r is reexive. It is also transitive, but we need rule induction to prove that: lemma star_trans : "star r x y = star r y z = star r x z" apply(induction rule : star .induct ) The induction is over star r x y and we try to prove star r y z = star r x z , which we abbreviate by P x y. These are our two subgoals: 1. 2. x . star r x z = star r x z u x y. [[r u x ; star r x y ; star r y z = star r x z ; star r y z ]] = star r u z
The rst one is P x x, the result of case re, and it is trivial. apply(assumption ) Let us examine subgoal 2, case step. Assumptions r u x and star r x y are the premises of rule step. Assumption star r y z = star r x z is P x y , the
36
IH coming from star r x y. We have to prove P u y, which we do by assuming star r y z and proving star r u z. The proof itself is straightforward: from star r y z the IH leads to star r x z which, together with r u x, leads to star r u z via rule step : apply(metis step ) done
3.5.3 The General Case Inductive denitions have approximately the following general form: inductive I :: " bool" where followed by a sequence of (possibly named) rules of the form [[ I a 1 ; . . .; I a n ]] = I a separated by |. As usual, n can be 0. The corresponding rule induction principle I .induct applies to propositions of the form I x = P x where P may itself be a chain of implications.
Rule induction is always on the leftmost premise of the goal. Hence I x must be the rst premise.
Proving I x = P x by rule induction means proving for every rule of I that P is invariant: [[ I a 1 ; P a 1 ; . . .; I a n ; P a n ]] = P a The above format for inductive denitions is simplied in a number of respects. I can have any number of arguments and each rule can have additional premises not involving I, so-called side conditions. In rule inductions, these side-conditions appear as additional assumptions. The for clause seen in the denition of the reexive transitive closure merely simplies the form of the induction rule.
Apply-scripts are unreadable and hard to maintain. The language of choice for larger proofs is Isar. The two key features of Isar are:
It is structured, not linear. It is readable without running it because you need to state what you are proving at any given point.
Whereas apply-scripts are like assembly language programs, Isar proofs are like structured programs with comments. A typical Isar proof looks like this: proof assume "formula 0 " have "formula 1 " by simp . . . have "formula n " by blast show "formula n+1 " by . . . qed It proves formula 0 = formula n+1 (provided each proof step succeeds). The intermediate have statements are merely stepping stones on the way towards the show statement that proves the actual goal. In more detail, this is the Isar core syntax: proof = by method | proof [method ] step qed step = x variables | assume proposition | [from fact + ] (have | show) proposition proof proposition = [name :] "formula" fact = name | . . .
38
A proof can either be an atomic by with a single proof method which must nish o the statement being proved, for example auto. Or it can be a proof qed block of multiple steps. Such a block can optionally begin with a proof method that indicates how to start o the proof, e.g. (induction xs ). A step either assumes a proposition or states a proposition together with its proof. The optional from clause indicates which facts are to be used in the proof. Intermediate propositions are stated with have, the overall goal with show. A step can also introduce new local variables with x. Logically, x introduces -quantied variables, assume introduces the assumption of an implication (=) and have/show the conclusion. Propositions are optionally named formulas. These names can be referred to in later from clauses. In the simplest case, a fact is such a name. But facts can also be composed with OF and of as shown in 3.4.4hence the . . . in the above grammar. Note that assumptions, intermediate have statements and global lemmas all have the same status and are thus collectively referred to as facts. Fact names can stand for whole lists of facts. For example, if f is dened by command fun, f .simps refers to the whole list of recursion equations dening f. Individual facts can be selected by writing f .simps (2), whole sublists by f .simps (24).
39
meaningful names are hard to invent and are often not necessary. Both have steps are obvious. The second one introduces the diagonal set {x . x / f x }, the key idea in the proof. If you wonder why 2 directly implies False : from 2 it follows that (a / f a ) = (a f a ) . 4.1.1 this, then, hence and thus Labels should be avoided. They interrupt the ow of the reader who has to scan the context for the point where the label was introduced. Ideally, the proof is a linear ow, where the output of one step becomes the input of the next step, piping the previously proved fact into the next proof, just like in a UNIX pipe. In such cases the predened name this can be used to refer to the proposition proved in the previous step. This allows us to eliminate all labels from our proof (we suppress the lemma statement): proof assume "surj f" from this have " a . {x . x / f x } = f a" by(auto simp : surj_def ) from this show "False" by blast qed We have also taken the opportunity to compress the two have steps into one. To compact the text further, Isar has a few convenient abbreviations: then = from this thus = then show hence = then have With the help of these abbreviations the proof becomes proof assume "surj f" hence " a . {x . x / f x } = f a" by(auto simp : surj_def ) thus "False" by blast qed There are two further linguistic variations: (have|show) prop using facts = from facts (have|show) prop with facts = from facts this The using idiom de-emphasizes the used facts by moving them behind the proposition. 4.1.2 Structured Lemma Statements: xes, assumes, shows Lemmas can also be stated in a more structured fashion. To demonstrate this feature with Cantors theorem, we rephrase surj f a little:
40
lemma xes f :: " a a set" assumes s : "surj f" shows "False" The optional xes part allows you to state the types of variables up front rather than by decorating one of their occurrences in the formula with a type constraint. The key advantage of the structured format is the assumes part that allows you to name each assumption; multiple assumptions can be separated by and. The shows part gives the goal. The actual theorem that will come out of the proof is surj f = False, but during the proof the assumption surj f is available under the name s like any other fact. proof have " a . {x . x / f x } = f a" using s by(auto simp : surj_def ) thus "False" by blast qed In the have step the assumption surj f is now referenced by its name s. The duplication of surj f in the above proofs (once in the statement of the lemma, once in its proof) has been eliminated.
Note the dash after the proof command. It is the null method that does nothing to the goal. Leaving it out would ask Isabelle to try some suitable introduction rule on the goal False but there is no suitable introduction rule and proof would fail.
Stating a lemma with assumes-shows implicitly introduces the name assms that stands for the list of all assumptions. You can refer to individual assumptions by assms (1), assms (2) etc, thus obviating the need to name them individually.
41
show "R" proof cases assume "P" . . . show "R" . . . next assume " P" . . . show "R" . . . qed
have "P Q" . . . then show "R" proof assume "P" . . . show "R" . . . next assume "Q" . . . show "R" . . . qed
How to prove a logical equivalence: show "P Q" proof assume "P" . . . show "Q" . . . next assume "Q" . . . show "P" . . . qed Proofs by contradiction: show " P" proof assume "P" . . . show "False" . . . qed show "P" proof (rule ccontr ) assume " P" . . . show "False" . . . qed
The name ccontr stands for classical contradiction. How to prove quantied formulas: show " x . P (x )" proof x x . . . show "P (x )" . . . qed show " x . P (x )" proof . . . show "P (witness )" . . . qed
42
In the proof of x . P (x ), the step x x introduces a locally xed variable x into the subproof, the proverbial arbitrary but xed value. Instead of x we could have chosen any name in the subproof. In the proof of x . P (x ), witness is some arbitrary term for which we can prove that it satises P. How to reason forward from x . P (x ): have " x . P (x )" . . . then obtain x where p : "P (x )" by blast After the obtain step, x (we could have chosen any name) is a xed local variable, and p is the name of the fact P (x ). This pattern works for one or more x. As an example of the obtain command, here is the proof of Cantors theorem in more detail: lemma " surj (f :: a a set )" proof assume "surj f" hence " a . {x . x / f x } = f a" by(auto simp : surj_def ) then obtain a where " {x . x / f x } = f a" by blast hence "a / f a a f a" by blast thus "False" by blast qed Finally, how to prove set equality and subset relationship: show "A = B" proof show "A B" . . . next show "B A" . . . qed show "A B" proof x x assume "x A" . . . show "x B" . . . qed
43
show "formula 1 formula 2 " (is "?L ?R" ) proof assume "?L" . . . show "?R" . . . next assume "?R" . . . show "?L" . . . qed Instead of duplicating formula i in the text, we introduce the two abbreviations ?L and ?R by pattern matching. Pattern matching works wherever a formula is stated, in particular with have and lemma. The unknown ?thesis is implicitly matched against any goal stated by lemma or show. Here is a typical example: lemma "formula" proof . . . show ?thesis . . . qed Unknowns can also be instantiated with let commands let ?t = "some-big-term" Later proof steps can refer to ?t : have " . . . ?t . . . "
Names of facts are introduced with name : and refer to proved theorems. Unknowns ?X refer to terms or formulas.
Although abbreviations shorten the text, the reader needs to remember what they stand for. Similarly for names of facts. Names like 1, 2 and 3 are not helpful and should only be used in short proofs. For longer proofs, descriptive names are better. But look at this example: have x_gr_0: "x > 0" . . . from x_gr_0 . . . The name is longer than the fact it stands for! Short facts do not need names, one can refer to them easily by quoting them:
44
have "x > 0" . . . from x >0 . . . Note that the quotes around x >0 are back quotes. They refer to the fact not by name but by value. 4.3.2 moreover Sometimes one needs a number of facts to enable some deduction. Of course one can name these facts individually, as shown on the right, but one can also combine them with moreover, as shown on the left: have "P 1 " . . . moreover have "P 2 " . . . moreover . . . moreover have "P n " . . . ultimately have "P" . . . have lab 1 : "P 1 " . . . have lab 2 : "P 2 " . . . . . . have lab n : "P n " . . . from lab 1 lab 2 . . . have "P" . . .
The moreover version is no shorter but expresses the structure more clearly and avoids new names. 4.3.3 Raw Proof Blocks Sometimes one would like to prove some lemma locally within a proof. A lemma that shares the current context of assumptions but that has its own assumptions and is generalized over its locally xed variables at the end. This is what a raw proof block does: { x x 1 . . . x n assume A1 . . . Am . . . have B } proves [[ A1 ; . . . ; Am ]] = B where all x i have been replaced by unknowns ?x i .
The conclusion of a raw proof block is not indicated by show but is simply the nal have.
As an example we prove a simple fact about divisibility on integers. The denition of dvd is (b dvd a ) = ( k . a = b k ).
45
lemma xes a b :: int assumes "b dvd (a +b )" shows "b dvd a" proof { x k assume k : "a +b = b k" have " k . a = b k " proof show "a = b (k 1)" using k by(simp add : algebra_simps ) qed } then show ?thesis using assms by(auto simp add : dvd_def ) qed Note that the result of a raw proof block has no name. In this example it was directly piped (via then) into the nal proof, but it can also be named for later reference: you simply follow the block directly by a note command: note name = this This introduces a new name name that refers to this, the fact just proved, in this case the preceding block. In general, note introduces a new name for one or more facts. 4.3.4 Exercises Exercise 4.1. Give a readable, structured proof of the following lemma: lemma assumes T : " x y . T x y T y x" and A: " x y . A x y A y x x = y" and TA: " x y . T x y A x y" and "A x y" shows "T x y" Exercise 4.2. Give a readable, structured proof of the following lemma: lemma " ( ys zs . xs = ys @ zs length ys = length zs ) ( ys zs . xs = ys @ zs length ys = length zs + 1)" Hint: There are predened functions take :: nat a list a list and drop :: nat a list a list such that take k [x 1 ,. . .] = [x 1 ,. . .,x k ] and drop k [x 1 ,. . .] = [x k +1 ,. . .]. Let simp and especially sledgehammer nd and apply the relevant take and drop lemmas for you.
46
lemma "length (tl xs ) = length xs 1" proof (cases xs ) assume "xs = []" thus ?thesis by simp next x y ys assume "xs = y #ys" thus ?thesis by simp qed Function tl (tail) is dened by tl [] = [] and tl (x # xs ) = xs. Note that the result type of length is nat and 0 1 = 0. This proof pattern works for any term t whose type is a datatype. The goal has to be proved for each constructor C : x x 1 . . . x n assume "t = C x 1 . . . x n " Each case can be written in a more compact form by means of the case command: case (C x 1 . . . x n ) This is equivalent to the explicit x-assume line but also gives the assumption "t = C x 1 . . . x n " a name: C, like the constructor. Here is the case version of the proof above: proof (cases xs ) case Nil thus ?thesis by simp next case (Cons y ys ) thus ?thesis by simp qed Remember that Nil and Cons are the alphanumeric names for [] and #. The names of the assumptions are not used because they are directly piped (via thus) into the proof of the claim. 4.4.2 Structural Induction We illustrate structural induction with an example based on natural numbers: the sum ( ) of the rst n natural numbers ({0..n ::nat }) is equal to n (n + 1) div 2. Never mind the details, just focus on the pattern: lemma " {0..n ::nat } = n (n +1) div 2" proof (induction n ) show " {0..0::nat } = 0(0+1) div 2" by simp
47
next x n assume " {0..n ::nat } = n (n +1) div 2" thus " {0..Suc n } = Suc n (Suc n +1) div 2" by simp qed Except for the rewrite steps, everything is explicitly given. This makes the proof easily readable, but the duplication means it is tedious to write and maintain. Here is how pattern matching can completely avoid any duplication: lemma " {0..n ::nat } = n (n +1) div 2" (is "?P n" ) proof (induction n ) show "?P 0" by simp next x n assume "?P n" thus "?P (Suc n )" by simp qed The rst line introduces an abbreviation ?P n for the goal. Pattern matching ?P n with the goal instantiates ?P to the function n . {0..n } = n (n + 1) div 2. Now the proposition to be proved in the base case can be written as ?P 0, the induction hypothesis as ?P n, and the conclusion of the induction step as ?P (Suc n ). Induction also provides the case idiom that abbreviates the x-assume step. The above proof becomes proof (induction n ) case 0 show ?case by simp next case (Suc n ) thus ?case by simp qed The unknown ?case is set in each case to the required claim, i.e. ?P 0 and ?P (Suc n ) in the above proof, without requiring the user to dene a ?P. The general pattern for induction over nat is shown on the left-hand side:
48
show "P (n )" proof (induction n ) case 0 . . . show ?case . . . next case (Suc n ) . . . show ?case . . . qed
On the right side you can see what the case command on the left stands for. In case the goal is an implication, induction does one more thing: the proposition to be proved in each case is not the whole implication but only its conclusion; the premises of the implication are immediately made assumptions of that case. That is, if in the above proof we replace show "P (n )" by show "A(n ) = P (n )" then case 0 stands for assume 0: "A(0)" let ?case = "P (0)" and case (Suc n ) stands for x n assume Suc : "A(n ) = P (n )" "A(Suc n )" let ?case = "P (Suc n )" The list of assumptions Suc is actually subdivided into Suc .IH, the induction hypotheses (here A(n ) = P (n )) and Suc .prems, the premises of the goal being proved (here A(Suc n )). Induction works for any datatype. Proving a goal [[ A1 (x ); . . .; Ak (x ) ]] = P (x ) by induction on x generates a proof obligation for each constructor C of the datatype. The command case (C x 1 . . . x n ) performs the following steps: 1. x x 1 . . . x n 2. assume the induction hypotheses (calling them C .IH ) and the premises Ai (C x 1 . . . x n ) (calling them C .prems ) and calling the whole list C 3. let ?case = "P (C x 1 . . . x n )" 4.4.3 Rule Induction Recall the inductive and recursive denitions of even numbers in Section 3.5:
49
inductive ev :: "nat bool" where ev0: "ev 0" | evSS : "ev n = ev (Suc (Suc n ))" fun even :: "nat bool" where "even 0 = True" | "even (Suc 0) = False" | "even (Suc (Suc n )) = even n" We recast the proof of ev n = even n in Isar. The left column shows the actual proof text, the right column shows the implicit eect of the two case commands: lemma "ev n = even n" proof (induction rule : ev .induct ) case ev0 show ?case by simp next case evSS
x n assume evSS : "ev n" "even n" let ?case = "even (Suc (Suc n ))"
thus ?case by simp qed The proof resembles structural induction, but the induction rule is given explicitly and the names of the cases are the names of the rules in the inductive denition. Let us examine the two assumptions named evSS : ev n is the premise of rule evSS, which we may assume because we are in the case where that rule was used; even n is the induction hypothesis.
Because each case command introduces a list of assumptions named like the case name, which is the name of a rule of the inductive denition, those rules now need to be accessed with a qualied name, here ev .ev0 and ev .evSS
In the case evSS of the proof above we have pretended that the system xes a variable n. But unless the user provides the name n, the system will just invent its own name that cannot be referred to. In the above proof, we do not need to refer to it, hence we do not give it a specic name. In case one needs to refer to it one writes case (evSS m ) just like case (Suc n ) in earlier structural inductions. The name m is an arbitrary choice. As a result, case evSS is derived from a renamed version of
50
rule evSS : ev m = ev (Suc (Suc m )). Here is an example with a (contrived) intermediate step that refers to m : lemma "ev n = even n" proof (induction rule : ev .induct ) case ev0 show ?case by simp next case (evSS m ) have "even (Suc (Suc m )) = even m" by simp thus ?case using even m by blast qed In general, let I be a (for simplicity unary) inductively dened predicate and let the rules in the denition of I be called rule 1 , . . . , rule n . A proof by rule induction follows this pattern: show "I x = P x" proof (induction rule : I .induct ) case rule 1 . . . show ?case . . . next . . . next case rule n . . . show ?case . . . qed One can provide explicit variable names by writing case (rule i x 1 . . . x k ), thus renaming the rst k free variables in rule i to x 1 . . . x k , going through rule i from left to right. 4.4.4 Assumption Naming In any induction, case name sets up a list of assumptions also called name, which is subdivided into three parts: name .IH contains the induction hypotheses. name .hyps contains all the other hypotheses of this case in the induction rule. For rule inductions these are the hypotheses of rule name, for structural inductions these are empty. name .prems contains the (suitably instantiated) premises of the statement being proved, i.e. the Ai when proving [[ A1 ; . . .; An ]] = A.
51
Proof method induct diers from induction only in this naming policy: induct does not distinguish IH from hyps but subsumes IH under hyps.
More complicated inductive proofs than the ones we have seen so far often need to refer to specic assumptionsjust name or even name .prems and name .IH can be too unspecic. This is where the indexing of fact lists comes in handy, e.g. name .IH (2) or name .prems (12). 4.4.5 Rule Inversion Rule inversion is case analysis of which rule could have been used to derive some fact. The name rule inversion emphasizes that we are reasoning backwards: by which rules could some given fact have been proved? For the inductive denition of ev, rule inversion can be summarized like this: ev n = n = 0 ( k . n = Suc (Suc k ) ev k ) The realisation in Isabelle is a case analysis. A simple example is the proof that ev n = ev (n 2). We already went through the details informally in Section 3.5.1. This is the Isar proof: assume "ev n" from this have "ev (n 2)" proof cases case ev0 thus "ev (n 2)" by (simp add : ev .ev0) next case (evSS k ) thus "ev (n 2)" by (simp add : ev .evSS ) qed The key point here is that a case analysis over some inductively dened predicate is triggered by piping the given fact (here: from this ) into a proof by cases. Let us examine the assumptions available in each case. In case ev0 we have n = 0 and in case evSS we have n = Suc (Suc k ) and ev k. In each case the assumptions are available under the name of the case; there is no ne grained naming schema like for induction. Sometimes some rules could not have been used to derive the given fact because constructors clash. As an extreme example consider rule inversion applied to ev (Suc 0): neither rule ev0 nor rule evSS can yield ev (Suc 0) because Suc 0 unies neither with 0 nor with Suc (Suc n ). Impossible cases do not have to be proved. Hence we can prove anything from ev (Suc 0): assume "ev (Suc 0)" then have P by cases That is, ev (Suc 0) is simply not provable: lemma " ev (Suc 0)"
52
proof assume "ev (Suc 0)" then show False by cases qed Normally not all cases will be impossible. As a simple exercise, prove that ev (Suc (Suc (Suc 0))). 4.4.6 Advanced Rule Induction So far, rule induction was always applied to goals of the form I x y z = . . . where I is some inductively dened predicate and x, y, z are variables. In some rare situations one needs to deal with an assumption where not all arguments r, s, t are variables: lemma "I r s t = . . ." Applying the standard form of rule induction in such a situation will lead to strange and typically unproveable goals. We can easily reduce this situation to the standard one by introducing new variables x, y, z and reformulating the goal like this: lemma "I x y z = x = r = y = s = z = t = . . ." Standard rule induction will worke ne now, provided the free variables in r, s, t are generalized via arbitrary. However, induction can do the above transformation for us, behind the curtains, so we never need to see the expanded version of the lemma. This is what we need to write: lemma "I r s t = . . ." proof (induction "r" "s" "t" arbitrary : . . . rule : I .induct ) Just like for rule inversion, cases that are impossible because of constructor clashes will not show up at all. Here is a concrete example: lemma "ev (Suc m ) = ev m" proof (induction "Suc m" arbitrary : m rule : ev .induct ) x n assume IH : " m . n = Suc m = ev m" show " ev (Suc n )" proof contradition assume "ev (Suc n )" thus False proof cases rule inversion x k assume "n = Suc k" "ev k" thus False using IH by auto qed
53
Instead of the case and ?case magic we have spelled all formulas out. This is merely for greater clarity. We only need to deal with one case because the ev0 case is impossible. The form of the IH shows us that internally the lemma was expanded as explained above: ev x = x = Suc m = ev m . The goal ev (Suc n ) may suprise. The expanded version of the lemma would suggest that we have a x m assume Suc (Suc n ) = Suc m and need to show ev m. What happened is that Isabelle immediately simplied Suc (Suc n ) = Suc m to Suc n = m and could then eliminate m. Beware of such nice surprises with this advanced form of induction.
This advanced form of induction does not support the IH naming schema explained in Section 4.4.4: the induction hypotheses are instead found under the name hyps, like for the simpler induct method.
4.4.7 Exercises Exercise 4.3. Dene a recursive function elems :: a list a set and prove x elems xs = ys zs . xs = ys @ x # zs x / elems ys. Exercise 4.4. A context-free grammar can be seen as an inductive denition where each nonterminal A is an inductively dened predicate on lists of terminal symbols: A(w) mans that w is in the language generated by A. For example, the production S aSb can be viewed as the implication S w = S (a # w @ [b ]) where a and b are constructors of some datatype of terminal symbols: datatype tsymbs = a | b | . . . Dene the two grammars S T | | aSb | T aT b SS
References
1. Michael Huth and Mark Ryan. Logic in Computer Science. Cambridge University Press, 2004. 2. Alexander Krauss. Dening Recursive Functions in Isabelle/HOL. http:// isabelle.in.tum.de/doc/functions.pdf. 3. Tobias Nipkow. Whats in Main. http://isabelle.in.tum.de/doc/main.pdf. 4. Tobias Nipkow and Gerwin Klein. Concrete Semantics. A Proof Assistant Approach. http://www.in.tum.de/~nipkow/Concrete, 2013. 5. Tobias Nipkow, Lawrence Paulson, and Markus Wenzel. Isabelle/HOL A Proof Assistant for Higher-Order Logic, volume 2283 of Lect. Notes in Comp. Sci. Springer-Verlag, 2002. 6. Makarius Wenzel. The Isabelle/Isar Reference Manual. http://isabelle.in. tum.de/doc/isar-ref.pdf.