DrRacket Guide
DrRacket Guide
DrRacket Guide
Version 5.2.900.2
This guide is intended for programmers who are new to Racket or new to some part of Racket. It assumes programming experience, so if you are new to programming, consider instead reading How to Design Programs. If you want an especially quick introduction to Racket, start with Quick: An Introduction to Racket with Pictures. Chapter 2 provides a brief introduction to Racket. From Chapter 3 on, this guide dives into detailscovering much of the Racket toolbox, but leaving precise details to The Racket Reference and other reference manuals.
Contents
1 Welcome to Racket 1.1 1.2 1.3 1.4 2 Interacting with Racket . . . . . . . . . . . . . . . . . . . . . . . . . . . . Denitions and Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . Creating Executables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Note to Readers with Lisp/Scheme Experience . . . . . . . . . . . . . . 14 15 15 16 17 18 18 18 19 20 21 21 22 25 26 28 29 30 31 33 34 36 37
Racket Essentials 2.1 2.2 Simple Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simple Denitions and Expressions . . . . . . . . . . . . . . . . . . . . . 2.2.1 2.2.2 2.2.3 2.2.4 2.2.5 2.2.6 2.2.7 2.2.8 2.3 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Aside on Indenting Code . . . . . . . . . . . . . . . . . . . . . Identiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Function Calls (Procedure Applications) . . . . . . . . . . . . . . . Conditionals with if, and, or, and cond . . . . . . . . . . . . . . Function Calls, Again . . . . . . . . . . . . . . . . . . . . . . . . Anonymous Functions with lambda . . . . . . . . . . . . . . . . . Local Binding with define, let, and let* . . . . . . . . . . . . .
Lists, Iteration, and Recursion . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 2.3.2 2.3.3 2.3.4 Predened List Loops . . . . . . . . . . . . . . . . . . . . . . . . List Iteration from Scratch . . . . . . . . . . . . . . . . . . . . . . Tail Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recursion versus Iteration . . . . . . . . . . . . . . . . . . . . . .
2.4
Pairs, Lists, and Racket Syntax . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Quoting Pairs and Symbols with quote . . . . . . . . . . . . . . .
2.4.2 2.4.3 3
39 40 42 42 42 45 47 48 50 52 53 56 57 59 59 61 61 62 63 64 64 65 66 66
Built-In Datatypes 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 Booleans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Strings (Unicode) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bytes and Byte Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pairs and Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.10 Hash Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.11 Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.12 Void and Undened . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Expressions and Denitions 4.1 4.2 4.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identiers and Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . Function Calls (Procedure Applications) . . . . . . . . . . . . . . . . . . . 4.3.1 4.3.2 4.3.3 4.4 Evaluation Order and Arity . . . . . . . . . . . . . . . . . . . . . . Keyword Arguments . . . . . . . . . . . . . . . . . . . . . . . . . The apply Function . . . . . . . . . . . . . . . . . . . . . . . . .
68 69 70 71 71 72 74 75 76 76 78 78 79 80 81 81 82 82 84 84 85 86 87 88 91
Denitions: define . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 4.5.2 4.5.3 4.5.4 Function Shorthand . . . . . . . . . . . . . . . . . . . . . . . . . . Curried Function Shorthand . . . . . . . . . . . . . . . . . . . . . Multiple Values and define-values . . . . . . . . . . . . . . . . Internal Denitions . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6
Local Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 4.6.2 4.6.3 4.6.4 4.6.5 Parallel Binding: let . . . . . . . . . . . . . . . . . . . . . . . . . Sequential Binding: let* . . . . . . . . . . . . . . . . . . . . . . Recursive Binding: letrec . . . . . . . . . . . . . . . . . . . . . Named let . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multiple Values: let-values, let*-values, letrec-values . .
4.7
Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 4.7.2 4.7.3 Simple Branching: if . . . . . . . . . . . . . . . . . . . . . . . . Combining Tests: and and or . . . . . . . . . . . . . . . . . . . . Chaining Tests: cond . . . . . . . . . . . . . . . . . . . . . . . . .
4.8
Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.1 4.8.2 4.8.3 Effects Before: begin . . . . . . . . . . . . . . . . . . . . . . . . Effects After: begin0 . . . . . . . . . . . . . . . . . . . . . . . . Effects If...: when and unless . . . . . . . . . . . . . . . . . . . .
4.9
Assignment: set! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.1 4.9.2 Guidelines for Using Assignment . . . . . . . . . . . . . . . . . . Multiple Values: set!-values . . . . . . . . . . . . . . . . . . .
4.10 Quoting: quote and ' . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 Quasiquoting: quasiquote and ` . . . . . . . . . . . . . . . . . . . . . . 4.12 Simple Dispatch: case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.13 Dynamic Binding: parameterize . . . . . . . . . . . . . . . . . . . . . . 5 Programmer-Dened Datatypes 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 6
91 93 95 96 100
Simple Structure Types: struct . . . . . . . . . . . . . . . . . . . . . . . 100 Copying and Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Structure Subtypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Opaque versus Transparent Structure Types . . . . . . . . . . . . . . . . . 102 Structure Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Structure Type Generativity . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Prefab Structure Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 More Structure Type Options . . . . . . . . . . . . . . . . . . . . . . . . . 107 112
Modules 6.1
Module Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 6.1.1 6.1.2 6.1.3 Organizing Modules . . . . . . . . . . . . . . . . . . . . . . . . . 113 Library Collections . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Adding Collections . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.2
Module Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 6.2.1 6.2.2 6.2.3 6.2.4 The module Form . . . . . . . . . . . . . . . . . . . . . . . . . . 117 The #lang Shorthand . . . . . . . . . . . . . . . . . . . . . . . . 118 Submodules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Main and Test Submodules . . . . . . . . . . . . . . . . . . . . . . 120
6.3
Imports: require . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Exports: provide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Assignment and Redenition . . . . . . . . . . . . . . . . . . . . . . . . . 131 134
Contracts 7.1
Contracts and Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.1.1 7.1.2 Contract Violations . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Experimenting with Contracts and Modules . . . . . . . . . . . . . 135
7.2
Simple Contracts on Functions . . . . . . . . . . . . . . . . . . . . . . . . 136 7.2.1 7.2.2 7.2.3 7.2.4 7.2.5 Styles of -> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.3
Contracts on Functions in General . . . . . . . . . . . . . . . . . . . . . . 143 7.3.1 7.3.2 7.3.3 7.3.4 7.3.5 7.3.6 7.3.7 7.3.8 7.3.9 Optional Arguments . . . . . . . . . . . . . . . . . . . . . . . . . 143 Rest Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Keyword Arguments . . . . . . . . . . . . . . . . . . . . . . . . . 144 Optional Keyword Arguments . . . . . . . . . . . . . . . . . . . . 145 Contracts for case-lambda . . . . . . . . . . . . . . . . . . . . . 146 Argument and Result Dependencies . . . . . . . . . . . . . . . . . 147 Checking State Changes . . . . . . . . . . . . . . . . . . . . . . . 150 Multiple Result Values . . . . . . . . . . . . . . . . . . . . . . . . 151 Fixed but Statically Unknown Arities . . . . . . . . . . . . . . . . 152
7.4 7.5
Guarantees for a Specic Value . . . . . . . . . . . . . . . . . . . 159 Guarantees for All Values . . . . . . . . . . . . . . . . . . . . . . 160 Checking Properties of Data Structures . . . . . . . . . . . . . . . 161
Abstract Contracts using #:exists and #: . . . . . . . . . . . . . . . . . 164 Additional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 7.7.1 7.7.2 7.7.3 7.7.4 A Customer-Manager Component . . . . . . . . . . . . . . . . . . 166 A Parameteric (Simple) Stack . . . . . . . . . . . . . . . . . . . . 168 A Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 A Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.8
Gotchas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 7.8.1 7.8.2 7.8.3 7.8.4 Contracts and eq? . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Exists Contracts and Predicates . . . . . . . . . . . . . . . . . . . 176 Dening Recursive Contracts . . . . . . . . . . . . . . . . . . . . 176 Mixing set! and contract-out . . . . . . . . . . . . . . . . . . 177 179
Varieties of Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Default Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Reading and Writing Racket Data . . . . . . . . . . . . . . . . . . . . . . 182 Datatypes and Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Bytes, Characters, and Encodings . . . . . . . . . . . . . . . . . . . . . . 186 I/O Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 188
9.3 9.4
Basic Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Characters and Character Classes . . . . . . . . . . . . . . . . . . . . . . . 192 9.4.1 9.4.2 Some Frequently Used Character Classes . . . . . . . . . . . . . . 193 POSIX character classes . . . . . . . . . . . . . . . . . . . . . . . 193
9.5 9.6
Quantiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 9.6.1 9.6.2 9.6.3 Backreferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Non-capturing Clusters . . . . . . . . . . . . . . . . . . . . . . . . 198 Cloisters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Alternation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Backtracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Looking Ahead and Behind . . . . . . . . . . . . . . . . . . . . . . . . . . 200 9.9.1 9.9.2 Lookahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Lookbehind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
10.1 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 10.2 Prompts and Aborts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 10.3 Continuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 11 Iterations and Comprehensions 209
11.1 Sequence Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 11.2 for and for* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 11.3 for/list and for*/list . . . . . . . . . . . . . . . . . . . . . . . . . . 213 11.4 for/vector and for*/vector . . . . . . . . . . . . . . . . . . . . . . . 214
11.5 for/and and for/or . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 11.6 for/first and for/last . . . . . . . . . . . . . . . . . . . . . . . . . . 215 11.7 for/fold and for*/fold . . . . . . . . . . . . . . . . . . . . . . . . . . 216 11.8 Multiple-Valued Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . 217 11.9 Iteration Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 12 Pattern Matching 13 Classes and Objects 220 223
13.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 13.2 Initialization Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 13.3 Internal and External Names . . . . . . . . . . . . . . . . . . . . . . . . . 227 13.4 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 13.5 Final, Augment, and Inner . . . . . . . . . . . . . . . . . . . . . . . . . . 228 13.6 Controlling the Scope of External Names . . . . . . . . . . . . . . . . . . 228 13.7 Mixins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 13.7.1 Mixins and Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . 231 13.7.2 The mixin Form . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 13.7.3 Parameterized Mixins . . . . . . . . . . . . . . . . . . . . . . . . 232 13.8 Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 13.8.1 Traits as Sets of Mixins . . . . . . . . . . . . . . . . . . . . . . . . 233 13.8.2 Inherit and Super in Traits . . . . . . . . . . . . . . . . . . . . . . 234 13.8.3 The trait Form . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 13.9 Class Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 13.9.1 External Class Contracts . . . . . . . . . . . . . . . . . . . . . . . 236 13.9.2 Internal Class Contracts . . . . . . . . . . . . . . . . . . . . . . . 239
14 Units (Components)
242
14.1 Signatures and Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 14.2 Invoking Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 14.3 Linking Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 14.4 First-Class Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 14.5 Whole-module Signatures and Units . . . . . . . . . . . . . . . . . . . . . 249 14.6 Contracts for Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 14.6.1 Adding Contracts to Signatures . . . . . . . . . . . . . . . . . . . 250 14.6.2 Adding Contracts to Units . . . . . . . . . . . . . . . . . . . . . . 252 14.7 unit versus module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 15 Reection and Dynamic Evaluation 255
15.1 eval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 15.1.1 Local Scopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 15.1.2 Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 15.1.3 Namespaces and Modules . . . . . . . . . . . . . . . . . . . . . . 257 15.2 Manipulating Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . 258 15.2.1 Creating and Installing Namespaces . . . . . . . . . . . . . . . . . 259 15.2.2 Sharing Data and Code Across Namespaces . . . . . . . . . . . . . 260 15.3 Scripting Evaluation and Using load . . . . . . . . . . . . . . . . . . . . . 261 16 Macros 264
16.1 Pattern-Based Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 16.1.1 define-syntax-rule . . . . . . . . . . . . . . . . . . . . . . . . 264 16.1.2 Lexical Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 16.1.3 define-syntax and syntax-rules . . . . . . . . . . . . . . . . 266
10
16.1.4 Matching Sequences . . . . . . . . . . . . . . . . . . . . . . . . . 267 16.1.5 Identier Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 16.1.6 Macro-Generating Macros . . . . . . . . . . . . . . . . . . . . . . 269 16.1.7 Extended Example: Call-by-Reference Functions . . . . . . . . . . 269 16.2 General Macro Transformers . . . . . . . . . . . . . . . . . . . . . . . . . 272 16.2.1 Syntax Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 16.2.2 Mixing Patterns and Expressions: syntax-case . . . . . . . . . . 274 16.2.3 with-syntax and generate-temporaries . . . . . . . . . . . . 276 16.2.4 Compile and Run-Time Phases . . . . . . . . . . . . . . . . . . . . 277 16.2.5 General Phase Levels . . . . . . . . . . . . . . . . . . . . . . . . . 280 16.2.6 Syntax Taints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 17 Creating Languages 293
17.1 Module Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 17.1.1 Implicit Form Bindings . . . . . . . . . . . . . . . . . . . . . . . . 294 17.1.2 Using #lang s-exp . . . . . . . . . . . . . . . . . . . . . . . . . 296 17.2 Reader Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 17.2.1 Source Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 17.2.2 Readtables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 17.3 Dening new #lang Languages . . . . . . . . . . . . . . . . . . . . . . . 302 17.3.1 Designating a #lang Language . . . . . . . . . . . . . . . . . . . 303 17.3.2 Using #lang reader . . . . . . . . . . . . . . . . . . . . . . . . 303 17.3.3 Using #lang s-exp syntax/module-reader . . . . . . . . . . 304 17.3.4 Installing a Language . . . . . . . . . . . . . . . . . . . . . . . . . 306 17.3.5 Source-Handling Conguration . . . . . . . . . . . . . . . . . . . 307 17.3.6 Module-Handling Conguration . . . . . . . . . . . . . . . . . . . 309 11
18 Performance
314
18.1 Performance in DrRacket . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 18.2 The Bytecode and Just-in-Time (JIT) Compilers . . . . . . . . . . . . . . . 314 18.3 Modules and Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 315 18.4 Function-Call Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . 316 18.5 Mutation and Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 317 18.6 letrec Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 18.7 Fixnum and Flonum Optimizations . . . . . . . . . . . . . . . . . . . . . . 318 18.8 Unchecked, Unsafe Operations . . . . . . . . . . . . . . . . . . . . . . . . 319 18.9 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 18.10Parallelism with Futures . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 18.11Parallelism with Places . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 18.12Distributed Places . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 19 Running and Creating Executables 329
19.1 Running racket and gracket . . . . . . . . . . . . . . . . . . . . . . . . 329 19.1.1 Interactive Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 19.1.2 Module Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 19.1.3 Load Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 19.2 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 19.2.1 Unix Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 19.2.2 Windows Batch Files . . . . . . . . . . . . . . . . . . . . . . . . . 333 19.3 Creating Stand-Alone Executables . . . . . . . . . . . . . . . . . . . . . . 334 20 More Libraries 335
12
20.2 The Web Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 20.3 Using Foreign Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 20.4 And More . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 21 Dialects of Racket and Scheme 337
21.1 More Rackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 21.2 Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 21.2.1 R5 RS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 21.2.2 R6 RS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 21.3 Teaching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 22 Command-Line Tools and Your Editor of Choice 340
22.1 Command-Line Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 22.1.1 Compilation and Conguration: raco . . . . . . . . . . . . . . . . 340 22.1.2 Interactive evaluation: XREPL . . . . . . . . . . . . . . . . . . . . 341 22.1.3 Bash completion . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 22.2 Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 22.2.1 Major Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 22.2.2 Minor Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 22.3 Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 22.3.1 Indentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 22.3.2 Highlighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 22.3.3 Structured Editing . . . . . . . . . . . . . . . . . . . . . . . . . . 344 22.3.4 Scribble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 22.3.5 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Index 346
13
Welcome to Racket
Depending on how you look at it, Racket is a programming languagea dialect of Lisp and a descendant of Scheme; a family of programming languagesvariants of Racket, and more; or a set of toolsfor using a family of programming languages. Where there is no room for confusion, we use simply Racket. Rackets main tools are racket, the core compiler, interpreter, and run-time system; DrRacket, the programming environment; and raco, a command-line tool for executing Racket commands that install packages, build libraries, and more. Most likely, youll want to explore the Racket language using DrRacket, especially at the beginning. If you prefer, you can also work with the command-line racket interpreter and your favorite text editor; see also 22 Command-Line Tools and Your Editor of Choice. The rest of this guide presents the language mostly independent of your choice of editor. If youre using DrRacket, youll need to choose the proper language, because DrRacket accommodates many different variants of Racket, as well as other languages. Assuming that youve never used DrRacket before, start it up, type the line
See 21 Dialects of Racket and Scheme for more information on other dialects of Lisp and how they relate to Racket.
#lang racket
in DrRackets top text area, and then click the Run button thats above the text area. DrRacket then understands that you mean to work in the normal variant of Racket (as opposed to the smaller racket/base or many other possibilities). If youve used DrRacket before with something other than a program that starts #lang, DrRacket will remember the last language that you used, instead of inferring the language from the #lang line. In that case, use the Language|Choose Language... menu item. In the dialog that appears, select the rst item, which tells DrRacket to use the language that is declared in a source program via #lang. Put the #lang line above in the top text area, still.
14
1.1
DrRackets bottom text area and the racket command-line program (when started with no options) both act as a kind of calculator. You type a Racket expression, hit the Return key, and the answer is printed. In the terminology of Racket, this kind of calculator is called a read-eval-print loop or REPL. A number by itself is an expression, and the answer is just the number:
> 5 5
A string is also an expression that evaluates to itself. A string is written with double quotes at the start and end of the string:
1.2
You can dene your own functions that work like substring by using the define form, like this:
(define (extract str) (substring str 4 7)) > (extract "the boy out of the country") "boy" > (extract "the country out of the boy") "cou"
Although you can evaluate the define form in the REPL, denitions are normally a part of a program that you want to keep and use later. So, in DrRacket, youd normally put the denition in the top text areacalled the denitions areaalong with the #lang prex: 15
> (enter! "extract.rkt") > (extract "the gal out of the city") "gal"
The enter! form both loads the code and switches the evaluation context to the inside of the module, just like DrRackets Run button.
,enter extract.rkt.
1.3
Creating Executables
#lang racket (define (extract str) (substring str 4 7)) (extract "the cat out of the bag")
then it is a complete program that prints cat when run. You can run the program within DrRacket or using enter! in racket, but if the program is saved in src-lename , you can also run it from a command line with
racket src-lename
To package the program as an executable, you have a few options: In DrRacket, you can select the Racket|Create
Executable...
menu item.
16
From a command-line prompt, run raco exe src-lename , where src-lename contains the program. See 3 raco exe: Creating Stand-Alone Executables for more information. With Unix or Mac OS X, you can turn the program le into an executable script by inserting the line
See 19.2 Scripts for more information on script les.
#! /usr/bin/env racket
at the very beginning of the le. Also, change the le permissions to executable using chmod +x lename on the command line. The script works as long as racket is in the users executable search path. Alternately, use a full path to racket after #! (with a space between #! and the path), in which case the users executable search path does not matter.
1.4
If you already know something about Racket or Lisp, you might be tempted to put just
17
Racket Essentials
This chapter provides a quick introduction to Racket as background for the rest of the guide. Readers with some Racket experience can safely skip to 3 Built-In Datatypes.
2.1
Simple Values
Racket values include numbers, booleans, strings, and byte strings. In DrRacket and documentation examples (when you read the documentation in color), value expressions are shown in green. Numbers are written in the usual way, including fractions and imaginary numbers:
3.2 Numbers (later in this guide) explains more about numbers.
1 1/2 1+2i
Booleans are #t for true and #f for false. In conditionals, however, all non-#f values are treated as true. Strings are written between doublequotes. Within a string, backslash is an escaping character; for example, a backslash followed by a doublequote includes a literal doublequote in the string. Except for an unescaped doublequote or backslash, any Unicode character can appear in a string constant.
3.4 Strings (Unicode) (later in this guide) explains more about strings.
> 1.0000 1.0 > "Bugs \u0022Figaro\u0022 Bunny" "Bugs \"Figaro\" Bunny"
2.2
#lang langname
topform *
where a topform is either a denition or an expr . The REPL also evaluates topform s. In syntax specications, text with a gray background, such as #lang, represents literal text. Whitespace must appear between such literals and nonterminals like id , except that whitespace is not required before or after (, ), [, or ]. A comment, which starts with ; and runs until the end of the line, is treated the same as whitespace. Following the usual conventions, * in a grammar means zero or more repetitions of the preceding element, + means one or more repetitions of the preceding element, and {} groups a sequence as an element for repetition.
1.3.8 Reading Comments in The Racket Reference provides more on different forms of comments.
2.2.1
Denitions
4.5 Denitions: define (later in this guide) explains more about denitions.
( define id
expr )
( define ( id
id * ) expr + )
binds the rst id to a function (also called a procedure) that takes arguments as named by the remaining id s. In the function case, the expr s are the body of the function. When the function is called, it returns the result of the last expr . Examples:
(define pie 3) (define (piece str) (substring str 0 pie)) > pie 3 > (piece "key lime") "key"
Under the hood, a function denition is really the same as a non-function denition, and a function name does not have to be used in a function call. A function is just another kind of value, though the printed form is necessarily less complete than the printed form of a number or string. Examples: 19
(define (bake flavor) (printf "pre-heating oven...\n") (string-append flavor " pie")) > (bake "apple") pre-heating oven... "apple pie"
Racket programmers prefer to avoid side-effects, so a denition usually has just one expression in its body. Its important, though, to understand that multiple expressions are allowed in a denition body, because it explains why the following nobake function fails to include its argument in its result:
(define (nobake flavor) string-append flavor "jello") > (nobake "green") "jello"
Within nobake, there are no parentheses around string-append flavor "jello", so they are three separate expressions instead of one function-call expression. The expressions string-append and flavor are evaluated, but the results are never used. Instead, the result of the function is just the result of the nal expression, "jello".
2.2.2
Line breaks and indentation are not signicant for parsing Racket programs, but most Racket programmers use a standard set of conventions to make code more readable. For example, the body of a denition is typically indented under the rst line of the denition. Identiers are written immediately after an open parenthesis with no extra space, and closing parentheses never go on their own line. DrRacket automatically indents according to the standard style when you type Enter in a program or REPL expression. For example, if you hit Enter after typing 20
(define (greet name), then DrRacket automatically inserts two spaces for the next line. If you change a region of code, you can select it in DrRacket and hit Tab, and DrRacket will re-indent the code (without inserting any line breaks). Editors like Emacs offer a Racket or Scheme mode with similar indentation support.
Re-indenting not only makes the code easier to read, it gives you extra feedback that your parentheses match in the way that you intended. For example, if you leave out a closing parenthesis after the last argument to a function, automatic indentation starts the next line under the rst argument, instead of under the define keyword:
2.2.3
Identiers
4.2 Identiers and Binding (later in this guide) explains more about identiers.
Rackets syntax for identiers is especially liberal. Excluding the special characters
()[]{}",'`;#|\
and except for the sequences of characters that make number constants, almost any sequence of non-whitespace characters forms an id . For example substring is an identier. Also, string-append and a+b are identiers, as opposed to arithmetic expressions. Here are several more examples:
We have already seen many function calls, which are called procedure applications in more traditional terminology. The syntax of a function call is
( id
expr * )
where the number of expr s determines the number of arguments supplied to the function named by id . 21
4.3 Function Calls (Procedure Applications) (later in this guide) explains more about function calls.
The racket language pre-denes many function identiers, such as substring and string-append. More examples are below. In example Racket code throughout the documentation, uses of pre-dened names are hyperlinked to the reference manual. So, you can click on an identier to get full details about its use.
> (string-append "rope" "twine" "yarn") "ropetwineyarn" > (substring "corduroys" 0 4) "cord" > (string-length "shoelace") 8 > (string? "Ceci n'est pas une string.") #t > (string? 1) #f > (sqrt 16) 4 > (sqrt -16) 0+4i > (+ 1 2) 3 > (- 2 1) 1 > (< 2 1) #f > (>= 2 1) #t > (number? "c'est une number") #f > (number? 1) #t > (equal? 6 "half dozen") #f > (equal? 6 6) #t > (equal? "half dozen" "half dozen") #t
2.2.5 Conditionals with if, and, or, and cond
; recognize numbers
; compare anything
( if expr
expr
expr ) 22
The rst expr is always evaluated. If it produces a non-#f value, then the second expr is evaluated for the result of the whole if expression, otherwise the third expr is evaluated for the result. Example:
> (if (> 2 3) "bigger" "smaller") "smaller" (define (reply s) (if (equal? "hello" (substring s 0 5)) "hi!" "huh?")) > (reply "hello racket") "hi!" > (reply "x:(.).xx") "huh?"
Complex conditionals can be formed by nesting if expressions. For example, you could make the reply function work when given non-strings:
(define (reply s) (if (string? s) (if (equal? "hello" (substring s 0 5)) "hi!" "huh?") "huh?"))
Instead of duplicating the "huh?" case, this function is better written as
(define (reply s) (if (if (string? s) (equal? "hello" (substring s 0 5)) #f) "hi!" "huh?"))
but these kinds of nested ifs are difcult to read. Racket provides more readable shortcuts through the and and or forms, which work with any number of expressions:
4.7.2 Combining Tests: and and or (later in this guide) explains more about and and or.
The and form short-circuits: it stops and returns #f when an expression produces #f, otherwise it keeps going. The or form similarly short-circuits when it encounters a true result. Examples:
(define (reply s) (if (and (string? s) (>= (string-length s) 5) (equal? "hello" (substring s 0 5))) "hi!" "huh?")) > (reply "hello racket") "hi!" > (reply 17) "huh?"
Another common pattern of nested ifs involves a sequence of tests, each with its own result:
(define (reply-more s) (if (equal? "hello" (substring s 0 5)) "hi!" (if (equal? "goodbye" (substring s 0 7)) "bye!" (if (equal? "?" (substring s (- (string-length s) 1))) "I don't know" "huh?"))))
The shorthand for a sequence of tests is the cond form:
4.7.3 Chaining Tests: cond (later in this guide) explains more about cond.
( cond {[ expr
expr * ]}* )
A cond form contains a sequence of clauses between square brackets. In each clause, the rst expr is a test expression. If it produces true, then the clauses remaining expr s are evaluated, and the last one in the clause provides the answer for the entire cond expression; the rest of the clauses are ignored. If the test expr produces #f, then the clauses remaining expr s are ignored, and evaluation continues with the next clause. The last clause can use else as a synonym for a #t test expression. Using cond, the reply-more function can be more clearly written as follows:
(define (reply-more s) (cond [(equal? "hello" (substring s 0 5)) "hi!"] [(equal? "goodbye" (substring s 0 7))
24
"bye!"] [(equal? "?" (substring s (- (string-length s) 1))) "I don't know"] [else "huh?"])) > (reply-more "hello racket") "hi!" > (reply-more "goodbye cruel world") "bye!" > (reply-more "what is your favorite color?") "I don't know" > (reply-more "mine is lime green") "huh?"
The use of square brackets for cond clauses is a convention. In Racket, parentheses and square brackets are actually interchangeable, as long as ( is matched with ) and [ is matched with ]. Using square brackets in a few key places makes Racket code even more readable.
2.2.6
In our earlier grammar of function calls, we oversimplied. The actual syntax of a function call allows an arbitrary expression for the function, instead of just an id :
( expr
expr * )
The rst expr is often an id , such as string-append or +, but it can be anything that evaluates to a function. For example, it can be a conditional expression:
4.3 Function Calls (Procedure Applications) (later in this guide) explains more about function calls.
(define (double v) ((if (string? v) string-append +) v v)) > (double "mnah") "mnahmnah" > (double 5) 10
Syntactically, the rst expression in a function call could even be a numberbut that leads to an error, since a number is not a function.
2.2.7
Programming in Racket would be tedious if you had to name all of your numbers. Instead of writing (+ 1 2), youd have to write
4.4 Functions (Procedures): lambda (later in this guide) explains more about lambda.
( lambda ( id * ) expr
> (twice (lambda (s) (string-append s "!")) "hello") "hello!!" > (twice (lambda (s) (string-append s "?!")) "hello") "hello?!?!"
Another use of lambda is as a result for a function that generates functions:
(define (make-add-suffix s2) (lambda (s) (string-append s s2))) > (twice (make-add-suffix "!") "hello") "hello!!" > (twice (make-add-suffix "?!") "hello") "hello?!?!" > (twice (make-add-suffix "...") "hello") "hello......"
Racket is a lexically scoped language, which means that s2 in the function returned by make-add-suffix always refers to the argument for the call that created the function. In other words, the lambda-generated function remembers the right s2:
> (define louder (make-add-suffix "!")) > (define less-sure (make-add-suffix "?")) > (twice less-sure "really") "really??" > (twice louder "really") "really!!"
We have so far referred to denitions of the form (define id expr ) as non-function denitions. This characterization is misleading, because the expr could be a lambda form, in which case the denition is equivalent to using the function denition form. For example, the following two denitions of louder are equivalent:
(define (louder s) (string-append s "!")) (define louder (lambda (s) (string-append s "!")))
27
2.2.8
Its time to retract another simplication in our grammar of Racket. In the body of a function, denitions can appear before the body expressions:
4.5.4 Internal Denitions (later in this guide) explains more about local (internal) denitions.
Denitions at the start of a function body are local to the function body. Examples:
(define (converse s) (define (starts? s2) ; local to converse (define len2 (string-length s2)) ; local to starts? (and (>= (string-length s) len2) (equal? s2 (substring s 0 len2)))) (cond [(starts? "hello") "hi!"] [(starts? "goodbye") "bye!"] [else "huh?"])) > (converse "hello!") "hi!" > (converse "urp") "huh?" > starts? ; outside of converse, so... reference to undened identier: starts?
Another way to create local bindings is the let form. An advantage of let is that it can be used in any expression position. Also, let binds many identiers at once, instead of requiring a separate define for each identier.
( let ( {[ id
4.5.4 Internal Denitions (later in this guide) explains more about let and let*.
Each binding clause is an id and a expr surrounded by square brackets, and the expressions after the clauses are the body of the let. In each clause, the id is bound to the result of the expr for use in the body. 28
(random 4)] (random 4)]) o) "X wins"] x) "O wins"] "cat's game"]))
The bindings of a let form are available only in the body of the let, so the binding clauses cannot refer to each other. The let* form, in contrast, allows later clauses to use earlier bindings:
> (let* ([x (random 4)] [o (random 4)] [diff (number->string (abs (- x o)))]) (cond [(> x o) (string-append "X wins by " diff)] [(> o x) (string-append "O wins by " diff)] [else "cat's game"])) "cat's game"
2.3
Racket is a dialect of the language Lisp, whose name originally stood for LISt Processor. The built-in list datatype remains a prominent feature of the language. The list function takes any number of values and returns a list containing the values:
As you can see, a list result prints in the REPL as a quote ' and then a pair of parentheses wrapped around the printed form of the list elements. Theres an opportunity for confusion here, because parentheses are used for both expressions, such as (list "red" "green" "blue"), and printed results, such as '("red" "green" "blue"). In addition to the quote, parentheses for results are printed in blue in the documentation and in DrRacket, whereas parentheses for expressions are brown. Many predened functions operate on lists. Here are a few examples:
> (list-ref (list "hop" "skip" "jump") 0) "hop" > (list-ref (list "hop" "skip" "jump") 1) "skip" > (append (list "hop" "skip") (list "jump")) '("hop" "skip" "jump") > (reverse (list "hop" "skip" "jump")) '("jump" "skip" "hop") > (member "fall" (list "hop" "skip" "jump")) ment #f
2.3.1 Predened List Loops
; extract by position
In addition to simple operations like append, Racket includes functions that iterate over the elements of a list. These iteration functions play a role similar to for in Java, Racket, and other languages. The body of a Racket iteration is packaged into a function to be applied to each element, so the lambda form becomes particularly handy in combination with iteration functions. Different list-iteration functions combine iteration results in different ways. The map function uses the per-element results to create a new list:
> (map sqrt (list 1 4 9 16)) '(1 2 3 4) > (map (lambda (i) (string-append i "!")) (list "peanuts" "popcorn" "crackerjack")) '("peanuts!" "popcorn!" "crackerjack!")
The andmap and ormap functions combine the results by anding or oring:
> (andmap string? (list "a" "b" "c")) #t > (andmap string? (list "a" "b" 6)) #f > (ormap number? (list "a" "b" 6)) #t
The filter function keeps elements for which the body result is true, and discards elements for which it is #f:
> (map (lambda (s n) (substring s 0 n)) (list "peanuts" "popcorn" "crackerjack") (list 6 3 7)) '("peanut" "pop" "cracker")
The foldl function generalizes some iteration functions. It uses the per-element function to both process an element and combine it with the current value, so the per-element function takes an extra rst argument. Also, a starting current value must be provided before the lists:
2.3.2
Although map and other iteration functions are predened, they are not primitive in any interesting sense. You can write equivalent iterations using a handful of list primitives. Since a Racket list is a linked list, the two core operations on a non-empty list are first: get the rst thing in the list; and rest: get the rest of the list. Examples:
31
> empty '() > (cons "head" empty) '("head") > (cons "dead" (cons "head" empty)) '("dead" "head")
To process a list, you need to be able to distinguish empty lists from non-empty lists, because first and rest work only on non-empty lists. The empty? function detects empty lists, and cons? detects non-empty lists:
> (empty? empty) #t > (empty? (cons "head" empty)) #f > (cons? empty) #f > (cons? (cons "head" empty)) #t
With these pieces, you can write your own versions of the length function, map function, and more. Examples:
(define (my-length lst) (cond [(empty? lst) 0] [else (+ 1 (my-length (rest lst)))])) > (my-length empty) 0 > (my-length (list "a" "b" "c")) 3 (define (my-map f lst)
32
(cond [(empty? lst) empty] [else (cons (f (first lst)) (my-map f (rest lst)))])) > (my-map string-upcase (list "ready" "set" "go")) '("READY" "SET" "GO")
If the derivation of the above denitions is mysterious to you, consider reading How to Design Programs. If you are merely suspicious of the use of recursive calls instead of a looping construct, then read on.
2.3.3
Tail Recursion
Both the my-length and my-map functions run in O(n) time for a list of length n. This is easy to see by imagining how (my-length (list "a" "b" "c")) must evaluate:
(my-length (list "a" "b" "c")) = (+ 1 (my-length (list "b" "c"))) = (+ 1 (+ 1 (my-length (list "c")))) = (+ 1 (+ 1 (+ 1 (my-length (list))))) = (+ 1 (+ 1 (+ 1 0))) = (+ 1 (+ 1 1)) = (+ 1 2) = 3
For a list with n elements, evaluation will stack up n (+ 1 ...) additions, and then nally add them up when the list is exhausted. You can avoid piling up additions by adding along the way. To accumulate a length this way, we need a function that takes both a list and the length of the list seen so far; the code below uses a local function iter that accumulates the length in an argument len:
(define (my-length lst) ; local function iter: (define (iter lst len) (cond [(empty? lst) len] [else (iter (rest lst) (+ len 1))])) ; body of my-length calls iter: (iter lst 0))
Now evaluation looks like this: 33
(my-length (list "a" "b" "c")) = (iter (list "a" "b" "c") 0) = (iter (list "b" "c") 1) = (iter (list "c") 2) = (iter (list) 3) 3
The revised my-length runs in constant space, just as the evaluation steps above suggest. That is, when the result of a function call, like (iter (list "b" "c") 1), is exactly the result of some other function call, like (iter (list "c") 2), then the rst one doesnt have to wait around for the second one, because that takes up space for no good reason. This evaluation behavior is sometimes called tail-call optimization, but its not merely an optimization in Racket; its a guarantee about the way the code will run. More precisely, an expression in tail position with respect to another expression does not take extra computation space over the other expression. In the case of my-map, O(n) space complexity is reasonable, since it has to generate a result of size O(n). Nevertheless, you can reduce the constant factor by accumulating the result list. The only catch is that the accumulated list will be backwards, so youll have to reverse it at the very end:
(define (my-map f lst) (define (iter lst backward-result) (cond [(empty? lst) (reverse backward-result)] [else (iter (rest lst) (cons (f (first lst)) backward-result))])) (iter lst empty))
It turns out that if you write
Attempting to reduce a constant factor like this is usually not worthwhile, as discussed below.
2.3.4
The my-length and my-map examples demonstrate that iteration is just a special case of recursion. In many languages, its important to try to t as many computations as possible 34
into iteration form. Otherwise, performance will be bad, and moderately large inputs can lead to stack overow. Similarly, in Racket, it is sometimes important to make sure that tail recursion is used to avoid O(n) space consumption when the computation is easily performed in constant space. At the same time, recursion does not lead to particularly bad performance in Racket, and there is no such thing as stack overow; you can run out of memory if a computation involves too much context, but exhausting memory typically requires orders of magnitude deeper recursion than would trigger a stack overow in other languages. These considerations, combined with the fact that tail-recursive programs automatically run the same as a loop, lead Racket programmers to embrace recursive forms rather than avoid them. Suppose, for example, that you want to remove consecutive duplicates from a list. While such a function can be written as a loop that remembers the previous element for each iteration, a Racket programmer would more likely just write the following:
(define (remove-dups l) (cond [(empty? l) empty] [(empty? (rest l)) l] [else (let ([i (first l)]) (if (equal? i (first (rest l))) (remove-dups (rest l)) (cons i (remove-dups (rest l)))))])) > (remove-dups (list "a" "b" "b" "b" "c" "c")) '("a" "b" "c")
In general, this function consumes O(n) space for an input list of length n, but thats ne, since it produces an O(n) result. If the input list happens to be mostly consecutive duplicates, then the resulting list can be much smaller than O(n)and remove-dups will also use much less than O(n) space! The reason is that when the function discards duplicates, it returns the result of a remove-dups call directly, so the tail-call optimization kicks in:
(remove-dups (list "a" "b" "b" = (cons "a" (remove-dups (list = (cons "a" (remove-dups (list = (cons "a" (remove-dups (list = (cons "a" (remove-dups (list = (cons "a" (remove-dups (list = (cons "a" (list "b")) = (list "a" "b")
"b" "b" "b")) "b" "b" "b" "b" "b"))) "b" "b" "b" "b"))) "b" "b" "b"))) "b" "b"))) "b")))
35
2.4
The cons function actually accepts any two values, not just a list for the second argument. When the second argument is not empty and not itself produced by cons, the result prints in a special way. The two values joined with cons are printed between parentheses, but with a dot (i.e., a period surrounded by whitespace) in between:
> (car (cons 1 2)) 1 > (cdr (cons 1 2)) 2 > (pair? empty) #f > (pair? (cons 1 2)) #t > (pair? (list 1 2 3)) #t
Rackets pair datatype and its relation to lists is essentially a historical curiosity, along with the dot notation for printing and the funny names car and cdr. Pairs are deeply wired into to the culture, specication, and implementation of Racket, however, so they survive in the language. You are perhaps most likely to encounter a non-list pair when making a mistake, such as accidentally reversing the arguments to cons:
36
Non-list pairs are used intentionally, sometimes. For example, the make-hash function takes a list of pairs, where the car of each pair is a key and the cdr is an arbitrary value. The only thing more confusing to new Racketeers than non-list pairs is the printing convention for pairs where the second element is a pair, but is not a list:
2.4.1
A list prints with a quote mark before it, but if an element of a list is itself a list, then no quote mark is printed for the inner list:
> (quote ("red" "green" "blue")) '("red" "green" "blue") > (quote ((1) (2 3) (4))) '((1) (2 3) (4)) > (quote ()) '()
The quote form works with the dot notation, too, whether the quoted form is normalized by the dot-parenthesis elimination rule or not:
'((1 2 3) 5 ("a" "b" "c")) > (quote ((1 2 3) 5 ("a" "b" "c"))) '((1 2 3) 5 ("a" "b" "c"))
If you wrap an identier with quote, then you get output that looks like an identier, but with a ' prex:
> map #<procedure:map> > (quote map) 'map > (symbol? (quote map)) #t > (symbol? map) #f > (procedure? map) #t > (string->symbol "map") 'map > (symbol->string (quote map)) "map"
In the same way that quote for a list automatically applies itself to nested lists, quote on a parenthesized sequence of identiers automatically applies itself to the identiers to create a list of symbols:
> (car (quote (road map))) 'road > (symbol? (car (quote (road map)))) #t
38
When a symbol is inside a list that is printed with ', the ' on the symbol is omitted, since ' is doing the job already:
> (quote 42) 42 > (quote "on the record") "on the record"
2.4.2 Abbreviating quote with '
As you may have guessed, you can abbreviate a use of quote by just putting ' in front of a form to quote:
> '(1 2 3) '(1 2 3) > 'road 'road > '((1 2 3) road ("a" "b" "c")) '((1 2 3) road ("a" "b" "c"))
In the documentation, ' within an expression is printed in green along with the form after it, since the combination is an expression that is a constant. In DrRacket, only the ' is colored green. DrRacket is more precisely correct, because the meaning of quote can vary depending on the context of an expression. In the documentation, however, we routinely assume that standard bindings are in scope, and so we paint quoted forms in green for extra clarity. A ' expands to a quote form in quite a literal way. You can see this if you put a ' in front of a form that has a ':
> (quote (quote road)) ''road > '(quote road) ''road > ''road ''road
2.4.3 Lists and Racket Syntax
Now that you know the truth about pairs and lists, and now that youve seen quote, youre ready to understand the main way in which we have been simplifying Rackets true syntax. The syntax of Racket is not dened directly in terms of character streams. Instead, the syntax is determined by two layers: a reader layer, which turns a sequence of characters into lists, symbols, and other constants; and an expander layer, which processes the lists, symbols, and other constants to parse them as an expression. The rules for printing and reading go together. For example, a list is printed with parentheses, and reading a pair of parentheses produces a list. Similarly, a non-list pair is printed with the dot notation, and a dot on input effectively runs the dot-notation rules in reverse to obtain a pair. One consequence of the read layer for expressions is that you can use the dot notation in expressions that are not quoted forms:
> (+ 1 . (2)) 3
This works because (+ 1 . (2)) is just another way of writing (+ 1 2). It is practically never a good idea to write application expressions using this dot notation; its just a consequence of the way Rackets syntax is dened. Normally, . is allowed by the reader only with a parenthesized sequence, and only before the last element of the sequence. However, a pair of .s can also appear around a single element in a parenthesized sequence, as long as the element is not rst or last. Such a pair triggers a reader conversion that moves the element between .s to the front of the list. The conversion enables a kind of general inx notation:
> (1 . < . 2) #t
40
41
Built-In Datatypes
The previous chapter introduced some of Rackets built-in datatypes: numbers, booleans, strings, lists, and procedures. This section provides a more complete coverage of the built-in datatypes for simple forms of data.
3.1
Booleans
Racket has two distinguished constants to represent boolean values: #t for true and #f for false. Uppercase #T and #F are parsed as the same values, but the lowercase forms are preferred. The boolean? procedure recognizes the two boolean constants. In the result of a test expression for if, cond, and, or, etc., however, any value other than #f counts as true. Examples:
> (= 2 (+ 1 1)) #t > (boolean? #t) #t > (boolean? #f) #t > (boolean? "no") #f > (if "no" 1 0) 1
3.2
Numbers
A Racket number is either exact or inexact: An exact number is either an arbitrarily large or small integer, such as 5, 99999999999999999, or -17; a rational that is exactly the ratio of two arbitrarily small or large integers, such as 1/2, 99999999999999999/2, or -3/4; or a complex number with exact real and imaginary parts (where the imaginary part is not zero), such as 1+2i or 1/2+3/4i. An inexact number is either
42
an IEEE oating-point representation of a number, such as 2.0 or 3.14e+87, where the IEEE innities and not-a-number are written +inf.0, -inf.0, and +nan.0 (or -nan.0); or a complex number with real and imaginary parts that are IEEE oating-point representations, such as 2.0+3.0i or -inf.0+nan.0i; as a special case, an inexact complex number can have an exact zero real part with an inexact imaginary part. Inexact numbers print with a decimal point or exponent specier, and exact numbers print as integers and fractions. The same conventions apply for reading number constants, but #e or #i can prex a number to force its parsing as an exact or inexact number. The prexes #b, #o, and #x specify binary, octal, and hexadecimal interpretation of digits. Examples:
1.3.3 Reading Numbers in The Racket Reference documents the ne points of the syntax of numbers.
> (/ 1 2) 1/2 > (/ 1 2.0) 0.5 > (if (= 3.0 2.999) 1 2) 2 > (inexact->exact 0.1) 3602879701896397/36028797018963968
Inexact results are also produced by procedures such as sqrt, log, and sin when an exact result would require representing real numbers that are not rational. Racket can represent only rational numbers and complex numbers with rational parts. Examples:
> (sin 0) 0
; rational...
43
(define (sigma f a b) (if (= a b) 0 (+ (f a) (sigma f (+ a 1) b)))) > (time (round (sigma (lambda (x) (/ 1 x)) 1 2000))) cpu time: 182 real time: 182 gc time: 37 8 > (time (round (sigma (lambda (x) (/ 1.0 x)) 1 2000))) cpu time: 1 real time: 1 gc time: 0 8.0
The number categories integer, rational, real (always rational), and complex are dened in the usual way, and are recognized by the procedures integer?, rational?, real?, and complex?, in addition to the generic number?. A few mathematical procedures accept only real numbers, but most implement standard extensions to complex numbers. Examples:
> (integer? 5) #t > (complex? 5) #t > (integer? 5.0) #t > (integer? 1+2i) #f > (complex? 1+2i) #t > (complex? 1.0+2.0i) #t > (abs -5) 5 > (abs -5+2i) abs: expects argument of type <real number>; given: -5+2i > (sin -5+2i) 3.6076607742131563+1.0288031496599335i
44
The = procedure compares numbers for numerical equality. If it is given both inexact and exact numbers to compare, it essentially converts the inexact numbers to exact before comparing. The eqv? (and therefore equal?) procedure, in contrast, compares numbers considering both exactness and numerical equality. Examples:
> (= 1/2 0.5) #t > (= 1/10 0.1) #f > (inexact->exact 0.1) 3602879701896397/36028797018963968
3.3
Characters
3.2 Numbers in The Racket Reference provides more on numbers and number procedures.
A Racket character corresponds to a Unicode scalar value. Roughly, a scalar value is an unsigned integer whose representation ts into 21 bits, and that maps to some notion of a natural-language character or piece of a character. Technically, a scalar value is a simpler notion than the concept called a character in the Unicode standard, but its an approximation that works well for many purposes. For example, any accented Roman letter can be represented as a scalar value, as can any common Chinese character. Although each Racket character corresponds to an integer, the character datatype is separate from numbers. The char->integer and integer->char procedures convert between scalar-value numbers and the corresponding character. A printable character normally prints as #\ followed by the represented character. An unprintable character normally prints as #\u followed by the scalar value as hexadecimal number. A few characters are printed specially; for example, the space and linefeed characters print as #\space and #\newline, respectively. Examples:
1.3.13 Reading Characters in The Racket Reference documents the ne points of the syntax of characters.
45
> (integer->char #\A > (char->integer 65 > #\ #\ > #\u03BB #\ > (integer->char #\u0011 > (char->integer 32
65) #\A)
17) #\space)
The display procedure directly writes a character to the current output port (see 8 Input and Output), in contrast to the character-constant syntax used to print a character result. Examples:
> (char-alphabetic? #\A) #t > (char-numeric? #\0) #t > (char-whitespace? #\newline) #t > (char-downcase #\A) #\a > (char-upcase #\) #\
The char=? procedure compares two or more characters, and char-ci=? compares characters ignoring case. The eqv? and equal? procedures behave the same as char=? on characters; use char=? when you want to more specically declare that the values being compared are characters. Examples:
46
> (char=? #\a #\A) #f > (char-ci=? #\a #\A) #t > (eqv? #\a #\A) #f
3.4
Strings (Unicode)
3.5 Characters in The Racket Reference provides more on characters and character procedures.
A string is a xed-length array of characters. It prints using doublequotes, where doublequote and backslash characters within the string are escaped with backslashes. Other common string escapes are supported, including \n for a linefeed, \r for a carriage return, octal escapes using \ followed by up to three octal digits, and hexadecimal escapes with \u (up to four digits). Unprintable characters in a string are normally shown with \u when the string is printed. The display procedure directly writes the characters of a string to the current output port (see 8 Input and Output), in contrast to the string-constant syntax used to print a string result. Examples:
1.3.6 Reading Strings in The Racket Reference documents the ne points of the syntax of strings.
> "Apple" "Apple" > "\u03BB" " " > (display "Apple") Apple > (display "a \"quoted\" thing") a "quoted" thing > (display "two\nlines") two lines > (display "\u03BB")
A string can be mutable or immutable; strings written directly as expressions are immutable, but most other strings are mutable. The make-string procedure creates a mutable string given a length and optional ll character. The string-ref procedure accesses a character from a string (with 0-based indexing); the string-set! procedure changes a character in a mutable string. 47
Examples:
> (string-ref "Apple" 0) #\A > (define s (make-string 5 #\.)) > s "....." > (string-set! s 2 #\) > s "...."
String ordering and case operations are generally locale-independent; that is, they work the same for all users. A few locale-dependent operations are provided that allow the way that strings are case-folded and sorted to depend on the end-users locale. If youre sorting strings, for example, use string<? or string-ci<? if the sort result should be consistent across machines and users, but use string-locale<? or string-locale-ci<? if the sort is purely to order strings for an end user. Examples:
> (string<? "apple" "Banana") #f > (string-ci<? "apple" "Banana") #t > (string-upcase "Strae") "STRASSE" > (parameterize ([current-locale "C"]) (string-locale-upcase "Strae")) "STRAE"
For working with plain ASCII, working with raw bytes, or encoding/decoding Unicode strings as bytes, use byte strings.
3.3 Strings in The Racket Reference provides more on strings and string procedures.
3.5
A byte is an exact integer between 0 and 255, inclusive. The byte? predicate recognizes numbers that represent bytes. Examples:
A byte string is similar to a stringsee 3.4 Strings (Unicode)but its content is a sequence of bytes instead of characters. Byte strings can be used in applications that process pure ASCII instead of Unicode text. The printed form of a byte string supports such uses in particular, because a byte string prints like the ASCII decoding of the byte string, but prexed with a #. Unprintable ASCII characters or non-ASCII bytes in the byte string are written with octal notation. Examples:
> #"Apple" #"Apple" > (bytes-ref #"Apple" 0) 65 > (make-bytes 3 65) #"AAA" > (define b (make-bytes 2 0)) > b #"\0\0" > (bytes-set! b 0 1) > (bytes-set! b 1 255) > b #"\1\377"
The display form of a byte string writes its raw bytes to the current output port (see 8 Input and Output). Technically, display of a normal (i.e,. character) string prints the UTF-8 encoding of the string to the current output port, since output is ultimately dened in terms of bytes; display of a byte string, however, writes the raw bytes with no encoding. Along the same lines, when this documentation shows output, it technically shows the UTF8-decoded form of the output. Examples:
1.3.6 Reading Strings in The Racket Reference documents the ne points of the syntax of byte strings.
for byte-to-byte conversions (especially to and from UTF-8) ll the gap to support arbitrary string encodings. Examples:
> (bytes->string/utf-8 #"\316\273") " " > (bytes->string/latin-1 #"\316\273") "" > (parameterize ([current-locale "C"]) ; C locale supports ASCII, (bytes->string/locale #"\316\273")) ; only, so... bytes->string/locale: byte string is not a valid encoding for the current locale: #"\316\273" > (let ([cvt (bytes-open-converter "cp1253" ; Greek code page "UTF-8")] [dest (make-bytes 2)]) (bytes-convert cvt #"\353" 0 1 dest) (bytes-close-converter cvt) (bytes->string/utf-8 dest)) " "
3.6
Symbols
A symbol is an atomic value that prints like an identier preceded with '. An expression that starts with ' and continues with an identier produces a symbol value. Examples:
3.4 Byte Strings in The Racket Reference provides more on byte strings and byte-string procedures.
#t > (eq? 'a 'b) #f > (eq? 'a 'A) #f > #ci'A 'a
Any string (i.e., any character sequence) can be supplied to string->symbol to obtain the corresponding symbol. For reader input, any character can appear directly in an identier, except for whitespace and the following special characters:
()[]{}",'`;#|\
Actually, # is disallowed only at the beginning of a symbol, and then only if not followed by %; otherwise, # is allowed, too. Also, . by itself is not a symbol. Whitespace or special characters can be included in an identier by quoting them with | or \. These quoting mechanisms are used in the printed form of identiers that contain special characters or that might otherwise look like numbers. Examples:
> (string->symbol "one, two") '|one, two| > (string->symbol "6") '|6|
The write function prints a symbol without a ' prex. The display form of a symbol is the same as the corresponding string. Examples:
1.3.2 Reading Symbols in The Racket Reference documents the ne points of the syntax of symbols.
> (write 'Apple) Apple > (display 'Apple) Apple > (write '|6|) |6| > (display '|6|) 6
The gensym and string->uninterned-symbol procedures generate fresh uninterned symbols that are not equal (according to eq?) to any previously interned or uninterned sym51
bol. Uninterned symbols are useful as fresh tags that cannot be confused with any other value. Examples:
> (define s (gensym)) > s 'g42 > (eq? s 'g42) #f > (eq? 'a (string->uninterned-symbol "a")) #f
3.7
Keywords
A keyword value is similar to a symbol (see 3.6 Symbols), but its printed form is prexed with #:. Examples:
> (string->keyword "apple") '#:apple > '#:apple '#:apple > (eq? '#:apple (string->keyword "apple")) #t
More precisely, a keyword is analogous to an identier; in the same way that an identier can be quoted to produce a symbol, a keyword can be quoted to produce a value. The same term keyword is used in both cases, but we sometimes use keyword value to refer more specically to the result of a quote-keyword expression or of string->keyword. An unquoted keyword is not an expression, just as an unquoted identier does not produce a symbol: Examples:
1.3.14 Reading Keywords in The Racket Reference documents the ne points of the syntax of keywords.
> not-a-symbol-expression reference to undened identier: not-a-symbol-expression > #:not-a-keyword-expression eval:2:0: #%datum: keyword used as an expression in: #:not-a-keyword-expression
Despite their similarities, keywords are used in a different way than identiers or symbols. Keywords are intended for use (unquoted) as special markers in argument lists and in certain syntactic forms. For run-time ags and enumerations, use symbols instead of keywords. The example below illustrates the distinct roles of keywords and symbols. 52
Examples:
> (define dir (find-system-path 'temp-dir)) ; not '#:temp-dir > (with-output-to-file (build-path dir "stuff.txt") (lambda () (printf "example\n")) ; optional #:mode argument can be 'text or 'binary #:mode 'text ; optional #:exists argument can be 'replace, 'truncate, ... #:exists 'replace)
3.8
A pair joins two arbitrary values. The cons procedure constructs pairs, and the car and cdr procedures extract the rst and second elements of the pair, respectively. The pair? predicate recognizes pairs. Some pairs print by wrapping parentheses around the printed forms of the two pair elements, putting a ' at the beginning and a . between the elements. Examples:
> (cons 1 2) '(1 . 2) > (cons (cons 1 2) 3) '((1 . 2) . 3) > (car (cons 1 2)) 1 > (cdr (cons 1 2)) 2 > (pair? (cons 1 2)) #t
A list is a combination of pairs that creates a linked list. More precisely, a list is either the empty list null, or it is a pair whose rst element is a list element and whose second element is a list. The list? predicate recognizes lists. The null? predicate recognizes the empty list. A list normally prints as a ' followed by a pair of parentheses wrapped around the list elements. Examples:
> (cons 0 (cons 1 (cons 2 null))) '(0 1 2) > (list? null) #t > (list? (cons 1 (cons 2 null))) #t > (list? (cons 1 2)) #f
A list or pair prints using list or cons when one of its elements cannot be written as a quoted value. For example, a value constructed with srcloc cannot be written using quote, and it prints using srcloc:
> (srcloc "file.rkt" 1 0 1 (+ 4 4)) (srcloc "file.rkt" 1 0 1 8) > (list 'here (srcloc "file.rkt" 1 0 1 8) 'there) (list 'here (srcloc "file.rkt" 1 0 1 8) 'there) > (cons 1 (srcloc "file.rkt" 1 0 1 8)) (cons 1 (srcloc "file.rkt" 1 0 1 8)) > (cons 1 (cons 2 (srcloc "file.rkt" 1 0 1 8))) (list* 1 2 (srcloc "file.rkt" 1 0 1 8))
As shown in the last example, list* is used to abbreviate a series of conses that cannot be abbreviated using list. The write and display functions print a pair or list without a leading ', cons, list, or list*. There is no difference between write and display for a pair or list, except as they apply to elements of the list: Examples:
> (write (cons 1 2)) (1 . 2) > (display (cons 1 2)) (1 . 2) > (write null) () > (display null) () > (write (list 1 2 "3")) (1 2 "3")
54
> (map (lambda (i) (/ 1 i)) '(1 2 3)) '(1 1/2 1/3) > (andmap (lambda (i) (i . < . 3)) '(1 2 3)) #f > (ormap (lambda (i) (i . < . 3)) '(1 2 3)) #t > (filter (lambda (i) (i . < . 3)) '(1 2 3)) '(1 2) > (foldl (lambda (v i) (+ v i)) 10 '(1 2 3)) 16 > (for-each (lambda (i) (display i)) '(1 2 3)) 123 > (member "Keys" '("Florida" "Keys" "U.S.A.")) '("Keys" "U.S.A.") > (assoc 'where '((when "3:30") (where "Florida") (who "Mickey"))) '(where "Florida")
Pairs are immutable (contrary to Lisp tradition), and pair? and list? recognize immutable pairs and lists, only. The mcons procedure creates a mutable pair, which works with setmcar! and set-mcdr!, as well as mcar and mcdr. A mutable pair prints using mcons, while write and display print mutable pairs with { and }: Examples:
3.9 Pairs and Lists in The Racket Reference provides more on pairs and lists.
3.9
Vectors
A vector is a xed-length array of arbitrary values. Unlike a list, a vector supports constanttime access and update of its elements. A vector prints similar to a listas a parenthesized sequence of its elementsbut a vector is prexed with # after ', or it uses vector if one of its elements cannot be expressed with quote. For a vector as an expression, an optional length can be supplied. Also, a vector as an expression implicitly quotes the forms for its content, which means that identiers and parenthesized forms in a vector constant represent symbols and lists. Examples:
> #("a" "b" "c") '#("a" "b" "c") > #(name (that tune)) '#(name (that tune)) > (vector-ref #("a" "b" "c") 1) "b" > (vector-ref #(name (that tune)) 1) '(that tune)
Like strings, a vector is either mutable or immutable, and vectors written directly as expressions are immutable. Vector can be converted to lists and vice versa via list->vector and vector->list; such conversions are particularly useful in combination with predened procedures on lists. When allocating extra lists seems too expensive, consider using looping forms like for/fold, which recognize vectors as well as lists. Example:
1.3.9 Reading Vectors in The Racket Reference documents the ne points of the syntax of vectors.
3.10
Hash Tables
3.11 Vectors in The Racket Reference provides more on vectors and vector procedures.
A hash table implements a mapping from keys to values, where both keys and values can be arbitrary Scheme values, and access and update to the table are normally constant-time operations. Keys are compared using equal?, eqv?, or eq?, depending on whether the hash table is created with make-hash, make-hasheqv, or make-hasheq. Examples:
> (define ht (make-hash)) > (hash-set! ht "apple" '(red round)) > (hash-set! ht "banana" '(yellow long)) > (hash-ref ht "apple") '(red round) > (hash-ref ht "coconut") hash-ref: no value found for key: "coconut" > (hash-ref ht "coconut" "not there") "not there"
The hash, hasheqv, and hasheq functions create immutable hash tables from an initial set of keys and values, which each value is provided as an argument after its key. Immutable hash tables can be extended with hash-set, which produces a new immutable hash table in constant time. Examples:
> (define ht (hash "apple" 'red "banana" 'yellow)) > (hash-ref ht "apple") 'red > (define ht2 (hash-set ht "coconut" 'brown)) > (hash-ref ht "coconut") hash-ref: no value found for key: "coconut" > (hash-ref ht2 "coconut") 'brown
A literal immutable hash table can be written as an expression by using #hash (for an equal?-based table), #hasheqv (for an eqv?-based table), or #hasheq (for an eq?-based table). A parenthesized sequence must immediately follow #hash, #hasheq, or #hasheqv, 57
where each element is a dotted keyvalue pair. The #hash, etc. forms implicitly quote their key and value sub-forms. Examples:
> (define ht #hash(("apple" . red) ("banana" . yellow))) > (hash-ref ht "apple") 'red
Both mutable and immutable hash tables print like immutable hash tables, using a quoted #hash, #hasheqv, or #hasheq form if all keys and values can be expressed with quote or using hash, hasheq, or hasheqv otherwise: Examples:
1.3.11 Reading Hash Tables in The Racket Reference documents the ne points of the syntax of hash table literals.
> #hash(("apple" . red) ("banana" . yellow)) '#hash(("apple" . red) ("banana" . yellow)) > (hash 1 (srcloc "file.rkt" 1 0 1 (+ 4 4))) (hash 1 (srcloc "file.rkt" 1 0 1 8))
A mutable hash table can optionally retain its keys weakly, so each mapping is retained only so long as the key is retained elsewhere. Examples:
> (define ht (make-weak-hasheq)) > (hash-set! ht (gensym) "can you see me?") > (collect-garbage) > (hash-count ht) 0
Beware that even a weak hash table retains its values strongly, as long as the corresponding key is accessible. This creates a catch-22 dependency when a value refers back to its key, so that the mapping is retained permanently. To break the cycle, map the key to an ephemeron that pairs the value with its key (in addition to the implicit pairing of the hash table). Examples:
> (define ht (make-weak-hasheq)) > (let ([g (gensym)]) (hash-set! ht g (list g)))
58
> (collect-garbage) > (hash-count ht) 1 > (define ht (make-weak-hasheq)) > (let ([g (gensym)]) (hash-set! ht g (make-ephemeron g (list g)))) > (collect-garbage) > (hash-count ht) 0
3.13 Hash Tables in The Racket Reference provides more on hash tables and hash-table procedures.
3.11
Boxes
A box is like a single-element vector. It can print as a quoted #& followed by the printed form of the boxed value. A #& form can also be used as an expression, but since the resulting box is constant, it has practically no use. Examples:
> (define b (box "apple")) > b '#&"apple" > (unbox b) "apple" > (set-box! b '(banana boat)) > b '#&(banana boat)
3.12
3.12 Boxes in The Racket Reference provides more on boxes and box procedures.
Some procedures or expression forms have no need for a result value. For example, the display procedure is called only for the side-effect of writing output. In such cases the result value is normally a special constant that prints as #<void>. When the result of an expression is simply #<void>, the REPL does not print anything.
59
The void procedure takes any number of arguments and returns #<void>. (That is, the identier void is bound to a procedure that returns #<void>, instead of being bound directly to #<void>.) Examples:
60
The 2 Racket Essentials chapter introduced some of Rackets syntactic forms: denitions, procedure applications, conditionals, and so on. This section provides more details on those forms, plus a few additional basic forms.
4.1
Notation
This chapter (and the rest of the documentation) uses a slightly different notation than the character-based grammars of the 2 Racket Essentials chapter. The grammar for a use of a syntactic form something is shown like this:
Some syntactic-form specications refer to meta-variables that are not implicitly dened and not previously dened. Such meta-variables are dened after the main form, using a BNF-like format for alternatives:
4.2
The context of an expression determines the meaning of identiers that appear in the expression. In particular, starting a module with the language racket, as in
#lang racket
means that, within the module, the identiers described in this guide start with the meaning described here: cons refers to the function that creates a pair, car refers to the function that extracts the rst element of a pair, and so on. Forms like define, lambda, and let associate a meaning with one or more identiers; that is, they bind identiers. The part of the program for which the binding applies is the scope of the binding. The set of bindings in effect for a given expression is the expressions environment. For example, in
#lang racket (define f (lambda (x) (let ([y 5]) (+ x y)))) (f 10)
the define is a binding of f, the lambda has a binding for x, and the let has a binding for y. The scope of the binding for f is the entire module; the scope of the x binding is (let ([y 5]) (+ x y)); and the scope of the y binding is just (+ x y). The environment of (+ x y) includes bindings for y, x, and f, as well as everything in racket. 62
A module-level define can bind only identiers that are not already dened or required into the module. A local define or other binding forms, however, can give a new local binding for an identier that already has a binding; such a binding shadows the existing binding. Examples:
(define f (lambda (append) (define cons (append "ugly" "confusing")) (let ([append 'this-was]) (list append cons)))) > (f list) '(this-was ("ugly" "confusing"))
Similarly, a module-level define can shadow a binding from the modules language. For example, (define cons 1) in a racket module shadows the cons that is provided by racket. Intentionally shadowing a language binding is rarely a good ideaespecially for widely used bindings like consbut shadowing relieves a programmer from having to avoid every obscure binding that is provided by a language. Even identiers like define and lambda get their meanings from bindings, though they have transformer bindings (which means that they indicate syntactic forms) instead of value bindings. Since define has a transformer binding, the identier define cannot be used by itself to get a value. However, the normal binding for define can be shadowed. Examples:
> define eval:1:0: dene: bad syntax in: dene > (let ([define 5]) define) 5
Again, shadowing standard bindings in this way is rarely a good idea, but the possibility is an inherent part of Rackets exibility.
4.3
4.3.1
A function call is evaluated by rst evaluating the proc-expr and all arg-expr s in order (left to right). Then, if proc-expr produces a function that accepts as many arguments as supplied arg-expr s, the function is called. Otherwise, an exception is raised. Examples:
> (cons 1 null) '(1) > (+ 1 2 3) 6 > (cons 1 2 3) cons: expects 2 arguments, given 3: 1 2 3 > (1 2 3) procedure application: expected procedure, given: 1; arguments were: 2 3
Some functions, such as cons, accept a xed number of arguments. Some functions, such as + or list, accept any number of arguments. Some functions accept a range of argument counts; for example substring accepts either two or three arguments. A functions arity is the number of arguments that it accepts.
4.3.2
Keyword Arguments
Some functions accept keyword arguments in addition to by-position arguments. For that case, an arg can be an arg-keyword arg-expr sequence instead of just a arg-expr :
64
4.3.3
The syntax for function calls supports any number of arguments, but a specic call always species a xed number of arguments. As a result, a function that takes a list of arguments cannot directly apply a function like + to all of the items in the list:
(define (avg lst) ; doesn't work... (/ (+ lst) (length lst))) > (avg '(1 2 3)) +: expects argument of type <number>; given: (1 2 3) (define (avg lst) ; doesn't always work... (/ (+ (list-ref lst 0) (list-ref lst 1) (list-ref lst 2)) (length lst))) > (avg '(1 2 3)) 2 > (avg '(1 2)) list-ref: index 2 too large for list: (1 2)
The apply function offers a way around this restriction. It takes a function and a list arguments, and it applies the function to the arguments:
(define (avg lst) (/ (apply + lst) (length lst))) > (avg '(1 2 3)) 2 > (avg '(1 2)) 3/2 > (avg '(1 2 3 4)) 5/2
65
As a convenience, the apply function accepts additional arguments between the function and the list. The additional arguments are effectively consed onto the argument list:
4.4
A lambda expression creates a function. In the simplest case, a lambda expression has the form
> ((lambda (x) x) 1) 1 > ((lambda (x y) (+ x y)) 1 2) 3 > ((lambda (x y) (+ x y)) 1) #<procedure>: expects 2 arguments, given 1: 1
4.4.1
> ((lambda x x) 1 2 3) '(1 2 3) > ((lambda x x)) '() > ((lambda x (car x)) 1 2 3) 1
Functions with a rest-id often use apply to call another function that accepts any number of arguments. Examples:
4.3.3 The apply Function describes apply.
(define max-mag (lambda nums (apply max (map magnitude nums)))) > (max 1 -2 0) 1 > (max-mag 1 -2 0) 2
The lambda form also supports required arguments combined with a rest-id :
(define max-mag (lambda (num . nums) (apply max (map magnitude (cons num nums)))))
67
> (max-mag 1 -2 0) 2 > (max-mag) procedure max-mag: expects at least 1 argument, given 0
A rest-id variable is sometimes called a rest argument, because it accepts the rest of the function arguments.
4.4.2
Instead of just an identier, an argument (other than a rest argument) in a lambda form can be specied with an identier and a default value:
(lambda gen-formals body ...+) gen-formals = (arg ...) | rest-id | (arg ...+ . rest-id ) arg = arg-id | [arg-id default-expr ]
An argument of the form [arg-id default-expr] is optional. When the argument is not supplied in an application, default-expr produces the default value. The default-expr can refer to any preceding arg-id , and every following arg-id must have a default as well. Examples:
(define greet (lambda (given [surname "Smith"]) (string-append "Hello, " given " " surname))) > (greet "John") "Hello, John Smith" > (greet "John" "Doe") "Hello, John Doe" (define greet (lambda (given [surname (if (equal? given "John") "Doe" "Smith")]) (string-append "Hello, " given " " surname)))
68
> (greet "John") "Hello, John Doe" > (greet "Adam") "Hello, Adam Smith"
4.4.3 Declaring Keyword Arguments
A lambda form can declare an argument to be passed by keyword, instead of position. Keyword arguments can be mixed with by-position arguments, and default-value expressions can be supplied for either kind of argument:
(lambda gen-formals body ...+) gen-formals = (arg ...) | rest-id | (arg ...+ . rest-id ) arg = | | | arg-id [arg-id default-expr ] arg-keyword arg-id arg-keyword [arg-id default-expr ]
An argument specied as arg-keyword arg-id is supplied by an application using the same arg-keyword . The position of the keywordidentier pair in the argument list does not matter for matching with arguments in an application, because it will be matched to an argument value by keyword instead of by position.
(define greet (lambda (given #:last surname) (string-append "Hello, " given " " surname))) > (greet "John" #:last "Smith") "Hello, John Smith" > (greet #:last "Doe" "John") "Hello, John Doe"
An arg-keyword [arg-id default-expr ] argument species a keyword-based argument with a default value. Examples:
(define greet
69
(lambda (#:hi [hi "Hello"] given #:last [surname "Smith"]) (string-append hi ", " given " " surname))) > (greet "John") "Hello, John Smith" > (greet "Karl" #:last "Marx") "Hello, Karl Marx" > (greet "John" #:hi "Howdy") "Howdy, John Smith" > (greet "Karl" #:last "Marx" #:hi "Guten Tag") "Guten Tag, Karl Marx"
The lambda form does not directly support the creation of a function that accepts rest keywords. To construct a function that accepts all keyword arguments, use make-keywordprocedure. The function supplied to make-keyword-procedure receives keyword arguments through parallel lists in the rst two (by-position) arguments, and then all by-position arguments from an application as the remaining by-position arguments. Examples:
(define (trace-wrap f) (make-keyword-procedure (lambda (kws kw-args . rest) (printf "Called with s s s\n" kws kw-args rest) (keyword-apply f kws kw-args rest)))) > ((trace-wrap greet) "John" #:hi "Howdy") Called with (#:hi) ("Howdy") ("John") "Howdy, John Smith"
4.4.4 Arity-Sensitive Functions: case-lambda
The case-lambda form creates a function that can have completely different behaviors depending on the number of arguments that are supplied. A case-lambda expression has the form
2.8 Procedure Expressions: lambda and case-lambda in The Racket Reference provides more on function expressions.
(case-lambda [formals body ...+] ...) formals = (arg-id ...) | rest-id | (arg-id ...+ . rest-id )
70
where each [formals body ...+] is analogous to (lambda formals body ...+). Applying a function produced by case-lambda is like applying a lambda for the rst case that matches the number of given arguments. Examples:
(define greet (case-lambda [(name) (string-append "Hello, " name)] [(given surname) (string-append "Hello, " given " " surname)])) > (greet "John") "Hello, John" > (greet "John" "Smith") "Hello, John Smith" > (greet) procedure greet: no clause matching 0 arguments
A case-lambda function cannot directly support optional or keyword arguments.
4.5
Denitions: define
(define id expr )
in which case id is bound to the result of expr . Examples:
(define salutation (list-ref '("Hi" "Hello") (random 2))) > salutation "Hello"
4.5.1 Function Shorthand
(define (greet name) (string-append salutation ", " name)) > (greet "John") "Hello, John" (define (greet first [surname "Smith"] #:hi [hi salutation]) (string-append hi ", " first " " surname)) > (greet "John") "Hello, John Smith" > (greet "John" #:hi "Hey") "Hey, John Smith" > (greet "John" "Doe") "Hello, John Doe"
The function shorthand via define also supports a rest argument (i.e., a nal argument to collect extra arguments in a list):
Consider the following make-add-suffix function that takes a string and returns another function that takes a string:
72
Although its not common, result of make-add-suffix could be called directly, like this:
(define ((make-add-suffix s2) s) (string-append s s2)) > ((make-add-suffix "!") "hello") "hello!"
(define louder (make-add-suffix "!")) (define less-sure (make-add-suffix "?")) > (less-sure "really") "really?" > (louder "really") "really!"
The full syntax of the function shorthand for define is as follows:
4.5.3
A Racket expression normally produces a single result, but some expressions can produce multiple results. For example, quotient and remainder each produce a single value, but quotient/remainder produces the same two values at once:
> (values 1 2 3) 1 2 3 (define (split-name name) (let ([parts (regexp-split " " name)]) (if (= (length parts) 2) (values (list-ref parts 0) (list-ref parts 1)) (error "not a <first> <last> name")))) > (split-name "Adam Smith") "Adam" "Smith"
74
The define-values form binds multiple identiers at once to multiple results produced from a single expression:
(define-values (given surname) (split-name "Adam Smith")) > given "Adam" > surname "Smith"
A define form (that is not a function shorthand) is equivalent to a define-values form with a single id .
2.14 Denitions: define, define-syntax, ... in The Racket Reference provides more on denitions.
4.5.4
Internal Denitions
When the grammar for a syntactic form species body , then the corresponding form can be either a denition or an expression. A denition as a body is an internal denition. Expressions and internal denitions in a body sequence can be mixed, as long as the last body is an expression. For example, the syntax of lambda is
; no definitions
(lambda (f) ; one definition (define (log-it what) (printf "a\n" what)) (log-it "running")
75
(f 0) (log-it "done")) (lambda (f n) ; two definitions (define (call n) (if (zero? n) (log-it "done") (begin (log-it "running") (f n) (call (- n 1))))) (define (log-it what) (printf "a\n" what)) (call n))
Internal denitions in a particular body sequence are mutually recursive; that is, any denition can refer to any other denitionas long as the reference isnt actually evaluated before its denition takes place. If a denition is referenced too early, the result is a special value #<undefined>. Examples:
4.6
Local Binding
1.2.3.7 Internal Denitions in The Racket Reference documents the ne points of internal denitions.
Although internal defines can be used for local binding, Racket provides three forms that give the programmer more control over bindings: let, let*, and letrec. Parallel Binding: let
4.6.1
A let form binds a set of identiers, each to the result of some expression, for use in the let body: 76
2.9 Local Binding: let, let*, letrec, ... in The Racket Reference also documents let.
> (let ([me "Bob"]) me) "Bob" > (let ([me "Bob"] [myself "Robert"] [I "Bobby"]) (list me myself I)) '("Bob" "Robert" "Bobby") > (let ([me "Bob"] [me "Robert"]) me) eval:3:0: let: duplicate identier at: me in: (let ((me "Bob") (me "Robert")) me)
The fact that an id s expr does not see its own binding is often useful for wrappers that must refer back to the old value:
> (let ([+ (lambda (x y) (if (string? x) (string-append x y) (+ x y)))]) ; use original + (list (+ 1 2) (+ "see" "saw"))) '(3 "seesaw")
Occasionally, the parallel nature of let bindings is convenient for swapping or rearranging a set of bindings:
> (let ([me "Tarzan"] [you "Jane"]) (let ([me you] [you me]) (list me you))) '("Jane" "Tarzan")
The characterization of let bindings as parallel is not meant to imply concurrent evaluation. The expr s are evaluated in order, even though the bindings are delayed until all expr s are evaluated. 77
4.6.2
2.9 Local Binding: let, let*, letrec, ... in The Racket Reference also documents let*.
> (let* ([x (list "Borroughs")] [y (cons "Rice" x)] [z (cons "Edgar" y)]) (list x y z)) '(("Borroughs") ("Rice" "Borroughs") ("Edgar" "Rice" "Borroughs")) > (let* ([name (list "Borroughs")] [name (cons "Rice" name)] [name (cons "Edgar" name)]) name) '("Edgar" "Rice" "Borroughs")
In other words, a let* form is equivalent to nested let forms, each with a single binding:
> (let ([name (list "Borroughs")]) (let ([name (cons "Rice" name)]) (let ([name (cons "Edgar" name)]) name))) '("Edgar" "Rice" "Borroughs")
4.6.3 Recursive Binding: letrec
2.9 Local Binding: let, let*, letrec, ... in The Racket Reference also documents letrec.
> (letrec ([swing (lambda (t) (if (eq? (car t) 'tarzan) (cons 'vine (cons 'tarzan (cddr t))) (cons (car t) (swing (cdr t)))))]) (swing '(vine tarzan vine vine))) '(vine vine tarzan vine) > (letrec ([tarzan-in-tree? (lambda (name path) (or (equal? name "tarzan") (and (directory-exists? path) (tarzan-in-directory? path))))] [tarzan-in-directory? (lambda (dir) (ormap (lambda (elem) (tarzan-in-tree? (path-element>string elem) (build-path dir elem))) (directory-list dir)))]) (tarzan-in-tree? "tmp" (find-system-path 'temp-dir))) #f
While the expr s of a letrec form are typically lambda expressions, they can be any expression. The expressions are evaluated in order, and after each value is obtained, it is immediately associated with its corresponding id . If an id is referenced before its value is ready, the result is #<undefined>, as just as for internal denitions.
A named let is an iteration and recursion form. It uses the same syntactic keyword let as for local binding, but an identier after the let (instead of an immediate open parenthesis) triggers a different parsing.
(letrec ([proc-id (lambda (arg-id ...) body ...+)]) (proc-id init-expr ...))
That is, a named let binds a function identier that is visible only in the functions body, and it implicitly calls the function with the values of some initial expressions. Examples:
(define (duplicate pos lst) (let dup ([i 0] [lst lst]) (cond [(= i pos) (cons (car lst) lst)] [else (cons (car lst) (dup (+ i 1) (cdr lst)))]))) > (duplicate 1 (list "apple" "cheese burger!" "banana")) '("apple" "cheese burger!" "cheese burger!" "banana")
4.6.5 Multiple Values: let-values, let*-values, letrec-values
In the same way that define-values binds multiple results in a denition (see 4.5.3 Multiple Values and define-values), let-values, let*-values, and letrec-values bind multiple results locally.
2.9 Local Binding: let, let*, letrec, ... in The Racket Reference also documents multiple-value binding forms.
Example:
4.7
Conditionals
Most functions used for branching, such as < and string?, produce either #t or #f. Rackets branching forms, however, treat any value other than #f as true. We say a true value to mean any value other than #f. This convention for true value meshes well with protocols where #f can serve as failure or to indicate that an optional value is not supplied. (Beware of overusing this trick, and remember that an exception is usually a better mechanism to report failure.) For example, the member function serves double duty; it can be used to nd the tail of a list that starts with a particular item, or it can be used to simply check whether an item is present in a list:
> (member "Groucho" '("Harpo" "Zeppo")) #f > (member "Groucho" '("Harpo" "Groucho" "Zeppo")) '("Groucho" "Zeppo") > (if (member "Groucho" '("Harpo" "Zeppo")) 'yep 'nope) 'nope > (if (member "Groucho" '("Harpo" "Groucho" "Zeppo")) 'yep 'nope) 'yep
4.7.1 Simple Branching: if
In an if form,
2.12 Conditionals: if, cond, and, and or in The Racket Reference also documents if.
81
An if form must have both a then-expr and an else-expr ; the latter is not optional. To perform (or skip) side-effects based on a test-expr , use when or unless, which we describe later in 4.8 Sequencing. Combining Tests: and and or
4.7.2
Rackets and and or are syntactic forms, rather than functions. Unlike a function, the and and or forms can skip evaluation of later expressions if an earlier one determines the answer.
2.12 Conditionals: if, cond, and, and or in The Racket Reference also documents and and or.
> (define (got-milk? lst) (and (not (null? lst)) (or (eq? 'milk (car lst)) (got-milk? (cdr lst))))) ; recurs only if needed > (got-milk? '(apple banana)) #f > (got-milk? '(apple milk banana)) #t
If evaluation reaches the last expr of an and or or form, then the expr s value directly determines the and or or result. Therefore, the last expr is in tail position, which means that the above got-milk? function runs in constant space. Chaining Tests: cond
4.7.3
The cond form chains a series of tests to select a result expression. To a rst approximation, the syntax of cond is as follows:
82
2.12 Conditionals: if, cond, and, and or in The Racket Reference also documents cond.
> (cond [(= 2 3) (error "wrong!")] [(= 2 2) 'ok]) 'ok > (cond [(= 2 3) (error "wrong!")]) > (cond [(= 2 3) (error "wrong!")] [else 'ok]) 'ok (define (got-milk? lst) (cond [(null? lst) #f] [(eq? 'milk (car lst)) #t] [else (got-milk? (cdr lst))])) > (got-milk? '(apple banana)) #f > (got-milk? '(apple milk banana)) #t
The full syntax of cond includes two more kinds of clauses:
(cond cond-clause ...) cond-clause = | | | [test-expr then-expr ...+] [else then-expr ...+] [test-expr => proc-expr ] [test-expr ]
83
The => variant captures the true result of its test-expr and passes it to the result of the proc-expr , which must be a function of one argument. Examples:
> (define (after-groucho lst) (cond [(member "Groucho" lst) => cdr] [else (error "not there")])) > (after-groucho '("Harpo" "Groucho" "Zeppo")) '("Zeppo") > (after-groucho '("Harpo" "Zeppo")) not there
A clause that includes only a test-expr is rarely used. It captures the true result of the test-expr , and simply returns the result for the whole cond expression.
4.8
Sequencing
Racket programmers prefer to write programs with as few side-effects as possible, since purely functional code is more easily tested and composed into larger programs. Interaction with the external environment, however, requires sequencing, such as when writing to a display, opening a graphical window, or manipulating a le on disk. Effects Before: begin
4.8.1
2.15 Sequencing: begin, begin0, and in The Racket Reference also documents begin.
begin-for-syntax
(define (print-triangle height) (if (zero? height) (void) (begin (display (make-string height #\*)) (newline) (print-triangle (sub1 height)))))
84
(define (print-triangle height) (cond [(positive? height) (display (make-string height #\*)) (newline) (print-triangle (sub1 height))])) > (print-triangle 4) **** *** ** *
The begin form is special at the top level, at module level, or as a body after only internal denitions. In those positions, instead of forming an expression, the content of begin is spliced into the surrounding context. Example:
> (let ([curly 0]) (begin (define moe (+ 1 curly)) (define larry (+ 1 moe))) (list larry curly moe)) '(2 0 1)
This splicing behavior is mainly useful for macros, as we discuss later in 16 Macros. Effects After: begin0
4.8.2
begin-for-syntax
85
(define (log-times thunk) (printf "Start: s\n" (current-inexact-milliseconds)) (begin0 (thunk) (printf "End..: s\n" (current-inexact-milliseconds)))) > (log-times (lambda () (sleep 0.1) 0)) Start: 1333960176779.379 End..: 1333960176879.436 0 > (log-times (lambda () (values 1 2))) Start: 1333960176880.03 End..: 1333960176880.097 1 2
4.8.3 Effects If...: when and unless
The when form combines an if-style conditional with sequencing for the then clause and no else clause:
2.16 Guarded Evaluation: when and unless in The Racket Reference also documents when and unless.
Examples:
(define (enumerate lst) (if (null? (cdr lst)) (printf "a.\n" (car lst)) (begin (printf "a, " (car lst)) (when (null? (cdr (cdr lst))) (printf "and ")) (enumerate (cdr lst))))) > (enumerate '("Larry" "Curly" "Moe")) Larry, Curly, and Moe. (define (print-triangle height) (unless (zero? height) (display (make-string height #\*)) (newline) (print-triangle (sub1 height)))) > (print-triangle 4) **** *** ** *
4.9
Assignment: set!
(set! id expr )
A set! expression evaluates expr and changes id (which must be bound in the enclosing environment) to the resulting value. The result of the set! expression itself is #<void>. Examples:
2.17 Assignment: set! and set!-values in The Racket Reference also documents set!.
(define greeted null) (define (greet name) (set! greeted (cons name greeted)) (string-append "Hello, " name))
87
> (greet "Athos") "Hello, Athos" > (greet "Porthos") "Hello, Porthos" > (greet "Aramis") "Hello, Aramis" > greeted '("Aramis" "Porthos" "Athos") (define (make-running-total) (let ([n 0]) (lambda () (set! n (+ n 1)) n))) (define win (make-running-total)) (define lose (make-running-total)) > 1 > 2 > 1 > 3 (win) (win) (lose) (win)
4.9.1
Although using set! is sometimes appropriate, Racket style generally discourages the use of set!. The following guidelines may help explain when using set! is appropriate. As in any modern language, assigning to shared identier is no substitute for passing an argument to a procedure or getting its result. Really awful example:
name "unknown") result "unknown") (greet) result (string-append "Hello, " name)))
(define (greet name) (string-append "Hello, " name)) > (greet "John") "Hello, John" > (greet "Anna") "Hello, Anna"
A sequence of assignments to a local variable is far inferior to nested bindings. Bad example:
> (let ([tree 0]) (set! tree (list tree 1 (set! tree (list tree 2 (set! tree (list tree 3 tree) '(((0 1 0) 2 (0 1 0)) 3 ((0
Ok example:
> (let* ([tree 0] [tree (list tree 1 [tree (list tree 2 [tree (list tree 3 tree) '(((0 1 0) 2 (0 1 0)) 3 ((0
Using assignment to accumulate results from an iteration is bad style. Accumulating through a loop argument is better. Somewhat bad example:
(define (sum lst) (let ([s 0]) (for-each (lambda (i) (set! s (+ i s))) lst) s)) > (sum '(1 2 3)) 6
Ok example:
89
(define (sum lst) (let loop ([lst lst] [s 0]) (if (null? lst) s (loop (cdr lst) (+ s (car lst)))))) > (sum '(1 2 3)) 6
Better (use an existing function) example:
(define (sum lst) (for/fold ([s 0]) ([i (in-list lst)]) (+ s i))) > (sum '(1 2 3)) 6
For cases where stateful objects are necessary or appropriate, then implementing the objects state with set! is ne. Ok example:
(define next-number! (let ([n 0]) (lambda () (set! n (add1 n)) n))) > (next-number!) 1 > (next-number!) 2 > (next-number!) 3
All else being equal, a program that uses no assignments or mutation is always preferable to one that uses assignments or mutation. While side effects are to be avoided, however, they should be used if the resulting code is signicantly more readable or if it implements a signicantly better algorithm. 90
The use of mutable values, such as vectors and hash tables, raises fewer suspicions about the style of a program than using set! directly. Nevertheless, simply replacing set!s in a program with a vector-set!s obviously does not improve the style of the program. Multiple Values: set!-values
4.9.2
The set!-values form assigns to multiple variables at once, given an expression that produces an appropriate number of values:
2.17 Assignment: set! and set!-values in The Racket Reference also documents set!-values.
(define game (let ([w 0] [l 0]) (lambda (win?) (if win? (set! w (+ w 1)) (set! l (+ l 1))) (begin0 (values w l) ; swap sides... (set!-values (w l) (values l w)))))) > (game #t) 1 0 > (game #t) 1 1 > (game #f) 1 2
4.10
91
2.3 Literals: quote and #%datum in The Racket Reference also documents quote.
(quote datum )
The syntax of a datum is technically specied as anything that the read function parses as a single element. The value of the quote form is the same value that read would produce given datum . The datum can be a symbol, a boolean, a number, a (character or byte) string, a character, a keyword, an empty list, a pair (or list) containing more such values, a vector containing more such values, a hash table containing more such values, or a box containing another such value. Examples:
> (quote apple) 'apple > (quote #t) #t > (quote 42) 42 > (quote "hello") "hello" > (quote ()) '() > (quote ((1 2 3) #("z" x) . the-end)) '((1 2 3) #("z" x) . the-end) > (quote (1 2 . (3))) '(1 2 3)
As the last example above shows, the datum does not have to match the normalized printed form of a value. A datum cannot be a printed representation that starts with #<, so it cannot be #<void>, #<undefined>, or a procedure. The quote form is rarely used for a datum that is a boolean, number, or string by itself, since the printed forms of those values can already be used as constants. The quote form is more typically used for symbols and lists, which have other meanings (identiers, function calls, etc.) when not quoted. An expression
'datum
is a shorthand for
(quote datum )
92
and this shorthand is almost always used instead of quote. The shorthand applies even within the datum , so it can produce a list containing quote. Examples:
> 'apple 'apple > '"hello" "hello" > '(1 2 3) '(1 2 3) > (display '(you can 'me)) (you can (quote me))
1.3.7 Reading Quotes in The Racket Reference provides more on the ' shorthand.
4.11
unquote-splicing
(quasiquote datum )
However, for each (unquote expr ) that appears within the datum , the expr is evaluated to produce a value that takes the place of the unquote sub-form. Example:
> (define (deep n) (cond [(zero? n) 0] [else (quasiquote ((unquote n) (unquote (deep (- n 1)))))])) > (deep 8) '(8 (7 (6 (5 (4 (3 (2 (1 0))))))))
Or even to cheaply construct expressions programmatically. (Of course, 9 times out of 10, you should be using a macro to do this (the 10th time being when youre working through a textbook like PLAI).) 93
Examples:
> (define (build-exp n) (add-lets n (make-sum n))) > (define (add-lets n body) (cond [(zero? n) body] [else (quasiquote (let ([(unquote (n->var n)) (unquote n)]) (unquote (add-lets (- n 1) body))))])) > (define (make-sum n) (cond [(= n 1) (n->var 1)] [else (quasiquote (+ (unquote (n->var n)) (unquote (make-sum (- n 1)))))])) > (define (n->var n) (string->symbol (format "xa" n))) > (build-exp 3) '(let ((x3 3)) (let ((x2 2)) (let ((x1 1)) (+ x3 (+ x2 x1)))))
The unquote-splicing form is similar to unquote, but its expr must produce a list, and the unquote-splicing form must appear in a context that produces either a list or a vector. As the name suggests, the resulting list is spliced into the context of its use. Example:
> (define (build-exp n) (add-lets n (quasiquote (+ (unquote-splicing (build-list n ( (x) (n->var (+ x 1)))))))))
94
> (define (add-lets n body) (quasiquote (let (unquote (build-list n ( (n) (quasiquote [(unquote (n->var (+ n 1))) (unquote (+ n 1))])))) (unquote body)))) > (define (n->var n) (string->symbol (format "xa" n))) > (build-exp 3) '(let ((x1 1) (x2 2) (x3 3)) (+ x1 x2 x3))
If a quasiquote form appears within an enclosing quasiquote form, then the inner quasiquote effectively cancels one layer of unquote and unquote-splicing forms, so that a second unquote or unquote-splicing is needed. Examples:
> (quasiquote (1 2 (quasiquote (unquote (+ 1 2))))) '(1 2 (quasiquote (unquote (+ 1 2)))) > (quasiquote (1 2 (quasiquote (unquote (unquote (+ 1 2)))))) '(1 2 (quasiquote (unquote 3))) > (quasiquote (1 2 (quasiquote ((unquote (+ 1 2)) (unquote (unquote (- 5 1))))))) '(1 2 (quasiquote ((unquote (+ 1 2)) (unquote 4))))
The evaluations above will not actually print as shown. Instead, the shorthand form of quasiquote and unquote will be used: ` (i.e., a backquote) and , (i.e., a comma). The same shorthands can be used in expressions: Example:
4.12
The case form dispatches to a clause by matching the result of an expression to the values for the clause: 95
> (let ([v (random 6)]) (printf "a\n" v) (case v [(0) 'zero] [(1) 'one] [(2) 'two] [(3 4 5) 'many])) 1 'one
The last clause of a case form can use else, just like cond: Example:
> (case (random 6) [(0) 'zero] [(1) 'one] [(2) 'two] [else 'many]) 'many
For more general pattern matching, use match, which is introduced in 12 Pattern Matching.
4.13
The parameterize form associates a new value with a parameter during the evaluation of body expressions:
parameterize.
For example, the error-print-width parameter controls how many characters of a value are printed in an error message:
> (parameterize ([error-print-width 5]) (car (expt 10 1024))) car: expects argument of type <pair>; given: 10... > (parameterize ([error-print-width 10]) (car (expt 10 1024))) car: expects argument of type <pair>; given: 1000000...
More generally, parameters implement a kind of dynamic binding. The make-parameter function takes any value and returns a new parameter that is initialized to the given value. Applying the parameter as a function returns its current value:
The term parameter is sometimes used to refer to the arguments of a function, but parameter in Racket has the more specic meaning described here.
> (parameterize ([location "there"]) (location)) "there" > (location) "here" > (parameterize ([location "in a house"]) (list (location) (parameterize ([location "with a mouse"]) (location)) (location))) '("in a house" "with a mouse" "in a house") > (parameterize ([location "in a box"]) (car (location))) car: expects argument of type <pair>; given: "in a box" > (location) "here"
The parameterize form is not a binding form like let; each use of location above refers directly to the original denition. A parameterize form adjusts the value of a parameter 97
during the whole time that the parameterize body is evaluated, even for uses of the parameter that are textually outside of the parameterize body:
> (define (would-you-could-you?) (and (not (equal? (location) "here")) (not (equal? (location) "there")))) > (would-you-could-you?) #f > (parameterize ([location "on a bus"]) (would-you-could-you?)) #t
If a use of a parameter is textually inside the body of a parameterize but not evaluated before the parameterize form produces a value, then the use does not see the value installed by the parameterize form:
> (let ([get (parameterize ([location "with a fox"]) (lambda () (location)))]) (get)) "here"
The current binding of a parameter can be adjusted imperatively by calling the parameter as a function with a value. If a parameterize has adjusted the value of the parameter, then directly applying the parameter procedure affects only the value associated with the active parameterize:
> (define (try-again! where) (location where)) > (location) "here" > (parameterize ([location "on a train"]) (list (location) (begin (try-again! "in a boat") (location)))) '("on a train" "in a boat") > (location) "here"
Using parameterize is generally preferable to updating a parameter value imperatively for much the same reasons that binding a fresh variable with let is preferable to using set! (see 4.9 Assignment: set!). It may seem that variables and set! can solve many of the same problems that parameters solve. For example, lokation could be dened as a string, and set! could be used to adjust its value: 98
> (define lokation "here") > (define (would-ya-could-ya?) (and (not (equal? lokation "here")) (not (equal? lokation "there")))) > (set! lokation "on a bus") > (would-ya-could-ya?) #t
Parameters, however, offer several crucial advantages over set!: The parameterize form helps automatically reset the value of a parameter when control escapes due to an exception. Adding exception handlers and other forms to rewind a set! is relatively tedious. Parameters work nicely with tail calls (see 2.3.3 Tail Recursion). The last body in a parameterize form is in tail position with respect to the parameterize form. Parameters work properly with threads (see 10.1 Threads). The parameterize form adjusts the value of a parameter only for evaluation in the current thread, which avoids race conditions with other threads.
99
Programmer-Dened Datatypes
4 Structures in The Racket Reference also documents structure types.
New datatypes are normally created with the struct form, which is the topic of this chapter. The class-based object system, which we defer to 13 Classes and Objects, offers an alternate mechanism for creating new datatypes, but even classes and objects are implemented in terms of structure types.
5.1
4.1 Dening Structure Types: struct in The Racket Reference also documents struct.
struct:struct-id : a structure type descriptor, which is a value that represents the structure type as a rst-class value (with #:super, as discussed later in 5.8 More Structure Type Options). A struct form places no constraints on the kinds of values that can appear for elds in an instance of the structure type. For example, (posn "apple" #f) produces an instance of posn, even though "apple" and #f are not valid coordinates for the obvious uses of posn instances. Enforcing constraints on eld values, such as requiring them to be numbers, is normally the job of a contract, as discussed later in 7 Contracts.
5.2
The struct-copy form clones a structure and optionally updates specied elds in the clone. This process is sometimes called a functional update, because the result is a structure with updated eld values. but the original structure is not modied.
> (define p1 (posn 1 2)) > (define p2 (struct-copy posn p1 [x 3])) > (list (posn-x p2) (posn-y p2)) '(3 2) > (list (posn-x p1) (posn-x p2)) '(1 3)
5.3
Structure Subtypes
An extended form of struct can be used to dene a structure subtype, which is a structure type that extends an existing structure type:
The super-id must be a structure type name bound by struct (i.e., the name that cannot be used directly as an expression). Examples:
> (define p (3d-posn 1 2 3)) > p #<3d-posn> > (posn? p) #t > (posn-x p) 1 > (3d-posn-z p) 3
5.4
102
An instance of a transparent structure type prints like a call to the constructor, so that it shows the structures eld values. A transparent structure type also allows reective operations, such as struct? and struct-info, to be used on its instances (see 15 Reection and Dynamic Evaluation). Structure types are opaque by default, because opaque structure instances provide more encapsulation guarantees. That is, a library can use an opaque structure to encapsulate data, and clients of the library cannot manipulate the data in the structure except as allowed by the library.
5.5
Structure Comparisons
A generic equal? comparison automatically recurs on the elds of a transparent structure type, but equal? defaults to mere instance identity for opaque structure types:
(struct glass (width height) #:transparent) > (equal? (glass 1 2) (glass 1 2)) #t
(struct lead (width height)) > (define slab (lead 1 2)) > (equal? slab slab) #t > (equal? slab (lead 1 2)) #f
To support instances comparisons via equal? without making the structure type transparent, you can use the #:property keyword, prop:equal+hash, and then a list of three functions:
(struct lead (width height) #:property prop:equal+hash (list (lambda (a b equal?-recur) ; compare a and b (and (equal?-recur (lead-width a) (lead-width b)) (equal?-recur (lead-height a) (lead-height b)))) (lambda (a hash-recur) ; compute primary hash code of a (+ (hash-recur (lead-width a))
103
(* 3 (hash-recur (lead-height a))))) (lambda (a hash2-recur) ; compute secondary hash code of a (+ (hash2-recur (lead-width a)) (hash2-recur (lead-height a)))))) > (equal? (lead 1 2) (lead 1 2)) #t
The rst function in the list implements the equal? test on two leads; the third argument to the function is used instead of equal? for recursive equality testing, so that data cycles can be handled correctly. The other two functions compute primary and secondary hash codes for use with hash tables:
> (define h (make-hash)) > (hash-set! h (lead 1 2) 3) > (hash-ref h (lead 1 2)) 3 > (hash-ref h (lead 2 1)) hash-ref: no value found for key: #<lead>
The rst function provided with prop:equal+hash is not required to recursively compare the elds of the structure. For example, a structure type representing a set might implement equality by checking that the members of the set are the same, independent of the order of elements in the internal representation. Just take care that the hash functions produce the same value for any two structure types that are supposed to be equivalent.
5.6
Each time that a struct form is evaluated, it generates a structure type that is distinct from all existing structure types, even if some other structure type has the same name and elds. This generativity is useful for enforcing abstractions and implementing programs such as interpreters, but beware of placing a struct form in positions that are evaluated multiple times. Examples:
(define (add-bigger-fish lst) (struct fish (size) #:transparent) ; new every time (cond [(null? lst) (list (fish 1))]
104
[else (cons (fish (* 2 (fish-size (car lst)))) lst)])) > (add-bigger-fish null) (list (fish 1)) > (add-bigger-fish (add-bigger-fish null)) sh-size: expects args of type <struct:sh>; given instance of a different <struct:sh> (struct fish (size) #:transparent) (define (add-bigger-fish lst) (cond [(null? lst) (list (fish 1))] [else (cons (fish (* 2 (fish-size (car lst)))) lst)])) > (add-bigger-fish (add-bigger-fish null)) (list (fish 2) (fish 1))
5.7
Although a transparent structure type prints in a way that shows its content, the printed form of the structure cannot be used in an expression to get the structure back, unlike the printed form of a number, string, symbol, or list. A prefab (previously fabricated) structure type is a built-in type that is known to the Racket printer and expression reader. Innitely many such types exist, and they are indexed by name, eld count, supertype, and other such details. The printed form of a prefab structure is similar to a vector, but it starts #s instead of just #, and the rst element in the printed form is the prefab structure types name. The following examples show instances of the sprout prefab structure type that has one eld. The rst instance has a eld value 'bean, and the second has eld value 'alfalfa:
> '#s(sprout bean) '#s(sprout bean) > '#s(sprout alfalfa) '#s(sprout alfalfa)
Like numbers and strings, prefab structures are self-quoting, so the quotes above are optional:
When you use the #:prefab keyword with struct, instead of generating a new structure type, you obtain bindings that work with the existing prefab structure type:
> (define lunch '#s(sprout bean)) > (struct sprout (kind) #:prefab) > (sprout? lunch) #t > (sprout-kind lunch) 'bean > (sprout 'garlic) '#s(sprout garlic)
The eld name kind above does not matter for nding the prefab structure type; only the name sprout and the number of elds matters. At the same time, the prefab structure type sprout with three elds is a different structure type than the one with a single eld:
> (sprout? #s(sprout bean #f 17)) #f > (struct sprout (kind yummy? count) #:prefab) ; redefine > (sprout? #s(sprout bean #f 17)) #t > (sprout? lunch) #f
A prefab structure type can have another prefab structure type as its supertype, it can have mutable elds, and it can have auto elds. Variations in any of these dimensions correspond to different prefab structure types, and the printed form of the structure types name encodes all of the relevant details.
> (struct building (rooms [location #:mutable]) #:prefab) > (struct house building ([occupied #:auto]) #:prefab #:auto-value 'no) > (house 5 'factory) '#s((house (1 no) building 2 #(1)) 5 factory no)
Every prefab structure type is transparentbut even less abstract than a transparent type, because instances can be created without any access to a particular structure-type declaration or existing examples. Overall, the different options for structure types offer a spectrum of possibilities from more abstract to more convenient:
106
Opaque (the default) : Instances cannot be inspected or forged without access to the structure-type declaration. As discussed in the next section, constructor guards and properties can be attached to the structure type to further protect or to specialize the behavior of its instances. Transparent : Anyone can inspect or create an instance without access to the structuretype declaration, which means that the value printer can show the content of an instance. All instance creation passes through a constructor guard, however, so that the content of an instance can be controlled, and the behavior of instances can be specialized through properties. Since the structure type is generated by its denition, instances cannot be manufactured simply through the name of the structure type, and therefore cannot be generated automatically by the expression reader. Prefab : Anyone can inspect or create an instance at any time, without prior access to a structure-type declaration or an example instance. Consequently, the expression reader can manufacture instances directly. The instance cannot have a constructor guard or properties. Since the expression reader can generate prefab instances, they are useful when convenient serialization is more important than abstraction. Opaque and transparent structures also can be serialized, however, if they are dened with define-serializable-struct as described in 8.4 Datatypes and Serialization.
5.8
The full syntax of struct supports many options, both at the structure-type level and at the level of individual elds:
(struct struct-id maybe-super (field ...) struct-option ...) maybe-super = | super-id field = field-id | [field-id field-option ...]
A struct-option always starts with a keyword:
#:mutable
Causes all elds of the structure to be mutable, and introduces for each field-id a mutator set-struct-id -field-id ! that sets 107
the value of the corresponding eld in an instance of the structure type. Examples:
> (struct dot (x y) #:mutable) (define d (dot 1 2)) > (dot-x d) 1 > (set-dot-x! d 10) > (dot-x d) 10
The #:mutable option can also be used as a field-option , in which case it makes an individual eld mutable. Examples:
> (struct person (name [age #:mutable])) (define friend (person "Barney" 5)) > (set-person-age! friend 6) > (set-person-name! friend "Mary") reference to undened identier: set-person-name!
#:transparent
Controls reective access to structure instances, as discussed in a previous section, 5.4 Opaque versus Transparent Structure Types.
#:inspector inspector-expr
Generalizes #:transparent to support more controlled access to reective operations.
#:prefab
Accesses a built-in structure type, as discussed in a previous section, 5.7 Prefab Structure Types.
108
#:auto-value auto-expr
Species a value to be used for all automatic elds in the structure type, where an automatic eld is indicated by the #:auto eld option. The constructor procedure does not accept arguments for automatic elds. Automatic elds are implicitly mutable (via reective operations), but mutator functions are bound only if #:mutator is also specied. Examples:
> (struct posn (x y [z #:auto]) #:transparent #:auto-value 0) > (posn 1 2) (posn 1 2 0) #:guard guard-expr
Species a constructor guard procedure to be called whenever an instance of the structure type is created. The guard takes as many arguments as non-automatic elds in the structure type, plus one more for the name of the instantiated type (in case a sub-type is instantiated, in which case its best to report an error using the sub-types name). The guard should return the same number of values as given, minus the name argument. The guard can raise an exception if one of the given arguments is unacceptable, or it can convert an argument. Examples:
> (struct thing (name) #:transparent #:guard (lambda (name type-name) (cond [(string? name) name] [(symbol? name) (symbol>string name)] [else (error type-name "bad name: e" name)]))) > (thing "apple") (thing "apple") > (thing 'apple)
109
> (struct person thing (age) #:transparent #:guard (lambda (name age type-name) (if (negative? age) (error type-name "bad age: e" age) (values name age)))) > (person "John" 10) (person "John" 10) > (person "Mary" -1) person: bad age: -1 > (person 10 10) person: bad name: 10
> (struct greeter (name) #:property prop:procedure (lambda (self other) (string-append "Hi " other ", I'm " (greetername self)))) (define joe-greet (greeter "Joe")) > (greeter-name joe-greet) "Joe"
110
> (joe-greet "Mary") "Hi Mary, I'm Joe" > (joe-greet "John") "Hi John, I'm Joe" #:super super-expr
An alternative to supplying a super-id next to struct-id. Instead of the name of a structure type (which is not an expression), superexpr should produce a structure type descriptor value. An advantage of #:super is that structure type descriptors are values, so they can be passed to procedures. Examples:
(define (raven-constructor super-type) (struct raven () #:super super-type #:transparent #:property prop:procedure (lambda (self) 'nevermore)) raven) > (let ([r ((raven-constructor struct:posn) 1 2)]) (list r (r))) (list (raven 1 2) 'nevermore) > (let ([r ((raven-constructor struct:thing) "apple")]) (list r (r))) (list (raven "apple") 'nevermore)
4 Structures in The Racket Reference provides more on structure types.
111
Modules
Modules let you organize Racket code into multiple les and reusable libraries.
6.1
Module Basics
For example, suppose the le
Each Racket module typically resides in its own le. "cake.rkt" contains the following module:
#lang racket (provide print-cake) ; draws (define (show (show (show (show a cake with n candles (print-cake n) " a " n #\.) " .-a-. " n #\|) " | a | " n #\space) "---a---" n #\-))
"cake.rkt"
"random-cake.rkt"
The relative reference "cake.rkt" in the import (require "cake.rkt") works if the "cake.rkt" and "random-cake.rkt" modules are in the same directory. Unix-style relative paths are used for relative module references on all platforms, much like relative URLs in HTML pages. 112
6.1.1
Organizing Modules
The "cake.rkt" and "random-cake.rkt" example demonstrates the most common way to organize a program into modules: put all module les in a single directory (perhaps with subdirectories), and then have the modules reference each other through relative paths. A directory of modules can act as a project, since it can be moved around on the lesystem or copied to other machines, and relative paths preserve the connections among modules. As another example, if you are building a candy-sorting program, you might have a main "sort.rkt" module that uses other modules to access a candy database and a control sorting machine. If the candy-database module itself is organized into submodules that handle barcode and manufacturer information, then the database module could be "db/lookup.rkt" that uses helper modules "db/barcodes.rkt" and "db/makers.rkt". Similarly, the sorting-machine driver "machine/control.rkt" might use helper modules "machine/sensors.rkt" and "machine/actuators.rkt".
sort.rkt
db
machine
control.rkt
lookup.rkt
sensors.rkt
actuators.rkt
barcodes.rkt
makers.rkt
The "sort.rkt" module uses the relative paths "db/lookup.rkt" and "machine/control.rkt" to import from the database and machine-control libraries:
"sort.rkt"
The "db/lookup.rkt" module similarly uses paths relative to its own source to access the "db/barcodes.rkt" and "db/makers.rkt" modules:
"db/lookup.rkt"
"machine/control.rkt"
Racket tools all work automatically with relative paths. For example,
racket sort.rkt
on the comamnd line runs the "sort.rkt" program and automatically loads and compiles required modules. With a large enough program, compilation from source can take too long, so use
6.1.2
Library Collections
A collection is a set of installed library modules. A module in a collection is referenced through an unquoted, sufxless path. For example, the following module refers to the "date.rkt" library that is part of the "racket" collection:
#lang racket (require racket/date) (printf "Today is s\n" (date->string (seconds->date (current-seconds))))
When you search the online Racket documentation, the search results indicate the module that provides each binding. Alternatively, if you reach a bindings documentation by clicking on hyperlinks, you can hover over the binding name to nd out which modules provide it. 114
A module reference like racket/date looks like an identier, but it is not treated in the same way as printf or date->string. Instead, when require sees a module reference that is unquoted, it converts the reference to a collection-based module path: First, if the unquoted path contains no /, then require automatically adds a "/main" to the reference. For example, (require slideshow) is equivalent to (require slideshow/main). Second, require implicitly adds a ".rkt" sufx to the path. Finally, require treats the path as relative to the installation location of the collection, instead of relative to the enclosing modules path. The "racket" collection is located in a directory with the Racket installation. A userspecic directory can contain additional collections, and even more collection directories can be specied in conguration les or through the PLTCOLLECTS search path. Try running the following program to nd out how your collection search path is congured:
#lang racket (require setup/dirs) (find-collects-dir) ; main collection directory (find-user-collects-dir) ; user-specific collection directory (get-collects-search-dirs) ; complete search path
6.1.3 Adding Collections
Looking back at the candy-sorting example of 6.1.1 Organizing Modules, suppose that modules in "db/" and "machine/" need a common set of helper functions. Helper functions could be put in a "utils/" directory, and modules in "db/" or "machine/" could access utility modules with relative paths that start "../utils/". As long as a set of modules work together in a single project, its best to stick with relative paths. A programmer can follow relative-path references without knowing about your Racket conguration. Some libraries are meant to be used across multiple projects, so that keeping the library source in a directory with its uses does not make sense. In that case, you have two options: Add the library to a new or existing collection. After the library is in a collection, it can be referenced with an unquoted path, just like libraries that are included with the Racket distribution. Add the library a new or existing PLaneT package. Libraries in a PLaneT package are referenced with a path of the form (planet ....) path. 115
See PLaneT: Automatic Package Distribution for more information on PLaneT.
The simplest option is to add a new collection. You could add a new collection by placing les in the Racket installation or one of the directories reported by (get-collectssearch-dirs). Alternatively, you could add to the list of searched directories by setting the PLTCOLLECTS environment variable; if you set PLTCOLLECTS, include an empty path in by starting the value with a colon (Unix and Mac OS X) or semicolon (Windows) so that the original search paths are preserved. Finally, instead of using one of the default directories or setting PLTCOLLECTS, you can use raco link. The raco link command-line tool creates a link from a collection name to a directory for the collections modules. For example, suppose you have a directory "/usr/molly/bakery" that contains the "cake.rkt" module (from the beginning of this section) and other related modules. To make the modules available as a "bakery" collection, use
Instead of installing a single collection directory, the --root or -d ag for raco link can install a directory that contains collections, much like adding to PLTCOLLECTS. See 2 raco link: Library Collection Links for more information on raco link.
6.2
Module Syntax
The #lang at the start of a module le begins a shorthand for a module form, much like ' is a shorthand for a quote form. Unlike ', the #lang shorthand does not work well in a REPL, in part because it must be terminated by an end-of-le, but also because the longhand expansion of #lang depends on the name of the enclosing le.
116
6.2.1
The longhand form of a module declaration, which works in a REPL as well as a le, is
(module cake racket (provide print-cake) (define (show (show (show (show (print-cake n) " a " n #\.) " .-a-. " n #\|) " | a | " n #\space) "---a---" n #\-))
... .-|||-. | | --------Declaring a module does not immediately evaluate the body denitions and expressions of the module. The module must be explicitly required at the top level to trigger evaluation. After evaluation is triggered once, later requires do not re-evaluate the module body. Examples:
> (module hi racket (printf "Hello\n")) > (require 'hi) Hello > (require 'hi)
6.2.2
The body of a #lang shorthand has no specic syntax, because the syntax is determined by the language name that follows #lang. In the case of #lang racket, the syntax is
6.2.3
Submodules
A module form can be nested within a module, in which case the nested module form declares a submodule. Submodules can be referenced directly by the enclosing module using a quoted name. The following example prints "Tony" by importing tiger from the zoo submodule:
#lang racket (module zoo racket (provide tiger) (define tiger "Tony")) (require 'zoo) tiger
"park.rkt"
Running a module does not necessarily run its submodules. In the above example, running "park.rkt" runs its submodule zoo only because the "park.rkt" module requires the zoo submodule. Otherwise, a module and each of its submodules can be run independently. Furthermore, if "park.rkt" is compiled to a bytecode le (via raco make), then the code for "park.rkt" or the code for zoo can be loaded independently. Submodules can be nested within submodules, and a submodule can be referenced directly by a module other than its enclosing module by using a submodule path. A module* form is similar to a nested module form:
One use of submodule declared with module* and #f is to export additional bindings through a submodule that are not normally exported from the module:
#lang racket (provide print-cake) (define (show (show (show (show (print-cake n) " a " n #\.) " .-a-. " n #\|) " | a | " n #\space) "---a---" n #\-))
"cake.rkt"
(define (show fmt n ch) (printf fmt (make-string n ch)) (newline)) (module* extras #f (provide show))
In this revised "cake.rkt" module, show is not imported by a module that uses (require "cake.rkt"), since most clients of "cake.rkt" will not want the extra function. A module can require the extra submodule using (require (submod "cake.rkt" extras)) to access the otherwise hidden show function.
6.2.4
The following variant of "cake.rkt" includes a main submodule that calls print-cake:
#lang racket (define (show (show (show (show (print-cake n) " a " n #\.) " .-a-. " n #\|) " | a | " n #\space) "---a---" n #\-))
"cake.rkt"
(define (show fmt n ch) (printf fmt (make-string n ch)) (newline)) (module* main #f (print-cake 10))
120
Running a module does not run its module*-dened submodules. Nevertheless, running the above module via racket or DrRacket prints a cake with 10 candles, because the main submodule is a special case. When a module is provided as a program name to the racket executable or run directly within DrRacket, if the module has a main submodule, the main submodule is run after its enclosing module. Declaring a main submodule thus species extra actions to be performed when a module is run directly, instead of required as a library within a larger program. A main submodule does not have to be declared with module*. If the main module does not need to use bindings from its enclosing module, it can be declared with module. More commonly, main is declared using module+:
#lang racket (module+ test (require rackunit) (define 1e-10)) (provide drop to-energy) (define (drop t) (* 1/2 9.8 t t)) (module+ test (check-= (drop 0) 0 ) (check-= (drop 10) 490 )) (define (to-energy m) (* m (expt 299792458.0 2)))
"physics.rkt"
121
#lang racket (provide drop to-energy) (define (drop t) (* 1/2 49/5 t t)) (define (to-energy m) (* m (expt 299792458 2))) (module* test #f (require rackunit) (define 1e-10) (check-= (drop 0) 0 ) (check-= (drop 10) 490 ) (check-= (to-energy 0) 0 ) (check-= (to-energy 1) 9e+16 1e+15))
"physics.rkt"
Using module+ instead of module* allows tests to be interleaved with function denitions. The combining behavior of module+ is also sometimes helpful for a main module. Even when combining is not needed, (module+ main ....) is preferred as more readable than (module* main #f ....).
6.3
Module Paths
A module path is a reference to a module, as used with require or as the initialmodule-path in a module form. It can be any of several forms:
(quote id )
122
A module path that is a quoted identier refers to a non-le module declaration using the identier. This form of module reference makes the most sense in a REPL. Examples:
> (module m racket (provide color) (define color "blue")) > (module n racket (require 'm) (printf "my favorite color is a\n" color)) > (require 'n) my favorite color is blue
rel-string
A string module path is a relative path using Unix-style conventions: / is the path separator, .. refers to the parent directory, and . refers to the same directory. The rel-string must not start or end with a path separator. If the path has no sufx, ".rkt" is added automatically. The path is relative to the enclosing le, if any, or it is relative to the current directory. (More precisely, the path is relative to the value of (current-loadrelative-directory), which is set while loading a le.) 6.1 Module Basics shows examples using relative paths. If a relative path ends with a ".ss" sufx, it is converted to ".rkt". If the le that implements the referenced module actually ends in ".ss", the sufx will be changed back when attempting to load the le (but a ".rkt" sufx takes precedence). This two-way conversion provides compatibility with older versions of Racket.
id
A module path that is an unquoted identier refers to an installed library. The id is constrained to contain only ASCII letters, ASCII numbers, +, -, _, and /, where / separates path elements within the identier. The elements refer to collections and sub-collections, instead of directories and sub-directories. An example of this form is racket/date. It refers to the module whose source is the "date.rkt" le in the "racket" collection, which is installed as part of Racket. The ".rkt" sufx is added automatically. 123
Another example of this form is racket, which is commonly used at the initial import. The path racket is shorthand for racket/main; when an id has no /, then /main is automatically added to the end. Thus, racket or racket/main refers to the module whose source is the "main.rkt" le in the "racket" collection. Examples:
> (module m racket (require racket/date) (printf "Today is s\n" (date->string (seconds->date (currentseconds))))) > (require 'm) Today is "Monday, April 9th, 2012"
When the full path of a module ends with ".rkt", if no such le exists but one does exist with the ".ss" sufx, then the ".ss" sufx is substituted automatically. This transformation provides compatibility with older versions of Racket.
(lib rel-string )
Like an unquoted-identier path, but expressed as a string instead of an identier. Also, the rel-string can end with a le sufx, in which case ".rkt" is not automatically added. Example of this form include (lib "racket/date.rkt") and (lib "racket/date"), which are equivalent to racket/date. Other examples include (lib "racket"), (lib "racket/main"), and (lib "racket/main.rkt"), which are all equivalent to racket. Examples:
> (module m (lib "racket") (require (lib "racket/date.rkt")) (printf "Today is s\n" (date->string (seconds->date (currentseconds))))) > (require 'm) Today is "Monday, April 9th, 2012"
124
(planet id )
Accesses a third-party library that is distributed through the PLaneT server. The library is downloaded the rst time that it is needed, and then the local copy is used afterward. The id encodes several pieces of information separated by a /: the package owner, then package name with optional version information, and an optional path to a specic library with the package. Like id as shorthand for a lib path, a ".rkt" sufx is added automatically, and /main is used as the path if no sub-path element is supplied. Examples:
> (module m (lib "racket") ; Use "schematics"'s "random.plt" 1.0, file "random.rkt": (require (planet schematics/random:1/random)) (display (random-gaussian))) > (require 'm) 0.9050686838895684
As with other forms, an implementation le ending with ".ss" can be substituted automatically if no implementation le ending with ".rkt" exists.
(planet package-string )
Like the symbol form of a planet, but using a string instead of an identier. Also, the package-string can end with a le sufx, in which case ".rkt" is not added. As with other forms, an ".ss" extension is converted to ".rkt", while an implementation le ending with ".ss" can be substituted automatically if no implementation le ending with ".rkt" exists.
(planet rel-string (user-string pkg-string vers ...)) vers = | | | | nat (nat nat ) (= nat ) (+ nat ) (- nat )
125
A more general form to access a library from the PLaneT server. In this general form, a PLaneT reference starts like a lib reference with a relative path, but the path is followed by information about the producer, package, and version of the library. The specied package is downloaded and installed on demand. The vers es specify a constraint on the acceptable version of the package, where a version number is a sequence of non-negative integers, and the constraints determine the allowable values for each element in the sequence. If no constraint is provided for a particular element, then any version is allowed; in particular, omitting all vers es means that any version is acceptable. Specifying at least one vers is strongly recommended. For a version constraint, a plain nat is the same as (+ nat ), which matches nat or higher for the corresponding element of the version number. A (startnat end-nat ) matches any number in the range start-nat to end-nat , inclusive. A (= nat ) matches only exactly nat . A (- nat ) matches nat or lower. Examples:
> (module m (lib "racket") (require (planet "random.rkt" ("schematics" "random.plt" 1 0))) (display (random-gaussian))) > (require 'm) 0.9050686838895684
The automatic ".ss" and ".rkt" conversions apply as with other forms.
(file string )
Refers to a le, where string is a relative or absolute path using the current platforms conventions. This form is not portable, and it should not be used when a plain, portable rel-string sufces. The automatic ".ss" and ".rkt" conversions apply as with other forms.
(submod base element ...+) base = module-path | "." | ".." element = id | ".."
Refers to a submodule of base . The sequence of element s within submod specify a path of submodule names to reach the nal submodule. 126
Examples:
> (module zoo racket (module monkey-house racket (provide monkey) (define monkey "Curious George"))) > (require (submod 'zoo monkey-house)) > monkey "Curious George"
Using "." as base within submod stands for the enclosing module. Using ".." as base is equivalent to using "." followed by an extra "..". When a path of the form (quote id ) refers to a submodule, it is equivalent to (submod "." id ). Using ".." as an element cancels one submodule step, effectively referring to the enclosing module. For example, (submod "..") refers to the enclosing module of the submodule in which the path appears. Examples:
> (module zoo racket (module monkey-house racket (provide monkey) (define monkey "Curious George")) (module crocodile-house racket (require (submod ".." monkey-house)) (provide dinner) (define dinner monkey))) > (require (submod 'zoo crocodile-house)) > dinner "Curious George"
6.4
Imports: require
The require form imports from another module. A require form can appear within a module, in which case it introduces bindings from the specied module into importing module. A require form can also appear at the top level, in which case it both imports bindings and instantiates the specied module; that is, it evaluates the body denitions and expressions of the specied module, if they have not been evaluated already. A single require can specify multiple imports at once: 127
module-path
In its simplest form, a require-spec is a module-path (as dened in the previous section, 6.3 Module Paths). In this case, the bindings introduced by require are determined by provide declarations within each module referenced by each module-path . Examples:
> (module m racket (provide color) (define color "blue")) > (module n racket (provide size) (define size 17)) > (require 'm 'n) > (list color size) '("blue" 17) (only-in require-spec id-maybe-renamed ...) id-maybe-renamed = id | [orig-id bind-id ]
An only-in form limits the set of bindings that would be introduced by a base require-spec . Also, only-in optionally renames each binding that is preserved: in a [orig-id bind-id ] form, the orig-id refers to a binding implied by require-spec , and bindid is the name that will be bound in the importing context instead of bind-id . Examples: 128
> (module m (lib "racket") (provide tastes-great? less-filling?) (define tastes-great? #t) (define less-filling? #t)) > (require (only-in 'm tastes-great?)) > tastes-great? #t > less-filling? reference to undened identier: less-lling? > (require (only-in 'm [less-filling? lite?])) > lite? #t (except-in require-spec id ...)
This form is the complement of only: it excludes specic bindings from the set specied by require-spec .
6.5
Exports: provide
By default, all of a modules denitions are private to the module. The provide form species denitions to be made available where the module is required.
identifier
In its simplest form, a provide-spec indicates a binding within its module to be exported. The binding can be from either a local denition, or from an import.
(struct-out struct-id )
A struct-out form exports the bindings created by (struct struct-id ....).
See 5 ProgrammerDened Datatypes for information on define-struct.
(all-defined-out)
The all-defined-out shorthand exports all bindings that are dened within the exporting module (as opposed to imported). Use of the all-defined-out shorthand is generally discouraged, because it makes less clear the actual exports for a module, and because Racket programmers get into the habit of thinking that denitions can be added freely to a module without affecting its public interface (which is not the case when all-defined-out is used). 130
(all-from-out module-path )
The all-from-out shorthand exports all bindings in the module that were imported using a require-spec that is based on modulepath . Although different module-path s could refer to the same le-based module, re-exporting with all-from-out is based specically on the module-path reference, and not the module that is actually referenced.
6.6
The use of set! on variables dened within a module is limited to the body of the dening module. That is, a module is allowed to change the value of its own denitions, and such changes are visible to importing modules. However, an importing context is not allowed to change the value of an imported binding. Examples:
> (module m racket (provide counter increment!) (define counter 0) (define (increment!) (set! counter (add1 counter)))) > (require 'm) > counter 0 > (increment!)
131
> counter 1 > (set! counter -1) set!: cannot mutate module-required identier in: counter
As the above example illustrates, a module can always grant others the ability to change its exports by providing a mutator function, such as increment!. The prohibition on assignment of imported variables helps support modular reasoning about programs. For example, in the module,
(module m racket (provide rx:fish fishy-string?) (define rx:fish #rx"fish") (define (fishy-string? s) (regexp-match? s rx:fish)))
the function fishy-string? will always match strings that contain sh, no matter how other modules use the rx:fish binding. For essentially the same reason that it helps programmers, the prohibition on assignment to imports also allows many programs to be executed more efciently. Along the same lines, when a module contains no set! of a particular identier that is dened within the module, then the identier is considered a constant that cannot be changed not even by re-declaring the module. Consequently, re-declaration of a module is not generally allowed. For le-based modules, simply changing the le does not lead to a re-declaration in any case, because le-based modules are loaded on demand, and the previously loaded declarations satisfy future requests. It is possible to use Rackets reection support to re-declare a module, however, and non-le modules can be re-declared in the REPL; in such cases, the re-declaration may fail if it involves the re-denition of a previously constant binding.
> (module m racket (define pie 3.141597)) > (require 'm) > (module m racket (define pie 3)) dene-values: cannot re-dene a constant: pie in module: m
For exploration and debugging purposes, the Racket reective layer provides a compileenforce-module-constants parameter to disable the enforcement of constants.
> (module m2 racket (provide pie) (define pie 3.141597)) > (require 'm2) > (module m2 racket (provide pie) (define pie 3)) > (compile-enforce-module-constants #t) > pie 3
133
Contracts
7 Contracts in The Racket Reference provides more on contracts.
7.1
Like a contract between two business partners, a software contract is an agreement between two parties. The agreement species obligations and guarantees for each product (or value) that is handed from one party to the other. A contract thus establishes a boundary between the two parties. Whenever a value crosses this boundary, the contract monitoring system performs contract checks, making sure the partners abide by the established contract. In this spirit, Racket encourages contracts mainly at module boundaries. Specically, programmers may attach contracts to provide clauses and thus impose constraints and promises on the use of exported values. For example, the export specication
#lang racket/base (require racket/contract) ; now we can write contracts (provide (contract-out [amount positive?])) (define amount ...)
7.1.1 Contract Violations
134
All of the contracts and modules in this chapter (excluding those just following) are written using the standard #lang syntax for describing modules. Since modules serve as the boundary between parties in a contract, examples involve multiple modules. To experiment with multiple modules within a single module or within DrRackets denitions area, use the racket/load language. The contents of such a module can be other modules (and require statements), using the longhand parenthesized syntax for a module (see 6.2.1 The module Form). For example, try the example earlier in this section as follows:
#lang racket/load (module m racket (provide (contract-out [amount (and/c number? positive?)])) (define amount 150)) (module n racket (require 'm)
135
7.2
A mathematical function has a domain and a range. The domain indicates the kind of values that the function can accept as arguments, and the range indicates the kind of values that it produces. The conventional notation for a describing a function with its domain and range is
f : A -> B
where A is the domain of the function and B is the range. Functions in a programming language have domains and ranges, too, and a contract can ensure that a function receives only values in its domain and produces only values in its range. A -> creates such a contract for a function. The forms after a -> specify contracts for the domains and nally a contract for the range. Here is a module that might represent a bank account:
#lang racket (provide (contract-out [deposit (-> number? any)] [balance (-> number?)])) (define amount 0) (define (deposit a) (set! amount (+ amount a))) (define (balance) amount)
The module exports two functions: deposit, which accepts a number and returns some value that is not specied in the contract, and
136
balance, which returns a number indicating the current balance of the account. When a module exports a function, it establishes two channels of communication between itself as a server and the client module that imports the function. If the client module calls the function, it sends a value into the server module. Conversely, if such a function call ends and the function returns a value, the server module sends a value back to the client module. This clientserver distinction is important, because when something goes wrong, one or the other of the parties is to blame. If a client module were to apply deposit to 'millions, it would violate the contract. The contract-monitoring system would catch this violation and blame client for breaking the contract with the above module. In contrast, if the balance function were to return 'broke, the contract-monitoring system would blame the server module. A -> by itself is not a contract; it is a contract combinator, which combines other contracts to form a contract. Styles of ->
7.2.1
If you are used to mathematical function, you may prefer a contract arrow to appear between the domain and the range of a function, not at the beginning. If you have read How to Design Programs, you have seen this many times. Indeed, you may have seen contracts such as these in other peoples code:
The any contract used for deposit matches any kind of result, and it can only be used in the range position of a function contract. Instead of any above, we could use the more specic contract void?, which says that the function will always return the (void) value. The 137
void? contract, however, would require the contract monitoring system to check the return value every time the function is called, even though the client module cant do much with the value. In contrast, any tells the monitoring system not to check the return value, it tells a potential client that the server module makes no promises at all about the functions return value, even whether it is a single value or multiple values.
The any/c contract is similar to any, in that it makes no demands on a value. Unlike any, any/c indicates a single value, and it is suitable for use as an argument contract. Using any/c as a range contract imposes a check that the function produces a single value. That is,
7.2.3
The deposit function adds the given number to the value of amount. While the functions contract prevents clients from applying it to non-numbers, the contract still allows them to apply the function to complex numbers, negative numbers, or inexact numbers, none of which sensibly represent amounts of money. The contract system allows programmers to dene their own contracts as functions:
#lang racket (define (amount? a) (and (number? a) (integer? a) (exact? a) (>= a 0))) (provide (contract-out ; an amount is a natural number of cents
138
given number an amount? (-> amount? any)] (-> any/c boolean?)] (-> amount?)]))
(define amount 0) (define (deposit a) (set! amount (+ amount a))) (define (balance) amount)
This module denes an amount? function and uses it as a contract within -> contracts. When a client calls the deposit function as exported with the contract (-> amount? any), it must supply an exact, nonnegative integer, otherwise the amount? function applied to the argument will return #f, which will cause the contract-monitoring system to blame the client. Similarly, the server module must provide an exact, nonnegative integer as the result of balance to remain blameless. Of course, it makes no sense to restrict a channel of communication to values that the client doesnt understand. Therefore the module also exports the amount? predicate itself, with a contract saying that it accepts an arbitrary value and returns a boolean. In this case, we could also have used natural-number/c in place of amount?, since it implies exactly the same check:
(define amount/c (and/c number? integer? exact? (or/c positive? zero?))) (provide (contract-out [deposit (-> amount/c any)] [balance (-> amount/c)]))
Other values also serve double duty as contracts. For example, if a function accepts a number or #f, (or/c number? #f) sufces. Similarly, the amount/c contract could have been written with a 0 in place of zero?. If you use a regular expression as a contract, the contract accepts strings and byte strings that match the regular expression. Naturally, you can mix your own contract-implementing functions with combinators like and/c. Here is a module for creating strings from banking records: 139
#lang racket (define (has-decimal? str) (define L (string-length str)) (and (>= L 3) (char=? #\. (string-ref str (- L 3))))) (provide (contract-out ; convert a random number to a string [format-number (-> number? string?)] ; convert an amount into a string with a decimal ; point, as in an amount of US currency [format-nat (-> natural-number/c (and/c string? has-decimal?))]))
The contract of the exported function format-number species that the function consumes a number and produces a string. The contract of the exported function format-nat is more interesting than the one of format-number. It consumes only natural numbers. Its range contract promises a string that has a . in the third position from the right. If we want to strengthen the promise of the range contract for format-nat so that it admits only strings with digits and a single dot, we could write it like this:
#lang racket (define (digit-char? x) (member x '(#\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9 #\0))) (define (has-decimal? str) (define L (string-length str)) (and (>= L 3) (char=? #\. (string-ref str (- L 3))))) (define (is-decimal-string? str) (define L (string-length str)) (and (has-decimal? str) (andmap digit-char? (string->list (substring str 0 (- L 3)))) (andmap digit-char? (string->list (substring str (- L 2) L))))) .... (provide (contract-out ....
140
; convert an amount (natural number) of cents ; into a dollar-based string [format-nat (-> natural-number/c (and/c string? is-decimal-string?))]))
Alternately, in this case, we could use a regular expression as a contract:
#lang racket (provide (contract-out .... ; convert an amount (natural number) of cents ; into a dollar-based string [format-nat (-> natural-number/c (and/c string? #rx"[0-9]*\\.[0-9][0-9]"))]))
7.2.4 Contracts on Higher-order Functions
Function contracts are not just restricted to having simple predicates on their domains or ranges. Any of the contract combinators discussed here, including function contracts themselves, can be used as contracts on the arguments and results of a function. For example,
7.2.5
You wrote your module. You added contracts. You put them into the interface so that client programmers have all the information from interfaces. Its a piece of art:
#lang racket (provide (contract-out [deposit (-> (lambda (x) (and (number? x) (integer? x) (>= x 0))) any)])) (define this 0) (define (deposit a) ...)
Several clients used your module. Others used their modules in turn. And all of a sudden one of them sees this error message: bank-client broke the contract (-> ??? any) it had with myaccount on deposit; expected <???>, given: -10 Clearly, bank-client is a module that uses myaccount but what is the ??? doing there? Wouldnt it be nice if we had a name for this class of data much like we have string, number, and so on? For this situation, Racket provides at named contracts. The use of contract in this term shows that contracts are rst-class values. The at means that the collection of data is a subset of the built-in atomic classes of data; they are described by a predicate that consumes all Racket values and produces a boolean. The named part says what we want to do, which is to name the contract so that error messages become intelligible:
#lang racket (define (amount? x) (and (number? x) (integer? x) (>= x 0))) (define amount (flat-named-contract 'amount amount?)) (provide (contract-out [deposit (amount . -> . any)])) (define this 0) (define (deposit a) ...)
With this little change, the error message becomes all of the sudden quite readable: 142
bank-client broke the contract (-> amount any) it had with myaccount on deposit; expected <amount>, given: -10
7.3
The -> contract constructor works for functions that take a xed number of arguments and where the result contract is independent of the input arguments. To support other kinds of functions, Racket supplies additional contract constructors, notably ->* and ->i.
7.3.1
Optional Arguments
Take a look at this excerpt from a string-processing module, inspired by the Scheme cookbook:
#lang racket (provide (contract-out ; pad the given str left and right with ; the (optional) char so that it is centered [string-pad-center (->* (string? natural-number/c) (char?) string?)])) (define (string-pad-center str width [pad #\space]) (define field-width (min width (string-length str))) (define rmargin (ceiling (/ (- width field-width) 2))) (define lmargin (floor (/ (- width field-width) 2))) (string-append (build-string lmargin ( (x) pad)) str (build-string rmargin ( (x) pad))))
The module exports string-pad-center, a function that creates a string of a given width with the given string in the center. The default ll character is #\space; if the client module wishes to use a different character, it may call string-pad-center with a third argument, a char, overwriting the default. The function denition uses optional arguments, which is appropriate for this kind of functionality. The interesting point here is the formulation of the contract for the string-padcenter. The contract combinator ->*, demands several groups of contracts:
143
The rst one is a parenthesized group of contracts for all required arguments. In this example, we see two: string? and natural-number/c. The second one is a parenthesized group of contracts for all optional arguments: char?. The last one is a single contract: the result of the function. Note if a default value does not satisfy a contract, you wont get a contract error for this interface. If you cant trust yourself to get the initial value right, you need to communicate the initial value across a boundary.
7.3.2
Rest Arguments
The max operator consumes at least one real number, but it accepts any number of additional arguments. You can write other such functions using a rest argument, such as in max-abs:
(define (max-abs n . rst) (foldr (lambda (n m) (max (abs n) m)) (abs n) rst))
Describing this function through a contract requires a further extension of ->*: a #:rest keyword species a contract on a list of arguments after the required and optional arguments:
7.3.3
Keyword Arguments
It turns out that the -> contract constructor also contains support for keyword arguments. For example, consider this function, which creates a simple GUI and asks the user a yes-or-no question:
(define
(send d answer)
#:default answer #:title title #:width w #:height h) d (new dialog% [label title] [width w] [height h])) msg (new message% [label question] [parent d])) (yes) (set! answer #t) (send d show #f)) (no) (set! answer #f) (send d show #f)) yes-b (new button% [label "Yes"] [parent d] [callback ( (x y) (yes))] [style (if answer '(border) '())])) no-b (new button% [label "No"] [parent d] [callback ( (x y) (no))] [style (if answer '() '(border))])) show #t)
(provide (contract-out [ask-yes-or-no-question (-> string? #:default boolean? #:title string? #:width exact-integer? #:height exact-integer? boolean?)]))
The contract for ask-yes-or-no-question uses ->, and in the same way that lambda (or define-based functions) allows a keyword to precede a functions formal argument, -> allows a keyword to precede a function contracts argument contract. In this case, the contract says that ask-yes-or-no-question must receive four keyword arguments, one for each of the keywords #:default, #:title, #:width, and #:height. As in a function denition, the order of the keywords in -> relative to each other does not matter for clients of the function; only the relative order of argument contracts without keywords matters.
If you really want to ask a yes-or-no question via a GUI, you should use For that matter, its usually better to provide buttons with more specic answers than yes and no.
message-box/custom.
7.3.4
Of course, many of the parameters in ask-yes-or-no-question (from the previous question) have reasonable defaults and should be made optional:
...)
To specify this functions contract, we need to use ->* again. It supports keywords just as you might expect in both the optional and mandatory argument sections. In this case, we have the mandatory keyword #:default and optional keywords #:title, #:width, and #:height. So, we write the contract like this:
(provide (contract-out [ask-yes-or-no-question (->* (string? #:default boolean?) (#:title string? #:width exact-integer? #:height exact-integer?) boolean?)]))
That is, we put the mandatory keywords in the rst section, and we put the optional ones in the second section. Contracts for case-lambda
7.3.5
A function dened with case-lambda might impose different constraints on its arguments depending on how many are provided. For example, a report-cost function might convert either a pair of numbers or a string into a new string:
(define report-cost (case-lambda [(lo hi) (format "between $a and $a" lo hi)] [(desc) (format "a of dollars" desc)])) > (report-cost 5 8) "between $5 and $8" > (report-cost "millions") "millions of dollars"
The contract for such a function is formed with the case-> combinator, which combines as many functional contracts as needed:
(provide (contract-out
146
7.3.6
(provide (contract-out [real-sqrt (->i ([argument (>=/c 1)]) [result (argument) (<=/c argument)])]))
The contract for the exported function real-sqrt uses the ->i rather than ->* function contract. The i stands for an indy dependent contract, meaning the contract for the function range depends on the value of the argument. The appearance of argument in the line for results contract means that the result depends on the argument. In this particular case, the argument of real-sqrt is greater or equal to 1, so a very basic correctness check is that the result is smaller than the argument. In general, a dependent function contract looks just like the more general ->* contract, but with names added that can be used elsewhere in the contract. Going back to the bank-account example, suppose that we generalize the module to support multiple accounts and that we also include a withdrawal operation. The improved bankaccount module includes a account structure type and the following functions:
(provide (contract-out [balance (-> account? amount/c)] [withdraw (-> account? amount/c account?)] [deposit (-> account? amount/c account?)]))
Besides requiring that a client provide a valid amount for a withdrawal, however, the amount should be less than the specied accounts balance, and the resulting account will have less money than it started with. Similarly, the module might promise that a deposit produces an account with money added to the account. The following implementation enforces those constraints and guarantees through contracts:
#lang racket
147
; section 1: the contract definitions (struct account (balance)) (define amount/c natural-number/c) ; section 2: the exports (provide (contract-out [create (amount/c . -> . account?)] [balance (account? . -> . amount/c)] [withdraw (->i ([acc account?] [amt (acc) (and/c amount/c (<=/c (balance acc)))]) [result (acc amt) (and/c account? (lambda (res) (>= (balance res) (- (balance acc) amt))))])] [deposit (->i ([acc account?] [amt amount/c]) [result (acc amt) (and/c account? (lambda (res) (>= (balance res) (+ (balance acc) amt))))])])) ; section 3: the function definitions (define balance account-balance) (define (create amt) (account amt)) (define (withdraw a amt) (account (- (account-balance a) amt))) (define (deposit a amt) (account (+ (account-balance a) amt)))
The contracts in section 2 provide typical type-like guarantees for create and balance. For withdraw and deposit, however, the contracts check and guarantee the more complicated constraints on balance and deposit. The contract on the second argument to withdraw uses (balance acc) to check whether the supplied withdrawal amount is small enough, where acc is the name given within ->i to the functions rst argument. The contract on the result of withdraw uses both acc and amt to guarantee that no more than that requested amount was withdrawn. The contract on deposit similarly uses acc and amount in the result contract to guarantee that at least as much money as provided was deposited into the account.
148
As written above, when a contract check fails, the error message is not great. The following revision uses flat-named-contract within a helper function mk-account-contract to provide better error messages.
#lang racket ; section 1: the contract definitions (struct account (balance)) (define amount/c natural-number/c) (define msg> "account a with balance larger than a expected") (define msg< "account a with balance less than a expected") (define (mk-account-contract acc amt op msg) (define balance0 (balance acc)) (define (ctr a) (and (account? a) (op balance0 (balance a)))) (flat-named-contract (format msg balance0) ctr)) ; section 2: the exports (provide (contract-out [create (amount/c . -> . account?)] [balance (account? . -> . amount/c)] [withdraw (->i ([acc account?] [amt (acc) (and/c amount/c (<=/c (balance acc)))]) [result (acc amt) (mk-accountcontract acc amt >= msg>)])] [deposit (->i ([acc account?] [amt amount/c]) [result (acc amt) (mk-account-contract acc amt <= msg<)])])) ; section 3: the function definitions (define balance account-balance) (define (create amt) (account amt)) (define (withdraw a amt) (account (- (account-balance a) amt))) (define (deposit a amt) (account (+ (account-balance a) amt)))
149
7.3.7
The ->i contract combinator can also ensure that a function only modies state according to certain constraints. For example, consider this contract (it is a slightly simplied from the function preferences:add-panel in the framework):
(->i ([parent (is-a?/c area-container-window<%>)]) [_ (parent) (let ([old-children (send parent get-children)]) ( (child) (andmap eq? (append old-children (list child)) (send parent get-children))))])
It says that the function accepts a single argument, named parent, and that parent must be an object matching the interface area-container-window<%>. The range contract ensures that the function only modies the children of parent by adding a new child to the front of the list. It accomplishes this by using the _ instead of a normal identier, which tells the contract library that the range contract does not depend on the values of any of the results, and thus the contract library evaluates the expression following the _ when the function is called, instead of when it returns. Therefore the call to the get-children method happens before the function under the contract is called. When the function under contract returns, its result is passed in as child, and the contract ensures that the children after the function return are the same as the children before the function called, but with one more child, at the front of the list. To see the difference in a toy example that focuses on this point, consider this program
#lang racket (define x '()) (define (get-x) x) (define (f) (set! x (cons 'f x))) (provide (contract-out [f (->i () [_ (begin (set! x (cons 'ctc x)) any/c)])] [get-x (-> (listof symbol?))]))
If you were to require this module, call f, then the result of get-x would be '(f ctc). In contrast, if the contract for f were
7.3.8
The function split consumes a list of chars and delivers the string that occurs before the rst occurrence of #\newline (if any) and the rest of the list:
(define (split l) (define (split l w) (cond [(null? l) (values (list->string (reverse w)) '())] [(char=? #\newline (car l)) (values (list->string (reverse w)) (cdr l))] [else (split (cdr l) (cons (car l) w))])) (split l '()))
It is a typical multiple-value function, returning two values by traversing a single list. The contract for such a function can use the ordinary function arrow ->, since -> treats values specially when it appears as the last result:
(provide (contract-out [split (-> (listof char?) (values string? (listof char?)))]))
The contract for such a function can also be written using ->*:
(provide (contract-out [split (->* ((listof char?)) () (values string? (listof char?)))]))
As before, the contract for the argument with ->* is wrapped in an extra pair of parentheses (and must always be wrapped like that) and the empty pair of parentheses indicates that there are no optional arguments. The contracts for the results are inside values: a string and a list of characters. Now, suppose that we also want to ensure that the rst result of split is a prex of the given word in list format. In that case, we need to use the ->i contract combinator:
(lambda (s2) (and (string? s2) (<= (string-length s2) s) (equal? (substring s 0 (string-length s2)) s2))))) (provide (contract-out [split (->i ([fl (listof char?)]) (values [s (fl) (substring-of (list->string fl))] [c (listof char?)]))]))
Like ->*, the ->i combinator uses a function over the argument to create the range contracts. Yes, it doesnt just return one contract but as many as the function produces values: one contract per value. In this case, the second contract is the same as before, ensuring that the second result is a list of chars. In contrast, the rst contract strengthens the old one so that the result is a prex of the given word. This contract is expensive to check, of course. Here is a slightly cheaper version:
(provide (contract-out [split (->i ([fl (listof char?)]) (values [s (fl) (string-len/c (length fl))] [c (listof char?)]))]))
7.3.9 Fixed but Statically Unknown Arities
Imagine yourself writing a contract for a function that accepts some other function and a list of numbers that eventually applies the former to the latter. Unless the arity of the given function matches the length of the given list, your procedure is in trouble. Consider this n-step function:
; (number ... -> (union #f number?)) (listof number) -> void (define (n-step proc inits) (let ([inc (apply proc inits)]) (when inc (n-step proc (map ( (x) (+ x inc)) inits)))))
The argument of n-step is proc, a function proc whose results are either numbers or false, and a list. It then applies proc to the list inits. As long as proc returns a number, n-step treats that number as an increment for each of the numbers in inits and recurs. When proc returns false, the loop stops. 152
; nat -> nat (define (f x) (printf "s\n" x) (if (= x 0) #f -1)) (n-step f '(2)) ; nat nat -> nat (define (g x y) (define z (+ x y)) (printf "s\n" (list x y z)) (if (= z 0) #f -1)) (n-step g '(1 1))
A contract for n-step must specify two aspects of procs behavior: its arity must include the number of elements in inits, and it must return either a number or #f. The latter is easy, the former is difcult. At rst glance, this appears to suggest a contract that assigns a variable-arity to proc:
(provide (contract-out [n-step (->i ([proc (inits) (and/c (unconstrained-domain-> (or/c false/c number?)) ( (f) (procedure-arity-includes? f (length inits))))] [inits (listof number?)])
153
() any)]))
7.4
This section develops several different avors of contracts for one and the same example: Rackets argmax function. According to its Racket documentation, the function consumes a procedure proc and a non-empty list of values, lst. It returns the rst element in the list lst that maximizes the result of proc. The emphasis on rst is ours. Examples:
> (argmax add1 (list 1 2 3)) 3 > (argmax sqrt (list 0.4 0.9 0.16)) 0.9 > (argmax second '((a 2) (b 3) (c 4) (d 1) (e 4))) '(c 4)
Here is the simplest possible contract for this function:
version 1
(provide (contract-out [argmax (-> (-> any/c real?) (and/c pair? list?) any/c)]))
This contract captures two essential conditions of the informal description of argmax: the given function must produce numbers that are comparable according to <. In particular, the contract (-> any/c number?) would not do, because number? also recognizes complex numbers in Racket. the given list must contain at least one item.
154
When combined with the name, the contract explains the behavior of argmax at the same level as an ML function type in a module signature (except for the non-empty list aspect). Contracts may communicate signicantly more than a type signature, however. Take a look at this second contract for argmax:
version 2
(provide (contract-out [argmax (->i ([f (-> any/c real?)] [lov (and/c pair? list?)]) () (r (f lov) (lambda (r) (define f@r (f r)) (for/and ([v lov]) (>= f@r (f v))))))]))
It is a dependent contract that names the two arguments and uses the names to impose a predicate on the result. This predicate computes (f r) where r is the result of argmax and then validates that this value is greater than or equal to all values of f on the items of lov. Is it possible that argmax could cheat by returning a random value that accidentally maximizes f over all elements of lov? With a contract, it is possible to rule out this possibility:
version 2 rev. a
(provide (contract-out [argmax (->i ([f (-> any/c real?)] [lov (and/c pair? list?)]) () (r (f lov) (lambda (r) (define f@r (f r)) (and (memq r lov) (for/and ([v lov]) (>= f@r (f v)))))))]))
The memq function ensures that r is intensionally equal to one of the members of lov. Of 155
That is, "pointer equality" for those who prefer to think at the hardware level.
course, a moments worth of reection shows that it is impossible to make up such a value. Functions are opaque values in Racket and without applying a function, it is impossible to determine whether some random input value produces an output value or triggers some exception. So we ignore this possibility from here on. Version 2 formulates the overall sentiment of argmaxs documentation, but it fails to bring across that the result is the rst element of the given list that maximizes the given function f. Here is a version that communicates this second aspect of the informal documentation:
version 3
(provide (contract-out [argmax (->i ([f (-> any/c real?)] [lov (and/c pair? list?)]) () (r (f lov) (lambda (r) (define f@r (f r)) (and (for/and ([v lov]) (>= f@r (f v))) (eq? (first (memf (lambda (v) (= (f v) f@r)) lov)) r)))))]))
That is, the memf function determines the rst element of lov whose value under f is equal to rs value under f. If this element is intensionally equal to r, the result of argmax is correct. This second renement step introduces two problems. First, both conditions recompute the values of f for all elements of lov. Second, the contract is now quite difcult to read. Contracts should have a concise formulation that a client can comprehend with a simple scan. Let us eliminate the readability problem with two auxiliary functions that have reasonably meaningful names:
version 3 rev. a
(provide (contract-out [argmax (->i ([f (-> any/c real?)] [lov (and/c pair? list?)]) () (r (f lov)
156
(lambda (r) (define f@r (f r)) (and (is-first-max? r f@r f lov) (dominates-all f@r f lov)))))])) ; where ; f@r is greater or equal to all (f v) for v in lov (define (dominates-all f@r f lov) (for/and ([v lov]) (>= (f v) f@r))) ; r is eq? to the first element v of lov for which (pred? v) (define (is-first-max? r f@r f lov) (eq? (first (memf (lambda (v) (= (f v) f@r)) lov)) r))
The names of the two predicates express their functionality and, in principle, render it unnecessary to read their denitions. This step leaves us with the problem of the newly introduced inefciency. To avoid the recomputation of (f v) for all v on lov, we change the contract so that it computes these values and reuses them as needed:
version 3 rev. b
(provide (contract-out [argmax (->i ([f (-> any/c real?)] [lov (and/c pair? list?)]) () (r (f lov) (lambda (r) (define f@r (f r)) (define flov (map f lov)) (and (is-first-max? r f@r (map list lov flov)) (dominates-all f@r flov)))))])) ; where ; f@r is greater or equal to all f@v in flov (define (dominates-all f@r flov) (for/and ([f@v flov]) (>= f@r f@v))) ; r is (second x) for the first x in flov+lov s.t. (= (first x) f@r)
157
(define (is-first-max? r f@r lov+flov) (define fst (first lov+flov)) (if (= (second fst) f@r) (eq? (first fst) r) (is-first-max? f@r r (rest lov+flov))))
Now the predicate on the result once again computes all values of f for elements of lov once. Version 3 may still be too eager when it comes to calling f. While Rackets argmax always calls f no matter how many items lov contains, let us imagine for illustrative purposes that our own implementation rst checks whether the list is a singleton. If so, the rst element would be the only element of lov and in that case there would be no need to compute (f r). As a matter of fact, since f may diverge or raise an exception for some inputs, argmax should avoid calling f when possible. The following contract demonstrates how a higher-order dependent contract needs to be adjusted so as to avoid being over-eager:
The word "eager" comes from the literature on the linguistics of contracts.
#lang racket (define (argmax f lov) (if (empty? (rest lov)) (first lov) ...))
version 4
The argmax of Racket implicitly argues that it not only promises the rst value that maximizes f over lov but also that f produces/produced a value for the result.
(provide (contract-out [argmax (->i ([f (-> any/c real?)] [lov (and/c pair? list?)]) () (r (f lov) (lambda (r) (cond [(empty? (rest lov)) (eq? (first lov) r)] [else (define f@r (f r)) (define flov (map f lov)) (and (is-first-max? r f@r (map list lov flov)) (dominates-all f@r flov))]))))])) ; where ; f@r is greater or equal to all f@v in flov (define (dominates-all f@r lov) ...)
158
; r is (second x) for the first x in flov+lov s.t. (= (first x) f@r) (define (is-first-max? r f@r lov+flov) ...)
Note that such considerations dont apply to the world of rst-order contracts. Only a higherorder (or lazy) language forces the programmer to express contracts with such precision. The problem of diverging or exception-raising functions should alert the reader to the even more general problem of functions with side-effects. If the given function f has visible effects say it logs its calls to a le then the clients of argmax will be able to observe two sets of logs for each call to argmax. To be precise, if the list of values contains more than one element, the log will contain two calls of f per value on lov. If f is expensive to compute, doubling the calls imposes a high cost. To avoid this cost and to signal problems with overly eager contracts, a contract system could record the i/o of contracted function arguments and use these hashtables in the dependency specication. This is a topic of on-going research in PLT. Stay tuned.
7.5
Contracts on Structures
Modules deal with structures in two ways. First they export struct denitions, i.e., the ability to create structs of a certain kind, to access their elds, to modify them, and to distinguish structs of this kind against every other kind of value in the world. Second, on occasion a module exports a specic struct and wishes to promise that its elds contain values of a certain kind. This section explains how to protect structs with contracts for both uses.
7.5.1
If your module denes a variable to be a structure, then you can specify the structures shape using struct/c:
#lang racket (require lang/posn) (define origin (make-posn 0 0)) (provide (contract-out [origin (struct/c posn zero? zero?)]))
In this example, the module imports a library for representing positions, which exports a posn structure. One of the posns it creates and exports stands for the origin, i.e., (0,0), of the grid. 159
See also vector/c and similar contract combinators for (at) compound data.
7.5.2
The book How to Design Programs teaches that posns should contain only numbers in their two elds. With contracts we would enforce this informal data denition as follows:
#lang racket (struct posn (x y)) (provide (contract-out [struct posn ((x number?) (y number?))] [p-okay posn?] [p-sick posn?])) (define p-okay (posn 10 20)) (define p-sick (posn 'a 'b))
This module exports the entire structure denition: posn, posn?, posn-x, posn-y, setposn-x!, and set-posn-y!. Each function enforces or promises that the two elds of a posn structure are numbers when the values ow across the module boundary. Thus, if a client calls posn on 10 and 'a, the contract system signals a contract violation. The creation of p-sick inside of the posn module, however, does not violate the contracts. The function posn is used internally, so 'a and 'b dont cross the module boundary. Similarly, when p-sick crosses the boundary of posn, the contract promises a posn? and nothing else. In particular, this check does not require that the elds of p-sick are numbers. The association of contract checking with module boundaries implies that p-okay and psick look alike from a clients perspective until the client extracts the pieces:
7.5.3
Contracts written using struct/c immediately check the elds of the data structure, but sometimes this can have disastrous effects on the performance of a program that does not, itself, inspect the entire data structure. As an example, consider the binary search tree search algorithm. A binary search tree is like a binary tree, except that the numbers are organized in the tree to make searching the tree fast. In particular, for each interior node in the tree, all of the numbers in the left subtree are smaller than the number in the node, and all of the numbers in the right subtree are larger than the number in the node. We can implement a search function in? that takes advantage of the structure of the binary search tree.
#lang racket (struct node (val left right)) ; determines if `n' is in the binary search tree `b', ; exploiting the binary search tree invariant (define (in? n b) (cond [(null? b) #f] [else (cond [(= n (node-val b)) #t]
161
; a predicate that identifies binary search trees (define (bst-between? b low high) (or (null? b) (and (<= low (node-val b) high) (bst-between? (node-left b) low (node-val b)) (bst-between? (node-right b) (node-val b) high)))) (define (bst? b) (bst-between? b -inf.0 +inf.0)) (provide (struct node (val left right))) (provide (contract-out [bst? (any/c . -> . boolean?)] [in? (number? bst? . -> . boolean?)]))
In a full binary search tree, this means that the in? function only has to explore a logarithmic number of nodes. The contract on in? guarantees that its input is a binary search tree. But a little careful thought reveals that this contract defeats the purpose of the binary search tree algorithm. In particular, consider the inner cond in the in? function. This is where the in? function gets its speed: it avoids searching an entire subtree at each recursive call. Now compare that to the bst-between? function. In the case that it returns #t, it traverses the entire tree, meaning that the speedup of in? is lost. In order to x that, we can employ a new strategy for checking the binary search tree contract. In particular, if we only checked the contract on the nodes that in? looks at, we can still guarantee that the tree is at least partially well-formed, but without changing the complexity. To do that, we need to use define-contract-struct in place of struct. Like struct (and more like define-struct), define-contract-struct denes a maker, predicate, and selectors for a new structure. Unlike define-struct, it also denes contract combinators, in this case node/c and node/dc. Also unlike define-struct, it does not allow mutators, making its structs always immutable. The node/c function accepts a contract for each eld of the struct and returns a contract on the struct. More interestingly, the syntactic form node/dc allows us to write dependent contracts, i.e., contracts where some of the contracts on the elds depend on the values of other elds. We can use this to dene the binary search tree contract:
#lang racket
162
(define-contract-struct node (val left right)) ; determines if `n' is in the binary search tree `b' (define (in? n b) ... as before ...) ; bst-between : number number -> contract ; builds a contract for binary search trees ; whose values are between low and high (define (bst-between/c low high) (or/c null? (node/dc [val (between/c low high)] [left (val) (bst-between/c low val)] [right (val) (bst-between/c val high)]))) (define bst/c (bst-between/c -inf.0 +inf.0)) (provide make-node node-left node-right node-val node?) (provide (contract-out [bst/c contract?] [in? (number? bst/c . -> . boolean?)]))
In general, each use of node/dc must name the elds and then specify contracts for each eld. In the above, the val eld is a contract that accepts values between low and high. The left and right elds are dependent on the value of the val eld, indicated by their second sub-expressions. Their contracts are built by recursive calls to the bst-between/c function. Taken together, this contract ensures the same thing that the bst-between? function checked in the original example, but here the checking only happens as in? explores the tree. Although this contract improves the performance of in?, restoring it to the logarithmic behavior that the contract-less version had, it is still imposes a fairly large constant overhead. So, the contract library also provides define-opt/c that brings down that constant factor by optimizing its body. Its shape is just like the define above. It expects its body to be a contract and then optimizes that contract.
(define-opt/c (bst-between/c low high) (or/c null? (node/dc [val (between/c low high)] [left (val) (bst-between/c low val)] [right (val) (bst-between/c val high)])))
163
7.6
The contract system provides existential contracts that can protect abstractions, ensuring that clients of your module cannot depend on the precise representation choices you make for your data structures. The contract-out form allows you to write
#: name-of-a-new-contract
as one of its clauses. This declaration introduces the variable name-of-a-new-contract , binding it to a new contract that hides information about the values it protects. As an example, consider this (simple) implementation of a stack datastructure:
You can type #:exists instead of #: if you cannot easily type unicode characters; in DrRacket, typing \exists followed by either alt-\ or control-\ (depending on your platform) will produce .
#lang racket (define empty '()) (define (enq top queue) (append queue (list top))) (define (next queue) (car queue)) (define (deq queue) (cdr queue)) (define (empty? queue) (null? queue)) (provide (contract-out [empty (listof integer?)] [enq (-> integer? (listof integer?) (listof integer?))] [next (-> (listof integer?) integer?)] [deq (-> (listof integer?) (listof integer?))] [empty? (-> (listof integer?) boolean?)]))
This code implements a queue purely in terms of lists, meaning that clients of this data structure might use car and cdr directly on the data structure (perhaps accidentally) and thus any change in the representation (say to a more efcient representation that supports amortized constant time enqueue and dequeue operations) might break client code. To ensure that the stack representation is abstract, we can use #: in the contract-out expression, like this:
(provide (contract-out #: stack [empty stack] [enq (-> integer? stack stack)]
164
[next (-> stack integer?)] [deq (-> stack (listof integer?))] [empty? (-> stack boolean?)]))
Now, if clients of the data structure try to use car and cdr, they receive an error, rather than mucking about with the internals of the queues. See also 7.8.2 Exists Contracts and Predicates.
7.7
Additional Examples
This section illustrates the current state of Rackets contract implementation with a series of examples from Design by Contract, by Example [Mitchell02]. Mitchell and McKims principles for design by contract DbC are derived from the 1970s style algebraic specications. The overall goal of DbC is to specify the constructors of an algebra in terms of its observers. While we reformulate Mitchell and McKims terminology and we use a mostly applicative approach, we retain their terminology of classes and objects: Separate queries from commands. A query returns a result but does not change the observable properties of an object. A command changes the visible properties of an object, but does not return a result. In applicative implementation a command typically returns an new object of the same class. Separate basic queries from derived queries. A derived query returns a result that is computable in terms of basic queries. For each derived query, write a post-condition contract that species the result in terms of the basic queries. For each command, write a post-condition contract that species the changes to the observable properties in terms of the basic queries. For each query and command, decide on a suitable pre-condition contract. Each of the following sections corresponds to a chapter in Mitchell and McKims book (but not all chapters show up here). We recommend that you read the contracts rst (near the end of the rst modules), then the implementation (in the rst modules), and then the test module (at the end of each section). Mitchell and McKim use Eiffel as the underlying programming language and employ a conventional imperative programming style. Our long-term goal is to transliterate their exam165
ples into applicative Racket, structure-oriented imperative Racket, and Rackets class system. Note: To mimic Mitchell and McKims informal notion of parametericity (parametric polymorphism), we use rst-class contracts. At several places, this use of rst-class contracts improves on Mitchell and McKims design (see comments in interfaces).
7.7.1
A Customer-Manager Component
This rst module contains some struct denitions in a separate module in order to better track bugs.
#lang racket ; data definitions (define id? symbol?) (define id-equal? eq?) (define-struct basic-customer (id name address) #:mutable) ; interface (provide (contract-out [id? (-> any/c boolean?)] [id-equal? (-> id? id? boolean?)] [struct basic-customer ((id id?) (name string?) (address string?))])) ; end of interface
This module contains the program that uses the above.
#lang racket (require "1.rkt") ; the module just above ; implementation ; [listof (list basic-customer? secret-info)] (define all '()) (define (find c) (define (has-c-as-key p) (id-equal? (basic-customer-id (car p)) c)) (define x (filter has-c-as-key all)) (if (pair? x) (car x) x))
166
(define (active? c) (define f (find c)) (pair? (find c))) (define not-active? (compose not active? basic-customer-id)) (define count 0) (define (add c) (set! all (cons (list c 'secret) all)) (set! count (+ count 1))) (define (name id) (define bc-with-id (find id)) (basic-customer-name (car bc-with-id))) (define (set-name id name) (define bc-with-id (find id)) (set-basic-customer-name! (car bc-with-id) name)) (define c0 0) ; end of implementation (provide (contract-out ; how many customers are in the db? [count natural-number/c] ; is the customer with this id active? [active? (-> id? boolean?)] ; what is the name of the customer with this id? [name (-> (and/c id? active?) string?)] ; change the name of the customer with this id [set-name (->d ([id id?] [nn string?]) () [result any/c] ; result contract #:post-cond (string=? (name id) nn))] [add (->d ([bc (and/c basic-customer? not-active?)]) () ; A pre-post condition contract must use ; a side-effect to express this contract ; via post-conditions #:pre-cond (set! c0 count) [result any/c] ; result contract
167
#lang racket (require rackunit rackunit/text-ui "1.rkt" "1b.rkt") (add (add (add (add (make-basic-customer (make-basic-customer (make-basic-customer (make-basic-customer 'mf 'rf 'fl 'sk "matthias" "brookstone")) "robby" "beverly hills park")) "matthew" "pepper clouds town")) "shriram" "i city"))
(run-tests (test-suite "manager" (test-equal? "id lookup" "matthias" (name 'mf)) (test-equal? "count" 4 count) (test-true "active?" (active? 'mf)) (test-false "active? 2" (active? 'kk)) (test-true "set name" (void? (set-name 'mf "matt")))))
7.7.2 A Parameteric (Simple) Stack
#lang racket ; a contract utility (define (eq/c x) (lambda (y) (eq? x y))) (define-struct stack (list p? eq)) (define (initialize p? eq) (make-stack '() p? eq)) (define (push s x) (make-stack (cons x (stack-list s)) (stack-p? s) (stack-eq s))) (define (item-at s i) (list-ref (reverse (stack-list s)) (- i 1))) (define (count s) (length (stack-list s))) (define (is-empty? s) (null? (stack-list s))) (define not-empty? (compose not is-empty?)) (define (pop s) (make-stack (cdr (stack-list s)) (stack-p? s) (stack-eq s))) (define (top s) (car (stack-list s))) (provide (contract-out
168
; predicate [stack? (-> any/c boolean?)] ; primitive queries ; how many items are on the stack? [count (-> stack? natural-number/c)] ; which item is at the given position? [item-at (->d ([s stack?] [i (and/c positive? (<=/c (count s)))]) () [result (stack-p? s)])] ; derived queries ; is the stack empty? [is-empty? (->d ([s stack?]) () [result (eq/c (= (count s) 0))])] ; which item is at the top of the stack [top (->d ([s (and/c stack? not-empty?)]) () [t (stack-p? s)] ; a stack item, t is its name #:post-cond ([stack-eq s] t (item-at s (count s))))] ; creation [initialize (->d ([p contract?] [s (p p . -> . boolean?)]) () ; Mitchell and McKim use (= (count s) 0) here to express ; the post-condition in terms of a primitive query [result (and/c stack? is-empty?)])] ; commands ; add an item to the top of the stack [push (->d ([s stack?] [x (stack-p? s)]) () [sn stack?] ; result kind #:post-cond (and (= (+ (count s) 1) (count sn)) ([stack-eq s] x (top sn))))]
169
; remove the item at the top of the stack [pop (->d ([s (and/c stack? not-empty?)]) () [sn stack?] ; result kind #:post-cond (= (- (count s) 1) (count sn)))]))
The tests:
#lang racket (require rackunit rackunit/text-ui "2.rkt") (define s0 (initialize (flat-contract integer?) =)) (define s2 (push (push s0 2) 1)) (run-tests (test-suite "stack" (test-true "empty" (is-empty? (initialize (flat-contract integer?) =))) (test-true "push" (stack? s2)) (test-true "push exn" (with-handlers ([exn:fail:contract? (lambda _ #t)]) (push (initialize (flat-contract integer?)) 'a) #f)) (test-true "pop" (stack? (pop s2))) (test-equal? "top" (top s2) 1) (test-equal? "toppop" (top (pop s2)) 2)))
7.7.3 A Dictionary
#lang racket ; a shorthand for use below (define-syntax (syntax-rules () [( antecedent consequent) (if antecedent consequent #t)])) ; implementation (define-struct dictionary (l value? eq?)) ; the keys should probably be another parameter (exercise)
170
(define (initialize p eq) (make-dictionary '() p eq)) (define (put d k v) (make-dictionary (cons (cons k v) (dictionary-l d)) (dictionary-value? d) (dictionary-eq? d))) (define (rem d k) (make-dictionary (let loop ([l (dictionary-l d)]) (cond [(null? l) l] [(eq? (caar l) k) (loop (cdr l))] [else (cons (car l) (loop (cdr l)))])) (dictionary-value? d) (dictionary-eq? d))) (define (count d) (length (dictionary-l d))) (define (value-for d k) (cdr (assq k (dictionary-l d)))) (define (has? d k) (pair? (assq k (dictionary-l d)))) (define (not-has? d) (lambda (k) (not (has? d k)))) ; end of implementation ; interface (provide (contract-out ; predicates [dictionary? (-> any/c boolean?)] ; basic queries ; how many items are in the dictionary? [count (-> dictionary? natural-number/c)] ; does the dictionary define key k? [has? (->d ([d dictionary?] [k symbol?]) () [result boolean?] #:post-cond ((zero? (count d)) . . (not result)))] ; what is the value of key k in this dictionary? [value-for (->d ([d dictionary?] [k (and/c symbol? (lambda (k) (has? d k)))]) () [result (dictionary-value? d)])] ; initialization ; post condition: for all k in symbol, (has? d k) is false. [initialize (->d ([p contract?] [eq (p p . -> . boolean?)]) () [result (and/c dictionary? (compose zero? count))])] ; commands
171
; Mitchell and McKim say that put shouldn't consume Void (null ptr) ; for v. We allow the client to specify a contract for all values ; via initialize. We could do the same via a key? parameter ; (exercise). add key k with value v to this dictionary [put (->d ([d dictionary?] [k (and symbol? (not-has? d))] [v (dictionary-value? d)]) () [result dictionary?] #:post-cond (and (has? result k) (= (count d) (- (count result) 1)) ([dictionary-eq? d] (valuefor result k) v)))] ; remove key k from this dictionary [rem (->d ([d dictionary?] [k (and/c symbol? (lambda (k) (has? d k)))]) () [result (and/c dictionary? not-has?)] #:post-cond (= (count d) (+ (count result) 1)))])) ; end of interface
The tests:
#lang racket (require rackunit rackunit/text-ui "3.rkt") (define d0 (initialize (flat-contract integer?) =)) (define d (put (put (put d0 'a 2) 'b 2) 'c 1)) (run-tests (test-suite "dictionaries" (test-equal? "value for" 2 (value-for d 'b)) (test-false "has?" (has? (rem d 'b) 'b)) (test-equal? "count" 3 (count d))))
7.7.4 A Queue
#lang racket
172
; Note: this queue doesn't implement the capacity restriction ; of Mitchell and McKim's queue but this is easy to add. ; a contract utility (define (all-but-last l) (reverse (cdr (reverse l)))) (define (eq/c x) (lambda (y) (eq? x y))) ; implementation (define-struct queue (list p? eq)) (define (initialize p? eq) (make-queue '() p? eq)) (define items queue-list) (define (put q x) (make-queue (append (queue-list q) (list x)) (queue-p? q) (queue-eq q))) (define (count s) (length (queue-list s))) (define (is-empty? s) (null? (queue-list s))) (define not-empty? (compose not is-empty?)) (define (rem s) (make-queue (cdr (queue-list s)) (queue-p? s) (queue-eq s))) (define (head s) (car (queue-list s))) ; interface (provide (contract-out ; predicate [queue? (-> any/c boolean?)] ; primitive queries ; Imagine providing this 'query' for the interface of the module ; only. Then in Racket there is no reason to have count or isempty? ; around (other than providing it to clients). After all items is ; exactly as cheap as count. [items (->d ([q queue?]) () [result (listof (queuep? q))])] ; derived queries [count (->d ([q queue?]) ; We could express this second part of the post ; condition even if count were a module "attribute"
173
; in the language of Eiffel; indeed it would ; exact same syntax (minus the arrow and () [result (and/c natural-number/c (=/c (length (items q))))])] (->d ([q queue?]) () [result (and/c boolean? (eq/c (null? (items q))))])] (->d ([q (and/c queue? (compose not is-empty?))]) () [result (and/c (queue-p? q) (eq/c (car (items q))))])]
[is-empty?
[head
; creation [initialize (-> contract? (contract? contract? . -> . boolean?) (and/c queue? (compose null? items)))] (->d ([oldq queue?] [i (queue-p? oldq)]) () [result (and/c queue? (lambda (q) (define old-items (items oldq)) (equal? (items q) (append olditems (list i)))))])] ; commands [put
(->d ([oldq (and/c queue? (compose not is-empty?))]) () [result (and/c queue? (lambda (q) (equal? (cdr (items oldq)) (items q))))])])) ; end of interface
The tests:
[rem
174
(define s (put (put (initialize (flat-contract integer?) =) 2) 1)) (run-tests (test-suite "queue" (test-true "empty" (is-empty? (initialize (flat-contract integer?) =))) (test-true "put" (queue? s)) (test-equal? "count" 2 (count s)) (test-true "put exn" (with-handlers ([exn:fail:contract? (lambda _ #t)]) (put (initialize (flat-contract integer?)) 'a) #f)) (test-true "remove" (queue? (rem s))) (test-equal? "head" 2 (head s))))
7.8
7.8.1
Gotchas
Contracts and eq?
As a general rule, adding a contract to a program should either leave the behavior of the program unchanged, or should signal a contract violation. And this is almost true for Racket contracts, with one exception: eq?. The eq? procedure is designed to be fast and does not provide much in the way of guarantees, except that if it returns true, it means that the two values behave identically in all respects. Internally, this is implemented as pointer equality at a low-level so it exposes information about how Racket is implemented (and how contracts are implemented). Contracts interact poorly with eq? because function contract checking is implemented internally as wrapper functions. For example, consider this module:
#lang racket (define (make-adder x) (if (= 1 x) add1 (lambda (y) (+ x y)))) (provide (contract-out [make-adder (-> number? (-> number? number?))]))
It exports the make-adder function that is the usual curried addition function, except that it 175
returns Rackets add1 when its input is 1. You might expect that
7.8.2
Much like the eq? example above, #: contracts can change the behavior of a program. Specically, the null? predicate (and many other predicates) return #f for #: contracts, and changing one of those contracts to any/c means that null? might now return #t instead, resulting in arbitrarily different behavior depending on this boolean might ow around in the program.
#lang racket/exists
To work around the above problem, the racket/exists library behaves just like the racket, but where predicates signal errors when given #: contracts. Moral: Do not use predicates on #: contracts, but if youre not sure, use racket/exists to be safe.
7.8.3
When dening a self-referential contract, it is natural to use define. For example, one might try to write a contract on streams like this:
> (define stream/c (promise/c (or/c null? (cons/c number? stream/c)))) reference to undened identier: stream/c
Unfortunately, this does not work because the value of stream/c is needed before it is dened. Put another way, all of the combinators evaluate their arguments eagerly, even thought the values that they accept do not. 176
Instead, use
7.8.4
The contract library assumes that variables exported via contract-out are not assigned to, but does not enforce it. Accordingly, if you try to set! those variables, you may be surprised. Consider the following example:
> (module server racket (define (inc-x!) (set! x (+ x 1))) (define x 0) (provide (contract-out [inc-x! (-> void?)] [x integer?]))) > (module client racket (require 'server) (define (print-latest) (printf "x is s\n" x)) (print-latest) (inc-x!) (print-latest)) > (require 'client) x is 0 x is 0
Both calls to print-latest print 0, even though the value of x has been incremented (and the change is visible inside the module x). To work around this, export accessor functions, rather than exporting the variable directly, like this: 177
#lang racket (define (get-x) x) (define (inc-x!) (set! x (+ x 1))) (define x 0) (provide (contract-out [inc-x! (-> void?)] [get-x (-> integer?)]))
Moral: This is a bug that we will address in a future release.
178
A Racket port represents an input or output stream, such as a le, a terminal, a TCP connection, or an in-memory string. More specically, an input port represents a stream from which a program can read data, and an output port represents a stream for writing data.
8.1
Varieties of Ports
Various functions create various kinds of ports. Here are a few examples: Files: The open-output-file function opens a le for writing, and open-inputfile opens a le for reading. Examples:
> (define out (open-output-file "data")) > (display "hello" out) > (close-output-port out) > (define in (open-input-file "data")) > (read-line in) "hello" > (close-input-port in)
If a le exists already, then open-output-file raises an exception by default. Supply an option like #:exists 'truncate or #:exists 'update to re-write or update the le: Examples:
> (define out (open-output-file "data" #:exists 'truncate)) > (display "howdy" out) > (close-output-port out)
Instead of having to match open-input-file and open-output-file calls, most Racket programmers will instead use call-with-output-file, which takes a function to call with the output port; when the function returns, the port is closed. Examples: 179
> (call-with-output-file "data" #:exists 'truncate (lambda (out) (display "hello" out))) > (call-with-input-file "data" (lambda (in) (read-line in))) "hello"
Strings: The open-output-string function creates a port that accumulates data into a string, and get-output-string extracts the accumulated string. The openinput-string function creates a port to read from a string. Examples:
> (define p (open-output-string)) > (display "hello" p) > (get-output-string p) "hello" > (read-line (open-input-string "goodbye\nfarewell")) "goodbye"
TCP Connections: The tcp-connect function creates both an input port and an output port for the client side of a TCP communication. The tcp-listen function creates a server, which accepts connections via tcp-accept. Examples:
> (define server (tcp-listen 12345)) > (define-values (c-in c-out) (tcp-connect "localhost" 12345)) > (define-values (s-in s-out) (tcp-accept server)) > (display "hello\n" c-out) > (close-output-port c-out) > (read-line s-in) "hello" > (read-line s-in) #<eof>
Process Pipes: The subprocess function runs a new process at the OS level and returns ports that correspond to the subprocesss stdin, stdout, and stderr. (The rst three 180
arguments can be certain kinds of existing ports to connect directly to the subprocess, instead of creating new ports.) Examples:
> (define-values (p stdout stdin stderr) (subprocess #f #f #f "/usr/bin/wc" "-w")) > (display "a b c\n" stdin) > (close-output-port stdin) > (read-line stdout) " 3" > (close-input-port stdout) > (close-input-port stderr)
Internal Pipes: The make-pipe function returns two ports that are ends of a pipe. This kind of pipe is internal to Racket, and not related to OS-level pipes for communicating between different processes. Examples:
> (define-values (in out) (make-pipe)) > (display "garbage" out) > (close-output-port out) > (read-line in) "garbage"
8.2
Default Ports
For most simple I/O functions, the target port is an optional argument, and the default is the current input port or current output port. Furthermore, error messages are written to the current error port, which is an output port. The current-input-port, current-outputport, and current-error-port functions return the corresponding current ports. Examples:
181
> (let ([s (open-output-string)]) (parameterize ([current-error-port s]) (swing-hammer) (swing-hammer) (swing-hammer)) (get-output-string s)) "Ouch!Ouch!Ouch!"
8.3
As noted throughout 3 Built-In Datatypes, Racket provides three ways to print an instance of a built-in value: print, which prints a value in the same way that is it printed for a REPL result; and write, which prints a value in such a way that read on the output produces the value back; and display, which tends to reduce a value to just its character or byte contentat least for those datatypes that are primarily about characters or bytes, otherwise it falls back to the same output as write.
182
> (print 1/2) 1/2 > (print #\x) #\x > (print "hello") "hello" > (print #"goodbye") #"goodbye" > (print '|pea pod|) '|pea pod| > (print '("i" pod)) '("i" pod) > (print write) #<procedure:write>
> (write 1/2) 1/2 > (write #\x) #\x > (write "hello") "hello" > (write #"goodbye") #"goodbye" > (write '|pea pod|) |pea pod| > (write '("i" pod)) ("i" pod) > (write write) #<procedure:write>
> (display 1/2) 1/2 > (display #\x) x > (display "hello") hello > (display #"goodbye") goodbye > (display '|pea pod|) pea pod > (display '("i" pod)) (i pod) > (display write) #<procedure:write>
Overall, print corresponds to the expression layer of Racket syntax, write corresponds to the reader layer, and display roughly corresponds to the character layer. The printf function supports simple formatting of data and text. In the format string supplied to printf, a displays the next argument, s writes the next argument, and v prints the next argument. Examples:
(define (deliver who when what) (printf "Items a for shopper s: v" who when what)) > (deliver '("list") '("John") '("milk")) Items (list) for shopper ("John"): '("milk")
After using write, as opposed to display or print, many forms of data can be read back in using read. The same values printed can also be parsed by read, but the result may have extra quote forms, since a printed form is meant to be read like an expression. Examples:
> (write "hello" out) > (read in) "hello" > (write '("alphabet" soup) out) > (read in) '("alphabet" soup) > (write #hash((a . "apple") (b . "banana")) out) > (read in) '#hash((b . "banana") (a . "apple")) > (print '("alphabet" soup) out) > (read in) ''("alphabet" soup) > (display '("alphabet" soup) out) > (read in) '(alphabet soup)
8.4
Prefab structure types (see 5.7 Prefab Structure Types) automatically support serialization: they can be written to an output stream, and a copy can be read back in from an input stream:
> (define-values (in out) (make-pipe)) > (write #s(sprout bean) out) > (read in) '#s(sprout bean)
Other structure types created by struct, which offer more abstraction than prefab structure types, normally write either using #<....> notation (for opaque structure types) or using #(....) vector notation (for transparent structure types). In neither can the result be read back in as an instance of the structure type:
> (define-values (in out) (make-pipe)) > (write (posn 1 2) out) > (read in) UNKNOWN::0: read: bad syntax #< > (struct posn (x y) #:transparent) > (write (posn 1 2)) #(struct:posn 1 2) > (define-values (in out) (make-pipe)) > (write (posn 1 2) out) > (define v (read in)) > v '#(struct:posn 1 2) > (posn? v) #f > (vector? v) #t
The serializable-struct form denes a structure type that can be serialized to a value that can be printed using write and restored via read. The serialized result can be deserialized to get back an instance of the original structure type. The serialization form and functions are provided by the racket/serialize library. Examples:
> (require racket/serialize) > (serializable-struct posn (x y) #:transparent) > (deserialize (serialize (posn 1 2))) (posn 1 2) > (write (serialize (posn 1 2))) ((3) 1 ((#f . deserialize-info:posn-v0)) 0 () () (0 1 2)) > (define-values (in out) (make-pipe)) > (write (serialize (posn 1 2)) out)
185
8.5
Functions like read-line, read, display, and write all work in terms of characters (which correspond to Unicode scalar values). Conceptually, they are implemented in terms of read-char and write-char. More primitively, ports read and write bytes, instead of characters. The functions readbyte and write-byte read and write raw bytes. Other functions, such as read-bytesline, build on top of byte operations instead of character operations. In fact, the read-char and write-char functions are conceptually implemented in terms of read-byte and write-byte. When a single bytes value is less than 128, then it corresponds to an ASCII character. Any other byte is treated as part of a UTF-8 sequence, where UTF-8 is a particular standard way of encoding Unicode scalar values in bytes (which has the nice property that ASCII characters are encoded as themselves). Thus, a single readchar may call read-byte multiple times, and a single write-char may generate multiple output bytes. The read-char and write-char operations always use a UTF-8 encoding. If you have a text stream that uses a different encoding, or if you want to generate a text stream in a different encoding, use reencode-input-port or reencode-output-port. The reencodeinput-port function converts an input stream from an encoding that you specify into a UTF-8 stream; that way, read-char sees UTF-8 encodings, even though the original used a different encoding. Beware, however, that read-byte also sees the re-encoded data, instead of the original byte stream.
8.6
I/O Patterns
If you want to process individual lines of a le, then you can use for with in-lines:
> (define (upcase-all in) (for ([l (in-lines in)]) (display (string-upcase l)) (newline)))
186
> (upcase-all (open-input-string (string-append "Hello, World!\n" "Can you hear me, now?"))) HELLO, WORLD! CAN YOU HEAR ME, NOW?
If you want to determine whether hello appears in a le, then you could search separate lines, but its even easier to simply apply a regular expression (see 9 Regular Expressions) to the stream:
> (define (has-hello? in) (regexp-match? #rx"hello" in)) > (has-hello? (open-input-string "hello")) #t > (has-hello? (open-input-string "goodbye")) #f
If you want to copy one port into another, use copy-port from racket/port, which efciently transfers large blocks when lots of data is available, but also transfers small blocks immediately if thats all that is available:
> (define o (open-output-string)) > (copy-port (open-input-string "broom") o) > (get-output-string o) "broom"
187
Regular Expressions
This chapter is a modied version of [Sitaram05].
A regexp value encapsulates a pattern that is described by a string or byte string. The regexp matcher tries to match this pattern against (a portion of) another string or byte string, which we will call the text string, when you call functions like regexp-match. The text string is treated as raw text, and not as a pattern.
9.1
A string or byte string can be used directly as a regexp pattern, or it can be prexed with #rx to form a literal regexp value. For example, #rx"abc" is a string-based regexp value, and #rx#"abc" is a byte string-based regexp value. Alternately, a string or byte string can be prexed with #px, as in #px"abc", for a slightly extended syntax of patterns within the string. Most of the characters in a regexp pattern are meant to match occurrences of themselves in the text string. Thus, the pattern #rx"abc" matches a string that contains the characters a, b, and c in succession. Other characters act as metacharacters, and some character sequences act as metasequences. That is, they specify something other than their literal selves. For example, in the pattern #rx"a.c", the characters a and c stand for themselves, but the metacharacter . can match any character. Therefore, the pattern #rx"a.c" matches an a, any character, and c in succession. If we needed to match the character . itself, we can escape it by precede it with a \. The character sequence \. is thus a metasequence, since it doesnt match itself but rather just .. So, to match a, ., and c in succession, we use the regexp pattern #rx"a\\.c"; the double \ is an artifact of Racket strings, not the regexp pattern itself. The regexp function takes a string or byte string and produces a regexp value. Use regexp when you construct a pattern to be matched against multiple strings, since a pattern is compiled to a regexp value before it can be used in a match. The pregexp function is like regexp, but using the extended syntax. Regexp values as literals with #rx or #px are compiled once and for all when they are read. The regexp-quote function takes an arbitrary string and returns a string for a pattern that matches exactly the original string. In particular, characters in the input string that could serve as regexp metacharacters are escaped with a backslash, so that they safely match only themselves.
When we want a literal \ inside a Racket string or regexp literal, we must escape it so that it shows up in the string at all. Racket strings use \ as the escape character, so we end up with two \s: one Racket-string \ to escape the regexp \, which then escapes the .. Another character that would need escaping inside a Racket string is ".
188
The regexp-quote function is useful when building a composite regexp from a mix of regexp strings and verbatim strings.
9.2
The regexp-match-positions function takes a regexp pattern and a text string, and it returns a match if the regexp matches (some part of) the text string, or #f if the regexp did not match the string. A successful match produces a list of index pairs. Examples:
> (regexp-match-positions #rx"brain" "bird") #f > (regexp-match-positions #rx"needle" "hay needle stack") '((4 . 10))
In the second example, the integers 4 and 10 identify the substring that was matched. The 4 is the starting (inclusive) index, and 10 the ending (exclusive) index of the matching substring:
> (regexp-match-positions #rx"needle" "his needle stack -- my needle stack -- her needle stack" 20 39) '((23 . 29))
Note that the returned indices are still reckoned relative to the full text string. The regexp-match function is like regexp-match-positions, but instead of returning index pairs, it returns the matching substrings:
> (regexp-match #rx"brain" "bird") #f > (regexp-match #rx"needle" "hay needle stack") '("needle")
189
When regexp-match is used with byte-string regexp, the result is a matching byte substring:
> (define-values (i o) (make-pipe)) > (write "hay needle stack" o) > (close-output-port o) > (regexp-match #rx#"needle" i) '(#"needle")
The regexp-match? function is like regexp-match-positions, but simply returns a boolean indicating whether the match succeeded:
> (regexp-match? #rx"brain" "bird") #f > (regexp-match? #rx"needle" "hay needle stack") #t
The regexp-split function takes two arguments, a regexp pattern and a text string, and it returns a list of substrings of the text string; the pattern identies the delimiter separating the substrings.
A byte-string regexp can be applied to a string, and a string regexp can be applied to a byte string. In both cases, the result is a byte string. Internally, all regexp matching is in terms of bytes, and a string regexp is expanded to a regexp that matches UTF-8 encodings of characters. For maximum efciency, use byte-string matching instead of string, since matching bytes directly avoids UTF-8 encodings.
> (regexp-split #rx":" "/bin:/usr/bin:/usr/bin/X11:/usr/local/bin") '("/bin" "/usr/bin" "/usr/bin/X11" "/usr/local/bin") > (regexp-split #rx" " "pea soup") '("pea" "soup")
If the rst argument matches empty strings, then the list of all the single-character substrings is returned.
> (regexp-split #rx"" "smithereens") '("" "s" "m" "i" "t" "h" "e" "r" "e" "e" "n" "s" "")
Thus, to identify one-or-more spaces as the delimiter, take care to use the regexp #rx" +", not #rx" *".
> (regexp-split #rx" +" "split pea soup") '("split" "pea" "soup") > (regexp-split #rx" *" "split pea soup") '("" "s" "p" "l" "i" "t" "" "p" "e" "a" "" "s" "o" "u" "p" "")
190
The regexp-replace function replaces the matched portion of the text string by another string. The rst argument is the pattern, the second the text string, and the third is either the string to be inserted or a procedure to convert matches to the insert string.
> (regexp-replace #rx"te" "liberte" "ty") "liberty" > (regexp-replace #rx"." "racket" string-upcase) "Racket"
If the pattern doesnt occur in the text string, the returned string is identical to the text string. The regexp-replace* function replaces all matches in the text string by the insert string:
> (regexp-replace* #rx"te" "liberte egalite fraternite" "ty") "liberty egality fratyrnity" > (regexp-replace* #rx"[ds]" "drracket" string-upcase) "Drracket"
9.3
Basic Assertions
The assertions ^ and $ identify the beginning and the end of the text string, respectively. They ensure that their adjoining regexps match at one or other end of the text string:
9.4
Typically, a character in the regexp matches the same character in the text string. Sometimes it is necessary or convenient to use a regexp metasequence to refer to a single character. For example, the metasequence \. matches the period character. The metacharacter . matches any character (other than newline in multi-line mode; see 9.6.3 Cloisters):
In #px syntax, some standard character classes can be conveniently represented as metasequences instead of as explicit bracketed expressions: \d matches a digit (the same as [0-9]); \s matches an ASCII whitespace character; and \w matches a character that could be part of a word. The upper-case versions of these metasequences stand for the inversions of the corresponding character classes: \D matches a non-digit, \S a non-whitespace character, and \W a non-word character. Remember to include a double backslash when putting these metasequences in a Racket string:
Following regexp custom, we identify word characters as [A-Za-z0-9_], although these are too restrictive for what a Racketeer might consider a word.
> (regexp-match #px"\\d\\d" "0 dear, 1 have 2 read catch 22 before 9") '("22")
These character classes can be used inside a bracketed expression. For example, #px"[az\\d]" matches a lower-case letter or a digit.
9.4.2
A POSIX character class is a special metasequence of the form [:...:] that can be used only inside a bracketed expression in #px syntax. The POSIX classes supported are [:alnum:] ASCII letters and digits [:alpha:] ASCII letters [:ascii:] ASCII characters [:blank:] ASCII widthful whitespace: space and tab [:cntrl:] control characters: ASCII 0 to 32 [:digit:] ASCII digits, same as \d [:graph:] ASCII characters that use ink [:lower:] ASCII lower-case letters [:print:] ASCII ink-users plus widthful whitespace 193
[:space:] ASCII whitespace, same as \s [:upper:] ASCII upper-case letters [:word:] ASCII letters and _, same as \w [:xdigit:] ASCII hex digits For example, the #px"[[:alpha:]_]" matches a letter or underscore.
> (regexp-match #px"[[:alpha:]_]" "--x--") '("x") > (regexp-match #px"[[:alpha:]_]" "--_--") '("_") > (regexp-match #px"[[:alpha:]_]" "--:--") #f
The POSIX class notation is valid only inside a bracketed expression. For instance, [:alpha:], when not inside a bracketed expression, will not be read as the letter class. Rather, it is (from previous principles) the character class containing the characters :, a, l, p, h.
9.5
Quantiers
The quantiers *, +, and ? match respectively: zero or more, one or more, and zero or one instances of the preceding subpattern.
> (regexp-match-positions '((0 . 11)) > (regexp-match-positions '((0 . 2)) > (regexp-match-positions '((0 . 11)) > (regexp-match-positions #f > (regexp-match-positions #f > (regexp-match-positions '((0 . 2)) > (regexp-match-positions '((0 . 3))
#rx"c[ad]*r" "cadaddadddr") #rx"c[ad]*r" "cr") #rx"c[ad]+r" "cadaddadddr") #rx"c[ad]+r" "cr") #rx"c[ad]?r" "cadaddadddr") #rx"c[ad]?r" "cr") #rx"c[ad]?r" "car")
194
In #px syntax, you can use braces to specify much ner-tuned quantication than is possible with *, +, ?: The quantier {m} matches exactly m instances of the preceding subpattern; m must be a nonnegative integer. The quantier {m,n} matches at least m and at most n instances. m and n are nonnegative integers with m less or equal to n. You may omit either or both numbers, in which case m defaults to 0 and n to innity. It is evident that + and ? are abbreviations for {1,} and {0,1} respectively, and * abbreviates {,}, which is the same as {0,}.
> (regexp-match '("uou") > (regexp-match #f > (regexp-match #f > (regexp-match '("eu")
The quantiers described so far are all greedy: they match the maximal number of instances that would still lead to an overall match for the full pattern.
9.6
Clusters
Clusteringenclosure within parens (...)identies the enclosed subpattern as a single entity. It causes the matcher to capture the submatch, or the portion of the string matching the subpattern, in addition to the overall match: 195
> (regexp-match #rx"([a-z]+) ([0-9]+), ([0-9]+)" "jan 1, 1970") '("jan 1, 1970" "jan" "1" "1970")
Clustering also causes a following quantier to treat the entire enclosed subpattern as an entity:
> (regexp-match #rx"(poo )*" "poo poo platter") '("poo poo " "poo ")
The number of submatches returned is always equal to the number of subpatterns specied in the regexp, even if a particular subpattern happens to match more than one substring or no substring at all.
> (regexp-match #rx"([a-z ]+;)*" "lather; rinse; repeat;") '("lather; rinse; repeat;" " repeat;")
Here, the *-quantied subpattern matches three times, but it is the last submatch that is returned. It is also possible for a quantied subpattern to fail to match, even if the overall pattern matches. In such cases, the failing submatch is represented by #f
> (define date-re ; match `month year' or `month day, year'; ; subpattern matches day, if present #rx"([a-z]+) +([0-9]+,)? *([0-9]+)") > (regexp-match date-re "jan 1, 1970") '("jan 1, 1970" "jan" "1," "1970") > (regexp-match date-re "jan 1970") '("jan 1970" "jan" #f "1970")
9.6.1 Backreferences
Submatches can be used in the insert string argument of the procedures regexp-replace and regexp-replace*. The insert string can use \n as a backreference to refer back to the nth submatch, which is the substring that matched the nth subpattern. A \0 refers to the entire match, and it can also be specied as \&.
> (regexp-replace #rx"_(.+?)_" "the _nina_, the _pinta_, and the _santa maria_" "*\\1*") "the *nina*, the _pinta_, and the _santa maria_"
196
> (regexp-replace* #rx"_(.+?)_" "the _nina_, the _pinta_, and the _santa maria_" "*\\1*") "the *nina*, the *pinta*, and the *santa maria*" > (regexp-replace #px"(\\S+) (\\S+) (\\S+)" "eat to live" "\\3 \\2 \\1") "live to eat"
Use \\ in the insert string to specify a literal backslash. Also, \$ stands for an empty string, and is useful for separating a backreference \n from an immediately following number. Backreferences can also be used within a #px pattern to refer back to an already matched subpattern in the pattern. \n stands for an exact repeat of the nth submatch. Note that \0, which is useful in an insert string, makes no sense within the regexp pattern, because the entire regexp has not matched yet that you could refer back to it.}
> (regexp-match #px"([a-z]+) and \\1" "billions and billions") '("billions and billions" "billions")
Note that the backreference is not simply a repeat of the previous subpattern. Rather it is a repeat of the particular substring already matched by the subpattern. In the above example, the backreference can only match billions. It will not match millions, even though the subpattern it harks back to([a-z]+)would have had no problem doing so:
> (regexp-replace* #px"\\b(\\S+) \\1\\b" (string-append "now is the the time for all good men to " "to come to the aid of of the party") "\\1") "now is the time for all good men to come to the aid of the party"
197
9.6.2
Non-capturing Clusters
It is often required to specify a cluster (typically for quantication) but without triggering the capture of submatch information. Such clusters are called non-capturing. To create a non-capturing cluster, use (?: instead of ( as the cluster opener. In the following example, a non-capturing cluster eliminates the directory portion of a given Unix pathname, and a capturing cluster identies the basename.
But dont parse paths with regexps. Use functions like split-path, instead.
The location between the ? and the : of a non-capturing cluster is called a cloister. You can put modiers there that will cause the enclustered subpattern to be treated specially. The modier i causes the subpattern to match case-insensitively:
The term cloister is a useful, if terminally cute, coinage from the abbots of Perl.
> (regexp-match '("\n") > (regexp-match '("a") > (regexp-match #f > (regexp-match '("A plan")
#rx"." "\na\n") #rx"(?m:.)" "\na\n") #rx"^A plan$" "A man\nA plan\nA canal") #rx"(?m:^A plan$)" "A man\nA plan\nA canal")
> (regexp-match #rx"(?mi:^A Plan$)" "a man\na plan\na canal") '("a plan")
A minus sign before a modier inverts its meaning. Thus, you can use -i in a subcluster to overturn the case-insensitivities caused by an enclosing cluster.
The above regexp will allow any casing for the and book, but it insists that TeX not be differently cased.
9.7
Alternation
You can specify a list of alternate subpatterns by separating them by |. The | separates subpatterns in the nearest enclosing cluster (or in the entire pattern string if there are no enclosing parens).
> (regexp-match #rx"f(ee|i|o|um)" "a small, final fee") '("fi" "i") > (regexp-replace* #rx"([yi])s(e[sdr]?|ing|ation)" (string-append "analyse an energising organisation" " pulsing with noisy organisms") "\\1z\\2") "analyze an energizing organization pulsing with noisy organisms"
Note again that if you wish to use clustering merely to specify a list of alternate subpatterns but do not want the submatch, use (?: instead of (.
9.8
Backtracking
Weve already seen that greedy quantiers match the maximal number of times, but the overriding priority is that the overall match succeed. Consider
9.9
You can have assertions in your pattern that look ahead or behind to ensure that a subpattern does or does not occur. These look around assertions are specied by putting the subpattern checked for in a cluster whose leading characters are: ?= (for positive lookahead), ?! 200
(negative lookahead), ?<= (positive lookbehind), ?<! (negative lookbehind). Note that the subpattern in the assertion does not generate a match in the nal result; it merely allows or disallows the rest of the match.
9.9.1
Lookahead
Positive lookahead with ?= peeks ahead to ensure that its subpattern could match.
> (regexp-match-positions #rx"grey(?=hound)" "i left my grey socks at the greyhound") '((28 . 32))
The regexp #rx"grey(?=hound)" matches grey, but only if it is followed by hound. Thus, the rst grey in the text string is not matched. Negative lookahead with ?! peeks ahead to ensure that its subpattern could not possibly match.
> (regexp-match-positions #rx"grey(?!hound)" "the gray greyhound ate the grey socks") '((27 . 31))
The regexp #rx"grey(?!hound)" matches grey, but only if it is not followed by hound. Thus the grey just before socks is matched.
9.9.2
Lookbehind
Positive lookbehind with ?<= checks that its subpattern could match immediately to the left of the current position in the text string.
> (regexp-match-positions #rx"(?<=grey)hound" "the hound in the picture is not a greyhound") '((38 . 43))
The regexp #rx"(?<=grey)hound" matches hound, but only if it is preceded by grey. Negative lookbehind with ?<! checks that its subpattern could not possibly match immediately to the left.
> (regexp-match-positions #rx"(?<!grey)hound" "the greyhound in the picture is not a hound") '((38 . 43))
201
The regexp #rx"(?<!grey)hound" matches hound, but only if it is not preceded by grey. Lookaheads and lookbehinds can be convenient when they are not confusing.
9.10
An Extended Example
Heres an extended example from Friedls Mastering Regular Expressions, page 189, that covers many of the features described in this chapter. The problem is to fashion a regexp that will match any and only IP addresses or dotted quads: four numbers separated by three dots, with each number between 0 and 255. First, we dene a subregexp n0-255 that matches 0 through 255:
> (define n0-255 (string-append "(?:" "\\d|" "\\d\\d|" "[01]\\d\\d|" "2[0-4]\\d|" "25[0-5]" ")"))
; ; ; ; ;
0 through 9 00 through 99 000 through 199 200 through 249 250 through 255
Note that n0-255 lists prexes as preferred alternates, which is something we cautioned against in 9.7 Alternation. However, since we intend to anchor this subregexp explicitly to force an overall match, the order of the alternates does not matter.
The rst two alternates simply get all single- and double-digit numbers. Since 0-padding is allowed, we need to match both 1 and 01. We need to be careful when getting 3-digit numbers, since numbers above 255 must be excluded. So we fashion alternates to get 000 through 199, then 200 through 249, and nally 250 through 255. An IP-address is a string that consists of four n0-255s with three dots separating them.
> (define ip-re1 (string-append "^" ; nothing before n0-255 ; the first n0-255, "(?:" ; then the subpattern of "\\." ; a dot followed by n0-255 ; an n0-255, ")" ; which is "{3}" ; repeated exactly 3 times "$")) ; with nothing following
Lets try it out: 202
> (regexp-match (pregexp ip-re1) "1.2.3.4") '("1.2.3.4") > (regexp-match (pregexp ip-re1) "55.155.255.265") #f
which is ne, except that we also have
> (define ip-re (pregexp (string-append "(?=.*[1-9])" ; ensure there's a non-0 digit ip-re1)))
Or we could use negative lookahead to ensure that whats ahead isnt composed of only zeros and dots.
> (define ip-re (pregexp (string-append "(?![0.]*$)" ; not just zeros and dots ; (note: . is not metachar inside [...]) ip-re1)))
The regexp ip-re will match all and only valid IP addresses.
203
10
Racket provides an especially rich set of control operationsnot only operations for raising and catching exceptions, but also operations for grabbing and restoring portions of a computation.
10.1
Exceptions
Whenever a run-time error occurs, an exception is raised. Unless the exception is caught, then it is handled by printing a message associated with the exception, and then escaping from the computation.
> (/ 1 0) /: division by zero > (car 17) car: expects argument of type <pair>; given: 17
To catch an exception, use the with-handlers form:
> (with-handlers ([exn:fail:contract:divide-by-zero? (lambda (exn) +inf.0)]) (/ 1 0)) +inf.0 > (with-handlers ([exn:fail:contract:divide-by-zero? (lambda (exn) +inf.0)]) (car 17)) car: expects argument of type <pair>; given: 17
The error function is one way to raise your own exception. It packages an error message and other information into an exn:fail structure: 204
> (error "crash!") crash! > (with-handlers ([exn:fail? (lambda (exn) 'air-bag)]) (error "crash!")) 'air-bag
The exn:fail:contract:divide-by-zero and exn:fail structure types are sub-types of the exn structure type. Exceptions raised by core forms and functions always raise an instance of exn or one of its sub-types, but an exception does not have to be represented by a structure. The raise function lets you raise any value as an exception:
> (raise 2) uncaught exception: 2 > (with-handlers ([(lambda (v) (equal? v 2)) (lambda (v) 'two)]) (raise 2)) 'two > (with-handlers ([(lambda (v) (equal? v 2)) (lambda (v) 'two)]) (/ 1 0)) /: division by zero
Multiple predicate-expr s in a with-handlers form let you handle different kinds of exceptions in different ways. The predicates are tried in order, and if none of them match, then the exception is propagated to enclosing contexts.
> (define (always-fail n) (with-handlers ([even? (lambda (v) 'even)] [positive? (lambda (v) 'positive)]) (raise n))) > (always-fail 2) 'even > (always-fail 3) 'positive > (always-fail -3) uncaught exception: -3 > (with-handlers ([negative? (lambda (v) 'negative)]) (always-fail -3)) 'negative
Using (lambda (v) #t) as a predicate captures all exceptions, of course:
> (with-handlers ([(lambda (v) #t) (lambda (v) 'oops)]) (car 17)) 'oops
205
Capturing all exceptions is usually a bad idea, however. If the user types Ctl-C in a terminal window or clicks the Stop button in DrRacket to interrupt a computation, then normally the exn:break exception should not be caught. To catch only exceptions that represent errors, use exn:fail? as the predicate:
> (with-handlers ([exn:fail? (lambda (v) 'oops)]) (car 17)) 'oops > (with-handlers ([exn:fail? (lambda (v) 'oops)]) (break-thread (current-thread)) ; simulate Ctl-C (car 17)) user break
10.2
When an exception is raised, control escapes out of an arbitrary deep evaluation context to the point where the exception is caughtor all the way out if the expression is never caught:
> (define (escape v) (abort-current-continuation (default-continuation-prompt-tag) (lambda () v))) > (+ 1 (+ 1 (+ 1 (+ 1 (+ 1 (+ 1 (escape 0))))))) 0 > (+ 1 (call-with-continuation-prompt
206
In escape above, the value v is wrapped in a procedure that is called after escaping to the enclosing prompt. Prompts and aborts look very much like exception handling and raising. Indeed, prompts and aborts are essentially a more primitive form of exceptions, and with-handlers and raise are implemented in terms of prompts and aborts. The power of the more primitive forms is related to the word continuation in the operator names, as we discuss in the next section.
10.3
Continuations
A continuation is a value that encapsulates a piece of an expression context. The callwith-composable-continuation function captures the current continuation starting outside the current function call and running up to the nearest enclosing prompt. (Keep in mind that each REPL interaction is implicitly wrapped in a prompt.) For example, in
(+ 1 (+ 1 (+ 1 0)))
at the point where 0 is evaluated, the expression context includes three nested addition expressions. We can grab that context by changing 0 to grab the continuation before returning 0:
> (define saved-k #f) > (define (save-it!) (call-with-composable-continuation (lambda (k) ; k is the captured continuation (set! saved-k k) 0))) > (+ 1 (+ 1 (+ 1 (save-it!)))) 3
The continuation saved in save-k encapsulates the program context (+ 1 (+ 1 (+ 1 ?))), where ? represents a place to plug in a result valuebecause that was the expression context when save-it! was called. The continuation is encapsulated so that it behaves like the function (lambda (v) (+ 1 (+ 1 (+ 1 v)))): 207
> (define (sum n) (if (zero? n) (save-it!) (+ n (sum (sub1 n))))) > (sum 5) 15
the continuation in saved-k becomes (lambda (x) (+ 5 (+ 4 (+ 3 (+ 2 (+ 1 x)))))):
208
11
The for family of syntactic forms support iteration over sequences. Lists, vectors, strings, byte strings, input ports, and hash tables can all be used as sequences, and constructors like in-range offer even more kinds of sequences. Variants of for accumulate iteration results in different ways, but they all have the same syntactic shape. Simplifying for now, the syntax of for is
> (for ([i '(1 2 3)]) (display i)) 123 > (for ([i "abc"]) (printf "a..." i)) a...b...c... > (for ([i 4]) (display i)) 0123
The for/list variant of for is more Racket-like. It accumulates body results into a list, instead of evaluating body only for side effects. In more technical terms, for/list implements a list comprehension. Examples:
> (for/list ([i '(1 2 3)]) (* i i)) '(1 4 9) > (for/list ([i "abc"]) i) '(#\a #\b #\c) > (for/list ([i 4]) i)
209
'(0 1 2 3)
The full syntax of for accommodates multiple sequences to iterate in parallel, and the for* variant nests the iterations instead of running them in parallel. More variants of for and for* accumulate body results in different ways. In all of these variants, predicates that prune iterations can be included along with bindings. Before details on the variations of for, though, its best to see the kinds of sequence generators that make interesting examples.
11.1
Sequence Constructors
The in-range function generates a sequence of numbers, given an optional starting number (which defaults to 0), a number before which the sequence ends, and an optional step (which defaults to 1). Using a non-negative integer k directly as a sequence is a shorthand for (in-range k ). Examples:
> (for ([i 3]) (display i)) 012 > (for ([i (in-range 3)]) (display i)) 012 > (for ([i (in-range 1 4)]) (display i)) 123 > (for ([i (in-range 1 4 2)]) (display i)) 13 > (for ([i (in-range 4 1 -1)]) (display i)) 432 > (for ([i (in-range 1 4 1/2)]) (printf " a " i)) 1 3/2 2 5/2 3 7/2
The in-naturals function is similar, except that the starting number must be an exact nonnegative integer (which defaults to 0), the step is always 1, and there is no upper limit. A 210
for loop using just in-naturals will never terminate unless a body expression raises an exception or otherwise escapes.
Example:
> (for ([i (in-naturals)]) (if (= i 10) (error "too much!") (display i))) 0123456789 too much!
The stop-before and stop-after functions construct a new sequence given a sequence and a predicate. The new sequence is like the given sequence, but truncated either immediately before or immediately after the rst element for which the predicate returns true. Example:
> (for ([i (stop-before "abc def" char-whitespace?)]) (display i)) abc
Sequence constructors like in-list, in-vector and in-string simply make explicit the use of a list, vector, or string as a sequence. Along with in-range, these constructors raise an exception when given the wrong kind of value, and since they otherwise avoid a run-time dispatch to determine the sequence type, they enable more efcient code generation; see 11.9 Iteration Performance for more information. Examples:
> (for ([i (in-string "abc")]) (display i)) abc > (for ([i (in-string '(1 2 3))]) (display i)) in-string: expected argument of type <string>; given: (1 2 3)
11.2
211
(for (clause ...) body ...+) clause = [id sequence-expr ] | #:when boolean-expr | #:unless boolean-expr
When multiple [id sequence-expr ] clauses are provided in a for form, the corresponding sequences are traversed in parallel:
> (for ([i (in-range 1 4)] [chapter '("Intro" "Details" "Conclusion")]) (printf "Chapter a. a\n" i chapter)) Chapter 1. Intro Chapter 2. Details Chapter 3. Conclusion
With parallel sequences, the for expression stops iterating when any sequence ends. This behavior allows in-naturals, which creates an innite sequence of numbers, to be used for indexing:
> (for ([i (in-naturals 1)] [chapter '("Intro" "Details" "Conclusion")]) (printf "Chapter a. a\n" i chapter)) Chapter 1. Intro Chapter 2. Details Chapter 3. Conclusion
The for* form, which has the same syntax as for, nests multiple sequences instead of running them in parallel:
> (for* ([book '("Guide" "Reference")] [chapter '("Intro" "Details" "Conclusion")]) (printf "a a\n" book chapter)) Guide Intro Guide Details Guide Conclusion Reference Intro Reference Details Reference Conclusion
Thus, for* is a shorthand for nested fors in the same way that let* is a shorthand for nested lets. 212
The #:when boolean-expr form of a clause is another shorthand. It allows the body s to evaluate only when the boolean-expr produces a true value:
> (for* ([book '("Guide" "Reference")] [chapter '("Intro" "Details" "Conclusion")] #:when (not (equal? chapter "Details"))) (printf "a a\n" book chapter)) Guide Intro Guide Conclusion Reference Intro Reference Conclusion
A boolean-expr with #:when can refer to any of the preceding iteration bindings. In a for form, this scoping makes sense only if the test is nested in the iteration of the preceding bindings; thus, bindings separated by #:when are mutually nested, instead of in parallel, even with for.
> (for ([book '("Guide" "Reference" "Notes")] #:when (not (equal? book "Notes")) [i (in-naturals 1)] [chapter '("Intro" "Details" "Conclusion" "Index")] #:when (not (equal? chapter "Index"))) (printf "a Chapter a. a\n" book i chapter)) Guide Chapter 1. Intro Guide Chapter 2. Details Guide Chapter 3. Conclusion Reference Chapter 1. Intro Reference Chapter 2. Details Reference Chapter 3. Conclusion
An #:unless clause is analogus to a #:when clause, but the body s evaluate only when the boolean-expr produces a false value.
11.3
The for/list form, which has the same syntax as for, evaluates the body s to obtain values that go into a newly constructed list:
> (for/list ([i (in-naturals 1)] [chapter '("Intro" "Details" "Conclusion")]) (string-append (number->string i) ". " chapter)) '("1. Intro" "2. Details" "3. Conclusion")
213
A #:when clause in a for-list form prunes the result list along with evaluations of the body s:
> (for/list ([i (in-naturals 1)] [chapter '("Intro" "Details" "Conclusion")] #:when (odd? i)) chapter) '("Intro" "Conclusion")
This pruning behavior of #:when is more useful with for/list than for. Whereas a plain when form normally sufces with for, a when expression form in a for/list would cause the result list to contain #<void>s instead of omitting list elements. The for*/list form is like for*, nesting multiple iterations:
> (for*/list ([book '("Guide" "Ref.")] [chapter '("Intro" "Details")]) (string-append book " " chapter)) '("Guide Intro" "Guide Details" "Ref. Intro" "Ref. Details")
A for*/list form is not quite the same thing as nested for/list forms. Nested for/lists would produce a list of lists, instead of one attened list. Much like #:when, then, the nesting of for*/list is more useful than the nesting of for*.
11.4
The for/vector form can be used with the same syntax as the for/list form, but the evaluated body s go into a newly-constructed vector instead of a list:
> (for/vector ([i (in-naturals 1)] [chapter '("Intro" "Details" "Conclusion")]) (string-append (number->string i) ". " chapter)) '#("1. Intro" "2. Details" "3. Conclusion")
The for*/vector form behaves similarly, but the iterations are nested as in for*. The for/vector and for*/vector forms also allow the length of the vector to be constructed to be supplied in advance. The resulting iteration can be performed more efciently than plain for/vector or for*/vector:
> (let ([chapters '("Intro" "Details" "Conclusion")]) (for/vector #:length (length chapters) ([i (in-naturals 1)] [chapter chapters]) (string-append (number->string i) ". " chapter))) '#("1. Intro" "2. Details" "3. Conclusion")
214
If a length is provided, the iteration stops when the vector is lled or the requested iterations are complete, whichever comes rst. If the provided length exceeds the requested number of iterations, then the remaining slots in the vector are initialized to the default argument of make-vector.
11.5
The for/and form combines iteration results with and, stopping as soon as it encounters #f:
11.6
The for/first form returns the result of the rst time that the body s are evaluated, skipping further iterations. This form is most useful with a #:when clause.
> (for/first ([chapter '("Intro" "Details" "Conclusion" "Index")] #:when (not (equal? chapter "Intro"))) chapter) "Details"
If the body s are evaluated zero times, then the result is #f. The for/last form runs all iterations, returning the value of the last iteration (or #f if no iterations are run):
> (for/last ([chapter '("Intro" "Details" "Conclusion" "Index")] #:when (not (equal? chapter "Index"))) chapter) "Conclusion"
215
As usual, the for*/first and for*/last forms provide the same facility with nested iterations:
> (for*/first ([book '("Guide" "Reference")] [chapter '("Intro" "Details" "Conclusion" "Index")] #:when (not (equal? chapter "Intro"))) (list book chapter)) '("Guide" "Details") > (for*/last ([book '("Guide" "Reference")] [chapter '("Intro" "Details" "Conclusion" "Index")] #:when (not (equal? chapter "Index"))) (list book chapter)) '("Reference" "Conclusion")
11.7
The for/fold form is a very general way to combine iteration results. Its syntax is slightly different than the syntax of for, because accumulation variables must be declared at the beginning:
> (for/fold ([len 0]) ([chapter '("Intro" "Conclusion")]) (+ len (string-length chapter))) 15 > (for/fold ([prev #f]) ([i (in-naturals 1)] [chapter '("Intro" "Details" "Details" "Conclusion")] #:when (not (equal? chapter prev))) (printf "a. a\n" i chapter) chapter) 1. Intro 2. Details 4. Conclusion "Conclusion"
216
When multiple accum-id s are specied, then the last body must produce multiple values, one for each accum-id . The for/fold expression itself produces multiple values for the results. Example:
> (for/fold ([prev #f] [counter 1]) ([chapter '("Intro" "Details" "Details" "Conclusion")] #:when (not (equal? chapter prev))) (printf "a. a\n" counter chapter) (values chapter (add1 counter))) 1. Intro 2. Details 3. Conclusion "Conclusion" 4
11.8
Multiple-Valued Sequences
In the same way that a function or expression can produce multiple values, individual iterations of a sequence can produce multiple elements. For example, a hash table as a sequence generates two values for each iteration: a key and a value. In the same way that let-values binds multiple results to multiple identiers, for can bind multiple sequence elements to multiple iteration identiers:
While let must be changed to let-values to bind multiple identier, for simply allows a parenthesized list of identiers instead of a single identier in any clause.
> (for ([(k v) #hash(("apple" . 1) ("banana" . 3))]) (printf "a count: a\n" k v)) apple count: 1 banana count: 3
This extension to multiple-value bindings works for all for variants. For example, for*/list nests iterations, builds a list, and also works with multiple-valued sequences:
> (for*/list ([(k v) #hash(("apple" . 1) ("banana" . 3))] [(i) (in-range v)]) k) '("apple" "banana" "banana" "banana")
217
11.9
Iteration Performance
Ideally, a for iteration should run as fast as a loop that you write by hand as a recursivefunction invocation. A hand-written loop, however, is normally specic to a particular kind of data, such as lists. In that case, the hand-written loop uses selectors like car and cdr directly, instead of handling all forms of sequences and dispatching to an appropriate iterator. The for forms can provide the performance of hand-written loops when enough information is apparent about the sequences to iterate. Specically, the clause should have one of the following fast-clause forms:
fast-clause = | | | fast-seq = | | | | | | | | | | |
(in-range expr ) (in-range expr expr ) (in-range expr expr expr ) (in-naturals) (in-naturals expr ) (in-list expr ) (in-vector expr ) (in-string expr ) (in-bytes expr ) (in-value expr ) (stop-before fast-seq predicate-expr ) (stop-after fast-seq predicate-expr )
fast-indexed-seq = (in-indexed fast-seq ) | (stop-before fast-indexed-seq predicate-expr ) | (stop-after fast-indexed-seq predicate-expr ) fast-parallel-seq = (in-parallel fast-seq ...) | (stop-before fast-parallel-seq predicate-expr ) | (stop-after fast-parallel-seq predicate-expr )
Examples:
> (time (for ([i (in-range 100000)]) (for ([elem (in-list '(a b c d e f g h))]) ; fast (void)))) cpu time: 4 real time: 3 gc time: 0 > (time (for ([i (in-range 100000)]) (for ([elem '(a b c d e f g h)]) (void))))
218
; slower
cpu time: 37 real time: 38 gc time: 0 > (time (let ([seq (in-list '(a b c d e f g h))]) (for ([i (in-range 100000)]) (for ([elem seq]) (void))))) cpu time: 83 real time: 83 gc time: 38
; slower
The grammars above are not complete, because the set of syntactic patterns that provide good performance is extensible, just like the set of sequence values. The documentation for a sequence constructor should indicate the performance benets of using it directly in a for clause .
2.18 Iterations and Comprehensions: for, for/list, ... in The Racket Reference provides more on iterations and comprehensions.
219
12
Pattern Matching
The match form supports pattern matching on arbitrary Racket values, as opposed to functions like regexp-match that compare regular expressions to byte and character sequences (see 9 Regular Expressions).
> (match 2 [1 'one] [2 'two] [3 'three]) 'two > (match #f [#t 'yes] [#f 'no]) 'no > (match "apple" ['apple 'symbol] ["apple" 'string] [#f 'boolean]) 'string
Constructors like cons, list, and vector can be used to create patterns that match pairs, lists, and vectors:
> (match '(1 [(list 0 [(list 1 'two > (match '(1 [(list 1 [(cons 1 'pair > (match #(1 [(list 1 [(vector
'vector
A constructor bound with struct also can be used as a pattern constructor:
> (struct shoe (size color)) > (struct hat (size style)) > (match (hat 23 'bowler) [(shoe 10 'white) "bottom"] [(hat 23 'bowler) "top"]) "top"
Unquoted, non-constructor identiers in a pattern are pattern variables that are bound in the result expressions:
> (match '(1) [(list x) (+ x 1)] [(list x y) (+ x y)]) 2 > (match '(1 2) [(list x) (+ x 1)] [(list x y) (+ x y)]) 3 > (match (hat 23 'bowler) [(shoe sz col) sz] [(hat sz stl) sz]) 23
An ellipsis, written ..., acts like a Kleene star within a list or vector pattern: the preceding sub-pattern can be used to match any number of times for any number of consecutive elements of the list or vector. If a sub-pattern followed by an ellipsis includes a pattern variable, the variable matches multiple times, and it is bound in the result expression to a list of matches:
> (match '(1 1 1) [(list 1 ...) 'ones] [else 'other]) 'ones > (match '(1 1 2) [(list 1 ...) 'ones] [else 'other]) 'other > (match '(1 2 3 4) [(list 1 x ... 4) x])
221
'(2 3) > (match (list (hat 23 'bowler) (hat 22 'pork-pie)) [(list (hat sz styl) ...) (apply + sz)]) 45
Ellipses can be nested to match nested repetitions, and in that case, pattern variables can be bound to lists of lists of matches:
> (match '((! 1) (! 2 2) (! 3 3 3)) [(list (list '! x ...) ...) x]) '((1) (2 2) (3 3 3))
The quasiquote form (see 4.11 Quasiquoting: quasiquote and ` for more about it) can also be used to build patterns. While unquoted portions of a normal quasiquoted form mean regular racket evaluation, here unquoted portions mean go back to regular pattern matching. So, in the example below, the with expression is the pattern and it gets rewritten into the application expression, using quasiquote as a pattern in the rst instance and quasiquote to build an expression in the second.
> (match `{with {x 1} {+ x 1}} [`{with {,id ,rhs} ,body} `{{lambda {,id} ,body} ,rhs}]) '((lambda (x) (+ x 1)) 1)
For information on many more pattern forms, see racket/match. Forms like match-let and match-lambda support patterns in positions that otherwise must be identiers. For example, match-let generalizes let to a destructing bind:
222
13
; initialization argument
(define current-size size) ; field (super-new) (define/public (get-size) current-size) (define/public (grow amt) (set! current-size (+ amt current-size))) (define/public (eat other-fish) (grow (send other-fish get-size))))
The size initialization argument must be supplied via a named argument when instantiating the class through the new form:
; superclass initialization
(define fish% (class object% (init size) ....)) (define charlie (new fish% [size 10]))
In the denition of fish%, current-size is a private eld that starts out with the value of the size initialization argument. Initialization arguments like size are available only during 223
class instantiation, so they cannot be referenced directly from a method. The current-size eld, in contrast, is available to methods. The (super-new) expression in fish% invokes the initialization of the superclass. In this case, the superclass is object%, which takes no initialization arguments and performs no work; super-new must be used, anyway, because a class must always invoke its superclasss initialization. Initialization arguments, eld declarations, and expressions such as (super-new) can appear in any order within a class, and they can be interleaved with method declarations. The relative order of expressions in the class determines the order of evaluation during instantiation. For example, if a elds initial value requires calling a method that works only after superclass initialization, then the eld declaration must be placed after the super-new call. Ordering eld and initialization declarations in this way helps avoid imperative assignment. The relative order of method declarations makes no difference for evaluation, because methods are fully dened before a class is instantiated.
13.1
Methods
Each of the three define/public declarations in fish% introduces a new method. The declaration uses the same syntax as a Racket function, but a method is not accessible as an independent function. A call to the grow method of a fish% object requires the send form:
(define hungry-fish% (class fish% (super-new) (define/public (eat-more fish1 fish2) (send this eat fish1) (send this eat fish2))))
Alternately, the class can declare the existence of a method using inherit, which brings the method name into scope for a direct call: 224
(define hungry-fish% (class fish% (super-new) (inherit eat) (define/public (eat-more fish1 fish2) (eat fish1) (eat fish2))))
With the inherit declaration, if fish% had not provided an eat method, an error would be signaled in the evaluation of the class form for hungry-fish%. In contrast, with (send this ....), an error would not be signaled until the eat-more method is called and the send form is evaluated. For this reason, inherit is preferred. Another drawback of send is that it is less efcient than inherit. Invocation of a method via send involves nding a method in the target objects class at run time, making send comparable to an interface-based method call in Java. In contrast, inherit-based method invocations use an offset within the classs method table that is computed when the class is created. To achieve performance similar to inherit-based method calls when invoking a method from outside the methods class, the programmer must use the generic form, which produces a class- and method-specic generic method to be invoked with send-generic:
(define get-fish-size (generic fish% get-size)) > (send-generic charlie get-fish-size) 16 > (send-generic (new hungry-fish% [size 32]) get-fish-size) 32 > (send-generic (new object%) get-fish-size) generic:get-size for class: sh%: expected argument of type <instance for class: sh%>; given: (object)
Roughly speaking, the form translates the class and the external method name to a location in the classs method table. As illustrated by the last example, sending through a generic method checks that its argument is an instance of the generics class. Whether a method is called directly within a class, through a generic method, or through send, method overriding works in the usual way:
(define picky-fish% (class fish% (super-new) (define/override (grow amt) (super grow (* 3/4 amt))))) (define daisy (new picky-fish% [size 20])) > (send daisy eat charlie)
225
13.2
Initialization Arguments
Since picky-fish% declares no initialization arguments, any initialization values supplied in (new picky-fish% ....) are propagated to the superclass initialization, i.e., to fish%. A subclass can supply additional initialization arguments for its superclass in a super-new call, and such initialization arguments take precedence over arguments supplied to new. For example, the following size-10-fish% class always generates sh of size 10:
(define size-10-fish% (class fish% (super-new [size 10]))) > (send (new size-10-fish%) get-size) 10
In the case of size-10-fish%, supplying a size initialization argument with new would result in an initialization error; because the size in super-new takes precedence, a size supplied to new would have no target declaration. An initialization argument is optional if the class form declares a default value. For example, the following default-10-fish% class accepts a size initialization argument, but its value defaults to 10 if no value is supplied on instantiation:
(define default-10-fish% (class fish% (init [size 10]) (super-new [size size]))) > (new default-10-fish%) (object:default-10-fish% ...) > (new default-10-fish% [size 20]) (object:default-10-fish% ...)
In this example, the super-new call propagates its own size value as the size initialization argument to the superclass. 226
13.3
The two uses of size in default-10-fish% expose the double life of class-member identiers. When size is the rst identier of a bracketed pair in new or super-new, size is an external name that is symbolically matched to an initialization argument in a class. When size appears as an expression within default-10-fish%, size is an internal name that is lexically scoped. Similarly, a call to an inherited eat method uses eat as an internal name, whereas a send of eat uses eat as an external name. The full syntax of the class form allows a programmer to specify distinct internal and external names for a class member. Since internal names are local, they can be renamed to avoid shadowing or conicts. Such renaming is not frequently necessary, but workarounds in the absence of renaming can be especially cumbersome.
13.4
Interfaces
Interfaces are useful for checking that an object or a class implements a set of methods with a particular (implied) behavior. This use of interfaces is helpful even without a static type system (which is the main reason that Java has interfaces). An interface in Racket is created using the interface form, which merely declares the method names required to implement the interface. An interface can extend other interfaces, which means that implementations of the interface automatically implement the extended interfaces.
(define fish-interface (interface () get-size grow eat)) (define fish% (class* object% (fish-interface) ....))
If the denition of fish% does not include get-size, grow, and eat methods, then an error is signaled in the evaluation of the class* form, because implementing the fishinterface interface requires those methods. 227
The is-a? predicate accepts either a class or interface as its rst argument and an object as its second argument. When given a class, is-a? checks whether the object is an instance of that class or a derived class. When given an interface, is-a? checks whether the objects class implements the interface. In addition, the implementation? predicate checks whether a given class implements a given interface.
13.5
As in Java, a method in a class form can be specied as nal, which means that a subclass cannot override the method. A nal method is declared using public-final or overridefinal, depending on whether the declaration is for a new method or an overriding implementation. Between the extremes of allowing arbitrary overriding and disallowing overriding entirely, the class system also supports Beta-style augmentable methods [Goldberg04]. A method declared with pubment is like public, but the method cannot be overridden in subclasses; it can be augmented only. A pubment method must explicitly invoke an augmentation (if any) using inner; a subclass augments the method using augment, instead of override. In general, a method can switch between augment and override modes in a class derivation. The augride method specication indicates an augmentation to a method where the augmentation is itself overrideable in subclasses (though the superclasss implementation cannot be overridden). Similarly, overment overrides a method and makes the overriding implementation augmentable.
13.6
As noted in 13.3 Internal and External Names, class members have both internal and external names. A member denition binds an internal name locally, and this binding can be locally renamed. External names, in contrast, have global scope by default, and a member denition does not bind an external name. Instead, a member denition refers to an existing binding for an external name, where the member name is bound to a member key; a class ultimately maps member keys to methods, elds, and initialization arguments. Recall the hungry-fish% class expression:
define-member-name,
(define hungry-fish% (class fish% .... (inherit eat) (define/public (eat-more fish1 fish2) (eat fish1) (eat fish2))))
During its evaluation, the hungry-fish% and fish% classes refer to the same global binding of eat. At run time, calls to eat in hungry-fish% are matched with the eat method in 228
(define-member-name id member-key-expr )
In particular, by using (generate-member-key) as the member-key-expr, an external name can be localized for a particular scope, because the generated member key is inaccessible outside the scope. In other words, define-member-name gives an external name a kind of package-private scope, but generalized from packages to arbitrary binding scopes in Racket. For example, the following fish% and pond% classes cooperate via a get-depth method that is only accessible to the cooperating classes:
(define-values (fish% pond%) ; two mutually recursive classes (let () (define-member-name get-depth (generate-member-key)) (define fish% (class .... (define my-depth ....) (define my-pond ....) (define/public (dive amt) (set! my-depth (min (+ my-depth amt) (send my-pond get-depth)))))) (define pond% (class .... (define current-depth ....) (define/public (get-depth) current-depth))) (values fish% pond%)))
External names are in a namespace that separates them from other Racket names. This separate namespace is implicitly used for the method name in send, for initialization-argument names in new, or for the external name in a member denition. The special form membername-key provides access to the binding of an external name in an arbitrary expression position: (member-name-key id) produces the member-key binding of id in the current scope. A member-key value is primarily used with a define-member-name form. Normally, then, (member-name-key id) captures the method key of id so that it can be communicated to a use of define-member-name in a different scope. This capability turns out to be useful for generalizing mixins, as discussed next. 229
13.7
Mixins
Since class is an expression form instead of a top-level declaration as in Smalltalk and Java, a class form can be nested inside any lexical scope, including lambda. The result is a mixin, i.e., a class extension that is parameterized with respect to its superclass. For example, we can parameterize the picky-fish% class over its superclass to dene picky-mixin:
(define (picky-mixin %) (class % (super-new) (define/override (grow amt) (super grow (* 3/4 amt))))) (define picky-fish% (picky-mixin fish%))
Many small differences between Smalltalk-style classes and Racket classes contribute to the effective use of mixins. In particular, the use of define/override makes explicit that picky-mixin expects a class with a grow method. If picky-mixin is applied to a class without a grow method, an error is signaled as soon as picky-mixin is applied. Similarly, a use of inherit enforces a method existence requirement when the mixin is applied:
(define (hungry-mixin %) (class % (super-new) (inherit eat) (define/public (eat-more fish1 fish2) (eat fish1) (eat fish2))))
The advantage of mixins is that we can easily combine them to create new classes whose implementation sharing does not t into a single-inheritance hierarchywithout the ambiguities associated with multiple inheritance. Equipped with picky-mixin and hungry-mixin, creating a class for a hungry, yet picky sh is straightforward:
.... (define/public (eat food) ....) (define/public (grow amt) ....))) (define child% (hungry-mixin (picky-mixin person%))) (define oliver (new child% [name "Oliver"] [age 6]))
Finally, the use of external names for class members (instead of lexically scoped identiers) makes mixin use convenient. Applying picky-mixin to person% works because the names eat and grow match, without any a priori declaration that eat and grow should be the same method in fish% and person%. This feature is a potential drawback when member names collide accidentally; some accidental collisions can be corrected by limiting the scope external names, as discussed in 13.6 Controlling the Scope of External Names.
13.7.1
Using implementation?, picky-mixin could require that its base class implements grower-interface, which could be implemented by both fish% and person%:
(define grower-interface (interface () grow)) (define (picky-mixin %) (unless (implementation? % grower-interface) (error "picky-mixin: not a grower-interface class")) (class % ....))
Another use of interfaces with a mixin is to tag classes generated by the mixin, so that instances of the mixin can be recognized. In other words, is-a? cannot work on a mixin represented as a function, but it can recognize an interface (somewhat like a specialization interface) that is consistently implemented by the mixin. For example, classes generated by picky-mixin could be tagged with picky-interface, enabling the is-picky? predicate:
(define picky-interface (interface ())) (define (picky-mixin %) (unless (implementation? % grower-interface) (error "picky-mixin: not a grower-interface class")) (class* % (picky-interface) ....)) (define (is-picky? o) (is-a? o picky-interface))
13.7.2 The mixin Form
To codify the lambda-plus-class pattern for implementing mixins, including the use of interfaces for the domain and range of the mixin, the class system provides a mixin macro: 231
13.7.3
Parameterized Mixins
As noted in 13.6 Controlling the Scope of External Names, external names can be bound with define-member-name. This facility allows a mixin to be generalized with respect to the methods that it denes and uses. For example, we can parameterize hungry-mixin with respect to the external member key for eat:
(define (make-hungry-mixin eat-method-key) (define-member-name eat eat-method-key) (mixin () () (super-new) (inherit eat) (define/public (eat-more x y) (eat x) (eat y))))
To obtain a particular hungry-mixin, we must apply this function to a member key that refers to a suitable eat method, which we can obtain using member-name-key:
232
13.8
Traits
A trait is similar to a mixin, in that it encapsulates a set of methods to be added to a class. A trait is different from a mixin in that its individual methods can be manipulated with trait operators such as trait-sum (merge the methods of two traits), trait-exclude (remove a method from a trait), and trait-alias (add a copy of a method with a new name; do not redirect any calls to the old name). The practical difference between mixins and traits is that two traits can be combined, even if they include a common method and even if neither method can sensibly override the other. In that case, the programmer must explicitly resolve the collision, usually by aliasing methods, excluding methods, and merging a new trait that uses the aliases. Suppose our fish% programmer wants to dene two class extensions, spots and stripes, each of which includes a get-color method. The shs spot color should not override the stripe color nor vice versa; instead, a spots+stripes-fish% should combine the two colors, which is not possible if spots and stripes are implemented as plain mixins. If, however, spots and stripes are implemented as traits, they can be combined. First, we alias get-color in each trait to a non-conicting name. Second, the get-color methods are removed from both and the traits with only aliases are merged. Finally, the new trait is used to create a class that introduces its own get-color method based on the two aliases, producing the desired spots+stripes extension.
13.8.1
One natural approach to implementing traits in Racket is as a set of mixins, with one mixin per trait method. For example, we might attempt to dene the spots and stripes traits as follows, using association lists to represent sets:
(define spots-trait (list (cons 'get-color (lambda (%) (class % (super-new) (define/public (get-color) 'black)))))) (define stripes-trait (list (cons 'get-color (lambda (%) (class % (super-new) (define/public (get-color) 'red))))))
A set representation, such as the above, allows trait-sum and trait-exclude as simple manipulations; unfortunately, it does not support the trait-alias operator. Although a mixin can be duplicated in the association list, the mixin has a xed method name, e.g., get-color, and mixins do not support a method-rename operation. To support trait233
alias, we must parameterize the mixins over the external method name in the same way that eat was parameterized in 13.7.3 Parameterized Mixins.
To support the trait-alias operation, spots-trait should be represented as:
(define spots-trait (list (cons (member-name-key get-color) (lambda (get-color-key %) (define-member-name get-color get-color-key) (class % (super-new) (define/public (get-color) 'black))))))
When the get-color method in spots-trait is aliased to get-trait-color and the get-color method is removed, the resulting trait is the same as
(list (cons (member-name-key get-trait-color) (lambda (get-color-key %) (define-member-name get-color get-color-key) (class % (super-new) (define/public (get-color) 'black)))))
To apply a trait T to a class C and obtain a derived class, we use ((trait->mixin T ) C ). The trait->mixin function supplies each mixin of T with the key for the mixins method and a partial extension of C :
13.8.2
This rst implementation of traits supports trait-alias, and it supports a trait method that calls itself, but it does not support trait methods that call each other. In particular, suppose that a spot-shs market value depends on the color of its spots:
(define spots-trait (list (cons (member-name-key get-color) ....) (cons (member-name-key get-price) (lambda (get-price %) .... (class % .... (define/public (get-price) .... (get-color) ....))))))
234
In this case, the denition of spots-trait fails, because get-color is not in scope for the get-price mixin. Indeed, depending on the order of mixin application when the trait is applied to a class, the get-color method may not be available when get-price mixin is applied to the class. Therefore adding an (inherit get-color) declaration to the getprice mixin does not solve the problem. One solution is to require the use of (send this get-color) in methods such as getprice. This change works because send always delays the method lookup until the method call is evaluated. The delayed lookup is more expensive than a direct call, however. Worse, it also delays checking whether a get-color method even exists. A second, effective, and efcient solution is to change the encoding of traits. Specically, we represent each method as a pair of mixins: one that introduces the method and one that implements it. When a trait is applied to a class, all of the method-introducing mixins are applied rst. Then the method-implementing mixins can use inherit to directly access any introduced method.
(define spots-trait (list (list (local-member-name-key get-color) (lambda (get-color get-price %) .... (class % .... (define/public (get-color) (void)))) (lambda (get-color get-price %) .... (class % .... (define/override (get-color) 'black)))) (list (local-member-name-key get-price) (lambda (get-price get-color %) .... (class % .... (define/public (get-price) (void)))) (lambda (get-color get-price %) .... (class % .... (inherit get-color) (define/override (get-price) .... (get-color) ....))))))
With this trait encoding, trait-alias adds a new method with a new name, but it does not change any references to the old method. The trait Form
13.8.3
The general-purpose trait pattern is clearly too complex for a programmer to use directly, but it is easily codied in a trait macro:
The ids in the optional inherit clause are available for direct reference in the method exprs, and they must be supplied either by other traits or the base class to which the trait is ultimately applied. Using this form in conjunction with trait operators such as trait-sum, trait-exclude, trait-alias, and trait->mixin, we can implement spots-trait and stripes-trait as desired.
(define spots-trait (trait (define/public (get-color) 'black) (define/public (get-price) ... (get-color) ...))) (define stripes-trait (trait (define/public (get-color) 'red))) (define spots+stripes-trait (trait-sum (trait-exclude (trait-alias spots-trait get-color get-spots-color) get-color) (trait-exclude (trait-alias stripes-trait get-color get-stripes-color) get-color) (trait (inherit get-spots-color get-stripes-color) (define/public (get-color) .... (get-spots-color) .... (get-stripes-color) ....))))
13.9
Class Contracts
As classes are values, they can ow across contract boundaries, and we may wish to protect parts of a given class with contracts. For this, the class/c form is used. The class/c form has many subforms, which describe two types of contracts on elds and methods: those that affect uses via instantiated objects and those that affect subclasses.
13.9.1
In its simplest form, class/c protects the public elds and methods of objects instantiated from the contracted class. There is also an object/c form that can be used to similarly protect the public elds and methods of a particular object. Take the following denition of animal%, which uses a public eld for its size attribute: 236
(define animal% (class object% (super-new) (field [size 10]) (define/public (eat food) (set! size (+ size (get-field size food))))))
For any instantiated animal%, accessing the size eld should return a positive number. Also, if the size eld is set, it should be assigned a positive number. Finally, the eat method should receive an argument which is an object with a size eld that contains a positive number. To ensure these conditions, we will dene the animal% class with an appropriate contract:
(define positive/c (and/c number? positive?)) (define edible/c (object/c (field [size positive/c]))) (define/contract animal% (class/c (field [size positive/c]) [eat (->m edible/c void?)]) (class object% (super-new) (field [size 10]) (define/public (eat food) (set! size (+ size (get-field size food))))))
Here we use ->m to describe the behavior of eat since we do not need to describe any requirements for the this parameter. Now that we have our contracted class, we can see that the contracts on both size and eat are enforced:
> (define bob (new animal%)) > (set-field! size bob 3) > (get-field size bob) 3 > (set-field! size bob 'large) animal%: contract violation, expected: positive/c, given: large contract from: (denition animal%), blaming: top-level contract: (class/c (eat (-> (object/c (eld (size positive/c))) void?)) (eld (size positive/c)))
237
> (send bob eat richie) > (get-field size bob) 13 > (define rock (new object%)) > (send bob eat rock) animal%: contract violation, no public eld size contract from: (denition animal%), blaming: top-level contract: (class/c (eat (-> (object/c (eld (size positive/c))) void?)) (eld (size positive/c))) at: eval:22.0 > (define giant (new (class object% (supernew) (field [size 'large])))) > (send bob eat giant) animal%: contract violation, expected: positive/c, given: large contract from: (denition animal%), blaming: top-level contract: (class/c (eat (-> (object/c (eld (size positive/c))) void?)) (eld (size positive/c))) at: eval:22.0
There are two important caveats for external class contracts. First, external method contracts are only enforced when the target of dynamic dispatch is the method implementation of the contracted class, which lies within the contract boundary. Overriding that implementation, and thus changing the target of dynamic dispatch, will mean that the contract is no longer enforced for clients, since accessing the method no longer crosses the contract boundary. Unlike external method contracts, external eld contracts are always enforced for clients of subclasses, since elds cannot be overridden or shadowed.
238
Second, these contracts do not restrict subclasses of animal% in any way. Fields and methods that are inherited and used by subclasses are not checked by these contracts, and uses of the superclasss methods via super are also unchecked. The following example illustrates both caveats:
(define large-animal% (class animal% (super-new) (inherit-field size) (set! size 'large) (define/override (eat food) (display "Nom nom nom") (newline)))) > (define elephant (new large-animal%)) > (send elephant eat (new object%)) Nom nom nom > (get-field size elephant) animal%: self-contract violation, expected: positive/c, given: large contract from: (denition animal%), blaming: (denition animal%) contract: (class/c (eat (-> (object/c (eld (size positive/c))) void?)) (eld (size positive/c))) at: eval:22.0
13.9.2
Notice that retrieving the size eld from the object elephant blames animal% for the contract violation. This blame is correct, but unfair to the animal% class, as we have not yet provided it with a method for protecting itself from subclasses. To this end we add internal class contracts, which provide directives to subclasses for how they may access and override features of the superclass. This distinction between external and internal class contracts allows for weaker contracts within the class hierarchy, where invariants may be broken internally by subclasses but should be enforced for external uses via instantiated objects. As a simple example of what kinds of protection are available, we provide an example aimed at the animal% class that uses all the applicable forms: 239
(class/c (field [size positive/c]) (inherit-field [size positive/c]) [eat (->m edible/c void?)] (inherit [eat (->m edible/c void?)]) (super [eat (->m edible/c void?)]) (override [eat (->m edible/c void?)]))
This class contract not only ensures that objects of class animal% are protected as before, but also ensure that subclasses of animal% only store appropriate values within the size eld and use the implementation of size from animal% appropriately. These contract forms only affect uses within the class hierarchy, and only for method calls that cross the contract boundary. That means that inherit will only affect subclass uses of a method until a subclass overrides that method, and that override only affects calls from the superclass into a subclasss overriding implementation of that method. Since these only affect internal uses, the override form does not automatically enter subclasses into obligations when objects of those classes are used. Also, use of override only makes sense, and thus can only be used, for methods where no Beta-style augmentation has taken place. The following example shows this difference:
(define/contract sloppy-eater% (class/c [eat (->m edible/c edible/c)]) (begin (define/contract glutton% (class/c (override [eat (->m edible/c void?)])) (class animal% (super-new) (inherit eat) (define/public (gulp food-list) (for ([f food-list]) (eat f))))) (class glutton% (super-new) (inherit-field size) (define/override (eat f) (let ([food-size (get-field size f)]) (set! size (/ food-size 2)) (set-field! size f (/ food-size 2)) f))))) > (define pig (new sloppy-eater%)) > (define slop1 (new animal%))
240
> (define slop2 (new animal%)) > (define slop3 (new animal%)) > (send pig eat slop1) (object:animal% ...) > (get-field size slop1) 5 > (send pig gulp (list slop1 slop2 slop3)) glutton%: contract violation, expected: void?, given: (object:animal% ...) contract from: (denition glutton%), blaming: (denition sloppy-eater%) contract: (class/c (override (eat (-> (object/c (eld (size positive/c))) void?)))) at: eval:38.0
In addition to the internal class contract forms shown here, there are similar forms for Betastyle augmentable methods. The inner form describes to the subclass what is expected from augmentations of a given method. Both augment and augride tell the subclass that the given method is a method which has been augmented and that any calls to the method in the subclass will dynamically dispatch to the appropriate implementation in the superclass. Such calls will be checked according to the given contract. The two forms differ in that use of augment signies that subclasses can augment the given method, whereas use of augride signies that subclasses must override the current augmentation instead. This means that not all forms can be used at the same time. Only one of the override, augment, and augride forms can be used for a given method, and none of these forms can be used if the given method has been nalized. In addition, super can be specied for a given method only if augride or override can be specied. Similarly, inner can be specied only if augment or augride can be specied.
241
14
Units (Components)
Units organize a program into separately compilable and reusable components. A unit resembles a procedure in that both are rst-class values that are used for abstraction. While procedures abstract over values in expressions, units abstract over names in collections of denitions. Just as a procedure is called to evaluate its expressions given actual arguments for its formal parameters, a unit is invoked to evaluate its denitions given actual references for its imported variables. Unlike a procedure, however, a units imported variables can be partially linked with the exported variables of another unit prior to invocation. Linking merges multiple units together into a single compound unit. The compound unit itself imports variables that will be propagated to unresolved imported variables in the linked units, and re-exports some variables from the linked units for further linking.
14.1
The interface of a unit is described in terms of signatures. Each signature is dened (normally within a module) using define-signature. For example, the following signature, placed in a "toy-factory-sig.rkt" le, describes the exports of a component that implements a toy factory:
#lang racket (define-signature toy-factory^ (build-toys ; (integer? -> (listof toy?)) repaint ; (toy? symbol? -> toy?) toy? ; (any/c -> boolean?) toy-color)) ; (toy? -> symbol?) (provide toy-factory^)
"toy-factory-sig.rkt"
An implementation of the toy-factory^ signature is written using define-unit with an export clause that names toy-factory^:
"simple-factory-unit.rkt"
242
(printf "Factory started.\n") (define-struct toy (color) #:transparent) (define (build-toys n) (for/list ([i (in-range n)]) (make-toy 'blue))) (define (repaint t col) (make-toy col))) (provide simple-factory@)
The toy-factory^ signature also could be referenced by a unit that needs a toy factory to implement something else. In that case, toy-factory^ would be named in an import clause. For example, a toy store would get toys from a toy factory. (Suppose, for the sake of an example with interesting features, that the store is willing to sell only toys in a particular color.)
#lang racket (define-signature toy-store^ (store-color ; (-> symbol?) stock! ; (integer? -> void?) get-inventory)) ; (-> (listof toy?)) (provide toy-store^)
"toy-store-sig.rkt"
#lang racket (require "toy-store-sig.rkt" "toy-factory-sig.rkt") (define-unit toy-store@ (import toy-factory^) (export toy-store^) (define inventory null) (define (store-color) 'green) (define (maybe-repaint t) (if (eq? (toy-color t) (store-color))
243
"toy-store-unit.rkt"
t (repaint t (store-color)))) (define (stock! n) (set! inventory (append inventory (map maybe-repaint (build-toys n))))) (define (get-inventory) inventory)) (provide toy-store@)
Note that "toy-store-unit.rkt" imports "toy-factory-sig.rkt", but not "simplefactory-unit.rkt". Consequently, the toy-store@ unit relies only on the specication of a toy factory, not on a specic implementation.
14.2
Invoking Units
The simple-factory@ unit has no imports, so it can be invoked directly using invokeunit:
The invoke-unit form does not make the body denitions available, however, so we cannot build any toys with this factory. The define-values/invoke-unit form binds the identiers of a signature to the values supplied by a unit (to be invoked) that implements the signature:
> (define-values/invoke-unit/infer simple-factory@) Factory started. > (build-toys 3) (list (toy 'blue) (toy 'blue) (toy 'blue))
Since simple-factory@ exports the toy-factory^ signature, each identier in toyfactory^ is dened by the define-values/invoke-unit/infer form. The /infer part of the form name indicates that the identiers bound by the declaration are inferred from simple-factory@. 244
Now that the identiers in toy-factory^ are dened, we can also invoke toy-store@, which imports toy-factory^ to produce toy-store^:
> (require "toy-store-unit.rkt") > (define-values/invoke-unit/infer toy-store@) > (get-inventory) '() > (stock! 2) > (get-inventory) (list (toy 'green) (toy 'green))
Again, the /infer part define-values/invoke-unit/infer determines that toystore@ imports toy-factory^, and so it supplies the top-level bindings that match the names in toy-factory^ as imports to toy-store@.
14.3
Linking Units
We can make our toy economy more efcient by having toy factories that cooperate with stores, creating toys that do not have to be repainted. Instead, the toys are always created using the stores color, which the factory gets by importing toy-store^:
"store-specific-factory-unit.rkt"
(define-unit store-specific-factory@ (import toy-store^) (export toy-factory^) (define-struct toy () #:transparent) (define (toy-color t) (store-color)) (define (build-toys n) (for/list ([i (in-range n)]) (make-toy))) (define (repaint t col) (error "cannot repaint")))
245
(provide store-specific-factory@)
To invoke store-specific-factory@, we need toy-store^ bindings to supply to the unit. But to get toy-store^ bindings by invoking toy-store@, we will need a toy factory! The unit implementations are mutually dependent, and we cannot invoke either before the other. The solution is to link the units together, and then we can invoke the combined units. The define-compound-unit/infer form links any number of units to form a combined unit. It can propagate imports and exports from the linked units, and it can satisfy each units imports using the exports of other linked units.
> (require "toy-factory-sig.rkt") > (require "toy-store-sig.rkt") > (require "store-specific-factory-unit.rkt") > (define-compound-unit/infer toy-store+factory@ (import) (export toy-factory^ toy-store^) (link store-specific-factory@ toy-store@))
The overall result above is a unit toy-store+factory@ that exports both toy-factory^ and toy-store^. The connection between store-specific-factory@ and toy-store@ is inferred from the signatures that each imports and exports. This unit has no imports, so we can always invoke it:
> (define-values/invoke-unit/infer toy-store+factory@) > (stock! 2) > (get-inventory) (list (toy) (toy)) > (map toy-color (get-inventory)) '(green green)
14.4
First-Class Units
The define-unit form combines define with a unit form, similar to the way that (define (f x) ....) combines define followed by an identier with an implicit lambda. 246
(define toy-store@ (unit (import toy-factory^) (export toy-store^) (define inventory null) (define (store-color) 'green) ....))
A difference between this expansion and define-unit is that the imports and exports of toy-store@ cannot be inferred. That is, besides combining define and unit, defineunit attaches static information to the dened identier so that its signature information is available statically to define-values/invoke-unit/infer and other forms. Despite the drawback of losing static signature information, unit can be useful in combination with other forms that work with rst-class values. For example, we could wrap a unit that creates a toy store in a lambda to supply the stores color:
#lang racket (require "toy-store-sig.rkt" "toy-factory-sig.rkt") (define toy-store@-maker (lambda (the-color) (unit (import toy-factory^) (export toy-store^) (define inventory null) (define (store-color) the-color) ; the rest is the same as before (define (maybe-repaint t) (if (eq? (toy-color t) (store-color)) t (repaint t (store-color)))) (define (stock! n) (set! inventory
247
"toy-store-maker.rkt"
(append inventory (map maybe-repaint (build-toys n))))) (define (get-inventory) inventory)))) (provide toy-store@-maker)
To invoke a unit created by toy-store@-maker, we must use define-values/invokeunit, instead of the /infer variant:
> (require "simple-factory-unit.rkt") > (define-values/invoke-unit/infer simple-factory@) Factory started. > (require "toy-store-maker.rkt") > (define-values/invoke-unit (toy-store@-maker 'purple) (import toy-factory^) (export toy-store^)) > (stock! 2) > (get-inventory) (list (toy 'purple) (toy 'purple))
In the define-values/invoke-unit form, the (import toy-factory^) line takes bindings from the current context that match the names in toy-factory^ (the ones that we created by invoking simple-factory@), and it supplies them as imports to toystore@. The (export toy-store^) clause indicates that the unit produced by toystore@-maker will export toy-store^, and the names from that signature are dened after invoking the unit. To link a unit from toy-store@-maker, we can use the compound-unit form:
> (require "store-specific-factory-unit.rkt") > (define toy-store+factory@ (compound-unit (import) (export TF TS) (link [((TF : toy-factory^)) store-specific-factory@ TS] [((TS : toy-store^)) toy-store@ TF])))
248
This compound-unit form packs a lot of information into one place. The left-handside TF and TS in the link clause are binding identiers. The identier TF is essentially bound to the elements of toy-factory^ as implemented by store-specific-factory@. The identier TS is similarly bound to the elements of toy-store^ as implemented by toy-store@. Meanwhile, the elements bound to TS are supplied as imports for storespecific-factory@, since TS follows store-specific-factory@. The elements bound to TF are similarly supplied to toy-store@. Finally, (export TF TS) indicates that the elements bound to TF and TS are exported from the compound unit. The above compound-unit form uses store-specific-factory@ as a rst-class unit, even though its information could be inferred. Every unit can be used as a rst-class unit, in addition to its use in inference contexts. Also, various forms let a programmer bridge the gap between inferred and rst-class worlds. For example, define-unit-binding binds a new identier to the unit produced by an arbitrary expression; it statically associates signature information to the identier, and it dynamically checks the signatures against the rst-class unit produced by the expression.
14.5
In programs that use units, modules like "toy-factory-sig.rkt" and "simplefactory-unit.rkt" are common. The racket/signature and racket/unit module names can be used as languages to avoid much of the boilerplate module, signature, and unit declaration text. For example, "toy-factory-sig.rkt" can be written as
#lang racket/signature build-toys repaint toy? toy-color ; ; ; ; (integer? -> (listof toy?)) (toy? symbol? -> toy?) (any/c -> boolean?) (toy? -> symbol?)
The signature toy-factory^ is automatically provided from the module, inferred from the lename "toy-factory-sig.rkt" by replacing the "-sig.rkt" sufx with ^. Similarly, "simple-factory-unit.rkt" module can be written
249
(printf "Factory started.\n") (define-struct toy (color) #:transparent) (define (build-toys n) (for/list ([i (in-range n)]) (make-toy 'blue))) (define (repaint t col) (make-toy col))
The unit simple-factory@ is automatically provided from the module, inferred from the lename "simple-factory-unit.rkt" by replacing the "-unit.rkt" sufx with @.
14.6
There are a couple of ways of protecting units with contracts. One way is useful when writing new signatures, and the other handles the case when a unit must conform to an already existing signature.
14.6.1
When contracts are added to a signature, then all units which implement that signature are protected by those contracts. The following version of the toy-factory^ signature adds the contracts previously written in comments:
#lang racket
"contracted-toy-factory-sig.rkt"
(define-signature contracted-toy-factory^ ((contracted [build-toys (-> integer? (listof toy?))] [repaint (-> toy? symbol? toy?)] [toy? (-> any/c boolean?)] [toy-color (-> toy? symbol?)]))) (provide contracted-toy-factory^)
Now we take the previous implementation of simple-factory@ and implement this version of toy-factory^ instead:
"contracted-simple-factory-unit.rkt"
250
#lang racket (require "contracted-toy-factory-sig.rkt") (define-unit contracted-simple-factory@ (import) (export contracted-toy-factory^) (printf "Factory started.\n") (define-struct toy (color) #:transparent) (define (build-toys n) (for/list ([i (in-range n)]) (make-toy 'blue))) (define (repaint t col) (make-toy col))) (provide contracted-simple-factory@)
As before, we can invoke our new unit and bind the exports so that we can use them. This time, however, misusing the exports causes the appropriate contract errors.
> (require "contracted-simple-factory-unit.rkt") > (define-values/invoke-unit/infer contracted-simple-factory@) Factory started. > (build-toys 3) (list (toy 'blue) (toy 'blue) (toy 'blue)) > (build-toys #f) build-toys: contract violation, expected: integer?, given: #f contract from: (unit contracted-simple-factory@) blaming: top-level contract: (-> integer? (listof toy?)) at: eval:34.0 > (repaint 3 'blue) repaint: contract violation, expected: toy?, given: 3 contract from: (unit contracted-simple-factory@) blaming: top-level contract: (-> toy? symbol? toy?) at: eval:34.0
251
14.6.2
However, sometimes we may have a unit that must conform to an already existing signature that is not contracted. In this case, we can create a unit contract with unit/c or use the define-unit/contract form, which denes a unit which has been wrapped with a unit contract. For example, heres a version of toy-factory@ which still implements the regular toyfactory^, but whose exports have been protected with an appropriate unit contract.
"wrapped-simple-factory-unit.rkt"
(define-unit/contract wrapped-simple-factory@ (import) (export (toy-factory^ [build-toys (-> integer? (listof toy?))] [repaint (-> toy? symbol? toy?)] [toy? (-> any/c boolean?)] [toy-color (-> toy? symbol?)])) (printf "Factory started.\n") (define-struct toy (color) #:transparent) (define (build-toys n) (for/list ([i (in-range n)]) (make-toy 'blue))) (define (repaint t col) (make-toy col))) (provide contracted-simple-factory@) > (require "wrapped-simple-factory-unit.rkt") > (define-values/invoke-unit/infer wrapped-simple-factory@) Factory started. > (build-toys 3) (list (toy 'blue) (toy 'blue) (toy 'blue)) > (build-toys #f) wrapped-simple-factory@: contract violation, expected:
252
integer?, given: #f contract from: (unit wrapped-simple-factory@) blaming: top-level contract: (unit/c (import) (export (toy-factory^ (build-toys (-> integer? (listof toy?))) (repaint (-> toy? symbol? toy?)) (toy? (-> any/c boolean?)) (toy-color (-> toy? symbol?))))) at: <collects>/mzlib/unit.rkt > (repaint 3 'blue) wrapped-simple-factory@: contract violation, expected: toy?, given: 3 contract from: (unit wrapped-simple-factory@) blaming: top-level contract: (unit/c (import) (export (toy-factory^ (build-toys (-> integer? (listof toy?))) (repaint (-> toy? symbol? toy?)) (toy? (-> any/c boolean?)) (toy-color (-> toy? symbol?))))) at: <collects>/mzlib/unit.rkt
14.7
As a form for modularity, unit complements module: The module form is primarily for managing a universal namespace. For example, it allows a code fragment to refer specically to the car operation from racket/base the one that extracts the rst element of an instance of the built-in pair datatypeas opposed to any number of other functions with the name car. In other word, the module construct lets you refer to the binding that you want. The unit form is for parameterizing a code fragment with respect to most any kind of run-time value. For example, it allows a code fragment for work with a car function that accepts a single argument, where the specic function is determined later by 253
linking the fragment to another. In other words, the unit construct lets you refer to a binding that meets some specication. The lambda and class forms, among others, also allow paremetrization of code with respect to values that are chosen later. In principle, any of those could be implemented in terms of any of the others. In practice, each form offers certain conveniencessuch as allowing overriding of methods or especially simple application to valuesthat make them suitable for different purposes. The module form is more fundamental than the others, in a sense. After all, a program fragment cannot reliably refer to a lambda, class, or unit form without the namespace management provided by module. At the same time, because namespace management is closely related to separate expansion and compilation, module boundaries end up as separate-compilation boundaries in a way that prohibits mutual dependencies among fragments. For similar reasons, module does not separate interface from implementation. Use unit when module by itself almost works, but when separately compiled pieces must refer to each other, or when you want a stronger separation between interface (i.e., the parts that need to be known at expansion and compilation time) and implementation (i.e., the runtime parts). More generally, use unit when you need to parameterize code over functions, datatypes, and classes, and when the parameterized code itself provides denitions to be linked with other parameterized code.
254
15
Racket is a dynamic language. It offers numerous facilities for loading, compiling, and even constructing new code at run time.
15.1 eval
The eval function takes a representation of an expression or denition (as a quoted form or syntax object) and evaluates it:
This example will not work within a module or in DrRackets denitions window, but it will work in the interactions window, for reasons that are explained by the end of 15.1.2 Namespaces.
> (define (eval-formula formula) (eval `(let ([x 2] [y 3]) ,formula))) > (eval-formula '(+ x y)) 5 > (eval-formula '(+ (* x y) y)) 9
Of course, if we just wanted to evaluate expressions with given values for x and y, we do not need eval. A more direct approach is to use rst-class functions:
> (define (apply-formula formula-proc) (formula-proc 2 3)) > (apply-formula (lambda (x y) (+ x y))) 5 > (apply-formula (lambda (x y) (+ (* x y) y))) 9
However, if expressions like (+ x y) and (+ (* x y) y) are read from a le supplied by a user, for example, then eval might be appropriate. Similarly, the REPL reads expressions that are typed by a user and uses eval to evaluate them. Also, eval is often used directly or indirectly on whole modules. For example, a program might load a module on demand using dynamic-require, which is essentially a wrapper around eval to dynamically load the module code. 255
15.1.1
Local Scopes
The eval function cannot see local bindings in the context where it is called. For example, calling eval inside an unquoted let form to evaluate a formula does not make values visible for x and y:
> (define (broken-eval-formula formula) (let ([x 2] [y 3]) (eval formula))) > (broken-eval-formula '(+ x y)) reference to undened identier: x
The eval function cannot see the x and y bindings precisely because it is a function, and Racket is a lexically scoped language. Imagine if eval were implemented as
15.1.2
Namespaces
Since eval cannot see the bindings from the context where it is called, another mechanism is needed to determine dynamically available bindings. A namespace is a rst-class value that encapsulates the bindings available for dynamic evaluation. Some functions, such as eval, accept an optional namespace argument. More often, the namespace used by a dynamic operation is the current namespace as determined by the current-namespace parameter. When eval is used in a REPL, the current namespace is the one that the REPL uses for 256
Informally, the term namespace is sometimes used interchangeably with environment or scope. In Racket, the term namespace has the more specic, dynamic meaning given above, and it should not be confused with static lexical concepts.
evaluating expressions. Thats why the following interaction successfully accesses x via eval:
15.1.3
As with let bindings, lexical scope means that eval cannot automatically see the denitions of a module in which it is called. Unlike let bindings, however, Racket provides a way to reect a module into a namespace. The module->namespace function takes a quoted module path and produces a namespace for evaluating expressions and denitions as if they appeared in the module body:
257
> (require 'm) > (define ns (module->namespace ''m)) > (eval 'x ns) 11
The module->namespace function is mostly useful from outside a module, where the modules full name is known. Inside a module form, however, the full name of a module may not be known, because it may depend on where the module source is location when it is eventually loaded. From within a module, use define-namespace-anchor to declare a reection hook on the module, and use namespace-anchor->namespace to reel in the modules namespace:
The double quoting in ''m is because 'm is a module path that refers to an interactively declared module, and so ''m is the quoted form of the path.
#lang racket (define-namespace-anchor a) (define ns (namespace-anchor->namespace a)) (define x 1) (define y 2) (eval '(cons x y) ns) ; produces (1 . 2)
15.2
Manipulating Namespaces
A namespace encapsulates two pieces of information: A mapping from identiers to bindings. For example, a namespace might map the identier lambda to the lambda form. An empty namespace is one that maps every identier to an uninitialized top-level variable. A mapping from module names to module declarations and instances. The rst mapping is used for evaluating expressions in a top-level context, as in (eval '(lambda (x) (+ x 1))). The second mapping is used, for example, by dynamicrequire to locate a module. The call (eval '(require racket/base)) normally uses both pieces: the identier mapping determines the binding of require; if it turns out to mean require, then the module mapping is used to locate the racket/base module. From the perspective of the core Racket run-time system, all evaluation is reective. Execution starts with an initial namespace that contains a few primitive modules, and that is further populated by loading les and modules as specied on the command line or as supplied in 258
the REPL. Top-level require and define forms adjusts the identier mapping, and module declarations (typically loaded on demand for a require form) adjust the module mapping.
15.2.1
The function make-empty-namespace creates a new, empty namespace. Since the namespace is truly empty, it cannot at rst be used to evaluate any top-level expressionnot even (require racket). In particular,
(define (run-dsl file) (parameterize ([current-namespace (make-base-empty-namespace)]) (namespace-require 'my-dsl) (load file)))
Note that the parameterize of current-namespace does not affect the meaning of identiers like namespace-require within the parameterize body. Those identiers obtain their meaning from the enclosing context (probably a module). Only expressions that are dynamic with respect to this code, such as the content of loaded les, are affected by the parameterize. Another subtle point in the above example is the use of (namespace-require 'my-dsl) instead of (eval '(require my-dsl)). The latter would not work, because eval needs 259
to obtain a meaning for require in the namespace, and the namespaces identier mapping is initially empty. The namespace-require function, in contrast, directly imports the given module into the current namespace. Starting with (namespace-require 'racket/base) would introduce a binding for require and make a subsequent (eval '(require mydsl)) work. The above is better, not only because it is more compact, but also because it avoids introducing bindings that are not part of the domain-specic languages.
15.2.2
Modules not attached to a new namespace will be loaded and instantiated afresh if they are demanded by evaluation. For example, racket/base does not include racket/class, and loading racket/class again will create a distinct class datatype:
> (require racket/class) > (class? object%) #t > (class? (parameterize ([current-namespace (make-base-empty-namespace)]) (namespace-require 'racket/class) ; loads again (eval 'object%))) #f
For cases when dynamically loaded code needs to share more code and data with its context, use the namespace-attach-module function. The rst argument to namespace-attachmodule is a source namespace from which to draw a module instance; in some cases, the current namespace is known to include the module that needs to be shared:
> (require racket/class) > (class? (let ([ns (make-base-empty-namespace)]) (namespace-attach-module (current-namespace) 'racket/class ns) (parameterize ([current-namespace ns]) (namespace-require 'racket/class) ; uses attached (eval 'object%)))) #t
Within a module, however, the combination of define-namespace-anchor and namespace-anchor->empty-namespace offers a more reliable method for obtaining a source namespace:
260
#lang racket/base (require racket/class) (define-namespace-anchor a) (define (load-plug-in file) (let ([ns (make-base-empty-namespace)]) (namespace-attach-module (namespace-anchor->empty-namespace a) 'racket/class ns) (parameterize ([current-namespace ns]) (dynamic-require file 'plug-in%))))
The anchor bound by namespace-attach-module connects the run time of a module with the namespace in which a module is loaded (which might differ from the current namespace). In the above example, since the enclosing module requires racket/class, the namespace produced by namespace-anchor->empty-namespace certainly contains an instance of racket/class. Moreover, that instance is the same as the one imported into the module, so the class datatype is shared.
15.3
Historically, Lisp implementations did not offer module systems. Instead, large programs were built by essentially scripting the REPL to evaluate program fragments in a particular order. While REPL scripting turns out to be a bad way to structure programs and libraries, it is still sometimes a useful capability. The load function runs a REPL script by reading S-expressions from a le, one by one, and passing them to eval. If a le "place.rkts" contains
(define city "Salt Lake City") (define state "Utah") (printf "a, a\n" city state)
then it can be loaded in a REPL:
Describing a program via load interacts especially badly with macro-dened language extensions [Flatt02].
> (load "place.rkts") Salt Lake City, Utah > city "Salt Lake City"
Since load uses eval, however, a module like the following generally will not workfor the same reasons described in 15.1.2 Namespaces: 261
#lang racket (define there "Utopia") (define-namespace-anchor a) (parameterize ([current-namespace (namespace-anchor>namespace a)]) (load "here.rkts"))
Still, if "here.rkts" denes any identiers, the denitions cannot be directly (i.e., statically) referenced by in the enclosing module. The racket/load module language is different from racket or racket/base. A module using racket/load treats all of its content as dynamic, passing each form in the module body to eval (using a namespace that is initialized with racket). As a result, uses of eval and load in the module body see the same dynamic namespace as immediate body forms. For example, if "here.rkts" contains
then running
#lang racket/load (define there "Utopia") (load "here.rkts") (go!) (printf "a\n" here)
prints Utopia. Drawbacks of using racket/load include reduced error checking, tool support, and performance. For example, with the program
263
16
Macros
A macro is a syntactic form with an associated transformer that expands the original form into existing forms. To put it another way, a macro is an extension to the Racket compiler. Most of the syntactic forms of racket/base and racket are actually macros that expand into a small set of core constructs. Like many languages, Racket provides pattern-based macros that make simple transformations easy to implement and reliable to use. Racket also supports arbitrary macro transformers that are implemented in Racketor in a macro-extended variant of Racket.
16.1
Pattern-Based Macros
A pattern-based macro replaces any code that matches a pattern to an expansion that uses parts of the original syntax that match parts of the pattern. 16.1.1 define-syntax-rule The simplest way to create a macro is to use define-syntax-rule:
The macro is un-Rackety in the sense that it involves side effects on variablesbut the point of macros is to let you add syntactic forms that some other language designer might not approve.
Macro pattern variables similar to pattern variables for match. See 12 Pattern Matching.
the pattern variable x matches first and y matches last, so that the expansion is
Suppose that we use the swap macro to swap variables named tmp and other:
(let ([tmp 5] [other 6]) (swap tmp other) (list tmp other))
The result of the above expression should be (6 5). The naive expansion of this use of swap, however, is
(let ([tmp 5] [other 6]) (let ([tmp tmp]) (set! tmp other) (set! other tmp)) (list tmp other))
whose result is (5 6). The problem is that the naive expansion confuses the tmp in the context where swap is used with the tmp that is in the macro template. Racket doesnt produce the naive expansion for the above use of swap. Instead, it produces
(let ([tmp 5] [other 6]) (let ([tmp_1 tmp]) (set! tmp other) (set! other tmp_1)) (list tmp other))
with the correct result in (6 5). Similarly, in the example
(let ([set! 5] [other 6]) (swap set! other) (list set! other))
265
the expansion is
(let ([set!_1 5] [other 6]) (let ([tmp_1 set!_1]) (set! set!_1 other) (set! other tmp_1)) (list set!_1 other))
so that the local set! binding doesnt interfere with the assignments introduced by the macro template. In other words, Rackets pattern-based macros automatically maintain lexical scope, so macro implementors can reason about variable reference in macros and macro uses in the same way as for functions and function calls. 16.1.3 define-syntax and syntax-rules The define-syntax-rule form binds a macro that matches a single pattern, but Rackets macro system supports transformers that match multiple patterns starting with the same identier. To write such macros, the programmer must use the more general define-syntax form along with the syntax-rules transformer form:
For example, suppose we would like a rotate macro that generalizes swap to work on either two or three identiers, so that
define-syntax-rule
form is itself a macro that expands into
(let ([red 1] [green 2] [blue 3]) (rotate red green) ; swaps (rotate red green blue) ; rotates left (list red green blue))
produces (1 3 2). We can implement rotate using syntax-rules:
define-syntax
with a
syntax-rules
form that contains only one pattern and template.
(define-syntax rotate (syntax-rules () [(rotate a b) (swap a b)] [(rotate a b c) (begin (swap a b) (swap b c))]))
266
The expression (rotate red green) matches the rst pattern in the syntax-rules form, so it expands to (swap red green). The expression (rotate a b c) matches the second pattern, so it expands to (begin (swap red green) (swap green blue)).
16.1.4
Matching Sequences
A better rotate macro would allow any number of identiers, instead of just two or three. To match a use of rotate with any number of identiers, we need a pattern form that has something like a Kleene star. In a Racket macro pattern, a star is written as .... To implement rotate with ..., we need a base case to handle a single identier, and an inductive case to handle more than one identier:
(define-syntax rotate (syntax-rules () [(rotate a) (void)] [(rotate a b c ...) (begin (swap a b) (rotate b c ...))]))
When a pattern variable like c is followed by ... in a pattern, then it must be followed by ... in a template, too. The pattern variable effectively matches a sequence of zero or more forms, and it is replaced in the template by the same sequence. Both versions of rotate so far are a bit inefcient, since pairwise swapping keeps moving the value from the rst variable into every variable in the sequence until it arrives at the last one. A more efcient rotate would move the rst value directly to the last variable. We can use ... patterns to implement the more efcient variant using a helper macro:
(define-syntax rotate (syntax-rules () [(rotate a c ...) (shift-to (c ... a) (a c ...))])) (define-syntax shift-to (syntax-rules () [(shift-to (from0 from ...) (to0 to ...)) (let ([tmp from0]) (set! to from) ... (set! to0 tmp))]))
In the shift-to macro, ... in the template follows (set! to from), which causes the (set! to from) expression to be duplicated as many times as necessary to use each identier matched in the to and from sequences. (The number of to and from matches must be the same, otherwise the macro expansion fails with an error.) 267
16.1.5
Identier Macros
Given our macro denitions, the swap or rotate identiers must be used after an open parenthesis, otherwise a syntax error is reported:
(define-syntax clock (syntax-id-rules (set!) [(set! clock e) (put-clock! e)] [(clock a ...) ((get-clock) a ...)] [clock (get-clock)])) (define-values (get-clock put-clock!) (let ([private-clock 0]) (values (lambda () private-clock) (lambda (v) (set! private-clock v)))))
The (clock a ...) pattern is needed because, when an identier macro is used after an open parenthesis, the macro transformer is given the whole form, like with a non-identier macro. Put another way, the syntax-rules form is essentially a special case of the syntax-id-rules form with errors in the set! and lone-identier cases.
268
16.1.6
Macro-Generating Macros
Suppose that we have many identiers like clock that wed like to redirect to accessor and mutator functions like get-clock and put-clock!. Wed like to be able to just write
(define-syntax-rule (define-get/put-id id get put!) (define-syntax id (syntax-id-rules (set!) [(set! id e) (put! e)] [(id a (... ...)) ((get) a (... ...))] [id (get)])))
The define-get/put-id macro is a macro-generating macro. The only non-obvious part of its denition is the (... ...), which quotes ... so that it takes its usual role in the generated macro, instead of the generating macro.
16.1.7
We can use pattern-matching macros to add a form to Racket for dening rst-order call-byreference functions. When a call-by-reference function body mutates its formal argument, the mutation applies to variables that are supplied as actual arguments in a call to the function. For example, if define-cbr is like define except that it denes a call-by-reference function, then
269
(define (do-f get-a get-b put-a! put-b!) (define-get/put-id a get-a put-a!) (define-get/put-id b get-b put-b!) (swap a b))
and redirect a function call (f x y) to
Clearly, then define-cbr is a macro-generating macro, which binds f to a macro that expands to a call of do-f. That is, (define-cbr (f a b) (swap ab)) needs to generate the denition
(define-syntax f (syntax-rules () [(id actual ...) (do-f (lambda () actual) ... (lambda (v) (set! actual v)) ...)]))
At the same time, define-cbr needs to dene do-f using the body of f, this second part is slightly more complex, so we defer most it to a define-for-cbr helper module, which lets us write define-cbr easily enough:
(define-syntax-rule (define-cbr (id arg ...) body) (begin (define-syntax id (syntax-rules () [(id actual (... ...)) (do-f (lambda () actual) (... ...) (lambda (v) (set! actual v)) (... ...))])) (define-for-cbr do-f (arg ...) () ; explained below... body)))
Our remaining task is to dene define-for-cbr so that it converts 270
(define-syntax define-for-cbr (syntax-rules () [(define-for-cbr do-f (id0 id ...) (gens ...) body) (define-for-cbr do-f (id ...) (gens ... (id0 get put)) body)] [(define-for-cbr do-f () ((id get put) ...) body) (define (do-f get ... put ...) (define-get/put-id id get put) ... body)]))
Step-by-step, expansion proceeds as follows:
(define-for-cbr do-f (a b) () (swap a b)) => (define-for-cbr do-f (b) ([a get_1 put_1]) (swap a b)) => (define-for-cbr do-f () ([a get_1 put_1] [b get_2 put_2]) (swap a b)) => (define (do-f get_1 get_2 put_1 put_2) (define-get/put-id a get_1 put_1) (define-get/put-id b get_2 put_2) (swap a b))
The subscripts on get_1, get_2, put_1, and put_2 are inserted by the macro expander to preserve lexical scope, since the get generated by each iteration of define-for-cbr should not bind the get generated by a different iteration. In other words, we are essentially 271
tricking the macro expander into generating fresh names for us, but the technique illustrates some of the surprising power of pattern-based macros with automatic lexical scope. The last expression eventually expands to just
(define (do-f get_1 get_2 put_1 put_2) (let ([tmp (get_1)]) (put_1 (get_2)) (put_2 tmp)))
which implements the call-by-name function f. To summarize, then, we can add call-by-reference functions to Racket with just three small pattern-based macros: define-cbr, define-for-cbr, and define-get/put-id.
16.2
The define-syntax form creates a transformer binding for an identier, which is a binding that can be used at compile time while expanding expressions to be evaluated at run time. The compile-time value associated with a transformer binding can be anything; if it is a procedure of one argument, then the binding is used as a macro, and the procedure is the macro transformer. The syntax-rules and syntax-id-rules forms are macros that expand to procedure forms. For example, if you evaluate a syntax-rules form directly (instead of placing on the right-hand of a define-syntax form), the result is a procedure:
16.2.1
Syntax Objects
The input and output of a macro transformer (i.e., source and replacement forms) are represented as syntax objects. A syntax object contains symbols, lists, and constant values (such as numbers) that essentially correspond to the quoted form of the expression. For example, a representation of the expression (+ 1 2) contains the symbol '+ and the numbers 1 and 2, all in a list. In addition to this quoted content, a syntax object associates source-location and lexical-binding information with each part of the form. The source-location information is
272
used when reporting syntax errors (for example), and the lexical-biding information allows the macro system to maintain lexical scope. To accommodate this extra information, the represention of the expression (+ 1 2) is not merely '(+ 1 2), but a packaging of '(+ 1 2) into a syntax object. To create a literal syntax object, use the syntax form:
> (identifier? #'car) #t > (identifier? #'(+ 1 2)) #f > (free-identifier=? #'car #'cdr) #f > (free-identifier=? #'car #'car) #t > (require (only-in racket/base [car also-car])) > (free-identifier=? #'car #'also-car) #t > (free-identifier=? #'car (let ([car 8]) #'car)) #f
The last example above, in particular, illustrates how syntax objects preserve lexical-context information. To see the lists, symbols, numbers, etc. within a syntax object, use syntax->datum:
16.2.2
The procedure generated by syntax-rules internally uses syntax-e to deconstruct the given syntax object, and it uses datum->syntax to construct the result. The syntax-rules form doesnt provide a way to escape from pattern-matching and template-construction mode into an arbitrary Racket expression. The syntax-case form lets you mix pattern matching, template construction, and arbitrary expressions:
#'shifts into template-construction mode; if the expr of a clause starts with #', then we have something like a syntax-rules form: > (syntax->datum (syntax-case #'(+ 1 2) () [(op n1 n2) #'(- n1 n2)])) '(- 1 2)
We could write the swap macro using syntax-case instead of define-syntax-rule or syntax-rules:
(define-syntax swap (lambda (stx) (syntax-case stx () [(swap x y) #'(let ([tmp x]) (set! x y) (set! y tmp))])))
One advantage of using syntax-case is that we can provide better error reporting for swap. For example, with the define-syntax-rule denition of swap, then (swap x 2) produces a syntax error in terms of set!, because 2 is not an identier. We can rene our syntax-case implementation of swap to explicitly check the sub-forms:
(define-syntax swap (lambda (stx) (syntax-case stx () [(swap x y) (if (and (identifier? #'x) (identifier? #'y)) #'(let ([tmp x]) (set! x y) (set! y tmp)) (raise-syntax-error #f "not an identifier" stx (if (identifier? #'x) #'y #'x)))])))
With this denition, (swap x 2) provides a syntax error originating from swap instead of set!. In the above denition of swap, #'x and #'y are templates, even though they are not used as the result of the macro transformer. This example illustrates how templates can be used to access pieces of the input syntax, in this case for checking the form of the pieces. Also, the match for #'x or #'y is used in the call to raise-syntax-error, so that the syntax-error message can point directly to the source location of the non-identier. 275
16.2.3 with-syntax and generate-temporaries Since syntax-case lets us compute with arbitrary Racket expression, we can more simply solve a problem that we had in writing define-for-cbr (see 16.1.7 Extended Example: Call-by-Reference Functions), where we needed to generate a set of names based on a sequence id ...:
(define-syntax (define-for-cbr stx) (syntax-case stx () [(_ do-f (id ...) body) .... #'(define (do-f get ... put ...) (define-get/put-id id get put) ... body) ....]))
This example uses
In place of the ....s above, we need to bind get ... and put ... to lists of generated identiers. We cannot use let to bind get and put, because we need bindings that count as pattern variables, instead of normal local variables. The with-syntax form lets us bind pattern variables:
(define-syntax (id arg ) body ...+), which is (define-syntax id (lambda (arg ) body ...+)).
equivalent to
(define-syntax (define-for-cbr stx) (syntax-case stx () [(_ do-f (id ...) body) (with-syntax ([(get ...) ....] [(put ...) ....]) #'(define (do-f get ... put ...) (define-get/put-id id get put) ... body))]))
Now we need an expression in place of .... that generates as many identiers as there are id matches in the original pattern. Since this is a common task, Racket provides a helper function, generate-temporaries, that takes a sequence of identiers and returns a sequence of generated identiers:
(define-syntax (define-for-cbr stx) (syntax-case stx () [(_ do-f (id ...) body) (with-syntax ([(get ...) (generate-temporaries #'(id ...))] [(put ...) (generate-temporaries #'(id ...))]) #'(define (do-f get ... put ...) (define-get/put-id id get put) ... body))]))
This way of generating identiers is normally easier to think about than tricking the macro expander into generating names with purely pattern-based macros. 276
In general, the right-hand side of a with-syntax binding is a pattern, just like in syntaxcase. In fact, a with-syntax form is just a syntax-case form turned partially inside-out.
16.2.4
As sets of macros get more complicated, you might want to write your own helper functions, like generate-temporaries. For example, to provide good syntax-error messsage, swap, rotate, and define-cbr all should check that certain sub-forms in the source form are identiers. We could use a check-ids to perform this checking everywhere:
(define-syntax (swap stx) (syntax-case stx () [(swap x y) (begin (check-ids stx #'(x y)) #'(let ([tmp x]) (set! x y) (set! y tmp)))])) (define-syntax (rotate stx) (syntax-case stx () [(rotate a c ...) (begin (check-ids stx #'(a c ...)) #'(shift-to (c ... a) (a c ...)))]))
The check-ids function can use the syntax->list function to convert a syntax-object wrapping a list into a list of syntax objects:
(define (check-ids stx forms) (for-each (lambda (form) (unless (identifier? form) (raise-syntax-error #f "not an identifier" stx form))) (syntax->list forms)))
If you dene swap and check-ids in this way, however, it doesnt work:
> (let ([a 1] [b 2]) (swap a b)) reference to undened identier: check-ids
The problem is that check-ids is dened as a run-time expression, but swap is trying to use it at compile time. In interactive mode, compile time and run time are interleaved, 277
but they are not interleaved within the body of a module, and they are not interleaved or across modules that are compiled ahead-of-time. To help make all of these modes treat code consistently, Racket separates the binding spaces for different phases. To dene a check-ids function that can be referenced at compile time, use begin-forsyntax:
(begin-for-syntax (define (check-ids stx forms) (for-each (lambda (form) (unless (identifier? form) (raise-syntax-error #f "not an identifier" stx form))) (syntax->list forms))))
With this for-syntax denition, then swap works:
> (let ([a 1] [b 2]) (swap a b) (list a b)) '(2 1) > (swap a 1) eval:7:0: swap: not an identier at: 1 in: (swap a 1)
When organizing a program into modules, you may want to put helper functions in one module to be used by macros that reside on other modules. In that case, you can write the helper function using define:
#lang racket (provide check-ids) (define (check-ids stx forms) (for-each (lambda (form) (unless (identifier? form) (raise-syntax-error #f "not an identifier" stx form))) (syntax->list forms)))
"utils.rkt"
Then, in the module that implements macros, import the helper function using (require (for-syntax "utils.rkt")) instead of (require "utils.rkt"): 278
#lang racket (require (for-syntax "utils.rkt")) (define-syntax (swap stx) (syntax-case stx () [(swap x y) (begin (check-ids stx #'(x y)) #'(let ([tmp x]) (set! x y) (set! y tmp)))]))
Since modules are separately compiled and cannot have circular dependencies, the "utils.rkt" modules run-time body can be compiled before the compiling the module that implements swap. Thus, the run-time denitions in "utils.rkt" can be used to implement swap, as long as they are explicitly shifted into compile time by (require (forsyntax ....)). The racket module provides syntax-case, generate-temporaries, lambda, if, and more for use in both the run-time and compile-time phases. That is why we can use syntaxcase in the racket REPL both directly and in the right-hand side of a define-syntax form. The racket/base module, in contrast, exports those bindings only in the run-time phase. If you change the module above that denes swap so that it uses the racket/base language instead of racket, then it no longer works. Adding (require (for-syntax racket/base)) imports syntax-case and more into the compile-time phase, so that the module works again. Suppose that define-syntax is used to dene a local macro in the right-hand side of a define-syntax form. In that case, the right-hand side of the inner define-syntax is in the meta-compile phase level, also known as phase level 2. To import syntaxcase into that phase level, you would have to use (require (for-syntax (for-syntax racket/base))) or, equivalently, (require (for-meta 2 racket/base)). For example,
#lang racket/base (require ;; This provides the bindings for the definition ;; of shell-game. (for-syntax racket/base) ;; And this for the definition of ;; swap. (for-syntax (for-syntax racket/base))) (define-syntax (shell-game stx)
279
(define-syntax (swap stx) (syntax-case stx () [(_ a b) #'(let ([tmp a]) (set! a b) (set! b tmp))])) (syntax-case stx () [(_ a b c) (let ([a #'a] [b #'b] [c #'c]) (when (= 0 (random 2)) (swap a b)) (when (= 0 (random 2)) (swap b c)) (when (= 0 (random 2)) (swap a c)) #`(list #,a #,b #,c))])) (shell-game 3 4 5) (shell-game 3 4 5) (shell-game 3 4 5)
Negative phase levels also exist. If a macro uses a helper function that is imported forsyntax, and if the helper function returns syntax-object constants generated by syntax, then identiers in the syntax will need bindings at phase level -1, also known as the template phase level, to have any binding at the run-time phase level relative to the module that denes the macro.
16.2.5
A phase can be thought of as a way to separate computations in a pipeline of processes where one produces code that is used by the next. (E.g., a pipeline that consists of a preprocessor process, a compiler, and an assembler.) Imagine starting two Racket processes for this purpose. If you ignore inter-process communication channels like sockets and les, the processes will have no way to share anything other than the text that is piped from the standard output of one process into the standard input of the other. Similarly, Racket effectively allows multiple invocations of a module to exist in the same process but separated by phase. Racket enforces separation of such phases, where different phases cannot communicate in any way other than via the protocol of macro expansion, where the output of one phases is the code used in the next.
280
Phases and Bindings Every binding of an identier exists in a particular phase. The link between a binding and its phase is represented by an integer phase level. Phase level 0 is the phase used for plain (or runtime) denitions, so
(define age 5)
adds a binding for age into phase level 0. The identier age can be dened at a higher phase level using begin-for-syntax:
The age binding at phase level 0 has a value of 3, and the age binding at phase level 1 has a value of 9. Syntax objects capture binding information as a rst-class value. Thus,
#'age
is a syntax object that represents the age bindingbut since there are two ages (one at phase level 0 and one at phase level 1), which one does it capture? In fact, Racket imbues #'age with lexical information for all phase levels, so the answer is that #'age captures both. The relevant binding of age captured by #'age is determined when #'age is eventually used. As an example, we bind #'age to a pattern variable so we can use it in a template, and then we evalutae the template:
We use eval here to demonstrate phases, but see 15 Reection and Dynamic Evaluation for caveats about eval.
The result is 3 because age is used at phase 0 level. We can try again with the use of age inside begin-for-syntax: 281
In this case, the answer is 9, because we are using age at phase level 1 instead of 0 (i.e., begin-for-syntax evaluates its expressions at phase level 1). So, you can see that we started with the same syntax object, #'age, and we were able to use it in two different ways: at phase level 0 and at phase level 1. A syntax object has a lexical context from the moment it rst exists. A syntax object that is provided from a module retains its lexical context, and so it references bindings in the context of its source module, not the context of its use. The following example denes button at phase level 0 and binds it to 0, while see-button binds the syntax object for button in module a:
> (module a racket (define button 0) (provide (for-syntax see-button)) ; Why not (define see-button #'button)? We explain later. (define-for-syntax see-button #'button)) > (module b racket (require 'a) (define button 8) (define-syntax (m stx) see-button) (m)) > (require 'b) 0
The result of the m macro is the value of see-button, which is #'button with the lexical context of the a module. Even though there is another button in b, the second button will not confuse Racket, because the lexical context of #'button (the value bound to seebutton) is a. Note that see-button is bound at phase level 1 by virtue of dening it with define-forsyntax. Phase level 1 is needed because m is a macro, so its body executes at one phase higher than the context of its denition. Since m is dened at phase level 0, its body is at phase level 1, so any bindings referenced by the body must be at phase level 1.
282
Phases and Modules A phase level is a module-relative concept. When importing from another module via require, Racket lets us shift imported bindings to a phase level that is different from the original one:
; ; ; ;
no by by by
phase shift +1 -1 +5
That is, using for-syntax in require means that all of the bindings from that module will have their phase levels increased by one. A binding that is defined at phase level 0 and imported with for-syntax becomes a phase-level 1 binding:
> (module c racket (define x 0) ; defined at phase level 0 (provide x)) > (module d racket (require (for-syntax 'c)) ; has a binding at phase level 1, not 0: #'x)
Lets see what happens if we try to create a binding for the #'button syntax object at phase level 0:
Now both button and see-button are dened at phase 0. The lexical context of #'button will know that there is a binding for button at phase 0. In fact, it seems like things are working just ne if we try to eval see-button:
> (module a racket (define button 0) (define see-button #'button) (provide see-button)) > (module b racket (require (for-syntax 'a)) ; gets see-button at phase level 1 (define-syntax (m stx) see-button) (m)) eval:1:0: compile: unbound identier (and no #%top syntax transformer is bound) in: button
Racket says that button is unbound now! When a is imported at phase level 1, we have the following bindings:
284
> (module a racket (define button 0) (define see-button #'button) (provide see-button)) > (module b racket (require (for-syntax 'a)) (define-syntax (m stx) (with-syntax ([x see-button]) #'(begin-for-syntax (displayln x)))) (m)) 0
In this case, module b has both button and see-button bound at phase level 1. The expansion of the macro is
You might expect now that see-button in a macro would work, but it doesnt:
> (module a racket (define button 0) (define see-button #'button) (provide see-button)) > (module b racket (require 'a (for-syntax 'a)) (define-syntax (m stx) see-button) (m)) eval:1:0: compile: unbound identier (and no #%top syntax transformer is bound) in: button
285
The see-button inside the m macro comes from the (for-syntax 'a) import. For the macro to work, there must be a button at phase 0 bound, and there is such a binding implied by (require 'a). However, (require 'a) and (require (for-syntax 'a)) are different instantiations of the same module. The see-button at phase 1 only refers to the button at phase level 1, not the button bound at phase 0 from a different instantiation even from the same source module. Mismatches like the one above can show up when a macro tries to match literal bindings using syntax-case or syntax-parse.
> (module x racket (require (for-syntax syntax/parse) (for-template racket/base)) (provide (all-defined-out)) (define button 0) (define (make) #'button) (define-syntax (process stx) (define-literal-set locals (button)) (syntax-parse stx [(_ (n (literal button))) #'#''ok]))) > (module y racket (require (for-meta 1 'x) (for-meta 2 'x racket/base)) (begin-for-syntax (define-syntax (m stx) (with-syntax ([out (make)]) #'(process (0 out))))) (define-syntax (p stx) (m)) (p)) eval:1:0: process: expected the identier button at: button in: (process (0 button))
In this example, make is being used in y at phase level 2, and it returns the #'button syntax objectwhich refers to button bound at phase level 0 inside x and at phase level 2 in y from (for-meta 2 'x). The process macro is imported at phase level 1 from (for-meta 1 'x), and it knows that button should be bound at phase level 1. When the syntax-parse is executed inside process, it is looking for button bound at phase level 1 but it sees only a phase level 2 binding and doesnt match.
286
To x the example, we can provide make at phase level 1 relative to x, and then we import it at phase level 1 in y:
> (module x racket (require (for-syntax syntax/parse) (for-template racket/base)) (provide (all-defined-out)) (define button 0) (provide (for-syntax make)) (define-for-syntax (make) #'button) (define-syntax (process stx) (define-literal-set locals (button)) (syntax-parse stx [(_ (n (literal button))) #'#''ok]))) > (module y racket (require (for-meta 1 'x) (for-meta 2 racket/base)) (begin-for-syntax (define-syntax (m stx) (with-syntax ([out (make)]) #'(process (0 out))))) (define-syntax (p stx) (m)) (p)) > (require 'y) 'ok
16.2.6
Syntax Taints
A use of a macro can expand into a use of an identier that is not exported from the module that binds the macro. In general, such an identier must not be extracted from the expanded expression and used in a different context, because using the identier in a different context may break invariants of the macros module. For example, the following module exports a macro go that expands to a use of uncheckedgo: 287
#lang racket (provide go) (define (unchecked-go n x) ; to avoid disaster, n must be a number (+ n 17)) (define-syntax (go stx) (syntax-case stx () [(_ x) #'(unchecked-go 8 x)]))
"m.rkt"
If the reference to unchecked-go is extracted from the expansion of (go 'a), then it might be inserted into a new expression, (unchecked-go #f 'a), leading to disaster. The datum->syntax procedure can be used similarly to construct references to an unexported identier, even when no macro expansion includes a reference to the identier. To prevent such abuses of unexported identiers, the go macro must explicitly protect its expansion by using syntax-protect:
input to its output (see 11.7 Syntax Object Properties), the expander copies dye packs from a transformers input to its output. Building on the previous example,
#lang racket (require "m.rkt") (provide go-more) (define y 'hello) (define-syntax (go-more stx) (syntax-protect #'(go y)))
"n.rkt"
the expansion of (go-more) introduces a reference to the unexported y in (go y), and the expansion result is armed so that y cannot be extracted from the expansion. Even if go did not use syntax-protect for its result (perhaps because it does not need to protect uncheckedgo after all), the dye pack on (go y) is propagated to the nal expansion (unchecked-go 8 y). The macro expander uses syntax-rearm to propagate dye packs from a transformers input to its output.
Tainting Modes In some cases, a macro implementor intends to allow limited destructuring of a macro result without tainting the result. For example, given the following define-like-y macro,
#lang racket (provide define-like-y) (define y 'hello) (define-syntax (define-like-y stx) (syntax-case stx () [(_ id) (syntax-protect #'(define-values (id) y))]))
someone may use the macro in an internal denition:
"q.rkt"
(let () (define-like-y x) x)
The implementor of the "q.rkt" module most likely intended to allow such uses of define-like-y. To convert an internal denition into a letrec binding, however, the 289
define form produced by define-like-y must be deconstructed, which would normally taint both the binding x and the reference to y.
Instead, the internal use of define-like-y is allowed, because syntax-protect treats specially a syntax list that begins with define-values. In that case, instead of arming the overall expression, each individual element of the syntax list is armed, pushing dye packs further into the second element of the list so that they are attached to the dened identiers. Thus, define-values, x, and y in the expansion result (define-values (x) y) are individually armed, and the denition can be deconstructed for conversion to letrec. Just like syntax-protect, the expander rearms a transformer result that starts with define-values, by pushing dye packs into the list elements. As a result, define-like-y could have been implemented to produce (define id y), which uses define instead of define-values. In that case, the entire define form is at rst armed with a dye pack, but as the define form is expanded to define-values, the dye pack is moved to the parts. The macro expander treats syntax-list results starting with define-syntaxes in the same way that it treats results starting with define-values. Syntax-list results starting with begin are treated similarly, except that the second element of the syntax list is treated like all the other elements (i.e., the immediate element is armed, instead of its content). Furthermore, the macro expander applies this special handling recursively, in case a macro produces a begin form that contains nested define-values forms. The default application of dye packs can be overridden by attaching a 'taint-mode property (see 11.7 Syntax Object Properties) to the result syntax object of a macro transformer. If the property value is 'opaque, then the syntax object is armed and not its parts. If the property value is 'transparent, then the syntax objects parts are armed. If the property value is 'transparent-binding, then the syntax objects parts and to the subparts of the second part (as for define-values and define-syntaxes) are armed. The 'transparent and 'transparent-binding modes triggers recursive property checking at the parts, so that armings can be pushed arbitrarily deep into a transformers result.
Taints and Code Inspectors Tools that are intended to be privileged (such as a debugging transformer) must disarm dye packs in expanded programs. Privilege is granted through code inspectors. Each dye pack records and inspector, and a syntax object can be disarmed using a sufciently powerful inspector. When a module is declared, the declaration captures the current value of the currentcode-inspector parameter. The captured inspector is used when syntax-protect is applied by a macro transformer that is dened within the module. A tool can disarm the resulting syntax object by supplying syntax-disarm with an inspector that is the same or a super-inspector of the modules inspector. Untrusted code is ultimately run after setting current-code-inspector to a less powerful inspector (after trusted code, such as debug290
ging tools, have been loaded). With this arrangement, macro-generating macros require some care, since the generating macro may embed syntax objects in the generated macro that need to have the generating modules protection level, rather than the protection level of the module that contains the generated macro. To avoid this problem, use the modules declaration-time inspector, which is accessible as (variable-reference->module-declaration-inspector (#%variable-reference)), and use it to dene a variant of syntax-protect. For example, suppose that the go macro is implemented through a macro:
#lang racket (provide def-go) (define (unchecked-go n x) (+ n 17)) (define-syntax (def-go stx) (syntax-case stx () [(_ go) (protect-syntax #'(define-syntax (go stx) (syntax-case stx () [(_ x) (protect-syntax #'(unchecked-go 8 x))])))]))
When def-go is used inside another module to dened go, and when the go-dening module is at a different protection level than the def-go-dening module, the generated macros use of protect-syntax is not right. The use of unchecked-go should be protected at the level of the def-go-dening module, not the go-dening module. The solution is to dene and use go-syntax-protect, instead:
#lang racket (provide def-go) (define (unchecked-go n x) (+ n 17)) (define-for-syntax go-syntax-protect (let ([insp (variable-reference->module-declaration-inspector (#%variable-reference))]) (lambda (stx) (syntax-arm stx insp)))) (define-syntax (def-go stx) (syntax-case stx ()
291
[(_ go) (protect-syntax #'(define-syntax (go stx) (syntax-case stx () [(_ x) (go-syntax-protect #'(unchecked-go 8 x))])))]))
Protected Exports Sometimes, a module needs to export bindings to some modulesother modules that are at the same trust level as the exporting modulebut prevent access from untrusted modules. Such exports should use the protect-out form in provide. For example, ffi/unsafe exports all of its unsafe bindings as protected in this sense. Code inspectors, again, provide the mechanism for determining which modules are trusted and which are untrusted. When a module is declared, the value of current-codeinspector is associated to the module declaration. When a module is instantiated (i.e., when the body of the declaration is actually executed), a sub-inspector is created to guard the modules exports. Access to the modules protected exports requires a code inspector higher in the inspector hierarchy than the modules instantiation inspector; note that a modules declaration inspector is always higher than its instantiation inspector, so modules are declared with the same code inspector can access each others exports. Syntax-object constants within a module, such as literal identiers in a template, retain the inspector of their source module. In this way, a macro from a trusted module can be used within an untrusted module, and protected identiers in the macro expansion still work, even through they ultimately appear in an untrusted module. Naturally, such identiers should be armed, so that they cannot be extracted from the macro expansion and abused by untrusted code. Compiled code from a ".zo" le is inherently untrustworthy, unfortunately, since it can be synthesized by means other than compile. When compiled code is written to a ".zo" le, syntax-object constants within the compiled code lose their inspectors. All syntax-object constants within compiled code acquire the enclosing modules declaration-time inspector when the code is loaded.
292
17
Creating Languages
The macro facilities dened in the preceding chapter let a programmer dene syntactic extensions to a language, but a macro is limited in two ways: a macro cannot restrict the syntax available in its context or change the meaning of surrounding forms; and a macro can extend the syntax of a language only within the parameters of the languages lexical conventions, such as using parentheses to group the macro name with its subforms and using the core syntax of identiers, keywords, and literals. That is, a macro can only extend a language, and it can do so only at the expander layer. Racket offers additional facilities for dening a starting point of the expander layer, for extending the reader layer, for dening the starting point of the reader layer, and for packaging a reader and expander starting point into a conveniently named language.
The distinction between the reader and expander layer is introduced in 2.4.3 Lists and Racket Syntax.
17.1
Module Languages
When using the longhand module form for writing modules, the module path that is specied after the new modules name provides the initial imports for the module. Since the initialimport module determines even the most basic bindings that are available in a modules body, such as require, the initial import can be called a module language. The most common module languages are racket or racket/base, but you can dene your own module language by dening a suitable module. For example, using provide subforms like all-from-out, except-out, and rename-out, you can add, remove, or rename bindings from racket to produce a module language that is a variant of racket:
> (module raquet racket (provide (except-out (all-from-out racket) lambda) (rename-out [lambda function]))) > (module score 'raquet (map (function (points) (case points [(0) "love"] [(1) "fifteen"] [(2) "thirty"] [(3) "forty"])) (list 0 2))) > (require 'score) '("love" "thirty")
293
17.1.1
If you try to remove too much from racket in dening your own module language, then the resulting module will no longer work right as a module language:
> (module just-lambda racket (provide lambda)) > (module identity 'just-lambda (lambda (x) x)) eval:2:0: module: no #%module-begin binding in the modules language in: (module identity (quote just-lambda) (lambda (x) x))
The #%module-begin form is an implicit form that wraps the body of a module. It must be provided by a module that is to be used as module language:
> (module just-lambda racket (provide lambda #%module-begin)) > (module identity 'just-lambda (lambda (x) x)) > (require 'identity) #<procedure>
The other implicit forms provided by racket/base are #%app for function calls, #%datum for literals, and #%top for identiers that have no binding:
> (module just-lambda racket (provide lambda #%module-begin ; ten needs these, too: #%app #%datum)) > (module ten 'just-lambda ((lambda (x) x) 10)) > (require 'ten) 10
Implicit forms such as #%app can be used explicitly in a module, but they exist mainly to allow a module language to restrict or change the meaning of implicit uses. For example, a 294
lambda-calculus module language might restrict functions to a single argument, restrict function calls to supply a single argument, restrict the module body to a single expression, disallow literals, and treat unbound identiers as uninterpreted symbols: > (module lambda-calculus racket (provide (rename-out [1-arg-lambda lambda] [1-arg-app #%app] [1-form-module-begin #%module-begin] [no-literals #%datum] [unbound-as-quoted #%top])) (define-syntax-rule (1-arg-lambda (x) expr) (lambda (x) expr)) (define-syntax-rule (1-arg-app e1 e2) (#%app e1 e2)) (define-syntax-rule (1-form-module-begin e) (#%module-begin e)) (define-syntax (no-literals stx) (raise-syntax-error #f "no" stx)) (define-syntax-rule (unbound-as-quoted . id) 'id)) > (module ok 'lambda-calculus ((lambda (x) (x z)) (lambda (y) y))) > (require 'ok) 'z > (module not-ok 'lambda-calculus (lambda (x y) x)) eval:4:0: lambda: use does not match pattern: (lambda (x) expr) in: (lambda (x y) x) > (module not-ok 'lambda-calculus (lambda (x) x) (lambda (y) (y y))) eval:5:0: #%module-begin: use does not match pattern: (#%module-begin e) in: (#%module-begin (lambda (x) x) (lambda (y) (y y))) > (module not-ok 'lambda-calculus (lambda (x) (x x x))) eval:6:0: #%app: use does not match pattern: (#%app e1 e2) in: (#%app x x x) > (module not-ok 'lambda-calculus 10) eval:7:0: #%datum: no in: (#%datum . 10)
295
Module languages rarely redene #%app, #%datum, and #%top, but redening #%modulebegin is more frequently useful. For example, when using modules to construct descriptions of HTML pages where a description is exported from the module as page, an alternate #%module-begin can help eliminate provide and quasiquoting boilerplate, as in "html.rkt":
#lang racket (require racket/date) (provide (except-out (all-from-out racket) #%module-begin) (rename-out [module-begin #%module-begin]) now) (define-syntax-rule (module-begin expr ...) (#%module-begin (define page `(html expr ...)) (provide page))) (define (now) (parameterize ([date-display-format 'iso-8601]) (date->string (seconds->date (current-seconds)))))
"html.rkt"
Using the "html.rkt" module language, a simple web page can be described without having to explicitly dene or export page and starting in quasiquoted mode instead of expression mode:
> (module lady-with-the-spinning-head "html.rkt" (title "Queen of Diamonds") (p "Updated: " ,(now))) > (require 'lady-with-the-spinning-head) > page '(html (title "Queen of Diamonds") (p "Updated: " "2012-04-09"))
17.1.2 Using #lang s-exp
Implementing a language at the level of #lang is more complex than declaring a single module, because #lang lets programmers control several different facets of a language. The s-exp language, however, acts as a kind of meta-language for using a module language with the #lang shorthand:
form ...
is the same as
17.2
Reader Extensions
The reader layer of the Racket language can be extended through the #reader form. A reader extension is implemented as a module that is named after #reader. The module exports functions that parse raw characters into a form to be consumed by the expander layer. The syntax of #reader is
#reader module-path
reader-specic
where module-path names a module that provides read and read-syntax functions. The reader-specic part is a sequence of characters that is parsed as determined by the read and read-syntax functions from module-path . For example, suppose that le "five.rkt" contains
#lang racket/base
"five.rkt"
297
(provide read read-syntax) (define (read in) (list (read-string 5 in))) (define (read-syntax src in) (list (read-string 5 in)))
Then, the program
The difference between read and read-syntax is that read is meant to be used for data while read-syntax is meant to be used to parse programs. More precisely, the read func298
tion will be used when the enclosing stream is being parsed by the Racket read, and readsyntax is used when the enclosing stream is being parsed by the Racket read-syntax function. Nothing requires read and read-syntax to parse input in the same way, but making them different would confuse programmers and tools. The read-syntax function can return the same kind of value as read, but it should normally return a syntax object that connects the parsed expression with source locations. Unlike the "five.rkt" example, the read-syntax function is typically implemented directly to produce syntax objects, and then read can use read-syntax and strip away syntax object wrappers to produce a raw result. The following "arith.rkt" module implements a reader to parse simple inx arithmetic expressions into Racket forms. For example, 1*2+3 parses into the Racket form (+ (* 1 2) 3). The supported operators are +, -, *, and /, while operands can be unsigned integers or single-letter variables. The implementation uses port-next-location to obtain the current source location, and it uses datum->syntax to turn raw values into syntax objects.
#lang racket (require syntax/readerr) (provide read read-syntax) (define (read in) (syntax->datum (read-syntax #f in))) (define (read-syntax src in) (skip-whitespace in) (read-arith src in)) (define (skip-whitespace in) (regexp-match #px"^\\s*" in))
"arith.rkt"
(define (read-arith src in) (define-values (line col pos) (port-next-location in)) (define expr-match (regexp-match ; Match an operand followed by any number of ; operatoroperand sequences, and prohibit an ; additional operator from following immediately: #px"^([a-z]|[0-9]+)(?:[-+*/]([a-z]|[0-9]+))*(?![-+*/])" in)) (define (to-syntax v delta span-str) (datum->syntax #f v (make-srcloc delta span-str))) (define (make-srcloc delta span-str)
299
(and line (vector src line (+ col delta) (+ pos delta) (string-length span-str)))) (define (parse-expr s delta) (match (or (regexp-match #rx"^(.*?)([+-])(.*)$" s) (regexp-match #rx"^(.*?)([*/])(.*)$" s)) [(list _ a-str op-str b-str) (define a-len (string-length a-str)) (define a (parse-expr a-str delta)) (define b (parse-expr b-str (+ delta 1 a-len))) (define op (to-syntax (string->symbol op-str) (+ delta a-len) op-str)) (to-syntax (list op a b) delta s)] [else (to-syntax (or (string->number s) (string->symbol s)) delta s)])) (unless expr-match (raise-read-error "bad arithmetic syntax" src line col pos (and pos (- (file-position in) pos)))) (parse-expr (bytes->string/utf-8 (car expr-match)) 0))
If the "arith.rkt" reader is used in an expression position, then its parse result will be treated as a Racket expression. If it is used in a quoted form, however, then it just produces a number or a list:
> (let #reader"arith.rkt" 1*2+3 8) repl:1:27: let: bad syntax (not an identier and expression for a binding) at: + in: (let (+ (* 1 2) 3) 8)
17.2.2
Readtables
A reader extensions ability to parse input characters in an arbitrary way can be powerful, but many cases of lexical extension call for a less general but more composable approach. 300
In much the same way that the expander level of Racket syntax can be extended through macros, the reader level of Racket syntax can be composably extended through a readtable. The Racket reader is a recursive-descent parser, and the readtable maps characters to parsing handlers. For example, the default readtable maps ( to a handler that recursively parses subforms until it nds a ). The current-readtable parameter determines the readtable that is used by read or read-syntax. Rather than parsing raw characters directly, a reader extension can install an extended readtable and then chain to read or read-syntax. The make-readtable function constructs a new readtable as an extension of an existing one. It accepts a sequence of specications in terms of a character, a type of mapping for the character, and (for certain types of mappings) a parsing procedure. For example, to extend the readtable so that $ can be used to start and end inx expressions, implement a parse-dollar function and use:
#lang racket (require syntax/readerr (prefix-in arith: "arith.rkt")) (provide (rename-out [$-read read] [$-read-syntax read-syntax]))
"dollar.rkt"
(define ($-read in) (parameterize ([current-readtable (make-$-readtable)]) (read in))) (define ($-read-syntax src in) (parameterize ([current-readtable (make-$-readtable)]) (read-syntax src in))) (define (make-$-readtable)
301
(make-readtable (current-readtable) #\$ 'terminating-macro read-dollar)) (define read-dollar (case-lambda [(ch in) (check-$-after (arith:read in) in (object-name in))] [(ch in src line col pos) (check-$-after (arith:read-syntax src in) in src)])) (define (check-$-after val in src) (regexp-match #px"^\\s*" in) ; skip whitespace (let ([ch (peek-char in)]) (unless (equal? ch #\$) (bad-ending ch src in)) (read-char in)) val) (define (bad-ending ch src in) (let-values ([(line col pos) (port-next-location in)]) ((if (eof-object? ch) raise-read-error raise-read-eof-error) "expected a closing `$'" src line col pos (if (eof-object? ch) 0 1))))
With this reader extension, a single #reader can be used at the beginning of an expression to enable multiple uses of $ that switch to inx arithmetic:
17.3
#lang language
the language determines the way that the rest of the module is parsed at the reader level. The reader-level parse must produce a module form as a syntax object. As always, the second sub-form after module species the module language that controls the meaning of the modules body forms. Thus, a language specied after #lang controls both the readerlevel and expander-level parsing of a module.
302
17.3.1
The syntax of a language intentionally overlaps with the syntax of a module path as used in require or as a module language, so that names like racket, racket/base, slideshow, or scribble/manual can be used both as #lang languages and as module paths. At the same time, the syntax of language is far more restricted than a module path, because only a-z, A-Z, 0-9, / (not at the start or end), _, -, and + are allowed in a language name. These restrictions keep the syntax of #lang as simple as possible. Keeping the syntax of #lang simple, in turn, is important because the syntax is inherently inexible and nonextensible; the #lang protocol allows a language to rene and dene syntax in a practically unconstrained way, but the #lang protocol itself must remain xed so that various different tools can boot into the extended world. Fortunately, the #lang protocol provides a natural way to refer to languages in ways other than the rigid language syntax: by dening a language that implements its own nested protocol. We have already seen one example (in 17.1.2 Using #lang s-exp): the s-exp language allows a programmer to specify a module language using the general module path syntax. Meanwhile, s-exp takes care of the reader-level responsibilities of a #lang language. Unlike racket, s-exp cannot be used as a module path with require. Although the syntax of language for #lang overlaps with the syntax of module paths, a language is not used directly as a module path. Instead, a language is sufxed with /lang/reader to obtain a module path, and the resulting module supplies read and read-syntax functions using a protocol that is similar to the one for #reader. A consequence of the way that a #lang language is turned into a module path is that the language must be installed in a collection, similar to the way that "racket" or "slideshow" are collections that are distributed with Racket. Again, however, theres an escape from this restriction: the reader language lets you specify a reader-level implementation of a language using a general module path. Using #lang reader
17.3.2
The reader language for #lang is similar to s-exp, in that it acts as a kind of metalanguage. Whereas s-exp lets a programmer specify a module language at the expander layer of parsing, reader lets a programmer specify a language at the reader level. A #lang reader must be followed by a module path, and the specied module must provide two functions: read and read-syntax. The protocol is the same as for a #reader implementation, but for #lang, the read and read-syntax functions must produce a module form that is based on the rest of the input le for the module. The following "literal.rkt" module implements a language that treats its entire body as 303
"literal.rkt"
(provide (rename-out [literal-read read] [literal-read-syntax read-syntax])) (define (literal-read in) (syntax->datum (literal-read-syntax #f in))) (define (literal-read-syntax src in) (with-syntax ([str (port->string in)]) (strip-context #'(module anything racket (provide data) (define data 'str)))))
The "literal.rkt" language uses strip-context on the generated module expression, because a read-syntax function should return a syntax object with no lexical context. Also, the "literal.rkt" language creates a module named anything, which is an arbitrary choice; the language is intended to be used in a le, and the longhand module name is ignored when it appears in a required le. The "literal.rkt" language can be used in a module "tuvalu.rkt":
"tuvalu.rkt"
Parsing a module body is usually not as trivial as in "literal.rkt". A more typical module parser must iterate to parse multiple forms for a module body. A language is also more likely 304
to extend Racket syntaxperhaps through a readtableinstead of replacing Racket syntax completely. The syntax/module-reader module language abstracts over common parts of a language implementation to simplify the creation of new languages. In its most basic form, a language implemented with syntax/module-reader simply species the module language to be used for the language, in which case the reader layer of the language is the same as Racket. For example, with
"raquet-mlang.rkt" #lang racket (provide (except-out (all-from-out racket) lambda) (rename-out [lambda function]))
and
"raquet.rkt"
#lang reader "raquet.rkt" (define identity (function (x) x)) (provide identity)
implements and exports the identity function, since "raquet-mlang.rkt" exports lambda as function. The syntax/module-reader language accepts many optional specications to adjust other features of the language. For example, an alternate read and read-syntax for parsing the language can be specied with #:read and #:read-syntax, respectively. The following "dollar-racket.rkt" language uses "dollar.rkt" (see 17.2.2 Readtables) to build a language that is like racket but with a $ escape to simple inx arithmetic:
#lang s-exp syntax/module-reader racket #:read $-read #:read-syntax $-read-syntax (require (prefix-in $- "dollar.rkt"))
"dollar-racket.rkt"
The require form appears at the end of the module, because all of the keyword-tagged optional specications for syntax/module-reader must appear before any helper imports or denitions. 305
The following module uses "dollar-racket.rkt" to implement a cost function using a $ escape:
#lang reader "dollar-racket.rkt" (provide cost) ; Cost of `n' $1 rackets with 7% sales ; tax and shipping-and-handling fee `h': (define (cost n h) $n*107/100+h$)
17.3.4 Installing a Language
"store.rkt"
So far, we have used the reader meta-language to access languages like "literal.rkt" and "dollar-racket.rkt". If you want to use something like #lang literal directly, then you must move "literal.rkt" into a Racket collection named "literal". There are two ways to create the "literal" collection (see also 6.1.3 Adding Collections): You can create a directory either in the main Racket installation or in a userspecic directory. Use find-collects-dir or find-user-collects-dir from setup/dirs to nd the directory:
.... (the main installation or the users space) |- "collects" |- "literal" |- "lang" |- "reader.rkt"
Alternatively, move "literal.rkt" to "literal/lang/reader.rkt" for any directory name "literal". Then, in the directory that contains "literal", use the command line 306
See raco: Racket Command-Line Tools for more information on using raco.
See PLaneT: Automatic Package Distribution for more information about PLaneT packages.
17.3.5
Source-Handling Conguration
The Racket distribution includes a Scribble language for writing prose documents, where Scribble extends the normal Racket to better support text. Here is an example Scribble document:
#lang scribble/base @(define (get-name) "Self-Describing Document") @title[(get-name)] The title of this document is ``@(get-name).''
If you put that program in DrRackets denitions area and click Run, then nothing much appears to happen. The scribble/base language just binds and exports doc as a description of a document, similar to the way that "literal.rkt" exports a string as data. Simply opening a module with the language scribble/base in DrRacket, however, causes a Scribble HTML button to appear. Furthermore, DrRacket knows how to colorize Scribble
307
syntax by coloring green those parts of the document that correspond to literal text. The language name scribble/base is not hard-wired into DrRacket. Instead, the implementation of the scribble/base language provides button and syntax-coloring information in response to a query from DrRacket. For security reasons, only languages that have been specically installed by a user can respond to language-information queries. If you have installed the literal language as described in 17.3.4 Installing a Language, then you can adjust "literal/lang/reader.rkt" so that DrRacket treats the content of a module in the literal language as plain text instead of (erroneously) as Racket syntax:
"literal/lang/reader.rkt"
(provide (rename-out [literal-read read] [literal-read-syntax read-syntax]) get-info) (define (literal-read in) (syntax->datum (literal-read-syntax #f in))) (define (literal-read-syntax src in) (with-syntax ([str (port->string in)]) (strip-context #'(module anything racket (provide data) (define data 'str))))) (define (get-info in mod line col pos) (lambda (key default) (case key [(color-lexer) (dynamic-require 'syntax-color/default-lexer 'default-lexer)] [else default])))
This revised literal implementation provides a get-info function. The get-info function will be applied to the source input stream and location information, in case query results should depend on the content of the module after the language name (which is not the case for literal). The result of get-info is a function of two arguments. The rst argument is always a symbol, indicating the kind of information that a tool requests from the language; the second argument is the default result to be returned if the language does not recognize the query or has no information for it.
308
After DrRacket obtains the result of get-info for a language, it calls the function with a 'color-lexer query; the result should be a function that implements syntax-coloring parsing on an input stream. For literal, the syntax-color/default-lexer module provides a default-lexer syntax-coloring parser that is suitable for plain text, so literal loads and returns that parser in response to a 'color-lexer query. The set of symbols that a programming tool uses for queries is entirely between the tool and the languages that choose to cooperate with it. For example, in addition to 'color-lexer, DrRacket uses a 'drracket:toolbar-buttons query to determine which buttons should be available in the toolbar to operate on modules using the language. The syntax/module-reader language lets you specify get-info handling through a #:info optional specication. The protocol for an #:info function is slightly different from the raw get-info protocol; the revised protocol allows syntax/module-reader the possibility of handling future language-information queries automatically.
17.3.6
Module-Handling Conguration
#lang racket (list "O-Ren Ishii" "Vernita Green" "Budd" "Elle Driver" "Bill")
"death-list-5.rkt"
If you require "death-list-5.rkt" directly, then it prints the list in the usual Racket result format:
> (require "death-list-5.rkt") '("O-Ren Ishii" "Vernita Green" "Budd" "Elle Driver" "Bill")
However, if "death-list-5.rkt" is required by a "kiddo.rkt" that is implemented with scheme instead of racket:
"kiddo.rkt"
then, if you run "kiddo.rkt" le in DrRacket or if you run it directly with racket, "kiddo.rkt" causes "death-list-5.rkt" to print its list in traditional Scheme format, without the leading quote: 309
.... (the main installation or the users space) |- "collects" |- "literal" |- "lang" | |- "reader.rkt" |- "language-info.rkt" (new) |- "runtime-config.rkt" (new) |- "show.rkt" (new)
The "literal/language-info.rkt" module provides reective information about the language of modules written in the literal language. The name of this module is not special; it will be connected to the literal language through a change to "literal/lang/reader.rkt". The "literal/runtime-config.rkt" module will be identied by "literal/language-info.rkt" as the run-time conguration code for a main module that uses the literal language. The "literal/show.rkt" module will provide a show function to be applied to the string content of a literal module. The run-time conguration action in "literal/runtime-config.rkt" will instruct show to print the strings that it is given, but only when a module using the literal language is run directly. Multiple modules are needed to implement the printing change, because the different modules must run at different times. For example, the code needed to parse a literal module is not needed after the module has been compiled, while the run-time conguration code is needed only when the module is run as the main module of a program. Similarly, when 310
creating a stand-alone executable with raco exe, the main module (in compiled form) must be queried for its run-time conguration, but the module and its conguration action should not run until the executable is started. By using different modules for these different tasks, we avoid loading code at times when it is not needed. The three new les are connected to the literal language by changes to "literal/lang/reader.rkt": The module form generated by the read-syntax function must import the literal/show module and call its show function. The module form must be annotated with a 'language-info syntax property, whose value points to a get-language-info function exported by a literal/languageinfo module. The get-language-info function will be responsible for reporting the literal/runtime-config as the run-time conguration action of the language. The 'language-info syntax property value is a vector that contains a module (in this case literal/language-info), a symbol for one of the modules exports (getlanguage-info in this case), and an data value (which is not needed in this case). The data component allows information to be propagated from the source to the modules language information. These changes are implemented in the following revised "literal/lang/reader.rkt":
"literal/lang/reader.rkt"
(provide (rename-out [literal-read read] [literal-read-syntax read-syntax]) get-info) (define (literal-read in) (syntax->datum (literal-read-syntax #f in))) (define (literal-read-syntax src in) (with-syntax ([str (port->string in)]) (syntax-property (strip-context #'(module anything racket (require literal/show) (provide data) (define data 'str) (show data))) 'module-language
311
'#(literal/language-info get-language-info #f)))) (define (get-info in mod line col pos) (lambda (key default) (case key [(color-lexer) (dynamic-require 'syntax-color/default-lexer 'default-lexer)] [else default])))
When a module form with a 'module-language property is compiled, the property value is preserved with the compiled module, and it is accessible via reective functions like module->language-info. When racket or DrRacket runs a module, it uses module>language-info to obtain a vector that contains a module name, export name, and data value. The result of the function applied to the data should be another function that answers queries, much like the get-info function in a language reader. For literal, "literal/language-info.rkt" is implemented as:
"literal/language-info.rkt"
(define (get-language-info data) (lambda (key default) (case key [(configure-runtime) '(#(literal/runtime-config configure #f))] [else default])))
The function returned by get-language-info answers a 'configure-runtime query with a list of yet more vectors, where each vector contains a module name, an exported name, and a data value. For the literal language, the run-time conguration action implemented in "literal/runtime-config.rkt" is to enable printing of strings that are sent to show:
#lang racket (require "show.rkt") (provide configure) (define (configure data) (show-enabled #t))
"literal/runtime-config.rkt"
312
Finally, the "literal/runtime-config.rkt" module must provide the show-enabled parameter and show function:
"literal/runtime-config.rkt"
(define show-enabled (make-parameter #f)) (define (show v) (when (show-enabled) (display v)))
With all of the pieces for literal in place, try running the following variant of "tuvalu.rkt" directly and through a require from another module:
"tuvalu.rkt"
When using syntax/module-reader to implement a language, specify a modules language information through the #:language-info optional specication. The value provided through #:language-info is attached to a module form directly as a syntax property.
313
18
Performance
Alan Perlis famously quipped Lisp programmers know the value of everything and the cost of nothing. A Racket programmer knows, for example, that a lambda anywhere in a program produces a value that is closed over its lexical environmentbut how much does allocating that value cost? While most programmers have a reasonable grasp of the cost of various operations and data structures at the machine level, the gap between the Racket language model and the underlying computing machinery can be quite large. In this chapter, we narrow the gap by explaining details of the Racket compiler and run-time system and how they affect the run-time and memory performance of Racket code.
18.1
Performance in DrRacket
By default, DrRacket instruments programs for debugging, and debugging instrumentation can signicantly degrade performance for some programs. Even when debugging is disabled through the Choose Language... dialogs Show Details panel, the Preserve stacktrace checkbox is clicked by default, which also affects performance. Disabling debugging and stacktrace preservation provides performance results that are more consistent with running in plain racket. Even so, DrRacket and programs developed within DrRacket use the same Racket virtual machine, so garbage collection times (see 18.9 Memory Management) may be longer in DrRacket than when a program is run by itself, and DrRacket threads may impede execution of program threads. For the most reliable timing results for a program, run in plain racket instead of in the DrRacket development environment. Non-interactive mode should be used instead of the REPL to benet from the module system. See 18.3 Modules and Performance for details.
18.2
Every denition or expression to be evaluated by Racket is compiled to an internal bytecode format. In interactive mode, this compilation occurs automatically and on-the-y. Tools like raco make and raco setup marshal compiled bytecode to a le, so that you do not have to compile from source every time that you run a program. (Most of the time required to compile a le is actually in macro expansion; generating bytecode from fully expanded code is relatively fast.) See 22.1.1 Compilation and Conguration: raco for more information on generating bytecode les. The bytecode compiler applies all standard optimizations, such as constant propagation, constant folding, inlining, and dead-code elimination. For example, in an environment where + has its usual binding, the expression (let ([x 1] [y (lambda () 4)]) (+ 1 (y)))
314
is compiled the same as the constant 5. On some platforms, bytecode is further compiled to native code via a just-in-time or JIT compiler. The JIT compiler substantially speeds programs that execute tight loops, arithmetic on small integers, and arithmetic on inexact real numbers. Currently, JIT compilation is supported for x86, x86_64 (a.k.a. AMD64), and 32-bit PowerPC processors. The JIT compiler can be disabled via the eval-jit-enabled parameter or the --no-jit/-j command-line ag for racket. The JIT compiler works incrementally as functions are applied, but the JIT compiler makes only limited use of run-time information when compiling procedures, since the code for a given module body or lambda abstraction is compiled only once. The JITs granularity of compilation is a single procedure body, not counting the bodies of any lexically nested procedures. The overhead for JIT compilation is normally so small that it is difcult to detect.
18.3
The module system aids optimization by helping to ensure that identiers have the usual bindings. That is, the + provided by racket/base can be recognized by the compiler and inlined, which is especially important for JIT-compiled code. In contrast, in a traditional interactive Scheme system, the top-level + binding might be redened, so the compiler cannot assume a xed + binding (unless special ags or declarations are used to compensate for the lack of a module system). Even in the top-level environment, importing with require enables some inlining optimizations. Although a + denition at the top level might shadow an imported +, the shadowing denition applies only to expressions evaluated later. Within a module, inlining and constant-propagation optimizations take additional advantage of the fact that denitions within a module cannot be mutated when no set! is visible at compile time. Such optimizations are unavailable in the top-level environment. Although this optimization within modules is important for performance, it hinders some forms of interactive development and exploration. The compile-enforce-module-constants parameter disables the JIT compilers assumptions about module denitions when interactive exploration is more important. See 6.6 Assignment and Redenition for more information. The compiler may inline functions or propagate constants across module boundaries. To avoid generating too much code in the case of function inlining, the compiler is conservative when choosing candidates for cross-module inlining; see 18.4 Function-Call Optimizations for information on providing inlining hints to the compiler. The later section 18.6 letrec Performance provides some additional caveats concerning inlining of module bindings.
315
18.4
Function-Call Optimizations
When the compiler detects a function call to an immediately visible function, it generates more efcient code than for a generic call, especially for tail calls. For example, given the program
(letrec ([odd (lambda (x) (if (zero? x) #f (even (sub1 x))))] [even (lambda (x) (if (zero? x) #t (odd (sub1 x))))]) (odd 40000000))
the compiler can detect the oddeven loop and produce code that runs much faster via loop unrolling and related optimizations. Within a module form, defined variables are lexically scoped like letrec bindings, and denitions within a module therefore permit call optimizations, so
18.5
Using set! to mutate a variable can lead to bad performance. For example, the microbenchmark
#lang racket/base (define (subtract-one x) (set! x (sub1 x)) x) (time (let loop ([n 4000000]) (if (zero? n) 'done (loop (subtract-one n)))))
runs much more slowly than the equivalent
#lang racket/base (define (subtract-one x) (sub1 x)) (time (let loop ([n 4000000]) (if (zero? n) 'done (loop (subtract-one n)))))
In the rst variant, a new location is allocated for x on every iteration, leading to poor performance. A more clever compiler could unravel the use of set! in the rst example, but since mutation is discouraged (see 4.9.1 Guidelines for Using Assignment), the compilers effort is spent elsewhere. More signicantly, mutation can obscure bindings where inlining and constant-propagation might otherwise apply. For example, in
(let ([minus1 #f]) (set! minus1 sub1) (let loop ([n 4000000]) (if (zero? n) 'done (loop (minus1 n)))))
the set! obscures the fact that minus1 is just another name for the built-in sub1. 317
18.6
letrec Performance
When letrec is used to bind only procedures and literals, then the compiler can treat the bindings in an optimal manner, compiling uses of the bindings efciently. When other kinds of bindings are mixed with procedures, the compiler may be less able to determine the control ow. For example,
(letrec ([loop (lambda (x) (if (zero? x) 'done (loop (next x))))] [junk (display loop)] [next (lambda (x) (sub1 x))]) (loop 40000000))
likely compiles to less efcient code than
(letrec ([loop (lambda (x) (if (zero? x) 'done (loop (next x))))] [next (lambda (x) (sub1 x))]) (loop 40000000))
In the rst case, the compiler likely does not know that display does not call loop. If it did, then loop might refer to next before the binding is available. This caveat about letrec also applies to denitions of functions and constants as internal denitions or in modules. A denition sequence in a module body is analogous to a sequence of letrec bindings, and non-constant expressions in a module body can interfere with the optimization of references to later bindings.
18.7
A xnum is a small exact integer. In this case, small depends on the platform. For a 32-bit machine, numbers that can be expressed in 30 bits plus a sign bit are represented as xnums. On a 64-bit machine, 62 bits plus a sign bit are available. A onum is used to represent any inexact real number. They correspond to 64-bit IEEE oating-point numbers on all platforms. Inlined xnum and onum arithmetic operations are among the most important advantages of the JIT compiler. For example, when + is applied to two arguments, the generated machine 318
code tests whether the two arguments are xnums, and if so, it uses the machines instruction to add the numbers (and check for overow). If the two numbers are not xnums, then it checks whether whether both are onums; in that case, the machines oating-point operations are used directly. For functions that take any number of arguments, such as +, inlining works for two or more arguments (except for -, whose one-argument case is also inlined) when the arguments are either all xnums or all onums. Flonums are typically boxed, which means that memory is allocated to hold every result of a onum computation. Fortunately, the generational garbage collector (described later in 18.9 Memory Management) makes allocation for short-lived results reasonably cheap. Fixnums, in contrast are never boxed, so they are typically cheap to use. The racket/flonum library provides onum-specic operations, and combinations of onum operations allow the JIT compiler to generate code that avoids boxing and unboxing intermediate results. Besides results within immediate combinations, onum-specic results that are bound with let and consumed by a later onum-specic operation are unboxed within temporary storage. Finally, the compiler can detect some onum-valued loop accumulators and avoid boxing of the accumulator. The bytecode decompiler (see 9 raco decompile: Decompiling Bytecode) annotates combinations where the JIT can avoid boxes with #%flonum, #%as-flonum, and #%from-flonum. The racket/unsafe/ops library provides unchecked xnum- and onum-specic operations. Unchecked onum-specic operations allow unboxing, and sometimes they allow the compiler to reorder expressions to improve performance. See also 18.8 Unchecked, Unsafe Operations, especially the warnings about unsafety.
See 18.10 Parallelism with Futures for an example use of onum-specic operations.
Unboxing of local bindings and accumualtors is not supported by the JIT for PowerPC.
18.8
The racket/unsafe/ops library provides functions that are like other functions in racket/base, but they assume (instead of checking) that provided arguments are of the right type. For example, unsafe-vector-ref accesses an element from a vector without checking that its rst argument is actually a vector and without checking that the given index is in bounds. For tight loops that use these functions, avoiding checks can sometimes speed the computation, though the benets vary for different unchecked functions and different contexts. Beware that, as unsafe in the library and function names suggest, misusing the exports of racket/unsafe/ops can lead to crashes or memory corruption.
18.9
Memory Management
The Racket implementation is available in two variants: 3m and CGC. The 3m variant uses a modern, generational garbage collector that makes allocation relatively cheap for short-
319
lived objects. The CGC variant uses a conservative garbage collector which facilitates interaction with C code at the expense of both precision and speed for Racket memory management. The 3m variant is the standard one. Although memory allocation is reasonably cheap, avoiding allocation altogether is normally faster. One particular place where allocation can be avoided sometimes is in closures, which are the run-time representation of functions that contain free variables. For example,
(let loop ([n 40000000] [prev-thunk (lambda () #f)]) (if (zero? n) (prev-thunk) (loop (sub1 n) (lambda () n))))
allocates a closure on every iteration, since (lambda () n) effectively saves n. The compiler can eliminate many closures automatically. For example, in
(let loop ([n 40000000] [prev-val #f]) (let ([prev-thunk (lambda () n)]) (if (zero? n) prev-val (loop (sub1 n) (prev-thunk)))))
no closure is ever allocated for prev-thunk, because its only application is visible, and so it is inlined. Similarly, in
(let n-loop ([n 400000]) (if (zero? n) 'done (let m-loop ([m 100]) (if (zero? m) (n-loop (sub1 n)) (m-loop (sub1 m))))))
then the expansion of the let form to implement m-loop involves a closure over n, but the compiler automatically converts the closure to pass itself n as an argument instead.
18.10
The racket/future library provides support for performance improvement through parallelism with the future and touch functions. The level of parallelism available from those constructs, however, is limited by several factors, and the current implementation is best suited to numerical tasks. 320
Other functions, such as thread, support the creation of reliably concurrent tasks. However, thread never run truly in parallel, even if the hardware and operating system support parallelism.
As a starting example, the any-double? function below takes a list of numbers and determines whether any number in the list has a double that is also in the list:
(define (any-double? l) (for/or ([i (in-list l)]) (for/or ([i2 (in-list l)]) (= i2 (* 2 i)))))
This function runs in quadratic time, so it can take a long time (on the order of a second) on large lists like l1 and l2:
(define l1 (for/list ([i (in-range 5000)]) (+ (* 2 i) 1))) (define l2 (for/list ([i (in-range 5000)]) (- (* 2 i) 1))) (or (any-double? l1) (any-double? l2))
The best way to speed up any-double? is to use a different algorithm. However, on a machine that offers at least two processing units, the example above can run in about half the time using future and touch:
(let ([f (future (lambda () (any-double? l2)))]) (or (any-double? l1) (touch f)))
The future f runs (any-double? l2) in parallel to (any-double? l1), and the result for (any-double? l2) becomes available about the same time that it is demanded by (touch f). Futures run in parallel as long as they can do so safely, but the notion of safe for parallelism is inherently tied to the system implementation. The distinction between safe and unsafe operations may be far from apparent at the level of a Racket program. Consider the following core of a Mandelbrot-set computation:
(define (mandelbrot iterations x y n) (let ([ci (- (/ (* 2.0 y) n) 1.0)] [cr (- (/ (* 2.0 x) n) 1.5)]) (let loop ([i 0] [zr 0.0] [zi 0.0]) (if (> i iterations) i (let ([zrq (* zr zr)] [ziq (* zi zi)]) (cond
321
[(> (+ zrq ziq) 4.0) i] [else (loop (add1 i) (+ (- zrq ziq) cr) (+ (* 2.0 zr zi) ci))]))))))
The expressions (mandelbrot 10000000 62 500 1000) and (mandelbrot 10000000 62 501 1000) each take a while to produce an answer. Computing them both, of course, takes twice as long:
(let ([f (future (lambda () (mandelbrot 10000000 62 501 1000)))]) (list (mandelbrot 10000000 62 500 1000) (touch f)))
One problem is that the * and / operations in the rst two lines of mandelbrot involve a mixture of exact and inexact real numbers. Such mixtures typically trigger a slow path in execution, and the general slow path is not safe for parallelism. Consequently, the future created in this example is almost immediately suspended, and it cannot resume until touch is called. Changing the rst two lines of mandelbrot addresses that rst the problem:
(define (mandelbrot iterations x y n) (let ([ci (- (/ (* 2.0 (->fl y)) (->fl n)) 1.0)] [cr (- (/ (* 2.0 (->fl x)) (->fl n)) 1.5)]) ....))
With that change, mandelbrot computations can run in parallel. Nevertheless, performance still does not improve. The problem is that most every arithmetic operation in this example produces an inexact number whose storage must be allocated. Especially frequent allocation triggers communication between parallel tasks that defeats any performance improvement. By using onum-specic operations (see 18.7 Fixnum and Flonum Optimizations), we can re-write mandelbot to use much less allocation:
(define (mandelbrot iterations x y n) (let ([ci (fl- (fl/ (* 2.0 (->fl y)) (->fl n)) 1.0)] [cr (fl- (fl/ (* 2.0 (->fl x)) (->fl n)) 1.5)]) (let loop ([i 0] [zr 0.0] [zi 0.0]) (if (> i iterations)
322
i (let ([zrq (fl* zr zr)] [ziq (fl* zi zi)]) (cond [(fl> (fl+ zrq ziq) 4.0) i] [else (loop (add1 i) (fl+ (fl- zrq ziq) cr) (fl+ (fl* 2.0 (fl* zr zi)) ci))]))))))
This conversion can speed mandelbrot by a factor of 8, even in sequential mode, but avoiding allocation also allows mandelbrot to run usefully faster in parallel. As a general guideline, any operation that is inlined by the JIT compiler runs safely in parallel, while other operations that are not inlined (including all operations if the JIT compiler is disabled) are considered unsafe. The mzc decompiler tool annotates operations that can be inlined by the compiler (see 9 raco decompile: Decompiling Bytecode), so the decompiler can be used to help predict parallel performance. To more directly report what is happening in a program that uses future and touch, operations are logged when they suspend a computation or synchronize with the main computation. For example, running the original mandelbrot in a future produces the following output in the 'debug log level:
future 1, process 1: BLOCKING on process 0; time: .... .... future 1, process 0: HANDLING: *; time: ....
The messages indicate which internal future-running task became blocked on an unsafe operation, the time it blocked (in terms of current-inexact-miliseconds), and the operation that caused the computation it to block. The rst revision to mandelbrot avoids suspending at *, but produces many log entries of the form
PLTSTDERR
18.11
The racket/place library provides support for performance improvement through parallelism with the place form. The place form creates a place, which is effectively a new Racket instance that can run in parallel to other places, including the initial place. The full power of the Racket language is available at each place, but places can communicate only 323
through message passingusing the place-channel-put and place-channel-get functions on a limited set of valueswhich helps ensure the safety and independence of parallel computations. As a starting example, the racket program below uses a place to determine whether any number in the list has a double that is also in the list:
#lang racket (provide main) (define (any-double? l) (for/or ([i (in-list l)]) (for/or ([i2 (in-list l)]) (= i2 (* 2 i))))) (define (main) (define p (place ch (define l (place-channel-get ch)) (define l-double? (any-double? l)) (place-channel-put ch l-double?))) (place-channel-put p (list 1 2 4 8)) (place-channel-get p))
The identier ch after place is bound to a place channel. The remaining body expressions within the place form are evaluated in a new place, and the body expressions use ch to communicate with the place that spawned the new place. In the body of the place form above, the new place receives a list of numbers over ch and binds the list to l. It then calls any-double? on the list and binds the result to l-double?. The nal body expression sends the l-double? result back to the original place over ch. In DrRacket, after saving and running the above program, evaluate (main) in the interactions window to create the new place. Alternatively, save the program as "double.rkt" and run from a command line with
When using places inside DrRacket, the module containg place code must be saved to a le before it will execute.
the enclosing module in a newly created place. As part of the dynamic-require, the current module body is evaluated in the new place. The consequence of this second feature is that place should not appear immediately in a module or in a function that is called in a modules top level; otherwise, invoking the module will invoke the same module in a new place, and so on, triggering a cascade of place creations that will soon exhaust memory.
#lang racket (provide main) ; Don't do this! (define p (place ch (place-channel-get ch))) (define (indirect-place-invocation) (define p2 (place ch (place-channel-get ch)))) ; Don't do this, either! (indirect-place-invocation)
18.12
Distributed Places
The racket/place/distributed library provides support for distributed programming. The example bellow demonstrates how to launch a remote racket vm instance, launch remote places on the new remote vm instance, and start an event loop that monitors the remote vm instance. The example code can also be found in "racket/distributed/examples/named/master.rkt". The spawn-remote-racket-vm primitive connects to "localhost" and starts a racloud node there that listens on port 6344 for further instructions. The handle to the new racloud node is assigned to the remote-vm variable. Localhost is used so that the example can be run using only a single machine. However localhost can be replaced by any host with ssh publickey access and racket. The supervise-named-place-thunk-at creates a new place on the remote-vm. The new place will be identied in the future by its name symbol 'tupleserver. A place descriptor is expected to be returned by dynamically requiring 'maketuple-server from the tuple-path module and invoking 'make-tuple-server. The code for the tuple-server place exists in the le "tuple.rkt". The "tuple.rkt" le contains the use of define-named-remote-server form, which denes a RPC server suitiable for invocation by supervise-named-place-thunk-at. The define-named-remote-server form takes an identier and a list of custom expressions as its arguments. From the identier a place-thunk function is created by prepending the make- prex. In this case make-tuple-server. The make-tuple-server identier 325
#lang racket/base (require racket/place/distributed racket/class racket/place racket/runtime-path "bank.rkt" "tuple.rkt") (define-runtime-path bank-path "bank.rkt") (define-runtime-path tuple-path "tuple.rkt") (provide main) (define (main) (define remote-node (spawn-remote-racketnode "localhost" #:listen-port 6344)) (define tuple-place (supervise-named-dynamic-place-at remotenode 'tuple-server tuple-path 'make-tuple-server)) (define bank-place (supervise-dynamic-place-at remotenode bank-path 'make-bank)) (message-router remote-node (after-seconds 4 (displayln (bank-new-account bank-place 'user0)) (displayln (bank-add bank-place 'user0 10)) (displayln (bank-removeM bank-place 'user0 5))) (after-seconds 2 (define c (connect-to-named-place remote-node 'tupleserver)) (define d (connect-to-named-place remote-node 'tupleserver)) (tuple-server-hello c) (tuple-server-hello d) (displayln (tuple-server-set c "user0" 100)) (displayln (tuple-server-set d "user2" 200)) (displayln (tuple-server-get c "user0")) (displayln (tuple-server-get d "user2")) (displayln (tuple-server-get d "user0")) (displayln (tuple-server-get c "user2")) ) (after-seconds 8 (node-send-exit remote-node)) (after-seconds 10 (exit 0))))
326 Figure 1: examples/named/master.rkt
#lang racket/base (require racket/match racket/place/define-remote-server) (define-named-remote-server tuple-server (define-state h (make-hash)) (define-rpc (set k v) (hash-set! h k v) v) (define-rpc (get k) (hash-ref h k #f)) (define-cast (hello) (printf "Hello from define-cast\n")(flush-output)))
Figure 2: examples/named/tuple.rkt
is the "compute-instance-place-function-name" given to the supervise-namedplace-thunk-at form above. The define-state custom form translates into a simple define form, which is closed over by define-rpc forms. The define-rpc form is expanded into two parts. The rst part is the client stub that calls the rpc function. The client function name is formed by concatenating the definenamed-remote-server identier, tuple-server. with the RPC function name set to form tuple-server-set. The RPC client functions take a destination argument which is a remote-connection% descriptor and then the RPC function arguments. The RPC client function sends the RPC function name, set, and the RPC arguments to the destination by calling an internal function named-place-channel-put. The RPC client then calls named-place-channel-get to wait for the RPC response. The second expansion part of define-rpc is the server implementation of the RPC call. The server is implemented by a match expression inside the make-tuple-server function. The match clause for tuple-server-set matches on messages beginning with the 'set symbol. The server executes the RPC call with the communicated arguments and sends the result back to the RPC client. The define-rpc form is similar to the define-rpc form except there is no reply message from the server to client
327
'(begin (require racket/place racket/match) (define/provide (tuple-server-set dest k v) (named-place-channel-put dest (list 'set k v)) (named-place-channel-get dest)) (define/provide (tuple-server-get dest k) (named-place-channel-put dest (list 'get k)) (named-place-channel-get dest)) (define/provide (tuple-server-hello dest) (named-place-channel-put dest (list 'hello))) (define/provide (make-tuple-server) (place ch (let () (define h (make-hash)) (let loop () (define msg (place-channel-get ch)) (define (log-to-parent-real msg #:severity (severity 'info)) (place-channel-put ch (log-message severity msg))) (syntax-parameterize ((log-to-parent (make-rename-transformer #'log-toparent-real))) (match msg ((list (list 'set k v) src) (define result (let () (hash-set! h k v) v)) (place-channel-put src result) (loop)) ((list (list 'get k) src) (define result (let () (hash-ref h k #f))) (place-channel-put src result) (loop)) ((list (list 'hello) src) (define result (let () (printf "Hello from define-cast\n") (flushoutput))) (loop)))) loop)))) (void))
Figure 3: Expansion of dene-named-remote-server 328
19
While developing programs, many Racket programmers use the DrRacket programming environment. To run a program without the development environment, use racket (for console-based programs) or gracket (for GUI programs). This chapter mainly explains how to run racket and gracket.
19.1
The gracket executable is the same as racket, but with small adjustments to behave as a GUI application rather than a console application. For example, gracket by default runs in interactive mode with a GUI window instead of a console prompt. GUI applications can be run with plain racket, however. Depending on command-line arguments, racket or gracket runs in interactive mode, module mode, or load mode.
19.1.1
Interactive Mode
When racket is run with no command-line arguments (other than confguration options, like -j), then it starts a REPL with a > prompt:
For enhancing your REPL experience, see xrepl; for information on GNU Readline support, see readline.
racket -l racket/base -i
329
starts a REPL using a much smaller initial language (that loads much faster). Beware that most modules do not provide the basic syntax of Racket, including function-call syntax and require. For example,
racket -l racket/date -i
produces a REPL that fails for every expression, because racket/date provides only a few functions, and not the #%top-interaction and #%app bindings that are needed to evaluate top-level function calls in the REPL. If a module-requiring ag appears after -i/--repl instead of before it, then the module is required after racket/init to augment the initial environment. For example,
racket -i -l racket/date
starts a useful REPL with racket/date available in addition to the exports of racket.
19.1.2
Module Mode
If a le argument is supplied to racket before any command-line switch (other than conguration options), then the le is required as a module, and (unless -i/--repl is specied), no REPL is started. For example,
racket hello.rkt
requires the "hello.rkt" module and then exits. Any argument after the le name, ag or otherwise, is preserved as a command-line argument for use by the required module via current-command-line-arguments. If command-line ags are used, then the -u or --require-script ag can be used to explicitly require a le as a module. The -t or --require ag is similar, except that additional command-line ags are processed by racket, instead of preserved for the required module. For example,
racket -l raco
is the same as running the raco executable with no arguments, since the raco module is the executables main module.
330
Note that if you wanted to pass command-line ags to raco above, you would need to protect the ags with a --, so that racket doesnt try to parse them itself:
The -f or --load ag supports loading top-level expressions in a le directly, as opposed to expressions within a module le. This evaluation is like starting a REPL and typing the expressions directly, except that the results are not printed. For example,
racket -f hi.rkts loads "hi.rkts" and exits. Note that load mode is generally a bad idea, for the reasons explained in 1.4 A Note to Readers with Lisp/Scheme Experience; using module mode is typically better.
The -e or --eval ag accepts an expression to evaluate directly. Unlike le loading, the result of the expression is printed, as in a REPL. For example,
racket -e '(current-seconds)'
prints the number of seconds since January 1, 1970. For le loading and expression evaluation, the top-level environment is created in the same way for interactive mode: racket/init is required unless another module is specied rst. For example,
19.2
Scripts
Racket les can be turned into executable scripts on Unix and Mac OS X. On Windows, a compatibility layer like Cygwin support the same kind of scripts, or scripts can be implemented as batch les.
19.2.1
Unix Scripts
In a Unix environment (including Linux and Mac OS X), a Racket le can be turned into an executable script using the shells #! convention. The rst two characters of the le must be 331
#!; the next character must be either a space or /, and the remainder of the rst line must be a command to execute the script. For some platforms, the total length of the rst line is restricted to 32 characters, and sometimes the space is required.
The simplest script format uses an absolute path to a racket executable followed by a module declaration. For example, if racket is installed in "/usr/local/bin", then a le containing the following text acts as a hello world script:
Use #lang racket/base instead of #lang racket to produce scripts with a faster startup time.
332
(define greeting (command-line #:once-each [("-v") "Verbose mode" (verbose? #t)] #:args (str) str)) (printf "aa\n" greeting (if (verbose?) " to you, too!" ""))
Try running the above script with the --help ag to see what command-line arguments are allowed by the script. An even more general trampoline uses /bin/sh plus some lines that are comments in one language and expressions in the other. This trampoline is more complicated, but it provides more control over command-line arguments to racket:
#! /bin/sh #| exec racket -cu "$0" ${1+"$@"} |# #lang racket/base (printf "This script started slowly, because the use of\n") (printf "bytecode files has been disabled via -c.\n") (printf "Given arguments: s\n" (current-command-line-arguments))
Note that #! starts a line comment in Racket, and #|...|# forms a block comment. Meanwhile, # also starts a shell-script comment, while exec racket aborts the shell script to start racket. That way, the script le turns out to be valid input to both /bin/sh and racket.
19.2.2
A similar trick can be used to write Racket code in Windows .bat batch les:
333
19.3
For information on creating and distributing executables, see 3 raco exe: Creating Stand-Alone Executables and 4 raco distribute: Sharing Stand-Alone Executables in raco: Racket Command-Line Tools.
334
20
More Libraries
This guide covers only the Racket language and libraries that are documented in The Racket Reference. The Racket distribution includes many additional libraries.
20.1
Racket provides many libraries for graphics and graphical user interfaces (GUIs): The racket/draw library provides basic drawing tools, including drawing contexts such as bitmaps and PostScript les. See The Racket Drawing Toolkit for more information. The racket/gui library provides GUI widgets such as windows, buttons, checkboxes, and text elds. The library also includes a sophisticated and extensible text editor. See The Racket Graphical Interface Toolkit for more information. The slideshow/pict library provides a more functional abstraction layer over racket/draw. This layer is especially useful for creating slide presentations with Slideshow, but it is also useful for creating images for Scribble documents or other drawing tasks. Pictures created with the slideshow/pict library can be rendered to any drawing context. See Slideshow: Figure and Presentation Tools for more information. The 2htdp/image library is similar to slideshow/pict. It is more streamlined for pedagogical use, but also slightly more specic to screen and bitmap drawing. See 2htdp/image for more information. The sgl library provides OpenGL for 3-D graphics. The context for rendering OpenGL can be a window or bitmap created with racket/gui. See GL: 3-D Graphics for more information.
20.2
Web Applications in Racket describes the Racket web server, which supports servlets implemented in Racket.
335
20.3
The Racket Foreign Interface describes tools for using Racket to access libraries that are normally used by C programs.
20.4
And More
Racket Documentation lists documentation for many other installed libraries. Run raco docs to nd documentation for libraries that are installed on your system and specic to your user account. PLaneT offers even more downloadable packages contributed by Racketeers.
336
21
We use Racket to refer to a specic dialect of the Lisp language, and one that is based on the Scheme branch of the Lisp family. Despite Rackets similarly to Scheme, the #lang prex on modules is a particular feature of Racket, and programs that start with #lang are unlikely to run in other implementations of Scheme. At the same time, programs that do not start with #lang do not work with the default mode of most Racket tools. Racket is not, however, the only dialect of Lisp that is supported by Racket tools. On the contrary, Racket tools are designed to support multiple dialects of Lisp and even multiple languages, which allows the Racket tool suite to serve multiple communities. Racket also gives programmers and researchers the tools they need to explore and create new languages.
21.1
More Rackets
Racket is more of an idea about programming languages than a language in the usual sense. Macros can extend a base language (as described in 16 Macros), and alternate parsers can construct an entirely new language from the ground up (as described in 17 Creating Languages). The #lang line that starts a Racket module declares the base language of the module. By Racket, we usually mean #lang followed by the base language racket or racket/base (of which racket is an extension). The Racket distribution provides additional languages, including the following: typed/racket like racket, but statically typed; see The Typed Racket Guide lazy like racket/base, but avoids evaluating an expression until its value is needed; see Lazy Racket frtime changes evaluation in an even more radical way to support reactive programming; see FrTime: A Language for Reactive Programs scribble/base a language, which looks more like Latex than Racket, for writing documentation; see Scribble: The Racket Documentation Tool Each of these languages is used by starting module with the language name after #lang. For example, this source of this document starts with #lang scribble/base. Furthermore, Racket users can dene their own languages, as discussed in 17 Creating Languages. Typically, a language name maps to its implementation through a module path by adding /lang/reader; for example, the language name scribble/base is expanded to scribble/base/lang/reader, which is the module that implements the surface-syntax parser. Some language names act as language loaders; for example, #lang planet planet-path downloads, installs, and uses a language via PLaneT. 337
21.2
Standards
21.2.1
R5 RS
R5 RS stands for The Revised5 Report on the Algorithmic Language Scheme, and it is currently the most widely implemented Scheme standard. Racket tools in their default modes do not conform to R5 RS, mainly because Racket tools generally expect modules, and R5 RS does not dene a module system. Typical single-le R5 RS programs can be converted to Racket programs by prexing them with #lang r5rs, but other Scheme systems do not recognize #lang r5rs. The plt-r5rs executable (see 2 plt-r5rs) more directly conforms to the R5 RS standard. Aside from the module system, the syntactic forms and functions of R5 RS and Racket differ. Only simple R5 RS become Racket programs when prexed with #lang racket, and relatively few Racket programs become R5 RS programs when a #lang line is removed. Also, when mixing R5 RS modules with Racket modules, beware that R5 RS pairs correspond to Racket mutable pairs (as constructed with mcons). See R5RS: Legacy Scheme for more information about running R5 RS programs with Racket.
21.2.2
R6 RS
R6 RS stands for The Revised6 Report on the Algorithmic Language Scheme, which extends R5 RS with a module system that is similar to the Racket module system. When an R6 RS library or top-level program is prexed with #!r6rs (which is valid R6 RS syntax), then it can also be used as a Racket program. This works because #! in Racket is treated as a shorthand for #lang followed by a space, so #!r6rs selects the r6rs module language. As with R5 RS, however, beware that the syntactic forms and functions of R6 RS differ from Racket, and R6 RS pairs are mutable pairs. See R6RS: Scheme for more information about running R6 RS programs with Racket.
21.3
Teaching
The How to Design Programs textbook relies on pedagogic variants of Racket that smooth the introduction of programming concepts for new programmers. The languages are documented in How to Design Programs Languages.
338
The How to Design Programs languages are typically not used with #lang prexes, but are instead used within DrRacket by selecting the language from the Choose Language... dialog.
339
22
Although DrRacket is the easiest way for most people to start with Racket, many Racketeers prefer command-line tools and other text editors. The Racket distribution includes several command-line tools, and popular editors include or support packages to make them work well with Racket.
22.1
Command-Line Tools
Racket provides, as part of its standard distribution, a number of command-line tools that can make racketeering more pleasant. Compilation and Conguration: raco
22.1.1
The raco (short for Racket command) program provides a command-line interface to many additional tools for compiling Racket programs and maintaining a Racket installation. raco make compiles Racket source to bytecode. For example, if you have a program "take-over-world.rkt" and youd like to compile it to bytecode, along with all of its dependencies, so that it loads more quickly, then run
340
22.1.2
The Racket distribution includes XREPL (eXtended REPL), which provides everything you expect from a modern interactive environment. For example, XREPL provides an ,enter command to have a REPL that runs in the context of a given module, and an ,edit command to invoke your editor (as specied by the EDITOR environment variable) on the le you entered. A ,drracket command makes it easy to use your favorite editor to write code, and still have DrRacket at hand to try things out. For more information about XREPL, see XREPL: eXtended REPL.
22.1.3
Bash completion
Shell auto-completion for bash is available in "collects/meta/contrib/completion/racketcompletion.bash". To enable it, just run the le from your .bashrc. The "meta" collection is only available in the Racket Full distribution. The completion script is also available online.
22.2
Emacs
Emacs has long been a favorite among Lispers and Schemers, and is popular among Racketeers as well.
22.2.1
Major Modes
Quack is an extension of Emacss scheme-mode that provides enhanced support for Racket, including highlighting and indentation of Racket-specic forms, and documentation integration. Quack is included in the Debian and Ubuntu repositories as part of the emacsgoodies-el package. A Gentoo port is also available (under the name appemacs/quack). Geiser provides a programming environment where the editor is tightly integrated with the Racket REPL. Programmers accustomed to environments such as Slime or Squeak should feel at home using Geiser. Geiser requires GNU Emacs 23.2 or better. Quack and Geiser can be used together, and complement each other nicely. More information is available in the Geiser manual. Debian and Ubuntu packages for Geiser are available under the name geiser.
341
Emacs ships with a major mode for Scheme, scheme-mode, that while not as featureful as the above options, but works reasonably well for editing Racket code. However, this mode does not provide support for Racket-specic forms. No Racket program is complete without documentation. Scribble support for emacs is available with Neil Van Dykes Scribble Mode. In addition, texinfo-mode (included with GNU Emacs) and plain text modes work well when editing Scribble documents. The Racket major modes above are not really suited to this task, given how different Scribbles syntax is from Rackets.
22.2.2
Minor Modes
Paredit is a minor mode for pseudo-structurally editing programs in Lisp-like languages. In addition to providing high-level S-expression editing commands, it prevents you from accidentally unbalancing parentheses. Debian and Ubuntu packages for Paredit are available under the name paredit-el. Alex Shinns scheme-complete provides intelligent, context-sensitive code completion. It also integrates with Emacss eldoc mode to provide live documentation in the minibuffer. While this mode was designed for R5 RS, it can still be useful for Racket development. That the tool is unaware of large portions of the Racket standard library, and there may be some discrepancies in the live documentation in cases where Scheme and Racket have diverged. The RainbowDelimiters mode colors parentheses and other delimiters according to their nesting depth. Coloring by nesting depth makes it easier to know, at a glance, which parentheses match. ParenFace lets you choose in which face (font, color, etc.) parentheses should be displayed. Choosing an alternate face makes it possible to make tone down parentheses.
22.3
Vim
Many distributions of Vim ship with support for Scheme, which will mostly work for Racket. You can enable letype detection of Racket les as Scheme with the following:
you would like to roll your own settings or override settings from the plugin, add something like the following to your ".vimrc" le:
if has("autocmd") au BufReadPost *.rkt,*.rktl set filetype=racket au filetype racket set lisp au filetype racket set autoindent endif
However, if you take this path you may need to do more work when installing plugins because many Lisp-related plugins and scripts for vim are not aware of Racket. You can also set these conditional commands in a "scheme.vim" or "racket.vim" le in the "ftplugin" subdirectory of your vim folder. Most installations of vim will automatically have useful defaults enabled, but if your installation does not, you will want to set at least the following in your ".vimrc" le:
" Syntax highlighting syntax on " These lines make vim load various plugins filetype on filetype indent on filetype plugin on " No tabs! set expandtab
22.3.1 Indentation
You can enable indentation for Racket by setting both the lisp and autoindent options in Vim. However, the indentation is limited and not as complete as what you can get in Emacs. You can also use Dorai Sitarams scmindent for better indentation of Racket code. The instructions on how to use the indenter are available on the website. If you use the built-in indenter, you can customize it by setting how to indent certain keywords. The vim-racket plugin mentioned above sets some default keywords for you. You can add keywords yourself in your ".vimrc" le like this:
By default vim will indent arguments after the function name but sometimes you want to only indent by 2 spaces similar to how DrRacket indents define. Set the `lispwords' variable to add function names that should have this type of indenting.
set lispwords+=public-method,override-method,private-method,syntax343
The Rainbow Parenthesis script for vim can be useful for more visible parenthesis matching. Syntax highlighting for Scheme is shipped with vim on many platforms, which will work for the most part with Racket. The vim-racket script provides good default highlighting settings for you.
22.3.3
Structured Editing
The Slimv plugin has a paredit mode that works like paredit in Emacs. However, the plugin is not aware of Racket. You can either set vim to treat Racket as Scheme les or you can modify the paredit script to load on ".rkt" les.
22.3.4
Scribble
Vim support for writing scribble documents is provided by the scribble.vim plugin.
22.3.5
Miscellaneous
If you are installing many vim plugins (not necessary specic to Racket), we recommend using a plugin that will make loading other plugins easier. Pathogen is one plugin that does this; using it, you can install new plugins by extracting them to subdirectories in the "bundle" folder of your Vim installation.
344
Bibliography
[Goldberg04] David Goldberg, Robert Bruce Findler, and Matthew Flatt, Super and InnerTogether at Last!, Object-Oriented Programming, Languages, Systems, and Applications, 2004. http://www.cs.utah.edu/plt/publications/oopsla04gff.pdf [Flatt02] Matthew Flatt, Composable and Compilable Macros: You Want it When?, International Conference on Functional Programming, 2002. [Flatt06] Matthew Flatt, Robert Bruce Findler, and Matthias Felleisen, Scheme with Classes, Mixins, and Traits (invited tutorial), Asian Symposium on Programming Languages and Systems, 2006. http://www.cs.utah.edu/plt/publications/aplas06-fff.pdf [Mitchell02] Richard Mitchell and Jim McKim, Design by Contract, by Example. 2002. [Sitaram05] Dorai Sitaram, pregexp: Portable Regular Expressions for Scheme and Common Lisp. 2002. http://www.ccs.neu.edu/home/dorai/pregexp/pregexp.html
345
Index
#!, 331 .bat, 333 .zo, 340 3m, 319 A Customer-Manager Component, 166 A Dictionary, 170 A Note to Readers with Lisp/Scheme Experience, 17 A Parameteric (Simple) Stack, 168 A Queue, 172 Abbreviating quote with ', 39 aborts, 206 Abstract Contracts using #:exists and #:, 164 accessor, 100 Adding Collections, 115 Adding Contracts to Signatures, 250 Adding Contracts to Units, 252 Additional Examples, 165 Alternation, 199 An Aside on Indenting Code, 20 An Extended Example, 202 And More, 336 Anonymous Functions with lambda, 26 any and any/c, 137 Argument and Result Dependencies, 147 Arity-Sensitive Functions: case-lambda, 70 arms, 288 assertions, 191 Assignment and Redenition, 131 Assignment: set!, 87 attached, 259 backreference, 196 Backreferences, 196 Backtracking, 200 backtracking, 200 Bash completion, 341 Basic Assertions, 191 benchmarking, 314 Booleans, 42
box, 59 Boxes, 59 bracketed character class, 192 Built-In Datatypes, 42 byte, 48 byte string, 49 Bytes and Byte Strings, 48 Bytes, Characters, and Encodings, 186 call-by-reference, 269 CGC, 319 Chaining Tests: cond, 82 character, 45 character class, 192 Characters, 45 Characters and Character Classes, 192 Checking Properties of Data Structures, 161 Checking State Changes, 150 Class Contracts, 236 Classes and Objects, 223 cloister, 198 Cloisters, 198 closures, 320 Clustering, 195 Clusters, 195 code inspectors, 290 collection, 114 Combining Tests: and and or, 82 Command-Line Tools, 340 Command-Line Tools and Your Editor of Choice, 340 comments, 19 Compilation and Conguration: raco, 340 Compile and Run-Time Phases, 277 complex, 44 components, 242 composable continuations, 208 Conditionals, 81 Conditionals with if, and, or, and cond, 22 conservative garbage collector, 320 constructor, 100 constructor guard, 109 continuation, 207
346
Continuations, 207 contract combinator, 137 Contract Messages with ???, 142 Contract Violations, 134 Contracts, 134 Contracts and Boundaries, 134 Contracts and eq?, 175 Contracts for case-lambda, 146 Contracts for Units, 250 Contracts on Functions in General, 143 Contracts on Higher-order Functions, 141 Contracts on Structures, 159 Contracts: A Thorough Example, 154 Controlling the Scope of External Names, 228 Copying and Update, 101 Creating and Installing Namespaces, 259 Creating Executables, 16 Creating Languages, 293 Creating Stand-Alone Executables, 334 current continuation, 207 current namespace, 256 Curried Function Shorthand, 72 Datatypes and Serialization, 184 Declaring a Rest Argument, 66 Declaring Keyword Arguments, 69 Declaring Optional Arguments, 68 Default Ports, 181 default prompt tag, 206 define-syntax and syntax-rules, 266 define-syntax-rule, 264 Dening new #lang Languages, 302 Dening Recursive Contracts, 176 Denitions, 19 Denitions and Interactions, 15 denitions area, 15 Denitions: define, 71 delimited continuations, 208 Designating a #lang Language, 303 destructing bind, 222 Dialects of Racket and Scheme, 337 disarm, 288
Distributed Places, 325 domain, 136 dye pack, 288 Dynamic Binding: parameterize, 96 Effects After: begin0, 85 Effects Before: begin, 84 Effects If...: when and unless, 86 Emacs, 341 eval, 255 Evaluation Order and Arity, 64 exception, 204 Exceptions, 204 Exceptions and Control, 204 Exists Contracts and Predicates, 176 expander, 40 expands, 264 Experimenting with Contracts and Modules, 135 Exports: provide, 130 Expressions and Denitions, 61 Extended Example: Call-by-Reference Functions, 269 External Class Contracts, 236 Final, Augment, and Inner, 228 First-Class Units, 246 Fixed but Statically Unknown Arities, 152 xnum, 318 Fixnum and Flonum Optimizations, 318 at named contracts, 142 onum, 318 for and for*, 211 for/and and for/or, 215 for/first and for/last, 215 for/fold and for*/fold, 216 for/list and for*/list, 213 for/vector and for*/vector, 214 Function Calls (Procedure Applications), 21 Function Calls (Procedure Applications), 63 Function Calls, Again, 25 Function Shorthand, 71 Function-Call Optimizations, 316 functional update, 101
347
Functions (Procedures): lambda, 66 General Macro Transformers, 272 General Phase Levels, 280 generational garbage collector, 319 Gotchas, 175 Graphics and GUIs, 335 greedy, 195 Guarantees for a Specic Value, 159 Guarantees for All Values, 160 Guidelines for Using Assignment, 88 hash table, 57 Hash Tables, 57 Highlighting, 344 I/O Patterns, 186 identier macro, 268 Identier Macros, 268 identier syntax object, 273 Identiers, 21 Identiers and Binding, 62 Implicit Form Bindings, 294 Imports: require, 127 Indentation, 343 index pairs, 189 Inherit and Super in Traits, 234 Initialization Arguments, 226 Input and Output, 179 Installing a Language, 306 instantiates, 127 integer, 44 Interacting with Racket, 15 Interactive evaluation: XREPL, 341 Interactive Mode, 329 Interfaces, 227 Internal and External Names, 227 Internal Class Contracts, 239 Internal Denitions, 75 invoked, 242 Invoking Units, 244 Iteration Performance, 218 Iterations and Comprehensions, 209 JIT, 315 just-in-time, 315
keyword, 52 Keyword Arguments, 144 Keyword Arguments, 64 Keywords, 52 letrec Performance, 318 Lexical Scope, 265 Library Collections, 114 link, 246 Linking Units, 245 list, 53 List Iteration from Scratch, 31 Lists and Racket Syntax, 40 Lists, Iteration, and Recursion, 29 Load Mode, 331 Local Binding, 76 Local Binding with define, let, and let*, 28 Local Scopes, 256 Lookahead, 201 Lookbehind, 201 Looking Ahead and Behind, 200 macro, 264 macro pattern variables, 264 macro transformer, 272 macro-generating macro, 269 Macro-Generating Macros, 269 Macros, 264 Main and Test Submodules, 120 main submodule, 121 Major Modes, 341 Manipulating Namespaces, 258 Matching Regexp Patterns, 189 Matching Sequences, 267 Memory Management, 319 meta-compile phase level, 279 metacharacters, 188 metasequences, 188 Methods, 224 Minor Modes, 342 Miscellaneous, 344 mixin, 230 Mixing Patterns and Expressions: syntax-
348
case, 274 Mixing set! and contract-out, 177 Mixins, 230 Mixins and Interfaces, 231 Module Basics, 112 module language, 293 Module Languages, 293 Module Mode, 330 module path, 122 Module Paths, 122 Module Syntax, 116 Module-Handling Conguration, 309 Modules, 112 Modules and Performance, 315 More Libraries, 335 More Rackets, 337 More Structure Type Options, 107 multi-line mode, 198 Multiple Result Values, 151 Multiple Values and define-values, 74 Multiple Values: let-values, let*values, letrec-values, 80 Multiple Values: set!-values, 91 Multiple-Valued Sequences, 217 mutable pair, 55 Mutation and Performance, 317 mutator, 107 Named let, 79 namespace, 256 Namespaces, 256 Namespaces and Modules, 257 non-capturing, 198 Non-capturing Clusters, 198 non-greedy, 195 Notation, 61 number, 42 Numbers, 42 opaque, 102 Opaque versus Transparent Structure Types, 102 Optional Arguments, 143 Optional Keyword Arguments, 145
Organizing Modules, 113 pair, 53 Pairs and Lists, 53 Pairs, Lists, and Racket Syntax, 36 Parallel Binding: let, 76 Parallelism with Futures, 320 Parallelism with Places, 323 parameter, 96 Parameterized Mixins, 232 Pattern Matching, 220 pattern variables, 220 pattern-based macro, 264 Pattern-Based Macros, 264 Performance, 314 Performance in DrRacket, 314 phase, 280 phase level, 281 phase level -1, 280 phase level 2, 279 Phases and Bindings, 281 Phases and Modules, 283 place, 323 place channel, 324 port, 179 POSIX character class, 193 POSIX character classes, 193 Predened List Loops, 30 predicate, 100 prefab, 105 Prefab Structure Types, 105 Programmer-Dened Datatypes, 100 prompt, 206 prompt tag, 206 Prompts and Aborts, 206 property, 110 protected, 292 Protected Exports, 292 protected method, 228 Quantiers, 194 quantiers, 194 Quasiquoting: quasiquote and `, 93 Quoting Pairs and Symbols with quote, 37
349
Quoting: quote and ', 91 R5 RS, 338 R6 RS, 338 Racket Essentials, 18 racket/exists, 176 range, 136 rational, 44 reader, 40 Reader Extensions, 297 Reading and Writing Racket Data, 182 readtable, 301 Readtables, 300 real, 44 Recursion versus Iteration, 34 Recursive Binding: letrec, 78 Reection and Dynamic Evaluation, 255 regexp, 188 Regular Expressions, 188 REPL, 15 Rest Arguments, 144 Rolling Your Own Contracts, 138 Running and Creating Executables, 329 Running racket and gracket, 329 S-expression, 297 Scribble, 344 Scripting Evaluation and Using load, 261 Scripts, 331 Sequence Constructors, 210 Sequencing, 84 Sequential Binding: let*, 78 serialization, 184 shadows, 63 Sharing Data and Code Across Namespaces, 260 signatures, 242 Signatures and Units, 242 Simple Branching: if, 81 Simple Contracts on Functions, 136 Simple Denitions and Expressions, 18 Simple Dispatch: case, 95 Simple Structure Types: struct, 100 Simple Values, 18
Some Frequently Used Character Classes, 193 Source Locations, 298 Source-Handling Conguration, 307 speed, 314 Standards, 338 string, 47 Strings (Unicode), 47 Structure Comparisons, 103 Structure Subtypes, 101 structure type descriptor, 101 Structure Type Generativity, 104 Structured Editing, 344 Styles of ->, 137 subcluster, 198 submatch, 195 submodule, 119 Submodules, 119 subpattern, 195 symbol, 50 Symbols, 50 Syntax Objects, 272 syntax objects, 272 Syntax Taints, 287 tail position, 34 Tail Recursion, 33 tainted, 288 Tainting Modes, 289 Taints and Code Inspectors, 290 Teaching, 338 template, 264 template phase level, 280 text string, 188 The #lang Shorthand, 118 The apply Function, 65 The Bytecode and Just-in-Time (JIT) Compilers, 314 The mixin Form, 231 The module Form, 117 The Racket Guide, 1 The trait Form, 235 The Web Server, 335
350
Traits, 233 Traits as Sets of Mixins, 233 transformer, 264 transformer binding, 272 transparent, 102 Unchecked, Unsafe Operations, 319 unit versus module, 253 Units, 242 Units (Components), 242 Unix Scripts, 331 Using #lang reader, 303 Using #lang s-exp, 296 Using #lang s-exp syntax/modulereader, 304 Using Foreign Libraries, 336 Varieties of Ports, 179 vector, 56 Vectors, 56 Vim, 342 Void and Undened, 59 Welcome to Racket, 14 Whole-module Signatures and Units, 249 Windows Batch Files, 333 with-syntax and generatetemporaries, 276 Writing Regexp Patterns, 188
351