CD PPTS 2
CD PPTS 2
CD PPTS 2
Source
program
Lexical
Analyzer
token
Parser
getNextToken
Symbol
table
To semantic
analysis
Example
Token
if
else
Informal description
Sample lexemes
if
Characters i, f
Characters e, l, s, e
else
<=, !=
comparison
< or > or <= or >= or == or !=
3.14159, 0, 6.02e23
Input buffering
Token beginning
lookahead pointer
Switch (lookahead++)
{
case declare :
if (lookahead is at end of first buffer)
{
reload second buffer;
lookahead = beginning of second buffer;
}
else if
{
lookahead is at end of second buffer)
{
reload first buffer;\
lookahead = beginning of first buffer;
}
else /
break;
cases for the other characters;
}
Specification of tokens
Regular expressions are used to formalize the
specification of tokens
Regular expressions are means for specifying
regular languages
Example:
Identifiers = Letter(letter | digit)*
Keyword = begin | end | if | then | else
Constant = digit +
Relop
= < | <=| =| <> | > | >=
Each regular expression is a pattern specifying the
form of strings
Token Recognized
Keywords
Identifier
Constants
Rel operator
State 10 :
C = Getchar ()
if letter ( C) or digit ( C) then goto 10
else if Delimiter ( C) then goto 11
else fail ()
Delimiter() : Procedure that returns true
whenever C is a character that could follow
identifier
State 11 : retract ()
return( id , Install() )
a
a
start
0
b
NFA recognizing the language (a | b ) * abb
The set of states = {0,1,2,3}
Input symbol = {a,b}
Start state is S0, accepting state is S3
Language defined by NFA is the set of strings it
accepts
Transition Function
Transition function can be implemented as a transition
table.
State
Input Symbol
a
{0,1}
{0}
--
{2}
--
{3}
Converting a RE to an
Automata
We can convert a RE to an NFA
Inductive construction
Start with a simple basis, use that to build
more complex parts of the NFA
RE to NFA
Basis:
a
R=a
R=
R=S+T
R=ST
R=S
RE to -NFA Example
b
ab
RE to NFA Example
ab+a
a
(ab+a)*