18 Code Gen

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Principles of Compiler Design

Code Generation

Compiler
Lexical Syntax Semantic
Analysis Analysis Analysis

Source Token
Abstract
Syntax Intermediate Code Target
Program
Program stream tree Code Generation

Front End Back End


(Language specific)

1
Code generation and Instruction
Selection
Front Intermediate Code
input output
end Code generator generator

Symbol
table

Requirements
• output code must be correct
• output code must be of high quality
• code generator should run efficiently
2
Design of code generator: Issues
• Input: Intermediate representation with symbol
table
– assume that input has been validated by the front end

• Target programs :
– absolute machine language
fast for small programs
– relocatable machine code
requires linker and loader
– assembly code
requires assembler, linker, and loader

3
More Issues…
• Instruction selection
– Uniformity
– Completeness
– Instruction speed, power consumption
• Register allocation
– Instructions with register operands are
faster
– store long life time and counters in registers
– temporary locations
– Even odd register pairs
• Evaluation order
4
Instruction Selection
• straight forward code if efficiency is not an issue

a=b+c Mov b, R0
d=a+e Add c, R0
Mov R0, a
Mov a, R0 can be eliminated
Add e, R0
Mov R0, d

a=a+1 Mov a, R0 Inc a


Add #1, R0
Mov R0, a

5
Example Target Machine
• Byte addressable with 4 bytes per word
• n registers R0, R1, ..., Rn-l
• Two address instructions of the form
opcode source, destination
• Usual opcodes like move, add, sub etc.
• Addressing modes
MODE FORM ADDRESS
Absolute M M
register R R
index c(R) c+content(R)
indirect register *R content(R)
indirect index *c(R) content(c+content(R))
literal #c c
6
Flow Graph
• Graph representation of three address
code
• Useful for understanding code generation
(and for optimization)
• Nodes represent computation
• Edges represent flow of control

7
Basic blocks
• (maximum) sequence of consecutive
statements in which flow of control enters at
the beginning and leaves at the end
Algorithm to identify basic blocks
• determine leader
– first statement is a leader
– any target of a goto statement is a leader
– any statement that follows a goto statement is a
leader
• for each leader its basic block consists of the
leader and all statements up to next8 leader
Flow graphs
• add control flow information to basic
blocks
• nodes are the basic blocks
• there is a directed edge from B1 to B2 if B2
can follow B1 in some execution sequence
– there is a jump from the last statement of B1
to the first statement of B2
– B2 follows B1 in natural order of execution
• initial node: block with first statement as
leader
9
Next use information
• for register and temporary allocation
• remove variables from registers if not
used
• statement X = Y op Z
defines X and uses Y and Z
• scan each basic blocks backwards
• assume all temporaries are dead on
exit and all user variables are live on
exit
10
Computing next use information
Suppose we are scanning
i : X := Y op Z
in backward scan
1. attach to statement i, information in symbol
table about X, Y, Z
2. set X to “not live” and “no next use” in symbol
table
3. set Y and Z to be “live” and next use as i in
symbol table
11
Example
1: t1 = a * a
2: t2 = a * b
3: t3 = 2 * t2
4: t4 = t1 + t3
5: t5 = b * b
6: t6 = t4 + t5
7: X = t6
12
STATEMENT
1: t1 = a * a Example
2: t2 = a * b
3: t3 = 2 * t2
4: t4 = t1 + t3
5: t5 = b * b Symbol Table
6: t6 = t4 + t5
7: X = t6
t1 dead Use in 4

t2 dead Use in 3
7: no temporary is live
6: t6:use(7), t4 t5 not live t3 dead Use in 4
5: t5:use(6)
4: t4:use(6), t1 t3 not live t4 dead Use in 6
3: t3:use(4), t2 not live t5 dead Use in 6
2: t2:use(3)
1: t1:use(4) t6 dead Use in 7
13
1
Example … STATEMENT
1: t1 = a * a
2: t2 = a * b
3: t3 = 2 * t2
2 4: t4 = t1 + t3
t1 5: t5 = b * b
t2 1: t1 = a * a 6: t6 = t4 + t5
3 2: t2 = a * b 7: X = t6
t3
3: t2 = 2 * t2
4
4: t1 = t1 + t2
5 t4 5: t2 = b * b
t5 6: t1 = t1 + t2
6
t6
7: X = t1
7 14
Code Generator
• consider each statement
• remember if operands are in registers
• Register descriptor
– Keep track of what is currently in each register.
– Initially all the registers are empty
• Address descriptor
– Keep track of location where current value of
the name can be found at runtime
– The location might be a register, stack,
memory address or a set of those
15
Code Generation Algorithm
for each X = Y op Z do
• invoke a function getreg to
determine location L where X must
be stored. Usually L is a register.
• Consult address descriptor of Y to
determine Y'. Prefer a register for Y'.
If value of Y not already in L generate
Mov Y', L

16
Code Generation Algorithm
• Generate
op Z', L
Again prefer a register for Z. Update
address descriptor of X to indicate X is in L.
• If L is a register, update its descriptor to
indicate that it contains X and remove X
from all other register descriptors.
• If current value of Y and/or Z have no next
use and are dead on exit from block and
are in registers, change register descriptor
to indicate that they no longer contain Y
and/or Z.
17
Function getreg
1. If Y is in register (that holds no other values)
and Y is not live and has no next use after
X = Y op Z
then return register of Y for L.
2. Failing (1) return an empty register
3. Failing (2) if X has a next use in the block or
op requires register then get a register R,
store its content into M (by Mov R, M) and
use it.
4. else select memory location X as L
18
Example
Stmt code reg desc addr desc
t1=a-b mov a,R0
sub b,R0 R0 contains t1 t1 in R0
t1=a-b
t2=a-c mov a,R1 R0 contains t1 t1 in R0 t2=a-c
sub c,R1 R1 contains t2 t2 in R1 t3=t1+t2
d=t3+t2
t3=t1+t2 add R1,R0 R0 contains t3 t3 in R0
R1 contains t2 t2 in R1
d=t3+t2 add R1,R0 R0 contains d d in R0
mov R0,d d in R0 and
memory
19
DAG representation of basic blocks
• useful data structures for implementing
transformations on basic blocks
• gives a picture of how value computed by a
statement is used in subsequent statements
• good way of determining common sub-
expressions
• A dag for a basic block has following labels on the
nodes
– leaves are labeled by unique identifiers, either variable
names or constants
– interior nodes are labeled by an operator symbol
– nodes are also optionally given a sequence of
identifiers for labels
20
DAG representation: example
1. t1 := 4 * i t6 prod
2. t2 := a[t1] +
3. t3 := 4 * i
4. t4 := b[t3] prod0 * t5
5. t5 := t2 * t4 t4 (1)
t2 [ ]
6. t6 := prod + t5 [] <=
7. prod := t6 t1 t3
8. t7 := i + 1 a b * + t7 i 20
9. i := t7
10. if i <= 20 goto (1) 4 i0 1

21
Code Generation from DAG
S1 = 4 * i S1 = 4 * i
S2 = addr(A)-4 S2 = addr(A)-4
S3 = S2[S1] S3 = S2[S1]
S4 = 4 * i
S5 = addr(B)-4 S5 = addr(B)-4
S6 = S5[S4] S6 = S5[S4]
S 7 = S3 * S6 S 7 = S3 * S6
S8 = prod+S7
prod = prod + S7
prod = S8
S9 = I+1
I=I+1
I = S9
If I <= 20 goto (1)
If I <= 20 goto (1)
22
Rearranging order of the code
• Consider -
X
following basic
block t3
t1
+ -

t1 = a + b
t2 = c + d a b e + t2
t3 = e –t2
X = t1 –t3
c d

and its DAG


23
Rearranging order …
Three adress code for Rearranging the code as
the DAG (assuming t2 = c + d
only two registers are
available) t3 = e –t2
t1 = a + b
MOV a, R0 X = t1 –t3
ADD b, R0
gives
MOV c, R1
ADD d, R1 MOV c, R0
MOV R0, t1 ADD d, R0
MOV e, R0 Register spilling MOV e, R1
SUB R1, R0 SUB R0, R1
MOV t1, R1 MOV a, R0
SUB R0, R1 Register reloading ADD b, R0
MOV R1, X SUB R1, R0
MOV R1, X 24

You might also like