RTL Design
RTL Design
RTL Design
Recall
Chapter 2: Combinational Logic Design
First step: Capture behavior (using equation or truth table) Remaining steps: Convert to circuit
Capture behavior
Convert to circuit
sensor
Example of how to create a high-level state machine to describe desired processor behavior Laser-based distance measurement pulse laser, measure time T to sense reflection
Laser light travels at speed of light, 3*108 m/sec Distance is thus D = T sec * 3*108 m/sec / 2
3
sensor
D to display
16
S from sensor
Inputs/outputs
B: bit input, from button to begin measurement L: bit output, activates laser S: bit input, senses laser reflection D: 16-bit output, displays computed distance
to laser
from sensor
S0
a
Step 1: Create high-level state machine Begin by declaring inputs and outputs Create initial state, name it S0
Initialize laser to off (L=0) Initialize displayed distance to 0 (D=0)
from button B
to laser
to display
16
from sensor
S0 L=0 D=0
S1 B (button pressed)
Add another state, call S1, that waits for a button press
B stay in S1, keep waiting B go to a new state S2
S0 L=0 D=0
S1
Add a state S2 that turns on the laser (L=1) Then turn off laser (L=0) in a state S3 Q: What do next? A: Start timer, wait to sense reflection
a
S0 L=0 D=0
S1
S2 L=1
S3
Stay in S3 until sense reflection (S) To measure time, count cycles for which we are in S3
To count, declare local register Dctr Increment Dctr each cycle in S3 Initialize Dctr to 0 in S1. S2 would have been O.K. too
8
Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) Local Registers: Dctr (16 bits) B S0 L=0 D=0 S1 Dctr = 0 S2 L=1 S
to displ ay
16
from sensor
S3
S4
10
11
S0 L=0 D=0
Datapath
S1 Dctr = 0
S2 L=1
S3
S4
Dreg_clr Dreg_ld Dctr_clr Dctr_cnt clear count Q 16 Dct r: 16-bit up -count er clear load
16
D
12
Laser-based distance measurer example Easy just connect all control signals between controller and datapath
16
clear count Q Dctr: 16-bit up-counter clear load I Dreg: 16-bit register Q 16 D
16
13
Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) Local Registers: Dctr (16 bits) B S
S0 L=0 D=0
S1 Dctr = 0
S2 L=1
S3
S4
Inputs: B, S FSM has same Outputs: L, Dreg_clr, Dreg_ld, Dctr_clr, Dctr_cnt structure as highB level state machine
S
a
Inputs/outputs all bits now Replace data operations by bit operations using datapath
S0
S1
S2
S3
S4
L=0 Dreg_clr = 0 Dreg_ld = 1 Dctr_clr = 0 Dctr_cnt = 0 (load D reg with Dctr/2) (stop counting)14
Inputs: B, S
S3
15
Step 4
from button Controller B L Dreg_clr Dreg_ld Dctr_clr Dctr_cnt to display D 16
300 MHz Clock
Datapath
Datapath
>>1
Dreg_clr Dreg_ld Dctr_clr Dctr_cnt clear count Dctr: 16-bit up-counter Q clear load 16
Inputs: B, S
S3
Implement S4 FSM as state register and Dreg_ld = 1 Dctr_cnt = 0 logic (Ch3) to (load D reg with Dctr/2) complete the (stop counting) design
16
Differences
Frame 1 Frame 2
Digitized
Digitized
Digitized
Difference of
frame 1
frame 2
frame 1
2 from 1
1 Mbyte (a)
1 Mbyte
1 Mbyte (b )
0.01 Mbyte
Video is a series of frames (e.g., 30 per second) Most frames similar to previous frame
Compression idea: just send difference from previous frame
17
Differences
Assume each pixel is represented as 1 byte (actually, a color picture might have 3 bytes per pixel, for intensity of red, green, and blue components of pixel)
Need to quickly determine whether two frames are similar enough to just send difference for second frame
Compare corresponding 16x16 blocks
Treat 16x16 block as 256-byte array
Compute the absolute value of the difference of each array item Sum those differences if above a threshold, send complete frame for second frame; if below, can use difference method (using another technique, not described)
18
B go
sad
integer
!(i<256)
19
Inputs: A, B (256 byte memory); go (bit) Outputs: sad (32 bits) Local registers: sum, sad_reg (32 bits); i (9 bits)
S0 go
go
!go
sum = 0 i=0
a
S0: wait for go S1: initialize sum and index S2: check if done (i>=256) S3: add difference to sum, increment index S4: done, write to output sad_reg
S1
(i<256)
!(i<256)
sad_reg = sum
20
S0 go S1
(i<256)
i_clr
sum_ld sum_clr
8 32
S2
sum
abs
8
32 32
sad_reg 32 sad
Datapath
21
!(i<256)
S4
23
24
X 12 digital filter 12
Filter should remove such noise in its output Y Simple filter: Output average of last N values
Small N: less filtering Large N: more filtering, but less sharp output
clk
25
RTL design
Step 1: Create high-level state machine
But there really is none! Data dominated indeed.
Go straight to step 2
26
180 181
180
27
c0
c2
*
Y
28
* +
* +
29
* +
* +
*
yreg Y
30
No controller needed Extreme data-dominated example (Example of an extreme control-dominated design an FSM, with no datapath)
100-tap filter, following design on previous slide, would have about a 34-gate delay: 1 multiplier and 7 adders on longest path
Software
100-tap filter: 100 multiplications, 100 additions. Say 2 instructions per multiplication, 2 per addition. Say 10-gate delay per instruction. (100*2 + 100*2)*10 = 4000 gate delays