NetFPGA Tutorial

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 51

CS 838: NetFPGA Tutorial

Theophilus Benson
Outline
Background: What is the NetFPGA?
Life cycle of a packet through a NetFPGA
Demo
What is the NetFPGA?

Networking
Software CPU Memory

running on a
standard PC
PCI

A hardware
1GE
accelerator
built with Field FPGA 1GE
Programmable
1GE
Gate Array
Memory
driving Gigabit 1GE
network links
NetFPGA Router
Function
4 Gigabit Ethernet ports

Fully programmable
FPGA hardware

Open-source FPGA hardware --


Verilog base design

Open-source Software -- Linux user Level


Drivers in C and C++
NetFPGA Platform
Major Components
Interfaces
4 Gigabit Ethernet Ports
PCI Host Interface

Memories
36Mbits Static RAM
512Mbits DDR2 Dynamic RAM

FPGA Resources
Block RAMs
Configurable Logic Block (CLBs)
Memory Mapped Registers
NetFGPA: Router Design
Pipeline of modules
FIFO queues between each module
Inter module communication
CTRL: Send on ctrl bus (8 bits)
Metadata about the data being send
DATA: Send on data bus (64 bits)
RDY: Signifies ready to receive packet (1 bit)
WR: Signifies packet being send(1bit)
NetFPGA

Software
Linux user-level
processes
Linux
Processes

Verilog on

Hardware
NetFPGA PCI board
FGPA FGPA
Modules 1 Modules 2
Example: An IP Router on NetFPGA

Management

Software
& CLI
Linux user-level
Routing processes
Exception Protocols
Processing Routing
Table

Verilog on

Hardware
NetFPGA PCI board
Forwarding
Switching
Table
Life of a Packet through the hardware

192.168.10 192.168.10
1.x port0 port2 2.y
Router Stages
MAC CPU MAC CPU MAC CPU MAC CPU
RxQ RxQ RxQ RxQ RxQ RxQ RxQ RxQ

Input Arbiter

Output Port Lookup

Output Queues

MAC CPU MAC CPU MAC CPU MAC CPU


TxQ TxQ TxQ TxQ TxQ TxQ TxQ TxQ
Inter-module Communication
Using Module Headers:
Ctrl Word Data Word
(8 bits) (64 bits)

x Module Hdr Contain information


such as packet length,
input port, output port,
y Last Module Hdr

0 Eth Hdr
0 IP Hdr
0
0x10 Last word of packet
Inter-module Communication

data

ctrl

wr

rdy
MAC Rx Queue
Rx Queue

Pkt length,
0xff
input port = 0
Eth Hdr:
0 Dst MAC = port 0,
Ethertype = IP
IP Hdr:
0 IP Dst: 192.168.2.3, TTL:
64, Csum:0x3ab4
0 Data
Input Arbiter

Pkt

Pkt

Pkt
Output Port Lookup
Output Port Lookup 5- Add output
1- Check input
port matches Dst port module
MAC
0x04 output port = 4
2- Check TTL, 6- Modify MAC
Pkt length,
checksum 0xff Dst and Src
input port = 0
EthHdr:
EthHdr: Dst Dst
MAC MAC =0
= nextHop addresses
3- Lookup next 0 SrcSrc MAC
MAC = x, 4,
= port
hop IP & output Ethertype = IP
port (LPM) IP Hdr: 7-Decrement TTL
0 IP Dst: 192.168.2.3, TTL: and update
64,
63, Csum:0x3ab4
Csum:0x3ac2 checksum
4- Lookup next 0 Data
hop MAC
address (ARP)
Output Queues
OQ0

OQ4

OQ7
MAC Tx Queue
MAC Tx Queue

0x04 output port = 4

Pkt length,
0xff
input port = 0
EthHdr: Dst MAC = nextHop
0 Src MAC = port 4,
Ethertype = IP
IP Hdr:
0 IP Dst: 192.168.2.3, TTL:
64,
63, Csum:0x3ab4
Csum:0x3ac2
0 Data
NetFPGA-Host Interaction
Linux driver interfaces with hardware
Packet interface via standard Linux network stack

Register reads/writes via ioctl system call (with


convenience wrapper functions)
readReg(nf2device *dev, int address, unsigned *rd_data)
writeReg(nf2device *dev, int address, unsigned *wr_data)

eg:
readReg(&nf2, OQ_NUM_PKTS_STORED_0, &val);
NetFPGA-Host
Register access
Interaction

2. Driver

PCI Bus
performs PCI
memory
read/write

1. Software makes ioctl call on


network socket. ioctl passed to
driver.
NetFPGA-Host Interaction
Packet transfers shown using DMA interface

Alternative: use programmed IO to transfer


packets via register reads/writes
slower but eliminates the need to deal with
network sockets
DEMO: Life of a Packet through the
hardware

192.168.1.x 192.168.2.y
port0 port2
Programming the FPGA with your code
nf2_download NF2/bitfiles/reference_router.bit
Mirror linux arp
./NF2/projects/router_kit/sw/rkd
Helpful tool
./NFlib/C/router/cli
Shows forwarding tables {arp table, ip table}
Allows to modify tables
Useful Links
NetFPGA Website
NetFPGA Wiki
NetFPGA Guide
Walkthrough the Reference Designs
The Verilog Golden Reference Guide
Questions
Verilog
Hardware
Concurrent Description Languages
By Default, Verilog statements
evaluated concurrently

Express fine grain parallelism


Allows gate-level parallelism

Provides Precise Description


Eliminates ambiguity about operation

Synthesizable
Generates hardware from description
Verilog Data Types
reg [7:0] A; // 8-bit register, MSB to LSB
// (Preferred bit order for NetFPGA)
reg [0:15] B; // 16-bit register, LSB to MSB

B = {A[7:0],A[0:7]}; // Assignment of bits

reg [31:0] Mem [0:1023]; // 1K Word Memory

integer Count; // simple signed 32-bit integer


integer K[1:64]; // an array of 64 integers
time Start, Stop; // Two 64-bit time variables
From: CSCI 320 Computer Architecture
Handbook on Verilog HDL, by Dr. Daniel C. Hyde :
http://eesun.free.fr/DOC/VERILOG/verilog-manual.html
Signal Multiplexers
Two input multiplexer (using if / else)
reg y;
always @*
if (select)
y = a;
else
y = b;

Two input multiplexer (using ternary operator ?:)

wire t = (select ? a : b);

From: http://eesun.free.fr/DOC/VERILOG/synvlg.html
Larger Multiplexers
Three input multiplexer

reg s;
always @*
begin
case (select2)
2'b00: s = a;
2'b01: s = b;
default: s = c;
endcase
end

From: http://eesun.free.fr/DOC/VERILOG/synvlg.html
Synchronous Storage
Values change at times
Din
Elements
D Q Dout
governed by clock
Clock

Clock Clock 1 Clock Transition


Input to circuit 0
t=0 t=1 t=2 time

Clock Event
Example: Rising edge
Din A B C
t=0
Flip/Flop Clock Transition
Transfers Value From Din to Dout S0 A B
Dout on Clock event t=0
Inputs (X)
Copyright 2001, John W. Lockwood, All Rights Reserved

S(t)
Com binational Logic

Q D

...
Q D
Finite State Machines
S(t+1)=
(X,S(t))
Outputs (Z)
(X,S(t))
[Mealy]
-or-
(S(t))
[Moore]

Next
State

State Storage
Synthesizable Verilog : Delay Flip/Flops
D-type flip flop
reg q;
always @ (posedge clk)
q <= d;

D type flip flop with data enable


reg q;
always @ (posedge clk)
if (enable)
q <= d;

From: http://eesun.free.fr/DOC/VERILOG/synvlg.html
More on NetFPGA System
NetFPGA System
Web & Browser
Monitor
CAD Video & Video
Software
Tools Server Client

User Space

Linux Kernel
Packet Forwarding Table

PCI PCI-e

VI

VI

VI

VI

NIC
NetFPGA Router
Hardware
GE

GE

GE

GE

GE

GE
(nf2c0 .. 3) (eth1 .. 2)

NetFPGA System Implementation
NetFPGA Blocks
Virtex-2 Pro FPGA
4.5MB ZBT SRAM
64MB DDR2 DRAM
PCI Host Interface
4 Gigabit Ethernet ports

Intranet Test Ports


Dual or Quad Gigabit
Etherents on PCI-e

Internet
Gigabit Ethernet
on Motherboard

Processor
Dual-Core CPU

Operating System
Linux CentOS 4.4
NetFPGA Lab Setup

Client GE Eth2 : Server

PCI-e
Dual
(eth1 NIC
.. 2) Eth1 : Local host
GE
Server

CPU x2 Net-FPGA GE Nf2c3 : Adj. Server

NetFPGA GE Nf2c2 : Local Host

PCI
Internet
Control SW Router GE Nf2c1 : Adjacent
Hardware Nf2c0 : Adjacent
GE
CAD Tools
Exception Path
Exception Packet
Example: TTL = 0 or TTL = 1
Packet has to be sent to the CPU which will
generate an ICMP packet as a response
Difference starts at the Output Port lookup
stage
Exception Packet Path
Software
nf2c0 nf2c1 nf2c2 nf2c3 ioctl

PCI Bus

nf2_reg_grp
CPU CPU CPU CPU CPU CPU CPU CPU
RxQ TxQ RxQ TxQ RxQ TxQ RxQ TxQ
NetFPGA user data path

MAC MAC MAC MAC MAC MAC MAC MAC


TxQ RxQ TxQ RxQ TxQ RxQ TxQ RxQ

Ethernet
Output Port Lookup
1- Check input
port matches Dst
MAC
0x04 output port = 1
2- Check TTL, Pkt length,
checksum 0xff
input port = 0
EXCEPTION! EthHdr: Dst MAC = 0,
0 Src MAC = x,
Ethertype = IP
3- Add output IP Hdr:
0 IP Dst: 192.168.2.3, TTL:
port module
1, Csum:0x3ab4
0 Data
Output Queues
OQ0

OQ1

OQ2

OQ7
CPU Tx Queue
CPU Tx Queue

0x04 output port = 1

Pkt length,
0xff
input port = 0
EthHdr: Dst MAC = 0,
0 Src MAC = x,
Ethertype = IP
IP Hdr:
0 IP Dst: 192.168.2.3, TTL:
1, Csum:0x3ab4
0 Data
ICMP Packet
For the ICMP packet, the packet arrives at the
CPU Rx Queue from the PCI Bus
Follows the same path as a packet from the MAC
until the Output Port Lookup.
The OPL module seeing the packet is from the
CPU Rx Queue 1, sets the output port directly to
0.
The packet then continues on the same path as
the non-exception packet to the Output Queues
and then MAC Tx queue 0.
ICMP Packet Path
Software
nf2c0 nf2c1 nf2c2 nf2c3 ioctl

PCI Bus

nf2_reg_grp
CPU CPU CPU CPU CPU CPU CPU CPU
RxQ TxQ RxQ TxQ RxQ TxQ RxQ TxQ
NetFPGA user data path

MAC MAC MAC MAC MAC MAC MAC MAC


TxQ RxQ TxQ RxQ TxQ RxQ TxQ RxQ

Ethernet
NetFPGA-Host
NetFPGA to host packet transfer
Interaction
1. Packet arrives
forwarding table sends
to CPU queue

2. Interrupt

PCI Bus
notifies 3. Driver sets up
driver of and initiates DMA
packet arrival transfer
NetFPGA-Host
NetFPGA to host packet transfer (cont)
Interaction

5. Interrupt
4. NetFPGA signals
PCI Bus
transfers completion
packet via of DMA
DMA

6. Driver passes packet to


network stack
NetFPGA-Host
Host to NetFPGA packet transfers
Interaction

2. Driver sets up 3. Interrupt


signals
PCI Bus
and initiates DMA
transfer completion
of DMA

1. Software sends packet via


network sockets. Packet
delivered to driver.

You might also like