Floating Point DSPs by Bhaskar

Part - 7
Floating point DSP Architecture

(TMS320C3X)
2
Key features of TMS320C3X floating point processor
 High performance CMOS 32-bit floating point processor
 Maximum performance rate 60MFLOPS (floating point
operations) and 30MIPS (fixed point operations)
 CPU – 40 bit floating point / 32 bit fixed point
 ALU – 40 bit floating point / 32 bit fixed point
 No accumulators, 8 General purpose registers (R0-R7) each
40 bit size used.
 32-bit barrel shift register
 Two ARUs, with 8 auxiliary registers (ARs), two index
registers (IR0 & IR1)
 Modified Harvard architecture
PB – PAB – 24 bit, PDB – 32 bit
DB – DAB1 & DAB2 – each 24 bit, DDB – 32 bit
Peripheral bus – DMA bus – DMA AB – 24 bit, DMA DB – 32 bit
Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15
3
Key features of TMS320C3X floating point processor cont…
 Memory 224 – 16M x 32 bit words
 4K x 32 bit On-chip ROM
 Two blocks of 1Kx 32 bit DARAM ( 2Kx 32 bit DARAM)
 64 words (64 x 32 bit) Program cache
 The `C3X family of processors - `C30, `C31 and `C32
 The architecture of the processors are same but the speed of
operation of each processor in the family is different
 on-chip peripherals
Clock generator
Timer
Serial port
Parallel port
DMA Controller – with on-chip DMA bus
4
TMS320C3X family
Device Frequency (MHz) / Memory(words) Peripherals
Name Cycle time (n sec.) On-chip Off-chip
`C30 27/75 RAM = 2K 16Mx32 Serial port = 2

(5V) 33/60 ROM = 4 K 8Kx 32 DMA channel = 1
40/50 Cache = 64 Timers = 2
50/40
`C31 27/75 RAM = 2K 16Mx32 Serial port = 1

(5V) 33/60 ROM - Boot DMA channel = 1
40/50 loader Timers = 2
50/40 Cache = 64
60/33
`LC31 33/60 RAM = 2K 16Mx32 Serial port =1

(3.3V) 40/50 ROM - Boot
loader DMA channel = 1
Cache = 64 Timers =2
16Mx 32/16/8
C32 40/50 RAM = 512 Serial port =1
(5V) 50/40 ROM - Boot
60/33 loader DMA channel = 2
Cache = 64 Timers =2
Floating point CPU `C3X 5
Floating-point/integer multiplier
The multiplier performs single-cycle
multiplications on 32-bit floating-point and
24-bit integer values and the results are 40-
bit and 32-bit respectively.
 Arithmetic logic unit (ALU)
The ALU performs single-cycle operations on
32-bit integer, 32-bit logical, and 40-bit
floating-point data (Input 24-bit integer and
32 –bit floating point)
 32-bit barrel shifter
The barrel shifter is used to shift up to 32
bits left or right in a single cycle.
 Internal buses (CPU1/CPU2 and REG1/REG2)
CPU bus and Register file bus
CPU bus 32-bit two buses
Register file bus 40-bit two buses
 Auxiliary register arithmetic units (ARAUs)
Two auxiliary register arithmetic units
(ARAU0 and ARAU1) can generate two
addresses in a single cycle. They support
addressing with displacements, index
registers (IR0 and IR1), and circular and bit-
reversed addressing.
 CPU register file
`C3X CPU Register file 6
`C3x provides 28 registers in a multiport register file that is tightly coupled to the CPU.
 Extended-precision registers (R7-R0) – 40 bit (All other registers are 32- bit)
Registers R7 – R0 can be operated upon by the multiplier and ALU and can be used as
general-purpose registers.
 Auxiliary registers (AR7-AR0)
 Index registers (IR1 and IR0)
 Block size register (BK)
Auxiliary, index and block size registers used for indirect addressing mode
 Data page pointer (DP)
Used for direct addressing mode
 System stack-pointer (SP)
used for stack management
 Status register (ST)
Used for system function
 CPU/DMA interrupt-enable register (IE)
 CPU interrupt flag register (IF)
Used for interrupts
 I/O flag register (IOF)
Used for I/O activity
 Repeat start-address register (RS)
 Repeat end-address register (RE)
 Repeat count register (RC)
Used for repeat operations
`C3X Memory and Buses 7
 The total memory space is 16M

(million) 32-bit words.
 Program, data, and I/O space are
contained within this 16M-word
address space (Unified memory
space).
 The storage of tables, coefficients,
program code, or data is in either RAM
or ROM.
On-chip memory
 4K words (32-bit) of ROM
 Two 1K words (32-bit) DARAM
 Remaining space is external RAM
 64 words (32-bit) Program cache
Internal Buses
 All address buses are 24-bit and All data buses are 32-bit
 Program buses: PADDR and PDATA
 Data buses: DADDR1, DADDR2, and DDATA
 DMA buses: DMAADDR and DMADATA
 External bus : External address bus and External data bus

`C3X Instruction/Program cache 8
 Instruction cache is useful if the device

access repeatedly a block of code in
time critical mode.
 Cache stores a section of code/
instructions
 Cache operates automatically with no
user intervention
 Cache reduces the no. of off-chip
access and frees the external bus from
program fetches so that it can be used
by DMA
The instruction cache contains 64 x32-bit words of RAM. It is divided into two 32-word segments
(Segment 0 and Segment 1). A 19-bit segment start address (SSA)register is associated with each
segment. For each word in the cache, there is a corresponding single bit-present (P) flag.
Cache Control Bits
Cache Clear Bit (CC), Cache Enable Bit (CE) and Cache Freeze Bit (CF).
Cache Algorithm
Cache Hit – Cache contains the IW, IW read from cache, LRU stack number changed
Cache Miss – cache doesn’t contain IW, Two types of cache miss
Sub-segment miss – SSA register matches, but P flag is not set – copy the IW from memory to
cache, set the P flag and change the LRU stack number
Segment Miss – Neither of SSA matches - SSA register loaded with 19MSBs of PA, copy IW from
memory to cache , P flag is set and LRU stack number
Floating point processor –Data formats 9
Signed Integer Formats

Short-integer format – 16 bit
It is a 16-bit 2s-complement integer format. This format can be sign-extended to 32 bit.
The range of an integer in the short-integer format is −215 ≤ si ≤ 215 − 1.
Single-precision integer format - 32-bit

It is a 32 – bit represented in 2s-complement notation.
The range of an integer in the single-precision integer format, is −231 ≤ sp ≤ 231 − 1.
Unsigned-Integer Formats
Short Unsigned-Integer Format
It is a 16-bit integer format. This format can be zero filled to 32 bit.
The range of a short-unsigned integer format is 0 ≤ si ≤ 216.

Floating point processor –Data formats cont… 10
Single-precision unsigned integer format - 32-bit

The number is represented as 32 – bit value
The range of an integer in the single-precision unsigned integer format, is 0 ≤ sp ≤ 232.
Floating-Point Formats
Floating-point formats consist an exponent field (e) and a mantissa field (man). The
mantissa field has single-bit sign field (s) and a fraction field (f ).
The exponent field is a 2s-complement number that determines the factor of 2 by which
the number is multiplied.
Mantissa can be signed number or normalized 2s-complement number. In the
normalized case, MSB is non-sign bit.

Short floating-point format

The floating-point numbers are represented by a 2s-complement, 4-bit exponent field
(e) and a 2s-complement, 12-bit mantissa field (man) with an implied most significant
non-sign bit
Single-Precision Floating-Point Format

The floating-point number is represented by an 8-bit exponent field (e) and a 2s-
complement 24-bit mantissa field (man) with an implied most significant non-sign bit

Extended-Precision Floating-Point Format

The floating-point number is represented by an 8-bit exponent field (e) and a 32-bit
mantissa field (man) with an implied most significant non-sign bit
Floating-Point to Integer Conversion (FIX Instruction)

The FIX instruction is used to convert an extended-precision floating point number to a
single-precision integer in a single cycle.
Integer to Floating-Point Conversion (FLOAT Instruction)
The FLOAT instruction, allows single-precision integers to be converted to extended-
precision floating-point numbers
`C3X Instruction word format
MSB – 16 bits opcode field , LSB 16 bits operand field

Floating point processor Instructions 13
Floating point processors support floating point operations as well as fixed

point operations
Floating Point operand instruction
Syntax: ADDF src, dst – Add floating point operand (dst + src → dst)
SUBF, MPYF, LDF, STF, ABSF,NEGF, PUSHF, POPF
Fixed point operand instructions
Syntax: ADDI src, dst – Add fixed point operand (dst + src → dst)
SUBI, MPYI, AND, OR, EXOR, NEGI, LDI, STI, ABSI, ASH,
LSH,PUSHI, POPI
Three operand instructions
(both for floating point and fixed point operand instructions)
Syntax: (Floating point instruction)
ADDF3 src2, src1, dst – Add floating point three operand (src1 + src2 → dst)
SUBF3, MPYF3
Syntax: (Fixed point instruction)
ADDI3 src2 ,src1,dst - Add fixed point three operand (src1 + src2 → dst
SUBI3, MPYI3 AND3, OR3, EXOR3, LSH3, ASH3, LSH3
Floating point processor Addressing Modes 14
 Register Addressing
 Immediate Addressing
 Direct Addressing
 Indirect Addressing
 General indirect addressing
 Indexed addressing
 Circular addressing mode
 Bit-reversed addressing mode
 PC-Relative Addressing

Floating point processor Addressing Modes cont… 15
Register Addressing
The CPU register contain the operand, both source and destination operand
Syntax : ADDF R1,R2 - floating point content in R1 and R2 added, result in R2
SUBI R2,R1 – fixed point content R1 subtracted from R2, result in R1
MPYI3 R1,R2,R3 – fixed point content R1&R2 multiplied,,result in R3
Immediate Addressing
If the operand is a 16-bit then it is short immediate addressing mode, more than 16 up to
24 bit then it is long immediate addressing mode. No symbol required for the operand
Syntax : ADDI 1000h,R2 - immediate operand 1000h added to R2, result in R2 (SI)
MPYI 11111h,R3 - immediate operand 11111h multiplied with R3, result in R3 (LI)
Direct Addressing
`C3X memory space is divided into 256 pages and each page has 64K words ( 64Kx32 bit)
The data address is formed by the concatenation of the eight LSBs of the data-page
pointer (DP) with the 16 LSBs of the instruction word
The symbol used for specify the operand is @ Assume DP = 20h
Syntax : ADDF @1200h,R2 - The floating
point content in page 32,location 1200h
is added to register R2, result in R2
ADDI @0200h,R3 – The fixed
point content in page 32, location 0200h
is added to register R3, result in R3
Indirect Addressing
Indirect addressing mode instruction format
• MSB – 16 bits : Opcode

• LSB – 16 bits : operand for Indirect addressing mode
• 15-11,5 –bits,MOD field : Address modification field
• 10- 8, 3 –bits, ARn : Auxiliary register field
• 7- 0 , 8 bits, disp : address displacement field
General Indirect addressing mode
Using the current content of ARn for data memory address (dma) and if the address
displacement is ± 1 to 256 dma locations then it is general Indirect addressing mode.
Syntax:
LDI 1200h,AR2 – load the address in AR2
ADDI *AR2(0),A – access the data using the content of AR2 as dma with zero displacement
ADDI *+AR2(6),A - access the data using the content of AR2+disp (1206h) as dma

General Indirect addressing mode
Indexed addressing mode

 Using the current content of ARn for data memory address (dma).
 If the address displacement is > ± 256 dma locations then Indexed addressing
mode is used.
 The content of index register is used for specifying the displacement value.
 IR0 and IR1 both registers can be used for index addressing mode.

Indexed addressing mode
Circular Addressing
Many DSP algorithms, such as convolution and correlation, require a circular buffer in
memory.
In convolution and correlation, the circular buffer acts as a sliding window that contains
the most recent data to process.
As new data is brought in, the new data overwrites the oldest data by increasing the
pointer to the data through the buffer in counter-clockwise fashion.
When the pointer accesses the end of the buffer, the device sets the pointer to the
beginning of the buffer
Circular buffer hardware

Syntax for circular buffer for 256 locations
Syntax for circular buffer for > 256 locations

using index register
Syntax for bit-reversed addressing mode

Parallel addressing mode

Some of the ’C3x instructions can occur in pairs that are executed in parallel.
Parallel arithmetic with store instructions Parallel load instructions
Parallel multiply and

add/subtract instructions

Status register (ST) 21
The status register (ST) contains global information about the state of the CPU.
Operations usually set the condition flags of the status register according to whether the
result is 0, negative, etc.
C Carry flag
V Overflow flag
Z Zero flag
N Negative flag
UF Floating-point under flow flag
LV Latched overflow flag
LUF Latched floating-point underflow flag
OVM Overflow mode flag Overflow mode flag
RM Repeat mode flag Repeat mode flag
CE Cache enable
CF Cache freeze
CC Cache clear
GIE Global interrupt-enable

Conditional operations 22
Logical Conditions in `C3X

The ’C3x provides 20 logical conditions that can be used in the conditional field
of any of the conditional instructions
a) U - Unconditional
b) Based on register destination register (accumulator)
Unsigned compares Signed compares
LO - Lower than LT - Less than
LS - Lower than or same as LE - Less than or equal to
HI - Higher than GT - Greater than
HS - Higher than or same as GE - Greater than or equal to
EQ - Equal to EQ - Equal to
NE - Not equal to NE - Not equal to
c) Based on flags
NN - Non-negative NUF - No underflow NLUF - No latched floating-point
N - Negative UF - Underflow underflow
NZ - Nonzero NC - No carry LUF - Latched floating-point
Z - Zero C - Carry underflow
NV - No overflow NLV - No latched overflow ZUF - Zero or floating-point
V - overflow LV - Latched overflow underflow

Conditional operations cont… 23
Un-conditional instructions
BR , CALL and RET
Conditional instructions
B cond , CALL cond , RET cond, LDF cond, LDI cond, TRAP cond
Delayed conditional instructions
BRD, B cond D
Decrement conditional instructions

The specified auxiliary register is decremented and a branch is performed if the
condition is true and the specified auxiliary register is greater than or equal to 0.
DB cond , DB cond D
No multi-conditional instructions
No execute conditional instruction

Interrupts 24
The ’C3x supports multiple internal and external interrupts, which can be used for a
variety of applications
Interrupt location
The interrupts INT0 – TINT1 can be used either by CPU or by the DMA controller

Interrupts cont… 25
CPU/DMA Interrupt Enable register (IE)
Interrupt Flag register (IF)
Interrupt-trap table pointer (ITTP) ,Allows the relocation of interrupt and trap vector tables
Generation of Interrupt or trap vector address
The ITTP bit field dictates the starting location (base) of the interrupt-trap vector table.
This base address is formed by left shifting by eight bits the value of the ITTP bit field.
This shifted value is called the effective base address and is referenced as EA[ITTP].
The location of an interrupt or trap vector is given by the addition of the effective base
address formed by the ITTP bit field (EA[ITTP]) and the offset of the interrupt or trap vector in
the interrupt trap vector table.
Repeat Operations 26
The repeat modes of the ’C3x can implement zero-overhead looping.

Repeat Instructions
RPTS (repeat a single instruction)
RPTS fetches a single instruction once and then repeats its execution a number of times.
Since the instruction is fetched only once, bus traffic is minimized.
RPTB (repeat a block of code)
RPTB repeats execution of a block of code a specified number of times.
Repeat-Mode Registers
RS Repeat start-address register.
Holds the address of the first instruction of the block of code to be repeated.
RE Repeat end-address register.
Holds the address of the last instruction of the block of code to be repeated.
RC Repeat-counter register.
Contains 1 less than the number of times the block remains to be repeated.
For example, to execute a block n times, load n − 1 into RC.
Repeat-Mode Control Bits
RM bit.
The repeat-mode (RM) flag bit in the status register specifies whether the processor is
running in the repeat mode
S bit.
Single instruction repeat bit. The S bit is internal to the processor and cannot be
programmed
Repeat Operations cont… 27
Repeat mode operation

 The repeat modes compare the contents of the RE register (repeat-end-address
register) with the PC after the execution of each instruction.
 If they match and the repeat counter (RC) is nonnegative, the RC is decremented.
 The PC is loaded with the repeat start- address (RS), and the processing continues.
Repeat a single instruction
Syntax: RPTS 9 - repeat the next instruction 10 times
ADDI *AR2++,(10),R1
Repeat a block of code
Syntax: LDI 9,RC - load the repeat value of the block in repeat count register
RPTB loop - repeat the block 10 times
Ins 1
Ins 2
-
-
loop Ins N
Repeat-Mode Restrictions
Rule 1: The last instruction in the block cannot be a Bcond, BR, DBcond, CALL,
CALLcond, TRAPcond, RET cond, IDLE, RPTB, or RPTS
Rule 2: None of the last four instructions from the bottom of the block can be a BcondD,
BRD, or DBcondD
Power down modes 28
The low-power control instruction group consists of three instructions that

affect the low-power modes.
IDLE and IDLE2
These low-power idle instruction allows extremely low-power mode.
LOPOWER
The device continues to execute instructions, but at the reduced rate of the
CLKIN frequency divided by 16
The restore-clock-to-regular-speed (MAXSPEED) instruction causes the
resumption of full-speed operation.

Memory map 29
 The ’C3x accesses a total memory

space of 16M (million) 32-bit words of
program, data, and I/O space and
allows tables, coefficients, program
code, or data to be stored in
either RAM or ROM.
 By manipulating one external pin
(MC/MP) configuration of on-chip
ROM is done
 0 – Microprocessor mode,
 1 – Microcomputer mode
 In microprocessor mode, the 4K
on-chip ROM is not mapped into the
’C3x memory map but it is mapped
for microcomputer mode of operation

Pipeline operation 30
Pipeline phases
Fetch - Fetches the instruction words from memory & updates the program counter (PC).
Decode - Decodes the instruction word and performs address generation. Also, the
decode unit controls modification of the ARn registers in the indirect
addressing mode and of the stack pointer when PUSH to/POP from the stack
occurs.
Read - If required, reads the operands from memory.
Execute - If required, reads the operands from the register file, performs the necessary
operation, and writes results to the register file. If required, results of previous
operations are written to memory.
DMA - Direct memory access activity through DMA bus
Pipeline Conflicts
Pipeline conflicts in the ’C3x can be grouped into the following categories:
Branch conflicts
Branch conflicts involve most of those instructions or operations that read and/or modify
the PC.
Register conflicts
Register conflicts involve delays that can occur when reading from, or writing to,
registers that are used for address generation.
Memory conflicts
Memory conflicts occur when the internal units of the ’C3x compete for memory
resources.
On-chip peripherals 31
 Clock generator (same as `C54X)

 Timer
 Serial port
 Parallel port
 DMA controller
Timers
The ’C3x has two 32-bit general-purpose timer modules.
The timer modules can be used to signal to the ’C3x or the external world at specified
intervals or to count external events.
Timer pins
Each timer has one pin associated with the timer clock signal (TCLK) pin.
This pin (TCLK) is used as a general-purpose I/0 signal, as a timer output, or as an input
for an external clock for a timer.
Timer Registers
Each timer has three memory mapped registers
Global-control register - determines the operating mode of the timer
Period register - specifies the timer’s signaling frequency
Counter register - contains the current value of the incrementing counter

Timer 32
Timer Global-Control Register
FUNC Controls the function of TCLK. FUNC = 0, TCLK is a general-purpose digital I/O port.
FUNC = 1, TCLK is configured as a timer pin.
I/O Input/output pin - I/O = 0, TCLK is input pin, I/O = 1, TCLK is configured as output pin.
DATOUT Data output
DATIN Data input
GO Resets and starts the timer counter.
HLD Counter hold signal
GO HLD Result
0 0 All timer operations are held. No reset is performed (reset value).
0 1 Timer proceeds from state before write.
1 0 All timer operations are held, including zeroing of the counter. The GO bit is not cleared
until the timer is taken out of hold.
1 1 Timer resets and starts
C/P Clock/pulse mode control - 0 – clock mode, 1 – pulse mode
CLKSRC Clock source – 0- internal clock, 1 – external clock
INV Inverter control bit
TSTAT Timer status bit - indicates the status of the timer

Timer 33
Timer Block diagram Timer timing wave forms
f(pulse mode) = f(timer clock) / period register

f(clock mode) = f(timer clock) / (2 x period register)
Timer modes of operation
Pulse mode and Clock mode
In both modes, an internal clock source f (timer clock) has a frequency of f(H1)/2,
and an externally generated clock source f (timer clock) can have a maximum
frequency of f(H1)/2.6.
In pulse mode (C/P = 0), the width of the pulse is 1/f(H1).
`C3X Serial port 34
 The ’C30 has two totally independent bidirectional serial ports. Both serial ports are
identical, and there is a complementary set of control registers in each one.
 You can configure each serial port to transfer 8, 16, 24, or 32 bits of data per word
simultaneously in both directions.
 The clock for each serial port can originate either internally, through the serial port
timer, or externally, through a supplied clock.
Serial port pins
CLKR - Receive clock signal & its complement
CLKX - Transmit clock signal & its complement
FSR - Receive frame synchronization signal & its complement
FSX - Transmit frame synchronization signal & its complement
DR - Receive serial data & its complement
DX - Transmit serial data & its complement
Serial port registers

• Each serial port has eight (8) MMREGs
GCR - Global-control register
PCR – Port control registers - Two control registers for the six serial I/O pins
TCR – Timer control registers Three receive/transmit timer registers
DXR - Data-transmit register
DRR - Data-receive register
XSR – Data transmit shift register
RSR – Data receive shift register
`C3X Serial port cont… 35
Serial port block diagram

 The global-control register controls the
global functions of the serial port and
determines the serial-port operating
mode.
 Two port control registers control
the functions of the six serial port pins.
 The transmit buffer contains the next
complete word to be transmitted.
 The receive buffer contains the last
complete word received.
 Three additional registers are associated
with the transmit/ receive sections of the
serial-port timer.

Serial port register formats

Serial-Port Global-Control Register
FSX/DX/CLKX Port-Control Register
FSR/DR/CLKR Port-Control Register
Receive/Transmit Timer-Control Register
Receive/Transmit Timer-Counter Register Receive/Transmit Timer-Period Register

Serial port interface
 The ’C3x resets the AIC through the external pin XF0.
 It also generates the master clock for the AIC through the timer 0 output pin, TCLK0.
 In turn, the AIC generates the CLKR0 and CLKX0 shift clocks as well as the FSR0
and FSX0 frame synchronization signals.

Serial Analog-to-Digital (A/D) and Digital-to-Analog (D/A) Interface

 The DSP102 A/D is interfaced to the ’C3x serial
-port receive side.
 The DSP202 D/A is interfaced to the transmit side.
 The A/Ds and D/As are hard-wired to run in
cascade mode.
 In this mode, when the ’C3x initiates a convert
command to the A/D via the TCLK0 pin, both
analog inputs are converted into two 16-bit
words, which are concatenated to form one 32-bit
word.
1. The A/D signals the ’C3x via the A/D’s SYNC signal (connected to the FSR0 pin) that serial data is
to be transmitted.
2. The 32-bit word is then serially transmitted, MSB first, out the SOUTA serial pin of the DSP102 to
the DR0 pin of the ’C3x serial port.
3. The ’C3x is programmed to drive the analog interface bit clock from the CLKX0 pin of the ’C3x.
4. The bit clock drives both the A/D’s and D/A’s XCLK input.
5. The ’C3x transmit clock also acts as the input clock on the receive side of the ’C3x serial port.
6. Since the receive clock is synchronous to the internal clock of the ’C3x, the receive clock can
run at full speed (that is, f(H1)/2).

`C3X Parallel port/External port 39
Parallel port pins Memory Interface
STRB - Primary interface access strobe

R/W - Memory read (active high) / write (active low)
HOLD - Hold external memory interface
HOLDA - Hold acknowledge for external memory interface
RDY - Indicates external primary interface is ready to be accessed
A (23−0) - Primary address bus.
When the primary bus address lines are not in high-impedance state due to
HOLD signal, they keep in the last external primary bus access.
D (31−0)- Primary data bus. These signals go to high impedance between write accesses.

Basics of DMA access 40
1. DMA operation is to transfer a block of data from external memory to on-chip memory
without the use of Accumulator.
2. One DMA access is reading data from the source address location and writing it in the
destination address location. It consists of one read and one write activity.
3. DMA access needs two addresses, source address and destination address
4. To count the number of data transferred, we need a count register.
DMA Signals
BR Bus request signal externally driven low in hold mode to indicate a request for
DMA access.
HOLD External request for control of address, data, and control lines.
HOLDA Indication to external circuitry that the memory address, data, and control line
are in high impedance, allowing external access.
IAQ Acknowledge BR request for access while HOLDA is low.
Apart from the above signals normal parallel port signals are used.
R/W Read/write signal indicates the data bus direction for DMA reads (high) and DMA
writes (low).
A(15-0) Address lines
D(15–0) Data lines
STRB When IAQ and HOLDA are low, STRB selects the memory access and determines
its duration.
`C3X DMA Controller 41
1. The DMA controller is a programmable peripheral that transfers blocks of data to any
location in the memory map without interfering with CPU operation.
2. The ’C3x can interface to slow, external memories and peripherals without reducing
throughput to the CPU.
3. The ’C3x DMA controller features are:
4. Transfers to and from anywhere in the processor’s memory map. E.g. transfers can be
made to and from on-chip memory, off-chip memory, and on-chip serial ports.
5. Concurrent CPU and DMA controller operation with DMA transfers at the same rate as
the CPU (supported by separate internal DMA address and data buses).
6. Source and destination-address registers with auto increment/decrement.
7. Synchronization of data transfers via external and internal interrupts
DMA Registers
DMA Control register: contains the status and mode information about the associated
DMA channel
Source-address register: contains the memory address of data to be read
Destination-address register: contains the memory address where data is written
Transfer-counter register: contains the block size to move
`C3X DMA Controller cont… 42
DMA Registers Initialization

1. The source-address register of a DMA channel is
loaded with the address of the memory location to
read from.
2. The destination-address register of the same DMA
channel is loaded with the address of the memory
location to write to.
3. The transfer counter is loaded with the number of
words to be transferred.
4. The DMA channel control register is loaded with the
appropriate modes to synchronize the DMA controller
reads and writes with interrupts.
DMA Start
1. The DMA controller is started through the DMA START field in the DMA channel control register.
Word Transfers
1. The DMA channel reads a word from the source-address register and writes it to a temporary
register within the DMA channel.
2. After a read by the DMA channel, the source-address register is incremented, decremented, or
unchanged
3. After the read operation completes, the DMA channel writes the temporary register value to the
destination-address pointed to by the destination address register.
4. After the destination-address has been fetched, the transfer-counter register is decremented and
the destination-address register is incremented, decremented, or unchanged
5. During every data write, the transfer counter is decremented. The block transfer terminates when
the transfer counter reaches zero and the write of the last transfer is completed.
`C3X DMA Controller cont… 43
1. The DMA destination-address and

source-address registers are 24-bit
registers
2. These contents specify destination and
source addresses.
3. These registers are incremented,
decremented, or remain unchanged at
the end of the corresponding memory
access.

44
End of Part-7

Floating Point DSPs by Bhaskar

Uploaded by

Copyright:

Available Formats

Floating Point DSPs by Bhaskar

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Floating Point DSPs by Bhaskar

Uploaded by

Copyright:

Available Formats

Part - 7

Floating point DSP Architecture

`C30 27/75 RAM = 2K 16Mx32 Serial port = 2

`C31 27/75 RAM = 2K 16Mx32 Serial port = 1

`LC31 33/60 RAM = 2K 16Mx32 Serial port =1

 The total memory space is 16M

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

 Instruction cache is useful if the device

Signed Integer Formats

Single-precision integer format - 32-bit

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Single-precision unsigned integer format - 32-bit

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Short floating-point format

Single-Precision Floating-Point Format

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Extended-Precision Floating-Point Format

Floating-Point to Integer Conversion (FIX Instruction)

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Floating point processors support floating point operations as well as fixed

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

• MSB – 16 bits : Opcode

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

General Indirect addressing mode

Indexed addressing mode

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Indexed addressing mode

Circular buffer hardware

Syntax for circular buffer for > 256 locations

Syntax for bit-reversed addressing mode

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Parallel addressing mode

Parallel multiply and

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Logical Conditions in `C3X

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Decrement conditional instructions

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

CPU/DMA Interrupt Enable register (IE)

Interrupt Flag register (IF)

The repeat modes of the ’C3x can implement zero-overhead looping.

Repeat mode operation

The low-power control instruction group consists of three instructions that

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

 The ’C3x accesses a total memory

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

 Clock generator (same as `C54X)

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Timer Global-Control Register

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Timer Block diagram Timer timing wave forms

f(pulse mode) = f(timer clock) / period register

Serial port registers

Serial port block diagram

Dr. M. Bhaskar, Professor, ECE, NIT, Trichy-15

Serial port register formats