ARM
ARM
ARM
1
What Is ARM?
• Advanced RISC Machine
2
ARM Powered Products
3
Features
• Architectural simplicity
which allows
• Very small implementations
which result in
• Very low power consumption
4
The History of ARM
• Developed at Acorn Computers Limited,
of Cambridge, England,
between 1983 and 1985
• Problems with CISC:
• Slower then memory parts
• Clock cycles per instruction
5
The History of ARM (2)
• Solution – the Berkeley RISC I:
• Competitive
• Easy to develop (less than a year)
• Cheap
• Pointing the way to the future
6
ARM Architecture
• Typical RISC architecture:
• Large uniform register file
• Load/store architecture
• Simple addressing modes
• Uniform and fixed-length instruction fields
7
ARM Architecture (2)
• Enhancements:
• Each instruction controls the ALU and shifter
• Auto-increment
and auto-decrement addressing modes
• Multiple Load/Store
• Conditional execution
8
ARM Architecture (3)
• Results:
• High performance
• Low code size
• Low power consumption
• Low silicon area
9
Pipeline Organization
• Increases speed –
most instructions executed in single cycle
• Versions:
• 3-stage (ARM7TDMI and earlier)
• 5-stage (ARMS, ARM9TDMI)
• 6-stage (ARM10TDMI)
10
Pipeline Organization (2)
• 3-stage pipeline: Fetch – Decode - Execute
• Three-cycle latency,
one instruction per cycle throughput
i
n
s
t i Fetch Decode Execute
r
u Fetch Decode Execute
i+1
c
t
i i+2 Fetch Decode Execute
o cycle
n
t t+1 t+2 t+3 t+4 11
Pipeline Organization (3)
• 5-stage pipeline: • Stages:
• Reduces work per cycle =>
allows higher clock frequency Fetch
• Separates data and
instruction memory => Decode
reduction of CPI
(average number Execute
of clock Cycles Per Instruction)
Buffer/data
Write-back
12
Pipeline Organization (4)
• Pipeline flushed and refilled on branch,
causing execution to slow down
• Special features in instruction set
eliminate small jumps in code
to obtain the best flow through pipeline
13
Operating Modes
• Seven operating modes:
• User
• Privileged:
• System (version 4 and above)
• FIQ
• IRQ
• Abort
exception modes
• Undefined
• Supervisor
14
Operating Modes (2)
Exception modes:
User mode:
• Entered
• Normal program
upon exception
execution mode
• Full access
• System resources
unavailable to system resources
15
Exceptions
Exception Mode Priority IV Address
Reset Supervisor 1 0x00000000
Undefined instruction Undefined 6 0x00000004
Software interrupt Supervisor 6 0x00000008
Prefetch Abort Abort 5 0x0000000C
Data Abort Abort 2 0x00000010
Interrupt IRQ 4 0x00000018
Fast interrupt FIQ 3 0x0000001C
17
ARM Registers (2)
• Special roles:
• Hardware
• R14 – Link Register (LR):
optionally holds return address
for branch instructions
• R15 – Program Counter (PC)
• Software
• R13 - Stack Pointer (SP)
18
ARM Registers (3)
• Current Program Status Register (CPSR)
• Saved Program Status Register (SPSR)
• On exception, entering mod mode:
• (PC + 4) LR
• CPSR SPSR_mod
• PC IV address
• R13, R14 replaced by R13_mod, R14_mod
• In case of FIQ mode R7 – R12 also replaced
19
ARM Registers (4)
System & User FIQ Supervisor Abort IRQ Undefined
R0 R0 R0 R0 R0 R0
R1 R1 R1 R1 R1 R1
R2 R2 R2 R2 R2 R2
R3 R3 R3 R3 R3 R3
R4 R4 R4 R4 R4 R4
R5 R5 R5 R5 R5 R5
R6 R6 R6 R6 R6 R6
R7 R7_fiq R7 R7 R7 R7
R8 R8_fiq R8 R8 R8 R8
R9 R9_fiq R9 R9 R9 R9
R10 R10_fiq R10 R10 R10 R10
R11 R11_fiq R11 R11 R11 R11
R12 R12_fiq R12 R12 R12 R12
R13 R13_fiq R13_svc R13_abt R13_irq R13_und
R14 R14_fiq R14_svc R14_abt R14_irq R14_und
R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC)
CPSR CPSR CPSR CPSR CPSR CPSR
SPSR_fiq SPSR_svc SPSR_abt SPSR_irq SPSR_und
20
Instruction Set
• Two instruction sets:
• ARM
• Standard 32-bit instruction set
• THUMB
• 16-bit compressed form
• Code density better than most CISC
• Dynamic decompression in pipeline
21
ARM Instruction Set
• Features:
• Load/Store architecture
• 3-address data processing instructions
• Conditional execution
• Load/Store multiple registers
• Shift & ALU operation in single clock cycle
22
ARM Instruction Set (2)
• Conditional execution:
• Each data processing instruction
postfixed by condition code
• Result – smooth flow of instructions through pipeline
• 16 condition codes:
signed greater
EQ equal MI negative HI unsigned higher GT
than
unsigned lower signed less
NE not equal PL positive or zero LS LE
or same than or equal
• By default, data processing instructions do not affect the condition code flags but the flags
can be optionally set by using “S”. CMP does not need “S”.
loop
…
SUBS r1,r1,#1
BNE loop decrement r1 and set flags
if Z flag clear then branch
24
Conditional execution examples
5 instructions 3 instructions
5 words 3 words
5 or 6 cycles 3 cycles
25
ARM Instruction Set (3)
Data processing
instructions
Data transfer
instructions
Block transfer
instructions
Branching instructions
Multiply instructions
Software interrupt
instructions
26
Data Processing Instructions
• Arithmetic and logical operations
• 3-address format:
• Two 32-bit operands
(op1 is register, op2 is register or immediate)
• 32-bit result placed in a register
• Barrel shifter for op2 allows full 32-bit shift
within instruction cycle
27
Data Processing Instructions (2)
• Arithmetic operations:
• ADD, ADDC, SUB, SUBC, RSB, RSC
• Bit-wise logical operations:
• AND, EOR, ORR, BIC
• Register movement operations:
• MOV, MVN
• Comparison operations:
• TST, TEQ, CMP, CMN
28
Data Processing Instructions (3)
Conditional codes
+
Data processing instructions
+
Barrel shifter
=
Powerful tools for efficient coded programs
29
Data Processing Instructions (4)
e.g.:
if (z==1) R1=R2+(R3*4)
compiles to
EQADDS R1,R2,R3, LSL #2
( SINGLE INSTRUCTION ! )
30
Data Transfer Instructions
• Load/store instructions
• Used to move signed and unsigned
Word, Half Word and Byte to and from registers
• Can be used to load PC
(if target address is beyond branch instruction range)
but R2
33
Modifying the Status Registers
• Only indirectly
R0
• MSR moves contents R1
from CPSR/SPSR to MRS
selected GPR R7
CPSR/SPSR R14
R15
• Only in privileged
modes
34
Multiply Instructions
• Integer multiplication (32-bit result)
• Long integer multiplication (64-bit result)
• Built in Multiply Accumulate Unit (MAC)
• Multiply and accumulate instructions add product to
running total
35
Multiply Instructions
• Instructions:
36
Software Interrupt
• SWI instruction
• Forces CPU into supervisor mode
• Usage: SWI #n
31 28 27 24 23 0
Cond Opcode Ordinal
making OS calls
37
Branching Instructions
• Branch (B):
jumps forwards/backwards up to 32 MB
• Branch link (BL): same + saves (PC+4) in LR
• Suitable for function call/return
• Condition codes for conditional branches
38
Branching Instructions (2)
• Branch exchange (BX) and Branch link exchange
(BLX): same as B/BL +exchange instruction set
(ARM THUMB)
• Only way to swap sets
39
Thumb Instruction Set
• Compressed form of ARM
• Instructions stored as 16-bit,
• Decompressed into ARM instructions and
• Executed
• Lower performance (ARM 40% faster)
• Higher density (THUMB saves 30% space)
• Optimal – “interworking”
(combining two sets) – compiler
supported
40
THUMB Instruction Set (2)
• More traditional:
• No condition codes
• Two-address data processing instructions
• Access to R0 – R8 restricted to
• MOV, ADD, CMP
• PUSH/POP for stack manipulation
• Descending stack (SP hardwired to R13)
41
THUMB Instruction Set (3)
• No MSR and MRS, must
change to ARM to modify CPSR (change using BX or
BLX)
• ARM entered automatically after RESET or entering
exception mode
• Maximum 255 SWI calls
42
43
44