Data Movement Instructions

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 22

Chapter 4: Data Movement Instructions

(Summary)

• This chapter concentrates on the data movement instructions.


• The data movement instructions include MOV, MOVSX, MOVZX, PUSH, POP, BSWAP,
XCHG, XLAT, IN, OUT, LEA, LDS, LES, LFS, LGS, LSS, LAHF, SAHF.
• String instructions: MOVS, LODS, STOS, INS, and OUTS.

MOV Revisited
• In this chapter, the MOV instruction introduces machine language instructions available with
various addressing modes and instructions.
• It may be necessary to interpret machine language programs generated by an assembler.

Machine Language
• Native binary code microprocessor uses as its instructions to control its operation.
–instructions vary in length from 1 to 13 bytes
• Over 100,000 variations of machine language instructions.
–there is no complete list of these variations
• Some bits in a machine language instruction are given; remaining bits are determined for each
variation of the instruction.

The Opcode
• Selects the operation (addition, subtraction, etc.,) performed by the microprocessor.
–either 1 or 2 bytes long for most instructions
• The figure illustrates the general form of the first opcode byte of many instructions.
–first 6 bits of the first byte are the binary opcode
–remaining 2 bits indicate the direction (D) of the data flow, and indicate whether the data are a
byte or a word (W)
This figure shows Byte 2 of many machine language instructions, showing the position of the MOD,
REG, and R/M fields.

MOD Field
• Specifies addressing mode (MOD) and whether a displacement is present with the selected type.
–If MOD field contains an 11, it selects the register addressing mode
–Register addressing specifies a register instead of a memory location, using the R/M field
• If the MOD field contains a 00, 01, or 10, the R/M field selects one of the data memory
addressing modes.
• All 8-bit displacements are sign-extended into 16-bit displacements when the processor executes
the instruction.
–if the 8-bit displacement is 00H–7FH (positive), it is sign-extended to 0000H–007FH before
adding to the offset address
–if the 8-bit displacement is 80H–FFH (negative), it is sign-extended to FF80H–FFFFH
• Some assembler programs do not use the 8bit displacements and in place default to all 16-bit
displacements.
Register Assignments
• Suppose a 2-byte instruction, 8BECH, appears in a machine language program.
– neither a 67H (operand address-size override prefix) nor a 66H (register-size override prefix)
appears as the first byte, thus the first byte is the opcode
• In 16-bit mode, this instruction is converted to binary and placed in the instruction format of
bytes 1 and 2, as illustrated in Figure 4–4.
This figure shows the 8BEC instruction placed into bytes 1 and 2 formats. This instruction is a MOV BP,
SP.
–D and W bits are a logic 1, so a word moves into the destination register specified in the REG
field
–REG field contains 101, indicating register BP, so the MOV instruction moves data into register
BP

R/M Memory Addressing


• If the MOD field contains a 00, 01, or 10, the R/M field takes on a new meaning.
• Figure illustrates the machine language version of the 16-bit instruction MOV DL, [DI] or
instruction (8AI5H).
• This instruction is 2 bytes long and has an opcode 100010, D=1 (to REG from R/M), W=0 (byte),
MOD=00 (no displacement), REG=010 (DL), and R/M=101 ([DI]).
This figure shows A MOV DL, [DI] instruction converted to its machine

language form.

–If the instruction changes to MOV DL, [DI+1], the MOD field changes to 01 for 8-bit
displacement
–first 2 bytes of the instruction remain the same
–instruction now becomes 8A55H instead of 8A15H

Special Addressing Mode


• A special addressing mode occurs when memory data are referenced by only the displacement
mode of addressing for 16-bit instructions.
• Examples are the MOV [1000H],DL and MOV NUMB,DL instructions.
–first instruction moves contents of register DL into data segment memory location 1000H
–second moves register DL into symbolic data segment memory location NUMB
• When an instruction has only a displacement, MOD field is always 00; R/M field always 110.
–You cannot actually use addressing mode [BP] without a displacement in machine language
• If the individual translating this symbolic instruction into machine language does not know about
the special addressing mode, the instruction would incorrectly translate to a MOV [BP], DL instruction.
This figure shows the MOV [1000H],DI instruction uses the special addressing mode.

This figure shows the MOV [BP], DL instruction converted to binary machine language.
32-Bit Addressing Modes
• Found in 80386 and above.
–by running in 32-bit instruction mode or
–In 16-bit mode by using address-size prefix 67H
• A scaled-index byte indicates additional forms of scaled-index addressing.
–mainly used when two registers are added to specify the memory address in an instruction
• A scaled-index instruction has 215 (32K) possible combinations.
• Over 32,000 variations of the MOV instruction alone in the 80386 - Core2 microprocessors.
• Figure 4–8 shows the format of the scaled index byte as selected by a value of 100 in the R/M
field of an instruction when the 80386 and above use a 32-bit address.
• The leftmost 2 bits select a scaling factor (multiplier) of 1x, 2x, 4x, 8x.
• Scaled-index addressing can also use a single register multiplied by a scaling factor.

This figure shows the scaled-index byte.


– the index and base fields both
contain register numbers
An Immediate Instruction
• An example of a 16-bit instruction using immediate addressing.
–MOV WORD PTR [BX+1000H] ,1234H moves a 1234H into a word-sized memory location
addressed by sum of 1000H, BX, and DS x 10H
• 6-byte instruction
–2 bytes for the opcode; 2 bytes are the data of
1234H; 2 bytes are the displacement of 1000H
• The following figure shows the binary bit pattern for each byte of this instruction.
This figure shows a MOV WORD PTR, [BX=1000H] 1234H instruction converted to binary machine
language.

• This instruction, in symbolic form, includes WORD PTR.


–directive indicates to the assembler that the instruction uses a word-sized memory pointer
• If the instruction moves a byte of immediate data, BYTE PTR replaces WORD PTR.
–if a doubleword of immediate data, the DWORD PTR directive replaces BYTE PTR
• Instructions referring to memory through a pointer do not need the BYTE PTR, WORD PTR, or
DWORD PTR directives.

Segment MOV Instructions


• If contents of a segment register are moved by MOV, PUSH, or POP instructions, a special-bits
(REG field) select the segment register.
–the opcode for this type of MOV instruction is different for the prior MOV instructions
–an immediate segment register MOV is not available in the instruction set
• To load a segment register with immediate data, first load another register with the data and move
it to a segment register.
This figure shows a MOV BX, CS instruction converted to binary machine language.

• The figure shows a MOV BX, CS instruction converted to binary.


• Segment registers can be moved between any 16bit register or 16-bit memory location.
• A program written in symbolic assembly language (assembly language) is rarely assembled by
hand into binary machine language.
• An assembler program converts symbolic assembly language into machine language.

The 64-Bit Mode for the Pentium 4 and Core2


• In 64-bit mode, a prefix called REX (register extension) is added.
–encoded as a 40H–4FH, follows other prefixes; placed immediately before the opcode
• Purpose is to modify reg and r/m fields in the second byte of the instruction.
–REX is needed to be able to address registers R8 through R15
• This figure illustrates the structure and application of REX to the second byte of the opcode.
• The reg field can only contain register assignments as in other modes of operation
• The r/m field contains either a register or memory assignment.
• This figure shows the scaled-index byte with the REX prefix for more complex addressing modes
and also for using a scaling factor in the 64-bit mode of operation.

PUSH/POP
• Important instructions that store and retrieve data from the LIFO (last-in, first-out) stack memory.
• Six forms of the PUSH and POP instructions:
–register, memory, immediate
–segment register, flags, all registers
• The PUSH and POP immediate & PUSHA and POPA (all registers) available 80286 - Core2.
• Register addressing allows contents of any 16bit register to transfer to & from the stack.
• Memory-addressing PUSH and POP instructions store contents of a 16- or 32 bit memory
location on the stack or stack data into a memory location.
• Immediate addressing allows immediate data to be pushed onto the stack, but not popped off the
stack.
• Segment register addressing allows contents of any segment register to be pushed onto the stack
or removed from the stack.
–ES may be pushed, but data from the stack may never be popped into ES
• The flags may be pushed or popped from that stack.
–contents of all registers may be pushed or popped

PUSH
• Always transfers 2 bytes of data to the stack;
–80386 and above transfer 2 or 4 bytes
• PUSHA instruction copies contents of the internal register set, except the segment registers, to the
stack.
• PUSHA (push all) instruction copies the registers to the stack in the following order:
AX, CX, DX, BX, SP, BP, SI, and DI.
• PUSHF (push flags) instruction copies the contents of the flag register to the stack.
• PUSHAD and POPAD instructions push and pop the contents of the 32-bit register set in 80386 -
Pentium 4.
–PUSHA and POPA instructions do not function in the 64-bit mode of operation for the Pentium
4

This figure shows the effect of the PUSH AX instruction on ESP and stack memory locations 37FFH and
37FEH. This instruction is shown at the point after execution.
• PUSHA instruction pushes all the internal 16bit registers onto the stack, illustrated in 4–14.
–requires 16 bytes of stack memory space to store all eight 16-bit registers
• After all registers are pushed, the contents of the SP register are decremented by 16.
• PUSHA is very useful when the entire register set of 80286 and above must be saved.
• PUSHAD instruction places 32-bit register set on the stack in 80386 - Core2.
–PUSHAD requires 32 bytes of stack storage
This figure shows the operation of the PUSHA instruction, showing the location and order of stack data.

POP
• Performs the inverse operation of PUSH.
• POP removes data from the stack and places it in a target 16-bit register, segment register, or a
16-bit memory location.
–not available as an immediate POP
• POPF (pop flags) removes a 16-bit number from the stack and places it in the flag register;
–POPFD removes a 32-bit number from the stack and places it into the extended flag register
• POPA (pop all) removes 16 bytes of data from the stack and places them into the following
registers, in the order shown: DI, SI, BP, SP, BX, DX, CX, and AX.
–reverse order from placement on the stack by PUSHA instruction, causing the same data to
return to the same registers
• Figure 4–15 shows how the POP BX instruction removes data from the stack and places them
into register BX.
This figure shows the POP BX instruction, showing how data are removed from the stack. This
instruction is shown after execution.

Initializing the Stack


• When the stack area is initialized, load both the stack segment (SS) register and the stack pointer
(SP) register.
• Figure 4–16 shows how this value causes data to be pushed onto the top of the stack segment
with a PUSH CX instruction.
• All segments are cyclic in nature
–the top location of a segment is contiguous with the bottom location of the segment
This figure shows the PUSH CX instruction, showing the cyclical nature of the stack segment. This
instruction is shown just before execution, to illustrate that the stack bottom is contiguous to the top.
• Assembly language stack segment setup:
–first statement identifies start of the segment
–last statement identifies end of the stack segment
• Assembler and linker programs place correct stack segment address in SS and the length of the
segment (top of the stack) into SP.
• There is no need to load these registers in your program.
–unless you wish to change the initial values for some reason
• If the stack is not specified, a warning will appear when the program is linked.
• Memory section is located in the program segment prefix (PSP), appended to the beginning of
each program file.
• If you use more memory for the stack, you will erase information in the PSP.
–information critical to the operation of your program and the computer
• Error often causes the program to crash.

LEA
• Loads a 16- or 32-bit register with the offset address of the data specified by the operand.
• Earlier examples presented by using the OFFSET directive.
–OFFSET performs same function as LEA instruction if the operand is a displacement
• LEA and MOV with OFFSET instructions are both the same length (3 bytes).
• Why is LEA instruction available if OFFSET accomplishes the same task?
–OFFSET functions with, and is more efficient than LEA instruction, for simple operands such as
LIST
–Microprocessor takes longer to execute the LEA BX, LIST instruction the MOV BX, OFFSET
LIST
• The MOV BX, OFFSET LIST instruction is actually assembled as a move immediate instruction
and is more efficient.

LOAD EFFECTIVE ADDRESS


• LEA instruction loads any 16-bit register with the offset address
–determined by the addressing mode selected
• LDS and LES load a 16-bit register with offset address retrieved from a memory location
–then load either DS or ES with a segment address retrieved from memory
• In 80386 and above, LFS, LGS, and LSS are added to the instruction set.
LDS, LES, LFS, LGS, and LSS
• Load any 16- or 32-bit register with an offset address, and the DS, ES, FS, GS, or SS segment
register with a segment address.
–instructions use any memory-addressing modes to access a 32-bit or 48-bit memory section that
contain both segment and offset address
• The next figure illustrates an example LDS BX,[DI] instruction.
This figure shows the LDS BX, [DI] instruction loads register BX from addresses 11000H and 11001H
and register DS from locations 11002H and 11003H. This instruction is shown at the point just before DS
changes to 3000H and BX changes to 127AH.

• This instruction transfers the 32-bit number, addressed by DI in the data segment, into the BX
and DS registers.
• LDS, LES, LFS, LGS, and LSS instructions obtain a new far address from memory.
–offset address appears first, followed by the segment address
• This format is used for storing all 32-bit memory addresses.
• A far address can be stored in memory by the assembler.
• The most useful of the load instructions is the LSS instruction.
–after executing some dummy instructions, the old stack area is reactivated by loading both SS
and SP with the LSS instruction
• CLI (disable interrupt) and STI (enable interrupt) instructions must be included to disable
interrupts.

STRING DATA TRANSFERS


• Five string data transfer instructions: LODS, STOS, MOVS, INS, and OUTS.
• Each allows data transfers as a single byte, word, or doubleword.
• Before the string instructions are presented, the operation of the D flag-bit (direction), DI, and SI
must be understood as they apply to the string instructions.

The Direction Flag


• The direction flag (D, located in the flag register) selects the auto-increment or the auto-
decrement operation for the DI and SI registers during string operations.
–used only with the string instructions
• The CLD instruction clears the D flag and the STD instruction sets it .
–CLD instruction selects the auto-increment mode and STD selects the auto-decrement mode

DI and SI
• During execution of string instruction, memory accesses occur through DI and SI registers.
–DI offset address accesses data in the extra segment for all string instructions that use it
–SI offset address accesses data by default in the data segment
• Operating in 32-bit mode EDI and ESI registers are used in place of DI and SI.
–this allows string using any memory location in the entire 4G-byte protected mode address space

LODS
• Loads AL, AX, or EAX with data at segment offset address indexed by the SI register.
• A 1 is added to or subtracted from SI for a byte-sized LODS
• A 2 is added or subtracted for a word-sized LODS.
• A 4 is added or subtracted for a double word-sized LODS.
• The next figure shows the LODSW instruction.
This figure shows the operation of the LODSW instruction if DS=1000H, D=0,11000H,
11001H = A0. This instruction is shown after AX is loaded from memory, but before SI increments by 2.

STOS
• Stores AL, AX, or EAX at the extra segment memory location addressed by the DI register.
• STOSB (stores a byte) stores the byte in AL at the extra segment memory location addressed by
DI.
• STOSW (stores a word) stores AX in the memory location addressed by DI.
• After the byte (AL), word (AX), or doubleword (EAX) is stored, contents of DI increment or
decrement.

STOS with a REP


• The repeat prefix (REP) is added to any string data transfer instruction except LODS.
– REP prefix causes CX to decrement by 1 each time the string instruction executes; after CX
decrements, the string instruction repeats
• If CX reaches a value of 0, the instruction terminates and the program continues.
• If CX is loaded with 100 and a REP STOSB instruction executes, the microprocessor
automatically repeats the STOSB 100 times.

MOVS
• Transfers a byte, word, or doubleword a data segment addressed by SI to extra segment location
addressed by SI.
– pointers are incremented or decremented, as dictated by the direction flag
• Only the source operand (SI), located in the data segment may be overridden so another segment
may be used.
• The destination operand (DI) must always be located in the extra segment.

INS
• Transfers a byte, word, or doubleword of data from an I/O device into the extra segment memory
location addressed by the DI register.
–I/O address is contained in the DX register
• Useful for inputting a block of data from an external I/O device directly into the memory.
• One application transfers data from a disk drive to memory.
–disk drives are often considered and interfaced as I/O devices in a computer system
• Three basic forms of the INS.
• INSB inputs data from an 8-bit I/O device and stores it in a memory location indexed by SI.
• INSW instruction inputs 16-bit I/O data and stores it in a word-sized memory location.
• INSD instruction inputs a doubleword.
• These instructions can be repeated using the
REP prefix
–allows an entire block of input data to be stored in the memory from an I/O device

OUTS
• Transfers a byte, word, or doubleword of data from the data segment memory location address by
SI to an I/O device.
–I/O device addressed by the DX register as with the INS instruction
• In the 64-bit mode for Pentium 4 and Core2, there is no 64-bit output
–but the address in RSI is 64 bits wide

MISCELLANEOUS DATA TRANSFER INSTRUCTIONS


• Used in programs, data transfer instructions detailed in this section are XCHG, LAHF, SAHF, XLAT,
IN, OUT, BSWAP, MOVSX, MOVZX, and CMOV.

XCHG
• Exchanges contents of a register with any other register or memory location.
– cannot exchange segment registers or memory-to-memory data
• Exchanges are byte-, word-, or doubleword and use any addressing mode except immediate
addressing.
• XCHG using the 16-bit AX register with another 16-bit register, is most efficient exchange.

LAHF and SAHF


• Seldom used bridge instructions.
• LAHF instruction transfers the rightmost 8 bits of the flag register into the AH register.
• SAHF instruction transfers the AH register into the rightmost 8 bits of the flag register.
• SAHF instruction may find some application with the numeric coprocessor.
• As legacy instructions, they do not function in the 64-bit mode and are invalid instructions.

XLAT
• Converts the contents of the AL register into a number stored in a memory table.
–performs the direct table lookup technique often used to convert one code to another
• An XLAT instruction first adds the contents of AL to BX to form a memory address within the
data segment.
–copies the contents of this address into AL
–only instruction adding an 8-bit to a 16-bit number
This figure shows the operation of the XLAT instruction at the point just before 6DH is loaded into AL.

IN and OUT
• IN & OUT instructions perform I/O operations.
• Contents of AL, AX, or EAX are transferred only between I/O device and microprocessor.
–an IN instruction transfers data from an external I/O device into AL, AX, or EAX
–an OUT transfers data from AL, AX, or EAX to an external I/O device
• Only the 80386 and above contain EAX
• Often, instructions are stored in ROM.
–a fixed-port instruction stored in ROM has its port number permanently fixed because of the
nature of read-only memory
• A fixed-port address stored in RAM can be modified, but such a modification does not conform
to good programming practices.
• The port address appears on the address bus during an I/O operation.
• Two forms of I/O device (port) addressing:
• Fixed-port addressing allows data transfer between AL, AX, or EAX using an 8-bit I/O port
address.
–port number follows the instruction’s opcode
• Variable-port addressing allows data transfers between AL, AX, or EAX and a 16-bit port
address.
–the I/O port number is stored in register DX, which can be changed (varied) during the
execution of a program.
This figure shows the signals found in the microprocessor-based system for an OUT 19H,AX instruction.
BSWAP
• Takes the contents of any 32-bit register and swaps the first byte with the fourth, and the second
with the third.
– BSWAP (byte swap) is available only in 80486– Pentium 4 microprocessors
• This instruction is used to convert data between the big and little endian forms.
• In 64-bit operation for the Pentium 4, all 8 bytes in the selected operand are swapped.

CMOV
• Many variations of the CMOV instruction.
–these move the data only if the condition is true
• CMOVZ instruction moves data only if the result from some prior instruction was a zero.
–destination is limited to only a 16- or 32-bit register, but the source can be a 16- or 32-bit
register or memory location
• Because this is a new instruction, you cannot use it with the assembler unless the .686 switch is
added to the program

SEGMENT OVERRIDE PREFIX


• May be added to almost any instruction in any memory-addressing mode
–allows the programmer to deviate from the default segment
–only instructions that cannot be prefixed are jump and call instructions using the code segment
register for address generation
–Additional byte appended to the front of an instruction to select alternate segment register

ASSEMBLER DETAIL
• The assembler can be used two ways:
–with models unique to a particular assembler
–with full-segment definitions that allow complete control over the assembly process and are
universal to all assemblers
• In most cases, the inline assembler found in Visual is used for developing assembly code for use
in a program
–occasions require separate assembly modules using the assembler

Directives
• Indicate how an operand or section of a program is to be processed by the assembler.
– some generate and store information in the memory; others do not
• The DB (define byte) directive stores bytes of data in the memory.
• BYTE PTR indicates the size of the data referenced by a pointer or index register.
• Complex sections of assembly code are still written using MASM.

Storing Data in a Memory Segment


• DB (define byte), DW (define word), and DD (define doubleword) are most often used with
MASM to define and store memory data.
• If a numeric coprocessor executes software in the system, the DQ (define quadword) and DT
(define ten bytes) directives are also common.
• These directives label a memory location with a symbolic name and indicate its size.
• Memory is reserved for use in the future by using a question mark (?) as an operand for a DB,
DW, or DD directive.
– when? is used in place of a numeric or ASCII value, the assembler sets aside a location
and does not initialize it to any specific value
• It is important that word-sized data are placed at word boundaries and doubleword-sized data are
placed at doubleword boundaries.
–if not, the microprocessor spends additional time accessing these data types
ASSUME, EQU, and ORG
• Equate directive (EQU) equates a numeric, ASCII, or label to another label.
– equates make a program clearer and simplify debugging
• The THIS directive always appears as THIS
BYTE, THIS WORD, THIS DWORD, or THIS QWORD.
• The assembler can only assign either a byte, word, or doubleword address to a label.
• The ORG (origin) statement changes the starting offset address of the data in the data segment to
location 300H.
• At times, the origin of data or the code must be assigned to an absolute offset address with the
ORG statement.
• ASSUME tells the assembler what names have been chosen for the code, data, extra, and stack
segments.

PROC and ENDP


• Indicate start and end of a procedure (subroutine).
–they force structure because the procedure is clearly defined
• If structure is to be violated for whatever reason, use the CALLF, CALLN, RETF, and RETN
instructions.
• Both the PROC and ENDP directives require a label to indicate the name of the procedure.
• The PROC directive, which indicates the start of a procedure, must also be followed with a
NEAR or FAR.
–A NEAR procedure is one that resides in the same code segment as the program, often
considered to be local
–A FAR procedure may reside at any location in the memory system, considered global
• The term global denotes a procedure that can be used by any program.
• Local defines a procedure that is only used by the current program.

Memory Organization
• The assembler uses two basic formats for developing software:
– one method uses models; the other uses full segment definitions
• Memory models are unique to MASM.
• The models are easier to use for simple tasks.
• The full-segment definitions offer better control over the assembly language task and are
recommended for complex programs.
Models
• There are many models available to the MASM assembler, ranging from tiny to huge.
• Special directives such as @DATA are used to identify various segments.
• Models are important with both Microsoft Visual and Borland development systems if assembly
language is included with programs.

Full-Segment Definitions
• Full-segment definitions are also used with the Borland and Microsoft environments for
procedures developed in assembly language.
• The names of the segments in this program can be changed to any name.
• Always include the group name ‘DATA’, so the Microsoft program CodeView can be used to
symbolically debug this software.
• To access CodeView, type CV, followed by the file name at the DOS command line; if operating
from Programmer’s WorkBench, select Debug under the Run menu.
• If the group name is not placed in a program, CodeView can still be used to debug a program, but
the program will not be debugged in symbolic form.

You might also like