x86-64学习1-Introduction & Data Formats & Information Accessing & Arithmetic Logical Operation

Made by Mike_Zhang


Computer System 相关文章:
有符号二进制数表示方法 Signed binary number representation
浮点数二进制数表示方法 Floating point numbers representation
UltraFish Plus - 有符号二进制数转换器 Signed binary number convertor
UltraFish Plus - 浮点数表示方法转换器 Floating Point Numbers Representation Convertor
UltraFish Plus - 多进制整数转换器 Multiple Bases Unsigned Integer Convertor
Y86-64学习1-State & Instruction & Basic Encoding
Y86-64学习2-Y86-64 SEQ Stages
x86-64学习1-Introduction & Data Formats & Information Accessing & Arithmetic Logical Operation
x86-64学习2-Control


1 Introduction

Machine-Level Language:

  • ISA: Instruction Set Architecture, defining:
    • the Processor State;
    • the format of the instructions;
    • the effect of each instruction on the state;
  • Use Virtual Addresses as memory addresses;

Compiler: transforming programs into the elementary instruction (machine code in binary)

Assembly-code: the code very close to the machine-code, but more readable, as it is not in binary format.


Processor Visible State behind C programmer:

  • Program Counter: PC, %rip in x86-64, the address of the NEXT instruction to execute;
  • Register File: 16 location with 64-bit of each, stored addresses and integer data:
      1. track the critical parts of the state;
      1. store the temporary data, i.e., arguments, local variables, and return values;
  • Conditional Code: store the state of recently executed arithmetic or logical instruction, can be used as the condition of control and date flow, e.g., if and while;
  • Set of Vector Register: store one or more integer or floating-point values.

Program Memory stores:

  • program machine-code;
  • operating system information;
  • run-time stack for calls and returns;
  • allocated by user.

2 Data Formats

  • word: 16-bit (2-byte) data type;
  • double word: 32-bit (4-byte) data type;
  • quad word: 64-bit (8-byte) date type;
  • char *: pointer, 8-byte quad word.

Size of C data types in x86-64 (CS: APP)

3 Accessing Information


Integer Register (CS: APP)

x86-64 CPU has a set of 16 64-bit general-purpose register, storing integer data and pointers(addresses):

  • Start from 8086, 8 16-bit register (in RED box): %axto%bp;
  • Then to IA32, extended to 8 32-bit register (in BLUE box) :%eaxto%ebp;
  • Finally to x86-64, extended to 8 64-bit register (in BLACK box): %raxto%rbp, as well as additional new 8 64-bit register (in DOT LINE box) :%r8to%r15;

Different registers have different functions:

  • %rsp has a specific function: stack pointer, indicate the end position of run-time stack;
  • Other 15 registers have more flexible functions.

Instructions can operate on different data size in low-order of the 16 registers:

  • 8-bit instruction: can access least significant 1 byte;
  • 16-bit instruction: can access least significant 2 byte;
  • 32-bit instruction: can access least significant 4 byte;
  • 64-bit instruction: can access entire register;

3.1 Operand Specifiers


Operand Forms (CS: APP)

Three Types:

  • Immediate:
    • constant value, started with $, followed by a integer in standard C notation;
  • Register:
    • contents of a register, each bit length of instruction has its specific among of bits (e.g., 8-byte register:64 bits);
    • Notation $r_a$ indicates the register $a$ and its value in reference $R[r_a]$ indexed by the register identifiers in an array $R$;
  • Memory:
    • Access memory location based on the computed address - effective address;
    • $M_b[Addr]$: reference to the $b$-byte value in memory starting at address $Addr$;
    • $Imm(r_b,r_i,s)$: the most general form:
      • $Imm$: immediate offset;
      • $r_b$: base register, 64-bit;
      • $r_i$: index register, 64-bit;
      • $s$: scale factor, must be 1,2,4, or 8;
      • effective address $=Imm+R[r_b]+R[r_i]\cdot s$;
      • The value is $M[Imm+R[r_b]+R[r_i]\cdot s]$;
    • These complex addressing modes useful in Array and structure elements referencing.

[Example]


Practice Problem 3.1 (CS: APP)

Solution to Problem 3.1 (CS: APP)

3.2 Data Movement Instructions

3.2.1 MOV Instructions

Copy data from a source location to a destination location, without transformation.


Simple data movement instructions (CS: APP)

Source(S):

  • value of immediate;
  • value in register;
  • value in memory.

Destination(D):

  • register;
  • memory address.

Copy from a memory to another memory:
Can not directly copy, first load the memory value to a register, then write the register value to the destination memory.

movabsq:

  • S: 64-bit immediate value;
  • D: must be a register.

For register operand:

  • The size of the register must match the last character of the instruction(b,w,l,q);
  • The MOV instruction will only update the specific byte indicated by the destination operand,
  • Except the movl instruction with the register destination, it will set high-order 4-byte to 0.
  • (For the convention in x86-64 from 64-bit to 32-bit to adopt)

[Example]


Example of simple data movement instructions (CS: APP)

3.2.2 MOVZ Instructions

Copy a small source value to a larger destination, fill the remaining bytes in destination with zeros.


Zero-extending data movement instructions (CS: APP)
  • S: register, memory;
  • D: register;
  • Last 2 character: size of source and size of destination;
  • size of destination $\gt$ size of source.
  • NO movzlq, implemented by movl: with 4-byte register as destination, it will fill the upper 4-byte with zeros.

3.2.3 MOVS Instructions

Copy a small source value to a larger destination, fill the remaining bytes in destination by sign extension (copy the most significant bit).


Sign-extending data movement instructions (CS: APP)
  • S: register, memory;
  • D: register;
  • Last 2 character: size of source and size of destination;
  • size of destination $\gt$ size of source.
  • cltq: no operand, %eax as source, %rax as destination with sign-extended, as same as movslq %eax, %rax.

[Example]


Example of data movement instructions (CS: APP)

Solution of data movement instructions (CS: APP)

[Example]


Example of data movement instructions (CS: APP)

Solution of data movement instructions (CS: APP)

3.3 Push & Pop Instructions


Push and pop instructions (CS: APP)

Illustration of stack operation (CS: APP)
  • pushq %rbp:
1
2
subq $8, %rsp       # Decrement stack pointer
movq %rbp, (%rsp) # Store %rbp on stack
  • popq %rax:
1
2
movq (%rsp), %rax    # Read %rax from stack
addq $8, %rsp # Increment stack pointer

Stack is contained in the same memory with program code and other program data, it can be accessed arbitrary positions within the stack, by using the standard memory addressing method;
e.g., movq 8(%rsp), %rdx, copy the second quad word in the stack to %rdx.


4 Arithmetic & Logical Operations


Integer arithmetic operations (CS: APP)

4.1 leap Instructions

  • load effective address instruction;
  • read memory address to a register;
  • NO access to the memory, just load the address;
  • $\&S$: C address operator, like a pointer;

[Example]


Example of leap Instructions (CS: APP)

Example of leap Instructions (CS: APP)

4.2 Unary & Binary Instructions


Integer arithmetic operations (Unary Instructions) (CS: APP)

Unary Instructions:

  • Operand can be register or memory location.


Integer arithmetic operations (Binary Instructions) (CS: APP)

Binary Instruction:

  • S: immediate value, register, memory location;
  • D: register, memory location;
  • S, D can NOT both be memory;
  • Source operand first, Destination second;
  • Fun S, D —> D = D fun S
  • subq %rax, %rdx: %rdx = %rdx - %rax (Subtract %rax from %rax)

[Example]


Example of Unary & Binary Instructions (CS: APP)

Example of Unary & Binary Instructions (CS: APP)

4.3 Shift Instructions


Integer arithmetic operations (Shift Instructions) (CS: APP)

Source and Destination can be register or memory location.

Shift amount (2 ways):

  • immediate value: k
  • single-byte register %cl:
    • based data: w-bit (i.e. 8,16,32,64);
    • shift amount: value of low-order m-bit of %cl, $2^m=w,m=\log_2w$;
      • e.g. 8-bit: lower 3-bit value of %cl;
      • 64-bit: lower 6-bit value of %cl;
    • Example: %cl = 0xFF = 1111 1111:
      • salb: 8-bit, shift lower 3-bit value = 111 = 7;
      • salw: 16-bit, shift lower 4-bit value = 1111 = 15;

Left Shift:

  • SAL: arithmetic left shift;
  • SHL: logical left shift;
  • Same effect, fill right with zero;

Right Shift:

  • SAR: arithmetic right shift, fill copy of the sign bits;
  • SHR: logical right shift, fill left with zeros;

[Example]


Example of Shift Instructions (CS: APP)

Example of Shift Instructions (CS: APP)

4.4 Special Arithmetic Instructions


Special arithmetic operations (CS: APP)

Multiply:

  • Different from 2-operand imul (generating 64-bit from two 64-bit operand);
  • It only has 1 operand, generating 128-bit from two 64-bit operand, a full multiply;
  • imulq: signed (two’s complement) multiply;
  • mulq: unsigned multiply;
  • One argument must be in register %rax;
  • Other one is given as S;
  • Stored in high-order 64-bit: %rdx, and low-order 64-bit: %rax.

[Example]


Example of Multiply (CS: APP)

Example of Multiply (CS: APP)

Division:

  • Single-operand instruction;
  • Dividend: high-order 64-bit: %rdx, and low-order 64-bit: %rax;
  • Divisor: given as S;
  • Store quotient in %rax;
  • Store remainder in rdx%

cqto:

  • NO operand;
  • Copy %rax and extends it to %rdx;
  • Convert it to oct word.

参考

B. Randal, D. R. O’Hallaron, Computer systems : a programmer’s perspective, Third edition. Boston: Pearson, 2016.


写在最后

x86-64相关的知识会继续学习,继续更新.
最后,希望大家一起交流,分享,指出问题,谢谢!


原创文章,转载请标明出处
Made by Mike_Zhang




感谢你的支持

x86-64学习1-Introduction & Data Formats & Information Accessing & Arithmetic Logical Operation
https://ultrafish.io/post/x86-64-learning-1/
Author
Mike_Zhang
Posted on
February 6, 2022
Licensed under