Y86-64学习1-State & Instruction & Basic Encoding

Made by Mike_Zhang


Computer System 相关文章:
有符号二进制数表示方法 Signed binary number representation
浮点数二进制数表示方法 Floating point numbers representation
UltraFish Plus - 有符号二进制数转换器 Signed binary number convertor
UltraFish Plus - 浮点数表示方法转换器 Floating Point Numbers Representation Convertor
UltraFish Plus - 多进制整数转换器 Multiple Bases Unsigned Integer Convertor
Y86-64学习1-State & Instruction & Basic Encoding
Y86-64学习2-Y86-64 SEQ Stages
x86-64学习1-Introduction & Data Formats & Information Accessing & Arithmetic Logical Operation
x86-64学习2-Control


1 Accessible & Modifiable State

Y86-64 programmer-visible state

The programmer can access and modify these processor state.
Similar to x86-64, but more simpler and less compact.


Y86-64 programmer-visible state(CS: APP)

Comparing with the register part of x86-64:


x86-64 Integer registers(CS: APP)

1.1 Y86-64 Program Registers

  • 15 Program Resister;
  • No %r15, to simplify the encoding;
  • 64-bit word, 8 words;
  • %rsp for Stack Pointer, NO fixed meaning or value for others.

Y86-64 Program Registers(CS: APP)

1.2 Condition Codes

  • CC: Condition Code;
  • 3 single-bit codes;
  • ZF, SF, OF;
  • ZF: Zero Flag. The most recent operation yielded zero.
  • SF: Sign Flag. The most recent operation yielded a negative value.
  • OF: Overflow Flag. The most recent operation caused a two’s-complement overflow—either negative or positive.

1.3 Program Counter

  • PC: Program Counter;
  • Store the address of currently executing instruction.

1.4 Memory

  • Virtual memory;
  • In the Operand forms, only represented in base and displacement, NO index and scale in x86-64.

1.5 Program State

  • Stat: Program State;
  • The overall state of program execution;
  • Normal operation or exception.

2 Y86-64 Instructions


Y86-64 Instructions Set(CS: APP)
  • A subset of x86-64 instruction set;
  • 8-byte integer operations;
  • fewer address modes;
  • smaller set of operations;
  • Each instruction set including:
    • 1-byte instruction specifier (e.g., 0|0 for halt), including(op|fn):
      • 4-bit operation code(op) and,
      • 4-bit function code(fn) to specify a particular function ;
    • (possibly) 1-byte register specifier (e.g., rA|rB, F|rB);
    • (possibly) 8-byte constant word (e.g., V, D, Dest);
  • 4 types of instruction: 1-byte, 2-byte, 9-byte, and 10-byte instruction;
    • 1-byte: halt, nop, ret (only instruction specifier);
    • 2-byte: rrmovq rA, rB, OPq rA, rB, cmovXX rA, rB, pushq rA, popq rA (only instruction specifier + register specifier);
    • 9-byte: jXX Dest, call Dest (only instruction specifier + constant word);
    • 10-byte: irmovq V, D(rB), rmmovq rA, D(rB), mrmovq D(rB), rA (instruction specifier + register specifier + constant word);
  • Encoded in hexadecimal value.

2.1 movq Instructions


Y86-64 movq instructions(CS: APP)
  • subset of x86-64 movq instruction set;
  • Indicating the movement: source $\to$ destination;
    • Source: first character - immediate(i), register(r), memory(m);
    • Destination: second character - register(r), memory(m);
  • NO memory(m) location $\rightarrow$ another memory(m) location;
  • NO immediate(i) data $\rightarrow$ memory(m);

The movement is from the first argument to the second argument, usually rA(V) $\to$ rB,
except mrmovq D(rB), rA, which is rB $\to$ rA

  • The immediate value(V) and displacement(D) is 8-byte constant word.

  • memory reference: only base and displacement(e.g., D(rB), rB for base, D for displacement), NO second index register or scale;


x86-64 Operand forms(CS: APP) (Y86-64's in red box)

2.2 OPq Integer Operation Instructions


Y86-64 OPq instructions(CS: APP)

x86-64 Integer arithmetic operations(CS: APP) (Y86-64's in red box)
  • subset of x86-64 Integer arithmetic operations;
  • 2-byte instruction;
  • 4 instructions: addq, subq, andq, xorq;
  • Only operate on register data, NOT on memory data;
  • Sets 3 conditional code ZF, SF, and OF.
  • Function code for fn:

Function codes for Y86-64 instruction set(CS: APP)(Operation part)

2.3 jXX Jump Instructions


Y86-64 jXX instructions(CS: APP)

x86-64 Jump instructions(CS: APP) (Y86-64's in red box)
  • subset of x86-64 Jump instructions;
  • 9-byte instruction;
  • 7 instructions: jmp, je, jne, jg, jge, jl, jle;
  • according to the conditional codes(CC).
  • Function code for fn:

Function codes for Y86-64 instruction set(CS: APP)(Branches part)

2.4 cmovXX Conditional Move Instructions


Y86-64 jXX instructions(CS: APP)

x86-64 Conditional move instructions(CS: APP) (Y86-64's in red box)
  • subset of x86-64 Conditional move instructions;
  • 2-byte instruction;
  • 7 instructions: cmove, cmovne, cmovg, cmovge, cmovl, cmovle;
  • same format with register-register move - rrmovq;
  • move occurs only if condition satisfied.
  • Function code for fn:

Function codes for Y86-64 instruction set(CS: APP)(Moves part)

2.5 call Instructions


Y86-64 call instructions(CS: APP)
  • 9-byte instruction;
  • First, push the return address into the stack, the return address refers to the address of the instruction immediately after the call instruction;
  • Second, jump to the Dest address by setting the PC to the destination address;

ret instruction:


Y86-64 ret instructions(CS: APP)
  • 1-byte instruction;
  • the instruction pop the address from stack, then set the PC to that address.

2.6 pushq & popq Instructions


Y86-64 pushq & popq instructions(CS: APP)
  • 2-byte instruction;
  • as same as in x86-64 Push and pop instructions:

x86-64 Push and pop instructions(CS: APP)

2.7 halt Instructions


Y86-64 halt instruction(CS: APP)
  • Stops instruction execution;
  • 1-byte instruction.

2.8 nop Instructions


Y86-64 nop instruction(CS: APP)
  • Do nothing;
  • 1-byte instruction.

3 Encoding

3.1 Instruction Specifier

Having been mentioned in Section 2.

Every instruction has a type specifier, which is the first byte.
It can be separated into two 4-bit parts, operation code and function code.

  • operation codes range from 0 to 0xB.
  • function code have special values in integer operation, branch, and move instructions; 0 for rest instructions:

Function codes for Y86-64 instruction set(CS: APP)

3.2 Register Identifier

In Y86-64, some instruction has register operands, such as rrmovq, which associated with the Program Register, who also need to be encoded.

  • Each Program Register has its Register Identifier, ranging from 0 to 0xE.

Y86-64 program register identifiers(CS: APP)
  • The 0xF register will not be accessed.
  • Some instruction do not require register specifier;
  • Some instruction require only one register specifier, such as irmovq, which need to set another register specifier to 0xF for easy implementation.

3.3 Constant Word Encoding

3 types of 8-byte constant word:

  • immediate data(V);
  • displacement for address specifier(D);
  • destination for address specifier(Dest);

For the destination address in branch and call instructions, the destination is the absolute address, NOT the PC-related address in x86-64.

  • all constant integer is encoded in little-endian encoding, which means every byte should be reversed when encoding.

little-endian means the LSP(right-most) byte appears first;
For 0x 0A 0B 0C 0D $\to$ 0D 0C 0B 0A (reversed)

3.4 Example

To encode:

1
2
3
4
5
6
7
.pos 0x100  # Start code at address 0x100
irmovq $15,%rbx
rrmovq %rbx,%rcx
loop:
rmmovq %rcx,-3(%rbx)
addq %rbx,%rcx
jmp loop

Practice Problem 4.1 (CS: APP)


Solution steps:

1
2
3
4
5
6
7
        .pos 0x100  # Start code at address 0x100
0x100 irmovq $15,%rbx # 10-byte instruction, next address +a(10)
0x10a rrmovq %rbx,%rcx # 2-byte instruction, next address +2
(0x10c) loop:
0x10c rmmovq %rcx,-3(%rbx) # 10-byte instruction, next address +a(10)
0x116 addq %rbx,%rcx # 2-byte instruction, next address +2
0x118 jmp loop

the start position of each instruction is depended on the length of the previous one.


1
0x100       irmovq $15,%rbx 

3|0|F|rB|V:

rB: %rbx $\to$ 0x3

V: $15 $\to$ 0x 00 00 00 00 00 00 00 0f $\to$ 0x 0f 00 00 00 00 00 00 00

1
0x100       30 f3 0f 00 00 00 00 00 00 00  # irmovq $15,%rbx 

1
0x10a       rrmovq %rbx,%rcx

2|0|rA|rB:

rA: %rbx $\to$ 3

rB: %rcx $\to$ 1

1
0x10a       20 31  # rrmovq %rbx,%rcx

1
0x10c       rmmovq %rcx,-3(%rbx)

4|0|rA|rB|D

rA: %rcx $\to$ 1

rB: %rbx $\to$ 3

D: -3 $\to$ 0000 0000 … 0011 $\to$ 1111 1111 … 1101 (2’s complement) $\to$ 0x ff ff ff ff ff ff ff fd $\to$ fd ff ff ff ff ff ff ff

1
0x10c       40 13 fd ff ff ff ff ff ff ff  # rmmovq %rcx,-3(%rbx)

1
0x116       addq %rbx,%rcx

6|0|rA|rB

rA: %rbx $\to$ 3

rB: %rcx $\to$ 1

1
0x116       60 31  # addq %rbx,%rcx

1
0x118       jmp loop

7|0|Dest

Dest: loop $\to$ 0x10a $\to$ 0x 00 00 00 00 00 00 01 0c $\to$ 0c 01 00 00 00 00 00 00

1
0x118       70 0c 01 00 00 00 00 00 00  # jmp loop

Answer:

1
2
3
4
5
6
7
0x100       # .pos 0x100
0x100 30 f3 0f 00 00 00 00 00 00 00 # irmovq $15,%rbx
0x10a 20 31 # rrmovq %rbx,%rcx
0x10c # loop:
0x10c 40 13 fd ff ff ff ff ff ff ff # rmmovq %rcx,-3(%rbx)
0x116 60 31 # addq %rbx,%rcx
0x118 70 0c 01 00 00 00 00 00 00 # jmp loop

参考

B. Randal, D. R. O’Hallaron, Computer systems : a programmer’s perspective, Third edition. Boston: Pearson, 2016.


写在最后

Y86-64相关的知识会继续学习,继续更新.
最后,希望大家一起交流,分享,指出问题,谢谢!


原创文章,转载请标明出处
Made by Mike_Zhang




感谢你的支持

Y86-64学习1-State & Instruction & Basic Encoding
https://ultrafish.io/post/Y86-64-learning-1/
Author
Mike_Zhang
Posted on
January 4, 2022
Licensed under