-
Notifications
You must be signed in to change notification settings - Fork 3.3k
LibDisassembly: RISC-V disassembly 4-3/4-4: Disassemble priviledged instructions and add some initial tests #26154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
These were the wrong way around.
This does not include all instructions for S- and M-mode from the priviledged spec, but these are the common ones that we will definitely need (and use ourselves)
This was prettier to add in one go instead of many small sections :^)
These tests make sure that every single non-trivial permutation of every RV64G instruction diassembles correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice! I didn't read the entire test code, but I think it should be fine.
if (raw_parts.funct3 != 0) | ||
return parse_csr(raw_parts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
funct3 == 0b100 aren't CSR instructions.
case 0b000000000000'00000'000'00000: | ||
return make<EnvironmentCall>(); | ||
case 0b000000000001'00000'000'00000: | ||
return make<EnvironmentBreak>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ECALL and EBREAK are part of RV32I, but I guess it's fine to put it here as well.
I'm not even sure which extensions SRET/MRET/WFI belong to officially. I think recently RISC-V specs started to adopt the extension names Ss and Sm for S-mode and M-mode (e.g. in RVA23). Those might not be official names yet though, idk.
make<MoveIntegerToFloat>(FloatWidth::Double, 20_x, 19_f), | ||
}; | ||
|
||
constexpr Array all_instructions_machine_code = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would use constexpr auto all_instructions_machine_code = to_array<u8>({
instead of manually casting the first element.
0xa3, 0x80, 0xce, 0x01, 0x23, 0x91, 0xef, 0x01, 0x23, 0x22, 0x11, 0x00, 0x93, 0x01, | ||
0x52, 0x00, 0x93, 0x22, 0x73, 0x00, 0x93, 0x33, 0x94, 0x00, 0x93, 0x44, | ||
0xb5, 0x00, 0x93, 0x65, 0xd6, 0x00, 0x93, 0x76, 0xf7, 0x00, 0x93, 0x17, | ||
0x18, 0x01, 0x93, 0x58, 0x39, 0x01, 0x93, 0x59, 0x5a, 0x41, 0xb3, 0x01, | ||
0x52, 0x00, 0x33, 0x8b, 0x8b, 0x41, 0xb3, 0x17, 0x18, 0x01, 0xb3, 0x22, | ||
0x73, 0x00, 0xb3, 0x33, 0x94, 0x00, 0xb3, 0x44, 0xb5, 0x00, 0xb3, 0x58, | ||
0x39, 0x01, 0xb3, 0x59, 0x5a, 0x41, 0xb3, 0x65, 0xd6, 0x00, 0xb3, 0x76, | ||
0xf7, 0x00, 0x0f, 0x00, 0x10, 0x0b, 0x0f, 0x00, 0x90, 0x04, 0x0f, 0x00, | ||
0x30, 0x83, 0x73, 0x00, 0x00, 0x00, 0x73, 0x00, 0x10, 0x00, 0x83, 0x60, 0x41, 0x00, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you manually replaced 2 C instructions with their non-compressed versions. This means the instructions to generate this array are wrong. The gcc invocation should include -march=rv64imafd_zicsr_zifencei
. That would also fix this weird formatting.
# NOTE: GCC *really* doesn't want us to just specify a raw offset in jumps and branches, so this is what we have to do instead. | ||
jal x3, _start # -8 | ||
jalr x4, 2(x5) | ||
beq x6, x7, _start # -16 | ||
bne x8, x9, _start # -20 | ||
blt x10, x11, _start # -24 | ||
bge x12, x13, _start # -28 | ||
bltu x14, x15, _start # -32 | ||
bgeu x16, x17, _start # -36 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jal x3, .-8
should work. .
means the current address in most assemblers, so this creates a relative offset.
Manually diffing our disasm output with objdump output, I found these two things (aside from formatting differences): Input assembly: We seem to incorrectly format JALR (the JumpAndLinkRegister class itself seems to have correct values). The other thing is that we format |
Very good catches. I hope I can get to this today. |
Finally we can test this code. Only includes the test without compressed instructions obviously.
Includes common system instruction disassembly, which completes RV64G.
Also included:
@spholz