Skip to content

Conversation

kleinesfilmroellchen
Copy link
Collaborator

Finally we can test this code. Only includes the test without compressed instructions obviously.

Includes common system instruction disassembly, which completes RV64G.

Also included:

  • Basic plumbing to actually run the disassembling code, including printing compressed instructions as unknown temporarily.
  • Fix to floating-point conversion decode swaparoo

@spholz

This does not include all instructions for S- and M-mode from the
priviledged spec, but these are the common ones that we will definitely
need (and use ourselves)
This was prettier to add in one go instead of many small sections :^)
These tests make sure that every single non-trivial permutation of every
RV64G instruction diassembles correctly.
@github-actions github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Aug 24, 2025
Copy link
Member

@spholz spholz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice! I didn't read the entire test code, but I think it should be fine.

Comment on lines +16 to +17
if (raw_parts.funct3 != 0)
return parse_csr(raw_parts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

funct3 == 0b100 aren't CSR instructions.

Comment on lines +21 to +24
case 0b000000000000'00000'000'00000:
return make<EnvironmentCall>();
case 0b000000000001'00000'000'00000:
return make<EnvironmentBreak>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ECALL and EBREAK are part of RV32I, but I guess it's fine to put it here as well.

I'm not even sure which extensions SRET/MRET/WFI belong to officially. I think recently RISC-V specs started to adopt the extension names Ss and Sm for S-mode and M-mode (e.g. in RVA23). Those might not be official names yet though, idk.

make<MoveIntegerToFloat>(FloatWidth::Double, 20_x, 19_f),
};

constexpr Array all_instructions_machine_code = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would use constexpr auto all_instructions_machine_code = to_array<u8>({ instead of manually casting the first element.

Comment on lines +253 to +261
0xa3, 0x80, 0xce, 0x01, 0x23, 0x91, 0xef, 0x01, 0x23, 0x22, 0x11, 0x00, 0x93, 0x01,
0x52, 0x00, 0x93, 0x22, 0x73, 0x00, 0x93, 0x33, 0x94, 0x00, 0x93, 0x44,
0xb5, 0x00, 0x93, 0x65, 0xd6, 0x00, 0x93, 0x76, 0xf7, 0x00, 0x93, 0x17,
0x18, 0x01, 0x93, 0x58, 0x39, 0x01, 0x93, 0x59, 0x5a, 0x41, 0xb3, 0x01,
0x52, 0x00, 0x33, 0x8b, 0x8b, 0x41, 0xb3, 0x17, 0x18, 0x01, 0xb3, 0x22,
0x73, 0x00, 0xb3, 0x33, 0x94, 0x00, 0xb3, 0x44, 0xb5, 0x00, 0xb3, 0x58,
0x39, 0x01, 0xb3, 0x59, 0x5a, 0x41, 0xb3, 0x65, 0xd6, 0x00, 0xb3, 0x76,
0xf7, 0x00, 0x0f, 0x00, 0x10, 0x0b, 0x0f, 0x00, 0x90, 0x04, 0x0f, 0x00,
0x30, 0x83, 0x73, 0x00, 0x00, 0x00, 0x73, 0x00, 0x10, 0x00, 0x83, 0x60, 0x41, 0x00,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like you manually replaced 2 C instructions with their non-compressed versions. This means the instructions to generate this array are wrong. The gcc invocation should include -march=rv64imafd_zicsr_zifencei. That would also fix this weird formatting.

Comment on lines +18 to +26
# NOTE: GCC *really* doesn't want us to just specify a raw offset in jumps and branches, so this is what we have to do instead.
jal x3, _start # -8
jalr x4, 2(x5)
beq x6, x7, _start # -16
bne x8, x9, _start # -20
blt x10, x11, _start # -24
bge x12, x13, _start # -28
bltu x14, x15, _start # -32
bgeu x16, x17, _start # -36
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jal x3, .-8 should work. . means the current address in most assemblers, so this creates a relative offset.

@spholz
Copy link
Member

spholz commented Aug 25, 2025

Manually diffing our disasm output with objdump output, I found these two things (aside from formatting differences):

Input assembly: jalr x4, 2(x5)
We disassemble it to: jalr tp, t0, 0x20a <_PROCEDURE_LINKAGE_TABLE_+0xa>
objdump: jalr tp, 2(t0)

We seem to incorrectly format JALR (the JumpAndLinkRegister class itself seems to have correct values).

The other thing is that we format fence.tso as fence.tso rw, rw. The rw, rw should be unnecessary, since fence.tso always has this predecessor and successor set.

@spholz spholz added ⏳ pr-waiting-for-author PR is blocked by feedback / code changes from the author and removed 👀 pr-needs-review PR needs review from a maintainer or community member labels Aug 30, 2025
@kleinesfilmroellchen
Copy link
Collaborator Author

Very good catches. I hope I can get to this today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⏳ pr-waiting-for-author PR is blocked by feedback / code changes from the author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants