In the latest post I summarized the last year because I wanted to talk about what I’m doing now. In this very moment I just realized that almost the half of this 2021 is already gone so following the breadcrumbs until this day could be a difficult task. That’s why I won’t give you more context than this: RISC-V is a deep, deep, hole.
I told you I was researching on programming languages and that made me research a little bit about ISAs. That’s how I started reading about RISC-V, and I realized learning about it was a great idea for many reasons: it’s a new thing and as an R&D engineer I should keep updated and the book I chose is really good1 and gives a great description about the design decisions behind RISC-V.
From that, and I don’t really know how, I started taking part on the efforts of
porting Guix to RISC-V. One of the things I’m working on right now is the
port of the machine code generation library that Guile uses, called
lightening, to RISC-V, and that’s what I’m talking about today.
Lightening is a lightweight fork of the GNU Lightning, a machine code generation library that can be used for many things that need to abstract from the target CPU, like JIT compilers or so.
The design of GNU Lightning is easy to understand. It exposes a set of instructions that are inspired in RISC machines, you use those, the library maps them to actual machine instructions on the target CPU and returns you a pointer to the function that calls them. Simple stuff.
The code is not that easy to understand, it makes a pretty aggressive and clever use of C macros that I’m not that used to read so it is a little bit hard for me.
I could try to explain the reasons behind the fork, but the guy who did it, that is also the maintainer of Guile explains it much better than I could. But at least I can summarize: lightening is simpler and it fits better what Guile needs for its JIT compiler.
So Lightening is basically simpler but the idea is the same. But how do you make the port of a library like that to other architecture?
The idea is kind of simple, but we need to talk about the basics first.
Lightening (and GNU Lightning too, but we are going to specifically talk about
Lightening from here) emulates a fake RISC machine with its functions. It
addr and so on. Basically, all those are C functions
you call, but they actually look like assembly. Look a random example here
taken from the
jit_begin(j, arena_base, arena_size);
size_t align = jit_enter_jit_abi(j, 0, 0, 0);
jit_load_args_2(j, jit_operand_gpr (JIT_OPERAND_ABI_WORD, JIT_R0),
jit_operand_gpr (JIT_OPERAND_ABI_WORD, JIT_R1));
jit_addr(j, JIT_R0, JIT_R0, JIT_R1);
jit_leave_jit_abi(j, 0, 0, align);
size_t size = 0;
void* ret = jit_end(j, &size);
int (*f)(int, int) = ret;
ASSERT(f(42, 69) == 111);
Basically you can see we get the
f function from the calls to
which include the call to the preparation of the arguments,
and the actual body of the function:
jit_addr. The word
addr comes from
add and registers, so you can understand what it does: adds the contents of
the registers and stores the result in other register.
The registers have understandable names like
JIT_R1, which are
basically the register number (the
R comes from “register”).
So, if you check the line of the
jit_addr you can understand it’s adding the
contents of the register
0 and the register
1 and storing them in the
0 (the first argument is the destination).
That’s pretty similar to RISC-V’s
add instruction, isn’t it?
Well, it’s basically the same thing. The only problem is that we have to emit
the machine code associated with the
add, not just writing it down in text,
and we also need to declare which are the registers
our actual machine.
Thankfully, the library has already all the machinery to make all that. There
are functions that emit the code for us, and we can also make some
JIT_R0 to the RISCV
a0 register, and so on.
We just need to make new files for RISC-V, define the mappings and add a little bit of glue around.
All that sounds simple and easy (on purpose), but it’s not that easy.
Some instructions that Lightening provides don’t have a simple mapping to RISC-V and we need to play around with them.
There’s an interesting example:
movi (move immediate to register).
Loading and immediate to a register is something that sounds extremely simple,
but it’s more complex than it looks. The RISC-V assembly has a
pseudoinstruction for that, called
li (load immediate) that can be literally
mapped to the
movi. The main problem is that pseudoinstructions don’t really
You all know there are CISC and RISC machines. CISC machines were a way to make simpler compilers, pushing that complexity to the hardware. RISC machines are the other way around.
The RISC hardware tends to be simple and they have few instructions, the compiler is the one that has to make the dirty job, trying to make the programmer’s life better.
Pseudoinstructions are a case of that. The programmer only wants to load a constant to a register but real life can be very depressing. When you want to load an immediate you don’t want to think about the size of it, if it fits a register you are fine, aren’t you?
Pseudoinstructions are expanded to actual instructions by the assembler, so you don’t need to worry about those details. In fact, RISC-V doesn’t really have move instructions, they are all pseudoinstructions that are expanded to something like:
addi destination, source, 0
Which means “add 0 to source and store the result in destination”.
li pseudoinstruction is a very interesting case, because the expansion is
kind of complex, it’s not just a conversion.
In RISC-V all the instructions are 32bit (or 16 if you take in account the compressed instruction extension) and the registers are 32bit wide in RV32 and 64bit wide in RV64. You see the problem, right? No 32bit instruction is able to load a full register at once, because that would mean that all the bits available for the instruction (or more!) need to be used to store the immediate.
Depending on the size of the immediate you want to load, the
can be expanded to just one instruction (
addi), two (
addi) or, if
you are in RV64 to a series of eight instructions (
addi.). There are also sign extensions in the
middle that make all the process even funnier.
Of course, as we are generating the machine code, we can’t rely in an assembler to make the dirty job for us: we need to expand everything ourselves.
So, something that looked extremely simple, the implementation of an obvious instruction, can get really messy, so we need a reasonable way to check if we did the expansions correctly.
And we didn’t talk yet about those instructions that don’t have a clear mapping to the machine!
Don’t worry: we won’t. I just wanted to point the need of proper tools for this task.
The debugging process is not as complex as I thought it was going to be, but my setup is a little bit of a mess, basically because I’m on Guix, which doesn’t have a proper support for RISC-V so I can’t really test on my machine (if there’s a way please let me know!).
I’m using an external Debian Sid machine (see acknowledgements below) for it.
I basically followed the Debian tutorial for cross compilation environments and Qemu and everything is perfectly set for the task.
Next: how to debug the code?
I’m using Qemu as a target for GDB, so I can run a binary on Qemu like this:
qemu-riscv64-static -g 1234 test-riscv-movi
Now I can attach GDB to that port and disassemble the
*f function that was
returned from Lightening to see if the expansion is correct:
GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) file lightening/tests/test-riscv-movi
Reading symbols from lightening/tests/test-riscv-movi...
(gdb) target remote :1234
Remote debugging using :1234
0x0000000000010538 in _start ()
(gdb) break movi.c:15
Breakpoint 1 at 0x1d956: file movi.c, line 15.
Breakpoint 1, run_test (j=0x82e90, arena_base=0x4000801000
"\023\001\201\377#0\021", arena_size=4096) at movi.c:15
15 ASSERT(f() == 0xa500a500);
(gdb) disassemble *f,+100
Dump of assembler code from 0x4000801000 to 0x4000801064:
0x0000004000801000: addi sp,sp,-8
0x0000004000801004: sd ra,0(sp)
0x0000004000801008: lui a0,0x0
0x000000400080100c: slli a0,a0,0x20
0x0000004000801010: srli a0,a0,0x21
0x0000004000801014: mv a0,a0
0x0000004000801018: slli a0,a0,0xb
0x000000400080101c: addi a0,a0,660 # 0x294
0x0000004000801020: slli a0,a0,0xb
0x0000004000801024: addi a0,a0,20
0x0000004000801028: slli a0,a0,0xb
0x000000400080102c: addi a0,a0,1280
0x0000004000801030: ld ra,0(sp)
0x0000004000801034: addi sp,sp,8
0x0000004000801038: mv a0,a0
Of course, I can debug the library code normally, but the generated code has to be checked like this, because there’s no debug symbol associated with it and GDB is lost in there.
Important stuff. Take notes.
It’s weird to have acknowledgments in a random blog post like this one, but I have to thank my friend Fanta for preparing me a Debian machine I can use for all this.
Also I’d like to thank Andy Wingo for the disassembly trick you just read. Yeah, there were no chances I discovered that by myself!
All the process can be followed in the gitlab of the project where I added a Merge Request. Feel free to comment and propose changes.
There’s still plenty of work to do. I only implemented the basics of the ALU, some configuration of the RISC-V context like the registers and all that, but I’d say the project is in the good direction.
I don’t know if I’m going to be able to spend as much as time as I want on it but I’m surely going to keep adding new instructions and eventually try to wrap my head around how are jumps implemented.
It’s going to be a lot of fun, that’s for sure.