You probably already know about how I spent more than a year having fun with RISC-V and software bootstrapping from source.
GCC is probably the most used compiler collection, period. With GCC we can compile the world and have a proper distribution directly from source, but who compiles the compiler?1
Well, someone has to.
Bootstrapping a compiler with a long history like GCC for a new architecture like RISC-V involves some complications, starting on the fact that the first version of GCC that supports RISC-V needs a C++98 capable compiler in order to build. C++98 is a really complex standard, so there’s no way we can bootstrap a C++98 compiler at the moment for RISC-V. The easiest way we can think of at this point is to use an older version of GCC for that, one of those that are able to build C++98 programs but they only require a C compiler to build. Older versions of GCC, of course, don’t have RISC-V support so… We need a backport2.
So that’s what I’m doing right now. I’m taking an old version of GCC that only depends on C89 and is able to compile C++98 code and I’m porting it to RISC-V so we can build newer GCCs with it.
Only needing C to compile it’s a huge improvement because there are Tiny C Compilers out there that can compile C to RISC-V, and those are written using simple C that we can bootstrap with simpler tools of a more civilized world.
- C++98 is too complex, but C89 is fine.
- GCC is the problem and also the solution.
What about GNU Mes?
When we3 started with this effort we wanted to prepare GNU Mes, a small C compiler that is able to compile a Tiny C Compiler, to work with RISC-V so we could start to work in this bootstrap process from the bottom.
Some random events, like someone else working on that part, made us rethink our strategy so we decided to start from the top and try to combine both efforts at the end. We share the same goal: full source bootstrap for RISC-V.
Tiny C Compilers?
There are many small C compilers out there that are written in simple C and are able to compile an old GCC that is written in C. Our favorite is TinyCC (Tiny C Compiler).
GNU Mes is able to build a patched version of TinyCC, which already supports RISC-V (RV64 only), and we can use that TinyCC to compile the GCC version I’m backporting.
We’d probably need to patch some things in both projects to make everything work smoothly but that’s also included in the project plan.
Binutils is also a problem mostly because GCC, as we will talk about in the
future, does not compile to binary directly. GCC generates assembly code and
coordinates calls to
ld (the GNU Assembler and Linker) to generate
the final binaries. Thankfully, TinyCC can act as an assembler and a linker,
and there’s also the chance to compile a modern binutils version because it is
written in C.
In any case, the binary file generation and support must be taken in account, because GCC is not the only actor in this film and RISC-V has some weird things on the assembly and the binaries that have to be supported correctly.
This is a very interesting project, where I need to dig in BIG stuff, which is cool, but also has a huge level of uncertainty, which scares the hell out of me. I hope everything goes well…
In any case, I’ll share all I learn here in the blog and I keep you all posted with the news we have.
PS: Big up to NlNet / NGI-Assure for the money.
wHo wATcHes tHE wAtchMEN? ↩
Insert “Back to the Future” music here. ↩
“We” means I shared my thoughts and plans with other people who have a much better understanding of this than myself. ↩
Or even hire me for some freelance IT stuff 🤓 ↩