In our previous blogpost in the series we described how GNU Mes’ scheme interpreter handles evaluation and how fast it was. There were some details that weren’t really obvious, though: why does it really do all that magic to emulate function calls? Is it just to make it look like scheme?
In this blog post we’ll explain how does the garbage collector work in GNU Mes,
try to take a new look to the eval_apply
function from the perspective of the
garbage collector, and propose an improvement that would enable a different
style of programming.
GNU Mes’ garbage collection
GNU Mes’ scheme interpreter has a very simple approach to memory management. It
allocates a big chunk of memory when the interpreter is run and that is used
for every scheme value. All scheme values are represented with the same
struct
, called struct scm
, and every make_
whatever function returns a
pointer to an struct scm
bump-allocated in that big chunk of memory which
we’ll call arena.
The bump-allocation is very easy to do, and doesn’t require much computation. It’s just in increase in the pointer to the first free element in the arena.
If we just had this, our scheme interpreter would just eat memory and never release or reuse the space we allocated for the scheme values we are not using anymore. This is a possible approach, really, but GNU Mes does provide garbage collection so the memory is reused and we don’t run out of our precious resource.
Mes uses a stop-the-world copying garbage collector. Similar to the one described in the “Structure and Interpretation of Computer Programs” (aka SICP) book or the one described in This Wikipedia entry about Cheney’s garbage collection algorithm based on a paper from 1970. It starts in few scheme values and copies them from the current arena to some other piece of memory, one after the other. It also check their internal values like elements in a list, a vector or so on, and copies those too, adjusting the pointers that refer to them. When a scheme value is copied, its memory representation is tagged with a broken heart (this appears in SICP), something that identifies that value as already processed, and the pointer to the new location replaces its value. Now, it can check if an object that is being copied points to a value that has been already copied, and in that case, it only needs to update the pointer of its reference.
Once all the objects and its references have been processed, what we have is a new memory space that contains only the objects in use, one after the other, with no empty spaces in between. This is why this process is called also a compacting garbage collector.
In Mes, once the new piece of memory is set up, it is moved back to the
original position, element by element (remember all elements are the same
size). In a process that in Mes is called flip
. The goal of the flip is to
keep the memory footprint low, only requiring a maximum memory of the twice of
the current memory usage (and oftentimes way less) which, potentially, is
smaller than having two fixed semi-spaces as this kind of garbage collectors
often have.
But now two things are left to explain: when is this whole process run and how it is kickstarted.
The garbage collector is run during the interpreter execution. When the
interpreter’s programmer decides to (by adding a call to the garbage collection
function, gc_check
). This function will check if the memory usage is large
enough so the garbage collecting is worth the effort and, in affirmative case,
then run the garbage collector process.
The first thing the garbage collection process does is copy a group of objects that will kickstart the copying of the rest of the live objects, meaning those objects that the running scheme program still has access to. Those objects are called roots, and they are fundamental to understand how this whole thing can be improved. The roots, more than being a great hip hop band, are easy to deduce what they can be in the interpreter: the current environment, the stack, the known symbols, and such.
Currently in Mes the roots are fixed, meaning they are hardcoded in the code, like so:
struct scm *new_cell_nil = g_free;
struct scm *s;
for (s = cell_nil; s < g_symbol_max; s = s + M2_CELL_SIZE)
gc_copy (s);
g_symbols = gc_copy (g_symbols);
g_macros = gc_copy (g_macros);
g_ports = gc_copy (g_ports);
scm_hash_table_type = gc_copy (scm_hash_table_type);
scm_variable_type = gc_copy (scm_variable_type);
M0 = gc_copy (M0);
M1 = gc_copy (M1);
long i;
for (i = g_stack; i < STACK_SIZE; i = i + 1)
copy_stack (i, gc_copy (g_stack_array[i]));
gc_loop (new_cell_nil);
The issues
This kind of copying garbage collectors have a really two very obvious problems that affect the way we write our C code.
Consider this code:
struct scm *val = make_number (10); // Makes a scheme value = 10
gc_check (); // May garbage collect
val->value; // What do we have there?
If we make a number like in the example and then garbage collect, the number will be allocated in the arena, but during garbage collection it won’t have anything pointing to it so it would be cleaned from the memory. Interestingly enough, it won’t probably produce a segfault or any kind of error like that, because it will try to access to the arena, and that’s valid memory, but it would still explode in very dirty ways afterwards, maybe after some other garbage collection loops. Nasty stuff.
Also, even if you keep the variable from being collected, it would be moved by the garbage collector, which will give you a very low chance of the value falling in the same spot, so your pointer would be outdated, and you wouldn’t have any way to know where the variable is now.
Do you want an easy solution?
First, register that val
as another root in the garbage collector, so it’s
not lost in each loop. The current garbage collector in Mes needs to hardcode
the roots, so if this example were just a local variable for a function, you’d
need to make it global for the garbage collector to find it. Of course, you
shouldn’t do this for every single variable in your code, but if you make sure
you only need a couple of temporaries you could call them R1
and R2
and use
those all the time for everything1.
That would leave you with the moving problem. After a garbage collector run,
your variables R1
and R2
could potentially point to a memory location where
they are not anymore, because of the copying.
At this point if you were really inspired, you could implement a stack of those
lets call them “registers” (R1
, R2
…), register it as a root, and there
you would have emulated a function call system that respects your garbage
collector. Oh! But isn’t that what we described in the previous blog post?
Implementing a stack that knows where to push and where to pop (those global
registers) would make every single piece of your code keep your pointers to the
variables coherent in the C world. You would just need to refer to R1
or
whatever, and it would always point to the proper memory location after the pop
is run.
But of course this is a little bit of a Henry Ford situation here: “everybody
will have a car of the color they like as long as it is black”. So you can
have all the variables you want as long as they are R0
, R1
…
One solution
Consider this code:
struct scm *
macro_expand_program (struct scm *program)
{
struct scm *cursor = program;
gc_preserve (&program);
for (cursor = program; cursor != cell_nil; cursor = cursor->cdr)
{
gc_preserve (&cursor);
cursor->car = macro_expand_expr (cursor->car); // May GC
gc_release (1); // cursor
}
gc_release (1); // program
return program;
}
In this piece of code we can see how macro_expand_expr
could try to garbage
collect and make us lose references to our local variables. By calling
gc_preserve
we register them as roots, and we can release them later, when we
don’t need them anymore. We send the address to the pointer to the garbage
collector so it’s able to fix the pointers when the copying is done.
This looks boring and boilerplatey, but we would only need it for those variables that would cross the function barrier to functions that we know that may garbage collect. The rest of the cases we can just use an ordinary C variable without preservation.
The improvements I see here is we can call the variables whatever we want, use
as many as we want and actually use C function calls without all the goto
and
push_cc
ceremony we described in the previous article, which is also
extremely confusing to me.
Also, we could get rid of all the other global roots pushing them first and never releasing them.
How is this thing implemented
I have to admit it’s not the greatest solution in the world but it works well enough if you are careful.
First, I added a new scheme value type for roots, that can store a reference to an scheme value and the address of the pointer to update. It just covers the simplest case.
The gc_preserve
function adds a root to a global list of preserved roots. The
gc_release
function removes the top n
elements from the list of preserved roots.
With that ready, the garbage collector only needs to handle that global list as
any other hardcoded root we mentioned before, but after the gc_flip
the
pointers stored in the roots themselves are updated. This means the pointers in
our C code would be updated when we exit the garbage collection operation,
keeping our C pointers consistent with the memory.
So, that leaves us with this copy step:
@@ -647,6 +695,8 @@ gc_ ()
gc_copy (s);
g_symbols = gc_copy (g_symbols);
+ if (g_preserved != NULL)
+ g_preserved = gc_copy (g_preserved);
g_macros = gc_copy (g_macros);
g_ports = gc_copy (g_ports);
scm_hash_table_type = gc_copy (scm_hash_table_type);
And this flip step:
@@ -464,6 +503,15 @@ gc_flip ()
long dist = g_news - g_cells;
g_free = g_free - dist;
g_symbols = g_symbols - dist;
+ if (g_preserved != NULL)
+ {
+ g_preserved = g_preserved - dist;
+ /* Update the preserved pointers */
+ struct scm *s;
+ for (s = g_preserved; s != cell_nil; s = s->cdr)
+ *(s->car->ptr) = s->car->car;
+ }
+
g_macros = g_macros - dist;
g_ports = g_ports - dist;
scm_hash_table_type = scm_hash_table_type - dist;
As my roots are copied and flipped the values they point at are also copied and flipped. That way I only need to update the pointers using the new address of the pointed objects.
And that’s it.
The only thing missing is to use the gc_preserve
and gc_release
properly.
It’s easy to misuse them and overrelease or overpreserve.
I’m writing a macro expander using this and happens to be somewhat comfortable,
at least way more comfortable than the goto and push_cc
style. I was having a
very hard time with that one.
We’ll see how it goes.
Cheers!
-
Oh! Yes! I know what you are thinking. Trust me. ↩