Ekaitz's tech blog:
I make stuff at ElenQ Technology and I talk about it

ELF format — why not?

From the series: Bootstrapping GCC in RISC-V

In the previous post of the series we introduced GCC and how it generates assembly code and we left a question unanswered: “Why is learning about ELF interesting if GCC generates assembly?”. In this post we are going to answer that question (not interesting) and maybe understand the very basics of ELF file format (more interesting).

What’s ELF

ELF is a file format with two main goals:

  • Represent an executable file
  • Represent a linkable file

Apart from that, ELF can also represent core dumps, but if you think about that all of the possible options have something in common: they represent contents on the memory. We can simply say ELF is a file format that acts as a picture of the state of the memory. In the case of the executables, the state will be loaded from the file, but in the case of the core dumps the state is obtained from the memory and dumped in a file.

Linkable files are those files that can be combined with others to generate executables or shared objects, so they can also fit that definition because they are going to end up in the memory anyway.

For efficiency reasons, the ELF format has two separate views of the same contents:

  • The Linking view is based on sections and needs a section header.
  • The Executable view is based on segments and needs a program header.

ELF header

The ELF header is the only thing that has a fixed position in the file, at the beginning. The ELF header has information that defines how to identify the file, the machine, the endianness and that sort of things, but it also says where are the headers located and identifies the size of their entries and their entry count.

It’s not that interesting, honestly. The most important thing is it points to the descriptions to both of the views (the headers) so we can check them.

Linking view

Based on sections, the linking view is the most detailed view of the file and it defines how the file should be linked with others in order to create an executable file.

Sections, the basic unit of the linking view, are consecutive sequences of bytes that do not overlap.

There are different types of sections according to their possible contents and meaning, the most interesting are:

  • SYMTAB and DYNSYM that hold a symbol table. The DYNSYM is for dynamic linking symbols, while SYMTAB normally is used for static linking but may contain both.
  • STRTAB holds a string table.
  • RELA contains relocation entries with addends and REL contains relocations without addends.
  • NOTE section contains some information of the file.
  • HASH contains a symbol hash table, necessary for dynamic linking.
  • DYNAMIC for dynamic linking information.

Each section has also a name, an address if it is supposed to appear in the memory of running process, an offset that defines where in the file do the section’s contents appear, a size, and some extra data fields that all together form a section header entry.

The section header entries are all located where the ELF header says, one after the other (like a C array of structures), so the programs just need to access that position in the file and read all the headers in a row. The contents of the sections are located throughout the file, where the section headers point.

String section

The string section (STRTAB) is one of the simplest. It contains all the strings of the file: the section and symbol names. It’s simply a set of null terminated strings, written one after the other (it also starts with a null character but whatever).

Anywhere in the file where we are supposed to get an string what we get is an index that points to the first position in this section to read from. We should read from that until we reach a null character. For example in the following string section:

    \0 h e l l o \0 n a m e \0

If a name of a section says 1, the actual name of the section is hello and if it says 7 it would be name. Also, if it says 9 it would be me, this trick could be used too.

Symbol table

The symbol table contains information needed to locate and relocate a program’s symbolic definitions and references. The symbol table is formed as an array of symbol elements that are defined with a name, obviously a value, their size, some extra info, the index of the section header they relate to (shndx) and some other stuff.

The info field manages symbol’s type (OBJECT for data, FUNC for function…) and binding attributes, which define the linking visibility and behavior of the symbol (local vs global…).

The value can be interpreted in several ways too, depending on the type of the symbol you are dealing with. But that’s not really relevant for us at the moment.

Relocation

According to the ELF documentation I got from somewhere I don’t really remember:

The relocation is the process of connecting symbolic references with symbolic definitions.

I hope it’s more explanatory for you than what it is to me, but I don’t have a clue of what that is supposed to mean. The Wikipedia does a much better job in the specifics right here:

Relocation is the process of assigning load addresses for position-dependent code and data of a program and adjusting the code and data to reflect the assigned addresses.

If this doesn’t really help, you have a really good example later, but we can basically say that it’s a way to adjust the code to point to the correct addresses, at linking or loading, or even execution, time.

ELF files have, as we said, sections that let us define relocations. These will point to some parts of the file and tell the linker or the loader that that positions of the file must be reprocessed.

There are two types of relocation sections and in both of them the relocation section is an array of entries where each of them represents one relocation. In the simple one (REL) each relocation only contains an offset and an info word, which also includes the type of relocation to apply. The more complex one (RELA) is mostly the same but it includes an addend which includes a constant value to use in calculation of the relocation.

The calculus of the final addresses are specific to the ISA and the relocation type, because processors have different instruction formats and different ways to pack addresses in instructions. RISC-V has no way to pack a full address inside of an instruction, while x86 does, so they have to patch the instructions in a different way.

Special sections

Some sections have a special treatment according to their name, normally the ones that start with a dot. These you might have found in the past in assembly files, defined like .data (for data), .rodata (for read only data) or .text (for code).

These are interesting to have in mind because they appear the same way they do in assembly, and we are going to disassemble some of them and play around with them.

Other special sections like .got or .dynamic don’t appear in assembly but they have a strong meaning in the resulting file, we are not going to deal with those today because we want to finish this post someday. If you need to deal with those I recommend you to read ELF’s documentation on special sections and the loading process.

Executable view

The executable view is another way to access the same contents, but with a different perspective. It’s based on segments rather than sections. Segments are also pieces of the file, as sections are, but segments can contain one or more sections.

Like in the linking view, the base unit, sections for the linking view but for segments for the executable view, are described in a header. The header of the executable view is called program header and it is, like the section header, a bunch of structures piled together, each describing one of the segments.

The program header describes the position and size in the file of each of the segments but also some important information about them: how they are supposed to be loaded in the memory and where (virtual address and physical address), the type of the segment, and some info more.

The most interesting segment types are the following:

  • LOAD is used for loadable segments, with the other fields of the segment the position and the size this segment will have in memory are described.
  • DYNAMIC are segments that have some dynamic linking information. It has to contain the .dynamic section.
  • INTERP gives the location and size of a null-terminated path name to invoke as an interpreter. Interpreter in this context usually means a dynamic linker, which will be called instead of loading this file to memory and the dynamic linker will be the one that will load the parts of the file it considers.

You can see how segments are interesting for loading the file in the memory, that is, they are mostly interesting for executable files or shared objects.

Segments vs Sections

If you want to have a clear idea about the difference between segments and sections, you can consider a file with multiple sections: .text, .rodata and .data.

A file that contains those sections can be understood from a linking perspective as a file that has some code (.text), read-only data (.rodata) and read-write data (.data). Each of those parts must be managed in a different way by the linker, but the reality is that the program loader doesn’t really care about some of the differences of them.

The code and the read-only data are loaded in the memory in the same way, with read and execute permission but no write permission, so the executable view can put both sections in the same segment, and make the loader’s life easier.

Also, the linker doesn’t really care about how is the memory loaded so the section header does not hold that information. It does care about the section’s goals though, as it will need to put them together in order during the linking. On the other hand, the loader is not really interested on what’s the goal of the contents of the file but only on what to do with those contents, so it only has that information.

So, why do we need to learn it?

We don’t really need to learn it very deeply, just learn how it works in a high-level way and make sure we are able to read it with the tools we have available. The good news for you is if the reasons I give you are not good enough it doesn’t really matter because you already learned1. Continue reading and you’ll realize how much you understand now.

First, let me tell you a personal story. I have previous experience working with assembly, but only in small devices that have two memories, one for data and other for code (Hardvard Architecture). In those small devices you often don’t really need to think about how the code and the data is mapped to memory because your programs are small and the separation is clear. Computers are a different thing, and I have had issues understanding this whole assembly thing.

Computers store both code and data in the same memory, the main memory, (Von Neumann Architecture) and they normally have memory segmentation, pagination, memory management units and all that kind of stuff, because there are many processes running and they want to separate one from the other. That forces us to think about how the code and the data are mapped to the memory. Also, modern operating systems also use dynamic linkers, which are not available in small devices, and we need to be able to deal with that amount of complexity.

ELF allows us to make that all, because it was born for that. ELF is a distillation of many of the ideas from System V Unix, that include exactly all I mentioned. It’s a great way to understand how memory, linking and processes work in a modern operating system. This is why you need to learn it, at least a little. It makes you a cultivated person, which is always good2.

The specifics

As I’m sure you are not satisfied totally with the answer of being a cultivated person3, let me go for some specifics.

So in this project GCC is not the only software we are dealing with, GNU Binutils and TinyCC are part of the party too, and I need to make them fit together in the best way possible. In those I need to make sure the relocations, formats and other things work properly, following the RISC-V ABI specification for ELF. That might be a point of failure, so being prepared on a high-level at least is interesting.

Of course, GCC’s output we need to analyze too, and in order to do that we need to make sure we know what it means. We already saw that some ELF sections are directly mentioned in the assembly, so in order to know their meanings ELF is a good way to understand them. They are really an OS related thing and ELF only reflects it, but learning them from the ELF perspective makes the path easier probably.

Relocations are a huge point in all this mess, because they are machine specific (instructions are too, but those I expect us to know already), and they are something I didn’t need to research on all the RISC-V adventures I had last year. I have to do it sometime.

In general, there are many sharp edges where we can get hurt, so it’s better if we wear gloves.

Tools

For all this process there are a couple of tools that were designed to help. GNU Binutils has many of them but we are going to focus on two, as they are more than enough for many usecases: objdump and readelf.

The example below uses both of them to analyze a piece of code and its compilation result. As you’ll see, the main problem they have is their output: it’s not always clear, the formatting is a little bit chaotic, it’s not obvious at all to get right and it’s really hard to use it procedurally.

There is a really cool tool you should investigate though, called GNU Poke, that is designed specifically to fight against those issues. I recommend you to take a look to it.

Example

Starting from a very simple C file we can follow a really interesting process and understand some of the ELF internals:

long global_symbol;

int main() {
  return global_symbol != 0;
}

We compile it to assembly with:

$ riscv64-linux-gnu-gcc -S b.c -O0

This are the contents of the assembly file:

        .file   "b.c"
        .option pic
        .text
        .globl  global_symbol
        .bss
        .align  3
        .type   global_symbol, @object
        .size   global_symbol, 8
global_symbol:
        .zero   8
        .text
        .align  1
        .globl  main
        .type   main, @function
main:
        addi    sp,sp,-16
        sd      s0,8(sp)
        addi    s0,sp,16
        lla     a5,global_symbol
        ld      a5,0(a5)
        snez    a5,a5
        andi    a5,a5,0xff
        sext.w  a5,a5
        mv      a0,a5
        ld      s0,8(sp)
        addi    sp,sp,16
        jr      ra
        .size   main, .-main
        .ident  "GCC: (Debian 10.2.1-6) 10.2.1 20210110"
        .section        .note.GNU-stack,"",@progbits

Assemble the file with as:

$ riscv64-linux-gnu-as b.s -o b.o

And this is what we get in b.o. The .text section contains the following:

$ riscv64-linux-gnu-objdump --disassemble b.o

b.o:     file format elf64-littleriscv


Disassembly of section .text:

0000000000000000 <main>:
   0:   ff010113        addi    sp,sp,-16
   4:   00813423        sd      s0,8(sp)
   8:   01010413        addi    s0,sp,16
   c:   00000797        auipc   a5,0x0
  10:   00078793        mv      a5,a5
  14:   0007b783        ld      a5,0(a5) # c <main+0xc>
  18:   00f037b3        snez    a5,a5
  1c:   0ff7f793        andi    a5,a5,255
  20:   0007879b        sext.w  a5,a5
  24:   00078513        mv      a0,a5
  28:   00813403        ld      s0,8(sp)
  2c:   01010113        addi    sp,sp,16
  30:   00008067        ret

Relocations

There are some relocations!

$ riscv64-linux-gnu-objdump b.o -r

b.o:     file format elf64-littleriscv

RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE                  VALUE
000000000000000c R_RISCV_PCREL_HI20    global_symbol
000000000000000c R_RISCV_RELAX         *ABS*
0000000000000010 R_RISCV_PCREL_LO12_I  .L0
0000000000000010 R_RISCV_RELAX         *ABS*

But in order to understand those relocations properly we need to check the value of the symbols too:

$ riscv64-linux-gnu-objdump -t b.o

b.o:     file format elf64-littleriscv

SYMBOL TABLE:
0000000000000000 l    df *ABS*            0000000000000000 b.c
0000000000000000 l    d  .text            0000000000000000 .text
0000000000000000 l    d  .data            0000000000000000 .data
0000000000000000 l    d  .bss             0000000000000000 .bss
0000000000000000 l    d  .note.GNU-stack  0000000000000000 .note.GNU-stack
000000000000000c l       .text            0000000000000000 .L0 
0000000000000000 l    d  .comment         0000000000000000 .comment
0000000000000000 g     O .bss             0000000000000008 global_symbol
0000000000000000 g     F .text            0000000000000034 main

If you pay attention to the offsets of those relocations (0x0c and 0x10) they exactly match the instructions auipc a5, 0x0 and mv a5, a5 and those are expanded from the lla a5, global_symbol (load local address) pseudoinstruction from the assembly.

The mv is not really a mv. mv is a pseudoinstruction too, that should be expanded to an addi a5, a5, 0. The objdump is playing with us, making the opposite conversion so we can read better but in fact is tricking us.

The auipc + addi couple in RISC-V appears pretty often, because it’s the method it has to load addresses in memory. The first instruction, auipc adds a high part of an immediate to the program counter and stores the result in a register, the addi adds then another, in this case low, immediate to the register i.e. they make a x[reg] = pc + immediate operation in two steps: x[reg] = pc + hi20(immediate) followed by x[reg] = x[reg] + lo12(immediate).

As we have relocations in both auipc and addi this means their 0 values (the immediates) are going to be overwritten with something else at linking time, and there’s when RISC-V has something to say. All the relocations we can see are RISC-V specific, and you can read about them in RISC-V ABI Specification.

In our case we have really some simple ones, the easiest to understand (what a coincidence, huh?):

R_RISCV_PCREL_HI20: High 20 bits of 32-bit PC-relative reference, %pcrel_hi(symbol). The formula is: S+A-P [but only obtains the highest 20 bits].

R_RISCV_PCREL_LO12_I: Low 12 bits of a 32-bit PC-relative, %pcrel_lo(address of %pcrel_hi), the addend must be 0. The formula is: S-P [but it only obtains the lowest 12 bits].

Both the HI20 and the LO12 have a similar formula, this is the meaning of the elements on the formula:

  • S: Address of the symbol
  • A: Addend of the relocation
  • P: Position of the relocation

If you match their formulas with the description of what we just said about how do auipc + addi couples work, you can easily understand the formulas and their meaning. We are not going to do it, do something yourself!

The other relocation:

R_RISCV_RELAX: Instruction can be relaxed, paired with a normal relocation at the same address.

Is an addition our example doesn’t use but it could. The R_RISCV_RELAX basically means that if the relocation it points at is not needed it can be discarded. And when does that happen? Easy, when we can get global_symbols address with only one of them, we can remove the other instruction from the program.

Relocation resolution

If we link the file and generate an executable, we can see the final value those zeroes get.

$ riscv64-linux-gnu-gcc b.o -o b.out

We link it like this because ld needs a lot of input fields and we don’t want to set them all by hand, but you can do it with ld if you feel like it.

$ riscv64-linux-gnu-objdump --disassemble b.out
...
00000000000005e4 <main>:
 5e4:   ff010113        addi    sp,sp,-16
 5e8:   00813423        sd      s0,8(sp)
 5ec:   01010413        addi    s0,sp,16
 5f0:   00002797        auipc   a5,0x2
 5f4:   a6878793        addi    a5,a5,-1432 # 2058 <global_symbol>
 5f8:   0007b783        ld      a5,0(a5)
 5fc:   00f037b3        snez    a5,a5
 600:   0ff7f793        andi    a5,a5,255
 604:   0007879b        sext.w  a5,a5
 608:   00078513        mv      a0,a5
 60c:   00813403        ld      s0,8(sp)
 610:   01010113        addi    sp,sp,16
 614:   00008067        ret
...

There you see the relocation was resolved (0x5f0 and 0x5f4) by the linker and the final values have been added. objdump is intelligent enough to tell us where are those instructions pointing (says 2058 <global_symbol>). Just to make sure we can search in the symbol table for the global_symbol:

$ riscv64-linux-gnu-objdump -t b.out | grep global_symbol
0000000000002058 g     O .bss   0000000000000008              global_symbol

NOTE: We could try to calculate the address of the global_symbol as the linker did, but it’s a little bit complicated because we also linked the file with the standard library and the startup files, which adds the crt files on top of the file. It’s really that we get more code than what we had in the assembly file. If you want to see that, you can see the rest of the output of the command, or even try with --disassemble-all and calculate the symbol address by hand. Good luck.

More sections

If you want the review some simple things, like a string section, you can use readelf for that. The -p flag (equivalent to --string-dump=) displays the contents of the section as strings. You can read the .comment section that way:

$ riscv64-linux-gnu-readelf -p .comment b.o

String dump of section '.comment':
  [     1]  GCC: (Debian 10.2.1-6) 10.2.1 20210110

This is what we had inserted in .ident on the assembly file by the compiler. We have it in the binary too.

In other distros the output is a little bit different. Look the output we have in Guix:

String dump of section '.comment':
  [     1]  GCC: (GNU) 11.2.0

Conclusion

So this whole this just to explain that ELF files are some kind of dual files that have two different goals at the same time. The executable one is kind of a picture of the memory state that can be used for loading that state in the memory, while the linking one just describes how different parts of the contents relate to each other and has tons of funny tricks to make the files relocatable, position independent and that kind of things. Cool.

There are still many fields of ELF we didn’t talk about but I consider this introduction more than enough. Having a simple understanding about how is the file organized and what kind of information it has is probably enough for the things we are going to need.

The proposed example shows that with the knowledge obtained by this short introduction we can dig a little bit on the files that result from a compilation and analyze their internals. That’s mostly the work I’ll need to do when I start combining compilers in a pipeline of death and destruction.

If I ever need to dig on something deeper, I’ll do.

Anyway, I’m still unsure if I answered the question we left in the previous post4:

Why is learning about ELF interesting if GCC generates assembly?

Did I?


  1. Ha! Gotcha! 

  2. It also makes you understand the complexities of the system so you can criticize it. Changing the world requires to learn about it first. 

  3. For those that really are. That’s the good attitude in life. High five. You can read the whole section still, it has interesting points I think. 

  4. It was a good cliffhanger, though.