How C Compilers Work Part 4: Linker

Now we are at the point where we have produced one or more object files and we want to create an executable. Under GNU/Linux systems, this job is done by ld, the GNU linker.

As seen in the previous part, the compiler always works on one file at once, so, every time there is the need to access a symbol (function or variable) defined somewhere else, a reference is used. The first work of the linker is to check the correctness of all these references.

Once this operation has ended successfully, it's time to produce the executable. To do this, all the object files are split into their basic components which are reassembled according to the ELF format. For example, all the fixed strings go in the string table, the names of the used symbols in the symbol table, etc.

Elf layoutElf layout

This also happens to static libraries (that are just a set of object files packed in an archive) and shared objects if the flag -static has been specified.

In case of dynamic linking (that happens 99% of the times), the linker appends:

  • a section called dynamic symbol table (.dynsym) containing the names of the external symbols, and
  • a section called simply .dynamic that, among other things, contains also the name of the shared objects needed at runtime.

When the process will be executed, the dynamic linker will append to the process image the images of the shared objects listed in the .dynamic section (but this is a different story).

Troubleshooting

"Undefined reference" is the most common error that you can get. It means that a function or a variable defined as extern has not been found. The most common case is a typo or a missing shared object.

When working with multiple toolchains or different versions of a shared object, it may happen that the linker signals errors for undefined references even if everything seems correct. This happens because the reference to the symbol (usually added by a .h file) does not match what's inside the shared object being linked. Or, to say it in other words, the shared object the linker is referring to, is not consistent with the header file that has been included. A solution is to check the path of the header and the library (flags -I and -L in gcc).

Another sneaky error may appear when a program is executed in a system different from the one where it has been created. The message usually shown is the pretty misleading "No such file or directory". But the message is absolutely correct, a file is missing (or it's in an unexpected location). The missing file is a dynamically linked shared object. To check which one is, you have to use readelf.

$ readelf -d <process_name>

The first rows show the names of the needed shared objects. Now you only have to make sure they are present in your system. If you are able to find them but you get the same error, try to specify additional paths for them.

Other posts in this series


Image by SurueƱa taken from Wikimedia Commons licensed under the Creative Commons Attribution-Share Alike 3.0 Unported, 2.5 Generic, 2.0 Generic and 1.0 Generic licenses.

Luca Sommacal

Luca Sommacal

Italian developer (mainly in C for embedded platforms), Linux learner, addicted to rock music, history, science and few other things. Follow me on Twitter

comments powered by Disqus