As said in the previous post, in modern compilers, preprocessing is not a separate phase but it's made together with compilation. Nevertheless, understanding the role of preprocessor is really helpful. The first thing to say is that it basically understands only rows that start with the character hash (#).

In a standard program, those rows specifies header files to include and constants/macro to substitute in the rest of the file. Another frequent used feature is the conditional compilation (#if, #ifdef, etc.) to enable some part of code only if a condition is met at compile time. In this case the flag -D of GCC can be really useful.

A strange thing is the #pragma directive, used to issue commands directly to the compiler, in some cases to enable some vendor specific option. Other directives commonly used are #warning and #error; they force the compiler to present a warning or an error in special situations (usually depending on which other files are included or not included in the project).

An Example

Now let's see what a preprocessor does. Look at this simple program:

#include <stdio.h>
#include <string.h>

#define ARGS    1
#define TEST    "test_arg"

/* Main function */
int main (int argc, char **argv)
{
        if (argc != ARGS + 1) {
                fprintf(stderr, "Error! Expected %d param\n", ARGS);
                return 1;
        }

        if (strcmp(argv[1], TEST) != 0) {
                fprintf(stderr, "Error! Expected %s\n", TEST);
                return 2;
        }

        fputs("OK!\n", stdout);
        return 0;
}

Now if you compile it with:

$ gcc -Wall -E -o main_pp.c main.c

you'll get another C file named main_pp.c as a result (the flag -E tells GCC to only execute the preprocessor). If you don't have a compiler available, you can look at it here. Pretty big, isn't it?

What you should notice is that #include and #define directives have been processed and the comment has been removed. This obviously helps the programmer but basically almost all the work done by the preprocessor can be bypassed. In other words, the preprocessor is not indispensable. If you compile the following piece of code, you'll notice no differences in program execution compared to the original one.

typedef struct foo FILE;
FILE *stdout;
FILE *stderr;

int fprintf(FILE*, char*, ...);
int fputs(char*, FILE*);
int strcmp(const char*, const char *);

int main (int argc, char **argv)
{
        if (argc != 1 + 1) {
                fprintf(stderr, "Error! Expected %d param\n", 1);
                return 1;
        }

        if (strcmp(argv[1], "test_arg") != 0) {
                fprintf(stderr, "Error! Expected %s\n", "test_arg");
                return 2;
        }

        fputs("OK!\n", stdout);
        return 0;
}

How is this possible? How can it be that struct foo is a FILE? And what about other functions? For the answer, you'll have to wait the next two chapters of this series.

Macros

Where the preprocessor comes to be very handy for a programmer is in the possibility to create a sort of flexible functions called macros. Let's see how powerful macros can be with a trivial example.

#define MIN(X, Y)   ((X) < (Y)) ? (X) : (Y)

/* Main function */
int main (void)
{
	int a = 1;
	int b = 2;
	float c = 3.4;
	float d = 5.6;

	int x = MIN(a, b);
	int y = MIN(c, d);

	return x + y;
}

In the above code, the macro MIN() returns the smaller between two variables passed. It does not care whether we are passing integers or floating points as arguments, we could even mix them. Now let's see what the preprocessor does when we compile it with:

gcc -Wall -E -o macro_pp.c macro.c

The result below should not come unexpected. The preprocessor does exactly what it is supposed to do: apply a substitution for any defined expression.


# 1 "macro.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "macro.c"




int main (void)
{
 int a = 1;
 int b = 2;
 float c = 3.4;
 float d = 5.6;

 int x = ((a) < (b)) ? (a) : (b);
 int y = ((c) < (d)) ? (c) : (d);

 return x + y;
}

When applying the substitution, it becomes clear why the type of the arguments doesn't really matter to the macro.

Macros Drawbacks

Macros can be helpful to avoid code duplication and sometimes to increase the readability of the code, but you have to be careful with them.

First of all, you have to consider that the code duplication isn't really avoided with macros. For sure you will write once, but the compiler will compile the same code for every time you call a macro, resulting in a bigger executable. In some cases, avoiding macro can allow you to better optimize your program.

Another issue is related to the clarity of the code. It could be that the high level algorithm is easier to follow when you use macros, but when you go in deep in the details, the code becomes hard to understand.

In addition, debugging a program that heavily relies on macros could become very tough, especially if some of them contain instructions that change the execution flow, such as break, continue, etc.

Troubleshooting

Usually preprocessor errors are easily understandable. For example:

failed.c:1:21: fatal error: missing.h

means that the header file missing.h does not exist or is not in the path. Another comprehensible error is the following:

failed.c:3:0: error: unterminated #ifdef

which remind us that an #endif is missing.

References

  • If you want to play with the above examples, source files are here.
  • A full explanation of the GCC preprocessor can be found at this page.
  • The idea for the second example has been taken from this blog post.

Other posts in this series


Image from Pixabay by Irfan Ahmad licensed under the Pixabay License.

Post last updated on 2022/05/25.