eighty characters per line

May 4, 2008

The Macro Preprocessor

Filed under: Pedagogy — Tags: , , — Chad Waters @ 7:42 pm

1. Introduction

To help better explain programming paradigms to students, schools often teach introductory programming courses centering around a particular language, most commonly either the C family or Java. Java, and its derivatives, which, for all intents and purposes, includes C#, J#, and other similar languages. This allows students to learn by example, being asked to prove competency regarding a particular concept by demonstrating proficiency in implementing the concept.

No comment is being made on this practice of teaching by implementation. This is a common practice that is proven to work, and there is no harm in doing a double service to students, introducing them to programming and to a particular language in one fell-swoop. The point that is to be made here is in regards to the slapdash, and often incomplete, means of teaching the C family.

2. The Macro Preprocessor

2.1. Parameter Resolution

Undoubtedly the most dangerous, and certainly the most confusing, of all of the C remnants is the macro preprocessor. It has all of the functionality of your garden variety copy-and-paste, without any of that messy common sense! Consider the following canonical example:

#define MAX(a, b) ((a) > (b) ? (a) : (b))

A call to the MAX macro will return the larger of the two values. To understand why this is a potentially problematic macro, let us consider how the macro preprocessor works. References to the MAX macro are directly translated into the associated expression. Parameters passed to the macro are directly replaced within the definition. Why is this a problem? Consider:

int z = MAX(++m, ++n);

Assume that m has been initialized to a value of 10, and n is 20. So, then, since they are both pre-incremented, m is 11 and n is 21, and the expected value of z after computation is 21. This is not the case, and the reason for that can be seen when the macro is expanded fully.

int z = ((++m) > (++n) ? (++m) : (++n));

The actual result will be 22, with n now holding 22 and m being 11. This is because the expansion of the macro never resolved the two instances of the same parameter, meaning that the pre-increment happened both times. Unless this was an intended side-effect, this is dangerous and, likely, a problematic case for debugging.

2.2. Macros as Functions

It is most simple to consider macros as functions. As in the case above, disregarding the issue with parameter resolution, the macro acted as a function, returning the larger of the two values. This view of macros can be dangerous, and, again, confusing. Consider the following macro:

#define SWAP(a, b) a ^= b; b ^= a; a ^= b;

This is what is known as the XOR swap. This definition gives the appearance of a function, at first glance.
After the completion of the following code segment, x and y will have swapped values, naturally. Remember that the macro definition operates as a rudimentary copy and paste.

SWAP(x, y); // becomes
// x ^= y; y ^= x; x ^= y;

This does exactly what is expected. Consider, however, the case of the conditional swap in the next code segment. Notice that the macro definition is a free-standing set of three operations. The macro preprocessor does the following:

if (x != y) SWAP(x, y); // becomes
// if (x != y) x ^= y; y ^= x; x ^= y;

Notice that the conditional statement will only apply to the first of the three operations. The second and third operations will proceed regardless of the condition. This will smash the values in x and y. This can be avoided, however, by enclosing the original definition in braces to offset the code as a local segment. This can fix many macro problems, unless the macro needs to return something, as in the case of the MAX macro.

3. Conclusion

As of GCC 3.4.5, most of the inane issues, such as a comment on the same line as a macro definition, were resolved by the compiler, but many still live on. Stroustrup explains that C++ programmers tend to regard the use of macros suspiciously, as a “lesser evil,” while C programmers find the use of macros as natural and elegant [1]. What is important is the use of safe, type-neutral code, relying on inline functions where efficiency is a concern.

4. References

  1. Stroustrup, Bjarne. 2002. C and C++: Siblings. From The C/C++ Users Journal. http://www.research.att.com/~bs/siblings_short.pdf.

No Comments Yet »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.