eighty characters per line

May 4, 2008

The Macro Preprocessor

Filed under: Pedagogy — Tags: , , — Chad Waters @ 7:42 pm

1. Introduction

To help better explain programming paradigms to students, schools often teach introductory programming courses centering around a particular language, most commonly either the C family or Java. Java, and its derivatives, which, for all intents and purposes, includes C#, J#, and other similar languages. This allows students to learn by example, being asked to prove competency regarding a particular concept by demonstrating proficiency in implementing the concept.

No comment is being made on this practice of teaching by implementation. This is a common practice that is proven to work, and there is no harm in doing a double service to students, introducing them to programming and to a particular language in one fell-swoop. The point that is to be made here is in regards to the slapdash, and often incomplete, means of teaching the C family.

2. The Macro Preprocessor

2.1. Parameter Resolution

Undoubtedly the most dangerous, and certainly the most confusing, of all of the C remnants is the macro preprocessor. It has all of the functionality of your garden variety copy-and-paste, without any of that messy common sense! Consider the following canonical example:

#define MAX(a, b) ((a) > (b) ? (a) : (b))

A call to the MAX macro will return the larger of the two values. To understand why this is a potentially problematic macro, let us consider how the macro preprocessor works. References to the MAX macro are directly translated into the associated expression. Parameters passed to the macro are directly replaced within the definition. Why is this a problem? Consider:

int z = MAX(++m, ++n);

Assume that m has been initialized to a value of 10, and n is 20. So, then, since they are both pre-incremented, m is 11 and n is 21, and the expected value of z after computation is 21. This is not the case, and the reason for that can be seen when the macro is expanded fully.

int z = ((++m) > (++n) ? (++m) : (++n));

The actual result will be 22, with n now holding 22 and m being 11. This is because the expansion of the macro never resolved the two instances of the same parameter, meaning that the pre-increment happened both times. Unless this was an intended side-effect, this is dangerous and, likely, a problematic case for debugging.

2.2. Macros as Functions

It is most simple to consider macros as functions. As in the case above, disregarding the issue with parameter resolution, the macro acted as a function, returning the larger of the two values. This view of macros can be dangerous, and, again, confusing. Consider the following macro:

#define SWAP(a, b) a ^= b; b ^= a; a ^= b;

This is what is known as the XOR swap. This definition gives the appearance of a function, at first glance.
After the completion of the following code segment, x and y will have swapped values, naturally. Remember that the macro definition operates as a rudimentary copy and paste.

SWAP(x, y); // becomes
// x ^= y; y ^= x; x ^= y;

This does exactly what is expected. Consider, however, the case of the conditional swap in the next code segment. Notice that the macro definition is a free-standing set of three operations. The macro preprocessor does the following:

if (x != y) SWAP(x, y); // becomes
// if (x != y) x ^= y; y ^= x; x ^= y;

Notice that the conditional statement will only apply to the first of the three operations. The second and third operations will proceed regardless of the condition. This will smash the values in x and y. This can be avoided, however, by enclosing the original definition in braces to offset the code as a local segment. This can fix many macro problems, unless the macro needs to return something, as in the case of the MAX macro.

3. Conclusion

As of GCC 3.4.5, most of the inane issues, such as a comment on the same line as a macro definition, were resolved by the compiler, but many still live on. Stroustrup explains that C++ programmers tend to regard the use of macros suspiciously, as a “lesser evil,” while C programmers find the use of macros as natural and elegant [1]. What is important is the use of safe, type-neutral code, relying on inline functions where efficiency is a concern.

4. References

  1. Stroustrup, Bjarne. 2002. C and C++: Siblings. From The C/C++ Users Journal. http://www.research.att.com/~bs/siblings_short.pdf.

April 13, 2008

Java as a First Language

Filed under: Pedagogy — Tags: , — Chad Waters @ 5:49 am

1. The State of Computer Science

These days, numbers in computer science programs at most institutions are suffering. Enrollment is down 49% since 2001/2002. [1] This can be possibly attributed to the brainwashing that secondary school systems usually put their students through, impressing upon them how computer science is a “dime-a-dozen” field, and that the market will be flooded. Furthermore, rumors that the College Board was planning to stop administering the Computer Science Advanced Placement test were greatly exaggerated, thanks, in part, to a fallacious article in the Washington Post [2]. With numbers dwindling as they are, computer science programs are attempting to find new and better ways to not only improve their enrollment rate, but also to improve their retention rate. So what approach are programs taking to entice new students? Make programming fun.

2. Java to the Rescue

There are no arguments that Java is a useful language to know. TIOBE, which evaluates programming languages based on search engine results for both web pages and message boards, ranks Java the number one programming language in terms of popularity, possessing 20.529% of the market share as of April 2008 [3]. It presents an approachable syntax with a good deal of flexibility and the potential for rapid prototyping. Dismissing criticism for Java’s virtual machine is even simple: GCJ, the GNU compiler for Java. Compiling to native code? Sure, perhaps it lacks full AWT support, but work is being done, and it is quickly improving. So then what could there possibly be to complain about?

3. Java as a Starting Language

Java is a very common starting language for students. It makes programming fun for those who are unfamiliar with traditional syntax. It hides lower-level functionality to allow the programmer to think about the bigger picture. This being said, there is much lower level functionality that is important to learn. Dewar and Schonberg, with AdaCore Inc., put it best:

What we observed at New York University is that the Java programming courses did not prepare our students for the first course in systems, much less for more advanced ones. Students found it hard to write programs that did not have a graphic interface, had no feeling for the relationship between the source program and what the hardware would actually do, and (most damaging) did not understand the semantics of pointers at all, which made the use of C in systems programming very challenging. [emphasis added] [4]

So, what is the solution? Do we continue to teach complicated languages, to the chagrin of entering students, and lower our retention rates significantly? Or do we continue to produce students who possess a skill set analogous to placing a square peg in a square hole, fitting a basic solution to a basic problem, but being unable to actually write code?

4. Finding the Solution

To prevent universities from continuing to produce a generation of replaceable developers, pains need to be taken to teach theory and paradigms, not languages with robust packages that coddle programmers. Languages, from C++ to Common Lisp to Ada, should be learned, to produce students better able to handle projects ranging in size from personal programming to large-scale code delegation. Predictions? No longer will scope and adherence to specification be the biggest evil to a programmer. Instead, evil will be known as working with colleagues who have no understanding of language fundamentals.

The storm’s a-comin’!

5. References

  1. Vegso, Jay. March 2008. Enrollments and Degree Production at US CS Departments Drop Further in 2006/2007. http://www.cra.org/wp/index.php?p=139.
  2. ACM. April 2008. AP Computer Science is NOT Going Away. http://usacm.acm.org/usacm/weblog/index.php?p=593.
  3. TIOBE. April 2008. TIOBE Programming Community Index. http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html.
  4. Dewar, Robert B.K., and Schonberg, Edmond. January 2008. Computer Science Education: Where Are the Software Engineers of Tomorrow? http://www.stsc.hill.af.mil/CrossTalk/2008/01/0801DewarSchonberg.html.

Blog at WordPress.com.