PART 3:

INSIDE THE SMALL C COMPILER

Much of the appeal of the Small C compiler lies in the fact that it holds no secrets. Everything is visible to those who want to know. This part of the book opens the compiler to reveal its inner workings. While not necessary for using the compiler, knowledge of what the compiler does and how it does it can improve our appreciation of the language and our use of it.

This material is in places hard to grasp. However, I have tried to simplify it as much as possible, especially in the most difficult areas. Numerous figures, tables, and special listings are provided to help clarify the concepts.

Prerequisite to this study, you should be familiar with the Small C language as described in Part 1. In addition, you should have at least a rudimentary knowledge of assembly language programming for the 8086 processor. If you are weak in this area, you will probably find that you can manage by relying on the explanations found in the text. Nevertheless, it would be easier with some foreknowledge of the 8086 instruction set.

Complete listings of the compiler are contained in Appendix C. You will need to study these listings as you read the explanations of the compiler's logic. Appendix H has been provided to help you find the compiler's functions. It lists the functions alphabetically with each function's source file, page in the text, and page in Appendix C.

In the difficult parts of the expression analyzer, pseudocode listings are provided. These stand midway between the text and the actual compiler listings. The text explains the pseudocode. After that, you can easily relate the pseudocode to the compiler listings, since they correspond almost line for line.

I am sure that you will find your exploration of the Small C compiler a rewarding experience. No doubt you will gain a feeling of satisfaction and self confidence at having discovered the secrets of a real compiler.

Organization of the Compiler

As indicated in Figure P3-1, the Small C compiler is essentially a parser with subordinate front end and back end functions. The front end reads, preprocesses, and scans source code for the parser while the back end expresses the outcome in assembly language. Basically, the front end is the input side of the parser and the back end is the output side.

Figure P3-1: Organization of the Small C Compiler

It would be nice to simply start at the top and work our way down through the underlying functions. But we would certainly bog down in the middle as we became overwhelmed with new functions while trying to remember where we were in the parsing process. So instead, we shall first move from the bottom up and then from the top down, connecting in the middle. This approach requires some patience at first because the overall picture does not form until the last stage. However, it does make the top down stage much easier since we encounter only functions that are familiar to us.

First, before looking at the compiler itself, the routines in the CALL module of the library are examined. We do this since these routines are called frequently by the generated code, and so a knowledge of them is essential to an understanding of the output of the compiler.

Next, samples of the code generated by the compiler are compared to the source statements that produce them. Comparing the compiler's input and output gives a feel for what the compiler has to do and provides a basis for understanding the back end of the compiler.

After that, the data structures of the compiler are examined. These include the symbol tables (global and local), the switch table, the while queue, the literal pool, the staging buffer (used for optimizing expressions), and the macro buffers (used with #define commands).

Then, the back end of the compiler is explored to learn the functions that directly produce the output. Next, to complete the bottom up phase, the front end of the compiler is studied.

With the preliminaries out of the way, we are finally be ready to tie it all together by studying main() and the parsing functions. In so doing, we move from the top down, through the heart of the compiler.

Since expression analysis is such a large part of parsing, and since it is easily separated from the rest of the compiler, we cover it in a separate chapter. Likewise, the code optimizer is treated in a chapter of its own.

Finally, since Small C is meant to be experimented with, suggestions for further development are given. These projects, which range in difficulty from very easy to very hard, are a rich source of ideas for student assignments in courses on compiler construction.

Go to Chapter 18 Return to Table of Contents