GENERATED CODE
As seen in Chapter 16, the compiler can be invoked in such a way that we can enter source statements from the keyboard and observe on the screen the code which they generate. Therefore, any of these examples can be verified easily.
Before proceeding, we need to establish some basic concepts underlying the compiler's view of the CPU. The compiler sees the CPU as a pair of 16 bit registers--primary and secondary--and a stack pointer.
The primary register is the recipient of expression values. When a unary operation is performed, the operand is placed in the primary register, and the operation is performed. When a binary operation is performed, the left hand operand is evaluated first. Then, when it is seen that a binary operator follows, it is pushed onto the stack while the right hand operand is evaluated--also in the primary register. After that, the left operand is popped into the secondary register and the operation is performed with the result going to the primary register. If possible, the optimizer eliminates this use of the stack by moving the left hand operand directly to the secondary register (if it is not needed for evaluating the right operand) or generating the left operand directly in the secondary register (at the point where the pop would be placed).
As it employs the 8086 CPU (Appendix B), the compiler uses the AX register for the primary register and BX for the secondary register. Both of these registers consists of a pair of 8 bit registers. AX, for example, consists of AH (the high-order byte) and AL (the low-order byte).
In the following examples, the compiler's code optimizer was allowed to operate as usual. So the assembly code you see is the normal, optimized code. You may find it instructive to try the -NO switch on these examples.
Constants
The examples in Listing 19-1 show two functions, each containing constant expressions as stand-alone statements. Each statement generates a
MOV AX,...instruction that moves the constant value to the primary register. Notice also that a constant expression like
123+321is evaluated by the compiler and only the result is loaded. Since the expression yields a constant, it is evaluated at compile time rather than run time. The advantage should be obvious.
Next, notice the handling of character strings. Each string generates
MOV AX,OFFSET _m+nwhere m is the number of a compiler generated label and n is an offset from the label to the beginning of the string. The operator OFFSET tells the assembler to place the offset part of the address in AX rather than the operand at that address. Since Small C uses a small memory model, the offset is effectively the full address. Thus, the presence of a character string produces in the program the address of the specified string. All of the strings in a function are dumped into the data segment, one after the other, when the compiler finds the end of the function. This literal pool for the function is preceded by a single compiler generated label. The underscore character is acceptable in labels. Its presence before the label number makes the assembler see the number as part of a label name rather than a number. Notice that each string is terminated with a null byte in the standard C fashion.
name SEGMENT PUBLICtell the assembler to assemble the following code into the named segment. Small C works with just two segments--CODE and DATA. The assembler is switched between these segments by closing the current segment with
name ENDSand opening the other one with another segment directive. The result is a collection of segment fragments bearing the names CODE and DATA. The linker combines the fragments with the same name into a single segment in the EXE file. The
ASSUME CS:CODE, SS:DATA, DS:DATAdirectives tell the assembler what to expect in the segment registers at execution time. It needs this information in order to correctly generate offset values for memory references. The start up routine in the CALL module of the library sees to it that the segment registers contain these values.
Finally, in the second function, notice that character constants result in just their numeric values.
Global Definitions
The examples in Listing 19-2 illustrate the code generated when global objects are defined. Integer definitions are on the left and equivalent character definitions are in corresponding positions on the right. Each global object is first declared to the assembler as an entry point. This is done with the
PUBLIC namedirective. As you can see, the compiler prefixes each name with a single underscore character which serves to avoid clashes with assembler reserved words.
DW (define word) directives define integers, and DB (define byte) directives do the same for characters. If no initial value is specified the
n DUP (0)syntax is used to allocate n occurrences of the value zero. (Recall that uninitialized globals are guaranteed to start with initial values of zero.) However, if initial values are given, then the individual values are listed, each resulting in an occurrence of an object. Notice that when fewer initial values than objects are given, then n DUP (0) is used to define the uninitialized trailing objects.
Note: The names of the objects in these examples were chosen to indicate the types of objects they are. For example, gi is a global integer, gca is a global character array, and gip is a global integer pointer. The examples which follow use the same naming conventions, except that l means local, a means argument, and e means external.
The examples in Listing 19-3 illustrate the code generated when global objects are referenced. Again, integer and character examples are separated into left- and right-hand columns. Each example is written as a very simple expression statement--just the reference in question. This isolates the references for sake of illustration. Of course the same code is generated when these references occur in more complicated expressions.
First of all, compare the references to integers and characters. In the first case, gi is obtained by moving a word from it's place in memory as indicated by the label _GI. The assembler knows to move a word instead of a byte, by the fact that the destination is a 16 bit register. In contrast, the reference to gc moves only a byte to AL. It also executes a CBW to convert from a byte in AL to word in AX by means of sign extension.
Next look at the effect of placing an address operator (&) in front of these references. Since this calls for the address of the object, the OFFSET operator is given to the assembler so that the value of the label itself is moved to AX rather than the object at that address. Notice here that there is no difference between the code for an integer address and a character address.
When a subscript of value zero is used, the compiler simply skips the subscript arithmetic altogether and fetches the object at the array address. Thus gia[0] obtains the first integer in gia just as gi obtains the integer gi. Now, notice that the indirection operator (*) applied to an array name has the same effect as a zero subscript--it obtains the object at the designated address.
The next two examples in Listing 19-3 illustrate ordinary array subscripting. Gia[5] refers to the sixth integer in gia by specifying the source operand in a MOV instruction as _GIA+10. The assembler evaluates this expression by adding ten to the address of gia to determine the location of the desired integer. Since this is done at assembly time rather than run time, it has no effect on program performance. This is possible only because the subscript is a constant. Had it been a variable or a more complex expression, then more code would have been generated which would have to be evaluated at run time. Comparing the integer to the character reference, we see that the subscript value of the integer reference has been doubled since integers occupy two bytes each. Also, a CBW instruction promotes the character to an integer.
The next examples illustrate the equivalence of subscripting and writing the address arithmetic directly. Notice that the same code is generated by gia[5]; and *(gia+5);.
The remaining examples in Listing 19-3 serve to compare pointer references to the previous array references. At the source level, these are conceptually the same--unadorned references yield an address, both may be subscripted, and both may have address arithmetic performed on them. However, they differ fundamentally in that an array name is not an lvalue, since it represents the constant address, whereas a pointer name is an lvalue since it identifies a piece of memory which can be changed. Thus, an unadorned pointer name like gip produces an address by fetching the word at _GIP instead of its address. Since a Small C pointer is always two bytes long, regardless of whether it refers to characters or integers, the integer pointer reference generates the same code as the character pointer reference. They both fetch the contents of the pointer which is assumed to be an address. Of course, it is the programmer's responsibility to see that the pointer does in fact contain the correct address value.
Placing an indirection operator (*) before a pointer fetches the object pointed to. Thus *gip first fetches the contents of the pointer into the secondary register BX, from which it can serve as a base address. Then, by means of
MOV AX,[BX]it moves the word pointed to by BX into AX. The brackets can be read as contents of the memory location. This must be a two step operation, because the value of the pointer is a variable. It may have a different value with each execution of the reference. Comparing the integer and character examples, we see that whereas they load the pointer value the same way, the character is fetched into AL and then promoted to an integer.
Finally, the last examples in Listing 19-3 illustrate the code generated by subscripting pointers and performing address arithmetic on them. Note that these are equivalent to array references except that the pointer's value provides the base address whereas the array's name is itself a constant address.
External Declarations and References
The examples in Listing 19-4 illustrate the code generated when objects are declared external. In that case, there is no definition of the objects. They are simply declared external to the assembler by means of EXTRN directives.
The function in Listing 19-6 illustrates how local objects are both declared and referenced. The function proceeds from the top of the left column to the bottom of the right column. As before, integer references are on the left and equivalent character references are on the right.
First, notice that on entry to the function, the base pointer (BP) is saved on the stack and the new stack pointer value (SP) is moved to BP as the base of the stack frame for this function call. Had there been arguments passed to this function they would be located at positive displacements from BP, beneath the saved value of BP and the return address. Local objects are allocated on top of the stack, and so are accessed by negative displacements from BP.
Rather than decrement SP separately for each local object, the compiler defers until the first executable statement is found. At that point a single adjustment to SP allocates all locals for the current block. In this example, we define a character, a character array of 10 elements, and a character pointer--a total of 13 bytes. Then we define an integer, an integer array of 10 elements, and an integer pointer--a total of 24 bytes. These numbers combined account for the negative adjustment to SP in the instruction
ADD SP,-37In its symbol table, the compiler keeps track of the displacement from BP to each variable. Decrementing SP simply ensures that the stack space claimed by these locals will not be used for other purposes; it reserves the space.
Next, notice that all local references involve a source operand of the form
-n[BP]where n is the displacement. When an address is needed, the load effective address instruction LEA is used. This instruction loads the address of the source operand rather than the operand itself.
Function Declarations and Calls
The example in Listing 19-7 shows the code generated when a function with arguments is declared and when the arguments passed to it are referenced. Again, integer references are on the left and character references are on the right.
In this example, observe that while arguments are referenced like locals, there are some differences. First, and most importantly, there is no allocation of arguments in the called function. Instead, the arguments are pushed onto the stack at the point of the function call.
The function assumes that they are already on the stack when it receives control. If the wrong number or type of arguments are passed, the function forges ahead anyway, just as if all of the required arguments were present.
Another difference is that arguments are on the stack beneath the base address in BP rather than above it. As we said above, BP points to its original (saved) value on the stack. Beneath that is the return address which was placed on the stack by the call instruction. And then comes the arguments. Since they are pushed onto the stack in the order of their appearance in the function call, the last argument is found, immediately below the return address, at BP+4. The next to last argument is at BP+6, and so on. You should recall that arrays cannot be passed. When an array name is given as an argument, the array's address is passed. Furthermore, since such an address exists in memory and can be changed, it is actually a pointer. This fact can be used to advantage within the function.
The last statement of the function is
return (gi1);which fetches the global integer gi1 into AX as the return value, and generates the return sequence
POP BP RET
If local variables had been declared, the pop would have been preceded by an adjustment to SP to deallocate the locals and return SP to the saved value of BP.
If the return statement is removed, the return sequence would be generated automatically. In that case, however, no return value would be established. Whatever happens to be in AX (actually *acp) will be the return value.
POP BX XCHG AX,BX PUSH BX
swaps the argument in AX with the function address on the stack. This leaves the function address in AX. However, if another argument follows, it gets pushed onto the stack again. That way it floats on top of the stack as arguments are processed. Then when all of the arguments have been processed, the address is not pushed onto the stack, but remains in AX. As usual, a count of the number of arguments is loaded into CL. Finally, the function is to be called with
CALL AXwhich calls the function pointed to by AX. As usual, on return, the arguments are deallocated by incrementing SP.
It would be easy to go on with examples of interesting expressions. For the sake of brevity, however, only a representative sample of the various operators and one example of a fairly complex expression are illustrated. Also, for simplicity, only global objects are referenced. You can infer from the previous examples the effects of referencing locals and arguments.
The examples in Listings 19-10 and 19-11 show the effects of unary operators. First the logical NOT operator which is implemented as a call to __LNEG() in the CALL module. This routine logically negates the value in AX and leaves the result in AX.
MOV BX,_GIP MOV AX,[BX]
in which the first move obtains the pointer value in BX, and the second move uses it as the address from which the object is moved to AX. In the second case, that object is itself used as the address of the sought object. Therefore, the instructions
MOV BX,AX MOV AX,[BX]
are added to the sequence. These move the first object to BX from which it points to the final object which is loaded into AX.
The third case illustrates that there is no limit to the levels of indirection that can be applied.
Applied to the array element gia[5], the address operator first loads the address of the array, then offsets it to the sixth element. This could be optimized to
MOV AX,OFFSET _GIA+10in which the addition is performed at assembly time rather than run time. This is left as an exercise.
Note: Since the three subexpressions are tested against zero (false), in-line code is generated rather than a call to one of the comparison routines in the CALL module. In each case, a bitwise OR of AX is performed on itself; this leaves the register unchanged, but sets the CPU's flags. The zero flag is tested by JNE $+5 which jumps around the following JMP _8 if the flag is not set (AX is not zero). The expression $+5 tells the assembler to jump 5 bytes ahead of the current location; and, since the jump instructions occupy 2 and 3 bytes respectively, this targets the next instruction. It may seem wasteful to use two jump instructions back to back when a simple JE _8 would do the job. However, the 8086 CPU does not support conditional jumps to arbitrary addresses. Its conditional jumps are relative to the instruction pointer (IP) and only an 8-bit signed displacement is used. This limits the range to -128 or +127 bytes. Since the subexpressions could be any size, we must use an unconditional jump (which has no such limit) to reach _8. Notice that the instruction at _8 is redundant. Chapter 28 suggests improvements to this code.
The logical OR operator is exactly opposite; the first term to yield true terminates the process with a value of one (true).
Notice that the subexpressions operated on do not have to contain relational operators. The last term gc, for instance, yields its own value which is taken for true if it is not zero, and false otherwise.
The += assignment is a bit more involved. Notice that it has the same effect as the expression
gi = gi + 5Gi is fetched, 5 is added to it, and the result is moved back to gi. Since the result remains in AX, it also becomes the value of the += operator.
It then compares that value to 5, yielding either true (one) or false (zero). This in turn multiplies 'y' (ASCII value 121) for the value to assign to gi. So gi is set to 121 if the function returns 5, otherwise zero. In either case, gc is set to the value returned by func().
Following are samples of the code generated for each of the statements known to the compiler. First, a simple if statement (Listing 19-21) is presented. In this example, the expression being tested is only a variable. It is loaded into AX where it is tested for true or false. The same in-line logic as we saw in Listing 19-18 is used. Only if gi is not zero is 5 assigned to it.
gi = 5;is executed if gi yields true and the second statement
gi = 10;is executed if it yields false. One and only one of these statements is executed. Notice that the first controlled statement is terminated by a jump around the second one.
The while statement has a particularly simple structure in assembly language, as the example in Listing 19-25 shows. In this example, the control variable gi is decremented with each iteration and checked for true or false. If it is false (zero) then a jump to the terminal label is performed. Otherwise, the function call is performed and a jump back to the top is executed. The loop continues until the tested expression becomes false.
while (1) ...However, the code is less efficient because of the extra jumping around.
This finishes our overview of the code generated by the Small C compiler. Many other situations could have been presented, but the combinations are endless. These should suffice to illustrate the task performed by the compiler, and to make the compiler's output understandable.
Of course, you can test the compiler further yourself. The simplest way is to invoke it without command-line arguments. Then enter source statements from the keyboard and watch on the screen what the compiler generates. Whenever there is doubt as to what the compiler does with some particular statement, just ask it.