EXPRESSIONS

Expression lists are evaluated from left to right with the right-most expression determining the value of the list. Before looking at the kinds of expressions we can write in C, we will first consider the process of evaluating expressions and some general properties of operators.

**Operator Properties**

**
** The basic problem in evaluating expressions is deciding which parts of an expression are to be associated with which operators. To eliminate ambiguity, operators are given three properties: *operand count*, *precedence*, and *associativity*.

Operand count refers to the classification of operators as unary, binary, or ternary according to whether they operate on one, two, or three operands. The unary minus sign, for instance, reverses the sign of the following operand, whereas the binary minus sign subtracts one operand from another.

**
**

a + b * cwould be evaluated by first taking the product of

Parentheses may be used to alter the default precedence or to make it more explicit. They must always be used in matching pairs. Evaluation of parenthesized expressions begins with the innermost parentheses and proceeds outward, with each parenthesized group yielding a single operand. Within each pair of parentheses, evaluation follows the order dictated by operator precedence and subordinate parentheses. Writing the above example as

(a + b) * coverrides the default precedence, so that the addition is performed before the multiplication. On the other hand,

a + (b * c)simply makes the default precedence explicit. Table 9-1 lists the Small C operators in descending order of precedence, going from top to bottom in the left column and then in the right. All operators listed together have the same precedence. Except as grouped by parentheses, they are evaluated as they are encountered.

The last property, associativity, determines whether evaluation of a succession of operators at a given precedence level proceeds from left to right or from right to left. This is also called *grouping* because the evaluation of each operator/operand(s) group yields a single value which then becomes an operand for the next operator. Of course, the result of the last operation becomes the value of the whole expression. Grouping is shown in Table 9-1 by arrows in the upper right corner
of each precedence box.

Since addition groups left to right, the expression

a + b + cis evaluated by first adding

**Operands**

**
**Operands that are not the immediate result of the application of an operator are called *primary* operands, or just *primaries*. Examples are constants, variable names, and so on. Although we treat function calling and subscripting as operations, the values they produce are still considered primaries. We might consider these as operations that help fetch primary values, rather than operations that determine values.

As Table 9-1 indicates, a pair of parentheses are considered an operator when they designate a function call; but when they direct the evaluation precedence they do not function as operators. Instead, they direct the expression analyzer to treat the enclosed expression as a simple operand. For this reason, parenthesized expressions are considered to be primaries. When we study the Small C expression analyzer (Chapter 26) we shall see that precedence setting parentheses are handled in the same function that deals with primary objects. Small C accepts 11 kinds of primary operand. They are:

- Numeric Constants. Numeric constants (decimal, hexadecimal, or octal) are stored internally as 16-bit values. Ordinarily, they are treated as signed integers; however, if they are written without a sign and are greater than 32767(decimal), Small C treats them as unsigned integers.
- Character Constants. Chapter 3 described how character constants are in fact stored internally as 16-bit values. Single-character constants always have zeroes in the high-order byte, making them positive. Double-character constants, however, could have the sign bit set if the left-most character is written as an octal escape sequence. Character constants of either type are treated as signed quantities.
- String Constants. A string constant, in an expression, yields the 16-bit address of the string. Being an address, it is treated as an unsigned quantity.
- Integer Variables. Integer variables are stored as 16-bit values. They enter into operations either as two's complement signed or as unsigned values depending on their declarations. Elements of integer arrays are treated as integer variables.
- Character Variables. As indicated in Chapter 4, character variables are automatically promoted to integers, either by sign extension or zero padding depending on whether they are signed or unsigned. They are treated as two's complement signed or unsigned quantities depending on their declarations. When values are assigned to character variables, the high-order byte is truncated. Elements of character arrays are treated as character variables.
- Integer Array Names. The unsubscripted name of an integer array yields the address of the array. Being an address it is treated as an unsigned quantity. Furthermore, values added to, or subtracted from, such an address are doubled so the effect of the operation will be to offset the address by the specified number of integers. Also, when the difference of two integer addresses is taken, the result is divided by two so the answer will reflect the number of integers between the addresses.
- Character Array Names. The unsubscripted name of a character array yields the address of the array. Being an address it is treated as an unsigned quantity. These addresses cause no scaling of values added to or subtracted from them since characters occupy one byte each. Likewise, the difference of two character addresses is not scaled because it already reflects the number of characters between them.
- Integer Pointers. An integer pointer yields the address of an integer. Being an address it is treated as an unsigned quantity. Furthermore, values added to, or subtracted from, such an address are doubled so the effect of the operation will be to offset the address by the specified number of integers. Also, when the difference of two integer addresses is taken, the result is divided by two so the answer will reflect the number of integers between the addresses.
- Character Pointers. An character pointer yields the address of an character. Being an address it is treated as an unsigned quantity. Character addresses cause no scaling of values added to or subtracted from them since characters occupy one byte each. Likewise, the difference of two character addresses is not scaled because it already reflects the number of characters between them.
- Function Calls. Functions in Small C are always considered to yield signed integer values. A function name followed by parentheses generates a call to the function which yields the value returned by the function. If an undefined name is found in an expression, it is implicitly declared to be a function name, and is treated as such.
- Function Names. A function name without trailing parentheses yields the address of the named function. If an undefined name is encountered in an expression, it is implicitly declared to be a function name, and is treated as such.

All other operations are the same whether operating on signed or unsigned quantities.*multiply

/divide

%modulo

>greater than>=greater than or equal<less than

<=less than or equal

**Constant and Variable Expressions**

**
**When the Small C compiler encounters an expression, it does one of two things. If the expression contains only constant (numeric or character) primaries, the result is a constant, so the compiler evaluates the expression and generates the result. However, if the expression contains variables or addresses (e.g., string constants), the result cannot be determined at compile time, so the compiler generates code that will perform the evaluation at run time. As we shall see in Chapter 26,
the same expression analyzer performs both tasks. It assumes it is dealing with a variable expression, and so generates run-time code in a staging buffer. However, at each step along the way, it knows whether or not it has a constant value. If the entire expression yields a constant, then the code in the buffer is replaced by code that yields the final value directly. This compile-time evaluation of constant expressions means that we can write constants in complex ways, that reveal their constituent parts, without
incurring any run-time overhead. This technique can add clarity to a program's logic by documenting the derivation of "magic" numbers.

**Operators**

**
**Every operator in a Small C expression yields a 16-bit value. Normally, these are numeric values representing numbers, character codes, addresses, or whatever. Some operators, however, produce *Boolean* (logical) values. There are only two such values, *true* and *false*. When an operand is tested for its logical value, zero is interpreted as false and any non-zero value is considered true. C places no limit on the kinds of values that may be tested logically. It
matters not whether a value is a number, an address, or a Boolean value, zero means false and non-zero means true.

There are four operators which interpret operands as Boolean values--the *logical AND* (**&&**), the *logical OR* (**||**), the *logical NOT* (**!**), and the *conditional operator* **(?:**). The first three of these also yield Boolean values. Boolean values produced by operators are always set to one for true and zero for false.

There are three bitwise operators--*AND* (**&**), *inclusive OR* (**|**), and *exclusive OR* (**^**)--which perform logical operations on the individual bits of their operands and reflect the results in the bits of the values they yield. These operators test corresponding bits of two operands to determine the corresponding bit of the result. The same operation is performed simultaneously on all 16 bits of the operands to produce a 16-bit result.

We now look at the individual operators in the order of precedence (high to low) as shown in Table 9-1.

**() Function Call**

**
**Parentheses, possibly containing a list of argument expressions, may follow a primary expression. Such parentheses are taken as an operator causing the function designated by the primary expression to be called.

Normally the primary expression is just a function name. Sometimes it is a pointer reference like (***fp**). It might even be a subscripted pointer reference, as in** ****fpa[5]**. Since Small C does not perform type checking, **fpa** can be an ordinary integer array containing function addresses. When a function name is used, the function is called directly by referencing its label. In other cases, the primary expression is evaluated to produce the function's address in
the primary register. The function is then called indirectly through the primary register.

The argument expressions are evaluated from left to right and passed to the called function. The value yielded by this operator is the value returned by the function. In Small C it will always be interpreted as a signed integer.

**[] Subscript**

**
**A pair of square brackets following a primary expression is considered to contain a subscript into an array of objects. The subscript can be any legal expression, and its value will be interpreted as an integer. The primary expression should yield an address. If it produces a character address, then the subscript will be added to it. The resulting address is then used to obtain a character. If the primary expression produces an integer address, then the subscript will be doubled before
being added to it. The resulting address is then used to obtain an integer. As we can easily verify, these operations are consistent with the concept of subscripting from zero. The value yielded by this operation is the value of the object referenced.

The primary expression is usually an array name, but pointer names are also frequently used. Since any expression surrounded by parentheses is a primary expression, we can perform arithmetic on array and pointer names before applying the subscript.

**! Logical NOT**

**
**This operator yields the logical negation of the operand to its right. If the operand is false it yields true (one). And if it is true it yields false (zero).

**~ One's Complement**

**
**This operator complements the bits of the operand to its right. Each bit containing one becomes zero and each bit containing zero becomes one.

**++ Increment**

**
**This operator increments an operand by one. If it precedes the operand, it yields the incremented value. However, if it follows the operand, it yields the original value in the expression, although it leaves the operand itself incremented. If the operand is an address, it increments to the next character or integer depending on the type of the address. The operand must be an lvalue.

**-- Decrement**

**
**This operator decrements an operand by one. If it precedes the operand, it yields the decremented value. However, if it follows the operand, it yields the original value in the expression, although it leaves the operand itself decremented. If the operand is an address, it decrements to the prior character or integer depending on the type of the address. The operand must be an *lvalue*.

**- Unary Minus**

**
**The unary minus operator reverses the sign of the operand on its right. This is accomplished by taking the two's complement of the operand. This operator is distinguished from the binary minus by its context; in this case there is no operand on the left.

*** Indirection**

**
**When this operator is written to the left of an address it changes the meaning from "address of" to "object at." In other words, it indirectly obtains the object pointed to by its operand. An integer address, yields an integer and a character address yields a character.

**& Address**

**
**The address operator yields the address of the operand to its right. In Small C, the address operator does not group; it can only be applied directly to variable, pointer, and subscripted pointer and array names.

**sizeof()**

**
**To promote the writing of more portable programs, the C language incorporates an unusual operator that yields the size in bytes of a specific object or of objects of a given type. This operator has the form** ****sizeof(...)**** **where the ellipsis stands for a defined object or a data type. Although it looks like a function call, this is really an operator. It has the effect of returning an integer constant which is the size of the named object or of objects of the specified type. If, f
or example, we have the declaration int **ia[10];** then **sizeof(ia)** will yield the constant 20--the number of bytes occupied by the array.

If a data type, instead of a specific object, is given then it is written as a declaration statement without an identifier. For example, **sizeof(unsigned int)** or just **sizeof(unsigned)** yields two, since Small C integers occupy two bytes. Likewise, **sizeof(char *)** yields two since that is the size of a character pointer. Small C does not accept the array modifier** [ ]**.

This operator should be used whenever writing code that depends on the size of variables of a given type, or of a specific object. It is legal anywhere a constant is appropriate. And, since it is evaluated at compile time, there is no run-time overhead associated with its use.

*** Multiplication**

**
**The multiplication operator yields the product of the operands on its left and right. Although the operation produces a 32-bit result, only the lower 16 bits are retained.

**/ Division**

**
**The division operator yields the quotient of the left operand divided by the right operand.

**% Modulo**

**
**The modulo operator yields the remainder of the left operand divided by the right operand.

**+ Addition**

**
**The addition operator performs an algebraic addition of the two adjoining operands. If one of the operands is an address, the other is interpreted as a character or integer offset, depending on the type of the address. In that case, if the address points to integers, the non-address operand is doubled before the operation is performed. The value generated by an address offset addition is an address of the same type.

**- Subtraction**

**
**The subtraction operator subtracts the right operand from the left operand. If one of the operands is an address, the other operand is interpreted as a character or integer offset, depending on the type of address. In that case, if the address points to integers, the non-address operand is doubled before the operation is performed. The value generated by an address offset subtraction is an address of the same type.

If both operands are addresses of the same type, the result is adjusted to represent the number of objects lying between them. In the case of integer addresses, the result is divided by two. In the case of character addresses, no adjustment is necessary. The result is an integer, not an address. The final result is likely to be useless under any of the following conditions: (1) both addresses are not of the same type, (2) the first address is smaller than the second, or (3) both addresses are not in the same array.

**<< Shift Left**

**
**This operator shifts the left operand left the number of bit positions indicated by the right operand. Zeroes are inserted into the low end of the result. For each position shifted, the left operand is effectively multiplied by two. Thus, **x << 2** multiplies **x** by four. Since shifting is much faster than multiplying, this is the preferred way to multiply by powers of two.

**>> Shift Right**

**
**This operator shifts the left operand right the number of bit positions indicated by the right operand. The sign bit retains its value and is replicated into the next lower bit with each shift. Thus the sign of the left operand is preserved. For each position shifted, the left operand is effectively divided by two. Thus,** ****x >> 2**** **divides **x** by four. Since shifting is much faster than dividing, this is the preferred way to divide by powers of two.

**< Less Than**

**
**This operator yields true if the left operand is less than the right operand.

**<= Less Than or Equal**

**
**This operator yields true if the left operand is less than or equal to the right operand.

**> Greater Than**

**
**This operator yields true if the left operand is greater than the right operand.

**>= Greater Than or Equal**

**
**This operator yields true if the left operand is greater than or equal to the right operand.

**== Equal**

**
**This operator yields true if the two operands are equal.

**!= Not Equal**

**
**This operator yields true if the two operands are not equal.

**
**This operator yields the bitwise AND of the adjoining operands. If both corresponding bits are one, the corresponding bit of the result is set to one; however, if either is zero, the resulting bit is reset to zero. This operator associates left to right.

**
**This operator yields the bitwise exclusive OR of the adjoining operands. If either, but not both, corresponding bits are one, the corresponding bit of the result is set to one; on the other hand, if both bits are the same (zero or one) the resulting bit is set to zero. This operation is exclusive in the sense that it excludes from the OR the case where both corresponding bits are ones. This operator associates left to right.

**
**This operator yields the bitwise OR of adjacent operands. If either corresponding bit is one, the corresponding bit of the result is set to one; however, if both are zero, the resulting bit is zero. It is inclusive in the sense that it includes in the OR the case where both corresponding bits are ones. This operator associates left to right.

**
**This operator yields the logical AND of the adjoining operands. If both operands are true it yields true (one); otherwise, it yields false (zero). This operator associates left to right.

If an expression contains a series of these operators at the same precedence level, they are evaluated left to right until the first false operand is reached. At that point the outcome is known, so false is generated for the entire series, and the right-most operands are not evaluated. This is a standard feature of the C language, so it can be used to advantage without fear of portability problems. For greater efficiency, compound tests should be written with the most frequently false operands first; in that case the trailing operands will be evaluated only rarely. This feature, together with the fact that the operands can be expressions of any complexity (including function calls), makes it possible to write very compact code. For example,

func1() && func2() && func3() ;is equivalent to

if(func1()) { if(func2()) { func3() ; } }

Also, trailing operands may safely reference variables (e.g., subscripts) which have been verified in preceding tests. For example,

if(ptr && *ptr == '-') ...;performs

**
**This operator yields the logical OR of the adjoining operands. If either is true it yields true, (one); otherwise, it yields false (zero). This operator associates left to right.

If the expression contains a series of these operators, the operands are evaluated left to right until the first true operand is reached. At that point the outcome is known, so true is generated for the entire series, and the right-most operands are not evaluated. This is a standard feature of the C language, so it can be used to advantage without fear of portability problems. For greater efficiency, compound tests should be written with the most frequently true operands first; in that case the trailing operands will be evaluated only rarely. This feature, together with the fact that the operands can be expressions of any complexity (including function calls), makes it possible to write very compact code. For example,

func1() || func2() || func3() ;is equivalent to

if(func1()) ; else if(func2()) ; else func3() ;

**
** This unusual operator tests the Boolean value of one expression to determine which of two other expressions to evaluate. Since it is the only operator that requires three operands it is sometimes called the *ternary* operator. It consists of two characters--a question mark and a colon--which separate the three expressions. The general form is

This sequence may appear anywhere an expression (or subexpression) is acceptable.Expression1?Expression2:Expression3

(y ? x/y : x) + 5If

Unless they are enclosed in parentheses, there are restrictions on the operators that can be used in the three expressions. The first expression can contain only operators that are higher in precedence than the conditional operator; that is, assignment and conditional operators are disallowed. The second and third expressions cannot contain assignment operators, but may have conditional operators or any operators of higher precedence. Considering these restrictions and the fact that this operator associates from right to left, the expression

a ? b : c ? d : eis equivalent to

a ? b : (c ? d : e)and

a ? b ? c : d : eis equivalent to

a ? (b ? c : d) : eObviously, whenever more than one conditional operator is used in combination, it is a good idea to use parentheses even if they are not strictly needed.

Since this operator must yield a single value, and that value comes from either of two expressions which could yield different data types, some further restrictions must apply to the second and third expressions. Otherwise, the compiler might not be able to assign a data type to the result. In other words, it is impossible to determine the attributes of the result at compile time if they are allowed to be determined at run time. Therefore, the last two expressions must observe the following additional restrictions:

- Both expressions yield numeric constants. In this case, because the two possible constant values are assumed to be different (otherwise, there would be no point selecting between them), the result is considered to be an integer variable.
- Either one or the other yields a numeric constant. In these cases, the numeric constant (often zero) is considered to be an exceptional value of the same type as the other expression. Therefore, the result is given the attributes of the non-constant expression.
- Both expressions yield addresses of the same data type. In this case, both expressions have the same properties, and naturally the result is the same.
- Both yield integer values. Likewise, in this case, the expression properties match and, therefore, determine the resulting attributes.

The left (receiving) operand must be an lvalue; that is, it must be an object which occupies space in memory and is therefore capable of being changed. An unadorned array name, **array** for instance, will not do; whereas, ***array** and **array[x]** are lvalues. Other operands that do not qualify as lvalues are calculated values which are not prefixed by the indirection operator, and names which are prefixed by the address operator. Conversely, the only allowable *lvalues*
in Small C are (1) variable names, (2) pointer names, (3) subscripted pointer names, (4) subscripted array names, and (5) subexpressions that are prefixed by the indirection operator.

If the receiving object is a character, it receives only the low-order byte of the value being assigned.

All but one of the assignment operators have the form **?=** where **?** stands for a specific non-assignment operator. These operators provide a shorthand way of writing expressions of the form

a = a ? bThus,

a += bis equivalent to

a = a + band

a <<= bis equivalent to

a = a << band so on. These operators produce more efficient code since the receiving operand is evaluated only once.

UNIX C originally implemented these operators with the equal sign leading instead of trailing. The result was that some of them (**=+**,** =-**, **=***, and **=&**) were ambiguous. Small C always takes these as a pair of separate operators. To avoid portability problems, always put a space between these character pairs.

Since assignment is a part of expression evaluation, traditional assignment statements are really just stand alone expressions. And, since all operators yield values which may be used in the further process of expression evaluation, any number of assignment operators may appear in an expression. Most common are multiple assignments like

a = b = c = 5Since these operators group right to left,

a = (b = (c = 5))Expressions like

val[i] = i = 5deserve special consideration. Notice that the variable

The assignment operators are:

**= Assignment**

This operator assigns the value of the right operand to the left operand.

**+= Add and Assign**

This operator assigns the sum of the left and right operands to the left operand.

**-= Subtract and Assign**

This operator subtracts the right operand from the left operand and assigns the result to the left operand.

***= Multiply and Assign**

This operator assigns the product of the left and right operands to the left operand.

**/= Divide and Assign **

This operator divides the left operand by the right operand and assigns the quotient to the left operand.

**%= Modulo and Assign**

This operator divides the left operand by the right operand and assigns the remainder to the left operand.

**&= Bitwise AND and Assign**

This operator assigns to the left operand the bitwise AND of the left and right operands.

**|= Bitwise Inclusive OR and Assign**

This operator assigns to the left operand the bitwise inclusive OR of the left and right operands.

**^= Bitwise Exclusive OR and Assign**

This operator assigns to the left operand the bitwise exclusive OR of the left and right operands.

**<<= Left Shift and Assign**

This operator shifts the left operand left the number of bits indicated by the right operand. The result is assigned to the left operand.

**>>= Right Shift and Assign**

This operator shifts the left operand right the number of bits indicated by the right operand. The result is assigned to the left operand.