HP OpenVMS Systems

C Programming Language
Content starts here HP C

Language Reference Manual

Previous Contents Index Numeric Escape Sequences

The compiler treats all characters as an integer representation, so it is possible to represent any character in the source code with its numeric equivalent. This is called a numeric escape sequence. The character is represented by typing a backslash ( \ ), followed by the character's octal or hexadecimal integer equivalent from the current character set (see Appendix C for the ASCII equivalence tables). For example, using the ASCII character set, the character A can be represented as \101 (the octal equivalent) or \x41 (the hexadecimal equivalent). A preceding 0 in the octal example is not necessary because octal values are the default in numeric escape sequences. A lowercase x following the backslash indicates a hexadecimal representation. For example, \x5A is equivalent to the character Z .

An example of numeric escape sequences follows:

#define NUL '\0'   /*  Defines logical null character   */ 
char x[] = {'\110','\145','\154','\154','\157','\41','\0'}; 
                    /*  Initializes x with "Hello!"  */ 

The escape sequence extends to three octal digits, or the first character that is not an octal digit, whichever is first. Therefore, the string "\089" is interpreted as four characters: \0 , 8 , 9 , and \0 .

With hexadecimal escape sequences, there is no limit to the number of characters in the escape sequence, but the result is not defined if the hexadecimal value exceeds the largest value representable by the unsigned char type for an normal character constant, or the largest value representable by the wchar_t type for a wide-character constant. For example, '\x777' is illegal.

In addition, hexadecimal escape sequences with more than three characters provoke a warning if the error-checking compiler option is used.

String concatenation can be used to specify a hexadecimal digit following a hexadecimal escape sequence. In the following example, a is initialized to the same value in both cases:

char a[] = "\xff" "f"; 
char a[] = {'\xff', 'f', '\0'}; 

Using numeric escape sequences can result in a nonportable program if the executing machine uses a different character set. Another threat to portability exists if arithmetic operations are performed on the integer character values, because multiple character constants (such as 'ABC' can be represented differently on different machines.

1.9.4 Enumeration Constants

An enumerated type specifies one or more enumeration constants to define allowable values for the enumerated type. Enumeration constants have the type signed int , except in the compiler's RELAXED mode, in which other types are allowed. See Section 3.6 for details on the declaration and use of enumerated types.

1.10 Header Files

Header files are text files included in a source file during compilation. To include a header file in a compilation, the #include preprocessor directive must be used in the source file. See Chapter 8 for more information on this directive. The entire header file, regardless of content, is substituted for the #include preprocessor directive.

A header file can contain other #include preprocessor directives to include another file. You can nest #include directives to any depth.

Header files can include any legal C source code. They are most often used to include external variable declarations, macro definitions, type definitions, and function declarations. Groups of logically related functions are commonly declared together in a header file, such as the C library input and output functions listed in the stdio.h header file. Header files traditionally have a .h suffix ( stdio.h , for example).

The names of header files must not include the ', \, ", or /* characters, because the use of these punctuation characters in a header file is undefined.

When referenced in a program, header names are surrounded by angle brackets or double quotation marks, as shown in the following example:

#include <math.h>   /* or  */ 
#include "local.h" 

Chapter 8 explains the difference between the two formats. The algorithm the compiler uses for finding the named files is discussed in Section B.37. Chapter 9 describes the library routines in each of the ANSI standard header files.

1.11 Limits

The ANSI C standard suggests several environmental limits on the use of the C language. These limits are an effort to define minimal standards for a conforming implementation of a C compiler. For example, the number of significant characters in an identifier is implementation-defined, with a minimum set required by the ANSI C standard.

The standard also includes several numerical limits that restrict the characteristics of integral and floating-point types. For the most part, these limits will not affect your use of the C language or compiler. However, for unusually large or unusually constructed programs, certain limits can be reached. The ANSI standard contains a list of minimum limits, and your platform-specific HP C documentation contains the actual limits used in HP C.

1.11.1 Translation Limits

As intended by the ANSI C standard, the HP C implementation avoids imposing many of the translation limits, allowing applications more flexibility. The HP C limits are:

  • A maximum of 32,767 significant characters in an internal identifier or macro name (a warning message is issued if this limit is exceeded)
  • A maximum of 1023 significant characters in an external identifier for Tru64 UNIX systems.
  • A maximum of 31 significant characters in an external identifier for OpenVMS VAX platforms (a warning message is issued if this limit is exceeded and the identifier is truncated)
  • A maximum of 253 function arguments/formal parameters on OpenVMS systems; a maximum of 1023 function arguments/formal parameters on Tru64 UNIX systems.
  • A maximum of 1012 bytes in any one function argument, and a maximum of 1012 bytes in a function argument list on OpenVMS systems
  • A maximum of 32,767 characters in a logical source line
  • A maximum of 32,767 characters in a physical source line
  • A maximum of 32,767 bytes in the representation of a string literal (this limit does not apply to string literals formed as a result of concatenation)

1.11.2 Numerical Limits

Numerical limits define the sizes and characteristics of integral and floating-point types. Numerical limits are described in the limits.h and float.h header files. The limits are:

  • Each character of type char is represented in 8 bits.
  • Each character of type wchar_t is represented in 32 bits.
  • The machine representation and set of possible values for the char type is the same as for the signed char type. A compiler command-line option changes this equivalence to unsigned char .
  • On OpenVMS systems, the machine representation and set of possible values for the int and signed int types are the same as for the long int type.
  • On OpenVMS systems, the machine representation and set of possible values for the unsigned int type are the same as for the unsigned long int type.
  • On Tru64 UNIX systems, the long int and unsigned long int types are 64 bits, while int and unsigned int are 32 bits.
  • The machine representation and set of possible values for the long double type is the same as for the double type.

1.11.3 Character Display

Characters from the executable character set are output to the active position on the screen or in a file. The active position is defined by the ANSI C standard as the spot where the next output character will appear. After a character is output, the active position advances to the next position on the current line (to the left or right).

The HP C compiler moves the active position from left to right across an output line.

Chapter 2
Basic Concepts

The C language was initially designed as a small, portable programming language used to implement an operating system. In its history, C has evolved into a powerful tool for writing all types of programs, and includes mechanisms to achieve most programming goals. C offers:

  • A standard set of lexical elements
  • A wide variety of types for data objects, including:
    • Integer and floating-point constants and variables
    • Pointers to data locations in memory and the ability to do pointer arithmetic
    • Arrays of identically typed data
    • Structures and unions with members of different data types
  • The ability to group independent code blocks into named functions
  • A large set of operators used to form expressions, including bit-wise operators
  • A simple method of declaring data objects and functions
  • Several preprocessor directives to expand the functionality of the language
  • Numerous library functions to handle many common programming tasks
  • A high degree of portability

To help you take full advantage of C's features, the following sections provide a guide to the basic concepts of the language:

These sections represent an expanded glossary of selected C terms and basic concepts. Understanding these concepts will provide a good foundation for a working knowledge of C, and will help show the relationship of these concepts to more complex ones in the language.

2.1 Blocks

A block in C is a section of code surrounded by braces { }. Understanding the definition of a block is very important to understanding many other C concepts, such as scope, visibility, and external or internal declarations.

The following example shows two blocks, one defined inside the other:

main () 
{              /*  This brace marks the beginning of the outer block  */ 
   int x; 
   if (x!=0) 
   {           /*  This brace marks the beginning of the inner block */ 
      x = x++; 
      return x; 
   };          /*  This brace marks the end of the inner block        */ 
}              /*  This brace marks the end of the outer block        */ 

A block is also a form of a compound statement; a set of related C statements enclosed in braces. Declarations of objects used in the program can appear anywhere within a block and affect the object's scope and visibility. Section 2.3 discusses scope; Section 2.4 discusses visibility.

2.2 Compilation Units

A compilation unit is C source code that is compiled and treated as one logical unit. The compilation unit is usually one or more entire files, but can also be a selected portion of a file if, for example, the #ifdef preprocessor directive is used to select specific code sections. Declarations and definitions within a compilation unit determine the scope of functions and data objects.

Files included by using the #include preprocessor directive become part of the compilation unit. Source lines skipped because of the conditional inclusion preprocessor directives are not included in the compilation unit.

Compilation units are important in determining the scope of identifiers, and in determining the linkage of identifiers to other internal and external identifiers. Section 2.3 discusses scope. Section 2.8 discusses linkage.

A compilation unit can refer to data or functions in other compilation units in the following ways:

  • A function in one compilation unit can call a function in a different compilation unit.
  • Data objects can be assigned external linkage so that other compilation units have access to them (see Section 2.8).

Programs composed of more than one compilation unit can be separately compiled, and later linked to produce the executable program. A legal C compilation unit consists of at least one external declaration, as defined in Section 4.3.

A translation unit with no declarations is accepted with a compiler warning in all modes except for the strict ANSI standard mode.

2.3 Scope

The scope of an identifier is the range of the program in which the declared identifier has meaning. An identifier has meaning if it is recognized by the compiler. Scope is determined by the location of the identifier's declaration. Trying to access an identifier outside of its scope results in an error. Every declaration has one of four kinds of scope:

  • File
  • Block
  • Function
  • Function prototype (a declaration including only the function's parameter types)

An enumeration constant's scope begins at the defining enumerator in an enumerator list. The scope of a statement label includes the entire function body. The scope of any other type of identifier begins at the identifier itself in the identifier's declaration. See the following sections for information on when an identifier's scope ends.

2.3.1 File Scope

An identifier whose declaration is located outside any block or function parameter list has file scope. An identifier with file scope is visible from the declaration of the identifier to the end of the compilation unit, unless hidden by an inner block declaration. In the following example, the identifier off has file scope:

int off = 5;     /*  Declares (and defines) the integer 
                        identifier off.                           */ 
main () 
   int on;       /*  Declares the integer identifier on.          */ 
   on = off + 1; /*  Uses off, declared outside the function 
                     block of main.  This point of the 
                     program is still within the 
                     active scope of off.                         */ 
   if (on<=100) 
     int off = 0;/*  This declaration of off creates a new object 
                     that hides the former object of the same name.  
                     The scope of the new off lasts through the 
                     end of the if block.                         */ 
     off = off + on; 
     return off; 

2.3.2 Block Scope

An identifier appearing within a block or in a parameter list of a function definition has block scope and is visible within the block, unless hidden by an inner block declaration.

Block scope begins at the identifier declaration and ends at the closing brace (}) completing the block. In the following example, the identifier red has block scope and blue has file scope:

int blue = 5;                /*  blue: file scope            */ 
main () 
    int x = 0 , y = 0;       /*  x and y: block scope        */ 
    int red = 10;            /*  red: block scope            */ 
    x = red + blue; 

2.3.3 Function Scope

Only statement labels have function scope (see Chapter 7). An identifier with function scope is unique throughout the function in which it is declared. Labeled statements are used as targets for goto statements and are implicitly declared by their syntax, which is the label followed by a colon (:) and a statement. For example:

int func1(int x, int y, int z) 
  label:  x += (y + z);   /*  label has function scope        */ 
  if (x > 1) goto label; 
int func2(int a, int b, int c) 
  if (a > 1) goto label; /*  illegal jump to undefined label */ 

See Section 7.1 for more information on statement labels.

2.3.4 Function Prototype Scope

An identifier that appears within a function prototype's list of parameter declarations has function prototype scope. The scope of such an identifier begins at the identifier's declaration and terminates at the end of the function prototype declaration list. For example:

int students ( int david, int susan, int mary, int john ); 

In this example, the identifiers ( david, susan, mary , and john ) have scope beginning at their declarations and ending at the closing parenthesis. The type of the function students is "function returning int with four int parameters." In effect, these identifiers are merely placeholders for the actual parameter names to be used after the function is defined.

2.4 Visibility

An identifier is visible only within a certain region of the program. An identifier has visibility over its entire scope, unless a subsequent declaration of the same identifier in an enclosed block overrides, or hides, the previous declaration. Visibility affects the ability to access a data object or other identifier, because an identifier can be used only where it is visible.

Once an identifier is used for a specific purpose, it cannot be used for another purpose within the same scope, unless the second use of the identifier is in a different name space. Section 2.15 describes the name space restrictions. For example, declarations of two different data objects using the same name as an identifier is illegal within the same scope.

When the scope of one of two identical identifiers is contained within the other (nested), the identifier with inner scope remains visible, while the identifier with wider scope becomes hidden for the duration of the inner identifier's scope.

In the following example, the identifier number is used twice: once as an integer variable and once as a floating-point variable. For the duration of the function main , the integer number is hidden by the floating-point number .

#include <math.h> 
int number;        /*  number is declared as an integer variable  */ 
main () 
 float x; 
 float number;     /*  This declaration of number occurs in an inner 
                       block, and "hides" the outer declaration.  
                       The inner declaration creates a new object */ 
 x = sqrt (number);/*  x receives a floating-point value          */ 

2.5 Side Effects and Sequence Points

The actual order in which expressions are evaluated is not specified for most of the operators in C. Because this sequence of evaluation is determined within the compiler depending on context, some unexpected results may occur when using certain operators. These unexpected results are caused by side effects.

Any operation that affects an operand's storage has a side effect. Side effects can be deliberately induced by the programmer to produce a desired result; in fact, the assignment operator depends on the side effect of altered storage to do its job. C guarantees that all side effects of a given expression will be completed by the next sequence point in the program. Sequence points are checkpoints in the program at which the compiler ensures that operations in an expression are concluded.

The most important sequence point is the semicolon marking the end of a statement. All expressions and their side effects are completely evaluated when the semicolon is reached. Other sequence points are as follows:

  • expr1, expr2 (the comma operator)
  • expr1 && expr2 (the logical AND operator)
  • expr1 || expr2 (the logical OR operator)
  • expr1 ? expr2 : expr3 (the conditional operator)

These operations do guarantee the order, or sequence, of evaluation (expr1), expr2, and expr3 are expressions). For each of these operators, the evaluation of expression expr1 is guaranteed to occur before the evaluation of expression expr2 (or expr3, in the case of the conditional expression).

Relying on the execution order of side effects, when none is guaranteed, is a risky practice because results are inconsistent and not portable. Undesirable side effects usually occur when the same data object is used in two or more places in the same expression, where at least one use produces a side effect. For example, the following code fragment produces inconsistent results because the order of evaluation of operands to the assignment operator is undefined.

int x[4] = { 0, 0, 0, 0 }; 
int i = 1; 
x[i] = i++; 

If the increment of i occurs before the subscript is evaluated, the value of x[2] is 1. If the subscript is evaluated first, the value of x[1] is 1.

A function call also has side effects. In the following example, the order in which f1(y) and f2(z) are called is undefined:

int y = 0; 
int z = 0; 
int x = 0; 
int f1(int s) 
     printf ("Now in f1\n"); 
     y += 7;        /*  Storage of y affected   */ 
     return y; 
int f2(int t) 
     printf ("Now in f2\n"); 
     z += 3;        /*  Storage of z affected   */ 
     return z; 
main () 
x = f1(y) + f2(z);     /*  Undefined calling order   */ 

The printf functions can be executed in any order even though the value of x will always be 10.

Previous Next Contents Index