[an error occurred while processing this directive]

HP OpenVMS Systems Documentation

Content starts here

HP Fortran for OpenVMS
User Manual


Previous Contents Index

5.8.5 Controlling the Inlining of Procedures

To specify the types of procedures to be inlined, use the /OPTIMIZE=INLINE=keyword keywords. Also, compile multiple source files together and specify an adequate optimization level, such as /OPTIMIZE=LEVEL=4.

If you omit /OPTIMIZE=INLINE=keyword, the optimization level /OPTIMIZE=LEVEL=n qualifier used determines the types of procedures that are inlined.

The /OPTIMIZE=INLINE=keyword keywords are as follows:

  • NONE (same as /OPTIMIZE=NOINLINE) inlines statement functions but not other procedures. This type of inlining occurs if you specify /OPTIMIZE=LEVEL=0, LEVEL=1, LEVEL=2, or LEVEL=3 and omit INLINE=keyword.
  • MANUAL (same as NONE) inlines statement functions but not other procedures. This type of inlining occurs if you specify /OPTIMIZE=LEVEL=2 or LEVEL=3 and omit INLINE=keyword.
  • In addition to inlining statement functions, SIZE inlines any procedures that the HP Fortran optimizer expects will improve run-time performance with no likely significant increase in program size.
  • In addition to inlining statement functions, SPEED inlines any procedures that the HP Fortran optimizer expects will improve run-time performance with a likely significant increase in program size. This type of inlining occurs if you specify /OPTIMIZE=LEVEL=4 or LEVEL=5 and omit /OPTIMIZE=INLINE=keyword.
  • ALL inlines every call that can possibly be inlined while generating correct code, including the following:
    • Statement functions (always inlined).
    • Any procedures that HP Fortran expects will improve run-time performance with a likely significant increase in program size.
    • Any other procedures that can possibly be inlined and generate correct code. Certain recursive routines are not inlined to prevent infinite expansion.

For information on the inlining of other procedures (inlined at optimization level /OPTIMIZE=LEVEL=4 or higher), see Section 5.7.5.2.

Maximizing the types of procedures that are inlined usually improves run-time performance, but compile-time memory usage and the size of the executable program may increase.

To determine whether using /OPTIMIZE=INLINE=ALL benefits your particular program, time program execution for the same program compiled with and without /OPTIMIZE=INLINE=ALL.

5.8.6 Requesting Optimized Code for a Specific Processor Generation (Alpha only)

You can specify the types of optimized code to be generated by using the /OPTIMIZE=TUNE=keyword (Alpha only) keywords. Regardless of the specified keyword, the generated code will run correctly on all implementations of the Alpha architecture. Tuning for a specific implementation can improve run-time performance; it is also possible that code tuned for a specific target may run slower on another target.

Specifying the correct keyword for /OPTIMIZE=TUNE=keyword (Alpha only) for the target processor generation type usually slightly improves run-time performance. Unless you request software pipelining, the run-time performance difference for using the wrong keyword for /OPTIMIZE=TUNE=keyword (such as using /OPTIMIZE=TUNE=EV4 for an EV5 processor) is usually less than 5%. When using software pipelining (using /OPTIMIZE=LEVEL=5) with /OPTIMIZE=TUNE=keyword, the difference can be more than 5%.

The combination of the specified keyword for /OPTIMIZE=TUNE=keyword and the type of processor generation used has no effect on producing the expected correct program results.

The /OPTIMIZE=TUNE=keyword keywords are as follows:

  • GENERIC generates and schedules code that will execute well for all types of Alpha processor generations. This provides generally efficient code for those applications that will be run on systems using all types of processor generations (an alternative to providing multiple versions of the application compiled for each processor generation type).
  • HOST generates and schedules code optimized for the type of processor generation in use on the system being used for compilation.
  • EV4 generates and schedules code optimized for the EV4 (21064) processor generation.
  • EV5 generates and schedules code optimized for the EV5 (21164) processor generation. This processor generation is faster than EV4.
  • EV56 generates and schedules code optimized for some 21164 Alpha architecture implementations that use the BWX (Byte/Word manipulation) instruction extensions of the Alpha architecture.
  • PCA56 generates and schedules code optimized for 21164PC Alpha architecture implementation that uses BWX (Byte/Word manipulation) and MAX (Multimedia) instructions extensions.
  • EV6 generates and schedules code for the 21264 chip implementation that uses the following extensions to the base Alpha instruction set: BWX (Byte/Word manipulation) and MAX (Multimedia) instructions, square root and floating-point convert instructions, and count instructions.
  • EV67 generates and schedules code optimized for the EV67 processor generation. This processor generation is faster than EV4, EV5, EV56, PCA56, and EV6.

If you omit /OPTIMIZE=TUNE=keyword, if /FAST is specified, then HOST is used; otherwise, GENERIC is used.

5.8.7 Requesting Generated Code for a Specific Processor Generation (Alpha only)

You can specify the types of instructions that will be generated for the program unit being compiled by using the /ARCHITECTURE qualifier. Unlike the /OPTIMIZE=TUNE=keyword (Alpha only) option that helps with proper instruction scheduling, the /ARCHITECTURE qualifier specifies the type of Alpha chip instructions that can be used.

Programs compiled with the /ARCHITECTURE=GENERIC option (default) run on all Alpha processors without instruction emulation overhead.

For example, if you specify /ARCHITECTURE=EV6, the code generated will run very fast on EV6 systems, but may run slower on older Alpha processor generations. Because instructions used for the EV6 chip may be present in the program's generated code, code generated for an EV6 system may slow program execution on older Alpha processors when EV6 instructions are emulated by the OpenVMS Alpha Version 7.1 (or later) instruction emulator.

This instruction emulator allows new instructions, not implemented on the host processor chip, to execute and produce correct results. Applications using emulated instructions will run correctly, but may incur significant software emulation overhead at run time.

The keywords used by /ARCHITECTURE=keyword are the same as those used by /OPTIMIZE=TUNE=keyword. If you omit /ARCHITECTURE=keyword, if /FAST is specified then HOST is used; otherwise, GENERIC is used. For more information on the /ARCHITECTURE qualifier, see Section 2.3.6.

5.8.8 Arithmetic Reordering Optimizations

If you use the /ASSUME=NOACCURACY_SENSITIVE qualifier, HP Fortran may reorder code (based on algebraic identities) to improve performance. For example, the following expressions are mathematically equivalent but may not compute the same value using finite precision arithmetic:


X = (A + B) + C

X = A + (B + C)

The results can be slightly different from the default (ACCURACY_SENSITIVE) because of the way intermediate results are rounded. However, the NOACCURACY_SENSITIVE results are not categorically less accurate than those gained by the default. In fact, dot product summations using NOACCURACY_SENSITIVE can produce more accurate results than those using ACCURACY_SENSITIVE.

The effect of /ASSUME=NOACCURACY_SENSITIVE is important when HP Fortran hoists divide operations out of a loop. If NOACCURACY_SENSITIVE is in effect, the unoptimized loop becomes the optimized loop:

Unoptimized Code Optimized Code
  T = 1/V
DO I=1,N DO I=1,N
. .
. .
. .
B(I) = A(I)/V B(I) = A(I)*T
END DO END DO

The transformation in the optimized loop increases performance significantly, and loses little or no accuracy. However, it does have the potential for raising overflow or underflow arithmetic exceptions.

5.8.9 Dummy Aliasing Assumption

Some programs compiled with HP Fortran (or Compaq Fortran 77) might have results that differ from the results of other Fortran compilers. Such programs might be aliasing dummy arguments to each other or to a variable in a common block or shared through use association, and at least one variable access is a store. Alternatively, they may be calling a user-defined procedure with actual arguments that do not match the procedure's dummy arguments in order, number, or type.

This program behavior is prohibited in programs conforming to the Fortran 90 and Fortran 95 standards, but not by HP Fortran. Other versions of Fortran allow dummy aliases and check for them to ensure correct results. However, HP Fortran assumes that no dummy aliasing will occur, and it can ignore potential data dependencies from this source in favor of faster execution.

The HP Fortran default is safe for programs conforming to the Fortran 90 and Fortran 95 standards. It will improve performance of these programs, because the standard prohibits such programs from passing overlapped variables or arrays as actual arguments if either is assigned in the execution of the program unit.

The /ASSUME=DUMMY_ALIASES qualifier allows dummy aliasing. It ensures correct results by assuming the exact order of the references to dummy and common variables is required. Program units taking advantage of this behavior can produce inaccurate results if compiled with /ASSUME=NODUMMY_ALIASES.

Example 5-3 is taken from the DAXPY routine in the Fortran-77 version of the Basic Linear Algebra Subroutines (BLAS).

Example 5-3 Using the /ASSUME =DUMMY_ALIASES Qualifier

      SUBROUTINE DAXPY(N,DA,DX,INCX,DY,INCY)

!     Constant times a vector plus a vector.
!     uses unrolled loops for increments equal to 1.

      DOUBLE PRECISION DX(1), DY(1), DA
      INTEGER I,INCX,INCY,IX,IY,M,MP1,N
!
      IF (N.LE.0) RETURN
      IF (DA.EQ.0.0) RETURN
      IF (INCX.EQ.1.AND.INCY.EQ.1) GOTO 20

!     Code for unequal increments or equal increments
!     not equal to 1.
      .
      .
      .
      RETURN
!     Code for both increments equal to 1.
!     Clean-up loop

 20   M = MOD(N,4)
      IF (M.EQ.0) GOTO 40
      DO I=1,M
          DY(I) = DY(I) + DA*DX(I)
      END DO
      IF (N.LT.4) RETURN
 40   MP1 = M + 1
      DO I = MP1, N, 4
          DY(I) = DY(I) + DA*DX(I)
          DY(I + 1) = DY(I + 1) + DA*DX(I + 1)
          DY(I + 2) = DY(I + 2) + DA*DX(I + 2)
          DY(I + 3) = DY(I + 3) + DA*DX(I + 3)
      END DO
      RETURN
      END SUBROUTINE

The second DO loop contains assignments to DY. If DY is overlapped with DA, any of the assignments to DY might give DA a new value, and this overlap would affect the results. If this overlap is desired, then DA must be fetched from memory each time it is referenced. The repetitious fetching of DA degrades performance.

Linking Routines with Opposite Settings

You can link routines compiled with the /ASSUME=DUMMY_ALIASES qualifier to routines compiled with /ASSUME=NODUMMY_ALIASES. For example, if only one routine is called with dummy aliases, you can use /ASSUME=DUMMY_ALIASES when compiling that routine, and compile all the other routines with /ASSUME=NODUMMY_ALIASES to gain the performance value of that qualifier.

Programs calling DAXPY with DA overlapping DY do not conform to the FORTRAN-77, Fortran 90, and Fortran 95 standards. However, they are supported if /ASSUME=DUMMY_ALIASES was used to compile the DAXPY routine.

5.9 Compiler Directives Related to Performance

Certain compiler source directives (cDEC$ prefix) can be used in place of some performance-related compiler options and provide more control of certain optimizations, as discussed in the following sections:

5.9.1, Using the cDEC$ OPTIONS Directive
5.9.2, Using the cDEC$ UNROLL Directive to Control Loop Unrolling
5.9.3, Using the cDEC$ IVDEP Directive to Control Certain Loop Optimizations

5.9.1 Using the cDEC$ OPTIONS Directive

The cDEC$ OPTIONS directive allows source code control of the alignment of fields in record structures and data items in common blocks. The fields and data items can be naturally aligned (for performance reasons) or they can be packed together on arbitrary byte boundaries.

Using this directive is an alternative to the compiler option /[NO]ALIGNMENT, which affects the alignment of all fields in record structures and data items in common blocks in the current program unit.

For more information:

See the description of the OPTIONS directive in the HP Fortran for OpenVMS Language Reference Manual.

5.9.2 Using the cDEC$ UNROLL Directive to Control Loop Unrolling

The cDEC$ UNROLL directive allows you to specify the number of times certain counted DO loops will be unrolled. Place the cDEC$ UNROLL directive before the DO loop you want to control the unrolling of.

Using this directive for a specific loop overrides the value specified by the compiler option /OPTIMIZE=UNROLL= for that loop. The value specified by unroll affects how many times all loops not controlled by their respective cDEC$ UNROLL directives are unrolled.

For more information:

See the description of the UNROLL directive in the HP Fortran for OpenVMS Language Reference Manual.

5.9.3 Using the cDEC$ IVDEP Directive to Control Certain Loop Optimizations

The cDEC$ IVDEP directive allows you to help control certain optimizations related to dependence analysis in a DO loop. Place the cDEC$ IVDEP directive before the DO loop you want to help control the optimizations for. Not all DO loops should use this directive.

The cDEC$ IVDEP directive tells the optimizer to begin dependence analysis by assuming all dependences occur in the same forward direction as their appearance in the normal scalar execution order. This contrasts with normal compiler behavior, which is for the dependence analysis to make no initial assumptions about the direction of a dependence.

For more information:

See the description of the IVDEP directive in the HP Fortran for OpenVMS Language Reference Manual.


Chapter 6
HP Fortran Input/Output

The following topics are addressed in this chapter:

6.1 Overview

This chapter describes HP Fortran input/output (I/O) as implemented for HP Fortran. It also provides information about HP Fortran I/O in relation to the OpenVMS Record Management Services (RMS) and Run-Time Library (RTL).

HP Fortran assumes that all unformatted data files will be in the same native little endian numeric formats used in memory. If you need to read or write unformatted numeric data (on disk) that has a different numeric format than that used in memory, see Chapter 9.

You can use HP Fortran I/O statements to communicate between processes on either the same computer or different computers.

For More Information:

  • On specifying the native floating-point format used in memory, see Section 2.3.22.
  • On supported data types, see Chapter 8.
  • On using various HP and non-HP data formats for unformatted files, see Chapter 9.
  • On interprocess communication, see Chapter 13.
  • On porting Compaq Fortran 77 for OpenVMS VAX data, see Appendix B.
  • On performing I/O to the same unit with HP Fortran and Compaq Fortran 77 object files, see Appendix B.
  • On using indexed files, see Chapter 12.

6.2 Logical I/O Units

In HP Fortran, a logical unit is a channel through which data transfer occurs between the program and a device or file. You identify each logical unit with a logical unit number, which can be any nonnegative integer from 0 to a maximum value of 2,147,483,647 (2**31--1).

For example, the following READ statement uses logical unit number 2:


READ (2,100) I,X,Y

This READ statement specifies that data is to be entered from the device or file corresponding to logical unit 2, in the format specified by the FORMAT statement labeled 100.

When opening a file, use the UNIT specifier to indicate the unit number. You can use the LIB$GET_LUN library routine to return a logical unit number not currently in use by your program. If you intend to use LIB$GET_LUN, avoid using logical unit numbers (UNIT) 100 to 119 (reserved for LIB$GET_LUN).

HP Fortran programs are inherently device-independent. The association between the logical unit number and the physical file can occur at run time. Instead of changing the logical unit numbers specified in the source program, you can change this association at run time to match the needs of the program and the available resources. For example, before running the program, a command procedure can set the appropriate logical name or allow the terminal user to type a directory, file name, or both.

Use the same logical unit number specified in the OPEN statement for other I/O statements to be applied to the opened file, such as READ and WRITE.

The OPEN statement connects a unit number with an external file and allows you to explicitly specify file attributes and run-time options using OPEN statement specifiers (all files except internal files are called external files).

ACCEPT, TYPE, and PRINT statements do not refer explicitly to a logical unit (a file or device) from which or to which data is to be transferred; they refer implicitly to a default preconnected logical unit. The ACCEPT statement is normally preconnected to the default input device, and the TYPE and PRINT statements are normally preconnected to the default output device. These defaults can be overridden with appropriate logical name assignments (see Section 6.6.1.2).

READ, WRITE, and REWRITE statements refer explicitly to a specified logical unit from which or to which data is to be transferred. However, to use a preconnected device for READ (SYS$INPUT) and WRITE (SYS$OUTPUT), specify the unit number as an asterisk (*).

Certain unit numbers are preconnected to OpenVMS standard devices. Unit number 5 is associated with SYS$INPUT and unit 6 with SYS$OUTPUT. At run time, if units 5 and 6 are specified by a record I/O statement (such as READ or WRITE) without having been explicitly opened by an OPEN statement, HP Fortran implicitly opens units 5 and 6 and associates them with their respective operating system standard I/O files.

For More Information:

On the OPEN statement and preconnected files, see Section 6.6.

6.3 Types of I/O Statements

Table 6-1 lists the HP Fortran I/O statements.

Table 6-1 Summary of I/O Statements
Category and Statement Name Description
File Connection  
OPEN Connects a unit number with an external file and specifies file connection characteristics.
CLOSE Disconnects a unit number from an external file.
File Inquiry  
INQUIRE Returns information about a named file, a connection to a unit, or the length of an output item list.
Record Position  
BACKSPACE Moves the record position to the beginning of the previous record (sequential access only).
ENDFILE Writes an end-of-file marker after the current record (sequential access only).
REWIND Sets the record position to the beginning of the file (sequential access only).
Record Input  
READ Transfers data from an external file record or an internal file to internal storage.
Record Output  
WRITE Transfers data from internal storage to an external file record or to an internal file.
PRINT Transfers data from internal storage to SYS$OUTPUT (standard output device). Unlike WRITE, PRINT only provides formatted sequential output and does not specify a unit number.
HP Fortran Extensions  
ACCEPT Reads input from SYS$INPUT. Unlike READ, ACCEPT only provides formatted sequential output and does not specify a unit number.
DELETE Marks a record at the current record position in a relative file as deleted (direct access only).
REWRITE Transfers data from internal storage to an external file record at the current record position. Certain restrictions apply.
UNLOCK Releases a lock held on the current record when file sharing was requested when the file was opened (see Section 6.9.2).
TYPE Writes record output to SYS$OUTPUT (same as PRINT).
DEFINE FILE Specifies file characteristics for a direct access relative file and connects the unit number to the file (like an OPEN statement). Provided for compatibility with compilers older than FORTRAN-77.
FIND Changes the record position in a direct access file. Provided for compatibility with compilers older than FORTRAN-77.

In addition to the READ, WRITE, REWRITE, TYPE, and PRINT statements, other I/O record-related statements are limited to a specific file organization. For instance:

  • The DELETE statement only applies to relative and indexed files.
  • The BACKSPACE and REWIND statements only apply to sequential files open for sequential access.
  • The ENDFILE statement only applies to certain types of sequential files open for sequential access.

The file-related statements (OPEN, INQUIRE, and CLOSE) apply to any relative or sequential file.


Previous Next Contents Index