[an error occurred while processing this directive]

HP OpenVMS Systems Documentation

Content starts here

HP OpenVMS MACRO Compiler
Porting and User's Guide


Previous Contents Index

2.11.5 Alignment Considerations for Atomicity

When preserving atomicity, the compiler must assume the modify data is aligned. An update of a field spanning a quadword boundary cannot occur atomically since this would require two read-modify-write sequences.

On OpenVMS Alpha systems, since software cannot handle an unaligned LDx_L or STx_C instruction as it can a normal load or store instruction, a LDx_L or STx_C instruction to an unaligned address will generate a fatal reserved operand fault.

On OpenVMS I64 systems, since software cannot handle an unaligned address in the compare-exchange (cmpxchg) instruction, it will generate an exception at run time.

On OpenVMS Alpha systems, when /PRESERVE=ATOMICITY (or .PRESERVE ATOMICITY) is specified, an INCL (R1) instruction generates LDL_L and STL_C instructions so R1 must be longword aligned.

Assume the following instruction:


INCW (R1)

For this instruction, the compiler generates a code sequence such as the following on OpenVMS Alpha systems:


        BIC     R1,#^B0110,R28   ; Compute Aligned Address
Retry:  LDQ_L   R24,(R28)        ; Load the QW with the data
        EXTWL   R24,R1,R23       ; Extract out the Word
        ADDL    R23,#1,R23       ; Increment the Word
        INSWL   R23,R1,R23       ; Correctly position the Word
        MSKWL   R24,R1,R24       ; Zero the spot for the Word
        BIS     R23,R24,R23      ; Combine Original and New word
        STQ_C   R23,(R28)        ; Conditionally store result
        BEQ     fail             ; Branch ahead on failure
        .
        .

        .
fail:   BR      Retry
Note that the first BIC instruction uses #^B0110, not #^B0111. This is to ensure that the word does not cross a quadword boundary, which would result in an incomplete memory update. If the address in R1 is not pointing to an aligned word, bit 0 will be set and the bit will not be cleared by the BIC instruction. The Load Quadword Locked instruction (LDQ_L) will then generate a fatal reserved operand fault.

An INCB instruction uses #^B0111 to generate the aligned address since all bytes are aligned.

For the INCW (R1) instruction, the compiler generates a code sequence such as the following on OpenVMS I64 systems:


$L5:    ld2           r19 = [r9]
        mov.m         apccv = r19
        mov           r18 = r19
        sxt2          r19 = r19
        adds          r19 = 1, r19
        cmpxchg2.acq  r19, [r9] = r19
        cmp.eq        pr0, pr8 = r18, r19
(pr8)   br.cond.dpnt.few  $L5

2.11.6 Interlocked Instructions and Atomicity

The compiler's methods of preserving atomicity have an interesting side effect in compiled VAX MACRO code.

On OpenVMS VAX systems, only the interlocked instructions will work correctly to synchronize access to shared data in multiprocessor systems. On OpenVMS Alpha multiprocessing systems, the code resulting from a compilation of modify instructions (with atomicity preserved) and interlocked instructions would both work correctly, because the LDx_L and STx_C which the compiler generates for both sets of instructions operate correctly across multiple processors. Likewise, on OpenVMS I64 systems, the the compare-exchange (cmpxchg) instruction provides interlocking across processors.

Because this compiler side effect is specific to OpenVMS Alpha and OpenVMS I64 systems and does not port back to OpenVMS VAX systems, you should avoid relying on it when porting VAX MACRO code to OpenVMS Alpha or OpenVMS I64 if you intend to run the code on both systems.

However, interlocked instructions must still be used if the memory modification is being used as an interlock for other instructions for which atomicity is not preserved. This is because the Alpha and and Itanium architectures do not guarantee strict write ordering.

For example, consider the following VAX MACRO code sequence:


.PRESERVE ATOMICITY
INCL (R1)
.NOPRESERVE ATOMICITY
MOVL (R2),R3

This code sequence will generate the following Alpha code sequence:


Retry:  LDL_L   R28,(R1)
        ADDL    R28,#1,R28
        STL_C   R28,(R1)

        BEQ     R28, fail
        LDL     R3, (R2)
         .
         .
         .
fail:   BR      Retry

Because of the data prefetching of the Alpha and Itanium architectures, the data from (R2) may be read before the store to (R1) is processed. If the INCL (R1) instruction is being used as a lock to prevent the data at (R2) from being accessed before the lock is set, the read of (R2) may occur before the increment of (R1) and thus is not protected.

The VAX interlocked instructions generate Alpha MB (memory barrier) or Itanium mf (memory fence) instructions before and after the interlocked sequence. This prevents memory loads from being moved across the interlocked instruction.

On OpenVMS I64, the code sequence would be similar to the following:


$L7:    ld4     r16 = [r9]
        mov.m   apccv = r16
        mov     r15 = r16
        sxt4    r16 = r16
        adds    r16 = 1, r16
        cmpxchg4.acq r16, [r9] = r16
        cmp.eq  pr0, pr10 = r15, r16
(pr10)  br.cond.dpnt.few $L7
        ld4     r3 = [r28]
        sxt4    r3 = r3

Consider the following code sequence:


ADAWI     #1,(R1)
MOVL      (R2),R3

This code sequence will generate the following Alpha code sequence:


        MB
Retry:  LDL_L   R28,(R1)
        ADDL    R28,#1,R28
        STL_C   R28,(R1)

        BEQ     R28, Fail
        MB
        LDL     R3, (R2)
         .
         .
         .
Fail:   BR      Retry

On OpenVMS I64, a code sequence similar to the following would be generated:


        mf
$L8:    ld2     r23 = [r9]
        mov.m   apccv = r23
        adds    r24 = 1, r23
        cmpxchg2.acq r14, [r9] = r24
        cmp.eq  pr0, pr11 = r23, r14
(pr11)  br.cond.dpnt.few $L8
        mf
        ld4     r3 = [r28]
        sxt4    r3 = r3

The MB or mf instructions cause all memory operations before the MB or mf instruction to complete before any memory operations after the MB or mf instruction are allowed to begin.

2.12 Compiling and Linking

The compiler requires the following files:

  • SYS$LIBRARY:STARLET.MLB
    This is a macro library that defines the compiler directives. When you compile your code, the compiler automatically checks STARLET.MLB for definitions of compiler directives.
  • SYS$LIBRARY:STARLET.OLB
    This is an object library containing emulation routines and other routines used by the compiler. When you link your code, the linker links against STARLET.OLB to resolve undefined symbols.

For information about compiler qualifiers, see Appendix A.

2.12.1 Line Numbering in Listing File

The macro expansion line numbering scheme in the listing file is Xnn/mmm, where Xnn shows the nesting depth and mmm is the line number relative to the outermost macro.

Example 2-1 shows an OpenVMS I64 listing file. The source portion of an OpenVMS Alpha listing file is essentially the same.

Example 2-1 Example of Line Numbering in an OpenVMS I64 Listing File

                      00000000       1 ;
                      00000000       2 ; This is the Itanium (previously called "IA-64") version of
                      00000000       3 ; ARCH_DEFS.MAR, which contains architectural definitions for
                      00000000       4 ; compiling VMS sources for VAX, Alpha, and I64 systems.
                      00000000       5 ;
                      00000000       6 ; Note: VAX, VAXPAGE, and IA64 should be left undefined,
                      00000000       7 ;       a lot of code checks for whether a symbol is
                      00000000       8 ;       defined (e.g. .IF DF VAX) vs. whether the value
                      00000000       9 ;       is of a expected value (e.g. .IF NE VAX).
                      00000000      10 ;
                      00000000      11 ;VAX   = 0
                      00000000      12 ;EVAX  = 0
                      00000000      13 ;ALPHA = 0
00000001              00000000      14 IA64   = 1
                      00000000      15 ;
                      00000000      16 ;VAXPAGE = 0
00000001              00000000      17 BIGPAGE  = 1
                      00000000      18 ;
00000020              00000000      19 ADDRESSBITS = 32
                      00000000      20  .TITLE ug_ex_listing /line numbering in the listing file/
                      00000000      21 ;
                      00000000      22  .MACRO test1
                      00000000      23  clrl r1
                      00000000      24  clrl r2
                      00000000      25  tstl 48(sp)  ; generate uplevel stack error
                      00000000      26  clrl r3
                      00000000      27  .ENDM test1
                      00000000      28  .MACRO test2
                      00000000      29  clrl r4
                      00000000      30  clrl r5
                      00000000      31  test1
                      00000000      32  clrl r6
                      00000000      33  .ENDM test2
                      00000000      34
                      00000000      35 foo: .jsb_entry
                      00000000      56  .show expansions
                      00000000      57  clrl r0
                      00000011      58  test2
     1.......
%IMAC-E-UPLEVSTK, (1) up-level stack reference in routine FOO

         X01/001      00000002   clrl r4
         X01/002      00000004   clrl r5
         X01/003      00000006   test1
         X02/004      00000006   clrl r1
         X02/005      00000008   clrl r2
         X02/006      0000000A   tstl 48(sp)  ; generate uplevel stack error
         X02/007      0000000D   clrl r3
         X02/008      0000000F
         X01/009      0000000F   clrl r6
         X01/010      00000011
                      00000011      59  rsb
                      00000012      60  .noshow expansions
                      00000012      61
                      00000012      62  .END

2.13 Debugging

The compiler provides full debugger support. The debug session for compiled VAX MACRO code is similar to that for assembled VAX MACRO code. However, there are some important differences that are described in this section. For a complete description of debugging, see the HP OpenVMS Debugger Manual.

2.13.1 Code Relocation

One major difference is that the code is compiled rather than assembled. On an OpenVMS VAX system, each VAX MACRO instruction is a single machine instruction. On an OpenVMS Alpha or OpenVMS I64 system, each VAX MACRO instruction may be compiled into many Alpha or Itanium machine instructions. A major side effect of this difference is the relocation and rescheduling of code if you do not specify /NOOPTIMIZE in your compile command.

By default, several optimizations are performed that cause the movement of generated code across source boundaries (see Section 1.2, Section 4.3, and Appendix A). For most code modules, debugging is simplified if you compile with /NOOPTIMIZE, which prevents this relocation from happening. After you have debugged your code, you can recompile without /NOOPTIMIZE to improve performance.

2.13.2 Symbolic Variables for Routine Arguments

Another major difference between debugging compiled code and debugging assembled code is a new concept to VAX MACRO, the definition of symbolic variables for examining routine arguments. On OpenVMS VAX systems, when you are debugging a routine and want to examine the arguments, you typically do something like the following:


DBG> EXAMINE @AP        ; to see the argument count
DBG> EXAMINE @AP+4      ; to examine the first arg

or


DBG> EXAMINE @AP        ; to see arg count
DBG> EXAMINE .+4:.+20   ; to see first 5 args

On OpenVMS Alpha and OpenVMS I64 systems, the arguments do not reside in a vector in memory as they do on OpenVMS VAX systems. Furthermore, there is no AP register on OpenVMS Alpha and OpenVMS I64 systems. If you type EXAMINE @AP when debugging VAX MACRO compiled code, the debugger reports that AP is an undefined symbol.

In the compiled code, the arguments can reside in some combination of:

  • Registers
  • On the stack above the routine's stack frame
  • In the stack frame, if the argument list was homed (see Section 2.4) or if there are calls out of the routine that would require the register arguments to be saved

The compiler does not require that you figure out where the arguments are by reading the generated code. Instead, it provides $ARGn symbols that point to the correct argument locations. The $ARG0 symbol is the same as @AP+0 is on VAX systems, that is, the argument count. The $ARG1 symbol is the first argument, $ARG2 is the second argument, and so forth. These symbols are defined in CALL_ENTRY and JSB_ENTRY directives, but not in EXCEPTION_ENTRY directives.

2.13.3 Locating Arguments Without $ARGn Symbols

There may be additional arguments in your code for which the compiler did not generate a $ARGn symbol. The number of $ARGn symbols defined for a .CALL_ENTRY routine is the maximum number detected by the compiler (either by automatic detection or as specified by MAX_ARGS). For a .JSB_ENTRY routine, since the arguments are homed in the caller's stack frame and the compiler cannot detect the actual number, it always creates eight $ARGn symbols.

In most cases, you can easily find any additional arguments, but in some cases you cannot.

2.13.3.1 Additional Arguments That Are Easy to Locate

You can easily find additional arguments if:

  • The argument list is not homed, and $ARGn symbols are defined to $ARG7 or higher on OpenVMS Alpha and $ARG9 or higher on OpenVMS I64. If the argument list is not homed, the $ARGn symbols $ARG7 and above on OpenVMS Alpha and $ARG9 and above on OpenVMS I64 always point into the list of parameters passed as quadwords on the stack. Subsequent arguments will be in quadwords following the last defined $ARGn symbol.
  • The argument list is homed, and you want to examine an argument that is less than or equal to the maximum number detected by the compiler (either by automatic detection or as specified by MAX_ARGS). If the argument list is homed, $ARGn symbols always point into the homed argument list. Subsequent arguments will be in longwords following the last defined $ARGn symbol.

For example, you can examine arguments beyond the eighth argument in a JSB routine (where the argument list must be homed in the caller), as follows:


DBG> EX $ARG8  ; highest defined $ARGn
.
.
.
DBG> EX .+4  ; next arg is in next longword
.
.
.
DBG> EX .+4  ; and so on

This example assumes that the caller detected at least 10 arguments when homing the argument list.

To find arguments beyond the last $ARGn symbol in a routine that did not home the arguments, proceed exactly as in the previous example except substitute EX .+8 for EX .+4.

2.13.3.2 Additional Arguments That Are Not Easy to Locate

You cannot easily find additional arguments if:

  • The argument list is not homed, and $ARGn symbols are defined only as high as $ARG6 on OpenVMS Alpha and $ARG8 on OpenVMS I64. In this case, the existing $ARGn symbols will either point to registers or to quadword locations in the stack frame. In both cases, subsequent arguments cannot be examined by looking at quadword locations beyond the defined $ARGn symbols.
  • The argument list is homed, and you want to examine arguments beyond the number detected by the compiler. The $ARGn symbols point to the longwords that are stored in the homed argument list. The compiler only moves as many arguments as it can detect into this list. Examining longwords beyond the last argument that was homed will result in examining various other stack context.

The only way to find the additional arguments in these cases is to examine the compiled machine code to determine where the arguments reside. Both of these problems are eliminated if MAX_ARGS is specified correctly for the maximum argument that you want to examine.

2.13.4 Using VAX and Alpha Register Names on OpenVMS I64

For convenience, the MACRO compiler on OpenVMS I64 defines symbols named R0, R1, ... R31 to refer to the Itanium registers where those Alpha register values reside. You can still use the debugger's names %R0, %R1, ... %R31 to refer to registers by the native machine's numbering.

2.13.5 Debugging Code with Packed Decimal Data

Keep this information in mind when debugging compiled VAX MACRO code with packed decimal data on an OpenVMS Alpha or OpenVMS I64 system:

  • When using the EXAMINE command to examine a location that was declared with a .PACKED directive, the debugger automatically displays the value as a packed decimal data type.
  • You can deposit packed decimal data. The syntax is the same as it is on VAX.

2.13.6 Debugging Code with Floating-Point Data

Keep this information in mind when debugging compiled VAX MACRO code with floating-point data on an OpenVMS Alpha or OpenVMS I64 system:

  • You can use the EXAMINE/FLOAT command to examine an Alpha or Itanium integer register for a floating-point value.
    Even though there is a set of registers for floating-point operations on OpenVMS Alpha and OpenVMS I64 systems, those registers are not used by compiled VAX MACRO code that contains floating-point operations. Only the integer registers are used.
    Floating-point operations in compiled VAX MACRO code are performed by emulation routines that operate outside the compiler. Therefore, performing VAX MACRO floating-point operations on, say, R7, has no effect on floating-point Register 7.
  • When using the EXAMINE command to examine a location that was declared with a .FLOAT directive or other floating-point storage directives, the debugger automatically displays the value as floating-point data.
  • When using the EXAMINE command to examine the G_FLOAT data type, the debugger does not use the contents of two registers to build the value for VAX data.
    Consider the following example:


    EXAMINE/G_FLOAT   R4
    

    In this example, the lower longwords of R4 and R5 are not used to build the value as is the case on VAX. Instead, the quadword contents of R4 are used.
    The code the compiler generates for D_FLOAT and G_FLOAT operations preserves the VAX format of the data in the low longwords of two consecutive registers. Therefore, using EXAMINE/G_FLOAT on either of these two registers will not give the true floating-point value, and issuing DEPOSIT/G_FLOAT to one of these registers will not give the desired results. You can manually combine the two halves of such a value, however. For example, assume you executed the following instruction:


    MOVG    DATA, R6
    

    You could then read the G_FLOAT value which now resides in R6 and R7 with a sequence like the following:


    DBG> EX R6
    .MAIN.\%LINE 100\%R6:   0FFFFFFFF D8E640D1
    DBG> EX R7
    .MAIN.\%LINE 100\%R7:   00000000 2F1B24DD
    DBG> DEP R0 = 2F1B24DDD8E640D1
    DBG> EX/G_FLOAT R0
    .MAIN.\%LINE 100\%R0:   4568.89900000000
    
  • You can deposit floating-point data in an Alpha or Itanium integer register with the DEPOSIT command. The syntax is the same as it is on a VAX system.
  • H_FLOAT is unsupported.
  • On OpenVMS I64 systems, incoming parameters are in R32 through R39, not in R16 through R21. Outgoing parameters are in higher numbered registers chosen by the compiler.


Previous Next Contents Index