[an error occurred while processing this directive]

HP OpenVMS Systems Documentation

Content starts here

Porting Applications from HP OpenVMS Alpha to HP OpenVMS Industry Standard 64 for Integrity Servers


Previous Contents Index

C.1.8 System-Supplied Allocation and Deallocation Routines

The system supplies a number of standard allocation routines. The routines adhere to the KP allocation API and can be used in place of a user-written routine if they meet the needs of the application. The following allocation routines are available:

  • EXE$KP_ALLOC_MEM_STACK (kernel mode, S0/S1 space)
  • EXE$KP_ALLOC_MEM_STACK_USER (user mode, P0 space)
  • EXE$KP_ALLOC_RSE_STACK (caller's mode, S2 space)
  • EXE$KP_ALLOC_RSE_STACK_P2 (caller's mode, P2 space)

The following deallocation routines are supplied. All deallocation routines take a single argument, the address of the KPB.

  • EXE$KP_DEALLOCATE_KPB
    • Only KPBs allocated by EXE$KP_ALLOCATE_KPB
    • Deallocates stacks and KPB
  • EXE$KP_DEALLOC_MEM_STACK
  • EXE$KP_DEALLOC_MEM_STACK_USER
  • EXE$KP_DEALLOC_RSE_STACK
  • EXE$KP_DEALLOC_RSE_STACK_P2

C.1.9 End Routine

The end routine is called by the KP services when EXE$KP_END is called, either explicitly or by reaching the end of the KP routine. Since EXE$KP_USER_ALLOC_KPB allows the specification of arbitrary allocation routines, the end routine must either cache the KPB for future use by the application or call the necessary deallocation routines for the stacks and KPB.

The end routine is called with two parameters as follows:


void end_routine (KPB_PQ KPB, int status)

  • KPB---64-bit address of a previously allocated KPB. Passed by 64-bit reference.
  • STATUS---optional status value. If EXE$KP_END was called explicitly, the status value is that supplied as the second argument to EXE$KP_END. If the second argument is omitted, the value SS$_NORMAL is supplied. If the KP routine terminated by returning, the value supplied is the return value of the KP routine.

C.2 KP Control Routines

Once the KPB and stacks are allocated, the four routines that determine the state of a KP routine can be used.

C.2.1 Overview

A KP routine is initiated by a call to EXE$KP_START. While running, the KP routine may elect to give up control by calling EXE$KP_STALL_GENERAL. A stalled KP routine can be resumed by calling EXE$KP_RESTART. The KP routine is terminated either by an explicit call to EXE$KP_END or by returning from the routine. In the latter case, the KP services call EXE$KP_END implicitly.

When a KP routine starts, the current thread of execution state is saved onto the current stack, the KP stack is loaded, and the KP routine is called.

When the KP routine stalls, the KP routine's context is saved onto the KP stack, the stack is switched to the original stack, and the main routine's context is loaded from stack. The result is that a stall causes the original routine to return from the last call to EXE$KP_START or to EXE$KP_RESTART.

When the KP routine is restarted, the current context is saved onto the current stack, the stack is switched to the KP routine's stack and the KP routine's context is restored from the stack. The KP routine returns from the last call to EXE$KP_STALL_GENERAL.

The stall/resume sequence can occur zero or more times. The KP routine is not required ever to stall. A stalled routine cannot stall again until it has been resumed. A running KP routine cannot be restarted until it has stalled. The full checking version of SYSTEM_PRIMITIVES.EXE enforces these rules. Failure to follow these rules can result in a KP_INCONSTATE bugcheck, depending on the mode in which the KP routine is running.

When the KP routine terminates, either by explicitly calling EXE$KP_END or by returning to the caller, no current context is saved, the stack is switched, and the original thread context is restored. At this point, if the DEALLOCATE_AT_END flag is set (kernel mode only) or if an end routine address has been supplied, the appropriate action takes place. The original thread returns from the call that started or restarted the KP routine.

Figure C-1 shows the overall code flow.

Figure C-1 KP Routine Execution


Note

While the main thread of execution is depicted as a continuous stream in the above diagram, the actual thread of execution may include asynchronous components. The application is required to perform all necessary synchronization as well as maintaining the necessary scope of the KPB.

C.2.2 Routine Descriptions

This section describes the routines.

C.2.2.1 EXE$KP_START

Syntax:


status = EXE$KP_START(kpb, routine, reg-mask)

  • kpb---Address of a previously allocated and initialize KPB. Passed by 32-bit reference.
  • routine---KP routine address
  • reg-mask---Register save mask. Longword. Passed by value. This argument is read on Alpha only. The I64 interface supports only the OpenVMS Calling Standard for register preservation. The constant KPREG$K_HLL_REG_MASK, defined in KPBDEF, can be used for calls from higher-level languages on Alpha to specify the calling standard register mask.

This routine suspends the current thread of execution, swaps to the new stacks; and calls the specified routine. The KP routine is called with a single argument---the 32-bit address of the supplied KPB. The KPB must be invalid and inactive.

C.2.2.2 EXE$KP_STALL_GENERAL

Syntax:


status = EXE$KP_STALL_GENERAL(kpb)

  • kpb---Address of the KPB passed at the start of this routine. Passed by 32-bit reference.

This routine stalls the current thread of execution, saving context onto the KP stack, and returns to the most recent call that started or restarted this routine. The KPB must be valid and active.

The return status from this routine is supplied by the routine that restarts this procedure.

C.2.2.3 EXE$KP_RESTART

Syntax:


EXE$KP_RESTART(kpb [, thread_status])

  • kpb--- Address of the KPB last used to stall this routine. Passed by 32-bit reference.
  • thread_status -- Status value to be supplied as the return value for the call to EXE$KP_STALL_GENERAL that last stalled the KP routine. Passed by value. Optional. If this parameter is omitted, SS$_NORMAL is returned.

This routine causes the stalled KP routine to restart by returning from the last call to EXE$KP_STALL_GENERL. Note that this may be a completely asynchronous operation with respect to the original thread of execution that started the KP routine. The KPB must be valid and inactive.

C.2.2.4 EXE$KP_RESTART

Syntax:


status = EXE$KP_END(kpb [, status])

  • kpb---Address of the KPB last used to start or restart this routine. Passed by 32-bit reference.
  • status---Status value to be supplied to the end routine, if any was specified in the KPB. Passed by value. Optional. If this parameter is omitted, SS$_NORMAL is returned.

This routine terminates the KP routine, returning control to the last thread of execution that started or restarted the KP routine. The KPB must be valid and active. The KPB is marked as invalid and inactive and cannot be used in subsequent calls to EXE$KP_RESTART or EXE$KP_STALL_GENERAL without first calling EXE$KP_START to start another KP routine.

Instead of calling EXE$KP_END, the KP routine can return to the caller. Returning to the caller causes the KP code to call KP_END automatically. In that case, the return status from the KP procedure is used for the status argument.

C.3 Design Considerations

  • KP routines run in a single process. The KP services by themselves offer no parallelism benefits on multiple-CPU systems.
  • All calls to EXE$KP_STALL_GENERAL, EXE$KP_RESTART and EXE$KP_END must occur in the same mode as the call to EXE$KP_START. Mode changes can occur between calls, but all calls must be made at the same mode.
  • Multiple KPBs can be valid at the same time.
  • Multiple KPBs can be active at the same time. This implies that a KP routine has started or restarted another KPB. Except in the simplest cases, code flow rapidly becomes exceedingly complex as the number of active KPBs increases. In such cases, a work queue model with a dispatcher often removes the need for multiple active KPBs.
  • Allocation and deallocation of stacks is sufficiently expensive in system time and resources that consideration should be given to some level of caching KPBs within the application instead.
  • Applications using KP services in process context in kernel mode must lock the stacks into the working set.


Glossary


ABI: Application binary interface. A general term encompassing all of the rules and conventions that programs must follow to run under a particular operating system. For example, the OpenVMS I64 ABI includes the calling standard, register usage, interrupt handing, and other related topics.

According to Intel's Itanium Software Conventions and Run-Time Architecture Guide, "an ABI is composed of an API, system-specific conventions, a hardware description, and a run-time architecture."

ACPI: Advanced configuration and power interface. ACPI provides interfaces for power state management, system device identification, and hardware configuration.

  • ACPI specification:


    http://www.acpi.info/spec.htm
    
  • Intel's ACPI web site:


    http://developer.intel.com/technology/iapc/acpi/
    

ALAT: Advanced load address table. A lookup table on Itanium processor that tracks speculative memory accesses and allows the CPU to handle conflicts.

AML: ACPI machine language. Pseudocode for a virtual machine supported by an ACPI-compatible operating system and in which ACPI control methods and objects are written. See also ACPI.

application registers: 128 special-purpose registers used for various functions, labeled ar0 to ar127. For example, ar17 is the register-backing store pointer.

Aries: HP emulator/translator for HP Precision Architecture (PA) application code. Unlike VEST (the VAX-to-Alpha translator), Aries emulates only at run-time and does not produce a translated image. When the image exits, all the translated code is discarded and must be regenerated each time the program runs.

ASL: ACPI source language. The programming language for AML. ASL source code is compiled into AML images.

ASM: Address space match, part of virtual memory management.

branch registers: Itanium 64-bit registers, br0 to br7, used to specify the target addresses for indirect branches. The branch registers streamline call/return branching.

bundle: The basic instruction format for the Itanium architecture. A bundle is a 128-bit structure consisting of three 41-bit instructions, plus a 5-bit template.

CISC: Complex instruction set computing. Includes numerous variable-length machine instructions that are more complex than RISC. See also RISC and VLIW.

COW: Copy-on-write. Refers to memory that is mapped by multiple processes provided each process does not attempt to write to it. If a process writes to a page, the page is copied to a process-private location and remapped, leaving the original shared page intact. COW is commonly used by the loader for memory-mapped shared libraries.

CVF: Compaq Visual Fortran.

DSDT: Differentiated system description table. Part of the ACPI subsystem. The DSDT contains the Differentiated Definition Block, which supplies the implementation and configuration information about the base system.

DVD: Digital versatile disk.

DWARF: Debugging and traceback information (embedded in ELF).

EFI: Extensible firmware interface. EFI's purpose is to initialize the hardware and start booting an operating system.

For more information, see Intel's EFI Page:


http://developer.intel.com/technology/efi/index.htm

ELF: Executable and linkable format (ELF) defines the file format for object files, shared libraries, and executable images on Itanium processors. ELF is not specific to the Itanium architecture; it has been used for years on various UNIX systems.

For more information, see the ELF specification from SCO:


http://stage.caldera.com/developer/gabi/

EPIC: Explicitly parallel instruction computing. A term coined by Intel to describe the Itanium architecture (comparable to CISC or RISC). EPIC provides architecturally visible access to parallelism in the CPU. For example, the Intel Itanium processor family defines data speculation and register renaming as part of the architecture, rather than hiding it in the chip implementation as was done with Alpha. EPIC is similar to the very long instruction word (VLIW) computing architecture.

FADT: Fixed ACPI description table. A table that contains the ACPI hardware register block implementation and configuration details, as well as the physical address of the DSDT.

FAT: File allocation table. The disk volume structure used by various Microsoft operating systems and by the Itanium EFI operating system loader. There are various FAT variants, including FAT8, FAT12, FAT16, and FAT32. These differ in total permitted volume size and various other changes.

fills: See spills and fills.

FIT: Firmware interface table.

GEM: The backend code generator used by most OpenVMS Alpha and I64 compilers.

general registers: Itanium processors have 128 64-bit general registers, numbered as r0 to r127 (sometimes called gr0 to gr127). However, only 32 registers are normally used directly; the other 96 are used for argument passing and to store local variables for the currently executing procedure. Registers r0 to r31 are static, but registers r32 and up are generally managed by the register stack engine (RSE) and might be renamed across procedure calls.

global pointer (GP): The base address of the current global data segment. The base address allows the use of compact, GP-relative addresses in machine code. Because the largest immediate value in an Itanium instruction is 22 bits wide, the global data segment is 4 MB in length. General register 1 holds the current GP.

HP OpenVMS Migration Software for Alpha to Integrity Servers: A tool that translates Alpha executable images to I64 executable images.

HWPCB: Hardware process control block. The portion of the process context that is stored and retrieved by the hardware when a process is rescheduled.

HWRPB: Hardware restart parameter block. These data structures are used to communicate system-level configuration information between the console and OpenVMS both on system startups and system restarts.

IA: Intel architecture.

IA-32: An Intel 32-bit CISC microprocessor architecture. Examples of implementations include the Pentium®, Xeontm, and 80386 series.

IA-64: An old name for the Itanium processor family architecture. The term IA-64 is no longer officially used. However, it still appears in older documentation, and IA64 still occurs in some code.

Integrity: The brand name of the HP server product line based on the Itanium architecture. For example, the rx2600 server is sold as the HP Integrity rx2600.

ILP: Instruction level parallelism. The ability to execute multiple instructions in parallel during the same cycle.

instruction group: A sequence of one or more instructions, delimited by explicit stops or taken branches. Instructions within an instruction group must have no RAW or WAW dependencies. Instruction groups can cross bundle boundaries.

Intel® Itanium®: The name of both the 64-bit architecture (the itanium processor family) and the first chip to implement the architecture (the Itanium processor).

Intel® Itanium® 2: The second generation of the Itanium processor family. The rx2600 is an Itanium 2 system. For more information, refer to the following web site:


http://www.cpus.hp.com/technical_references/ia64.shtml

IPINT: Inter-processor interrupt. Also simply called IPI.

IPF: Itanium processor family.

IPMI: Intelligent platform management interface, a set of APIs used to manage servers. Functions include monitoring environmental data such as temperature, voltage, fans, and chassis switches, as well as alerting, remote console and control, and asset tracking. See the Intel web site for more details:


http://www.intel.com/design/servers/ipmi/

IVT: Interrupt vector table.

Jacket routines: Interface routines that transform the arguments from one calling standard to another.

MAS: Multiple address space. In an MAS operating system, all processes on the system have at least a partially private virtual address space. OpenVMS and Tru64 UNIX are both MAS operating systems. See also SAS.

MBR: Master boot record. The contents of the first disk block and the core data structure of a FAT32 partitionable disk.

NaT: Not a thing. An extra bit associated with various registers that indicates whether or not the register contents are valid. Used for speculative execution. For example, a read from a nonexistent virtual address would result in NaT. The NaT bit propagates as execution progresses so that any operation involving an NaT results in an NaT.

NMI: Non-maskable interrupt.

NUE: The Linux Native User Environment, a set of development software used in conjunction with the Ski Itanium emulator.

PA: Hewlett-Packard (HP) Precision Architecture, a computer architecture also known as PA-RISC. HP is porting its UNIX from PA the to Itanium architecture.

PAL (Alpha): Privileged architecture library. PAL is part of the Alpha architecture. It is a software mechanism for performing low-level operations such as interrupt handling, TLB management, and atomic operations that were implemented on VAX as microcode. On Itanium systems, the same functionality is provided as part of the operating system. Whenever possible, the BLISS, C, and Macro compilers for OpenVMS I64 convert CALL_PAL macros to the equivalent operating system calls for backward compatibility. Not all Alpha PAL operations are implemented on I64; in some cases, programs that call PALcode directly might need to change.

PAL (Itanium): Processor abstraction layer. Part of the Itanium architecture that is implemented in firmware. Provides a consistent interface for processor-specific functions such as hardware errors or system initialization. Generally speaking, SAL isolates the operating system from platform-specific implementation differences, while PAL isolates the operating system (and SAL) from processor-specific differences.

PIC: Position-independent code. Machine code that uses only relative address references. This strategy allows the code to be loaded to any memory location without modification.

POSSE: Pre-operating system startup environment. POSSE is the standard firmware user interface on Itanium® 2 servers. It is an HP value-added component that is available only on HP systems.

PPN: Physical page number.

PTE: Page table entry.

predication: The conditional execution of instructions based on a set of special 1-bit registers called predicates. Most I64 instructions include a predicate number. When the corresponding predicate register is true (1), the instruction is executed. When it is false (0), the instruction is treated as a "no-op."

RAW: Read-after-write violation. A type of data dependency between two instructions in one instruction group. The later instruction reads data from the same register to which the earlier instruction wrote.

RID: Region identifier.

RISC: Reduced instruction set computing. Fixed-size machine instructions that are intended to be simpler machine instructions than CISC. See also CISC and VLIW.

RSE: Register stack engine. An on-chip entity that handles register renaming and spills and fills across procedure calls.

rx2600: An HP server with up to two Itanium 2 CPUs. The rx2600 is the first platform for customer use to run the OpenVMS Itanium port.

SAL: System abstraction layer. Part of the Itanium architecture, implemented in firmware. Provides a consistent interface for platform-specific functions such as hardware errors or system initialization. Generally speaking, SAL isolates the operating system from platform-specific implementation differences, while PAL isolates SAL and the operating system from processor-specific differences.

SAS: Single address space. In an SAS operating system, all processes share the same address space. See also MAS.

SCI: System control interrupt. A system interrupt used by hardware to notify the operating system about ACPI events.

Ski: A free Itanium emulator for Linux. Ski emulates the Intel Itanium Processor Family architecture but not any particular implementation.

speculation: The process of moving memory accesses earlier in the code to hide memory access latencies. If it turns out that the memory access was invalid or unnecessary, it can easily be discarded. The Itanium processor instructions ld.a,ld.s, chk.a, and chk.s are used for speculation.

spills and fills: The rotating portion of the Itanium register file that is used as a stack. If it overflows, registers are saved to a memory region called the register backing store. A spill flushes one or more registers out to the backing store; a fill restores registers from the backing store. Spills and fills are handled automatically by the register stack engine (RSE) on the CPU without program intervention.

SRM console: Firmware for Alpha systems that includes boot support, PALcode, and platform-specific functionality used by OpenVMS, UNIX, and Linux. The chevron prompt (>>>) that you see when you power on any Alpha system is part of the SRM console. The console behavior is defined in the Alpha System Reference Manual (SRM).

The Itanium architecture does not have an equivalent to the SRM console. Instead, the SRM functionality is split between the PAL/SAL firmware, the EFI boot software, and the operating system.

stop: The end of an instruction group, designated in by a double semicolon (;;) in assembly listings.

Superdome: The high-end HP server product line. Superdome servers support about 64 CPUs, either PA-RISC or Itanium. Eventually, all new Superdome systems will be Itanium based.

SWIS: SoftWare interrupt support. The set of services that implement the interrupt model used by OpenVMS. This includes processor IPL, ASTs, mode changes, and software interrupts. On Alpha, this support was implemented in PALcode.

template: On Itanium processors, templates define the combinations of instructions that can be placed into a single bundle. For example, a bundle with the MFI template must have, in this order, a memory instruction (M), a floating-point instruction (F), and an integer instruction (I). Templates also define the location of stops. 24 templates are defined in the Itanium architecture, but there are only 12 combinations of letters. For each combination, there are two template versions: one with a stop at the end, and one without a stop. Some templates also contain stops within the bundle.

TLB: Translation lookaside buffer. An on-chip cache that stores recently used mappings between real and virtual memory. The Itanium architecture has separate TLBs for instructions and data.

USB: Universal Serial Bus. A connector for I/O devices such as keyboards, mouses, printers, cameras, external disks, and so forth.

VEST: A tool that translates VAX executable images to Alpha executable images. Also known as DECmigrate.

VHPT: Virtual hash page table.

VIAL: OpenVMS to itanium abstraction layer. VIAL is the equivalent of the OpenVMS SRM console PALcode layer.

VLIW: Very long instruction word computing architecture. Exceedingly complex instructions reminiscent of the microcode engine found deep within many VAX processors. See also CISC, RISC, and EPIC.

VPN: Virtual page number.

VRN: Virtual region number.

WAW: Write-after-write violation. A type of data dependency between two instructions in one instruction group. Both of the instructions write to the same destination register.

WWID: Worldwide ID. A globally unique Fibre Channel device identifier analogous to an Ethernet station address.


Index Contents