[an error occurred while processing this directive]

Software > OpenVMS Systems > Documentation > 731final > 5841

HP OpenVMS Systems Documentation

OpenVMS Programming Concepts Manual

Contents

Index

15.3.2 The Performance and Coverage Analyzer---PCA

The PCA allows you to detect and fix performance problems. Because unaligned data handling can significantly increase overhead, PCA has the capability to collect and present information on aligned data exceptions. PCA commands that collect and display unaligned data exceptions are:

SET UNALIGNED_DATA
PLOT/UNALIGNED_DATA PROGRAM BY LINE

Also, PCA can display data according to the PC of the fault, or by the virtual address of the unaligned data.

15.3.3 System Services (Alpha Only)

On Alpha systems, there are eight system services to help locate unaligned data. The first three system services establish temporary image reporting; the next two provide process-permanent reporting, and the last three provide for system alignment fault tracking. The symbols used in calling all eight of these system services are located in $AFRDEF in the OpenVMS Alpha MACRO-32 library, SYS$LIBRARY:STARLET.MLB. You can also call these system services in C with #include <afrdef.h>.

The first three system services can be used together; they report on the currently executing image. They are as follows:

SYS$START_ALIGN_FAULT_REPORT. This service enables unaligned data exception for the current image. You can use either a buffered or an exception method of reporting, but you can enable only one method at a time.
- Buffered method. This method requires that the buffer address and size be specified. You use the SYS$GET_ALIGN_FAULT_DATA service to retrieve buffered alignment data under program control.
- Exception method. This method requires no buffer. Unaligned data exceptions are signaled to the image, at which point a user-written condition handler takes whatever action is desired. If no user-written handler is set up, then an informational exception message is broadcast for each unaligned data trap, and the program continues to execute.
SYS$STOP_ALIGN_FAULT_REPORT. This service cancels unaligned data exception reporting for the current image if it were previously enabled. If you do not explicitly call this routine, then reporting is disabled by the operating systems' image rundown logic.
SYS$GET_ALIGN_FAULT_DATA. This service retrieves the accumulated, buffered alignment data when using the buffered collection method.

You can use two of the eight system services to report unaligned data exceptions for the current process. The two services are as follows:

SYS$PERM_REPORT_ALIGN_FAULT. This service enables unaligned data exception reporting for the process. Once you enable this service, the reporting remains in effect for the process until you explicitly disable it. Once enabled, the SS$_ALIGN condition is signaled for all unaligned data exceptions while the process is active. By default, if no user-written exception handler handles the condition, this results in an information display message for each unaligned data exception.
This service provides a convenient way of running a number of images without modifying the code in each image, and also of recording the unaligned data exception behavior of each image.
SYS$PERM_DIS_ALIGN_FAULT_REPORT. This service disables unaligned data exception reporting for the process.

The three system services that allow you to track systemwide alignment faults are as follows:

SYS$INIT_SYS_ALIGN_FAULT_REPORT. This service initializes system process alignment fault reporting.
SYS$STOP_SYS_ALIGN_FAULT_REPORT. This service disables systemwide alignment fault reporting.
SYS$GET_SYS_ALIGN_FAULT_DATA. This service obtains data from the system alignment fault buffer.

These services require CMKRNL privilege. Alignment faults for all modes and all addresses can be reported using these services. The user can also set up masks to report only certain types of alignment faults. For example, you can get reports on only kernel modes, only user PC, or only data in system space.

Chapter 16
Memory Management with VLM Features

OpenVMS Alpha very large memory (VLM) features for memory management provide extended support for database, data warehouse, and other very large database (VLDB) products. The VLM features enable database products and data warehousing applications to realize increased capacity and performance gains.

By using the extended VLM features, application programs can create large in-memory global data caches that do not require an increase in process quotas. These large memory-resident global sections can be mapped with shared global pages to dramatically reduce the system overhead required to map large amounts of memory.

This chapter describes the following OpenVMS Alpha memory management VLM features:

Memory-resident global sections
Fast I/O and buffer objects for global sections
Shared page tables
Expandable global page table
Reserved memory registry

To see an example program that demonstrates many of these VLM features, refer to Appendix C.

16.1 Overview of VLM Features

Memory-resident global sections allow a database server to keep larger amounts of hot data cached in physical memory. The database server then accesses the data directly from physical memory without performing I/O read operations from the database files on disk. With faster access to the data in physical memory, runtime performance increases dramatically.

Fast I/O reduces CPU costs per I/O request, which increases the performance of database operations. Fast I/O requires data to be locked in memory through buffer objects. In prior versions of OpenVMS Alpha, buffer objects could be created only for process-private virtual address space. As of OpenVMS Alpha 7.2, buffer objects can be created for global pages, including pages in memory-resident sections.

Shared page tables allow that same database server to reduce the amount of physical memory consumed within the system. Because multiple server processes share the same physical page tables that map the large database cache, an OpenVMS Alpha system can support more server processes. This increases overall system capacity and decreases response time to client requests.

Shared page tables dramatically reduce the database server startup time because server processes can map memory-resident global sections hundreds of times faster than traditional global sections. With a multiple gigabyte global database cache, the server startup performance gains can be significant.

The system parameters GBLPAGES and GBLPAGFIL are dynamic parameters. Users with the CMKRNL privilege can now change these parameter values on a running system. Increasing the value of the GBLPAGES parameter allows the global page table to expand, on demand, up to the new maximum size.

The Reserved Memory Registry supports memory-resident global sections and shared page tables. Through its interface within the SYSMAN utility, the Reserved Memory Registry allows an OpenVMS system to be configured with large amounts of memory set aside for use within memory-resident sections or other privileged code. The Reserved Memory Registry also allows an OpenVMS system to be properly tuned through AUTOGEN, thus accounting for the pre-allocated reserved memory. For information about using the reserved memory registry, see the OpenVMS System Manager's Manual.

16.2 Memory-Resident Global Sections

Memory-resident global sections are non-file-backed global sections. This means that the pages within a memory-resident global section are not backed by the pagefile or by any other file on disk. Thus, no pagefile quota is charged to any process or charged to the system. When a process maps to a memory-resident global section and references the pages, working set list entries are not created for the pages. No working set quota is charged to the process.

Pages within a memory-resident global demand zero (DZRO) section initially have zero contents.

Creating a memory-resident global DZRO section is performed by calling either the SYS$CREATE_GDZRO system service or the SYS$CRMPSC_GDZRO_64 system service.

Mapping to a memory-resident global DZRO section is performed by calling either the SYS$CRMPSC_GDZRO_64 system service or the SYS$MGBLSC_64 system service.

To create a memory-resident global section, the process must have been granted the VMS$MEM_RESIDENT_USER rights identifier. Mapping to a memory-resident global section does not require this right identifier.

Two options are available when creating a memory-resident global DZRO section:

Fault option: allocate pages only when virtual addresses are referenced.
Allocate option: allocate all pages when section is created.

Fault option

To use the fault option, it is recommended, but not required that the pages within the memory-resident global section be deducted from the system's fluid page count through the Reserved Memory Registry.

Using the Reserved Memory Registry ensures that AUTOGEN tunes the system properly to exclude memory-resident global section pages in its calculation of the system's fluid page count. AUTOGEN sizes the system pagefile, number of processes, and working set maximum size based on the system's fluid page count.

If the memory-resident global section has not been registered through the Reserved Memory Registry, the system service call fails if there are not enough fluid pages left in the system to accommodate the memory-resident global section.

If the memory-resident global section has been registered through the Reserved Memory Registry, the system service call fails if the size of the global section exceeds the size of reserved memory and there are not enough fluid pages left in the system to accommodate the additional pages.

If memory has been reserved using the Reserved Memory Registry, that memory must be used for the global section named in the SYSMAN command. To return the memory to the system, SYSMAN can be run to free the reserved memory, thus returning the pages back into the system's count of fluid pages.

If the name of the memory-resident global section is not known at boot time, or if a large amount of memory is to be configured out of the system's pool of fluid memory, entries in the Reserved Memory Registry can be added and the system can be retuned with AUTOGEN. After the system re-boots, the reserved memory can be freed for use by any application in the system with the VMS$MEM_RESIDENT_USER rights identifier. This technique increases the availability of fluid memory for use within memory-resident global sections without committing to which applications or named global sections will receive the reserved memory.

Allocate option

To use the allocate option, the memory must be pre-allocated during system initialization to attempt to ensure that contiguous, aligned physical pages are available. OpenVMS attempts to allow granularity hints, so that in many or even most cases, pre-allocated resident memory sections will be physically contiguous. However, for example on systems supporting resource affinity domains (RADs), OpenVMS intentionally tries to "stripe" memory across all RADs, unless told to use only a single RAD. Granularity hints may be used when mapping to the memory-resident global section if the virtual alignment of the mapping is on an even 8-page, 64-page, or 512-page boundary. (With a system page size of 8 KB, granularity hint virtual alignments are on 64-KB, 512-KB, and 4-MB boundaries.) The maximum granularity hint on Alpha covers 512 pages. With 8-KB pages, this is 4 MB. If your selection is below this limit, there is an excellent chance that it will be contiguous if at all possible. Currently, there is no guarantee of continguousness for application software. OpenVMS chooses optimal virtual alignment to use granularity hints if the flag SEC$M_EXPREG is set on the call to one of the mapping system services, such as SYS$MGBLSC.

Sufficiently contiguous, aligned PFNs are reserved using the Reserved Memory Registry. These pages are allocated during system initialization, based on the description of the reserved memory. The memory-resident global section size must be less than or equal to the size of the reserved memory or an error is returned from the system service call.

If memory has been reserved using the Reserved Memory Registry, that memory must be used for the global section named in the SYSMAN command. To return the memory to the system, SYSMAN can be run to free the pre-reserved memory. Once the pre-reserved memory has been freed, the allocate option can no longer be used to create the memory-resident global section.

16.3 Fast I/O and Buffer Objects for Global Sections

As of OpenVMS Alpha 7.2, VLM applications can use Fast I/O for memory shared by processes through global sections. In prior versions of OpenVMS Alpha, buffer objects could be created only for process-private virtual address space. Fast I/O requires data to be locked in memory through buffer objects. Database applications where multiple processes share a large cache can now create buffer objects for the following types of global sections:

Page file-backed global sections
Disk file-backed global sections
Memory-resident global sections

Buffer objects enable Fast I/O system services, which can be used to read and write very large amounts of shared data to and from I/O devices at an increased rate. By reducing the CPU cost per I/O request, Fast I/O increases performance for I/O operations.

Fast I/O improves the ability of VLM applications, such as database servers, to handle larger capacities and higher data throughput rates.

16.3.1 Comparison of $QIO and Fast I/O

The $QIO system service must ensure that a specified memory range exists and is accessible for the duration of each direct I/O request. Validating that the buffer exists and is accessible is done in an operation called probing. Making sure that the buffer cannot be deleted and that the access protection is not changed while the I/O is still active is achieved by locking the memory pages for I/O and by unlocking them at I/O completion.

The probing and locking/unlocking operations for I/O are costly operations. Having to do this work for each I/O can consume a significant percentage of CPU capacity. The advantage of Fast I/O is that memory is locked only for the duration of a single I/O and can otherwise be paged.

Fast I/O must still ensure that the buffer is available, but if many I/O requests are performed from the same memory cache, performance can increase if the cache is probed and locked only once---instead of for each I/O. OpenVMS must then ensure only that the memory access is unchanged between multiple I/Os. Fast I/O uses buffer objects to achieve this goal. Fast I/O gains additional performance advantages by pre-allocating some system resources and by streamlining the I/O flow in general.

16.3.2 Overview of Locking Buffers

Before the I/O subsystem can move any data into a user buffer, either by moving data from system space in a buffered I/O, or by allowing a direct I/O operation, it must ensure that the user buffer actually exists and is accessible.

For buffered I/O, this is usually achieved by assuming the context of the process requesting the I/O and probing the target buffer. For most QIO requests, this happens at IPL 2 (IPL$_ASTDEL), which ensures that no AST can execute between the buffer probing and the moving of the data. The buffer is not deleted until the whole operation has completed. IPL 2 also allows the normal paging mechanisms to work while the data is copied.

For direct I/O, this is usually achieved by locking the target pages for I/O. This makes the pages that make up the buffer ineligible for paging or swapping. From there on the I/O subsystem identifies the buffer by the page frame numbers, the byte offset within the first page, and the length of the I/O request.

This method allows for maximum flexibility because the process can continue to page and can even be swapped out of the balance set while the I/O is still outstanding or active. No pages need to be locked for buffered I/O, and for direct I/O, most of the process pages can still be paged or swapped. However, this flexibility comes at a price: all pages involved in an I/O must be probed or locked and unlocked for every single I/O. For applications with high I/O rates, the operating system can spend a significant amount of time on these time-consuming operations.

Buffer objects can help avoid much of this overhead.

16.3.3 Overview of Buffer Objects

A buffer object is a process entity that is associated with a virtual address range within a process. When the buffer object is created, all pages in this address range are locked in memory. These pages cannot be freed until the buffer object has been deleted. The Fast I/O environment uses this feature by locking the buffer object itself during $IO_SETUP. This prevents the buffer object and its associated pages from being deleted. The buffer object is unlocked during $IO_CLEANUP. This replaces the costly probe, lock, and unlock operations with simple checks validating that the I/O buffer does not exceed the buffer object. The trade-off is that the pages associated with the buffer object are permanently locked in memory. An application may need more physical memory but it can then execute faster.

To control this type of access to the system's memory, a user must hold the VMS$BUFFER_OBJECT_USER identifier, and the system allows only a certain number of pages locked for use in buffer objects. This number is controlled by the dynamic SYSGEN parameter MAXBOBMEM.

A second buffer object property allows Fast I/O to perform several I/O-related tasks entirely from system context at high IPL, without having to assume process context. When a buffer object is created, the system maps by default a section of system space (S2) to process pages associated with the buffer object. This system space window is protected to allow read and write access only from kernel mode. Because all of system space is equally accessible from within any context, it is now possible to avoid the still expensive context switch to assume the original user's process context.

The convenience of having system space access to buffer object pages comes at a price. For example, even though S2 space usually measures several gigabytes, this may still be insufficient if several gigabytes of database cache should be shared for Fast I/O by many processes. In such an environment all or most I/O to or from the cache buffers is direct I/O, and the system space mapping is not needed.

As of OpenVMS Version 7.2, buffer objects can be created with or without an associated system space window. Resources used by buffer objects are charged as follows:

Physical pages are charged against MAXBOBMEM unless the page belongs to a memory-resident section, or the page is already associated with another buffer object.
By default, system space window pages are charged against MAXBOBS2. They are charged against MAXBOBS0S1 if CBO$_SVA_32 is specified.
If CBO$_NOSVA is set, no system space window is created, and only MAXBOBMEM is charged as appropriate.

For more information about using Fast I/O features, see the OpenVMS I/O User's Reference Manual.

16.3.4 Creating and Using Buffer Objects

When creating and using buffer objects, you must be aware of the following:

Buffer objects can be associated only with process space (P0, P1, or P2) pages.
PFN-mapped pages cannot be associated with buffer objects.
The special type of buffer object without associated system space can be used only to describe Fast /O data buffers. The IOSA must always be associated with a full buffer object with system space.
Some Fast I/O operations are not fully optimized if the data buffer is associated with a buffer object without system space. Copying of data at the completion of buffered I/O or disk-read I/O through the VIOC cache may happen at IPL 8 in system context for full buffer objects. However, it must happen in process context for buffer objects without system space. If your application implements its own caching, Compaq recommends bypassing the VIOC for disk I/O by setting the IO$M_NOVCACHE function code modifier. Fast I/O recognizes this condition and uses the maximum level of optimization regardless of the type of buffer object.

16.4 Shared Page Tables

Shared page tables enable two or more processes to map to the same physical pages without each process incurring the overhead of page table construction, page file accounting, and working set quota accounting. Internally, shared page tables are treated as a special type of global section and are specifically used to map pages that are part of a memory-resident global section. The special global section that provides page table sharing is called a shared page table section. Shared page table sections themselves are memory resident.

Shared page tables are created and propagated to multiple processes by a cooperating set of system services. No special privileges or rights identifiers are required for a process or application to use shared page tables. The VMS$MEM_RESIDENT_USER rights identifier is required only to create a memory-resident global section. Processes that do not have this identifier can benefit from shared page tables (as long as certain mapping criteria prevail).

Similar to memory reserved for memory-resident global sections, memory for shared page tables must be deducted from the system's set of fluid pages. The Reserved Memory Registry allows for this deduction when a memory-resident global section is registered.

16.4.1 Memory Requirements for Private Page Tables

Table 16-1 highlights the physical memory requirements for private page tables and shared page tables that map to various sizes of global sections by various numbers of processes. This table illustrates the amount of physical memory saved systemwide through the use of shared page tables. For example, when 100 processes map to a 1 GB global section, 99 MB of physical memory are saved by mapping to the global section with shared page tables.

Overall system performance benefits from this physical memory savings because of the reduced contention for the physical memory system resource. Individual processes benefit from the reduction of working set usage by page table pages, thus allowing a process to keep more private code and data in physical memory.

**Table 16-1 Page Table Size Requirements**
Number of	Size of Global Section
Mapping	8MB	8MB	1GB	1GB	8GB	8GB	1TB	1TB
Processes	PPT	SHPT	PPT	SHPT	PPT	SHPT	PPT	SHPT
1	8KB	8KB	1MB	1MB	8MB	8MB	1GB	1GB
10	80KB	8KB	10MB	1MB	80MB	8MB	10GB	1GB
100	800KB	8KB	100MB	1MB	800MB	8MB	100GB	1GB
1000	8MB	8KB	1GB	1MB	8GB	8MB	1TB	1GB

Key

PPT = Private Page Tables
SHPT = Shared Page Tables