[an error occurred while processing this directive]

Software > OpenVMS Systems > Documentation > 731final > 4506

HP OpenVMS Systems Documentation

Guide to OpenVMS File Applications

Contents

Index

3.2.1.3.2 Establishing Auto Extend Default Quantities

This section describes how to establish the auto extend default quantities for an RMS file.

Area and File Default Extension Quantities

You can establish a file-specific default, called the file default extension quantity, for all file organizations. In the case of an indexed file with multiple areas, you can also establish a separate area default extension quantity for each area of the indexed file.

The following list describes the methods for establishing file default extension quantities, and, where applicable, area default extension quantities:

The recommended method is to use the Edit/FDL utility to permanently establish file and area default extension quantities when you create or convert a file. The Edit/FDL utility calculates these quantities using your responses to the script questions, and it assigns the file default extension quantity using the FILE EXTENSION attribute. For indexed files with multiple areas, the Edit/FDL utility assigns a default extension quantity to each area using the AREA EXTENSION attribute. A subsequent $CREATE service or use of the CONVERT utility using an FDL with these EXTENSION attributes permanently sets these defaults. For a description of how the Edit/FDL utility calculates default extension quantities, see Appendix A.
One alternative to using the Edit/FDL utility is to permanently establish the file and area default extension quantities by specifying the appropriate values in the FAB (or XABALL) used as input to the $CREATE service.
The FAB$W_DEQ field defines the file default extension quantity. For indexed files with multiple areas, individual area XABALLs (with the XAB$B_AID field and the related XAB$W_DEQ field set appropriately) establish area default extension quantities.
If you use this method, you can determine the default extension quantities using file and area-specific information such as the average record size, the number of records to be added to the file during a given period of time, the number of records per bucket and the bucket size.
When both a FAB and a XABALL are present on the opening or creation of an RMS file, the XABALL fields override equivalent FAB fields. If the XABALL is present, then the file default extension quantity is set from the XAB$W_DEQ, overriding any value in the FAB$W_DEQ field. In the case of an indexed file with multiple areas where multiple XABALLs might exist, the file default extension quantity is set to the default extension quantity for Area 0.
A single allocation XAB (XABALL) can also be specified on the creation of a relative or sequential file. However, there is no separate area default extension quantity maintained for these files. The XABALL is used in this case to establish the file default extension quantity in one of the following ways:
- After a file has been created, specifying the file default extension quantities (FAB$W_DEQ) on input to a $OPEN establishes a temporary file default extension quantity that overrides any permanent setting that might exist. This temporary default is used when you access the file until the file is closed.
  Note that the area default extension quantities for an indexed file specified on a $CREATE cannot be changed over the lifetime of the file nor can they be overridden at run time.
- Once a file has been created, you can change or establish the permanent file default extension quantity by using the DCL command SET FILE/EXTENSION=n, where n is the extension quantity in disk blocks for the file. The next time the file is opened, it uses the new default quantity.

In addition to the file and area default extension quantities, there are process, system, and volume default extension quantities.

Process Default Extension Quantity

The process default extension quantity is established by the DCL command SET RMS_DEFAULT/EXTEND_QUANTITY=n, where n is the extension quantity. This default applies only to the process issuing this DCL command and remains in effect only until the process is deleted.

System Default Extension Quantity

The system default extension quantity is established by the SET RMS_DEFAULT/SYSTEM/EXTEND_QUANTITY=n command. Note that you need the CMKRNL privilege to use the /SYSTEM qualifier. This default applies to all processes on a node in the cluster. When you use this DCL command to establish the system default extension quantity, it remains in effect until the node is rebooted.

You can also establish the system default extension quantity in a temporary or permanent fashion by appropriately setting the SYSGEN system parameter RMS_EXTEND_SIZE.

Volume Default Extension Quantity

The volume default extension quantity can be permanently established when the volume is initialized with the INITIALIZE/EXTENSION=n command. This default quantity is used whenever the volume is mounted. To permanently change the volume default extension quantity, you can use the SET VOLUME/EXTENSION=n command on a mounted disk. To temporarily establish a volume default extension quantity or temporarily override the permanent volume default extension quantity, use the MOUNT/EXTENSION=n command. The new default is in effect until the volume is dismounted. Unlike the other default quantities described that default to zero if not specified, the volume default extension quantity defaults to 5 if not specified.

3.2.1.3.3 Placement and Contiguity of Extends

In addition to specifying the size of an extend, you can specify other characteristics that affect the placement and contiguity of the extend.

When an application extends a file by calling the $EXTEND service, an Allocation XAB (XABALL) can be used to place an extend on a particular disk block or disk cylinder. If no allocation XAB is present on the $EXTEND and the FAB contiguity options (described later in this section) are not selected, RMS automatically places the extend near the last allocated disk block in the file. If the file being extended in this fashion is an indexed file opened for record I/O access, RMS adds the new disk space as near as possible to the last allocated disk block in the area being extended. This technique groups disk blocks belonging to the same area of the indexed file.

When RMS automatically extends a file, the application cannot control placement; however, RMS uses placement controls in one of the following ways, depending on how the file is organized:

When automatically extending an indexed file, RMS uses placement control to allocate the new disk space as close as possible to the last allocated disk block of the indexed file area being extended.
When automatically extending a relative file, RMS uses placement control to allocate the new disk space as close as possible to the last allocated disk block of the file.
No placement control is used when RMS automatically extends a sequential file or any file organization accessed for block I/O.

An extend is considered contiguous if all the disk blocks of the extend are physically adjacent on the disk. There are two types of contiguous extend requests that can be made. The first, called a contiguous request, returns an error if contiguous disk blocks cannot be found to satisfy the request. The second, called a contiguous best try request, attempts to find contiguous disk blocks for the request. If it does not find sufficient contiguous space, it extends the file and does not return an error. The contiguity options can be input to an $EXTEND service in the FAB (FAB$V_CBT, FAB$V_CTG) or in the Allocation XAB (XAB$V_CBT, XAB$V_CTG). The Allocation XAB settings override any FAB settings.

When RMS automatically extends a file, the application can only indirectly control contiguity by setting the FAB or XABALL contiguity bits on input to the $CREATE service. Once set on file creation, these options are available for subsequent extends done automatically by RMS.

Note that setting the FAB$V_CTG bit could cause an extend to fail on a sufficiently fragmented disk. Note too, that the FAB$V_CBT option is disabled after several failures to allocate contiguous disk space to avoid the expensive overhead of contiguous best try processing on a badly fragmented disk.

3.2.1.4 Truncating a File

Only RMS sequential disk files that have been opened for write access (FAB$V_PUT, FAB$V_UPD, FAB$V_DEL or FAB$V_TRN) can be truncated. This applies to unshared and shared sequential files.

Two types of truncation can occur on RMS sequential files: RMS truncation and Ancillary Control Procedure (ACP) truncation.

RMS truncation involves resetting the end-of-file (EOF) pointer back to a previous position (possibly the beginning) of a sequential file to reuse the allocated space in a file. RMS truncation is described in the OpenVMS Record Management Services Reference Manual under the $TRUNCATE service.

ACP truncation occurs when RMS closes a sequential file and requests that the ACP deallocate all disk blocks allocated beyond the EOF of the file. The primary use of ACP truncation is to conserve disk space. The remainder of this section deals with ACP truncation.

You can also use ACP truncation in conjunction with large extend sizes to reduce disk fragmentation. If a file is growing slowly over time, the application can allocate the largest possible extend, and when finished, it can use ACP truncation to deallocate any unused space at the end of the sequential file. However, if a sequential file is continually growing, excessive ACP truncation can lead to an increase in disk fragmentation resulting in more CPU and I/O overhead.

ACP truncation can be requested directly by way of the programming interface by setting the FAB$V_TEF bit on input to the $OPEN, $CREATE, or $CLOSE service. The ACP truncation occurs on the close of the sequential file. Note that ACP truncation can occur on shared as well as unshared sequential files. If there are shared readers of the file, ACP truncation is postponed until the last reader of the file closes the file. If there are other writers of a shared sequential file, then ACP truncation requests are ignored. However, the ACP truncation request of the last writer to close the file will be honored.

ACP truncation of a sequential file can be automatically requested by RMS if an auto extend has been done during this file access and no file default extend quantity exists to be used for the auto extend. Using ACP truncation in this instance avoids wasting space when auto extending with a less precise extend quantity default, such as the system default extend quantity.

3.2.1.5 Units of I/O

Another file design consideration is to reduce the number of times that RMS must transfer data from disk to memory by making the I/O units as large as possible. Each time RMS fetches data from the disk, it stores the data in an I/O memory buffer whose capacity is equal to the size of one I/O unit. A larger I/O unit makes more records immediately accessible to your program from the I/O buffers.

In general, using larger units of I/O for disk transfers improves performance, as long as the data does not have to be swapped out too frequently. However, as the total space used for I/O buffers in the system approaches a point that results in excessive paging and swapping, increasing I/O unit size degrades system performance.

RMS performs I/O operations using one of the following I/O unit types:

Blocks
Multiblocks
Buckets

A block is the basic unit of disk I/O, and it consists of 512 contiguous bytes. Even if your program requests less than a block of data, RMS automatically transfers an entire block.

The other I/O units---multiblocks and buckets---are made up of block multiples. A multiblock is an I/O unit that includes up to 127 blocks but whose use is restricted to sequential files. See Section 3.3.2 for details. A bucket is the I/O unit for relative and indexed files and it may include up to 63 blocks. See Section 3.4 and Section 3.5 for details.

3.2.1.6 Multiple Areas for Indexed Files

For indexed files, another design strategy is to separate the file into multiple areas. Each area has its own extension size, initial allocation size, contiguity options, and bucket size. You can minimize access times by precisely positioning each area on a specific volume, cylinder, or block.

For instance, you can place the data area on one volume of a volume set and place the indexed area on another volume. If your application is I/O bound, this strategy could increase its throughput. You can also derive data bucket contiguity by allocating extra space for future extensions of the data area.

3.2.1.7 Bucket Fill Factor for Indexed Files

Any attempt to insert a record into a filled bucket results in a bucket split. When a bucket split occurs, RMS tries to keep half of the records (including the new record, if applicable) in the original bucket and moves the remaining records to a newly created bucket.

Excessive bucket splitting can degrade system performance through wasted space, excessive processing overhead, and file fragmentation. For example, each record that moves to a new bucket must maintain a forward pointer in the original bucket. The forward pointer indicates the record's new location. At the new bucket, the record must maintain a backward pointer to its original bucket. RMS uses the backward pointer to update the forward pointer in the original bucket if the record later moves to yet another bucket.

When a program attempts to access a record either by alternate key or by RFA, it must first go to the bucket where the record originally resided, read the pointer to the record's current bucket residence, and then access the record.

To avoid bucket splits, you should permit buckets to be only partially filled when records are initially loaded. This provides your application with space to make additional random inserts without overfilling the affected bucket.

Section 3.5.2.2 describes fill factors in more detail.

3.2.2 Processing Options

Five processing options can be used to improve I/O operations: two file-processing options and three record-processing options. The file-processing options include the deferred-write option and the global buffer option. The global buffer option may be used with all three file organizations, but the deferred-write option is restricted to use with relative and indexed files.

The record-processing options include the multiple buffer option, the read-ahead option and the write-behind option. The multiple buffer option may be used with all three file organizations, but the read-ahead option and the write-behind option may be used only with sequential files.

This section summarizes the options. Section 3.3 through Section 3.5 describe the options in the context of tuning files. For additional information about buffering, see Chapter 7.

3.2.2.1 Multiple Buffers

When a program accesses a data file, it transfers the file from disk into memory using I/O units of blocks, multiblocks, or buckets. The I/O units are subsequently placed in memory I/O buffers sized to be compatible with the I/O units.

If you do not have enough buffers, excessive I/O transfers may degrade the performance of your application. On the other hand, if you have too many buffers, performance may degrade because of an overly large working set. As a rule, however, increasing the size and number of buffers can improve performance if the data read into the buffers will soon be processed and if your working set can efficiently maintain the buffers. In fact, changing the size and number of buffers is the quickest way to improve the performance of your application when you are processing localized data.

The optimum number of buffers depends on the organization and use of your data files. The recommended way to determine the optimum number of buffers for your application is to use the Edit/FDL utility.

The number of I/O buffers is a run-time parameter you set with the FDL editor by adding the CONNECT secondary attribute MULTIBUFFER_COUNT to the definition file. (See Chapter 9.) With a low-level language, you can set the value directly into the RAB$B_MBF field of the record access block, or you can set the count using the XAB$_MULTIBUFFER_COUNT XABITM if you want to specify more than 127 buffers.

Alternatively, the number of buffers may be specified for a process using the DCL command SET RMS_DEFAULT/BUFFER_COUNT=n, where the variable n represents the desired number of buffers. With this command, you may set distinct values for your sequential, relative, and indexed files using the appropriate file organization qualifier. If you omit the file organization qualifier, sequential organization is assumed. To examine the current settings for the process and system default multibuffer count, use the DCL command SHOW RMS_DEFAULT. If none of the above methods is used, RMS uses the system-wide default value established by the system manager. If the system-wide default is either omitted or is set to 0, RMS uses a value of 1 for sequential and relative files and a value of 2 for indexed files.

For more details about using multiple buffers with sequential files, see Section 3.3.3. For more details about using multiple buffers with relative files, see Section 3.4.2. For more details about using multiple buffers with indexed files, see Section 3.5.2.3.

Chapter 7 describes the use of multiple buffers in the context of shared files.

3.2.2.2 Deferred-Write Processing

One way to improve performance through minimized I/O is to use the deferred-write option to keep data in memory as long as practicable. However, you must determine if this added performance benefit is worth the increased risk of losing data if the system crashes before a buffer is transferred to disk.

With indexed files and relative files, you may use the deferred-write option to defer writing modified buckets to disk until the buffer is needed for another purpose or until the file is closed.

Typically, the largest gains in performance come from using the deferred-write option with sequential access. Retrieving and modifying records one after the other permits you to access all of the records from one bucket while the bucket is in memory.

You may also improve performance by using the deferred-write option to prevent writing a shared sequential file to disk on each modification. However, this increases the risk of losing data if the system crashes before the full buffer is transferred to disk.

Note that nonshared sequential files behave as if the deferred-write option is always specified, because a buffer is only written to disk after it is completely filled.

Deferred-write processing is a default run-time option for some high-level languages and can be specified by using clauses in other languages. You can activate this option through FDL by adding the FILE attribute DEFERRED_WRITE. From a low-level language, you can activate the deferred-write option by setting the FAB$V_DFW bit in FAB$L_FOP field.

3.2.2.3 Global Buffers

If several processes are to share a file, you may want to provide the file with global buffers---I/O buffers that two or more processes can access. With global buffers, processes may access file information without allocating dedicated buffers. If you do not allocate dedicated buffers, you can conserve buffer space and buffer management overhead. You gain this benefit at the cost of additional system resources, as described in the OpenVMS Record Management Services Reference Manual.

When you create a file, you can assign the desired number of global buffers by using the FDL editor to set the value for the FILE secondary attribute GLOBAL_BUFFER_COUNT. From a low-level language, you can optionally set the value directly into the FAB$W_GBC field. Alternatively, you may use the DCL command SET FILE/GLOBAL_BUFFERS to set the global buffer count.

Global buffers are not used directly to retain modified information when the deferred-write option is enabled. If a global buffer is modified and the deferred-write option is enabled, the contents of the global buffer are copied to a process local buffer before other processes are allowed to access the global buffer contents. Subsequent use of the modified buffer by the process that deferred the writeback refer to the process local buffer while it remains in the process local cache. Reference to the global buffer by another process causes the contents of the process local buffer to be written back to disk.

If a global buffer is modified and the deferred-write option is not enabled, then the contents are written out to disk from the global buffer. Therefore, using global buffers along with the deferred-write option may cause a slight increase in processing overhead if in fact no further references to the modified buffer occur before it is flushed from the cache anyway. For that reason, you may want to disable the deferred-write option for processes that do not reaccess buffers after records have been written to them.

Section 3.3, Section 3.4, and Section 3.5 discuss the use of global buffers in tuning the various file types.

3.2.2.4 Read-Ahead and Write-Behind Processing

The operation of sequentially organized files can be improved by implementing read-ahead and write-behind processing. These features improve performance by permitting record processing and I/O operation to occur simultaneously. The read-ahead and write-behind features are default run-time attributes in some languages, but they must be explicitly specified in others.

You implement read-ahead and write-behind processing by using two buffers. The processing program uses one buffer, and the I/O subsystem uses the other. In read-ahead processing, the program reads data from one buffer as the second buffer inputs data from the disk. In write-behind processing, one buffer accepts output data from the program, while the second buffer outputs program data to a disk.

The next section provides additional information about read-ahead and write-behind processing.

Contents

Index