[an error occurred while processing this directive]

HP OpenVMS Systems Documentation

Content starts here

OpenVMS Programming Concepts Manual


Previous Contents Index

30.4 Default Transactions

A default transaction TID is maintained for each process. Some DECdtm services act on the default transaction if no transaction is explicitly specified in the call. The default transaction of a process has two states:

  • Set: The process has a default transaction.
  • Clear: The process does not have a default transaction.

The default transaction is cleared during the processing that occurs when the transaction commits or aborts.

Some operations ($START_TRANS, $START_BRANCH) that set the default transaction of a process will fail if the default transaction of the process was not previously clear. Such operations will update the default transaction without error if it is still set but commit or abort processing that is already in progress.

The default transaction TID is read by the $GET_DEFAULT_TRANS service.

Some RMs check if a default transaction has been started by the application. If there is none, the requested operation is performed as a single atomic operation. Do not use unsynchronized branches with such RMs. The problem is that a transaction might be aborted asynchronously (by another branch) before the branch calls the RM in question. The RM would then perform the operation separately instead of joining the transaction and then receiving an abort notification. This problem cannot occur with a synchronized branch because the default transaction TID is not cleared until $END_BRANCH is called.

30.4.1 Multithreaded Applications

Because the default transaction TID is per-process, not per-thread, it is preferable to use explicit TIDs in multithreaded processes.

However, you must use the default transaction with RMs that do not provide an interface that allows the AP to specify the TID. In this case, use the $SET_DEFAULT_TRANS service to set the appropriate TID in each thread. Take care to serialize each sequence of operations that sets and uses the default transaction.

30.5 Resource Manager Interface

A resource manager provides transaction operations on one or more resources. The RM must have the following characteristics:

  • It should implement transactions with the ACID properties on the resources it manages. This is not a precondition for using DECdtm. For example, some RMs compromise on isolation for improved performance; but unless this characteristic is observed, distributed transactions constructed with DECdtm will not have the ACID properties expected by most applications. Section 30.5.6 describes where volatile (nondurable) resources are used.
  • It must be able to participate in the two-phase commit protocol. This means that it must be able to store the state of a transaction on disk in phase 1 and subsequently commit or roll back the changes as requested in phase 2.
  • It must respond correctly to DECdtm events in the event handler declared by $DECLARE_RM.
  • On recovery from an RM or node failure it must call DECdtm to determine the state of each transaction that was in phase 2 at the time of the failure. It must then commit or roll back the transaction as determined by DECdtm.

DECdtm recognizes two components of an RM:

  • RM instance (RMI) for each process that makes RM-related calls to DECdtm.
  • RM participant for each transaction in which an RM instance takes part.

The RMI and its RM participants share a single event handler, but each participant may have a different name and context. The name is used to find relevant transactions on recovery. The context is a handle, opaque to DECdtm, which is passed to the event handler and may be used to address RM-specific data.

An RM uses the following DECdtm services during normal execution of transactions:

$DECLARE_RM Creates an RM instance in the current process.
$JOIN_RM Adds an RM participant to a transaction.
$ACK_EVENT Acknowledges an event reported to an RMI or RM participant.
$FORGET_RM Deletes an RMI from the current process.

An RM uses the following DECdtm services during recovery from an RM or system failure:

$GETDTI Gets distributed transaction information. Used to get information about the state of transactions.
$SETDTI Sets distributed transaction information. Used to remove RM participants from a transaction.

30.5.1 Creating RM Instances and Participants

You can create an RMI by calling $DECLARE_RM. This specifies an event handler for the RM in the process and returns the RM_ID that is needed to add participants to transactions.

The RM can add an RM participant as follows:

  • The RM may call $JOIN_RM on the first operation for a new TID.
  • The RM may request transaction start events (DDTM$M_EV_TRANS_START). It calls $ACK_EVENT to join every transaction. If an RM participant finds that it takes no part in a transaction, it can vote SS$_FORGET in phase 1.

In either case, the RM specifies a participant name, the RM_ID, which is used as a key to retrieve transaction state information on recovery from an RM or system failure. The RM_ID has the following characteristics:

  • It must have an RM or facility prefix that is unique to the RM.
  • Typically it includes an RM-specific name for a group of resources that are recovered as a unit, such as a database or volume.
  • It may also include an RM log version (see Section 30.5.5).

You can design an RM to be used either with or without DECdtm. In the latter case, the RM may perform a single request as a transaction without calling DECdtm. Such RMs must take care when using $GET_DEFAULT_TRANS. A status of SS$_NOCURTID indicates that either no transaction has started, or that a transaction started and then aborted before the RM was called. Therefore, the RM interface must provide some way for an AP to specify whether requests are for DECdtm transactions or not, for example, by using an interface function, or by setting a mode switch with a logical name. Do not decide if a DECdtm transaction is required just by checking $GET_DEFAULT_TRANS for a TID. The RM should return an error (for example, SS$_ABORTED) if the AP requires a DECdtm transaction and there is no current TID.

30.5.2 Reporting an Event Notification

The DECdtm transaction manager reports events to an RMI and the RM participants associated with it using asynchronous system traps (ASTs) executed in the access mode specified in the call $DECLARE_RM that created that RMI.

The DECdtm transaction manager creates an event report block, and passes its address to the AST routine in the parameter of the AST. Each event report block contains the following:

  • The identifier of the event report.
  • A code that describes the event.
  • The identifier (TID) of the transaction.
  • The name of the RM participant or RMI.
  • The context of the RM participant or RMI.
  • Other data that depend on the type of the event.

Table 30-1 describes the fields in an event report block, in alphabetical order.

Table 30-1 Fields in an Event Report Block
Symbol Description
DDTM$A_TID_PTR Address of the identifier (TID) of the transaction.
DDTM$L_ABORT_REASON Abort reason code (longword).

See Appendix B for a list of possible values. Present only in abort event reports.

DDTM$L_EVENT_TYPE A code that identifies the event (longword). The following table shows the possible values.
Symbol Event
DDTM$K_ABORT Abort
DDTM$K_COMMIT Commit
DDTM$K_PREPARE Prepare
DDTM$K_ONE_PHASE_COMMIT One-phase commit
DDTM$K_STARTED_DEFAULT Default transaction started
DDTM$K_STARTED_NONDEFAULT Nondefault transaction started
DDTM$L_REPORT_ID Event report identifier (unsigned longword).
DDTM$L_RM_CONTEXT The context of the RM participant or RMI to which the event report is being delivered (unsigned longword).
DDTM$Q_PART_NAME The name of the RM participant or RMI to which the event report is being delivered (descriptor).
DDTM$Q_TX_CLASS The transaction class of the transaction (descriptor).

Each event report must be acknowledged by calling $ACK_EVENT, specifying the identifier of the report. This acknowledgment need not come from AST context.

The DECdtm transaction manager delivers only one event report at a time to each RM participant. For example, if a prepare event report has been delivered to an RM participant, and the transaction is aborted while the RM participant is doing its prepare processing, then the DECdtm transaction manager does not deliver an abort event report to that RM participant until it has acknowledged the prepare event report by a call to $ACK_EVENT. Note that the DECdtm transaction manager may deliver multiple reports to an RMI.

After acknowledging the event report, the RMI or RM participant should no longer access the event report block.

30.5.3 Responding to Events

The primary requirement of an RM participant is that it should respond to the following DECdtm events by calling $ACK_EVENT.

DDTM$K_PREPARE:

Delivered at the start of phase 1. Normally, the participant saves on disk information needed to commit or abort the transaction, and responds with SS$_PREPARED.

If the participant has not updated any resources during the transaction, it may respond with SS$_FORGET. The participant should then release any locks on its resources. This optimization eliminates an unnecessary commit or abort event.

If the participant had an error while the transaction was active, or is unable to save information to disk, it responds with SS$_VETO. The participant may then abort its transaction and release any locks on its resources.

DDTM$K_ONE_PHASE_COMMIT:

Delivered as an alternative to DDTM$K_PREPARE if there is a single participant and it is in the process that started the transaction.

The participant may commit the transaction and respond with SS$_NORMAL. This optimization eliminates the need for DECdtm to log information and to deliver a commit event.

The participant may respond with SS$_PREPARED to request a regular two- phase commit, or with SS$_VETO to abort the transaction.

DDTM$K_COMMIT:

Delivered when all participants have voted SS$_PREPARED in phase 1.

Normally, the participant commits the transaction and responds with SS$_FORGET. This allows DECdtm to discard the transaction from its log. The participant may then release any locks on its resources.

Alternatively, the participant may respond with SS$_REMEMBER. This is used if the RM encounters an error while committing the transaction. DECdtm retains information about the transaction in its log. The RM must commit the transaction later, as a recovery operation.

DDTM$K_ABORT:

Delivered after $ABORT_TRANS has been called on any node, or when one or more of the participants have responded with SS$_VETO in phase 1.

Table 30-2 shows the the abort reason codes.

Table 30-2 Abort Reason Codes
Symbolic Name Description
DDTM$_ABORTED Application aborted the transaction without giving a reason.
DDTM$_COMM_FAIL Transaction aborted because a communications link failed.
DDTM$_INTEGRITY Transaction aborted because a resource manager integrity constraint check failed.
DDTM$_LOG_FAIL Transaction aborted because an attempt to write to the transaction log failed.
DDTM$_ORPHAN_BRANCH Transaction aborted because it had an unauthorized branch.
DDTM$_PART_SERIAL Transaction aborted because a resource manager serialization check failed.
DDTM$_PART_TIMEOUT Transaction aborted because a resource manager timeout expired.
DDTM$_SEG_FAIL Transaction aborted because a process or image terminated.
DDTM$_SERIALIZATION Transaction aborted because a serialization check failed.
DDTM$_SYNC_FAIL Transaction aborted because a branch had been authorized for it but had not been added to it.
DDTM$_TIMEOUT Transaction aborted because its timeout expired.
DDTM$_UNKNOWN Transaction aborted for an unknown reason.
DDTM$_VETOED Transaction aborted because a resource manager was unable to commit it.

The participant must abort the transaction and respond with SS$_FORGET. It may then release any locks on its resources.

The previous descriptions suggest that a participant drops locks after calling $ACK_EVENT. It could equally well drop locks immediately before calling $ACK_EVENT.

To ensure isolation between transactions (distributed or otherwise), RMs set locks on all resources that are either read or updated, and observe a two-phase lock protocol. This specifies that a transaction must be divided into a phase when locks may be acquired and a following phase when locks may be released. When any lock is released, no further locks may be acquired. An RM may gain a useful improvement in concurrency by releasing locks on non-updated resources at the end of the active phase, before the transaction is saved on disk.

To obey the two-phase lock protocol for distributed transactions, an RM participant must hold all locks until the start of phase 1. In other words, it must wait for the other participants to complete their active phases of the transaction.

(This is not an absolute requirement by DECdtm. Some RMs allow an application to request reduced isolation between transactions, to get higher concurrency. But if an RM releases locks on non-updated resources before phase 1, distributed transactions constructed with DECdtm will not have the isolation property expected by most applications.)

30.5.4 Aborting a Transaction

If an RM detects an error during a transaction, it may return an error status to the AP and allow the AP to decide whether to abort the transaction. For some errors, the RM may decide to veto the transaction when it receives a request to prepare.

However, an RM should not call $ABORT_TRANS itself. A synchronized branch is terminated by $ABORT_TRANS and the decision to terminate the branch should be taken by the AP that started it, not by an RM that it called.

DECdtm has no control over the execution of APs. Therefore, an RM must be prepared to receive and reject application requests for a transaction after calling $ABORT_TRANS, and after DECdtm has signaled the start of phase 1. Under rare conditions, an RM may be asked to vote despite calling $ABORT_TRANS.

30.5.5 Performing Recovery

An RM may fail at any time, or the process or node on which it is running may fail. When the RM is restarted, it must clean up the on-disk state of any transaction that was running at the time of the failure. Typically, this is done by maintaining an RM-specific log of operations. On recovery, you should examine the log to find updates that must be undone (for transactions that are being aborted) or redone (for transactions that are being committed). The RM cannot resume normal operation until it has either reacquired locks for in-progress transactions, or completed or aborted them appropriately.

Logging is a common technique because it performs well, but other methods may be suitable for specific RMs. The key point is that the RM must store sufficient information on disk so that it can abort or complete in-progress transactions following an RM or node restart.

If the RM failed before voting, the RM can assume that the transaction is to be aborted, because the RM never voted to commit the transaction.

If the RM failed after voting, it must determine the outcome of the transaction from DECdtm. This is done using the $GETDTI system service. The RM may query the outcome of a specific transaction, using a TID stored in its own log. Alternatively, it may select all transactions using a prefix of the RM participant names.

Two features allow the RM to match its log against the DECdtm log. This is desirable because, for instance, the wrong log might be used if either log has been incorrectly restored from backup following a disk failure. Following are the two features:

  • $DECLARE_RM returns the ID of the DECdtm log on the local node. The RM should save this ID with its own log, and check the value in a call to $GETDTI. This check will fail if either the wrong TM log or the wrong RM log is used.
  • The backup sequence number for the RM log may be encoded as a suffix to the RM participant name. On recovery, a $GETDTI scan may be used to check if the DECdtm log records participants with more recent backup sequence numbers than expected. This would indicate that an out-of-date RM log is being recovered.
    This check is recommended for RMs that use per-resource logs (rather than a single per-system log), where the risk of an old log being restored is significant.

Two transaction states allow the RM to take action: DTI$K_COMMITTED and DTI$K_ABORTED. The RM may specify that $GETDTI does not complete until a selected transaction has one of these two states.

Alternatively, other states may be returned if the final state of a transaction has not been resolved yet, perhaps because the DECdtm log is unavailable, or DECdtm is still waiting for votes from other RMs or TMs. This allows the RM to continue recovery for other transactions, to take locks for the outstanding unrecovered transactions, and then to resume normal operation.

When an RM has committed or aborted a transaction, it must allow DECdtm to remove the transaction from its log. This is done using the DTI$K_DELETE_RM_NAME function of $SETDTI.

DECdtm implements a presumed-abort optimization. This removes the need for DECdtm to log abort decisions. Therefore, if a query for a TID returns SS$_NOSUCHTID, or the TID is missing from the results of a wildcard query, the RM must assume that the transaction has aborted. There is no need to call $SETDTI in this case.

DECdtm writes the removal of a transaction from its log when the transaction is committed. This means that following a system failure, the DECdtm log may hold commit records for transactions that the RM has forgotten. To prevent such records from eventually filling the log, the RM must occasionally perform recovery by the wildcard scan method, instead of querying specific transactions, and remove its association from any committed transaction that is unknown to the RM.

30.5.6 Volatile Resource Manager

An RM may be declared as volatile in $DECLARE_RM if it manages resources that do not need to survive an RM or node failure, such as the following:

  • Managing a cache of information that is transactionally consistent, but that can be regenerated from information held by another nonvolatile RM.
  • Implementing a scratchpad for communication between APs during a series of transactions. Changes to the scratchpad should be undone on transaction abort, but the scratchpad does not need to be reconstructed following a system failure.
  • Monitoring transaction start, commit, and abort events for performance information or perhaps to clean up volatile state, without managing a real resource.

Declaring an RM as volatile removes the need for DECdtm to log information about RM participants. By definition, the RM does not need to perform recovery after a failure, and does not call $GETDTI.

30.5.7 Modifying the DECdtm Log

On recovery, RMs are expected to wait until each transaction state can be resolved as committed or aborted. During this time, they may be unavailable for new operations, or they may hold locks that block the normal functioning of applications.

When you use DECdtm within an OpenVMS Cluster, any node can access the DECdtm log for recovery, provided that the log is configured on a clustered disk. However, if the log is on a failed node outside the cluster, if communication to the node has failed, or if the disk holding the log has failed, applications may be blocked indefinitely.

In this scenario, you may prefer to intervene manually rather than to tolerate an unavailable system. The DTI$K_MODIFY_STATE function of $SETDTI allows you to change the state of an in-doubt transaction in a DECdtm log. The DTI$K_DELETE_TRANSACTION allows you to remove a transaction from a DECdtm log.

You can make these changes using the Log Manager Control Program (LMCP) REPAIR command rather than calling $SETDTI directly. Intervention of this type is for emergency use and is likely to break the consistency of distributed resources. You may need to perform application-specific updates to resources to restore consistency.

30.5.8 Transaction Class

An AP may specify a transaction class parameter to $START_TRANS or $ADD_BRANCH. This is passed as a string to the RM event handler. The mechanism is provided so that an RM may monitor transaction activity for suitably labeled transactions or branches. Its use is optional.


Previous Next Contents Index