pthreads programming considerations?

I am designing an application in C++ and I would like to use DECthreads
(pthread_ interface).  After looking through the Guide to DECthreads, I have
a few questions:
1. Are there any limitations on the use of system services from within my
multithreaded app?
2. Can I issue $QIO calls that block?  If I do issue a blocking call, does
it block only the calling thread or does it block the entire process (i.e.,
if I block on a read from a mailbox, do a scanf(), or issue a UCX select())?
Does the thread scheduling algorithm affect the behaviour?
3. If (2) above doesn't block the entire process, how does VMS implement this
and does that implementation produce side-affects or other problems I need to
be aware of?
4. If I FORCEX() a process, do all of the thread cleanup routines get called?

Most of these questions have two sets of answers, depending on whether the
VMS kernel support for threads has been enabled or not.  The kernel support
is available on OpenVMS Alpha as of V7.0; the behavior on prior versions of
OpenVMS Alpha and on OpenVMS VAX is the same as the behavior on OpenVMS Alpha
V7 with the support disabled.  On OpenVMS Alpha V7 the support is enabled by
using a linker qualifier which controls whether "upcalls" and whether
multiple kernel execution contexts are available to the process.  The default
is that they are both disabled; note that the qualifier is only meaningful
when linking executable images, not shareable images.  (There is also a
SYSGEN parameter which can be used to control the maximum number of execution
contexts per process; setting it to zero disables upcalls and allows no
additional execution contexts to any process on the system.  By default, this
parameter is set to the number of processors on the system.)
In general, there are no limitations on the use of system services from
within a multithreaded application.  However, it's helpful to understand the
implications of calling any specific service as well as its interaction with
other threads.  The most important ramification is the effect of calling a
blocking service, as explained below.  However, there are other issues for
the application developer to consider, such as ensuring exclusive access to
output buffers, the use of event flags and the issues around sharing them
between threads, and other more specific problems (such as the fact that
taking out locks in different threads can cause the VMS Lock Manager to
incorrectly detect deadlocks).
You certainly can issue blocking system service calls from within a thread.
The effects on the calling thread as well as the other threads in the process
depends on whether the process has upcalls enabled or not.
With upcalls enabled, the VMS Exec notifies DECthreads that the executing
thread has blocked in a system service call.  (This is done by having the
Exec call "up" into a routine in the DECthreads scheduler, hence the term
"upcall".)  This allows DECthreads to deschedule the current thread and
immediately schedule another thread to run in its place.  When the block is
concluded, the Exec will issue another upcall to notify DECthreads that the
thread is now eligible to execute again, and DECthreads will schedule it at
the next point which is appropriate, based on the thread's scheduling policy
and priority.
Without upcalls, DECthreads has no way of determining that the current thread
is no longer actually executing when it is blocked in a system call.  As a
result, DECthreads will continue to schedule the blocked thread to "run",
according to the thread's scheduling policy and priority.  During the times
when the thread is scheduled, the process as a whole (i.e., all threads) will
be blocked.  However, with upcalls disabled, DECthreads uses a repeating
real-time timer AST to effect timeslicing among threads.  Thus, the process
only remains blocked for the remainder of the thread's quantum (or until it
is preempted by another thread), at which point DECthreads will regain
control and schedule another thread to run, if appropriate.  However, if
there is no reason for DECthreads to schedule another thread, e.g., because
the current thread has a priority which is higher than all other threads or
because the thread has a FIFO scheduling policy, then the thread will
continue to "run", and no other thread will get a chance to execute until the
blocking system service completes.
This mechanism (i.e., the one used in the absence of upcalls) works fairly
well for end-user applications, especially those which avoid heavy use of
priorities and FIFO scheduling.  These are typically cases in which periodic
short-term blocks don't pose a problem.  However, the mechanism doesn't work
out as well in performance-critical applications or in certain specialized
coding situtations.  The issues in the former should be obvious. If an
application maintains a hundred network connections with a thread blocked on
a read on each connection, then the latencies between I/O completion and
resumption of execution could be in the tens of seconds!  Likewise, if a
thread which is executing in such a process reaches the end of its quantum,
it would be a similar period of time before it was assigned to run again.
The other problems arise from the fact that the timeslicing depends on a
User-mode AST.  That is, if the system service blocks in a more privileged
mode, such as Exec- or Kernel-mode, then it is not subject to interruption by
the timeslice AST.  (Most services, such as $QIOW, do their waiting in
"caller's mode" -- i.e., User-mode in the typical case -- but a very few do
block in inner mode.)  Furthermore, if an application, run-time library
routine, or system service disables ASTs, it likewise will block the entire
process for the duration of the blocking system service call.  (Again,
basically no Digital-supplied system services disable ASTs, and almost none
of the run-time library routines do, but it's not uncommon for application
code to do so, and there have been several reports of routines for socket I/O
blocking the process with ASTs disabled.)
For these and similar reasons, with V7.0 VMS provides kernel support for
threads on Alpha, including upcalls.  With the addition of upcalls, the Exec
can notify DECthreads immediately when a thread is blocked or no longer
blocked in a system service call, and this frees us from the issues above.
Furthermore, it allows DECthreads to use a timeslice interval which is based
on CPU-time instead of real-time which requires less overhead and produces
much fairer distribution of execution on a busy system.  And, it frees
DECthreads to a large extent from problems resulting from the application
disabling ASTs.
In addition, it also provides for much more robust operation of services like
$EXIT and $FORCEX in a multithreaded process.  Exit handling presents special
challenges in a multithreaded process, because the exit handler routines may
need to synchronize with other threads in order to gain access to resources
which need to be cleaned up.  Without upcalls, a call to $EXIT made from
within a thread might invoke a routine which blocks waiting for a resource
already held by the calling thread, resulting in a deadlock situation which
makes the process hang instead of exiting.  $FORCEX is even more
trouble-prone, since in this case the exit handlers are invoked
asynchronously in an arbitrary thread, so there is no possibility of
preparation, such as releasing resources, prior to the invocation of the exit
handler routines.  However, with upcalls enabled, exit handler routines are
executed in a separate thread created expressly for the purpose; a call to
$EXIT or $FORCEX simply makes this thread eligible to run.  Thus, if an exit
handler routine requires a mutex, the executing thread is free to block
without risking a deadlock.  Furthermore, when a thread calls $EXIT with
upcalls enabled, it results in an implicit call to pthread_exit(), which
unwinds the calling thread's stack allowing any frames with TRY blocks,
pthread cleanup handlers, or VMS condition handlers to clean up and release
any resources which the thread is holding (to allow the exit handlers to
proceed without deadlocking).
Independent of upcalls, it is important that the application developer
provide for an orderly application shutdown.  Just as a thread must not be
allowed to call a subsystem before that subsystem is initialized, it is up to
the application developer to ensure that no threads call a subsystem after it
has been shut down by its exit handler.  (The best example of this is calls
to printf() during or after the execution of the C RTL's exit handler that
flushes and closes all open stdio files.)  One approach is for each subsystem
to declare an exit handler during its initialization which shuts down and
joins with all the threads that the subsystem created:  since the exit
handlers are run in the reverse order from that in which they were declared,
this ensures that each thread can clean up properly, including calling other
subsystems and that when a given subsystem is shut down all of its consumers
are already shut down.
Note that thread clean-up is only done in threads which actually terminate
before the process is run down (rundown happens after the last exit handler
routine completes).  At process rundown, all unterminated threads and the
rest of P0-space memory is summarily destroyed.  Thus, if an application is
depending on thread clean-up to maintain external invariants, it must ensure
that it has an exit handler which joins with the critical threads, to ensure
that they have been properly shut down prior to process rundown.

answer written or last revised on ( 24-JUL-1998 )

