HP OpenVMS Systems

ask the wizard

sys$getjpi, condition (error) codes, and ASTs?

» close window

The Question is:

 
I have seen two $GETJPI peculiarities, and would
appreciate any illumination you might be able to
provide... Please note that both issues seem to be
vms features since they have been reproduced and/or
observed on a number of different VMS versions and
hardware platforms.
 
1.
SYS$GETJPI, when using the wildcard process lookup
capability (pidadr == &-1), will sometimes return
an unexpected SS$_NONEXPR during a "scan". If the lookup
loop is coded to exit on any returned errors during the
scan, this will cause it to abort prematurely. This
situation has been observed on VAX only so far, and has
also been known to return SS$_BADPARAM in the middle of
the scan. In both cases, the wildcard context appears to
be intact and can be used to continue the scan.
Circumstantial evidence seems to implicate an
SMISERVER subprocess as the origin of these errors.
 
This can be reproduced using (just as an outline):
   while (1) {
     int neg1 = -1;
     while ((ret=getjpi(&neg1)) & 1)
        ;
     if (ret!=SS$_NOMOREPROC)
        die();
    }
And then from another machine, use SYSMAN to repeatedly
execute commands on the target node... use two different
SYSMAN sessions with two different usernames and load 'em
up using small rapidly completing commands... we've been
testing with 'do spawn/out=nl: show sys/noproc'.
 
The jpi flags used are NO_TARGET_INSWAP | IGNORE_TARGET_STATUS.
The only data item collected is the PID.
 
How should a scan loop be coded such that it properly
stops a scan for "fatal" errors, and appropriately
ignores these seemingly "bogus" errors?
 
2.
Why is it that SYS$GETJPI can return an error from the
SYS$GETJPI() call itself, *AND* deliver the specified
completion AST? This seems odd when compared with
the SYS$QIO() concept of a success return indicating that
the operation was successfully initiated and the AST will
be delivered as opposed to a SYS$QIO() failure indicating
the operation was NOT initiated and the AST should not
be delivered? This would seem to indicate a misunderstanding
on my part. I would appreciate comments on how to determine
which system services that use completion ASTs conform to
the $GETJPI paradigm, and which to $QIO...
 
 
Thank you!

The Answer is :

 
  Without seeing the actual code and the specifics of the failure, the
  Wizard is not in a particular position to provide suggestions, nor to
  comment on the behaviour and operation of the completion AST with
  various error severities.  (Processes do, however, come and go with
  some regularity on OpenVMS systems.)
 
  SS$_BADPARAM is generally a fatal error -- see the use of the "FATAL"
  severity in the message: "%SYSTEM-F-BADPARAM, bad parameter value"
 
  SS$_NONEXPR is generally a recoverable error, hence the use of the
  "WARNING" severity: "%SYSTEM-W-NONEXPR, nonexistent process".
 
  In the case of sys$getjpi() or sys$getjpiw(), the Wizard would attempt
  continue the loop from a successful, informational, and warning status
  code, and would take other steps to recover from errors and fatal errors.
 
  Care must be taken with any asynchronous programming to correctly
  resynchronize upon completion, around the correct allocation of the IOSB
  data structure and any variables potentially volatile over the lifetime
  of the outstanding call, and around the synchronization of any variables
  shared between the mainline and the AST.
 
  The differences between the severities can be determined using the
  macros and constants that are located in the C module <stsdef.h>.
 

  
     
     answer written or last revised on ( 7-AUG-1998 )
     » close window