HP OpenVMS Systemsask the wizard |
The Question is:
I have seen two $GETJPI peculiarities, and would
appreciate any illumination you might be able to
provide... Please note that both issues seem to be
vms features since they have been reproduced and/or
observed on a number of different VMS versions and
hardware platforms.
1.
SYS$GETJPI, when using the wildcard process lookup
capability (pidadr == &-1), will sometimes return
an unexpected SS$_NONEXPR during a "scan". If the lookup
loop is coded to exit on any returned errors during the
scan, this will cause it to abort prematurely. This
situation has been observed on VAX only so far, and has
also been known to return SS$_BADPARAM in the middle of
the scan. In both cases, the wildcard context appears to
be intact and can be used to continue the scan.
Circumstantial evidence seems to implicate an
SMISERVER subprocess as the origin of these errors.
This can be reproduced using (just as an outline):
while (1) {
int neg1 = -1;
while ((ret=getjpi(&neg1)) & 1)
;
if (ret!=SS$_NOMOREPROC)
die();
}
And then from another machine, use SYSMAN to repeatedly
execute commands on the target node... use two different
SYSMAN sessions with two different usernames and load 'em
up using small rapidly completing commands... we've been
testing with 'do spawn/out=nl: show sys/noproc'.
The jpi flags used are NO_TARGET_INSWAP | IGNORE_TARGET_STATUS.
The only data item collected is the PID.
How should a scan loop be coded such that it properly
stops a scan for "fatal" errors, and appropriately
ignores these seemingly "bogus" errors?
2.
Why is it that SYS$GETJPI can return an error from the
SYS$GETJPI() call itself, *AND* deliver the specified
completion AST? This seems odd when compared with
the SYS$QIO() concept of a success return indicating that
the operation was successfully initiated and the AST will
be delivered as opposed to a SYS$QIO() failure indicating
the operation was NOT initiated and the AST should not
be delivered? This would seem to indicate a misunderstanding
on my part. I would appreciate comments on how to determine
which system services that use completion ASTs conform to
the $GETJPI paradigm, and which to $QIO...
Thank you!
The Answer is : Without seeing the actual code and the specifics of the failure, the Wizard is not in a particular position to provide suggestions, nor to comment on the behaviour and operation of the completion AST with various error severities. (Processes do, however, come and go with some regularity on OpenVMS systems.) SS$_BADPARAM is generally a fatal error -- see the use of the "FATAL" severity in the message: "%SYSTEM-F-BADPARAM, bad parameter value" SS$_NONEXPR is generally a recoverable error, hence the use of the "WARNING" severity: "%SYSTEM-W-NONEXPR, nonexistent process". In the case of sys$getjpi() or sys$getjpiw(), the Wizard would attempt continue the loop from a successful, informational, and warning status code, and would take other steps to recover from errors and fatal errors. Care must be taken with any asynchronous programming to correctly resynchronize upon completion, around the correct allocation of the IOSB data structure and any variables potentially volatile over the lifetime of the outstanding call, and around the synchronization of any variables shared between the mainline and the AST. The differences between the severities can be determined using the macros and constants that are located in the C module <stsdef.h>.
|