Dear All
     
     My original question was a general one about how async IO differs from 
     synch IO along with a more specific one as to whether this is a viable 
     alternative to using multiple database writers with oracle v7.
     
     Many thanks for the authoritative replies from David 
     (sxdjd_at_java.sois.alaska.edu), John Haskall and Lars Bro which I have 
     included in full below.
     
     On the oracle front, there seemed to be some doubt as to whether async 
     IO was supported with file systems rather than raw partitions.  Oracle 
     assured us it would work with DU v4 so we tried it and it does!  We 
     have 4.0b with patch kit #6, oracle 7.3.4.0.0 and data on advfs.
     
     Many thanks again
     
     Nick.
     
     Async IO info follows:
     
     =============================================================
     From Lars Bro
     
     1) with synchronous I/O from a file system, you allocate a buffer in 
     the user process, then trap into the operating system. If the data you 
     want is already present in operating system buffers, you will just 
     copy out into your buffer what you need. Otherwise, the executing 
     thread will order the data from the device driver and go to sleep. 
     When the device driver recieves the interrupt from the device, it will 
     place the data read into the operating system buffers and wake up your 
     thread.
     
     2) The same is going on during asynchronous I/O. Yoy make another trap 
     into the operating system that just orders the read from the device 
     driver. The calling thread returns to user space immediately. When the 
     data arrives into the kernel buffer, it is copied out into the buffer 
     in the user process and the thread is notified by a signal.
     
     3) If you do "raw" i/o instead of filesystem i/o, you trap into the 
     kernel and immediately tells the device driver to fetch the blocks you 
     want. However, in the block header, you put the adress of your user 
     buffer instead of a normal kernel cache buffer. This requires that 
     your user buffer does not move around (paging). I think that the 
     kernel locks the buffer into memory when the I/O is ongoing.
     
     
     !!! Please note that you might either have a process with MULTIPLE 
     THREADS and synchronous I/O or a process with a SINGLE THREAD and 
     asynchronous I/O. The latter is the 'original' Unix way of thinking 
     and what I guess Oracle still does.
     
     
     The kernel has to keep a record of each outstanding asynchronous I/O 
     call since it must know what thread to notify.
     
     The following limits apply to asynchronous I/O
     
     aio-max-num                # of outstanding io's in database 
     aio-max-percent            % of memory to use for asynch I/O database
     
     aio-task-max-num           # of outstanding IO's per process
     
     sigqueue-max-num           # of outstanding notifications per process
     
     
     aio-max-retry              # of times to retry lock a user buffer
     
     aio-retry-scale            how to increase time betw. retries
     
     
     Yours,
     Lars Bro (lbro_at_dscc.dk)
     
     =============================================================
     From John Haskall
     
     Standard read/write are synchronous -- the function call does not 
     return until the I/O is done - e.g., your process waits around for it 
     to complete (while it could be doing something else perhaps).  
     Basically, this means any process pretty much has a max of 1 i/o in 
     progress at
     once - async io lets you do something else (including queue another 
     async io) while your i/o completes.
     
     You can find out about async io under these man pages:
     
     aio_cancel, aio_error, aio_fsync, aio_read, aio_return, aio_suspend, 
     aio_write, lio_listio
     
     Basically you call aio_read or aio_write instead of read or write. You 
     can arrange to a have signal sent when an aio completes.
     You can check the status of an aio (either at your convenience or 
     after getting the signal) with aio_error -- if it says EINPROGRESS it 
     hasn't finished, if says some other error,
     the read/write failed.  If it says '0' (OK), then you can aio_return 
     to get the get the bytes xfer and return the aio control block to the 
     pool for reuse.
     
     aio_suspend says wait until some aio completes. aio_cancel does what 
     you think
     aio_fsync makes sure all I/O previous to it are on-disk lio_listio 
     lets you do a bunch of aio_reads/writes at once
     
     John
     
     =============================================================
Received on Mon Jun 01 1998 - 16:50:40 NZST