My problem has now been solved, very much thanks to advice from this
list and Digital support.
With help from George Michaelson <ggm_at_dstc.edu.au> who had a very
similar problem with the same kind of hardware (AS800, KZPAA and TZ88
drives) - in fact, his solution was the one I used. It turned out that
it was the KZPAA that didn't get along with the TZ887 (possibly in
conjuction with the AS800).
The KZPAA has now been replaced by a ISP1040-based KZPBA-CA card and
everything just works great.
All replies are included at the bottom of this post.
My Original post:
> I recently migrated all my NFS server functions from an AlphaServer
> 800, leaving only some internal disks and an external TZ887 loader. 
> Prior to this operation, the machine has been doing local and network
> backups with Digital NSR 4.4 on the very DLT loader mentioned above
> without any problems whatsoever. The idea is to use this machine
> exclusively as a NetWorker server.
> 
> The problem is that this setup after being stripped does not longer
> function properly when it comes to running backups. What happens is
> that the loader device gets recognized properly and can be
> maniplulated from within networker as far as loading, labeling and
> mounting tapes is concerned, but when it comes to writing more data
> than just a tape label - for instance when trying actually to run a
> backup trouble arrives.
> 
> What happens is that a few MB (or sometimes just a few KB) gets
> written to tape and then all writing stops and everything hangs for a
> couple of minutes until Networker reports an I/O error and proceeds to
> load the next tape which also suffers from that behaviour. It seems as
> if the SCSI bus or driver just hangs.
> 
> I have tried several combinations of cables, terminations, SCSI
> controllers and tapes and I even went so far as connecting a
> standalone DLT4000 drive which also showed the same symptoms.
> 
> The machine was previously configured as follows:
> 
> AS800---+-- Internal ISP1020 -- RZ28 - RC1CB - RZ1CB - Fujitsu MAB3091
> 	|
> 	+-- External KZPAA (NCR810) -- TZ887 -- RZ29B -- RZ29B -- RZ29B
> 	|
> 	+-- External KZPAA (NCR810) -- RZ29B -- RZ29B -- RZ29B
> 	|
> 	+-- External KZPAA (NCR810) -- RZ29B -- RZ29B -- RZ29B
> 
> All external RZ29:s were in StorageWorks shelves. All disks except the
> RZ28 (the system disks) were used in LSM stripes. Yes, I know that the
> NCR810 sucks big time. Now, it looks like:
> 
> AS800---+-- Internal ISP1020 -- RZ28 - RC1CB - RZ1CB
> 	|
> 	+-- External KZPAA (NCR810) -- TZ887 -- Terminator
> 	|
> 	+-- External KZPAA (NCR810) -- Terminator
> 	|
> 	+-- External KZPAA (NCR810) -- Terminator
> 
> The operating system is Digital Unix 4.0D, patch kit #1. This is the
> uerf log from the I/O error business:
> 
> ********************************* ENTRY 20. *********************************
> 
> ----- EVENT INFORMATION -----
> 
> EVENT CLASS                             ERROR EVENT 
> OS EVENT TYPE                  199.     CAM SCSI 
> SEQUENCE NUMBER                  2.
> OPERATING SYSTEM                        DEC OSF/1 
> OCCURRED/LOGGED ON                      Fri Aug 28 19:35:24 1998
> OCCURRED ON SYSTEM                      dylan 
> SYSTEM ID                 x0007001B
> SYSTYPE                   x00000000
> 
> ----- UNIT INFORMATION -----
> 
> CLASS                         x0022     DEC SIM 
> SUBSYSTEM                     x0000     DISK 
> BUS #                         x0002
>                               x00A8     LUN x0
>                                         TARGET x5
> 
> ********************************* ENTRY 21. *********************************
> 
> ----- EVENT INFORMATION -----
> 
> EVENT CLASS                             ERROR EVENT 
> OS EVENT TYPE                  199.     CAM SCSI 
> SEQUENCE NUMBER                  3.
> OPERATING SYSTEM                        DEC OSF/1 
> OCCURRED/LOGGED ON                      Fri Aug 28 19:35:24 1998
> OCCURRED ON SYSTEM                      dylan 
> SYSTEM ID                 x0007001B
> SYSTYPE                   x00000000
> 
> ----- UNIT INFORMATION -----
> 
> CLASS                         x0022     DEC SIM 
> SUBSYSTEM                     x0000     DISK 
> BUS #                         x0002
>                               x00A8     LUN x0
>                                         TARGET x5
> 
> ********************************* ENTRY 22. *********************************
> 
> ----- EVENT INFORMATION -----
> 
> EVENT CLASS                             ERROR EVENT 
> OS EVENT TYPE                  199.     CAM SCSI 
> SEQUENCE NUMBER                  4.
> OPERATING SYSTEM                        DEC OSF/1 
> OCCURRED/LOGGED ON                      Fri Aug 28 19:35:24 1998
> OCCURRED ON SYSTEM                      dylan 
> SYSTEM ID                 x0007001B
> SYSTYPE                   x00000000
> 
> ----- UNIT INFORMATION -----
> 
> CLASS                         x0001     TAPE 
> SUBSYSTEM                     x0000     DISK 
> BUS #                         x0002
>                               x00A8     LUN x0
>                                         TARGET x5
> 
> 
> Any suggestions?
---
>From: alan_at_nabeth.cxo.dec.com
>Subject: Re: Mysterious SCSI problems (ADDENDUM)
>To: Erik Persson <erik_at_ikp.liu.se>
>Date: Fri, 28 Aug 98 13:33:42 -0600
>
>
>	 They're SCSI protocol errors.  Check the cables, connections
>	 and termination again.
---
>From: George Michaelson <ggm_at_dstc.edu.au>
>Subject: tape failure
>To: erik_at_ikp.liu.se
>Date: Mon, 31 Aug 1998 11:09:22 +1000 (EST)
>
>
>we never got it to work. We forced DEC to swap the KZP-AA for a DA or
>similar, and upgraded to fast/wide tapedrives.
>
>There is a firmware level problem and/or a data rate mismatch which
>DEC can't fix and seem to have trouble admitting to.
>
>so you need fast/wide/differential scsi on the tape.
>
>-george
---
>From: George Michaelson <ggm_at_dstc.edu.au>
>Subject: Re: tape failure
>To: Erik Persson <erik_at_ikp.liu.se>
>Date: Tue, 01 Sep 1998 08:54:38 +1000
>
>
>there were substantial differences. We were on an 800 5/333 with
>hardware RAID in a BA enclosure, matched 9Gb spindles. the tapes
>were TZ87 and then TZ88 on the KZPAA.
>
>we found that dumps worked very very intermittantly, and then failed
>almost trivially. any tape operation streaming data flooded things.
>
>We were loaned a differential scsi controller and a single TZ89 which
>worked, and then we persuaded DEC (who specced the original system)
>to replace the TZ87's with a pair of TZ88's on fast/wide and a
>suitable
>controller. the problem went away.
---
-- 
Erik Persson, System Manager            e-mail: erik_at_ikp.liu.se
Dept. of Mech. Engineering              Voice: +46 13 28 2464
University of Linköping, Sweden		Fax:   +46 13 21 2717
Received on Tue Sep 01 1998 - 15:01:05 NZST