Thanks to Frederik (frpa01_at_shb.se) who was the only one that managed 
to identify the problem.  Well done.  The problem was rather serious.
Here follows a description of my problem and then the solution:
I have a customer with an Alpha 8200 100MBit Card (tu2).  It has been 
very simply configured.  It is basically just a dumb host on the 
network - and it should only speak when it gets spoken to.  This 
system has approx. 200 users spread geographically across the 
country.  However one particular site every few days seems to lose 
their connection to the 8200 (running Sybase).  Cannot ping or 
telnet.  When things are working correctly the output of netstat -r 
is as follows:
Routing tables
Destination      Gateway            Flags     Refs     Use  Interface
Netmasks:
Inet             255.0.0.0          
Inet             255.255.255.0      
Route Tree for Protocol Family 2:
default          gateway            UG         53  1236620  tu2
localhost        localhost          UH          1        0  lo0
196.6.175        gateway            UG          5     8011  tu2
200.1.1          dec8200            U          85   355411  tu2
Note:  Neither routed or gated are running (not necessary, I think it 
will just complicate matters)
A copy of /etc/routes is as follows:
default  200.1.1.254   #gateway
When that remote site looses their network connection the output of 
netstat -r is as follows:
Routing tables
Destination      Gateway            Flags     Refs     Use  Interface
Netmasks:
Inet             255.255.255.0      
Route Tree for Protocol Family 2:
default          gateway            UG         67   677265  tu2
localhost        localhost          UH          1        0  lo0
196.6.175        firewall           UGD         2     2805  tu2
# the above line is the weird entry suggesting that the traffic has 
# been dynamically redirected through this firewall
COMroute         gateway            UGH         0       44  tu2
200.1.1          dec8200            U         101    92532  tu2
What I did after this was remove all traces and entries of the 
firewall from the /etc/hosts and any other file that might have had 
reference to it.
I must mention that this system is not on the internet and should not 
use DNS at all.  However, someone did run (before this problem 
happened) #bindsetup on this system and attempted to set this system as a 
client on the DNS.  Since then management have decided otherwise, and all traces 
of bind have been removed (I hope).
Anway, we solve the problem on the fly by adding a route to that specific 
network segment ie. # route add -net xxx.xxx.xxx gateway xxx.xxx.xxx.xxx
Sometimes we also have to restart the network ie. #rcinet restart  
for it to take effect.
The situation now is that the Windows NT DNS and Firewall 
administrators point fingers at the 8200 and say that there is 
probably still a switch which points to the DNS to do name resolving, 
which I think is a load of B-S. They say this is what is causing the 
redirection of the network traffic via the firewall instead of the 
default router/gateway.  I in turn point fingers at the NT DNS and I 
asked them to remove all DNS records from their NT DNS and firewall 
to do with the 8200.  Whether this will solve the problem I don't know?  
What are your views on this? 
What factors could possibly be causing this redirection of traffic from the default 
router to the Firewall?
Is it possible for DNS config. information to still be sitting on the 
8200?  Typically what files should I search for that could affect 
this?
Here is another weird sample of netstat -r:(after removing the 
firewall 200.1.1.14 from /etc/hosts)
Routing tables
Destination      Gateway            Flags     Refs     Use  Interface
Netmasks:
Inet             255.0.0.0          
Inet             255.255.255.0      
Route Tree for Protocol Family 2:
default          gateway            UG         52  1147204  tu2
localhost        localhost          UH          1        0  lo0
196.6.175        200.1.1.14         UGM         5    45439  tu2
200.1.1          dec8200            U         128  4075124  tu2
Regards
Paulo
The solution which was exactly the solution to my problem.
==========================================
On my network there is, like on yours, a default gateway and
a firewall. In the gateway there is/was a default route, with a very
high cost/metric/preference, pointing to the firewall. This means
that if the gateway gets a request for a connection to a network that
is not within it's own routing tables, i.e. a network outside your
own LAN/WAN, go through the firewall (the network is 'out there').
Every once in a while there is a glitch on the line to a remote site,
the gateway updates it's routing tables or get's a request for this
off-line network and finds it unreachable, sends a redirect to the
server saying 'use the firewall' to get to this network.
I did a quick-and-dirty workaround to solve this problem, a shell 
script that removes all routes not wanted. I let cron run this every
5 minutes. As I said, ugly but it works.
Later on we did a redesign of our router network and removed the
default route within the gateway. Problem solved permanently.
So, what it all boiled down to was routing (OSPF) timers and
default routes in the WAN.
BTW, I don't thing DNS has anything to do with it. Just to make 
sure, check /etc/svc.conf and make sure the entry for hosts=
doesn't include bind and that /etc/resolv.conf is empty or that
the nameserver entry points to were you want it to. Best of all,
remove /etc/resolv.conf entirely.
===========================================
To add to what Frederik said:
We managed to solve the problem by configuring the remote router (at 
the client side) to point only and direct traffic only via the router/gateway at 
the 8200 (server) side.
Thanks once again.
Paulo
Received on Fri Feb 28 1997 - 17:25:22 NZDT