Assigning Domain Server Breaks rlogin

herot · May 5, 2012, 9:29pm

Most of my Unix servers do not have access to the internet. We have a test box that I want to use to receive all root email from the other unix boxes locally. i want to then have the test box able to .forward all these emails over the internet to me. I can give the test box a dns server and it can access the internet just fine. However, when I do so, rlogin (previously works fine) stops working. It gives some sort of "ssword mismatch" problem. When I remove the entries from resolv.conf and netsvc.conf it starts working again... What is going on here?

bakunin · May 7, 2012, 12:34am

I suppose you have DNS in your local network and this contradicts with the DNS database in the internet. Remove this contradiction and it should work.

Just as an aside: you might want to reconsider using rlogin (along with telnet, ftp and other non-securified protocols) because of its lacking security. Use ssh (sftp/scp) instead.

I hope this helps.

bakunin

herot · May 7, 2012, 7:32am

The only way the Unix boxes know names is through the hosts files. The servers are on an isolated network (and don't have DNS servers specified in basic network setup) and the Backup server that I am trying to send all the mail thru is dual homed on that network and to the other "internet network". Is there an easy way to tell if we have a DNS server on our network? I do not think we do. I think that everything is windows workgroup and hosts files... Also,could you provide a more in depth explanation of the DNS conflict you speak of? Thanks.

bakunin · May 7, 2012, 8:16am

With respect to your problem this is the same:

Host name resolution (regardless of being done with a /etc/hosts file or DNS or whatever) basically is to specify a hostname and get a distinct IP address back. If your local host name resolution answers for host A with a certain IP address and your dual-homed server gets another answer then it might be not able to address the other hosts correctly any more.

Still, this is speculation and in fact i (we, the readers here) don't know enough of your situation to efficiently help you. Time to correct this.

Please tell us about your setup (how the hosts communicate, which OS version (i suppose AIX, because you post here), etc.. Have you IP forwarding switched on or off on the multihomed system? What are the relevant parts of your /etc/hosts-file? What are the contents of /etc/resolv.conf and /etc/netsvc.conf ? Do you have the netcd (network caching daemon) active or not?

I hope this helps.

bakunin

herot · May 7, 2012, 9:52am

Ok. I will try to give a more detailed description of my network.

(I am aware of the public ip addressing problem. I will fix this someday soon). I inherited this network.

we have a 192.0.0.0 network. it has no gateway to the internet.
we have a 192.0.10.0 network. it has 2 gateways to the internet (one for users and one for servers to keep the bandwidth seperate).

There are a few machines on the 192.0.0.0 network (also dual-homed to 192.0.10.0) that can access the internet. They don't route packets between 192.0.10.0 and 192.0.0.0 (you can ping their 0.0 interface buy not their 10.0 interface). There aren't many machines on the 0.0 and they are all servers of some sort. The servers I am focusing on right now are 2 SCO boxes and 2 AIX boxes. None of these servers have a netcd process running.

SCOally! -->#cat /etc/resolv.conf <-ip address = 192.0.0.20
nameserver 192.0.0.20 <- server has no netsvc.conf
nameserver 192.0.0.22
hostresorder local bind
search nesdi.com

SCOissy! -->#cat /etc/resolv.conf <-ip address = 192.0.0.22
nameserver 192.0.0.2(powered off) <- server has no netsvc.conf
nameserver 192.0.0.1(no purpose now)
hostresorder local bind
search nesdi.com

AIXbddy# <-NO resolv.conf file. <-ip address = 192.0.0.60
<-ip address = 192.0.10.160
<-netsvc.conf default

AIXbackup# <-NO resolv.conf file. <-ip address = 192.0.0.55
<-ip address = 192.0.10.155
<-netsvc.conf default

all the above servers have each other specified in /etc/hosts . None of the above servers have anything setup to point to a DNS server.

Now, on the 10.0 network I only use numerical address's to talk to hosts. They all have names, but they don't resolve. I assume this is because there is no DNS server on that network. All the pc's DNS settings point to the same address as the gateway which are a linksys wrt54g (dd-wrt) for the servers and the pc's gateway/DNS point to an Untangle server. There is no domain server for the pc's. Its a 70 computer workgroup.

Let me know what I'm leaving out. Thanks again.

bakunin · May 7, 2012, 11:07am

First off, the IP network you use looks fishy: the "usual" setup is to have a private network and routing to the internet shut off. Then, via a proxy server in a DMZ, selected systems are allowed to access the internet. To hide the (not-routable) private addresses from the internet usually NAT is used.

This works because several addresses of the IP address range are set aside and defined as a) not being routable and b) used for private purposes. This means, the normal property of an IP address to be distinct worldwide is not the case with these addresses. Everybody can use them (instead of having to registering them with the IANA), but in return you cannot access the internet with these.

The address ranges in question are (see RFC 1597 or RFC 1918, "Address Allocation for Private Internets"):

10 (-> one class-A net)
172.16 - 172.31 (-> 16 class-B nets)
192.168.0 - 192.168.255 (-> 256 class-C nets)

I presume you (metaphorical - maybe your predecessor admin) wanted to set up a private network, but mixed up addresses. Right now you are using official internet addresses, probably without having them registered and them being duplicate. This works well as long as there is absolutely no connection to the internet, but once there is (and you say that there is now) this will lead to errors galore.

I still cannot tell you why your specific error message showed up, but i suggest that you correct the most obvious error first, which will definitely prevent successful operation anyway.

I hope this helps.

bakunin

herot · May 7, 2012, 11:17am

We do use NAT. There is no route to the internet from the 0.0 . There is NAT between the 10.0 and the internet via the 2 Gateways (Untangle, dd-wrt). I am very aware of the addressing problem as I stated in my last post ("I am aware of the public ip addressing problem. I will fix this someday soon. I inherited this network."). I plan to change all the 192's to 10's but it is a huge project and I have a lot of planning to do first. For now, NAT is keeping us from having any problems from that.

:wall:

wouldn't traceroute show something weird if this was causing us problems?

$ traceroute ibby                                                       
trying to get source for ibby                                             
source should be 192.0.0.60                                                 
traceroute to ibby (192.0.0.22) from 192.0.0.60 (192.0.0.60), 30 hops max 
outgoing MTU = 1500                                                         
 1  ibby (192.0.0.22)  1 ms  0 ms  0 ms

bakunin · May 7, 2012, 11:44am

Sorry, you are right. Somehow i overlooked that, my bad. Still i think that it causes problems even with NAT because duplicate addresses are duplicate addresses. NAT works only the way it does, because there is a distinction between outside (only public addresses) and inside (only private addresses). If you have the same address inside and outside i am not sure if this will work at all, at least not for systems with access to the outside.

I hope this helps.

bakunin

herot · May 7, 2012, 11:56am

Yes, BUT the address we are using are not currently allocated in the outer world.

vjm · May 9, 2012, 2:36am

The query is about your system not resolving proper fqdn and hence you are facing issue in rlogin. For tis you have to enter you local hosts in /etc/hosts file and you can mention your dns server in resolve.conf. In any os first it check the hosts file and then it will go to the dns servers.
Hope this resolves your query.

Regards,

vjm

bakunin · May 9, 2012, 6:30am

Sorry, but this is outright wrong.

In fact this is what the file /etc/resolv.conf is for: to determine the precedence the various means of name resolution (DNS, host files, NIS, etc.) take. This behavior is in fact common for every OS (with the possible exception of Windows, of which i don't know) because it is the compulsory modus operandi of the gethostbyname() system function mandated by several RFCs (1034, 1035, 2065, 2308 just to name a few).

@herot:

I'm sorry not to be of any more help, but i think without some in-depth debugging you won't find the culprit here. It might be a good idea to install tcpdump (if you haven't already) and start analysing the (attempted) traffic the respective machines are encountering.

I hope this helps.

bakunin