Hi all,
I'm using an HP Proliant server with double NIC and Debian 5 (Lenny) as its OS. I used link bonding on it for several years with no problem.
Today, (now only one interface connected to switch) the interface went down. I restarted networking service (/etc/init.d/networking restart) and everything seemed to be OK; but after a while it went down and needed more restart.
Here is my /etc/network/interfaces:
auto lo
iface lo inet loopback
auto bond0
iface bond0 inet static
#It's not my real IP address
address X.Y.Z.3
netmask 255.255.254.0
#It's not my real MAC address
hwaddress ether A:B:C:D:E:F
#It's not my real gateway address
gateway X.Y.Z.1
up ifenslave bond0 eth0 eth1
down ifenslave -d bond0 eth0 eth1
auto bond0:0
iface bond0:0 inet static
address 192.168.168.3
netmask 255.255.255.0
I checked /var/log/messages, /var/log/syslog but didn't find any error.
Can some one suggest any solution for the problem?
Thanks in advance.
how many systems are connected to your network?
A lot! In some times more than 400 PCs & servers.
I've 2 other servers which work fine but 2 of my servers have this problem.
If nothing has changed in your network or software, then I would suspect the NICs are starting to fail.
I have 2 servers with this problem. I don't think that the problem is from my NICs.
In fact, I added a "broadcast" line in my configuration and then removed it. But the problem remained.
I would take fpmurphy's suggestion and swap them out with a known good the next time they go down. I would also check your logs and see if you see anything in there.
I disconnected cable from etch0 interface and everything became OK! But I think there shouldn't be a hardware problem. Because:
1- My 2 servers faced this problem concurrently! Interface eth0 of both of them became faulty! I've another servers without any problem.
2- Link bonding didn't detect the faulty interface and so didn't change the use of healthy interface.
This is the content of my /etc/modprobe.d/aliases file:
options bonding mode=1 miimon=100
Humm. Interesting. I still think it could be a hardware problem however but could be
wrong.