Solaris IP Multipathing (IPMP) Help

mainegeek · December 24, 2007, 10:44pm

Hello All,

I work for a Health care company at a local trauma hospital. I maintain a Picture Archiving and Communication System (PAC's). Basically, any medical images (X-Ray, CT, MRI, Mammo, etc) are stored digitally on the servers for viewing and dictation from diagnostic stations. I took over this job about 8 months ago. Considering this is a major hub hospital I have made a huge effort to pinpoint and eliminate single points of failure. I have scripted nightly system back ups of major (core) systems and tested disaster recovery and high availability systems. Any problems I have found I have corrected promptly.

My last single point of failure ends up being the network connectivity. My core systems are connected by one (1) Ethernet connection to the hospitals network. This introduces single points of failure across the entire network path. The systems could be brought down by switch or switch port failure.

So..... I decided to research Solaris IPMP. Since "most" of my core servers run Solaris 9 I figured this would be a good solution especially since I can guarantee each network connection is connected to its own switch.

My problem however, is that my companies software is licensed by the MAC address. Most Sun boxes have a per machine MAC ID versus a typically per port configuration. Well, from my research IPMP requires both interfaces to have unique ID's. So Solaris must be configured at the firmware to use different MAC addresses on each NIC. Is there a way to not only failover IP/netmask but also the MAC address?

Basically, each system has two (2) NIC's. I want to have the primary NIC hold the only Public address and licensed MAC address. The secondary interface will hold a private address on a separate subnet with a separate MAC address. When a failure is found the primary NIC will fail to a its own unique private IP and MAC address; the secondary NIC will then fail over to the primaries IP & MAC. This should provide continuous uptime without invalidating the systems license.

Has any one down anything similar? Any crib notes, ideas, sources of info or thoughts?

blowtorch · December 26, 2007, 11:29pm

You can, and should, use IPMP. Here's how we setup IPMP on our systems:

bash-3.00# ls -l /etc/hostname*
-rw-r--r--   1 root     root         147 Dec 19 05:49 /etc/hostname.e1000g0
-rw-r--r--   1 root     root         103 Dec 19 05:01 /etc/hostname.e1000g4
for i in /etc/hostname.e1000g*; do
> echo $i
> cat $i
> done
/etc/hostname.e1000g0
servername-e1000g0-ipmp netmask 255.255.255.0 broadcast + group mnb -failover deprecated up
addif servername netmask 255.255.255.0 up
/etc/hostname.e1000g4
servername-e1000g4-ipmp netmask 255.255.255.0 broadcast + group mnb -failover deprecated up

You can call your group anything you want, and make sure that you add the servername and servername-e1000g0/g4 (this will, of course, depend on your driver and your hardware) in the /etc/hosts file.

So we use three IP addresses, and have the actual server IP as a virtual IP that will failover to the other NIC in case of a network failure on the primary interface (e1000g0 in this case).
You can test your IPMP using the if_mpadm command. Check the man page for details.

prowla · December 27, 2007, 5:01am

IPMP is great, but beware that the "resilience" it offers can be an illusion.

If you have a system with a quad ether (eg. qfe) card, then you don't get much resilience by using IPMP on two ports in the same card. You do get protection against accidentally pulling out a single LAN cable, but you are still susceptible to failure of that card, and to an accident involving both cables.
If your LAN cables run to the same patch panel, then you are susceptible to any problems with that.
If your cables attach to the same network switch, then you are dependent upon that.
If your cables all go to the back of the cabinet, then you are susceptible to an accident there. (Some systems have card cages at the front and back of the machine, to allow completely independent routing of the cables.)
Etc, all the way along the chain.

Oh, and if you're doing it for LAN interfaces, then you need to do the same for disks (SCSI and/or FC), and mains cables too!
Basically you have to sonsider your whole environment and eliminate and SPOFs (Single Points Of Failure).

Welcome to the world of managing datacentre computers!

blowtorch · December 27, 2007, 10:48pm

Yes, I should have mentioned the whole "path" along which failures can happen...

How we do it is:

Cables come to network panels from different switches that are housed in different cabinets.
Network cables from the patch panels plug in to NICs which are on different cards - housed in different PCI slots (or one onboard and one in a PCI slot).

mainegeek · December 28, 2007, 11:09pm

Hi blowtorch, prowla--

Thanks for the prompt input! I plan on using IPMP. As I said before, I can guarantee that each NIC is plugged into a seperate switch and each switch has a dedicated patch panel. They both plug into the same router but that is less likely to be a problem since it is a high-end cisco that is completely modular.

I forgot to mention in my post why it is so important the NIC's MAC address failover also. My companies software is licensed by MAC address. I need to change the MAC address on failover to avoid invalidating the softwares license. I have done extensive research on IPMP but have yet to find a solution to my dilemma.

Although this system requires high availability, I have no problems with a delay in the failover. The users would only have a slight hiccup in the system instead of downtime, me being dispatched, bringing up the second ethernet, and requesting a new license (which could take near a day).

My thought is to configure IPMP so that the primary NIC holds the primary IP and licensed MAC address. The secondary NIC will use a IP (unknown to everyone) on a separate subnet and using a separate MAC address. On failure of the primary NIC, the primary NIC will failover to a unique IP/MAC on the private subnet. The secondary NIC will then attain the public IP/subnet and licensed MAC address. On recovery everything should fail back to the original configuration (if possible--haven't seen info on this).

Any ideas how something like this can be done? I've yet to find any documentation/examples on this topic.... I'm sure very few people require this type of failover.

Anyways, I know this one is probably a stumper... again thanks for your input

prowla-- I have considered all(most) points of failure. We had JBOD disks serving medical images. These have been replaced with dual-path dual HBA SAN attached RAID 5 disks. I also tested our SAN-attached 17TB longterm archive's failover procedure and found that the Veritas DMP was not operating as expected. Basically the mounted filesystem just went :p.... I was able to find the error in configuration and resolved that. I'm sure there are more single points of failure. I just haven't found or thought of them yet!

blowtorch · January 2, 2008, 5:14am

I don't know if the MAC address can failover via IPMP. I am not even sure if you can change the MAC address of a NIC, but if you can, it will have to be via a separate script.

mainegeek · January 10, 2008, 9:45pm

Hi blowtorch,

Sorry for the late reply....

I have not tested it but form my research you can set the MAC address on the interfaces of a Sparc box. Here's what I dug up really quick with google:

How do I change a MAC address?

I figured since you can re-assign the MAC address via ifconfig and IPMP is configured with ifconfig that it may be possible.

blowtorch · January 14, 2008, 7:38am

Well, I have never done this, so cannot really comment on whether it will work well or not. Perhaps reborg (or someone with more knowledge about this than me) can comment?

upengan78 · June 10, 2008, 12:02pm

blowtorch:

You can, and should, use IPMP. Here's how we setup IPMP on our systems:
bash-3.00# ls -l /etc/hostname*
-rw-r--r--   1 root     root         147 Dec 19 05:49 /etc/hostname.e1000g0
-rw-r--r--   1 root     root         103 Dec 19 05:01 /etc/hostname.e1000g4
for i in /etc/hostname.e1000g*; do
> echo $i
> cat $i
> done
/etc/hostname.e1000g0
servername-e1000g0-ipmp netmask 255.255.255.0 broadcast + group mnb -failover deprecated up
addif servername netmask 255.255.255.0 up
/etc/hostname.e1000g4
servername-e1000g4-ipmp netmask 255.255.255.0 broadcast + group mnb -failover deprecated up
You can call your group anything you want, and make sure that you add the servername and servername-e1000g0/g4 (this will, of course, depend on your driver and your hardware) in the /etc/hosts file.

what will happen if I replace the following line with 'standby'

servername-e1000g4-ipmp netmask 255.255.255.0 broadcast + group mnb -failover deprecated up

i.e. instead of above use :

servername-e1000g4-ipmp netmask 255.255.255.0 broadcast + group mnb -failover standby up

in /etc/hostname.e1000g4 ! and /etc/hostname.e1000g0 remains as in quote.

incredible · June 10, 2008, 12:23pm

The relationship between the 2 is explained here:
IPMP

upengan78 · June 10, 2008, 12:37pm

Thanks, while I was reading I did not understand,

IPv4 test addresses should not be placed in the DNS and NIS name service tables

What is the reason for above and if I am not wrong test address are servername-e1000g4-ipmp* right ?(in above example)

what will happen if we add R-Records for these test IPS/Hostnames in DNS ? well ofcourse they will get resolved but what wrong will it do ?

Thanks