New to Solaris IPMP (conversion from Linux)

Hi all,

I been reading examples of how to setup IPMP and how it differs from Etherchannel. However, i am still unsure of how it really works and i hope gurus here can shed some light on the questions I have below while i will lab it up for my own test ->

q1) for IPMP, there is no such thing as a bond0 / bonded interface right ?

q2) for all interfaces in a IPMP group, do they have their own mac address ?

q3) for interfaces configured as active, active in a IPMP group, packets send out by the individual interfaces will have its own source IP and mac addresses right ?
-- in short, there are no sharing of mac address / ip address across the 2 physical interfaces right ?

q4) if the 2 interfaces have its own IP and source mac, how does load sharing work across the 2 interfaces ? if I am sending a file or sending some packets over a TCP session ? will the packets be round robin across the interfaces ?
-- if so, wouldn't there be some sequencing issue or FW session issue ?

q5) with regards to q4) how does an app or OS select the active, active interface for use ? for a single transaction / session, will it always stick to a particular interface ?
i am not very good in network, but i don't think sending a file across a network to a destination using 2 different source IP will work ?

Looking forward to hear your advices

P.s. I am on Solaris 10

Regards,
Noob:confused::confused:

The following is my experience with Solaris 10 (things change in Solaris 11).
There is dladm (dynamic link aggregation) that can create LACP links (none-LACP works only in rare conditions). Has strange defaults, e.g. LACPtimer=short (is only reliable if both links are connected to one LAN switch). dladm creates interface names aggr1,aggr2,... with one MAC address and hides the bonded interfaces, so applications don't see them. This aggregate-type is always all-active.
Example:

# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
aggr1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1500 index 2
        inet 47.11.12.13 netmask ffffff00 broadcast 47.11.12.255
        ether 18:a9:5:4e:47:11 
# dladm show-aggr
key: 1 (0x0001) policy: L3      address: 18:a9:5:4e:47:11 (auto)
           device       address                 speed           duplex  link    state
           bge0        18:a9:5:4e:47:11          1000  Mbps    full    up      attached
           bge1        18:a9:5:4e:47:12          1000  Mbps    full    up      attached
# dladm show-aggr -L
key: 1 (0x0001) policy: L3      address: 18:a9:5:4e:47:11 (auto)
                LACP mode: passive      LACP timer: long
    device    activity timeout aggregatable sync  coll dist defaulted expired
    bge0      passive  long    yes          no    no   no   yes       no     
    bge1      passive  long    yes          yes   yes  yes  no        no 

passive means the LAN switch initiates the LACP dialogue.
--
For completeness I show you the alternative.
One can create an IPMP group, it does not hide the interfaces and each interface keeps its individual MAC address. De-facto this only works with active/standby.(There is maybe an exotic LAN switch configuration that allows active/active.)
Test addresses (that do periodic line checks) must be explicitly added.
Example with preferred active bge0 interface (bge1 is standby), and no test address (link-detection-based):

# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 47.11.12.13 netmask ffffff00 broadcast 47.11.12.255
        groupname prod
        ether 18:a9:5:4e:47:11 
bge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
bge1: flags=69000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,STANDBY,INACTIVE> mtu 0 index 3
        inet 0.0.0.0 netmask 0 
        groupname prod
        ether 18:a9:5:4e:47:12 
1 Like

Start reading here (for Solaris 10):
https://docs.oracle.com/cd/E26505_01/html/E27061/mpoverview.html\#scrolltoc

Hi DukeNuke2,

I have read that documentation, but it doesn't says how are the load spreading done ?

Will they be using same mac address / ip address for the multiple interfaces ?

How does the system chooses the interfaces for doing the load sharing ?
Will a single file be split over 2 interfaces for sending out ?

Regards,
Alan

IPMP is for redundancy not for load sharing!

1 Like

If you use LACP you have 4 interfaces and 4G or 40 G bandwidth.

Solaris has numerous ways of achieving bandwidth and redundancy, depending on the release (most of this will work on 10).

I will try to write about as many as i can remember from the head.

For the sake of argument lets say you have 2 cards with 2 interfaces each (net0 to net3).

They are all connected to the same physical switch.

  • This is a requirement for LACP, after configuration on the Solaris host you will have one interface with bandwidth x 4 (4G if 1G switch/card, 40 G if 10 G switch/card).
  • LACP interface (aggr) is a single logical link from switch and host side. You can configure balancing algorithm on the switch for that group (choice is switch dependant).
  • SWITCH is single point of failure.

    SWITCH --> net0/net1/net2/net3 --> aggr0 [40G/4G]

One card is connected to the first switch, while other is connected to second switch (or combinations)

  • You can configure LACP between two interfaces on the same switch (so now you have two aggr interfaces) and then use IPMP between two created.
  • By configuring ipmp (ipadm set-ifprop ..) you can configure active / active or active / passive configuration on the host.
  • Option is as well to use transitive probing (default on newer Solaris) or using test addresses probing for failure detection.
  • LACP switch side balancing algorithm mentioned in first case for those two interfaces still apply since you have two logical interfaces (aggr0/1) under IPMP group (ipmp0)


    SWITCH0 --> net0/net3 --> aggr0
    ---------------------------------------------> ipmp0 [2G/20G] [Active/Active, Active/Passive host configured, BW is limited to the speed of 1 interface in IPMP group]
    SWITCH1 --> net2/net1 --> aggr1

IPMP groups between any number of switches and card ports.

  • BW is limited to speed of one interface.
  • For instance, you can combine 10G / 1 G interfaces in one IPMP group, making 10G active interface and 1 G passive (for redudancy).

DLMP aggregation, similar to LACP feature wise regarding bandwidth, but it can span over multiple switches inside you network (no same switch requirement)

  • Everything is done on the host, switches are not used for actual aggregation.
  • This method is easier to configure and administer (one interface for all operations), by using VLAN tagging, VNICs and flowadm(1M) you can do basically whatever you want regarding shaping and securing your traffic (L2/L3/L4) on the host.
  • I would recommend this method, but this is Solaris 11 feature.

Hope that helps.
Regards
Peasant.

1 Like

Hi Peasant,

Thanks for the reply and the detailed response.
However, if we are talking about pure IPMP without leveraging on top of LACP.

How does load sharing works on 2 active interface in an IPMP group ?
Will a single file transfer to 1 destination be load balance between the 2 interfaces ?

Regards,
Noob

---------- Post updated at 04:26 AM ---------- Previous update was at 04:24 AM ----------

Hi MadeInGermany,

Why do we need an exotic LAN setup for active active to work in IPMP ?

Regards,
Noob

---------- Post updated at 04:27 AM ---------- Previous update was at 04:26 AM ----------

Hi DukeNuke2,

I hope i am not mistaken but in the document, it states that IPMP can be use for load sharing and can be setup as active,active, which is the reason for my confusion ;(

Regards,
Noob

Hi duke2nuke,

Thanks for the quote. Precisely that's what I am asking about. How does an ipmp group load spread its traffic ?

If I am to send a file , or establish some tcp handshake or session, does packets get spread across the 2 interfaces which uses 2 different source macs and ips - wouldn't that be an issue ?

Regards,
Noob

As stated before, the purpose of IPMP is to increase redundancy. Usually this is an asymmetric approach, active/standby.
Even if active/active with IPMP is advertised - I have only seen broken ones.

For symmetric load spreading the dynamic link aggregation was developed, and it also provides redundancy. It needs LACP. While a non-LACP mode is advertised - I have only seen a broken one.

In post#2 I have given configuration examples that WORK.

1 Like

Well i'm using active - passive with test addresses on 11.1 :slight_smile:

I have tested LACP with IPMP over 2 x LACP (active-passive), which works fine.

Other stuff did not pass physical tests (unplugging or port shut) on current patchset i'm running. Stuff like transitive probing and active active ipmp.

Stuff gets fixed tho, but having to make test like that for every update ..... not so enterprise :D.

DLMP - didn't try, but i've have heard that it works in newer releases.

Regards
Peasant.

Dear all,

Thanks for the valuable feed backs. It do seems like active,active might not be the common option/setup despite having the "active, active" functionality.

Lastly, can i check , is there any configurations command we can use to check if our setup is ACTIVE,ACTIVE or ACTIVE,standby ?

On ifconfig -a, i do not see any "STANDBY" wording in the output for the interfaces.

Regards,
Noob

Please give ifconfig -a output. (ip- and mac-address X'd out.)

Hi MadeInGermany,

Please see output as below ->

-bash-3.2$ cat /etc/hostname.igb0
universe group IPMP
-bash-3.2$ cat /etc/hostname.igb1
universe-igb1 group IPMP

--/etc/hosts
192.168.7.51    universe loghost universe.planet.com
192.168.7.52    universe-igb1


-bash-3.2$ cat  /etc/default/mpathd
#
#pragma ident   "@(#)mpathd.dfl 1.2     00/07/17 SMI"
#
# Time taken by mpathd to detect a NIC failure in ms. The minimum time
# that can be specified is 100 ms.
#
FAILURE_DETECTION_TIME=10000
#
# Failback is enabled by default. To disable failback turn off this option
#
FAILBACK=yes
#
# By default only interfaces configured as part of multipathing groups
# are tracked. Turn off this option to track all network interfaces
# on the system
#
TRACK_INTERFACES_ONLY_WITH_GROUPS=yes



igb0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 192.168.7.51 netmask ffffff00 broadcast 192.168.7.255
        groupname IPMP
        ether b0:99:28:98:82:18
igb0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 192.168.7.50 netmask ffffff00 broadcast 192.168.7.255
igb1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 192.168.7.52 netmask ffffff00 broadcast 192.168.7.255
        groupname IPMP
        ether b0:99:28:98:82:19

Thanks.
Regards,
Noob

I think you have active/active, check with

netstat -i

It can lead to hickups - sudden delay or packet loss.
For active/standby have
hostname.igb1

group IPMP standby

No second IP-address.

1 Like

In case you want to go for a symmetric active/active aggregate with LACP (this is not called IPMP by Sun and Oracle):

  1. enable LACP on the LAN switch port
  2. create an executable script on the local disk
!/bin/sh

echo "Creating LACP setup"
/usr/sbin/ifconfig bge0 down
/usr/sbin/ifconfig bge0 unplumb
/usr/sbin/ifconfig bge1 unplumb
/usr/sbin/dladm create-aggr -P L3 -l passive -d bge0 -d bge1 1
/usr/sbin/ifconfig aggr1 plumb

mv /etc/hostname.bge0 /etc/hostname.aggr1

The script is for bge0 and bge1; rename them to your NIC drivers.
Run the script from a console (e.g. an ILO board - not over ssh!)
See also Link Aggregation Overview - Sun Quad Port GbE PCIe 2.0 ExpressModule, UTP
Note they advertise -l active but usually it should be -l passive because the LAN switch is the active part.
Because the aggregated links are hidden, netstat -i 1 shows the sum only, while the per-link packets are shown with dladm show-aggr -s -i 1 .

1 Like

Hi MadeInGermany,

Greatly appreciate the examples.
I am trying to setup ACTIVE,ACTIVE IPMP and do a packet sniff to see what are the actual IPs and mac address being send out of both interfaces... but sadly.. solaris doesn't seems to have wireshark :frowning:

Regards,
Noobb

Solaris has something better than wireshark... It is called snoop .

1 Like

Hi DukeNuke2,

Is there anyway i can capture both interfaces at the same time with snoop ?
:confused::confused::confused: Been trying but doesn't seems like it.

Look forward to your advice.

Regards,
Noob

Never tried that... You can open multiple snoop sessions and capture the output in multiple terminal sessions...

1 Like