Hi,
I have Solaris-10 server and link aggregation is configured on this in below way
# dladm show-aggr
key: 1 (0x0001) policy: L4 address: 3c:d9:2b:ee:a8:a (auto)
device address speed duplex link state
bnx1 3c:d9:2b:ee:a8:a 1000 Mbps full up attached
igb2 f4:ce:46:a7:eb:92 1000 Mbps full up attached
key: 2 (0x0002) policy: L4 address: 3c:d9:2b:ee:a8:8 (auto)
device address speed duplex link state
bnx0 3c:d9:2b:ee:a8:8 1000 Mbps full up attached
igb3 f4:ce:46:a7:eb:93 1000 Mbps full up attached
# dladm show-link
bnx0 type: non-vlan mtu: 1500 device: bnx0
bnx1 type: non-vlan mtu: 1500 device: bnx1
bnx2 type: non-vlan mtu: 1500 device: bnx2
bnx3 type: non-vlan mtu: 1500 device: bnx3
igb0 type: non-vlan mtu: 1500 device: igb0
igb1 type: non-vlan mtu: 1500 device: igb1
igb2 type: non-vlan mtu: 1500 device: igb2
igb3 type: non-vlan mtu: 1500 device: igb3
aggr1 type: non-vlan mtu: 1500 aggregation: key 1
aggr2 type: non-vlan mtu: 1500 aggregation: key 2
aggr150002 type: vlan 150 mtu: 1500 aggregation: key 2
aggr50001 type: vlan 50 mtu: 1500 aggregation: key 1
aggr55001 type: vlan 55 mtu: 1500 aggregation: key 1
aggr60001 type: vlan 60 mtu: 1500 aggregation: key 1
aggr62001 type: vlan 62 mtu: 1500 aggregation: key 1
aggr64001 type: vlan 64 mtu: 1500 aggregation: key 1
aggr66001 type: vlan 66 mtu: 1500 aggregation: key 1
aggr81001 type: vlan 81 mtu: 1500 aggregation: key 1
# dladm show-dev
bnx0 link: up speed: 1000 Mbps duplex: full
bnx1 link: up speed: 1000 Mbps duplex: full
bnx2 link: unknown speed: 0 Mbps duplex: unknown
bnx3 link: unknown speed: 0 Mbps duplex: unknown
igb0 link: unknown speed: 0 Mbps duplex: half
igb1 link: unknown speed: 0 Mbps duplex: half
igb2 link: up speed: 1000 Mbps duplex: full
igb3 link: up speed: 1000 Mbps duplex: full
#
There will be switch replacement, so one by one, link will go down from one side. Before that activity, is there any way to check/test, if server will work fine, if one side goes down ? In same way, as used to check by if_mpadm -d in ipmp.
Thanks
The only way I know of to properly test it is to either physically pull a cable (logically if it's a virtual server) or to down the network interface card. Obviously you would down the physical card that supports one path or the aggregated link. You should be able to get statistics about the aggregated link to show you what is in use. A path should die and then recover when you turn it back on.
Of course, this introduces risk if it doesn't work, so always make sure you have a way to re-enable it all quickly.:rolleyes:
Kind regards,
Robin
1 Like
Thank Robin,
Thats what I thought. I will plan it out
dladm show-aggr
Check correct "speed" and "duplex mode" and there should be "link up".
If you use LACP (I think you must use LACP - everything else I have seen being faulty), then check with
dladm show-aggr -L
The output of a working LACP looks like this:
device activity timeout aggregatable sync coll dist defaulted expired
xyz0 passive long yes yes yes yes no no
xyz1 passive long yes yes yes yes no no
In Solaris 10 the timer defaults to "short", you should change it to "long" unless told otherwise by the vendor of the connected LAN switches.
"passive" or "active" does not mattter (the connected LAN switch must be "active").
"policy" does not matter.
The "sync" "coll" and "dist" must all be "yes" (otherwise it does not cooperate with the connected LAN switch).
That's all theory.
The practise is: pull a cable and check connectivity.
In my output, timeout is mentioned as short.
How will it affect, if I do not change it to long ?
# dladm show-aggr -L
key: 1 (0x0001) policy: L4 address: 3c:d9:2b:ee:a8:a (auto)
LACP mode: active LACP timer: short
device activity timeout aggregatable sync coll dist defaulted expired
bnx1 active short yes yes yes yes no no
igb2 active short yes yes yes yes no no
key: 2 (0x0002) policy: L4 address: 3c:d9:2b:ee:a8:8 (auto)
LACP mode: active LACP timer: short
device activity timeout aggregatable sync coll dist defaulted expired
bnx0 active short yes yes yes yes no no
igb3 active short yes yes yes yes no no
#
Ask your LAN switch vendor!
Most vendors support "short" if both NICs are connected to one LAN switch, but not if connected to two different LAN switches.
In the worst case a fail-over does not work. Do the "pull the cable" test at least!
Thanks.
I was able to find a server, on which I can do "pull the cable test".
From below output, it shows that bnx1 cable was pulled out. since aggr is configured igb2 took the traffic and all link sustained. But first line shows "address: 3c:d9:2b:f9:20:5e (auto)" and this MAC address is for bnx1.
Since igb2 took all the traffic now and its mac is f4:ce:46:a7:df:ba , should it not show in auto ? Sorry, I am confused in understanding this.
# dladm show-aggr
key: 1 (0x0001) policy: L4 address: 3c:d9:2b:f9:20:5e (auto)
device address speed duplex link state
bnx1 3c:d9:2b:f9:20:5e 0 Mbps half down standby
igb2 f4:ce:46:a7:df:ba 1000 Mbps full up attached
key: 2 (0x0002) policy: L4 address: 3c:d9:2b:f9:20:5c (auto)
device address speed duplex link state
bnx0 3c:d9:2b:f9:20:5c 1000 Mbps full up attached
igb3 f4:ce:46:a7:df:bb 1000 Mbps full up attached
#
# dladm show-aggr -L
key: 1 (0x0001) policy: L4 address: 3c:d9:2b:f9:20:5e (auto)
LACP mode: active LACP timer: short
device activity timeout aggregatable sync coll dist defaulted expired
bnx1 active short yes no no no yes no
igb2 active short yes yes yes yes no no
key: 2 (0x0002) policy: L4 address: 3c:d9:2b:f9:20:5c (auto)
LACP mode: active LACP timer: short
device activity timeout aggregatable sync coll dist defaulted expired
bnx0 active short yes yes yes yes no no
igb3 active short yes yes yes yes no no
#
I don't know how the fail-over case looks in detail.
The main point is: connectivity is still there, and, once you put the cable back, the normal redundant state comes back.
1 Like