Changing single path NIC to a teamed connection in same subnet

Dear all,

I have a remote CentOS7 server that has two network cards. Each card has four ports and port one of card one was defined with the IP address assigned to the server. So far, so good and it's been working for over a year. We have now got cables sorted out so there are four paths available (two from each card, via two switches) to give us resilience. After that, I have built a logical teamed device with the three new paths and it's all working just fine until I try to turn off the old un-teamed/single path eno1. I think it's all down to routing and I'm stuck.

Output from ip add show (removed unused ports for clarity):-

2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether b8:83:13:55:10:e0 brd ff:ff:ff:ff:ff:ff
    inet 10.102.16.11/24 brd 10.102.16.255 scope global noprefixroute eno1
       valid_lft forever preferred_lft forever
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master team0 state UP group default qlen 1000
    link/ether 98:f2:b3:1d:3e:04 brd ff:ff:ff:ff:ff:ff
6: eno5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master team0 state UP group default qlen 1000
    link/ether 98:f2:b3:1d:3e:04 brd ff:ff:ff:ff:ff:ff
7: eno6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master team0 state UP group default qlen 1000
    link/ether 98:f2:b3:1d:3e:04 brd ff:ff:ff:ff:ff:ff
22: team0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 98:f2:b3:1d:3e:04 brd ff:ff:ff:ff:ff:ff
    inet 10.102.16.13/24 brd 10.102.16.255 scope global noprefixroute team0
       valid_lft forever preferred_lft forever

All seems well and I have changed the DNS to refer to the new IP address for the team0 device. I can SSH to the new IP address and verify with netstat that I am connected to the new IP. Output from netstat -nr and route -n :-

# netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         10.102.16.254   0.0.0.0         UG        0 0          0 eno1
0.0.0.0         10.102.16.254   0.0.0.0         UG        0 0          0 team0
10.102.16.0     0.0.0.0         255.255.255.0   U         0 0          0 eno1
10.102.16.0     0.0.0.0         255.255.255.0   U         0 0          0 team0

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.102.16.254   0.0.0.0         UG    0      0        0 eno1
0.0.0.0         10.102.16.254   0.0.0.0         UG    350    0        0 team0
10.102.16.0     0.0.0.0         255.255.255.0   U     100    0        0 eno1
10.102.16.0     0.0.0.0         255.255.255.0   U     350    0        0 team0
#

If I try to disable eno1 I get no network response until I turn it back on again via the console. It's the same it I delete the default route with route delete default gw 10.102.16.254 dev eno1

I just don't get it. There appears (to me) to be a valid route from device team0 so what am I missing? :confused:

I do see this oddity though:-

# arp -a
? (10.102.16.254) at <incomplete> on team0
? (10.102.16.254) at 00:22:bd:fa:09:ff [ether] on eno1

For completeness, my server reports the below:-

# uname -a
Linux my_host 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

After I retire eno1 as a discreet connection, it will be added into the team to give the proper resilience.

I am quite expecting to be shown that I am a fool. I'd prefer that than being left hanging. :rolleyes:

Many thanks, in advance,
Robin

What happens when you look at the arp tables, before and after?

On ethernet, as you know, the actual LAN routing is done by MAC address; so when we change IP address, we need to flush / rebuild the arp table (on both ends), generally speaking.

If we change an IP address and do not flush the arp table, there will often be an incorrect IP address <---> MAC address in the arp table and this can cause LAN routing snafus.

PS: I'm writing this based on knowledge from decades ago, off the top of my head, and have not done any research on arp in a long time.

Are you using network manager and nmcli or ?

Regards
Peasant.

Creation was with nmcli yes, although I think we have Network Manager service turned off. The course I went on suggested it doesn't work :frowning:

This might actually be a network team thing, as in the switches are not happy seeing the same MAC address & IP address on multiple ports. I will keep looking at that and post an update of how we get on.

Robin

This appears to be a conflict between the server and the network switches all along. We have redefined them as LACP balanced connections and later the network team have done the same to the switch ports. This broke everything because there was still the default route going out on the old card which had been configured by the network team to be in the team. I removed the default route from that port and hey-presto, I've got on to the new IP address properly with data being returned so a proper connection could be established.

I have then reconfigured the now redundant link to be part of the team and restarted the network services. I now have multiple LACP balanced active links.

For anyone else who may find this thread, I have two cards with four ports, so eight possible eno interfaces. Only 1, 2, 5 & 6 are cabled, so the full commands I used to bond them all together are:-

nmcli con add type team con-name team0 ifname team0 config '{"runner": {"name": "loadbalance", "tx_hash": ["eth", "ipv4", "ipv6"], "tx_balancer": {"name": "basic"} } }'
nmcli con mod team0 ipv4.addresses '10.102.16.13/24' ipv4.method manual
nmcli con add type team-slave ifname eno2 con-name team0-eno2 master team0
nmcli con add type team-slave ifname eno5 con-name team0-eno5 master team0
nmcli con add type team-slave ifname eno6 con-name team0-eno6 master team0

A new MAC address is created for the team and each bonded interface gets the same MAC address.

After the original connection was redundant, I added it to the group with this command and this edit:-

nmcli con add type team-slave ifname eno1 con-name team0-eno1 master team0
sed -i 's/ONBOOT="yes"/ONBOOT="no"/'  /etc/sysconfig/network-scripts/ifcfg-eno1

Verification commands:

ip ad s
teamdctl team0 state                        (and variations from this)
nmcli -p
ethtool -S eno1                             (or eno2, eno5 & eno6)

I hope that this is useful to someone, but at least I have it documented for myself too! :wink:

Kind regards,
Robin

Moderator comments were removed during original forum migration.