Eth0 Limitations

Hi,

I have noticed some performance issues on my RHEL5 server but the memory and CPU utilization on the box is fine.

I have a 1G full duplexed eth0 card and I am suspicious that this may be causing the problem. My eth0 settings are as follows:

Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 1000baseT/Full 
10000baseT/Full 
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full 
10000baseT/Full 
Advertised auto-negotiation: No
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: d
Link detected: yes

........and my current Rx & Tx figures are as follows.

ifconfig eth0
eth0 Link encap:Ethernet HWaddr 9C:8E:99:31:34:A0 
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:75042475 errors:0 dropped:0 overruns:0 frame:0
TX packets:105451412 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000 
RX bytes:3906089934 (3.6 GiB) TX bytes:269789061 (257.2 MiB)

Can anybody tell me what the limitations of this card is? As a percentage is it currently under stress or is it normal and well within its limitations?

If all looks well I'll examine the app in much more detail but really need to rule this out firstly.

R,
D.

I think you need to post the difference between two samples over a given reasonable period of time.

Depending on what network kit this server is plugged into, auto-negotiation should be avoided. It usually needs turning off on the server and the LAN port. Similarly anywhere where network components are cascaded.
It looks like auto-negotiation is off on your server.

Duffs,

You are confused. One, we'd need to know what card it is and the driver in order to tell you what "limitations" it has.

What are the problems you have had with performance? Do you realize that even a 1Gbit card is really only running around 100 Megabytes per second, assuming your switch can even handle that?

Are you configuring the switch and card as full duplex or as auto detect? Most Gigabit and faster connections recommend auto detection for optimal performance.

Without knowing what your issue is, we cannot help you. Does the app send lots of small packets or fewer large ones? Do you do interrupt coalescence? Are you using Nagle's algorithm?

Have you tuned your kernel parameters?

The metrics you have shown have only told us how many packets have hit the wire, and no time frame, or anything.

Let me ask you. How many gallons should it take for me to get to work by driving?

I have not told you how far it is, how fast I drive, what car it is, how well tuned it is, etc, or the inflation of my tires.

---------- Post updated at 03:22 PM ---------- Previous update was at 11:37 AM ----------

For gigabit, auto-neg is almost always recommended. Performance deteriorates when they try to force.

Mark,

Ok, it uses the following card and driver:

# lspci | grep -i eth
0c:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S Gigabit Ethernet (rev a3)
 
# ethtool -i eth0
driver: be2net

Packet information - Output of sar -n DEV:

AM IFACE rxpck/s txpck/s rxbyt/s txbyt/s rxcmp/s txcmp/s rxmcst/s
10:00:01 AM eth0 273.90 389.90 96997.29 82105.24 0.00 0.00 0.00

The switch has auto-negotiation enabled.

Nangle's algorithm - now you're taking me back to my college days. And how exactly do you apply/configure Nangle's algorithm in redhat? - I'm curious now.

R,
D.

Nagles (not Nangles) is the TCP_NODELAY option.

However, simply showing how many packets have come and gone shows NOTHING about the performance.

Have you tried using a tool to actually gauge performance and throughput?

Also, how come you are not using the bnx2 driver?

It actually uses the following card:

Emulex Corporation OneConnect 10Gb NIC (be3)

...hence the be2net driver.

Yes have used relic but as mentioned CPU and memory are fine.

Ok I think its time to get down to the nitty gritty of JBOSS application tuning and performance - thanks for your help.

R,
D.

0c:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S Gigabit Ethernet (rev a3) Shows that it is a Broadcom.

Here's the issue.

STOP. You are going about this in a totally chicken minus head scenario.

You are ignoring the advice given to you. I'm going to do this exactly one last time before I let you float in the water you keep rising around yourself.

What tools have you used, if any, to test the NETWORK, not the memory, not the CPU, etc.

Your first post mentions 1000Mb/s connection. Is your switch 10Gbe? Your MTU size of 1500 will hold you back, and your txqueuelen will also.

You are looking everywhere but where you've been directed.

I appreciate your effort but you are incorrect, the driver and NIC compatibility is not the issue.

# lspci | grep -ieth
02:00.0 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
02:00.1 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
0c:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S Gigabit Ethernet (rev a3)
0c:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S Gigabit Ethernet (rev a3)
0e:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S Gigabit Ethernet (rev a3)
0e:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S Gigabit Ethernet (rev a3)
[root@lnprodapp02 ~]# ethtool -i eth0
driver: be2net
version: 4.0.100r
firmware-version: 3.102.517.701
bus-info: 0000:02:00.0

I use mostly iptraf & sar to check the network.

The cisco switch is 1Gb.

R,
D.

I did not say the driver was incompatible, but rather you gave me erroneous information.

Sar is not going to give you throughput metrics, and neither will iptraf. You would need something like iperf or ttcp.

using a txqueuelen of 1000, and an MTU of 1500 will further limit you, but since you don't seem to know, or want to know what your actual throughput is, your latency, then I leave it up to you to keep tuning the app blindly.

You keep saying there are issues, but no information about where the numbers for the issues are. To put it into a bad car analogy:

Why do you care what the RPMs of the motor are if you have no idea how fast you are going?

sar provides historical info on the total number of bytes recieved and transmitted per second, and iptraf gathers byte counts, and interface stats. This is needed to tace and compare the byte count at the time when the app crashed. Not sure if iperf provides historical data?

However, looks like a useful tool all the same. I gathered some output from the following command.

Can you explain then how I can obtain my max throughput info from this? - Don't want to drive my car too fast and break any speed limits in a restricted zone.

# iperf -c <host_ip> -d -p 8080
------------------------------------------------------------
Server listening on TCP port 8080
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to <host_IP>, TCP port 8080
TCP window size: 49.3 KByte (default)
------------------------------------------------------------
[ 5] local <host_ip> port 40000 connected with <host_ip> port 8080
[ 4] local <host_ip> port 8080 connected with <host_ip> port 40000
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-10.0 sec 18.3 GBytes 15.7 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 18.3 GBytes 15.7 Gbits/sec

[/FONT]

Did you connect the client from the same machine?

That will not work if you do, because according to this, you are breaking physics.

No in that case I am not in breach of breaking physics.

The reason I ask is that you are taking a 1Gbit device to 15.7Gbit.