FreeBSD Network Performance in ESXi


Host:
Dell PowerEdge R720
ESXi 6.5
Broadcom 57810 10 Gb NIC

Guest:
FreeBSD 11.2

I first noticed that Samba performance was quite bad. Not only was it way slower than expected, there were "pauses" where nothing seemed to transfer.

Using iPerf from another 10 Gb system, I had results that looked like this:

# iperf3 -c 10.0.0.1
Connecting to host 10.0.0.1, port 5201
[  5] local 10.0.0.2 port 28895 connected to 10.0.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   225 MBytes  1.89 Gbits/sec   85    380 KBytes       
[  5]   1.00-2.00   sec   469 MBytes  3.94 Gbits/sec  908    734 KBytes       
[  5]   2.00-3.00   sec   517 MBytes  4.34 Gbits/sec    5   1.23 MBytes       
[  5]   3.00-4.00   sec   513 MBytes  4.31 Gbits/sec  858   1.28 MBytes       
[  5]   4.00-5.00   sec   647 MBytes  5.43 Gbits/sec  918   1.40 MBytes       
[  5]   5.00-6.00   sec   644 MBytes  5.40 Gbits/sec  917   1.51 MBytes       
[  5]   6.00-7.00   sec   633 MBytes  5.31 Gbits/sec  1023   1.47 MBytes       
[  5]   7.00-8.00   sec   634 MBytes  5.32 Gbits/sec  979   1.50 MBytes       
[  5]   8.00-9.00   sec   637 MBytes  5.34 Gbits/sec  1921    857 KBytes       
[  5]   9.00-10.00  sec   639 MBytes  5.36 Gbits/sec  835   1.10 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  5.43 GBytes  4.66 Gbits/sec  8449             sender
[  5]   0.00-10.02  sec  5.43 GBytes  4.65 Gbits/sec                  receiver

iperf Done.

Ouch.

The "Retr" column is what stands out. Those are retransmissions. Something isn't working right.

I found guides that mentioned turning off TCP Segmentation Offload (TSO) and tuning various TCP settings via sysctl, but none of that worked for me.
Long story short, I needed to enable Large Receive Offload (LRO). This is off by default, and I've only seen people recommend turning it off.

FreeBSD enables LRO by default when it detects the Broadcom 57810 10 Gb NIC if running on bare-metal, but not when running on ESXi (since it cannot see the actual NIC hardware).

I updated my /etc/rc.conf file by adding "lro" to the end of the interface setting, like this:

# port 1 / 10 Gbps / VMXNET3
ifconfig_vmx0="inet 10.0.0.1 netmask 255.255.255.0 lro"

Then restarted the port:

service netif restart vmx0

On my next run, iPerf looked like this:

# iperf3 -c 10.0.0.1
Connecting to host 10.0.0.1, port 5201
[  5] local 10.0.0.2 port 60560 connected to 10.0.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   946 MBytes  7.94 Gbits/sec    0    499 KBytes       
[  5]   1.00-2.00   sec  1.10 GBytes  9.42 Gbits/sec    0    932 KBytes       
[  5]   2.00-3.00   sec  1.10 GBytes  9.42 Gbits/sec    0   1.25 MBytes       
[  5]   3.00-4.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.51 MBytes       
[  5]   4.00-5.00   sec  1.10 GBytes  9.42 Gbits/sec    0   1.75 MBytes       
[  5]   5.00-6.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.76 MBytes       
[  5]   6.00-7.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.76 MBytes       
[  5]   7.00-8.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.77 MBytes       
[  5]   8.00-9.00   sec  1.09 GBytes  9.40 Gbits/sec    0   1.77 MBytes       
[  5]   9.00-10.00  sec  1.10 GBytes  9.41 Gbits/sec    0   1.77 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.8 GBytes  9.26 Gbits/sec    0             sender
[  5]   0.00-10.01  sec  10.8 GBytes  9.26 Gbits/sec                  receiver

iperf Done.

Much better.

Now I'm getting 500-700 MB/sec in Samba. (It seems like the limiting factor in Samba's performance is the 6 Gbps HBA used for my storage.)