Slow upload speed for VMWare virtual machines working via pfSense
Solution 1:
I want to absolutely confirm the same scenario. Running pfSense on VMware where the upload bandwidth would be painfully slow while download was just fine. For us, it was ONLY if the pfSense VM and the guest VMs were on the same host. When the pfSense VM and host VM were on a different host the problem went away. When disabling the offloads on the pfsense VMs (check the boxes ON) it instantly fixed the problems. I am not sure if it's only the VMXNET 3 NICs but that is how the pfSense VMs are also configured. I hope this helps others as this is not documented anywhere. I will try to get pfSense to update the VMware configuration page on their site.
Solution 2:
I have solved the issue by disabling "Hardware Large Receive Offloading" in pfSense settings (System / Advanced / Networking | Network Interfaces)
There is a checkbox "Disable hardware large receive offload" and I have turned it to "Checked" (ON).
The description says the following on this option:
Checking this option will disable hardware large receive offloading (LRO). This offloading is broken in some hardware drivers, and may impact performance with some specific NICs. This will take effect after a machine reboot or re-configure of each interface.
Other options are unchecked. So now the options in the "Network Interfaces" are the following:
[ ] Disable hardware checksum offload
[ ] Disable hardware TCP segmentation offload
[✓] Disable hardware large receive offload
According to HP documentation, the network adapters on Gen8/Gen9 (model 331 based on the Broadcom BCM5719 chipset) support standard TCP/IP offloading techniques including: - TCP/IP, UDP checksum offload (TCO) (moves the TCP and IP checksum offloading from the CPU to the network adapter). - Large send offload (LSO) or TCP segmentation offload (TSO) (allows the TCP segmentation to be handled by the adapter rather than the CPU).
That's what pfSense writes about these features:
The settings for Hardware TCP Segmentation Offload (TSO) and Hardware Large Receive Offload (LRO) under System > Advanced on the Networking tab default to checked (disabled) for good reason. Nearly all hardware/drivers have issues with these settings, and they can lead to throughput issues. Ensure the options are checked. Sometimes disabling via sysctl is also necessary.
In fact there were not hardware/drivers have issues, but a misconfiguration. LRO and TSO should never be enabled on a router. Only if pfSense is configured as an end-point (e.g. a DNS server), these options may be enabled.
Let me quote from the FreeBSD bugtracking entry:
From my testing this is not a bug and everything is working as designed. I am seeing a large decrease in performance when LRO is turned on and using pfSense as a gateway. This is due to the originating packets having the IP DF (don’t fragment) flag set which then gets combined into larger packets via LRO. When this (larger) packet needs to be fragmented to match the other NIC the FreeBSD kernel sees the DF flag, drops the packet, and then sends back an ICMP “unreachable - need to frag” message to the sender. The reason it works at all is due to other traffic which disallows the LRO to occur and some packets get forwarded. One test I did was turning LRO on and using scp to put a file onto the pfSense appliance which resulted in good performance (not seeing the same drop in performance). I would be interested if you 1) see good performance with LRO turned on and scp a large file to the appliance and 2) see ICMP "need to frag" with LRO turned on and scp to a machine on the remote side. Since the pfSense appliance is being used as a gateway you should leave LRO turned off.