How to set the don't fragment (DF) flag on a socket?
You do it with the setsockopt()
call, by using the IP_DONTFRAG
option:
int val = 1;
setsockopt(sd, IPPROTO_IP, IP_DONTFRAG, &val, sizeof(val));
Here's a page explaining this in further detail.
For Linux, it appears you have to use the IP_MTU_DISCOVER
option with the value IP_PMTUDISC_DO
(or IP_PMTUDISC_DONT
to turn it off):
int val = IP_PMTUDISC_DO;
setsockopt(sd, IPPROTO_IP, IP_MTU_DISCOVER, &val, sizeof(val));
I haven't tested this, just looked in the header files and a bit of a web search so you'll need to test it.
As to whether there's another way the DF flag could be set:
I find nowhere in my program where the "force DF flag" is set, yet
tcpdump
suggests it is. Is there any other way this could get set?
From this excellent page here:
IP_MTU_DISCOVER:
Sets or receives the Path MTU Discovery setting for a socket. When enabled, Linux will perform Path MTU Discovery as defined in RFC 1191 on this socket. The don't fragment flag is set on all outgoing datagrams. The system-wide default is controlled by theip_no_pmtu_disc
sysctl
forSOCK_STREAM
sockets, and disabled on all others. For nonSOCK_STREAM
sockets it is the user's responsibility to packetize the data in MTU sized chunks and to do the retransmits if necessary. The kernel will reject packets that are bigger than the known path MTU if this flag is set (withEMSGSIZE
).
This looks to me like you can set the system-wide default using sysctl
:
sysctl ip_no_pmtu_disc
returns "error: "ip_no_pmtu_disc" is an unknown key"
on my system but it may be set on yours. Other than that, I'm not aware of anything else (other than setsockopt()
as previously mentioned) that can affect the setting.
If you are working in Userland with the intention to bypass the Kernel network stack and thus building your own packets and headers and hand them to a custom Kernel module, there is a better option than setsockopt()
.
You can actually set the DF flag just like any other field of struct iphdr
defined in linux/ip.h
. The 3-bit IP flags are in fact part of the frag_off
(Fragment Offset) member of the structure.
When you think about it, it makes sense to group those two things as the flags are fragmentation related. According to the RFC-791, the section describing the IP header structure states that Fragment Offset is 13-bit long and there are three 1-bit flags. The
frag_off
member is of type __be16
, which can hold 13 + 3 bits.
Long story short, here's a solution:
struct iphdr ip;
ip.frag_off |= ntohs(IP_DF);
We are here exactly setting the DF bit using the designed-for-that-particular-purpose IP_DF
mask.
IP_DF
is defined in net/ip.h
(kernel headers, of course), whereas struct iphdr
is defined in linux/ip.h
.
I agree with the paxdiablo's answer.
- setsockopt(sockfd, IPPROTO_IP, IP_MTU_DISCOVER, &val, sizeof(val))
where val
is one of:
#define IP_PMTUDISC_DONT 0 /* Never send DF frames. */ #define IP_PMTUDISC_WANT 1 /* Use per route hints. */ #define IP_PMTUDISC_DO 2 /* Always DF. */ #define IP_PMTUDISC_PROBE 3 /* Ignore dst pmtu. */
ip_no_pmtu_disc
in kernel source:
if (ipv4_config.no_pmtu_disc) inet->pmtudisc = IP_PMTUDISC_DONT; else inet->pmtudisc = IP_PMTUDISC_WANT;