How to get maximum supported MTU size for interface?
However different interfaces and different machines appear to have different limits resulting in an error:
Error: mtu greater than device maximum.
I’m trying to find a way to check if NIC supporting a specific MTU size or not without trying to set it first; actually, I want to find the theoretical maximum MTU on all interfaces on all my servers. I’ve inspected all features of ethtool, looked in /sys/class/net, etc, but all I can find is the current MTU value. Are there a way to see how high MTU can be on interface without trying it?
I don’t know of any cananonical way to find out what you want, but you must ask yourself why do you want to change the MTU. 9/10 cases, your default MTU is just fine. And there is something to be said about jumbo frames, as seen here : archive.nanog.org/sites/default/files/…
I can answer this question pretty easily: because we are running network appliances and those are asking to provide them with as much MTU as we could in their cluster fabric network, but it should be the same MTU on all servers. So I want to see what number can I make without breaking stuff in the process (so, no trial and error).
@GeorgeShuklin I’ve edited the question to aid future readers, feel free to revert the edit if you think it is incorrect.
2 Answers 2
Amazingly, I found that ip reports this information if asked.
21: enxa44cc8aa52bd: mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether a4:4c:c8:aa:52:bd brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 9194 addrgenmode none numtxqueues 1 numrxqueues 1 gso_max_size 16354 gso_max_segs 65535
minmtu and maxmtu is the answer.
Note that the reduced ip command included in busybox does not support the -d option to show the minmtu and maxmtu .
I would presume though it should be ip -d link show . Although the list works, but I don’t see it mentioned in man ip-iink or zsh autcompletions when pressing TAB. I would guess using list is deprecated.
you can send a specific mtu size with ping
being the local ip of the interface you wish to check.
Note there is an additional 28 bytes as a header when using this method.
Just keep increasing the mtu size (in the ping command) until you get a Message too long error or similar.
Current MTU setting and IP:
[root@centos7 ~]# ip l | grep ens37 | awk '' mtu 1500 [root@dev-worker1 ~]# ip addr show ens37 | grep "inet " | awk '' 10.10.10.10/24
sending a packet larger than the current MTU setting, but is still accepted:
[root@centos7 ~]# ping -M do -s 8972 10.10.10.10 PING 10.10.10.10 (10.10.10.10) 8972(9000) bytes of data. 8980 bytes from 10.10.10.10: icmp_seq=1 ttl=64 time=0.103 ms 8980 bytes from 10.10.10.10: icmp_seq=2 ttl=64 time=0.067 ms
Sending one too large. Some distros may actually tell you the maximum via this method. E.g Centos7:
[root@centos7 ~]# ping -M do -s 118972 10.10.10.10 Error: packet size 118972 is too large. Maximum is 65507
Once done, you can set it to the maximum if that’s what you desire using ip link
- Clarified I’m referring to pinging a local IP and provided example.
- I do not know for sure that some distros will output the actual limit, as my testing environment interfaces have max capabilities of 65535 bytes.
How to find the correct MTU and MRU of your link
In the previous post, I talked about Network IP Fragmentation, what it is and why it’s needed (You are advised to read it before continuing). I also covered the so called PMTUD Black hole effect.
Fixing a PMTUD Black hole is a multistep process, and it starts with finding the correct MTU/MRU of your link.
Now as I’ve discussed, every path can have its own unique MTU/MRU value, but we are usually interested in the max value that is dictated by your ISP.
When you send a packet, it always routes through your ISP. Because of different protocols in place and their overheads (mostly layer 2 ones), it is common for your ISP to force MTU/MRU of less than 1500 bytes on your link.
If a packet exceeds these values, your ISP is required to send the appropriate ICMP messages either back to you (for the MTU), or to the server sending the data (for the MRU). These messages give the corresponding hosts a chance to adapt themselves to the link.
If your ISP decides to not send the required ICMP messages (or they get lost in transaction for some reason), all sorts of issues could arise. And for solving that, the first step is to manually determine your links MTU/MRU values.
ICMP packets
The best way for finding your link’s MTU/MRU is by sending ICMP packets (or more precisely, Pings) to the other host.
To be able to interpret the results, we first need to have an understanding of an ICMP packet’s structure.
Each PDU in layer 3, consists of different parts. Lets take a look at a typical IPv4 ICMP packet:
As you can see, we have 20 bytes of IPv4 header at the top, followed by 8 bytes of ICMP header, and finally the data or the payload .
Procedure
If we send some packets to a remote host with the DF flag set, and if those packets exceed the maximum packet size of our link, not only they should never reach the remote host, we should also receive an ICMP message from a router along the way (likely our own ISP), informing us.
The easiest way to do so, is by using ICMP Ping. The ping command, is available in pretty much every platform you can think of.
To summarize: We set the DF flag on our ICMP packet and send a big enough ICMP Ping payload, after exceeding the maximum packet size of our link, we will observe the results.
Constructing the ping command
We first try to send a 1500 bytes ping packet to a remote server. Shortly I will explain why.
Linux
For this, we’re going to use iputils package which is most likely already installed in your distro 1 .
Open the terminal and issue this command:
ping -c 4 -s 1472 -M do 8.8.4.4
The arguments are pretty simple:
Argument | Description |
---|---|
-c 4 | Number of pings |
-s 1472 | Size of the payload. Remember that each ICMP payload has 28 bytes of overhead ( 1472 + 28 = 1500 ) |
-M do | Path MTU Discovery strategy. The do option, makes it to set the Don’t Fragment flag (not confusing at all) |
8.8.4.4 | The remote host we’re sending the packets to (In this case, one of google’s public DNS servers) |
Windows
Open cmd.exe and issue this command:
Again, arguments are very simple:
Argument | Description |
---|---|
-l 1472 | Size of the payload (just like above) |
-f | Set the Don’t Fragment flag |
8.8.4.4 | Sorry google! Your DNS servers are just too awesome |
Identifying the correct MTU
Now if you do get a ping reply with the above commands, it means that at least for the path between you and google’s DNS servers, your MTU is 1500 2 and you should not have any MTU (and likely MRU) problems.
If you suspect you are having MTU issues on your link, the first step is to reproduce it. So try again those commands with another IP address (or ideally the IP address you have issue with) until you do not get a pong from the other end.
On a healthy link with MTU of less than 1500 bytes, you should see a response like this:
PING 8.8.4.4 (8.8.4.4) 1472(1500) bytes of data. From 10.11.12.13 icmp_seq=1 Frag needed and DF set (mtu = 1492) ping: local error: Message too long, mtu=1492 ping: local error: Message too long, mtu=1492 ping: local error: Message too long, mtu=1492 --- 8.8.4.4 ping statistics --- 4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3051ms
Pinging 8.8.4.4 with 1472 bytes of data: Reply from 10.11.12.13: Packet needs to be fragmented but DF set. Packet needs to be fragmented but DF set. Packet needs to be fragmented but DF set. Packet needs to be fragmented but DF set. Ping statistics for 8.8.4.4: Packets: Sent = 4, Received = 1, Lost = 3 (75% loss),
As you can see, we got a response from a hop on the link that it can not pass our packet. We also get the hop’s IP address and as a bonus, Linux ping also shows the MTU.
Most likely than not, the said hop is either the next immediate router on your path (i.e, your home router) or your ISP.
You can then adjust the payload size (in this case 1492-28= 1464 ) again and retry. You do that until you get a response from the other end.
That however, is how it should work and if it did work like this, you wouldn’t be needing to find the MTU manually.
If you do get a plain ping-timeout reply every time, then you might indeed have a PMTUD black hole on your path. Finding the MTU at this point is as easy as adjusting ping’s payload.
To summarize: You reduce the ping’s payload until you start getting replies. You could start by reducing its size in half (which you most likely would get a reply) and then fine-tune it from there 3 .
Identifying the correct MRU
You usually shouldn’t be needing to adjust this. It’s not your problem but the next-hop’s ones 4 .
There is a certain twist in finding your MRU:
Setting the DF flag on your ping packet, does not automatically mean the reply would also have the DF flag set (In fact my testing shows it doesn’t). And even if you could force that, you still would have trouble finding out whether in fact PMTUD works for your link’s MRU or not. That is unless you control the other end as well.
To summarize: Your best bet for finding the correct MRU of your link, is to ping your host from a remote location (making it the MTU of that remote location). If you don’t have access to a remote host, search the web for online ping services and use those instead.
Caveats and pitfalls
You should be aware that there are some situations in which you might not get the expected results. Some of them are as follows:
- Your ISP might silently remove the DF flags: This is really a bad practice but some ISPs opt for this as a way of solving MTU issues altogether. On such connections, once a packet reaches the ISP, the DF flag is removed.
- Some remote hosts might send truncated replies: To protect their network, some remote hosts instead of replying the ping with the same payload, they truncate it. Making the reply somewhat invalid. While the client can usually correctly handle this for a single ping packet, the situation could get complicated once they get fragmented. Best to use a host known to not do that (like google DNS servers).
- A firewall might be blocking your ping request/response: If that’s the case and you are sure it’s not being done on your end, you are pretty much out of luck using ping. One way would be constructing a TCP packet to send to a remote host (this is somewhat more complex). Another way is to just run Wireshark and observe the normal traffic for couple of minutes to make an educated guess about your links MTU/MRU.
- You might be behind a transparent proxy: That means you think you have a direct connection but you really don’t. Most of your traffic goes through what’s called a Transparent Proxy. Even if you do get a ping response from a remote host, that’s not necessarily reflect your real MTU/MRU to the transparent proxy.
- Your link MTU might change over time: This is rather unusual for home networks but I’ve seen this on mobile networks. In such cases the all time low MTU of your link is your only reliable MTU.
- A firewall might block fragmented ping requests/responses: Yes, no kidding, I’ve seen this too. Whether it was accidental and the result of connection tracking issue, or on purpose to possibly discourage the use of ping payloads to transfer real data (e.g., to bypass firewalls), it effectively blocks pings as soon as they get fragmented. So basically you may not have any MTU issue at all and yet, ping results would suggest otherwise. This one is really unfair to troubleshoot!
- The NIC on your host, migh be set to use a MTU value other than 1500: Specially in windows, this may cause lots of weird results. Whether its set via adapter’s setting or the netsh command, that can influence your result.
- Some low-level network drivers might affect the result: Again, specially in windows, this can easily happen with security softwares such as Kaspersky’s NDIS filter.
- Note that some ping implementations like BusyBox, are not suitable for this since they lacked the required parameters. ↩︎
- Or possibly even more, but uncommon. It could also mean your ISP is being really naughty, more on that later. ↩︎
- If you never get a reply and you are sure your internet connection is working, then a firewall in the path might be blocking it. I will briefly introduce another approach at the end of the article. ↩︎
- Refer to the PMTUD discussion for more info. ↩︎