«Stale NFS file handle» after reboot
I have to say that there were no problem with the shared folder from client side however after reboots (server and client), I see this message. Any way to fix that?
4 Answers 4
The order of reboots is important. Rebooting the server after the clients can result in this situation. The stale NFS handle indicates that the client has a file open, but the server no longer recognizes the file handle. In some cases, NFS will cleanup its data structures after a timeout. In other cases, you will need to clean the NFS data structures yourself and restart NFS afterwards. Where these structures are located are somewhat O/S dependent.
Try restarting NFS first on the server and then on the clients. This may clear the file handles.
Rebooting NFS servers with files opened from other servers is not recommended. This is especially problematic if the open file has been deleted on the server. The server may keep the file open until it is rebooted, but the reboot will remove the in-memory file handle on the server side. Then the client will no longer be able to open the file.
Determining which mounts have been used from the server is difficult and unreliable. The showmount -a option may show some active mounts, but may not report all of them. Locked files are easier to identify, but require the locking to be enabled and relies on the client software to lock the files.
You can use lsof on the clients to identify the processes which have files open on the mounts.
I use the hard and intr mount options on my NFS mounts. The hard option causes IO to be retried indefinitely. The intr option allows processes to be killed if they are waiting on NFS IO to complete.
Using hard, intr is good advice. However, note that NFS doubles the timeouts with each try. So you best set timeo=1 and retrans=5 or so. Note that this will put heavy strain on your NFS server after NFS restart. Try to not restart your NFS service so often 😉
Your answer is correct. I also found another simple solution. On the node that says stale NFS handler, just umount and remount the folder again.
The root cause for problems of this type is usually that the rpc.statd service fails to resolve the IP address of the other partner by the uname -n name it used before the reboot. Yes, even if you mount by IP address, the names must be resolvable, because the NFS lock protocol uses names internally. Please see this answer of mine in another NFS question.
#!/bin/bash # Purpose: # Detect Stale File handle and remove it # Script created: July 29, 2015 by Birgit Ducarroz # Last modification: -- # # Detect Stale file handle and write output into a variable and then into a file mounts=`df 2>&1 | grep 'Stale file handle' |awk '' > NFS_stales.txt` # Remove : ‘ and ’ characters from the output sed -r -i 's/://' NFS_stales.txt && sed -r -i 's/‘//' NFS_stales.txt && sed -r -i 's/’//' NFS_stales.txt # Not used: replace space by a new line # stales=`cat NFS_stales.txt && sed -r -i ':a;N;$!ba;s/ /\n /g' NFS_stales.txt` # read NFS_stales.txt output file line by line then unmount stale by stale. # IFS='' (or IFS=) prevents leading/trailing whitespace from being trimmed. # -r prevents backslash escapes from being interpreted. # || [[ -n $line ]] prevents the last line from being ignored if it doesn't end with a \n (since read returns a non-zero exit code when it encounters EOF). while IFS='' read -r line || [[ -n "$line" ]]; do echo "Unmounting due to NFS Stale file handle: $line" umount -fl $line done < "NFS_stales.txt" #EOF
In meantime, the above script works not with all servers. Here is an update:
#!/bin/bash # Purpose: # Detect Stale File handle and remove it # Script created: July 29, 2015 by Birgit Ducarroz # Last modification: 23.12.2020 /bdu # MYMAIL="my.mail@something.com" THIS_HOST=`hostname` # Detect Stale file handle and write output into a variable and then into a file mounts=`df 2>&1 | grep 'Stale' |awk '' > NFS_stales.txt` sleep 8 # Remove : special characters from the output sed -r -i 's/://' NFS_stales.txt && sed -r -i 's/‘//' NFS_stales.txt && sed -r -i 's/’//' NFS_stales.txt && sed -r -i 's/`//' NFS_stales.txt && sed -r -i "s/'//" NFS_stales.txt # Not used: replace space by a new line # stales=`cat NFS_stales.txt && sed -r -i ':a;N;$!ba;s/ /\n /g' NFS_stales.txt` # read NFS_stales.txt output file line by line then unmount stale by stale. # IFS='' (or IFS=) prevents leading/trailing whitespace from being trimmed. # -r prevents backslash escapes from being interpreted. # || [[ -n $line ]] prevents the last line from being ignored if it doesn't end with a \n (since read returns a non-zero exit code when it encounters EOF). while IFS='' read -r line || [[ -n "$line" ]]; do message=`echo "Unmounting due to NFS Stale file handle: $line"` echo echo | mail -s "$THIS_HOST: NFS Stale Handle unmounted" $MYMAIL
How can I resolve a stale nfs handle?
I noticed once when I shut down my home server while my desktop was connected via NFS that I kept getting "stale NFS handle warning" when entering my home dir, this caused issues with some programs that looked in those folders. How do I resolve this issue without restarting my machine? Debian Squeeze/Wheezy
3 Answers 3
Force unmount the local mount
/etc/init.d/nfs-common restart
Try this shell script. Works good for me:
#!/bin/bash # Purpose: # Detect Stale File handle and remove it # Script created: July 29, 2015 by Birgit Ducarroz # Last modification: -- # # Detect Stale file handle and write output into a variable and then into a file mounts=`df 2>&1 | grep 'Stale file handle' |awk '' > NFS_stales.txt` # Remove : ‘ and ’ characters from the output sed -r -i 's/://' NFS_stales.txt && sed -r -i 's/‘//' NFS_stales.txt && sed -r -i 's/’//' NFS_stales.txt # Not used: replace space by a new line # stales=`cat NFS_stales.txt && sed -r -i ':a;N;$!ba;s/ /\n /g' NFS_stales.txt` # read NFS_stales.txt output file line by line then unmount stale by stale. # IFS='' (or IFS=) prevents leading/trailing whitespace from being trimmed. # -r prevents backslash escapes from being interpreted. # || [[ -n $line ]] prevents the last line from being ignored if it doesn't end with a \n (since read returns a non-zero exit code when it encounters EOF). while IFS='' read -r line || [[ -n "$line" ]]; do echo "Unmounting due to NFS Stale file handle: $line" umount -fl $line done < "NFS_stales.txt" #EOF
mount.nfs: Stale file handle error - cannot umount
Are you positive the /export/registry-gitlab-prod-data-vol directory exists and has the correct permissions?
Try to add -v to mount -t command. See dmesg and /var/log/messages also. Maybe addition info will be issued. Are you trying reboot your machine?
5 Answers 5
A mount -t nfs fails with Stale file handle if the server has some stale exports entries for that client.
Example scenario: this might happen when the server reboots without the client umounting the nfs volumes first. When the server is back and the client then umounts and tries to mount the nfs volume the server might respond with:
mount.nfs: Stale file handle
You can check for this via looking at /proc/fs/nfs/exports or /proc/fs/nfsd/exports . If there is entry for the client it might be a stale one.
You can fix this via explicitly un-exporting and re-exporting the relevant exports on the server. For example to do this with all exports:
# exportfs -ua # cat /proc/fs/nfs/exports # exportfs -a
After this the client's mount -t nfs . should succeed.
Note that mount yielding ESTALE is quite different from some other system call (like open/readdir/unlink/chdir . ) returning ESTALE . It's export being stale vs. a file handle being stale. A stale file handle easily happens with NFS (e.g. a client has a file handle but the file got deleted on the server).
Just want to add that this worked for me when the server is running Manjaro Linux and the client is running Ubuntu. Furthermore, the conditions that led to this are that I moved a drive from one server to another, unmounting the device from the clients before the move, but one client ended up giving me the "stale file handle" error when I tried to mount the nfs share from the new server. Executing the commands in this answer fixed it.
The error, ESTALE, was originally introduced to handle the situation where a file handle, which NFS uses to uniquely identify a file on the server, no longer refers to a valid file on the server. This can happen when the file is removed on the server, either by an application on the server, some other client accessing the server, or sometimes even by another mounted file system from the same client. The NFS server also returns this error when the file resides upon a file system which is no longer exported. Additionally, some NFS servers even change the file handle when a file is renamed, although this practice is discouraged.
This error occurs even if a file or directory, with the same name, is recreated on the server without the client being aware of it. The file handle refers to a specific instance of a file and deleting the file and then recreating it creates a new instance of the file.
The error, ESTALE, is usually seen when cached directory information is used to convert a pathname to a dentry/inode pair. The information is discovered to be out of date or stale when a subsequent operation is sent to the NFS server. This can easily happen in system calls such as stat(2) when the pathname is converted a dentry/inode pair using cached information, but then a subsequent GETATTR call to the server discovers that the file handle is no longer valid.
This error can also occur when a change is made on the server in between looking up different components of the pathname to be looked up or between a successful lookup and a subsequent operation.
Original link about ESTALE: ESTALE LWN .
I suggest to you check files and directories on NFS server or say to admin of NFS server to do this.
Maybe some old pagecache, inode, dentry cache entries are exists on NFS server. Please clean it:
# To free pagecache echo 1 > /proc/sys/vm/drop_caches # To free dentries and inodes echo 2 > /proc/sys/vm/drop_caches # To free pagecache, dentries and inodes echo 3 > /proc/sys/vm/drop_caches