how to `tail` the latest file in a directory
The close is only for moving to superuser or serverfault. The question will live there, and more people that might be interested will find it.
The real problem here is finding the most recently update file in the directory and I believe that that has already been answered (either here or on Super User, I can’t recall).
12 Answers 12
If you’re worried about filenames with spaces,
But what happens when your latest file has spaces or special characters? Use $() instead of « and quote your subshell to avoid this problem.
Well, it depends on what you’re doing, really. A solution that always works everywhere, for all possible filenames, is very nice, but in a constrained situation (log files, for example, with known non-weird names) it might be unnecessary.
Do not parse the output of ls! Parsing the output of ls is difficult and unreliable.
If you must do this I recommend using find. Originally I had here a simple example merely to give you the gist of the solution, but since this answer seems somewhat popular I decided to revise this to provide a version that is safe to copy/paste and use with all inputs. Are you sitting comfortably? We’ll start with a oneliner that will give you the latest file in the current directory:
tail -- "$(find . -maxdepth 1 -type f -printf '%T@.%p\0' | sort -znr -t. -k1,2 | while IFS= read -r -d '' -r record ; do printf '%s' "$record" | cut -d. -f3- ; break ; done)"
Not quite a oneliner now, is it? Here it is again as a shell function and formatted for easier reading:
latest-file-in-directory () < find "$" -maxdepth 1 -type f -printf '%T@.%p\0' | \ sort -znr -t. -k1,2 | \ while IFS= read -r -d '' -r record ; do printf '%s' "$record" | cut -d. -f3- break done >
And now that as a oneliner:
tail -- "$(latest-file-in-directory)"
If all else fails you can include the above function in your .bashrc and consider the problem solved, with one caveat. If you just wanted to get the job done you need not read further.
The caveat with this is that a file name ending in one or more newlines will still not be passed to tail correctly. Working around this problem is complicated and I consider it sufficient that if such a malicious file name is encountered the relatively safe behavior of encountering a «No such file» error will occur instead of anything more dangerous.
Juicy details
For the curious this is the tedious explanation of how it works, why it’s safe and why other methods probably aren’t.
Danger, Will Robinson
First of all, the only byte that is safe to delimit file paths is null because it is the only byte universally forbidden in file paths on Unix systems. It is important when handling any list of file paths to only use null as a delimiter and, when handing even a single file path from one program to another, to do so in a manner which will not choke on arbitrary bytes. There are many seemingly-correct ways to solve this and other problems which fail by assuming (even accidentally) that file names will not have either new lines or spaces in them. Neither assumption is safe.
For today’s purposes step one is to get a null-delimited list of files out of find. This is pretty easy if you have a find supporting -print0 such as GNU’s:
But this list still does not tell us which one is newest, so we need to include that information. I choose to use find’s -printf switch which lets me specify what data appears in the output. Not all versions of find support -printf (it is not standard) but GNU find does. If you find yourself without -printf you will need to rely on -exec stat <> \; at which point you must give up all hope of portability as stat is not standard either. For now I’m going to move on assuming you have GNU tools.
Here I am asking for printf format %T@ which is the modification time in seconds since the beginning of the Unix epoch followed by a period and then followed by a number indicating fractions of a second. I add to this another period and then %p (which is the full path to the file) before ending with a null byte.
find . -maxdepth 1 \! -type d -printf '%T@.%p\0'
It may go without saying but for the sake of being complete -maxdepth 1 prevents find from listing the contents of sub directories and \! -type d skips directories which you are unlikely to want to tail . So far I have files in the current directory with modification time information, so now I need to sort by that modification time.
Getting it in the right order
By default sort expects its input to be newline-delimited records. If you have GNU sort you can ask it to expect null-delimited records instead by using the -z switch.; for standard sort there is no solution. I am only interested in sorting by the first two numbers (seconds and fractions of a second) and don’t want to sort by the actual file name so I tell sort two things: First, that it should consider the period ( . ) a field delimiter and second that it should only use the first and second fields when considering how to sort the records.
First of all I am bundling three short options that take no value together; -znr is just a concise way of saying -z -n -r ). After that -t . (the space is optional) tells sort the field delimiter character and -k 1,2 specifies the field numbers: first and second ( sort counts fields from one, not zero). Remember that a sample record for the current directory would look like:
1000000000.0000000000../some-file-name
This means sort will look at first 1000000000 and then 0000000000 when ordering this record. The -n option tells sort to use numeric comparison when comparing these values, because both values are numbers. This may not be important since the numbers are of fixed length but it does no harm.
The other switch given to sort is -r for «reverse.» By default the output of a numeric sort will be lowest numbers first, -r changes it so that it lists the lowest numbers last and the highest numbers first. Since these numbers are timestamps higher will mean newer and this puts the newest record at the beginning of the list.
Just the important bits
As the list of file paths emerges from sort it now has the desired answer we’re looking for right at the top. What remains is to find a way to discard the other records and to strip the timestamp. Unfortunately even GNU head and tail do not accept switches to make them operate on null-delimited input. Instead I use a while loop as a kind of poor man’s head .
| while IFS= read -r -d '' record
First I unset IFS so that the list of files is not subjected to word splitting. Next I tell read two things: Do not interpret escape sequences in the input ( -r ) and the input is delimited with a null byte ( -d ); here the empty string » is used to indicate «no delimiter» aka delimited by null. Each record will be read in to the variable record so that each time the while loop iterates it has a single timestamp and a single file name. Note that -d is a GNU extension; if you have only a standard read this technique will not work and you have little recourse.
We know that the record variable has three parts to it, all delimited by period characters. Using the cut utility it is possible to extract a portion of them.
printf '%s' "$record" | cut -d. -f3-
Having printed the newest file path there’s no need to keep going: break exits the loop without letting it move on to the second file path.
The only thing that remains is running tail on the file path returned by this pipeline. You may have noticed in my example that I did this by enclosing the pipeline in a subshell; what you may not have noticed is that I enclosed the subshell in double quotes. This is important because at the last even with all of this effort to be safe for any file names an unquoted subshell expansion could still break things. A more detailed explanation is available if you’re interested. The second important but easily-overlooked aspect to the invocation of tail is that I provided the option — to it before expanding the file name. This will instruct tail that no more options are being specified and everything following is a file name, which makes it safe to handle file names that begin with — .
Find latest files
How do I find out the most recently accessed file in a given directory? I can use the find command to list out all files modified/accessed in last n minutes. But here in my case, I’m not sure when the last file was modified/accessed? All that I need is to list all the files which were accessed/modified very recently among all other sub-files or sub-directories, sorted by their access/modified times, for example. Is that possible?
Your question is unclear. Are you saying you want to take the list of files from find and sort them by date?
7 Answers 7
To print the last 3 accessed files (sorted from the last accessed file to the third last accessed file):
find . -type f -exec stat -c '%X %n' <> \; | sort -nr | awk 'NR==1,NR==3 '
To print the last 3 modified files (sorted from the last modified file to the third last modified file):
find . -type f -exec stat -c '%Y %n' <> \; | sort -nr | awk 'NR==1,NR==3 '
- find . -type f -exec stat -c ‘%X %n’ * : prints the last access’ time followed by the file’s path for each file in the current directory hierarchy;
- find . -type f -exec stat -c ‘%Y %n’ * : prints the last modification’s time followed by the file’s path for each file in the current directory hierarchy;
- sort -nr : sorts in an inverse numerical order;
- awk ‘NR==1,NR==3 ‘ : prints the second field of the first, second and third line.
You can change the number of files to be shown by changing 3 to the desired number of files in awk ‘NR==1,NR==3 ‘ .
% touch file1 % touch file2 % touch file3 % find . -type f -exec stat -c '%X %n' <> \; | sort -nr | awk 'NR==1,NR==3 ' ./file3 ./file2 ./file1 % find . -type f -exec stat -c '%Y %n' <> \; | sort -nr | awk 'NR==1,NR==3 ' ./file3 ./file2 ./file1 % cat file1 % find . -type f -exec stat -c '%X %n' <> \; | sort -nr | awk 'NR==1,NR==3 ' ./file1 ./file3 ./file2 % find . -type f -exec stat -c '%Y %n' <> \; | sort -nr | awk 'NR==1,NR==3 ' ./file3 ./file2 ./file1 % touch file2 % find . -type f -exec stat -c '%X %n' <> \; | sort -nr | awk 'NR==1,NR==3 ' ./file2 ./file1 ./file3 % find . -type f -exec stat -c '%Y %n' <> \; | sort -nr | awk 'NR==1,NR==3 ' ./file2 ./file3 ./file1
@SHW Not sure what you mean. The files are sorted based on the number of seconds passed from January 1st 1970.
To help other users: if you want to scan another directory than the current directory, replace the . directly after the find . It took me a couple of minutes to understand it.
These commands break for filenames that contains whitespaces. The culprit is awk and as I don’t know how to use it, simply replacing the awk command with | head -n 3 (to keep the first three results) does the trick. If you stil want to remove the timestamps, chain it with | cut -d’ ‘ -f2-
You could use the recursive switch ( -R ) to ls along with the sort by time switch ( -t ) and the reverse sort switch ( -r ) to list out all the files in a directory tree. This will not sort all the files by their access/modify dates across sub-directories, but will sort them by this date within each sub-directory independently.
Using a command such as this: ls -ltrR .
Example
$ ls -ltrR . total 759720 -rw-r-----@ 1 sammingolelli staff 2514441 Mar 31 2015 restfulapi-120704053212-phpapp01.pdf -rw-r-----@ 1 sammingolelli staff 567808 Apr 7 2015 USGCB-Windows-Settings.xls -rw-r-----@ 1 sammingolelli staff 180736 Apr 7 2015 USGCB-RHEL5-Desktop-Settings-Version-1.2.5.0.xls -rw-r-----@ 1 sammingolelli staff 6474 Apr 8 2015 tap_kp_mavericks.txt ./kerberos: total 5464 -rw-r-----@ 1 sammingolelli staff 37317 Oct 2 13:03 Set_up_Kerberos_instruction_d8.docx -rw-r-----@ 1 sammingolelli staff 2753195 Oct 13 13:49 Keberos configuration with AD 01_09_2014.pdf ./homestarrunner: total 10624 -rw-rw-rw-@ 1 sammingolelli staff 319422 May 10 2000 error_hs.wav -rw-rw-rw-@ 1 sammingolelli staff 53499 Jun 8 2001 sb_duck.mp3 -rw-rw-rw-@ 1 sammingolelli staff 199254 Mar 11 2002 email_sb.wav -rw-rw-rw-@ 1 sammingolelli staff 39288 Mar 25 2002 bubs_dontutalk.mp3 -rw-rw-rw-@ 1 sammingolelli staff 75432 May 6 2002 trash_sb.wav -rw-rw-rw-@ 1 sammingolelli staff 298946 Dec 1 2002 error_sb.wav -rw-rw-rw-@ 1 sammingolelli staff 298686 Dec 1 2002 startup_hs.wav -rw-rw-rw-@ 1 sammingolelli staff 90279 Dec 1 2002 sb_meedlymee.mp3 -rw-rw-rw-@ 1 sammingolelli staff 73561 Dec 1 2002 sb_dubdeuce.mp3 -rw-rw-rw-@ 1 sammingolelli staff 193097 Dec 1 2002 sb_pizza.mp3 -rw-rw-rw-@ 1 sammingolelli staff 30093 Dec 1 2002 sb_stiny.mp3 -rw-rw-rw-@ 1 sammingolelli staff 61858 Dec 1 2002 ss_sadflying.mp3 -rw-rw-rw-@ 1 sammingolelli staff 150142 Dec 1 2002 email_hs.wav -rw-rw-rw-@ 1 sammingolelli staff 68545 Dec 1 2002 bubs_grabbinbutt.mp3 -rw-rw-rw-@ 1 sammingolelli staff 61022 Dec 1 2002 cz_jeorghb.mp3 -rw-rw-rw-@ 1 sammingolelli staff 40124 Dec 1 2002 marzy_nasty.mp3 -rw-rw-rw-@ 1 sammingolelli staff 224116 Dec 1 2002 shutdown_sb.wav -rw-rw-rw-@ 1 sammingolelli staff 260546 Dec 1 2002 shutdown_hs.wav -rw-rw-rw-@ 1 sammingolelli staff 57686 Dec 1 2002 trash_hs.wav