Find all large files in linux

How to find the largest file in a directory and its subdirectories?

We’re just starting a UNIX class and are learning a variety of Bash commands. Our assignment involves performing various commands on a directory that has a number of folders under it as well. I know how to list and count all the regular files from the root folder using:

But I’d like to know where to go from there in order to find the largest file in the whole directory. I’ve seen somethings regarding a du command, but we haven’t learned that, so in the repertoire of things we’ve learned I assume we need to somehow connect it to the ls -t command. And pardon me if my ‘lingo’ isn’t correct, I’m still getting used to it!

If you know of a command, but aren’t sure how to use it then try typing in man followed by the command you are interested in. Up will pop a nice manual entry for that command (press q to get back to command line).

17 Answers 17

If you want to find and print the top 10 largest files names (not directories) in a particular directory and its sub directories

$ find . -type f -printf ‘%s %p\n’|sort -nr|head

To restrict the search to the present directory use «-maxdepth 1» with find.

$ find . -maxdepth 1 -printf ‘%s %p\n’|sort -nr|head

And to print the top 10 largest «files and directories»:

$ du -a . | sort -nr | head

** Use «head -n X» instead of the only «head» above to print the top X largest files (in all the above examples)

The first solution didn’t work on OS X for me, so i ended up using a quick hack to filter out the directories from the third solution: du -am . | sort -nr | grep ‘\..*\.’ | head . The m is to display file size in megabytes and used grep to show lines with at least two dots, the first is in the ./ in the path, the second is in the file extension, e.g. .mov .

To find the top 25 files in the current directory and its subdirectories:

find . -type f -exec ls -al <> \; | sort -nr -k5 | head -n 25

This will output the top 25 files by sorting based on the size of the files via the «sort -nr -k5» piped command.

Same but with human-readable file sizes:

find . -type f -exec ls -alh <> \; | sort -hr -k5 | head -n 25

find . -type f | xargs ls -lS | head -n 1 
-rw-r--r-- 1 nneonneo staff 9274991 Apr 11 02:29 ./devel/misc/test.out 

If you just want the filename:

find . -type f | xargs ls -1S | head -n 1 

This avoids using awk and allows you to use whatever flags you want in ls .

Caveat. Because xargs tries to avoid building overlong command lines, this might fail if you run it on a directory with a lot of files because ls ends up executing more than once. It’s not an insurmountable problem (you can collect the head -n 1 output from each ls invocation, and run ls -S again, looping until you have a single file), but it does mar this approach somewhat.

This finds the biggest files in only the first batch xargs has executed. To fix it add sorting: find . -type f -print0 | xargs -0 ls -lS | sort -rk 5 | head -n 10 . Worked on OSX for me.

Читайте также:  Linux установка пакета определенной версии

There is no simple command available to find out the largest files/directories on a Linux/UNIX/BSD filesystem. However, combination of following three commands (using pipes) you can easily find out list of largest files:

# du -a /var | sort -n -r | head -n 10 

If you want more human readable output try:

$ cd /path/to/some/var $ du -hsx * | sort -rh | head -10 
  • Var is the directory you wan to search
  • du command -h option : display sizes in human readable format (e.g., 1K, 234M, 2G).
  • du command -s option : show only a total for each argument (summary).
  • du command -x option : skip directories on different file systems.
  • sort command -r option : reverse the result of comparisons.
  • sort command -h option : compare human readable numbers. This is GNU sort specific option only.
  • head command -10 OR -n 10 option : show the first 10 lines.

I like the 2nd command better but on osx, no -h option for sort version installed. Should be for mac: du -hsx * | sort -rn | head -10

This lists files recursively if they’re normal files, sorts by the 7th field (which is size in my find output; check yours), and shows just the first file.

find . -type f -ls | sort +7 | head -1 

The first option to find is the start path for the recursive search. A -type of f searches for normal files. Note that if you try to parse this as a filename, you may fail if the filename contains spaces, newlines or other special characters. The options to sort also vary by operating system. I’m using FreeBSD.

A «better» but more complex and heavier solution would be to have find traverse the directories, but perhaps use stat to get the details about the file, then perhaps use awk to find the largest size. Note that the output of stat also depends on your operating system.

What is the +7 arg meant to be doing? On my machine sort just complains that it can’t find a file called +7 .

@Dunes — As I said, check the man page for sort on your system. I’m using OS X 10.4 at the moment, where usage derives from FreeBSD’s sort: sort [-cmus] [-t separator] [-o output-file] [-T tempdir] [-bdfiMnr] [+POS1 [-POS2]] [-k POS1[,POS2]] [file. ] . Note the +POS [-POS2] . This works in current versions of FreeBSD too.

Seems you have a different sort program to me. This is the man page for my sort program — linux.die.net/man/1/sort For this to work on my machine you would need to explicitly use the -k arg eg. sort -k 7 . edit: by OSX 10.5 the man page for sort seems to have changed to the version I have.

@Dunes — It’s all GNU sort, but different versions. The [+POS1] [-POS2] notation is just an older one. As far as I can tell, this notation is still supported by modern GNU sort, though now that I look, it seems to have been dropped from the sort man page after around version 5.1. You can see it in the man page for sort for FreeBSD 4.11. I guess I haven’t read sort’s man page since before FreeBSD 5.0 was released!

This will find the largest file or folder in your present working directory:

ls -S /path/to/folder | head -1 

To find the largest file in all sub-directories:

find /path/to/folder -type f -exec ls -s <> \; | sort -nr | awk 'NR==1 < $1=""; sub(/^ /, ""); print >' 

I think the default behaviour of ls is to list files in columns (ie. several entries per line), so the first doesn’t exactly find just the largest file. With regards to your second command it only found the largest file in the given directory and not its subdirectories.

Читайте также:  What is my default shell in linux

@Dunes: You are correct, the first command could find directories, but not because of default behavior of ls . In my testing, the -S flag will list one file per line. I have corrected the second command. Hopefully now it’s full-proof. Thank-you.

find . -type f -ls|sort -nr -k7|awk 'NR==1' #formatted 
find . -type f -ls | sort -nrk7 | head -1 #unformatted 

because anything else posted here didn’t work. This will find the largest file in $PWD and subdirectories.

Try the following one-liner (display top-20 biggest files):

ls -1Rs | sed -e "s/^ *//" | grep "^5" | sort -nr | head -n20 
ls -1Rhs | sed -e "s/^ *//" | grep "^2" | sort -hr | head -n20 

Works fine under Linux/BSD/OSX in comparison to other answers, as find’s -printf option doesn’t exist on OSX/BSD and stat has different parameters depending on OS. However the second command to work on OSX/BSD properly (as sort doesn’t have -h ), install sort from coreutils or remove -h from ls and use sort -nr instead.

So these aliases are useful to have in your rc files:

alias big='du -ah . | sort -rh | head -20' alias big-files='ls -1Rhs | sed -e "s/^ *//" | grep "^9" | sort -hr | head -n20' 

This only shows the filename without the path, so doesn’t really help to actually find the biggest file.

find /your/path -printf "%k %p\n" | sort -g -k 1,1 | awk ' 500000) print $1/1024 "MB" " " $2 >' |tail -n 1 

This will print the largest file name and size and more than 500M. You can move the if($1 > 500000) ,and it will print the largest file in the directory.

du -aS /PATH/TO/folder | sort -rn | head -2 | tail -1

du -aS /PATH/TO/folder | sort -rn | awk ‘NR==2’

To list the larger file in a folder

ls -sh /pathFolder | sort -rh | head -n 1 

The output of ls -sh is a sized s and human h understandable view of the file size number.

You could use ls -shS /pathFolder | head -n 1 . The bigger S from ls already order the list from the larger files to the smaller ones but the first result its the sum of all files in that folder. So if you want just to list the bigger file, one file, you need to head -n 2 and check at the «second line result» or use the first example with ls sort head .

This command works for me,

find /path/to/dir -type f -exec du -h ‘<>‘ + | sort -hr | head -10

Lists Top 10 files ordered by size in human-readable mode.

This script simplifies finding largest files for further action. I keep it in my ~/bin directory, and put ~/bin in my $PATH.

#!/usr/bin/env bash # scriptname: above # author: Jonathan D. Lettvin, 201401220235 # This finds files of size >= $1 (format $[K|M|G|T], default 10G) # using a reliable version-independent bash hash to relax find's -size syntax. # Specifying size using 'T' for Terabytes is supported. # Output size has units (K|M|G|T) in the left hand output column. # Example: # ubuntu12.04$ above 1T # 128T /proc/core # http://stackoverflow.com/questions/1494178/how-to-define-hash-tables-in-bash # Inspiration for hasch: thanks Adam Katz, Oct 18 2012 00:39 function hasch() < local hasch=`echo "$1" | cksum`; echo "$"; > function usage() < echo "Usage: $0 ["; exit 1; > function arg1() < # Translate single arg (if present) into format usable by find. count=10; units=G; # Default find -size argument to 10G. size=$$ if [ -n "$1" ]; then for P in TT tT GG gG MM mM Kk kk; do xlat[`hasch $`]="$"; done units=$`]>; count=$ test -n "$units" || usage test -x $(echo "$count" | sed s/4//g) || usage if [ "$units" == "T" ]; then units="G"; let count=$count*1024; fi size=$$ fi > function main() < sudo \ find / -type f -size +$size -exec ls -lh <>\; 2>/dev/null | \ awk '< N=$5; fn=$9; for(i=10;i<=NF;i++);print N " " fn >' > arg1 $1 main $size 

Источник

Linux Command , how to find files by size larger than x?

I’m trying to find files with size larger than «x» , ex: 32 bytes But what I found in ls —help was only ls -S , that just sort by size and not satisfied my demand I’m new to Linux , I’ve tried Google but I don’t know what keywords should I use to find answer , could somebody give me advice , thank you !

Your question is not about programing but about linux command that would be better in serverfault.com. If you want to ask something about shell script please read this first. stackoverflow.com/help/mcve

4 Answers 4

Try to use the power of find utility.

Find files greater than 32 bytes:

Of course you could also ask for files smaller than 32 bytes:

And files with exact 32 bytes:

For files with 32 bytes or more:

Or not smaller than 32 bytes (the same as above, but written another way):

And files between 32 and 64 bytes:

And you can also apply other common suffixes:

If you get find: illegal option — i error, specify the folder to search in after find . Add a dot to find files in the current folder: find . -size +32k -size -64M

You can change 32 for the size you want. In this case, I used your example.

Both answers you two provided me are excellent 😀 But I think I will give @rslemos the best answer tag since he’s «1 minute earier» , but still thanks to you !

Explanation: Use unix command find Using -size oprator

The find utility recursively descends the directory tree for each path listed, evaluating an expression (composed of the ‘primaries’ and ‘operands’) in terms of each file in the tree.

Solution: According to the documentation

-size n[ckMGTP] True if the file's size, rounded up, in 512-byte blocks is n. If n is fol- lowed by a c, than the primary is true if the file's size is n bytes (charac- ters). Similarly if n is followed by a scale indicator than the file's size is compared to n scaled as: k kilobytes (1024 bytes) M megabytes (1024 kilobytes) G gigabytes (1024 megabytes) T terabytes (1024 gigabytes) P petabytes (1024 terabytes) 

Usage: perform find operation with -size flag and threshold measurement arguments: more than(+)/less than(-), number(n) and measurement type (k/M/G/T/P).

Formula: find -size [+/-]

1.Greater Than — Find all files in my current directory (.) that greater than 500 kilobyte

2.Less Than — Find all files in my current directory (.) that less than 100 megabyte.

3.Range — Find specific file (test) in my current directory (.) that greater than 500 kilobyte less than 100 megabyte (500k-1000k)

find . -name "test" -size +500k -size -100M 

4.Complex Range with Regex Find all .mkv files in all filesystem (root /) that are greater than 1 gigabyte and created this month, and present info of them.

find / -name "*.mkv" -size +1G -type f -ctime -4w | xargs ls -la 

Источник

Оцените статью
Adblock
detector