How to list the size of each file and directory and sort by descending size in Bash?
I found that there is no easy to get way the size of a directory in Bash? I want that when I type ls —
What exactly do you mean by the «size» of a directory? The number of files under it (recursively or not)? The sum of the sizes of the files under it (recursively or not)? The disk size of the directory itself? (A directory is implemented as a special file containing file names and other information.)
@KeithThompson @KitHo du command estimates file space usage so you cannot use it if you want to get the exact size.
@ztank1013: Depending on what you mean by «the exact size», du (at least the GNU coreutils version) probably has an option to provide the information.
12 Answers 12
Simply navigate to directory and run following command:
OR add -h for human readable sizes and -r to print bigger directories/files first.
du -a -h --max-depth=1 | sort -hr
du -h requires sort -h too, to ensure that, say 981M sorts before 1.3G ; with sort -n only the numbers would be taken into account and they’d be the wrong way round.
This doesn’t list the size of the individual files within the current directory, only the size of its subdirectories and the total size of the current directory. How would you include individual files in the output as well (to answer OP’s question)?
@ErikTrautman to list the files also you need to add -a and use —all instead of —max-depth=1 like so du -a -h —all | sort -h
Apparently —max-depth option is not in Mac OS X’s version of the du command. You can use the following instead.
Unfortunately this does not show the files, but only the folder sizes. -a does not work with -d either.
To show files and folders, I combined 2 commands: l -hp | grep -v / && du -h -d 1 , which shows the normal file size from ls for files, but uses du for directories.
(this willnot show hidden (.dotfiles) files)
Use du -sm for Mb units etc. I always use
because the total line ( -c ) will end up at the bottom for obvious reasons 🙂
PS:
- See comments for handling dotfiles
- I frequently use e.g. ‘du -smc /home// | sort -n |tail’ to get a feel of where exactly the large bits are sitting
du —max-depth=1|sort -n or find . -mindepth 1 -maxdepth 1|xargs du -s|sort -n for including dotfiles too.
@arnaud576875 find . -mindepth 1 -maxdepth 1 -print0 | xargs -0 du -s | sort -n if some of the found paths could contain spaces.
This is a great variant to get a human readable view of the biggest: sudo du -smch * | sort -h | tail
Command
Output
3,5M asdf.6000.gz 3,4M asdf.4000.gz 3,2M asdf.2000.gz 2,5M xyz.PT.gz 136K xyz.6000.gz 116K xyz.6000p.gz 88K test.4000.gz 76K test.4000p.gz 44K test.2000.gz 8,0K desc.common.tcl 8,0K wer.2000p.gz 8,0K wer.2000.gz 4,0K ttree.3
Explanation
- du displays «disk usage»
- h is for «human readable» (both, in sort and in du)
- max-depth=0 means du will not show sizes of subfolders (remove that if you want to show all sizes of every file in every sub-, subsub-, . folder)
- r is for «reverse» (biggest file first)
ncdu
When I came to this question, I wanted to clean up my file system. The command line tool ncdu is way better suited to this task.
Just type ncdu [path] in the command line. After a few seconds for analyzing the path, you will see something like this:
$ ncdu 1.11 ~ Use the arrow keys to navigate, press ? for help --- / --------------------------------------------------------- . 96,1 GiB [##########] /home . 17,7 GiB [# ] /usr . 4,5 GiB [ ] /var 1,1 GiB [ ] /lib 732,1 MiB [ ] /opt . 275,6 MiB [ ] /boot 198,0 MiB [ ] /storage . 153,5 MiB [ ] /run . 16,6 MiB [ ] /etc 13,5 MiB [ ] /bin 11,3 MiB [ ] /sbin . 8,8 MiB [ ] /tmp . 2,2 MiB [ ] /dev ! 16,0 KiB [ ] /lost+found 8,0 KiB [ ] /media 8,0 KiB [ ] /snap 4,0 KiB [ ] /lib64 e 4,0 KiB [ ] /srv ! 4,0 KiB [ ] /root e 4,0 KiB [ ] /mnt e 4,0 KiB [ ] /cdrom . 0,0 B [ ] /proc . 0,0 B [ ] /sys @ 0,0 B [ ] initrd.img.old @ 0,0 B [ ] initrd.img @ 0,0 B [ ] vmlinuz.old @ 0,0 B [ ] vmlinuz
Delete the currently highlighted element with d , exit with CTRL + c
ls -S sorts by size. Then, to show the size too, ls -lS gives a long ( -l ), sorted by size ( -S ) display. I usually add -h too, to make things easier to read, so, ls -lhS .
Ah, sorry, that was not clear from your post. You want du , seems someone has posted it. @sehe: Depends on your definition of real — it is showing the amount of space the directory is using to store itself. (It’s just not also adding in the size of the subentries.) It’s not a random number, and it’s not always 4KiB.
find . -mindepth 1 -maxdepth 1 -type d | parallel du -s | sort -n
I think I might have figured out what you want to do. This will give a sorted list of all the files and all the directories, sorted by file size and size of the content in the directories.
(find . -depth 1 -type f -exec ls -s <> \;; find . -depth 1 -type d -exec du -s <> \;) | sort -n
[enhanced version]
This is going to be much faster and precise than the initial version below and will output the sum of all the file size of current directory:
echo `find . -type f -exec stat -c %s <> \; | tr '\n' '+' | sed 's/+$//g'` | bc
the stat -c %s command on a file will return its size in bytes. The tr command here is used to overcome xargs command limitations (apparently piping to xargs is splitting results on more lines, breaking the logic of my command). Hence tr is taking care of replacing line feed with + (plus) sign. sed has the only goal to remove the last + sign from the resulting string to avoid complains from the final bc (basic calculator) command that, as usual, does the math.
Performances: I tested it on several directories and over ~150.000 files top (the current number of files of my fedora 15 box) having what I believe it is an amazing result:
# time echo `find / -type f -exec stat -c %s <> \; | tr '\n' '+' | sed 's/+$//g'` | bc 12671767700 real 2m19.164s user 0m2.039s sys 0m14.850s
Just in case you want to make a comparison with the du -sb / command, it will output an estimated disk usage in bytes ( -b option)
As I was expecting it is a little larger than my command calculation because the du utility returns allocated space of each file and not the actual consumed space.
[initial version]
You cannot use du command if you need to know the exact sum size of your folder because (as per man page citation) du estimates file space usage. Hence it will lead you to a wrong result, an approximation (maybe close to the sum size but most likely greater than the actual size you are looking for).
I think there might be different ways to answer your question but this is mine:
ls -l $(find . -type f | xargs) | cut -d" " -f5 | xargs | sed 's/\ /+/g'| bc
It finds all files under . directory (change . with whatever directory you like), also hidden files are included and (using xargs ) outputs their names in a single line, then produces a detailed list using ls -l . This (sometimes) huge output is piped towards cut command and only the fifth field ( -f5 ), which is the file size in bytes is taken and again piped against xargs which produces again a single line of sizes separated by blanks. Now take place a sed magic which replaces each blank space with a plus ( + ) sign and finally bc (basic calculator) does the math.
It might need additional tuning and you may have ls command complaining about arguments list too long.
How to get the summarized sizes of directories and their subdirectories?
Let’s say I want to get the size of each directory of a Linux file system. When I use ls -la I don’t really get the summarized size of the folders. If I use df I get the size of each mounted file system but that also doesn’t help me. And with du I get the size of each subdirectory and the summary of the whole file system. But I want to have only the summarized size of each directory within the ROOT folder of the file system. Is there any command to achieve that?
The —total flag was helpful for me. E.g. du -sh —total applications/* . askubuntu.com/a/465436/48214
9 Answers 9
This does what you’re looking for:
- -s to give only the total for each command line argument.
- -h for human-readable suffixes like M for megabytes and G for gigabytes (optional).
- /* simply expands to all directories (and files) in / . Note: dotfiles are not included; run shopt -s dotglob to include those too.
Also useful is sorting by size:
If you have dot-directories in the root directory, you can use shopt -s dotglob to include them in the count.
It’s very usefull, because it’s simple and you can place what path you want instead of /* , e.g. ./ for current directory or ./* for each item in current directory.
@c1phr If your sort doesn’t have -h , you need to leave it off from du as well, otherwise the sorting will mix up kilo/mega/gigabytes. du -s /* | sort -nr .
I often need to find the biggest directories, so to get a sorted list containing the 20 biggest dirs I do this:
du -m /some/path | sort -nr | head -n 20
In this case the sizes will be reported in megabytes.
@Xedecima the problem with using h is the sort doesn’t know how to handle different sizes. For example 268K is sorted higher than 255M, and both are sorted higher than 2.7G
The -h (human readable) argument on the ‘sort’ command should properly read these values. Just like du’s -h flag exports them. Depending on what you’re running I’m guessing.
I like to use Ncdu for that, you can use the cursor to navigate and drill down through the directory structure it works really well.
The existing answers are very helpful, maybe some beginner (like me) will find this helpful as well.
- Very basic loop, but for me this was a good start for some other size related operations:
for each in $(ls) ; do du -hs "$each" ; done
The following du invocation should work on BSD systems:
Right portable option combination on BSD/*NIX is du -sk /* . I hate the -k stuff soooo much. Linux’ -h totally rocks.
This isn’t easy. The du command either shows files and folders (default) or just the sizes of all items which you specify on the command line (option -s ).
To get the largest items (files and folders), sorted, with human readable sizes on Linux:
This will bury you in a ton of small files. You can get rid of them with —threshold (1 MB in my example):
The advantage of this command is that it includes hidden dot folders (folders which start with . ).
If you really just want the folders, you need to use find but this can be very, very slow since du will have to scan many folders several times:
find . -type d -print0 | sort -z | xargs --null -I '<>' du -sh '<>' | sort -h
@podarok It’s available on OpenSUSE 13.2 Linux. Try to find a more recent version of your distribution or compile a more recent version of the package yourself.
Caching might have been a bad term. I was thinking of something like done in this port superuser.com/a/597173/121352 where we scan the disks contents once into a mapping and then continue using data from that mapping rather than hitting the disk again.
You might also want to check out xdiskusage. Will give you the same information, but shown graphically, plus allows to drill down (very useful). There are other similar utilities for KDE and even Windows.
Be aware, that you can’t compare directories with du on different systems/machines without getting sure, both share the same blocksize of the filesystem. This might count if you rsync some files from a linux machine to a nas and you want to compare the synced directory on your own. You might get different results with du because of different blocksizes.
You could use ls in conjunction with awk :
The output of ls is piped to awk . awk starts processing the data. Standard delimiter is space. The sum variable tot is initialised to zero; the following statement is executed for each row/line outputted by ls . It merely increments tot with the size. $5 stands for fifth column (outputted by ls ). At the end we divide by (1024*1024) to sum in megabytes.
If you would convert this into a script or function (.bashrc) you can also use it to get the size of certain subsets of directories, according to filetypes.
If you want system wide information, kdirstat may came in handy!