File size in linux command line

Total size of the contents of all the files in a directory [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.

When I use ls or du , I get the amount of disk space each file is occupying. I need the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes. Bonus points if I can get this without opening each file and counting.

ls actually shows the number of bytes in each file, not the amount of disk space. Is this sufficient for your needs?

Note that du can’t be used to answer to this question. It shows the amount of disk space the directory occupy on the disk (the files’ data plus the size of auxiliary file system meta-information). The du output can be even smaller than the total size of all files. This may happen if file system can store data compressed on the disk or if hard links are used. Correct answers are based on ls and find . See answers by Nelson and by bytepan here, or this answer: unix.stackexchange.com/a/471061/152606

12 Answers 12

If you want the ‘apparent size’ (that is the number of bytes in each file), not size taken up by files on the disk, use the -b or —bytes option (if you got a Linux system with GNU coreutils):

Is there an easy way to show the “apparent size” in human-readable format? When using du -shb (as suggested by this answer), the -b setting seems to override the -h setting.

@Arkady I have tried your solution on CentOS and Ubuntu, and there is a small error. You want «du -sbh». The «-h» flag must come last.

Optionally, add the h option for more user-friendly output:

@lynxoid: You can install the GNU version with brew: brew install coreutils . It will be available as the command gdu .

Does not work. ls -> file.gz hardlink-to-file.gz . stat -c %s file.gz -> 9657212 . stat -c %s hardlink-to-file.gz -> 9657212 . du -sb -> 9661308 . It’s definitely not the total size of the content but the size the directory takes up on the disk.

This is simple and does not work. It prints the space that directory takes up on the disk, not the total size of the content that could be calculated by opening each file and counting the bytes.

ls -lAR | grep -v '^d' | awk ' END ' 

grep -v ‘^d’ will exclude the directories.

Isolated to a specific file type (in this case, PNG) and expressed in MB for more readability: ls -lR | grep ‘.png$’ | awk ‘ END ‘

It is a correct answer. Unlike du this solution really counts the total size of all the data in files as if they were opened one by one and their bytes were counted. But yes, adding the -A parameter is required to count hidden files as well.

Читайте также:  Самоучитель системного администратора linux pdf

stat’s «%s» format gives you the actual number of bytes in a file.

 find . -type f | xargs stat --format=%s | awk ' END ' 

Preferably use «find . -type f -print0 | xargs -0 . » to avoid problems with certain file names (containing spaces etc).

(Note that stat works with sparse files, reporting the large nominal size of the file and not the smaller blocks used on disk like du reports.)

Unlike many other answers here which erroneously use the du utility, this answer is correct. It is very similar to answer here: unix.stackexchange.com/a/471061/152606. But I would use ! -type d instead of -type f to count symlinks as well (the size of symlink itself (usually few bytes), not the size of the file it points to).

If you use busybox’s «du» in emebedded system, you can not get a exact bytes with du, only Kbytes you can get.

BusyBox v1.4.1 (2007-11-30 20:37:49 EST) multi-call binary Usage: du [-aHLdclsxhmk] [FILE]. Summarize disk space used for each FILE and/or directory. Disk space is printed in units of 1024 bytes. Options: -a Show sizes of files in addition to directories -H Follow symbolic links that are FILE command line args -L Follow all symbolic links encountered -d N Limit output to directories (and files with -a) of depth < N -c Output a grand total -l Count sizes many times if hard linked -s Display only a total for each argument -x Skip directories on different filesystems -h Print sizes in human readable format (e.g., 1K 243M 2G ) -m Print sizes in megabytes -k Print sizes in kilobytes(default) 

c:> dir /s c:\directory\you\want

and the penultimate line will tell you how many bytes the files take up.

I know this reads all files and directories, but works faster in some situations.

When a folder is created, many Linux filesystems allocate 4096 bytes to store some metadata about the directory itself. This space is increased by a multiple of 4096 bytes as the directory grows.

du command (with or without -b option) take in count this space, as you can see typing:

you will have a result of 4096 bytes for an empty dir. So, if you put 2 files of 10000 bytes inside the dir, the total amount given by du -sb would be 24096 bytes.

If you read carefully the question, this is not what asked. The questioner asked:

the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes

that in the example above should be 20000 bytes, not 24096.

So, the correct answer IMHO could be a blend of Nelson answer and hlovdal suggestion to handle filenames containing spaces:

find . -type f -print0 | xargs -0 stat --format=%s | awk ' END ' 

There are at least three ways to get the "sum total of all the data in files and subdirectories" in bytes that work in both Linux/Unix and Git Bash for Windows, listed below in order from fastest to slowest on average. For your reference, they were executed at the root of a fairly deep file system ( docroot in a Magento 2 Enterprise installation comprising 71,158 files in 30,027 directories).

$ time find -type f -printf '%s\n' | awk '< total += $1 >; END < print total" bytes" >' 748660546 bytes real 0m0.221s user 0m0.068s sys 0m0.160s 
$ time echo `find -type f -print0 | xargs -0 stat --format=%s | awk ' END '` bytes 748660546 bytes real 0m0.256s user 0m0.164s sys 0m0.196s 
$ time echo `find -type f -exec du -bc <> + | grep -P "\ttotal$" | cut -f1 | awk '< total += $1 >; END < print total >'` bytes 748660546 bytes real 0m0.553s user 0m0.308s sys 0m0.416s 

These two also work, but they rely on commands that don't exist on Git Bash for Windows:

$ time echo `find -type f -printf "%s + " | dc -e0 -f- -ep` bytes 748660546 bytes real 0m0.233s user 0m0.116s sys 0m0.176s 
$ time echo `find -type f -printf '%s\n' | paste -sd+ | bc` bytes 748660546 bytes real 0m0.242s user 0m0.104s sys 0m0.152s 

If you only want the total for the current directory, then add -maxdepth 1 to find .

Читайте также:  Аналог команды top linux

Note that some of the suggested solutions don't return accurate results, so I would stick with the solutions above instead.

$ du -sbh 832M . $ ls -lR | grep -v '^d' | awk ' END ' Total: 583772525 $ find . -type f | xargs stat --format=%s | awk ' END ' xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option 4390471 $ ls -l| grep -v '^d'| awk ' END ' Total 968133 

Regarding Git Bash for Windows, — in case of Cygwin, dc is part of bc package, so to get dc it is need to install bc .

du is handy, but find is useful in case if you want to calculate the size of some files only (for example, using filter by extension). Also note that find themselves can print the size of each file in bytes. To calculate a total size we can connect dc command in the following manner:

find . -type f -printf "%s + " | dc -e0 -f- -ep 

Here find generates sequence of commands for dc like 123 + 456 + 11 + . Although, the completed program should be like 0 123 + 456 + 11 + p (remember postfix notation).

So, to get the completed program we need to put 0 on the stack before executing the sequence from stdin, and print the top number after executing (the p command at the end). We achieve it via dc options:

  1. -e0 is just shortcut for -e '0' that puts 0 on the stack,
  2. -f- is for read and execute commands from stdin (that generated by find here),
  3. -ep is for print the result ( -e 'p' ).

To print the size in MiB like 284.06 MiB we can use -e '2 k 1024 / 1024 / n [ MiB] p' in point 3 instead (most spaces are optional).

Источник

How to get file and directory size in Linux

In Linux, you can use command-line tools like ls, stat, and du to display information about files and directories, such as their sizes. While ls and stat provide general information, du is a specialized tool for displaying size-related details.

Use du to identify large files or folders on your system, which can help you free up storage by deleting unnecessary items. However, du isn't intended for viewing overall disk usage information.

Steps to check file and folder sizes in Linux:

$ du Documents/random.txt 16 Documents/random.txt
$ du -h Documents/random.txt 16K Documents/random.txt

Size will automatically be displayed in K (Kilobytes), M (Megabytes), G (Gigabytes) or T (Terabytes) unit.

$ du -h Documents/ 21M Documents/Finance 4.0K Documents/Secret/Empty 40K Documents/Secret 21M Documents/
$ du -h --max-depth=1 Documents/ 21M Documents/Finance 40K Documents/Secret 21M Documents/
$ du -hc Documents/ 21M Documents/Finance 4.0K Documents/Secret/Empty 40K Documents/Secret 21M Documents/ 21M total
$ du -hs Documents/ 21M Documents/
$ sudo du -hs /var/cache/ [sudo] password for user: 117M /var/cache/
$ sudo du -hs /var/cache/* 6.2M /var/cache/apparmor 16M /var/cache/app-info 75M /var/cache/apt 6.1M /var/cache/cracklib 32K /var/cache/cups 5.2M /var/cache/debconf 40K /var/cache/dictionaries-common 2.7M /var/cache/fontconfig 2.1M /var/cache/fwupd 0 /var/cache/fwupdmgr 60K /var/cache/ldconfig 2.1M /var/cache/man 8.0K /var/cache/PackageKit 8.0K /var/cache/private 4.0K /var/cache/realmd 2.2M /var/cache/snapd
$ du --help Usage: du [OPTION]. [FILE]. or: du [OPTION]. --files0-from=F Summarize disk usage of the set of FILEs, recursively for directories. Mandatory arguments to long options are mandatory for short options too. -0, --null end each output line with NUL, not newline -a, --all write counts for all files, not just directories --apparent-size print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in ('sparse') files, internal fragmentation, indirect blocks, and the like -B, --block-size=SIZE scale sizes by SIZE before printing them; e.g., '-BM' prints sizes in units of 1,048,576 bytes; see SIZE format below -b, --bytes equivalent to '--apparent-size --block-size=1' -c, --total produce a grand total -D, --dereference-args dereference only symlinks that are listed on the command line -d, --max-depth=N print the total for a directory (or file, with --all) only if it is N or fewer levels below the command line argument; --max-depth=0 is the same as --summarize --files0-from=F summarize disk usage of the NUL-terminated file names specified in file F; if F is -, then read names from standard input -H equivalent to --dereference-args (-D) -h, --human-readable print sizes in human readable format (e.g., 1K 234M 2G) --inodes list inode usage information instead of block usage -k like --block-size=1K -L, --dereference dereference all symbolic links -l, --count-links count sizes many times if hard linked -m like --block-size=1M -P, --no-dereference don't follow any symbolic links (this is the default) -S, --separate-dirs for directories do not include size of subdirectories --si like -h, but use powers of 1000 not 1024 -s, --summarize display only a total for each argument -t, --threshold=SIZE exclude entries smaller than SIZE if positive, or entries greater than SIZE if negative --time show time of the last modification of any file in the directory, or any of its subdirectories --time=WORD show time as WORD instead of modification time: atime, access, use, ctime or status --time-style=STYLE show times using STYLE, which can be: full-iso, long-iso, iso, or +FORMAT; FORMAT is interpreted like in 'date' -X, --exclude-from=FILE exclude files that match any pattern in FILE --exclude=PATTERN exclude files that match PATTERN -x, --one-file-system skip directories on different file systems --help display this help and exit --version output version information and exit Display values are in units of the first available SIZE from --block-size, and the DU_BLOCK_SIZE, BLOCK_SIZE and BLOCKSIZE environment variables. Otherwise, units default to 1024 bytes (or 512 if POSIXLY_CORRECT is set). The SIZE argument is an integer and optional unit (example: 10K is 10*1024). Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB. (powers of 1000). Binary prefixes can be used, too: KiB=K, MiB=M, and so on. GNU coreutils online help: Full documentation or available locally via: info '(coreutils) du invocation'

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria, a proficient cloud architect, is deeply rooted in development, entrepreneurship, and open-source advocacy. As the founder of Simplified Guide, he combines these passions to help others navigate the intricate world of computing. His expertise simplifies complex tech concepts, making them accessible to everyone. Discuss the article:

Читайте также:  Linux on screen keyboard

Comment anonymously. Login not required.

Источник

Оцените статью
Adblock
detector