Calculate size file linux

Содержание

How can I get the size of a file in a bash script?
13 Answers 13
Calculate size of files in shell
15 Answers 15

How can I get the size of a file in a bash script?

How can I get the size of a file in a bash script? How do I assign this to a bash variable so I can use it later?

In case of (very narrow) XY problem, this is neat: if all you need is to test the file has a nonzero size, bash has a conditional expression -s , so you may simply test if a file has nonzero length with if [ -s file ]; then echo «file has nonzero size» ; fi

13 Answers 13

Your best bet if on a GNU system:

#!/bin/bash FILENAME=/home/heiko/dummy/packages.txt FILESIZE=$(stat -c%s "$FILENAME") echo "Size of $FILENAME = $FILESIZE bytes."

NOTE: see @chbrown’s answer for how to use stat in terminal on Mac OS X.

@haunted85 stat is the most straightforward way, assuming you’re using Linux or Cygwin ( stat isn’t standard). wc -c as suggested by Eugéne is portable.

@woohoo Your prompt overwrites the output. man stat says that —printf omits the trailing newline. Use —format or -c to see the output. Gain more insight by comparing stat —printf=»%s» file.any | xxd — to stat -c «%s» file.any | xxd —

file_size_kb=`du -k "$filename" | cut -f1`

The problem with using stat is that it is a GNU (Linux) extension. du -k and cut -f1 are specified by POSIX and are therefore portable to any Unix system.

Solaris, for example, ships with bash but not with stat . So this is not entirely hypothetical.

ls has a similar problem in that the exact format of the output is not specified, so parsing its output cannot be done portably. du -h is also a GNU extension.

Stick to portable constructs where possible, and you will make somebody’s life easier in the future. Maybe your own.

du doesn’t give the size of the file, it gives an indication of how much space the file uses, which is subtly different (usually the size reported by du is the size of the file rounded up to the nearest number of blocks, where a block is typically 512B or 1kB or 4kB).

@fralau: The OP wants to «assign this to a bash variable so they can use it later», so it is much more likely they want an actual numeric value, not a human-readable approximation. Also, -h is a GNU extension; it is not standard

Using du with —apparent-size flag will return a more precise size (as stated on man : print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in (‘sparse’) files, internal fragmentation, indirect blocks, and the like )

You could also use the «word count» command ( wc ):

The problem with wc is that it’ll add the filename and indent the output. For example:

$ wc -c somefile.txt 1160 somefile.txt

If you would like to avoid chaining a full interpreted language or stream editor just to get a file size count, just redirect the input from the file so that wc never sees the filename:

This last form can be used with command substitution to easily grab the value you were seeking as a shell variable, as mentioned by Gilles below.

Just one more point: I just tested it and wc -c < file seems to be very fast, at least on OS X. I'm guessing that wc has the brains to try to stat the file if only -c is specified.

@EdwardFalk: GNU wc -c uses fstat , but then seeks to second-last block of the file and reads the last up-to st_blksize bytes. Apparently this is because files in Linux’s /proc and /sys for example have stat sizes that are only approximate, and wc wants to report the actual size, not the stat-reported size. I guess it would be weird for wc -c to report a different size than wc , but it’s not idea to read data from the file if it’s a normal disk file, and it’s not in memory. Or worse, near-line tape storage.

It seems like printf still sees the indentation, e.g. printf «Size: $size» -> size: <4 spaces>54339 . On the other hand echo ignores the whitespace. Any way to make it consistent?

@keithpjolley: By calling fstat . Try running strace wc -c

BSD’s (macOS’s) stat has a different format argument flag, and different field specifiers. From man stat(1) :

-f format : Display information using the specified format. See the FORMATS section for a description of valid formats.
. the FORMATS section .
z : The size of file in bytes.

NOTE: see @b01’s answer for how to use the stat command on GNU/Linux systems. 🙂

Depends what you mean by size.

will give you the number of bytes that can be read from the file. IOW, it’s the size of the contents of the file. It will however read the contents of the file (except if the file is a regular file or symlink to regular file in most wc implementations as an optimisation). That may have side effects. For instance, for a named pipe, what has been read can no longer be read again and for things like /dev/zero or /dev/random which are of infinite size, it’s going to take a while. That also means you need read permission to the file, and the last access timestamp of the file may be updated.

That’s standard and portable, however note that some wc implementations may include leading blanks in that output. One way to get rid of them is to use:

or to avoid an error about an empty arithmetic expression in dash or yash when wc produces no output (like when the file can’t be opened):

ksh93 has wc builtin (provided you enable it, you can also invoke it as command /opt/ast/bin/wc ) which makes it the most efficient for regular files in that shell.

Various systems have a command called stat that’s an interface to the stat() or lstat() system calls.

Those report information found in the inode. One of that information is the st_size attribute. For regular files, that’s the size of the content (how much data could be read from it in the absence of error (that’s what most wc -c implementations use in their optimisation)). For symlinks, that’s the size in bytes of the target path. For named pipes, depending on the system, it’s either 0 or the number of bytes currently in the pipe buffer. Same for block devices where depending on the system, you get 0 or the size in bytes of the underlying storage.

You don’t need read permission to the file to get that information, only search permission to the directory it is linked to.

By chronological¹ order, there is:

stat -L +size -- $file # st_size of file stat +size -- $file # after symlink resolution

stat -c %s -- "$file" # st_size of file stat -Lc %s -- "$file" # after symlink resolution

stat -f %z -- "$file" # st_size of file stat -Lf %z -- "$file" # after symlink resolution

Or you can use the stat() / lstat() function of some scripting language like perl :

perl -le 'print((lstat shift)[7])' -- "$file"

AIX also has an istat command which will dump all the stat() (not lstat() , so won’t work on symlinks) information and which you could post-process with, for example:

LC_ALL=C istat "$file" | awk 'NR == 4 '

(size after symlink resolution)

Long before GNU introduced its stat command, the same could be achieved with GNU find command with its -printf predicate (already in 1991):

find -- "$file" -prune -printf '%s\n' # st_size of file find -L -- "$file" -prune -printf '%s\n' # after symlink resolution

One issue though is that doesn’t work if $file starts with — or is a find predicate (like ! , ( . ).

Since version 4.9, that can be worked around by passing the file path through its stdin rather than as an argument with:

printf '%s\0' "$file" | find -files0-from - -prune -printf '%s\n'

The standard command to get the stat() / lstat() information is ls .

and add -L for the same after symlink resolution. That doesn’t work for device files though where the 5 th field is the device major number instead of the size.

For block devices, systems where stat() returns 0 for st_size , usually have other APIs to report the size of the block device. For instance, Linux has the BLKGETSIZE64 ioctl() , and most Linux distributions now ship with a blockdev command that can make use of it:

blockdev --getsize64 -- "$device_file"

However, you need read permission to the device file for that. It’s usually possible to derive the size by other means. For instance (still on Linux):

lsblk -bdno size -- "$device_file"

Should work except for empty devices.

An approach that works for all seekable files (so includes regular files, most block devices and some character devices) is to open the file and seek to the end:

perl -le 'seek STDIN, 0, 2 or die "seek: $!"; print tell STDIN' < "$file"

For named pipes, we've seen that some systems (AIX, Solaris, HP/UX at least) make the amount of data in the pipe buffer available in stat() 's st_size . Some (like Linux or FreeBSD) don't.

On Linux at least, you can use the FIONREAD ioctl() after having open the pipe (in read+write mode to avoid it hanging):

fuser -s -- "$fifo_file" && perl -le 'require "sys/ioctl.ph"; ioctl(STDIN, &FIONREAD, $n) or die$!; print unpack "L", $n' <> "$fifo_file"

However note that while it doesn't read the content of the pipe, the mere opening of the named pipe here can still have side effects. We're using fuser to check first that some process already has the pipe open to alleviate that but that's not foolproof as fuser may not be able to check all processes.

Now, so far we've only been considering the size of the primary data associated with the files. That doesn't take into account the size of the metadata and all the supporting infrastructure needed to store that file.

Another inode attribute returned by stat() is st_blocks . That's the number of 512 byte (1024 on HP/UX) blocks that is used to store the file's data (and sometimes some of its metadata like the extended attributes on ext4 filesystems on Linux). That doesn't include the inode itself, or the entries in the directories the file is linked to.

Size and disk usage are not necessarily tightly related as compression, sparseness (sometimes some metadata), extra infrastructure like indirect blocks in some filesystems have an influence on the latter.

That's typically what du uses to report disk usage. Most of the commands listed above will be able to get you that information.

POSIXLY_CORRECT=1 ls -sd -- "$file" | awk ''
POSIXLY_CORRECT=1 du -s -- "$file" (not for directories where that would include the disk usage of the files within).
GNU find -- "$file" -printf '%b\n'
zstat -L +block -- $file
GNU stat -c %b -- "$file"
BSD stat -f %b -- "$file"
perl -le 'print((lstat shift)[12])' -- "$file"

¹ Strictly speaking, early versions of UNIX in the 70s, from v1 to v4 had a stat command. It was just dumping information from the inode and didn't take options. It apparently disappeared in v5 (1974) presumably because it was redundant with ls -l .

Источник

Calculate size of files in shell

I'm trying to calculate the total size in bytes of all files (in a directory tree) matching a filename pattern just using the shell. This is what I have so far:

find -name *.undo -exec stat -c%s <> \; | awk ' END '

Is there an easier way to do this? I feel like there should be a simple du or find switch that does this for me but I can't find one. To be clear I want to total files matching a pattern anywhere under a directory tree which means

15 Answers 15

find . -name "*.undo" -ls | awk ' END '

On my system the size of the file is the seventh field in the find -ls output. If your find … -ls output is different, adjust.

In this version, using the existing directory information (file size) and the built-in ls feature of find should be efficient, avoiding process creations or file i/o.

I would add "-type f" to the find command to prevent from incorrect total if there are directories matching "*.undo" glob.

Note that if you need several patterns to match, you will have to use escaped parenthesis for the whole expression to match otherwise the -ls will apply only to the last pattern. For instance, if you want to match all jpeg and png files (trusting filenames), you would use find . $ -iname "*.jpg" -o -iname "*.jpeg" -o -iname "*.png" $ -ls | awk ' END ' ( -iname is for case insensitive search ; also, note the space between the expression and the escaped parenthesis).

Источник