Linux script if file size

How can I get the size of a file in a bash script?

How can I get the size of a file in a bash script? How do I assign this to a bash variable so I can use it later?

In case of (very narrow) XY problem, this is neat: if all you need is to test the file has a nonzero size, bash has a conditional expression -s , so you may simply test if a file has nonzero length with if [ -s file ]; then echo «file has nonzero size» ; fi

13 Answers 13

Your best bet if on a GNU system:

#!/bin/bash FILENAME=/home/heiko/dummy/packages.txt FILESIZE=$(stat -c%s "$FILENAME") echo "Size of $FILENAME = $FILESIZE bytes." 

NOTE: see @chbrown’s answer for how to use stat in terminal on Mac OS X.

@haunted85 stat is the most straightforward way, assuming you’re using Linux or Cygwin ( stat isn’t standard). wc -c as suggested by Eugéne is portable.

@woohoo Your prompt overwrites the output. man stat says that —printf omits the trailing newline. Use —format or -c to see the output. Gain more insight by comparing stat —printf=»%s» file.any | xxd — to stat -c «%s» file.any | xxd —

file_size_kb=`du -k "$filename" | cut -f1` 

The problem with using stat is that it is a GNU (Linux) extension. du -k and cut -f1 are specified by POSIX and are therefore portable to any Unix system.

Solaris, for example, ships with bash but not with stat . So this is not entirely hypothetical.

ls has a similar problem in that the exact format of the output is not specified, so parsing its output cannot be done portably. du -h is also a GNU extension.

Stick to portable constructs where possible, and you will make somebody’s life easier in the future. Maybe your own.

du doesn’t give the size of the file, it gives an indication of how much space the file uses, which is subtly different (usually the size reported by du is the size of the file rounded up to the nearest number of blocks, where a block is typically 512B or 1kB or 4kB).

@fralau: The OP wants to «assign this to a bash variable so they can use it later», so it is much more likely they want an actual numeric value, not a human-readable approximation. Also, -h is a GNU extension; it is not standard

Using du with —apparent-size flag will return a more precise size (as stated on man : print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in (‘sparse’) files, internal fragmentation, indirect blocks, and the like )

You could also use the «word count» command ( wc ):

The problem with wc is that it’ll add the filename and indent the output. For example:

$ wc -c somefile.txt 1160 somefile.txt 

If you would like to avoid chaining a full interpreted language or stream editor just to get a file size count, just redirect the input from the file so that wc never sees the filename:

This last form can be used with command substitution to easily grab the value you were seeking as a shell variable, as mentioned by Gilles below.

Just one more point: I just tested it and wc -c < file seems to be very fast, at least on OS X. I'm guessing that wc has the brains to try to stat the file if only -c is specified.

Читайте также:  Hydra kali linux установка

@EdwardFalk: GNU wc -c uses fstat , but then seeks to second-last block of the file and reads the last up-to st_blksize bytes. Apparently this is because files in Linux’s /proc and /sys for example have stat sizes that are only approximate, and wc wants to report the actual size, not the stat-reported size. I guess it would be weird for wc -c to report a different size than wc , but it’s not idea to read data from the file if it’s a normal disk file, and it’s not in memory. Or worse, near-line tape storage.

It seems like printf still sees the indentation, e.g. printf «Size: $size» -> size: <4 spaces>54339 . On the other hand echo ignores the whitespace. Any way to make it consistent?

@keithpjolley: By calling fstat . Try running strace wc -c

BSD’s (macOS’s) stat has a different format argument flag, and different field specifiers. From man stat(1) :

  • -f format : Display information using the specified format. See the FORMATS section for a description of valid formats.
  • . the FORMATS section .
  • z : The size of file in bytes.

NOTE: see @b01’s answer for how to use the stat command on GNU/Linux systems. 🙂

Depends what you mean by size.

will give you the number of bytes that can be read from the file. IOW, it’s the size of the contents of the file. It will however read the contents of the file (except if the file is a regular file or symlink to regular file in most wc implementations as an optimisation). That may have side effects. For instance, for a named pipe, what has been read can no longer be read again and for things like /dev/zero or /dev/random which are of infinite size, it’s going to take a while. That also means you need read permission to the file, and the last access timestamp of the file may be updated.

That’s standard and portable, however note that some wc implementations may include leading blanks in that output. One way to get rid of them is to use:

or to avoid an error about an empty arithmetic expression in dash or yash when wc produces no output (like when the file can’t be opened):

ksh93 has wc builtin (provided you enable it, you can also invoke it as command /opt/ast/bin/wc ) which makes it the most efficient for regular files in that shell.

Various systems have a command called stat that’s an interface to the stat() or lstat() system calls.

Those report information found in the inode. One of that information is the st_size attribute. For regular files, that’s the size of the content (how much data could be read from it in the absence of error (that’s what most wc -c implementations use in their optimisation)). For symlinks, that’s the size in bytes of the target path. For named pipes, depending on the system, it’s either 0 or the number of bytes currently in the pipe buffer. Same for block devices where depending on the system, you get 0 or the size in bytes of the underlying storage.

You don’t need read permission to the file to get that information, only search permission to the directory it is linked to.

By chronological¹ order, there is:

stat -L +size -- $file # st_size of file stat +size -- $file # after symlink resolution 
stat -c %s -- "$file" # st_size of file stat -Lc %s -- "$file" # after symlink resolution 
stat -f %z -- "$file" # st_size of file stat -Lf %z -- "$file" # after symlink resolution 

Or you can use the stat() / lstat() function of some scripting language like perl :

perl -le 'print((lstat shift)[7])' -- "$file" 

AIX also has an istat command which will dump all the stat() (not lstat() , so won’t work on symlinks) information and which you could post-process with, for example:

LC_ALL=C istat "$file" | awk 'NR == 4 ' 

(size after symlink resolution)

Читайте также:  Intel drivers linux debian

Long before GNU introduced its stat command, the same could be achieved with GNU find command with its -printf predicate (already in 1991):

find -- "$file" -prune -printf '%s\n' # st_size of file find -L -- "$file" -prune -printf '%s\n' # after symlink resolution 

One issue though is that doesn’t work if $file starts with — or is a find predicate (like ! , ( . ).

Since version 4.9, that can be worked around by passing the file path through its stdin rather than as an argument with:

printf '%s\0' "$file" | find -files0-from - -prune -printf '%s\n' 

The standard command to get the stat() / lstat() information is ls .

and add -L for the same after symlink resolution. That doesn’t work for device files though where the 5 th field is the device major number instead of the size.

For block devices, systems where stat() returns 0 for st_size , usually have other APIs to report the size of the block device. For instance, Linux has the BLKGETSIZE64 ioctl() , and most Linux distributions now ship with a blockdev command that can make use of it:

blockdev --getsize64 -- "$device_file" 

However, you need read permission to the device file for that. It’s usually possible to derive the size by other means. For instance (still on Linux):

lsblk -bdno size -- "$device_file" 

Should work except for empty devices.

An approach that works for all seekable files (so includes regular files, most block devices and some character devices) is to open the file and seek to the end:

    With zsh (after loading the zsh/system module):

perl -le 'seek STDIN, 0, 2 or die "seek: $!"; print tell STDIN' < "$file" 

For named pipes, we've seen that some systems (AIX, Solaris, HP/UX at least) make the amount of data in the pipe buffer available in stat() 's st_size . Some (like Linux or FreeBSD) don't.

On Linux at least, you can use the FIONREAD ioctl() after having open the pipe (in read+write mode to avoid it hanging):

fuser -s -- "$fifo_file" && perl -le 'require "sys/ioctl.ph"; ioctl(STDIN, &FIONREAD, $n) or die$!; print unpack "L", $n' <> "$fifo_file" 

However note that while it doesn't read the content of the pipe, the mere opening of the named pipe here can still have side effects. We're using fuser to check first that some process already has the pipe open to alleviate that but that's not foolproof as fuser may not be able to check all processes.

Now, so far we've only been considering the size of the primary data associated with the files. That doesn't take into account the size of the metadata and all the supporting infrastructure needed to store that file.

Another inode attribute returned by stat() is st_blocks . That's the number of 512 byte (1024 on HP/UX) blocks that is used to store the file's data (and sometimes some of its metadata like the extended attributes on ext4 filesystems on Linux). That doesn't include the inode itself, or the entries in the directories the file is linked to.

Читайте также:  Ipv6 туннель ipv4 linux

Size and disk usage are not necessarily tightly related as compression, sparseness (sometimes some metadata), extra infrastructure like indirect blocks in some filesystems have an influence on the latter.

That's typically what du uses to report disk usage. Most of the commands listed above will be able to get you that information.

  • POSIXLY_CORRECT=1 ls -sd -- "$file" | awk ''
  • POSIXLY_CORRECT=1 du -s -- "$file" (not for directories where that would include the disk usage of the files within).
  • GNU find -- "$file" -printf '%b\n'
  • zstat -L +block -- $file
  • GNU stat -c %b -- "$file"
  • BSD stat -f %b -- "$file"
  • perl -le 'print((lstat shift)[12])' -- "$file"

¹ Strictly speaking, early versions of UNIX in the 70s, from v1 to v4 had a stat command. It was just dumping information from the inode and didn't take options. It apparently disappeared in v5 (1974) presumably because it was redundant with ls -l .

Источник

Bash if file sizes are greater than 1kb

I have created a vi file and I want to check the files in my home directory to see their size. If the size of the regular file is greater than 1kb I want to back it up as a compressed file with .bak extension. I have started with the command du -h --max-depth=0 * | sort -r which list the files like.

10K archive.tar 1.0K activity48 1.0K activity47 1.0K activity46 1.0K activity45 1.0k activity44 1.0K activity43 1.0K activity42 1.0K activity41 1.0K activity40 1.0K activity39 1.0K activity38 

These are some of the files listed but my thought is I need to cut field 1 and somehow create an if statement and compare the field something like if [ $x -ge 1.0 ] ; do something. Any thoughts on how I should go about the problem.

2 Answers 2

find . -maxdepth 1 -type f -size +1k -exec gzip -k -S .bak '<>' \; 

I'd probably not use a custom extension for the compressed file, though; that's just asking for future confusion.

find searches a directory ( . in this case) for files that pass a filter. Complex filters can be constructed; in this relatively simple case, several primitive filters are chained to select

  • Files that are no more than one level deep into . (i.e., subdirectories are not searched),
  • are regular files,
  • 1KB or larger, and
  • for which gzip -k S .bak filename exits with a status code of 0 .

The -exec filter is special in that it is considered an action (other actions include -delete and -print ). If a filter does not contain an action, an implicit -print action is appended to the filter so that the names of all files that fit the filter are printed. Since our filter contains an action, that does not happen.

In any case, we're not really interested in the result of the -exec filter in this case, just in its side effect of running the specified command. It is useful to know that -exec is also a filter, however, in case you want to chain commands. For example, if you wanted to copy the backup files to another folder after packing them, you could write

find . -maxdepth 1 -type f -size +1k -exec gzip -k -S .bak '<>' \; -exec cp '<>.bak' /some/where/else/ \; 

Then cp filename.bak /some/where/else/ would be executed only if gzip -k -S .bak filename returned with an exit status of 0 (that is, if it indicated successful completion).

Источник

Оцените статью
Adblock
detector