Get file count linux

How to count number of files in each directory?

Are you looking for a way to count the number of files in each of the sub-directories directly under ./ ?

How’s this an off-topic question?? I would like to see close-voters comments with reason! If this is off-topic then where does this belong to? super user? I don’t think so..

voted to reopen it. There may be other answers that could be useful in many situations (including script programming, which is the reason I reached this question).

21 Answers 21

This prints the file count per directory for the current directory level:

du -a | cut -d/ -f2 | sort | uniq -c | sort -nr 

By far the best (and most elegant) solution if one wants to list the number of files in top level directories recursively.

This has two problems: It counts one file per directory more than there actually is and it gives a useless line containing the size of the current directory as «1 size«. Both can be fixed with du -a | sed ‘/.*\.\/.*\/.*/!d’ | cut -d/ -f2 | sort | uniq -c . Add | sort -nr to sort by the count instead of the directory name.

I’d like to point out that this works in OSX, too. (Just copy-pasting Linux advice into an OSX shell usually doesn’t work.)

it fetches unneeded size by du -a . Better way is using find command. but main idea is exactly the same 🙂

Assuming you have GNU find, let it find the directories and let bash do the rest:

find . -type d -print0 | while read -d '' -r dir; do files=("$dir"/*) printf "%5d files in directory %s\n" "$" "$dir" done 

Its just a slighly different version from the above, so: ( hint: its sorted by name and its in csv) for x in find . -maxdepth 1 -type d | sort ; do y= find $x | wc -l ; echo $x,$y; done

Great one! Putting it into a single line (so it’s confortable for direct usage in shell): find . -type d -print0 | while read -d » -r dir; do files=(«$dir»/*); printf «%5d files in directory %s\n» «$» «$dir»; done

I needed to get the number of all files (recursively count) in each subdirectory. This modification gives you that: find . -maxdepth 1 -type d -print0 | while read -d » -r dir; do num=$(find $dir -ls | wc -l); printf «%5d files in directory %s\n» «$num» «$dir»; done

@Kory The following will do it: find . -maxdepth 1 -type d -print0 | while read -d » -r dir; do num=$(find «$dir» -ls | wc -l); printf «%5d files in directory %s\n» «$num» «$dir»; done | sort -rn -k1

@OmidS Great oneliner, but $dir should be inside quotes in your first comment to correctly handle dir names with whitespaces. : find . -maxdepth 1 -type d -print0 | while read -d » -r dir; do num=$(find «$dir» -ls | wc -l); printf «%5d files in directory %s\n» «$num» «$dir»; done

find . -type f | cut -d/ -f2 | sort | uniq -c 
  • find . -type f to find all items of the type file , in current folder and subfolders
  • cut -d/ -f2 to cut out their specific folder
  • sort to sort the list of foldernames
  • uniq -c to return the number of times each foldername has been counted
Читайте также:  Window theme linux mint

Perfect. And can be extended to count over subdirectories by replacing the field specifiers with a list of field specifiers. E.g,: find . -type f | cut -d/ -f2,3 | sort | uniq -c

You could arrange to find all the files, remove the file names, leaving you a line containing just the directory name for each file, and then count the number of times each directory appears:

find . -type f | sed 's%/[^/]*$%%' | sort | uniq -c 

The only gotcha in this is if you have any file names or directory names containing a newline character, which is fairly unlikely. If you really have to worry about newlines in file names or directory names, I suggest you find them, and fix them so they don’t contain newlines (and quietly persuade the guilty party of the error of their ways).

If you’re interested in the count of the files in each sub-directory of the current directory, counting any files in any sub-directories along with the files in the immediate sub-directory, then I’d adapt the sed command to print only the top-level directory:

find . -type f | sed -e 's%^\(\./[^/]*/\).*$%\1%' -e 's%^\.\/[^/]*$%./%' | sort | uniq -c 

The first pattern captures the start of the name, the dot, the slash, the name up to the next slash and the slash, and replaces the line with just the first part, so:

The second replace captures the files directly in the current directory; they don’t have a slash at the end, and those are replace by ./ . The sort and count then works on just the number of names.

Источник

How can I get a count of files in a directory using the command line?

I have a directory with a large number of files. I don’t see a ls switch to provide the count. Is there some command line magic to get a count of files?

tree . | tail or tree -a . | tail to include hidden files/dirs, tree is recursive if that’s what you want.

@CodyChan : It should be tail -n 1 , and even then the count would also include the entries in subdirectories.

20 Answers 20

Using a broad definition of «file»

(note that it doesn’t count hidden files and assumes that file names don’t contain newline characters).

To include hidden files (except . and .. ) and avoid problems with newline characters, the canonical way is:

find . ! -name . -prune -print | grep -c / 
find .//. ! -name . -print | grep -c // 

wc is a «word count» program. The -l switch causes it to count lines. In this case, it’s counting the lines in the output from ls . This is the always the way I was taught to get a file count for a given directory, too.

that doesn’t get everything in a directory — you’ve missed dot files, and collect a couple extra lines, too. An empty directory will still return 1 line. And if you call ls -la , you will get three lines in the directory. You want ls -lA | wc -l to skip the . and .. entries. You’ll still be off-by-one, however.

A corrected approach, that would not double count files with newlines in the name, would be this: ls -q | wc -l — though note that hidden files will still not be counted by this approach, and that directories will be counted.

For narrow definition of file:

 find . -maxdepth 1 -type f | wc -l 

And you can of course omit the -maxdepth 1 for counting files recursively (or adjust it for desired max search depth).

Читайте также:  Tmp folder in linux

A corrected approach, that would not double count files with newlines in the name, would be this: find -maxdepth 1 -type f -printf «\n» | wc -l

I have found du —inodes useful, but I’m not sure which version of du it requires. It should be substantially faster than alternative approaches using find and wc .

On Ubuntu 17.10, the following works:

du --inodes # all files and subdirectories du --inodes -s # summary du --inodes -d 2 # depth 2 at most 

Combine with | sort -nr to sort descending by number of containing inodes.

Thanks for sharing! I searched for «count» in the du man page, as in «I want to count the files», but it’s not documented with that word. Any answer using wc -l will be wrong when any name contains a newline character.

$ ls --help | grep -- ' -1' -1 list one file per line 
$ wc --help | grep -- ' -l' -l, --lines print the newline counts 

@Dennis that’s interesting I didn’t know that an application could tell its output was going to a pipe.

I +’ed this version since it is more explicit. Though, yes ls does use -1 if it’s piped (try it: ls | cat), I find the -1 syntax more explicit.

In my tests it was significantly faster to also provide the -f option to avoid ls sorting the filenames. Unfortunately you still get the wrong answer if your filenames contain newlines.

Probably the most complete answer using ls / wc pair is

if you want to count dot files, and

  • -A is to count dot files, but omit . and .. .
  • -q make ls replace nongraphic characters, specifically newline character, with ? , making output 1 line for each file

To get one-line output from ls in terminal (i.e. without piping it into wc ), -1 option has to be added.

(behaviour of ls tested with coreutils 8.23)

As you said, -1 is not needed. As to «it handles newlines in filenames sensibly with console output», this is because of the -q switch (that you should use instead of -b because it’s portable) which «Forces each instance of non-printable filename characters and characters to be written as the ( ‘?’ ) character. Implementations may provide this option by default if the output is to a terminal device.» So e.g. ls -Aq | wc -l to count all files/dirs or ls -qp | grep -c / to count only non-hidden dirs etc.

Currently includes directories in its file count. To be most complete we need an easy way to omit those when needed.

@JoshHabdas It says «probably». 😉 I think the way to omit directories would be to use don_crissti’s suggestion with a slight twist: ls -qp | grep -vc / . Actually, you can use ls -q | grep -vc / to count all (non-hidden) files, and adding -p makes it match only regular files.

If you know the current directory contains at least one non-hidden file:

This is obviously generalizable to any glob.

In a script, this has the sometimes unfortunate side effect of overwriting the positional parameters. You can work around that by using a subshell or with a function (Bourne/POSIX version) like:

count_words () < eval 'shift; '"$1"'=$#' >count_words number_of_files * echo "There are $number_of_files non-dot files in the current directory" 

An alternative solution is $(ls -d — * | wc -l) . If the glob is * , the command can be shortened to $(ls | wc -l) . Parsing the output of ls always makes me uneasy, but here it should work as long as your file names don’t contain newlines, or your ls escapes them. And $(ls -d — * 2>/dev/null | wc -l) has the advantage of handling the case of a non-matching glob gracefully (i.e., it returns 0 in that case, whereas the set * method requires fiddly testing if the glob might be empty).

Читайте также:  Dvl damn vulnerable linux

If file names may contain newline characters, an alternative is to use $(ls -d ./* | grep -c /) .

Any of those solutions that rely on passing the expansion of a glob to ls may fail with a argument list too long error if there are a lot of matching files.

Источник

Find the number of files in a directory

Is there any method in Linux to calculate the number of files in a directory (that is, immediate children) in O(1) (independently of the number of files) without having to list the directory first? If not O(1), is there a reasonably efficient way? I’m searching for an alternative to ls | wc -l .

ls | wc -l will cause ls to do an opendir(), readdir() and probably a stat() on all the files. This will generally be at least O(n).

Yeah correct, my fault. I was thinking of O(1) and O(n) to be same, although I should know it better.

8 Answers 8

readdir is not as expensive as you may think. The knack is avoid stat’ing each file, and (optionally) sorting the output of ls.

avoids aliases in your shell, doesn’t sort the output, and lists 1 file-per-line (not strictly necessary when piping the output into wc).

The original question can be rephrased as «does the data structure of a directory store a count of the number of entries?», to which the answer is no. There isn’t a more efficient way of counting files than readdir(2)/getdents(2).

One can get the number of subdirectories of a given directory without traversing the whole list by stat’ing (stat(1) or stat(2)) the given directory and observing the number of links to that directory. A given directory with N child directories will have a link count of N+2, one link for the «..» entry of each subdirectory, plus two for the «.» and «..» entries of the given directory.

However one cannot get the number of all files (whether regular files or subdirectories) without traversing the whole list — that is correct.

The «/bin/ls -1U» command will not get all entries however. It will get only those directory entries that do not start with the dot (.) character. For example, it would not count the «.profile» file found in many login $HOME directories.

One can use either the «/bin/ls -f» command or the «/bin/ls -Ua» command to avoid the sort and get all entries.

Perhaps unfortunately for your purposes, either the «/bin/ls -f» command or the «/bin/ls -Ua» command will also count the «.» and «..» entries that are in each directory. You will have to subtract 2 from the count to avoid counting these two entries, such as in the following:

expr `/bin/ls -f | wc -l` - 2 # Those are back ticks, not single quotes. 

The —format=single-column (-1) option is not necessary on the «/bin/ls -Ua» command when piping the «ls» output, as in to «wc» in this case. The «ls» command will automatically write its output in a single column if the output is not a terminal.

Источник

Оцените статью
Adblock
detector