Recursively count all the files in a directory [duplicate]
I have a really deep directory tree on my Linux box. I would like to count all of the files in that path, including all of the subdirectories. For instance, given this directory tree:
/home/blue /home/red /home/dir/green /home/dir/yellow /home/otherDir/
If I pass in /home , I would like for it to return four files. Or, bonus points if it returns four files and two directories. Basically, I want the equivalent of right-clicking a folder on Windows and selecting properties and seeing how many files/folders are contained in that folder. How can I most easily do this? I have a solution involving a Python script I wrote, but why isn’t this as easy as running ls | wc or similar?
Not exactly what you’re looking for, but to get a very quick grand total, if your locate database is up to date: locate /some/path | wc -l (or on my Mac: locate -c /some/path ). But: this will also count files in /this/other/path/with/some/path , and will count the folders themselves.
By the way, this is a different, but closely related problem (counting all the directories on a drive) and solution: superuser.com/questions/129088/…
5 Answers 5
Explanation:
find . -type f finds all files ( -type f ) in this ( . ) directory and in all sub directories, the filenames are then printed to standard out one per line.
This is then piped | into wc (word count) the -l option tells wc to only count lines of its input.
Together they count all your files.
The answers above already answer the question, but I’ll add that if you use find without arguments (except for the folder where you want the search to happen) as in:
the search goes much faster, almost instantaneous, or at least it does for me. This is because the type clause has to run a stat() system call on each name to check its type — omitting it avoids doing so.
This has the difference of returning the count of files plus folders instead of only files, but at least for me it’s enough since I mostly use this to find which folders have huge ammounts of files that take forever to copy and compress them. Counting folders still allows me to find the folders with most files, I need more speed than precision.
How to count number of files in each directory?
Are you looking for a way to count the number of files in each of the sub-directories directly under ./ ?
How’s this an off-topic question?? I would like to see close-voters comments with reason! If this is off-topic then where does this belong to? super user? I don’t think so..
voted to reopen it. There may be other answers that could be useful in many situations (including script programming, which is the reason I reached this question).
21 Answers 21
This prints the file count per directory for the current directory level:
du -a | cut -d/ -f2 | sort | uniq -c | sort -nr
By far the best (and most elegant) solution if one wants to list the number of files in top level directories recursively.
This has two problems: It counts one file per directory more than there actually is and it gives a useless line containing the size of the current directory as «1 size«. Both can be fixed with du -a | sed ‘/.*\.\/.*\/.*/!d’ | cut -d/ -f2 | sort | uniq -c . Add | sort -nr to sort by the count instead of the directory name.
I’d like to point out that this works in OSX, too. (Just copy-pasting Linux advice into an OSX shell usually doesn’t work.)
it fetches unneeded size by du -a . Better way is using find command. but main idea is exactly the same 🙂
Assuming you have GNU find, let it find the directories and let bash do the rest:
find . -type d -print0 | while read -d '' -r dir; do files=("$dir"/*) printf "%5d files in directory %s\n" "$" "$dir" done
Its just a slighly different version from the above, so: ( hint: its sorted by name and its in csv) for x in find . -maxdepth 1 -type d | sort ; do y= find $x | wc -l ; echo $x,$y; done
Great one! Putting it into a single line (so it’s confortable for direct usage in shell): find . -type d -print0 | while read -d » -r dir; do files=(«$dir»/*); printf «%5d files in directory %s\n» «$» «$dir»; done#files[@]>
I needed to get the number of all files (recursively count) in each subdirectory. This modification gives you that: find . -maxdepth 1 -type d -print0 | while read -d » -r dir; do num=$(find $dir -ls | wc -l); printf «%5d files in directory %s\n» «$num» «$dir»; done
@Kory The following will do it: find . -maxdepth 1 -type d -print0 | while read -d » -r dir; do num=$(find «$dir» -ls | wc -l); printf «%5d files in directory %s\n» «$num» «$dir»; done | sort -rn -k1
@OmidS Great oneliner, but $dir should be inside quotes in your first comment to correctly handle dir names with whitespaces. : find . -maxdepth 1 -type d -print0 | while read -d » -r dir; do num=$(find «$dir» -ls | wc -l); printf «%5d files in directory %s\n» «$num» «$dir»; done
find . -type f | cut -d/ -f2 | sort | uniq -c
- find . -type f to find all items of the type file , in current folder and subfolders
- cut -d/ -f2 to cut out their specific folder
- sort to sort the list of foldernames
- uniq -c to return the number of times each foldername has been counted
Perfect. And can be extended to count over subdirectories by replacing the field specifiers with a list of field specifiers. E.g,: find . -type f | cut -d/ -f2,3 | sort | uniq -c
You could arrange to find all the files, remove the file names, leaving you a line containing just the directory name for each file, and then count the number of times each directory appears:
find . -type f | sed 's%/[^/]*$%%' | sort | uniq -c
The only gotcha in this is if you have any file names or directory names containing a newline character, which is fairly unlikely. If you really have to worry about newlines in file names or directory names, I suggest you find them, and fix them so they don’t contain newlines (and quietly persuade the guilty party of the error of their ways).
If you’re interested in the count of the files in each sub-directory of the current directory, counting any files in any sub-directories along with the files in the immediate sub-directory, then I’d adapt the sed command to print only the top-level directory:
find . -type f | sed -e 's%^\(\./[^/]*/\).*$%\1%' -e 's%^\.\/[^/]*$%./%' | sort | uniq -c
The first pattern captures the start of the name, the dot, the slash, the name up to the next slash and the slash, and replaces the line with just the first part, so:
The second replace captures the files directly in the current directory; they don’t have a slash at the end, and those are replace by ./ . The sort and count then works on just the number of names.
How do I count all the files recursively through directories
which will give me the space used in the directories off of root, but in this case I want the number of files, not the size.
I think that «how many files are in subdirectories in there subdirectories» is a confusing construction. If more clearly state what you want, you might get an answer that fits the bill.
@Steven feel free to rewrite it. I thought my example of du -sh /* made it pretty clear how I wanted the count to work. same thing, just count the files not the bytes.
As you mention inode usage, I don’t understand whether you want to count the number of files or the number of used inodes. The two are different when hard links are present in the filesystem. Most, if not all, answers give the number of files. Don’t use them on an Apple Time Machine backup disk.
@mouviciel this isn’t being used on a backup disk, and yes I suppose they might be different, but in the environment I’m in there are very few hardlinks, technically I just need to get a feel for it. figure out where someone is burning out there inode quota.
11 Answers 11
find . -maxdepth 1 -type d | while read -r dir do printf "%s:\t" "$dir"; find "$dir" -type f | wc -l; done
Thanks to Gilles and xenoterracide for safety/compatibility fixes.
The first part: find . -maxdepth 1 -type d will return a list of all directories in the current working directory. (Warning: -maxdepth is a GNU extension and might not be present in non-GNU versions of find .) This is piped to.
The second part: while read -r dir; do (shown above as while read -r dir (newline) do ) begins a while loop – as long as the pipe coming into the while is open (which is until the entire list of directories is sent), the read command will place the next line into the variable dir . Then it continues.
The third part: printf «%s:\t» «$dir» will print the string in $dir (which is holding one of the directory names) followed by a colon and a tab (but not a newline).
The fourth part: find «$dir» -type f makes a list of all the files inside the directory whose name is held in $dir . This list is sent to.
The fifth part: wc -l counts the number of lines that are sent into its standard input.
The final part: done simply ends the while loop.
So we get a list of all the directories in the current directory. For each of those directories, we generate a list of all the files in it so that we can count them all using wc -l . The result will look like:
./dir1: 234 ./dir2: 11 ./dir3: 2199 .