Counting open files per process
I’m working on an application that monitors the processes’ resources and gives a periodic report in Linux, but I faced a problem in extracting the open files count per process. This takes quite a while if I take all of the files and group them according to their PID and count them. How can I take the open files count for each process in Linux?
5 Answers 5
Have a look at the /proc/ file system:
To do this for all processes, use this:
cd /proc for pid in 3* do echo "PID = $pid with $(ls /proc/$pid/fd/ | wc -l) file descriptors" done
As a one-liner (filter by appending | grep -v «0 FDs» ):
for pid in /proc/4*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done
As a one-liner including the command name, sorted by file descriptor count in descending order (limit the results by appending | head -10 ):
for pid in /proc/9*; do p=$(basename $pid); printf "%4d FDs for PID %6d; command=%s\n" $(ls $pid/fd | wc -l) $p "$(ps -p $p -o comm=)"; done | sort -nr
Credit to @Boban for this addendum:
You can pipe the output of the script above into the following script to see the ten processes (and their names) which have the most file descriptors open:
. done | sort -rn -k5 | head | while read -r _ _ pid _ fdcount _ do command=$(ps -o cmd -p "$pid" -hc) printf "pid = %5d with %4d fds: %s\n" "$pid" "$fdcount" "$command" done
Here’s another approach to list the top-ten processes with the most open fds, probably less readable, so I don’t put it in front:
find /proc -maxdepth 1 -type d -name '3*' \ -exec bash -c "ls <>/fd/ | wc -l | tr '\n' ' '" \; \ -printf "fds (PID = %P), command: " \ -exec bash -c "tr '\0' ' ' < <>/cmdline" \; \ -exec echo \; | sort -rn | head
Of course, you will need to have root permissions to do that for many of the processes. Their file descriptors are kind of private, you know 😉
/proc/$pid/fd lists descriptor files, that is slightly different of «open files» as we can have memory map and other unusual file objects.
This extends the answer and turns pids to command names: for pid in 1*; do echo «PID = $pid with $(ls /proc/$pid/fd/ 2>/dev/null | wc -l) file descriptors»; done | sort -rn -k5 | head | while read -r line; do pid= echo $line | awk ‘
Yeah, well. Instead of parsing the original output and then call ps again for each process to find out its command, it might make more sense to use /proc/$pid/cmdline in the first loop. While technically it is still possible for a process to disappear between the evaluating of 2* and the scanning of its disc, this is less likely.
Executing command=$(ps -o cmd -p «$pid» -hc) gave me Warning: bad syntax, perhaps a bogus ‘-‘ . It worked running as command=$(ps -o cmd -p «$pid» hc) .
ps aux | sed 1d | awk '' | xargs -I <> bash -c <>
For Fedora, it gives: lsof: WARNING: can’t stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs Output information may be incomplete. lsof: no pwd entry for UID 65535
I used this to find top filehandler-consuming processes for a given user (username) where dont have lsof or root access:
for pid in `ps -o pid -u username` ; do echo "$(ls /proc/$pid/fd/ 2>/dev/null | wc -l ) for PID: $pid" ; done | sort -n | tail
How can I take the open files count for each process in Linux?
if you’re running it from root (e.g. prefixing the command with sudo -E env PATH=$PATH ), otherwise it’ll only return file descriptor counts per process whose /proc//fd you may list. This will give you a big JSON document/tree whose nodes look something like:
The content of fd dictionary is counts per file descriptor type. The most interesting ones are probably these (see procfile.Fd description or man fstat for more details):
I’m the author of Procpath, which is a tool that provides a nicer interface to procfs for process analysis. You can record a process tree’s procfs stats (in a SQLite database) and plot any of them later. For instance this is how my Firefox’s process tree (root PID 2468) looks like with regards to open file descriptor count (sum of all types):
procpath --logging-level ERROR record -f stat,fd -i 1 -d ff_fd.sqlite \ '$..children[?(@.stat.pid == 2468)]' # Ctrl+C procpath plot -q fd -d ff_fd.sqlite -f ff_df.svg
If I’m interested in only a particular type of open file descriptors (say, sockets) I can plot it like this:
procpath plot --custom-value-expr fd_sock -d ff_fd.sqlite -f ff_df.svg
Why is number of open files limited in Linux?
My question is: Why is there a limit of open files in Linux?
Well, process limits and file limits are important so things like fork bombs don’t break a server/computer for all users, only the user that does it and only temporarily. Otherwise, someone on a shared server could set off a forkbomb and completely knock it down for all users, not just themselves.
@Rob, a fork bomb doesn’t have anything to do with it since the file limit is per process and each time you fork it does not open a new file handle.
3 Answers 3
The reason is that the operating system needs memory to manage each open file, and memory is a limited resource — especially on embedded systems.
As root user you can change the maximum of the open files count per process (via ulimit -n ) and per system (e.g. echo 800000 > /proc/sys/fs/file-max ).
There is also a security reason : if there were no limits, a userland software would be able to create files endlessly until the server goes down.
@Coren The here discussed limits are only for the count of open file handlers. As a program can also close file handlers, it could create as many files and as big as it want, until all available disk space is full. To prevent this, you can use disk quotas or separated partitions. You are true in the sense, that one aspect of security is preventing resource exhaustion — and for this there are limits.
@jofel Thanks. I guess that opened file handles are represented by instances of struct file, and size of this struct is quite small (bytes level), so can I set /. /file-max with a quite big value as long as memory is not used up?
@xanpeng I am not a kernel expert, but as far as I can see, the default for file-max seems to be RAM size divided by 10k. As the real memory used per file handler should be much smaller (size of struct file plus some driver dependent memory), this seems a quite conservative limit.
The max you can set it to is 2^63-1: echo 9223372036854775807 > /proc/sys/fs/file-max . Don’t know why Linux is using signed integers.
Please note that lsof | wc -l sums up a lot of duplicated entries (forked processes can share file handles etc). That number could be much higher than the limit set in /proc/sys/fs/file-max .
To get the current number of open files from the Linux kernel’s point of view, do this:
Example: This server has 40096 out of max 65536 open files, although lsof reports a much larger number:
# cat /proc/sys/fs/file-max 65536 # cat /proc/sys/fs/file-nr 40096 0 65536 # lsof | wc -l 521504
As lsof will report many files twice or more, such as /dev/null , you can try a best guess with: lsof|awk ‘
you may use lsof|awk ‘!a[$NF]++
Very old question but i have bene looking into these settings on my server and lsof | wc -l give 40,000 while file-nr says 2300 — is that discrepancy normal?
I think it’s largely for historical reasons.
A Unix file descriptor is a small int value, returned by functions like open and creat , and passed to read , write , close , and so forth.
At least in early versions of Unix, a file descriptor was simply an index into a fixed-size per-process array of structures, where each structure contains information about an open file. If I recall correctly, some early systems limited the size of this table to 20 or so.
More modern systems have higher limits, but have kept the same general scheme, largely out of inertia.
20 was the Solaris limit for C language FILE data structures. The file handle count was always larger.
@Lothar: Interesting. I wonder why the limits would differ. Given the fileno and fdopen functions I’d expect them to be nearly interchangeable.
A unix file is more than just the file handle (int) returned. There are disk buffers, and a file control block that defines the current file offset, file owner, permissions, inode, etc.
@ChuckCottrill: Yes, of course. But most of that information has to be stored whether a file is accessed via an int descriptor or a FILE* . If you have more than 20 files open via open() , would fdopen() fail?