Linux: find out what process is using all the RAM?

Before actually asking, just to be clear: yes, I know about disk cache, and no, it is not my case 🙂 Sorry, for this preamble 🙂 I’m using CentOS 5. Every application in the system is swapping heavily, and the system is very slow. When I do free -m , here is what I got:

 total used free shared buffers cached Mem: 3952 3929 22 0 1 18 -/+ buffers/cache: 3909 42 Swap: 16383 46 16337 

So, I actually have only 42 Mb to use! As far as I understand, -/+ buffers/cache actually doesn’t count the disk cache, so I indeed only have 42 Mb, right? I thought, I might be wrong, so I tried to switch off the disk caching and it had no effect — the picture remained the same. So, I decided to find out who is using all my RAM, and I used top for that. But, apparently, it reports that no process is using my RAM. The only process in my top is MySQL, but it is using 0.1% of RAM and 400Mb of swap. Same picture when I try to run other services or applications — all go in swap, top shows that MEM is not used (0.1% maximum for any process).

top - 15:09:00 up 2:09, 2 users, load average: 0.02, 0.16, 0.11 Tasks: 112 total, 1 running, 111 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4046868k total, 4001368k used, 45500k free, 748k buffers Swap: 16777208k total, 68840k used, 16708368k free, 16632k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP COMMAND 3214 ntp 15 0 23412 5044 3916 S 0.0 0.1 0:00.00 17m ntpd 2319 root 5 -10 12648 4460 3184 S 0.0 0.1 0:00.00 8188 iscsid 2168 root RT 0 22120 3692 2848 S 0.0 0.1 0:00.00 17m multipathd 5113 mysql 18 0 474m 2356 856 S 0.0 0.1 0:00.11 472m mysqld 4106 root 34 19 251m 1944 1360 S 0.0 0.0 0:00.11 249m yum-updatesd 4109 root 15 0 90152 1904 1772 S 0.0 0.0 0:00.18 86m sshd 5175 root 15 0 90156 1896 1772 S 0.0 0.0 0:00.02 86m sshd 

Restart doesn’t help, and, by they way is very slow, which I wouldn’t normally expect on this machine (4 cores, 4Gb RAM, RAID1). So, with that — I’m pretty sure that this is not a disk cache, who is using the RAM, because normally it should have been reduced and let other processes to use RAM, rather then go to swap. So, finally, the question is — if someone has any ideas how to find out what process is actually using the memory so heavily?


Linux process memory usage: How to sort ‘ps’ command output

Linux ps command FAQ: Can you share some examples of how to sort the ps command?

Sure. In this article I’ll show how to sort the Linux ps command output, without using the Linux sort command.

The `ps —sort` option

Before I get started, it’s important to note that the Linux ps command supports a —sort argument, and that argument takes a number of key values, and those keys indicate how you want to support the ps output.

Here’s a quick look at the —sort information from the ps command man page:

--sort spec specify sorting order. Sorting syntax is [+|-]key[,[+|-]key[. ]] Choose a multi-letter key from the STANDARD FORMAT SPECIFIERS section. The "+" is optional since default direction is increasing numerical or lexicographic order. Identical to k. For example: ps jax --sort=uid,-ppid,+pid

Sort Linux `ps` output by memory (RAM), from high to low

Given that little piece of background information, here’s how we can sort the ps command output by memory usage:

That ps command gives me this output:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND mysql 2897 0.0 1.7 136700 17952 ? Sl Oct21 0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysq root 2483 0.0 1.0 43540 11368 ? Ssl Oct21 0:00 /usr/bin/python -E /usr/sbin/setroubleshootd root 3124 0.0 0.9 25816 10332 ? SN Oct21 0:00 /usr/bin/python -tt /usr/sbin/yum-updatesd root 2406 0.0 0.9 11572 10004 ? Ss Oct21 0:00 /usr/sbin/restorecond root 2928 0.0 0.6 17648 7120 ? Ss Oct21 0:00 /usr/local/apache2/bin/httpd -k start nobody 2949 0.0 0.6 17648 6492 ? S Oct21 0:00 /usr/local/apache2/bin/httpd -k start nobody 2950 0.0 0.6 17648 6492 ? S Oct21 0:00 /usr/local/apache2/bin/httpd -k start nobody 2951 0.0 0.6 17648 6492 ? S Oct21 0:00 /usr/local/apache2/bin/httpd -k start nobody 2952 0.0 0.6 17648 6492 ? S Oct21 0:00 /usr/local/apache2/bin/httpd -k start nobody 2953 0.0 0.6 17648 6492 ? S Oct21 0:00 /usr/local/apache2/bin/httpd -k start 68 3115 0.0 0.3 5920 3912 ? Ss Oct21 0:01 hald root 18453 0.0 0.2 10140 2884 ? Ss 11:09 0:00 sshd: root@pts/0 root 2801 0.0 0.2 10020 2328 ? Ss Oct21 0:00 cupsd root 2959 0.0 0.1 9072 1876 ? Ss Oct21 0:00 sendmail: accepting connections root 475 0.0 0.1 3004 1600 ? S

As you can see, this prints the ps output with the largest RSS size at the top of the output. (There are also many more lines than this, I just trimmed the output.)

To reverse this output and show the largest RSS value at the bottom of the ps command output, just take the "-" sign off the rss sort argument, like this:

How to sort `ps` output by pid

To sort the output of the ps command by pid, we'd issue one of the following two commands. First, to sort by pid, in order from highest PID to lowest, we'd use this ps command:

And to sort by pid, from low to high, again we remove the "-" from our argument:

GNU `ps` command sorting specifiers

There are many, many more ways to sort ps command output, and you can find all of them in the Linux ps command man page. I've trimmed down some of the output from the ps man page to show what I think the most important sort keys are. Here is that information, along with a little introductory information:

STANDARD FORMAT SPECIFIERS Here are the different keywords that may be used to control the output format (e.g. with option -o) or to sort the selected processes with the GNU-style --sort option. For example: ps -eo pid,user,args --sort user This version of ps tries to recognize most of the keywords used in other implementations of ps. The following user-defined format specifiers may contain spaces: args, cmd, comm, command, fname, ucmd, ucomm, lstart, bsdstart, start. Some keywords may not be available for sorting. CODE HEADER DESCRIPTION %cpu %CPU cpu utilization of the process in "##.#" format. Currently, it is the CPU time used divided by the time the process has been running (cputime/realtime ratio), expressed as a percentage. It will not add up to 100% unless you are lucky. (alias pcpu). %mem %MEM ratio of the process’s resident set size to the physical memory on the machine, expressed as a percentage. (alias pmem). bsdstart START time the command started. If the process was started less than 24 hours ago, the output format is " HH:MM", else it is "mmm dd" (where mmm is the three letters of the month). bsdtime TIME accumulated cpu time, user + system. The display format is usually "MMM:SS", but can be shifted to the right if the process used more than 999 minutes of cpu time. c C processor utilization. Currently, this is the integer value of the percent usage over the lifetime of the process. (see %cpu). comm COMMAND command name (only the executable name). Modifications to the command name will not be shown. A process marked is partly dead, waiting to be fully destroyed by its parent. The output in this column may contain spaces. (alias ucmd, ucomm). See also the args format keyword, the -f option, and the c option. When specified last, this column will extend to the edge of the display. If ps can not determine display width, as when output is redirected (piped) into a file or another command, the output width is undefined. (it may be 80, unlimited, determined by the TERM variable, and so on) The COLUMNS environment variable or --cols option may be used to exactly determine the width in this case. The w or -w option may be also be used to adjust width. command COMMAND see args. (alias args, cmd). cp CP per-mill (tenths of a percent) CPU usage. (see %cpu). cputime TIME cumulative CPU time, "[dd-]hh:mm:ss" format. (alias time). egroup EGROUP effective group ID of the process. This will be the textual group ID, if it can be obtained and the field width permits, or a decimal representation otherwise. (alias group). etime ELAPSED elapsed time since the process was started, in the form [[dd-]hh:]mm:ss. euid EUID effective user ID. (alias uid). euser EUSER effective user name. This will be the textual user ID, if it can be obtained and the field width permits, or a decimal representation otherwise. The n option can be used to force the decimal representation. (alias uname, user). gid GID see egid. (alias egid). lstart STARTED time the command started. ni NI nice value. This ranges from 19 (nicest) to -20 (not nice to others), see nice(1). (alias nice). pcpu %CPU see %cpu. (alias %cpu). pgid PGID process group ID or, equivalently, the process ID of the process group leader. (alias pgrp). pid PID process ID number of the process. pmem %MEM see %mem. (alias %mem). ppid PPID parent process ID. rss RSS resident set size, the non-swapped physical memory that a task has used (in kiloBytes). (alias rssize, rsz). ruid RUID real user ID. size SZ approximate amount of swap space that would be required if the process were to dirty all writable pages and then be swapped out. This number is very rough! start STARTED time the command started. If the process was started less than 24 hours ago, the output format is "HH:MM:SS", else it is " mmm dd" (where mmm is a three-letter month name). sz SZ size in physical pages of the core image of the process. This includes text, data, and stack space. Device mappings are currently excluded; this is subject to change. See vsz and rss. time TIME cumulative CPU time, "[dd-]hh:mm:ss" format. (alias cputime). tname TTY controlling tty (terminal). (alias tt, tty). vsz VSZ virtual memory size of the process in KiB (1024-byte units). Device mappings are currently excluded; this is subject to change. (alias vsize).

As you can see, there are a lot of sorting options with the ps command, even though I've trimmed down this list significantly.


How to find which processes are taking all the memory?

Under Linux, simply press M to sort by physical memory usage (RES column). Under *BSD, run top -o res or top -o size . But htop is a lot nicer and doesn't even consume more memory than top (however it's not part of the basic toolset so you might not have it installed).

@Steven How can we group process with same parent? Basically firefox shows up in multiple times may be as it spawns multiple child processes. Is it possible to get combined memory usage?

If you have it installed I like htop once launching it you can press f6 , down arrow (to MEM% ), enter to sort by memory.

In Solaris the command you would need is:

This will list all processes in order of descending process image size. Note that the latter is based on memory committed to the process by the OS, not its resident physical memory usage.

There are supposedly versions of "top" available for Solaris, but these are not part of the standard installation.

Once top starts, press F to switch to the sort field screen. Choose one of the fields listed by pressing the key listed on the left; you probably want N for MEM%

If you want MEM%, pressing 'M' does the same stated above. 'c' adds command line parameters to the process list, may be informative for your problem.

This command will identify the top memory consuming processes:

ps -A --sort -rss -o pid,pmem:40,cmd:500 | head -n 6 | tr -s " " ";z" 

Doesn't work on Solaris 9: ps: illegal option -- - ps: ort is an invalid non-numeric argument for -s option ps: illegal option -- r ps: s is an invalid non-numeric argument for -s option ps: unknown output format: -o pmem:40 ps: unknown output format: -o cmd:500

One nice alternative to top is htop . Check it, it is much more user friendly than regular top.

Globally: It's always recommended to use a log analyser tool for logging history logs such as Splunk, ELK etc. So that using query language you can easily get the PIDs and its usage by CPU & memory.

AT SERVER/OS LEVEL: From inside top you can try the following:

 Press SHIFT+M ---> This will give you a process which takes more memory in descending order. 
$ ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head -10 

This will give the top 10 processes by memory usage. Also you can use vmstat utility to find the RAM usage at same time not for history.


