Linux detect open files

Quick way to know if a file is open on Linux?

Is there a quick way (i.e. that minimizes time-to-answer) to find out if a file is open on Linux? Let’s say I have a process that writes a ton a files in a directory and another process which reads those files once they are finished writing, can the latter process know if a file is still being written to by the former process? A Python based solution would be ideal, if possible. Note: I understand I could be using a FIFO / Queue based solution but I am looking for something else.

Since you want to observe changes in the file system, inotify could be the answer. I have not tested it myself, but maybe this is related to your problem ubuntuforums.org/showthread.php?t=663950

Nice question, lots of good different answers. With a little more information it may be easier to pick a front runner solution.

I’ve got 10,000’s of files generated at each stage by process #1 and I’d like a quick way to pick-out the files that have been processed and hand those off to process #2.

yes I have full control of the environment. I have used the «tmp file & rename» option in the past but I was looking for alternatives.

I would suggest that the «tmpfile & rename» is the most resilient method. If process #1 crashes or runs out of space, you do not have any half-complete «result files». You could combine this method with «inotify» to avoid the need to poll the directory

11 Answers 11

You can of course use INOTIFY feature of Linux, but it is safer to avoid the situation: let the writing process create the files (say data.tmp) which the reading process will definitely ignore. When the writer finishes, it should just rename the file for the reader (into say .dat). The rename operation guarantees that there may be no misunderstandings.

If you know the PID of the writing process, in Linux you can simply query the /proc//fd/ and see whether one of the links found there points to one of your files.

What you would do is, scan the directory, archiving the fact that fd 5 (say) points to /var/data/whatever/file1.log . Then store the file pointed to into an array.

At that point if a filename is in the array, the process has it in use.

import os # Here I use PID = 31824 path="/proc/%d/fd" % 31824 openfiles = [ os.readlink("%s/%s" % (path, fname)) for fname in os.listdir(path) ] if whatever in openfiles: # whatever is used by pid 31824. 

lsof | grep filename immediately comes to mind.

You have a variety of options available:

  • Inotify is a feature that allows you to watch for file operations
  • Writing process renames files when finished writing
  • The program fuser will let you query whether a file is in use
  • Knowing the PID of the writer may let you query /proc/PID/fd for open file descriptors.
Читайте также:  How to delete group in linux

If you can change the ‘first’ process logic, the easy solution would be to write data to a temp file and rename the file once all the data is written.

This is a solution using inotify. You will get a notification for every file in the directory being closed after a writing operation.

import os import pyinotify def Monitor(path): class PClose(pyinotify.ProcessEvent): def process_IN_CLOSE(self, event): f = event.name and os.path.join(event.path, event.name) or event.path print 'close event: ' + f wm = pyinotify.WatchManager() notifier = pyinotify.Notifier(wm, PClose()) wm.add_watch(path, pyinotify.IN_CLOSE_WRITE) try: while 1: notifier.process_events() if notifier.check_events(): notifier.read_events() except KeyboardInterrupt: notifier.stop() return if __name__ == '__main__': path = "." Monitor(path) 

However, since you are the one being in control of the process writing the files I’d vote for a different solution involving some kind of communication between the processes.

Источник

How to Check Open Files in Linux

You may have come across the saying, “Everything is a file in Linux.” Although this is not entirely true, it does hold a set of truths to it.

In Linux and Unix-like systems, everything is like a file. That means the resources in the Unix system get assigned a file descriptor, including storage devices, network sockets, processes, etc.

A file descriptor is a unique number that identifies a file and other input/output devices. It describes resources and how the kernel accesses them. Think of it as a gateway to the Kernel abstraction hardware resources.

Unfortunately, the concept of file descriptors is beyond the scope of this tutorial; consider the link provided below to get started on learning more:

That means that Unix and Unix-like systems such as Linux use such files heavily. As a Linux power user, seeing the open files and the process and users using them is incredibly useful.

This tutorial will focus on ways to view the files open and which process or user is responsible.

Pre-Requisites

Before we begin, ensure that you have:

If you have these, let us get started:

LSOF Utility

Created by Victor A Abell, List open files, or lsof for short, is a command-line utility that allows us to view the open files and the processes or users who opened them.

The lsof utility is available in major Linux distributions; however, you may find it not installed and thus may need to install manually.

How to Install lsof on Debian/Ubuntu

To install it on Debian, use the command:

sudo apt-get install lsof -y

How to Install on REHL/CentOS

To install on REHL and CentOS, use the command:

How to Install on Arch

On Arch, call the package manager using the command:

How to Install on Fedora

On Fedora, use the command:

Once you have the lsof utility installed and updated, we can begin using it.

Basic lsof Usage

To use the lsof tool, enter the command:

Once you execute the above command, lsof will dump a lot of information as shown below:

The above output shows all the files opened by the processes. The output has various columns, each representing specific information about the file.

  • The COMMAND column – shows the name of the process that is using the file.
  • PID – shows the Process Identifier of the process using the file.
  • The TID – Shows the task ID (threads) of the process.
  • TASKCMD – Represent the name of the task command.
  • USER – The owner of the process.
  • FD – Shows the file descriptor number. This is how processes use the file; the options available in this column output include:
  • cwd – current working directory.
  • mem – memory-mapped file
  • pd – parent directory
  • jld – jail directory
  • ltx – shared library text
  • rtd – root directory.
  • txt – program code and data
  • tr – kernel trace file.
  • err – File descriptor information error
  • mmp – Memory-mapped device.
  • TYPE – Shows the type of node associated with the file, such as:
  • Unix – for Unix domain socket.
  • DIR – represents the directory
  • REG – representing the regular file
  • CHR – represents the special character file.
  • LINK – symbolic link file
  • BLK – Block special file
  • INET – Internet domain socket
  • FIFO – a named pipe (First In First Out file)
  • PIPE – for pipes
  • DEVICES – Shows the device numbers separated by commas in the order of special character file, block special, regular, directory, and NFS file.
  • SIZE/OFF – shows the size of the file pr file offset in bytes.
  • NODE – shows the node number of the local file, type for internet protocol type, etc.
  • NAME – shows the name of the mount point and fs on which the file is located.
Читайте также:  Linux mint hdmi разрешение

Note: Please Refer to the lsof Manual for detailed information on the columns.

How to Show Processes that Opened a File

Lsof provides us with options that help us filter the output to show only the processes that opened a specific file.

For example, to see the file that opened the file /bin/bash, use the command as:

This will give you an output as shown below:

COMMAND PID USER FD TYPE DEVICE SIZE / OFF NODE NAME

ksmtuned 1025 root txt REG 253 , 0 1150704 428303 / usr / bin / bash

bash 2968 centos txt REG 253 , 0 1150704 428303 / usr / bin / bash

bash 3075 centos txt REG 253 , 0 1150704 428303 / usr / bin / bash

How Show files Opened by a Specific User

We can also filter the output to show the files opened by a specific user. We do this by using the -u flag followed by the username as:

This will give you an output as shown below:

How to Show Files Opened by a Specific Process

Suppose we want to view all the files opened by a specific process? For this, we can use the PID of the process to filter the output.

For example, the below command shows the files opened by bash.

This will give you only the files opened by systemd as shown:

How to Show Files Opened in a Directory

To get the files opened in a specific directory, we can pass the +D option followed by the directory path.

For example, list open files in the /etc directory.

Below is the output for this:

How to Show Network Connection

Since everything in Linux is a file, we can get the network files such as TCP files or connections.

Читайте также:  Обновление линукс минт через консоль

This will give you the TCP connections in the system.

You can also filter by the specific port using the command shown below:

This will give you the output as shown below:

How to Continuously Show Files

Lsof provides us with a mode to loop the output every few seconds. This allows you to monitor the files opened by a process or user continuously.

This option, however, requires you to terminate the process manually.

For example, the command below continuously monitors the files opened on port 22:

As you can see, in the third loop, lsof catches the established connection to the server on SSH.

Conclusion

Lsof is an incredibly useful utility. It allows you to monitor for critical files as well as monitor users and processes opening files. This can be incredibly useful when troubleshooting or looking for malicious attempts to the system.

As shown in this tutorial, using various examples and methods, you can combine the functionality provided by the lsof tool for custom monitoring.

Thank you for reading and sharing! I hope you learned something new!

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list

Источник

How to determine, whether a file is open?

My code needs to go through files in a directory, picking only those, which are currently opened (for writing) by any other process on the system. The ideal solution would apply for all Unixes, but I’ll settle for a Linux-only. The program is written in Python, but I can add a custom C-function, if I have to — I just need to know, what API is available for this. One suggestion I found was to go through all file-descriptors under Linux /proc , resolving their links to see, if they point at the file of interest. But that seems rather heavy. I know, for example, that opening a file increases its reference count — filesystem will not deallocate blocks of an opened file even if it is deleted — until it is closed — the feature relied upon by tmpfile(3) . Perhaps, a user process can get access to these records in the kernel?

Yeah, lsof — and fuser — scan /proc . But that yields more information than I need — I don’t care, which processes have the file open. I just want to know, whether any such exist. Perhaps, this information can be obtained more cheaply, than /proc rescan?

The advantage of scanning /proc is that it is backed by direct kernel calls, not a physical file system. That gives /proc a huge performance advantage over opening and reading a directory, even just to find the names.

The advantage of scanning /proc is that it is the only way to get the information without modifying the kernel.

Источник

Оцените статью
Adblock
detector