Linux create lock file

Содержание

Linux lockfile explained, how to use them the easy or hard way
How lockfiles work
Empty file
Lockfile with PID
Lockfiles the “hard” way
Lockfiles the easy way
4 thoughts on “ Linux lockfile explained, how to use them the easy or hard way ”
SYNOPSIS
DESCRIPTION
lockfile_create
lockfile_touch
lockfile_check
lockfile_remove
RETURN VALUES
ALGORITHM
REMOTE FILE SYSTEMS AND THE KERNEL ATTRIBUTE CACHE
PERMISSIONS
FILES
AUTHOR
SEE ALSO

Linux lockfile explained, how to use them the easy or hard way

You may have experienced it before, you create a cronjob to change some data every X hour or minutes and one day this job takes longer than it usually does and cron spawns another job before the first one is finished.
This can result in data corruption or deletion of data that should not have been deleted, all depending on what the cronjob is set up to do
To prevent bad things from happening, a good rule of thumb is to always use a lockfile
A lockfile is a small file, it virtually takes up no space, at least so little you won’t care (The actual size depends on your filesystem). Sometimes it contains a PID, sometimes a timestamp or just plain empty. Depending on how the lockfile is managed

How lockfiles work

There are multiple ways to write a lockfile, i’ll explain the basics of a lockfile here, in two different ways

Empty file

The first and most simple way is to make your script/program check if a file exists at the beginning of a script, let’s say the filename is /var/lock/myscript.lock
If the lockfile exists, then just exit the script since it seems like the script is already running based on the existence of the lockfile. However if the lockfile does not exist, then create it and continue on with doing what the script has to do
When the script is done doing it’s job, the lockfile has to be deleted before the script exists
That’s basically it, the lockfile is just a file indicating that the script is already running. However this method with just an empty file has one big problem and advantage.
If the script fails and exits before it gets to delete the lockfile, the script will never run again before you go in and delete the lockfile manually, or if your server reboots/crashes while the script is running you will have the same problem
However that is not necessarily bad and can be useful in some cases. Sometimes your script may be written to do some changes that can not be restarted if interrupted before it’s finished, in this case this type of lockfile is a must because the script will not restart on it’s own before you delete the lockfile manually to let it

Lockfile with PID

Let’s say your script name is myscript.sh and the lockfile is located at /var/lock/myscript.lock
If the lockfile exists, your script will read it to see if it has any content, if it finds data in the file, the script will assume it’s a PID (Process ID, every process gets an ID. The ID is just a number starting from 1 which is incremented by 1 for every process spawned) and check if a process with that ID is running
If no process with the PID from the lockfile is found or the lockfile does not exists at all, the script will create the lockfile with the current running scripts PID (Process ID) as the content of the lockfile. Nothing else, just the PID
This way, you do not have to delete the lockfile when done, and in case of a script or system crash your lockfile will still be there, but it does not matter since the script with the PID from the file is no longer running so when the script runs again, it will not find it doing the check at the beginning and therefore write the new PID into the lockfile and continue on with it’s job
I have used this method in multiple scripts and even though it has some downsides, for example if a process with the same ID is spawned (PID’s are reused when you hit the max). But I have never run into any problems like this

Читайте также: Antix linux how to

Lockfiles the “hard” way

I call it the hard way because it requires you to add some code to your script, it’s not really hard but it’s not as easy as the easy solution further down in this post, but it helps people who are new to lockfiles to understand how it works
Adding the following code on top of a bash script will:

Create the lockfile if it does not already exists
Read the data from the lockfile
Check if a process with the PID matching the data from the lockfile is running
If no process with the PID is running, then write the current PID to it
However if a process with the PID from the lockfile is running, then just exit the script

Here is the code for a bash script with comments:

# Variable to hold the location of the lockfile lf=/var/lock/myscript.lock # Create empty lock file if none exists touch $lf # Read the content of the lockfile into a variable read lastPID < $lf # If lastPID is not null and a process with that pid exists, exit the script [ ! -z "$lastPID" -a -d /proc/$lastPID ] && exit # Write the PID of the current running script to the lock file echo $$ >$lf # Your code goes here and will do it's job from this point on. No further code related to the lockfile is needed

And here is the code to use just an empty lockfile:

# Variable to hold the location of the lockfile lf=/var/lock/myscript.lock # Check if the lockfile exists, exit if it does [ -f $lf ] && exit # Create the lockfile touch $lf # You script has to do it's job here # At the very end of the script or before it exists, delete the lockfile rm $lf

Lockfiles the easy way

Above i showed you how a lockfile works, and the “hard” way to manage them
Now let’s look into the easy way. This method however required you to install a tiny program that is in the official repositories
The program is called “flock”
To install flock run the following command:
Debian:

Once installed, it’s really easy to use with the following syntax:

/usr/bin/flock -n /path/to/lockfile.lock /path/to/myscript.sh

the -n makes flock exit in case the script is already running, without the -n flock will wait until the first process is done
That’s it, flock will handle it all for you then, just run the script with flock in front of it every time you run it and you will be safe from the script accidentally running multiple times in parallel

4 thoughts on “ Linux lockfile explained, how to use them the easy or hard way ”

Nicolaas HyattJune 13, 2016 at 6:33 am Process ID numbers start at 1 and increase, but they have an upper limit depending on the system and configuration. You need to take this into account and add a test to see if the process that has the ID is the same as the script that created it.
That way if another process is using the same PID your script won’t prematurely terminate.

Источник

SYNOPSIS

int lockfile_create( const char *lockfile, int retrycnt, int flags [, struct lockargs args ] );
int lockfile_remove( const char *lockfile );
int lockfile_touch( const char *lockfile );
int lockfile_check( const char *lockfile, int flags );

DESCRIPTION

Functions to handle lockfiles in an NFS safe way.

lockfile_create

The lockfile_create function creates a lockfile in an NFS safe way.

If flags is set to L_PID or L_PPID then lockfile_create will not only check for an existing lockfile, but it will read the contents as well to see if it contains a process id in ASCII. If so, the lockfile is only valid if that process still exists. Otherwise, a lockfile older than 5 minutes is considered to be stale.

When creating a lockfile, if L_PID is set in flags, then the current process’ PID will be written to the lockfile. Sometimes it can be useful to use the parent’s PID instead (for example, the dotlockfile command uses that). In such cases you can use the L_PPID flag instead.

lockfile_touch

If the lockfile is on a shared filesystem, it might have been created by a process on a remote host. So the L_PID or L_PPID method of deciding if a lockfile is still valid or stale is incorrect and must not be used. If you are holding a lock longer than 5 minutes, a call to lockfile_create by another process will consider the lock stale and remove it. To prevent this, call lockfile_touch to refresh the lockfile at a regular interval (every minute or so).

lockfile_check

This function checks if a valid lockfile is already present without trying to create a new lockfile.

lockfile_remove

RETURN VALUES

lockfile_create returns one of the following status codes:

 
#define L_SUCCESS 0 /* Lockfile created */ 
#define L_TMPLOCK 2 /* Error creating tmp lockfile */ 
#define L_TMPWRITE 3 /* Can't write pid int tmp lockfile */ 
#define L_MAXTRYS 4 /* Failed after max. number of attempts */ 
#define L_ERROR 5 /* Unknown error; check errno */ 
#define L_ORPHANED 7 /* Called with L_PPID but parent is gone */ 
#define L_RMSTALE 8 /* Failed to remove stale lockfile */

lockfile_check returns 0 if a valid lockfile is present. If no lockfile or no valid lockfile is present, -1 is returned.

lockfile_touch and lockfile_remove return 0 on success. On failure -1 is returned and errno is set appropriately. It is not an error to lockfile_remove() a non-existing lockfile.

ALGORITHM

The algorithm that is used to create a lockfile in an atomic way, even over NFS, is as follows:

1 A unique file is created. In printf format, the name of the file is .lk%05d%x%s. The first argument (%05d) is the current process id. The second argument (%x) consists of the 4 minor bits of the value returned by time(2). The last argument is the system hostname. 2 Then the lockfile is created using link(2). The return value of link is ignored. 3 Now the lockfile is stat()ed. If the stat fails, we go to step 6. 4 The stat value of the lockfile is compared with that of the temporary file. If they are the same, we have the lock. The temporary file is deleted and a value of 0 (success) is returned to the caller. 5 A check is made to see if the existing lockfile is a valid one. If it isn’t valid, the stale lockfile is deleted. 6 Before retrying, we sleep for n seconds. n is initially 5 seconds, but after every retry 5 extra seconds is added up to a maximum of 60 seconds (an incremental backoff). Then we go to step 2 up to retries times.

REMOTE FILE SYSTEMS AND THE KERNEL ATTRIBUTE CACHE

These functions do not lock a file — they generate a lockfile. However in a lot of cases, such as Unix mailboxes, all concerned programs accessing the mailboxes agree on the fact that the presence of .lock means that is locked.

If you are using lockfile_create to create a lock on a file that resides on a remote server, and you already have that file open, you need to flush the NFS attribute cache after locking. This is needed to prevent the following scenario:

o open /var/mail/USERNAME o attributes, such as size, inode, etc are now cached in the kernel! o meanwhile, another remote system appends data to /var/mail/USERNAME o grab lock using lockfile_create() o seek to end of file o write data

Now the end of the file really isn’t the end of the file — the kernel cached the attributes on open, and st_size is not the end of the file anymore. So after locking the file, you need to tell the kernel to flush the NFS file attribute cache.

The only portable way to do this is the POSIX fcntl() file locking primitives — locking a file using fcntl() has the fortunate side-effect of invalidating the NFS file attribute cache of the kernel.

lockfile_create() cannot do this for you for two reasons. One, it just creates a lockfile- it doesn’t know which file you are actually trying to lock! Two, even if it could deduce the file you’re locking from the filename, by just opening and closing it, it would invalidate any existing POSIX locks the program might already have on that file (yes, POSIX locking semantics are insane!).

So basically what you need to do is something like this:

 
fd = open("/var/mail/USER"); 
.. program code .. 
lockfile_create("/var/mail/USER.lock", x, y); 
/* Invalidate NFS attribute cache using POSIX locks */ 
if (lockf(fd, F_TLOCK, 0) == 0) lockf(fd, F_ULOCK, 0);

You have to be careful with this if you’re putting this in an existing program that might already be using fcntl(), flock() or lockf() locking- you might invalidate existing locks.

There is also a non-portable way. A lot of NFS operations return the updated attributes — and the Linux kernel actually uses these to update the attribute cache. One of these operations is chmod(2).

So stat()ing a file and then chmod()ing it to st.st_mode will not actually change the file, nor will it interfere with any locks on the file, but it will invalidate the attribute cache. The equivalent to use from a shell script would be

PERMISSIONS

If you are on a system that has a mail spool directory that is only writable by a special group (usually «mail») you cannot create a lockfile directly in the mailspool directory without special permissions.

Lockfile_create and lockfile_remove check if the lockfile ends in $USERNAME.lock, and if the directory the lockfile is writable by group «mail». If so, an external set group-id mail executable (dotlockfile(1) ) is spawned to do the actual locking / unlocking.