- How can I remove the first line of a text file using bash/sed script?
- 19 Answers 19
- Remove the First Line of a Text File in Linux
- Method 1: Using the head Command
- Example
- Method 2: Using the sed Command
- Example
- Method 3: Using the tail Command
- Example
- Method 4: Using the awk command
- Example
- Performance
- Conclusion
- Delete First line of a file
- 11 Answers 11
How can I remove the first line of a text file using bash/sed script?
I need to repeatedly remove the first line from a huge text file using a bash script. Right now I am using sed -i -e «1d» $FILE — but it takes around a minute to do the deletion. Is there a more efficient way to accomplish this?
tail is MUCH SLOWER than sed. tail needs 13.5s, sed needs 0.85s. My file has ~1M lines, ~100MB. MacBook Air 2013 with SSD.
19 Answers 19
-n x : Just print the last x lines. tail -n 5 would give you the last 5 lines of the input. The + sign kind of inverts the argument and make tail print anything but the first x-1 lines. tail -n +1 would print the whole file, tail -n +2 everything but the first line, etc.
GNU tail is much faster than sed . tail is also available on BSD and the -n +2 flag is consistent across both tools. Check the FreeBSD or OS X man pages for more.
The BSD version can be much slower than sed , though. I wonder how they managed that; tail should just read a file line by line while sed does pretty complex operations involving interpreting a script, applying regular expressions and the like.
Note: You may be tempted to use
# THIS WILL GIVE YOU AN EMPTY FILE! tail -n +2 "$FILE" > "$FILE"
but this will give you an empty file. The reason is that the redirection ( > ) happens before tail is invoked by the shell:
- Shell truncates file $FILE
- Shell creates a new process for tail
- Shell redirects stdout of the tail process to $FILE
- tail reads from the now empty $FILE
If you want to remove the first line inside the file, you should use:
tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"
The && will make sure that the file doesn’t get overwritten when there is a problem.
According to this ss64.com/bash/tail.html the typical buffer defaults to 32k when using BSD ‘tail’ with the -r option. Maybe there’s a buffer setting somewhere in the system? Or -n is a 32-bit signed number?
@Eddie: user869097 said it doesn’t work when a single line is 15Mb or more. As long as the lines are shorter, tail will work for any file size.
@Dreampuf: sed has an internal buffer for the current line while tail can get away by just remembering the offset of the N last newline characters (note that I didn’t actually look at the sources).
I was going to concur with @JonaChristopherSahnwaldt — tail is much, much slower than the sed variant, by an order of magnitude. I’m testing it on a file of 500,000K lines (no more than 50 chars per line). However, I then realized I was using the FreeBSD version of tail (which comes with OS X by default). When I switched to GNU tail, the tail call was 10 times faster than the sed call (and the GNU sed call, too). AaronDigulla is correct here, if you’re using GNU.
You can use -i to update the file without using ‘>’ operator. The following command will delete the first line from the file and save it to the file (uses a temp file behind the scenes).
Just to remember, Mac requires a suffix to be provided when using sed with in-place edits. So run the above with -i.bak
This version is really much more readable, and more universal, than tail -n +2 . Not sure why it isn’t the top answer.
Works on Ubuntu (GNU) but for OS X (BSD) I had to change it to sed -i » ‘1d’ filename . Per stackoverflow.com/questions/16745988/…
For those who are on SunOS which is non-GNU, the following code will help:
@ValerioBozz It’s kinda weird revisiting this comment after almost a decade lol. I don’t even remember it. But I was just pointing out that this answer is for SunOS which was last released in 1998. Very few if any use it
You can easily do this with:
cat filename | sed 1d > filename_without_first_line
on the command line; or to remove the first line of a file permanently, use the in-place mode of sed with the -i flag:
The -i option technically takes an argument specifying the file suffix to use when making a backup of the file (e.g. sed -I .bak 1d filename creates a copy called filename.bak of the original file with the first line intact). While GNU sed lets you specify -i without an argument to skip the backup, BSD sed, as found on macOS, requires an empty string argument as a separate shell word (e.g. sed -i » . ).
The sponge util avoids the need for juggling a temp file:
tail -n +2 "$FILE" | sponge "$FILE"
sponge is indeed much cleaner and more robust than the accepted solution ( tail -n +2 «$FILE» > «$FILE.tmp» && mv «$FILE.tmp» «$FILE» )
This is the only solution that worked for me to change a system file (on a Debian docker image). Other solutions failed due to «Device or resource busy» error when attempting to write the file.
@OrangeDog, So long as the file system can store it, sponge will soak it up, since it uses a /tmp file as an intermediate step, which is then used to replace the original afterward.
No, that’s about as efficient as you’re going to get. You could write a C program which could do the job a little faster (less startup time and processing arguments) but it will probably tend towards the same speed as sed as files get large (and I assume they’re large if it’s taking a minute).
But your question suffers from the same problem as so many others in that it pre-supposes the solution. If you were to tell us in detail what you’re trying to do rather then how, we may be able to suggest a better option.
For example, if this is a file A that some other program B processes, one solution would be to not strip off the first line, but modify program B to process it differently.
Let’s say all your programs append to this file A and program B currently reads and processes the first line before deleting it.
You could re-engineer program B so that it didn’t try to delete the first line but maintains a persistent (probably file-based) offset into the file A so that, next time it runs, it could seek to that offset, process the line there, and update the offset.
Then, at a quiet time (midnight?), it could do special processing of file A to delete all lines currently processed and set the offset back to 0.
It will certainly be faster for a program to open and seek a file rather than open and rewrite. This discussion assumes you have control over program B, of course. I don’t know if that’s the case but there may be other possible solutions if you provide further information.
Remove the First Line of a Text File in Linux
There are several ways to remove the first line of a text file in Linux. In this article, we will go over three different methods that can be used to accomplish this task.
Method 1: Using the head Command
The head command is a Linux utility that is used to display the first few lines of a text file. It can also be used to remove the first line of a text file by using the -n option. The -n option is used to specify the number of lines that should be displayed. By specifying -1, we can remove the first line of a text file.
Example
To remove the first line of a text file called file.txt, we can use the following command −
$ head -n -1 file.txt >: newfile.txt
This command will create a new file called newfile.txt that contains all the lines of file.txt except the first line.
Method 2: Using the sed Command
The sed command is a Linux utility that is used to perform text transformations on an input file. It can also be used to remove the first line of a text file. The sed command can be used to delete a specified line in a text file by using the d command.
Example
To remove the first line of a text file called file.txt, we can use the following command −
$ sed '1d' file.txt > newfile.txt
This command will create a new file called newfile.txt that contains all the lines of file.txt except the first line.
Method 3: Using the tail Command
The tail command is a Linux utility that is used to display the last few lines of a text file. It can also be used to remove the first line of a text file by using the + option. The + option is used to specify the number of lines that should be displayed. By specifying 2, we can remove the first line of a text file.
Example
To remove the first line of a text file called file.txt, we can use the following command −
$ tail -n +2 file.txt > newfile.txt
This command will create a new file called newfile.txt that contains all the lines of file.txt except the first line.
Method 4: Using the awk command
Another method to remove the first line of a text file in Linux is by using the awk command. The awk command is a powerful text processing tool that can be used for a variety of tasks, including removing specific lines from a text file.
Example
To remove the first line of a text file called file.txt, we can use the following command −
$ awk 'NR>1' file.txt > newfile.txt
This command uses the NR variable, which stands for «number of records», to specify that all lines except for the first one should be printed. The output is then redirected to a new file called newfile.txt, which contains all the lines of file.txt except the first line.
Alternatively, we can also use the following command −
$ awk 'FNR>1' file.txt > newfile.txt
This command uses the FNR variable, which stands for «file number of records», and it works the same way as the previous example, but it also considers multiple files.
Performance
When it comes to performance, the head, sed, tail, and awk commands all have their own strengths and weaknesses.
The head and tail commands are generally faster than sed and awk when working with large files because they only read a certain number of lines from the file. However, they are limited in their functionality and can only be used to remove the first or last lines of a file, respectively.
On the other hand, sed and awk commands are more powerful and versatile, they can be used to perform various text transformations, but they may be slower than the head and tail commands when working with large files.
In general, the choice of command will depend on the specific requirements of the task at hand. If the task is simply to remove the first line of a file, then the head or tail commands will likely be the most efficient option. However, if more advanced text transformations are needed, then the sed or awk commands may be more appropriate.
It is important to note that the performance of these commands also depends on the size of the file and the performance of the computer that is being used.
Conclusion
In conclusion, there are several ways to remove the first line of a text file in Linux. The head, sed, and tail commands are all useful utilities that can be used to accomplish this task. Each method has its own unique syntax and options, so it’s important to choose the one that best fits your needs.
Delete First line of a file
How can I delete the first line of a file and keep the changes? I tried this but it erases the whole content of the file.
11 Answers 11
An alternative very lightweight option is just to ‘tail’ everything but the first line (this can be an easy way to remove file headers generally):
# -n +2 : start at line 2 of the file. tail -n +2 file.txt > file.stdout
Following @Evan Teitelman, you can:
tail -n +2 file.txt | sponge file.txt
To avoid a temporary file. Another option might be:
echo "$(tail -n +2 file.txt)" > file.txt
And so forth. Testing last one:
[user@work ~]$ cat file.txt line 1 line 2 line 3 line 4 line 5 [user@work ~]$ echo "$(tail -n +2 file.txt)" > file.txt [user@work ~]$ cat file.txt line 2 line 3 line 4 line 5 [user@work ~]$
Oops we lost a newline (per @1_CR comment below), try instead:
printf "%s\n\n" "$(tail -n +2 file.txt)" > file.txt [user@work ~]$ cat file.txt line 1 line 2 line 3 line 4 line 5 [user@work ~]$ printf '%s\n\n' "$(tail -n +2 file.txt)" > file.txt [user@work ~]$ cat file.txt line 2 line 3 line 4 line 5 [user@work ~]$
printf '%s\n\n' "$(sed '1d' file.txt)" > file.txt
echo -e "$(sed '1d' file.txt)\n" > file.txt
I’ve just tried it on my Fedora system and the output is above. You are correct — thanks for pointing that out.
The reason file.txt is empty after that command is the order in which the shell does things. The first thing that happens with that line is the redirection. The file «file.txt» is opened and truncated to 0 bytes. After that the sed command runs, but at the point the file is already empty.
There are a few options, most involve writing to a temporary file.
sed '1d' file.txt > tmpfile; mv tmpfile file.txt # POSIX sed -i '1d' file.txt # GNU sed only, creates a temporary file perl -ip -e '$_ = undef if $. == 1' file.txt # also creates a temporary file