- How to remove the lines which appear on file B from another file A?
- 12 Answers 12
- 5 Ways to Empty or Delete a Large File Content in Linux
- 1. Empty File Content by Redirecting to Null
- 2. Empty File Using ‘true’ Command Redirection
- 3. Empty File Using cat/cp/dd utilities with /dev/null
- 4. Empty File Using echo Command
- 5. Empty File Using truncate Command
- How to Delete all Text in a File Using Vi/Vim Editor
- How can I delete all lines in a file using vi?
- 12 Answers 12
How to remove the lines which appear on file B from another file A?
I have a large file A (consisting of emails), one line for each mail. I also have another file B that contains another set of mails. Which command would I use to remove all the addresses that appear in file B from the file A. So, if file A contained:
Now I know this is a question that might have been asked more often, but I only found one command online that gave me an error with a bad delimiter. Any help would be much appreciated! Somebody will surely come up with a clever one-liner, but I’m not the shell expert.
Most if the answers here are for sorted files, and the most obvious one is missing, which of course isn’t your fault, but that makes the other one more generally useful.
12 Answers 12
If the files are sorted (they are in your example):
-23 suppresses the lines that are in both files, or only in file 2. If the files are not sorted, pipe them through sort first.
comm -23 file1 file2 > file3 will output contents in file1 not in file2, to file3. And then mv file3 file1 would finally clear redundant contents in file1.
@TheArchetypalPaul I figured it out. It was line-endings. It’s always line-endings in Linux 🙂 I edited and sorted both files on my Windows desktop, but for some reason the line-endings were saved differently. Dos2unix helped.
cat A b 1 a 0 01 b 1 EOF cat B 0 1 EOF grep -Fvxf B A
- -F : use literal strings instead of the default BRE
- -x : only consider matches that match the entire line
- -v : print non-matching
- -f file : take patterns from the given file
This method is slower on pre-sorted files than other methods, since it is more general. If speed matters as well, see: Fast way of finding lines in one file that are not in another?
Here’s a quick bash automation for in-line operation:
remove-lines() ( remove_lines="$1" all_lines="$2" tmp_file="$(mktemp)" grep -Fvxf "$remove_lines" "$all_lines" > "$tmp_file" mv "$tmp_file" "$all_lines" )
remove-lines lines-to-remove remove-from-this-file
This solution doesn’t require sorted inputs. You have to provide fileB first.
awk 'NR==FNR !($0 in a)' fileB fileA
How does it work?
NR==FNR idiom is for storing the first file in an associative array as keys for a later «contains» test.
NR==FNR is checking whether we’re scanning the first file, where the global line counter (NR) equals to the current file line counter (FNR).
a[$0] adds the current line to the associative array as key, note that this behaves like a set, where there won’t be any duplicate values (keys)
!($0 in a) we’re now in the next file(s), in is a contains test, here it’s checking whether current line is in the set we populated in the first step from the first file, ! negates the condition. What is missing here is the action, which by default is and usually not written explicitly.
Note that this can now be used to remove blacklisted words.
$ awk '. ' badwords allwords > goodwords
with a slight change it can clean multiple lists and create cleaned versions.
$ awk 'NR==FNR !($0 in a) FILENAME".clean">' bad file1 file2 file3 .
5 Ways to Empty or Delete a Large File Content in Linux
Occasionally, while dealing with files in Linux terminal, you may want to clear the content of a file without necessarily opening it using any Linux command line editors. How can this be achieved? In this article, we will go through several different ways of emptying file content with the help of some useful commands.
Caution: Before we proceed to looking at the various ways, note that because in Linux everything is a file, you must always make sure that the file(s) you are emptying are not important user or system files. Clearing the content of a critical system or configuration file could lead to a fatal application/system error or failure.
With that said, below are means of clearing file content from the command line.
Important: For the purpose of this article, we’ve used file access.log in the following examples.
1. Empty File Content by Redirecting to Null
A easiest way to empty or blank a file content using shell redirect null (non-existent object) to the file as below:
2. Empty File Using ‘true’ Command Redirection
Here we will use a symbol : is a shell built-in command that is essence equivalent to the true command and it can be used as a no-op (no operation).
Another method is to redirect the output of : or true built-in command to the file like so:
# : > access.log OR # true > access.log
3. Empty File Using cat/cp/dd utilities with /dev/null
In Linux, the null device is basically utilized for discarding of unwanted output streams of a process, or else as a suitable empty file for input streams. This is normally done by redirection mechanism.
And the /dev/null device file is therefore a special file that writes-off (removes) any input sent to it or its output is same as that of an empty file.
Additionally, you can empty contents of a file by redirecting output of /dev/null to it (file) as input using cat command:
Next, we will use cp command to blank a file content as shown.
In the following command, if means the input file and of refers to the output file.
4. Empty File Using echo Command
Here, you can use an echo command with an empty string and redirect it to the file as follows:
# echo "" > access.log OR # echo > access.log
Note: You should keep in mind that an empty string is not the same as null. A string is already an object much as it may be empty while null simply means non-existence of an object.
For this reason, when you redirect the out of the echo command above into the file, and view the file contents using the cat command, is prints an empty line (empty string).
To send a null output to the file, use the flag -n which tells echo to not output the trailing newline that leads to the empty line produced in the previous command.
5. Empty File Using truncate Command
The truncate command helps to shrink or extend the size of a file to a defined size.
You can employ it with the -s option that specifies the file size. To empty a file content, use a size of 0 (zero) as in the next command:
That’s it for now, in this article we have covered multiple methods of clearing or emptying file content using simple command line utilities and shell redirection mechanism.
These are not probably the only available practical ways of doing this, so you can also tell us about any other methods not mentioned in this guide via the feedback section below.
How to Delete all Text in a File Using Vi/Vim Editor
Vim is a great tool for editing text or configuration files in Linux. One of the lesser-known Vim tricks is clearing or deleting all text or lines in a file. Although, this is not a frequently used operation, its a good practice to know or learn it.
In this article, we will describe steps on how to delete, remove or clear all text in a file using a Vim editor in different vim modes.
The first option is to remove, clear or delete the all lines in a file in the normal mode (note that Vim starts in “normal” mode by default). Immediately after opening a file, type “gg” to move the cursor to the first line of the file, assuming it is not already there. Then type dG to delete all the lines or text in it.
If Vim is in another mode, for example, insert mode, you can access normal mode by pressing Esc or C-[> .
Alternatively, you can also clear all lines or text in Vi/Vim in command mode by running the following command.
Last but not least, here is a list of Vim articles that you will find useful:
In this article, we have explained how to clear or delete all lines or text in a file using Vi/Vim editor. Remember to share your thoughts with us or ask questions using the comment form below.
How can I delete all lines in a file using vi?
How can I delete all lines in a file using vi? At moment I do that using something like this to remove all lines in a file:
How can I delete all lines using vi ? Note: Using dd is not a good option. There can be many lines.
echo | test.txt is not a valid command, unless test.txt is an executable script. I’m guessing you mean echo >test.txt instead?
12 Answers 12
The : introduces a command (and moves the cursor to the bottom).
The 1,$ is an indication of which lines the following command ( d ) should work on. In this case the range from line one to the last line (indicated by $ , so you don’t need to know the number of lines in the document).
The final d stands for delete the indicated lines.
There is a shorter form ( :%d ) but I find myself never using it. The :1,$d can be more easily «adapted» to e.g. :4,$-2d leaving only the first 3 and last 2 lines, deleting the rest.
Even with :%d and enter, its way too many strokes. I am trying to get into vim but without vim ctrl + a and del button is much more simpler than what vim provides for this. Just two strokes(ctrl+a, del) unlike :%d which has 4 strokes including the enter command
What is the problem with dd?
You need to distinguish between dd the vi command (which the OP meant) and dd the utility, which you give an example of. Also, > test.txt may not work as expected in non-bash shells (e.g. zsh).
I’d recommend that you just do this (should work in any POSIX-compliant shell):
If you really want to do it with vi, you can do:
G represents last line. If you are on the first line ( gg ), dG tells vi to remove all the lines from current line (first line) to the last line. So, you do it in one shot.
Your shell syntax is dependent on configuration for some shells. Both bash and zsh (zsh by default) interactive shells can wait for input on STDIN after receiving that command and an additional
@Caleb: Like I said, in any POSIX-compliant shell. 🙂 Neither of those two situations are POSIX-complaint.
I’m a lazy dude, and I like to keep it simple. ggdG is five keystrokes including Shift
gg goes to the first line in the file, d is the start of the d elete verb and G is the movement to go to the bottom of the file. Verbosely, it’s go to the beginning of the file and delete everything until the end of the tile.
If your cursor is on the first line (if not, type: gg or 1G ), then you can just use dG . It will delete all lines from the current line to the end of file. So to make sure that you’ll delete all the lines from the file, you may mix both together, which would be: ggdG (while in command mode).
Or %d in Ex mode, command-line example: vim +%d foo.bar .
+1 for being more ergonomic than :1,$d. Not that our fingers aren’t wired for typing colon all the time now, anyway 😉
I find this ( ggdG ) the easiest method in vim. The reason this answer isn’t upvoted as others is that gg is non-existent in pure vi?
There are of course other similar variations if you’re not already at the top of the file, such as 1GdG or Gd1G . But if you’re using vim , then ggdG is the easiest to type.
Go to the beginning of the file and press d G .
- gg jumps to the start of the current editing file
- V (capitalized v) will select the current line. In this case the first line of the current editing file
- G (capitalized g) will jump to the end of the file. In this case, since I selected the first line, G will select the whole text in this file.
Then you can simply press d or x to delete all the lines.
If you use d vertically, it automatically applies linewise. dl deletes a character to the right, dj deletes a line down, for example.
note that in your question, echo > test.txt creates a file with a single line break in it, not an empty file.
From the shell, consider using echo -n > test.txt or : > test.txt .
While I’d generally use a vi editing command (I use ggdG ), you can also call out to the shell with a reference to the current file like so:
It’s nearly as concise as ggdG , but harder to type, and you also have to confirm that you want to reload the modified file, so I don’t particularly recommend it in this case, but knowing how to use shell commands from vi like this is useful.
- : initiate a vi command
- ! initate a shell command
- : this is a shell builtin command with empty output
- > redirect the output
- % vi substitutes this with the name of the current file
The suggested :1,$d is also a good one of course, and just while I’m at it there’s also 1GdG