- Understanding grep and pipes in linux
- 1 Answer 1
- Examples
- How to grep for pipe |
- You must log in to answer this question.
- Related
- Hot Network Questions
- Subscribe to RSS
- Pipe output to use as the search specification for grep on Linux
- 9 Answers 9
- How do I grep for multiple patterns with pattern having a pipe character?
- 13 Answers 13
Understanding grep and pipes in linux
Now the output of cat file.txt is content of the file which is foo World this becomes the input of the grep ? Am I correct? If so I thought grep required a filepath as a string?
In the grep(1) man page (type man grep ), you’ll see “grep [options] PATTERN [FILE. ]”. The fact that “FILE” is followed by “. ” means that there can be multiple file arguments; the fact that it is in square brackets ( [ … ] ) means that the file argument is optional (i.e., there doesn’t have to be any). Keep on reading and you’ll see “Grep searches the named input FILEs (or standard input if no files are named, or the file name — is given) ….” So: if no files are specified, grep searches the standard input.
1 Answer 1
Most commands can deal with input that’s either a file that they need to open for input, or as a stream of data that’s passed to the command via STDIN.
When the contents of cat file.txt is sent to another command through a pipe ( | ) the output via STDOUT that’s passed to the pipe on the left side, is setup and fed to the command that’s on the right side of the pipe’s STDIN.
If the contents is not being passed via STDOUT -> STDIN via a pipe, then commands can receive data by opening files that are passed by name via command line arguments.
Examples
Output from cat file is sent via STDOUT to grep ‘s STDIN via the pipe.
Processing the file as a command line argument.
Processing the contents of the file via STDIN directly.
Here I’m demonstrating that the contents of file can be directed to grep via STDIN above.
How to grep for pipe |
If you are using GNU grep you can do this with the or operator ( | ), which should be escaped (preceded by a backslash \ ). So to find lines containing either pipe or a greater-than sign, included them literally with the or operator:
\| is not a standard BRE operator, though it works with the GNU grep which is the grep found in most Operating systems built around a Linux kernel.
Yes, the alternation operator is the only real addition of EREs over BREs, the rest ( + , ? ) being syntactic sugar for \ and \ . (on the other hand, EREs loose back references ( \(.\)\1 ) which are a BRE only feature)1,\>
Using bracket expression to match either of the wanted characters:
The correct way to accomplish it is using -e flag which is specified by POSIX. E.g:
You must log in to answer this question.
Related
Hot Network Questions
Subscribe to RSS
To subscribe to this RSS feed, copy and paste this URL into your RSS reader.
Site design / logo © 2023 Stack Exchange Inc; user contributions licensed under CC BY-SA . rev 2023.7.12.43529
Linux is a registered trademark of Linus Torvalds. UNIX is a registered trademark of The Open Group.
This site is not affiliated with Linus Torvalds or The Open Group in any way.
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.
Pipe output to use as the search specification for grep on Linux
I want the output of the first grep as the search term for the second grep. The above command is treating the output of the first grep as the file name for the second grep. I tried using the -e option for the second grep, but it does not work either.
9 Answers 9
You need to use xargs ‘s -i switch:
grep . | xargs -ifoo grep foo file_in_which_to_search
This takes the option after -i ( foo in this case) and replaces every occurrence of it in the command with the output of the first grep .
grep `grep . ` file_in_which_to_search
grep . | fgrep -f - file1 file2 .
@Nathan, it uses the output of the first grep as the input file of patterns for fgrep. «-f -» is the normal way of telling a unix program to use standard input as the input file if it normally takes an input file. Traditionally fgrep was the only grep that took a file as input, although it’s possible that Gnu has modified the other greps.
If using Bash then you can use backticks:
the -e flag and the double quotes are there to ensure that any output from the initial grep that starts with a hyphen isn’t then interpreted as an option to the second grep .
Note that the double quoting trick (which also ensures that the output from grep is treated as a single parameter) only works with Bash. It doesn’t appear to work with (t)csh.
Note also that backticks are the standard way to get the output from one program into the parameter list of another. Not all programs have a convenient way to read parameters from stdin the way that (f)grep does.
How do I grep for multiple patterns with pattern having a pipe character?
but the shell interprets the | as a pipe and complains when bar isn’t an executable. How can I grep for multiple patterns in the same set of files?
13 Answers 13
First, you need to protect the pattern from expansion by the shell. The easiest way to do that is to put single quotes around it. Single quotes prevent expansion of anything between them (including backslashes); the only thing you can’t do then is have single quotes in the pattern.
(also note the — end-of-option-marker to stop some grep implementations including GNU grep from treating a file called -foo-.txt for instance (that would be expanded by the shell from *.txt ) to be taken as an option (even though it follows a non-option argument here)).
If you do need a single quote, you can write it as ‘\» (end string literal, literal quote, open string literal).
Second, grep supports at least¹ two syntaxes for patterns. The old, default syntax (basic regular expressions) doesn’t support the alternation ( | ) operator, though some versions have it as an extension, but written with a backslash.
The portable way is to use the newer syntax, extended regular expressions. You need to pass the -E option to grep to select it (formerly that was done with the egrep separate command²)
Another possibility when you’re just looking for any of several patterns (as opposed to building a complex pattern using disjunction) is to pass multiple patterns to grep . You can do this by preceding each pattern with the -e option.
Or put patterns on several lines:
Or store those patterns in a file, one per line and run
Note that if *.txt expands to a single file, grep won’t prefix matching lines with its name like it does when there are more than one file. To work around that, with some grep implementations like GNU grep , you can use the -H option, or with any implementation, you can pass /dev/null as an extra argument.
¹ some grep implementations support even more like perl-compatible ones with -P , or augmented ones with -X , -K for ksh wildcards.
² while egrep has been deprecated by POSIX and is sometimes no longer found on some systems, on some other systems like Solaris when the POSIX or GNU utilities have not been installed, then egrep is your only option as its /bin/grep supports none of -e , -f , -E , \| or multi-line patterns
As a sidenote — when the patterns are fixed, you should really get into the habit of fgrep or grep -F , for small patterns the difference will be negligible but as they get longer, the benefits start to show.
@TC1 Whether grep -F has an actual performance benefit depends on the grep implementation: some of them apply the same algorithm anyway, so that -F makes a difference only to the time spent parsing the pattern and not to the time searching. GNU grep isn’t faster with -F , for example (it also has a bug that makes grep -F slower in multibyte locales — the same constant pattern with grep is actually significantly faster!). On the other hand BusyBox grep does benefit a lot from -F on large files.
Perhaps it should be mentioned that for more complicated patterns where alternation is only to be for a part of the regular expression, it can be grouped with «\(» and «\)» (the escaping is for the default «basic regular expressions») (?).
Note that egrep predates grep -E . It is not GNU specific (it certainly has nothing to do with Linux). Actually, you’ll still find systems like Solaris where the default grep still doesn’t support -E .
grep "foo\|bar" *.txt grep -E "foo|bar" *.txt
selectively citing the man page of gnu-grep:
-E, --extended-regexp Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.) Matching Control -e PATTERN, --regexp=PATTERN Use PATTERN as the pattern. This can be used to specify multiple search patterns, or to protect a pattern beginning with a hyphen (-). (-e is specified by POSIX.)
grep understands two different versions of regular expression syntax: “basic” and “extended.” In GNU grep, there is no difference in available functionality using either syntax. In other implementations, basic regular expressions are less powerful. The following description applies to extended regular expressions; differences for basic regular expressions are summarized afterwards.
In the beginning I didn’t read further, so I didn’t recognize the subtle differences:
Basic vs Extended Regular Expressions In basic regular expressions the meta-characters ?, +,
I always used egrep and needlessly parens, because I learned from examples. Now I learned something new. 🙂