Always include first line in grep
I often grep CSV files with column names on the first line. Therefore, I want the output of grep to always include the first line (to get the column names) as well as any lines matching the grep pattern. What is the best way to do this?
9 Answers 9
sed:
awk:
awk 'NR==1 || /pattern/' input.txt
grep1:
Note the sed version shows the first line twice. Use ‘1
; instead of 1p; to avoid this.
I like that sed can be used, I’d just never suggest its quirky syntax to a newbie. The awk version is easier to explain, and the sed one is notably harder to get right. The instant errors with 😕 in the function are pretty cool, though.
To print multiple lines from the begging: sed ‘1,3p;/pattern/!d’ or awk ‘NR
How could grep1 be tweaked so that kubectl get pods -n foo | grep1 bar works instead of returning filename is empty ?
How would one modify the grep1 version to take a third argument (defaulting to 1) which is the number of header lines it should include? Use case: Some commands output two or more header lines, e.g. netstat or systemctl status . It would be useful to be able to pipe the output of those commands to such a version of grep1 .
You could include an alternate pattern match for the one of the column names. If a column was called COL then this would work:
$ grep -E 'COL|pattern' file.csv
It might also match an unintended line later in the file, if you didn’t have strict control over the contents of the first line.
grep doesn’t really have a concept of line number, but awk does, so here’s an example to output lines contain «Incoming» — and the first line, whatever it is:
awk 'NR == 1 || /Incoming/' foo.csv
You could make a script (a bit excessive, but). I made a file, grep+1, and put this in it:
#!/bin/sh pattern="$1" ; shift exec awk 'NR == 1 || /'"$pattern"'/' "$@"
edit: removed the «», which is awk’s default action.
Howdy, Alex — I like the script idea, but might modify it slightly to actually use grep instead of awk such that grep’s other command-line arguments could be used: read; printf ‘%s\n’ «$REPLY»; grep «$@» . Main gotcha with this approach is that if the args include filename(s), one would need to parse them out for local handling.
Right Charles. A good solution would preserve all of grep’s options, and both reading from stdin or from filenames. However, the question sounds more like looking for a one-liner, even a broken one like another offered with uses $2 for the filename (so it only works on exactly 1 file) but trades that out for error feedback.
Glenn: on the «-v» — the asker didn’t include his Unix variant. Do all versions of grep have -v ? 🙂
@AlexNorth-Keys, that’s awk -v , and yes, that functionality is included in the One True Awk and is in POSIX. And of course, grep -v is in POSIX too. 🙂
You can use sed instead of grep to do this:
This will print the first line twice, however, if it happens to contain the pattern.
-n tells sed not to print each line by default.
-e ‘1p’ prints the first line.
-e ‘/pattern/p’ prints each line that matches the pattern.
$ cat data.csv | (read line; echo "$line"; grep SEARCH_TERM)
$ echo "title\nvalue1\nvalue2\nvalue3" | (read line; echo "$line"; grep value2)
FWIW, 10 years later — in 2022 — it seems like a plain echo doesn’t work. My answer to another question that references this answer on Super User can be found here. But using printf or echo -e works nowadays; wonder if older versions of Bash defaulted to interpreting escaped characters like \n ?
This is a very general solution, for example if you want to sort a file while keeping the first line in place. Basically, «pass the first line through as-is, then do whatever I want ( awk / grep / sort /whatever) on the rest of the data.»
Try this in a script, perhaps calling it keepfirstline (don’t forget chmod +x keepfirstline and to put it in your PATH ):
#!/bin/bash IFS='' read -r JUST1LIINE printf "%s\n" "$JUST1LIINE" exec "$@"
It can be used as follows:
cat your.data.csv | keepfirstline grep SearchTerm > results.with.header.csv
or perhaps, if you want to filter with awk
cat your.data.csv | keepfirstline awk '$1 < 3' >results.with.header.csv
I often like to sort a file, but keeping the header in the first line
cat your.data.csv | keepfirstline sort
keepfirstline executes the command it’s given ( grep SearchTerm ), but only after reading and printing the first line.
grep —color=yes should for grep to be colourful no matter what. You might find that grep —color is sufficient in many cases
Do you think that read reads just the first line from stdin , not more? I.e. reads it character by character without buffering? I do not think so. I think that the command invoked by exec would get incomplete input remaining on stdin because bash will read more than just the first line.
Incomplete. You would also want to combine the output in to a single stream (execute inside a group or subshell), and you’d want to ensure that the grep doesn’t accidentally match the first line of the file as well.
yes, of course, but u can just do head -1
So, I posted a completely different short answer above a while back.
However, for those pining for a command that looks like grep in terms of taking all the same options (although this script requires you to use the long options if an optarg is involved), and can cope with weird characters in filenames, etc, etc.. have fun pulling this apart.
Essentially it’s a grep that always emits the first line. If you think a file with no matching lines should skip emitting that first (header) line, well, that’s left as an exercise for the reader. I saved is as grep+1 .
#!/bin/bash # grep+1 [. ] [] [. ] # Emits the first line of each input and ignores it otherwise. # For grep options that have optargs, only the --forms will work here. declare -a files options regex_seen=false regex= double_dash_seen=false for arg in "$@" ; do is_file_or_rx=true case "$arg" in -*) is_file_or_rx=$double_dash_seen ;; esac if $is_file_or_rx ; then if ! $regex_seen ; then regex="$arg" regex_seen=true else files[$]="$arg" # append the value fi else options[$]="$arg" # append the value fi done # We could either open files all at once in the shell and pass the handles into # one grep call, but that would limit how many we can process to the fd limit. # So instead, here's the simpler approach with a series of grep calls if $regex_seen ; then if [ $ -gt 0 ] ; then for file in "$" ; do head -n 1 "$file" tail -n +2 "$file" | grep --label="$file" "$" "$regex" done else grep "$" # stdin fi else grep "$" # probably --help fi #--eof
How to print the first line using grep command?
@steeldriver you need to make it an answer as it is the way to do it with grep, strangely selected tool for the job.
grep is not the best tool for printing the first line of a file. If you simply meant that you wanted to print the first line matched with grep , or if you have some specific use for grep , please let us know what that is. If we had more context, perhaps we could give an answer that would better help you and the community.
@MelBurslan In file.txt I have three lines for example : This is a new file. Second line: The name of the file is newFile Third line : I have not created the new line. So by using grep command how can I print the first line only? Also How which command will help me to print both first and last line? Can you guys please tell me the commands for both the questions?
If any of the existing answers solves your problem, please consider accepting it via the checkmark. Thank you!
5 Answers 5
Although it’s an unconventional application of grep, you can do it in GNU grep using
It works because the empty expression matches anything, while -m1 causes grep to exit after the first match
-m NUM, --max-count=NUM Stop reading a file after NUM matching lines.
This is not something grep does. The name «grep» itself is an acronym for «globally search a regular expression and print», which is what the ed command g/re/p does (for a given regular expression re ).
ed is an interactive line editor from 1969, but it’s most likely installed on your system today nonetheless (it’s a standard POSIX tool). We got grep from ed , and it can be seen as a shortcut or alias for a specific functionality of ed , and sed , which is «stream- ed «, i.e. a (non-interactive) stream editor.
The 1p string is a tiny sed «script» that prints ( p ) the line corresponding to the given address ( 1 , the first line). The editing command 1p would (no surprise) do the same thing in the ed editor by the way.
The -n suppresses the output of anything not explicitly printed by the script, so all we get is the first line of the file file.txt .
This prints all lines of the file, but quits ( q ) at line 1 (after printing it). This is exactly equivalent to head -n 1 file.txt .
In the POSIX standard (in the rationale section for the head command) it says (paraphrasing) that head -n N is much the same as sed ‘Nq’ , i.e. «print every line, but quit at line N «. The reason head was included in the standard at all was due to symmetry with tail and backwards compatibility with existing Unix implementations.
This is the most illuminating answer, the first answer is a weird hack. I knew there was something weird, thank you for showing me the history of grep and ed. It all makes sense now.
Unless the first line has a unique string you cannot do this using only grep. head -n 1 file.txt would work in its place.
If you want to only print out the first line if it matches a pattern then pipe head into grep
Yet Another Unconventional Use of Grep — a Schwartzian Transform that goes through several gyrations to number the lines, then uses grep to look for the line number, then strip the line number back off:
function grep1() ( nlines=$(wc -l < "$1") nlw=$(printf "%d" "$nlines" | wc -c) nl -d '\n' -ba -n ln -w "$nlw" -s ' ' "$1" | grep '^1 ' | sed 's/^1 *//' ) function greplast() ( nlines=$(wc -l < "$1") nlines=$((nlines + 0)) nlw=$(printf "%d" "$nlines" | wc -c) nl -d '\n' -ba -n ln -w "$nlw" -s ' ' "$1" | grep "^$nlines " | sed "s/^$nlines *//" )
I'm putting this Answer here as an example of the idea that just because you can do something in (grep or bash or . etc), doesn't mean that you should -- there's probably a better tool for the job. sed ( sed 1q or sed -n 1p ) and head ( head -n 1 ) are the more appropriate tools for printing the first line from a file. For printing the last line of a file, you have tail -n 1 or sed -n '$p' . Not only are these tools a single command (instead of 3+ in the above functions), they are also much clearer for future readers -- perhaps yourself! -- of the scripts they're in. While I am not one of the (currently 3) down-voters of your question, it's likely that your insistence on an arbitrary tool for the job (without any supporting reasons) is the reason for the downvotes. It's extremely unlikely that a system that has grep does not also have head , tail , and sed .
How can I use grep to search only on the first line of files for a specific string?
It's probably easiest to write a perl 1-liner to do this, otherwise it would have to be a hacky combination of head and grep.
@peterh Rather than asking the question again, it is better to flag the current question for migration there.
10 Answers 10
With awk
or if your awk version doesn't support nextfile (thanks to Stéphane Chazelas for the suggestion) :
will read only the first line before switching to next file, and print it only if it matches "pattern" .
Advantages are that one can fine-tune both the field on which to search the pattern for (using e.g. $2 to search on the second field only) and the output (e.g. $3 to print the third field, or FILENAME , or even mix).
Note that with the FNR ("current input record number", i.e. line number) version you can fine-tune further the line(s) on which you want to grep : FNR==3 for the third line, FNR
With head , keeping filenames
head -n1 -v mydir/files*|grep -B1 pattern
-v option of head will print filenames, and option -B1 of grep will print the previous line of matching lines — that is, the filenames. If you only need the filenames you can pipe it further to grep :
head -n1 -v mydir/*|grep -B1 pattern|grep ==>
As noticed by don_crissti in comments, beware of filenames matching the pattern themselves, though…