Linux find patterns in files

How can I search for a multiline pattern in a file?

I needed to find all the files that contained a specific string pattern. The first solution that comes to mind is using find piped with xargs grep:

find . -iname '*.py' | xargs grep -e 'YOUR_PATTERN' 

But if I need to find patterns that spans on more than one line, I’m stuck because vanilla grep can’t find multiline patterns.

@rogerdpack When marking questions as duplicates, the age of a question is a tertiary concern, after the amount and quality of answers and the quality of the question.

13 Answers 13

awk '/Start pattern/,/End pattern/' filename 

You can show the line numbers of the matches with awk ‘/Start pattern/,/End pattern/ ‘ filename . You can make it prettier by giving the line numbers a fixed width: awk ‘/Start pattern/,/End pattern/ ‘ filename .

This seems to work nicely on single file, however, what if I would like to search within multiple files?

Here is the example using GNU grep :

grep -Pzo '_name.*\n.*_description' 

-z / —null-data Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline.

Which has the effect of treating the whole file as one large line. See -z description on grep’s manual and also common question no 14 on grep’s manual usage page

I wasn’t able to use grep for multiline search, without using flags -z so it doesn’t split search on single line, and -o to print only matched part.

I found that -o caused it to not print anything, but -l worked to get a list of files (my command was grep -rzl pattern * , -rzo didn’t work)

I recommend »grep -Pazo» instead of »-Pzo» for non-ASCII files. It’s better because the -z switch on non-ASCII files may trigger grep’s «binary data» behaviour which changes the return values. Switch »-a | —text» prevents that.

So I discovered pcregrep which stands for Perl Compatible Regular Expressions GREP.

the -M option makes it possible to search for patterns that span line boundaries.

For example, you need to find files where the ‘_name‘ variable is followed on the next line by the ‘_description‘ variable:

find . -iname '*.py' | xargs pcregrep -M '_name.*\n.*_description' 

Tip: you need to include the line break character in your pattern. Depending on your platform, it could be ‘\n’, \r’, ‘\r\n’, .

Читайте также:  Updating git on linux

As mentioned by halka below, «you can also persuade the dot wildcard to match newlines if you add (?s) to your regular expression». Then use grep with perl regex by adding -P. find . -exec grep -nHP ‘(?s)SELECT.<1,60>FROM.table_name’ ‘<>‘ \;

pcregrep: line 1 of file /dev/fd/63 is too long for the internal buffer when acting on a simple text file like <(cat file.txt | tr '\0' '\n') .

grep -P also uses libpcre, but is much more widely installed. To find a complete title section of an html document, even if it spans multiple lines, you can use this:

Since the PCRE project implements to the perl standard, use the perl documentation for reference:

I didn’t know grep had this option. Probably because of this: This is highly experimental and grep -P may warn of unimplemented features.; that’s under CentOS 7. Under Fedora 29: This is experimental and grep -P may warn of unimplemented features. Of course in BSD grep it’s not there at all. Would be nice if it wasn’t so experimental but it’s nice to be reminded of it — little though I’m likely to use it.

Works with grep -Pzo (though adds a trailing NUL char, see some of the other answers). grep -P is common in «linux» but not BSD.

Источник

What’s the best way to find a string/regex match in files recursively? (UNIX)

I have had to do this several times, usually when trying to find in what files a variable or a function is used. I remember using xargs with grep in the past to do this, but I am wondering if there are any easier ways.

8 Answers 8

Replace . with whatever directory you want to search from.

Since you ask for «the best» way, I think you should not have accepted this answer, but the one by Chas. This is not the best answer because -r is not portable. The best answer is the one that’s portable, deals with funny characters in file names and does only create a handful of processes in the worst case. This would be Chas’ find . -type f -print0 | xargs -0 grep pattern . Completely POSIX and as bullet-proof as it gets.

Читайте также:  Подключить сетевой принтер линукс виндовс

@Jens: Since you like to be pedantic, 1. xargs -0 wasn’t portable for a very long time, and 2. Chas’s, not Chas’ (people’s names are never considered plural).

@ChrisJester-Young owl.english.purdue.edu/owl/resource/621/01 educate yourself my friend, «James’ hat is also acceptable.»

The portable method* of doing this is

find . -type f -print0 | xargs -0 grep pattern 

-print0 tells find to use ASCII nuls as the separator and -0 tells xargs the same thing. If you don’t use them you will get errors on files and directories that contain spaces in their names.

* as opposed to grep -r, grep -R, or grep —recursive which only work on some machines.

@FelipeAlvarez Just quoting fixes most of the problem: grep «f.*o» . The only thing that should expand in a string is stuff following a $ , so escape the $ if necessary, but I can’t think of a valid pattern for grep that would have stuff following a $ (grep works on lines and $ means end of line).

What happens to shell special characters After shell removes your quotes then passed it to xargs? How will that affect grep after that?

The command execute by xargs doesn’t receive shell expansion. It is a straight up fork / exec deal. You can see this with echo «foo*» ‘$foo’ | xargs perl -le ‘print «[$_]» for @ARGV’ .

This is one of the cases for which I’ve started using ack (http://petdance.com/ack/) in lieu of grep. From the site, you can get instructions to install it as a Perl CPAN component, or you can get a self-contained version that can be installed without dealing with dependencies.

Besides the fact that it defaults to recursive searching, it allows you to use Perl-strength regular expressions, use regex’s to choose files to search, etc. It has an impressive list of options. I recommend visiting the site and checking it out. I’ve found it extremely easy to use, and there are tips for integrating it with vi(m), emacs, and even TextMate if you use that.

If you’re looking for a string match, use

which is faster than using grep. More about the subject here: http://www.mkssoftware.com/docs/man1/grep.1.asp

grep -r if you’re using GNU grep, which comes with most Linux distros.

On most UNIXes it’s not installed by default so try this instead:

find . -type f | xargs grep regex

Читайте также:  Delete all files and directories in linux

If you use the zsh shell you can use

This can run out of steam if there are too many matching files.

The canonical way though is to use find with exec.

find . -name '*.java' -exec grep REGEX <> \; 
find . -type f -exec grep REGEX <> \; 

The ‘type f’ bit just means type of file and will match all files.

Источник

How to find lines containing a string in linux [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.

I have a file in Linux, I would like to display lines which contain a specific string in that file, how to do this?

5 Answers 5

The usual way to do this is with grep , which uses a regex pattern to match lines:

Each line which matches the pattern will be output. If you want to search for fixed strings only, use grep -F ‘pattern’ file . fgrep is shorthand for grep -F .

addition grep -rn ‘string’ /path/ if you want to search a string in a folder in which file including and line number

Besides grep , you can also use other utilities such as awk or sed

Here is a few examples. Let say you want to search for a string is in the file named GPL .

Your sample file

$ cat -n GPL 1 The GNU General Public License is a free, copyleft license for 2 The licenses for most software and other practical works are designed 3 the GNU General Public License is intended to guarantee your freedom to 4 GNU General Public License for most of our software; 
$ grep is GPL The GNU General Public License is a free, copyleft license for the GNU General Public License is intended to guarantee your freedom to 
$ awk /is/ GPL The GNU General Public License is a free, copyleft license for the GNU General Public License is intended to guarantee your freedom to 
$ sed -n '/is/p' GPL The GNU General Public License is a free, copyleft license for the GNU General Public License is intended to guarantee your freedom to 

Источник

Оцените статью
Adblock
detector