- Using regex in find command for multiple file types [duplicate]
- 4 Answers 4
- Using Find Command With Regex
- Quick Introduction to Regex Tokens
- Practical Examples of find command with regex
- Search Files based on Initial Characters in the Current Directory
- Search Files in Sub Directory
- Search Files through regex patterns in every Subdirectory
- Search Files based on Extension
- Search Files based on Filename and Extension
- Final Words
- Find files using regular expressions in Ubuntu
- 2 Answers 2
- How to use regex with find command?
- 9 Answers 9
Using regex in find command for multiple file types [duplicate]
to search for a string in all files ending with .c, .C, .h, .H, .cc and .CC listed in all subdirectories. But since this includes two commands this feels inefficient. How can I use a single regex pattern to find .c,.C,.h,.H,.cc and .CC files? I am using bash on a Linux machine.
4 Answers 4
You can use the boolean OR argument:
find . -name '*.[ch]' -o -name '*.[CH]' -o -name '*.cc' -o -name '*.CC'
The above searches the current directory and all sub-directories for files that end in:
This doesn’t work if you want to add «-exec». For example «find -name *.log’ -o -name *.txt -exec cat ‘<>‘ \; » It only executes the first -name
@OutputLogic using find s precedence operators ( and ) address this e.g. find \( -name «*.log» -o -name «*.txt» \) -exec cat <> \; should mean that -exec works as expected.
find . -iregex '.*\.\(c\|cc\|h\)' -exec grep -nHr "$1" <> +
-iregex for case-insensitive regex pattern.
(c|cc|h) (nasty escapes not shown) matches c, cc, or h extensions
find -regextype "posix-extended" -iregex '.*\.(c|cc|h)' -exec grep -nHr "$1" <> +
This will find .Cc and .cC extensions too. You have been warned.
find -regextype posix-extended -regex '.+\.(h|H|c|C)$'
I wish I could use iregex . iregex would also find .Cc and .cC . If I could, the command would look like this. Just a bit shorter.
find -regextype posix-extended -iregex '.+\.(h|H|c)$'
find . -regex ‘.*\.\([chCH]\|cc\|CC\)’
will find all files with names ending in .c,.C,.h,.H,.cc and .CC and does not find any that end in .hc, .cC, or .Cc. In the regex, the first few characters match through the last period in a name, and the parenthesized alternatives match any of the single characters c, h, C, or H, or either of cc or CC.
Note, find’s -regex and -iregex switches are analogous to -name and -iname , but the regex-type switches allow regular expressions with | for alternative matches. Like -iname , -iregex is case-insensitive.
The (non-functional) form
find . -name ‘*.[cCHh][cC]?$’
given in a previous answer doesn’t list any names on my linux system with GNU find 4.4.2. Another problem with ‘*.[cCHh][cC]?$’ as a regex is that it will match names like abc.Cc and xyz.hc which are not in the set of .c,.C,.h,.H,.cc and .CC files that you want.
Using Find Command With Regex
Enable the beast mode of the find command by using regex for your search.
The find command is a powerhouse for searching files based on a number of criteria.
You can enable the beast mode in the find command by using regular expression (regex) for searching.
But before jumping to the examples part, it is crucial to know some basic regex tokens and syntax.
Quick Introduction to Regex Tokens
Tokens are nothing but special characters to search for specified patterns.
So let’s have a look at some of the most basic and widely used tokens which I’ll be using with the find command:
Token | Description |
---|---|
Period (.) | It gets you a match for any character once (except a new line). So a.b will match strings such as acb, aeb and abb but won’t match accb or ab |
Backslash (\) | It eliminates the effect of special characters such as the (.) will indicate to period effect but when used a.b it will only search for strings as a.b |
Asterisk (*) | It is known as a repeater symbol. This means the preceding character can be found 0 or more times. So the ca*t will find get you ct, cat, caat etc. |
Square brackets ([]) | It will get a positive result of any character used in a string inside the square brackets. This means a[bc]d will match abd or acd, but not the abcd. |
Caret (^) | Generally, it is sued to specify the starting point of search but can also be used to negate the content when used inside the square brackets [ ^ ]. Means a[^bc]d will get you aed, azd but not abd or acd. |
Now, Let’s have a look at the basic syntax of using find with regex:
find [path] -regex [regular_expression]
- [path] is where you want to search files.
- regular_expression is where you will be using tokens to express the file pattern you are looking for.
Now it’s time for me to share some examples of how you can use find with regex.
Practical Examples of find command with regex
I am going to start with the most common scenario where a user only knows the first few characters of a file and wants to know where it is.
Search Files based on Initial Characters in the Current Directory
Currently, my file system looks like this:
And I want to search for files that start with Fo or Fr so my command will be:
Here, the -type f was used to search for files, .\/ was used to look for files in the current directory and F[or] will show us file names starting from Fo and Fr.
But what if you want to execute some commands/programs over the given result? This can be done using the find command with exec:
Search Files in Sub Directory
The above example only applied to the current directory and did not show some files that followed the same naming pattern.
So I’ll be using the same naming pattern F[or] to find files in the subdirectory:
Seems too complex right? Let me break it down for you.
Here the [^/]*\/ referees to the files that do not contain any back slashes which eliminates the possibility of finding files in the current directory.
And in the end, I’ve replaced period ( . ) with [^/] to not expand search than the first subdirectory by mentioning there should be no slashes after the filename.
Search Files through regex patterns in every Subdirectory
Seems quite complex after going searching in a single subdirectory right? Well, this is going to be the easiest one!
Well, two asterisks and that is it! Let me show you how:
And if you are curious about how it worked, it’s because I used asterisks at the beginning and the end, so it went through every possibility.
Search Files based on Extension
First, let me share the general syntax of how you are supposed to search files based on their extensions:
find ./ -type f -regex ".*\[fileextension]"
So let’s suppose I want to find all the text files (having a .txt extension) and that can be done quite easily by the given command:
Search Files based on Filename and Extension
This is my personal favorite implementation of regex with find as you can search files based on first letters and their extensions making it quite convenient.
First, let’s have a look at the syntax:
find ./ -type f -regex '\.\/[Filename].*\.[extension]'
Let’s make it a bit practical. So I’m in a scenario where I only know the first few letters of the file (started with Fo or Fr) and its extension (.sh):
Final Words
From finding files modified in n minutes to executing scripts over results with exec, find is one of the most extensive commands offering over 50+ options.
This guide explained yet another way to use the find command making you one step advanced in your Linux journey.
While this guide was kept simple, if you still have any doubts, let me know in the comments.
Find files using regular expressions in Ubuntu
I have a task where i have to find different files with conditions that I think require regular expressions. For example : Find files that begin with 3 small letters and where the last letter is not an ‘i’. I’m searching for the best way to find those files. i could do
ls [a-z][a-z][a-z]*[azertyuopqsdfghjklmwxcvbn]
that is one way, but there is a negation operator for char classes, so you could do [^aeiou0-9::punct::] (if you really just want lower case chars). Actually [ b-d. ] isn’t bad, because it lists explicitly what you do want to match, and you don’t try to have to guess about what you might be missing with ::punct:: and some other of the «shortcut» terms. There are other ::named-ranges:: but I don’t want to have to try and find you a reference 😉 (I’m recovering from an operation). Good luck.
@tripleee : Thanks for the reminder. I think in the back of my mind ! was the csh negation, so I try to block those years from my memoyr 😉 . Nice answer! Good luck to all.
2 Answers 2
$ ls abci ASds dferasfds dsfa998 ilkj323 retk232i $ find -regextype egrep -regex '.*/[a-z].*[^i]$' ./dferasfds ./dsfa998 ./ilkj323
- .*/ is needed to match beginning part of file path
- [a-z] three lowercase letters
- .* any other characters
- [^i]$ not ending with i
However, this particular case seems possible without regex:
$ find -name '[a-z][a-z][a-z]*[!i]' ./dferasfds ./dsfa998 ./ilkj323
These are globs, not regular expressions. You don’t need regular expressions for this.
printf '%s\n' [a-z][a-z][a-z]*[!i] [a-z][a-z][a-hj-z]
The second pattern covers file names which are three characters long; it is unclear from your requirements whether these should be included. (If no matching file exists, the shell will emit a warning message, but that’s harmless.) If not, just use the first pattern.
(I’m using printf mainly to illustrate that the shell does the actual work here, and ls is not necessary to expand a glob pattern.)
If you really do require regular expressions, find -regex ‘pattern’ is your friend. By default, find will traverse subdirectories; you can avoid this with -maxdepth 1 .
Maybe also look at Bash’s extended globbing for an in-between option.
How to use regex with find command?
I have some images named with generated uuid1 string. For example 81397018-b84a-11e0-9d2a-001b77dc0bed.jpg. I want to find out all these images using «find» command:
9 Answers 9
find . -regextype sed -regex ".*/[a-f0-9\-]\\.jpg"
Note that you need to specify .*/ in the beginning because find matches the whole path.
susam@nifty:~/so$ find . -name "*.jpg" ./foo-111.jpg ./test/81397018-b84a-11e0-9d2a-001b77dc0bed.jpg ./81397018-b84a-11e0-9d2a-001b77dc0bed.jpg susam@nifty:~/so$ susam@nifty:~/so$ find . -regextype sed -regex ".*/[a-f0-9\-]\\.jpg" ./test/81397018-b84a-11e0-9d2a-001b77dc0bed.jpg ./81397018-b84a-11e0-9d2a-001b77dc0bed.jpg
$ find --version find (GNU findutils) 4.4.2 Copyright (C) 2007 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Eric B. Decker, James Youngman, and Kevin Dalley. Built using GNU gnulib version e5573b1bad88bfabcda181b9e0125fb0c52b7d3b Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS() CBO(level=0) susam@nifty:~/so$ susam@nifty:~/so$ find . -regextype foo -regex ".*/[a-f0-9\-]\\.jpg" find: Unknown regular expression type `foo'; valid types are `findutils-default', `awk', `egrep', `ed', `emacs', `gnu-awk', `grep', `posix-awk', `posix-basic', `posix-egrep', `posix-extended', `posix-minimal-basic', `sed'.