- Linux Find Command With Regular Expressions
- 1. Introduction
- 2. Regular Expressions Primer
- 2.1. Main Regex Tokens and Examples
- 3. Command Description
- 3.1. Using -regex
- 3.2. Using -iregex
- 3.3. Using -regextype
- 4. Comparison With Bash Globbing
- 5. Conclusion
- linux search file based on file name pattern [closed]
- 3 Answers 3
Linux Find Command With Regular Expressions
The Kubernetes ecosystem is huge and quite complex, so it’s easy to forget about costs when trying out all of the exciting tools.
To avoid overspending on your Kubernetes cluster, definitely have a look at the free K8s cost monitoring tool from the automation platform CAST AI. You can view your costs in real time, allocate them, calculate burn rates for projects, spot anomalies or spikes, and get insightful reports you can share with your team.
Connect your cluster and start monitoring your K8s costs right away:
1. Introduction
In this tutorial, we’ll talk about the use of the command find with regular expressions (regex). We’ll look at how to specify the regular expression to further refine the results of the search.
2. Regular Expressions Primer
Before showing how to use regular expressions with find, let’s start with what they are and how they are constructed.
Regular expressions (shortened as regex) are powerful tools described by character sequences that specify a search pattern. This is why their use together with find yields a more refined search with a more reduced command.
There are different types of regular expressions and formats. The concepts explained below are consistent between them. However, more advanced features require knowing which regex type is being used because there are differences between them. The accepted flavors of regex by the find command are detailed in the following section.
2.1. Main Regex Tokens and Examples
Although sometimes deemed as daunting, regex improves the searches and enhances the interaction with the command line. With just basic knowledge, we can already profit from them.
As a quick introduction, there are regex tokens that match multiple characters:
- Period (.): it matches any character once (except a newline character): q.e will match the strings qwe, qre, and qee but not the strings qe or qwwe
- Asterisk (* ): it matches zero or more occurrences of the preceding character/regular expression: qw*e will match the strings qe, qwe, qwwe but not the string qre
- Backslash (\ ): it escapes special characters, for example, to search for a period: q\.e will match the string q.e but not the strings qre, qee,qe or qwwe
- Square brackets ([string] ): any of the characters of the string within square brackets return a positive match: q[we]r will match the strings qwr and qer but not the strings qr, qwer or qwewer
- Caret (^ ): it negates the content within square brackets (although it also specifies the beginning of lines when searching within a file): q[^we]r will match the strings qar and qsr but not the string qwr or qer
Two tokens frequently used in conjunction are .* that, based on the previous discussion, will match zero or more occurrences of any character except a newline, meaning that it will match any string!
3. Command Description
The use of the command find can be split into two components: a path and a search expression:
The path is the directory for the search. The expression part also includes possible actions taken in the files that comply with the search criterion. It is there where the command find has three options related to regular expressions. We present them now with some use case examples. The following mockup directory will be used for the examples:
$ tree ./ ./ ├── a0 ├── a0.sh ├── A0.sh ├── a1 ├── a1.sh ├── A1.sh ├── a2 ├── ca ├── cb ├── cc └── folder ├── a0 ├── a1 └── a0folder ├── a0 └── a1 2 directories, 13 files
3.1. Using -regex
The first option is -regex together with the regular expression:
find [path] -regex [regular_expression]
With this command, the path is searched, and the files that comply with the regular_expression are returned. The regular_expression pattern includes the full filename, including the root path directory. This means that if looking in the current directory, the regular_expression should start with \.\/ (using the backslash to escape the special characters).
The following command finds the files (with the -type f flag) that are in the current directory (\.\/), that start with the letter a followed by either a 0 or a 1:
$ find ./ -type f -regex '\.\/a[01].*' ./a1 ./a0 ./a1.sh ./a0.sh
File a2 is not returned because the letter a is not followed by a 0 or a 1. We can also search in the first level directory instead of the current directory with the command:
$ find ./ -type f -regex '\.\/[^/]*\/a[01][^/]*' ./folder/a1 ./folder/a0
Two differences exist between the two last regexes. First, the tokens [^/]*\/ refer to any string that doesn’t contain any slash ([^/]*) followed by one slash ( \ / ) immediately before the filename that starts with the letter a. Secondly, we replaced the period with [^/] to denote that after the letter a, no more slashes can appear.
The files in the subdirectories don’t fulfill the regex: between the first slash (current directory) and the slash immediately followed by the letter a there are extra slashes for the subdirectory (for example ./folder/a0folder/a0).
Finally, to include all files in all subdirectories, we can use:
$ find ./ -type f -regex '.*a[01].*' ./folder/a0folder ./folder/a0folder/a0 ./folder/a0folder/a1 ./folder/a0 ./folder/a1 ./a0 ./a1 ./a0.sh ./a1.sh
3.2. Using -iregex
The second option is -iregex:
find [path] -iregex [regular_expression]
The command performs the same search as with the -regex option but ignores the letter case of the search patterns. As a mnemonic rule, the command -iregex stands for case-insensitive regex.
If we modify one of the commands from before to find only the files with a dot (by including [.]), the output looks like this:
$ find ./ -type f -regex '\.\/a[01][.].*' ./a0.sh ./a1.sh
The results with the -iregex flag instead of the -regex flag include the files with the capital letter A as well:
$ find ./ -type f -iregex '\.\/a[01][.].*' ./a0.sh ./A1.sh ./A0.sh ./a1.sh
3.3. Using -regextype
Finally, the option -regextype selects the type of regular expression:
find [path] -regex [regular_expression] -regextype [regex_type]
Different regex types are available for the command find:
The tokens defined before are compatible with all these types of regex. However, more advanced search queries may produce different results under different regex types. There is a comprehensive GNU webpage dedicated to detailing the different syntaxes.
4. Comparison With Bash Globbing
After using Linux for just a little bit, bash globbing has certainly appeared in commands like ls. Let’s consider the following command:
It lists all the files with a format extension of .png. Meanwhile, the command:
lists all the files that have a format extension of .png and that start with the letter M. This is bash globbing in action: filename completion. Bash globbing is used when searching for a name with the find command.
Even if they look similar, bash globbing and regular expressions present different syntax – complicating the matter. We discuss two of the most relevant differences. A period (.) represents a literal period in bash globbing but any single character in regex. This first command shows the bash globbing approach:
$ find ./ -type f -name 'a*.sh' ./a0.sh ./a1.sh
To obtain the same result, we can use the following regex find command:
$ find ./ -type f -regex '\.\/a.*\.sh' ./a0.sh ./a1.sh
Another difference between bash globbing and regular expressions is the asterisk (* ): it represents zero or more of any characters in bash globbing, but in regex, it represents zero o more of the preceding character. Thus, similar commands behave differently whether they expand bash globbing or regex. When we employ bash globbing, the following command returns all files starting with c:
$ find ./ -type f -name 'c*' ./cb ./cc ./ca
However, if a similar search pattern has regex, it returns all the files whose names contain only c:
We should keep in mind these differences when searching in a directory to use either bash globbing or regular expressions to our advantage.
5. Conclusion
In this tutorial, we described how to apply some basic regular expressions to further refine the output of the find command and ease the search for files within our directories.
linux search file based on file name pattern [closed]
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
I want to search a series of files based on the files name. Here is my directory : For example I had the file on above. I only want to search out the file which is without _bak.
Your question is far too vague to give a good answer. Are you trying to search the contents of the files? The filenames themselves? Are you just trying to get filenames that don’t end in _bak ?
Also, you should really post your directory contents in plain text, not an image. See the formatting help.
I’m not certain what the confusion is about. The OP has tagged this as linux unix and he’s asking to pick out the files that don’t have _bak at the end. Am I missing something? (Although the formatting thing is true. You should really use plain text.)
question seems very clear to me. pattern is starts with «ei469390ONL00», looks like windows, not linux, so dir
3 Answers 3
If you’re wanting to limit your search to the current directory, use:
find . -maxdepth 1 -mindepth 1 ! -name '*_bak'
If you want to recursively find in all directories remove the -maxdepth 1 .
Edit in response to OP’s comment
To get files that begin with ei and do not end with _bak use:
find . -type f -name 'ei*' -a ! -name '*_bak'
Note, you must use quotes or double quotes. This won’t work: find . -name *_bak because the shell will replace the *_bak with matching filenames in the current directory, hence the unknown primary or operator error you typically get. Use quotes or double quotes as in the answer here.