Linux find files by regex

How to use regex with find command?

I have some images named with generated uuid1 string. For example 81397018-b84a-11e0-9d2a-001b77dc0bed.jpg. I want to find out all these images using «find» command:

9 Answers 9

find . -regextype sed -regex ".*/[a-f0-9\-]\\.jpg" 

Note that you need to specify .*/ in the beginning because find matches the whole path.

susam@nifty:~/so$ find . -name "*.jpg" ./foo-111.jpg ./test/81397018-b84a-11e0-9d2a-001b77dc0bed.jpg ./81397018-b84a-11e0-9d2a-001b77dc0bed.jpg susam@nifty:~/so$ susam@nifty:~/so$ find . -regextype sed -regex ".*/[a-f0-9\-]\\.jpg" ./test/81397018-b84a-11e0-9d2a-001b77dc0bed.jpg ./81397018-b84a-11e0-9d2a-001b77dc0bed.jpg 
$ find --version find (GNU findutils) 4.4.2 Copyright (C) 2007 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Eric B. Decker, James Youngman, and Kevin Dalley. Built using GNU gnulib version e5573b1bad88bfabcda181b9e0125fb0c52b7d3b Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS() CBO(level=0) susam@nifty:~/so$ susam@nifty:~/so$ find . -regextype foo -regex ".*/[a-f0-9\-]\\.jpg" find: Unknown regular expression type `foo'; valid types are `findutils-default', `awk', `egrep', `ed', `emacs', `gnu-awk', `grep', `posix-awk', `posix-basic', `posix-egrep', `posix-extended', `posix-minimal-basic', `sed'. 

Источник

Using Find Command With Regex

Enable the beast mode of the find command by using regex for your search.

The find command is a powerhouse for searching files based on a number of criteria.

You can enable the beast mode in the find command by using regular expression (regex) for searching.

But before jumping to the examples part, it is crucial to know some basic regex tokens and syntax.

Quick Introduction to Regex Tokens

Tokens are nothing but special characters to search for specified patterns.

So let’s have a look at some of the most basic and widely used tokens which I’ll be using with the find command:

Token Description
Period (.) It gets you a match for any character once (except a new line). So a.b will match strings such as acb, aeb and abb but won’t match accb or ab
Backslash (\) It eliminates the effect of special characters such as the (.) will indicate to period effect but when used a.b it will only search for strings as a.b
Asterisk (*) It is known as a repeater symbol. This means the preceding character can be found 0 or more times. So the ca*t will find get you ct, cat, caat etc.
Square brackets ([]) It will get a positive result of any character used in a string inside the square brackets. This means a[bc]d will match abd or acd, but not the abcd.
Caret (^) Generally, it is sued to specify the starting point of search but can also be used to negate the content when used inside the square brackets [ ^ ]. Means a[^bc]d will get you aed, azd but not abd or acd.
Читайте также:  Linux etc var usr

Now, Let’s have a look at the basic syntax of using find with regex:

find [path] -regex [regular_expression]
  • [path] is where you want to search files.
  • regular_expression is where you will be using tokens to express the file pattern you are looking for.

Now it’s time for me to share some examples of how you can use find with regex.

Practical Examples of find command with regex

I am going to start with the most common scenario where a user only knows the first few characters of a file and wants to know where it is.

Search Files based on Initial Characters in the Current Directory

Currently, my file system looks like this:

Linux filesystem

And I want to search for files that start with Fo or Fr so my command will be:

Find files using filename using find command with regex

Here, the -type f was used to search for files, .\/ was used to look for files in the current directory and F[or] will show us file names starting from Fo and Fr.

But what if you want to execute some commands/programs over the given result? This can be done using the find command with exec:

Search Files in Sub Directory

The above example only applied to the current directory and did not show some files that followed the same naming pattern.

So I’ll be using the same naming pattern F[or] to find files in the subdirectory:

search files using find with regex in subdirectories

Seems too complex right? Let me break it down for you.

Here the [^/]*\/ referees to the files that do not contain any back slashes which eliminates the possibility of finding files in the current directory.

And in the end, I’ve replaced period ( . ) with [^/] to not expand search than the first subdirectory by mentioning there should be no slashes after the filename.

Search Files through regex patterns in every Subdirectory

Seems quite complex after going searching in a single subdirectory right? Well, this is going to be the easiest one!

Well, two asterisks and that is it! Let me show you how:

search files using find command with regex inside every subdirectory

And if you are curious about how it worked, it’s because I used asterisks at the beginning and the end, so it went through every possibility.

Search Files based on Extension

First, let me share the general syntax of how you are supposed to search files based on their extensions:

find ./ -type f -regex ".*\[fileextension]"

So let’s suppose I want to find all the text files (having a .txt extension) and that can be done quite easily by the given command:

search files using file extension by find command

Search Files based on Filename and Extension

This is my personal favorite implementation of regex with find as you can search files based on first letters and their extensions making it quite convenient.

Читайте также:  Общая папка linux для windows

First, let’s have a look at the syntax:

find ./ -type f -regex '\.\/[Filename].*\.[extension]'

Let’s make it a bit practical. So I’m in a scenario where I only know the first few letters of the file (started with Fo or Fr) and its extension (.sh):

search files using filename and extension

Final Words

From finding files modified in n minutes to executing scripts over results with exec, find is one of the most extensive commands offering over 50+ options.

This guide explained yet another way to use the find command making you one step advanced in your Linux journey.

While this guide was kept simple, if you still have any doubts, let me know in the comments.

Источник

Find files using regular expressions in Ubuntu

I have a task where i have to find different files with conditions that I think require regular expressions. For example : Find files that begin with 3 small letters and where the last letter is not an ‘i’. I’m searching for the best way to find those files. i could do

ls [a-z][a-z][a-z]*[azertyuopqsdfghjklmwxcvbn] 

that is one way, but there is a negation operator for char classes, so you could do [^aeiou0-9::punct::] (if you really just want lower case chars). Actually [ b-d. ] isn’t bad, because it lists explicitly what you do want to match, and you don’t try to have to guess about what you might be missing with ::punct:: and some other of the «shortcut» terms. There are other ::named-ranges:: but I don’t want to have to try and find you a reference 😉 (I’m recovering from an operation). Good luck.

@tripleee : Thanks for the reminder. I think in the back of my mind ! was the csh negation, so I try to block those years from my memoyr 😉 . Nice answer! Good luck to all.

2 Answers 2

$ ls abci ASds dferasfds dsfa998 ilkj323 retk232i $ find -regextype egrep -regex '.*/[a-z].*[^i]$' ./dferasfds ./dsfa998 ./ilkj323 
  • .*/ is needed to match beginning part of file path
  • [a-z] three lowercase letters
  • .* any other characters
  • [^i]$ not ending with i

However, this particular case seems possible without regex:

$ find -name '[a-z][a-z][a-z]*[!i]' ./dferasfds ./dsfa998 ./ilkj323 

These are globs, not regular expressions. You don’t need regular expressions for this.

printf '%s\n' [a-z][a-z][a-z]*[!i] [a-z][a-z][a-hj-z] 

The second pattern covers file names which are three characters long; it is unclear from your requirements whether these should be included. (If no matching file exists, the shell will emit a warning message, but that’s harmless.) If not, just use the first pattern.

(I’m using printf mainly to illustrate that the shell does the actual work here, and ls is not necessary to expand a glob pattern.)

If you really do require regular expressions, find -regex ‘pattern’ is your friend. By default, find will traverse subdirectories; you can avoid this with -maxdepth 1 .

Maybe also look at Bash’s extended globbing for an in-between option.

Источник

How can I find all files in a folder that contain a match of a regular expression in the file name?

I’d like to find all of the files in my home folder on Linux (Ubuntu, in this case) that contain a match a particular regular expression. Is there a simple Unix command that I can use in order to do this? For example, I’d like to find all of the files in my home folder with names that contain a match of the following regex (here, using Javascript-style notation): ((R|r)eading(T|t)est(D|d)ata)

Читайте также:  Virtualbox виртуальная машина kali linux

Are you looking for files whose content matches a regexp (from your post body), or files whose name matches a regexp (from your post title)?

I entered the above command on Ubuntu and got the following output: ls: cannot access *[Rr]eading[Tt]est[Dd]ata*: No such file or directory even though there is a file on my system that matches the regex.

I tried entering ls -l *.js* and got the same output: No such file or directory . I expected to see a list of every file that ended in .js, but it didn’t work as expected.

4 Answers 4

Find’s -name option supports file globbing. It also supports a limited set of regex-like options like limited square-bracket expressions, but for actual regex matches, use -regex .

If you’re looking for a match in the contents of a file, use grep -r as Craig suggested.

If you want to match the filename, then use find with its -regex option:

find . -type f -regex '.*[Rr]eading[Tt]est[Dd]ata.*' -print 

Note the shift in regex, because find doesn’t portably support bracketed atoms in its regex. If you happen to be on a Linux system, GNU find supports a -regextype option that gives you more control:

find . -regextype posix-extended -regex '.*((R|r)eading(T|t)est(D|d)ata).*' -print 

Note that if all you’re looking for is case matching, -iregex or even -iname may be sufficient. If you’re using bash as your shell, Gilles’ globstar solution should work too.

Shells have wildcard characters that differ from the usual regexp syntaxes: ? to match any single character, * to match any number of characters, and [abc] to match any single character among a , b or c . The following command shows all files whose name matches the extended regular expression¹ ((R|r)eading(T|t)est(D|d)ata) in the current directory:

If you want to find files in subdirectories as well, then first run shopt -s globstar (you can put this command in your ~/.bashrc ). This turns on the ** pattern to match any level of subdirectories:

Shell wildcard characters are not as powerful as regular expressions. For example, there is no or ( | ) operator. You can get the power of regular expressions, but with a different syntax for historical reasons. Add shopt -s exgblob to your .bashrc , then you can use @(foo|bar) to match foo or bar (like foo|bar in an ERE), *(pattern) to match a sequence any number of occurrences of pattern (like (pattern)* in an ERE), +(pattern) to match one or more occurrences, ?(pattern) to match zero or one occurrence, and !(pattern) to match anything except pattern (no ERE equivalent).

¹ “Extended regular expression” (ERE for short) is the unix name of the regex syntax that JavaScript uses.

Источник

Оцените статью
Adblock
detector