Linux find exclude file

How to find all files except a specified file

In most simple case you may use the following (in case if the 1st subword is static CentOS ):

  • [BDV] — character class to ensure the second subword starting with one of the specified characters

or the same with negation:

If you want to ignore all filenames that contain the M character, with the GNU implementation of ls (as typically found on CentOS), use the -I ( —ignore ) option:

-I, —ignore=PATTERN
do not list implied entries matching shell PATTERN

To ignore entries with Media word:

Those patterns need to be passed verbatim to ls , so must be quoted (otherwise, the shell would treat them as globs to expand).

The patterns are greedy: the first * matches everything except the last letter, the [^M] matches the last letter since none of them ends with M and the trailing * matches the empty string. So they all match. And even if something ended with M, it would still match, assuming there was something different from M somewhere: e.g. if you had a file called OOM, the first star would match the first O, the [^M] would match the second O and the trailing start would match the M.

The easiest way is to use find. Do:

find . -maxdepth 1 -type f ! -name "CentOS-Media.repo" 

Here «f» means search for regular files only (excludes symlinks to regular files though; with GNU find , use -xtype f instead to include them). If you want to search for directories, pass «d» instead.

( -maxdepth while initially a GNU extension is now quite common. If your find doesn’t support it, you can replace -maxdepth 1 with the standard ! -name . -prune ).

see the find man page for more awesome features.

One option is to use find with the -not -name flags. I.e. find . -not -name CentOS-Media.repo . If you don’t want to recurse down the directory structure, add -maxdepth 1 flag.

Alternatively, one may write the following (which is much more complex, but I forgot about -not flag and posted this answer originally, so I will not delete this part):

find . -print0 | grep —invert-match -z «CentOS-Media.repo$» | tr ‘\0’ ‘\n’

You need to force find to separate filenames with null byte, so that newlines in filenames won’t break anything down. Hopefully, grep supports this kind of separator with flag -z . You may want to revert to the typical separation (i.e. null byte -> new line) with tr ‘\0’ ‘\n’

Читайте также:  Что такое shebang linux

Источник

How to ignore certain filenames using «find»?

which searches the contents of all of the files at and below the current directory for the specified SearchString. As a developer, this has come in handy at times. Due to my current project, and the structure of my codebase, however, I’d like to make this BASH command even more advanced by not searching any files that are in or below a directory that contains «.svn», or any files that end with «.html» The MAN page for find kind of confused me though. I tried using -prune, and it gave me strange behavior. In an attempt to skip only the .html pages (to start), I tried :

find . -wholename './*.html' -prune -exec grep 'SearchString' <> /dev/null \; 

and did not get the behavior I was hoping for. I think I might be missing the point of -prune. Could you guys help me out? Thanks

@emanuele Hi, welcome to SuperUser (and the Stack Exchange network). This is a question I asked, and that was answered, 2 1/2 years ago. Typically, if you would like to add an answer to the question, please do so by scrolling to the bottom and answering there, instead of in a comment. Since this question already has an accepted answer (the one with the green checkmark), it’s unlikely that your answer is going to get much attention, however. FYI.

Hi, it is not an answer to your question. It is only a tip, as you stated in preamble that use find to search inside a file.

FWIW, -name ‘*.*’ does not find all files: only those with a . in their name (the use of *.* is typically an DOS-ism, whereas in Unix, you normally use just * for that). To really match them all, just remove the argument altogether: find . -exec . . Or if you want to only apply grep to files (and skip directories) then do find . -type f -exec . .

5 Answers 5

You can use the negate (!) feature of find to not match files with specific names:

find . ! -name '*.html' ! -path '*.svn*' -exec grep 'SearchString' <> /dev/null \; 

So if the name ends in .html or contains .svn anywhere in the path, it will not match, and so the exec will not be executed.

@Paul The desired effect is to exclude «files that are in or below a directory that contains .svn «, so path (or wholename , but path is more portable) is more accurate than name for the answer. They questioner doesn’t appear to have any files with .svn in the name.

Читайте также:  Astra linux размер окон

I’ve had the same issue for a long time, and there are several solutions which can be applicable in different situations:

  • ack-grep is a sort of «developer’s grep » which by default skips version control directories and temporary files. The man page explains how to search only specific file types and how to define your own.
  • grep ‘s own —exclude and —exclude-dir options can be used very easily to skip file globs and single directories (no globbing for directories, unfortunately).
  • find . \( -type d -name ‘.svn’ -o -type f -name ‘*.html’ \) -prune -o -print0 | xargs -0 grep . should work, but the above options are probably less of a hassle in the long run.

The following find command does prune directories whose names contain .svn , Although it does not descend into the directory, the pruned path name is printed . ( -name ‘*.svn’ is the cause!) ..

You can filter out the directory names via: grep -d skip which silently skips such input «directory names».

With GNU grep, you can use -H instead of /dev/null . As a slight side issue: \+ can be much faster than \; , eg. for 1 million one-line files, using \; it took 4m20s, using \+ it took only 1.2s.

The following method uses xargs instead of -exec , and assumes there are no newlines \n in any of your file names. As used here, xargs is much the same as find’s \+ .

xargs can pass file-names which contain consecutive spaces by changing the input delimiter to ‘\n’ with the -d option.

This excludes directories whose names contain .svn and greps only files which don’t end with .html .

find . \( -name '*.svn*' -prune -o ! -name '*.html' \) | xargs -d '\n' grep -Hd skip 'SearchString' 

Источник

Exclude list of files from find

If I have a list of filenames in a text file that I want to exclude when I run find , how can I do that? For example, I want to do something like:

find /dir -name "*.gz" -exclude_from skip_files 

and get all the .gz files in /dir except for the files listed in skip_files. But find has no -exclude_from flag. How can I skip all the files in skip_files ?

7 Answers 7

I don’t think find has an option like this, you could build a command using printf and your exclude list:

find /dir -name "*.gz" $(printf "! -name %s " $(cat skip_files)) 

Which is the same as doing:

find /dir -name "*.gz" ! -name first_skip ! -name second_skip . etc 

Alternatively you can pipe from find into grep :

find /dir -name "*.gz" | grep -vFf skip_files 

Does this work for inclusions as well? I just tested and I got nothing: included_paths=(«./.aws/*» «./.bash_env.m4») && find . -type f -name ‘*.m4’ \( -path «$» $(printf » -or -path ‘%s'» «$») \) . When I directly put in the paths into the find command, it does work.

Читайте также:  Linux mint and skype

This is what i usually do to remove some files from the result (In this case i looked for all text files but wasn’t interested in a bunch of valgrind memcheck reports we have here and there):

find . -type f -name '*.txt' ! -name '*mem*.txt' 

Example for if you need to ignore multiple filenames / patterns: find . -type f ! -name ‘.foo’ ! -name ‘.bar’ .

find /dir \( -name "*.gz" ! -name skip_file1 ! -name skip_file2 . so on \) 
find /var/www/test/ -type f \( -iname "*.*" ! -iname "*.php" ! -iname "*.jpg" ! -iname "*.png" \) 

The above command gives list of all files excluding files with .php, .jpg ang .png extension. This command works for me in putty.

PuTTY is a remote terminal (usually using SSH) — whether it works will very much depend on what you’re SSH’ing into.

Josh Jolly’s grep solution works, but has O(N**2) complexity, making it too slow for long lists. If the lists are sorted first (O(N*log(N)) complexity), you can use comm , which has O(N) complexity:

find /dir -name '*.gz' |sort >everything_sorted sort skip_files >skip_files_sorted comm -23 everything_sorted skip_files_sorted | xargs . . . etc 

man your computer’s comm for details.

This solution will go through all files (not exactly excluding from the find command), but will produce an output skipping files from a list of exclusions. I found that useful while running a time-consuming command ( file /dir -exec md5sum <> \; ).

  1. You can create a shell script to handle the skipping logic and run commands on the files found (make it executable with chmod , replace echo with other commands):
 $ cat skip_file.sh #!/bin/bash found=$(grep "^$1$" files_to_skip.txt) if [ -z "$found" ]; then # run your command echo $1 fi 
  1. Create a file with the list of files to skip named files_to_skip.txt (on the dir you are running from).
  2. Then use find using it:
 find /dir -name "*.gz" -exec ./skip_file.sh <> \; 

Working out

this will fail if any of the filenames has a space in it that is unquoted (which it must be if sharing the exclusion list with another utility that expect it that way.)

Источник

Оцените статью
Adblock
detector