How to find binary files in a directory?
I need to find the binary files in a directory. I want to do this with file, and after that I will check the results with grep. But my problem is that I have no idea what is a binary file. What will give the file command for binary files or what should I check with grep?
What kind of «binary» files are you talking about here? Do you have an appropriate «binary» file on your system anywhere? What does file say about it?
I don’t know what kind of binary files, because my homework doesn’t define it, only: write a shell script using grep command (and others) to find the binary files in a directory and write their permissions. So i don;t know nothing about binary files type.
That seems under-specific to me. I’d ask for clarification. Though given the suggestion to use grep I’m going to guess it means «contains a NUL byte».
All files are binary. «Binary» means you don’t know the actual format of the file or it is not important in the context. Some files are text files. A text file is one where the entire file can be decoded into a text string with a specific character encoding. All files can be decoded using several different character encodings. It is only valid to do so if you know the file is text and use the character encoding that was used to write it.
10 Answers 10
This finds all non-text based, binary, and empty files.
Edit
Solution with only grep (from Mehrdad’s comment):
Original answer
This does not require any other tool except find and grep :
find . -type f -exec grep -IL . "<>" \;
-I tells grep to assume binary files as unmatched
-L prints only unmatched files
Edit 2
This finds all non-empty binary files:
find . -type f ! -size 0 -exec grep -IL . "<>" \;
It looks like you’re right. However it’s quite some time ago that I looked into this so I don’t remember why I put the find there. Without the additional fork this it’s also way faster!
Maybe the files you think are ‘non-binary’ are empty? Those show up, too (as they are not text, I guess)
Just have to mention Perl‘s -T test for text files, and its opposite -B for binary files.
$ find . -type f | perl -lne 'print if -B'
will print out any binary files it sees. Use -T if you want the opposite: text files.
It’s not totally foolproof as it only looks in the first 1,000 characters or so, but it’s better than some of the ad-hoc methods suggested here. See man perlfunc for the whole rundown. Here is a summary:
The «-T» and «-B» switches work as follows. The first block or so of the file is examined to see if it is valid UTF-8 that includes non-ASCII characters. If, so it’s a «-T» file. Otherwise, that same portion of the file is examined for odd characters such as strange control codes or characters with the high bit set. If more than a third of the characters are strange, it’s a «-B» file; otherwise it’s a «-T» file. Also, any file containing a zero byte in the examined portion is considered a binary file.
In these modern times (2020 is practically the 3rd decade of the 21st century after all), I think the correct question is how do I find all the non-utf-8 files? Utf-8 being the modern equivalent of a text file.
utf-8 encoding of text with non-ascii code points will introduce non-ascii bytes (i.e., bytes with the most significant bit set). Now, not all sequences of such bytes form valid utf-8 sequences.
isutf8 from the moreutils package is what you need.
$ isutf8 -l /bin/* /bin/[ /bin/acyclic /bin/addr2line /bin/animate /bin/applydeltarpm /bin/apropos ⋮
$ file $(isutf8 -l /bin/*) /bin/[: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=4d70c2142fc672d8a69d033ecb6693ec15b1e6fb, for GNU/Linux 3.2.0, stripped /bin/acyclic: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=d428ea52eb0e8aaf7faf30914710d8fbabe6ca28, for GNU/Linux 3.2.0, stripped /bin/addr2line: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=797f42bc4f8fb754a49b816b82d6b40804626567, for GNU/Linux 3.2.0, stripped /bin/animate: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=36ab46e69c1bfea433382ffc9bbd9708365dac2b, for GNU/Linux 3.2.0, stripped /bin/applydeltarpm: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a1fddcbeec9266e698782596f2dfd1b4f3e0b974, for GNU/Linux 3.2.0, stripped /bin/apropos: symbolic link to whatis ⋮
You may wish to invert the test and get all the text files. Use -i :
$ isutf8 -il /bin/* /bin/alias /bin/bashbug /bin/bashbug-64 /bin/bg ⋮ $ file -L $(isutf8 -il /bin/*) /bin/alias: a /usr/bin/sh script, ASCII text executable /bin/bashbug: a /usr/bin/sh - script, ASCII text executable, with very long lines /bin/bashbug-64: a /usr/bin/sh - script, ASCII text executable, with very long lines /bin/bg: a /usr/bin/sh script, ASCII text executable ⋮
Yeah, it reads the whole file, but it’s pretty speedy, and if you want accuracy…
How do I find where Linux application binary files exist?
Where does Linux (e.g., CentOS, RHEL) store the binary files for a given application? How can I find it out for an application?
It’s unclear whether you mean «executable binary file» (the actual command, which in fact may not be a compiled binary at all) or «binary data file». What is the issue that you are currently having? Are you, for example, trying to find the executable associated with a particular package?
5 Answers 5
You can use whereis for this task.
$ whereis python3 python3: /usr/bin/python3.5m /usr/bin/python3 /usr/bin/python3.5 /usr/lib/python3 /usr/lib/python3.5 /etc/python3 /usr/local/lib/python3.5 /usr/share/python3 /usr/share/man/man1/python3.1.gz
And while whereis checks the standard path, you can also use which to check the actual path (though it won’t find «shadowed» duplicates).
To find the binary itself, another method is type .
$ type python3 python3 is /usr/bin/python3 $
In terms of location for configuration files and data files, the best place to determine that is often the relevant man page.
$ man python3 | grep -A10 FILES FILES AND DIRECTORIES These are subject to difference depending on local installation conven‐ tions; $ and $ are installation-dependent and should be interpreted as for GNU software; they may be the same. The default for both is /usr/local. $/bin/python Recommended location of the interpreter. $/lib/python $/lib/python $
You might try using which like so:
which is likely a symbolic link, so to find the actual real path
ls -la $(which zoom) lrwxrwxrwx 1 root root 22 Apr 24 07:19 /usr/bin/zoom -> /opt/zoom/ZoomLauncher
In case that gives another symlink (it doesn’t in the case of zoom), you can use namei to traverse the chain.
In general, refer to the file-system-heirarchy (FSH) for this:
A direct (and simplified) answer to your question is:
- binary files:
- /usr/bin for executables
- /usr/lib for shared objects (libraries)
- /etc
- /var for read-write stuff
- /usr/share for architecture-independent, read-only data (like images)
If you are using a debian-based system, you can find what files are deployed by a specific application (package) with dpkg -L :
$ dpkg-query -L xclip /usr/bin/xclip /usr/share/doc/xclip/README /usr/share/doc/xclip/changelog.Debian.gz /usr/share/doc/xclip/changelog.gz /usr/share/doc/xclip/copyright /usr/share/man/man1/xclip.1.gz
In this (abbreviated) case, we see package xclip has a binary in /usr/bin , some changelog and copyright stuff in /usr/share/doc/xclip and a man page in /usr/share/man/man1 .
Other distros have other tools for the same thing.
Shellscript what-about
I use the shellscript what-about in order to show some basic information of executable programs, that are available via PATH ,
- where it is located
- what package it belongs to or can be installed from
- what kind of program it is (binary executable code, shellscript, shell built-in, alias, link . )
This bash shellscript uses the tool dpkg , that belongs to Debian and Ubuntu and can be used also in Linux distros developed from those two distros. If you want to see the whole content of a [debian] program package, you can use dpkg or emacs according to this link to AskUbuntu.
If you want to find the program package the program belongs to in some other distro, CentOS and RHEL were mentioned in the original question, you must replace dpkg with the corresponding tool.
#!/bin/bash LANG=C inversvid="\0033[7m" resetvid="\0033[0m" if [ $# -ne 1 ] then echo "Usage: $ " echo "Will try to find corresponding package" echo "and tell what kind of program it is" exit 1 fi command="$1" str=;for ((i=1;i<=$(tput cols);i++)) do str="-$str";done tmp="$command" first=true curdir="$(pwd)" tmq=$(which "$command") tdr="$" tex="$" if test -d "$tdr"; then cd "$tdr"; fi #echo "cwd='$(pwd)' ################# d" while $first || [ "$" == "l" ] do first=false tmp=$ tmq="$tmp" tmp=$(ls -l "$(which "$tmp")" 2>/dev/null) tdr="$" tex="$" if test -d "$tdr"; then cd "$tdr"; fi # echo "cwd='$(pwd)' ################# d" if [ "$tmp" == "" ] then tmp=$(ls -l "$tex" 2>/dev/null) tmp=$ if [ "$tmp" == "" ] then echo "$command is not in PATH" # package=$(bash -ic "$command -v 2>&1") # echo "package=$package XXXXX 0" bash -ic "alias '$command' > /dev/null 2>&1" > /dev/null 2>&1 if [ $? -ne 0 ] then echo 'looking for package . ' package=$(bash -ic "$command -v 2>&1"| sed -e '0,/with:/d'| grep -v '^$') else echo 'alias, hence not looking for package' fi # echo "package=$package XXXXX 1" if [ "$package" != "" ] then echo "$str" echo "package: [to get command '$1']" echo -e "$$$" fi else echo "$tmp" fi else echo "$tmp" fi done tmp=$ if [ "$tmp" != "" ] then echo "$str" program="$tex" program="$(pwd)/$tex" file "$program" if [ "$program" == "/usr/bin/snap" ] then echo "$str" echo "/usr/bin/snap run $command # run $command " sprog=$(find /snap/"$command" -type f -iname "$command" \ -exec file <> \; 2>/dev/null | sort | tail -n1) echo -e "$file: $sprog$resetvid" echo "/usr/bin/snap list $command # list $command" slist="$(/usr/bin/snap list "$command")" echo -e "$$slist$resetvid" else package=$(dpkg -S "$program") if [ "$package" == "" ] then package=$(dpkg -S "$tex" | grep -e " /bin/$tex$" -e " /sbin/$tex$") if [ "$package" != "" ] then ls -l /bin /sbin fi fi if [ "$package" != "" ] then echo "$str" echo " package: /path/program [for command '$1']" echo -e "$ $package $" fi fi fi echo "$str" #alias=$(grep "alias $command=" "$HOME/.bashrc") alias=$(bash -ic "alias '$command' 2>/dev/null"| grep "$command") if [ "$alias" != "" ] then echo "$alias" fi type=$(type "$command" 2>/dev/null) if [ "$type" != "" ] then echo "type: $type" elif [ "$alias" == "" ] then echo "type: $command: not found" fi cd "$curdir"
This shellscript can find a program that is ‘behind’ a link or a series of links.
See also this link to AskUbuntu, where there are some demo examples.
How to find all binary executables recursively within a directory?
all executable files are listed (excluding directories), and including executable script file (like script.sh, etc). What I want to do is list only binary executable files.
4 Answers 4
You might try the file utility. According to the manpage:
The magic tests are used to check for files with data in particular fixed formats. The canonical example of this is a binary executable (compiled program) a.out file, whose format is defined in , and possibly in the standard include directory.
You might have to play around with the regular expression but something like:
$ find -type f -executable -exec file -i '<>' \; | grep 'x-executable; charset=binary'
file has lots of options, so you might want to take a closer look at the man page. I used the first option I found that seemed to output easily-to-grep output.
I’d say use find -type f -executable -exec sh -c «file -i ‘<>‘ | grep -q ‘x-executable; charset=binary'» \; -print . It will only give you files (and thus can be passed to the next command he wants to run)
serverfault.com/a/584595/211551 solution finds files that are NOT marked executable but are executable.
On OS X, you can install GNU find with brew install findutils or sudo port install findutils and then you can run an invocation like this to a similar effect: gfind . -type f -executable -exec file ‘<>‘ \; | grep -i execut
Here’s a way to exclude scripts, i.e., files whose first two characters are #! :
find -type f -executable -exec sh -c 'test "$(head -c 2 "$1")" != "#!"' sh <> \; -print
For some kinds of files, it’s not clear whether you want them classified as scripts or binary, for example bytecode files. Depending on how things are set up, these may or may not start with #! . If these matter to you, you’ll have to make the inner shell script more complex. For example, here’s how you might include ELF binaries and Mono executables and Objective Caml bytecode programs but not other kinds of executables like shell scripts or perl scripts or JVM bytecode programs:
find -type f -executable -exec sh -c ' case "$(head -n 1 "$1")" in ?ELF*) exit 0;; MZ*) exit 0;; #!*/ocamlrun*) exit 0;; esac exit 1 ' sh <> \; -print