Find file in zip files linux

Is it possible in unix to search inside zip files

I have 100s of directories and within those I have a few zip files. Now there are images named abc.jpg in those zip files. The zip files may be in any folder or in any subfolder so its difficult to extract them all in one place. I just want to collect those image files. Is this possible?

You can use zip -sf foo.zip | grep abc.jpg to determine if an archive has abc.jpg ; that should help. I don’t have time to figure out the complete command now, but I’ll try later if nobody else has answered

5 Answers 5

I once needed something similar to find class files in a bunch of zip files. Here it is:

#!/bin/bash function process() < while read line; do if [[ "$line" =~ ^Archive:\s*(.*) ]] ; then ar="$" #echo "$ar" else if [[ "$line" =~ \s*([^ ]*abc\.jpg)$ ]] ; then echo "$: $" fi fi done > find . -iname '*.zip' -exec unzip -l '<>' \; | process 

Now you only need to add one line to extract the files and maybe move them. I’m not sure exactly what you want to do, so I’ll leave that to you.

If your unix variant supports FUSE (Linux, *BSD, OSX, Solaris all do), mount AVFS to access archives transparently. The command mountavfs creates a view of the whole filesystem, rooted at ~/.avfs , in which archive files have an associated directory that contains the directories and files in the archive. For example, if you have foo.zip in the current directory, then the following command is roughly equivalent to unzip -l foo.zip :

mountavfs # needs to be done once and for all find ~/.avfs$PWD/foo.zip\# -ls 

So, to loop over all images contained in a zip file under the current directory and copy them to /destination/directory (with a prompt in case of clash):

find ~/.avfs"$PWD" -name '*.zip' -exec sh -c ' find "$#" -name "*.jpg" -exec cp -ip <> "$1" \; ' <> /destination/directory \; 
cp -ip ~/.avfs$PWD/**/*.zip(e\''REPLY=($REPLY\#/**/*.jpg(N))'\') /destination/directory 

Deconstruction: ~/.avfs$PWD/**/*.zip expands to the AVFS view of the zip files under the current directory. The glob qualifier e is used to modify the output of the glob: …/*.zip(e\»REPLY=$REPLY\#’\’) would just append a # to each match. REPLY=($REPLY\#/**/*.jpg(N)) transforms each match into the array of .jpg files in the .zip# directory.

some reasons not to use FUSE if you don’t have to: portability (some OSs don’t have FUSE), maintainability (not everybody knows FUSE)

I assume you have a new version of Bash, so you should be able to use this:

shopt -s globstar for path in topdir/**/*.zip do unzip "$path" '.*abc.jpg' done 

Similar to Kims answer but slightly modified. Just use sed :

find . -name *.zip -exec unzip -l '<>' \; | sed -n -e '/^Archive/ ' -e '/abc.jpg$/ ' 

Let’s do this! Tragically, existing answers are deficient in various obvious ways – including those both here and at a popular duplicate.

Читайте также:  Alt linux mount iso

The accepted answer, for example, is Bash-specific (that’s bad) and hardcodes the desired search pattern into a one-off 10-line shell function (that’s even badder). The next most upvoted answer leverages FUSE-based pseudo-filesystems (that’s patently insane). Likewise, the most upvoted answer at the aforementioned duplicate yields ambiguous, non-human-readable output (just. ugh).

I am Jack’s wizened disapproval.

Working Code or It Didn’t Happen

A new contender has entered the ring:

# str find_in_zip(str regex, str zip_filename1, . ) # # Find all paths contained in any zip-formatted archives with the passed # filenames such that the relative pathnames of these paths in these # archives match the passed extended regular expression. function find_in_zip() < (( $# >= 2 )) || < echo 'Expected one extended regular expression and one or more zip filenames.' 1>&2 return 1 > # Localize and remove the passed regex from the argument list. local regex="$" zip_filename shift # For each passed zip filename. for zip_filename in "$"; do # Print the name of this filename for disambiguity. echo "$:" # Print all paths in this file matching this regex. command unzip -l "$" | command grep --extended-regexp --color=always "$" # Page the above output for readability. done | less --RAW-CONTROL-CHARS > 

For usability, this function is called with the exact same signature as grep . Namely, this function first accepts the regular expression to be searched for and then a variadic sequence of one or more zip filenames.

Likewise, this function has been tested under both Bash and zsh. Add the above code to either ~/.bashrc or ~/.zshrc and great zipfile glory shall be yours, ideally with set -e enabled for sanity and strictness.

Examples or It Didn’t Happen

To demonstrate, let’s find the set of all classes embedded in I2P JAR files installed under Gentoo Linux whose names begin with exactly seven uppercase characters followed by one lowercase character – just ’cause:

$ find_in_zip '/[A-Z][a-z]' /usr/share/i2p/lib/*.jar /usr/share/i2p/lib/addressbook.jar: /usr/share/i2p/lib/BOB.jar: /usr/share/i2p/lib/commons-el.jar: /usr/share/i2p/lib/desktopgui.jar: /usr/share/i2p/lib/i2p.jar: 568 01-16-2020 00:20 freenet/support/CPUInformation/AMDCPUInfo.class 236 01-16-2020 00:20 freenet/support/CPUInformation/VIACPUInfo.class /usr/share/i2p/lib/i2psnark.jar: /usr/share/i2p/lib/i2ptunnel.jar: /usr/share/i2p/lib/jasper-compiler.jar: /usr/share/i2p/lib/jasper-runtime.jar: /usr/share/i2p/lib/jetty-continuation.jar: /usr/share/i2p/lib/jetty-deploy.jar: /usr/share/i2p/lib/jetty-http.jar: /usr/share/i2p/lib/jetty-i2p.jar: /usr/share/i2p/lib/jetty-io.jar: /usr/share/i2p/lib/jetty-java5-threadpool.jar: /usr/share/i2p/lib/jetty-rewrite-handler.jar: /usr/share/i2p/lib/jetty-security.jar: /usr/share/i2p/lib/jetty-servlet.jar: /usr/share/i2p/lib/jetty-servlets.jar: /usr/share/i2p/lib/jetty-sslengine.jar: /usr/share/i2p/lib/jetty-start.jar: /usr/share/i2p/lib/jetty-util.jar: /usr/share/i2p/lib/jetty-webapp.jar: /usr/share/i2p/lib/jetty-xml.jar: /usr/share/i2p/lib/jstl.jar: /usr/share/i2p/lib/mstreaming.jar: /usr/share/i2p/lib/org.mortbay.jetty.jar: /usr/share/i2p/lib/org.mortbay.jmx.jar: /usr/share/i2p/lib/routerconsole.jar: /usr/share/i2p/lib/router.jar: 5598 01-16-2020 00:20 org/cybergarage/upnp/ssdp/HTTPMUSocket.class /usr/share/i2p/lib/sam.jar: /usr/share/i2p/lib/standard.jar: /usr/share/i2p/lib/streaming.jar: /usr/share/i2p/lib/systray.jar: 

You. probably wouldn’t want to do that by hand.

Источник

Find and search inside all compressed files

I’d like to scan my hard drive for all compressed file collections like zip, gzip, bzip, and others and have the content of those searched for certain file types (such as images). Anti-virus’ do it, so I believe there should be a way.

@Rinzwind that will search within the files of the archive, not within the list of files. It will find files containing foo but not foo.png .

2 Answers 2

The simplest approach would be to list the contents of the archive and look for files of the relevant extension. For example, with a zip file:

$ zip -sf foo.zip | grep -iE '\.png$|\.jpg$' file1.jpg file1.png file2.jpg file2.png 

The -sf option tells zip to list the files contained in an archive. Then, the grep will look for a .png or .jpg that are at the end of the line ( $ ). The -E enables extended regular expressions, so we can use | as OR and the -i makes the matching case insensitive.

Читайте также:  Linux active directory group

However, each archive tool has a different command to list the contents. I’ve written a script that can deal with most of the more popular ones. If you save that script as list_compressed.sh , you could then run:

list_compressed.sh | grep -iE '\.png$|\.jpg$|\.jpeg$|\.gif$|\.tif$|\.tiff$' 

That would show you the most common image types. Note that this approach assumes that the file type can be determined by the file’s extension. It will not find image files that don’t have an extension and it will not recognize files with the wrong extension. There is no way to deal with that without actually extracting the files from the archive and running file on each of them.

If you want to find all archives that contain image files on your hard drive, combine the above with find :

find / -name '*.gz' -o -name '*.tgz' -o -name '*.zip' -print0 | while IFS= read -r -d '' arch; do list_compressed.sh "$arch" | grep -qiE '\.png$|\.jpg$|\.jpeg$|\.gif$|\.tif$|\.tiff$' && echo "$arch contains image(s)" done 

The find command will search for all .gz , .tgz or .zip files (you can add as many extensions as you like), those are then passed through my script. The -q suppresses grep’s normal output, nothing will be printed. The && echo will print the archive’s name only if the grep was successful.

Источник

Finding a file within recursive directory of zip files

What I have tried
1. I know that I can list files recursively pretty easily:

2. I know that I can list files inside zip archives:

How do I find a specific file within a directory structure of zip files?

3 Answers 3

You can omit using find for single-level (or recursive in bash 4 with globstar ) searches of .zip files using a for loop approach:

for i in *.zip; do grep -iq "mylostfile" < <( unzip -l $i ) && echo $i; done 

for recursive searching in bash 4:

shopt -s globstar for i in **/*.zip; do grep -iq "mylostfile" < <( unzip -l $i ) && echo $i; done 

You can use xargs to process the output of find or you can do something like the following:

find . -type f -name '*zip' -exec sh -c 'unzip -l "<>" | grep -q myLostfile' \; -print 

which will start searching in . for files that match *zip then will run unzip -ls on each and search for your filename. If that filename is found it will print the name of the zip file that matched it.

Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir] Default action is to extract files in list, except those in xlist, to exdir; file[.zip] may be a wildcard. -Z => ZipInfo mode ("unzip -Z" for usage).

Some have suggested to use ugrep to search zip files and tarballs. To find the zip files that contain a mylostfile file, specify it as a -g glob pattern like so:

Читайте также:  About red hat enterprise linux

With the empty regex pattern '' this this recursively searches all files down the working directory, including any zip, tar, cpio/pax archives for mylostfile . If you only want to search the zip files located in the working directory:

ugrep -z -l -g'myLostfile' '' *.zip 

Источник

Zipgrep – How to search through zip files in Linux

zipgrep

zipgrep

Zipgrep is an amazing tool to search through zip archive for a specified pattern. Zipgrep is just a piece of shell script which leverage the usage of unzip and egrep to run. Zipgrep process given expressions just as egrep.

Zipgrep – Usage

Since it’s a just a wrapper script for unzip and grep, it’s input and output process is a lot similar to the egrep . Zipgrep searches for text strings inside the files contained in the archive, not the filenames that zip archive contains. Note that the command syntax is:search pattern + archivename + optional list of filenames to search.

  • Metasploit: The penetration tester’s guide
  • Hacking: The Art of Exploitation, 2nd Edition
  • The Basics of hacking and Penetration Testing
  • CEH Certified Ethical Hackers All-in-one Exam Guide
  • Black Hat Python: Python Programming for Hackers and pen-testers

And you also have another file called “l-books” with the following list:

  • How Linux Works: What Every Superuser Should Know by Brian Ward
  • The Linux Programming Interface: A Linux and UNIX System Programming Handbook by Michael Kerrisk
  • Unix and Linux System Administration Handbook by Evi Nemeth
  • Linux in a Nutshell: A Desktop Quick Reference
  • The Linux Command Line – A Complete Introduction by Williams E. Shotts

Now these two files have been compressed using the zip format into a file called “library.zip. Now you can use the zipgrep command to find patterns within all the files within the zip file.

For example, if you wanted to search for all the occurrences of “hacking” you would use the following command:

zipgrep "hacking" library.zip
[[email protected] ~]$ zipgrep "Hacking" library.zip library/h-books:Hacking: The Art of Exploitation, 2nd Edition library/h-books:The Basics of Hacking and Penetration Testing [[email protected] ~]$

zipgrep-example

As you can see, you can use any expression with zipgrep that you use with grep or egrep, this makes the zipgrep tool very handy, and it makes looking for zip files much easier than decompressing, searching and then compressing again.

If you only want to search specific files within the zip archive you can give the certain files names to search within the zip archive as part of the command shown below:

zipgrep "Linux" library.zip l-books

If you want to exclude the certain files from your search process, you can use the following command as an example:

zipgrep "Program" library.zip -x h-books

This will result in same output as before as it will search all files within library.zip except for excluded file “h-books”.

Источник

Оцените статью
Adblock
detector