Nested for loop
With a combination of libarchive's bsdtar
and GNU tar
, you can list the contents of those nested archives without having to extract them on disk:
for f in *.zip; do
bsdtar -cf - --include='*.zip' "@$f" | tar -xf - --to-command='bsdtar tvf -'
done
GNU tar
can pipe members of archives to commands upon extraction with --to-command
but only supports tar
archive formats.
bsdtar
supports all sorts of archive formats beside tar
ones (including zip
ones), doesn't have the equivalent of GNU tar
's --to-command
(AFAIK), but can convert archive formats on the fly.
You can't do it without actually unzipping the top files in a sub-folder.
Something like this:
set -e
for f in *.zip
do
n=`basename -- "${f}" .zip`
mkdir -- "${n}"
cd -- "${n}"
unzip ../"${f}"
for p in *.zip
do
unzip -l -- "${p}"
done
cd ..
rm -rf -- "${n}"
done
You should probably verify whether ${n}
already exists and if so generate an error. You could also use a temporary filename for the sub-directory:
dir=`mktemp -d zip-files.XXXXXX`
Then do cd "${dir}"
and rm -rf "${dir}"
once done.
Updates:
The set -e
is used to make sure that if something goes wrong then the script stops. Especially, if the mkdir -- "${m}"
fails, the cd -- "${m}"
will fail too and thus the cd ..
would get you at the wrong directory level and that's where the rm -rf -- "${n}"
becomes dangerous.
Another way to make the cd ..
statement safer is to memorize that directory before the for
loop and use that path like so:
topdir=`pwd`
for ...
do
...
cd "$topdir" # instead of `cd ..`
...
done
That way the rm -rf -- "${n}"
will only operate in $topdir
.
The use of the temporary directory will also make things a lot safer since that way whatever the filenames in the top zip file, the directory creation/removal will work as expected.
If GNU Parallel is installed:
extract_list() {
mkdir "$1"
cd "$1"
unzip ../"$1".zip
parallel unzip -l ::: *.zip
cd ..
rm -rf "$1"
}
export -f extract_list
parallel extract_list {.} ::: *.zip