Recursive bash script to collect information about each file in a directory structure
While find
solutions are simple and powerful, I decided to create a more complicated solution, that is based on this interesting function, which I saw few days ago.
- More explanations and two other scripts, based on the current are provided here.
1. Create executable script file, called walk
, that is located in /usr/local/bin
to be accessible as shell command:
sudo touch /usr/local/bin/walk
sudo chmod +x /usr/local/bin/walk
sudo nano /usr/local/bin/walk
- Copy the below script content and use in
nano
: Shift+Insert for paste; Ctrl+O and Enter for save; Ctrl+X for exit.
2. The content of the script walk
is:
#!/bin/bash
# Colourise the output
RED='\033[0;31m' # Red
GRE='\033[0;32m' # Green
YEL='\033[1;33m' # Yellow
NCL='\033[0m' # No Color
file_specification() {
FILE_NAME="$(basename "${entry}")"
DIR="$(dirname "${entry}")"
NAME="${FILE_NAME%.*}"
EXT="${FILE_NAME##*.}"
SIZE="$(du -sh "${entry}" | cut -f1)"
printf "%*s${GRE}%s${NCL}\n" $((indent+4)) '' "${entry}"
printf "%*s\tFile name:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$FILE_NAME"
printf "%*s\tDirectory:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$DIR"
printf "%*s\tName only:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$NAME"
printf "%*s\tExtension:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$EXT"
printf "%*s\tFile size:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$SIZE"
}
walk() {
local indent="${2:-0}"
printf "\n%*s${RED}%s${NCL}\n\n" "$indent" '' "$1"
# If the entry is a file do some operations
for entry in "$1"/*; do [[ -f "$entry" ]] && file_specification; done
# If the entry is a directory call walk() == create recursion
for entry in "$1"/*; do [[ -d "$entry" ]] && walk "$entry" $((indent+4)); done
}
# If the path is empty use the current, otherwise convert relative to absolute; Exec walk()
[[ -z "${1}" ]] && ABS_PATH="${PWD}" || cd "${1}" && ABS_PATH="${PWD}"
walk "${ABS_PATH}"
echo
3. Explanation:
The main mechanism of the
walk()
function is pretty well described by Zanna in her answer. So I will describe only the new part.Within the
walk()
function I've added this loop:for entry in "$1"/*; do [[ -f "$entry" ]] && file_specification; done
That means for each
$entry
that is a file will be executed the functionfile_specification()
.The function
file_specification()
has two parts. The first part gets data related to the file - name, path, size, etc. The second part output the data in well formatted form. To format the data is used the commandprintf
. And if you want to tweak the script you should read about this command - for example this article.The function
file_specification()
is good place where you can put the specific command that should be execute for each file. Use this format:command "${entry}"
Or you can save the output of the command as variable, and then
printf
this variable, etc.:MY_VAR="$(command "${entry}")" printf "%*s\tFile size:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$MY_VAR"
Or directly
printf
the output of the command:printf "%*s\tFile size:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$(command "${entry}")"
The section to the begging, called
Colourise the output
, initialise few variables that are used within theprintf
command to colourise the output. More about this you could find here.To the bottom of the scrip is added additional condition that deals with absolute and relative paths.
4. Examples of usage:
To run
walk
for the current directory:walk # You shouldn't use any argument, walk ./ # but you can use also this format
To run
walk
for any child directory:walk <directory name> walk ./<directory name> walk <directory name>/<sub directory>
To run
walk
for any other directory:walk /full/path/to/<directory name>
To create a text file, based on the
walk
output:walk > output.file
To create output file without colour codes (source):
walk | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" > output.file
5. Demonstration of usage:
I'm slightly perplexed as to why nobody has posted it yet, but indeed bash
does have recursive capabilities, if you enable globstar
option and use **
glob. As such, you can write (almost) pure bash
script that uses that recursive globstar like this:
#!/usr/bin/env bash
shopt -s globstar
for i in ./**/*
do
if [ -f "$i" ];
then
printf "Path: %s\n" "${i%/*}" # shortest suffix removal
printf "Filename: %s\n" "${i##*/}" # longest prefix removal
printf "Extension: %s\n" "${i##*.}"
printf "Filesize: %s\n" "$(du -b "$i" | awk '{print $1}')"
# some other command can go here
printf "\n\n"
fi
done
Notice that here we use parameter expansion to get the parts of filename we want and we're not relying on external commands except for getting the file size with du
and cleaning output with awk
.
And as it traverses your directory tree, your output should something like this:
Path: ./glibc/glibc-2.23/benchtests
Filename: sprintf-source.c
Extension: c
Filesize: 326
Standard rules of script usage apply: make sure it is executable with chmod +x ./myscript.sh
and run it from current directory via ./myscript.sh
or place it in ~/bin
and run source ~/.profile
.
You can use find
to do the job
find /path/ -type f -exec ls -alh {} \;
This will help you if you just want to list all files with size.
-exec
will allow you to execute custom command or script for each file
\;
used to parse files one by one, you can use +;
if you want to concatenate them (means file names).