What is the difference between double-quoting and not double-quoting an array in Bash?
There is absolutely no problem with quoting an array expansion.
And, of course, there is no problem with no quoting it either as long as you know and accept the consequences. Any non-quoted expansion is subject to splitting and globbing. And, in your code, the ${filelist[…]}
is subject to IFS character removal (and splitting if the string contains any <space>
, <tab>
, or <newline>
).
That is what having the expansion un-quoted do, remove trailing <newline>
.
What creates this problem is that you are using readarray
without removing the trailing delimiter from each array element.
Doing that keeps a trailing <newline>
that is reflected on the error message.
What you could have used is:
readarray -t filelist < <(ls -A)
The -t
option will remove all the trailing newlines of each file name.
-t Remove a trailing delim (default newline) from each line read.
But your code has some additional issues.
There is no need to declare or empty the array
filelist
. It gets done by default by readarray. It needs to be done in some other cases.There is no need to parse the output of
ls
, in fact, that is a bad idea. The easiest way to get a list of files in an array is simply:filelist=( ./* )
And, to make it even better, it would be a good idea to avoid directories:
for file in ./*; do [[ -f $file ]] && filelist+=( "$file" ) done
In the loop, the value of the var
$file
is what should be used:for file in "${filelist[@]}"; do sha256sum "$file" | head -c 64 done
Unless you use
for file in "${!filelist[@]}"; do
which will list the keys of the array.The whole list could be processed with just one call to sha256sum:
sha256sum "${filelist[@]}" | cut -c -64
The improved script is:
filelist=() # declare filelist as an array and empty it.
for file in ./*; do
if [[ -f $file ]]; then
filelist+=( "$file" )
fi
done
declare -r filelist # declare filelist as readonly.
sha256sum "${filelist[@]}" | cut -c -64
I'm not worried about word splitting in this case
Well, in fact, you're relying on it to remove the trailing newline from array entries!
Bash's readarray
(mapfile
) leaves the delimiters in by default. The man page or the command line help don't seem to say that explicitly, but there's an option to remove the delimiter, so by implication the default is that it's not removed:
-t Remove a trailing delim (default newline) from each line read.
So, the actual string in the array is file1[newline]
.
Without quotes, word splitting removes trailing whitespace, fixing the newline. But if you had filenames with spaces in them, word splitting would mess them up, as usual. Double quoting the array prevents that. To answer your first question, the best practice is to double-quote, here we just have an unwanted extra newline.
(Double quoting an array or $@
is the slightly confusing exceptional case where a double-quoted string results in multiple words, one for each array element.)
You also have ${filelist[$file]}
in the sha256sum
command line. That won't work, file
already contains the value received from the array, not the index.
As a minimal modification, this might work:
declare -a filelist
readarray -t filelist < <(ls -A)
readonly filelist
for file in "${filelist[@]}"; do
sha256sum "$file" | head -c 64
done
(I don't think the explicit declare
is actually necessary either.)
The issue above has nothing to do with ls
per se. You'd get the same issue if you had filenames stored in a file, one per line, and used readarray
/mapfile
to read them without using the -t
option. (Or if you read the output of find
, but in that case, you might be able to use find -exec
instead.)
Of course, this is a useless use of ls
and some versions of ls
might break your filenames on output. (I don't think GNU ls does that when outputting to a pipe.)
In Bash, you could instead fill the array with a glob:
shopt -s dotglob
filelist=(*)
for file in *; do ...
Or just run the loop on the glob without storing to an array:
shopt -s dotglob
for file in *; do ...
Note that you do need shopt -s dotglob
to get *
to match dotfiles, and that's shell-dependant.
Part of the problem based on your code snippet may be that you're parsing the output of ls
. This is dangerous and fraught with myriad issues and is best avoided.
Rather than
declare -a filelist
readarray filelist < <(ls -A)
readonly filelist
for file in "${filelist[@]}"; do
it is much simpler (and safer!) to:
for file in *; do
In this case:
for file in *; do
sha256sum "${file}" | head -c 64
done
readarray
as you are invoking it is also helpfully keeping the literal data passed into it, including the newlines. So when you echo the quoted value, the newline is preserved. when you do not quote it, the shell consumes it as intertoken whitespace to ignore. This is also why sha256sum
is failing. If you have a file called foo
, readarray
is passing a value of foo\n
, which does not correspond to a file. Unquoting this "fixes" the problem by accidentally throwing out part of your variable's value.