Extract missing paths from bash array of paths

You've already discovered the immediate reason your code didn't do what you expected: errors from ls are reported to stderr (as suggested by POSIX), which is not captured as input by the pipe. You therefore got a mixture of normal output (which passed through unchanged by your sed statements) and stderr (which bypassed them). I do not know why your ls output changed between calls; redirecting stdout to /dev/null should have the effect of removing all "normal" (existing paths) from the output. The fix for this is not to shove stderr into stdout, though.

Post-processing the output from ls is a dangerous idea if you want a reliable script. One good article on the topic is "Why you shouldn't parse the output of ls(1)", available on the wooledge.org site. One in-depth Q/A at the Unix & Linux site goes into some of issues: Why not parse ls (and what to do instead)?. The upshot is that UNIX filenames can contain almost any character, including spaces, tabs, newlines, single quotes, double quotes, escaped single quotes, etc! For some quick examples, consider directories by these names, all of which are perfectly legal:

"No such file" (mkdir "No such file")
"ls: cannot access 'foo': No such file or directory" (mkdir "ls: cannot access 'foo': No such file or directory")
"directory

with

embedded

newlines" (mkdir $'directory\nwith\nembedded\newlines')

The first is an innocent directory that is wrongfully captured (from stdout) by the grep. The second is also wrongfully captured, but then further mangled into a completely different path -- which may or may not exist! -- by the sed statements. The third is one example of what happens when you pass the output of ls into line-oriented programs; if the directory doesn't exist, ls will say so on more than one line, which is probably how you ended up with two separate sed statements!

To distinguish "good paths" -- ones that exist and are readable -- from "bad paths", I would suggest looping over the array and building new arrays of each.

for p in "${paths[@]}"
do
  if [ -r "$p" ]
  then
    goodpaths+=("$p")
  else
    badpaths+=("$p")
  fi
done

You can then do whatever you like with each set:

printf 'Good path: -->%s<--\n' "${goodpaths[@]}"
echo
printf 'Bad path: -->%s<--\n' "${badpaths[@]}"

Extract missing paths from bash array of paths

Tags:

Linux

Bash

Sed

Related

Recent Posts