How to make xargs handle spaces and special chars from cat?
Use -d '\n'
with your xargs
command:
cat file | xargs -d '\n' -l1 mkdir
From manpage:
-d delim
Input items are terminated by the specified character. Quotes and backslash are not special; every
character in the input is taken literally. Disables the end-of-file string, which is treated like any
other argument. This can be used when the input consists of simply newline-separated items, although
it is almost always better to design your program to use --null where this is possible. The specified
delimiter may be a single character, a C-style character escape such as \n, or an octal or hexadecimal
escape code. Octal and hexadecimal escape codes are understood as for the printf command. Multibyte
characters are not supported.
Example output:
$ ls
file
$ cat file
Long Name One (001)
Long Name Two (201)
Long Name Three (123)
$ cat file | xargs -d '\n' -l1 mkdir
$ ls -1
file
Long Name One (001)
Long Name Three (123)
Long Name Two (201)
If your xargs implementation support -0
option:
tr '\n' '\0' <file | xargs -0 -l1 mkdir
POSIXly:
while IFS= read -r file; do
mkdir -p -- "$file"
done <file
(Note that using while
loop to process text considered bad practice in shell script)
xargs
expects a very special input format where arguments are delimited by blanks or newlines (sometimes other forms of vertical whitespace, sometimes dependant on the current locale), and where single quote, double quotes and backslash can be used to escape them (but in a different way from shell quotes).
-l1
is not to pass one line of input as one single argument to mkdir
, but to call one mkdir
invocation for each single line of input but with words on that line still separated out as different arguments to mkdir
.
The GNU implementation of xargs
added a -0
option decades ago to accept NUL-delimited input. That's the most obvious way to separate words that are going to end up being arguments to a command because the NUL character happens to be the only character that cannot occur in a command argument or file name (your chosen list format which puts one file per line can't represent all possible file names as it doesn't allow a newline in a file name).
That -0
has been copied by several other xargs
implementations but not all.
With those you can do:
<file tr '\n' '\0' | xargs -0 mkdir -p --
That will call mkdir
as few times as possible with as many arguments as possible.
But note that if file
is empty, mkdir
will still be run and you'll get a syntax error by mkdir
because of the missing argument. GNU xargs
added a -r
option for that which has been copied by a few other implementations.
GNU xargs
also added (later) a -d
option to be able to specify arbitrary delimiters, but I don't think any other implementation copied it. With GNU xargs
, the best way is with:
xargs -rd '\n' -a file mkdir -p --
By passing the file with -a
(also a GNU extension) instead of stdin, that means mkdir
's stdin is preserved.
POSIXly, you'd need to post-process the input to put it in the format expected by xargs
. You could do it for instance with:
<file sed 's/"/"\\""/g; s/^/"/; s/$/"/' | xargs mkdir -p --
Where we enclose each line inside double quotes and escape each "
as "\""
before feeding to xargs.
But beware of possible limitations:
- the error when the file is empty already mentioned above
- it may fail with some implementations (including of
sed
) if the content offile
is not valid text in the current locale. Iffile
contains file names encoding in more than one different charset, or a charset different from the the locale's one, you can fix the locale to C which should help. - some
xargs
implementations have ridiculously low limits on the maximum length of an argument (can be as low as 255 bytes).
To work around the syntax error upon empty input error, you can write:
<file sed 's/"/"\\""/g; s/^/"/; s/$/"/' |
xargs sh -c '[ "$#" -eq 0 ] || exec mkdir -p -- "$@"' sh