Reading character by character with bash read
You need to remove whitespace characters from the $IFS
parameter for read
to stop skipping leading and trailing ones (with -n1
, the whitespace character if any would be both leading and trailing, so skipped):
while IFS= read -rn1 a; do printf %s "$a"; done
But even then bash's read
will skip newline characters, which you can work around with:
while IFS= read -rn1 a; do printf %s "${a:-$'\n'}"; done
Though you could use IFS= read -d '' -rn1
instead or even better IFS= read -N1
(added in 4.1, copied from ksh93
(added in o
)) which is the command to read one character.
Note that bash's read
can't cope with NUL characters. And ksh93 has the same issues as bash.
With zsh:
while read -ku0 a; do print -rn -- "$a"; done
(zsh can cope with NUL characters).
Note that those read -k/n/N
read a number of characters, not bytes. So for multibyte characters, they may have to read multiple bytes until a full character is read. If the input contains invalid characters, you may end up with a variable that contains a sequence of bytes that doesn't form valid characters and which the shell may end up counting as several characters. For instance in a UTF-8 locale:
$ printf '\375\200\200\200\200ABC' | bash -c '
IFS= read -rN1 a; echo "${#a}"'
6
That \375
would introduce a 6-byte UTF-8 character. However, the 6th one (A
) above is invalid for a UTF-8 character. You still end-up with \375\200\200\200\200A
in $a
, which bash
counts as 6 characters though the first 5 ones are not really characters, just 5 bytes not forming part of any character.
This is a simple example using cut
, a for
loop & wc
:
bytes=$(wc -c < /etc/passwd)
file=$(</etc/passwd)
for ((i=0; i<bytes; i++)); do
echo $file | cut -c $i
done
KISS isn't it ?