Read a file by bytes in BASH
Did you try xxd
? It gives hex dump directly, as you want..
For your case, the command would be:
xxd -c 1 /path/to/input_file | while read offset hex char; do
#Do something with $hex
done
Note: extract the char from hex, rather than while read line. This is required because read will not capture white space properly.
Full rewrite: september 2019!
A lot shorter and simplier than previous versions! (Something faster, but not so much)
Yes , bash can read and write binary:
Syntax:
LANG=C IFS= read -r -d '' -n 1 foo
will populate $foo
with 1 binary byte. Unfortunately, as bash strings cannot hold null bytes ($\0
), reading one byte once is required.
But for the value of byte read, I've missed this in man bash
(have a look at 2016 post, at bottom of this):
printf [-v var] format [arguments] ... Arguments to non-string format specifiers are treated as C constants, except that ..., and if the leading character is a single or double quote, the value is the ASCII value of the following character.
So:
read8() {
local _r8_var=${1:-OUTBIN} _r8_car LANG=C IFS=
read -r -d '' -n 1 _r8_car
printf -v $_r8_var %d "'"$_r8_car
}
Will populate submitted variable name (default to $OUTBIN
) with decimal ascii value of first byte from STDIN
read16() {
local _r16_var=${1:-OUTBIN} _r16_lb _r16_hb
read8 _r16_lb &&
read8 _r16_hb
printf -v $_r16_var %d $(( _r16_hb<<8 | _r16_lb ))
}
Will populate submitted variable name (default to $OUTBIN
) with decimal value of first 16 bits word from STDIN...
Of course, for switching Endianness, you have to switch:
read8 _r16_hb &&
read8 _r16_lb
And so on:
# Usage:
# read[8|16|32|64] [varname] < binaryStdInput
read8() { local _r8_var=${1:-OUTBIN} _r8_car LANG=C IFS=
read -r -d '' -n 1 _r8_car
printf -v $_r8_var %d "'"$_r8_car ;}
read16() { local _r16_var=${1:-OUTBIN} _r16_lb _r16_hb
read8 _r16_lb && read8 _r16_hb
printf -v $_r16_var %d $(( _r16_hb<<8 | _r16_lb )) ;}
read32() { local _r32_var=${1:-OUTBIN} _r32_lw _r32_hw
read16 _r32_lw && read16 _r32_hw
printf -v $_r32_var %d $(( _r32_hw<<16| _r32_lw )) ;}
read64() { local _r64_var=${1:-OUTBIN} _r64_ll _r64_hl
read32 _r64_ll && read32 _r64_hl
printf -v $_r64_var %d $(( _r64_hl<<32| _r64_ll )) ;}
So you could source
this, then if your /dev/sda
is gpt
partitioned,
read totsize < <(blockdev --getsz /dev/sda)
read64 gptbackup < <(dd if=/dev/sda bs=8 skip=68 count=1 2>/dev/null)
echo $((totsize-gptbackup))
1
Answer could be 1
(1st GPT is at sector 1, one sector is 512 bytes. GPT Backup location is at byte 32. With bs=8
512 -> 64 + 32 -> 4 = 544 -> 68 blocks to skip... See GUID Partition Table at Wikipedia).
Quick small write function...
write () {
local i=$((${2:-64}/8)) o= v r
r=$((i-1))
for ((;i--;)) {
printf -vv '\%03o' $(( ($1>>8*(0${3+-1}?i:r-i))&255 ))
o+=$v
}
printf "$o"
}
This function default to 64 bits, little endian.
Usage: write <integer> [bits:64|32|16|8] [switchto big endian]
- With two parameter, second parameter must be one of
8
,16
,32
or64
, to be bit length of generated output. - With any dummy 3th parameter, (even empty string), function will switch to big endian.
.
read64 foo < <(write -12345);echo $foo
-12345
...
First post 2015...
Upgrade for adding specific bash version (with bashisms)
With new version of printf
built-in, you could do a lot without having to fork ($(...)
) making so your script a lot faster.
First let see (by using seq
and sed
) how to parse hd output:
echo ;sed <(seq -f %02g 0 $(( COLUMNS-1 )) ) -ne '
/0$/{s/^\(.*\)0$/\o0337\o033[A\1\o03380/;H;};
/[1-9]$/{s/^.*\(.\)/\1/;H};
${x;s/\n//g;p}';hd < <(echo Hello good world!)
0 1 2 3 4 5 6 7
012345678901234567890123456789012345678901234567890123456789012345678901234567
00000000 48 65 6c 6c 6f 20 67 6f 6f 64 20 77 6f 72 6c 64 |Hello good world|
00000010 21 0a |!.|
00000012
Were hexadecimal part begin at col 10 and end at col 56, spaced by 3 chars and having one extra space at col 34.
So parsing this could by done by:
while read line ;do
for x in ${line:10:48};do
printf -v x \\%o 0x$x
printf $x
done
done < <( ls -l --color | hd )
Old original post
Edit 2 for Hexadecimal, you could use hd
echo Hello world | hd
00000000 48 65 6c 6c 6f 20 77 6f 72 6c 64 0a |Hello world.|
or od
echo Hello world | od -t x1 -t c
0000000 48 65 6c 6c 6f 20 77 6f 72 6c 64 0a
H e l l o w o r l d \n
shortly
while IFS= read -r -n1 car;do [ "$car" ] && echo -n "$car" || echo ; done
try them:
while IFS= read -rn1 c;do [ "$c" ]&&echo -n "$c"||echo;done < <(ls -l --color)
Explain:
while IFS= read -rn1 car # unset InputFieldSeparator so read every chars
do [ "$car" ] && # Test if there is ``something''?
echo -n "$car" || # then echo them
echo # Else, there is an end-of-line, so print one
done
Edit; Question was edited: need hex values!?
od -An -t x1 | while read line;do for char in $line;do echo $char;done ;done
Demo:
od -An -t x1 < <(ls -l --color ) | # Translate binary to 1 byte hex
while read line;do # Read line of HEX pairs
for char in $line;do # For each pair
printf "\x$char" # Print translate HEX to binary
done
done
Demo 2: We have both hex and binary
od -An -t x1 < <(ls -l --color ) | # Translate binary to 1 byte hex
while read line;do # Read line of HEX pairs
for char in $line;do # For each pair
bin="$(printf "\x$char")" # translate HEX to binary
dec=$(printf "%d" 0x$char) # translate to decimal
[ $dec -lt 32 ] || # if caracter not printable
( [ $dec -gt 128 ] && # change bin to a single dot.
[ $dec -lt 160 ] ) && bin="."
str="$str$bin"
echo -n $char \ # Print HEX value and a space
((i++)) # count printed values
if [ $i -gt 15 ] ;then
i=0
echo " - $str"
str=""
fi
done
done
New post on september 2016:
This could be usefull on very specific cases, ( I've used them to manualy copy GPT partitions between two disk, at low level, without having /usr
mounted...)
Yes, bash could read binary!
... but only one byte, by one... (because `char(0)' couldn't be correctly read, the only way of reading them correctly is to consider end-of-file, where if no caracter is read and end of file not reached, then character read is a char(0)).
This is more a proof of concept than a relly usefull tool: there is a pure bash version of hd
(hexdump).
This use recent bashisms, under bash v4.3
or higher.
#!/bin/bash
printf -v ascii \\%o {32..126}
printf -v ascii "$ascii"
printf -v cntrl %-20sE abtnvfr
values=()
todisplay=
address=0
printf -v fmt8 %8s
fmt8=${fmt8// / %02x}
while LANG=C IFS= read -r -d '' -n 1 char ;do
if [ "$char" ] ;then
printf -v char "%q" "$char"
((${#char}==1)) && todisplay+=$char || todisplay+=.
case ${#char} in
1|2 ) char=${ascii%$char*};values+=($((${#char}+32)));;
7 ) char=${char#*\'\\};values+=($((8#${char%\'})));;
5 ) char=${char#*\'\\};char=${cntrl%${char%\'}*};
values+=($((${#char}+7)));;
* ) echo >&2 ERROR: $char;;
esac
else
values+=(0)
fi
if [ ${#values[@]} -gt 15 ] ;then
printf "%08x $fmt8 $fmt8 |%s|\n" $address ${values[@]} "$todisplay"
((address+=16))
values=() todisplay=
fi
done
if [ "$values" ] ;then
((${#values[@]}>8))&&fmt="$fmt8 ${fmt8:0:(${#values[@]}%8)*5}"||
fmt="${fmt8:0:${#values[@]}*5}"
printf "%08x $fmt%$((
50-${#values[@]}*3-(${#values[@]}>8?1:0)
))s |%s|\n" $address ${values[@]} ''""'' "$todisplay"
fi
printf "%08x (%d chars read.)\n" $((address+${#values[@]})){,}
You could try/use this, but don't try to compare performances!
time hd < <(seq 1 10000|gzip)|wc
1415 25480 111711
real 0m0.020s
user 0m0.008s
sys 0m0.000s
time ./hex.sh < <(seq 1 10000|gzip)|wc
1415 25452 111669
real 0m2.636s
user 0m2.496s
sys 0m0.048s
same job: 20ms for hd
vs 2000ms for my bash script
.
... but if you wanna read 4 bytes in a file header or even a sector address in an hard drive, this could do the job...