convert text file of bits to binary file
Adding the -r
option (reverse mode) to xxd -b
does not actually work as intended, because xxd simply does not support combining these two flags (it ignores -b
if both are given). Instead, you have to convert the bits to hex yourself first. For example like this:
( echo 'obase=16;ibase=2'; sed -Ee 's/[01]{4}/;\0/g' instructions.txt ) | bc | xxd -r -p > instructions.bin
Full explanation:
- The part inside the parentheses creates a
bc
script. It first sets the input base to binary (2) and the output base to hexadecimal (16). After that, thesed
command prints the contents ofinstructions.txt
with a semicolon between each group of 4 bits, which corresponds to 1 hex digit. The result is piped intobc
. - The semicolon is a command separator in
bc
, so all the script does is print every input integer back out (after base conversion). - The output of
bc
is a sequence of hex digits, which can be converted to a file with the usualxxd -r -p
.
Output:
$ hexdump -Cv instructions.bin
00000000 00 00 00 13 02 d1 20 83 00 73 02 b3 00 73 04 33 |...... ..s...s.3|
00000010 00 73 64 b3 00 00 00 13 |.sd.....|
00000018
$ xxd -b -c4 instructions.bin
00000000: 00000000 00000000 00000000 00010011 ....
00000004: 00000010 11010001 00100000 10000011 .. .
00000008: 00000000 01110011 00000010 10110011 .s..
0000000c: 00000000 01110011 00000100 00110011 .s.3
00000010: 00000000 01110011 01100100 10110011 .sd.
00000014: 00000000 00000000 00000000 00010011 ....
oneliner to convert 32-bit strings of ones and zeros into corresponding binary:
$ perl -ne 'print pack("B32", $_)' < instructions.txt > instructions.bin
what it does:
perl -ne
will iterate through each line of input file provided on STDIN (instructions.txt
)pack("B32", $_)
will take a string list of 32 bits ($_
which we just read from STDIN), and convert it to binary value (you could alternatively use"b32"
if you wanted ascending bit order inside each byte instead of descending bit order; seeperldoc -f pack
for more details)print
would then output that converted value to STDOUT, which we then redirect to our binary fileinstructions.bin
verify:
$ hexdump -Cv instructions.bin
00000000 00 00 00 13 02 d1 20 83 00 73 02 b3 00 73 04 33 |...... ..s...s.3|
00000010 00 73 64 b3 00 00 00 13 |.sd.....|
00000018
$ xxd -b -c4 instructions.bin
00000000: 00000000 00000000 00000000 00010011 ....
00000004: 00000010 11010001 00100000 10000011 .. .
00000008: 00000000 01110011 00000010 10110011 .s..
0000000c: 00000000 01110011 00000100 00110011 .s.3
00000010: 00000000 01110011 01100100 10110011 .sd.
00000014: 00000000 00000000 00000000 00010011 ....
My original answer was incorrect - xxd
cannot accept either -p
or -r
with -b
...
Given that the other answers are workable, and in the interest of "another way", how about the following:
Input
$ cat instructions.txt
00000000000000000000000000010011
00000010110100010010000010000011
00000000011100110000001010110011
00000000011100110000010000110011
00000000011100110110010010110011
00000000000000000000000000010011
Output
$ hexdump -Cv < instructions.bin
00000000 00 00 00 13 02 d1 20 83 00 73 02 b3 00 73 04 33 |...... ..s...s.3|
00000010 00 73 64 b3 00 00 00 13 |.sd.....|
00000018
Bash pipeline:
cat instructions.txt \
| tr -d $'\n' \
| while read -N 4 nibble; do
printf '%x' "$((2#${nibble}))"; \
done \
| xxd -r -p \
> instructions.bin
cat
- unnecessary, but used for claritytr -d $'\n'
- remove all newlines from the inputread -N 4 nibble
- read exactly 4× characters into thenibble
variableprintf '%x' "$((2#${nibble}))"
convert the nibble from binary to 1× hex character$((2#...))
- convert the given value from base 2 (binary) to base 10 (decimal)printf '%x'
- format the given value from base 10 (decimal) to base 16 (hexadecimal)
xxd -r -p
- reverse (-r
) a plain dump (-p
) - from hexadecimal to raw binary
Python:
python << EOF > instructions.bin
d = '$(cat instructions.txt | tr -d $'\n')'
print(''.join([chr(int(d[i:i+8],2)) for i in range(0, len(d), 8)]))
EOF
- An unquoted heredoc (
<< EOF
) is used to get content into the Python code- This is not efficient if the input becomes large
cat
andtr
- used to get a clean (one-line) inputrange(0, len(d), 8)
- get a list of numbers from 0 to the end of the stringd
, stepping 8× characters at a time.chr(int(d[i:i+8],2))
- convert the current slice (d[i:i+8]
) from binary to decimal (int(..., 2)
), and then to a raw character (chr(...)
)[ x for y in z]
- list comprehension''.join(...)
- convert the list of characters into a single stringprint(...)
- print it