Using grep to search for hex strings in a file
grep has a -P switch allowing to use perl regexp syntax the perl regex allows to look at bytes, using \x.. syntax.
so you can look for a given hex string in a file with: grep -aP "\xdf"
but the outpt won't be very useful; indeed better do a regexp on the hexdump output;
The grep -P can be useful however to just find files matrching a given binary pattern. Or to do a binary query of a pattern that actually happens in text (see for example How to regexp CJK ideographs (in utf-8) )
We tried several things before arriving at an acceptable solution:
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
Then found we could get usable results with
xxd -u /usr/bin/xxd > /tmp/xxd.hex ; grep -H 'DF' /tmp/xxd
Note that using a simple search target like 'DF' will incorrectly match characters that span across byte boundaries, i.e.
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
--------------------^^
So we use an ORed regexp to search for ' DF' OR 'DF ' (the searchTarget preceded or followed by a space char).
The final result seems to be
xxd -u -ps -c 10000000000 DumpFile > DumpFile.hex
egrep ' DF|DF ' Dumpfile.hex
0001020: 0089 0424 8D95 D8F5 FFFF 89F0 E8DF F6FF ...$............
-----------------------------------------^^
0001220: 0C24 E871 0B00 0083 F8FF 89C3 0F84 DF03 .$.q............
--------------------------------------------^^
There's also a pretty handy tool called binwalk, written in python, which provides for binary pattern matching (and quite a lot more besides). Here's how you would search for a binary string, which outputs the offset in decimal and hex (from the docs):
$ binwalk -R "\x00\x01\x02\x03\x04" firmware.bin
DECIMAL HEX DESCRIPTION
--------------------------------------------------------------------------
377654 0x5C336 Raw string signature
This seems to work for me:
LANG=C grep --only-matching --byte-offset --binary --text --perl-regexp "<\x-hex pattern>" <file>
short form:
LANG=C grep -obUaP "<\x-hex pattern>" <file>
Example:
LANG=C grep -obUaP "\x01\x02" /bin/grep
Output (cygwin binary):
153: <\x01\x02>
33210: <\x01\x02>
53453: <\x01\x02>
So you can grep this again to extract offsets. But don't forget to use binary mode again.
Note: LANG=C
is needed to avoid utf8 encoding issues.