Find only GUIDs in file - Bash
With the GNU implementation of grep
(or compatible):
<your-file grep -Ewo '[[:xdigit:]]{8}(-[[:xdigit:]]{4}){3}-[[:xdigit:]]{12}' |
while IFS= read -r guid; do
your-action "$guid"
sleep 5
done
Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).
GNU grep
has a -o
option that prints the non-empty matches of the regular expression.
-w
is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:
aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa
The rest is standard POSIX syntax. Note that [[:xdigit:]]
matches on ABCDEF as well. You can replace it with [0123456789abcdef]
if you want to match only lower case GUIDs.
While I love Regular Expressions, I prefer to avoid over-specifying. For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:
$ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
while read guid ; do
some_command "$guid"
sleep 5
done
Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:
egrep '^[0-9a-f-]{36}$'