How to extract a value from a string using regex and a shell?

It seems that you are asking multiple things. To answer them:

  • Yes, it is ok to extract data from a string using regular expressions, that's what they're there for
  • You get errors, which one and what shell tool do you use?
  • You can extract the numbers by catching them in capturing parentheses:

    .*(\d+) rofl.*
    

    and using $1 to get the string out (.* is for "the rest before and after on the same line)

With sed as example, the idea becomes this to replace all strings in a file with only the matching number:

sed -e 's/.*(\d+) rofl.*/$1/g' inputFileName > outputFileName

or:

echo "12 BBQ ,45 rofl, 89 lol" | sed -e 's/.*(\d+) rofl.*/$1/g'

Yes regex can certainly be used to extract part of a string. Unfortunately different flavours of *nix and different tools use slightly different Regex variants.

This sed command should work on most flavours (Tested on OS/X and Redhat)

echo '12 BBQ ,45 rofl, 89 lol' | sed  's/^.*,\([0-9][0-9]*\).*$/\1/g'

You can do this with GNU grep's perl mode:

echo "12 BBQ ,45 rofl, 89 lol" | grep -P '\d+ (?=rofl)' -o
echo "12 BBQ ,45 rofl, 89 lol" | grep --perl-regexp '\d+ (?=rofl)' --only-matching

-P and --perl-regexp mean Perl-style regular expression. -o and --only-matching mean to output only the matching text.


Using ripgrep's replace option, it is possible to change the output to a capture group:

rg --only-matching --replace '$1' '(\d+) rofl'
  • --only-matching or -o outputs only the part that matches instead of the whole line.
  • --replace '$1' or -r replaces the output by the first capture group.

Tags:

Shell

Regex