Correct regex not working in grep

You seem to have defined the right regex, but not set the sufficient flags in command-line for grep to understand it. Because by default grep supports BRE and with -E flag it does ERE. What you have (look-aheads) are available only in the PCRE regex flavor which is supported only in GNU grep with its -P flag.

Assuming you need to extract only the matching string after prefix you need to add an extra flag -o to let know grep that print only the matching portion as

Click to copy

grep -oP '(?<=prefix).*$' <<< prefixSTRING

There is also a version of grep that supports PCRE libraries by default - pcregrep in which you can just do

Click to copy

pcregrep -o '(?<=prefix).*$' <<< prefixSTRING

Detailed explanation on various regex flavors are explained in this wonderful Giles' answer and tools that implement each of them

Regular expressions come in many different flavours. What you are showing is a Perl-like regular expression (PCRE, "Perl Compatible Regular Expression").

grep does POSIX regular expressions. These are basic regular expressions (BRE) and extended regular expressions (ERE, if grep is used with the -E option). See the manual for re_format or regex or whatever similar manual your grep manual refers to on your system, or the POSIX standard texts that I just linked to.

If you use GNU grep, you would be able to use Perl-like regular expressions if you used grep with the GNU grep-specific -P option.

Also note that grep returns lines by default, not substrings from lines. Again, with GNU grep (and some other grep implementations), you may use the -o option to get only the bit(s) that matches the given expression from each line.

Note that both -P and -o are non-standard extensions the POSIX specification of grep.

If you are not using GNU grep, then you may use sed instead to get the bit between the string prefix and the end of the line:

Click to copy

sed -n 's/.*prefix\(.*\)/\1/p' file

What this does is to only print the lines that sed manages to apply the given substitution to. The substitution will replace the whole line that matches the expression (which is a BRE), with the piece of it that occurs after the string prefix.

Note that if there are several instances of prefix on a line, the sed variation would return the string after the last one, while the GNU grep variation would return the string after the first one (which would include the other instances of prefix).

The sed solution would be portable to all Unix-like systems.

As the other answers have stated, grep does not use a regex flavour with lookbehinds (by default with GNU grep, or not at all with other versions).

If you find yourself unable to use GNU grep or pcregrep, you can use perl if you have it.

The command line equivalent with perl would be:

Click to copy

perl -ne 'print if /(?<=prefix).*$/' <<< prefixSTRING

You put the desired regex between the slashes. As you are using Perl, this uses Perl's regex flavour.

Correct regex not working in grep

Tags:

Grep

Regular Expression

Related

Recent Posts