Regular expression to grab word before a certain character R Perl

If you use (\S+)\s*&\s*(\S+) then the words both sides of & will be captured. This allows for optional whitespace around the ampersand.

You need to double-up the backslashes in an R string, and use the regexec and regmatches functions to apply the pattern and extract the matched substrings.

string  <- "...something something word1 & word2 something..."
pattern <- "(\\S+)\\s*&\\s*(\\S+)"
match   <- regexec(pattern, string)
words   <- regmatches(string, match)

Now words is a one-element list holding a three-item vector: the whole matched string followed by the first and second backreferences. So words[[1]][2] is word1 and words[[1]][3] is word2.


(?<=&)(\w*)(?=&)"

Will match anything that is a word character between & symbols. Uses a positive lookbehind and a positive lookahead.


\b(.*?)\b&

The word will be captured in group 1. This is a reluctant match contained in any string surrounded by two boundaries; after the second boundary is &.

Tags:

Regex

Perl

R