Regex for rectangle brackets in R
You should enable perl = TRUE
, then you can use Perl-like syntax which is more straight-forward (IMHO):
gsub("[\\[\\]$]","",mystring, perl = TRUE)
Or, you may use "smart placement" when placing ]
at the start of the bracket expression ([
is not special inside it, there is no need escaping [
there):
gsub("[][$]","",mystring)
See demo
Result:
[1] "abcde"
More details
The [...]
construct is considered a bracket expression by the TRE regex engine (used by default in base R regex functions - (g)sub, grep(l), (g)regexpr - when used without perl=TRUE
), which is a POSIX regex construct. Bracket expressions, unlike character classes in NFA regex engines, do not support escape sequences, i.e. the \
char is treated as a a literal backslash char inside them.
Thus, the [\[\]]
in a TRE regex matches \
or [
char (with the [\[\]
part that is actually equal to [\[]
) and then a ]
. So, it matches \]
or []
substrings, just have a look at gsub("[\\[\\]]", "", "[]\\]ab]")
demo - it outputs ab]
because []
and \]
are matched and eventually removed.
Note that the terms POSIX bracket expressions and NFA character classes are used in the same meaning as is used at https://www.regular-expressions.info, it is not quite a standard, but there is a need to differentiate between the two.
I would sidestep [ab]
syntax and use (a|b)
. Besides working, it may also be more readable:
gsub("(\\[|\\]|\\$)","",mystring)