A Regex to remove digits except for words starting with #
How about capturing the wanted and replacing the unwanted with empty (non captured).
gsub("(#\\S+)|\\d+","\\1",x)
See demo at regex101 or R demo at tio.run (I have no experience with R)
My Answer is assuming, that there is always whitespace between #foo bar #baz2
. If you have something like #foo1,bar2:#baz3 4
, use \w
(word character) instead of \S
(non whitespace).
You could split the string on spaces, remove digits from tokens if they don't start with '#' and paste back:
x <- "table9 dolv5e #10n #dec10 #nov8e 23 hello"
y <- unlist(strsplit(x, ' '))
paste(ifelse(startsWith(y, '#'), y, sub('\\d+', '', y)), collapse = ' ')
# output
[1] "table dolve #10n #dec10 #nov8e hello"