Regular Expression to match #hashtag but not #hashtag; (with semicolon)
/(#(?:[^\x00-\x7F]|\w)+)/g
Starts with #, then at least one (+) ANCII symbols ([^\x00-\x7F], range excluding non-ANCII symbols) or word symbol (\w).
This one should cover cases including ANCII symbols like "#їжак".
This is the best practice.
(#+[a-zA-Z0-9(_)]{1,})
How about the following:
\B(\#[a-zA-Z]+\b)(?!;)
Regex Demo
- \B -> Not a word boundary
- (#[a-zA-Z]+\b) -> Capturing Group beginning with # followed by any number of a-z or A-Z with a word boundary at the end
- (?!;) -> Not followed by ;
You can use a negative lookahead reegex:
/(?<=[\s>]|^)#(\w*[A-Za-z_]+\w*)\b(?!;)/
\b
- word boundary ensures that we are at end of word(?!;)
- asserts that we don't have semi-colon at next position
RegEx Demo