Regular Expression to match #hashtag but not #hashtag; (with semicolon)

/(#(?:[^\x00-\x7F]|\w)+)/g

Starts with #, then at least one (+) ANCII symbols ([^\x00-\x7F], range excluding non-ANCII symbols) or word symbol (\w).

This one should cover cases including ANCII symbols like "#їжак".


This is the best practice.

(#+[a-zA-Z0-9(_)]{1,})

How about the following:

\B(\#[a-zA-Z]+\b)(?!;)

Regex Demo

  • \B -> Not a word boundary
  • (#[a-zA-Z]+\b) -> Capturing Group beginning with # followed by any number of a-z or A-Z with a word boundary at the end
  • (?!;) -> Not followed by ;

You can use a negative lookahead reegex:

/(?<=[\s>]|^)#(\w*[A-Za-z_]+\w*)\b(?!;)/
  • \b - word boundary ensures that we are at end of word
  • (?!;) - asserts that we don't have semi-colon at next position

RegEx Demo

Tags:

Regex

Hashtag