Regex - Should hyphens be escaped?

Correct on all fronts. Outside of a character class (that's what the "square brackets" are called) the hyphen has no special meaning, and within a character class, you can place a hyphen as the first or last character in the range (e.g. [-a-z] or [0-9-]), OR escape it (e.g. [a-z\-0-9]) in order to add "hyphen" to your class.

It's more common to find a hyphen placed first or last within a character class, but by no means will you be lynched by hordes of furious neckbeards for choosing to escape it instead.

(Actually... my experience has been that a lot of regex is employed by folks who don't fully grok the syntax. In these cases, you'll typically see everything escaped (e.g. [a-z\%\$\#\@\!\-\_]) simply because the engineer doesn't know what's "special" and what's not... so they "play it safe" and obfuscate the expression with loads of excessive backslashes. You'll be doing yourself, your contemporaries, and your posterity a huge favor by taking the time to really understand regex syntax before using it.)

Great question!


Typically you would always put the hyphen first in the [] match section. EG, to match any alphanumeric character including hyphens (written the long way), you would use [-a-zA-Z0-9]


Outside of character classes, it is conventional not to escape hyphens. If I saw an escaped hyphen outside of a character class, that would suggest to me that it was written by someone who was not very comfortable with regexes.

Inside character classes, I don't think one way is conventional over the other; in my experience, it usually seems to be to put either first or last, as in [-._:] or [._:-], to avoid the backslash; but I've also often seen it escaped instead, as in [._\-:], and I wouldn't call that unconventional.

Tags:

Regex