Commenting Regular Expressions

Unfortunately, JavaScript doesn't have a verbose mode for regular expression literals like some other langauges do. You may find this interesting, though.

In lieu of any external libraries, your best bet is just to use a normal string and comment that:

var r = new RegExp(
    '('      + //start capture
    '[0-9]+' + // match digit
    ')'        //end capture
r.test('9'); //true

While Javascript doesn't natively support multi-line and commented regular expressions, it's easy enough to construct something that accomplishes the same thing - use a function that takes in a (multi-line, commented) string and returns a regular expression from that string, sans comments and newlines.

The following snippet imitates the behavior of other flavors' x ("extended") flag, which ignores all whitespace characters in a pattern as well as comments, which are denoted with #:

function makeExtendedRegExp(inputPatternStr, flags) {
  // Remove everything between the first unescaped `#` and the end of a line
  // and then remove all unescaped whitespace
  const cleanedPatternStr = inputPatternStr
    .replace(/(^|[^\\])#.*/g, '$1')
    .replace(/(^|[^\\])\s+/g, '$1');
  return new RegExp(cleanedPatternStr, flags);

// The following switches the first word with the second word:
const input = 'foo bar baz';
const pattern = makeExtendedRegExp(String.raw`
  ^       # match the beginning of the line
  (\w+)   # 1st capture group: match one or more word characters
  \s      # match a whitespace character
  (\w+)   # 2nd capture group: match one or more word characters
console.log(input.replace(pattern, '$2 $1'));

Ordinarily, to represent a backslash in a Javascript string, one must double-escape each literal backslash, eg str = 'abc\\def'. But regular expressions often use many backslashes, and the double-escaping can make the pattern much less readable, so when writing a Javascript string with many backslashes it's a good idea to use a String.raw template literal, which allows a single typed backslash to actually represent a literal backslash, without additional escaping.

Just like with the standard x modifier, to match an actual # in the string, just escape it first, eg

foo\#bar     # comments go here

// this function is exactly the same as the one in the first snippet

function makeExtendedRegExp(inputPatternStr, flags) {
  // Remove everything between the first unescaped `#` and the end of a line
  // and then remove all unescaped whitespace
  const cleanedPatternStr = inputPatternStr
    .replace(/(^|[^\\])#.*/g, '$1')
    .replace(/(^|[^\\])\s+/g, '$1');
  return new RegExp(cleanedPatternStr, flags);

// The following switches the first word with the second word:
const input = 'foo#bar baz';
const pattern = makeExtendedRegExp(String.raw`
  ^       # match the beginning of the line
  (\w+)   # 1st capture group: match one or more word characters
  \#      # match a hash character
  (\w+)   # 2nd capture group: match one or more word characters
console.log(input.replace(pattern, '$2 $1'));

Note that to match a literal space character (and not just any whitespace character), while using the x flag in any environment (including the above), you have to escape the space with a \ first, eg:

^(\S+)\ (\S+)   # capture the first two words

If you want to frequently match space characters, this can get a bit tedious and make the pattern harder to read, similar to how double-escaping backslashes isn't very desirable. One possible (non-standard) modification to permit unescaped space characters would be to only strip out spaces at the beginning and end of a line, and spaces before a # comment:

function makeExtendedRegExp(inputPatternStr, flags) {
  // Remove the first unescaped `#`, any preceeding unescaped spaces, and everything that follows
  // and then remove leading and trailing whitespace on each line, including linebreaks
  const cleanedPatternStr = inputPatternStr
    .replace(/(^|[^\\]) *#.*/g, '$1')
    .replace(/^\s+|\s+$|\n/gm, '');
  return new RegExp(cleanedPatternStr, flags);

// The following switches the first word with the second word:
const input = 'foo bar baz';
const pattern = makeExtendedRegExp(String.raw`
  ^             # match the beginning of the line
  (\w+) (\w+)   # capture the first two words
console.log(input.replace(pattern, '$2 $1'));

In several other languages (notably Perl), there's the special x flag. When set, the regexp ignores any whitespace and comments inside of it. Sadly, javascript regexps do not support the x flag.

Lacking syntax, the only way to leverage readability is convention. Mine is to add a comment before the tricky regular expression, containing it as if you've had the x flag. Example:

  \+?     #optional + sign
  (\d*)   #the integeric part
  (       #begin decimal portion
     \d+  #decimal part
var re = /\+?(\d*)(\.\d+)/;

For more complex examples, you can see what I've done with the technique here and here.