How to find indices of groups in JavaScript regular expressions match?

You can't directly get the index of a match group. What you have to do is first put every character in a match group, even the ones you don't care about:

var m= /(s+)(.*?)(l)([^l]*?)(o+)/.exec('this is hello to you');

Now you've got the whole match in parts:

['s is hello', 's', ' is hel', 'l', '', 'o']

So you can add up the lengths of the strings before your group to get the offset from the match index to the group index:

function indexOfGroup(match, n) {
    var ix= match.index;
    for (var i= 1; i<n; i++)
        ix+= match[i].length;
    return ix;
}

console.log(indexOfGroup(m, 3)); // 11

I wrote a simple (well the initialization got a bit bloated) javascript object to solve this problem on a project I've been working on recently. It works the same way as the accepted answer but generates the new regexp and pulls out the data you requested automatically.

var exp = new MultiRegExp(/(firstBit\w+)this text is ignored(optionalBit)?/i);
var value = exp.exec("firstbitWithMorethis text is ignored");

value = {0: {index: 0, text: 'firstbitWithMore'},
         1: null};

Git Repo: My MultiRegExp. Hope this helps someone out there.

edit Aug, 2015:

Try me: MultiRegExp Live.


Another javascript class which is also able to parse nested groups is available under: https://github.com/valorize/MultiRegExp2

Usage:

let regex = /a(?: )bc(def(ghi)xyz)/g;
let regex2 = new MultiRegExp2(regex);

let matches = regex2.execForAllGroups('ababa bcdefghixyzXXXX'));

Will output:
[ { match: 'defghixyz', start: 8, end: 17 },
  { match: 'ghi', start: 11, end: 14 } ]