Get the index of the nth occurrence of a string?

You really could use the regular expression /((s).*?){n}/ to search for n-th occurrence of substring s.

In C# it might look like this:

public static class StringExtender
{
    public static int NthIndexOf(this string target, string value, int n)
    {
        Match m = Regex.Match(target, "((" + Regex.Escape(value) + ").*?){" + n + "}");

        if (m.Success)
            return m.Groups[2].Captures[n - 1].Index;
        else
            return -1;
    }
}

Note: I have added Regex.Escape to original solution to allow searching characters which have special meaning to regex engine.


That's basically what you need to do - or at least, it's the easiest solution. All you'd be "wasting" is the cost of n method invocations - you won't actually be checking any case twice, if you think about it. (IndexOf will return as soon as it finds the match, and you'll keep going from where it left off.)

Here is the recursive implementation (of the above idea) as an extension method, mimicing the format of the framework method(s):

public static int IndexOfNth(this string input,
                             string value, int startIndex, int nth)
{
    if (nth < 1)
        throw new NotSupportedException("Param 'nth' must be greater than 0!");
    if (nth == 1)
        return input.IndexOf(value, startIndex);
    var idx = input.IndexOf(value, startIndex);
    if (idx == -1)
        return -1;
    return input.IndexOfNth(value, idx + 1, --nth);
}

Also, here are some (MBUnit) unit tests that might help you (to prove it is correct):

using System;
using MbUnit.Framework;

namespace IndexOfNthTest
{
    [TestFixture]
    public class Tests
    {
        //has 4 instances of the 
        private const string Input = "TestTest";
        private const string Token = "Test";

        /* Test for 0th index */

        [Test]
        public void TestZero()
        {
            Assert.Throws<NotSupportedException>(
                () => Input.IndexOfNth(Token, 0, 0));
        }

        /* Test the two standard cases (1st and 2nd) */

        [Test]
        public void TestFirst()
        {
            Assert.AreEqual(0, Input.IndexOfNth("Test", 0, 1));
        }

        [Test]
        public void TestSecond()
        {
            Assert.AreEqual(4, Input.IndexOfNth("Test", 0, 2));
        }

        /* Test the 'out of bounds' case */

        [Test]
        public void TestThird()
        {
            Assert.AreEqual(-1, Input.IndexOfNth("Test", 0, 3));
        }

        /* Test the offset case (in and out of bounds) */

        [Test]
        public void TestFirstWithOneOffset()
        {
            Assert.AreEqual(4, Input.IndexOfNth("Test", 4, 1));
        }

        [Test]
        public void TestFirstWithTwoOffsets()
        {
            Assert.AreEqual(-1, Input.IndexOfNth("Test", 8, 1));
        }
    }
}

private int IndexOfOccurence(string s, string match, int occurence)
{
    int i = 1;
    int index = 0;

    while (i <= occurence && (index = s.IndexOf(match, index + 1)) != -1)
    {
        if (i == occurence)
            return index;

        i++;
    }

    return -1;
}

or in C# with extension methods

public static int IndexOfOccurence(this string s, string match, int occurence)
{
    int i = 1;
    int index = 0;

    while (i <= occurence && (index = s.IndexOf(match, index + 1)) != -1)
    {
        if (i == occurence)
            return index;

        i++;
    }

    return -1;
}

That's basically what you need to do - or at least, it's the easiest solution. All you'd be "wasting" is the cost of n method invocations - you won't actually be checking any case twice, if you think about it. (IndexOf will return as soon as it finds the match, and you'll keep going from where it left off.)