Contains is faster than StartsWith?
I figured it out. It's because StartsWith
is culture-sensitive, while Contains is not. That inherently means StartsWith
has to do more work.
FWIW, here are my results on Mono with the below (corrected) benchmark:
1988.7906ms using Contains
10174.1019ms using StartsWith
I'd be glad to see people's results on MS, but my main point is that correctly done (and assuming similar optimizations), I think StartsWith
has to be slower:
using System;
using System.Diagnostics;
public class ContainsStartsWith
{
public static void Main()
{
string str = "Hello there";
Stopwatch s = new Stopwatch();
s.Start();
for (int i = 0; i < 10000000; i++)
{
str.Contains("H");
}
s.Stop();
Console.WriteLine("{0}ms using Contains", s.Elapsed.TotalMilliseconds);
s.Reset();
s.Start();
for (int i = 0; i < 10000000; i++)
{
str.StartsWith("H");
}
s.Stop();
Console.WriteLine("{0}ms using StartsWith", s.Elapsed.TotalMilliseconds);
}
}
Try using StopWatch
to measure the speed instead of DateTime
checking.
Stopwatch vs. using System.DateTime.Now for timing events
I think the key is the following the important parts bolded:
Contains
:
This method performs an ordinal (case-sensitive and culture-insensitive) comparison.
StartsWith
:
This method performs a word (case-sensitive and culture-sensitive) comparison using the current culture.
I think the key is the ordinal comparison which amounts to:
An ordinal sort compares strings based on the numeric value of each Char object in the string. An ordinal comparison is automatically case-sensitive because the lowercase and uppercase versions of a character have different code points. However, if case is not important in your application, you can specify an ordinal comparison that ignores case. This is equivalent to converting the string to uppercase using the invariant culture and then performing an ordinal comparison on the result.
References:
http://msdn.microsoft.com/en-us/library/system.string.aspx
http://msdn.microsoft.com/en-us/library/dy85x1sa.aspx
http://msdn.microsoft.com/en-us/library/baketfxw.aspx
Using Reflector you can see the code for the two:
public bool Contains(string value)
{
return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}
public bool StartsWith(string value, bool ignoreCase, CultureInfo culture)
{
if (value == null)
{
throw new ArgumentNullException("value");
}
if (this == value)
{
return true;
}
CultureInfo info = (culture == null) ? CultureInfo.CurrentCulture : culture;
return info.CompareInfo.IsPrefix(this, value,
ignoreCase ? CompareOptions.IgnoreCase : CompareOptions.None);
}
StartsWith
and Contains
behave completely different when it comes to culture-sensitive issues.
In particular, StartsWith
returning true
does NOT imply Contains
returning true
. You should replace one of them with the other only if you really know what you are doing.
using System;
class Program
{
static void Main()
{
var x = "A";
var y = "A\u0640";
Console.WriteLine(x.StartsWith(y)); // True
Console.WriteLine(x.Contains(y)); // False
}
}