What's faster: Regex or string operations?
It depends
Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:
- How many times you parse the regex
- How cleverly you write your string code
- Whether the regex is precompiled
As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.
String operations will always be faster than regular expression operations. Unless, of course, you write the string operations in an inefficient way.
Regular expressions have to be parsed, and code generated to perform the operation using string operations. At best, the regular expression operation can do what's optimal to do the string manipulations.
Regular expressions are not used because they can do anything faster than plain string operations, it's used because it can do very complicated operations with little code, with reasonably small overhead.
I've done some benchmarks with two functions called FunctionOne (string operations) and FunctionTwo (Regex). They should both get all matches between '<' and '>'.
benchmark #1:
- times called: 1'000'000
- input: 80 characters
- duration (string operations // FunctionOne): 1.12 sec
- duration (regex operation //FunctionTwo) : 1.88 sec
benchmark #2:
- times called: 1'000'000
- input: 2000 characters
- duration (string operations): 27.69 sec
- duration (regex operations): 41.436 sec
Conclusion: String operations will almost always beat regular expressions, if programmed efficiently. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.
Code FunctionOne
private void FunctionOne(string input) {
var matches = new List<string>();
var match = new StringBuilder();
Boolean startRecording = false;
foreach( char c in input) {
if (c.Equals('<')) {
startRecording = true;
continue;
}
if (c.Equals('>')) {
matches.Add(match.ToString());
match = new StringBuilder();
startRecording = false;
}
if (startRecording) {
match.Append(c);
}
}
}
Code FunctionTwo
Regex regx = new Regex("<.*?>");
private void FunctionTwo(string input) {
Match m = regx.Match(input);
var results = new List<string>();
while (m.Success) {
results.Add(m.Value);
m = m.NextMatch();
}
}