How to find a position of a substring within a string with fuzzy match
Here's a recursive implementation that works, but might not be fast enough. The worst case scenario is when a match can't be found, and all but the last char in "What" gets matched at every index in Where. In that case the algorithm will make Length(What)-1 + Tolerance comparasions for each char in Where, plus one recursive call per Tolerance. Since both Tolerance and the length of What are constnats, I'd say the algorithm is O(n). It's performance will degrade linearly with the length of both "What" and "Where".
function BrouteFindFirst(What, Where:string; Tolerance:Integer; out AtIndex, OfLength:Integer):Boolean;
var i:Integer;
aLen:Integer;
WhatLen, WhereLen:Integer;
function BrouteCompare(wherePos, whatPos, Tolerance:Integer; out Len:Integer):Boolean;
var aLen:Integer;
aRecursiveLen:Integer;
begin
// Skip perfect match characters
aLen := 0;
while (whatPos <= WhatLen) and (wherePos <= WhereLen) and (What[whatPos] = Where[wherePos]) do
begin
Inc(aLen);
Inc(wherePos);
Inc(whatPos);
end;
// Did we find a match?
if (whatPos > WhatLen) then
begin
Result := True;
Len := aLen;
end
else if Tolerance = 0 then
Result := False // No match and no more "wild cards"
else
begin
// We'll make an recursive call to BrouteCompare, allowing for some tolerance in the string
// matching algorithm.
Dec(Tolerance); // use up one "wildcard"
Inc(whatPos); // consider the current char matched
if BrouteCompare(wherePos, whatPos, Tolerance, aRecursiveLen) then
begin
Len := aLen + aRecursiveLen;
Result := True;
end
else if BrouteCompare(wherePos + 1, whatPos, Tolerance, aRecursiveLen) then
begin
Len := aLen + aRecursiveLen;
Result := True;
end
else
Result := False; // no luck!
end;
end;
begin
WhatLen := Length(What);
WhereLen := Length(Where);
for i:=1 to Length(Where) do
begin
if BrouteCompare(i, 1, Tolerance, aLen) then
begin
AtIndex := i;
OfLength := aLen;
Result := True;
Exit;
end;
end;
// No match found!
Result := False;
end;
I've used the following code to test the function:
procedure TForm18.Button1Click(Sender: TObject);
var AtIndex, OfLength:Integer;
begin
if BrouteFindFirst(Edit2.Text, Edit1.Text, ComboBox1.ItemIndex, AtIndex, OfLength) then
Label3.Caption := 'Found @' + IntToStr(AtIndex) + ', of length ' + IntToStr(OfLength)
else
Label3.Caption := 'Not found';
end;
For case:
String: Where is our caat if any?
Substring: your cat
Tolerance: 2
Result: match on character 10
it shows a match on character 9, of length 6. For the other two examples it gives the expected result.