agrep: only return best match(es)
The agrep package uses Levenshtein Distances to match strings. The package RecordLinkage has a C function to calculate the Levenshtein Distance, which can be used directly to speed up your computation. Here is a reworked ClosestMatch
function that is around 10x faster
library(RecordLinkage)
ClosestMatch2 = function(string, stringVector){
distance = levenshteinSim(string, stringVector);
stringVector[distance == max(distance)]
}
RecordLinkage package was removed from CRAN, use stringdist instead:
library(stringdist)
ClosestMatch2 = function(string, stringVector){
stringVector[amatch(string, stringVector, maxDist=Inf)]
}