Write a function that compares two strings and returns a third string containing only the letters that appear in both
For an alternative solution, you can view the strings as enumerables and use Intersect()
like this:
public static string Common(string first, string second)
{
return new string((first.Intersect(second)).ToArray());
}
That's fine for a first approach, but you can make a few improvements, and there's a small error.
- If
b
contains a character ina
that's already inc
, you'll repeat it. - To avoid repeats, you might consider using a
Set
to store the characters, since aSet
won't have repeats. - Assembling strings with
+=
concatenation is usually inefficient; consider using aStringBuilder
or an analogous string-assembly class. - Your variable names aren't very descriptive.
- If
a
orb
are empty, you don't have to do any work at all! Just return an empty string.
You can think about some more sophisticated improvements, too, by imagining how your algorithm scales if you started to use huge strings. For example, one approach might be that if one string is much longer than the other, you can sort the longer one and remove duplicates. Then you can do a binary search on the characters of the shorter string very quickly.
To improve on John Feminella's last suggestion, quicker than a binary search (for any sufficiently long string) would be a lookup in a hashset; or a lookup in a 256-element array of booleans, if those are ASCII or UTF8 characters instead of UTF16 characters.
- Instantiate an empty hashset, or an empty array of booleans; name this variable
found
. - For each char in the first string, either add the char to the hashset (but beware of duplicate characters in the first string),
or set the corresponding element in the
found
array to true; this will take linear O(n) time. - For each char in the second string, test whether the char exists in the hashset or whether the corresponding element in the 'found' array is true: if found then add the character to the return string, and also remove the character from the hashset or the clear the boolean element in the array, so that it won't be found again (to beware of duplicate characters in the second string); this will take linear O(n) time.