Best way to merge multiple STL containers, removing duplicate elements?
For an unordered lists, your set trick is probably one of the best. It each insert should be O(log n), with N inserts required, and traversing will be O(n), giving you O(N*log n). The other option is to run std::sort on each list individually and then walk through them in parallel using std::set_union, which removes duplicates for you. This will also be O(n*log n), so if you're worried about performance, you'll have to profile. If you're not, do whichever makes more sense to you.
Edit:
set_union
will only work if there are no duplicates in the original lists, otherwise you'll have to go with sort
, merge
, unique
and erase
. The big O performance is still the same, with the same caveats about profiling.
template <typename container>
container unique_merge(container c1, container c2)
{
std::sort(c1.begin(), c1.end());
std::sort(c2.begin(), c2.end());
container mergeTarget;
std::merge(c1.begin(), c1.end(), c2.begin(), c2.end(),
std::insert_iterator(mergeTarget, mergeTarget.end())
);
std::erase(
std::unique(mergeTarget.begin(), mergeTarget.end()),
mergeTarget.end()
);
return mergeTarget;
}
You are going to need to either sort (either explicitly, or implicitly via a sorted container like set).
There is a common idiom using std::sort/std::unique/std::erase to get unique elements in a container.
So create a container with the contents of c1, append the contents of c2, then sort, move unique elements to the end, and erase them. Something like this:
container c(c1.begin(), c1.end());
c.insert(c.end(), c2.begin(), c2.end());
c.erase(std::unique(c.begin(), c.end()), c.end());
Use the std::set_union algorithm from the STL. You'll need to sort your input lists first though -- or create copies of your input lists, sort them, then use std::set_union.