Efficient way to filter out elements from std::vector
Yes you can use std::remove_copy_if, e.g.
std::remove_copy_if(
all_items.begin(),
all_items.end(),
std::back_inserter(filter_items),
[&bad_ids](const mystruct& item) { return std::find(bad_ids.begin(), bad_ids.end(), item.id) != bad_ids.end(); });
Or you can use std::remove_if and erase the bad elements on the vector directly, e.g.
all_items.erase(
std::remove_if(
all_items.begin(),
all_items.end(),
[&bad_ids](const mystruct& item) { return std::find(bad_ids.begin(), bad_ids.end(), item.id) != bad_ids.end(); }),
all_items.end());
expanding on @songyuanyao's correct answer, it never hurts to keep a little library of container helpers to make code more expressive.
#include <iostream>
#include <vector>
#include <algorithm>
struct mystruct {
int id;
std::string name;
};
template<class T, class A, class Pred>
std::vector<T, A> copy_unless(std::vector<T, A> container, Pred&& pred)
{
container.erase(std::remove_if(container.begin(), container.end(),
std::forward<Pred>(pred)),
container.end());
return container;
}
template<class Container, class Pred>
bool any_match(Container&& container, Pred&& pred)
{
return std::find_if(container.begin(), container.end(), pred) != container.end();
}
int main()
{
std::vector<mystruct> all_items = {{151, "test1"}, {154, "test4"}, {152, "test2"}, {151, "test1"}, {151, "test1"}, {153, "test3"}};
std::vector<int> bad_ids = {151, 152};
auto is_bad = [&bad_ids](mystruct const& item)
{
auto match_id = [&item](int id){ return item.id == id; };
return any_match(bad_ids, match_id);
};
auto filter_items = copy_unless(all_items, is_bad);
for (auto&& f : filter_items) {
std::cout << "Good item: " << f.id << std::endl;
}
}
I'm sure I remember a library like this in boost, but for the life of me I can't remember which one it is.
I'd suggest Boost Range:
Live On Coliru
int main() {
myvec all_items = { { 151, "test1" }, { 154, "test4" }, { 152, "test2" },
{ 151, "test1" }, { 151, "test1" }, { 153, "test3" } };
auto is_good = [bad_ids = std::set<int> { 151, 152 }](mystruct v) {
return bad_ids.end() == bad_ids.find(v.id);
};
// just filter on the fly:
for (auto& f : all_items | filtered(is_good)) {
std::cout << "Good item: " << f.id << std::endl;
}
// actually copy:
auto filter_items = boost::copy_range<myvec>(all_items | filtered(is_good));
}
Prints
Good item: 154
Good item: 153
Improving...
You could improve style by factoring things out a little:
Assuming you have a utility like contains
:
template <typename... Arg, typename V> bool contains(std::set<Arg...> const &set, V const &v) {
return set.end() != set.find(v);
}
template <typename... Arg, typename V> bool contains(std::vector<Arg...> const &vec, V const &v) {
return vec.end() != std::find(vec.begin(), vec.end(), v);
}
Then it becomes more readable:
Live On Coliru
auto is_good = [&bad_ids](auto& v) { return !contains(bad_ids, v.id); };
for (auto& f : all_items | filtered(is_good)) {
std::cout << "Good item: " << f.id << std::endl;
}
Now, I feel like the whole bad_ids
list could probably also be dynamic. But if it weren't, you could be more "in-place" using Phoenix:
Peak Hipster:
Live On Coliru
for (auto& f : all_items | filtered(!contains_(std::set<int> { 151, 152 }, arg1->*&mystruct::id))) {
std::cout << "Good item: " << f.id << std::endl;
}
I know. That's pushing it for no good reason, but hey. Just showing :)