Using The [] Operator Efficiently With C++ unordered_map
Firstly could someone clarify whether in C++ the use of the [] operator in conjunction with an unordered_map for lookups wraps a call to the Find() method, or is using the [] operator quicker than Find()?
There is no rule for that. An implementation of []
could use find()
, it could perform the lookup by itself or it could delegate the lookup to some private method that is also used by find()
internally.
There is also no guarantee on which one is faster. find()
involves an overhead in constructing and returning an iterator, whereas []
will probably be slower if the key does not exist, as it inserts a new value in this case.
(...) is there a way (perhaps by use of pointers or something) that I might only perform one look up in any case (...)
If the key is not in the map, []
will insert a new default-constructed value, and return a reference. Therefore, you could store that reference to save the second lookup:
int& stored_val = map[key]; // Note the reference
if (stored_val) return stored_val;
// Use the reference to save a second lookup.
stored_val = value;
return value;
operator[]
will insert an entry for you with a default-constructed value, if one isn't already there. It is equivalent to, but will probably be implemented more efficiently than:
iterator iter = map.find(key);
if(iter == map.end())
{
iter = map.insert(value_type(key, int())).first;
}
return *iter;
operator[]
can be quicker than doing the work manually with find()
and insert()
, because it can save having to re-hash the key.
One way you can work around having multiple lookups in your code is to take a reference to the value:
int &stored_val = map[key];
// return the corresponding value if we find the key in the map - ie != 0
if (stored_val) return stored_val;
// if not in map
stored_val = value;
return value;
Note that if the value doesn't exist in the map, operator[]
will default-construct and insert one. So while this will avoid multiple lookups, it might actually be slower if used with a type that is slower to default-construct + assign than to copy- or move-construct.
With int
though, which cheaply default-constructs to 0, you might be able to treat 0 as a magic number meaning empty. This looks like it might be the case in your example.
If you have no such magic number, you've got two options. What you should use depends on how expensive it is for you to compute the value.
First, when hashing the key is cheap but computing the value is expensive, find()
may be the best option. This will hash twice but only compute the value when needed:
iterator iter = map.find(key);
// return the corresponding value if we find the key in the map
if(iter != map.end()) return *iter;
// if not in map
map.insert(value_type(key, value));
return value;
But if you've got the value already, you can do it very efficiently -- perhaps slighty more efficiently than using a reference + magic number as above:
pair<iterator,bool> iter = map.insert(value_type(key, value));
return *iter.first;
If the bool returned by map.insert(value_type)
is true, the item was inserted. Otherwise, it already existed and no modifications were made. The iterator returned points to the inserted or existing value in the map. For your simple example, this may be the best option.
You can both check if an element exists, and insert a new element if it does not exist, with the special insert
function that returns a pair<iterator, bool>
in which the boolean value tells you if the value has been actually inserted. For example, the code here:
unordered_map<char, int> mymap;
pair<unordered_map<char,int>::iterator,bool> ret;
// first insert function version (single parameter):;
mymap.insert ( pair<char,int>('z',200) );
ret=mymap.insert (pair<char,int>('z',500) );
if (ret.second==false)
{
cout << "element 'z' already existed";
cout << " with a value of " << ret.first->second << endl;
}
The code here inserts the pair <'z',200>
into the map if it does not exist. It returns the iterator where it is inserted if the value of the second element of the pair returned is true, or, it returns the iterator where the element actually was, if the second element of the pair is false.