Standard Template Library II
unordered_map - 2020
While a map is an ordered sequence of pairs (key, value) in which we can look up a value based on a key, an unordered_map is an unordered sequence of pairs (key, value).
Unordered map is an associative container that stores elements formed by the combination of a key value and a mapped value, and which allows for fast retrieval of individual elements based on their keys.
To find an element from a vector, we needs to examine all elements from the beginning, and in its worst case, to the end. The average cost is O(N).
For a map, to find an element, we should start from the root of the balanced tree (red-black tree), and in its worst case, to the leaf. The average cost of the look-up is O(logN).
For integers and strings, we can have better look-up than map's tree search. With a given key, we can compute an index into a vector. The index is called hash value. The container that uses the technique is called hash table. Usually, the number of possible keys is far larger than the number of slots available from the hash table. So, some of the indices can point to vectors that have lots of elements. However, the main advantage of a hash table is that on the average the cost of a look-up is O(1), and this is clearly a significant advantage for large maps.
Including hash tables (unordered associative containers) in the C++11 standard library is one of the most required features. Although hash tables are less efficient than a balanced tree in the worst case (in the presence of many collisions), they perform better in many real applications.
Collisions are managed only through linear chaining because the committee didn't consider it to be opportune to standardize solutions of open addressing that introduce quite a lot of intrinsic problems (above all when erasure of elements is admitted). To avoid name clashes with non-standard libraries that developed their own hash table implementations, the prefix unordered was used instead of hash.
As discussed earlier, the unordered_map is implemented using a hash table, and the map is implemented using a balanced binary tree, and vector is implemented using an array.
Stroustrup, in his book, "Programming Principles and Practice Using C++", talked about the rule of thumb of which container we should choose:
- Use vector unless you have a good reason not to.
- Use map if you need to look up based on a value (and if your key type has a reasonable and efficient less-than operation).
- Use unordered_map if you need to do a lot of lookup in a large map and you don't need an ordered traversal (and if you can find a good hash function for your key type).
template < class Key, // unordered_map::key_type class T, // unordered_map::mapped_type class Hash = hash<Key>, // unordered_map::hasher class Pred = equal_to<Key>, // unordered_map::key_equal class Alloc = allocator< pair<const Key,T> > // unordered_map::allocator_type > class unordered_map;
The usage of unodered_map is not different from the way of using map:
/* m.cpp */ #include <iostream> #include <map> #include <unordered_map> #include <string> using namespace std; int main() { map<unsigned long, string> employer; unordered_map<unsigned long, unsigned> salary; employer[2014021701] = "Celine Dion"; employer[2014021702] = "Whitney Houston"; for(auto e: employer) cout << "name: " << e.second << "\t id: " << e.first << endl; unsigned total_payroll = 0; salary[2014021702] = 654321; salary[2014021701] = 123456; for(auto s: salary) total_payroll += s.second; cout << "total payrolls $" << total_payroll << endl; return 0; }
Output:
$ g++ -std=c++11 -o m m.cpp $ ./m name: Celine Dion id: 2014021701 name: Whitney Houston id: 2014021702 total payrolls $777777
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization