Standard Template Library (Intro, Vector, List) - 2020
The Standard Template Library (STL) is the heart of the C++ standard library. There is no official definition of the STL, however, generally accepted definition may be this:
The STL is the parts of C++ Standard Library what work with iterators.
So, it includes the containers, part of the iostream libraries, function objects, and algorithms.
The STL is an example of generic programming. While the object-oriented programming concentrates on the data aspect of programming, generic programming concentrates on algorithms.
A goal of generic programming is to write code that is independent of data types. Templates are the C++ tools for creating generic program. Templates let us define a function or class in terms of a generic type. The STL goes further by providing a generic representation of algorithms.
STL has three key components - containers (popular templatized data structures), iterators and algorithms.
- Containers
Containers manage collection of element. To meet different needs, the STL provides different kinds of containers.
- Sequential containers
These are ordered collections in which every element has a certain position. This position depends on the time and place of the insertion, but it is independent of the value of the element. These include vector, deque, and list. - Associative containers
These are sorted collections in which the actual position of an element depends on its value due to a sorting criterion. These include set, multiset, map, and multimap.
- Sequential containers
- Iterators
Iterators are objects that can navigate over elements. An iterator represents a certain position in a container. Iterators are divided into five groups, based on the operations they support.
- Input iterators
These are read-only iterators where each iterated location may be read only once. The most common manifestation of input iterator is istream_iterators. - Output iterators
These are write-only iterators where each iterated location may be read only once. The most common manifestation of output iterator is ostream_iterator. - Forward iterators
These have the capabilities of both input and output iterators, but they can read or write a single location repeatedly. They don't support operator--, so they can move only forward. - Bidirectional iterators
These are just like forward iterators, except they can go backward as well as forward. The standard associative containers all offer bidirectional iterators. - Random access iterators
These do everything bidirectional iterators do, but they also offer iterator arithmetic, i.e., the ability to jump forward or backward in a single step. vector, string, and deque each provide random access iterators.
- Input iterators
- Algorithms
The STL provides several standard algorithms for the processing of elements of collections. They can search, sort, modify, or simply use the element for different purpose. Algorithms use iterators. So, an algorithm has to be written only once to work with arbitrary containers because the iterator interface for iterators is common for all container types.
A vector manages its elements in a dynamic array. It enables random access, which means we can access each element directly with index.
Appending and removing elements at the end of the array is very fast. But inserting an element in the middle or at the beginning of the array takes time because all the following elements should be moved to make room for it while maintaining the order.
Double-ended-queue (deque) is a dynamic array that is implemented so that it can grow in both directions. So, inserting element at the end and at the beginning is fast. Inserting and elements in the middle, however, takes time because element must be moved.
A list is implemented as a doubly linked list of element. In other words, each element in a list has its own segment of memory and refers to its predecessor and its successor. Lists do not provide random access. General access to an arbitrary element takes linear time and this is a lot worse than vectors and deques.
The advantage of a list is that the insertion or removal of an element is fast at any position. Only the links must be changed. This implies that moving an element in the middle of a list is very fast compared with moving an element in a vector or a deque.
#include <vector> int main () { std::vector<int> vec1; // empty vector of ints std::vector<int> vec2 (3); // 3 ints std::vector<int> vec2 (3,10); // 3 ints with value 10 std::vector<int> vec3 (vec2.begin(),vec2.end()); // iterating via vec2 std::vector<int> vec4 (vec3); // a copy of vec3 int myInt[] = {1,2,3}; // construct from arrays: std::vector<int> vec5 (myInt, myInt + sizeof(myInt) / sizeof(int) ); return 0; }
#include <vector> #include <iostream> #include <string> int main() { std::vector<std::string> Scientist; Scientist.push_back("James Maxwell"); Scientist.push_back("Edwin Hubble"); Scientist.push_back("Charles Augustin de Coulomb"); Scientist.push_back("Louis Pasteur"); std::cout << "Now, we have " << Scientist.size() << " scientists.\n"; Scientist.pop_back(); std::cout << "Now, we have " << Scientist.size() << " scientists.\n"; std::vector<std::string>::iterator iter; for (iter = Scientist.begin(); iter != Scientist.end(); ++iter) std::cout << *iter << std::endl; Scientist.clear(); if(Scientist.empty()) std::cout << "Nothing in the list\n"; else std::cout << "You have something in the list\n"; return 0; }
Output is:
Now, we have 4 scientists. Now, we have 3 scientists. James Maxwell Edwin Hubble Charles Augustin de Coulomb Nothing in the list
The push_back() adds a new element to the end of a vector. As a result, Scientist[0] is now equal to "James Maxwell", and Scientist[3] is now equal to "Louis Pasteur".
The pop_back() removes the last element of a vector and reduces the vector size by one. In the example, the size of "Scientist" is reduced from 4 to 3.
The clear() member function removes all of the elements of a vector and sets its size to 0.
The empty() returns "true" if the vector is empty, otherwise, it returns "false."
We can declare iterator as following: container type
vector<string>::iterator iter;Iterators are values that identify an element in a container. We can access/change the value of the element using iterator.
We can also declare another iterator:
vector<string>::const_iterator iter;
This constant iterator is just like a regular iterator except that we can think of a constant iterator as providing read-only access. The iterator itself, however, can change. In other words, we can move iter all around the vector. But we can't change the value of any of the elements through iter.
In the above example, we assign the return value of Scientist.begin() to iter. The begin() returns an iterator that refers to a container's first element. So, the statement assigns an iterator that refers to the first element of Scientist to iter. Then, while looping through the elements, we test the return value of Scientist.end() against iter to make sure the two are equal.
The end() returns an iterator one past the last element in a container. This means the loop will continue until iter has looped through all of the elements in Scientist. In the action statement in the loop, ++iter, increments iter, which traverse it to the next element in the vector. Depending upon the iterator, we can perform other mathematical operations on iterators to traverse them around a container.
In the body of the loop, we send *iter to cout. By placing the dereference operator(*) in front of iter, we display the value of the element to which the iterator refers.
We can insert an item, for example, an the beginning:
Scientist.insert(Scientist.begin(),"Leonardo da Vinci");not or we can remove an item from the list, an element not at the end but from the middle:
Scientist.erase(Scientist.begin() + 2);
Vectors grow dynamically, and every vector has a specific size. When we add a new element to a vector, the computer reallocates memory and may even copy all of the vector elements into this new memory, and this can cause a performance hit.
The capacity() returns the capacity of a vector (the number of elements that a vector can hold before a program must allocate more memory for it). So, a vector's capacity is not the same thing as its size which is the number of elements a vector currently holds. In short, capacity() is the size of the container and the size() is the currently filled level. The capacity() is always equal to or larger than the size. The difference between them is the number of elements that we can add to the vector before the array under the hood needs to be reallocated.
Before we look into the reserve() we need to know what's happening whenever a vector needs more space. It's doing similar to realloc operation. New memory allocation, copy from the old to the new, destruct old objects, deallocate old memory, invalidation of iterators. It's expensive!
The reserve() increases the capacity of a vector to the number supplied as an argument. The reserve() gives us control over when a reallocation of additional memory occurs.
Scientist.reserve(20); // reserve memory for 20 additional elementsBy using reserve() to keep a vector's capacity large enough for our purposes, we can delay memory reallocation.
According to Scott Meyers, the following code requires 2-18 reallocations:
vector<int> v; for(int i = 0; i < 1000; ++i) v.push_back(i);
So, he suggested we should use resever() to reduce the costs:
vector<int> v; v.reserve(1000); for(int i = 0; i < 1000; ++i) v.push_back(i);
Adding or removing an element from the end of a vector using push_back() or pop_back() is extremely efficient. Adding or removing an element at any other element in a vector, however, can require more work because we may have to move multiple elements to accommodate the insertion or deletion. With large vectors this can cause a performance hit.
Here is another example of vector of vector, 3x2, matrix initialization:
#include <iostream> #include <vector> using namespace std; #define ROW 3 #define COL 2 int main() { // vector with ROW rows, each row has COL columns with initial value of 99 vector<vector<int> > v2D(ROW, vector<int>(COL,99)); for(int i = 0; i < ROW; ++i) { for(int j = 0; j < COL; ++j) { cout << v2D[i][j] << " "; } cout << endl; } return 0; }
Output is:
99 99 99 99 99 99
Actually, similar initialization is used for a C++ Starter Package of Google AI Ants 2011:
/* struct for representing a square in the grid. */ struct Square { bool isVisible, isWater, isHill, isFood; int ant, hillPlayer; ... Square() { isVisible = isWater = isHill = isFood = 0; ant = hillPlayer = -1; }; ... }; ... std::vector<std::vector<Square> > grid; ... grid = <vector<vector<Square> >(rows, vector<Square>(cols, Square())); ... std::vector<std::vector<bool> > visited(rows, std::vector<bool>(cols, 0));
We can make a matrix with vector:
#include <iostream> #include <vector> int main() { // 3 x 2 matrix std::vector<std::vector<int> > myVecA(3,std::vector<int>(2) ) ; myVecA[0][0] = 0; myVecA[0][1] = 1; myVecA[1][0] = 10; myVecA[1][1] = 11; myVecA[2][0] = 20; myVecA[2][1] = 21; for(int i = 0;i < 3 ; i++) { for(int j = 0; j < 2; j++) { std::cout << myVecA[i][j] << " " ; } std::cout << std::endl; } std::cout << std::endl; std::vector<std::vector<int> > myVecB; for (int i = 0; i < 3; i++) { std::vector<int> row; // Create an empty row for(int j = 0; j < 2; j++) { row.push_back(i+j); // Add an element (column) to the row } myVecB.push_back(row); // Add the row to the main vector } for(int i = 0;i < 3 ; i++) { for(int j = 0; j < 2; j++) { std::cout << myVecB[i][j] << " " ; } std::cout << std::endl; } std::cout << std::endl; std::vector<std::vector<int> > myVecC; for (int i = 0; i < 3; i++) { myVecC.push_back(std::vector<int>()); // Add an empty row } int dummy = 0; for (int i = 0; i < 3; i++) { for (std::vector<std::vector<int> >::iterator it = myVecC.begin(); it != myVecC.end(); ++it) { it->push_back(i*100 + dummy++); // Add column to all rows } } for(int i = 0;i < 3 ; i++) { for(int j = 0; j < 2; j++) { std::cout << myVecC[i][j] << " " ; } std::cout << std::endl; } std::cout << std::endl; return 0; }
Output from the run is:
0 1 10 11 20 21 0 1 1 2 2 3 0 103 1 104 2 105
Here is an example of multi-dimensional vectors used to calculate matrix multiplication:
#include <vector> using namespace std; void mat_mul(const vector<vector<int> >&a;, int row_a, const vector<vector<int> >&b;, int row_b, vector<vector<int> >&c;) { for(int i = 0; i < row_a; i++) for(int j = 0; j < 4; j++) for(int k = 0; k < row_b; k++) c[i][j] += a[i][k]*b[k][j]; } int main() { vector<vector<int> > a(2, vector<int>(3)); vector<vector<int> > b(3, vector<int>(4)); vector<vector<int> > c(2, vector<int>(4)); a[0][0] = 6; a[0][1] = 5; a[0][2] = -2; a[1][0] = -1; a[1][1] = -4; a[1][2] = 0; b[0][0] = 6; b[0][1] = 1; b[0][2] = 7; b[0][3] = 3; b[1][0] = 2; b[1][1] = 4; b[1][2] = -1; b[1][3] = 5; b[2][0] = 8; b[2][1] = 9; b[2][2] = 10; b[2][3] = 11; int row_a = 2; int row_b = 3; mat_mul(a,row_a,b,row_b, c); /* c = { 30 8 17 21 -14 -17 -3 -23 } */ return 0; }
The vector class in this section is not the stl's vector, but implementation wise, it should be similar, and we get some insight how it works.
When we change sizes, the most basic operation is vector::reserve(). This operation adds space for new elements, and its code looks like this;
class vector { int sz; // current number of elements double *elem; // address of first element int space; // current number of elements + free_space public: void reserve(int); void resize(int); int capacity() const; void push_back(double); }; void vector::reserve(int amount) { if(amount <= space) return; // we have enough space double *pd = new double[amount]; // allocate new space for (int i = 0; i < sz; ++i) pd[i] = elem[i]; // copy old element delete[] elem; // deallocate old space elem = pd; space = amount; }
We are not initializing the elements of the reserved space, and push_back() will do it. Here, we are just reserving space. So, we can get the information about the amount of available space by using:
int vector::capacity() const {return space;}
If v is a vector, then the space for our new vector element when we use with out push_back() will be:
v.capacity()-v.size()
That's the currently available size for new elements without reallocation.
If we utilize the reserve(), our resize() fairly straightforward: make the vector have the requested size of elements and initialize each new element with 0.0.
void vector::resize(int requested_size) { reserve(requested_size); for(int i = sz; i < requested_size; ++i) elem[i] = 0; sz = requested_size; }
So, the push_back() looks like this:
void vector::push_back(double d) { if(space == 0) reserve(10); else if (sz == space) reserve(2*space); // we need more space, so double the size elem[sz] = d; // put d at the end ++sz; // increase the number of element, sz }
vector& vector::operator=(const vector& v) { if(this == &v;) return; if(v.sz <= space) { for(int i = 0; i < sz; ++i) elem[i] = v.elem[i]; sz = v.sz; return *this; } double *pd = new double[v.sz]; for(int i = 0; i < sz; ++i) pd[i] = v.elem[i]; delete[] elem; elem = pd; space = sz = v.sz; return *this; }
The operator=() implemented above may be the simplest:
- Allocate memory for a copy.
- Copy the elements/
- Free the old memory allocation.
- Update sz, elem, space.
The code below implements queue using vector. It's a contrived one. The vector contains objects which have only one integer member.
#include <iostream> #include <vector> using namespace std; typedef struct object { int data; } Object; class queue { public: explicit queue(int); ~queue(); void pop(); void push(int); int peek(){return qv.front()->data;} private: vector<Object*> qv; }; queue::queue(int n) { Object *ptr = new Object; ptr->data = n; qv.push_back(ptr); } void queue::push(int n) { Object *ptr = new Object; ptr->data = n; qv.push_back(ptr); } void queue::pop() { vector<Object*>::iterator st = qv.begin(); delete *st; qv.erase(st); } queue::~queue() { vector<Object*>::iterator it; for(it = qv.begin(); it != qv.end(); ++it) { delete *it; } qv.erase(qv.begin(), qv.end()); } int main() { queue *q = new queue(10); q->push(20); q->push(30); q->push(40); q->push(50); q->pop(); q->pop(); cout << q->peek(); delete q; return 0; }
- vector
- Contiguous memory.
- Pre-allocates space for future elements, so extra space may be required.
- Unlike a list where additional space for a pointer is needed, each element only requires the space for the element type itself.
- Can re-allocate memory for the entire vector at any time that we add an element.
- Insertions at the end are constant, but insertions elsewhere are a costly O(n).
- Erasing an element at the end of the vector is constant time, but for the other locations it's O(n).
- We can randomly access its elements.
- Iterators, pointers, and references are invalidated if we add or remove elements to or from the vector.
vector
So, after the *it == 5, this may crash.::iterator it = v.begin(); for(it = v.begin(); it != v.end(); ++it) { if(*it == 5) v.erase(it); } - We can easily get at the underlying array if we need an array of the elements:
vector<int> v; for(int i = 0; i < 10; ++i) v.push_back(i); int *a = &v;[0];
vector is very similar to the array, so the following line is true:&v;[i] == &v;[0] + i;
- list
- Non-contiguous memory.
- No pre-allocated memory. The memory overhead for the list itself is constant.
- Each element requires extra space for the node which holds the element, including pointers to the next and previous elements in the list.
- Never has to re-allocate memory for the whole list just because we add an element.
- Insertions and erasures are cheap no matter where in the list they occur.
- It's cheap to combine lists with splicing.
- We cannot randomly access elements, so getting at a particular element in the list can be expensive.
- Iterators remain valid even when we add or remove elements from the list.
- If we need an array of the elements, we'll have to create a new one and add them all to it, since there is no underlying array.
The following code examples shows how the vector works with iterator and algorithms such as find() and for_each(). It also demonstrates the characteristics of string class as part of STL.
Coding itself is almost self explanatory:
// Palindrome: "A man a plan a canal Panama" // Using vector, this code shows the given input is a palindrome // It starts with an incomplete input, // to demonstrate some other operations of vector. #include <iostream> #include <vector> #include <algorithm> #include <string> #include <iterator> using namespace std; // print out all element separated by spaces template <class T> void showAll(T& v) { copy(v.begin(),v.end(), ostream_iterator<string>(cout," ")); cout << endl; } // find the s1 and put s2 before s1 template <class T> void insertMyWay(T& v, string s1, string s2) { T::iterator pos; for(pos = v.begin(); pos != v.end(); ++pos) { if(*pos == s1) { v.insert(pos,s2); break; } } } // reverse characters of a string input void reverseCharacters(string& s) { reverse(s.begin(),s.end()); } int main() { const int SIZE = 7; // create an empty vector vector<string> word; // reserve memory for 7 elements word.reserve(SIZE); // append word.push_back("a"); word.push_back("man"); word.push_back("plan"); word.push_back("canal"); word.push_back("Panama"); // display all elements showAll(word); // insert a string - we iterate the vector by ourselves insertMyWay(word,"plan","a"); showAll(word); // insert a string - using STL algorithm find() word.insert(find(word.begin(),word.end(),"canal"),"a"); showAll(word); // reverse the element - using STL algorithm reverse() reverse(word.begin(),word.end()); showAll(word); // pass each element(string) of vector // then, reverseCharacters() will reverse the characters for_each(word.begin(),word.end(),reverseCharacters); showAll(word); return 0; }
The output is:
a man plan canal Panama a man a plan canal Panama a man a plan a canal Panama Panama canal a plan a man a amanaP lanac a nalp a nam a
We could have done more to the output such as put spaces between appropriate words and convert characters to lower/upper case. Yet, it demonstrates enough use of several features that STL provides.
In the following Chapters, we'll look at Associative Containers, iterators, and algorithms in detail.
It's a little bit tricky to erase elements from the list within a loop. Here is a sample:
#include <iostream> #include <list> using namespace std; int main() { listmyList; int i; for (i = 0;i < 10;++i) { myList.push_back(i); } for each(int m in myList) { cout << m <<","; } cout << "\n"; list ::iterator it = myList.begin(); while(it != myList.end()) { if(*it % 2 == 0) it = myList.erase(it); ++it; // needed for VC } for(it = myList.begin(); it != myList.end(); ++it) { cout << *it <<","; } return 0; }
The iterator returned from the line:
it = myList.erase(it);should points to the next element, Visual C++ seems it an additional step to increment the iterator:
++it;But gcc can do the job without the additional line.
All STL containers define the types. In this list, we use X to represent a container type, such as vector<double>, and T for the type stored in the container, such as double.
- X::value_type -
T, the element type.
T is the type we get when the iterator is dereferenced.
For sets and multisets, it is constant.
For maps and multimaps, it is pair<const key-type, mapped-type>. - X::reference - T&.
- X::const_reference - const T&.
- X::iterator - iterator type pointing to T, behaves like T*.
- X::const_iterator - iterator type pointing to const T, behaves like const T*.
- X::difference_type - signed integral type used to represent the distance from one iterator to another.
- X::size_type - unsigned integral type size_type can represent size of data objects, such as number of elements, and subscripts.
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization