C++11/C++14 8. Race Conditions - 2020
"If you're an experienced C++ programmer and are anything like me, you initially
approached C++11 thinking, "Yes, yes, I get it. It's C++, only more so." But as you
learned more, you were surprised by the scope of the changes. auto declarations,
range-based for loops, lambda expressions, and rvalue references change the face of
C++, to say nothing of the new concurrency features. And then there are the
idiomatic changes. 0 and typedefs are out, nullptr and alias declarations are in.
Enums should now be scoped. Smart pointers are now preferable to built-in ones.
Moving objects is normally better than copying them.
- Effective Modern C++ by Scott Meyers
Just because we're using a mutex or other mechanisms to protect shared data, we're not necessarily protected from race conditions. We still have to ensure that the data should be protected properly. The following two section shows that we may trigger race conditions depending on our interface design.
The code below is an example of stack implementation via vector. We have two threads each pops the stack element in turn.
// t9.cpp #include <iostream> #include <thread> #include <mutex> #include <string> #include <vector> using namespace std; mutex myMutex; class stack { public: stack() {}; ~stack() {}; void pop(); int top() { return data.back(); } void push(int); void print(); int getSize() { return data.size(); } private: vector<int> data; }; void stack::pop() { lock_guard<mutex> guard(myMutex); data.erase(data.end()-1); } void stack::push(int n) { lock_guard<mutex> guard(myMutex); data.push_back(n); } void stack::print() { cout << "initial stack : " ; for(int item : data) cout << item << " "; cout << endl; } void process(int val, string s) { lock_guard<mutex> guard(myMutex); cout << s << " : " << val << endl; } void thread_function(stack& st, string s) { int val = st.top(); st.pop(); process(val, s); } int main() { stack st; for (int i = 0; i < 10; i++) st.push(i); st.print(); while(true) { if(st.getSize() > 0) { thread t1(&thread_function, ref(st), string("thread1")); t1.join(); } else break; if(st.getSize() > 0) { thread t2(&thread_function, ref(st), string("thread2")); t2.join(); } else break; } return 0; }
Output:
$ g++ t9.cpp -o t9 -std=c++11 -pthread $ ./t9 initial stack : 0 1 2 3 4 5 6 7 8 9 thread1 : 9 thread2 : 8 thread1 : 7 thread2 : 6 thread1 : 5 thread2 : 4 thread1 : 3 thread2 : 2 thread1 : 1 thread2 : 0
The code is guarded against accessing resources at the same time by the two threads via
lock_guard<mutex> guard(myMutex);
It appears the code is thread-safe. However, as shown in the tables below, we can have a race condition depending on the order of the execution. For example, the element "6" may be processed twice while skipping the element "5" because the 2nd st.pop() by thread2.
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Thread 1 | Thread 2 |
---|---|
val = st.top(); | |
val = st.top(); | |
st.pop(); | |
st.pop(); | |
st.process(val, s); | |
st.process(val, s); |
Though the output from the code run seems to be working fine, the code has a design flaw which may trigger data race. In other words, interface design of the code is not thread safe.
One of the fixes is merging the top() and pop() calls into one call with a mutex guard:
int stack::pop() { lock_guard<mutex> guard(myMutex); int val = data.back(); data.erase(data.end()-1); return val; } void thread_function(stack& st, string s) { int val = st.pop(); process(val, s); }
Suppose we need a code to handle a doubly linked list.
In order for a thread to safely delete a node for a doubly linked list,
we need to ensure that we're preventing concurrent accesses to three nodes:
the node being deleted and the nodes on either side. If we
protect accesses to the pointers of each node individually, we'd be no better off
than with code that used no mutexes, because the race condition could still happen.
It's not the individual nodes that need protecting for the individual steps but the
whole data structure, for the whole delete operation. The easiest solution in this case
is to have a single mutex that protects the entire list.
Just because individual operations on the list are safe, we're not out of the woods
yet. We can still get race conditions, even with a really simple interface. Consider a
stack data structure like the std::stack container adapter shown in the example below.
Aside from the constructors and swap(),
there are only five things we can do to a std::stack:
- push() a new element onto the stack
- pop() an element off the stack
- read the top() element
- check whether it's empty()
- and read the number of elements-the size() of the stack.
If we change top() so that it returns a copy rather than a reference and protect the internal data with a mutex, this interface is still inherently subject to race conditions. This problem is not unique to a mutex-based implementation; it's an interface problem, so the race conditions would still occur with a lock-free implementation.
#include <mutex> #include <deque> using namespace std; template<typename T, typename Container = std::deque<T> > class stack { public: explicit stack(const Container&); explicit stack(Container&& = Container()); template <typename Alloc> explicit stack(const Alloc&); template <typename Alloc> stack(const Container&, const Alloc&); template <typename Alloc> stack(Container&&, const Alloc&); template <typename Alloc> stack(stack&&, const Alloc&); // not reliable bool empty() const; // not reliable size_t size() const; T& top(); T const& top() const; void push(T const&); void push(T&&); void pop(); void swap(stack&&); };
The problem here is that the results of empty() and size() can't be relied on. Although they might be correct at the time of the call, once they've returned, other threads are free to access the stack and might push() new elements onto or pop() the existing ones off of the stack before the thread that called empty() or size() could use that information.
In particular, if the stack instance is not shared, it's safe to check for empty() and then call top() to access the top element if the stack is not empty, as the code below:
stack<int> s; if(!s.empty()) { int const value=s.top(); s.pop(); do_task(value); }
Note that calling top() on an empty stack may show undefined behavior. With a shared stack object, this call sequence (empty()->top()->pop()) is no longer safe, because there might be a call to pop() from another thread that removes the last element in between the call to empty() and the call to top().
So, this is a classic race condition, and the use of a mutex internally to protect the stack contents doesn't prevent it. That because it's a consequence of the interface rather than a consequence of not protecting underlying data properly.
What's the solution?
Well, this problem happens as a consequence of the design
of the interface, so the solution is to change the interface.
What changes need to be made?
In the simplest case, we could just
declare that top() will throw an exception if there aren't any elements in the stack
when it's called. Though this directly addresses this issue, it makes for more cumbersome
programming, because now we need to be able to catch an exception, even if
the call to empty() returned false. This essentially makes the call to empty() completely
redundant.
If we look closely at the previous snippet, there's also potential for another race condition but this time between the call to top() and the call to pop(). Consider two threads running the previous snippet of code and both referencing the same stack object, s. This isn't an unusual situation; when using threads for performance, it's quite common to have several threads running the same code on different data, and a shared stack object is ideal for dividing work between them. Suppose that initially the stack has two elements, so we don't have to worry about the race between empty() and top() on either thread, and consider the potential execution patterns.
If the stack is protected by a mutex internally, only one thread can be running a stack member function at any one time, so the calls get nicely interleaved, while the calls to do_task() can run concurrently. One possible execution is shown in table below.
Thread 1 | Thread 2 |
---|---|
if(!s.empty()) | |
if(!s.empty()) | |
int const value=s.top(); | |
int const value=s.top(); | |
s.pop(); | |
do_task(value); | s.pop(); |
do_task(value); |
As we can see, if these are all the threads running, there's nothing in between the two calls to top() to modify the stack, so both threads will see the same value. Not only that, but there are no calls to top() between the calls to pop(). Consequently, one of the two values on the stack is discarded without ever having been read, whereas the other is processed twice. This is yet another race condition and far more insidious than the undefined behavior of the empty()/top() race; there's never anything obviously wrong going on, and the consequences of the bug are likely far removed from the cause, although they obviously depend on exactly what do_task() really does.
This calls for a more radical change to the interface, one that combines the calls to top() and pop() under the protection of the mutex. A combined call can lead to issues if the copy constructor for the objects on the stack can throw an exception. This problem was dealt with fairly comprehensively from an exception-safety point of view by Herb Sutter but the potential for race conditions brings something new to the mix.
For those of us who aren't aware of the issue, consider a stack<vector<int>>. Now, a vector is a dynamically sized container, so when we copy a vector the library has to allocate some more memory from the heap in order to copy the contents. If the system is heavily loaded, or there are significant resource constraints, this memory allocation can fail, so the copy constructor for vector might throw a std::bad_alloc exception. This is especially likely if the vector contains a lot of elements. If the pop() function was defined to return the value popped, as well as remove it from the stack, we have a potential problem: the value being popped is returned to the caller only after the stack has been modified, but the process of copying the data to return to the caller might throw an exception. If this happens, the data just popped is lost; it has been removed from the stack, but the copy was unsuccessful!
The designers of the std::stack interface helpfully split the operation in two:
- get the top element (top())
- remove it from the stack (pop())
So, if we can't safely copy the data, it stays on the stack. If the problem was lack of heap memory, maybe the application can free some memory and try again.
Unfortunately, it's precisely this split that we're trying to avoid in eliminating the race condition!
What do do next?
Please visit another tutorial for more detail: MultiThreading Programming with C++11 - Part B (Sharing Data - mutex, and race conditions, and deadlock).
C++11/C++14 Thread Tutorials
C++11 1. Creating Threads
C++11 2. Debugging with Visual Studio 2013
C++11 3. Threading with Lambda Function
C++11 4. Rvalue and Lvalue
C++11 5. Move semantics and Rvalue Reference
C++11 5B. Move semantics - more samples for move constructor
C++11 6. Thread with Move Semantics and Rvalue Reference
C++11 7. Threads - Sharing Memory and Mutex
C++11 8. Threads - Race Conditions
C++11 9. Threads - Deadlock
C++11 10. Threads - Condition Variables
C++11 11. Threads - unique futures (std::future<>) and shared futures (std::shared_future<>).
C++11 12. Threads - std::promise
C++11/C++14 New Features
initializer_list
Uniform initialization
Type Inference (auto) and Range-based for loop
The nullptr and strongly typed enumerations
Static assertions and Constructor delegation
override and final
default and delete specifier
constexpr and string literals
Lambda functions and expressions
std::array container
Rvalue and Lvalue (from C++11 Thread tutorial)
Move semantics and Rvalue Reference (from C++11 Thread tutorial)
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization