Friday, October 27, 2023

Standard Library Synchronization Objects

Overview
Prior to C++11, the C++ memory model and the standard library did not support multi threading. In microsoft platform,  Win32 libraries were used to accomplish this. 

Details
The following discusses these new features in detail.

classification
Standard library concurrency objects can be broadly classified into five categories.
NameDescription
Execution Utility Class Provide mechanism to store results and exceptions later to be retrieved asynchronously.
Execution ClassExecute code asynchronously and results are stored in Execution Utility Classes.
Synchronization classConditionally block execution of code and unblock in a thread thus provide synchronization between Execution Classes. The usage can vary from protecting a resource to prevent race conditions.
Synchronization Utility classProvide wrapper for Synchronization classes, help with their construction and destruction.
Lock free programmingProvide interlocked atomic operations using Atomic class.

The following describes synchronization object for each category

Execution Utilities Classes
These classes store the result or exception to be retrieved asynchronously.
The following describes synchronization object for each category
NameDescription
promisetemplate class that is used for storing the result or exception of the thread function in a shared state.
futurefuture object can access shared state in the promise object containing the result or an exception.
shared_futureSimilar to future object except multiple threads are allowed to wait for the same shared state.

Execution Classes
These classes can execute code asynchronously.
The following describes synchronization object for each category
NameDescription
asyncSame as Win32 thread pool - enables running short tasks in a dedicated thread pool.
threadSame as Win32 thread - An independent unit of execution.
this_threadProvides functions that access the current thread.
packaged_taskSimilar to async except it can be launched at a later time.

async
async object is the win32 equivalent of ThreadPool discussed earlier. Two constructors are available.
template <class Fn, class... Args>  
future<typename result_of<Fn(Args...)>::type>    
async (launch policy, Fn&& fn, Args&&... args);

template <class Fn, class... Args>  
future<typename result_of<Fn(Args...)>::type>
async (Fn&& fn, Args&&... args);
The first constructor accepts function object for the callback and arguments. The launch policy launch::async  is implied.
The second  constructor accepts launch policy,  launch::async  or launch::deferred, function object for the callback and arguments. 
Both returns a future object that contains the results or exceptions.
A thread can access the results from the future object by calling get() or wait() function. 
The launch policy decides how the callback is run. In case of launch::async, the callback function will be run immediately and asynchronously in a separate thread from possibly a thread pool.
In case of deferred,  the callback runs on the thread  that calls get() or wait() methods on the future object. In other words, running callback is deferred until get() or wait() methods are called. Also, wait_for() and wait_until() methods returns deferred value.

This example 5  demonstrates functionality of the async object as in its console output. 

thread
thread object is the win32 equivalent of Thread discussed earlier. The constructor accepts a function object for the callback. The other methods are described as below.

properties/methodsDescription
joinablechecks whether the thread is joinable, i.e. potentially running in parallel context
get_id()returns the id of the thread
native_handlereturns the underlying implementation-defined thread handle
hardware_concurrencyreturns the number of concurrent threads supported by the implementation
join()waits for the thread to finish its execution. This is applicable only if the thread is not detached.
detach()permits the thread to execute independently from the thread handle.

A thread object can be used in two ways.
In the first method depicted below, the caller creates thread and wait tills the thread function returns. The behavior is similar to set_value_at_thread_exit(), the caller has to wait even if the result is set in the promise object earlier before thread function completion.
//main thread
promise<int> p;
thread {doubler,10, ref(p)}.join();
//different thread
auto result =  p.get_future().get();


This method is most flexible, the get returns immediately after the return value is set in the promise object.
//main thread
promise<int> p;
thread {doubler,10, ref(p)}.detach();
//different thread
auto result =  p.get_future().get();
This  example 6  demonstrates various functionality of the thread class as in its console output.
In this example 7, the main  thread creates a thread object to print numbers serially as in 
its console output.  

packaged_task
packaged_task object is similar to async except it can be launched at a later time like a lambda function using the function operator. By default, it runs on the same thread that calls it. However, to be asynchronous, it needs to be explicitly run from a separate thread.
MethodsDescription
get_future()gets the future object
valid()checks the object has valid function
reset()resets the state abandoning any stored results of previous executions
make_ready_at_thread_exit()enables function to execute at the end of thread.

This example 8  demonstrates functionality of the packaged_task object as seen in its console output.   

this_thread namesapce
provides functions that access the current thread.

MethodsDescription
get_id()returns thread id
yield()Yield to other threads
sleep_until()Sleep until time point 
sleep_for()Sleep for time span

This example 9  demonstrates the functionality of the this_thread apis as seen in its console output.

Synchronization Classes
These classes provide synchronization between Execution Classes.
The following describes synchronization object for each category
NameDescription
mutexProvides synchronization to access to a shared resource from multiple threads.
timed_mutexSame as mutex. It also extends locking for a duration or timepoint.
recursive_mutexsame as mutex$ except owning thread can call lock() more than once recursively.
recursive_timed_mutex A recursive and timed mutex.
conditional variablesSame as Win32 conditional variable - Designed to address producer/consumer scenario.
call_onceSame as Win32 InitOnce - One time initialization of variables that's used in multiple threads.

mutex
provides synchronization to access to a shared resource from multiple threads. The following commonly used methods.
MethodsDescription
lock()locks the mutex, blocks if the mutex is not available
unlock()unlocks the mutex
try_lock()tries to lock the mutex, returns if the mutex is not available
native_handle()returns the underlying implementation-defined native handle object

This example 10   demonstrates the functionality of the mutex as seen in its console output.
The main thread spawns two worker threads to print numbers serially. Synchronization is provided by a mutex.

timed_mutex
Similar to to a mutex. It also extends try_lock() as discussed below.
MethodsDescription
lock()locks the mutex, blocks if the mutex is not available
unlock()unlocks the mutex
try_lock()tries to lock the mutex, returns if the mutex is not available
native_handle()returns the underlying implementation-defined native handle object
try_lock_for()tries to lock the mutex and waits for a time span.
try_lock_until()tries to lock the mutex and waits for a time point.

This example 11  demonstrates the functionality of the timed_mutex as seen in its console outputSynchronization is provided by a timed_mutex.
The main thread owns the mutex, spawns a worker thread, sleeps for 3 seconds and unlocks the mutex. The worker thread waits to own the mutex for 5 seconds, print a message to the console. 

recursive_mutex
The owning thread of a mutex cannot call lock() again on it as it will be hung.  
recursive_mutex is same as mutex, except owning  thread can call lock() more than once. it must also call unlock() equal number of times.

MethodsDescription
lock()locks the mutex, blocks if the mutex is not available
unlock()unlocks the mutex
try_lock()tries to lock the mutex, returns if the mutex is not available
native_handle()returns the underlying implementation-defined native handle object

This example 12  demonstrates the functionality of the recursive_mutex as seen in 
its console output. Synchronization is provided by a recursive_mutex.
The program defines 3 methods create_table_withoutdata(), insert_data() and create_table_withdata(). All these methods locks the mutex and releases it after work. 
create_table_withdata(method internally calls insert_data() thus mutex is locked twice and unlocked twice.

recusrisve_timed_mutex
Combines functionalities of recursive_mutex and timed_mutex.
MethodsDescription
lock()locks the mutex, blocks if the mutex is not available
unlock()unlocks the mutex
try_lock()tries to lock the mutex, returns if the mutex is not available
native_handle()returns the underlying implementation-defined native handle object
try_lock_for()tries to lock the mutex and waits for a time span.
try_lock_until()tries to lock the mutex and waits for a time point.

conditional_variable
conditinal_variable removes the need for polling and idling during producer-consumer scenarios.
In order to work, conditinal_variable require a mutex and unique_lock. Optionally a boolean variable to prevent spurious wake ups.
std::mutex cvmtx;
std::condition_variable cv;
bool condtion=false;

The following depicts  typical scenario. However multiple variations can be applied.
One or more  consumer threads waits on a condition_variable with a callable object checking for a condition to be true as below.
void consumer()
{
	std::unique_lock<mutex> lk(cvmtx);
	cv.wait(lk,[](){return condition;});
}

The producer signals the waiting consumer thread(s)  with notify_one() or notify_all() call as below. 
This will wake up the waiting consumer thread(s) from wait.
void producer()
{
	std::lock_guard(cvmtx);
	condition = true;
	cv.notify_one();
}

The following lists important apis
MethodsDescription
notify_one()notifies one waiting thread
notify_all()notifies all waiting threads
wait()blocks the current thread until the condition variable is awakened
wait_for()blocks the current thread until the condition variable is awakened or after the specified timeout duration
wait_until()blocks the current thread until the condition variable is awakened or until specified time point has been reached
native_handle()returns the native handle

This example 18   demonstrates the functionality of the conditinal_variable  as seen in its console output using thread.  Three chat clients, in three threads wait for message from chat server in a conditional variable.

call_once
Sometimes applications needs to initialize global variables or a instance of a class only once before it's accessed by multiple threads. It is usually done using a singleton with a lock. Alternatively once_flag can be used. It involves declaring  a once_flag variable and initializing it using call_once() api as shown below. call_once() accepts a callable object. call_once() can be called from multiple threads but it's called only once. if the call fails due to an exception in one thread, it's tried again in another thread.
int balance=0;
std::once_flag balance_flag;
std::call_once(balance_flag, []()
{ 
	balance=1000;
});

This example 16  demonstrates the functionality of the once_flag  as seen in its console output using thread. 

The program defines three bank accounts sam_acct, rob_acct and steve_acct and tries to transfer money from sam_acct to rob_acct, steve_acct using multiple threads. once_flag  is used to initialize the balances. The first call to call_once() fails because of exception and the second succeeds. Third time it does not get called. 

This example 17  demonstrates the functionality of the once_flag  as seen in its console output using async. The functionality is same as Example 16.

Synchronization Utility Classes
These classes provide wrapper for Synchronization classes.
The following describes synchronization object for each category
NameDescription
lock_guardA RAII-style object for owning a mutex for the duration of a scoped block.
unique_lockSimilar to lock_guard with extended functionality allowing deferred locking$ time-constrained attempts at locking$ recursive locking$ transfer of lock ownership$ and use with condition variables.
lockEnables locking multiple mutexes without deadlocking

lock_guard
lock_guard is a RAII-style object for owning a mutex for the duration of a scoped block.
When a lock_guard object is created, it attempts to take ownership of the mutex it is given. When control leaves the scope in which the lock_guard object was created, the lock_guard is destructed and the mutex is released.

The following code use of the lock_guard. The commented code shows how mutex is locked and unlocked without lock_guard. Whereas a single instantiation of lock_guard removes the need for explicit lock() and unlock() calls.
mutex m;

//{
//	m.lock();
//	cout << "inside"
//	m.unlock();
//}

{
	lock_guard<mutex> lg(m);
	cout << "inside"
}

unique_lock
unique_lock is similar to lock_guard with extended functionality allowing deferred locking, time-constrained attempts at locking, recursive locking, transfer of lock ownership, and use with condition variables.
The constructor accepts additional parameter that decides mode of locking discussed as below:
TypeDescription
defer_lockdo not acquire ownership of the mutex
try_to_locktry to acquire ownership of the mutex without blocking
adopt_lockassume the calling thread already has ownership of the mutex
See lock object discussed below for usage.

The following are commonly used methods:
MethodsDescription
lock()tries to lock the associated mutex
try_lock()tries to lock the associated mutex without blocking
try_lock_for()attempts to lock the associated timed_mutex, while waiting for a time span.
try_lock_until()attempts to lock the associated timed_mutex, while waiting for a time point.
unlock()unlocks  the associated mutex
release()disassociates the associated mutex without unlocking it
mutex()returns a pointer to the associated mutex
owns_lock()tests whether the lock owns its associated mutex
operator bool()tests whether the lock owns its associated mutex

This example 13  demonstrates the functionality of the unique_lock as seen in 
its console output. 

lock
lock enables locking multiple mutexes without deadlocking. When the lock succeeds the thread owns the mutexes. After locking, mutexes must be also unlocked after use. 
This can be done by using adoption_lock  as below
mutex m, m2;
 
lock(m, m2);
unique_lock<mutex> lk(m, adopt_lock);
unique_lockmutex> lk2(m2, adopt_lock);

or by using defer_lock as below. Notice the difference in the arguments type passed to lock() call.
mutex m, m2;
 
unique_lock<mutex> lk(m, defer_lock);
unique_lock<mutex> lk2(m2, defer_lock);
lock(lk, lk2);

This example 14  demonstrates the functionality of the lock as seen in its console output. Synchronization is provided by a mutex and lock.

A restaurant has 2 tables(T, T2) and 2 waiters(W, W2) with a service time of 3 seconds. Initially the main thread spawns four threads assigning waiters and tables. i.e., TW,  TW2, T2W, T2W2. The tables and waiters are protected by a mutex. The assignment algorithm waits on both table and waiter mutexes to make assignments.

This example 15  demonstrates the functionality of the unique_lock as seen in its console output. Synchronization is provided by a  unique_lock  and mutex.
The program tries to transfer money from sam_acct to rob_acct, steve_acct.

Lock free programming
The synchronization mechanisms discussed earlier uses operating system provided objects using atomic classes which provide interlocked atomic operations..
The following describes synchronization object for each category
NameDescription
atomic<>Same as Win32 interlocked - provide a large range of atomic operations.
atomic_flagSame as atomic<> for boolean - guaranteed to be lock free.

atomic<>
atomic<> derivative data types enables guaranteed thread safe operations of accessing, updating scalar data types such as boolean, integers, pointers  etc. atomic<> derivative data types are declared as below.

std::atomic<int> i=5;
std::atomic_int j=10;

The following are   members. In addition it has overloaded operators for addition, subtraction that internally uses fetch_ xxx() APIs.
MethodsDescription
is_lock_free()checks if the atomic object is lock-free
store()atomically replaces the value of the atomic object with a non-atomic argument
load()atomically obtains the value of the atomic object
exchange()atomically replaces the value of the atomic object and obtains the value held previously
compare_exchange_weak()
compare_exchange_strong()
atomically compares the value of the atomic object with non-atomic argument and performs atomic exchange if equal or atomic load if not
fetch_add()atomically adds the argument to the value stored in the atomic object and obtains the value held previously
fetch_sub()atomically subtracts the argument from the value stored in the atomic object and obtains the value held previously
fetch_and()atomically performs bitwise AND between the argument and the value of the atomic object and obtains the value held previously
fetch_or()atomically performs bitwise OR between the argument and the value of the atomic object and obtains the value held previously
fetch_xor()atomically performs bitwise XOR between the argument and the value of the atomic object and obtains the value held previously

load() and store() APIs 
also accept additional parameter for accepting memory order.  It must be one of the following below.
OrderDescription
memory_order_relaxed         There are no synchronization or ordering constraints imposed on other reads or writes, only this operation's atomicity is guaranteed
memory_order_consume      A load operation with this memory order performs a consume operation on the affected memory location: no reads or writes in the current thread dependent on the value currently loaded can be reordered before this load. Writes to data-dependent variables in other threads that release the same atomic variable are visible in the current thread. On most platforms, this affects compiler optimizations only
memory_order_acquire       A load operation with this memory order performs the acquire operation on the affected memory location: no reads or writes in the current thread can be reordered before this load. All writes in other threads that release the same atomic variable are visible in the current thread
memory_order_release     A store operation with this memory order performs the release operation: no reads or writes in the current thread can be reordered after this store. All writes in the current thread are visible in other threads that acquire the same atomic variable and writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic.
memory_order_acq_relA read-modify-write operation with this memory order is both an acquire operation and a release operation. No memory reads or writes in the current thread can be reordered before the load, nor after the store. All writes in other threads that release the same atomic variable are visible before the modification and the modification is visible in other threads that acquire the same atomic variable.
memory_order_seq_cst                                                                                                                                      A load operation with this memory order performs an acquire operation, a store performs a release operation, and read-modify-write performs both an acquire operation and a release operation, plus a single total order exists in which all threads observe all modifications in the same order.

This  example 20  demonstrates the functionality of the atomics<> as seen in its console output.  
It defines a critical_section based on atomic<bool> that can be used to synchronize threads like a mutex. It also defines a stack that can be used by multiple threads  for access and updates.

This example 21  demonstrates the functionality of the atomic<> as seen in its console output. 
The main thread spawns two worker threads to print numbers serially. Synchronization is provided by a atomic<int>.

atomic_flag 
atomic_flag is an atomic boolean type. Unlike all specializations of  atomic<>, it is guaranteed to be lock-free and also it does not provide load() or store() operations.

atomic_flag  supports following operations.
MethodsDescription
clear()atomically sets flag to false
test_and_set()atomically sets the flag to true and obtains its previous value

This example 22  demonstrates the functionality of the atomic_flag as seen in its console output. 
It defines a critical_section based on atomic<bool> that can be used to synchronize threads like a mutex. It also defines a stack that can be used by multiple threads  for access and updates.

Summary of Examples
stdlib Synchronization Objects
The source of all the stdlib examples are available in github and  wandbox.
In wandbox and GDBOnline examples can be viewed, edited, built and run.
NameSynchronization ObjectGithubWandbox
Example    promise - functionality   source   output      source+output
Example 2   promise - usage   source   output   source+output
Example 3future - functionality   source   output   source+output
Example 4shared future - usage   source   output   source+output
Example 5async - functionality   source   output   source+output
Example 6thread - functionality   source   output   source+output
Example 7thread - usage   source   output   source+output
Example 8packaged_task - functionality   source   output   source+output
Example 9this_thread - functionality   source   output   source+output
Example 10mutex - usage   source   output   source+output
Example 11timed_mutex - usage   source   output   source+output
Example 12recursive_mutex - usage   source   output   source+output
Example 13unique_lock - functionality   source   output   source+output
Example 14lock - (usage) restaurant   source   output    source+output
Example 15lock - (usage) bank   source   output   source+output
Example 16call_once - (usage) async   source   output   source+output
Example 17call_once - (usage) thread   source   output   source+output
Example 18conditional_variable - usage   source   output   source+output
Example 19conditional_variable - functionality   source   output   source+output
Example 20atomic<>  - functionality   source   output   source+output   
Example 21atomic<> - usage   source   output   source+output
Example 22atomic_flag - functionality   source   output    source+output
  

No comments:

Post a Comment