top of page

C++ Threads for beginners

Hi guys and girls ;). Hope you are doing good. I am going to keep this blog short and crisp. To make the best use of this blog I would recommend you to make your own modifications in the example codes and run them on your machine.

You can fork my repo to access code used in this blog:

First things first, the prerequisites include :

  1. Basic Understanding of C++ language.

  2. Functors in C++. Its a short read incase you are not aware (

  3. Lambda Expression in C++. Go through a few codes to get the hang of it. (

The flow of the blog would be :

  1. What are threads and why to use them?

  2. First C++ Thread.

  3. Multithreading in C++.

  4. Resource Sharing and Locks.

What are threads and why to use them?

Wikipedia defines thread as “In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system.”

Apart from the main thread, you can create your own threads to execute multiple instructions concurrently to save time.

Point to note : All the threads share the same resources, so it is important to synchronize the use of resources efficiently. By resources I mean( the code section, data section, and OS resources (like open files and signals) ).

First C++ Thread

The thread class is defined in “#include<thread>” header file.

Creating a thread is simple. We need to pass a function pointer or a callable object(functor or lambda expressions), which would contain the code to be executed by the thread, to the constructor of the thread object. Have a look at the code, I have used a function pointer.

Passing function pointer to thread

thread::join() function makes the (this->) thread wait for the thread to complete its execution. In this case the main thread waits for ‘t’ thread to complete its execution. If many concurrent tasks are going on, we can synchronize the workflow using join().

thread::detach() makes both the threads(the creator and createe) independent of each other, i.e. both the threads would execute independently. Suppose we don’t want to use join()(i.e. not wait for the completion of the thread we created because we don’t care anymore about its flow of execution), in that case we call thread::detach(). Once detach() has been called, we won’t be able to join(). It’s always better to check if a thread is joinable using thread::joinable() and then join(), else we get error.

Detached Thread

As can be seen, main function didn’t wait for the thread to complete its execution. Since both the threads(main and t) are running concurrently the order of execution changes with every run. More like the threads race for resources(in this case “cout”). Let’s see what happens if we comment detach().


Above code can be found here. ←(click on here to view the code)

As expected, join() was called and first the thread finished its execution and after that “Main Thread” was printed. This was just an example where we used one thread. If you observe the flow in this eg. is mostly sequential. In real life we could have multiple threads running concurrently and the main thread waiting for each of the threads depending on certain states.

It is important for a thread to call either detach() or join(). If they are not called, the destructor of the object, from where the thread was called, would terminate the program.

Multithreading in C++

There’s a hardware limit when it comes to the number of threads that can run concurrently. More number of threads can be created but the number of threads that can run concurrently is limited by the number of logical processors in your system. To find out you can open task manager → performance → CPU →Logical Processors. It can also found programmatically by using thread::hardware_concurrency().

Number of logical processors of CPU as shown in Task Manager


Output: 4 (for my system)

An example of multi-threading

Multi Threading

Above code can be found here. ←(click on here to view the code)

We created a vector of worker thread and pushed threads which were created using lamda functions. We created thread::hardware_concurrency() number of threads.

At the end, every worker thread was looped using for_each(), the lambda expression took a reference of each worker thread and called join() on it. The output as can be seen is random and haphazard and different for every run. Any guesses why?




Because the threads are running concurrently and cout is a resource which is being shared by the threads simultaneously ==>there is a resource race between the threads, commonly known as ‘race condition’. To prevent simultaneous access to resources we use locks(it’s coming in the next section).

Have a look at the following scenario:

Task is to print the string “Thread is running” 200 times using different number of threads. The following code takes number of threads to be used as input, performs the task and prints the execution time of the program.

For the code to work properly enter a number in the range(0,200) which is a divisor of 200.

Print the string “Thread is running” 200 times using different number of threads

Above code can be found here. ←(click on here to view the code)

Results are interesting :

numThreads : 1 ==> Execution time : 0.046 sec numThreads : 2 ==> Execution time : 0.062 sec numThreads : 4 ==> Execution time : 0.031 sec numThreads : 10 ==> Execution time : 0.077 sec numThreads : 50 ==> Execution time : 0.063 sec numThreads : 200 ==> Execution time : 0.155 sec

Having more thread by no way guarantees less execution time. Using less threads does not fully use up the CPU resources which implies you can do better in terms of execution time.

On the other hand more threads fight over acquiring the CPU resources, sometimes end up taking more time due to activities like frequent thread switching(and many other factors).

So, there’s basically a tradeoff on the optimal number of threads for your program to complete execution in minimal time.

So what’s the optimal number of threads one should use? There’s no definite answer to that but first try should be to keep the thread count around the number of logical processors of you CPU given by hardware_concurrency().

Resource Sharing and Locks

Suppose I decide to display multiplication table for the numbers 1, 10, 20 and 40. I use multiple threads for the same(One thread for each number). The output looks something like this:

Random output using multi-threads

Above code can be found here. ←(click on here to view the code)

The output doesn’t make sense right!!!

This is because while printing, the threads are using cout simultaneously. If we are able to block simultaneous usage of cout our, problem would be solved. For exactly this reason we use locks.

Webopedia describes mutex as : In computer programming, a mutual exclusion object (mutex) is a program object that allows multiple program threads to share the same resource, such as file access, but not simultaneously.

After adding lock my table function would look something like this :

Added lock() and unlock()

Above code can be found here. ←(click on here to view the code)

Now that’s a decent output.

Here ‘mtx’ is an object of Mutex. If a thread calls lock for a mutex object(using mtx.lock()), if the mutex is free, the thread acquires the lock, otherwise the mutex is held by other thread and it will wait for the lock to be released. Lock is released at mtx.unlock().

But there’s a problem. What if between lock() and unlock() a runtime eror occurs, the thread won’t proceed and unlock() would not be called. All other threads would be waiting for mtx to be released and they would keep waiting indefinately. Main() would be waiting for the remaining threads to complete. Basically all the threads(except one) would be waiting forever and the program will reach a state commonly known as deadlock.