Parallelism, Concurrency, and Threads
Throughout the technological progress of computers, it became necessary for them to process various data more quickly or to handle different actions triggered by users simultaneously within a reasonable time frame. For this reason, both processors and programs have been designed to support parallel processing when further improvements in physical processor performance or the discovery of faster algorithms were no longer feasible.
As a result, multi-core processors emerged, allowing multiple instructions to be executed in parallel. Many programming languages come with standard libraries that take advantage of this, implementing APIs to make it easier for programmers to represent parallel work programmatically.
We will delve into a few classes used for parallel and concurrent programming. This is not an introduction to parallel computing; for that, consider taking a specialized course. We will focus on how to work with threads and tasks in an object-oriented manner.
Parallelism and Concurrency
A distinction must be made between parallelism and concurrency. In both cases, multiple instructions are executed at a given moment in an indeterminate order, with potential waiting for the completion of all instructions. The difference is that if instructions are executed on different processor cores simultaneously, it's parallelism. If instructions are executed in parallel on multiple cores or alternatively in a sequential manner on a core in an arbitrary order, it's concurrency. Concurrency is a broader concept that encompasses and is more general than parallelism.
Regarding support for parallel programming in C#, there's the Thread class, which abstracts a thread of execution. A thread of execution within a process is defined by its stack pointer and its instruction pointer—that is, where the thread of execution is on the program's stack and what instruction it's at. It's exactly the same concept as a process in the operating system. However, threads of execution share a virtual address space within the same process, unlike processes.
📄️ Thread Class
The simplest method to create a thread is to use the Thread class. It takes a parameterless function or a function with a single object parameter in its constructor, which doesn't return anything. After creating an instance for the Thread, two methods need to be called. The first is Start(), which effectively starts the thread of execution and invokes the function in that thread in parallel with the execution of the current thread. After launching the thread, the parallel instruction flow needs to be synchronized back into the main one using the Join() method from the thread that created it. The thread will wait for the other thread to finish and will close it. This way, we can create threads as shown in the following example, using lambda functions to transmit and return data through objects.
📄️ Task Class
In general, it's preferable not to share data unless absolutely necessary. For most applications, the shared data model is not used extensively. Instead, there is a preference for an asynchronous and functional programming style where each unit of work (task) is performed with its own data within a thread. In many applications, shared data mostly originates from other applications that are optimized for parallel and concurrent processing, as are databases, which ensure data consistency (even in the end). For this reason, many multi-threaded applications are designed to use the shared data model as little as possible, especially to simplify application logic, improve code quality, and reduce the likelihood of errors.
📄️ Bonus - Producer-Consumer
As a classic problem that can be discussed in relation to parallelism, we have the producer-consumer problem. The problem can be described as follows: we have two types of threads, producers and consumers, communicating through a buffer with a limited number of elements. Producers add values to the buffer, while consumers remove them. When accessing the buffer, we want it to be accessed exclusively to maintain its integrity. Producers should wait when the buffer is full and they can't produce anymore, while consumers should wait when there's nothing in the buffer to consume.