Nbsp; Primitive User-Mode and Kernel-Mode Constructs
In this chapter, I explain the primitive thread synchronization constructs. By primitive, I mean the simplest constructs that are available to use in your code. There are two kinds of primitive constructs: user-mode and kernel-mode. Whenever possible, you should use the primitive user-mode constructs, because they are significantly faster than the kernel-mode constructs because they use special CPU instructions to coordinate threads. This means that the coordination is occurring in hardware (which is what makes it fast). But this also means that the Windows operating system never detects that a thread is blocked on a primitive user-mode construct. Because a thread pool thread blocked on a user-mode primitive construct is never considered blocked, the thread pool will not create a new thread to replace the temporarily blocked thread. In addition, these CPU instructions block the thread for an incredibly short period of time.
Wow! All of this sounds great, doesn’t it? And it is great, which is why I recommend using these constructs as much as possible. However, there is a downside—only the Windows operating system kernel can stop a thread from running so that it is not wasting CPU time. A thread running in user mode can be preempted by the system, but the thread will be scheduled again as soon as possible. So, a thread that wants to acquire some resource, but can’t get it, spins in user mode. This potentially
1 Specifically, the field that both members access is marked as volatile, a concept that will be discussed later in this chapter.
wastes a lot of CPU time, which would be better spent performing other work or even just letting the CPU go idle to conserve power.
This brings us to the primitive kernel-mode constructs. The kernel-mode constructs are provided by the Windows operating system itself. As such, they require that your application’s threads call functions implemented in the operating system kernel. Having threads transition from user mode to kernel mode and back incurs a big performance hit, which is why kernel-mode constructs should be avoided.2 However, they do have a positive feature—when a thread uses a kernel-mode construct to acquire a resource that another thread has, Windows blocks the thread so that it is no longer wasting CPU time. Then, when the resource becomes available, Windows resumes the thread, allowing it to access the resource.
A thread waiting on a construct might block forever if the thread currently holding the construct never releases it. If the construct is a user-mode construct, the thread is running on a CPU forever, and we call this a livelock. If the construct is a kernel-mode construct, the thread is blocked forever, and we call this a deadlock. Both of these are bad, but of the two, a deadlock is always preferable to a livelock, because a livelock wastes both CPU time and memory (the thread’s stack, etc.), whereas a deadlock wastes only memory.3
In an ideal world, we’d like to have constructs that take the best of both worlds. That is, we’d like a construct that is fast and non-blocking (like the user-mode constructs) when there is no contention. But when there is contention for the construct, we’d like it to be blocked by the operating system kernel. Constructs that work like this do exist; I call them hybrid constructs, and I will discuss them in Chapter 30. It is very common for applications to use the hybrid constructs, because in most applica- tions, it is rare for two or more threads to attempt to access the same data at the same time. A hybrid construct keeps your application running fast most of the time, and occasionally it runs slowly to block the thread. The slowness usually doesn’t matter at this point, because your thread is going to be blocked anyway.
Many of the CLR’s thread synchronization constructs are really just object-oriented class wrap- pers around Win32 thread synchronization constructs. After all, CLR threads are Windows threads, which means that Windows schedules and controls the synchronization of threads. Windows thread synchronization constructs have been around because 1992, and a ton of material has been written about them.4 Therefore, I give them only cursory treatment in this chapter.
2 I’ll show a program that measures the performance later in this chapter, at the end of the “Event Constructs” section.
3 I say that the memory allocated for the thread is wasted because the memory is not being used in a productive manner if the thread is not making forward progress.
4 In fact, Christophe Nasarre’s and my book, Windows via C/C++, Fifth Edition (Microsoft Press, 2007), has several chapters devoted to this subject.