CATEGORIES:

Biology Chemistry Construction Culture Ecology Economy Electronics Finance Geography History Informatics Law Mathematics Mechanics Medicine Other Pedagogy Philosophy Physics Policy Psychology Sociology Sport Tourism

Nbsp; Kernel-Mode Constructs

Windows offers several kernel-mode constructs for synchronizing threads. The kernel-mode con- structs are much slower than the user-mode constructs. This is because they require coordination from the Windows operating system itself. Also, each method call on a kernel object causes the calling thread to transition from managed code to native user-mode code to native kernel-mode code and then return all the way back. These transitions require a lot of CPU time and, if performed frequently, can adversely affect the overall performance of your application.

However, the kernel-mode constructs offer some benefits over the primitive user-mode constructs,

such as:

■ When a kernel-mode construct detects contention on a resource, Windows blocks the losing thread so that it is not spinning on a CPU, wasting processor resources.

■ Kernel-mode constructs can synchronize native and managed threads with each other.

■ Kernel-mode constructs can synchronize threads running in different processes on the same machine.

■ Kernel-mode constructs can have security applied to them to prevent unauthorized accounts from accessing them.

■ A thread can block until all kernel-mode constructs in a set are available or until any one kernel-mode construct in a set has become available.

■ A thread can block on a kernel-mode construct specifying a timeout value; if the thread can’t have access to the resource it wants in the specified amount of time, then the thread is un- blocked and can perform other tasks.

The two primitive kernel-mode thread synchronization constructs are events and semaphores. Other kernel-mode constructs, such as mutex, are built on top of the two primitive constructs. For more information about the Windows kernel-mode constructs, see the book, Windows via C/C++, Fifth Edition (Microsoft Press, 2007) by myself and Christophe Nasarre.

The System.Threading namespace offers an abstract base class called WaitHandle. The Wait Handle class is a simple class whose sole purpose is to wrap a Windows kernel object handle. The FCL provides several classes derived from WaitHandle. All classes are defined in the System.Threading namespace. The class hierarchy looks like this.

WaitHandle EventWaitHandle

AutoResetEvent ManualResetEvent

Semaphore Mutex

Internally, the WaitHandle base class has a SafeWaitHandle field that holds a Win32 kernel object handle. This field is initialized when a concrete WaitHandle-derived class is constructed. In addition, the WaitHandle class publicly exposes methods that are inherited by all the derived classes. Every method called on a kernel-mode construct represents a full memory fence. WaitHandle’s interesting public methods are shown in the following code (some overloads for some methods are not shown).

public abstract class WaitHandle : MarshalByRefObject, IDisposable {

// WaitOne internally calls the Win32 WaitForSingleObjectEx function. public virtual Boolean WaitOne();

public virtual Boolean WaitOne(Int32 millisecondsTimeout); public virtual Boolean WaitOne(TimeSpan timeout);

// WaitAll internally calls the Win32 WaitForMultipleObjectsEx function public static Boolean WaitAll(WaitHandle[] waitHandles);

public static Boolean WaitAll(WaitHandle[] waitHandles, Int32 millisecondsTimeout); public static Boolean WaitAll(WaitHandle[] waitHandles, TimeSpan timeout);

// WaitAny internally calls the Win32 WaitForMultipleObjectsEx function public static Int32 WaitAny(WaitHandle[] waitHandles);

public static Int32 WaitAny(WaitHandle[] waitHandles, Int32 millisecondsTimeout); public static Int32 WaitAny(WaitHandle[] waitHandles, TimeSpan timeout);

public const Int32 WaitTimeout = 258; // Returned from WaitAny if a timeout occurs

// Dispose internally calls the Win32 CloseHandle function – DON’T CALL THIS. public void Dispose();

}

There are a few things to note about these methods:

■ You call WaitHandle’s WaitOne method to have the calling thread wait for the underlying kernel object to become signaled. Internally, this method calls the Win32 WaitForSingle ObjectEx function. The returned Boolean is true if the object became signaled or false if a timeout occurs.

■ You call WaitHandle’s static WaitAll method to have the calling thread wait for all the kernel objects specified in the WaitHandle[] to become signaled. The returned Boolean is true if all of the objects became signaled or false if a timeout occurs. Internally, this method calls the Win32 WaitForMultipleObjectsEx function, passing TRUE for the bWaitAll parameter.

■ You call WaitHandle’s static WaitAny method to have the calling thread wait for any one of the kernel objects specified in the WaitHandle[] to become signaled. The returned Int32 is the index of the array element corresponding to the kernel object that became signaled, or

WaitHandle.WaitTimeout if no object became signaled while waiting. Internally, this method calls the Win32 WaitForMultipleObjectsEx function, passing FALSE for the bWaitAll parameter.

■ The array that you pass to the WaitAny and WaitAll methods must contain no more than 64 elements or else the methods throw a System.NotSupportedException.

■ You call WaitHandle’s Dispose method to close the underlying kernel object handle. Inter- nally, these methods call the Win32 CloseHandle function. You can only call Dispose explic- itly in your code if you know for a fact that no other threads are using the kernel object. This puts a lot of burden on you as you write your code and test it. So, I would strongly discourage you from calling Dispose; instead, just let the garbage collector (GC) do the cleanup. The GC knows when no threads are using the object anymore, and then it will get rid of it. In a way, the GC is doing thread synchronization for you automatically!

The versions of the WaitOne and WaitAll that do not accept a timeout parameter should be pro- totyped as having a void return type, not Boolean. The reason is because these methods will return only true because the implied timeout is infinite (System.Threading.Timeout.Infinite). When you call any of these methods, you do not need to check their return value.

As already mentioned, the AutoResetEvent, ManualResetEvent, Semaphore, and Mutex classes are all derived from WaitHandle, so they inherit WaitHandle’s methods and their behavior. How- ever, these classes introduce additional methods of their own, and I’ll address those now.

First, the constructors for all of these classes internally call the Win32 CreateEvent (passing FALSE for the bManualReset parameter) or CreateEvent (passing TRUE for the bManualReset pa- rameter), CreateSemaphore, or CreateMutex functions. The handle value returned from all of these calls is saved in a private SafeWaitHandle field defined inside the WaitHandle base class.

Second, the EventWaitHandle, Semaphore, and Mutex classes all offer static OpenExisting methods, which internally call the Win32 OpenEvent, OpenSemaphore, or OpenMutex functions, passing a String argument that identifies an existing named kernel object. The handle value re- turned from all of these functions is saved in a newly constructed object that is returned from the OpenExisting method. If no kernel object exists with the specified name, a WaitHandleCannot BeOpenedException is thrown.

A common usage of the kernel-mode constructs is to create the kind of application that allows only one instance of itself to execute at any given time. Examples of single-instance applications are Microsoft Outlook, Windows Live Messenger, Windows Media Player, and Windows Media Center. Here is how to implement a single-instance application.

using System;

using System.Threading;

public static class Program { public static void Main() {

Boolean createdNew;

// Try to create a kernel object with the specified name

using (new Semaphore(0, 1, "SomeUniqueStringIdentifyingMyApp", out createdNew)) { if (createdNew) {

// This thread created the kernel object so no other instance of this

// application must be running. Run the rest of the application here...

} else {

// This thread opened an existing kernel object with the same string name;

// another instance of this application must be running now.

// There is nothing to do in here, let's just return from Main to terminate

// this second instance of the application.

}

In this code, I am using a Semaphore, but it would work just as well if I had used an EventWait Handle or a Mutex because I’m not actually using the thread synchronization behavior that the object offers. However, I am taking advantage of some thread synchronization behavior that the kernel offers when creating any kind of kernel object. Let me explain how the preceding code works. Let’s say that two instances of this process are started at exactly the same time. Each process will have its own thread, and both threads will attempt to create a Semaphore with the same string name (“SomeUniqueStringIdentifyingMyApp,” in my example). The Windows kernel ensures that only

one thread actually creates a kernel object with the specified name; the thread that created the object

will have its createdNew variable set to true.

For the second thread, Windows will see that a kernel object with the specified name already ex- ists; the second thread does not get to create another kernel object with the same name, although if this thread continues to run, it can access the same kernel object as the first process’s thread. This is how threads in different processes can communicate with each other via a single kernel object. How- ever, in this example, the second process’s thread sees that its createdNew variable is set to false. This thread now knows that another instance of this process is running, and the second instance of the process exits immediately.

Event Constructs

Events are simply Boolean variables maintained by the kernel. A thread waiting on an event blocks when the event is false and unblocks when the event is true. There are two kinds of events. When an auto-reset event is true, it wakes up just one blocked thread, because the kernel automatically resets the event back to false after unblocking the first thread. When a manual-reset event is true, it unblocks all threads waiting for it because the kernel does not automatically reset the event back to false; your code must manually reset the event back to false. The classes related to events look like this.

public class EventWaitHandle : WaitHandle {

public Boolean Set(); // Sets Boolean to true; always returns true public Boolean Reset(); // Sets Boolean to false; always returns true

}

public sealed class AutoResetEvent : EventWaitHandle { public AutoResetEvent(Boolean initialState);

}

public sealed class ManualResetEvent : EventWaitHandle { public ManualResetEvent(Boolean initialState);

}

Using an auto-reset event, we can easily create a thread synchronization lock whose behavior is similar to the SimpleSpinLock class I showed earlier.

internal sealed class SimpleWaitLock : IDisposable { private readonly AutoResetEvent m_available;

public SimpleWaitLock() {

m_available = new AutoResetEvent(true); // Initially free

}

public void Enter() {

// Block in kernel until resource available m_available.WaitOne();

}

public void Leave() {

// Let another thread access the resource m_available.Set();

}

public void Dispose() { m_available.Dispose(); }

}

You would use this SimpleWaitLock exactly the same way that you’d use the SimpleSpinLock. In fact, the external behavior is exactly the same; however, the performance of the two locks is radi- cally different. When there is no contention on the lock, the SimpleWaitLock is much slower than the SimpleSpinLock, because every call to SimpleWaitLock’s Enter and Leave methods forces the calling thread to transition from managed code to the kernel and back—which is bad. But when there is contention, the losing thread is blocked by the kernel and is not spinning and wasting CPU

cycles—which is good. Note also that constructing the AutoResetEvent object and calling Dispose on it also causes managed to kernel transitions, affecting performance negatively. These calls usually happen rarely, so they are not something to be too concerned about.

To give you a better feel for the performance differences, I wrote the following code.

public static void Main() { Int32 x = 0;

const Int32 iterations = 10000000; // 10 million

// How long does it take to increment x 10 million times? Stopwatch sw = Stopwatch.StartNew();

for (Int32 i = 0; i < iterations; i++) { x++;

}

Console.WriteLine("Incrementing x: {0:N0}", sw.ElapsedMilliseconds);

// How long does it take to increment x 10 million times

// adding the overhead of calling a method that does nothing? sw.Restart();

for (Int32 i = 0; i < iterations; i++) { M(); x++; M();

}

Console.WriteLine("Incrementing x in M: {0:N0}", sw.ElapsedMilliseconds);

// How long does it take to increment x 10 million times

// adding the overhead of calling an uncontended SimpleSpinLock? SpinLock sl = new SpinLock(false);

sw.Restart();

for (Int32 i = 0; i < iterations; i++) {

Boolean taken = false; sl.Enter(ref taken); x++; sl.Exit();

}

Console.WriteLine("Incrementing x in SpinLock: {0:N0}", sw.ElapsedMilliseconds);

// How long does it take to increment x 10 million times

// adding the overhead of calling an uncontended SimpleWaitLock? using (SimpleWaitLock swl = new SimpleWaitLock()) {

sw.Restart();

for (Int32 i = 0; i < iterations; i++) { swl.Enter(); x++; swl.Leave();

}

Console.WriteLine("Incrementing x in SimpleWaitLock: {0:N0}", sw.ElapsedMilliseconds);

}

[MethodImpl(MethodImplOptions.NoInlining)]

private static void M() { /* This method does nothing but return */ }

When I run the preceding code, I get the following output.

Incrementing x: 8 Fastest

Incrementing x in M: 69 ~9x slower

Incrementing x in SpinLock: 164 ~21x slower Incrementing x in SimpleWaitLock: 8,854 ~1,107x slower

As you can clearly see, just incrementing x took only 8 milliseconds. To call empty methods before and after incrementing x made the operation take nine times longer! Then, executing code in a method that uses a user-mode construct caused the code to run 21 (164 / 8) times slower. But now, see how much slower the program ran using a kernel-mode construct: 1,107 (8,854 / 8) times slower! So, if you can avoid thread synchronization, you should. If you need thread synchronization, then try to use the user-mode constructs. Always try to avoid the kernel-mode constructs.

Semaphore Constructs

Semaphores are simply Int32 variables maintained by the kernel. A thread waiting on a semaphore blocks when the semaphore is 0 and unblocks when the semaphore is greater than 0. When a thread waiting on a semaphore unblocks, the kernel automatically subtracts 1 from the semaphore’s count. Semaphores also have a maximum Int32 value associated with them, and the current count is never allowed to go over the maximum count. Here is what the Semaphore class looks like.

public sealed class Semaphore : WaitHandle {

public Semaphore(Int32 initialCount, Int32 maximumCount);

public Int32 Release(); // Calls Release(1); returns previous count public Int32 Release(Int32 releaseCount); // Returns previous count

}

So now let me summarize how these three kernel-mode primitives behave:

■ When multiple threads are waiting on an auto-reset event, setting the event causes only one

thread to become unblocked.

■ When multiple threads are waiting on a manual-reset event, setting the event causes all

threads to become unblocked.

■ When multiple threads are waiting on a semaphore, releasing the semaphore causes releaseCount threads to become unblocked (where releaseCount is the argument passed to Semaphore’s Release method).

Therefore, an auto-reset event behaves very similarly to a semaphore whose maximum count is 1. The difference between the two is that Set can be called multiple times consecutively on an

auto-reset event, and still only one thread will be unblocked, whereas calling Release multiple times consecutively on a semaphore keeps incrementing its internal count, which could unblock many threads. By the way, if you call Release on a semaphore too many times, causing its count to exceed its maximum count, then Release will throw a SemaphoreFullException.

Using a semaphore, we can re-implement the SimpleWaitLock as follows, so that it gives multiple threads concurrent access to a resource (which is not necessarily a safe thing to do unless all threads access the resource in a read-only fashion).

public sealed class SimpleWaitLock : IDisposable { private readonly Semaphore m_available;

public SimpleWaitLock(Int32 maxConcurrent) {

m_available = new Semaphore(maxConcurrent, maxConcurrent);

}

public void Enter() {

// Block in kernel until resource available m_available.WaitOne();

}

public void Leave() {

// Let another thread access the resource m_available.Release(1);

}

public void Dispose() { m_available.Close(); }

}

Mutex Constructs

A Mutex represents a mutual-exclusive lock. It works similar to an AutoResetEvent or a Semaphore with a count of 1 because all three constructs release only one waiting thread at a time. The following shows what the Mutex class looks like.

public sealed class Mutex : WaitHandle { public Mutex();

public void ReleaseMutex();

}

Mutexes have some additional logic in them, which makes them more complex than the other constructs. First, Mutex objects record which thread obtained it by querying the calling thread’s Int32 ID. When a thread calls ReleaseMutex, the Mutex makes sure that the calling thread is the same thread that obtained the Mutex. If the calling thread is not the thread that obtained the Mutex, then the Mutex object’s state is unaltered and ReleaseMutex throws a System.Application Exception. Also, if a thread owning a Mutex terminates for any reason, then some thread waiting on the Mutex will be awakened by having a System.Threading.AbandonedMutexException thrown. Usually, this exception will go unhandled, terminating the whole process. This is good be- cause a thread acquired the Mutex and it is likely that the thread terminated before it finished updat- ing the data that the Mutex was protecting. If a thread catches AbandonedMutexException, then it could attempt to access the corrupt data, leading to unpredictable results and security problems.

Second, Mutex objects maintain a recursion count indicating how many times the owning thread owns the Mutex. If a thread currently owns a Mutex and then that thread waits on the Mutex again, the recursion count is incremented and the thread is allowed to continue running. When that thread calls ReleaseMutex, the recursion count is decremented. Only when the recursion count becomes 0 can another thread become the owner of the Mutex.

Most people do not like this additional logic. The problem is that these “features” have a cost associated with them. The Mutex object needs more memory to hold the additional thread ID and recursion count information. And, more importantly, the Mutex code has to maintain this information, which makes the lock slower. If an application needs or wants these additional features, then the ap- plication code could have done this itself; the code doesn’t have to be built into the Mutex object. For this reason, a lot of people avoid using Mutex objects.

Usually a recursive lock is needed when a method takes a lock and then calls another method that also requires the lock, as the following code demonstrates.

internal class SomeClass : IDisposable { private readonly Mutex m_lock = new Mutex();

public void Method1() { m_lock.WaitOne();

// Do whatever...

Method2(); // Method2 recursively acquires the lock m_lock.ReleaseMutex();

}

public void Method2() { m_lock.WaitOne();

// Do whatever... m_lock.ReleaseMutex();

}

public void Dispose() { m_lock.Dispose(); }

}

In the preceding code, code that uses a SomeClass object could call Method1, which acquires the Mutex, performs some thread-safe operation, and then calls Method2, which also performs some thread-safe operation. Because Mutex objects support recursion, the thread will acquire the lock twice and then release it twice before another thread can own the Mutex. If SomeClass has used an

AutoResetEvent instead of a Mutex, then the thread would block when it called Method2’s WaitOne

method.

If you need a recursive lock, then you could create one easily by using an AutoResetEvent.

internal sealed class RecursiveAutoResetEvent : IDisposable { private AutoResetEvent m_lock = new AutoResetEvent(true); private Int32 m_owningThreadId = 0;

private Int32 m_recursionCount = 0;

public void Enter() {

// Obtain the calling thread's unique Int32 ID

Int32 currentThreadId = Thread.CurrentThread.ManagedThreadId;

// If the calling thread owns the lock, increment the recursion count if (m_owningThreadId == currentThreadId) {

m_recursionCount++; return;

}

// The calling thread doesn't own the lock, wait for it m_lock.WaitOne();

// The calling now owns the lock, initialize the owning thread ID & recursion count m_owningThreadId = currentThreadId;

m_recursionCount = 1;

}

public void Leave() {

// If the calling thread doesn't own the lock, we have an error if (m_owningThreadId != Thread.CurrentThread.ManagedThreadId)

throw new InvalidOperationException();

// Subtract 1 from the recursion count if (m_recursionCount == 0) {

// If the recursion count is 0, then no thread owns the lock m_owningThreadId = 0;

m_lock.Set(); // Wake up 1 waiting thread (if any)

}

public void Dispose() { m_lock.Dispose(); }

}

Although the behavior of the RecursiveAutoResetEvent class is identical to that of the Mutex class, a RecursiveAutoResetEvent object will have far superior performance when a thread tries to acquire the lock recursively, because all the code that is required to track thread ownership and recursion is now in managed code. A thread has to transition into the Windows kernel only when first acquiring the AutoResetEvent or when finally relinquishing it to another thread.

C HA P T E R 3 0

Date: 2016-03-03; view: 1054

<== previous page	\|	next page ==>
C#’s Support for Volatile Fields	\|	Nbsp; A Simple Hybrid Lock

doclecture.net - lectures - 2014-2025 year. Copyright infringement or personal data (0.167 sec.)