IL and Verification

IL is stack-based, which means that all of its instructions push operands onto an execution stack and pop results off the stack. Because IL offers no instructions to manipulate registers, it is easy for people to create new languages and compilers that produce code targeting the CLR.

IL instructions are also typeless. For example, IL offers an add instruction that adds the last two operands pushed on the stack. There are no separate 32-bit and 64-bit versions of the add instruc- tion. When the add instruction executes, it determines the types of the operands on the stack and performs the appropriate operation.

In my opinion, the biggest benefit of IL isn’t that it abstracts away the underlying CPU. The biggest benefit IL provides is application robustness and security. While compiling IL into native CPU instruc- tions, the CLR performs a process called verification. Verification examines the high-level IL code and ensures that everything the code does is safe. For example, verification checks that every method is called with the correct number of parameters, that each parameter passed to every method is of the correct type, that every method’s return value is used properly, that every method has a return state- ment, and so on. The managed module’s metadata includes all of the method and type information used by the verification process.

In Windows, each process has its own virtual address space. Separate address spaces are neces- sary because you can’t trust an application’s code. It is entirely possible (and unfortunately, all too common) that an application will read from or write to an invalid memory address. By placing each Windows process in a separate address space, you gain robustness and stability; one process can’t adversely affect another process.

By verifying the managed code, however, you know that the code doesn’t improperly access memory and can’t adversely affect another application’s code. This means that you can run multiple managed applications in a single Windows virtual address space.

Because Windows processes require a lot of operating system resources, having many of them can hurt performance and limit available resources. Reducing the number of processes by running multiple applications in a single operating system process can improve performance, require fewer resources, and be just as robust as if each application had its own process. This is another benefit of managed code as compared to unmanaged code.

The CLR does, in fact, offer the ability to execute multiple managed applications in a single operat- ing system process. Each managed application executes in an AppDomain. By default, every managed EXE file will run in its own separate address space that has just one AppDomain. However, a process hosting the CLR (such as Internet Information Services [IIS] or Microsoft SQL Server) can decide to run AppDomains in a single operating system process. I’ll devote part of Chapter 22, “CLR Hosting and AppDomains,” to a discussion of AppDomains.

Unsafe Code

By default, Microsoft’s C# compiler produces safe code. Safe code is code that is verifiably safe. How- ever, Microsoft’s C# compiler allows developers to write unsafe code. Unsafe code is allowed to work directly with memory addresses and can manipulate bytes at these addresses. This is a very power- ful feature and is typically useful when interoperating with unmanaged code or when you want to improve the performance of a time-critical algorithm.

However, using unsafe code introduces a significant risk: unsafe code can corrupt data structures and exploit or even open up security vulnerabilities. For this reason, the C# compiler requires that all methods that contain unsafe code be marked with the unsafe keyword. In addition, the C# compiler requires you to compile the source code by using the /unsafe compiler switch.

When the JIT compiler attempts to compile an unsafe method, it checks to see if the assembly containing the method has been granted the System.Security.Permissions.Security Permis sion with the System.Security.Permissions.SecurityPermissionFlag’s SkipVerification flag set. If this flag is set, the JIT compiler will compile the unsafe code and allow it to execute. The CLR is trusting this code and is hoping the direct address and byte manipulations do not cause any harm. If the flag is not set, the JIT compiler throws either a System.InvalidProgramException or a System.Security.VerificationException, preventing the method from executing. In fact, the whole application will probably terminate at this point, but at least no harm can be done.

Microsoft supplies a utility called PEVerify.exe, which examines all of an assembly’s methods and notifies you of any methods that contain unsafe code. You may want to consider running PEVerify.exe on assemblies that you are referencing; this will let you know if there may be problems running your application via the intranet or Internet.

You should be aware that verification requires access to the metadata contained in any dependent assemblies. So when you use PEVerify to check an assembly, it must be able to locate and load all referenced assemblies. Because PEVerify uses the CLR to locate the dependent assemblies, the as- semblies are located using the same binding and probing rules that would normally be used when ex- ecuting the assembly. I’ll discuss these binding and probing rules in Chapter 2 and Chapter 3, “Shared Assemblies and Strongly Named Assemblies.”

IL and Protecting Your Intellectual Property

Some people are concerned that IL doesn’t offer enough intellectual property protection for their algorithms. In other words, they think that you could build a managed module and that someone else could use a tool, such as an IL Disassembler, to easily reverse engineer exactly what your application’s code does.

Yes, it’s true that IL code is higher-level than most other assembly languages, and, in general, reverse engineering IL code is relatively simple. However, when implementing server-side code (such as a web service, web form, or stored procedure), your assembly resides on your server.

Because no one outside of your company can access the assembly, no one outside of your com- pany can use any tool to see the IL—your intellectual property is completely safe.

If you’re concerned about any of the assemblies you do distribute, you can obtain an obfus- cator utility from a third-party vendor. These utilities scramble the names of all of the private symbols in your assembly’s metadata. It will be difficult for someone to unscramble the names and understand the purpose of each method. Note that these obfuscators can provide only a little protection because the IL must be available at some point for the CLR to JIT compile it.

If you don’t feel that an obfuscator offers the kind of intellectual property protection you desire, you can consider implementing your more sensitive algorithms in some unmanaged module that will contain native CPU instructions instead of IL and metadata. Then you can use the CLR’s interoperability features (assuming that you have ample permissions) to communicate between the managed and unmanaged portions of your application. Of course, this assumes that you’re not worried about people reverse engineering the native CPU instructions in your unmanaged code.

Date: 2016-03-03; view: 625

<== previous page	\|	next page ==>
Nbsp; Executing Your Assembly’s Code	\|	Nbsp; The Native Code Generator Tool: NGen.exe

doclecture.net - lectures - 2014-2025 year. Copyright infringement or personal data (1.213 sec.)