2. What logical phases of instructionexecution may an individual instruction be partitioned into?
3. Why is the execution of an individual instruction broken down into a sequence of separate phases?
4. Why are the logical phases of instructionexecution not necessarily distinct in the sense that they must be executed sequentially?
5. How is it possible to organize a simple pipeline with four stages?
6. What problems occur in realizing the most common approach used on almost all RISC processors?
7. What are the peculiarities of instruction scheduling?
8. How is the fundamental idea realized in program loops to cause a straight-line sequence of instructions to be executed repeatedly?
9. What are the two aspects of the problem that occurs in an instruction pipeline when a branch is encountered in an executing program?
10. What is the essence of the delayed branch (or delayed jump) technique which isused on RISC machines for dealing with the branch problem?
11. What is the structure of the MIPS-computer?
12. What is the feature that distinguishes the MIPS from other RISC processors?
13. How does the five-stage pipeline of the MIPS work?
Interrupts
An alternative means for coordinating the activities of the CPU with those of I/O devices is the use of interrupts. The CPU may temporarily suspend the execution of instructions. The processor stays in this state until the external device is ready for data transfer. At this point, the device alerts the CPU by activating one of the control lines, which we shall refer to as the interrupt-request (INTR) line. This signals the CPU to proceed with the execution of the data transfer instruction. Since the computer is no longer required to continuously check the status of external devices, the waiting period can be utilized to perform other useful functions. Indeed, by using interrupts, such waiting periods can ideally be eliminated [11], [13], [42], [59], [61], [62].
Example. Consider a task which requires some computations to be performed and the results to be printed on a line printer. This is followed by more computations and output, and so on. Let the program consist of two routines, COMPUTE and PRINT. Assume that COMPUTE produces n lines of output. These are printed by the PRINT routine. The required task may be performed by repeatedly executing the COMPUTE routine, then the PRINT routine. The printer accepts one line of text at a time. Hence, the PRINT routine must send one line of text, wait for it to be printed, then send the next line, until all the results have been printed. The disadvantage of this simple approach is that the CPU spends a considerable amount of time waiting for the printer to become ready. It is possible to overlap the printing and computation processes, that is, to execute the COMPUTE routine while printing is in progress, resulting in a higher overall speed of execution. This may be achieved as follows. First, the COMPUTE routine is executed to produce the first n lines of output. Then, the PRINT routine is executed to send the first line of text to the printer. At this point, instead of waiting for this line to be printed, the PRINT routine may be temporarily suspended. This makes it possible to continue execution of the COMPUTE routine. Whenever the printer becomes ready, it alerts the CPU by sending an interrupt-request signal. This causes the CPU to interrupt the execution of the COMPUTE routine and transfer control to the PRINT routine. The PRINT routine sends the second line to the printer. Then, it returns control to the interrupted COMPUTE routine, which resumes execution at the point of interruption. This process continues until all n lines have been printed. The PRINT routine will be restarted whenever the next set of n lines is available for printing. If COMPUTE takes about the same amount of time to generate n lines as the time required printing them, then it is likely that the next set of n lines will be available immediately.
The above example is intended to introduce the concept of interrupts. The routine which is executed in response to an interrupt request is called the interrupt-service routine. In the above case, this is the PRINT routine. Interrupt servicing bears considerable resemblance to subroutine calls. Assume that the interrupt arrives during execution of instruction i in Fig. 3.8. The CPU first
completes execution of instruction i. It should then load the program counter (PC) with the address of the first instruction of the interrupt-service routine. Let us assume, for the time being, that this address is hardwired in the CPU. After execution of the interrupt-service routine, the CPU should come back to instruction i+1. Therefore, when an interrupt is received, the current contents of the PC, which point at instruction i+1 in the case of Fig. 3.8, should be put in temporary storage. A return-from-interrupt instruction at the end of the interrupt-service routine causes the CPU to reload the PC from that temporary storage location. Thus, execution resumes at instruction i+1. In most computers, the return address is saved on the processor stack. We should note that as part of handling interrupts, the CPU must inform the device that its request has been recognized, so that it may remove its interrupt-request signal. This may be accomplished by means of a special control signal on the bus. An interrupt-acknowledge signal, used in some of the interrupt schemes to be discussed later, may serve this function. A common alternative is to have the transfer of data between the CPU and the I/O device interface accomplishes the same purpose. The execution of an instruction in the interrupt-service routine that either reads from or writes into the data register in the device interface implicitly informs the device that its interrupt request has been recognized. So far, treatment of an interrupt-service routine is very similar to that of a subroutine. An important departure from this similarity should be noted at this point. A subroutine performs a function required by the main program by which it is called. However, the interrupt-service routine may not have anything in common with the program being executed at the time the interrupt is received. In fact, the two programs often belong to different users. Therefore, before starting execution of the interrupt-service routine, the CPU should save, along with the contents of the PC, any information that may affect execution after return to the original program. Namely, the CPU should save the status word which includes the condition codes and any other status indicators at the time of interruption. Upon return from the interrupt-service routine, the CPU reloads the status word from its temporary storage location. This enables the original program to resume execution without being affected in any way by the occurrence of the interrupt, except, of course, for the time delay.
The contents of CPU registers, other than the program counter and the processor status register may or may not be saved automatically by the interrupt-handling mechanism. Obviously, if the CPU saves register contents before entering the interrupt-service routine, it must also restore them before returning to the interrupted program. The process of saving and restoring registers involves a number of memory transfers. These transfers represent a time overhead that is associated with every interrupt accepted by the CPU. In a computer which does not automatically save all register contents following an interrupt, the interrupt-service routine should save the contents of any CPU register that it needs to use. The saved data should be restored to their respective registers before returning to the interrupted program. In order to minimize the interrupt overhead, some computers provide two types of interrupts. One saves all register contents, and the other does not. A particular I/O device may use either type, depending upon its response time requirements.
Another interesting approach is to provide duplicate sets of CPU registers. Then a different set of registers can be used in servicing interrupt requests. The above discussion shows that an interrupt is more than a simple mechanism for coordinating I/O transfers. In general sense, interrupts enable transfer of control from one program to another to be initiated by an event external to the computer. Execution of the interrupted program resumes after completion of execution of the interrupt-service routine. As such, the concept of interrupts is useful in operating systems and in many control applications where processing of certain routines has to be accurately timed relative to external events. The latter application is generally referred to as real-time processing.
Interrupt Handling.The facilities available in a computer should be sufficient to enable the programmer to have complete control over the events that take place during program execution. The arrival of an interrupt request from an external device causes the CPU to suspend the execution of one program and start the execution of another. Since interrupts can arrive at any time, they may alter the sequence of events envisaged by the programmer. Hence, they should be carefully controlled. A fundamental facility found in all computers is the ability to enable and disable the occurrence of program interruptions as desired. We will examine this in some detail below. Since proper handling of interrupts requires close cooperation between the software and the hardware, no attempt will be made to separate these two aspects in the discussion.
Enabling and Disabling Interrupts.There are many situations in which the CPU should ignore interrupt requests. For example, in the case of the COMPUTE-PRINT program of Fig. 3.8, an interrupt request from the printer should be accepted only if there are output lines to be printed. After printing the last line of a set of n lines, interrupts should be disabled until another set becomes available for printing. In another case, it may be necessary to guarantee that a particular sequence of instructions is executed to the end without interruption, since the interrupt-service routine may change some of the data used by the instructions in question. For these reasons, some means for enabling and disabling interrupts must be made available to the programmer. A simple way is to provide machine instructions, such as Interrupt-enable and Interrupt-disable, which perform these functions.
Let us consider in some detail the specific case of a single interrupt request from one device. When a device activates the interrupt-request signal, it keeps this signal activated until it learns that the CPU has accepted its request. This means that the interrupt-request signal will be active during execution of the interrupt-service routine, perhaps until an instruction is reached which accesses the device in question. It is essential to ensure that this active request signal does not cause a second interruption during this period. An erroneous interpretation of a single interrupt as multiple requests would cause the system to enter an infinite loop from which it could not recover. Several mechanisms are available to alleviate this problem. We will describe three simple possibilities here. Other schemes that involve more than one interrupting device will be presented later. The first possibility is to have the CPU hardware ignore the interrupt-request line until the execution of the first instruction of the interrupt-service routine has been completed. Thus, using an Interrupt-disable instruction as the first instruction in the interrupt-service routine, the programmer can ensure that no further interruptions will occur until an Interrupt-enable instruction is executed. Typically, this will be the last instruction in the interrupt-service routine before the Return-from-interrupt instruction. Again the CPU must guarantee that execution of the Return-from-interrupt instruction is completed before further interruption can occur.
Another option, commonly encountered in practice, is to have the CPU automatically disable interrupts before starting the execution of the interrupt-service routine. That is, after saving the contents of the PC and the processor status (PS) on the stack, the CPU automatically performs the equivalent of executing an Interrupt-disable instruction. It is often the case that one bit in the PS register indicates whether interrupts are enabled or disabled. The CPU sets this bit to disable interrupts. Then it starts execution of the interrupt-service routine. Similarly, the CPU may automatically enable interrupts when a Return-from-interrupt instruction is executed. This is one of the results of the restoration of the contents of the PS register from the stack.
The third approach that can be used is to arrange the interrupt-handling circuit in the CPU so that it responds only to the leading edge of the interrupt-request signal. Obviously, only one such transition will be seen by the CPU for every request generated by the device. A proper understanding of the sequence of events involved in servicing interrupts is essential, both for the hardware designer and for the programmer. Before proceeding to study more complex aspects of interrupts, let us summarize the sequence of events involved in handling an interrupt request from a single device:
1. The device raises an interrupt request.
2. The CPU interrupts the program being executed at the time.
3. Interrupts are disabled.
4. The device is informed that its request has been recognized, and in response, it deactivates the interrupt-request signal.
5. The action requested by the interrupt is performed.
6. Interrupts are enabled.
7. Execution of the interrupted program is resumed.
Let us now consider the situation where a number of devices capable of initiating interrupts are connected to the CPU. Since these devices are operationally independent, there is no definite order in which they will generate interrupts. For example, device X may request an interrupt while an interrupt caused by device Y is being serviced, or all devices may request interrupts at exactly the same time.
The means by which the above problems are resolved vary considerably from one machine to another. The approach taken in any machine is an important consideration in determining its suitability for a given application.
Vectored Interrupts.Vectored interrupt is an interrupt for which the address to which control is transferred is determined by the cause of the exception. In order to reduce the overhead involved in the polling process, a device requesting an interrupt may identify itself directly to the CPU. Then, the CPU can immediately start executing the required interrupt-service routine. The term vectored interrupts refers to all interrupt-handling schemes based on this approach. In a computer which has multiple interrupt-request lines, vectored interrupts may be implemented by simply associating a unique starting address with each line. Alternatively, a device requesting an interrupt may identify itself by sending a special code to the CPU over the I/O bus. This is a more powerful technique, since it enables identification of individual devices that may share a single interrupt-request line. The code supplied by the device may then be chosen to represent the starting address of the interrupt-service routine for that device. In some cases, especially in smaller machines, only a few bits of the address are supplied, with the remainder of the address being fixed. This minimizes the number of bits that need to be transmitted by the I/O device, thus simplifying the design of its interface. Note, however, that it limits the number of devices that can be automatically identified by the CPU. For example, if 4 bits are supplied by the device, only 16 distinct codes, representing 16 different devices, can be recognized by the CPU. It is possible to assign each code to a group of devices. When a given code is received, the CPU can identify the device causing the interrupt by polling the members of the group represented by that code.
The above arrangement implies that the interrupt-service routine for a given device must always start at the same location. The programmer may gain some flexibility by storing in this location an instruction which causes a jump or a branch to the appropriate routine. In some machines, this is done automatically by the interrupt-handling mechanism. The CPU uses the code received from the interrupting device as an indirect specification of the starting address of the interrupt-service routine. That is, this code is interpreted as an address of a memory location which contains the required starting address. The contents of this location, which comprise a new value for the PC, are referred to as the interrupt vector. In many machines, the interrupt vector also includes a new value for the processor status register. For example, in the case of the PDP-11 computer, the CPU receives a 16-bit address from the device requesting an interrupt at the time the request is accepted. This is the address of a two-word interrupt vector associated with the device. After saving the current contents of the PC and the PS on the processor stack, the CPU loads the first word of this vector into the PC and the second word into the PS. The ability to change the contents of the PS at the time the interrupt-service routine is entered provides the programmer with considerable flexibility in changing the priority of the CPU or disabling further interrupts, as will be explained later. In general, this is a useful facility, found in many computers.
Some modifications to the hardware are required to support the vectored-interrupt feature. The key modification follows from the realization that the CPU may not respond immediately when it receives an interrupt request. The minimum delay in the CPU response is dictated by the requirement to complete execution of the current instruction. Further delays may occur because of an earlier execution of an Interrupt-disable instruction or because the interrupt in question has lower priority than that of the program currently being executed. Since the CPU may require the use of the I/O bus during this delay, the interrupting device should not be allowed to put data on the I/O bus until the CPU is ready to receive it. The necessary coordination can be achieved through the use of another control signal that may be termed interrupt acknowledge (INTA). As soon as the CPU is ready to service the interrupt, it sets the INTA line to on. This, in turn, causes the device interface to place the interrupt-vector address on the data lines, and to turn off the INTR signal. The CPU uses the addresses supplied by the device interface to determine the new values of the PC and the PS and, hence, to start executing the appropriate interrupt-service routine.
Interrupt Nesting.The same arrangement is often used when several devices are involved, in which case execution of a given interrupt-service routine, once started, will always continue to completion before a second interrupt request is accepted by the CPU. Interrupt-service routines are typically short, and a possible delay in responding to the second request is acceptable for most simple devices.
For some devices, a long delay in responding to an interrupt request may lead to erroneous operation. Consider, for example, a computer which keeps track of the time of day. This can be implemented by using an I/O device, usually called a real-time clock, which sends interrupt requests to the CPU at regular intervals. For each of these requests, the CPU executes a short interrupt-service routine which increments a set of counters in the memory to obtain time in seconds, minutes, etc. Proper operation of this subsystem requires that the delay in responding to an interrupt request from the real-time clock be small in comparison with the interval between two successive requests. In order to ensure that this requirement is satisfied in the presence of other interrupting devices, it may be necessary that the CPU accept an interrupt request from the real-time clock during the execution of an interrupt-service routine for another device. The example of the real-time clock suggests that I/O devices should be organized in a hierarchical priority structure. An interrupt request from a high-priority device should be accepted even while the CPU is servicing another request from a lower-priority device.
A multiple-level priority organization means that during execution of an interrupt-service routine, interrupt requests will be accepted from some devices but not from others, depending upon the device priority. In order to facilitate implementation of this scheme, it is useful to assign a priority level to the CPU which can be changed under program control. The priority level of the CPU is in fact the priority of the program that is currently being executed. The CPU will accept interrupts only from devices having priorities higher than it. At the time the execution of an interrupt-service routine for some device is started, the priority of the CPU should be set to correspond to that of the device. This, in effect, disables interrupts from devices at the same level of priority or lower. However, interrupt requests from higher-priority devices will continue to be accepted. The CPU priority is usually incorporated as a part of the processor status word, thus making it program controlled.
From the hardware point of view, a multiple-priority scheme can be implemented easily by using separate interrupt-request and interrupt-acknowledge lines for each device. Such an arrangement is shown in Fig. 3.9. Each of the interrupt-request lines is assigned a different priority level. A priority circuit in the CPU arbitrates interrupt requests received over these lines.
A request is accepted only if it has a higher priority level than that currently assigned to the CPU.
Simultaneous Requests.Let us now consider the problem of simultaneous arrivals of interrupt requests from two or more devices. The CPU should have some means of arbitration by which only one request is serviced and the others are either delayed or ignored. In the presence of a priority scheme such as that of Fig. 3.9 the solution to this problem is straightforward. The CPU simply accepts the request having the highest priority. However when several devices share the use of one interrupt-request line, some other mechanism has to be implemented to assign relative priority to these devices. In the polling scheme, priority is automatically implemented by the order in which devices are polled. Therefore, no special treatment is required to accommodate situations where simultaneous interrupt requests may occur.
In the case of vectored interrupts, the priority of any device is usually determined by the way in which it is connected to the CPU. The most common method is the daisy-chain arrangement shown in Fig. 3.10,a. The interrupt-request line ( ) is common to all devices. However, the interrupt-acknowledge line (INTA) is connected in a daisy-chain fashion as shown. When one or more devices issue an interrupt request, the line is activated. The CPU responds, after some delay, by setting the INTA line to 1. This signal is received by device 1. Device 1 passes the signal on to the next device only if it does not require any service. If device 1 has a pending request for interrupt, it blocks the acknowledgment signal INTA and proceeds to put its interrupt vector on the data lines. Therefore, the daisy-chain arrangement results in the device that is electrically closest to the CPU having the highest priority. The second device along the chain has second highest priority, and so on.
The scheme of Fig. 3.10,a has the advantage that it requires considerably fewer wires than the individual connections of Fig. 3.9. The main advantage of the latter scheme is that it makes it possible for the CPU to accept interrupt requests from some devices, but not from others, depending upon their priorities. The two schemes may be combined to produce the more general structure of Fig. 3.10b. This organization is used in many computer systems. We should note that the general organization of Fig. 3.10,b makes it possible for a device to be connected to several priority levels. Thus, at any given time, it can request an interrupt at the priority level consistent with the urgency of the function being performed. This approach offers additional flexibility. However, it requires complex control circuitry in the device interface.
Selective Interrupt Masking.In most machines, a selective interrupt enable-disable facility that is independent of the device priority is provided. This is accomplished through the use of an ?interrupt-enable? flip-flop associated with each device or group of devices. When set to 1 by the CPU, this flip-flop allows an interrupt request issued by the corresponding device to reach the CPU. Otherwise the interrupt is masked and does not reach the CPU until the flip-flop is set to 1. Physically, the interrupt-enable flip-flop may be incorporated in the device interface, usually as one of the bits in the status register. In machines using the organization of Fig. 3.10,b, an interrupt-enable flip-flop may also be provided for each line. The collection of these flip-flops forms an interrupt mask. It precedes the priority arbitration circuit, as shown in Fig. 3.11. In this case, the interrupt mask constitutes an internal CPU register. Its contents can be easily changed at any time under program control.