Fetching a Word from Memory and Storing a Word into Memory.Information is stored in locations identified by their RAM addresses. To fetch a word of information from memory, the CPU has to specify the address of the memory location where this information is stored and request a Read operation. This applies whether the information to be fetched represents a new instruction in a program or a word of data (operand) specified by an instruction. Thus, to perform a memory fetch, the CPU transfers the address of the required information word to the memory address register (MAR). As shown in Fig. 1.3, the MAR is connected to the address lines of the memory bus. Hence, the address of the required word is transferred to the main memory. Meanwhile, the CPU uses the control lines of the memory bus to indicate that a Read operation is required. Normally, after issuing this request the CPU waits until it receives an answer from the memory, informing it that the requested function has been completed. This is accomplished through the use of another control signal on the memory bus, which will be referred to as Memory-Function-Completed (MFC).
The memory sets this signal to a 1 to indicate that the contents of the specified location in the memory have been read and are available on the data lines of the memory bus. We will assume that as soon as the MFC signal is set to 1, the information on the data lines is loaded into MDR and is thus available for use inside the CPU. This completes the memory fetch operation. As an example, assume that the address of the memory location to be accessed is in register R1 and that the memory data is to be loaded into register R2. This is achieved by the following sequence of operations:
1. MAR ← [R1]
2. Read
3. Wait for the MFC signal
4. [R2] ← MDR
The duration of step 3 depends upon the speed of the memory used. Usually, the time required to read a word from the memory is longer than the time required to perform any single operation within the CPU. Therefore, the overall execution time of an instruction can be decreased if the sequence of operations is organized such that a useful function is performed within the CPU while waiting for the memory to respond. Obviously, only functions that do not require the use of MDR or MAR can be carried out during this time. Such a situation arises during the fetch phase. As we will see shortly, the PC can be incremented while waiting for the Read operation to be completed.
The transfer mechanism where one device initiates the transfer (Read request) and waits until the other device responds (MFC signal) is referred to as an asynchronous transfer. It can be easily seen that this mechanism enables transfer of data between two independent devices that have different speeds of operation. An alternative scheme found in some computers uses synchronous transfers. In this case, one of the control lines of the bus carries pulses from a continuously running clock of a fixed frequency. These pulses provide common timing signals to the CPU and the main memory. A memory operation is completed during every clock period. Furthermore, the instants at which the address is placed on the address lines and the data is loaded into MDR are fixed relative to the clock pulses. The synchronous bus scheme leads to a simpler implementation. However, it cannot accommodate devices of widely varying speed, except by reducing the speed of all devices to that of the slowest one. In the remainder of the discussion of the operation of the CPU, we will assume that an asynchronous memory bus is used.
The procedure for writing a word into a given memory location is similar to that for reading from memory. The only exception is that the data word to be written is loaded into the MDR before the Write command is issued. If we assume that the data word to be stored in the memory is in R2 and that the memory address is in R1, the Write operation requires the following sequence:
1. MAR ← [R1]
2. MDR ← [R2]
3. Write
4. Wait for MFC
It is interesting to note that steps 1 and 2 are independent. Therefore they can be carried out in any order. In fact, steps 1 and 2 can be carried out simultaneously, if this is allowed by the architecture, that is, if the two transfers do not use the same data path. Of course, this would not be possible in the single-bus organization of Fig. 1.3. Note also that, as in the case of the Read operation, the wait period in step 4 may be overlapped with other operations, provided that such operations do not involve registers MDR or MAR.
Register Transfers and Performing an Arithmetic/Logic Operation.To enable data transfer between various blocks connected to the common bus in Fig. 1.3, input and output gating must be provided. This is represented symbolically in Fig. 1.4. The input and output gates for register Ri are controlled by the signals Riin and Riout, respectively. Thus, when Riin is set to 1, the data available on the common bus is loaded into Ri. Similarly, when Riout is set to 1, the contents of register Ri are placed on the bus. While Riout is equal to 0, the bus can be used for transferring data from other registers. For example, to transfer the contents of register R1 to register R4, the following actions are needed:
? Enable the output gate of register R1 by setting R1out to 1. This places the contents of R1 on the CPU bus.
? Enable the input gate of register R4 by setting R4in to 1. This loads data from the CPU bus into register R4. This data transfer can be represented symbolically as R1out, R4in.
A basic operation in all digital computers is the addition or subtraction of two numbers. Such operations are provided at the machine instruction level. They are implemented, along with basic logic functions such as AND, OR, NOT, and EXCLUSIVE-OR, in the arithmetic and logic unit (ALU) subsystem of the CPU. These functions are normally performed by combinational logic circuitry. The operands are presented to the ALU as the outputs of two CPU registers, possibly via a bus. The result is usually routed to another CPU register after an amount of time that permits the combinational logic to complete the computations.
When performing an arithmetic or logic operation, it should be remembered that the ALU itself is a combinational circuit that has no internal storage. Therefore, to perform an addition, for example, the two numbers to be added should be made available at the two inputs of the ALU simultaneously. Register Y, in Fig. 1.3, is provided for this purpose. It is used to hold one of the two numbers while the other number is gated to the bus. The result is stored temporarily in register Z. Therefore, the sequence of operations to add the contents of register R1 to register R2 and store the result in register R3 should be as follows: step 1 - R1out, Yin; step 2 - R2out, Add, Zin; step 3 - Zout, R3in.
In step 2 of this sequence the contents of register R2 are gated to the bus, hence to input B of the ALU which is connected directly to the bus. The contents of register Y are always available at input A. The function performed by the ALU depends upon the signals applied to the ALU control lines. In this case, the Add line is set to 1, causing the output of the ALU to be the sum of the two numbers at A and B. This sum is loaded into register Z, since its input gate is enabled (Zin). In step 3, the contents of register Z are transferred to the destination register R3. Obviously, this last transfer cannot be carried out during step 2, since only one register output can be meaningfully connected to the bus at any given time.
Multiply and divide operations are comparatively more complex than either addition or subtraction. These operations are usually included in the basic instruction set. However, their execution times may be significantly slower than other instructions such as Add, Move, etc. This is because they are implemented as a sequence of addition and subtraction steps through the ALU, controlled by a microprogram. Of course, as long as Add and Subtract are available as machine instructions, both multiply and divide operations can be supplied as software routines. These routines basically implement multiplication as a sequence of adds and shifts, and division as a sequence of subtracts and shifts, as will be explained. Compared with arithmetic operations, logic operations are simple from the combinational circuit viewpoint. They require only independent Boolean operations on individual bit positions of the operands, whereas carry-borrow lateral signals are required in arithmetic operations.