. (2 pts. per part) | |
Referring
to figure 5.17 (textbook p.358), for a single-cycle MIPS
implementation: Locate the unit marked "shift left 2". |
Give an example of an
instruction whose correct processing relies on that unit.
For an instruction such as the one you gave just above, what
prevents the unit from carrying out the shift anyway? |
a
branch or jump anything except a branch or jump Nothing! [The shift takes place, but the new address which is ultimately generated never makes it to the PC.] |
. | |
In
the single-cycle implementation, an instruction is supposed to be
fully processed within just one clock cycle. Looking at the
schematic (fig. 5.17), we can see that signals have to propagate
through a number of components before the final result is
achieved. Let's focus on the Data Memory. As the
cycle begins, that unit has some signals already present on its inputs. |
(1 pt. per) List all the
inputs of the Data Memory and state briefly what their purpose is.
|
[there are four inputs to list, including the two control signals.] Timing -- of the Data Memory's own operation -- since this is single-cycle, it is not correct to say that the control unit delays setting the read or write control value until the correct inputs are present. All the control signals are set at the beginning of the cycle and never changed. The whole circuit must be timed so that the Data Memory does not perform any write operations until it is safe to do so. |
. | |
Figure
5.33 of the textbook (p. 383) shows the datapath and control
lines of a multi-cycle MIPS implementation. |
(1/2 pt. ea., 7 points total)
Writing directly on that sheet of the handout, show the required value
for each control line during the Fetch cycle. If any value is a
"don't care", indicate it as X. ("Each control line" means each
of the lines coming out of the oval marked Control). (3 pts.) Comparing the single-cycle (fig. 5.17) with the multi-cycle (fig. 5.33), the latter has an Instruction Register on its data-path, not present on in the single-cycle version. Explain this addition (i.e., why is it needed now and wasn't before?)
|
[There are 13
signals. This was scored as .5 points per signal, rather than 1
pt. each as originally indicated, and the total was rounded up.
The biggest mistake was forgetting that the PC is incremented during
the fetch cycle, which involves about 5 of the control signals.
The second biggest mistake was setting RegWrite to X and not to
0. This is necessary because if RegWrite were X, it could
sometimes be at 1 and potentially write bogus data to the register
file, corrupting registers that we could use later. That is, it
changes architectural state, and this should never be a don't care
condition. It wouldn't break the particular fetch cycle of
current instruction, but it could break subsequent instructions and
correctness of the whole program.] The reason? In the single-cycle version, there were two memories, one for instructions, and one for data, and the PC was not incremented until the end of the cycle. In the multi-cycle case, there is only one memory, which is used for data as well as instruction fetch. Thus, the fetched instruction needs to be saved since the memory is going to be reused. It is not enough to say "the instruction is needed even after the fetch stage in multi-cycle". This is simply saying that the data in the instruction word is needed for the entire length of instruction execution, which is also true for the single-cycle case. That doesn't explain why there isn't also an IR for the single-cycle design. |
. (2 pts. ea.) | |
On
a handout is an unlabled finite-state diagram for a multi-cycle
implementation of some machine. For each of the following, answer as best you can; if no answer is possible, explain what other information would be needed in order to give a good answer. |
What is the lower bound for
the CPI for this machine (i.e., the theoretical smallest value)? What is the upper bound for the CPI for this machine (i.e. the theoretical largest value)?
|
shortest: 3. You can get this by following the shortest path on the
diagram using the top path from third state!) Many people
missed the path on the top and ended up with a CPI of 6 here following
the next shortest route. 7, following the longest path. Cannot determine the average path without knowing a mix of
instructions - we need to know % of instructions following each
possible path on the diagram. |
. (2 pts. ea.) | |
On
the handout there is a very high-level view of a 5-stage MIPS
pipeline. |
Explain the vertical bar named MEM/WB. |
MEM/WB: It's a set
of pipeline registers storing all the necessary information
coming from the memory stage to be used to execute the WB part of
the instruction. This includes memory output, ALU output,
destination register, and all the control signals. You didn't
have to list all the contents, but you also couldn't just list one of
these and say that this is what's the pipeline register is used for. MEM exceptions: page fault, illegal address WB exceptions: none The value meant here was the destination register index, determined in ID (by selecting from rt or rd) and passed through to WB stage, where it is fed to the register file to decide which register to write to. Note that for this to work, you need an instruction that actually writes to a register! That means add, lw, sll, etc. are all fine. For some reason, many people put down sw. This is wrong, because sw writes to memory and not to register file! Thus, it never uses the destination register field and doesn't work as an example here. Other example that don't work include branches and jumps, for the same reason. It gets passed along in the pipeline registers in the sequence ID/EX -> EX/MEM -> MEM/WB |
. | |
Consider
the following sequence of instructions: add $s0, $t0, $t2 sub $t2, $s0, $t3 Assume the MIPS five-stage pipeline. |
(2 pts.) Identify the
hazards, if any.
|
There is a data hazard, the sub instruction uses register $s0
that the add instruction writes. Forward the ALU value held in the EX/MEM pipeline register to
the ALU input in the EX stage. |
. (3 pts.) | |
Assuming the MIPs five-stage pipeline: give a sequence of two consecutive instructions which have a data hazard which cannot be resolved by forwarding alone. | |
lw $t0, 0($sp) |
. 6 pts. | |
You
work for a company that makes MIPS-compatibile CPU chips. A
new type of on-chip memory technology has become available, which could
serve for the Data/Instruction memory of your CPU. It operates
20% faster than the memory currently used. |
Determine the potential effect on clock cycle time for each of the two designs listed below. If you cannot calculate the impact directly, explain what the effect would depend on and what you would need to know to calculate it exactly. |
single-cycle design: The cycle time is determined by the time taken for signals to propagate through the longest path in the circuit. This path would be for the load instruction, which accesses memory twice; once to fetch the load instruction from memory (this is done for all instructions) and a second time to actually load the value from memory to put into a register. Both of these accesses will take 20% less time but the reduction in cycle time is less than 40% because the rest of the circuit still requires time for signals to propagate through. The faster memory may make the load instruction not be the longest path anymore, in which case the cycle time would depend on the new longest path. You would need to know the speeds of the other functional units to calculate the cycle time exactly. pipelined design: In a pipeline all stages take the same time (one clock cycle), which is the time taken for the slowest operation performed in a stage. If this operation involves accessing memory (i.e. IF or MEM stages), then a 20% faster memory can result in a 20% shorter cycle time, unless there is now a slower operation, in which case the cycle time improvement would be less than 20% (to calculate the exact value would require knowing the time taken for the new slowest operation). |