慶應義塾大学
2009年度 秋学期
コンピューター・アーキテクチャ
Computer Architecture
第7回 12月1日
Lecture 7, December 1: Processors: Basics of Pipelining
*** should use H&P slides! they're actually pretty good. ***
Outline of This Lecture
- Follow-up on Assembly Programming
- Follow-up on Arithmetic
- Stages of Instruction Execution
- Pipelining
- Final Thoughts
- Homework
Follow-up on Assembly Programming
How did you fare on the assembly programming?
Follow-up on Arithmetic
Two weeks ago I meant to point out the following:
Stages of Instruction Execution
This model of how an instruction is executed is tilted slightly toward
the MIPS architecture, of which Hennessy and Patterson were two
of the instigators. However, the actions in any CPU would be
similar.
- Instruction Fetch cycle (IF)
Fetch the current instruction from memory, using the program
counter (PC) as the address, add 4 to the PC, and store the PC
(actually, in MIPS, store the tentative new PC into an internal
register called NPC, Next PC).
- Instruction Decode/register fetch cycle (ID)
Determine which instruction we are holding, fetch the register
values (two, always, in this instruction set), compare the two
registers and set the EQUAL flag if equal.
- Execution/effective address cycle (EX)
Depending on the instruction type:
- Memory reference: The ALU adds the base register and the
offset to form the effective address.
- Register-Register ALU instruction: perform the operation
(e.g., addition, multiplication, logic operation) on the
register values fetched by the ID stage.
- Register-Immediate ALU instruction: perform the
instruction on the first register read and the immediate value
in the instruction.
- Memory access (MEM)
If the instruction is a LOAD or a STORE, do the
appropriate thing, otherwise do nothing. (In MIPS, update the PC
using either NPC or the output of the ALU operation.)
- Write-Back cycle (WB)
If the instruction was LOAD, write the value fetched from
memory into the matching register; if it was an ALU operation,
write the result to the register.
The MIPS Pipeline
Pipeline Hazards
Sometimes conflicts occur between the different stages of the
pipeline. Such as condition is called a pipeline hazard.
There are three types of hazards:
- Structural hazard: when there is only a single resource,
such as a single port to access main memory, and two instructions
try to use it at the same time, a structural hazard is hit and one
must wait on the other.
- Data hazard:Sometimes one instruction will attempt to use
the result of an ALU operation (addition, etc.) before the operation
is complete. It must wait, sometimes more than one clock cycle.
- Control hazard:In the simplest pipeline implementations,
every time a branch occurs, one or more in-progress instructions
must be aborted without changing the state of the system, and new
instructions must be fetched.
Hazards result in pipeline stalls or pipeline
bubbles.
Final Thoughts
The five-stage pipeline we have discussed is far from the only way to
divide the work in a pipeline. The Intel Prescott microprocessor
(Feb. 2004) had a thirty stage pipeline! Filling that pipeline
takes some serious time, so every branch is a problem.
The most famous pipeline of all:
宿題
Homework
This week's homework (submit via SFS):
- Take your assembly-language matrix multiplication program and
count the following:
- Floating-point additions
- Floating-point multiplications
- Integer additions/subtractions
- Branches
- The number of instructions between branch instructions
- Calculate the ideal throughput, assuming one instruction per clock cycle, ****this should go in an earlier lecture?****
- Find and describe a real-world pipeline. Include:
- The number of stages
- Functionality of each stage
- Interlocking between stages
- Any hazards
- How balance in execution time is maintained
- Pipeline hazards equate to arrows flowing right to left on the
figure above. Identify the arrows on the diagram above by type and
indicate the maximum delay that the hazard can cause.
- The three pipeline programs we "executed" during class today are
linked to below. Calculate the following for each:
- The number of instructions that must be executed. Don't
forget to account for the loop in program 3. (n.b.: the #-28 in
the branch is decimal!)
- The number of clock cycles the entire program takes,
accounting for data and control hazards.
- The average clock cycles per instruction (CPI) for the
program.
Next Lecture
Next lectures:
第8回 12月8日 and 12月12日
Lecture 8, December 8: Memory: Caching and Memory Hierarchy
Lecture 9, December 12 (Saturday!): Memory: Virtual Memory
Readings for next time:
- Follow-up from this lecture:
- H-P:Appendix A.1 and A.2
- P-H:
- For next time:
- H-P: Appendix C.1, C.2 and C.3
- P-H:
Additional Information
その他