慶應義塾大学
2009年度 秋学期

コンピューター・アーキテクチャ
Computer Architecture

2009年度秋学期 火曜日3時限
科目コード: 35010 / 2単位
カテゴリ:
開講場所:SFC
授業形態:講義
担当: Rodney Van Meter
E-mail: rdv@sfc.keio.ac.jp

第2回 10月6日 Lecture 2, October 6:
Fundamentals of Computer Design

Picture(s) of the Day

Picture of a Sony PS3

Picture of a 45nm Cell processor Cell processor architecture Picture of IBM's
						      Cell BE-based
						      Roadrunner
						      supercomputer

Picture(s) of the Day

Outline of This Lecture

System Diagram

Wikipedia
						  motherboard block
						  diagram

命令:基本の概念
Instructions: the Basic Idea

Computers execute instructions, which are usually compiled by a compiler, a piece of software that translates human-readable (usually ASCII) code into computer-readable binary.

コンピューターが命令を実行する。その命令はコンパイラーが人間の 読めるプログラムから通訳してある。例えば:

LOAD	R1, A
ADD	R1, R3
STORE	R1, A
This example shows three instructions, to be executed sequentially. The first instruction LOADs a value into register R1 from memory (we will come back to how the value that is loaded into R1 is found in a minute). The second instruction ADDs the contents of register R3 into register R1, then the third instruction STOREs the result into the original memory location.

Depending on the instruction, the data may be one of several sizes (using common modern terminology):

CPU: the Central Processing Unit

ちょっと抽象的な絵ですが:

CPU block diagram
簡単に説明すると、CPUはこの機能の部品がある:

メモリー(記録):レジスター、スタック、ヒープ
Memory: Registers, Stacks, and Heaps

It is the job of the compiler to decide how to use the registers, stack, and heap most efficiently. Note that these functions apply to both user programs, or applications, and the operating system kernel.

定量てきなデザイン概念
Quantitative Principles of Design

Last time, we talked about Hennessy & Patterson's Five Principles:

  1. Take Advantage of Parallelism
  2. Principle of Locality
  3. Focus on the Common Case
  4. Amdahl's Law
  5. The Processor Performance Equation
I would add to this one imperative: Achieve Balance.

Take Advantage of Parallelism

Parallelism can be found by using multiple processors on different parts of the problem, or multiple functional units (floating point units, disk drives, etc.), or by pipelining, dividing an individual computer instruction into several parts and executing the parts of different instructions at the same time in different parts of the CPU.

Principle of Locality

Programs and data tend to reuse data and instructions that have been recently used. There are two forms of locality: spatial and temporal. Locality is what allows a cache memory to work.

Focus on the Common Case

The things that are done a lot should be fast; the things that are rare may be slow.

Amdahl's Law

Amdahl's Law tells us how much improvement is possible by making the common case fast, or by parallelizing part of the algorithm. In the example below, 3/5 of the algorithm can be parallelized, meaning that three times as much hardware applied to the problem gains us only a reduction from five time units to three.

Example of Amdahl's Law, parallel and
				serial portions.

Some problems, most famously graphics, are known as "embarrassingly parallel" problems, in which extracting parallelism is trivial, and performance is primarily determined by input/output bandwidth and the number of processing elements available. More generally, the parallelism achievable is determined by the dependency graph. Creating that graph and scheduling operations to maximize the parallelism and enforce correctness is generally the shared responsibility of the hardware architecture and the compiler.

Dependency graph for the
					     above figure.

プロセッサー・パフォマンス定式
The Processor Performance Equation

CPU time = (seconds )/ program = (Instructions )/ program × (Clock cycles )/ Instruction × (Seconds )/ Clock cycle

宿題
Homework

This week's homework (submit via SFS):

  1. Take your "hello, world" program from last time and compile to assembly code and submit the assembly code. Also, answer the following questions:
    1. Where does your program start?
    2. How many instructions are in the main body of the program (between that starting point and the exit or return)?
  2. For the parallelism exercise for the lasagna recipe we did in class,
    1. Draw a diagram similar to the Amdahl's Law diagram above.
    2. Estimate the total time with one cook, one stove burner, and one oven.
    3. Estimate the total time with two cooks, one stove burner, and one oven.
  3. Read the text for next week.
今週の課題

  1. 先週書いた"hello,world"のプログラムをアセンブリコードにコンパイルし て、コードを提出しなさい。また、以下の設問に答えなさい。
    1. プログラムはどこからスタートしましたか?
    2. "main"の中に命令はいくつありましたか?(上記スタート地点からexitも しくはreturnまで)
  2. 授業中に行った並列処理(ラザニャのレシピ)の練習問題について、
    1. Amdahl's Lawのような図に起こしなさい
    2. コックさん一人、ストーブ一つ、オーブン一つでどのくらいかかるで しょう。
    3. コックさん二人、ストーブ一つ、オーブン一つでどのくらいかかるで しょう。
  3. 来週の予習をすること

Next Lecture

Next lecture:

第3回 10月13日
Lecture 3, October 13: Processors: Arithmetic (プロセッサーの数学)

Readings for next time:

Additional Information

その他