慶應義塾大学
2007年度 秋学期
コンピューター・アーキテクチャ
Computer Architecture
第10回 1月7日
Lecture 10, January 7: Systems: Chip Multiprocessors http://www.realworldtech.com/page.cfm?ArticleID=rwt090406012516&p=3
Outline of This Lecture
- Review: What's Important?
- Chip Multiprocessors
- Dual-Core Intel and AMD processors
- Intel's 80-core processor
- Sun Niagara
- Cell
- Homework
Review: What's Important?
学期末の試験を書きました。レヴューとして、この概念は大事ですよ。
- Digital arithmetic ディジタル数学 (浮動小数点を含む)
- Amdahl's Law アムダールの法則
- Moore's Law ムーアの法則
- processor performance equation プロセッサパフォマンス定式
- caching キャッシュ
- pipelining パイプライン処理
Additional topics:
- instruction sets: CISC v. RISC, load-store v. memory-memory
- basic multiprocessor architecture マルチプロセッサの基本
- MIMD, SIMD
- shared memory v. distributed memory
- multiprocessor networks
- I/O systems
- synchronization
- virtual memory (仮想記録)
Review: Amdahl's Law
Intel's 80-Core Processor
Last year, Intel announced a demonstration 80-core, single-chip
multiprocessor capable of 1 teraFLOPS (10^9 32-bit floating point
operations per second).
A photomicrograph of the chip, and the basic floor plan:
Block diagram of a single processing element (PE). Note the many
read/write ports on the register file. This means that the pipeline
exhibits no structural hazards.
The pipeline is 8 stages:
The chip, mounted on a board:
Note that each PE has a 3KB instruction memory, and a 2KB data
memory. Data can be transferred to the network only from registers,
not from memory. Their existing demonstration doesn't include a
larger memory, but plans are for 3D packaging, with the memory chip
stacked directly on top of the processor chip.
Programming this puppy requires a lot of very careful work to schedule
the operations, including transfers to other processors on the
network. Obviously, it is a message passing,
or distributed memory, multiprocessor in its current form.
The network is a 2D mesh.
Sun Niagara
Contrast the above architecture with Sun's Niagara.
Sony/Toshiba/IBM Cell Processor
Certainly the most famous chip multiprocessor at the moment is the
Cell, used in the Sony PS3.
宿題
Homework
This week's homework (submit via email):
Intel says their 80-core processor runs at 1TFLOPS when the clock
speed is 3.13GHz. Determine:
- FLOPS per processor, assuming all 80 cores are working
- Average floating point ops/clock cycle
- If each pipeline stall costs the full 8 cycles, what percentage
of instructions can stall and still meet that performance?
- If each pipeline stall costs only one clock cycle, what percentage
of instructions can stall and still meet that performance?
Next Lecture
Next week, we will have the first of two lectures on I/O systems.
Next lecture:
第11回 1月18日 金曜日! 入出力
Lecture 11, January 18 (n.b.: Friday!): Basics of I/O and
Storage Systems and Designing for Networks
Additional Information
その他