慶應義塾大学
2015年度秋学期

コンピューター・アーキテクチャ
Computer Architecture

2015年度秋学期　月曜日3時限
科目コード: 35010 / 2単位
カテゴリ:
開講場所：SFC
授業形態：講義
担当: Rodney Van Meter
E-mail: rdv@sfc.keio.ac.jp

第5回 6月25日
Lecture 5, June 25: Instruction Sets and the Data Path, Simulation

Outline of This Lecture

The rest of the semester
A quick look at system structure
Instruction Sets
- The basic idea
- Basic parts of a CPU
- Memory types and uses
- Instruction set classes
- Types of instructions
- Memory addressing
- What an instruction looks like
Using the MIPS Simulator
Homework

The Rest of the Semester

Remaining lectures:

5. MIPS assembly
   HW: MIPS matmul
6. pipelining
   HW: none
7. memory, cache
   HW: cache
8. VM
9. Data parallelism: OpenMP, CUDA in-class exercise
   HW: OpenMP, submit CUDA
10. Distributed-memory parallel systems: Interconnects, Fugaku
11. I/O Systems, Error Correction, RAID
   HW: last year's exam
12. Putting it all together: a look back at the x86 and MIPS assembly

-----

Homeworks for Computer Architecture 2020

1. a. Calculate array addresses by hand
   b. Multiply two matrices by hand
   c. Multiply in pseudocode
3. Matrix multiply in assembler (MIPS)
4. Cache
   a. bitfields
   b. just simple calculations on efficiency
5. Parallel
   a. CUDA matrix multiply (cookbook on colab, just submit as proof you did it)
   b. OpenMP parallel (same as previous years)
6. Test (last year's final exam as a homework)

*** Final deadline for all homeworks is August 1, 2020! ***

Instruction Sets

CPUs execute instructions (命令)
The data for those instructions comes from memory (stack or heap) or registers
An instruction includes an opcode
Instructions may be ALU, data movement, or control flow instructions

A Quick Look at System Structure

We have seen this diagram before:

You already know that data is stored in memory, and the actual computation is done by the CPU. Starting today, we are looking at the inside of that CPU: what work it performs, and how, in order to complete a computation.

命令：基本の概念
Instructions: the Basic Idea

Computers execute instructions, which are usually compiled by a compiler, a piece of software that translates human-readable (usually ASCII) code into computer-readable binary.

コンピューターが命令を実行する。その命令はコンパイラーが人間の読めるプログラムから通訳してある。例えば：

LOAD	R1, A
ADD	R1, R3
STORE	R1, A

This example shows three instructions, to be executed sequentially. The first instruction LOADs a value into register R1 from memory (we will come back to how the value that is loaded into R1 is found in a minute). The second instruction ADDs the contents of register R3 into register R1, then the third instruction STOREs the result into the original memory location.

CPU: the Central Processing Unit

ちょっと抽象的な絵ですが：

This figure is a little bit on the abstract side, but:

簡単に説明すると、CPUはこの機能の部品がある：

Instruction fetcherは命令をメモリーから読む。 The instruction fetcher reads instructions from memory.
Instruction decoderはその命令どのことか、処理する部分。メモリーからデータを読まなければならいかどうかを決める。 The instruction decoder decides what type of instruction is being executed, and fetches data from memory if necessary.
Memory interfaceは命令のために、メモリーを読んだり書いたりする。 The memory interface reads and writes data from memory for the instructions.
RegistersはCPUの中のメモリーです。The registers are the on-chip memory.
ALU, Arithmetic and Logic Unit,は数学と論理の命令を実行する。 The ALU is the Arithmetic and Logic Unit actually executes, as the name says, arithmetic and logical instructions.

メモリー（記録）：レジスター、スタック、ヒープ
Memory: Registers, Stacks, and Heaps

Registers: Special memory inside the CPU chip. There are typically only a few registers in a CPU. They are fast, but expensive.
Main Memory: Random Access Memory (RAM) is the largest amount of memory in your system; you may have 512 megabytes or more in your laptop. RAM is typically used in several ways:
- Stack: Also called a push-down stack, this area of memory is used to keep values used as local variables by functions in the program.
- Heap: Memory allocated to hold global variables for the program.
- Binary/text segment: The program itself.

It is the job of the compiler to decide how to use the registers, stack, and heap most efficiently. Note that these functions apply to both user programs, or applications, and the operating system kernel.

命令の種類
Types of Instructions

It's easy to think of an algorithm in terms of the arithmetic that must be performed. In fact, you could argue that the only important work is that arithmetic. However, you should be aware that much of the work actually done by the CPU is not that arithmetic directly, but instead is various kinds of supporting work to enable that arithmetic: moving data around, and deciding what work should be done next. We can categorize the most common instructions used in an algorithm into three groups:

ALU
- Integer arithmetic
- Bitfield and logical operations
- Floating point arithmetic (often handled in separate part of the computer known as an FPU, or floating point unit)
Data load/store, stack push/pop
Control flow
- Unconditional branch
- Conditional branch
- Function call/return

Besides these three groups, there are instructions that control the state of the processor itself, turning on and off various features of the CPU, some of which we will talk about when we talk about virtual memory. Additional instructions include those necessary to support operating system calls and device I/O, such as interrupts.

Classes of Instructions Sets

There are many different ways to build a complete instruction set for a CPU. The broadest classification is to divide them based on how the operands for an arithmetic operation are brought into the ALU. Can they come directly from memory, or do they have to be used from registers? A common taxonomy is:

Stack architecture
Accumulator architecture
General-purpose register architecture
- Register-memory
- Load-store

The diagram below shows how data flows in the CPU, depending on the class of instruction set. (TOS = Top of Stack)

differences in data
flow for instruction
set classes

Memory Addressing

Each operand of an instruction must be fetched before the instruction can be executed. Data may come from

Immediate or literal data (limited to less than a full word)
Registers
Register indirect
Register displacement

(There are other addressing modes, as well, which we will not discuss.)

Immediate	ADD R4,#3	Regs[R4] ← Regs[R4] + 3
Register	ADD R4,R3	Regs[R4] ← Regs[R4] + Regs[R3]
Register Indirect	ADD R4, (R1)	Regs[R4] ← Regs[R4] + Mem[Regs[R1]]
Displacement	ADD R4, 100(R1)	Regs[R4] ← Regs[R4] + Mem[100+Regs[R1]]

Depending on the instruction, the data may be one of several sizes (using common modern terminology):

A byte (8 bits, today)
A half word (16 bits)
A word (32 bits)
A double word (64 bits)

What an Instruction Looks Like

An instruction must contain the following:

An opcode, the operation code that identifies the instruction.
Address type for zero or more arguments (sometimes, implicit in the instruction).
Addresses (or immediate data) for zero or more arguments.

Some architectures always use the same number of arguments, others use variable numbers. In some architectures, the addressing information and address are always the same length; in others they are variable.

In general, the arithmetic instructions are either two address or three address. Two-address operations modify one of the operands, e.g.

ADD R1, R3	; R1 = R1 + R3

whereas three-address operations specify a separate result register, e.g.

ADD R1, R2, R3	; R3 = R1 + R2

(n.b.: in some assembly languages, the target is specified first; in others, it is specified last.)

The MIPS architecture, developed in part by Professors Patterson and Hennessy, is relatively easy to understand. Its instructions are always 32 bits, of which 6 bits are the opcode (giving a maximum of 64 opcodes). rs and rt are the source and target registers, respectively. (Those fields are five bits; how many registers can the architecture support?) Instructions are one of three forms:

Using the MIPS Simulator

Note: The MIPS simulator is available on SourceForge. The original spim page is still on the web.

You will need vecadd.s, a short MIPS assembly program.

Here are screen shots from spim (Linux), xspim (Linux), PCspim (Windows) and spim and QtSpim (Mac) versions of the tool.

PCspim screenshot, with
output console (Windows)

Things to note in the images above, as well as your own execution runs:

Watch the advance of the PC (program counter) as you step through the program.
By convention, most of the registers have two names, Rxx and a more mnemonic one indicating the use.
The floating point registers are either single-precision, or a pair of them (e.g., F0 and F1) are used as one double-precision FP register.
Notice especially the precision and accuracy of the FP numbers in the output, compared to the values we intended! This is an important fact!

宿題
Homework

As in SFC-SFS.

Next Lecture

Next week, we will continue with the discussion of processor architecture, getting into the fun stuff: pipelining!

Next lecture:

第6回プロセッサー：パイプラインの基本
Lecture 6: Processors: Basics of Pipelining

Readings for next time:

P-H:
H-P: Appendix A.1 and A.2

コンピューター・アーキテクチャ
Computer Architecture

第5回 6月25日
Lecture 5, June 25: Instruction Sets and the Data Path, Simulation

Outline of This Lecture

The Rest of the Semester

Instruction Sets

A Quick Look at System Structure

命令：基本の概念
Instructions: the Basic Idea

CPU: the Central Processing Unit

メモリー（記録）：レジスター、スタック、ヒープ
Memory: Registers, Stacks, and Heaps

命令の種類
Types of Instructions

Classes of Instructions Sets

Memory Addressing

What an Instruction Looks Like

Using the MIPS Simulator

宿題
Homework

Next Lecture

Additional Information

その他

コンピューター・アーキテクチャ Computer Architecture

第5回 6月25日 Lecture 5, June 25: Instruction Sets and the Data Path, Simulation

Outline of This Lecture

The Rest of the Semester

Instruction Sets

A Quick Look at System Structure

命令：基本の概念 Instructions: the Basic Idea

CPU: the Central Processing Unit

メモリー（記録）：レジスター、スタック、ヒープ Memory: Registers, Stacks, and Heaps

命令の種類 Types of Instructions

Classes of Instructions Sets

Memory Addressing

What an Instruction Looks Like

Using the MIPS Simulator

宿題 Homework

Next Lecture

Additional Information

その他

コンピューター・アーキテクチャ
Computer Architecture

第5回 6月25日
Lecture 5, June 25: Instruction Sets and the Data Path, Simulation

命令：基本の概念
Instructions: the Basic Idea

メモリー（記録）：レジスター、スタック、ヒープ
Memory: Registers, Stacks, and Heaps

命令の種類
Types of Instructions

宿題
Homework