Get Google to translate this page. このページをグーグルに翻訳をまかせよう!
Let's go back to last week and look at some of the history we didn't have time to finish.
Recall that mainframes went from bare metal, to batch systems, to unprotected multiprogramming, to protected multi-user systems. PC operating systems followed the same pattern, and embedded OSes for devices such as PDAs are following the same pattern.
最初にメインフレームとして利用されていたプログラマブルコンピューターは、 金属自身で(つまり部品を切り替えて)プログラムをつくっていました. 次にバッチシステムが開発されました。 そのあと、保護されないマルチプログラミング用のOSが出現し, 次に保護のあるOSをできました。 PC用と組み込み用のOSは同じ道をたどってきました.
Finally, remember that we discussed the light cone of information and its impact on systems, where we must consider that information is always distributed (and therefore out of date), and that systems are definitely concurrent.
最後に 光円錐の影響で、今の知っていることが古いでしょう。そのふたつの概念:全 ての情報は分散である、とすべてのことは同時で行う可能性がある。
I asked you to read the Levin and Redell paper. That paper will help you learn how to understand a computer system, as well as the larger issue of learning how to both analyze existing research, as well as how to present your own research. Thinking about this issue will help with writing your master's thesis.
前回, Levin and Redell paperを読むという宿題を出しました. この文章はコンピュータシステムへの理解だけではなく, どのように既存研究を分析するか, どのように自分の研究を提供するかといった知識を与えてくれるでしょう. これらの問題に取り組むことはあなたの修士論文の助けになるでしょう.
Everyone always wants to describe what they did, rather than why they did it. A good paper must describe why, and, ultimately, whether or not what you did worked, and how you know.
誰もが何をしたのかを話したがり, 何故それが必要かを説明しません. そして,どういう結果が出て,どうしてそれを知ったのかを説明しないのです.
When you present a project, or an idea, to me, I expect the following:
皆さんはプロジェクトあるいはアイデアを私に提案するとき, 以下のことを考えます.
You will find a slightly updated version of the questions from Levin and Redell here.
また,皆さんはLevin と Redellからの若干のアップデートされた質問を ここ.
からみつけることができでしょう.
非常に簡単にではありますが,いくつかのOSの種類を説明します. 本抗議では,多目的PCのOS(訳注:皆さんが普段使うようなOSです) について解決しますが,世の中には多くのOSが存在します. それらの多くの特徴はほとんど変わらず, 新しいOSは従来のOSから大きく変わるものではないです.
The two photos above demonstrate two critical concepts in operating systems:
We will see these two concepts again in later lectures.
In the mid-twentieth century, there was much excitement over the design of digital computers. Early important names include John von Neumann, Konrad Zuse, J. Presper Eckert, John William Mauchly, and Maurice Wilkes; I encourage you look up those names and study their contributions. The term "von Neumann architecture" is generally used to refer to a computer that can store a program as a form of data, rather than in the wiring of the computer itself, though it is generally acknowledged that all of the above contributed to the basic ideas. Without this innovation, the field of computer software would be very dry!
By the mid-1950s, computers made of transistors and capable of storing their programs were becoming common. At first, each program controlled the entire computer; it was like reformatting your hard disk each time you wanted to run a different program -- except, of course, this was before the invention of the hard disk! Programs were on decks of punch cards instead. Individuals used the computer for a single program for hours at a time. Each program had to know how to control all of the hardware of the entire computer that it intended to use. (The first hard disk, the IBM 305 RAMAC, held five megabytes and was introduced in 1956.)
At the same time, the first high-level languages and compilers, of which FORTRAN is the most prominent example, were being developed. During this time, batch system operating systems were created. The computer was run by operators, a specialized profession of people who did not necessarily write programs, but knew how to run a progam on the computer. They took card decks from programmers, fed them to the computer, and returned the original program and data card decks and the output data card deck to the programmers.
On April 7, 1964, IBM introduced the System/360. It was a watershed event in computing history, but for our purposes there are two major innovations:
Around 1960, it was realized that more than one user could be seated at a terminal at the same time, and that each user spends most of his time thinking, which would allow the computer to be idle. Fernando J. Corbató led the development of the Compatible Time Sharing System (CTSS), but Corbató credits John McCarthy with the original idea. Corby went on to lead the development of Multics, which would ultimately lead to the development of Unix. With these systems, and others such as some of the operating systems from Digital Equipment Corporation (DEC), especially VMS, the idea of computers that could do many things at once and support many users at the same time became common.
In the late 1970s, personal computers began to appear; their history followed the same sequence as above, only faster. Initially a personal computer could run only a single program. Later, systems could run more than one program, but with no hardware protection for the separate programs, crashes were common as programs interfered with one another. Finally, more "modern" PC operating systems adopted the widely-accepted notions of a kernel and common libraries that allowed programs to more efficiently and safely share a computer. Programs in these systems are run in a process. Today's PC operating systems, such as Linux, Microsoft Windows, and MacOS, are heavily derived from the systems mentioned above. They also commonly ship with a GUI and many utilities, including shells, editors, compilers, and graphics tools, that are not strictly part of the OS itself.
More recently, a similar evolution has happened in the embedded device world; OSes for cell phones, for example, didn't really exist initially. Firmware was hand-coded from the ground up. Later, OSes, even with simple forms of multitasking, arrived (Symbian and PalmOS are probably the two most prominent). Nowadays, with iOS and Android, full-featured, Unix-like system services are available with sophisticated development tools and GUI toolkits.
システムコールはOSとユーザランドプログラムのインタフェースを 定義します. 基本的なシステムコールは以下のように分類されます.
Unix,あるいはその影響を強く受けているOS上で, 最後のカテゴリは通常,ファイルI/Oです. しかし,通常いくつかのシステムコールがファイルI/Oには必要となります. たとえば,ネットワーク接続が正常に開いているのか,などです.
We discussed resource management, data movement and naming as critical functions of an OS. These system calls are an application's means of requesting that the OS perform one of these functions.
前回の授業では, リソース管理, データ転送と 名前空間 という重要なOSの要素について扱ってきました. これらのシステムコースは,アプリケーションの要求する OSの機能を提供する要素です.
System calls are different from library functions. Many OS environments (whether the library is or is not a part of the OS itself is arguable) provide libraries of functions, often standardized, that application programmers may wish to use. What's the difference? Library functions start by running in user space, though they may also make system calls on behalf of the user process. Library functions perform actions like string formatting, calculating math functions, etc. System calls generally involve access to things that must be protected: disk drives, files on those disk drives, process control structures, etc.
システムコールは通常のライブラリとは違う機能を持ちます. 一般的な多くのOS環境はアプリケーションプログラマが使用したがる 標準化されているライブラリの機能を提供します. (ライブラリがOSの機能かどうかについては議論の余地があるけれども) では,システムコールとライブラリは何が違うのでしょうか? ライブラリはシステムコールを呼ぶものであってもユーザランドからスタートします. ライブラリファンクションは, 文字列の成型や,数式の計算などを処理します. システムコールは一般的に,ディスクドライバ,その上のファイル,保護された領域など にアクセスするために呼ばれます.
Last week we discussed naming as a critical function of an OS. Humans use a readable form of the name of a system call, such as write(). However, the operating system itself does not actually use the human-readable names. In this case, the C compiler uses header files as a means to translate the human-readable name into a machine-readable one.
前回,我々はネーミングというOSの重要な機能について議論しました. 通常,人間は可読なシステムコールの名前(たとえば,writeのような)を用います. しかし,OS自身は人間が可読な名前を持ちません. つまり,Cコンパイラのヘッダファイルが,マシンが用いる名前と人間の用いる名前を 変換しているのです.
Here is part of the list in a Linux 2.6.19 kernel:
さて,ここにLinux2.6.19のコードの一部を示します.
[rdv@2 ~]$ more /usr/include/asm/unistd.h #ifndef _ASM_I386_UNISTD_H_ #define _ASM_I386_UNISTD_H_ /* * This file contains the system call numbers. */ #define __NR_restart_syscall 0 #define __NR_exit 1 #define __NR_fork 2 #define __NR_read 3 #define __NR_write 4 #define __NR_open 5 #define __NR_close 6 ... #define __NR_move_pages 317 #define __NR_getcpu 318 #define __NR_epoll_pwait 319 #endif /* _ASM_I386_UNISTD_H_ */...that's it. In Linux, there are 319 system calls that do everything.
Linuxには319のシステムコールが存在することがわかります.
The execution of a system call occurs in several phases:
システムコール処理はいくつかのフェーズに分かれます.
アプリケーション側では,ライブラリの中でシステムコールに飛ぶかもしれませんし,飛ばないかもしれません. コンパイルではこのことを非常に注意深く扱います.
Note that this same essential structure is followed for making calls to remote servers, as well as to local system services. Again, our principles of distributed and concurrent actions and information (the light cone) applies.
このような本質的な構造が, リモートサーバとの接続もローカルサーバのように扱われているのです. 繰り返しますが,我々の原則である分散かつ並列に処理可能である, ということが実現されているのです.
Most system calls are synchronous; your application program stops until the OS completes the call and returns (or decides that it cannot complete, in which case an error is returned).
ほとんどのシステムコールは同期型です. 同期型とは,アプリケーションがシステムコールの動作を完了するまで 停止するということです. (もちろん場合によってはエラーを返すこともあります)
Looking in a little more detail at the setuid system call:
setuidの詳細をみてみましょう.
_syscall1(int,setuid,uid_t,uid); which will expand to: _setuid: subl $4,%exp pushl %ebx movzwl 12(%esp),%eax movl %eax,4(%esp) movl $23,%eax movl 4(%esp),%ebx int $0x80 movl %eax,%edx testl %edx,%edx jge L2 negl %edx movl %edx,_errno movl $-1,%eax popl %ebx addl $4,%esp ret L2: movl %edx,%eax popl %ebx addl $4,%esp ret(This code is a little old, but illustrates the necessary points.)
このコードは少し古いですが重要な部分は残っています.
LinuxディストリビューションのC言語の行数は,
なんということだ!これを印刷したら10万ページにもなるじゃないか! 最初のUnixカーネルは5000行だったというのに今のLinuxカーネルは17,000ファイルもある! LinuxはKISS: keep it simple, stupidという基礎的な概念を守っていてもいいのにね.
全体量の半分以上と3分の一ものだいるが, さまざまなデバイスのために用意されています. そして,そのほとんどが実際にシステムでは使われないのです.
The Linux kernel hackers are generally reasonable about maintaining comments, so consider these numbers to be high by almost a factor of two. Note also that this does not include any of the following:
Linux Kernelのハッカーたちは,メンテナンスコメントの中で, この大きな2つのサイズのコードについて話します. ただし,以下のものは含まれていないことに注意してください.
また,元のUNIXでは以下のものも含まれないのです.
このレベルのシステムでは,十分な定義がされた最低限の過程を隠した 高機能なインタフェースが極めて重要です. しかしながら,最初の実装を捨てるためにLampsonのコマンドを守り, ほぼすべてバーチャルメモリサブシステムを持つLinuxは, よくに(windowsのように)書き換えられている. 本抗議では,これらのシステムをどのようにエンジニアリングしていくか を話す予定です.
That gives you seven to eight weeks to actually implement your project. I would expect that your project will take 30-40 hours total, including writing and debugging code, taking data, analyzing the data, and writing up a report of the results. This is actually not very much time for a project, so they must be sized appropriately. (I think I said 40-60 last week, but that's too high.)
Your grade on your project will be 60% of your total grade, split 10% for the mid-term progress review, and 50% for the final evaluation. The things I will look for are those detailed in the Levin and Redell paper. Because many of your projects will involve performance measurements, I also expect data with error bars and carefully designed experiments. One great book on the topic is Jain, The Art of Computer Systems Performance Analysis, but there are probably also good books available in Japanese.
Some people have asked what language they must write their program(s) in. I don't care what language you use; I care what you learn about the operating system. In order to learn about the OS, a low-overhead, predictable, compiled language is probably preferable. C would be the obvious choice. Interpreted scripting languages are probably bad choices.
Likewise, there is no requirement to perform your project on a particular operating system. Class lectures will focus on principles highlighted by Unix and Linux examples, as above. The importance of Unix in the history of operating systems cannot be overstated, and Linux and MacOS are (arguably) its most vibrant current implementations; students must have some familiarity with the basic ideas of Unix. However, if your OS of choice is Windows, you will learn a great deal by comparing concepts from lectures and the book with what you see on your Windows machine. One obvious advantage of Linux is the easy availability of source code, turning "black box" experiments into "white box" ones.
The first step in either research or development is to identify a problem. Most of these projects will help you carefully characterize a system problem that you might want to attack more thoroughly in research later. It's okay by me if this is related to your work in your lab for your thesis.
The ideal project for this class is probably a performance measurement of the system. Examples include:
I will ask you not just what happened, but why it happened. In most cases, you should be able to point to some kernel source code, or a conference paper, design document, book, or web site that supports your understanding of why the system behaves the way it does. For most of these projects, I will also ask you to predict what will happen as technology continues to improve: modest improvements in clock speed and disk speed, significant increases in number of processors and disk capacity.
There will be readings from the textbook assigned almost every week. Those are for your own benefit; you are not required to keep notes on them. There will also be five papers (in English) that are assigned reading, for which you are required to post a summary and comments with your homework. They will be assigned through the semester, but FYI, a summary:
#include <unistd.h> int main() { char *buf = "123"; write(1, buf, 3); return 0; }(Hint: if you are using Linux, some of the information you need to complete this exercise is above in the lecture notes.)
第3回 4月27日 プロセスとスレッド
Lecture 3, April 27: Processes and Threads
Follow-up readings for this week: