I recommend the CACM version, it's more up to date and probably better-written.
Last week we saw basic priority scheduling, in the example of VxWorks on Mars.
The simplest approach of all is first come, first served (FCFS). In FCFS, jobs are simply executed in the order in which they arrive. This approach has the advantage of being fair; all jobs get the processing they need in a relatively predictable time.
Better still, in some ways, is Shortest Job First (SJF). SJF is provably optimal for minimizing the wait time, among a fixed set of jobs. However, in order to maintain fairness, one has to be careful about continuing to allow new jobs to join the processing queue ahead of older, longer jobs. Moreover, actually determining which jobs will be short is often a manual process, and error-prone, at that. When I was a VMS systems administrator, we achieved an equivalent effect by having a high-priority batch queue and a low-priority batch queue. The high-priority one was used only rarely, when someone suddenly needed a particular job done quickly, and usually for shorter jobs than the low-priority batch queue.
If CPU is the only interesting resource, FCFS does well. But in reality, computers are complex machines with multiple resources that we would like to keep busy, and different jobs have different characteristics. What if one job would like to do a lot of disk I/O, and another is only using the CPU? We call these I/O-bound and CPU-bound jobs, respectively. FCFS would have the disk busy for the first one, then the CPU busy for the second one. Is there a way we can keep both busy at the same time, and improve overall throughput?
next instruction executed is dependent on the current state of the machine. What chunks of main memory are already stored in the cache? What is the disk head position? (We will study disk scheduling more when we get to file systems.)
You're already familiar with multitasking operating systems; no self-respecting OS today allows one program to use all of the resources until it completes, then picks the next one. Instead, they all use a quantum of time; when the process that is currently running uses up a certain amount of time, its quantum is said to expire, and the CPU scheduler is invoked. The CPU scheduler may choose to keep running the same process, or may choose another process to run. This basic approach achieves two major goals: it allows us to balance I/O-bound and CPU-bound jobs, and it allows the computer to be responsive, and give the appearance that it is paying attention to your job.
This basic concept of a multiprogrammed system was developed for mainframe hardware with multiple terminals attached to the same computer; fifty people or more might be using the same machine. As we discussed in the first lecture, the concept was pioneered by the Compatible Time Sharing System (CTSS), created at MIT by Fernando Corbató and his collaborators and students.
In this environment, it makes sense to give some priority to interactive jobs, so that human time is not wasted. Batch jobs still run, but at a lower priority than interactive ones. But how do you pick among multiple interactive jobs? The simplest approach is round-robin scheduling, in which the jobs are simply executed for their quantum, and when the quantum expires, the next one in the list is taken and the current one is sent to the back of the list. It is important to select an appropriate quantum.
In round-robin scheduling, if we have five compute-bound tasks, they will execute in the order
ABCDEABCDEABCDE
We have already seen the basic idea of priority scheduling a couple of times. Usually, priority scheduling and round-robin scheduling are combined, and the priority scheduling is strict. If any process of a higher priority is ready to run, no lower-priority process gets the CPU. If batch jobs are given lower priority than those run from interactive terminals, this has the disadvantage of making it attractive for users to run their compute-bound jobs in a terminal window, rather than submitting them to a batch queue.
To guarantee that batch jobs make at least some progress, it is also possible to divide the CPU up so that, say, 80 percent of the CPU goes to high-priority jobs and 20 percent goes to low-priority jobs. In practice, this is rarely necessary.
A1B1C1D1A1B1C1D1 or
ABCD1ABCD1?
Fairness has an actual mathematical definition, once you have decided what you are attempting to measure. This definition is from Raj Jain:
This field was heavily researched in the 1980s, and due to the rapid increase in multicore systems, will no doubt be important in commodity operating systems for the next several years, especially the interaction of thread scheduling and CPU scheduling.
At the other end, one important experiment is in multithreaded architectures, in which the CPU has enough hardware to support more than one thread, under limited circumstances. The most extreme form of this was the Tera Computer, which had hardware support for 128 threads and always switched threads on every clock cycle. This approach allowed the machine to hide the latency to memory, and work without a cache. It also meant that the overall throughput for the system was poor unless a large number of processes or threads were ready to execute all of the time.
We should have come to this earlier, but it didn't fit into the flow above. One important class of scheduling algorithms is deadline scheduling algorithms for realtime systems.
None, just work on your project.
第6回 5月18日 メモリ管理と仮想記憶
Lecture 6, May 18: Memory Management and Virtual Memory
Next week we will also talk about performance measurement.
Readings for next week and followup for this week: