Quoting from the first lecture:
Your grade will be determined as follows:
This year, heavier emphasis will be placed on the term project than on homeworks. You should expect your project to take 40-60 hours. There will be five homework assignments over the course of the semester, and readings almost every week. At least one of the programming assignments will involve parallel programming.
Summarizing the currently-assigned work:
Project: * Proposal: (overdue, get them in this week!) * final report & interview due: July 17-27 Programming: HW1/L1. hello, world HW2/L2. system call & assembly HW3/L3. fork() + Berkeley parallel programming exercise (important!) L4. (none) L5. (none) L6. (none) L7. (none) HW6/L8. study socket definition L9. (none) L10. TBD L11. TBD L12. TBD L13. TBD L14. TBD Required Reading: HW1/L1. Levin & Redell (reading & writing papers) HW2/L2. Lampson, hints (system design) L3. (none) HW4+5/L4. Scherer, scalable sync queues L5. (none) HW?/L6. Raymond, Cathedral & Bazaar (software development philosophy) L7. (none) L8. (none) L9. (none) L10. (none) L11. TBD L12. TBD L13. TBD L14. your choice of recent SOSP paper (e.g., a "Best Paper" would be good) Recommended Textbook Reading: (various things) Recommended Paper & Website Reading: L1. (none) L2. (none) L3. (none) L4. (none) L5. Larus, spending Moore's dividend L7. several things on page replacement L8. several things on the Morris worm, Ken Thompson's hack, Yoshifuji on Linux IPv6 L9. parallel debugging from SOSP 2009 L10. Sweeney, XFS L11. TBD L12. TBD L13. TBD L14. TBD
The most common model of file today is the simple byte stream, but originally file I/O involved a much more hardware-oriented view and/or more sophisticated, database-oriented system services. There have been numerous types of basic files:
Regular files represent non-volatile data stored on disk. Using the standard APIs, however, the data may not be committed to disk when the write() call completes; the data may still be buffered in the system somewhere (e.g., in a special place in kernel memory). Again our concept of relativity, (相対性理論) comes into play: the information we have has not yet flowed to its final resting place. The data is guaranteed to be committed to disk once the close() is complete, or once a sync() call is complete. Note that data does not necessarily all land on disk in the order in which you wrote it! If you care, you should sync() every time you need guaranteed ordering of the writes. Also note that the semantics allow close() or sync() to fail even after the write() has succeeded!
Directories can contain other directories, known as subdirectories, creating a hierarchical namespace, or a directory tree. A complete path name may look like /home/rdv/keio/file1.
In Unix terminology, the normal mapping from name to file is a hard link. A regular file can have more than one hard link, or more than one name. A file with more than one hard link is not really deleted until the last link is deleted. Files can also have soft links, which are just name to name mappings that are held in the file system, but which do not participate in the actual management of the file. If the file is deleted but the soft link is not, the soft link is referred to as a dangling reference. One reason for the existence of soft links is to allow linking to directories without violating the requirement that each directory has a single parent. Another is to allow linking across partition boundaries or mount points.
In a Unix system, there is a single root to the directory tree. Applications and users only rarely have to know on which disk their data is stored. System managers can expand parts of the directory tree by mounting other file systems in any place in the tree.
In many other operating systems, the devices are explicitly named. On Windows, they have names such as C:\RDV\KEIO\file1, where the colon separates the device and the directory. On VMS, it could be SYS$HOME:[rdv.keio]file1.
Note that a name for a file is non-volatile, but not permanent; files can be renamed by users and applications, or the system manager may change the mount point and hence the full path name to a file. This behavior creates problems for long-term tracking of data, and numerous research systems (including Plan 9) have attempted to address this need.
Some file systems support case-sensitive file names, others do not. You have probably also noticed that sometimes non-ASCII file names are not printed properly. Most file systems originally assumed ASCII file names, and non-ASCII names are a problem because the character sets are not self-describing. NTFS solves this problem by storing all names in Unicode.
Permissions, on Unix and its descendants, consist of read, write, and execute. Execute is used for both programs, and for directories.
An important, independent concept is that of an Access Control List, or ACL. An ACL matches specific permissions to either specific users, or to users holding a particular identifier or access token, which may be encrypted, and may be transferable from one user or process to another.
Way back at the beginning, I mentioned file forks but didn't discuss them. Forks were originally developed for the Macintosh, to hold file icons. NTFS has a similar feature called file streams. These forks really blur the boundary between system metadata and user data, and typically are not preserved when files move between systems.
none