Skip to content

Files, Disk Management


OSTEP: Chapters 36, 37

File: a named collection of bits stored on disk. From the OS' standpoint, the file consists of a bunch of blocks stored on the device. Programmer may actually see a different interface (bytes or records), but this does not matter to the file system (just pack bytes into blocks, unpack them again on reading).

Common addressing patterns:

  • Sequential: information is processed in order, one piece after the other. This is by far the most common mode: e.g. editor writes out new file, compiler compiles it, etc.

  • Random Access: can address any record in the file directly without passing through its predecessors. E.g. the data set for demand paging, also databases.

  • Keyed: search for records with particular values, e.g. hash table, associative database, dictionary. Usually not provided by operating system. TLB is one example of a keyed search.

Modern file systems must address four general problems:

  • Disk Management: efficient use of disk space, fast access to files, sharing of space between several users.

  • Naming: how do users select files?

  • Protection: all users are not equal.

  • Reliability: information must last safely for long periods of time.

Disk Management: how should the disk sectors be used to represent the blocks of a file? The structure used to describe which sectors represent a file is called the file descriptor.

Contiguous allocation:

allocate files like segmented memory (give each disk sector a number from 0 up). Keep a free list of unused areas of the disk. When creating a file, make the user specify its length, allocate all the space at once. Descriptor contains location and size.

Contiguous Disk Allocation

  • Advantages: easy access, both sequential and random. Simple. Few seeks.

  • Drawbacks: horrible fragmentation will preclude large files, hard to predict needs. With interleaved user requests, still cannot eliminate all seeks.

Linked Files:

In the file descriptor, just keep pointer to first block. In each block of file keep pointer to next block. Can also keep a linked list of free blocks for the free list.

Linked Disk Allocation

  • Advantages: Files can be extended, no fragmentation problems. Sequential Access is easy: just chase links. No extra space for the free list.

  • Drawbacks: Random access is virtually impossible. Lots of seeking, even in sequential access. Pointer to next block occupies same block as data.

  • Example: FAT (MSDOS) file system.

Array of Block Pointers (Extents):

Allocate an array to hold pointers to all the blocks, but do not allocate the blocks. Then fill in the pointers dynamically using a free list.

Extent Disk Allocation

  • Advantages: Both sequential and random access are easy.

  • Drawbacks: still have to set maximum file size, and there will be lots of seeks.

DOS FAT File File System:

A single File Allocation Table (FAT) that combines free list info and file allocation info. In file descriptor, keep pointer to first block. A FAT table entry contains either (1) the block number of the next block in the file, (2) a distinguished "end of file" (eof) value, or (3) a distinguished "free" value.

DOS File Allocation Table (FAT)

  • Advantages/Disadvantages: similar to those mentioned above for linked file.

None of these is a very good solution: what is the answer? First, and MOST IMPORTANT: understand the application. How are file systems used?

  • Most files are small.

  • Much of the disk is allocated to large files.

  • Many of the I/O's are made to large files.

Thus, the per-file cost must be low, but the performance of large files must be good. Must allow reasonably fast random access, extensibility.


Copyright © 2013, 2018 Barton P. Miller

Non-University of Wisconsin students and teachers are welcome to print these notes their personal use. Further reproduction requires permission of the author.