File system
Encyclopedia
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device(s) which contain it. A file system organizes data in an efficient manner and is tuned to the specific characteristics of the device. There is usually a tight coupling between the operating system
and the file system. Some filesystems provide mechanisms to control access to the data and metadata
. Ensuring reliability is a major responsibility of a filesystem. Some filesystems provide a means for multiple programs to update data in the same file at nearly the same time.
Without a filesystem programs would not be able to access data by file name or directory and would need to be able to directly access data regions on a storage device.
File systems are used on data storage device
s such as hard disk drives, floppy disk
s, optical disc
s, or flash memory
storage devices to maintain the physical location of the computer file
s. They may provide access to data on a file server by acting as clients for a network protocol (e.g. NFS, SMB
, or 9P
clients), or they may be virtual and exist only as an access method for virtual data (e.g. procfs
). This is distinguished from a directory service
and registry
.
and directories
, and keeping track of which areas of the media belong to which file and which are not being used. For example, in Apple DOS
of the early 1980s, 256-byte sectors on 140 kilobyte floppy disk used a track/sector map.
This results in unused space when a file is not an exact multiple of the allocation unit, sometimes referred to as slack space. For a 512-byte allocation, the average unused space is 255 bytes. For a 64 KB clusters, the average unused space is 32KB. The size of the allocation unit is chosen when the file system is created. Choosing the allocation size based on the average size of the files expected to be in the filesystem can minimize the amount of unusable space. Frequently the default allocation may provide reasonable usage. If it can be anticipated that a file system will contain mostly small files a small cluster size should be chosen. Choosing an allocation size that is too small results in excessive overhead if the file system will contain mostly very large files.
File system fragmentation
occurs when unused space or single files are not contiguous. As a filesystem is used, files are created, modified and deleted. When a file is created the filesystem allocates space for the data. Some filesystems permit or require specifying an initial space allocation and subsequent incremental allocations as the file grows. As files are deleted the space they were allocated eventually is considered available for use by other files. This creates alternating used and unused areas of various sizes. This is free space fragmentation. When a file is created and there is not an area of contiguous space available for its initial allocation the space must be assigned in fragments. When a file is modified such that it becomes larger it may exceed the space initially allocated to it, another allocation must be assigned elsewhere and the file becomes fragmented.
A file system may not make use of a storage device but can be used to organize and represent access to any data, whether it is stored or dynamically generated (e.g. procfs
).
Most file system interface utilities place restrictions on the characters permitted in the filename restricting some special characters to provide a syntax to indicate a device, device type, directory prefix or file type. These are typically not file system restrictions and utilities may provide a means to refer to files with embedded special characters such as enclosing the entire filename within quotes ("). Avoiding using special characters makes it easier for users to refer to files.
Some filesystem utilities, editors and compilers treat prefixes and suffixes in a special way. These are usually merely conventions and not implemented within the filesystem.
or an inode
in a Unix-like
file system. Directory structures may be flat (i.e. linear), or allow hierarchies where directories may contain subdirectories. The first file system to support arbitrary hierarchies of directories was the file system in the Multics
operating system. The native file systems of Unix-like
systems also support arbitrary directory hierarchies, as do, for example, Apple's Hierarchical File System
and its successor HFS+
in classic Mac OS
(HFS+ is still used in Mac OS X
), the FAT
file system in MS-DOS
2.0 and later and Microsoft Windows
, the NTFS
file system in the Windows NT
family of operating systems, and the ODS-2 and higher levels of the Files-11
file system in OpenVMS
.
of the data contained in a file may be stored as the number of blocks allocated for the file or as a byte
count. The time that the file was last modified may be stored as the file's timestamp. File systems might store the file creation time, the time it was last accessed, the time the file's meta-data was changed, or the time the file was last backed up. Other information can include the file's device type (e.g. block, character, socket
, subdirectory, etc.), its owner user ID
and group ID
, and its access permission
settings (e.g. whether the file is read-only, executable
, etc.).
Additional attributes can be associated on file systems, such as NTFS
, XFS
, ext2
/ext3
, some versions of UFS
, and HFS+, using extended file attributes
. Some file systems provide for user defined attributes such as the author of the document, the character encoding of a document or the size of an image.
Some file systems allow for different data collections to be associated with one file name. These separate collections may be referred to as streams or forks. Apple has long used a forked file system on the Macintosh, and Microsoft supports streams in NTFS. Some file systems maintain multiple past revisions of a file under a single file name; the filename by itself retrieves the most recent version, while prior saved version can be accessed using a special naming convention such as "filename;4" or "filename(-4)" to access the version four saves ago.
Directory utilities create, rename and delete directory entries and alter metadata associated with a directory. They may include a means to create additional links to a directory (hard link
s in Unix
), rename parent links (".." in Unix-like
OS), and create bidirectional links to files.
File utilities create, list, copy, move and delete files, alter metadata. They may be able to truncate data, truncate or extend space allocation, append to, move, and modify files in-place. Depending on the underlying structure of the filesystem, they may provide a mechanism to prepend to, or truncate from, the beginning of a file, insert entries into the middle of a file or deletion entries from a file.
Also in this category are utilities to free space for deleted files if the filesystem provides an undelete function.
Some filesystems defer reorganization of free space, secure erasing of free space and rebuilding of hierarchical structures. They provide utilities to perform these functions at times of minimal activity. Included in this category is the infamous defragmentation
utility.
Some of the most important features of files system utilities involve supervisory activities which may involve bypassing ownership or direct access to the underlying device. These include high performance backup and recovery, data replication and reorganization of various data structures and allocation tables within the filesystem.
s, or capabilities. The need for filesystem utilities to be able to access the data at the media level to reorganize the structures and provide efficient backup usually means that these are only effective for polite users but are not effective against intruders.
See also password cracking
.
Methods for encrypting file data are sometimes included in the filesystem. This is very effective since there is no need for filesystem utilities to know the encryption seed to effectively manage the data. The risks of relying on encryption include the fact that an attacker can copy the data and use brute force to decrypt the data. Losing the seed means losing the data.
See also filesystem-level encryption
, Encrypting File System
.
Other failures which the filesystem must deal with include media failures or loss of connection to remote systems.
In the event of an operating system failure or "soft" power failure, special routines in the filesystem must be invoked similar to when an individual program fails.
The filesystem must also be able to correct damaged structures. These may occur as a result of an operating system failure for which the OS was unable to notify the file system, power failure or reset.
The filesystem must also record events to allow analysis of systemic issues as well as problems with specific files or directories.
Some filesystems accept data for storage as a stream of bytes which are collected and stored in a manner efficient for the media. When a program retrieves the data it specifies the size of a memory buffer and the file system transfers data from the media to the buffer. Sometimes a runtime library routine may allow the user program to define a record based on a library call specifying a length. When the user program reads the data the library retrieves data via the file system and returns a record.
Some filesystems allow the specification of a fixed record length which is used for all write and reads. This facilitates updating records.
An identification for each record, also known as a key, makes for a more sophisticated file system. The user program can read, write and update records without regard with their location. This requires complicated management of blocks of media usually separating key blocks and data blocks. Very efficient algorithms can be developed with pyramid structure for locating records.
Another approach is to partition
the disk so that several filesystems with different attributes can be used. One filesystem, for use as browser cache, might be configured with a small allocation size. This has the additional advantage of keeping the frantic activity of creating and deleting files typical of browser activity in a narrow area of the disk and not interfering with allocations of other files. A similar partition might be created for email. Another partition, and filesystem might be created for the storage of audio or video files with a relatively large allocation. One of the filesystems may normally be set read-only and only periodically be set writable.
Multiple filesystems on a single system has the additional benefit that in the event of a corruption of a single partition, the remaining filesystems will frequently be still intact. This includes virus destruction of the system partition or even a system that will not boot. Filesystem utilities which require dedicated access can effectively be completed piece meal, in addition defragmentation
may be more effective. Several system maintenance utilities such as virus scans and backups can also be processed in segments. For example it is not necessary to back up the filesystem containing videos along with all the other files if none have been added since the last backup.
rates (see Moore's law
), so after a few years file systems have kept reaching design limitations that require computer users to repeatedly move to a newer system with ever greater capacity.
File system complexity typically varies proportionally with the available storage capacity. The file systems of early 1980s home computer
s with 50 KB to 512 KB of storage would not be a reasonable choice for modern storage systems with hundreds of gigabytes of capacity. Likewise, modern file systems would not be a reasonable choice for these early systems, since the complexity of modern file system structures would consume most or all of the very limited capacity of the early storage systems.
(FAT12, FAT16, FAT32, exFAT), NTFS
, HFS
and HFS+
, HPFS
, UFS
, ext2
, ext3
, ext4
, btrfs
, ISO 9660
, Files-11
, Veritas File System
, VMFS, ZFS
, ReiserFS
and UDF
. Some disk file systems are journaling file system
s or versioning file system
s.
and Universal Disk Format
(UDF) are two common formats that target Compact Disc
s, DVD
s and Blu-ray discs. Mount Rainier
is an extension to UDF supported by Linux 2.6 series and Windows Vista that facilitates rewriting to DVDs.
devices. Frequently a disk file system can use a flash memory device as the underlying storage media but it is much better to use a filesystem specifically designed for a flash device.
In a disk file system there is typically a master file directory, and a map of used and free data regions. Any file additions, changes, or removals require updating the directory and the used/free maps. Random access to data regions is measured in milliseconds so this system works well for disks.
Tape requires linear motion to wind and unwind potentially very long reels of media. This tape motion may take several seconds to several minutes to move the read/write head from one end of the tape to the other.
Consequently, a master file directory and usage map can be extremely slow and inefficient with tape. Writing typically involves reading the block usage map to find free blocks for writing, updating the usage map and directory to add the data, and then advancing the tape to write the data in the correct spot. Each additional file write requires updating the map and directory and writing the data, which may take several seconds to occur for each file.
Tape file systems instead typically allow for the file directory to be spread across the tape intermixed with the data, referred to as streaming, so that time-consuming and repeated tape motions are not required to write new data.
However, a side effect of this design is that reading the file directory of a tape usually requires scanning the entire tape to read all the scattered directory entries. Most data archiving software that works with tape storage will store a local copy of the tape catalog on a disk file system, so that adding files to a tape can be done quickly without having to rescan the tape media. The local tape catalog copy is usually discarded if not used for a specified period of time, at which point the tape must be re-scanned if it is to be used in the future.
IBM has developed a file system for tape called the Linear Tape File System. The IBM implementation of this file system has been released as the open-source IBM Linear Tape File System — Single Drive Edition (LTFS—SDE) product. The Linear Tape File System uses a separate partition on the tape to record the index meta-data, thereby avoiding the problems associated with scattering directory entries across the entire tape.
Because of the time it can take to format a tape, typically tapes are pre-formatted so that the tape user does not need to spend time preparing each new tape for use. All that is usually necessary is to write an identifying media label to the tape before use, and even this can be automatically written by software when a new tape is used for the first time.
IBM DB2 for i http://www-03.ibm.com/systems/i/software/db2/index.html (formerly known as DB2/400 and DB2 for i5/OS) is a database file system as part of the object based IBM i http://www.ibm.com/developerworks/ibmi/newto/ operating system (formerly known as OS/400 and i5/OS), incorporating a single level store and running on IBM Power Systems (formerly known as AS/400 and iSeries), designed by Frank G. Soltis IBM's former chief scientist for IBM i. Around 1978 to 1988 Frank G. Soltis and his team at IBM Rochester have successfully designed and applied technologies like the database file system where others like Microsoft later failed to accomplish http://www.theregister.co.uk/2002/01/28/xp_successor_longhorn_goes_sql/. These technologies are informally known as 'Fortress Rochester' and were in few basic aspects extended from early Mainframe technologies but in many ways more advance from a technology perspective.
Some other projects that aren't "pure" database file systems but that use some aspects of a database file system:
, the entire system may be left in an unusable state.
Transaction processing
introduces the isolation guarantee, which states that operations within a transaction are hidden from other threads on the system until the transaction commits, and that interfering operations on the system will be properly serialized
with the transaction.
Transactions also provide the
atomicity
guarantee, that operations inside of a transaction are either all committed, or the transaction can be aborted and the system discards all of its partial results. This means that if there is a crash or power failure, after recovery, the stored state will be consistent. Either the software will be completely installed or the failed installation will be completely rolled back, but an unusable partial install will not be left on the system.
Windows, beginning with Vista, added transaction support to NTFS
, abbreviated TxF. TxF is the only commercial implementation of a transactional file system, as transactional file systems are difficult to implement correctly in practice. There are a number of research prototypes of transactional file systems for UNIX systems, including the Valor file system, Amino, LFS, and a transactional ext3
file system on the TxOS kernel,
as well as transactional file systems targeting embedded systems, such as TFFS.
Ensuring consistency across multiple file system operations is difficult, if not impossible, without file system transactions. File locking
can be used as a concurrency control
mechanism for individual files, but it typically does not protect the directory structure or file metadata. For instance, file locking cannot prevent TOCTTOU race conditions on symbolic links.
File locking also cannot automatically roll back a failed operation, such as a software upgrade; this requires atomicity.
Journaling file system
s are one technique used to introduce transaction-level consistency to file system structures. Journal transactions are not exposed to programs as part of the OS API; they are only used internally to ensure consistency at the granularity of a single system call.
Data backup systems typically do not provide support for direct backup of data stored in a transactional manner, which makes recovery of reliable and consistent data sets difficult. Most backup software simply notes what files have changed since a certain time, regardless of the transactional state shared across multiple files in the overall dataset. As a workaround, some database systems simply produce an archived state file containing all data up to that point, and the backup software only backs that up and does not interact directly with the active transactional databases at all. Recovery requires separate recreation of the database from the state file, after the file has been restored by the backup software.
, SMB
protocols, and file-system-like clients for FTP
and WebDAV
.
from Red Hat
, GPFS from IBM, and SFS from DataPlow.
operating systems, but devices are given file names in some non-Unix-like operating systems as well.
systems include devfs and, in Linux
2.6 systems, udev
. In non-Unix-like systems, such as TOPS-10
and other operating systems influenced by it, where the full filename or pathname of a file can include a device prefix, devices other than those containing file systems are referred to by a device prefix specifying the device, without anything following it.
. Disk and digital tape devices were too expensive for hobbyists. An inexpensive basic data storage system was devised that used common audio cassette tape.
When then system needed to write data, the user was notified to press "RECORD" on the cassette recorder, then press "RETURN" on the keyboard to notify the system that the cassette recorder was recording. The system wrote a sound to provide time synchronization, then sounds that encoded a prefix, the data, a checksum
and a suffix. When the system needed to read data, the user was instructed to press "PLAY" on the cassette recorder. The system would listen to the sounds on the tape waiting until a burst of sound could be recognized as the synchronization. The system would then interpret subsequent sounds as data. When the data read was complete, the system would notify the user to press "STOP" on the cassette recorder. It was primitive, but it worked (a lot of the time). Data was stored sequentially in an un-named format. Multiple sets of data could be written and located by fast-forwarding the tape and observing at the tape counter to find the approximate start of the next data region on the tape. The user might have to listen to the sounds to find the right spot to begin playing the next data region. Some implementations even included audible sounds interspersed with the data.
.
When floppy disk media was first available this type of filesystem was adequate due to the relatively small amount of data space available. The Apple Macintosh featured a flat file system, the Macintosh File System
. It was unusual in that the file management program (Macintosh Finder
) created the illusion of a partially hierarchical filing system on top of EMFS. This structure required every file to have a unique name, even if it appeared to be in a separate folder.
While simple, flat file systems becomes awkward as the number of files grows and makes it difficult to organize data into related groups of files.
A recent addition to the flat file system family is Amazon
's S3
, a remote storage service, which is intentionally simplistic to allow users the ability to customize how their data is stored. The only constructs are buckets (imagine a disk drive of unlimited size) and objects (similar, but not identical to the standard concept of a file). Advanced file management is allowed by being able to use nearly any character (including '/') in the object's name, and the ability to select subsets of the bucket's content based on identical prefixes.
s include support for more than one file system. Sometimes the OS and the filesystem are so tightly interwoven it is difficult to separate out filesystem functions.
There needs to be an interface provided by the operating system software between the user and the file system. This interface can be textual (such as provided by a command line interface, such as the Unix shell
, or OpenVMS DCL
) or graphical (such as provided by a graphical user interface
, such as file browsers). If graphical, the metaphor of the folder, containing documents, other files, and nested folders is often used (see also: directory
and folder).
operating systems create a virtual file system, which makes all the files on all the devices appear to exist in a single hierarchy. This means, in those systems, there is one root directory
, and every file existing on the system is located under it somewhere. Unix-like systems can use a RAM disk
or network shared resource as its root directory.
Unix-like systems assign a device name to each device, but this is not how the files on that device are accessed. Instead, to gain access to files on another device, the operating system must first be informed where in the directory tree those files should appear. This process is called mounting
a file system. For example, to access the files on a CD-ROM
, one must tell the operating system "Take the file system from this CD-ROM and make it appear under such-and-such directory". The directory given to the operating system is called the mount point – it might, for example, be /media. The /media directory exists on many Unix systems (as specified in the Filesystem Hierarchy Standard
) and is intended specifically for use as a mount point for removable media such as CDs, DVDs, USB drives or floppy disks. It may be empty, or it may contain subdirectories for mounting individual devices. Generally, only the administrator
(i.e. root user) may authorize the mounting of file systems.
Unix-like
operating systems often include software and tools that assist in the mounting process and provide it new functionality. Some of these strategies have been coined "auto-mounting" as a reflection of their purpose.
supports many different file systems, but common choices for the system disk include the ext* family (such as ext2
, ext3
and ext4
), XFS
, JFS, ReiserFS
and btrfs
.
Solaris operating system
in earlier releases defaulted to (non-journaled or non-logging) UFS
for bootable and supplementary file systems. Solaris defaulted to, supported, and extended UFS.
Support for other file systems and significant enhancements were added over time, including Veritas Software
Corp. (Journaling) VxFS, Sun Microsystems (Clustering) QFS
, Sun Microsystems (Journaling) UFS, and Sun Microsystems (open source, poolable, 128 bit compressible, and error-correcting) ZFS
.
Kernel extensions were added to Solaris to allow for bootable Veritas VxFS operation. Logging or Journaling
was added to UFS in Sun's Solaris 7. Releases of Solaris 10, Solaris Express, OpenSolaris
, and other open source variants of the Solaris operating system later supported bootable ZFS
.
Logical Volume Management
allows for spanning a file system across multiple devices for the purpose of adding redundancy, capacity, and/or throughput. Legacy environments in Solaris may use Solaris Volume Manager
(formerly known as Solstice DiskSuite.) Multiple operating systems (including Solaris) may use Veritas Volume Manager
. Modern Solaris based operating systems eclipse the need for Volume Management through leveraging virtual storage pools in ZFS
.
uses a file system that it inherited from classic Mac OS
called HFS Plus
, sometimes called Mac OS Extended. HFS Plus is a metadata-rich and case preserving
file system. Due to the Unix roots of Mac OS X, Unix permissions were added to HFS Plus. Later versions of HFS Plus added journaling
to prevent corruption of the file system structure and introduced a number of optimizations to the allocation algorithms in an attempt to defragment files automatically without requiring an external defragmenter.
Filenames can be up to 255 characters. HFS Plus uses Unicode
to store filenames. On Mac OS X, the filetype
can come from the type code
, stored in file's metadata, or the filename.
HFS Plus has three kinds of links: Unix-style hard link
s, Unix-style symbolic link
s and aliases
. Aliases are designed to maintain a link to their original file even if they are moved or renamed; they are not interpreted by the file system itself, but by the File Manager code in userland.
Mac OS X also supports the UFS
file system, derived from the BSD Unix Fast File System via NeXTSTEP
. However, as of Mac OS X 10.5 (Leopard), Mac OS X can no longer be installed on a UFS volume, nor can a pre-Leopard system installed on a UFS volume be upgraded to Leopard.
Newer versions Mac OS X are capable of reading and writing to the legacy FAT file systems(16 & 32). They are capable of reading NTFS Filesystems. Writing is only supported on Mac OS X 10.6 (Snow Leopard) and later but only after a non-trivial system setting change. Third party software exists that automates this. Third party software is still necessary to write to the NTFS file system on Mac OS X versions prior to 10.6 (Snow Leopard).
treats everything as a file, and accessed as a file would be (i.e., no ioctl
or mmap
)
networking, graphics, debugging, authentication, capabilities, encryption, and other services are accessed via I-O operations on file descriptor
s.
The 9P
protocol removes the difference between local and remote files
These file systems are organized with the help of private, per-process namespaces, allowing each process to have a different view of the many file systems that provide resources in a distributed system.
The Inferno operating system
shares these concepts with Plan 9.
and NTFS
file systems.
Windows uses a drive letter abstraction at the user level to distinguish one disk or partition from another. For example, the path
C:\WINDOWS represents a directory WINDOWS on the partition represented by the letter C. The C drive is most commonly used for the primary hard disk partition, on which Windows is usually installed and from which it boots. This "tradition" has become so firmly ingrained that bugs came about in older applications which made assumptions that the drive that the operating system was installed on was C. The use of drive letters, and the tradition of using "C" as the drive letter for the primary hard disk partition, can be traced to MS-DOS
, where the letters A and B were reserved for up to two floppy disk drives. This in turn derived from CP/M
in the 1970s, and ultimately from IBM's CP/CMS
of 1967.
Network drives may also be mapped to drive letters.
(FAT) filing system, supported by all versions of Microsoft Windows
, was an evolution of that used in Microsoft's earlier operating system (MS-DOS
which in turn was based on 86-DOS). FAT ultimately traces its roots back to the short-lived M-DOS
project and Standalone disk BASIC
before it. Over the years various features have been added to it, inspired by similar features found on file systems used by operating systems such as Unix
.
Older versions of the FAT file system (FAT12 and FAT16) had file name length limits, a limit on the number of entries in the root directory of the file system and had restrictions on the maximum size of FAT-formatted disks or partitions. Specifically, FAT12 and FAT16 had a limit of 8 characters for the file name, and 3 characters for the extension (such as .exe). This is commonly referred to as the 8.3 filename limit. VFAT, which was an extension to FAT12 and FAT16 introduced in Windows NT 3.5
and subsequently included in Windows 95, allowed long file names (LFN).
FAT32 also addressed many of the limits in FAT12 and FAT16, but remains limited compared to NTFS.
exFAT
(also known as FAT64) is the newest iteration of FAT, with certain advantages over NTFS with regards to file system overhead. exFAT is only compatible with newer Windows systems, such as Windows 2003, Windows Vista, Windows 2008, Windows 7 and more recently, support has been added for WinXP.
, introduced with the Windows NT
operating system, allowed ACL
-based permission control. Other features also supported by NTFS
include hard links, multiple file streams, attribute indexing, quota tracking, sparse files, encryption, compression, and reparse points (directories working as mount-points for other file systems, symlinks, junctions, remote storage links), though not all these features are well-documented.
, and converted back until the undo information is deleted. These conversions are possible due to using the same format for the file data itself, and relocating the metadata into empty space, in some cases using sparse file
support.
For example, to migrate a FAT32 filesystem to an ext2 filesystem. First create a new ext2 filesystem, then copy the data to the filesystem, then delete the FAT32 filesystem.
An alternative, when there is not sufficient space to retain the original filesystem until the new one is created, is to use a work area (such as a removable media). This takes longer but a backup of the data is a nice side effect.
Filesystems also have a limit on the length of an individual filename.
Copying files with long names or located in paths of significant depth from one filesystem to another may cause undesirable results. This depends on how the utility doing the copying handles the discrepancy. See also pathmunge
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
and the file system. Some filesystems provide mechanisms to control access to the data and metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
. Ensuring reliability is a major responsibility of a filesystem. Some filesystems provide a means for multiple programs to update data in the same file at nearly the same time.
Without a filesystem programs would not be able to access data by file name or directory and would need to be able to directly access data regions on a storage device.
File systems are used on data storage device
Data storage device
thumb|200px|right|A reel-to-reel tape recorder .The magnetic tape is a data storage medium. The recorder is data storage equipment using a portable medium to store the data....
s such as hard disk drives, floppy disk
Floppy disk
A floppy disk is a disk storage medium composed of a disk of thin and flexible magnetic storage medium, sealed in a rectangular plastic carrier lined with fabric that removes dust particles...
s, optical disc
Optical disc
In computing and optical disc recording technologies, an optical disc is a flat, usually circular disc which encodes binary data in the form of pits and lands on a special material on one of its flat surfaces...
s, or flash memory
Flash memory
Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. It was developed from EEPROM and must be erased in fairly large blocks before these can be rewritten with new data...
storage devices to maintain the physical location of the computer file
Computer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...
s. They may provide access to data on a file server by acting as clients for a network protocol (e.g. NFS, SMB
Server Message Block
In computer networking, Server Message Block , also known as Common Internet File System operates as an application-layer network protocol mainly used to provide shared access to files, printers, serial ports, and miscellaneous communications between nodes on a network. It also provides an...
, or 9P
9P
9P is a network protocol developed for the Plan 9 from Bell Labs distributed operating system as the means of connecting the components of a Plan 9 system. Files are key objects in Plan 9. They represent windows, network connections, processes, and almost anything else available in the operating...
clients), or they may be virtual and exist only as an access method for virtual data (e.g. procfs
Procfs
procfs is a special filesystem in UNIX-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional...
). This is distinguished from a directory service
Directory service
A directory service is the software system that stores, organizes and provides access to information in a directory. In software engineering, a directory is a map between names and values. It allows the lookup of values given a name, similar to a dictionary...
and registry
Windows registry
The Windows Registry is a hierarchical database that stores configuration settings and options on Microsoft Windows operating systems. It contains settings for low-level operating system components as well as the applications running on the platform: the kernel, device drivers, services, SAM, user...
.
Space management
File systems allocate space in a granular manner, usually multiple physical units on the device. The file system is responsible for organizing filesComputer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...
and directories
Directory (file systems)
In computing, a folder, directory, catalog, or drawer, is a virtual container originally derived from an earlier Object-oriented programming concept by the same name within a digital file system, in which groups of computer files and other folders can be kept and organized.A typical file system may...
, and keeping track of which areas of the media belong to which file and which are not being used. For example, in Apple DOS
Apple DOS
Apple DOS refers to operating systems for the Apple II series of microcomputers from late 1978 through early 1983. Apple DOS had three major releases: DOS 3.1, DOS 3.2, and DOS 3.3; each one of these three releases was followed by a second, minor "bug-fix" release, but only in the case of Apple DOS...
of the early 1980s, 256-byte sectors on 140 kilobyte floppy disk used a track/sector map.
This results in unused space when a file is not an exact multiple of the allocation unit, sometimes referred to as slack space. For a 512-byte allocation, the average unused space is 255 bytes. For a 64 KB clusters, the average unused space is 32KB. The size of the allocation unit is chosen when the file system is created. Choosing the allocation size based on the average size of the files expected to be in the filesystem can minimize the amount of unusable space. Frequently the default allocation may provide reasonable usage. If it can be anticipated that a file system will contain mostly small files a small cluster size should be chosen. Choosing an allocation size that is too small results in excessive overhead if the file system will contain mostly very large files.
File system fragmentation
File system fragmentation
In computing, file system fragmentation, sometimes called file system aging, is the inability of a file system to lay out related data sequentially , an inherent phenomenon in storage-backed file systems that allow in-place modification of their contents. It is a special case of data fragmentation...
occurs when unused space or single files are not contiguous. As a filesystem is used, files are created, modified and deleted. When a file is created the filesystem allocates space for the data. Some filesystems permit or require specifying an initial space allocation and subsequent incremental allocations as the file grows. As files are deleted the space they were allocated eventually is considered available for use by other files. This creates alternating used and unused areas of various sizes. This is free space fragmentation. When a file is created and there is not an area of contiguous space available for its initial allocation the space must be assigned in fragments. When a file is modified such that it becomes larger it may exceed the space initially allocated to it, another allocation must be assigned elsewhere and the file becomes fragmented.
A file system may not make use of a storage device but can be used to organize and represent access to any data, whether it is stored or dynamically generated (e.g. procfs
Procfs
procfs is a special filesystem in UNIX-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional...
).
File names
A file name (or filename) is used to reference the storage location in the filesystem. Most filesystems have restrictions on the length of the filename. Some filesystems have case insensitive filenames.Most file system interface utilities place restrictions on the characters permitted in the filename restricting some special characters to provide a syntax to indicate a device, device type, directory prefix or file type. These are typically not file system restrictions and utilities may provide a means to refer to files with embedded special characters such as enclosing the entire filename within quotes ("). Avoiding using special characters makes it easier for users to refer to files.
Some filesystem utilities, editors and compilers treat prefixes and suffixes in a special way. These are usually merely conventions and not implemented within the filesystem.
Directories
File systems typically have directories (sometimes called folders) which allow the user to group files. This may be implemented by connecting the file name to an index in a table of contentsTable of contents
A table of contents, usually headed simply "Contents" and abbreviated informally as TOC, is a list of the parts of a book or document organized in the order in which the parts appear...
or an inode
Inode
In computing, an inode is a data structure on a traditional Unix-style file system such as UFS. An inode stores all the information about a regular file, directory, or other file system object, except its data and name....
in a Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
file system. Directory structures may be flat (i.e. linear), or allow hierarchies where directories may contain subdirectories. The first file system to support arbitrary hierarchies of directories was the file system in the Multics
Multics
Multics was an influential early time-sharing operating system. The project was started in 1964 in Cambridge, Massachusetts...
operating system. The native file systems of Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
systems also support arbitrary directory hierarchies, as do, for example, Apple's Hierarchical File System
Hierarchical File System
Hierarchical File System is a file system developed by Apple Inc. for use in computer systems running Mac OS. Originally designed for use on floppy and hard disks, it can also be found on read-only media such as CD-ROMs...
and its successor HFS+
HFS Plus
HFS Plus or HFS+ is a file system developed by Apple Inc. to replace their Hierarchical File System as the primary file system used in Macintosh computers . It is also one of the formats used by the iPod digital music player...
in classic Mac OS
Mac OS
Mac OS is a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems. The Macintosh user experience is credited with popularizing the graphical user interface...
(HFS+ is still used in Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...
), the FAT
File Allocation Table
File Allocation Table is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of...
file system in MS-DOS
MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...
2.0 and later and Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
, the NTFS
NTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
file system in the Windows NT
Windows NT
Windows NT is a family of operating systems produced by Microsoft, the first version of which was released in July 1993. It was a powerful high-level-language-based, processor-independent, multiprocessing, multiuser operating system with features comparable to Unix. It was intended to complement...
family of operating systems, and the ODS-2 and higher levels of the Files-11
Files-11
Files-11, also known as on-disk structure, is the file system used by Hewlett-Packard's OpenVMS operating system, and also by the older RSX-11...
file system in OpenVMS
OpenVMS
OpenVMS , previously known as VAX-11/VMS, VAX/VMS or VMS, is a computer server operating system that runs on VAX, Alpha and Itanium-based families of computers. Contrary to what its name suggests, OpenVMS is not open source software; however, the source listings are available for purchase...
.
Metadata
Other bookkeeping information is typically associated with each file within a file system. The lengthFile size
File size measures the size of a computer file. Typically it is measured in bytes with a prefix. The actual amount of disk space consumed by the file depends on the file system....
of the data contained in a file may be stored as the number of blocks allocated for the file or as a byte
Byte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...
count. The time that the file was last modified may be stored as the file's timestamp. File systems might store the file creation time, the time it was last accessed, the time the file's meta-data was changed, or the time the file was last backed up. Other information can include the file's device type (e.g. block, character, socket
Internet socket
In computer networking, an Internet socket or network socket is an endpoint of a bidirectional inter-process communication flow across an Internet Protocol-based computer network, such as the Internet....
, subdirectory, etc.), its owner user ID
User (computing)
A user is an agent, either a human agent or software agent, who uses a computer or network service. A user often has a user account and is identified by a username , screen name , nickname , or handle, which is derived from the identical Citizen's Band radio term.Users are...
and group ID
Group (computing)
In computing, the term group generally refers to a grouping of users. In principle, users may belong to none, one, or many groups The primary purpose of user groups is to simplify access control to computer systems.Suppose a computer science department has a network which is shared by students and...
, and its access permission
File system permissions
Most current file systems have methods of administering permissions or access rights to specific users and groups of users. These systems control the ability of the users to view or make changes to the contents of the filesystem....
settings (e.g. whether the file is read-only, executable
Executable
In computing, an executable file causes a computer "to perform indicated tasks according to encoded instructions," as opposed to a data file that must be parsed by a program to be meaningful. These instructions are traditionally machine code instructions for a physical CPU...
, etc.).
Additional attributes can be associated on file systems, such as NTFS
NTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
, XFS
XFS
XFS is a high-performance journaling file system created by Silicon Graphics, Inc. It is the default file system in IRIX releases 5.3 and onwards and later ported to the Linux kernel. XFS is particularly proficient at parallel IO due to its allocation group based design...
, ext2
Ext2
The ext2 or second extended filesystem is a file system for the Linux kernel. It was initially designed by Rémy Card as a replacement for the extended file system ....
/ext3
Ext3
The ext3 or third extended filesystem is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian...
, some versions of UFS
Unix File System
The Unix file system is a file system used by many Unix and Unix-like operating systems. It is also called the Berkeley Fast File System, the BSD Fast File System or FFS...
, and HFS+, using extended file attributes
Extended file attributes
Extended file attributes is a file system feature that enables users to associate computer files with metadata not interpreted by the filesystem, whereas regular attributes have a purpose strictly defined by the filesystem...
. Some file systems provide for user defined attributes such as the author of the document, the character encoding of a document or the size of an image.
Some file systems allow for different data collections to be associated with one file name. These separate collections may be referred to as streams or forks. Apple has long used a forked file system on the Macintosh, and Microsoft supports streams in NTFS. Some file systems maintain multiple past revisions of a file under a single file name; the filename by itself retrieves the most recent version, while prior saved version can be accessed using a special naming convention such as "filename;4" or "filename(-4)" to access the version four saves ago.
Utilities
File systems include utilities to initialize, alter parameters of and remove an instance of the filesystem. Some include the ability to extend or truncate the space allocated to the file system.Directory utilities create, rename and delete directory entries and alter metadata associated with a directory. They may include a means to create additional links to a directory (hard link
Hard link
In computing, a hard link is a directory entry that associates a name with a file on a file system. . The term is used in file systems which allow multiple hard links to be created for the same file. This has the effect of creating multiple names for the same file, causing an aliasing effect: e.g...
s in Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
), rename parent links (".." in Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
OS), and create bidirectional links to files.
File utilities create, list, copy, move and delete files, alter metadata. They may be able to truncate data, truncate or extend space allocation, append to, move, and modify files in-place. Depending on the underlying structure of the filesystem, they may provide a mechanism to prepend to, or truncate from, the beginning of a file, insert entries into the middle of a file or deletion entries from a file.
Also in this category are utilities to free space for deleted files if the filesystem provides an undelete function.
Some filesystems defer reorganization of free space, secure erasing of free space and rebuilding of hierarchical structures. They provide utilities to perform these functions at times of minimal activity. Included in this category is the infamous defragmentation
Defragmentation
In the maintenance of file systems, defragmentation is a process that reduces the amount of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions . It also attempts to create larger regions of...
utility.
Some of the most important features of files system utilities involve supervisory activities which may involve bypassing ownership or direct access to the underlying device. These include high performance backup and recovery, data replication and reorganization of various data structures and allocation tables within the filesystem.
Restricting and permitting access
There are several mechanisms used by file systems to control access to data. Usually the intent is to prevent reading or modifying files by a user or group of users. Another reason is to insure data is modified in a controlled way so access may be restricted to a specific to program. Examples include passwords stored in the metadata of the file or elsewhere and file permissions in the form of permission bits, access control listAccess control list
An access control list , with respect to a computer file system, is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on given objects. Each entry in a typical ACL specifies a subject...
s, or capabilities. The need for filesystem utilities to be able to access the data at the media level to reorganize the structures and provide efficient backup usually means that these are only effective for polite users but are not effective against intruders.
See also password cracking
Password cracking
Password cracking is the process of recovering passwords from data that has been stored in or transmitted by a computer system. A common approach is to repeatedly try guesses for the password...
.
Methods for encrypting file data are sometimes included in the filesystem. This is very effective since there is no need for filesystem utilities to know the encryption seed to effectively manage the data. The risks of relying on encryption include the fact that an attacker can copy the data and use brute force to decrypt the data. Losing the seed means losing the data.
See also filesystem-level encryption
Filesystem-level encryption
Filesystem-level encryption, often called file or folder encryption, is a form of disk encryption where individual files or directories are encrypted by the file system itself...
, Encrypting File System
Encrypting File System
The Encrypting File System on Microsoft Windows is a feature introduced in version 3.0 of NTFS that provides filesystem-level encryption...
.
Maintaining integrity
One of the filesystems significant responsibilities is to insure that, regardless of the actions by programs accessing the data, the structure remains consistent. This includes actions taken if a program modifying data terminates abnormally or neglects to inform the filesystem that is has completed its activities. This may include updating the metadata, the directory entry and handling any data that was buffered but not yet updated on the physical storage media.Other failures which the filesystem must deal with include media failures or loss of connection to remote systems.
In the event of an operating system failure or "soft" power failure, special routines in the filesystem must be invoked similar to when an individual program fails.
The filesystem must also be able to correct damaged structures. These may occur as a result of an operating system failure for which the OS was unable to notify the file system, power failure or reset.
The filesystem must also record events to allow analysis of systemic issues as well as problems with specific files or directories.
User data
The most important purpose of a filesystem is to manage user data. This includes storing, retrieving and updating data.Some filesystems accept data for storage as a stream of bytes which are collected and stored in a manner efficient for the media. When a program retrieves the data it specifies the size of a memory buffer and the file system transfers data from the media to the buffer. Sometimes a runtime library routine may allow the user program to define a record based on a library call specifying a length. When the user program reads the data the library retrieves data via the file system and returns a record.
Some filesystems allow the specification of a fixed record length which is used for all write and reads. This facilitates updating records.
An identification for each record, also known as a key, makes for a more sophisticated file system. The user program can read, write and update records without regard with their location. This requires complicated management of blocks of media usually separating key blocks and data blocks. Very efficient algorithms can be developed with pyramid structure for locating records.
Using a filesystem
Utilities, language specific run-time libraries and user programs use file system APIs to make requests of the file system. These include data transfer, positioning, updating metadata, managing directories, managing access specifications and removal.Multiple filesystems within a single system
Frequently retail systems are configured with a single filesystem occupying the entire hard disk.Another approach is to partition
Disk partitioning
Disk partitioning is the act of dividing a hard disk drive into multiple logical storage units referred to as partitions, to treat one physical disk drive as if it were multiple disks. Partitions are also termed "slices" for operating systems based on BSD, Solaris or GNU Hurd...
the disk so that several filesystems with different attributes can be used. One filesystem, for use as browser cache, might be configured with a small allocation size. This has the additional advantage of keeping the frantic activity of creating and deleting files typical of browser activity in a narrow area of the disk and not interfering with allocations of other files. A similar partition might be created for email. Another partition, and filesystem might be created for the storage of audio or video files with a relatively large allocation. One of the filesystems may normally be set read-only and only periodically be set writable.
Multiple filesystems on a single system has the additional benefit that in the event of a corruption of a single partition, the remaining filesystems will frequently be still intact. This includes virus destruction of the system partition or even a system that will not boot. Filesystem utilities which require dedicated access can effectively be completed piece meal, in addition defragmentation
Defragmentation
In the maintenance of file systems, defragmentation is a process that reduces the amount of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions . It also attempts to create larger regions of...
may be more effective. Several system maintenance utilities such as virus scans and backups can also be processed in segments. For example it is not necessary to back up the filesystem containing videos along with all the other files if none have been added since the last backup.
Design limitations
All file systems have some functional limit that defines the maximum storable data capacity within that system. These functional limits are a best-guess effort by the designer to determine how large the storage systems will be right now, and how large storage systems are likely to become in the future. Disk storage has continued to increase at near exponentialExponential growth
Exponential growth occurs when the growth rate of a mathematical function is proportional to the function's current value...
rates (see Moore's law
Moore's Law
Moore's law describes a long-term trend in the history of computing hardware: the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years....
), so after a few years file systems have kept reaching design limitations that require computer users to repeatedly move to a newer system with ever greater capacity.
File system complexity typically varies proportionally with the available storage capacity. The file systems of early 1980s home computer
Home computer
Home computers were a class of microcomputers entering the market in 1977, and becoming increasingly common during the 1980s. They were marketed to consumers as affordable and accessible computers that, for the first time, were intended for the use of a single nontechnical user...
s with 50 KB to 512 KB of storage would not be a reasonable choice for modern storage systems with hundreds of gigabytes of capacity. Likewise, modern file systems would not be a reasonable choice for these early systems, since the complexity of modern file system structures would consume most or all of the very limited capacity of the early storage systems.
Types of file systems
File system types can be classified into disk/tape file systems, network file systems and special purpose file systems.Disk file systems
A disk file system takes advantages of the ability to randomly address data on a disk storage media in a short amount of time. Additional considerations include the speed of accessing data following that initially requested and the anticipation that the following data may also be requested. This permits multiple users (or processes) access to various data on the disk without regard to the sequential location of the data. Examples include FATFile Allocation Table
File Allocation Table is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of...
(FAT12, FAT16, FAT32, exFAT), NTFS
NTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
, HFS
Hierarchical File System
Hierarchical File System is a file system developed by Apple Inc. for use in computer systems running Mac OS. Originally designed for use on floppy and hard disks, it can also be found on read-only media such as CD-ROMs...
and HFS+
HFS Plus
HFS Plus or HFS+ is a file system developed by Apple Inc. to replace their Hierarchical File System as the primary file system used in Macintosh computers . It is also one of the formats used by the iPod digital music player...
, HPFS
HPFS
HPFS or High Performance File System is a file system created specifically for the OS/2 operating system to improve upon the limitations of the FAT file system...
, UFS
Unix File System
The Unix file system is a file system used by many Unix and Unix-like operating systems. It is also called the Berkeley Fast File System, the BSD Fast File System or FFS...
, ext2
Ext2
The ext2 or second extended filesystem is a file system for the Linux kernel. It was initially designed by Rémy Card as a replacement for the extended file system ....
, ext3
Ext3
The ext3 or third extended filesystem is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian...
, ext4
Ext4
The ext4 or fourth extended filesystem is a journaling file system for Linux, developed as the successor to ext3.It was born as a series of backward compatible extensions to ext3, many of them originally developed by Cluster File Systems for the Lustre file system between 2003 and 2006, meant to...
, btrfs
Btrfs
Btrfs is a GPL-licensed copy-on-write file system for Linux.Development began at Oracle Corporation in 2007....
, ISO 9660
ISO 9660
ISO 9660, also referred to as CDFS by some hardware and software providers, is a file system standard published by the International Organization for Standardization for optical disc media....
, Files-11
Files-11
Files-11, also known as on-disk structure, is the file system used by Hewlett-Packard's OpenVMS operating system, and also by the older RSX-11...
, Veritas File System
VERITAS File System
The VERITAS File System, , is an extent-based file system. It was originally developed by VERITAS Software. Through an OEM agreement, VxFS is used as the primary filesystem of the HP-UX operating system...
, VMFS, ZFS
ZFS
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include data integrity verification against data corruption modes , support for high storage capacities, integration of the concepts of filesystem and volume management,...
, ReiserFS
ReiserFS
ReiserFS is a general-purpose, journaled computer file system designed and implemented by a team at Namesys led by Hans Reiser. ReiserFS is currently supported on Linux . Introduced in version 2.4.1 of the Linux kernel, it was the first journaling file system to be included in the standard kernel...
and UDF
Universal Disk Format
Universal Disk Format is an implementation of the specification known as ISO/IEC 13346 and ECMA-167 and is an open vendor-neutral file system for computer data storage for a broad range of media. In practice, it has been most widely used for DVDs and newer optical disc formats, supplanting ISO 9660...
. Some disk file systems are journaling file system
Journaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...
s or versioning file system
Versioning file system
A versioning file system is any computer file system which allows a computer file to exist in several versions at the same time. Thus it is a form of revision control. Most common versioning file systems keep a number of old copies of the file. Some limit the number of changes per minute or per...
s.
Optical discs
ISO 9660ISO 9660
ISO 9660, also referred to as CDFS by some hardware and software providers, is a file system standard published by the International Organization for Standardization for optical disc media....
and Universal Disk Format
Universal Disk Format
Universal Disk Format is an implementation of the specification known as ISO/IEC 13346 and ECMA-167 and is an open vendor-neutral file system for computer data storage for a broad range of media. In practice, it has been most widely used for DVDs and newer optical disc formats, supplanting ISO 9660...
(UDF) are two common formats that target Compact Disc
Compact Disc
The Compact Disc is an optical disc used to store digital data. It was originally developed to store and playback sound recordings exclusively, but later expanded to encompass data storage , write-once audio and data storage , rewritable media , Video Compact Discs , Super Video Compact Discs ,...
s, DVD
DVD
A DVD is an optical disc storage media format, invented and developed by Philips, Sony, Toshiba, and Panasonic in 1995. DVDs offer higher storage capacity than Compact Discs while having the same dimensions....
s and Blu-ray discs. Mount Rainier
Mount Rainier (packet writing)
Mount Rainier is a format for writable optical discs which provides the packet writing and defect management. Its goal is the replacement of the floppy disk...
is an extension to UDF supported by Linux 2.6 series and Windows Vista that facilitates rewriting to DVDs.
Flash file systems
A flash file system considers the special abilities, performance and restrictions of flash memoryFlash memory
Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. It was developed from EEPROM and must be erased in fairly large blocks before these can be rewritten with new data...
devices. Frequently a disk file system can use a flash memory device as the underlying storage media but it is much better to use a filesystem specifically designed for a flash device.
Tape file systems
A tape file system is a file system and tape format designed to store files on tape in a self-describing form. Magnetic tapes are sequential storage media with significantly longer random data access times than disks, posing challenges to the creation and efficient management of a general-purpose file system.In a disk file system there is typically a master file directory, and a map of used and free data regions. Any file additions, changes, or removals require updating the directory and the used/free maps. Random access to data regions is measured in milliseconds so this system works well for disks.
Tape requires linear motion to wind and unwind potentially very long reels of media. This tape motion may take several seconds to several minutes to move the read/write head from one end of the tape to the other.
Consequently, a master file directory and usage map can be extremely slow and inefficient with tape. Writing typically involves reading the block usage map to find free blocks for writing, updating the usage map and directory to add the data, and then advancing the tape to write the data in the correct spot. Each additional file write requires updating the map and directory and writing the data, which may take several seconds to occur for each file.
Tape file systems instead typically allow for the file directory to be spread across the tape intermixed with the data, referred to as streaming, so that time-consuming and repeated tape motions are not required to write new data.
However, a side effect of this design is that reading the file directory of a tape usually requires scanning the entire tape to read all the scattered directory entries. Most data archiving software that works with tape storage will store a local copy of the tape catalog on a disk file system, so that adding files to a tape can be done quickly without having to rescan the tape media. The local tape catalog copy is usually discarded if not used for a specified period of time, at which point the tape must be re-scanned if it is to be used in the future.
IBM has developed a file system for tape called the Linear Tape File System. The IBM implementation of this file system has been released as the open-source IBM Linear Tape File System — Single Drive Edition (LTFS—SDE) product. The Linear Tape File System uses a separate partition on the tape to record the index meta-data, thereby avoiding the problems associated with scattering directory entries across the entire tape.
Tape formatting
Writing data to a tape is often a significantly time-consuming process that may take several hours. Similarly, completely erasing or formatting a tape can also take several hours. With many data tape technologies it is not necessary to format the tape before over-writing new data to the tape. This is due to the inherently destructive nature of overwriting data on sequential media.Because of the time it can take to format a tape, typically tapes are pre-formatted so that the tape user does not need to spend time preparing each new tape for use. All that is usually necessary is to write an identifying media label to the tape before use, and even this can be automatically written by software when a new tape is used for the first time.
Database file systems
Another concept for file management is the idea of a database-based file system. Instead of, or in addition to, hierarchical structured management, files are identified by their characteristics, like type of file, topic, author, or similar rich metadata. http://www.theregister.co.uk/2002/03/29/windows_on_a_database_sliced/IBM DB2 for i http://www-03.ibm.com/systems/i/software/db2/index.html (formerly known as DB2/400 and DB2 for i5/OS) is a database file system as part of the object based IBM i http://www.ibm.com/developerworks/ibmi/newto/ operating system (formerly known as OS/400 and i5/OS), incorporating a single level store and running on IBM Power Systems (formerly known as AS/400 and iSeries), designed by Frank G. Soltis IBM's former chief scientist for IBM i. Around 1978 to 1988 Frank G. Soltis and his team at IBM Rochester have successfully designed and applied technologies like the database file system where others like Microsoft later failed to accomplish http://www.theregister.co.uk/2002/01/28/xp_successor_longhorn_goes_sql/. These technologies are informally known as 'Fortress Rochester' and were in few basic aspects extended from early Mainframe technologies but in many ways more advance from a technology perspective.
Some other projects that aren't "pure" database file systems but that use some aspects of a database file system:
- A lot of Web-CMSWeb content management systemA web content management system is a software system that provides website authoring, collaboration, and administration tools designed to allow users with little knowledge of web programming languages or markup languages to create and manage website content with relative ease...
use a relational DBMSDatabase management systemA database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
to store and retrieve files. Examples: XHTMLXHTMLXHTML is a family of XML markup languages that mirror or extend versions of the widely-used Hypertext Markup Language , the language in which web pages are written....
files are stored as XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
or text fields, image files are stored as blob fields; SQLSQLSQL is a programming language designed for managing data in relational database management systems ....
SELECT (with optional XPathXPathXPath is a language for selecting nodes from an XML document. In addition, XPath may be used to compute values from the content of an XML document...
) statements retrieve the files, and allow the use of a sophisticated logic and more rich information associations than "usual file systems". - Very large file systems, embodied by applications like Apache Hadoop and Google File SystemGoogle File SystemGoogle File System is a proprietary distributed file system developed by Google Inc. for its own use. It is designed to provide efficient, reliable access to data using large clusters of commodity hardware...
, use some database file system concepts.
Transactional file systems
Some programs need to update multiple files "all at once". For example, a software installation may write program binaries, libraries, and configuration files. If the software installation fails, the program may be unusable. If the installation is upgrading a key system utility, such as the command shellShell (computing)
A shell is a piece of software that provides an interface for users of an operating system which provides access to the services of a kernel. However, the term is also applied very loosely to applications and may include any software that is "built around" a particular component, such as web...
, the entire system may be left in an unusable state.
Transaction processing
Transaction processing
In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...
introduces the isolation guarantee, which states that operations within a transaction are hidden from other threads on the system until the transaction commits, and that interfering operations on the system will be properly serialized
Serialization
In computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...
with the transaction.
Transactions also provide the
atomicity
guarantee, that operations inside of a transaction are either all committed, or the transaction can be aborted and the system discards all of its partial results. This means that if there is a crash or power failure, after recovery, the stored state will be consistent. Either the software will be completely installed or the failed installation will be completely rolled back, but an unusable partial install will not be left on the system.
Windows, beginning with Vista, added transaction support to NTFS
Transactional NTFS
Transactional NTFS is a component of Windows Vista and later operating systems. It brings the concept of atomic transactions to the NTFS file system, allowing Windows application developers to write file output routines that are guaranteed either to succeed completely or to fail completely.-...
, abbreviated TxF. TxF is the only commercial implementation of a transactional file system, as transactional file systems are difficult to implement correctly in practice. There are a number of research prototypes of transactional file systems for UNIX systems, including the Valor file system, Amino, LFS, and a transactional ext3
Ext3
The ext3 or third extended filesystem is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian...
file system on the TxOS kernel,
as well as transactional file systems targeting embedded systems, such as TFFS.
Ensuring consistency across multiple file system operations is difficult, if not impossible, without file system transactions. File locking
File locking
File locking is a mechanism that restricts access to a computer file by allowing only one user or process access at any specific time. Systems implement locking to prevent the classic interceding update scenario ....
can be used as a concurrency control
Concurrency control
In information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...
mechanism for individual files, but it typically does not protect the directory structure or file metadata. For instance, file locking cannot prevent TOCTTOU race conditions on symbolic links.
File locking also cannot automatically roll back a failed operation, such as a software upgrade; this requires atomicity.
Journaling file system
Journaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...
s are one technique used to introduce transaction-level consistency to file system structures. Journal transactions are not exposed to programs as part of the OS API; they are only used internally to ensure consistency at the granularity of a single system call.
Data backup systems typically do not provide support for direct backup of data stored in a transactional manner, which makes recovery of reliable and consistent data sets difficult. Most backup software simply notes what files have changed since a certain time, regardless of the transactional state shared across multiple files in the overall dataset. As a workaround, some database systems simply produce an archived state file containing all data up to that point, and the backup software only backs that up and does not interact directly with the active transactional databases at all. Recovery requires separate recreation of the database from the state file, after the file has been restored by the backup software.
Network file systems
A network file system is a file system that acts as a client for a remote file access protocol, providing access to files on a server. Examples of network file systems include clients for the NFS, AFSAndrew file system
The Andrew File System is a distributed networked file system which uses a set of trusted servers to present a homogeneous, location-transparent file name space to all the client workstations. It was developed by Carnegie Mellon University as part of the Andrew Project. It is named after Andrew...
, SMB
Server Message Block
In computer networking, Server Message Block , also known as Common Internet File System operates as an application-layer network protocol mainly used to provide shared access to files, printers, serial ports, and miscellaneous communications between nodes on a network. It also provides an...
protocols, and file-system-like clients for FTP
File Transfer Protocol
File Transfer Protocol is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server...
and WebDAV
WebDAV
Web-based Distributed Authoring and Versioning is a set of methods based on the Hypertext Transfer Protocol that facilitates collaboration between users in editing and managing documents and files stored on World Wide Web servers...
.
Shared disk file systems
A shared disk file system is one in which a number of machines (usually servers) all have access to the same external disk subsystem (usually a SAN). The file system arbitrates access to that subsystem, preventing write collisions. Examples include GFSGlobal File System
In computing, the Global File System is a shared disk file system for Linux computer clusters. This is not to be confused with the Google File System, a proprietary distributed filesystem developed by Google....
from Red Hat
Red Hat
Red Hat, Inc. is an S&P 500 company in the free and open source software sector, and a major Linux distribution vendor. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina with satellite offices worldwide....
, GPFS from IBM, and SFS from DataPlow.
Special file systems
A special file system presents non-file elements of an operating system as files so they can be acted on using file system APIs. This is most commonly done in Unix-likeUnix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
operating systems, but devices are given file names in some non-Unix-like operating systems as well.
Device file systems
A device file system represents I/O devices and pseudo-devices as files, called device files. Examples in Unix-likeUnix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
systems include devfs and, in Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
2.6 systems, udev
Udev
udev is the device manager for the Linux kernel. Primarily, it manages device nodes in /dev. It is the successor of devfs and hotplug, which means that it handles the /dev directory and all user space actions when adding/removing devices, including firmware load.-History:udev was new in Linux...
. In non-Unix-like systems, such as TOPS-10
TOPS-10
The TOPS-10 System was a computer operating system from Digital Equipment Corporation for the PDP-10 mainframe computer launched in 1967...
and other operating systems influenced by it, where the full filename or pathname of a file can include a device prefix, devices other than those containing file systems are referred to by a device prefix specifying the device, without anything following it.
Others
- In the Linux kernel, configfsConfigfsConfigfs is a RAM-based virtual file system provided by the 2.6 Linux kernel. Configfs appears similar to sysfs but they are in fact different and complementary. Configfs is for creating, managing and destroying kernel objects from user-space, and sysfs for viewing and manipulating objects from...
and sysfsSysfsSysfs is a virtual file system provided by Linux 2.6. Sysfs exports information about devices and drivers from the kernel device model to user space, and is also used for configuration...
provide files that can be used to query the kernel for information and configure entities in the kernel. - procfsProcfsprocfs is a special filesystem in UNIX-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional...
maps processes and, on Linux, other operating system structures into a filespace.
Minimal filesystem / Audio-cassette storage
In the late 1970's hobbyists saw the development of the microcomputerMicrocomputer
A microcomputer is a computer with a microprocessor as its central processing unit. They are physically small compared to mainframe and minicomputers...
. Disk and digital tape devices were too expensive for hobbyists. An inexpensive basic data storage system was devised that used common audio cassette tape.
When then system needed to write data, the user was notified to press "RECORD" on the cassette recorder, then press "RETURN" on the keyboard to notify the system that the cassette recorder was recording. The system wrote a sound to provide time synchronization, then sounds that encoded a prefix, the data, a checksum
Checksum
A checksum or hash sum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and...
and a suffix. When the system needed to read data, the user was instructed to press "PLAY" on the cassette recorder. The system would listen to the sounds on the tape waiting until a burst of sound could be recognized as the synchronization. The system would then interpret subsequent sounds as data. When the data read was complete, the system would notify the user to press "STOP" on the cassette recorder. It was primitive, but it worked (a lot of the time). Data was stored sequentially in an un-named format. Multiple sets of data could be written and located by fast-forwarding the tape and observing at the tape counter to find the approximate start of the next data region on the tape. The user might have to listen to the sounds to find the right spot to begin playing the next data region. Some implementations even included audible sounds interspersed with the data.
Flat file systems
In a flat file system, there are no subdirectoriesDirectory (file systems)
In computing, a folder, directory, catalog, or drawer, is a virtual container originally derived from an earlier Object-oriented programming concept by the same name within a digital file system, in which groups of computer files and other folders can be kept and organized.A typical file system may...
.
When floppy disk media was first available this type of filesystem was adequate due to the relatively small amount of data space available. The Apple Macintosh featured a flat file system, the Macintosh File System
Macintosh File System
Macintosh File System is a volume format created by Apple Computer for storing files on 400K floppy disks. MFS was introduced with the Macintosh 128K in January 1984....
. It was unusual in that the file management program (Macintosh Finder
Macintosh Finder
The Finder is the default file manager used on Mac OS and Mac OS X operating systems; it is responsible for the overall user-management of files, disks, network volumes and the launching of other applications...
) created the illusion of a partially hierarchical filing system on top of EMFS. This structure required every file to have a unique name, even if it appeared to be in a separate folder.
While simple, flat file systems becomes awkward as the number of files grows and makes it difficult to organize data into related groups of files.
A recent addition to the flat file system family is Amazon
Amazon.com
Amazon.com, Inc. is a multinational electronic commerce company headquartered in Seattle, Washington, United States. It is the world's largest online retailer. Amazon has separate websites for the following countries: United States, Canada, United Kingdom, Germany, France, Italy, Spain, Japan, and...
's S3
Amazon S3
Amazon S3 is an online storage web service offered by Amazon Web Services. Amazon S3 provides storage through web services interfaces...
, a remote storage service, which is intentionally simplistic to allow users the ability to customize how their data is stored. The only constructs are buckets (imagine a disk drive of unlimited size) and objects (similar, but not identical to the standard concept of a file). Advanced file management is allowed by being able to use nearly any character (including '/') in the object's name, and the ability to select subsets of the bucket's content based on identical prefixes.
File systems and operating systems
Many operating systemOperating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
s include support for more than one file system. Sometimes the OS and the filesystem are so tightly interwoven it is difficult to separate out filesystem functions.
There needs to be an interface provided by the operating system software between the user and the file system. This interface can be textual (such as provided by a command line interface, such as the Unix shell
Unix shell
A Unix shell is a command-line interpreter or shell that provides a traditional user interface for the Unix operating system and for Unix-like systems...
, or OpenVMS DCL
DIGITAL Command Language
DCL, the DIGITAL Command Language, is the standard command languageadopted by most of the operating systems that were sold by the former Digital Equipment Corporation...
) or graphical (such as provided by a graphical user interface
Graphical user interface
In computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...
, such as file browsers). If graphical, the metaphor of the folder, containing documents, other files, and nested folders is often used (see also: directory
Directory (file systems)
In computing, a folder, directory, catalog, or drawer, is a virtual container originally derived from an earlier Object-oriented programming concept by the same name within a digital file system, in which groups of computer files and other folders can be kept and organized.A typical file system may...
and folder).
Unix-like operating systems
Unix-likeUnix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
operating systems create a virtual file system, which makes all the files on all the devices appear to exist in a single hierarchy. This means, in those systems, there is one root directory
Root directory
In computer file systems, the root directory is the first or top-most directory in a hierarchy. It can be likened to the root of a tree — the starting point where all branches originate.-Metaphor:...
, and every file existing on the system is located under it somewhere. Unix-like systems can use a RAM disk
RAM disk
A RAM disk or RAM drive is a block of RAM that a computer's software is treating as if the memory were a disk drive...
or network shared resource as its root directory.
Unix-like systems assign a device name to each device, but this is not how the files on that device are accessed. Instead, to gain access to files on another device, the operating system must first be informed where in the directory tree those files should appear. This process is called mounting
Mount (computing)
Mounting takes place before a computer can use any kind of storage device . The user or their operating system must make it accessible through the computer's file system. A user can access only files on mounted media.- Mount point :A mount point is a physical location in the partition used as a...
a file system. For example, to access the files on a CD-ROM
CD-ROM
A CD-ROM is a pre-pressed compact disc that contains data accessible to, but not writable by, a computer for data storage and music playback. The 1985 “Yellow Book” standard developed by Sony and Philips adapted the format to hold any form of binary data....
, one must tell the operating system "Take the file system from this CD-ROM and make it appear under such-and-such directory". The directory given to the operating system is called the mount point – it might, for example, be /media. The /media directory exists on many Unix systems (as specified in the Filesystem Hierarchy Standard
Filesystem Hierarchy Standard
The Filesystem Hierarchy Standard defines the main directories and their contents in Linux operating systems. For the most part, it is a formalization and extension of the traditional BSD filesystem hierarchy....
) and is intended specifically for use as a mount point for removable media such as CDs, DVDs, USB drives or floppy disks. It may be empty, or it may contain subdirectories for mounting individual devices. Generally, only the administrator
System administrator
A system administrator, IT systems administrator, systems administrator, or sysadmin is a person employed to maintain and operate a computer system and/or network...
(i.e. root user) may authorize the mounting of file systems.
Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
operating systems often include software and tools that assist in the mounting process and provide it new functionality. Some of these strategies have been coined "auto-mounting" as a reflection of their purpose.
- In many situations, file systems other than the root need to be available as soon as the operating system has bootedBootingIn computing, booting is a process that begins when a user turns on a computer system and prepares the computer to perform its normal operations. On modern computers, this typically involves loading and starting an operating system. The boot sequence is the initial set of operations that the...
. All Unix-like systems therefore provide a facility for mounting file systems at boot time. System administratorSystem administratorA system administrator, IT systems administrator, systems administrator, or sysadmin is a person employed to maintain and operate a computer system and/or network...
s define these file systems in the configuration file fstabFstabThe fstab file is a system configuration file commonly found on Unix systems. The fstab file typically lists all available disks and disk partitions, and indicates how they are to be initialized or otherwise integrated into the overall system's file system...
(vfstab in Solaris), which also indicates options and mount points. - In some situations, there is no need to mount certain file systems at boot timeBootingIn computing, booting is a process that begins when a user turns on a computer system and prepares the computer to perform its normal operations. On modern computers, this typically involves loading and starting an operating system. The boot sequence is the initial set of operations that the...
, although their use may be desired thereafter. There are some utilities for Unix-like systems that allow the mounting of predefined file systems upon demand. - Removable media have become very common with microcomputerMicrocomputerA microcomputer is a computer with a microprocessor as its central processing unit. They are physically small compared to mainframe and minicomputers...
platforms. They allow programs and data to be transferred between machines without a physical connection. Common examples include USB flash driveUSB flash driveA flash drive is a data storage device that consists of flash memory with an integrated Universal Serial Bus interface. flash drives are typically removable and rewritable, and physically much smaller than a floppy disk. Most weigh less than 30 g...
s, CD-ROMCD-ROMA CD-ROM is a pre-pressed compact disc that contains data accessible to, but not writable by, a computer for data storage and music playback. The 1985 “Yellow Book” standard developed by Sony and Philips adapted the format to hold any form of binary data....
s, and DVDDVDA DVD is an optical disc storage media format, invented and developed by Philips, Sony, Toshiba, and Panasonic in 1995. DVDs offer higher storage capacity than Compact Discs while having the same dimensions....
s. Utilities have therefore been developed to detect the presence and availability of a medium and then mount that medium without any user intervention.
- Progressive Unix-like systems have also introduced a concept called supermounting; see, for example, the Linux supermount-ng project. For example, a floppy disk that has been supermounted can be physically removed from the system. Under normal circumstances, the disk should have been synchronized and then unmounted before its removal. Provided synchronization has occurred, a different disk can be inserted into the drive. The system automatically notices that the disk has changed and updates the mount point contents to reflect the new medium. Similar functionality is found on Windows machines.
- An automounterAutomounterAn automounter is any program or software facility which automatically mounts filesystems in response to access operations by user programs. An automounter system utility , when notified of file and directory access attempts under selectively monitored subdirectory trees, dynamically and...
will automatically mount a file system when a reference is made to the directory atop which it should be mounted. This is usually used for file systems on network servers, rather than relying on events such as the insertion of media, as would be appropriate for removable media.
Linux
LinuxLinux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
supports many different file systems, but common choices for the system disk include the ext* family (such as ext2
Ext2
The ext2 or second extended filesystem is a file system for the Linux kernel. It was initially designed by Rémy Card as a replacement for the extended file system ....
, ext3
Ext3
The ext3 or third extended filesystem is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian...
and ext4
Ext4
The ext4 or fourth extended filesystem is a journaling file system for Linux, developed as the successor to ext3.It was born as a series of backward compatible extensions to ext3, many of them originally developed by Cluster File Systems for the Lustre file system between 2003 and 2006, meant to...
), XFS
XFS
XFS is a high-performance journaling file system created by Silicon Graphics, Inc. It is the default file system in IRIX releases 5.3 and onwards and later ported to the Linux kernel. XFS is particularly proficient at parallel IO due to its allocation group based design...
, JFS, ReiserFS
ReiserFS
ReiserFS is a general-purpose, journaled computer file system designed and implemented by a team at Namesys led by Hans Reiser. ReiserFS is currently supported on Linux . Introduced in version 2.4.1 of the Linux kernel, it was the first journaling file system to be included in the standard kernel...
and btrfs
Btrfs
Btrfs is a GPL-licensed copy-on-write file system for Linux.Development began at Oracle Corporation in 2007....
.
Solaris
The Sun MicrosystemsSun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...
Solaris operating system
Solaris Operating System
Solaris is a Unix operating system originally developed by Sun Microsystems. It superseded their earlier SunOS in 1993. Oracle Solaris, as it is now known, has been owned by Oracle Corporation since Oracle's acquisition of Sun in January 2010....
in earlier releases defaulted to (non-journaled or non-logging) UFS
Unix File System
The Unix file system is a file system used by many Unix and Unix-like operating systems. It is also called the Berkeley Fast File System, the BSD Fast File System or FFS...
for bootable and supplementary file systems. Solaris defaulted to, supported, and extended UFS.
Support for other file systems and significant enhancements were added over time, including Veritas Software
VERITAS Software
Veritas Software Corp. was an international software company that was founded in 1983 as Tolerant Systems, renamed Veritas Software Corp. in 1989, and merged with Symantec in 2005. It was headquartered in Mountain View, California...
Corp. (Journaling) VxFS, Sun Microsystems (Clustering) QFS
QFS
QFS is an open source filesystem from Sun Microsystems. It is tightly integrated with SAM, the Storage and Archive Manager, and hence is often referred to as SAM-QFS. SAM provides the functionality of a Hierarchical Storage Manager....
, Sun Microsystems (Journaling) UFS, and Sun Microsystems (open source, poolable, 128 bit compressible, and error-correcting) ZFS
ZFS
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include data integrity verification against data corruption modes , support for high storage capacities, integration of the concepts of filesystem and volume management,...
.
Kernel extensions were added to Solaris to allow for bootable Veritas VxFS operation. Logging or Journaling
Journaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...
was added to UFS in Sun's Solaris 7. Releases of Solaris 10, Solaris Express, OpenSolaris
OpenSolaris
OpenSolaris was an open source computer operating system based on Solaris created by Sun Microsystems. It was also the name of the project initiated by Sun to build a developer and user community around the software...
, and other open source variants of the Solaris operating system later supported bootable ZFS
ZFS
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include data integrity verification against data corruption modes , support for high storage capacities, integration of the concepts of filesystem and volume management,...
.
Logical Volume Management
Logical volume management
In computer storage, logical volume management or LVM provides a method of allocating space on mass-storage devices that is more flexible than conventional partitioning schemes...
allows for spanning a file system across multiple devices for the purpose of adding redundancy, capacity, and/or throughput. Legacy environments in Solaris may use Solaris Volume Manager
Solaris Volume Manager
Solaris Volume Manager is a software package for creating, modifying and controlling RAID-0 volumes, RAID-1 volumes, RAID 0+1 volumes, RAID 1+0 volumes, RAID-5 volumes, and soft partitions.Version 1.0 of Online: DiskSuite was released as an add-on product for SunOS in...
(formerly known as Solstice DiskSuite.) Multiple operating systems (including Solaris) may use Veritas Volume Manager
Veritas Volume Manager
The Veritas Volume Manager, VVM or VxVM is a proprietary logical volume manager from Veritas . It is available for Windows, AIX, Solaris, Linux, and HP-UX. A modified version is bundled with HP-UX as its built-in volume manager...
. Modern Solaris based operating systems eclipse the need for Volume Management through leveraging virtual storage pools in ZFS
ZFS
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include data integrity verification against data corruption modes , support for high storage capacities, integration of the concepts of filesystem and volume management,...
.
Mac OS X
Mac OS XMac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...
uses a file system that it inherited from classic Mac OS
Mac OS
Mac OS is a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems. The Macintosh user experience is credited with popularizing the graphical user interface...
called HFS Plus
HFS Plus
HFS Plus or HFS+ is a file system developed by Apple Inc. to replace their Hierarchical File System as the primary file system used in Macintosh computers . It is also one of the formats used by the iPod digital music player...
, sometimes called Mac OS Extended. HFS Plus is a metadata-rich and case preserving
Case preservation
When a computer file system stores file names, the computer may keep or discard case information. When the case is stored, it is called case preservation....
file system. Due to the Unix roots of Mac OS X, Unix permissions were added to HFS Plus. Later versions of HFS Plus added journaling
Journaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...
to prevent corruption of the file system structure and introduced a number of optimizations to the allocation algorithms in an attempt to defragment files automatically without requiring an external defragmenter.
Filenames can be up to 255 characters. HFS Plus uses Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
to store filenames. On Mac OS X, the filetype
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...
can come from the type code
Type code
A type code is the only mechanism used in pre-Mac OS X versions of the Macintosh operating system to denote a file's format, in a manner similar to file extensions in other operating systems. Codes are four-byte OSTypes...
, stored in file's metadata, or the filename.
HFS Plus has three kinds of links: Unix-style hard link
Hard link
In computing, a hard link is a directory entry that associates a name with a file on a file system. . The term is used in file systems which allow multiple hard links to be created for the same file. This has the effect of creating multiple names for the same file, causing an aliasing effect: e.g...
s, Unix-style symbolic link
Symbolic link
In computing, a symbolic link is a special type of file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution. Symbolic links were already present by 1978 in mini-computer operating systems from DEC and Data...
s and aliases
Alias (Mac OS)
In Mac OS System 7 and later, an alias is a small file that represents another object in a local, remote, or removable file system and provides a dynamic link to it; the target object may be moved, and the alias will still link to it...
. Aliases are designed to maintain a link to their original file even if they are moved or renamed; they are not interpreted by the file system itself, but by the File Manager code in userland.
Mac OS X also supports the UFS
Unix File System
The Unix file system is a file system used by many Unix and Unix-like operating systems. It is also called the Berkeley Fast File System, the BSD Fast File System or FFS...
file system, derived from the BSD Unix Fast File System via NeXTSTEP
NEXTSTEP
NeXTSTEP was the object-oriented, multitasking operating system developed by NeXT Computer to run on its range of proprietary workstation computers, such as the NeXTcube...
. However, as of Mac OS X 10.5 (Leopard), Mac OS X can no longer be installed on a UFS volume, nor can a pre-Leopard system installed on a UFS volume be upgraded to Leopard.
Newer versions Mac OS X are capable of reading and writing to the legacy FAT file systems(16 & 32). They are capable of reading NTFS Filesystems. Writing is only supported on Mac OS X 10.6 (Snow Leopard) and later but only after a non-trivial system setting change. Third party software exists that automates this. Third party software is still necessary to write to the NTFS file system on Mac OS X versions prior to 10.6 (Snow Leopard).
Plan 9
Plan 9 from Bell LabsPlan 9 from Bell Labs
Plan 9 from Bell Labs is a distributed operating system. It was developed primarily for research purposes as the successor to Unix by the Computing Sciences Research Center at Bell Labs between the mid-1980s and 2002...
treats everything as a file, and accessed as a file would be (i.e., no ioctl
Ioctl
In computing, ioctl, short for input/output control, is a system call for device-specific operations and other operations which cannot be expressed by regular system calls. It takes a parameter specifying a request code; the effect of a call depends completely on the request code. Request codes are...
or mmap
Mmap
In computing, mmap is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It naturally implements demand paging, because initially file contents are not entirely read from disk and do not use physical RAM at all...
)
networking, graphics, debugging, authentication, capabilities, encryption, and other services are accessed via I-O operations on file descriptor
File descriptor
In computer programming, a file descriptor is an abstract indicator for accessing a file. The term is generally used in POSIX operating systems...
s.
The 9P
9P
9P is a network protocol developed for the Plan 9 from Bell Labs distributed operating system as the means of connecting the components of a Plan 9 system. Files are key objects in Plan 9. They represent windows, network connections, processes, and almost anything else available in the operating...
protocol removes the difference between local and remote files
These file systems are organized with the help of private, per-process namespaces, allowing each process to have a different view of the many file systems that provide resources in a distributed system.
The Inferno operating system
Inferno (operating system)
Inferno is a distributed operating system started at Bell Labs, but is now developed and maintained by Vita Nuova Holdings as free software. Inferno was based on the experience gained with Plan 9 from Bell Labs, and the further research of Bell Labs into operating systems, languages, on-the-fly...
shares these concepts with Plan 9.
Microsoft Windows
Windows makes use of the FATFile Allocation Table
File Allocation Table is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of...
and NTFS
NTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
file systems.
Windows uses a drive letter abstraction at the user level to distinguish one disk or partition from another. For example, the path
Path (computing)
A path, the general form of a filename or of a directory name, specifies a unique location in a file system. A path points to a file system location by following the directory tree hierarchy expressed in a string of characters in which path components, separated by a delimiting character, represent...
C:\WINDOWS represents a directory WINDOWS on the partition represented by the letter C. The C drive is most commonly used for the primary hard disk partition, on which Windows is usually installed and from which it boots. This "tradition" has become so firmly ingrained that bugs came about in older applications which made assumptions that the drive that the operating system was installed on was C. The use of drive letters, and the tradition of using "C" as the drive letter for the primary hard disk partition, can be traced to MS-DOS
MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...
, where the letters A and B were reserved for up to two floppy disk drives. This in turn derived from CP/M
CP/M
CP/M was a mass-market operating system created for Intel 8080/85 based microcomputers by Gary Kildall of Digital Research, Inc...
in the 1970s, and ultimately from IBM's CP/CMS
CP/CMS
CP/CMS was a time-sharing operating system of the late 60s and early 70s, known for its excellent performance and advanced features...
of 1967.
Network drives may also be mapped to drive letters.
FAT
The File Allocation TableFile Allocation Table
File Allocation Table is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of...
(FAT) filing system, supported by all versions of Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
, was an evolution of that used in Microsoft's earlier operating system (MS-DOS
MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...
which in turn was based on 86-DOS). FAT ultimately traces its roots back to the short-lived M-DOS
M-DOS
M-DOS or MIDAS refers to an operating system that was designed by Marc McDonald of Microsoft in 1979...
project and Standalone disk BASIC
Microsoft BASIC
Microsoft BASIC was the foundation product of the Microsoft company. It first appeared in 1975 as Altair BASIC, which was the first BASIC, and the first high level programming language available for the MITS Altair 8800 hobbyist microcomputer....
before it. Over the years various features have been added to it, inspired by similar features found on file systems used by operating systems such as Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
.
Older versions of the FAT file system (FAT12 and FAT16) had file name length limits, a limit on the number of entries in the root directory of the file system and had restrictions on the maximum size of FAT-formatted disks or partitions. Specifically, FAT12 and FAT16 had a limit of 8 characters for the file name, and 3 characters for the extension (such as .exe). This is commonly referred to as the 8.3 filename limit. VFAT, which was an extension to FAT12 and FAT16 introduced in Windows NT 3.5
Windows NT 3.5
Windows NT 3.5 is the second release of the Microsoft Windows NT operating system. It was released on 21 September 1994.One of the primary goals during Windows NT 3.5's development was to increase the speed of the operating system; as a result, the project was given the codename "Daytona" in...
and subsequently included in Windows 95, allowed long file names (LFN).
FAT32 also addressed many of the limits in FAT12 and FAT16, but remains limited compared to NTFS.
exFAT
ExFAT
exFAT is a proprietary, patent-pending file system designed especially for USB flash drives. Developed by Microsoft, it is supported in Windows XP and Windows Server 2003 with update KB955704, Windows Embedded CE 6.0, Windows Vista with Service Pack 1, Windows Server 2008, Windows 7, Windows...
(also known as FAT64) is the newest iteration of FAT, with certain advantages over NTFS with regards to file system overhead. exFAT is only compatible with newer Windows systems, such as Windows 2003, Windows Vista, Windows 2008, Windows 7 and more recently, support has been added for WinXP.
NTFS
NTFSNTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
, introduced with the Windows NT
Windows NT
Windows NT is a family of operating systems produced by Microsoft, the first version of which was released in July 1993. It was a powerful high-level-language-based, processor-independent, multiprocessing, multiuser operating system with features comparable to Unix. It was intended to complement...
operating system, allowed ACL
Access control list
An access control list , with respect to a computer file system, is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on given objects. Each entry in a typical ACL specifies a subject...
-based permission control. Other features also supported by NTFS
NTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
include hard links, multiple file streams, attribute indexing, quota tracking, sparse files, encryption, compression, and reparse points (directories working as mount-points for other file systems, symlinks, junctions, remote storage links), though not all these features are well-documented.
Other file systems
- The Prospero File System is a file system based on the Virtual System Model. The system was created by Dr. B. Clifford Neuman of the Information Sciences Institute at the University of Southern California.
- RSRE FLEX file system - written in ALGOL 68ALGOL 68ALGOL 68 isan imperative computerprogramming language that was conceived as a successor to theALGOL 60 programming language, designed with the goal of a...
- The file system of the Michigan Terminal SystemMichigan Terminal SystemThe Michigan Terminal System is one of the first time-sharing computer operating systems. Initially developed in 1967 at the University of Michigan for use on IBM S/360-67, S/370 and compatible mainframe computers, it was developed and used by a consortium of eight universities in the United...
(MTS) is interesting because: (i) it provides "line files" where record lengths and line numbers are associated as metadata with each record in the file, lines can be added, replaced, updated with the same or different length records, and deleted anywhere in the file without the need to read and rewrite the entire file; (ii) using program keys files may be shared or permitted to commands and programs in addition to users and groups; and (iii) there is a comprehensive file locking mechanism that protects both the file's data and its metadata.
Converting the type of a file system
It may be advantageous or necessary to have files in a different file system than they currently exist. Reasons include the need for an increase in the space requirements beyond the limits of the current filesystem. The depth of path may need to be increased beyond the restrictions of the filesystem. There may be performance or reliability considerations. Providing access to another operating system which does not support existing filessytem is another reason.In-place conversion
In some cases conversion can be done in-place, although migrating the file system is more conservative, as it involves a creating a copy of the data and is recommended. On Windows, FAT and FAT32 file systems can be converted to NTFS via the convert.exe utility, but not the reverse. On Linux, ext2 can be converted to ext3 (and converted back), and ext3 can be converted to ext4 (but not back), and both ext3 and ext4 can be converted to btrfsBtrfs
Btrfs is a GPL-licensed copy-on-write file system for Linux.Development began at Oracle Corporation in 2007....
, and converted back until the undo information is deleted. These conversions are possible due to using the same format for the file data itself, and relocating the metadata into empty space, in some cases using sparse file
Sparse file
In computer science, a sparse file is a type of computer file that attempts to use file system space more efficiently when blocks allocated to the file are mostly empty. This is achieved by writing brief information representing the empty blocks to disk instead of the actual "empty" space which...
support.
Migrating to a different file system
Migration has the disadvantage of requiring additional space although it may be faster. The best case is if there is unused space on media which will contain the final file system.For example, to migrate a FAT32 filesystem to an ext2 filesystem. First create a new ext2 filesystem, then copy the data to the filesystem, then delete the FAT32 filesystem.
An alternative, when there is not sufficient space to retain the original filesystem until the new one is created, is to use a work area (such as a removable media). This takes longer but a backup of the data is a nice side effect.
Long file paths and long file names
In hierarchical file systems, files are accessed by means of a path that is a branching list of directories containing the file. Different filesystems have different limits on the depth of the path.Filesystems also have a limit on the length of an individual filename.
Copying files with long names or located in paths of significant depth from one filesystem to another may cause undesirable results. This depends on how the utility doing the copying handles the discrepancy. See also pathmunge
See also
- List of default file system
- List of file systems
- List of Unix programs
- Comparison of file systemsComparison of file systems-General information:-Limits:-Metadata:-Features:-Allocation and layout policies:-Supporting operating systems:-See also:* Comparison of archive formats* Comparison of file archivers* List of archive formats* List of file archivers...
- Directory structureDirectory structureIn computing, a directory structure is the way an operating system's file system and its files are displayed to the user. Files are typically displayed in a Hierarchical tree structure.-File names and extensions:...
- Disk sharing
- Distributed file systemDistributed file systemNetwork file system may refer to:* A distributed file system, which is accessed over a computer network* Network File System , a specific brand of distributed file system...
- File managerFile managerA file manager or file browser is a computer program that provides a user interface to work with file systems. The most common operations performed on files or groups of files are: create, open, edit, view, print, play, rename, move, copy, delete, search/find, and modify file attributes, properties...
- File system fragmentationFile system fragmentationIn computing, file system fragmentation, sometimes called file system aging, is the inability of a file system to lay out related data sequentially , an inherent phenomenon in storage-backed file systems that allow in-place modification of their contents. It is a special case of data fragmentation...
- Filename extensionFilename extensionA filename extension is a suffix to the name of a computer file applied to indicate the encoding of its contents or usage....
- Global filesystemGlobal filesystemA global filesystem is one which guarantees that the same path name corresponds to the same object on all computers deploying the filesystem. That implies#usage of a global network#prohibition of host-dependent mountpoints...
- Physical and logical storage
- Opera File SystemOpera File SystemOpera File System was a file system used in 3DO discs. Recently was implemented in Linux and can be read and copied in CloneCD v4.2 or later....
- Storage efficiencyStorage EfficiencyStorage efficiency is the ability to store and manage data that consumes the least amount of space with little to no impact on performance; resulting in a lower total operational cost. Efficiency addresses the real-world demands of managing costs, reducing complexity and limiting risk...
- Virtual file systemVirtual file systemA virtual file system or virtual filesystem switch is an abstraction layer on top of a more concrete file system. The purpose of a VFS is to allow client applications to access different types of concrete file systems in a uniform way...
General references
- File System Forensic Analysis, Brian Carrier, Addison Wesley, 2005.
Books
- Prabhakaran, Vijayan (2006). IRON File Systems. PhD disseration, University of Wisconsin-Madison.
Online
- Benchmarking Filesystems (outdated) by Justin Piszcz, Linux Gazette 102, May 2004
- Benchmarking Filesystems Part II using kernel 2.6, by Justin Piszcz, Linux Gazette 122, January 2006
- Filesystems (ext3, ReiserFS, XFS, JFS) comparison on Debian Etch 2006
- Interview With the People Behind JFS, ReiserFS & XFS
- Journal File System Performance (outdated): ReiserFS, JFS, and Ext3FS show their merits on a fast RAID appliance
- Journaled Filesystem Benchmarks (outdated): A comparison of ReiserFS, XFS, JFS, ext3 & ext2
- Large List of File System Summaries
- Linux File System Benchmarks v2.6 kernel with a stress on CPU usage
- Linux Filesystem Benchmarks
- Linux large file support (outdated)
- Local Filesystems for Windows
- Overview of some filesystems (outdated)
- Sparse files support (outdated)