Disk formatting
Encyclopedia
Disk formatting is the process of preparing a hard disk drive or flexible disk medium for data storage. In some cases, the formatting operation may also create one or more new file system
s. The formatting process that performs basic medium preparation is often referred to as "low-level formatting." The term "high level formatting" most often refers to the process of generating a new file system. In certain operation systems (e.g., Microsoft Windows
), the two processes are combined and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Illustrated to the right are the prompts and diagnostics printed by MS-DOS
's FORMAT.COM utility as a hard drive is being formatted.
As a general rule, formatting a disk is "destructive," in that existing data (if any) is lost during the process.
on the IBM System/360 expanded this concept in the form of Count Key Data
(CKD) and later Extended Count Key Data (ECKD); however the use of variable block size in HDDs fell out of use in the 1990s; one of the last HDDs to support variable block size was the IBM 3390 Model 9, announced May 1993
Modern hard disk drives such as Serial attached SCSI
(SAS) and Serial ATA
(SATA) drives, appear at their interface
s as a contiguous set of fixed-size blocks; typically 512 bytes long but the industry is in the process of changing to 4,096 byte logical blocks.
Floppy disk
s generally only used fixed block sizes but these sizes were a function of the host's OS
and its interaction with its controller
so that a particular type of media (e.g., 5¼-inch DSDD) would have different block sizes depending upon the host OS and controller.
Optical disks generally only use fixed block sizes.
Consider a standard 1.44 MB floppy disk. Low-level formatting of the floppy disk, normally writes 18 sector
s of 512 byte
s to each of 160 tracks (80 on each side) of the floppy disk, providing 1,474,560 bytes of storage on the disk.
Physical sectors are actually larger than 512 bytes, as in addition to the 512 byte data field they include a sector identifier field, CRC
bytes (in some cases error correction bytes
) and gaps between the fields. These additional bytes are not normally included in the quoted figure for overall storage capacity of the disk.
Different low-level formats can be used on the same media; for example, large records can be used to cut down on inter-record gap size.
Several freeware
, shareware
and free software
programs (e.g. GParted
, FDFORMAT
, NFORMAT and 2M
) allowed considerably more control over formatting, allowing the formatting of high-density 3.5" disks with a capacity up to 2 MB.
Techniques used include:
Linux
supports a variety of sector sizes, and DOS
and Windows
support a large-record-size DMF
-formatted floppy format.
that defined how data was encoded on the media. With the media, the drive and/or the controller possibly procured from separate vendors, low level formatting was a potential user activity. Separate procurement also had the potential of incompatibility between the separate components such that the subsystem would not reliably store data.
User instigated low-level formatting (LLF) of hard disk drives was common for minicomputer
and personal computer
systems until the 1990s. IBM and other mainframe system vendors typically supplied their hard disk drives (or media in the case of removable media HDDs) with a low-level format. Typically this involved subdividing each track on the disk into one or more blocks which would contain the user data and associated control information. Different computers used different block sizes and IBM notably used variable block sizes
but the popularity of the IBM PC caused the industry to adopt a standard of 512 user data bytes per block by the middle 1980s.
Depending upon the system, low-level formatting was generally done by an operating system system utility. IBM compatible PCs used the BIOS which is involved using the MS-DOS debug
program to transfer control to a routine hidden at different addresses in different BIOSs. Low-level format function can also be called as "erase" or "wipe" in different tools. For best results it's highly recommended to use tools created by hard disk's manufacturer.
Today, an end-user
, in most cases, should never perform a low-level formatting of an IDE or ATA hard drive, and in fact it is often not possible to do so on modern hard drives outside of the factory.
to every addressable location on the disk, sometimes called zero-filling.
The present ambiguity in the term low-level format seems to be due to both inconsistent documentation on web sites and the belief by many users that any process below a high-level (file system) format must be called a low-level format. Since much of the low level formatting process can today only be performed at the factory, various drive manufacturers describe reinitialization software as LLF utilities on their web sites. Since users generally have no way to determine the difference between a complete LLF and reinitialization (they simply observe running the software results in a hard disk that must be high-level formatted), both the misinformed user and mixed signals from various drive manufacturers have perpetuated this error. Note: Whatever possible misuse of such terms may exist (search hard drive manufacturers' web sites for all these terms), many sites do make such reinitialization utilities available (possibly as bootable floppy diskette or CD image files), to both overwrite every byte and check for damaged sectors on the hard disk.
One popular method for performing only the zero-fill operation on a hard disk is by writing zero-value bytes to the drive using the Unix dd
utility with the /dev/zero
stream as the input file and the drive itself or a specific partition as the output file.
Another method for SCSI
disks may use the sg_format command to issue a low level SCSI FORMAT UNIT command.
On PC and UNIX-based operating systems (such as BSD, Linux/GNU
, Mac OSX) this is normally done with a Partition editor
, e.g., fdisk
, LVM
, parted
. These operating systems support multiple partitions.
In current IBM mainframe OSs derived from S/360 OSs, this is done by the INIT command of the ICKDSF utility; these legacy OSs support only a single partition per device, called a volume. The ICKDSF functions include creating a volume label and writing a Record 0 on every track.
Floppy disks are not partitioned; however depending upon the OS they may require volume information in order to be accessed by the OS.
Partition editor
s and ICKDSF today do not handle low level functions for HDDs and optical disk drives such as writing timing marks, and they cannot reinitialize a modern disk that has been degaussed or otherwise lost the factory formatting.
. This is a fast operation, and is sometimes referred to as quick formatting.
The entire logical drive or partition may optionally be scanned for defects, which may take considerable time.
In the case of floppy disks, both high- and low-level formatting are customarily performed in one pass by the disk formatting software. In recent years, most floppies have shipped pre-formatted from the factory as DOS FAT12 floppies.
In current IBM mainframes derived from S/360, this may done as part of allocating a file, by a utility specific to the file system or, in some older access methods, on the fly as new data are written.
(OS).
Reformatting often carries the implication that the operating system and all other software will be reinstalled after the format is complete. Rather than fixing an installation suffering from malfunction or security compromise, it is sometimes judged easier to simply reformat everything and start from scratch. Various colloquialism exist for this process, such as "wipe and reload", "nuke and pave", "reimage", etc.
, PC-DOS
, OS/2
and Microsoft Windows
, disk formatting can be performed by the
. The
uses this command to format the C: drive as soon as a document is opened.
There is also the undocumented /U parameter that performs an unconditional format which under most circumstances overwrites the entire partition, preventing the recovery of data through software. Note however that the /U switch only works reliably with floppy diskettes (technically because unless /Q is used, floppies are always low-level formatted in addition to high-level formatted). Under certain circumstances with hard drive partitions, however, the /U switch merely prevents the creation of
or disk editor
s. Reliance upon /U for secure overwriting of hard drive partitions is therefore inadvisable, and purpose-built tools such as DBAN should be considered instead.
Under OS/2, if you use the /L parameter, which specifies a long format, then format will overwrite the entire partition or logical drive. Doing so enhances the ability of CHKDSK
to recover files.
provides a format utility for the NTFS
filesystem.
Some Unix and Unix-like operating systems have higher-level formatting tools, usually for the purpose of making disk formatting easier and/or allowing the user to partition the disk with the same tool. Examples include GNU Parted
(and its various GUI frontends such as GParted
and the KDE Partition Manager
) and the Disk Utility
application on Mac OS X
.
From the perspective of preventing the recovery of sensitive data through recovery tools, the data must either be completely overwritten (every sector) with random data before the format, or the format program itself must perform this overwriting, as the DOS
FORMAT command did with floppy diskettes, filling every data sector with the byte value
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...
s. The formatting process that performs basic medium preparation is often referred to as "low-level formatting." The term "high level formatting" most often refers to the process of generating a new file system. In certain operation systems (e.g., Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
), the two processes are combined and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Illustrated to the right are the prompts and diagnostics printed by MS-DOS
MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...
's FORMAT.COM utility as a hard drive is being formatted.
As a general rule, formatting a disk is "destructive," in that existing data (if any) is lost during the process.
History
A "block", a contiguous number of bytes, is the unit of memory that is read from and written to a disk by a disk driver. The earliest disk drives had fixed block sizes (e.g. the IBM 350 disk storage unit (of the late 1950's) block size was 100 6 bit characters) but starting with the 1301 IBM marketed subsystems that featured variable block sizes - a particular track could have blocks of different sizes. The disk subsystemsDirect access storage device
In mainframe computers and some minicomputers, a direct access storage device, or DASD , is any secondary storage device which has relatively low access time relative to its capacity....
on the IBM System/360 expanded this concept in the form of Count Key Data
Count Key Data
Count Key Data is a disk data architecture. Each physical disk record consists of a count field, an optional key field, and a data field with error correction/detection information appended to each field and gaps separating each field...
(CKD) and later Extended Count Key Data (ECKD); however the use of variable block size in HDDs fell out of use in the 1990s; one of the last HDDs to support variable block size was the IBM 3390 Model 9, announced May 1993
Modern hard disk drives such as Serial attached SCSI
Serial Attached SCSI
Serial Attached SCSI is a computer bus used to move data to and from computer storage devices such as hard drives and tape drives. SAS depends on a point-to-point serial protocol that replaces the parallel SCSI bus technology that first appeared in the mid 1980s in data centers and workstations,...
(SAS) and Serial ATA
Serial ATA
Serial ATA is a computer bus interface for connecting host bus adapters to mass storage devices such as hard disk drives and optical drives...
(SATA) drives, appear at their interface
Interface
-Academic journals:* Interface: a journal for and about social movements* Interfaces * Journal of the Royal Society Interface* The Technology Interface Journal-Science:* Biointerface* Interface , boundary surface...
s as a contiguous set of fixed-size blocks; typically 512 bytes long but the industry is in the process of changing to 4,096 byte logical blocks.
Floppy disk
Floppy disk
A floppy disk is a disk storage medium composed of a disk of thin and flexible magnetic storage medium, sealed in a rectangular plastic carrier lined with fabric that removes dust particles...
s generally only used fixed block sizes but these sizes were a function of the host's OS
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
and its interaction with its controller
Floppy disk controller
A floppy disk controller is a special-purpose chip and associated disk controller circuitry that directs and controls reading from and writing to a computer's floppy disk drive . This article contains concepts common to FDCs based on the NEC µPD765 and Intel 8072A or 82072A and their descendants,...
so that a particular type of media (e.g., 5¼-inch DSDD) would have different block sizes depending upon the host OS and controller.
Optical disks generally only use fixed block sizes.
Disk formatting process
Formatting a disk for use by an operating system and its applications involves three different steps.- Low-level formatting (i.e., closest to the hardware) marks the surfaces of the disks with markers indicating the start of a recording block (typically today called sector markers) and other information to be used later, in normal operations, by the disk controllerDisk controllerThe disk controller is the circuit which enables the CPU to communicate with a hard disk, floppy disk or other kind of disk drive.Early disk controllers were identified by their storage methods and data encoding. They were typically implemented on a separate controller card...
to read or write data. This is intended to be the permanent foundation of the disk, and is often completed at the factory. - Partitioning creates data structures needed by the operating system. This level of formatting often includes checking for defective tracks or defective sectors.
- High-level formatting creates the file systemFile systemA file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...
format within the structure of the intermediate-level formatting This formatting includes the data structures used by the OS to identify the logical drive or partition's contents). This may occur during operating system installation, or when adding a new disk. Disk and distributed file system may specify an optional boot block, and/or various volume and directory information for the operating system.
Low-level formatting of floppy disks
The low-level format of floppy disks (and early hard disks) is performed by the disk drive's controller.Consider a standard 1.44 MB floppy disk. Low-level formatting of the floppy disk, normally writes 18 sector
Disk sector
In computer disk storage, a sector is a subdivision of a track on a magnetic disk or optical disc. Each sector stores a fixed amount of user data. Traditional formatting of these storage media provides space for 512 bytes or 2048 bytes of user-accessible data per sector...
s of 512 byte
Byte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...
s to each of 160 tracks (80 on each side) of the floppy disk, providing 1,474,560 bytes of storage on the disk.
Physical sectors are actually larger than 512 bytes, as in addition to the 512 byte data field they include a sector identifier field, CRC
Cyclic redundancy check
A cyclic redundancy check is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data...
bytes (in some cases error correction bytes
Error detection and correction
In information theory and coding theory with applications in computer science and telecommunication, error detection and correction or error control are techniques that enable reliable delivery of digital data over unreliable communication channels...
) and gaps between the fields. These additional bytes are not normally included in the quoted figure for overall storage capacity of the disk.
Different low-level formats can be used on the same media; for example, large records can be used to cut down on inter-record gap size.
Several freeware
Freeware
Freeware is computer software that is available for use at no cost or for an optional fee, but usually with one or more restricted usage rights. Freeware is in contrast to commercial software, which is typically sold for profit, but might be distributed for a business or commercial purpose in the...
, shareware
Shareware
The term shareware is a proprietary software that is provided to users without payment on a trial basis and is often limited by any combination of functionality, availability, or convenience. Shareware is often offered as a download from an Internet website or as a compact disc included with a...
and free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...
programs (e.g. GParted
GParted
GParted is a GTK+ front-end to GNU Parted and the official GNOME Partition Editor application.It is used for creating, deleting, resizing, moving, checking and copying partitions, and the file systems on them...
, FDFORMAT
Fdformat
Fdformat is the name of two unrelated programs:* A command-line tool for Linux that "low-level formats" a floppy disk.* A DOS tool written in Pascal by Christoph H. Hochstätter that allows users to format floppy disks to a higher than usual density, enabling the user to store up to 300 kilobytes...
, NFORMAT and 2M
2M (DOS)
2M is a DOS program by the Spanish programmer Ciriaco García de Celis. It enables higher than normal capacity formatting of floppy disks. It saw active development from 1993 to 1995. The last version, v3.0, was released on March 6, 1995. It was written in C and assembler and compiled using Borland...
) allowed considerably more control over formatting, allowing the formatting of high-density 3.5" disks with a capacity up to 2 MB.
Techniques used include:
- head/track sector skew (moving the sector numbering forward at side change and track stepping to reduce mechanical delay),
- interleavingInterleavingIn computer science and telecommunication, interleaving is a way to arrange data in a non-contiguous way to increase performance.It is typically used:* In error-correction coding, particularly within data transmission, disk storage, and computer memory....
sectors (to minimize sector gap and thereby allowing the number of sectors per track to be increased), - increasing the number of sectors per track (while a normal 1.44 MB format uses 18 sectors per track, it is possible to increase this to a maximum of 21), and
- increasing the number of tracks (most drives could tolerate extension to 82 tracks – though some could handle more, others could jam).
Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
supports a variety of sector sizes, and DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...
and Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
support a large-record-size DMF
Distribution Media Format
Distribution Media Format is a format for floppy disks that Microsoft used to distribute software. It allowed the disk to contain 1680 KB of data on a 3½-inch disk, instead of the standard 1440 KB. As a side effect, utilities had to specially support the format in order to read and write the...
-formatted floppy format.
Low-level formatting (LLF) of hard disks
Hard disk drives prior to the 1990s typically had a separate disk controllerDisk controller
The disk controller is the circuit which enables the CPU to communicate with a hard disk, floppy disk or other kind of disk drive.Early disk controllers were identified by their storage methods and data encoding. They were typically implemented on a separate controller card...
that defined how data was encoded on the media. With the media, the drive and/or the controller possibly procured from separate vendors, low level formatting was a potential user activity. Separate procurement also had the potential of incompatibility between the separate components such that the subsystem would not reliably store data.
User instigated low-level formatting (LLF) of hard disk drives was common for minicomputer
Minicomputer
A minicomputer is a class of multi-user computers that lies in the middle range of the computing spectrum, in between the largest multi-user systems and the smallest single-user systems...
and personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...
systems until the 1990s. IBM and other mainframe system vendors typically supplied their hard disk drives (or media in the case of removable media HDDs) with a low-level format. Typically this involved subdividing each track on the disk into one or more blocks which would contain the user data and associated control information. Different computers used different block sizes and IBM notably used variable block sizes
Count Key Data
Count Key Data is a disk data architecture. Each physical disk record consists of a count field, an optional key field, and a data field with error correction/detection information appended to each field and gaps separating each field...
but the popularity of the IBM PC caused the industry to adopt a standard of 512 user data bytes per block by the middle 1980s.
Depending upon the system, low-level formatting was generally done by an operating system system utility. IBM compatible PCs used the BIOS which is involved using the MS-DOS debug
DEBUG (DOS Command)
debug is a command in DOS, MS-DOS, OS/2 and Microsoft Windows which runs the program debug.exe...
program to transfer control to a routine hidden at different addresses in different BIOSs. Low-level format function can also be called as "erase" or "wipe" in different tools. For best results it's highly recommended to use tools created by hard disk's manufacturer.
Transition away from LLF
Starting in the late 1980s, driven by the volume of IBM compatible PCs, HDDs became routinely available pre-formatted with a compatible low-level format. At the same time, the industry moved from historical (dumb) bit serial interfaces to modern (intelligent) bit serial interfaces and Word serial interfaces wherein the low level format was performed at the factory.Today, an end-user
End-user
Economics and commerce define an end user as the person who uses a product. The end user or consumer may differ from the person who purchases the product...
, in most cases, should never perform a low-level formatting of an IDE or ATA hard drive, and in fact it is often not possible to do so on modern hard drives outside of the factory.
Disk reinitialization
While it is generally impossible to perform a complete LLF on most modern hard drives (since the mid-1990s) outside the factory, the term "low-level format" is still used for what could be called the reinitialization of a hard drive to its factory configuration (and even these terms may be misunderstood). Reinitialization should include identifying (and sparing out if possible) any sectors which cannot be written to and read back from the drive, correctly. The term has, however, been used by some to refer to only a portion of that process, in which every sector of the drive is written to; usually by writing a zero byteByte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...
to every addressable location on the disk, sometimes called zero-filling.
The present ambiguity in the term low-level format seems to be due to both inconsistent documentation on web sites and the belief by many users that any process below a high-level (file system) format must be called a low-level format. Since much of the low level formatting process can today only be performed at the factory, various drive manufacturers describe reinitialization software as LLF utilities on their web sites. Since users generally have no way to determine the difference between a complete LLF and reinitialization (they simply observe running the software results in a hard disk that must be high-level formatted), both the misinformed user and mixed signals from various drive manufacturers have perpetuated this error. Note: Whatever possible misuse of such terms may exist (search hard drive manufacturers' web sites for all these terms), many sites do make such reinitialization utilities available (possibly as bootable floppy diskette or CD image files), to both overwrite every byte and check for damaged sectors on the hard disk.
One popular method for performing only the zero-fill operation on a hard disk is by writing zero-value bytes to the drive using the Unix dd
Dd (Unix)
In computing, dd is a common Unix program whose primary purpose is the low-level copying and conversion of raw data. According to the manual page for Version 7 Unix, it will "convert and copy a file". It is used to copy a specified number of bytes or blocks, performing on-the-fly byte order...
utility with the /dev/zero
/dev/zero
/dev/zero is a special file in Unix-like operating systems that provides as many null characters as are read from it. One of the typical uses is to provide a character stream for initializing data storage.-Function:...
stream as the input file and the drive itself or a specific partition as the output file.
Another method for SCSI
SCSI
Small Computer System Interface is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it...
disks may use the sg_format command to issue a low level SCSI FORMAT UNIT command.
Partitioning
Partitioning is the process of writing information into blocks of a storage device or medium that allows access by an operating system. Some operating systems allow the device (or its medium) to appear as multiple devices; i.e. partitioned into multiple devices.On PC and UNIX-based operating systems (such as BSD, Linux/GNU
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
, Mac OSX) this is normally done with a Partition editor
Partition editor
A partition editor is a kind of utility software designed to view, create, alter and delete disk partitions on a computer storage device, most commonly a hard disk, but often a USB flash drive or other storage medium.A partition is a section or segment of the storage space on a storage device...
, e.g., fdisk
Fdisk
On personal computer operating systems, fdisk is a commonly used name for a command-line utility that provides disk partitioning functions...
, LVM
Logical volume management
In computer storage, logical volume management or LVM provides a method of allocating space on mass-storage devices that is more flexible than conventional partitioning schemes...
, parted
GNU Parted
GNU Parted is a free partition editor, used for creating, destroying, resizing, checking, and copying partitions, and the file systems on them. This is useful for creating space for new operating systems, reorganising hard disk usage, copying data between hard disks, and disk imaging...
. These operating systems support multiple partitions.
In current IBM mainframe OSs derived from S/360 OSs, this is done by the INIT command of the ICKDSF utility; these legacy OSs support only a single partition per device, called a volume. The ICKDSF functions include creating a volume label and writing a Record 0 on every track.
Floppy disks are not partitioned; however depending upon the OS they may require volume information in order to be accessed by the OS.
Partition editor
Partition editor
A partition editor is a kind of utility software designed to view, create, alter and delete disk partitions on a computer storage device, most commonly a hard disk, but often a USB flash drive or other storage medium.A partition is a section or segment of the storage space on a storage device...
s and ICKDSF today do not handle low level functions for HDDs and optical disk drives such as writing timing marks, and they cannot reinitialize a modern disk that has been degaussed or otherwise lost the factory formatting.
High-level formatting
High-level formatting is the process of setting up an empty file system on the disk and, for PC's, installing a boot sectorBoot sector
A boot sector or boot block is a region of a hard disk, floppy disk, optical disc, or other data storage device that contains machine code to be loaded into random-access memory by a computer system's built-in firmware...
. This is a fast operation, and is sometimes referred to as quick formatting.
The entire logical drive or partition may optionally be scanned for defects, which may take considerable time.
In the case of floppy disks, both high- and low-level formatting are customarily performed in one pass by the disk formatting software. In recent years, most floppies have shipped pre-formatted from the factory as DOS FAT12 floppies.
In current IBM mainframes derived from S/360, this may done as part of allocating a file, by a utility specific to the file system or, in some older access methods, on the fly as new data are written.
Host protected area
The host protected area, sometimes referred to as hidden protected area, is an area of a hard drive that is high level formatted so that the area is not normally visible to its operating systemOperating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
(OS).
Reformatting
Reformatting is a high-level formatting performed on a functioning disk drive to free the contents of its medium. Reformatting is unique to each operating system because what actually is done to existing data varies by OS. The most important aspect of the process is that it frees disk space for use by other data. To actually "erase" everything requires overwriting each block of data on the medium; something that is not done by many PC high-level formatting utilities.Reformatting often carries the implication that the operating system and all other software will be reinstalled after the format is complete. Rather than fixing an installation suffering from malfunction or security compromise, it is sometimes judged easier to simply reformat everything and start from scratch. Various colloquialism exist for this process, such as "wipe and reload", "nuke and pave", "reimage", etc.
DOS, OS/2 and Windows
Under MS-DOSMS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...
, PC-DOS
PC-DOS
IBM PC DOS is a DOS system for the IBM Personal Computer and compatibles, manufactured and sold by IBM from the 1980s to the 2000s....
, OS/2
OS/2
OS/2 is a computer operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively. The name stands for "Operating System/2," because it was introduced as part of the same generation change release as IBM's "Personal System/2 " line of second-generation personal...
and Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
, disk formatting can be performed by the
formatFormat (command)Format is a command-line utility included in Microsoft DOS, IBM OS/2 and Microsoft Windows operating systems to produce disk formatting.The command will perform the following actions by default on a floppy drive, hard drive, solid-states , or other magnetic medium...
commandCommand (computing)
In computing, a command is a directive to a computer program acting as an interpreter of some kind, in order to perform a specific task. Most commonly a command is a directive to some kind of command line interface, such as a shell....
. The
format
program usually asks for confirmation beforehand to prevent accidental removal of data, but some versions of DOS have an undocumented /AUTOTEST option; if used, the usual confirmation is skipped and the format begins right away. The WM/FormatC macro virusMacro virus (computing)
In computing terminology, a macro virus is a virus that is written in a macro language: that is to say, a language built into a software application such as a word processor...
uses this command to format the C: drive as soon as a document is opened.
There is also the undocumented /U parameter that performs an unconditional format which under most circumstances overwrites the entire partition, preventing the recovery of data through software. Note however that the /U switch only works reliably with floppy diskettes (technically because unless /Q is used, floppies are always low-level formatted in addition to high-level formatted). Under certain circumstances with hard drive partitions, however, the /U switch merely prevents the creation of
unformat
information in the partition to be formatted while otherwise leaving the partition's contents entirely intact (still on disk but marked deleted). In such cases, the user's data remain ripe for recovery with specialist tools such as EnCaseEnCase
EnCase is a computer forensics product produced by Guidance Software used to analyze digital media . The software is available to law enforcement agencies and corporations.EnCase includes tools for data acquisition, file recovery, indexing/search and file parsing...
or disk editor
Disk editor
A disk editor is a computer program that allows its user to read, edit, and write raw data on disk drives ; as such, they are sometimes called sector editors, since the read/write routines built into the electronics of most disk drives require to read/write data in...
s. Reliance upon /U for secure overwriting of hard drive partitions is therefore inadvisable, and purpose-built tools such as DBAN should be considered instead.
Under OS/2, if you use the /L parameter, which specifies a long format, then format will overwrite the entire partition or logical drive. Doing so enhances the ability of CHKDSK
CHKDSK
CHKDSK is a command on computers running DOS, OS/2 and Microsoft Windows operating systems that displays the file system integrity status of hard disks and floppy disk and can fix logical file system errors. It is similar to the fsck command in Unix.The command is implemented as an executable...
to recover files.
Unix-like operating systems
High-level formatting of disks on these systems is traditionally done using themkfsMkfsmkfs is the Linux/GNU command for formatting a disk partition with a specific filesystem.- Syntax :The basic syntax is: mkfs -t type device...
command. On Linux (and potentially other systems as well) mkfs
is typically a wrapper around filesystem-specific commands which have the name mkfs.fsname
, where fsname is the name of the filesystem with which to format the disk. Some filesystems which are not supported by certain implementations of mkfs
have their own manipulation tools; for example NtfsprogsNtfsprogs
Ntfsprogs is a collection of free Unix utilities for managing the NTFS filesystem used by Windows XP, Windows Server 2003, Windows 2000, Windows NT 4.0, Windows Vista, Windows Server 2008 and Windows 7 on a harddisk partition. 'ntfsprogs' was the first stable method of writing to NTFS partitions in...
provides a format utility for the NTFS
NTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
filesystem.
Some Unix and Unix-like operating systems have higher-level formatting tools, usually for the purpose of making disk formatting easier and/or allowing the user to partition the disk with the same tool. Examples include GNU Parted
GNU Parted
GNU Parted is a free partition editor, used for creating, destroying, resizing, checking, and copying partitions, and the file systems on them. This is useful for creating space for new operating systems, reorganising hard disk usage, copying data between hard disks, and disk imaging...
(and its various GUI frontends such as GParted
GParted
GParted is a GTK+ front-end to GNU Parted and the official GNOME Partition Editor application.It is used for creating, deleting, resizing, moving, checking and copying partitions, and the file systems on them...
and the KDE Partition Manager
KDE Partition Manager
KDE Partition Manager is a disk partitioning application for the KDE Platform. It was first released for KDE SC 4.1. It is released independently of the central KDE release cycle....
) and the Disk Utility
Disk Utility
Disk Utility is the name of a utility created by Apple for performing disk-related tasks in Mac OS X. These tasks include:*the creation, conversion, compression and encryption of disk images from a wide range of formats read by Disk Utility to .dmg or—for CD/DVD images—.cdr, which is identical to...
application on Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...
.
Recovery of data from a formatted disk
As in file deletion by the operating system, data on a disk are not fully erased during every high-level format. Instead, the area on the disk containing the data is merely marked as available, and retains the old data until it is overwritten. If the disk is formatted with a different file system than the one which previously existed on the partition, some data may be overwritten that wouldn't be if the same file system had been used. However, under some file systems (e.g., NTFS, but not FAT), the file indexes (such as $MFTs under NTFS, inodes under ext2/3, etc.) may not be written to the same exact locations. And if the partition size is increased, even FAT file systems will overwrite more data at the beginning of that new partition.From the perspective of preventing the recovery of sensitive data through recovery tools, the data must either be completely overwritten (every sector) with random data before the format, or the format program itself must perform this overwriting, as the DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...
FORMAT command did with floppy diskettes, filling every data sector with the byte value
F6
in hex.See also
- Data erasureData erasureData erasure is a software-based method of overwriting data that completely destroys all electronic data residing on a hard disk drive or other digital media. Permanent data erasure goes beyond basic file deletion commands, which only remove direct pointers to data disk sectors and make data...
- Data recoveryData recoveryData recovery is the process of salvaging data from damaged, failed, corrupted, or inaccessible secondary storage media when it cannot be accessed normally. Often the data are being salvaged from storage media such as internal or external hard disk drives, solid-state drives , USB flash drive,...
- Data remanenceData remanenceData remanence is the residual representation of data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written...
- Drive mapping
- List of default file system
External links
- Formatting Hard Disks and Floppy Disks in Windows NT Workstation Resource Kit, Chapter 17 - Disk and File System Basics
- Secure Deletion of Data from Magnetic and Solid-State Memory by Peter Gutmann
- Differences between a Quick format and a regular format during a "clean" installation of Windows XP from Microsoft Help and Support. Useful for anyone setting up their own computer and needing advice on the subject!
- support.microsoft.com — How to Use the Fdisk Tool and the Format Tool to Partition or Repartition a Hard Disk
- Help: I Got Hacked. Now What Do I Do? -- Microsoft Tech Net: Why you should wipe a compromised drive to the bare metal. Article by Jesper M. Johansson, Ph.D., CISSP, MCSE, MCP+I
- Ultimate Boot CD - Free utility including many useful dos/linux based tools for system maintenance. It's bootable from a CD or USB and has its own operating systems so it's completely independent from external software.