Self-Monitoring, Analysis, and Reporting Technology
Encyclopedia
S.M.A.R.T. is a monitoring system for computer
hard disk drives to detect and report on various indicators of reliability, in the hope of anticipating failures.
When a failure is anticipated by S.M.A.R.T., the user may choose to replace the drive to avoid unexpected outage and data loss. The manufacturer may be able to use the S.M.A.R.T. data to discover where faults lie and prevent them from recurring in future drive designs.
Hard disk failure
s fall into one of two basic classes:
Mechanical failures account for about 60% of all drive failures. While the eventual failure may be catastrophic, most mechanical failures result from gradual wear and there are usually certain indications that failure is imminent. These may include increased heat output, increased noise level, problems with reading and writing of data, or an increase in the number of damaged disk sectors.
Work at Google
on over 100,000 drives has shown little predictive value of S.M.A.R.T. status as a whole, but suggests that certain sub-categories of information which some S.M.A.R.T. implementations track do correlate with actual failure rates: specifically, in the 60 days following the first scan error on a drive, the drive is, on average, 39 times more likely to fail than it would have been had no such error occurred. Furthermore, first errors in reallocations, offline reallocations and probational counts are strongly correlated to higher probabilities of failure.
PCTechGuide's page on S.M.A.R.T. (2003) comments that the technology has gone through three phases:
in 1992 in its IBM 9337 Disk Arrays for AS/400 servers using IBM 0662 SCSI-2 disk drives. Later it was named Predictive Failure Analysis
(PFA) technology. It was measuring several key device health parameters and evaluating them within the drive firmware. Communications between the physical unit and the monitoring software were limited to a binary result: namely, either "device is OK" or "drive is likely to fail soon".
Later, another variant, which was named IntelliSafe, was created by computer manufacturer Compaq
and disk drive manufacturers Seagate
, Quantum
, and Conner
. The disk drives would measure the disk’s "health parameters", and the values would be transferred to the operating system and user-space monitoring software. Each disk drive vendor was free to decide which parameters were to be included for monitoring, and what their thresholds should be. The unification was at the protocol level with the host.
Compaq submitted its implementation to Small Form Factor committee
for standardization in early 1995. It was supported by IBM, by Compaq's development partners Seagate, Quantum, and Conner, and by Western Digital
, which did not have a failure prediction system at the time. The Committee chose IntelliSafe's approach, as it provided more flexibility. The resulting jointly developed standard was named S.M.A.R.T.
That SFF standard described a communication protocol for an ATA host to use and control monitoring and analysis in a hard disk drive, but did not specify any particular metrics or analysis methods. Later, "SMART" came to be understood (though without any formal specification) to refer to a variety of specific metrics and methods and to apply to protocols unrelated to ATA for communicating the same kinds of things.
The most basic information that SMART provides is the SMART status. It provides only two values: "threshold not exceeded" and "threshold exceeded". Often these are represented as "drive OK" or "drive fail" respectively. A "threshold exceeded" value is intended to indicate that there is a relatively high probability that the drive will not be able to honor its specification in the future: that is, the drive is "about to fail". The predicted failure may be catastrophic or may be something as subtle as the inability to write to certain sectors, or perhaps slower performance than the manufacturer's declared minimum.
The SMART status does not necessarily indicate the drive's past or present reliability. If a drive has already failed catastrophically, the SMART status may be inaccessible. Alternatively, if a drive has experienced problems in the past, but the sensors no longer detect such problems, the SMART status may, depending on the manufacturer's programming, suggest that the drive is now sound.
The inability to read some sectors is not always an indication that a drive is about to fail. One way that unreadable sectors may be created, even when the drive is functioning within specification, is through a sudden power failure while the drive is writing. Also, even if the physical disk is damaged at one location, such that a certain sector is unreadable, the disk may be able to use spare space to replace the bad area, so that the sector can be overwritten.
More detail on the health of the drive may be obtained by examining the SMART Attributes. SMART Attributes were included in some drafts of the ATA standard, but were removed before the standard became final. The meaning and interpretation of the attributes varies between manufacturers, and are sometimes considered a trade secret for one manufacturer or another. Attributes are further discussed below.
Drives with SMART may optionally maintain a number of 'logs'. The error log records information about the most recent errors that the drive has reported back to the host computer. Examining this log may help one to determine whether computer problems are disk-related or caused by something else.
A drive that implements SMART may optionally implement a number of self-test or maintenance routines, and the results of the tests are kept in the self-test log. The self-test routines may be used to detect any unreadable sectors on the disk, so that they may be restored from back-up sources (for example, from other disks in a RAID). This helps to reduce the risk of incurring permanent loss of data.
From a legal perspective, the term "S.M.A.R.T." refers only to a signaling method between internal disk drive electromechanical sensors and the host computer. Hence, a drive may be claimed by its manufacturers to implement S.M.A.R.T. even if it does not include, say, a temperature sensor, which the customer might reasonably expect to be present. Moreover, in the most extreme case, a disk manufacturer could, in theory, produce a drive which includes a sensor for just one physical attribute, and then legally advertise the product as "S.M.A.R.T. compatible".
and Firewire correctly send S.M.A.R.T. data over those interfaces. With so many ways to connect a hard drive (SCSI
, Fibre Channel
, ATA, SATA
, SAS
, SSA
, and so on), it is difficult to predict whether S.M.A.R.T. reports will function correctly in a given system.
Even with a hard drive and interface that implements the specification, the computer's operating system may not see the S.M.A.R.T. information because the drive and interface are encapsulated in a lower layer. For example, they may be part of a RAID subsystem in which the RAID controller sees the S.M.A.R.T.-capable drive, but the main computer sees only a logical volume generated by the RAID controller.
On the Windows
platform, many programs designed to monitor and report S.M.A.R.T. information will function only under an administrator account. At present, S.M.A.R.T. is implemented individually by manufacturers, and while some aspects are standardized for compatibility, others are not.
.
Manufacturers that have implemented at least one S.M.A.R.T. attribute in various products include: Samsung
, Seagate
, IBM
(Hitachi
), Fujitsu
, Maxtor, Toshiba
, Intel, Western Digital
and ExcelStor Technology
.
To predict the date, the drive tracks the rate at which the attribute changes. Note that TEC dates are only estimates; hard drives can and do fail much sooner or much later than the TEC date.
Short : Checks the electrical and mechanical performance as well as the read performance of the disk. Electrical tests might include a test of buffer RAM, a read/write circuitry test, or a test of the read/write head elements. Mechanical test includes seeking and servo on data tracks. Scans small parts of the drive's surface (area is vendor-specific and there is a time limit on the test). Checks the list of Pending sectors that may have read errors. (Usually under two minutes.)
Long / Extended : A longer and more thorough version of the short self-test, scans the entire disk surface, with no time limit. (Tens of minutes, >1 GB per minute for modern drives.)
Conveyance : Intended as a quick test to identify damage incurred during transporting of the device from the drive manufacturer to the computer manufacturer. Only available on ATA drives. (Several minutes.)
Selective : Some drives allow selective self-tests of just a part of the surface.
The self-test logs for SCSI and ATA drives are slightly different. It is possible for the long test to pass even if the short test fails.
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...
hard disk drives to detect and report on various indicators of reliability, in the hope of anticipating failures.
When a failure is anticipated by S.M.A.R.T., the user may choose to replace the drive to avoid unexpected outage and data loss. The manufacturer may be able to use the S.M.A.R.T. data to discover where faults lie and prevent them from recurring in future drive designs.
Background
The purpose of S.M.A.R.T. is to warn a user of impending drive failure while there is still time to take action, such as copying the data to a replacement device.Hard disk failure
Hard disk failure
In computing, a hard-disk failure occurs when a hard disk drive malfunctions and the stored information cannot be accessed with a properly configured computer...
s fall into one of two basic classes:
- Predictable failures: These failures result from slow processes such as mechanical wear and gradual degradation of storage surfaces. Monitoring can determine when such failures are becoming more likely.
- Unpredictable failures: These failures happen suddenly and without warning. They range from electronic components becoming defective to a sudden mechanical failure (perhaps due to improper handling).
Mechanical failures account for about 60% of all drive failures. While the eventual failure may be catastrophic, most mechanical failures result from gradual wear and there are usually certain indications that failure is imminent. These may include increased heat output, increased noise level, problems with reading and writing of data, or an increase in the number of damaged disk sectors.
Work at Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...
on over 100,000 drives has shown little predictive value of S.M.A.R.T. status as a whole, but suggests that certain sub-categories of information which some S.M.A.R.T. implementations track do correlate with actual failure rates: specifically, in the 60 days following the first scan error on a drive, the drive is, on average, 39 times more likely to fail than it would have been had no such error occurred. Furthermore, first errors in reallocations, offline reallocations and probational counts are strongly correlated to higher probabilities of failure.
PCTechGuide's page on S.M.A.R.T. (2003) comments that the technology has gone through three phases:
- "In its original incarnation SMART provided failure prediction by monitoring certain online hard drive activities. A subsequent version improved failure prediction by adding an automatic off-line read scan to monitor additional operations. The latest "SMART" technology not only monitors hard drive activities but adds failure prevention by attempting to detect and repair sector errors. Also, while earlier versions of the technology only monitored hard drive activity for data that was retrieved by the operating system, this latest SMART tests all data and all sectors of a drive by using "off-line data collection" to confirm the drive's health during periods of inactivity."
History and predecessors
An early hard disk monitoring technology was introduced by IBMIBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
in 1992 in its IBM 9337 Disk Arrays for AS/400 servers using IBM 0662 SCSI-2 disk drives. Later it was named Predictive Failure Analysis
Predictive Failure Analysis
Predictive Failure Analysis is a proprietary IBM technology for monitoring the likelihood of hard disk drives to fail. It was introduced in 1992 in IBM 0662-S1x drive , and was industry's first such technology....
(PFA) technology. It was measuring several key device health parameters and evaluating them within the drive firmware. Communications between the physical unit and the monitoring software were limited to a binary result: namely, either "device is OK" or "drive is likely to fail soon".
Later, another variant, which was named IntelliSafe, was created by computer manufacturer Compaq
Compaq
Compaq Computer Corporation is a personal computer company founded in 1982. Once the largest supplier of personal computing systems in the world, Compaq existed as an independent corporation until 2002, when it was acquired for US$25 billion by Hewlett-Packard....
and disk drive manufacturers Seagate
Seagate Technology
Seagate Technology is one of the world's largest manufacturers of hard disk drives. Incorporated in 1978 as Shugart Technology, Seagate is currently incorporated in Dublin, Ireland and has its principal executive offices in Scotts Valley, California, United States.-1970s:On November 1, 1979...
, Quantum
Quantum Corp.
Quantum Corporation is a manufacturer of tape drive, tape automation, data deduplication storage products and scalable file storage software, based in San Jose, California...
, and Conner
Conner Peripherals
Conner Peripherals was a company that manufactured hard drives for personal computers. Conner Peripherals was founded in 1985 by Seagate Technology co-founder Finis Conner but it in itself never produced a product. In 1986 Conner Peripherals merged with CoData, started by MiniScribe founders Terry...
. The disk drives would measure the disk’s "health parameters", and the values would be transferred to the operating system and user-space monitoring software. Each disk drive vendor was free to decide which parameters were to be included for monitoring, and what their thresholds should be. The unification was at the protocol level with the host.
Compaq submitted its implementation to Small Form Factor committee
Small Form Factor committee
The Small Form Factor committee is an ad hoc electronics industry group formed to quickly develop interoperability specifications ....
for standardization in early 1995. It was supported by IBM, by Compaq's development partners Seagate, Quantum, and Conner, and by Western Digital
Western Digital
Western Digital Corporation is one of the largest computer hard disk drive manufacturers in the world. It has a long history in the electronics industry as an integrated circuit maker and a storage products company. Western Digital was founded on April 23, 1970 by Alvin B...
, which did not have a failure prediction system at the time. The Committee chose IntelliSafe's approach, as it provided more flexibility. The resulting jointly developed standard was named S.M.A.R.T.
That SFF standard described a communication protocol for an ATA host to use and control monitoring and analysis in a hard disk drive, but did not specify any particular metrics or analysis methods. Later, "SMART" came to be understood (though without any formal specification) to refer to a variety of specific metrics and methods and to apply to protocols unrelated to ATA for communicating the same kinds of things.
Information Provided
The technical documentation for SMART is in the AT Attachment (ATA) standard.The most basic information that SMART provides is the SMART status. It provides only two values: "threshold not exceeded" and "threshold exceeded". Often these are represented as "drive OK" or "drive fail" respectively. A "threshold exceeded" value is intended to indicate that there is a relatively high probability that the drive will not be able to honor its specification in the future: that is, the drive is "about to fail". The predicted failure may be catastrophic or may be something as subtle as the inability to write to certain sectors, or perhaps slower performance than the manufacturer's declared minimum.
The SMART status does not necessarily indicate the drive's past or present reliability. If a drive has already failed catastrophically, the SMART status may be inaccessible. Alternatively, if a drive has experienced problems in the past, but the sensors no longer detect such problems, the SMART status may, depending on the manufacturer's programming, suggest that the drive is now sound.
The inability to read some sectors is not always an indication that a drive is about to fail. One way that unreadable sectors may be created, even when the drive is functioning within specification, is through a sudden power failure while the drive is writing. Also, even if the physical disk is damaged at one location, such that a certain sector is unreadable, the disk may be able to use spare space to replace the bad area, so that the sector can be overwritten.
More detail on the health of the drive may be obtained by examining the SMART Attributes. SMART Attributes were included in some drafts of the ATA standard, but were removed before the standard became final. The meaning and interpretation of the attributes varies between manufacturers, and are sometimes considered a trade secret for one manufacturer or another. Attributes are further discussed below.
Drives with SMART may optionally maintain a number of 'logs'. The error log records information about the most recent errors that the drive has reported back to the host computer. Examining this log may help one to determine whether computer problems are disk-related or caused by something else.
A drive that implements SMART may optionally implement a number of self-test or maintenance routines, and the results of the tests are kept in the self-test log. The self-test routines may be used to detect any unreadable sectors on the disk, so that they may be restored from back-up sources (for example, from other disks in a RAID). This helps to reduce the risk of incurring permanent loss of data.
Lack of common interpretation
Many motherboards display a warning message when a disk drive is approaching failure. Although an industry standard exists among most major hard drive manufacturers, there are some remaining issues and much proprietary "secret knowledge" held by individual manufacturers as to their specific approach. As a result, S.M.A.R.T. is not always implemented correctly on many computer platforms, due to the absence of industry-wide software and hardware standards for S.M.A.R.T. data interchange.From a legal perspective, the term "S.M.A.R.T." refers only to a signaling method between internal disk drive electromechanical sensors and the host computer. Hence, a drive may be claimed by its manufacturers to implement S.M.A.R.T. even if it does not include, say, a temperature sensor, which the customer might reasonably expect to be present. Moreover, in the most extreme case, a disk manufacturer could, in theory, produce a drive which includes a sensor for just one physical attribute, and then legally advertise the product as "S.M.A.R.T. compatible".
Visibility to host systems
Depending on the type of interface being used, some S.M.A.R.T.-enabled motherboards and related software may not communicate with certain S.M.A.R.T.-capable drives. For example, few external drives connected via USBUniversal Serial Bus
USB is an industry standard developed in the mid-1990s that defines the cables, connectors and protocols used in a bus for connection, communication and power supply between computers and electronic devices....
and Firewire correctly send S.M.A.R.T. data over those interfaces. With so many ways to connect a hard drive (SCSI
SCSI
Small Computer System Interface is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it...
, Fibre Channel
Fibre Channel
Fibre Channel, or FC, is a gigabit-speed network technology primarily used for storage networking. Fibre Channel is standardized in the T11 Technical Committee of the InterNational Committee for Information Technology Standards , an American National Standards Institute –accredited standards...
, ATA, SATA
Serial ATA
Serial ATA is a computer bus interface for connecting host bus adapters to mass storage devices such as hard disk drives and optical drives...
, SAS
Serial Attached SCSI
Serial Attached SCSI is a computer bus used to move data to and from computer storage devices such as hard drives and tape drives. SAS depends on a point-to-point serial protocol that replaces the parallel SCSI bus technology that first appeared in the mid 1980s in data centers and workstations,...
, SSA
Serial Storage Architecture
Serial Storage Architecture is a serial transport protocol used to attach disk drives to servers. It was invented by Ian Judd of IBM in 1990...
, and so on), it is difficult to predict whether S.M.A.R.T. reports will function correctly in a given system.
Even with a hard drive and interface that implements the specification, the computer's operating system may not see the S.M.A.R.T. information because the drive and interface are encapsulated in a lower layer. For example, they may be part of a RAID subsystem in which the RAID controller sees the S.M.A.R.T.-capable drive, but the main computer sees only a logical volume generated by the RAID controller.
On the Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
platform, many programs designed to monitor and report S.M.A.R.T. information will function only under an administrator account. At present, S.M.A.R.T. is implemented individually by manufacturers, and while some aspects are standardized for compatibility, others are not.
Access
For a list of various programs that allow reading of Smart Data, see Comparison of S.M.A.R.T. toolsComparison of S.M.A.R.T. tools
This is a list of software that reads S.M.A.R.T. data from hard drives....
.
ATA S.M.A.R.T. attributes
Each drive manufacturer defines a set of attributes, and sets threshold values beyond which attributes should not pass under normal operation. Each attribute has a raw value, whose meaning is entirely up to the drive manufacturer (but often corresponds to counts or a physical unit, such as degrees Celsius or seconds), a normalized value, which ranges from 1 to 253 (with 1 representing the worst case and 253 representing the best) and a worst value, which represents the lowest recorded normalized value. Depending on the manufacturer, a value of 100 or 200 will often be chosen as the initial normalized value.Manufacturers that have implemented at least one S.M.A.R.T. attribute in various products include: Samsung
Samsung Electronics
Samsung Electronics is a South Korean multinational electronics and information technology company headquartered in Samsung Town, Seoul...
, Seagate
Seagate Technology
Seagate Technology is one of the world's largest manufacturers of hard disk drives. Incorporated in 1978 as Shugart Technology, Seagate is currently incorporated in Dublin, Ireland and has its principal executive offices in Scotts Valley, California, United States.-1970s:On November 1, 1979...
, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
(Hitachi
Hitachi, Ltd.
is a Japanese multinational conglomerate headquartered in Marunouchi 1-chome, Chiyoda, Tokyo, Japan. The company is the parent of the Hitachi Group as part of the larger DKB Group companies...
), Fujitsu
Fujitsu
is a Japanese multinational information technology equipment and services company headquartered in Tokyo, Japan. It is the world's third-largest IT services provider measured by revenues....
, Maxtor, Toshiba
Toshiba
is a multinational electronics and electrical equipment corporation headquartered in Tokyo, Japan. It is a diversified manufacturer and marketer of electrical products, spanning information & communications equipment and systems, Internet-based solutions and services, electronic components and...
, Intel, Western Digital
Western Digital
Western Digital Corporation is one of the largest computer hard disk drive manufacturers in the world. It has a long history in the electronics industry as an integrated circuit maker and a storage products company. Western Digital was founded on April 23, 1970 by Alvin B...
and ExcelStor Technology
ExcelStor Technology
ExcelStor established in 2000 as a small hard disk drive manufacturer has evolved into a contract manufacturer and a system integrator. It has a manufacturing plant in Shenzhen, China, and an R&D center in Longmont, Colorado, USA...
.
Known ATA S.M.A.R.T. attributes
The following chart lists some S.M.A.R.T. attributes and the typical meaning of their raw values. Normalized values are always mapped so that higher values are better (with only very rare exceptions such as the "Temperature" attribute on certain Seagate drives), but higher raw attribute values may be better or worse depending on the attribute and manufacturer. For example, the "Reallocated Sectors Count" attribute's normalized value decreases as the count of reallocated sectors increases. In this case, the attribute's raw value will often indicate the actual count of sectors that were reallocated, although vendors are in no way required to adhere to this convention. As manufacturers do not necessarily agree on precise attribute definitions and measurement units, the following list of attributes should be regarded as a general guide only.Higher raw value is better | |
Lower raw value is better | |
Critical: red colored row | Potential indicators of imminent electromechanical failure |
ID | Hex | Attribute name | Better | Description |
---|---|---|---|---|
01 | 0x01 | Read Error Rate | (Vendor specific raw value.) Stores data related to the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number. | |
02 | 0x02 | Throughput Performance | Overall (general) throughput performance of a hard disk drive. If the value of this attribute is decreasing there is a high probability that there is a problem with the disk. | |
03 | 0x03 | Spin-Up Time | Average time of spindle spin up (from zero RPM to fully operational [millisecs]). | |
04 | 0x04 | Start/Stop Count | A tally of spindle start/stop cycles. The spindle turns on, and hence the count is increased, both when the hard disk is turned on after having before been turned entirely off (disconnected from power source) and when the hard disk returns from having previously been put to sleep mode. | |
05 | 0x05 | Reallocated Sectors Count | Count of reallocated sectors. When the hard drive finds a read/write/verification error, it marks that sector as "reallocated" and transfers data to a special reserved area (spare area). This process is also known as remapping, and reallocated sectors are called "remaps". The raw value normally represents a count of the bad sectors that have been found and remapped. Thus, the higher the attribute value, the more sectors the drive has had to reallocate. This allows a drive with bad sectors to continue operation; however, a drive which has had any reallocations at all is significantly more likely to fail in the near future. While primarily used as a metric of the life expectancy of the drive, this number also affects performance. As the count of reallocated sectors increases, the read/write speed tends to become worse because the drive head is forced to seek to the reserved area whenever a remap is accessed. A workaround which will preserve drive speed at the expense of capacity is to create a disk partition over the region which contains remaps and instruct the operating system Operating system An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system... to not use that partition. |
|
06 | 0x06 | Read Channel Margin | Margin of a channel while reading data. The function of this attribute is not specified. | |
07 | 0x07 | Seek Error Rate | (Vendor specific raw value.) Rate of seek errors of the magnetic heads. If there is a partial failure in the mechanical positioning system, then seek errors will arise. Such a failure may be due to numerous factors, such as damage to a servo, or thermal widening of the hard disk. The raw value has different structure for different vendors and is often not meaningful as a decimal number. | |
08 | 0x08 | Seek Time Performance | Average performance of seek operations of the magnetic heads. If this attribute is decreasing, it is a sign of problems in the mechanical subsystem. | |
09 | 0x09 | Power-On Hours Power-On Hours Power-on hours is the length of time, in hours, that electrical power is applied to a device. It is primarily used in:* Mean time between failures , and* S.M.A.R.T., where it is one of the attributes.... (POH) |
Count of hours in power-on state. The raw value of this attribute shows total count of hours (or minutes, or seconds, depending on manufacturer) in power-on state. | |
10 | 0x0A | Spin Retry Count | Count of retry of spin start attempts. This attribute stores a total count of the spin start attempts to reach the fully operational speed (under the condition that the first attempt was unsuccessful). An increase of this attribute value is a sign of problems in the hard disk mechanical subsystem. | |
11 | 0x0B | Recalibration Retries or Calibration Retry Count | This attribute indicates the count that recalibration was requested (under the condition that the first attempt was unsuccessful). An increase of this attribute value is a sign of problems in the hard disk mechanical subsystem. | |
12 | 0x0C | Power Cycle Count | This attribute indicates the count of full hard disk power on/off cycles. | |
13 | 0x0D | Soft Read Error Rate | Uncorrected read errors reported to the operating system. | |
180 | 0xB4 | Unused Reserved Block Count Total | "Pre-Fail" Attribute used at least in HP devices. | |
183 | 0xB7 | SATA Downshift Error Count | Western Digital and Samsung attribute. | |
184 | 0xB8 | End-to-End error / IOEDC | This attribute is a part of Hewlett-Packard Hewlett-Packard Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including... 's SMART IV technology, as well as part of other vendors' IO Error Detection and Correction schemas, and it contains a count of parity errors which occur in the data path to the media via the drive's cache RAM. |
|
185 | 0xB9 | Head Stability | Western Digital attribute. | |
186 | 0xBA | Induced Op-Vibration Detection | Western Digital attribute. | |
187 | 0xBB | Reported Uncorrectable Errors | The count of errors that could not be recovered using hardware ECC (see attribute 195). | |
188 | 0xBC | Command Timeout | The count of aborted operations due to HDD timeout. Normally this attribute value should be equal to zero and if the value is far above zero, then most likely there will be some serious problems with power supply or an oxidized data cable. | |
189 | 0xBD | High Fly Writes | HDD producers implement a Fly Height Monitor that attempts to provide additional protections for write operations by detecting when a recording head is flying outside its normal operating range. If an unsafe fly height condition is encountered, the write process is stopped, and the information is rewritten or reallocated to a safe region of the hard drive. This attribute indicates the count of these errors detected over the lifetime of the drive. This feature is implemented in most modern Seagate drives and some of Western Digital’s drives, beginning with the WD Enterprise WDE18300 and WDE9180 Ultra2 SCSI hard drives, and will be included on all future WD Enterprise products. |
|
190 | 0xBE | Airflow Temperature (WDC) resp. Airflow Temperature Celsius (HP) | Airflow temperature on Western Digital HDs (Same as temp. [C2], but current value is 50 less for some models. Marked as obsolete.) | |
190 | 0xBE | Temperature Difference from 100 | Value is equal to (100−temp. °C), allowing manufacturer to set a minimum threshold which corresponds to a maximum temperature. | |
191 | 0xBF | G-sense Error Rate | The count of errors resulting from externally-induced shock & vibration. | |
192 | 0xC0 | Power-off Retract Count or Emergency Retract Cycle Count (Fujitsu) | Count of times the heads are loaded off the media. Heads can be unloaded without actually powering off. | |
193 | 0xC1 | Load Cycle Count or Load/Unload Cycle Count (Fujitsu) | Count of load/unload cycles into head landing zone position. The typical lifetime rating for laptop (2.5-in) hard drives is 300,000 to 600,000 load cycles. Some laptop drives are programmed to unload the heads whenever there has not been any activity for about five seconds. Many Linux installations write to the file system a few times a minute in the background. As a result, there may be 100 or more load cycles per hour, and the load cycle rating may be exceeded in less than a year. |
|
194 | 0xC2 | Temperature resp. Temperature Celsius | Current internal temperature. | |
195 | 0xC3 | Hardware ECC Recovered | (Vendor specific raw value.) The raw value has different structure for different vendors and is often not meaningful as a decimal number. | |
196 | 0xC4 | Reallocation Event Count | Count of remap operations. The raw value of this attribute shows the total count of attempts to transfer data from reallocated sectors to a spare area. Both successful & unsuccessful attempts are counted. | |
197 | 0xC5 | Current Pending Sector Count | Count of "unstable" sectors (waiting to be remapped, because of read errors). If an unstable sector is subsequently read successfully, this value is decreased and the sector is not remapped. Read errors on a sector will not remap the sector (since it might be readable later); instead, the drive firmware remembers that the sector needs to be remapped, and remaps it the next time it's written. | |
198 | 0xC6 | Uncorrectable Sector Count or Offline Uncorrectable or Off-Line Scan Uncorrectable Sector Count |
The total count of uncorrectable errors when reading/writing a sector. A rise in the value of this attribute indicates defects of the disk surface and/or problems in the mechanical subsystem. | |
199 | 0xC7 | UltraDMA CRC Error Count | The count of errors in data transfer via the interface cable as determined by ICRC (Interface Cyclic Redundancy Check). | |
200 | 0xC8 | Multi-Zone Error Rate | The count of errors found when writing a sector. The higher the value, the worse the disk's mechanical condition is. | |
200 | 0xC8 | Write Error Rate (Fujitsu) | The total count of errors when writing a sector. | |
201 | 0xC9 | Soft Read Error Rate or TA Counter Detected |
Count of off-track errors. | |
202 | 0xCA | Data Address Mark errors or TA Counter Increased |
Count of Data Address Mark errors (or vendor-specific). | |
203 | 0xCB | Run Out Cancel | Count of ECC errors | |
204 | 0xCC | Soft ECC Correction | Count of errors corrected by software ECC | |
205 | 0xCD | Thermal Asperity Rate (TAR) | Count of errors due to high temperature. | |
206 | 0xCE | Flying Height | Height of heads above the disk surface. A flying height that's too low increases the chances of a head crash while a flying height that's too high increases the chances of a read/write error. | |
207 | 0xCF | Spin High Current | Amount of surge current used to spin up the drive. | |
208 | 0xD0 | Spin Buzz | Count of buzz routines needed to spin up the drive due to insufficient power. | |
209 | 0xD1 | Offline Seek Performance | Drive’s seek performance during its internal tests. | |
210 | 0xD2 | ? | (found in a Maxtor 6B200M0 200GB and Maxtor 2R015H1 15GB disks) | |
211 | 0xD3 | Vibration During Write | Vibration During Write | |
212 | 0xD4 | Shock During Write | Shock During Write | |
220 | 0xDC | Disk Shift | Distance the disk has shifted relative to the spindle (usually due to shock or temperature). Unit of measure is unknown. | |
221 | 0xDD | G-Sense Error Rate | The count of errors resulting from externally-induced shock & vibration. | |
222 | 0xDE | Loaded Hours | Time spent operating under data load (movement of magnetic head armature) | |
223 | 0xDF | Load/Unload Retry Count | Count of times head changes position. | |
224 | 0xE0 | Load Friction | Resistance caused by friction in mechanical parts while operating. | |
225 | 0xE1 | Load/Unload Cycle Count | Total count of load cycles | |
226 | 0xE2 | Load 'In'-time | Total time of loading on the magnetic heads actuator (time not spent in parking area). | |
227 | 0xE3 | Torque Amplification Count | Count of attempts to compensate for platter speed variations | |
228 | 0xE4 | Power-Off Retract Cycle | The count of times the magnetic armature was retracted automatically as a result of cutting power. | |
230 | 0xE6 | GMR Head Amplitude | Amplitude of "thrashing" (distance of repetitive forward/reverse head motion) | |
231 | 0xE7 | Temperature | Drive Temperature | |
232 | 0xE8 | Endurance Remaining | Number of physical erase cycles completed on the drive as a percentage of the maximum physical erase cycles the drive is designed to endure | |
232 | 0xE8 | Available Reserved Space | Intel SSD reports the number of available reserved space as a percentage of reserved space in a brand new SSD. | |
233 | 0xE9 | Power-On Hours | Number of hours elapsed in the power-on state. | |
233 | 0xE9 | Media Wearout Indicator | Intel SSD reports a normalized value of 100 (when the SSD is new) and declines to a minimum value of 1. It decreases while the NAND erase cycles increase from 0 to the maximum-rated cycles. | |
240 | 0xF0 | Head Flying Hours | Time while head is positioning | |
240 | 0xF0 | Transfer Error Rate (Fujitsu) | Count of times the link is reset during a data transfer. | |
241 | 0xF1 | Total LBAs Written | Total count of LBAs written | |
242 | 0xF2 | Total LBAs Read | Total count of LBAs read. Some S.M.A.R.T. utilities will report a negative number for the raw value since in reality it has 48 bits rather than 32. |
|
250 | 0xFA | Read Error Retry Rate | Count of errors while reading from a disk | |
254 | 0xFE | Free Fall Protection | Count of "Free Fall Events" detected |
Threshold Exceeds Condition
Threshold Exceeds Condition (TEC) is an estimated date when a critical drive statistic attribute will reach its threshold value. When Drive Health software reports a "Nearest T.E.C.", it should be regarded as a "Failure date". Sometimes, no date is given and the drive can be expected to work without errors.To predict the date, the drive tracks the rate at which the attribute changes. Note that TEC dates are only estimates; hard drives can and do fail much sooner or much later than the TEC date.
Self-tests
SMART drives may offer a number of self-tests:Short : Checks the electrical and mechanical performance as well as the read performance of the disk. Electrical tests might include a test of buffer RAM, a read/write circuitry test, or a test of the read/write head elements. Mechanical test includes seeking and servo on data tracks. Scans small parts of the drive's surface (area is vendor-specific and there is a time limit on the test). Checks the list of Pending sectors that may have read errors. (Usually under two minutes.)
Long / Extended : A longer and more thorough version of the short self-test, scans the entire disk surface, with no time limit. (Tens of minutes, >1 GB per minute for modern drives.)
Conveyance : Intended as a quick test to identify damage incurred during transporting of the device from the drive manufacturer to the computer manufacturer. Only available on ATA drives. (Several minutes.)
Selective : Some drives allow selective self-tests of just a part of the surface.
The self-test logs for SCSI and ATA drives are slightly different. It is possible for the long test to pass even if the short test fails.
External links
- Out SMART Your Hard Drive Using the smartmontools program to monitor S.M.A.R.T. values
- How S.M.A.R.T. is your hard drive?
- How to predict hard disk failure (SMART Report) in Ubuntu Linux in just 3 clicks?