Check digit
Encyclopedia
A check digit is a form of redundancy check used for error detection
, the decimal equivalent of a binary checksum
. It consists of a single digit computed from the other digits in the message.
With a check digit, one can detect simple errors in the input of a series of digits, such as a single mistyped digit or some permutations of two successive digits.
s are generally designed to capture human transcription errors. In order of complexity, these include the following:
In choosing a system, a high probability of catching errors is traded off against implementation difficulty; simple check digit systems are easily understood and implemented by humans but do not catch as many errors as complex ones, which require sophisticated programs to implement.
A desirable feature is that left-padding with zeros should not change the check digit. This allows variable length digits to be used and the length to be changed.
If there is a single check digit added to the original number, the system will not always capture multiple errors, such as two replacement errors (12 → 34) though, typically, double errors will be caught 90% of the time (both changes would need to change the output by offsetting amounts).
A very simple check digit method would be to take the sum of all digits (digital sum
) modulo
10. This would catch any single-digit error, as such an error would always change the sum, but does not catch any transposition errors (switching two digits) as re-ordering does not change the sum.
A slightly more complex method is to take the weighted sum of the digits, modulo 10, with different weights for each number position.
To illustrate this, for example if the weights for a four digit number were 5, 3, 2, 7 and the number to be coded was 4871, then one would take 5×4 + 3×8 + 2×7 + 7×1 = 65, ie 5 modulo 10, and the check digit would be 5, giving 48715.
Systems with weights of 1, 3, 7, or 9, with the weights on neighboring numbers being different, are widely used: for example, 31 31 weights in UPC
codes, 13 13 weights in EAN
numbers (GS1 algorithm), and the 371 371 371 weights used in United States bank routing transit number
s. This system detects all single-digit errors and around 90% of transposition errors. 1, 3, 7, and 9 are used because they are coprime
to 10, so changing any digit changes the check digit; using a coefficient that is divisible by 2 or 5 would lose information (because ) and thus not catch some single-digit errors. Using different weights on neighboring numbers means that most transpositions change the check digit; however, because all weights differ by an even number, this does not catch transpositions of two digits that differ by 5, (0 and 5, 1 and 6, 2 and 7, 3 and 8, 4 and 9), since the 2 and 5 multiply to yield 10.
The code instead uses modulo 11, which is prime, and all the number positions have different weights . This system thus detects all single digit substitution and transposition errors (including jump transpositions), but at the cost of the check digit possibly being 10, represented by "X". (An alternative is simply to avoid using the serial numbers which result in an "X" check digit.) instead uses the GS1 algorithm used in EAN numbers.
More complicated algorithms include the Luhn algorithm
(1954), which captures 98% of single digit transposition errors (it does not detect 90 ↔ 09), while more sophisticated is the Verhoeff algorithm
(1969), which catches all single digit substitution and transposition errors, and many (but not all) more complex errors. Both these methods use a single check digit and will therefore fail to capture around 10% of more complex errors. To reduce this failure rate, it is necessary to use more than one check digit (for example, the modulo 97 check referred to below, which uses two check digits - for the algorithm, see International Bank Account Number
) and/or to use a wider range of characters in the check digit, for example letters plus numbers.
is a check digit computed as follows:
For instance, the UPC-A barcode for a box of tissues is "036000241457". The last digit is the check digit "7", and if the other numbers are correct then the check digit calculation must produce 7.
Another example: to calculate the check digit for the following food item "01010101010".
is a check digit computed so that multiplying each digit by its position in the number (counting from the right) and taking the sum of these products modulo
11 is 0. The digit the farthest to the right (which is multiplied by 1) is the check digit, chosen to make the sum correct. It may need to have the value 10, which is represented as the letter X. For example, take the ISBN 0-201-53082-1. The sum of products is 0×10 + 2×9 + 0×8 + 1×7 + 5×6 + 3×5 + 0×4 + 8×3 + 2×2 + 1×1 = 99 ≡ 0 modulo 11. So the ISBN is valid.
While this may seem more complicated than the first scheme, it can be validated simply by adding all the products together then dividing by 11. The sum can be computed without any multiplications by initializing two variables,
as
) check digits (administered by GS1
) are calculated by summing the even position numbers and multiplying by 3 and then by adding the sum of the odd position numbers. The final digit of the result is subtracted from 10 to calculate the check digit (or left as is if already zero).
A GS1 check digit calculator and detailed documentation is online at GS1
's website.
Error detection and correction
In information theory and coding theory with applications in computer science and telecommunication, error detection and correction or error control are techniques that enable reliable delivery of digital data over unreliable communication channels...
, the decimal equivalent of a binary checksum
Checksum
A checksum or hash sum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and...
. It consists of a single digit computed from the other digits in the message.
With a check digit, one can detect simple errors in the input of a series of digits, such as a single mistyped digit or some permutations of two successive digits.
Design
Check digit algorithmAlgorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
s are generally designed to capture human transcription errors. In order of complexity, these include the following:
- single digit errors, such as 1 → 2
- transposition errors, such as 12 → 21
- twin errors, such as 11 → 22
- jump transpositions errors, such as 132 → 231
- jump twin errors, such as 131 → 232
- phonetic errors, such as 60 → 16 ("sixty" to "sixteen")
In choosing a system, a high probability of catching errors is traded off against implementation difficulty; simple check digit systems are easily understood and implemented by humans but do not catch as many errors as complex ones, which require sophisticated programs to implement.
A desirable feature is that left-padding with zeros should not change the check digit. This allows variable length digits to be used and the length to be changed.
If there is a single check digit added to the original number, the system will not always capture multiple errors, such as two replacement errors (12 → 34) though, typically, double errors will be caught 90% of the time (both changes would need to change the output by offsetting amounts).
A very simple check digit method would be to take the sum of all digits (digital sum
Digital sum
- Values :* The digit sum - add the digits of the representation of a number in a given base. For example, considering 84001 in base 10 the digit sum would be 8 + 4 + 0 + 0 + 1 = 13....
) modulo
Modulo operation
In computing, the modulo operation finds the remainder of division of one number by another.Given two positive numbers, and , a modulo n can be thought of as the remainder, on division of a by n...
10. This would catch any single-digit error, as such an error would always change the sum, but does not catch any transposition errors (switching two digits) as re-ordering does not change the sum.
A slightly more complex method is to take the weighted sum of the digits, modulo 10, with different weights for each number position.
To illustrate this, for example if the weights for a four digit number were 5, 3, 2, 7 and the number to be coded was 4871, then one would take 5×4 + 3×8 + 2×7 + 7×1 = 65, ie 5 modulo 10, and the check digit would be 5, giving 48715.
Systems with weights of 1, 3, 7, or 9, with the weights on neighboring numbers being different, are widely used: for example, 31 31 weights in UPC
Universal Product Code
The Universal Product Code is a barcode symbology , that is widely used in North America, and in countries including the UK, Australia, and New Zealand for tracking trade items in stores. Its most common form, the UPC-A, consists of 12 numerical digits, which are uniquely assigned to each trade item...
codes, 13 13 weights in EAN
European Article Number
An EAN-13 barcode is a 13 digit barcoding standard which is a superset of the original 12-digit Universal Product Code system developed in the United States...
numbers (GS1 algorithm), and the 371 371 371 weights used in United States bank routing transit number
Routing transit number
A routing transit number is a nine digit bank code, used in the United States, which appears on the bottom of negotiable instruments such as checks identifying the financial institution on which it was drawn...
s. This system detects all single-digit errors and around 90% of transposition errors. 1, 3, 7, and 9 are used because they are coprime
Coprime
In number theory, a branch of mathematics, two integers a and b are said to be coprime or relatively prime if the only positive integer that evenly divides both of them is 1. This is the same thing as their greatest common divisor being 1...
to 10, so changing any digit changes the check digit; using a coefficient that is divisible by 2 or 5 would lose information (because ) and thus not catch some single-digit errors. Using different weights on neighboring numbers means that most transpositions change the check digit; however, because all weights differ by an even number, this does not catch transpositions of two digits that differ by 5, (0 and 5, 1 and 6, 2 and 7, 3 and 8, 4 and 9), since the 2 and 5 multiply to yield 10.
The code instead uses modulo 11, which is prime, and all the number positions have different weights . This system thus detects all single digit substitution and transposition errors (including jump transpositions), but at the cost of the check digit possibly being 10, represented by "X". (An alternative is simply to avoid using the serial numbers which result in an "X" check digit.) instead uses the GS1 algorithm used in EAN numbers.
More complicated algorithms include the Luhn algorithm
Luhn algorithm
The Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm,is a simple checksum formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, National Provider Identifier numbers in US and Canadian Social Insurance Numbers...
(1954), which captures 98% of single digit transposition errors (it does not detect 90 ↔ 09), while more sophisticated is the Verhoeff algorithm
Verhoeff algorithm
The Verhoeff algorithm, a checksum formula for error detection first published in 1969, was developed by Dutch mathematician Jacobus Verhoeff . Like the more widely known Luhn algorithm, it works with strings of decimal digits of any length...
(1969), which catches all single digit substitution and transposition errors, and many (but not all) more complex errors. Both these methods use a single check digit and will therefore fail to capture around 10% of more complex errors. To reduce this failure rate, it is necessary to use more than one check digit (for example, the modulo 97 check referred to below, which uses two check digits - for the algorithm, see International Bank Account Number
International Bank Account Number
The International Bank Account Number is an international standard for identifying bank accounts across national borders with a minimal risk of propagating transcription errors. It was originally adopted by the European Committee for Banking Standards , and was later adopted as an international...
) and/or to use a wider range of characters in the check digit, for example letters plus numbers.
UPC
The final digit of a Universal Product CodeUniversal Product Code
The Universal Product Code is a barcode symbology , that is widely used in North America, and in countries including the UK, Australia, and New Zealand for tracking trade items in stores. Its most common form, the UPC-A, consists of 12 numerical digits, which are uniquely assigned to each trade item...
is a check digit computed as follows:
- Add the digits (up to but not including the check digit) in the odd-numbered positions (first, third, fifth, etc.) together and multiply by three.
- Add the digits (up to but not including the check digit) in the even-numbered positions (second, fourth, sixth, etc.) to the result.
- Take the remainder of the result divided by 10 (modulo operation) and subtract this from 10 to derive the check digit.
For instance, the UPC-A barcode for a box of tissues is "036000241457". The last digit is the check digit "7", and if the other numbers are correct then the check digit calculation must produce 7.
- Add the odd number digits: 0+6+0+2+1+5 = 14
- Multiply the result by 3: 14 × 3 = 42
- Add the even number digits: 3+0+0+4+4 = 11
- Add the two results together: 42 + 11 = 53
- To calculate the check digit, take the remainder of (53 / 10), which is also known as (53 modulo 10), and subtract from 10. Therefore, the check digit value is 7.
Another example: to calculate the check digit for the following food item "01010101010".
- Add the odd number digits: 0+0+0+0+0+0 = 0
- Multiply the result by 3: 0 x 3 = 0
- Add the even number digits: 1+1+1+1+1 = 5
- Add the two results together: 0 + 5 = 5
- To calculate the check digit, take the remainder of (5 / 10), which is also known as (5 modulo 10), and subtract from 10 i.e. (10 - 5 modulo 10) = 5. Therefore, the check digit value is 5.
- If the remainder is 0, subtracting from 10 would give 10. In that case, use 0 as the check digit.
ISBN 10
The final character of a ten digit International Standard Book NumberInternational Standard Book Number
The International Standard Book Number is a unique numeric commercial book identifier based upon the 9-digit Standard Book Numbering code created by Gordon Foster, Emeritus Professor of Statistics at Trinity College, Dublin, for the booksellers and stationers W.H...
is a check digit computed so that multiplying each digit by its position in the number (counting from the right) and taking the sum of these products modulo
Modulo operation
In computing, the modulo operation finds the remainder of division of one number by another.Given two positive numbers, and , a modulo n can be thought of as the remainder, on division of a by n...
11 is 0. The digit the farthest to the right (which is multiplied by 1) is the check digit, chosen to make the sum correct. It may need to have the value 10, which is represented as the letter X. For example, take the ISBN 0-201-53082-1. The sum of products is 0×10 + 2×9 + 0×8 + 1×7 + 5×6 + 3×5 + 0×4 + 8×3 + 2×2 + 1×1 = 99 ≡ 0 modulo 11. So the ISBN is valid.
While this may seem more complicated than the first scheme, it can be validated simply by adding all the products together then dividing by 11. The sum can be computed without any multiplications by initializing two variables,
t
and sum
, to 0 and repeatedly performing t = t + digit; sum = sum + t;
(which can be expressed in CC (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
as
sum += t += digit;
). If the final sum
is a multiple of 11, the ISBN is valid.ISBN 13
ISBN 13 (in use January 2007) is equal to the EAN-13 code found underneath a book's barcode. Its check digit is generated the same way as the UPC except that the even digits are multiplied by 3 instead of the odd digits.EAN (GLN,GTIN, EAN numbers administered by GS1)
EAN (European Article NumberEuropean Article Number
An EAN-13 barcode is a 13 digit barcoding standard which is a superset of the original 12-digit Universal Product Code system developed in the United States...
) check digits (administered by GS1
GS1
Founded in 1977, GS1 is an international not-for-profit association dedicated to the development and implementation of global standards and solutions to improve the efficiency and visibility of supply and demand chains globally and across multiple sectors...
) are calculated by summing the even position numbers and multiplying by 3 and then by adding the sum of the odd position numbers. The final digit of the result is subtracted from 10 to calculate the check digit (or left as is if already zero).
A GS1 check digit calculator and detailed documentation is online at GS1
GS1
Founded in 1977, GS1 is an international not-for-profit association dedicated to the development and implementation of global standards and solutions to improve the efficiency and visibility of supply and demand chains globally and across multiple sectors...
's website.
Other examples of check digits
- The tenth digit of the National Provider IdentifierNational Provider IdentifierA National Provider Identifier or NPI is a unique 10-digit identification number issued to health care providers in the United States by the Centers for Medicare and Medicaid Services ....
for the US healthcare industry - The Australian Tax File NumberTax File NumberTax File Number is an 8 or 9 digit number issued by the Australian Taxation Office to each taxpayer to identify that taxpayer's Australian tax dealings. When it was introduced in 1988, individuals received a 9 digit TFN and non-individuals were issued an 8 digit TFN. Now both are issued 9 digit...
(based on modulo 11) - The Guatemalan Tax Number (NIT - Número de Identificación Tributaria) based on modulo 11
- The North American CUSIPCUSIPThe acronym CUSIP historically refers to the Committee on Uniform Security Identification Procedures, which was founded in 1964, during the paper crunch in Wall Street. This 9-character alphanumeric code identifies any North American security for the purposes of facilitating clearing and settlement...
number - The final (ninth) digit of the routing transit numberRouting transit numberA routing transit number is a nine digit bank code, used in the United States, which appears on the bottom of negotiable instruments such as checks identifying the financial institution on which it was drawn...
, a bank codeBank codeA Bank Code is a code assigned by a central bank, a Bank Supervisory Body or a Bankers Association in a country to all its licensed member banks. The rules vary to a great extent between the countries. Also the name of such a code varies...
used in the United States - The International SEDOLSEDOLSEDOL stands for Stock Exchange Daily Official List, a list of security identifiers used in the United Kingdom and Ireland for clearing purposes. The numbers are assigned by the London Stock Exchange, on request by the security issuer...
number - The International Securities Identifying Number (ISIN)
- The International CAS registry numberCAS registry numberCAS Registry Numbersare unique numerical identifiers assigned by the "Chemical Abstracts Service" toevery chemical described in the...
's final digit. - Modulo 10 check digits in credit cardCredit cardA credit card is a small plastic card issued to users as a system of payment. It allows its holder to buy goods and services based on the holder's promise to pay for these goods and services...
account numbers, calculated with the Luhn algorithmLuhn algorithmThe Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm,is a simple checksum formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, National Provider Identifier numbers in US and Canadian Social Insurance Numbers...
.- Also used in the Norwegian KID (customer identification number) numbers used in bank giros (credit transfer).
- The final character encoded in a magnetic stripe cardMagnetic stripe cardA magnetic stripe card is a type of card capable of storing data by modifying the magnetism of tiny iron-based magnetic particles on a band of magnetic material on the card...
is a computed Longitudinal redundancy checkLongitudinal redundancy checkIn telecommunication, a longitudinal redundancy check or horizontal redundancy check is a form of redundancy check that is applied independently to each of a parallel group of bit streams... - final digit of a POSTNETPOSTNETPOSTNET is a barcode symbology that was used by the United States Postal Service to assist in directing mail. The ZIP Code or ZIP+4 code is encoded in half- and full-height bars...
code - final digit of an ISSN code
- final digit of a DUNSData Universal Numbering SystemThe Data Universal Numbering System, abbreviated as DUNS or D-U-N-S, is a system developed and regulated by Dun & Bradstreet , that assigns a unique numeric identifier, referred to as a "DUNS number" to a single business entity. It was introduced in 1963 to support D&B's credit reporting practice....
number (though this is scheduled to change, such as that the final digit will be chosen freely in new allocations, rather than being a check digit) - The Spanish fiscal identification number (número de identificación fiscal, NIFNIF-Localities:* Nif, former name of the town of Kemalpaşa in western Turkey* Mount Nif, near Kemalpaşa* The River Nif in the same region, which joins the Gediz River-Organizations and other abbreviations:...
), (based on modulo 23). - The ninth digit of a Vehicle Identification NumberVehicle identification numberA Vehicle Identification Number, commonly abbreviated to VIN, is a unique serial number used by the automotive industry to identify individual motor vehicles. VINs were first used in 1954...
(VIN). - The ninth digit of an IsraelIsraelThe State of Israel is a parliamentary republic located in the Middle East, along the eastern shore of the Mediterranean Sea...
i Teudat ZehutTeudat ZehutTeudat Zehut is the Israeli compulsory identity document, as prescribed in the Identity Card Carrying and Displaying Act of 1982:Any resident sixteen years of age or older must at all times carry an Identity card, and present it upon demand to a senior police officer, head of Municipal or Regional...
(Identity Card) number. - The 13th digit of SerbiaSerbiaSerbia , officially the Republic of Serbia , is a landlocked country located at the crossroads of Central and Southeast Europe, covering the southern part of the Carpathian basin and the central part of the Balkans...
n and Former Yugoslav Unique Master Citizen Number (JMBG)Unique Master Citizen NumberUnique Master Citizen Number was a unique identification number that was assigned to every citizen of former Yugoslav republics of the SFR Yugoslavia. Today it continues to be used in all of the countries that were created after the dissolution of Yugoslavia – Bosnia and Herzegovina, Croatia,... - Last check digit in EAN/UPC serialisation of Global Trade Identification Number (GTIN). It applies to GTIN-8, GTIN-12, GTIN-13 and GTIN-14.
- The seventh character of a New ZealandNew ZealandNew Zealand is an island country in the south-western Pacific Ocean comprising two main landmasses and numerous smaller islands. The country is situated some east of Australia across the Tasman Sea, and roughly south of the Pacific island nations of New Caledonia, Fiji, and Tonga...
NHI NumberNHI NumberThe National Health Index number is the unique person identifier used within the New Zealand health system. It is technically not a number but rather an alphanumeric identifier consisting of 7 characters, with three letters and four numbers...
. - The last digit on a New Zealand locomotiveLocomotives of New ZealandLocomotives of New Zealand currently in operation owned by KiwiRail consist of 172 diesel-electric locomotives, 22 electric locomotives, 3 railcars, and 103 shunting locomotives...
's Traffic Monitoring System (TMS) number. - The last two digits of the 11-digit Turkish Identification NumberTurkish Identification NumberTurkish Identification Number is a unique personal identification number that is assigned to every citizen of Turkey.Foreigners residing in Turkey at least six months for any purpose receive a Foreigner Identification Number, which is different from the Turkish Identification Number.- Purpose :The...
. - The third and fourth digits in an International Bank Account NumberInternational Bank Account NumberThe International Bank Account Number is an international standard for identifying bank accounts across national borders with a minimal risk of propagating transcription errors. It was originally adopted by the European Committee for Banking Standards , and was later adopted as an international...
(Modulo 97 check). - The ninth character in the 14-character EUEuropean UnionThe European Union is an economic and political union of 27 independent member states which are located primarily in Europe. The EU traces its origins from the European Coal and Steel Community and the European Economic Community , formed by six countries in 1958...
cattle passport number (cycles from 1 to 7: see British Cattle Movement Service). - The ninth digit in an IcelandIcelandIceland , described as the Republic of Iceland, is a Nordic and European island country in the North Atlantic Ocean, on the Mid-Atlantic Ridge. Iceland also refers to the main island of the country, which contains almost all the population and almost all the land area. The country has a population...
ic KennitalaKennitalaThe kennitala is a unique national identification number used by the Icelandic government to identify individuals and organisations in Iceland, administered by the National Registry . Kennitölur are issued to Icelandic citizens at birth, and to foreign nationals resident in Iceland upon registration...
(national ID number). - Modulo 97 check digits in a BelgianBelgiumBelgium , officially the Kingdom of Belgium, is a federal state in Western Europe. It is a founding member of the European Union and hosts the EU's headquarters, and those of several other major international organisations such as NATO.Belgium is also a member of, or affiliated to, many...
and SerbiaSerbiaSerbia , officially the Republic of Serbia , is a landlocked country located at the crossroads of Central and Southeast Europe, covering the southern part of the Carpathian basin and the central part of the Balkans...
n bank account numbers. - Mayo ClinicMayo ClinicMayo Clinic is a not-for-profit medical practice and medical research group specializing in treating difficult patients . Patients are referred to Mayo Clinic from across the U.S. and the world, and it is known for innovative and effective treatments. Mayo Clinic is known for being at the top of...
patient identification numbers used in Arizona and Florida include a trailing check digit
Algorithms
Notable algorithms include:- Luhn algorithmLuhn algorithmThe Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm,is a simple checksum formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, National Provider Identifier numbers in US and Canadian Social Insurance Numbers...
(1954) - Verhoeff algorithmVerhoeff algorithmThe Verhoeff algorithm, a checksum formula for error detection first published in 1969, was developed by Dutch mathematician Jacobus Verhoeff . Like the more widely known Luhn algorithm, it works with strings of decimal digits of any length...
(1969)
External links
- Identification numbers and check digit schemes (a mathematical explanation of various check digit schemes)
- GS1 check digit calculator
- Check identity numbers by check digits.