Greylisting
Encyclopedia
Greylisting is a method of defending e-mail
users against spam
. A mail transfer agent
(MTA) using greylisting will "temporarily reject" any email from a sender it does not recognize. If the mail is legitimate the originating server will, after a delay, try again and, if sufficient time has elapsed, the email will be accepted. If the mail is from a spam sender, sending to many thousands of email addresses, it will probably not be retried.
This is checked against the mail server's internal database. If this triplet has not been seen before (within some configurable period), the email is greylisted for a short time (also configurable), and it is refused with a temporary rejection with a SMTP 4xx error code. The assumption is that since temporary failures are defined in the SMTP-related RFCs, a legitimate server will try again to deliver the email.
The temporary rejection can be issued at different stages of the SMTP dialogue, allowing an implementation to store more or less data about the incoming message. The trade-off is more work and bandwidth for more exact matching of retries with original messages. Rejecting a message after its content has been received allows the server to store a choice of headers and/or a hash of the message body.
Large companies with big pools of sending machines typically break RFCs and greylisting systems by returning greylisted mail to the sending pool of servers, but to be fully compliant the sending server and its corresponding unique IP address are solely responsible for the delivery of the message. The practice of returning a message back into a pool for later delivery breaks this rule and is generally discouraged and will cause mail delays. Greylisting can generally be overridden by a fully validated TLS connection with a matching certificate. Because large senders often have a pool of machines that can send (and resend) email, IP addresses that have the most-significant 24 bits (/24) the same are treated as equivalent, or in some cases SPF
records are used to determine the sending pool. Similarly, some e-mail systems use unique per-message return-paths, for example variable envelope return path (VERP)
for mailing lists, Sender Rewriting Scheme
for forwarded e-mail, Bounce Address Tag Validation
for backscatter protection, etc. If an exact match on the sender address is required, every e-mail from such systems will be delayed. Some greylisting systems try to avoid this delay by eliminating the variable parts of the VERP by using only the sender domain and the beginning of the local-part
of the sender address.
In addition, if a spammer does retry a delivery after the waiting period has expired, any one of a number of automated spamtrap
s will have had a good chance of identifying the spam source and listing both the source and the particular message in their databases. Thus, these subsequent attempts are more likely to be detected as spam by other mechanisms than they were before the greylisting delay.
From a mail administrator's point of view the benefit is twofold. Greylisting takes minimal configuration to get up and running with occasional modifications of any local whitelists. The second benefit is that rejecting email with a temporary 451 error (actual error code is implementation dependent) is very cheap in system resources. Most spam filtering tools are very intensive users of CPU and memory. By stopping spam before it hits filtering processes, far fewer system resources are used. This allows more layers of spam filtering or higher throughput since greylisting can easily be configured as a first line of defense with a heuristic filter such as SpamAssassin
handling messages that go through.
Some greylisting packages support a SQL backend which allows for a distributed multiple-server frontend to be deployed with the same greylisting data on all frontends.
However, the original specification for email states that it is not a guaranteed delivery mechanism and not an instantaneous delivery mechanism. This means that greylisting is a perfectly legitimate process and does not break any protocols or rules.
If mail from a particular frequent sender is sent from any of several mail servers, mail may be delayed unless the greylisting server recognises the different servers as belonging to the same whitelisted group.
On a technical level, some SMTP clients and SMTP servers acting as clients may interpret the temporary rejection as a permanent failure. Old clients conforming only to the obsolete specification (RFC 821) and ignoring its recommendations may give up on delivery after the first failed attempt—RFC 821 states that clients "should" retry messages rather than using the word "must". RFC 2119 dictates that "should" means recommended and to ignore at your own risk, and it is a violation of the current SMTP standard for the client to fail to retry. The current SMTP specification (RFC 5321) clearly states that "the SMTP client retains responsibility for delivery of that message" (section 4.2.5) and "mail that cannot be transmitted immediately MUST be queued and periodically retried by the sender." (section 4.5.4.1).
This problem can affect SMTP clients in unexpected ways. Most MTA
s will queue and retry messages, but a small number do not. A similar concern exists for applications which act as SMTP clients and fail to incorporate any form of queueing for deferred SMTP mail. This can be mitigated on the sending side by configuring the application to use a local SMTP server as an outbound queue, instead of attempting direct delivery. For the server operator who uses greylisting, clients which are known to fail on temporary errors can be supported by whitelisting or exception lists.
Some MTAs, upon encountering the temporary failure message from a greylisting server on the first attempt, will send a warning message back to the original sender of the message. The warning message is not a bounce message, but it is often formatted similarly to one and reads like one. This practice often causes the sender to believe that the message has not been delivered, when in fact the message will be delivered successfully at a later time.
When a mail server is greylisted, the duration of time between the initial delay and the re-transmission is variable. Some mail servers use a default of four hours, though most will retry sooner. Most open-source MTAs have retry rules set to attempt delivery after around fifteen minutes (Sendmail
default is 0, 15, ..., Exim
default is 0, 15, ..., Postfix
default is 0, 16.6, ..., Qmail
default is 0, 6:40, 26:40, ..., Courier
default is 0, 5, 10, 15, 30, 35, 40, 70, 75, 80,...). Microsoft Exchange
defaults to 0, 1, 2, 22, 42, 62 ..., Message Systems Momentum defaults to 0, 20, 60, 100, 180, ...
Greylisting delays much of the mail from non-whitelisted mail servers—not just spam—until typical patterns of communication are recorded by the greylisting system. For best results, whitelisting should be used extensively. A static list of public servers worth being whitelisted can be found in the greylisting.org repository.
Also, legitimate mail might not get delivered if the retry does not arrive within the time window the greylisting software uses, or if the retry comes from a different IP address from the original attempt. When the source of an email is a server farm or goes out through an anti-spam mail relay service, it is likely that on the retry a server other than the original server will make the next attempt. Since the IP addresses will be different, the recipient's server will fail to recognize that the two attempts are related and refuse the latest connection as well. This can continue until the message ages out of the queue if the number of servers is large enough. Such server farming techniques can be construed as breaking RFCs detailed above since the original sending machine has absolved itself of the responsibility of mail delivery by tossing it back into the pool, which breaks the state of the mail delivery process. This problem can partially be bypassed by identifying and whitelisting such server farms in advance. However, it is not possible on a distributed network the size of the Internet to maintain a complete list of all such server farms.
Greylisting can be a particular nuisance with websites that require an account to be created and the email address confirmed before they can be used. If the sending MTA of the site is poorly configured, greylisting may delay the initial email containing the signup confirmation link, thus introducing a waiting period even though the actual website may have attempted to send out the email confirmation code immediately. Almost all stock-configured Sendmail MTAs (sendmail being the most widely deployed MTA on the internet) will retry after a few minutes, leading to typical delays of under 10 minutes in most cases (still dependent on the greylisting configuration).
Greylisting is particularly effective in many cases at weeding out misconfigured MTAs, and is gaining in popularity as a very effective anti-spam tool. It is likely that those MTAs that do not correctly handle greylisting will become less numerous as greylisting spreads.
In order for greylisting to work for a particular domain, all backup mail servers (as specified by lower-priority MX records for the domain) must implement the greylisting policy as well.
Also, if certain details of the sending vary and the receiving MTA is not programmed to notice this, a message may be greylisted eternally and never delivered.
Greylisting will cause longer delivery delays if the sender has a large infrastructure and is sending from a different IP when it retries. However this technically breaks SMTP protocol rules, since delivery is the responsibility of the sending server and its associated IP address, and "tossing it back into a pool" for retry by a different server in the group breaks this continuity, and will quite correctly and legitimately restart the greylisting process over again, since delivery is being retried from a different server. Technically there is no reason to throw a message back into a pool for retry by a different server, if load balancing and capacity handling require multiple outbound servers then performing the load balancing prior to insertion into the queue of a delivery server is a simple and obvious way round this which allows for the SMTP protocol to be adhered to and avoids the legitimate re-greylisting by any receiving MTA when the sender IP address changes. In this case it is clearly the sending systems infrastructure implementation that indrectly causes the delay by virtue of its inability to maintain delivery retries from a single IP address.
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...
users against spam
E-mail spam
Email spam, also known as junk email or unsolicited bulk email , is a subset of spam that involves nearly identical messages sent to numerous recipients by email. Definitions of spam usually include the aspects that email is unsolicited and sent in bulk. One subset of UBE is UCE...
. A mail transfer agent
Mail transfer agent
Within Internet message handling services , a message transfer agent or mail transfer agent or mail relay is software that transfers electronic mail messages from one computer to another using a client–server application architecture...
(MTA) using greylisting will "temporarily reject" any email from a sender it does not recognize. If the mail is legitimate the originating server will, after a delay, try again and, if sufficient time has elapsed, the email will be accepted. If the mail is from a spam sender, sending to many thousands of email addresses, it will probably not be retried.
How it works
Typically, a server employing greylisting will record the three pieces of data known as a "triplet" for each incoming mail message:- The IP addressIP addressAn Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...
of the connecting host - The envelope sender address
- The envelope recipient address(es)
This is checked against the mail server's internal database. If this triplet has not been seen before (within some configurable period), the email is greylisted for a short time (also configurable), and it is refused with a temporary rejection with a SMTP 4xx error code. The assumption is that since temporary failures are defined in the SMTP-related RFCs, a legitimate server will try again to deliver the email.
The temporary rejection can be issued at different stages of the SMTP dialogue, allowing an implementation to store more or less data about the incoming message. The trade-off is more work and bandwidth for more exact matching of retries with original messages. Rejecting a message after its content has been received allows the server to store a choice of headers and/or a hash of the message body.
Large companies with big pools of sending machines typically break RFCs and greylisting systems by returning greylisted mail to the sending pool of servers, but to be fully compliant the sending server and its corresponding unique IP address are solely responsible for the delivery of the message. The practice of returning a message back into a pool for later delivery breaks this rule and is generally discouraged and will cause mail delays. Greylisting can generally be overridden by a fully validated TLS connection with a matching certificate. Because large senders often have a pool of machines that can send (and resend) email, IP addresses that have the most-significant 24 bits (/24) the same are treated as equivalent, or in some cases SPF
Sender Policy Framework
Sender Policy Framework is an email validation system designed to prevent email spam by detecting email spoofing, a common vulnerability, by verifying sender IP addresses. SPF allows administrators to specify which hosts are allowed to send mail from a given domain by creating a specific SPF...
records are used to determine the sending pool. Similarly, some e-mail systems use unique per-message return-paths, for example variable envelope return path (VERP)
Variable envelope return path
Variable envelope return path is a technique used by some electronic mailing list software to enable automatic detection and removal of undeliverable e-mail addresses...
for mailing lists, Sender Rewriting Scheme
Sender Rewriting Scheme
Sender Rewriting Scheme is a technique to re-mail an email message so that eventual Delivery Status Notifications can reach the original message sender...
for forwarded e-mail, Bounce Address Tag Validation
Bounce Address Tag Validation
In computing, Bounce Address Tag Validation is a method, defined in an Internet Draft, for determining whether the bounce address specified in an E-mail message is valid...
for backscatter protection, etc. If an exact match on the sender address is required, every e-mail from such systems will be delayed. Some greylisting systems try to avoid this delay by eliminating the variable parts of the VERP by using only the sender domain and the beginning of the local-part
E-mail address
An email address identifies an email box to which email messages are delivered. An example format of an email address is lewis@example.net which is read as lewis at example dot net...
of the sender address.
Why it works
Greylisting is effective because many mass email tools used by spammers will not bother to retry a failed delivery, so the spam is never delivered. A spam sender may retry with a different sender, and possibly a different message, because it has a queue of victims rather than the proper queue of messages that regular mail servers maintain.In addition, if a spammer does retry a delivery after the waiting period has expired, any one of a number of automated spamtrap
Spamtrap
A spamtrap is a honeypot used to collect spam.Spamtraps are usually e-mail addresses that are created not for communication, but rather to lure spam...
s will have had a good chance of identifying the spam source and listing both the source and the particular message in their databases. Thus, these subsequent attempts are more likely to be detected as spam by other mechanisms than they were before the greylisting delay.
Advantages
The main advantage from the users' point of view is that greylisting requires no additional configuration from their end. If the server utilizing greylisting is configured appropriately, the end user will only notice a delay on the first message from a given sender, so long as the sending email server is identified as belonging to the same whitelisted group as earlier messages. If mail from the same sender is repeatedly greylisted it may be worth contacting the mail system administrator with detailed headers of delayed mail.From a mail administrator's point of view the benefit is twofold. Greylisting takes minimal configuration to get up and running with occasional modifications of any local whitelists. The second benefit is that rejecting email with a temporary 451 error (actual error code is implementation dependent) is very cheap in system resources. Most spam filtering tools are very intensive users of CPU and memory. By stopping spam before it hits filtering processes, far fewer system resources are used. This allows more layers of spam filtering or higher throughput since greylisting can easily be configured as a first line of defense with a heuristic filter such as SpamAssassin
SpamAssassin
SpamAssassin is a computer program released under the Apache License 2.0 used for e-mail spam filtering based on content-matching rules. It is now part of the Apache Foundation....
handling messages that go through.
Some greylisting packages support a SQL backend which allows for a distributed multiple-server frontend to be deployed with the same greylisting data on all frontends.
Disadvantages
The biggest disadvantage of greylisting is that it destroys the near-instantaneous nature of email that users have come to expect. Mail from unrecognised senders is delayed by typically about 15 minutes, and up to four hours. A customer of a greylisting ISP can not always rely on getting every email in a pre-determined amount of time.However, the original specification for email states that it is not a guaranteed delivery mechanism and not an instantaneous delivery mechanism. This means that greylisting is a perfectly legitimate process and does not break any protocols or rules.
If mail from a particular frequent sender is sent from any of several mail servers, mail may be delayed unless the greylisting server recognises the different servers as belonging to the same whitelisted group.
On a technical level, some SMTP clients and SMTP servers acting as clients may interpret the temporary rejection as a permanent failure. Old clients conforming only to the obsolete specification (RFC 821) and ignoring its recommendations may give up on delivery after the first failed attempt—RFC 821 states that clients "should" retry messages rather than using the word "must". RFC 2119 dictates that "should" means recommended and to ignore at your own risk, and it is a violation of the current SMTP standard for the client to fail to retry. The current SMTP specification (RFC 5321) clearly states that "the SMTP client retains responsibility for delivery of that message" (section 4.2.5) and "mail that cannot be transmitted immediately MUST be queued and periodically retried by the sender." (section 4.5.4.1).
This problem can affect SMTP clients in unexpected ways. Most MTA
Mail transfer agent
Within Internet message handling services , a message transfer agent or mail transfer agent or mail relay is software that transfers electronic mail messages from one computer to another using a client–server application architecture...
s will queue and retry messages, but a small number do not. A similar concern exists for applications which act as SMTP clients and fail to incorporate any form of queueing for deferred SMTP mail. This can be mitigated on the sending side by configuring the application to use a local SMTP server as an outbound queue, instead of attempting direct delivery. For the server operator who uses greylisting, clients which are known to fail on temporary errors can be supported by whitelisting or exception lists.
Some MTAs, upon encountering the temporary failure message from a greylisting server on the first attempt, will send a warning message back to the original sender of the message. The warning message is not a bounce message, but it is often formatted similarly to one and reads like one. This practice often causes the sender to believe that the message has not been delivered, when in fact the message will be delivered successfully at a later time.
When a mail server is greylisted, the duration of time between the initial delay and the re-transmission is variable. Some mail servers use a default of four hours, though most will retry sooner. Most open-source MTAs have retry rules set to attempt delivery after around fifteen minutes (Sendmail
Sendmail
Sendmail is a general purpose internetwork email routing facility that supports many kinds of mail-transfer and -delivery methods, including the Simple Mail Transfer Protocol used for email transport over the Internet....
default is 0, 15, ..., Exim
Exim
Exim is a mail transfer agent used on Unix-like operating systems. Exim is free software distributed under the terms of the GNU General Public License, and it aims to be a general and flexible mailer with extensive facilities for checking incoming e-mail....
default is 0, 15, ..., Postfix
Postfix (software)
In computing, Postfix is a free and open-source mail transfer agent that routes and delivers electronic mail. It is intended as a fast, easier-to-administer, and secure alternative to the widely-used Sendmail MTA....
default is 0, 16.6, ..., Qmail
Qmail
qmail is a mail transfer agent that runs on Unix. It was written, starting December 1995, by Daniel J. Bernstein as a more secure replacement for the popular Sendmail program...
default is 0, 6:40, 26:40, ..., Courier
Courier Mail Server
The Courier mail server is a mail transfer agent server that provides ESMTP, IMAP, POP3, SMAP, webmail, and mailing list services with individual components. It is best known for its IMAP server component....
default is 0, 5, 10, 15, 30, 35, 40, 70, 75, 80,...). Microsoft Exchange
Microsoft Exchange Server
Microsoft Exchange Server is the server side of a client–server, collaborative application product developed by Microsoft. It is part of the Microsoft Servers line of server products and is used by enterprises using Microsoft infrastructure products...
defaults to 0, 1, 2, 22, 42, 62 ..., Message Systems Momentum defaults to 0, 20, 60, 100, 180, ...
Greylisting delays much of the mail from non-whitelisted mail servers—not just spam—until typical patterns of communication are recorded by the greylisting system. For best results, whitelisting should be used extensively. A static list of public servers worth being whitelisted can be found in the greylisting.org repository.
Also, legitimate mail might not get delivered if the retry does not arrive within the time window the greylisting software uses, or if the retry comes from a different IP address from the original attempt. When the source of an email is a server farm or goes out through an anti-spam mail relay service, it is likely that on the retry a server other than the original server will make the next attempt. Since the IP addresses will be different, the recipient's server will fail to recognize that the two attempts are related and refuse the latest connection as well. This can continue until the message ages out of the queue if the number of servers is large enough. Such server farming techniques can be construed as breaking RFCs detailed above since the original sending machine has absolved itself of the responsibility of mail delivery by tossing it back into the pool, which breaks the state of the mail delivery process. This problem can partially be bypassed by identifying and whitelisting such server farms in advance. However, it is not possible on a distributed network the size of the Internet to maintain a complete list of all such server farms.
Greylisting can be a particular nuisance with websites that require an account to be created and the email address confirmed before they can be used. If the sending MTA of the site is poorly configured, greylisting may delay the initial email containing the signup confirmation link, thus introducing a waiting period even though the actual website may have attempted to send out the email confirmation code immediately. Almost all stock-configured Sendmail MTAs (sendmail being the most widely deployed MTA on the internet) will retry after a few minutes, leading to typical delays of under 10 minutes in most cases (still dependent on the greylisting configuration).
Greylisting is particularly effective in many cases at weeding out misconfigured MTAs, and is gaining in popularity as a very effective anti-spam tool. It is likely that those MTAs that do not correctly handle greylisting will become less numerous as greylisting spreads.
In order for greylisting to work for a particular domain, all backup mail servers (as specified by lower-priority MX records for the domain) must implement the greylisting policy as well.
Also, if certain details of the sending vary and the receiving MTA is not programmed to notice this, a message may be greylisted eternally and never delivered.
Greylisting will cause longer delivery delays if the sender has a large infrastructure and is sending from a different IP when it retries. However this technically breaks SMTP protocol rules, since delivery is the responsibility of the sending server and its associated IP address, and "tossing it back into a pool" for retry by a different server in the group breaks this continuity, and will quite correctly and legitimately restart the greylisting process over again, since delivery is being retried from a different server. Technically there is no reason to throw a message back into a pool for retry by a different server, if load balancing and capacity handling require multiple outbound servers then performing the load balancing prior to insertion into the queue of a delivery server is a simple and obvious way round this which allows for the SMTP protocol to be adhered to and avoids the legitimate re-greylisting by any receiving MTA when the sender IP address changes. In this case it is clearly the sending systems infrastructure implementation that indrectly causes the delay by virtue of its inability to maintain delivery retries from a single IP address.