Network congestion
Encyclopedia
In data networking and queueing theory
, network congestion occurs when a link or node is carrying so much data that its quality of service
deteriorates. Typical effects include queueing delay, packet loss
or the blocking of new connections. A consequence of these latter two is that incremental increases in offered load
lead either only to small increases in network throughput
, or to an actual reduction in network throughput.
Network protocols which use aggressive retransmissions
to compensate for packet loss tend to keep systems in a state of network congestion even after the initial load has been reduced to a level which would not normally have induced network congestion. Thus, networks using these protocols can exhibit two stable states under the same level of load. The stable state with low throughput is known as congestive collapse.
Modern networks use congestion control and network congestion avoidance techniques to try to avoid congestion collapse. These include: exponential backoff
in protocols such as 802.11's CSMA/CA
and the original Ethernet
, window reduction in TCP
, and fair queueing
in devices such as routers. Another method to avoid the negative effects of network congestion is implementing priority schemes, so that some packets are transmitted with higher priority than others. Priority schemes do not solve network congestion by themselves, but they help to alleviate the effects of congestion for some services. An example of this is 802.1p. A third method to avoid network congestion is the explicit allocation of network resources to specific flows. One example of this is the use of Contention-Free Transmission Opportunities (CFTXOPs) in the ITU-T
G.hn
standard, which provides high-speed (up to 1 Gbit/s) Local area network
ing over existing home wires (power lines, phone lines and coaxial cables).
RFC 2914 addresses the subject of congestion control in detail.
.
However:
can reach, when little or no useful communication is happening due to congestion. Congestion collapse generally occurs at choke points in the network, where the total incoming bandwidth to a node exceeds the outgoing bandwidth. Connection points between a local area network
and a wide area network
are the most likely choke points. A DSL modem is the most common small network example, with between 10 and 1000 Mbit/s of incoming bandwidth and at most 8 Mbit/s of outgoing bandwidth.
When a network is in such a condition, it has settled (under overload) into a stable state where traffic demand is high but little useful throughput
is available, and there are high levels of packet delay
and loss (caused by routers discarding packets because their output queues are too full) and general quality of service
is extremely poor.
phase-I backbone dropped three orders of magnitude from its capacity of 32 kbit/s to 40 bit/s, and this continued to occur until end nodes started implementing Van Jacobson's congestion control between 1987 and 1988.
, so as to avoid congestive collapse by attempting to avoid oversubscription of any of the processing or link
capabilities of the intermediate nodes and networks and taking resource reducing steps, such as reducing the rate of sending packets. It should not be confused with flow control
, which prevents the sender from overwhelming the receiver.
, who applied microeconomic theory and convex optimization theory to describe how individuals controlling their own rates can interact to achieve an "optimal" network-wide rate allocation.
Examples of "optimal" rate allocation are max-min fair allocation
and Kelly's suggestion of proportional fair allocation, although many others are possible.
The mathematical expression for optimal rate allocation is as follows. Let be the rate of flow , be the capacity of link , and be 1 if flow uses link and 0 otherwise. Let , and be the corresponding vectors and matrix. Let be an increasing, strictly convex
function, called the utility
, which measures how much benefit a user obtains by transmitting at rate . The optimal rate allocation then satisfies
The Lagrange dual of this problem decouples, so that each flow sets its own rate, based only on a "price" signalled by the network. Each link capacity imposes a constraint, which gives rise to a Lagrange multiplier, . The sum of these Lagrange multipliers, is the price to which the flow responds.
Congestion control then becomes a distributed optimisation algorithm for solving the above problem. Many current congestion control algorithms can be modelled in this framework, with being either the loss probability or the queueing delay at link .
A major weakness of this model is that it assumes all flows observe the same price, while sliding window flow control causes "burstiness" which causes different flows to observe different loss or delay at a given link.
The correct end point behaviour is usually still to repeat dropped information, but progressively slow the rate that information is repeated. Provided all end points do this, the congestion lifts and good use of the network occurs, and the end points all get a fair share of the available bandwidth. Other strategies such as slow-start
ensure that new connections don't overwhelm the router before the congestion detection can kick in.
The most common router mechanisms used to prevent congestive collapses are fair queueing
and other scheduling algorithms, and random early detection
, or RED, where packets are randomly dropped proactively triggering the end points to slow transmission before congestion collapse actually occurs. Fair queueing is most useful in routers at choke points with a small number of connections passing through them. Larger routers must rely on RED.
Some end-to-end protocols are better behaved under congested conditions than others. TCP
is perhaps the best behaved. The first TCP implementations to handle congestion well were developed in 1984, but it was not until Van Jacobson
's inclusion of an open source solution in the Berkeley Standard Distribution UNIX ("BSD") in 1988 that good TCP implementations became widespread.
UDP
does not, in itself, have any congestion control mechanism. Protocols built atop UDP must handle congestion in their own way. Protocols atop UDP which transmit at a fixed rate, independent of congestion, can be troublesome. Real-time streaming protocols, including many Voice over IP
protocols, have this property. Thus, special measures, such as quality-of-service
routing, must be taken to keep packets from being dropped from streams.
In general, congestion in pure datagram networks must be kept out at the periphery of the network, where the mechanisms described above can handle it. Congestion in the Internet backbone
is very difficult to deal with. Fortunately, cheap fiber-optic
lines have reduced costs in the Internet backbone. The backbone can thus be provisioned with enough bandwidth to keep congestion at the periphery.
, such as the widely-used TCP
protocol, generally watch for packet errors, losses, or delays (see Quality of Service
) in order to adjust the transmit speed. There are many different network congestion avoidance processes, since there are a number of different trade-offs available.
Problems occur when many concurrent TCP flows are experiencing port queue buffer
tail-drops. Then TCP's automatic congestion avoidance is not enough. All flows that experience port queue buffer tail-drop will begin a TCP retrain at the same moment - this is called TCP global synchronization
.
"Recommendations on Queue Management and Congestion Avoidance in the Internet" (RFC 2309) states that:
One solution is to use random early detection
(RED) on network equipments port queue buffer
.
On network equipment
ports with more than one queue buffer, weighted random early detection
(WRED) could be used if available.
RED indirectly signals to sender and receiver by deleting some packets, e.g. when the average queue buffer lengths are more than e.g. 50% (lower threshold) filled and deletes linearly more
or (better according to paper) cubical more
packets,
up to e.g. 100% (higher threshold). The average queue buffer lengths are computed over 1 second at a time.
Robust Random Early Detection (RRED) algorithm was proposed to improve the TCP throughput against Denial-of-Service (DoS) attacks, particularly Low-rate Deinal-of-Service (LDoS) attacks. Experiments have confirmed that the existing RED-like algorithms are notably vulnerable under Low-rate Denial-of-Service (LDoS) attacks due to the oscillating TCP queue size caused by the attacks . RRED algorithm can significantly improve the performance of TCP under Low-rate Denial-of-Service attacks .
Recent Publications in low-rate Denial-of-Service (DoS) attacks
Some network equipment are equipped with ports that can follow and measure each flow (flowbased-RED/WRED) and are hereby able to signal to a too big bandwidth flow according to some QoS policy. A policy could divide the bandwidth among all flows by some criteria.
Another approach is to use IP
ECN
. ECN is only used when the two hosts signal that they want to use it. With this method, an ECN bit is used to signal that there is explicit congestion. This is better than the indirect packet delete congestion notification performed by the RED/WRED algorithms, but it requires explicit support by both hosts to be effective.
Some outdated or buggy network equipment drops packets with the ECN bit set, rather than ignoring the bit. More information on the status of ECN including the version required for Cisco IOS
, by Sally Floyd
, one of the authors of ECN.
When a router receives a packet marked as ECN capable and anticipates (using RED) congestion, it will set an ECN-flag notifying the sender of congestion. The sender then ought to decrease its transmission bandwidth; e.g. by decreasing the tcp window size (sending rate) or by other means.
Cisco
has taken a step further in their Catalyst 4000 series with engine IV and V. Engine IV and V has the possibility to classify all flows in "aggressive" (bad) and "adaptive" (good). It ensures that no flows fill the port queues for a long time. DBL can utilize IP ECN instead of packet-delete-signalling.
Congestion avoidance can also efficiently be achieved by reducing the amount of traffic flowing into a network. When an application requests a large file, graphic or web page, it usually advertises a "window" of between 32K and 64K. This results in the server sending a full window of data (assuming the file is larger than the window). When there are many applications simultaneously requesting downloads, this data creates a congestion point at an upstream provider by flooding the queue much faster than it can be emptied. By using a device to reduce the window advertisement, the remote servers will send less data, thus reducing the congestion and allowing traffic to flow more freely. This technique can reduce congestion in a network by a factor of 40.
, 3G
or other networks with a radio layer to have poor throughput in some cases since wireless networks are susceptible to data loss due to interference. The TCP connections running over a radio based physical layer
see the data loss and tend to believe that congestion is occurring when it isn't and erroneously reduce the data rate sent.
To avoid this problem, modern browsers either open multiple connections simultaneously or reuse one connection for all files requested from a particular web server. However, the initial performance can be poor, and many connections never get out of the slow-start regime, significantly increasing latency.
Queueing theory
Queueing theory is the mathematical study of waiting lines, or queues. The theory enables mathematical analysis of several related processes, including arriving at the queue, waiting in the queue , and being served at the front of the queue...
, network congestion occurs when a link or node is carrying so much data that its quality of service
Quality of service
The quality of service refers to several related aspects of telephony and computer networks that allow the transport of traffic with special requirements...
deteriorates. Typical effects include queueing delay, packet loss
Packet loss
Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is distinguished as one of the three main error types encountered in digital communications; the other two being bit error and spurious packets caused due to noise.-...
or the blocking of new connections. A consequence of these latter two is that incremental increases in offered load
Offered load
In the mathematical theory of probability, offered load is a concept in queuing theory. The offered load is a measure of traffic in the queue. The offered load is given by Little's law: the arrival rate into the queue multiplied by the mean holding time , the average amount of time spent by items...
lead either only to small increases in network throughput
Throughput
In communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...
, or to an actual reduction in network throughput.
Network protocols which use aggressive retransmissions
Retransmission (data networks)
Retransmission, essentially identical with Automatic repeat request , is the resending of packets which have been either damaged or lost. It is a term that refers to one of the basic mechanisms used by protocols operating over a packet switched computer network to provide reliable communication...
to compensate for packet loss tend to keep systems in a state of network congestion even after the initial load has been reduced to a level which would not normally have induced network congestion. Thus, networks using these protocols can exhibit two stable states under the same level of load. The stable state with low throughput is known as congestive collapse.
Modern networks use congestion control and network congestion avoidance techniques to try to avoid congestion collapse. These include: exponential backoff
Exponential backoff
Exponential backoff is an algorithm that uses feedback to multiplicatively decrease the rate of some process, in order to gradually find an acceptable rate.-Binary exponential backoff / truncated exponential backoff:...
in protocols such as 802.11's CSMA/CA
Carrier sense multiple access with collision avoidance
Carrier sense multiple access with collision avoidance , in computer networking, is a wireless network multiple access method in which:*a carrier sensing scheme is used....
and the original Ethernet
Ethernet
Ethernet is a family of computer networking technologies for local area networks commercially introduced in 1980. Standardized in IEEE 802.3, Ethernet has largely replaced competing wired LAN technologies....
, window reduction in TCP
Transmission Control Protocol
The Transmission Control Protocol is one of the core protocols of the Internet Protocol Suite. TCP is one of the two original components of the suite, complementing the Internet Protocol , and therefore the entire suite is commonly referred to as TCP/IP...
, and fair queueing
Fair queueing
Fair queuing is a scheduling algorithm used in computer and telecommunications networks to allow multiple packet flows to fairly share the link capacity. The advantage over conventional first in first out queuing is that a high-data-rate flow, consisting of large or many data packets, cannot take...
in devices such as routers. Another method to avoid the negative effects of network congestion is implementing priority schemes, so that some packets are transmitted with higher priority than others. Priority schemes do not solve network congestion by themselves, but they help to alleviate the effects of congestion for some services. An example of this is 802.1p. A third method to avoid network congestion is the explicit allocation of network resources to specific flows. One example of this is the use of Contention-Free Transmission Opportunities (CFTXOPs) in the ITU-T
ITU-T
The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications....
G.hn
G.hn
G.hn is the common name for a home network technology family of standards developed under the International Telecommunication Union's Standardization arm and promoted by the HomeGrid Forum...
standard, which provides high-speed (up to 1 Gbit/s) Local area network
Local area network
A local area network is a computer network that interconnects computers in a limited area such as a home, school, computer laboratory, or office building...
ing over existing home wires (power lines, phone lines and coaxial cables).
RFC 2914 addresses the subject of congestion control in detail.
Network capacity
The fundamental problem is that all network resources are limited, including router processing time and link throughputThroughput
In communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...
.
However:
- today's (2006) Wireless LANWireless LANA wireless local area network links two or more devices using some wireless distribution method , and usually providing a connection through an access point to the wider internet. This gives users the mobility to move around within a local coverage area and still be connected to the network...
effective bandwidth throughput (15-100Mbit/s) is easily filled by a single personal computerPersonal computerA personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...
. - Even on fast computer networkComputer networkA computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....
s (e.g. 1 Gbit), the backbone can easily be congested by a few servers and client PCs. - Because P2PPeer-to-peerPeer-to-peer computing or networking is a distributed application architecture that partitions tasks or workloads among peers. Peers are equally privileged, equipotent participants in the application...
scales very well, file transmissions by P2P have no problem filling and will fill an uplink or some other network bottleneck, particularly when nearby peers are preferred over distant peers. - Denial of service attacks by botnetBotnetA botnet is a collection of compromised computers connected to the Internet. Termed "bots," they are generally used for malicious purposes. When a computer becomes compromised, it becomes a part of a botnet...
s are capable of filling even the largest Internet backboneInternet backboneThe Internet backbone refers to the principal data routes between large, strategically interconnected networks and core routers in the Internet...
network links (40 Gbit/s as of 2007), generating large-scale network congestion
Congestive collapse
Congestive collapse (or congestion collapse) is a condition which a packet switched computer networkComputer network
A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....
can reach, when little or no useful communication is happening due to congestion. Congestion collapse generally occurs at choke points in the network, where the total incoming bandwidth to a node exceeds the outgoing bandwidth. Connection points between a local area network
Local area network
A local area network is a computer network that interconnects computers in a limited area such as a home, school, computer laboratory, or office building...
and a wide area network
Wide area network
A wide area network is a telecommunication network that covers a broad area . Business and government entities utilize WANs to relay data among employees, clients, buyers, and suppliers from various geographical locations...
are the most likely choke points. A DSL modem is the most common small network example, with between 10 and 1000 Mbit/s of incoming bandwidth and at most 8 Mbit/s of outgoing bandwidth.
When a network is in such a condition, it has settled (under overload) into a stable state where traffic demand is high but little useful throughput
Throughput
In communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...
is available, and there are high levels of packet delay
Network delay
Network delay is an important design and performance characteristic of a computer network or telecommunications network. The delay of a network specifies how long it takes for a bit of data to travel across the network from one node or endpoint to another. It is typically measured in multiples or...
and loss (caused by routers discarding packets because their output queues are too full) and general quality of service
Quality of service
The quality of service refers to several related aspects of telephony and computer networks that allow the transport of traffic with special requirements...
is extremely poor.
History
Congestion collapse was identified as a possible problem as far back as 1984 (RFC 896, dated 6 January). It was first observed on the early Internet in October 1986, when the NSFnetNSFNet
The National Science Foundation Network was a program of coordinated, evolving projects sponsored by the National Science Foundation beginning in 1985 to promote advanced research and education networking in the United States...
phase-I backbone dropped three orders of magnitude from its capacity of 32 kbit/s to 40 bit/s, and this continued to occur until end nodes started implementing Van Jacobson's congestion control between 1987 and 1988.
Cause
When more packets were sent than could be handled by intermediate routers, the intermediate routers discarded many packets, expecting the end points of the network to retransmit the information. However, early TCP implementations had very bad retransmission behavior. When this packet loss occurred, the end points sent extra packets that repeated the information lost; doubling the data rate sent, exactly the opposite of what should be done during congestion. This pushed the entire network into a 'congestion collapse' where most packets were lost and the resultant throughput was negligible.Congestion control
Congestion control concerns controlling traffic entry into a telecommunications networkTelecommunications network
A telecommunications network is a collection of terminals, links and nodes which connect together to enable telecommunication between users of the terminals. Networks may use circuit switching or message switching. Each terminal in the network must have a unique address so messages or connections...
, so as to avoid congestive collapse by attempting to avoid oversubscription of any of the processing or link
Data link
In telecommunication a data link is the means of connecting one location to another for the purpose of transmitting and receiving information. It can also refer to a set of electronics assemblies, consisting of a transmitter and a receiver and the interconnecting data telecommunication circuit...
capabilities of the intermediate nodes and networks and taking resource reducing steps, such as reducing the rate of sending packets. It should not be confused with flow control
Flow control
In data communications, flow control is the process of managing the pacing of data transmission between two nodes to prevent a fast sender from outrunning a slow receiver. It provides a mechanism for the receiver to control the transmission speed, so that the receiving node is not overwhelmed with...
, which prevents the sender from overwhelming the receiver.
Theory of congestion control
The modern theory of congestion control was pioneered by Frank KellyFrank Kelly (professor)
Francis Patrick "Frank" Kelly, FRS is professor of the Mathematics of Systems in the Statistical Laboratory, University of Cambridge, and Master of Christ's College, Cambridge....
, who applied microeconomic theory and convex optimization theory to describe how individuals controlling their own rates can interact to achieve an "optimal" network-wide rate allocation.
Examples of "optimal" rate allocation are max-min fair allocation
Max-min fairness
In communication networks and multiplexing, a division of the bandwidth resources is said to be max-min fair when: firstly, the minimum data rate that a dataflow achieves is maximized; secondly, the second lowest data rate that a dataflow achieves is maximized, etc.In best-effort statistical...
and Kelly's suggestion of proportional fair allocation, although many others are possible.
The mathematical expression for optimal rate allocation is as follows. Let be the rate of flow , be the capacity of link , and be 1 if flow uses link and 0 otherwise. Let , and be the corresponding vectors and matrix. Let be an increasing, strictly convex
Convex function
In mathematics, a real-valued function f defined on an interval is called convex if the graph of the function lies below the line segment joining any two points of the graph. Equivalently, a function is convex if its epigraph is a convex set...
function, called the utility
Utility
In economics, utility is a measure of customer satisfaction, referring to the total satisfaction received by a consumer from consuming a good or service....
, which measures how much benefit a user obtains by transmitting at rate . The optimal rate allocation then satisfies
- such that
The Lagrange dual of this problem decouples, so that each flow sets its own rate, based only on a "price" signalled by the network. Each link capacity imposes a constraint, which gives rise to a Lagrange multiplier, . The sum of these Lagrange multipliers, is the price to which the flow responds.
Congestion control then becomes a distributed optimisation algorithm for solving the above problem. Many current congestion control algorithms can be modelled in this framework, with being either the loss probability or the queueing delay at link .
A major weakness of this model is that it assumes all flows observe the same price, while sliding window flow control causes "burstiness" which causes different flows to observe different loss or delay at a given link.
Classification of congestion control algorithms
There are many ways to classify congestion control algorithms:- By the type and amount of feedback received from the network: Loss; delay; single-bit or multi-bit explicit signals
- By incremental deployability on the current Internet: Only sender needs modification; sender and receiver need modification; only router needs modification; sender, receiver and routers need modification.
- By the aspect of performance it aims to improve: high bandwidth-delay product networks; lossy links; fairness; advantage to short flows; variable-rate links
- By the fairness criterion it uses: max-min, proportional, "minimum potential delay"
Avoidance
The prevention of network congestion and collapse requires two major components:- A mechanism in routers to reorder or drop packets under overload,
- End-to-end flow control mechanisms designed into the end points which respond to congestion and behave appropriately.
The correct end point behaviour is usually still to repeat dropped information, but progressively slow the rate that information is repeated. Provided all end points do this, the congestion lifts and good use of the network occurs, and the end points all get a fair share of the available bandwidth. Other strategies such as slow-start
Slow-start
Slow-start is part of the congestion control strategy used by TCP, the data transmission protocol used by many Internet applications. Slow-start is used in conjunction with other algorithms to avoid sending more data than the network is capable of transmitting, that is, to avoid causing network...
ensure that new connections don't overwhelm the router before the congestion detection can kick in.
The most common router mechanisms used to prevent congestive collapses are fair queueing
Fair queueing
Fair queuing is a scheduling algorithm used in computer and telecommunications networks to allow multiple packet flows to fairly share the link capacity. The advantage over conventional first in first out queuing is that a high-data-rate flow, consisting of large or many data packets, cannot take...
and other scheduling algorithms, and random early detection
Random early detection
Random early detection , also known as random early discard or random early drop is an active queue management algorithm. It is also a congestion avoidance algorithm....
, or RED, where packets are randomly dropped proactively triggering the end points to slow transmission before congestion collapse actually occurs. Fair queueing is most useful in routers at choke points with a small number of connections passing through them. Larger routers must rely on RED.
Some end-to-end protocols are better behaved under congested conditions than others. TCP
Transmission Control Protocol
The Transmission Control Protocol is one of the core protocols of the Internet Protocol Suite. TCP is one of the two original components of the suite, complementing the Internet Protocol , and therefore the entire suite is commonly referred to as TCP/IP...
is perhaps the best behaved. The first TCP implementations to handle congestion well were developed in 1984, but it was not until Van Jacobson
Van Jacobson
Van Jacobson is one of the primary contributors to the TCP/IP protocol stack which is the technological foundation of today’s Internet. He is renowned for his pioneering achievements in network performance and scaling....
's inclusion of an open source solution in the Berkeley Standard Distribution UNIX ("BSD") in 1988 that good TCP implementations became widespread.
UDP
User Datagram Protocol
The User Datagram Protocol is one of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer applications can send messages, in this case referred to as datagrams, to other hosts on an Internet Protocol network without requiring...
does not, in itself, have any congestion control mechanism. Protocols built atop UDP must handle congestion in their own way. Protocols atop UDP which transmit at a fixed rate, independent of congestion, can be troublesome. Real-time streaming protocols, including many Voice over IP
Voice over IP
Voice over Internet Protocol is a family of technologies, methodologies, communication protocols, and transmission techniques for the delivery of voice communications and multimedia sessions over Internet Protocol networks, such as the Internet...
protocols, have this property. Thus, special measures, such as quality-of-service
Quality of service
The quality of service refers to several related aspects of telephony and computer networks that allow the transport of traffic with special requirements...
routing, must be taken to keep packets from being dropped from streams.
In general, congestion in pure datagram networks must be kept out at the periphery of the network, where the mechanisms described above can handle it. Congestion in the Internet backbone
Internet backbone
The Internet backbone refers to the principal data routes between large, strategically interconnected networks and core routers in the Internet...
is very difficult to deal with. Fortunately, cheap fiber-optic
Optical fiber
An optical fiber is a flexible, transparent fiber made of a pure glass not much wider than a human hair. It functions as a waveguide, or "light pipe", to transmit light between the two ends of the fiber. The field of applied science and engineering concerned with the design and application of...
lines have reduced costs in the Internet backbone. The backbone can thus be provisioned with enough bandwidth to keep congestion at the periphery.
Practical network congestion avoidance
Implementations of connection-oriented protocolsCommunications protocol
A communications protocol is a system of digital message formats and rules for exchanging those messages in or between computing systems and in telecommunications...
, such as the widely-used TCP
Transmission Control Protocol
The Transmission Control Protocol is one of the core protocols of the Internet Protocol Suite. TCP is one of the two original components of the suite, complementing the Internet Protocol , and therefore the entire suite is commonly referred to as TCP/IP...
protocol, generally watch for packet errors, losses, or delays (see Quality of Service
Quality of service
The quality of service refers to several related aspects of telephony and computer networks that allow the transport of traffic with special requirements...
) in order to adjust the transmit speed. There are many different network congestion avoidance processes, since there are a number of different trade-offs available.
TCP/IP congestion avoidance
The TCP congestion avoidance algorithm is the primary basis for congestion control in the Internet.Problems occur when many concurrent TCP flows are experiencing port queue buffer
Computer port (software)
In computer programming, port has a wide range of meanings.A software port is a virtual/logical data connection that can be used by programs to exchange data directly, instead of going through a file or other temporary storage location...
tail-drops. Then TCP's automatic congestion avoidance is not enough. All flows that experience port queue buffer tail-drop will begin a TCP retrain at the same moment - this is called TCP global synchronization
TCP global synchronization
TCP global synchronization in Computer networks can happen toTCP/IP flows during periodsof congestion because each sender will reduce their transmission rate at the sametime when packet loss occurs....
.
Purpose
"Recommendations on Queue Management and Congestion Avoidance in the Internet" (RFC 2309) states that:
- Fewer packets will be dropped with Active Queue Management (AQM).
- The link utilization will increase because less TCP global synchronization will occur.
- By keeping the average queue size small, queue management will reduce the delays and jitter seen by flows.
- The connection bandwidth will be more equally shared among connection oriented flows, even without flow-based RED or WRED.
Random early detection
One solution is to use random early detection
Random early detection
Random early detection , also known as random early discard or random early drop is an active queue management algorithm. It is also a congestion avoidance algorithm....
(RED) on network equipments port queue buffer
Computer port (software)
In computer programming, port has a wide range of meanings.A software port is a virtual/logical data connection that can be used by programs to exchange data directly, instead of going through a file or other temporary storage location...
.
On network equipment
Computer networking device
'Computer networking devices are units that mediate data in a computer network. Computer networking devices are also called network equipment, Intermediate Systems or InterWorking Unit...
ports with more than one queue buffer, weighted random early detection
Weighted random early detection
Weighted random early detection is a queue management algorithm with congestion avoidance capabilities. It is an extension to random early detection where a single queue may have several different queue thresholds. Each queue threshold is associated to a particular traffic class.For example, a...
(WRED) could be used if available.
RED indirectly signals to sender and receiver by deleting some packets, e.g. when the average queue buffer lengths are more than e.g. 50% (lower threshold) filled and deletes linearly more
Linear
In mathematics, a linear map or function f is a function which satisfies the following two properties:* Additivity : f = f + f...
or (better according to paper) cubical more
Cubic function
In mathematics, a cubic function is a function of the formf=ax^3+bx^2+cx+d,\,where a is nonzero; or in other words, a polynomial of degree three. The derivative of a cubic function is a quadratic function...
packets,
up to e.g. 100% (higher threshold). The average queue buffer lengths are computed over 1 second at a time.
Robust random early detection (RRED)
Robust Random Early Detection (RRED) algorithm was proposed to improve the TCP throughput against Denial-of-Service (DoS) attacks, particularly Low-rate Deinal-of-Service (LDoS) attacks. Experiments have confirmed that the existing RED-like algorithms are notably vulnerable under Low-rate Denial-of-Service (LDoS) attacks due to the oscillating TCP queue size caused by the attacks . RRED algorithm can significantly improve the performance of TCP under Low-rate Denial-of-Service attacks .
Recent Publications in low-rate Denial-of-Service (DoS) attacks
Flowbased-RED/WRED
Some network equipment are equipped with ports that can follow and measure each flow (flowbased-RED/WRED) and are hereby able to signal to a too big bandwidth flow according to some QoS policy. A policy could divide the bandwidth among all flows by some criteria.
IP ECN
Another approach is to use IP
Internet Protocol
The Internet Protocol is the principal communications protocol used for relaying datagrams across an internetwork using the Internet Protocol Suite...
ECN
Explicit Congestion Notification
Explicit Congestion Notification is an extension to the Internet Protocol and to the Transmission Control Protocol and is defined in RFC 3168 . ECN allows end-to-end notification of network congestion without dropping packets. ECN is an optional feature that is only used when both endpoints...
. ECN is only used when the two hosts signal that they want to use it. With this method, an ECN bit is used to signal that there is explicit congestion. This is better than the indirect packet delete congestion notification performed by the RED/WRED algorithms, but it requires explicit support by both hosts to be effective.
Some outdated or buggy network equipment drops packets with the ECN bit set, rather than ignoring the bit. More information on the status of ECN including the version required for Cisco IOS
Cisco IOS
Cisco IOS is the software used on the vast majority of Cisco Systems routers and current Cisco network switches...
, by Sally Floyd
Sally Floyd (computer scientist)
Sally Floyd was a computer scientist at the International Computer Science Institute in Berkeley, California. She retired in 2009. She is best known for her work on Internet congestion control...
, one of the authors of ECN.
When a router receives a packet marked as ECN capable and anticipates (using RED) congestion, it will set an ECN-flag notifying the sender of congestion. The sender then ought to decrease its transmission bandwidth; e.g. by decreasing the tcp window size (sending rate) or by other means.
Cisco AQM: Dynamic buffer limiting (DBL)
Cisco
Cisco Systems
Cisco Systems, Inc. is an American multinational corporation headquartered in San Jose, California, United States, that designs and sells consumer electronics, networking, voice, and communications technology and services. Cisco has more than 70,000 employees and annual revenue of US$...
has taken a step further in their Catalyst 4000 series with engine IV and V. Engine IV and V has the possibility to classify all flows in "aggressive" (bad) and "adaptive" (good). It ensures that no flows fill the port queues for a long time. DBL can utilize IP ECN instead of packet-delete-signalling.
TCP Window Shaping
Congestion avoidance can also efficiently be achieved by reducing the amount of traffic flowing into a network. When an application requests a large file, graphic or web page, it usually advertises a "window" of between 32K and 64K. This results in the server sending a full window of data (assuming the file is larger than the window). When there are many applications simultaneously requesting downloads, this data creates a congestion point at an upstream provider by flooding the queue much faster than it can be emptied. By using a device to reduce the window advertisement, the remote servers will send less data, thus reducing the congestion and allowing traffic to flow more freely. This technique can reduce congestion in a network by a factor of 40.
Radio links
The protocols that avoid congestive collapse are often based on the idea that data loss on the Internet is caused by congestion. This is true in nearly all cases; errors during transmission are rare on today's fiber based Internet. However, this causes WiFiWIFI
WIFI is a radio station broadcasting a brokered format. Licensed to Florence, New Jersey, USA, the station is currently operated by Florence Broadcasting Partners, LLC.This station was previously owned by Real Life Broadcasting...
, 3G
3G
3G or 3rd generation mobile telecommunications is a generation of standards for mobile phones and mobile telecommunication services fulfilling the International Mobile Telecommunications-2000 specifications by the International Telecommunication Union...
or other networks with a radio layer to have poor throughput in some cases since wireless networks are susceptible to data loss due to interference. The TCP connections running over a radio based physical layer
Physical layer
The physical layer or layer 1 is the first and lowest layer in the seven-layer OSI model of computer networking. The implementation of this layer is often termed PHY....
see the data loss and tend to believe that congestion is occurring when it isn't and erroneously reduce the data rate sent.
Short-lived connections
The slow-start protocol performs badly for short-lived connections. Older web browsers would create many consecutive short-lived connections to the web server, and would open and close the connection for each file requested. This kept most connections in the slow start mode, which resulted in poor response time.To avoid this problem, modern browsers either open multiple connections simultaneously or reuse one connection for all files requested from a particular web server. However, the initial performance can be poor, and many connections never get out of the slow-start regime, significantly increasing latency.
See also
- Bandwidth managementBandwidth managementBandwidth management is the process of measuring and controlling the communications on a network link, to avoid filling the link to capacity or overfilling the link, which would result in network congestion and poor performance of the network.- Management :Bandwidth management mechanisms may be...
- BufferbloatBufferbloatBufferbloat is a phenomenon in a packet-switched computer network whereby excess buffering of packets inside the network causes high latency and jitter, as well as reducing the overall network throughput...
- Cascading failureCascading failureA cascading failure is a failure in a system of interconnected parts in which the failure of a part can trigger the failure of successive parts.- Cascading failure in power transmission :...
- Choke exchangeChoke exchangeA choke exchange is a telephone exchange designed to limit the effect of high numbers of simultaneous call attempts to that exchange. The choke exchange is connected to other exchanges in such a way that even high call volume will be handled through the "choke" connection rather than overwhelming...
- Erlang unitErlang unitThe erlang is a dimensionless unit that is used in telephony as a statistical measure of offered load or carried load on service-providing elements such as telephone circuits or telephone switching equipment. It is named after the Danish telephone engineer A. K...
- Explicit Congestion NotificationExplicit Congestion NotificationExplicit Congestion Notification is an extension to the Internet Protocol and to the Transmission Control Protocol and is defined in RFC 3168 . ECN allows end-to-end notification of network congestion without dropping packets. ECN is an optional feature that is only used when both endpoints...
- Max-min fairnessMax-min fairnessIn communication networks and multiplexing, a division of the bandwidth resources is said to be max-min fair when: firstly, the minimum data rate that a dataflow achieves is maximized; secondly, the second lowest data rate that a dataflow achieves is maximized, etc.In best-effort statistical...
- Quality of serviceQuality of serviceThe quality of service refers to several related aspects of telephony and computer networks that allow the transport of traffic with special requirements...
- Sorcerer's Apprentice SyndromeSorcerer's Apprentice SyndromeSorcerer's Apprentice Syndrome is a particularly bad network protocol flaw, discovered in the original versions of TFTP. It was named after the "Sorcerer's Apprentice" segment of the animated film Fantasia, because the details of its operation closely resemble the disaster that befalls the...
- TCP congestion avoidance algorithmTCP congestion avoidance algorithmTransmission Control Protocol uses a network congestion avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease scheme, with other schemes such as slow-start in order to achieve congestion avoidance....
- Teletraffic engineeringTeletraffic engineeringTelecommunications traffic engineering, teletraffic engineering, or traffic engineering is the application of traffic engineering theory to telecommunications...
- Thrashing
- Traffic shapingTraffic shapingTraffic shaping is the control of computer network traffic in order to optimize or guarantee performance, improve latency, and/or increase usable bandwidth for some kinds of packets by delaying other kinds of packets that meet certain criteria...
Books
- "Deploying IP and MPLS QoS for Multiservice Networks: Theory and Practice" by John Evans, Clarence Filsfils (Morgan Kaufmann, 2007, ISBN 0-12-370549-5)
External links
- Nagle, J. RFC 896: Congestion control in IP/TCP internetworks (1984)
- Floyd, S. RFC 2914: Congestion control principles (2000)
- Floyd, S. and K. Fall, Promoting the Use of End-to-End Congestion Control in the Internet (IEEE/ACM Transactions on Networking, August 1999)
- Sally Floyd, On the Evolution of End-to-end Congestion Control in the Internet: An Idiosyncratic View (IMA Workshop on Scaling Phenomena in Communication Networks, October 1999) (pdfPortable Document FormatPortable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....
format) - Linktionary term: Queuing
- Pierre-Francois Quet, Sriram Chellappan, Arjan Durresi, Mukundan Sridharan, Hitay Ozbay, Raj Jain, " Guidelines for optimizing Multi-Level ECN, using fluid flow based TCP model"
- Sally Floyd, Ratul Mahajan, David Wetherall: RED-PD: RED with Preferential Dropping
- A Generic Simple RED Simulator for educational purposes by Mehmet Suzen
- Approaches to Congestion Control in Packet Networks
- Papers in Congestion Control
- Random Early Detection Homepage
- Explicit Congestion Notification Homepage
- TFRC Homepage
- AIMD-FC Homepage
- TCP congestion control simulation: Fast recovery