Self-similar process
Encyclopedia
Self-similar processes are types of stochastic processes that exhibit the phenomenon of self-similarity
. A self-similar phenomenon behaves the same when viewed at different degrees of magnification, or different scales on a dimension (space or time). Self-similar processes can sometimes be described using heavy-tailed distribution
s, also known as long-tailed distributions. Example of such processes include traffic processes such as packet inter-arrival times and burst lengths. Self-similar processes can exhibit long-range dependency
.
world. To achieve this goal,
understanding the characteristics of Internet traffic plays a more and more critical
role. Empirical studies of measured traffic traces have led to the wide recognition of
self-similarity in network traffic.
Self-similar Ethernet
traffic exhibits dependencies over a long range of time scales. This is to be contrasted with telephone traffic which is Poisson
in its arrival and departure process.
In traditional Poisson traffic, the short-term fluctuations would average out, and a graph covering a large amount of time would approach a constant value.
Heavy-tailed distributions have been observed in many natural phenomena including both physical and sociological phenomena. Mandelbrot
established the use of heavy-tailed distributions to model real-world fractal
phenomena, e.g. Stock markets, earthquakes, climate, and the weather.
Ethernet, WWW
, SS7, TCP
, FTP
, TELNET
and VBR
video (digitised video of the type that is transmitted over ATM
networks) traffic is self-similar.
Self-similarity in packetised data networks can be caused by the distribution of file sizes, human interactions and/ or Ethernet dynamics. Self-similar and long-range dependent characteristics in computer networks present a fundamentally different set of problems to people doing analysis and/or design of networks, and many of the previous assumptions upon which systems have been built are no longer valid in the presence of self-similarity.
with a memoryless
waiting-time distribution, used to model (among many things) traditional telephony networks, is briefly reviewed below.
Assuming pure-chance arrivals and pure-chance terminations leads to the following:
Self-similarity
In mathematics, a self-similar object is exactly or approximately similar to a part of itself . Many objects in the real world, such as coastlines, are statistically self-similar: parts of them show the same statistical properties at many scales...
. A self-similar phenomenon behaves the same when viewed at different degrees of magnification, or different scales on a dimension (space or time). Self-similar processes can sometimes be described using heavy-tailed distribution
Heavy-tailed distribution
In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded: that is, they have heavier tails than the exponential distribution...
s, also known as long-tailed distributions. Example of such processes include traffic processes such as packet inter-arrival times and burst lengths. Self-similar processes can exhibit long-range dependency
Long-range dependency
Long-range dependency is a phenomenon that may arise in the analysis of spatial or time series data. It relates to the rate of decay of statistical dependence, with the implication that this decays more slowly than an exponential decay, typically a power-like decay...
.
Overview
The design of robust and reliable networks and network services has become an increasingly challenging task in today's InternetInternet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
world. To achieve this goal,
understanding the characteristics of Internet traffic plays a more and more critical
role. Empirical studies of measured traffic traces have led to the wide recognition of
self-similarity in network traffic.
Self-similar Ethernet
Ethernet
Ethernet is a family of computer networking technologies for local area networks commercially introduced in 1980. Standardized in IEEE 802.3, Ethernet has largely replaced competing wired LAN technologies....
traffic exhibits dependencies over a long range of time scales. This is to be contrasted with telephone traffic which is Poisson
Poisson distribution
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...
in its arrival and departure process.
In traditional Poisson traffic, the short-term fluctuations would average out, and a graph covering a large amount of time would approach a constant value.
Heavy-tailed distributions have been observed in many natural phenomena including both physical and sociological phenomena. Mandelbrot
Mandelbrot
Mandelbrot, may refer to:* Benoît Mandelbrot , a mathematician associated with fractal geometry* Mandelbrot set, a fractal popularized by Benoît Mandelbrot* Mandelbrot Competition, a math competition- See also :...
established the use of heavy-tailed distributions to model real-world fractal
Fractal
A fractal has been defined as "a rough or fragmented geometric shape that can be split into parts, each of which is a reduced-size copy of the whole," a property called self-similarity...
phenomena, e.g. Stock markets, earthquakes, climate, and the weather.
Ethernet, WWW
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
, SS7, TCP
Transmission Control Protocol
The Transmission Control Protocol is one of the core protocols of the Internet Protocol Suite. TCP is one of the two original components of the suite, complementing the Internet Protocol , and therefore the entire suite is commonly referred to as TCP/IP...
, FTP
File Transfer Protocol
File Transfer Protocol is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server...
, TELNET
TELNET
Telnet is a network protocol used on the Internet or local area networks to provide a bidirectional interactive text-oriented communications facility using a virtual terminal connection...
and VBR
VBR
VBR may refer to:* Variable bitrate, in telecommunications and computing, when the bitrate used in sound or video encoding is not held constant* Volume Boot Record, in computer disks, a type of boot sector that contains code for bootstrapping programs...
video (digitised video of the type that is transmitted over ATM
Asynchronous Transfer Mode
Asynchronous Transfer Mode is a standard switching technique designed to unify telecommunication and computer networks. It uses asynchronous time-division multiplexing, and it encodes data into small, fixed-sized cells. This differs from approaches such as the Internet Protocol or Ethernet that...
networks) traffic is self-similar.
Self-similarity in packetised data networks can be caused by the distribution of file sizes, human interactions and/ or Ethernet dynamics. Self-similar and long-range dependent characteristics in computer networks present a fundamentally different set of problems to people doing analysis and/or design of networks, and many of the previous assumptions upon which systems have been built are no longer valid in the presence of self-similarity.
The Poisson distribution
Before the heavy-tailed distribution is introduced mathematically, the Poisson processPoisson process
A Poisson process, named after the French mathematician Siméon-Denis Poisson , is a stochastic process in which events occur continuously and independently of one another...
with a memoryless
Memorylessness
In probability and statistics, memorylessness is a property of certain probability distributions: the exponential distributions of non-negative real numbers and the geometric distributions of non-negative integers....
waiting-time distribution, used to model (among many things) traditional telephony networks, is briefly reviewed below.
Assuming pure-chance arrivals and pure-chance terminations leads to the following:
- The number of call arrivals in a given time has a Poisson distribution, i.e.:
-
where a is the number of call arrivals in time T, and is the mean number of call arrivals in time T. For this reason, pure-chance traffic is also known as Poisson traffic.- The number of call departures in a given time, also has a Poisson distribution, i.e.:
-
where d is the number of call departures in time T and is the mean number of call departures in time T.- The intervals, T, between call arrivals and departures are intervals between independent, identically distributed random events. It can be shown that these intervals have a negative exponential distribution, i.e.:
-
where h is the mean holding time (MHT).
The heavy-tail distribution
A distribution is said to have a heavy tail if
This means that regardless of the distribution for small values of the random variable, if the asymptotic shape of the distribution is hyperbolic, it is heavy-tailed. The simplest heavy-tailed distribution is the Pareto distribution which is hyperbolic over its entire range.
Modelling self-similar traffic
Since (unlike traditional telephony traffic) packetised traffic exhibits self-similar or fractal characteristics, conventional traffic models do not apply to networks which carry self-similar traffic.
With the convergence of voice and data, the future multi-service network will be based on packetised traffic, and models which accurately reflect the nature of self-similar traffic will be required to develop, design and dimension future multi-service networks.
Previous analytic work done in Internet studies adopted assumptions such as exponentially-distributed packet inter-arrivals, and conclusions reached under such assumptions may be misleading or incorrect in the presence of heavy-tailed distributions.
Deriving mathematical models which accurately represent long-range dependent traffic is a fertile area of research.
Network performance
Network performance degrades gradually with increasing self-similarity. The more self-similar the traffic, the longer the queue size. The queue length distribution of self-similar traffic decays more slowly than with Poisson sources.
However, long-range dependence implies nothing about its short-term correlations which affect performance in small buffers. Additionally, aggregating streams of self-similar traffic typically intensifies the self-similarity ("burstiness") rather than smoothing it, compounding the problem.
Self-similar traffic exhibits the persistence of clusteringClusteringClustering can refer to the following:In demographics:* Clustering , the gathering of various populations based on factors such as ethnicity, economics or religion.In graph theory:...
which has a negative impact on network performance.- With Poisson traffic (found in conventional telephonyTelephonyIn telecommunications, telephony encompasses the general use of equipment to provide communication over distances, specifically by connecting telephones to each other....
networks), clustering occurs in the short term but smooths out over the long term. - With self-similar traffic, the bursty behaviour may itself be bursty, which exacerbates the clustering phenomena, and degrades network performance.
Many aspects of network quality of service depend on coping with traffic peaks that might cause network failures, such as- Cell/packet loss and queue overflow
- Violation of delay bounds e.g. in video
- Worst cases in statistical multiplexingMultiplexingThe multiplexed signal is transmitted over a communication channel, which may be a physical transmission medium. The multiplexing divides the capacity of the low-level communication channel into several higher-level logical channels, one for each message signal or data stream to be transferred...
Poisson processes are well-behaved because they are statelessStateless serverIn computing, a stateless protocol is a communications protocol that treats each request as an independent transaction that is unrelated to any previous request so that the communication consists of independent pairs of requests and responses...
, and peak loading is not sustained, so queues do not fill. With long-range order, peaks last longer and have greater impact: the equilibrium shifts for a while.
External links
- A site offering numerous links to articles written on the effect of self-similar traffic on network performance.
- http://dept.ee.wits.ac.za/~kennedy/elen5007/coursepack/course/selfsim.htm
- With Poisson traffic (found in conventional telephony