Memory bound function
Encyclopedia
Memory bound refers to a situation in which the time to complete a given computational problem
is decided primarily by the amount of available memory to hold data
. In other words, the limiting factor of solving a given problem is the memory access speed. The application of memory bound functions
could prove to be valuable in preventing spam, which has become a problem of epidemic proportions on the Internet
.
Memory functions use a dynamic programming
technique called memoization
in order to relieve the inefficiency of recursion
that might occur. It is based on the simple idea of calculating and storing solutions to subproblems so that the solutions can be reused later without recalculating the subproblems
again. The best known example that takes advantage of memorization is an algorithm
that computes the Fibonacci number
s. The following pseudocode
illustrates an algorithm that uses memoization, which runs in linear
CPU time:
Fibonacci (n)
{
for i = 0 to n-1
results[i] = -1 // -1 means undefined
return Fibonacci_Results (results, n);
}
Fibonacci_Results (results, n)
{
if (results[n] != -1) //check if it has already been solved before
return results[n]
if (n 0)
1)
val = 1
else
val = Fibonacci_Results(results, n -2 ) + Fibonacci_Results(results, n -1)
results[n] = val
return val
}
Compare the above to an algorithm that uses recursion, which runs in exponential CPU time:
Recursive_Fibonacci (n)
{
if (n 0)
1)
return 1
else
return Recursive_Fibonacci (n -1) + Recursive_Fibonacci (n -2)
}
While the recursive algorithm is simpler and more elegant than the algorithm that uses memoization, the latter has a significantly lower time complexity
than the former.
The term "memory bound function" has surfaced only recently and is used principally to describe a function that uses XOR and consists of a series of computations in which each computation depends on the previous computation. Whereas memory functions have long been an important actor in improving time complexity, memory bound functions have seen far fewer applications. Recently, however, scientists have proposed a method using memory bound functions as a means to discourage spammers from abusing resources, which could be a major breakthrough in that area.
and Moni Naor
published a paper titled Pricing via Processing or Combating Junk Mail, suggesting a possibility of using CPU bound
functions to deter abusers from sending spam. The scheme was based on the idea that computer users are much more likely to abuse a resource if the cost of abusing the resource is negligible: the underlying reason spam has become so rampant is that sending an e-mail
has minuscule cost for spammers.
Dwork and Naor proposed that spamming might be reduced by injecting an additional cost in the form of an expensive CPU computation: CPU bound functions would consume CPU resources at the sender's machine for each message, thus preventing huge amounts of spam from being sent in a short period.
The basic scheme that protects against abuses is as follows:
Let S be sender, R be recipient, and M be an e-mail. If R has agreed beforehand to receive e-mail from S, then M is transmitted in the usual way. Otherwise, S computes some function G(M) and sends (M, G(M)) to R. R checks if what it receives from S is of the form (M, G(M)). If yes, R accepts M. Otherwise, R rejects M. The figure on the right depicts cases in which there were no prior agreements:
The function G is selected such that the verification by R is relatively fast (taking a millisecond) and such that the computation by S is somewhat slow (involving at least several seconds). Therefore, S will be discouraged from sending M to multiple recipients with no prior agreements: the cost in terms of both time and computing resources of computing G repeatedly will become very prohibitive for a spammer who intends to send many millions of e-mails.
The major problem of using the above scheme is that fast CPUs compute much faster than slow CPUs. Further, higher-end computer systems also have sophisticated pipelines and other advantageous features that facilitate computations. As a result, a spammer with a state-of-the-art system will hardly be affected by such deterrence while a typical user with a mediocre system will be adversely affected. If a computation takes a few seconds on a new PC
, it may take a minute on an old PC, and several minutes on a PDA
, which might be a nuisance for users of old PCs, but probably unacceptable for users of PDAs. The disparity in client CPU speed constitutes one of the prominent roadblocks to widespread adoption of any scheme based on a CPU bound function. Therefore, researchers are concerned with finding functions that most computer systems will evaluate at about the same speed, so that high-end systems might evaluate these functions somewhat faster than low-end systems (2-10 times faster, but not 10-100 faster) as CPU disparities might imply. These ratios are “egalitarian” enough for the intended applications: the functions are effective in discouraging abuses and do not add a prohibitive delay on legitimate interactions, across a wide range of systems.
The new egalitarian approach is to rely on memory bound functions. As stated before, a memory bound function is a function whose computation time is dominated by the time spent accessing memory. A memory bound function accesses locations in a large region of memory in an unpredictable way, in such a way that using caches are not effective. In recent years, the speed of CPU has grown drastically, but there has been comparatively small progress in developing faster main memory. Since the ratio
s of memory latencies
of machines built in the last five years is typically no greater than two, and almost always less than four, the memory bound function will be egalitarian to most systems for the foreseeable future.
Computational problem
In theoretical computer science, a computational problem is a mathematical object representing a collection of questions that computers might want to solve. For example, the problem of factoring...
is decided primarily by the amount of available memory to hold data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
. In other words, the limiting factor of solving a given problem is the memory access speed. The application of memory bound functions
Subroutine
In computer science, a subroutine is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code....
could prove to be valuable in preventing spam, which has become a problem of epidemic proportions on the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
.
Memory bound functions and memory functions
Memory bound functions and memory functions are related in that both involve extensive memory access, but a distinction exists between the two.Memory functions use a dynamic programming
Dynamic programming
In mathematics and computer science, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems which are only slightly smaller and optimal substructure...
technique called memoization
Memoization
In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs...
in order to relieve the inefficiency of recursion
Recursion
Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...
that might occur. It is based on the simple idea of calculating and storing solutions to subproblems so that the solutions can be reused later without recalculating the subproblems
Optimal substructure
In computer science, a problem is said to have optimal substructure if an optimal solution can be constructed efficiently from optimal solutions to its subproblems...
again. The best known example that takes advantage of memorization is an algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
that computes the Fibonacci number
Fibonacci number
In mathematics, the Fibonacci numbers are the numbers in the following integer sequence:0,\;1,\;1,\;2,\;3,\;5,\;8,\;13,\;21,\;34,\;55,\;89,\;144,\; \ldots\; ....
s. The following pseudocode
Pseudocode
In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading...
illustrates an algorithm that uses memoization, which runs in linear
Linear
In mathematics, a linear map or function f is a function which satisfies the following two properties:* Additivity : f = f + f...
CPU time:
Fibonacci (n)
{
for i = 0 to n-1
results[i] = -1 // -1 means undefined
return Fibonacci_Results (results, n);
}
Fibonacci_Results (results, n)
{
if (results[n] != -1) //check if it has already been solved before
return results[n]
if (n
0)
val = 0
else if (n
1)val = 1
else
val = Fibonacci_Results(results, n -2 ) + Fibonacci_Results(results, n -1)
results[n] = val
return val
}
Compare the above to an algorithm that uses recursion, which runs in exponential CPU time:
Recursive_Fibonacci (n)
{
if (n
0)
return 0
else if ( n
1)return 1
else
return Recursive_Fibonacci (n -1) + Recursive_Fibonacci (n -2)
}
While the recursive algorithm is simpler and more elegant than the algorithm that uses memoization, the latter has a significantly lower time complexity
Time complexity
In computer science, the time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the size of the input to the problem. The time complexity of an algorithm is commonly expressed using big O notation, which suppresses multiplicative constants and...
than the former.
The term "memory bound function" has surfaced only recently and is used principally to describe a function that uses XOR and consists of a series of computations in which each computation depends on the previous computation. Whereas memory functions have long been an important actor in improving time complexity, memory bound functions have seen far fewer applications. Recently, however, scientists have proposed a method using memory bound functions as a means to discourage spammers from abusing resources, which could be a major breakthrough in that area.
Using memory bound functions to prevent spam
In 1992, IBM research scientists Cynthia DworkCynthia Dwork
Cynthia Dwork is a distinguished scientist at Microsoft Research who works on distributed computing, cryptography, and e-mail spam prevention....
and Moni Naor
Moni Naor
Moni Naor is an Israeli computer scientist, currently a professor at the Weizmann Institute of Science. Naor received his Ph.D. in 1989 at the University of California, Berkeley. His adviser was Manuel Blum....
published a paper titled Pricing via Processing or Combating Junk Mail, suggesting a possibility of using CPU bound
CPU bound
In computer science, CPU bound is when the time for a computer to complete a task is determined principally by the speed of the central processor: processor utilization is high, perhaps at 100% usage for many seconds or minutes...
functions to deter abusers from sending spam. The scheme was based on the idea that computer users are much more likely to abuse a resource if the cost of abusing the resource is negligible: the underlying reason spam has become so rampant is that sending an e-mail
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...
has minuscule cost for spammers.
Dwork and Naor proposed that spamming might be reduced by injecting an additional cost in the form of an expensive CPU computation: CPU bound functions would consume CPU resources at the sender's machine for each message, thus preventing huge amounts of spam from being sent in a short period.
The basic scheme that protects against abuses is as follows:
Let S be sender, R be recipient, and M be an e-mail. If R has agreed beforehand to receive e-mail from S, then M is transmitted in the usual way. Otherwise, S computes some function G(M) and sends (M, G(M)) to R. R checks if what it receives from S is of the form (M, G(M)). If yes, R accepts M. Otherwise, R rejects M. The figure on the right depicts cases in which there were no prior agreements:
The function G is selected such that the verification by R is relatively fast (taking a millisecond) and such that the computation by S is somewhat slow (involving at least several seconds). Therefore, S will be discouraged from sending M to multiple recipients with no prior agreements: the cost in terms of both time and computing resources of computing G repeatedly will become very prohibitive for a spammer who intends to send many millions of e-mails.
The major problem of using the above scheme is that fast CPUs compute much faster than slow CPUs. Further, higher-end computer systems also have sophisticated pipelines and other advantageous features that facilitate computations. As a result, a spammer with a state-of-the-art system will hardly be affected by such deterrence while a typical user with a mediocre system will be adversely affected. If a computation takes a few seconds on a new PC
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...
, it may take a minute on an old PC, and several minutes on a PDA
Personal digital assistant
A personal digital assistant , also known as a palmtop computer, or personal data assistant, is a mobile device that functions as a personal information manager. Current PDAs often have the ability to connect to the Internet...
, which might be a nuisance for users of old PCs, but probably unacceptable for users of PDAs. The disparity in client CPU speed constitutes one of the prominent roadblocks to widespread adoption of any scheme based on a CPU bound function. Therefore, researchers are concerned with finding functions that most computer systems will evaluate at about the same speed, so that high-end systems might evaluate these functions somewhat faster than low-end systems (2-10 times faster, but not 10-100 faster) as CPU disparities might imply. These ratios are “egalitarian” enough for the intended applications: the functions are effective in discouraging abuses and do not add a prohibitive delay on legitimate interactions, across a wide range of systems.
The new egalitarian approach is to rely on memory bound functions. As stated before, a memory bound function is a function whose computation time is dominated by the time spent accessing memory. A memory bound function accesses locations in a large region of memory in an unpredictable way, in such a way that using caches are not effective. In recent years, the speed of CPU has grown drastically, but there has been comparatively small progress in developing faster main memory. Since the ratio
Ratio
In mathematics, a ratio is a relationship between two numbers of the same kind , usually expressed as "a to b" or a:b, sometimes expressed arithmetically as a dimensionless quotient of the two which explicitly indicates how many times the first number contains the second In mathematics, a ratio is...
s of memory latencies
Latency (engineering)
Latency is a measure of time delay experienced in a system, the precise definition of which depends on the system and the time being measured. Latencies may have different meaning in different contexts.-Packet-switched networks:...
of machines built in the last five years is typically no greater than two, and almost always less than four, the memory bound function will be egalitarian to most systems for the foreseeable future.
See also
- CPU boundCPU boundIn computer science, CPU bound is when the time for a computer to complete a task is determined principally by the speed of the central processor: processor utilization is high, perhaps at 100% usage for many seconds or minutes...
- I/O boundIO boundIn computer science, I/O bound refers to a condition in which the time it takes to complete a computation is determined principally by the period of time spent waiting for input/output operations to be completed. This is the opposite of a task being CPU bound...
- MemoryMemoryIn psychology, memory is an organism's ability to store, retain, and recall information and experiences. Traditional studies of memory began in the fields of philosophy, including techniques of artificially enhancing memory....
- Dynamic programmingDynamic programmingIn mathematics and computer science, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems which are only slightly smaller and optimal substructure...
- MemoizationMemoizationIn computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs...
- Spam
- RecursionRecursionRecursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...
- Optimal substructureOptimal substructureIn computer science, a problem is said to have optimal substructure if an optimal solution can be constructed efficiently from optimal solutions to its subproblems...