Garbage (computer science)
Encyclopedia
Garbage, in the context of computer science
, refers to object
s, data
, or other regions of the memory of a computer system (or other system resources), which will not be used in any future computation by the system, or by a program running on it. As computer systems all have finite amounts of memory, it is frequently necessary to deallocate garbage and return it to the heap, or memory pool, so the underlying memory can be reused.
semantic garbage : Semantic garbage is any object or data which will never be accessed by a running program, for any combination of inputs to the program.
syntactic garbage: Syntactic garbage refers to objects or data within a program's memory space that are unreachable from the program's root set.
Note that syntactic garbage is a (usually strict) subset of semantic garbage as it is entirely possible for an object to hold a reference to another object without the latter object being used. Determination of the semantic garbage present in a program is generally undecidable
, but there are many algorithms for identifying syntactic garbage.
Objects and/or data which are not garbage are said to be live.
An example of the automatic removal of semantic garbage, by reference counting
garbage collection, can be produced using the Python
command-line interpreter
:
In this session, an object is created, its location in the memory is displayed, and the only reference to the object is then destroyed—there is no way to ever use the object again from this point on, as there are no references to it. This becomes apparent when we try to access the original reference:
As it is impossible to refer to the object, it has become useless: the object is garbage. Since Python uses garbage collection, it automatically deallocates the memory that was used for the object so that it may be used again:
Note that the Bar instance now resides at the memory location 0x54f30; at the same place as where our previous object, the Foo instance, was located. Since the Foo instance was destroyed, freeing up the memory used to contain it, the interpreter creates the Bar object at the same memory location as before, making good use of the available resources.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, refers to object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...
s, data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
, or other regions of the memory of a computer system (or other system resources), which will not be used in any future computation by the system, or by a program running on it. As computer systems all have finite amounts of memory, it is frequently necessary to deallocate garbage and return it to the heap, or memory pool, so the underlying memory can be reused.
Classification
Garbage is generally classified into two types:semantic garbage : Semantic garbage is any object or data which will never be accessed by a running program, for any combination of inputs to the program.
syntactic garbage: Syntactic garbage refers to objects or data within a program's memory space that are unreachable from the program's root set.
Note that syntactic garbage is a (usually strict) subset of semantic garbage as it is entirely possible for an object to hold a reference to another object without the latter object being used. Determination of the semantic garbage present in a program is generally undecidable
Undecidable
Undecidable may refer to:In mathematics and logic* Undecidable problem - a decision problem which no algorithm can decide* "Undecidable" is sometimes used as a synonym of "independent", where a formula in mathematical logic is independent of a logical theory if neither that formula nor its negation...
, but there are many algorithms for identifying syntactic garbage.
Objects and/or data which are not garbage are said to be live.
Eliminating garbage
The problem of managing the deallocation of garbage is a well-known one in computer science. Several approaches are taken:- Many operating systems will reclaim the memory and resources used by a process or program when it terminates. Simple or short-lived programs which are designed to run in such environments can exit and allow the operating system to perform any necessary reclamation.
- In systems or programming languageProgramming languageA programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s with manual memory managementManual memory managementIn computer science, manual memory management refers to the usage of manual instructions by the programmer to identify and deallocate unused objects, or garbage. Up until the mid 1990s, the majority of programming languages used in industry supported manual memory management...
, the programmer must explicitly arrange for memory to be deallocated when it is no longer used. CC (programming language)C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
and C++C++C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
are two well-known languages which support this model. - Garbage collectionGarbage collection (computer science)In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...
uses various algorithms to automatically analyze the state of a program, identify garbage, and deallocate it without intervention by the programmer. Many modern programming languages such as JavaJava (programming language)Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
and HaskellHaskell (programming language)Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...
provide automated garbage collection. However, it is not a recent development, as it has also been used in older languages such as LISPLispA lisp is a speech impediment, historically also known as sigmatism. Stereotypically, people with a lisp are unable to pronounce sibilants , and replace them with interdentals , though there are actually several kinds of lisp...
. - There is ongoing research to type theoreticType theoryIn mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general...
approaches (such as region inference) to identification and removal of garbage from a program. Note that no general type-theoretic solution to the problem has been developed.
An example of the automatic removal of semantic garbage, by reference counting
Reference counting
In computer science, reference counting is a technique of storing the number of references, pointers, or handles to a resource such as an object, block of memory, disk space or other resource...
garbage collection, can be produced using the Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
command-line interpreter
Interpreter (computing)
In computer science, an interpreter normally means a computer program that executes, i.e. performs, instructions written in a programming language...
:
In this session, an object is created, its location in the memory is displayed, and the only reference to the object is then destroyed—there is no way to ever use the object again from this point on, as there are no references to it. This becomes apparent when we try to access the original reference:
As it is impossible to refer to the object, it has become useless: the object is garbage. Since Python uses garbage collection, it automatically deallocates the memory that was used for the object so that it may be used again:
Note that the Bar instance now resides at the memory location 0x54f30; at the same place as where our previous object, the Foo instance, was located. Since the Foo instance was destroyed, freeing up the memory used to contain it, the interpreter creates the Bar object at the same memory location as before, making good use of the available resources.
External links
- Benjamin PierceBenjamin C. PierceBenjamin Crawford Pierce is an American professor of computer science at the University of Pennsylvania. Pierce joined Penn in 1998 from Indiana University and held research positions at the University of Cambridge and the University of Edinburgh. He received his Ph.D. from Carnegie Mellon...
(editor), Advanced Topics in Types and Programming Languages, MIT Press (2005), ISBN 0-262-16228-8 - Richard Jones and Rafael Lins, Garbage Collection: Algorithms for Automated Dynamic Memory Management, Wiley and Sons (1996), ISBN 0-471-94148-4