Never-Ending Language Learning
Encyclopedia
Never-Ending Language Learning system (NELL) is a semantic
machine learning
system developed by a research team at Carnegie Mellon University
, and supported by grants from DARPA, Google
, and the NSF
, with portions of the system running on a supercomputing
cluster provided by Yahoo!
.
The goal of NELL and other semantic learning systems, such as IBM
's Watson
system, is to be able to develop means of answering questions posed by users in natural language with no human intervention in the process. Oren Etzioni of the University of Washington
lauded the system's "continuous learning, as if NELL is exercising curiosity on its own, with little human help".
By October 2010, NELL has doubled the number of relationships it has available in its knowledge base and has learned 440,000 new facts, with an accuracy of 87%. Team leader Tom M. Mitchell, chairman of the machine learning department at Carnegie Mellon described how NELL "self-corrects when it has more information, as it learns more", though it does sometimes arrive at incorrect conclusions. Accumulated errors, such as the deduction that Internet cookies
were a kind of baked good, led NELL to deduce from the phrases "I deleted my Internet cookies" and "I deleted my files" that "computer file
s" also belonged in the baked goods category. Clear errors like these are corrected every few weeks by the members of the research team and the system is allowed to continue its learning process.
Semantics
Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
machine learning
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...
system developed by a research team at Carnegie Mellon University
Carnegie Mellon University
Carnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....
, and supported by grants from DARPA, Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...
, and the NSF
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...
, with portions of the system running on a supercomputing
Supercomputer
A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation.Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting, climate research, molecular modeling A supercomputer is a...
cluster provided by Yahoo!
Yahoo!
Yahoo! Inc. is an American multinational internet corporation headquartered in Sunnyvale, California, United States. The company is perhaps best known for its web portal, search engine , Yahoo! Directory, Yahoo! Mail, Yahoo! News, Yahoo! Groups, Yahoo! Answers, advertising, online mapping ,...
.
Process and goals
NELL was programmed by its developers to be able to identify a basic set of fundamental semantic relationships between a few hundred predefined categories of data, such as cities, companies, emotions and sports teams. Since the beginning of 2010, the Carnegie Mellon research team has been running NELL around the clock, sifting through hundreds of millions of web pages looking for connections between the information it already knows and what it finds through its search process – to make new connections in a manner that is intended to mimic the way humans learn new information. For example, in encountering the word pair "Pikes Peak", NELL would notice that both words are capitalized and deduce from the second word that it was the name of a mountain, and then build on the relationship of words surrounding those two words to deduce other connections.The goal of NELL and other semantic learning systems, such as IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
's Watson
Watson (artificial intelligence software)
Watson is an artificial intelligence computer system capable of answering questions posed in natural language, developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first president, Thomas J...
system, is to be able to develop means of answering questions posed by users in natural language with no human intervention in the process. Oren Etzioni of the University of Washington
University of Washington
University of Washington is a public research university, founded in 1861 in Seattle, Washington, United States. The UW is the largest university in the Northwest and the oldest public university on the West Coast. The university has three campuses, with its largest campus in the University...
lauded the system's "continuous learning, as if NELL is exercising curiosity on its own, with little human help".
By October 2010, NELL has doubled the number of relationships it has available in its knowledge base and has learned 440,000 new facts, with an accuracy of 87%. Team leader Tom M. Mitchell, chairman of the machine learning department at Carnegie Mellon described how NELL "self-corrects when it has more information, as it learns more", though it does sometimes arrive at incorrect conclusions. Accumulated errors, such as the deduction that Internet cookies
HTTP cookie
A cookie, also known as an HTTP cookie, web cookie, or browser cookie, is used for an origin website to send state information to a user's browser and for the browser to return the state information to the origin site...
were a kind of baked good, led NELL to deduce from the phrases "I deleted my Internet cookies" and "I deleted my files" that "computer file
Computer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...
s" also belonged in the baked goods category. Clear errors like these are corrected every few weeks by the members of the research team and the system is allowed to continue its learning process.