Categorization
Encyclopedia
Categorization is the process in which ideas and objects are recognized, differentiated
and understood
. Categorization implies that objects are grouped into categories, usually for some specific purpose. Ideally, a category illuminates a relationship
between the subject
s and object
s of knowledge
. Categorization is fundamental in language
, prediction
, inference
, decision making
and in all kinds of environmental interaction. It is indicated that categorization plays a major role in programming.
There are many categorization theories and techniques. In a broader historical view, however, three general approaches to categorization may be identified:
from Plato
, who, in his Statesman dialogue, introduces the approach of grouping objects based on their similar properties
. This approach was further explored and systematized by Aristotle
in his Categories
treatise, where he analyzes the differences between class
es and object
s. Aristotle also applied intensively the classical categorization scheme in his approach to the classification of living beings (which uses the technique of applying successive narrowing questions such as "Is it an animal or vegetable?", "How many feet does it have?", "Does it have fur or feathers?", "Can it fly?"...), establishing this way the basis for natural taxonomy
.
The classical Aristotelian
view claims that categories are discrete entities
characterized by a set of properties which are shared by their members. In analytic philosophy
, these properties are assumed to establish the conditions which are both necessary and sufficient conditions to capture meaning.
According to the classical view, categories should be clearly defined, mutually exclusive and collectively exhaustive. This way, any entity of the given classification universe belongs unequivocally to one, and only one, of the proposed categories.
is represented. In this approach, class
es (clusters or entities
) are generated by first formulating their conceptual descriptions and then classifying the entities according to the descriptions.
Conceptual clustering developed mainly during the 1980s, as a machine paradigm for unsupervised learning
. It is distinguished from ordinary data clustering by generating a concept description for each generated category.
Categorization tasks in which category labels are provided to the learner for certain objects are referred to as supervised classification, supervised learning
, or concept learning
. Categorization tasks in which no labels are supplied are referred to as unsupervised classification, unsupervised learning
, or data clustering. The task of supervised classification involves extracting information from the labeled examples that allows accurate prediction of class labels of future examples. This may involve the abstraction
of a rule
or concept
relating observed object features to category labels, or it may not involve abstraction (e.g., exemplar models). The task of clustering involves recognizing inherent structure in a data set and grouping objects together by similarity
into classes. It is thus a process of generating a classification structure.
Conceptual clustering is closely related to fuzzy set
theory, in which objects may belong to one or more groups, in varying degrees of fitness.
and George Lakoff
in the 1970s, categorization can also be viewed as the process of grouping things based on prototype
s - the idea of necessary and sufficient conditions is almost never met in categories of naturally occurring things. It has also been suggested that categorization based on prototypes is the basis for human development, and that this learning
relies on learning about the world via embodiment
.
A cognitive
approach accepts that natural categories are graded (they tend to be fuzzy at their boundaries) and inconsistent in the status of their constituent members.
Systems of categories are not objectively "out there" in the world but are rooted in people's experience. Conceptual categories are not identical for different cultures, or indeed, for every individual in the same culture.
Categories form part of a hierarchical structure when applied to such subjects as taxonomy
in biological classification
: higher level: life-form level, middle level: generic or genus
level, and lower level: the species
level. These can be distinguished by certain traits that put an item in its distinctive category. But even these can be arbitrary and are subject to revision.
Categories at the middle level are perceptually and conceptually the more salient. The generic level of a category tends to elicit the most responses and richest images and seems to be the psychologically basic level. Typical taxonomies in zoology for example exhibit categorization at the embodied
level, with similarities leading to formulation of "higher" categories, and differences leading to differentiation within categories.
in which diverse and dissimilar objects, concepts, entities, etc. are grouped together based upon illogical common denominators, or common denominators that virtually any concept, object or entity have in common. A common way miscategorization occurs is through an over-categorization of concepts, objects or entities, and then miscategorization based upon over-similar variables that virtually all things have in common.
Difference
Difference may refer to:* Difference , a 2005 power metal album* Difference , a concept in computer science* Difference , any systematic way of distinguishing similar coats of arms belonging to members of the same family* Difference , a statement about the relative size or order of two objects**...
and understood
Understanding
Understanding is a psychological process related to an abstract or physical object, such as a person, situation, or message whereby one is able to think about it and use concepts to deal adequately with that object....
. Categorization implies that objects are grouped into categories, usually for some specific purpose. Ideally, a category illuminates a relationship
Binary relation
In mathematics, a binary relation on a set A is a collection of ordered pairs of elements of A. In other words, it is a subset of the Cartesian product A2 = . More generally, a binary relation between two sets A and B is a subset of...
between the subject
Subject (philosophy)
In philosophy, a subject is a being that has subjective experiences, subjective consciousness or a relationship with another entity . A subject is an observer and an object is a thing observed...
s and object
Object (philosophy)
An object in philosophy is a technical term often used in contrast to the term subject. Consciousness is a state of cognition that includes the subject, which can never be doubted as only it can be the one who doubts, and some object or objects that may or may not have real existence without...
s of knowledge
Knowledge
Knowledge is a familiarity with someone or something unknown, which can include information, facts, descriptions, or skills acquired through experience or education. It can refer to the theoretical or practical understanding of a subject...
. Categorization is fundamental in language
Language
Language may refer either to the specifically human capacity for acquiring and using complex systems of communication, or to a specific instance of such a system of complex communication...
, prediction
Prediction
A prediction or forecast is a statement about the way things will happen in the future, often but not always based on experience or knowledge...
, inference
Inference
Inference is the act or process of deriving logical conclusions from premises known or assumed to be true. The conclusion drawn is also called an idiomatic. The laws of valid inference are studied in the field of logic.Human inference Inference is the act or process of deriving logical conclusions...
, decision making
Decision making
Decision making can be regarded as the mental processes resulting in the selection of a course of action among several alternative scenarios. Every decision making process produces a final choice. The output can be an action or an opinion of choice.- Overview :Human performance in decision terms...
and in all kinds of environmental interaction. It is indicated that categorization plays a major role in programming.
There are many categorization theories and techniques. In a broader historical view, however, three general approaches to categorization may be identified:
- Classical categorization
- Conceptual clustering
- Prototype theory
The classical view
Classical categorization comes to us firstfrom Plato
Plato
Plato , was a Classical Greek philosopher, mathematician, student of Socrates, writer of philosophical dialogues, and founder of the Academy in Athens, the first institution of higher learning in the Western world. Along with his mentor, Socrates, and his student, Aristotle, Plato helped to lay the...
, who, in his Statesman dialogue, introduces the approach of grouping objects based on their similar properties
Property (philosophy)
In modern philosophy, logic, and mathematics a property is an attribute of an object; a red object is said to have the property of redness. The property may be considered a form of object in its own right, able to possess other properties. A property however differs from individual objects in that...
. This approach was further explored and systematized by Aristotle
Aristotle
Aristotle was a Greek philosopher and polymath, a student of Plato and teacher of Alexander the Great. His writings cover many subjects, including physics, metaphysics, poetry, theater, music, logic, rhetoric, linguistics, politics, government, ethics, biology, and zoology...
in his Categories
Categories (Aristotle)
The Categories is a text from Aristotle's Organon that enumerates all the possible kinds of thing that can be the subject or the predicate of a proposition...
treatise, where he analyzes the differences between class
Class (philosophy)
Philosophers sometimes distinguish classes from types and kinds. We can talk about the class of human beings, just as we can talk about the type , human being, or humanity...
es and object
Object (philosophy)
An object in philosophy is a technical term often used in contrast to the term subject. Consciousness is a state of cognition that includes the subject, which can never be doubted as only it can be the one who doubts, and some object or objects that may or may not have real existence without...
s. Aristotle also applied intensively the classical categorization scheme in his approach to the classification of living beings (which uses the technique of applying successive narrowing questions such as "Is it an animal or vegetable?", "How many feet does it have?", "Does it have fur or feathers?", "Can it fly?"...), establishing this way the basis for natural taxonomy
Taxonomy
Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...
.
The classical Aristotelian
Aristotelianism
Aristotelianism is a tradition of philosophy that takes its defining inspiration from the work of Aristotle. The works of Aristotle were initially defended by the members of the Peripatetic school, and, later on, by the Neoplatonists, who produced many commentaries on Aristotle's writings...
view claims that categories are discrete entities
Entity
An entity is something that has a distinct, separate existence, although it need not be a material existence. In particular, abstractions and legal fictions are usually regarded as entities. In general, there is also no presumption that an entity is animate.An entity could be viewed as a set...
characterized by a set of properties which are shared by their members. In analytic philosophy
Analytic philosophy
Analytic philosophy is a generic term for a style of philosophy that came to dominate English-speaking countries in the 20th century...
, these properties are assumed to establish the conditions which are both necessary and sufficient conditions to capture meaning.
According to the classical view, categories should be clearly defined, mutually exclusive and collectively exhaustive. This way, any entity of the given classification universe belongs unequivocally to one, and only one, of the proposed categories.
Conceptual clustering
Conceptual clustering is a modern variation of the classical approach, and derives from attempts to explain how knowledgeKnowledge
Knowledge is a familiarity with someone or something unknown, which can include information, facts, descriptions, or skills acquired through experience or education. It can refer to the theoretical or practical understanding of a subject...
is represented. In this approach, class
Class (philosophy)
Philosophers sometimes distinguish classes from types and kinds. We can talk about the class of human beings, just as we can talk about the type , human being, or humanity...
es (clusters or entities
Entity
An entity is something that has a distinct, separate existence, although it need not be a material existence. In particular, abstractions and legal fictions are usually regarded as entities. In general, there is also no presumption that an entity is animate.An entity could be viewed as a set...
) are generated by first formulating their conceptual descriptions and then classifying the entities according to the descriptions.
Conceptual clustering developed mainly during the 1980s, as a machine paradigm for unsupervised learning
Unsupervised learning
In machine learning, unsupervised learning refers to the problem of trying to find hidden structure in unlabeled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution...
. It is distinguished from ordinary data clustering by generating a concept description for each generated category.
Categorization tasks in which category labels are provided to the learner for certain objects are referred to as supervised classification, supervised learning
Supervised learning
Supervised learning is the machine learning task of inferring a function from supervised training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object and a desired output value...
, or concept learning
Concept learning
Concept learning, also known as category learning, concept attainment, and concept formation, is largely based on the works of the cognitive psychologist Jerome Bruner...
. Categorization tasks in which no labels are supplied are referred to as unsupervised classification, unsupervised learning
Unsupervised learning
In machine learning, unsupervised learning refers to the problem of trying to find hidden structure in unlabeled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution...
, or data clustering. The task of supervised classification involves extracting information from the labeled examples that allows accurate prediction of class labels of future examples. This may involve the abstraction
Abstraction
Abstraction is a process by which higher concepts are derived from the usage and classification of literal concepts, first principles, or other methods....
of a rule
Rule
Rule, ruler, ruling usually refers to standards for activities. They may refer to:- Human activity :* Business rule, a rule pertaining to the structure or behavior internal to an organization* Game rules, rules that define how a game is played...
or concept
Concept
The word concept is used in ordinary language as well as in almost all academic disciplines. Particularly in philosophy, psychology and cognitive sciences the term is much used and much discussed. WordNet defines concept: "conception, construct ". However, the meaning of the term concept is much...
relating observed object features to category labels, or it may not involve abstraction (e.g., exemplar models). The task of clustering involves recognizing inherent structure in a data set and grouping objects together by similarity
Similarity
-Specific definitions:Different fields provide differing definitions of similarity:-In computer science:* string metric, aka string similarity* semantic similarity in computational linguistics-In other fields:...
into classes. It is thus a process of generating a classification structure.
Conceptual clustering is closely related to fuzzy set
Fuzzy set
Fuzzy sets are sets whose elements have degrees of membership. Fuzzy sets were introduced simultaneously by Lotfi A. Zadeh and Dieter Klaua in 1965 as an extension of the classical notion of set. In classical set theory, the membership of elements in a set is assessed in binary terms according to...
theory, in which objects may belong to one or more groups, in varying degrees of fitness.
Prototype Theory
Since the research by Eleanor RoschEleanor Rosch
Eleanor Rosch is a professor of psychology at the University of California, Berkeley, specializing in cognitive psychology and primarily known for her work on categorization, in particular her prototype theory, which has profoundly influenced the field of cognitive psychology...
and George Lakoff
George Lakoff
George P. Lakoff is an American cognitive linguist and professor of linguistics at the University of California, Berkeley, where he has taught since 1972...
in the 1970s, categorization can also be viewed as the process of grouping things based on prototype
Prototype
A prototype is an early sample or model built to test a concept or process or to act as a thing to be replicated or learned from.The word prototype derives from the Greek πρωτότυπον , "primitive form", neutral of πρωτότυπος , "original, primitive", from πρῶτος , "first" and τύπος ,...
s - the idea of necessary and sufficient conditions is almost never met in categories of naturally occurring things. It has also been suggested that categorization based on prototypes is the basis for human development, and that this learning
Learning theory (education)
In psychology and education, learning is commonly defined as a process that brings together cognitive, emotional, and environmental influences and experiences for acquiring, enhancing, or making changes in one's knowledge, skills, values, and world views . Learning as a process focuses on what...
relies on learning about the world via embodiment
Embodied cognition
Philosophers, psychologists, cognitive scientists and artificial intelligence researchers who study embodied cognition and the embodied mind believe that the nature of the human mind is largely determined by the form of the human body. They argue that all aspects of cognition, such as ideas,...
.
A cognitive
Cognition
In science, cognition refers to mental processes. These processes include attention, remembering, producing and understanding language, solving problems, and making decisions. Cognition is studied in various disciplines such as psychology, philosophy, linguistics, and computer science...
approach accepts that natural categories are graded (they tend to be fuzzy at their boundaries) and inconsistent in the status of their constituent members.
Systems of categories are not objectively "out there" in the world but are rooted in people's experience. Conceptual categories are not identical for different cultures, or indeed, for every individual in the same culture.
Categories form part of a hierarchical structure when applied to such subjects as taxonomy
Taxonomy
Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...
in biological classification
Biological classification
Biological classification, or scientific classification in biology, is a method to group and categorize organisms by biological type, such as genus or species. Biological classification is part of scientific taxonomy....
: higher level: life-form level, middle level: generic or genus
Genus
In biology, a genus is a low-level taxonomic rank used in the biological classification of living and fossil organisms, which is an example of definition by genus and differentia...
level, and lower level: the species
Species
In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...
level. These can be distinguished by certain traits that put an item in its distinctive category. But even these can be arbitrary and are subject to revision.
Categories at the middle level are perceptually and conceptually the more salient. The generic level of a category tends to elicit the most responses and richest images and seems to be the psychologically basic level. Typical taxonomies in zoology for example exhibit categorization at the embodied
Embodied cognition
Philosophers, psychologists, cognitive scientists and artificial intelligence researchers who study embodied cognition and the embodied mind believe that the nature of the human mind is largely determined by the form of the human body. They argue that all aspects of cognition, such as ideas,...
level, with similarities leading to formulation of "higher" categories, and differences leading to differentiation within categories.
Miscategorisation
Miscategorization can be a logical fallacyFallacy
In logic and rhetoric, a fallacy is usually an incorrect argumentation in reasoning resulting in a misconception or presumption. By accident or design, fallacies may exploit emotional triggers in the listener or interlocutor , or take advantage of social relationships between people...
in which diverse and dissimilar objects, concepts, entities, etc. are grouped together based upon illogical common denominators, or common denominators that virtually any concept, object or entity have in common. A common way miscategorization occurs is through an over-categorization of concepts, objects or entities, and then miscategorization based upon over-similar variables that virtually all things have in common.
See also
- Artificial neural networkArtificial neural networkAn artificial neural network , usually called neural network , is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes...
- Category learning
- Categorical perceptionCategorical perceptionCategorical perception is the experience of percept invariances in sensory phenomena that can be varied along a continuum. Multiple views of a face, for example, are mapped onto a common identity, visually distinct objects such as cars are mapped into the same category and distinct speech tokens...
- Classification in machine learningClassification in machine learningIn machine learning and pattern recognition, classification refers to an algorithmic procedure for assigning a given piece of input data into one of a given number of categories...
- Family resemblanceFamily resemblanceFamily resemblance is a philosophical idea made popular by Ludwig Wittgenstein, with the best known exposition being given in the posthumously published book Philosophical Investigations It has been suggested that Wittgenstein picked the idea and the term from Nietzsche, who had been using it,...
- Fuzzy conceptFuzzy conceptA fuzzy concept is a concept of which the content, value, or boundaries of application can vary according to context or conditions, instead of being fixed once and for all....
- Information architectureInformation ArchitectureInformation architecture is the art of expressing a model or concept of information used in activities that require explicit details of complex systems. Among these activities are library systems, Content Management Systems, web development, user interactions, database development, programming,...
- Language acquisitionLanguage acquisitionLanguage acquisition is the process by which humans acquire the capacity to perceive, produce and use words to understand and communicate. This capacity involves the picking up of diverse capacities including syntax, phonetics, and an extensive vocabulary. This language might be vocal as with...
- Library classificationLibrary classificationA library classification is a system of coding and organizing documents or library materials according to their subject and allocating a call number to that information resource...
- Machine learningMachine learningMachine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...
- Multi-label classificationMulti-label classificationIn machine learning, multi-label classification is a variant of the classification problem where multiple target labels must be assigned to each instance...
- Natural kindNatural kindIn philosophy, a natural kind is a "natural" grouping, not an artificial one. Or, it is something that a set of things has in common which distinguishes it from other things as a real set rather than as a group of things arbitrarily lumped together by a person or group of people.If any natural...
- OntologyOntologyOntology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories of being and their relations...
- Pattern recognitionPattern recognitionIn machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...
- Perceptual learningPerceptual learningThe term perceptual learning refers to the process of long lasting improvement in performing perceptual tasks as a function of experienceand practice . According to Eleanor Gibson , it refers to the experience-induced changes in the way information is extracted following sensory experience...
- SemanticsSemanticsSemantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
- SocratesSocratesSocrates was a classical Greek Athenian philosopher. Credited as one of the founders of Western philosophy, he is an enigmatic figure known chiefly through the accounts of later classical writers, especially the writings of his students Plato and Xenophon, and the plays of his contemporary ...
- SortalSortalA sortal is something that takes numerical modifiers. There is disagreement about the exact definition of the term as well as whether the term is applied to linguistic things , abstract entities , or psychological entities, .-Differing perspectives:According to the Stanford Encyclopedia of...
- Symbol groundingSymbol groundingThe Symbol Grounding Problem is related to the problem of how words get their meanings, and hence to the problem of what meaning itself really is. The problem of meaning is in turn related to the problem of consciousness, or how it is that mental states are meaningful...
- TaxonomyTaxonomyTaxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...