Correspondence analysis
Encyclopedia
Correspondence analysis is a multivariate statistical technique
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

 proposed by Hirschfeld and later developed by Jean-Paul Benzécri
Jean-Paul Benzécri
Jean-Paul Benzécri is a French statistician. He studied at École Normale Supérieure and has been professor at Université Pierre-et-Marie-Curie in Paris. He is most famous for the development of the Correspondence analysis, a statistical technique for analyzing contingency tables...

. It is conceptually similar to principal component analysis, but applies to categorical rather than continuous data. In a similar manner to principal component analysis, it provides a means of displaying or summarising a set of data in two-dimensional graphical form.

All data should be nonnegative and on the same scale for CA to be applicable, and the method treats rows and columns equivalently. It is traditionally applied to contingency tables — CA decomposes the chi-square statistic associated with this table into orthogonal factors. Because CA is a descriptive technique, it can be applied to tables whether or not the chi-square statistic is appropriate. Several variants of CA are available, including detrended correspondence analysis
Detrended Correspondence Analysis
Detrended correspondence analysis is a multivariate statistical technique widely used by ecologists to find the main factors or gradients in large, species-rich but usually sparse data matrices that typify ecological community data. For example, Hill and Gauch analyse the data of a vegetation...

 and canonical correspondence analysis. The extension of correspondence analysis to many categorical variables is called multiple correspondence analysis
Multiple correspondence analysis
In statistics, multiple correspondence analysis is a data analysis technique for nominal categorical data, used to detect and represent underlying structures in a data set. It does this by representing data as points in a low-dimensional Euclidean space. The procedure thus appears to be the...

. An adaptation of correspondence analysis to the problem of discrimination based upon qualitative variables (i.e., the equivalent of discriminant analysis for qualitative data) is called discriminant correspondence analysis or barycentric discriminant analysis.

In the social sciences, correspondence analysis, and particularly its extension multiple correspondence analysis
Multiple correspondence analysis
In statistics, multiple correspondence analysis is a data analysis technique for nominal categorical data, used to detect and represent underlying structures in a data set. It does this by representing data as points in a low-dimensional Euclidean space. The procedure thus appears to be the...

, was made known outside France through French sociologist Pierre Bourdieu
Pierre Bourdieu
Pierre Bourdieu was a French sociologist, anthropologist, and philosopher.Starting from the role of economic capital for social positioning, Bourdieu pioneered investigative frameworks and terminologies such as cultural, social, and symbolic capital, and the concepts of habitus, field or location,...

's application of it.

Implementations

  • Orange
    Orange (software)
    Orange is a component-based data mining and machine learning software suite, featuring friendly yet powerful and flexible visual programming front-end for explorative data analysis and visualization, and Python bindings and libraries for scripting...

    , a free data mining software suite, module orngCA
  • In the open source statistical package R
    R (programming language)
    R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....

    , the packages ade4, ca, vegan,, andhttp://factominer.free.fr/FactoMineR implement correspondence analysis and multiple correspondence analysis.
  • Here is a link to a MATLAB
    MATLAB
    MATLAB is a numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages,...

    program (with a tutorial) for correspondence analysis: http://www.utdallas.edu/~herve/abdi-CorrespondenceAnalysisMatlabProgram.zip

External links

  • Greenacre, Michael (2008), La Práctica del Análisis de Correspondencias, BBVA Foundation, Madrid, Spanish translation of Correspondence Analysis in Practice, available for free download from BBVA Foundation publications

  • Greenacre, Michael (2010), Biplots in Practice, BBVA Foundation, Madrid, available for free download at multivariatestatistics.org
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK