Galton's problem
Encyclopedia
Galton’s problem, named after Sir Francis Galton
, is the problem of drawing inferences from cross-cultural
data, due to the statistical phenomenon now called autocorrelation
. The problem is now recognized as a general one that applies to all nonexperimental studies and to experimental design as well. It is most simply described as the problem of external dependencies in making statistical estimates when the elements sampled are not statistically independent
.
Asking two people in the same household whether they watch TV, for example, does not give you statistically independent answers. The sample size, n, for independent observations in this case is one, not two. Once proper adjustments are made that deal with external dependencies, then the axioms of probability theory concerning statistical independence will apply. These axioms are important for deriving measures of variance
, for example, or tests of statistical significance
.
,
who proposed the first statistical solutions.
By the early 20th century unilineal evolutionism was abandoned and along with it the drawing of direct inferences from correlations to evolutionary sequences. Galton's criticisms proved equally valid, however, for inferring functional relations from correlations. The problem of autocorrelation remained.
have a similar problem: the closer the poll to the election, the less individuals make up their mind independently, and the greater the unreliability
of the polling results, especially the margin of error
or confidence limits. The effective n of independent cases
from their sample drops as the election nears. Statistical significance
falls with lower effective sample size.
The problem pops up in sample surveys
when sociologists want to reduce the travel time to do their interviews, and hence they divide their population into local clusters
and sample the clusters randomly, then sample again within the clusters. If they interview n people in clusters of size m the effective sample size
(efs) would have a lower limit of 1 + (n-1)/m if everyone in each cluster were identical. When there are only partial similarities within clusters, the m in this formula has to be lowered accordingly. A formula of this sort is 1 + d (n-1) where d is the intraclass correlation
for the statistic in question. In general, estimations of the appropriate efs depends on the statistic
estimated, as for example, mean
, chi-square, r
, regression
coefficient, and their variance
s.
For cross-cultural studies
, Murdock and White
estimated the size of patches of similarities in their sample of 186 societies. The four variables they tested – language, economy, political integration, and descent – had patches of similarities that varied from size three to size ten. A very crude rule of thumb might be to divide the square root of the similarity-patch sizes into n, so that the effective sample sizes are 58 and 107 for these patches, respectively. Again, statistical significance falls with lower effective sample size.
In modern analysis spatial lags have been modelled in order to estimate the degree of globalization on modern societies (Jahn 2006).
Spatial dependency
or auto-correlation is a fundamental concept in geography. Methods developed by geographers that measure and control for spatial autocorrelation (e.g., Cliff and Ord 1973, 1981) do far more than reduce the effective n for tests of significance of a correlation. One example is the complicated hypothesis that “the presence of gambling in a society is directly proportional to the presence of a commercial money and to the presence of considerable socioeconomic differences and is inversely related to whether or not the society is a nomadic herding society.” Tests of this hypothesis in a sample of 60 societies failed to reject the null hypothesis. Autocorrelation analysis, however, showed a significant effect of socioeconomic differences.
How prevalent is autocorrelation among the variables studied in cross-cultural research? A test by Anthon Eff on 1700 variables in the cumulative database for the Standard Cross-Cultural Sample
, published in World Cultures, measured Moran’s I for spatial autocorrelation (distance), linguistic autocorrelation (common descent), and autoccorrelation in cultural complexity (mainline evolution). "The results suggest that ... it would be prudent to test for spatial and phylogenetic autoccorrelation when conducting regression analyses with the Standard Cross-Cultural Sample."
The use of autocorrelation tests in exploratory data analysis is illustrated, showing how all variables in a given study can be evaluated for nonindependence of cases in terms of distance, language, and cultural complexity. The methods for estimating these autocorrelation effects are then explained and illustrated for ordinary least squares regression using again the Moran I significance measure of autocorrelation.
When autocorrelation is present, it can often be removed to get unbiased estimates of regression coefficients and their variances by constructing a respecified dependent variable that is "lagged" by weightings on the dependent variable on other locations, where the weights are degree of relationship. This lagged dependent variable is endogenous, and estimation requires either two-stage least squares or maximum likelihood
methods (Anselin 1988).
have begun to realize that evidence of diffusion, historical origin, and other sources of similarity among related societies or individuals should be renamed Galton’s Opportunity and Galton’s Asset rather than Galton’s Problem. Researchers, like Mace and Pagel (1994), now use longitudinal, cross-cultural, and regional variation analysis routinely to analyze all the competing hypotheses: functional
relationships, diffusion
, common historical origin, multilineal evolution
, co-adaptation
with environment, and complex social interaction dynamics.
There is little doubt, however, that the community of cross-cultural researchers have been remiss in ignoring Galton's problem. Expert investigation of this question shows results that "strongly suggest that the extensive reporting of naïve chi-square independence tests using cross-cultural data sets over the past several decades has led to incorrect rejection of null hypotheses at levels much higher than the expected 5% rate."
The investigator concludes that "Incorrect theories that have been ‘saved’ by naïve chi-square tests with comparative data may yet be more rigorously tested another day.” Once again, the adjusted variance of a cluster sample is given as one multiplied by 1 + d(k+1) where k is the average size of a cluster, and a more complicated correction is given for the variance of contingency table correlations with r rows and c columns. Since this critique was published in 1993, and others like it, more authors have begun to adopt corrections for Galton's problem, but the majority in the cross-cultural field have not. Consequently, a large proportion of published results that rely on naive significance tests and that adopt the p <.05 rather than a p <.005 standard are likely to be in error
because they are more susceptible to type I error
, which is to reject the null hypothesis when it is true.
Some cross-cultural researchers reject the seriousness of Galton's problem because, they argue, estimates of correlations and means may be unbiased even if autocorrelation, weak or strong, is present. Without investigating autocorrelation, however, they may still mis-estimate
statistics dealing with relationships among variables. In regression analysis
, for example, examining the patterns of autocorrelated residuals may give important clues to third factors that may affect the relationships among variables but that have not been included in the regression model. Second, if there are
clusters of similar and related societies in the sample,
measures of variance will be underestimated, leading to
spurious statistical conclusions. for example, exaggerating
the statistical significance of correlations. Third, the
underestimation of variance makes it difficult to test for
replication of results from two different samples, as
the results will more often be rejected as similar.
Francis Galton
Sir Francis Galton /ˈfrɑːnsɪs ˈgɔːltn̩/ FRS , cousin of Douglas Strutt Galton, half-cousin of Charles Darwin, was an English Victorian polymath: anthropologist, eugenicist, tropical explorer, geographer, inventor, meteorologist, proto-geneticist, psychometrician, and statistician...
, is the problem of drawing inferences from cross-cultural
Cross-cultural studies
Cross-cultural studies, sometimes called Holocultural Studies, is a specialization in anthropology and sister sciences that uses field data from many societies to examine the scope of human behavior and test hypotheses about human behavior and culture. Cross-cultural studies is the third form of...
data, due to the statistical phenomenon now called autocorrelation
Autocorrelation
Autocorrelation is the cross-correlation of a signal with itself. Informally, it is the similarity between observations as a function of the time separation between them...
. The problem is now recognized as a general one that applies to all nonexperimental studies and to experimental design as well. It is most simply described as the problem of external dependencies in making statistical estimates when the elements sampled are not statistically independent
Statistical independence
In probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs...
.
Asking two people in the same household whether they watch TV, for example, does not give you statistically independent answers. The sample size, n, for independent observations in this case is one, not two. Once proper adjustments are made that deal with external dependencies, then the axioms of probability theory concerning statistical independence will apply. These axioms are important for deriving measures of variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...
, for example, or tests of statistical significance
Statistical significance
In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. The phrase test of significance was coined by Ronald Fisher....
.
Origin
In 1889, Galton was present when Sir Edward Tylor presented a paper at the Royal Anthropological Institute. Tylor had compiled information on institutions of marriage and descent for 350 cultures and examined the correlations between these institutions and measures of societal complexity. Tylor interpreted his results as indications of a general evolutionary sequence, in which institutions change focus from the maternal line to the paternal line as societies become increasingly complex. Galton disagreed, pointing out that similarity between cultures could be due to borrowing, could be due to common descent, or could be due to evolutionary development; he maintained that without controlling for borrowing and common descent one cannot make valid inferences regarding evolutionary development. Galton’s critique has become the eponymous Galton’s Problem (Stocking 1968: 175), as named by Raoul NarollRaoul Naroll
Raoul Naroll was an anthropologist who did much to promote the methodology of cross-cultural studies. He was born in Toronto, Ontario but was raised in Los Angeles and attended UCLA at the age of 16, dropping out in his junior year. Naroll returned to his studies in anthropology and history in...
,
who proposed the first statistical solutions.
By the early 20th century unilineal evolutionism was abandoned and along with it the drawing of direct inferences from correlations to evolutionary sequences. Galton's criticisms proved equally valid, however, for inferring functional relations from correlations. The problem of autocorrelation remained.
Solutions
Statistician William S. Gosset in 1914 developed methods of eliminating spurious correlation due to how position in time or space affects similarities. Today’s election pollsOpinion poll
An opinion poll, sometimes simply referred to as a poll is a survey of public opinion from a particular sample. Opinion polls are usually designed to represent the opinions of a population by conducting a series of questions and then extrapolating generalities in ratio or within confidence...
have a similar problem: the closer the poll to the election, the less individuals make up their mind independently, and the greater the unreliability
Reliability (statistics)
In statistics, reliability is the consistency of a set of measurements or of a measuring instrument, often used to describe a test. Reliability is inversely related to random error.-Types:There are several general classes of reliability estimates:...
of the polling results, especially the margin of error
Margin of error
The margin of error is a statistic expressing the amount of random sampling error in a survey's results. The larger the margin of error, the less faith one should have that the poll's reported results are close to the "true" figures; that is, the figures for the whole population...
or confidence limits. The effective n of independent cases
Statistical independence
In probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs...
from their sample drops as the election nears. Statistical significance
Statistical significance
In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. The phrase test of significance was coined by Ronald Fisher....
falls with lower effective sample size.
The problem pops up in sample surveys
Sampling (statistics)
In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population....
when sociologists want to reduce the travel time to do their interviews, and hence they divide their population into local clusters
Cluster sampling
Cluster Sampling is a sampling technique used when "natural" groupings are evident in a statistical population. It is often used in marketing research. In this technique, the total population is divided into these groups and a sample of the groups is selected. Then the required information is...
and sample the clusters randomly, then sample again within the clusters. If they interview n people in clusters of size m the effective sample size
Sample size
Sample size determination is the act of choosing the number of observations to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample...
(efs) would have a lower limit of 1 + (n-1)/m if everyone in each cluster were identical. When there are only partial similarities within clusters, the m in this formula has to be lowered accordingly. A formula of this sort is 1 + d (n-1) where d is the intraclass correlation
Statistical Methods for Research Workers
Statistical Methods for Research Workers is a classic 1925 book on statistics by the statistician R.A. Fisher. It is considered by some to be one of the 20th century's most influential books on statistical methods. According to ,...
for the statistic in question. In general, estimations of the appropriate efs depends on the statistic
Statistic
A statistic is a single measure of some attribute of a sample . It is calculated by applying a function to the values of the items comprising the sample which are known together as a set of data.More formally, statistical theory defines a statistic as a function of a sample where the function...
estimated, as for example, mean
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
, chi-square, r
Correlation
In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....
, regression
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
coefficient, and their variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...
s.
For cross-cultural studies
Cross-cultural studies
Cross-cultural studies, sometimes called Holocultural Studies, is a specialization in anthropology and sister sciences that uses field data from many societies to examine the scope of human behavior and test hypotheses about human behavior and culture. Cross-cultural studies is the third form of...
, Murdock and White
Standard cross-cultural sample
The standard cross-cultural sample is a sample of 186 cultures, used by scholars engaged in cross-cultural studies.-Origin:Cross-cultural research entails a particular statistical problem, known as Galton's problem: tests of functional relationships can be confounded because the...
estimated the size of patches of similarities in their sample of 186 societies. The four variables they tested – language, economy, political integration, and descent – had patches of similarities that varied from size three to size ten. A very crude rule of thumb might be to divide the square root of the similarity-patch sizes into n, so that the effective sample sizes are 58 and 107 for these patches, respectively. Again, statistical significance falls with lower effective sample size.
In modern analysis spatial lags have been modelled in order to estimate the degree of globalization on modern societies (Jahn 2006).
Spatial dependency
Spatial dependence
In applications of statistics, spatial dependence is the existence of statistical dependence in a collection of random variables or a collection time series of random variables, each of which is associated with a different geographical location...
or auto-correlation is a fundamental concept in geography. Methods developed by geographers that measure and control for spatial autocorrelation (e.g., Cliff and Ord 1973, 1981) do far more than reduce the effective n for tests of significance of a correlation. One example is the complicated hypothesis that “the presence of gambling in a society is directly proportional to the presence of a commercial money and to the presence of considerable socioeconomic differences and is inversely related to whether or not the society is a nomadic herding society.” Tests of this hypothesis in a sample of 60 societies failed to reject the null hypothesis. Autocorrelation analysis, however, showed a significant effect of socioeconomic differences.
How prevalent is autocorrelation among the variables studied in cross-cultural research? A test by Anthon Eff on 1700 variables in the cumulative database for the Standard Cross-Cultural Sample
Standard cross-cultural sample
The standard cross-cultural sample is a sample of 186 cultures, used by scholars engaged in cross-cultural studies.-Origin:Cross-cultural research entails a particular statistical problem, known as Galton's problem: tests of functional relationships can be confounded because the...
, published in World Cultures, measured Moran’s I for spatial autocorrelation (distance), linguistic autocorrelation (common descent), and autoccorrelation in cultural complexity (mainline evolution). "The results suggest that ... it would be prudent to test for spatial and phylogenetic autoccorrelation when conducting regression analyses with the Standard Cross-Cultural Sample."
The use of autocorrelation tests in exploratory data analysis is illustrated, showing how all variables in a given study can be evaluated for nonindependence of cases in terms of distance, language, and cultural complexity. The methods for estimating these autocorrelation effects are then explained and illustrated for ordinary least squares regression using again the Moran I significance measure of autocorrelation.
When autocorrelation is present, it can often be removed to get unbiased estimates of regression coefficients and their variances by constructing a respecified dependent variable that is "lagged" by weightings on the dependent variable on other locations, where the weights are degree of relationship. This lagged dependent variable is endogenous, and estimation requires either two-stage least squares or maximum likelihood
Maximum likelihood
In statistics, maximum-likelihood estimation is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters....
methods (Anselin 1988).
Opportunities
In Anthropology, where Tylor's problem was first recognized by the statistician Galton in 1889, it is still not widely recognized that there are standard statistical adjustments for the problem of patches of similarity in observed cases and opportunities for new discoveries using autocorrelation methods. Some cross-cultural researchers (see, e.g., Korotayev and de Munck 2003)have begun to realize that evidence of diffusion, historical origin, and other sources of similarity among related societies or individuals should be renamed Galton’s Opportunity and Galton’s Asset rather than Galton’s Problem. Researchers, like Mace and Pagel (1994), now use longitudinal, cross-cultural, and regional variation analysis routinely to analyze all the competing hypotheses: functional
Functional form
In programming and mathematics, a functional form is an operator or function that can either be applied to other operators or yield operators as result, or both...
relationships, diffusion
Cultural diffusion
In cultural anthropology and cultural geography, cultural diffusion, as first conceptualized by Alfred L. Kroeber in his influential 1940 paper Stimulus Diffusion, or trans-cultural diffusion in later reformulations, is the spread of cultural items—such as ideas, styles, religions, technologies,...
, common historical origin, multilineal evolution
Multilineal evolution
Multilineal evolution is a 20th century social theory about the evolution of societies and cultures. It is composed of many competing theories by various sociologists and anthropologists...
, co-adaptation
Co-adaptation
In biology, co-adaptation, or coadaptation refers to the mutual adaptation of:* Species: see mutualism, symbiosis* organs: see the evolution of the eye.* Genes or gene complexes: see Linkage disequilibrium, epistasis...
with environment, and complex social interaction dynamics.
Controversies
Within Anthropology, Galton's problem is often given as a cause to reject comparative studies altogether. Since the problem is a general one, common to the sciences and statistical inference generally, this particular criticism of cross-cultural or comparative studies – and there are many – is one that, logically speaking, amounts to a rejection of science and statistics altogether. Any data collected and analyzed by ethnographers, for example, is equally subject to Galton's problem, understood in its most general sense. A critique of the anticomparative critique is not limited to statistical comparison since it would apply as well to the analysis of text. That is, the analysis and use of text in argumentation is subject to critique as to the evidential basis of inference. Reliance purely on rhetoric is no protection against critique as to the validity of argument and its evidentiary basis.There is little doubt, however, that the community of cross-cultural researchers have been remiss in ignoring Galton's problem. Expert investigation of this question shows results that "strongly suggest that the extensive reporting of naïve chi-square independence tests using cross-cultural data sets over the past several decades has led to incorrect rejection of null hypotheses at levels much higher than the expected 5% rate."
The investigator concludes that "Incorrect theories that have been ‘saved’ by naïve chi-square tests with comparative data may yet be more rigorously tested another day.” Once again, the adjusted variance of a cluster sample is given as one multiplied by 1 + d(k+1) where k is the average size of a cluster, and a more complicated correction is given for the variance of contingency table correlations with r rows and c columns. Since this critique was published in 1993, and others like it, more authors have begun to adopt corrections for Galton's problem, but the majority in the cross-cultural field have not. Consequently, a large proportion of published results that rely on naive significance tests and that adopt the p <.05 rather than a p <.005 standard are likely to be in error
because they are more susceptible to type I error
Type I and type II errors
In statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...
, which is to reject the null hypothesis when it is true.
Some cross-cultural researchers reject the seriousness of Galton's problem because, they argue, estimates of correlations and means may be unbiased even if autocorrelation, weak or strong, is present. Without investigating autocorrelation, however, they may still mis-estimate
Omitted-variable bias
In statistics, omitted-variable bias occurs when a model is created which incorrectly leaves out one or more important causal factors. The 'bias' is created when the model compensates for the missing factor by over- or under-estimating one of the other factors.More specifically, OVB is the bias...
statistics dealing with relationships among variables. In regression analysis
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
, for example, examining the patterns of autocorrelated residuals may give important clues to third factors that may affect the relationships among variables but that have not been included in the regression model. Second, if there are
clusters of similar and related societies in the sample,
measures of variance will be underestimated, leading to
spurious statistical conclusions. for example, exaggerating
the statistical significance of correlations. Third, the
underestimation of variance makes it difficult to test for
replication of results from two different samples, as
the results will more often be rejected as similar.