Ecological fallacy
Encyclopedia
An ecological fallacy is a logical fallacy in the interpretation of statistic
al data in an ecological study
, whereby inference
s about the nature of specific individuals are based solely upon aggregate statistics collected for the group to which those individuals belong. This fallacy
assumes that individual members of a group have the average characteristics of the group at large. However, statistics that accurately describe group characteristics do not necessarily apply to individuals within that group. For a mathematical explanation of this see how variability of individuals is much greater than the variability of their mean.
Stereotype
s, which assume that groups are homogeneous, are one form of ecological fallacy. For example, if a particular group of people are measured to have a lower average IQ than the general population, it is an error to assume that any or all members of that group have a lower IQ than the general population. In fact, any given individual from that group may have a lower than average IQ, average IQ, or above average IQ compared to the general population.
If a particular sports team is described as performing poorly, it would be fallacious to conclude that each player on that team performs poorly. Because the performance of the team depends on each player, one excellent player and two terrible players may average out to three poor players. This does not diminish the excellence of the one player.
In the United States presidential elections of 2000
, 2004
, and 2008
, wealthier states (states with higher per capita incomes) tended to vote Democratic and poorer states tended to vote Republican. Yet wealthier voters tended to vote Republican and poorer voters tended to vote Democratic. For example, in 2004, the Republican candidate, George W. Bush
, won the fifteen poorest states, and the Democratic candidate, John Kerry
, won 9 of the 11 wealthiest states. Yet 62% of voters with annual incomes over $200,000 voted for Bush, but only 36% of voters with annual incomes of $15,000 or less voted for Bush.
The ecological fallacy was discussed in a court challenge to the Washington gubernatorial election, 2004
in which a number of illegal voters were identified, after the election; their votes were unknown, because the vote was by secret ballot
. The challengers argued that illegal votes cast in the election would have followed the voting patterns of the precincts in which they had been cast, and thus adjustments should be made accordingly. An expert witness said this approach was like trying to figure out Ichiro Suzuki
's batting average by looking at the batting average of the entire Seattle Mariners
team, since the illegal votes were cast by a unrepresentative sample of each precinct's voters, and might be as different from the average voter in the precinct as Ichiro was from the rest of his team. The judge determined that the challengers' argument was an ecological fallacy, and rejected it.
An early example of the ecological fallacy was Émile Durkheim's 1897 study of suicide
in France although this has been debated by some.
In 2011, it was found that Robinson's calculations of the ecological correlations are based on the wrong state level data. The correlation of 0.53 mentioned above is in fact 0.46. The research note on this curious data glitch is published in the International Journal for Epidemiology (http://ije.oxfordjournals.org/content/early/2011/05/24/ije.dyr081.full%20). The data Robinson used and the corrections are available at
http://www.ru.nl/mt/rob/downloads/ .
, in which one infers something is true of the whole from the fact that it is true of some part of the whole.
Statistic
A statistic is a single measure of some attribute of a sample . It is calculated by applying a function to the values of the items comprising the sample which are known together as a set of data.More formally, statistical theory defines a statistic as a function of a sample where the function...
al data in an ecological study
Ecological study
An ecological study is an epidemiological study in which the unit of analysis is a population rather than an individual. For instance, an ecological study may look at the association between smoking and lung cancer deaths in different countries...
, whereby inference
Inference
Inference is the act or process of deriving logical conclusions from premises known or assumed to be true. The conclusion drawn is also called an idiomatic. The laws of valid inference are studied in the field of logic.Human inference Inference is the act or process of deriving logical conclusions...
s about the nature of specific individuals are based solely upon aggregate statistics collected for the group to which those individuals belong. This fallacy
Fallacy
In logic and rhetoric, a fallacy is usually an incorrect argumentation in reasoning resulting in a misconception or presumption. By accident or design, fallacies may exploit emotional triggers in the listener or interlocutor , or take advantage of social relationships between people...
assumes that individual members of a group have the average characteristics of the group at large. However, statistics that accurately describe group characteristics do not necessarily apply to individuals within that group. For a mathematical explanation of this see how variability of individuals is much greater than the variability of their mean.
Stereotype
Stereotype
A stereotype is a popular belief about specific social groups or types of individuals. The concepts of "stereotype" and "prejudice" are often confused with many other different meanings...
s, which assume that groups are homogeneous, are one form of ecological fallacy. For example, if a particular group of people are measured to have a lower average IQ than the general population, it is an error to assume that any or all members of that group have a lower IQ than the general population. In fact, any given individual from that group may have a lower than average IQ, average IQ, or above average IQ compared to the general population.
Examples
A study is done that shows people from City A score higher on college entry exams, on average, than people from City B. This does not mean that a randomly selected individual from A will usually score higher than a randomly selected individual from B. This is because the distribution of scores might be very different between the cities. Consider this synthetic example:- City A: 80% of people got 40 points and 20% of them got 95 points. The average score is 51 points.
- City B: 50% of people got 45 points and 50% got 55 points. The average score is 50 points.
- If we pick two people at random from A and B, there are 4 possible outcomes:
- A - 40, B - 45 (B wins, 40% probability)
- A - 40, B - 55 (B wins, 40% probability)
- A - 95, B - 45 (A wins, 10% probability)
- A - 95, B - 55 (A wins, 10% probability)
- Although City A has a higher average score, 80% of the time a random inhabitant of A will score lower than a random inhabitant of B.
If a particular sports team is described as performing poorly, it would be fallacious to conclude that each player on that team performs poorly. Because the performance of the team depends on each player, one excellent player and two terrible players may average out to three poor players. This does not diminish the excellence of the one player.
In the United States presidential elections of 2000
United States presidential election, 2000
The United States presidential election of 2000 was a contest between Republican candidate George W. Bush, then-governor of Texas and son of former president George H. W. Bush , and Democratic candidate Al Gore, then-Vice President....
, 2004
United States presidential election, 2004
The United States presidential election of 2004 was the United States' 55th quadrennial presidential election. It was held on Tuesday, November 2, 2004. Republican Party candidate and incumbent President George W. Bush defeated Democratic Party candidate John Kerry, the then-junior U.S. Senator...
, and 2008
United States presidential election, 2008
The United States presidential election of 2008 was the 56th quadrennial presidential election. It was held on November 4, 2008. Democrat Barack Obama, then the junior United States Senator from Illinois, defeated Republican John McCain, the senior U.S. Senator from Arizona. Obama received 365...
, wealthier states (states with higher per capita incomes) tended to vote Democratic and poorer states tended to vote Republican. Yet wealthier voters tended to vote Republican and poorer voters tended to vote Democratic. For example, in 2004, the Republican candidate, George W. Bush
George W. Bush
George Walker Bush is an American politician who served as the 43rd President of the United States, from 2001 to 2009. Before that, he was the 46th Governor of Texas, having served from 1995 to 2000....
, won the fifteen poorest states, and the Democratic candidate, John Kerry
John Kerry
John Forbes Kerry is the senior United States Senator from Massachusetts, the 10th most senior U.S. Senator and chairman of the Senate Foreign Relations Committee. He was the presidential nominee of the Democratic Party in the 2004 presidential election, but lost to former President George W...
, won 9 of the 11 wealthiest states. Yet 62% of voters with annual incomes over $200,000 voted for Bush, but only 36% of voters with annual incomes of $15,000 or less voted for Bush.
The ecological fallacy was discussed in a court challenge to the Washington gubernatorial election, 2004
Washington gubernatorial election, 2004
The election for governor of Washington on November 2, 2004 gained national attention for its legal twists and extremely close finish. Notable for being among the closest political races in United States election history, Republican Dino Rossi was declared the winner in the initial automated count...
in which a number of illegal voters were identified, after the election; their votes were unknown, because the vote was by secret ballot
Secret ballot
The secret ballot is a voting method in which a voter's choices in an election or a referendum are anonymous. The key aim is to ensure the voter records a sincere choice by forestalling attempts to influence the voter by intimidation or bribery. The system is one means of achieving the goal of...
. The challengers argued that illegal votes cast in the election would have followed the voting patterns of the precincts in which they had been cast, and thus adjustments should be made accordingly. An expert witness said this approach was like trying to figure out Ichiro Suzuki
Ichiro Suzuki
, usually known simply as is a Major League Baseball right fielder for the Seattle Mariners. Ichiro has established a number of batting records, including the sport's single-season record for hits with 262...
's batting average by looking at the batting average of the entire Seattle Mariners
Seattle Mariners
The Seattle Mariners are a professional baseball team based in Seattle, Washington. Enfranchised in , the Mariners are a member of the Western Division of Major League Baseball's American League. Safeco Field has been the Mariners' home ballpark since July...
team, since the illegal votes were cast by a unrepresentative sample of each precinct's voters, and might be as different from the average voter in the precinct as Ichiro was from the rest of his team. The judge determined that the challengers' argument was an ecological fallacy, and rejected it.
Origin of concept
The term comes from a 1950 paper by William S. Robinson. For each of the 48 states + District of Columbia in the US as of the 1930 census, he computed the literacy rate and the proportion of the population born outside the US. He showed that these two figures were associated with a positive correlation of 0.53 — in other words, the greater the proportion of immigrants in a state, the higher its average literacy. However, when individuals are considered, the correlation was −0.11 — immigrants were on average less literate than native citizens. Robinson showed that the positive correlation at the level of state populations was because immigrants tended to settle in states where the native population was more literate. He cautioned against deducing conclusions about individuals on the basis of population-level, or "ecological" data.An early example of the ecological fallacy was Émile Durkheim's 1897 study of suicide
Suicide (book)
Suicide was one of the groundbreaking books in the field of sociology. Written by French sociologist Émile Durkheim and published in 1897 it was a case study of suicide, a publication unique for its time which provided an example of what the sociological...
in France although this has been debated by some.
In 2011, it was found that Robinson's calculations of the ecological correlations are based on the wrong state level data. The correlation of 0.53 mentioned above is in fact 0.46. The research note on this curious data glitch is published in the International Journal for Epidemiology (http://ije.oxfordjournals.org/content/early/2011/05/24/ije.dyr081.full%20). The data Robinson used and the corrections are available at
http://www.ru.nl/mt/rob/downloads/ .
Inverse error
The inverse of the ecological fallacy is the fallacy of compositionFallacy of composition
The fallacy of composition arises when one infers that something is true of the whole from the fact that it is true of some part of the whole...
, in which one infers something is true of the whole from the fact that it is true of some part of the whole.
See also
- Ecological correlationEcological correlationIn statistics, an ecological correlation is a correlation between two variables that are group means, in contrast to a correlation between two variables that describe individuals. For example, one might study the correlation between physical activity and weight among sixth-grade children...
- Modifiable areal unit problemModifiable Areal Unit ProblemThe modifiable areal unit problem is a source of statistical bias that can radically affect the results of statistical hypothesis tests. It affects results when point-based measures of spatial phenomena are aggregated into districts. The resulting summary values are influenced by the choice of...
- Prosecutor's fallacyProsecutor's fallacyThe prosecutor's fallacy is a fallacy of statistical reasoning made in law where the context in which the accused has been brought to court is falsely assumed to be irrelevant to judging how confident a jury can be in evidence against them with a statistical measure of doubt...
- Sampling (statistics)Sampling (statistics)In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population....
- Simpson's paradoxSimpson's paradoxIn probability and statistics, Simpson's paradox is a paradox in which a correlation present in different groups is reversed when the groups are combined. This result is often encountered in social-science and medical-science statistics, and it occurs when frequencydata are hastily given causal...
- Statistical discrimination