Average treatment effects
Encyclopedia
The average treatment effect (ATE) is a measure used to compare treatments (or 'interventions) in randomized experiments, evaluation of policy interventions, and medical trials. The ATE measures the average causal difference in outcomes under the treatment and under the control. In a randomized trial (i.e., experiment), the average treatment effect can be estimated using a comparison in means between treated and untreated units. However, the ATE is a causal estimand defined without reference to the study design or estimation procedure, and both observational and experimental designs may attempt to estimate an ATE in a variety of ways.
, political science
, and economics
such as, for example, the evaluation of the impact of public policies. The nature of a treatment or outcome is relatively unimportant in the estimation of the ATE.
The expression "treatment effect" refers to the causal effect of a given treatment or policy (for example, the administering of a drug) on an outcome variable of interest (for example, the health of the patient). In the Neyman-Rubin "Potential Outcomes Framework" of causality
a treatment effect is the difference in outcomes for an individual experimental unit under the treatment and control. This individual-level treatment effect is unobservable, however, because individual units can only receive the treatment or the control, but not both. The average treatment effect in a sample is therefore an estimate of the group-level average treatment effect in the population, which is itself an estimate of an unobservable individual-level treatment effect.
For example, consider an example where all units are unemployed individuals, and some experience a policy intervention (the treatment group), while others do not (the control group). The causal effect of interest is the impact a job search monitoring policy (the treatment) has on the length of an unemployment spell: On average, how much shorter would one's unemployment be if they experienced the intervention? The ATE, in this case, is the difference in expected values (averages) of the treatment and control groups' length of unemployment.
Other aggregate measures widely used are the average treatment effect on the treated (ATET) and the local average treatment effect (LATE).
he is treated. For example, is the health status of the individual if he is not administered the drug under study and is the health status if he is administered the drug.
The treatment effect for individual is given by . In the general case, there is no reason to expect this effect to be constant across individuals.
Let denote the expectation operator for any given variable (that is, the average value of the variable across the whole population of interest). The Average treatment effects is given by: .
If we could observe, for each individual, and among a large representative sample of the population, we could estimate the ATE simply by taking the average value of for the sample: (where is the size of the sample).
The problem is that we can not observe both and for each individual. For example, in the drug example, we can only observe for individuals who have received the drug and for those who did not receive it; we do not observe for treated individuals and for untreated ones. This fact is the main problem faced by scientists in the evaluation of treatment effects and has triggered a large body of estimation techniques.
Once a policy change occurs on a population, a regression
can be run controlling for the treatment. The resulting equation would be
where y is the response variable
and measures the effects of the policy change on the population.
The difference in differences
equation would be
where T is the treatment group and C is the control group. In this case the measures the effects of the treatment on the average outcome and is the average treatment effect.
From the diffs-in-diffs example we can see the main problems of estimating treatment effects. As we can not observe the same individual as treated and non-treated at the same time, we have to come up with a measure of counterfactuals to estimate the average treatment effect.
General definition
Originating from early statistical analysis in the fields of agriculture and medicine, the term "treatment" is now applied, more generally, to other fields of natural and social science, especially psychologyPsychology
Psychology is the study of the mind and behavior. Its immediate goal is to understand individuals and groups by both establishing general principles and researching specific cases. For many, the ultimate goal of psychology is to benefit society...
, political science
Political science
Political Science is a social science discipline concerned with the study of the state, government and politics. Aristotle defined it as the study of the state. It deals extensively with the theory and practice of politics, and the analysis of political systems and political behavior...
, and economics
Economics
Economics is the social science that analyzes the production, distribution, and consumption of goods and services. The term economics comes from the Ancient Greek from + , hence "rules of the house"...
such as, for example, the evaluation of the impact of public policies. The nature of a treatment or outcome is relatively unimportant in the estimation of the ATE.
The expression "treatment effect" refers to the causal effect of a given treatment or policy (for example, the administering of a drug) on an outcome variable of interest (for example, the health of the patient). In the Neyman-Rubin "Potential Outcomes Framework" of causality
Causality
Causality is the relationship between an event and a second event , where the second event is understood as a consequence of the first....
a treatment effect is the difference in outcomes for an individual experimental unit under the treatment and control. This individual-level treatment effect is unobservable, however, because individual units can only receive the treatment or the control, but not both. The average treatment effect in a sample is therefore an estimate of the group-level average treatment effect in the population, which is itself an estimate of an unobservable individual-level treatment effect.
For example, consider an example where all units are unemployed individuals, and some experience a policy intervention (the treatment group), while others do not (the control group). The causal effect of interest is the impact a job search monitoring policy (the treatment) has on the length of an unemployment spell: On average, how much shorter would one's unemployment be if they experienced the intervention? The ATE, in this case, is the difference in expected values (averages) of the treatment and control groups' length of unemployment.
Other aggregate measures widely used are the average treatment effect on the treated (ATET) and the local average treatment effect (LATE).
Formal definition
In order to define formally the ATE, we define two potential outcomes : is the value of the outcome variable for individual if he is not treated, is the value of the outcome variable for individual ifhe is treated. For example, is the health status of the individual if he is not administered the drug under study and is the health status if he is administered the drug.
The treatment effect for individual is given by . In the general case, there is no reason to expect this effect to be constant across individuals.
Let denote the expectation operator for any given variable (that is, the average value of the variable across the whole population of interest). The Average treatment effects is given by: .
If we could observe, for each individual, and among a large representative sample of the population, we could estimate the ATE simply by taking the average value of for the sample: (where is the size of the sample).
The problem is that we can not observe both and for each individual. For example, in the drug example, we can only observe for individuals who have received the drug and for those who did not receive it; we do not observe for treated individuals and for untreated ones. This fact is the main problem faced by scientists in the evaluation of treatment effects and has triggered a large body of estimation techniques.
Estimation
Depending on the data and its underlying circumstances, many methods can be used to estimate the ATE. The most common ones are- Natural experimentNatural experimentA natural experiment is an observational study in which the assignment of treatments to subjects has been haphazard: That is, the assignment of treatments has been made "by nature", but not by experimenters. Thus, a natural experiment is not a controlled experiment...
and the similar quasi-experimentQuasi-experimentA quasi-experiment is an empirical study used to estimate the causal impact of an intervention on its target population. Quasi-experimental research designs share many similarities with the traditional experimental design or randomized controlled trial, but they specifically lack the element of...
, - Difference in differencesDifference in differencesDifference in differences is a quasi-experimental technique used in econometrics that measures the effect of a treatment at a given period in time. It is often used to measure the change induced by a particular treatment or event, though may be subject to certain biases...
or its short version: diffs-in-diffs, - the Regression discontinuity design method,
- matching methodPaired difference testIn statistics, a paired difference test is a type of location test that is used when comparing two sets of measurements to assess whether their population means differ...
, - methods based on the theory of local IVs (in a strict sense regression discontinuity design belongs here as well)
Once a policy change occurs on a population, a regression
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
can be run controlling for the treatment. The resulting equation would be
where y is the response variable
Dependent and independent variables
The terms "dependent variable" and "independent variable" are used in similar but subtly different ways in mathematics and statistics as part of the standard terminology in those subjects...
and measures the effects of the policy change on the population.
The difference in differences
Difference in differences
Difference in differences is a quasi-experimental technique used in econometrics that measures the effect of a treatment at a given period in time. It is often used to measure the change induced by a particular treatment or event, though may be subject to certain biases...
equation would be
where T is the treatment group and C is the control group. In this case the measures the effects of the treatment on the average outcome and is the average treatment effect.
From the diffs-in-diffs example we can see the main problems of estimating treatment effects. As we can not observe the same individual as treated and non-treated at the same time, we have to come up with a measure of counterfactuals to estimate the average treatment effect.
See also
- Average treatment effect on the treated
- Local average treatment effect
- Marginal treatment effect
- Matching method
- Local IV
- Set identification