Granger causality
Encyclopedia
The Granger causality test is a statistical hypothesis test
for determining whether one time series
is useful in forecasting another. Ordinarily, regressions
reflect "mere" correlation
s, but Clive Granger
, who won a Nobel Prize in Economics, argued that there is an interpretation of a set of tests as revealing something about causality
.
A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-test
s on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.
This technique has been adapted to neuroscience
, although its usefulness in fMRI
is contested.
but ΔY is.) Once the set of significant lagged values for ΔY is found (via t-statistics or p-value
s), the regression is augmented with lagged levels of ΔX. Any particular lagged value of ΔX is retained in the regression if (1) it is significant according to a t-test, and (2) it and the other lagged values of ΔX jointly add explanatory power
to the model according to an F-test. Then the null hypothesis
of no Granger causality is retained if and only if no lagged values of ΔX have been retained in the regression.
The researcher is often looking for a clear story, such as X Granger-causes Y but not the other way around. In practice, however, it may be found that neither variable Granger-causes the other, or that each of the two variables Granger-causes the other.
.
Here is retained in the regression if and only if it has a significant t-statistic; m is the greatest lag length for which the lagged dependent variable is significant.
Next, the autoregression is augmented by including lagged values of x:
One retains in this regression all lagged values of x that are individually significant according to their t-statistics, provided that collectively they add explanatory power to the regression according to an F-test (whose null hypothesis is no explanatory power jointly added by the xs). In the notation of the above augmented regression, p is the shortest, and q is the longest, lag length for which the lagged value of x is significant.
The null hypothesis that x does not Granger-cause y is accepted if and only if no lagged values of x are retained in the regression.
The first Model 1 tests whether it is okay to remove lagged rM from the regression explaining
FII using lagged FII. It is not (p = 0.02896). The second pair of Model 1 and Model 2 finds that
it is possible to remove the lagged FII from the model explaining rM using lagged rM. From this, we conclude that rM Granger-causes FII but not the other way around.
, José Miguel Bornot-Sánchez, Mayrín Vega Hernández, Lester Melie-García, Agustín Lage-Castellano and Erick Cavales- Rodríguez, who evaluated a special extension of Granger Causality using a Statistical Parametric Mapping (SPM) of influence field for the analysis of effective brain connectivity.
Statistical hypothesis testing
A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...
for determining whether one time series
Time series
In statistics, signal processing, econometrics and mathematical finance, a time series is a sequence of data points, measured typically at successive times spaced at uniform time intervals. Examples of time series are the daily closing value of the Dow Jones index or the annual flow volume of the...
is useful in forecasting another. Ordinarily, regressions
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
reflect "mere" correlation
Correlation
In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....
s, but Clive Granger
Clive Granger
Sir Clive William John Granger was a British economist, who taught in Britain at the University of Nottingham and in the U.S.A. at the University of California, San Diego. In 2003, Granger was awarded the Nobel Memorial Prize in Economic Sciences, in recognition that he and his co-winner, Robert F...
, who won a Nobel Prize in Economics, argued that there is an interpretation of a set of tests as revealing something about causality
Causality
Causality is the relationship between an event and a second event , where the second event is understood as a consequence of the first....
.
A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-test
F-test
An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis.It is most often used when comparing statistical models that have been fit to a data set, in order to identify the model that best fits the population from which the data were sampled. ...
s on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.
This technique has been adapted to neuroscience
Neuroscience
Neuroscience is the scientific study of the nervous system. Traditionally, neuroscience has been seen as a branch of biology. However, it is currently an interdisciplinary science that collaborates with other fields such as chemistry, computer science, engineering, linguistics, mathematics,...
, although its usefulness in fMRI
Functional magnetic resonance imaging
Functional magnetic resonance imaging or functional MRI is a type of specialized MRI scan used to measure the hemodynamic response related to neural activity in the brain or spinal cord of humans or other animals. It is one of the most recently developed forms of neuroimaging...
is contested.
Method
The test for Granger causality works by first doing a regression of ΔY on lagged values of ΔY. (Here ΔY is the first difference of the variable Y — that is, Y minus its one-period-prior value. The regressions are performed in terms of ΔY rather than Y if Y is not stationaryStationary process
In the mathematical sciences, a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time or space...
but ΔY is.) Once the set of significant lagged values for ΔY is found (via t-statistics or p-value
P-value
In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. One often "rejects the null hypothesis" when the p-value is less than the significance level α ,...
s), the regression is augmented with lagged levels of ΔX. Any particular lagged value of ΔX is retained in the regression if (1) it is significant according to a t-test, and (2) it and the other lagged values of ΔX jointly add explanatory power
Explanatory power
Explanatory power is the ability of a theory to effectively explain the subject matter it pertains to. One theory is sometimes said to have more explanatory power than another theory about the same subject matter if it offers greater predictive power...
to the model according to an F-test. Then the null hypothesis
Null hypothesis
The practice of science involves formulating and testing hypotheses, assertions that are capable of being proven false using a test of observed data. The null hypothesis typically corresponds to a general or default position...
of no Granger causality is retained if and only if no lagged values of ΔX have been retained in the regression.
The researcher is often looking for a clear story, such as X Granger-causes Y but not the other way around. In practice, however, it may be found that neither variable Granger-causes the other, or that each of the two variables Granger-causes the other.
Limitations
Despite its name, Granger causality is not sufficient to imply true causality. If both X and Y are driven by a common third process with different lags, one might still accept the alternative hypothesis of Granger causality. Yet, manipulation of one of the variables would not change the other. Indeed, the Granger test is designed to handle pairs of variables, and may produce misleading results when the true relationship involves three or more variables. A similar test involving more variables can be applied with vector autoregressionVector autoregression
Vector autoregression is a statistical model used to capture the linear interdependencies among multiple time series. VAR models generalize the univariate autoregression models. All the variables in a VAR are treated symmetrically; each variable has an equation explaining its evolution based on...
.
Mathematical statement
Let y and x be stationary time series. To test the null hypothesis that x does not Granger-cause y, one first finds the proper lagged values of y to include in a univariate autoregression of y:Here is retained in the regression if and only if it has a significant t-statistic; m is the greatest lag length for which the lagged dependent variable is significant.
Next, the autoregression is augmented by including lagged values of x:
One retains in this regression all lagged values of x that are individually significant according to their t-statistics, provided that collectively they add explanatory power to the regression according to an F-test (whose null hypothesis is no explanatory power jointly added by the xs). In the notation of the above augmented regression, p is the shortest, and q is the longest, lag length for which the lagged value of x is significant.
The null hypothesis that x does not Granger-cause y is accepted if and only if no lagged values of x are retained in the regression.
Software implementation
Here is an example of the function grangertest in the lmtest library of the R package:
Granger causality test
Model 1: fii ~ Lags(fii, 1:5) + Lags(rM, 1:5)
Model 2: fii ~ Lags(fii, 1:5)
Res.Df Df F Pr(>F)
1 629
2 634 5 2.5115 0.02896 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Granger causality test
Model 1: rM ~ Lags(rM, 1:5) + Lags(fii, 1:5)
Model 2: rM ~ Lags(rM, 1:5)
Res.Df Df F Pr(>F)
1 629
2 634 5 1.1804 0.3172
The first Model 1 tests whether it is okay to remove lagged rM from the regression explaining
FII using lagged FII. It is not (p = 0.02896). The second pair of Model 1 and Model 2 finds that
it is possible to remove the lagged FII from the model explaining rM using lagged rM. From this, we conclude that rM Granger-causes FII but not the other way around.
Extensions
A method for Granger causality that is not sensitive to deviations from the assumption that the error term is normally distributed has been developed by Hacker and Hatemi-J (2006). This new method is especially useful in financial economics since many financial variables are non-normally distributed. Another application is proposed by Pedro Antonio Valdes-SosaPedro Antonio Valdes-Sosa
Pedro Antonio Valdes-SosaPedro Antonio Valdes-Sosa , EE.UU, is the Vice-Director of the which he cofounded in 1990. He is also Member of the Editorial Boards of the following journals: Neuroimage, , , and Brain Connectivity...
, José Miguel Bornot-Sánchez, Mayrín Vega Hernández, Lester Melie-García, Agustín Lage-Castellano and Erick Cavales- Rodríguez, who evaluated a special extension of Granger Causality using a Statistical Parametric Mapping (SPM) of influence field for the analysis of effective brain connectivity.
Further reading
- Anil Seth (2007) Granger causality. Scholarpedia, 2(7):1667
- Kleinberg, S. and Hripcsak, G. (2011) "A review of causal inference for biomedical informatics" J. Biomed Informatics