Partial regression plot
Encyclopedia
In applied statistics, a partial regression plot attempts to show the effect of adding an additional variable to the model (given that one or more independent variables are already in the model). Partial regression plots are also referred to as added variable plots, adjusted variable plots, and individual coefficient plots.
When performing a linear regression
with a single independent variable
, a scatter plot of the response variable against the independent variable provides a good indication of the nature of the relationship. If there is more than one independent variable, things become more complicated. Although it can still be useful to generate scatter plots of the response variable against each of the independent variables, this does not take into account the effect of the other independent variables in the model.
Partial regression plots are formed by:
Velleman and Welsch (see References below) express this mathematically as:
where
Velleman and Welsch list the following useful properties for this plot:
Partial regression plots are widely discussed in the regression diagnostics literature (e.g., see the References section below). Since the strengths and weaknesses of partial regression plots are widely discussed in the literature, we will not discuss that in any detail here.
Partial regression plots are related to, but distinct from, partial residual plot
s. Partial regression plots are most commonly used to identify data points with high leverage
and influential data points that might not have high leverage. Partial residual plots are most commonly used to identify the nature of the relationship between Y and Xi (given the effect of the other independent variables in the model). Note that since the simple correlation between the two sets of residuals plotted is equal to the partial correlation
between the response variable and Xi, partial regression plots will show the correct strength of the linear relationship between the response variable and Xi. This is not true for partial residual plots. On the other hand, for the partial regression plot, the x-axis is not Xi. This limits its usefulness in determining the need for a transformation (which is the primary purpose of the partial residual plot).
When performing a linear regression
Linear regression
In statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more explanatory variables denoted X. The case of one explanatory variable is called simple regression...
with a single independent variable
Independent variable
The terms "dependent variable" and "independent variable" are used in similar but subtly different ways in mathematics and statistics as part of the standard terminology in those subjects...
, a scatter plot of the response variable against the independent variable provides a good indication of the nature of the relationship. If there is more than one independent variable, things become more complicated. Although it can still be useful to generate scatter plots of the response variable against each of the independent variables, this does not take into account the effect of the other independent variables in the model.
Partial regression plots are formed by:
- Computing the residuals of regressing the response variable against the independent variables but omitting Xi
- Computing the residuals from regressing Xi against the remaining independent variables
- Plotting the residuals from (1) against the residuals from (2).
Velleman and Welsch (see References below) express this mathematically as:
where
- Y.[i] = residuals from regressing Y (the response variable) against all the independent variables except Xi
- Xi.[i] = residuals from regressing Xi against the remaining independent variables.
Velleman and Welsch list the following useful properties for this plot:
- The least squares linear fit to this plot has the slope Betai and intercept zero.
- The residuals from the least squares linear fit to this plot are identical to the residuals from the least squares fit of the original model (Y against all the independent variables including Xi).
- The influences of individual data values on the estimation of a coefficient are easy to see in this plot.
- It is easy to see many kinds of failures of the model or violations of the underlying assumptions (nonlinearity, heteroscedasticity, unusual patterns).
Partial regression plots are widely discussed in the regression diagnostics literature (e.g., see the References section below). Since the strengths and weaknesses of partial regression plots are widely discussed in the literature, we will not discuss that in any detail here.
Partial regression plots are related to, but distinct from, partial residual plot
Partial residual plot
In applied statistics, a partial residual plot is a graphical technique that attempts to show the relationship between a given independent variable and the response variable given that other independent variables are also in the model.-Background:...
s. Partial regression plots are most commonly used to identify data points with high leverage
Leverage (statistics)
In statistics, leverage is a term used in connection with regression analysis and, in particular, in analyses aimed at identifying those observations that are far away from corresponding average predictor values...
and influential data points that might not have high leverage. Partial residual plots are most commonly used to identify the nature of the relationship between Y and Xi (given the effect of the other independent variables in the model). Note that since the simple correlation between the two sets of residuals plotted is equal to the partial correlation
Partial correlation
In probability theory and statistics, partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed.-Formal definition:...
between the response variable and Xi, partial regression plots will show the correct strength of the linear relationship between the response variable and Xi. This is not true for partial residual plots. On the other hand, for the partial regression plot, the x-axis is not Xi. This limits its usefulness in determining the need for a transformation (which is the primary purpose of the partial residual plot).
See also
- Partial residual plotPartial residual plotIn applied statistics, a partial residual plot is a graphical technique that attempts to show the relationship between a given independent variable and the response variable given that other independent variables are also in the model.-Background:...
- Partial leverage plotPartial leverage plotIn statistics, high-leverage points are those that are outliers with respect to the independent variables. Leverage points are those that cause large changes in the parameter estimates when they are deleted. Although a leverage point will typically have high leverage, a high leverage point is not...
- Variance inflation factorVariance inflation factorIn statistics, the variance inflation factor quantifies the severity of multicollinearity in an ordinary least squares regression analysis...
for a multi-linear fit. - Scatterplot matrix