Parameter identification problem
Encyclopedia
The parameter identification problem is a problem which can occur in the estimation of multiple-equation econometric models where the equations have variables in common.
More generally, the term can be used to refer to any situation where a statistical model will invariably have more than one set of parameters which generate the same distribution of observations.
of some specific good. The quantity of the demand varies inversely with the price: a higher price decreases demand. The quantity of the supply varies directly with the price: a higher price makes supply more profitable.
Assume that, say for several years, we have data on both the price and the traded quantity of this good. Unfortunately this is not enough to identify the two equations (demand and supply) using regression analysis
on observations of Q and P: of course one can not estimate a downward slope and an upward slope with one linear regression line involving only two variables. Additional variables can make it possible to identify the individual relations.
In the graph shown here, the supply (red line, upward sloping) depends on the price, while the demand (black lines, downward sloping) depends on the price and also on some additional variable Z. This Z might be the income, with more income shifting the demand curve outwards. This is symbolically indicated with the values 1, 2 and 3 for Z.
With supply and demand being equal, the observations on quantity and price are like the three white dots in the graph: they reveal the supply curve. Hence the effect of Z on the demand makes it possible to identify the (positive) slope of the supply equation. The (negative) slope parameter of the demand can not be identified in this case.
In other words, the parameters of an equation can be identified if it is known that some variable does not enter into the equation, while it does enter the other equation. In formulae, we might have:
with positive bS and negative bD. Here both equations are identified if c and d are nonzero. Then Z occurs in the demand, but not in the supply, and X occurs in the supply and not in the demand.
Note that this is the structural form of the model, showing the relations between the Q and P. The reduced form
however can be identified easily.
An equation can not be identified from the data if less than M variables are excluded from that equation. This is a particular form of the order condition for identification. (The general form of the order condition deals also with other restrictions than exclusions.) The order condition is necessary but not sufficient for identification.
The rank condition is a necessary and sufficient condition for identification. In the case of only exclusion restrictions, it must "be possible to form at least one nonvanishing determinant of order M from the columns of A corresponding to the variables excluded a priori from that equation" (Fisher 1966, p. 40), where A is the matrix of coefficients of the equations. This is the generalization in matrix algebra of the requirement "while it does enter the other equation" mentioned above (in the line above the formulas).
in statistics
.
More generally, the term can be used to refer to any situation where a statistical model will invariably have more than one set of parameters which generate the same distribution of observations.
The standard example, with two equations
Consider a linear model for the supply and demandSupply and demand
Supply and demand is an economic model of price determination in a market. It concludes that in a competitive market, the unit price for a particular good will vary until it settles at a point where the quantity demanded by consumers will equal the quantity supplied by producers , resulting in an...
of some specific good. The quantity of the demand varies inversely with the price: a higher price decreases demand. The quantity of the supply varies directly with the price: a higher price makes supply more profitable.
Assume that, say for several years, we have data on both the price and the traded quantity of this good. Unfortunately this is not enough to identify the two equations (demand and supply) using regression analysis
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
on observations of Q and P: of course one can not estimate a downward slope and an upward slope with one linear regression line involving only two variables. Additional variables can make it possible to identify the individual relations.
In the graph shown here, the supply (red line, upward sloping) depends on the price, while the demand (black lines, downward sloping) depends on the price and also on some additional variable Z. This Z might be the income, with more income shifting the demand curve outwards. This is symbolically indicated with the values 1, 2 and 3 for Z.
With supply and demand being equal, the observations on quantity and price are like the three white dots in the graph: they reveal the supply curve. Hence the effect of Z on the demand makes it possible to identify the (positive) slope of the supply equation. The (negative) slope parameter of the demand can not be identified in this case.
In other words, the parameters of an equation can be identified if it is known that some variable does not enter into the equation, while it does enter the other equation. In formulae, we might have:
- supply:
- demand:
with positive bS and negative bD. Here both equations are identified if c and d are nonzero. Then Z occurs in the demand, but not in the supply, and X occurs in the supply and not in the demand.
Note that this is the structural form of the model, showing the relations between the Q and P. The reduced form
Reduced form
In statistics, and particularly in econometrics, the reduced form of a system of equations is the result of solving the system for the endogenous variables. This gives the latter as a function of the exogenous variables, if any...
however can be identified easily.
Estimation methods and disturbances
"It is important to note that the problem is not one of the appropriateness of a particular estimation technique. In the situation described [without the Z variable], there clearly exists no way using any technique whatsoever in which the true demand (or supply) curve can be estimated. Nor, indeed, is the problem here one of statistical inference - of separating out the effects of random disturbance. There is no disturbance in this model [...] It is the logic of the supply-demand equilibrium itself which leads to the difficulty." (Fisher 1966, p. 5)More equations
More in general, consider a linear system of M equations, with M > 1.An equation can not be identified from the data if less than M variables are excluded from that equation. This is a particular form of the order condition for identification. (The general form of the order condition deals also with other restrictions than exclusions.) The order condition is necessary but not sufficient for identification.
The rank condition is a necessary and sufficient condition for identification. In the case of only exclusion restrictions, it must "be possible to form at least one nonvanishing determinant of order M from the columns of A corresponding to the variables excluded a priori from that equation" (Fisher 1966, p. 40), where A is the matrix of coefficients of the equations. This is the generalization in matrix algebra of the requirement "while it does enter the other equation" mentioned above (in the line above the formulas).
Related use of the term
In engineering language, the term "parameter identification" is used to indicate a more general subject, which is roughly the same as estimationEstimation
Estimation is the calculated approximation of a result which is usable even if input data may be incomplete or uncertain.In statistics,*estimation theory and estimator, for topics involving inferences about probability distributions...
in statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
.
See also
- Observational equivalenceObservational equivalenceIn econometrics, two parameter values are considered observationally equivalent if they both result in the same probability distribution of observable data...
- IdentifiabilityIdentifiabilityIn statistics, identifiability is a property which a model must satisfy in order for inference to be possible. We say that the model is identifiable if it is theoretically possible to learn the true value of this model’s underlying parameter after obtaining an infinite number of observations from it...
- System of linear equations
- Simultaneous equationsSimultaneous equationsIn mathematics, simultaneous equations are a set of equations containing multiple variables. This set is often referred to as a system of equations. A solution to a system of equations is a particular specification of the values of all variables that simultaneously satisfies all of the equations...
- Reduced formReduced formIn statistics, and particularly in econometrics, the reduced form of a system of equations is the result of solving the system for the endogenous variables. This gives the latter as a function of the exogenous variables, if any...