Explain how you can use the general F test to test for a restriction on the parameters (restricted least squares). Illustrate with an example. (pp.267–273).
The F-Test Approach: Restricted Least Squares
The preceding t test is a kind of postmortem examination because we try to
find out whether the linear restriction is satisfied after estimating the “unrestricted’’
regression. A direct approach would be to incorporate the restriction
(8.7.3) into the estimating procedure at the outset. In the present example,
this procedure can be done easily. From (8.7.3) we see that
β2 = 1 − β3 (8.7.5)
or
β3 = 1 − β2 (8.7.6)
Therefore, using either of these equalities, we can eliminate one of the β coefficients
in (8.7.2) and estimate the resulting equation. Thus, if we use
(8.7.5), we can write the Cobb–Douglas production function as
where (Yi/X2i) = output/labor ratio and (X3i/X2i) = capital labor ratio, quantities
of great economic importance.
Notice how the original equation (8.7.2) is transformed. Once we estimate
β3 from (8.7.7) or (8.7.8), β2 can be easily estimated from the relation
(8.7.5). Needless to say, this procedure will guarantee that the sum of the
estimated coefficients of the two inputs will equal 1. The procedure outlined
in (8.7.7) or (8.7.8) is known as restricted least squares (RLS).This procedure
can be generalized to models containing any number of explanatory
variables and more than one linear equality restriction. The generalization
can be found in Theil.13 (See also general F testing below.)
How do we compare the unrestricted and restricted least-squares regressions?
In other words, how do we know that, say, the restriction (8.7.3) is
valid? This question can be answered by applying the F test as follows.
10.Illustrate in the example of a savings function how you can use dummy variables as an alternative to the Chow test (pp. 306–310). THE DUMMY VARIABLE ALTERNATIVE TO THE CHOW TEST9
In Section 8.8 we discussed the Chow test to examine the structural stability
of a regression model. The example we discussed there related to the
relationship between savings and income in the United States over the
period 1970–1995. We divided the sample period into two, 1970–1981 and
1982–1995, and showed on the basis of the Chow test that there was a difference
in the regression of savings on income between the two periods.
However, we could not tell whether the difference in the two regressions
was because of differences in the intercept terms or the slope coefficients or
both. Very often this knowledge itself is very useful.
Referring to Eqs. (8.8.1) and (8.8.2), we see that there are four possibilities,
which we illustrate in Figure 9.3.
1.Both the intercept and the slope coefficients are the same in the two regressions.
This, the case of coincident regressions,is shown in Figure 9.3a.
2.Only the intercepts in the two regressions are different but the slopes
are the same. This is the case of parallel regressions,which is shown in
Figure 9.3b.
3.The intercepts in the two regressions are the same, but the slopes are
different. This is the situation of concurrent regressions(Figure 9.3c).
4.Both the intercepts and slopes in the two regressions are different.
This is the case of dissimilar regressions,which is shown in Figure 9.3d.
The multistep Chow test procedure discussed in Section 8.8, as noted earlier,
tells us only if two (or more) regressions are different without telling us
what is the source of the difference. The source of difference, if any, can be
pinned down by pooling all the observations (26 in all) and running just one
multiple regression as shown below10:
Yt = α1 + α2Dt + β1Xt + β2(DtXt) + ut (9.5.1)
where Y = savings
X = income
t = time
D = 1 for observations in 1982–1995
= 0, otherwise (i.e., for observations in 1970–1981)
11. Explain the use of a dummy variable in an interactive (or multiplicative) form. Illustrate in an example. (pp. 310–312).
Dummy variables are a flexible tool that can handle a variety of interesting
problems. To see this, consider the following model:
Yi = α1 + α2D2i + α3D3i + βXi + ui (9.6.1)
where Y = hourly wage in dollars
X = education (years of schooling)
D2 = 1 if female, 0 otherwise
D3 = 1 if nonwhite and non-Hispanic, 0 otherwise
In this model gender and race are qualitative regressors and education is
a quantitative regressor.11 Implicit in this model is the assumption that the
differential effect of the gender dummy D2 is constant across the two categories
of race and the differential effect of the race dummy D3 is also constant
across the two sexes. That is to say, if the mean salary is higher for
males than for females, this is so whether they are nonwhite/non-Hispanic
or not. Likewise, if, say, nonwhite/non-Hispanics have lower mean wages,
this is so whether they are females or males.
In many applications such an assumption may be untenable. A female
nonwhite/non-Hispanic may earn lower wages than a male nonwhite/non-
Hispanic. In other words, there may be interactionbetween the two qualitative
variables D2 and D3. Therefore their effect on mean Y may not be simply
additiveas in (9.6.1) but multiplicativeas well, as in the following model.
which is the mean hourly wage function for female nonwhite/non-Hispanic
workers. Observe that
α2 = differential effect of being a female
α3 = differential effect of being a nonwhite/non-Hispanic
α4 = differential effect of being a female nonwhite/non-Hispanic
which shows that the mean hourly wages of female nonwhite/non-Hispanics
is different (by α4) from the mean hourly wages of females or nonwhite/non-
Hispanics. If, for instance, all the three differential dummy coefficients are
negative, this would imply that female nonwhite/non-Hispanic workers earn
much lower mean hourly wages than female or nonwhite/non-Hispanic
workers as compared with the base category, which in the present example
is male white or Hispanic.
Now the reader can see how the interaction dummy(i.e., the product of
two qualitative or dummy variables) modifies the effect of the two attributes
considered individually (i.e., additively).
EXAMPLE 9.5
AVERAGE HOURLY EARNINGS IN RELATION TO EDUCATION, GENDER, AND RACE
Let us first present the regression results based on model (9.6.1). Using the data that were
used to estimate regression (9.3.1), we obtained the following results:
ˆY
i = −0.2610 − 2.3606D2i − 1.7327D3i + 0.8028Xi
t = (−0.2357)** (−5.4873)* (−2.1803)* (9.9094)* (9.6.4)
R2 = 0.2032 n = 528
where * indicates p values less than 5 percent and ** indicates p values greater than 5 percent.
The reader can check that the differential intercept coefficients are statistically significant,
that they have the expected signs (why?), and that education has a strong positive effect on
hourly wage, an unsurprising finding.
As (9.6.4) shows, ceteris paribus, the average hourly earnings of females are lower by
about $2.36, and the average hourly earnings of nonwhite non-Hispanic workers are also
lower by about $1.73.
We now consider the results of model (9.6.2), which includes the interaction dummy.
ˆYi =
−
0.26100 −
2.3606D2i −
1.7327D3i +
2.1289D2iD3i +
0.8028Xi
t = (−0.2357)** (−5.4873)* (−2.1803)* (1.7420)** (9.9095)** (9.6.5)
R2 = 0.2032 n = 528
where * indicates p values less than 5 percent and ** indicates p values greater than 5 percent.
As you can see, the two additive dummies are still statistically significant, but the interactive
dummy is not at the conventional 5 percent level; the actual p value of the interaction
dummy is about the 8 percent level. If you think this is a low enough probability, then the results
of (9.6.5) can be interpreted as follows: Holding the level of education constant, if you
add the three dummy coefficients you will obtain: −1.964 (=−2.3605 − 1.7327 + 2.1289),
which means that mean hourly wages of nonwhite/non-Hispanic female workers is lower by
about $1.96, which is between the value of −2.3605 (gender difference alone) and −1.7327
(race difference alone).
The preceding example clearly reveals the role of interaction dummies
when two or more qualitative regressors are included in the model. It is
important to note that in the model (9.6.5) we are assuming that the rate of
increase of hourly earnings with respect to education (of about 80 cents per
additional year of schooling) remains constant across gender and race. But
this may not be the case. If you want to test for this, you will have to introduce
differential slope coefficients (see exercise 9.25)
12. Explain the use of dummy variables to seasonally adjust time series. Illustrate in an example. (pp. 312–317)
Many economic time series based on monthly or quarterly data exhibit
seasonal patterns (regular oscillatory movements). Examples are sales of
department stores at Christmas and other major holiday times, demand for
money (or cash balances) by households at holiday times, demand for ice
cream and soft drinks during summer, prices of crops right after harvesting
season, demand for air travel, etc. Often it is desirable to remove the seasonal
factor, or component, from a time series so that one can concentrate
on the other components, such as the trend.12 The process of removing
the seasonal component from a time series is known as deseasonalization
or seasonal adjustment,and the time series thus obtained is called the
deseasonalized,or seasonally adjusted,time series. Important economic
time series, such as the unemployment rate, the consumer price index (CPI),
the producer’s price index (PPI), and the index of industrial production, are
usually published in seasonally adjusted form.
There are several methods of deseasonalizing a time series, but we will
consider only one of these methods, namely, the method of dummy variables.
13 To illustrate how the dummy variables can be used to deseasonalize
economic time series, consider the data given in Table 9.3. This table gives
quarterly data for the years 1978–1995 on the sale of four major appliances,
dishwashers, garbage disposers, refrigerators, and washing machines, all
data in thousands of units. The table also gives data on durable goods expenditure
in 1982 billions of dollars.
To illustrate the dummy technique, we will consider only the sales of refrigerators
over the sample period. But first let us look at the data, which is
shown in Figure 9.4. This figure suggests that perhaps there is a seasonal
pattern in the data associated with the various quarters. To see if this is the
case, consider the following model:
Yt = α1D1t + α2D2t + α3tD3t + α4D4t + ut (9.7.1)
where Yt = sales of refrigerators (in thousands) and the D’s are the dummies,
taking a value of 1 in the relevant quarter and 0 otherwise. Note that
13. What are the practical consequences of high multicollinearity? (p. 350,pp. 350–355 but no proofs)
In cases of near or high multicollinearity, one is likely to encounter the fol- lowing consequences:
1.Although BLUE, the OLS estimators have large variances and covari- ances, making precise estimation difficult.
2.Because of consequence 1, the confidence intervals tend to be much wider, leading to the acceptance of the “zero null hypothesis” (i.e., the true population coefficient is zero) more readily.
3.Also because of consequence 1, the t ratio of one or more coefficients tends to be statistically insignificant.
4.Although the t ratio of one or more coefficients is statistically insigni- ficant, R2, the overall measure of goodness of fit, can be very high.
5.The OLS estimators and their standard errors can be sensitive to small changes in the data.