CATEGORIES:

Biology Chemistry Construction Culture Ecology Economy Electronics Finance Geography History Informatics Law Mathematics Mechanics Medicine Other Pedagogy Philosophy Physics Policy Psychology Sociology Sport Tourism

Consequences of Micronumerosity

In a parody of the consequences of multicollinearity, and in a tongue-incheek

manner, Goldberger cites exactly similar consequences of micronumerosity,

that is, analysis based on small sample size.15 The reader is

advised to read Goldberger’s analysis to see why he regards micronumerosity

as being as important as multicollinearity.

14. Give three ways to detect multicollinearity (1, 2, 6—Variance Inflation Factor only). Briefly explain. (pp. 359–363)

1. HighR2 but few significant t ratios.As noted, this is the “classic”

symptom of multicollinearity. If R2 is high, say, in excess of 0.8, the F test in

most cases will reject the hypothesis that the partial slope coefficients are

simultaneously equal to zero, but the individual t tests will show that none or

very few of the partial slope coefficients are statistically different from zero.

This fact was clearly demonstrated by our consumption–income–wealth

example.

Although this diagnostic is sensible, its disadvantage is that “it is too

strong in the sense that multicollinearity is considered as harmful only

when all of the influences of the explanatory variables on Y cannot be disentangled.”

2. High pair-wise correlations among regressors.Another suggested

rule of thumb is that if the pair-wise or zero-order correlation coefficient between

two regressors is high, say, in excess of 0.8, then multicollinearity is a

serious problem. The problem with this criterion is that, although high

zero-order correlations may suggest collinearity, it is not necessary that they

be high to have collinearity in any specific case. To put the matter somewhat

technically, high zero-order correlations are a sufficient but not a necessary

condition for the existence of multicollinearity because it can exist even

though the zero-order or simple correlations are comparatively low (say, less

than 0.50). To see this relationship, suppose we have a four-variable model:

Yi = β1 + β2X2i + β3X3i + β4X4i + ui

and suppose that

X4i = λ2X2i + λ3X3i

where λ2 and λ3 are constants, not both zero. Obviously, X4 is an exact linear

combination of X2 and X3, giving R2= 1, the coefficient of determination

in the regression of X4 on X2 and X3.

6. Tolerance and variance inflation factor.We have already introduced

TOL and VIF. As R2j

, the coefficient of determination in the regression

of regressor Xj on the remaining regressors in the model, increases toward

unity, that is, as the collinearity of Xj with the other regressors increases,

VIF also increases and in the limit it can be infinite.

Some authors therefore use the VIF as an indicator of multicollinearity.

The larger the value of VIFj, the more “troublesome” or collinear the variable

Xj. As a rule of thumb,if the VIF of a variable exceeds 10, which will

happen if R2j

exceeds 0.90, that variable is said be highly collinear.27

Of course, one could use TOLj as a measure of multicollinearity in view

of its intimate connection with VIFj. The closer is TOLj to zero, the greater

the degree of collinearity of that variable with the other regressors. On the other hand, the closer TOLj is to 1, the greater the evidence that Xj is not

collinear with the other regressors.

VIF (or tolerance) as a measure of collinearity is not free of criticism. As

(10.5.4) shows, var ( ˆ βj ) depends on three factors: σ2,

j , and VIFj. A high

VIF can be counterbalanced by a low σ2 or a high

j . To put it differently,

a high VIF is neither necessary nor sufficient to get high variances and high

standard errors. Therefore, high multicollinearity, as measured by a high

VIF, may not necessarily cause high standard errors. In all this discussion,

the terms high and low are used in a relative sense.

To conclude our discussion of detecting multicollinearity, we stress that

the various methods we have discussed are essentially in the nature of

“fishing expeditions,” for we cannot tell which of these methods will work in

any particular application. Alas, not much can be done about it, for multicollinearity

is specific to a given sample over which the researcher may not

have much control, especially if the data are nonexperimental in nature—

the usual fate of researchers in the social sciences.

Again as a parody of multicollinearity, Goldberger cites numerous ways of

detecting micronumerosity, such as developing critical values of the sample

size, n*, such that micronumerosity is a problem only if the actual sample

size, n, is smaller than n*. The point of Goldberger’s parody is to emphasize

that small sample size and lack of variability in the explanatory variables may

cause problems that are at least as serious as those due to multicollinearity.

15. Illustrate the nature of homoscedasticity and heteroscedasticity in two diagrams. (pp. 387–388).

16.Give three (out of seven) reasons for heteroscedasticity. Briefly explain. (pp. 389–393)

As noted in Chapter 3, one of the important assumptions of the classical

linear regression model is that the variance of each disturbance term ui ,

conditional on the chosen values of the explanatory variables, is some constant

number equal to σ2. This is the assumption of homoscedasticity,or

equal (homo) spread (scedasticity), that is, equal variance. Symbolically,

Diagrammatically, in the two-variable regression model homoscedasticity

can be shown as in Figure 3.4, which, for convenience, is reproduced as

Figure 11.1. As Figure 11.1 shows, the conditional variance of Yi (which is

equal to that of ui), conditional upon the given Xi, remains the same regardless

of the values taken by the variable X.

In contrast, consider Figure 11.2, which shows that the conditional variance

of Yi increases as X increases. Here, the variances of Yi are not the

same. Hence, there is heteroscedasticity. Symbolically,

Notice the subscript of σ2, which reminds us that the conditional variances

of ui (= conditional variances of Yi) are no longer constant.

To make the difference between homoscedasticity and heteroscedasticity

clear, assume that in the two-variable model Yi = β1 + β2Xi + ui , Y represents

savings and X represents income. Figures 11.1 and 11.2 show that as

income increases, savings on the average also increase. But in Figure 11.1 the variance of savings remains the same at all levels of income, whereas in

Figure 11.2 it increases with income. It seems that in Figure 11.2 the higherincome

families on the average save more than the lower-income families,

but there is also more variability in their savings.

There are several reasons why the variances of ui may be variable, some

of which are as follows.1

1.Following the error-learning models, as people learn, their errors of behavior

become smaller over time. In this case, σ2

i is expected to decrease. As

an example, consider Figure 11.3, which relates the number of typing errors

made in a given time period on a test to the hours put in typing practice. As

Figure 11.3 shows, as the number of hours of typing practice increases, the

average number of typing errors as well as their variances decreases.

2.As incomes grow, people have more discretionary income2 and hence

more scope for choice about the disposition of their income. Hence, σ2

i is

likely to increase with income. Thus in the regression of savings on income

one is likely to find σ2

i increasing with income (as in Figure 11.2) because

people have more choices about their savings behavior. Similarly, companies

with larger profits are generally expected to show greater variability

in their dividend policies than companies with lower profits. Also, growthoriented

companies are likely to show more variability in their dividend

payout ratio than established companies.

3.As data collecting techniques improve, σ2

i is likely to decrease. Thus,

banks that have sophisticated data processing equipment are likely to commit fewer errors in the monthly or quarterly statements of their customers

than banks without such facilities.

Date: 2016-01-14; view: 1901

<== previous page	\|	next page ==>
The F-Test Approach: Restricted Least Squares	\|	Heteroscedasticity can also arise as a result of the presence of outliers.

doclecture.net - lectures - 2014-2025 year. Copyright infringement or personal data (0.217 sec.)