Lecture 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

ECON 5360 Class Notes

Heteroscedasticity

1 Introduction

In this chapter, we focus on the problem of heteroscedasticity within the multiple linear regression model.

Throughout, we assume that all other classical assumptions are satis…ed. Assume the model is

Y =X + (1)

where 2 3
2
6 1 0 0 07
6 7
60 2
0 07
6 2 7
6 7
0 2 6. .. .. .. .. 7
E( )= = 6 .. . . . . 7 : (2)
6 7
6 7
60 0 2
07
6 n 1 7
4 5
2
0 0 0 n

Heteroscedasticity is a common occurrence in cross-sectional data. It can also occur in time series data

(e.g., AutoRegressive Conditional Heteroscedasticity, ARCH).

2 Ordinary Least Squares

We now examine several results related to OLS when heteroscedasticity is present in the model.

2.1 Summary of Findings

1. b = (X 0 X) 1
X 0 Y is unbiased and consistent.

2
2. var(b) = (X 0 X) 1
X 0 X(X 0 X) 1
is the correct formula.

2
3. var(b) = (X 0 X) 1
is the incorrect formula.

asy 2
4. b N( ; Q 1 ~
QQ 1 ~
) where plim n1 (X 0 X) = Q and plim n1 (X 0 X) = Q.
n

2.2 White’s Estimator of Var(b)


2
If we continue to use OLS, we need a good estimate of var(b) = (X 0 X) 1
X 0 X(X 0 X) 1
. White (1980)

suggests that if we don’t know the form of , we can still …nd a consistent estimate of X 0 X, that is,

1 Xn 2
S0 = ei xi x0i
n i=1

1
2
will converge in probability to n X 0 X, where the ei are the OLS residuals. Therefore, White’s asymptotic

estimate of var(b) is

est:asy:var(b) = (X 0 X) 1
nS0 (X 0 X) 1
.

Davidson and McKinnon have shown that White’s estimator can be unreliable in small samples and have

suggested appropriate modi…cations.

2.3 Gauss Example

In this application, we are interested in measuring the degree of technical ine¢ ciency of rice farmers in the

Ivory Coast. The data are both cross-sectional (N = 154 farmers) and time series (T = 3 years). The

model is

ln(1=T E) = +X +Z +

where T E represents technical e¢ ciency (i.e., ratio of actual production to the e¢ cient level from a production

frontier), X is a set of managerial variables (e.g., years of experience, gender, age, education, etc.) and Z

is a set of exogenous variables (i.e., erosion, slope, weed density, pests, region dummies, year dummies,

etc.). The main point of the exercise is to see whether technical ine¢ ciency is related to the managerial

characteristics of the rice farmers, once we have accounted for aspects of the production process outside their

control. See Gauss example 1 for further details.

3 Testing for Heteroscedasticity

All the tests below are based on the OLS residuals. This makes sense, at least asymptotically, because
p
b! .

3.1 Graphical Test

As a …rst step, it may be useful to graph e2i or ei against any variable suspected of being related to the

heteroscedasticity. If you are unsure which variable is responsible, you can plot against Y^i = Xi b, which is

simply a weghted sum of all X.

3.2 White’s Test

The advantage of White’s test for heteroscedasticity (and similarly White’s estimator of var(b)) is that you do
2 2
not need to know the speci…c form of . The null hypothesis is H0 : i = ; 8i and the alternative is that the

null is false. The motivation for the test is that if the null is true s2 (X 0 X) 1
and s2 (X 0 X) 1
X 0 X(X 0 X) 1

2
are both consistent estimators of var(b), while if the null is false, the two estimates will diverge. The test

procedure is

asy
Regress e2i on all the crosses and squares of X. The test statistic is W = nR2 2
(P 1), where P

is the total number of regressors, including the constant.

The disadvantage of the test is that since it is so general, it can easily detect other sorts of misspeci…cations

other than heteroscedasticity. Also the test in nonconstructive, in the sense that once heteroscedasticity is

found, the test does not provide guidance in how to …nd an optimal estimator.

3.3 Goldfeld-Quandt Test

The Goldfeld-Quandt test addresses the disadvantage of White’s test. It is a more powerful test that assumes

the sample can be divided into two groups – one with a low error variance and the other with a high error

variance. The trick is to …nd the variable on which to sort the data. The hypotheses are

2 2
H0 : i = ; 8i
2 2 2
HA : n n 1 ::: 1

The test procedure is

1. Order the observations in ascending order according to the size of the error variances.

2. Omit r central observations (often r = n=3).

3. Run two separate regressions –…rst (n r)=2 observations and last (n r)=2 observations.

4. Form the statistic F = (e01 e1 =(n1 k))=(e02 e2 =(n2 k)) F (n1 k; n2 k), which requires that
2
N (0; ).

5. Reject or fail to reject the null hypothesis.

3.4 Breusch-Pagan Test

One drawback of the Goldfeld-Quandt test is that you need to choose only one variable related to the

heteroscedasticity. Often there are many candidates. The Breusch-Pagan test allows you to choose a

vector, zi , of variables causing the heteroscedasticity. The hypotheses are

2 2
H0 : i = ; 8i
2 2 0
HA : i = f( 0 + zi ).

3
The test statistic is
g 0 Z(Z 0 Z) 1
Z 0g asy 2
LM = (P 1)
2

where gi = (e2i =^ 2 ) 1 and Zi = (1; zi ). If Z the regressors from White’s test, then the two tests are

algebraically equivalent.

3.5 Gauss Example (cont.)

We now perform the three tests for heteroscedasticity using the Ivory Coast rice-farming data. The Goldfeld-

Quandt test will not work because after sorting, the smaller X matrix is not of full rank. White’s test will

not work either because there are too many variables. See Gauss example 2 for the results from the

Breusch-Pagan test.

4 Generalized Least Squares

4.1 is Known
2
Assume that the variance-covariance matrix of the errors is known (apart from the scalar ) and is given

by (2). We learned that the e¢ cient estimator is

^ = (X 0 1
X) 1
(X 0 1
Y)

= (X 0 P 0 P X) 1
(X 0 P 0 P Y )

where P P 0 = I and 2 3
1= 1 0 0
6 7
6 7
6 0 1= 0 7
6 2 7
P =6 . .. .. 7:
6 . .. 7
6 . . . . 7
4 5
0 0 1= n

GLS can be interpreted as "weighted least squares" because the transformation matrix P weights every

observation by the inverse of its error standard deviation. Therefore, observations with the most inherent

uncertainty get the smallest weight.

Example. Let the model be

Yi = Xi + i

where
2 2
i = Xi2 :

4
The GLS estimator is therefore
P 1
^ = Pi Xi
2 Xi Yi 1X
1 2
= Yi =Xi
i X 2 Xi n i
i

or the average y-x ratio.

4.2 is Unknown
2
There are too many i elements to estimate with a sample size equal to n. Therefore, we need to restrict
2 2 2
i so that it is a function of a smaller number of parameters (e.g., i = Xi2 or 2
i = f ( 0 zi )).

4.2.1 Two-Step Estimation

Since is unknown, we need to estimate it. Let’s refer to

^ = (X 0 ^ 1
X) 1
(X 0 ^ 1
Y)
F GLS

as the feasible GLS estimator. Consider the following two-step procedure for calculating ^ F GLS :

1. Estimate the regression model e2i = f ( 0 zi ) + i. Use ^ to obtain the estimates ^ 2i = f (^ 0 zi ).

2. Calculate ^ F GLS .

Provided ^ is a consistent estimate of in step #1, then ^ F GLS will be asymptotically e¢ cient at step

#2. It may be possible to iterate steps #1 and #2 further, but nothing is gained asymptotically. Sometimes
2
it may be necessary to transform the regression model in step #1 (e.g., take natural logs of i = exp( 0 zi )).

4.2.2 Maximum Likelihood Estimation

2 2
Write the heteroscedasticity generally as i = fi ( ). The (normal) log likelihood function is

n Xn 1 (yi x0i )2
2 2
ln L( ; ; )= (ln(2 ) + ln( )) 0:5 [ln fi ( ) + 2
]:
2 i=1 fi ( )

The …rst-order conditions are

@ ln L 1 Xn 2xi yi 2xi x0i Xn xi i


= 2
= 0 =) =0 (3)
@ 2 i=1 fi ( ) i=1 fi ( )

@ ln L n 1 Xn 2
i 1 Xn 2
i
= + 4 = 0 =) 2 = (4)
@ 2 2 2 2 i=1 fi ( ) n i=1 fi ( )

@ ln L 1 Xn gi ( ) 1 Xn 2
i gi ( )
= + 2 =0 (5)
@ 2 i=1 fi ( ) 2 i=1 fi ( )2

5
where gi ( ) = @fi ( )=@ . Notice that equation (3) gives the normal equation for GLS. Solving equations
2
(3) through (5) jointly for =f ; ; g will produce the maximum likelihood estimates of the model. This

can be accomplished in a couple of di¤erent ways.

1. Brute force. Use one of the nonlinear optimization algorithms (e.g., Newton-Raphson) to maximize

the likelihood function.

2. Oberhofer and Kmenta two-step estimator. Start with a consistent estimator of . Use that estimate
2
to obtain estimates of and . Iterate back and forth until convergence.

The (e¢ cient) asymptotic ML variance is given by the negative inverse of the information matrix

@ 2 ln L
asy:var:(^M L ) = E[ ] 1
@ @ 0

and is given as equation (11-21) in Greene. If this matrix is not working well in the nonlinear optimization

algorithm or is not invertible, one could simply use the negative inverse Hessian (without expectations) or

the outer product of the gradients (OPG).

4.3 Model Based Test for Heteroscedasticity

As a …nal note, rather than use the OLS residuals to test for heteroscedasticity, one could test the null

hypothesis H0 : = 0 using one of the classical asymptotic tests. For example, the likelihood ratio test

would use
asy 2
LR = 2[ln(LR ) ln(LU )] (J)

where LR is the likelihood value with homoscedasticity imposed (i.e., = 0) and LU is the likelihood value

allowing for heteroscedasticity (i.e., 6= 0).

4.4 Gauss Application (cont.)

Using the Ivory Coast rice-farming example, we now calculate feasible GLS and ML estimates of and .
2 2
The heteroscedasticity is assumed to follow i = exp( 0 zi ), where zi = (1; region1i ; region2i ). See Gauss

example 3 for further details.

You might also like