Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
I also have a questions about nonlinear GMM - which is more or less nonlinear IV technique
I suppose.
I am running a panel non-linear regression (non-linear in the parameters) and I have L
parameters and K exogenous variables with L>K.
In particular my model looks kind of like this: Y = b1*X^b2 + e, and so I am trying to
estimate the extra b2 that don't usually appear in a regression.
From what I am reading, to run nonlinear GMM I can use the K exogenous variables to
construct the orthogonality conditions but what should I use for the extra, b 2 coefficients?
Just some more possible IVs (like lags) of the exogenous variables?
I agree that by adding more IVs you will get a more efficient estimation, but isn't it only the
case when you believe the IVs are truly uncorrelated with the error term?
So by adding more "instruments" you are more or less imposing more and more restrictive
assumptions about the model (which might not actually be true).
I am asking because I have not found sources comparing nonlinear GMM/IV to nonlinear
least squares. If there is no homoscadesticity/serial correlation what is more efficient/give
tighter estimates?
Linear model
2 step
ML Murphy & Topel
Binary choice application
Other models
Applications
y*=wage-reservation wage
d=labor force participation
Inefficient
Simple exists in current software
Simple to understand and widely used
Efficient
Simple exists in current software
Not so simple to understand widely
misunderstood
Heckmans Model
i +i
yi *=x
i +ui ; d=1[d*
d*=
z
> 0] (probit)
i
i
i
yi = yi * if di = 1; not observed otherwise
[i ,ui ]~Bivariate Normal[0,0,2 , ,1]
i +E[i | xi , di 1]
E[yi *|xi ,d=1]
= x
i
i +E[i | xi ,ui z i ]
= x
(z
i )
(
z
)
i
= x
i i
= x
Least squares is biased and inconsistent again. Left out variable
. Now compute i
Estimation of by
i )
(z
Step 2: Estimate the regression model with estimated regressor
i +i
yi *=x
yi = yi * if di = 1; not observed otherwise
i +E[i | xi , di 1]
E[yi *|xi ,d=1]
= x
i
The LAMBDA
i i
= x
i.
Linearly regress yi on xi ,
Step2a. Fix standard errors (Murphy and Topel). Estimate
and using and e'e/n
FIML Estimation
logL d0 log zi
z /
i2
i
d1 log
exp
i
2
2
2
1
2
i
i yi x
1
Let
1 / , =- / , =
1-2
logL d0 log zi
2
1
i ( 1 2 )z i (yi x
exp yi x
)
i
2
2
Classic Application
A (my) specification
N =753
N1 = 428
LFP=f(age,age2,family income, education,
kids)
Wage=g(experience, exp2, education, city)
Selection Equation
+---------------------------------------------+
| Binomial Probit Model
|
| Dependent variable
LFP
|
| Number of observations
753
|
| Log likelihood function
-490.8478
|
+---------------------------------------------+
+--------+--------------+----------------+--------+--------+----------+
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X|
+--------+--------------+----------------+--------+--------+----------+
---------+Index function for probability
Constant|
-4.15680692
1.40208596
-2.965
.0030
AGE
|
.18539510
.06596666
2.810
.0049
42.5378486
AGESQ
|
-.00242590
.00077354
-3.136
.0017
1874.54847
FAMINC |
.458045D-05
.420642D-05
1.089
.2762
23080.5950
WE
|
.09818228
.02298412
4.272
.0000
12.2868526
KIDS
|
-.44898674
.13091150
-3.430
.0006
.69588313
i )
(z
i +
= x
i + i
= x
i +d+E[
E[yi *|xi ,d=0]
= x
i | xi , di 0]
i
i
(zi )
i )
(z
i
= x
Sample Selection
An approach modeled on Heckman's model
Regression Equation:
Prob[y=j|x,u]=P(); =exp(x+u)
Selection Equation:
d=1[z+>0]
(The usual probit)
[u,]~n[0,0,1,1,] (Var[u] is absorbed in )
Estimation:
Nonlinear Least Squares: [Terza (1998, see cite in text).]
2 (z+)
E[y|x,d=1]=exp(x+
)
(z)
FIML using Hermite quadrature: [Greene (Stern wp, 97-02, 1997)]
(1 dit )
Ti
dit 0
z z v ( / ) 1
i
i
it
d 1 it
it 2 (vi , wi )dvi dwi
-
it
1 2
it yit xit xi wi
Practical Complications
The bivariate normal integration is actually the
product of two univariate normals, because in the
specification above, vi and wi are assumed to be
uncorrelated. Vella notes, however, given the
computational demands of estimating by maximum
likelihood induced by the requirement to evaluate
multiple integrals, we consider the applicability of
available simple, or two step procedures.
Simulation
The first line in the log likelihood is of the form
Ev[d=0()] and the second line is of the form
Ew[Ev[()()/]]. Using simulation instead, the
simulated likelihood is
LSi
1 R
R r 1
dit 0
zit zi vi ,r
zit zi vi ,r ( / )it ,r 1 it ,r
dit 1
yit xit xi wi ,r
1 R
R r 1
it ,r
Correlated Effects
Suppose that wi and vi are bivariate standard normal with
correlation vw. We can project wi on vi and write
wi = vwvi + (1-vw2)1/2hi
where hi has a standard normal distribution. To allow
the correlation, we now simply substitute this expression
for wi in the simulated (or original) log likelihood, and
add vw to the list of parameters to be estimated. The
simulation is then over still independent normal variates,
vi and hi.
Conditional Means
A Feasible Estimator
Estimation
Kyriazidou - Semiparametrics
Assume 2 periods
Estimate selection equation by FE logit
Use first differences and weighted least squares:
= N d d x x 1 N d d
i xi y i
i=1
i1
i2
i
i
i
i=1
i1
i2
1 w i
i K
kernel function.
h h
Use with longer panels - any pairwise differences
Extensions based on pairwise differences by RochinaBarrachina and Dustman/Rochina-Barrachina (1999)
Bias Corrections
Postscript
Sample Selection
Boris Bravo-Ureta
University of Connecticut
Daniel Solis
University of Miami
William Greene
New York University
MARENA
Training &
Financing
Natural, Human &
Social Capital
Off-Farm
Income
More Production
and Productivity
More Farm
Income
Sustainability
Component II - Module 3
Component II - Module 3 focused on promoting investments in sustainable production
systems with a budget of US $7.6 million (Bravo-Ureta, 2009).
The major activities undertaken with beneficiaries: training in business
management and sustainable farming practices; and the provision of funds to
co-finance investment activities through local rural savings associations (cajas rurales).
Conclusions
Methods
A matched group of beneficiaries and control
farmers is determined using Propensity Score
Matching techniques to mitigate biases that
would stem from selection on observed
variables.
In addition, we deal with possible self-selection
on unobservables arising from unobserved
variables using a selectivity correction model for
stochastic frontiers introduced by Greene (2010).
|U i |
exp[ 12 ( yi x i u |U i |)2 / v2 ]
p(| U i |)d | U i |
v 2
2exp[ 12 | U i |2 ]
p (| U i |)
, |U i | 0. (Half normal)
2
1 R exp[ 12 ( yi x i u |U ir |) 2 / v2 ]
f ( y | xi )
R r 1
v 2
1 R exp[ 12 ( yi x i u |U ir |) 2 / v2 ]
logLS (,,u ,v ) = i =1 log r 1
v 2
R
exp 12 ( yi x i u | U i |) 2 / v2 )
v 2
di
( y x | U |) / z
i
i
u
i
(1 d i ) ( zi )
f yi | ( x i , d i , z i )
|U i |
f yi | ( x i , d i , zi ,| U i |) f (| U i |)d | U i |
log LS (, , u , v , , ) i 1 log
N
1 R
R r 1
exp 12 ( yi x i u | U ir |) 2 / v2 )
di
v 2
( y x | U |) / z
i
i
u
ir
(1 d i ) ( zi )
JLMS Estimator of ui
exp 12 ( yi x i u | U ir |) 2 / v2 )
fir
v 2
( y x | U |) / a
i
i
u
ir
v
i
1
A = 1 R ( | U |) f , B 1 R f
i
u
ir
ir
i
ir
R r 1
R r 1
Ai
ui Estimator of E[ui |i ]
Bi
R
fir
r 1 g ir | uU ir | where g ir R
,
r 1 fir
R
r 1
g ir 1
Variables Used
in the Analysis
Production
Participation
1 R
d 1 log r 1
i
R
T
t 0
v 2
Attrition
QOL Study
Corr[vi2 (i2 ui ), i2 ui )
Selection Model
Reduced form probit model for second period observation equation
zi2* xi2 ( ) wi (i2 ui vi )
ri2 hi2
zi2
1(zi2* 0)
(ri2 )
(ri2 )
Maximum Likelihood
(yi1 xi1)2
log2
log
2
22
LogLi
zi2
2
22 (1 12
)
r ( / )(y x )
i2
i2
log i2
2
2
12
(1 zi2 ) log
2
1 12
A Model of Attrition
Attrition Model
The main equation
yi,t 0 xi,t %i i,t , Random effects consumption function
%i xi ui ,
Mundlak device; ui uncorrelated with X i
yi,t 0 xi,t xi ui i,t , Reduced form random effects model
The selection mechanism
ait 1[individual i asked to participate in period t] Purely exogenous
ait may depend on observables, but does not depend on unobservables
rit 1[individual i chooses to participate if asked] Endogenous.
rit is the endogenous participation dummy variable
ait 0 rit 0
ait 1 the selection mechanism operates
Selection Equation
The main equation
yi,t 0 xi,t xi ui i,t , Reduced form random effects model
The selection mechanism
rit 1[individual i chooses to participate if asked] Endogenous.
rit is the endogenous participation dummy variable
ait 0 rit 0
ait 1 the selection mechanism operates
rit 1[ 0 xi,t xi zi,t v i w i,t 0] all observed if ait 1
State dependence: z may include ri,t-1
Latent persistent unobserved heterogeneity: 2v 0.
"Selection" arises if Cov[i,t ,w i,t ] 0 or Cov[ui,v i ] 0
Loss of efficiency
One can no longer distinguish between state
dependence and unobserved heterogeneity.
Ti
t 1
Pit,j f( j )d j
Attrition
d it
Inverse probability weight W
it
Pit
Weighted log likelihood
Spatial Autocorrelation in a
Sample Selection Model
Flores-Lagunes, A. and Schnier, K., Sample selection and Spatial Dependence, Journal of Applied
Econometrics, 27, 2, 2012, pp. 173-204.
Spatial Autocorrelation in a
Sample Selection Model
Flores-Lagunes, A. and Schnier, K., Sample selection and Spatial Dependence, Journal of Applied
Econometrics, 27, 2, 2012, pp. 173-204.
Spatial Autocorrelation in a
Sample Selection Model
yi*1 0 xi1 ui1
ui1 j i cij u j1 i1
yi*2 0 xi 2 ui 2
ui 2 j i cij u j 2 i 2
i1
~N
i 2
0
0
12
,
12
12
, (?? 1 1??)
2
2
Observation Mechanism
yi1 1 yi*1 > 0
Probit Model
yi 2 yi*2 if yi1 = 1, unobserved otherwise.
Spatial Autocorrelation in a
Sample Selection Model
u1 Cu1 1
C = Spatial weight matrix, Cii 0.
u1 [I C]1 1 = (1) 1 , likewise for u 2
()
, Var[u ] ( )
Cov[u , u ] () ( )
y 0 xi1 j 1 () i1 , Var[ui1 ]
*
i1
(1)
ij
y 0 xi 2 j 1 ( )ij(2)
*
i2
2
1
i2
i2
j 1
2
1
j 1
N
i1
i2
12
j 1
(1) 2
ij
( 2) 2
ij
(1)
ij
(2)
ij
Spatial Weights
1
cij 2 ,
dij
dij Euclidean distance
Band of 7 neighbors is used
Row standardized.
0 xi1
() ( )
()
N
(1)
ij
j 1
j 1
(2)
ij
(1) 2
ij
2
1
()
N
j 1
0 xi1
(1) 2
ij
j 1 ()
2
1
(1) 2
ij