ECON 2404 – Assignment 2

1 Estimating Production Function
Download the data “data-olley-pakes”. There are 531 Örms and 6 years of data.
Each row refers to one Örm in one year. The variables are as follows: Örm, year,
output, age, capital, labor, and investment. If a Örmís values are zero in a given
year, that means that the Örm does not exist in that year, it has either exited

You can answer the questions using either STATA or MATLAB, although I
would suggest you to implement the Olley and Pakes (OP) estimator in MATLAB to have a better understanding of the method. If you are using STATA,
do not forget to change the zeros into dots so that STATA understands that the
variables are missing; otherwise it will consider the zeros as observations. Make
sure you put the data in logs before estimating the model.

1.1 Model
Assume the Örms have a Cobb-Douglas production function (letís ignore Örmís
age in the exercise, but you can include it if you want to):
yit = 0 + llit + kkit + !it + “it; (1)
where yit is the log of output; lit is the log of labor; kit is the log of capital; the
term !it represents “productivity shocks” that are observed or predictable by
the Örms before making their input decisions at t; and “it represents both i.i.d.
shocks to production that are not predicted by the Örms and measurement
errors in the observed variables. The endogeneity problem in estimating (1)
comes from the correlation between the inputs and !it.

1.2 Question

1. Estimate the production function using pooled OLS for both the unbalanced and for the balanced panel data. What do you Önd? Are the
estimates signiÖcant? Are they economically reasonable? Why would you
expect them to be biased?
2. Assume !it = !i
, i.e., it is a time-invariant Öxed-e§ect. Estimate the
production function using the Öxed-e§ect estimator. What do you Önd?
Are the estimates signiÖcant? Are they economically reasonable?
3. Assume !it = !it 1 + it, where jj < 1 and it is i.i.d. Add the Öxede§ect ai
in the unobservables. Estimate the production function using the
SYS GMM estimator proposed by Blundell and Bond. What do you Önd?
Are the estimates signiÖcant? Are they economically reasonable?
4. Return to the original model (1) and assume the model satisÖes the OP
assumptions. Estimate the model using investments as a proxy for !it.
(a) Estimate the Örst step of OP to get bl and b
it. Use a fourth order
polynomial series (just as OP did) for b
it.
(b) Estimate the second step of OP ignoring the selection problem to get
bk
. Use a fourth order polynomial to approximate the function g (:),
where
!it = g (!it 1) + it;
and use the vector of instruments (1; kit)0.
(c) When compared with your previous results (OLS, FE, and SYS GMM),
does your new estimate from OP conform with the biases suggested
by the theoretical model?
(d) Calculate productivity growth for each Örm as in Olley and Pakes
(1996), Section 5:
pit = exp yit bllit bkkit :
Calculate aggregate productivity growth for the sample in each year
using output shares to aggregate Örms (sit = yit=PNt i=1 yit), and do
the same decomposition as they perform (equation 16). What do
you conclude in terms of relocation of output shares versus plantlevel growth?
(e) (EXTRA) You do not need to incorporate the selection problem into
OP, but you can if you want to. In this case, when you estimate
the survival probability function (i.e., the probability of staying in
the market) you may use a fourth order polynomial within a probit model, Pit = Pr it+1 = 1jiit; kit
. Then, in the last step (the
nonlinear least squares), also use a fourth order polynomial for g (:),
where now
!it = g (!it 1; Pit) + it:
Discuss your estimates compared with those obtained ignoring the
selection problem.
HINT 1: There are a couple of ways to handle the unbalanced panel in
MATLAB. One possibility is to Örst drop all missing data (all the lines with
zeros) and then estimate the model as usual. To create lagged variables you
need to be careful. One possibility is to Örst shift the whole vector (e.g.,
create in MATLAB the yitt1 as “y_1=[0,y];”). Then, you replace with
zeros all lines in which year = 1 (so that the lagged variable for the Örmsí
Örst year is zero and not the previous Örmsílast observation). Finally you
drop the lines with zeros when estimating k and g (:).
HINT 2: If you want to correct for selection, you need to estimate the
probit Pit = Pr it+1 = 1jiit; kit
before droping the missing variables inMATLAB. It may be easier to estimate Pit in STATA, save the results and then move to MATLAB. OBS: Note that the usual standard errors in the second stage will be wrong since they need correction for the error in the Örst stage estimators. This can be done analytically (using Pakes and Olley (1995)), or it can be ìbootstrappedî. Donít worry about the corrections needed.