ECON 2404 – Assignment 2

1 Estimating Production Function

Download the data “data-olley-pakes”. There are 531 Örms and 6 years of data.

Each row refers to one Örm in one year. The variables are as follows: Örm, year,

output, age, capital, labor, and investment. If a Örmís values are zero in a given

year, that means that the Örm does not exist in that year, it has either exited

already or not yet entered.

You can answer the questions using either STATA or MATLAB, although I

would suggest you to implement the Olley and Pakes (OP) estimator in MATLAB to have a better understanding of the method. If you are using STATA,

do not forget to change the zeros into dots so that STATA understands that the

variables are missing; otherwise it will consider the zeros as observations. Make

sure you put the data in logs before estimating the model.

1.1 Model

Assume the Örms have a Cobb-Douglas production function (letís ignore Örmís

age in the exercise, but you can include it if you want to):

yit = 0 + llit + kkit + !it + “it; (1)

where yit is the log of output; lit is the log of labor; kit is the log of capital; the

term !it represents “productivity shocks” that are observed or predictable by

the Örms before making their input decisions at t; and “it represents both i.i.d.

shocks to production that are not predicted by the Örms and measurement

errors in the observed variables. The endogeneity problem in estimating (1)

comes from the correlation between the inputs and !it.

1.2 Question

- Estimate the production function using pooled OLS for both the unbalanced and for the balanced panel data. What do you Önd? Are the

estimates signiÖcant? Are they economically reasonable? Why would you

expect them to be biased? - Assume !it = !i

, i.e., it is a time-invariant Öxed-e§ect. Estimate the

production function using the Öxed-e§ect estimator. What do you Önd?

Are the estimates signiÖcant? Are they economically reasonable? - Assume !it = !it 1 + it, where jj < 1 and it is i.i.d. Add the Öxede§ect ai

in the unobservables. Estimate the production function using the

SYS GMM estimator proposed by Blundell and Bond. What do you Önd?

Are the estimates signiÖcant? Are they economically reasonable? - Return to the original model (1) and assume the model satisÖes the OP

assumptions. Estimate the model using investments as a proxy for !it.

(a) Estimate the Örst step of OP to get bl and b

it. Use a fourth order

polynomial series (just as OP did) for b

it.

(b) Estimate the second step of OP ignoring the selection problem to get

bk

. Use a fourth order polynomial to approximate the function g (:),

where

!it = g (!it 1) + it;

and use the vector of instruments (1; kit)0.

(c) When compared with your previous results (OLS, FE, and SYS GMM),

does your new estimate from OP conform with the biases suggested

by the theoretical model?

(d) Calculate productivity growth for each Örm as in Olley and Pakes

(1996), Section 5:

pit = exp yit bllit bkkit :

Calculate aggregate productivity growth for the sample in each year

using output shares to aggregate Örms (sit = yit=PNt i=1 yit), and do

the same decomposition as they perform (equation 16). What do

you conclude in terms of relocation of output shares versus plantlevel growth?

(e) (EXTRA) You do not need to incorporate the selection problem into

OP, but you can if you want to. In this case, when you estimate

the survival probability function (i.e., the probability of staying in

the market) you may use a fourth order polynomial within a probit model, Pit = Pr it+1 = 1jiit; kit

. Then, in the last step (the

nonlinear least squares), also use a fourth order polynomial for g (:),

where now

!it = g (!it 1; Pit) + it:

Discuss your estimates compared with those obtained ignoring the

selection problem.

HINT 1: There are a couple of ways to handle the unbalanced panel in

MATLAB. One possibility is to Örst drop all missing data (all the lines with

zeros) and then estimate the model as usual. To create lagged variables you

need to be careful. One possibility is to Örst shift the whole vector (e.g.,

create in MATLAB the yitt1 as “y_1=[0,y];”). Then, you replace with

zeros all lines in which year = 1 (so that the lagged variable for the Örmsí

Örst year is zero and not the previous Örmsílast observation). Finally you

drop the lines with zeros when estimating k and g (:).

HINT 2: If you want to correct for selection, you need to estimate the

probit Pit = Pr it+1 = 1jiit; kit

before droping the missing variables inMATLAB. It may be easier to estimate Pit in STATA, save the results and then move to MATLAB. OBS: Note that the usual standard errors in the second stage will be wrong since they need correction for the error in the Örst stage estimators. This can be done analytically (using Pakes and Olley (1995)), or it can be ìbootstrappedî. Donít worry about the corrections needed.