ECON 2404 – Assignment 2

1 Estimating Production Function
Download the data “data-olley-pakes”. There are 531 Örms and 6 years of data.
Each row refers to one Örm in one year. The variables are as follows: Örm, year,
output, age, capital, labor, and investment. If a Örmís values are zero in a given
year, that means that the Örm does not exist in that year, it has either exited
already or not yet entered.

You can answer the questions using either STATA or MATLAB, although I
would suggest you to implement the Olley and Pakes (OP) estimator in MATLAB to have a better understanding of the method. If you are using STATA,
do not forget to change the zeros into dots so that STATA understands that the
variables are missing; otherwise it will consider the zeros as observations. Make
sure you put the data in logs before estimating the model.

1.1 Model
Assume the Örms have a Cobb-Douglas production function (letís ignore Örmís
age in the exercise, but you can include it if you want to):
yit = 0 + llit + kkit + !it + “it; (1)
where yit is the log of output; lit is the log of labor; kit is the log of capital; the
term !it represents “productivity shocks” that are observed or predictable by
the Örms before making their input decisions at t; and “it represents both i.i.d.
shocks to production that are not predicted by the Örms and measurement
errors in the observed variables. The endogeneity problem in estimating (1)
comes from the correlation between the inputs and !it.

1.2 Question

  1. Estimate the production function using pooled OLS for both the unbalanced and for the balanced panel data. What do you Önd? Are the
    estimates signiÖcant? Are they economically reasonable? Why would you
    expect them to be biased?
  2. Assume !it = !i
    , i.e., it is a time-invariant Öxed-e§ect. Estimate the
    production function using the Öxed-e§ect estimator. What do you Önd?
    Are the estimates signiÖcant? Are they economically reasonable?
  3. Assume !it = !it 1 + it, where jj < 1 and it is i.i.d. Add the Öxede§ect ai
    in the unobservables. Estimate the production function using the
    SYS GMM estimator proposed by Blundell and Bond. What do you Önd?
    Are the estimates signiÖcant? Are they economically reasonable?
  4. Return to the original model (1) and assume the model satisÖes the OP
    assumptions. Estimate the model using investments as a proxy for !it.
    (a) Estimate the Örst step of OP to get bl and b
    it. Use a fourth order
    polynomial series (just as OP did) for b
    it.
    (b) Estimate the second step of OP ignoring the selection problem to get
    bk
    . Use a fourth order polynomial to approximate the function g (:),
    where
    !it = g (!it 1) + it;
    and use the vector of instruments (1; kit)0.
    (c) When compared with your previous results (OLS, FE, and SYS GMM),
    does your new estimate from OP conform with the biases suggested
    by the theoretical model?
    (d) Calculate productivity growth for each Örm as in Olley and Pakes
    (1996), Section 5:
    pit = exp yit bllit bkkit :
    Calculate aggregate productivity growth for the sample in each year
    using output shares to aggregate Örms (sit = yit=PNt i=1 yit), and do
    the same decomposition as they perform (equation 16). What do
    you conclude in terms of relocation of output shares versus plantlevel growth?
    (e) (EXTRA) You do not need to incorporate the selection problem into
    OP, but you can if you want to. In this case, when you estimate
    the survival probability function (i.e., the probability of staying in
    the market) you may use a fourth order polynomial within a probit model, Pit = Pr it+1 = 1jiit; kit
    . Then, in the last step (the
    nonlinear least squares), also use a fourth order polynomial for g (:),
    where now
    !it = g (!it 1; Pit) + it:
    Discuss your estimates compared with those obtained ignoring the
    selection problem.
    HINT 1: There are a couple of ways to handle the unbalanced panel in
    MATLAB. One possibility is to Örst drop all missing data (all the lines with
    zeros) and then estimate the model as usual. To create lagged variables you
    need to be careful. One possibility is to Örst shift the whole vector (e.g.,
    create in MATLAB the yitt1 as “y_1=[0,y];”). Then, you replace with
    zeros all lines in which year = 1 (so that the lagged variable for the Örmsí
    Örst year is zero and not the previous Örmsílast observation). Finally you
    drop the lines with zeros when estimating k and g (:).
    HINT 2: If you want to correct for selection, you need to estimate the
    probit Pit = Pr it+1 = 1jiit; kit
    before droping the missing variables inMATLAB. It may be easier to estimate Pit in STATA, save the results and then move to MATLAB. OBS: Note that the usual standard errors in the second stage will be wrong since they need correction for the error in the Örst stage estimators. This can be done analytically (using Pakes and Olley (1995)), or it can be ìbootstrappedî. Donít worry about the corrections needed.