Research Assistant - Exercise

Prof. Dickstein conducts research on health care markets. Some of his recent research projects include:

For a current project, Prof. Dickstein collected data for surgical procedures carried out in hospitals and “ambulatory surgery centers” for multiple regions/markets. The goal of the research is to analyze the market for hospital services and the incentives for entry of competing ambulatory centers.

We wish to study how competition evolved over time by comparing the number of procedures completed and total revenue earned in each market over time at facilities of each type—hospitals vs. ambulatory surgery centers.

There are a few data files needed for this exercise.

  1. data_exercise.csv is a dataset containing quarterly data for the years 1997 through 2012. This dataset contains one summary observation per facility per quarter.
  2. data_description.csv contains a description of the variables in data_exercise.csv.
  3. fac_exercise.csv is a dataset containing a facility-level “for-profit” status indicator.

Steps to complete:

  1. Download and unzip the files:
  2. Open data_exercise.csv. We are only interested in examining hospitals and ambulatory surgery centers. Do not include any facilities labeled as other types beyond these two categories. Report the mean, median, standard deviation, min value, and max value of: number of procedures completed, total revenue, and the mean charge per patient. Report these summary statistics for each year and for each of the two facility types.
  3. The data begin in Q1 of 1997. Firms that existed at the time are considered “incumbents”; firms that entered after 1997 Q1 are considered “entrants” in the year and quarter in which they began having positive revenues, and incumbents in subsequent years. Create three new variables in the dataset to capture these designations. First, create an indicator variable that equals 1 for the periods in which a facility is an “entrant”. Second, create an indicator variable that equals one for the periods in which we consider the firm an incumbent. And third, create a count variable that contains the running number of years for which the facility has existed. Document exactly how you create these variables.
  4. Now you need to merge in data on the characteristics of each facility, including its for-profit status. These data are saved in fac_exercise.csv. Merge this into the dataset you've created in (3) above.
  5. We now want to analyze the effect of competition on prices. To do so, we need to create a count of the number of facilities operating in a region (region_id) in a given year and quarter. Create this variable and merge it back to the dataset you created in (4) as a new variable, “num_facilities”.
  6. Test how competition affects prices. Run a regression of the natural log of per-patient charges on a constant and the “num_facilities” variable you created in (5) (Useful stata commands, if using stata: gen, reg). Note: a regression is just finding a line of best fit through the data. Does competition have an effect on prices?

Please submit your script file (this is the name of the file containing your code) and your .log file (this file saves the output in a text file) to when you have finished your work.

Note that not everything in this write up will spell out exactly how to do each step. Be resourceful: look in the Stata/Sas/Matlab/R/Python documentation files, use google... Please comment your code and be sure to note things that you were not quite sure about and the solution you chose. We are looking for work that carefully executes the assignment with clear documentation of the decisions made. If you can't figure something out, that is fine. Document your confusion, make a decision of how to proceed, and continue.