# UK ENGLISH The report consists of two parts. The first is your written analysis

UK ENGLISH

The report consists of two parts. The first is your written analysis of two datasets. The datasets and the questions you should answer are outlined below. The second part of the report is the Stata code (do file) you used in order to carry out your analysis. The overall marking is based on the following principles:

• Correctness and completeness: Each question is worth a certain number of points. Only if you answer the question correctly you will receive full points. Correctness and completeness will determine 70% of your report mark (and will thus cound 35% towards your final mark on the module).

• Stata code: I am asking you to submit the do file you used in order to carry out the analysis. You can copy the content of your do file into a textfile or word file in order to submit it. I will mark the do file based on whether it is executable without errors (except for the path) and based on whether it has a clear structure and you use appropriate comments The do file counts 15 % towards your report mark (and 7.5% towards the overall mark for the module).

• Presentation and style: I will also assess your written report based on how it is presented. This means whether there is a clear structure, whether it is clear to which question your writing is referring, whether you explain key steps and decisions you made as part of your analysis. Think about it as preparing a report for your future boss. I am not looking for “art” but your report should be clear, easy to follow and informative about what you are doing. Presentation and style will count 15% towards the report mark and 7.5% towards your final mark.

• Student ID: Marking is anonymous but in order to identify you after marking, your student ID is required. If you fail to provide your student ID on the (first page) of the report you will loose 10 points (10%).

The report is expected to be about 3000 words (± 10%) long. This is a rough guide and I won’t count words unless it is extremely long (or short). Focus on answering the questions outlined below and keep an eye on the word count as a secondary issue.

1 Report

For your report you are required to analyse two datasets, described below.

1.1 Dataset 1: CEO salaries

You will find the relevant dataset on moodle. Each student has his/her individual dataset according to the instructions on moodle. Make sure you are analysing the dataset which has been assigned to you. The dataset contains data on CEOs and thei individual as well as the company’s characteristics. The definition of the variables is as follows:

• salary: CEO salary measured in 1000s of 1990 USD

• age: age of the CEO in years

• college: indicator variable =1 if CEO attended college

• grad: indicator variable =1 if CEO attended graduate school (master or PhD)

1

• comten: company tenure; the amount of years the CEO has been working for the company • ceoten: CEO tenure; the amount of years the CEO has been in the position of CEO

• sales: the value of the firm’s sales in million USD, 1990 dollars

• profits: the value of the firm’s profits in million USD, 1990 dollars

• mktval: (stock) market value of the company at the end of 1990, million USD

• profmarg: profit margin expressed as profits in % of sales (i.e. =profits/sales*100)

Use this dataset in order to investigate the effects of firm performance and personal characteristics

on CEO salaries. Answer the following questions (total of 60 points):

1. [10 points]: Start with a model which relates salaries to firm sales and market value. Choose a log-log specification for both independent variables. Assess the overall model fit, interpret the meaning of the estimated coefficients and assess their statistical significance. Take the units of measurement into account when interpreting the results.

2. [10 points]: Add the profit variable to the model. Why is it not possible to add it in logarithmic form? What is your overall assessment about the explanatory power of these three firm performance indicators?

3. [10 points]: Add the CEO tenture variable to the model in part (2). What is the estimated percentage return for another year of CEO tenure, holding all other variables fixed?

4. [7 points]: Find the sample correlation coefficient between the logarithm of market value and profits. Is there a strong correlation between these variables? What implication does this have for the OLS estimators?

5. [13 points]: Report all 5 specifications in one regression table and compare the results. Which is your preferred model? Explain why? Are there other factors or other functional forms which should be taken into account? Explain. Are there any results which you did not expect or which are not in line with economic theory? Can you come up with a possible explanation why you obtained these results?

6. [10 points]: Test for normality and homoskedasticity of the residuals for your preferred specification. Interpret and explain the results. What are the implications of these tests for how you should interpret the results?

1.2 Dataset 2: Survey of Consumer Finances

You will find the relevant dataset on moodle. Each student has his/her individual dataset. A wordfile contains the breakdown which student is assigned to which dataset. The dataset contains information on individual households and their balance sheets. The key variables are defined as follows:

• debt: total household debt in 2013 US Dollars

• debtlag2: total household debt in the previous year; measured in 2013 US Dollars • income: total household income in 2013 US Dollars

• houses: value of primary residence in 2013 US Dollars

• age: age of the survey responded (household representative)

• wgt: survey weight for each household

Use the dataset in order to investigate the effects of household characteristics on household indebt- edness. Answer the following questions (total of 40 points):

2

1. [12 points]: Start with the following model

ln(debti,t) = β0 + β1ln(debti,t−1) + β2ln(incomei,t) + β3ln(housesi,t) + β4agei,t + β5age2i,t + ui,t

and interpret the meaning of the estimated coefficients, assess the model fit and the statistical significance of the estimated coefficients. Use the weights (wgt) when estimating the model.

2. [12 points]: Re-estimate the model from part (1) but now restrict it to only include those households whose outstanding liabilities are positive. Discuss the main differences between these two models with respect to the parameters you obtained. Can you come up with an economic explanation for the difference?

3. [7 points]: The variable inheritance is an indicator variable which is equal to 1 for those households who inherited some assets in the past. Include it in your model and interpret the results.

4. [9 points]: The variable X720 contains the year in which the primary residence was bought. Generate a new variable containing how many years ago the primary residence was bought and include that in your regression model and interpret the results.

For additional information about the variables use the SCF handbooks. The links and an explana- tion of how they work you can find on moodle.

2 Stata code

In addition to your report you are required to hand in the the Stata code you used to carry out the analysis of the two datasets. The code should allow another person to reproduce all results reported in the analytical section (after adjusting the directory path). Include your code as an Appendix at the end of your report. The marking of the do file will be based on whehter the code is able to reproduce your results (i.e. works correctly) and whether you have included appropriate comments which allow someone else to understand your code.

3

• •

•

•

•

Format of the report and general tips

Include your student ID on the first page of the report and your do file! Don’t include your name as it prevents anonymous marking.

You should think of the written report as a problem set, rather than an actual research report. That means you are not required to have an introduction, literature review, results section etc.

Most importantly clearly indicate which questions your answers are addressing and based on which results your answer is based. A good way to do that is to include a regression table (if you have more than one then number the tables) and refer to the (numbered) columns of that table.

Make sure you provide a proper interpretation of the regression results you produce. What do the coefficients mean? How should they be interpreted? Make sure that you are using the correct units when interpreting your results (i.e. Dollars, test scores, etc.). Use the example document which we discussed in lectures 8 and 9.

Group work is encouraged. Discussing the results with your colleagues will greatly help you to understand the material. The crucial point is that despite this exchange you are required to deliver an independent piece of work. So you cannot copy the code or your written answers from your colleagues. If you are working in groups, you are required to clearly state that at the beginning of your report and provide the student ID numbers of the other group members.