Final Exam - section 2. Thursday, December hours, 30 minutes

Similar documents
Age (continuous) Gender (0=Male, 1=Female) SES (1=Low, 2=Medium, 3=High) Prior Victimization (0= Not Victimized, 1=Victimized)

Notes for laboratory session 2

Multiple Linear Regression Analysis

ECON Introductory Econometrics Seminar 7

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers

Limited dependent variable regression models

Economics 345 Applied Econometrics

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Introduction to Econometrics

Name: emergency please discuss this with the exam proctor. 6. Vanderbilt s academic honor code applies.

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Final Research on Underage Cigarette Consumption

This tutorial presentation is prepared by. Mohammad Ehsanul Karim

Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda. DUE: June 6, Name

Sociology 63993, Exam1 February 12, 2015 Richard Williams, University of Notre Dame,

(a) Perform a cost-benefit analysis of diabetes screening for this group. Does it favor screening?

Chapter 11 Regression with a Binary Dependent Variable

Introduction to regression

Assessing Studies Based on Multiple Regression. Chapter 7. Michael Ash CPPA

INTRODUCTION TO ECONOMETRICS (EC212)

Regression Output: Table 5 (Random Effects OLS) Random-effects GLS regression Number of obs = 1806 Group variable (i): subject Number of groups = 70

Sociology Exam 3 Answer Key [Draft] May 9, 201 3

Modeling unobserved heterogeneity in Stata

ANOVA. Thomas Elliott. January 29, 2013

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Least likely observations in regression models for categorical outcomes

Estimating average treatment effects from observational data using teffects

Carrying out an Empirical Project

Heuristics as a Proxy for Contestant Risk Aversion in Deal or No Deal

Econometric Game 2012: infants birthweight?

STP 231 Example FINAL

Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models, 2nd Ed.

Business Statistics Probability

Introduction to Econometrics

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Instrumental Variables Estimation: An Introduction

Today Retrospective analysis of binomial response across two levels of a single factor.

Use the above variables and any you might need to construct to specify the MODEL A/C comparisons you would use to ask the following questions.

Econometric analysis and counterfactual studies in the context of IA practices

Methods for Addressing Selection Bias in Observational Studies

EXAMINING THE EDUCATION GRADIENT IN CHRONIC ILLNESS

2. Scientific question: Determine whether there is a difference between boys and girls with respect to the distance and its change over time.

Exam 2 Solutions: Monday, April 2 8:30-9:50 AM

Answer all three questions. All questions carry equal marks.

MULTIPLE REGRESSION OF CPS DATA

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

HS Exam 1 -- March 9, 2006

Psych 5741/5751: Data Analysis University of Boulder Gary McClelland & Charles Judd. Exam #2, Spring 1992

Multivariate dose-response meta-analysis: an update on glst

Psychology Research Process

Case A, Wednesday. April 18, 2012

THE WAGE EFFECTS OF PERSONAL SMOKING

Still important ideas

1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA.

Problem set 2: understanding ordinary least squares regressions

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Final Exam. Thursday the 23rd of March, Question Total Points Score Number Possible Received Total 100

Getting started with Eviews 9 (Volume IV)

Session 1: Dealing with Endogeneity

Binary Diagnostic Tests Two Independent Samples

Statistical reports Regression, 2010

Multiple Regression Analysis

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX

"Lack of activity destroys the good condition of every human being, while movement and methodical physical exercise save it and preserve it.

Psychology Research Process

Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer

EC352 Econometric Methods: Week 07

Problem Set 3 ECN Econometrics Professor Oscar Jorda. Name. ESSAY. Write your answer in the space provided.

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

THIS PROBLEM HAS BEEN SOLVED BY USING THE CALCULATOR. A 90% CONFIDENCE INTERVAL IS ALSO SHOWN. ALL QUESTIONS ARE LISTED BELOW THE RESULTS.

Basic Biostatistics. Chapter 1. Content

Careful with Causal Inference

Still important ideas

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

Inferential Statistics

Example 7.2. Autocorrelation. Pilar González and Susan Orbe. Dpt. Applied Economics III (Econometrics and Statistics)

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H

Score Tests of Normality in Bivariate Probit Models

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see

Student name: SOCI 420 Advanced Methods of Social Research Fall 2017

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D)

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Research Manual STATISTICAL ANALYSIS SECTION. By: Curtis Lauterbach 3/7/13

Technical Track Session IV Instrumental Variables

Session 3: Dealing with Reverse Causality

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros

An Instrumental Variable Consistent Estimation Procedure to Overcome the Problem of Endogenous Variables in Multilevel Models

Chapter 3: Examining Relationships

Demystifying causal inference in randomised trials. Lecture 3: Introduction to mediation and mediation analysis using instrumental variables

Are Illegal Drugs Inferior Goods?

Rational Behavior in Cigarette Consumption: Evidence from the United States

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Transcription:

Econometrics, ECON312 San Francisco State University Michael Bar Fall 2011 Final Exam - section 2 Thursday, December 15 2 hours, 30 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. No calculators of any kind are allowed. 3. Show all the calculations. 4. If you need more space, use the back of the page. 5. Fully label all graphs. Good Luck

1. (35 points). Meder studies the factors affecting the individual demand for cigarettes. He collected a sample of 807 individuals, with the following variables: cigs number of cigarettes smoked per day lcigpric log of cigarette price lincome log of income educ years of schooling restaurn dummy variable (= 1 if smoking in restaurants is restricted in the state, 0 otherwise) age age in years agesq age squared ( agesq age ). black dummy variable (= 1 if person is black, 0 otherwise) Meder s Stata command and regression output are presented below. 2. regress cigs lcigpric lincome educ restaurn age agesq black Source SS df MS Number of obs = 807 -------------+------------------------------ F( 7, 799) = 6.38 Model 8029.43629 7 1147.06233 Prob > F = 0.0000 Residual 143724.246 799 179.880158 R-squared = 0.0529 -------------+------------------------------ Adj R-squared = 0.0446 Total 151753.683 806 188.280003 Root MSE = 13.412 cigs Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- lcigpric -.8508953 5.782322-0.15 0.883-12.20123 10.49944 lincome.8690151.7287642 1.19 0.233 -.5615034 2.299534 educ -.5017533.1671677-3.00 0.003 -.829893 -.1736135 restaurn -2.865621 1.117406-2.56 0.011-5.059019 -.6722235 age.7745021.1605158 4.83 0.000.4594196 1.089585 agesq -.0090686.0017481-5.19 0.000 -.0124999 -.0056373 black.5592361 1.459461 0.38 0.702-2.305595 3.424067 _cons -3.241715 24.11391-0.13 0.893-50.57581 44.09238 a. Interpret the estimated coefficient on lincome. b 3 /100 0.0087 means that when the income goes up by 1%, the demand for cigarettes is predicted to increase by 0.0087 cigarettes per day. 1

b. Based on Meder s results, what is the estimated income elasticity of demand for cigarettes for someone who smokes 1 cigarette per day, and for someone who smokes 10 cigarettes per day? ˆ ˆ cigs, income cigs 1 cigs, income cigs 10 b3 0.87 0.87 cigs 1 b3 0.87 0.087 cigs 10 You are supposed to learn from the above that for heavier smokers, income becomes less important factor in their demand for cigarettes. c. Interpret the estimated coefficient on restaurn. In states that have restrictions on smoking in restaurants, the demand for cigarettes is 2.87 cigarettes per day lower, than in states that do not have restrictions on smoking in restaurants, holding all other characteristics the same. d. Suppose that Meder wants to test whether restricting smoking in restaurants is an effective policy for reducing smoking. Write the null and alternative hypotheses for this test. H0 : 5 0 H : 0 1 5 2

e. Based on the reported p-values, what is your conclusion about the test in the last section? Explain your answer. First, the sign of the estimated coefficient is negative ( b 2 2.87 0 ), which is consistent with the alternative hypothesis. Second, the reported p-value of 0.011 is for the two sided test, H 0 v.s. 0 : 2 H 1 : 2 0, so for the one-tailed test the relevant p-value is 0.011/ 2 0.0055 0. 05. Therefore, we reject the null hypothesis at significance level of 0. 05 and conclude that restricting smoking is effective in reducing smoking. f. Interpret the estimated coefficient on black. b 8 0.56 means that black (African Americans) smoke 0.56 cigarettes more per day that other races, holding all other characteristics the same. g. Suppose that you wish to test whether the number of cigarettes smoked per day is the same for black as for other races. Write the null and alternative hypotheses for this test. H H 0 1 : 8 0 : 0 8 h. Based on the reported 95% confidence intervals, what is your conclusion about the test in the last section? Explain your answer. The 95% confidence interval for 8 is [-2.305595, 3.424067], which contains the value of 0. Therefore, we fail to reject the null hypothesis of H 0 : 6 0 at significance level of 5%. Recall that the reported confidence intervals in Stata contain all the hypothesized values of the true parameter, which are not rejected by the current sample. We conclude that there is no significant difference between smoking of black and other races. 3

i. Suppose that you wish to test whether the effect of education on demand for cigarettes is the same for black as for other races. How would you change the original model to allow for that test? We would need to add another regressor an interaction term educ*black. The coefficient on that regressor will give the difference between the effect of education on smoking of black and the effect of education on smoking of other races. j. Based on Meder s results, the demand for cigarettes is increasing in age. True/False, circle the correct answer and prove mathematically. The effect of age on smoking is not monotone (cigs is a quadratic function of age, increasing up to certain age, and declining afterwards): cigs b 6 2b 7 age 0.774 2 0.009age age Remark. In fact, one can find the age when smoking is at maximum: cigs * b6 2b7age 0 age * b6 0.774 age 43 2b7 2 0.009 After age of 43 smoking declines, perhaps because people get more mature and start feeling the negative consequences of smoking on their health. 4

2. (20 points). Seung studies the factors affecting the choice of high school students to go to college. She collected data on 1000 former high school students, with the following observations: college - dummy variable (= 1 if the student enrolled in college, 0 otherwise) grades average high school grade of math, English and social studies faminc gross annual family income (in $1000) famsiz - number of family members parcoll dummy variable (= 1 if most educated parent graduated from college or had an advanced degree, 0 otherwise) female dummy variable (= 1 if a person is female, 0 otherwise) black dummy variable (= 1 if a person is black, 0 otherwise) Seung s Stata commands and output are presented below.. probit college grades faminc famsiz parcoll female black Probit regression Number of obs = 1000 LR chi2(6) = 226.42 Prob > chi2 = 0.0000 Log likelihood = -416.21967 Pseudo R2 = 0.2138 college Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- grades.2945521.0274882 10.72 0.000.2406761.348428 faminc.005393.0018099 2.98 0.003.0018457.0089404 famsiz -.0531059.0374572-1.42 0.156 -.1265207.0203089 parcoll.4765344.1424817 3.34 0.001.1972755.7557933 female.0237927.1014679 0.23 0.815 -.1750806.2226661 black.6109028.2176202 2.81 0.005.184375 1.037431 _cons -1.135516.2250342-5.05 0.000-1.576574 -.6944567. mfx Marginal effects after probit y = Pr(college) (predict) =.84535426 variable dy/dx Std. Err. z P> z [ 95% C.I. ] X ---------+-------------------------------------------------------------------- grades.0700821.00618 11.35 0.000.057978.082186 6.46961 faminc.0012832.00043 3.02 0.003.00045.002116 51.3935 famsiz -.0126354.00889-1.42 0.155 -.030052.004781 4.206 parcoll*.1030921.02745 3.76 0.000.049284.1569.308 female*.0056604.02414 0.23 0.815 -.041661.052981.496 black*.107392.02648 4.06 0.000.055491.159293.056 (*) dy/dx is for discrete change of dummy variable from 0 to 1 5

a. Interpret the estimated marginal effect of grades. A one-point increase in the average high school grades increases the probability of attending college by 0.07 (or by 7%). b. Interpret the estimated marginal effect of black. The probability of attending college is higher by 0.1 (or by 10%) for black than for white high school graduates, with all other characteristics at their sample mean values. c. Between family income (faminc) and family size (famsiz), which one is more important factor in determining the chances of attending college, based on the above estimates? Explain your answer. Family income is more important because it s effect on probability of attending college is significant (p-value = 0.003 < 0.05), while family size has insignificant effect on the probability of attending college (p-value = 0.155 > 0.05). d. The main reason why logit and probit models are preferred to the linear probability model is (circle the correct answer): i. Logit and probit are easier to estimate than the linear probability model. ii. Logit and probit models allow for calculating the marginal effects on the predicted probability of an outcome. iii. The fitted values in the probit and logit models are always between 0 and 1. iv. The Logit and probit estimators are BLUE (Best Linear Unbiased Estimators). 6

3. (15 points). Suppose that according to the theory, health of a nation s population is important factor in determining the country s economic growth. a. A researcher estimates a regression model with dependent variable being growth, but she does not include health as one of the regressors. What are the likely consequences on the other estimates? Circle the correct answer. i. The OLS estimators of the other coefficients are biased but consistent. ii. The OLS estimators unbiased but inconsistent. iii. The OLS estimators are unbiased and consistent, but inefficient. iv. The OLS estimators are biased and inconsistent. v. The OLS estimators are biased and inefficient. b. Suppose that the researcher realizes that health should be included as one of the regressors in the model, but unfortunately there is no data on the variable health. Propose a solution to this problem. Be specific. The researcher can use a proxy for health, e.g. life expectancy. c. In general, what is the most important source of guidance for model specification (i.e. determining what variable is dependent, and what are the regressors)? Economic (or other) theory. 7

4. (5 points). A researcher estimates a regression model, and finds that the estimated individual coefficients are not significant, but at the same time the overall fit of the model is good. What is the likely reason for her results? Circle the correct answer. a. Multicollinearity. b. Heteroscedasticity. c. Omitted variable bias. d. Serial correlation. 5. (10 points). Suppose a researcher is using time series data in regression analysis. a. Which problem is the researcher more likely to face? Circle the correct answer. i. Heteroscedasticity. ii. Multicollinearity. iii. Autocorrelation. b. Suppose the dependent variable and some of the regressors exhibit time trends. Briefly explain what problem that is likely to arise in this research project, and provide a solution to it. The problem is spurious regression (meaning false or fake), because it does not measure the causal effect of the regressors on the dependent variable. Instead, the model estimates the effects of the time trend on the dependent variable. The simplest solution is to include time as a regressor. 8