EWMBA 296 (Fall 2015)

Similar documents
Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012

Political Science 15, Winter 2014 Final Review

Simple Linear Regression One Categorical Independent Variable with Several Categories

Final Exam Practice Test

Test Bank for Privitera, Statistics for the Behavioral Sciences

Pre-Test Unit 9: Descriptive Statistics

Homework #3. SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D)

Review Questions Part 2 (MP 4 and 5) College Statistics. 1. Identify each of the following variables as qualitative or quantitative:

Unit 7 Comparisons and Relationships

REVIEW PROBLEMS FOR FIRST EXAM

Use Survey Happiness Complete. SAV file on Blackboard Assigned to Mistake 8

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

AP Statistics Practice Test Ch. 3 and Previous

d =.20 which means females earn 2/10 a standard deviation more than males

Business Statistics Probability

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

Simple Linear Regression the model, estimation and testing

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Chapter Eight: Multivariate Analysis

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Economics 345 Applied Econometrics

Causality and Treatment Effects

Lecture (chapter 1): Introduction

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

Amherst College Department of Economics Economics 360 Fall 2015 Wednesday, October 28 Problem Set Solutions

Chapter Eight: Multivariate Analysis

POL 242Y Final Test (Take Home) Name

3.2A Least-Squares Regression

Math 075 Activities and Worksheets Book 2:

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Creative Commons Attribution-NonCommercial-Share Alike License

A response variable is a variable that. An explanatory variable is a variable that.

3. For a $5 lunch with a 55 cent ($0.55) tip, what is the value of the residual?

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013#

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

Introduction to Econometrics

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

AP Statistics Hypothesis Testing and Confidence Intervals Take Home Test 40 Points. Name Due Date:

PHP2500: Introduction to Biostatistics. Lecture III: Introduction to Probability

Multiple Choice Questions

Sheila Barron Statistics Outreach Center 2/8/2011

Chapter 3: Examining Relationships

RECOMMENDED CITATION: Pew Research Center, December, 2014, Perceptions of Job News Trend Upward

STATISTICS 201. Survey: Provide this Info. How familiar are you with these? Survey, continued IMPORTANT NOTE. Regression and ANOVA 9/29/2013

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

STAT 200. Guided Exercise 4

Chapter 11. Experimental Design: One-Way Independent Samples Design

(a) 50% of the shows have a rating greater than: impossible to tell

PSY 216: Elementary Statistics Exam 4

TEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A.

CP Statistics Sem 1 Final Exam Review

Statistical Significance, Effect Size, and Practical Significance Eva Lawrence Guilford College October, 2017

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

bivariate analysis: The statistical analysis of the relationship between two variables.

Chapter 4: More about Relationships between Two-Variables Review Sheet

Communication Research Practice Questions

Test 1 Version A STAT 3090 Spring 2018

Correlation and regression

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

Tobacco Company Political Action Committee (PAC) Contributions to Federal Candidates by Election Cycle, PAC, and Party,

Practice First Midterm Exam

Sampling Reminders about content and communications:

Homework 2 Math 11, UCSD, Winter 2018 Due on Tuesday, 23rd January

Creative Commons Attribution-NonCommercial-Share Alike License

10 Nov of 6 Bio301D Exam 3

Chapter 3 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Tobacco Company Political Action Committee (PAC) Contributions to Federal Candidates by Election Cycle, PAC, and Party,

9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations)

STATISTICS - CLUTCH CH.11: HYPOTHESIS TESTING: PART 1.

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

[POLS 4150] R, Randomization and Sampling

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC

Sociology 4 Winter PART ONE -- Based on Baker, Doing Social Research, pp , and lecture. Circle the one best answer for each.

Answers to Midterm Exam

MODULE S1 DESCRIPTIVE STATISTICS

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Tobacco Company Political Action Committee (PAC) Contributions to Federal Candidates by Election Cycle, PAC, and Party,

Chapter 1 - The Nature of Probability and Statistics

STATISTICS INFORMED DECISIONS USING DATA

Basic Statistical Concepts, Research Design, & Notation

Handout on Perfect Bayesian Equilibrium

Lecturer Mrs Vimbayi Mafunda Room 4.4. Lectures 1 & 2: 27 February - 2 March 2012

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Name: emergency please discuss this with the exam proctor. 6. Vanderbilt s academic honor code applies.

66 Questions, 5 pages - 1 of 5 Bio301D Exam 3

Transferring itunes University Courses from a Computer to Your ipad

ELECTION NIGHT 2018 HOT SHEET

Bayesian approaches to handling missing data: Practical Exercises

Chapter 12. The One- Sample

A. Indicate the best answer to each the following multiple-choice questions (20 points)

Research Questions, Variables, and Hypotheses: Part 2. Review. Hypotheses RCS /7/04. What are research questions? What are variables?

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

Abe Mirza Practice Test # 4 Statistic. Hypothesis Testing

CHAPTER III RESEARCH METHODOLOGY

Transcription:

EWMBA 296 (Fall 2015) Section 7 GSI: Fenella Carpena December 8, 2015

Agenda for Today Dummy Variables: Basic Concepts 3 Forms of Regressions With Dummy Variables Case 1: SRM where the only X variable is a dummy Case 2: MRM where one of the X variables is a dummy Case 3: Interactions involving dummy variables Practice Problems/Final Exam Review Practice Problem: Effect of ipad on College GPA Practice Final Exam 1, Q7 Practice Problem: Lightbulbs Note: Slides posted in bcourses; I may *not* be able to write section notes on dummy variables due to time constraints.

Dummy Variables: Basic Concepts So far, we ve been working with variables that have quantitative meaning. Sales in thousands of $; Size in square feet of a retail store; Weight of diamond. But in some cases, we may want to incorporate qualitative variables in our regression. Gender or race of a person, industry of a firm (manufacturing, retail, etc.), region in the US (south, north, west, etc.) These qualitative variables cannot be measured on a numerical scale. How can we include such qualitative info in our regression? We use a dummy variable (also called binary variable). It is a variable that takes on only two values: 0 or 1.

Dummy Variables: Basic Concepts Example: Suppose we are interested in comparing the election outcomes between Democratic and Republican candidates. We can define Dem to be a dummy variable = 1 if the candidate is Democrat, 0 if Republican. The same information can be captured by defining Rep to be 1 if the candidate is Republican, and 0 otherwise. The choice of which category to assign to 1 is arbitrary. In practice, the name of the dummy variable is not important for the regression (we can use any name we want). But it is often useful to choose names that signify what category turns on when the dummy = 1. Example: A variable name like party won t make it clear whether party = 1 means a Democrat or a Republican.

Dummy Variables: Basic Concepts We often encounter 3 forms of regressions that include dummy variables. In the examples below, let salary be an employee s annual salary in dollars, MBA = 1 if the government has an MBA and 0 otherwise, and exper is years of work experience. Case 1: SRM where the only X variable is a dummy. ŝalary = b 0 + b 1 MBA Case 2: MRM where one of the X variables is a dummy. ŝalary = b 0 + b 1 MBA + b 2 exper Case 3: Interactions involving dummy variables. ŝalary = b 0 + b 1 MBA + b 2 exper + b 3 MBA exper Important: The interpretation of the dummy variables varies in each of the above 3 cases.

Case 1: SRM where the only X variable is a dummy ŝalary = b 0 + b 1 MBA (1) Regression if MBA = 0: (2) Regression if MBA = 1: (3) Interpretation of b 0 : (4) Interpretation of b 0 + b 1 : (5) Interpretation of b 1 :

Case 2: MRM where one of the X variables is a dummy ŝalary = b 0 + b 1 MBA + b 2 exper (1) Regression if MBA = 0: (2) Regression if MBA = 1: (3) Interpretation of b 0 : (4) Interpretation of b 0 + b 1 : (5) Interpretation of b 1 : (6) Interpretation of b 2 :

Case 3: Interactions involving dummy variables ŝalary = b 0 + b 1 MBA + b 2 exper + b 3 MBA exper (1) Regression if MBA = 0: (2) Regression if MBA = 1: (3) Interpretation of b 0: (4) Interpretation of b 0 + b 1: (5) Interpretation of b 1: (6) Interpretation of b 2: (7) Interpretation of b 2 + b 3: (8) Interpretation of b 3:

Practice Problem: Effect of ipad on College GPA Suppose that the Haas Undergraduate Program Office wishes to examine the potential educational benefits of providing students with an ipad. Using data from 141 students, the table below summarizes the mean and SD of the variable collgpa (college GPA) for students with and without an ipad. Assume that all sample size conditions are met. Number Mean SD Without ipad 85 2.989 0.321 With ipad 56 3.159 0.421 (a) Let X 1 be the sample mean collage GPA of students with an ipad, and X 0 be the sample mean collage GPA of students without an ipad. Calculate: (i) X 1 X 0 (ii) se(x 1 X 0)

Practice Problem: Effect of ipad on College GPA (b) The program office does not plan to provide ipads to students unless there is evidence that having an ipad increases students GPAs. What is the one-sided null and alternative hypothesis that the program office would like to test? (c) Carry out the test from part (b) at the 1% level. Do you reject or fail to reject the null hypothesis? Number Mean SD Without ipad 85 2.989 0.321 With ipad 56 3.159 0.421

Practice Problem: Effect of ipad on College GPA (d) Can the difference in the GPA between students with and without an ipad be attributed to ipad ownership alone? Explain why or why not. Number Mean SD Without ipad 85 2.989 0.321 With ipad 56 3.159 0.421

Practice Problem: Effect of ipad on College GPA (e) Suppose that an analyst at the program office instead estimated an SRM of college GPA on HasIPAD, a dummy variable equal to 1 if the student has an ipad, and 0 otherwise, obtaining the following regression results. Interpret each of the coefficients in the regression. Coefficients Standard Error Intercept 2.989 0.040 HasIPAD 0.170 0.063

Practice Problem: Effect of ipad on College GPA (f) Consider again the null and alternative hypothesis from part (b). (i) How can use the regression coefficients in the SRM from part (e) to test the same null and alternative hypothesis as in part (b)? (ii) Carry out the test using the regression coefficients at the 1% level (Recall that the sample size is 141). (iii) What does question (f)(i) and (f)(ii) tell us about how we can use regressions to carry out a test for the differences of means across two groups? Coefficients Standard Error Intercept 2.989 0.040 HasIPAD 0.170 0.063

Practice Problem: Effect of ipad on College GPA (g) Now, suppose the following two explanatory variables were added to the SRM from part (f): skipped is the average number of lectures the student misses per week, and hsgpa is the student s high school GPA. The MRM regression results are shown below. Interpret each of the coefficients in this regression. Coefficients Standard Error Intercept 1.527 0.300 HasIPAD 0.129 0.057 skipped -0.065 0.026 hsgpa 0.456 0.086

Practice Problem: Effect of ipad on College GPA (h) Why is the estimated GPA differential between students with and without an ipad obtained in the MRM in part (g) smaller than what is obtained in the difference in means from part (a)(i)? Regression from part (g) Coefficients Standard Error Intercept 1.527 0.300 HasIPAD 0.129 0.057 skipped -0.065 0.026 hsgpa 0.456 0.086 Sample means from part (a) Number Mean SD Without ipad 85 2.989 0.321 With ipad 56 3.159 0.421

Practice Problem: Effect of ipad on College GPA (i) How would you augment the model estimated in part (g) to allow the effect of having an ipad on college GPA to vary by gender?

Final Review: Practice Final Exam 1, Q7 Television and Risk of Death. A recent NYTimes article discussed a study about television and the risk of death. (A) Based on the info reported in this figure, write out the regression equation(s) that you think the authors estimated to produce these results. Define the response variable and all the explanatory variables as completely as 5 total points) possible. about rted in ion (or imated sponse les as

Final Review: Practice Final Exam 1, Q7 Television and Risk of Death. A recent NYTimes article discussed a study about television and the risk of death. (B) The authors of the study concluded from these results that watching television increases the risk of death. Do you agree with the authors 5 total points) interpretation. Explain. about rted in ion (or imated sponse les as

Practice Problem: Lightbulbs Suppose that a lightbulb manufacturing plant produces bulbs with a mean life of 2000 hours and a standard deviation of 200 hours. An inventor claims to have developed an improved process that produces bulbs with a longer mean life and the same standard deviation. (a) Let µ denote the mean of the new process. What is the null and alternative hypothesis that the plant manager wishes to test?

Practice Problem: Lightbulbs (b) The plant manager randomly selects a sample of 100 bulbs produced by the new process to test the inventor s claim. Now, suppose that the plant manager uses the following procedure: she will believe the inventor s claim if the sample mean life of the bulbs is > 2100. Assume all SSCs hold. (i) What is the probability that the plant manager makes a Type I error?

Practice Problem: Lightbulbs (b) The plant manager randomly selects a sample of 100 bulbs produced by the new process to test the inventor s claim. Now, suppose that the plant manager uses the following procedure: she will believe the inventor s claim if the sample mean life of the bulbs is > 2100. Assume all SSCs hold. (ii) Suppose that the new process is in fact better and has a mean bulb life of 2150 hours. What is the probably that the plant manager s procedure correctly rejects H 0?

Practice Problem: Lightbulbs (b) The plant manager randomly selects a sample of 100 bulbs produced by the new process to test the inventor s claim. Now, suppose that the plant manager uses the following procedure: she will believe the inventor s claim if the sample mean life of the bulbs is > 2100. Assume all SSCs hold. (iii) What action threshold should the plant manager use if she wants the probability of falsely rejecting the null hypothesis to be 5%?