Name: emergency please discuss this with the exam proctor. 6. Vanderbilt s academic honor code applies.

Size: px
Start display at page:

Download "Name: emergency please discuss this with the exam proctor. 6. Vanderbilt s academic honor code applies."

Transcription

1 Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam May 28 th, 2015: 9am to 1pm Instructions: 1. There are seven questions and 12 pages. 2. Read each question carefully. Answer to the best of your ability. 3. Be as specific as possible and write as clearly as possible. 4. This is an in- class examination; do not discuss any part of this exam with anyone while you are taking the exam. NO BOOKS, NO NOTES, NO INTERNET DEVICES, NO CALCULATORS, NO OUTSIDE ASSISTANCE. 5. You may leave the examination room to use the restroom or to step out into the hallway for a short breather. HOWEVER, YOU MUST LEAVE YOUR CELL PHONE AND ALL EXAM MATERIALS IN THE EXAMINATION ROOM. If there is an emergency please discuss this with the exam proctor. 6. Vanderbilt s academic honor code applies. Question Points Score Comments Total 256

2 1. These are True or False questions. Use a separate sheet of paper to indicate which option (True or False) you are choosing for each answer. Write a brief justification for each answer (1-3 sentences). A new blood pressure medication is tested against a placebo. The p- value testing the null hypothesis of no- effect of medication is a. True or False: It is more likely than not that the drug has no effect. A new blood pressure medication is tested against a placebo. The likelihood ratio comparing the hypotheses of a 5- mmhg difference in favor of the drug working versus the hypothesis of no difference between drug and placebo is b. True or False: It is more likely than not that the drug has no effect. A new blood pressure medication is tested against a placebo. The posterior probability that the mean difference between the drug and placebo is less than 0 is c. True or False: It is more likely than not that the drug has no effect. A new blood pressure medication is tested against a placebo in a randomized controlled trial with 10 subjects in each arm. The outcome measure is systolic blood pressure, in mmhg units, at 1 month after starting therapy. d. True or False: In this setting, a two- sample unequal variance t- test will be more efficient (more powerful) than a Wilcoxon- Mann- Whitney test. e. True or False: In this setting, a two- sample unequal variance t- test will be more efficient (more powerful) than a Z- test that uses pilot data to determine the assumed standard deviations. f. True or False: For any set of outcome data, there exists a prior distribution such that a 95% credible interval for the difference in mean systolic blood pressure will exclude 0. 1 of 12

3 2. Consider the following R code: Question 2 continued: # initialize variables reps <- 10^4 x <- rep( NA, reps ) y <- rep( NA, reps ) z <- rep( NA, reps ) # run loops for( i in 1:reps ){ a <- rnorm( n=2, mean=1, sd=1 ) b <- rnorm( n=2, mean=1, sd=1 ) c <- mean(a) - mean(b) d <- 2*pnorm( abs(c), lower.tail=f ) x[i] <- (d < 0.05 ) f <- rnorm( n=2, mean=2.96, sd=1 ) g <- mean(a) - mean(f) h <- 2*pnorm( abs(g), lower.tail=f ) y[i] <- (h < 0.05 ) k <- wilcox.test( a, f )$p.value z[i] <- (k < 0.05 ) } # summarize results x.mean <- mean(x) y.mean <- mean(y) z.mean <- mean(z) a. Describe the values f will take as explicitly as possible. b. Describe the values c will take as explicitly as possible. c. Make an educated guess for the value of x.mean. Explain your guess or explain why no reasonable guess can be made. d. Make an educated guess for the value of y.mean. Explain your guess or explain why no reasonable guess can be made. e. Make an educated guess for the value of z.mean. Explain your guess or explain why no reasonable guess can be made. f. As reps goes to infinity, x.mean will converge to some constant, say x.mu. What value of reps will ensure that x.mean is sufficiently close to x.mu? That is, find reps such that PP x.mean x.mu = 0.997? Write out a formula and simplify as much as possible; you do not have to solve this numerically. 2 of 12

4 3. A two- arm randomized controlled trial of a new anti- diabetic medication was tested against a placebo. HbA1c, glycated haemoglobin, was measured three months after randomly assigned therapy was begun. HbA1c is used to assess a patient s average blood sugar levels over a period of months. A table summarizing key data from this trial follows; STATA output for these data are on the following page. HbA1c Treatment N Mean Standard Deviation Drug Placebo a. Using standard notation, write out the null and alternative hypotheses for a two- sample equal variance t- test of HbA1c levels on drug and placebo. b. Write out a test statistic that can be used to test the hypothesis from part (a) and insert the appropriate numbers from the table above (do not solve it). c. Interpret the STATA output using a formal hypothesis test with a pre- specified size of 5%. Provide a correct interpretation that is also suitable for a non- statistician. d. Interpret the STATA output using a formal significance test with a 5% significance level. Provide a correct interpretation that is also suitable for a non- statistician. e. Interpret the STATA output using an approach other than classical testing. Provide a correct interpretation that is also suitable for a non- statistician. If your ideal statistics are not reported here, define the missing statistics and provide an example to illustrate how they would be interpreted. f. The sample standard deviations are very close in this example. What would be a potential advantage of using an equal- variance t- test in this case? g. Suppose the data on the placebo arm was replaced by historical data from ten million (10^7) patients. As a group, these patients were known to be highly representative of the population of interest in terms of the central tendency of HbA1c but not representative in terms of its dispersion. Propose a test statistic for comparing drug to placebo that makes use of this essentially infinite sample of placebo patients. h. Propose and justify the degrees of freedom for the test you suggest in part (g). 3 of 12

5 STATA Output for Question #3 Two-sample t test with unequal variances Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] x y combined diff diff = mean(x) - mean(y) t = Ho: diff = 0 Satterthwaite's degrees of freedom = Ha: diff < 0 Ha: diff!= 0 Ha: diff > 0 Pr(T < t) = Pr( T > t ) = Pr(T > t) = of 12

6 4. A three arm randomized controlled trial for treating seasonal affective disorder (SAD) looked at the alleviation of SAD symptoms three weeks after beginning a therapy of cognitive- behavioral therapy (CBT), light therapy (LT), or placebo. A table summarizing key trial data follows; STATA output for these data are on the following page. Symptoms alleviated at three weeks? Therapy Yes No Cognitive- behavioral therapy Light therapy Placebo a. Using standard notation, write out the null and alternative hypotheses for comparing the effectiveness of therapy between two arms. Write down the standard large- sample Wald test statistic for testing the hypotheses. b. Use the STATA output to find an observed Z- statistic for comparing the proportion of alleviated symptoms between the cognitive- behavior therapy and placebo arms. Does this Z- statistic correspond to the test in part (a)? If not, explain why. c. Using standard notation, write out the null and alternative hypotheses that are associated with the standard Chi- square test for the table above. Comment on the role these hypotheses play in understanding the effectiveness of therapy over placebo. The analysis plan called for first using a 5% significance level for the omnibus test. Then, if the omnibus test rejects, all pairwise tests between arms would be performed at a 1.67% significance level. d. Use the STATA output to implement this plan and interpret the results. Explain your conclusions and translate them into practical advice for health care providers. If there is additional information you would have liked to see, explain why that information is important/useful. e. The p- values for two of three pairwise comparisons are less than the p- value for the omnibus test. However, the omnibus test uses all of the data and thus has a larger sample size than the pairwise tests. How is it possible that the omnibus test has a larger p- value? Explain your reasoning. f. Is the family- wise Type I Error rate for the three pairwise tests, as implemented in the analysis plan, less than 5%, equal to 5% or greater than 5%? Justify your answer. 5 of 12

7 STATA Output for Question #4 (Page 1 of 2). tabi \ \ 11 31, row chi2 col row 1 2 Total Total Pearson chi2(2) = Pr = cii 87 44, wald -- Binomial Wald --- Variable Obs Mean Std. Err. [95% Conf. Interval] cii 59 32, wald -- Binomial Wald --- Variable Obs Mean Std. Err. [95% Conf. Interval] cii 42 11, wald -- Binomial Wald --- Variable Obs Mean Std. Err. [95% Conf. Interval] of 12

8 STATA Output for Question #4 (Page 2 of 2). tabi \ 32 27, row chi2 col row 1 2 Total Total Pearson chi2(1) = Pr = tabi \ 11 31, row chi2 col row 1 2 Total Total Pearson chi2(1) = Pr = tabi \ 11 31, row chi2 col row 1 2 Total Total Pearson chi2(1) = Pr = of 12

9 5. For each of the following models, indicate whether it is a linear regression model, an intrinsically linear regression model, or neither of these. Justify your indication. [A model is intrinsically linear if it can be expressed in a linear form by a suitable transformation.] In each case, εε is a random error term. a. Model (a): YY = ββ + ββ XX + ββ log XX + ββ XX + εε b. Model (b): YY = εε exp ββ + ββ XX + ββ XX c. Model (c): YY = ββ + exp ββ XX + ββ XX + εε d. Model (d): YY = δδ XX / δδ εε + 1 XX + δδ e. Propose a computational solution that would allow you to compute an approximate 95% confidence interval for δδ + δδ ^2 from model (d). Provide enough detail to explain how to implement the procedure. [Code not necessary.] f. Suppose you wanted to compare model (a) and model (c) for model selection. Explain how to do this using a likelihood ratio test with a 5% significance level. Write down the computational steps needed to carry it this test. Provide enough detail to explain how to implement the procedure. [Code not necessary.] 8 of 12

10 6. National health, welfare, and education statistics for 210 places, mostly UN members, were collected. Measured social and health variables included fertility (number of children per woman), ppgdp (per capita gross domestic product in US dollars), and lifeexpf (female life expectancy in years). The results of the regression are shown below. log ffffffffffffffffff = ββ + ββ log pppppppppp + ββ llllllllllllllll + ee a. Provide a 95% confidence interval for the coefficient of lifeexpf and interpret both the interval and the coefficient. b. Provide a 95% confidence interval for the intercept and interpret both the interval and the coefficient. c. Suppose llllllllllllllll was re- centered at the population mean of llllllllllllllll and the regression refit. Which coefficients would change? Explain your reasoning. d. What is the correlation between the transformed response, log ffffffffffffffffff, and its fitted value? 9 of 12

11 STATA Output for Question #6 Source SS df MS Number of obs = F( 2, 196) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = lfert Coef. Std. Err. t P> t [95% Conf. Interval] lppg lifeexpf _cons of 12

12 7. A business analyst studied one- way airfare (US dollars) and distance (miles) from city A to 17 other cities in the US. The focus was on modeling airfare as a function of distance. The first model fit to the data was FFFFFFFF = ββ + ββ DDDDDDDDDDDDDDDD + ee a. Based on the output for the model (shown on next page) the analyst concluded the following: The regression coefficient of the predictor variable, Distance, is highly statistically significant and the model explains 99.4% of the variability in the Y- variable, Fare. Thus the model is highly effective for both understanding the effects of Distance on Fare and for predicting future values of Fare given the value of the predictor variable, Distance. Critique this conclusion. b. Does the ordinary simple regression model appear to fit the data well? Model output and diagnostics are shown on the next page. Explain your answer. For example, if your answer is no, also describe in detail how the model can be improved. 11 of 12

13 STATA Output for Question #7 Source SS df MS Number of obs = F( 1, 15) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = fare Coef. Std. Err. t P> t [95% Conf. Interval] distance _cons Plot left shows data and fitted regression line. Plot right shows the standardized residual plot for the simple regression. 12 of 12

Age (continuous) Gender (0=Male, 1=Female) SES (1=Low, 2=Medium, 3=High) Prior Victimization (0= Not Victimized, 1=Victimized)

Age (continuous) Gender (0=Male, 1=Female) SES (1=Low, 2=Medium, 3=High) Prior Victimization (0= Not Victimized, 1=Victimized) Criminal Justice Doctoral Comprehensive Exam Statistics August 2016 There are two questions on this exam. Be sure to answer both questions in the 3 and half hours to complete this exam. Read the instructions

More information

Notes for laboratory session 2

Notes for laboratory session 2 Notes for laboratory session 2 Preliminaries Consider the ordinary least-squares (OLS) regression of alcohol (alcohol) and plasma retinol (retplasm). We do this with STATA as follows:. reg retplasm alcohol

More information

Multiple Linear Regression Analysis

Multiple Linear Regression Analysis Revised July 2018 Multiple Linear Regression Analysis This set of notes shows how to use Stata in multiple regression analysis. It assumes that you have set Stata up on your computer (see the Getting Started

More information

Final Exam - section 2. Thursday, December hours, 30 minutes

Final Exam - section 2. Thursday, December hours, 30 minutes Econometrics, ECON312 San Francisco State University Michael Bar Fall 2011 Final Exam - section 2 Thursday, December 15 2 hours, 30 minutes Name: Instructions 1. This is closed book, closed notes exam.

More information

Cross-over trials. Martin Bland. Cross-over trials. Cross-over trials. Professor of Health Statistics University of York

Cross-over trials. Martin Bland. Cross-over trials. Cross-over trials. Professor of Health Statistics University of York Cross-over trials Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk Cross-over trials Use the participant as their own control. Each participant gets more than one

More information

ANOVA. Thomas Elliott. January 29, 2013

ANOVA. Thomas Elliott. January 29, 2013 ANOVA Thomas Elliott January 29, 2013 ANOVA stands for analysis of variance and is one of the basic statistical tests we can use to find relationships between two or more variables. ANOVA compares the

More information

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID # STA 3024 Spring 2013 Name EXAM 3 Test Form Code A UF ID # Instructions: This exam contains 34 Multiple Choice questions. Each question is worth 3 points, for a total of 102 points (there are TWO bonus

More information

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do. Midterm STAT-UB.0003 Regression and Forecasting Models The exam is closed book and notes, with the following exception: you are allowed to bring one letter-sized page of notes into the exam (front and

More information

Biostatistics 2 nd year Comprehensive Examination. Due: May 31 st, 2013 by 5pm. Instructions:

Biostatistics 2 nd year Comprehensive Examination. Due: May 31 st, 2013 by 5pm. Instructions: Biostatistics 2 nd year Comprehensive Examination Due: May 31 st, 2013 by 5pm. Instructions: 1. The exam is divided into two parts. There are 6 questions in section I and 2 questions in section II. 2.

More information

Psych 5741/5751: Data Analysis University of Boulder Gary McClelland & Charles Judd. Exam #2, Spring 1992

Psych 5741/5751: Data Analysis University of Boulder Gary McClelland & Charles Judd. Exam #2, Spring 1992 Exam #2, Spring 1992 Question 1 A group of researchers from a neurobehavioral institute are interested in the relationships that have been found between the amount of cerebral blood flow (CB FLOW) to the

More information

MULTIPLE REGRESSION OF CPS DATA

MULTIPLE REGRESSION OF CPS DATA MULTIPLE REGRESSION OF CPS DATA A further inspection of the relationship between hourly wages and education level can show whether other factors, such as gender and work experience, influence wages. Linear

More information

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple

More information

Self-assessment test of prerequisite knowledge for Biostatistics III in R

Self-assessment test of prerequisite knowledge for Biostatistics III in R Self-assessment test of prerequisite knowledge for Biostatistics III in R Mark Clements, Karolinska Institutet 2017-10-31 Participants in the course Biostatistics III are expected to have prerequisite

More information

This tutorial presentation is prepared by. Mohammad Ehsanul Karim

This tutorial presentation is prepared by. Mohammad Ehsanul Karim STATA: The Red tutorial STATA: The Red tutorial This tutorial presentation is prepared by Mohammad Ehsanul Karim ehsan.karim@gmail.com STATA: The Red tutorial This tutorial presentation is prepared by

More information

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers SOCY5061 RELATIVE RISKS, RELATIVE ODDS, LOGISTIC REGRESSION RELATIVE RISKS: Suppose we are interested in the association between lung cancer and smoking. Consider the following table for the whole population:

More information

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK SUMMER 011 RE-EXAM PSYF11STAT - STATISTIK Full Name: Årskortnummer: Date: This exam is made up of three parts: Part 1 includes 30 multiple choice questions; Part includes 10 matching questions; and Part

More information

1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA.

1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA. LDA lab Feb, 6 th, 2002 1 1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA. 2. Scientific question: estimate the average

More information

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H 1. Data from a survey of women s attitudes towards mammography are provided in Table 1. Women were classified by their experience with mammography

More information

ECON Introductory Econometrics Seminar 7

ECON Introductory Econometrics Seminar 7 ECON4150 - Introductory Econometrics Seminar 7 Stock and Watson EE11.2 April 28, 2015 Stock and Watson EE11.2 ECON4150 - Introductory Econometrics Seminar 7 April 28, 2015 1 / 25 E. 11.2 b clear set more

More information

Regression models, R solution day7

Regression models, R solution day7 Regression models, R solution day7 Exercise 1 In this exercise, we shall look at the differences in vitamin D status for women in 4 European countries Read and prepare the data: vit

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Name: Biostatistics 1 st year Comprehensive Examination: Applied Take Home exam. Due May 29 th, 2015 by 5pm. Late exams will not be accepted.

Name: Biostatistics 1 st year Comprehensive Examination: Applied Take Home exam. Due May 29 th, 2015 by 5pm. Late exams will not be accepted. Name: Biostatistics 1 st year Comprehensive Examination: Applied Take Home exam Due May 29 th, 2015 by 5pm. Late exams will not be accepted. Instructions: 1. There are 2 questions and 4 pages. Answer each

More information

Binary Diagnostic Tests Two Independent Samples

Binary Diagnostic Tests Two Independent Samples Chapter 537 Binary Diagnostic Tests Two Independent Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary

More information

Basic statistics for public health and policy

Basic statistics for public health and policy PHM102 Basic statistics for public health and policy FORMATIVE ASSIGNMENT Write (preferably type*) answers to BOTH parts of this assignment. Show your working and give reasons for your answers. *See Hint

More information

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

More information

Business Research Methods. Introduction to Data Analysis

Business Research Methods. Introduction to Data Analysis Business Research Methods Introduction to Data Analysis Data Analysis Process STAGES OF DATA ANALYSIS EDITING CODING DATA ENTRY ERROR CHECKING AND VERIFICATION DATA ANALYSIS Introduction Preparation of

More information

Introduction to regression

Introduction to regression Introduction to regression Regression describes how one variable (response) depends on another variable (explanatory variable). Response variable: variable of interest, measures the outcome of a study

More information

Regression Output: Table 5 (Random Effects OLS) Random-effects GLS regression Number of obs = 1806 Group variable (i): subject Number of groups = 70

Regression Output: Table 5 (Random Effects OLS) Random-effects GLS regression Number of obs = 1806 Group variable (i): subject Number of groups = 70 Regression Output: Table 5 (Random Effects OLS) Random-effects GLS regression Number of obs = 1806 R-sq: within = 0.1498 Obs per group: min = 18 between = 0.0205 avg = 25.8 overall = 0.0935 max = 28 Random

More information

Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy

Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy Final Proposals Read instructions carefully! Check Canvas for our comments on

More information

NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003

NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003 NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003 Obs GROUP I DOPA LNDOPA 1 neurblst 1 48.000 1.68124 2 neurblst 1 133.000 2.12385 3 neurblst 1 34.000 1.53148

More information

Sociology Exam 3 Answer Key [Draft] May 9, 201 3

Sociology Exam 3 Answer Key [Draft] May 9, 201 3 Sociology 63993 Exam 3 Answer Key [Draft] May 9, 201 3 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. Bivariate regressions are

More information

Use the above variables and any you might need to construct to specify the MODEL A/C comparisons you would use to ask the following questions.

Use the above variables and any you might need to construct to specify the MODEL A/C comparisons you would use to ask the following questions. Fall, 2002 Grad Stats Final Exam There are four questions on this exam, A through D, and each question has multiple sub-questions. Unless otherwise indicated, each sub-question is worth 3 points. Question

More information

Sociology 63993, Exam1 February 12, 2015 Richard Williams, University of Notre Dame,

Sociology 63993, Exam1 February 12, 2015 Richard Williams, University of Notre Dame, Sociology 63993, Exam1 February 12, 2015 Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ I. True-False. (20 points) Indicate whether the following statements are true or false.

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

An Introduction to Bayesian Statistics

An Introduction to Bayesian Statistics An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA Fielding School of Public Health robweiss@ucla.edu Sept 2015 Robert Weiss (UCLA) An Introduction to Bayesian Statistics

More information

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN Vs. 2 Background 3 There are different types of research methods to study behaviour: Descriptive: observations,

More information

Statistical reports Regression, 2010

Statistical reports Regression, 2010 Statistical reports Regression, 2010 Niels Richard Hansen June 10, 2010 This document gives some guidelines on how to write a report on a statistical analysis. The document is organized into sections that

More information

Data Analysis in the Health Sciences. Final Exam 2010 EPIB 621

Data Analysis in the Health Sciences. Final Exam 2010 EPIB 621 Data Analysis in the Health Sciences Final Exam 2010 EPIB 621 Student s Name: Student s Number: INSTRUCTIONS This examination consists of 8 questions on 17 pages, including this one. Tables of the normal

More information

Modeling unobserved heterogeneity in Stata

Modeling unobserved heterogeneity in Stata Modeling unobserved heterogeneity in Stata Rafal Raciborski StataCorp LLC November 27, 2017 Rafal Raciborski (StataCorp) Modeling unobserved heterogeneity November 27, 2017 1 / 59 Plan of the talk Concepts

More information

Sensitivity, Specificity and Predictive Value [adapted from Altman and Bland BMJ.com]

Sensitivity, Specificity and Predictive Value [adapted from Altman and Bland BMJ.com] Sensitivity, Specificity and Predictive Value [adapted from Altman and Bland BMJ.com] The simplest diagnostic test is one where the results of an investigation, such as an x ray examination or biopsy,

More information

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Dr. Kelly Bradley Final Exam Summer {2 points} Name {2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.

More information

SAS Data Setup: SPSS Data Setup: STATA Data Setup: Hoffman ICPSR Example 5 page 1

SAS Data Setup: SPSS Data Setup: STATA Data Setup: Hoffman ICPSR Example 5 page 1 Hoffman ICPSR Example 5 page 1 Example 5: Examining BP and WP Effects of Negative Mood Predicting Next-Morning Glucose (complete data, syntax, and output available for SAS, SPSS, and STATA electronically)

More information

Common Statistical Issues in Biomedical Research

Common Statistical Issues in Biomedical Research Common Statistical Issues in Biomedical Research Howard Cabral, Ph.D., M.P.H. Boston University CTSI Boston University School of Public Health Department of Biostatistics May 15, 2013 1 Overview of Basic

More information

Least likely observations in regression models for categorical outcomes

Least likely observations in regression models for categorical outcomes The Stata Journal (2002) 2, Number 3, pp. 296 300 Least likely observations in regression models for categorical outcomes Jeremy Freese University of Wisconsin Madison Abstract. This article presents a

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Research Methods and Ethics in Psychology Week 4 Analysis of Variance (ANOVA) One Way Independent Groups ANOVA Brief revision of some important concepts To introduce the concept of familywise error rate.

More information

4. STATA output of the analysis

4. STATA output of the analysis Biostatistics(1.55) 1. Objective: analyzing epileptic seizures data using GEE marginal model in STATA.. Scientific question: Determine whether the treatment reduces the rate of epileptic seizures. 3. Dataset:

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Threats and Analysis. Shawn Cole. Harvard Business School

Threats and Analysis. Shawn Cole. Harvard Business School Threats and Analysis Shawn Cole Harvard Business School Course Overview 1. What is Evaluation? 2. Outcomes, Impact, and Indicators 3. Why Randomize? 4. How to Randomize? 5. Sampling and Sample Size 6.

More information

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes Content Quantifying association between continuous variables. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General

More information

HS Exam 1 -- March 9, 2006

HS Exam 1 -- March 9, 2006 Please write your name on the back. Don t forget! Part A: Short answer, multiple choice, and true or false questions. No use of calculators, notes, lab workbooks, cell phones, neighbors, brain implants,

More information

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels; 1 One-Way ANOVAs We have already discussed the t-test. The t-test is used for comparing the means of two groups to determine if there is a statistically significant difference between them. The t-test

More information

Inferential Statistics

Inferential Statistics Inferential Statistics and t - tests ScWk 242 Session 9 Slides Inferential Statistics Ø Inferential statistics are used to test hypotheses about the relationship between the independent and the dependent

More information

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE Spring 20 11, Volume 1, Issue 1 THE STATSWHISPERER The StatsWhisperer Newsletter is published by staff at StatsWhisperer. Visit us at: www.statswhisperer.com Introduction to this Issue The current issue

More information

STP 231 Example FINAL

STP 231 Example FINAL STP 231 Example FINAL Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

Multivariate dose-response meta-analysis: an update on glst

Multivariate dose-response meta-analysis: an update on glst Multivariate dose-response meta-analysis: an update on glst Nicola Orsini Unit of Biostatistics Unit of Nutritional Epidemiology Institute of Environmental Medicine Karolinska Institutet http://www.imm.ki.se/biostatistics/

More information

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale. Model Based Statistics in Biology. Part V. The Generalized Linear Model. Single Explanatory Variable on an Ordinal Scale ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10,

More information

Lecture 21. RNA-seq: Advanced analysis

Lecture 21. RNA-seq: Advanced analysis Lecture 21 RNA-seq: Advanced analysis Experimental design Introduction An experiment is a process or study that results in the collection of data. Statistical experiments are conducted in situations in

More information

Name MATH0021Final Exam REVIEW UPDATED 8/18. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Name MATH0021Final Exam REVIEW UPDATED 8/18. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Name MATH0021Final Exam REVIEW UPDATED 8/18 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the equation. 1) 6x - (3x - 1) = 2 1) A) - 1 3 B)

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D)

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D) Exam Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D) Decide whether or not the conditions and assumptions for inference with

More information

CLINICAL RESEARCH METHODS VISP356. MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY

CLINICAL RESEARCH METHODS VISP356. MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY DIVISION OF VISION SCIENCES SESSION: 2005/2006 DIET: 1ST CLINICAL RESEARCH METHODS VISP356 LEVEL: MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY MAY 2006 DURATION: 2 HRS CANDIDATES SHOULD

More information

SPSS output for 420 midterm study

SPSS output for 420 midterm study Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,

More information

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug? MMI 409 Spring 2009 Final Examination Gordon Bleil Table of Contents Research Scenario and General Assumptions Questions for Dataset (Questions are hyperlinked to detailed answers) 1. Is there a difference

More information

2. Scientific question: Determine whether there is a difference between boys and girls with respect to the distance and its change over time.

2. Scientific question: Determine whether there is a difference between boys and girls with respect to the distance and its change over time. LDA lab Feb, 11 th, 2002 1 1. Objective:analyzing dental data using ordinary least square (OLS) and Generalized Least Square(GLS) in STATA. 2. Scientific question: Determine whether there is a difference

More information

The Association Design and a Continuous Phenotype

The Association Design and a Continuous Phenotype PSYC 5102: Association Design & Continuous Phenotypes (4/4/07) 1 The Association Design and a Continuous Phenotype The purpose of this note is to demonstrate how to perform a population-based association

More information

Final Exam Version A

Final Exam Version A Final Exam Version A Open Book and Notes your 4-digit code: Staple the question sheets to your answers Write your name only once on the back of this sheet. Problem 1: (10 points) A popular method to isolate

More information

Lessons in biostatistics

Lessons in biostatistics Lessons in biostatistics The test of independence Mary L. McHugh Department of Nursing, School of Health and Human Services, National University, Aero Court, San Diego, California, USA Corresponding author:

More information

Basic Biostatistics. Chapter 1. Content

Basic Biostatistics. Chapter 1. Content Chapter 1 Basic Biostatistics Jamalludin Ab Rahman MD MPH Department of Community Medicine Kulliyyah of Medicine Content 2 Basic premises variables, level of measurements, probability distribution Descriptive

More information

Day 11: Measures of Association and ANOVA

Day 11: Measures of Association and ANOVA Day 11: Measures of Association and ANOVA Daniel J. Mallinson School of Public Affairs Penn State Harrisburg mallinson@psu.edu PADM-HADM 503 Mallinson Day 11 November 2, 2017 1 / 45 Road map Measures of

More information

Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface

More information

Poisson regression. Dae-Jin Lee Basque Center for Applied Mathematics.

Poisson regression. Dae-Jin Lee Basque Center for Applied Mathematics. Dae-Jin Lee dlee@bcamath.org Basque Center for Applied Mathematics http://idaejin.github.io/bcam-courses/ D.-J. Lee (BCAM) Intro to GLM s with R GitHub: idaejin 1/40 Modeling count data Introduction Response

More information

Binary Diagnostic Tests Paired Samples

Binary Diagnostic Tests Paired Samples Chapter 536 Binary Diagnostic Tests Paired Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary measures

More information

HZAU MULTIVARIATE HOMEWORK #2 MULTIPLE AND STEPWISE LINEAR REGRESSION

HZAU MULTIVARIATE HOMEWORK #2 MULTIPLE AND STEPWISE LINEAR REGRESSION HZAU MULTIVARIATE HOMEWORK #2 MULTIPLE AND STEPWISE LINEAR REGRESSION Using the malt quality dataset on the class s Web page: 1. Determine the simple linear correlation of extract with the remaining variables.

More information

CHAPTER TWO REGRESSION

CHAPTER TWO REGRESSION CHAPTER TWO REGRESSION 2.0 Introduction The second chapter, Regression analysis is an extension of correlation. The aim of the discussion of exercises is to enhance students capability to assess the effect

More information

University of New Mexico Hypothesis Testing-4 (Fall 2015) PH 538: Public Health Biostatistical Methods I (by Fares Qeadan)

University of New Mexico Hypothesis Testing-4 (Fall 2015) PH 538: Public Health Biostatistical Methods I (by Fares Qeadan) University of New Mexico Hypothesis Testing-4 (Fall 2015) PH 538: Public Health Biostatistical Methods I (by Fares Qeadan) hypothesis testing for association between two categorical variables (Chi-square

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables

More information

Regression Including the Interaction Between Quantitative Variables

Regression Including the Interaction Between Quantitative Variables Regression Including the Interaction Between Quantitative Variables The purpose of the study was to examine the inter-relationships among social skills, the complexity of the social situation, and performance

More information

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu What you should know before you collect data BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Types and levels of study Descriptive statistics Inferential statistics How to choose a statistical test

More information

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012 STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements

More information

CLINICAL RESEARCH METHODS VISP356. MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY

CLINICAL RESEARCH METHODS VISP356. MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY DIVISION OF VISION SCIENCES SESSION: 2006/2007 DIET: 1ST CLINICAL RESEARCH METHODS VISP356 LEVEL: MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY MAY 2007 DURATION: 2 HRS CANDIDATES SHOULD

More information

NORTH SOUTH UNIVERSITY TUTORIAL 2

NORTH SOUTH UNIVERSITY TUTORIAL 2 NORTH SOUTH UNIVERSITY TUTORIAL 2 AHMED HOSSAIN,PhD Data Management and Analysis AHMED HOSSAIN,PhD - Data Management and Analysis 1 Correlation Analysis INTRODUCTION In correlation analysis, we estimate

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

Find the slope of the line that goes through the given points. 1) (-9, -68) and (8, 51) 1)

Find the slope of the line that goes through the given points. 1) (-9, -68) and (8, 51) 1) Math 125 Semester Review Problems Name Find the slope of the line that goes through the given points. 1) (-9, -68) and (8, 51) 1) Solve the inequality. Graph the solution set, and state the solution set

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

(a) Perform a cost-benefit analysis of diabetes screening for this group. Does it favor screening?

(a) Perform a cost-benefit analysis of diabetes screening for this group. Does it favor screening? Cameron ECON 132 (Health Economics): SECOND MIDTERM EXAM (A) Winter 18 Answer all questions in the space provided on the exam. Total of 36 points (and worth 22.5% of final grade). Read each question carefully,

More information

Lecture 20: Chi Square

Lecture 20: Chi Square Statistics 20_chi.pdf Michael Hallstone, Ph.D. hallston@hawaii.edu Lecture 20: Chi Square Introduction Up until now, we done statistical test using means, but the assumptions for means have eliminated

More information

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences. SPRING GROVE AREA SCHOOL DISTRICT PLANNED COURSE OVERVIEW Course Title: Basic Introductory Statistics Grade Level(s): 11-12 Units of Credit: 1 Classification: Elective Length of Course: 30 cycles Periods

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

Chapter 14. Inference for Regression Inference about the Model 14.1 Testing the Relationship Signi!cance Test Practice

Chapter 14. Inference for Regression Inference about the Model 14.1 Testing the Relationship Signi!cance Test Practice Chapter 14 Inference for Regression Our!nal topic of the year involves inference for the regression model. In Chapter 3 we learned how to!nd the Least Squares Regression Line for a set of bivariate data.

More information

Analysis and Interpretation of Data Part 1

Analysis and Interpretation of Data Part 1 Analysis and Interpretation of Data Part 1 DATA ANALYSIS: PRELIMINARY STEPS 1. Editing Field Edit Completeness Legibility Comprehensibility Consistency Uniformity Central Office Edit 2. Coding Specifying

More information

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Ho (null hypothesis) Ha (alternative hypothesis) Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Hypothesis: Ho:

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

Student Performance Q&A:

Student Performance Q&A: Student Performance Q&A: 2009 AP Statistics Free-Response Questions The following comments on the 2009 free-response questions for AP Statistics were written by the Chief Reader, Christine Franklin of

More information

Math 261 Exam I Spring Name:

Math 261 Exam I Spring Name: Math 261 Exam I Spring 2008 Name: Instructions: Write your answers clearly on the exam paper. Show work to receive partial credit. Use the back of the page if you need more space but please note that you

More information

Things you need to know about the Normal Distribution. How to use your statistical calculator to calculate The mean The SD of a set of data points.

Things you need to know about the Normal Distribution. How to use your statistical calculator to calculate The mean The SD of a set of data points. Things you need to know about the Normal Distribution How to use your statistical calculator to calculate The mean The SD of a set of data points. The formula for the Variance (SD 2 ) The formula for the

More information

Sample Math 71B Final Exam #1. Answer Key

Sample Math 71B Final Exam #1. Answer Key Sample Math 71B Final Exam #1 Answer Key 1. (2 points) Graph the equation. Be sure to plot the points on the graph at. 2. Solve for. 3. Given that, find and simplify. 4. Suppose and a. (1 point) Find.

More information

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d PSYCHOLOGY 300B (A01) Assignment 3 January 4, 019 σ M = σ N z = M µ σ M d = M 1 M s p d = µ 1 µ 0 σ M = µ +σ M (z) Independent-samples t test One-sample t test n = δ δ = d n d d = µ 1 µ σ δ = d n n = δ

More information

Between-Person and Within-Person Effects of Negative Mood Predicting Next-Morning Glucose COMPLETED VERSION

Between-Person and Within-Person Effects of Negative Mood Predicting Next-Morning Glucose COMPLETED VERSION Psyc 944 Example 9b page 1 Between-Person and Within-Person Effects of Negative Mood Predicting Next-Morning Glucose COMPLETED VERSION These data were simulated loosely based on real data reported in the

More information