Math 215, Lab 7: 5/23/2007
|
|
- Shavonne Johns
- 5 years ago
- Views:
Transcription
1 Math 215, Lab 7: 5/23/2007 (1) Parametric versus Nonparamteric Bootstrap. Parametric Bootstrap: (Davison and Hinkley, 1997) The data below are 12 times between failures of airconditioning equipment in a Boeing 720 jet aircraft, for which we wish to estimate the underlying mean or its reciprocal, the failure rate: 3, 5, 7, 18, 43, 85, 91, 98, 100, 130, 120, 230, 487 It makes sense to assume that these data come from an exponential distribution. The problem is we do not know the rate of that distribution hence, we can not identify its pdf. However, we can rely upon the fact that the sample average is an efficient estimator for the exponential mean. Consequently, we may consider an exponential distribution whose rate is the reciprocal of the sample average. Formally, this means: 1) Let y exp(1/µ) f(y) =(1/µ) exp( 1/µ). 2) Let ˆf = fˆµ =(1/ȳ) exp( 1/ȳ). Now, bootstrapping seems obvious here. Obtain a sample Y1,..., Y n from ˆf, thencalculate the statistic of the interest T boot using sampled data, finally repeat this procedure B times to get the distribution for T boot. (a) Perform a parametric bootstrap for the Boeing data to calculate the MSE for the mean, the MSE for the variance, a 95% confidence interval for mean, and a 95% confidence interval for the variance of the exponential distribution of interest. Nonparametric Bootstrap: This is pretty much what we have been doing so far. That is, we tend to assume that the underlying distribution from which the data are generated is unknown. Consequently, we assumed that the data may very well modeled via its empirical distribution which puts equal probabilities of n 1 at each sample value. The latter assumption is equivalent to estimate the unknown cumulative distribution function (CDF) F with its Empirical Cumulative Distribution (ECDF) or ˆF which is defined as the sample proportion: ˆF (y) = #{y j y} n Note that the values of ECDF are fixed and they are : (0, 1, 2,..., n). So in the n n n Nonparametric version, we may sample with replacement from the original dataset B times and proceed as before. (b) Perform a Nonparametric bootstrap for the Boeing data. Carry out calculations similar to part a. 1
2 (2) Consider the following data representing the populations in thousands of n=20 large US cities in 1920 (x) and 1930 (y) (x) 1930 (y) (a) Perform a Nonparametric bootstrap to obtain a distribution for the Pearson s correlation coefficient between the 1920 and 1930 population data. Note in particular that, you need to obtain 20 samples without replacement from the pairs of data. (b) Obtain a 90% confidence interval for the following statistic: Z = 1 2 ln(1+r 1 r ) (3) Consider the bootstrap procedure for obtaining the t-statistic for the slope of a regression line using its associated least-square residuals as explained in page 271. Note that in order to do this, you need to fit e = y ˆβ 0 ˆβ 1 X or a separate regression with e i s being the response values. Subsequently, you need to sample with replacement from e i s to get B repetitions of the slope s t statistic. (a) Use the data of the previous problem to create a 95% bootstrap confidence interval for β 1, the slope of the regression line. 2
3 Multiple Regression Parametric Multiple Regression (Dalgaard, 2002). Dalgaard presents the analysis related to a study concerning lung function in patients with cystic fibrosis. Data are in the ISwR package which can be downloaded from the web. After connecting to the web, all you need to do is to type the following in R: > cyst<-read.table(" > cyst<-as.matrix(cyst) Then, each time you need to use the data sets included in this package in R, you need to click on packages and choose ISwR from the packages menu. Each data set will be available by using the command data (see Dalgaard Appendix A andsection9.1). > cyst age sex height weight bmp fev1 rv frc tlc pemax
4 The description for the data set is also given at page 228 of the text book (Appendix B): The cystfibr data frame has 25 rows and 10 columns. It contains lung function data for for cystic fibrosis patients (7-23 years old). Format: This data frame contains the following columns: age a numeric vector: Age in years. sex a numeric vector code. 0:male, 1:female. height a numeric vector. Height (cm). weight a numeric vector. Weight (kg). bmp a numeric vector. Body mass (% of normal). fev 1 a numeric vector. Forced expiratory volume. rv a numeric vector. Residual volume. frc a numeric vector. Functional residual capacity. tlc a numeric vector. Total lung capacity. pemax a numeric vector. Maximum expiratory pressure. 4
5 Preliminary Exploratory Analysis First of all, by typing: > plot(cystfibr) you can obtain a matrix for all pairwise scatterplots associated with the data set (figure 1). This is an extremely powerful tool in visualizing the multivariate data. secondly, by typing: > attach(cystfibr) the default data set of in R is going to be cystfibr. This is really important because to see the column labeled as height instead of typing > cystfibr[, height] I only need to type: > height. 5
6 age sex height weight bmp fev1 rv frc tlc pemax figure 1. This grid of scatterplots reflects the pairwise relationships between all variables of interest. 6
7 Also, you can get a matrix for all the pairwise correlations by typing: > cor(cystfibr) age sex height weight bmp fev1 age sex height weight bmp fev rv frc tlc pemax rv frc tlc pemax age sex height weight bmp fev rv frc tlc pemax note that this will enable us to browse through all the pairwise linear associations. 7
8 1 Multiple Regression Modeling Here, the response variable is pemax. All the other variables are considered as the independent variables. One might be interested to start with the most complete model. That is, we consider the additive model of: pemax = β 0 + β 1 age + β 2 sex + β 3 height + β 4 weight + β 5 bmp + β 6 fev1+β 7 rv + β 8 frc+ β 9 tlc (1) here is the output for the full model: > summary(lm(pemax~age+sex+height+weight+bmp+fev1+rv+frc+tlc)) Call: lm(formula = pemax ~ age + sex + height + weight + bmp + fev1 + rv + frc + tlc) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) age sex height weight bmp fev rv frc tlc Residual standard error: on 15 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 9 and 15 DF, p-value: > test<-lm(pemax~age+sex+height+weight+bmp+fev1+rv+frc+tlc) > step(test) Start: AIC= pemax ~ age + sex + height + weight + bmp + fev1 + rv + frc + tlc 8
9 Df Sum of Sq RSS AIC - sex tlc height age frc fev rv <none> weight bmp Step: AIC= pemax ~ age + height + weight + bmp + fev1 + rv + frc + tlc Df Sum of Sq RSS AIC - tlc height age frc rv <none> weight bmp fev Step: AIC= pemax ~ age + height + weight + bmp + fev1 + rv + frc Df Sum of Sq RSS AIC - frc height age rv <none> fev weight bmp Step: AIC= pemax ~ age + height + weight + bmp + fev1 + rv 9
10 Df Sum of Sq RSS AIC - age height rv <none> weight bmp fev Step: AIC= pemax ~ height + weight + bmp + fev1 + rv Df Sum of Sq RSS AIC - height rv <none> weight bmp fev Step: AIC= pemax ~ weight + bmp + fev1 + rv Df Sum of Sq RSS AIC <none> rv bmp fev weight Call: lm(formula = pemax ~ weight + bmp + fev1 + rv) Coefficients: (Intercept) weight bmp fev1 rv
11 The Best Model > summary(lm(formula = pemax ~ weight + bmp + fev1 + rv)) Call: lm(formula = pemax ~ weight + bmp + fev1 + rv) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) weight *** bmp * fev * rv Signif. codes: 0 *** ** 0.01 * Residual standard error: on 20 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 4 and 20 DF, p-value: > anova(lm(formula = pemax ~ weight + bmp + fev1 + rv)) Analysis of Variance Table Response: pemax Df Sum Sq Mean Sq F value Pr(>F) weight *** bmp fev * rv Residuals Signif. codes: 0 *** ** 0.01 *
12 A Naive Bootstrap Analysis of the Best Model B<-1000 Tboot<-matrix(0,nrow=B,ncol=5) for(i in 1:B) { e.star<-sample(cyst.resid,25,replace=t) y.star<- cyst.predict+e.star Tboot[i,]<-summary(lm(y.star~weight + bmp + fev1 + rv))$coefficients[,1] #Tcov<-summary(lm(y.star~weight + bmp + fev1 + rv))$cov print(i) } par(mfrow=c(2,3)) for(i in 1:5) { hist(tboot[,i]) } > mean(tboot[,1]) [1] > sd(tboot[,1]) [1] > > mean(tboot[,2]) [1] > sd(tboot[,2]) [1] > > mean(tboot[,3]) [1] > sd(tboot[,3]) [1] > > mean(tboot[,4]) [1] > sd(tboot[,4]) [1] > > mean(tboot[,5]) [1] > sd(tboot[,5]) [1]
13 Histogram of Tboot[, i] Histogram of Tboot[, i] Histogram of Tboot[, i] Frequency Frequency Frequency Tboot[, i] Tboot[, i] Tboot[, i] Histogram of Tboot[, i] Histogram of Tboot[, i] Frequency Frequency Tboot[, i] Tboot[, i] figure 2. The Distribution of the Bootstrap Parameter Estimates. 13
14 14
Normal Q Q. Residuals vs Fitted. Standardized residuals. Theoretical Quantiles. Fitted values. Scale Location 26. Residuals vs Leverage
Residuals 400 0 400 800 Residuals vs Fitted 26 42 29 Standardized residuals 2 0 1 2 3 Normal Q Q 26 42 29 360 400 440 2 1 0 1 2 Fitted values Theoretical Quantiles Standardized residuals 0.0 0.5 1.0 1.5
More informationNORTH SOUTH UNIVERSITY TUTORIAL 2
NORTH SOUTH UNIVERSITY TUTORIAL 2 AHMED HOSSAIN,PhD Data Management and Analysis AHMED HOSSAIN,PhD - Data Management and Analysis 1 Correlation Analysis INTRODUCTION In correlation analysis, we estimate
More informationRegression models, R solution day7
Regression models, R solution day7 Exercise 1 In this exercise, we shall look at the differences in vitamin D status for women in 4 European countries Read and prepare the data: vit
More informationApplication of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties
Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Bob Obenchain, Risk Benefit Statistics, August 2015 Our motivation for using a Cut-Point
More informationMULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES
24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter
More informationBangor University Laboratory Exercise 1, June 2008
Laboratory Exercise, June 2008 Classroom Exercise A forest land owner measures the outside bark diameters at.30 m above ground (called diameter at breast height or dbh) and total tree height from ground
More informationDaniel Boduszek University of Huddersfield
Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Correlation SPSS procedure for Pearson r Interpretation of SPSS output Presenting results Partial Correlation Correlation
More informationMultiple Regression Analysis
Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables
More informationMidterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.
Midterm STAT-UB.0003 Regression and Forecasting Models The exam is closed book and notes, with the following exception: you are allowed to bring one letter-sized page of notes into the exam (front and
More informationBiology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction
Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction In this exercise, we will gain experience assessing scatterplots in regression and
More informationBIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA
BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA PART 1: Introduction to Factorial ANOVA ingle factor or One - Way Analysis of Variance can be used to test the null hypothesis that k or more treatment or group
More information10. LINEAR REGRESSION AND CORRELATION
1 10. LINEAR REGRESSION AND CORRELATION The contingency table describes an association between two nominal (categorical) variables (e.g., use of supplemental oxygen and mountaineer survival ). We have
More informationMultiple Linear Regression Analysis
Revised July 2018 Multiple Linear Regression Analysis This set of notes shows how to use Stata in multiple regression analysis. It assumes that you have set Stata up on your computer (see the Getting Started
More informationStatistics for EES Factorial analysis of variance
Statistics for EES Factorial analysis of variance Dirk Metzler http://evol.bio.lmu.de/_statgen 1. July 2013 1 ANOVA and F-Test 2 Pairwise comparisons and multiple testing 3 Non-parametric: The Kruskal-Wallis
More informationGPA vs. Hours of Sleep: A Simple Linear Regression Jacob Ushkurnis 12/16/2016
GPA vs. Hours of Sleep: A Simple Linear Regression Jacob Ushkurnis 12/16/2016 Introduction As a college student, life can sometimes get extremely busy and stressful when there is a lot of work to do. More
More informationContent. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes
Content Quantifying association between continuous variables. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General
More informationSimple Linear Regression
Simple Linear Regression Assoc. Prof Dr Sarimah Abdullah Unit of Biostatistics & Research Methodology School of Medical Sciences, Health Campus Universiti Sains Malaysia Regression Regression analysis
More informationCRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys
Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests
More informationPreliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)
Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete
More informationStatistical reports Regression, 2010
Statistical reports Regression, 2010 Niels Richard Hansen June 10, 2010 This document gives some guidelines on how to write a report on a statistical analysis. The document is organized into sections that
More informationbivariate analysis: The statistical analysis of the relationship between two variables.
bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for
More informationScore Tests of Normality in Bivariate Probit Models
Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model
More informationANOVA in SPSS (Practical)
ANOVA in SPSS (Practical) Analysis of Variance practical In this practical we will investigate how we model the influence of a categorical predictor on a continuous response. Centre for Multilevel Modelling
More informationPoisson regression. Dae-Jin Lee Basque Center for Applied Mathematics.
Dae-Jin Lee dlee@bcamath.org Basque Center for Applied Mathematics http://idaejin.github.io/bcam-courses/ D.-J. Lee (BCAM) Intro to GLM s with R GitHub: idaejin 1/40 Modeling count data Introduction Response
More informationData Analysis in the Health Sciences. Final Exam 2010 EPIB 621
Data Analysis in the Health Sciences Final Exam 2010 EPIB 621 Student s Name: Student s Number: INSTRUCTIONS This examination consists of 8 questions on 17 pages, including this one. Tables of the normal
More informationOverview of Non-Parametric Statistics
Overview of Non-Parametric Statistics LISA Short Course Series Mark Seiss, Dept. of Statistics April 7, 2009 Presentation Outline 1. Homework 2. Review of Parametric Statistics 3. Overview Non-Parametric
More informationSelf-assessment test of prerequisite knowledge for Biostatistics III in R
Self-assessment test of prerequisite knowledge for Biostatistics III in R Mark Clements, Karolinska Institutet 2017-10-31 Participants in the course Biostatistics III are expected to have prerequisite
More informationDaniel Boduszek University of Huddersfield
Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Multiple Regression (MR) Types of MR Assumptions of MR SPSS procedure of MR Example based on prison data Interpretation of
More informationAnalysis of Variance: repeated measures
Analysis of Variance: repeated measures Tests for comparing three or more groups or conditions: (a) Nonparametric tests: Independent measures: Kruskal-Wallis. Repeated measures: Friedman s. (b) Parametric
More informationANOVA. Thomas Elliott. January 29, 2013
ANOVA Thomas Elliott January 29, 2013 ANOVA stands for analysis of variance and is one of the basic statistical tests we can use to find relationships between two or more variables. ANOVA compares the
More informationStatistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.
Statistics as a Tool A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations. Descriptive Statistics Numerical facts or observations that are organized describe
More informationMath 075 Activities and Worksheets Book 2:
Math 075 Activities and Worksheets Book 2: Linear Regression Name: 1 Scatterplots Intro to Correlation Represent two numerical variables on a scatterplot and informally describe how the data points are
More informationCHAPTER TWO REGRESSION
CHAPTER TWO REGRESSION 2.0 Introduction The second chapter, Regression analysis is an extension of correlation. The aim of the discussion of exercises is to enhance students capability to assess the effect
More informationCorrelation and Regression
Dublin Institute of Technology ARROW@DIT Books/Book Chapters School of Management 2012-10 Correlation and Regression Donal O'Brien Dublin Institute of Technology, donal.obrien@dit.ie Pamela Sharkey Scott
More information1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA.
LDA lab Feb, 6 th, 2002 1 1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA. 2. Scientific question: estimate the average
More informationMeasurement Error 2: Scale Construction (Very Brief Overview) Page 1
Measurement Error 2: Scale Construction (Very Brief Overview) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 22, 2015 This handout draws heavily from Marija
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!
More informationBAM Monitor Performance. Seasonal and Geographic Variation in NC
BAM Monitor Performance Seasonal and Geographic Variation in NC 2009-10 2 Presenter Wayne Cornelius 3 Introduction Unsuccessful ARM tests in 2007 and 2009, using a configuration of R&P TEOM monitor. Acquired
More informationHZAU MULTIVARIATE HOMEWORK #2 MULTIPLE AND STEPWISE LINEAR REGRESSION
HZAU MULTIVARIATE HOMEWORK #2 MULTIPLE AND STEPWISE LINEAR REGRESSION Using the malt quality dataset on the class s Web page: 1. Determine the simple linear correlation of extract with the remaining variables.
More informationBiology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 8 One Way ANOVA and comparisons among means Introduction
Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 8 One Way ANOVA and comparisons among means Introduction In this exercise, we will conduct one-way analyses of variance using two different
More informationChapter 2 Organizing and Summarizing Data. Chapter 3 Numerically Summarizing Data. Chapter 4 Describing the Relation between Two Variables
Tables and Formulas for Sullivan, Fundamentals of Statistics, 4e 014 Pearson Education, Inc. Chapter Organizing and Summarizing Data Relative frequency = frequency sum of all frequencies Class midpoint:
More informationMultiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple
More informationSPSS output for 420 midterm study
Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,
More informationEXECUTIVE SUMMARY DATA AND PROBLEM
EXECUTIVE SUMMARY Every morning, almost half of Americans start the day with a bowl of cereal, but choosing the right healthy breakfast is not always easy. Consumer Reports is therefore calculated by an
More informationLab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups.
Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups. Activity 1 Examining Data From Class Background Download
More informationSPSS output for 420 midterm study
Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,
More informationThe Pretest! Pretest! Pretest! Assignment (Example 2)
The Pretest! Pretest! Pretest! Assignment (Example 2) May 19, 2003 1 Statement of Purpose and Description of Pretest Procedure When one designs a Math 10 exam one hopes to measure whether a student s ability
More informationAP Statistics. Semester One Review Part 1 Chapters 1-5
AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically
More informationMath Section MW 1-2:30pm SR 117. Bekki George 206 PGH
Math 3339 Section 21155 MW 1-2:30pm SR 117 Bekki George bekki@math.uh.edu 206 PGH Office Hours: M 11-12:30pm & T,TH 10:00 11:00 am and by appointment More than Two Independent Samples: Single Factor Analysis
More informationSection 6: Analysing Relationships Between Variables
6. 1 Analysing Relationships Between Variables Section 6: Analysing Relationships Between Variables Choosing a Technique The Crosstabs Procedure The Chi Square Test The Means Procedure The Correlations
More informationNotes for laboratory session 2
Notes for laboratory session 2 Preliminaries Consider the ordinary least-squares (OLS) regression of alcohol (alcohol) and plasma retinol (retplasm). We do this with STATA as follows:. reg retplasm alcohol
More informationAnalysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen
More informationOrdinary Least Squares Regression
Ordinary Least Squares Regression March 2013 Nancy Burns (nburns@isr.umich.edu) - University of Michigan From description to cause Group Sample Size Mean Health Status Standard Error Hospital 7,774 3.21.014
More informationHour 2: lm (regression), plot (scatterplots), cooks.distance and resid (diagnostics) Stat 302, Winter 2016 SFU, Week 3, Hour 1, Page 1
Agenda for Week 3, Hr 1 (Tuesday, Jan 19) Hour 1: - Installing R and inputting data. - Different tools for R: Notepad++ and RStudio. - Basic commands:?,??, mean(), sd(), t.test(), lm(), plot() - t.test()
More informationGENERALIZED ESTIMATING EQUATIONS FOR LONGITUDINAL DATA. Anti-Epileptic Drug Trial Timeline. Exploratory Data Analysis. Exploratory Data Analysis
GENERALIZED ESTIMATING EQUATIONS FOR LONGITUDINAL DATA 1 Example: Clinical Trial of an Anti-Epileptic Drug 59 epileptic patients randomized to progabide or placebo (Leppik et al., 1987) (Described in Fitzmaurice
More informationMMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?
MMI 409 Spring 2009 Final Examination Gordon Bleil Table of Contents Research Scenario and General Assumptions Questions for Dataset (Questions are hyperlinked to detailed answers) 1. Is there a difference
More informationSimple Linear Regression the model, estimation and testing
Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.
More informationIAPT: Regression. Regression analyses
Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project
More informationList of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition
List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing
More informationLAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*
LAB ASSIGNMENT 4 1 INFERENCES FOR NUMERICAL DATA In this lab assignment, you will analyze the data from a study to compare survival times of patients of both genders with different primary cancers. First,
More informationCHAPTER ONE CORRELATION
CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to
More informationPsych 5741/5751: Data Analysis University of Boulder Gary McClelland & Charles Judd. Exam #2, Spring 1992
Exam #2, Spring 1992 Question 1 A group of researchers from a neurobehavioral institute are interested in the relationships that have been found between the amount of cerebral blood flow (CB FLOW) to the
More informationTEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A.
Proceedings of the 004 Winter Simulation Conference R G Ingalls, M D Rossetti, J S Smith, and B A Peters, eds TEACHING REGRESSION WITH SIMULATION John H Walker Statistics Department California Polytechnic
More informationChoosing a Significance Test. Student Resource Sheet
Choosing a Significance Test Student Resource Sheet Choosing Your Test Choosing an appropriate type of significance test is a very important consideration in analyzing data. If an inappropriate test is
More informationStat Wk 9: Hypothesis Tests and Analysis
Stat 342 - Wk 9: Hypothesis Tests and Analysis Crash course on ANOVA, proc glm Stat 342 Notes. Week 9 Page 1 / 57 Crash Course: ANOVA AnOVa stands for Analysis Of Variance. Sometimes it s called ANOVA,
More informationPart 8 Logistic Regression
1 Quantitative Methods for Health Research A Practical Interactive Guide to Epidemiology and Statistics Practical Course in Quantitative Data Handling SPSS (Statistical Package for the Social Sciences)
More informationLinear Regression in SAS
1 Suppose we wish to examine factors that predict patient s hemoglobin levels. Simulated data for six patients is used throughout this tutorial. data hgb_data; input id age race $ bmi hgb; cards; 21 25
More informationHere are the various choices. All of them are found in the Analyze menu in SPSS, under the sub-menu for Descriptive Statistics :
Descriptive Statistics in SPSS When first looking at a dataset, it is wise to use descriptive statistics to get some idea of what your data look like. Here is a simple dataset, showing three different
More informationNature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.
Supplementary Figure 1 Behavioral training. a, Mazes used for behavioral training. Asterisks indicate reward location. Only some example mazes are shown (for example, right choice and not left choice maze
More informationSmall Group Presentations
Admin Assignment 1 due next Tuesday at 3pm in the Psychology course centre. Matrix Quiz during the first hour of next lecture. Assignment 2 due 13 May at 10am. I will upload and distribute these at the
More informationTHE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE
Spring 20 11, Volume 1, Issue 1 THE STATSWHISPERER The StatsWhisperer Newsletter is published by staff at StatsWhisperer. Visit us at: www.statswhisperer.com Introduction to this Issue The current issue
More informationSTAT 503X Case Study 1: Restaurant Tipping
STAT 503X Case Study 1: Restaurant Tipping 1 Description Food server s tips in restaurants may be influenced by many factors including the nature of the restaurant, size of the party, table locations in
More information12.1 Inference for Linear Regression. Introduction
12.1 Inference for Linear Regression vocab examples Introduction Many people believe that students learn better if they sit closer to the front of the classroom. Does sitting closer cause higher achievement,
More informationisc ove ring i Statistics sing SPSS
isc ove ring i Statistics sing SPSS S E C O N D! E D I T I O N (and sex, drugs and rock V roll) A N D Y F I E L D Publications London o Thousand Oaks New Delhi CONTENTS Preface How To Use This Book Acknowledgements
More informationAdvanced ANOVA Procedures
Advanced ANOVA Procedures Session Lecture Outline:. An example. An example. Two-way ANOVA. An example. Two-way Repeated Measures ANOVA. MANOVA. ANalysis of Co-Variance (): an ANOVA procedure whereby the
More informationTwo-Way Independent Samples ANOVA with SPSS
Two-Way Independent Samples ANOVA with SPSS Obtain the file ANOVA.SAV from my SPSS Data page. The data are those that appear in Table 17-3 of Howell s Fundamental statistics for the behavioral sciences
More informationThings you need to know about the Normal Distribution. How to use your statistical calculator to calculate The mean The SD of a set of data points.
Things you need to know about the Normal Distribution How to use your statistical calculator to calculate The mean The SD of a set of data points. The formula for the Variance (SD 2 ) The formula for the
More informationDaniel Boduszek University of Huddersfield
Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Multinominal Logistic Regression SPSS procedure of MLR Example based on prison data Interpretation of SPSS output Presenting
More informationStat 13, Lab 11-12, Correlation and Regression Analysis
Stat 13, Lab 11-12, Correlation and Regression Analysis Part I: Before Class Objective: This lab will give you practice exploring the relationship between two variables by using correlation, linear regression
More informationLessons in biostatistics
Lessons in biostatistics The test of independence Mary L. McHugh Department of Nursing, School of Health and Human Services, National University, Aero Court, San Diego, California, USA Corresponding author:
More informationQuantitative Methods in Computing Education Research (A brief overview tips and techniques)
Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Dr Judy Sheard Senior Lecturer Co-Director, Computing Education Research Group Monash University judy.sheard@monash.edu
More informationSimple Linear Regression One Categorical Independent Variable with Several Categories
Simple Linear Regression One Categorical Independent Variable with Several Categories Does ethnicity influence total GCSE score? We ve learned that variables with just two categories are called binary
More informationChapter 9. Factorial ANOVA with Two Between-Group Factors 10/22/ Factorial ANOVA with Two Between-Group Factors
Chapter 9 Factorial ANOVA with Two Between-Group Factors 10/22/2001 1 Factorial ANOVA with Two Between-Group Factors Recall that in one-way ANOVA we study the relation between one criterion variable and
More informationCitation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.
University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document
More informationStatistics 571: Statistical Methods Summer 2003 Final Exam Ramón V. León
Name: Statistics 571: Statistical Methods Summer 2003 Final Exam Ramón V. León This exam is closed-book and closed-notes. However, you can use up to twenty pages of personal notes as an aid in answering
More informationAnswer all three questions. All questions carry equal marks.
UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science and Statistics Postgraduate Diploma in Statistics Trinity Term 2 Introduction to Regression
More information5 To Invest or not to Invest? That is the Question.
5 To Invest or not to Invest? That is the Question. Before starting this lab, you should be familiar with these terms: response y (or dependent) and explanatory x (or independent) variables; slope and
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationChapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)
Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it
More informationProblem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.
Ho (null hypothesis) Ha (alternative hypothesis) Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Hypothesis: Ho:
More informationBiostatistics II
Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,
More informationApplications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis
DSC 4/5 Multivariate Statistical Methods Applications DSC 4/5 Multivariate Statistical Methods Discriminant Analysis Identify the group to which an object or case (e.g. person, firm, product) belongs:
More informationHungry Mice. NP: Mice in this group ate as much as they pleased of a non-purified, standard diet for laboratory mice.
Hungry Mice When laboratory mice (and maybe other animals) are fed a nutritionally adequate but near-starvation diet, they may live longer on average than mice that eat a normal amount of food. In this
More informationOn Regression Analysis Using Bivariate Extreme Ranked Set Sampling
On Regression Analysis Using Bivariate Extreme Ranked Set Sampling Atsu S. S. Dorvlo atsu@squ.edu.om Walid Abu-Dayyeh walidan@squ.edu.om Obaid Alsaidy obaidalsaidy@gmail.com Abstract- Many forms of ranked
More informationUnderstandable Statistics
Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement
More informationStatistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN
Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN Vs. 2 Background 3 There are different types of research methods to study behaviour: Descriptive: observations,
More informationOverview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups
Survey Methods & Design in Psychology Lecture 10 ANOVA (2007) Lecturer: James Neill Overview of Lecture Testing mean differences ANOVA models Interactions Follow-up tests Effect sizes Parametric Tests
More informationREGRESSION MODELLING IN PREDICTING MILK PRODUCTION DEPENDING ON DAIRY BOVINE LIVESTOCK
REGRESSION MODELLING IN PREDICTING MILK PRODUCTION DEPENDING ON DAIRY BOVINE LIVESTOCK Agatha POPESCU University of Agricultural Sciences and Veterinary Medicine Bucharest, 59 Marasti, District 1, 11464,
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector
More informationStill important ideas
Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement
More information