BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS
|
|
- Victoria Norris
- 6 years ago
- Views:
Transcription
1 BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS 17 December 2009 Michael Wood University of Portsmouth Business School SBS Department, Richmond Building Portland Street, Portsmouth PO1 3DE, UK 1
2 BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS Abstract This paper shows how bootstrapping using a spreadsheet can be used to derive confidence levels for hypotheses about features of regression models such as their shape, and the location of optimum values. The data used as an example leads to a confidence level of 67% that the sample comes from a population which displays the hypothesized inverted U shape. There is no obvious and satisfactory alternative way of deriving this result, or an equivalent result. In particular, null hypothesis tests cannot provide adequate support for this type of hypothesis. Keywords: Confidence, Regression models, Curvilinear models, Bootstrapping 2
3 Introduction Glebbeek and Bax (2004) investigated the hypothesis that there is an inverted U-shape relationship between two variables staff turnover and organizational performance by setting up regression models with both staff turnover, and staff turnover squared, as independent variables. Their sample comprised 110 branches of an employment agency on the Netherlands. In all their models they found that the coefficient for the linear term was positive and for the squared term the coefficient was negative, which confirms their hypothesis. One of these models is shown in Figure 1 below. (This diagram is not in Glebbeek and Bax, Dr. Glebbeek, however, was kind enough to give me access to their data, which I have used to produce Figure 1.) Figure 1. Predicted performance from quadratic model (after adjusting for values of three control variables) Performance Turnover (% per year) (The solid line is the prediction from the regression model; the scattered points are the data on which the regression model is based.) The question now arises of whether this demonstrates that a similar pattern would occur if the analysis was done with the whole population from which the sample was drawn. Can we be sure that this is a stable result, or might another sample from the same source show a different pattern? Conventionally this question is answered by testing two 3
4 null hypotheses the first being that the coefficient of the linear term is zero, and the second being that the squared term is zero. In the model represented by Figure 1, neither coefficient is significantly different from zero. The evidence provides some support for the inverted U-shape hypothesis, but it is difficult to combine the two significance levels into a single figure to indicate the strength of the support for this hypothesis. Bootstrap methods provide a way out of this difficulty. The result from the bootstrap analysis below is that the data on which Figure 1 is based suggests a confidence level of 67% for the inverted U shape hypothesis. Using a confidence level in this way has a number of further advantages which are explained after we have shown how the bootstrap method works. The method shown is implemented on an Excel spreadsheet (available on the web) which can easily be adapted to analyse different models. The approach is mentioned briefly in Wood (2009a); the present paper extends it and analyzes it in more detail. Bootstrapping confidence levels for hypotheses The idea of bootstrapping is very simple. Suppose we have a random sample of size n from a specified population, and we have worked out a statistic, s, based on this sample. Now imagine that the population comprises a large number of copies of the sample say one million of them. If we now take a series of random samples from this imaginary population, and work out s for each of these, we can investigate how variable sample values of s are, and so derive sampling error statistics such as the standard error and confidence intervals. In practice, the easiest way of doing this is to take resamples with replacement from the original sample. This means we choose a member of the original sample at random, then replace it and choose again, until we have a sample of size n. This means that some members of the original sample will appear in the resample more than once, and others not at all. This is equivalent to choosing an ordinary sample from the large constructed population because the large size of this population means that its composition is effectively unchanged by removing each member of a sample. This principle has been widely used with a variety of statistics. Where conventional methods are possible, the answers obtained tend to be similar, but bootstrapping does have a 4
5 number of advantages, including the fact that it can be used where there is no convenient standard method (see, for example, Lunneborg, 2000; Wood, 2005). Figure 2. Predicted performance using a quadratic model (after adjusting for values of three control variables) from the data (bold) and three resamples (dotted lines) Performance Staff turnover (% per year) Applying this idea to our present problem, the statistic, is now a line on a graph. Figure 2 shows results from three resamples as well as the original sample. Each of the dotted lines in this figure is based on identical formulae to the solid line representing the real sample, but using the data from a resample, rather than the original sample. Two of these resamples are obviously an inverted U shape; the third is not. These results come from an excel spreadsheet at The Resample sheet of the spreadsheet allows users to press the Recalculate button (F9) and generate a new resample and line on the graph. These can be thought of as different simulated samples from the same source. It is then a simple matter to produce more of these resamples and count up the number which are inverted U shapes. The conclusion was that 62 of 100 resamples gave an inverted U shape (with, obviously, the top of the U at a positive value of turnover), which suggests that the confidence level for this hypothesis, based on the data, should be 5
6 put at 62%. For a more stable and reliable answer, we can use a larger number of resamples 1000 resamples yielded a confidence level of 67%. It is very easy to use this method to obtain confidence levels for other hypotheses. The frequency of occurrence of any feature of the resample graphs can easily be worked out. For example, we might want to know the location of the optimum staff turnover. The point estimate from the regression shown in Figure 1 is that the optimum performance occurs with a staff turnover of 6%. Examining 1000 resamples gives these confidence level for three hypotheses: Confidence in hypothesis that the optimum is between 0% and 10% = 30% Confidence in hypothesis that the optimum is between 10% and 20% = 37% Confidence in hypothesis that the optimum is above 20% = 0% Another hypothesis of interest to Glebbeek and Bax (2004) is that the relationship between performance and turnover is negative this being the rival to the inverted U shape hypothesis. The top resample in Figure 2 illustrates the importance of defining this clearly: this shows a negative relationship for low turnover, but a positive relation for higher turnover. If we define a negative relationship as one which is not an inverted U shape, and for which the predicted performance for Turnover = 25% is less than the prediction for Turnover = 0 (which makes the top resample in Figure 2 such a negative relationship), then the spreadsheet shows that Confidence in hypothesis that the relationship is negative = 33% In fact, all 1000 resamples gave either an inverted U shape or a negative relationship in this sense. Pros and cons of this method of analysis The main advantage is that this method gives an answer for the degree of support for the inverted U shape hypothesis (a confidence level of 67%) which the conventional p values do not. Glebbeek and Bax (2004) cited p values for two coefficients, and it is unclear how these should be combined. More fundamentally, it is difficult to see what null hypothesis could be tested to demonstrate an inverted U shape. A null hypothesis of no relationship between the two variables would not differentiate between the hypothesis that the 6
7 relationship is linear and negative (the main competing hypothesis here), and the inverted U shape hypothesis. The method has the further advantages of flexibility (it can easily be adapted to analyze the hypotheses about the optimum turnover above, for example), and transparency users can literally see, by pressing the recalculate (F9) key, how variable different resamples are and so how variable real samples from the same source might have been. Figure 2 above demonstrates, graphically and clearly, the sampling error problem, and the derivation of the confidence levels from the spreadsheet is very straightforward. Against this there are some issues about the interpretation and validity of the method. Validity of bootstrapped confidence levels There an obvious logical problem with the description of the bootstrap method above: the imaginary population constructed from the sample is not the real population, so using it to make inferences about the accuracy of the sample as a guide to the real population obviously entail a few assumptions. Bootstrapping essentially models the proccess of sampling, so it will tell us about likely discrepancies between the real population and the sample, but, as Bayes theorem reminds us, to make inferences about the real population we need to take account of the prior probabilities of the various possibilities. In practice, this is not feasible (except in simpler cases than this see Wood, 2009b), so we will accept the bootstrap conjecture that we can use the bootstrap-world to learn about the real world (Lunneborg, 2000), but it is important to realise that the validity of the approach is not guaranteed. However, experience shows that in normal, well-behaved situations the bootstrap approach gives similar result to standard approaches based on probability theory. In our present example, we can check the confidence intervals from the Excel Regression Tool with bootstrapped estimates. For the data and model used for Figures 1 and 2, the Excel Regression Tool gives 95% confidence interval for square coeff. is 230 to +57 (Regression Tool) 7
8 On the bootstrap spreadsheet, the square term is described as curvature because it measures whether the curve is a U shape (+) or an inverted U shape ( ). Taking 1000 resamples and arranging them in order of curvature, the 95% confidence interval extends from the 2.5 percentile to the 97.5 percentile, which is 95% confidence interval for square coeff. is 301 to +75 (Bootstrap) This is 31% wider than the interval produced by the Regression Tool. Furthermore, the next two intervals produced by further sets of 1000 resamples were also wider than the Excel Regression Tool interval. We can also compare the two methods for the linear model without the coefficient for the square term (Model 3 in Glebbeek and Bax, 2004). With the Regression Tool this gives 95% confidence interval for the slope is 3060 to 495 (Regression Tool) and the corresponding result from 1000 bootstrap samples (using is 95% confidence interval for the slope is 3147 to 704 (Bootstrap). In this case the bootstrap interval is 5% narrower (and the next bootstrap interval was 3% wider). The bootstrap confidence interval for the single variable model (Model 1 in Glebbeek and Bax, 2004) is similarly close to the Regression Tool estimate (a difference of less than 1% in the width of the intervals). These results suggest that the confidence interval estimates for the linear models are very close, and not too far apart (31%) for the curvilinear model. It is obviously very difficult to be sure which estimate is the better, but the reasonably close agreement of the two methods should give us some confidence in the bootstrap method for assessing confidence for the inverted U shape hypothesis, for which we have got no conventional method for comparison. It is also very important to acknowledge that all these confidence intervals and levels presuppose a background model. For the derivation of the confidence level for the inverted U shape hypothesis we used a quadratic model, as did Glebbeek and Bax (2004) for their analysis in terms of p values. Similarly, the linear model discussed above can be used to derive a confidence level of the regression slope being negative the bootstrap confidence level based on 3000 resamples was 99.5%. (The p value given by the 8
9 Regression Tool is 0.704%, which yields a confidence level of 99.6% using the method described in Wood, 2009b.) This confidence level is much more than the 33% obtained above for a negative relationship based on a quadratic model, largely because the definition of a negative relationship above is more restrictive it excludes, for example, the model based on the actual data in Figure 1 because this is an inverted U shape (but if a straight line were fitted it would obviously have a negative slope). Conclusions The bootstrap method outlined here gives a simple, direct and transparent method of assessing the confidence for hypotheses about various features of regression models. For example, the confidence in a hypothesis of an inverted U shape based on the data used was 67%. The method is implemented by a spreadsheet at and can easily be adapted to analyze other hypotheses and models. There is no satisfactory way to analyze the support for hypotheses like this using null hypothesis testing. The bootstrap method answers a question which cannot easily be answered by other means. It is also a transparent method based simply on simulating successive samples from the same source. References Glebbeek, A. C., & Bax, E. H. (2004). Is high employee turnover really harmful? An empirical test using company records. Academy of Management Journal, 47(2), Lunneborg, C. E. (2000). Data analysis by resampling: concepts and applications. Pacific Grove, CA, USA: Duxbury. Wood, M. (2005). Bootstrapped confidence intervals as an approach to statistical inference. Organizational Research Methods, 8(4), Wood, M. (2009a). The use of statistical methods in management research: suggestions from a case study. arxiv: v1 [stat.ap] ( 9
10 Wood, M. (2009b). Liberating research from null hypotheses: confidence levels for substantive hypotheses instead of p values. 10
BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS
BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS 12 June 2012 Michael Wood University of Portsmouth Business School SBS Department, Richmond Building Portland
More informationPitfalls in Linear Regression Analysis
Pitfalls in Linear Regression Analysis Due to the widespread availability of spreadsheet and statistical software for disposal, many of us do not really have a good understanding of how to use regression
More informationThe use of statistical methods in management research: a critique and some suggestions based on a case study 30 March 2010
The use of statistical methods in management research: a critique and some suggestions based on a case study 30 March 2010 Michael Wood University of Portsmouth Business School SBS Department, Richmond
More informationBrief notes on statistics: Part 4 More on regression: multiple regression, p values, confidence intervals, etc
Brief notes on statistics: Part 4 More on regression: multiple regression, p values, confidence intervals, etc Michael Wood (Michael.wood@port.ac.uk) 22 October 2012 Introduction and links to electronic
More informationWDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?
WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters
More informationBusiness Statistics Probability
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationUnderstandable Statistics
Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement
More informationSimple Linear Regression the model, estimation and testing
Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationIAPT: Regression. Regression analyses
Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project
More informationSPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.
SPRING GROVE AREA SCHOOL DISTRICT PLANNED COURSE OVERVIEW Course Title: Basic Introductory Statistics Grade Level(s): 11-12 Units of Credit: 1 Classification: Elective Length of Course: 30 cycles Periods
More informationResearch Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process
Research Methods in Forest Sciences: Learning Diary Yoko Lu 285122 9 December 2016 1. Research process It is important to pursue and apply knowledge and understand the world under both natural and social
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More information1.4 - Linear Regression and MS Excel
1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!
More information2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%
Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of
More informationA Spreadsheet for Deriving a Confidence Interval, Mechanistic Inference and Clinical Inference from a P Value
SPORTSCIENCE Perspectives / Research Resources A Spreadsheet for Deriving a Confidence Interval, Mechanistic Inference and Clinical Inference from a P Value Will G Hopkins sportsci.org Sportscience 11,
More information12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand
More informationStatistical Methods and Reasoning for the Clinical Sciences
Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries
More informationStill important ideas
Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement
More informationPolitical Science 15, Winter 2014 Final Review
Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically
More informationUnderstanding Uncertainty in School League Tables*
FISCAL STUDIES, vol. 32, no. 2, pp. 207 224 (2011) 0143-5671 Understanding Uncertainty in School League Tables* GEORGE LECKIE and HARVEY GOLDSTEIN Centre for Multilevel Modelling, University of Bristol
More informationStatistics for Psychology
Statistics for Psychology SIXTH EDITION CHAPTER 12 Prediction Prediction a major practical application of statistical methods: making predictions make informed (and precise) guesses about such things as
More informationPTHP 7101 Research 1 Chapter Assignments
PTHP 7101 Research 1 Chapter Assignments INSTRUCTIONS: Go over the questions/pointers pertaining to the chapters and turn in a hard copy of your answers at the beginning of class (on the day that it is
More informationReadings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F
Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions
More informationReadings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14
Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Still important ideas Contrast the measurement of observable actions (and/or characteristics)
More informationSection 6: Analysing Relationships Between Variables
6. 1 Analysing Relationships Between Variables Section 6: Analysing Relationships Between Variables Choosing a Technique The Crosstabs Procedure The Chi Square Test The Means Procedure The Correlations
More informationThe Regression-Discontinuity Design
Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations
More informationRegression Discontinuity Analysis
Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income
More informationSection 3.2 Least-Squares Regression
Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter
More informationCHAPTER 3 DATA ANALYSIS: DESCRIBING DATA
Data Analysis: Describing Data CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA In the analysis process, the researcher tries to evaluate the data collected both from written documents and from other sources such
More information11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES
Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are
More information6. Unusual and Influential Data
Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the
More informationConvergence Principles: Information in the Answer
Convergence Principles: Information in the Answer Sets of Some Multiple-Choice Intelligence Tests A. P. White and J. E. Zammarelli University of Durham It is hypothesized that some common multiplechoice
More informationReflection Questions for Math 58B
Reflection Questions for Math 58B Johanna Hardin Spring 2017 Chapter 1, Section 1 binomial probabilities 1. What is a p-value? 2. What is the difference between a one- and two-sided hypothesis? 3. What
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 5, 6, 7, 8, 9 10 & 11)
More informationModeration in management research: What, why, when and how. Jeremy F. Dawson. University of Sheffield, United Kingdom
Moderation in management research: What, why, when and how Jeremy F. Dawson University of Sheffield, United Kingdom Citing this article: Dawson, J. F. (2014). Moderation in management research: What, why,
More informationConditional Distributions and the Bivariate Normal Distribution. James H. Steiger
Conditional Distributions and the Bivariate Normal Distribution James H. Steiger Overview In this module, we have several goals: Introduce several technical terms Bivariate frequency distribution Marginal
More information10.1 Estimating with Confidence. Chapter 10 Introduction to Inference
10.1 Estimating with Confidence Chapter 10 Introduction to Inference Statistical Inference Statistical inference provides methods for drawing conclusions about a population from sample data. Two most common
More informationChapter 1: Introduction to Statistics
Chapter 1: Introduction to Statistics Variables A variable is a characteristic or condition that can change or take on different values. Most research begins with a general question about the relationship
More informationWELCOME! Lecture 11 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression
More informationAQA (A) Research methods. Model exam answers
AQA (A) Research methods Model exam answers These answers are not for you to copy or learn by heart, they are for you to see how to develop you answers to get the marks. They have been written according
More informationQ: How do I get the protein concentration in mg/ml from the standard curve if the X-axis is in units of µg.
Photometry Frequently Asked Questions Q: How do I get the protein concentration in mg/ml from the standard curve if the X-axis is in units of µg. Protein standard curves are traditionally presented as
More informationChapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.
Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able
More informationSUPPLEMENTAL MATERIAL
1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across
More informationSTATISTICS & PROBABILITY
STATISTICS & PROBABILITY LAWRENCE HIGH SCHOOL STATISTICS & PROBABILITY CURRICULUM MAP 2015-2016 Quarter 1 Unit 1 Collecting Data and Drawing Conclusions Unit 2 Summarizing Data Quarter 2 Unit 3 Randomness
More informationSTATISTICS AND RESEARCH DESIGN
Statistics 1 STATISTICS AND RESEARCH DESIGN These are subjects that are frequently confused. Both subjects often evoke student anxiety and avoidance. To further complicate matters, both areas appear have
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence Section 8.1 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Introduction Our goal in many statistical settings is to use a sample statistic
More informationHow to interpret results of metaanalysis
How to interpret results of metaanalysis Tony Hak, Henk van Rhee, & Robert Suurmond Version 1.0, March 2016 Version 1.3, Updated June 2018 Meta-analysis is a systematic method for synthesizing quantitative
More informationStatistics and Probability
Statistics and a single count or measurement variable. S.ID.1: Represent data with plots on the real number line (dot plots, histograms, and box plots). S.ID.2: Use statistics appropriate to the shape
More informationCHAPTER ONE CORRELATION
CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to
More informationUSING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE
USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE Using StatCrunch for confidence intervals (CI s) is super easy. As you can see in the assignments, I cover 9.2 before 9.1 because
More informationAppendix B Statistical Methods
Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon
More informationBayes Linear Statistics. Theory and Methods
Bayes Linear Statistics Theory and Methods Michael Goldstein and David Wooff Durham University, UK BICENTENNI AL BICENTENNIAL Contents r Preface xvii 1 The Bayes linear approach 1 1.1 Combining beliefs
More informationChapter-2 RESEARCH DESIGN
Chapter-2 RESEARCH DESIGN 33 2.1 Introduction to Research Methodology: The general meaning of research is the search for knowledge. Research is also defined as a careful investigation or inquiry, especially
More informationChapter 3: Examining Relationships
Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression
More informationIntroduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.
Diagnostic Tests 1 Introduction Suppose we have a quantitative measurement X i on experimental or observed units i = 1,..., n, and a characteristic Y i = 0 or Y i = 1 (e.g. case/control status). The measurement
More informationData and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data
TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2
More informationCorrelational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
More informationTHE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER
THE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER Introduction, 639. Factor analysis, 639. Discriminant analysis, 644. INTRODUCTION
More informationStatistical Techniques. Masoud Mansoury and Anas Abulfaraj
Statistical Techniques Masoud Mansoury and Anas Abulfaraj What is Statistics? https://www.youtube.com/watch?v=lmmzj7599pw The definition of Statistics The practice or science of collecting and analyzing
More informationChapter 02 Developing and Evaluating Theories of Behavior
Chapter 02 Developing and Evaluating Theories of Behavior Multiple Choice Questions 1. A theory is a(n): A. plausible or scientifically acceptable, well-substantiated explanation of some aspect of the
More informationSheila Barron Statistics Outreach Center 2/8/2011
Sheila Barron Statistics Outreach Center 2/8/2011 What is Power? When conducting a research study using a statistical hypothesis test, power is the probability of getting statistical significance when
More informationCHAPTER 3 RESEARCH METHODOLOGY
CHAPTER 3 RESEARCH METHODOLOGY 3.1 Introduction 3.1 Methodology 3.1.1 Research Design 3.1. Research Framework Design 3.1.3 Research Instrument 3.1.4 Validity of Questionnaire 3.1.5 Statistical Measurement
More informationOverview of Non-Parametric Statistics
Overview of Non-Parametric Statistics LISA Short Course Series Mark Seiss, Dept. of Statistics April 7, 2009 Presentation Outline 1. Homework 2. Review of Parametric Statistics 3. Overview Non-Parametric
More informationInferential Statistics
Inferential Statistics and t - tests ScWk 242 Session 9 Slides Inferential Statistics Ø Inferential statistics are used to test hypotheses about the relationship between the independent and the dependent
More informationMultiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012
Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012 2 In Today s Class Recap Single dummy variable Multiple dummy variables Ordinal dummy variables Dummy-dummy interaction Dummy-continuous/discrete
More informationStudent Performance Q&A:
Student Performance Q&A: 2009 AP Statistics Free-Response Questions The following comments on the 2009 free-response questions for AP Statistics were written by the Chief Reader, Christine Franklin of
More informationEXPERIMENTAL DESIGN Page 1 of 11. relationships between certain events in the environment and the occurrence of particular
EXPERIMENTAL DESIGN Page 1 of 11 I. Introduction to Experimentation 1. The experiment is the primary means by which we are able to establish cause-effect relationships between certain events in the environment
More informationChapter 12. The One- Sample
Chapter 12 The One- Sample z-test Objective We are going to learn to make decisions about a population parameter based on sample information. Lesson 12.1. Testing a Two- Tailed Hypothesis Example 1: Let's
More informationA Brief Introduction to Bayesian Statistics
A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon
More informationCHAPTER 3. Methodology
CHAPTER 3 Methodology The purpose of this chapter is to provide the research methodology which was designed to achieve the objectives of this study. It is important to select appropriate method to ensure
More information3.2 Least- Squares Regression
3.2 Least- Squares Regression Linear (straight- line) relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these
More informationTitle: A new statistical test for trends: establishing the properties of a test for repeated binomial observations on a set of items
Title: A new statistical test for trends: establishing the properties of a test for repeated binomial observations on a set of items Introduction Many studies of therapies with single subjects involve
More informationCRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys
Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests
More informationLAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*
LAB ASSIGNMENT 4 1 INFERENCES FOR NUMERICAL DATA In this lab assignment, you will analyze the data from a study to compare survival times of patients of both genders with different primary cancers. First,
More informationChapter 3 CORRELATION AND REGRESSION
CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression
More informationAsignificant amount of information systems (IS) research involves hypothesizing and testing for interaction
Information Systems Research Vol. 18, No. 2, June 2007, pp. 211 227 issn 1047-7047 eissn 1526-5536 07 1802 0211 informs doi 10.1287/isre.1070.0123 2007 INFORMS Research Note Statistical Power in Analyzing
More informationCHILD HEALTH AND DEVELOPMENT STUDY
CHILD HEALTH AND DEVELOPMENT STUDY 9. Diagnostics In this section various diagnostic tools will be used to evaluate the adequacy of the regression model with the five independent variables developed in
More informationChapter 7: Descriptive Statistics
Chapter Overview Chapter 7 provides an introduction to basic strategies for describing groups statistically. Statistical concepts around normal distributions are discussed. The statistical procedures of
More information11/24/2017. Do not imply a cause-and-effect relationship
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
More informationYou must answer question 1.
Research Methods and Statistics Specialty Area Exam October 28, 2015 Part I: Statistics Committee: Richard Williams (Chair), Elizabeth McClintock, Sarah Mustillo You must answer question 1. 1. Suppose
More informationSample Math 71B Final Exam #1. Answer Key
Sample Math 71B Final Exam #1 Answer Key 1. (2 points) Graph the equation. Be sure to plot the points on the graph at. 2. Solve for. 3. Given that, find and simplify. 4. Suppose and a. (1 point) Find.
More informationContext of Best Subset Regression
Estimation of the Squared Cross-Validity Coefficient in the Context of Best Subset Regression Eugene Kennedy South Carolina Department of Education A monte carlo study was conducted to examine the performance
More informationRegression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES
CHAPTER SIXTEEN Regression NOTE TO INSTRUCTORS This chapter includes a number of complex concepts that may seem intimidating to students. Encourage students to focus on the big picture through some of
More informationReliability, validity, and all that jazz
Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence Key Vocabulary: point estimator point estimate confidence interval margin of error interval confidence level random normal independent four step process level C confidence
More informationChapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)
Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it
More informationNeuropsychology, in press. (Neuropsychology journal home page) American Psychological Association
Abnormality of test scores 1 Running head: Abnormality of Differences Neuropsychology, in press (Neuropsychology journal home page) American Psychological Association This article may not exactly replicate
More informationChapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE
Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE 1. When you assert that it is improbable that the mean intelligence test score of a particular group is 100, you are using. a. descriptive
More informationCorrelational analysis: Pearson s r CHAPTER OVERVIEW
168 Correlational analysis: 6 Pearson s r CHAPTER OVERVIEW In the first five chapters we have given you the basic building blocks that you will need to understand the statistical analyses presented in
More informationMTH 225: Introductory Statistics
Marshall University College of Science Mathematics Department MTH 225: Introductory Statistics Course catalog description Basic probability, descriptive statistics, fundamental statistical inference procedures
More informationAssessing Agreement Between Methods Of Clinical Measurement
University of York Department of Health Sciences Measuring Health and Disease Assessing Agreement Between Methods Of Clinical Measurement Based on Bland JM, Altman DG. (1986). Statistical methods for assessing
More informationCalories per oz. Price per oz Corn Wheat
Donald Wittman Lecture 1 Diet Problem Consider the following problem: Corn costs.6 cents an ounce and wheat costs 1 cent an ounce. Each ounce of corn has 10 units of vitamin A, 5 calories and 2 units of
More informationChapter 4. Navigating. Analysis. Data. through. Exploring Bivariate Data. Navigations Series. Grades 6 8. Important Mathematical Ideas.
Navigations Series Navigating through Analysis Data Grades 6 8 Chapter 4 Exploring Bivariate Data Important Mathematical Ideas Copyright 2009 by the National Council of Teachers of Mathematics, Inc. www.nctm.org.
More informationBayesian Tailored Testing and the Influence
Bayesian Tailored Testing and the Influence of Item Bank Characteristics Carl J. Jensema Gallaudet College Owen s (1969) Bayesian tailored testing method is introduced along with a brief review of its
More informationTheory Building and Hypothesis Testing. POLI 205 Doing Research in Politics. Theory. Building. Hypotheses. Testing. Fall 2015
and and Fall 2015 and The Road to Scientific Knowledge and Make your Theories Causal Think in terms of causality X causes Y Basis of causality Rules of the Road Time Ordering: The cause precedes the effect
More informationFOR TEACHERS ONLY. The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION MATHEMATICS B
FOR TEACHERS ONLY The University of the State of New Yk REGENTS HIGH SCHOOL EXAMINATION MATHEMATICS B Thursday, June 15, 2006 1:15 to 4:15 p.m., only SCORING KEY Mechanics of Rating The following procedures
More informationSurvival Skills for Researchers. Study Design
Survival Skills for Researchers Study Design Typical Process in Research Design study Collect information Generate hypotheses Analyze & interpret findings Develop tentative new theories Purpose What is
More information